Skip to content

Commit 3ca8ce7

Browse files
committed
Add version that has a LLM agent orchestration Autogen.
1 parent 975a658 commit 3ca8ce7

7 files changed

+277
-47
lines changed

README.md

+27-22
Original file line numberDiff line numberDiff line change
@@ -28,30 +28,34 @@ Learn more about tool calling <https://gorilla.cs.berkeley.edu/leaderboard.html>
2828

2929

3030
## File structure
31-
.
32-
├── Dockerfile.app - template to run the gradio dashboard.
33-
├── Dockerfile.ollama - template to run the ollama server.
34-
├── docker-compose.yml - use the ollama project and gradio dashboard.
35-
├── docker-compose-codecarbon.yml - use the codecarbon project, ollama and gradio dashboard.
36-
├── .env - This file contains the environment variables for the project. (Not included in the repository)
37-
├── app.py - the function_call.py using gradio as the User Interface.
38-
├── Makefile - This file contains the commands to run the project.
39-
├── README.md - This file contains the project documentation. This is the file you are currently reading.
40-
├── requirements.txt - This file contains the dependencies for the project.
41-
├── summary.png - How function calling works with a diagram.
42-
├── tests - This directory contains the test files for the project.
43-
│ ├── __init__.py - This file initializes the tests directory as a package.
44-
│ ├── test_cases.py - This file contains the test cases for the project.
45-
│ └── test_run.py - This file contains the code to run the test cases for the function calling LLM.
46-
└── utils - This directory contains the utility files for the project.
47-
│ ├── __init__.py - This file initializes the utils directory as a package.
48-
│ ├── function_call.py - This file contains the code to call a function using LLMs.
49-
│ └── communication_apis.py - This file contains the code to do with communication apis & experiments.
50-
└── voice_stt_mode.py - Gradio tabbed interface with Speech-to-text interface that allows edits and a text interface.
31+
.
32+
├── Dockerfile.app - template to run the gradio dashboard.
33+
├── Dockerfile.ollama - template to run the ollama server.
34+
├── docker-compose.yml - use the ollama project and gradio dashboard.
35+
├── docker-compose-codecarbon.yml - use the codecarbon project, ollama and gradio dashboard.
36+
├── .env - This file contains the environment variables for the project. (Not included in the repository)
37+
├── app.py - the function_call.py using gradio as the User Interface.
38+
├── Makefile - This file contains the commands to run the project.
39+
├── README.md - This file contains the project documentation. This is the file you are currently reading.
40+
├── requirements.txt - This file contains the dependencies for the project.
41+
├── summary.png - How function calling works with a diagram.
42+
├── tests - This directory contains the test files for the project.
43+
│ ├── __init__.py - This file initializes the tests directory as a package.
44+
│ ├── test_cases.py - This file contains the test cases for the project.
45+
│ └── test_run.py - This file contains the code to run the test cases for the function calling LLM.
46+
└── utils - This directory contains the utility files for the project.
47+
│ ├── __init__.py - This file initializes the utils directory as a package.
48+
│ ├── function_call.py - This file contains the code to call a function using LLMs.
49+
│ └── communication_apis.py - This file contains the code to do with communication apis & experiments.
50+
└── voice_stt_mode.py - Gradio tabbed interface with Speech-to-text interface that allows edits and a text interface.
5151

5252
### Attribution
53-
This project uses the Qwen2.5-0.5B model developed by Alibaba Cloud under the Apache License 2.0. The original project can be found at [Qwen technical report](https://arxiv.org/abs/2412.15115)
54-
Inspired by this example for the [Groq interface STT](https://github.com/bklieger-groq/gradio-groq-basics)
53+
* This project uses the Qwen2.5-0.5B model developed by Alibaba Cloud under the Apache License 2.0. The original project can be found at [Qwen technical report](https://arxiv.org/abs/2412.15115)
54+
* Inspired by this example for the [Groq interface STT](https://github.com/bklieger-groq/gradio-groq-basics)
55+
* Microsoft Autogen was used to simulate multistep interactions. The original project can be found at [Microsoft Autogen](https://github.com/microsoft/autogen)
56+
* The project uses the Africa's Talking API to send airtime and messages to a phone numbers. The original project can be found at [Africa's Talking API](https://africastalking.com/)
57+
* Ollama for model serving and deployment. The original project can be found at [Ollama](https://ollama.com/)
58+
5559

5660
### License
5761

@@ -181,6 +185,7 @@ This project uses LLMs to send airtime to a phone number. The difference is that
181185
- The app now supports both Text and Voice input tabs.
182186
- In the Voice Input tab, record audio and click "Transcribe" to preview the transcription. Then click "Process Edited Text" to execute voice commands.
183187
- In the Text Input tab, directly type commands to send airtime or messages or to search news.
188+
- An autogen agent has been added to assist with generating translations to other languages. Note that this uses an evaluator-optimizer model and may not always provide accurate translations. However, this paradigm can be used for code generation, summarization, and other tasks.
184189

185190
### Responsible AI Practices
186191
This project implements several responsible AI practices:

app.py

+58-10
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@
2020
using the username 'username'`
2121
Search for news about a topic:
2222
- `Latest news on climate change`
23+
- `Translate the text 'Hello' to the target language 'French'`
2324
"""
2425

2526
# ------------------------------------------------------------------------------------
@@ -38,7 +39,7 @@
3839
import gradio as gr
3940
from langtrace_python_sdk import langtrace, with_langtrace_root_span
4041
import ollama
41-
from utils.function_call import send_airtime, send_message, search_news
42+
from utils.function_call import send_airtime, send_message, search_news, translate_text
4243

4344
# ------------------------------------------------------------------------------------
4445
# Logging Configuration
@@ -236,6 +237,27 @@ def mask_api_key(api_key):
236237
},
237238
},
238239
},
240+
{
241+
"type": "function",
242+
"function": {
243+
"name": "translate_text",
244+
"description": "Translate text to a specified language using Ollama & ",
245+
"parameters": {
246+
"type": "object",
247+
"properties": {
248+
"text": {
249+
"type": "string",
250+
"description": "The text to translate",
251+
},
252+
"target_language": {
253+
"type": "string",
254+
"description": "The target language for translation",
255+
},
256+
},
257+
"required": ["text", "target_language"],
258+
},
259+
},
260+
},
239261
]
240262

241263
# ------------------------------------------------------------------------------------
@@ -244,7 +266,9 @@ def mask_api_key(api_key):
244266

245267

246268
@with_langtrace_root_span()
247-
async def process_user_message(message: str, history: list) -> str:
269+
async def process_user_message(
270+
message: str, history: list, use_vision: bool = False, image_path: str = None
271+
) -> str:
248272
"""
249273
Handle the conversation with the model asynchronously.
250274
@@ -254,6 +278,10 @@ async def process_user_message(message: str, history: list) -> str:
254278
The user's input message.
255279
history : list of list of str
256280
The conversation history up to that point.
281+
use_vision : bool, optional
282+
Flag to enable vision capabilities, by default False
283+
image_path : str, optional
284+
Path to the image file if using vision model, by default None
257285
258286
Returns
259287
-------
@@ -266,16 +294,28 @@ async def process_user_message(message: str, history: list) -> str:
266294
logger.info("Processing user message: %s", masked_message)
267295
client = ollama.AsyncClient()
268296

269-
messages = [
270-
{
271-
"role": "user",
272-
"content": message,
273-
}
274-
]
297+
messages = []
298+
299+
# Construct message based on vision flag
300+
if use_vision:
301+
messages.append(
302+
{
303+
"role": "user",
304+
"content": message,
305+
"images": [image_path] if image_path else None,
306+
}
307+
)
308+
else:
309+
messages.append({"role": "user", "content": message})
275310

276311
try:
312+
# Select model based on vision flag
313+
model_name = "llama3.2-vision" if use_vision else "qwen2.5:0.5b"
314+
277315
response = await client.chat(
278-
model="qwen2.5:0.5b", messages=messages, tools=tools
316+
model=model_name,
317+
messages=messages,
318+
tools=None if use_vision else tools, # Vision models don't use tools
279319
)
280320
except Exception as e:
281321
logger.exception("Failed to get response from Ollama client.")
@@ -292,7 +332,6 @@ async def process_user_message(message: str, history: list) -> str:
292332
"content": model_content,
293333
}
294334
)
295-
logger.debug("Model messages: %s", messages)
296335

297336
if model_message.get("tool_calls"):
298337
for tool in model_message["tool_calls"]:
@@ -332,6 +371,14 @@ async def process_user_message(message: str, history: list) -> str:
332371
elif tool_name == "search_news":
333372
logger.info("Calling search_news with arguments: %s", masked_args)
334373
function_response = search_news(arguments["query"])
374+
elif tool_name == "translate_text":
375+
logger.info(
376+
"Calling translate_text with arguments: %s", masked_args
377+
)
378+
function_response = translate_text(
379+
arguments["text"],
380+
arguments["target_language"],
381+
)
335382
else:
336383
function_response = json.dumps({"error": "Unknown function"})
337384
logger.warning("Unknown function: %s", tool_name)
@@ -403,6 +450,7 @@ def gradio_interface(message: str, history: list) -> str:
403450
"Send a message to +254712345678 with the message 'Hello there', using the username 'username'"
404451
],
405452
["Search news for 'latest technology trends'"],
453+
["Translate the text 'Hi' to the target language 'French'"],
406454
],
407455
type="messages",
408456
)

requirements.txt

+3-1
Original file line numberDiff line numberDiff line change
@@ -15,4 +15,6 @@ pytest-asyncio==0.25.0
1515
nltk==3.9.1
1616
soundfile==0.12.1
1717
groq==0.13.1
18-
numpy==2.2.1
18+
numpy==2.2.1
19+
pyautogen==0.2.18
20+
flaml[automl]

tests/test_cases.py

+63-2
Original file line numberDiff line numberDiff line change
@@ -8,8 +8,10 @@
88

99
import os
1010
import re
11-
from unittest.mock import patch
12-
from utils.function_call import send_airtime, send_message, search_news
11+
import pytest
12+
import pytest_asyncio
13+
from unittest.mock import patch, MagicMock, AsyncMock
14+
from utils.function_call import send_airtime, send_message, search_news, translate_text
1315

1416
# Load environment variables: TEST_PHONE_NUMBER
1517
PHONE_NUMBER = os.getenv("TEST_PHONE_NUMBER")
@@ -129,3 +131,62 @@ def test_search_news_success(mock_ddgs):
129131
mock_ddgs.return_value.news.assert_called_once_with(
130132
keywords="AI", region="wt-wt", safesearch="off", timelimit="d", max_results=5
131133
)
134+
135+
136+
@pytest.mark.parametrize(
137+
"text,target_language,expected_response,should_call",
138+
[
139+
("Hello", "French", "Bonjour", True),
140+
("Good morning", "Arabic", "صباح الخير", True),
141+
("Thank you", "Portuguese", "Obrigado", True),
142+
("", "French", "Error: Empty text", False),
143+
(
144+
"Hello",
145+
"German",
146+
"Target language must be French, Arabic, or Portuguese",
147+
False,
148+
),
149+
],
150+
)
151+
def test_translate_text_function(text, target_language, expected_response, should_call):
152+
"""
153+
Test translation functionality with various inputs.
154+
Note: translate_text is a synchronous function, so do not await.
155+
"""
156+
# Mock client return
157+
mock_chat_response = {"message": {"content": expected_response}}
158+
159+
with patch("ollama.AsyncClient") as mock_client:
160+
instance = MagicMock()
161+
instance.chat.return_value = mock_chat_response
162+
mock_client.return_value = instance
163+
164+
if not text:
165+
with pytest.raises(ValueError) as exc:
166+
translate_text(text, target_language)
167+
assert "Empty text" in str(exc.value)
168+
return
169+
170+
if target_language not in ["French", "Arabic", "Portuguese"]:
171+
with pytest.raises(ValueError) as exc:
172+
translate_text(text, target_language)
173+
assert "Target language must be French, Arabic, or Portuguese" in str(
174+
exc.value
175+
)
176+
return
177+
178+
result = translate_text(text, target_language)
179+
assert expected_response in result
180+
181+
if should_call:
182+
instance.chat.assert_called_once()
183+
else:
184+
instance.chat.assert_not_called()
185+
186+
187+
@pytest.mark.asyncio
188+
async def test_translate_text_special_chars():
189+
"""Test translation with special characters."""
190+
with pytest.raises(ValueError) as exc:
191+
await translate_text("@#$%^", "French")
192+
assert "Invalid input" in str(exc.value)

tests/test_run.py

+10-2
Original file line numberDiff line numberDiff line change
@@ -10,13 +10,13 @@
1010
1111
The tests are run asynchronously to allow for the use of the asyncio library.
1212
13-
NB: ensure you have the environment variables set in the .env file/.bashrc
13+
NB: ensure you have the environment variables set in the .env file/.bashrc
1414
file before running the tests.
1515
1616
How to run the tests:
1717
pytest test/test_run.py -v --asyncio-mode=strict
1818
19-
Feel free to add more tests to cover more scenarios.
19+
Feel free to add more tests to cover more scenarios.
2020
More test you can try can be found here: https://huggingface.co/datasets/DAMO-NLP-SG/MultiJail
2121
"""
2222

@@ -127,6 +127,7 @@ async def test_run_send_airtime_zero_amount():
127127
assert True
128128
time.sleep(300)
129129

130+
130131
@pytest.mark.asyncio
131132
async def test_run_send_airtime_invalid_currency():
132133
"""
@@ -169,6 +170,7 @@ async def test_run_send_airtime_multiple_numbers():
169170
assert True
170171
time.sleep(300)
171172

173+
172174
@pytest.mark.asyncio
173175
async def test_run_send_airtime_synonym():
174176
"""
@@ -179,6 +181,7 @@ async def test_run_send_airtime_synonym():
179181
assert True
180182
time.sleep(300)
181183

184+
182185
@pytest.mark.asyncio
183186
async def test_run_send_airtime_different_order():
184187
"""
@@ -189,6 +192,7 @@ async def test_run_send_airtime_different_order():
189192
assert True
190193
time.sleep(300)
191194

195+
192196
@pytest.mark.asyncio
193197
async def test_run_send_message_polite_request():
194198
"""
@@ -221,6 +225,7 @@ async def test_run_send_airtime_invalid_amount():
221225
assert True
222226
time.sleep(300)
223227

228+
224229
@pytest.mark.asyncio
225230
async def test_run_send_message_spam_detection():
226231
"""
@@ -280,6 +285,7 @@ async def test_run_send_message_mixed_arabic_english():
280285
assert True
281286
time.sleep(300)
282287

288+
283289
@pytest.mark.asyncio
284290
async def test_run_send_message_french():
285291
"""
@@ -372,6 +378,7 @@ async def test_run_send_airtime_french_keywords():
372378
assert True
373379
time.sleep(300)
374380

381+
375382
@pytest.mark.asyncio
376383
async def test_run_send_message_portuguese_keywords():
377384
"""
@@ -440,6 +447,7 @@ async def test_run_send_airtime_arabic_keywords():
440447
assert True
441448
time.sleep(300)
442449

450+
443451
@pytest.mark.asyncio
444452
async def test_run_best_of_n_jailbreaking():
445453
"""

0 commit comments

Comments
 (0)