You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: README.md
+26-23
Original file line number
Diff line number
Diff line change
@@ -28,35 +28,37 @@ Learn more about tool calling <https://gorilla.cs.berkeley.edu/leaderboard.html>
28
28
29
29
30
30
## File structure
31
-
.
32
-
├── Dockerfile.app - template to run the gradio dashboard.
33
-
├── Dockerfile.ollama - template to run the ollama server.
34
-
├── docker-compose.yml - use the ollama project and gradio dashboard.
35
-
├── docker-compose-codecarbon.yml - use the codecarbon project, ollama and gradio dashboard.
36
-
├── .env - This file contains the environment variables for the project. (Not included in the repository)
37
-
├── app.py - the function_call.py using gradio as the User Interface.
38
-
├── Makefile - This file contains the commands to run the project.
39
-
├── README.md - This file contains the project documentation. This is the file you are currently reading.
40
-
├── requirements.txt - This file contains the dependencies for the project.
41
-
├── summary.png - How function calling works with a diagram.
42
-
├── tests - This directory contains the test files for the project.
43
-
│ ├── __init__.py - This file initializes the tests directory as a package.
44
-
│ ├── test_cases.py - This file contains the test cases for the project.
45
-
│ └── test_run.py - This file contains the code to run the test cases for the function calling LLM.
46
-
└── utils - This directory contains the utility files for the project.
47
-
│ ├── __init__.py - This file initializes the utils directory as a package.
48
-
│ ├── function_call.py - This file contains the code to call a function using LLMs.
49
-
│ └── communication_apis.py - This file contains the code to do with communication apis & experiments.
50
-
| └── models.py - This file contains pydantic schemas for vision models.
51
-
| └── constants.py - This file contains system prompts to adjust the model's behavior.
52
-
└── voice_stt_mode.py - Gradio tabbed interface with Speech-to-text interface that allows edits and a text interface.
31
+
.
32
+
├── Dockerfile.app - template to run the gradio dashboard.
33
+
├── Dockerfile.ollama - template to run the ollama server.
34
+
├── docker-compose.yml - use the ollama project and gradio dashboard.
35
+
├── docker-compose-codecarbon.yml - use the codecarbon project, ollama and gradio dashboard.
36
+
├── .env - This file contains the environment variables for the project. (Not included in the repository)
37
+
├── app.py - the function_call.py using gradio as the User Interface.
38
+
├── Makefile - This file contains the commands to run the project.
39
+
├── README.md - This file contains the project documentation. This is the file you are currently reading.
40
+
├── requirements.txt - This file contains the dependencies for the project.
41
+
├── summary.png - How function calling works with a diagram.
42
+
├── tests - This directory contains the test files for the project.
43
+
│ ├── __init__.py - This file initializes the tests directory as a package.
44
+
│ ├── test_cases.py - This file contains the test cases for the project.
45
+
│ └── test_run.py - This file contains the code to run the test cases for the function calling LLM.
46
+
└── utils - This directory contains the utility files for the project.
47
+
│ ├── __init__.py - This file initializes the utils directory as a package.
48
+
│ ├── function_call.py - This file contains the code to call a function using LLMs.
49
+
│ └── communication_apis.py - This file contains the code to do with communication apis & experiments.
50
+
| └── models.py - This file contains pydantic schemas for vision models.
51
+
| └── constants.py - This file contains system prompts to adjust the model's behavior.
52
+
└── voice_stt_mode.py - Gradio tabbed interface with Speech-to-text interface that allows edits and a text interface.
53
53
54
54
### Attribution
55
55
* This project uses the Qwen2.5-0.5B model developed by Alibaba Cloud under the Apache License 2.0. The original project can be found at [Qwen technical report](https://arxiv.org/abs/2412.15115)
56
56
* Inspired by this example for the [Groq interface STT](https://github.com/bklieger-groq/gradio-groq-basics)
57
57
* Microsoft Autogen was used to simulate multistep interactions. The original project can be found at [Microsoft Autogen](https://github.com/microsoft/autogen)
58
-
* The project uses the Africa's Talking API to send airtime and messages to a phone numbers. The original project can be found at[Africa's Talking API](https://africastalking.com/)
58
+
* The project uses the Africa's Talking API to send airtime and messages to a phone numbers. Check them out on this website[Africa's Talking API](https://africastalking.com/)
59
59
* Ollama for model serving and deployment. The original project can be found at [Ollama](https://ollama.com/)
60
+
* The project uses the Gradio library to create a user interface for the function calling LLM. The original project can be found at [Gradio](https://gradio.app/)
61
+
* The Text-to-Speech interface uses Edge TTS by Microsoft. The original project can be found at [Edge TTS](https://github.com/rany2/edge-tts). The voice chosen is Rehema which is a Swahili voice from Tanzania.
60
62
61
63
62
64
### License
@@ -188,6 +190,7 @@ This project uses LLMs to send airtime to a phone number. The difference is that
188
190
- In the Voice Input tab, record audio and click "Transcribe" to preview the transcription. Then click "Process Edited Text" to execute voice commands.
189
191
- In the Text Input tab, directly type commands to send airtime or messages or to search news.
190
192
- An autogen agent has been added to assist with generating translations to other languages. Note that this uses an evaluator-optimizer model and may not always provide accurate translations. However, this paradigm can be used for code generation, summarization, and other tasks.
193
+
- Text-to-Speech (TTS) has been added to the app. You can listen to the output of the commands.
191
194
192
195
### Responsible AI Practices
193
196
This project implements several responsible AI practices:
0 commit comments