Add audio input to gradioUI #1201
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hi,
I am experimenting with smolagents and the first thing I wanted to do was to simply give a vocal command and get a result from it. Unfortunately the default gradio interface does not implement voice command. Hence this PR.
I simply added the possibility to input a vocal command, be it from microphone or audio file.
You can find the demo here https://huggingface.co/spaces/GTimothee/smolagent
Basically what I implemented works as follows: if you want to input voice command, you need a function to process it. So the idea is that you must pass a function to process the audio as you like, and adding it will enable gradio UI to just display an audio input. Then on submit it will run your function to extract the text out of the audio, and pass it to the agent. Of course the next step will be to enable audio output for the agent.
You can find all the code in the app.py of my space.