Add audio input to gradioUI #1201

GTimothee · 2025-04-15T18:14:22Z

Hi,

I am experimenting with smolagents and the first thing I wanted to do was to simply give a vocal command and get a result from it. Unfortunately the default gradio interface does not implement voice command. Hence this PR.

I simply added the possibility to input a vocal command, be it from microphone or audio file.

You can find the demo here https://huggingface.co/spaces/GTimothee/smolagent

Basically what I implemented works as follows: if you want to input voice command, you need a function to process it. So the idea is that you must pass a function to process the audio as you like, and adding it will enable gradio UI to just display an audio input. Then on submit it will run your function to extract the text out of the audio, and pass it to the agent. Of course the next step will be to enable audio output for the agent.

GradioUI(agent).launch(speech2text_func=speech2text_func)

You can find all the code in the app.py of my space.

GTimothee added 4 commits April 15, 2025 07:42

add audio input

771b08c

add audio input

27ab909

add audio input file

3a75637

fix

5af2e79

GTimothee changed the title ~~Add audio input~~ Add audio input to gradioUI Apr 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add audio input to gradioUI #1201

Add audio input to gradioUI #1201

Uh oh!

GTimothee commented Apr 15, 2025 •

edited

Loading

Uh oh!

Uh oh!

Add audio input to gradioUI #1201

Are you sure you want to change the base?

Add audio input to gradioUI #1201

Uh oh!

Conversation

GTimothee commented Apr 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

GTimothee commented Apr 15, 2025 •

edited

Loading