A voice-based chat application that allows users to interact with an AI assistant using speech. The application leverages OpenAI's Whisper small for accurate speech recognition and Kokoro-TTS for natural-sounding voice synthesis. It will use the LM Studio local model that needs to be served at http://localhost:1234 to generate responses. Works great/responsive with qwen2.5-7b-instruct on Mac M1 Pro. Deepseek R1 also works but is much slower.
Hacked together with Claude and Cursor.
-
Clone the repository:
git clone https://github.com/jpzk/voicemvp.git cd voicemvp
-
Install system dependencies (macOS):
brew install portaudio
-
Install dependencies:
python -m venv venv source venv/bin/activate pip install -r requirements.txt
-
Run the application:
python voice_chat_agent.py
-
The first run might take a while as it needs to download the models.