VoiceNote for Business is an innovative solution to revolutionize business communication, particularly for rural and multilingual markets. This project enables businesses to process customer queries via WhatsApp voice notes, breaking down language and technological barriers.
- Multilingual Voice Processing
- WhatsApp Business API Integration
- Automatic Speech-to-Text Translation
- Cross-Platform Compatibility
- User-Friendly Tkinter GUI
- Advanced Language Detection
VoiceNote-for-Business-Vaseegrah-Veda/
βββ VoicetoText.py # Handles voice-to-text conversion
βββ TexttoVoice.py # Converts text responses back to voice
βββ VNText.csv # Stores processed text data
βββ requirements.txt # Project dependencies
βββ Dockerfile # Containerization setup
βββ LICENSE # Licensing information
βββ README.md # Documentation
βββ Voice-Notes/
β βββ Received/ # Incoming WhatsApp voice notes
β βββ Sent/ # Generated response voice notes
βββ Images/ # UI assets for the Tkinter GUI
βββ ffmpeg-7.0.1/ # Audio processing dependency
- Primary Function: Converts voice notes to text
- Key Capabilities:
- Speech recognition across multiple languages
- Automatic language detection
- Translation of voice notes to English
- Integration with Tkinter GUI
- Saves processed data to
VNText.csv
- Primary Function: Converts text responses back to voice
- Key Capabilities:
- Text-to-speech conversion
- Multi-language support
- Generates voice responses in the original customer's language
- Saves voice responses in
Voice-Notes/Sent
directory
- Purpose: Central data storage for processed voice notes
- Columns:
- Original file path
- Translated text
- Source language code
- Timestamp
- Processing status
- Purpose: Audio processing and conversion
- Key Features:
- Handles various audio formats
- Supports audio encoding/decoding
- Enables high-quality audio transformations
requirements.txt
: Python dependency management.gitignore
: Version control configurationLICENSE
: Project licensing informationREADME.md
: Project documentation
-
Operating Systems:
- Windows 10/11 (64-bit)
- macOS 10.15+
- Ubuntu 20.04 LTS / Linux Distributions
-
Hardware:
- Processor: Intel Core i5 or equivalent
- RAM: 8GB
- Storage: 10GB free disk space
- Internet Connection: Minimum 10 Mbps
- Python: 3.7 - 3.10
- Dependencies:
- Tkinter
- SpeechRecognition
- Googletrans
- PyDub
- gTTS
- FFmpeg
# Install virtualenv
pip install virtualenv
# Create virtual environment
python -m venv voicenote_env
# Activate virtual environment
voicenote_env\Scripts\activate
# Install virtualenv
pip3 install virtualenv
# Create virtual environment
python3 -m venv voicenote_env
# Activate virtual environment
source voicenote_env/bin/activate
# Clone the repository
git clone https://github.com/samnaveenkumaroff/VoiceNote-for-Business-Vaseegrah-Veda.git
# Navigate to project directory
cd VoiceNote-for-Business-Vaseegrah-Veda
# Install requirements
pip install -r requirements.txt
The project includes a pre-packaged FFmpeg version (ffmpeg-7.0.1.zip) for seamless installation. Follow the steps below to use it:
- Extract the
ffmpeg-7.0.1.zip
archive to your preferred directory. - Inside the extracted folder, locate
bin/ffmpeg.exe
. - Add the bin folder to the system PATH:
- Open System Properties β Advanced β Environment Variables.
- Under System Variables, find
Path
and edit it. - Click New and add the full path to the
bin
folder. - Click OK and restart your terminal.
- Verify installation:
ffmpeg -version
unzip ffmpeg-7.0.1.zip
mv ffmpeg-7.0.1 /usr/local/bin/ffmpeg
export PATH=$PATH:/usr/local/bin/ffmpeg/bin
Verify with:
ffmpeg -version
unzip ffmpeg-7.0.1.zip
sudo mv ffmpeg-7.0.1 /usr/local/ffmpeg
export PATH=$PATH:/usr/local/ffmpeg/bin
Verify with:
ffmpeg -version
- Download from FFmpeg Official Site
- Extract and add to system PATH
- Verify:
ffmpeg -version
brew install ffmpeg
sudo apt update
sudo apt install ffmpeg
# Dockerfile
FROM python:3.9-slim
WORKDIR /app
# Install system dependencies
RUN apt-get update && apt-get install -y \
ffmpeg \
&& rm -rf /var/lib/apt/lists/*
# Copy project files
COPY . /app
# Install Python dependencies
RUN pip install --no-cache-dir -r requirements.txt
# Run the application
CMD ["python", "VoicetoText.py"]
Build and Run:
docker build -t voicenote-business .
docker run -d voicenote-business
-
Voice Note Reception
- Customer sends a WhatsApp voice note
VoicetoText.py
receives and processes the audio
-
Speech-to-Text Conversion
- Automatic language detection
- Translation to English
- Data stored in
VNText.csv
-
Business Logic Processing
- Match customer query with appropriate products/services
- Generate tailored response
-
Voice Response Generation
TexttoVoice.py
converts response to voice- Voice note sent back in original language
- AWS Lambda
- Google Cloud Functions
- Azure Functions
- Heroku
import json
import boto3
from voicetotext import process_voice_note
def lambda_handler(event, context):
# Process incoming WhatsApp voice note
voice_file = event['voice_file']
processed_text = process_voice_note(voice_file)
return {
'statusCode': 200,
'body': json.dumps(processed_text)
}
- Create RESTful APIs for voice processing
- Integrate with CRM systems
- Develop webhook-based communication
- Multi-tenant architecture
- Horizontal scaling
- Microservices design
- CUDA Support for faster processing
- TensorFlow GPU integration
- PyTorch acceleration
- Redis for voice note metadata
- Memcached for translation caching
- End-to-end encryption
- GDPR Compliance
- Data anonymization
- Secure WhatsApp Business API integration
- Prometheus
- Grafana
- ELK Stack
- Sentry for error tracking
- Fork the repository
- Create feature branch
- Commit changes
- Push to branch
- Create pull request
Apache 2.0 License
Sam Naveenkumar V
- Email: samnaveenkumaroff@gmail.com
- LinkedIn: samnaveenkumaroff
Soorya K
- Email: sooryak@karunya.edu.in
- LinkedIn: SOORYA K
Dhuruv Swamy R
- Email: dhuruvswamy@karunya.edu.in
- LinkedIn: Dhuruv Swamy R
- Vaseegrah Veda
- Tech Vaseegrah Team
- Open Source Community