Table of Contents
This repository allows the connection of coqui-ai/TTS with ROS, providing a real-time advanced Text-to-Speech generation.
The latest TTSv2
provides of 16 languages and better performance overall.
This section describes how to set up this repository.
First, please set up the following environment before proceeding to the next installation stage.
System | Version |
---|---|
Ubuntu | 20.04 (Focal Fossa) - Local Env. |
Python | >= 3.9, < 3.12 |
Docker Engine | Tested on 26.0.0 |
CUDA | >=11.8 (If GPU is used) |
Note
Docker is required to use this TTS library.
- Go to the
src
folder of ROS.$ roscd # Or just use "cd ~/catkin_ws/" and change directory. $ cd src/
- Clone this repository.
$ git clone https://github.com/TeamSOBITS/coqui_tts_ros
- Navigate into the repository.
$ cd coqui_tts_ros/
- Install the dependent packages.
$ bash install.sh
- Compile the package.
$ roscd # Or just use "cd ~/catkin_ws/" and change directory. $ catkin_make
- Create a simple alias to launch the TTS server.
- If using CPU:
$ echo "alias tts_launch='docker run --rm -it -p 5002:5002 -v ~/{PATH_ROS_WS_LOCAL}/src/coqui_tts_ros/models/:/root/.local/share/tts/ --entrypoint \"tts-server\" ghcr.io/coqui-ai/tts-cpu'" >> ~/.bash_alias
- If using GPU:
$ echo "alias tts_launch='docker run --rm -it -p 5002:5002 --gpus all -v ~/{PATH_ROS_WS_LOCAL}/src/coqui_tts_ros/models/:/root/.local/share/tts/ --entrypoint \"tts-server\" ghcr.io/coqui-ai/tts'" >> ~/.bash_alias
Important
{PATH_ROS_WS_LOCAL}
needs to be updated to your ROS PATH in the local environment.
Important
You need to run the command 6. in the local environment.
- Launch TTS server from the local environment.
- If using CPU:
$ tts_launch --model_name tts_models/en/vctk/vits
- If using GPU:
$ tts_launch --model_name tts_models/en/vctk/vits --use_cuda true
Note
Remember that --model_name
value can be updated.
Please, check the available models in model_list.yaml.
-
Set the parameters inside tts.launch and select the functions to be used.
<!-- Set Coqui TTS server url --> <arg name="url" default="http://localhost:5002"/> <!-- Add period at the end of a sentence (true) --> <arg name="addStopChar" default="true"/> <!-- Set result sound filename --> <arg name="filename" default="output.wav"/> <!-- Set input style_wav if sample voice is given --> <arg name="style_wav" default=""/> <!-- Set Speaker ID if multi-speaker model is being used --> <arg name="speaker_id" default="p225"/> <!-- Set Language if multi-language model is being used --> <arg name="language_id" default=""/> <!-- Set sound_audio to true if you want to play the sound --> <arg name="sound_audio" default="true"/>
-
Execute the launch file tts.launch.
$ roslaunch coqui_tts_ros tts.launch
- Choose
--model_name
value through parameter. - Make available the funtion of
style_wav
.
See the open issues for a full list of proposed features (and known issues).