YourTTS and voice cloning #291

ROBERT-MCDOWELL · 2025-02-06T16:04:58Z

ROBERT-MCDOWELL
Feb 6, 2025

I'm using tts_with_vc() with yourtts like this:
tts.tts_with_vc(
text='hello world'],
speaker='male-en-2',
speaker_wav=voice_clone_path,
language='en'
)
I had to add speaker= to make it work, although with fairseq I don't need to add it. is it correct? btw the result is not really the cloned voice but a voice sounding like the builtin voice with a very little sound from the voice_clone. is it normal? or does it need more settings? thanks

eginhard · 2025-02-07T10:19:57Z

eginhard
Feb 7, 2025
Maintainer

Yes, for multi-speaker models you always need to specify a speaker to do TTS, Coqui doesn't have a concept of a default speaker for a model. The Fairseq models are single-speaker, so it's not needed there.

But note that YourTTS specifically also supports voice cloning directly and you will probably achieve better results like this instead of separately doing voice conversion afterwards. And in that case there's no need to specify a speaker:

from TTS.api import TTS

tts = TTS('tts_models/multilingual/multi-dataset/your_tts')
tts.tts_to_file('hello world', speaker_wav='reference.wav', language='en')

0 replies

ROBERT-MCDOWELL · 2025-02-07T12:10:17Z

ROBERT-MCDOWELL
Feb 7, 2025
Author

ok got it thanks!

0 replies

ROBERT-MCDOWELL · 2025-02-07T12:13:46Z

ROBERT-MCDOWELL
Feb 7, 2025
Author

Could you confirm yourtts is only eng, fr-fr, pt-br? I saw somewhere it was supporting more languages, but how?

5 replies

eginhard Feb 7, 2025
Maintainer

Yes, this model has only these languages. There is another YourTTS model trained on other languages - the training recipe (https://github.com/idiap/coqui-ai-TTS/blob/dev/recipes/multilingual/cml_yourtts/train_yourtts.py) is in the repo. The model itself is not fully integrated, but you can download it and run manually: coqui-ai#2735

ROBERT-MCDOWELL Feb 7, 2025
Author

thanks eginhard!, to load it, should I use tts() or else? because for now I have only the xttsv2 load checkpoint as an example to load custom models

ROBERT-MCDOWELL Feb 7, 2025
Author

btw thee are the datasets, but how to get the config.json and model.pth already done?

eginhard Feb 7, 2025
Maintainer

There's a Colab notebook in the linked PR that has some usage examples for the model.

ROBERT-MCDOWELL Feb 7, 2025
Author

ok thanks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

YourTTS and voice cloning #291

{{title}}

Replies: 3 comments 5 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

YourTTS and voice cloning #291

ROBERT-MCDOWELL Feb 6, 2025

Replies: 3 comments · 5 replies

eginhard Feb 7, 2025 Maintainer

ROBERT-MCDOWELL Feb 7, 2025 Author

ROBERT-MCDOWELL Feb 7, 2025 Author

eginhard Feb 7, 2025 Maintainer

ROBERT-MCDOWELL Feb 7, 2025 Author

ROBERT-MCDOWELL Feb 7, 2025 Author

eginhard Feb 7, 2025 Maintainer

ROBERT-MCDOWELL Feb 7, 2025 Author

ROBERT-MCDOWELL
Feb 6, 2025

Replies: 3 comments 5 replies

eginhard
Feb 7, 2025
Maintainer

ROBERT-MCDOWELL
Feb 7, 2025
Author

ROBERT-MCDOWELL
Feb 7, 2025
Author

eginhard Feb 7, 2025
Maintainer

ROBERT-MCDOWELL Feb 7, 2025
Author

ROBERT-MCDOWELL Feb 7, 2025
Author

eginhard Feb 7, 2025
Maintainer

ROBERT-MCDOWELL Feb 7, 2025
Author