Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expand Python API capabilities #197

Merged
merged 10 commits into from
Dec 6, 2024
Merged

Expand Python API capabilities #197

merged 10 commits into from
Dec 6, 2024

Conversation

eginhard
Copy link
Member

@eginhard eginhard commented Dec 6, 2024

This PR aligns the Python API more closely with what is available via the CLI. Previously pretrained TTS models could only be used with their default vocoder.

For example, this uses vocoder_models/en/ljspeech/hifigan_v2

from TTS.api import TTS

tts = TTS("tts_models/en/ljspeech/fast_pitch")
_ = tts.tts("hello")

Now you can also pass a different pretrained vocoder name (fixes coqui-ai#3558):

tts = TTS(
    "tts_models/en/ljspeech/fast_pitch",
    vocoder_name="vocoder_models/en/ljspeech/multiband-melgan"
)
_ = tts.tts("hello")

Combining a pretrained TTS model with a local vocoder and vice-versa or passing a custom speaker encoder is also possible now.

I then refactored the CLI to use the Python API internally, so that everything goes through the same pipeline for consistency.

@eginhard eginhard merged commit b545ab8 into dev Dec 6, 2024
35 checks passed
@eginhard eginhard deleted the api branch December 6, 2024 17:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants