Skip to content

Commit 64c7a59

Browse files
committed
feat(manager): print download location when listing models
1 parent cd52907 commit 64c7a59

File tree

2 files changed

+40
-16
lines changed

2 files changed

+40
-16
lines changed

TTS/utils/manage.py

+2
Original file line numberDiff line numberDiff line change
@@ -95,6 +95,8 @@ def list_models(self):
9595
for model_type in self.models_dict:
9696
model_list = self._list_models(model_type, model_count)
9797
models_name_list.extend(model_list)
98+
logger.info("")
99+
logger.info("Path to downloaded models: %s", self.output_prefix)
98100
return models_name_list
99101

100102
def log_model_details(self, model_type, lang, dataset, model):

docs/source/faq.md

+38-16
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,43 @@
11
# FAQ
2-
We tried to collect common issues and questions we receive about 🐸TTS. It is worth checking before going deeper.
2+
We tried to collect common issues and questions we receive about 🐸TTS. It is
3+
worth checking before going deeper.
34

4-
## Errors with a pre-trained model. How can I resolve this?
5-
- Make sure you use the right commit version of 🐸TTS. Each pre-trained model has its corresponding version that needs to be used. It is defined on the model table.
6-
- If it is still problematic, post your problem on [Discussions](https://github.com/idiap/coqui-ai-TTS/discussions). Please give as many details as possible (error message, your TTS version, your TTS model and config.json etc.)
7-
- If you feel like it's a bug to be fixed, then prefer Github issues with the same level of scrutiny.
5+
## Using Coqui
86

9-
## What are the requirements of a good 🐸TTS dataset?
7+
### Where does Coqui store downloaded models?
8+
9+
The path to downloaded models is printed when running `tts --list_models`.
10+
Default locations are:
11+
12+
- **Linux:** `~/.local/share/tts`
13+
- **Mac:** `~/Library/Application Support/tts`
14+
- **Windows:** `C:\Users\<user>\AppData\Local\tts`
15+
16+
You can change the prefix of this `tts/` folder by setting the `XDG_DATA_HOME`
17+
or `TTS_HOME` environment variables.
18+
19+
### Errors with a pre-trained model. How can I resolve this?
20+
- Make sure you use the latest version of 🐸TTS. Each pre-trained model is only
21+
supported from a certain minimum version.
22+
- If it is still problematic, post your problem on
23+
[Discussions](https://github.com/idiap/coqui-ai-TTS/discussions). Please give
24+
as many details as possible (error message, your TTS version, your TTS model
25+
and config.json etc.)
26+
- If you feel like it's a bug to be fixed, then prefer Github issues with the
27+
same level of scrutiny.
28+
29+
## Training Coqui models
30+
31+
### What are the requirements of a good 🐸TTS dataset?
1032
- [See this page](datasets/what_makes_a_good_dataset.md)
1133

12-
## How should I choose the right model?
34+
### How should I choose the right model?
1335
- First, train Tacotron. It is smaller and faster to experiment with. If it performs poorly, try Tacotron2.
1436
- Tacotron models produce the most natural voice if your dataset is not too noisy.
1537
- If both models do not perform well and especially the attention does not align, then try AlignTTS or GlowTTS.
1638
- If you need faster models, consider SpeedySpeech, GlowTTS or AlignTTS. Keep in mind that SpeedySpeech requires a pre-trained Tacotron or Tacotron2 model to compute text-to-speech alignments.
1739

18-
## How can I train my own `tts` model?
40+
### How can I train my own `tts` model?
1941
0. Check your dataset with notebooks in [dataset_analysis](https://github.com/idiap/coqui-ai-TTS/tree/main/notebooks/dataset_analysis) folder. Use [this notebook](https://github.com/idiap/coqui-ai-TTS/blob/main/notebooks/dataset_analysis/CheckSpectrograms.ipynb) to find the right audio processing parameters. A better set of parameters results in a better audio synthesis.
2042

2143
1. Write your own dataset `formatter` in `datasets/formatters.py` or [format](datasets/formatting_your_dataset) your dataset as one of the supported datasets, like LJSpeech.
@@ -64,13 +86,13 @@ We tried to collect common issues and questions we receive about 🐸TTS. It is
6486
**Note:** You can also train your model using pure 🐍 python. Check the
6587
[tutorial](tutorial_for_nervous_beginners.md).
6688

67-
## How can I train in a different language?
89+
### How can I train in a different language?
6890
- Check steps 2, 3, 4, 5 above.
6991

70-
## How can I train multi-GPUs?
92+
### How can I train multi-GPUs?
7193
- Check step 5 above.
7294

73-
## How can I check model performance?
95+
### How can I check model performance?
7496
- You can inspect model training and performance using ```tensorboard```. It will show you loss, attention alignment, model output. Go with the order below to measure the model performance.
7597
1. Check ground truth spectrograms. If they do not look as they are supposed to, then check audio processing parameters in ```config.json```.
7698
2. Check train and eval losses and make sure that they all decrease smoothly in time.
@@ -85,7 +107,7 @@ We tried to collect common issues and questions we receive about 🐸TTS. It is
85107
- 'bidirectional_decoder' is your ultimate savior, but it trains 2x slower and demands 1.5x more GPU memory.
86108
- You can also try the other models like AlignTTS or GlowTTS.
87109

88-
## How do I know when to stop training?
110+
### How do I know when to stop training?
89111
There is no single objective metric to decide the end of a training since the voice quality is a subjective matter.
90112

91113
In our model trainings, we follow these steps;
@@ -98,17 +120,17 @@ In our model trainings, we follow these steps;
98120
Keep in mind that the approach above only validates the model robustness. It is hard to estimate the voice quality without asking the actual people.
99121
The best approach is to pick a set of promising models and run a Mean-Opinion-Score study asking actual people to score the models.
100122

101-
## My model does not learn. How can I debug?
123+
### My model does not learn. How can I debug?
102124
- Go over the steps under "How can I check model performance?"
103125

104-
## Attention does not align. How can I make it work?
126+
### Attention does not align. How can I make it work?
105127
- Check the 4th step under "How can I check model performance?"
106128

107-
## How can I test a trained model?
129+
### How can I test a trained model?
108130
- The best way is to use `tts` or `tts-server` commands. For details check [here](inference.md).
109131
- If you need to code your own ```TTS.utils.synthesizer.Synthesizer``` class.
110132

111-
## My Tacotron model does not stop - I see "Decoder stopped with 'max_decoder_steps" - Stopnet does not work.
133+
### My Tacotron model does not stop - I see "Decoder stopped with 'max_decoder_steps" - Stopnet does not work.
112134
- In general, all of the above relates to the `stopnet`. It is the part of the model telling the `decoder` when to stop.
113135
- In general, a poor `stopnet` relates to something else that is broken in your model or dataset. Especially the attention module.
114136
- One common reason is the silent parts in the audio clips at the beginning and the ending. Check ```trim_db``` value in the config. You can find a better value for your dataset by using ```CheckSpectrogram``` notebook. If this value is too small, too much of the audio will be trimmed. If too big, then too much silence will remain. Both will curtail the `stopnet` performance.

0 commit comments

Comments
 (0)