Documentation corrections for finetuning and data preparation (#931)

* arctic recipe added

* config correction

* arctic config update

* directory name fix

* ugly prints added

* config and data corrections

* training instructions added

* documentation updates for finetuning and data prep

Revert "arctic recipe added"

This reverts commit 77b4df1f43a00af642f43655abf817e0551d0147.

doc updates for finetuning and data prep
This commit is contained in:
Baybars Külebi 2021-11-15 18:14:55 +01:00 committed by GitHub
parent 2ed9e3c241
commit 9a145c9b88
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 9 additions and 9 deletions

View File

@ -93,13 +93,13 @@ them and fine-tune it for your own dataset. This will help you in two main ways:
```bash ```bash
CUDA_VISIBLE_DEVICES="0" python recipes/ljspeech/glow_tts/train_glowtts.py \ CUDA_VISIBLE_DEVICES="0" python recipes/ljspeech/glow_tts/train_glowtts.py \
--restore_path /home/ubuntu/.local/share/tts/tts_models--en--ljspeech--glow-tts --restore_path /home/ubuntu/.local/share/tts/tts_models--en--ljspeech--glow-tts/model_file.pth.tar
``` ```
```bash ```bash
CUDA_VISIBLE_DEVICES="0" python TTS/bin/train_tts.py \ CUDA_VISIBLE_DEVICES="0" python TTS/bin/train_tts.py \
--config_path /home/ubuntu/.local/share/tts/tts_models--en--ljspeech--glow-tts/config.json \ --config_path /home/ubuntu/.local/share/tts/tts_models--en--ljspeech--glow-tts/config.json \
--restore_path /home/ubuntu/.local/share/tts/tts_models--en--ljspeech--glow-tts --restore_path /home/ubuntu/.local/share/tts/tts_models--en--ljspeech--glow-tts/model_file.pth.tar
``` ```
As stated above, you can also use command-line arguments to change the model configuration. As stated above, you can also use command-line arguments to change the model configuration.
@ -107,7 +107,7 @@ them and fine-tune it for your own dataset. This will help you in two main ways:
```bash ```bash
CUDA_VISIBLE_DEVICES="0" python recipes/ljspeech/glow_tts/train_glowtts.py \ CUDA_VISIBLE_DEVICES="0" python recipes/ljspeech/glow_tts/train_glowtts.py \
--restore_path /home/ubuntu/.local/share/tts/tts_models--en--ljspeech--glow-tts --restore_path /home/ubuntu/.local/share/tts/tts_models--en--ljspeech--glow-tts/model_file.pth.tar
--coqpit.run_name "glow-tts-finetune" \ --coqpit.run_name "glow-tts-finetune" \
--coqpit.lr 0.00001 --coqpit.lr 0.00001
``` ```

View File

@ -19,15 +19,15 @@ Let's assume you created the audio clips and their transcription. You can collec
You can either create separate transcription files for each clip or create a text file that maps each audio clip to its transcription. In this file, each line must be delimitered by a special character separating the audio file name from the transcription. And make sure that the delimiter is not used in the transcription text. You can either create separate transcription files for each clip or create a text file that maps each audio clip to its transcription. In this file, each line must be delimitered by a special character separating the audio file name from the transcription. And make sure that the delimiter is not used in the transcription text.
We recommend the following format delimited by `||`. We recommend the following format delimited by `||`. In the following example, `audio1`, `audio2` refer to files `audio1.wav`, `audio2.wav` etc.
``` ```
# metadata.txt # metadata.txt
audio1.wav || This is my sentence. audio1||This is my sentence.
audio2.wav || This is maybe my sentence. audio2||This is maybe my sentence.
audio3.wav || This is certainly my sentence. audio3||This is certainly my sentence.
audio4.wav || Let this be your sentence. audio4||Let this be your sentence.
... ...
``` ```
@ -80,4 +80,4 @@ See `TTS.tts.datasets.TTSDataset`, a generic `Dataset` implementation for the `t
See `TTS.vocoder.datasets.*`, for different `Dataset` implementations for the `vocoder` models. See `TTS.vocoder.datasets.*`, for different `Dataset` implementations for the `vocoder` models.
See `TTS.utils.audio.AudioProcessor` that includes all the audio processing and feature extraction functions used in a See `TTS.utils.audio.AudioProcessor` that includes all the audio processing and feature extraction functions used in a
`Dataset` implementation. Feel free to add things as you need.passed `Dataset` implementation. Feel free to add things as you need.passed