Documentation corrections for finetuning and data preparation (#931)

* arctic recipe added

* config correction

* arctic config update

* directory name fix

* ugly prints added

* config and data corrections

* training instructions added

* documentation updates for finetuning and data prep

Revert "arctic recipe added"

This reverts commit 77b4df1f43a00af642f43655abf817e0551d0147.

doc updates for finetuning and data prep
This commit is contained in:
Baybars Külebi 2021-11-15 18:14:55 +01:00 committed by GitHub
parent 2ed9e3c241
commit 9a145c9b88
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 9 additions and 9 deletions

View File

@ -93,13 +93,13 @@ them and fine-tune it for your own dataset. This will help you in two main ways:
```bash
CUDA_VISIBLE_DEVICES="0" python recipes/ljspeech/glow_tts/train_glowtts.py \
--restore_path /home/ubuntu/.local/share/tts/tts_models--en--ljspeech--glow-tts
--restore_path /home/ubuntu/.local/share/tts/tts_models--en--ljspeech--glow-tts/model_file.pth.tar
```
```bash
CUDA_VISIBLE_DEVICES="0" python TTS/bin/train_tts.py \
--config_path /home/ubuntu/.local/share/tts/tts_models--en--ljspeech--glow-tts/config.json \
--restore_path /home/ubuntu/.local/share/tts/tts_models--en--ljspeech--glow-tts
--restore_path /home/ubuntu/.local/share/tts/tts_models--en--ljspeech--glow-tts/model_file.pth.tar
```
As stated above, you can also use command-line arguments to change the model configuration.
@ -107,7 +107,7 @@ them and fine-tune it for your own dataset. This will help you in two main ways:
```bash
CUDA_VISIBLE_DEVICES="0" python recipes/ljspeech/glow_tts/train_glowtts.py \
--restore_path /home/ubuntu/.local/share/tts/tts_models--en--ljspeech--glow-tts
--restore_path /home/ubuntu/.local/share/tts/tts_models--en--ljspeech--glow-tts/model_file.pth.tar
--coqpit.run_name "glow-tts-finetune" \
--coqpit.lr 0.00001
```

View File

@ -19,15 +19,15 @@ Let's assume you created the audio clips and their transcription. You can collec
You can either create separate transcription files for each clip or create a text file that maps each audio clip to its transcription. In this file, each line must be delimitered by a special character separating the audio file name from the transcription. And make sure that the delimiter is not used in the transcription text.
We recommend the following format delimited by `||`.
We recommend the following format delimited by `||`. In the following example, `audio1`, `audio2` refer to files `audio1.wav`, `audio2.wav` etc.
```
# metadata.txt
audio1.wav || This is my sentence.
audio2.wav || This is maybe my sentence.
audio3.wav || This is certainly my sentence.
audio4.wav || Let this be your sentence.
audio1||This is my sentence.
audio2||This is maybe my sentence.
audio3||This is certainly my sentence.
audio4||Let this be your sentence.
...
```
@ -80,4 +80,4 @@ See `TTS.tts.datasets.TTSDataset`, a generic `Dataset` implementation for the `t
See `TTS.vocoder.datasets.*`, for different `Dataset` implementations for the `vocoder` models.
See `TTS.utils.audio.AudioProcessor` that includes all the audio processing and feature extraction functions used in a
`Dataset` implementation. Feel free to add things as you need.passed
`Dataset` implementation. Feel free to add things as you need.passed