From 9a145c9b88a348cb9149aa363782bcfe86a14aa8 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Baybars=20K=C3=BClebi?= <40303490+gullabi@users.noreply.github.com> Date: Mon, 15 Nov 2021 18:14:55 +0100 Subject: [PATCH] Documentation corrections for finetuning and data preparation (#931) * arctic recipe added * config correction * arctic config update * directory name fix * ugly prints added * config and data corrections * training instructions added * documentation updates for finetuning and data prep Revert "arctic recipe added" This reverts commit 77b4df1f43a00af642f43655abf817e0551d0147. doc updates for finetuning and data prep --- docs/source/finetuning.md | 6 +++--- docs/source/formatting_your_dataset.md | 12 ++++++------ 2 files changed, 9 insertions(+), 9 deletions(-) diff --git a/docs/source/finetuning.md b/docs/source/finetuning.md index fd9a295c..42b9e518 100644 --- a/docs/source/finetuning.md +++ b/docs/source/finetuning.md @@ -93,13 +93,13 @@ them and fine-tune it for your own dataset. This will help you in two main ways: ```bash CUDA_VISIBLE_DEVICES="0" python recipes/ljspeech/glow_tts/train_glowtts.py \ - --restore_path /home/ubuntu/.local/share/tts/tts_models--en--ljspeech--glow-tts + --restore_path /home/ubuntu/.local/share/tts/tts_models--en--ljspeech--glow-tts/model_file.pth.tar ``` ```bash CUDA_VISIBLE_DEVICES="0" python TTS/bin/train_tts.py \ --config_path /home/ubuntu/.local/share/tts/tts_models--en--ljspeech--glow-tts/config.json \ - --restore_path /home/ubuntu/.local/share/tts/tts_models--en--ljspeech--glow-tts + --restore_path /home/ubuntu/.local/share/tts/tts_models--en--ljspeech--glow-tts/model_file.pth.tar ``` As stated above, you can also use command-line arguments to change the model configuration. @@ -107,7 +107,7 @@ them and fine-tune it for your own dataset. This will help you in two main ways: ```bash CUDA_VISIBLE_DEVICES="0" python recipes/ljspeech/glow_tts/train_glowtts.py \ - --restore_path /home/ubuntu/.local/share/tts/tts_models--en--ljspeech--glow-tts + --restore_path /home/ubuntu/.local/share/tts/tts_models--en--ljspeech--glow-tts/model_file.pth.tar --coqpit.run_name "glow-tts-finetune" \ --coqpit.lr 0.00001 ``` diff --git a/docs/source/formatting_your_dataset.md b/docs/source/formatting_your_dataset.md index cbefc61d..3db38af0 100644 --- a/docs/source/formatting_your_dataset.md +++ b/docs/source/formatting_your_dataset.md @@ -19,15 +19,15 @@ Let's assume you created the audio clips and their transcription. You can collec You can either create separate transcription files for each clip or create a text file that maps each audio clip to its transcription. In this file, each line must be delimitered by a special character separating the audio file name from the transcription. And make sure that the delimiter is not used in the transcription text. -We recommend the following format delimited by `||`. +We recommend the following format delimited by `||`. In the following example, `audio1`, `audio2` refer to files `audio1.wav`, `audio2.wav` etc. ``` # metadata.txt -audio1.wav || This is my sentence. -audio2.wav || This is maybe my sentence. -audio3.wav || This is certainly my sentence. -audio4.wav || Let this be your sentence. +audio1||This is my sentence. +audio2||This is maybe my sentence. +audio3||This is certainly my sentence. +audio4||Let this be your sentence. ... ``` @@ -80,4 +80,4 @@ See `TTS.tts.datasets.TTSDataset`, a generic `Dataset` implementation for the `t See `TTS.vocoder.datasets.*`, for different `Dataset` implementations for the `vocoder` models. See `TTS.utils.audio.AudioProcessor` that includes all the audio processing and feature extraction functions used in a -`Dataset` implementation. Feel free to add things as you need.passed \ No newline at end of file +`Dataset` implementation. Feel free to add things as you need.passed