mirror of https://github.com/coqui-ai/TTS.git
Update XTTS finetuner docs
This commit is contained in:
parent
68964fca0d
commit
5dd217a759
|
@ -202,7 +202,7 @@ if __name__ == "__main__":
|
|||
)
|
||||
demo.load(read_logs, None, logs, every=1)
|
||||
|
||||
prompt_compute_btn = gr.Button(value="Step 1 - Create dataset.")
|
||||
prompt_compute_btn = gr.Button(value="Step 1 - Create dataset")
|
||||
|
||||
def preprocess_dataset(audio_path, language, out_path, progress=gr.Progress(track_tqdm=True)):
|
||||
clear_gpu_cache()
|
||||
|
@ -315,7 +315,7 @@ if __name__ == "__main__":
|
|||
progress_load = gr.Label(
|
||||
label="Progress:"
|
||||
)
|
||||
load_btn = gr.Button(value="Step 3 - Load Fine tuned XTTS model")
|
||||
load_btn = gr.Button(value="Step 3 - Load Fine-tuned XTTS model")
|
||||
|
||||
with gr.Column() as col2:
|
||||
speaker_reference_audio = gr.Textbox(
|
||||
|
|
|
@ -182,7 +182,7 @@ To make `XTTS_v2` GPT encoder training easier for beginner users we did a gradio
|
|||
- Train the XTTS GPT encoder with the processed data
|
||||
- Inference support using the fine-tuned model
|
||||
|
||||
The user can run this gradio demos locally or remotely using a Colab Notebook.
|
||||
The user can run this gradio demo locally or remotely using a Colab Notebook.
|
||||
|
||||
##### Run demo on Colab
|
||||
To make the `XTTS_v2` fine-tuning more accessible for users that do not have good GPUs available we did a Google Colab Notebook.
|
||||
|
@ -191,6 +191,15 @@ The Colab Notebook is available [here](https://colab.research.google.com/drive/1
|
|||
|
||||
To learn how to use this Colab Notebook please check the [XTTS fine-tuning video]().
|
||||
|
||||
If you are not able to acess the video you need to follow the steps:
|
||||
|
||||
1. Open the Colab notebook and start the demo by runining the first two cells (ignore pip install errors in the first one).
|
||||
2. Click on the link "Running on public URL:" on the second cell output.
|
||||
3. On the first Tab (1 - Data processing) you need to select the audio file or files, wait for upload, and then click on the button "Step 1 - Create dataset" and then wait until the dataset processing is done.
|
||||
4. Soon as the dataset processing is done you need to go to the second Tab (2 - Fine-tuning XTTS Encoder) and press the button "Step 2 - Run the training" and then wait until the training is finished. Note that it can take up to 40 minutes.
|
||||
5. Soon the training is done you can go to the third Tab (3 - Inference) and then click on the button "Step 3 - Load Fine-tuned XTTS model" and wait until the fine-tuned model is loaded. Then you can do the inference on the model by clicking on the button "Step 4 - Inference".
|
||||
|
||||
|
||||
##### Run demo locally
|
||||
|
||||
To run the demo locally you need to do the following steps:
|
||||
|
@ -199,6 +208,13 @@ To run the demo locally you need to do the following steps:
|
|||
3. Run the gradio demo using the command `python3 TTS/demos/xtts_ft_demo/xtts_demo.py`
|
||||
4. Follow the steps presented on the [XTTS fine-tuning video]() to be able to fine-tune and use the fine-tuned model.
|
||||
|
||||
|
||||
If you are not able to acess the video you need to follow the steps:
|
||||
|
||||
1. On the first Tab (1 - Data processing) you need to select the audio file or files, wait for upload, and then click on the button "Step 1 - Create dataset" and then wait until the dataset processing is done.
|
||||
2. Soon as the dataset processing is done you need to go to the second Tab (2 - Fine-tuning XTTS Encoder) and press the button "Step 2 - Run the training" and then wait until the training is finished. Note that it can take up to 40 minutes.
|
||||
3. Soon the training is done you can go to the third Tab (3 - Inference) and then click on the button "Step 3 - Load Fine-tuned XTTS model" and wait until the fine-tuned model is loaded. Then you can do the inference on the model by clicking on the button "Step 4 - Inference".
|
||||
|
||||
#### Advanced training
|
||||
|
||||
A recipe for `XTTS_v2` GPT encoder training using `LJSpeech` dataset is available at https://github.com/coqui-ai/TTS/tree/dev/recipes/ljspeech/xtts_v1/train_gpt_xtts.py
|
||||
|
|
Loading…
Reference in New Issue