History

Agrin Hilmkil ced4cfdbbf Allow saving / loading checkpoints from cloud paths (#683 ) * Allow saving / loading checkpoints from cloud paths Allows saving and loading checkpoints directly from cloud paths like Amazon S3 (s3://) and Google Cloud Storage (gs://) by using fsspec. Note: The user will have to install the relevant dependency for each protocol. Otherwise fsspec will fail and specify which dependency is missing. * Append suffix _fsspec to save/load function names * Add a lower bound to the fsspec dependency Skips the 0 major version. * Add missing changes from refactor * Use fsspec for remaining artifacts * Add test case with path requiring fsspec * Avoid writing logs to file unless output_path is local * Document the possibility of using paths supported by fsspec * Fix style and lint * Add missing lint fixes * Add type annotations to new functions * Use Coqpit method for converting config to dict * Fix type annotation in semi-new function * Add return type for load_fsspec * Fix bug where fs not always created * Restore the experiment removal functionality		2021-08-09 18:02:36 +00:00
..
configs	Add UnivNet vocoder 🚀	2021-06-23 13:51:04 +02:00
datasets	Fix WaveGrad `test_run`	2021-07-16 13:02:25 +02:00
layers	Add UnivNet vocoder 🚀	2021-06-23 13:51:04 +02:00
models	Allow saving / loading checkpoints from cloud paths (#683 )	2021-08-09 18:02:36 +00:00
tf	Allow saving / loading checkpoints from cloud paths (#683 )	2021-08-09 18:02:36 +00:00
utils	Update glowtts docstrings and docs	2021-06-30 14:30:55 +02:00
README.md	rename the project to old TTS	2020-09-09 12:27:23 +02:00
__init__.py	rename the project to old TTS	2020-09-09 12:27:23 +02:00
pqmf_output.wav	rename the project to old TTS	2020-09-09 12:27:23 +02:00

README.md

Mozilla TTS Vocoders (Experimental)

Here there are vocoder model implementations which can be combined with the other TTS models.

Currently, following models are implemented:

Melgan
MultiBand-Melgan
ParallelWaveGAN
GAN-TTS (Discriminator Only)

It is also very easy to adapt different vocoder models as we provide a flexible and modular (but not too modular) framework.

Training a model

You can see here an example (Soon)Colab Notebook training MelGAN with LJSpeech dataset.

In order to train a new model, you need to gather all wav files into a folder and give this folder to data_path in '''config.json'''

You need to define other relevant parameters in your config.json and then start traning with the following command.

CUDA_VISIBLE_DEVICES='0' python tts/bin/train_vocoder.py --config_path path/to/config.json

Example config files can be found under tts/vocoder/configs/ folder.

You can continue a previous training run by the following command.

CUDA_VISIBLE_DEVICES='0' python tts/bin/train_vocoder.py --continue_path path/to/your/model/folder

You can fine-tune a pre-trained model by the following command.

CUDA_VISIBLE_DEVICES='0' python tts/bin/train_vocoder.py --restore_path path/to/your/model.pth.tar

Restoring a model starts a new training in a different folder. It only restores model weights with the given checkpoint file. However, continuing a training starts from the same directory where the previous training run left off.

You can also follow your training runs on Tensorboard as you do with our TTS models.

Acknowledgement

Thanks to @kan-bayashi for his repository being the start point of our work.