coqui-tts/vocoder
erogol b2cc256dab tflite inference for melgan models 2020-07-14 17:48:44 +02:00
..
configs update multiband melgan for mean-scale normalization and new audio parameters 2020-06-22 14:56:12 +02:00
datasets Fixes #450 2020-07-11 17:48:05 +01:00
layers update multiband melgan for mean-scale normalization and new audio parameters 2020-06-22 14:56:12 +02:00
models renaming for melgan generator 2020-06-19 12:25:03 +02:00
notebooks initial commit intro. to vocoder submodule 2020-06-15 19:02:18 +02:00
tests use librosa 0.7.2 and fix vocoder datatset assert 2020-07-12 16:09:03 +02:00
tf tflite inference for melgan models 2020-07-14 17:48:44 +02:00
utils Round seconds to two decimals 2020-07-13 19:02:17 +02:00
README.md Clarify GPU Id use with vocoder training 2020-07-11 17:56:49 +01:00
__init__.py initial commit intro. to vocoder submodule 2020-06-15 19:02:18 +02:00
compute_tts_features.py initial commit intro. to vocoder submodule 2020-06-15 19:02:18 +02:00
pqmf_output.wav initial commit intro. to vocoder submodule 2020-06-15 19:02:18 +02:00
train.py bug fix init AudioProcessor in train.py 2020-06-15 19:26:26 +02:00

README.md

Mozilla TTS Vocoders (Experimental)

We provide here different vocoder implementations which can be combined with our TTS models to enable "FASTER THAN REAL-TIME" end-to-end TTS stack.

Currently, there are implementations of the following models.

  • Melgan
  • MultiBand-Melgan
  • GAN-TTS (Discriminator Only)

It is also very easy to adapt different vocoder models as we provide here a flexible and modular (but not too modular) framework.

Training a model

You can see here an example (Soon)Colab Notebook training MelGAN with LJSpeech dataset.

In order to train a new model, you need to collecto all your wav files under a common parent folder and give this path to data_path field in '''config.json'''

You need to define other relevant parameters in your config.json and then start traning with the following command from Mozilla TTS root path, where '0' is the Id of the GPU you wish to use.

CUDA_VISIBLE_DEVICES='0' python vocoder/train.py --config_path path/to/config.json

Exampled config files can be found under vocoder/configs/ folder.

You can continue a previous training by the following command.

CUDA_VISIBLE_DEVICES='0' python vocoder/train.py --continue_path path/to/your/model/folder

You can fine-tune a pre-trained model by the following command.

CUDA_VISIBLE_DEVICES='0' python vocoder/train.py --restore_path path/to/your/model.pth.tar

Restoring a model starts a new training in a different output folder. It only restores model weights with the given checkpoint file. However, continuing a training starts from the same conditions the previous training run left off.

You can also follow your training runs on Tensorboard as you do with our TTS models.

Acknowledgement

Thanks to @kan-bayashi for his repository being the start point of our work.