From 112fe0dc4d5189dc4ea682347c097b2529ca113d Mon Sep 17 00:00:00 2001 From: Eren Golge Date: Thu, 7 Mar 2019 02:01:41 +0100 Subject: [PATCH] readme update --- README.md | 6 +++--- config.json | 2 +- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index dc0bce73..48d2774b 100644 --- a/README.md +++ b/README.md @@ -3,9 +3,9 @@ This project is a part of [Mozilla Common Voice](https://voice.mozilla.org/en). TTS aims a deep learning based Text2Speech engine, low in cost and high in quality. To begin with, you can hear a sample generated voice from [here](https://soundcloud.com/user-565970875/commonvoice-loc-sens-attn). -The model architecture is highly inspired by Tacotron: [A Fully End-to-End Text-To-Speech Synthesis Model](https://arxiv.org/abs/1703.10135). However, it has many important updates that make training faster and computationally very efficient. Feel free to experiment with new ideas and propose changes. +TTS includes two different model implementations which are based on [Tacotron](https://arxiv.org/abs/1703.10135) and [Tacotron2](https://arxiv.org/abs/1712.05884). Tacotron is smaller, efficient and easier to train but Tacotron2 provides better results, especially when it is combined with a Neural vocoder. Therefore, choose depending on your project requirements. -You can find [here](http://www.erogol.com/text-speech-deep-learning-architectures/) a brief note about TTS architectures and their comparisons. +If you are new, you can also find [here](http://www.erogol.com/text-speech-deep-learning-architectures/) a brief post about TTS architectures and their comparisons. ## Requirements and Installation Highly recommended to use [miniconda](https://conda.io/miniconda.html) for easier installation. @@ -90,7 +90,7 @@ head -n 12000 metadata_shuf.csv > metadata_train.csv tail -n 1100 metadata_shuf.csv > metadata_val.csv ``` -To train a new model, you need to define your own ```config.json``` file (check the example) and call with the command below. +To train a new model, you need to define your own ```config.json``` file (check the example) and call with the command below. You also set the model architecture in ```config.json```. ```train.py --config_path config.json``` diff --git a/config.json b/config.json index b4037a1d..45178618 100644 --- a/config.json +++ b/config.json @@ -29,7 +29,7 @@ "url": "tcp:\/\/localhost:54321" }, - "model": "Tacotron", // one of the model in models/ + "model": "Tacotron", // one of the model in models/. For now "Tacotron" or "Tacotron2" are available models. "grad_clip": 0.02, // upper limit for gradients for clipping. "epochs": 1000, // total number of epochs to train. "lr": 0.0001, // Initial learning rate. If Noam decay is active, maximum learning rate.