Commit Graph

1912 Commits

Author SHA1 Message Date
Edresson Casanova d5adc35fdf
Add docstring to compute_embeddings script 2021-07-21 07:16:10 -03:00
Eren Gölge 05c75aa9d5 Fix linter issues 2021-07-16 13:37:38 +02:00
Eren Gölge 58cc414477 Fix WaveGrad `test_run` 2021-07-16 13:02:25 +02:00
WeberJulian 25832eb97b Changes for review 2021-07-15 11:38:45 +02:00
Edresson b1620d1f3f remove ignore generate eval flag 2021-07-15 03:34:28 -03:00
WeberJulian c79a82ed07 refix linter 2021-07-13 23:12:18 +02:00
WeberJulian 7d92b30946 Fix tests 2021-07-13 23:00:34 +02:00
WeberJulian 32974dd6a9 Fix test sentences synthesis 2021-07-13 16:07:13 +02:00
Edresson d906fea08c lint fix and eval as argparse in extract tts spectrograms 2021-07-13 02:15:31 -03:00
Edresson 2e5baffa9c Merge fix and eval split as argparse 2021-07-13 01:47:32 -03:00
Eren Gölge 93a74cbb71
Merge pull request #628 from Aloento/patch-2
Change to _get_preprocessor_by_name
2021-07-11 22:17:50 +02:00
Edresson 4eac1c4651 bug fix on train_encoder and unit tests 2021-07-11 12:00:39 -03:00
Aloento 6e3e6d5756
Change to _get_preprocessor_by_name 2021-07-08 09:53:13 +02:00
Eren Gölge 8fbadad68e Bump up to v0.1.2 2021-07-06 14:44:59 +02:00
eren golge 3c0454490f Fix #616 2021-07-06 14:44:03 +02:00
Eren Gölge 0c347624e7 Bump up version to v0.1.1 2021-07-04 11:46:36 +02:00
Eren Gölge a05b234080 Raise an error when multiple GPUs are in use
User must define the target GPU by `CUDA_VISIBLE_DEVICES` and
use `distribute.py` for multi-gpu training.
2021-07-04 11:25:49 +02:00
Eren Gölge 270c3823eb Fix #608 2021-07-04 11:19:31 +02:00
Eren Gölge c25a2184e7 Add docs for `SpeakerManager` 2021-07-03 13:55:27 +02:00
Eren Gölge f382e4c700 Fix linter warnings 2021-07-03 13:30:24 +02:00
Eren Gölge 9e7824fe35 Fix UnivNet inference code 2021-07-02 10:48:34 +02:00
Eren Gölge 168f97cbe9 Let `Synthesizer` use the speaker manager out of the model 2021-07-02 10:47:55 +02:00
Eren Gölge 196876feb1 Fix `ModelManager` model download 2021-07-02 10:47:05 +02:00
Eren Gölge 9352cb4136 Format Align TTS docstrings 2021-07-02 10:45:58 +02:00
Eren Gölge 95ad72f38f Fix glow tts initialization 2021-07-02 10:45:37 +02:00
Eren Gölge 40b0b5365e Let `get_characters` return `num_chars` 2021-07-02 10:45:00 +02:00
Eren Gölge 0fa6a8c9b8 Fix glow tts default parameters 2021-07-02 10:44:23 +02:00
Eren Gölge a4c658f5ef Fix for using the `Synthesizer` out of the model 2021-07-02 10:43:38 +02:00
Eren Gölge db47f4f105 Update `.models.json` 2021-07-02 10:43:00 +02:00
Eren Gölge 2e1a428b83 Update glowtts docstrings and docs 2021-06-30 14:30:55 +02:00
Eren Gölge 5723eb4738 Fix config init in `process_args` 2021-06-29 16:41:08 +02:00
Eren Gölge 4b5421b42f Remove FAQ link from README.md 2021-06-29 13:20:40 +02:00
Eren Gölge 47b3b10d6d Bump up to v0.1.0 🚀 2021-06-29 13:07:59 +02:00
Eren Gölge 7ec5c31898 Merge branch 'univnet' into trainer-api 2021-06-29 10:27:12 +02:00
Eren Gölge 51398cd15b Add docstrings and typing for `audio.py` 2021-06-28 17:03:47 +02:00
Eren Gölge ae6405bb76 Docstrings for `Trainer` 2021-06-28 17:03:47 +02:00
Eren Gölge 6b265ae8e3 Docstring update 2021-06-28 17:03:47 +02:00
Eren Gölge ab563ce7cd Start training by config.json using `register_config` 2021-06-28 17:03:47 +02:00
Eren Gölge b3c073c99b Allow runing full path scripts with `distribute.py` 2021-06-28 17:03:47 +02:00
Eren Gölge d42d1c02ea Use `torch.linalg.qr` for pytorch > `v1.9.0` 2021-06-28 17:03:47 +02:00
Eren Gölge fbba37e01e Fix loading the `amp` scaler from a checkpoint 🛠️ 2021-06-28 17:03:47 +02:00
Eren Gölge a7617d8ab6 Add 🐍 python 3.9 to CI 2021-06-28 17:03:47 +02:00
Eren Gölge 9790eddada Fix wrong argument name 🛠️ 2021-06-28 17:03:47 +02:00
Eren Gölge 932ab107ae Docstring edit in `TTSDataset.py` ✍️ 2021-06-28 17:03:47 +02:00
Eren Gölge cfa5041db7 Fix `eval_log` for `gan.py` 🛠️ 2021-06-28 17:03:47 +02:00
Eren Gölge d700845b10 Move `TorchSTFT` to `utils.audio` 2021-06-28 17:03:47 +02:00
Eren Gölge 5b89cb4fec Fixup `trainer.py` 🛠️ 2021-06-28 17:03:47 +02:00
Eren Gölge 8c74f054f0 Enable support for 🐍 python 3.10
Bump up versions numpy 1.19.5 and TF 2.5.0
2021-06-28 17:03:47 +02:00
Eren Gölge 9455a2b01e Apply small fixes for API compatibility 2021-06-28 17:03:47 +02:00
Eren Gölge a5d5bc9063 Print `max_decoder_steps` when model reaches the limit 2021-06-28 17:03:47 +02:00
Eren Gölge e30f245e06 Update `synthesizer` for speaker and model init 2021-06-28 17:03:47 +02:00
Eren Gölge 15fa31b595 fixup configs 2021-06-28 17:03:47 +02:00
Eren Gölge f23b228e24 Update `speaker_manager` 2021-06-28 17:03:47 +02:00
Eren Gölge e53616078a Fixup `utils` for the trainer 2021-06-28 17:03:47 +02:00
Eren Gölge 106b63d8a9 Update `vocoder` utils 2021-06-28 17:03:47 +02:00
Eren Gölge 45947acb60 Update `TTS.bin` scripts for the new API 2021-06-28 17:03:47 +02:00
Eren Gölge d7225eedb0 Update `vocoder` datasets and `setup_dataset` 2021-06-28 17:03:20 +02:00
Eren Gölge d18198dff8 Implement `setup_model` for vocoder models 2021-06-28 17:03:20 +02:00
Eren Gölge e949e7ad58 Update vocoder models 2021-06-28 17:03:19 +02:00
Eren Gölge 51005cdab4 Update `tts.models.setup_model` 2021-06-28 17:03:19 +02:00
Eren Gölge 7b8c15ac49 Create base 🐸TTS model abstraction for tts models 2021-06-28 17:03:19 +02:00
Eren Gölge a358f74a52 Update vocoder model configs 2021-06-28 17:03:19 +02:00
Eren Gölge 786170fe7d Update tts model configs 2021-06-28 17:03:19 +02:00
Eren Gölge 98298ee671 Implement unified IO utils 2021-06-28 17:03:19 +02:00
Eren Gölge c7aad884cd Implement unified trainer 2021-06-28 17:03:19 +02:00
Eren Gölge 6d7b5fbcde `tts` model abstraction with `TTSModel` 2021-06-28 17:03:19 +02:00
Eren Gölge d4dbd89752 fix calculation of `loader_start_time` 2021-06-28 17:03:19 +02:00
Eren Gölge c754a0e17d `TrainerAbstract` and related updates for `TrainerTTS` 2021-06-28 17:03:19 +02:00
Eren Gölge 00c82c516d rename to 2021-06-28 17:03:19 +02:00
Eren Gölge 166f0aeb9a merge if branches with the same implementation 2021-06-28 17:03:19 +02:00
Eren Gölge 03494ad642 adjust `distribute.py` for the `train_tts.py` 2021-06-28 17:03:19 +02:00
Eren Gölge fdfb18d230 downsize melgan test model size 2021-06-28 17:03:19 +02:00
Eren Gölge 25238e0658 fix glow-tts `inference()` 2021-06-28 17:03:19 +02:00
Eren Gölge 419735f440 refactor and fix multi-speaker training in Trainer and Tacotron models 2021-06-28 17:03:19 +02:00
Eren Gölge 269e5a734e add max_decoder_steps argument to tacotron models 2021-06-28 17:03:19 +02:00
Eren Gölge b3324bd914 fix speaker_manager init 2021-06-28 17:03:19 +02:00
Eren Gölge 2c38ef8441 use get_speaker_manager in Trainer and save speakers.json file when
needed
2021-06-28 17:03:19 +02:00
Eren Gölge d6b2b6add6 make style and linter fixes 2021-06-28 17:03:19 +02:00
Eren Gölge 802d461389 Compute d_vectors and speaker_ids separately in TTSDataset 2021-06-28 17:03:19 +02:00
Eren Gölge db6a97d1a2 rename external speaker embedding arguments as `d_vectors` 2021-06-28 17:03:19 +02:00
Eren Gölge 9042ae9195 use `to_cuda()` for moving data in `format_batch()` 2021-06-28 17:03:19 +02:00
Eren Gölge f82f1970b8 change `to(device)` to `type_as` in models 2021-06-28 17:03:19 +02:00
Eren Gölge 9c94b0c5c0 init `durations = None` 2021-06-28 17:03:19 +02:00
Eren Gölge 1fa15c195a docstring fix 2021-06-28 17:03:19 +02:00
Eren Gölge 1c8a3d7c86 make style 2021-06-28 17:03:19 +02:00
Eren Gölge 8cdd423234 styling formatting.py 2021-06-28 17:03:19 +02:00
Eren Gölge 30211512a4 fix type annotations 2021-06-28 17:03:19 +02:00
Eren Gölge b22b7620c3 update glow-tts output shapes to match [B, T, C] 2021-06-28 17:03:19 +02:00
Eren Gölge 8381379938 formating `cond_input` with a function in Tacotron models 2021-06-28 17:03:19 +02:00
Eren Gölge ef4ea9e527 update imports for `formatters` 2021-06-28 17:03:19 +02:00
Eren Gölge 6c495c6a6e fix glow-tts inference and forward functions for handling `cond_input`
and refactor its test
2021-06-28 17:03:19 +02:00
Eren Gölge f840268181 refactor `SpeakerManager` 2021-06-28 17:03:19 +02:00
Eren Gölge 421194880d linter fixes 2021-06-28 17:03:19 +02:00
Eren Gölge 8e52a69230 delete separate tts training scripts and pre-commit configuration 2021-06-28 17:03:19 +02:00
Eren Gölge d96ebcd6d3 make style 2021-06-28 17:03:19 +02:00
Eren Gölge b643e8b37c `logging/__init__.py` 2021-06-28 17:03:19 +02:00
Eren Gölge 0cee5042a9 fix logger imports 2021-06-28 17:03:19 +02:00
Eren Gölge 72dceca52c import missings 2021-06-28 17:03:19 +02:00
Eren Gölge 0eec238429 remove redundant imports 2021-06-28 17:03:19 +02:00
Eren Gölge b500338faa make style 2021-06-28 17:03:19 +02:00
Eren Gölge 469d2e620a update extract_tts_spectrogram for `cond_input` API of the models 2021-06-28 17:03:19 +02:00
Eren Gölge 5ab28fa618 update `extract_tts_spec...` using `SpeakerManager` 2021-06-28 17:03:19 +02:00
Eren Gölge c392fa4288 update `extract_tts_spectrograms` for the new model API 2021-06-28 17:03:19 +02:00
Eren Gölge 8f47f95998 correct import of `load_meta_data`
remove redundant import
2021-06-28 17:03:19 +02:00
Eren Gölge c680a07a20 fix `Synthesized` for the new `synthesis()` 2021-06-28 17:03:19 +02:00
Eren Gölge 73bf9673ed revert logging.info to print statements for trainer 2021-06-28 17:03:19 +02:00
Eren Gölge d25f017b42 update `setup_model.py` imports 2021-06-28 17:03:19 +02:00
Eren Gölge bb355b7441 update align_tts.py model for the trainer 2021-06-28 17:03:19 +02:00
Eren Gölge 9203b863d9 update align_tts_loss for trainer 2021-06-28 17:03:19 +02:00
Eren Gölge fc9a0fb8ce update aling_tts_config for the trainer 2021-06-28 17:03:19 +02:00
Eren Gölge e298b8e364 update trainer.py for better logging handling, restoring models and
rename init_ functions with get_
2021-06-28 17:03:19 +02:00
Eren Gölge b8a4af4010 update `synthesis.py` for being more generic 2021-06-28 17:03:19 +02:00
Eren Gölge c70d0c9dae update `speedy_speech.py` model for trainer 2021-06-28 17:03:19 +02:00
Eren Gölge 06ee57d816 update `speedy_speecy_config.py` for the trainer 2021-06-28 17:03:19 +02:00
Eren Gölge 4e910993f1 update tacotron model to return `model_outputs` 2021-06-28 17:03:19 +02:00
Eren Gölge bb4deee64c update glow-tts for the trainer 2021-06-28 17:03:19 +02:00
Eren Gölge 9134c7dfb6 update `sequence_mask` import globally 2021-06-28 17:03:19 +02:00
Eren Gölge b2218e882a update `glow_tts_config.py` for setting the optimizer and the scheduler 2021-06-28 17:03:19 +02:00
Eren Gölge 891631ab47 typing annotation for the trainer 2021-06-28 17:03:19 +02:00
Eren Gölge 5f07315722 add trainer and train_tts 2021-06-28 17:03:19 +02:00
Eren Gölge 34f8a74e4d remove `truncated` from synthesizer 2021-06-28 17:03:19 +02:00
Eren Gölge 178eccbc16 update console logger 2021-06-28 17:03:19 +02:00
Eren Gölge f4f83b6379 update `synthesis.py` for the trainer 2021-06-28 17:03:19 +02:00
Eren Gölge 130781dab6 remove `tts.generic_utils` as all the functions are moved to other files 2021-06-28 17:03:19 +02:00
Eren Gölge 535a458f40 update Tacotron models for the trainer 2021-06-28 17:03:19 +02:00
Eren Gölge bdbfc95618 add `gradual_training` argument to tacotron.py 2021-06-28 17:03:19 +02:00
Eren Gölge 5a2e75f0ee import missings for tacotron.py 2021-06-28 17:03:19 +02:00
Eren Gölge da7d10e53c mode `setup_model()` to `models/__init__.py` 2021-06-28 17:03:19 +02:00
Eren Gölge ca302db7b0 add sequence_mask to `utils.data` 2021-06-28 17:03:19 +02:00
Eren Gölge 844abb3b1d `setup_loss()` in `layer/__init__.py` 2021-06-28 17:03:19 +02:00
Eren Gölge a20a1c7d06 rename preprocess.py -> formatters.py 2021-06-28 17:03:19 +02:00
Eren Gölge b9bccbb243 move load_meta_data and related functions to `datasets/__init__.py` 2021-06-28 17:03:19 +02:00
Eren Gölge d09385808a set test_sentences in config 2021-06-28 17:03:19 +02:00
Eren Gölge 8def3c87af trainer-API updates 2021-06-28 17:03:19 +02:00
Eren Gölge 42554cc711 rename MyDataset -> TTSDataset 2021-06-28 17:03:19 +02:00
Edresson 1c4e806f54 use speaker manager on compute embeddings script 2021-06-27 03:35:34 -03:00
Edresson Casanova eb84bb2bc8
Merge branch 'dev' into dev 2021-06-26 15:32:19 -03:00
Eren Gölge 987cf1178b Bump up to v0.0.16 2021-06-25 14:44:56 +02:00
Michael Hansen 3f172b84d8 Fix linting issues 2021-06-25 14:41:31 +02:00
Michael Hansen 4d8426fa0a Use eSpeak IPA lexicons by default for phoneme models 2021-06-25 14:41:05 +02:00
Michael Hansen 618b509204 Use combined characters available in TTS phonemes (like ç) 2021-06-25 14:41:05 +02:00
Michael Hansen da6f6a4a01 Update docstring for clean_gruut_phonemes 2021-06-25 14:41:05 +02:00
Michael Hansen 47191f3ecc Add tests for gruut phonemization 2021-06-25 14:41:05 +02:00
Michael Hansen 67869e77f9 Use gruut for phonemization 2021-06-25 14:41:05 +02:00
Eren Gölge 788992093d Add UnivNet vocoder 🚀 2021-06-23 13:51:04 +02:00
Eren Gölge 64fd59204c Use `torch.linalg.qr` for pytorch > `v1.9.0` 2021-06-23 13:49:42 +02:00
Eren Gölge aba840b4e6 Fix loading the `amp` scaler from a checkpoint 🛠️ 2021-06-23 13:49:42 +02:00
Eren Gölge 18e5393f16 Add 🐍 python 3.9 to CI 2021-06-23 13:49:36 +02:00
Eren Gölge 0ff2d2336a Fix wrong argument name 🛠️ 2021-06-22 16:21:11 +02:00
Eren Gölge 61c3cb871f Docstring edit in `TTSDataset.py` ✍️ 2021-06-22 16:21:11 +02:00
Eren Gölge 6f739ea07a Fix `eval_log` for `gan.py` 🛠️ 2021-06-22 16:21:11 +02:00
Eren Gölge ebb91c0fbb Move `TorchSTFT` to `utils.audio` 2021-06-22 16:21:11 +02:00
Eren Gölge 01c4b22a2f Fixup `trainer.py` 🛠️ 2021-06-22 16:21:11 +02:00
Eren Gölge 7de2756fc4 Enable support for 🐍 python 3.10
Bump up versions numpy 1.19.5 and TF 2.5.0
2021-06-22 16:21:11 +02:00
Eren Gölge 220e184f66 Apply small fixes for API compatibility 2021-06-22 16:21:11 +02:00
Eren Gölge 77d57dd301 Print `max_decoder_steps` when model reaches the limit 2021-06-22 16:21:11 +02:00
Eren Gölge 7dc2177df4 Update `synthesizer` for speaker and model init 2021-06-22 16:21:11 +02:00
Eren Gölge c3a0bc702e fixup configs 2021-06-22 16:21:11 +02:00
Eren Gölge 0e01c2594f Update `speaker_manager` 2021-06-22 16:21:11 +02:00
Eren Gölge 8182f5168f Fixup `utils` for the trainer 2021-06-22 16:21:11 +02:00
Eren Gölge b4bb567e04 Update `vocoder` utils 2021-06-22 16:21:11 +02:00
Eren Gölge f3ff5b1971 Update `TTS.bin` scripts for the new API 2021-06-22 16:21:11 +02:00
Eren Gölge aed919cf1c Update `vocoder` datasets and `setup_dataset` 2021-06-22 16:21:11 +02:00
Eren Gölge 59abf490a1 Implement `setup_model` for vocoder models 2021-06-22 16:21:11 +02:00
Eren Gölge 420820caf4 Update vocoder models 2021-06-22 16:21:11 +02:00
Eren Gölge d10f9c5676 Update `tts.models.setup_model` 2021-06-22 16:21:11 +02:00
Eren Gölge cae702980f Create base 🐸TTS model abstraction for tts models 2021-06-22 16:21:11 +02:00
Eren Gölge 70d968b169 Update vocoder model configs 2021-06-22 16:21:11 +02:00
Eren Gölge f8a3460818 Update tts model configs 2021-06-22 16:21:11 +02:00
Eren Gölge acd96a4940 Implement unified IO utils 2021-06-22 16:21:10 +02:00
Eren Gölge 6b907554f8 Implement unified trainer 2021-06-22 16:21:10 +02:00
Eren Gölge 20c4a8c8e1 `tts` model abstraction with `TTSModel` 2021-06-22 16:21:10 +02:00
Eren Gölge b934665fc0 fix calculation of `loader_start_time` 2021-06-22 16:21:10 +02:00
Eren Gölge 64f0f57757 `TrainerAbstract` and related updates for `TrainerTTS` 2021-06-22 16:21:10 +02:00
Eren Gölge f077a356e0 rename to 2021-06-22 16:21:10 +02:00
Eren Gölge 4575b70826 merge if branches with the same implementation 2021-06-22 16:21:10 +02:00
Eren Gölge 59be1b9af1 adjust `distribute.py` for the `train_tts.py` 2021-06-22 16:21:10 +02:00
Eren Gölge 614738cc85 downsize melgan test model size 2021-06-22 13:12:52 +02:00
Eren Gölge 4f29725eb6 fix glow-tts `inference()` 2021-06-22 13:12:52 +02:00
Eren Gölge a87c886497 refactor and fix multi-speaker training in Trainer and Tacotron models 2021-06-22 13:12:52 +02:00
Eren Gölge 0206bb847b add max_decoder_steps argument to tacotron models 2021-06-22 13:12:52 +02:00
Eren Gölge cbb52b3d83 fix speaker_manager init 2021-06-22 13:12:52 +02:00
Eren Gölge d2fd6a34a1 use get_speaker_manager in Trainer and save speakers.json file when
needed
2021-06-22 13:12:52 +02:00
Eren Gölge 147550c65f make style and linter fixes 2021-06-22 13:12:52 +02:00
Eren Gölge a605dd3d08 Compute d_vectors and speaker_ids separately in TTSDataset 2021-06-22 13:12:52 +02:00
Eren Gölge f00ef90ce6 rename external speaker embedding arguments as `d_vectors` 2021-06-22 13:12:52 +02:00
Eren Gölge e7b7268c43 use `to_cuda()` for moving data in `format_batch()` 2021-06-22 13:12:52 +02:00
Eren Gölge 26a3312f0d change `to(device)` to `type_as` in models 2021-06-22 13:12:52 +02:00
Eren Gölge c09622459e init `durations = None` 2021-06-22 13:12:52 +02:00
Eren Gölge 2e31659dd9 docstring fix 2021-06-22 13:12:52 +02:00
Eren Gölge 7a0750a4f5 make style 2021-06-22 13:12:52 +02:00
Eren Gölge 534401377d styling formatting.py 2021-06-22 13:12:52 +02:00
Eren Gölge e229f5c081 fix type annotations 2021-06-22 13:12:52 +02:00
Eren Gölge 506189bdee update glow-tts output shapes to match [B, T, C] 2021-06-22 13:12:52 +02:00
Eren Gölge f568833d28 formating `cond_input` with a function in Tacotron models 2021-06-22 13:12:52 +02:00
Eren Gölge 254707c610 update imports for `formatters` 2021-06-22 13:12:52 +02:00
Eren Gölge 223502d827 fix glow-tts inference and forward functions for handling `cond_input`
and refactor its test
2021-06-22 13:12:52 +02:00
Eren Gölge d4b1acfa81 refactor `SpeakerManager` 2021-06-22 13:12:52 +02:00
Eren Gölge 26e7c0960c linter fixes 2021-06-22 13:12:52 +02:00
Eren Gölge 79f7c5da1e delete separate tts training scripts and pre-commit configuration 2021-06-22 13:12:52 +02:00
Eren Gölge ca787be193 make style 2021-06-22 13:12:52 +02:00
Eren Gölge d376647ca0 `logging/__init__.py` 2021-06-22 13:12:52 +02:00
Eren Gölge bb58a0588e fix logger imports 2021-06-22 13:12:52 +02:00
Eren Gölge 9bbc924377 import missings 2021-06-22 13:12:52 +02:00
Eren Gölge b4d4ce0d7e remove redundant imports 2021-06-22 13:12:52 +02:00
Eren Gölge aefa71155c make style 2021-06-22 13:12:52 +02:00
Eren Gölge 88d8a94a10 update extract_tts_spectrogram for `cond_input` API of the models 2021-06-22 13:12:52 +02:00
Eren Gölge 667bb708b6 update `extract_tts_spec...` using `SpeakerManager` 2021-06-22 13:12:52 +02:00
Eren Gölge 830306d2fd update `extract_tts_spectrograms` for the new model API 2021-06-22 13:12:52 +02:00
Eren Gölge c673eb8ef8 correct import of `load_meta_data`
remove redundant import
2021-06-22 13:12:52 +02:00
Eren Gölge f0a419546b fix `Synthesized` for the new `synthesis()` 2021-06-22 13:12:52 +02:00
Eren Gölge c7ff175592 revert logging.info to print statements for trainer 2021-06-22 13:12:52 +02:00
Eren Gölge fd6afe5ae5 update `setup_model.py` imports 2021-06-22 13:12:52 +02:00
Eren Gölge c82d91051d update align_tts.py model for the trainer 2021-06-22 13:12:52 +02:00
Eren Gölge 4f66e816d1 update align_tts_loss for trainer 2021-06-22 13:12:52 +02:00
Eren Gölge 8213ad8b5f update aling_tts_config for the trainer 2021-06-22 13:12:52 +02:00
Eren Gölge 8dfd4c91ff update trainer.py for better logging handling, restoring models and
rename init_ functions with get_
2021-06-22 13:12:52 +02:00
Eren Gölge fb9289d365 update `synthesis.py` for being more generic 2021-06-22 13:12:52 +02:00
Eren Gölge f121b0ff5d update `speedy_speech.py` model for trainer 2021-06-22 13:12:52 +02:00
Eren Gölge 843b3ba960 update `speedy_speecy_config.py` for the trainer 2021-06-22 13:12:52 +02:00
Eren Gölge c9790bee2c update tacotron model to return `model_outputs` 2021-06-22 13:12:52 +02:00
Eren Gölge f09ec7e3a7 update glow-tts for the trainer 2021-06-22 13:12:52 +02:00
Eren Gölge 3346a6d9dc update `sequence_mask` import globally 2021-06-22 13:12:52 +02:00
Eren Gölge 9765b1aa6b update `glow_tts_config.py` for setting the optimizer and the scheduler 2021-06-22 13:12:52 +02:00
Eren Gölge 6bf6543df8 typing annotation for the trainer 2021-06-22 13:12:52 +02:00
Eren Gölge 57cdddef16 add trainer and train_tts 2021-06-22 13:12:52 +02:00
Eren Gölge d769af9e3b remove `truncated` from synthesizer 2021-06-22 13:12:52 +02:00
Eren Gölge 570633ab80 update console logger 2021-06-22 13:12:52 +02:00
Eren Gölge 2ac6b824ca update `synthesis.py` for the trainer 2021-06-22 13:12:52 +02:00
Eren Gölge c9e5527070 remove `tts.generic_utils` as all the functions are moved to other files 2021-06-22 13:12:52 +02:00
Eren Gölge 2ab723cd10 update Tacotron models for the trainer 2021-06-22 13:12:52 +02:00
Eren Gölge d6b6a15b5c add `gradual_training` argument to tacotron.py 2021-06-22 13:12:52 +02:00
Eren Gölge 118a7f2b43 import missings for tacotron.py 2021-06-22 13:12:52 +02:00
Eren Gölge c98149d488 mode `setup_model()` to `models/__init__.py` 2021-06-22 13:12:52 +02:00
Eren Gölge 86edf6ab0e add sequence_mask to `utils.data` 2021-06-22 13:12:52 +02:00
Eren Gölge c61486b1e3 `setup_loss()` in `layer/__init__.py` 2021-06-22 13:12:52 +02:00
Eren Gölge f07209d2e0 rename preprocess.py -> formatters.py 2021-06-22 13:12:52 +02:00
Eren Gölge facb782851 move load_meta_data and related functions to `datasets/__init__.py` 2021-06-22 13:12:52 +02:00
Eren Gölge b9d4355d20 set test_sentences in config 2021-06-22 13:12:52 +02:00
Eren Gölge 7bdd0eb72f trainer-API updates 2021-06-22 13:12:52 +02:00
Eren Gölge 0f284841d1 rename MyDataset -> TTSDataset 2021-06-22 13:12:52 +02:00
Edresson 99d40e98d9 fix Lint checks 2021-06-18 14:59:01 -03:00
Edresson 28bec238ca fix Lint checks 2021-06-18 14:33:50 -03:00
Edresson 83644056e3 fix Lint checks 2021-06-18 14:32:28 -03:00
Edresson Casanova e78e3cd81e
Merge branch 'dev' into dev 2021-06-18 14:10:03 -03:00
Edresson b74b510d3c Compute embeddings and find characters using config file 2021-06-18 14:04:49 -03:00
Adam Froghyar b0aa189348 Forcing do_trim_silence to False in the extract TTS script 2021-06-14 10:44:00 +02:00
Eren Gölge d245b5d48f bump up v0.0.15.1 2021-06-08 09:21:01 +02:00
Edresson 14b209c7e9 Create a batch for more fast inference on LSTM Speaker Encoder 2021-06-05 03:12:17 -03:00
Eren Gölge b8b79a5e5a fix `use_cuda` bug in `server.py` 2021-06-04 14:02:53 +02:00
Eren Gölge 203ab855c3 bump up to v0.0.15 2021-06-04 13:52:54 +02:00
Eren Gölge ba9bcf7c6b auto upload to pypi on release 2021-06-04 12:20:06 +02:00
Eren Gölge e66753bd0d fixup! new japanese model placeholder in `.models.json` 2021-06-03 18:04:28 +02:00
Eren Gölge bd434636a9 new japanese model placeholder in `.models.json` 2021-06-02 15:54:37 +02:00
Eren Gölge 401fbd8978 bump up to v0.0.15 2021-06-02 11:48:17 +02:00
Eren Gölge 49c5e5d820 maket style japanese PR 2021-06-02 11:44:46 +02:00
Eren Gölge 73b4083c6c
Merge pull request #502 from kaiidams/kaiidams/kokoro
Japanese Tacotron 2 model
2021-06-02 10:20:08 +02:00
Katsuya Iida 6d8310d2a9 Set the version to the same with the dev branch. 2021-06-02 07:48:28 +09:00
Alexander Korolev c1eb9bdcca
fix speaker dim inference 2021-06-01 15:15:26 +02:00
Katsuya Iida 1cc18d1972 Move unittest of Japanese phonemizer. 2021-06-01 18:51:34 +09:00
Alexander Korolev 5b89ef2c6e
fix speaker-embeddings dimension during inference 2021-06-01 11:06:35 +02:00
Eren Gölge d0ab0382fc linter fixes 2021-06-01 09:15:32 +02:00
Eren Gölge bec85ac58d make style 2021-05-31 16:37:15 +02:00
Eren Gölge d9f1268f99 init tb_logger None for rank > 0 processes 2021-05-31 15:47:07 +02:00
Eren Gölge 301c516abd Merge branch 'dev' of https://github.com/coqui-ai/TTS into dev 2021-05-31 15:46:25 +02:00
Edresson 7448177b72 use SpeakerManager on compute embeddings script 2021-05-29 21:11:53 -03:00
Katsuya Iida c4a5a73f18 update Kokoro config 2021-05-29 19:17:27 +09:00
Katsuya Iida 3a9ac2de4a Merge remote-tracking branch 'coqui-ai/main' into kaiidams/kokoro 2021-05-29 09:39:23 +09:00
Katsuya Iida d0c9c1ca5c Move TTS/tts/utils/japanese 2021-05-29 09:21:47 +09:00
Edresson 099142d4dd bug fix 2021-05-27 21:50:56 -03:00
Edresson 208bb0f0ee add batched speaker encoder inference 2021-05-27 20:01:00 -03:00
Edresson 825734a3a9 remove unused embeddings export 2021-05-27 19:10:24 -03:00
Katsuya Iida c4987e9d4e Move import at the head of the file. 2021-05-28 00:22:57 +09:00
Eren Gölge 925c08cf95 replace unidecode with anyascii 2021-05-27 14:02:44 +02:00
Eren Gölge e08c58db3b bump up version to v0.14.1 2021-05-27 13:11:01 +02:00
Eren Gölge c6f22aaa67 fix #509 2021-05-27 13:09:15 +02:00
Edresson 1496f271dc update Compute embeddings script 2021-05-27 00:45:18 -03:00
Edresson bc5307caa0 add unit tests for SoftmaxAngleProtoLoss and ResnetSpeakerEncoder and bugfix 2021-05-26 20:35:58 -03:00
Edresson c90037c2e9 solve merge problems 2021-05-26 16:01:30 -03:00
Katsuya Iida f921a05bdb Fixed lint errors 2021-05-26 19:02:16 +09:00
Edresson Casanova f89cb6aec2
Merge branch 'dev' into dev 2021-05-25 17:30:25 -03:00
Edresson d570c2d790 pylint fix and data loader bug fix 2021-05-26 01:11:37 -03:00
Katsuya Iida 0536aa6d0f Japanese Tacotron 2 model 2021-05-22 17:12:19 +09:00
Eren Gölge 5482a0f62d type def for gradual_training 2021-05-19 14:03:26 +02:00
Eren Gölge df6a98d0c3 type def for gradual_training 2021-05-19 14:00:44 +02:00
Eren Gölge 16576d6408 bump version number 2021-05-19 12:35:10 +02:00
Eren Gölge 8a7c40736c set use_phonemes false 2021-05-19 01:27:26 +02:00
Eren Gölge ccfaa6b1d5 add `needs_phonemizer` field to models.json. If set true these models
are only compatible with v0.0.13 or below.
2021-05-18 17:57:28 +02:00
Eren Gölge a14fcf2a13 remove text_processing test 2021-05-18 17:57:28 +02:00
Eren Gölge d7fae3f515 remove all espeaker and phonemizer deps 2021-05-18 17:57:28 +02:00
Eren Gölge ced05e812a move chinese phonemizer 2021-05-18 17:57:28 +02:00
Eren Gölge 218af1d9a2 change `list` to `List` in config 2021-05-18 17:30:27 +02:00
Eren Gölge 4df31f7fbd unused_speakers argument for ignoring speaker ids in multi-speaker
training
2021-05-18 14:50:03 +02:00
Eren Gölge c2c7dff805 use relaxted coqpit parser 2021-05-18 14:49:47 +02:00
Edresson 856ea19758 bug fix in dataloader and update inference 2021-05-18 03:43:16 -03:00
Eren Gölge d1b469935d tacotron DDC LJSpeech recipe 2021-05-17 11:42:14 +02:00
Eren Gölge 34a42d379f update tacotron_config.py for checking `r` and the docstring 2021-05-17 11:35:30 +02:00
Eren Gölge 12722501bb styling 2021-05-15 23:48:31 +02:00
Eren Gölge 8b1014d188 add docstrings with default value fixes 2021-05-15 23:45:10 +02:00
Eren Gölge da49089a72 update melgan training test batch size 2021-05-12 10:12:11 +02:00
Edresson 3433c2f348 add compute embedding for the new speaker encoder 2021-05-12 03:06:46 -03:00
Eren Gölge 0213e1cbf4 update configs for tts models to match the field typed with the expected
values
2021-05-12 00:57:38 +02:00
Eren Gölge 715b0a65a0 update main.yml for python x64
fix test
2021-05-12 00:57:29 +02:00
Edresson 3fcc748b2e implement the Speaker Encoder H/ASP 2021-05-11 16:27:05 -03:00
Eren Gölge 843d1b3d98 linter fixes 2021-05-11 11:30:00 +02:00
Eren Gölge 19fb1d743d style update 2021-05-11 11:30:00 +02:00
Eren Gölge 6e980b49c4 fix synthesizer.py for Coqpit 2021-05-11 11:29:18 +02:00
Eren Gölge db14dcd95a remove old load_config 2021-05-11 11:29:18 +02:00
Eren Gölge a21ac883dd add get_cuda() 2021-05-11 11:29:18 +02:00
Eren Gölge 21dd4d7960 fix load_config imports for Coqpit 2021-05-11 11:29:18 +02:00
Eren Gölge c57f0b46bb reintro use_gst for backwars compat 2021-05-11 11:29:18 +02:00
Eren Gölge 18e76a2309 fix speaker encoder model initialization 2021-05-11 11:29:18 +02:00
Eren Gölge 10de40bba1 make num_workers mandatory config field 2021-05-11 11:29:18 +02:00
Eren Gölge df1ddd3539 allow read_json_with_comments for backward compat 2021-05-11 11:29:18 +02:00
Eren Gölge 9f7599e3c3 fix train_encoder for coqpit 2021-05-11 11:29:18 +02:00
Eren Gölge f8e52965dd add speaker encoder coqpit 2021-05-11 11:29:18 +02:00
Eren Gölge ce2bba543e remove extra from utils and move funcs to io.py 2021-05-11 11:29:18 +02:00
Eren Gölge 812dbc2b06 rm config.json 2021-05-11 11:29:18 +02:00
Eren Gölge 3fde2001b1 train_encoder refactoring for coqpit 2021-05-11 11:29:18 +02:00
Eren Gölge 9ee70af9bb code styling 2021-05-11 11:29:18 +02:00
Eren Gölge 10db2baa06 global shared Coqpit configs 2021-05-11 11:29:18 +02:00
Eren Gölge 3dec62b183 add Coqpits for the vocoder models 2021-05-11 11:29:18 +02:00
Eren Gölge 6f4eed94f5 remove *.json vocoder configs 2021-05-11 11:29:18 +02:00
Eren Gölge 78b3825d0b update train scripts for coqpit 2021-05-11 11:29:18 +02:00
Eren Gölge 757e90b1cc load_config function to initialize the right Coqpit for the given model 2021-05-11 11:29:18 +02:00
Eren Gölge e6f45b9eb7 update train_vocoder_gan.py for coqpit 2021-05-11 11:29:18 +02:00
Eren Gölge bcebd69d09 remove bash tts training tests 2021-05-11 11:29:17 +02:00
Eren Gölge 7663bc63c1 add Coqpit configs for the TTS models 2021-05-11 11:29:17 +02:00
Eren Gölge 7227e8f1d2 update train_align_tts.py for coqpit 2021-05-11 11:29:17 +02:00
Eren Gölge 51a7e06945 glow_tts_config.py and train test on python 2021-05-11 11:29:17 +02:00
Eren Gölge 720fe13056 update glow_tts modules and training script for coqpit use 2021-05-11 11:29:17 +02:00
Eren Gölge 816e7ee698 remove default configs.json as replacing with Coqpit configs 2021-05-11 11:29:17 +02:00
Eren Gölge 35341d5482 move bash script based tests to python with coqpit 2021-05-11 11:29:17 +02:00
Eren Gölge 647163397d coqpit refactoring 2021-05-11 11:29:17 +02:00
Eren Gölge eaa130e813 fix tacotron for coqpit 2021-05-11 11:29:17 +02:00
Eren Gölge 65d7ad4250 refactor train_speedy_speech.py for coqpit 2021-05-11 11:29:17 +02:00
Eren Gölge 4a58fdfd59 comment out check-arguments before copying fields to the configs 2021-05-11 11:29:17 +02:00
Eren Gölge 05d9543ed8 init GST module using gst config in Tacotron models 2021-05-11 11:29:17 +02:00
Eren Gölge 93a00373f6 move split_dataset 2021-05-11 11:29:17 +02:00
Eren Gölge 9c18e40f64 black formatting 2021-05-11 11:29:17 +02:00
Eren Gölge c34c8137d7 update compute_statistics for coqpit 2021-05-11 11:29:17 +02:00
Eren Gölge 79d7215142 config refactor #5 WIP 2021-05-11 11:29:17 +02:00
Eren Gölge dc50f5f0b0 config refactor #4 WIP 2021-05-11 11:28:35 +02:00
Eren Gölge 97bd5f9734 [ci skip] config update #3 WIP 2021-05-11 11:28:35 +02:00
Eren Gölge a21c0b5585 config update 2 WIP 2021-05-11 11:28:35 +02:00
Eren Gölge e092ae40dc config update WIP 2021-05-11 11:28:35 +02:00
Eren Gölge 06f80a4806 update check argument 2021-05-11 11:28:35 +02:00
Eren Gölge bf7ddfa542
Merge pull request #481 from chmodsss/main
Accessing __version__ command
2021-05-11 10:20:48 +02:00
Edresson 85ccad7e0a add Audio data augamentation Addtive and RIR 2021-05-11 00:59:57 -03:00
Edresson 77d85c6cc5 add softmaxproto loss and bug fix in data loader 2021-05-10 17:08:38 -03:00
chmodsss 607d5cf377 [#480] Adding version variable 2021-05-10 19:46:34 +02:00
Adam Froghyar 7ddc885f37 deleted a line the broke GravesAttention 2021-05-10 15:42:59 +02:00
Edresson 78bad25f2b update voxceleb download link 2021-05-07 23:45:15 -03:00
Eren Gölge f7582107da
Merge pull request #453 from Edresson/dev
Script for spectrogram extraction using teacher forcing and Glow-TTS inference with MAS.
2021-05-06 17:53:28 +02:00
Edresson 501c8e0302 remove unused vars on extract tts spectrograms script 2021-05-04 19:04:13 -03:00
Eren Gölge 0325c58862
Merge pull request #468 from shaun95/patch-1
Update losses.py
2021-05-03 14:45:24 +02:00
Eren Gölge 8cb27267a4 formatting 2021-05-03 14:26:35 +02:00
Eren Gölge 87d674a038 bumpup librosa version to 0.8.0 2021-05-03 14:25:09 +02:00
shaun 7d0ec62bf1
Update losses.py
The block of code for use_l1_spec_loss is repeated which doubles the amount of L1 loss when enabled.
The weight for L1 loss in hifigan_ljspeech configutation will likely need to be doubled to compensate (l1_spec_loss_weight)
2021-05-02 14:14:24 +02:00
Edresson 3ecd556bbe add unit test for extract tts spectrograms script 2021-05-01 13:41:56 -03:00
Edresson 446b1da936 create inference function 2021-04-29 18:18:37 -03:00
Eren Gölge f02f0338c2 fix .models.json and add testing to check released models availability 2021-04-29 09:32:36 +02:00
Eren Gölge fd95e9b8a4 [ci skip] Add sam models 2021-04-28 21:57:31 +02:00
Agrin Hilmkil 351d0ed6ae Remove unnecessary fsspec usage 2021-04-28 11:21:08 +02:00
Agrin Hilmkil 167f86417e Move dev, tf, notebook dependencies to extras 2021-04-28 11:20:06 +02:00
Eren Gölge 1235e54738 test for synthesize.py 2021-04-27 14:17:38 +02:00
Eren Gölge 4719414f2e remove imports 2021-04-27 11:25:17 +02:00
Eren Gölge add97cddc1 move function and remove import 2021-04-27 11:22:56 +02:00
Eren Gölge 734e6a515c bug fix 2021-04-27 10:27:45 +02:00
Eren Gölge 6bdd81667e place holders for sc-glow and hifigan models 2021-04-26 19:53:12 +02:00
Eren Gölge 2f0716073e enable multi-speaker CoquiTTS models for synthesize.py 2021-04-26 19:36:53 +02:00
Eren Gölge b531fa699c remove conflicy noise 2021-04-26 15:27:52 +02:00
Eren Gölge f37b488876 Merge branch 'speaker-manager' of https://github.com/coqui-ai/TTS into speaker-manager 2021-04-26 15:25:25 +02:00
Eren Gölge b82daa5e86 style and linter fixes 2021-04-26 15:22:24 +02:00
Edresson 20e42a3381 add save audio option 2021-04-23 15:00:00 -03:00
Edresson 8228091f92 add script for extraction of tts spectrograms 2021-04-23 14:17:46 -03:00
Eren Gölge 4cf211348d styling and linting 2021-04-23 18:04:37 +02:00
Eren Gölge 7eb0c60d2e let synthesizer to pass speaker encoder file paths to speaker manager 2021-04-23 18:04:37 +02:00
Eren Gölge f69195739e let speaker manager compute mean x_vector from multiple wav files 2021-04-23 18:04:37 +02:00
Eren Gölge 179722e3a7 new arguments to synthesize.py for loading speaker encoder and speaker wavs 2021-04-23 18:04:37 +02:00
Eren Gölge dfa415a8b8 small refactor in server.py 2021-04-23 18:04:37 +02:00
Eren Gölge c80d21f311 load speaker_encoder_ap and compute x_vector directly from the input file in speaker manager 2021-04-23 18:04:37 +02:00
Eren Gölge ad047c8195 html formatting, enable multi-speaker model on the server with a dropdown menu to select the speaker 2021-04-23 18:04:37 +02:00
Eren Gölge f9f3d04d14 remove moved function 2021-04-23 18:04:37 +02:00
Eren Gölge 10c988ac8c update server.py 2021-04-23 18:04:37 +02:00
Eren Gölge 6d0f5e0459 use SpeakerManager in Synthesizer 2021-04-23 18:04:37 +02:00
Eren Gölge e97126314c add ```unique``` argument to make_symbols to fix the incompat. issue of the
SC-Glow models
2021-04-23 18:04:37 +02:00
Eren Gölge d08888e603 formating speakers.py 2021-04-23 18:04:37 +02:00
Eren Gölge df422223a3 initial SpeakerManager implementation 2021-04-23 18:04:37 +02:00
Eren Gölge 7a7aeb35f5 fix the glow-tts in setup_model 2021-04-23 18:04:37 +02:00
Eren Gölge d42748082a update argument name external_speaker_embedding_dim -> speaker_embedding_dim
add inference_noise_scale argument to glow-tts
2021-04-23 18:04:37 +02:00
Eren Gölge 2da81f5bb6 add load_chekpoint to speaker encoder 2021-04-23 18:04:37 +02:00
Eren Gölge 1229ccbf07 update argument name in server.py 2021-04-23 18:04:37 +02:00
Eren Gölge af2d36faeb update synthesize.py for multi-speaker setting 2021-04-23 18:04:37 +02:00
Eren Gölge 99dc07a7dd add ```unique``` param to keep scglow models compatible (they are duplicate symbols ins the character set) 2021-04-23 18:04:37 +02:00
Eren Gölge c955a12428 set the default layer size compatible with scglow 2021-04-23 18:04:37 +02:00
Eren Gölge 3ace2440fa fix a mistake from rebase 2021-04-23 18:04:37 +02:00
Eren Gölge aadb2106ec code styling 2021-04-23 18:04:37 +02:00
Eren Gölge af7baa3387 refactoring to allow defining the speaker file externally 2021-04-23 18:04:37 +02:00
kirianguiller 7dccbfdcd5 handle multi speaker and gst in Synthetizer class 2021-04-23 18:04:37 +02:00
Edresson d2b6326b8b change optimizer initialization for compatibility with Hifi-GAN official implementation 2021-04-23 07:54:39 -03:00
WeberJulian 4205284f92
Change name of the functions 2021-04-23 10:09:55 +02:00
WeberJulian a26498181b Change back the default value 2021-04-22 16:10:17 +02:00
Julian Weber 355e1f47ab fix dumb mistake 2021-04-22 15:50:29 +02:00
Julian Weber c125b71f36 fix windows support 2021-04-22 15:14:24 +02:00
Jörg Thalheim f5fd7f78d4 server: also listen to ipv6
The [::] address will listen to both ipv4/ipv6 addresses.
2021-04-22 12:38:55 +02:00
Eren Gölge ef37633cb3 [ci skip] use prenet_dropout by default with Tacotron models 2021-04-22 12:38:55 +02:00
Eren Gölge e1d960da9e use SpeakerManager in Synthesizer 2021-04-21 13:13:27 +02:00
Eren Gölge 04b6881b66 add ```unique``` argument to make_symbols to fix the incompat. issue of the
SC-Glow models
2021-04-21 13:12:35 +02:00
Eren Gölge 790946faec formating speakers.py 2021-04-21 13:12:11 +02:00
Eren Gölge ab313814de initial SpeakerManager implementation 2021-04-21 13:11:46 +02:00
Eren Gölge 09890c7421 fix the glow-tts in setup_model 2021-04-21 13:10:40 +02:00
Eren Gölge 8764d02eb2 update argument name external_speaker_embedding_dim -> speaker_embedding_dim
add inference_noise_scale argument to glow-tts
2021-04-21 13:09:44 +02:00
Eren Gölge 8b40720977 add load_chekpoint to speaker encoder 2021-04-21 13:09:04 +02:00
Eren Gölge 37cad38c27 update argument name in server.py 2021-04-21 13:08:45 +02:00
Eren Gölge 9bccee9da8 update synthesize.py for multi-speaker setting 2021-04-21 13:08:25 +02:00
Eren Gölge d2fa8add1f add ```unique``` param to keep scglow models compatible (they are duplicate symbols ins the character set) 2021-04-16 19:40:13 +02:00
Eren Gölge d9612a4351 set the default layer size compatible with scglow 2021-04-16 19:40:13 +02:00
Eren Gölge 1038fd420d fix a mistake from rebase 2021-04-16 19:39:47 +02:00
Eren Gölge 47e356cb48 code styling 2021-04-16 16:01:40 +02:00
Eren Gölge 25328aad00 refactoring to allow defining the speaker file externally 2021-04-16 15:59:57 +02:00
kirianguiller 48ae52a9a3 handle multi speaker and gst in Synthetizer class 2021-04-16 15:54:49 +02:00
Eren Gölge a53958ae3a fix urls for the new models 2021-04-15 17:05:00 +02:00
Eren Gölge 9cc17be53a formatting and a small bug fix in Tacotron model 2021-04-15 16:36:51 +02:00
Eren Gölge 1ad838bc83 add newly released models under .model.json 2021-04-15 16:06:10 +02:00
Eren Gölge 7cada1a949 remove noise 2021-04-15 15:30:45 +02:00
Eren Gölge d60a8d7211 show the real waveform on TB too for GAN vocoder training. 2021-04-15 15:30:06 +02:00
Eren Gölge 5fbe926429 change the default TTS model to TacotronDDC 2021-04-15 15:29:44 +02:00
Eren Gölge 3de5a89154 optionally enable prenet dropout at inference time for tacotron models 2021-04-13 13:24:56 +02:00
Eren Gölge 28a2fed8a3 update hifigan in .model.json 2021-04-12 16:48:05 +02:00
Eren Gölge abaf36861a aligntts model .model.json placeholder 2021-04-12 16:43:52 +02:00
Eren Gölge 480e2f7888 docstring update and better handling make_symbols 2021-04-12 16:40:49 +02:00
Eren Gölge b735076bb4 linter fixes 2021-04-12 13:14:11 +02:00
Eren Gölge b11d1cb845 small fixes 2021-04-12 12:40:55 +02:00
Eren Gölge a7f6045644 Merge branch 'reformat' into hifigan-reformat 2021-04-12 12:00:17 +02:00
Eren Gölge f519012dea reformatting and styling 2021-04-12 11:47:39 +02:00
Eren Gölge 9011dddf77 tacotron DDC placeholder in models.json 2021-04-12 04:06:27 +02:00
Eren Gölge d295d5de97 remove torch.no_grad from TorchSTFT 2021-04-10 19:43:57 +02:00
Eren Gölge 5b70da2e3f restore schedulers only if training is continuing a previous training
inherit nn.Module for TorchSTFT
2021-04-09 19:31:28 +02:00
Eren Gölge 2c71c6d8cd [ci skip]update gan vocoder configs to reflect the recent changes 2021-04-09 17:15:32 +02:00
Eren Gölge 2b529f60c8 update default hifigan config 2021-04-09 11:40:06 +02:00
Eren Gölge 105e0b4d62 vocoder gan training fixes 2021-04-09 11:38:04 +02:00
Eren Gölge 87ee6ceb57 style update #3 2021-04-09 01:17:15 +02:00
Eren Gölge 18d9ec8036 format with black 2021-04-09 00:54:59 +02:00
Eren Gölge e5b9607bc3 isort all imports 2021-04-09 00:45:20 +02:00
Eren Gölge 0e79fa86ad format with black and pylint 2.7.3 2021-04-09 00:38:08 +02:00
Eren Gölge cd69da4868 linter fixes #2 2021-04-08 16:57:46 +02:00
Eren Gölge 4d3e1e9d9a linter fix 2021-04-08 14:57:46 +02:00
Eren Gölge 53f54898bc small fixes 2021-04-08 14:22:47 +02:00
Eren Gölge 006b1d3aaa bug fix 2021-04-08 13:17:45 +02:00
Eren Gölge 3f0993aebe remove junk 2021-04-08 12:17:02 +02:00
Eren Gölge 0ee0458309 remove redundant imports 2021-04-08 11:29:15 +02:00
Eren Gölge 773f1db6fa refactor HifiGAN discriminator 2021-04-08 11:28:30 +02:00
Eren Gölge 15f362d5b1 formatting 2021-04-08 11:28:30 +02:00
Eren Gölge aee24b0704 set different seed in gan_dataset when it is multi-workers 2021-04-08 11:28:30 +02:00
Eren Gölge 6ee211c137 remove stft params causing warning 2021-04-08 11:28:30 +02:00
Eren Gölge 4998ece8d8 allow configuration of optimziers from the config file 2021-04-08 11:28:30 +02:00
Eren Gölge 8daf407652 cache empty 2021-04-08 11:28:30 +02:00
Eren Gölge 3fb78c004a move scheduler updates to the end of the epoch 2021-04-08 11:28:30 +02:00
Eren Gölge 2a872c98aa don't call os.exit as it leaves the process resources standing 2021-04-08 11:27:40 +02:00
Eren Gölge 7cecd2fb2e add hifigan D 2021-04-08 11:27:40 +02:00
Eren Gölge 13dca6e6b6 revert some of Hifigan generator updates 2021-04-08 11:27:40 +02:00
Eren Gölge 02bc776c35 prevenet grad in TorchSTFT 2021-04-08 11:27:40 +02:00
Eren Gölge cf44624df8 more docstring 2021-04-08 11:27:40 +02:00
Eren Gölge d95b1458e8 Linter fixes and docstrings for HiFiGAN 2021-04-08 11:27:40 +02:00
Eren Gölge bd7a1c177b fix #419 2021-04-08 11:26:41 +02:00
Eren Gölge 7726dfca99 change the upper bound in sound normalization 2021-04-08 11:26:01 +02:00
Eren Gölge 57f6bd1afa make using different samples for G and D networks optional 2021-04-08 11:26:01 +02:00
Eren Gölge 67f8248492 placeholder for finetuned sam hifigan model 2021-04-08 11:25:29 +02:00
Eren Gölge 241e968df1 load_checkpoint for hifigan and no_grad for inference 2021-04-08 11:25:29 +02:00
Eren Gölge de3a04f104 some commeting for Generator loss and check if the argument is defines in the config file 2021-04-08 11:25:29 +02:00
Eren Gölge ff07c5f5e3 update TorchSTFT to enable melspec 2021-04-08 11:25:29 +02:00
Eren Gölge 4a5b1d4ac2 update hifigan config 2021-04-08 11:24:21 +02:00
Eren Gölge e0e3b12b26 pass all parameters explicity to _istft 2021-04-08 11:23:20 +02:00
Eren Gölge f0e76ee135 initial models.json entry for universal hifigan 2021-04-08 11:23:20 +02:00
Eren Gölge d57f416957 small fixes 2021-04-08 11:22:30 +02:00
Eren Gölge 8c9e1c9e58 hifigan implementation update 2021-04-08 11:21:43 +02:00
Eren Gölge a14d7bc5db hifigan config update 2021-04-08 11:20:33 +02:00
Eren Gölge 8d4fd79cd7 update hifigan config 2021-04-08 11:20:33 +02:00
rishikksh20 e656e8b108 Remove select size bug 2021-04-08 11:20:33 +02:00
rishikksh20 b533474e3b Remove minor bugs and make code trainable 2021-04-08 11:20:33 +02:00
rishikksh20 ef6ff4e95c Add Exponential LR scheduler check 2021-04-08 11:20:33 +02:00
rishikksh20 1535777f64 1) Add ExponentialLR 2021-04-08 11:18:36 +02:00
rishikksh20 c20a6b1185 * Format the model definition
* Update code and integrate training code
2021-04-08 11:18:36 +02:00
rishikksh20 39b5845810 1) Add hifigan json files
2) Rename MPD disc
3) Re-format remove weight norm generator
2021-04-08 11:14:39 +02:00
rishikksh20 7b7c5d635f 1) Combine MSD with Multi-Period disc
2) Add remove weight norm layer on Generator
2021-04-08 11:14:39 +02:00
rishikksh20 4493feb95c Add HiFi-GAN v1 generator and discriminator classes 2021-04-08 11:14:39 +02:00
Eren Gölge c86c559349 docstring and optional padding in TorchSTFT 2021-04-07 12:36:15 +02:00
Eren Gölge f890454de3 linter fixes 2021-04-07 12:36:03 +02:00
Eren Gölge 9782d9ea5d [ci skip] implement #418 2021-04-06 16:24:50 +02:00
Eren Gölge f46a275b22 update docstring 2 2021-04-06 16:24:50 +02:00
Eren Gölge ec94ff3691 update docstring 2021-04-06 16:24:50 +02:00
Eren Gölge 2048095e9a audio.py fix 2021-04-06 16:24:50 +02:00
Eren Gölge e0b3008c31 allow choosing the log function used for amptodb conversion 2021-04-06 16:24:50 +02:00
Eren Gölge 44b4cb5ba5 DCA comment 2021-04-06 16:24:50 +02:00
Eren Gölge b86e7fb2e8 pad short samples when loading precomputed features in vocoder trainign 2021-04-06 16:24:50 +02:00
Eren Gölge 6ad4eba678 gan vocoder train fix in case of restoring models wiht no scheduler is defined 2021-04-06 16:24:50 +02:00
Eren Gölge e3ccfe37ea add DE more urls 2021-04-02 14:54:41 +02:00
Eren Gölge e84f120a04 sam-accenture model preprocessor 2021-04-01 03:41:41 +02:00
Eren Gölge e3c052382b fix loading always best_model when continue 2021-04-01 03:41:15 +02:00
Eren Gölge 48ea20e69f example aligntts config 2021-03-30 14:41:00 +02:00
Eren Gölge b4c2cf80f2 fix eval iter 2021-03-30 14:39:16 +02:00
Eren Gölge a3a840fd78 linter fixes 2021-03-30 14:39:16 +02:00
Eren Gölge 6b2e13bf62 compute normalized logp using torch primitives 2021-03-30 14:39:16 +02:00
Eren Gölge 7a382a5c2b stowed aligntts commit and small refactoring with feed_forward layers 2021-03-30 14:39:16 +02:00
Eren Gölge d542a50818 fix losses for alignTTS 2021-03-30 14:39:16 +02:00
Eren Gölge 18cc7b95ec update l1 and huber to mse loss 2021-03-30 14:39:16 +02:00
Eren Gölge 896d33ed49 update losses to hande alingtts phases 2021-03-30 14:39:16 +02:00
Eren Gölge aec0b78aff duration predictor fix 2 2021-03-30 14:39:16 +02:00
Eren Gölge 07269e639b fix duration predictor in AlignTTS 2021-03-30 14:39:16 +02:00
Eren Gölge c2d29e5cd4 FFTransformer encoder for aligntts 2021-03-30 14:39:16 +02:00
Eren Gölge 460a2d3e26 FFTransformer Decoder for AlignTTS 2021-03-30 14:39:16 +02:00
Eren Gölge 844e8e0ed4 adapt align_tts and model name handling 2021-03-30 14:39:16 +02:00
Eren Gölge aa29f5b199 aligntts loss 2021-03-30 14:39:16 +02:00
Eren Gölge a831468cab align tts MDN layer 2021-03-30 14:39:16 +02:00
Eren Gölge 4396f8e2da continue refactoring 2021-03-30 14:39:16 +02:00
Eren Gölge 892c3c3623 use torch for AngleProtoLoss 2021-03-30 14:39:16 +02:00
Eren Gölge 2b3e12ea49 correct imports after refactoring, add AlignTTS (old SSMAS) and some formatting 2021-03-30 14:39:16 +02:00
Eren Gölge ecb6b0d6ad rename GlowTtts as GlowTTS 2021-03-30 14:39:16 +02:00
Eren Gölge e8cf8cb00e restructure TF tacotron files 2021-03-30 14:39:16 +02:00
Eren Gölge 1ac99ce0d0 if git is not available set git has 'unknown' 2021-03-30 14:39:16 +02:00
Eren Gölge d9c405f0c3 create feedforward folder for SS layers 2021-03-30 14:39:16 +02:00
Eren Gölge a8cf1ae6b4 fix wavenet running with no input mask 2021-03-30 14:39:16 +02:00
Eren Gölge 1c1949d348 utf-8 encoding for certain preprocessors 2021-03-30 14:39:16 +02:00
Eren Gölge ca2f22cdd7 linter fix 2021-03-30 14:36:12 +02:00
Eren Gölge d0dcd7d1b8 let the user define outpu.wav file path fix #393 2021-03-30 14:24:31 +02:00
Eren Gölge 25654233d5 [ci skip]initial commit for the new DE models and stale ot update 2021-03-29 03:23:57 +02:00
Guy Elsmore-Paddock 15459627cc Fix `UnicodeEncodeError` on Windows Platforms
Prevents the following error from appearing when running training on Windows platforms:
```
UnicodeEncodeError: 'charmap' codec can't encode characters in position: character maps to <undefined>
```
2021-03-20 17:30:00 -04:00
Eren Gölge 3947750dd9 Merge branch 'dev' of https://github.com/coqui-ai/TTS into dev 2021-03-18 14:09:47 +01:00
WeberJulian 4a9d2e4309 fix french_cleaners 2021-03-18 13:35:29 +01:00
WeberJulian 596ea2c98a Add resample script 2021-03-18 13:33:37 +01:00
Eren Gölge 6e68637f48 bug fix 2021-03-18 13:33:23 +01:00
Eren Gölge f3e5ddfaaf bug fix in preprocessor 2021-03-18 13:33:23 +01:00
Eren Gölge aeb4f82233 bug fix 2021-03-18 13:33:23 +01:00
Eren Gölge 0514330869 fix mozilla/TTS#685 2021-03-18 13:33:23 +01:00
Eren Gölge f06603a0db force utf8 2021-03-18 13:33:23 +01:00
Eren Gölge 32e8b56c45 linter fix 2021-03-18 13:33:23 +01:00
Eren Gölge 65533f33e9 fix #374 2021-03-18 13:33:00 +01:00
Eren Gölge d790d2fccb linter fix 2021-03-18 13:33:00 +01:00
WeberJulian af96080e17 fix linter issues 2021-03-18 13:33:00 +01:00
WeberJulian bf04383e74 fix french_cleaners 2021-03-18 13:33:00 +01:00
WeberJulian f6cd8e0ecc test case 2021-03-18 13:33:00 +01:00
WeberJulian e954e45e57 linter + test 2021-03-18 13:33:00 +01:00
WeberJulian e598977f3d Using path.join instead of concat 2021-03-18 13:33:00 +01:00
WeberJulian c5ef2de73f Add resample script 2021-03-18 13:33:00 +01:00
Eren Gölge 2690ab2ee5 bug fix 2021-03-16 19:15:28 +01:00
Eren Gölge 4c1aed4a9c bug fix in preprocessor 2021-03-16 19:13:32 +01:00
Eren Gölge 01e35e06c4 bug fix 2021-03-16 19:13:32 +01:00
Eren Gölge aa8bb815a7 fix mozilla/TTS#685 2021-03-16 19:13:32 +01:00
Eren Gölge a8c348ffb2 force utf8 2021-03-16 19:13:32 +01:00
Eren Gölge bf0caba0bc linter fix 2021-03-16 19:13:32 +01:00
Eren Gölge babc94f63f fix #374 2021-03-16 19:13:32 +01:00
Eren Gölge bdfd1f8a89 linter fix 2021-03-16 19:13:32 +01:00
WeberJulian 11e25a7125 fix linter issues 2021-03-16 19:13:01 +01:00
WeberJulian 1574d8dd39 fix french_cleaners 2021-03-16 19:13:01 +01:00
WeberJulian b94373afb8 test case 2021-03-16 19:13:01 +01:00
WeberJulian 93fdc0729c linter + test 2021-03-16 19:13:01 +01:00
WeberJulian 17f197f51e Using path.join instead of concat 2021-03-16 19:13:01 +01:00
WeberJulian d6749f030f Add resample script 2021-03-16 19:13:01 +01:00
Eren Gölge 838ebd6ad5 add the missing russian model 2021-03-16 18:38:35 +01:00
Eren Gölge 5c657715f2 fix #382 2021-03-16 17:31:48 +01:00
Eren Gölge 38a29ce1c9 move all models to github rls 2021-03-10 18:19:32 +01:00
Eren Gölge e5bb317242 fix model manager 2021-03-10 17:01:19 +01:00
Eren Gölge d260fb03a2 fix handling scale_stats.npy for models downloaded from Github rls 2021-03-10 16:40:30 +01:00
Eren Gölge 4aba4e5b1e linter fx 2021-03-10 15:33:11 +01:00
Eren Gölge 6c932c8503 print the desc if required parameters are not provided 2021-03-10 15:19:00 +01:00
Eren Gölge 9e84c8a623 do not copy scale_stats if exist in the output folder 2021-03-10 15:13:55 +01:00
Eren Gölge 7782034e7e fix #369 2021-03-10 15:13:21 +01:00
Eren Gölge 4337e9ff87 pad_mode in torch_stft 2021-03-10 14:41:00 +01:00
Eren Gölge 599149a7e5 downloading models from github releases 2021-03-10 11:09:01 +01:00
Eren Gölge fc19411ac6 update some of the models to github releases 2021-03-10 11:08:15 +01:00
Eren Gölge 19bb9ba851 fix tts endpoint using list-models argument 2021-03-09 14:06:09 +01:00
Eren Gölge 43379eecef fix the nl model and add the vocoder 2021-03-09 14:05:56 +01:00
r-dh 8a4dcd152f Add Dutch model 2021-03-09 13:22:19 +01:00
Eren Gölge 94805236fb Merge branch 'dev' of https://github.com/coqui-ai/TTS into dev 2021-03-08 15:21:06 +01:00
Eren Gölge 5dcc4be560 rebrand demo server 2021-03-08 14:51:04 +01:00
Eren Gölge 947e3d6a93 rename test 2021-03-08 14:50:54 +01:00
Eren Gölge a519ed52f2 deprecate embedding models to the wheel 2021-03-08 14:06:15 +01:00
Eren Gölge c16ad38930 update server rEADME 2021-03-08 14:05:59 +01:00
Eren Gölge 594d8d8f09 linter fixes 2021-03-08 11:22:59 +01:00
Eren Gölge 00b5090974 linter fix 2021-03-08 11:05:30 +01:00
Eren Gölge e15734c3fc linter fix 2021-03-08 05:29:43 +01:00
Eren Gölge 9a48ba3821 a ton of linter updates 2021-03-08 05:06:54 +01:00
Eren Gölge e03a426378 bug fix 2021-03-08 02:59:48 +01:00
kirianguiller 628afe5cb0 remove gst handling in synthetizer.py class 2021-03-08 02:59:48 +01:00
kirianguiller 557239db7f remove re.Match typing in '_number_replace()' 2021-03-08 02:59:48 +01:00
kirianguiller 9ab07f94e2 modify according to PR reviews 2021-03-08 02:59:48 +01:00
kirianguiller 42ba30eb8f <add> Chinese mandarin implementation (tacotron2) 2021-03-08 02:59:24 +01:00
kirianguiller 49665783a6 remove gst handling in synthetizer.py class 2021-03-08 02:57:11 +01:00
kirianguiller e85658ac2b remove re.Match typing in '_number_replace()' 2021-03-08 02:57:11 +01:00
kirianguiller 0d4525322c modify according to PR reviews 2021-03-08 02:57:11 +01:00
kirianguiller e6fd118cf8 <add> Chinese mandarin implementation (tacotron2) 2021-03-08 02:57:11 +01:00
Eren Gölge e3102e753c enable backward compat for loading the best model 2021-03-08 02:57:11 +01:00
gerazov 2451a813a2 refactored keep_all_best 2021-03-08 02:57:11 +01:00
gerazov 8cefa76bae reformated docstrings in arguments.py 2021-03-08 02:57:11 +01:00
gerazov 2db40457e8 brushed up printing model load path and best loss path 2021-03-08 02:56:36 +01:00
gerazov f2e474cd37 loading last checkpoint/best_model works, deleting last best models options added, loading last best_loss added 2021-03-08 02:56:36 +01:00
Eren Gölge 4111df6769 Docstrings for audioprocessor 2021-03-08 02:54:47 +01:00
Eren Gölge 2ca74b8ab3 add RUSLAN dataset preprocessor 2021-03-08 02:54:47 +01:00
Eren Gölge 8993120634 do not test server and modelManager until fixing #657 2021-03-08 02:54:47 +01:00
Adonis Pujols 89b7f01534 add encoding="utf-8" 2021-03-08 02:54:47 +01:00
Eren Gölge ffceccb021 fix #655 2021-03-08 02:54:47 +01:00
Eren Gölge 534c341f16 linter update 2021-03-08 02:54:47 +01:00
Eren Gölge 0e1e60bef0 remove redundancy 2021-03-08 02:54:47 +01:00
Eren Gölge 93a83c0068 Update TTS/utils/arguments.py
Co-authored-by: Jörg Thalheim <Mic92@users.noreply.github.com>
2021-03-08 02:54:47 +01:00
Eren Gölge 39fbf2fe84 Update TTS/bin/find_unique_chars.py
Co-authored-by: Jörg Thalheim <Mic92@users.noreply.github.com>
2021-03-08 02:54:47 +01:00
Eren Gölge ee71eb4eb7 linter fixes 2021-03-08 02:54:47 +01:00
Eren Gölge 55fc50b26d update test_text_processing for espeak-ng 2021-03-08 02:54:47 +01:00
Eren Gölge 5b8a6736a7 remove _phoneme_punctuations 2021-03-08 02:54:47 +01:00
Eren Gölge 194f82de51 save default model chars to the training config file 2021-03-08 02:54:47 +01:00
Eren Gölge 62a8eba3b2 parse_characters function 2021-03-08 02:54:47 +01:00
Eren Gölge 0b33acdcca enable saving model characters in io.py 2021-03-08 02:54:47 +01:00
Eren Gölge f9fe167537 docstring update 2021-03-08 02:54:47 +01:00
Eren Gölge 62aeacbdd1 save used model characters to the checkpoints 2021-03-08 02:54:47 +01:00
Eren Gölge e06c93fe81 model_manager tests 2021-03-08 02:54:47 +01:00
Eren Gölge fe41084eb3 author , license and contact info in .model.json 2021-03-08 02:54:47 +01:00
nmstoker ae0d54ddae Updating models list to include EK1 TTS/vocoder 2021-03-08 02:54:47 +01:00
Eren Gölge c6702b5b9f find unique characters in a dataset 2021-03-08 02:54:47 +01:00
Eren Gölge dad3565379 use default vocoders in server.pu 2021-03-08 02:54:47 +01:00
Eren Gölge d30608ab17 set an output_sample_rate in synthesizer and use it for writing the wav
file
2021-03-08 02:54:47 +01:00
Eren Gölge 3ccb015cd8 return the json entry of the downloaded model 2021-03-08 02:54:47 +01:00
Eren Gölge 00e0933f43 save_wav with a custom sampling rate 2021-03-08 02:54:47 +01:00
Eren Gölge 9fefc79f0c fix make_symbols 2021-03-08 02:54:47 +01:00
Eren Gölge 8955333e9d use default vocoder in synthesize.py 2021-03-08 02:54:47 +01:00
Eren Gölge 23b282f600 define default vocoders 2021-03-08 02:54:47 +01:00
Eren Gölge 6bd8485d10 bug fix 2021-03-08 02:54:47 +01:00
Eren Gölge 5f1018abee fix spelling of a def argument and parse phonemes from config.json if
use_phonemes is True
2021-03-08 02:54:47 +01:00
Eren Gölge 1c1abb8a9b docstring update 2021-03-08 02:54:47 +01:00
Eren Gölge 6cd642c2e1 add missing phonemes to test_config.json 2021-03-08 02:54:47 +01:00
Eren Gölge 43b951018e fix the default vocoder name 2021-03-08 02:54:47 +01:00
Adonis Pujols 81b145c321 spelling error. should be multiband not mulitband 2021-03-08 02:54:47 +01:00
Adonis Pujols 59b1b13e07 spelling error. should be multiband not mulitband 2021-03-08 02:54:47 +01:00
Eren Gölge ee58ff2d38 add russian phoneme char 2021-03-08 02:54:47 +01:00
Eren Gölge 29d928d531 css10 dataset preprocessor 2021-03-08 02:54:47 +01:00
Eren Gölge 49771f2541 download github model releases by model manager 2021-03-08 02:54:21 +01:00
Eren Gölge 3c961370e7 linter fixes 2021-03-08 02:54:21 +01:00
gerazov 2b5cb24db7 final final fixes 2021-03-08 02:54:21 +01:00
gerazov b3c5cc2cdc final fixes 2021-03-08 02:54:21 +01:00
gerazov 10d5a63d49 updated to current dev 2021-03-08 02:54:21 +01:00
gerazov 6f06e31541 changed train scripts 2021-03-08 02:54:21 +01:00
gerazov 2daca15802 restructured arg parsing and processing to utils 2021-03-08 02:54:21 +01:00
Eren Gölge 2fbe4a1b8a fix gdown 2021-03-08 02:54:21 +01:00
Branislav Gerazov ed56944c4a improve robustness of defining wavernn in config file 2021-03-08 02:54:21 +01:00
Branislav Gerazov 5e2bc8c99f update wavernn test config, delete cap=True 2021-03-08 02:54:21 +01:00
Branislav Gerazov b1e3160884 waveRNN fix 2021-03-08 02:54:21 +01:00
Eren Gölge 08581deb61 linter updates 2021-03-08 02:53:02 +01:00
Thorsten Mueller 167901813d Ups. Added missing , 2021-03-08 02:53:02 +01:00
Eren Gölge 93a6bdfd6c linter fixes and version updates for deps 2021-03-08 02:51:10 +01:00
Eren Gölge a30a231566 unpin cython version and commentout pyworld in audio.py causing dep
issues
2021-03-08 02:50:15 +01:00
Thorsten Mueller 3eb00e8d93 Set out_path to be required param. 2021-03-08 02:49:15 +01:00
Alexander Korolev ace430d5e6 fix device mismatch wavegrad training
this should fixe the device mismatch as seen here https://github.com/mozilla/TTS/issues/622#issue-789802916
2021-03-08 02:49:15 +01:00
Eren Gölge 83143fbe39 fix #638 2021-03-08 02:48:31 +01:00
Eren Gölge 30c3bef3f9 move hubconf 2021-03-08 02:48:31 +01:00
Eren Gölge bbea6a0884 hubconf.py and load .models.json from the defualt location by mange.py 2021-03-08 02:48:31 +01:00
Eren Gölge 90d4f08d6c reorder imports 2021-03-08 02:48:31 +01:00
Eren Gölge db231c83fc distill import statement, check python version in setup.py 2021-03-08 02:48:31 +01:00
Thorsten Mueller 915ec1faac Added info if model already downloaded in --list_models 2021-03-08 02:48:31 +01:00
Alexander Korolev b4bc5f6eb1 update fixed stopnet_pos_weight parameter
config parameter c.stopnet_pos_weight has currently no effect as it is not used.
2021-03-08 02:48:31 +01:00
Eren Gölge 534e3c67c6 README update, set default models for synthesize.py and server.py. Disable verbose for ap init. 2021-03-08 02:48:31 +01:00
kirianguiller 7f36d91131 update chinese model 2021-03-01 14:55:05 +01:00
Eren Gölge 547bfc4ce9 bug fix 2021-02-18 18:24:03 +00:00
Eren Gölge adaeec57ec Merge branch 'dev' of https://github.com/mozilla/TTS into dev 2021-02-18 17:21:09 +00:00
Eren Gölge 5b70c8ba4f enable backward compat for loading the best model 2021-02-18 17:20:36 +00:00
Eren Gölge e4f81d6856
Merge pull request #654 from kirianguiller/chinese-implementation
Chinese implementation (merge into dev)
2021-02-18 17:15:32 +01:00
kirianguiller 22a6bbfa80 remove gst handling in synthetizer.py class 2021-02-17 20:53:56 +01:00
kirianguiller 3911b87e54 remove re.Match typing in '_number_replace()' 2021-02-17 20:53:56 +01:00
kirianguiller fb0655d1e7 modify according to PR reviews 2021-02-17 20:53:56 +01:00
kirianguiller c4c7bc1b88 <add> Chinese mandarin implementation (tacotron2) 2021-02-17 20:53:56 +01:00
Eren Gölge d0454461de Merge branch 'pr/gerazov/650-2' into dev 2021-02-17 13:40:45 +00:00
Eren Gölge a8ea0ea6ce Docstrings for audioprocessor 2021-02-17 13:35:41 +00:00
Eren Gölge f6e6314910 add RUSLAN dataset preprocessor 2021-02-17 13:35:23 +00:00
Eren Gölge ce0c5eccbd do not test server and modelManager until fixing #657 2021-02-17 00:35:43 +00:00
gerazov 61c88beb94 refactored keep_all_best 2021-02-15 18:40:17 +01:00
Eren Gölge eb543c027e Merge branch 'dev' of https://github.com/mozilla/TTS into dev 2021-02-15 17:06:40 +00:00
Eren Gölge 8a106e0527 fix #655 2021-02-15 17:06:03 +00:00
Eren Gölge 216945e653
Merge pull request #647 from adonispujols/patch-1
Easy Fix for #454 (which was somehow deleted?)
2021-02-15 13:17:17 +01:00
Eren Gölge 06a3ba2fe2 linter update 2021-02-15 12:10:19 +00:00
Eren Gölge 7f58fa365b Merge branch 'save_characters' into dev 2021-02-15 12:07:28 +00:00
Eren Gölge ff218e2370 remove redundancy 2021-02-15 12:07:02 +00:00
Eren Gölge 80af8ca5e1
Update TTS/utils/arguments.py
Co-authored-by: Jörg Thalheim <Mic92@users.noreply.github.com>
2021-02-15 13:03:59 +01:00
Eren Gölge 3b6ce04332
Update TTS/bin/find_unique_chars.py
Co-authored-by: Jörg Thalheim <Mic92@users.noreply.github.com>
2021-02-15 13:02:29 +01:00
Eren Gölge dc3596dad4 model_manager tests 2021-02-15 11:29:22 +00:00
Eren Gölge 77e630348e author , license and contact info in .model.json 2021-02-15 11:02:21 +00:00
Eren Gölge e1bc823e44 Merge branch 'pr/nmstoker/652' into dev 2021-02-15 10:57:12 +00:00
nmstoker 33bcdc6ff8 Updating models list to include EK1 TTS/vocoder 2021-02-14 23:44:05 +00:00
Eren Gölge 420901f4c2 linter fixes 2021-02-12 14:41:17 +00:00
Eren Gölge 4244096ccb update test_text_processing for espeak-ng 2021-02-12 14:07:26 +00:00
Eren Gölge b28c724c04 remove _phoneme_punctuations 2021-02-12 12:10:57 +00:00
Eren Gölge 7ab527d17e save default model chars to the training config file 2021-02-12 12:06:46 +00:00
Eren Gölge 593cedee14 parse_characters function 2021-02-12 12:05:56 +00:00
Eren Gölge 2abfff17f9 enable saving model characters in io.py 2021-02-12 12:04:41 +00:00
Eren Gölge 918f007a11 docstring update 2021-02-12 12:04:07 +00:00
Eren Gölge e774f68aee save used model characters to the checkpoints 2021-02-12 12:03:42 +00:00
gerazov 0e78e31dbf reformated docstrings in arguments.py 2021-02-12 11:36:01 +01:00
gerazov 310d18325e brushed up printing model load path and best loss path 2021-02-12 10:55:45 +01:00
Eren Gölge 8b6fd76ad2 find unique characters in a dataset 2021-02-12 09:46:11 +00:00
gerazov af46727517 loading last checkpoint/best_model works, deleting last best models options added, loading last best_loss added 2021-02-12 02:12:00 +01:00
Eren Gölge a1e595790d use default vocoders in server.pu 2021-02-11 15:31:39 +00:00
Eren Gölge 8aa6a0decb set an output_sample_rate in synthesizer and use it for writing the wav
file
2021-02-11 15:28:07 +00:00
Eren Gölge 0c52d27d65 return the json entry of the downloaded model 2021-02-11 15:27:41 +00:00
Eren Gölge 1649ad3431 save_wav with a custom sampling rate 2021-02-11 15:27:20 +00:00
Eren Gölge 43f54d2dce fix make_symbols 2021-02-11 15:26:52 +00:00
Eren Gölge 0657b38111 use default vocoder in synthesize.py 2021-02-11 15:26:17 +00:00
Eren Gölge 2043a9b5f5 define default vocoders 2021-02-11 15:25:55 +00:00
Eren Gölge ff27690ca7 bug fix 2021-02-11 13:43:29 +00:00
Eren Gölge bc131208be fix spelling of a def argument and parse phonemes from config.json if
use_phonemes is True
2021-02-11 13:04:47 +00:00
Eren Gölge f1799dbd60 docstring update 2021-02-11 11:25:31 +00:00
Eren Gölge 3baec4ea96 add missing phonemes to test_config.json 2021-02-11 11:14:39 +00:00
Eren Gölge a3d1e65b34 Merge branch 'pr/adonispujols/646' into dev 2021-02-11 10:37:29 +00:00
Eren Gölge 3c2e13ca5c fix the default vocoder name 2021-02-11 10:36:52 +00:00
Adonis Pujols 48011a8b58
add encoding="utf-8" 2021-02-11 05:26:06 -05:00
Adonis Pujols b29a7e9645
spelling error. should be multiband not mulitband 2021-02-11 04:49:28 -05:00
Adonis Pujols 6c824a6629
spelling error. should be multiband not mulitband 2021-02-11 04:48:53 -05:00
Eren Gölge b08b8ca2a1 add russian phoneme char 2021-02-10 13:30:59 +00:00
Eren Gölge 9cad435288 css10 dataset preprocessor 2021-02-09 15:11:26 +00:00
Eren Gölge cea5e517f2 download github model releases by model manager 2021-02-09 14:24:14 +00:00
Eren Gölge c619859a3f linter fixes 2021-02-09 11:43:17 +00:00
gerazov e507373b55 final final fixes 2021-02-06 23:08:47 +01:00
gerazov ad17dc9e76 final fixes 2021-02-06 23:05:01 +01:00
gerazov 8fdd08ea15 updated to current dev 2021-02-06 22:59:52 +01:00
gerazov 2705d27b28 changed train scripts 2021-02-06 22:29:30 +01:00
gerazov 4f8f274d6e restructured arg parsing and processing to utils 2021-02-06 22:25:56 +01:00
Eren Gölge e7e880f514 fix gdown 2021-02-05 13:42:24 +00:00
Eren Gölge f4f6290eec Merge branch 'pr/gerazov/641' into dev 2021-02-05 13:14:49 +00:00
Eren Gölge d49757faaa linter updates 2021-02-05 13:10:43 +00:00
Branislav Gerazov f063545325 improve robustness of defining wavernn in config file 2021-02-05 13:26:33 +01:00
Branislav Gerazov 24ffa9e9f6 update wavernn test config, delete cap=True 2021-02-05 13:10:02 +01:00
Branislav Gerazov cb77aef36c waveRNN fix 2021-02-04 09:52:03 +01:00
Thorsten Mueller d74866cb8e Merge remote-tracking branch 'upstream/dev' into dev
Fix for circleci error mentioned in PR https://github.com/mozilla/TTS/pull/637
2021-02-02 19:40:18 +01:00
Thorsten Mueller a82152eef3 Ups. Added missing , 2021-02-02 19:29:16 +01:00
Thorsten Mueller 4cb4fcf02c Set out_path to be required param. 2021-02-02 19:29:16 +01:00
Thorsten Mueller c75ea74914 Added info if model already downloaded in --list_models 2021-02-02 19:29:16 +01:00
Eren Gölge 2edab4b3f9 disable pw in audio that causes numpy issue 2021-02-01 17:05:03 +00:00
Eren Gölge 5c46543765 linter fixes and version updates for deps 2021-02-01 13:18:56 +00:00
Eren Gölge 8774e37444 unpin cython version and commentout pyworld in audio.py causing dep
issues
2021-02-01 11:34:05 +00:00
Eren Gölge 5beed0ddcd Merge branch 'dev' of https://github.com/mozilla/TTS into dev 2021-02-01 11:27:14 +00:00
Eren Gölge c7407571fa fix #638 2021-02-01 10:05:55 +00:00
Eren Gölge dfdac1def9
Merge pull request #636 from thorstenMueller/dev
Set out_path to be required param in compute_statistics.py.
2021-01-29 18:08:31 +01:00
Thorsten Mueller 44c4a49745 Set out_path to be required param. 2021-01-29 17:23:38 +01:00
Eren Gölge 536366dc0a
Merge pull request #635 from SanjaESC/patch-1
fix device mismatch wavegrad training
2021-01-29 16:42:25 +01:00
Eren Gölge 0354b6f35e move hubconf 2021-01-29 15:28:32 +00:00
Eren Gölge aa5f24608a hubconf.py and load .models.json from the defualt location by mange.py 2021-01-29 15:28:26 +00:00
Alexander Korolev e81ebec7a8
fix device mismatch wavegrad training
this should fixe the device mismatch as seen here https://github.com/mozilla/TTS/issues/622#issue-789802916
2021-01-29 15:18:59 +01:00
Eren Gölge a926aa106d reorder imports 2021-01-29 01:36:21 +01:00
Eren Gölge 8a6eee7fec distill import statement, check python version in setup.py 2021-01-28 17:04:08 +01:00
Eren Gölge 131a163c95
Merge pull request #628 from thorstenMueller/dev
Added info if model already downloaded in --list_models
2021-01-28 13:10:06 +01:00
Alexander Korolev ca28e05ed7
update fixed stopnet_pos_weight parameter
config parameter c.stopnet_pos_weight has currently no effect as it is not used.
2021-01-27 16:33:25 +01:00
Thorsten Mueller ccbd542eb0 Added info if model already downloaded in --list_models 2021-01-27 16:19:02 +01:00
Eren Gölge 25c86ca715 README update, set default models for synthesize.py and server.py. Disable verbose for ap init. 2021-01-27 11:47:03 +01:00
Eren Gölge 4f32e77006 platform indep. way to fetch user data folder 2021-01-26 17:32:43 +01:00
Eren Gölge 0117c811a9 add a button to index.html to see the model details 2021-01-26 12:33:27 +01:00
Eren Gölge a3adcaccdb Merge branch 'pr/thorstenMueller/623' into dev 2021-01-26 12:19:39 +01:00
Eren Gölge b464cab9b8 setup.py update and pylint fixes 2021-01-26 02:57:50 +01:00
Eren Gölge 660d61aeeb maximum_path_numpy and CYTHON adabtable import 2021-01-26 02:57:07 +01:00
Eren Gölge 877f0bbfba manifest.in update 2021-01-26 02:56:55 +01:00
Eren Gölge 82e029529e fix manifest file 2021-01-25 13:27:54 +01:00
Eren Gölge 57b668fd86 fixing dome pypi issues 2021-01-25 13:06:12 +01:00
Eren Gölge 60c1bb93d9 fixes before first PyPI release 2021-01-25 11:16:20 +01:00
Thorsten Mueller afb7db2a1d Removed unneeded check and removed specific taco2 model name. 2021-01-22 16:22:50 +01:00
Eren Gölge fae10309e4
Merge pull request #624 from SanjaESC/patch-3
Update train_tacotron.py
2021-01-22 13:29:09 +01:00
Eren Gölge 5ee73c2bae Merge branch 'dev' of https://github.com/mozilla/TTS into dev 2021-01-22 13:26:27 +01:00
Eren Gölge 5fb611ef40 static image for server index.html 2021-01-22 03:01:53 +01:00
Eren Gölge ca647cf222 Model Manager to download released models 2021-01-22 02:35:43 +01:00
Eren Gölge ca8ad9c21e rename audio._normalize to audio.normalize 2021-01-22 02:33:19 +01:00
Eren Gölge c990b3a59c linter fixes and test fixes 2021-01-22 02:32:35 +01:00
Alexander Korolev f251dc8c0e
Update train_tacotron.py
When attempting to fine-tune a model with "prenet_type": "bn" that was originally trained with "prenet_type": "original", a RuntimeError is thrown that stops the training.

By catching the RuntimeError, the required layers can be partially restored and the training will continue without any problems.
2021-01-21 21:16:30 +01:00
Eren Gölge 0ab2eb2664 use synthesizer in both synthesize.py and server.pu 2021-01-21 15:54:33 +01:00
Eren Gölge 9addfabc43 wavernn load_checkpoint function 2021-01-21 15:31:13 +01:00
Eren Gölge 50fee59a2c update synthesizer.py for better interfacing to different models 2021-01-21 15:30:49 +01:00
Eren Gölge 007a4d7139 remove 3rd paty wavernn support from server.py and add ModelManager arguments 2021-01-21 15:30:16 +01:00
Eren Gölge 6b6e989fd2 update server readme 2021-01-21 15:29:46 +01:00
Thorsten Mueller e414582be6 Added option for server ui details page. 2021-01-20 21:56:40 +01:00
root 1bc8fbbd3c set eval mode whe nloading models 2021-01-20 02:14:18 +00:00
root 5bd7238153 interpolate spectrogram in vocoder generic utils for matching sample
rates
2021-01-20 02:13:01 +00:00
root ca3743539a load_checkpoint func for vocoder models 2021-01-20 02:12:29 +00:00
root ea39715305 read_json_with_comments 2021-01-20 02:11:55 +00:00
root 563bc921d8 optional verbose for audio.py init 2021-01-20 02:11:24 +00:00
root 1faf565e3a add load_checkpoint func to tts models 2021-01-20 02:10:56 +00:00
root 5c87753e88 glow-tts fix for saving inverse weight 2021-01-20 02:09:42 +00:00
root 3d30dae8f3 .models.json and synthesize.py update for interfacing with model manager 2021-01-20 02:08:58 +00:00
gerazov b2b4828f17 set requires_grad=False 2021-01-16 19:46:04 +01:00
gerazov c96f7a2614 TorchSTFT to device fix 2021-01-16 12:21:16 +01:00
root 7beaacc55b update compute_attention_masks.py 2021-01-13 10:03:57 +00:00
erogol 428c224b88 commet update 2021-01-12 17:31:04 +01:00
erogol bbc8d665a1 move attention layers to a sperate file 2021-01-11 17:27:30 +01:00
erogol 79c841ccd3 mass refactoring and update 2021-01-11 17:26:58 +01:00
erogol 1d961d6f8a cladd renaming 2021-01-11 17:26:11 +01:00
erogol c0a2aa68d3 formatting 2021-01-11 17:25:39 +01:00
erogol b206162d11 more docstrings 2021-01-11 17:25:04 +01:00
erogol 6e9043c5d2 rename convbnblocks and handle none mask 2021-01-11 17:22:34 +01:00
erogol 921fa5db92 remove attentions from common layers 2021-01-11 15:06:42 +01:00
erogol cc2b1e043d docstrings for common layers 2021-01-11 15:06:12 +01:00
erogol a6f40fef2e stage missing files 2021-01-08 16:02:56 +01:00
erogol d382d759b3 small fixes and test fixes 2021-01-08 15:48:40 +01:00
erogol a6259041d3 docstring for speedyspeech 2021-01-07 14:35:22 +01:00
erogol de2a542f83 glow-tts bug fix 2021-01-07 13:40:32 +01:00
erogol 14d33662ea input shapes for tacotron models 2021-01-06 13:19:40 +01:00
erogol f288e9a260 docstrings for taoctron models 2021-01-06 13:19:40 +01:00
erogol 5a45af48f1 fix 2021-01-06 13:19:40 +01:00
erogol e7fad928e7 doc strings for the all glow-tts layers 2021-01-06 13:19:40 +01:00
erogol d3b7284be4 glow-tts comments and refactoring 2021-01-06 13:19:40 +01:00
erogol 7586fbc4de SS refactoring 2021-01-06 13:19:40 +01:00
erogol e82d31b6ac glow ttss refactoring 2021-01-06 13:19:40 +01:00
erogol 29f4329d7f update glow-tts layers and add some comments 2021-01-06 13:19:40 +01:00
erogol 29cf933831 update SS condif 2021-01-06 13:19:40 +01:00
erogol 228ada04b5 update glow-tts ljspeech config 2021-01-06 13:19:40 +01:00
erogol f352b3534c make noise augmentation optional 2021-01-06 13:19:40 +01:00
erogol 71c382be14 copy model scale stats file with config.json to the trianing folder, fixed for model inits 2021-01-06 13:19:40 +01:00
erogol aa40fe1aa0 SS model refacotring for multi speaker 2021-01-06 13:19:40 +01:00
erogol eb555855e4 small fixes 2021-01-06 13:19:40 +01:00
erogol 5901a00576 argument rename 2021-01-06 13:19:40 +01:00
erogol 4ef083f0f1 select decoder type for SS 2021-01-06 13:19:40 +01:00
erogol d5a0190c4b update copy_config_file to copy_model_files 2021-01-06 13:19:40 +01:00
erogol 8971c59b2d plot eval alignment score right 2021-01-06 13:19:40 +01:00
erogol 3fa408a5ea change order BN + ReLU to ReLU + BN for SS 2021-01-06 13:19:40 +01:00
erogol ac5c9217d1 positional encoding masking for SS 2021-01-06 13:19:40 +01:00
erogol fede46e96e pylint and test fixes 2021-01-06 13:19:40 +01:00
erogol 2abe3df153 compute_attention_masks.py 2021-01-06 13:19:40 +01:00
erogol cf869e8922 add SS files 2021-01-06 13:19:40 +01:00
erogol e4680e1b99 plot float16 alignments 2021-01-06 13:19:40 +01:00
erogol 13c6665c92 inference for SS 2021-01-06 13:19:40 +01:00
erogol 30788960a8 check SS model parameters 2021-01-06 13:19:40 +01:00
erogol 5cae2c5742 make optional position encoding for speedyspeech 2021-01-06 13:19:40 +01:00
erogol dc4a16d62e speedy speehc losses 2021-01-06 13:19:40 +01:00
erogol d62cac7252 fix glow-tts prenet bug fix 2021-01-06 13:19:40 +01:00
erogol a1d5a9ddda config update tyo use noise for augmentation 2021-01-06 13:19:40 +01:00
erogol 022af74d74 update prompt msg 2021-01-06 13:19:40 +01:00
erogol 57ef53bef3 update argumnet check for non tacotron models 2021-01-06 13:19:40 +01:00
erogol 27a75de15f update processors for loading attention maps 2021-01-06 13:19:40 +01:00
erogol fa6907fa0e update glow-tts parameters and fix rel-attn-win size 2021-01-06 13:19:40 +01:00
erogol 7b20d8cbd3 implement residual BN convolution and add it as an alternative encoder for glow-tts. also generic layers to layers/generic 2021-01-06 13:19:40 +01:00
erogol 973754d893 fix for init glow-tts 2021-01-06 13:19:40 +01:00
erogol f81af4eb0d config update disable guided attention for dynamic conv attention 2021-01-06 13:19:40 +01:00
erogol 29b17c0808 bug fix for gradual training 2021-01-06 13:19:40 +01:00
erogol 5c50e104d6 config update 2021-01-06 13:19:40 +01:00
erogol 6478d552dc tacotron training bug fix 2021-01-06 13:19:40 +01:00
erogol 1dd086577a tacotron training bug fix 2021-01-06 13:18:41 +01:00
erogol fa20638083 config for ljspeech dynamic conv attention 2021-01-06 13:18:41 +01:00
erogol 070146e143 add monotonic dynamic convolution attention 2021-01-06 13:18:41 +01:00
erogol 18392bc13a Merge branch 'dev' of https://github.com/mozilla/TTS into dev 2021-01-06 13:18:08 +01:00
Thorsten Mueller f673f8f74d Added support for npy output from tune-wavegrad 2020-12-19 22:51:22 +01:00
Thorsten Mueller 2aa0354b44 Fix for 'NoneType' object has no attribute 'to' 2020-12-19 22:37:03 +01:00
Thorsten Mueller 28a64221ea Improve robostness on cpu / gpu model mix 2020-12-19 22:23:28 +01:00
erogol 8293751a38 remove mozilla from server page 2020-12-17 12:28:28 +01:00
erogol 639fa29261 update speaker id casting for glow-tts 2020-12-14 16:58:47 +01:00
erogol 999120ecdf Merge branch 'dev' of https://github.com/mozilla/TTS into dev 2020-12-12 18:50:14 +01:00
erogol f611e6ac01 Merge branch 'dev' of https://github.com/mozilla/TTS into dev 2020-12-12 18:47:59 +01:00
Jörg Thalheim 62fd4ca70d
inflect negative numbers correctly 2020-12-10 16:47:51 +01:00
Jörg Thalheim 6646682650
cleaners: expand english time 2020-12-10 14:53:20 +01:00
Jörg Thalheim 76138687d3
expand more currencies 2020-12-10 14:53:20 +01:00
erogol a2859b7ddc update config args checks 2020-12-10 13:52:57 +01:00
erogol 788cd6f902 fix multi-speaker glow-tts inference 2020-12-10 02:05:48 +01:00
erogol 3d5066e2b8 Merge branch 'dev' of https://github.com/mozilla/TTS into dev 2020-12-10 00:31:03 +01:00
erogol 92cc9630d7 fix glow-tts synthesis for DPP 2020-12-10 00:30:34 +01:00
Eren Gölge 2473b2dc62
Merge pull request #559 from krzim/patch-1
Fix import to grab the encoder model save function
2020-12-10 00:19:32 +01:00
erogol 53679b706d glow-tts distributed fix 2020-12-09 23:39:09 +01:00
erogol 62bc171db5 Merge branch 'dev' of https://github.com/mozilla/TTS into dev 2020-12-09 15:46:57 +01:00
erogol df180148e9 use noise augmentation in TTSDataset 2020-12-09 15:46:25 +01:00
Thorsten Mueller e39628ce2f Limit filenames to 10 chars 2020-12-08 18:44:19 +01:00
erogol 06612ce305 test fixes 2020-12-07 15:57:34 +01:00
erogol 0252a07fa6 Merge branch 'dev' of https://github.com/mozilla/TTS into dev 2020-12-07 11:31:55 +01:00
erogol 482e725752 sync torch calls before logging training results 2020-12-07 11:30:19 +01:00
erogol 7505c0ba27 muliprocess phoneme computation 2020-12-07 11:29:41 +01:00
erogol 20c86489d7 make static methods for faster multiprocess call 2020-12-07 11:29:10 +01:00
erogol affe1c1138 setup training scripts for computing phonemes before training optionally. And define data_loaders before starting training and re-use them instead of re-define for every train and eval calls. This is to enable better instance filtering based on input length. 2020-12-07 11:26:57 +01:00
Alexander Korolev f42ca2b73f
Update wavegrad.py
This should fix the issue https://github.com/mozilla/TTS/issues/581
2020-12-04 16:43:39 +01:00
erogol 7c3cdced1a make speaker_mapping a global variable to prevent reload. Fix glow-tts training 2020-12-01 03:23:25 +01:00
Thorsten Mueller 06a389bc08 Added option for saving raw spectograms 2020-11-27 15:49:55 +01:00
erogol a757b203bc fix longer phoneme seqs 2020-11-26 15:05:03 +01:00
erogol 7b0a93d2f8 fix 2020-11-26 11:44:52 +01:00
erogol 0c6f7e4c77 resample audio if flag set true 2020-11-26 11:30:48 +01:00
erogol f6c96b0ac2 Merge branch 'dev' 2020-11-25 15:29:06 +01:00
erogol e3b7157146 remove contextlib 2020-11-25 15:22:01 +01:00
erogol e3eda159d1 wavegrad_dataset update 2020-11-25 14:50:50 +01:00
erogol a1e4ee18f9 convert float16 to float32 for plotting spectrograms 2020-11-25 14:50:28 +01:00
erogol 7541d2ecaa return eval split optional 2020-11-25 14:50:09 +01:00
erogol 4b92ac0f92 tune_wavegrad update 2020-11-25 14:49:48 +01:00
erogol d8c1b5b73d print max lengths in tacotron training 2020-11-25 14:49:07 +01:00
erogol 1229554c42 use native amp 2020-11-25 14:48:54 +01:00
erogol 8a820930c6 compute_embedding update 2020-11-25 14:46:08 +01:00
erogol aa2b31a1b0 use 'enabled' argument to control autocast 2020-11-17 14:22:01 +01:00
erogol d9d04d892b Merge branch 'dev' of https://github.com/mozilla/TTS into dev 2020-11-17 14:17:24 +01:00
erogol 8b0e0846a3 temporary travis check 2020-11-17 14:17:03 +01:00
Qingping Hou b0b97d636f speed up metafile build for voxceleb 2020-11-14 23:45:17 -08:00
erogol a2a142dc39 Merge branch 'dev' of https://github.com/mozilla/TTS into dev 2020-11-14 13:02:19 +01:00
erogol c65712426a change noise scheduling for wavegrad. Compute beta values externally to enable better flexibility 2020-11-14 13:01:10 +01:00
erogol 5a59467f34 scaler fix for wavegrad and wavernn. Save and load scaler 2020-11-14 13:00:35 +01:00
erogol d8511efa8f use native amp for tacotron training 2020-11-14 12:59:28 +01:00
Qingping Hou 0cc3650ef6 support loading config in yaml 2020-11-14 00:13:53 -08:00
erogol 6cc464ead6 fix ton of tesnting bugs 2020-11-12 16:33:29 +01:00
erogol 25551c4634 change wavernn generate to inference 2020-11-12 12:52:52 +01:00
erogol 9b0f441945 argument for returning no eval split 2020-11-12 12:52:27 +01:00
erogol a7aefd5c50 use pytorch amp for mixed precision training for Tacotron 2020-11-12 12:51:56 +01:00
erogol 67e2b664e5 compute embeddings and create speakers.json 2020-11-12 12:51:17 +01:00
erogol f8fd300b3e bug fix 2020-11-10 12:53:39 +01:00
erogol 016d3503da compute embeddings with speaker encoder 2020-11-10 12:51:02 +01:00
erogol 21364331d2 Merge branch 'dev' of https://github.com/mozilla/TTS into dev 2020-11-09 13:31:12 +01:00
erogol c76a617072 linter updates 2020-11-09 13:18:35 +01:00
erogol ea976b0543 python compat update for contextlib 2020-11-06 13:34:11 +01:00
erogol c80225544e tune wavegrad to fine the best noise schedule for inferece 2020-11-06 13:04:46 +01:00
erogol d94782a076 reset the way ga_loss is stored in return_dict 2020-11-02 13:18:56 +01:00
erogol a108d0ee81 check nan loss in glow-tts loss 2020-11-02 13:12:19 +01:00
erogol b8ac9aba9d check against NaN loss in tacotron_loss 2020-11-02 12:44:41 +01:00
erogol ef04d7fae7 bug fix for wavernn training 2020-10-30 14:08:41 +01:00
erogol a44ef58aea wavegrad weight norm refactoring 2020-10-30 13:23:24 +01:00
erogol 183fe56d95 Merge branch 'ssim_loss' into dev 2020-10-29 23:49:09 +01:00
krzim 2202e171c5
Fix import to grab the encoder model save function
I saw that this was recently changed but I'm not sure if it should have been. This is the correct function given the arguments provided to it in the train loop.
2020-10-29 18:03:11 -04:00
erogol 73581cd94c renaming train scripts and updating tests 2020-10-29 16:50:07 +01:00
erogol 39c71ee8a9 wavegrad refactoring, fixing tests for glow-tts and wavegrad 2020-10-29 15:47:15 +01:00
erogol 946a0c0fb9 bug fixes for single speaker glow-tts, enable torch based amp. Make amp optional for wavegrad. Bug fixes for synthesis setup for glow-tts 2020-10-29 15:45:50 +01:00
erogol 14c2381207 weight norm and torch based amp training for wavegrad 2020-10-29 12:31:43 +01:00
erogol b76a0be97a wavegrad model and layers refactoring 2020-10-29 12:31:43 +01:00
erogol dc2825dfb2 wavegrad dataset update 2020-10-29 12:31:43 +01:00
erogol 5b5b9fcfdd wavegrad config updates 2020-10-29 12:31:43 +01:00
erogol c8a4c771a8 train wavegrad updates 2020-10-29 12:31:43 +01:00
erogol 670f44aa18 enable compute stats by vocoder config 2020-10-29 12:31:43 +01:00
erogol f79bbbbd00 use Adam for wavegras instead of RAdam 2020-10-29 12:31:43 +01:00
erogol 7bcdb7ac35 wavegrad updates 2020-10-29 12:31:43 +01:00
erogol a1582a0e12 fix distributed training for train_* scripts 2020-10-29 12:31:43 +01:00
erogol 193b81b273 add universal_fullband_melgan config 2020-10-29 12:30:37 +01:00
erogol e02cd6a220 initial wavegrad layers model and trainig script 2020-10-29 12:30:37 +01:00
erogol ac57eea928 add wavegrad to vocoder generators 2020-10-29 12:30:37 +01:00
erogol e723b99888 handle distributed model as saving 2020-10-29 12:30:37 +01:00
Eren Gölge 26c18b61c9
Merge pull request #553 from Edresson/dev
bug fix in the inference with GlowTTS
2020-10-28 18:49:31 +01:00
erogol fdaed45f58 optional loss masking for stoptoken predictor 2020-10-28 18:40:54 +01:00
erogol e49cc3bbcd bug fix 2020-10-28 18:34:34 +01:00
erogol 59e1cf99d0 config update and ssim implementation 2020-10-28 18:30:00 +01:00
erogol 9cef923d99 ssim loss for tacotron models 2020-10-28 15:24:18 +01:00
erogol 9d0ae2bfb4 wavernn dataloader handling for short samples and mixed precision training 2020-10-28 12:31:01 +01:00
Edresson f01502a9db bug fix in glowTTS sythesize 2020-10-27 16:30:16 -03:00
Eren Gölge f4b8170bd1
Merge pull request #545 from Edresson/dev
GlowTTS zeroshot TTS support
2020-10-27 15:23:41 +01:00
erogol a6f564c8c8 pylint fixes 2020-10-27 12:35:10 +01:00
erogol 0becef4b58 small updates 2020-10-27 12:17:38 +01:00
sanjaesc 2ee47e9568 fix pylint once again 2020-10-27 12:17:38 +01:00
sanjaesc 1e646135ca add model params to config 2020-10-27 12:17:38 +01:00
sanjaesc bef3f2020b compute audio feat on dataload 2020-10-27 12:17:38 +01:00
sanjaesc 7c72562fe7 fix travis + pylint tests 2020-10-27 12:17:38 +01:00
sanjaesc 91e5f8b63d added to device cpu/gpu + formatting 2020-10-27 12:17:38 +01:00
sanjaesc 016a77fcf2 fix formatting + pylint 2020-10-27 12:17:38 +01:00
erogol 8de7c13708 fix no loss masking loss computation 2020-10-27 12:17:38 +01:00
sanjaesc e8294cb9db fixing pylint errors 2020-10-27 12:17:38 +01:00
sanjaesc 878b7c373e added feature preprocessing if not set in config 2020-10-27 12:17:38 +01:00
sanjaesc e495e03ea1 some minor changes to wavernn 2020-10-27 12:17:38 +01:00
Alex K 9c3c7ce2f8 wavernn stuff... 2020-10-27 12:17:38 +01:00
Alex K 6378fa2b07 add initial wavernn support 2020-10-27 12:17:38 +01:00
Edresson 89e9bfe3a2 add text processing blank token test 2020-10-26 17:41:23 -03:00
Edresson d9540a5857 add blank token in sequence for encrease glowtts results 2020-10-25 15:08:28 -03:00
Edresson fbea058c59 add parse speakers function 2020-10-24 16:10:05 -03:00
Edresson 07345099ee GlowTTS zero-shot TTS Support 2020-10-24 15:58:39 -03:00
Alexander Korolev 47d74ced1c
Update losses.py
Seems like in the latest dev merge, this change was reverted. Any specific reason for this?
Without it the problem as stated here https://github.com/mozilla/TTS/issues/473 occurs.
2020-10-23 14:15:01 +02:00