Commit Graph

394 Commits

Author SHA1 Message Date
Edresson 856ea19758 bug fix in dataloader and update inference 2021-05-18 03:43:16 -03:00
Eren Gölge 12722501bb styling 2021-05-15 23:48:31 +02:00
Edresson 3433c2f348 add compute embedding for the new speaker encoder 2021-05-12 03:06:46 -03:00
Eren Gölge 715b0a65a0 update main.yml for python x64
fix test
2021-05-12 00:57:29 +02:00
Edresson 3fcc748b2e implement the Speaker Encoder H/ASP 2021-05-11 16:27:05 -03:00
Eren Gölge 843d1b3d98 linter fixes 2021-05-11 11:30:00 +02:00
Eren Gölge 19fb1d743d style update 2021-05-11 11:30:00 +02:00
Eren Gölge 9f7599e3c3 fix train_encoder for coqpit 2021-05-11 11:29:18 +02:00
Eren Gölge 3fde2001b1 train_encoder refactoring for coqpit 2021-05-11 11:29:18 +02:00
Eren Gölge 9ee70af9bb code styling 2021-05-11 11:29:18 +02:00
Eren Gölge 78b3825d0b update train scripts for coqpit 2021-05-11 11:29:18 +02:00
Eren Gölge e6f45b9eb7 update train_vocoder_gan.py for coqpit 2021-05-11 11:29:18 +02:00
Eren Gölge bcebd69d09 remove bash tts training tests 2021-05-11 11:29:17 +02:00
Eren Gölge 7227e8f1d2 update train_align_tts.py for coqpit 2021-05-11 11:29:17 +02:00
Eren Gölge 720fe13056 update glow_tts modules and training script for coqpit use 2021-05-11 11:29:17 +02:00
Eren Gölge 35341d5482 move bash script based tests to python with coqpit 2021-05-11 11:29:17 +02:00
Eren Gölge eaa130e813 fix tacotron for coqpit 2021-05-11 11:29:17 +02:00
Eren Gölge 65d7ad4250 refactor train_speedy_speech.py for coqpit 2021-05-11 11:29:17 +02:00
Eren Gölge 9c18e40f64 black formatting 2021-05-11 11:29:17 +02:00
Eren Gölge c34c8137d7 update compute_statistics for coqpit 2021-05-11 11:29:17 +02:00
Eren Gölge 79d7215142 config refactor #5 WIP 2021-05-11 11:29:17 +02:00
Eren Gölge dc50f5f0b0 config refactor #4 WIP 2021-05-11 11:28:35 +02:00
Eren Gölge 97bd5f9734 [ci skip] config update #3 WIP 2021-05-11 11:28:35 +02:00
Eren Gölge a21c0b5585 config update 2 WIP 2021-05-11 11:28:35 +02:00
Edresson 85ccad7e0a add Audio data augamentation Addtive and RIR 2021-05-11 00:59:57 -03:00
Edresson 77d85c6cc5 add softmaxproto loss and bug fix in data loader 2021-05-10 17:08:38 -03:00
Eren Gölge f7582107da
Merge pull request #453 from Edresson/dev
Script for spectrogram extraction using teacher forcing and Glow-TTS inference with MAS.
2021-05-06 17:53:28 +02:00
Edresson 501c8e0302 remove unused vars on extract tts spectrograms script 2021-05-04 19:04:13 -03:00
Eren Gölge 87d674a038 bumpup librosa version to 0.8.0 2021-05-03 14:25:09 +02:00
Edresson 3ecd556bbe add unit test for extract tts spectrograms script 2021-05-01 13:41:56 -03:00
Edresson 446b1da936 create inference function 2021-04-29 18:18:37 -03:00
Eren Gölge 1235e54738 test for synthesize.py 2021-04-27 14:17:38 +02:00
Eren Gölge 2f0716073e enable multi-speaker CoquiTTS models for synthesize.py 2021-04-26 19:36:53 +02:00
Edresson 20e42a3381 add save audio option 2021-04-23 15:00:00 -03:00
Edresson 8228091f92 add script for extraction of tts spectrograms 2021-04-23 14:17:46 -03:00
Eren Gölge 4cf211348d styling and linting 2021-04-23 18:04:37 +02:00
Eren Gölge 179722e3a7 new arguments to synthesize.py for loading speaker encoder and speaker wavs 2021-04-23 18:04:37 +02:00
Eren Gölge af2d36faeb update synthesize.py for multi-speaker setting 2021-04-23 18:04:37 +02:00
Edresson d2b6326b8b change optimizer initialization for compatibility with Hifi-GAN official implementation 2021-04-23 07:54:39 -03:00
Eren Gölge 9cc17be53a formatting and a small bug fix in Tacotron model 2021-04-15 16:36:51 +02:00
Eren Gölge d60a8d7211 show the real waveform on TB too for GAN vocoder training. 2021-04-15 15:30:06 +02:00
Eren Gölge 5fbe926429 change the default TTS model to TacotronDDC 2021-04-15 15:29:44 +02:00
Eren Gölge b11d1cb845 small fixes 2021-04-12 12:40:55 +02:00
Eren Gölge a7f6045644 Merge branch 'reformat' into hifigan-reformat 2021-04-12 12:00:17 +02:00
Eren Gölge f519012dea reformatting and styling 2021-04-12 11:47:39 +02:00
Eren Gölge 5b70da2e3f restore schedulers only if training is continuing a previous training
inherit nn.Module for TorchSTFT
2021-04-09 19:31:28 +02:00
Eren Gölge 105e0b4d62 vocoder gan training fixes 2021-04-09 11:38:04 +02:00
Eren Gölge 18d9ec8036 format with black 2021-04-09 00:54:59 +02:00
Eren Gölge e5b9607bc3 isort all imports 2021-04-09 00:45:20 +02:00
Eren Gölge 0e79fa86ad format with black and pylint 2.7.3 2021-04-09 00:38:08 +02:00
Eren Gölge cd69da4868 linter fixes #2 2021-04-08 16:57:46 +02:00
Eren Gölge 0ee0458309 remove redundant imports 2021-04-08 11:29:15 +02:00
Eren Gölge 4998ece8d8 allow configuration of optimziers from the config file 2021-04-08 11:28:30 +02:00
Eren Gölge 8daf407652 cache empty 2021-04-08 11:28:30 +02:00
Eren Gölge 3fb78c004a move scheduler updates to the end of the epoch 2021-04-08 11:28:30 +02:00
Eren Gölge 2a872c98aa don't call os.exit as it leaves the process resources standing 2021-04-08 11:27:40 +02:00
Eren Gölge 57f6bd1afa make using different samples for G and D networks optional 2021-04-08 11:26:01 +02:00
rishikksh20 e656e8b108 Remove select size bug 2021-04-08 11:20:33 +02:00
rishikksh20 ef6ff4e95c Add Exponential LR scheduler check 2021-04-08 11:20:33 +02:00
Eren Gölge 6ad4eba678 gan vocoder train fix in case of restoring models wiht no scheduler is defined 2021-04-06 16:24:50 +02:00
Eren Gölge b4c2cf80f2 fix eval iter 2021-03-30 14:39:16 +02:00
Eren Gölge a3a840fd78 linter fixes 2021-03-30 14:39:16 +02:00
Eren Gölge 7a382a5c2b stowed aligntts commit and small refactoring with feed_forward layers 2021-03-30 14:39:16 +02:00
Eren Gölge 2b3e12ea49 correct imports after refactoring, add AlignTTS (old SSMAS) and some formatting 2021-03-30 14:39:16 +02:00
Eren Gölge d9c405f0c3 create feedforward folder for SS layers 2021-03-30 14:39:16 +02:00
Eren Gölge ca2f22cdd7 linter fix 2021-03-30 14:36:12 +02:00
Eren Gölge d0dcd7d1b8 let the user define outpu.wav file path fix #393 2021-03-30 14:24:31 +02:00
Eren Gölge 3947750dd9 Merge branch 'dev' of https://github.com/coqui-ai/TTS into dev 2021-03-18 14:09:47 +01:00
WeberJulian 596ea2c98a Add resample script 2021-03-18 13:33:37 +01:00
Eren Gölge 65533f33e9 fix #374 2021-03-18 13:33:00 +01:00
WeberJulian af96080e17 fix linter issues 2021-03-18 13:33:00 +01:00
WeberJulian f6cd8e0ecc test case 2021-03-18 13:33:00 +01:00
WeberJulian e954e45e57 linter + test 2021-03-18 13:33:00 +01:00
WeberJulian e598977f3d Using path.join instead of concat 2021-03-18 13:33:00 +01:00
WeberJulian c5ef2de73f Add resample script 2021-03-18 13:33:00 +01:00
Eren Gölge babc94f63f fix #374 2021-03-16 19:13:32 +01:00
WeberJulian 11e25a7125 fix linter issues 2021-03-16 19:13:01 +01:00
WeberJulian b94373afb8 test case 2021-03-16 19:13:01 +01:00
WeberJulian 93fdc0729c linter + test 2021-03-16 19:13:01 +01:00
WeberJulian 17f197f51e Using path.join instead of concat 2021-03-16 19:13:01 +01:00
WeberJulian d6749f030f Add resample script 2021-03-16 19:13:01 +01:00
Eren Gölge 6c932c8503 print the desc if required parameters are not provided 2021-03-10 15:19:00 +01:00
Eren Gölge 19bb9ba851 fix tts endpoint using list-models argument 2021-03-09 14:06:09 +01:00
Eren Gölge 94805236fb Merge branch 'dev' of https://github.com/coqui-ai/TTS into dev 2021-03-08 15:21:06 +01:00
Eren Gölge 9a48ba3821 a ton of linter updates 2021-03-08 05:06:54 +01:00
gerazov 2451a813a2 refactored keep_all_best 2021-03-08 02:57:11 +01:00
gerazov 2db40457e8 brushed up printing model load path and best loss path 2021-03-08 02:56:36 +01:00
gerazov f2e474cd37 loading last checkpoint/best_model works, deleting last best models options added, loading last best_loss added 2021-03-08 02:56:36 +01:00
Eren Gölge 8993120634 do not test server and modelManager until fixing #657 2021-03-08 02:54:47 +01:00
Eren Gölge 39fbf2fe84 Update TTS/bin/find_unique_chars.py
Co-authored-by: Jörg Thalheim <Mic92@users.noreply.github.com>
2021-03-08 02:54:47 +01:00
Eren Gölge ee71eb4eb7 linter fixes 2021-03-08 02:54:47 +01:00
Eren Gölge 62aeacbdd1 save used model characters to the checkpoints 2021-03-08 02:54:47 +01:00
Eren Gölge c6702b5b9f find unique characters in a dataset 2021-03-08 02:54:47 +01:00
Eren Gölge 00e0933f43 save_wav with a custom sampling rate 2021-03-08 02:54:47 +01:00
Eren Gölge 8955333e9d use default vocoder in synthesize.py 2021-03-08 02:54:47 +01:00
Eren Gölge 1c1abb8a9b docstring update 2021-03-08 02:54:47 +01:00
Eren Gölge 43b951018e fix the default vocoder name 2021-03-08 02:54:47 +01:00
Eren Gölge 3c961370e7 linter fixes 2021-03-08 02:54:21 +01:00
gerazov b3c5cc2cdc final fixes 2021-03-08 02:54:21 +01:00
gerazov 10d5a63d49 updated to current dev 2021-03-08 02:54:21 +01:00
gerazov 6f06e31541 changed train scripts 2021-03-08 02:54:21 +01:00
Branislav Gerazov b1e3160884 waveRNN fix 2021-03-08 02:54:21 +01:00
Eren Gölge 08581deb61 linter updates 2021-03-08 02:53:02 +01:00
Thorsten Mueller 167901813d Ups. Added missing , 2021-03-08 02:53:02 +01:00
Eren Gölge 93a6bdfd6c linter fixes and version updates for deps 2021-03-08 02:51:10 +01:00
Thorsten Mueller 3eb00e8d93 Set out_path to be required param. 2021-03-08 02:49:15 +01:00
Alexander Korolev ace430d5e6 fix device mismatch wavegrad training
this should fixe the device mismatch as seen here https://github.com/mozilla/TTS/issues/622#issue-789802916
2021-03-08 02:49:15 +01:00
Eren Gölge 83143fbe39 fix #638 2021-03-08 02:48:31 +01:00
Alexander Korolev b4bc5f6eb1 update fixed stopnet_pos_weight parameter
config parameter c.stopnet_pos_weight has currently no effect as it is not used.
2021-03-08 02:48:31 +01:00
Eren Gölge 534e3c67c6 README update, set default models for synthesize.py and server.py. Disable verbose for ap init. 2021-03-08 02:48:31 +01:00
Eren Gölge d0454461de Merge branch 'pr/gerazov/650-2' into dev 2021-02-17 13:40:45 +00:00
Eren Gölge ce0c5eccbd do not test server and modelManager until fixing #657 2021-02-17 00:35:43 +00:00
gerazov 61c88beb94 refactored keep_all_best 2021-02-15 18:40:17 +01:00
Eren Gölge 3b6ce04332
Update TTS/bin/find_unique_chars.py
Co-authored-by: Jörg Thalheim <Mic92@users.noreply.github.com>
2021-02-15 13:02:29 +01:00
Eren Gölge 420901f4c2 linter fixes 2021-02-12 14:41:17 +00:00
Eren Gölge e774f68aee save used model characters to the checkpoints 2021-02-12 12:03:42 +00:00
gerazov 310d18325e brushed up printing model load path and best loss path 2021-02-12 10:55:45 +01:00
Eren Gölge 8b6fd76ad2 find unique characters in a dataset 2021-02-12 09:46:11 +00:00
gerazov af46727517 loading last checkpoint/best_model works, deleting last best models options added, loading last best_loss added 2021-02-12 02:12:00 +01:00
Eren Gölge 1649ad3431 save_wav with a custom sampling rate 2021-02-11 15:27:20 +00:00
Eren Gölge 0657b38111 use default vocoder in synthesize.py 2021-02-11 15:26:17 +00:00
Eren Gölge f1799dbd60 docstring update 2021-02-11 11:25:31 +00:00
Eren Gölge 3c2e13ca5c fix the default vocoder name 2021-02-11 10:36:52 +00:00
Eren Gölge c619859a3f linter fixes 2021-02-09 11:43:17 +00:00
gerazov ad17dc9e76 final fixes 2021-02-06 23:05:01 +01:00
gerazov 8fdd08ea15 updated to current dev 2021-02-06 22:59:52 +01:00
gerazov 2705d27b28 changed train scripts 2021-02-06 22:29:30 +01:00
Eren Gölge f4f6290eec Merge branch 'pr/gerazov/641' into dev 2021-02-05 13:14:49 +00:00
Eren Gölge d49757faaa linter updates 2021-02-05 13:10:43 +00:00
Branislav Gerazov cb77aef36c waveRNN fix 2021-02-04 09:52:03 +01:00
Thorsten Mueller d74866cb8e Merge remote-tracking branch 'upstream/dev' into dev
Fix for circleci error mentioned in PR https://github.com/mozilla/TTS/pull/637
2021-02-02 19:40:18 +01:00
Thorsten Mueller a82152eef3 Ups. Added missing , 2021-02-02 19:29:16 +01:00
Thorsten Mueller 4cb4fcf02c Set out_path to be required param. 2021-02-02 19:29:16 +01:00
Eren Gölge 5c46543765 linter fixes and version updates for deps 2021-02-01 13:18:56 +00:00
Eren Gölge 5beed0ddcd Merge branch 'dev' of https://github.com/mozilla/TTS into dev 2021-02-01 11:27:14 +00:00
Eren Gölge c7407571fa fix #638 2021-02-01 10:05:55 +00:00
Eren Gölge dfdac1def9
Merge pull request #636 from thorstenMueller/dev
Set out_path to be required param in compute_statistics.py.
2021-01-29 18:08:31 +01:00
Thorsten Mueller 44c4a49745 Set out_path to be required param. 2021-01-29 17:23:38 +01:00
Alexander Korolev e81ebec7a8
fix device mismatch wavegrad training
this should fixe the device mismatch as seen here https://github.com/mozilla/TTS/issues/622#issue-789802916
2021-01-29 15:18:59 +01:00
Alexander Korolev ca28e05ed7
update fixed stopnet_pos_weight parameter
config parameter c.stopnet_pos_weight has currently no effect as it is not used.
2021-01-27 16:33:25 +01:00
Eren Gölge 25c86ca715 README update, set default models for synthesize.py and server.py. Disable verbose for ap init. 2021-01-27 11:47:03 +01:00
Eren Gölge 877f0bbfba manifest.in update 2021-01-26 02:56:55 +01:00
Eren Gölge 82e029529e fix manifest file 2021-01-25 13:27:54 +01:00
Eren Gölge 57b668fd86 fixing dome pypi issues 2021-01-25 13:06:12 +01:00
Eren Gölge 60c1bb93d9 fixes before first PyPI release 2021-01-25 11:16:20 +01:00
Eren Gölge fae10309e4
Merge pull request #624 from SanjaESC/patch-3
Update train_tacotron.py
2021-01-22 13:29:09 +01:00
Eren Gölge c990b3a59c linter fixes and test fixes 2021-01-22 02:32:35 +01:00
Alexander Korolev f251dc8c0e
Update train_tacotron.py
When attempting to fine-tune a model with "prenet_type": "bn" that was originally trained with "prenet_type": "original", a RuntimeError is thrown that stops the training.

By catching the RuntimeError, the required layers can be partially restored and the training will continue without any problems.
2021-01-21 21:16:30 +01:00
Eren Gölge 0ab2eb2664 use synthesizer in both synthesize.py and server.pu 2021-01-21 15:54:33 +01:00
Eren Gölge 6b6e989fd2 update server readme 2021-01-21 15:29:46 +01:00
root 3d30dae8f3 .models.json and synthesize.py update for interfacing with model manager 2021-01-20 02:08:58 +00:00
root 7beaacc55b update compute_attention_masks.py 2021-01-13 10:03:57 +00:00
erogol cc2b1e043d docstrings for common layers 2021-01-11 15:06:12 +01:00
erogol d382d759b3 small fixes and test fixes 2021-01-08 15:48:40 +01:00
erogol f352b3534c make noise augmentation optional 2021-01-06 13:19:40 +01:00
erogol d5a0190c4b update copy_config_file to copy_model_files 2021-01-06 13:19:40 +01:00
erogol 8971c59b2d plot eval alignment score right 2021-01-06 13:19:40 +01:00
erogol fede46e96e pylint and test fixes 2021-01-06 13:19:40 +01:00
erogol 2abe3df153 compute_attention_masks.py 2021-01-06 13:19:40 +01:00
erogol cf869e8922 add SS files 2021-01-06 13:19:40 +01:00
erogol 29b17c0808 bug fix for gradual training 2021-01-06 13:19:40 +01:00
erogol 6478d552dc tacotron training bug fix 2021-01-06 13:19:40 +01:00
erogol 1dd086577a tacotron training bug fix 2021-01-06 13:18:41 +01:00
Thorsten Mueller f673f8f74d Added support for npy output from tune-wavegrad 2020-12-19 22:51:22 +01:00
Thorsten Mueller 2aa0354b44 Fix for 'NoneType' object has no attribute 'to' 2020-12-19 22:37:03 +01:00
Thorsten Mueller 28a64221ea Improve robostness on cpu / gpu model mix 2020-12-19 22:23:28 +01:00
Eren Gölge 2473b2dc62
Merge pull request #559 from krzim/patch-1
Fix import to grab the encoder model save function
2020-12-10 00:19:32 +01:00
erogol 53679b706d glow-tts distributed fix 2020-12-09 23:39:09 +01:00
erogol 62bc171db5 Merge branch 'dev' of https://github.com/mozilla/TTS into dev 2020-12-09 15:46:57 +01:00
erogol df180148e9 use noise augmentation in TTSDataset 2020-12-09 15:46:25 +01:00
Thorsten Mueller e39628ce2f Limit filenames to 10 chars 2020-12-08 18:44:19 +01:00
erogol 06612ce305 test fixes 2020-12-07 15:57:34 +01:00
erogol 0252a07fa6 Merge branch 'dev' of https://github.com/mozilla/TTS into dev 2020-12-07 11:31:55 +01:00
erogol 482e725752 sync torch calls before logging training results 2020-12-07 11:30:19 +01:00
erogol affe1c1138 setup training scripts for computing phonemes before training optionally. And define data_loaders before starting training and re-use them instead of re-define for every train and eval calls. This is to enable better instance filtering based on input length. 2020-12-07 11:26:57 +01:00
erogol 7c3cdced1a make speaker_mapping a global variable to prevent reload. Fix glow-tts training 2020-12-01 03:23:25 +01:00
Thorsten Mueller 06a389bc08 Added option for saving raw spectograms 2020-11-27 15:49:55 +01:00
erogol 4b92ac0f92 tune_wavegrad update 2020-11-25 14:49:48 +01:00
erogol d8c1b5b73d print max lengths in tacotron training 2020-11-25 14:49:07 +01:00
erogol 1229554c42 use native amp 2020-11-25 14:48:54 +01:00
erogol 8a820930c6 compute_embedding update 2020-11-25 14:46:08 +01:00
erogol aa2b31a1b0 use 'enabled' argument to control autocast 2020-11-17 14:22:01 +01:00
Qingping Hou b0b97d636f speed up metafile build for voxceleb 2020-11-14 23:45:17 -08:00
erogol a2a142dc39 Merge branch 'dev' of https://github.com/mozilla/TTS into dev 2020-11-14 13:02:19 +01:00
erogol c65712426a change noise scheduling for wavegrad. Compute beta values externally to enable better flexibility 2020-11-14 13:01:10 +01:00
erogol 5a59467f34 scaler fix for wavegrad and wavernn. Save and load scaler 2020-11-14 13:00:35 +01:00
erogol d8511efa8f use native amp for tacotron training 2020-11-14 12:59:28 +01:00
Qingping Hou 0cc3650ef6 support loading config in yaml 2020-11-14 00:13:53 -08:00
erogol a7aefd5c50 use pytorch amp for mixed precision training for Tacotron 2020-11-12 12:51:56 +01:00
erogol 67e2b664e5 compute embeddings and create speakers.json 2020-11-12 12:51:17 +01:00
erogol f8fd300b3e bug fix 2020-11-10 12:53:39 +01:00
erogol 016d3503da compute embeddings with speaker encoder 2020-11-10 12:51:02 +01:00
erogol 21364331d2 Merge branch 'dev' of https://github.com/mozilla/TTS into dev 2020-11-09 13:31:12 +01:00
erogol c76a617072 linter updates 2020-11-09 13:18:35 +01:00
erogol c80225544e tune wavegrad to fine the best noise schedule for inferece 2020-11-06 13:04:46 +01:00
erogol ef04d7fae7 bug fix for wavernn training 2020-10-30 14:08:41 +01:00
erogol 183fe56d95 Merge branch 'ssim_loss' into dev 2020-10-29 23:49:09 +01:00
krzim 2202e171c5
Fix import to grab the encoder model save function
I saw that this was recently changed but I'm not sure if it should have been. This is the correct function given the arguments provided to it in the train loop.
2020-10-29 18:03:11 -04:00
erogol 73581cd94c renaming train scripts and updating tests 2020-10-29 16:50:07 +01:00
erogol 946a0c0fb9 bug fixes for single speaker glow-tts, enable torch based amp. Make amp optional for wavegrad. Bug fixes for synthesis setup for glow-tts 2020-10-29 15:45:50 +01:00