Commit Graph

407 Commits

Author SHA1 Message Date
Eren Gölge af2d36faeb update synthesize.py for multi-speaker setting 2021-04-23 18:04:37 +02:00
Edresson d2b6326b8b change optimizer initialization for compatibility with Hifi-GAN official implementation 2021-04-23 07:54:39 -03:00
Eren Gölge 9cc17be53a formatting and a small bug fix in Tacotron model 2021-04-15 16:36:51 +02:00
Eren Gölge d60a8d7211 show the real waveform on TB too for GAN vocoder training. 2021-04-15 15:30:06 +02:00
Eren Gölge 5fbe926429 change the default TTS model to TacotronDDC 2021-04-15 15:29:44 +02:00
Eren Gölge b11d1cb845 small fixes 2021-04-12 12:40:55 +02:00
Eren Gölge a7f6045644 Merge branch 'reformat' into hifigan-reformat 2021-04-12 12:00:17 +02:00
Eren Gölge f519012dea reformatting and styling 2021-04-12 11:47:39 +02:00
Eren Gölge 5b70da2e3f restore schedulers only if training is continuing a previous training
inherit nn.Module for TorchSTFT
2021-04-09 19:31:28 +02:00
Eren Gölge 105e0b4d62 vocoder gan training fixes 2021-04-09 11:38:04 +02:00
Eren Gölge 18d9ec8036 format with black 2021-04-09 00:54:59 +02:00
Eren Gölge e5b9607bc3 isort all imports 2021-04-09 00:45:20 +02:00
Eren Gölge 0e79fa86ad format with black and pylint 2.7.3 2021-04-09 00:38:08 +02:00
Eren Gölge cd69da4868 linter fixes #2 2021-04-08 16:57:46 +02:00
Eren Gölge 0ee0458309 remove redundant imports 2021-04-08 11:29:15 +02:00
Eren Gölge 4998ece8d8 allow configuration of optimziers from the config file 2021-04-08 11:28:30 +02:00
Eren Gölge 8daf407652 cache empty 2021-04-08 11:28:30 +02:00
Eren Gölge 3fb78c004a move scheduler updates to the end of the epoch 2021-04-08 11:28:30 +02:00
Eren Gölge 2a872c98aa don't call os.exit as it leaves the process resources standing 2021-04-08 11:27:40 +02:00
Eren Gölge 57f6bd1afa make using different samples for G and D networks optional 2021-04-08 11:26:01 +02:00
rishikksh20 e656e8b108 Remove select size bug 2021-04-08 11:20:33 +02:00
rishikksh20 ef6ff4e95c Add Exponential LR scheduler check 2021-04-08 11:20:33 +02:00
Eren Gölge 6ad4eba678 gan vocoder train fix in case of restoring models wiht no scheduler is defined 2021-04-06 16:24:50 +02:00
Eren Gölge b4c2cf80f2 fix eval iter 2021-03-30 14:39:16 +02:00
Eren Gölge a3a840fd78 linter fixes 2021-03-30 14:39:16 +02:00
Eren Gölge 7a382a5c2b stowed aligntts commit and small refactoring with feed_forward layers 2021-03-30 14:39:16 +02:00
Eren Gölge 2b3e12ea49 correct imports after refactoring, add AlignTTS (old SSMAS) and some formatting 2021-03-30 14:39:16 +02:00
Eren Gölge d9c405f0c3 create feedforward folder for SS layers 2021-03-30 14:39:16 +02:00
Eren Gölge ca2f22cdd7 linter fix 2021-03-30 14:36:12 +02:00
Eren Gölge d0dcd7d1b8 let the user define outpu.wav file path fix #393 2021-03-30 14:24:31 +02:00
Eren Gölge 3947750dd9 Merge branch 'dev' of https://github.com/coqui-ai/TTS into dev 2021-03-18 14:09:47 +01:00
WeberJulian 596ea2c98a Add resample script 2021-03-18 13:33:37 +01:00
Eren Gölge 65533f33e9 fix #374 2021-03-18 13:33:00 +01:00
WeberJulian af96080e17 fix linter issues 2021-03-18 13:33:00 +01:00
WeberJulian f6cd8e0ecc test case 2021-03-18 13:33:00 +01:00
WeberJulian e954e45e57 linter + test 2021-03-18 13:33:00 +01:00
WeberJulian e598977f3d Using path.join instead of concat 2021-03-18 13:33:00 +01:00
WeberJulian c5ef2de73f Add resample script 2021-03-18 13:33:00 +01:00
Eren Gölge babc94f63f fix #374 2021-03-16 19:13:32 +01:00
WeberJulian 11e25a7125 fix linter issues 2021-03-16 19:13:01 +01:00
WeberJulian b94373afb8 test case 2021-03-16 19:13:01 +01:00
WeberJulian 93fdc0729c linter + test 2021-03-16 19:13:01 +01:00
WeberJulian 17f197f51e Using path.join instead of concat 2021-03-16 19:13:01 +01:00
WeberJulian d6749f030f Add resample script 2021-03-16 19:13:01 +01:00
Eren Gölge 6c932c8503 print the desc if required parameters are not provided 2021-03-10 15:19:00 +01:00
Eren Gölge 19bb9ba851 fix tts endpoint using list-models argument 2021-03-09 14:06:09 +01:00
Eren Gölge 94805236fb Merge branch 'dev' of https://github.com/coqui-ai/TTS into dev 2021-03-08 15:21:06 +01:00
Eren Gölge 9a48ba3821 a ton of linter updates 2021-03-08 05:06:54 +01:00
gerazov 2451a813a2 refactored keep_all_best 2021-03-08 02:57:11 +01:00
gerazov 2db40457e8 brushed up printing model load path and best loss path 2021-03-08 02:56:36 +01:00
gerazov f2e474cd37 loading last checkpoint/best_model works, deleting last best models options added, loading last best_loss added 2021-03-08 02:56:36 +01:00
Eren Gölge 8993120634 do not test server and modelManager until fixing #657 2021-03-08 02:54:47 +01:00
Eren Gölge 39fbf2fe84 Update TTS/bin/find_unique_chars.py
Co-authored-by: Jörg Thalheim <Mic92@users.noreply.github.com>
2021-03-08 02:54:47 +01:00
Eren Gölge ee71eb4eb7 linter fixes 2021-03-08 02:54:47 +01:00
Eren Gölge 62aeacbdd1 save used model characters to the checkpoints 2021-03-08 02:54:47 +01:00
Eren Gölge c6702b5b9f find unique characters in a dataset 2021-03-08 02:54:47 +01:00
Eren Gölge 00e0933f43 save_wav with a custom sampling rate 2021-03-08 02:54:47 +01:00
Eren Gölge 8955333e9d use default vocoder in synthesize.py 2021-03-08 02:54:47 +01:00
Eren Gölge 1c1abb8a9b docstring update 2021-03-08 02:54:47 +01:00
Eren Gölge 43b951018e fix the default vocoder name 2021-03-08 02:54:47 +01:00
Eren Gölge 3c961370e7 linter fixes 2021-03-08 02:54:21 +01:00
gerazov b3c5cc2cdc final fixes 2021-03-08 02:54:21 +01:00
gerazov 10d5a63d49 updated to current dev 2021-03-08 02:54:21 +01:00
gerazov 6f06e31541 changed train scripts 2021-03-08 02:54:21 +01:00
Branislav Gerazov b1e3160884 waveRNN fix 2021-03-08 02:54:21 +01:00
Eren Gölge 08581deb61 linter updates 2021-03-08 02:53:02 +01:00
Thorsten Mueller 167901813d Ups. Added missing , 2021-03-08 02:53:02 +01:00
Eren Gölge 93a6bdfd6c linter fixes and version updates for deps 2021-03-08 02:51:10 +01:00
Thorsten Mueller 3eb00e8d93 Set out_path to be required param. 2021-03-08 02:49:15 +01:00
Alexander Korolev ace430d5e6 fix device mismatch wavegrad training
this should fixe the device mismatch as seen here https://github.com/mozilla/TTS/issues/622#issue-789802916
2021-03-08 02:49:15 +01:00
Eren Gölge 83143fbe39 fix #638 2021-03-08 02:48:31 +01:00
Alexander Korolev b4bc5f6eb1 update fixed stopnet_pos_weight parameter
config parameter c.stopnet_pos_weight has currently no effect as it is not used.
2021-03-08 02:48:31 +01:00
Eren Gölge 534e3c67c6 README update, set default models for synthesize.py and server.py. Disable verbose for ap init. 2021-03-08 02:48:31 +01:00
Eren Gölge d0454461de Merge branch 'pr/gerazov/650-2' into dev 2021-02-17 13:40:45 +00:00
Eren Gölge ce0c5eccbd do not test server and modelManager until fixing #657 2021-02-17 00:35:43 +00:00
gerazov 61c88beb94 refactored keep_all_best 2021-02-15 18:40:17 +01:00
Eren Gölge 3b6ce04332
Update TTS/bin/find_unique_chars.py
Co-authored-by: Jörg Thalheim <Mic92@users.noreply.github.com>
2021-02-15 13:02:29 +01:00
Eren Gölge 420901f4c2 linter fixes 2021-02-12 14:41:17 +00:00
Eren Gölge e774f68aee save used model characters to the checkpoints 2021-02-12 12:03:42 +00:00
gerazov 310d18325e brushed up printing model load path and best loss path 2021-02-12 10:55:45 +01:00
Eren Gölge 8b6fd76ad2 find unique characters in a dataset 2021-02-12 09:46:11 +00:00
gerazov af46727517 loading last checkpoint/best_model works, deleting last best models options added, loading last best_loss added 2021-02-12 02:12:00 +01:00
Eren Gölge 1649ad3431 save_wav with a custom sampling rate 2021-02-11 15:27:20 +00:00
Eren Gölge 0657b38111 use default vocoder in synthesize.py 2021-02-11 15:26:17 +00:00
Eren Gölge f1799dbd60 docstring update 2021-02-11 11:25:31 +00:00
Eren Gölge 3c2e13ca5c fix the default vocoder name 2021-02-11 10:36:52 +00:00
Eren Gölge c619859a3f linter fixes 2021-02-09 11:43:17 +00:00
gerazov ad17dc9e76 final fixes 2021-02-06 23:05:01 +01:00
gerazov 8fdd08ea15 updated to current dev 2021-02-06 22:59:52 +01:00
gerazov 2705d27b28 changed train scripts 2021-02-06 22:29:30 +01:00
Eren Gölge f4f6290eec Merge branch 'pr/gerazov/641' into dev 2021-02-05 13:14:49 +00:00
Eren Gölge d49757faaa linter updates 2021-02-05 13:10:43 +00:00
Branislav Gerazov cb77aef36c waveRNN fix 2021-02-04 09:52:03 +01:00
Thorsten Mueller d74866cb8e Merge remote-tracking branch 'upstream/dev' into dev
Fix for circleci error mentioned in PR https://github.com/mozilla/TTS/pull/637
2021-02-02 19:40:18 +01:00
Thorsten Mueller a82152eef3 Ups. Added missing , 2021-02-02 19:29:16 +01:00
Thorsten Mueller 4cb4fcf02c Set out_path to be required param. 2021-02-02 19:29:16 +01:00
Eren Gölge 5c46543765 linter fixes and version updates for deps 2021-02-01 13:18:56 +00:00
Eren Gölge 5beed0ddcd Merge branch 'dev' of https://github.com/mozilla/TTS into dev 2021-02-01 11:27:14 +00:00
Eren Gölge c7407571fa fix #638 2021-02-01 10:05:55 +00:00
Eren Gölge dfdac1def9
Merge pull request #636 from thorstenMueller/dev
Set out_path to be required param in compute_statistics.py.
2021-01-29 18:08:31 +01:00
Thorsten Mueller 44c4a49745 Set out_path to be required param. 2021-01-29 17:23:38 +01:00
Alexander Korolev e81ebec7a8
fix device mismatch wavegrad training
this should fixe the device mismatch as seen here https://github.com/mozilla/TTS/issues/622#issue-789802916
2021-01-29 15:18:59 +01:00
Alexander Korolev ca28e05ed7
update fixed stopnet_pos_weight parameter
config parameter c.stopnet_pos_weight has currently no effect as it is not used.
2021-01-27 16:33:25 +01:00
Eren Gölge 25c86ca715 README update, set default models for synthesize.py and server.py. Disable verbose for ap init. 2021-01-27 11:47:03 +01:00
Eren Gölge 877f0bbfba manifest.in update 2021-01-26 02:56:55 +01:00
Eren Gölge 82e029529e fix manifest file 2021-01-25 13:27:54 +01:00
Eren Gölge 57b668fd86 fixing dome pypi issues 2021-01-25 13:06:12 +01:00
Eren Gölge 60c1bb93d9 fixes before first PyPI release 2021-01-25 11:16:20 +01:00
Eren Gölge fae10309e4
Merge pull request #624 from SanjaESC/patch-3
Update train_tacotron.py
2021-01-22 13:29:09 +01:00
Eren Gölge c990b3a59c linter fixes and test fixes 2021-01-22 02:32:35 +01:00
Alexander Korolev f251dc8c0e
Update train_tacotron.py
When attempting to fine-tune a model with "prenet_type": "bn" that was originally trained with "prenet_type": "original", a RuntimeError is thrown that stops the training.

By catching the RuntimeError, the required layers can be partially restored and the training will continue without any problems.
2021-01-21 21:16:30 +01:00
Eren Gölge 0ab2eb2664 use synthesizer in both synthesize.py and server.pu 2021-01-21 15:54:33 +01:00
Eren Gölge 6b6e989fd2 update server readme 2021-01-21 15:29:46 +01:00
root 3d30dae8f3 .models.json and synthesize.py update for interfacing with model manager 2021-01-20 02:08:58 +00:00
root 7beaacc55b update compute_attention_masks.py 2021-01-13 10:03:57 +00:00
erogol cc2b1e043d docstrings for common layers 2021-01-11 15:06:12 +01:00
erogol d382d759b3 small fixes and test fixes 2021-01-08 15:48:40 +01:00
erogol f352b3534c make noise augmentation optional 2021-01-06 13:19:40 +01:00
erogol d5a0190c4b update copy_config_file to copy_model_files 2021-01-06 13:19:40 +01:00
erogol 8971c59b2d plot eval alignment score right 2021-01-06 13:19:40 +01:00
erogol fede46e96e pylint and test fixes 2021-01-06 13:19:40 +01:00
erogol 2abe3df153 compute_attention_masks.py 2021-01-06 13:19:40 +01:00
erogol cf869e8922 add SS files 2021-01-06 13:19:40 +01:00
erogol 29b17c0808 bug fix for gradual training 2021-01-06 13:19:40 +01:00
erogol 6478d552dc tacotron training bug fix 2021-01-06 13:19:40 +01:00
erogol 1dd086577a tacotron training bug fix 2021-01-06 13:18:41 +01:00
Thorsten Mueller f673f8f74d Added support for npy output from tune-wavegrad 2020-12-19 22:51:22 +01:00
Thorsten Mueller 2aa0354b44 Fix for 'NoneType' object has no attribute 'to' 2020-12-19 22:37:03 +01:00
Thorsten Mueller 28a64221ea Improve robostness on cpu / gpu model mix 2020-12-19 22:23:28 +01:00
Eren Gölge 2473b2dc62
Merge pull request #559 from krzim/patch-1
Fix import to grab the encoder model save function
2020-12-10 00:19:32 +01:00
erogol 53679b706d glow-tts distributed fix 2020-12-09 23:39:09 +01:00
erogol 62bc171db5 Merge branch 'dev' of https://github.com/mozilla/TTS into dev 2020-12-09 15:46:57 +01:00
erogol df180148e9 use noise augmentation in TTSDataset 2020-12-09 15:46:25 +01:00
Thorsten Mueller e39628ce2f Limit filenames to 10 chars 2020-12-08 18:44:19 +01:00
erogol 06612ce305 test fixes 2020-12-07 15:57:34 +01:00
erogol 0252a07fa6 Merge branch 'dev' of https://github.com/mozilla/TTS into dev 2020-12-07 11:31:55 +01:00
erogol 482e725752 sync torch calls before logging training results 2020-12-07 11:30:19 +01:00
erogol affe1c1138 setup training scripts for computing phonemes before training optionally. And define data_loaders before starting training and re-use them instead of re-define for every train and eval calls. This is to enable better instance filtering based on input length. 2020-12-07 11:26:57 +01:00
erogol 7c3cdced1a make speaker_mapping a global variable to prevent reload. Fix glow-tts training 2020-12-01 03:23:25 +01:00
Thorsten Mueller 06a389bc08 Added option for saving raw spectograms 2020-11-27 15:49:55 +01:00
erogol 4b92ac0f92 tune_wavegrad update 2020-11-25 14:49:48 +01:00
erogol d8c1b5b73d print max lengths in tacotron training 2020-11-25 14:49:07 +01:00
erogol 1229554c42 use native amp 2020-11-25 14:48:54 +01:00
erogol 8a820930c6 compute_embedding update 2020-11-25 14:46:08 +01:00
erogol aa2b31a1b0 use 'enabled' argument to control autocast 2020-11-17 14:22:01 +01:00
Qingping Hou b0b97d636f speed up metafile build for voxceleb 2020-11-14 23:45:17 -08:00
erogol a2a142dc39 Merge branch 'dev' of https://github.com/mozilla/TTS into dev 2020-11-14 13:02:19 +01:00
erogol c65712426a change noise scheduling for wavegrad. Compute beta values externally to enable better flexibility 2020-11-14 13:01:10 +01:00
erogol 5a59467f34 scaler fix for wavegrad and wavernn. Save and load scaler 2020-11-14 13:00:35 +01:00
erogol d8511efa8f use native amp for tacotron training 2020-11-14 12:59:28 +01:00
Qingping Hou 0cc3650ef6 support loading config in yaml 2020-11-14 00:13:53 -08:00
erogol a7aefd5c50 use pytorch amp for mixed precision training for Tacotron 2020-11-12 12:51:56 +01:00
erogol 67e2b664e5 compute embeddings and create speakers.json 2020-11-12 12:51:17 +01:00
erogol f8fd300b3e bug fix 2020-11-10 12:53:39 +01:00
erogol 016d3503da compute embeddings with speaker encoder 2020-11-10 12:51:02 +01:00
erogol 21364331d2 Merge branch 'dev' of https://github.com/mozilla/TTS into dev 2020-11-09 13:31:12 +01:00
erogol c76a617072 linter updates 2020-11-09 13:18:35 +01:00
erogol c80225544e tune wavegrad to fine the best noise schedule for inferece 2020-11-06 13:04:46 +01:00
erogol ef04d7fae7 bug fix for wavernn training 2020-10-30 14:08:41 +01:00
erogol 183fe56d95 Merge branch 'ssim_loss' into dev 2020-10-29 23:49:09 +01:00
krzim 2202e171c5
Fix import to grab the encoder model save function
I saw that this was recently changed but I'm not sure if it should have been. This is the correct function given the arguments provided to it in the train loop.
2020-10-29 18:03:11 -04:00
erogol 73581cd94c renaming train scripts and updating tests 2020-10-29 16:50:07 +01:00
erogol 946a0c0fb9 bug fixes for single speaker glow-tts, enable torch based amp. Make amp optional for wavegrad. Bug fixes for synthesis setup for glow-tts 2020-10-29 15:45:50 +01:00
erogol 14c2381207 weight norm and torch based amp training for wavegrad 2020-10-29 12:31:43 +01:00
erogol c8a4c771a8 train wavegrad updates 2020-10-29 12:31:43 +01:00
erogol 670f44aa18 enable compute stats by vocoder config 2020-10-29 12:31:43 +01:00
erogol f79bbbbd00 use Adam for wavegras instead of RAdam 2020-10-29 12:31:43 +01:00
erogol 7bcdb7ac35 wavegrad updates 2020-10-29 12:31:43 +01:00
erogol a1582a0e12 fix distributed training for train_* scripts 2020-10-29 12:31:43 +01:00
erogol e02cd6a220 initial wavegrad layers model and trainig script 2020-10-29 12:30:37 +01:00
erogol e723b99888 handle distributed model as saving 2020-10-29 12:30:37 +01:00
Eren Gölge 26c18b61c9
Merge pull request #553 from Edresson/dev
bug fix in the inference with GlowTTS
2020-10-28 18:49:31 +01:00
erogol 9d0ae2bfb4 wavernn dataloader handling for short samples and mixed precision training 2020-10-28 12:31:01 +01:00
Edresson f01502a9db bug fix in glowTTS sythesize 2020-10-27 16:30:16 -03:00
Eren Gölge f4b8170bd1
Merge pull request #545 from Edresson/dev
GlowTTS zeroshot TTS support
2020-10-27 15:23:41 +01:00
erogol 0becef4b58 small updates 2020-10-27 12:17:38 +01:00
sanjaesc 2ee47e9568 fix pylint once again 2020-10-27 12:17:38 +01:00
sanjaesc bef3f2020b compute audio feat on dataload 2020-10-27 12:17:38 +01:00
sanjaesc 91e5f8b63d added to device cpu/gpu + formatting 2020-10-27 12:17:38 +01:00
sanjaesc 016a77fcf2 fix formatting + pylint 2020-10-27 12:17:38 +01:00
sanjaesc e8294cb9db fixing pylint errors 2020-10-27 12:17:38 +01:00
sanjaesc 878b7c373e added feature preprocessing if not set in config 2020-10-27 12:17:38 +01:00
sanjaesc e495e03ea1 some minor changes to wavernn 2020-10-27 12:17:38 +01:00
Alex K 6378fa2b07 add initial wavernn support 2020-10-27 12:17:38 +01:00
Edresson d9540a5857 add blank token in sequence for encrease glowtts results 2020-10-25 15:08:28 -03:00
Edresson fbea058c59 add parse speakers function 2020-10-24 16:10:05 -03:00
Edresson 07345099ee GlowTTS zero-shot TTS Support 2020-10-24 15:58:39 -03:00
Edresson b7f9ebd32b add check arguments for GlowTTS and multispeaker training bug fix 2020-10-19 17:17:58 -03:00
erogol c5074cfd8e general purpose distribute.py 2020-10-08 01:30:42 +02:00
Edresson 99d5a0ac07 add Speaker Conditional GST support 2020-09-29 16:09:27 -03:00
erogol 154f90bc44 format speaker encoder imports 2020-09-28 11:19:19 +02:00
mueller91 cfeeef7a7f fix: broken imports and missing files after merging in latest commits from mozilla/dev into mueller91/dev.
speaker_encoder's config.json and visuals.py are missing in the current dev branch of MozillaTTS, and some imports are broken.
2020-09-22 20:10:41 +02:00
mueller91 1fe5eb054f Merge branch 'dev' of https://github.com/mozilla/TTS into dev
 Conflicts:
	TTS/bin/train_encoder.py
	requirements.txt
2020-09-22 19:58:53 +02:00
mueller91 df4caec4b7 add: check_config for speaker_encoder 2020-09-22 19:52:09 +02:00
erogol 10258724d1 linter fixes 2020-09-22 03:54:16 +02:00
erogol a6df617eb1 Merge branch 'glow-tts-amp-time_depth_conv' into dev 2020-09-21 14:23:45 +02:00
erogol 8150d5727e Merge branch 'dev' of https://github.com/mozilla/TTS into dev 2020-09-21 14:21:55 +02:00
erogol e0b9fa887f glow-tts modules added 2020-09-21 14:15:40 +02:00
mueller 6b0621c794 cleanup 2020-09-17 16:46:43 +02:00
mueller a273b1a210 add: add random noise to dataset 2020-09-17 14:23:40 +02:00
mueller e36a3067e4 add: save wavs instead feats to storage.
This is done in order to mitigate staleness when caching and loading from data storage
2020-09-17 14:14:30 +02:00
mueller 1511076fde add: Configurable encoder dataset storage to reduce disk I/O
add: Averaged time for data loader to console and Tensorboard output
2020-09-17 12:29:38 +02:00
maxbachmann 60ce862113
use difflib for string matching 2020-09-14 23:55:34 +02:00
erogol 498a3ea36f fix condition check 2020-09-12 03:39:01 +02:00
erogol 15e6ab3912 glow-tts module renaming updates 2020-09-12 03:33:36 +02:00
erogol f9001a4bdd refactor and fix compat issues for speaker encoder 2020-09-11 17:17:07 +02:00
erogol df19428ec6 rename the project to old TTS 2020-09-09 12:27:23 +02:00