Commit Graph

1871 Commits

Author SHA1 Message Date
Thorsten Mueller 2aa0354b44 Fix for 'NoneType' object has no attribute 'to' 2020-12-19 22:37:03 +01:00
Thorsten Mueller 28a64221ea Improve robostness on cpu / gpu model mix 2020-12-19 22:23:28 +01:00
erogol 8293751a38 remove mozilla from server page 2020-12-17 12:28:28 +01:00
erogol 639fa29261 update speaker id casting for glow-tts 2020-12-14 16:58:47 +01:00
erogol 999120ecdf Merge branch 'dev' of https://github.com/mozilla/TTS into dev 2020-12-12 18:50:14 +01:00
erogol f611e6ac01 Merge branch 'dev' of https://github.com/mozilla/TTS into dev 2020-12-12 18:47:59 +01:00
Jörg Thalheim 62fd4ca70d
inflect negative numbers correctly 2020-12-10 16:47:51 +01:00
Jörg Thalheim 6646682650
cleaners: expand english time 2020-12-10 14:53:20 +01:00
Jörg Thalheim 76138687d3
expand more currencies 2020-12-10 14:53:20 +01:00
erogol a2859b7ddc update config args checks 2020-12-10 13:52:57 +01:00
erogol 788cd6f902 fix multi-speaker glow-tts inference 2020-12-10 02:05:48 +01:00
erogol 3d5066e2b8 Merge branch 'dev' of https://github.com/mozilla/TTS into dev 2020-12-10 00:31:03 +01:00
erogol 92cc9630d7 fix glow-tts synthesis for DPP 2020-12-10 00:30:34 +01:00
Eren Gölge 2473b2dc62
Merge pull request #559 from krzim/patch-1
Fix import to grab the encoder model save function
2020-12-10 00:19:32 +01:00
erogol 53679b706d glow-tts distributed fix 2020-12-09 23:39:09 +01:00
erogol 62bc171db5 Merge branch 'dev' of https://github.com/mozilla/TTS into dev 2020-12-09 15:46:57 +01:00
erogol df180148e9 use noise augmentation in TTSDataset 2020-12-09 15:46:25 +01:00
Thorsten Mueller e39628ce2f Limit filenames to 10 chars 2020-12-08 18:44:19 +01:00
erogol 06612ce305 test fixes 2020-12-07 15:57:34 +01:00
erogol 0252a07fa6 Merge branch 'dev' of https://github.com/mozilla/TTS into dev 2020-12-07 11:31:55 +01:00
erogol 482e725752 sync torch calls before logging training results 2020-12-07 11:30:19 +01:00
erogol 7505c0ba27 muliprocess phoneme computation 2020-12-07 11:29:41 +01:00
erogol 20c86489d7 make static methods for faster multiprocess call 2020-12-07 11:29:10 +01:00
erogol affe1c1138 setup training scripts for computing phonemes before training optionally. And define data_loaders before starting training and re-use them instead of re-define for every train and eval calls. This is to enable better instance filtering based on input length. 2020-12-07 11:26:57 +01:00
Alexander Korolev f42ca2b73f
Update wavegrad.py
This should fix the issue https://github.com/mozilla/TTS/issues/581
2020-12-04 16:43:39 +01:00
erogol 7c3cdced1a make speaker_mapping a global variable to prevent reload. Fix glow-tts training 2020-12-01 03:23:25 +01:00
Thorsten Mueller 06a389bc08 Added option for saving raw spectograms 2020-11-27 15:49:55 +01:00
erogol a757b203bc fix longer phoneme seqs 2020-11-26 15:05:03 +01:00
erogol 7b0a93d2f8 fix 2020-11-26 11:44:52 +01:00
erogol 0c6f7e4c77 resample audio if flag set true 2020-11-26 11:30:48 +01:00
erogol f6c96b0ac2 Merge branch 'dev' 2020-11-25 15:29:06 +01:00
erogol e3b7157146 remove contextlib 2020-11-25 15:22:01 +01:00
erogol e3eda159d1 wavegrad_dataset update 2020-11-25 14:50:50 +01:00
erogol a1e4ee18f9 convert float16 to float32 for plotting spectrograms 2020-11-25 14:50:28 +01:00
erogol 7541d2ecaa return eval split optional 2020-11-25 14:50:09 +01:00
erogol 4b92ac0f92 tune_wavegrad update 2020-11-25 14:49:48 +01:00
erogol d8c1b5b73d print max lengths in tacotron training 2020-11-25 14:49:07 +01:00
erogol 1229554c42 use native amp 2020-11-25 14:48:54 +01:00
erogol 8a820930c6 compute_embedding update 2020-11-25 14:46:08 +01:00
erogol aa2b31a1b0 use 'enabled' argument to control autocast 2020-11-17 14:22:01 +01:00
erogol d9d04d892b Merge branch 'dev' of https://github.com/mozilla/TTS into dev 2020-11-17 14:17:24 +01:00
erogol 8b0e0846a3 temporary travis check 2020-11-17 14:17:03 +01:00
Qingping Hou b0b97d636f speed up metafile build for voxceleb 2020-11-14 23:45:17 -08:00
erogol a2a142dc39 Merge branch 'dev' of https://github.com/mozilla/TTS into dev 2020-11-14 13:02:19 +01:00
erogol c65712426a change noise scheduling for wavegrad. Compute beta values externally to enable better flexibility 2020-11-14 13:01:10 +01:00
erogol 5a59467f34 scaler fix for wavegrad and wavernn. Save and load scaler 2020-11-14 13:00:35 +01:00
erogol d8511efa8f use native amp for tacotron training 2020-11-14 12:59:28 +01:00
Qingping Hou 0cc3650ef6 support loading config in yaml 2020-11-14 00:13:53 -08:00
erogol 6cc464ead6 fix ton of tesnting bugs 2020-11-12 16:33:29 +01:00
erogol 25551c4634 change wavernn generate to inference 2020-11-12 12:52:52 +01:00
erogol 9b0f441945 argument for returning no eval split 2020-11-12 12:52:27 +01:00
erogol a7aefd5c50 use pytorch amp for mixed precision training for Tacotron 2020-11-12 12:51:56 +01:00
erogol 67e2b664e5 compute embeddings and create speakers.json 2020-11-12 12:51:17 +01:00
erogol f8fd300b3e bug fix 2020-11-10 12:53:39 +01:00
erogol 016d3503da compute embeddings with speaker encoder 2020-11-10 12:51:02 +01:00
erogol 21364331d2 Merge branch 'dev' of https://github.com/mozilla/TTS into dev 2020-11-09 13:31:12 +01:00
erogol c76a617072 linter updates 2020-11-09 13:18:35 +01:00
erogol ea976b0543 python compat update for contextlib 2020-11-06 13:34:11 +01:00
erogol c80225544e tune wavegrad to fine the best noise schedule for inferece 2020-11-06 13:04:46 +01:00
erogol d94782a076 reset the way ga_loss is stored in return_dict 2020-11-02 13:18:56 +01:00
erogol a108d0ee81 check nan loss in glow-tts loss 2020-11-02 13:12:19 +01:00
erogol b8ac9aba9d check against NaN loss in tacotron_loss 2020-11-02 12:44:41 +01:00
erogol ef04d7fae7 bug fix for wavernn training 2020-10-30 14:08:41 +01:00
erogol a44ef58aea wavegrad weight norm refactoring 2020-10-30 13:23:24 +01:00
erogol 183fe56d95 Merge branch 'ssim_loss' into dev 2020-10-29 23:49:09 +01:00
krzim 2202e171c5
Fix import to grab the encoder model save function
I saw that this was recently changed but I'm not sure if it should have been. This is the correct function given the arguments provided to it in the train loop.
2020-10-29 18:03:11 -04:00
erogol 73581cd94c renaming train scripts and updating tests 2020-10-29 16:50:07 +01:00
erogol 39c71ee8a9 wavegrad refactoring, fixing tests for glow-tts and wavegrad 2020-10-29 15:47:15 +01:00
erogol 946a0c0fb9 bug fixes for single speaker glow-tts, enable torch based amp. Make amp optional for wavegrad. Bug fixes for synthesis setup for glow-tts 2020-10-29 15:45:50 +01:00
erogol 14c2381207 weight norm and torch based amp training for wavegrad 2020-10-29 12:31:43 +01:00
erogol b76a0be97a wavegrad model and layers refactoring 2020-10-29 12:31:43 +01:00
erogol dc2825dfb2 wavegrad dataset update 2020-10-29 12:31:43 +01:00
erogol 5b5b9fcfdd wavegrad config updates 2020-10-29 12:31:43 +01:00
erogol c8a4c771a8 train wavegrad updates 2020-10-29 12:31:43 +01:00
erogol 670f44aa18 enable compute stats by vocoder config 2020-10-29 12:31:43 +01:00
erogol f79bbbbd00 use Adam for wavegras instead of RAdam 2020-10-29 12:31:43 +01:00
erogol 7bcdb7ac35 wavegrad updates 2020-10-29 12:31:43 +01:00
erogol a1582a0e12 fix distributed training for train_* scripts 2020-10-29 12:31:43 +01:00
erogol 193b81b273 add universal_fullband_melgan config 2020-10-29 12:30:37 +01:00
erogol e02cd6a220 initial wavegrad layers model and trainig script 2020-10-29 12:30:37 +01:00
erogol ac57eea928 add wavegrad to vocoder generators 2020-10-29 12:30:37 +01:00
erogol e723b99888 handle distributed model as saving 2020-10-29 12:30:37 +01:00
Eren Gölge 26c18b61c9
Merge pull request #553 from Edresson/dev
bug fix in the inference with GlowTTS
2020-10-28 18:49:31 +01:00
erogol fdaed45f58 optional loss masking for stoptoken predictor 2020-10-28 18:40:54 +01:00
erogol e49cc3bbcd bug fix 2020-10-28 18:34:34 +01:00
erogol 59e1cf99d0 config update and ssim implementation 2020-10-28 18:30:00 +01:00
erogol 9cef923d99 ssim loss for tacotron models 2020-10-28 15:24:18 +01:00
erogol 9d0ae2bfb4 wavernn dataloader handling for short samples and mixed precision training 2020-10-28 12:31:01 +01:00
Edresson f01502a9db bug fix in glowTTS sythesize 2020-10-27 16:30:16 -03:00
Eren Gölge f4b8170bd1
Merge pull request #545 from Edresson/dev
GlowTTS zeroshot TTS support
2020-10-27 15:23:41 +01:00
erogol a6f564c8c8 pylint fixes 2020-10-27 12:35:10 +01:00
erogol 0becef4b58 small updates 2020-10-27 12:17:38 +01:00
sanjaesc 2ee47e9568 fix pylint once again 2020-10-27 12:17:38 +01:00
sanjaesc 1e646135ca add model params to config 2020-10-27 12:17:38 +01:00
sanjaesc bef3f2020b compute audio feat on dataload 2020-10-27 12:17:38 +01:00
sanjaesc 7c72562fe7 fix travis + pylint tests 2020-10-27 12:17:38 +01:00
sanjaesc 91e5f8b63d added to device cpu/gpu + formatting 2020-10-27 12:17:38 +01:00
sanjaesc 016a77fcf2 fix formatting + pylint 2020-10-27 12:17:38 +01:00
erogol 8de7c13708 fix no loss masking loss computation 2020-10-27 12:17:38 +01:00
sanjaesc e8294cb9db fixing pylint errors 2020-10-27 12:17:38 +01:00
sanjaesc 878b7c373e added feature preprocessing if not set in config 2020-10-27 12:17:38 +01:00
sanjaesc e495e03ea1 some minor changes to wavernn 2020-10-27 12:17:38 +01:00
Alex K 9c3c7ce2f8 wavernn stuff... 2020-10-27 12:17:38 +01:00
Alex K 6378fa2b07 add initial wavernn support 2020-10-27 12:17:38 +01:00
Edresson 89e9bfe3a2 add text processing blank token test 2020-10-26 17:41:23 -03:00
Edresson d9540a5857 add blank token in sequence for encrease glowtts results 2020-10-25 15:08:28 -03:00
Edresson fbea058c59 add parse speakers function 2020-10-24 16:10:05 -03:00
Edresson 07345099ee GlowTTS zero-shot TTS Support 2020-10-24 15:58:39 -03:00
Alexander Korolev 47d74ced1c
Update losses.py
Seems like in the latest dev merge, this change was reverted. Any specific reason for this?
Without it the problem as stated here https://github.com/mozilla/TTS/issues/473 occurs.
2020-10-23 14:15:01 +02:00
ayush-1506 2a3559f02b Fix readme and config file 2020-10-21 13:43:49 +05:30
Edresson b7f9ebd32b add check arguments for GlowTTS and multispeaker training bug fix 2020-10-19 17:17:58 -03:00
erogol c2c4126a18 remove merge conflicts 2020-10-08 01:35:27 +02:00
erogol c5074cfd8e general purpose distribute.py 2020-10-08 01:30:42 +02:00
erogol 6f0654f9a8 differential spectral loss 2020-10-08 01:30:42 +02:00
erogol e0d4b88877 config update 2020-10-08 01:29:30 +02:00
erogol 4e93f90108 bug fix 2020-10-08 01:29:30 +02:00
erogol bb9b70ee27 differential spectral loss and loss weight settings 2020-10-08 01:29:30 +02:00
erogol e1eab1ce4b print model r value as loading it 2020-10-07 13:34:21 +02:00
erogol 48a40c4730 remove unused import 2020-10-06 11:32:24 +02:00
erogol a2606fbc22 format utils 2020-10-06 11:02:54 +02:00
Eren Gölge 4873601694
Merge pull request #531 from WeberJulian/french-cleaners
Adding support for french cleaners
2020-09-30 15:30:50 +02:00
Edresson 99d5a0ac07 add Speaker Conditional GST support 2020-09-29 16:09:27 -03:00
Julian WEBER ea7c2e15c0 Adding french abbreviations 2020-09-29 15:43:39 +02:00
Julian WEBER 54b4031391 Merge remote-tracking branch 'origin/dev' into french-cleaners 2020-09-29 14:24:51 +02:00
Julian WEBER da134eeee4 Subjective improvements 2020-09-29 14:20:52 +02:00
Julian WEBER b2817e9e93 Adding french cleaners 2020-09-29 14:20:24 +02:00
Eren Gölge cf02ace5b7
Merge pull request #530 from mueller91/fix_split_dataset
fix: split_dataset
2020-09-28 12:42:40 +02:00
erogol 154f90bc44 format speaker encoder imports 2020-09-28 11:19:19 +02:00
erogol e097bc6c5d Merge branch 'dev' of https://github.com/mozilla/TTS into dev 2020-09-28 11:15:32 +02:00
Eren Gölge 8e2dc79c3a
Merge pull request #526 from mueller91/dev
Fix: Check storage params only for speaker encoder
2020-09-28 11:15:23 +02:00
erogol 6a70c63f24 correct glow-tts loss 2020-09-27 03:28:42 +02:00
erogol 665f7ca714 linter fix 2020-09-24 12:57:54 +02:00
mueller91 227b9c8864 fix: split_dataset() runtime reduced from O(N * |items|) to O(N) where N is the size of the eval split (max 500)
I notice a significant speedup on the initial loading of large datasets such as common voice (from minutes to seconds)
2020-09-23 23:27:51 +02:00
mueller91 cfeeef7a7f fix: broken imports and missing files after merging in latest commits from mozilla/dev into mueller91/dev.
speaker_encoder's config.json and visuals.py are missing in the current dev branch of MozillaTTS, and some imports are broken.
2020-09-22 20:10:41 +02:00
mueller91 1fe5eb054f Merge branch 'dev' of https://github.com/mozilla/TTS into dev
 Conflicts:
	TTS/bin/train_encoder.py
	requirements.txt
2020-09-22 19:58:53 +02:00
mueller91 df4caec4b7 add: check_config for speaker_encoder 2020-09-22 19:52:09 +02:00
WeberJulian 3c212be5a8
fix: fixing the RenamingUnpickler fix 2020-09-22 17:36:05 +02:00
mueller91 0ea7f4e2bd fix: make speaker encoder's storage parameters non-restriced 2020-09-22 10:39:40 +02:00
mueller91 7029452228 fix: make speaker encoder's storage parameters non-restriced 2020-09-22 10:31:42 +02:00
erogol 10258724d1 linter fixes 2020-09-22 03:54:16 +02:00
erogol a6df617eb1 Merge branch 'glow-tts-amp-time_depth_conv' into dev 2020-09-21 14:23:45 +02:00
erogol 8150d5727e Merge branch 'dev' of https://github.com/mozilla/TTS into dev 2020-09-21 14:21:55 +02:00
erogol e0b9fa887f glow-tts modules added 2020-09-21 14:15:40 +02:00
erogol e4c6386603 change import for normalization layer 2020-09-21 13:09:52 +02:00
mueller91 9b4aac94a8 fix: linter issues 2020-09-21 12:13:02 +02:00
erogol c008003506 do not check sample rate as loading stats file for normalization to enable interpolation for different sample rate vocoder 2020-09-18 12:52:19 +02:00
mueller 6b0621c794 cleanup 2020-09-17 16:46:43 +02:00
mueller a273b1a210 add: add random noise to dataset 2020-09-17 14:23:40 +02:00
mueller e36a3067e4 add: save wavs instead feats to storage.
This is done in order to mitigate staleness when caching and loading from data storage
2020-09-17 14:14:30 +02:00
mueller 1511076fde add: Configurable encoder dataset storage to reduce disk I/O
add: Averaged time for data loader to console and Tensorboard output
2020-09-17 12:29:38 +02:00
erogol 3660c57f1e time seperable convolution encoder, huber loss for duration predictor 2020-09-17 03:10:58 +02:00
mueller 95d2906307 add: Mozilla Commonvoice, VoxCeleb1+2, LibriTTS to Speaker Encoder Training 2020-09-16 16:49:53 +02:00
mueller c909ca3855 Improve runtime of __parse_items() from O(|speakers|*|items|) to O(|items|) 2020-09-16 15:55:55 +02:00
mueller d733b90255 Improve runtime of __parse_items() from O(|speakers|*|items|) to O(|items|) 2020-09-16 15:09:02 +02:00
maxbachmann 60ce862113
use difflib for string matching 2020-09-14 23:55:34 +02:00
erogol f1a75468c2 fix arguments 2020-09-12 04:00:25 +02:00
erogol 7c2c4d6f27 pass x_mask to layer norm 2020-09-12 03:41:37 +02:00
erogol 45fbc0d003 convolution encoder with GLU and res connections 2020-09-12 03:40:21 +02:00
erogol 498a3ea36f fix condition check 2020-09-12 03:39:01 +02:00
erogol 72b8ac0ff6 remove redundant arguments 2020-09-12 03:37:47 +02:00
erogol 15e6ab3912 glow-tts module renaming updates 2020-09-12 03:33:36 +02:00
erogol 1b238f04b2 add gated conv encoder to glow-tts 2020-09-11 19:01:38 +02:00
erogol 14356d3250 glow-tts with relative pos encoding 2020-09-11 19:01:38 +02:00
erogol 43771a3a5c remove redundant arguments 2020-09-11 19:01:38 +02:00
erogol 1dea2c9034 faster sequence masking 2020-09-11 19:01:38 +02:00
erogol 673ba74a80 glow tts training and inference fixes 2020-09-11 19:01:38 +02:00
erogol d5c6d60884 synthesis update for glow tts 2020-09-11 19:01:37 +02:00
erogol 89d15bf118 merge glow-tts after rebranding 2020-09-11 19:01:37 +02:00
erogol f9001a4bdd refactor and fix compat issues for speaker encoder 2020-09-11 17:17:07 +02:00
erogol 540d811dd5 solve pickling models after module name change 2020-09-11 12:03:39 +02:00
erogol df19428ec6 rename the project to old TTS 2020-09-09 12:27:23 +02:00