Commit Graph

912 Commits

Author SHA1 Message Date
Eren Gölge b28c724c04 remove _phoneme_punctuations 2021-02-12 12:10:57 +00:00
Eren Gölge 593cedee14 parse_characters function 2021-02-12 12:05:56 +00:00
Eren Gölge 2abfff17f9 enable saving model characters in io.py 2021-02-12 12:04:41 +00:00
Eren Gölge 918f007a11 docstring update 2021-02-12 12:04:07 +00:00
gerazov af46727517 loading last checkpoint/best_model works, deleting last best models options added, loading last best_loss added 2021-02-12 02:12:00 +01:00
Eren Gölge 43f54d2dce fix make_symbols 2021-02-11 15:26:52 +00:00
Eren Gölge bc131208be fix spelling of a def argument and parse phonemes from config.json if
use_phonemes is True
2021-02-11 13:04:47 +00:00
Eren Gölge 3baec4ea96 add missing phonemes to test_config.json 2021-02-11 11:14:39 +00:00
Eren Gölge b08b8ca2a1 add russian phoneme char 2021-02-10 13:30:59 +00:00
Eren Gölge 9cad435288 css10 dataset preprocessor 2021-02-09 15:11:26 +00:00
Eren Gölge d49757faaa linter updates 2021-02-05 13:10:43 +00:00
Eren Gölge a926aa106d reorder imports 2021-01-29 01:36:21 +01:00
Eren Gölge b464cab9b8 setup.py update and pylint fixes 2021-01-26 02:57:50 +01:00
Eren Gölge 660d61aeeb maximum_path_numpy and CYTHON adabtable import 2021-01-26 02:57:07 +01:00
Eren Gölge c990b3a59c linter fixes and test fixes 2021-01-22 02:32:35 +01:00
root 1bc8fbbd3c set eval mode whe nloading models 2021-01-20 02:14:18 +00:00
root 1faf565e3a add load_checkpoint func to tts models 2021-01-20 02:10:56 +00:00
root 5c87753e88 glow-tts fix for saving inverse weight 2021-01-20 02:09:42 +00:00
erogol 428c224b88 commet update 2021-01-12 17:31:04 +01:00
erogol bbc8d665a1 move attention layers to a sperate file 2021-01-11 17:27:30 +01:00
erogol 79c841ccd3 mass refactoring and update 2021-01-11 17:26:58 +01:00
erogol 1d961d6f8a cladd renaming 2021-01-11 17:26:11 +01:00
erogol c0a2aa68d3 formatting 2021-01-11 17:25:39 +01:00
erogol b206162d11 more docstrings 2021-01-11 17:25:04 +01:00
erogol 6e9043c5d2 rename convbnblocks and handle none mask 2021-01-11 17:22:34 +01:00
erogol 921fa5db92 remove attentions from common layers 2021-01-11 15:06:42 +01:00
erogol cc2b1e043d docstrings for common layers 2021-01-11 15:06:12 +01:00
erogol a6f40fef2e stage missing files 2021-01-08 16:02:56 +01:00
erogol d382d759b3 small fixes and test fixes 2021-01-08 15:48:40 +01:00
erogol a6259041d3 docstring for speedyspeech 2021-01-07 14:35:22 +01:00
erogol de2a542f83 glow-tts bug fix 2021-01-07 13:40:32 +01:00
erogol 14d33662ea input shapes for tacotron models 2021-01-06 13:19:40 +01:00
erogol f288e9a260 docstrings for taoctron models 2021-01-06 13:19:40 +01:00
erogol 5a45af48f1 fix 2021-01-06 13:19:40 +01:00
erogol e7fad928e7 doc strings for the all glow-tts layers 2021-01-06 13:19:40 +01:00
erogol d3b7284be4 glow-tts comments and refactoring 2021-01-06 13:19:40 +01:00
erogol 7586fbc4de SS refactoring 2021-01-06 13:19:40 +01:00
erogol e82d31b6ac glow ttss refactoring 2021-01-06 13:19:40 +01:00
erogol 29f4329d7f update glow-tts layers and add some comments 2021-01-06 13:19:40 +01:00
erogol 29cf933831 update SS condif 2021-01-06 13:19:40 +01:00
erogol 228ada04b5 update glow-tts ljspeech config 2021-01-06 13:19:40 +01:00
erogol 71c382be14 copy model scale stats file with config.json to the trianing folder, fixed for model inits 2021-01-06 13:19:40 +01:00
erogol aa40fe1aa0 SS model refacotring for multi speaker 2021-01-06 13:19:40 +01:00
erogol eb555855e4 small fixes 2021-01-06 13:19:40 +01:00
erogol 5901a00576 argument rename 2021-01-06 13:19:40 +01:00
erogol 4ef083f0f1 select decoder type for SS 2021-01-06 13:19:40 +01:00
erogol 3fa408a5ea change order BN + ReLU to ReLU + BN for SS 2021-01-06 13:19:40 +01:00
erogol ac5c9217d1 positional encoding masking for SS 2021-01-06 13:19:40 +01:00
erogol fede46e96e pylint and test fixes 2021-01-06 13:19:40 +01:00
erogol cf869e8922 add SS files 2021-01-06 13:19:40 +01:00
erogol e4680e1b99 plot float16 alignments 2021-01-06 13:19:40 +01:00
erogol 13c6665c92 inference for SS 2021-01-06 13:19:40 +01:00
erogol 30788960a8 check SS model parameters 2021-01-06 13:19:40 +01:00
erogol 5cae2c5742 make optional position encoding for speedyspeech 2021-01-06 13:19:40 +01:00
erogol dc4a16d62e speedy speehc losses 2021-01-06 13:19:40 +01:00
erogol d62cac7252 fix glow-tts prenet bug fix 2021-01-06 13:19:40 +01:00
erogol a1d5a9ddda config update tyo use noise for augmentation 2021-01-06 13:19:40 +01:00
erogol 022af74d74 update prompt msg 2021-01-06 13:19:40 +01:00
erogol 57ef53bef3 update argumnet check for non tacotron models 2021-01-06 13:19:40 +01:00
erogol 27a75de15f update processors for loading attention maps 2021-01-06 13:19:40 +01:00
erogol fa6907fa0e update glow-tts parameters and fix rel-attn-win size 2021-01-06 13:19:40 +01:00
erogol 7b20d8cbd3 implement residual BN convolution and add it as an alternative encoder for glow-tts. also generic layers to layers/generic 2021-01-06 13:19:40 +01:00
erogol 973754d893 fix for init glow-tts 2021-01-06 13:19:40 +01:00
erogol f81af4eb0d config update disable guided attention for dynamic conv attention 2021-01-06 13:19:40 +01:00
erogol 5c50e104d6 config update 2021-01-06 13:19:40 +01:00
erogol fa20638083 config for ljspeech dynamic conv attention 2021-01-06 13:18:41 +01:00
erogol 070146e143 add monotonic dynamic convolution attention 2021-01-06 13:18:41 +01:00
erogol 639fa29261 update speaker id casting for glow-tts 2020-12-14 16:58:47 +01:00
erogol 999120ecdf Merge branch 'dev' of https://github.com/mozilla/TTS into dev 2020-12-12 18:50:14 +01:00
erogol f611e6ac01 Merge branch 'dev' of https://github.com/mozilla/TTS into dev 2020-12-12 18:47:59 +01:00
Jörg Thalheim 62fd4ca70d
inflect negative numbers correctly 2020-12-10 16:47:51 +01:00
Jörg Thalheim 6646682650
cleaners: expand english time 2020-12-10 14:53:20 +01:00
Jörg Thalheim 76138687d3
expand more currencies 2020-12-10 14:53:20 +01:00
erogol a2859b7ddc update config args checks 2020-12-10 13:52:57 +01:00
erogol 788cd6f902 fix multi-speaker glow-tts inference 2020-12-10 02:05:48 +01:00
erogol 92cc9630d7 fix glow-tts synthesis for DPP 2020-12-10 00:30:34 +01:00
erogol df180148e9 use noise augmentation in TTSDataset 2020-12-09 15:46:25 +01:00
erogol 06612ce305 test fixes 2020-12-07 15:57:34 +01:00
erogol 7505c0ba27 muliprocess phoneme computation 2020-12-07 11:29:41 +01:00
erogol 20c86489d7 make static methods for faster multiprocess call 2020-12-07 11:29:10 +01:00
erogol affe1c1138 setup training scripts for computing phonemes before training optionally. And define data_loaders before starting training and re-use them instead of re-define for every train and eval calls. This is to enable better instance filtering based on input length. 2020-12-07 11:26:57 +01:00
erogol 7c3cdced1a make speaker_mapping a global variable to prevent reload. Fix glow-tts training 2020-12-01 03:23:25 +01:00
erogol a757b203bc fix longer phoneme seqs 2020-11-26 15:05:03 +01:00
erogol f6c96b0ac2 Merge branch 'dev' 2020-11-25 15:29:06 +01:00
erogol a1e4ee18f9 convert float16 to float32 for plotting spectrograms 2020-11-25 14:50:28 +01:00
erogol 7541d2ecaa return eval split optional 2020-11-25 14:50:09 +01:00
erogol 1229554c42 use native amp 2020-11-25 14:48:54 +01:00
Qingping Hou b0b97d636f speed up metafile build for voxceleb 2020-11-14 23:45:17 -08:00
erogol 6cc464ead6 fix ton of tesnting bugs 2020-11-12 16:33:29 +01:00
erogol 9b0f441945 argument for returning no eval split 2020-11-12 12:52:27 +01:00
erogol 21364331d2 Merge branch 'dev' of https://github.com/mozilla/TTS into dev 2020-11-09 13:31:12 +01:00
erogol d94782a076 reset the way ga_loss is stored in return_dict 2020-11-02 13:18:56 +01:00
erogol a108d0ee81 check nan loss in glow-tts loss 2020-11-02 13:12:19 +01:00
erogol b8ac9aba9d check against NaN loss in tacotron_loss 2020-11-02 12:44:41 +01:00
erogol 183fe56d95 Merge branch 'ssim_loss' into dev 2020-10-29 23:49:09 +01:00
erogol 73581cd94c renaming train scripts and updating tests 2020-10-29 16:50:07 +01:00
erogol 946a0c0fb9 bug fixes for single speaker glow-tts, enable torch based amp. Make amp optional for wavegrad. Bug fixes for synthesis setup for glow-tts 2020-10-29 15:45:50 +01:00
erogol a1582a0e12 fix distributed training for train_* scripts 2020-10-29 12:31:43 +01:00
erogol e723b99888 handle distributed model as saving 2020-10-29 12:30:37 +01:00
Eren Gölge 26c18b61c9
Merge pull request #553 from Edresson/dev
bug fix in the inference with GlowTTS
2020-10-28 18:49:31 +01:00
erogol fdaed45f58 optional loss masking for stoptoken predictor 2020-10-28 18:40:54 +01:00
erogol e49cc3bbcd bug fix 2020-10-28 18:34:34 +01:00
erogol 59e1cf99d0 config update and ssim implementation 2020-10-28 18:30:00 +01:00
erogol 9cef923d99 ssim loss for tacotron models 2020-10-28 15:24:18 +01:00
Edresson f01502a9db bug fix in glowTTS sythesize 2020-10-27 16:30:16 -03:00
Eren Gölge f4b8170bd1
Merge pull request #545 from Edresson/dev
GlowTTS zeroshot TTS support
2020-10-27 15:23:41 +01:00
erogol a6f564c8c8 pylint fixes 2020-10-27 12:35:10 +01:00
erogol 8de7c13708 fix no loss masking loss computation 2020-10-27 12:17:38 +01:00
Edresson 89e9bfe3a2 add text processing blank token test 2020-10-26 17:41:23 -03:00
Edresson d9540a5857 add blank token in sequence for encrease glowtts results 2020-10-25 15:08:28 -03:00
Edresson fbea058c59 add parse speakers function 2020-10-24 16:10:05 -03:00
Edresson 07345099ee GlowTTS zero-shot TTS Support 2020-10-24 15:58:39 -03:00
Alexander Korolev 47d74ced1c
Update losses.py
Seems like in the latest dev merge, this change was reverted. Any specific reason for this?
Without it the problem as stated here https://github.com/mozilla/TTS/issues/473 occurs.
2020-10-23 14:15:01 +02:00
ayush-1506 2a3559f02b Fix readme and config file 2020-10-21 13:43:49 +05:30
Edresson b7f9ebd32b add check arguments for GlowTTS and multispeaker training bug fix 2020-10-19 17:17:58 -03:00
erogol c2c4126a18 remove merge conflicts 2020-10-08 01:35:27 +02:00
erogol 6f0654f9a8 differential spectral loss 2020-10-08 01:30:42 +02:00
erogol e0d4b88877 config update 2020-10-08 01:29:30 +02:00
erogol 4e93f90108 bug fix 2020-10-08 01:29:30 +02:00
erogol bb9b70ee27 differential spectral loss and loss weight settings 2020-10-08 01:29:30 +02:00
erogol e1eab1ce4b print model r value as loading it 2020-10-07 13:34:21 +02:00
Eren Gölge 4873601694
Merge pull request #531 from WeberJulian/french-cleaners
Adding support for french cleaners
2020-09-30 15:30:50 +02:00
Edresson 99d5a0ac07 add Speaker Conditional GST support 2020-09-29 16:09:27 -03:00
Julian WEBER ea7c2e15c0 Adding french abbreviations 2020-09-29 15:43:39 +02:00
Julian WEBER 54b4031391 Merge remote-tracking branch 'origin/dev' into french-cleaners 2020-09-29 14:24:51 +02:00
Julian WEBER da134eeee4 Subjective improvements 2020-09-29 14:20:52 +02:00
Julian WEBER b2817e9e93 Adding french cleaners 2020-09-29 14:20:24 +02:00
Eren Gölge cf02ace5b7
Merge pull request #530 from mueller91/fix_split_dataset
fix: split_dataset
2020-09-28 12:42:40 +02:00
erogol e097bc6c5d Merge branch 'dev' of https://github.com/mozilla/TTS into dev 2020-09-28 11:15:32 +02:00
Eren Gölge 8e2dc79c3a
Merge pull request #526 from mueller91/dev
Fix: Check storage params only for speaker encoder
2020-09-28 11:15:23 +02:00
erogol 6a70c63f24 correct glow-tts loss 2020-09-27 03:28:42 +02:00
erogol 665f7ca714 linter fix 2020-09-24 12:57:54 +02:00
mueller91 227b9c8864 fix: split_dataset() runtime reduced from O(N * |items|) to O(N) where N is the size of the eval split (max 500)
I notice a significant speedup on the initial loading of large datasets such as common voice (from minutes to seconds)
2020-09-23 23:27:51 +02:00
mueller91 1fe5eb054f Merge branch 'dev' of https://github.com/mozilla/TTS into dev
 Conflicts:
	TTS/bin/train_encoder.py
	requirements.txt
2020-09-22 19:58:53 +02:00
mueller91 df4caec4b7 add: check_config for speaker_encoder 2020-09-22 19:52:09 +02:00
mueller91 0ea7f4e2bd fix: make speaker encoder's storage parameters non-restriced 2020-09-22 10:39:40 +02:00
mueller91 7029452228 fix: make speaker encoder's storage parameters non-restriced 2020-09-22 10:31:42 +02:00
erogol 10258724d1 linter fixes 2020-09-22 03:54:16 +02:00
erogol a6df617eb1 Merge branch 'glow-tts-amp-time_depth_conv' into dev 2020-09-21 14:23:45 +02:00
erogol e0b9fa887f glow-tts modules added 2020-09-21 14:15:40 +02:00
erogol e4c6386603 change import for normalization layer 2020-09-21 13:09:52 +02:00
mueller91 9b4aac94a8 fix: linter issues 2020-09-21 12:13:02 +02:00
erogol c008003506 do not check sample rate as loading stats file for normalization to enable interpolation for different sample rate vocoder 2020-09-18 12:52:19 +02:00
mueller 6b0621c794 cleanup 2020-09-17 16:46:43 +02:00
mueller e36a3067e4 add: save wavs instead feats to storage.
This is done in order to mitigate staleness when caching and loading from data storage
2020-09-17 14:14:30 +02:00
mueller 1511076fde add: Configurable encoder dataset storage to reduce disk I/O
add: Averaged time for data loader to console and Tensorboard output
2020-09-17 12:29:38 +02:00
erogol 3660c57f1e time seperable convolution encoder, huber loss for duration predictor 2020-09-17 03:10:58 +02:00
mueller 95d2906307 add: Mozilla Commonvoice, VoxCeleb1+2, LibriTTS to Speaker Encoder Training 2020-09-16 16:49:53 +02:00
mueller c909ca3855 Improve runtime of __parse_items() from O(|speakers|*|items|) to O(|items|) 2020-09-16 15:55:55 +02:00
erogol f1a75468c2 fix arguments 2020-09-12 04:00:25 +02:00
erogol 45fbc0d003 convolution encoder with GLU and res connections 2020-09-12 03:40:21 +02:00
erogol 72b8ac0ff6 remove redundant arguments 2020-09-12 03:37:47 +02:00
erogol 15e6ab3912 glow-tts module renaming updates 2020-09-12 03:33:36 +02:00
erogol 1b238f04b2 add gated conv encoder to glow-tts 2020-09-11 19:01:38 +02:00
erogol 14356d3250 glow-tts with relative pos encoding 2020-09-11 19:01:38 +02:00
erogol 43771a3a5c remove redundant arguments 2020-09-11 19:01:38 +02:00
erogol 1dea2c9034 faster sequence masking 2020-09-11 19:01:38 +02:00
erogol 673ba74a80 glow tts training and inference fixes 2020-09-11 19:01:38 +02:00
erogol d5c6d60884 synthesis update for glow tts 2020-09-11 19:01:37 +02:00
erogol 89d15bf118 merge glow-tts after rebranding 2020-09-11 19:01:37 +02:00
erogol 540d811dd5 solve pickling models after module name change 2020-09-11 12:03:39 +02:00
erogol df19428ec6 rename the project to old TTS 2020-09-09 12:27:23 +02:00