Commit Graph

1785 Commits

Author SHA1 Message Date
Eren Gölge 1a43e05460 Fix VITS loss bug
Fake and real features were given in the wrong args order to
the loss function
2022-02-25 11:26:59 +01:00
Eren Gölge 4b96bfe925 Fix train logging 2022-02-25 11:26:59 +01:00
Eren Gölge ab8a4ca2c3 Revert random segment 2022-02-25 11:26:59 +01:00
Eren Gölge 8622226f3f Make style 2022-02-25 11:26:59 +01:00
Eren Gölge 27db089d6c Change TrainingArgs -> TrainerArgs 2022-02-25 11:26:59 +01:00
Eren Gölge aa81454721 Update BaseTrainingConfig 2022-02-25 11:26:59 +01:00
Eren Gölge d3a58ed07a Fix default values 2022-02-25 11:26:59 +01:00
Eren Gölge 54c6bb2a8c Fix add speaker VITS 2022-02-25 11:26:59 +01:00
Eren Gölge 590b04fb89 Fix espeak_wrapper 2022-02-25 11:26:59 +01:00
Eren Gölge a013566d15 Delete trainer related code 2022-02-25 11:26:59 +01:00
Eren Gölge 38314194e7 Set `drop_last` 2022-02-25 11:26:59 +01:00
Eren Gölge f70e4bb8c6 Add new speakers to the vits model 2022-02-25 11:26:59 +01:00
Eren Gölge d5c0e17548 Load right char class dynamically 2022-02-25 11:26:59 +01:00
Eren Gölge 1f0c8179da Make style 2022-02-25 11:26:59 +01:00
Eren Gölge b3ed6ff6b7 Update FastPitchConfig 2022-02-25 11:26:59 +01:00
Eren Gölge 1932401e8d Fix dataset preprocessing 2022-02-25 11:26:59 +01:00
Eren Gölge 34c4be5e49 Update forwardtts 2022-02-25 11:26:59 +01:00
Eren Gölge bb37462794 Update language manager 2022-02-25 11:26:59 +01:00
Eren Gölge 5169d4eb32 Plot pitch over input characters 2022-02-25 11:26:59 +01:00
Eren Gölge cd5d1497cf Add pitch_fmin pitch_fmax args to the audio 2022-02-25 11:26:59 +01:00
Eren Gölge 1445a46e9e Update synthesizer to use iinit_from_config 2022-02-25 11:26:59 +01:00
Eren Gölge 7058fcc3ff Take file extension as an argument 2022-02-25 11:26:59 +01:00
Eren Gölge 13482dde1f Update GAN model 2022-02-25 11:26:59 +01:00
Eren Gölge 2829027d8b Refactor VITS model 2022-02-25 11:26:59 +01:00
Eren Gölge ef63c99524 Implement `start_by_longest` option for TTSDatase 2022-02-25 11:26:18 +01:00
Eren Gölge c4c471d61d Allow padding for shorter segments 2022-02-25 11:25:48 +01:00
Eren Gölge 47fbddc8d4 Fix docstring 2022-02-25 11:25:48 +01:00
Eren Gölge bc2243bac4 Fix tests 2022-02-25 11:25:00 +01:00
Eren Gölge 146fbfd7c9 Extend unittests 2022-02-25 11:25:00 +01:00
Eren Gölge 2fe16de8e3 Make lint 2022-02-25 11:25:00 +01:00
Eren Gölge 7b49a4aa2b Fix glow_tts_config missing field 2022-02-25 11:24:13 +01:00
Eren Gölge 07b0a80d57 Fix tokenizer init_from_config 2022-02-25 11:24:13 +01:00
Eren Gölge 50e17097a7 Add verbose option to AudioProcessor 2022-02-25 11:24:13 +01:00
Eren Gölge 235f7d9b02 Extend glow_tts model tests 2022-02-25 11:24:13 +01:00
Eren Gölge 8e248913d6 Update train_tts for the new API 2022-02-25 11:24:13 +01:00
Eren Gölge 001da8afc8 Update Vits for the new model API 2022-02-25 11:21:19 +01:00
Eren Gölge 5176ae9e53 Fixes small compat. issues 2022-02-25 11:21:19 +01:00
Eren Gölge 131bc0cfc0 Fix synthesis.py 🔧 2022-02-25 11:18:00 +01:00
Eren Gölge c0746f23df Fix `too many open files` 2022-02-25 11:16:30 +01:00
Eren Gölge df0d58bf09 Update VCTK recipes 2022-02-25 11:16:30 +01:00
Eren Gölge 730f7c0df4 Add file_ext args to resample.py 2022-02-25 11:15:46 +01:00
Eren Gölge 28d98da422 Update VCTK formatter 2022-02-25 11:15:46 +01:00
Eren Gölge 4d99fee3e2 Update spec extractor 2022-02-25 11:12:44 +01:00
Eren Gölge 38a0b3b6c7 Update train_tts.py 2022-02-25 11:11:35 +01:00
Eren Gölge cfaa51fddc Update BaseTTS config 2022-02-25 11:11:35 +01:00
Eren Gölge 4c5cb44eeb Update setup_model 2022-02-25 11:11:35 +01:00
Eren Gölge 7c4243fba7 Update GlowTTS 2022-02-25 11:11:35 +01:00
Eren Gölge bacf79f4fb Update AlignTTS 2022-02-25 11:11:35 +01:00
Eren Gölge 18f726af65 Update ForwardTTS 2022-02-25 11:11:35 +01:00
Eren Gölge d0ec4b91e5 Update Tacotron models 2022-02-25 11:11:35 +01:00
Eren Gölge ea965a5683 Update VITS for the new API 2022-02-25 11:11:35 +01:00
Eren Gölge f802a931a3 Pass samples to init_from_config in SpeakerManager 2022-02-25 11:07:34 +01:00
Eren Gölge bde68d9f25 Use the same phonemizer for `en` to `en-us` 2022-02-25 11:07:34 +01:00
Eren Gölge 8649d4fd36 Allow None pad and blank tokens 2022-02-25 11:07:34 +01:00
Eren Gölge c9972e6f14 Make lint 2022-02-25 11:07:34 +01:00
Eren Gölge 30cfafce56 Add init_from_config 2022-02-25 11:05:54 +01:00
Eren Gölge 90cc45dd4e Update data loader tests 2022-02-25 11:05:54 +01:00
Eren Gölge 93957d58a1 Refactorin VITS for the tokenizer API 2022-02-25 11:05:06 +01:00
Eren Gölge 04df0a3d9f Refactor TTSDataset 2022-02-25 11:05:06 +01:00
Eren Gölge 9bb347a52b Update for tokenizer API 2022-02-25 11:05:06 +01:00
Eren Gölge 452dbc43d8 Update imports for symbols -> characters 2022-02-25 11:05:06 +01:00
Eren Gölge 8071fa0020 Refactor GlowTTS model and recipe for TTSTokenizer 2022-02-25 11:05:06 +01:00
Eren Gölge b6c2bfdf08 Refactor synthesis.py for TTSTokenizer 2022-02-25 11:05:06 +01:00
Eren Gölge b2bb954a51 Refactor TTSDataset to use TTSTokenizer 2022-02-25 11:05:06 +01:00
Eren Gölge 84091096a6 Refactor Synthesizer class for TTSTokenizer 2022-02-25 11:05:06 +01:00
Eren Gölge 196ae74273 Update data loader tests 2022-02-25 11:05:06 +01:00
Eren Gölge 98057a00ae Make style 2022-02-25 10:57:35 +01:00
Eren Gölge 7575367b9f Refactorin VITS for the tokenizer API 2022-02-25 10:57:35 +01:00
Eren Gölge 4cd690e4c1 Updates BaseTTS and configs 2022-02-25 10:57:35 +01:00
Eren Gölge 176b712c1a Refactor TTSDataset 2022-02-25 10:57:35 +01:00
Eren Gölge 4597d4e5b6 Remove get_characters from BaseTTS 2022-02-25 10:48:03 +01:00
Eren Gölge 1df1d6c4a9 Update for tokenizer API 2022-02-25 10:48:03 +01:00
Eren Gölge 2d8ce98d2a Update imports for symbols -> characters 2022-02-25 10:48:03 +01:00
Eren Gölge 9a95e15483 Refactor GlowTTS model and recipe for TTSTokenizer 2022-02-25 10:48:03 +01:00
Eren Gölge d0eb642d88 Refactor synthesis.py for TTSTokenizer 2022-02-25 10:48:03 +01:00
Eren Gölge 3476be30d7 Refactor Synthesizer class for TTSTokenizer 2022-02-25 10:48:03 +01:00
Eren Gölge 9397a56b13 Allow init_from_config from model or audio config 2022-02-25 10:48:03 +01:00
Eren Gölge a71a013276 Fix the wrong default loss name for GAN models 2022-02-25 10:48:03 +01:00
Eren Gölge 04202da1ac Make style 2022-02-25 10:48:03 +01:00
Eren Gölge 3b63d713b9 Fix espeak wrapper cmd call 2022-02-25 10:48:03 +01:00
Eren Gölge 4894998e6b Fix print_logs 2022-02-25 10:48:03 +01:00
Eren Gölge 4e8f9d6f10 Fix IPAPhonemes init_from_config 2022-02-25 10:48:03 +01:00
Eren Gölge 0fe39166fe Discard OOV chars in tokenizer
Discard but store OOV chars with a warninig message
when the OOV char first recognized
2022-02-25 10:48:03 +01:00
Eren Gölge c39aaafbfc Update EspeakWrapper for espeak-ng 2022-02-25 10:48:03 +01:00
Eren Gölge bb389479a4 Update setup_model for TTS.tts models 2022-02-25 10:48:03 +01:00
Eren Gölge 9b83e665fc Add init_from_config as an abstract class 2022-02-25 10:48:03 +01:00
Eren Gölge 3eca5ad060 Update config fields for phonemizer 2022-02-25 10:48:03 +01:00
Eren Gölge d2525abe8c Remove get_characters from BaseTTS 2022-02-25 10:48:03 +01:00
Eren Gölge 73d27ebd45 Fix GlowTTS 2022-02-25 10:48:03 +01:00
Eren Gölge 87bf940676 Print duplicate characters 2022-02-25 10:48:03 +01:00
Eren Gölge 3de9f38d16 Add init_from_config to SpeakerManager 2022-02-25 10:48:03 +01:00
Eren Gölge d8ec7086b6 Update `synthesis` for the new API 2022-02-25 10:48:03 +01:00
Eren Gölge 4e83bf3968 Allow choosing phonemizer 2022-02-25 10:48:02 +01:00
Eren Gölge 22f0c58fe1 Print language codes 2022-02-25 10:48:02 +01:00
Eren Gölge 693fb4dd39 Modify init_from_config for IPAPhonemes 2022-02-25 10:48:02 +01:00
Eren Gölge acc6eef625 Update for tokenizer API 2022-02-25 10:48:02 +01:00
Eren Gölge e1b4c4ca43 Add init_from_config to GAN 2022-02-25 10:48:02 +01:00
Eren Gölge 353f913efc Fix #985 2022-02-25 10:48:02 +01:00
Eren Gölge ba3b60c90f Test TTSTokenizer 2022-02-25 10:48:02 +01:00
Eren Gölge 79a84410f2 Test punctuations 2022-02-25 10:48:02 +01:00
Eren Gölge d8bdeb8b8f Fix Punctuation 2022-02-25 10:48:02 +01:00
Eren Gölge ff7c385838 Fix BasePhonemizer 2022-02-25 10:48:02 +01:00
Eren Gölge 10d435ce77 Fixup 2022-02-25 10:48:02 +01:00
Eren Gölge f0655bfffc Fix ja_jp_phonemizer 2022-02-25 10:48:02 +01:00
Eren Gölge 20e5dd3678 Add doc examples 2022-02-25 10:48:02 +01:00
Eren Gölge fbad17e084 Update imports for symbols -> characters 2022-02-25 10:48:02 +01:00
Eren Gölge a1df4f9887 Test character classes 2022-02-25 10:45:24 +01:00
Eren Gölge bd461ace33 Refactor GlowTTS model and recipe for TTSTokenizer 2022-02-25 10:45:24 +01:00
Eren Gölge 5a9653978a Refactor synthesis.py for TTSTokenizer 2022-02-25 10:45:24 +01:00
Eren Gölge e5785b34b0 Style fix 2022-02-25 10:27:46 +01:00
Eren Gölge e4049aa31a Refactor TTSDataset to use TTSTokenizer 2022-02-25 10:27:46 +01:00
Eren Gölge 2480bbe937 Remove OLD TOKENIZATION ROUTINES 2022-02-25 09:32:54 +01:00
Eren Gölge 53f696615b Add init_from_config to AudioProcessor 2022-02-25 09:32:54 +01:00
Eren Gölge 3d86edfc81 Refactor Synthesizer class for TTSTokenizer 2022-02-25 09:32:54 +01:00
Eren Gölge 8d85af84cd Implement Punctuation class 2022-02-25 09:32:54 +01:00
Eren Gölge 1aca58afaf Fix imports in cleaners.py 2022-02-25 09:32:54 +01:00
Eren Gölge 0344645e90 Implement TTSTokenizer 2022-02-25 09:32:54 +01:00
Eren Gölge 2fb1f70503 Implement BaseCharacters, IPAPhonemes, Graphemes 2022-02-25 09:32:54 +01:00
Eren Gölge 1bee40af40 Create language folders under `TTS.tts.utils.text` 2022-02-25 09:32:54 +01:00
Eren Gölge c1119bc291 Implement BasePhonemizer 2022-02-25 09:32:54 +01:00
Eren Gölge dcd01356e0 Create `text/english` folder 2022-02-25 09:32:54 +01:00
Eren Gölge 80867c8e8c Implement multi-phonemizer 2022-02-25 09:32:54 +01:00
Eren Gölge 5e4f78add3 Implement espeak wrapper 2022-02-25 09:32:54 +01:00
Eren Gölge e03a05c816 Implement gruut wrapper 2022-02-25 09:32:54 +01:00
Eren Gölge 172ba0c5e7 Implement JA_JP phonemizer 2022-02-25 09:32:54 +01:00
Eren Gölge ca02b82218 Implement ZH_CH phonemizer 2022-02-25 09:32:54 +01:00
Eren Gölge a51b031bff
Merge branch 'dev' into dev-fix-glowtts-infer 2022-02-21 12:01:40 +03:00
Edresson Casanova 28a7464975
Fix the bug in split dataset function (#1251)
* Fix the bug in split_dataset

* Make eval_split_size configurable

* Change test_loader to use load_tts_samples function

* Change eval_split_portion to eval_split_size and permits to set the absolute number of samples in eval

* Fix samplers unit test

* Add data unit test on GitHub workflow
2022-02-21 11:59:36 +03:00
Edresson Casanova bc5db13d06 Fix the bug in extract tts spectrogram script 2022-02-19 19:24:00 +00:00
Edresson Casanova ba6e56e01c Fix Glow-TTS multi-speaker inference 2022-02-18 19:25:29 +00:00
Eren Gölge 127118c637
Update TTS.tts formatters (#1228)
* Return Dict from tts formatters

* Make style
2022-02-11 23:03:43 +01:00
Eren Gölge 5e3f499a69
Fix #1187 (#1227) 2022-02-11 13:27:59 +01:00
Edresson Casanova 0860d73cf8
Remove Tensorflow requeriment (#1225)
* Remove TF modules

* Remove TF unit tests

* Remove TF vocoder modules

* Remove TF convert scripts

* Remove TF requirement

* Remove the Docs TF instructions

* Remove TF inference support
2022-02-10 16:14:54 +01:00
Eren Gölge 44c7d1a826
Merge pull request #1054 from WeberJulian/partial_embedding_compute
Partial embedding compute
2022-02-06 20:13:55 +01:00
WeberJulian c7f5e005e1 Compute embedding for new audios only 2022-01-06 15:41:38 +01:00
WeberJulian e778bad626 Add argument to enable dp speaker conditioning 2022-01-06 15:07:27 +01:00
WeberJulian e1accb6e28
Fix train_tts.py and uncomment code (#1051)
* Fix SE loading and language embedding logic

* remove trailing white space

* Uncomment resmapling code for SCL
2022-01-03 17:44:57 +01:00
Eren Gölge 58c38de58d Bump up to v0.5.0 2022-01-03 15:04:03 +00:00
Eren Gölge 5840d89802 Keep proj_dim in speaker encoder models 2022-01-03 15:03:34 +00:00
Eren Gölge 03bcae1ba5
Merge pull request #1050 from coqui-ai/fix_synthesizer_init
Fix if else statement
2022-01-03 15:59:29 +01:00
Eren Gölge fc09e319d4 Prioritize the given encoder path over config 2022-01-03 14:24:19 +00:00
Eren Gölge 7fad969a1f Fix if else statement 2022-01-03 14:16:11 +00:00
Eren Gölge d724984be1 Fix language assignment 2022-01-02 11:11:24 +00:00
WeberJulian a63998c048 Fix phoneme language 2022-01-01 21:08:13 +01:00
Eren Gölge 7ef458a59c Updake default vocoder for uk model 2022-01-01 16:09:42 +00:00
Eren Gölge e55f5ee59e Make linter 2022-01-01 15:50:04 +00:00
Eren Gölge 38f5a11125 Merge branch 'dev' of https://github.com/coqui-ai/TTS into dev 2022-01-01 15:38:46 +00:00
Eren Gölge c5512af82b Update uk vocoder url 2022-01-01 15:38:21 +00:00
Eren Gölge d37cfe474a Merge branch 'pr/Edresson/731-rebased' into dev 2022-01-01 15:37:35 +00:00
Eren Gölge 33711afa01 Update yourTTS url 2022-01-01 15:37:08 +00:00
Eren Gölge 8fd1ee1926 Print urls when BadZipError 2022-01-01 15:26:35 +00:00
Eren Gölge 61874bc0a0 Fix your_tts inference from the listed models 2021-12-31 13:45:05 +00:00
Eren Gölge 8100135a7e Add the YourTTS entry to the models 2021-12-31 12:22:08 +00:00
Eren Gölge 36cef5966b Fix resnet speaker encoder 2021-12-30 15:36:35 +00:00
Eren Gölge 348b5c96a2 Fix speaker encoder test 2021-12-30 15:36:35 +00:00
Eren Gölge 7129b04d46 Update VITS model 2021-12-30 14:08:17 +00:00
Eren Gölge 638091f41d Update Speaker Encoder models 2021-12-30 12:02:06 +00:00
Eren Gölge 6189fdfaea Fix Training HiFiGan -- avg loss not decreasing #1003 2021-12-30 10:48:55 +00:00
Eren Gölge 275c759993 Fix #1037 2021-12-23 15:57:10 +00:00
Eren Gölge 5c5ddd2ba7 Init speaker manager for speaker encoder 2021-12-22 15:51:53 +00:00
Eren Gölge 633dcc9c56 Implement RMS volume normalization 2021-12-22 15:51:14 +00:00
Eren Gölge 8d2bb284ac Add UK vocoder models 2021-12-21 13:13:35 +00:00
Eren Gölge 56378b12f7 Fix speaker encoder init 2021-12-21 12:26:25 +00:00
Eren Gölge c9c1fa0548 Fix multi-speaker init in Synthesizer 2021-12-21 09:44:07 +00:00
Eren Gölge f769595112 Add more listing options to ModelManager 2021-12-20 11:54:10 +00:00
Eren Gölge a25269d897 Remove commented code 2021-12-20 11:54:10 +00:00
Eren Gölge 473414d4af Implement init_speaker_encoder and change arg names 2021-12-20 11:54:10 +00:00
Eren Gölge d29c3780d1 Use speaker_encoder from speaker manager in Vits 2021-12-20 11:54:10 +00:00
Eren Gölge 4d13b887f5 Change speaker_idx to speaker_name 2021-12-20 11:54:10 +00:00
Eren Gölge 4c50f6f4df Add functions to get and check and argument in config and config.model_args 2021-12-20 11:54:10 +00:00
Eren Gölge 3c6d7f495c Fixup 2021-12-20 11:54:10 +00:00
Eren Gölge 3818bd0c23 Fixup 2021-12-20 11:54:10 +00:00
Eren Gölge 79de38ca76 Rename setup_model to setup_speaker_encoder_model 2021-12-20 11:54:10 +00:00
Eren Gölge 35a781fb90 Fix synthesizer reading `use_language_embedding` 2021-12-20 11:54:10 +00:00
Eren Gölge 7a987db62b Use torchaudio for ResNet speaker encoder 2021-12-20 11:54:10 +00:00
Eren Gölge 649dc9e9da Remove redundant code 2021-12-20 11:54:10 +00:00
Eren Gölge 704dddcffa Make style 2021-12-20 11:54:10 +00:00
WeberJulian 54b7fb4e4a Fix zoo tests 2021-12-20 11:54:10 +00:00
WeberJulian a564eb9f54 Add support for multi-lingual models in CLI 2021-12-20 11:54:10 +00:00
WeberJulian 2bbcb558dc Prevent weighted sampler use when num_gpus > 1 2021-12-20 11:54:10 +00:00
WeberJulian 74cedfac38 Revert init multispeaker change 2021-12-20 11:54:10 +00:00
WeberJulian 9cfbacc622 Fix trailing space 2021-12-20 11:54:10 +00:00
WeberJulian 6b03943526 Move multilingual logic out of the trainer 2021-12-20 11:54:10 +00:00
Edresson 818dc4ccd8 Add Docstring for TorchSTFT 2021-12-20 11:54:10 +00:00
Edresson 67dda0abe1 Add the SCL resample TODO 2021-12-20 11:54:10 +00:00
WeberJulian 8b52fb89d1 Fix merge bug 2021-12-20 11:54:10 +00:00
WeberJulian 09eda31a3f Fix tests 2021-12-20 11:54:10 +00:00
Edresson 78a23e19df Fix pylint checks 2021-12-20 11:54:10 +00:00
WeberJulian 4cd0e4eb0d Remove self.audio_config from VITS 2021-12-20 11:54:10 +00:00
Edresson d39200e69b Remove torchaudio requeriment 2021-12-20 11:54:10 +00:00
WeberJulian 2e516869a1 Fix trailing whitespace 2021-12-20 11:54:10 +00:00
WeberJulian ffc269eaf4 Update docstring 2021-12-20 11:54:10 +00:00
Edresson 12968532fe Add the language embedding dim in the duration predictor class 2021-12-20 11:54:10 +00:00
Edresson 4196a42de7 Get the number speaker from the Speaker Manager property 2021-12-20 11:54:10 +00:00
Edresson f394d60695 Fix the bug in multispeaker vits 2021-12-20 11:54:10 +00:00
Edresson 90eac13bb2 Rename ununsed_speakers to ignored_speakers 2021-12-20 11:54:10 +00:00
Edresson f34596d957 Fix function name 2021-12-20 11:54:10 +00:00
Edresson 45d0b04179 Lint fixs 2021-12-20 11:54:10 +00:00
Edresson 85418ffeaa Fix the bug in extract tts spectrograms 2021-12-20 11:54:10 +00:00
Edresson 2b2cecaea2 Set the new_fields in copy_model_files as None by default 2021-12-20 11:54:10 +00:00
Edresson 34749f8727 Remove the call to get_speaker_manager 2021-12-20 11:54:10 +00:00
Edresson b769b49e34 Remove the data from the set_d_vectors_from_file function 2021-12-20 11:54:10 +00:00
Edresson 9daa33d1fd Remove unusable speaker manager function 2021-12-20 11:54:10 +00:00
Edresson 8c22d5ac49 Turn more clear the VITS loss function 2021-12-20 11:54:10 +00:00
Edresson 6fc3b9e679 Remove the unusable fine-tuning model 2021-12-20 11:54:10 +00:00
Edresson 352aa69eca Create a module for the VAD script 2021-12-20 11:54:10 +00:00
WeberJulian 631addf33b fix d-vector 2021-12-20 11:54:10 +00:00
WeberJulian da6c1e858c Fix small issues 2021-12-20 11:54:10 +00:00
WeberJulian e8af6a9f08 Fix use_speaker_embedding logic 2021-12-20 11:54:10 +00:00
WeberJulian 23d789c072 Fix continue path 2021-12-20 11:54:10 +00:00
WeberJulian 120332d53f Fix phonemes 2021-12-20 11:54:10 +00:00
WeberJulian 846bf16f02 fix imports for load_meta_data 2021-12-20 11:54:10 +00:00
WeberJulian 1340938159 fix phonemes per language 2021-12-20 11:54:10 +00:00
WeberJulian e995a63bd6 fix linter 2021-12-20 11:54:10 +00:00
WeberJulian 1472b6df49 make style 2021-12-20 11:54:10 +00:00
WeberJulian 4d721bcabd fix test sentence synthesis 2021-12-20 11:54:10 +00:00
WeberJulian 0804806727 fix f0_cache_path in dataset 2021-12-20 11:54:10 +00:00
WeberJulian 3b5592abcf fix test vits 2021-12-20 11:54:10 +00:00
WeberJulian 2a2b5767c2 fix collate_fn 2021-12-20 11:54:10 +00:00
Julian WEBER 78c2d12a91 PitchExtractor 2021-12-20 11:54:10 +00:00
Julian WEBER 9a2f91327c get_aux_input 2021-12-20 11:54:10 +00:00
Julian WEBER b3abd01793 Merge dataset 2021-12-20 11:54:10 +00:00
Edresson 10ff90d6d2 Add remove silence VAD script 2021-12-20 11:54:10 +00:00
Edresson 1bd1a0546b Add audio resample in the speaker consistency loss 2021-12-20 11:54:10 +00:00
Edresson 1c6bcda950 Add freeze vocoder generator and flow-based decoder option 2021-12-20 11:54:10 +00:00
WeberJulian 2b952d8b97 freeze vits parts 2021-12-20 11:54:10 +00:00
WeberJulian 005bba60b0 get_speaker_weighted_sampler 2021-12-20 11:54:10 +00:00
Edresson 9de4539422 Update the VITS model docs 2021-12-20 11:54:10 +00:00
Edresson eeb8ac07d9 Add voice conversion fine tuning mode 2021-12-20 11:54:10 +00:00
Edresson 690b37d0ab Add support to use the speaker encoder as loss function in VITS model 2021-12-20 11:54:09 +00:00
Edresson 9b011b1cb3 Add H/ASP original checkpoint support 2021-12-20 11:54:09 +00:00
Edresson 0bdfd3cb50 Add the ValueError in the restore checkpoint exception to avoid problems with the optimizer restauration when new keys are addition 2021-12-20 11:54:09 +00:00
Edresson de78556655 Fix the optimizer parameters bug in multilingual and multispeaker training 2021-12-20 11:54:09 +00:00
Edresson 9be5b75da3 Fix bug after merge 2021-12-20 11:54:09 +00:00
Edresson 76251b619a Fix d-vector multispeaker training bug 2021-12-20 11:54:09 +00:00
Edresson 7ef3ddc6ff Fix unit tests 2021-12-20 11:54:09 +00:00
Edresson 36dcd11453 Fix pylint issues 2021-12-20 11:54:09 +00:00
Edresson c53693c155 Implement vocoder Fine Tuning like SC-GlowTTS paper 2021-12-20 11:54:09 +00:00
Edresson f1f016314e Fix the bug in M-AILABS formatter 2021-12-20 11:54:09 +00:00
Edresson c334d39acc Add voice conversion support for the model VITS trained with external speaker embedding 2021-12-20 11:54:09 +00:00
Edresson e997889ba8 Fix bug in VITS multilingual inference 2021-12-20 11:54:09 +00:00
Edresson 7c0b8ec572 Fix bugs in the non-multilingual VITS inference 2021-12-20 11:54:09 +00:00
Edresson 3fbbebd74d Fix pylint issues 2021-12-20 11:54:09 +00:00
Edresson ac9416fb86 Add multilingual inference support 2021-12-20 11:54:09 +00:00
Edresson dcb2374bc9 Add multilingual training support to the VITS model 2021-12-20 11:54:09 +00:00
Edresson f996afedb0 Implement multilingual dataloader support 2021-12-20 11:54:09 +00:00
Edresson 5f1c18187f Fix pylint issues 2021-12-20 11:54:09 +00:00
Edresson d91c595c5a Implement training support with d_vecs in the VITS model 2021-12-20 11:54:09 +00:00
Edresson 6a7db67a91 Allow ignore speakers for all multispeaker datasets 2021-12-20 11:54:09 +00:00
Edresson e0ad838066 Select randomly a speaker from the speaker manager for the test setences 2021-12-20 11:54:09 +00:00
Edresson eb3e8affe1 Save speakers embeddings/ids before starting training 2021-12-20 11:54:09 +00:00
Eren Gölge 37803467aa
Merge pull request #1021 from loganhart420/dataset_downloaders
Add addtional datasets
2021-12-20 10:42:20 +01:00
Reuben Morais 859ac1a54c Include usage instructions in README 2021-12-17 11:37:19 +01:00
loganhart420 103c010eca Add addtional datasets 2021-12-16 07:21:27 -05:00
Jörg Thalheim bce143c738
server: fix compatibility with tts_models/en/ljspeech/fast_pitch (#893) 2021-12-07 14:36:29 +01:00
Eren Gölge babdd84f91 Fix GST inference
commit d3e477875a7e46a101fcf95a1794442823750fe2
Author: George Rousssos <25833833+george-roussos@users.noreply.github.com>
Date:   Wed Nov 3 10:16:12 2021 +0000

    Read .wav for GST conditioning from CL

commit 074e6d0874d3b34fb6a4991fc17d66dccd413fbb
Author: George Rousssos <25833833+george-roussos@users.noreply.github.com>
Date:   Fri Oct 29 14:43:47 2021 +0100

    Fix GST during inference in Tacotron2

commit fdece14585ab5a36eed1061a9a838d8e48aa6882
Author: George Rousssos <25833833+george-roussos@users.noreply.github.com>
Date:   Wed Nov 3 10:16:12 2021 +0000

    Read .wav for GST conditioning from CL

commit cd29e21b8d0a541ee298d2bf5f67223ad60be38f
Author: George Rousssos <25833833+george-roussos@users.noreply.github.com>
Date:   Fri Oct 29 14:43:47 2021 +0100

    Fix GST during inference in Tacotron2

commit 908ce39370eadcc9fa8510cdb26c9ead87305427
Author: George Rousssos <25833833+george-roussos@users.noreply.github.com>
Date:   Fri Oct 29 12:49:37 2021 +0100

    Make trim_db value negative

commit 1008a2e0f72fa7ca7f0307424f570386f2f16d42
Author: George Rousssos <25833833+george-roussos@users.noreply.github.com>
Date:   Fri Oct 29 12:22:24 2021 +0100

    Set find_endpoint db threshold in config.json
2021-12-07 13:28:49 +00:00
Eren Gölge ce45d9e1af Make style and lint 2021-12-01 10:42:52 +00:00
Eren Gölge 40cb8ac966 Fix #958 2021-12-01 10:33:34 +00:00
Eren Gölge 512ada7548 Fix callbacks against multi-gpu training 2021-12-01 10:32:14 +00:00
Eren Gölge 2ed9e3c241 Fix constant use of noise augment 2021-11-08 09:20:34 +01:00
Eren Gölge b6b14a76af Fix VITS stochastic duration predictor 2021-11-08 09:20:11 +01:00
Eren Gölge dc3dd55dd9 Add collect_env_info.py 2021-11-08 08:59:08 +01:00
Eren Gölge faafea4cf2 Fix style 2021-11-04 17:04:40 +01:00
Eren Gölge d227aaebcc Print when using Griffin-Lim in Synthesizer 2021-11-01 16:52:26 +01:00
Eren Gölge c5077c6c3f Merge branch 'dev' of https://github.com/coqui-ai/TTS into dev 2021-11-01 16:42:27 +01:00
Eren Gölge 20cebde1c9 Add docstring to MAI labs formatter 2021-11-01 16:41:55 +01:00
Eren Gölge 608f437545 Add a function to find unique chars 2021-11-01 16:41:33 +01:00
Eren Gölge d6d780e758 Fix FastSpeech config 2021-11-01 16:41:15 +01:00
Eren Gölge 5ba47081ee Use GL for VCTK FastPitch models 2021-11-01 16:39:03 +01:00
Michael Hansen 3bc043faeb
Upgrade to gruut 2.0 (#882) 2021-10-31 11:41:55 +01:00
George 37eaefc085
Optional silence trimming during inference and find_endpoint() fix (#898)
* Set find_endpoint db threshold in config.json

* Optional silence trimming during inference

* Make trim_db value negative
2021-10-29 18:28:55 +02:00
Eren Gölge 7293abada2 Bump up to v0.4.2 2021-10-29 17:57:30 +02:00
Eren Gölge 2df0752e73
Model zoo tests (#900)
* Fix VITS model multi-speaker init

* Remove gdrive support in model manager

* Add model zoo tests
2021-10-29 17:54:16 +02:00
Eren Gölge aaaa591485 Bump up version to v0.4.1 2021-10-26 19:24:17 +02:00
Eren Gölge 3ea1c2037b Fix model entry in .models.json 2021-10-26 19:14:29 +02:00
Eren Gölge fa4ec83c6e Bump up version to v0.4.0 2021-10-26 18:27:39 +02:00
Eren Gölge 035ed432bc
Doc update (#889)
* Link source files from the docs

* Update glowTTS recipes for docs

* Add dataset downloaders
2021-10-26 17:41:33 +02:00
Eren Gölge 0cac3f330a Enable custom formatter in load_tts_samples 2021-10-26 13:07:11 +02:00
Eren Gölge 7c10574931 Gateway for TTS models 2021-10-26 13:04:51 +02:00
Eren Gölge 00becf2671 Fix import statements 2021-10-25 19:29:16 +02:00
Eren Gölge 027424dda8 Add VCTK fast_pitch and UK glow-tts 2021-10-25 19:29:16 +02:00
Eren Gölge 70e4d0e524 Fix grad_norm handling 2021-10-21 16:29:06 +00:00
Eren Gölge a409e0f8f8 Update train_tts for multi-speaker 2021-10-21 16:29:06 +00:00
Eren Gölge 2b7d159383 Update BaseTTS for multi-speaker training 2021-10-21 16:29:06 +00:00
Eren Gölge e62d3c5cf7 Use absolute imports for tts configs and models 2021-10-21 16:29:06 +00:00
Eren Gölge 82fed4add2 Make style 2021-10-21 16:05:51 +00:00
Eren Gölge 3cb07fb6b5 Fix SpeakerManager init with data items 2021-10-21 13:54:39 +00:00
Eren Gölge aea90e2501 Comment synthesis.py 2021-10-21 13:53:45 +00:00
Eren Gölge 1987aaaaed Update d-vector reshape in synthesizer 2021-10-21 13:53:25 +00:00
Eren Gölge 3ab009ca8d Edit model configs for multi-speaker 2021-10-21 13:51:37 +00:00
Eren Gölge cea8e1739b Update AlignTTS to use SpeakerManager 2021-10-20 18:22:41 +00:00
Eren Gölge 0e768dd4c5 Update comments 2021-10-20 18:21:26 +00:00
Eren Gölge 7c2cb7cc30 Update BaseTTS 2021-10-20 18:18:22 +00:00
Eren Gölge 330ee7d208 Comment BaseTacotron and remove unused funcs 2021-10-20 18:17:25 +00:00
Eren Gölge aa25f70b95 Update ForwardTTS for multi-speaker 2021-10-20 18:16:41 +00:00
Eren Gölge 0ebc2a400e Implement `_set_speaker_embedding` in GlowTTS 2021-10-20 18:15:20 +00:00
Eren Gölge 3da79a4de4 Comment Tacotron2 model 2021-10-20 18:14:04 +00:00
Eren Gölge 92b6d98443 Set pitch frame alignment wrt spec computation 2021-10-20 18:12:38 +00:00
Eren Gölge 0a3d1cc7ee Pass speaker manager to the model in synthesizer 2021-10-20 18:11:36 +00:00
Eren Gölge 588da1a24e Simplify grad_norm handling in trainer 2021-10-19 16:33:04 +00:00
Eren Gölge 3c7848e9b1 Don't OOR values in train console log 2021-10-19 16:32:16 +00:00
Eren Gölge c514351c0e Refactor multi-speaker init in BaseTTS-Tacotron1-2 2021-10-18 08:55:45 +00:00
Eren Gölge 127571423c Update multi-speaker init in BaseTTS 2021-10-18 08:54:41 +00:00
Eren Gölge a0a5d580e9 Approximate audio length from file size 2021-10-18 08:54:02 +00:00
Eren Gölge b4b890df03 Update trainer's initialization 2021-10-18 08:53:19 +00:00
Eren Gölge fcbfc53cb7 Fix linter 2021-10-15 10:24:19 +00:00
Eren Gölge 700b056117 Update Synthesizer multi-speaker handling 2021-10-15 10:21:12 +00:00
Eren Gölge 073a2d2eb0 Refactor VITS multi-speaker initialization 2021-10-15 10:20:00 +00:00
Eren Gölge 0565457faa Fix #846 2021-10-14 14:46:14 +00:00
Eren Gölge e15bc157d8 Fix #873 2021-10-14 14:39:45 +00:00
Eren Gölge 21cc0517a3 Fix WaveRNN test 2021-10-01 10:21:37 +00:00
Eren Gölge 4dbe7ed0de Fix all-zero duration case for GlowTTS 2021-10-01 09:24:26 +00:00
Eren Gölge 37959ad0c7 Make linter 2021-09-30 23:02:16 +00:00
Eren Gölge 0b1986384f Make style 2021-09-30 16:21:18 +00:00
Eren Gölge 7edbe04fe0 Fix WaveRNN config and test 2021-09-30 16:20:12 +00:00
Eren Gölge 55d9209221 Remote STT tokenizer 2021-09-30 14:58:26 +00:00
Eren Gölge ba2b8c827f Update `train_tts.py` and `train_vocoder.py` 2021-09-30 14:47:56 +00:00
Eren Gölge 2e9b6b4f90 Refactor Speaker Encoder training 2021-09-30 14:47:56 +00:00
Eren Gölge 043dca61b4 Rename `load_meta_data` as `load_tts_data` 2021-09-30 14:47:56 +00:00
Eren Gölge 9f23ad6a0f Fix imports 2021-09-30 14:47:56 +00:00
Eren Gölge 16b70be0dd Add `_set_model_args` to BaseModel 2021-09-30 14:47:56 +00:00
Eren Gölge 9a0d8fa027 Update `copy_model_files()` 2021-09-30 14:47:56 +00:00
Eren Gölge 4163b4f2e4 Update Tacotron models 2021-09-30 14:47:56 +00:00
Eren Gölge e27feade38 Fixup wavernn 2021-09-30 14:47:56 +00:00
Eren Gölge 45889804c2 Update VITS 2021-09-30 14:47:56 +00:00
Eren Gölge 4f94f91305 Update WaveRNN 2021-09-30 14:47:56 +00:00
Eren Gölge 3d5205d66f Update WaveGrad 2021-09-30 14:47:56 +00:00
Eren Gölge fd95926009 Update GlowTTS 2021-09-30 14:47:56 +00:00
Eren Gölge 4baecdf92a Update GAN for Trainer_v2 2021-09-30 14:47:56 +00:00
Eren Gölge a156a40b47 Update ForwardTTS for Trainer_v2 2021-09-30 14:19:19 +00:00
Eren Gölge d9df33f837 Update `align_tts` for trainer_v2 2021-09-30 14:18:10 +00:00
Eren Gölge 8ada870a57 Refactor `trainer.py` for v2 2021-09-30 14:16:34 +00:00
Eren Gölge 7f388f26e3 Bump up to v0.3.1 2021-09-17 23:53:22 +00:00
Eren Gölge 2766dd1d6e
Fix #813 - GlowTTS training (#814)
* Fix #813

* Update glow_tts recipe

* Fix glow-tts test

* Linter fix

* Run data dep init only in training
2021-09-17 20:06:55 +02:00
Eren Gölge f563415052 Bump up to v0.3.0 2021-09-13 09:40:38 +00:00
Eren Gölge a97dc8d09f Fix trainer malformatted print 2021-09-13 08:32:02 +00:00
Eren Gölge 91bebebe18 Add new models to `.models.json`
SpeedySpeech model using `ForwardTTS`
UnivNet model fine-tuned on TacotronDDC_ph spectrograms
2021-09-13 08:22:14 +00:00
Eren Gölge 1ea011571a Update SpeedySpeech config 2021-09-12 15:33:27 +00:00
Eren Gölge cbbc9e0172 Add FastSpeechConfig 2021-09-11 10:20:37 +00:00
Eren Gölge 26f76fce22 Remove SpeedySpeech from .models.json 2021-09-10 17:47:27 +00:00
Eren Gölge d97952611d Remove unused import 2021-09-10 17:31:41 +00:00
Eren Gölge 7d8f77385a Use `glow-tts` in synthesis tests 2021-09-10 17:27:33 +00:00
Eren Gölge d5f256b34c Update tacotron `r` init 2021-09-10 17:26:23 +00:00
Eren Gölge ab37fa9c39 Edit AlignTTS 2021-09-10 17:25:00 +00:00
Eren Gölge 66732025e1 Add `base_model` field to `forward_tts` configs 2021-09-10 17:23:48 +00:00
Eren Gölge d6e29ef98a Style update 2021-09-10 08:30:33 +00:00
Eren Gölge a89eb12aca Fix glow_tts imports 2021-09-10 08:29:51 +00:00
Eren Gölge 570d5971be Implement `ForwardTTSLoss` 2021-09-10 08:29:12 +00:00
Eren Gölge 0541a25e90 Remove `fastpitch.py` and `speedy_speech.py` 2021-09-10 08:27:48 +00:00
Eren Gölge 3c16013199 Fix Vits imports 2021-09-10 08:26:34 +00:00