Commit Graph

1598 Commits

Author SHA1 Message Date
Eren Gölge c11944022d Revert back again rand_segment 2022-02-25 11:30:24 +01:00
Eren Gölge 00c7600103 Update Vits model API 2022-02-25 11:30:24 +01:00
Eren Gölge 935a604046 Delete trainer_utils 2022-02-25 11:29:41 +01:00
Eren Gölge d0c27a9661 Update synthesis.py 2022-02-25 11:29:41 +01:00
Eren Gölge 35fc7270ff Implement BaseTTS 2022-02-25 11:28:47 +01:00
Eren Gölge 2bad098625 Implement BaseVocabulary 2022-02-25 11:28:47 +01:00
Eren Gölge 833de62e30 Update base_vocoder 2022-02-25 11:28:14 +01:00
Eren Gölge fc3b6d2861 Update gan 2022-02-25 11:28:14 +01:00
Eren Gölge 20a677c623 Update test_run in wavernn and wavegrad 2022-02-25 11:28:14 +01:00
Eren Gölge be3a03126a Update imports for trainer 2022-02-25 11:28:14 +01:00
Eren Gölge c911729896 Update BaseTrainerModel 2022-02-25 11:28:14 +01:00
Eren Gölge 1e219fef0a Revert drop_last 2022-02-25 11:26:59 +01:00
Eren Gölge 7dfd753d91 Add a cheap trick to avoid short audio clips 2022-02-25 11:26:59 +01:00
Eren Gölge 1a43e05460 Fix VITS loss bug
Fake and real features were given in the wrong args order to
the loss function
2022-02-25 11:26:59 +01:00
Eren Gölge 4b96bfe925 Fix train logging 2022-02-25 11:26:59 +01:00
Eren Gölge ab8a4ca2c3 Revert random segment 2022-02-25 11:26:59 +01:00
Eren Gölge 8622226f3f Make style 2022-02-25 11:26:59 +01:00
Eren Gölge 27db089d6c Change TrainingArgs -> TrainerArgs 2022-02-25 11:26:59 +01:00
Eren Gölge aa81454721 Update BaseTrainingConfig 2022-02-25 11:26:59 +01:00
Eren Gölge d3a58ed07a Fix default values 2022-02-25 11:26:59 +01:00
Eren Gölge 54c6bb2a8c Fix add speaker VITS 2022-02-25 11:26:59 +01:00
Eren Gölge 590b04fb89 Fix espeak_wrapper 2022-02-25 11:26:59 +01:00
Eren Gölge a013566d15 Delete trainer related code 2022-02-25 11:26:59 +01:00
Eren Gölge 38314194e7 Set `drop_last` 2022-02-25 11:26:59 +01:00
Eren Gölge f70e4bb8c6 Add new speakers to the vits model 2022-02-25 11:26:59 +01:00
Eren Gölge d5c0e17548 Load right char class dynamically 2022-02-25 11:26:59 +01:00
Eren Gölge 1f0c8179da Make style 2022-02-25 11:26:59 +01:00
Eren Gölge b3ed6ff6b7 Update FastPitchConfig 2022-02-25 11:26:59 +01:00
Eren Gölge 1932401e8d Fix dataset preprocessing 2022-02-25 11:26:59 +01:00
Eren Gölge 34c4be5e49 Update forwardtts 2022-02-25 11:26:59 +01:00
Eren Gölge bb37462794 Update language manager 2022-02-25 11:26:59 +01:00
Eren Gölge 5169d4eb32 Plot pitch over input characters 2022-02-25 11:26:59 +01:00
Eren Gölge cd5d1497cf Add pitch_fmin pitch_fmax args to the audio 2022-02-25 11:26:59 +01:00
Eren Gölge 1445a46e9e Update synthesizer to use iinit_from_config 2022-02-25 11:26:59 +01:00
Eren Gölge 7058fcc3ff Take file extension as an argument 2022-02-25 11:26:59 +01:00
Eren Gölge 13482dde1f Update GAN model 2022-02-25 11:26:59 +01:00
Eren Gölge 2829027d8b Refactor VITS model 2022-02-25 11:26:59 +01:00
Eren Gölge ef63c99524 Implement `start_by_longest` option for TTSDatase 2022-02-25 11:26:18 +01:00
Eren Gölge c4c471d61d Allow padding for shorter segments 2022-02-25 11:25:48 +01:00
Eren Gölge 47fbddc8d4 Fix docstring 2022-02-25 11:25:48 +01:00
Eren Gölge bc2243bac4 Fix tests 2022-02-25 11:25:00 +01:00
Eren Gölge 146fbfd7c9 Extend unittests 2022-02-25 11:25:00 +01:00
Eren Gölge 2fe16de8e3 Make lint 2022-02-25 11:25:00 +01:00
Eren Gölge 7b49a4aa2b Fix glow_tts_config missing field 2022-02-25 11:24:13 +01:00
Eren Gölge 07b0a80d57 Fix tokenizer init_from_config 2022-02-25 11:24:13 +01:00
Eren Gölge 50e17097a7 Add verbose option to AudioProcessor 2022-02-25 11:24:13 +01:00
Eren Gölge 235f7d9b02 Extend glow_tts model tests 2022-02-25 11:24:13 +01:00
Eren Gölge 8e248913d6 Update train_tts for the new API 2022-02-25 11:24:13 +01:00
Eren Gölge 001da8afc8 Update Vits for the new model API 2022-02-25 11:21:19 +01:00
Eren Gölge 5176ae9e53 Fixes small compat. issues 2022-02-25 11:21:19 +01:00
Eren Gölge 131bc0cfc0 Fix synthesis.py 🔧 2022-02-25 11:18:00 +01:00
Eren Gölge c0746f23df Fix `too many open files` 2022-02-25 11:16:30 +01:00
Eren Gölge df0d58bf09 Update VCTK recipes 2022-02-25 11:16:30 +01:00
Eren Gölge 730f7c0df4 Add file_ext args to resample.py 2022-02-25 11:15:46 +01:00
Eren Gölge 28d98da422 Update VCTK formatter 2022-02-25 11:15:46 +01:00
Eren Gölge 4d99fee3e2 Update spec extractor 2022-02-25 11:12:44 +01:00
Eren Gölge 38a0b3b6c7 Update train_tts.py 2022-02-25 11:11:35 +01:00
Eren Gölge cfaa51fddc Update BaseTTS config 2022-02-25 11:11:35 +01:00
Eren Gölge 4c5cb44eeb Update setup_model 2022-02-25 11:11:35 +01:00
Eren Gölge 7c4243fba7 Update GlowTTS 2022-02-25 11:11:35 +01:00
Eren Gölge bacf79f4fb Update AlignTTS 2022-02-25 11:11:35 +01:00
Eren Gölge 18f726af65 Update ForwardTTS 2022-02-25 11:11:35 +01:00
Eren Gölge d0ec4b91e5 Update Tacotron models 2022-02-25 11:11:35 +01:00
Eren Gölge ea965a5683 Update VITS for the new API 2022-02-25 11:11:35 +01:00
Eren Gölge f802a931a3 Pass samples to init_from_config in SpeakerManager 2022-02-25 11:07:34 +01:00
Eren Gölge bde68d9f25 Use the same phonemizer for `en` to `en-us` 2022-02-25 11:07:34 +01:00
Eren Gölge 8649d4fd36 Allow None pad and blank tokens 2022-02-25 11:07:34 +01:00
Eren Gölge c9972e6f14 Make lint 2022-02-25 11:07:34 +01:00
Eren Gölge 30cfafce56 Add init_from_config 2022-02-25 11:05:54 +01:00
Eren Gölge 90cc45dd4e Update data loader tests 2022-02-25 11:05:54 +01:00
Eren Gölge 93957d58a1 Refactorin VITS for the tokenizer API 2022-02-25 11:05:06 +01:00
Eren Gölge 04df0a3d9f Refactor TTSDataset 2022-02-25 11:05:06 +01:00
Eren Gölge 9bb347a52b Update for tokenizer API 2022-02-25 11:05:06 +01:00
Eren Gölge 452dbc43d8 Update imports for symbols -> characters 2022-02-25 11:05:06 +01:00
Eren Gölge 8071fa0020 Refactor GlowTTS model and recipe for TTSTokenizer 2022-02-25 11:05:06 +01:00
Eren Gölge b6c2bfdf08 Refactor synthesis.py for TTSTokenizer 2022-02-25 11:05:06 +01:00
Eren Gölge b2bb954a51 Refactor TTSDataset to use TTSTokenizer 2022-02-25 11:05:06 +01:00
Eren Gölge 84091096a6 Refactor Synthesizer class for TTSTokenizer 2022-02-25 11:05:06 +01:00
Eren Gölge 196ae74273 Update data loader tests 2022-02-25 11:05:06 +01:00
Eren Gölge 98057a00ae Make style 2022-02-25 10:57:35 +01:00
Eren Gölge 7575367b9f Refactorin VITS for the tokenizer API 2022-02-25 10:57:35 +01:00
Eren Gölge 4cd690e4c1 Updates BaseTTS and configs 2022-02-25 10:57:35 +01:00
Eren Gölge 176b712c1a Refactor TTSDataset 2022-02-25 10:57:35 +01:00
Eren Gölge 4597d4e5b6 Remove get_characters from BaseTTS 2022-02-25 10:48:03 +01:00
Eren Gölge 1df1d6c4a9 Update for tokenizer API 2022-02-25 10:48:03 +01:00
Eren Gölge 2d8ce98d2a Update imports for symbols -> characters 2022-02-25 10:48:03 +01:00
Eren Gölge 9a95e15483 Refactor GlowTTS model and recipe for TTSTokenizer 2022-02-25 10:48:03 +01:00
Eren Gölge d0eb642d88 Refactor synthesis.py for TTSTokenizer 2022-02-25 10:48:03 +01:00
Eren Gölge 3476be30d7 Refactor Synthesizer class for TTSTokenizer 2022-02-25 10:48:03 +01:00
Eren Gölge 9397a56b13 Allow init_from_config from model or audio config 2022-02-25 10:48:03 +01:00
Eren Gölge a71a013276 Fix the wrong default loss name for GAN models 2022-02-25 10:48:03 +01:00
Eren Gölge 04202da1ac Make style 2022-02-25 10:48:03 +01:00
Eren Gölge 3b63d713b9 Fix espeak wrapper cmd call 2022-02-25 10:48:03 +01:00
Eren Gölge 4894998e6b Fix print_logs 2022-02-25 10:48:03 +01:00
Eren Gölge 4e8f9d6f10 Fix IPAPhonemes init_from_config 2022-02-25 10:48:03 +01:00
Eren Gölge 0fe39166fe Discard OOV chars in tokenizer
Discard but store OOV chars with a warninig message
when the OOV char first recognized
2022-02-25 10:48:03 +01:00
Eren Gölge c39aaafbfc Update EspeakWrapper for espeak-ng 2022-02-25 10:48:03 +01:00
Eren Gölge bb389479a4 Update setup_model for TTS.tts models 2022-02-25 10:48:03 +01:00
Eren Gölge 9b83e665fc Add init_from_config as an abstract class 2022-02-25 10:48:03 +01:00
Eren Gölge 3eca5ad060 Update config fields for phonemizer 2022-02-25 10:48:03 +01:00
Eren Gölge d2525abe8c Remove get_characters from BaseTTS 2022-02-25 10:48:03 +01:00
Eren Gölge 73d27ebd45 Fix GlowTTS 2022-02-25 10:48:03 +01:00
Eren Gölge 87bf940676 Print duplicate characters 2022-02-25 10:48:03 +01:00
Eren Gölge 3de9f38d16 Add init_from_config to SpeakerManager 2022-02-25 10:48:03 +01:00
Eren Gölge d8ec7086b6 Update `synthesis` for the new API 2022-02-25 10:48:03 +01:00
Eren Gölge 4e83bf3968 Allow choosing phonemizer 2022-02-25 10:48:02 +01:00
Eren Gölge 22f0c58fe1 Print language codes 2022-02-25 10:48:02 +01:00
Eren Gölge 693fb4dd39 Modify init_from_config for IPAPhonemes 2022-02-25 10:48:02 +01:00
Eren Gölge acc6eef625 Update for tokenizer API 2022-02-25 10:48:02 +01:00
Eren Gölge e1b4c4ca43 Add init_from_config to GAN 2022-02-25 10:48:02 +01:00
Eren Gölge 353f913efc Fix #985 2022-02-25 10:48:02 +01:00
Eren Gölge ba3b60c90f Test TTSTokenizer 2022-02-25 10:48:02 +01:00
Eren Gölge 79a84410f2 Test punctuations 2022-02-25 10:48:02 +01:00
Eren Gölge d8bdeb8b8f Fix Punctuation 2022-02-25 10:48:02 +01:00
Eren Gölge ff7c385838 Fix BasePhonemizer 2022-02-25 10:48:02 +01:00
Eren Gölge 10d435ce77 Fixup 2022-02-25 10:48:02 +01:00
Eren Gölge f0655bfffc Fix ja_jp_phonemizer 2022-02-25 10:48:02 +01:00
Eren Gölge 20e5dd3678 Add doc examples 2022-02-25 10:48:02 +01:00
Eren Gölge fbad17e084 Update imports for symbols -> characters 2022-02-25 10:48:02 +01:00
Eren Gölge a1df4f9887 Test character classes 2022-02-25 10:45:24 +01:00
Eren Gölge bd461ace33 Refactor GlowTTS model and recipe for TTSTokenizer 2022-02-25 10:45:24 +01:00
Eren Gölge 5a9653978a Refactor synthesis.py for TTSTokenizer 2022-02-25 10:45:24 +01:00
Eren Gölge e5785b34b0 Style fix 2022-02-25 10:27:46 +01:00
Eren Gölge e4049aa31a Refactor TTSDataset to use TTSTokenizer 2022-02-25 10:27:46 +01:00
Eren Gölge 2480bbe937 Remove OLD TOKENIZATION ROUTINES 2022-02-25 09:32:54 +01:00
Eren Gölge 53f696615b Add init_from_config to AudioProcessor 2022-02-25 09:32:54 +01:00
Eren Gölge 3d86edfc81 Refactor Synthesizer class for TTSTokenizer 2022-02-25 09:32:54 +01:00
Eren Gölge 8d85af84cd Implement Punctuation class 2022-02-25 09:32:54 +01:00
Eren Gölge 1aca58afaf Fix imports in cleaners.py 2022-02-25 09:32:54 +01:00
Eren Gölge 0344645e90 Implement TTSTokenizer 2022-02-25 09:32:54 +01:00
Eren Gölge 2fb1f70503 Implement BaseCharacters, IPAPhonemes, Graphemes 2022-02-25 09:32:54 +01:00
Eren Gölge 1bee40af40 Create language folders under `TTS.tts.utils.text` 2022-02-25 09:32:54 +01:00
Eren Gölge c1119bc291 Implement BasePhonemizer 2022-02-25 09:32:54 +01:00
Eren Gölge dcd01356e0 Create `text/english` folder 2022-02-25 09:32:54 +01:00
Eren Gölge 80867c8e8c Implement multi-phonemizer 2022-02-25 09:32:54 +01:00
Eren Gölge 5e4f78add3 Implement espeak wrapper 2022-02-25 09:32:54 +01:00
Eren Gölge e03a05c816 Implement gruut wrapper 2022-02-25 09:32:54 +01:00
Eren Gölge 172ba0c5e7 Implement JA_JP phonemizer 2022-02-25 09:32:54 +01:00
Eren Gölge ca02b82218 Implement ZH_CH phonemizer 2022-02-25 09:32:54 +01:00
Eren Gölge a51b031bff
Merge branch 'dev' into dev-fix-glowtts-infer 2022-02-21 12:01:40 +03:00
Edresson Casanova 28a7464975
Fix the bug in split dataset function (#1251)
* Fix the bug in split_dataset

* Make eval_split_size configurable

* Change test_loader to use load_tts_samples function

* Change eval_split_portion to eval_split_size and permits to set the absolute number of samples in eval

* Fix samplers unit test

* Add data unit test on GitHub workflow
2022-02-21 11:59:36 +03:00
Edresson Casanova bc5db13d06 Fix the bug in extract tts spectrogram script 2022-02-19 19:24:00 +00:00
Edresson Casanova ba6e56e01c Fix Glow-TTS multi-speaker inference 2022-02-18 19:25:29 +00:00
Eren Gölge 127118c637
Update TTS.tts formatters (#1228)
* Return Dict from tts formatters

* Make style
2022-02-11 23:03:43 +01:00
Eren Gölge 5e3f499a69
Fix #1187 (#1227) 2022-02-11 13:27:59 +01:00
Edresson Casanova 0860d73cf8
Remove Tensorflow requeriment (#1225)
* Remove TF modules

* Remove TF unit tests

* Remove TF vocoder modules

* Remove TF convert scripts

* Remove TF requirement

* Remove the Docs TF instructions

* Remove TF inference support
2022-02-10 16:14:54 +01:00
Eren Gölge 44c7d1a826
Merge pull request #1054 from WeberJulian/partial_embedding_compute
Partial embedding compute
2022-02-06 20:13:55 +01:00
WeberJulian c7f5e005e1 Compute embedding for new audios only 2022-01-06 15:41:38 +01:00
WeberJulian e778bad626 Add argument to enable dp speaker conditioning 2022-01-06 15:07:27 +01:00
WeberJulian e1accb6e28
Fix train_tts.py and uncomment code (#1051)
* Fix SE loading and language embedding logic

* remove trailing white space

* Uncomment resmapling code for SCL
2022-01-03 17:44:57 +01:00