Eren Gölge
8071fa0020
Refactor GlowTTS model and recipe for TTSTokenizer
2022-02-25 11:05:06 +01:00
Eren Gölge
b6c2bfdf08
Refactor synthesis.py for TTSTokenizer
2022-02-25 11:05:06 +01:00
Eren Gölge
b2bb954a51
Refactor TTSDataset to use TTSTokenizer
2022-02-25 11:05:06 +01:00
Eren Gölge
196ae74273
Update data loader tests
2022-02-25 11:05:06 +01:00
Eren Gölge
98057a00ae
Make style
2022-02-25 10:57:35 +01:00
Eren Gölge
7575367b9f
Refactorin VITS for the tokenizer API
2022-02-25 10:57:35 +01:00
Eren Gölge
4cd690e4c1
Updates BaseTTS and configs
2022-02-25 10:57:35 +01:00
Eren Gölge
176b712c1a
Refactor TTSDataset ⚡ ️
2022-02-25 10:57:35 +01:00
Eren Gölge
4597d4e5b6
Remove get_characters from BaseTTS
2022-02-25 10:48:03 +01:00
Eren Gölge
2d8ce98d2a
Update imports for symbols -> characters
2022-02-25 10:48:03 +01:00
Eren Gölge
9a95e15483
Refactor GlowTTS model and recipe for TTSTokenizer
2022-02-25 10:48:03 +01:00
Eren Gölge
d0eb642d88
Refactor synthesis.py for TTSTokenizer
2022-02-25 10:48:03 +01:00
Eren Gölge
04202da1ac
Make style
2022-02-25 10:48:03 +01:00
Eren Gölge
3b63d713b9
Fix espeak wrapper cmd call
2022-02-25 10:48:03 +01:00
Eren Gölge
4894998e6b
Fix print_logs
2022-02-25 10:48:03 +01:00
Eren Gölge
4e8f9d6f10
Fix IPAPhonemes init_from_config
2022-02-25 10:48:03 +01:00
Eren Gölge
0fe39166fe
Discard OOV chars in tokenizer
...
Discard but store OOV chars with a warninig message
when the OOV char first recognized
2022-02-25 10:48:03 +01:00
Eren Gölge
c39aaafbfc
Update EspeakWrapper for espeak-ng
2022-02-25 10:48:03 +01:00
Eren Gölge
bb389479a4
Update setup_model for TTS.tts models
2022-02-25 10:48:03 +01:00
Eren Gölge
3eca5ad060
Update config fields for phonemizer
2022-02-25 10:48:03 +01:00
Eren Gölge
d2525abe8c
Remove get_characters from BaseTTS
2022-02-25 10:48:03 +01:00
Eren Gölge
73d27ebd45
Fix GlowTTS
2022-02-25 10:48:03 +01:00
Eren Gölge
87bf940676
Print duplicate characters
2022-02-25 10:48:03 +01:00
Eren Gölge
3de9f38d16
Add init_from_config to SpeakerManager
2022-02-25 10:48:03 +01:00
Eren Gölge
d8ec7086b6
Update `synthesis` for the new API
2022-02-25 10:48:03 +01:00
Eren Gölge
4e83bf3968
Allow choosing phonemizer
2022-02-25 10:48:02 +01:00
Eren Gölge
22f0c58fe1
Print language codes
2022-02-25 10:48:02 +01:00
Eren Gölge
693fb4dd39
Modify init_from_config for IPAPhonemes
2022-02-25 10:48:02 +01:00
Eren Gölge
ba3b60c90f
Test TTSTokenizer
2022-02-25 10:48:02 +01:00
Eren Gölge
79a84410f2
Test punctuations
2022-02-25 10:48:02 +01:00
Eren Gölge
d8bdeb8b8f
Fix Punctuation
2022-02-25 10:48:02 +01:00
Eren Gölge
ff7c385838
Fix BasePhonemizer
2022-02-25 10:48:02 +01:00
Eren Gölge
10d435ce77
Fixup
2022-02-25 10:48:02 +01:00
Eren Gölge
f0655bfffc
Fix ja_jp_phonemizer
2022-02-25 10:48:02 +01:00
Eren Gölge
20e5dd3678
Add doc examples
2022-02-25 10:48:02 +01:00
Eren Gölge
fbad17e084
Update imports for symbols -> characters
2022-02-25 10:48:02 +01:00
Eren Gölge
a1df4f9887
Test character classes
2022-02-25 10:45:24 +01:00
Eren Gölge
bd461ace33
Refactor GlowTTS model and recipe for TTSTokenizer
2022-02-25 10:45:24 +01:00
Eren Gölge
5a9653978a
Refactor synthesis.py for TTSTokenizer
2022-02-25 10:45:24 +01:00
Eren Gölge
e5785b34b0
Style fix
2022-02-25 10:27:46 +01:00
Eren Gölge
e4049aa31a
Refactor TTSDataset to use TTSTokenizer
2022-02-25 10:27:46 +01:00
Eren Gölge
2480bbe937
Remove OLD TOKENIZATION ROUTINES
2022-02-25 09:32:54 +01:00
Eren Gölge
8d85af84cd
Implement Punctuation class
2022-02-25 09:32:54 +01:00
Eren Gölge
1aca58afaf
Fix imports in cleaners.py
2022-02-25 09:32:54 +01:00
Eren Gölge
0344645e90
Implement TTSTokenizer
2022-02-25 09:32:54 +01:00
Eren Gölge
2fb1f70503
Implement BaseCharacters, IPAPhonemes, Graphemes
2022-02-25 09:32:54 +01:00
Eren Gölge
1bee40af40
Create language folders under `TTS.tts.utils.text`
2022-02-25 09:32:54 +01:00
Eren Gölge
c1119bc291
Implement BasePhonemizer
2022-02-25 09:32:54 +01:00
Eren Gölge
dcd01356e0
Create `text/english` folder
2022-02-25 09:32:54 +01:00
Eren Gölge
80867c8e8c
Implement multi-phonemizer
2022-02-25 09:32:54 +01:00
Eren Gölge
5e4f78add3
Implement espeak wrapper
2022-02-25 09:32:54 +01:00
Eren Gölge
e03a05c816
Implement gruut wrapper
2022-02-25 09:32:54 +01:00
Eren Gölge
172ba0c5e7
Implement JA_JP phonemizer
2022-02-25 09:32:54 +01:00
Eren Gölge
ca02b82218
Implement ZH_CH phonemizer
2022-02-25 09:32:54 +01:00
Eren Gölge
a51b031bff
Merge branch 'dev' into dev-fix-glowtts-infer
2022-02-21 12:01:40 +03:00
Edresson Casanova
28a7464975
Fix the bug in split dataset function ( #1251 )
...
* Fix the bug in split_dataset
* Make eval_split_size configurable
* Change test_loader to use load_tts_samples function
* Change eval_split_portion to eval_split_size and permits to set the absolute number of samples in eval
* Fix samplers unit test
* Add data unit test on GitHub workflow
2022-02-21 11:59:36 +03:00
Edresson Casanova
ba6e56e01c
Fix Glow-TTS multi-speaker inference
2022-02-18 19:25:29 +00:00
Eren Gölge
127118c637
Update TTS.tts formatters ( #1228 )
...
* Return Dict from tts formatters
* Make style
2022-02-11 23:03:43 +01:00
Edresson Casanova
0860d73cf8
Remove Tensorflow requeriment ( #1225 )
...
* Remove TF modules
* Remove TF unit tests
* Remove TF vocoder modules
* Remove TF convert scripts
* Remove TF requirement
* Remove the Docs TF instructions
* Remove TF inference support
2022-02-10 16:14:54 +01:00
WeberJulian
e778bad626
Add argument to enable dp speaker conditioning
2022-01-06 15:07:27 +01:00
WeberJulian
e1accb6e28
Fix train_tts.py and uncomment code ( #1051 )
...
* Fix SE loading and language embedding logic
* remove trailing white space
* Uncomment resmapling code for SCL
2022-01-03 17:44:57 +01:00
Eren Gölge
d724984be1
Fix language assignment
2022-01-02 11:11:24 +00:00
WeberJulian
a63998c048
Fix phoneme language
2022-01-01 21:08:13 +01:00
Eren Gölge
36cef5966b
Fix resnet speaker encoder
2021-12-30 15:36:35 +00:00
Eren Gölge
348b5c96a2
Fix speaker encoder test
2021-12-30 15:36:35 +00:00
Eren Gölge
7129b04d46
Update VITS model
2021-12-30 14:08:17 +00:00
Eren Gölge
5c5ddd2ba7
Init speaker manager for speaker encoder
2021-12-22 15:51:53 +00:00
Eren Gölge
a25269d897
Remove commented code
2021-12-20 11:54:10 +00:00
Eren Gölge
d29c3780d1
Use speaker_encoder from speaker manager in Vits
2021-12-20 11:54:10 +00:00
Eren Gölge
79de38ca76
Rename setup_model to setup_speaker_encoder_model
2021-12-20 11:54:10 +00:00
Eren Gölge
649dc9e9da
Remove redundant code
2021-12-20 11:54:10 +00:00
Eren Gölge
704dddcffa
Make style
2021-12-20 11:54:10 +00:00
WeberJulian
a564eb9f54
Add support for multi-lingual models in CLI
2021-12-20 11:54:10 +00:00
WeberJulian
2bbcb558dc
Prevent weighted sampler use when num_gpus > 1
2021-12-20 11:54:10 +00:00
WeberJulian
74cedfac38
Revert init multispeaker change
2021-12-20 11:54:10 +00:00
WeberJulian
9cfbacc622
Fix trailing space
2021-12-20 11:54:10 +00:00
WeberJulian
6b03943526
Move multilingual logic out of the trainer
2021-12-20 11:54:10 +00:00
Edresson
67dda0abe1
Add the SCL resample TODO
2021-12-20 11:54:10 +00:00
WeberJulian
8b52fb89d1
Fix merge bug
2021-12-20 11:54:10 +00:00
WeberJulian
09eda31a3f
Fix tests
2021-12-20 11:54:10 +00:00
Edresson
78a23e19df
Fix pylint checks
2021-12-20 11:54:10 +00:00
WeberJulian
4cd0e4eb0d
Remove self.audio_config from VITS
2021-12-20 11:54:10 +00:00
Edresson
d39200e69b
Remove torchaudio requeriment
2021-12-20 11:54:10 +00:00
WeberJulian
2e516869a1
Fix trailing whitespace
2021-12-20 11:54:10 +00:00
WeberJulian
ffc269eaf4
Update docstring
2021-12-20 11:54:10 +00:00
Edresson
12968532fe
Add the language embedding dim in the duration predictor class
2021-12-20 11:54:10 +00:00
Edresson
90eac13bb2
Rename ununsed_speakers to ignored_speakers
2021-12-20 11:54:10 +00:00
Edresson
f34596d957
Fix function name
2021-12-20 11:54:10 +00:00
Edresson
45d0b04179
Lint fixs
2021-12-20 11:54:10 +00:00
Edresson
b769b49e34
Remove the data from the set_d_vectors_from_file function
2021-12-20 11:54:10 +00:00
Edresson
9daa33d1fd
Remove unusable speaker manager function
2021-12-20 11:54:10 +00:00
Edresson
8c22d5ac49
Turn more clear the VITS loss function
2021-12-20 11:54:10 +00:00
Edresson
6fc3b9e679
Remove the unusable fine-tuning model
2021-12-20 11:54:10 +00:00
WeberJulian
631addf33b
fix d-vector
2021-12-20 11:54:10 +00:00
WeberJulian
da6c1e858c
Fix small issues
2021-12-20 11:54:10 +00:00
WeberJulian
e8af6a9f08
Fix use_speaker_embedding logic
2021-12-20 11:54:10 +00:00
WeberJulian
120332d53f
Fix phonemes
2021-12-20 11:54:10 +00:00
WeberJulian
1340938159
fix phonemes per language
2021-12-20 11:54:10 +00:00
WeberJulian
e995a63bd6
fix linter
2021-12-20 11:54:10 +00:00
WeberJulian
1472b6df49
make style
2021-12-20 11:54:10 +00:00
WeberJulian
4d721bcabd
fix test sentence synthesis
2021-12-20 11:54:10 +00:00
WeberJulian
0804806727
fix f0_cache_path in dataset
2021-12-20 11:54:10 +00:00
WeberJulian
3b5592abcf
fix test vits
2021-12-20 11:54:10 +00:00
WeberJulian
2a2b5767c2
fix collate_fn
2021-12-20 11:54:10 +00:00
Julian WEBER
78c2d12a91
PitchExtractor
2021-12-20 11:54:10 +00:00
Julian WEBER
9a2f91327c
get_aux_input
2021-12-20 11:54:10 +00:00
Julian WEBER
b3abd01793
Merge dataset
2021-12-20 11:54:10 +00:00
Edresson
1bd1a0546b
Add audio resample in the speaker consistency loss
2021-12-20 11:54:10 +00:00
Edresson
1c6bcda950
Add freeze vocoder generator and flow-based decoder option
2021-12-20 11:54:10 +00:00
WeberJulian
2b952d8b97
freeze vits parts
2021-12-20 11:54:10 +00:00
WeberJulian
005bba60b0
get_speaker_weighted_sampler
2021-12-20 11:54:10 +00:00
Edresson
9de4539422
Update the VITS model docs
2021-12-20 11:54:10 +00:00
Edresson
eeb8ac07d9
Add voice conversion fine tuning mode
2021-12-20 11:54:10 +00:00
Edresson
690b37d0ab
Add support to use the speaker encoder as loss function in VITS model
2021-12-20 11:54:09 +00:00
Edresson
9b011b1cb3
Add H/ASP original checkpoint support
2021-12-20 11:54:09 +00:00
Edresson
de78556655
Fix the optimizer parameters bug in multilingual and multispeaker training
2021-12-20 11:54:09 +00:00
Edresson
9be5b75da3
Fix bug after merge
2021-12-20 11:54:09 +00:00
Edresson
76251b619a
Fix d-vector multispeaker training bug
2021-12-20 11:54:09 +00:00
Edresson
7ef3ddc6ff
Fix unit tests
2021-12-20 11:54:09 +00:00
Edresson
36dcd11453
Fix pylint issues
2021-12-20 11:54:09 +00:00
Edresson
c53693c155
Implement vocoder Fine Tuning like SC-GlowTTS paper
2021-12-20 11:54:09 +00:00
Edresson
f1f016314e
Fix the bug in M-AILABS formatter
2021-12-20 11:54:09 +00:00
Edresson
c334d39acc
Add voice conversion support for the model VITS trained with external speaker embedding
2021-12-20 11:54:09 +00:00
Edresson
e997889ba8
Fix bug in VITS multilingual inference
2021-12-20 11:54:09 +00:00
Edresson
7c0b8ec572
Fix bugs in the non-multilingual VITS inference
2021-12-20 11:54:09 +00:00
Edresson
3fbbebd74d
Fix pylint issues
2021-12-20 11:54:09 +00:00
Edresson
ac9416fb86
Add multilingual inference support
2021-12-20 11:54:09 +00:00
Edresson
dcb2374bc9
Add multilingual training support to the VITS model
2021-12-20 11:54:09 +00:00
Edresson
f996afedb0
Implement multilingual dataloader support
2021-12-20 11:54:09 +00:00
Edresson
5f1c18187f
Fix pylint issues
2021-12-20 11:54:09 +00:00
Edresson
d91c595c5a
Implement training support with d_vecs in the VITS model
2021-12-20 11:54:09 +00:00
Edresson
6a7db67a91
Allow ignore speakers for all multispeaker datasets
2021-12-20 11:54:09 +00:00
Edresson
e0ad838066
Select randomly a speaker from the speaker manager for the test setences
2021-12-20 11:54:09 +00:00
Edresson
eb3e8affe1
Save speakers embeddings/ids before starting training
2021-12-20 11:54:09 +00:00
Eren Gölge
babdd84f91
Fix GST inference
...
commit d3e477875a7e46a101fcf95a1794442823750fe2
Author: George Rousssos <25833833+george-roussos@users.noreply.github.com>
Date: Wed Nov 3 10:16:12 2021 +0000
Read .wav for GST conditioning from CL
commit 074e6d0874d3b34fb6a4991fc17d66dccd413fbb
Author: George Rousssos <25833833+george-roussos@users.noreply.github.com>
Date: Fri Oct 29 14:43:47 2021 +0100
Fix GST during inference in Tacotron2
commit fdece14585ab5a36eed1061a9a838d8e48aa6882
Author: George Rousssos <25833833+george-roussos@users.noreply.github.com>
Date: Wed Nov 3 10:16:12 2021 +0000
Read .wav for GST conditioning from CL
commit cd29e21b8d0a541ee298d2bf5f67223ad60be38f
Author: George Rousssos <25833833+george-roussos@users.noreply.github.com>
Date: Fri Oct 29 14:43:47 2021 +0100
Fix GST during inference in Tacotron2
commit 908ce39370eadcc9fa8510cdb26c9ead87305427
Author: George Rousssos <25833833+george-roussos@users.noreply.github.com>
Date: Fri Oct 29 12:49:37 2021 +0100
Make trim_db value negative
commit 1008a2e0f72fa7ca7f0307424f570386f2f16d42
Author: George Rousssos <25833833+george-roussos@users.noreply.github.com>
Date: Fri Oct 29 12:22:24 2021 +0100
Set find_endpoint db threshold in config.json
2021-12-07 13:28:49 +00:00
Eren Gölge
2ed9e3c241
Fix constant use of noise augment
2021-11-08 09:20:34 +01:00
Eren Gölge
b6b14a76af
Fix VITS stochastic duration predictor
2021-11-08 09:20:11 +01:00
Eren Gölge
faafea4cf2
Fix style
2021-11-04 17:04:40 +01:00
Eren Gölge
c5077c6c3f
Merge branch 'dev' of https://github.com/coqui-ai/TTS into dev
2021-11-01 16:42:27 +01:00
Eren Gölge
20cebde1c9
Add docstring to MAI labs formatter
2021-11-01 16:41:55 +01:00
Eren Gölge
608f437545
Add a function to find unique chars
2021-11-01 16:41:33 +01:00
Eren Gölge
d6d780e758
Fix FastSpeech config
2021-11-01 16:41:15 +01:00
Michael Hansen
3bc043faeb
Upgrade to gruut 2.0 ( #882 )
2021-10-31 11:41:55 +01:00
Eren Gölge
2df0752e73
Model zoo tests ( #900 )
...
* Fix VITS model multi-speaker init
* Remove gdrive support in model manager
* Add model zoo tests
2021-10-29 17:54:16 +02:00
Eren Gölge
035ed432bc
Doc update ( #889 )
...
* Link source files from the docs
* Update glowTTS recipes for docs
* Add dataset downloaders
2021-10-26 17:41:33 +02:00
Eren Gölge
0cac3f330a
Enable custom formatter in load_tts_samples
2021-10-26 13:07:11 +02:00
Eren Gölge
00becf2671
Fix import statements
2021-10-25 19:29:16 +02:00
Eren Gölge
2b7d159383
Update BaseTTS for multi-speaker training
2021-10-21 16:29:06 +00:00
Eren Gölge
e62d3c5cf7
Use absolute imports for tts configs and models
2021-10-21 16:29:06 +00:00
Eren Gölge
82fed4add2
Make style
2021-10-21 16:05:51 +00:00