Eren Gölge
fbad17e084
Update imports for symbols -> characters
2022-02-25 10:48:02 +01:00
Eren Gölge
a1df4f9887
Test character classes
2022-02-25 10:45:24 +01:00
Eren Gölge
bd461ace33
Refactor GlowTTS model and recipe for TTSTokenizer
2022-02-25 10:45:24 +01:00
Eren Gölge
5a9653978a
Refactor synthesis.py for TTSTokenizer
2022-02-25 10:45:24 +01:00
Eren Gölge
e5785b34b0
Style fix
2022-02-25 10:27:46 +01:00
Eren Gölge
e4049aa31a
Refactor TTSDataset to use TTSTokenizer
2022-02-25 10:27:46 +01:00
Eren Gölge
2480bbe937
Remove OLD TOKENIZATION ROUTINES
2022-02-25 09:32:54 +01:00
Eren Gölge
53f696615b
Add init_from_config to AudioProcessor
2022-02-25 09:32:54 +01:00
Eren Gölge
3d86edfc81
Refactor Synthesizer class for TTSTokenizer
2022-02-25 09:32:54 +01:00
Eren Gölge
8d85af84cd
Implement Punctuation class
2022-02-25 09:32:54 +01:00
Eren Gölge
1aca58afaf
Fix imports in cleaners.py
2022-02-25 09:32:54 +01:00
Eren Gölge
0344645e90
Implement TTSTokenizer
2022-02-25 09:32:54 +01:00
Eren Gölge
2fb1f70503
Implement BaseCharacters, IPAPhonemes, Graphemes
2022-02-25 09:32:54 +01:00
Eren Gölge
1bee40af40
Create language folders under `TTS.tts.utils.text`
2022-02-25 09:32:54 +01:00
Eren Gölge
c1119bc291
Implement BasePhonemizer
2022-02-25 09:32:54 +01:00
Eren Gölge
dcd01356e0
Create `text/english` folder
2022-02-25 09:32:54 +01:00
Eren Gölge
80867c8e8c
Implement multi-phonemizer
2022-02-25 09:32:54 +01:00
Eren Gölge
5e4f78add3
Implement espeak wrapper
2022-02-25 09:32:54 +01:00
Eren Gölge
e03a05c816
Implement gruut wrapper
2022-02-25 09:32:54 +01:00
Eren Gölge
172ba0c5e7
Implement JA_JP phonemizer
2022-02-25 09:32:54 +01:00
Eren Gölge
ca02b82218
Implement ZH_CH phonemizer
2022-02-25 09:32:54 +01:00
Eren Gölge
a51b031bff
Merge branch 'dev' into dev-fix-glowtts-infer
2022-02-21 12:01:40 +03:00
Edresson Casanova
28a7464975
Fix the bug in split dataset function ( #1251 )
...
* Fix the bug in split_dataset
* Make eval_split_size configurable
* Change test_loader to use load_tts_samples function
* Change eval_split_portion to eval_split_size and permits to set the absolute number of samples in eval
* Fix samplers unit test
* Add data unit test on GitHub workflow
2022-02-21 11:59:36 +03:00
Edresson Casanova
bc5db13d06
Fix the bug in extract tts spectrogram script
2022-02-19 19:24:00 +00:00
Edresson Casanova
ba6e56e01c
Fix Glow-TTS multi-speaker inference
2022-02-18 19:25:29 +00:00
Eren Gölge
127118c637
Update TTS.tts formatters ( #1228 )
...
* Return Dict from tts formatters
* Make style
2022-02-11 23:03:43 +01:00
Eren Gölge
5e3f499a69
Fix #1187 ( #1227 )
2022-02-11 13:27:59 +01:00
Edresson Casanova
0860d73cf8
Remove Tensorflow requeriment ( #1225 )
...
* Remove TF modules
* Remove TF unit tests
* Remove TF vocoder modules
* Remove TF convert scripts
* Remove TF requirement
* Remove the Docs TF instructions
* Remove TF inference support
2022-02-10 16:14:54 +01:00
Eren Gölge
44c7d1a826
Merge pull request #1054 from WeberJulian/partial_embedding_compute
...
Partial embedding compute
2022-02-06 20:13:55 +01:00
WeberJulian
c7f5e005e1
Compute embedding for new audios only
2022-01-06 15:41:38 +01:00
WeberJulian
e778bad626
Add argument to enable dp speaker conditioning
2022-01-06 15:07:27 +01:00
WeberJulian
e1accb6e28
Fix train_tts.py and uncomment code ( #1051 )
...
* Fix SE loading and language embedding logic
* remove trailing white space
* Uncomment resmapling code for SCL
2022-01-03 17:44:57 +01:00
Eren Gölge
58c38de58d
Bump up to v0.5.0
2022-01-03 15:04:03 +00:00
Eren Gölge
5840d89802
Keep proj_dim in speaker encoder models
2022-01-03 15:03:34 +00:00
Eren Gölge
03bcae1ba5
Merge pull request #1050 from coqui-ai/fix_synthesizer_init
...
Fix if else statement
2022-01-03 15:59:29 +01:00
Eren Gölge
fc09e319d4
Prioritize the given encoder path over config
2022-01-03 14:24:19 +00:00
Eren Gölge
7fad969a1f
Fix if else statement
2022-01-03 14:16:11 +00:00
Eren Gölge
d724984be1
Fix language assignment
2022-01-02 11:11:24 +00:00
WeberJulian
a63998c048
Fix phoneme language
2022-01-01 21:08:13 +01:00
Eren Gölge
7ef458a59c
Updake default vocoder for uk model
2022-01-01 16:09:42 +00:00
Eren Gölge
e55f5ee59e
Make linter
2022-01-01 15:50:04 +00:00
Eren Gölge
38f5a11125
Merge branch 'dev' of https://github.com/coqui-ai/TTS into dev
2022-01-01 15:38:46 +00:00
Eren Gölge
c5512af82b
Update uk vocoder url
2022-01-01 15:38:21 +00:00
Eren Gölge
d37cfe474a
Merge branch 'pr/Edresson/731-rebased' into dev
2022-01-01 15:37:35 +00:00
Eren Gölge
33711afa01
Update yourTTS url
2022-01-01 15:37:08 +00:00
Eren Gölge
8fd1ee1926
Print urls when BadZipError
2022-01-01 15:26:35 +00:00
Eren Gölge
61874bc0a0
Fix your_tts inference from the listed models
2021-12-31 13:45:05 +00:00
Eren Gölge
8100135a7e
Add the YourTTS entry to the models
2021-12-31 12:22:08 +00:00
Eren Gölge
36cef5966b
Fix resnet speaker encoder
2021-12-30 15:36:35 +00:00
Eren Gölge
348b5c96a2
Fix speaker encoder test
2021-12-30 15:36:35 +00:00
Eren Gölge
7129b04d46
Update VITS model
2021-12-30 14:08:17 +00:00
Eren Gölge
638091f41d
Update Speaker Encoder models
2021-12-30 12:02:06 +00:00
Eren Gölge
6189fdfaea
Fix Training HiFiGan -- avg loss not decreasing #1003
2021-12-30 10:48:55 +00:00
Eren Gölge
275c759993
Fix #1037
2021-12-23 15:57:10 +00:00
Eren Gölge
5c5ddd2ba7
Init speaker manager for speaker encoder
2021-12-22 15:51:53 +00:00
Eren Gölge
633dcc9c56
Implement RMS volume normalization
2021-12-22 15:51:14 +00:00
Eren Gölge
8d2bb284ac
Add UK vocoder models
2021-12-21 13:13:35 +00:00
Eren Gölge
56378b12f7
Fix speaker encoder init
2021-12-21 12:26:25 +00:00
Eren Gölge
c9c1fa0548
Fix multi-speaker init in Synthesizer
2021-12-21 09:44:07 +00:00
Eren Gölge
f769595112
Add more listing options to ModelManager
2021-12-20 11:54:10 +00:00
Eren Gölge
a25269d897
Remove commented code
2021-12-20 11:54:10 +00:00
Eren Gölge
473414d4af
Implement init_speaker_encoder and change arg names
2021-12-20 11:54:10 +00:00
Eren Gölge
d29c3780d1
Use speaker_encoder from speaker manager in Vits
2021-12-20 11:54:10 +00:00
Eren Gölge
4d13b887f5
Change speaker_idx to speaker_name
2021-12-20 11:54:10 +00:00
Eren Gölge
4c50f6f4df
Add functions to get and check and argument in config and config.model_args
2021-12-20 11:54:10 +00:00
Eren Gölge
3c6d7f495c
Fixup
2021-12-20 11:54:10 +00:00
Eren Gölge
3818bd0c23
Fixup
2021-12-20 11:54:10 +00:00
Eren Gölge
79de38ca76
Rename setup_model to setup_speaker_encoder_model
2021-12-20 11:54:10 +00:00
Eren Gölge
35a781fb90
Fix synthesizer reading `use_language_embedding`
2021-12-20 11:54:10 +00:00
Eren Gölge
7a987db62b
Use torchaudio for ResNet speaker encoder
2021-12-20 11:54:10 +00:00
Eren Gölge
649dc9e9da
Remove redundant code
2021-12-20 11:54:10 +00:00
Eren Gölge
704dddcffa
Make style
2021-12-20 11:54:10 +00:00
WeberJulian
54b7fb4e4a
Fix zoo tests
2021-12-20 11:54:10 +00:00
WeberJulian
a564eb9f54
Add support for multi-lingual models in CLI
2021-12-20 11:54:10 +00:00
WeberJulian
2bbcb558dc
Prevent weighted sampler use when num_gpus > 1
2021-12-20 11:54:10 +00:00
WeberJulian
74cedfac38
Revert init multispeaker change
2021-12-20 11:54:10 +00:00
WeberJulian
9cfbacc622
Fix trailing space
2021-12-20 11:54:10 +00:00
WeberJulian
6b03943526
Move multilingual logic out of the trainer
2021-12-20 11:54:10 +00:00
Edresson
818dc4ccd8
Add Docstring for TorchSTFT
2021-12-20 11:54:10 +00:00
Edresson
67dda0abe1
Add the SCL resample TODO
2021-12-20 11:54:10 +00:00
WeberJulian
8b52fb89d1
Fix merge bug
2021-12-20 11:54:10 +00:00
WeberJulian
09eda31a3f
Fix tests
2021-12-20 11:54:10 +00:00
Edresson
78a23e19df
Fix pylint checks
2021-12-20 11:54:10 +00:00
WeberJulian
4cd0e4eb0d
Remove self.audio_config from VITS
2021-12-20 11:54:10 +00:00
Edresson
d39200e69b
Remove torchaudio requeriment
2021-12-20 11:54:10 +00:00
WeberJulian
2e516869a1
Fix trailing whitespace
2021-12-20 11:54:10 +00:00
WeberJulian
ffc269eaf4
Update docstring
2021-12-20 11:54:10 +00:00
Edresson
12968532fe
Add the language embedding dim in the duration predictor class
2021-12-20 11:54:10 +00:00
Edresson
4196a42de7
Get the number speaker from the Speaker Manager property
2021-12-20 11:54:10 +00:00
Edresson
f394d60695
Fix the bug in multispeaker vits
2021-12-20 11:54:10 +00:00
Edresson
90eac13bb2
Rename ununsed_speakers to ignored_speakers
2021-12-20 11:54:10 +00:00
Edresson
f34596d957
Fix function name
2021-12-20 11:54:10 +00:00
Edresson
45d0b04179
Lint fixs
2021-12-20 11:54:10 +00:00
Edresson
85418ffeaa
Fix the bug in extract tts spectrograms
2021-12-20 11:54:10 +00:00
Edresson
2b2cecaea2
Set the new_fields in copy_model_files as None by default
2021-12-20 11:54:10 +00:00
Edresson
34749f8727
Remove the call to get_speaker_manager
2021-12-20 11:54:10 +00:00
Edresson
b769b49e34
Remove the data from the set_d_vectors_from_file function
2021-12-20 11:54:10 +00:00
Edresson
9daa33d1fd
Remove unusable speaker manager function
2021-12-20 11:54:10 +00:00
Edresson
8c22d5ac49
Turn more clear the VITS loss function
2021-12-20 11:54:10 +00:00
Edresson
6fc3b9e679
Remove the unusable fine-tuning model
2021-12-20 11:54:10 +00:00
Edresson
352aa69eca
Create a module for the VAD script
2021-12-20 11:54:10 +00:00
WeberJulian
631addf33b
fix d-vector
2021-12-20 11:54:10 +00:00
WeberJulian
da6c1e858c
Fix small issues
2021-12-20 11:54:10 +00:00
WeberJulian
e8af6a9f08
Fix use_speaker_embedding logic
2021-12-20 11:54:10 +00:00
WeberJulian
23d789c072
Fix continue path
2021-12-20 11:54:10 +00:00
WeberJulian
120332d53f
Fix phonemes
2021-12-20 11:54:10 +00:00
WeberJulian
846bf16f02
fix imports for load_meta_data
2021-12-20 11:54:10 +00:00
WeberJulian
1340938159
fix phonemes per language
2021-12-20 11:54:10 +00:00
WeberJulian
e995a63bd6
fix linter
2021-12-20 11:54:10 +00:00
WeberJulian
1472b6df49
make style
2021-12-20 11:54:10 +00:00
WeberJulian
4d721bcabd
fix test sentence synthesis
2021-12-20 11:54:10 +00:00
WeberJulian
0804806727
fix f0_cache_path in dataset
2021-12-20 11:54:10 +00:00
WeberJulian
3b5592abcf
fix test vits
2021-12-20 11:54:10 +00:00
WeberJulian
2a2b5767c2
fix collate_fn
2021-12-20 11:54:10 +00:00
Julian WEBER
78c2d12a91
PitchExtractor
2021-12-20 11:54:10 +00:00
Julian WEBER
9a2f91327c
get_aux_input
2021-12-20 11:54:10 +00:00
Julian WEBER
b3abd01793
Merge dataset
2021-12-20 11:54:10 +00:00
Edresson
10ff90d6d2
Add remove silence VAD script
2021-12-20 11:54:10 +00:00
Edresson
1bd1a0546b
Add audio resample in the speaker consistency loss
2021-12-20 11:54:10 +00:00
Edresson
1c6bcda950
Add freeze vocoder generator and flow-based decoder option
2021-12-20 11:54:10 +00:00
WeberJulian
2b952d8b97
freeze vits parts
2021-12-20 11:54:10 +00:00
WeberJulian
005bba60b0
get_speaker_weighted_sampler
2021-12-20 11:54:10 +00:00
Edresson
9de4539422
Update the VITS model docs
2021-12-20 11:54:10 +00:00
Edresson
eeb8ac07d9
Add voice conversion fine tuning mode
2021-12-20 11:54:10 +00:00
Edresson
690b37d0ab
Add support to use the speaker encoder as loss function in VITS model
2021-12-20 11:54:09 +00:00
Edresson
9b011b1cb3
Add H/ASP original checkpoint support
2021-12-20 11:54:09 +00:00
Edresson
0bdfd3cb50
Add the ValueError in the restore checkpoint exception to avoid problems with the optimizer restauration when new keys are addition
2021-12-20 11:54:09 +00:00
Edresson
de78556655
Fix the optimizer parameters bug in multilingual and multispeaker training
2021-12-20 11:54:09 +00:00
Edresson
9be5b75da3
Fix bug after merge
2021-12-20 11:54:09 +00:00
Edresson
76251b619a
Fix d-vector multispeaker training bug
2021-12-20 11:54:09 +00:00
Edresson
7ef3ddc6ff
Fix unit tests
2021-12-20 11:54:09 +00:00
Edresson
36dcd11453
Fix pylint issues
2021-12-20 11:54:09 +00:00
Edresson
c53693c155
Implement vocoder Fine Tuning like SC-GlowTTS paper
2021-12-20 11:54:09 +00:00
Edresson
f1f016314e
Fix the bug in M-AILABS formatter
2021-12-20 11:54:09 +00:00
Edresson
c334d39acc
Add voice conversion support for the model VITS trained with external speaker embedding
2021-12-20 11:54:09 +00:00
Edresson
e997889ba8
Fix bug in VITS multilingual inference
2021-12-20 11:54:09 +00:00
Edresson
7c0b8ec572
Fix bugs in the non-multilingual VITS inference
2021-12-20 11:54:09 +00:00
Edresson
3fbbebd74d
Fix pylint issues
2021-12-20 11:54:09 +00:00
Edresson
ac9416fb86
Add multilingual inference support
2021-12-20 11:54:09 +00:00
Edresson
dcb2374bc9
Add multilingual training support to the VITS model
2021-12-20 11:54:09 +00:00
Edresson
f996afedb0
Implement multilingual dataloader support
2021-12-20 11:54:09 +00:00
Edresson
5f1c18187f
Fix pylint issues
2021-12-20 11:54:09 +00:00
Edresson
d91c595c5a
Implement training support with d_vecs in the VITS model
2021-12-20 11:54:09 +00:00
Edresson
6a7db67a91
Allow ignore speakers for all multispeaker datasets
2021-12-20 11:54:09 +00:00
Edresson
e0ad838066
Select randomly a speaker from the speaker manager for the test setences
2021-12-20 11:54:09 +00:00
Edresson
eb3e8affe1
Save speakers embeddings/ids before starting training
2021-12-20 11:54:09 +00:00
Eren Gölge
37803467aa
Merge pull request #1021 from loganhart420/dataset_downloaders
...
Add addtional datasets
2021-12-20 10:42:20 +01:00
Reuben Morais
859ac1a54c
Include usage instructions in README
2021-12-17 11:37:19 +01:00
loganhart420
103c010eca
Add addtional datasets
2021-12-16 07:21:27 -05:00
Jörg Thalheim
bce143c738
server: fix compatibility with tts_models/en/ljspeech/fast_pitch ( #893 )
2021-12-07 14:36:29 +01:00
Eren Gölge
babdd84f91
Fix GST inference
...
commit d3e477875a7e46a101fcf95a1794442823750fe2
Author: George Rousssos <25833833+george-roussos@users.noreply.github.com>
Date: Wed Nov 3 10:16:12 2021 +0000
Read .wav for GST conditioning from CL
commit 074e6d0874d3b34fb6a4991fc17d66dccd413fbb
Author: George Rousssos <25833833+george-roussos@users.noreply.github.com>
Date: Fri Oct 29 14:43:47 2021 +0100
Fix GST during inference in Tacotron2
commit fdece14585ab5a36eed1061a9a838d8e48aa6882
Author: George Rousssos <25833833+george-roussos@users.noreply.github.com>
Date: Wed Nov 3 10:16:12 2021 +0000
Read .wav for GST conditioning from CL
commit cd29e21b8d0a541ee298d2bf5f67223ad60be38f
Author: George Rousssos <25833833+george-roussos@users.noreply.github.com>
Date: Fri Oct 29 14:43:47 2021 +0100
Fix GST during inference in Tacotron2
commit 908ce39370eadcc9fa8510cdb26c9ead87305427
Author: George Rousssos <25833833+george-roussos@users.noreply.github.com>
Date: Fri Oct 29 12:49:37 2021 +0100
Make trim_db value negative
commit 1008a2e0f72fa7ca7f0307424f570386f2f16d42
Author: George Rousssos <25833833+george-roussos@users.noreply.github.com>
Date: Fri Oct 29 12:22:24 2021 +0100
Set find_endpoint db threshold in config.json
2021-12-07 13:28:49 +00:00
Eren Gölge
ce45d9e1af
Make style and lint
2021-12-01 10:42:52 +00:00
Eren Gölge
40cb8ac966
Fix #958
2021-12-01 10:33:34 +00:00
Eren Gölge
512ada7548
Fix callbacks against multi-gpu training
2021-12-01 10:32:14 +00:00
Eren Gölge
2ed9e3c241
Fix constant use of noise augment
2021-11-08 09:20:34 +01:00
Eren Gölge
b6b14a76af
Fix VITS stochastic duration predictor
2021-11-08 09:20:11 +01:00
Eren Gölge
dc3dd55dd9
Add collect_env_info.py
2021-11-08 08:59:08 +01:00
Eren Gölge
faafea4cf2
Fix style
2021-11-04 17:04:40 +01:00
Eren Gölge
d227aaebcc
Print when using Griffin-Lim in Synthesizer
2021-11-01 16:52:26 +01:00
Eren Gölge
c5077c6c3f
Merge branch 'dev' of https://github.com/coqui-ai/TTS into dev
2021-11-01 16:42:27 +01:00
Eren Gölge
20cebde1c9
Add docstring to MAI labs formatter
2021-11-01 16:41:55 +01:00
Eren Gölge
608f437545
Add a function to find unique chars
2021-11-01 16:41:33 +01:00
Eren Gölge
d6d780e758
Fix FastSpeech config
2021-11-01 16:41:15 +01:00
Eren Gölge
5ba47081ee
Use GL for VCTK FastPitch models
2021-11-01 16:39:03 +01:00
Michael Hansen
3bc043faeb
Upgrade to gruut 2.0 ( #882 )
2021-10-31 11:41:55 +01:00
George
37eaefc085
Optional silence trimming during inference and find_endpoint() fix ( #898 )
...
* Set find_endpoint db threshold in config.json
* Optional silence trimming during inference
* Make trim_db value negative
2021-10-29 18:28:55 +02:00
Eren Gölge
7293abada2
Bump up to v0.4.2
2021-10-29 17:57:30 +02:00
Eren Gölge
2df0752e73
Model zoo tests ( #900 )
...
* Fix VITS model multi-speaker init
* Remove gdrive support in model manager
* Add model zoo tests
2021-10-29 17:54:16 +02:00
Eren Gölge
aaaa591485
Bump up version to v0.4.1
2021-10-26 19:24:17 +02:00
Eren Gölge
3ea1c2037b
Fix model entry in .models.json
2021-10-26 19:14:29 +02:00
Eren Gölge
fa4ec83c6e
Bump up version to v0.4.0
2021-10-26 18:27:39 +02:00
Eren Gölge
035ed432bc
Doc update ( #889 )
...
* Link source files from the docs
* Update glowTTS recipes for docs
* Add dataset downloaders
2021-10-26 17:41:33 +02:00
Eren Gölge
0cac3f330a
Enable custom formatter in load_tts_samples
2021-10-26 13:07:11 +02:00
Eren Gölge
7c10574931
Gateway for TTS models
2021-10-26 13:04:51 +02:00
Eren Gölge
00becf2671
Fix import statements
2021-10-25 19:29:16 +02:00
Eren Gölge
027424dda8
Add VCTK fast_pitch and UK glow-tts
2021-10-25 19:29:16 +02:00
Eren Gölge
70e4d0e524
Fix grad_norm handling
2021-10-21 16:29:06 +00:00
Eren Gölge
a409e0f8f8
Update train_tts for multi-speaker
2021-10-21 16:29:06 +00:00
Eren Gölge
2b7d159383
Update BaseTTS for multi-speaker training
2021-10-21 16:29:06 +00:00
Eren Gölge
e62d3c5cf7
Use absolute imports for tts configs and models
2021-10-21 16:29:06 +00:00
Eren Gölge
82fed4add2
Make style
2021-10-21 16:05:51 +00:00
Eren Gölge
3cb07fb6b5
Fix SpeakerManager init with data items
2021-10-21 13:54:39 +00:00
Eren Gölge
aea90e2501
Comment synthesis.py
2021-10-21 13:53:45 +00:00
Eren Gölge
1987aaaaed
Update d-vector reshape in synthesizer
2021-10-21 13:53:25 +00:00
Eren Gölge
3ab009ca8d
Edit model configs for multi-speaker
2021-10-21 13:51:37 +00:00
Eren Gölge
cea8e1739b
Update AlignTTS to use SpeakerManager
2021-10-20 18:22:41 +00:00
Eren Gölge
0e768dd4c5
Update comments
2021-10-20 18:21:26 +00:00
Eren Gölge
7c2cb7cc30
Update BaseTTS
2021-10-20 18:18:22 +00:00
Eren Gölge
330ee7d208
Comment BaseTacotron and remove unused funcs
2021-10-20 18:17:25 +00:00
Eren Gölge
aa25f70b95
Update ForwardTTS for multi-speaker
2021-10-20 18:16:41 +00:00
Eren Gölge
0ebc2a400e
Implement `_set_speaker_embedding` in GlowTTS
2021-10-20 18:15:20 +00:00
Eren Gölge
3da79a4de4
Comment Tacotron2 model
2021-10-20 18:14:04 +00:00
Eren Gölge
92b6d98443
Set pitch frame alignment wrt spec computation
2021-10-20 18:12:38 +00:00
Eren Gölge
0a3d1cc7ee
Pass speaker manager to the model in synthesizer
2021-10-20 18:11:36 +00:00
Eren Gölge
588da1a24e
Simplify grad_norm handling in trainer
2021-10-19 16:33:04 +00:00
Eren Gölge
3c7848e9b1
Don't OOR values in train console log
2021-10-19 16:32:16 +00:00
Eren Gölge
c514351c0e
Refactor multi-speaker init in BaseTTS-Tacotron1-2
2021-10-18 08:55:45 +00:00
Eren Gölge
127571423c
Update multi-speaker init in BaseTTS
2021-10-18 08:54:41 +00:00
Eren Gölge
a0a5d580e9
Approximate audio length from file size
2021-10-18 08:54:02 +00:00
Eren Gölge
b4b890df03
Update trainer's initialization
2021-10-18 08:53:19 +00:00
Eren Gölge
fcbfc53cb7
Fix linter
2021-10-15 10:24:19 +00:00
Eren Gölge
700b056117
Update Synthesizer multi-speaker handling
2021-10-15 10:21:12 +00:00
Eren Gölge
073a2d2eb0
Refactor VITS multi-speaker initialization
2021-10-15 10:20:00 +00:00
Eren Gölge
0565457faa
Fix #846
2021-10-14 14:46:14 +00:00
Eren Gölge
e15bc157d8
Fix #873
2021-10-14 14:39:45 +00:00
Eren Gölge
21cc0517a3
Fix WaveRNN test
2021-10-01 10:21:37 +00:00
Eren Gölge
4dbe7ed0de
Fix all-zero duration case for GlowTTS
2021-10-01 09:24:26 +00:00
Eren Gölge
37959ad0c7
Make linter
2021-09-30 23:02:16 +00:00
Eren Gölge
0b1986384f
Make style
2021-09-30 16:21:18 +00:00
Eren Gölge
7edbe04fe0
Fix WaveRNN config and test
2021-09-30 16:20:12 +00:00
Eren Gölge
55d9209221
Remote STT tokenizer
2021-09-30 14:58:26 +00:00
Eren Gölge
ba2b8c827f
Update `train_tts.py` and `train_vocoder.py`
2021-09-30 14:47:56 +00:00
Eren Gölge
2e9b6b4f90
Refactor Speaker Encoder training
2021-09-30 14:47:56 +00:00
Eren Gölge
043dca61b4
Rename `load_meta_data` as `load_tts_data`
2021-09-30 14:47:56 +00:00
Eren Gölge
9f23ad6a0f
Fix imports
2021-09-30 14:47:56 +00:00
Eren Gölge
16b70be0dd
Add `_set_model_args` to BaseModel
2021-09-30 14:47:56 +00:00
Eren Gölge
9a0d8fa027
Update `copy_model_files()`
2021-09-30 14:47:56 +00:00
Eren Gölge
4163b4f2e4
Update Tacotron models
2021-09-30 14:47:56 +00:00
Eren Gölge
e27feade38
Fixup wavernn
2021-09-30 14:47:56 +00:00
Eren Gölge
45889804c2
Update VITS
2021-09-30 14:47:56 +00:00
Eren Gölge
4f94f91305
Update WaveRNN
2021-09-30 14:47:56 +00:00
Eren Gölge
3d5205d66f
Update WaveGrad
2021-09-30 14:47:56 +00:00
Eren Gölge
fd95926009
Update GlowTTS
2021-09-30 14:47:56 +00:00
Eren Gölge
4baecdf92a
Update GAN for Trainer_v2
2021-09-30 14:47:56 +00:00
Eren Gölge
a156a40b47
Update ForwardTTS for Trainer_v2
2021-09-30 14:19:19 +00:00
Eren Gölge
d9df33f837
Update `align_tts` for trainer_v2
2021-09-30 14:18:10 +00:00
Eren Gölge
8ada870a57
Refactor `trainer.py` for v2
2021-09-30 14:16:34 +00:00
Eren Gölge
7f388f26e3
Bump up to v0.3.1
2021-09-17 23:53:22 +00:00
Eren Gölge
2766dd1d6e
Fix #813 - GlowTTS training ( #814 )
...
* Fix #813
* Update glow_tts recipe
* Fix glow-tts test
* Linter fix
* Run data dep init only in training
2021-09-17 20:06:55 +02:00
Eren Gölge
f563415052
Bump up to v0.3.0
2021-09-13 09:40:38 +00:00
Eren Gölge
a97dc8d09f
Fix trainer malformatted print
2021-09-13 08:32:02 +00:00
Eren Gölge
91bebebe18
Add new models to `.models.json`
...
SpeedySpeech model using `ForwardTTS`
UnivNet model fine-tuned on TacotronDDC_ph spectrograms
2021-09-13 08:22:14 +00:00
Eren Gölge
1ea011571a
Update SpeedySpeech config
2021-09-12 15:33:27 +00:00
Eren Gölge
cbbc9e0172
Add FastSpeechConfig
2021-09-11 10:20:37 +00:00
Eren Gölge
26f76fce22
Remove SpeedySpeech from .models.json
2021-09-10 17:47:27 +00:00
Eren Gölge
d97952611d
Remove unused import
2021-09-10 17:31:41 +00:00
Eren Gölge
7d8f77385a
Use `glow-tts` in synthesis tests
2021-09-10 17:27:33 +00:00
Eren Gölge
d5f256b34c
Update tacotron `r` init
2021-09-10 17:26:23 +00:00
Eren Gölge
ab37fa9c39
Edit AlignTTS
2021-09-10 17:25:00 +00:00
Eren Gölge
66732025e1
Add `base_model` field to `forward_tts` configs
2021-09-10 17:23:48 +00:00
Eren Gölge
d6e29ef98a
Style update
2021-09-10 08:30:33 +00:00
Eren Gölge
a89eb12aca
Fix glow_tts imports
2021-09-10 08:29:51 +00:00
Eren Gölge
570d5971be
Implement `ForwardTTSLoss`
2021-09-10 08:29:12 +00:00
Eren Gölge
0541a25e90
Remove `fastpitch.py` and `speedy_speech.py`
2021-09-10 08:27:48 +00:00
Eren Gölge
3c16013199
Fix Vits imports
2021-09-10 08:26:34 +00:00
Eren Gölge
742f9c54da
Warn user if nan in GL
2021-09-10 08:26:05 +00:00
Eren Gölge
ed4b1d8514
Test `TTS.tts.utils.helpers`
2021-09-10 08:25:21 +00:00
Eren Gölge
8b7e094bde
Implement `forward_tts`
...
- Generic API for feed-forward TTS models (FastPitch, SpeedySpeech)
- Tests for `forward-tts`
- Edit FastPitchConfig and SpeedySpeechConfig to use `forward_tts`
2021-09-10 08:24:33 +00:00
Eren Gölge
3c740d4893
Style extract_tts_spectrogram.py
2021-09-10 08:21:21 +00:00
Eren Gölge
bfc6ceac29
Move MAS to `TTS.tts.utils.helpers`
2021-09-09 10:57:19 +00:00
Eren Gölge
2dfc5bdd11
Fix best_model_path init if no best_mode
2021-09-09 09:01:52 +00:00
Eren Gölge
abf5e48177
Fix logging current learning rate in trainer
2021-09-09 09:01:04 +00:00
Eren Gölge
6c4c1065b0
Fix trainer's scheduler restoring
2021-09-09 09:00:27 +00:00
Eren Gölge
807f1d3817
Fix `extract_tts_spectrograms.py` model init
2021-09-09 08:59:55 +00:00
Eren Gölge
537c8576ec
Stage `TTS.tts.utils.helpers`
2021-09-08 13:35:18 +00:00
Eren Gölge
4761853c5c
Fix imports
2021-09-08 13:34:40 +00:00
Eren Gölge
e20ea57c87
Update comment and add a warning
2021-09-07 12:23:32 +00:00
Eren Gölge
82598f3fdb
Bump up to v0.2.2
2021-09-06 16:59:41 +00:00
Eren Gölge
4cc544bc46
Add FastPitch model to `.models.json`
2021-09-06 16:59:22 +00:00
Eren Gölge
2c4bbbf9b9
Use pyworld for pitch
2021-09-06 15:16:58 +00:00
Eren Gölge
c1513ec4cd
Plot pitch over spectrogram
2021-09-06 15:16:58 +00:00
Eren Gölge
d847a68e42
Reformat multi-speaker handling in GlowTTS
2021-09-06 15:16:58 +00:00
Eren Gölge
8d41060d36
Plot unnormalized pitch by `FastPitch`
2021-09-06 15:16:58 +00:00
Eren Gölge
2b59da802c
Fix loader setup in `base_tts`
2021-09-06 15:16:58 +00:00
Eren Gölge
76c4929ab2
Fix attn mask reading bug
2021-09-06 15:16:58 +00:00
Eren Gölge
91a70e80b2
Refactor TTSDataset
...
Return a dict by `collate`
Refactor batch handling in `collate`
A couple of bug fixes
2021-09-06 15:16:58 +00:00
Eren Gölge
29248536c9
Update `PositionalEncoding`
2021-09-06 15:16:58 +00:00
Eren Gölge
4672889549
Update `generic.FFTransformer`
2021-09-06 15:16:58 +00:00
Eren Gölge
2bf9e83c49
FastPitch refactor and commenting
2021-09-06 15:16:58 +00:00
Eren Gölge
59b24e66cf
Add `AlignerNetwork`
2021-09-06 15:16:58 +00:00
Eren Gölge
648655fa03
Add `PitchExtractor` and return dict by `collate`
2021-09-06 15:16:58 +00:00
Eren Gölge
debf772ec5
Implement binary alignment loss
2021-09-06 15:16:58 +00:00
Eren Gölge
6e9d4062f2
Add `sort_by_audio_len` option
2021-09-06 15:16:58 +00:00
Eren Gölge
59d52a4cd8
Disable autcast for criterions
2021-09-06 15:16:58 +00:00
Eren Gölge
98a7271ce8
Refactor FastPitchv2
2021-09-06 15:16:58 +00:00
Eren Gölge
e429afbce4
Enable aligner for FastPitch
2021-09-06 15:16:58 +00:00
Eren Gölge
81c228a2d8
Update FastPitch don't detach duration network inputs
2021-09-06 15:16:58 +00:00
Eren Gölge
ca29033ef4
Refactor FastPitch model
2021-09-06 15:16:58 +00:00
Eren Gölge
42862f7fdb
Format style of the recipes
2021-09-06 15:16:58 +00:00
Eren Gölge
5d59100a88
Don't use align_score for models with duration predictor
2021-09-06 15:16:58 +00:00
Eren Gölge
fac9dbe661
Update FastPitchLoss
2021-09-06 15:16:58 +00:00
Eren Gölge
b81560607b
Update docstrings
2021-09-06 15:16:58 +00:00
Eren Gölge
57b3aec1b9
Update docstring format
2021-09-06 15:16:58 +00:00
Eren Gölge
7692bfe7f8
Update FastPitch config
2021-09-06 15:16:58 +00:00
Eren Gölge
8584f2b82d
Update docstring format
2021-09-06 15:16:58 +00:00
Eren Gölge
b7caad39e0
Make optional to detach duration predictor input
2021-09-06 15:16:58 +00:00
Eren Gölge
9af42f7886
Restore `last_epoch` of the scheduler
2021-09-06 15:16:58 +00:00
Eren Gölge
aacbb3ed77
Fix SpeakerManager usage in `synthesize.py`
2021-09-06 15:16:58 +00:00
Eren Gölge
545a00fc04
Use absolute paths of the attention masks
2021-09-06 15:16:58 +00:00
Eren Gölge
bc396c393f
Add FastPitch model and FastPitchconfig
2021-09-06 15:16:58 +00:00
Eren Gölge
5a6ffaee08
Add yin based pitch computation
2021-09-06 15:16:58 +00:00
Eren Gölge
e802b24ad0
Compute mean and std pitch
2021-09-06 15:16:58 +00:00
Eren Gölge
8fffd4e813
Don't print computed phonemes
...
It causes noise in logs
2021-09-06 15:16:58 +00:00
Eren Gölge
d085642ac1
Cache pitch features
...
Cache the features at the beginning of `BaseTTS` training.
2021-09-06 15:16:58 +00:00
Eren Gölge
7590c7db7a
Fix `base_tacotron` `aux_input` handling
2021-09-06 15:16:58 +00:00
Eren Gölge
db32162eae
Fix `FastPitchLoss`
2021-09-06 15:16:58 +00:00
Eren Gölge
94e8e0d416
Fix configs
2021-09-06 15:16:58 +00:00
Eren Gölge
0f19f8c911
Fix `compute_attention_masks.py`
2021-09-06 15:16:58 +00:00
Eren Gölge
994f2be2c1
Add comput_f0 field
2021-09-06 15:16:58 +00:00
Eren Gölge
c8d999b010
Add FastPitchLoss
2021-09-06 15:16:58 +00:00
Eren Gölge
fba257104d
Compute F0 using librosa
2021-09-06 15:16:58 +00:00
Katsuya Iida
165e5814af
Update Japanese phonemizer ( #758 )
...
* Update default ja vocoder
* update
* Japanese phonemizer test
* Run make style
Co-authored-by: Eren Gölge <egolge@coqui.ai>
2021-09-01 09:33:15 +02:00
Eren Gölge
2b7e55f01f
Fix vits args types
2021-08-30 23:24:20 +00:00
Eren Gölge
b910a6ddce
Bump up to v0.2.1
2021-08-30 16:31:24 +00:00
Eren Gölge
d16da949a5
Merge branch 'fix_distribute' into dev
2021-08-30 16:31:07 +00:00
Eren Gölge
6782d3eab7
Fix linter issues ofr p3.6
2021-08-30 16:18:33 +00:00
Eren Gölge
738eee0cf9
Fix style
2021-08-30 13:12:13 +00:00
Eren Gölge
5255e089e6
Fix #767
2021-08-30 13:10:08 +00:00
Eren Gölge
c560114324
Fix #750
2021-08-30 13:06:50 +00:00
Eren Gölge
18b2e41e5a
Use `coqui_tts` as the default run name
2021-08-30 12:56:47 +00:00
Eren Gölge
9c86f1ac68
Fix usage of abstract class in vocoders
2021-08-30 08:10:35 +00:00
Eren Gölge
18da8f5dbd
Update pylint 2.10.2 and fix lint issues
2021-08-30 08:10:35 +00:00
Eren Gölge
f186856e5d
Add option to sort input sequnce by audio len
2021-08-30 08:10:35 +00:00
Eren Gölge
2620f62ea8
Move duration_loss inside VitsGeneratorLoss
2021-08-27 07:07:07 +00:00
Eren Gölge
1692b8e4d9
Merge pull request #726 from fijipants/patch-1
...
Fix bug with log_func
2021-08-26 22:11:29 +02:00
Eren Gölge
49e1181ea4
Fixes for the vits model
2021-08-26 17:15:09 +00:00
Eren Gölge
5911eec3b1
Small trainer refactoring
...
1. Use a single Gradscaler for all the optimizers
2. Save terminal logs to a file. In DDP mode, each worker creates `trainer_N_log.txt`.
3. Fixes to allow only the main worker (rank==0) writing to Tensorboard
4. Pass parameters owned by the target optimizer to the grad_clip_norm
2021-08-26 17:08:58 +00:00
fijipants
e9e01b09b0
Fix bug with log_func
2021-08-18 19:59:51 -04:00
fijipants
8f57f8adfd
Update synthesizer.py
2021-08-18 19:56:52 -04:00
Eren Gölge
3ab8cef99e
Fix VITS model SPD
2021-08-18 14:55:46 +00:00
Eren Gölge
c5d1dd9d1b
Fix restoring best_loss
...
Keep the default value if model checkpoint has no `model_loss`
2021-08-17 12:12:36 +00:00
Eren Gölge
c8bbcdfd07
Fix `test_run` for DDP
2021-08-13 19:39:02 +00:00
Eren Gölge
7c0d564965
Syncronize DDP processes
2021-08-13 10:40:50 +00:00
Eren Gölge
ecf5f17dca
Fix distribute.py and ddp training
2021-08-12 22:22:32 +00:00
Eren Gölge
b02c4fe347
Bump up to v0.2.0
2021-08-11 08:15:39 +00:00
Eren Gölge
537bc8487a
Print model count when listing modelsk
2021-08-10 16:25:11 +00:00
Eren Gölge
09ed8426e8
Add the models released with v0.2.0
2021-08-10 15:46:31 +00:00
Eren Gölge
39004484b9
Fix 🐛
...
Fix synthesizer multi-speaker init
Fix #712
2021-08-10 12:56:32 +00:00
Eren Gölge
c8b9ca3d71
Fix Tacotron num_char init
2021-08-10 08:56:34 +00:00
Eren Gölge
7eb94f760b
Remove Ruslan model
2021-08-09 21:48:36 +00:00
Eren Gölge
6af03ac476
Fix `num_char` init in Tacotron models
2021-08-09 21:46:15 +00:00
Ayush Chaurasia
e685ddfca7
Update trainer.py
2021-08-09 18:37:46 +00:00
Ayush Chaurasia
28870f8df4
update docstring
2021-08-09 18:35:35 +00:00
Ayush Chaurasia
8a246cbb66
Update trainer.py
2021-08-09 18:35:08 +00:00
Ayush Chaurasia
f3e9d61330
Refactor logging initialization
2021-08-09 18:35:08 +00:00
Ayush Chaurasia
79b74a989d
Update: add_text
2021-08-09 18:34:38 +00:00
Ayush Chaurasia
9fcf48b760
Delete logger_base.py
2021-08-09 18:34:00 +00:00
Ayush Chaurasia
290972fd35
reformat
2021-08-09 18:34:00 +00:00
Ayush Chaurasia
936a47504d
Update Logger API, recipes
2021-08-09 18:34:00 +00:00
Ayush Chaurasia
f63cf46c55
Unified logger API
2021-08-09 18:34:00 +00:00
Ayush Chaurasia
f4434da5a3
Update disabled structure
2021-08-09 18:31:16 +00:00
Ayush Chaurasia
f606741dc4
Add artifacts logging , wandb args
2021-08-09 18:31:16 +00:00
Ayush Chaurasia
f5e50ad502
WandbLogger
2021-08-09 18:27:06 +00:00
Eren Gölge
06018251e6
Add VITS and GlowTTS class docs 🗒️
2021-08-09 18:02:36 +00:00
Eren Gölge
6a7275881d
Add VitsConfig docstring
2021-08-09 18:02:36 +00:00
Eren Gölge
f7a72552f1
Make duration predictor dropout configurable
2021-08-09 18:02:36 +00:00
Eren Gölge
c312acac7d
Implement VITS model 🚀
...
VITS model implementation built on Glow TTS and HiFiGAN
layers.
2021-08-09 18:02:36 +00:00
Eren Gölge
060e746e21
Add `do_amp_to_db` option
2021-08-09 18:02:36 +00:00
Eren Gölge
e94c1f894d
Simplify `console_logger`
2021-08-09 18:02:36 +00:00
Eren Gölge
dd55960732
Update `synthesizer.py`
...
Fixes and changes for multi-speaker model init and custom symbols made
by mode.make_symbols()
2021-08-09 18:02:36 +00:00
Eren Gölge
232a5abb6a
Update `tts.setup_model`
...
Run `model.make_symbols()` if availabe to set the symbol list
2021-08-09 18:02:36 +00:00
Eren Gölge
f5a6aa974f
Modify `symbols.py` not to add _arpanet
2021-08-09 18:02:36 +00:00
Eren Gölge
d4deb2716f
Modify `get_optimizer` to accept a model argument
2021-08-09 18:02:36 +00:00
Eren Gölge
003e5579e8
Enable `custom_symbols` in text processing
...
Models can define their own custom symbols lists with custom
`make_symbols()`
2021-08-09 18:02:36 +00:00
Eren Gölge
bd4e29b4dd
Add `compute_linear_spec=False` to `BaseTTSConfig`
2021-08-09 18:02:36 +00:00
Eren Gölge
960a35a121
Add `scheduler_after_epoch` to `BaseTrainingConfig`
2021-08-09 18:02:36 +00:00
Eren Gölge
e4648ffef1
Fix multi-speaker init of Tacotron models & tests
2021-08-09 18:02:36 +00:00
Eren Gölge
01324c8e70
Update `base_tts.py`
...
Enable calling `make_symbols()` from the model if defined.
Compatibility changes for end2end `tts` models in batch formatting.
Changes in multi-speaker initialization.
Modify `test_run()` to work with dict output iof `synthesis`
2021-08-09 18:02:36 +00:00
Eren Gölge
bf562cf437
Update `trainer.py`
...
Fix multi-speaker initialization of models. Add changes for end2end`tts`
models.
2021-08-09 18:02:36 +00:00
Agrin Hilmkil
ced4cfdbbf
Allow saving / loading checkpoints from cloud paths ( #683 )
...
* Allow saving / loading checkpoints from cloud paths
Allows saving and loading checkpoints directly from cloud paths like
Amazon S3 (s3://) and Google Cloud Storage (gs://) by using fsspec.
Note: The user will have to install the relevant dependency for each
protocol. Otherwise fsspec will fail and specify which dependency is
missing.
* Append suffix _fsspec to save/load function names
* Add a lower bound to the fsspec dependency
Skips the 0 major version.
* Add missing changes from refactor
* Use fsspec for remaining artifacts
* Add test case with path requiring fsspec
* Avoid writing logs to file unless output_path is local
* Document the possibility of using paths supported by fsspec
* Fix style and lint
* Add missing lint fixes
* Add type annotations to new functions
* Use Coqpit method for converting config to dict
* Fix type annotation in semi-new function
* Add return type for load_fsspec
* Fix bug where fs not always created
* Restore the experiment removal functionality
2021-08-09 18:02:36 +00:00
Eren Gölge
d9e18e009b
Skip phoneme cache pre-compute if the path exists
2021-08-09 18:02:36 +00:00
Eren Gölge
6c131d168e
Bump the version to 0.1.3
2021-07-26 21:32:27 +02:00
Eren Gölge
febd6105b5
Update default vocoder for de-thorsten
2021-07-26 16:08:52 +02:00
Eren Gölge
4b7b88dd3d
Add fullband-melgan DE vocoder
2021-07-26 15:38:30 +02:00
Eren Gölge
764f684e1b
Fix `server.py` for multi-speaker models
2021-07-26 15:38:30 +02:00
Eren Gölge
75b201c6c1
Merge pull request #673 from coqui-ai/fix_stopnet
...
Fix stopnet training for Tacotron models
2021-07-24 12:25:38 +02:00
Eren Gölge
fc0c4600bd
Fix stopnet training
2021-07-24 11:39:54 +02:00
Eren Gölge
30eed347b6
Merge pull request #581 from Edresson/dev
...
Compute speaker embeddings in batch for the LSTM Speaker Encoder and Compute embeddings/ finding chars using config file.
2021-07-23 17:22:51 +02:00
Edresson Casanova
d5adc35fdf
Add docstring to compute_embeddings script
2021-07-21 07:16:10 -03:00
Eren Gölge
05c75aa9d5
Fix linter issues
2021-07-16 13:37:38 +02:00
Eren Gölge
58cc414477
Fix WaveGrad `test_run`
2021-07-16 13:02:25 +02:00
WeberJulian
25832eb97b
Changes for review
2021-07-15 11:38:45 +02:00
Edresson
b1620d1f3f
remove ignore generate eval flag
2021-07-15 03:34:28 -03:00
WeberJulian
c79a82ed07
refix linter
2021-07-13 23:12:18 +02:00
WeberJulian
7d92b30946
Fix tests
2021-07-13 23:00:34 +02:00
WeberJulian
32974dd6a9
Fix test sentences synthesis
2021-07-13 16:07:13 +02:00
Edresson
d906fea08c
lint fix and eval as argparse in extract tts spectrograms
2021-07-13 02:15:31 -03:00
Edresson
2e5baffa9c
Merge fix and eval split as argparse
2021-07-13 01:47:32 -03:00
Eren Gölge
93a74cbb71
Merge pull request #628 from Aloento/patch-2
...
Change to _get_preprocessor_by_name
2021-07-11 22:17:50 +02:00
Edresson
4eac1c4651
bug fix on train_encoder and unit tests
2021-07-11 12:00:39 -03:00
Aloento
6e3e6d5756
Change to _get_preprocessor_by_name
2021-07-08 09:53:13 +02:00
Eren Gölge
8fbadad68e
Bump up to v0.1.2
2021-07-06 14:44:59 +02:00
eren golge
3c0454490f
Fix #616
2021-07-06 14:44:03 +02:00
Eren Gölge
0c347624e7
Bump up version to v0.1.1
2021-07-04 11:46:36 +02:00
Eren Gölge
a05b234080
Raise an error when multiple GPUs are in use
...
User must define the target GPU by `CUDA_VISIBLE_DEVICES` and
use `distribute.py` for multi-gpu training.
2021-07-04 11:25:49 +02:00
Eren Gölge
270c3823eb
Fix #608
2021-07-04 11:19:31 +02:00
Eren Gölge
c25a2184e7
Add docs for `SpeakerManager`
2021-07-03 13:55:27 +02:00
Eren Gölge
f382e4c700
Fix linter warnings
2021-07-03 13:30:24 +02:00
Eren Gölge
9e7824fe35
Fix UnivNet inference code
2021-07-02 10:48:34 +02:00
Eren Gölge
168f97cbe9
Let `Synthesizer` use the speaker manager out of the model
2021-07-02 10:47:55 +02:00
Eren Gölge
196876feb1
Fix `ModelManager` model download
2021-07-02 10:47:05 +02:00
Eren Gölge
9352cb4136
Format Align TTS docstrings
2021-07-02 10:45:58 +02:00
Eren Gölge
95ad72f38f
Fix glow tts initialization
2021-07-02 10:45:37 +02:00
Eren Gölge
40b0b5365e
Let `get_characters` return `num_chars`
2021-07-02 10:45:00 +02:00
Eren Gölge
0fa6a8c9b8
Fix glow tts default parameters
2021-07-02 10:44:23 +02:00
Eren Gölge
a4c658f5ef
Fix for using the `Synthesizer` out of the model
2021-07-02 10:43:38 +02:00
Eren Gölge
db47f4f105
Update `.models.json`
2021-07-02 10:43:00 +02:00
Eren Gölge
2e1a428b83
Update glowtts docstrings and docs
2021-06-30 14:30:55 +02:00
Eren Gölge
5723eb4738
Fix config init in `process_args`
2021-06-29 16:41:08 +02:00
Eren Gölge
4b5421b42f
Remove FAQ link from README.md
2021-06-29 13:20:40 +02:00
Eren Gölge
47b3b10d6d
Bump up to v0.1.0 🚀
2021-06-29 13:07:59 +02:00
Eren Gölge
7ec5c31898
Merge branch 'univnet' into trainer-api
2021-06-29 10:27:12 +02:00
Eren Gölge
51398cd15b
Add docstrings and typing for `audio.py`
2021-06-28 17:03:47 +02:00
Eren Gölge
ae6405bb76
Docstrings for `Trainer`
2021-06-28 17:03:47 +02:00
Eren Gölge
6b265ae8e3
Docstring update
2021-06-28 17:03:47 +02:00
Eren Gölge
ab563ce7cd
Start training by config.json using `register_config`
2021-06-28 17:03:47 +02:00
Eren Gölge
b3c073c99b
Allow runing full path scripts with `distribute.py`
2021-06-28 17:03:47 +02:00
Eren Gölge
d42d1c02ea
Use `torch.linalg.qr` for pytorch > `v1.9.0`
2021-06-28 17:03:47 +02:00
Eren Gölge
fbba37e01e
Fix loading the `amp` scaler from a checkpoint 🛠️
2021-06-28 17:03:47 +02:00
Eren Gölge
a7617d8ab6
Add 🐍 python 3.9 to CI
2021-06-28 17:03:47 +02:00
Eren Gölge
9790eddada
Fix wrong argument name 🛠️
2021-06-28 17:03:47 +02:00
Eren Gölge
932ab107ae
Docstring edit in `TTSDataset.py` ✍️
2021-06-28 17:03:47 +02:00
Eren Gölge
cfa5041db7
Fix `eval_log` for `gan.py` 🛠️
2021-06-28 17:03:47 +02:00
Eren Gölge
d700845b10
Move `TorchSTFT` to `utils.audio`
2021-06-28 17:03:47 +02:00
Eren Gölge
5b89cb4fec
Fixup `trainer.py` 🛠️
2021-06-28 17:03:47 +02:00
Eren Gölge
8c74f054f0
Enable support for 🐍 python 3.10
...
Bump up versions numpy 1.19.5 and TF 2.5.0
2021-06-28 17:03:47 +02:00
Eren Gölge
9455a2b01e
Apply small fixes for API compatibility
2021-06-28 17:03:47 +02:00
Eren Gölge
a5d5bc9063
Print `max_decoder_steps` when model reaches the limit
2021-06-28 17:03:47 +02:00
Eren Gölge
e30f245e06
Update `synthesizer` for speaker and model init
2021-06-28 17:03:47 +02:00
Eren Gölge
15fa31b595
fixup configs
2021-06-28 17:03:47 +02:00
Eren Gölge
f23b228e24
Update `speaker_manager`
2021-06-28 17:03:47 +02:00
Eren Gölge
e53616078a
Fixup `utils` for the trainer
2021-06-28 17:03:47 +02:00
Eren Gölge
106b63d8a9
Update `vocoder` utils
2021-06-28 17:03:47 +02:00
Eren Gölge
45947acb60
Update `TTS.bin` scripts for the new API
2021-06-28 17:03:47 +02:00
Eren Gölge
d7225eedb0
Update `vocoder` datasets and `setup_dataset`
2021-06-28 17:03:20 +02:00
Eren Gölge
d18198dff8
Implement `setup_model` for vocoder models
2021-06-28 17:03:20 +02:00
Eren Gölge
e949e7ad58
Update vocoder models
2021-06-28 17:03:19 +02:00
Eren Gölge
51005cdab4
Update `tts.models.setup_model`
2021-06-28 17:03:19 +02:00
Eren Gölge
7b8c15ac49
Create base 🐸 TTS model abstraction for tts models
2021-06-28 17:03:19 +02:00
Eren Gölge
a358f74a52
Update vocoder model configs
2021-06-28 17:03:19 +02:00
Eren Gölge
786170fe7d
Update tts model configs
2021-06-28 17:03:19 +02:00
Eren Gölge
98298ee671
Implement unified IO utils
2021-06-28 17:03:19 +02:00
Eren Gölge
c7aad884cd
Implement unified trainer
2021-06-28 17:03:19 +02:00
Eren Gölge
6d7b5fbcde
`tts` model abstraction with `TTSModel`
2021-06-28 17:03:19 +02:00
Eren Gölge
d4dbd89752
fix calculation of `loader_start_time`
2021-06-28 17:03:19 +02:00
Eren Gölge
c754a0e17d
`TrainerAbstract` and related updates for `TrainerTTS`
2021-06-28 17:03:19 +02:00
Eren Gölge
00c82c516d
rename to
2021-06-28 17:03:19 +02:00
Eren Gölge
166f0aeb9a
merge if branches with the same implementation
2021-06-28 17:03:19 +02:00
Eren Gölge
03494ad642
adjust `distribute.py` for the `train_tts.py`
2021-06-28 17:03:19 +02:00
Eren Gölge
fdfb18d230
downsize melgan test model size
2021-06-28 17:03:19 +02:00
Eren Gölge
25238e0658
fix glow-tts `inference()`
2021-06-28 17:03:19 +02:00
Eren Gölge
419735f440
refactor and fix multi-speaker training in Trainer and Tacotron models
2021-06-28 17:03:19 +02:00
Eren Gölge
269e5a734e
add max_decoder_steps argument to tacotron models
2021-06-28 17:03:19 +02:00
Eren Gölge
b3324bd914
fix speaker_manager init
2021-06-28 17:03:19 +02:00
Eren Gölge
2c38ef8441
use get_speaker_manager in Trainer and save speakers.json file when
...
needed
2021-06-28 17:03:19 +02:00
Eren Gölge
d6b2b6add6
make style and linter fixes
2021-06-28 17:03:19 +02:00
Eren Gölge
802d461389
Compute d_vectors and speaker_ids separately in TTSDataset
2021-06-28 17:03:19 +02:00
Eren Gölge
db6a97d1a2
rename external speaker embedding arguments as `d_vectors`
2021-06-28 17:03:19 +02:00
Eren Gölge
9042ae9195
use `to_cuda()` for moving data in `format_batch()`
2021-06-28 17:03:19 +02:00
Eren Gölge
f82f1970b8
change `to(device)` to `type_as` in models
2021-06-28 17:03:19 +02:00