Edresson
eb3e8affe1
Save speakers embeddings/ids before starting training
2021-12-20 11:54:09 +00:00
Eren Gölge
37803467aa
Merge pull request #1021 from loganhart420/dataset_downloaders
...
Add addtional datasets
2021-12-20 10:42:20 +01:00
Reuben Morais
859ac1a54c
Include usage instructions in README
2021-12-17 11:37:19 +01:00
loganhart420
103c010eca
Add addtional datasets
2021-12-16 07:21:27 -05:00
Jörg Thalheim
bce143c738
server: fix compatibility with tts_models/en/ljspeech/fast_pitch ( #893 )
2021-12-07 14:36:29 +01:00
Eren Gölge
babdd84f91
Fix GST inference
...
commit d3e477875a7e46a101fcf95a1794442823750fe2
Author: George Rousssos <25833833+george-roussos@users.noreply.github.com>
Date: Wed Nov 3 10:16:12 2021 +0000
Read .wav for GST conditioning from CL
commit 074e6d0874d3b34fb6a4991fc17d66dccd413fbb
Author: George Rousssos <25833833+george-roussos@users.noreply.github.com>
Date: Fri Oct 29 14:43:47 2021 +0100
Fix GST during inference in Tacotron2
commit fdece14585ab5a36eed1061a9a838d8e48aa6882
Author: George Rousssos <25833833+george-roussos@users.noreply.github.com>
Date: Wed Nov 3 10:16:12 2021 +0000
Read .wav for GST conditioning from CL
commit cd29e21b8d0a541ee298d2bf5f67223ad60be38f
Author: George Rousssos <25833833+george-roussos@users.noreply.github.com>
Date: Fri Oct 29 14:43:47 2021 +0100
Fix GST during inference in Tacotron2
commit 908ce39370eadcc9fa8510cdb26c9ead87305427
Author: George Rousssos <25833833+george-roussos@users.noreply.github.com>
Date: Fri Oct 29 12:49:37 2021 +0100
Make trim_db value negative
commit 1008a2e0f72fa7ca7f0307424f570386f2f16d42
Author: George Rousssos <25833833+george-roussos@users.noreply.github.com>
Date: Fri Oct 29 12:22:24 2021 +0100
Set find_endpoint db threshold in config.json
2021-12-07 13:28:49 +00:00
Eren Gölge
ce45d9e1af
Make style and lint
2021-12-01 10:42:52 +00:00
Eren Gölge
40cb8ac966
Fix #958
2021-12-01 10:33:34 +00:00
Eren Gölge
512ada7548
Fix callbacks against multi-gpu training
2021-12-01 10:32:14 +00:00
Eren Gölge
2ed9e3c241
Fix constant use of noise augment
2021-11-08 09:20:34 +01:00
Eren Gölge
b6b14a76af
Fix VITS stochastic duration predictor
2021-11-08 09:20:11 +01:00
Eren Gölge
dc3dd55dd9
Add collect_env_info.py
2021-11-08 08:59:08 +01:00
Eren Gölge
faafea4cf2
Fix style
2021-11-04 17:04:40 +01:00
Eren Gölge
d227aaebcc
Print when using Griffin-Lim in Synthesizer
2021-11-01 16:52:26 +01:00
Eren Gölge
c5077c6c3f
Merge branch 'dev' of https://github.com/coqui-ai/TTS into dev
2021-11-01 16:42:27 +01:00
Eren Gölge
20cebde1c9
Add docstring to MAI labs formatter
2021-11-01 16:41:55 +01:00
Eren Gölge
608f437545
Add a function to find unique chars
2021-11-01 16:41:33 +01:00
Eren Gölge
d6d780e758
Fix FastSpeech config
2021-11-01 16:41:15 +01:00
Eren Gölge
5ba47081ee
Use GL for VCTK FastPitch models
2021-11-01 16:39:03 +01:00
Michael Hansen
3bc043faeb
Upgrade to gruut 2.0 ( #882 )
2021-10-31 11:41:55 +01:00
George
37eaefc085
Optional silence trimming during inference and find_endpoint() fix ( #898 )
...
* Set find_endpoint db threshold in config.json
* Optional silence trimming during inference
* Make trim_db value negative
2021-10-29 18:28:55 +02:00
Eren Gölge
7293abada2
Bump up to v0.4.2
2021-10-29 17:57:30 +02:00
Eren Gölge
2df0752e73
Model zoo tests ( #900 )
...
* Fix VITS model multi-speaker init
* Remove gdrive support in model manager
* Add model zoo tests
2021-10-29 17:54:16 +02:00
Eren Gölge
aaaa591485
Bump up version to v0.4.1
2021-10-26 19:24:17 +02:00
Eren Gölge
3ea1c2037b
Fix model entry in .models.json
2021-10-26 19:14:29 +02:00
Eren Gölge
fa4ec83c6e
Bump up version to v0.4.0
2021-10-26 18:27:39 +02:00
Eren Gölge
035ed432bc
Doc update ( #889 )
...
* Link source files from the docs
* Update glowTTS recipes for docs
* Add dataset downloaders
2021-10-26 17:41:33 +02:00
Eren Gölge
0cac3f330a
Enable custom formatter in load_tts_samples
2021-10-26 13:07:11 +02:00
Eren Gölge
7c10574931
Gateway for TTS models
2021-10-26 13:04:51 +02:00
Eren Gölge
00becf2671
Fix import statements
2021-10-25 19:29:16 +02:00
Eren Gölge
027424dda8
Add VCTK fast_pitch and UK glow-tts
2021-10-25 19:29:16 +02:00
Eren Gölge
70e4d0e524
Fix grad_norm handling
2021-10-21 16:29:06 +00:00
Eren Gölge
a409e0f8f8
Update train_tts for multi-speaker
2021-10-21 16:29:06 +00:00
Eren Gölge
2b7d159383
Update BaseTTS for multi-speaker training
2021-10-21 16:29:06 +00:00
Eren Gölge
e62d3c5cf7
Use absolute imports for tts configs and models
2021-10-21 16:29:06 +00:00
Eren Gölge
82fed4add2
Make style
2021-10-21 16:05:51 +00:00
Eren Gölge
3cb07fb6b5
Fix SpeakerManager init with data items
2021-10-21 13:54:39 +00:00
Eren Gölge
aea90e2501
Comment synthesis.py
2021-10-21 13:53:45 +00:00
Eren Gölge
1987aaaaed
Update d-vector reshape in synthesizer
2021-10-21 13:53:25 +00:00
Eren Gölge
3ab009ca8d
Edit model configs for multi-speaker
2021-10-21 13:51:37 +00:00
Eren Gölge
cea8e1739b
Update AlignTTS to use SpeakerManager
2021-10-20 18:22:41 +00:00
Eren Gölge
0e768dd4c5
Update comments
2021-10-20 18:21:26 +00:00
Eren Gölge
7c2cb7cc30
Update BaseTTS
2021-10-20 18:18:22 +00:00
Eren Gölge
330ee7d208
Comment BaseTacotron and remove unused funcs
2021-10-20 18:17:25 +00:00
Eren Gölge
aa25f70b95
Update ForwardTTS for multi-speaker
2021-10-20 18:16:41 +00:00
Eren Gölge
0ebc2a400e
Implement `_set_speaker_embedding` in GlowTTS
2021-10-20 18:15:20 +00:00
Eren Gölge
3da79a4de4
Comment Tacotron2 model
2021-10-20 18:14:04 +00:00
Eren Gölge
92b6d98443
Set pitch frame alignment wrt spec computation
2021-10-20 18:12:38 +00:00
Eren Gölge
0a3d1cc7ee
Pass speaker manager to the model in synthesizer
2021-10-20 18:11:36 +00:00
Eren Gölge
588da1a24e
Simplify grad_norm handling in trainer
2021-10-19 16:33:04 +00:00
Eren Gölge
3c7848e9b1
Don't OOR values in train console log
2021-10-19 16:32:16 +00:00
Eren Gölge
c514351c0e
Refactor multi-speaker init in BaseTTS-Tacotron1-2
2021-10-18 08:55:45 +00:00
Eren Gölge
127571423c
Update multi-speaker init in BaseTTS
2021-10-18 08:54:41 +00:00
Eren Gölge
a0a5d580e9
Approximate audio length from file size
2021-10-18 08:54:02 +00:00
Eren Gölge
b4b890df03
Update trainer's initialization
2021-10-18 08:53:19 +00:00
Eren Gölge
fcbfc53cb7
Fix linter
2021-10-15 10:24:19 +00:00
Eren Gölge
700b056117
Update Synthesizer multi-speaker handling
2021-10-15 10:21:12 +00:00
Eren Gölge
073a2d2eb0
Refactor VITS multi-speaker initialization
2021-10-15 10:20:00 +00:00
Eren Gölge
0565457faa
Fix #846
2021-10-14 14:46:14 +00:00
Eren Gölge
e15bc157d8
Fix #873
2021-10-14 14:39:45 +00:00
Eren Gölge
21cc0517a3
Fix WaveRNN test
2021-10-01 10:21:37 +00:00
Eren Gölge
4dbe7ed0de
Fix all-zero duration case for GlowTTS
2021-10-01 09:24:26 +00:00
Eren Gölge
37959ad0c7
Make linter
2021-09-30 23:02:16 +00:00
Eren Gölge
0b1986384f
Make style
2021-09-30 16:21:18 +00:00
Eren Gölge
7edbe04fe0
Fix WaveRNN config and test
2021-09-30 16:20:12 +00:00
Eren Gölge
55d9209221
Remote STT tokenizer
2021-09-30 14:58:26 +00:00
Eren Gölge
ba2b8c827f
Update `train_tts.py` and `train_vocoder.py`
2021-09-30 14:47:56 +00:00
Eren Gölge
2e9b6b4f90
Refactor Speaker Encoder training
2021-09-30 14:47:56 +00:00
Eren Gölge
043dca61b4
Rename `load_meta_data` as `load_tts_data`
2021-09-30 14:47:56 +00:00
Eren Gölge
9f23ad6a0f
Fix imports
2021-09-30 14:47:56 +00:00
Eren Gölge
16b70be0dd
Add `_set_model_args` to BaseModel
2021-09-30 14:47:56 +00:00
Eren Gölge
9a0d8fa027
Update `copy_model_files()`
2021-09-30 14:47:56 +00:00
Eren Gölge
4163b4f2e4
Update Tacotron models
2021-09-30 14:47:56 +00:00
Eren Gölge
e27feade38
Fixup wavernn
2021-09-30 14:47:56 +00:00
Eren Gölge
45889804c2
Update VITS
2021-09-30 14:47:56 +00:00
Eren Gölge
4f94f91305
Update WaveRNN
2021-09-30 14:47:56 +00:00
Eren Gölge
3d5205d66f
Update WaveGrad
2021-09-30 14:47:56 +00:00
Eren Gölge
fd95926009
Update GlowTTS
2021-09-30 14:47:56 +00:00
Eren Gölge
4baecdf92a
Update GAN for Trainer_v2
2021-09-30 14:47:56 +00:00
Eren Gölge
a156a40b47
Update ForwardTTS for Trainer_v2
2021-09-30 14:19:19 +00:00
Eren Gölge
d9df33f837
Update `align_tts` for trainer_v2
2021-09-30 14:18:10 +00:00
Eren Gölge
8ada870a57
Refactor `trainer.py` for v2
2021-09-30 14:16:34 +00:00
Eren Gölge
7f388f26e3
Bump up to v0.3.1
2021-09-17 23:53:22 +00:00
Eren Gölge
2766dd1d6e
Fix #813 - GlowTTS training ( #814 )
...
* Fix #813
* Update glow_tts recipe
* Fix glow-tts test
* Linter fix
* Run data dep init only in training
2021-09-17 20:06:55 +02:00
Eren Gölge
f563415052
Bump up to v0.3.0
2021-09-13 09:40:38 +00:00
Eren Gölge
a97dc8d09f
Fix trainer malformatted print
2021-09-13 08:32:02 +00:00
Eren Gölge
91bebebe18
Add new models to `.models.json`
...
SpeedySpeech model using `ForwardTTS`
UnivNet model fine-tuned on TacotronDDC_ph spectrograms
2021-09-13 08:22:14 +00:00
Eren Gölge
1ea011571a
Update SpeedySpeech config
2021-09-12 15:33:27 +00:00
Eren Gölge
cbbc9e0172
Add FastSpeechConfig
2021-09-11 10:20:37 +00:00
Eren Gölge
26f76fce22
Remove SpeedySpeech from .models.json
2021-09-10 17:47:27 +00:00
Eren Gölge
d97952611d
Remove unused import
2021-09-10 17:31:41 +00:00
Eren Gölge
7d8f77385a
Use `glow-tts` in synthesis tests
2021-09-10 17:27:33 +00:00
Eren Gölge
d5f256b34c
Update tacotron `r` init
2021-09-10 17:26:23 +00:00
Eren Gölge
ab37fa9c39
Edit AlignTTS
2021-09-10 17:25:00 +00:00
Eren Gölge
66732025e1
Add `base_model` field to `forward_tts` configs
2021-09-10 17:23:48 +00:00
Eren Gölge
d6e29ef98a
Style update
2021-09-10 08:30:33 +00:00
Eren Gölge
a89eb12aca
Fix glow_tts imports
2021-09-10 08:29:51 +00:00
Eren Gölge
570d5971be
Implement `ForwardTTSLoss`
2021-09-10 08:29:12 +00:00
Eren Gölge
0541a25e90
Remove `fastpitch.py` and `speedy_speech.py`
2021-09-10 08:27:48 +00:00
Eren Gölge
3c16013199
Fix Vits imports
2021-09-10 08:26:34 +00:00
Eren Gölge
742f9c54da
Warn user if nan in GL
2021-09-10 08:26:05 +00:00
Eren Gölge
ed4b1d8514
Test `TTS.tts.utils.helpers`
2021-09-10 08:25:21 +00:00
Eren Gölge
8b7e094bde
Implement `forward_tts`
...
- Generic API for feed-forward TTS models (FastPitch, SpeedySpeech)
- Tests for `forward-tts`
- Edit FastPitchConfig and SpeedySpeechConfig to use `forward_tts`
2021-09-10 08:24:33 +00:00
Eren Gölge
3c740d4893
Style extract_tts_spectrogram.py
2021-09-10 08:21:21 +00:00
Eren Gölge
bfc6ceac29
Move MAS to `TTS.tts.utils.helpers`
2021-09-09 10:57:19 +00:00
Eren Gölge
2dfc5bdd11
Fix best_model_path init if no best_mode
2021-09-09 09:01:52 +00:00
Eren Gölge
abf5e48177
Fix logging current learning rate in trainer
2021-09-09 09:01:04 +00:00
Eren Gölge
6c4c1065b0
Fix trainer's scheduler restoring
2021-09-09 09:00:27 +00:00
Eren Gölge
807f1d3817
Fix `extract_tts_spectrograms.py` model init
2021-09-09 08:59:55 +00:00
Eren Gölge
537c8576ec
Stage `TTS.tts.utils.helpers`
2021-09-08 13:35:18 +00:00
Eren Gölge
4761853c5c
Fix imports
2021-09-08 13:34:40 +00:00
Eren Gölge
e20ea57c87
Update comment and add a warning
2021-09-07 12:23:32 +00:00
Eren Gölge
82598f3fdb
Bump up to v0.2.2
2021-09-06 16:59:41 +00:00
Eren Gölge
4cc544bc46
Add FastPitch model to `.models.json`
2021-09-06 16:59:22 +00:00
Eren Gölge
2c4bbbf9b9
Use pyworld for pitch
2021-09-06 15:16:58 +00:00
Eren Gölge
c1513ec4cd
Plot pitch over spectrogram
2021-09-06 15:16:58 +00:00
Eren Gölge
d847a68e42
Reformat multi-speaker handling in GlowTTS
2021-09-06 15:16:58 +00:00
Eren Gölge
8d41060d36
Plot unnormalized pitch by `FastPitch`
2021-09-06 15:16:58 +00:00
Eren Gölge
2b59da802c
Fix loader setup in `base_tts`
2021-09-06 15:16:58 +00:00
Eren Gölge
76c4929ab2
Fix attn mask reading bug
2021-09-06 15:16:58 +00:00
Eren Gölge
91a70e80b2
Refactor TTSDataset
...
Return a dict by `collate`
Refactor batch handling in `collate`
A couple of bug fixes
2021-09-06 15:16:58 +00:00
Eren Gölge
29248536c9
Update `PositionalEncoding`
2021-09-06 15:16:58 +00:00
Eren Gölge
4672889549
Update `generic.FFTransformer`
2021-09-06 15:16:58 +00:00
Eren Gölge
2bf9e83c49
FastPitch refactor and commenting
2021-09-06 15:16:58 +00:00
Eren Gölge
59b24e66cf
Add `AlignerNetwork`
2021-09-06 15:16:58 +00:00
Eren Gölge
648655fa03
Add `PitchExtractor` and return dict by `collate`
2021-09-06 15:16:58 +00:00
Eren Gölge
debf772ec5
Implement binary alignment loss
2021-09-06 15:16:58 +00:00
Eren Gölge
6e9d4062f2
Add `sort_by_audio_len` option
2021-09-06 15:16:58 +00:00
Eren Gölge
59d52a4cd8
Disable autcast for criterions
2021-09-06 15:16:58 +00:00
Eren Gölge
98a7271ce8
Refactor FastPitchv2
2021-09-06 15:16:58 +00:00
Eren Gölge
e429afbce4
Enable aligner for FastPitch
2021-09-06 15:16:58 +00:00
Eren Gölge
81c228a2d8
Update FastPitch don't detach duration network inputs
2021-09-06 15:16:58 +00:00
Eren Gölge
ca29033ef4
Refactor FastPitch model
2021-09-06 15:16:58 +00:00
Eren Gölge
42862f7fdb
Format style of the recipes
2021-09-06 15:16:58 +00:00
Eren Gölge
5d59100a88
Don't use align_score for models with duration predictor
2021-09-06 15:16:58 +00:00
Eren Gölge
fac9dbe661
Update FastPitchLoss
2021-09-06 15:16:58 +00:00
Eren Gölge
b81560607b
Update docstrings
2021-09-06 15:16:58 +00:00
Eren Gölge
57b3aec1b9
Update docstring format
2021-09-06 15:16:58 +00:00
Eren Gölge
7692bfe7f8
Update FastPitch config
2021-09-06 15:16:58 +00:00
Eren Gölge
8584f2b82d
Update docstring format
2021-09-06 15:16:58 +00:00
Eren Gölge
b7caad39e0
Make optional to detach duration predictor input
2021-09-06 15:16:58 +00:00
Eren Gölge
9af42f7886
Restore `last_epoch` of the scheduler
2021-09-06 15:16:58 +00:00
Eren Gölge
aacbb3ed77
Fix SpeakerManager usage in `synthesize.py`
2021-09-06 15:16:58 +00:00
Eren Gölge
545a00fc04
Use absolute paths of the attention masks
2021-09-06 15:16:58 +00:00
Eren Gölge
bc396c393f
Add FastPitch model and FastPitchconfig
2021-09-06 15:16:58 +00:00
Eren Gölge
5a6ffaee08
Add yin based pitch computation
2021-09-06 15:16:58 +00:00
Eren Gölge
e802b24ad0
Compute mean and std pitch
2021-09-06 15:16:58 +00:00
Eren Gölge
8fffd4e813
Don't print computed phonemes
...
It causes noise in logs
2021-09-06 15:16:58 +00:00
Eren Gölge
d085642ac1
Cache pitch features
...
Cache the features at the beginning of `BaseTTS` training.
2021-09-06 15:16:58 +00:00
Eren Gölge
7590c7db7a
Fix `base_tacotron` `aux_input` handling
2021-09-06 15:16:58 +00:00