Eren Gölge
c312acac7d
Implement VITS model 🚀
...
VITS model implementation built on Glow TTS and HiFiGAN
layers.
2021-08-09 18:02:36 +00:00
Eren Gölge
232a5abb6a
Update `tts.setup_model`
...
Run `model.make_symbols()` if availabe to set the symbol list
2021-08-09 18:02:36 +00:00
Eren Gölge
e4648ffef1
Fix multi-speaker init of Tacotron models & tests
2021-08-09 18:02:36 +00:00
Eren Gölge
01324c8e70
Update `base_tts.py`
...
Enable calling `make_symbols()` from the model if defined.
Compatibility changes for end2end `tts` models in batch formatting.
Changes in multi-speaker initialization.
Modify `test_run()` to work with dict output iof `synthesis`
2021-08-09 18:02:36 +00:00
Agrin Hilmkil
ced4cfdbbf
Allow saving / loading checkpoints from cloud paths ( #683 )
...
* Allow saving / loading checkpoints from cloud paths
Allows saving and loading checkpoints directly from cloud paths like
Amazon S3 (s3://) and Google Cloud Storage (gs://) by using fsspec.
Note: The user will have to install the relevant dependency for each
protocol. Otherwise fsspec will fail and specify which dependency is
missing.
* Append suffix _fsspec to save/load function names
* Add a lower bound to the fsspec dependency
Skips the 0 major version.
* Add missing changes from refactor
* Use fsspec for remaining artifacts
* Add test case with path requiring fsspec
* Avoid writing logs to file unless output_path is local
* Document the possibility of using paths supported by fsspec
* Fix style and lint
* Add missing lint fixes
* Add type annotations to new functions
* Use Coqpit method for converting config to dict
* Fix type annotation in semi-new function
* Add return type for load_fsspec
* Fix bug where fs not always created
* Restore the experiment removal functionality
2021-08-09 18:02:36 +00:00
Eren Gölge
d9e18e009b
Skip phoneme cache pre-compute if the path exists
2021-08-09 18:02:36 +00:00
Eren Gölge
fc0c4600bd
Fix stopnet training
2021-07-24 11:39:54 +02:00
WeberJulian
25832eb97b
Changes for review
2021-07-15 11:38:45 +02:00
WeberJulian
c79a82ed07
refix linter
2021-07-13 23:12:18 +02:00
WeberJulian
7d92b30946
Fix tests
2021-07-13 23:00:34 +02:00
WeberJulian
32974dd6a9
Fix test sentences synthesis
2021-07-13 16:07:13 +02:00
eren golge
3c0454490f
Fix #616
2021-07-06 14:44:03 +02:00
Eren Gölge
f382e4c700
Fix linter warnings
2021-07-03 13:30:24 +02:00
Eren Gölge
196876feb1
Fix `ModelManager` model download
2021-07-02 10:47:05 +02:00
Eren Gölge
9352cb4136
Format Align TTS docstrings
2021-07-02 10:45:58 +02:00
Eren Gölge
95ad72f38f
Fix glow tts initialization
2021-07-02 10:45:37 +02:00
Eren Gölge
40b0b5365e
Let `get_characters` return `num_chars`
2021-07-02 10:45:00 +02:00
Eren Gölge
2e1a428b83
Update glowtts docstrings and docs
2021-06-30 14:30:55 +02:00
Eren Gölge
9790eddada
Fix wrong argument name 🛠️
2021-06-28 17:03:47 +02:00
Eren Gölge
51005cdab4
Update `tts.models.setup_model`
2021-06-28 17:03:19 +02:00
Eren Gölge
7b8c15ac49
Create base 🐸 TTS model abstraction for tts models
2021-06-28 17:03:19 +02:00
Eren Gölge
c7aad884cd
Implement unified trainer
2021-06-28 17:03:19 +02:00
Eren Gölge
6d7b5fbcde
`tts` model abstraction with `TTSModel`
2021-06-28 17:03:19 +02:00
Eren Gölge
00c82c516d
rename to
2021-06-28 17:03:19 +02:00
Eren Gölge
25238e0658
fix glow-tts `inference()`
2021-06-28 17:03:19 +02:00
Eren Gölge
419735f440
refactor and fix multi-speaker training in Trainer and Tacotron models
2021-06-28 17:03:19 +02:00
Eren Gölge
269e5a734e
add max_decoder_steps argument to tacotron models
2021-06-28 17:03:19 +02:00
Eren Gölge
db6a97d1a2
rename external speaker embedding arguments as `d_vectors`
2021-06-28 17:03:19 +02:00
Eren Gölge
f82f1970b8
change `to(device)` to `type_as` in models
2021-06-28 17:03:19 +02:00
Eren Gölge
1fa15c195a
docstring fix
2021-06-28 17:03:19 +02:00
Eren Gölge
1c8a3d7c86
make style
2021-06-28 17:03:19 +02:00
Eren Gölge
b22b7620c3
update glow-tts output shapes to match [B, T, C]
2021-06-28 17:03:19 +02:00
Eren Gölge
8381379938
formating `cond_input` with a function in Tacotron models
2021-06-28 17:03:19 +02:00
Eren Gölge
6c495c6a6e
fix glow-tts inference and forward functions for handling `cond_input`
...
and refactor its test
2021-06-28 17:03:19 +02:00
Eren Gölge
421194880d
linter fixes
2021-06-28 17:03:19 +02:00
Eren Gölge
d96ebcd6d3
make style
2021-06-28 17:03:19 +02:00
Eren Gölge
b500338faa
make style
2021-06-28 17:03:19 +02:00
Eren Gölge
bb355b7441
update align_tts.py model for the trainer
2021-06-28 17:03:19 +02:00
Eren Gölge
c70d0c9dae
update `speedy_speech.py` model for trainer
2021-06-28 17:03:19 +02:00
Eren Gölge
4e910993f1
update tacotron model to return `model_outputs`
2021-06-28 17:03:19 +02:00
Eren Gölge
bb4deee64c
update glow-tts for the trainer
2021-06-28 17:03:19 +02:00
Eren Gölge
9134c7dfb6
update `sequence_mask` import globally
2021-06-28 17:03:19 +02:00
Eren Gölge
535a458f40
update Tacotron models for the trainer
2021-06-28 17:03:19 +02:00
Eren Gölge
bdbfc95618
add `gradual_training` argument to tacotron.py
2021-06-28 17:03:19 +02:00
Eren Gölge
5a2e75f0ee
import missings for tacotron.py
2021-06-28 17:03:19 +02:00
Eren Gölge
da7d10e53c
mode `setup_model()` to `models/__init__.py`
2021-06-28 17:03:19 +02:00
Alexander Korolev
c1eb9bdcca
fix speaker dim inference
2021-06-01 15:15:26 +02:00
Alexander Korolev
5b89ef2c6e
fix speaker-embeddings dimension during inference
2021-06-01 11:06:35 +02:00
Eren Gölge
c57f0b46bb
reintro use_gst for backwars compat
2021-05-11 11:29:18 +02:00
Eren Gölge
05d9543ed8
init GST module using gst config in Tacotron models
2021-05-11 11:29:17 +02:00
Eren Gölge
a21c0b5585
config update 2 WIP
2021-05-11 11:28:35 +02:00
Eren Gölge
f7582107da
Merge pull request #453 from Edresson/dev
...
Script for spectrogram extraction using teacher forcing and Glow-TTS inference with MAS.
2021-05-06 17:53:28 +02:00
Eren Gölge
8cb27267a4
formatting
2021-05-03 14:26:35 +02:00
Edresson
8228091f92
add script for extraction of tts spectrograms
2021-04-23 14:17:46 -03:00
Eren Gölge
d42748082a
update argument name external_speaker_embedding_dim -> speaker_embedding_dim
...
add inference_noise_scale argument to glow-tts
2021-04-23 18:04:37 +02:00
Eren Gölge
c955a12428
set the default layer size compatible with scglow
2021-04-23 18:04:37 +02:00
Eren Gölge
9cc17be53a
formatting and a small bug fix in Tacotron model
2021-04-15 16:36:51 +02:00
Eren Gölge
3de5a89154
optionally enable prenet dropout at inference time for tacotron models
2021-04-13 13:24:56 +02:00
Eren Gölge
b735076bb4
linter fixes
2021-04-12 13:14:11 +02:00
Eren Gölge
a7f6045644
Merge branch 'reformat' into hifigan-reformat
2021-04-12 12:00:17 +02:00
Eren Gölge
f519012dea
reformatting and styling
2021-04-12 11:47:39 +02:00
Eren Gölge
e5b9607bc3
isort all imports
2021-04-09 00:45:20 +02:00
Eren Gölge
0e79fa86ad
format with black and pylint 2.7.3
2021-04-09 00:38:08 +02:00
Eren Gölge
a3a840fd78
linter fixes
2021-03-30 14:39:16 +02:00
Eren Gölge
6b2e13bf62
compute normalized logp using torch primitives
2021-03-30 14:39:16 +02:00
Eren Gölge
7a382a5c2b
stowed aligntts commit and small refactoring with feed_forward layers
2021-03-30 14:39:16 +02:00
Eren Gölge
aec0b78aff
duration predictor fix 2
2021-03-30 14:39:16 +02:00
Eren Gölge
07269e639b
fix duration predictor in AlignTTS
2021-03-30 14:39:16 +02:00
Eren Gölge
2b3e12ea49
correct imports after refactoring, add AlignTTS (old SSMAS) and some formatting
2021-03-30 14:39:16 +02:00
Eren Gölge
ecb6b0d6ad
rename GlowTtts as GlowTTS
2021-03-30 14:39:16 +02:00
Eren Gölge
0514330869
fix mozilla/TTS#685
2021-03-18 13:33:23 +01:00
Eren Gölge
5c657715f2
fix #382
2021-03-16 17:31:48 +01:00
Eren Gölge
9a48ba3821
a ton of linter updates
2021-03-08 05:06:54 +01:00
Eren Gölge
c990b3a59c
linter fixes and test fixes
2021-01-22 02:32:35 +01:00
root
1faf565e3a
add load_checkpoint func to tts models
2021-01-20 02:10:56 +00:00
erogol
428c224b88
commet update
2021-01-12 17:31:04 +01:00
erogol
79c841ccd3
mass refactoring and update
2021-01-11 17:26:58 +01:00
erogol
d382d759b3
small fixes and test fixes
2021-01-08 15:48:40 +01:00
erogol
a6259041d3
docstring for speedyspeech
2021-01-07 14:35:22 +01:00
erogol
14d33662ea
input shapes for tacotron models
2021-01-06 13:19:40 +01:00
erogol
f288e9a260
docstrings for taoctron models
2021-01-06 13:19:40 +01:00
erogol
7586fbc4de
SS refactoring
2021-01-06 13:19:40 +01:00
erogol
e82d31b6ac
glow ttss refactoring
2021-01-06 13:19:40 +01:00
erogol
aa40fe1aa0
SS model refacotring for multi speaker
2021-01-06 13:19:40 +01:00
erogol
eb555855e4
small fixes
2021-01-06 13:19:40 +01:00
erogol
ac5c9217d1
positional encoding masking for SS
2021-01-06 13:19:40 +01:00
erogol
fede46e96e
pylint and test fixes
2021-01-06 13:19:40 +01:00
erogol
cf869e8922
add SS files
2021-01-06 13:19:40 +01:00
erogol
fa6907fa0e
update glow-tts parameters and fix rel-attn-win size
2021-01-06 13:19:40 +01:00
erogol
7b20d8cbd3
implement residual BN convolution and add it as an alternative encoder for glow-tts. also generic layers to layers/generic
2021-01-06 13:19:40 +01:00
erogol
7c3cdced1a
make speaker_mapping a global variable to prevent reload. Fix glow-tts training
2020-12-01 03:23:25 +01:00
Edresson
07345099ee
GlowTTS zero-shot TTS Support
2020-10-24 15:58:39 -03:00
Edresson
b7f9ebd32b
add check arguments for GlowTTS and multispeaker training bug fix
2020-10-19 17:17:58 -03:00
Edresson
99d5a0ac07
add Speaker Conditional GST support
2020-09-29 16:09:27 -03:00
erogol
10258724d1
linter fixes
2020-09-22 03:54:16 +02:00
erogol
e0b9fa887f
glow-tts modules added
2020-09-21 14:15:40 +02:00
erogol
c008003506
do not check sample rate as loading stats file for normalization to enable interpolation for different sample rate vocoder
2020-09-18 12:52:19 +02:00
erogol
3660c57f1e
time seperable convolution encoder, huber loss for duration predictor
2020-09-17 03:10:58 +02:00
erogol
f1a75468c2
fix arguments
2020-09-12 04:00:25 +02:00
erogol
45fbc0d003
convolution encoder with GLU and res connections
2020-09-12 03:40:21 +02:00