Eren Gölge
960a35a121
Add `scheduler_after_epoch` to `BaseTrainingConfig`
2021-08-09 18:02:36 +00:00
Eren Gölge
e4648ffef1
Fix multi-speaker init of Tacotron models & tests
2021-08-09 18:02:36 +00:00
Eren Gölge
01324c8e70
Update `base_tts.py`
...
Enable calling `make_symbols()` from the model if defined.
Compatibility changes for end2end `tts` models in batch formatting.
Changes in multi-speaker initialization.
Modify `test_run()` to work with dict output iof `synthesis`
2021-08-09 18:02:36 +00:00
Eren Gölge
bf562cf437
Update `trainer.py`
...
Fix multi-speaker initialization of models. Add changes for end2end`tts`
models.
2021-08-09 18:02:36 +00:00
Agrin Hilmkil
ced4cfdbbf
Allow saving / loading checkpoints from cloud paths ( #683 )
...
* Allow saving / loading checkpoints from cloud paths
Allows saving and loading checkpoints directly from cloud paths like
Amazon S3 (s3://) and Google Cloud Storage (gs://) by using fsspec.
Note: The user will have to install the relevant dependency for each
protocol. Otherwise fsspec will fail and specify which dependency is
missing.
* Append suffix _fsspec to save/load function names
* Add a lower bound to the fsspec dependency
Skips the 0 major version.
* Add missing changes from refactor
* Use fsspec for remaining artifacts
* Add test case with path requiring fsspec
* Avoid writing logs to file unless output_path is local
* Document the possibility of using paths supported by fsspec
* Fix style and lint
* Add missing lint fixes
* Add type annotations to new functions
* Use Coqpit method for converting config to dict
* Fix type annotation in semi-new function
* Add return type for load_fsspec
* Fix bug where fs not always created
* Restore the experiment removal functionality
2021-08-09 18:02:36 +00:00
Eren Gölge
d9e18e009b
Skip phoneme cache pre-compute if the path exists
2021-08-09 18:02:36 +00:00
Eren Gölge
6c131d168e
Bump the version to 0.1.3
2021-07-26 21:32:27 +02:00
Eren Gölge
febd6105b5
Update default vocoder for de-thorsten
2021-07-26 16:08:52 +02:00
Eren Gölge
4b7b88dd3d
Add fullband-melgan DE vocoder
2021-07-26 15:38:30 +02:00
Eren Gölge
764f684e1b
Fix `server.py` for multi-speaker models
2021-07-26 15:38:30 +02:00
Eren Gölge
75b201c6c1
Merge pull request #673 from coqui-ai/fix_stopnet
...
Fix stopnet training for Tacotron models
2021-07-24 12:25:38 +02:00
Eren Gölge
fc0c4600bd
Fix stopnet training
2021-07-24 11:39:54 +02:00
Eren Gölge
30eed347b6
Merge pull request #581 from Edresson/dev
...
Compute speaker embeddings in batch for the LSTM Speaker Encoder and Compute embeddings/ finding chars using config file.
2021-07-23 17:22:51 +02:00
Edresson Casanova
d5adc35fdf
Add docstring to compute_embeddings script
2021-07-21 07:16:10 -03:00
Eren Gölge
05c75aa9d5
Fix linter issues
2021-07-16 13:37:38 +02:00
Eren Gölge
58cc414477
Fix WaveGrad `test_run`
2021-07-16 13:02:25 +02:00
WeberJulian
25832eb97b
Changes for review
2021-07-15 11:38:45 +02:00
Edresson
b1620d1f3f
remove ignore generate eval flag
2021-07-15 03:34:28 -03:00
WeberJulian
c79a82ed07
refix linter
2021-07-13 23:12:18 +02:00
WeberJulian
7d92b30946
Fix tests
2021-07-13 23:00:34 +02:00
WeberJulian
32974dd6a9
Fix test sentences synthesis
2021-07-13 16:07:13 +02:00
Edresson
d906fea08c
lint fix and eval as argparse in extract tts spectrograms
2021-07-13 02:15:31 -03:00
Edresson
2e5baffa9c
Merge fix and eval split as argparse
2021-07-13 01:47:32 -03:00
Eren Gölge
93a74cbb71
Merge pull request #628 from Aloento/patch-2
...
Change to _get_preprocessor_by_name
2021-07-11 22:17:50 +02:00
Edresson
4eac1c4651
bug fix on train_encoder and unit tests
2021-07-11 12:00:39 -03:00
Aloento
6e3e6d5756
Change to _get_preprocessor_by_name
2021-07-08 09:53:13 +02:00
Eren Gölge
8fbadad68e
Bump up to v0.1.2
2021-07-06 14:44:59 +02:00
eren golge
3c0454490f
Fix #616
2021-07-06 14:44:03 +02:00
Eren Gölge
0c347624e7
Bump up version to v0.1.1
2021-07-04 11:46:36 +02:00
Eren Gölge
a05b234080
Raise an error when multiple GPUs are in use
...
User must define the target GPU by `CUDA_VISIBLE_DEVICES` and
use `distribute.py` for multi-gpu training.
2021-07-04 11:25:49 +02:00
Eren Gölge
270c3823eb
Fix #608
2021-07-04 11:19:31 +02:00
Eren Gölge
c25a2184e7
Add docs for `SpeakerManager`
2021-07-03 13:55:27 +02:00
Eren Gölge
f382e4c700
Fix linter warnings
2021-07-03 13:30:24 +02:00
Eren Gölge
9e7824fe35
Fix UnivNet inference code
2021-07-02 10:48:34 +02:00
Eren Gölge
168f97cbe9
Let `Synthesizer` use the speaker manager out of the model
2021-07-02 10:47:55 +02:00
Eren Gölge
196876feb1
Fix `ModelManager` model download
2021-07-02 10:47:05 +02:00
Eren Gölge
9352cb4136
Format Align TTS docstrings
2021-07-02 10:45:58 +02:00
Eren Gölge
95ad72f38f
Fix glow tts initialization
2021-07-02 10:45:37 +02:00
Eren Gölge
40b0b5365e
Let `get_characters` return `num_chars`
2021-07-02 10:45:00 +02:00
Eren Gölge
0fa6a8c9b8
Fix glow tts default parameters
2021-07-02 10:44:23 +02:00
Eren Gölge
a4c658f5ef
Fix for using the `Synthesizer` out of the model
2021-07-02 10:43:38 +02:00
Eren Gölge
db47f4f105
Update `.models.json`
2021-07-02 10:43:00 +02:00
Eren Gölge
2e1a428b83
Update glowtts docstrings and docs
2021-06-30 14:30:55 +02:00
Eren Gölge
5723eb4738
Fix config init in `process_args`
2021-06-29 16:41:08 +02:00
Eren Gölge
4b5421b42f
Remove FAQ link from README.md
2021-06-29 13:20:40 +02:00
Eren Gölge
47b3b10d6d
Bump up to v0.1.0 🚀
2021-06-29 13:07:59 +02:00
Eren Gölge
7ec5c31898
Merge branch 'univnet' into trainer-api
2021-06-29 10:27:12 +02:00
Eren Gölge
51398cd15b
Add docstrings and typing for `audio.py`
2021-06-28 17:03:47 +02:00
Eren Gölge
ae6405bb76
Docstrings for `Trainer`
2021-06-28 17:03:47 +02:00
Eren Gölge
6b265ae8e3
Docstring update
2021-06-28 17:03:47 +02:00
Eren Gölge
ab563ce7cd
Start training by config.json using `register_config`
2021-06-28 17:03:47 +02:00
Eren Gölge
b3c073c99b
Allow runing full path scripts with `distribute.py`
2021-06-28 17:03:47 +02:00
Eren Gölge
d42d1c02ea
Use `torch.linalg.qr` for pytorch > `v1.9.0`
2021-06-28 17:03:47 +02:00
Eren Gölge
fbba37e01e
Fix loading the `amp` scaler from a checkpoint 🛠️
2021-06-28 17:03:47 +02:00
Eren Gölge
a7617d8ab6
Add 🐍 python 3.9 to CI
2021-06-28 17:03:47 +02:00
Eren Gölge
9790eddada
Fix wrong argument name 🛠️
2021-06-28 17:03:47 +02:00
Eren Gölge
932ab107ae
Docstring edit in `TTSDataset.py` ✍️
2021-06-28 17:03:47 +02:00
Eren Gölge
cfa5041db7
Fix `eval_log` for `gan.py` 🛠️
2021-06-28 17:03:47 +02:00
Eren Gölge
d700845b10
Move `TorchSTFT` to `utils.audio`
2021-06-28 17:03:47 +02:00
Eren Gölge
5b89cb4fec
Fixup `trainer.py` 🛠️
2021-06-28 17:03:47 +02:00
Eren Gölge
8c74f054f0
Enable support for 🐍 python 3.10
...
Bump up versions numpy 1.19.5 and TF 2.5.0
2021-06-28 17:03:47 +02:00
Eren Gölge
9455a2b01e
Apply small fixes for API compatibility
2021-06-28 17:03:47 +02:00
Eren Gölge
a5d5bc9063
Print `max_decoder_steps` when model reaches the limit
2021-06-28 17:03:47 +02:00
Eren Gölge
e30f245e06
Update `synthesizer` for speaker and model init
2021-06-28 17:03:47 +02:00
Eren Gölge
15fa31b595
fixup configs
2021-06-28 17:03:47 +02:00
Eren Gölge
f23b228e24
Update `speaker_manager`
2021-06-28 17:03:47 +02:00
Eren Gölge
e53616078a
Fixup `utils` for the trainer
2021-06-28 17:03:47 +02:00
Eren Gölge
106b63d8a9
Update `vocoder` utils
2021-06-28 17:03:47 +02:00
Eren Gölge
45947acb60
Update `TTS.bin` scripts for the new API
2021-06-28 17:03:47 +02:00
Eren Gölge
d7225eedb0
Update `vocoder` datasets and `setup_dataset`
2021-06-28 17:03:20 +02:00
Eren Gölge
d18198dff8
Implement `setup_model` for vocoder models
2021-06-28 17:03:20 +02:00
Eren Gölge
e949e7ad58
Update vocoder models
2021-06-28 17:03:19 +02:00
Eren Gölge
51005cdab4
Update `tts.models.setup_model`
2021-06-28 17:03:19 +02:00
Eren Gölge
7b8c15ac49
Create base 🐸 TTS model abstraction for tts models
2021-06-28 17:03:19 +02:00
Eren Gölge
a358f74a52
Update vocoder model configs
2021-06-28 17:03:19 +02:00
Eren Gölge
786170fe7d
Update tts model configs
2021-06-28 17:03:19 +02:00
Eren Gölge
98298ee671
Implement unified IO utils
2021-06-28 17:03:19 +02:00
Eren Gölge
c7aad884cd
Implement unified trainer
2021-06-28 17:03:19 +02:00
Eren Gölge
6d7b5fbcde
`tts` model abstraction with `TTSModel`
2021-06-28 17:03:19 +02:00
Eren Gölge
d4dbd89752
fix calculation of `loader_start_time`
2021-06-28 17:03:19 +02:00
Eren Gölge
c754a0e17d
`TrainerAbstract` and related updates for `TrainerTTS`
2021-06-28 17:03:19 +02:00
Eren Gölge
00c82c516d
rename to
2021-06-28 17:03:19 +02:00
Eren Gölge
166f0aeb9a
merge if branches with the same implementation
2021-06-28 17:03:19 +02:00
Eren Gölge
03494ad642
adjust `distribute.py` for the `train_tts.py`
2021-06-28 17:03:19 +02:00
Eren Gölge
fdfb18d230
downsize melgan test model size
2021-06-28 17:03:19 +02:00
Eren Gölge
25238e0658
fix glow-tts `inference()`
2021-06-28 17:03:19 +02:00
Eren Gölge
419735f440
refactor and fix multi-speaker training in Trainer and Tacotron models
2021-06-28 17:03:19 +02:00
Eren Gölge
269e5a734e
add max_decoder_steps argument to tacotron models
2021-06-28 17:03:19 +02:00
Eren Gölge
b3324bd914
fix speaker_manager init
2021-06-28 17:03:19 +02:00
Eren Gölge
2c38ef8441
use get_speaker_manager in Trainer and save speakers.json file when
...
needed
2021-06-28 17:03:19 +02:00
Eren Gölge
d6b2b6add6
make style and linter fixes
2021-06-28 17:03:19 +02:00
Eren Gölge
802d461389
Compute d_vectors and speaker_ids separately in TTSDataset
2021-06-28 17:03:19 +02:00
Eren Gölge
db6a97d1a2
rename external speaker embedding arguments as `d_vectors`
2021-06-28 17:03:19 +02:00
Eren Gölge
9042ae9195
use `to_cuda()` for moving data in `format_batch()`
2021-06-28 17:03:19 +02:00
Eren Gölge
f82f1970b8
change `to(device)` to `type_as` in models
2021-06-28 17:03:19 +02:00
Eren Gölge
9c94b0c5c0
init `durations = None`
2021-06-28 17:03:19 +02:00
Eren Gölge
1fa15c195a
docstring fix
2021-06-28 17:03:19 +02:00
Eren Gölge
1c8a3d7c86
make style
2021-06-28 17:03:19 +02:00
Eren Gölge
8cdd423234
styling formatting.py
2021-06-28 17:03:19 +02:00
Eren Gölge
30211512a4
fix type annotations
2021-06-28 17:03:19 +02:00
Eren Gölge
b22b7620c3
update glow-tts output shapes to match [B, T, C]
2021-06-28 17:03:19 +02:00
Eren Gölge
8381379938
formating `cond_input` with a function in Tacotron models
2021-06-28 17:03:19 +02:00
Eren Gölge
ef4ea9e527
update imports for `formatters`
2021-06-28 17:03:19 +02:00
Eren Gölge
6c495c6a6e
fix glow-tts inference and forward functions for handling `cond_input`
...
and refactor its test
2021-06-28 17:03:19 +02:00
Eren Gölge
f840268181
refactor `SpeakerManager`
2021-06-28 17:03:19 +02:00
Eren Gölge
421194880d
linter fixes
2021-06-28 17:03:19 +02:00
Eren Gölge
8e52a69230
delete separate tts training scripts and pre-commit configuration
2021-06-28 17:03:19 +02:00
Eren Gölge
d96ebcd6d3
make style
2021-06-28 17:03:19 +02:00
Eren Gölge
b643e8b37c
`logging/__init__.py`
2021-06-28 17:03:19 +02:00
Eren Gölge
0cee5042a9
fix logger imports
2021-06-28 17:03:19 +02:00
Eren Gölge
72dceca52c
import missings
2021-06-28 17:03:19 +02:00
Eren Gölge
0eec238429
remove redundant imports
2021-06-28 17:03:19 +02:00
Eren Gölge
b500338faa
make style
2021-06-28 17:03:19 +02:00
Eren Gölge
469d2e620a
update extract_tts_spectrogram for `cond_input` API of the models
2021-06-28 17:03:19 +02:00
Eren Gölge
5ab28fa618
update `extract_tts_spec...` using `SpeakerManager`
2021-06-28 17:03:19 +02:00
Eren Gölge
c392fa4288
update `extract_tts_spectrograms` for the new model API
2021-06-28 17:03:19 +02:00
Eren Gölge
8f47f95998
correct import of `load_meta_data`
...
remove redundant import
2021-06-28 17:03:19 +02:00
Eren Gölge
c680a07a20
fix `Synthesized` for the new `synthesis()`
2021-06-28 17:03:19 +02:00
Eren Gölge
73bf9673ed
revert logging.info to print statements for trainer
2021-06-28 17:03:19 +02:00
Eren Gölge
d25f017b42
update `setup_model.py` imports
2021-06-28 17:03:19 +02:00
Eren Gölge
bb355b7441
update align_tts.py model for the trainer
2021-06-28 17:03:19 +02:00
Eren Gölge
9203b863d9
update align_tts_loss for trainer
2021-06-28 17:03:19 +02:00
Eren Gölge
fc9a0fb8ce
update aling_tts_config for the trainer
2021-06-28 17:03:19 +02:00
Eren Gölge
e298b8e364
update trainer.py for better logging handling, restoring models and
...
rename init_ functions with get_
2021-06-28 17:03:19 +02:00
Eren Gölge
b8a4af4010
update `synthesis.py` for being more generic
2021-06-28 17:03:19 +02:00
Eren Gölge
c70d0c9dae
update `speedy_speech.py` model for trainer
2021-06-28 17:03:19 +02:00
Eren Gölge
06ee57d816
update `speedy_speecy_config.py` for the trainer
2021-06-28 17:03:19 +02:00
Eren Gölge
4e910993f1
update tacotron model to return `model_outputs`
2021-06-28 17:03:19 +02:00
Eren Gölge
bb4deee64c
update glow-tts for the trainer
2021-06-28 17:03:19 +02:00
Eren Gölge
9134c7dfb6
update `sequence_mask` import globally
2021-06-28 17:03:19 +02:00
Eren Gölge
b2218e882a
update `glow_tts_config.py` for setting the optimizer and the scheduler
2021-06-28 17:03:19 +02:00
Eren Gölge
891631ab47
typing annotation for the trainer
2021-06-28 17:03:19 +02:00
Eren Gölge
5f07315722
add trainer and train_tts
2021-06-28 17:03:19 +02:00
Eren Gölge
34f8a74e4d
remove `truncated` from synthesizer
2021-06-28 17:03:19 +02:00
Eren Gölge
178eccbc16
update console logger
2021-06-28 17:03:19 +02:00
Eren Gölge
f4f83b6379
update `synthesis.py` for the trainer
2021-06-28 17:03:19 +02:00
Eren Gölge
130781dab6
remove `tts.generic_utils` as all the functions are moved to other files
2021-06-28 17:03:19 +02:00
Eren Gölge
535a458f40
update Tacotron models for the trainer
2021-06-28 17:03:19 +02:00
Eren Gölge
bdbfc95618
add `gradual_training` argument to tacotron.py
2021-06-28 17:03:19 +02:00
Eren Gölge
5a2e75f0ee
import missings for tacotron.py
2021-06-28 17:03:19 +02:00
Eren Gölge
da7d10e53c
mode `setup_model()` to `models/__init__.py`
2021-06-28 17:03:19 +02:00
Eren Gölge
ca302db7b0
add sequence_mask to `utils.data`
2021-06-28 17:03:19 +02:00
Eren Gölge
844abb3b1d
`setup_loss()` in `layer/__init__.py`
2021-06-28 17:03:19 +02:00
Eren Gölge
a20a1c7d06
rename preprocess.py -> formatters.py
2021-06-28 17:03:19 +02:00
Eren Gölge
b9bccbb243
move load_meta_data and related functions to `datasets/__init__.py`
2021-06-28 17:03:19 +02:00
Eren Gölge
d09385808a
set test_sentences in config
2021-06-28 17:03:19 +02:00
Eren Gölge
8def3c87af
trainer-API updates
2021-06-28 17:03:19 +02:00
Eren Gölge
42554cc711
rename MyDataset -> TTSDataset
2021-06-28 17:03:19 +02:00
Edresson
1c4e806f54
use speaker manager on compute embeddings script
2021-06-27 03:35:34 -03:00
Edresson Casanova
eb84bb2bc8
Merge branch 'dev' into dev
2021-06-26 15:32:19 -03:00
Eren Gölge
987cf1178b
Bump up to v0.0.16
2021-06-25 14:44:56 +02:00
Michael Hansen
3f172b84d8
Fix linting issues
2021-06-25 14:41:31 +02:00
Michael Hansen
4d8426fa0a
Use eSpeak IPA lexicons by default for phoneme models
2021-06-25 14:41:05 +02:00
Michael Hansen
618b509204
Use combined characters available in TTS phonemes (like ç)
2021-06-25 14:41:05 +02:00
Michael Hansen
da6f6a4a01
Update docstring for clean_gruut_phonemes
2021-06-25 14:41:05 +02:00
Michael Hansen
47191f3ecc
Add tests for gruut phonemization
2021-06-25 14:41:05 +02:00
Michael Hansen
67869e77f9
Use gruut for phonemization
2021-06-25 14:41:05 +02:00
Eren Gölge
788992093d
Add UnivNet vocoder 🚀
2021-06-23 13:51:04 +02:00
Eren Gölge
64fd59204c
Use `torch.linalg.qr` for pytorch > `v1.9.0`
2021-06-23 13:49:42 +02:00
Eren Gölge
aba840b4e6
Fix loading the `amp` scaler from a checkpoint 🛠️
2021-06-23 13:49:42 +02:00
Eren Gölge
18e5393f16
Add 🐍 python 3.9 to CI
2021-06-23 13:49:36 +02:00
Eren Gölge
0ff2d2336a
Fix wrong argument name 🛠️
2021-06-22 16:21:11 +02:00
Eren Gölge
61c3cb871f
Docstring edit in `TTSDataset.py` ✍️
2021-06-22 16:21:11 +02:00
Eren Gölge
6f739ea07a
Fix `eval_log` for `gan.py` 🛠️
2021-06-22 16:21:11 +02:00
Eren Gölge
ebb91c0fbb
Move `TorchSTFT` to `utils.audio`
2021-06-22 16:21:11 +02:00
Eren Gölge
01c4b22a2f
Fixup `trainer.py` 🛠️
2021-06-22 16:21:11 +02:00
Eren Gölge
7de2756fc4
Enable support for 🐍 python 3.10
...
Bump up versions numpy 1.19.5 and TF 2.5.0
2021-06-22 16:21:11 +02:00
Eren Gölge
220e184f66
Apply small fixes for API compatibility
2021-06-22 16:21:11 +02:00
Eren Gölge
77d57dd301
Print `max_decoder_steps` when model reaches the limit
2021-06-22 16:21:11 +02:00
Eren Gölge
7dc2177df4
Update `synthesizer` for speaker and model init
2021-06-22 16:21:11 +02:00
Eren Gölge
c3a0bc702e
fixup configs
2021-06-22 16:21:11 +02:00
Eren Gölge
0e01c2594f
Update `speaker_manager`
2021-06-22 16:21:11 +02:00
Eren Gölge
8182f5168f
Fixup `utils` for the trainer
2021-06-22 16:21:11 +02:00
Eren Gölge
b4bb567e04
Update `vocoder` utils
2021-06-22 16:21:11 +02:00
Eren Gölge
f3ff5b1971
Update `TTS.bin` scripts for the new API
2021-06-22 16:21:11 +02:00
Eren Gölge
aed919cf1c
Update `vocoder` datasets and `setup_dataset`
2021-06-22 16:21:11 +02:00
Eren Gölge
59abf490a1
Implement `setup_model` for vocoder models
2021-06-22 16:21:11 +02:00
Eren Gölge
420820caf4
Update vocoder models
2021-06-22 16:21:11 +02:00
Eren Gölge
d10f9c5676
Update `tts.models.setup_model`
2021-06-22 16:21:11 +02:00
Eren Gölge
cae702980f
Create base 🐸 TTS model abstraction for tts models
2021-06-22 16:21:11 +02:00
Eren Gölge
70d968b169
Update vocoder model configs
2021-06-22 16:21:11 +02:00
Eren Gölge
f8a3460818
Update tts model configs
2021-06-22 16:21:11 +02:00
Eren Gölge
acd96a4940
Implement unified IO utils
2021-06-22 16:21:10 +02:00
Eren Gölge
6b907554f8
Implement unified trainer
2021-06-22 16:21:10 +02:00
Eren Gölge
20c4a8c8e1
`tts` model abstraction with `TTSModel`
2021-06-22 16:21:10 +02:00
Eren Gölge
b934665fc0
fix calculation of `loader_start_time`
2021-06-22 16:21:10 +02:00
Eren Gölge
64f0f57757
`TrainerAbstract` and related updates for `TrainerTTS`
2021-06-22 16:21:10 +02:00
Eren Gölge
f077a356e0
rename to
2021-06-22 16:21:10 +02:00
Eren Gölge
4575b70826
merge if branches with the same implementation
2021-06-22 16:21:10 +02:00
Eren Gölge
59be1b9af1
adjust `distribute.py` for the `train_tts.py`
2021-06-22 16:21:10 +02:00
Eren Gölge
614738cc85
downsize melgan test model size
2021-06-22 13:12:52 +02:00
Eren Gölge
4f29725eb6
fix glow-tts `inference()`
2021-06-22 13:12:52 +02:00
Eren Gölge
a87c886497
refactor and fix multi-speaker training in Trainer and Tacotron models
2021-06-22 13:12:52 +02:00
Eren Gölge
0206bb847b
add max_decoder_steps argument to tacotron models
2021-06-22 13:12:52 +02:00
Eren Gölge
cbb52b3d83
fix speaker_manager init
2021-06-22 13:12:52 +02:00
Eren Gölge
d2fd6a34a1
use get_speaker_manager in Trainer and save speakers.json file when
...
needed
2021-06-22 13:12:52 +02:00
Eren Gölge
147550c65f
make style and linter fixes
2021-06-22 13:12:52 +02:00
Eren Gölge
a605dd3d08
Compute d_vectors and speaker_ids separately in TTSDataset
2021-06-22 13:12:52 +02:00
Eren Gölge
f00ef90ce6
rename external speaker embedding arguments as `d_vectors`
2021-06-22 13:12:52 +02:00
Eren Gölge
e7b7268c43
use `to_cuda()` for moving data in `format_batch()`
2021-06-22 13:12:52 +02:00
Eren Gölge
26a3312f0d
change `to(device)` to `type_as` in models
2021-06-22 13:12:52 +02:00
Eren Gölge
c09622459e
init `durations = None`
2021-06-22 13:12:52 +02:00
Eren Gölge
2e31659dd9
docstring fix
2021-06-22 13:12:52 +02:00
Eren Gölge
7a0750a4f5
make style
2021-06-22 13:12:52 +02:00
Eren Gölge
534401377d
styling formatting.py
2021-06-22 13:12:52 +02:00
Eren Gölge
e229f5c081
fix type annotations
2021-06-22 13:12:52 +02:00
Eren Gölge
506189bdee
update glow-tts output shapes to match [B, T, C]
2021-06-22 13:12:52 +02:00
Eren Gölge
f568833d28
formating `cond_input` with a function in Tacotron models
2021-06-22 13:12:52 +02:00
Eren Gölge
254707c610
update imports for `formatters`
2021-06-22 13:12:52 +02:00
Eren Gölge
223502d827
fix glow-tts inference and forward functions for handling `cond_input`
...
and refactor its test
2021-06-22 13:12:52 +02:00
Eren Gölge
d4b1acfa81
refactor `SpeakerManager`
2021-06-22 13:12:52 +02:00
Eren Gölge
26e7c0960c
linter fixes
2021-06-22 13:12:52 +02:00
Eren Gölge
79f7c5da1e
delete separate tts training scripts and pre-commit configuration
2021-06-22 13:12:52 +02:00
Eren Gölge
ca787be193
make style
2021-06-22 13:12:52 +02:00
Eren Gölge
d376647ca0
`logging/__init__.py`
2021-06-22 13:12:52 +02:00
Eren Gölge
bb58a0588e
fix logger imports
2021-06-22 13:12:52 +02:00
Eren Gölge
9bbc924377
import missings
2021-06-22 13:12:52 +02:00
Eren Gölge
b4d4ce0d7e
remove redundant imports
2021-06-22 13:12:52 +02:00
Eren Gölge
aefa71155c
make style
2021-06-22 13:12:52 +02:00
Eren Gölge
88d8a94a10
update extract_tts_spectrogram for `cond_input` API of the models
2021-06-22 13:12:52 +02:00
Eren Gölge
667bb708b6
update `extract_tts_spec...` using `SpeakerManager`
2021-06-22 13:12:52 +02:00
Eren Gölge
830306d2fd
update `extract_tts_spectrograms` for the new model API
2021-06-22 13:12:52 +02:00
Eren Gölge
c673eb8ef8
correct import of `load_meta_data`
...
remove redundant import
2021-06-22 13:12:52 +02:00
Eren Gölge
f0a419546b
fix `Synthesized` for the new `synthesis()`
2021-06-22 13:12:52 +02:00
Eren Gölge
c7ff175592
revert logging.info to print statements for trainer
2021-06-22 13:12:52 +02:00
Eren Gölge
fd6afe5ae5
update `setup_model.py` imports
2021-06-22 13:12:52 +02:00
Eren Gölge
c82d91051d
update align_tts.py model for the trainer
2021-06-22 13:12:52 +02:00
Eren Gölge
4f66e816d1
update align_tts_loss for trainer
2021-06-22 13:12:52 +02:00
Eren Gölge
8213ad8b5f
update aling_tts_config for the trainer
2021-06-22 13:12:52 +02:00
Eren Gölge
8dfd4c91ff
update trainer.py for better logging handling, restoring models and
...
rename init_ functions with get_
2021-06-22 13:12:52 +02:00
Eren Gölge
fb9289d365
update `synthesis.py` for being more generic
2021-06-22 13:12:52 +02:00
Eren Gölge
f121b0ff5d
update `speedy_speech.py` model for trainer
2021-06-22 13:12:52 +02:00
Eren Gölge
843b3ba960
update `speedy_speecy_config.py` for the trainer
2021-06-22 13:12:52 +02:00
Eren Gölge
c9790bee2c
update tacotron model to return `model_outputs`
2021-06-22 13:12:52 +02:00
Eren Gölge
f09ec7e3a7
update glow-tts for the trainer
2021-06-22 13:12:52 +02:00
Eren Gölge
3346a6d9dc
update `sequence_mask` import globally
2021-06-22 13:12:52 +02:00
Eren Gölge
9765b1aa6b
update `glow_tts_config.py` for setting the optimizer and the scheduler
2021-06-22 13:12:52 +02:00
Eren Gölge
6bf6543df8
typing annotation for the trainer
2021-06-22 13:12:52 +02:00
Eren Gölge
57cdddef16
add trainer and train_tts
2021-06-22 13:12:52 +02:00
Eren Gölge
d769af9e3b
remove `truncated` from synthesizer
2021-06-22 13:12:52 +02:00
Eren Gölge
570633ab80
update console logger
2021-06-22 13:12:52 +02:00
Eren Gölge
2ac6b824ca
update `synthesis.py` for the trainer
2021-06-22 13:12:52 +02:00
Eren Gölge
c9e5527070
remove `tts.generic_utils` as all the functions are moved to other files
2021-06-22 13:12:52 +02:00
Eren Gölge
2ab723cd10
update Tacotron models for the trainer
2021-06-22 13:12:52 +02:00
Eren Gölge
d6b6a15b5c
add `gradual_training` argument to tacotron.py
2021-06-22 13:12:52 +02:00
Eren Gölge
118a7f2b43
import missings for tacotron.py
2021-06-22 13:12:52 +02:00
Eren Gölge
c98149d488
mode `setup_model()` to `models/__init__.py`
2021-06-22 13:12:52 +02:00
Eren Gölge
86edf6ab0e
add sequence_mask to `utils.data`
2021-06-22 13:12:52 +02:00
Eren Gölge
c61486b1e3
`setup_loss()` in `layer/__init__.py`
2021-06-22 13:12:52 +02:00
Eren Gölge
f07209d2e0
rename preprocess.py -> formatters.py
2021-06-22 13:12:52 +02:00
Eren Gölge
facb782851
move load_meta_data and related functions to `datasets/__init__.py`
2021-06-22 13:12:52 +02:00
Eren Gölge
b9d4355d20
set test_sentences in config
2021-06-22 13:12:52 +02:00
Eren Gölge
7bdd0eb72f
trainer-API updates
2021-06-22 13:12:52 +02:00
Eren Gölge
0f284841d1
rename MyDataset -> TTSDataset
2021-06-22 13:12:52 +02:00
Edresson
99d40e98d9
fix Lint checks
2021-06-18 14:59:01 -03:00
Edresson
28bec238ca
fix Lint checks
2021-06-18 14:33:50 -03:00
Edresson
83644056e3
fix Lint checks
2021-06-18 14:32:28 -03:00
Edresson Casanova
e78e3cd81e
Merge branch 'dev' into dev
2021-06-18 14:10:03 -03:00
Edresson
b74b510d3c
Compute embeddings and find characters using config file
2021-06-18 14:04:49 -03:00
Adam Froghyar
b0aa189348
Forcing do_trim_silence to False in the extract TTS script
2021-06-14 10:44:00 +02:00
Eren Gölge
d245b5d48f
bump up v0.0.15.1
2021-06-08 09:21:01 +02:00
Edresson
14b209c7e9
Create a batch for more fast inference on LSTM Speaker Encoder
2021-06-05 03:12:17 -03:00
Eren Gölge
b8b79a5e5a
fix `use_cuda` bug in `server.py`
2021-06-04 14:02:53 +02:00
Eren Gölge
203ab855c3
bump up to v0.0.15
2021-06-04 13:52:54 +02:00
Eren Gölge
ba9bcf7c6b
auto upload to pypi on release
2021-06-04 12:20:06 +02:00
Eren Gölge
e66753bd0d
fixup! new japanese model placeholder in `.models.json`
2021-06-03 18:04:28 +02:00
Eren Gölge
bd434636a9
new japanese model placeholder in `.models.json`
2021-06-02 15:54:37 +02:00
Eren Gölge
401fbd8978
bump up to v0.0.15
2021-06-02 11:48:17 +02:00
Eren Gölge
49c5e5d820
maket style japanese PR
2021-06-02 11:44:46 +02:00
Eren Gölge
73b4083c6c
Merge pull request #502 from kaiidams/kaiidams/kokoro
...
Japanese Tacotron 2 model
2021-06-02 10:20:08 +02:00
Katsuya Iida
6d8310d2a9
Set the version to the same with the dev branch.
2021-06-02 07:48:28 +09:00
Alexander Korolev
c1eb9bdcca
fix speaker dim inference
2021-06-01 15:15:26 +02:00
Katsuya Iida
1cc18d1972
Move unittest of Japanese phonemizer.
2021-06-01 18:51:34 +09:00
Alexander Korolev
5b89ef2c6e
fix speaker-embeddings dimension during inference
2021-06-01 11:06:35 +02:00
Eren Gölge
d0ab0382fc
linter fixes
2021-06-01 09:15:32 +02:00
Eren Gölge
bec85ac58d
make style
2021-05-31 16:37:15 +02:00
Eren Gölge
d9f1268f99
init tb_logger None for rank > 0 processes
2021-05-31 15:47:07 +02:00
Eren Gölge
301c516abd
Merge branch 'dev' of https://github.com/coqui-ai/TTS into dev
2021-05-31 15:46:25 +02:00
Edresson
7448177b72
use SpeakerManager on compute embeddings script
2021-05-29 21:11:53 -03:00
Katsuya Iida
c4a5a73f18
update Kokoro config
2021-05-29 19:17:27 +09:00
Katsuya Iida
3a9ac2de4a
Merge remote-tracking branch 'coqui-ai/main' into kaiidams/kokoro
2021-05-29 09:39:23 +09:00
Katsuya Iida
d0c9c1ca5c
Move TTS/tts/utils/japanese
2021-05-29 09:21:47 +09:00
Edresson
099142d4dd
bug fix
2021-05-27 21:50:56 -03:00
Edresson
208bb0f0ee
add batched speaker encoder inference
2021-05-27 20:01:00 -03:00
Edresson
825734a3a9
remove unused embeddings export
2021-05-27 19:10:24 -03:00
Katsuya Iida
c4987e9d4e
Move import at the head of the file.
2021-05-28 00:22:57 +09:00
Eren Gölge
925c08cf95
replace unidecode with anyascii
2021-05-27 14:02:44 +02:00
Eren Gölge
e08c58db3b
bump up version to v0.14.1
2021-05-27 13:11:01 +02:00
Eren Gölge
c6f22aaa67
fix #509
2021-05-27 13:09:15 +02:00
Edresson
1496f271dc
update Compute embeddings script
2021-05-27 00:45:18 -03:00
Edresson
bc5307caa0
add unit tests for SoftmaxAngleProtoLoss and ResnetSpeakerEncoder and bugfix
2021-05-26 20:35:58 -03:00
Edresson
c90037c2e9
solve merge problems
2021-05-26 16:01:30 -03:00
Katsuya Iida
f921a05bdb
Fixed lint errors
2021-05-26 19:02:16 +09:00
Edresson Casanova
f89cb6aec2
Merge branch 'dev' into dev
2021-05-25 17:30:25 -03:00
Edresson
d570c2d790
pylint fix and data loader bug fix
2021-05-26 01:11:37 -03:00
Katsuya Iida
0536aa6d0f
Japanese Tacotron 2 model
2021-05-22 17:12:19 +09:00
Eren Gölge
5482a0f62d
type def for gradual_training
2021-05-19 14:03:26 +02:00
Eren Gölge
df6a98d0c3
type def for gradual_training
2021-05-19 14:00:44 +02:00
Eren Gölge
16576d6408
bump version number
2021-05-19 12:35:10 +02:00
Eren Gölge
8a7c40736c
set use_phonemes false
2021-05-19 01:27:26 +02:00
Eren Gölge
ccfaa6b1d5
add `needs_phonemizer` field to models.json. If set true these models
...
are only compatible with v0.0.13 or below.
2021-05-18 17:57:28 +02:00
Eren Gölge
a14fcf2a13
remove text_processing test
2021-05-18 17:57:28 +02:00
Eren Gölge
d7fae3f515
remove all espeaker and phonemizer deps
2021-05-18 17:57:28 +02:00
Eren Gölge
ced05e812a
move chinese phonemizer
2021-05-18 17:57:28 +02:00
Eren Gölge
218af1d9a2
change `list` to `List` in config
2021-05-18 17:30:27 +02:00
Eren Gölge
4df31f7fbd
unused_speakers argument for ignoring speaker ids in multi-speaker
...
training
2021-05-18 14:50:03 +02:00
Eren Gölge
c2c7dff805
use relaxted coqpit parser
2021-05-18 14:49:47 +02:00
Edresson
856ea19758
bug fix in dataloader and update inference
2021-05-18 03:43:16 -03:00
Eren Gölge
d1b469935d
tacotron DDC LJSpeech recipe
2021-05-17 11:42:14 +02:00
Eren Gölge
34a42d379f
update tacotron_config.py for checking `r` and the docstring
2021-05-17 11:35:30 +02:00
Eren Gölge
12722501bb
styling
2021-05-15 23:48:31 +02:00
Eren Gölge
8b1014d188
add docstrings with default value fixes
2021-05-15 23:45:10 +02:00
Eren Gölge
da49089a72
update melgan training test batch size
2021-05-12 10:12:11 +02:00
Edresson
3433c2f348
add compute embedding for the new speaker encoder
2021-05-12 03:06:46 -03:00
Eren Gölge
0213e1cbf4
update configs for tts models to match the field typed with the expected
...
values
2021-05-12 00:57:38 +02:00
Eren Gölge
715b0a65a0
update main.yml for python x64
...
fix test
2021-05-12 00:57:29 +02:00
Edresson
3fcc748b2e
implement the Speaker Encoder H/ASP
2021-05-11 16:27:05 -03:00
Eren Gölge
843d1b3d98
linter fixes
2021-05-11 11:30:00 +02:00
Eren Gölge
19fb1d743d
style update
2021-05-11 11:30:00 +02:00
Eren Gölge
6e980b49c4
fix synthesizer.py for Coqpit
2021-05-11 11:29:18 +02:00
Eren Gölge
db14dcd95a
remove old load_config
2021-05-11 11:29:18 +02:00
Eren Gölge
a21ac883dd
add get_cuda()
2021-05-11 11:29:18 +02:00
Eren Gölge
21dd4d7960
fix load_config imports for Coqpit
2021-05-11 11:29:18 +02:00
Eren Gölge
c57f0b46bb
reintro use_gst for backwars compat
2021-05-11 11:29:18 +02:00
Eren Gölge
18e76a2309
fix speaker encoder model initialization
2021-05-11 11:29:18 +02:00
Eren Gölge
10de40bba1
make num_workers mandatory config field
2021-05-11 11:29:18 +02:00
Eren Gölge
df1ddd3539
allow read_json_with_comments for backward compat
2021-05-11 11:29:18 +02:00
Eren Gölge
9f7599e3c3
fix train_encoder for coqpit
2021-05-11 11:29:18 +02:00
Eren Gölge
f8e52965dd
add speaker encoder coqpit
2021-05-11 11:29:18 +02:00
Eren Gölge
ce2bba543e
remove extra from utils and move funcs to io.py
2021-05-11 11:29:18 +02:00
Eren Gölge
812dbc2b06
rm config.json
2021-05-11 11:29:18 +02:00
Eren Gölge
3fde2001b1
train_encoder refactoring for coqpit
2021-05-11 11:29:18 +02:00
Eren Gölge
9ee70af9bb
code styling
2021-05-11 11:29:18 +02:00
Eren Gölge
10db2baa06
global shared Coqpit configs
2021-05-11 11:29:18 +02:00
Eren Gölge
3dec62b183
add Coqpits for the vocoder models
2021-05-11 11:29:18 +02:00
Eren Gölge
6f4eed94f5
remove *.json vocoder configs
2021-05-11 11:29:18 +02:00
Eren Gölge
78b3825d0b
update train scripts for coqpit
2021-05-11 11:29:18 +02:00
Eren Gölge
757e90b1cc
load_config function to initialize the right Coqpit for the given model
2021-05-11 11:29:18 +02:00
Eren Gölge
e6f45b9eb7
update train_vocoder_gan.py for coqpit
2021-05-11 11:29:18 +02:00
Eren Gölge
bcebd69d09
remove bash tts training tests
2021-05-11 11:29:17 +02:00
Eren Gölge
7663bc63c1
add Coqpit configs for the TTS models
2021-05-11 11:29:17 +02:00
Eren Gölge
7227e8f1d2
update train_align_tts.py for coqpit
2021-05-11 11:29:17 +02:00
Eren Gölge
51a7e06945
glow_tts_config.py and train test on python
2021-05-11 11:29:17 +02:00
Eren Gölge
720fe13056
update glow_tts modules and training script for coqpit use
2021-05-11 11:29:17 +02:00
Eren Gölge
816e7ee698
remove default configs.json as replacing with Coqpit configs
2021-05-11 11:29:17 +02:00
Eren Gölge
35341d5482
move bash script based tests to python with coqpit
2021-05-11 11:29:17 +02:00
Eren Gölge
647163397d
coqpit refactoring
2021-05-11 11:29:17 +02:00
Eren Gölge
eaa130e813
fix tacotron for coqpit
2021-05-11 11:29:17 +02:00
Eren Gölge
65d7ad4250
refactor train_speedy_speech.py for coqpit
2021-05-11 11:29:17 +02:00
Eren Gölge
4a58fdfd59
comment out check-arguments before copying fields to the configs
2021-05-11 11:29:17 +02:00
Eren Gölge
05d9543ed8
init GST module using gst config in Tacotron models
2021-05-11 11:29:17 +02:00
Eren Gölge
93a00373f6
move split_dataset
2021-05-11 11:29:17 +02:00
Eren Gölge
9c18e40f64
black formatting
2021-05-11 11:29:17 +02:00
Eren Gölge
c34c8137d7
update compute_statistics for coqpit
2021-05-11 11:29:17 +02:00
Eren Gölge
79d7215142
config refactor #5 WIP
2021-05-11 11:29:17 +02:00
Eren Gölge
dc50f5f0b0
config refactor #4 WIP
2021-05-11 11:28:35 +02:00
Eren Gölge
97bd5f9734
[ci skip] config update #3 WIP
2021-05-11 11:28:35 +02:00
Eren Gölge
a21c0b5585
config update 2 WIP
2021-05-11 11:28:35 +02:00
Eren Gölge
e092ae40dc
config update WIP
2021-05-11 11:28:35 +02:00
Eren Gölge
06f80a4806
update check argument
2021-05-11 11:28:35 +02:00
Eren Gölge
bf7ddfa542
Merge pull request #481 from chmodsss/main
...
Accessing __version__ command
2021-05-11 10:20:48 +02:00
Edresson
85ccad7e0a
add Audio data augamentation Addtive and RIR
2021-05-11 00:59:57 -03:00
Edresson
77d85c6cc5
add softmaxproto loss and bug fix in data loader
2021-05-10 17:08:38 -03:00
chmodsss
607d5cf377
[ #480 ] Adding version variable
2021-05-10 19:46:34 +02:00
Adam Froghyar
7ddc885f37
deleted a line the broke GravesAttention
2021-05-10 15:42:59 +02:00
Edresson
78bad25f2b
update voxceleb download link
2021-05-07 23:45:15 -03:00
Eren Gölge
f7582107da
Merge pull request #453 from Edresson/dev
...
Script for spectrogram extraction using teacher forcing and Glow-TTS inference with MAS.
2021-05-06 17:53:28 +02:00
Edresson
501c8e0302
remove unused vars on extract tts spectrograms script
2021-05-04 19:04:13 -03:00
Eren Gölge
0325c58862
Merge pull request #468 from shaun95/patch-1
...
Update losses.py
2021-05-03 14:45:24 +02:00
Eren Gölge
8cb27267a4
formatting
2021-05-03 14:26:35 +02:00
Eren Gölge
87d674a038
bumpup librosa version to 0.8.0
2021-05-03 14:25:09 +02:00
shaun
7d0ec62bf1
Update losses.py
...
The block of code for use_l1_spec_loss is repeated which doubles the amount of L1 loss when enabled.
The weight for L1 loss in hifigan_ljspeech configutation will likely need to be doubled to compensate (l1_spec_loss_weight)
2021-05-02 14:14:24 +02:00
Edresson
3ecd556bbe
add unit test for extract tts spectrograms script
2021-05-01 13:41:56 -03:00
Edresson
446b1da936
create inference function
2021-04-29 18:18:37 -03:00
Eren Gölge
f02f0338c2
fix .models.json and add testing to check released models availability
2021-04-29 09:32:36 +02:00
Eren Gölge
fd95e9b8a4
[ci skip] Add sam models
2021-04-28 21:57:31 +02:00
Agrin Hilmkil
351d0ed6ae
Remove unnecessary fsspec usage
2021-04-28 11:21:08 +02:00
Agrin Hilmkil
167f86417e
Move dev, tf, notebook dependencies to extras
2021-04-28 11:20:06 +02:00
Eren Gölge
1235e54738
test for synthesize.py
2021-04-27 14:17:38 +02:00
Eren Gölge
4719414f2e
remove imports
2021-04-27 11:25:17 +02:00
Eren Gölge
add97cddc1
move function and remove import
2021-04-27 11:22:56 +02:00
Eren Gölge
734e6a515c
bug fix
2021-04-27 10:27:45 +02:00
Eren Gölge
6bdd81667e
place holders for sc-glow and hifigan models
2021-04-26 19:53:12 +02:00
Eren Gölge
2f0716073e
enable multi-speaker CoquiTTS models for synthesize.py
2021-04-26 19:36:53 +02:00
Eren Gölge
b531fa699c
remove conflicy noise
2021-04-26 15:27:52 +02:00
Eren Gölge
f37b488876
Merge branch 'speaker-manager' of https://github.com/coqui-ai/TTS into speaker-manager
2021-04-26 15:25:25 +02:00
Eren Gölge
b82daa5e86
style and linter fixes
2021-04-26 15:22:24 +02:00
Edresson
20e42a3381
add save audio option
2021-04-23 15:00:00 -03:00
Edresson
8228091f92
add script for extraction of tts spectrograms
2021-04-23 14:17:46 -03:00
Eren Gölge
4cf211348d
styling and linting
2021-04-23 18:04:37 +02:00
Eren Gölge
7eb0c60d2e
let synthesizer to pass speaker encoder file paths to speaker manager
2021-04-23 18:04:37 +02:00
Eren Gölge
f69195739e
let speaker manager compute mean x_vector from multiple wav files
2021-04-23 18:04:37 +02:00
Eren Gölge
179722e3a7
new arguments to synthesize.py for loading speaker encoder and speaker wavs
2021-04-23 18:04:37 +02:00
Eren Gölge
dfa415a8b8
small refactor in server.py
2021-04-23 18:04:37 +02:00
Eren Gölge
c80d21f311
load speaker_encoder_ap and compute x_vector directly from the input file in speaker manager
2021-04-23 18:04:37 +02:00
Eren Gölge
ad047c8195
html formatting, enable multi-speaker model on the server with a dropdown menu to select the speaker
2021-04-23 18:04:37 +02:00
Eren Gölge
f9f3d04d14
remove moved function
2021-04-23 18:04:37 +02:00
Eren Gölge
10c988ac8c
update server.py
2021-04-23 18:04:37 +02:00
Eren Gölge
6d0f5e0459
use SpeakerManager in Synthesizer
2021-04-23 18:04:37 +02:00
Eren Gölge
e97126314c
add ```unique``` argument to make_symbols to fix the incompat. issue of the
...
SC-Glow models
2021-04-23 18:04:37 +02:00
Eren Gölge
d08888e603
formating speakers.py
2021-04-23 18:04:37 +02:00
Eren Gölge
df422223a3
initial SpeakerManager implementation
2021-04-23 18:04:37 +02:00
Eren Gölge
7a7aeb35f5
fix the glow-tts in setup_model
2021-04-23 18:04:37 +02:00
Eren Gölge
d42748082a
update argument name external_speaker_embedding_dim -> speaker_embedding_dim
...
add inference_noise_scale argument to glow-tts
2021-04-23 18:04:37 +02:00
Eren Gölge
2da81f5bb6
add load_chekpoint to speaker encoder
2021-04-23 18:04:37 +02:00
Eren Gölge
1229ccbf07
update argument name in server.py
2021-04-23 18:04:37 +02:00
Eren Gölge
af2d36faeb
update synthesize.py for multi-speaker setting
2021-04-23 18:04:37 +02:00
Eren Gölge
99dc07a7dd
add ```unique``` param to keep scglow models compatible (they are duplicate symbols ins the character set)
2021-04-23 18:04:37 +02:00
Eren Gölge
c955a12428
set the default layer size compatible with scglow
2021-04-23 18:04:37 +02:00
Eren Gölge
3ace2440fa
fix a mistake from rebase
2021-04-23 18:04:37 +02:00
Eren Gölge
aadb2106ec
code styling
2021-04-23 18:04:37 +02:00
Eren Gölge
af7baa3387
refactoring to allow defining the speaker file externally
2021-04-23 18:04:37 +02:00
kirianguiller
7dccbfdcd5
handle multi speaker and gst in Synthetizer class
2021-04-23 18:04:37 +02:00
Edresson
d2b6326b8b
change optimizer initialization for compatibility with Hifi-GAN official implementation
2021-04-23 07:54:39 -03:00
WeberJulian
4205284f92
Change name of the functions
2021-04-23 10:09:55 +02:00
WeberJulian
a26498181b
Change back the default value
2021-04-22 16:10:17 +02:00
Julian Weber
355e1f47ab
fix dumb mistake
2021-04-22 15:50:29 +02:00
Julian Weber
c125b71f36
fix windows support
2021-04-22 15:14:24 +02:00
Jörg Thalheim
f5fd7f78d4
server: also listen to ipv6
...
The [::] address will listen to both ipv4/ipv6 addresses.
2021-04-22 12:38:55 +02:00
Eren Gölge
ef37633cb3
[ci skip] use prenet_dropout by default with Tacotron models
2021-04-22 12:38:55 +02:00
Eren Gölge
e1d960da9e
use SpeakerManager in Synthesizer
2021-04-21 13:13:27 +02:00
Eren Gölge
04b6881b66
add ```unique``` argument to make_symbols to fix the incompat. issue of the
...
SC-Glow models
2021-04-21 13:12:35 +02:00
Eren Gölge
790946faec
formating speakers.py
2021-04-21 13:12:11 +02:00
Eren Gölge
ab313814de
initial SpeakerManager implementation
2021-04-21 13:11:46 +02:00
Eren Gölge
09890c7421
fix the glow-tts in setup_model
2021-04-21 13:10:40 +02:00
Eren Gölge
8764d02eb2
update argument name external_speaker_embedding_dim -> speaker_embedding_dim
...
add inference_noise_scale argument to glow-tts
2021-04-21 13:09:44 +02:00
Eren Gölge
8b40720977
add load_chekpoint to speaker encoder
2021-04-21 13:09:04 +02:00
Eren Gölge
37cad38c27
update argument name in server.py
2021-04-21 13:08:45 +02:00
Eren Gölge
9bccee9da8
update synthesize.py for multi-speaker setting
2021-04-21 13:08:25 +02:00
Eren Gölge
d2fa8add1f
add ```unique``` param to keep scglow models compatible (they are duplicate symbols ins the character set)
2021-04-16 19:40:13 +02:00
Eren Gölge
d9612a4351
set the default layer size compatible with scglow
2021-04-16 19:40:13 +02:00
Eren Gölge
1038fd420d
fix a mistake from rebase
2021-04-16 19:39:47 +02:00
Eren Gölge
47e356cb48
code styling
2021-04-16 16:01:40 +02:00
Eren Gölge
25328aad00
refactoring to allow defining the speaker file externally
2021-04-16 15:59:57 +02:00
kirianguiller
48ae52a9a3
handle multi speaker and gst in Synthetizer class
2021-04-16 15:54:49 +02:00
Eren Gölge
a53958ae3a
fix urls for the new models
2021-04-15 17:05:00 +02:00
Eren Gölge
9cc17be53a
formatting and a small bug fix in Tacotron model
2021-04-15 16:36:51 +02:00
Eren Gölge
1ad838bc83
add newly released models under .model.json
2021-04-15 16:06:10 +02:00
Eren Gölge
7cada1a949
remove noise
2021-04-15 15:30:45 +02:00
Eren Gölge
d60a8d7211
show the real waveform on TB too for GAN vocoder training.
2021-04-15 15:30:06 +02:00
Eren Gölge
5fbe926429
change the default TTS model to TacotronDDC
2021-04-15 15:29:44 +02:00
Eren Gölge
3de5a89154
optionally enable prenet dropout at inference time for tacotron models
2021-04-13 13:24:56 +02:00
Eren Gölge
28a2fed8a3
update hifigan in .model.json
2021-04-12 16:48:05 +02:00
Eren Gölge
abaf36861a
aligntts model .model.json placeholder
2021-04-12 16:43:52 +02:00
Eren Gölge
480e2f7888
docstring update and better handling make_symbols
2021-04-12 16:40:49 +02:00
Eren Gölge
b735076bb4
linter fixes
2021-04-12 13:14:11 +02:00
Eren Gölge
b11d1cb845
small fixes
2021-04-12 12:40:55 +02:00
Eren Gölge
a7f6045644
Merge branch 'reformat' into hifigan-reformat
2021-04-12 12:00:17 +02:00
Eren Gölge
f519012dea
reformatting and styling
2021-04-12 11:47:39 +02:00
Eren Gölge
9011dddf77
tacotron DDC placeholder in models.json
2021-04-12 04:06:27 +02:00
Eren Gölge
d295d5de97
remove torch.no_grad from TorchSTFT
2021-04-10 19:43:57 +02:00
Eren Gölge
5b70da2e3f
restore schedulers only if training is continuing a previous training
...
inherit nn.Module for TorchSTFT
2021-04-09 19:31:28 +02:00
Eren Gölge
2c71c6d8cd
[ci skip]update gan vocoder configs to reflect the recent changes
2021-04-09 17:15:32 +02:00
Eren Gölge
2b529f60c8
update default hifigan config
2021-04-09 11:40:06 +02:00
Eren Gölge
105e0b4d62
vocoder gan training fixes
2021-04-09 11:38:04 +02:00
Eren Gölge
87ee6ceb57
style update #3
2021-04-09 01:17:15 +02:00
Eren Gölge
18d9ec8036
format with black
2021-04-09 00:54:59 +02:00
Eren Gölge
e5b9607bc3
isort all imports
2021-04-09 00:45:20 +02:00
Eren Gölge
0e79fa86ad
format with black and pylint 2.7.3
2021-04-09 00:38:08 +02:00
Eren Gölge
cd69da4868
linter fixes #2
2021-04-08 16:57:46 +02:00
Eren Gölge
4d3e1e9d9a
linter fix
2021-04-08 14:57:46 +02:00
Eren Gölge
53f54898bc
small fixes
2021-04-08 14:22:47 +02:00
Eren Gölge
006b1d3aaa
bug fix
2021-04-08 13:17:45 +02:00
Eren Gölge
3f0993aebe
remove junk
2021-04-08 12:17:02 +02:00
Eren Gölge
0ee0458309
remove redundant imports
2021-04-08 11:29:15 +02:00
Eren Gölge
773f1db6fa
refactor HifiGAN discriminator
2021-04-08 11:28:30 +02:00
Eren Gölge
15f362d5b1
formatting
2021-04-08 11:28:30 +02:00
Eren Gölge
aee24b0704
set different seed in gan_dataset when it is multi-workers
2021-04-08 11:28:30 +02:00
Eren Gölge
6ee211c137
remove stft params causing warning
2021-04-08 11:28:30 +02:00
Eren Gölge
4998ece8d8
allow configuration of optimziers from the config file
2021-04-08 11:28:30 +02:00
Eren Gölge
8daf407652
cache empty
2021-04-08 11:28:30 +02:00
Eren Gölge
3fb78c004a
move scheduler updates to the end of the epoch
2021-04-08 11:28:30 +02:00
Eren Gölge
2a872c98aa
don't call os.exit as it leaves the process resources standing
2021-04-08 11:27:40 +02:00
Eren Gölge
7cecd2fb2e
add hifigan D
2021-04-08 11:27:40 +02:00
Eren Gölge
13dca6e6b6
revert some of Hifigan generator updates
2021-04-08 11:27:40 +02:00
Eren Gölge
02bc776c35
prevenet grad in TorchSTFT
2021-04-08 11:27:40 +02:00
Eren Gölge
cf44624df8
more docstring
2021-04-08 11:27:40 +02:00
Eren Gölge
d95b1458e8
Linter fixes and docstrings for HiFiGAN
2021-04-08 11:27:40 +02:00
Eren Gölge
bd7a1c177b
fix #419
2021-04-08 11:26:41 +02:00
Eren Gölge
7726dfca99
change the upper bound in sound normalization
2021-04-08 11:26:01 +02:00
Eren Gölge
57f6bd1afa
make using different samples for G and D networks optional
2021-04-08 11:26:01 +02:00
Eren Gölge
67f8248492
placeholder for finetuned sam hifigan model
2021-04-08 11:25:29 +02:00
Eren Gölge
241e968df1
load_checkpoint for hifigan and no_grad for inference
2021-04-08 11:25:29 +02:00
Eren Gölge
de3a04f104
some commeting for Generator loss and check if the argument is defines in the config file
2021-04-08 11:25:29 +02:00
Eren Gölge
ff07c5f5e3
update TorchSTFT to enable melspec
2021-04-08 11:25:29 +02:00
Eren Gölge
4a5b1d4ac2
update hifigan config
2021-04-08 11:24:21 +02:00
Eren Gölge
e0e3b12b26
pass all parameters explicity to _istft
2021-04-08 11:23:20 +02:00
Eren Gölge
f0e76ee135
initial models.json entry for universal hifigan
2021-04-08 11:23:20 +02:00
Eren Gölge
d57f416957
small fixes
2021-04-08 11:22:30 +02:00
Eren Gölge
8c9e1c9e58
hifigan implementation update
2021-04-08 11:21:43 +02:00
Eren Gölge
a14d7bc5db
hifigan config update
2021-04-08 11:20:33 +02:00
Eren Gölge
8d4fd79cd7
update hifigan config
2021-04-08 11:20:33 +02:00
rishikksh20
e656e8b108
Remove select size bug
2021-04-08 11:20:33 +02:00
rishikksh20
b533474e3b
Remove minor bugs and make code trainable
2021-04-08 11:20:33 +02:00
rishikksh20
ef6ff4e95c
Add Exponential LR scheduler check
2021-04-08 11:20:33 +02:00
rishikksh20
1535777f64
1) Add ExponentialLR
2021-04-08 11:18:36 +02:00
rishikksh20
c20a6b1185
* Format the model definition
...
* Update code and integrate training code
2021-04-08 11:18:36 +02:00
rishikksh20
39b5845810
1) Add hifigan json files
...
2) Rename MPD disc
3) Re-format remove weight norm generator
2021-04-08 11:14:39 +02:00
rishikksh20
7b7c5d635f
1) Combine MSD with Multi-Period disc
...
2) Add remove weight norm layer on Generator
2021-04-08 11:14:39 +02:00
rishikksh20
4493feb95c
Add HiFi-GAN v1 generator and discriminator classes
2021-04-08 11:14:39 +02:00
Eren Gölge
c86c559349
docstring and optional padding in TorchSTFT
2021-04-07 12:36:15 +02:00
Eren Gölge
f890454de3
linter fixes
2021-04-07 12:36:03 +02:00
Eren Gölge
9782d9ea5d
[ci skip] implement #418
2021-04-06 16:24:50 +02:00
Eren Gölge
f46a275b22
update docstring 2
2021-04-06 16:24:50 +02:00
Eren Gölge
ec94ff3691
update docstring
2021-04-06 16:24:50 +02:00
Eren Gölge
2048095e9a
audio.py fix
2021-04-06 16:24:50 +02:00
Eren Gölge
e0b3008c31
allow choosing the log function used for amptodb conversion
2021-04-06 16:24:50 +02:00
Eren Gölge
44b4cb5ba5
DCA comment
2021-04-06 16:24:50 +02:00
Eren Gölge
b86e7fb2e8
pad short samples when loading precomputed features in vocoder trainign
2021-04-06 16:24:50 +02:00
Eren Gölge
6ad4eba678
gan vocoder train fix in case of restoring models wiht no scheduler is defined
2021-04-06 16:24:50 +02:00
Eren Gölge
e3ccfe37ea
add DE more urls
2021-04-02 14:54:41 +02:00
Eren Gölge
e84f120a04
sam-accenture model preprocessor
2021-04-01 03:41:41 +02:00
Eren Gölge
e3c052382b
fix loading always best_model when continue
2021-04-01 03:41:15 +02:00
Eren Gölge
48ea20e69f
example aligntts config
2021-03-30 14:41:00 +02:00
Eren Gölge
b4c2cf80f2
fix eval iter
2021-03-30 14:39:16 +02:00
Eren Gölge
a3a840fd78
linter fixes
2021-03-30 14:39:16 +02:00
Eren Gölge
6b2e13bf62
compute normalized logp using torch primitives
2021-03-30 14:39:16 +02:00
Eren Gölge
7a382a5c2b
stowed aligntts commit and small refactoring with feed_forward layers
2021-03-30 14:39:16 +02:00
Eren Gölge
d542a50818
fix losses for alignTTS
2021-03-30 14:39:16 +02:00
Eren Gölge
18cc7b95ec
update l1 and huber to mse loss
2021-03-30 14:39:16 +02:00
Eren Gölge
896d33ed49
update losses to hande alingtts phases
2021-03-30 14:39:16 +02:00
Eren Gölge
aec0b78aff
duration predictor fix 2
2021-03-30 14:39:16 +02:00
Eren Gölge
07269e639b
fix duration predictor in AlignTTS
2021-03-30 14:39:16 +02:00
Eren Gölge
c2d29e5cd4
FFTransformer encoder for aligntts
2021-03-30 14:39:16 +02:00
Eren Gölge
460a2d3e26
FFTransformer Decoder for AlignTTS
2021-03-30 14:39:16 +02:00
Eren Gölge
844e8e0ed4
adapt align_tts and model name handling
2021-03-30 14:39:16 +02:00
Eren Gölge
aa29f5b199
aligntts loss
2021-03-30 14:39:16 +02:00
Eren Gölge
a831468cab
align tts MDN layer
2021-03-30 14:39:16 +02:00
Eren Gölge
4396f8e2da
continue refactoring
2021-03-30 14:39:16 +02:00
Eren Gölge
892c3c3623
use torch for AngleProtoLoss
2021-03-30 14:39:16 +02:00
Eren Gölge
2b3e12ea49
correct imports after refactoring, add AlignTTS (old SSMAS) and some formatting
2021-03-30 14:39:16 +02:00
Eren Gölge
ecb6b0d6ad
rename GlowTtts as GlowTTS
2021-03-30 14:39:16 +02:00
Eren Gölge
e8cf8cb00e
restructure TF tacotron files
2021-03-30 14:39:16 +02:00
Eren Gölge
1ac99ce0d0
if git is not available set git has 'unknown'
2021-03-30 14:39:16 +02:00
Eren Gölge
d9c405f0c3
create feedforward folder for SS layers
2021-03-30 14:39:16 +02:00
Eren Gölge
a8cf1ae6b4
fix wavenet running with no input mask
2021-03-30 14:39:16 +02:00
Eren Gölge
1c1949d348
utf-8 encoding for certain preprocessors
2021-03-30 14:39:16 +02:00
Eren Gölge
ca2f22cdd7
linter fix
2021-03-30 14:36:12 +02:00
Eren Gölge
d0dcd7d1b8
let the user define outpu.wav file path fix #393
2021-03-30 14:24:31 +02:00
Eren Gölge
25654233d5
[ci skip]initial commit for the new DE models and stale ot update
2021-03-29 03:23:57 +02:00
Guy Elsmore-Paddock
15459627cc
Fix `UnicodeEncodeError` on Windows Platforms
...
Prevents the following error from appearing when running training on Windows platforms:
```
UnicodeEncodeError: 'charmap' codec can't encode characters in position: character maps to <undefined>
```
2021-03-20 17:30:00 -04:00
Eren Gölge
3947750dd9
Merge branch 'dev' of https://github.com/coqui-ai/TTS into dev
2021-03-18 14:09:47 +01:00
WeberJulian
4a9d2e4309
fix french_cleaners
2021-03-18 13:35:29 +01:00
WeberJulian
596ea2c98a
Add resample script
2021-03-18 13:33:37 +01:00
Eren Gölge
6e68637f48
bug fix
2021-03-18 13:33:23 +01:00
Eren Gölge
f3e5ddfaaf
bug fix in preprocessor
2021-03-18 13:33:23 +01:00
Eren Gölge
aeb4f82233
bug fix
2021-03-18 13:33:23 +01:00
Eren Gölge
0514330869
fix mozilla/TTS#685
2021-03-18 13:33:23 +01:00
Eren Gölge
f06603a0db
force utf8
2021-03-18 13:33:23 +01:00
Eren Gölge
32e8b56c45
linter fix
2021-03-18 13:33:23 +01:00
Eren Gölge
65533f33e9
fix #374
2021-03-18 13:33:00 +01:00
Eren Gölge
d790d2fccb
linter fix
2021-03-18 13:33:00 +01:00
WeberJulian
af96080e17
fix linter issues
2021-03-18 13:33:00 +01:00
WeberJulian
bf04383e74
fix french_cleaners
2021-03-18 13:33:00 +01:00
WeberJulian
f6cd8e0ecc
test case
2021-03-18 13:33:00 +01:00
WeberJulian
e954e45e57
linter + test
2021-03-18 13:33:00 +01:00
WeberJulian
e598977f3d
Using path.join instead of concat
2021-03-18 13:33:00 +01:00
WeberJulian
c5ef2de73f
Add resample script
2021-03-18 13:33:00 +01:00
Eren Gölge
2690ab2ee5
bug fix
2021-03-16 19:15:28 +01:00
Eren Gölge
4c1aed4a9c
bug fix in preprocessor
2021-03-16 19:13:32 +01:00
Eren Gölge
01e35e06c4
bug fix
2021-03-16 19:13:32 +01:00
Eren Gölge
aa8bb815a7
fix mozilla/TTS#685
2021-03-16 19:13:32 +01:00
Eren Gölge
a8c348ffb2
force utf8
2021-03-16 19:13:32 +01:00
Eren Gölge
bf0caba0bc
linter fix
2021-03-16 19:13:32 +01:00
Eren Gölge
babc94f63f
fix #374
2021-03-16 19:13:32 +01:00
Eren Gölge
bdfd1f8a89
linter fix
2021-03-16 19:13:32 +01:00
WeberJulian
11e25a7125
fix linter issues
2021-03-16 19:13:01 +01:00
WeberJulian
1574d8dd39
fix french_cleaners
2021-03-16 19:13:01 +01:00
WeberJulian
b94373afb8
test case
2021-03-16 19:13:01 +01:00
WeberJulian
93fdc0729c
linter + test
2021-03-16 19:13:01 +01:00
WeberJulian
17f197f51e
Using path.join instead of concat
2021-03-16 19:13:01 +01:00
WeberJulian
d6749f030f
Add resample script
2021-03-16 19:13:01 +01:00
Eren Gölge
838ebd6ad5
add the missing russian model
2021-03-16 18:38:35 +01:00
Eren Gölge
5c657715f2
fix #382
2021-03-16 17:31:48 +01:00
Eren Gölge
38a29ce1c9
move all models to github rls
2021-03-10 18:19:32 +01:00
Eren Gölge
e5bb317242
fix model manager
2021-03-10 17:01:19 +01:00
Eren Gölge
d260fb03a2
fix handling scale_stats.npy for models downloaded from Github rls
2021-03-10 16:40:30 +01:00
Eren Gölge
4aba4e5b1e
linter fx
2021-03-10 15:33:11 +01:00
Eren Gölge
6c932c8503
print the desc if required parameters are not provided
2021-03-10 15:19:00 +01:00
Eren Gölge
9e84c8a623
do not copy scale_stats if exist in the output folder
2021-03-10 15:13:55 +01:00
Eren Gölge
7782034e7e
fix #369
2021-03-10 15:13:21 +01:00
Eren Gölge
4337e9ff87
pad_mode in torch_stft
2021-03-10 14:41:00 +01:00
Eren Gölge
599149a7e5
downloading models from github releases
2021-03-10 11:09:01 +01:00
Eren Gölge
fc19411ac6
update some of the models to github releases
2021-03-10 11:08:15 +01:00
Eren Gölge
19bb9ba851
fix tts endpoint using list-models argument
2021-03-09 14:06:09 +01:00
Eren Gölge
43379eecef
fix the nl model and add the vocoder
2021-03-09 14:05:56 +01:00
r-dh
8a4dcd152f
Add Dutch model
2021-03-09 13:22:19 +01:00
Eren Gölge
94805236fb
Merge branch 'dev' of https://github.com/coqui-ai/TTS into dev
2021-03-08 15:21:06 +01:00
Eren Gölge
5dcc4be560
rebrand demo server
2021-03-08 14:51:04 +01:00
Eren Gölge
947e3d6a93
rename test
2021-03-08 14:50:54 +01:00
Eren Gölge
a519ed52f2
deprecate embedding models to the wheel
2021-03-08 14:06:15 +01:00
Eren Gölge
c16ad38930
update server rEADME
2021-03-08 14:05:59 +01:00
Eren Gölge
594d8d8f09
linter fixes
2021-03-08 11:22:59 +01:00
Eren Gölge
00b5090974
linter fix
2021-03-08 11:05:30 +01:00
Eren Gölge
e15734c3fc
linter fix
2021-03-08 05:29:43 +01:00
Eren Gölge
9a48ba3821
a ton of linter updates
2021-03-08 05:06:54 +01:00
Eren Gölge
e03a426378
bug fix
2021-03-08 02:59:48 +01:00
kirianguiller
628afe5cb0
remove gst handling in synthetizer.py class
2021-03-08 02:59:48 +01:00
kirianguiller
557239db7f
remove re.Match typing in '_number_replace()'
2021-03-08 02:59:48 +01:00
kirianguiller
9ab07f94e2
modify according to PR reviews
2021-03-08 02:59:48 +01:00
kirianguiller
42ba30eb8f
<add> Chinese mandarin implementation (tacotron2)
2021-03-08 02:59:24 +01:00
kirianguiller
49665783a6
remove gst handling in synthetizer.py class
2021-03-08 02:57:11 +01:00
kirianguiller
e85658ac2b
remove re.Match typing in '_number_replace()'
2021-03-08 02:57:11 +01:00
kirianguiller
0d4525322c
modify according to PR reviews
2021-03-08 02:57:11 +01:00
kirianguiller
e6fd118cf8
<add> Chinese mandarin implementation (tacotron2)
2021-03-08 02:57:11 +01:00
Eren Gölge
e3102e753c
enable backward compat for loading the best model
2021-03-08 02:57:11 +01:00
gerazov
2451a813a2
refactored keep_all_best
2021-03-08 02:57:11 +01:00
gerazov
8cefa76bae
reformated docstrings in arguments.py
2021-03-08 02:57:11 +01:00
gerazov
2db40457e8
brushed up printing model load path and best loss path
2021-03-08 02:56:36 +01:00
gerazov
f2e474cd37
loading last checkpoint/best_model works, deleting last best models options added, loading last best_loss added
2021-03-08 02:56:36 +01:00
Eren Gölge
4111df6769
Docstrings for audioprocessor
2021-03-08 02:54:47 +01:00
Eren Gölge
2ca74b8ab3
add RUSLAN dataset preprocessor
2021-03-08 02:54:47 +01:00
Eren Gölge
8993120634
do not test server and modelManager until fixing #657
2021-03-08 02:54:47 +01:00
Adonis Pujols
89b7f01534
add encoding="utf-8"
2021-03-08 02:54:47 +01:00
Eren Gölge
ffceccb021
fix #655
2021-03-08 02:54:47 +01:00
Eren Gölge
534c341f16
linter update
2021-03-08 02:54:47 +01:00
Eren Gölge
0e1e60bef0
remove redundancy
2021-03-08 02:54:47 +01:00
Eren Gölge
93a83c0068
Update TTS/utils/arguments.py
...
Co-authored-by: Jörg Thalheim <Mic92@users.noreply.github.com>
2021-03-08 02:54:47 +01:00
Eren Gölge
39fbf2fe84
Update TTS/bin/find_unique_chars.py
...
Co-authored-by: Jörg Thalheim <Mic92@users.noreply.github.com>
2021-03-08 02:54:47 +01:00
Eren Gölge
ee71eb4eb7
linter fixes
2021-03-08 02:54:47 +01:00
Eren Gölge
55fc50b26d
update test_text_processing for espeak-ng
2021-03-08 02:54:47 +01:00
Eren Gölge
5b8a6736a7
remove _phoneme_punctuations
2021-03-08 02:54:47 +01:00
Eren Gölge
194f82de51
save default model chars to the training config file
2021-03-08 02:54:47 +01:00
Eren Gölge
62a8eba3b2
parse_characters function
2021-03-08 02:54:47 +01:00
Eren Gölge
0b33acdcca
enable saving model characters in io.py
2021-03-08 02:54:47 +01:00
Eren Gölge
f9fe167537
docstring update
2021-03-08 02:54:47 +01:00
Eren Gölge
62aeacbdd1
save used model characters to the checkpoints
2021-03-08 02:54:47 +01:00
Eren Gölge
e06c93fe81
model_manager tests
2021-03-08 02:54:47 +01:00
Eren Gölge
fe41084eb3
author , license and contact info in .model.json
2021-03-08 02:54:47 +01:00
nmstoker
ae0d54ddae
Updating models list to include EK1 TTS/vocoder
2021-03-08 02:54:47 +01:00
Eren Gölge
c6702b5b9f
find unique characters in a dataset
2021-03-08 02:54:47 +01:00
Eren Gölge
dad3565379
use default vocoders in server.pu
2021-03-08 02:54:47 +01:00
Eren Gölge
d30608ab17
set an output_sample_rate in synthesizer and use it for writing the wav
...
file
2021-03-08 02:54:47 +01:00
Eren Gölge
3ccb015cd8
return the json entry of the downloaded model
2021-03-08 02:54:47 +01:00
Eren Gölge
00e0933f43
save_wav with a custom sampling rate
2021-03-08 02:54:47 +01:00
Eren Gölge
9fefc79f0c
fix make_symbols
2021-03-08 02:54:47 +01:00
Eren Gölge
8955333e9d
use default vocoder in synthesize.py
2021-03-08 02:54:47 +01:00
Eren Gölge
23b282f600
define default vocoders
2021-03-08 02:54:47 +01:00
Eren Gölge
6bd8485d10
bug fix
2021-03-08 02:54:47 +01:00
Eren Gölge
5f1018abee
fix spelling of a def argument and parse phonemes from config.json if
...
use_phonemes is True
2021-03-08 02:54:47 +01:00
Eren Gölge
1c1abb8a9b
docstring update
2021-03-08 02:54:47 +01:00
Eren Gölge
6cd642c2e1
add missing phonemes to test_config.json
2021-03-08 02:54:47 +01:00
Eren Gölge
43b951018e
fix the default vocoder name
2021-03-08 02:54:47 +01:00
Adonis Pujols
81b145c321
spelling error. should be multiband not mulitband
2021-03-08 02:54:47 +01:00
Adonis Pujols
59b1b13e07
spelling error. should be multiband not mulitband
2021-03-08 02:54:47 +01:00
Eren Gölge
ee58ff2d38
add russian phoneme char
2021-03-08 02:54:47 +01:00
Eren Gölge
29d928d531
css10 dataset preprocessor
2021-03-08 02:54:47 +01:00
Eren Gölge
49771f2541
download github model releases by model manager
2021-03-08 02:54:21 +01:00
Eren Gölge
3c961370e7
linter fixes
2021-03-08 02:54:21 +01:00
gerazov
2b5cb24db7
final final fixes
2021-03-08 02:54:21 +01:00
gerazov
b3c5cc2cdc
final fixes
2021-03-08 02:54:21 +01:00
gerazov
10d5a63d49
updated to current dev
2021-03-08 02:54:21 +01:00
gerazov
6f06e31541
changed train scripts
2021-03-08 02:54:21 +01:00
gerazov
2daca15802
restructured arg parsing and processing to utils
2021-03-08 02:54:21 +01:00
Eren Gölge
2fbe4a1b8a
fix gdown
2021-03-08 02:54:21 +01:00
Branislav Gerazov
ed56944c4a
improve robustness of defining wavernn in config file
2021-03-08 02:54:21 +01:00
Branislav Gerazov
5e2bc8c99f
update wavernn test config, delete cap=True
2021-03-08 02:54:21 +01:00
Branislav Gerazov
b1e3160884
waveRNN fix
2021-03-08 02:54:21 +01:00
Eren Gölge
08581deb61
linter updates
2021-03-08 02:53:02 +01:00
Thorsten Mueller
167901813d
Ups. Added missing ,
2021-03-08 02:53:02 +01:00
Eren Gölge
93a6bdfd6c
linter fixes and version updates for deps
2021-03-08 02:51:10 +01:00
Eren Gölge
a30a231566
unpin cython version and commentout pyworld in audio.py causing dep
...
issues
2021-03-08 02:50:15 +01:00
Thorsten Mueller
3eb00e8d93
Set out_path to be required param.
2021-03-08 02:49:15 +01:00
Alexander Korolev
ace430d5e6
fix device mismatch wavegrad training
...
this should fixe the device mismatch as seen here https://github.com/mozilla/TTS/issues/622#issue-789802916
2021-03-08 02:49:15 +01:00
Eren Gölge
83143fbe39
fix #638
2021-03-08 02:48:31 +01:00
Eren Gölge
30c3bef3f9
move hubconf
2021-03-08 02:48:31 +01:00
Eren Gölge
bbea6a0884
hubconf.py and load .models.json from the defualt location by mange.py
2021-03-08 02:48:31 +01:00
Eren Gölge
90d4f08d6c
reorder imports
2021-03-08 02:48:31 +01:00
Eren Gölge
db231c83fc
distill import statement, check python version in setup.py
2021-03-08 02:48:31 +01:00
Thorsten Mueller
915ec1faac
Added info if model already downloaded in --list_models
2021-03-08 02:48:31 +01:00
Alexander Korolev
b4bc5f6eb1
update fixed stopnet_pos_weight parameter
...
config parameter c.stopnet_pos_weight has currently no effect as it is not used.
2021-03-08 02:48:31 +01:00
Eren Gölge
534e3c67c6
README update, set default models for synthesize.py and server.py. Disable verbose for ap init.
2021-03-08 02:48:31 +01:00
kirianguiller
7f36d91131
update chinese model
2021-03-01 14:55:05 +01:00
Eren Gölge
547bfc4ce9
bug fix
2021-02-18 18:24:03 +00:00
Eren Gölge
adaeec57ec
Merge branch 'dev' of https://github.com/mozilla/TTS into dev
2021-02-18 17:21:09 +00:00
Eren Gölge
5b70c8ba4f
enable backward compat for loading the best model
2021-02-18 17:20:36 +00:00
Eren Gölge
e4f81d6856
Merge pull request #654 from kirianguiller/chinese-implementation
...
Chinese implementation (merge into dev)
2021-02-18 17:15:32 +01:00
kirianguiller
22a6bbfa80
remove gst handling in synthetizer.py class
2021-02-17 20:53:56 +01:00
kirianguiller
3911b87e54
remove re.Match typing in '_number_replace()'
2021-02-17 20:53:56 +01:00
kirianguiller
fb0655d1e7
modify according to PR reviews
2021-02-17 20:53:56 +01:00
kirianguiller
c4c7bc1b88
<add> Chinese mandarin implementation (tacotron2)
2021-02-17 20:53:56 +01:00
Eren Gölge
d0454461de
Merge branch 'pr/gerazov/650-2' into dev
2021-02-17 13:40:45 +00:00
Eren Gölge
a8ea0ea6ce
Docstrings for audioprocessor
2021-02-17 13:35:41 +00:00
Eren Gölge
f6e6314910
add RUSLAN dataset preprocessor
2021-02-17 13:35:23 +00:00
Eren Gölge
ce0c5eccbd
do not test server and modelManager until fixing #657
2021-02-17 00:35:43 +00:00
gerazov
61c88beb94
refactored keep_all_best
2021-02-15 18:40:17 +01:00
Eren Gölge
eb543c027e
Merge branch 'dev' of https://github.com/mozilla/TTS into dev
2021-02-15 17:06:40 +00:00
Eren Gölge
8a106e0527
fix #655
2021-02-15 17:06:03 +00:00
Eren Gölge
216945e653
Merge pull request #647 from adonispujols/patch-1
...
Easy Fix for #454 (which was somehow deleted?)
2021-02-15 13:17:17 +01:00
Eren Gölge
06a3ba2fe2
linter update
2021-02-15 12:10:19 +00:00
Eren Gölge
7f58fa365b
Merge branch 'save_characters' into dev
2021-02-15 12:07:28 +00:00
Eren Gölge
ff218e2370
remove redundancy
2021-02-15 12:07:02 +00:00
Eren Gölge
80af8ca5e1
Update TTS/utils/arguments.py
...
Co-authored-by: Jörg Thalheim <Mic92@users.noreply.github.com>
2021-02-15 13:03:59 +01:00
Eren Gölge
3b6ce04332
Update TTS/bin/find_unique_chars.py
...
Co-authored-by: Jörg Thalheim <Mic92@users.noreply.github.com>
2021-02-15 13:02:29 +01:00
Eren Gölge
dc3596dad4
model_manager tests
2021-02-15 11:29:22 +00:00
Eren Gölge
77e630348e
author , license and contact info in .model.json
2021-02-15 11:02:21 +00:00
Eren Gölge
e1bc823e44
Merge branch 'pr/nmstoker/652' into dev
2021-02-15 10:57:12 +00:00
nmstoker
33bcdc6ff8
Updating models list to include EK1 TTS/vocoder
2021-02-14 23:44:05 +00:00
Eren Gölge
420901f4c2
linter fixes
2021-02-12 14:41:17 +00:00
Eren Gölge
4244096ccb
update test_text_processing for espeak-ng
2021-02-12 14:07:26 +00:00
Eren Gölge
b28c724c04
remove _phoneme_punctuations
2021-02-12 12:10:57 +00:00
Eren Gölge
7ab527d17e
save default model chars to the training config file
2021-02-12 12:06:46 +00:00
Eren Gölge
593cedee14
parse_characters function
2021-02-12 12:05:56 +00:00
Eren Gölge
2abfff17f9
enable saving model characters in io.py
2021-02-12 12:04:41 +00:00
Eren Gölge
918f007a11
docstring update
2021-02-12 12:04:07 +00:00
Eren Gölge
e774f68aee
save used model characters to the checkpoints
2021-02-12 12:03:42 +00:00
gerazov
0e78e31dbf
reformated docstrings in arguments.py
2021-02-12 11:36:01 +01:00
gerazov
310d18325e
brushed up printing model load path and best loss path
2021-02-12 10:55:45 +01:00
Eren Gölge
8b6fd76ad2
find unique characters in a dataset
2021-02-12 09:46:11 +00:00
gerazov
af46727517
loading last checkpoint/best_model works, deleting last best models options added, loading last best_loss added
2021-02-12 02:12:00 +01:00
Eren Gölge
a1e595790d
use default vocoders in server.pu
2021-02-11 15:31:39 +00:00
Eren Gölge
8aa6a0decb
set an output_sample_rate in synthesizer and use it for writing the wav
...
file
2021-02-11 15:28:07 +00:00
Eren Gölge
0c52d27d65
return the json entry of the downloaded model
2021-02-11 15:27:41 +00:00
Eren Gölge
1649ad3431
save_wav with a custom sampling rate
2021-02-11 15:27:20 +00:00
Eren Gölge
43f54d2dce
fix make_symbols
2021-02-11 15:26:52 +00:00
Eren Gölge
0657b38111
use default vocoder in synthesize.py
2021-02-11 15:26:17 +00:00
Eren Gölge
2043a9b5f5
define default vocoders
2021-02-11 15:25:55 +00:00
Eren Gölge
ff27690ca7
bug fix
2021-02-11 13:43:29 +00:00
Eren Gölge
bc131208be
fix spelling of a def argument and parse phonemes from config.json if
...
use_phonemes is True
2021-02-11 13:04:47 +00:00
Eren Gölge
f1799dbd60
docstring update
2021-02-11 11:25:31 +00:00
Eren Gölge
3baec4ea96
add missing phonemes to test_config.json
2021-02-11 11:14:39 +00:00
Eren Gölge
a3d1e65b34
Merge branch 'pr/adonispujols/646' into dev
2021-02-11 10:37:29 +00:00
Eren Gölge
3c2e13ca5c
fix the default vocoder name
2021-02-11 10:36:52 +00:00
Adonis Pujols
48011a8b58
add encoding="utf-8"
2021-02-11 05:26:06 -05:00
Adonis Pujols
b29a7e9645
spelling error. should be multiband not mulitband
2021-02-11 04:49:28 -05:00
Adonis Pujols
6c824a6629
spelling error. should be multiband not mulitband
2021-02-11 04:48:53 -05:00
Eren Gölge
b08b8ca2a1
add russian phoneme char
2021-02-10 13:30:59 +00:00
Eren Gölge
9cad435288
css10 dataset preprocessor
2021-02-09 15:11:26 +00:00
Eren Gölge
cea5e517f2
download github model releases by model manager
2021-02-09 14:24:14 +00:00
Eren Gölge
c619859a3f
linter fixes
2021-02-09 11:43:17 +00:00
gerazov
e507373b55
final final fixes
2021-02-06 23:08:47 +01:00
gerazov
ad17dc9e76
final fixes
2021-02-06 23:05:01 +01:00
gerazov
8fdd08ea15
updated to current dev
2021-02-06 22:59:52 +01:00
gerazov
2705d27b28
changed train scripts
2021-02-06 22:29:30 +01:00
gerazov
4f8f274d6e
restructured arg parsing and processing to utils
2021-02-06 22:25:56 +01:00
Eren Gölge
e7e880f514
fix gdown
2021-02-05 13:42:24 +00:00
Eren Gölge
f4f6290eec
Merge branch 'pr/gerazov/641' into dev
2021-02-05 13:14:49 +00:00
Eren Gölge
d49757faaa
linter updates
2021-02-05 13:10:43 +00:00
Branislav Gerazov
f063545325
improve robustness of defining wavernn in config file
2021-02-05 13:26:33 +01:00
Branislav Gerazov
24ffa9e9f6
update wavernn test config, delete cap=True
2021-02-05 13:10:02 +01:00
Branislav Gerazov
cb77aef36c
waveRNN fix
2021-02-04 09:52:03 +01:00
Thorsten Mueller
d74866cb8e
Merge remote-tracking branch 'upstream/dev' into dev
...
Fix for circleci error mentioned in PR https://github.com/mozilla/TTS/pull/637
2021-02-02 19:40:18 +01:00
Thorsten Mueller
a82152eef3
Ups. Added missing ,
2021-02-02 19:29:16 +01:00
Thorsten Mueller
4cb4fcf02c
Set out_path to be required param.
2021-02-02 19:29:16 +01:00
Thorsten Mueller
c75ea74914
Added info if model already downloaded in --list_models
2021-02-02 19:29:16 +01:00
Eren Gölge
2edab4b3f9
disable pw in audio that causes numpy issue
2021-02-01 17:05:03 +00:00
Eren Gölge
5c46543765
linter fixes and version updates for deps
2021-02-01 13:18:56 +00:00
Eren Gölge
8774e37444
unpin cython version and commentout pyworld in audio.py causing dep
...
issues
2021-02-01 11:34:05 +00:00
Eren Gölge
5beed0ddcd
Merge branch 'dev' of https://github.com/mozilla/TTS into dev
2021-02-01 11:27:14 +00:00
Eren Gölge
c7407571fa
fix #638
2021-02-01 10:05:55 +00:00
Eren Gölge
dfdac1def9
Merge pull request #636 from thorstenMueller/dev
...
Set out_path to be required param in compute_statistics.py.
2021-01-29 18:08:31 +01:00
Thorsten Mueller
44c4a49745
Set out_path to be required param.
2021-01-29 17:23:38 +01:00
Eren Gölge
536366dc0a
Merge pull request #635 from SanjaESC/patch-1
...
fix device mismatch wavegrad training
2021-01-29 16:42:25 +01:00
Eren Gölge
0354b6f35e
move hubconf
2021-01-29 15:28:32 +00:00
Eren Gölge
aa5f24608a
hubconf.py and load .models.json from the defualt location by mange.py
2021-01-29 15:28:26 +00:00
Alexander Korolev
e81ebec7a8
fix device mismatch wavegrad training
...
this should fixe the device mismatch as seen here https://github.com/mozilla/TTS/issues/622#issue-789802916
2021-01-29 15:18:59 +01:00
Eren Gölge
a926aa106d
reorder imports
2021-01-29 01:36:21 +01:00
Eren Gölge
8a6eee7fec
distill import statement, check python version in setup.py
2021-01-28 17:04:08 +01:00
Eren Gölge
131a163c95
Merge pull request #628 from thorstenMueller/dev
...
Added info if model already downloaded in --list_models
2021-01-28 13:10:06 +01:00
Alexander Korolev
ca28e05ed7
update fixed stopnet_pos_weight parameter
...
config parameter c.stopnet_pos_weight has currently no effect as it is not used.
2021-01-27 16:33:25 +01:00
Thorsten Mueller
ccbd542eb0
Added info if model already downloaded in --list_models
2021-01-27 16:19:02 +01:00
Eren Gölge
25c86ca715
README update, set default models for synthesize.py and server.py. Disable verbose for ap init.
2021-01-27 11:47:03 +01:00
Eren Gölge
4f32e77006
platform indep. way to fetch user data folder
2021-01-26 17:32:43 +01:00
Eren Gölge
0117c811a9
add a button to index.html to see the model details
2021-01-26 12:33:27 +01:00
Eren Gölge
a3adcaccdb
Merge branch 'pr/thorstenMueller/623' into dev
2021-01-26 12:19:39 +01:00
Eren Gölge
b464cab9b8
setup.py update and pylint fixes
2021-01-26 02:57:50 +01:00
Eren Gölge
660d61aeeb
maximum_path_numpy and CYTHON adabtable import
2021-01-26 02:57:07 +01:00
Eren Gölge
877f0bbfba
manifest.in update
2021-01-26 02:56:55 +01:00
Eren Gölge
82e029529e
fix manifest file
2021-01-25 13:27:54 +01:00
Eren Gölge
57b668fd86
fixing dome pypi issues
2021-01-25 13:06:12 +01:00
Eren Gölge
60c1bb93d9
fixes before first PyPI release
2021-01-25 11:16:20 +01:00
Thorsten Mueller
afb7db2a1d
Removed unneeded check and removed specific taco2 model name.
2021-01-22 16:22:50 +01:00
Eren Gölge
fae10309e4
Merge pull request #624 from SanjaESC/patch-3
...
Update train_tacotron.py
2021-01-22 13:29:09 +01:00
Eren Gölge
5ee73c2bae
Merge branch 'dev' of https://github.com/mozilla/TTS into dev
2021-01-22 13:26:27 +01:00
Eren Gölge
5fb611ef40
static image for server index.html
2021-01-22 03:01:53 +01:00
Eren Gölge
ca647cf222
Model Manager to download released models
2021-01-22 02:35:43 +01:00
Eren Gölge
ca8ad9c21e
rename audio._normalize to audio.normalize
2021-01-22 02:33:19 +01:00
Eren Gölge
c990b3a59c
linter fixes and test fixes
2021-01-22 02:32:35 +01:00
Alexander Korolev
f251dc8c0e
Update train_tacotron.py
...
When attempting to fine-tune a model with "prenet_type": "bn" that was originally trained with "prenet_type": "original", a RuntimeError is thrown that stops the training.
By catching the RuntimeError, the required layers can be partially restored and the training will continue without any problems.
2021-01-21 21:16:30 +01:00
Eren Gölge
0ab2eb2664
use synthesizer in both synthesize.py and server.pu
2021-01-21 15:54:33 +01:00
Eren Gölge
9addfabc43
wavernn load_checkpoint function
2021-01-21 15:31:13 +01:00
Eren Gölge
50fee59a2c
update synthesizer.py for better interfacing to different models
2021-01-21 15:30:49 +01:00
Eren Gölge
007a4d7139
remove 3rd paty wavernn support from server.py and add ModelManager arguments
2021-01-21 15:30:16 +01:00
Eren Gölge
6b6e989fd2
update server readme
2021-01-21 15:29:46 +01:00
Thorsten Mueller
e414582be6
Added option for server ui details page.
2021-01-20 21:56:40 +01:00
root
1bc8fbbd3c
set eval mode whe nloading models
2021-01-20 02:14:18 +00:00
root
5bd7238153
interpolate spectrogram in vocoder generic utils for matching sample
...
rates
2021-01-20 02:13:01 +00:00
root
ca3743539a
load_checkpoint func for vocoder models
2021-01-20 02:12:29 +00:00
root
ea39715305
read_json_with_comments
2021-01-20 02:11:55 +00:00
root
563bc921d8
optional verbose for audio.py init
2021-01-20 02:11:24 +00:00
root
1faf565e3a
add load_checkpoint func to tts models
2021-01-20 02:10:56 +00:00
root
5c87753e88
glow-tts fix for saving inverse weight
2021-01-20 02:09:42 +00:00
root
3d30dae8f3
.models.json and synthesize.py update for interfacing with model manager
2021-01-20 02:08:58 +00:00
gerazov
b2b4828f17
set requires_grad=False
2021-01-16 19:46:04 +01:00
gerazov
c96f7a2614
TorchSTFT to device fix
2021-01-16 12:21:16 +01:00
root
7beaacc55b
update compute_attention_masks.py
2021-01-13 10:03:57 +00:00
erogol
428c224b88
commet update
2021-01-12 17:31:04 +01:00
erogol
bbc8d665a1
move attention layers to a sperate file
2021-01-11 17:27:30 +01:00
erogol
79c841ccd3
mass refactoring and update
2021-01-11 17:26:58 +01:00
erogol
1d961d6f8a
cladd renaming
2021-01-11 17:26:11 +01:00
erogol
c0a2aa68d3
formatting
2021-01-11 17:25:39 +01:00
erogol
b206162d11
more docstrings
2021-01-11 17:25:04 +01:00
erogol
6e9043c5d2
rename convbnblocks and handle none mask
2021-01-11 17:22:34 +01:00
erogol
921fa5db92
remove attentions from common layers
2021-01-11 15:06:42 +01:00
erogol
cc2b1e043d
docstrings for common layers
2021-01-11 15:06:12 +01:00
erogol
a6f40fef2e
stage missing files
2021-01-08 16:02:56 +01:00
erogol
d382d759b3
small fixes and test fixes
2021-01-08 15:48:40 +01:00
erogol
a6259041d3
docstring for speedyspeech
2021-01-07 14:35:22 +01:00
erogol
de2a542f83
glow-tts bug fix
2021-01-07 13:40:32 +01:00
erogol
14d33662ea
input shapes for tacotron models
2021-01-06 13:19:40 +01:00
erogol
f288e9a260
docstrings for taoctron models
2021-01-06 13:19:40 +01:00
erogol
5a45af48f1
fix
2021-01-06 13:19:40 +01:00
erogol
e7fad928e7
doc strings for the all glow-tts layers
2021-01-06 13:19:40 +01:00
erogol
d3b7284be4
glow-tts comments and refactoring
2021-01-06 13:19:40 +01:00
erogol
7586fbc4de
SS refactoring
2021-01-06 13:19:40 +01:00
erogol
e82d31b6ac
glow ttss refactoring
2021-01-06 13:19:40 +01:00
erogol
29f4329d7f
update glow-tts layers and add some comments
2021-01-06 13:19:40 +01:00
erogol
29cf933831
update SS condif
2021-01-06 13:19:40 +01:00
erogol
228ada04b5
update glow-tts ljspeech config
2021-01-06 13:19:40 +01:00
erogol
f352b3534c
make noise augmentation optional
2021-01-06 13:19:40 +01:00
erogol
71c382be14
copy model scale stats file with config.json to the trianing folder, fixed for model inits
2021-01-06 13:19:40 +01:00
erogol
aa40fe1aa0
SS model refacotring for multi speaker
2021-01-06 13:19:40 +01:00
erogol
eb555855e4
small fixes
2021-01-06 13:19:40 +01:00
erogol
5901a00576
argument rename
2021-01-06 13:19:40 +01:00
erogol
4ef083f0f1
select decoder type for SS
2021-01-06 13:19:40 +01:00
erogol
d5a0190c4b
update copy_config_file to copy_model_files
2021-01-06 13:19:40 +01:00
erogol
8971c59b2d
plot eval alignment score right
2021-01-06 13:19:40 +01:00
erogol
3fa408a5ea
change order BN + ReLU to ReLU + BN for SS
2021-01-06 13:19:40 +01:00
erogol
ac5c9217d1
positional encoding masking for SS
2021-01-06 13:19:40 +01:00
erogol
fede46e96e
pylint and test fixes
2021-01-06 13:19:40 +01:00
erogol
2abe3df153
compute_attention_masks.py
2021-01-06 13:19:40 +01:00
erogol
cf869e8922
add SS files
2021-01-06 13:19:40 +01:00
erogol
e4680e1b99
plot float16 alignments
2021-01-06 13:19:40 +01:00
erogol
13c6665c92
inference for SS
2021-01-06 13:19:40 +01:00
erogol
30788960a8
check SS model parameters
2021-01-06 13:19:40 +01:00
erogol
5cae2c5742
make optional position encoding for speedyspeech
2021-01-06 13:19:40 +01:00
erogol
dc4a16d62e
speedy speehc losses
2021-01-06 13:19:40 +01:00
erogol
d62cac7252
fix glow-tts prenet bug fix
2021-01-06 13:19:40 +01:00
erogol
a1d5a9ddda
config update tyo use noise for augmentation
2021-01-06 13:19:40 +01:00
erogol
022af74d74
update prompt msg
2021-01-06 13:19:40 +01:00
erogol
57ef53bef3
update argumnet check for non tacotron models
2021-01-06 13:19:40 +01:00
erogol
27a75de15f
update processors for loading attention maps
2021-01-06 13:19:40 +01:00
erogol
fa6907fa0e
update glow-tts parameters and fix rel-attn-win size
2021-01-06 13:19:40 +01:00
erogol
7b20d8cbd3
implement residual BN convolution and add it as an alternative encoder for glow-tts. also generic layers to layers/generic
2021-01-06 13:19:40 +01:00
erogol
973754d893
fix for init glow-tts
2021-01-06 13:19:40 +01:00
erogol
f81af4eb0d
config update disable guided attention for dynamic conv attention
2021-01-06 13:19:40 +01:00
erogol
29b17c0808
bug fix for gradual training
2021-01-06 13:19:40 +01:00
erogol
5c50e104d6
config update
2021-01-06 13:19:40 +01:00
erogol
6478d552dc
tacotron training bug fix
2021-01-06 13:19:40 +01:00
erogol
1dd086577a
tacotron training bug fix
2021-01-06 13:18:41 +01:00
erogol
fa20638083
config for ljspeech dynamic conv attention
2021-01-06 13:18:41 +01:00
erogol
070146e143
add monotonic dynamic convolution attention
2021-01-06 13:18:41 +01:00
erogol
18392bc13a
Merge branch 'dev' of https://github.com/mozilla/TTS into dev
2021-01-06 13:18:08 +01:00
Thorsten Mueller
f673f8f74d
Added support for npy output from tune-wavegrad
2020-12-19 22:51:22 +01:00
Thorsten Mueller
2aa0354b44
Fix for 'NoneType' object has no attribute 'to'
2020-12-19 22:37:03 +01:00
Thorsten Mueller
28a64221ea
Improve robostness on cpu / gpu model mix
2020-12-19 22:23:28 +01:00
erogol
8293751a38
remove mozilla from server page
2020-12-17 12:28:28 +01:00
erogol
639fa29261
update speaker id casting for glow-tts
2020-12-14 16:58:47 +01:00
erogol
999120ecdf
Merge branch 'dev' of https://github.com/mozilla/TTS into dev
2020-12-12 18:50:14 +01:00
erogol
f611e6ac01
Merge branch 'dev' of https://github.com/mozilla/TTS into dev
2020-12-12 18:47:59 +01:00
Jörg Thalheim
62fd4ca70d
inflect negative numbers correctly
2020-12-10 16:47:51 +01:00
Jörg Thalheim
6646682650
cleaners: expand english time
2020-12-10 14:53:20 +01:00
Jörg Thalheim
76138687d3
expand more currencies
2020-12-10 14:53:20 +01:00
erogol
a2859b7ddc
update config args checks
2020-12-10 13:52:57 +01:00
erogol
788cd6f902
fix multi-speaker glow-tts inference
2020-12-10 02:05:48 +01:00
erogol
3d5066e2b8
Merge branch 'dev' of https://github.com/mozilla/TTS into dev
2020-12-10 00:31:03 +01:00
erogol
92cc9630d7
fix glow-tts synthesis for DPP
2020-12-10 00:30:34 +01:00
Eren Gölge
2473b2dc62
Merge pull request #559 from krzim/patch-1
...
Fix import to grab the encoder model save function
2020-12-10 00:19:32 +01:00
erogol
53679b706d
glow-tts distributed fix
2020-12-09 23:39:09 +01:00
erogol
62bc171db5
Merge branch 'dev' of https://github.com/mozilla/TTS into dev
2020-12-09 15:46:57 +01:00
erogol
df180148e9
use noise augmentation in TTSDataset
2020-12-09 15:46:25 +01:00
Thorsten Mueller
e39628ce2f
Limit filenames to 10 chars
2020-12-08 18:44:19 +01:00
erogol
06612ce305
test fixes
2020-12-07 15:57:34 +01:00
erogol
0252a07fa6
Merge branch 'dev' of https://github.com/mozilla/TTS into dev
2020-12-07 11:31:55 +01:00
erogol
482e725752
sync torch calls before logging training results
2020-12-07 11:30:19 +01:00
erogol
7505c0ba27
muliprocess phoneme computation
2020-12-07 11:29:41 +01:00
erogol
20c86489d7
make static methods for faster multiprocess call
2020-12-07 11:29:10 +01:00
erogol
affe1c1138
setup training scripts for computing phonemes before training optionally. And define data_loaders before starting training and re-use them instead of re-define for every train and eval calls. This is to enable better instance filtering based on input length.
2020-12-07 11:26:57 +01:00
Alexander Korolev
f42ca2b73f
Update wavegrad.py
...
This should fix the issue https://github.com/mozilla/TTS/issues/581
2020-12-04 16:43:39 +01:00
erogol
7c3cdced1a
make speaker_mapping a global variable to prevent reload. Fix glow-tts training
2020-12-01 03:23:25 +01:00
Thorsten Mueller
06a389bc08
Added option for saving raw spectograms
2020-11-27 15:49:55 +01:00
erogol
a757b203bc
fix longer phoneme seqs
2020-11-26 15:05:03 +01:00
erogol
7b0a93d2f8
fix
2020-11-26 11:44:52 +01:00
erogol
0c6f7e4c77
resample audio if flag set true
2020-11-26 11:30:48 +01:00
erogol
f6c96b0ac2
Merge branch 'dev'
2020-11-25 15:29:06 +01:00
erogol
e3b7157146
remove contextlib
2020-11-25 15:22:01 +01:00
erogol
e3eda159d1
wavegrad_dataset update
2020-11-25 14:50:50 +01:00
erogol
a1e4ee18f9
convert float16 to float32 for plotting spectrograms
2020-11-25 14:50:28 +01:00
erogol
7541d2ecaa
return eval split optional
2020-11-25 14:50:09 +01:00
erogol
4b92ac0f92
tune_wavegrad update
2020-11-25 14:49:48 +01:00
erogol
d8c1b5b73d
print max lengths in tacotron training
2020-11-25 14:49:07 +01:00
erogol
1229554c42
use native amp
2020-11-25 14:48:54 +01:00
erogol
8a820930c6
compute_embedding update
2020-11-25 14:46:08 +01:00
erogol
aa2b31a1b0
use 'enabled' argument to control autocast
2020-11-17 14:22:01 +01:00
erogol
d9d04d892b
Merge branch 'dev' of https://github.com/mozilla/TTS into dev
2020-11-17 14:17:24 +01:00
erogol
8b0e0846a3
temporary travis check
2020-11-17 14:17:03 +01:00
Qingping Hou
b0b97d636f
speed up metafile build for voxceleb
2020-11-14 23:45:17 -08:00
erogol
a2a142dc39
Merge branch 'dev' of https://github.com/mozilla/TTS into dev
2020-11-14 13:02:19 +01:00
erogol
c65712426a
change noise scheduling for wavegrad. Compute beta values externally to enable better flexibility
2020-11-14 13:01:10 +01:00
erogol
5a59467f34
scaler fix for wavegrad and wavernn. Save and load scaler
2020-11-14 13:00:35 +01:00
erogol
d8511efa8f
use native amp for tacotron training
2020-11-14 12:59:28 +01:00
Qingping Hou
0cc3650ef6
support loading config in yaml
2020-11-14 00:13:53 -08:00
erogol
6cc464ead6
fix ton of tesnting bugs
2020-11-12 16:33:29 +01:00
erogol
25551c4634
change wavernn generate to inference
2020-11-12 12:52:52 +01:00
erogol
9b0f441945
argument for returning no eval split
2020-11-12 12:52:27 +01:00
erogol
a7aefd5c50
use pytorch amp for mixed precision training for Tacotron
2020-11-12 12:51:56 +01:00
erogol
67e2b664e5
compute embeddings and create speakers.json
2020-11-12 12:51:17 +01:00
erogol
f8fd300b3e
bug fix
2020-11-10 12:53:39 +01:00
erogol
016d3503da
compute embeddings with speaker encoder
2020-11-10 12:51:02 +01:00
erogol
21364331d2
Merge branch 'dev' of https://github.com/mozilla/TTS into dev
2020-11-09 13:31:12 +01:00
erogol
c76a617072
linter updates
2020-11-09 13:18:35 +01:00
erogol
ea976b0543
python compat update for contextlib
2020-11-06 13:34:11 +01:00
erogol
c80225544e
tune wavegrad to fine the best noise schedule for inferece
2020-11-06 13:04:46 +01:00
erogol
d94782a076
reset the way ga_loss is stored in return_dict
2020-11-02 13:18:56 +01:00
erogol
a108d0ee81
check nan loss in glow-tts loss
2020-11-02 13:12:19 +01:00
erogol
b8ac9aba9d
check against NaN loss in tacotron_loss
2020-11-02 12:44:41 +01:00
erogol
ef04d7fae7
bug fix for wavernn training
2020-10-30 14:08:41 +01:00
erogol
a44ef58aea
wavegrad weight norm refactoring
2020-10-30 13:23:24 +01:00
erogol
183fe56d95
Merge branch 'ssim_loss' into dev
2020-10-29 23:49:09 +01:00
krzim
2202e171c5
Fix import to grab the encoder model save function
...
I saw that this was recently changed but I'm not sure if it should have been. This is the correct function given the arguments provided to it in the train loop.
2020-10-29 18:03:11 -04:00
erogol
73581cd94c
renaming train scripts and updating tests
2020-10-29 16:50:07 +01:00
erogol
39c71ee8a9
wavegrad refactoring, fixing tests for glow-tts and wavegrad
2020-10-29 15:47:15 +01:00
erogol
946a0c0fb9
bug fixes for single speaker glow-tts, enable torch based amp. Make amp optional for wavegrad. Bug fixes for synthesis setup for glow-tts
2020-10-29 15:45:50 +01:00
erogol
14c2381207
weight norm and torch based amp training for wavegrad
2020-10-29 12:31:43 +01:00
erogol
b76a0be97a
wavegrad model and layers refactoring
2020-10-29 12:31:43 +01:00
erogol
dc2825dfb2
wavegrad dataset update
2020-10-29 12:31:43 +01:00
erogol
5b5b9fcfdd
wavegrad config updates
2020-10-29 12:31:43 +01:00
erogol
c8a4c771a8
train wavegrad updates
2020-10-29 12:31:43 +01:00
erogol
670f44aa18
enable compute stats by vocoder config
2020-10-29 12:31:43 +01:00
erogol
f79bbbbd00
use Adam for wavegras instead of RAdam
2020-10-29 12:31:43 +01:00
erogol
7bcdb7ac35
wavegrad updates
2020-10-29 12:31:43 +01:00
erogol
a1582a0e12
fix distributed training for train_* scripts
2020-10-29 12:31:43 +01:00
erogol
193b81b273
add universal_fullband_melgan config
2020-10-29 12:30:37 +01:00
erogol
e02cd6a220
initial wavegrad layers model and trainig script
2020-10-29 12:30:37 +01:00
erogol
ac57eea928
add wavegrad to vocoder generators
2020-10-29 12:30:37 +01:00
erogol
e723b99888
handle distributed model as saving
2020-10-29 12:30:37 +01:00
Eren Gölge
26c18b61c9
Merge pull request #553 from Edresson/dev
...
bug fix in the inference with GlowTTS
2020-10-28 18:49:31 +01:00
erogol
fdaed45f58
optional loss masking for stoptoken predictor
2020-10-28 18:40:54 +01:00
erogol
e49cc3bbcd
bug fix
2020-10-28 18:34:34 +01:00
erogol
59e1cf99d0
config update and ssim implementation
2020-10-28 18:30:00 +01:00
erogol
9cef923d99
ssim loss for tacotron models
2020-10-28 15:24:18 +01:00
erogol
9d0ae2bfb4
wavernn dataloader handling for short samples and mixed precision training
2020-10-28 12:31:01 +01:00
Edresson
f01502a9db
bug fix in glowTTS sythesize
2020-10-27 16:30:16 -03:00
Eren Gölge
f4b8170bd1
Merge pull request #545 from Edresson/dev
...
GlowTTS zeroshot TTS support
2020-10-27 15:23:41 +01:00
erogol
a6f564c8c8
pylint fixes
2020-10-27 12:35:10 +01:00
erogol
0becef4b58
small updates
2020-10-27 12:17:38 +01:00
sanjaesc
2ee47e9568
fix pylint once again
2020-10-27 12:17:38 +01:00
sanjaesc
1e646135ca
add model params to config
2020-10-27 12:17:38 +01:00
sanjaesc
bef3f2020b
compute audio feat on dataload
2020-10-27 12:17:38 +01:00
sanjaesc
7c72562fe7
fix travis + pylint tests
2020-10-27 12:17:38 +01:00
sanjaesc
91e5f8b63d
added to device cpu/gpu + formatting
2020-10-27 12:17:38 +01:00
sanjaesc
016a77fcf2
fix formatting + pylint
2020-10-27 12:17:38 +01:00
erogol
8de7c13708
fix no loss masking loss computation
2020-10-27 12:17:38 +01:00
sanjaesc
e8294cb9db
fixing pylint errors
2020-10-27 12:17:38 +01:00
sanjaesc
878b7c373e
added feature preprocessing if not set in config
2020-10-27 12:17:38 +01:00
sanjaesc
e495e03ea1
some minor changes to wavernn
2020-10-27 12:17:38 +01:00
Alex K
9c3c7ce2f8
wavernn stuff...
2020-10-27 12:17:38 +01:00
Alex K
6378fa2b07
add initial wavernn support
2020-10-27 12:17:38 +01:00
Edresson
89e9bfe3a2
add text processing blank token test
2020-10-26 17:41:23 -03:00
Edresson
d9540a5857
add blank token in sequence for encrease glowtts results
2020-10-25 15:08:28 -03:00
Edresson
fbea058c59
add parse speakers function
2020-10-24 16:10:05 -03:00
Edresson
07345099ee
GlowTTS zero-shot TTS Support
2020-10-24 15:58:39 -03:00
Alexander Korolev
47d74ced1c
Update losses.py
...
Seems like in the latest dev merge, this change was reverted. Any specific reason for this?
Without it the problem as stated here https://github.com/mozilla/TTS/issues/473 occurs.
2020-10-23 14:15:01 +02:00
ayush-1506
2a3559f02b
Fix readme and config file
2020-10-21 13:43:49 +05:30
Edresson
b7f9ebd32b
add check arguments for GlowTTS and multispeaker training bug fix
2020-10-19 17:17:58 -03:00
erogol
c2c4126a18
remove merge conflicts
2020-10-08 01:35:27 +02:00
erogol
c5074cfd8e
general purpose distribute.py
2020-10-08 01:30:42 +02:00
erogol
6f0654f9a8
differential spectral loss
2020-10-08 01:30:42 +02:00
erogol
e0d4b88877
config update
2020-10-08 01:29:30 +02:00
erogol
4e93f90108
bug fix
2020-10-08 01:29:30 +02:00
erogol
bb9b70ee27
differential spectral loss and loss weight settings
2020-10-08 01:29:30 +02:00
erogol
e1eab1ce4b
print model r value as loading it
2020-10-07 13:34:21 +02:00
erogol
48a40c4730
remove unused import
2020-10-06 11:32:24 +02:00
erogol
a2606fbc22
format utils
2020-10-06 11:02:54 +02:00
Eren Gölge
4873601694
Merge pull request #531 from WeberJulian/french-cleaners
...
Adding support for french cleaners
2020-09-30 15:30:50 +02:00
Edresson
99d5a0ac07
add Speaker Conditional GST support
2020-09-29 16:09:27 -03:00
Julian WEBER
ea7c2e15c0
Adding french abbreviations
2020-09-29 15:43:39 +02:00
Julian WEBER
54b4031391
Merge remote-tracking branch 'origin/dev' into french-cleaners
2020-09-29 14:24:51 +02:00
Julian WEBER
da134eeee4
Subjective improvements
2020-09-29 14:20:52 +02:00
Julian WEBER
b2817e9e93
Adding french cleaners
2020-09-29 14:20:24 +02:00
Eren Gölge
cf02ace5b7
Merge pull request #530 from mueller91/fix_split_dataset
...
fix: split_dataset
2020-09-28 12:42:40 +02:00
erogol
154f90bc44
format speaker encoder imports
2020-09-28 11:19:19 +02:00
erogol
e097bc6c5d
Merge branch 'dev' of https://github.com/mozilla/TTS into dev
2020-09-28 11:15:32 +02:00
Eren Gölge
8e2dc79c3a
Merge pull request #526 from mueller91/dev
...
Fix: Check storage params only for speaker encoder
2020-09-28 11:15:23 +02:00
erogol
6a70c63f24
correct glow-tts loss
2020-09-27 03:28:42 +02:00
erogol
665f7ca714
linter fix
2020-09-24 12:57:54 +02:00
mueller91
227b9c8864
fix: split_dataset() runtime reduced from O(N * |items|) to O(N) where N is the size of the eval split (max 500)
...
I notice a significant speedup on the initial loading of large datasets such as common voice (from minutes to seconds)
2020-09-23 23:27:51 +02:00
mueller91
cfeeef7a7f
fix: broken imports and missing files after merging in latest commits from mozilla/dev into mueller91/dev.
...
speaker_encoder's config.json and visuals.py are missing in the current dev branch of MozillaTTS, and some imports are broken.
2020-09-22 20:10:41 +02:00
mueller91
1fe5eb054f
Merge branch 'dev' of https://github.com/mozilla/TTS into dev
...
Conflicts:
TTS/bin/train_encoder.py
requirements.txt
2020-09-22 19:58:53 +02:00
mueller91
df4caec4b7
add: check_config for speaker_encoder
2020-09-22 19:52:09 +02:00
WeberJulian
3c212be5a8
fix: fixing the RenamingUnpickler fix
2020-09-22 17:36:05 +02:00
mueller91
0ea7f4e2bd
fix: make speaker encoder's storage parameters non-restriced
2020-09-22 10:39:40 +02:00
mueller91
7029452228
fix: make speaker encoder's storage parameters non-restriced
2020-09-22 10:31:42 +02:00
erogol
10258724d1
linter fixes
2020-09-22 03:54:16 +02:00
erogol
a6df617eb1
Merge branch 'glow-tts-amp-time_depth_conv' into dev
2020-09-21 14:23:45 +02:00
erogol
8150d5727e
Merge branch 'dev' of https://github.com/mozilla/TTS into dev
2020-09-21 14:21:55 +02:00
erogol
e0b9fa887f
glow-tts modules added
2020-09-21 14:15:40 +02:00
erogol
e4c6386603
change import for normalization layer
2020-09-21 13:09:52 +02:00
mueller91
9b4aac94a8
fix: linter issues
2020-09-21 12:13:02 +02:00
erogol
c008003506
do not check sample rate as loading stats file for normalization to enable interpolation for different sample rate vocoder
2020-09-18 12:52:19 +02:00