Eren Gölge
2bf9e83c49
FastPitch refactor and commenting
2021-09-06 15:16:58 +00:00
Eren Gölge
59b24e66cf
Add `AlignerNetwork`
2021-09-06 15:16:58 +00:00
Eren Gölge
648655fa03
Add `PitchExtractor` and return dict by `collate`
2021-09-06 15:16:58 +00:00
Eren Gölge
debf772ec5
Implement binary alignment loss
2021-09-06 15:16:58 +00:00
Eren Gölge
6e9d4062f2
Add `sort_by_audio_len` option
2021-09-06 15:16:58 +00:00
Eren Gölge
59d52a4cd8
Disable autcast for criterions
2021-09-06 15:16:58 +00:00
Eren Gölge
98a7271ce8
Refactor FastPitchv2
2021-09-06 15:16:58 +00:00
Eren Gölge
e429afbce4
Enable aligner for FastPitch
2021-09-06 15:16:58 +00:00
Eren Gölge
81c228a2d8
Update FastPitch don't detach duration network inputs
2021-09-06 15:16:58 +00:00
Eren Gölge
ca29033ef4
Refactor FastPitch model
2021-09-06 15:16:58 +00:00
Eren Gölge
42862f7fdb
Format style of the recipes
2021-09-06 15:16:58 +00:00
Eren Gölge
5d59100a88
Don't use align_score for models with duration predictor
2021-09-06 15:16:58 +00:00
Eren Gölge
fac9dbe661
Update FastPitchLoss
2021-09-06 15:16:58 +00:00
Eren Gölge
b81560607b
Update docstrings
2021-09-06 15:16:58 +00:00
Eren Gölge
57b3aec1b9
Update docstring format
2021-09-06 15:16:58 +00:00
Eren Gölge
7692bfe7f8
Update FastPitch config
2021-09-06 15:16:58 +00:00
Eren Gölge
8584f2b82d
Update docstring format
2021-09-06 15:16:58 +00:00
Eren Gölge
b7caad39e0
Make optional to detach duration predictor input
2021-09-06 15:16:58 +00:00
Eren Gölge
9af42f7886
Restore `last_epoch` of the scheduler
2021-09-06 15:16:58 +00:00
Eren Gölge
aacbb3ed77
Fix SpeakerManager usage in `synthesize.py`
2021-09-06 15:16:58 +00:00
Eren Gölge
545a00fc04
Use absolute paths of the attention masks
2021-09-06 15:16:58 +00:00
Eren Gölge
bc396c393f
Add FastPitch model and FastPitchconfig
2021-09-06 15:16:58 +00:00
Eren Gölge
5a6ffaee08
Add yin based pitch computation
2021-09-06 15:16:58 +00:00
Eren Gölge
e802b24ad0
Compute mean and std pitch
2021-09-06 15:16:58 +00:00
Eren Gölge
8fffd4e813
Don't print computed phonemes
...
It causes noise in logs
2021-09-06 15:16:58 +00:00
Eren Gölge
d085642ac1
Cache pitch features
...
Cache the features at the beginning of `BaseTTS` training.
2021-09-06 15:16:58 +00:00
Eren Gölge
7590c7db7a
Fix `base_tacotron` `aux_input` handling
2021-09-06 15:16:58 +00:00
Eren Gölge
db32162eae
Fix `FastPitchLoss`
2021-09-06 15:16:58 +00:00
Eren Gölge
94e8e0d416
Fix configs
2021-09-06 15:16:58 +00:00
Eren Gölge
0f19f8c911
Fix `compute_attention_masks.py`
2021-09-06 15:16:58 +00:00
Eren Gölge
994f2be2c1
Add comput_f0 field
2021-09-06 15:16:58 +00:00
Eren Gölge
c8d999b010
Add FastPitchLoss
2021-09-06 15:16:58 +00:00
Eren Gölge
fba257104d
Compute F0 using librosa
2021-09-06 15:16:58 +00:00
Katsuya Iida
165e5814af
Update Japanese phonemizer ( #758 )
...
* Update default ja vocoder
* update
* Japanese phonemizer test
* Run make style
Co-authored-by: Eren Gölge <egolge@coqui.ai>
2021-09-01 09:33:15 +02:00
Eren Gölge
2b7e55f01f
Fix vits args types
2021-08-30 23:24:20 +00:00
Eren Gölge
b910a6ddce
Bump up to v0.2.1
2021-08-30 16:31:24 +00:00
Eren Gölge
d16da949a5
Merge branch 'fix_distribute' into dev
2021-08-30 16:31:07 +00:00
Eren Gölge
6782d3eab7
Fix linter issues ofr p3.6
2021-08-30 16:18:33 +00:00
Eren Gölge
738eee0cf9
Fix style
2021-08-30 13:12:13 +00:00
Eren Gölge
5255e089e6
Fix #767
2021-08-30 13:10:08 +00:00
Eren Gölge
c560114324
Fix #750
2021-08-30 13:06:50 +00:00
Eren Gölge
18b2e41e5a
Use `coqui_tts` as the default run name
2021-08-30 12:56:47 +00:00
Eren Gölge
9c86f1ac68
Fix usage of abstract class in vocoders
2021-08-30 08:10:35 +00:00
Eren Gölge
18da8f5dbd
Update pylint 2.10.2 and fix lint issues
2021-08-30 08:10:35 +00:00
Eren Gölge
f186856e5d
Add option to sort input sequnce by audio len
2021-08-30 08:10:35 +00:00
Eren Gölge
2620f62ea8
Move duration_loss inside VitsGeneratorLoss
2021-08-27 07:07:07 +00:00
Eren Gölge
1692b8e4d9
Merge pull request #726 from fijipants/patch-1
...
Fix bug with log_func
2021-08-26 22:11:29 +02:00
Eren Gölge
49e1181ea4
Fixes for the vits model
2021-08-26 17:15:09 +00:00
Eren Gölge
5911eec3b1
Small trainer refactoring
...
1. Use a single Gradscaler for all the optimizers
2. Save terminal logs to a file. In DDP mode, each worker creates `trainer_N_log.txt`.
3. Fixes to allow only the main worker (rank==0) writing to Tensorboard
4. Pass parameters owned by the target optimizer to the grad_clip_norm
2021-08-26 17:08:58 +00:00
fijipants
e9e01b09b0
Fix bug with log_func
2021-08-18 19:59:51 -04:00
fijipants
8f57f8adfd
Update synthesizer.py
2021-08-18 19:56:52 -04:00
Eren Gölge
3ab8cef99e
Fix VITS model SPD
2021-08-18 14:55:46 +00:00
Eren Gölge
c5d1dd9d1b
Fix restoring best_loss
...
Keep the default value if model checkpoint has no `model_loss`
2021-08-17 12:12:36 +00:00
Eren Gölge
c8bbcdfd07
Fix `test_run` for DDP
2021-08-13 19:39:02 +00:00
Eren Gölge
7c0d564965
Syncronize DDP processes
2021-08-13 10:40:50 +00:00
Eren Gölge
ecf5f17dca
Fix distribute.py and ddp training
2021-08-12 22:22:32 +00:00
Eren Gölge
b02c4fe347
Bump up to v0.2.0
2021-08-11 08:15:39 +00:00
Eren Gölge
537bc8487a
Print model count when listing modelsk
2021-08-10 16:25:11 +00:00
Eren Gölge
09ed8426e8
Add the models released with v0.2.0
2021-08-10 15:46:31 +00:00
Eren Gölge
39004484b9
Fix 🐛
...
Fix synthesizer multi-speaker init
Fix #712
2021-08-10 12:56:32 +00:00
Eren Gölge
c8b9ca3d71
Fix Tacotron num_char init
2021-08-10 08:56:34 +00:00
Eren Gölge
7eb94f760b
Remove Ruslan model
2021-08-09 21:48:36 +00:00
Eren Gölge
6af03ac476
Fix `num_char` init in Tacotron models
2021-08-09 21:46:15 +00:00
Ayush Chaurasia
e685ddfca7
Update trainer.py
2021-08-09 18:37:46 +00:00
Ayush Chaurasia
28870f8df4
update docstring
2021-08-09 18:35:35 +00:00
Ayush Chaurasia
8a246cbb66
Update trainer.py
2021-08-09 18:35:08 +00:00
Ayush Chaurasia
f3e9d61330
Refactor logging initialization
2021-08-09 18:35:08 +00:00
Ayush Chaurasia
79b74a989d
Update: add_text
2021-08-09 18:34:38 +00:00
Ayush Chaurasia
9fcf48b760
Delete logger_base.py
2021-08-09 18:34:00 +00:00
Ayush Chaurasia
290972fd35
reformat
2021-08-09 18:34:00 +00:00
Ayush Chaurasia
936a47504d
Update Logger API, recipes
2021-08-09 18:34:00 +00:00
Ayush Chaurasia
f63cf46c55
Unified logger API
2021-08-09 18:34:00 +00:00
Ayush Chaurasia
f4434da5a3
Update disabled structure
2021-08-09 18:31:16 +00:00
Ayush Chaurasia
f606741dc4
Add artifacts logging , wandb args
2021-08-09 18:31:16 +00:00
Ayush Chaurasia
f5e50ad502
WandbLogger
2021-08-09 18:27:06 +00:00
Eren Gölge
06018251e6
Add VITS and GlowTTS class docs 🗒️
2021-08-09 18:02:36 +00:00
Eren Gölge
6a7275881d
Add VitsConfig docstring
2021-08-09 18:02:36 +00:00
Eren Gölge
f7a72552f1
Make duration predictor dropout configurable
2021-08-09 18:02:36 +00:00
Eren Gölge
c312acac7d
Implement VITS model 🚀
...
VITS model implementation built on Glow TTS and HiFiGAN
layers.
2021-08-09 18:02:36 +00:00
Eren Gölge
060e746e21
Add `do_amp_to_db` option
2021-08-09 18:02:36 +00:00
Eren Gölge
e94c1f894d
Simplify `console_logger`
2021-08-09 18:02:36 +00:00
Eren Gölge
dd55960732
Update `synthesizer.py`
...
Fixes and changes for multi-speaker model init and custom symbols made
by mode.make_symbols()
2021-08-09 18:02:36 +00:00
Eren Gölge
232a5abb6a
Update `tts.setup_model`
...
Run `model.make_symbols()` if availabe to set the symbol list
2021-08-09 18:02:36 +00:00
Eren Gölge
f5a6aa974f
Modify `symbols.py` not to add _arpanet
2021-08-09 18:02:36 +00:00
Eren Gölge
d4deb2716f
Modify `get_optimizer` to accept a model argument
2021-08-09 18:02:36 +00:00
Eren Gölge
003e5579e8
Enable `custom_symbols` in text processing
...
Models can define their own custom symbols lists with custom
`make_symbols()`
2021-08-09 18:02:36 +00:00
Eren Gölge
bd4e29b4dd
Add `compute_linear_spec=False` to `BaseTTSConfig`
2021-08-09 18:02:36 +00:00
Eren Gölge
960a35a121
Add `scheduler_after_epoch` to `BaseTrainingConfig`
2021-08-09 18:02:36 +00:00
Eren Gölge
e4648ffef1
Fix multi-speaker init of Tacotron models & tests
2021-08-09 18:02:36 +00:00
Eren Gölge
01324c8e70
Update `base_tts.py`
...
Enable calling `make_symbols()` from the model if defined.
Compatibility changes for end2end `tts` models in batch formatting.
Changes in multi-speaker initialization.
Modify `test_run()` to work with dict output iof `synthesis`
2021-08-09 18:02:36 +00:00
Eren Gölge
bf562cf437
Update `trainer.py`
...
Fix multi-speaker initialization of models. Add changes for end2end`tts`
models.
2021-08-09 18:02:36 +00:00
Agrin Hilmkil
ced4cfdbbf
Allow saving / loading checkpoints from cloud paths ( #683 )
...
* Allow saving / loading checkpoints from cloud paths
Allows saving and loading checkpoints directly from cloud paths like
Amazon S3 (s3://) and Google Cloud Storage (gs://) by using fsspec.
Note: The user will have to install the relevant dependency for each
protocol. Otherwise fsspec will fail and specify which dependency is
missing.
* Append suffix _fsspec to save/load function names
* Add a lower bound to the fsspec dependency
Skips the 0 major version.
* Add missing changes from refactor
* Use fsspec for remaining artifacts
* Add test case with path requiring fsspec
* Avoid writing logs to file unless output_path is local
* Document the possibility of using paths supported by fsspec
* Fix style and lint
* Add missing lint fixes
* Add type annotations to new functions
* Use Coqpit method for converting config to dict
* Fix type annotation in semi-new function
* Add return type for load_fsspec
* Fix bug where fs not always created
* Restore the experiment removal functionality
2021-08-09 18:02:36 +00:00
Eren Gölge
d9e18e009b
Skip phoneme cache pre-compute if the path exists
2021-08-09 18:02:36 +00:00
Eren Gölge
6c131d168e
Bump the version to 0.1.3
2021-07-26 21:32:27 +02:00
Eren Gölge
febd6105b5
Update default vocoder for de-thorsten
2021-07-26 16:08:52 +02:00
Eren Gölge
4b7b88dd3d
Add fullband-melgan DE vocoder
2021-07-26 15:38:30 +02:00
Eren Gölge
764f684e1b
Fix `server.py` for multi-speaker models
2021-07-26 15:38:30 +02:00
Eren Gölge
75b201c6c1
Merge pull request #673 from coqui-ai/fix_stopnet
...
Fix stopnet training for Tacotron models
2021-07-24 12:25:38 +02:00
Eren Gölge
fc0c4600bd
Fix stopnet training
2021-07-24 11:39:54 +02:00
Eren Gölge
30eed347b6
Merge pull request #581 from Edresson/dev
...
Compute speaker embeddings in batch for the LSTM Speaker Encoder and Compute embeddings/ finding chars using config file.
2021-07-23 17:22:51 +02:00
Edresson Casanova
d5adc35fdf
Add docstring to compute_embeddings script
2021-07-21 07:16:10 -03:00
Eren Gölge
05c75aa9d5
Fix linter issues
2021-07-16 13:37:38 +02:00
Eren Gölge
58cc414477
Fix WaveGrad `test_run`
2021-07-16 13:02:25 +02:00
WeberJulian
25832eb97b
Changes for review
2021-07-15 11:38:45 +02:00
Edresson
b1620d1f3f
remove ignore generate eval flag
2021-07-15 03:34:28 -03:00
WeberJulian
c79a82ed07
refix linter
2021-07-13 23:12:18 +02:00
WeberJulian
7d92b30946
Fix tests
2021-07-13 23:00:34 +02:00
WeberJulian
32974dd6a9
Fix test sentences synthesis
2021-07-13 16:07:13 +02:00
Edresson
d906fea08c
lint fix and eval as argparse in extract tts spectrograms
2021-07-13 02:15:31 -03:00
Edresson
2e5baffa9c
Merge fix and eval split as argparse
2021-07-13 01:47:32 -03:00
Eren Gölge
93a74cbb71
Merge pull request #628 from Aloento/patch-2
...
Change to _get_preprocessor_by_name
2021-07-11 22:17:50 +02:00
Edresson
4eac1c4651
bug fix on train_encoder and unit tests
2021-07-11 12:00:39 -03:00
Aloento
6e3e6d5756
Change to _get_preprocessor_by_name
2021-07-08 09:53:13 +02:00
Eren Gölge
8fbadad68e
Bump up to v0.1.2
2021-07-06 14:44:59 +02:00
eren golge
3c0454490f
Fix #616
2021-07-06 14:44:03 +02:00
Eren Gölge
0c347624e7
Bump up version to v0.1.1
2021-07-04 11:46:36 +02:00
Eren Gölge
a05b234080
Raise an error when multiple GPUs are in use
...
User must define the target GPU by `CUDA_VISIBLE_DEVICES` and
use `distribute.py` for multi-gpu training.
2021-07-04 11:25:49 +02:00
Eren Gölge
270c3823eb
Fix #608
2021-07-04 11:19:31 +02:00
Eren Gölge
c25a2184e7
Add docs for `SpeakerManager`
2021-07-03 13:55:27 +02:00
Eren Gölge
f382e4c700
Fix linter warnings
2021-07-03 13:30:24 +02:00
Eren Gölge
9e7824fe35
Fix UnivNet inference code
2021-07-02 10:48:34 +02:00
Eren Gölge
168f97cbe9
Let `Synthesizer` use the speaker manager out of the model
2021-07-02 10:47:55 +02:00
Eren Gölge
196876feb1
Fix `ModelManager` model download
2021-07-02 10:47:05 +02:00
Eren Gölge
9352cb4136
Format Align TTS docstrings
2021-07-02 10:45:58 +02:00
Eren Gölge
95ad72f38f
Fix glow tts initialization
2021-07-02 10:45:37 +02:00
Eren Gölge
40b0b5365e
Let `get_characters` return `num_chars`
2021-07-02 10:45:00 +02:00
Eren Gölge
0fa6a8c9b8
Fix glow tts default parameters
2021-07-02 10:44:23 +02:00
Eren Gölge
a4c658f5ef
Fix for using the `Synthesizer` out of the model
2021-07-02 10:43:38 +02:00
Eren Gölge
db47f4f105
Update `.models.json`
2021-07-02 10:43:00 +02:00
Eren Gölge
2e1a428b83
Update glowtts docstrings and docs
2021-06-30 14:30:55 +02:00
Eren Gölge
5723eb4738
Fix config init in `process_args`
2021-06-29 16:41:08 +02:00
Eren Gölge
4b5421b42f
Remove FAQ link from README.md
2021-06-29 13:20:40 +02:00
Eren Gölge
47b3b10d6d
Bump up to v0.1.0 🚀
2021-06-29 13:07:59 +02:00
Eren Gölge
7ec5c31898
Merge branch 'univnet' into trainer-api
2021-06-29 10:27:12 +02:00
Eren Gölge
51398cd15b
Add docstrings and typing for `audio.py`
2021-06-28 17:03:47 +02:00
Eren Gölge
ae6405bb76
Docstrings for `Trainer`
2021-06-28 17:03:47 +02:00
Eren Gölge
6b265ae8e3
Docstring update
2021-06-28 17:03:47 +02:00
Eren Gölge
ab563ce7cd
Start training by config.json using `register_config`
2021-06-28 17:03:47 +02:00
Eren Gölge
b3c073c99b
Allow runing full path scripts with `distribute.py`
2021-06-28 17:03:47 +02:00
Eren Gölge
d42d1c02ea
Use `torch.linalg.qr` for pytorch > `v1.9.0`
2021-06-28 17:03:47 +02:00
Eren Gölge
fbba37e01e
Fix loading the `amp` scaler from a checkpoint 🛠️
2021-06-28 17:03:47 +02:00
Eren Gölge
a7617d8ab6
Add 🐍 python 3.9 to CI
2021-06-28 17:03:47 +02:00
Eren Gölge
9790eddada
Fix wrong argument name 🛠️
2021-06-28 17:03:47 +02:00
Eren Gölge
932ab107ae
Docstring edit in `TTSDataset.py` ✍️
2021-06-28 17:03:47 +02:00
Eren Gölge
cfa5041db7
Fix `eval_log` for `gan.py` 🛠️
2021-06-28 17:03:47 +02:00
Eren Gölge
d700845b10
Move `TorchSTFT` to `utils.audio`
2021-06-28 17:03:47 +02:00
Eren Gölge
5b89cb4fec
Fixup `trainer.py` 🛠️
2021-06-28 17:03:47 +02:00
Eren Gölge
8c74f054f0
Enable support for 🐍 python 3.10
...
Bump up versions numpy 1.19.5 and TF 2.5.0
2021-06-28 17:03:47 +02:00
Eren Gölge
9455a2b01e
Apply small fixes for API compatibility
2021-06-28 17:03:47 +02:00
Eren Gölge
a5d5bc9063
Print `max_decoder_steps` when model reaches the limit
2021-06-28 17:03:47 +02:00