Commit Graph

430 Commits

Author SHA1 Message Date
Katsuya Iida 165e5814af
Update Japanese phonemizer (#758)
* Update default ja vocoder

* update

* Japanese phonemizer test

* Run make style

Co-authored-by: Eren Gölge <egolge@coqui.ai>
2021-09-01 09:33:15 +02:00
Eren Gölge 2b7e55f01f Fix vits args types 2021-08-30 23:24:20 +00:00
Eren Gölge 18da8f5dbd Update pylint 2.10.2 and fix lint issues 2021-08-30 08:10:35 +00:00
Eren Gölge f186856e5d Add option to sort input sequnce by audio len 2021-08-30 08:10:35 +00:00
Eren Gölge 2620f62ea8 Move duration_loss inside VitsGeneratorLoss 2021-08-27 07:07:07 +00:00
Eren Gölge 49e1181ea4 Fixes for the vits model 2021-08-26 17:15:09 +00:00
Eren Gölge 3ab8cef99e Fix VITS model SPD 2021-08-18 14:55:46 +00:00
Eren Gölge 7c0d564965 Syncronize DDP processes 2021-08-13 10:40:50 +00:00
Eren Gölge ecf5f17dca Fix distribute.py and ddp training 2021-08-12 22:22:32 +00:00
Eren Gölge c8b9ca3d71 Fix Tacotron num_char init 2021-08-10 08:56:34 +00:00
Eren Gölge 6af03ac476 Fix `num_char` init in Tacotron models 2021-08-09 21:46:15 +00:00
Eren Gölge 06018251e6 Add VITS and GlowTTS class docs 🗒️ 2021-08-09 18:02:36 +00:00
Eren Gölge 6a7275881d Add VitsConfig docstring 2021-08-09 18:02:36 +00:00
Eren Gölge f7a72552f1 Make duration predictor dropout configurable 2021-08-09 18:02:36 +00:00
Eren Gölge c312acac7d Implement VITS model 🚀
VITS model implementation built on Glow TTS and HiFiGAN
layers.
2021-08-09 18:02:36 +00:00
Eren Gölge 232a5abb6a Update `tts.setup_model`
Run `model.make_symbols()` if availabe to set the symbol list
2021-08-09 18:02:36 +00:00
Eren Gölge f5a6aa974f Modify `symbols.py` not to add _arpanet 2021-08-09 18:02:36 +00:00
Eren Gölge 003e5579e8 Enable `custom_symbols` in text processing
Models can define their own custom symbols lists with custom
`make_symbols()`
2021-08-09 18:02:36 +00:00
Eren Gölge bd4e29b4dd Add `compute_linear_spec=False` to `BaseTTSConfig` 2021-08-09 18:02:36 +00:00
Eren Gölge e4648ffef1 Fix multi-speaker init of Tacotron models & tests 2021-08-09 18:02:36 +00:00
Eren Gölge 01324c8e70 Update `base_tts.py`
Enable calling `make_symbols()` from the model if defined.
Compatibility changes for end2end `tts` models in batch formatting.
Changes in multi-speaker initialization.
Modify `test_run()` to work with dict output iof `synthesis`
2021-08-09 18:02:36 +00:00
Agrin Hilmkil ced4cfdbbf Allow saving / loading checkpoints from cloud paths (#683)
* Allow saving / loading checkpoints from cloud paths

Allows saving and loading checkpoints directly from cloud paths like
Amazon S3 (s3://) and Google Cloud Storage (gs://) by using fsspec.

Note: The user will have to install the relevant dependency for each
protocol. Otherwise fsspec will fail and specify which dependency is
missing.

* Append suffix _fsspec to save/load function names

* Add a lower bound to the fsspec dependency

Skips the 0 major version.

* Add missing changes from refactor

* Use fsspec for remaining artifacts

* Add test case with path requiring fsspec

* Avoid writing logs to file unless output_path is local

* Document the possibility of using paths supported by fsspec

* Fix style and lint

* Add missing lint fixes

* Add type annotations to new functions

* Use Coqpit method for converting config to dict

* Fix type annotation in semi-new function

* Add return type for load_fsspec

* Fix bug where fs not always created

* Restore the experiment removal functionality
2021-08-09 18:02:36 +00:00
Eren Gölge d9e18e009b Skip phoneme cache pre-compute if the path exists 2021-08-09 18:02:36 +00:00
Eren Gölge 4b7b88dd3d Add fullband-melgan DE vocoder 2021-07-26 15:38:30 +02:00
Eren Gölge 75b201c6c1
Merge pull request #673 from coqui-ai/fix_stopnet
Fix stopnet training for Tacotron models
2021-07-24 12:25:38 +02:00
Eren Gölge fc0c4600bd Fix stopnet training 2021-07-24 11:39:54 +02:00
Eren Gölge 30eed347b6
Merge pull request #581 from Edresson/dev
Compute speaker embeddings in batch for the LSTM  Speaker Encoder and Compute embeddings/ finding chars using config file.
2021-07-23 17:22:51 +02:00
WeberJulian 25832eb97b Changes for review 2021-07-15 11:38:45 +02:00
Edresson b1620d1f3f remove ignore generate eval flag 2021-07-15 03:34:28 -03:00
WeberJulian c79a82ed07 refix linter 2021-07-13 23:12:18 +02:00
WeberJulian 7d92b30946 Fix tests 2021-07-13 23:00:34 +02:00
WeberJulian 32974dd6a9 Fix test sentences synthesis 2021-07-13 16:07:13 +02:00
Edresson 2e5baffa9c Merge fix and eval split as argparse 2021-07-13 01:47:32 -03:00
eren golge 3c0454490f Fix #616 2021-07-06 14:44:03 +02:00
Eren Gölge c25a2184e7 Add docs for `SpeakerManager` 2021-07-03 13:55:27 +02:00
Eren Gölge f382e4c700 Fix linter warnings 2021-07-03 13:30:24 +02:00
Eren Gölge 196876feb1 Fix `ModelManager` model download 2021-07-02 10:47:05 +02:00
Eren Gölge 9352cb4136 Format Align TTS docstrings 2021-07-02 10:45:58 +02:00
Eren Gölge 95ad72f38f Fix glow tts initialization 2021-07-02 10:45:37 +02:00
Eren Gölge 40b0b5365e Let `get_characters` return `num_chars` 2021-07-02 10:45:00 +02:00
Eren Gölge 0fa6a8c9b8 Fix glow tts default parameters 2021-07-02 10:44:23 +02:00
Eren Gölge 2e1a428b83 Update glowtts docstrings and docs 2021-06-30 14:30:55 +02:00
Eren Gölge ae6405bb76 Docstrings for `Trainer` 2021-06-28 17:03:47 +02:00
Eren Gölge d42d1c02ea Use `torch.linalg.qr` for pytorch > `v1.9.0` 2021-06-28 17:03:47 +02:00
Eren Gölge 9790eddada Fix wrong argument name 🛠️ 2021-06-28 17:03:47 +02:00
Eren Gölge 932ab107ae Docstring edit in `TTSDataset.py` ✍️ 2021-06-28 17:03:47 +02:00
Eren Gölge 8c74f054f0 Enable support for 🐍 python 3.10
Bump up versions numpy 1.19.5 and TF 2.5.0
2021-06-28 17:03:47 +02:00
Eren Gölge 9455a2b01e Apply small fixes for API compatibility 2021-06-28 17:03:47 +02:00
Eren Gölge a5d5bc9063 Print `max_decoder_steps` when model reaches the limit 2021-06-28 17:03:47 +02:00
Eren Gölge f23b228e24 Update `speaker_manager` 2021-06-28 17:03:47 +02:00