Commit Graph

332 Commits

Author SHA1 Message Date
Eren Gölge f769595112 Add more listing options to ModelManager 2021-12-20 11:54:10 +00:00
Eren Gölge 473414d4af Implement init_speaker_encoder and change arg names 2021-12-20 11:54:10 +00:00
Eren Gölge 35a781fb90 Fix synthesizer reading `use_language_embedding` 2021-12-20 11:54:10 +00:00
Eren Gölge 704dddcffa Make style 2021-12-20 11:54:10 +00:00
WeberJulian 54b7fb4e4a Fix zoo tests 2021-12-20 11:54:10 +00:00
WeberJulian a564eb9f54 Add support for multi-lingual models in CLI 2021-12-20 11:54:10 +00:00
Edresson 818dc4ccd8 Add Docstring for TorchSTFT 2021-12-20 11:54:10 +00:00
Edresson d39200e69b Remove torchaudio requeriment 2021-12-20 11:54:10 +00:00
Edresson 45d0b04179 Lint fixs 2021-12-20 11:54:10 +00:00
Edresson 2b2cecaea2 Set the new_fields in copy_model_files as None by default 2021-12-20 11:54:10 +00:00
Edresson 352aa69eca Create a module for the VAD script 2021-12-20 11:54:10 +00:00
loganhart420 103c010eca Add addtional datasets 2021-12-16 07:21:27 -05:00
Eren Gölge ce45d9e1af Make style and lint 2021-12-01 10:42:52 +00:00
Eren Gölge 512ada7548 Fix callbacks against multi-gpu training 2021-12-01 10:32:14 +00:00
Eren Gölge d227aaebcc Print when using Griffin-Lim in Synthesizer 2021-11-01 16:52:26 +01:00
George 37eaefc085
Optional silence trimming during inference and find_endpoint() fix (#898)
* Set find_endpoint db threshold in config.json

* Optional silence trimming during inference

* Make trim_db value negative
2021-10-29 18:28:55 +02:00
Eren Gölge 2df0752e73
Model zoo tests (#900)
* Fix VITS model multi-speaker init

* Remove gdrive support in model manager

* Add model zoo tests
2021-10-29 17:54:16 +02:00
Eren Gölge 035ed432bc
Doc update (#889)
* Link source files from the docs

* Update glowTTS recipes for docs

* Add dataset downloaders
2021-10-26 17:41:33 +02:00
Eren Gölge 1987aaaaed Update d-vector reshape in synthesizer 2021-10-21 13:53:25 +00:00
Eren Gölge 92b6d98443 Set pitch frame alignment wrt spec computation 2021-10-20 18:12:38 +00:00
Eren Gölge 0a3d1cc7ee Pass speaker manager to the model in synthesizer 2021-10-20 18:11:36 +00:00
Eren Gölge 3c7848e9b1 Don't OOR values in train console log 2021-10-19 16:32:16 +00:00
Eren Gölge c514351c0e Refactor multi-speaker init in BaseTTS-Tacotron1-2 2021-10-18 08:55:45 +00:00
Eren Gölge 700b056117 Update Synthesizer multi-speaker handling 2021-10-15 10:21:12 +00:00
Eren Gölge 9a0d8fa027 Update `copy_model_files()` 2021-09-30 14:47:56 +00:00
Eren Gölge 8ada870a57 Refactor `trainer.py` for v2 2021-09-30 14:16:34 +00:00
Eren Gölge 7d8f77385a Use `glow-tts` in synthesis tests 2021-09-10 17:27:33 +00:00
Eren Gölge 742f9c54da Warn user if nan in GL 2021-09-10 08:26:05 +00:00
Eren Gölge 4761853c5c Fix imports 2021-09-08 13:34:40 +00:00
Eren Gölge 2c4bbbf9b9 Use pyworld for pitch 2021-09-06 15:16:58 +00:00
Eren Gölge 98a7271ce8 Refactor FastPitchv2 2021-09-06 15:16:58 +00:00
Eren Gölge 42862f7fdb Format style of the recipes 2021-09-06 15:16:58 +00:00
Eren Gölge aacbb3ed77 Fix SpeakerManager usage in `synthesize.py` 2021-09-06 15:16:58 +00:00
Eren Gölge 5a6ffaee08 Add yin based pitch computation 2021-09-06 15:16:58 +00:00
Eren Gölge d085642ac1 Cache pitch features
Cache the features at the beginning of `BaseTTS` training.
2021-09-06 15:16:58 +00:00
Eren Gölge fba257104d Compute F0 using librosa 2021-09-06 15:16:58 +00:00
Eren Gölge d16da949a5 Merge branch 'fix_distribute' into dev 2021-08-30 16:31:07 +00:00
Eren Gölge 5255e089e6 Fix #767 2021-08-30 13:10:08 +00:00
Eren Gölge c560114324 Fix #750 2021-08-30 13:06:50 +00:00
Eren Gölge 18da8f5dbd Update pylint 2.10.2 and fix lint issues 2021-08-30 08:10:35 +00:00
Eren Gölge 2620f62ea8 Move duration_loss inside VitsGeneratorLoss 2021-08-27 07:07:07 +00:00
Eren Gölge 1692b8e4d9
Merge pull request #726 from fijipants/patch-1
Fix bug with log_func
2021-08-26 22:11:29 +02:00
Eren Gölge 49e1181ea4 Fixes for the vits model 2021-08-26 17:15:09 +00:00
fijipants e9e01b09b0 Fix bug with log_func 2021-08-18 19:59:51 -04:00
fijipants 8f57f8adfd Update synthesizer.py 2021-08-18 19:56:52 -04:00
Eren Gölge 7c0d564965 Syncronize DDP processes 2021-08-13 10:40:50 +00:00
Eren Gölge ecf5f17dca Fix distribute.py and ddp training 2021-08-12 22:22:32 +00:00
Eren Gölge 537bc8487a Print model count when listing modelsk 2021-08-10 16:25:11 +00:00
Ayush Chaurasia f3e9d61330 Refactor logging initialization 2021-08-09 18:35:08 +00:00
Ayush Chaurasia 79b74a989d Update: add_text 2021-08-09 18:34:38 +00:00
Ayush Chaurasia 9fcf48b760 Delete logger_base.py 2021-08-09 18:34:00 +00:00
Ayush Chaurasia 290972fd35 reformat 2021-08-09 18:34:00 +00:00
Ayush Chaurasia 936a47504d Update Logger API, recipes 2021-08-09 18:34:00 +00:00
Ayush Chaurasia f63cf46c55 Unified logger API 2021-08-09 18:34:00 +00:00
Ayush Chaurasia f4434da5a3 Update disabled structure 2021-08-09 18:31:16 +00:00
Ayush Chaurasia f606741dc4 Add artifacts logging , wandb args 2021-08-09 18:31:16 +00:00
Ayush Chaurasia f5e50ad502 WandbLogger 2021-08-09 18:27:06 +00:00
Eren Gölge c312acac7d Implement VITS model 🚀
VITS model implementation built on Glow TTS and HiFiGAN
layers.
2021-08-09 18:02:36 +00:00
Eren Gölge 060e746e21 Add `do_amp_to_db` option 2021-08-09 18:02:36 +00:00
Eren Gölge e94c1f894d Simplify `console_logger` 2021-08-09 18:02:36 +00:00
Eren Gölge dd55960732 Update `synthesizer.py`
Fixes and changes for multi-speaker model init and custom symbols  made
by mode.make_symbols()
2021-08-09 18:02:36 +00:00
Eren Gölge d4deb2716f Modify `get_optimizer` to accept a model argument 2021-08-09 18:02:36 +00:00
Agrin Hilmkil ced4cfdbbf Allow saving / loading checkpoints from cloud paths (#683)
* Allow saving / loading checkpoints from cloud paths

Allows saving and loading checkpoints directly from cloud paths like
Amazon S3 (s3://) and Google Cloud Storage (gs://) by using fsspec.

Note: The user will have to install the relevant dependency for each
protocol. Otherwise fsspec will fail and specify which dependency is
missing.

* Append suffix _fsspec to save/load function names

* Add a lower bound to the fsspec dependency

Skips the 0 major version.

* Add missing changes from refactor

* Use fsspec for remaining artifacts

* Add test case with path requiring fsspec

* Avoid writing logs to file unless output_path is local

* Document the possibility of using paths supported by fsspec

* Fix style and lint

* Add missing lint fixes

* Add type annotations to new functions

* Use Coqpit method for converting config to dict

* Fix type annotation in semi-new function

* Add return type for load_fsspec

* Fix bug where fs not always created

* Restore the experiment removal functionality
2021-08-09 18:02:36 +00:00
Eren Gölge a05b234080 Raise an error when multiple GPUs are in use
User must define the target GPU by `CUDA_VISIBLE_DEVICES` and
use `distribute.py` for multi-gpu training.
2021-07-04 11:25:49 +02:00
Eren Gölge 168f97cbe9 Let `Synthesizer` use the speaker manager out of the model 2021-07-02 10:47:55 +02:00
Eren Gölge 196876feb1 Fix `ModelManager` model download 2021-07-02 10:47:05 +02:00
Eren Gölge 2e1a428b83 Update glowtts docstrings and docs 2021-06-30 14:30:55 +02:00
Eren Gölge 51398cd15b Add docstrings and typing for `audio.py` 2021-06-28 17:03:47 +02:00
Eren Gölge d700845b10 Move `TorchSTFT` to `utils.audio` 2021-06-28 17:03:47 +02:00
Eren Gölge e30f245e06 Update `synthesizer` for speaker and model init 2021-06-28 17:03:47 +02:00
Eren Gölge e53616078a Fixup `utils` for the trainer 2021-06-28 17:03:47 +02:00
Eren Gölge 98298ee671 Implement unified IO utils 2021-06-28 17:03:19 +02:00
Eren Gölge c7aad884cd Implement unified trainer 2021-06-28 17:03:19 +02:00
Eren Gölge 00c82c516d rename to 2021-06-28 17:03:19 +02:00
Eren Gölge db6a97d1a2 rename external speaker embedding arguments as `d_vectors` 2021-06-28 17:03:19 +02:00
Eren Gölge 9042ae9195 use `to_cuda()` for moving data in `format_batch()` 2021-06-28 17:03:19 +02:00
Eren Gölge 1c8a3d7c86 make style 2021-06-28 17:03:19 +02:00
Eren Gölge 8cdd423234 styling formatting.py 2021-06-28 17:03:19 +02:00
Eren Gölge 8381379938 formating `cond_input` with a function in Tacotron models 2021-06-28 17:03:19 +02:00
Eren Gölge d96ebcd6d3 make style 2021-06-28 17:03:19 +02:00
Eren Gölge b643e8b37c `logging/__init__.py` 2021-06-28 17:03:19 +02:00
Eren Gölge 0cee5042a9 fix logger imports 2021-06-28 17:03:19 +02:00
Eren Gölge 0eec238429 remove redundant imports 2021-06-28 17:03:19 +02:00
Eren Gölge b500338faa make style 2021-06-28 17:03:19 +02:00
Eren Gölge c680a07a20 fix `Synthesized` for the new `synthesis()` 2021-06-28 17:03:19 +02:00
Eren Gölge d25f017b42 update `setup_model.py` imports 2021-06-28 17:03:19 +02:00
Eren Gölge 34f8a74e4d remove `truncated` from synthesizer 2021-06-28 17:03:19 +02:00
Eren Gölge 178eccbc16 update console logger 2021-06-28 17:03:19 +02:00
Eren Gölge a20a1c7d06 rename preprocess.py -> formatters.py 2021-06-28 17:03:19 +02:00
Eren Gölge 8def3c87af trainer-API updates 2021-06-28 17:03:19 +02:00
Michael Hansen 67869e77f9 Use gruut for phonemization 2021-06-25 14:41:05 +02:00
Eren Gölge d0ab0382fc linter fixes 2021-06-01 09:15:32 +02:00
Eren Gölge d9f1268f99 init tb_logger None for rank > 0 processes 2021-05-31 15:47:07 +02:00
Eren Gölge 8a7c40736c set use_phonemes false 2021-05-19 01:27:26 +02:00
Eren Gölge ccfaa6b1d5 add `needs_phonemizer` field to models.json. If set true these models
are only compatible with v0.0.13 or below.
2021-05-18 17:57:28 +02:00
Eren Gölge c2c7dff805 use relaxted coqpit parser 2021-05-18 14:49:47 +02:00
Eren Gölge 715b0a65a0 update main.yml for python x64
fix test
2021-05-12 00:57:29 +02:00
Eren Gölge 843d1b3d98 linter fixes 2021-05-11 11:30:00 +02:00
Eren Gölge 19fb1d743d style update 2021-05-11 11:30:00 +02:00
Eren Gölge 6e980b49c4 fix synthesizer.py for Coqpit 2021-05-11 11:29:18 +02:00
Eren Gölge db14dcd95a remove old load_config 2021-05-11 11:29:18 +02:00
Eren Gölge a21ac883dd add get_cuda() 2021-05-11 11:29:18 +02:00
Eren Gölge 21dd4d7960 fix load_config imports for Coqpit 2021-05-11 11:29:18 +02:00
Eren Gölge 9ee70af9bb code styling 2021-05-11 11:29:18 +02:00
Eren Gölge 757e90b1cc load_config function to initialize the right Coqpit for the given model 2021-05-11 11:29:18 +02:00
Eren Gölge 35341d5482 move bash script based tests to python with coqpit 2021-05-11 11:29:17 +02:00
Eren Gölge 647163397d coqpit refactoring 2021-05-11 11:29:17 +02:00
Eren Gölge 9c18e40f64 black formatting 2021-05-11 11:29:17 +02:00
Eren Gölge 79d7215142 config refactor #5 WIP 2021-05-11 11:29:17 +02:00
Eren Gölge dc50f5f0b0 config refactor #4 WIP 2021-05-11 11:28:35 +02:00
Eren Gölge 97bd5f9734 [ci skip] config update #3 WIP 2021-05-11 11:28:35 +02:00
Eren Gölge e092ae40dc config update WIP 2021-05-11 11:28:35 +02:00
Eren Gölge 06f80a4806 update check argument 2021-05-11 11:28:35 +02:00
Eren Gölge 8cb27267a4 formatting 2021-05-03 14:26:35 +02:00
Eren Gölge 87d674a038 bumpup librosa version to 0.8.0 2021-05-03 14:25:09 +02:00
Eren Gölge 4719414f2e remove imports 2021-04-27 11:25:17 +02:00
Eren Gölge add97cddc1 move function and remove import 2021-04-27 11:22:56 +02:00
Eren Gölge 734e6a515c bug fix 2021-04-27 10:27:45 +02:00
Eren Gölge 2f0716073e enable multi-speaker CoquiTTS models for synthesize.py 2021-04-26 19:36:53 +02:00
Eren Gölge f37b488876 Merge branch 'speaker-manager' of https://github.com/coqui-ai/TTS into speaker-manager 2021-04-26 15:25:25 +02:00
Eren Gölge b82daa5e86 style and linter fixes 2021-04-26 15:22:24 +02:00
Eren Gölge 4cf211348d styling and linting 2021-04-23 18:04:37 +02:00
Eren Gölge 7eb0c60d2e let synthesizer to pass speaker encoder file paths to speaker manager 2021-04-23 18:04:37 +02:00
Eren Gölge f9f3d04d14 remove moved function 2021-04-23 18:04:37 +02:00
Eren Gölge 6d0f5e0459 use SpeakerManager in Synthesizer 2021-04-23 18:04:37 +02:00
Eren Gölge 3ace2440fa fix a mistake from rebase 2021-04-23 18:04:37 +02:00
Eren Gölge aadb2106ec code styling 2021-04-23 18:04:37 +02:00
Eren Gölge af7baa3387 refactoring to allow defining the speaker file externally 2021-04-23 18:04:37 +02:00
kirianguiller 7dccbfdcd5 handle multi speaker and gst in Synthetizer class 2021-04-23 18:04:37 +02:00
WeberJulian 4205284f92
Change name of the functions 2021-04-23 10:09:55 +02:00
WeberJulian a26498181b Change back the default value 2021-04-22 16:10:17 +02:00
Julian Weber 355e1f47ab fix dumb mistake 2021-04-22 15:50:29 +02:00
Julian Weber c125b71f36 fix windows support 2021-04-22 15:14:24 +02:00
Eren Gölge e1d960da9e use SpeakerManager in Synthesizer 2021-04-21 13:13:27 +02:00
Eren Gölge 1038fd420d fix a mistake from rebase 2021-04-16 19:39:47 +02:00
Eren Gölge 47e356cb48 code styling 2021-04-16 16:01:40 +02:00
Eren Gölge 25328aad00 refactoring to allow defining the speaker file externally 2021-04-16 15:59:57 +02:00
kirianguiller 48ae52a9a3 handle multi speaker and gst in Synthetizer class 2021-04-16 15:54:49 +02:00
Eren Gölge 7cada1a949 remove noise 2021-04-15 15:30:45 +02:00
Eren Gölge a7f6045644 Merge branch 'reformat' into hifigan-reformat 2021-04-12 12:00:17 +02:00
Eren Gölge f519012dea reformatting and styling 2021-04-12 11:47:39 +02:00
Eren Gölge 18d9ec8036 format with black 2021-04-09 00:54:59 +02:00
Eren Gölge e5b9607bc3 isort all imports 2021-04-09 00:45:20 +02:00
Eren Gölge 0e79fa86ad format with black and pylint 2.7.3 2021-04-09 00:38:08 +02:00
Eren Gölge 6ee211c137 remove stft params causing warning 2021-04-08 11:28:30 +02:00
Eren Gölge 7726dfca99 change the upper bound in sound normalization 2021-04-08 11:26:01 +02:00
Eren Gölge e0e3b12b26 pass all parameters explicity to _istft 2021-04-08 11:23:20 +02:00
Eren Gölge d57f416957 small fixes 2021-04-08 11:22:30 +02:00
Eren Gölge f890454de3 linter fixes 2021-04-07 12:36:03 +02:00
Eren Gölge 9782d9ea5d [ci skip] implement #418 2021-04-06 16:24:50 +02:00