Commit Graph

60 Commits

Author SHA1 Message Date
Edresson Casanova 8d228ab22a
Trick to Upsampling to High sampling rates using VITS model (#1456)
* Add upsample VITS support

* Fix the bug in inference

* Fix lint checks

* Add RMS based norm in save_wav method

* Style fix

* Add the period for VITS multi-period discriminator in model_args

* Bug fix in speaker encoder load in inference time

* Add unit tests

* Remove useless detach_z_vocoder parameter

* Add docs for VITS upsampling

* Fix the docs

* Rename TTS_part_sample_rate to encoder_sample_rate

* Add upsampling_init and upsampling_z methods

* Add asserts for encoder_sample_rate part

* Move upsampling tests to test_vits.py
2022-04-26 11:47:46 +02:00
Eren Gölge 1c3623af33
Fix model manager (#1436)
* Fix manager

* Make style
2022-03-23 12:57:14 +01:00
Eren Gölge 72d85e53c9
Update model file extension (#1422)
* Update model file ext to ```.pth```

* Update docs

* Rename more

* Find model files
2022-03-22 17:55:00 +01:00
Eren Gölge cd5d1497cf Add pitch_fmin pitch_fmax args to the audio 2022-02-25 11:26:59 +01:00
Eren Gölge 50e17097a7 Add verbose option to AudioProcessor 2022-02-25 11:24:13 +01:00
Eren Gölge c9972e6f14 Make lint 2022-02-25 11:07:34 +01:00
Eren Gölge 9397a56b13 Allow init_from_config from model or audio config 2022-02-25 10:48:03 +01:00
Eren Gölge 53f696615b Add init_from_config to AudioProcessor 2022-02-25 09:32:54 +01:00
Eren Gölge 127118c637
Update TTS.tts formatters (#1228)
* Return Dict from tts formatters

* Make style
2022-02-11 23:03:43 +01:00
Eren Gölge 633dcc9c56 Implement RMS volume normalization 2021-12-22 15:51:14 +00:00
Eren Gölge 704dddcffa Make style 2021-12-20 11:54:10 +00:00
Edresson 818dc4ccd8 Add Docstring for TorchSTFT 2021-12-20 11:54:10 +00:00
Edresson d39200e69b Remove torchaudio requeriment 2021-12-20 11:54:10 +00:00
George 37eaefc085
Optional silence trimming during inference and find_endpoint() fix (#898)
* Set find_endpoint db threshold in config.json

* Optional silence trimming during inference

* Make trim_db value negative
2021-10-29 18:28:55 +02:00
Eren Gölge 92b6d98443 Set pitch frame alignment wrt spec computation 2021-10-20 18:12:38 +00:00
Eren Gölge c514351c0e Refactor multi-speaker init in BaseTTS-Tacotron1-2 2021-10-18 08:55:45 +00:00
Eren Gölge 7d8f77385a Use `glow-tts` in synthesis tests 2021-09-10 17:27:33 +00:00
Eren Gölge 742f9c54da Warn user if nan in GL 2021-09-10 08:26:05 +00:00
Eren Gölge 4761853c5c Fix imports 2021-09-08 13:34:40 +00:00
Eren Gölge 2c4bbbf9b9 Use pyworld for pitch 2021-09-06 15:16:58 +00:00
Eren Gölge 98a7271ce8 Refactor FastPitchv2 2021-09-06 15:16:58 +00:00
Eren Gölge 42862f7fdb Format style of the recipes 2021-09-06 15:16:58 +00:00
Eren Gölge d085642ac1 Cache pitch features
Cache the features at the beginning of `BaseTTS` training.
2021-09-06 15:16:58 +00:00
Eren Gölge fba257104d Compute F0 using librosa 2021-09-06 15:16:58 +00:00
fijipants e9e01b09b0 Fix bug with log_func 2021-08-18 19:59:51 -04:00
Eren Gölge c312acac7d Implement VITS model 🚀
VITS model implementation built on Glow TTS and HiFiGAN
layers.
2021-08-09 18:02:36 +00:00
Eren Gölge 060e746e21 Add `do_amp_to_db` option 2021-08-09 18:02:36 +00:00
Eren Gölge 2e1a428b83 Update glowtts docstrings and docs 2021-06-30 14:30:55 +02:00
Eren Gölge 51398cd15b Add docstrings and typing for `audio.py` 2021-06-28 17:03:47 +02:00
Eren Gölge d700845b10 Move `TorchSTFT` to `utils.audio` 2021-06-28 17:03:47 +02:00
Eren Gölge 647163397d coqpit refactoring 2021-05-11 11:29:17 +02:00
Eren Gölge 8cb27267a4 formatting 2021-05-03 14:26:35 +02:00
Eren Gölge 87d674a038 bumpup librosa version to 0.8.0 2021-05-03 14:25:09 +02:00
Eren Gölge b82daa5e86 style and linter fixes 2021-04-26 15:22:24 +02:00
WeberJulian 4205284f92
Change name of the functions 2021-04-23 10:09:55 +02:00
WeberJulian a26498181b Change back the default value 2021-04-22 16:10:17 +02:00
Julian Weber 355e1f47ab fix dumb mistake 2021-04-22 15:50:29 +02:00
Julian Weber c125b71f36 fix windows support 2021-04-22 15:14:24 +02:00
Eren Gölge 7cada1a949 remove noise 2021-04-15 15:30:45 +02:00
Eren Gölge a7f6045644 Merge branch 'reformat' into hifigan-reformat 2021-04-12 12:00:17 +02:00
Eren Gölge f519012dea reformatting and styling 2021-04-12 11:47:39 +02:00
Eren Gölge e5b9607bc3 isort all imports 2021-04-09 00:45:20 +02:00
Eren Gölge 0e79fa86ad format with black and pylint 2.7.3 2021-04-09 00:38:08 +02:00
Eren Gölge 6ee211c137 remove stft params causing warning 2021-04-08 11:28:30 +02:00
Eren Gölge 7726dfca99 change the upper bound in sound normalization 2021-04-08 11:26:01 +02:00
Eren Gölge e0e3b12b26 pass all parameters explicity to _istft 2021-04-08 11:23:20 +02:00
Eren Gölge d57f416957 small fixes 2021-04-08 11:22:30 +02:00
Eren Gölge 2048095e9a audio.py fix 2021-04-06 16:24:50 +02:00
Eren Gölge e0b3008c31 allow choosing the log function used for amptodb conversion 2021-04-06 16:24:50 +02:00
Eren Gölge 4111df6769 Docstrings for audioprocessor 2021-03-08 02:54:47 +01:00