Commit Graph

722 Commits

Author SHA1 Message Date
Eren Gölge 05d9543ed8 init GST module using gst config in Tacotron models 2021-05-11 11:29:17 +02:00
Eren Gölge 93a00373f6 move split_dataset 2021-05-11 11:29:17 +02:00
Eren Gölge 9c18e40f64 black formatting 2021-05-11 11:29:17 +02:00
Eren Gölge c34c8137d7 update compute_statistics for coqpit 2021-05-11 11:29:17 +02:00
Eren Gölge 79d7215142 config refactor #5 WIP 2021-05-11 11:29:17 +02:00
Eren Gölge dc50f5f0b0 config refactor #4 WIP 2021-05-11 11:28:35 +02:00
Eren Gölge 97bd5f9734 [ci skip] config update #3 WIP 2021-05-11 11:28:35 +02:00
Eren Gölge a21c0b5585 config update 2 WIP 2021-05-11 11:28:35 +02:00
Eren Gölge e092ae40dc config update WIP 2021-05-11 11:28:35 +02:00
Eren Gölge 06f80a4806 update check argument 2021-05-11 11:28:35 +02:00
Eren Gölge bf7ddfa542
Merge pull request #481 from chmodsss/main
Accessing __version__ command
2021-05-11 10:20:48 +02:00
chmodsss 607d5cf377 [#480] Adding version variable 2021-05-10 19:46:34 +02:00
Adam Froghyar 7ddc885f37 deleted a line the broke GravesAttention 2021-05-10 15:42:59 +02:00
Eren Gölge f7582107da
Merge pull request #453 from Edresson/dev
Script for spectrogram extraction using teacher forcing and Glow-TTS inference with MAS.
2021-05-06 17:53:28 +02:00
Edresson 501c8e0302 remove unused vars on extract tts spectrograms script 2021-05-04 19:04:13 -03:00
Eren Gölge 0325c58862
Merge pull request #468 from shaun95/patch-1
Update losses.py
2021-05-03 14:45:24 +02:00
Eren Gölge 8cb27267a4 formatting 2021-05-03 14:26:35 +02:00
Eren Gölge 87d674a038 bumpup librosa version to 0.8.0 2021-05-03 14:25:09 +02:00
shaun 7d0ec62bf1
Update losses.py
The block of code for use_l1_spec_loss is repeated which doubles the amount of L1 loss when enabled.
The weight for L1 loss in hifigan_ljspeech configutation will likely need to be doubled to compensate (l1_spec_loss_weight)
2021-05-02 14:14:24 +02:00
Edresson 3ecd556bbe add unit test for extract tts spectrograms script 2021-05-01 13:41:56 -03:00
Edresson 446b1da936 create inference function 2021-04-29 18:18:37 -03:00
Eren Gölge f02f0338c2 fix .models.json and add testing to check released models availability 2021-04-29 09:32:36 +02:00
Eren Gölge fd95e9b8a4 [ci skip] Add sam models 2021-04-28 21:57:31 +02:00
Agrin Hilmkil 351d0ed6ae Remove unnecessary fsspec usage 2021-04-28 11:21:08 +02:00
Agrin Hilmkil 167f86417e Move dev, tf, notebook dependencies to extras 2021-04-28 11:20:06 +02:00
Eren Gölge 1235e54738 test for synthesize.py 2021-04-27 14:17:38 +02:00
Eren Gölge 4719414f2e remove imports 2021-04-27 11:25:17 +02:00
Eren Gölge add97cddc1 move function and remove import 2021-04-27 11:22:56 +02:00
Eren Gölge 734e6a515c bug fix 2021-04-27 10:27:45 +02:00
Eren Gölge 6bdd81667e place holders for sc-glow and hifigan models 2021-04-26 19:53:12 +02:00
Eren Gölge 2f0716073e enable multi-speaker CoquiTTS models for synthesize.py 2021-04-26 19:36:53 +02:00
Eren Gölge b531fa699c remove conflicy noise 2021-04-26 15:27:52 +02:00
Eren Gölge f37b488876 Merge branch 'speaker-manager' of https://github.com/coqui-ai/TTS into speaker-manager 2021-04-26 15:25:25 +02:00
Eren Gölge b82daa5e86 style and linter fixes 2021-04-26 15:22:24 +02:00
Edresson 20e42a3381 add save audio option 2021-04-23 15:00:00 -03:00
Edresson 8228091f92 add script for extraction of tts spectrograms 2021-04-23 14:17:46 -03:00
Eren Gölge 4cf211348d styling and linting 2021-04-23 18:04:37 +02:00
Eren Gölge 7eb0c60d2e let synthesizer to pass speaker encoder file paths to speaker manager 2021-04-23 18:04:37 +02:00
Eren Gölge f69195739e let speaker manager compute mean x_vector from multiple wav files 2021-04-23 18:04:37 +02:00
Eren Gölge 179722e3a7 new arguments to synthesize.py for loading speaker encoder and speaker wavs 2021-04-23 18:04:37 +02:00
Eren Gölge dfa415a8b8 small refactor in server.py 2021-04-23 18:04:37 +02:00
Eren Gölge c80d21f311 load speaker_encoder_ap and compute x_vector directly from the input file in speaker manager 2021-04-23 18:04:37 +02:00
Eren Gölge ad047c8195 html formatting, enable multi-speaker model on the server with a dropdown menu to select the speaker 2021-04-23 18:04:37 +02:00
Eren Gölge f9f3d04d14 remove moved function 2021-04-23 18:04:37 +02:00
Eren Gölge 10c988ac8c update server.py 2021-04-23 18:04:37 +02:00
Eren Gölge 6d0f5e0459 use SpeakerManager in Synthesizer 2021-04-23 18:04:37 +02:00
Eren Gölge e97126314c add ```unique``` argument to make_symbols to fix the incompat. issue of the
SC-Glow models
2021-04-23 18:04:37 +02:00
Eren Gölge d08888e603 formating speakers.py 2021-04-23 18:04:37 +02:00
Eren Gölge df422223a3 initial SpeakerManager implementation 2021-04-23 18:04:37 +02:00
Eren Gölge 7a7aeb35f5 fix the glow-tts in setup_model 2021-04-23 18:04:37 +02:00
Eren Gölge d42748082a update argument name external_speaker_embedding_dim -> speaker_embedding_dim
add inference_noise_scale argument to glow-tts
2021-04-23 18:04:37 +02:00
Eren Gölge 2da81f5bb6 add load_chekpoint to speaker encoder 2021-04-23 18:04:37 +02:00
Eren Gölge 1229ccbf07 update argument name in server.py 2021-04-23 18:04:37 +02:00
Eren Gölge af2d36faeb update synthesize.py for multi-speaker setting 2021-04-23 18:04:37 +02:00
Eren Gölge 99dc07a7dd add ```unique``` param to keep scglow models compatible (they are duplicate symbols ins the character set) 2021-04-23 18:04:37 +02:00
Eren Gölge c955a12428 set the default layer size compatible with scglow 2021-04-23 18:04:37 +02:00
Eren Gölge 3ace2440fa fix a mistake from rebase 2021-04-23 18:04:37 +02:00
Eren Gölge aadb2106ec code styling 2021-04-23 18:04:37 +02:00
Eren Gölge af7baa3387 refactoring to allow defining the speaker file externally 2021-04-23 18:04:37 +02:00
kirianguiller 7dccbfdcd5 handle multi speaker and gst in Synthetizer class 2021-04-23 18:04:37 +02:00
Edresson d2b6326b8b change optimizer initialization for compatibility with Hifi-GAN official implementation 2021-04-23 07:54:39 -03:00
WeberJulian 4205284f92
Change name of the functions 2021-04-23 10:09:55 +02:00
WeberJulian a26498181b Change back the default value 2021-04-22 16:10:17 +02:00
Julian Weber 355e1f47ab fix dumb mistake 2021-04-22 15:50:29 +02:00
Julian Weber c125b71f36 fix windows support 2021-04-22 15:14:24 +02:00
Jörg Thalheim f5fd7f78d4 server: also listen to ipv6
The [::] address will listen to both ipv4/ipv6 addresses.
2021-04-22 12:38:55 +02:00
Eren Gölge ef37633cb3 [ci skip] use prenet_dropout by default with Tacotron models 2021-04-22 12:38:55 +02:00
Eren Gölge e1d960da9e use SpeakerManager in Synthesizer 2021-04-21 13:13:27 +02:00
Eren Gölge 04b6881b66 add ```unique``` argument to make_symbols to fix the incompat. issue of the
SC-Glow models
2021-04-21 13:12:35 +02:00
Eren Gölge 790946faec formating speakers.py 2021-04-21 13:12:11 +02:00
Eren Gölge ab313814de initial SpeakerManager implementation 2021-04-21 13:11:46 +02:00
Eren Gölge 09890c7421 fix the glow-tts in setup_model 2021-04-21 13:10:40 +02:00
Eren Gölge 8764d02eb2 update argument name external_speaker_embedding_dim -> speaker_embedding_dim
add inference_noise_scale argument to glow-tts
2021-04-21 13:09:44 +02:00
Eren Gölge 8b40720977 add load_chekpoint to speaker encoder 2021-04-21 13:09:04 +02:00
Eren Gölge 37cad38c27 update argument name in server.py 2021-04-21 13:08:45 +02:00
Eren Gölge 9bccee9da8 update synthesize.py for multi-speaker setting 2021-04-21 13:08:25 +02:00
Eren Gölge d2fa8add1f add ```unique``` param to keep scglow models compatible (they are duplicate symbols ins the character set) 2021-04-16 19:40:13 +02:00
Eren Gölge d9612a4351 set the default layer size compatible with scglow 2021-04-16 19:40:13 +02:00
Eren Gölge 1038fd420d fix a mistake from rebase 2021-04-16 19:39:47 +02:00
Eren Gölge 47e356cb48 code styling 2021-04-16 16:01:40 +02:00
Eren Gölge 25328aad00 refactoring to allow defining the speaker file externally 2021-04-16 15:59:57 +02:00
kirianguiller 48ae52a9a3 handle multi speaker and gst in Synthetizer class 2021-04-16 15:54:49 +02:00
Eren Gölge a53958ae3a fix urls for the new models 2021-04-15 17:05:00 +02:00
Eren Gölge 9cc17be53a formatting and a small bug fix in Tacotron model 2021-04-15 16:36:51 +02:00
Eren Gölge 1ad838bc83 add newly released models under .model.json 2021-04-15 16:06:10 +02:00
Eren Gölge 7cada1a949 remove noise 2021-04-15 15:30:45 +02:00
Eren Gölge d60a8d7211 show the real waveform on TB too for GAN vocoder training. 2021-04-15 15:30:06 +02:00
Eren Gölge 5fbe926429 change the default TTS model to TacotronDDC 2021-04-15 15:29:44 +02:00
Eren Gölge 3de5a89154 optionally enable prenet dropout at inference time for tacotron models 2021-04-13 13:24:56 +02:00
Eren Gölge 28a2fed8a3 update hifigan in .model.json 2021-04-12 16:48:05 +02:00
Eren Gölge abaf36861a aligntts model .model.json placeholder 2021-04-12 16:43:52 +02:00
Eren Gölge 480e2f7888 docstring update and better handling make_symbols 2021-04-12 16:40:49 +02:00
Eren Gölge b735076bb4 linter fixes 2021-04-12 13:14:11 +02:00
Eren Gölge b11d1cb845 small fixes 2021-04-12 12:40:55 +02:00
Eren Gölge a7f6045644 Merge branch 'reformat' into hifigan-reformat 2021-04-12 12:00:17 +02:00
Eren Gölge f519012dea reformatting and styling 2021-04-12 11:47:39 +02:00
Eren Gölge 9011dddf77 tacotron DDC placeholder in models.json 2021-04-12 04:06:27 +02:00
Eren Gölge d295d5de97 remove torch.no_grad from TorchSTFT 2021-04-10 19:43:57 +02:00
Eren Gölge 5b70da2e3f restore schedulers only if training is continuing a previous training
inherit nn.Module for TorchSTFT
2021-04-09 19:31:28 +02:00
Eren Gölge 2c71c6d8cd [ci skip]update gan vocoder configs to reflect the recent changes 2021-04-09 17:15:32 +02:00