Commit Graph

1550 Commits

Author SHA1 Message Date
Edresson Casanova 570edb7e93 Add compute encoder accuracy in a function 2022-03-11 09:39:22 -03:00
Edresson Casanova b0bad56ba9 Add Perfect Batch Sampler unit test 2022-03-10 17:02:37 -03:00
Edresson Casanova 9c8b8201c3 Fix the unitests 2022-03-10 16:22:33 -03:00
Edresson Casanova 50305215b3 Add BaseEncoder Class 2022-03-10 15:10:26 -03:00
Edresson Casanova a9208e9edd Use get_optimizer in Encoder 2022-03-10 13:58:17 -03:00
Edresson Casanova 247da8ef12 Unit tests fixs 2022-03-10 11:50:18 -03:00
Edresson Casanova 3a7feadba4 Remove useless config parameter 2022-03-10 11:50:18 -03:00
Edresson Casanova 711a46506f Fix lint checks 2022-03-10 11:50:18 -03:00
Edresson Casanova 33fd07a209 Add evaluation during encoder training 2022-03-10 11:50:18 -03:00
Edresson Casanova 0e372e0b9b Add Perfect Sampler and remove storage 2022-03-10 11:50:18 -03:00
Edresson Casanova 8ba3385747 Add argumnet to disable storage 2022-03-10 11:50:18 -03:00
Edresson Casanova 984b6d9fd1 Enable Weight decay for encoder training 2022-03-10 11:50:18 -03:00
Edresson Casanova 1c1684bdc5 Fix the bug in plot_embeddings 2022-03-10 11:50:18 -03:00
Edresson Casanova 0a06d1e67b Add evaluation encoder script 2022-03-10 11:50:18 -03:00
Edresson Casanova f811af7651 Add Base encoder config 2022-03-10 11:50:18 -03:00
Edresson Casanova 33ac13e44e Add class map in emotion config 2022-03-10 11:50:18 -03:00
Edresson Casanova 854c887764 Transform the Speaker Encoder dataset to a generic dataset and create emotion encoder config 2022-03-10 11:50:18 -03:00
Edresson Casanova 1c6d16cffc Add a generic emotion dataset formatter 2022-03-10 11:50:18 -03:00
Edresson Casanova 71a1907f4c Rename Speaker encoder module to encoder 2022-03-10 11:50:18 -03:00
Edresson Casanova dbe9da7f15
Add Voice conversion inference support (#1337)
* Add support for voice conversion inference

* Cache d_vectors_by_speaker for fast inference using a bigger speakers.json

* Rebase bug fix

* Use the average d-vector for inference
2022-03-10 14:57:12 +01:00
Edresson Casanova 917f417ac4
Add alphas to control language and speaker balancer (#1216)
* Add alphas to control language and speaker balancer

* Add docs for speaker and language samplers

* Change the Samplers weights to float for save memory

* Change the test_samplers to unittest format

* Add get_sampler method in BaseTTS

* Fix rebase issues

* Add language and speaker samplers support for DDP training

* Rename distributed sampler wrapper

* Remove the DistributedSamplerWrapper and use the one from Trainer

* Bugfix after rebase

* Move the samplers config to tts config
2022-03-10 14:56:09 +01:00
Edresson Casanova f381e29b91
REBASED: Add support for the speaker encoder training using torch spectrograms (#1348)
* Add support for the speaker encoder training using torch spectrograms

* Remove useless function in speaker encoder dataset class
2022-03-10 14:54:51 +01:00
Eren Gölge c670365507 Fix VCTK recipe and formatter 2022-03-08 14:20:34 +01:00
Eren Gölge 8feb41d361 Bump up to v0.6.1 2022-03-07 15:57:44 +01:00
Eren Gölge ee02bc3823 Bump up to v0.6.0 2022-03-07 12:08:22 +01:00
Eren Gölge dc280819be Add new models 2022-03-07 12:08:09 +01:00
Eren Gölge e9d9028b4d Revert cleaner name 2022-03-06 12:57:06 +01:00
Eren Gölge 764c7fa4a4 Rename phoneme_cleaners 2022-03-06 12:09:54 +01:00
Eren Gölge dd4287de1f Update models 2022-03-03 20:23:00 +01:00
Eren Gölge 6cb00be795 Update your_tts model URL 2022-03-02 18:04:49 +01:00
Eren Gölge 1425a023fe Make style and lint 2022-03-02 13:25:35 +01:00
Eren Gölge c68885b3fd Update Vits speaker encoder init 2022-03-02 13:20:23 +01:00
Eren Gölge 27b67b7945 Fix import 2022-03-02 09:15:20 +01:00
Eren Gölge 942df0fb05 Update vits dataset 2022-03-02 09:14:32 +01:00
Eren Gölge 6a9f8074f0 Fix TTSDataset 2022-03-01 07:57:48 +01:00
Eren Gölge 690de1ab06 Update Characters and add more tests 2022-02-25 11:32:44 +01:00
Eren Gölge 9063397892 Fix FastSpeech config 2022-02-25 11:31:56 +01:00
Eren Gölge 1e414b3a09 Make stlye 2022-02-25 11:31:56 +01:00
Eren Gölge acc83cd3e6 Update Vits model API 2022-02-25 11:31:56 +01:00
Eren Gölge fe656659be Implement BaseTTS 2022-02-25 11:31:56 +01:00
Eren Gölge bed4afd4ee Implement BaseVocabulary 2022-02-25 11:31:56 +01:00
Eren Gölge e0f9be76c0 Update test_run in wavernn and wavegrad 2022-02-25 11:31:56 +01:00
Eren Gölge bf540f4323 Update imports for trainer 2022-02-25 11:31:56 +01:00
Eren Gölge 4c43eda414 Update BaseTrainerModel 2022-02-25 11:31:56 +01:00
Eren Gölge 83c5ddc5b7 Update imports 2022-02-25 11:31:56 +01:00
Eren Gölge 14c117978d Fix return outputs 2022-02-25 11:31:56 +01:00
Eren Gölge 424d04e4f6 Make stlye 2022-02-25 11:31:56 +01:00
Eren Gölge 8b3ba02c95 Add vocab_dict to model config 2022-02-25 11:31:20 +01:00
Eren Gölge ff23dce081 Update TTSDataset 2022-02-25 11:31:20 +01:00
Eren Gölge 750903d2ba Add VCTK formatter docstring 2022-02-25 11:30:24 +01:00