Commit Graph

116 Commits

Author SHA1 Message Date
Edresson Casanova dcd0d1f6a1 Clean up old code 2022-05-16 13:09:12 +00:00
Edresson Casanova 3a524b0597 Add prosody encoder params on config 2022-05-16 09:45:28 -03:00
Edresson Casanova 093bd07528 Add reversal classifier loss 2022-04-18 21:09:59 -03:00
Edresson Casanova 8a3396d9c1 Add prosody encoder training support 2022-04-18 17:01:44 -03:00
Edresson Casanova f31ba25233 Add emotion embedding in the encoder 2022-03-31 19:14:41 -03:00
Edresson Casanova 314f95f974 Add formatter for the Emotional Speech Dataset 2022-03-31 17:27:30 +00:00
Edresson Casanova 7be9056b3d Remove useless encoder weights reload 2022-03-31 11:05:58 -03:00
Edresson Casanova 047cebd7b8 Fix Style tests 2022-03-30 16:51:39 -03:00
Edresson Casanova aebbdfc62b
Merge branch 'dev-managers' into dev-emotion 2022-03-30 16:25:47 -03:00
Edresson Casanova 397b3e9baf Fix style tests 2022-03-23 15:31:33 -03:00
Edresson Casanova c7af7c6474 Implement LanguageManager inherit BaseIDManager 2022-03-23 15:26:59 -03:00
Edresson Casanova 40df2cfdd1 Change the speaker manager to a generic manager 2022-03-23 15:26:06 -03:00
Edresson Casanova 10dee54ac3 Bug fix in single speaker emotion embedding training 2022-03-16 20:57:14 +00:00
Eren Gölge 0870a4faa2
Make style (#1405) 2022-03-16 12:13:55 +01:00
Edresson Casanova 4f03784b1f Add emotion external embeddings training unit test 2022-03-15 13:09:58 +00:00
Edresson Casanova 5090034fd1 Add emotion consistency loss 2022-03-15 12:35:00 +00:00
Edresson Casanova e3520e9e9f Add Emotion Support for the VITS model 2022-03-15 01:16:48 +00:00
Edresson Casanova e33819b7de Implement LanguageManager inherit BaseIDManager 2022-03-11 19:25:18 -03:00
Edresson Casanova 12e0b6f39e Change the speaker manager to a generic manager 2022-03-11 17:09:58 -03:00
Edresson Casanova dbe9da7f15
Add Voice conversion inference support (#1337)
* Add support for voice conversion inference

* Cache d_vectors_by_speaker for fast inference using a bigger speakers.json

* Rebase bug fix

* Use the average d-vector for inference
2022-03-10 14:57:12 +01:00
Edresson Casanova 917f417ac4
Add alphas to control language and speaker balancer (#1216)
* Add alphas to control language and speaker balancer

* Add docs for speaker and language samplers

* Change the Samplers weights to float for save memory

* Change the test_samplers to unittest format

* Add get_sampler method in BaseTTS

* Fix rebase issues

* Add language and speaker samplers support for DDP training

* Rename distributed sampler wrapper

* Remove the DistributedSamplerWrapper and use the one from Trainer

* Bugfix after rebase

* Move the samplers config to tts config
2022-03-10 14:56:09 +01:00
Eren Gölge dd4287de1f Update models 2022-03-03 20:23:00 +01:00
Eren Gölge 1425a023fe Make style and lint 2022-03-02 13:25:35 +01:00
Eren Gölge c68885b3fd Update Vits speaker encoder init 2022-03-02 13:20:23 +01:00
Eren Gölge 942df0fb05 Update vits dataset 2022-03-02 09:14:32 +01:00
Eren Gölge 1e414b3a09 Make stlye 2022-02-25 11:31:56 +01:00
Eren Gölge acc83cd3e6 Update Vits model API 2022-02-25 11:31:56 +01:00
Eren Gölge 83c5ddc5b7 Update imports 2022-02-25 11:31:56 +01:00
Eren Gölge 14c117978d Fix return outputs 2022-02-25 11:31:56 +01:00
Eren Gölge 424d04e4f6 Make stlye 2022-02-25 11:31:56 +01:00
Eren Gölge 00c7600103 Update Vits model API 2022-02-25 11:30:24 +01:00
Eren Gölge 4b96bfe925 Fix train logging 2022-02-25 11:26:59 +01:00
Eren Gölge ab8a4ca2c3 Revert random segment 2022-02-25 11:26:59 +01:00
Eren Gölge 8622226f3f Make style 2022-02-25 11:26:59 +01:00
Eren Gölge 54c6bb2a8c Fix add speaker VITS 2022-02-25 11:26:59 +01:00
Eren Gölge f70e4bb8c6 Add new speakers to the vits model 2022-02-25 11:26:59 +01:00
Eren Gölge 1f0c8179da Make style 2022-02-25 11:26:59 +01:00
Eren Gölge 2829027d8b Refactor VITS model 2022-02-25 11:26:59 +01:00
Eren Gölge 146fbfd7c9 Extend unittests 2022-02-25 11:25:00 +01:00
Eren Gölge 2fe16de8e3 Make lint 2022-02-25 11:25:00 +01:00
Eren Gölge 001da8afc8 Update Vits for the new model API 2022-02-25 11:21:19 +01:00
Eren Gölge ea965a5683 Update VITS for the new API 2022-02-25 11:11:35 +01:00
Eren Gölge 93957d58a1 Refactorin VITS for the tokenizer API 2022-02-25 11:05:06 +01:00
Eren Gölge 7575367b9f Refactorin VITS for the tokenizer API 2022-02-25 10:57:35 +01:00
Eren Gölge 127118c637
Update TTS.tts formatters (#1228)
* Return Dict from tts formatters

* Make style
2022-02-11 23:03:43 +01:00
WeberJulian e778bad626 Add argument to enable dp speaker conditioning 2022-01-06 15:07:27 +01:00
WeberJulian e1accb6e28
Fix train_tts.py and uncomment code (#1051)
* Fix SE loading and language embedding logic

* remove trailing white space

* Uncomment resmapling code for SCL
2022-01-03 17:44:57 +01:00
Eren Gölge 36cef5966b Fix resnet speaker encoder 2021-12-30 15:36:35 +00:00
Eren Gölge 348b5c96a2 Fix speaker encoder test 2021-12-30 15:36:35 +00:00
Eren Gölge 7129b04d46 Update VITS model 2021-12-30 14:08:17 +00:00