Commit Graph

298 Commits

Author SHA1 Message Date
Edresson Casanova dcd0d1f6a1 Clean up old code 2022-05-16 13:09:12 +00:00
Edresson Casanova 3a524b0597 Add prosody encoder params on config 2022-05-16 09:45:28 -03:00
Edresson Casanova 5271846d9c Add Speech style balancer 2022-04-19 15:51:15 -03:00
Edresson Casanova 093bd07528 Add reversal classifier loss 2022-04-18 21:09:59 -03:00
Edresson Casanova 8a3396d9c1 Add prosody encoder training support 2022-04-18 17:01:44 -03:00
Edresson Casanova f31ba25233 Add emotion embedding in the encoder 2022-03-31 19:14:41 -03:00
Edresson Casanova 314f95f974 Add formatter for the Emotional Speech Dataset 2022-03-31 17:27:30 +00:00
Edresson Casanova 7be9056b3d Remove useless encoder weights reload 2022-03-31 11:05:58 -03:00
Edresson Casanova 047cebd7b8 Fix Style tests 2022-03-30 16:51:39 -03:00
Edresson Casanova aebbdfc62b
Merge branch 'dev-managers' into dev-emotion 2022-03-30 16:25:47 -03:00
Edresson Casanova 397b3e9baf Fix style tests 2022-03-23 15:31:33 -03:00
Edresson Casanova c7af7c6474 Implement LanguageManager inherit BaseIDManager 2022-03-23 15:26:59 -03:00
Edresson Casanova 40df2cfdd1 Change the speaker manager to a generic manager 2022-03-23 15:26:06 -03:00
Edresson Casanova 10dee54ac3 Bug fix in single speaker emotion embedding training 2022-03-16 20:57:14 +00:00
Eren Gölge 0870a4faa2
Make style (#1405) 2022-03-16 12:13:55 +01:00
Edresson Casanova 4f03784b1f Add emotion external embeddings training unit test 2022-03-15 13:09:58 +00:00
Edresson Casanova 5090034fd1 Add emotion consistency loss 2022-03-15 12:35:00 +00:00
Edresson Casanova e3520e9e9f Add Emotion Support for the VITS model 2022-03-15 01:16:48 +00:00
Edresson Casanova e33819b7de Implement LanguageManager inherit BaseIDManager 2022-03-11 19:25:18 -03:00
Edresson Casanova 12e0b6f39e Change the speaker manager to a generic manager 2022-03-11 17:09:58 -03:00
Edresson Casanova dbe9da7f15
Add Voice conversion inference support (#1337)
* Add support for voice conversion inference

* Cache d_vectors_by_speaker for fast inference using a bigger speakers.json

* Rebase bug fix

* Use the average d-vector for inference
2022-03-10 14:57:12 +01:00
Edresson Casanova 917f417ac4
Add alphas to control language and speaker balancer (#1216)
* Add alphas to control language and speaker balancer

* Add docs for speaker and language samplers

* Change the Samplers weights to float for save memory

* Change the test_samplers to unittest format

* Add get_sampler method in BaseTTS

* Fix rebase issues

* Add language and speaker samplers support for DDP training

* Rename distributed sampler wrapper

* Remove the DistributedSamplerWrapper and use the one from Trainer

* Bugfix after rebase

* Move the samplers config to tts config
2022-03-10 14:56:09 +01:00
Eren Gölge dd4287de1f Update models 2022-03-03 20:23:00 +01:00
Eren Gölge 1425a023fe Make style and lint 2022-03-02 13:25:35 +01:00
Eren Gölge c68885b3fd Update Vits speaker encoder init 2022-03-02 13:20:23 +01:00
Eren Gölge 27b67b7945 Fix import 2022-03-02 09:15:20 +01:00
Eren Gölge 942df0fb05 Update vits dataset 2022-03-02 09:14:32 +01:00
Eren Gölge 1e414b3a09 Make stlye 2022-02-25 11:31:56 +01:00
Eren Gölge acc83cd3e6 Update Vits model API 2022-02-25 11:31:56 +01:00
Eren Gölge fe656659be Implement BaseTTS 2022-02-25 11:31:56 +01:00
Eren Gölge 83c5ddc5b7 Update imports 2022-02-25 11:31:56 +01:00
Eren Gölge 14c117978d Fix return outputs 2022-02-25 11:31:56 +01:00
Eren Gölge 424d04e4f6 Make stlye 2022-02-25 11:31:56 +01:00
Eren Gölge c68962c574 Update forward tts binary loss 2022-02-25 11:30:24 +01:00
Eren Gölge 00c7600103 Update Vits model API 2022-02-25 11:30:24 +01:00
Eren Gölge 35fc7270ff Implement BaseTTS 2022-02-25 11:28:47 +01:00
Eren Gölge 1e219fef0a Revert drop_last 2022-02-25 11:26:59 +01:00
Eren Gölge 4b96bfe925 Fix train logging 2022-02-25 11:26:59 +01:00
Eren Gölge ab8a4ca2c3 Revert random segment 2022-02-25 11:26:59 +01:00
Eren Gölge 8622226f3f Make style 2022-02-25 11:26:59 +01:00
Eren Gölge 54c6bb2a8c Fix add speaker VITS 2022-02-25 11:26:59 +01:00
Eren Gölge 38314194e7 Set `drop_last` 2022-02-25 11:26:59 +01:00
Eren Gölge f70e4bb8c6 Add new speakers to the vits model 2022-02-25 11:26:59 +01:00
Eren Gölge 1f0c8179da Make style 2022-02-25 11:26:59 +01:00
Eren Gölge 34c4be5e49 Update forwardtts 2022-02-25 11:26:59 +01:00
Eren Gölge 2829027d8b Refactor VITS model 2022-02-25 11:26:59 +01:00
Eren Gölge ef63c99524 Implement `start_by_longest` option for TTSDatase 2022-02-25 11:26:18 +01:00
Eren Gölge 146fbfd7c9 Extend unittests 2022-02-25 11:25:00 +01:00
Eren Gölge 2fe16de8e3 Make lint 2022-02-25 11:25:00 +01:00
Eren Gölge 235f7d9b02 Extend glow_tts model tests 2022-02-25 11:24:13 +01:00