Edresson Casanova
dcd0d1f6a1
Clean up old code
2022-05-16 13:09:12 +00:00
Edresson Casanova
3a524b0597
Add prosody encoder params on config
2022-05-16 09:45:28 -03:00
Edresson Casanova
093bd07528
Add reversal classifier loss
2022-04-18 21:09:59 -03:00
Edresson Casanova
8a3396d9c1
Add prosody encoder training support
2022-04-18 17:01:44 -03:00
Edresson Casanova
f31ba25233
Add emotion embedding in the encoder
2022-03-31 19:14:41 -03:00
Edresson Casanova
314f95f974
Add formatter for the Emotional Speech Dataset
2022-03-31 17:27:30 +00:00
Edresson Casanova
7be9056b3d
Remove useless encoder weights reload
2022-03-31 11:05:58 -03:00
Edresson Casanova
047cebd7b8
Fix Style tests
2022-03-30 16:51:39 -03:00
Edresson Casanova
aebbdfc62b
Merge branch 'dev-managers' into dev-emotion
2022-03-30 16:25:47 -03:00
Edresson Casanova
397b3e9baf
Fix style tests
2022-03-23 15:31:33 -03:00
Edresson Casanova
c7af7c6474
Implement LanguageManager inherit BaseIDManager
2022-03-23 15:26:59 -03:00
Edresson Casanova
40df2cfdd1
Change the speaker manager to a generic manager
2022-03-23 15:26:06 -03:00
Edresson Casanova
10dee54ac3
Bug fix in single speaker emotion embedding training
2022-03-16 20:57:14 +00:00
Eren Gölge
0870a4faa2
Make style ( #1405 )
2022-03-16 12:13:55 +01:00
Edresson Casanova
4f03784b1f
Add emotion external embeddings training unit test
2022-03-15 13:09:58 +00:00
Edresson Casanova
5090034fd1
Add emotion consistency loss
2022-03-15 12:35:00 +00:00
Edresson Casanova
e3520e9e9f
Add Emotion Support for the VITS model
2022-03-15 01:16:48 +00:00
Edresson Casanova
e33819b7de
Implement LanguageManager inherit BaseIDManager
2022-03-11 19:25:18 -03:00
Edresson Casanova
12e0b6f39e
Change the speaker manager to a generic manager
2022-03-11 17:09:58 -03:00
Edresson Casanova
dbe9da7f15
Add Voice conversion inference support ( #1337 )
...
* Add support for voice conversion inference
* Cache d_vectors_by_speaker for fast inference using a bigger speakers.json
* Rebase bug fix
* Use the average d-vector for inference
2022-03-10 14:57:12 +01:00
Edresson Casanova
917f417ac4
Add alphas to control language and speaker balancer ( #1216 )
...
* Add alphas to control language and speaker balancer
* Add docs for speaker and language samplers
* Change the Samplers weights to float for save memory
* Change the test_samplers to unittest format
* Add get_sampler method in BaseTTS
* Fix rebase issues
* Add language and speaker samplers support for DDP training
* Rename distributed sampler wrapper
* Remove the DistributedSamplerWrapper and use the one from Trainer
* Bugfix after rebase
* Move the samplers config to tts config
2022-03-10 14:56:09 +01:00
Eren Gölge
dd4287de1f
Update models
2022-03-03 20:23:00 +01:00
Eren Gölge
1425a023fe
Make style and lint
2022-03-02 13:25:35 +01:00
Eren Gölge
c68885b3fd
Update Vits speaker encoder init
2022-03-02 13:20:23 +01:00
Eren Gölge
942df0fb05
Update vits dataset
2022-03-02 09:14:32 +01:00
Eren Gölge
1e414b3a09
Make stlye
2022-02-25 11:31:56 +01:00
Eren Gölge
acc83cd3e6
Update Vits model API
2022-02-25 11:31:56 +01:00
Eren Gölge
83c5ddc5b7
Update imports
2022-02-25 11:31:56 +01:00
Eren Gölge
14c117978d
Fix return outputs
2022-02-25 11:31:56 +01:00
Eren Gölge
424d04e4f6
Make stlye
2022-02-25 11:31:56 +01:00
Eren Gölge
00c7600103
Update Vits model API
2022-02-25 11:30:24 +01:00
Eren Gölge
4b96bfe925
Fix train logging
2022-02-25 11:26:59 +01:00
Eren Gölge
ab8a4ca2c3
Revert random segment
2022-02-25 11:26:59 +01:00
Eren Gölge
8622226f3f
Make style
2022-02-25 11:26:59 +01:00
Eren Gölge
54c6bb2a8c
Fix add speaker VITS
2022-02-25 11:26:59 +01:00
Eren Gölge
f70e4bb8c6
Add new speakers to the vits model
2022-02-25 11:26:59 +01:00
Eren Gölge
1f0c8179da
Make style
2022-02-25 11:26:59 +01:00
Eren Gölge
2829027d8b
Refactor VITS model
2022-02-25 11:26:59 +01:00
Eren Gölge
146fbfd7c9
Extend unittests
2022-02-25 11:25:00 +01:00
Eren Gölge
2fe16de8e3
Make lint
2022-02-25 11:25:00 +01:00
Eren Gölge
001da8afc8
Update Vits for the new model API
2022-02-25 11:21:19 +01:00
Eren Gölge
ea965a5683
Update VITS for the new API
2022-02-25 11:11:35 +01:00
Eren Gölge
93957d58a1
Refactorin VITS for the tokenizer API
2022-02-25 11:05:06 +01:00
Eren Gölge
7575367b9f
Refactorin VITS for the tokenizer API
2022-02-25 10:57:35 +01:00
Eren Gölge
127118c637
Update TTS.tts formatters ( #1228 )
...
* Return Dict from tts formatters
* Make style
2022-02-11 23:03:43 +01:00
WeberJulian
e778bad626
Add argument to enable dp speaker conditioning
2022-01-06 15:07:27 +01:00
WeberJulian
e1accb6e28
Fix train_tts.py and uncomment code ( #1051 )
...
* Fix SE loading and language embedding logic
* remove trailing white space
* Uncomment resmapling code for SCL
2022-01-03 17:44:57 +01:00
Eren Gölge
36cef5966b
Fix resnet speaker encoder
2021-12-30 15:36:35 +00:00
Eren Gölge
348b5c96a2
Fix speaker encoder test
2021-12-30 15:36:35 +00:00
Eren Gölge
7129b04d46
Update VITS model
2021-12-30 14:08:17 +00:00