Edresson Casanova
5859e6474c
Add script for extract VITS MAS alignments
2022-06-16 19:07:10 +00:00
Edresson Casanova
92e7391a5d
Add speaker embedding on prosody encoder
2022-06-16 19:06:48 +00:00
Edresson Casanova
251e1c289d
Add support for inference using an specific reference file instead of the averaged embeddings
2022-06-13 13:47:31 +00:00
Edresson Casanova
856e185641
Add Resnet prosody encoder support
2022-06-13 13:47:22 +00:00
Edresson Casanova
0844d9225d
Fix unit tests
2022-06-08 10:18:19 -03:00
Edresson Casanova
4b59f07946
Support the use of speaker embedding as emotion embedding
2022-06-08 09:52:39 -03:00
Edresson Casanova
360b969c23
Fix rebase issues
2022-06-08 09:52:39 -03:00
Edresson Casanova
e069985f17
Add speaker and emotion squeezer layers
2022-06-08 09:52:39 -03:00
Edresson Casanova
a309edacb4
Remove VITS conditional flow module
2022-06-08 09:52:39 -03:00
Edresson Casanova
a1d0088087
Remove VITS End2End loss
2022-06-08 09:52:38 -03:00
Edresson Casanova
ae55bdae6c
Fix Lint checks
2022-06-08 09:52:38 -03:00
Edresson Casanova
fd1036f4ba
Add Noise scale predictor
2022-06-08 09:52:38 -03:00
Edresson Casanova
d6d8d0e3e1
Fix the VITS GAN loss
2022-06-08 09:52:38 -03:00
Edresson Casanova
e07fcc7a8c
Add text encoder adversarial loss on the VITS
2022-06-08 09:52:38 -03:00
Edresson Casanova
4e94b46d5e
Add end2end VITS loss
2022-06-08 09:52:38 -03:00
Edresson Casanova
ec8c8dc5a2
Recreate the prior distribution of Capacitron VAE on the right device
2022-06-08 09:52:38 -03:00
Edresson Casanova
a822f21b78
Add prosody encoder inference support
2022-06-08 09:52:38 -03:00
Edresson Casanova
010f847929
Add an option to detach the prosody encoder input
2022-06-08 09:52:38 -03:00
Edresson Casanova
2cac18c7b7
Add VAE prosody encoder
2022-06-08 09:52:37 -03:00
Edresson Casanova
f774cf0648
Condition the prosody encoder on z_p
2022-06-08 09:52:37 -03:00
Edresson Casanova
512525cc39
Support prosody conditional model on decoder input
2022-06-08 09:52:37 -03:00
Edresson Casanova
02194367d7
Add emotion classifier loss
2022-06-08 09:52:37 -03:00
Edresson Casanova
f50819a5f6
Fix compute embeddings issue
2022-06-08 09:52:37 -03:00
Edresson Casanova
a6c8fea192
Add conditional module
2022-06-08 09:52:37 -03:00
Edresson Casanova
bce4a41b9c
Fix unit tests
2022-06-08 09:52:37 -03:00
Edresson Casanova
0fb1b200c6
Fix rebase issues
2022-06-08 09:52:37 -03:00
Edresson Casanova
98c2834b17
Disable the reversal prosody encoder speaker loss
2022-06-08 09:52:37 -03:00
Edresson Casanova
ac3f98cefb
Add text encoder reversal speaker classifier loss
2022-06-08 09:52:37 -03:00
Edresson Casanova
a543d71352
Clean up old code
2022-06-08 09:52:36 -03:00
Edresson Casanova
66e3f5388e
Add prosody encoder params on config
2022-06-08 09:52:36 -03:00
Edresson Casanova
95409be0bc
Add Speech style balancer
2022-06-08 09:52:36 -03:00
Edresson Casanova
050f7707e2
Add reversal classifier loss
2022-06-08 09:52:36 -03:00
Edresson Casanova
44ec2ab387
Add prosody encoder training support
2022-06-08 09:52:36 -03:00
Edresson Casanova
6126e5e588
Add emotion embedding in the encoder
2022-06-08 09:52:36 -03:00
Edresson Casanova
1fdef1c4c9
Add formatter for the Emotional Speech Dataset
2022-06-08 09:52:36 -03:00
Edresson Casanova
61a04a7855
Remove useless encoder weights reload
2022-06-08 09:52:36 -03:00
Edresson Casanova
836c4c6801
Fix emotion unit test
2022-06-08 09:52:36 -03:00
Edresson Casanova
e8c4417f07
Fix Style tests
2022-06-08 09:52:36 -03:00
Edresson Casanova
730befebcc
Fix style tests
2022-06-08 09:52:36 -03:00
Edresson Casanova
a8292c7c03
Fix the Bug in Synthesizer
2022-06-08 09:52:36 -03:00
Edresson Casanova
e409f3588b
Bug fix in single speaker emotion embedding training
2022-06-08 09:52:36 -03:00
Edresson Casanova
6f33506d89
Fix unit tests
2022-06-08 09:52:35 -03:00
Edresson Casanova
7a0eba517f
Add emotion external embeddings training unit test
2022-06-08 09:52:35 -03:00
Edresson Casanova
5a10ef27b3
Add emotion consistency loss
2022-06-08 09:52:35 -03:00
Edresson Casanova
c54e6ae1e4
Fix the bug in sythesizer
2022-06-08 09:52:35 -03:00
Edresson Casanova
bd99548016
Add Emotion Support for the VITS model
2022-06-08 09:52:35 -03:00
Edresson Casanova
ad7ce05ac9
Add emotion manager
2022-06-08 09:52:35 -03:00
WeberJulian
f09ea11c71
Internal formatter ( #1629 )
...
* Add coqui formatter
* Make style
2022-06-08 14:31:03 +02:00
Aya-AlJafari
68cef28a88
Adding TTS Tutorials ( #1584 )
...
* Adding inferencing notebook
* added multispeaker explanation and usecase and renamed the file
* Adding training tutorial
* fixed dummy paths
* fixed review comments
* fixed metadata extension
Co-authored-by: Eren Gölge <erogol@hotmail.com>
2022-06-02 12:23:00 +02:00
Eren Gölge
f70e82cd19
Use fsspec and torch for embedding file IO ( #1581 )
...
* Use fsspec and torch for embedding file
* Fixup
* Fix load and save files
* Fix compute embedding script
* Set use_cuda to true if available
* Add dummy speakers.pth file
* Make style
* Change default speakers file extension
Co-authored-by: WeberJulian <julian.weber@hotmail.fr>
2022-06-01 13:49:42 +02:00