Edresson Casanova
360b969c23
Fix rebase issues
2022-06-08 09:52:39 -03:00
Edresson Casanova
e069985f17
Add speaker and emotion squeezer layers
2022-06-08 09:52:39 -03:00
Edresson Casanova
a309edacb4
Remove VITS conditional flow module
2022-06-08 09:52:39 -03:00
Edresson Casanova
a1d0088087
Remove VITS End2End loss
2022-06-08 09:52:38 -03:00
Edresson Casanova
ae55bdae6c
Fix Lint checks
2022-06-08 09:52:38 -03:00
Edresson Casanova
fd1036f4ba
Add Noise scale predictor
2022-06-08 09:52:38 -03:00
Edresson Casanova
d6d8d0e3e1
Fix the VITS GAN loss
2022-06-08 09:52:38 -03:00
Edresson Casanova
e07fcc7a8c
Add text encoder adversarial loss on the VITS
2022-06-08 09:52:38 -03:00
Edresson Casanova
4e94b46d5e
Add end2end VITS loss
2022-06-08 09:52:38 -03:00
Edresson Casanova
a822f21b78
Add prosody encoder inference support
2022-06-08 09:52:38 -03:00
Edresson Casanova
010f847929
Add an option to detach the prosody encoder input
2022-06-08 09:52:38 -03:00
Edresson Casanova
2cac18c7b7
Add VAE prosody encoder
2022-06-08 09:52:37 -03:00
Edresson Casanova
f774cf0648
Condition the prosody encoder on z_p
2022-06-08 09:52:37 -03:00
Edresson Casanova
512525cc39
Support prosody conditional model on decoder input
2022-06-08 09:52:37 -03:00
Edresson Casanova
02194367d7
Add emotion classifier loss
2022-06-08 09:52:37 -03:00
Edresson Casanova
a6c8fea192
Add conditional module
2022-06-08 09:52:37 -03:00
Edresson Casanova
bce4a41b9c
Fix unit tests
2022-06-08 09:52:37 -03:00
Edresson Casanova
0fb1b200c6
Fix rebase issues
2022-06-08 09:52:37 -03:00
Edresson Casanova
98c2834b17
Disable the reversal prosody encoder speaker loss
2022-06-08 09:52:37 -03:00
Edresson Casanova
ac3f98cefb
Add text encoder reversal speaker classifier loss
2022-06-08 09:52:37 -03:00
Edresson Casanova
a543d71352
Clean up old code
2022-06-08 09:52:36 -03:00
Edresson Casanova
66e3f5388e
Add prosody encoder params on config
2022-06-08 09:52:36 -03:00
Edresson Casanova
050f7707e2
Add reversal classifier loss
2022-06-08 09:52:36 -03:00
Edresson Casanova
44ec2ab387
Add prosody encoder training support
2022-06-08 09:52:36 -03:00
Edresson Casanova
6126e5e588
Add emotion embedding in the encoder
2022-06-08 09:52:36 -03:00
Edresson Casanova
1fdef1c4c9
Add formatter for the Emotional Speech Dataset
2022-06-08 09:52:36 -03:00
Edresson Casanova
61a04a7855
Remove useless encoder weights reload
2022-06-08 09:52:36 -03:00
Edresson Casanova
e8c4417f07
Fix Style tests
2022-06-08 09:52:36 -03:00
Edresson Casanova
730befebcc
Fix style tests
2022-06-08 09:52:36 -03:00
Edresson Casanova
e409f3588b
Bug fix in single speaker emotion embedding training
2022-06-08 09:52:36 -03:00
Edresson Casanova
7a0eba517f
Add emotion external embeddings training unit test
2022-06-08 09:52:35 -03:00
Edresson Casanova
5a10ef27b3
Add emotion consistency loss
2022-06-08 09:52:35 -03:00
Edresson Casanova
bd99548016
Add Emotion Support for the VITS model
2022-06-08 09:52:35 -03:00
Edresson Casanova
ee99a6c1e2
Fix voice conversion inference ( #1583 )
...
* Add voice conversion zoo test
* Fix style
* Fix unit test
2022-05-20 15:50:25 +02:00
Edresson Casanova
e5d8ec2402
Change the VITS upsampling interpolation trick to linear ( #1564 )
2022-05-13 10:52:39 +02:00
Eren Gölge
6e460b7e42
Add an assert for the upsampling trick ( #1538 )
2022-05-12 19:55:24 +02:00
Eren Gölge
e45ae57aef
Merge pull request #1550 from coqui-ai/fix-upsampling-asserts
...
Fix VITS upsampling asserts
2022-05-12 14:51:41 +02:00
Edresson Casanova
175ca06388
Add reinit text encoder and duration predictor parameter ( #1562 )
...
* Add reinit encoder and duration predictor option
* Add .data to prevent any overlooked autograd hook
2022-05-12 09:08:36 -03:00
Edresson Casanova
182711043c
Fix the VITS upsampling asserts
...
Fix style
2022-05-12 09:08:29 -03:00
Eren Gölge
c18bd21b3f
Return durations at VITS inference
2022-05-11 11:30:05 +02:00
Eren Gölge
5021a03de0
Use torch.no_grad for VITS inference
2022-05-11 11:29:36 +02:00
Eren Gölge
3f03e3012c
Fix batch_group_size in VITS
2022-05-07 13:44:44 +02:00
WeberJulian
fbdf76b2fc
returns y_mask in VITS inference ( #1540 )
...
* returns y_mask
* make style
2022-05-03 13:49:24 +02:00
Edresson Casanova
8d228ab22a
Trick to Upsampling to High sampling rates using VITS model ( #1456 )
...
* Add upsample VITS support
* Fix the bug in inference
* Fix lint checks
* Add RMS based norm in save_wav method
* Style fix
* Add the period for VITS multi-period discriminator in model_args
* Bug fix in speaker encoder load in inference time
* Add unit tests
* Remove useless detach_z_vocoder parameter
* Add docs for VITS upsampling
* Fix the docs
* Rename TTS_part_sample_rate to encoder_sample_rate
* Add upsampling_init and upsampling_z methods
* Add asserts for encoder_sample_rate part
* Move upsampling tests to test_vits.py
2022-04-26 11:47:46 +02:00
Edresson Casanova
060e0f9368
Add EmbeddingManager and BaseIDManager ( #1374 )
2022-03-31 13:41:16 +02:00
Edresson Casanova
37896e1743
Bug fix in freeze encoder ( #1391 )
...
* Fix the bug in freeze encoder
* Remove emb_l definition for non-multilingual training
* Fix unit tests
2022-03-24 18:16:04 +01:00
Eren Gölge
0870a4faa2
Make style ( #1405 )
2022-03-16 12:13:55 +01:00
Edresson Casanova
dbe9da7f15
Add Voice conversion inference support ( #1337 )
...
* Add support for voice conversion inference
* Cache d_vectors_by_speaker for fast inference using a bigger speakers.json
* Rebase bug fix
* Use the average d-vector for inference
2022-03-10 14:57:12 +01:00
Edresson Casanova
917f417ac4
Add alphas to control language and speaker balancer ( #1216 )
...
* Add alphas to control language and speaker balancer
* Add docs for speaker and language samplers
* Change the Samplers weights to float for save memory
* Change the test_samplers to unittest format
* Add get_sampler method in BaseTTS
* Fix rebase issues
* Add language and speaker samplers support for DDP training
* Rename distributed sampler wrapper
* Remove the DistributedSamplerWrapper and use the one from Trainer
* Bugfix after rebase
* Move the samplers config to tts config
2022-03-10 14:56:09 +01:00
Eren Gölge
dd4287de1f
Update models
2022-03-03 20:23:00 +01:00