coqui-tts

Commit Graph

Author	SHA1	Message	Date
Edresson Casanova	6186da855f	Bug fix on pre-compute F0	2022-06-16 19:38:11 +00:00
Edresson Casanova	6a573065f4	Add pitch predictor	2022-06-16 19:34:54 +00:00
Edresson Casanova	92e7391a5d	Add speaker embedding on prosody encoder	2022-06-16 19:06:48 +00:00
Edresson Casanova	856e185641	Add Resnet prosody encoder support	2022-06-13 13:47:22 +00:00
Edresson Casanova	0844d9225d	Fix unit tests	2022-06-08 10:18:19 -03:00
Edresson Casanova	4b59f07946	Support the use of speaker embedding as emotion embedding	2022-06-08 09:52:39 -03:00
Edresson Casanova	360b969c23	Fix rebase issues	2022-06-08 09:52:39 -03:00
Edresson Casanova	e069985f17	Add speaker and emotion squeezer layers	2022-06-08 09:52:39 -03:00
Edresson Casanova	a309edacb4	Remove VITS conditional flow module	2022-06-08 09:52:39 -03:00
Edresson Casanova	a1d0088087	Remove VITS End2End loss	2022-06-08 09:52:38 -03:00
Edresson Casanova	ae55bdae6c	Fix Lint checks	2022-06-08 09:52:38 -03:00
Edresson Casanova	fd1036f4ba	Add Noise scale predictor	2022-06-08 09:52:38 -03:00
Edresson Casanova	d6d8d0e3e1	Fix the VITS GAN loss	2022-06-08 09:52:38 -03:00
Edresson Casanova	e07fcc7a8c	Add text encoder adversarial loss on the VITS	2022-06-08 09:52:38 -03:00
Edresson Casanova	4e94b46d5e	Add end2end VITS loss	2022-06-08 09:52:38 -03:00
Edresson Casanova	a822f21b78	Add prosody encoder inference support	2022-06-08 09:52:38 -03:00
Edresson Casanova	010f847929	Add an option to detach the prosody encoder input	2022-06-08 09:52:38 -03:00
Edresson Casanova	2cac18c7b7	Add VAE prosody encoder	2022-06-08 09:52:37 -03:00
Edresson Casanova	f774cf0648	Condition the prosody encoder on z_p	2022-06-08 09:52:37 -03:00
Edresson Casanova	512525cc39	Support prosody conditional model on decoder input	2022-06-08 09:52:37 -03:00
Edresson Casanova	02194367d7	Add emotion classifier loss	2022-06-08 09:52:37 -03:00
Edresson Casanova	a6c8fea192	Add conditional module	2022-06-08 09:52:37 -03:00
Edresson Casanova	bce4a41b9c	Fix unit tests	2022-06-08 09:52:37 -03:00
Edresson Casanova	0fb1b200c6	Fix rebase issues	2022-06-08 09:52:37 -03:00
Edresson Casanova	98c2834b17	Disable the reversal prosody encoder speaker loss	2022-06-08 09:52:37 -03:00
Edresson Casanova	ac3f98cefb	Add text encoder reversal speaker classifier loss	2022-06-08 09:52:37 -03:00
Edresson Casanova	a543d71352	Clean up old code	2022-06-08 09:52:36 -03:00
Edresson Casanova	66e3f5388e	Add prosody encoder params on config	2022-06-08 09:52:36 -03:00
Edresson Casanova	050f7707e2	Add reversal classifier loss	2022-06-08 09:52:36 -03:00
Edresson Casanova	44ec2ab387	Add prosody encoder training support	2022-06-08 09:52:36 -03:00
Edresson Casanova	6126e5e588	Add emotion embedding in the encoder	2022-06-08 09:52:36 -03:00
Edresson Casanova	1fdef1c4c9	Add formatter for the Emotional Speech Dataset	2022-06-08 09:52:36 -03:00
Edresson Casanova	61a04a7855	Remove useless encoder weights reload	2022-06-08 09:52:36 -03:00
Edresson Casanova	e8c4417f07	Fix Style tests	2022-06-08 09:52:36 -03:00
Edresson Casanova	730befebcc	Fix style tests	2022-06-08 09:52:36 -03:00
Edresson Casanova	e409f3588b	Bug fix in single speaker emotion embedding training	2022-06-08 09:52:36 -03:00
Edresson Casanova	7a0eba517f	Add emotion external embeddings training unit test	2022-06-08 09:52:35 -03:00
Edresson Casanova	5a10ef27b3	Add emotion consistency loss	2022-06-08 09:52:35 -03:00
Edresson Casanova	bd99548016	Add Emotion Support for the VITS model	2022-06-08 09:52:35 -03:00
Edresson Casanova	ee99a6c1e2	Fix voice conversion inference (#1583 ) * Add voice conversion zoo test * Fix style * Fix unit test	2022-05-20 15:50:25 +02:00
Edresson Casanova	e5d8ec2402	Change the VITS upsampling interpolation trick to linear (#1564 )	2022-05-13 10:52:39 +02:00
Eren Gölge	6e460b7e42	Add an assert for the upsampling trick (#1538 )	2022-05-12 19:55:24 +02:00
Eren Gölge	e45ae57aef	Merge pull request #1550 from coqui-ai/fix-upsampling-asserts Fix VITS upsampling asserts	2022-05-12 14:51:41 +02:00
Edresson Casanova	175ca06388	Add reinit text encoder and duration predictor parameter (#1562 ) * Add reinit encoder and duration predictor option * Add .data to prevent any overlooked autograd hook	2022-05-12 09:08:36 -03:00
Edresson Casanova	182711043c	Fix the VITS upsampling asserts Fix style	2022-05-12 09:08:29 -03:00
Eren Gölge	c18bd21b3f	Return durations at VITS inference	2022-05-11 11:30:05 +02:00
Eren Gölge	5021a03de0	Use torch.no_grad for VITS inference	2022-05-11 11:29:36 +02:00
Eren Gölge	3f03e3012c	Fix batch_group_size in VITS	2022-05-07 13:44:44 +02:00
WeberJulian	fbdf76b2fc	returns y_mask in VITS inference (#1540 ) * returns y_mask * make style	2022-05-03 13:49:24 +02:00
Edresson Casanova	8d228ab22a	Trick to Upsampling to High sampling rates using VITS model (#1456 ) * Add upsample VITS support * Fix the bug in inference * Fix lint checks * Add RMS based norm in save_wav method * Style fix * Add the period for VITS multi-period discriminator in model_args * Bug fix in speaker encoder load in inference time * Add unit tests * Remove useless detach_z_vocoder parameter * Add docs for VITS upsampling * Fix the docs * Rename TTS_part_sample_rate to encoder_sample_rate * Add upsampling_init and upsampling_z methods * Add asserts for encoder_sample_rate part * Move upsampling tests to test_vits.py	2022-04-26 11:47:46 +02:00

1 2 3

150 Commits