coqui-tts

Commit Graph

Author	SHA1	Message	Date
Eren Gölge	6e460b7e42	Add an assert for the upsampling trick (#1538 )	2022-05-12 19:55:24 +02:00
Edresson Casanova	a97eed696a	Fix the bug in eSpeak wrapper for eSpeak version 1.48.15 (#1560 )	2022-05-12 15:15:18 +02:00
Eren Gölge	e45ae57aef	Merge pull request #1550 from coqui-ai/fix-upsampling-asserts Fix VITS upsampling asserts	2022-05-12 14:51:41 +02:00
Edresson Casanova	175ca06388	Add reinit text encoder and duration predictor parameter (#1562 ) * Add reinit encoder and duration predictor option * Add .data to prevent any overlooked autograd hook	2022-05-12 09:08:36 -03:00
Edresson Casanova	182711043c	Fix the VITS upsampling asserts Fix style	2022-05-12 09:08:29 -03:00
Eren Gölge	c3f8c4d5eb	Return default SpeakerManager if no d_vector_file	2022-05-11 11:31:45 +02:00
Eren Gölge	121e9ed685	Pass use_cuda to init_encoder	2022-05-11 11:31:17 +02:00
Eren Gölge	c18bd21b3f	Return durations at VITS inference	2022-05-11 11:30:05 +02:00
Eren Gölge	5021a03de0	Use torch.no_grad for VITS inference	2022-05-11 11:29:36 +02:00
Eren Gölge	3f03e3012c	Fix batch_group_size in VITS	2022-05-07 13:44:44 +02:00
code-review-doctor	fa887ef5f9	Fix issue probably-meant-fstring found at https://codereview.doctor (#1532 )	2022-05-07 13:33:40 +02:00
WeberJulian	fbdf76b2fc	returns y_mask in VITS inference (#1540 ) * returns y_mask * make style	2022-05-03 13:49:24 +02:00
Edresson Casanova	8d228ab22a	Trick to Upsampling to High sampling rates using VITS model (#1456 ) * Add upsample VITS support * Fix the bug in inference * Fix lint checks * Add RMS based norm in save_wav method * Style fix * Add the period for VITS multi-period discriminator in model_args * Bug fix in speaker encoder load in inference time * Add unit tests * Remove useless detach_z_vocoder parameter * Add docs for VITS upsampling * Fix the docs * Rename TTS_part_sample_rate to encoder_sample_rate * Add upsampling_init and upsampling_z methods * Add asserts for encoder_sample_rate part * Move upsampling tests to test_vits.py	2022-04-26 11:47:46 +02:00
Edresson Casanova	060e0f9368	Add EmbeddingManager and BaseIDManager (#1374 )	2022-03-31 13:41:16 +02:00
WeberJulian	c66a6241fd	Enforce phonemizer definition for synthesis (#1441 ) * Enforce phonemizer definition for synthesis * Fix train_tts, tokenizer init can now edit config * Add small change to trigger CI pipeline * fix wrong output path for one tts_test * Fix style * Test config overides by args and tokenizer * Fix style	2022-03-25 23:15:33 +01:00
Edresson Casanova	37896e1743	Bug fix in freeze encoder (#1391 ) * Fix the bug in freeze encoder * Remove emb_l definition for non-multilingual training * Fix unit tests	2022-03-24 18:16:04 +01:00
Eren Gölge	1c3623af33	Fix model manager (#1436 ) * Fix manager * Make style	2022-03-23 12:57:14 +01:00
Eren Gölge	fd56fabb21	Fix #1380 (#1409 )	2022-03-16 12:38:27 +01:00
Eren Gölge	0870a4faa2	Make style (#1405 )	2022-03-16 12:13:55 +01:00
WeberJulian	690c96ed28	Fix default phonemizer for ja and zh (#1399 )	2022-03-16 12:13:22 +01:00
Edresson Casanova	f81892483d	REBASED: Transform Speaker Encoder in a Generic Encoder and Implement Emotion Encoder training support (#1349 ) * Rename Speaker encoder module to encoder * Add a generic emotion dataset formatter * Transform the Speaker Encoder dataset to a generic dataset and create emotion encoder config * Add class map in emotion config * Add Base encoder config * Add evaluation encoder script * Fix the bug in plot_embeddings * Enable Weight decay for encoder training * Add argumnet to disable storage * Add Perfect Sampler and remove storage * Add evaluation during encoder training * Fix lint checks * Remove useless config parameter * Active evaluation in speaker encoder test and use multispeaker dataset for this test * Unit tests fixs * Remove useless tests for speedup the aux_tests * Use get_optimizer in Encoder * Add BaseEncoder Class * Fix the unitests * Add Perfect Batch Sampler unit test * Add compute encoder accuracy in a function	2022-03-11 14:43:40 +01:00
Edresson Casanova	36e9ea2f97	Open bible dataset formatter (#1365 ) * Add support for voice conversion inference * Cache d_vectors_by_speaker for fast inference using a bigger speakers.json * Rebase bug fix * Use the average d-vector for inference * Fix the bug in find unique chars script * Add OpenBible formatter Co-authored-by: Eren Gölge <erogol@hotmail.com>	2022-03-11 10:43:31 +01:00
Edresson Casanova	dbe9da7f15	Add Voice conversion inference support (#1337 ) * Add support for voice conversion inference * Cache d_vectors_by_speaker for fast inference using a bigger speakers.json * Rebase bug fix * Use the average d-vector for inference	2022-03-10 14:57:12 +01:00
Edresson Casanova	917f417ac4	Add alphas to control language and speaker balancer (#1216 ) * Add alphas to control language and speaker balancer * Add docs for speaker and language samplers * Change the Samplers weights to float for save memory * Change the test_samplers to unittest format * Add get_sampler method in BaseTTS * Fix rebase issues * Add language and speaker samplers support for DDP training * Rename distributed sampler wrapper * Remove the DistributedSamplerWrapper and use the one from Trainer * Bugfix after rebase * Move the samplers config to tts config	2022-03-10 14:56:09 +01:00
Edresson Casanova	f381e29b91	REBASED: Add support for the speaker encoder training using torch spectrograms (#1348 ) * Add support for the speaker encoder training using torch spectrograms * Remove useless function in speaker encoder dataset class	2022-03-10 14:54:51 +01:00
Eren Gölge	c670365507	Fix VCTK recipe and formatter	2022-03-08 14:20:34 +01:00
Eren Gölge	e9d9028b4d	Revert cleaner name	2022-03-06 12:57:06 +01:00
Eren Gölge	764c7fa4a4	Rename phoneme_cleaners	2022-03-06 12:09:54 +01:00
Eren Gölge	dd4287de1f	Update models	2022-03-03 20:23:00 +01:00
Eren Gölge	1425a023fe	Make style and lint	2022-03-02 13:25:35 +01:00
Eren Gölge	c68885b3fd	Update Vits speaker encoder init	2022-03-02 13:20:23 +01:00
Eren Gölge	27b67b7945	Fix import	2022-03-02 09:15:20 +01:00
Eren Gölge	942df0fb05	Update vits dataset	2022-03-02 09:14:32 +01:00
Eren Gölge	6a9f8074f0	Fix TTSDataset	2022-03-01 07:57:48 +01:00
Eren Gölge	690de1ab06	Update Characters and add more tests	2022-02-25 11:32:44 +01:00
Eren Gölge	9063397892	Fix FastSpeech config	2022-02-25 11:31:56 +01:00
Eren Gölge	1e414b3a09	Make stlye	2022-02-25 11:31:56 +01:00
Eren Gölge	acc83cd3e6	Update Vits model API	2022-02-25 11:31:56 +01:00
Eren Gölge	fe656659be	Implement BaseTTS	2022-02-25 11:31:56 +01:00
Eren Gölge	bed4afd4ee	Implement BaseVocabulary	2022-02-25 11:31:56 +01:00
Eren Gölge	83c5ddc5b7	Update imports	2022-02-25 11:31:56 +01:00
Eren Gölge	14c117978d	Fix return outputs	2022-02-25 11:31:56 +01:00
Eren Gölge	424d04e4f6	Make stlye	2022-02-25 11:31:56 +01:00
Eren Gölge	8b3ba02c95	Add vocab_dict to model config	2022-02-25 11:31:20 +01:00
Eren Gölge	ff23dce081	Update TTSDataset	2022-02-25 11:31:20 +01:00
Eren Gölge	750903d2ba	Add VCTK formatter docstring	2022-02-25 11:30:24 +01:00
Eren Gölge	52a7896668	Update VITS loss	2022-02-25 11:30:24 +01:00
Eren Gölge	c68962c574	Update forward tts binary loss	2022-02-25 11:30:24 +01:00
Eren Gölge	c11944022d	Revert back again rand_segment	2022-02-25 11:30:24 +01:00
Eren Gölge	00c7600103	Update Vits model API	2022-02-25 11:30:24 +01:00
Eren Gölge	d0c27a9661	Update synthesis.py	2022-02-25 11:29:41 +01:00
Eren Gölge	35fc7270ff	Implement BaseTTS	2022-02-25 11:28:47 +01:00
Eren Gölge	2bad098625	Implement BaseVocabulary	2022-02-25 11:28:47 +01:00
Eren Gölge	1e219fef0a	Revert drop_last	2022-02-25 11:26:59 +01:00
Eren Gölge	7dfd753d91	Add a cheap trick to avoid short audio clips	2022-02-25 11:26:59 +01:00
Eren Gölge	1a43e05460	Fix VITS loss bug Fake and real features were given in the wrong args order to the loss function	2022-02-25 11:26:59 +01:00
Eren Gölge	4b96bfe925	Fix train logging	2022-02-25 11:26:59 +01:00
Eren Gölge	ab8a4ca2c3	Revert random segment	2022-02-25 11:26:59 +01:00
Eren Gölge	8622226f3f	Make style	2022-02-25 11:26:59 +01:00
Eren Gölge	d3a58ed07a	Fix default values	2022-02-25 11:26:59 +01:00
Eren Gölge	54c6bb2a8c	Fix add speaker VITS	2022-02-25 11:26:59 +01:00
Eren Gölge	590b04fb89	Fix espeak_wrapper	2022-02-25 11:26:59 +01:00
Eren Gölge	38314194e7	Set `drop_last`	2022-02-25 11:26:59 +01:00
Eren Gölge	f70e4bb8c6	Add new speakers to the vits model	2022-02-25 11:26:59 +01:00
Eren Gölge	d5c0e17548	Load right char class dynamically	2022-02-25 11:26:59 +01:00
Eren Gölge	1f0c8179da	Make style	2022-02-25 11:26:59 +01:00
Eren Gölge	b3ed6ff6b7	Update FastPitchConfig	2022-02-25 11:26:59 +01:00
Eren Gölge	1932401e8d	Fix dataset preprocessing	2022-02-25 11:26:59 +01:00
Eren Gölge	34c4be5e49	Update forwardtts	2022-02-25 11:26:59 +01:00
Eren Gölge	bb37462794	Update language manager	2022-02-25 11:26:59 +01:00
Eren Gölge	5169d4eb32	Plot pitch over input characters	2022-02-25 11:26:59 +01:00
Eren Gölge	2829027d8b	Refactor VITS model	2022-02-25 11:26:59 +01:00
Eren Gölge	ef63c99524	Implement `start_by_longest` option for TTSDatase	2022-02-25 11:26:18 +01:00
Eren Gölge	c4c471d61d	Allow padding for shorter segments	2022-02-25 11:25:48 +01:00
Eren Gölge	47fbddc8d4	Fix docstring	2022-02-25 11:25:48 +01:00
Eren Gölge	146fbfd7c9	Extend unittests	2022-02-25 11:25:00 +01:00
Eren Gölge	2fe16de8e3	Make lint	2022-02-25 11:25:00 +01:00
Eren Gölge	7b49a4aa2b	Fix glow_tts_config missing field	2022-02-25 11:24:13 +01:00
Eren Gölge	07b0a80d57	Fix tokenizer init_from_config	2022-02-25 11:24:13 +01:00
Eren Gölge	235f7d9b02	Extend glow_tts model tests	2022-02-25 11:24:13 +01:00
Eren Gölge	001da8afc8	Update Vits for the new model API	2022-02-25 11:21:19 +01:00
Eren Gölge	5176ae9e53	Fixes small compat. issues	2022-02-25 11:21:19 +01:00
Eren Gölge	131bc0cfc0	Fix synthesis.py 🔧	2022-02-25 11:18:00 +01:00
Eren Gölge	c0746f23df	Fix `too many open files`	2022-02-25 11:16:30 +01:00
Eren Gölge	df0d58bf09	Update VCTK recipes	2022-02-25 11:16:30 +01:00
Eren Gölge	28d98da422	Update VCTK formatter	2022-02-25 11:15:46 +01:00
Eren Gölge	cfaa51fddc	Update BaseTTS config	2022-02-25 11:11:35 +01:00
Eren Gölge	4c5cb44eeb	Update setup_model	2022-02-25 11:11:35 +01:00
Eren Gölge	7c4243fba7	Update GlowTTS	2022-02-25 11:11:35 +01:00
Eren Gölge	bacf79f4fb	Update AlignTTS	2022-02-25 11:11:35 +01:00
Eren Gölge	18f726af65	Update ForwardTTS	2022-02-25 11:11:35 +01:00
Eren Gölge	d0ec4b91e5	Update Tacotron models	2022-02-25 11:11:35 +01:00
Eren Gölge	ea965a5683	Update VITS for the new API	2022-02-25 11:11:35 +01:00
Eren Gölge	f802a931a3	Pass samples to init_from_config in SpeakerManager	2022-02-25 11:07:34 +01:00
Eren Gölge	bde68d9f25	Use the same phonemizer for `en` to `en-us`	2022-02-25 11:07:34 +01:00
Eren Gölge	8649d4fd36	Allow None pad and blank tokens	2022-02-25 11:07:34 +01:00
Eren Gölge	c9972e6f14	Make lint	2022-02-25 11:07:34 +01:00
Eren Gölge	90cc45dd4e	Update data loader tests	2022-02-25 11:05:54 +01:00
Eren Gölge	93957d58a1	Refactorin VITS for the tokenizer API	2022-02-25 11:05:06 +01:00
Eren Gölge	04df0a3d9f	Refactor TTSDataset ⚡️	2022-02-25 11:05:06 +01:00
Eren Gölge	452dbc43d8	Update imports for symbols -> characters	2022-02-25 11:05:06 +01:00
Eren Gölge	8071fa0020	Refactor GlowTTS model and recipe for TTSTokenizer	2022-02-25 11:05:06 +01:00
Eren Gölge	b6c2bfdf08	Refactor synthesis.py for TTSTokenizer	2022-02-25 11:05:06 +01:00
Eren Gölge	b2bb954a51	Refactor TTSDataset to use TTSTokenizer	2022-02-25 11:05:06 +01:00
Eren Gölge	196ae74273	Update data loader tests	2022-02-25 11:05:06 +01:00
Eren Gölge	98057a00ae	Make style	2022-02-25 10:57:35 +01:00
Eren Gölge	7575367b9f	Refactorin VITS for the tokenizer API	2022-02-25 10:57:35 +01:00
Eren Gölge	4cd690e4c1	Updates BaseTTS and configs	2022-02-25 10:57:35 +01:00
Eren Gölge	176b712c1a	Refactor TTSDataset ⚡️	2022-02-25 10:57:35 +01:00
Eren Gölge	4597d4e5b6	Remove get_characters from BaseTTS	2022-02-25 10:48:03 +01:00
Eren Gölge	2d8ce98d2a	Update imports for symbols -> characters	2022-02-25 10:48:03 +01:00
Eren Gölge	9a95e15483	Refactor GlowTTS model and recipe for TTSTokenizer	2022-02-25 10:48:03 +01:00
Eren Gölge	d0eb642d88	Refactor synthesis.py for TTSTokenizer	2022-02-25 10:48:03 +01:00
Eren Gölge	04202da1ac	Make style	2022-02-25 10:48:03 +01:00
Eren Gölge	3b63d713b9	Fix espeak wrapper cmd call	2022-02-25 10:48:03 +01:00
Eren Gölge	4894998e6b	Fix print_logs	2022-02-25 10:48:03 +01:00
Eren Gölge	4e8f9d6f10	Fix IPAPhonemes init_from_config	2022-02-25 10:48:03 +01:00
Eren Gölge	0fe39166fe	Discard OOV chars in tokenizer Discard but store OOV chars with a warninig message when the OOV char first recognized	2022-02-25 10:48:03 +01:00
Eren Gölge	c39aaafbfc	Update EspeakWrapper for espeak-ng	2022-02-25 10:48:03 +01:00
Eren Gölge	bb389479a4	Update setup_model for TTS.tts models	2022-02-25 10:48:03 +01:00
Eren Gölge	3eca5ad060	Update config fields for phonemizer	2022-02-25 10:48:03 +01:00
Eren Gölge	d2525abe8c	Remove get_characters from BaseTTS	2022-02-25 10:48:03 +01:00
Eren Gölge	73d27ebd45	Fix GlowTTS	2022-02-25 10:48:03 +01:00
Eren Gölge	87bf940676	Print duplicate characters	2022-02-25 10:48:03 +01:00
Eren Gölge	3de9f38d16	Add init_from_config to SpeakerManager	2022-02-25 10:48:03 +01:00
Eren Gölge	d8ec7086b6	Update `synthesis` for the new API	2022-02-25 10:48:03 +01:00
Eren Gölge	4e83bf3968	Allow choosing phonemizer	2022-02-25 10:48:02 +01:00
Eren Gölge	22f0c58fe1	Print language codes	2022-02-25 10:48:02 +01:00
Eren Gölge	693fb4dd39	Modify init_from_config for IPAPhonemes	2022-02-25 10:48:02 +01:00
Eren Gölge	ba3b60c90f	Test TTSTokenizer	2022-02-25 10:48:02 +01:00
Eren Gölge	79a84410f2	Test punctuations	2022-02-25 10:48:02 +01:00
Eren Gölge	d8bdeb8b8f	Fix Punctuation	2022-02-25 10:48:02 +01:00
Eren Gölge	ff7c385838	Fix BasePhonemizer	2022-02-25 10:48:02 +01:00
Eren Gölge	10d435ce77	Fixup	2022-02-25 10:48:02 +01:00
Eren Gölge	f0655bfffc	Fix ja_jp_phonemizer	2022-02-25 10:48:02 +01:00
Eren Gölge	20e5dd3678	Add doc examples	2022-02-25 10:48:02 +01:00
Eren Gölge	fbad17e084	Update imports for symbols -> characters	2022-02-25 10:48:02 +01:00
Eren Gölge	a1df4f9887	Test character classes	2022-02-25 10:45:24 +01:00
Eren Gölge	bd461ace33	Refactor GlowTTS model and recipe for TTSTokenizer	2022-02-25 10:45:24 +01:00
Eren Gölge	5a9653978a	Refactor synthesis.py for TTSTokenizer	2022-02-25 10:45:24 +01:00
Eren Gölge	e5785b34b0	Style fix	2022-02-25 10:27:46 +01:00
Eren Gölge	e4049aa31a	Refactor TTSDataset to use TTSTokenizer	2022-02-25 10:27:46 +01:00
Eren Gölge	2480bbe937	Remove OLD TOKENIZATION ROUTINES	2022-02-25 09:32:54 +01:00
Eren Gölge	8d85af84cd	Implement Punctuation class	2022-02-25 09:32:54 +01:00
Eren Gölge	1aca58afaf	Fix imports in cleaners.py	2022-02-25 09:32:54 +01:00
Eren Gölge	0344645e90	Implement TTSTokenizer	2022-02-25 09:32:54 +01:00
Eren Gölge	2fb1f70503	Implement BaseCharacters, IPAPhonemes, Graphemes	2022-02-25 09:32:54 +01:00
Eren Gölge	1bee40af40	Create language folders under `TTS.tts.utils.text`	2022-02-25 09:32:54 +01:00
Eren Gölge	c1119bc291	Implement BasePhonemizer	2022-02-25 09:32:54 +01:00
Eren Gölge	dcd01356e0	Create `text/english` folder	2022-02-25 09:32:54 +01:00
Eren Gölge	80867c8e8c	Implement multi-phonemizer	2022-02-25 09:32:54 +01:00
Eren Gölge	5e4f78add3	Implement espeak wrapper	2022-02-25 09:32:54 +01:00
Eren Gölge	e03a05c816	Implement gruut wrapper	2022-02-25 09:32:54 +01:00
Eren Gölge	172ba0c5e7	Implement JA_JP phonemizer	2022-02-25 09:32:54 +01:00
Eren Gölge	ca02b82218	Implement ZH_CH phonemizer	2022-02-25 09:32:54 +01:00
Eren Gölge	a51b031bff	Merge branch 'dev' into dev-fix-glowtts-infer	2022-02-21 12:01:40 +03:00
Edresson Casanova	28a7464975	Fix the bug in split dataset function (#1251 ) * Fix the bug in split_dataset * Make eval_split_size configurable * Change test_loader to use load_tts_samples function * Change eval_split_portion to eval_split_size and permits to set the absolute number of samples in eval * Fix samplers unit test * Add data unit test on GitHub workflow	2022-02-21 11:59:36 +03:00
Edresson Casanova	ba6e56e01c	Fix Glow-TTS multi-speaker inference	2022-02-18 19:25:29 +00:00
Eren Gölge	127118c637	Update TTS.tts formatters (#1228 ) * Return Dict from tts formatters * Make style	2022-02-11 23:03:43 +01:00
Edresson Casanova	0860d73cf8	Remove Tensorflow requeriment (#1225 ) * Remove TF modules * Remove TF unit tests * Remove TF vocoder modules * Remove TF convert scripts * Remove TF requirement * Remove the Docs TF instructions * Remove TF inference support	2022-02-10 16:14:54 +01:00
WeberJulian	e778bad626	Add argument to enable dp speaker conditioning	2022-01-06 15:07:27 +01:00
WeberJulian	e1accb6e28	Fix train_tts.py and uncomment code (#1051 ) * Fix SE loading and language embedding logic * remove trailing white space * Uncomment resmapling code for SCL	2022-01-03 17:44:57 +01:00
Eren Gölge	d724984be1	Fix language assignment	2022-01-02 11:11:24 +00:00
WeberJulian	a63998c048	Fix phoneme language	2022-01-01 21:08:13 +01:00
Eren Gölge	36cef5966b	Fix resnet speaker encoder	2021-12-30 15:36:35 +00:00
Eren Gölge	348b5c96a2	Fix speaker encoder test	2021-12-30 15:36:35 +00:00
Eren Gölge	7129b04d46	Update VITS model	2021-12-30 14:08:17 +00:00
Eren Gölge	5c5ddd2ba7	Init speaker manager for speaker encoder	2021-12-22 15:51:53 +00:00
Eren Gölge	a25269d897	Remove commented code	2021-12-20 11:54:10 +00:00
Eren Gölge	d29c3780d1	Use speaker_encoder from speaker manager in Vits	2021-12-20 11:54:10 +00:00
Eren Gölge	79de38ca76	Rename setup_model to setup_speaker_encoder_model	2021-12-20 11:54:10 +00:00
Eren Gölge	649dc9e9da	Remove redundant code	2021-12-20 11:54:10 +00:00
Eren Gölge	704dddcffa	Make style	2021-12-20 11:54:10 +00:00
WeberJulian	a564eb9f54	Add support for multi-lingual models in CLI	2021-12-20 11:54:10 +00:00
WeberJulian	2bbcb558dc	Prevent weighted sampler use when num_gpus > 1	2021-12-20 11:54:10 +00:00
WeberJulian	74cedfac38	Revert init multispeaker change	2021-12-20 11:54:10 +00:00
WeberJulian	9cfbacc622	Fix trailing space	2021-12-20 11:54:10 +00:00
WeberJulian	6b03943526	Move multilingual logic out of the trainer	2021-12-20 11:54:10 +00:00
Edresson	67dda0abe1	Add the SCL resample TODO	2021-12-20 11:54:10 +00:00
WeberJulian	8b52fb89d1	Fix merge bug	2021-12-20 11:54:10 +00:00
WeberJulian	09eda31a3f	Fix tests	2021-12-20 11:54:10 +00:00
Edresson	78a23e19df	Fix pylint checks	2021-12-20 11:54:10 +00:00
WeberJulian	4cd0e4eb0d	Remove self.audio_config from VITS	2021-12-20 11:54:10 +00:00
Edresson	d39200e69b	Remove torchaudio requeriment	2021-12-20 11:54:10 +00:00
WeberJulian	2e516869a1	Fix trailing whitespace	2021-12-20 11:54:10 +00:00
WeberJulian	ffc269eaf4	Update docstring	2021-12-20 11:54:10 +00:00
Edresson	12968532fe	Add the language embedding dim in the duration predictor class	2021-12-20 11:54:10 +00:00
Edresson	90eac13bb2	Rename ununsed_speakers to ignored_speakers	2021-12-20 11:54:10 +00:00
Edresson	f34596d957	Fix function name	2021-12-20 11:54:10 +00:00
Edresson	45d0b04179	Lint fixs	2021-12-20 11:54:10 +00:00
Edresson	b769b49e34	Remove the data from the set_d_vectors_from_file function	2021-12-20 11:54:10 +00:00
Edresson	9daa33d1fd	Remove unusable speaker manager function	2021-12-20 11:54:10 +00:00
Edresson	8c22d5ac49	Turn more clear the VITS loss function	2021-12-20 11:54:10 +00:00
Edresson	6fc3b9e679	Remove the unusable fine-tuning model	2021-12-20 11:54:10 +00:00
WeberJulian	631addf33b	fix d-vector	2021-12-20 11:54:10 +00:00
WeberJulian	da6c1e858c	Fix small issues	2021-12-20 11:54:10 +00:00
WeberJulian	e8af6a9f08	Fix use_speaker_embedding logic	2021-12-20 11:54:10 +00:00
WeberJulian	120332d53f	Fix phonemes	2021-12-20 11:54:10 +00:00
WeberJulian	1340938159	fix phonemes per language	2021-12-20 11:54:10 +00:00
WeberJulian	e995a63bd6	fix linter	2021-12-20 11:54:10 +00:00
WeberJulian	1472b6df49	make style	2021-12-20 11:54:10 +00:00
WeberJulian	4d721bcabd	fix test sentence synthesis	2021-12-20 11:54:10 +00:00
WeberJulian	0804806727	fix f0_cache_path in dataset	2021-12-20 11:54:10 +00:00
WeberJulian	3b5592abcf	fix test vits	2021-12-20 11:54:10 +00:00
WeberJulian	2a2b5767c2	fix collate_fn	2021-12-20 11:54:10 +00:00
Julian WEBER	78c2d12a91	PitchExtractor	2021-12-20 11:54:10 +00:00
Julian WEBER	9a2f91327c	get_aux_input	2021-12-20 11:54:10 +00:00
Julian WEBER	b3abd01793	Merge dataset	2021-12-20 11:54:10 +00:00
Edresson	1bd1a0546b	Add audio resample in the speaker consistency loss	2021-12-20 11:54:10 +00:00
Edresson	1c6bcda950	Add freeze vocoder generator and flow-based decoder option	2021-12-20 11:54:10 +00:00
WeberJulian	2b952d8b97	freeze vits parts	2021-12-20 11:54:10 +00:00
WeberJulian	005bba60b0	get_speaker_weighted_sampler	2021-12-20 11:54:10 +00:00
Edresson	9de4539422	Update the VITS model docs	2021-12-20 11:54:10 +00:00
Edresson	eeb8ac07d9	Add voice conversion fine tuning mode	2021-12-20 11:54:10 +00:00
Edresson	690b37d0ab	Add support to use the speaker encoder as loss function in VITS model	2021-12-20 11:54:09 +00:00
Edresson	9b011b1cb3	Add H/ASP original checkpoint support	2021-12-20 11:54:09 +00:00
Edresson	de78556655	Fix the optimizer parameters bug in multilingual and multispeaker training	2021-12-20 11:54:09 +00:00
Edresson	9be5b75da3	Fix bug after merge	2021-12-20 11:54:09 +00:00
Edresson	76251b619a	Fix d-vector multispeaker training bug	2021-12-20 11:54:09 +00:00
Edresson	7ef3ddc6ff	Fix unit tests	2021-12-20 11:54:09 +00:00
Edresson	36dcd11453	Fix pylint issues	2021-12-20 11:54:09 +00:00
Edresson	c53693c155	Implement vocoder Fine Tuning like SC-GlowTTS paper	2021-12-20 11:54:09 +00:00
Edresson	f1f016314e	Fix the bug in M-AILABS formatter	2021-12-20 11:54:09 +00:00
Edresson	c334d39acc	Add voice conversion support for the model VITS trained with external speaker embedding	2021-12-20 11:54:09 +00:00
Edresson	e997889ba8	Fix bug in VITS multilingual inference	2021-12-20 11:54:09 +00:00
Edresson	7c0b8ec572	Fix bugs in the non-multilingual VITS inference	2021-12-20 11:54:09 +00:00
Edresson	3fbbebd74d	Fix pylint issues	2021-12-20 11:54:09 +00:00
Edresson	ac9416fb86	Add multilingual inference support	2021-12-20 11:54:09 +00:00
Edresson	dcb2374bc9	Add multilingual training support to the VITS model	2021-12-20 11:54:09 +00:00
Edresson	f996afedb0	Implement multilingual dataloader support	2021-12-20 11:54:09 +00:00
Edresson	5f1c18187f	Fix pylint issues	2021-12-20 11:54:09 +00:00
Edresson	d91c595c5a	Implement training support with d_vecs in the VITS model	2021-12-20 11:54:09 +00:00
Edresson	6a7db67a91	Allow ignore speakers for all multispeaker datasets	2021-12-20 11:54:09 +00:00
Edresson	e0ad838066	Select randomly a speaker from the speaker manager for the test setences	2021-12-20 11:54:09 +00:00
Edresson	eb3e8affe1	Save speakers embeddings/ids before starting training	2021-12-20 11:54:09 +00:00
Eren Gölge	babdd84f91	Fix GST inference commit d3e477875a7e46a101fcf95a1794442823750fe2 Author: George Rousssos <25833833+george-roussos@users.noreply.github.com> Date: Wed Nov 3 10:16:12 2021 +0000 Read .wav for GST conditioning from CL commit 074e6d0874d3b34fb6a4991fc17d66dccd413fbb Author: George Rousssos <25833833+george-roussos@users.noreply.github.com> Date: Fri Oct 29 14:43:47 2021 +0100 Fix GST during inference in Tacotron2 commit fdece14585ab5a36eed1061a9a838d8e48aa6882 Author: George Rousssos <25833833+george-roussos@users.noreply.github.com> Date: Wed Nov 3 10:16:12 2021 +0000 Read .wav for GST conditioning from CL commit cd29e21b8d0a541ee298d2bf5f67223ad60be38f Author: George Rousssos <25833833+george-roussos@users.noreply.github.com> Date: Fri Oct 29 14:43:47 2021 +0100 Fix GST during inference in Tacotron2 commit 908ce39370eadcc9fa8510cdb26c9ead87305427 Author: George Rousssos <25833833+george-roussos@users.noreply.github.com> Date: Fri Oct 29 12:49:37 2021 +0100 Make trim_db value negative commit 1008a2e0f72fa7ca7f0307424f570386f2f16d42 Author: George Rousssos <25833833+george-roussos@users.noreply.github.com> Date: Fri Oct 29 12:22:24 2021 +0100 Set find_endpoint db threshold in config.json	2021-12-07 13:28:49 +00:00
Eren Gölge	2ed9e3c241	Fix constant use of noise augment	2021-11-08 09:20:34 +01:00
Eren Gölge	b6b14a76af	Fix VITS stochastic duration predictor	2021-11-08 09:20:11 +01:00
Eren Gölge	faafea4cf2	Fix style	2021-11-04 17:04:40 +01:00
Eren Gölge	c5077c6c3f	Merge branch 'dev' of https://github.com/coqui-ai/TTS into dev	2021-11-01 16:42:27 +01:00
Eren Gölge	20cebde1c9	Add docstring to MAI labs formatter	2021-11-01 16:41:55 +01:00
Eren Gölge	608f437545	Add a function to find unique chars	2021-11-01 16:41:33 +01:00
Eren Gölge	d6d780e758	Fix FastSpeech config	2021-11-01 16:41:15 +01:00
Michael Hansen	3bc043faeb	Upgrade to gruut 2.0 (#882 )	2021-10-31 11:41:55 +01:00
Eren Gölge	2df0752e73	Model zoo tests (#900 ) * Fix VITS model multi-speaker init * Remove gdrive support in model manager * Add model zoo tests	2021-10-29 17:54:16 +02:00
Eren Gölge	035ed432bc	Doc update (#889 ) * Link source files from the docs * Update glowTTS recipes for docs * Add dataset downloaders	2021-10-26 17:41:33 +02:00
Eren Gölge	0cac3f330a	Enable custom formatter in load_tts_samples	2021-10-26 13:07:11 +02:00
Eren Gölge	00becf2671	Fix import statements	2021-10-25 19:29:16 +02:00
Eren Gölge	2b7d159383	Update BaseTTS for multi-speaker training	2021-10-21 16:29:06 +00:00
Eren Gölge	e62d3c5cf7	Use absolute imports for tts configs and models	2021-10-21 16:29:06 +00:00

... 3 4 5 6 7 ...

960 Commits