coqui-tts

Commit Graph

Author	SHA1	Message	Date
Eren Gölge	8071fa0020	Refactor GlowTTS model and recipe for TTSTokenizer	2022-02-25 11:05:06 +01:00
Eren Gölge	b6c2bfdf08	Refactor synthesis.py for TTSTokenizer	2022-02-25 11:05:06 +01:00
Eren Gölge	b2bb954a51	Refactor TTSDataset to use TTSTokenizer	2022-02-25 11:05:06 +01:00
Eren Gölge	196ae74273	Update data loader tests	2022-02-25 11:05:06 +01:00
Eren Gölge	98057a00ae	Make style	2022-02-25 10:57:35 +01:00
Eren Gölge	7575367b9f	Refactorin VITS for the tokenizer API	2022-02-25 10:57:35 +01:00
Eren Gölge	4cd690e4c1	Updates BaseTTS and configs	2022-02-25 10:57:35 +01:00
Eren Gölge	176b712c1a	Refactor TTSDataset ⚡️	2022-02-25 10:57:35 +01:00
Eren Gölge	4597d4e5b6	Remove get_characters from BaseTTS	2022-02-25 10:48:03 +01:00
Eren Gölge	2d8ce98d2a	Update imports for symbols -> characters	2022-02-25 10:48:03 +01:00
Eren Gölge	9a95e15483	Refactor GlowTTS model and recipe for TTSTokenizer	2022-02-25 10:48:03 +01:00
Eren Gölge	d0eb642d88	Refactor synthesis.py for TTSTokenizer	2022-02-25 10:48:03 +01:00
Eren Gölge	04202da1ac	Make style	2022-02-25 10:48:03 +01:00
Eren Gölge	3b63d713b9	Fix espeak wrapper cmd call	2022-02-25 10:48:03 +01:00
Eren Gölge	4894998e6b	Fix print_logs	2022-02-25 10:48:03 +01:00
Eren Gölge	4e8f9d6f10	Fix IPAPhonemes init_from_config	2022-02-25 10:48:03 +01:00
Eren Gölge	0fe39166fe	Discard OOV chars in tokenizer Discard but store OOV chars with a warninig message when the OOV char first recognized	2022-02-25 10:48:03 +01:00
Eren Gölge	c39aaafbfc	Update EspeakWrapper for espeak-ng	2022-02-25 10:48:03 +01:00
Eren Gölge	bb389479a4	Update setup_model for TTS.tts models	2022-02-25 10:48:03 +01:00
Eren Gölge	3eca5ad060	Update config fields for phonemizer	2022-02-25 10:48:03 +01:00
Eren Gölge	d2525abe8c	Remove get_characters from BaseTTS	2022-02-25 10:48:03 +01:00
Eren Gölge	73d27ebd45	Fix GlowTTS	2022-02-25 10:48:03 +01:00
Eren Gölge	87bf940676	Print duplicate characters	2022-02-25 10:48:03 +01:00
Eren Gölge	3de9f38d16	Add init_from_config to SpeakerManager	2022-02-25 10:48:03 +01:00
Eren Gölge	d8ec7086b6	Update `synthesis` for the new API	2022-02-25 10:48:03 +01:00
Eren Gölge	4e83bf3968	Allow choosing phonemizer	2022-02-25 10:48:02 +01:00
Eren Gölge	22f0c58fe1	Print language codes	2022-02-25 10:48:02 +01:00
Eren Gölge	693fb4dd39	Modify init_from_config for IPAPhonemes	2022-02-25 10:48:02 +01:00
Eren Gölge	ba3b60c90f	Test TTSTokenizer	2022-02-25 10:48:02 +01:00
Eren Gölge	79a84410f2	Test punctuations	2022-02-25 10:48:02 +01:00
Eren Gölge	d8bdeb8b8f	Fix Punctuation	2022-02-25 10:48:02 +01:00
Eren Gölge	ff7c385838	Fix BasePhonemizer	2022-02-25 10:48:02 +01:00
Eren Gölge	10d435ce77	Fixup	2022-02-25 10:48:02 +01:00
Eren Gölge	f0655bfffc	Fix ja_jp_phonemizer	2022-02-25 10:48:02 +01:00
Eren Gölge	20e5dd3678	Add doc examples	2022-02-25 10:48:02 +01:00
Eren Gölge	fbad17e084	Update imports for symbols -> characters	2022-02-25 10:48:02 +01:00
Eren Gölge	a1df4f9887	Test character classes	2022-02-25 10:45:24 +01:00
Eren Gölge	bd461ace33	Refactor GlowTTS model and recipe for TTSTokenizer	2022-02-25 10:45:24 +01:00
Eren Gölge	5a9653978a	Refactor synthesis.py for TTSTokenizer	2022-02-25 10:45:24 +01:00
Eren Gölge	e5785b34b0	Style fix	2022-02-25 10:27:46 +01:00
Eren Gölge	e4049aa31a	Refactor TTSDataset to use TTSTokenizer	2022-02-25 10:27:46 +01:00
Eren Gölge	2480bbe937	Remove OLD TOKENIZATION ROUTINES	2022-02-25 09:32:54 +01:00
Eren Gölge	8d85af84cd	Implement Punctuation class	2022-02-25 09:32:54 +01:00
Eren Gölge	1aca58afaf	Fix imports in cleaners.py	2022-02-25 09:32:54 +01:00
Eren Gölge	0344645e90	Implement TTSTokenizer	2022-02-25 09:32:54 +01:00
Eren Gölge	2fb1f70503	Implement BaseCharacters, IPAPhonemes, Graphemes	2022-02-25 09:32:54 +01:00
Eren Gölge	1bee40af40	Create language folders under `TTS.tts.utils.text`	2022-02-25 09:32:54 +01:00
Eren Gölge	c1119bc291	Implement BasePhonemizer	2022-02-25 09:32:54 +01:00
Eren Gölge	dcd01356e0	Create `text/english` folder	2022-02-25 09:32:54 +01:00
Eren Gölge	80867c8e8c	Implement multi-phonemizer	2022-02-25 09:32:54 +01:00
Eren Gölge	5e4f78add3	Implement espeak wrapper	2022-02-25 09:32:54 +01:00
Eren Gölge	e03a05c816	Implement gruut wrapper	2022-02-25 09:32:54 +01:00
Eren Gölge	172ba0c5e7	Implement JA_JP phonemizer	2022-02-25 09:32:54 +01:00
Eren Gölge	ca02b82218	Implement ZH_CH phonemizer	2022-02-25 09:32:54 +01:00
Eren Gölge	a51b031bff	Merge branch 'dev' into dev-fix-glowtts-infer	2022-02-21 12:01:40 +03:00
Edresson Casanova	28a7464975	Fix the bug in split dataset function (#1251 ) * Fix the bug in split_dataset * Make eval_split_size configurable * Change test_loader to use load_tts_samples function * Change eval_split_portion to eval_split_size and permits to set the absolute number of samples in eval * Fix samplers unit test * Add data unit test on GitHub workflow	2022-02-21 11:59:36 +03:00
Edresson Casanova	ba6e56e01c	Fix Glow-TTS multi-speaker inference	2022-02-18 19:25:29 +00:00
Eren Gölge	127118c637	Update TTS.tts formatters (#1228 ) * Return Dict from tts formatters * Make style	2022-02-11 23:03:43 +01:00
Edresson Casanova	0860d73cf8	Remove Tensorflow requeriment (#1225 ) * Remove TF modules * Remove TF unit tests * Remove TF vocoder modules * Remove TF convert scripts * Remove TF requirement * Remove the Docs TF instructions * Remove TF inference support	2022-02-10 16:14:54 +01:00
WeberJulian	e778bad626	Add argument to enable dp speaker conditioning	2022-01-06 15:07:27 +01:00
WeberJulian	e1accb6e28	Fix train_tts.py and uncomment code (#1051 ) * Fix SE loading and language embedding logic * remove trailing white space * Uncomment resmapling code for SCL	2022-01-03 17:44:57 +01:00
Eren Gölge	d724984be1	Fix language assignment	2022-01-02 11:11:24 +00:00
WeberJulian	a63998c048	Fix phoneme language	2022-01-01 21:08:13 +01:00
Eren Gölge	36cef5966b	Fix resnet speaker encoder	2021-12-30 15:36:35 +00:00
Eren Gölge	348b5c96a2	Fix speaker encoder test	2021-12-30 15:36:35 +00:00
Eren Gölge	7129b04d46	Update VITS model	2021-12-30 14:08:17 +00:00
Eren Gölge	5c5ddd2ba7	Init speaker manager for speaker encoder	2021-12-22 15:51:53 +00:00
Eren Gölge	a25269d897	Remove commented code	2021-12-20 11:54:10 +00:00
Eren Gölge	d29c3780d1	Use speaker_encoder from speaker manager in Vits	2021-12-20 11:54:10 +00:00
Eren Gölge	79de38ca76	Rename setup_model to setup_speaker_encoder_model	2021-12-20 11:54:10 +00:00
Eren Gölge	649dc9e9da	Remove redundant code	2021-12-20 11:54:10 +00:00
Eren Gölge	704dddcffa	Make style	2021-12-20 11:54:10 +00:00
WeberJulian	a564eb9f54	Add support for multi-lingual models in CLI	2021-12-20 11:54:10 +00:00
WeberJulian	2bbcb558dc	Prevent weighted sampler use when num_gpus > 1	2021-12-20 11:54:10 +00:00
WeberJulian	74cedfac38	Revert init multispeaker change	2021-12-20 11:54:10 +00:00
WeberJulian	9cfbacc622	Fix trailing space	2021-12-20 11:54:10 +00:00
WeberJulian	6b03943526	Move multilingual logic out of the trainer	2021-12-20 11:54:10 +00:00
Edresson	67dda0abe1	Add the SCL resample TODO	2021-12-20 11:54:10 +00:00
WeberJulian	8b52fb89d1	Fix merge bug	2021-12-20 11:54:10 +00:00
WeberJulian	09eda31a3f	Fix tests	2021-12-20 11:54:10 +00:00
Edresson	78a23e19df	Fix pylint checks	2021-12-20 11:54:10 +00:00
WeberJulian	4cd0e4eb0d	Remove self.audio_config from VITS	2021-12-20 11:54:10 +00:00
Edresson	d39200e69b	Remove torchaudio requeriment	2021-12-20 11:54:10 +00:00
WeberJulian	2e516869a1	Fix trailing whitespace	2021-12-20 11:54:10 +00:00
WeberJulian	ffc269eaf4	Update docstring	2021-12-20 11:54:10 +00:00
Edresson	12968532fe	Add the language embedding dim in the duration predictor class	2021-12-20 11:54:10 +00:00
Edresson	90eac13bb2	Rename ununsed_speakers to ignored_speakers	2021-12-20 11:54:10 +00:00
Edresson	f34596d957	Fix function name	2021-12-20 11:54:10 +00:00
Edresson	45d0b04179	Lint fixs	2021-12-20 11:54:10 +00:00
Edresson	b769b49e34	Remove the data from the set_d_vectors_from_file function	2021-12-20 11:54:10 +00:00
Edresson	9daa33d1fd	Remove unusable speaker manager function	2021-12-20 11:54:10 +00:00
Edresson	8c22d5ac49	Turn more clear the VITS loss function	2021-12-20 11:54:10 +00:00
Edresson	6fc3b9e679	Remove the unusable fine-tuning model	2021-12-20 11:54:10 +00:00
WeberJulian	631addf33b	fix d-vector	2021-12-20 11:54:10 +00:00
WeberJulian	da6c1e858c	Fix small issues	2021-12-20 11:54:10 +00:00
WeberJulian	e8af6a9f08	Fix use_speaker_embedding logic	2021-12-20 11:54:10 +00:00
WeberJulian	120332d53f	Fix phonemes	2021-12-20 11:54:10 +00:00
WeberJulian	1340938159	fix phonemes per language	2021-12-20 11:54:10 +00:00
WeberJulian	e995a63bd6	fix linter	2021-12-20 11:54:10 +00:00
WeberJulian	1472b6df49	make style	2021-12-20 11:54:10 +00:00
WeberJulian	4d721bcabd	fix test sentence synthesis	2021-12-20 11:54:10 +00:00
WeberJulian	0804806727	fix f0_cache_path in dataset	2021-12-20 11:54:10 +00:00
WeberJulian	3b5592abcf	fix test vits	2021-12-20 11:54:10 +00:00
WeberJulian	2a2b5767c2	fix collate_fn	2021-12-20 11:54:10 +00:00
Julian WEBER	78c2d12a91	PitchExtractor	2021-12-20 11:54:10 +00:00
Julian WEBER	9a2f91327c	get_aux_input	2021-12-20 11:54:10 +00:00
Julian WEBER	b3abd01793	Merge dataset	2021-12-20 11:54:10 +00:00
Edresson	1bd1a0546b	Add audio resample in the speaker consistency loss	2021-12-20 11:54:10 +00:00
Edresson	1c6bcda950	Add freeze vocoder generator and flow-based decoder option	2021-12-20 11:54:10 +00:00
WeberJulian	2b952d8b97	freeze vits parts	2021-12-20 11:54:10 +00:00
WeberJulian	005bba60b0	get_speaker_weighted_sampler	2021-12-20 11:54:10 +00:00
Edresson	9de4539422	Update the VITS model docs	2021-12-20 11:54:10 +00:00
Edresson	eeb8ac07d9	Add voice conversion fine tuning mode	2021-12-20 11:54:10 +00:00
Edresson	690b37d0ab	Add support to use the speaker encoder as loss function in VITS model	2021-12-20 11:54:09 +00:00
Edresson	9b011b1cb3	Add H/ASP original checkpoint support	2021-12-20 11:54:09 +00:00
Edresson	de78556655	Fix the optimizer parameters bug in multilingual and multispeaker training	2021-12-20 11:54:09 +00:00
Edresson	9be5b75da3	Fix bug after merge	2021-12-20 11:54:09 +00:00
Edresson	76251b619a	Fix d-vector multispeaker training bug	2021-12-20 11:54:09 +00:00
Edresson	7ef3ddc6ff	Fix unit tests	2021-12-20 11:54:09 +00:00
Edresson	36dcd11453	Fix pylint issues	2021-12-20 11:54:09 +00:00
Edresson	c53693c155	Implement vocoder Fine Tuning like SC-GlowTTS paper	2021-12-20 11:54:09 +00:00
Edresson	f1f016314e	Fix the bug in M-AILABS formatter	2021-12-20 11:54:09 +00:00
Edresson	c334d39acc	Add voice conversion support for the model VITS trained with external speaker embedding	2021-12-20 11:54:09 +00:00
Edresson	e997889ba8	Fix bug in VITS multilingual inference	2021-12-20 11:54:09 +00:00
Edresson	7c0b8ec572	Fix bugs in the non-multilingual VITS inference	2021-12-20 11:54:09 +00:00
Edresson	3fbbebd74d	Fix pylint issues	2021-12-20 11:54:09 +00:00
Edresson	ac9416fb86	Add multilingual inference support	2021-12-20 11:54:09 +00:00
Edresson	dcb2374bc9	Add multilingual training support to the VITS model	2021-12-20 11:54:09 +00:00
Edresson	f996afedb0	Implement multilingual dataloader support	2021-12-20 11:54:09 +00:00
Edresson	5f1c18187f	Fix pylint issues	2021-12-20 11:54:09 +00:00
Edresson	d91c595c5a	Implement training support with d_vecs in the VITS model	2021-12-20 11:54:09 +00:00
Edresson	6a7db67a91	Allow ignore speakers for all multispeaker datasets	2021-12-20 11:54:09 +00:00
Edresson	e0ad838066	Select randomly a speaker from the speaker manager for the test setences	2021-12-20 11:54:09 +00:00
Edresson	eb3e8affe1	Save speakers embeddings/ids before starting training	2021-12-20 11:54:09 +00:00
Eren Gölge	babdd84f91	Fix GST inference commit d3e477875a7e46a101fcf95a1794442823750fe2 Author: George Rousssos <25833833+george-roussos@users.noreply.github.com> Date: Wed Nov 3 10:16:12 2021 +0000 Read .wav for GST conditioning from CL commit 074e6d0874d3b34fb6a4991fc17d66dccd413fbb Author: George Rousssos <25833833+george-roussos@users.noreply.github.com> Date: Fri Oct 29 14:43:47 2021 +0100 Fix GST during inference in Tacotron2 commit fdece14585ab5a36eed1061a9a838d8e48aa6882 Author: George Rousssos <25833833+george-roussos@users.noreply.github.com> Date: Wed Nov 3 10:16:12 2021 +0000 Read .wav for GST conditioning from CL commit cd29e21b8d0a541ee298d2bf5f67223ad60be38f Author: George Rousssos <25833833+george-roussos@users.noreply.github.com> Date: Fri Oct 29 14:43:47 2021 +0100 Fix GST during inference in Tacotron2 commit 908ce39370eadcc9fa8510cdb26c9ead87305427 Author: George Rousssos <25833833+george-roussos@users.noreply.github.com> Date: Fri Oct 29 12:49:37 2021 +0100 Make trim_db value negative commit 1008a2e0f72fa7ca7f0307424f570386f2f16d42 Author: George Rousssos <25833833+george-roussos@users.noreply.github.com> Date: Fri Oct 29 12:22:24 2021 +0100 Set find_endpoint db threshold in config.json	2021-12-07 13:28:49 +00:00
Eren Gölge	2ed9e3c241	Fix constant use of noise augment	2021-11-08 09:20:34 +01:00
Eren Gölge	b6b14a76af	Fix VITS stochastic duration predictor	2021-11-08 09:20:11 +01:00
Eren Gölge	faafea4cf2	Fix style	2021-11-04 17:04:40 +01:00
Eren Gölge	c5077c6c3f	Merge branch 'dev' of https://github.com/coqui-ai/TTS into dev	2021-11-01 16:42:27 +01:00
Eren Gölge	20cebde1c9	Add docstring to MAI labs formatter	2021-11-01 16:41:55 +01:00
Eren Gölge	608f437545	Add a function to find unique chars	2021-11-01 16:41:33 +01:00
Eren Gölge	d6d780e758	Fix FastSpeech config	2021-11-01 16:41:15 +01:00
Michael Hansen	3bc043faeb	Upgrade to gruut 2.0 (#882 )	2021-10-31 11:41:55 +01:00
Eren Gölge	2df0752e73	Model zoo tests (#900 ) * Fix VITS model multi-speaker init * Remove gdrive support in model manager * Add model zoo tests	2021-10-29 17:54:16 +02:00
Eren Gölge	035ed432bc	Doc update (#889 ) * Link source files from the docs * Update glowTTS recipes for docs * Add dataset downloaders	2021-10-26 17:41:33 +02:00
Eren Gölge	0cac3f330a	Enable custom formatter in load_tts_samples	2021-10-26 13:07:11 +02:00
Eren Gölge	00becf2671	Fix import statements	2021-10-25 19:29:16 +02:00
Eren Gölge	2b7d159383	Update BaseTTS for multi-speaker training	2021-10-21 16:29:06 +00:00
Eren Gölge	e62d3c5cf7	Use absolute imports for tts configs and models	2021-10-21 16:29:06 +00:00
Eren Gölge	82fed4add2	Make style	2021-10-21 16:05:51 +00:00

1 2 3 4 5 ...

759 Commits