coqui-tts

Commit Graph

Author	SHA1	Message	Date
Aarni Koskela	90991e89b4	Ruff autofix unused imports and import order	2023-12-13 14:56:41 +02:00
Aarni Koskela	72ac2bfa09	Get rid of some star imports	2023-12-13 14:56:41 +02:00
Eren Gölge	fa28f99f15	Update to v0.22.0	2023-12-12 16:10:46 +01:00
Eren Gölge	8c1a8b522b	Merge pull request #3405 from coqui-ai/studio_speakers Add studio speakers to open source XTTS!	2023-12-12 16:10:09 +01:00
Enno Hermann	9f325b1f6c	fixup! Fix aux unit tests	2023-12-12 16:07:16 +01:00
Edresson Casanova	fc099218df	Fix aux unit tests	2023-12-12 16:07:16 +01:00
Eren Gölge	934b87bbd1	Merge pull request #3391 from aaron-lii/multi-gpu support multiple GPU training for XTTS	2023-12-12 13:51:26 +01:00
Eren Gölge	8e6a7cbfbf	Update .models.json	2023-12-12 13:50:01 +01:00
Eren Gölge	4dc0722bbc	Update .models.json	2023-12-12 13:28:16 +01:00
WeberJulian	61b67ef16f	Fix read_json_with_comments	2023-12-11 23:58:52 +01:00
WeberJulian	d47b6df4e5	Make comments in .model.json valid	2023-12-11 23:35:27 +01:00
WeberJulian	b40750d1f5	Remove models that require app.coqui.ai	2023-12-11 23:17:54 +01:00
WeberJulian	5ab228dff2	Fix CI	2023-12-11 22:31:53 +01:00
WeberJulian	8c20a599d8	Remove coqui studio integration from TTS	2023-12-11 22:11:46 +01:00
WeberJulian	5cd750ac7e	Fix API and CI	2023-12-11 20:21:53 +01:00
WeberJulian	e3c9dab7a3	Make CLI work	2023-12-11 18:49:18 +01:00
WeberJulian	0a90359a42	rename speaker file	2023-12-11 18:48:49 +01:00
WeberJulian	a5c0d9780f	rename manager	2023-12-11 18:48:31 +01:00
WeberJulian	36143fee26	Add basic speaker manager	2023-12-11 15:25:46 +01:00
Frederico S. Oliveira	f9117918fe	Update .models.json	2023-12-11 10:47:31 -03:00
Frederico S. Oliveira	163f9a3fdf	Merge branch 'coqui-ai:dev' into dev	2023-12-11 10:04:07 -03:00
WeberJulian	0a136a8535	Download speaker file	2023-12-11 11:29:36 +01:00
Aaron-Li	b6e929696a	support multiple GPU training	2023-12-08 16:55:32 +08:00
Josh Meyer	759d9ab3ae	Print message for either commercial license or CPML	2023-12-07 13:54:48 +01:00
Eren Gölge	e49c512d99	Merge pull request #3351 from aaron-lii/chinese-puncs fix pause problem of Chinese speech	2023-12-04 15:57:42 +01:00
Eren Gölge	2d02015978	Update to v0.21.3	2023-12-01 23:52:57 +01:00
Edresson Casanova	5f900f156a	Add XTTS Fine tuning gradio demo (#3296 ) * Add XTTS FT demo data processing pipeline * Add training and inference columns * Uses tabs instead of columns * Fix demo freezing issue * Update demo * Convert stereo to mono * Bug fix on XTTS inference * Update gradio demo * Update gradio demo * Update gradio demo * Update gradio demo * Add parameters to be able to set then on colab demo * Add erros messages * Add intuitive error messages * Update * Add max_audio_length parameter * Add XTTS fine-tuner docs * Update XTTS finetuner docs * Delete trainer to freeze memory * Delete unused variables * Add gc.collect() * Update xtts.md --------- Co-authored-by: Eren Gölge <erogol@hotmail.com>	2023-12-01 23:52:23 +01:00
Aaron-Li	7b8808186a	fix pause problem of Chinese speech	2023-12-01 23:30:03 +08:00
Frederico S. Oliveira	bcd500fa7b	Fixing bug Correction in training the Fastspeech/Fastspeech2/FastPitch/SpeedySpeech model using external speaker embedding.	2023-11-30 17:27:05 -03:00
Frederico S. Oliveira	a26e51b0b4	Merge branch 'coqui-ai:dev' into dev	2023-11-30 14:19:05 -03:00
Eren Gölge	6d1905c2b7	Update to v0.21.2	2023-11-30 13:05:10 +01:00
Enno Hermann	39321d02be	fix: correctly strip/restore initial punctuation (#3336 ) * refactor(punctuation): remove orphan code for handling lone punctuation The case of lone punctuation is already handled at the top of restore(). The removed if statement would never be called and would in fact raise an AttributeError because the _punc_index named tuple doesn't have the attribute `mark`. * refactor(punctuation): remove unused argument * fix(punctuation): correctly handle initial punctuation Stripping and restoring initial punctuation didn't work correctly because the string-splitting caused an additional empty string to be inserted in the text list (because `".A".split(".")` => `["", "A"]`). Now, an initial empty string is skipped and relevant test cases are added. Fixes #3333	2023-11-30 13:03:16 +01:00
Frederico S. Oliveira	77c2155609	Merge pull request #1 from coqui-ai/dev Update	2023-11-29 17:24:02 -03:00
Eren G??lge	bfbaffc84a	Fixup	2023-11-28 13:47:45 +01:00
Eren G??lge	b75e90ba85	Make text splitting optional	2023-11-27 14:53:11 +01:00
Eren G??lge	3b8894a3dd	Make style	2023-11-27 14:15:50 +01:00
Eren G??lge	2fd8cf3d94	Make xtts runnable by version names	2023-11-27 14:15:16 +01:00
Eren G??lge	11ec9f7471	Add hi in config defaults	2023-11-24 15:38:36 +01:00
Eren G??lge	00a870c26a	Update to v0.21.1	2023-11-24 15:15:44 +01:00
Eren G??lge	7e575068c9	Merge branch 'dev' of https://github.com/coqui-ai/TTS into dev	2023-11-24 15:15:19 +01:00
Eren G??lge	32065139e7	Simple text cleaner for "hi"	2023-11-24 15:14:34 +01:00
Eren Gölge	1542a50c3a	Update to v0.21.0	2023-11-24 14:37:05 +01:00
Eren G??lge	6dd43b0ce2	Update to XTTS v2.0.3	2023-11-24 14:36:04 +01:00
TITC	4d0f53d2ee	Misjudgment of `is_multi_lingual` When Loading Multilingual Model via `model_path` (#3273 ) * load multilingual model by path * use config to assert multi lingual or not	2023-11-24 12:28:31 +01:00
Enno Hermann	8c5227ed84	Fix tts_with_vc (#3275 ) * Revert "fix for issue 3067" This reverts commit `041b4b6723`. Fixes #3143. The original issue (#3067) was people trying to use tts.tts_with_vc_to_file() with XTTS and was "fixed" in #3109. But XTTS has integrated VC and you can just do tts.tts_to_file(..., speaker_wav="..."), there is no point in passing it through FreeVC afterwards. So, reverting this commit because it breaks tts.tts_with_vc_to_file() for any model that doesn't have integrated VC, i.e. all models this method is meant for. * fix: support multi-speaker models in tts_with_vc/tts_with_vc_to_file * fix: only compute spk embeddings for models that support it Fixes #1440. Passing a `speaker_wav` argument to regular Vits models failed because they don't support voice cloning. Now that argument is simply ignored.	2023-11-24 12:26:37 +01:00
Enno Hermann	2af0220996	fix: don't pass quotes to espeak (#3286 ) Previously, the text was wrapped in an additional set of quotes that was passed to Espeak. This could result in different phonemization in certain edges and caused the insertion of an initial separator "_" that had to be removed. Compare: $ espeak-ng -q -b 1 -v en-us --ipa=1 '"A"' _ˈɐ $ espeak-ng -q -b 1 -v en-us --ipa=1 'A' ˈeɪ Fixes #2619	2023-11-24 12:25:37 +01:00
Enno Hermann	4a2684be34	fix(bin.synthesize): more informative error for wrong --language argument (#3294 ) In multilingual models, the target language is specified via the `--language_idx` argument. However, the `tts` CLI also accepts a `--language` argument for use with Coqui Studio, so it is easy to choose the wrong one, resulting in the following confusing error at synthesis time: ``` AssertionError: ❗ Language None is not supported. Supported languages are ['en', 'es', 'fr', 'de', 'it', 'pt', 'pl', 'tr', 'ru', 'nl', 'cs', 'ar', 'zh-cn', 'hu', 'ko', 'ja'] ``` This commit adds a better error message when `--language` is passed for a non-studio model. Fixes #3270, fixes #3291	2023-11-24 12:24:42 +01:00
Tessa Painter	64f391b583	Made the tqdm `progress_bar` objects of static download methods a static class variable (#3297 )	2023-11-24 12:23:59 +01:00
Eren Gölge	b47d9c6e36	Merge pull request #3243 from idiap/checkpoints Remove duplicate/unused code	2023-11-22 23:52:06 +01:00
Eren Gölge	c011ab7455	Update to v0.20.6	2023-11-17 15:16:32 +01:00
Eren G??lge	52cb1e2f68	Update model hash for v2.0.2	2023-11-17 15:16:32 +01:00
Edresson Casanova	6075fa208c	Ensures that only GPT model is in training mode during XTTS GPT training (#3241 ) * Ensures that only GPT model is in training mode during training * Fix parallel wavegan unit test	2023-11-17 15:15:22 +01:00
Eren G??lge	a3279f9294	Make style	2023-11-17 15:15:22 +01:00
Eren G??lge	f21067a84a	Make k_diffusion optional	2023-11-17 15:15:21 +01:00
Enno Hermann	0fb0d67de7	refactor: use save_checkpoint()/save_best_model() from Trainer	2023-11-17 01:18:23 +01:00
Enno Hermann	96678c7ba2	refactor: use copy_model_files() from Trainer	2023-11-17 01:18:23 +01:00
Enno Hermann	5119e651a1	chore(utils.io): remove unused code These are all available in Trainer.	2023-11-17 01:18:23 +01:00
Enno Hermann	39fe38bda4	refactor: use save_fsspec() from Trainer	2023-11-17 01:18:23 +01:00
Enno Hermann	fdf0c8b10a	chore(encoder): remove unused code	2023-11-17 01:18:23 +01:00
Eren Gölge	7e4375da2b	Update to v0.20.6	2023-11-16 17:52:13 +01:00
Julian Weber	fbc18b8c34	Fix zh bug (#3238 )	2023-11-16 17:51:37 +01:00
Julian Weber	675f983550	Add sentence splitting (#3227 ) * Add sentence spliting * update requirements * update default args v2 * Add spanish * Fix return gpt_latents * Update requirements * Fix requirements	2023-11-16 11:01:11 +01:00
Enno Hermann	3c2d5a9e03	Remove duplicate AudioProcessor code and fix ExtractTTSpectrogram.ipynb (#3230 ) * chore: remove unused argument * refactor(audio.processor): remove duplicate stft+griffin_lim * chore(audio.processor): remove unused compute_stft_paddings Same function available in numpy_transforms * refactor(audio.processor): remove duplicate db_to_amp * refactor(audio.processor): remove duplicate amp_to_db * refactor(audio.processor): remove duplicate linear_to_mel * refactor(audio.processor): remove duplicate mel_to_linear * refactor(audio.processor): remove duplicate build_mel_basis * refactor(audio.processor): remove duplicate stft_parameters * refactor(audio.processor): use pre-/deemphasis from numpy_transforms * refactor(audio.processor): use rms_volume_norm from numpy_transforms * chore(audio.processor): remove duplicate assert Already checked in numpy_transforms.compute_f0 * refactor(audio.processor): use find_endpoint from numpy_transforms * refactor(audio.processor): use trim_silence from numpy_transforms * refactor(audio.processor): use volume_norm from numpy_transforms * refactor(audio.processor): use load_wav from numpy_transforms * fix(bin.extract_tts_spectrograms): set quantization bits * fix(ExtractTTSpectrogram.ipynb): adapt to current TTS code Fixes #2447, #2574 * refactor(audio.processor): remove duplicate quantization methods	2023-11-16 10:57:06 +01:00
Eren Gölge	88630c60e5	Update to v0.20.5	2023-11-15 14:02:51 +01:00
Edresson Casanova	73a5bd08c0	Fix XTTS GPT padding and inference issues (#3216 ) * Fix end artifact for fine tuning models * Bug fix on zh-cn inference * Remove ununsed code	2023-11-15 14:02:05 +01:00
Julian Weber	04901fb2e4	Add speed control for inference (#3214 ) * Add speed control for inference * Fix XTTS tests * Add speed control tests	2023-11-14 16:07:17 +01:00
Eren Gölge	d96f3885d5	Update to v0.20.4	2023-11-13 17:07:25 +01:00
Eren Gölge	ac3df409a6	Merge pull request #3208 from coqui-ai/fix_max_mel_len fix max generation length for XTTS	2023-11-13 14:32:56 +01:00
Eren G??lge	92fa988aec	Fixup	2023-11-13 13:44:06 +01:00
WeberJulian	b85536b23f	fix max generation length	2023-11-13 13:18:45 +01:00
Eren G??lge	b2682d39c5	Make style	2023-11-13 13:01:01 +01:00
Eren G??lge	a16360af85	Implement chunking gpt_cond	2023-11-13 13:00:08 +01:00
Eren Gölge	6f1cba2f81	Update to v0.20.3	2023-11-09 17:41:37 +01:00
Enno Hermann	3b1e7038bc	fix(formatters): set missing root_path attribute (#3182 ) Fixes #2778	2023-11-09 16:49:52 +01:00
Aarni Koskela	a8e9163fb3	xtts/tokenizer: merge duplicate implementations of preprocess_text (#3170 ) This was found via ruff: > F811 Redefinition of unused `preprocess_text` from line 570	2023-11-09 16:32:12 +01:00
Matthew Boakes	1b9c400bca	PyTorch 2.1 Updates (Weight Norm and TorchAudio I/O) (#3176 ) * Replaced PyTorch weight_norm With parametrizations.weight_norm * TorchAudio: Migrating The I/O Functions To Use The Dispatcher Mechanism * Corrected Code Style --------- Co-authored-by: Eren Gölge <erogol@hotmail.com>	2023-11-09 16:31:03 +01:00
Gorkem	66a1e248d0	torchaudio should use proper backend to load audio (#3179 )	2023-11-09 16:28:39 +01:00
Eren Gölge	46d9c27212	Update to v0.20.2	2023-11-08 16:07:56 +01:00
Julian Weber	03ad90135b	Add lang code in XTTS doc (#3158 ) * Add lang code in XTTS doc * Remove ununsed config and args * update docs * woops	2023-11-08 13:47:33 +01:00
Gorkem	78a596618a	Fix for exception on streaming if last chunk empty (#3160 )	2023-11-08 11:32:02 +01:00
Enno Hermann	99edd6daa3	Fix ModelManager.list_models() (#3128 ) * fix(utils.manage): remove hard-coded model_type variable * refactor(utils.manage): address lint issues, fix typos Addressed the following: TTS/utils/manage.py:307:12: R1705: Unnecessary "else" after "return" (no-else-return) TTS/utils/manage.py:308:21: W1514: Using open without explicitly specifying an encoding (unspecified-encoding) TTS/utils/manage.py:299:4: R1710: Either all return statements in a function should return an expression, or none of them should. (inconsistent-return-statements) TTS/utils/manage.py:299:4: R0201: Method could be a function (no-self-use) TTS/utils/manage.py:314:4: R0201: Method could be a function (no-self-use)	2023-11-08 11:29:01 +01:00
Eren Gölge	77b18126c7	Merge pull request #3126 from akx/freevc-config-module Move FreeVCConfig to TTS.vc.configs (like all other config classes)	2023-11-08 11:24:47 +01:00
Eren Gölge	cc6e9fcaa7	Fix #3153 (#3169 )	2023-11-08 11:13:58 +01:00
Eren Gölge	a24ebcd8a6	Fix coqui api (#3168 )	2023-11-08 10:51:23 +01:00
Julian Weber	ce1a39a9a4	Add char limit warn (#3130 ) * Add char limit warning * Adding v2 langs * cached_property for cutlet * Fix import	2023-11-08 10:24:23 +01:00
Eren Gölge	f846a9f300	Update to v0.20.1	2023-11-07 14:17:36 +01:00
Edresson Casanova	cbdbc44e0f	Fix XTTS v2.0 training recipe (#3154 ) * Fix XTTS v2.0 training recipe * Update XTTS v2 model hash	2023-11-07 14:16:44 +01:00
Edresson Casanova	5f9ab6cfaa	Fix style Co-authored-by: Aarni Koskela <akx@iki.fi>	2023-11-06 19:22:34 -03:00
Edresson Casanova	2470599d18	Drop XTTS v1	2023-11-06 19:12:04 -03:00
Edresson Casanova	13243df526	Update XTTS v1.1 files	2023-11-06 19:10:21 -03:00
Edresson Casanova	09fb317e6d	Remove unused code	2023-11-06 17:36:32 -03:00
Edresson Casanova	b146de4ce8	Bug fix on XTTS v2.0 Trainer	2023-11-06 20:26:01 +01:00
Edresson Casanova	1b6f8d0e46	Update unit tests and recipes	2023-11-06 20:25:06 +01:00
Edresson Casanova	72b2bac0f8	Load reference in 24khz to avoid issued with multiple sr references	2023-11-06 20:25:06 +01:00
Edresson Casanova	00294ffdf6	Update XTTS docs	2023-11-06 20:24:06 +01:00
Edresson Casanova	459ad70dc8	Add support for multiples speaker references on XTTS inference	2023-11-06 20:22:35 +01:00
Eren Gölge	f0cb19ecca	Drop diffusion from XTTS (#3150 ) * Drop diffusion for XTTS * Make style * Drop diffusion deps in code * Restore thrashed	2023-11-06 20:15:49 +01:00
Eren G??lge	5d418bb84a	Update docs	2023-11-06 18:48:41 +01:00
Eren G??lge	9bbf6eb8dd	Drop use_ne_hifigan	2023-11-06 18:43:38 +01:00
Eren G??lge	9d54bd7655	Fixup XTTS	2023-11-06 18:13:58 +01:00

1 2 3 4 5 ...

2024 Commits