coqui-tts

Commit Graph

Author	SHA1	Message	Date
Eren G??lge	32065139e7	Simple text cleaner for "hi"	2023-11-24 15:14:34 +01:00
Eren G??lge	6dd43b0ce2	Update to XTTS v2.0.3	2023-11-24 14:36:04 +01:00
Julian Weber	a55755c8df	update deepspeed version (#3281 )	2023-11-24 12:35:49 +01:00
Kaszanas	1bf5926196	Introducing Development Dockerfile (#3263 ) * Moved Dockerfile, COPY at the end This change should prevent re-installation of the dependencies upon every change of the repository's contents. Typically if Docker detects that something changed in a layer, all downstream layers are invalidated and rebuilt. * Moved Dockerfile back to main directory Main dockerfile in a separate directory can cause issues with the current CI/CD setup. This can be a good change for later. * Introduced Dockerfile.dev, updated CONTRIBUTING Dockerfile.dev can be used as a separate development environment for anyone that does not wish to install the dependencies locally.	2023-11-24 12:30:15 +01:00
TITC	4d0f53d2ee	Misjudgment of `is_multi_lingual` When Loading Multilingual Model via `model_path` (#3273 ) * load multilingual model by path * use config to assert multi lingual or not	2023-11-24 12:28:31 +01:00
Enno Hermann	8c5227ed84	Fix tts_with_vc (#3275 ) * Revert "fix for issue 3067" This reverts commit `041b4b6723`. Fixes #3143. The original issue (#3067) was people trying to use tts.tts_with_vc_to_file() with XTTS and was "fixed" in #3109. But XTTS has integrated VC and you can just do tts.tts_to_file(..., speaker_wav="..."), there is no point in passing it through FreeVC afterwards. So, reverting this commit because it breaks tts.tts_with_vc_to_file() for any model that doesn't have integrated VC, i.e. all models this method is meant for. * fix: support multi-speaker models in tts_with_vc/tts_with_vc_to_file * fix: only compute spk embeddings for models that support it Fixes #1440. Passing a `speaker_wav` argument to regular Vits models failed because they don't support voice cloning. Now that argument is simply ignored.	2023-11-24 12:26:37 +01:00
Enno Hermann	2af0220996	fix: don't pass quotes to espeak (#3286 ) Previously, the text was wrapped in an additional set of quotes that was passed to Espeak. This could result in different phonemization in certain edges and caused the insertion of an initial separator "_" that had to be removed. Compare: $ espeak-ng -q -b 1 -v en-us --ipa=1 '"A"' _ˈɐ $ espeak-ng -q -b 1 -v en-us --ipa=1 'A' ˈeɪ Fixes #2619	2023-11-24 12:25:37 +01:00
Enno Hermann	4a2684be34	fix(bin.synthesize): more informative error for wrong --language argument (#3294 ) In multilingual models, the target language is specified via the `--language_idx` argument. However, the `tts` CLI also accepts a `--language` argument for use with Coqui Studio, so it is easy to choose the wrong one, resulting in the following confusing error at synthesis time: ``` AssertionError: ❗ Language None is not supported. Supported languages are ['en', 'es', 'fr', 'de', 'it', 'pt', 'pl', 'tr', 'ru', 'nl', 'cs', 'ar', 'zh-cn', 'hu', 'ko', 'ja'] ``` This commit adds a better error message when `--language` is passed for a non-studio model. Fixes #3270, fixes #3291	2023-11-24 12:24:42 +01:00
Tessa Painter	64f391b583	Made the tqdm `progress_bar` objects of static download methods a static class variable (#3297 )	2023-11-24 12:23:59 +01:00
Eren Gölge	b47d9c6e36	Merge pull request #3243 from idiap/checkpoints Remove duplicate/unused code	2023-11-22 23:52:06 +01:00
Eren Gölge	29dede20d3	Merge pull request #3249 from coqui-ai/run_ci_for_v0.20.6 Run CI for v0.20.6	2023-11-17 15:45:26 +01:00
Eren Gölge	c011ab7455	Update to v0.20.6	2023-11-17 15:16:32 +01:00
Eren G??lge	52cb1e2f68	Update model hash for v2.0.2	2023-11-17 15:16:32 +01:00
Edresson Casanova	6075fa208c	Ensures that only GPT model is in training mode during XTTS GPT training (#3241 ) * Ensures that only GPT model is in training mode during training * Fix parallel wavegan unit test	2023-11-17 15:15:22 +01:00
Eren G??lge	a3279f9294	Make style	2023-11-17 15:15:22 +01:00
Eren G??lge	f21067a84a	Make k_diffusion optional	2023-11-17 15:15:21 +01:00
Eren G??lge	44494daa27	Update CI version	2023-11-17 15:15:21 +01:00
Eren G??lge	c864acf2b7	Update versions	2023-11-17 15:15:21 +01:00
Edresson Casanova	11283fce07	Ensures that only GPT model is in training mode during XTTS GPT training (#3241 ) * Ensures that only GPT model is in training mode during training * Fix parallel wavegan unit test	2023-11-17 15:13:46 +01:00
Eren Gölge	14579a4607	Merge pull request #3248 from coqui-ai/slacker_deps Update versions	2023-11-17 15:13:19 +01:00
Eren G??lge	44880f09ed	Make style	2023-11-17 13:43:34 +01:00
Eren G??lge	26efdf6ee7	Make k_diffusion optional	2023-11-17 13:42:33 +01:00
Eren G??lge	08d11e9198	Update CI version	2023-11-17 13:01:32 +01:00
Eren G??lge	63d7145647	Update versions	2023-11-17 12:10:46 +01:00
Enno Hermann	0fb0d67de7	refactor: use save_checkpoint()/save_best_model() from Trainer	2023-11-17 01:18:23 +01:00
Enno Hermann	96678c7ba2	refactor: use copy_model_files() from Trainer	2023-11-17 01:18:23 +01:00
Enno Hermann	5119e651a1	chore(utils.io): remove unused code These are all available in Trainer.	2023-11-17 01:18:23 +01:00
Enno Hermann	39fe38bda4	refactor: use save_fsspec() from Trainer	2023-11-17 01:18:23 +01:00
Enno Hermann	fdf0c8b10a	chore(encoder): remove unused code	2023-11-17 01:18:23 +01:00
Eren Gölge	7e4375da2b	Update to v0.20.6	2023-11-16 17:52:13 +01:00
Julian Weber	fbc18b8c34	Fix zh bug (#3238 )	2023-11-16 17:51:37 +01:00
Julian Weber	675f983550	Add sentence splitting (#3227 ) * Add sentence spliting * update requirements * update default args v2 * Add spanish * Fix return gpt_latents * Update requirements * Fix requirements	2023-11-16 11:01:11 +01:00
Enno Hermann	3c2d5a9e03	Remove duplicate AudioProcessor code and fix ExtractTTSpectrogram.ipynb (#3230 ) * chore: remove unused argument * refactor(audio.processor): remove duplicate stft+griffin_lim * chore(audio.processor): remove unused compute_stft_paddings Same function available in numpy_transforms * refactor(audio.processor): remove duplicate db_to_amp * refactor(audio.processor): remove duplicate amp_to_db * refactor(audio.processor): remove duplicate linear_to_mel * refactor(audio.processor): remove duplicate mel_to_linear * refactor(audio.processor): remove duplicate build_mel_basis * refactor(audio.processor): remove duplicate stft_parameters * refactor(audio.processor): use pre-/deemphasis from numpy_transforms * refactor(audio.processor): use rms_volume_norm from numpy_transforms * chore(audio.processor): remove duplicate assert Already checked in numpy_transforms.compute_f0 * refactor(audio.processor): use find_endpoint from numpy_transforms * refactor(audio.processor): use trim_silence from numpy_transforms * refactor(audio.processor): use volume_norm from numpy_transforms * refactor(audio.processor): use load_wav from numpy_transforms * fix(bin.extract_tts_spectrograms): set quantization bits * fix(ExtractTTSpectrogram.ipynb): adapt to current TTS code Fixes #2447, #2574 * refactor(audio.processor): remove duplicate quantization methods	2023-11-16 10:57:06 +01:00
Eren Gölge	88630c60e5	Update to v0.20.5	2023-11-15 14:02:51 +01:00
Edresson Casanova	73a5bd08c0	Fix XTTS GPT padding and inference issues (#3216 ) * Fix end artifact for fine tuning models * Bug fix on zh-cn inference * Remove ununsed code	2023-11-15 14:02:05 +01:00
Ikko Eltociear Ashimine	15f0ac57d6	Update README.md (#3215 ) Dicord -> Discord	2023-11-15 13:59:56 +01:00
Julian Weber	04901fb2e4	Add speed control for inference (#3214 ) * Add speed control for inference * Fix XTTS tests * Add speed control tests	2023-11-14 16:07:17 +01:00
Eren Gölge	d96f3885d5	Update to v0.20.4	2023-11-13 17:07:25 +01:00
Eren Gölge	ac3df409a6	Merge pull request #3208 from coqui-ai/fix_max_mel_len fix max generation length for XTTS	2023-11-13 14:32:56 +01:00
Eren Gölge	f32a465711	Merge pull request #3207 from coqui-ai/update_xtts_cloning Update XTTS cloning	2023-11-13 14:32:43 +01:00
Eren G??lge	92fa988aec	Fixup	2023-11-13 13:44:06 +01:00
WeberJulian	b85536b23f	fix max generation length	2023-11-13 13:18:45 +01:00
Eren G??lge	b2682d39c5	Make style	2023-11-13 13:01:01 +01:00
Eren G??lge	a16360af85	Implement chunking gpt_cond	2023-11-13 13:00:08 +01:00
Eren Gölge	6f1cba2f81	Update to v0.20.3	2023-11-09 17:41:37 +01:00
Enno Hermann	3b1e7038bc	fix(formatters): set missing root_path attribute (#3182 ) Fixes #2778	2023-11-09 16:49:52 +01:00
Aarni Koskela	a8e9163fb3	xtts/tokenizer: merge duplicate implementations of preprocess_text (#3170 ) This was found via ruff: > F811 Redefinition of unused `preprocess_text` from line 570	2023-11-09 16:32:12 +01:00
Matthew Boakes	1b9c400bca	PyTorch 2.1 Updates (Weight Norm and TorchAudio I/O) (#3176 ) * Replaced PyTorch weight_norm With parametrizations.weight_norm * TorchAudio: Migrating The I/O Functions To Use The Dispatcher Mechanism * Corrected Code Style --------- Co-authored-by: Eren Gölge <erogol@hotmail.com>	2023-11-09 16:31:03 +01:00
Gorkem	66a1e248d0	torchaudio should use proper backend to load audio (#3179 )	2023-11-09 16:28:39 +01:00
Eren Gölge	46d9c27212	Update to v0.20.2	2023-11-08 16:07:56 +01:00

1 2 3 4 5 ...

4598 Commits All Branches Search

4598 Commits

All Branches