coqui-tts

Commit Graph

Author	SHA1	Message	Date
Enno Hermann	6a52c8a855	fix(bin): log to stdout in cli tools, unless pipe_out is set This way the outputs are available for further downstream processing, e.g. with grep. For TTS/bin/synthesize.py, if --pipe_out is set, log to stderr because then only the output audio stream should be on stdout, e.g. to pipe it to aplay.	2024-12-17 11:38:39 +01:00
Enno Hermann	0df04cc259	docs: add notes about xtts fine-tuning	2024-12-14 16:19:38 +01:00
Enno Hermann	e38dcbea7a	docs: streamline readme and reuse content in other docs pages [ci skip]	2024-12-12 18:29:23 +01:00
Enno Hermann	e0f621180f	refactor(bin.synthesize): use Python API for CLI	2024-12-06 17:07:54 +01:00
Enno Hermann	5daed879e0	chore(bin.synthesize): remove unused argument	2024-12-05 21:19:07 +01:00
Enno Hermann	546f43cb25	refactor: only use keyword args in Synthesizer	2024-12-02 23:26:27 +01:00
Enno Hermann	63625e79af	refactor: import get_last_checkpoint from trainer.io	2024-11-29 13:59:43 +01:00
Shavit	540e8d6cf2	fix(bin.synthesize): return speakers names only (#147 )	2024-11-09 18:35:54 +01:00
Enno Hermann	de35920317	Merge pull request #50 from idiap/umap build: move umap-learn into optional notebook dependencies	2024-07-25 13:26:09 +01:00
Enno Hermann	e869b9b658	refactor: use load_checkpoint from trainer	2024-06-29 15:07:10 +02:00
Enno Hermann	59ef28d708	build: move umap-learn into optional notebook dependencies Except for notebooks, it's only used to show embedding plots during speaker encoder training, in which case a warning is now shown to install it.	2024-06-26 23:53:17 +02:00
Enno Hermann	c5241d71ab	chore: address pytorch deprecations torch.range(a, b) == torch.arange(a, b+1) meshgrid indexing: https://github.com/pytorch/pytorch/issues/50276 checkpoint use_reentrant: https://dev-discuss.pytorch.org/t/bc-breaking-update-to-torch-utils-checkpoint-not-passing-in-use-reentrant-flag-will-raise-an-error/1745 optimizer.step() before scheduler.step(): https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate	2024-06-26 11:38:25 +02:00
Enno Hermann	77722cb0dd	fix(bin.synthesize): correctly handle boolean arguments Previously, e.g. `--use_cuda false` would actually set use_cuda=True: https://github.com/coqui-ai/TTS/discussions/3762	2024-05-31 08:39:32 +02:00
Enno Hermann	7dc5d1eb3d	fix: logging in executables	2024-04-03 15:19:45 +02:00
Enno Hermann	b711e19cb6	refactor: remove verbose arguments Can be handled by adjusting logging levels instead.	2024-04-03 15:19:45 +02:00
Enno Hermann	7630abb43f	refactor(bin.find_unique_chars): use existing function	2024-03-30 22:22:40 +01:00
Enno Hermann	a7753708fb	refactor: remove duplicate methods available in Trainer	2024-03-12 15:06:42 +01:00
Enno Hermann	efdafd5a7f	style: run black	2024-03-07 11:46:51 +01:00
Enno Hermann	24298da5fc	Merge pull request #1 from eginhard/lint-overhaul Lint overhaul (pylint to ruff)	2024-03-06 16:10:26 +01:00
Eren Gölge	55c7063724	Merge pull request #3423 from idiap/fix-aux-tests Fix CI (save best model after 0 steps in tests)	2023-12-14 18:00:30 +01:00
Aarni Koskela	d6ea806469	Run `make style`	2023-12-13 14:56:41 +02:00
Aarni Koskela	4584ef6580	Simplify branch in TTS/bin/synthesize.py	2023-12-13 14:56:41 +02:00
Aarni Koskela	33b69c6c09	Add some noqa directives (for now)	2023-12-13 14:56:41 +02:00
Aarni Koskela	64bb41f4fa	Ruff autofix C41	2023-12-13 14:56:41 +02:00
Aarni Koskela	90991e89b4	Ruff autofix unused imports and import order	2023-12-13 14:56:41 +02:00
Enno Hermann	9f325b1f6c	fixup! Fix aux unit tests	2023-12-12 16:07:16 +01:00
Edresson Casanova	fc099218df	Fix aux unit tests	2023-12-12 16:07:16 +01:00
WeberJulian	5ab228dff2	Fix CI	2023-12-11 22:31:53 +01:00
WeberJulian	8c20a599d8	Remove coqui studio integration from TTS	2023-12-11 22:11:46 +01:00
Enno Hermann	4a2684be34	fix(bin.synthesize): more informative error for wrong --language argument (#3294 ) In multilingual models, the target language is specified via the `--language_idx` argument. However, the `tts` CLI also accepts a `--language` argument for use with Coqui Studio, so it is easy to choose the wrong one, resulting in the following confusing error at synthesis time: ``` AssertionError: ❗ Language None is not supported. Supported languages are ['en', 'es', 'fr', 'de', 'it', 'pt', 'pl', 'tr', 'ru', 'nl', 'cs', 'ar', 'zh-cn', 'hu', 'ko', 'ja'] ``` This commit adds a better error message when `--language` is passed for a non-studio model. Fixes #3270, fixes #3291	2023-11-24 12:24:42 +01:00
Enno Hermann	0fb0d67de7	refactor: use save_checkpoint()/save_best_model() from Trainer	2023-11-17 01:18:23 +01:00
Enno Hermann	96678c7ba2	refactor: use copy_model_files() from Trainer	2023-11-17 01:18:23 +01:00
Enno Hermann	3c2d5a9e03	Remove duplicate AudioProcessor code and fix ExtractTTSpectrogram.ipynb (#3230 ) * chore: remove unused argument * refactor(audio.processor): remove duplicate stft+griffin_lim * chore(audio.processor): remove unused compute_stft_paddings Same function available in numpy_transforms * refactor(audio.processor): remove duplicate db_to_amp * refactor(audio.processor): remove duplicate amp_to_db * refactor(audio.processor): remove duplicate linear_to_mel * refactor(audio.processor): remove duplicate mel_to_linear * refactor(audio.processor): remove duplicate build_mel_basis * refactor(audio.processor): remove duplicate stft_parameters * refactor(audio.processor): use pre-/deemphasis from numpy_transforms * refactor(audio.processor): use rms_volume_norm from numpy_transforms * chore(audio.processor): remove duplicate assert Already checked in numpy_transforms.compute_f0 * refactor(audio.processor): use find_endpoint from numpy_transforms * refactor(audio.processor): use trim_silence from numpy_transforms * refactor(audio.processor): use volume_norm from numpy_transforms * refactor(audio.processor): use load_wav from numpy_transforms * fix(bin.extract_tts_spectrograms): set quantization bits * fix(ExtractTTSpectrogram.ipynb): adapt to current TTS code Fixes #2447, #2574 * refactor(audio.processor): remove duplicate quantization methods	2023-11-16 10:57:06 +01:00
Eren Gölge	a24ebcd8a6	Fix coqui api (#3168 )	2023-11-08 10:51:23 +01:00
Aarni Koskela	38f6f8f0bb	Run `make style` & re-enable it in CI (#3127 )	2023-11-06 11:36:37 +01:00
David Garvey	a151d70242	Add stdout option (#3027 ) * add add cli options for play and speed --play argument uses simpleaudio to play the tts wav --speed <float 0.0-2.0> passes speed argument to Coqui Studio models * remove simpleaudio not referenced in file * fix simpleaudio dependency version * add ALSA headers for simpleaudio compilation * Dockerfile ALSA headers for simpleaudio * base changes to use stdout instead of play audio Considering conversion to pipe wav data for audio playback with ohter program like aplay. This is incomplete code. Using to get feedback before proceeding with implementation. * remove play for pipe_out arg that suppresses stdout removed play and simpleaudio dependency in place of pipe fuctionality to allow passing wav file data to a program dedicated to playing audio. * scipy.io.wavfile.write fails with /dev/null target * Streaming inference for XTTS 🚀 (#3035) * v0.17.7 * Redownload XTTS with the local and remote config do not match * Remove unused method * Print a message when it is already donwloaded * Try-except to present error when the user dont have connection * Fix style * 0.17.8 * v0.17.8 --------- Co-authored-by: Julian Weber <julian.weber@hotmail.fr> Co-authored-by: Eren Gölge <erogol@hotmail.com> Co-authored-by: Edresson Casanova <edresson1@gmail.com> Co-authored-by: ggoknar <ggoknar@coqui.ai>	2023-10-16 12:07:21 +02:00
Aarni Koskela	0a82f063cc	Late-import main TTS libraries in `tts` CLI	2023-09-26 15:38:56 +03:00
Aarni Koskela	5c047cf304	Ensure `tts` CLI tool readme and usage help is in sync	2023-09-26 15:38:56 +03:00
Eren Gölge	4033db5f4b	🔥 XTTS implementation	2023-09-13 17:51:24 +02:00
Jake Tae	b79b6f0762	feature: add device flag to tts cli (#2875 )	2023-08-28 11:20:12 +02:00
Eren Gölge	3a104d5c49	Update Studio API for XTTS (#2861 ) * Update Studio API for XTTS * Update the docs * Update README.md * Update README.md Update README	2023-08-13 12:04:12 +02:00
logan hart	6fdb88f8e2	Add Delightful-TTS implementation (#2095 ) * add configs * Update config file * Add model configs * Add model layers * Add layer files * Add layer modules * change config names * Add emotion manager * fIX missing ap bug * Fix missing ap bug * Add base TTS e2e class * Fix wrong variable name in load_tts_samples * Add training script * Remove range predictor and gaussian upsampling * Add helper function * Add vctk recipe * Add conformer docs * Fix linting in conformer.py * Add Docs * remove duplicate import * refactor args * Fix bugs * Removew emotion embedding * remove unused arg * Remove emotion embedding arg * Remove emotion embedding arg * fix style issues * Fix bugs * Fix bugs * Add unittests * make style * fix formatter bug * fix test * Add pyworld compute pitch func * Update requirments.txt * Fix dataset Bug * Chnge layer norm to instance norm * Add missing import * Remove emotions.py * remove ssim loss * Add init layers func to aligner * refactor model layers * remove audio_config arg * Rename loss func * Rename to delightful-tts * Rename loss func * Remove unused modules * refactor imports * replace audio config with audio processor * Add change sample rate option * remove broken resample func * update recipe * fix style, add config docs * fix tests and multispeaker embd dim * remove pyworld * Make style and fix inference * Split tts tests * Fixup * Fixup * Fixup * Add argument names * Set "random" speaker in the model Tortoise/Bark * Use a diff f0_cache path for delightfull tts * Fix delightful speaker handling * Fix lint * Make style --------- Co-authored-by: loganhart420 <loganartpersonal@gmail.com> Co-authored-by: Eren Gölge <erogol@hotmail.com>	2023-07-24 13:41:26 +02:00
PiaoYang	630327c4e6	Update compute_embeddings.py (#2668 ) * [Typo] Fix variable name. More readable description. Update train_yourtts.py Reformat. Reformat using black again. * Add `old_append`. Fix bool argparse. * Reformat.	2023-07-04 11:37:47 +02:00
Eren Gölge	c844b6570a	Inference API for 🐶Bark (#2685 ) * Add bark requirements * Draft Bark implementation * Download HF models * Update synthesizer * Add bark model * Make style * Update pylintrc * Update model URLs * Update Bark Config * Fix here and ther * Make style * Make lint * Update requirements * Update requirements	2023-06-28 11:55:27 +02:00
Eren Gölge	8e415732dd	Fixup	2023-06-06 09:41:46 +02:00
Eren Gölge	547a72c97d	Fixup	2023-06-05 22:38:56 +02:00
Eren Gölge	50b1074779	Make `tts` ready	2023-06-05 11:29:10 +02:00
manmay nakhashi	a3d5801c44	Tortoise TTS inference (#2547 ) * initial commit * Tortoise inference * revert path change * style fix * remove accidental remove * style fixes * style fixes * removed unwanted assests and deps * remove changes * remove cvvp * style fix black * added tortoise config and updated config and args, refactoring the code * added tortoise to api * Pull mel_norm from url * Use TTS cleaners * Let download model files * add ability to pass tortoise presets through coqui api * fix tests * fix style and tests * fix tts commandline for tortoise * Add config.json to tortoise * Use kwargs * Use regular model api for loading tortoise * Add load from dir to synthesizer * Fix Tortoise floats * Use model_dir when there are multiple urls * Use `synthesize` when exists * lint fixes and resolve preset bug * resolve a download bug and update model link * fix json * do tortoise inference from voice dir * fix * fix test * fix speaker id and remove assests * update inference_tests.yml * replace inference_test.yml * fix extra dir as None * fix tests * remove space * Reformat docstring * Add docs * Update docs * lint fixes --------- Co-authored-by: Eren Gölge <egolge@coqui.ai> Co-authored-by: Eren Gölge <erogol@hotmail.com>	2023-05-16 00:58:21 +02:00
Eren Gölge	9b5822d625	Update VAD for silence trimming. (#2604 ) * Update vad for mp3 and fault tolerance * Make style * Remove importt * Remove stupid defaults	2023-05-11 11:09:23 +02:00
Eren Gölge	dba5cec497	Merge pull request #2509 from coqui-ai/update_vad Update VAD	2023-04-13 19:35:17 +02:00

1 2 3 4 5 ...

443 Commits