coqui-tts

Commit Graph

Author	SHA1	Message	Date
Matthew Boakes	1b9c400bca	PyTorch 2.1 Updates (Weight Norm and TorchAudio I/O) (#3176 ) * Replaced PyTorch weight_norm With parametrizations.weight_norm * TorchAudio: Migrating The I/O Functions To Use The Dispatcher Mechanism * Corrected Code Style --------- Co-authored-by: Eren Gölge <erogol@hotmail.com>	2023-11-09 16:31:03 +01:00
Julian Weber	ce1a39a9a4	Add char limit warn (#3130 ) * Add char limit warning * Adding v2 langs * cached_property for cutlet * Fix import	2023-11-08 10:24:23 +01:00
Edresson Casanova	5f9ab6cfaa	Fix style Co-authored-by: Aarni Koskela <akx@iki.fi>	2023-11-06 19:22:34 -03:00
Edresson Casanova	b146de4ce8	Bug fix on XTTS v2.0 Trainer	2023-11-06 20:26:01 +01:00
Edresson Casanova	72b2bac0f8	Load reference in 24khz to avoid issued with multiple sr references	2023-11-06 20:25:06 +01:00
Eren Gölge	f0cb19ecca	Drop diffusion from XTTS (#3150 ) * Drop diffusion for XTTS * Make style * Drop diffusion deps in code * Restore thrashed	2023-11-06 20:15:49 +01:00
Edresson Casanova	e45227d9ff	XTTS v2.0 (#3137 ) * Implement most similar ref training approach * Use non-enhanced hifigan for test samples * Add Perceiver * Update GPT Trainer for perceiver support * Update XTTS docs * Bug fix masking with XTTS perceiver * Bug fix on gpt forward * Bug Fix on XTTS v2.0 training * Add XTTS v2.0 unit tests * Add XTTS v2.0 inference unit tests * Bug Fix on diffusion inference * Add XTTS v2.0 training recipe * Placeholder model entry * Add cloning params to config * Make prompt embedding configurable * Make cloning configurable * Cheap fix for a cheaper fix * Prevent resampling * Update model entry * Update docs * Update requirements * Code linting * Add xtts v2 to sep tests * Bug fix on XTTS get_gpt_cond_latents * Bug fix on rebase * Make style * Bug fix in Japenese tokenizer * Add num2words to deps * Remove unused kwarg and added num_beams=1 as default --------- Co-authored-by: Eren G??lge <egolge@coqui.ai>	2023-11-06 14:58:18 +01:00
Aarni Koskela	38f6f8f0bb	Run `make style` & re-enable it in CI (#3127 )	2023-11-06 11:36:37 +01:00
WeberJulian	c1133724a1	Move lang token add to tokenizer	2023-10-26 14:52:13 +02:00
Edresson Casanova	01839af926	Bug fix on XTTS masking training	2023-10-24 18:30:14 -03:00
Edresson Casanova	ec7f54768a	Rebase bug fix and update recipe	2023-10-21 17:37:51 -03:00
Edresson Casanova	affaf11148	Add XTTS training unit test	2023-10-21 13:41:12 -03:00
Edresson Casanova	1f92741d6a	Fix issue #2971	2023-10-21 13:37:21 -03:00
Edresson Casanova	9e3598c3b7	Bug Fix on inference using XTTS trainer checkpoint	2023-10-21 13:37:21 -03:00
Edresson Casanova	c4ceaabe2c	Add test sentences during the training	2023-10-21 13:33:56 -03:00
Edresson Casanova	2f868dd5c2	Bug fix on reproducible evaluation	2023-10-21 13:33:56 -03:00
Edresson Casanova	bafab049c2	Add prompting masking	2023-10-21 13:33:56 -03:00
Edresson Casanova	47d613df3a	Add reproducible evaluation	2023-10-21 13:33:56 -03:00
Edresson Casanova	40a4e631ea	Update mel spectrogram for the style encoder	2023-10-21 13:33:56 -03:00
Edresson Casanova	a32961bcb4	Add XTTS base training code	2023-10-21 13:33:56 -03:00
Julian Weber	dad6a7b0b6	Preserve [ja] token of the text processing	2023-10-21 11:26:03 +02:00
Julian Weber	c7a16042e3	Remove global cutlet import	2023-10-21 11:18:58 +02:00
Julian Weber	cf97116185	XTTS v1.1 (#3089 ) * Add support for ne_hifigan * Update model.json * Update hash * Fix model loading * Enhance text_normalization * Add xtts to zoo test exception * Add model hash check * Add get_number_tokens	2023-10-20 16:02:08 +02:00
Julian Weber	e5e0cbffc9	Streaming inference for XTTS 🚀 (#3035 )	2023-10-06 18:34:06 +02:00
Edresson Casanova	4c3c11c958	Tortoise inference fix and fix zoo unit tests (#3010 )	2023-09-29 13:40:57 +02:00
Aarni Koskela	09e14e68db	Remove duplicate get_named_beta_schedules	2023-09-27 01:09:59 +03:00
Aarni Koskela	59f85a7122	Remove duplicate code from xtts.tokenizer	2023-09-27 01:09:59 +03:00
Eren Gölge	4033db5f4b	🔥 XTTS implementation	2023-09-13 17:51:24 +02:00
Eren G??lge	37b558ccb9	Make style	2023-08-11 12:55:23 +02:00
Eren G??lge	9a8352b8da	Fix import error with Bark	2023-08-11 03:33:59 +02:00
Eren Gölge	4186f42b21	Handle missing JA phonemizer (#2843 ) * Handle missing JA phonemizer * Make style	2023-08-07 13:19:38 +02:00
Eren Gölge	483888b9d8	Add kwargs to ignore extra arguments w/o error (#2822 )	2023-07-31 11:37:35 +02:00
logan hart	6fdb88f8e2	Add Delightful-TTS implementation (#2095 ) * add configs * Update config file * Add model configs * Add model layers * Add layer files * Add layer modules * change config names * Add emotion manager * fIX missing ap bug * Fix missing ap bug * Add base TTS e2e class * Fix wrong variable name in load_tts_samples * Add training script * Remove range predictor and gaussian upsampling * Add helper function * Add vctk recipe * Add conformer docs * Fix linting in conformer.py * Add Docs * remove duplicate import * refactor args * Fix bugs * Removew emotion embedding * remove unused arg * Remove emotion embedding arg * Remove emotion embedding arg * fix style issues * Fix bugs * Fix bugs * Add unittests * make style * fix formatter bug * fix test * Add pyworld compute pitch func * Update requirments.txt * Fix dataset Bug * Chnge layer norm to instance norm * Add missing import * Remove emotions.py * remove ssim loss * Add init layers func to aligner * refactor model layers * remove audio_config arg * Rename loss func * Rename to delightful-tts * Rename loss func * Remove unused modules * refactor imports * replace audio config with audio processor * Add change sample rate option * remove broken resample func * update recipe * fix style, add config docs * fix tests and multispeaker embd dim * remove pyworld * Make style and fix inference * Split tts tests * Fixup * Fixup * Fixup * Add argument names * Set "random" speaker in the model Tortoise/Bark * Use a diff f0_cache path for delightfull tts * Fix delightful speaker handling * Fix lint * Make style --------- Co-authored-by: loganhart420 <loganartpersonal@gmail.com> Co-authored-by: Eren Gölge <erogol@hotmail.com>	2023-07-24 13:41:26 +02:00
Eren Gölge	0de12ec5aa	API tests (#2790 ) * Separate API tests and only run when uplifted * Make style	2023-07-24 12:14:21 +02:00
Eren Gölge	672ec3b35e	Fix #2749 (#2750 )	2023-07-08 11:40:44 +02:00
Eren Gölge	7b5c8422c8	Export multispeaker onnx (#2743 )	2023-07-06 13:36:50 +02:00
Eren G??lge	91cc11d636	Remove commented codes	2023-06-28 12:14:37 +02:00
Eren G??lge	6b9ebf5aab	Merge branch 'p3_11' into dev	2023-06-28 12:13:04 +02:00
Eren Gölge	c844b6570a	Inference API for 🐶Bark (#2685 ) * Add bark requirements * Draft Bark implementation * Download HF models * Update synthesizer * Add bark model * Make style * Update pylintrc * Update model URLs * Update Bark Config * Fix here and ther * Make style * Make lint * Update requirements * Update requirements	2023-06-28 11:55:27 +02:00
Eren G??lge	17ac188958	Drop fairseq for Hubert	2023-06-26 19:27:48 +02:00
Eren G??lge	c03768bb53	Make style	2023-06-26 17:16:26 +02:00
Eren Gölge	fff8b762bc	Merge branch 'dev' into bark	2023-06-21 15:49:05 +02:00
Eren G??lge	cf98ae04df	Make lint	2023-06-21 12:05:08 +02:00
Eren G??lge	3b9fca2398	Make style	2023-06-21 12:02:06 +02:00
Eren G??lge	0f8932a6a9	Fix here and ther	2023-06-21 11:59:27 +02:00
Eren G??lge	f4c88ed677	Make style	2023-06-19 14:22:32 +02:00
Eren G??lge	37b708dac7	Add bark model	2023-06-19 14:16:06 +02:00
Eren G??lge	f59da4dba5	Draft Bark implementation	2023-06-12 14:32:39 +02:00
Tsai Meng-Ting	d65819422b	Update stochastic_duration_predictor.py (#2663 ) fix a typo	2023-06-12 11:10:54 +02:00
manmay nakhashi	a3d5801c44	Tortoise TTS inference (#2547 ) * initial commit * Tortoise inference * revert path change * style fix * remove accidental remove * style fixes * style fixes * removed unwanted assests and deps * remove changes * remove cvvp * style fix black * added tortoise config and updated config and args, refactoring the code * added tortoise to api * Pull mel_norm from url * Use TTS cleaners * Let download model files * add ability to pass tortoise presets through coqui api * fix tests * fix style and tests * fix tts commandline for tortoise * Add config.json to tortoise * Use kwargs * Use regular model api for loading tortoise * Add load from dir to synthesizer * Fix Tortoise floats * Use model_dir when there are multiple urls * Use `synthesize` when exists * lint fixes and resolve preset bug * resolve a download bug and update model link * fix json * do tortoise inference from voice dir * fix * fix test * fix speaker id and remove assests * update inference_tests.yml * replace inference_test.yml * fix extra dir as None * fix tests * remove space * Reformat docstring * Add docs * Update docs * lint fixes --------- Co-authored-by: Eren Gölge <egolge@coqui.ai> Co-authored-by: Eren Gölge <erogol@hotmail.com>	2023-05-16 00:58:21 +02:00

1 2 3 4 5

216 Commits