coqui-tts

Commit Graph

Author	SHA1	Message	Date
Jake Tae	409db505d2	Add device support in TTS and Synthesizer (#2855 ) * fix: resolve merge conflicts * fix: retain backwards compatability in functions * feature: utilize device for voice transfer * feature: use device for vocoder * chore: cleanup vocoder cpu logic * fix: add necessary vocoder output device check * fix: add necessary vocoder output device check * fix: indentation * fix: check if waveform is pt tensor before cpu conversion --------- Co-authored-by: Jake Tae <jaketae@Jakes-MacBook-Pro-2.local>	2023-08-14 21:04:44 +02:00
Julian Weber	febcaf710a	Add customizable data home path (#2871 ) * Add customizable data home path * Add TTS_HOME as an option	2023-08-14 21:02:48 +02:00
Eren Gölge	c4e5effab9	Bump up to v0.16.3	2023-08-13 12:22:04 +02:00
Eren Gölge	3a104d5c49	Update Studio API for XTTS (#2861 ) * Update Studio API for XTTS * Update the docs * Update README.md * Update README.md Update README	2023-08-13 12:04:12 +02:00
Eren G??lge	37b558ccb9	Make style	2023-08-11 12:55:23 +02:00
Eren G??lge	9a8352b8da	Fix import error with Bark	2023-08-11 03:33:59 +02:00
Eren Gölge	c87377b713	Bump up to v0.16.2	2023-08-07 13:21:14 +02:00
Eren Gölge	4186f42b21	Handle missing JA phonemizer (#2843 ) * Handle missing JA phonemizer * Make style	2023-08-07 13:19:38 +02:00
Javier	4e7f8cd021	Add fairseq onnx support and strict configuration, fixes some onnx errors (#2831 )	2023-08-04 11:02:59 +02:00
ChaseC	52a528cfcf	add post functionality to /api/tts (#2836 )	2023-08-04 10:54:20 +02:00
Eren Gölge	dc04baa1ee	Bump up to v0.16.1	2023-07-31 15:54:45 +02:00
Eren Gölge	17ddd65741	Please p3.11	2023-07-31 15:53:19 +02:00
Eren Gölge	69f080eb47	Fix DelightfulTTS (#2823 ) * Fix tests * Make style	2023-07-31 13:52:45 +02:00
Eren Gölge	483888b9d8	Add kwargs to ignore extra arguments w/o error (#2822 )	2023-07-31 11:37:35 +02:00
Aleś Bułojčyk	d124f78430	Recipe for Belarusian TTS (#2756 ) * Changes from jhlfrfufyfn <jhlfrfufyfn@gmail.com> * Recipe for Belarusian TTS --------- Co-authored-by: jhlfrfufyfn <jhlfrfufyfn@gmail.com>	2023-07-31 10:26:21 +02:00
Javier	c140df5a58	Adds multi-language support for VITS onnx, fixes onnx inference error when speaker_id is None or not passed, fixes onnx exporting for models with init_discriminator=false (#2816 )	2023-07-31 10:19:49 +02:00
Eren Gölge	b739326503	Bump up to v0.16.0	2023-07-24 16:04:10 +02:00
Eren Gölge	8aacb81849	Fix Tortoise load (#2791 ) * Remove key prunning in tortoise * Make lint	2023-07-24 13:42:47 +02:00
logan hart	6fdb88f8e2	Add Delightful-TTS implementation (#2095 ) * add configs * Update config file * Add model configs * Add model layers * Add layer files * Add layer modules * change config names * Add emotion manager * fIX missing ap bug * Fix missing ap bug * Add base TTS e2e class * Fix wrong variable name in load_tts_samples * Add training script * Remove range predictor and gaussian upsampling * Add helper function * Add vctk recipe * Add conformer docs * Fix linting in conformer.py * Add Docs * remove duplicate import * refactor args * Fix bugs * Removew emotion embedding * remove unused arg * Remove emotion embedding arg * Remove emotion embedding arg * fix style issues * Fix bugs * Fix bugs * Add unittests * make style * fix formatter bug * fix test * Add pyworld compute pitch func * Update requirments.txt * Fix dataset Bug * Chnge layer norm to instance norm * Add missing import * Remove emotions.py * remove ssim loss * Add init layers func to aligner * refactor model layers * remove audio_config arg * Rename loss func * Rename to delightful-tts * Rename loss func * Remove unused modules * refactor imports * replace audio config with audio processor * Add change sample rate option * remove broken resample func * update recipe * fix style, add config docs * fix tests and multispeaker embd dim * remove pyworld * Make style and fix inference * Split tts tests * Fixup * Fixup * Fixup * Add argument names * Set "random" speaker in the model Tortoise/Bark * Use a diff f0_cache path for delightfull tts * Fix delightful speaker handling * Fix lint * Make style --------- Co-authored-by: loganhart420 <loganartpersonal@gmail.com> Co-authored-by: Eren Gölge <erogol@hotmail.com>	2023-07-24 13:41:26 +02:00
Eren Gölge	0de12ec5aa	API tests (#2790 ) * Separate API tests and only run when uplifted * Make style	2023-07-24 12:14:21 +02:00
Paul O'Leary McCann	c0aabb8596	Make Japanese-specific dependencies optional (#2776 ) * Don't install MeCab by default * Add optional [ja] deps, like [dev] etc * Add JA requirements file * Add JA requirements to requirements_all This should help the tests run.	2023-07-24 11:28:27 +02:00
Eren Gölge	672ec3b35e	Fix #2749 (#2750 )	2023-07-08 11:40:44 +02:00
Eren Gölge	b5cd644132	Bump up to v0.15.6	2023-07-08 10:33:09 +02:00
Eren Gölge	a2984fb435	Fix #2745 (#2748 )	2023-07-07 20:23:27 +02:00
Eren Gölge	7b5c8422c8	Export multispeaker onnx (#2743 )	2023-07-06 13:36:50 +02:00
JiangCheng	53938e2d32	Squashed commit of the following: commit `dd612fd72e` Author: JiangCheng <jiangcheng@kezaihui.com> Date: Mon Jun 5 16:04:54 2023 +0800 Failed to download the file and need to delete the created file path	2023-07-05 12:08:05 +02:00
ZhouGongZaiShi	d5f16d77c2	delete meaningless print() (#2662 )	2023-07-04 11:38:17 +02:00
PiaoYang	630327c4e6	Update compute_embeddings.py (#2668 ) * [Typo] Fix variable name. More readable description. Update train_yourtts.py Reformat. Reformat using black again. * Add `old_append`. Fix bool argparse. * Reformat.	2023-07-04 11:37:47 +02:00
ChaseC	8957799e45	fix loading of model and vocoder configs (#2698 )	2023-07-04 11:32:00 +02:00
Eren Gölge	505ac1aa8f	Bump up to v0.15.5	2023-07-03 11:18:06 +02:00
Eren G??lge	21a3f280de	Bump up to v0.15.4	2023-06-30 15:05:00 +02:00
Eren Gölge	f9cde7bb1b	Bump up to v0.15.3	2023-06-30 14:30:18 +02:00
Eren G??lge	413a345d66	Bump up to v0.15.2	2023-06-30 14:16:47 +02:00
Eren G??lge	cb9c320691	Fixup	2023-06-30 14:13:11 +02:00
Eren G??lge	dfd8d313a2	Bump up to v1.5.1	2023-06-29 17:53:09 +02:00
Eren G??lge	a035b25340	Bump up to v0.15.0	2023-06-28 15:24:20 +02:00
Eren G??lge	34b9a18c47	Fixup	2023-06-28 12:26:04 +02:00
Eren G??lge	91cc11d636	Remove commented codes	2023-06-28 12:14:37 +02:00
Eren G??lge	6b9ebf5aab	Merge branch 'p3_11' into dev	2023-06-28 12:13:04 +02:00
Eren Gölge	c844b6570a	Inference API for 🐶Bark (#2685 ) * Add bark requirements * Draft Bark implementation * Download HF models * Update synthesizer * Add bark model * Make style * Update pylintrc * Update model URLs * Update Bark Config * Fix here and ther * Make style * Make lint * Update requirements * Update requirements	2023-06-28 11:55:27 +02:00
Eren G??lge	a13b1352a4	Fixup	2023-06-26 19:30:26 +02:00
Eren G??lge	17ac188958	Drop fairseq for Hubert	2023-06-26 19:27:48 +02:00
Eren G??lge	c03768bb53	Make style	2023-06-26 17:16:26 +02:00
Eren G??lge	a1c431e6a9	Fixups	2023-06-26 12:55:18 +02:00
Eren G??lge	a58fb6c01b	Update requirements	2023-06-22 13:53:19 +02:00
Eren G??lge	e888e8a56d	Fix manage	2023-06-22 10:13:20 +02:00
Eren Gölge	fff8b762bc	Merge branch 'dev' into bark	2023-06-21 15:49:05 +02:00
Eren Gölge	4cf8652392	Fix Tortoise load (#2697 ) * Handle missing gpt weights * Make style * Fix lint	2023-06-21 15:42:01 +02:00
Eren G??lge	cf98ae04df	Make lint	2023-06-21 12:05:08 +02:00
Eren G??lge	3b9fca2398	Make style	2023-06-21 12:02:06 +02:00
Eren G??lge	0f8932a6a9	Fix here and ther	2023-06-21 11:59:27 +02:00
Eren G??lge	03c347b7f3	Update Bark Config	2023-06-21 11:58:18 +02:00
Eren G??lge	695e862aad	Update model URLs	2023-06-21 11:57:46 +02:00
Eren G??lge	f4c88ed677	Make style	2023-06-19 14:22:32 +02:00
Eren G??lge	37b708dac7	Add bark model	2023-06-19 14:16:06 +02:00
Eren G??lge	2364c38d16	Update synthesizer	2023-06-19 14:15:21 +02:00
Eren G??lge	5a31fad502	Download HF models	2023-06-19 14:14:04 +02:00
Eren G??lge	f59da4dba5	Draft Bark implementation	2023-06-12 14:32:39 +02:00
Tsai Meng-Ting	d65819422b	Update stochastic_duration_predictor.py (#2663 ) fix a typo	2023-06-12 11:10:54 +02:00
Eren Gölge	49cf6a5d62	Bump up to v0.14.3	2023-06-06 09:41:59 +02:00
Eren Gölge	8e415732dd	Fixup	2023-06-06 09:41:46 +02:00
Eren Gölge	547a72c97d	Fixup	2023-06-05 22:38:56 +02:00
Eren Gölge	a494f0c92a	Bump up to v0.14.1	2023-06-05 11:29:10 +02:00
Eren Gölge	50b1074779	Make `tts` ready	2023-06-05 11:29:10 +02:00
Eren Gölge	e785d101a1	Port Fairseq TTS models (#2628 ) * Load fairseq models * Add docs and missing files * Managing fairseq models and docs for API * Make style * Use scarf URL * Add tests * Fix URL * Pass cpu * Make lint * Fixup * Make lint * fixup * Fixup * Change tokenization order * Update README * Fixup * Fixup	2023-06-05 11:15:13 +02:00
Shukrullo Turgunov	0d5e68a09f	fix typo (#2647 ) * fix typo * typo fix	2023-06-05 09:58:16 +02:00
Reuben Morais	23a7a9a363	Fetch all built-in speakers (#2626 )	2023-05-22 17:28:08 +02:00
Eren Gölge	aef7f6d980	Bump up to v0.14.1	2023-05-18 11:13:09 +02:00
Eren Gölge	9e99e0f42d	Disable reduction	2023-05-18 11:12:51 +02:00
Eren Gölge	bc0a532c7a	Bump up to v0.14.0	2023-05-16 10:08:41 +02:00
Eren Gölge	4de797bb11	Draft ONNX export for VITS (#2563 ) * Draft ONNX export for VITS Could not get it work to output variable length sequence * Fixup for onnx constant output * Make style * Remove commented code	2023-05-16 01:07:56 +02:00
manmay nakhashi	a3d5801c44	Tortoise TTS inference (#2547 ) * initial commit * Tortoise inference * revert path change * style fix * remove accidental remove * style fixes * style fixes * removed unwanted assests and deps * remove changes * remove cvvp * style fix black * added tortoise config and updated config and args, refactoring the code * added tortoise to api * Pull mel_norm from url * Use TTS cleaners * Let download model files * add ability to pass tortoise presets through coqui api * fix tests * fix style and tests * fix tts commandline for tortoise * Add config.json to tortoise * Use kwargs * Use regular model api for loading tortoise * Add load from dir to synthesizer * Fix Tortoise floats * Use model_dir when there are multiple urls * Use `synthesize` when exists * lint fixes and resolve preset bug * resolve a download bug and update model link * fix json * do tortoise inference from voice dir * fix * fix test * fix speaker id and remove assests * update inference_tests.yml * replace inference_test.yml * fix extra dir as None * fix tests * remove space * Reformat docstring * Add docs * Update docs * lint fixes --------- Co-authored-by: Eren Gölge <egolge@coqui.ai> Co-authored-by: Eren Gölge <erogol@hotmail.com>	2023-05-16 00:58:21 +02:00
Eren Gölge	9b5822d625	Update VAD for silence trimming. (#2604 ) * Update vad for mp3 and fault tolerance * Make style * Remove importt * Remove stupid defaults	2023-05-11 11:09:23 +02:00
Eren Gölge	dfb51e06b2	Add jenny model (#2603 )	2023-05-08 12:05:40 +02:00
Michael Görner	27e237ed08	use default_factory for audio parameter (#2576 ) Python 3.11 complains about the mutable default and other members were already adapted to use the factory, so I expect this line just went unnoticed until now.	2023-05-08 11:17:36 +02:00
prakharpbuf	c1875f68df	typos and minor fixes (#2508 ) * Update tacotron1-2.md * Update README.md * Update Tutorial_2_train_your_first_TTS_model.ipynb * Update synthesizer.py There is no arg called --speaker_name * Update formatting_your_dataset.md * Update AnalyzeDataset.ipynb * Update AnalyzeDataset.ipynb * Update AnalyzeDataset.ipynb * Update finetuning.md * Update train_yourtts.py * Update train_yourtts.py * Update train_yourtts.py * Update finetuning.md	2023-04-26 15:22:57 +02:00
Eren Gölge	2071088bab	Bump up to v0.13.3	2023-04-17 16:13:35 +02:00
Eren Gölge	1a6a5710fd	Make lint	2023-04-17 15:02:56 +02:00
Eren Gölge	a44a0e1fd2	Update model urls	2023-04-17 14:53:27 +02:00
Eren Gölge	2533a18d62	Add BN tests	2023-04-17 13:37:10 +02:00
Eren Gölge	2d49c05259	Remove import	2023-04-17 13:05:29 +02:00
Eren Gölge	5e5768d784	Fix API	2023-04-17 13:05:19 +02:00
Eren Gölge	cd83991067	Add BN phonemizer	2023-04-17 12:54:00 +02:00
Eren Gölge	36be05290d	Add models	2023-04-17 12:52:32 +02:00
Eren Gölge	e4c5c27854	Bump up to v0.13.2	2023-04-14 10:23:39 +02:00
Eren Gölge	dba5cec497	Merge pull request #2509 from coqui-ai/update_vad Update VAD	2023-04-13 19:35:17 +02:00
Eren Gölge	5a9bda13f3	Make style	2023-04-13 14:19:06 +02:00
Eren Gölge	c9375e4b8b	Make style	2023-04-13 14:17:06 +02:00
Eren Gölge	758ef84cc2	Using 🐸Studio models with `tts` command	2023-04-13 14:14:41 +02:00
Eren G??lge	537dc0e933	Update VAD	2023-04-13 00:39:46 +02:00
Eren Gölge	e33e7170ed	Bump up to v0.13.1	2023-04-12 16:20:53 +02:00
Eren Gölge	8da3342676	Ping API	2023-04-12 16:20:53 +02:00
Eren Gölge	cbb592b295	Fixup	2023-04-10 14:50:11 +02:00
Eren Gölge	b8b9f09de5	Fixup	2023-04-10 14:06:31 +02:00
Eren Gölge	a49c1931d9	Fixup	2023-04-10 13:33:42 +02:00
Eren Gölge	5bd1fb6b2c	Fix API for voice conversion	2023-04-10 13:32:16 +02:00
Eren Gölge	30109af2a0	Merge pull request #2480 from MattyB95/librosa_v0.10.0 Update Librosa Version To V0.10.0	2023-04-07 12:32:33 +02:00
Eren Gölge	1233365cf4	Bump up to v0.13.0	2023-04-05 15:09:31 +02:00
Eren Gölge	ad8b9bf2be	🐸 Coqui Studio API integration (#2484 ) * Warn when lang is not avail * Make style * Implement Coqui Studio API * Test * Update docs * Set action * Make style * Make lint * Update README * Make style * Fix action * Run actions	2023-04-05 15:06:50 +02:00
Matthew Boakes	4c829e74a1	Update Librosa Version To V0.10.0	2023-04-05 00:59:20 +01:00
Yingzhi WANG	95fa2c9fd6	fix typo (#2475 )	2023-04-03 23:31:09 +02:00
p0p	91cf1b2da9	[minor] batch["speaker_ids"] getting set two times (#2470 ) * [minor] batch["speaker_ids"] getting set two times just to make it consistent with language_ids * Update vits.py style.	2023-04-03 11:35:21 +02:00
Rajiv P	c2d15cd413	[minor] hifigan_generator.py typo (#2462 ) resblock2 description updated.	2023-03-28 12:43:36 +02:00
Eren Gölge	d309f50e53	Implement FreeVC (#2451 ) * Update .gitignore * Draft FreeVC implementation * Tests and relevant updates * Update API tests * Add missings * Update requirements * :( * Lazy handle for vc * Update docs for voice conversion * Make style	2023-03-25 18:33:23 +01:00
Khalid Bashir	14c80dd1fd	vits.py training fixed due to return_complex (#2418 ) Torch set default value for `return_complex=True` for `torch.stft` method This turned warning into error:- ``` Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/trainer/trainer.py", line 1591, in fit self._fit() File "/usr/local/lib/python3.10/dist-packages/trainer/trainer.py", line 1544, in _fit self.train_epoch() File "/usr/local/lib/python3.10/dist-packages/trainer/trainer.py", line 1309, in train_epoch _, _ = self.train_step(batch, batch_num_steps, cur_step, loader_start_time) File "/usr/local/lib/python3.10/dist-packages/trainer/trainer.py", line 1162, in train_step outputs, loss_dict_new, step_time = self._optimize( File "/usr/local/lib/python3.10/dist-packages/trainer/trainer.py", line 1023, in _optimize outputs, loss_dict = self._model_train_step(batch, model, criterion, optimizer_idx=optimizer_idx) File "/usr/local/lib/python3.10/dist-packages/trainer/trainer.py", line 970, in _model_train_step return model.train_step(*input_args) File "/workspace/coqui-tts/TTS/tts/models/vits.py", line 1293, in train_step mel_slice_hat = wav_to_mel( File "/workspace/coqui-tts/TTS/tts/models/vits.py", line 191, in wav_to_mel spec = torch.stft( File "/usr/local/lib/python3.10/dist-packages/torch/functional.py", line 641, in stft return _VF.stft(input, n_fft, hop_length, win_length, window, # type: ignore[attr-defined] RuntimeError: stft requires the return_complex parameter be given for real inputs, and will further require that return_complex=True in a future PyTorch release. ```	2023-03-19 00:22:04 +01:00
Eren Gölge	2db262747e	Bump up to v0.12.0	2023-03-17 13:21:03 +01:00
Roee Shenberg	3c15f0619a	Bug fixes in OverFlow audio generation (#2380 )	2023-03-15 12:02:11 +01:00
Daniel Vera Nieto	dfb48737fb	Style fixed	2023-03-13 16:11:15 +01:00
Dani Vera	0d12229b64	Update vits.py This should fix the issue https://github.com/coqui-ai/TTS/issues/1986 without breaking batch data sampling.	2023-03-10 18:35:16 +01:00
manmay nakhashi	624513018d	add energy by default to Fastspeech2 config (#2326 ) * add energy by default * added energy to base tts * fix energy dataset * fix styles * fix test	2023-03-06 10:20:25 +01:00
Florian Quirin	478c8178b8	Basic Mary-TTS API compatibility (#2352 ) * added basic Mary-TTS API endpoints to server - imported `parse_qs` from `urllib.parse` to parse HTTP POST parameters - imported `render_template_string` from `flask` to return text as endpoint result - added new routes: - `/locales` - returns list of locales (currently locale of active model) - `/voices` - returns list of voices (currently locale and name of active model) - `/process` - accepts synth. request (GET and POST) with parameter `INPUT_TEXT` (other parameters ignored since we have only one active model) * better log messages for Mary-TTS API - smaller tweaks to log output * use f-string in log print to please linter * updated server.py to match 'make style' result	2023-03-06 10:08:21 +01:00
thennal10	d39bc74f57	OverFlow with test sentences (#2253 ) * Fix typo in function definiton * Swap hasattr out hasattr(self, "speaker_manager") and hasattr(self, "language_manager") seems to be redundant since BaseTTS defines both.	2023-03-01 09:11:30 +01:00
Edresson Casanova	16b9862252	Fix Speaker Consistency Loss (SCL) (#2364 )	2023-02-27 09:14:00 +03:00
Eren G??lge	661725b95e	Bump up to v0.11.1	2023-02-10 15:59:05 +01:00
Eren G??lge	0196b4dfbf	Merge branch 'add_neural_hmm_model' into dev	2023-02-10 15:23:56 +01:00
Eren Gölge	914280a556	Bump up to v0.11.0 (#2329 ) * Make style * Bump up to v0.11.0	2023-02-08 13:58:49 +01:00
Eren G??lge	85b3a04b37	Merge branch 'api_model_path' into dev	2023-02-06 11:18:00 +01:00
marius851000	1f4d8bf0f1	Fix tts-server for multi-lingual models (#2257 )	2023-02-06 10:54:34 +01:00
Eren G??lge	6ee94f8bad	Fixup	2023-01-30 14:02:25 +01:00
Eren G??lge	713e8c8d04	Add pretrained model	2023-01-30 13:55:17 +01:00
Eren G??lge	7fddabc8ac	Implement cloning in API	2023-01-30 13:35:48 +01:00
Eren G??lge	335b8ed44e	Add vocoder path	2023-01-30 12:59:29 +01:00
Martin Weinelt	994be163e1	Use packaging.version for version comparisons (#2310 ) * Use packaging.version for version comparisons The distutils package is deprecated¹ and relies on PEP 386² version comparisons, which have been superseded by PEP 440³ which is implemented through the packaging module. With more recent distutils versions, provided through setuptools vendoring, we are seeing the following exception during version comparisons: > TypeError: '<' not supported between instances of 'str' and 'int' This is fixed by this migration. [1] https://docs.python.org/3/library/distutils.html [2] https://peps.python.org/pep-0386/ [3] https://peps.python.org/pep-0440/ * Improve espeak version detection robustness On many modern systems espeak is just a symlink to espeak-ng. In that case looking for the 3rd word in the version output will break the version comparison, when it finds `text-to-speech:`, instead of a proper version. This will not break during runtime, where espeak-ng would be prioritized, but the phonemizer and tokenizer tests force the backend to `espeak`, which exhibits this breakage. This improves the version detection by simply looking for the version after the "text-to-speech:" token. * Replace distuils.copy_tree with shutil.copytree The distutils module is deprecated and slated for removal in Python 3.12. Its usage should be replaced, in this case by a compatible method from shutil.	2023-01-29 23:47:00 +01:00
Eren G??lge	cf076345e7	Make style	2023-01-23 13:49:51 +01:00
Eren G??lge	13334d507c	Load model from path	2023-01-23 13:45:45 +01:00
Gerard Sant Muniesa	c59b3f75b8	Add Catalan text cleaners for Catalan support (#2295 )	2023-01-23 11:56:30 +01:00
Shivam Mehta	d83ee8fe45	Adding neural HMM TTS Model (#2272 ) * Adding neural HMM TTS * Adding tests * Adding neural hmm on readme * renaming training recipe * Removing overflow\s decoder parameters from the config * Update the Trainer requirement version for a compatible one (#2276) * Bump up to v0.10.2 * Adding neural HMM TTS * Adding tests * Adding neural hmm on readme * renaming training recipe * Removing overflow\s decoder parameters from the config * fixing documentation Co-authored-by: Edresson Casanova <edresson1@gmail.com> Co-authored-by: Eren Gölge <erogol@hotmail.com>	2023-01-23 11:53:04 +01:00
Eren Gölge	497f22b20b	Cache speaker encoder model (#2284 )	2023-01-23 11:49:51 +01:00
Eren G??lge	6e3f74fc29	Fix #2191	2023-01-15 23:11:57 +01:00
manmay nakhashi	bc422f2f3c	Fastspeech2 (#2073 ) * added EnergyDataset * add energy to Dataset * add comupte_energy * added energy params * added energy to forward_tts * added plot_avg_energy for visualisation * Update forward_tts.py * create file * added fastspeech2 recipe * add fastspeech2 config * removed energy from fast pitch * add energy loss to forward tts * Update fastspeech2_config.py * change run_name * Update numpy_transforms.py * fix typo * fix typo * fix typo * linting issues * use_energy default value --> False * Update numpy_transforms.py * linting fixes * fix typo * liniting_fix * liniting_fix * fix * fixes * fixes * lint fix * lint fixws * added training test * wrong import * wrong import * trailing whitespace * style fix * changed class name because of error * class name change * class name change * change class name * fixed styles	2023-01-15 22:39:22 +01:00
Eren Gölge	14d45b5347	Bump up to v0.10.2	2023-01-11 01:06:02 +01:00
Khalid Bashir	42afad5e79	Fixed bug related to yourtts speaker embeddings issue (#2234 ) * Fixed bug related to yourtts speaker embeddings issue * Reverted code for base_tts * Bug fix on VITS d_vector_file type * Ignore the test speakers on YourTTS recipe * Add speaker encoder model and config on YourTTS recipe to easily do zero-shot inference * Update YourTTS config file * Update ModelManager._update_path to deal with list attributes * Fix lint checks * Remove unused code * Fix unit tests * Reset name_to_id to get the right speaker ids on load_embeddings_from_list_of_files * Set weighted_sampler_multipliers as an empty dict to prevent users' mistakes Co-authored-by: Edresson Casanova <edresson1@gmail.com>	2023-01-02 14:20:02 +01:00
Julian Weber	a07397733b	Multilingual tokenizer (#2229 ) * Implement multilingual tokenizer * Add multi_phonemizer receipe * Fix lint * Add TestMultiPhonemizer * Fix lint * make style	2023-01-02 10:03:19 +01:00
Eren G??lge	f814d52394	Bump up to v0.10.1	2022-12-26 14:29:46 +01:00
Eren G??lge	8c32a6998a	Add pth files to manager	2022-12-26 14:29:25 +01:00
Eren G??lge	cf765cb3f2	Add ca and fa models	2022-12-26 14:29:10 +01:00
Eren G??lge	46b0ad37e7	Bump up to v0.10.0	2022-12-15 11:19:23 +01:00
Eren Gölge	a9167cf239	Fixup overflow (#2218 ) * Update overflow config * Pulling shuffle and drop_last from config * Print training stats for overflow	2022-12-15 00:56:48 +01:00
Eren Gölge	ecea43ec81	Adding pre-trained Overflow model (#2211 ) * Adding pretrained Overflow model * Stabilize HMM * Fixup model manager * Return `audio_unique_name` by default * Distribute max split size over datasets * Fixup eval_split_size * Make style	2022-12-14 16:55:48 +01:00
Edresson Casanova	3b1a28fa95	Add YourTTS VCTK recipe (#2198 ) * Add YourTTS VCTK recipe * Fix lint * Add compute_embeddings and resample_files functions to be able to reuse it * Add automatic download and speaker embedding computation for YourTTS VCTK recipe * Add parameter for eval metadata file on compute embeddings function	2022-12-12 16:14:25 +01:00
Shivam Mehta	3b8b105b0d	Adding OverFlow (#2183 ) * Adding encoder * currently modifying hmm * Adding hmm * Adding overflow * Adding overflow setting up flat start * Removing runs * adding normalization parameters * Fixing models on same device * Training overflow and plotting evaluations * Adding inference * At the end of epoch the test sentences are coming on cpu instead of gpu * Adding figures from model during training to monitor * reverting tacotron2 training recipe * fixing inference on gpu for test sentences on config * moving helpers and texts within overflows source code * renaming to overflow * moving loss to the model file * Fixing the rename * Model training but not plotting the test config sentences's audios * Formatting logs * Changing model name to camelcase * Fixing test log * Fixing plotting bug * Adding some tests * Adding more tests to overflow * Adding all tests for overflow * making changes to camel case in config * Adding information about parameters and docstring * removing compute_mel_statistics moved statistic computation to the model instead * Added overflow in readme * Adding more test cases, now it doesn't saves transition_p like tensor and can be dumped as json	2022-12-12 12:44:15 +01:00
p0p4k	2e153d54a8	Adding missing key to formatter (#2194 ) quick fix for #2156. added 'root_path' key.	2022-12-12 12:25:37 +01:00
Eren Gölge	1ddc484b49	Python API implementation (#2195 ) * Draft implementation * Fix style * Add api tests * Fix lint * Update docs * Update tests * Set env * Fixup * Fixup * Fix lint * Revert	2022-12-12 12:04:20 +01:00
Eren Gölge	fdeefcc612	Handle espeak 1.48.15 (#2203 )	2022-12-12 11:23:45 +01:00
Edresson Casanova	ee20e30958	Fix VITS multi-speaker voice conversion inference	2022-12-05 09:15:01 -03:00
Eren Gölge	9321b22203	Fix scheduler order	2022-12-05 12:26:15 +01:00
Eren G??lge	bc6120c330	[ci skip]Bump up to v0.9.0	2022-11-16 16:45:02 +01:00
logan hart	ff9b63d02a	Add neon models (#2140 ) * Add neon ljspeech vits model * Add neon german model * Update .models.json * Add neon spanish model * Add french model * Add Dutch model * Add Hungarian model * Add Greek model * Remove uneeded description * Update .models.json * Update .models.json * Handling neon models * Add all neon models * Update .models.json * Split zoo_tests * Update test names * Update model testing Co-authored-by: Eren Gölge <erogol@hotmail.com>	2022-11-16 16:12:39 +01:00
Eren Gölge	8cb1433e6e	Cache fsspec downloads (#2132 ) * Cache fsspec downloaded files * Use diff paths for test * Make fsspec caching optional * Decom GPU docker tests * Make progress bar optional for better CI log * Check path local	2022-11-09 22:12:48 +01:00
Eren G??lge	b686c09704	Fix #2062	2022-11-07 09:22:43 +01:00

1 2 3 4 5 ...

1882 Commits