coqui-tts

Commit Graph

Author	SHA1	Message	Date
Eren Gölge	a7a96d08dd	Fix loading Bark (#2893 ) * Fixup hubert path * Make style	2023-08-26 11:59:00 +02:00
Jake Tae	409db505d2	Add device support in TTS and Synthesizer (#2855 ) * fix: resolve merge conflicts * fix: retain backwards compatability in functions * feature: utilize device for voice transfer * feature: use device for vocoder * chore: cleanup vocoder cpu logic * fix: add necessary vocoder output device check * fix: add necessary vocoder output device check * fix: indentation * fix: check if waveform is pt tensor before cpu conversion --------- Co-authored-by: Jake Tae <jaketae@Jakes-MacBook-Pro-2.local>	2023-08-14 21:04:44 +02:00
Julian Weber	a07397733b	Multilingual tokenizer (#2229 ) * Implement multilingual tokenizer * Add multi_phonemizer receipe * Fix lint * Add TestMultiPhonemizer * Fix lint * make style	2023-01-02 10:03:19 +01:00
Julian Weber	896e46d0e5	Fix vc (#1971 )	2022-09-16 12:01:26 +02:00
a-froghyar	8be21ec387	Capacitron (#977 ) * new CI config * initial Capacitron implementation * delete old unused file * fix empty formatting changes * update losses and training script * fix previous commit * fix commit * Add Capacitron test and first round of test fixes * revert formatter change * add changes to the synthesizer * add stepwise gradual lr scheduler and changes to the recipe * add inference script for dev use * feat: add posterior inference arguments to synth methods - added reference wav and text args for posterior inference - some formatting * fix: add espeak flag to base_tts and dataset APIs - use_espeak_phonemes flag was not implemented in those APIs - espeak is now able to be utilised for phoneme generation - necessary phonemizer for the Capacitron model * chore: update training script and style - training script includes the espeak flag and other hyperparams - made style * chore: fix linting * feat: add Tacotron 2 support * leftover from dev * chore:rename parser args * feat: extract optimizers - created a separate optimizer class to merge the two optimizers * chore: revert arbitrary trainer changes * fmt: revert formatting bug * formatting again * formatting fixed * fix: log func * fix: update optimizer - Implemented load_state_dict for continuing training * fix: clean optimizer init for standard models * improvement: purge espeak flags and add training scripts * Delete capacitronT2.py delete old training script, new one is pushed * feat: capacitron trainer methods - extracted capacitron specific training operations from the trainer into custom methods in taco1 and taco2 models * chore: renaming and merging capacitron and gst style args * fix: bug fixes from the previous commit * fix: implement state_dict method on CapacitronOptimizer * fix: call method * fix: inference naming * Delete train_capacitron.py * fix: synthesize * feat: update tests * chore: fix style * Delete capacitron_inference.py * fix: fix train tts t2 capacitron tests * fix: double forward in T2 train step * fix: double forward in T1 train step * fix: run make style * fix: remove unused import * fix: test for T1 capacitron * fix: make lint * feat: add blizzard2013 recipes * make style * fix: update recipes * chore: make style * Plot test sentences in Tacotron * chore: make style and fix import * fix: call forward first before problematic floordiv op * fix: update recipes * feat: add min_audio_len to recipes * aux_input["style_mel"] * chore: make style * Make capacitron T2 recipe more stable * Remove T1 capacitron Ljspeech * feat: implement new grad clipping routine and update configs * make style * Add pretrained checkpoints * Add default vocoder * Change trainer package * Fix grad clip issue for tacotron * Fix scheduler issue with tacotron Co-authored-by: Eren Gölge <egolge@coqui.ai> Co-authored-by: WeberJulian <julian.weber@hotmail.fr> Co-authored-by: Eren Gölge <erogol@hotmail.com>	2022-05-20 16:17:11 +02:00
Eren Gölge	0870a4faa2	Make style (#1405 )	2022-03-16 12:13:55 +01:00
Edresson Casanova	dbe9da7f15	Add Voice conversion inference support (#1337 ) * Add support for voice conversion inference * Cache d_vectors_by_speaker for fast inference using a bigger speakers.json * Rebase bug fix * Use the average d-vector for inference	2022-03-10 14:57:12 +01:00
Eren Gölge	764c7fa4a4	Rename phoneme_cleaners	2022-03-06 12:09:54 +01:00
Eren Gölge	dd4287de1f	Update models	2022-03-03 20:23:00 +01:00
Eren Gölge	d0c27a9661	Update synthesis.py	2022-02-25 11:29:41 +01:00
Eren Gölge	131bc0cfc0	Fix synthesis.py 🔧	2022-02-25 11:18:00 +01:00
Eren Gölge	c9972e6f14	Make lint	2022-02-25 11:07:34 +01:00
Eren Gölge	b6c2bfdf08	Refactor synthesis.py for TTSTokenizer	2022-02-25 11:05:06 +01:00
Eren Gölge	d0eb642d88	Refactor synthesis.py for TTSTokenizer	2022-02-25 10:48:03 +01:00
Eren Gölge	d8ec7086b6	Update `synthesis` for the new API	2022-02-25 10:48:03 +01:00
Eren Gölge	fbad17e084	Update imports for symbols -> characters	2022-02-25 10:48:02 +01:00
Eren Gölge	5a9653978a	Refactor synthesis.py for TTSTokenizer	2022-02-25 10:45:24 +01:00
Edresson Casanova	0860d73cf8	Remove Tensorflow requeriment (#1225 ) * Remove TF modules * Remove TF unit tests * Remove TF vocoder modules * Remove TF convert scripts * Remove TF requirement * Remove the Docs TF instructions * Remove TF inference support	2022-02-10 16:14:54 +01:00
WeberJulian	ffc269eaf4	Update docstring	2021-12-20 11:54:10 +00:00
WeberJulian	120332d53f	Fix phonemes	2021-12-20 11:54:10 +00:00
WeberJulian	4d721bcabd	fix test sentence synthesis	2021-12-20 11:54:10 +00:00
Edresson	ac9416fb86	Add multilingual inference support	2021-12-20 11:54:09 +00:00
Eren Gölge	babdd84f91	Fix GST inference commit d3e477875a7e46a101fcf95a1794442823750fe2 Author: George Rousssos <25833833+george-roussos@users.noreply.github.com> Date: Wed Nov 3 10:16:12 2021 +0000 Read .wav for GST conditioning from CL commit 074e6d0874d3b34fb6a4991fc17d66dccd413fbb Author: George Rousssos <25833833+george-roussos@users.noreply.github.com> Date: Fri Oct 29 14:43:47 2021 +0100 Fix GST during inference in Tacotron2 commit fdece14585ab5a36eed1061a9a838d8e48aa6882 Author: George Rousssos <25833833+george-roussos@users.noreply.github.com> Date: Wed Nov 3 10:16:12 2021 +0000 Read .wav for GST conditioning from CL commit cd29e21b8d0a541ee298d2bf5f67223ad60be38f Author: George Rousssos <25833833+george-roussos@users.noreply.github.com> Date: Fri Oct 29 14:43:47 2021 +0100 Fix GST during inference in Tacotron2 commit 908ce39370eadcc9fa8510cdb26c9ead87305427 Author: George Rousssos <25833833+george-roussos@users.noreply.github.com> Date: Fri Oct 29 12:49:37 2021 +0100 Make trim_db value negative commit 1008a2e0f72fa7ca7f0307424f570386f2f16d42 Author: George Rousssos <25833833+george-roussos@users.noreply.github.com> Date: Fri Oct 29 12:22:24 2021 +0100 Set find_endpoint db threshold in config.json	2021-12-07 13:28:49 +00:00
Eren Gölge	aea90e2501	Comment synthesis.py	2021-10-21 13:53:45 +00:00
Eren Gölge	003e5579e8	Enable `custom_symbols` in text processing Models can define their own custom symbols lists with custom `make_symbols()`	2021-08-09 18:02:36 +00:00
Eren Gölge	00c82c516d	rename to	2021-06-28 17:03:19 +02:00
Eren Gölge	166f0aeb9a	merge if branches with the same implementation	2021-06-28 17:03:19 +02:00
Eren Gölge	03494ad642	adjust `distribute.py` for the `train_tts.py`	2021-06-28 17:03:19 +02:00
Eren Gölge	25238e0658	fix glow-tts `inference()`	2021-06-28 17:03:19 +02:00
Eren Gölge	db6a97d1a2	rename external speaker embedding arguments as `d_vectors`	2021-06-28 17:03:19 +02:00
Eren Gölge	421194880d	linter fixes	2021-06-28 17:03:19 +02:00
Eren Gölge	b500338faa	make style	2021-06-28 17:03:19 +02:00
Eren Gölge	c680a07a20	fix `Synthesized` for the new `synthesis()`	2021-06-28 17:03:19 +02:00
Eren Gölge	b8a4af4010	update `synthesis.py` for being more generic	2021-06-28 17:03:19 +02:00
Eren Gölge	f4f83b6379	update `synthesis.py` for the trainer	2021-06-28 17:03:19 +02:00
Michael Hansen	3f172b84d8	Fix linting issues	2021-06-25 14:41:31 +02:00
Michael Hansen	4d8426fa0a	Use eSpeak IPA lexicons by default for phoneme models	2021-06-25 14:41:05 +02:00
Eren Gölge	9ee70af9bb	code styling	2021-05-11 11:29:18 +02:00
Eren Gölge	647163397d	coqpit refactoring	2021-05-11 11:29:17 +02:00
Eren Gölge	eaa130e813	fix tacotron for coqpit	2021-05-11 11:29:17 +02:00
Eren Gölge	aadb2106ec	code styling	2021-04-23 18:04:37 +02:00
kirianguiller	7dccbfdcd5	handle multi speaker and gst in Synthetizer class	2021-04-23 18:04:37 +02:00
Eren Gölge	e5b9607bc3	isort all imports	2021-04-09 00:45:20 +02:00
Eren Gölge	0e79fa86ad	format with black and pylint 2.7.3	2021-04-09 00:38:08 +02:00
Eren Gölge	2b3e12ea49	correct imports after refactoring, add AlignTTS (old SSMAS) and some formatting	2021-03-30 14:39:16 +02:00
Eren Gölge	9a48ba3821	a ton of linter updates	2021-03-08 05:06:54 +01:00
kirianguiller	9ab07f94e2	modify according to PR reviews	2021-03-08 02:59:48 +01:00
kirianguiller	42ba30eb8f	<add> Chinese mandarin implementation (tacotron2)	2021-03-08 02:59:24 +01:00
kirianguiller	0d4525322c	modify according to PR reviews	2021-03-08 02:57:11 +01:00
kirianguiller	e6fd118cf8	<add> Chinese mandarin implementation (tacotron2)	2021-03-08 02:57:11 +01:00

1 2

62 Commits