coqui-tts

Commit Graph

Author	SHA1	Message	Date
Eren Gölge	72d85e53c9	Update model file extension (#1422 ) * Update model file ext to ```.pth``` * Update docs * Rename more * Find model files	2022-03-22 17:55:00 +01:00
Eren Gölge	0870a4faa2	Make style (#1405 )	2022-03-16 12:13:55 +01:00
Edresson Casanova	f81892483d	REBASED: Transform Speaker Encoder in a Generic Encoder and Implement Emotion Encoder training support (#1349 ) * Rename Speaker encoder module to encoder * Add a generic emotion dataset formatter * Transform the Speaker Encoder dataset to a generic dataset and create emotion encoder config * Add class map in emotion config * Add Base encoder config * Add evaluation encoder script * Fix the bug in plot_embeddings * Enable Weight decay for encoder training * Add argumnet to disable storage * Add Perfect Sampler and remove storage * Add evaluation during encoder training * Fix lint checks * Remove useless config parameter * Active evaluation in speaker encoder test and use multispeaker dataset for this test * Unit tests fixs * Remove useless tests for speedup the aux_tests * Use get_optimizer in Encoder * Add BaseEncoder Class * Fix the unitests * Add Perfect Batch Sampler unit test * Add compute encoder accuracy in a function	2022-03-11 14:43:40 +01:00
Edresson Casanova	36e9ea2f97	Open bible dataset formatter (#1365 ) * Add support for voice conversion inference * Cache d_vectors_by_speaker for fast inference using a bigger speakers.json * Rebase bug fix * Use the average d-vector for inference * Fix the bug in find unique chars script * Add OpenBible formatter Co-authored-by: Eren Gölge <erogol@hotmail.com>	2022-03-11 10:43:31 +01:00
Edresson Casanova	dbe9da7f15	Add Voice conversion inference support (#1337 ) * Add support for voice conversion inference * Cache d_vectors_by_speaker for fast inference using a bigger speakers.json * Rebase bug fix * Use the average d-vector for inference	2022-03-10 14:57:12 +01:00
Edresson Casanova	f381e29b91	REBASED: Add support for the speaker encoder training using torch spectrograms (#1348 ) * Add support for the speaker encoder training using torch spectrograms * Remove useless function in speaker encoder dataset class	2022-03-10 14:54:51 +01:00
Eren Gölge	1425a023fe	Make style and lint	2022-03-02 13:25:35 +01:00
Eren Gölge	1e414b3a09	Make stlye	2022-02-25 11:31:56 +01:00
Eren Gölge	bf540f4323	Update imports for trainer	2022-02-25 11:31:56 +01:00
Eren Gölge	424d04e4f6	Make stlye	2022-02-25 11:31:56 +01:00
Eren Gölge	be3a03126a	Update imports for trainer	2022-02-25 11:28:14 +01:00
Eren Gölge	8622226f3f	Make style	2022-02-25 11:26:59 +01:00
Eren Gölge	27db089d6c	Change TrainingArgs -> TrainerArgs	2022-02-25 11:26:59 +01:00
Eren Gölge	1f0c8179da	Make style	2022-02-25 11:26:59 +01:00
Eren Gölge	bc2243bac4	Fix tests	2022-02-25 11:25:00 +01:00
Eren Gölge	2fe16de8e3	Make lint	2022-02-25 11:25:00 +01:00
Eren Gölge	8e248913d6	Update train_tts for the new API	2022-02-25 11:24:13 +01:00
Eren Gölge	730f7c0df4	Add file_ext args to resample.py	2022-02-25 11:15:46 +01:00
Eren Gölge	4d99fee3e2	Update spec extractor	2022-02-25 11:12:44 +01:00
Eren Gölge	38a0b3b6c7	Update train_tts.py	2022-02-25 11:11:35 +01:00
Eren Gölge	fbad17e084	Update imports for symbols -> characters	2022-02-25 10:48:02 +01:00
Eren Gölge	a51b031bff	Merge branch 'dev' into dev-fix-glowtts-infer	2022-02-21 12:01:40 +03:00
Edresson Casanova	28a7464975	Fix the bug in split dataset function (#1251 ) * Fix the bug in split_dataset * Make eval_split_size configurable * Change test_loader to use load_tts_samples function * Change eval_split_portion to eval_split_size and permits to set the absolute number of samples in eval * Fix samplers unit test * Add data unit test on GitHub workflow	2022-02-21 11:59:36 +03:00
Edresson Casanova	bc5db13d06	Fix the bug in extract tts spectrogram script	2022-02-19 19:24:00 +00:00
Eren Gölge	127118c637	Update TTS.tts formatters (#1228 ) * Return Dict from tts formatters * Make style	2022-02-11 23:03:43 +01:00
Edresson Casanova	0860d73cf8	Remove Tensorflow requeriment (#1225 ) * Remove TF modules * Remove TF unit tests * Remove TF vocoder modules * Remove TF convert scripts * Remove TF requirement * Remove the Docs TF instructions * Remove TF inference support	2022-02-10 16:14:54 +01:00
WeberJulian	c7f5e005e1	Compute embedding for new audios only	2022-01-06 15:41:38 +01:00
WeberJulian	e1accb6e28	Fix train_tts.py and uncomment code (#1051 ) * Fix SE loading and language embedding logic * remove trailing white space * Uncomment resmapling code for SCL	2022-01-03 17:44:57 +01:00
Eren Gölge	56378b12f7	Fix speaker encoder init	2021-12-21 12:26:25 +00:00
Eren Gölge	4c50f6f4df	Add functions to get and check and argument in config and config.model_args	2021-12-20 11:54:10 +00:00
Eren Gölge	704dddcffa	Make style	2021-12-20 11:54:10 +00:00
WeberJulian	a564eb9f54	Add support for multi-lingual models in CLI	2021-12-20 11:54:10 +00:00
WeberJulian	6b03943526	Move multilingual logic out of the trainer	2021-12-20 11:54:10 +00:00
Edresson	4196a42de7	Get the number speaker from the Speaker Manager property	2021-12-20 11:54:10 +00:00
Edresson	f394d60695	Fix the bug in multispeaker vits	2021-12-20 11:54:10 +00:00
Edresson	45d0b04179	Lint fixs	2021-12-20 11:54:10 +00:00
Edresson	85418ffeaa	Fix the bug in extract tts spectrograms	2021-12-20 11:54:10 +00:00
Edresson	34749f8727	Remove the call to get_speaker_manager	2021-12-20 11:54:10 +00:00
Edresson	352aa69eca	Create a module for the VAD script	2021-12-20 11:54:10 +00:00
WeberJulian	da6c1e858c	Fix small issues	2021-12-20 11:54:10 +00:00
WeberJulian	23d789c072	Fix continue path	2021-12-20 11:54:10 +00:00
WeberJulian	120332d53f	Fix phonemes	2021-12-20 11:54:10 +00:00
WeberJulian	846bf16f02	fix imports for load_meta_data	2021-12-20 11:54:10 +00:00
WeberJulian	e995a63bd6	fix linter	2021-12-20 11:54:10 +00:00
WeberJulian	1472b6df49	make style	2021-12-20 11:54:10 +00:00
Edresson	10ff90d6d2	Add remove silence VAD script	2021-12-20 11:54:10 +00:00
Edresson	eeb8ac07d9	Add voice conversion fine tuning mode	2021-12-20 11:54:10 +00:00
Reuben Morais	859ac1a54c	Include usage instructions in README	2021-12-17 11:37:19 +01:00
Eren Gölge	babdd84f91	Fix GST inference commit d3e477875a7e46a101fcf95a1794442823750fe2 Author: George Rousssos <25833833+george-roussos@users.noreply.github.com> Date: Wed Nov 3 10:16:12 2021 +0000 Read .wav for GST conditioning from CL commit 074e6d0874d3b34fb6a4991fc17d66dccd413fbb Author: George Rousssos <25833833+george-roussos@users.noreply.github.com> Date: Fri Oct 29 14:43:47 2021 +0100 Fix GST during inference in Tacotron2 commit fdece14585ab5a36eed1061a9a838d8e48aa6882 Author: George Rousssos <25833833+george-roussos@users.noreply.github.com> Date: Wed Nov 3 10:16:12 2021 +0000 Read .wav for GST conditioning from CL commit cd29e21b8d0a541ee298d2bf5f67223ad60be38f Author: George Rousssos <25833833+george-roussos@users.noreply.github.com> Date: Fri Oct 29 14:43:47 2021 +0100 Fix GST during inference in Tacotron2 commit 908ce39370eadcc9fa8510cdb26c9ead87305427 Author: George Rousssos <25833833+george-roussos@users.noreply.github.com> Date: Fri Oct 29 12:49:37 2021 +0100 Make trim_db value negative commit 1008a2e0f72fa7ca7f0307424f570386f2f16d42 Author: George Rousssos <25833833+george-roussos@users.noreply.github.com> Date: Fri Oct 29 12:22:24 2021 +0100 Set find_endpoint db threshold in config.json	2021-12-07 13:28:49 +00:00
Eren Gölge	ce45d9e1af	Make style and lint	2021-12-01 10:42:52 +00:00
Eren Gölge	dc3dd55dd9	Add collect_env_info.py	2021-11-08 08:59:08 +01:00
Eren Gölge	a409e0f8f8	Update train_tts for multi-speaker	2021-10-21 16:29:06 +00:00
Eren Gölge	ba2b8c827f	Update `train_tts.py` and `train_vocoder.py`	2021-09-30 14:47:56 +00:00
Eren Gölge	2e9b6b4f90	Refactor Speaker Encoder training	2021-09-30 14:47:56 +00:00
Eren Gölge	043dca61b4	Rename `load_meta_data` as `load_tts_data`	2021-09-30 14:47:56 +00:00
Eren Gölge	3c740d4893	Style extract_tts_spectrogram.py	2021-09-10 08:21:21 +00:00
Eren Gölge	807f1d3817	Fix `extract_tts_spectrograms.py` model init	2021-09-09 08:59:55 +00:00
Eren Gölge	91a70e80b2	Refactor TTSDataset Return a dict by `collate` Refactor batch handling in `collate` A couple of bug fixes	2021-09-06 15:16:58 +00:00
Eren Gölge	545a00fc04	Use absolute paths of the attention masks	2021-09-06 15:16:58 +00:00
Eren Gölge	0f19f8c911	Fix `compute_attention_masks.py`	2021-09-06 15:16:58 +00:00
Eren Gölge	18da8f5dbd	Update pylint 2.10.2 and fix lint issues	2021-08-30 08:10:35 +00:00
Eren Gölge	f186856e5d	Add option to sort input sequnce by audio len	2021-08-30 08:10:35 +00:00
Eren Gölge	5911eec3b1	Small trainer refactoring 1. Use a single Gradscaler for all the optimizers 2. Save terminal logs to a file. In DDP mode, each worker creates `trainer_N_log.txt`. 3. Fixes to allow only the main worker (rank==0) writing to Tensorboard 4. Pass parameters owned by the target optimizer to the grad_clip_norm	2021-08-26 17:08:58 +00:00
Eren Gölge	ecf5f17dca	Fix distribute.py and ddp training	2021-08-12 22:22:32 +00:00
Eren Gölge	6af03ac476	Fix `num_char` init in Tacotron models	2021-08-09 21:46:15 +00:00
Ayush Chaurasia	936a47504d	Update Logger API, recipes	2021-08-09 18:34:00 +00:00
Ayush Chaurasia	f63cf46c55	Unified logger API	2021-08-09 18:34:00 +00:00
Ayush Chaurasia	f606741dc4	Add artifacts logging , wandb args	2021-08-09 18:31:16 +00:00
Agrin Hilmkil	ced4cfdbbf	Allow saving / loading checkpoints from cloud paths (#683 ) * Allow saving / loading checkpoints from cloud paths Allows saving and loading checkpoints directly from cloud paths like Amazon S3 (s3://) and Google Cloud Storage (gs://) by using fsspec. Note: The user will have to install the relevant dependency for each protocol. Otherwise fsspec will fail and specify which dependency is missing. * Append suffix _fsspec to save/load function names * Add a lower bound to the fsspec dependency Skips the 0 major version. * Add missing changes from refactor * Use fsspec for remaining artifacts * Add test case with path requiring fsspec * Avoid writing logs to file unless output_path is local * Document the possibility of using paths supported by fsspec * Fix style and lint * Add missing lint fixes * Add type annotations to new functions * Use Coqpit method for converting config to dict * Fix type annotation in semi-new function * Add return type for load_fsspec * Fix bug where fs not always created * Restore the experiment removal functionality	2021-08-09 18:02:36 +00:00
Eren Gölge	4b7b88dd3d	Add fullband-melgan DE vocoder	2021-07-26 15:38:30 +02:00
Edresson Casanova	d5adc35fdf	Add docstring to compute_embeddings script	2021-07-21 07:16:10 -03:00
Edresson	b1620d1f3f	remove ignore generate eval flag	2021-07-15 03:34:28 -03:00
Edresson	d906fea08c	lint fix and eval as argparse in extract tts spectrograms	2021-07-13 02:15:31 -03:00
Edresson	2e5baffa9c	Merge fix and eval split as argparse	2021-07-13 01:47:32 -03:00
Eren Gölge	93a74cbb71	Merge pull request #628 from Aloento/patch-2 Change to _get_preprocessor_by_name	2021-07-11 22:17:50 +02:00
Edresson	4eac1c4651	bug fix on train_encoder and unit tests	2021-07-11 12:00:39 -03:00
Aloento	6e3e6d5756	Change to _get_preprocessor_by_name	2021-07-08 09:53:13 +02:00
Eren Gölge	a4c658f5ef	Fix for using the `Synthesizer` out of the model	2021-07-02 10:43:38 +02:00
Eren Gölge	b3c073c99b	Allow runing full path scripts with `distribute.py`	2021-06-28 17:03:47 +02:00
Eren Gölge	a7617d8ab6	Add 🐍 python 3.9 to CI	2021-06-28 17:03:47 +02:00
Eren Gölge	9790eddada	Fix wrong argument name 🛠️	2021-06-28 17:03:47 +02:00
Eren Gölge	45947acb60	Update `TTS.bin` scripts for the new API	2021-06-28 17:03:47 +02:00
Eren Gölge	c7aad884cd	Implement unified trainer	2021-06-28 17:03:19 +02:00
Eren Gölge	c754a0e17d	`TrainerAbstract` and related updates for `TrainerTTS`	2021-06-28 17:03:19 +02:00
Eren Gölge	00c82c516d	rename to	2021-06-28 17:03:19 +02:00
Eren Gölge	03494ad642	adjust `distribute.py` for the `train_tts.py`	2021-06-28 17:03:19 +02:00
Eren Gölge	d6b2b6add6	make style and linter fixes	2021-06-28 17:03:19 +02:00
Eren Gölge	802d461389	Compute d_vectors and speaker_ids separately in TTSDataset	2021-06-28 17:03:19 +02:00
Eren Gölge	db6a97d1a2	rename external speaker embedding arguments as `d_vectors`	2021-06-28 17:03:19 +02:00
Eren Gölge	ef4ea9e527	update imports for `formatters`	2021-06-28 17:03:19 +02:00
Eren Gölge	421194880d	linter fixes	2021-06-28 17:03:19 +02:00
Eren Gölge	8e52a69230	delete separate tts training scripts and pre-commit configuration	2021-06-28 17:03:19 +02:00
Eren Gölge	d96ebcd6d3	make style	2021-06-28 17:03:19 +02:00
Eren Gölge	b500338faa	make style	2021-06-28 17:03:19 +02:00
Eren Gölge	469d2e620a	update extract_tts_spectrogram for `cond_input` API of the models	2021-06-28 17:03:19 +02:00
Eren Gölge	5ab28fa618	update `extract_tts_spec...` using `SpeakerManager`	2021-06-28 17:03:19 +02:00
Eren Gölge	c392fa4288	update `extract_tts_spectrograms` for the new model API	2021-06-28 17:03:19 +02:00
Eren Gölge	8f47f95998	correct import of `load_meta_data` remove redundant import	2021-06-28 17:03:19 +02:00
Eren Gölge	d25f017b42	update `setup_model.py` imports	2021-06-28 17:03:19 +02:00
Eren Gölge	e298b8e364	update trainer.py for better logging handling, restoring models and rename init_ functions with get_	2021-06-28 17:03:19 +02:00

1 2 3 4 5 ...

416 Commits