coqui-tts

Commit Graph

Author	SHA1	Message	Date
Edresson Casanova	28a7464975	Fix the bug in split dataset function (#1251 ) * Fix the bug in split_dataset * Make eval_split_size configurable * Change test_loader to use load_tts_samples function * Change eval_split_portion to eval_split_size and permits to set the absolute number of samples in eval * Fix samplers unit test * Add data unit test on GitHub workflow	2022-02-21 11:59:36 +03:00
Eren Gölge	127118c637	Update TTS.tts formatters (#1228 ) * Return Dict from tts formatters * Make style	2022-02-11 23:03:43 +01:00
Eren Gölge	d724984be1	Fix language assignment	2022-01-02 11:11:24 +00:00
WeberJulian	a63998c048	Fix phoneme language	2022-01-01 21:08:13 +01:00
Eren Gölge	704dddcffa	Make style	2021-12-20 11:54:10 +00:00
Edresson	90eac13bb2	Rename ununsed_speakers to ignored_speakers	2021-12-20 11:54:10 +00:00
WeberJulian	631addf33b	fix d-vector	2021-12-20 11:54:10 +00:00
WeberJulian	120332d53f	Fix phonemes	2021-12-20 11:54:10 +00:00
WeberJulian	1340938159	fix phonemes per language	2021-12-20 11:54:10 +00:00
WeberJulian	e995a63bd6	fix linter	2021-12-20 11:54:10 +00:00
WeberJulian	1472b6df49	make style	2021-12-20 11:54:10 +00:00
WeberJulian	0804806727	fix f0_cache_path in dataset	2021-12-20 11:54:10 +00:00
WeberJulian	3b5592abcf	fix test vits	2021-12-20 11:54:10 +00:00
WeberJulian	2a2b5767c2	fix collate_fn	2021-12-20 11:54:10 +00:00
Julian WEBER	78c2d12a91	PitchExtractor	2021-12-20 11:54:10 +00:00
Julian WEBER	b3abd01793	Merge dataset	2021-12-20 11:54:10 +00:00
Edresson	f1f016314e	Fix the bug in M-AILABS formatter	2021-12-20 11:54:09 +00:00
Edresson	f996afedb0	Implement multilingual dataloader support	2021-12-20 11:54:09 +00:00
Edresson	5f1c18187f	Fix pylint issues	2021-12-20 11:54:09 +00:00
Edresson	6a7db67a91	Allow ignore speakers for all multispeaker datasets	2021-12-20 11:54:09 +00:00
Eren Gölge	faafea4cf2	Fix style	2021-11-04 17:04:40 +01:00
Eren Gölge	20cebde1c9	Add docstring to MAI labs formatter	2021-11-01 16:41:55 +01:00
Eren Gölge	608f437545	Add a function to find unique chars	2021-11-01 16:41:33 +01:00
Eren Gölge	035ed432bc	Doc update (#889 ) * Link source files from the docs * Update glowTTS recipes for docs * Add dataset downloaders	2021-10-26 17:41:33 +02:00
Eren Gölge	0cac3f330a	Enable custom formatter in load_tts_samples	2021-10-26 13:07:11 +02:00
Eren Gölge	82fed4add2	Make style	2021-10-21 16:05:51 +00:00
Eren Gölge	a0a5d580e9	Approximate audio length from file size	2021-10-18 08:54:02 +00:00
Eren Gölge	043dca61b4	Rename `load_meta_data` as `load_tts_data`	2021-09-30 14:47:56 +00:00
Eren Gölge	9f23ad6a0f	Fix imports	2021-09-30 14:47:56 +00:00
Eren Gölge	8ada870a57	Refactor `trainer.py` for v2	2021-09-30 14:16:34 +00:00
Eren Gölge	76c4929ab2	Fix attn mask reading bug	2021-09-06 15:16:58 +00:00
Eren Gölge	91a70e80b2	Refactor TTSDataset Return a dict by `collate` Refactor batch handling in `collate` A couple of bug fixes	2021-09-06 15:16:58 +00:00
Eren Gölge	648655fa03	Add `PitchExtractor` and return dict by `collate`	2021-09-06 15:16:58 +00:00
Eren Gölge	545a00fc04	Use absolute paths of the attention masks	2021-09-06 15:16:58 +00:00
Eren Gölge	e802b24ad0	Compute mean and std pitch	2021-09-06 15:16:58 +00:00
Eren Gölge	8fffd4e813	Don't print computed phonemes It causes noise in logs	2021-09-06 15:16:58 +00:00
Eren Gölge	d085642ac1	Cache pitch features Cache the features at the beginning of `BaseTTS` training.	2021-09-06 15:16:58 +00:00
Eren Gölge	fba257104d	Compute F0 using librosa	2021-09-06 15:16:58 +00:00
Eren Gölge	18da8f5dbd	Update pylint 2.10.2 and fix lint issues	2021-08-30 08:10:35 +00:00
Eren Gölge	f186856e5d	Add option to sort input sequnce by audio len	2021-08-30 08:10:35 +00:00
Eren Gölge	c312acac7d	Implement VITS model 🚀 VITS model implementation built on Glow TTS and HiFiGAN layers.	2021-08-09 18:02:36 +00:00
Eren Gölge	003e5579e8	Enable `custom_symbols` in text processing Models can define their own custom symbols lists with custom `make_symbols()`	2021-08-09 18:02:36 +00:00
Eren Gölge	4b7b88dd3d	Add fullband-melgan DE vocoder	2021-07-26 15:38:30 +02:00
Edresson	b1620d1f3f	remove ignore generate eval flag	2021-07-15 03:34:28 -03:00
Edresson	2e5baffa9c	Merge fix and eval split as argparse	2021-07-13 01:47:32 -03:00
Eren Gölge	932ab107ae	Docstring edit in `TTSDataset.py` ✍️	2021-06-28 17:03:47 +02:00
Eren Gölge	8c74f054f0	Enable support for 🐍 python 3.10 Bump up versions numpy 1.19.5 and TF 2.5.0	2021-06-28 17:03:47 +02:00
Eren Gölge	fdfb18d230	downsize melgan test model size	2021-06-28 17:03:19 +02:00
Eren Gölge	419735f440	refactor and fix multi-speaker training in Trainer and Tacotron models	2021-06-28 17:03:19 +02:00
Eren Gölge	802d461389	Compute d_vectors and speaker_ids separately in TTSDataset	2021-06-28 17:03:19 +02:00
Eren Gölge	9042ae9195	use `to_cuda()` for moving data in `format_batch()`	2021-06-28 17:03:19 +02:00
Eren Gölge	d96ebcd6d3	make style	2021-06-28 17:03:19 +02:00
Eren Gölge	b500338faa	make style	2021-06-28 17:03:19 +02:00
Eren Gölge	a20a1c7d06	rename preprocess.py -> formatters.py	2021-06-28 17:03:19 +02:00
Eren Gölge	b9bccbb243	move load_meta_data and related functions to `datasets/__init__.py`	2021-06-28 17:03:19 +02:00
Eren Gölge	42554cc711	rename MyDataset -> TTSDataset	2021-06-28 17:03:19 +02:00
Edresson	28bec238ca	fix Lint checks	2021-06-18 14:33:50 -03:00
Edresson	83644056e3	fix Lint checks	2021-06-18 14:32:28 -03:00
Edresson Casanova	e78e3cd81e	Merge branch 'dev' into dev	2021-06-18 14:10:03 -03:00
Edresson	b74b510d3c	Compute embeddings and find characters using config file	2021-06-18 14:04:49 -03:00
Eren Gölge	49c5e5d820	maket style japanese PR	2021-06-02 11:44:46 +02:00
Katsuya Iida	0536aa6d0f	Japanese Tacotron 2 model	2021-05-22 17:12:19 +09:00
Eren Gölge	8a7c40736c	set use_phonemes false	2021-05-19 01:27:26 +02:00
Eren Gölge	8b1014d188	add docstrings with default value fixes	2021-05-15 23:45:10 +02:00
Eren Gölge	93a00373f6	move split_dataset	2021-05-11 11:29:17 +02:00
Eren Gölge	79d7215142	config refactor #5 WIP	2021-05-11 11:29:17 +02:00
Eren Gölge	e5b9607bc3	isort all imports	2021-04-09 00:45:20 +02:00
Eren Gölge	0e79fa86ad	format with black and pylint 2.7.3	2021-04-09 00:38:08 +02:00
Eren Gölge	e84f120a04	sam-accenture model preprocessor	2021-04-01 03:41:41 +02:00
Eren Gölge	1c1949d348	utf-8 encoding for certain preprocessors	2021-03-30 14:39:16 +02:00
Eren Gölge	f3e5ddfaaf	bug fix in preprocessor	2021-03-18 13:33:23 +01:00
Eren Gölge	e15734c3fc	linter fix	2021-03-08 05:29:43 +01:00
Eren Gölge	9a48ba3821	a ton of linter updates	2021-03-08 05:06:54 +01:00
kirianguiller	9ab07f94e2	modify according to PR reviews	2021-03-08 02:59:48 +01:00
kirianguiller	42ba30eb8f	<add> Chinese mandarin implementation (tacotron2)	2021-03-08 02:59:24 +01:00
kirianguiller	0d4525322c	modify according to PR reviews	2021-03-08 02:57:11 +01:00
kirianguiller	e6fd118cf8	<add> Chinese mandarin implementation (tacotron2)	2021-03-08 02:57:11 +01:00
Eren Gölge	2ca74b8ab3	add RUSLAN dataset preprocessor	2021-03-08 02:54:47 +01:00
Eren Gölge	f9fe167537	docstring update	2021-03-08 02:54:47 +01:00
Eren Gölge	29d928d531	css10 dataset preprocessor	2021-03-08 02:54:47 +01:00
Eren Gölge	08581deb61	linter updates	2021-03-08 02:53:02 +01:00
erogol	27a75de15f	update processors for loading attention maps	2021-01-06 13:19:40 +01:00
erogol	df180148e9	use noise augmentation in TTSDataset	2020-12-09 15:46:25 +01:00
erogol	7505c0ba27	muliprocess phoneme computation	2020-12-07 11:29:41 +01:00
erogol	20c86489d7	make static methods for faster multiprocess call	2020-12-07 11:29:10 +01:00
erogol	affe1c1138	setup training scripts for computing phonemes before training optionally. And define data_loaders before starting training and re-use them instead of re-define for every train and eval calls. This is to enable better instance filtering based on input length.	2020-12-07 11:26:57 +01:00
erogol	a757b203bc	fix longer phoneme seqs	2020-11-26 15:05:03 +01:00
erogol	7541d2ecaa	return eval split optional	2020-11-25 14:50:09 +01:00
Qingping Hou	b0b97d636f	speed up metafile build for voxceleb	2020-11-14 23:45:17 -08:00
erogol	9b0f441945	argument for returning no eval split	2020-11-12 12:52:27 +01:00
Edresson	d9540a5857	add blank token in sequence for encrease glowtts results	2020-10-25 15:08:28 -03:00
erogol	10258724d1	linter fixes	2020-09-22 03:54:16 +02:00
erogol	a6df617eb1	Merge branch 'glow-tts-amp-time_depth_conv' into dev	2020-09-21 14:23:45 +02:00
mueller91	9b4aac94a8	fix: linter issues	2020-09-21 12:13:02 +02:00
mueller	e36a3067e4	add: save wavs instead feats to storage. This is done in order to mitigate staleness when caching and loading from data storage	2020-09-17 14:14:30 +02:00
mueller	1511076fde	add: Configurable encoder dataset storage to reduce disk I/O add: Averaged time for data loader to console and Tensorboard output	2020-09-17 12:29:38 +02:00
mueller	95d2906307	add: Mozilla Commonvoice, VoxCeleb1+2, LibriTTS to Speaker Encoder Training	2020-09-16 16:49:53 +02:00
mueller	c909ca3855	Improve runtime of __parse_items() from O(\|speakers\|*\|items\|) to O(\|items\|)	2020-09-16 15:55:55 +02:00
erogol	89d15bf118	merge glow-tts after rebranding	2020-09-11 19:01:37 +02:00
erogol	df19428ec6	rename the project to old TTS	2020-09-09 12:27:23 +02:00

1 2 3

150 Commits