coqui-tts

Commit Graph

Author	SHA1	Message	Date
Edresson	ac9416fb86	Add multilingual inference support	2021-12-20 11:54:09 +00:00
Eren Gölge	faafea4cf2	Fix style	2021-11-04 17:04:40 +01:00
Eren Gölge	d6d780e758	Fix FastSpeech config	2021-11-01 16:41:15 +01:00
Eren Gölge	00becf2671	Fix import statements	2021-10-25 19:29:16 +02:00
Eren Gölge	e62d3c5cf7	Use absolute imports for tts configs and models	2021-10-21 16:29:06 +00:00
Eren Gölge	82fed4add2	Make style	2021-10-21 16:05:51 +00:00
Eren Gölge	3ab009ca8d	Edit model configs for multi-speaker	2021-10-21 13:51:37 +00:00
Eren Gölge	a0a5d580e9	Approximate audio length from file size	2021-10-18 08:54:02 +00:00
Eren Gölge	073a2d2eb0	Refactor VITS multi-speaker initialization	2021-10-15 10:20:00 +00:00
Eren Gölge	2766dd1d6e	Fix #813 - GlowTTS training (#814 ) * Fix #813 * Update glow_tts recipe * Fix glow-tts test * Linter fix * Run data dep init only in training	2021-09-17 20:06:55 +02:00
Eren Gölge	1ea011571a	Update SpeedySpeech config	2021-09-12 15:33:27 +00:00
Eren Gölge	cbbc9e0172	Add FastSpeechConfig	2021-09-11 10:20:37 +00:00
Eren Gölge	66732025e1	Add `base_model` field to `forward_tts` configs	2021-09-10 17:23:48 +00:00
Eren Gölge	8b7e094bde	Implement `forward_tts` - Generic API for feed-forward TTS models (FastPitch, SpeedySpeech) - Tests for `forward-tts` - Edit FastPitchConfig and SpeedySpeechConfig to use `forward_tts`	2021-09-10 08:24:33 +00:00
Eren Gölge	91a70e80b2	Refactor TTSDataset Return a dict by `collate` Refactor batch handling in `collate` A couple of bug fixes	2021-09-06 15:16:58 +00:00
Eren Gölge	debf772ec5	Implement binary alignment loss	2021-09-06 15:16:58 +00:00
Eren Gölge	6e9d4062f2	Add `sort_by_audio_len` option	2021-09-06 15:16:58 +00:00
Eren Gölge	e429afbce4	Enable aligner for FastPitch	2021-09-06 15:16:58 +00:00
Eren Gölge	81c228a2d8	Update FastPitch don't detach duration network inputs	2021-09-06 15:16:58 +00:00
Eren Gölge	57b3aec1b9	Update docstring format	2021-09-06 15:16:58 +00:00
Eren Gölge	7692bfe7f8	Update FastPitch config	2021-09-06 15:16:58 +00:00
Eren Gölge	bc396c393f	Add FastPitch model and FastPitchconfig	2021-09-06 15:16:58 +00:00
Eren Gölge	f186856e5d	Add option to sort input sequnce by audio len	2021-08-30 08:10:35 +00:00
Eren Gölge	2620f62ea8	Move duration_loss inside VitsGeneratorLoss	2021-08-27 07:07:07 +00:00
Eren Gölge	49e1181ea4	Fixes for the vits model	2021-08-26 17:15:09 +00:00
Eren Gölge	3ab8cef99e	Fix VITS model SPD	2021-08-18 14:55:46 +00:00
Eren Gölge	6a7275881d	Add VitsConfig docstring	2021-08-09 18:02:36 +00:00
Eren Gölge	c312acac7d	Implement VITS model 🚀 VITS model implementation built on Glow TTS and HiFiGAN layers.	2021-08-09 18:02:36 +00:00
Eren Gölge	bd4e29b4dd	Add `compute_linear_spec=False` to `BaseTTSConfig`	2021-08-09 18:02:36 +00:00
Eren Gölge	0fa6a8c9b8	Fix glow tts default parameters	2021-07-02 10:44:23 +02:00
Eren Gölge	2e1a428b83	Update glowtts docstrings and docs	2021-06-30 14:30:55 +02:00
Eren Gölge	786170fe7d	Update tts model configs	2021-06-28 17:03:19 +02:00
Eren Gölge	269e5a734e	add max_decoder_steps argument to tacotron models	2021-06-28 17:03:19 +02:00
Eren Gölge	b500338faa	make style	2021-06-28 17:03:19 +02:00
Eren Gölge	fc9a0fb8ce	update aling_tts_config for the trainer	2021-06-28 17:03:19 +02:00
Eren Gölge	06ee57d816	update `speedy_speecy_config.py` for the trainer	2021-06-28 17:03:19 +02:00
Eren Gölge	b2218e882a	update `glow_tts_config.py` for setting the optimizer and the scheduler	2021-06-28 17:03:19 +02:00
Eren Gölge	535a458f40	update Tacotron models for the trainer	2021-06-28 17:03:19 +02:00
Eren Gölge	d09385808a	set test_sentences in config	2021-06-28 17:03:19 +02:00
Eren Gölge	8def3c87af	trainer-API updates	2021-06-28 17:03:19 +02:00
Michael Hansen	4d8426fa0a	Use eSpeak IPA lexicons by default for phoneme models	2021-06-25 14:41:05 +02:00
Eren Gölge	c6f22aaa67	fix #509	2021-05-27 13:09:15 +02:00
Eren Gölge	5482a0f62d	type def for gradual_training	2021-05-19 14:03:26 +02:00
Eren Gölge	218af1d9a2	change `list` to `List` in config	2021-05-18 17:30:27 +02:00
Eren Gölge	d1b469935d	tacotron DDC LJSpeech recipe	2021-05-17 11:42:14 +02:00
Eren Gölge	34a42d379f	update tacotron_config.py for checking `r` and the docstring	2021-05-17 11:35:30 +02:00
Eren Gölge	12722501bb	styling	2021-05-15 23:48:31 +02:00
Eren Gölge	8b1014d188	add docstrings with default value fixes	2021-05-15 23:45:10 +02:00
Eren Gölge	0213e1cbf4	update configs for tts models to match the field typed with the expected values	2021-05-12 00:57:38 +02:00
Eren Gölge	843d1b3d98	linter fixes	2021-05-11 11:30:00 +02:00
Eren Gölge	19fb1d743d	style update	2021-05-11 11:30:00 +02:00
Eren Gölge	c57f0b46bb	reintro use_gst for backwars compat	2021-05-11 11:29:18 +02:00
Eren Gölge	9ee70af9bb	code styling	2021-05-11 11:29:18 +02:00
Eren Gölge	7663bc63c1	add Coqpit configs for the TTS models	2021-05-11 11:29:17 +02:00
Eren Gölge	7227e8f1d2	update train_align_tts.py for coqpit	2021-05-11 11:29:17 +02:00
Eren Gölge	51a7e06945	glow_tts_config.py and train test on python	2021-05-11 11:29:17 +02:00
Eren Gölge	816e7ee698	remove default configs.json as replacing with Coqpit configs	2021-05-11 11:29:17 +02:00
Eren Gölge	97bd5f9734	[ci skip] config update #3 WIP	2021-05-11 11:28:35 +02:00
Eren Gölge	a21c0b5585	config update 2 WIP	2021-05-11 11:28:35 +02:00
Eren Gölge	ef37633cb3	[ci skip] use prenet_dropout by default with Tacotron models	2021-04-22 12:38:55 +02:00
Eren Gölge	48ea20e69f	example aligntts config	2021-03-30 14:41:00 +02:00
gerazov	2451a813a2	refactored keep_all_best	2021-03-08 02:57:11 +01:00
gerazov	f2e474cd37	loading last checkpoint/best_model works, deleting last best models options added, loading last best_loss added	2021-03-08 02:56:36 +01:00
erogol	29cf933831	update SS condif	2021-01-06 13:19:40 +01:00
erogol	228ada04b5	update glow-tts ljspeech config	2021-01-06 13:19:40 +01:00
erogol	ac5c9217d1	positional encoding masking for SS	2021-01-06 13:19:40 +01:00
erogol	cf869e8922	add SS files	2021-01-06 13:19:40 +01:00
erogol	a1d5a9ddda	config update tyo use noise for augmentation	2021-01-06 13:19:40 +01:00
erogol	7b20d8cbd3	implement residual BN convolution and add it as an alternative encoder for glow-tts. also generic layers to layers/generic	2021-01-06 13:19:40 +01:00
erogol	f81af4eb0d	config update disable guided attention for dynamic conv attention	2021-01-06 13:19:40 +01:00
erogol	5c50e104d6	config update	2021-01-06 13:19:40 +01:00
erogol	fa20638083	config for ljspeech dynamic conv attention	2021-01-06 13:18:41 +01:00
erogol	070146e143	add monotonic dynamic convolution attention	2021-01-06 13:18:41 +01:00
erogol	affe1c1138	setup training scripts for computing phonemes before training optionally. And define data_loaders before starting training and re-use them instead of re-define for every train and eval calls. This is to enable better instance filtering based on input length.	2020-12-07 11:26:57 +01:00
erogol	1229554c42	use native amp	2020-11-25 14:48:54 +01:00
erogol	183fe56d95	Merge branch 'ssim_loss' into dev	2020-10-29 23:49:09 +01:00
erogol	73581cd94c	renaming train scripts and updating tests	2020-10-29 16:50:07 +01:00
erogol	a1582a0e12	fix distributed training for train_* scripts	2020-10-29 12:31:43 +01:00
erogol	59e1cf99d0	config update and ssim implementation	2020-10-28 18:30:00 +01:00
erogol	9cef923d99	ssim loss for tacotron models	2020-10-28 15:24:18 +01:00
Eren Gölge	f4b8170bd1	Merge pull request #545 from Edresson/dev GlowTTS zeroshot TTS support	2020-10-27 15:23:41 +01:00
Edresson	d9540a5857	add blank token in sequence for encrease glowtts results	2020-10-25 15:08:28 -03:00
ayush-1506	2a3559f02b	Fix readme and config file	2020-10-21 13:43:49 +05:30
erogol	e0d4b88877	config update	2020-10-08 01:29:30 +02:00
erogol	bb9b70ee27	differential spectral loss and loss weight settings	2020-10-08 01:29:30 +02:00
erogol	e0b9fa887f	glow-tts modules added	2020-09-21 14:15:40 +02:00
erogol	df19428ec6	rename the project to old TTS	2020-09-09 12:27:23 +02:00

1 2 3

137 Commits