coqui-tts

Commit Graph

Author	SHA1	Message	Date
Edresson	5f1c18187f	Fix pylint issues	2021-12-20 11:54:09 +00:00
Edresson	d91c595c5a	Implement training support with d_vecs in the VITS model	2021-12-20 11:54:09 +00:00
Edresson	6a7db67a91	Allow ignore speakers for all multispeaker datasets	2021-12-20 11:54:09 +00:00
Edresson	e0ad838066	Select randomly a speaker from the speaker manager for the test setences	2021-12-20 11:54:09 +00:00
Edresson	eb3e8affe1	Save speakers embeddings/ids before starting training	2021-12-20 11:54:09 +00:00
Eren Gölge	37803467aa	Merge pull request #1021 from loganhart420/dataset_downloaders Add addtional datasets	2021-12-20 10:42:20 +01:00
Reuben Morais	859ac1a54c	Include usage instructions in README	2021-12-17 11:37:19 +01:00
loganhart420	103c010eca	Add addtional datasets	2021-12-16 07:21:27 -05:00
Jörg Thalheim	bce143c738	server: fix compatibility with tts_models/en/ljspeech/fast_pitch (#893 )	2021-12-07 14:36:29 +01:00
Eren Gölge	babdd84f91	Fix GST inference commit d3e477875a7e46a101fcf95a1794442823750fe2 Author: George Rousssos <25833833+george-roussos@users.noreply.github.com> Date: Wed Nov 3 10:16:12 2021 +0000 Read .wav for GST conditioning from CL commit 074e6d0874d3b34fb6a4991fc17d66dccd413fbb Author: George Rousssos <25833833+george-roussos@users.noreply.github.com> Date: Fri Oct 29 14:43:47 2021 +0100 Fix GST during inference in Tacotron2 commit fdece14585ab5a36eed1061a9a838d8e48aa6882 Author: George Rousssos <25833833+george-roussos@users.noreply.github.com> Date: Wed Nov 3 10:16:12 2021 +0000 Read .wav for GST conditioning from CL commit cd29e21b8d0a541ee298d2bf5f67223ad60be38f Author: George Rousssos <25833833+george-roussos@users.noreply.github.com> Date: Fri Oct 29 14:43:47 2021 +0100 Fix GST during inference in Tacotron2 commit 908ce39370eadcc9fa8510cdb26c9ead87305427 Author: George Rousssos <25833833+george-roussos@users.noreply.github.com> Date: Fri Oct 29 12:49:37 2021 +0100 Make trim_db value negative commit 1008a2e0f72fa7ca7f0307424f570386f2f16d42 Author: George Rousssos <25833833+george-roussos@users.noreply.github.com> Date: Fri Oct 29 12:22:24 2021 +0100 Set find_endpoint db threshold in config.json	2021-12-07 13:28:49 +00:00
Eren Gölge	ce45d9e1af	Make style and lint	2021-12-01 10:42:52 +00:00
Eren Gölge	40cb8ac966	Fix #958	2021-12-01 10:33:34 +00:00
Eren Gölge	512ada7548	Fix callbacks against multi-gpu training	2021-12-01 10:32:14 +00:00
Eren Gölge	2ed9e3c241	Fix constant use of noise augment	2021-11-08 09:20:34 +01:00
Eren Gölge	b6b14a76af	Fix VITS stochastic duration predictor	2021-11-08 09:20:11 +01:00
Eren Gölge	dc3dd55dd9	Add collect_env_info.py	2021-11-08 08:59:08 +01:00
Eren Gölge	faafea4cf2	Fix style	2021-11-04 17:04:40 +01:00
Eren Gölge	d227aaebcc	Print when using Griffin-Lim in Synthesizer	2021-11-01 16:52:26 +01:00
Eren Gölge	c5077c6c3f	Merge branch 'dev' of https://github.com/coqui-ai/TTS into dev	2021-11-01 16:42:27 +01:00
Eren Gölge	20cebde1c9	Add docstring to MAI labs formatter	2021-11-01 16:41:55 +01:00
Eren Gölge	608f437545	Add a function to find unique chars	2021-11-01 16:41:33 +01:00
Eren Gölge	d6d780e758	Fix FastSpeech config	2021-11-01 16:41:15 +01:00
Eren Gölge	5ba47081ee	Use GL for VCTK FastPitch models	2021-11-01 16:39:03 +01:00
Michael Hansen	3bc043faeb	Upgrade to gruut 2.0 (#882 )	2021-10-31 11:41:55 +01:00
George	37eaefc085	Optional silence trimming during inference and find_endpoint() fix (#898 ) * Set find_endpoint db threshold in config.json * Optional silence trimming during inference * Make trim_db value negative	2021-10-29 18:28:55 +02:00
Eren Gölge	7293abada2	Bump up to v0.4.2	2021-10-29 17:57:30 +02:00
Eren Gölge	2df0752e73	Model zoo tests (#900 ) * Fix VITS model multi-speaker init * Remove gdrive support in model manager * Add model zoo tests	2021-10-29 17:54:16 +02:00
Eren Gölge	aaaa591485	Bump up version to v0.4.1	2021-10-26 19:24:17 +02:00
Eren Gölge	3ea1c2037b	Fix model entry in .models.json	2021-10-26 19:14:29 +02:00
Eren Gölge	fa4ec83c6e	Bump up version to v0.4.0	2021-10-26 18:27:39 +02:00
Eren Gölge	035ed432bc	Doc update (#889 ) * Link source files from the docs * Update glowTTS recipes for docs * Add dataset downloaders	2021-10-26 17:41:33 +02:00
Eren Gölge	0cac3f330a	Enable custom formatter in load_tts_samples	2021-10-26 13:07:11 +02:00
Eren Gölge	7c10574931	Gateway for TTS models	2021-10-26 13:04:51 +02:00
Eren Gölge	00becf2671	Fix import statements	2021-10-25 19:29:16 +02:00
Eren Gölge	027424dda8	Add VCTK fast_pitch and UK glow-tts	2021-10-25 19:29:16 +02:00
Eren Gölge	70e4d0e524	Fix grad_norm handling	2021-10-21 16:29:06 +00:00
Eren Gölge	a409e0f8f8	Update train_tts for multi-speaker	2021-10-21 16:29:06 +00:00
Eren Gölge	2b7d159383	Update BaseTTS for multi-speaker training	2021-10-21 16:29:06 +00:00
Eren Gölge	e62d3c5cf7	Use absolute imports for tts configs and models	2021-10-21 16:29:06 +00:00
Eren Gölge	82fed4add2	Make style	2021-10-21 16:05:51 +00:00
Eren Gölge	3cb07fb6b5	Fix SpeakerManager init with data items	2021-10-21 13:54:39 +00:00
Eren Gölge	aea90e2501	Comment synthesis.py	2021-10-21 13:53:45 +00:00
Eren Gölge	1987aaaaed	Update d-vector reshape in synthesizer	2021-10-21 13:53:25 +00:00
Eren Gölge	3ab009ca8d	Edit model configs for multi-speaker	2021-10-21 13:51:37 +00:00
Eren Gölge	cea8e1739b	Update AlignTTS to use SpeakerManager	2021-10-20 18:22:41 +00:00
Eren Gölge	0e768dd4c5	Update comments	2021-10-20 18:21:26 +00:00
Eren Gölge	7c2cb7cc30	Update BaseTTS	2021-10-20 18:18:22 +00:00
Eren Gölge	330ee7d208	Comment BaseTacotron and remove unused funcs	2021-10-20 18:17:25 +00:00
Eren Gölge	aa25f70b95	Update ForwardTTS for multi-speaker	2021-10-20 18:16:41 +00:00
Eren Gölge	0ebc2a400e	Implement `_set_speaker_embedding` in GlowTTS	2021-10-20 18:15:20 +00:00
Eren Gölge	3da79a4de4	Comment Tacotron2 model	2021-10-20 18:14:04 +00:00
Eren Gölge	92b6d98443	Set pitch frame alignment wrt spec computation	2021-10-20 18:12:38 +00:00
Eren Gölge	0a3d1cc7ee	Pass speaker manager to the model in synthesizer	2021-10-20 18:11:36 +00:00
Eren Gölge	588da1a24e	Simplify grad_norm handling in trainer	2021-10-19 16:33:04 +00:00
Eren Gölge	3c7848e9b1	Don't OOR values in train console log	2021-10-19 16:32:16 +00:00
Eren Gölge	c514351c0e	Refactor multi-speaker init in BaseTTS-Tacotron1-2	2021-10-18 08:55:45 +00:00
Eren Gölge	127571423c	Update multi-speaker init in BaseTTS	2021-10-18 08:54:41 +00:00
Eren Gölge	a0a5d580e9	Approximate audio length from file size	2021-10-18 08:54:02 +00:00
Eren Gölge	b4b890df03	Update trainer's initialization	2021-10-18 08:53:19 +00:00
Eren Gölge	fcbfc53cb7	Fix linter	2021-10-15 10:24:19 +00:00
Eren Gölge	700b056117	Update Synthesizer multi-speaker handling	2021-10-15 10:21:12 +00:00
Eren Gölge	073a2d2eb0	Refactor VITS multi-speaker initialization	2021-10-15 10:20:00 +00:00
Eren Gölge	0565457faa	Fix #846	2021-10-14 14:46:14 +00:00
Eren Gölge	e15bc157d8	Fix #873	2021-10-14 14:39:45 +00:00
Eren Gölge	21cc0517a3	Fix WaveRNN test	2021-10-01 10:21:37 +00:00
Eren Gölge	4dbe7ed0de	Fix all-zero duration case for GlowTTS	2021-10-01 09:24:26 +00:00
Eren Gölge	37959ad0c7	Make linter	2021-09-30 23:02:16 +00:00
Eren Gölge	0b1986384f	Make style	2021-09-30 16:21:18 +00:00
Eren Gölge	7edbe04fe0	Fix WaveRNN config and test	2021-09-30 16:20:12 +00:00
Eren Gölge	55d9209221	Remote STT tokenizer	2021-09-30 14:58:26 +00:00
Eren Gölge	ba2b8c827f	Update `train_tts.py` and `train_vocoder.py`	2021-09-30 14:47:56 +00:00
Eren Gölge	2e9b6b4f90	Refactor Speaker Encoder training	2021-09-30 14:47:56 +00:00
Eren Gölge	043dca61b4	Rename `load_meta_data` as `load_tts_data`	2021-09-30 14:47:56 +00:00
Eren Gölge	9f23ad6a0f	Fix imports	2021-09-30 14:47:56 +00:00
Eren Gölge	16b70be0dd	Add `_set_model_args` to BaseModel	2021-09-30 14:47:56 +00:00
Eren Gölge	9a0d8fa027	Update `copy_model_files()`	2021-09-30 14:47:56 +00:00
Eren Gölge	4163b4f2e4	Update Tacotron models	2021-09-30 14:47:56 +00:00
Eren Gölge	e27feade38	Fixup wavernn	2021-09-30 14:47:56 +00:00
Eren Gölge	45889804c2	Update VITS	2021-09-30 14:47:56 +00:00
Eren Gölge	4f94f91305	Update WaveRNN	2021-09-30 14:47:56 +00:00
Eren Gölge	3d5205d66f	Update WaveGrad	2021-09-30 14:47:56 +00:00
Eren Gölge	fd95926009	Update GlowTTS	2021-09-30 14:47:56 +00:00
Eren Gölge	4baecdf92a	Update GAN for Trainer_v2	2021-09-30 14:47:56 +00:00
Eren Gölge	a156a40b47	Update ForwardTTS for Trainer_v2	2021-09-30 14:19:19 +00:00
Eren Gölge	d9df33f837	Update `align_tts` for trainer_v2	2021-09-30 14:18:10 +00:00
Eren Gölge	8ada870a57	Refactor `trainer.py` for v2	2021-09-30 14:16:34 +00:00
Eren Gölge	7f388f26e3	Bump up to v0.3.1	2021-09-17 23:53:22 +00:00
Eren Gölge	2766dd1d6e	Fix #813 - GlowTTS training (#814 ) * Fix #813 * Update glow_tts recipe * Fix glow-tts test * Linter fix * Run data dep init only in training	2021-09-17 20:06:55 +02:00
Eren Gölge	f563415052	Bump up to v0.3.0	2021-09-13 09:40:38 +00:00
Eren Gölge	a97dc8d09f	Fix trainer malformatted print	2021-09-13 08:32:02 +00:00
Eren Gölge	91bebebe18	Add new models to `.models.json` SpeedySpeech model using `ForwardTTS` UnivNet model fine-tuned on TacotronDDC_ph spectrograms	2021-09-13 08:22:14 +00:00
Eren Gölge	1ea011571a	Update SpeedySpeech config	2021-09-12 15:33:27 +00:00
Eren Gölge	cbbc9e0172	Add FastSpeechConfig	2021-09-11 10:20:37 +00:00
Eren Gölge	26f76fce22	Remove SpeedySpeech from .models.json	2021-09-10 17:47:27 +00:00
Eren Gölge	d97952611d	Remove unused import	2021-09-10 17:31:41 +00:00
Eren Gölge	7d8f77385a	Use `glow-tts` in synthesis tests	2021-09-10 17:27:33 +00:00
Eren Gölge	d5f256b34c	Update tacotron `r` init	2021-09-10 17:26:23 +00:00
Eren Gölge	ab37fa9c39	Edit AlignTTS	2021-09-10 17:25:00 +00:00
Eren Gölge	66732025e1	Add `base_model` field to `forward_tts` configs	2021-09-10 17:23:48 +00:00
Eren Gölge	d6e29ef98a	Style update	2021-09-10 08:30:33 +00:00
Eren Gölge	a89eb12aca	Fix glow_tts imports	2021-09-10 08:29:51 +00:00
Eren Gölge	570d5971be	Implement `ForwardTTSLoss`	2021-09-10 08:29:12 +00:00
Eren Gölge	0541a25e90	Remove `fastpitch.py` and `speedy_speech.py`	2021-09-10 08:27:48 +00:00
Eren Gölge	3c16013199	Fix Vits imports	2021-09-10 08:26:34 +00:00
Eren Gölge	742f9c54da	Warn user if nan in GL	2021-09-10 08:26:05 +00:00
Eren Gölge	ed4b1d8514	Test `TTS.tts.utils.helpers`	2021-09-10 08:25:21 +00:00
Eren Gölge	8b7e094bde	Implement `forward_tts` - Generic API for feed-forward TTS models (FastPitch, SpeedySpeech) - Tests for `forward-tts` - Edit FastPitchConfig and SpeedySpeechConfig to use `forward_tts`	2021-09-10 08:24:33 +00:00
Eren Gölge	3c740d4893	Style extract_tts_spectrogram.py	2021-09-10 08:21:21 +00:00
Eren Gölge	bfc6ceac29	Move MAS to `TTS.tts.utils.helpers`	2021-09-09 10:57:19 +00:00
Eren Gölge	2dfc5bdd11	Fix best_model_path init if no best_mode	2021-09-09 09:01:52 +00:00
Eren Gölge	abf5e48177	Fix logging current learning rate in trainer	2021-09-09 09:01:04 +00:00
Eren Gölge	6c4c1065b0	Fix trainer's scheduler restoring	2021-09-09 09:00:27 +00:00
Eren Gölge	807f1d3817	Fix `extract_tts_spectrograms.py` model init	2021-09-09 08:59:55 +00:00
Eren Gölge	537c8576ec	Stage `TTS.tts.utils.helpers`	2021-09-08 13:35:18 +00:00
Eren Gölge	4761853c5c	Fix imports	2021-09-08 13:34:40 +00:00
Eren Gölge	e20ea57c87	Update comment and add a warning	2021-09-07 12:23:32 +00:00
Eren Gölge	82598f3fdb	Bump up to v0.2.2	2021-09-06 16:59:41 +00:00
Eren Gölge	4cc544bc46	Add FastPitch model to `.models.json`	2021-09-06 16:59:22 +00:00
Eren Gölge	2c4bbbf9b9	Use pyworld for pitch	2021-09-06 15:16:58 +00:00
Eren Gölge	c1513ec4cd	Plot pitch over spectrogram	2021-09-06 15:16:58 +00:00
Eren Gölge	d847a68e42	Reformat multi-speaker handling in GlowTTS	2021-09-06 15:16:58 +00:00
Eren Gölge	8d41060d36	Plot unnormalized pitch by `FastPitch`	2021-09-06 15:16:58 +00:00
Eren Gölge	2b59da802c	Fix loader setup in `base_tts`	2021-09-06 15:16:58 +00:00
Eren Gölge	76c4929ab2	Fix attn mask reading bug	2021-09-06 15:16:58 +00:00
Eren Gölge	91a70e80b2	Refactor TTSDataset Return a dict by `collate` Refactor batch handling in `collate` A couple of bug fixes	2021-09-06 15:16:58 +00:00
Eren Gölge	29248536c9	Update `PositionalEncoding`	2021-09-06 15:16:58 +00:00
Eren Gölge	4672889549	Update `generic.FFTransformer`	2021-09-06 15:16:58 +00:00
Eren Gölge	2bf9e83c49	FastPitch refactor and commenting	2021-09-06 15:16:58 +00:00
Eren Gölge	59b24e66cf	Add `AlignerNetwork`	2021-09-06 15:16:58 +00:00
Eren Gölge	648655fa03	Add `PitchExtractor` and return dict by `collate`	2021-09-06 15:16:58 +00:00
Eren Gölge	debf772ec5	Implement binary alignment loss	2021-09-06 15:16:58 +00:00
Eren Gölge	6e9d4062f2	Add `sort_by_audio_len` option	2021-09-06 15:16:58 +00:00
Eren Gölge	59d52a4cd8	Disable autcast for criterions	2021-09-06 15:16:58 +00:00
Eren Gölge	98a7271ce8	Refactor FastPitchv2	2021-09-06 15:16:58 +00:00
Eren Gölge	e429afbce4	Enable aligner for FastPitch	2021-09-06 15:16:58 +00:00
Eren Gölge	81c228a2d8	Update FastPitch don't detach duration network inputs	2021-09-06 15:16:58 +00:00
Eren Gölge	ca29033ef4	Refactor FastPitch model	2021-09-06 15:16:58 +00:00
Eren Gölge	42862f7fdb	Format style of the recipes	2021-09-06 15:16:58 +00:00
Eren Gölge	5d59100a88	Don't use align_score for models with duration predictor	2021-09-06 15:16:58 +00:00
Eren Gölge	fac9dbe661	Update FastPitchLoss	2021-09-06 15:16:58 +00:00
Eren Gölge	b81560607b	Update docstrings	2021-09-06 15:16:58 +00:00
Eren Gölge	57b3aec1b9	Update docstring format	2021-09-06 15:16:58 +00:00
Eren Gölge	7692bfe7f8	Update FastPitch config	2021-09-06 15:16:58 +00:00
Eren Gölge	8584f2b82d	Update docstring format	2021-09-06 15:16:58 +00:00
Eren Gölge	b7caad39e0	Make optional to detach duration predictor input	2021-09-06 15:16:58 +00:00
Eren Gölge	9af42f7886	Restore `last_epoch` of the scheduler	2021-09-06 15:16:58 +00:00
Eren Gölge	aacbb3ed77	Fix SpeakerManager usage in `synthesize.py`	2021-09-06 15:16:58 +00:00
Eren Gölge	545a00fc04	Use absolute paths of the attention masks	2021-09-06 15:16:58 +00:00
Eren Gölge	bc396c393f	Add FastPitch model and FastPitchconfig	2021-09-06 15:16:58 +00:00
Eren Gölge	5a6ffaee08	Add yin based pitch computation	2021-09-06 15:16:58 +00:00

1 2 3 4 5 ...

1339 Commits