coqui-tts

Commit Graph

Author	SHA1	Message	Date
Edresson	1bd1a0546b	Add audio resample in the speaker consistency loss	2021-12-20 11:54:10 +00:00
Edresson	1c6bcda950	Add freeze vocoder generator and flow-based decoder option	2021-12-20 11:54:10 +00:00
WeberJulian	2b952d8b97	freeze vits parts	2021-12-20 11:54:10 +00:00
WeberJulian	005bba60b0	get_speaker_weighted_sampler	2021-12-20 11:54:10 +00:00
Edresson	9de4539422	Update the VITS model docs	2021-12-20 11:54:10 +00:00
Edresson	eeb8ac07d9	Add voice conversion fine tuning mode	2021-12-20 11:54:10 +00:00
Edresson	690b37d0ab	Add support to use the speaker encoder as loss function in VITS model	2021-12-20 11:54:09 +00:00
Edresson	9b011b1cb3	Add H/ASP original checkpoint support	2021-12-20 11:54:09 +00:00
Edresson	0bdfd3cb50	Add the ValueError in the restore checkpoint exception to avoid problems with the optimizer restauration when new keys are addition	2021-12-20 11:54:09 +00:00
Edresson	de78556655	Fix the optimizer parameters bug in multilingual and multispeaker training	2021-12-20 11:54:09 +00:00
Edresson	9be5b75da3	Fix bug after merge	2021-12-20 11:54:09 +00:00
Edresson	76251b619a	Fix d-vector multispeaker training bug	2021-12-20 11:54:09 +00:00
Edresson	7ef3ddc6ff	Fix unit tests	2021-12-20 11:54:09 +00:00
Edresson	36dcd11453	Fix pylint issues	2021-12-20 11:54:09 +00:00
Edresson	c53693c155	Implement vocoder Fine Tuning like SC-GlowTTS paper	2021-12-20 11:54:09 +00:00
Edresson	f1f016314e	Fix the bug in M-AILABS formatter	2021-12-20 11:54:09 +00:00
Edresson	c334d39acc	Add voice conversion support for the model VITS trained with external speaker embedding	2021-12-20 11:54:09 +00:00
Edresson	e997889ba8	Fix bug in VITS multilingual inference	2021-12-20 11:54:09 +00:00
Edresson	7c0b8ec572	Fix bugs in the non-multilingual VITS inference	2021-12-20 11:54:09 +00:00
Edresson	3fbbebd74d	Fix pylint issues	2021-12-20 11:54:09 +00:00
Edresson	ac9416fb86	Add multilingual inference support	2021-12-20 11:54:09 +00:00
Edresson	dcb2374bc9	Add multilingual training support to the VITS model	2021-12-20 11:54:09 +00:00
Edresson	f996afedb0	Implement multilingual dataloader support	2021-12-20 11:54:09 +00:00
Edresson	5f1c18187f	Fix pylint issues	2021-12-20 11:54:09 +00:00
Edresson	d91c595c5a	Implement training support with d_vecs in the VITS model	2021-12-20 11:54:09 +00:00
Edresson	6a7db67a91	Allow ignore speakers for all multispeaker datasets	2021-12-20 11:54:09 +00:00
Edresson	e0ad838066	Select randomly a speaker from the speaker manager for the test setences	2021-12-20 11:54:09 +00:00
Edresson	eb3e8affe1	Save speakers embeddings/ids before starting training	2021-12-20 11:54:09 +00:00
Eren Gölge	37803467aa	Merge pull request #1021 from loganhart420/dataset_downloaders Add addtional datasets	2021-12-20 10:42:20 +01:00
Reuben Morais	859ac1a54c	Include usage instructions in README	2021-12-17 11:37:19 +01:00
loganhart420	103c010eca	Add addtional datasets	2021-12-16 07:21:27 -05:00
Jörg Thalheim	bce143c738	server: fix compatibility with tts_models/en/ljspeech/fast_pitch (#893 )	2021-12-07 14:36:29 +01:00
Eren Gölge	babdd84f91	Fix GST inference commit d3e477875a7e46a101fcf95a1794442823750fe2 Author: George Rousssos <25833833+george-roussos@users.noreply.github.com> Date: Wed Nov 3 10:16:12 2021 +0000 Read .wav for GST conditioning from CL commit 074e6d0874d3b34fb6a4991fc17d66dccd413fbb Author: George Rousssos <25833833+george-roussos@users.noreply.github.com> Date: Fri Oct 29 14:43:47 2021 +0100 Fix GST during inference in Tacotron2 commit fdece14585ab5a36eed1061a9a838d8e48aa6882 Author: George Rousssos <25833833+george-roussos@users.noreply.github.com> Date: Wed Nov 3 10:16:12 2021 +0000 Read .wav for GST conditioning from CL commit cd29e21b8d0a541ee298d2bf5f67223ad60be38f Author: George Rousssos <25833833+george-roussos@users.noreply.github.com> Date: Fri Oct 29 14:43:47 2021 +0100 Fix GST during inference in Tacotron2 commit 908ce39370eadcc9fa8510cdb26c9ead87305427 Author: George Rousssos <25833833+george-roussos@users.noreply.github.com> Date: Fri Oct 29 12:49:37 2021 +0100 Make trim_db value negative commit 1008a2e0f72fa7ca7f0307424f570386f2f16d42 Author: George Rousssos <25833833+george-roussos@users.noreply.github.com> Date: Fri Oct 29 12:22:24 2021 +0100 Set find_endpoint db threshold in config.json	2021-12-07 13:28:49 +00:00
Eren Gölge	ce45d9e1af	Make style and lint	2021-12-01 10:42:52 +00:00
Eren Gölge	40cb8ac966	Fix #958	2021-12-01 10:33:34 +00:00
Eren Gölge	512ada7548	Fix callbacks against multi-gpu training	2021-12-01 10:32:14 +00:00
Eren Gölge	2ed9e3c241	Fix constant use of noise augment	2021-11-08 09:20:34 +01:00
Eren Gölge	b6b14a76af	Fix VITS stochastic duration predictor	2021-11-08 09:20:11 +01:00
Eren Gölge	dc3dd55dd9	Add collect_env_info.py	2021-11-08 08:59:08 +01:00
Eren Gölge	faafea4cf2	Fix style	2021-11-04 17:04:40 +01:00
Eren Gölge	d227aaebcc	Print when using Griffin-Lim in Synthesizer	2021-11-01 16:52:26 +01:00
Eren Gölge	c5077c6c3f	Merge branch 'dev' of https://github.com/coqui-ai/TTS into dev	2021-11-01 16:42:27 +01:00
Eren Gölge	20cebde1c9	Add docstring to MAI labs formatter	2021-11-01 16:41:55 +01:00
Eren Gölge	608f437545	Add a function to find unique chars	2021-11-01 16:41:33 +01:00
Eren Gölge	d6d780e758	Fix FastSpeech config	2021-11-01 16:41:15 +01:00
Eren Gölge	5ba47081ee	Use GL for VCTK FastPitch models	2021-11-01 16:39:03 +01:00
Michael Hansen	3bc043faeb	Upgrade to gruut 2.0 (#882 )	2021-10-31 11:41:55 +01:00
George	37eaefc085	Optional silence trimming during inference and find_endpoint() fix (#898 ) * Set find_endpoint db threshold in config.json * Optional silence trimming during inference * Make trim_db value negative	2021-10-29 18:28:55 +02:00
Eren Gölge	7293abada2	Bump up to v0.4.2	2021-10-29 17:57:30 +02:00
Eren Gölge	2df0752e73	Model zoo tests (#900 ) * Fix VITS model multi-speaker init * Remove gdrive support in model manager * Add model zoo tests	2021-10-29 17:54:16 +02:00
Eren Gölge	aaaa591485	Bump up version to v0.4.1	2021-10-26 19:24:17 +02:00
Eren Gölge	3ea1c2037b	Fix model entry in .models.json	2021-10-26 19:14:29 +02:00
Eren Gölge	fa4ec83c6e	Bump up version to v0.4.0	2021-10-26 18:27:39 +02:00
Eren Gölge	035ed432bc	Doc update (#889 ) * Link source files from the docs * Update glowTTS recipes for docs * Add dataset downloaders	2021-10-26 17:41:33 +02:00
Eren Gölge	0cac3f330a	Enable custom formatter in load_tts_samples	2021-10-26 13:07:11 +02:00
Eren Gölge	7c10574931	Gateway for TTS models	2021-10-26 13:04:51 +02:00
Eren Gölge	00becf2671	Fix import statements	2021-10-25 19:29:16 +02:00
Eren Gölge	027424dda8	Add VCTK fast_pitch and UK glow-tts	2021-10-25 19:29:16 +02:00
Eren Gölge	70e4d0e524	Fix grad_norm handling	2021-10-21 16:29:06 +00:00
Eren Gölge	a409e0f8f8	Update train_tts for multi-speaker	2021-10-21 16:29:06 +00:00
Eren Gölge	2b7d159383	Update BaseTTS for multi-speaker training	2021-10-21 16:29:06 +00:00
Eren Gölge	e62d3c5cf7	Use absolute imports for tts configs and models	2021-10-21 16:29:06 +00:00
Eren Gölge	82fed4add2	Make style	2021-10-21 16:05:51 +00:00
Eren Gölge	3cb07fb6b5	Fix SpeakerManager init with data items	2021-10-21 13:54:39 +00:00
Eren Gölge	aea90e2501	Comment synthesis.py	2021-10-21 13:53:45 +00:00
Eren Gölge	1987aaaaed	Update d-vector reshape in synthesizer	2021-10-21 13:53:25 +00:00
Eren Gölge	3ab009ca8d	Edit model configs for multi-speaker	2021-10-21 13:51:37 +00:00
Eren Gölge	cea8e1739b	Update AlignTTS to use SpeakerManager	2021-10-20 18:22:41 +00:00
Eren Gölge	0e768dd4c5	Update comments	2021-10-20 18:21:26 +00:00
Eren Gölge	7c2cb7cc30	Update BaseTTS	2021-10-20 18:18:22 +00:00
Eren Gölge	330ee7d208	Comment BaseTacotron and remove unused funcs	2021-10-20 18:17:25 +00:00
Eren Gölge	aa25f70b95	Update ForwardTTS for multi-speaker	2021-10-20 18:16:41 +00:00
Eren Gölge	0ebc2a400e	Implement `_set_speaker_embedding` in GlowTTS	2021-10-20 18:15:20 +00:00
Eren Gölge	3da79a4de4	Comment Tacotron2 model	2021-10-20 18:14:04 +00:00
Eren Gölge	92b6d98443	Set pitch frame alignment wrt spec computation	2021-10-20 18:12:38 +00:00
Eren Gölge	0a3d1cc7ee	Pass speaker manager to the model in synthesizer	2021-10-20 18:11:36 +00:00
Eren Gölge	588da1a24e	Simplify grad_norm handling in trainer	2021-10-19 16:33:04 +00:00
Eren Gölge	3c7848e9b1	Don't OOR values in train console log	2021-10-19 16:32:16 +00:00
Eren Gölge	c514351c0e	Refactor multi-speaker init in BaseTTS-Tacotron1-2	2021-10-18 08:55:45 +00:00
Eren Gölge	127571423c	Update multi-speaker init in BaseTTS	2021-10-18 08:54:41 +00:00
Eren Gölge	a0a5d580e9	Approximate audio length from file size	2021-10-18 08:54:02 +00:00
Eren Gölge	b4b890df03	Update trainer's initialization	2021-10-18 08:53:19 +00:00
Eren Gölge	fcbfc53cb7	Fix linter	2021-10-15 10:24:19 +00:00
Eren Gölge	700b056117	Update Synthesizer multi-speaker handling	2021-10-15 10:21:12 +00:00
Eren Gölge	073a2d2eb0	Refactor VITS multi-speaker initialization	2021-10-15 10:20:00 +00:00
Eren Gölge	0565457faa	Fix #846	2021-10-14 14:46:14 +00:00
Eren Gölge	e15bc157d8	Fix #873	2021-10-14 14:39:45 +00:00
Eren Gölge	21cc0517a3	Fix WaveRNN test	2021-10-01 10:21:37 +00:00
Eren Gölge	4dbe7ed0de	Fix all-zero duration case for GlowTTS	2021-10-01 09:24:26 +00:00
Eren Gölge	37959ad0c7	Make linter	2021-09-30 23:02:16 +00:00
Eren Gölge	0b1986384f	Make style	2021-09-30 16:21:18 +00:00
Eren Gölge	7edbe04fe0	Fix WaveRNN config and test	2021-09-30 16:20:12 +00:00
Eren Gölge	55d9209221	Remote STT tokenizer	2021-09-30 14:58:26 +00:00
Eren Gölge	ba2b8c827f	Update `train_tts.py` and `train_vocoder.py`	2021-09-30 14:47:56 +00:00
Eren Gölge	2e9b6b4f90	Refactor Speaker Encoder training	2021-09-30 14:47:56 +00:00
Eren Gölge	043dca61b4	Rename `load_meta_data` as `load_tts_data`	2021-09-30 14:47:56 +00:00
Eren Gölge	9f23ad6a0f	Fix imports	2021-09-30 14:47:56 +00:00
Eren Gölge	16b70be0dd	Add `_set_model_args` to BaseModel	2021-09-30 14:47:56 +00:00
Eren Gölge	9a0d8fa027	Update `copy_model_files()`	2021-09-30 14:47:56 +00:00
Eren Gölge	4163b4f2e4	Update Tacotron models	2021-09-30 14:47:56 +00:00
Eren Gölge	e27feade38	Fixup wavernn	2021-09-30 14:47:56 +00:00
Eren Gölge	45889804c2	Update VITS	2021-09-30 14:47:56 +00:00
Eren Gölge	4f94f91305	Update WaveRNN	2021-09-30 14:47:56 +00:00
Eren Gölge	3d5205d66f	Update WaveGrad	2021-09-30 14:47:56 +00:00
Eren Gölge	fd95926009	Update GlowTTS	2021-09-30 14:47:56 +00:00
Eren Gölge	4baecdf92a	Update GAN for Trainer_v2	2021-09-30 14:47:56 +00:00
Eren Gölge	a156a40b47	Update ForwardTTS for Trainer_v2	2021-09-30 14:19:19 +00:00
Eren Gölge	d9df33f837	Update `align_tts` for trainer_v2	2021-09-30 14:18:10 +00:00
Eren Gölge	8ada870a57	Refactor `trainer.py` for v2	2021-09-30 14:16:34 +00:00
Eren Gölge	7f388f26e3	Bump up to v0.3.1	2021-09-17 23:53:22 +00:00
Eren Gölge	2766dd1d6e	Fix #813 - GlowTTS training (#814 ) * Fix #813 * Update glow_tts recipe * Fix glow-tts test * Linter fix * Run data dep init only in training	2021-09-17 20:06:55 +02:00
Eren Gölge	f563415052	Bump up to v0.3.0	2021-09-13 09:40:38 +00:00
Eren Gölge	a97dc8d09f	Fix trainer malformatted print	2021-09-13 08:32:02 +00:00
Eren Gölge	91bebebe18	Add new models to `.models.json` SpeedySpeech model using `ForwardTTS` UnivNet model fine-tuned on TacotronDDC_ph spectrograms	2021-09-13 08:22:14 +00:00
Eren Gölge	1ea011571a	Update SpeedySpeech config	2021-09-12 15:33:27 +00:00
Eren Gölge	cbbc9e0172	Add FastSpeechConfig	2021-09-11 10:20:37 +00:00
Eren Gölge	26f76fce22	Remove SpeedySpeech from .models.json	2021-09-10 17:47:27 +00:00
Eren Gölge	d97952611d	Remove unused import	2021-09-10 17:31:41 +00:00
Eren Gölge	7d8f77385a	Use `glow-tts` in synthesis tests	2021-09-10 17:27:33 +00:00
Eren Gölge	d5f256b34c	Update tacotron `r` init	2021-09-10 17:26:23 +00:00
Eren Gölge	ab37fa9c39	Edit AlignTTS	2021-09-10 17:25:00 +00:00
Eren Gölge	66732025e1	Add `base_model` field to `forward_tts` configs	2021-09-10 17:23:48 +00:00
Eren Gölge	d6e29ef98a	Style update	2021-09-10 08:30:33 +00:00
Eren Gölge	a89eb12aca	Fix glow_tts imports	2021-09-10 08:29:51 +00:00
Eren Gölge	570d5971be	Implement `ForwardTTSLoss`	2021-09-10 08:29:12 +00:00
Eren Gölge	0541a25e90	Remove `fastpitch.py` and `speedy_speech.py`	2021-09-10 08:27:48 +00:00
Eren Gölge	3c16013199	Fix Vits imports	2021-09-10 08:26:34 +00:00
Eren Gölge	742f9c54da	Warn user if nan in GL	2021-09-10 08:26:05 +00:00
Eren Gölge	ed4b1d8514	Test `TTS.tts.utils.helpers`	2021-09-10 08:25:21 +00:00
Eren Gölge	8b7e094bde	Implement `forward_tts` - Generic API for feed-forward TTS models (FastPitch, SpeedySpeech) - Tests for `forward-tts` - Edit FastPitchConfig and SpeedySpeechConfig to use `forward_tts`	2021-09-10 08:24:33 +00:00
Eren Gölge	3c740d4893	Style extract_tts_spectrogram.py	2021-09-10 08:21:21 +00:00
Eren Gölge	bfc6ceac29	Move MAS to `TTS.tts.utils.helpers`	2021-09-09 10:57:19 +00:00
Eren Gölge	2dfc5bdd11	Fix best_model_path init if no best_mode	2021-09-09 09:01:52 +00:00
Eren Gölge	abf5e48177	Fix logging current learning rate in trainer	2021-09-09 09:01:04 +00:00
Eren Gölge	6c4c1065b0	Fix trainer's scheduler restoring	2021-09-09 09:00:27 +00:00
Eren Gölge	807f1d3817	Fix `extract_tts_spectrograms.py` model init	2021-09-09 08:59:55 +00:00
Eren Gölge	537c8576ec	Stage `TTS.tts.utils.helpers`	2021-09-08 13:35:18 +00:00
Eren Gölge	4761853c5c	Fix imports	2021-09-08 13:34:40 +00:00
Eren Gölge	e20ea57c87	Update comment and add a warning	2021-09-07 12:23:32 +00:00
Eren Gölge	82598f3fdb	Bump up to v0.2.2	2021-09-06 16:59:41 +00:00
Eren Gölge	4cc544bc46	Add FastPitch model to `.models.json`	2021-09-06 16:59:22 +00:00
Eren Gölge	2c4bbbf9b9	Use pyworld for pitch	2021-09-06 15:16:58 +00:00
Eren Gölge	c1513ec4cd	Plot pitch over spectrogram	2021-09-06 15:16:58 +00:00
Eren Gölge	d847a68e42	Reformat multi-speaker handling in GlowTTS	2021-09-06 15:16:58 +00:00
Eren Gölge	8d41060d36	Plot unnormalized pitch by `FastPitch`	2021-09-06 15:16:58 +00:00
Eren Gölge	2b59da802c	Fix loader setup in `base_tts`	2021-09-06 15:16:58 +00:00
Eren Gölge	76c4929ab2	Fix attn mask reading bug	2021-09-06 15:16:58 +00:00
Eren Gölge	91a70e80b2	Refactor TTSDataset Return a dict by `collate` Refactor batch handling in `collate` A couple of bug fixes	2021-09-06 15:16:58 +00:00
Eren Gölge	29248536c9	Update `PositionalEncoding`	2021-09-06 15:16:58 +00:00
Eren Gölge	4672889549	Update `generic.FFTransformer`	2021-09-06 15:16:58 +00:00
Eren Gölge	2bf9e83c49	FastPitch refactor and commenting	2021-09-06 15:16:58 +00:00
Eren Gölge	59b24e66cf	Add `AlignerNetwork`	2021-09-06 15:16:58 +00:00
Eren Gölge	648655fa03	Add `PitchExtractor` and return dict by `collate`	2021-09-06 15:16:58 +00:00
Eren Gölge	debf772ec5	Implement binary alignment loss	2021-09-06 15:16:58 +00:00
Eren Gölge	6e9d4062f2	Add `sort_by_audio_len` option	2021-09-06 15:16:58 +00:00
Eren Gölge	59d52a4cd8	Disable autcast for criterions	2021-09-06 15:16:58 +00:00
Eren Gölge	98a7271ce8	Refactor FastPitchv2	2021-09-06 15:16:58 +00:00
Eren Gölge	e429afbce4	Enable aligner for FastPitch	2021-09-06 15:16:58 +00:00
Eren Gölge	81c228a2d8	Update FastPitch don't detach duration network inputs	2021-09-06 15:16:58 +00:00
Eren Gölge	ca29033ef4	Refactor FastPitch model	2021-09-06 15:16:58 +00:00
Eren Gölge	42862f7fdb	Format style of the recipes	2021-09-06 15:16:58 +00:00
Eren Gölge	5d59100a88	Don't use align_score for models with duration predictor	2021-09-06 15:16:58 +00:00
Eren Gölge	fac9dbe661	Update FastPitchLoss	2021-09-06 15:16:58 +00:00
Eren Gölge	b81560607b	Update docstrings	2021-09-06 15:16:58 +00:00
Eren Gölge	57b3aec1b9	Update docstring format	2021-09-06 15:16:58 +00:00
Eren Gölge	7692bfe7f8	Update FastPitch config	2021-09-06 15:16:58 +00:00
Eren Gölge	8584f2b82d	Update docstring format	2021-09-06 15:16:58 +00:00
Eren Gölge	b7caad39e0	Make optional to detach duration predictor input	2021-09-06 15:16:58 +00:00
Eren Gölge	9af42f7886	Restore `last_epoch` of the scheduler	2021-09-06 15:16:58 +00:00
Eren Gölge	aacbb3ed77	Fix SpeakerManager usage in `synthesize.py`	2021-09-06 15:16:58 +00:00
Eren Gölge	545a00fc04	Use absolute paths of the attention masks	2021-09-06 15:16:58 +00:00
Eren Gölge	bc396c393f	Add FastPitch model and FastPitchconfig	2021-09-06 15:16:58 +00:00
Eren Gölge	5a6ffaee08	Add yin based pitch computation	2021-09-06 15:16:58 +00:00
Eren Gölge	e802b24ad0	Compute mean and std pitch	2021-09-06 15:16:58 +00:00
Eren Gölge	8fffd4e813	Don't print computed phonemes It causes noise in logs	2021-09-06 15:16:58 +00:00
Eren Gölge	d085642ac1	Cache pitch features Cache the features at the beginning of `BaseTTS` training.	2021-09-06 15:16:58 +00:00
Eren Gölge	7590c7db7a	Fix `base_tacotron` `aux_input` handling	2021-09-06 15:16:58 +00:00
Eren Gölge	db32162eae	Fix `FastPitchLoss`	2021-09-06 15:16:58 +00:00
Eren Gölge	94e8e0d416	Fix configs	2021-09-06 15:16:58 +00:00
Eren Gölge	0f19f8c911	Fix `compute_attention_masks.py`	2021-09-06 15:16:58 +00:00
Eren Gölge	994f2be2c1	Add comput_f0 field	2021-09-06 15:16:58 +00:00
Eren Gölge	c8d999b010	Add FastPitchLoss	2021-09-06 15:16:58 +00:00
Eren Gölge	fba257104d	Compute F0 using librosa	2021-09-06 15:16:58 +00:00
Katsuya Iida	165e5814af	Update Japanese phonemizer (#758 ) * Update default ja vocoder * update * Japanese phonemizer test * Run make style Co-authored-by: Eren Gölge <egolge@coqui.ai>	2021-09-01 09:33:15 +02:00
Eren Gölge	2b7e55f01f	Fix vits args types	2021-08-30 23:24:20 +00:00
Eren Gölge	b910a6ddce	Bump up to v0.2.1	2021-08-30 16:31:24 +00:00
Eren Gölge	d16da949a5	Merge branch 'fix_distribute' into dev	2021-08-30 16:31:07 +00:00
Eren Gölge	6782d3eab7	Fix linter issues ofr p3.6	2021-08-30 16:18:33 +00:00
Eren Gölge	738eee0cf9	Fix style	2021-08-30 13:12:13 +00:00
Eren Gölge	5255e089e6	Fix #767	2021-08-30 13:10:08 +00:00
Eren Gölge	c560114324	Fix #750	2021-08-30 13:06:50 +00:00
Eren Gölge	18b2e41e5a	Use `coqui_tts` as the default run name	2021-08-30 12:56:47 +00:00
Eren Gölge	9c86f1ac68	Fix usage of abstract class in vocoders	2021-08-30 08:10:35 +00:00
Eren Gölge	18da8f5dbd	Update pylint 2.10.2 and fix lint issues	2021-08-30 08:10:35 +00:00
Eren Gölge	f186856e5d	Add option to sort input sequnce by audio len	2021-08-30 08:10:35 +00:00
Eren Gölge	2620f62ea8	Move duration_loss inside VitsGeneratorLoss	2021-08-27 07:07:07 +00:00
Eren Gölge	1692b8e4d9	Merge pull request #726 from fijipants/patch-1 Fix bug with log_func	2021-08-26 22:11:29 +02:00
Eren Gölge	49e1181ea4	Fixes for the vits model	2021-08-26 17:15:09 +00:00
Eren Gölge	5911eec3b1	Small trainer refactoring 1. Use a single Gradscaler for all the optimizers 2. Save terminal logs to a file. In DDP mode, each worker creates `trainer_N_log.txt`. 3. Fixes to allow only the main worker (rank==0) writing to Tensorboard 4. Pass parameters owned by the target optimizer to the grad_clip_norm	2021-08-26 17:08:58 +00:00
fijipants	e9e01b09b0	Fix bug with log_func	2021-08-18 19:59:51 -04:00
fijipants	8f57f8adfd	Update synthesizer.py	2021-08-18 19:56:52 -04:00
Eren Gölge	3ab8cef99e	Fix VITS model SPD	2021-08-18 14:55:46 +00:00
Eren Gölge	c5d1dd9d1b	Fix restoring best_loss Keep the default value if model checkpoint has no `model_loss`	2021-08-17 12:12:36 +00:00
Eren Gölge	c8bbcdfd07	Fix `test_run` for DDP	2021-08-13 19:39:02 +00:00
Eren Gölge	7c0d564965	Syncronize DDP processes	2021-08-13 10:40:50 +00:00
Eren Gölge	ecf5f17dca	Fix distribute.py and ddp training	2021-08-12 22:22:32 +00:00
Eren Gölge	b02c4fe347	Bump up to v0.2.0	2021-08-11 08:15:39 +00:00
Eren Gölge	537bc8487a	Print model count when listing modelsk	2021-08-10 16:25:11 +00:00
Eren Gölge	09ed8426e8	Add the models released with v0.2.0	2021-08-10 15:46:31 +00:00
Eren Gölge	39004484b9	Fix 🐛 Fix synthesizer multi-speaker init Fix #712	2021-08-10 12:56:32 +00:00
Eren Gölge	c8b9ca3d71	Fix Tacotron num_char init	2021-08-10 08:56:34 +00:00
Eren Gölge	7eb94f760b	Remove Ruslan model	2021-08-09 21:48:36 +00:00
Eren Gölge	6af03ac476	Fix `num_char` init in Tacotron models	2021-08-09 21:46:15 +00:00
Ayush Chaurasia	e685ddfca7	Update trainer.py	2021-08-09 18:37:46 +00:00
Ayush Chaurasia	28870f8df4	update docstring	2021-08-09 18:35:35 +00:00
Ayush Chaurasia	8a246cbb66	Update trainer.py	2021-08-09 18:35:08 +00:00
Ayush Chaurasia	f3e9d61330	Refactor logging initialization	2021-08-09 18:35:08 +00:00
Ayush Chaurasia	79b74a989d	Update: add_text	2021-08-09 18:34:38 +00:00
Ayush Chaurasia	9fcf48b760	Delete logger_base.py	2021-08-09 18:34:00 +00:00
Ayush Chaurasia	290972fd35	reformat	2021-08-09 18:34:00 +00:00
Ayush Chaurasia	936a47504d	Update Logger API, recipes	2021-08-09 18:34:00 +00:00
Ayush Chaurasia	f63cf46c55	Unified logger API	2021-08-09 18:34:00 +00:00
Ayush Chaurasia	f4434da5a3	Update disabled structure	2021-08-09 18:31:16 +00:00
Ayush Chaurasia	f606741dc4	Add artifacts logging , wandb args	2021-08-09 18:31:16 +00:00
Ayush Chaurasia	f5e50ad502	WandbLogger	2021-08-09 18:27:06 +00:00
Eren Gölge	06018251e6	Add VITS and GlowTTS class docs 🗒️	2021-08-09 18:02:36 +00:00
Eren Gölge	6a7275881d	Add VitsConfig docstring	2021-08-09 18:02:36 +00:00
Eren Gölge	f7a72552f1	Make duration predictor dropout configurable	2021-08-09 18:02:36 +00:00
Eren Gölge	c312acac7d	Implement VITS model 🚀 VITS model implementation built on Glow TTS and HiFiGAN layers.	2021-08-09 18:02:36 +00:00
Eren Gölge	060e746e21	Add `do_amp_to_db` option	2021-08-09 18:02:36 +00:00
Eren Gölge	e94c1f894d	Simplify `console_logger`	2021-08-09 18:02:36 +00:00
Eren Gölge	dd55960732	Update `synthesizer.py` Fixes and changes for multi-speaker model init and custom symbols made by mode.make_symbols()	2021-08-09 18:02:36 +00:00
Eren Gölge	232a5abb6a	Update `tts.setup_model` Run `model.make_symbols()` if availabe to set the symbol list	2021-08-09 18:02:36 +00:00
Eren Gölge	f5a6aa974f	Modify `symbols.py` not to add _arpanet	2021-08-09 18:02:36 +00:00
Eren Gölge	d4deb2716f	Modify `get_optimizer` to accept a model argument	2021-08-09 18:02:36 +00:00
Eren Gölge	003e5579e8	Enable `custom_symbols` in text processing Models can define their own custom symbols lists with custom `make_symbols()`	2021-08-09 18:02:36 +00:00
Eren Gölge	bd4e29b4dd	Add `compute_linear_spec=False` to `BaseTTSConfig`	2021-08-09 18:02:36 +00:00
Eren Gölge	960a35a121	Add `scheduler_after_epoch` to `BaseTrainingConfig`	2021-08-09 18:02:36 +00:00
Eren Gölge	e4648ffef1	Fix multi-speaker init of Tacotron models & tests	2021-08-09 18:02:36 +00:00
Eren Gölge	01324c8e70	Update `base_tts.py` Enable calling `make_symbols()` from the model if defined. Compatibility changes for end2end `tts` models in batch formatting. Changes in multi-speaker initialization. Modify `test_run()` to work with dict output iof `synthesis`	2021-08-09 18:02:36 +00:00
Eren Gölge	bf562cf437	Update `trainer.py` Fix multi-speaker initialization of models. Add changes for end2end`tts` models.	2021-08-09 18:02:36 +00:00
Agrin Hilmkil	ced4cfdbbf	Allow saving / loading checkpoints from cloud paths (#683 ) * Allow saving / loading checkpoints from cloud paths Allows saving and loading checkpoints directly from cloud paths like Amazon S3 (s3://) and Google Cloud Storage (gs://) by using fsspec. Note: The user will have to install the relevant dependency for each protocol. Otherwise fsspec will fail and specify which dependency is missing. * Append suffix _fsspec to save/load function names * Add a lower bound to the fsspec dependency Skips the 0 major version. * Add missing changes from refactor * Use fsspec for remaining artifacts * Add test case with path requiring fsspec * Avoid writing logs to file unless output_path is local * Document the possibility of using paths supported by fsspec * Fix style and lint * Add missing lint fixes * Add type annotations to new functions * Use Coqpit method for converting config to dict * Fix type annotation in semi-new function * Add return type for load_fsspec * Fix bug where fs not always created * Restore the experiment removal functionality	2021-08-09 18:02:36 +00:00
Eren Gölge	d9e18e009b	Skip phoneme cache pre-compute if the path exists	2021-08-09 18:02:36 +00:00
Eren Gölge	6c131d168e	Bump the version to 0.1.3	2021-07-26 21:32:27 +02:00
Eren Gölge	febd6105b5	Update default vocoder for de-thorsten	2021-07-26 16:08:52 +02:00
Eren Gölge	4b7b88dd3d	Add fullband-melgan DE vocoder	2021-07-26 15:38:30 +02:00
Eren Gölge	764f684e1b	Fix `server.py` for multi-speaker models	2021-07-26 15:38:30 +02:00
Eren Gölge	75b201c6c1	Merge pull request #673 from coqui-ai/fix_stopnet Fix stopnet training for Tacotron models	2021-07-24 12:25:38 +02:00
Eren Gölge	fc0c4600bd	Fix stopnet training	2021-07-24 11:39:54 +02:00
Eren Gölge	30eed347b6	Merge pull request #581 from Edresson/dev Compute speaker embeddings in batch for the LSTM Speaker Encoder and Compute embeddings/ finding chars using config file.	2021-07-23 17:22:51 +02:00
Edresson Casanova	d5adc35fdf	Add docstring to compute_embeddings script	2021-07-21 07:16:10 -03:00
Eren Gölge	05c75aa9d5	Fix linter issues	2021-07-16 13:37:38 +02:00
Eren Gölge	58cc414477	Fix WaveGrad `test_run`	2021-07-16 13:02:25 +02:00
WeberJulian	25832eb97b	Changes for review	2021-07-15 11:38:45 +02:00
Edresson	b1620d1f3f	remove ignore generate eval flag	2021-07-15 03:34:28 -03:00
WeberJulian	c79a82ed07	refix linter	2021-07-13 23:12:18 +02:00
WeberJulian	7d92b30946	Fix tests	2021-07-13 23:00:34 +02:00
WeberJulian	32974dd6a9	Fix test sentences synthesis	2021-07-13 16:07:13 +02:00
Edresson	d906fea08c	lint fix and eval as argparse in extract tts spectrograms	2021-07-13 02:15:31 -03:00
Edresson	2e5baffa9c	Merge fix and eval split as argparse	2021-07-13 01:47:32 -03:00
Eren Gölge	93a74cbb71	Merge pull request #628 from Aloento/patch-2 Change to _get_preprocessor_by_name	2021-07-11 22:17:50 +02:00
Edresson	4eac1c4651	bug fix on train_encoder and unit tests	2021-07-11 12:00:39 -03:00
Aloento	6e3e6d5756	Change to _get_preprocessor_by_name	2021-07-08 09:53:13 +02:00
Eren Gölge	8fbadad68e	Bump up to v0.1.2	2021-07-06 14:44:59 +02:00
eren golge	3c0454490f	Fix #616	2021-07-06 14:44:03 +02:00
Eren Gölge	0c347624e7	Bump up version to v0.1.1	2021-07-04 11:46:36 +02:00
Eren Gölge	a05b234080	Raise an error when multiple GPUs are in use User must define the target GPU by `CUDA_VISIBLE_DEVICES` and use `distribute.py` for multi-gpu training.	2021-07-04 11:25:49 +02:00
Eren Gölge	270c3823eb	Fix #608	2021-07-04 11:19:31 +02:00
Eren Gölge	c25a2184e7	Add docs for `SpeakerManager`	2021-07-03 13:55:27 +02:00
Eren Gölge	f382e4c700	Fix linter warnings	2021-07-03 13:30:24 +02:00
Eren Gölge	9e7824fe35	Fix UnivNet inference code	2021-07-02 10:48:34 +02:00
Eren Gölge	168f97cbe9	Let `Synthesizer` use the speaker manager out of the model	2021-07-02 10:47:55 +02:00
Eren Gölge	196876feb1	Fix `ModelManager` model download	2021-07-02 10:47:05 +02:00
Eren Gölge	9352cb4136	Format Align TTS docstrings	2021-07-02 10:45:58 +02:00
Eren Gölge	95ad72f38f	Fix glow tts initialization	2021-07-02 10:45:37 +02:00
Eren Gölge	40b0b5365e	Let `get_characters` return `num_chars`	2021-07-02 10:45:00 +02:00
Eren Gölge	0fa6a8c9b8	Fix glow tts default parameters	2021-07-02 10:44:23 +02:00
Eren Gölge	a4c658f5ef	Fix for using the `Synthesizer` out of the model	2021-07-02 10:43:38 +02:00
Eren Gölge	db47f4f105	Update `.models.json`	2021-07-02 10:43:00 +02:00
Eren Gölge	2e1a428b83	Update glowtts docstrings and docs	2021-06-30 14:30:55 +02:00
Eren Gölge	5723eb4738	Fix config init in `process_args`	2021-06-29 16:41:08 +02:00
Eren Gölge	4b5421b42f	Remove FAQ link from README.md	2021-06-29 13:20:40 +02:00
Eren Gölge	47b3b10d6d	Bump up to v0.1.0 🚀	2021-06-29 13:07:59 +02:00
Eren Gölge	7ec5c31898	Merge branch 'univnet' into trainer-api	2021-06-29 10:27:12 +02:00
Eren Gölge	51398cd15b	Add docstrings and typing for `audio.py`	2021-06-28 17:03:47 +02:00
Eren Gölge	ae6405bb76	Docstrings for `Trainer`	2021-06-28 17:03:47 +02:00
Eren Gölge	6b265ae8e3	Docstring update	2021-06-28 17:03:47 +02:00
Eren Gölge	ab563ce7cd	Start training by config.json using `register_config`	2021-06-28 17:03:47 +02:00
Eren Gölge	b3c073c99b	Allow runing full path scripts with `distribute.py`	2021-06-28 17:03:47 +02:00
Eren Gölge	d42d1c02ea	Use `torch.linalg.qr` for pytorch > `v1.9.0`	2021-06-28 17:03:47 +02:00
Eren Gölge	fbba37e01e	Fix loading the `amp` scaler from a checkpoint 🛠️	2021-06-28 17:03:47 +02:00
Eren Gölge	a7617d8ab6	Add 🐍 python 3.9 to CI	2021-06-28 17:03:47 +02:00
Eren Gölge	9790eddada	Fix wrong argument name 🛠️	2021-06-28 17:03:47 +02:00
Eren Gölge	932ab107ae	Docstring edit in `TTSDataset.py` ✍️	2021-06-28 17:03:47 +02:00
Eren Gölge	cfa5041db7	Fix `eval_log` for `gan.py` 🛠️	2021-06-28 17:03:47 +02:00
Eren Gölge	d700845b10	Move `TorchSTFT` to `utils.audio`	2021-06-28 17:03:47 +02:00
Eren Gölge	5b89cb4fec	Fixup `trainer.py` 🛠️	2021-06-28 17:03:47 +02:00
Eren Gölge	8c74f054f0	Enable support for 🐍 python 3.10 Bump up versions numpy 1.19.5 and TF 2.5.0	2021-06-28 17:03:47 +02:00
Eren Gölge	9455a2b01e	Apply small fixes for API compatibility	2021-06-28 17:03:47 +02:00
Eren Gölge	a5d5bc9063	Print `max_decoder_steps` when model reaches the limit	2021-06-28 17:03:47 +02:00
Eren Gölge	e30f245e06	Update `synthesizer` for speaker and model init	2021-06-28 17:03:47 +02:00
Eren Gölge	15fa31b595	fixup configs	2021-06-28 17:03:47 +02:00
Eren Gölge	f23b228e24	Update `speaker_manager`	2021-06-28 17:03:47 +02:00
Eren Gölge	e53616078a	Fixup `utils` for the trainer	2021-06-28 17:03:47 +02:00
Eren Gölge	106b63d8a9	Update `vocoder` utils	2021-06-28 17:03:47 +02:00
Eren Gölge	45947acb60	Update `TTS.bin` scripts for the new API	2021-06-28 17:03:47 +02:00
Eren Gölge	d7225eedb0	Update `vocoder` datasets and `setup_dataset`	2021-06-28 17:03:20 +02:00
Eren Gölge	d18198dff8	Implement `setup_model` for vocoder models	2021-06-28 17:03:20 +02:00
Eren Gölge	e949e7ad58	Update vocoder models	2021-06-28 17:03:19 +02:00
Eren Gölge	51005cdab4	Update `tts.models.setup_model`	2021-06-28 17:03:19 +02:00
Eren Gölge	7b8c15ac49	Create base 🐸TTS model abstraction for tts models	2021-06-28 17:03:19 +02:00
Eren Gölge	a358f74a52	Update vocoder model configs	2021-06-28 17:03:19 +02:00
Eren Gölge	786170fe7d	Update tts model configs	2021-06-28 17:03:19 +02:00
Eren Gölge	98298ee671	Implement unified IO utils	2021-06-28 17:03:19 +02:00
Eren Gölge	c7aad884cd	Implement unified trainer	2021-06-28 17:03:19 +02:00
Eren Gölge	6d7b5fbcde	`tts` model abstraction with `TTSModel`	2021-06-28 17:03:19 +02:00
Eren Gölge	d4dbd89752	fix calculation of `loader_start_time`	2021-06-28 17:03:19 +02:00
Eren Gölge	c754a0e17d	`TrainerAbstract` and related updates for `TrainerTTS`	2021-06-28 17:03:19 +02:00
Eren Gölge	00c82c516d	rename to	2021-06-28 17:03:19 +02:00
Eren Gölge	166f0aeb9a	merge if branches with the same implementation	2021-06-28 17:03:19 +02:00
Eren Gölge	03494ad642	adjust `distribute.py` for the `train_tts.py`	2021-06-28 17:03:19 +02:00
Eren Gölge	fdfb18d230	downsize melgan test model size	2021-06-28 17:03:19 +02:00
Eren Gölge	25238e0658	fix glow-tts `inference()`	2021-06-28 17:03:19 +02:00
Eren Gölge	419735f440	refactor and fix multi-speaker training in Trainer and Tacotron models	2021-06-28 17:03:19 +02:00
Eren Gölge	269e5a734e	add max_decoder_steps argument to tacotron models	2021-06-28 17:03:19 +02:00
Eren Gölge	b3324bd914	fix speaker_manager init	2021-06-28 17:03:19 +02:00
Eren Gölge	2c38ef8441	use get_speaker_manager in Trainer and save speakers.json file when needed	2021-06-28 17:03:19 +02:00
Eren Gölge	d6b2b6add6	make style and linter fixes	2021-06-28 17:03:19 +02:00
Eren Gölge	802d461389	Compute d_vectors and speaker_ids separately in TTSDataset	2021-06-28 17:03:19 +02:00
Eren Gölge	db6a97d1a2	rename external speaker embedding arguments as `d_vectors`	2021-06-28 17:03:19 +02:00
Eren Gölge	9042ae9195	use `to_cuda()` for moving data in `format_batch()`	2021-06-28 17:03:19 +02:00
Eren Gölge	f82f1970b8	change `to(device)` to `type_as` in models	2021-06-28 17:03:19 +02:00
Eren Gölge	9c94b0c5c0	init `durations = None`	2021-06-28 17:03:19 +02:00
Eren Gölge	1fa15c195a	docstring fix	2021-06-28 17:03:19 +02:00
Eren Gölge	1c8a3d7c86	make style	2021-06-28 17:03:19 +02:00
Eren Gölge	8cdd423234	styling formatting.py	2021-06-28 17:03:19 +02:00
Eren Gölge	30211512a4	fix type annotations	2021-06-28 17:03:19 +02:00
Eren Gölge	b22b7620c3	update glow-tts output shapes to match [B, T, C]	2021-06-28 17:03:19 +02:00
Eren Gölge	8381379938	formating `cond_input` with a function in Tacotron models	2021-06-28 17:03:19 +02:00
Eren Gölge	ef4ea9e527	update imports for `formatters`	2021-06-28 17:03:19 +02:00
Eren Gölge	6c495c6a6e	fix glow-tts inference and forward functions for handling `cond_input` and refactor its test	2021-06-28 17:03:19 +02:00
Eren Gölge	f840268181	refactor `SpeakerManager`	2021-06-28 17:03:19 +02:00
Eren Gölge	421194880d	linter fixes	2021-06-28 17:03:19 +02:00
Eren Gölge	8e52a69230	delete separate tts training scripts and pre-commit configuration	2021-06-28 17:03:19 +02:00
Eren Gölge	d96ebcd6d3	make style	2021-06-28 17:03:19 +02:00
Eren Gölge	b643e8b37c	`logging/__init__.py`	2021-06-28 17:03:19 +02:00
Eren Gölge	0cee5042a9	fix logger imports	2021-06-28 17:03:19 +02:00
Eren Gölge	72dceca52c	import missings	2021-06-28 17:03:19 +02:00
Eren Gölge	0eec238429	remove redundant imports	2021-06-28 17:03:19 +02:00
Eren Gölge	b500338faa	make style	2021-06-28 17:03:19 +02:00
Eren Gölge	469d2e620a	update extract_tts_spectrogram for `cond_input` API of the models	2021-06-28 17:03:19 +02:00
Eren Gölge	5ab28fa618	update `extract_tts_spec...` using `SpeakerManager`	2021-06-28 17:03:19 +02:00
Eren Gölge	c392fa4288	update `extract_tts_spectrograms` for the new model API	2021-06-28 17:03:19 +02:00
Eren Gölge	8f47f95998	correct import of `load_meta_data` remove redundant import	2021-06-28 17:03:19 +02:00
Eren Gölge	c680a07a20	fix `Synthesized` for the new `synthesis()`	2021-06-28 17:03:19 +02:00
Eren Gölge	73bf9673ed	revert logging.info to print statements for trainer	2021-06-28 17:03:19 +02:00
Eren Gölge	d25f017b42	update `setup_model.py` imports	2021-06-28 17:03:19 +02:00
Eren Gölge	bb355b7441	update align_tts.py model for the trainer	2021-06-28 17:03:19 +02:00
Eren Gölge	9203b863d9	update align_tts_loss for trainer	2021-06-28 17:03:19 +02:00
Eren Gölge	fc9a0fb8ce	update aling_tts_config for the trainer	2021-06-28 17:03:19 +02:00
Eren Gölge	e298b8e364	update trainer.py for better logging handling, restoring models and rename init_ functions with get_	2021-06-28 17:03:19 +02:00
Eren Gölge	b8a4af4010	update `synthesis.py` for being more generic	2021-06-28 17:03:19 +02:00
Eren Gölge	c70d0c9dae	update `speedy_speech.py` model for trainer	2021-06-28 17:03:19 +02:00
Eren Gölge	06ee57d816	update `speedy_speecy_config.py` for the trainer	2021-06-28 17:03:19 +02:00
Eren Gölge	4e910993f1	update tacotron model to return `model_outputs`	2021-06-28 17:03:19 +02:00
Eren Gölge	bb4deee64c	update glow-tts for the trainer	2021-06-28 17:03:19 +02:00
Eren Gölge	9134c7dfb6	update `sequence_mask` import globally	2021-06-28 17:03:19 +02:00
Eren Gölge	b2218e882a	update `glow_tts_config.py` for setting the optimizer and the scheduler	2021-06-28 17:03:19 +02:00
Eren Gölge	891631ab47	typing annotation for the trainer	2021-06-28 17:03:19 +02:00
Eren Gölge	5f07315722	add trainer and train_tts	2021-06-28 17:03:19 +02:00
Eren Gölge	34f8a74e4d	remove `truncated` from synthesizer	2021-06-28 17:03:19 +02:00
Eren Gölge	178eccbc16	update console logger	2021-06-28 17:03:19 +02:00
Eren Gölge	f4f83b6379	update `synthesis.py` for the trainer	2021-06-28 17:03:19 +02:00
Eren Gölge	130781dab6	remove `tts.generic_utils` as all the functions are moved to other files	2021-06-28 17:03:19 +02:00
Eren Gölge	535a458f40	update Tacotron models for the trainer	2021-06-28 17:03:19 +02:00
Eren Gölge	bdbfc95618	add `gradual_training` argument to tacotron.py	2021-06-28 17:03:19 +02:00
Eren Gölge	5a2e75f0ee	import missings for tacotron.py	2021-06-28 17:03:19 +02:00
Eren Gölge	da7d10e53c	mode `setup_model()` to `models/__init__.py`	2021-06-28 17:03:19 +02:00
Eren Gölge	ca302db7b0	add sequence_mask to `utils.data`	2021-06-28 17:03:19 +02:00
Eren Gölge	844abb3b1d	`setup_loss()` in `layer/__init__.py`	2021-06-28 17:03:19 +02:00
Eren Gölge	a20a1c7d06	rename preprocess.py -> formatters.py	2021-06-28 17:03:19 +02:00
Eren Gölge	b9bccbb243	move load_meta_data and related functions to `datasets/__init__.py`	2021-06-28 17:03:19 +02:00
Eren Gölge	d09385808a	set test_sentences in config	2021-06-28 17:03:19 +02:00
Eren Gölge	8def3c87af	trainer-API updates	2021-06-28 17:03:19 +02:00
Eren Gölge	42554cc711	rename MyDataset -> TTSDataset	2021-06-28 17:03:19 +02:00
Edresson	1c4e806f54	use speaker manager on compute embeddings script	2021-06-27 03:35:34 -03:00
Edresson Casanova	eb84bb2bc8	Merge branch 'dev' into dev	2021-06-26 15:32:19 -03:00
Eren Gölge	987cf1178b	Bump up to v0.0.16	2021-06-25 14:44:56 +02:00
Michael Hansen	3f172b84d8	Fix linting issues	2021-06-25 14:41:31 +02:00
Michael Hansen	4d8426fa0a	Use eSpeak IPA lexicons by default for phoneme models	2021-06-25 14:41:05 +02:00
Michael Hansen	618b509204	Use combined characters available in TTS phonemes (like ç)	2021-06-25 14:41:05 +02:00
Michael Hansen	da6f6a4a01	Update docstring for clean_gruut_phonemes	2021-06-25 14:41:05 +02:00
Michael Hansen	47191f3ecc	Add tests for gruut phonemization	2021-06-25 14:41:05 +02:00
Michael Hansen	67869e77f9	Use gruut for phonemization	2021-06-25 14:41:05 +02:00
Eren Gölge	788992093d	Add UnivNet vocoder 🚀	2021-06-23 13:51:04 +02:00
Eren Gölge	64fd59204c	Use `torch.linalg.qr` for pytorch > `v1.9.0`	2021-06-23 13:49:42 +02:00
Eren Gölge	aba840b4e6	Fix loading the `amp` scaler from a checkpoint 🛠️	2021-06-23 13:49:42 +02:00
Eren Gölge	18e5393f16	Add 🐍 python 3.9 to CI	2021-06-23 13:49:36 +02:00
Eren Gölge	0ff2d2336a	Fix wrong argument name 🛠️	2021-06-22 16:21:11 +02:00
Eren Gölge	61c3cb871f	Docstring edit in `TTSDataset.py` ✍️	2021-06-22 16:21:11 +02:00
Eren Gölge	6f739ea07a	Fix `eval_log` for `gan.py` 🛠️	2021-06-22 16:21:11 +02:00
Eren Gölge	ebb91c0fbb	Move `TorchSTFT` to `utils.audio`	2021-06-22 16:21:11 +02:00
Eren Gölge	01c4b22a2f	Fixup `trainer.py` 🛠️	2021-06-22 16:21:11 +02:00
Eren Gölge	7de2756fc4	Enable support for 🐍 python 3.10 Bump up versions numpy 1.19.5 and TF 2.5.0	2021-06-22 16:21:11 +02:00
Eren Gölge	220e184f66	Apply small fixes for API compatibility	2021-06-22 16:21:11 +02:00
Eren Gölge	77d57dd301	Print `max_decoder_steps` when model reaches the limit	2021-06-22 16:21:11 +02:00
Eren Gölge	7dc2177df4	Update `synthesizer` for speaker and model init	2021-06-22 16:21:11 +02:00
Eren Gölge	c3a0bc702e	fixup configs	2021-06-22 16:21:11 +02:00
Eren Gölge	0e01c2594f	Update `speaker_manager`	2021-06-22 16:21:11 +02:00
Eren Gölge	8182f5168f	Fixup `utils` for the trainer	2021-06-22 16:21:11 +02:00
Eren Gölge	b4bb567e04	Update `vocoder` utils	2021-06-22 16:21:11 +02:00
Eren Gölge	f3ff5b1971	Update `TTS.bin` scripts for the new API	2021-06-22 16:21:11 +02:00
Eren Gölge	aed919cf1c	Update `vocoder` datasets and `setup_dataset`	2021-06-22 16:21:11 +02:00
Eren Gölge	59abf490a1	Implement `setup_model` for vocoder models	2021-06-22 16:21:11 +02:00
Eren Gölge	420820caf4	Update vocoder models	2021-06-22 16:21:11 +02:00
Eren Gölge	d10f9c5676	Update `tts.models.setup_model`	2021-06-22 16:21:11 +02:00
Eren Gölge	cae702980f	Create base 🐸TTS model abstraction for tts models	2021-06-22 16:21:11 +02:00
Eren Gölge	70d968b169	Update vocoder model configs	2021-06-22 16:21:11 +02:00
Eren Gölge	f8a3460818	Update tts model configs	2021-06-22 16:21:11 +02:00
Eren Gölge	acd96a4940	Implement unified IO utils	2021-06-22 16:21:10 +02:00
Eren Gölge	6b907554f8	Implement unified trainer	2021-06-22 16:21:10 +02:00
Eren Gölge	20c4a8c8e1	`tts` model abstraction with `TTSModel`	2021-06-22 16:21:10 +02:00
Eren Gölge	b934665fc0	fix calculation of `loader_start_time`	2021-06-22 16:21:10 +02:00
Eren Gölge	64f0f57757	`TrainerAbstract` and related updates for `TrainerTTS`	2021-06-22 16:21:10 +02:00
Eren Gölge	f077a356e0	rename to	2021-06-22 16:21:10 +02:00
Eren Gölge	4575b70826	merge if branches with the same implementation	2021-06-22 16:21:10 +02:00
Eren Gölge	59be1b9af1	adjust `distribute.py` for the `train_tts.py`	2021-06-22 16:21:10 +02:00
Eren Gölge	614738cc85	downsize melgan test model size	2021-06-22 13:12:52 +02:00
Eren Gölge	4f29725eb6	fix glow-tts `inference()`	2021-06-22 13:12:52 +02:00
Eren Gölge	a87c886497	refactor and fix multi-speaker training in Trainer and Tacotron models	2021-06-22 13:12:52 +02:00
Eren Gölge	0206bb847b	add max_decoder_steps argument to tacotron models	2021-06-22 13:12:52 +02:00
Eren Gölge	cbb52b3d83	fix speaker_manager init	2021-06-22 13:12:52 +02:00
Eren Gölge	d2fd6a34a1	use get_speaker_manager in Trainer and save speakers.json file when needed	2021-06-22 13:12:52 +02:00
Eren Gölge	147550c65f	make style and linter fixes	2021-06-22 13:12:52 +02:00
Eren Gölge	a605dd3d08	Compute d_vectors and speaker_ids separately in TTSDataset	2021-06-22 13:12:52 +02:00
Eren Gölge	f00ef90ce6	rename external speaker embedding arguments as `d_vectors`	2021-06-22 13:12:52 +02:00
Eren Gölge	e7b7268c43	use `to_cuda()` for moving data in `format_batch()`	2021-06-22 13:12:52 +02:00
Eren Gölge	26a3312f0d	change `to(device)` to `type_as` in models	2021-06-22 13:12:52 +02:00
Eren Gölge	c09622459e	init `durations = None`	2021-06-22 13:12:52 +02:00
Eren Gölge	2e31659dd9	docstring fix	2021-06-22 13:12:52 +02:00
Eren Gölge	7a0750a4f5	make style	2021-06-22 13:12:52 +02:00
Eren Gölge	534401377d	styling formatting.py	2021-06-22 13:12:52 +02:00
Eren Gölge	e229f5c081	fix type annotations	2021-06-22 13:12:52 +02:00
Eren Gölge	506189bdee	update glow-tts output shapes to match [B, T, C]	2021-06-22 13:12:52 +02:00
Eren Gölge	f568833d28	formating `cond_input` with a function in Tacotron models	2021-06-22 13:12:52 +02:00
Eren Gölge	254707c610	update imports for `formatters`	2021-06-22 13:12:52 +02:00
Eren Gölge	223502d827	fix glow-tts inference and forward functions for handling `cond_input` and refactor its test	2021-06-22 13:12:52 +02:00
Eren Gölge	d4b1acfa81	refactor `SpeakerManager`	2021-06-22 13:12:52 +02:00
Eren Gölge	26e7c0960c	linter fixes	2021-06-22 13:12:52 +02:00
Eren Gölge	79f7c5da1e	delete separate tts training scripts and pre-commit configuration	2021-06-22 13:12:52 +02:00
Eren Gölge	ca787be193	make style	2021-06-22 13:12:52 +02:00
Eren Gölge	d376647ca0	`logging/__init__.py`	2021-06-22 13:12:52 +02:00
Eren Gölge	bb58a0588e	fix logger imports	2021-06-22 13:12:52 +02:00
Eren Gölge	9bbc924377	import missings	2021-06-22 13:12:52 +02:00
Eren Gölge	b4d4ce0d7e	remove redundant imports	2021-06-22 13:12:52 +02:00
Eren Gölge	aefa71155c	make style	2021-06-22 13:12:52 +02:00
Eren Gölge	88d8a94a10	update extract_tts_spectrogram for `cond_input` API of the models	2021-06-22 13:12:52 +02:00
Eren Gölge	667bb708b6	update `extract_tts_spec...` using `SpeakerManager`	2021-06-22 13:12:52 +02:00
Eren Gölge	830306d2fd	update `extract_tts_spectrograms` for the new model API	2021-06-22 13:12:52 +02:00
Eren Gölge	c673eb8ef8	correct import of `load_meta_data` remove redundant import	2021-06-22 13:12:52 +02:00
Eren Gölge	f0a419546b	fix `Synthesized` for the new `synthesis()`	2021-06-22 13:12:52 +02:00
Eren Gölge	c7ff175592	revert logging.info to print statements for trainer	2021-06-22 13:12:52 +02:00
Eren Gölge	fd6afe5ae5	update `setup_model.py` imports	2021-06-22 13:12:52 +02:00
Eren Gölge	c82d91051d	update align_tts.py model for the trainer	2021-06-22 13:12:52 +02:00
Eren Gölge	4f66e816d1	update align_tts_loss for trainer	2021-06-22 13:12:52 +02:00
Eren Gölge	8213ad8b5f	update aling_tts_config for the trainer	2021-06-22 13:12:52 +02:00
Eren Gölge	8dfd4c91ff	update trainer.py for better logging handling, restoring models and rename init_ functions with get_	2021-06-22 13:12:52 +02:00
Eren Gölge	fb9289d365	update `synthesis.py` for being more generic	2021-06-22 13:12:52 +02:00
Eren Gölge	f121b0ff5d	update `speedy_speech.py` model for trainer	2021-06-22 13:12:52 +02:00
Eren Gölge	843b3ba960	update `speedy_speecy_config.py` for the trainer	2021-06-22 13:12:52 +02:00
Eren Gölge	c9790bee2c	update tacotron model to return `model_outputs`	2021-06-22 13:12:52 +02:00
Eren Gölge	f09ec7e3a7	update glow-tts for the trainer	2021-06-22 13:12:52 +02:00
Eren Gölge	3346a6d9dc	update `sequence_mask` import globally	2021-06-22 13:12:52 +02:00
Eren Gölge	9765b1aa6b	update `glow_tts_config.py` for setting the optimizer and the scheduler	2021-06-22 13:12:52 +02:00
Eren Gölge	6bf6543df8	typing annotation for the trainer	2021-06-22 13:12:52 +02:00
Eren Gölge	57cdddef16	add trainer and train_tts	2021-06-22 13:12:52 +02:00
Eren Gölge	d769af9e3b	remove `truncated` from synthesizer	2021-06-22 13:12:52 +02:00
Eren Gölge	570633ab80	update console logger	2021-06-22 13:12:52 +02:00
Eren Gölge	2ac6b824ca	update `synthesis.py` for the trainer	2021-06-22 13:12:52 +02:00
Eren Gölge	c9e5527070	remove `tts.generic_utils` as all the functions are moved to other files	2021-06-22 13:12:52 +02:00
Eren Gölge	2ab723cd10	update Tacotron models for the trainer	2021-06-22 13:12:52 +02:00
Eren Gölge	d6b6a15b5c	add `gradual_training` argument to tacotron.py	2021-06-22 13:12:52 +02:00
Eren Gölge	118a7f2b43	import missings for tacotron.py	2021-06-22 13:12:52 +02:00
Eren Gölge	c98149d488	mode `setup_model()` to `models/__init__.py`	2021-06-22 13:12:52 +02:00
Eren Gölge	86edf6ab0e	add sequence_mask to `utils.data`	2021-06-22 13:12:52 +02:00
Eren Gölge	c61486b1e3	`setup_loss()` in `layer/__init__.py`	2021-06-22 13:12:52 +02:00
Eren Gölge	f07209d2e0	rename preprocess.py -> formatters.py	2021-06-22 13:12:52 +02:00
Eren Gölge	facb782851	move load_meta_data and related functions to `datasets/__init__.py`	2021-06-22 13:12:52 +02:00
Eren Gölge	b9d4355d20	set test_sentences in config	2021-06-22 13:12:52 +02:00
Eren Gölge	7bdd0eb72f	trainer-API updates	2021-06-22 13:12:52 +02:00
Eren Gölge	0f284841d1	rename MyDataset -> TTSDataset	2021-06-22 13:12:52 +02:00
Edresson	99d40e98d9	fix Lint checks	2021-06-18 14:59:01 -03:00
Edresson	28bec238ca	fix Lint checks	2021-06-18 14:33:50 -03:00
Edresson	83644056e3	fix Lint checks	2021-06-18 14:32:28 -03:00
Edresson Casanova	e78e3cd81e	Merge branch 'dev' into dev	2021-06-18 14:10:03 -03:00
Edresson	b74b510d3c	Compute embeddings and find characters using config file	2021-06-18 14:04:49 -03:00
Adam Froghyar	b0aa189348	Forcing do_trim_silence to False in the extract TTS script	2021-06-14 10:44:00 +02:00
Eren Gölge	d245b5d48f	bump up v0.0.15.1	2021-06-08 09:21:01 +02:00
Edresson	14b209c7e9	Create a batch for more fast inference on LSTM Speaker Encoder	2021-06-05 03:12:17 -03:00
Eren Gölge	b8b79a5e5a	fix `use_cuda` bug in `server.py`	2021-06-04 14:02:53 +02:00
Eren Gölge	203ab855c3	bump up to v0.0.15	2021-06-04 13:52:54 +02:00
Eren Gölge	ba9bcf7c6b	auto upload to pypi on release	2021-06-04 12:20:06 +02:00
Eren Gölge	e66753bd0d	fixup! new japanese model placeholder in `.models.json`	2021-06-03 18:04:28 +02:00
Eren Gölge	bd434636a9	new japanese model placeholder in `.models.json`	2021-06-02 15:54:37 +02:00
Eren Gölge	401fbd8978	bump up to v0.0.15	2021-06-02 11:48:17 +02:00
Eren Gölge	49c5e5d820	maket style japanese PR	2021-06-02 11:44:46 +02:00
Eren Gölge	73b4083c6c	Merge pull request #502 from kaiidams/kaiidams/kokoro Japanese Tacotron 2 model	2021-06-02 10:20:08 +02:00
Katsuya Iida	6d8310d2a9	Set the version to the same with the dev branch.	2021-06-02 07:48:28 +09:00
Alexander Korolev	c1eb9bdcca	fix speaker dim inference	2021-06-01 15:15:26 +02:00
Katsuya Iida	1cc18d1972	Move unittest of Japanese phonemizer.	2021-06-01 18:51:34 +09:00
Alexander Korolev	5b89ef2c6e	fix speaker-embeddings dimension during inference	2021-06-01 11:06:35 +02:00
Eren Gölge	d0ab0382fc	linter fixes	2021-06-01 09:15:32 +02:00
Eren Gölge	bec85ac58d	make style	2021-05-31 16:37:15 +02:00
Eren Gölge	d9f1268f99	init tb_logger None for rank > 0 processes	2021-05-31 15:47:07 +02:00
Eren Gölge	301c516abd	Merge branch 'dev' of https://github.com/coqui-ai/TTS into dev	2021-05-31 15:46:25 +02:00
Edresson	7448177b72	use SpeakerManager on compute embeddings script	2021-05-29 21:11:53 -03:00
Katsuya Iida	c4a5a73f18	update Kokoro config	2021-05-29 19:17:27 +09:00
Katsuya Iida	3a9ac2de4a	Merge remote-tracking branch 'coqui-ai/main' into kaiidams/kokoro	2021-05-29 09:39:23 +09:00
Katsuya Iida	d0c9c1ca5c	Move TTS/tts/utils/japanese	2021-05-29 09:21:47 +09:00
Edresson	099142d4dd	bug fix	2021-05-27 21:50:56 -03:00
Edresson	208bb0f0ee	add batched speaker encoder inference	2021-05-27 20:01:00 -03:00
Edresson	825734a3a9	remove unused embeddings export	2021-05-27 19:10:24 -03:00
Katsuya Iida	c4987e9d4e	Move import at the head of the file.	2021-05-28 00:22:57 +09:00
Eren Gölge	925c08cf95	replace unidecode with anyascii	2021-05-27 14:02:44 +02:00
Eren Gölge	e08c58db3b	bump up version to v0.14.1	2021-05-27 13:11:01 +02:00
Eren Gölge	c6f22aaa67	fix #509	2021-05-27 13:09:15 +02:00
Edresson	1496f271dc	update Compute embeddings script	2021-05-27 00:45:18 -03:00
Edresson	bc5307caa0	add unit tests for SoftmaxAngleProtoLoss and ResnetSpeakerEncoder and bugfix	2021-05-26 20:35:58 -03:00
Edresson	c90037c2e9	solve merge problems	2021-05-26 16:01:30 -03:00
Katsuya Iida	f921a05bdb	Fixed lint errors	2021-05-26 19:02:16 +09:00
Edresson Casanova	f89cb6aec2	Merge branch 'dev' into dev	2021-05-25 17:30:25 -03:00
Edresson	d570c2d790	pylint fix and data loader bug fix	2021-05-26 01:11:37 -03:00
Katsuya Iida	0536aa6d0f	Japanese Tacotron 2 model	2021-05-22 17:12:19 +09:00
Eren Gölge	5482a0f62d	type def for gradual_training	2021-05-19 14:03:26 +02:00
Eren Gölge	df6a98d0c3	type def for gradual_training	2021-05-19 14:00:44 +02:00
Eren Gölge	16576d6408	bump version number	2021-05-19 12:35:10 +02:00
Eren Gölge	8a7c40736c	set use_phonemes false	2021-05-19 01:27:26 +02:00
Eren Gölge	ccfaa6b1d5	add `needs_phonemizer` field to models.json. If set true these models are only compatible with v0.0.13 or below.	2021-05-18 17:57:28 +02:00
Eren Gölge	a14fcf2a13	remove text_processing test	2021-05-18 17:57:28 +02:00
Eren Gölge	d7fae3f515	remove all espeaker and phonemizer deps	2021-05-18 17:57:28 +02:00
Eren Gölge	ced05e812a	move chinese phonemizer	2021-05-18 17:57:28 +02:00
Eren Gölge	218af1d9a2	change `list` to `List` in config	2021-05-18 17:30:27 +02:00
Eren Gölge	4df31f7fbd	unused_speakers argument for ignoring speaker ids in multi-speaker training	2021-05-18 14:50:03 +02:00
Eren Gölge	c2c7dff805	use relaxted coqpit parser	2021-05-18 14:49:47 +02:00
Edresson	856ea19758	bug fix in dataloader and update inference	2021-05-18 03:43:16 -03:00
Eren Gölge	d1b469935d	tacotron DDC LJSpeech recipe	2021-05-17 11:42:14 +02:00
Eren Gölge	34a42d379f	update tacotron_config.py for checking `r` and the docstring	2021-05-17 11:35:30 +02:00
Eren Gölge	12722501bb	styling	2021-05-15 23:48:31 +02:00
Eren Gölge	8b1014d188	add docstrings with default value fixes	2021-05-15 23:45:10 +02:00
Eren Gölge	da49089a72	update melgan training test batch size	2021-05-12 10:12:11 +02:00
Edresson	3433c2f348	add compute embedding for the new speaker encoder	2021-05-12 03:06:46 -03:00
Eren Gölge	0213e1cbf4	update configs for tts models to match the field typed with the expected values	2021-05-12 00:57:38 +02:00
Eren Gölge	715b0a65a0	update main.yml for python x64 fix test	2021-05-12 00:57:29 +02:00
Edresson	3fcc748b2e	implement the Speaker Encoder H/ASP	2021-05-11 16:27:05 -03:00
Eren Gölge	843d1b3d98	linter fixes	2021-05-11 11:30:00 +02:00
Eren Gölge	19fb1d743d	style update	2021-05-11 11:30:00 +02:00
Eren Gölge	6e980b49c4	fix synthesizer.py for Coqpit	2021-05-11 11:29:18 +02:00
Eren Gölge	db14dcd95a	remove old load_config	2021-05-11 11:29:18 +02:00
Eren Gölge	a21ac883dd	add get_cuda()	2021-05-11 11:29:18 +02:00
Eren Gölge	21dd4d7960	fix load_config imports for Coqpit	2021-05-11 11:29:18 +02:00
Eren Gölge	c57f0b46bb	reintro use_gst for backwars compat	2021-05-11 11:29:18 +02:00
Eren Gölge	18e76a2309	fix speaker encoder model initialization	2021-05-11 11:29:18 +02:00
Eren Gölge	10de40bba1	make num_workers mandatory config field	2021-05-11 11:29:18 +02:00
Eren Gölge	df1ddd3539	allow read_json_with_comments for backward compat	2021-05-11 11:29:18 +02:00
Eren Gölge	9f7599e3c3	fix train_encoder for coqpit	2021-05-11 11:29:18 +02:00
Eren Gölge	f8e52965dd	add speaker encoder coqpit	2021-05-11 11:29:18 +02:00
Eren Gölge	ce2bba543e	remove extra from utils and move funcs to io.py	2021-05-11 11:29:18 +02:00
Eren Gölge	812dbc2b06	rm config.json	2021-05-11 11:29:18 +02:00
Eren Gölge	3fde2001b1	train_encoder refactoring for coqpit	2021-05-11 11:29:18 +02:00
Eren Gölge	9ee70af9bb	code styling	2021-05-11 11:29:18 +02:00
Eren Gölge	10db2baa06	global shared Coqpit configs	2021-05-11 11:29:18 +02:00
Eren Gölge	3dec62b183	add Coqpits for the vocoder models	2021-05-11 11:29:18 +02:00
Eren Gölge	6f4eed94f5	remove *.json vocoder configs	2021-05-11 11:29:18 +02:00
Eren Gölge	78b3825d0b	update train scripts for coqpit	2021-05-11 11:29:18 +02:00
Eren Gölge	757e90b1cc	load_config function to initialize the right Coqpit for the given model	2021-05-11 11:29:18 +02:00
Eren Gölge	e6f45b9eb7	update train_vocoder_gan.py for coqpit	2021-05-11 11:29:18 +02:00
Eren Gölge	bcebd69d09	remove bash tts training tests	2021-05-11 11:29:17 +02:00
Eren Gölge	7663bc63c1	add Coqpit configs for the TTS models	2021-05-11 11:29:17 +02:00
Eren Gölge	7227e8f1d2	update train_align_tts.py for coqpit	2021-05-11 11:29:17 +02:00
Eren Gölge	51a7e06945	glow_tts_config.py and train test on python	2021-05-11 11:29:17 +02:00
Eren Gölge	720fe13056	update glow_tts modules and training script for coqpit use	2021-05-11 11:29:17 +02:00
Eren Gölge	816e7ee698	remove default configs.json as replacing with Coqpit configs	2021-05-11 11:29:17 +02:00
Eren Gölge	35341d5482	move bash script based tests to python with coqpit	2021-05-11 11:29:17 +02:00
Eren Gölge	647163397d	coqpit refactoring	2021-05-11 11:29:17 +02:00
Eren Gölge	eaa130e813	fix tacotron for coqpit	2021-05-11 11:29:17 +02:00
Eren Gölge	65d7ad4250	refactor train_speedy_speech.py for coqpit	2021-05-11 11:29:17 +02:00
Eren Gölge	4a58fdfd59	comment out check-arguments before copying fields to the configs	2021-05-11 11:29:17 +02:00
Eren Gölge	05d9543ed8	init GST module using gst config in Tacotron models	2021-05-11 11:29:17 +02:00
Eren Gölge	93a00373f6	move split_dataset	2021-05-11 11:29:17 +02:00
Eren Gölge	9c18e40f64	black formatting	2021-05-11 11:29:17 +02:00
Eren Gölge	c34c8137d7	update compute_statistics for coqpit	2021-05-11 11:29:17 +02:00
Eren Gölge	79d7215142	config refactor #5 WIP	2021-05-11 11:29:17 +02:00
Eren Gölge	dc50f5f0b0	config refactor #4 WIP	2021-05-11 11:28:35 +02:00
Eren Gölge	97bd5f9734	[ci skip] config update #3 WIP	2021-05-11 11:28:35 +02:00
Eren Gölge	a21c0b5585	config update 2 WIP	2021-05-11 11:28:35 +02:00
Eren Gölge	e092ae40dc	config update WIP	2021-05-11 11:28:35 +02:00
Eren Gölge	06f80a4806	update check argument	2021-05-11 11:28:35 +02:00
Eren Gölge	bf7ddfa542	Merge pull request #481 from chmodsss/main Accessing __version__ command	2021-05-11 10:20:48 +02:00
Edresson	85ccad7e0a	add Audio data augamentation Addtive and RIR	2021-05-11 00:59:57 -03:00
Edresson	77d85c6cc5	add softmaxproto loss and bug fix in data loader	2021-05-10 17:08:38 -03:00
chmodsss	607d5cf377	[#480 ] Adding version variable	2021-05-10 19:46:34 +02:00
Adam Froghyar	7ddc885f37	deleted a line the broke GravesAttention	2021-05-10 15:42:59 +02:00
Edresson	78bad25f2b	update voxceleb download link	2021-05-07 23:45:15 -03:00
Eren Gölge	f7582107da	Merge pull request #453 from Edresson/dev Script for spectrogram extraction using teacher forcing and Glow-TTS inference with MAS.	2021-05-06 17:53:28 +02:00
Edresson	501c8e0302	remove unused vars on extract tts spectrograms script	2021-05-04 19:04:13 -03:00
Eren Gölge	0325c58862	Merge pull request #468 from shaun95/patch-1 Update losses.py	2021-05-03 14:45:24 +02:00
Eren Gölge	8cb27267a4	formatting	2021-05-03 14:26:35 +02:00
Eren Gölge	87d674a038	bumpup librosa version to 0.8.0	2021-05-03 14:25:09 +02:00
shaun	7d0ec62bf1	Update losses.py The block of code for use_l1_spec_loss is repeated which doubles the amount of L1 loss when enabled. The weight for L1 loss in hifigan_ljspeech configutation will likely need to be doubled to compensate (l1_spec_loss_weight)	2021-05-02 14:14:24 +02:00
Edresson	3ecd556bbe	add unit test for extract tts spectrograms script	2021-05-01 13:41:56 -03:00
Edresson	446b1da936	create inference function	2021-04-29 18:18:37 -03:00
Eren Gölge	f02f0338c2	fix .models.json and add testing to check released models availability	2021-04-29 09:32:36 +02:00
Eren Gölge	fd95e9b8a4	[ci skip] Add sam models	2021-04-28 21:57:31 +02:00
Agrin Hilmkil	351d0ed6ae	Remove unnecessary fsspec usage	2021-04-28 11:21:08 +02:00
Agrin Hilmkil	167f86417e	Move dev, tf, notebook dependencies to extras	2021-04-28 11:20:06 +02:00
Eren Gölge	1235e54738	test for synthesize.py	2021-04-27 14:17:38 +02:00
Eren Gölge	4719414f2e	remove imports	2021-04-27 11:25:17 +02:00
Eren Gölge	add97cddc1	move function and remove import	2021-04-27 11:22:56 +02:00
Eren Gölge	734e6a515c	bug fix	2021-04-27 10:27:45 +02:00
Eren Gölge	6bdd81667e	place holders for sc-glow and hifigan models	2021-04-26 19:53:12 +02:00
Eren Gölge	2f0716073e	enable multi-speaker CoquiTTS models for synthesize.py	2021-04-26 19:36:53 +02:00
Eren Gölge	b531fa699c	remove conflicy noise	2021-04-26 15:27:52 +02:00
Eren Gölge	f37b488876	Merge branch 'speaker-manager' of https://github.com/coqui-ai/TTS into speaker-manager	2021-04-26 15:25:25 +02:00
Eren Gölge	b82daa5e86	style and linter fixes	2021-04-26 15:22:24 +02:00
Edresson	20e42a3381	add save audio option	2021-04-23 15:00:00 -03:00
Edresson	8228091f92	add script for extraction of tts spectrograms	2021-04-23 14:17:46 -03:00
Eren Gölge	4cf211348d	styling and linting	2021-04-23 18:04:37 +02:00
Eren Gölge	7eb0c60d2e	let synthesizer to pass speaker encoder file paths to speaker manager	2021-04-23 18:04:37 +02:00
Eren Gölge	f69195739e	let speaker manager compute mean x_vector from multiple wav files	2021-04-23 18:04:37 +02:00
Eren Gölge	179722e3a7	new arguments to synthesize.py for loading speaker encoder and speaker wavs	2021-04-23 18:04:37 +02:00
Eren Gölge	dfa415a8b8	small refactor in server.py	2021-04-23 18:04:37 +02:00
Eren Gölge	c80d21f311	load speaker_encoder_ap and compute x_vector directly from the input file in speaker manager	2021-04-23 18:04:37 +02:00
Eren Gölge	ad047c8195	html formatting, enable multi-speaker model on the server with a dropdown menu to select the speaker	2021-04-23 18:04:37 +02:00
Eren Gölge	f9f3d04d14	remove moved function	2021-04-23 18:04:37 +02:00
Eren Gölge	10c988ac8c	update server.py	2021-04-23 18:04:37 +02:00
Eren Gölge	6d0f5e0459	use SpeakerManager in Synthesizer	2021-04-23 18:04:37 +02:00
Eren Gölge	e97126314c	add ```unique``` argument to make_symbols to fix the incompat. issue of the SC-Glow models	2021-04-23 18:04:37 +02:00
Eren Gölge	d08888e603	formating speakers.py	2021-04-23 18:04:37 +02:00
Eren Gölge	df422223a3	initial SpeakerManager implementation	2021-04-23 18:04:37 +02:00
Eren Gölge	7a7aeb35f5	fix the glow-tts in setup_model	2021-04-23 18:04:37 +02:00
Eren Gölge	d42748082a	update argument name external_speaker_embedding_dim -> speaker_embedding_dim add inference_noise_scale argument to glow-tts	2021-04-23 18:04:37 +02:00
Eren Gölge	2da81f5bb6	add load_chekpoint to speaker encoder	2021-04-23 18:04:37 +02:00
Eren Gölge	1229ccbf07	update argument name in server.py	2021-04-23 18:04:37 +02:00
Eren Gölge	af2d36faeb	update synthesize.py for multi-speaker setting	2021-04-23 18:04:37 +02:00
Eren Gölge	99dc07a7dd	add ```unique``` param to keep scglow models compatible (they are duplicate symbols ins the character set)	2021-04-23 18:04:37 +02:00
Eren Gölge	c955a12428	set the default layer size compatible with scglow	2021-04-23 18:04:37 +02:00
Eren Gölge	3ace2440fa	fix a mistake from rebase	2021-04-23 18:04:37 +02:00
Eren Gölge	aadb2106ec	code styling	2021-04-23 18:04:37 +02:00
Eren Gölge	af7baa3387	refactoring to allow defining the speaker file externally	2021-04-23 18:04:37 +02:00
kirianguiller	7dccbfdcd5	handle multi speaker and gst in Synthetizer class	2021-04-23 18:04:37 +02:00
Edresson	d2b6326b8b	change optimizer initialization for compatibility with Hifi-GAN official implementation	2021-04-23 07:54:39 -03:00
WeberJulian	4205284f92	Change name of the functions	2021-04-23 10:09:55 +02:00
WeberJulian	a26498181b	Change back the default value	2021-04-22 16:10:17 +02:00
Julian Weber	355e1f47ab	fix dumb mistake	2021-04-22 15:50:29 +02:00
Julian Weber	c125b71f36	fix windows support	2021-04-22 15:14:24 +02:00
Jörg Thalheim	f5fd7f78d4	server: also listen to ipv6 The [::] address will listen to both ipv4/ipv6 addresses.	2021-04-22 12:38:55 +02:00
Eren Gölge	ef37633cb3	[ci skip] use prenet_dropout by default with Tacotron models	2021-04-22 12:38:55 +02:00
Eren Gölge	e1d960da9e	use SpeakerManager in Synthesizer	2021-04-21 13:13:27 +02:00
Eren Gölge	04b6881b66	add ```unique``` argument to make_symbols to fix the incompat. issue of the SC-Glow models	2021-04-21 13:12:35 +02:00
Eren Gölge	790946faec	formating speakers.py	2021-04-21 13:12:11 +02:00
Eren Gölge	ab313814de	initial SpeakerManager implementation	2021-04-21 13:11:46 +02:00
Eren Gölge	09890c7421	fix the glow-tts in setup_model	2021-04-21 13:10:40 +02:00
Eren Gölge	8764d02eb2	update argument name external_speaker_embedding_dim -> speaker_embedding_dim add inference_noise_scale argument to glow-tts	2021-04-21 13:09:44 +02:00
Eren Gölge	8b40720977	add load_chekpoint to speaker encoder	2021-04-21 13:09:04 +02:00
Eren Gölge	37cad38c27	update argument name in server.py	2021-04-21 13:08:45 +02:00
Eren Gölge	9bccee9da8	update synthesize.py for multi-speaker setting	2021-04-21 13:08:25 +02:00
Eren Gölge	d2fa8add1f	add ```unique``` param to keep scglow models compatible (they are duplicate symbols ins the character set)	2021-04-16 19:40:13 +02:00
Eren Gölge	d9612a4351	set the default layer size compatible with scglow	2021-04-16 19:40:13 +02:00
Eren Gölge	1038fd420d	fix a mistake from rebase	2021-04-16 19:39:47 +02:00
Eren Gölge	47e356cb48	code styling	2021-04-16 16:01:40 +02:00
Eren Gölge	25328aad00	refactoring to allow defining the speaker file externally	2021-04-16 15:59:57 +02:00
kirianguiller	48ae52a9a3	handle multi speaker and gst in Synthetizer class	2021-04-16 15:54:49 +02:00
Eren Gölge	a53958ae3a	fix urls for the new models	2021-04-15 17:05:00 +02:00
Eren Gölge	9cc17be53a	formatting and a small bug fix in Tacotron model	2021-04-15 16:36:51 +02:00
Eren Gölge	1ad838bc83	add newly released models under .model.json	2021-04-15 16:06:10 +02:00
Eren Gölge	7cada1a949	remove noise	2021-04-15 15:30:45 +02:00
Eren Gölge	d60a8d7211	show the real waveform on TB too for GAN vocoder training.	2021-04-15 15:30:06 +02:00
Eren Gölge	5fbe926429	change the default TTS model to TacotronDDC	2021-04-15 15:29:44 +02:00
Eren Gölge	3de5a89154	optionally enable prenet dropout at inference time for tacotron models	2021-04-13 13:24:56 +02:00
Eren Gölge	28a2fed8a3	update hifigan in .model.json	2021-04-12 16:48:05 +02:00
Eren Gölge	abaf36861a	aligntts model .model.json placeholder	2021-04-12 16:43:52 +02:00
Eren Gölge	480e2f7888	docstring update and better handling make_symbols	2021-04-12 16:40:49 +02:00
Eren Gölge	b735076bb4	linter fixes	2021-04-12 13:14:11 +02:00
Eren Gölge	b11d1cb845	small fixes	2021-04-12 12:40:55 +02:00
Eren Gölge	a7f6045644	Merge branch 'reformat' into hifigan-reformat	2021-04-12 12:00:17 +02:00
Eren Gölge	f519012dea	reformatting and styling	2021-04-12 11:47:39 +02:00
Eren Gölge	9011dddf77	tacotron DDC placeholder in models.json	2021-04-12 04:06:27 +02:00
Eren Gölge	d295d5de97	remove torch.no_grad from TorchSTFT	2021-04-10 19:43:57 +02:00
Eren Gölge	5b70da2e3f	restore schedulers only if training is continuing a previous training inherit nn.Module for TorchSTFT	2021-04-09 19:31:28 +02:00
Eren Gölge	2c71c6d8cd	[ci skip]update gan vocoder configs to reflect the recent changes	2021-04-09 17:15:32 +02:00
Eren Gölge	2b529f60c8	update default hifigan config	2021-04-09 11:40:06 +02:00
Eren Gölge	105e0b4d62	vocoder gan training fixes	2021-04-09 11:38:04 +02:00
Eren Gölge	87ee6ceb57	style update #3	2021-04-09 01:17:15 +02:00
Eren Gölge	18d9ec8036	format with black	2021-04-09 00:54:59 +02:00
Eren Gölge	e5b9607bc3	isort all imports	2021-04-09 00:45:20 +02:00
Eren Gölge	0e79fa86ad	format with black and pylint 2.7.3	2021-04-09 00:38:08 +02:00
Eren Gölge	cd69da4868	linter fixes #2	2021-04-08 16:57:46 +02:00
Eren Gölge	4d3e1e9d9a	linter fix	2021-04-08 14:57:46 +02:00
Eren Gölge	53f54898bc	small fixes	2021-04-08 14:22:47 +02:00
Eren Gölge	006b1d3aaa	bug fix	2021-04-08 13:17:45 +02:00

... 12 13 14 15 16 ...

1912 Commits