coqui-tts

Commit Graph

Author	SHA1	Message	Date
Reuben Morais	859ac1a54c	Include usage instructions in README	2021-12-17 11:37:19 +01:00
Eren Gölge	babdd84f91	Fix GST inference commit d3e477875a7e46a101fcf95a1794442823750fe2 Author: George Rousssos <25833833+george-roussos@users.noreply.github.com> Date: Wed Nov 3 10:16:12 2021 +0000 Read .wav for GST conditioning from CL commit 074e6d0874d3b34fb6a4991fc17d66dccd413fbb Author: George Rousssos <25833833+george-roussos@users.noreply.github.com> Date: Fri Oct 29 14:43:47 2021 +0100 Fix GST during inference in Tacotron2 commit fdece14585ab5a36eed1061a9a838d8e48aa6882 Author: George Rousssos <25833833+george-roussos@users.noreply.github.com> Date: Wed Nov 3 10:16:12 2021 +0000 Read .wav for GST conditioning from CL commit cd29e21b8d0a541ee298d2bf5f67223ad60be38f Author: George Rousssos <25833833+george-roussos@users.noreply.github.com> Date: Fri Oct 29 14:43:47 2021 +0100 Fix GST during inference in Tacotron2 commit 908ce39370eadcc9fa8510cdb26c9ead87305427 Author: George Rousssos <25833833+george-roussos@users.noreply.github.com> Date: Fri Oct 29 12:49:37 2021 +0100 Make trim_db value negative commit 1008a2e0f72fa7ca7f0307424f570386f2f16d42 Author: George Rousssos <25833833+george-roussos@users.noreply.github.com> Date: Fri Oct 29 12:22:24 2021 +0100 Set find_endpoint db threshold in config.json	2021-12-07 13:28:49 +00:00
Eren Gölge	ce45d9e1af	Make style and lint	2021-12-01 10:42:52 +00:00
Eren Gölge	dc3dd55dd9	Add collect_env_info.py	2021-11-08 08:59:08 +01:00
Eren Gölge	a409e0f8f8	Update train_tts for multi-speaker	2021-10-21 16:29:06 +00:00
Eren Gölge	ba2b8c827f	Update `train_tts.py` and `train_vocoder.py`	2021-09-30 14:47:56 +00:00
Eren Gölge	2e9b6b4f90	Refactor Speaker Encoder training	2021-09-30 14:47:56 +00:00
Eren Gölge	043dca61b4	Rename `load_meta_data` as `load_tts_data`	2021-09-30 14:47:56 +00:00
Eren Gölge	3c740d4893	Style extract_tts_spectrogram.py	2021-09-10 08:21:21 +00:00
Eren Gölge	807f1d3817	Fix `extract_tts_spectrograms.py` model init	2021-09-09 08:59:55 +00:00
Eren Gölge	91a70e80b2	Refactor TTSDataset Return a dict by `collate` Refactor batch handling in `collate` A couple of bug fixes	2021-09-06 15:16:58 +00:00
Eren Gölge	545a00fc04	Use absolute paths of the attention masks	2021-09-06 15:16:58 +00:00
Eren Gölge	0f19f8c911	Fix `compute_attention_masks.py`	2021-09-06 15:16:58 +00:00
Eren Gölge	18da8f5dbd	Update pylint 2.10.2 and fix lint issues	2021-08-30 08:10:35 +00:00
Eren Gölge	f186856e5d	Add option to sort input sequnce by audio len	2021-08-30 08:10:35 +00:00
Eren Gölge	5911eec3b1	Small trainer refactoring 1. Use a single Gradscaler for all the optimizers 2. Save terminal logs to a file. In DDP mode, each worker creates `trainer_N_log.txt`. 3. Fixes to allow only the main worker (rank==0) writing to Tensorboard 4. Pass parameters owned by the target optimizer to the grad_clip_norm	2021-08-26 17:08:58 +00:00
Eren Gölge	ecf5f17dca	Fix distribute.py and ddp training	2021-08-12 22:22:32 +00:00
Eren Gölge	6af03ac476	Fix `num_char` init in Tacotron models	2021-08-09 21:46:15 +00:00
Ayush Chaurasia	936a47504d	Update Logger API, recipes	2021-08-09 18:34:00 +00:00
Ayush Chaurasia	f63cf46c55	Unified logger API	2021-08-09 18:34:00 +00:00
Ayush Chaurasia	f606741dc4	Add artifacts logging , wandb args	2021-08-09 18:31:16 +00:00
Agrin Hilmkil	ced4cfdbbf	Allow saving / loading checkpoints from cloud paths (#683 ) * Allow saving / loading checkpoints from cloud paths Allows saving and loading checkpoints directly from cloud paths like Amazon S3 (s3://) and Google Cloud Storage (gs://) by using fsspec. Note: The user will have to install the relevant dependency for each protocol. Otherwise fsspec will fail and specify which dependency is missing. * Append suffix _fsspec to save/load function names * Add a lower bound to the fsspec dependency Skips the 0 major version. * Add missing changes from refactor * Use fsspec for remaining artifacts * Add test case with path requiring fsspec * Avoid writing logs to file unless output_path is local * Document the possibility of using paths supported by fsspec * Fix style and lint * Add missing lint fixes * Add type annotations to new functions * Use Coqpit method for converting config to dict * Fix type annotation in semi-new function * Add return type for load_fsspec * Fix bug where fs not always created * Restore the experiment removal functionality	2021-08-09 18:02:36 +00:00
Eren Gölge	4b7b88dd3d	Add fullband-melgan DE vocoder	2021-07-26 15:38:30 +02:00
Edresson Casanova	d5adc35fdf	Add docstring to compute_embeddings script	2021-07-21 07:16:10 -03:00
Edresson	b1620d1f3f	remove ignore generate eval flag	2021-07-15 03:34:28 -03:00
Edresson	d906fea08c	lint fix and eval as argparse in extract tts spectrograms	2021-07-13 02:15:31 -03:00
Edresson	2e5baffa9c	Merge fix and eval split as argparse	2021-07-13 01:47:32 -03:00
Eren Gölge	93a74cbb71	Merge pull request #628 from Aloento/patch-2 Change to _get_preprocessor_by_name	2021-07-11 22:17:50 +02:00
Edresson	4eac1c4651	bug fix on train_encoder and unit tests	2021-07-11 12:00:39 -03:00
Aloento	6e3e6d5756	Change to _get_preprocessor_by_name	2021-07-08 09:53:13 +02:00
Eren Gölge	a4c658f5ef	Fix for using the `Synthesizer` out of the model	2021-07-02 10:43:38 +02:00
Eren Gölge	b3c073c99b	Allow runing full path scripts with `distribute.py`	2021-06-28 17:03:47 +02:00
Eren Gölge	a7617d8ab6	Add 🐍 python 3.9 to CI	2021-06-28 17:03:47 +02:00
Eren Gölge	9790eddada	Fix wrong argument name 🛠️	2021-06-28 17:03:47 +02:00
Eren Gölge	45947acb60	Update `TTS.bin` scripts for the new API	2021-06-28 17:03:47 +02:00
Eren Gölge	c7aad884cd	Implement unified trainer	2021-06-28 17:03:19 +02:00
Eren Gölge	c754a0e17d	`TrainerAbstract` and related updates for `TrainerTTS`	2021-06-28 17:03:19 +02:00
Eren Gölge	00c82c516d	rename to	2021-06-28 17:03:19 +02:00
Eren Gölge	03494ad642	adjust `distribute.py` for the `train_tts.py`	2021-06-28 17:03:19 +02:00
Eren Gölge	d6b2b6add6	make style and linter fixes	2021-06-28 17:03:19 +02:00
Eren Gölge	802d461389	Compute d_vectors and speaker_ids separately in TTSDataset	2021-06-28 17:03:19 +02:00
Eren Gölge	db6a97d1a2	rename external speaker embedding arguments as `d_vectors`	2021-06-28 17:03:19 +02:00
Eren Gölge	ef4ea9e527	update imports for `formatters`	2021-06-28 17:03:19 +02:00
Eren Gölge	421194880d	linter fixes	2021-06-28 17:03:19 +02:00
Eren Gölge	8e52a69230	delete separate tts training scripts and pre-commit configuration	2021-06-28 17:03:19 +02:00
Eren Gölge	d96ebcd6d3	make style	2021-06-28 17:03:19 +02:00
Eren Gölge	b500338faa	make style	2021-06-28 17:03:19 +02:00
Eren Gölge	469d2e620a	update extract_tts_spectrogram for `cond_input` API of the models	2021-06-28 17:03:19 +02:00
Eren Gölge	5ab28fa618	update `extract_tts_spec...` using `SpeakerManager`	2021-06-28 17:03:19 +02:00
Eren Gölge	c392fa4288	update `extract_tts_spectrograms` for the new model API	2021-06-28 17:03:19 +02:00
Eren Gölge	8f47f95998	correct import of `load_meta_data` remove redundant import	2021-06-28 17:03:19 +02:00
Eren Gölge	d25f017b42	update `setup_model.py` imports	2021-06-28 17:03:19 +02:00
Eren Gölge	e298b8e364	update trainer.py for better logging handling, restoring models and rename init_ functions with get_	2021-06-28 17:03:19 +02:00
Eren Gölge	5f07315722	add trainer and train_tts	2021-06-28 17:03:19 +02:00
Eren Gölge	8def3c87af	trainer-API updates	2021-06-28 17:03:19 +02:00
Eren Gölge	42554cc711	rename MyDataset -> TTSDataset	2021-06-28 17:03:19 +02:00
Edresson	1c4e806f54	use speaker manager on compute embeddings script	2021-06-27 03:35:34 -03:00
Edresson Casanova	eb84bb2bc8	Merge branch 'dev' into dev	2021-06-26 15:32:19 -03:00
Michael Hansen	3f172b84d8	Fix linting issues	2021-06-25 14:41:31 +02:00
Edresson	99d40e98d9	fix Lint checks	2021-06-18 14:59:01 -03:00
Edresson	28bec238ca	fix Lint checks	2021-06-18 14:33:50 -03:00
Edresson	83644056e3	fix Lint checks	2021-06-18 14:32:28 -03:00
Edresson Casanova	e78e3cd81e	Merge branch 'dev' into dev	2021-06-18 14:10:03 -03:00
Edresson	b74b510d3c	Compute embeddings and find characters using config file	2021-06-18 14:04:49 -03:00
Adam Froghyar	b0aa189348	Forcing do_trim_silence to False in the extract TTS script	2021-06-14 10:44:00 +02:00
Eren Gölge	d0ab0382fc	linter fixes	2021-06-01 09:15:32 +02:00
Eren Gölge	bec85ac58d	make style	2021-05-31 16:37:15 +02:00
Edresson	7448177b72	use SpeakerManager on compute embeddings script	2021-05-29 21:11:53 -03:00
Edresson	208bb0f0ee	add batched speaker encoder inference	2021-05-27 20:01:00 -03:00
Edresson	825734a3a9	remove unused embeddings export	2021-05-27 19:10:24 -03:00
Edresson	1496f271dc	update Compute embeddings script	2021-05-27 00:45:18 -03:00
Edresson	c90037c2e9	solve merge problems	2021-05-26 16:01:30 -03:00
Edresson Casanova	f89cb6aec2	Merge branch 'dev' into dev	2021-05-25 17:30:25 -03:00
Edresson	d570c2d790	pylint fix and data loader bug fix	2021-05-26 01:11:37 -03:00
Eren Gölge	c2c7dff805	use relaxted coqpit parser	2021-05-18 14:49:47 +02:00
Edresson	856ea19758	bug fix in dataloader and update inference	2021-05-18 03:43:16 -03:00
Eren Gölge	12722501bb	styling	2021-05-15 23:48:31 +02:00
Edresson	3433c2f348	add compute embedding for the new speaker encoder	2021-05-12 03:06:46 -03:00
Eren Gölge	715b0a65a0	update main.yml for python x64 fix test	2021-05-12 00:57:29 +02:00
Edresson	3fcc748b2e	implement the Speaker Encoder H/ASP	2021-05-11 16:27:05 -03:00
Eren Gölge	843d1b3d98	linter fixes	2021-05-11 11:30:00 +02:00
Eren Gölge	19fb1d743d	style update	2021-05-11 11:30:00 +02:00
Eren Gölge	9f7599e3c3	fix train_encoder for coqpit	2021-05-11 11:29:18 +02:00
Eren Gölge	3fde2001b1	train_encoder refactoring for coqpit	2021-05-11 11:29:18 +02:00
Eren Gölge	9ee70af9bb	code styling	2021-05-11 11:29:18 +02:00
Eren Gölge	78b3825d0b	update train scripts for coqpit	2021-05-11 11:29:18 +02:00
Eren Gölge	e6f45b9eb7	update train_vocoder_gan.py for coqpit	2021-05-11 11:29:18 +02:00
Eren Gölge	bcebd69d09	remove bash tts training tests	2021-05-11 11:29:17 +02:00
Eren Gölge	7227e8f1d2	update train_align_tts.py for coqpit	2021-05-11 11:29:17 +02:00
Eren Gölge	720fe13056	update glow_tts modules and training script for coqpit use	2021-05-11 11:29:17 +02:00
Eren Gölge	35341d5482	move bash script based tests to python with coqpit	2021-05-11 11:29:17 +02:00
Eren Gölge	eaa130e813	fix tacotron for coqpit	2021-05-11 11:29:17 +02:00
Eren Gölge	65d7ad4250	refactor train_speedy_speech.py for coqpit	2021-05-11 11:29:17 +02:00
Eren Gölge	9c18e40f64	black formatting	2021-05-11 11:29:17 +02:00
Eren Gölge	c34c8137d7	update compute_statistics for coqpit	2021-05-11 11:29:17 +02:00
Eren Gölge	79d7215142	config refactor #5 WIP	2021-05-11 11:29:17 +02:00
Eren Gölge	dc50f5f0b0	config refactor #4 WIP	2021-05-11 11:28:35 +02:00
Eren Gölge	97bd5f9734	[ci skip] config update #3 WIP	2021-05-11 11:28:35 +02:00
Eren Gölge	a21c0b5585	config update 2 WIP	2021-05-11 11:28:35 +02:00
Edresson	85ccad7e0a	add Audio data augamentation Addtive and RIR	2021-05-11 00:59:57 -03:00
Edresson	77d85c6cc5	add softmaxproto loss and bug fix in data loader	2021-05-10 17:08:38 -03:00
Eren Gölge	f7582107da	Merge pull request #453 from Edresson/dev Script for spectrogram extraction using teacher forcing and Glow-TTS inference with MAS.	2021-05-06 17:53:28 +02:00
Edresson	501c8e0302	remove unused vars on extract tts spectrograms script	2021-05-04 19:04:13 -03:00
Eren Gölge	87d674a038	bumpup librosa version to 0.8.0	2021-05-03 14:25:09 +02:00
Edresson	3ecd556bbe	add unit test for extract tts spectrograms script	2021-05-01 13:41:56 -03:00
Edresson	446b1da936	create inference function	2021-04-29 18:18:37 -03:00
Eren Gölge	1235e54738	test for synthesize.py	2021-04-27 14:17:38 +02:00
Eren Gölge	2f0716073e	enable multi-speaker CoquiTTS models for synthesize.py	2021-04-26 19:36:53 +02:00
Edresson	20e42a3381	add save audio option	2021-04-23 15:00:00 -03:00
Edresson	8228091f92	add script for extraction of tts spectrograms	2021-04-23 14:17:46 -03:00
Eren Gölge	4cf211348d	styling and linting	2021-04-23 18:04:37 +02:00
Eren Gölge	179722e3a7	new arguments to synthesize.py for loading speaker encoder and speaker wavs	2021-04-23 18:04:37 +02:00
Eren Gölge	af2d36faeb	update synthesize.py for multi-speaker setting	2021-04-23 18:04:37 +02:00
Edresson	d2b6326b8b	change optimizer initialization for compatibility with Hifi-GAN official implementation	2021-04-23 07:54:39 -03:00
Eren Gölge	9cc17be53a	formatting and a small bug fix in Tacotron model	2021-04-15 16:36:51 +02:00
Eren Gölge	d60a8d7211	show the real waveform on TB too for GAN vocoder training.	2021-04-15 15:30:06 +02:00
Eren Gölge	5fbe926429	change the default TTS model to TacotronDDC	2021-04-15 15:29:44 +02:00
Eren Gölge	b11d1cb845	small fixes	2021-04-12 12:40:55 +02:00
Eren Gölge	a7f6045644	Merge branch 'reformat' into hifigan-reformat	2021-04-12 12:00:17 +02:00
Eren Gölge	f519012dea	reformatting and styling	2021-04-12 11:47:39 +02:00
Eren Gölge	5b70da2e3f	restore schedulers only if training is continuing a previous training inherit nn.Module for TorchSTFT	2021-04-09 19:31:28 +02:00
Eren Gölge	105e0b4d62	vocoder gan training fixes	2021-04-09 11:38:04 +02:00
Eren Gölge	18d9ec8036	format with black	2021-04-09 00:54:59 +02:00
Eren Gölge	e5b9607bc3	isort all imports	2021-04-09 00:45:20 +02:00
Eren Gölge	0e79fa86ad	format with black and pylint 2.7.3	2021-04-09 00:38:08 +02:00
Eren Gölge	cd69da4868	linter fixes #2	2021-04-08 16:57:46 +02:00
Eren Gölge	0ee0458309	remove redundant imports	2021-04-08 11:29:15 +02:00
Eren Gölge	4998ece8d8	allow configuration of optimziers from the config file	2021-04-08 11:28:30 +02:00
Eren Gölge	8daf407652	cache empty	2021-04-08 11:28:30 +02:00
Eren Gölge	3fb78c004a	move scheduler updates to the end of the epoch	2021-04-08 11:28:30 +02:00
Eren Gölge	2a872c98aa	don't call os.exit as it leaves the process resources standing	2021-04-08 11:27:40 +02:00
Eren Gölge	57f6bd1afa	make using different samples for G and D networks optional	2021-04-08 11:26:01 +02:00
rishikksh20	e656e8b108	Remove select size bug	2021-04-08 11:20:33 +02:00
rishikksh20	ef6ff4e95c	Add Exponential LR scheduler check	2021-04-08 11:20:33 +02:00
Eren Gölge	6ad4eba678	gan vocoder train fix in case of restoring models wiht no scheduler is defined	2021-04-06 16:24:50 +02:00
Eren Gölge	b4c2cf80f2	fix eval iter	2021-03-30 14:39:16 +02:00
Eren Gölge	a3a840fd78	linter fixes	2021-03-30 14:39:16 +02:00
Eren Gölge	7a382a5c2b	stowed aligntts commit and small refactoring with feed_forward layers	2021-03-30 14:39:16 +02:00
Eren Gölge	2b3e12ea49	correct imports after refactoring, add AlignTTS (old SSMAS) and some formatting	2021-03-30 14:39:16 +02:00
Eren Gölge	d9c405f0c3	create feedforward folder for SS layers	2021-03-30 14:39:16 +02:00
Eren Gölge	ca2f22cdd7	linter fix	2021-03-30 14:36:12 +02:00
Eren Gölge	d0dcd7d1b8	let the user define outpu.wav file path fix #393	2021-03-30 14:24:31 +02:00
Eren Gölge	3947750dd9	Merge branch 'dev' of https://github.com/coqui-ai/TTS into dev	2021-03-18 14:09:47 +01:00
WeberJulian	596ea2c98a	Add resample script	2021-03-18 13:33:37 +01:00
Eren Gölge	65533f33e9	fix #374	2021-03-18 13:33:00 +01:00
WeberJulian	af96080e17	fix linter issues	2021-03-18 13:33:00 +01:00
WeberJulian	f6cd8e0ecc	test case	2021-03-18 13:33:00 +01:00
WeberJulian	e954e45e57	linter + test	2021-03-18 13:33:00 +01:00
WeberJulian	e598977f3d	Using path.join instead of concat	2021-03-18 13:33:00 +01:00
WeberJulian	c5ef2de73f	Add resample script	2021-03-18 13:33:00 +01:00

1 2 3 4 5 ...

419 Commits