coqui-tts

Commit Graph

Author	SHA1	Message	Date
p0p4k	903a77c197	Update wavenet.py (#1796 ) * Update wavenet.py Current version does not use "in_channels" argument. In glowTTS, we use normalizing flows and so "input dim" == "ouput dim" (channels and length). So, the existing code just uses hidden_channel sized tensor as input to first layer as well as outputs hidden_channel sized tensor. However, since it is a generic implementation, I believe it is better to update it for a more general use. * "in_channels -> hidden_channels"	2022-08-01 12:20:37 +02:00
p0p4k	4fe50801b5	Update README.md; download progress bar in CLI. (#1797 ) * Update README.md - minor PR - added model_info usage guide based on #1623 in README.md . * "added tqdm bar for model download" * Update manage.py * fixed style * fixed style * sort imports	2022-08-01 12:17:47 +02:00
Eren G??lge	7d8b1665c8	Fix rand_segment edge case (input_len == seg_len - 1)	2022-08-01 11:37:45 +02:00
vanIvan	5094499eba	Fix & update WaveRNN vocoder model (#1749 ) * Fixes KeyError bug. Adding logging to dashboard. * Make pep8 compliant * Make style compliant * Still fixing style	2022-07-26 15:05:11 +02:00
p0p4k	10195c4eba	Update decoder.py (#1792 ) Minor comment correction.	2022-07-26 13:06:06 +02:00
ivan provalov	903d9c791a	Fix for FloorDiv Function Warning (#1760 ) * Fix for Floor Function Warning Fix for Floor Function Warning * Adding double quotes to fix formatting Adding double quotes to fix formatting * Update glow_tts.py * Update glow_tts.py	2022-07-20 11:31:22 +02:00
Eren Gölge	f7587fc134	Fix SSIM loss correction	2022-07-13 10:47:12 +02:00
Eren Gölge	bc1f93c299	Fix device allocation	2022-07-12 19:05:25 +02:00
Eren Gölge	49bac724c0	Implement VitsAudioConfig (#1556 ) * Implement VitsAudioConfig * Update VITS LJSpeech recipe * Update VITS VCTK recipe * Make style * Add missing decorator * Add missing param * Make style * Update recipes * Fix test * Bug fix * Exclude tests folder * Make linter * Make style	2022-07-12 18:49:58 +02:00
a-froghyar	34b80e0280	feat: updated recipes and lr fix (#1718 ) - updated the recipes activating more losses for more stable training - re-enabling guided attention loss - fixed a bug about not the correct lr fetched for logging	2022-07-12 15:00:53 +02:00
Eren G??lge	48a4f3647f	Make lint	2022-07-12 14:58:26 +02:00
WeberJulian	c614f21982	Add durations as aux input for VITS (#1694 ) * Add durations as aux input for VITS * Make style * Fix tts_tests * Fix test_get_aux_input	2022-07-12 14:25:21 +02:00
Eren G??lge	2cf89b88c9	Make style	2022-07-12 14:12:57 +02:00
Eren G??lge	a6f73a18cb	Fix BCELoss adressing #1192	2022-07-12 14:11:34 +02:00
Eren G??lge	c17ff17a18	Fix SSIM loss	2022-07-12 12:35:24 +02:00
Eren G??lge	f1e35596e8	Remove redundant config field	2022-07-11 13:39:41 +02:00
WeberJulian	5cef6facb0	Fix tokenizer for punc only (#1717 )	2022-07-06 22:59:41 +02:00
camillem	5c821d9fa1	Fix the --model_name and --vocoder_name arguments need a <model_type> element (#1469 ) Co-authored-by: Eren Gölge <erogol@hotmail.com>	2022-06-27 10:32:43 +02:00
manmay nakhashi	577ec406f4	Fix checkpointing GAN models (#1641 ) * checkpoint sae step crash fix * checkpoint save step crash fix * Update gan.py updated requested changes * crash fix	2022-06-22 12:07:46 +02:00
Eren G??lge	00e67092d8	Bump up to v0.7.1	2022-06-21 14:12:55 +02:00
Eren G??lge	3328be7a8e	Remove GL message	2022-06-21 12:39:31 +02:00
WeberJulian	30c72e0d05	Add Thorsten VITS model (#1675 ) Co-authored-by: Eren Gölge <egolge@coqui.ai>	2022-06-21 11:39:49 +02:00
p0p4k	71281ff1e4	Add support for model_info in CLI (#1623 ) * model_info * model_info * model_info_by_idx and name * model_info_by_idx and name * model_info * Update manage.py * fixed linter * fixed linter * fixed linter * fixed linter * fixed return style checks * fixed linter * fixed linter * fixed idx always positive * added comments * fix parser.args check * fix parser.args check * Make style Co-authored-by: Eren G??lge <egolge@coqui.ai>	2022-06-20 23:28:17 +02:00
Eren G??lge	8b75e8be9c	Bump up to v0.7.0	2022-06-20 13:50:09 +02:00
WeberJulian	6126c23498	Add synpaflex formatter (#1616 ) * Add synpaflex formatter * Fix formatter * Make style	2022-06-20 13:36:26 +02:00
WeberJulian	f09ea11c71	Internal formatter (#1629 ) * Add coqui formatter * Make style	2022-06-08 14:31:03 +02:00
Eren Gölge	f70e82cd19	Use fsspec and torch for embedding file IO (#1581 ) * Use fsspec and torch for embedding file * Fixup * Fix load and save files * Fix compute embedding script * Set use_cuda to true if available * Add dummy speakers.pth file * Make style * Change default speakers file extension Co-authored-by: WeberJulian <julian.weber@hotmail.fr>	2022-06-01 13:49:42 +02:00
Noran Raskin	a790df4e94	Training recipes for thorsten dataset (#1020 ) * Fix style * Fix isort * Remove tensorboardX from requirements Co-authored-by: logan hart <72301874+loganhart420@users.noreply.github.com> Co-authored-by: Eren Gölge <egolge@coqui.ai>	2022-05-30 12:07:31 +02:00
André R. de Miranda	3b84ef9524	Fixed use_cuda issue in compute_embeddings.py Added use_cuda argument in self.init_encoder method	2022-05-20 12:46:46 -03:00
a-froghyar	8be21ec387	Capacitron (#977 ) * new CI config * initial Capacitron implementation * delete old unused file * fix empty formatting changes * update losses and training script * fix previous commit * fix commit * Add Capacitron test and first round of test fixes * revert formatter change * add changes to the synthesizer * add stepwise gradual lr scheduler and changes to the recipe * add inference script for dev use * feat: add posterior inference arguments to synth methods - added reference wav and text args for posterior inference - some formatting * fix: add espeak flag to base_tts and dataset APIs - use_espeak_phonemes flag was not implemented in those APIs - espeak is now able to be utilised for phoneme generation - necessary phonemizer for the Capacitron model * chore: update training script and style - training script includes the espeak flag and other hyperparams - made style * chore: fix linting * feat: add Tacotron 2 support * leftover from dev * chore:rename parser args * feat: extract optimizers - created a separate optimizer class to merge the two optimizers * chore: revert arbitrary trainer changes * fmt: revert formatting bug * formatting again * formatting fixed * fix: log func * fix: update optimizer - Implemented load_state_dict for continuing training * fix: clean optimizer init for standard models * improvement: purge espeak flags and add training scripts * Delete capacitronT2.py delete old training script, new one is pushed * feat: capacitron trainer methods - extracted capacitron specific training operations from the trainer into custom methods in taco1 and taco2 models * chore: renaming and merging capacitron and gst style args * fix: bug fixes from the previous commit * fix: implement state_dict method on CapacitronOptimizer * fix: call method * fix: inference naming * Delete train_capacitron.py * fix: synthesize * feat: update tests * chore: fix style * Delete capacitron_inference.py * fix: fix train tts t2 capacitron tests * fix: double forward in T2 train step * fix: double forward in T1 train step * fix: run make style * fix: remove unused import * fix: test for T1 capacitron * fix: make lint * feat: add blizzard2013 recipes * make style * fix: update recipes * chore: make style * Plot test sentences in Tacotron * chore: make style and fix import * fix: call forward first before problematic floordiv op * fix: update recipes * feat: add min_audio_len to recipes * aux_input["style_mel"] * chore: make style * Make capacitron T2 recipe more stable * Remove T1 capacitron Ljspeech * feat: implement new grad clipping routine and update configs * make style * Add pretrained checkpoints * Add default vocoder * Change trainer package * Fix grad clip issue for tacotron * Fix scheduler issue with tacotron Co-authored-by: Eren Gölge <egolge@coqui.ai> Co-authored-by: WeberJulian <julian.weber@hotmail.fr> Co-authored-by: Eren Gölge <erogol@hotmail.com>	2022-05-20 16:17:11 +02:00
Edresson Casanova	ee99a6c1e2	Fix voice conversion inference (#1583 ) * Add voice conversion zoo test * Fix style * Fix unit test	2022-05-20 15:50:25 +02:00
Edresson Casanova	e5d8ec2402	Change the VITS upsampling interpolation trick to linear (#1564 )	2022-05-13 10:52:39 +02:00
Edresson Casanova	c6008e5235	Add audio length sampler balancer (#1561 ) * Add audio length sampler balancer * Add unit tests	2022-05-12 19:59:19 +02:00
Eren Gölge	6e460b7e42	Add an assert for the upsampling trick (#1538 )	2022-05-12 19:55:24 +02:00
Eren Gölge	4857967063	🐍 Python 3.10.x support and drop Python 3.6 support (#1565 ) * Update requirements * Update CI for p3.10 * Update numpy requirement * Drop 🐍p3.6 support Numpy also dropped support for p3.6 * Bind cython v0.29.28 * Bind pyworld to v0.2.10 > 0.2.10 is not p3.10.x compatible * Update Dockerfile	2022-05-12 15:50:25 +02:00
Edresson Casanova	a97eed696a	Fix the bug in eSpeak wrapper for eSpeak version 1.48.15 (#1560 )	2022-05-12 15:15:18 +02:00
Eren Gölge	e45ae57aef	Merge pull request #1550 from coqui-ai/fix-upsampling-asserts Fix VITS upsampling asserts	2022-05-12 14:51:41 +02:00
Edresson Casanova	175ca06388	Add reinit text encoder and duration predictor parameter (#1562 ) * Add reinit encoder and duration predictor option * Add .data to prevent any overlooked autograd hook	2022-05-12 09:08:36 -03:00
Edresson Casanova	182711043c	Fix the VITS upsampling asserts Fix style	2022-05-12 09:08:29 -03:00
Eren Gölge	2fc38f67d2	Update SpeakerManager init in Synthesizer	2022-05-11 11:32:27 +02:00
Eren Gölge	c3f8c4d5eb	Return default SpeakerManager if no d_vector_file	2022-05-11 11:31:45 +02:00
Eren Gölge	121e9ed685	Pass use_cuda to init_encoder	2022-05-11 11:31:17 +02:00
Eren Gölge	c18bd21b3f	Return durations at VITS inference	2022-05-11 11:30:05 +02:00
Eren Gölge	5021a03de0	Use torch.no_grad for VITS inference	2022-05-11 11:29:36 +02:00
Eren Gölge	3f03e3012c	Fix batch_group_size in VITS	2022-05-07 13:44:44 +02:00
code-review-doctor	fa887ef5f9	Fix issue probably-meant-fstring found at https://codereview.doctor (#1532 )	2022-05-07 13:33:40 +02:00
Eren Gölge	a0a9279e4b	Fix GAN optimizer order commit `212d330929` Author: Edresson Casanova <edresson1@gmail.com> Date: Fri Apr 29 16:29:44 2022 -0300 Fix unit test commit `44456b0483` Author: Edresson Casanova <edresson1@gmail.com> Date: Fri Apr 29 07:28:39 2022 -0300 Fix style commit `d545beadb9` Author: Edresson Casanova <edresson1@gmail.com> Date: Thu Apr 28 17:08:04 2022 -0300 Change order of HIFI-GAN optimizers to be equal than the original repository commit `657c5442e5` Author: Edresson Casanova <edresson1@gmail.com> Date: Thu Apr 28 15:40:16 2022 -0300 Remove audio padding before mel spec extraction commit `76b274e690` Merge: `379ccd7b` `6233f4fc` Author: Edresson Casanova <edresson1@gmail.com> Date: Wed Apr 27 07:28:48 2022 -0300 Merge pull request #1541 from coqui-ai/comp_emb_fix Bug fix in compute embedding without eval partition commit `379ccd7ba6` Author: WeberJulian <julian.weber@hotmail.fr> Date: Wed Apr 27 10:42:26 2022 +0200 returns y_mask in VITS inference (#1540) * returns y_mask * make style	2022-05-07 13:29:11 +02:00
Edresson Casanova	60034674f9	Remove audio padding before mel spec extraction	2022-05-07 13:12:09 +02:00
WeberJulian	fbdf76b2fc	returns y_mask in VITS inference (#1540 ) * returns y_mask * make style	2022-05-03 13:49:24 +02:00
Edresson Casanova	6233f4fcd7	Bug fix in compute embedding without eval partition	2022-04-26 13:58:03 -03:00
Edresson Casanova	8d228ab22a	Trick to Upsampling to High sampling rates using VITS model (#1456 ) * Add upsample VITS support * Fix the bug in inference * Fix lint checks * Add RMS based norm in save_wav method * Style fix * Add the period for VITS multi-period discriminator in model_args * Bug fix in speaker encoder load in inference time * Add unit tests * Remove useless detach_z_vocoder parameter * Add docs for VITS upsampling * Fix the docs * Rename TTS_part_sample_rate to encoder_sample_rate * Add upsampling_init and upsampling_z methods * Add asserts for encoder_sample_rate part * Move upsampling tests to test_vits.py	2022-04-26 11:47:46 +02:00
Eren Gölge	c410bc58ef	Bump to v0.6.2	2022-04-20 11:46:26 +02:00
WeberJulian	30bea7d53c	Update manage.py (#1514 )	2022-04-19 14:27:32 +02:00
Eren Gölge	7133f8f47d	Print Model's license when downloading (#1512 ) * Print model license while downloading * Make style * Add a new license link * Make style	2022-04-19 14:18:49 +02:00
WeberJulian	4953636b14	Add African models (#1511 ) * Add african models * Set default license for all models	2022-04-19 14:18:30 +02:00
Edresson Casanova	060e0f9368	Add EmbeddingManager and BaseIDManager (#1374 )	2022-03-31 13:41:16 +02:00
WeberJulian	1b22f03e98	Fix G2P backend of the released models (#1461 ) * Fix enforce phonemizer * Add new models * Fix .model.json	2022-03-30 12:47:11 +02:00
WeberJulian	c66a6241fd	Enforce phonemizer definition for synthesis (#1441 ) * Enforce phonemizer definition for synthesis * Fix train_tts, tokenizer init can now edit config * Add small change to trigger CI pipeline * fix wrong output path for one tts_test * Fix style * Test config overides by args and tokenizer * Fix style	2022-03-25 23:15:33 +01:00
Edresson Casanova	37896e1743	Bug fix in freeze encoder (#1391 ) * Fix the bug in freeze encoder * Remove emb_l definition for non-multilingual training * Fix unit tests	2022-03-24 18:16:04 +01:00
Edresson Casanova	3435bc8fca	Fix style tests	2022-03-23 15:05:32 -03:00
Edresson Casanova	0ae1e0248c	Fix the bug for emptly audio files	2022-03-23 14:39:31 -03:00
Edresson Casanova	ea53d6feb3	Replace webrtcvad by silero-vad	2022-03-23 14:39:31 -03:00
Eren Gölge	3af01cfe3b	Update base model wrt 👟 (#1406 )	2022-03-23 17:24:20 +01:00
Eren Gölge	1c3623af33	Fix model manager (#1436 ) * Fix manager * Make style	2022-03-23 12:57:14 +01:00
Eren Gölge	72d85e53c9	Update model file extension (#1422 ) * Update model file ext to ```.pth``` * Update docs * Rename more * Find model files	2022-03-22 17:55:00 +01:00
Eren Gölge	fd56fabb21	Fix #1380 (#1409 )	2022-03-16 12:38:27 +01:00
Eren Gölge	0870a4faa2	Make style (#1405 )	2022-03-16 12:13:55 +01:00
WeberJulian	690c96ed28	Fix default phonemizer for ja and zh (#1399 )	2022-03-16 12:13:22 +01:00
Edresson Casanova	f81892483d	REBASED: Transform Speaker Encoder in a Generic Encoder and Implement Emotion Encoder training support (#1349 ) * Rename Speaker encoder module to encoder * Add a generic emotion dataset formatter * Transform the Speaker Encoder dataset to a generic dataset and create emotion encoder config * Add class map in emotion config * Add Base encoder config * Add evaluation encoder script * Fix the bug in plot_embeddings * Enable Weight decay for encoder training * Add argumnet to disable storage * Add Perfect Sampler and remove storage * Add evaluation during encoder training * Fix lint checks * Remove useless config parameter * Active evaluation in speaker encoder test and use multispeaker dataset for this test * Unit tests fixs * Remove useless tests for speedup the aux_tests * Use get_optimizer in Encoder * Add BaseEncoder Class * Fix the unitests * Add Perfect Batch Sampler unit test * Add compute encoder accuracy in a function	2022-03-11 14:43:40 +01:00
Edresson Casanova	36e9ea2f97	Open bible dataset formatter (#1365 ) * Add support for voice conversion inference * Cache d_vectors_by_speaker for fast inference using a bigger speakers.json * Rebase bug fix * Use the average d-vector for inference * Fix the bug in find unique chars script * Add OpenBible formatter Co-authored-by: Eren Gölge <erogol@hotmail.com>	2022-03-11 10:43:31 +01:00
Edresson Casanova	dbe9da7f15	Add Voice conversion inference support (#1337 ) * Add support for voice conversion inference * Cache d_vectors_by_speaker for fast inference using a bigger speakers.json * Rebase bug fix * Use the average d-vector for inference	2022-03-10 14:57:12 +01:00
Edresson Casanova	917f417ac4	Add alphas to control language and speaker balancer (#1216 ) * Add alphas to control language and speaker balancer * Add docs for speaker and language samplers * Change the Samplers weights to float for save memory * Change the test_samplers to unittest format * Add get_sampler method in BaseTTS * Fix rebase issues * Add language and speaker samplers support for DDP training * Rename distributed sampler wrapper * Remove the DistributedSamplerWrapper and use the one from Trainer * Bugfix after rebase * Move the samplers config to tts config	2022-03-10 14:56:09 +01:00
Edresson Casanova	f381e29b91	REBASED: Add support for the speaker encoder training using torch spectrograms (#1348 ) * Add support for the speaker encoder training using torch spectrograms * Remove useless function in speaker encoder dataset class	2022-03-10 14:54:51 +01:00
Eren Gölge	c670365507	Fix VCTK recipe and formatter	2022-03-08 14:20:34 +01:00
Eren Gölge	8feb41d361	Bump up to v0.6.1	2022-03-07 15:57:44 +01:00
Eren Gölge	ee02bc3823	Bump up to v0.6.0	2022-03-07 12:08:22 +01:00
Eren Gölge	dc280819be	Add new models	2022-03-07 12:08:09 +01:00
Eren Gölge	e9d9028b4d	Revert cleaner name	2022-03-06 12:57:06 +01:00
Eren Gölge	764c7fa4a4	Rename phoneme_cleaners	2022-03-06 12:09:54 +01:00
Eren Gölge	dd4287de1f	Update models	2022-03-03 20:23:00 +01:00
Eren Gölge	6cb00be795	Update your_tts model URL	2022-03-02 18:04:49 +01:00
Eren Gölge	1425a023fe	Make style and lint	2022-03-02 13:25:35 +01:00
Eren Gölge	c68885b3fd	Update Vits speaker encoder init	2022-03-02 13:20:23 +01:00
Eren Gölge	27b67b7945	Fix import	2022-03-02 09:15:20 +01:00
Eren Gölge	942df0fb05	Update vits dataset	2022-03-02 09:14:32 +01:00
Eren Gölge	6a9f8074f0	Fix TTSDataset	2022-03-01 07:57:48 +01:00
Eren Gölge	690de1ab06	Update Characters and add more tests	2022-02-25 11:32:44 +01:00
Eren Gölge	9063397892	Fix FastSpeech config	2022-02-25 11:31:56 +01:00
Eren Gölge	1e414b3a09	Make stlye	2022-02-25 11:31:56 +01:00
Eren Gölge	acc83cd3e6	Update Vits model API	2022-02-25 11:31:56 +01:00
Eren Gölge	fe656659be	Implement BaseTTS	2022-02-25 11:31:56 +01:00
Eren Gölge	bed4afd4ee	Implement BaseVocabulary	2022-02-25 11:31:56 +01:00
Eren Gölge	e0f9be76c0	Update test_run in wavernn and wavegrad	2022-02-25 11:31:56 +01:00
Eren Gölge	bf540f4323	Update imports for trainer	2022-02-25 11:31:56 +01:00
Eren Gölge	4c43eda414	Update BaseTrainerModel	2022-02-25 11:31:56 +01:00
Eren Gölge	83c5ddc5b7	Update imports	2022-02-25 11:31:56 +01:00
Eren Gölge	14c117978d	Fix return outputs	2022-02-25 11:31:56 +01:00
Eren Gölge	424d04e4f6	Make stlye	2022-02-25 11:31:56 +01:00
Eren Gölge	8b3ba02c95	Add vocab_dict to model config	2022-02-25 11:31:20 +01:00
Eren Gölge	ff23dce081	Update TTSDataset	2022-02-25 11:31:20 +01:00
Eren Gölge	750903d2ba	Add VCTK formatter docstring	2022-02-25 11:30:24 +01:00
Eren Gölge	52a7896668	Update VITS loss	2022-02-25 11:30:24 +01:00
Eren Gölge	c68962c574	Update forward tts binary loss	2022-02-25 11:30:24 +01:00
Eren Gölge	c11944022d	Revert back again rand_segment	2022-02-25 11:30:24 +01:00
Eren Gölge	00c7600103	Update Vits model API	2022-02-25 11:30:24 +01:00
Eren Gölge	935a604046	Delete trainer_utils	2022-02-25 11:29:41 +01:00
Eren Gölge	d0c27a9661	Update synthesis.py	2022-02-25 11:29:41 +01:00
Eren Gölge	35fc7270ff	Implement BaseTTS	2022-02-25 11:28:47 +01:00
Eren Gölge	2bad098625	Implement BaseVocabulary	2022-02-25 11:28:47 +01:00
Eren Gölge	833de62e30	Update base_vocoder	2022-02-25 11:28:14 +01:00
Eren Gölge	fc3b6d2861	Update gan	2022-02-25 11:28:14 +01:00
Eren Gölge	20a677c623	Update test_run in wavernn and wavegrad	2022-02-25 11:28:14 +01:00
Eren Gölge	be3a03126a	Update imports for trainer	2022-02-25 11:28:14 +01:00
Eren Gölge	c911729896	Update BaseTrainerModel	2022-02-25 11:28:14 +01:00
Eren Gölge	1e219fef0a	Revert drop_last	2022-02-25 11:26:59 +01:00
Eren Gölge	7dfd753d91	Add a cheap trick to avoid short audio clips	2022-02-25 11:26:59 +01:00
Eren Gölge	1a43e05460	Fix VITS loss bug Fake and real features were given in the wrong args order to the loss function	2022-02-25 11:26:59 +01:00
Eren Gölge	4b96bfe925	Fix train logging	2022-02-25 11:26:59 +01:00
Eren Gölge	ab8a4ca2c3	Revert random segment	2022-02-25 11:26:59 +01:00
Eren Gölge	8622226f3f	Make style	2022-02-25 11:26:59 +01:00
Eren Gölge	27db089d6c	Change TrainingArgs -> TrainerArgs	2022-02-25 11:26:59 +01:00
Eren Gölge	aa81454721	Update BaseTrainingConfig	2022-02-25 11:26:59 +01:00
Eren Gölge	d3a58ed07a	Fix default values	2022-02-25 11:26:59 +01:00
Eren Gölge	54c6bb2a8c	Fix add speaker VITS	2022-02-25 11:26:59 +01:00
Eren Gölge	590b04fb89	Fix espeak_wrapper	2022-02-25 11:26:59 +01:00
Eren Gölge	a013566d15	Delete trainer related code	2022-02-25 11:26:59 +01:00
Eren Gölge	38314194e7	Set `drop_last`	2022-02-25 11:26:59 +01:00
Eren Gölge	f70e4bb8c6	Add new speakers to the vits model	2022-02-25 11:26:59 +01:00
Eren Gölge	d5c0e17548	Load right char class dynamically	2022-02-25 11:26:59 +01:00
Eren Gölge	1f0c8179da	Make style	2022-02-25 11:26:59 +01:00
Eren Gölge	b3ed6ff6b7	Update FastPitchConfig	2022-02-25 11:26:59 +01:00
Eren Gölge	1932401e8d	Fix dataset preprocessing	2022-02-25 11:26:59 +01:00
Eren Gölge	34c4be5e49	Update forwardtts	2022-02-25 11:26:59 +01:00
Eren Gölge	bb37462794	Update language manager	2022-02-25 11:26:59 +01:00
Eren Gölge	5169d4eb32	Plot pitch over input characters	2022-02-25 11:26:59 +01:00
Eren Gölge	cd5d1497cf	Add pitch_fmin pitch_fmax args to the audio	2022-02-25 11:26:59 +01:00
Eren Gölge	1445a46e9e	Update synthesizer to use iinit_from_config	2022-02-25 11:26:59 +01:00
Eren Gölge	7058fcc3ff	Take file extension as an argument	2022-02-25 11:26:59 +01:00
Eren Gölge	13482dde1f	Update GAN model	2022-02-25 11:26:59 +01:00
Eren Gölge	2829027d8b	Refactor VITS model	2022-02-25 11:26:59 +01:00
Eren Gölge	ef63c99524	Implement `start_by_longest` option for TTSDatase	2022-02-25 11:26:18 +01:00
Eren Gölge	c4c471d61d	Allow padding for shorter segments	2022-02-25 11:25:48 +01:00
Eren Gölge	47fbddc8d4	Fix docstring	2022-02-25 11:25:48 +01:00
Eren Gölge	bc2243bac4	Fix tests	2022-02-25 11:25:00 +01:00
Eren Gölge	146fbfd7c9	Extend unittests	2022-02-25 11:25:00 +01:00
Eren Gölge	2fe16de8e3	Make lint	2022-02-25 11:25:00 +01:00
Eren Gölge	7b49a4aa2b	Fix glow_tts_config missing field	2022-02-25 11:24:13 +01:00
Eren Gölge	07b0a80d57	Fix tokenizer init_from_config	2022-02-25 11:24:13 +01:00
Eren Gölge	50e17097a7	Add verbose option to AudioProcessor	2022-02-25 11:24:13 +01:00
Eren Gölge	235f7d9b02	Extend glow_tts model tests	2022-02-25 11:24:13 +01:00
Eren Gölge	8e248913d6	Update train_tts for the new API	2022-02-25 11:24:13 +01:00
Eren Gölge	001da8afc8	Update Vits for the new model API	2022-02-25 11:21:19 +01:00
Eren Gölge	5176ae9e53	Fixes small compat. issues	2022-02-25 11:21:19 +01:00
Eren Gölge	131bc0cfc0	Fix synthesis.py 🔧	2022-02-25 11:18:00 +01:00
Eren Gölge	c0746f23df	Fix `too many open files`	2022-02-25 11:16:30 +01:00
Eren Gölge	df0d58bf09	Update VCTK recipes	2022-02-25 11:16:30 +01:00
Eren Gölge	730f7c0df4	Add file_ext args to resample.py	2022-02-25 11:15:46 +01:00
Eren Gölge	28d98da422	Update VCTK formatter	2022-02-25 11:15:46 +01:00
Eren Gölge	4d99fee3e2	Update spec extractor	2022-02-25 11:12:44 +01:00
Eren Gölge	38a0b3b6c7	Update train_tts.py	2022-02-25 11:11:35 +01:00
Eren Gölge	cfaa51fddc	Update BaseTTS config	2022-02-25 11:11:35 +01:00
Eren Gölge	4c5cb44eeb	Update setup_model	2022-02-25 11:11:35 +01:00
Eren Gölge	7c4243fba7	Update GlowTTS	2022-02-25 11:11:35 +01:00
Eren Gölge	bacf79f4fb	Update AlignTTS	2022-02-25 11:11:35 +01:00
Eren Gölge	18f726af65	Update ForwardTTS	2022-02-25 11:11:35 +01:00
Eren Gölge	d0ec4b91e5	Update Tacotron models	2022-02-25 11:11:35 +01:00
Eren Gölge	ea965a5683	Update VITS for the new API	2022-02-25 11:11:35 +01:00
Eren Gölge	f802a931a3	Pass samples to init_from_config in SpeakerManager	2022-02-25 11:07:34 +01:00
Eren Gölge	bde68d9f25	Use the same phonemizer for `en` to `en-us`	2022-02-25 11:07:34 +01:00
Eren Gölge	8649d4fd36	Allow None pad and blank tokens	2022-02-25 11:07:34 +01:00
Eren Gölge	c9972e6f14	Make lint	2022-02-25 11:07:34 +01:00
Eren Gölge	30cfafce56	Add init_from_config	2022-02-25 11:05:54 +01:00
Eren Gölge	90cc45dd4e	Update data loader tests	2022-02-25 11:05:54 +01:00
Eren Gölge	93957d58a1	Refactorin VITS for the tokenizer API	2022-02-25 11:05:06 +01:00
Eren Gölge	04df0a3d9f	Refactor TTSDataset ⚡️	2022-02-25 11:05:06 +01:00
Eren Gölge	9bb347a52b	Update for tokenizer API	2022-02-25 11:05:06 +01:00
Eren Gölge	452dbc43d8	Update imports for symbols -> characters	2022-02-25 11:05:06 +01:00
Eren Gölge	8071fa0020	Refactor GlowTTS model and recipe for TTSTokenizer	2022-02-25 11:05:06 +01:00
Eren Gölge	b6c2bfdf08	Refactor synthesis.py for TTSTokenizer	2022-02-25 11:05:06 +01:00
Eren Gölge	b2bb954a51	Refactor TTSDataset to use TTSTokenizer	2022-02-25 11:05:06 +01:00
Eren Gölge	84091096a6	Refactor Synthesizer class for TTSTokenizer	2022-02-25 11:05:06 +01:00
Eren Gölge	196ae74273	Update data loader tests	2022-02-25 11:05:06 +01:00
Eren Gölge	98057a00ae	Make style	2022-02-25 10:57:35 +01:00
Eren Gölge	7575367b9f	Refactorin VITS for the tokenizer API	2022-02-25 10:57:35 +01:00
Eren Gölge	4cd690e4c1	Updates BaseTTS and configs	2022-02-25 10:57:35 +01:00
Eren Gölge	176b712c1a	Refactor TTSDataset ⚡️	2022-02-25 10:57:35 +01:00
Eren Gölge	4597d4e5b6	Remove get_characters from BaseTTS	2022-02-25 10:48:03 +01:00
Eren Gölge	1df1d6c4a9	Update for tokenizer API	2022-02-25 10:48:03 +01:00
Eren Gölge	2d8ce98d2a	Update imports for symbols -> characters	2022-02-25 10:48:03 +01:00
Eren Gölge	9a95e15483	Refactor GlowTTS model and recipe for TTSTokenizer	2022-02-25 10:48:03 +01:00
Eren Gölge	d0eb642d88	Refactor synthesis.py for TTSTokenizer	2022-02-25 10:48:03 +01:00
Eren Gölge	3476be30d7	Refactor Synthesizer class for TTSTokenizer	2022-02-25 10:48:03 +01:00
Eren Gölge	9397a56b13	Allow init_from_config from model or audio config	2022-02-25 10:48:03 +01:00
Eren Gölge	a71a013276	Fix the wrong default loss name for GAN models	2022-02-25 10:48:03 +01:00
Eren Gölge	04202da1ac	Make style	2022-02-25 10:48:03 +01:00
Eren Gölge	3b63d713b9	Fix espeak wrapper cmd call	2022-02-25 10:48:03 +01:00
Eren Gölge	4894998e6b	Fix print_logs	2022-02-25 10:48:03 +01:00
Eren Gölge	4e8f9d6f10	Fix IPAPhonemes init_from_config	2022-02-25 10:48:03 +01:00
Eren Gölge	0fe39166fe	Discard OOV chars in tokenizer Discard but store OOV chars with a warninig message when the OOV char first recognized	2022-02-25 10:48:03 +01:00
Eren Gölge	c39aaafbfc	Update EspeakWrapper for espeak-ng	2022-02-25 10:48:03 +01:00
Eren Gölge	bb389479a4	Update setup_model for TTS.tts models	2022-02-25 10:48:03 +01:00
Eren Gölge	9b83e665fc	Add init_from_config as an abstract class	2022-02-25 10:48:03 +01:00
Eren Gölge	3eca5ad060	Update config fields for phonemizer	2022-02-25 10:48:03 +01:00
Eren Gölge	d2525abe8c	Remove get_characters from BaseTTS	2022-02-25 10:48:03 +01:00
Eren Gölge	73d27ebd45	Fix GlowTTS	2022-02-25 10:48:03 +01:00
Eren Gölge	87bf940676	Print duplicate characters	2022-02-25 10:48:03 +01:00
Eren Gölge	3de9f38d16	Add init_from_config to SpeakerManager	2022-02-25 10:48:03 +01:00
Eren Gölge	d8ec7086b6	Update `synthesis` for the new API	2022-02-25 10:48:03 +01:00
Eren Gölge	4e83bf3968	Allow choosing phonemizer	2022-02-25 10:48:02 +01:00
Eren Gölge	22f0c58fe1	Print language codes	2022-02-25 10:48:02 +01:00
Eren Gölge	693fb4dd39	Modify init_from_config for IPAPhonemes	2022-02-25 10:48:02 +01:00
Eren Gölge	acc6eef625	Update for tokenizer API	2022-02-25 10:48:02 +01:00
Eren Gölge	e1b4c4ca43	Add init_from_config to GAN	2022-02-25 10:48:02 +01:00
Eren Gölge	353f913efc	Fix #985	2022-02-25 10:48:02 +01:00
Eren Gölge	ba3b60c90f	Test TTSTokenizer	2022-02-25 10:48:02 +01:00
Eren Gölge	79a84410f2	Test punctuations	2022-02-25 10:48:02 +01:00
Eren Gölge	d8bdeb8b8f	Fix Punctuation	2022-02-25 10:48:02 +01:00
Eren Gölge	ff7c385838	Fix BasePhonemizer	2022-02-25 10:48:02 +01:00
Eren Gölge	10d435ce77	Fixup	2022-02-25 10:48:02 +01:00
Eren Gölge	f0655bfffc	Fix ja_jp_phonemizer	2022-02-25 10:48:02 +01:00
Eren Gölge	20e5dd3678	Add doc examples	2022-02-25 10:48:02 +01:00
Eren Gölge	fbad17e084	Update imports for symbols -> characters	2022-02-25 10:48:02 +01:00
Eren Gölge	a1df4f9887	Test character classes	2022-02-25 10:45:24 +01:00
Eren Gölge	bd461ace33	Refactor GlowTTS model and recipe for TTSTokenizer	2022-02-25 10:45:24 +01:00
Eren Gölge	5a9653978a	Refactor synthesis.py for TTSTokenizer	2022-02-25 10:45:24 +01:00
Eren Gölge	e5785b34b0	Style fix	2022-02-25 10:27:46 +01:00
Eren Gölge	e4049aa31a	Refactor TTSDataset to use TTSTokenizer	2022-02-25 10:27:46 +01:00
Eren Gölge	2480bbe937	Remove OLD TOKENIZATION ROUTINES	2022-02-25 09:32:54 +01:00
Eren Gölge	53f696615b	Add init_from_config to AudioProcessor	2022-02-25 09:32:54 +01:00
Eren Gölge	3d86edfc81	Refactor Synthesizer class for TTSTokenizer	2022-02-25 09:32:54 +01:00
Eren Gölge	8d85af84cd	Implement Punctuation class	2022-02-25 09:32:54 +01:00
Eren Gölge	1aca58afaf	Fix imports in cleaners.py	2022-02-25 09:32:54 +01:00
Eren Gölge	0344645e90	Implement TTSTokenizer	2022-02-25 09:32:54 +01:00
Eren Gölge	2fb1f70503	Implement BaseCharacters, IPAPhonemes, Graphemes	2022-02-25 09:32:54 +01:00
Eren Gölge	1bee40af40	Create language folders under `TTS.tts.utils.text`	2022-02-25 09:32:54 +01:00
Eren Gölge	c1119bc291	Implement BasePhonemizer	2022-02-25 09:32:54 +01:00
Eren Gölge	dcd01356e0	Create `text/english` folder	2022-02-25 09:32:54 +01:00
Eren Gölge	80867c8e8c	Implement multi-phonemizer	2022-02-25 09:32:54 +01:00
Eren Gölge	5e4f78add3	Implement espeak wrapper	2022-02-25 09:32:54 +01:00
Eren Gölge	e03a05c816	Implement gruut wrapper	2022-02-25 09:32:54 +01:00
Eren Gölge	172ba0c5e7	Implement JA_JP phonemizer	2022-02-25 09:32:54 +01:00
Eren Gölge	ca02b82218	Implement ZH_CH phonemizer	2022-02-25 09:32:54 +01:00
Eren Gölge	a51b031bff	Merge branch 'dev' into dev-fix-glowtts-infer	2022-02-21 12:01:40 +03:00
Edresson Casanova	28a7464975	Fix the bug in split dataset function (#1251 ) * Fix the bug in split_dataset * Make eval_split_size configurable * Change test_loader to use load_tts_samples function * Change eval_split_portion to eval_split_size and permits to set the absolute number of samples in eval * Fix samplers unit test * Add data unit test on GitHub workflow	2022-02-21 11:59:36 +03:00
Edresson Casanova	bc5db13d06	Fix the bug in extract tts spectrogram script	2022-02-19 19:24:00 +00:00
Edresson Casanova	ba6e56e01c	Fix Glow-TTS multi-speaker inference	2022-02-18 19:25:29 +00:00
Eren Gölge	127118c637	Update TTS.tts formatters (#1228 ) * Return Dict from tts formatters * Make style	2022-02-11 23:03:43 +01:00
Eren Gölge	5e3f499a69	Fix #1187 (#1227 )	2022-02-11 13:27:59 +01:00
Edresson Casanova	0860d73cf8	Remove Tensorflow requeriment (#1225 ) * Remove TF modules * Remove TF unit tests * Remove TF vocoder modules * Remove TF convert scripts * Remove TF requirement * Remove the Docs TF instructions * Remove TF inference support	2022-02-10 16:14:54 +01:00
Eren Gölge	44c7d1a826	Merge pull request #1054 from WeberJulian/partial_embedding_compute Partial embedding compute	2022-02-06 20:13:55 +01:00

... 3 4 5 6 7 ...

1801 Commits