coqui-tts

Commit Graph

Author	SHA1	Message	Date
Eren Gölge	14d45b5347	Bump up to v0.10.2	2023-01-11 01:06:02 +01:00
Khalid Bashir	42afad5e79	Fixed bug related to yourtts speaker embeddings issue (#2234 ) * Fixed bug related to yourtts speaker embeddings issue * Reverted code for base_tts * Bug fix on VITS d_vector_file type * Ignore the test speakers on YourTTS recipe * Add speaker encoder model and config on YourTTS recipe to easily do zero-shot inference * Update YourTTS config file * Update ModelManager._update_path to deal with list attributes * Fix lint checks * Remove unused code * Fix unit tests * Reset name_to_id to get the right speaker ids on load_embeddings_from_list_of_files * Set weighted_sampler_multipliers as an empty dict to prevent users' mistakes Co-authored-by: Edresson Casanova <edresson1@gmail.com>	2023-01-02 14:20:02 +01:00
Julian Weber	a07397733b	Multilingual tokenizer (#2229 ) * Implement multilingual tokenizer * Add multi_phonemizer receipe * Fix lint * Add TestMultiPhonemizer * Fix lint * make style	2023-01-02 10:03:19 +01:00
Eren G??lge	f814d52394	Bump up to v0.10.1	2022-12-26 14:29:46 +01:00
Eren G??lge	8c32a6998a	Add pth files to manager	2022-12-26 14:29:25 +01:00
Eren G??lge	cf765cb3f2	Add ca and fa models	2022-12-26 14:29:10 +01:00
Eren G??lge	46b0ad37e7	Bump up to v0.10.0	2022-12-15 11:19:23 +01:00
Eren Gölge	a9167cf239	Fixup overflow (#2218 ) * Update overflow config * Pulling shuffle and drop_last from config * Print training stats for overflow	2022-12-15 00:56:48 +01:00
Eren Gölge	ecea43ec81	Adding pre-trained Overflow model (#2211 ) * Adding pretrained Overflow model * Stabilize HMM * Fixup model manager * Return `audio_unique_name` by default * Distribute max split size over datasets * Fixup eval_split_size * Make style	2022-12-14 16:55:48 +01:00
Edresson Casanova	3b1a28fa95	Add YourTTS VCTK recipe (#2198 ) * Add YourTTS VCTK recipe * Fix lint * Add compute_embeddings and resample_files functions to be able to reuse it * Add automatic download and speaker embedding computation for YourTTS VCTK recipe * Add parameter for eval metadata file on compute embeddings function	2022-12-12 16:14:25 +01:00
Shivam Mehta	3b8b105b0d	Adding OverFlow (#2183 ) * Adding encoder * currently modifying hmm * Adding hmm * Adding overflow * Adding overflow setting up flat start * Removing runs * adding normalization parameters * Fixing models on same device * Training overflow and plotting evaluations * Adding inference * At the end of epoch the test sentences are coming on cpu instead of gpu * Adding figures from model during training to monitor * reverting tacotron2 training recipe * fixing inference on gpu for test sentences on config * moving helpers and texts within overflows source code * renaming to overflow * moving loss to the model file * Fixing the rename * Model training but not plotting the test config sentences's audios * Formatting logs * Changing model name to camelcase * Fixing test log * Fixing plotting bug * Adding some tests * Adding more tests to overflow * Adding all tests for overflow * making changes to camel case in config * Adding information about parameters and docstring * removing compute_mel_statistics moved statistic computation to the model instead * Added overflow in readme * Adding more test cases, now it doesn't saves transition_p like tensor and can be dumped as json	2022-12-12 12:44:15 +01:00
p0p4k	2e153d54a8	Adding missing key to formatter (#2194 ) quick fix for #2156. added 'root_path' key.	2022-12-12 12:25:37 +01:00
Eren Gölge	1ddc484b49	Python API implementation (#2195 ) * Draft implementation * Fix style * Add api tests * Fix lint * Update docs * Update tests * Set env * Fixup * Fixup * Fix lint * Revert	2022-12-12 12:04:20 +01:00
Eren Gölge	fdeefcc612	Handle espeak 1.48.15 (#2203 )	2022-12-12 11:23:45 +01:00
Edresson Casanova	ee20e30958	Fix VITS multi-speaker voice conversion inference	2022-12-05 09:15:01 -03:00
Eren Gölge	9321b22203	Fix scheduler order	2022-12-05 12:26:15 +01:00
Eren G??lge	bc6120c330	[ci skip]Bump up to v0.9.0	2022-11-16 16:45:02 +01:00
logan hart	ff9b63d02a	Add neon models (#2140 ) * Add neon ljspeech vits model * Add neon german model * Update .models.json * Add neon spanish model * Add french model * Add Dutch model * Add Hungarian model * Add Greek model * Remove uneeded description * Update .models.json * Update .models.json * Handling neon models * Add all neon models * Update .models.json * Split zoo_tests * Update test names * Update model testing Co-authored-by: Eren Gölge <erogol@hotmail.com>	2022-11-16 16:12:39 +01:00
Eren Gölge	8cb1433e6e	Cache fsspec downloads (#2132 ) * Cache fsspec downloaded files * Use diff paths for test * Make fsspec caching optional * Decom GPU docker tests * Make progress bar optional for better CI log * Check path local	2022-11-09 22:12:48 +01:00
Eren G??lge	b686c09704	Fix #2062	2022-11-07 09:22:43 +01:00
freezerain	fcbfca869f	Fix back/forward slash in file path in mailabs formatter (#1938 ) * mailabs formatter: back/forward slash in file path fix * formatters.mailabs() path rework for Windows os * new formatter added "mailabs_win" * lint test fix commit * mailabs_win: removed, mailabs: "/" replaced with os.sep for windows compatibility * Black small style fix	2022-11-01 12:54:40 +01:00
Victor Shepardson	5307a2229b	Fix Capacitron training (#2086 )	2022-11-01 12:52:06 +01:00
Eren Gölge	dae79b0acd	Remove `/` prefix from the relative path (#2065 )	2022-10-10 13:32:27 +02:00
Eren Gölge	843fa6f3fa	Check num of columns in coqui format (#2066 ) * Check 4 colums in coqui format * Fix encoding * Fixup	2022-10-10 12:13:32 +02:00
Edresson Casanova	f3b947e706	Minors bug fixes on VITS/YourTTS and inference (#2054 ) * Set the right device to the speaker encoder * Bug fix on inference list_language_idxs parameter * Bug fix on speaker encoder resample audio transform	2022-10-06 22:23:54 +02:00
Eren Gölge	5f5d441ee5	Write non-speech files in a TXT (#2048 ) * Write non-speech files in a txt * Save 16-bit wav out of vad	2022-10-06 13:25:54 +02:00
Edresson Casanova	d6ad9a05b4	Fix colliding dataset cache file names (#1994 ) * Fix colliding dataset cache file names * Remove unused code	2022-09-21 12:54:07 +02:00
Edresson Casanova	3faccbda97	Fix dataset handling with the new embedding file keys (#1991 )	2022-09-19 23:44:14 +02:00
Eren Gölge	0a112f7841	Add metafile arg (#1977 )	2022-09-16 14:41:49 +02:00
Julian Weber	896e46d0e5	Fix vc (#1971 )	2022-09-16 12:01:26 +02:00
Eren Gölge	b95cf3363c	Prevent installing mecab-ko (#1967 )	2022-09-14 10:28:07 +02:00
Eren Gölge	9e5a469c64	d-vector handling (#1945 ) * Update BaseDatasetConfig - Add dataset_name - Chane name to formatter_name * Update compute_embedding - Allow entering dataset by args - Use released model by default - Use the new key format * Update loading * Update recipes * Update other dep code * Update tests * Fixup * Load multiple embedding files * Fix argument names in dep code * Update docs * Fix argument name * Fix linter	2022-09-13 14:10:33 +02:00
Edresson Casanova	371772c355	Replace pyworld by pyin (#1946 ) * Replace pyworld by pyin * Fix unit tests	2022-09-09 10:43:14 +02:00
happylittlecat	4546b4cbd8	Add espeak support for Chinese (#1905 ) * fix description * add espeak support for chinese * add espeak support for chinese	2022-09-08 12:32:41 +02:00
harmlessman	5abbe56642	Korean Phonemizer (#1822 ) * Update requirements.txt install jamo for korean * Update formatters.py add KSS formatter KSS is a korean single speech dataset (12hours) * Add files via upload add phonemizer for korean * Add files via upload add korean phonemizer * Update requirements.txt * change code style with `black` and `pylint` * reflecting pylint's Evaluation * reflecting pylint's Evaluation * reflecting pylint's Evaluation-2 * isort * edit about separator write test case and add 'nltk' for requirements.txt * add korean g2p (g2pkk) * isort * TTS/tts/utils/text/phonemizers/ko_kr_phonemizer.py:43:24: W0621: Redefining name 'text' from outer scope (line 58) (redefined-outer-name) TTS/tts/utils/text/korean/korean.py:28:8: R1705: Unnecessary "else" after "return" (no-else-return) * black	2022-09-08 12:06:07 +02:00
Edresson Casanova	159eeeef64	Fix find unique phonemes script (#1928 ) * Fix find unique phonemes script * Fix unit tests	2022-09-08 10:17:35 +02:00
KyuubiYoru	3b7dff568a	Fixes a race condition with multiple simultaneous get requests. (#1807 ) * Fixes a race condition with multiple simultaneous get requests. * Removed unused import * Removed unused threading import * Changed lock style to notation * make style Co-authored-by: WeberJulian <julian.weber@hotmail.fr>	2022-09-08 10:16:16 +02:00
Julian Weber	bb59718c03	Add capacitron v2 model (#1768 ) * Add capacitron v2 in .models.json * Put right commit hash	2022-09-08 09:43:56 +02:00
Edresson Casanova	096b35f639	Add VCTK speaker encoder recipe (#1912 )	2022-08-26 16:19:03 +02:00
Eren Gölge	e5430a6519	Add new DE Thorsten models (#1898 ) - Tacotron2-DDC - HifiGAN vocoder	2022-08-22 11:27:39 +02:00
Eren G??lge	8845f06fd9	Bump up to v0.8.0	2022-08-22 11:26:47 +02:00
Stanislav Kachnov	2c9f00a808	Fix tune wavegrad (#1844 ) * fix imports in tune_wavegrad * load_config returns Coqpit object instead None * set action (store true) for flag "--use_cuda"; start to tune if module is running as the main program * fix var order in the result of batch collating * make style * make style with black and isort	2022-08-22 09:55:32 +02:00
Eren Gölge	fcb0bb58ae	Handle when no batch sampler (#1882 )	2022-08-18 11:26:04 +02:00
Eren Gölge	7442bcefa5	Remove deprecated files (#1873 ) - samplers.py is moved - distribute.py is replaces by the 👟Trainer	2022-08-15 12:16:37 +02:00
Eren Gölge	4333492341	Fix BCE loss issue (#1872 ) * Fix BCE loss issue * Remove import	2022-08-15 11:27:21 +02:00
manmay nakhashi	e4db7c51b5	Update capacitron_layers.py (#1664 ) crashing because of dimension miss match at line no. 57 [batch, 256] vs [batch , 1, 512] enc_out = torch.cat([enc_out, speaker_embedding], dim=-1)	2022-08-15 11:08:50 +02:00
Eren Gölge	bfc63829ac	Implement bucketed weighted sampling for VITS (#1871 )	2022-08-15 11:08:11 +02:00
Eren Gölge	d46fbc240c	Introduce numpy and torch transforms (#1705 ) * Refactor audio processing functions * Add tests for numpy transforms * Fix imports * Fix imports2	2022-08-08 11:57:50 +02:00
manmay nakhashi	7fd9b89ebf	fix get_random_embeddings --> get_random_embedding (#1726 ) * fix get_random_embeddings --> get_random_embedding function typo leads to training crash, no such function * fix typo get_random_embedding	2022-08-07 14:06:03 +02:00
rbaraglia	75ac9e3f0c	Fix language flags generated by espeak-ng phonemizer (#1801 ) * fix language flags generated by espeak-ng phonemizer * Style * Updated language flag regex to consider all language codes alike	2022-08-07 13:57:40 +02:00

1 2 3 4 5 ...

1652 Commits