Aarni Koskela
90991e89b4
Ruff autofix unused imports and import order
2023-12-13 14:56:41 +02:00
Aarni Koskela
72ac2bfa09
Get rid of some star imports
2023-12-13 14:56:41 +02:00
Eren Gölge
fa28f99f15
Update to v0.22.0
2023-12-12 16:10:46 +01:00
Eren Gölge
8c1a8b522b
Merge pull request #3405 from coqui-ai/studio_speakers
...
Add studio speakers to open source XTTS!
2023-12-12 16:10:09 +01:00
Enno Hermann
9f325b1f6c
fixup! Fix aux unit tests
2023-12-12 16:07:16 +01:00
Edresson Casanova
fc099218df
Fix aux unit tests
2023-12-12 16:07:16 +01:00
Eren Gölge
934b87bbd1
Merge pull request #3391 from aaron-lii/multi-gpu
...
support multiple GPU training for XTTS
2023-12-12 13:51:26 +01:00
Eren Gölge
8e6a7cbfbf
Update .models.json
2023-12-12 13:50:01 +01:00
Eren Gölge
4dc0722bbc
Update .models.json
2023-12-12 13:28:16 +01:00
WeberJulian
61b67ef16f
Fix read_json_with_comments
2023-12-11 23:58:52 +01:00
WeberJulian
d47b6df4e5
Make comments in .model.json valid
2023-12-11 23:35:27 +01:00
WeberJulian
b40750d1f5
Remove models that require app.coqui.ai
2023-12-11 23:17:54 +01:00
WeberJulian
5ab228dff2
Fix CI
2023-12-11 22:31:53 +01:00
WeberJulian
8c20a599d8
Remove coqui studio integration from TTS
2023-12-11 22:11:46 +01:00
WeberJulian
5cd750ac7e
Fix API and CI
2023-12-11 20:21:53 +01:00
WeberJulian
e3c9dab7a3
Make CLI work
2023-12-11 18:49:18 +01:00
WeberJulian
0a90359a42
rename speaker file
2023-12-11 18:48:49 +01:00
WeberJulian
a5c0d9780f
rename manager
2023-12-11 18:48:31 +01:00
WeberJulian
36143fee26
Add basic speaker manager
2023-12-11 15:25:46 +01:00
Frederico S. Oliveira
f9117918fe
Update .models.json
2023-12-11 10:47:31 -03:00
Frederico S. Oliveira
163f9a3fdf
Merge branch 'coqui-ai:dev' into dev
2023-12-11 10:04:07 -03:00
WeberJulian
0a136a8535
Download speaker file
2023-12-11 11:29:36 +01:00
Aaron-Li
b6e929696a
support multiple GPU training
2023-12-08 16:55:32 +08:00
Josh Meyer
759d9ab3ae
Print message for either commercial license or CPML
2023-12-07 13:54:48 +01:00
Eren Gölge
e49c512d99
Merge pull request #3351 from aaron-lii/chinese-puncs
...
fix pause problem of Chinese speech
2023-12-04 15:57:42 +01:00
Eren Gölge
2d02015978
Update to v0.21.3
2023-12-01 23:52:57 +01:00
Edresson Casanova
5f900f156a
Add XTTS Fine tuning gradio demo ( #3296 )
...
* Add XTTS FT demo data processing pipeline
* Add training and inference columns
* Uses tabs instead of columns
* Fix demo freezing issue
* Update demo
* Convert stereo to mono
* Bug fix on XTTS inference
* Update gradio demo
* Update gradio demo
* Update gradio demo
* Update gradio demo
* Add parameters to be able to set then on colab demo
* Add erros messages
* Add intuitive error messages
* Update
* Add max_audio_length parameter
* Add XTTS fine-tuner docs
* Update XTTS finetuner docs
* Delete trainer to freeze memory
* Delete unused variables
* Add gc.collect()
* Update xtts.md
---------
Co-authored-by: Eren Gölge <erogol@hotmail.com>
2023-12-01 23:52:23 +01:00
Aaron-Li
7b8808186a
fix pause problem of Chinese speech
2023-12-01 23:30:03 +08:00
Frederico S. Oliveira
bcd500fa7b
Fixing bug
...
Correction in training the Fastspeech/Fastspeech2/FastPitch/SpeedySpeech model using external speaker embedding.
2023-11-30 17:27:05 -03:00
Frederico S. Oliveira
a26e51b0b4
Merge branch 'coqui-ai:dev' into dev
2023-11-30 14:19:05 -03:00
Eren Gölge
6d1905c2b7
Update to v0.21.2
2023-11-30 13:05:10 +01:00
Enno Hermann
39321d02be
fix: correctly strip/restore initial punctuation ( #3336 )
...
* refactor(punctuation): remove orphan code for handling lone punctuation
The case of lone punctuation is already handled at the top of restore(). The
removed if statement would never be called and would in fact raise an
AttributeError because the _punc_index named tuple doesn't have the attribute
`mark`.
* refactor(punctuation): remove unused argument
* fix(punctuation): correctly handle initial punctuation
Stripping and restoring initial punctuation didn't work correctly because the
string-splitting caused an additional empty string to be inserted in the text
list (because `".A".split(".")` => `["", "A"]`). Now, an initial empty string is
skipped and relevant test cases are added.
Fixes #3333
2023-11-30 13:03:16 +01:00
Frederico S. Oliveira
77c2155609
Merge pull request #1 from coqui-ai/dev
...
Update
2023-11-29 17:24:02 -03:00
Eren G??lge
bfbaffc84a
Fixup
2023-11-28 13:47:45 +01:00
Eren G??lge
b75e90ba85
Make text splitting optional
2023-11-27 14:53:11 +01:00
Eren G??lge
3b8894a3dd
Make style
2023-11-27 14:15:50 +01:00
Eren G??lge
2fd8cf3d94
Make xtts runnable by version names
2023-11-27 14:15:16 +01:00
Eren G??lge
11ec9f7471
Add hi in config defaults
2023-11-24 15:38:36 +01:00
Eren G??lge
00a870c26a
Update to v0.21.1
2023-11-24 15:15:44 +01:00
Eren G??lge
7e575068c9
Merge branch 'dev' of https://github.com/coqui-ai/TTS into dev
2023-11-24 15:15:19 +01:00
Eren G??lge
32065139e7
Simple text cleaner for "hi"
2023-11-24 15:14:34 +01:00
Eren Gölge
1542a50c3a
Update to v0.21.0
2023-11-24 14:37:05 +01:00
Eren G??lge
6dd43b0ce2
Update to XTTS v2.0.3
2023-11-24 14:36:04 +01:00
TITC
4d0f53d2ee
Misjudgment of `is_multi_lingual` When Loading Multilingual Model via `model_path` ( #3273 )
...
* load multilingual model by path
* use config to assert multi lingual or not
2023-11-24 12:28:31 +01:00
Enno Hermann
8c5227ed84
Fix tts_with_vc ( #3275 )
...
* Revert "fix for issue 3067"
This reverts commit 041b4b6723
.
Fixes #3143 . The original issue (#3067 ) was people trying to use
tts.tts_with_vc_to_file() with XTTS and was "fixed" in #3109 . But XTTS has
integrated VC and you can just do tts.tts_to_file(..., speaker_wav="..."), there
is no point in passing it through FreeVC afterwards. So, reverting this commit
because it breaks tts.tts_with_vc_to_file() for any model that doesn't have
integrated VC, i.e. all models this method is meant for.
* fix: support multi-speaker models in tts_with_vc/tts_with_vc_to_file
* fix: only compute spk embeddings for models that support it
Fixes #1440 . Passing a `speaker_wav` argument to regular Vits models failed
because they don't support voice cloning. Now that argument is simply ignored.
2023-11-24 12:26:37 +01:00
Enno Hermann
2af0220996
fix: don't pass quotes to espeak ( #3286 )
...
Previously, the text was wrapped in an additional set of quotes that was passed
to Espeak. This could result in different phonemization in certain edges and
caused the insertion of an initial separator "_" that had to be removed.
Compare:
$ espeak-ng -q -b 1 -v en-us --ipa=1 '"A"'
_ˈɐ
$ espeak-ng -q -b 1 -v en-us --ipa=1 'A'
ˈeɪ
Fixes #2619
2023-11-24 12:25:37 +01:00
Enno Hermann
4a2684be34
fix(bin.synthesize): more informative error for wrong --language argument ( #3294 )
...
In multilingual models, the target language is specified via the
`--language_idx` argument. However, the `tts` CLI also accepts a `--language`
argument for use with Coqui Studio, so it is easy to choose the wrong one,
resulting in the following confusing error at synthesis time:
```
AssertionError: ❗ Language None is not supported. Supported languages are
['en', 'es', 'fr', 'de', 'it', 'pt', 'pl', 'tr', 'ru', 'nl', 'cs', 'ar',
'zh-cn', 'hu', 'ko', 'ja']
```
This commit adds a better error message when `--language` is passed for a
non-studio model.
Fixes #3270 , fixes #3291
2023-11-24 12:24:42 +01:00
Tessa Painter
64f391b583
Made the tqdm `progress_bar` objects of static download methods a static class variable ( #3297 )
2023-11-24 12:23:59 +01:00
Eren Gölge
b47d9c6e36
Merge pull request #3243 from idiap/checkpoints
...
Remove duplicate/unused code
2023-11-22 23:52:06 +01:00
Eren Gölge
c011ab7455
Update to v0.20.6
2023-11-17 15:16:32 +01:00
Eren G??lge
52cb1e2f68
Update model hash for v2.0.2
2023-11-17 15:16:32 +01:00
Edresson Casanova
6075fa208c
Ensures that only GPT model is in training mode during XTTS GPT training ( #3241 )
...
* Ensures that only GPT model is in training mode during training
* Fix parallel wavegan unit test
2023-11-17 15:15:22 +01:00
Eren G??lge
a3279f9294
Make style
2023-11-17 15:15:22 +01:00
Eren G??lge
f21067a84a
Make k_diffusion optional
2023-11-17 15:15:21 +01:00
Enno Hermann
0fb0d67de7
refactor: use save_checkpoint()/save_best_model() from Trainer
2023-11-17 01:18:23 +01:00
Enno Hermann
96678c7ba2
refactor: use copy_model_files() from Trainer
2023-11-17 01:18:23 +01:00
Enno Hermann
5119e651a1
chore(utils.io): remove unused code
...
These are all available in Trainer.
2023-11-17 01:18:23 +01:00
Enno Hermann
39fe38bda4
refactor: use save_fsspec() from Trainer
2023-11-17 01:18:23 +01:00
Enno Hermann
fdf0c8b10a
chore(encoder): remove unused code
2023-11-17 01:18:23 +01:00
Eren Gölge
7e4375da2b
Update to v0.20.6
2023-11-16 17:52:13 +01:00
Julian Weber
fbc18b8c34
Fix zh bug ( #3238 )
2023-11-16 17:51:37 +01:00
Julian Weber
675f983550
Add sentence splitting ( #3227 )
...
* Add sentence spliting
* update requirements
* update default args v2
* Add spanish
* Fix return gpt_latents
* Update requirements
* Fix requirements
2023-11-16 11:01:11 +01:00
Enno Hermann
3c2d5a9e03
Remove duplicate AudioProcessor code and fix ExtractTTSpectrogram.ipynb ( #3230 )
...
* chore: remove unused argument
* refactor(audio.processor): remove duplicate stft+griffin_lim
* chore(audio.processor): remove unused compute_stft_paddings
Same function available in numpy_transforms
* refactor(audio.processor): remove duplicate db_to_amp
* refactor(audio.processor): remove duplicate amp_to_db
* refactor(audio.processor): remove duplicate linear_to_mel
* refactor(audio.processor): remove duplicate mel_to_linear
* refactor(audio.processor): remove duplicate build_mel_basis
* refactor(audio.processor): remove duplicate stft_parameters
* refactor(audio.processor): use pre-/deemphasis from numpy_transforms
* refactor(audio.processor): use rms_volume_norm from numpy_transforms
* chore(audio.processor): remove duplicate assert
Already checked in numpy_transforms.compute_f0
* refactor(audio.processor): use find_endpoint from numpy_transforms
* refactor(audio.processor): use trim_silence from numpy_transforms
* refactor(audio.processor): use volume_norm from numpy_transforms
* refactor(audio.processor): use load_wav from numpy_transforms
* fix(bin.extract_tts_spectrograms): set quantization bits
* fix(ExtractTTSpectrogram.ipynb): adapt to current TTS code
Fixes #2447 , #2574
* refactor(audio.processor): remove duplicate quantization methods
2023-11-16 10:57:06 +01:00
Eren Gölge
88630c60e5
Update to v0.20.5
2023-11-15 14:02:51 +01:00
Edresson Casanova
73a5bd08c0
Fix XTTS GPT padding and inference issues ( #3216 )
...
* Fix end artifact for fine tuning models
* Bug fix on zh-cn inference
* Remove ununsed code
2023-11-15 14:02:05 +01:00
Julian Weber
04901fb2e4
Add speed control for inference ( #3214 )
...
* Add speed control for inference
* Fix XTTS tests
* Add speed control tests
2023-11-14 16:07:17 +01:00
Eren Gölge
d96f3885d5
Update to v0.20.4
2023-11-13 17:07:25 +01:00
Eren Gölge
ac3df409a6
Merge pull request #3208 from coqui-ai/fix_max_mel_len
...
fix max generation length for XTTS
2023-11-13 14:32:56 +01:00
Eren G??lge
92fa988aec
Fixup
2023-11-13 13:44:06 +01:00
WeberJulian
b85536b23f
fix max generation length
2023-11-13 13:18:45 +01:00
Eren G??lge
b2682d39c5
Make style
2023-11-13 13:01:01 +01:00
Eren G??lge
a16360af85
Implement chunking gpt_cond
2023-11-13 13:00:08 +01:00
Eren Gölge
6f1cba2f81
Update to v0.20.3
2023-11-09 17:41:37 +01:00
Enno Hermann
3b1e7038bc
fix(formatters): set missing root_path attribute ( #3182 )
...
Fixes #2778
2023-11-09 16:49:52 +01:00
Aarni Koskela
a8e9163fb3
xtts/tokenizer: merge duplicate implementations of preprocess_text ( #3170 )
...
This was found via ruff:
> F811 Redefinition of unused `preprocess_text` from line 570
2023-11-09 16:32:12 +01:00
Matthew Boakes
1b9c400bca
PyTorch 2.1 Updates (Weight Norm and TorchAudio I/O) ( #3176 )
...
* Replaced PyTorch weight_norm With parametrizations.weight_norm
* TorchAudio: Migrating The I/O Functions To Use The Dispatcher Mechanism
* Corrected Code Style
---------
Co-authored-by: Eren Gölge <erogol@hotmail.com>
2023-11-09 16:31:03 +01:00
Gorkem
66a1e248d0
torchaudio should use proper backend to load audio ( #3179 )
2023-11-09 16:28:39 +01:00
Eren Gölge
46d9c27212
Update to v0.20.2
2023-11-08 16:07:56 +01:00
Julian Weber
03ad90135b
Add lang code in XTTS doc ( #3158 )
...
* Add lang code in XTTS doc
* Remove ununsed config and args
* update docs
* woops
2023-11-08 13:47:33 +01:00
Gorkem
78a596618a
Fix for exception on streaming if last chunk empty ( #3160 )
2023-11-08 11:32:02 +01:00
Enno Hermann
99edd6daa3
Fix ModelManager.list_models() ( #3128 )
...
* fix(utils.manage): remove hard-coded model_type variable
* refactor(utils.manage): address lint issues, fix typos
Addressed the following:
TTS/utils/manage.py:307:12: R1705: Unnecessary "else" after "return" (no-else-return)
TTS/utils/manage.py:308:21: W1514: Using open without explicitly specifying an encoding (unspecified-encoding)
TTS/utils/manage.py:299:4: R1710: Either all return statements in a function should return an expression, or none of them should. (inconsistent-return-statements)
TTS/utils/manage.py:299:4: R0201: Method could be a function (no-self-use)
TTS/utils/manage.py:314:4: R0201: Method could be a function (no-self-use)
2023-11-08 11:29:01 +01:00
Eren Gölge
77b18126c7
Merge pull request #3126 from akx/freevc-config-module
...
Move FreeVCConfig to TTS.vc.configs (like all other config classes)
2023-11-08 11:24:47 +01:00
Eren Gölge
cc6e9fcaa7
Fix #3153 ( #3169 )
2023-11-08 11:13:58 +01:00
Eren Gölge
a24ebcd8a6
Fix coqui api ( #3168 )
2023-11-08 10:51:23 +01:00
Julian Weber
ce1a39a9a4
Add char limit warn ( #3130 )
...
* Add char limit warning
* Adding v2 langs
* cached_property for cutlet
* Fix import
2023-11-08 10:24:23 +01:00
Eren Gölge
f846a9f300
Update to v0.20.1
2023-11-07 14:17:36 +01:00
Edresson Casanova
cbdbc44e0f
Fix XTTS v2.0 training recipe ( #3154 )
...
* Fix XTTS v2.0 training recipe
* Update XTTS v2 model hash
2023-11-07 14:16:44 +01:00
Edresson Casanova
5f9ab6cfaa
Fix style
...
Co-authored-by: Aarni Koskela <akx@iki.fi>
2023-11-06 19:22:34 -03:00
Edresson Casanova
2470599d18
Drop XTTS v1
2023-11-06 19:12:04 -03:00
Edresson Casanova
13243df526
Update XTTS v1.1 files
2023-11-06 19:10:21 -03:00
Edresson Casanova
09fb317e6d
Remove unused code
2023-11-06 17:36:32 -03:00
Edresson Casanova
b146de4ce8
Bug fix on XTTS v2.0 Trainer
2023-11-06 20:26:01 +01:00
Edresson Casanova
1b6f8d0e46
Update unit tests and recipes
2023-11-06 20:25:06 +01:00
Edresson Casanova
72b2bac0f8
Load reference in 24khz to avoid issued with multiple sr references
2023-11-06 20:25:06 +01:00
Edresson Casanova
00294ffdf6
Update XTTS docs
2023-11-06 20:24:06 +01:00
Edresson Casanova
459ad70dc8
Add support for multiples speaker references on XTTS inference
2023-11-06 20:22:35 +01:00
Eren Gölge
f0cb19ecca
Drop diffusion from XTTS ( #3150 )
...
* Drop diffusion for XTTS
* Make style
* Drop diffusion deps in code
* Restore thrashed
2023-11-06 20:15:49 +01:00
Eren G??lge
5d418bb84a
Update docs
2023-11-06 18:48:41 +01:00
Eren G??lge
9bbf6eb8dd
Drop use_ne_hifigan
2023-11-06 18:43:38 +01:00
Eren G??lge
9d54bd7655
Fixup XTTS
2023-11-06 18:13:58 +01:00