David Martin Rius
f6a23c1d8a
Merge remote-tracking branch 'subuday/matcha_tts' into dev
2024-03-05 22:13:35 +01:00
David Martin Rius
3db0dec08a
Add 2 functions to verify any spacy language can be instantiated. By now, the only one that needs special packages is Korean. So, all languages works well but Korean
2024-02-28 20:23:53 +01:00
David Martin Rius
8aeced16fc
import the spacy language class dynamically with a English fallback when import error
2024-02-28 19:58:25 +01:00
Subuday
f15230bb67
Add transformer block to UNet
2024-02-15 18:52:42 +00:00
Subuday
5fd7ea93ea
Add upsampling and downsampling to UNet
2024-02-15 13:24:30 +00:00
Subuday
8676ab30d9
Fix appending a new block to input_blocks
2024-02-15 08:55:52 +00:00
Subuday
fd6c0afbbf
Add ResNetBlock1D to UNet
2024-02-15 08:40:04 +00:00
Subuday
0f7a7edb9b
Add conv block to UNet
2024-02-14 21:21:07 +00:00
Subuday
b5467b8051
Add UNet backbone
2024-02-12 21:44:29 +00:00
Subuday
7314b1cbec
Implement model forward
2024-02-12 19:39:22 +00:00
Subuday
8c4d0142b7
Add MatchaTTS backbone
2024-02-11 21:02:20 +00:00
Edresson Casanova
5dcc16d193
Bug fix in MP3 and FLAC compute length on TTSDataset ( #3092 )
...
* Bug Fix on XTTS load
* Bug fix in MP3 length on TTSDataset
* Update TTS/tts/datasets/dataset.py
Co-authored-by: Aarni Koskela <akx@iki.fi>
* Uses mutagen for all audio formats
* Add dataloader test wit hall supported audio formats
* Use mutagen.File
* Update
* Fix aux unit tests
* Bug fixe on unit tests
---------
Co-authored-by: Aarni Koskela <akx@iki.fi>
2023-12-27 13:23:43 -03:00
Eren Gölge
8c1a8b522b
Merge pull request #3405 from coqui-ai/studio_speakers
...
Add studio speakers to open source XTTS!
2023-12-12 16:10:09 +01:00
Eren Gölge
934b87bbd1
Merge pull request #3391 from aaron-lii/multi-gpu
...
support multiple GPU training for XTTS
2023-12-12 13:51:26 +01:00
WeberJulian
5cd750ac7e
Fix API and CI
2023-12-11 20:21:53 +01:00
WeberJulian
e3c9dab7a3
Make CLI work
2023-12-11 18:49:18 +01:00
WeberJulian
a5c0d9780f
rename manager
2023-12-11 18:48:31 +01:00
WeberJulian
36143fee26
Add basic speaker manager
2023-12-11 15:25:46 +01:00
Frederico S. Oliveira
163f9a3fdf
Merge branch 'coqui-ai:dev' into dev
2023-12-11 10:04:07 -03:00
Aaron-Li
b6e929696a
support multiple GPU training
2023-12-08 16:55:32 +08:00
Eren Gölge
e49c512d99
Merge pull request #3351 from aaron-lii/chinese-puncs
...
fix pause problem of Chinese speech
2023-12-04 15:57:42 +01:00
Edresson Casanova
5f900f156a
Add XTTS Fine tuning gradio demo ( #3296 )
...
* Add XTTS FT demo data processing pipeline
* Add training and inference columns
* Uses tabs instead of columns
* Fix demo freezing issue
* Update demo
* Convert stereo to mono
* Bug fix on XTTS inference
* Update gradio demo
* Update gradio demo
* Update gradio demo
* Update gradio demo
* Add parameters to be able to set then on colab demo
* Add erros messages
* Add intuitive error messages
* Update
* Add max_audio_length parameter
* Add XTTS fine-tuner docs
* Update XTTS finetuner docs
* Delete trainer to freeze memory
* Delete unused variables
* Add gc.collect()
* Update xtts.md
---------
Co-authored-by: Eren Gölge <erogol@hotmail.com>
2023-12-01 23:52:23 +01:00
Aaron-Li
7b8808186a
fix pause problem of Chinese speech
2023-12-01 23:30:03 +08:00
Frederico S. Oliveira
bcd500fa7b
Fixing bug
...
Correction in training the Fastspeech/Fastspeech2/FastPitch/SpeedySpeech model using external speaker embedding.
2023-11-30 17:27:05 -03:00
Enno Hermann
39321d02be
fix: correctly strip/restore initial punctuation ( #3336 )
...
* refactor(punctuation): remove orphan code for handling lone punctuation
The case of lone punctuation is already handled at the top of restore(). The
removed if statement would never be called and would in fact raise an
AttributeError because the _punc_index named tuple doesn't have the attribute
`mark`.
* refactor(punctuation): remove unused argument
* fix(punctuation): correctly handle initial punctuation
Stripping and restoring initial punctuation didn't work correctly because the
string-splitting caused an additional empty string to be inserted in the text
list (because `".A".split(".")` => `["", "A"]`). Now, an initial empty string is
skipped and relevant test cases are added.
Fixes #3333
2023-11-30 13:03:16 +01:00
Eren G??lge
3b8894a3dd
Make style
2023-11-27 14:15:50 +01:00
Eren G??lge
11ec9f7471
Add hi in config defaults
2023-11-24 15:38:36 +01:00
Eren G??lge
32065139e7
Simple text cleaner for "hi"
2023-11-24 15:14:34 +01:00
Enno Hermann
2af0220996
fix: don't pass quotes to espeak ( #3286 )
...
Previously, the text was wrapped in an additional set of quotes that was passed
to Espeak. This could result in different phonemization in certain edges and
caused the insertion of an initial separator "_" that had to be removed.
Compare:
$ espeak-ng -q -b 1 -v en-us --ipa=1 '"A"'
_ˈɐ
$ espeak-ng -q -b 1 -v en-us --ipa=1 'A'
ˈeɪ
Fixes #2619
2023-11-24 12:25:37 +01:00
Edresson Casanova
11283fce07
Ensures that only GPT model is in training mode during XTTS GPT training ( #3241 )
...
* Ensures that only GPT model is in training mode during training
* Fix parallel wavegan unit test
2023-11-17 15:13:46 +01:00
Eren G??lge
44880f09ed
Make style
2023-11-17 13:43:34 +01:00
Eren G??lge
26efdf6ee7
Make k_diffusion optional
2023-11-17 13:42:33 +01:00
Julian Weber
fbc18b8c34
Fix zh bug ( #3238 )
2023-11-16 17:51:37 +01:00
Julian Weber
675f983550
Add sentence splitting ( #3227 )
...
* Add sentence spliting
* update requirements
* update default args v2
* Add spanish
* Fix return gpt_latents
* Update requirements
* Fix requirements
2023-11-16 11:01:11 +01:00
Edresson Casanova
73a5bd08c0
Fix XTTS GPT padding and inference issues ( #3216 )
...
* Fix end artifact for fine tuning models
* Bug fix on zh-cn inference
* Remove ununsed code
2023-11-15 14:02:05 +01:00
Julian Weber
04901fb2e4
Add speed control for inference ( #3214 )
...
* Add speed control for inference
* Fix XTTS tests
* Add speed control tests
2023-11-14 16:07:17 +01:00
Eren Gölge
ac3df409a6
Merge pull request #3208 from coqui-ai/fix_max_mel_len
...
fix max generation length for XTTS
2023-11-13 14:32:56 +01:00
Eren G??lge
92fa988aec
Fixup
2023-11-13 13:44:06 +01:00
WeberJulian
b85536b23f
fix max generation length
2023-11-13 13:18:45 +01:00
Eren G??lge
b2682d39c5
Make style
2023-11-13 13:01:01 +01:00
Eren G??lge
a16360af85
Implement chunking gpt_cond
2023-11-13 13:00:08 +01:00
Enno Hermann
3b1e7038bc
fix(formatters): set missing root_path attribute ( #3182 )
...
Fixes #2778
2023-11-09 16:49:52 +01:00
Aarni Koskela
a8e9163fb3
xtts/tokenizer: merge duplicate implementations of preprocess_text ( #3170 )
...
This was found via ruff:
> F811 Redefinition of unused `preprocess_text` from line 570
2023-11-09 16:32:12 +01:00
Matthew Boakes
1b9c400bca
PyTorch 2.1 Updates (Weight Norm and TorchAudio I/O) ( #3176 )
...
* Replaced PyTorch weight_norm With parametrizations.weight_norm
* TorchAudio: Migrating The I/O Functions To Use The Dispatcher Mechanism
* Corrected Code Style
---------
Co-authored-by: Eren Gölge <erogol@hotmail.com>
2023-11-09 16:31:03 +01:00
Gorkem
66a1e248d0
torchaudio should use proper backend to load audio ( #3179 )
2023-11-09 16:28:39 +01:00
Julian Weber
03ad90135b
Add lang code in XTTS doc ( #3158 )
...
* Add lang code in XTTS doc
* Remove ununsed config and args
* update docs
* woops
2023-11-08 13:47:33 +01:00
Gorkem
78a596618a
Fix for exception on streaming if last chunk empty ( #3160 )
2023-11-08 11:32:02 +01:00
Julian Weber
ce1a39a9a4
Add char limit warn ( #3130 )
...
* Add char limit warning
* Adding v2 langs
* cached_property for cutlet
* Fix import
2023-11-08 10:24:23 +01:00
Edresson Casanova
5f9ab6cfaa
Fix style
...
Co-authored-by: Aarni Koskela <akx@iki.fi>
2023-11-06 19:22:34 -03:00
Edresson Casanova
09fb317e6d
Remove unused code
2023-11-06 17:36:32 -03:00