Commit Graph

2144 Commits

Author SHA1 Message Date
Enno Hermann 4e183c61df fix(api): handle missing attribute in is_multilingual 2024-03-06 22:41:32 +01:00
Enno Hermann e05243c4c8 refactor: read/write csv files with standard library 2024-03-06 16:18:09 +01:00
Enno Hermann 24298da5fc
Merge pull request #1 from eginhard/lint-overhaul
Lint overhaul (pylint to ruff)
2024-03-06 16:10:26 +01:00
Enno Hermann 04d8d4b09a chore: remove unused imports 2024-03-06 13:27:43 +01:00
Nick Potafiy dbf1a08a0d
Update generic_utils.py (#3561)
Handles cases when git branch produces no output or invalid output. Right now, it just crashes with `StopIteration`
2024-02-10 11:20:58 -03:00
wangjie b184e9f0fe fix chinese pinyin phonemes 2024-01-12 09:11:56 +08:00
Ivan Peevski 08e00e4b49
Fix bark model 2024-01-08 14:45:04 +10:30
Edresson Casanova 5dcc16d193
Bug fix in MP3 and FLAC compute length on TTSDataset (#3092)
* Bug Fix on XTTS load

* Bug fix in MP3 length on TTSDataset

* Update TTS/tts/datasets/dataset.py

Co-authored-by: Aarni Koskela <akx@iki.fi>

* Uses mutagen for all audio formats

* Add dataloader test wit hall supported audio formats

* Use mutagen.File

* Update

* Fix aux unit tests

* Bug fixe on unit tests

---------

Co-authored-by: Aarni Koskela <akx@iki.fi>
2023-12-27 13:23:43 -03:00
Eren Gölge 55c7063724
Merge pull request #3423 from idiap/fix-aux-tests
Fix CI (save best model after 0 steps in tests)
2023-12-14 18:00:30 +01:00
Aarni Koskela d6ea806469 Run `make style` 2023-12-13 14:56:41 +02:00
Aarni Koskela bd172dabbf xtts/stream_generator: remove duplicate import + code 2023-12-13 14:56:41 +02:00
Aarni Koskela 32abb1a7c4 xtts/perceiver_encoder: Delete duplicate exists() 2023-12-13 14:56:41 +02:00
Aarni Koskela aa549e9028 Fix trailing whitespace 2023-12-13 14:56:41 +02:00
Aarni Koskela 4584ef6580 Simplify branch in TTS/bin/synthesize.py 2023-12-13 14:56:41 +02:00
Aarni Koskela 08fa5d4098 Fix implicitly concatenated docstring 2023-12-13 14:56:41 +02:00
Aarni Koskela 33b69c6c09 Add some noqa directives (for now) 2023-12-13 14:56:41 +02:00
Aarni Koskela 00f8f4892a Ruff autofix unnecessary passes 2023-12-13 14:56:41 +02:00
Aarni Koskela bc2cf296a3 Ruff autofix PLW3301 2023-12-13 14:56:41 +02:00
Aarni Koskela 64bb41f4fa Ruff autofix C41 2023-12-13 14:56:41 +02:00
Aarni Koskela 449820ec7d Ruff autofix E71* 2023-12-13 14:56:41 +02:00
Aarni Koskela 90991e89b4 Ruff autofix unused imports and import order 2023-12-13 14:56:41 +02:00
Aarni Koskela 72ac2bfa09 Get rid of some star imports 2023-12-13 14:56:41 +02:00
Eren Gölge fa28f99f15
Update to v0.22.0 2023-12-12 16:10:46 +01:00
Eren Gölge 8c1a8b522b
Merge pull request #3405 from coqui-ai/studio_speakers
Add studio speakers to open source XTTS!
2023-12-12 16:10:09 +01:00
Enno Hermann 9f325b1f6c fixup! Fix aux unit tests 2023-12-12 16:07:16 +01:00
Edresson Casanova fc099218df Fix aux unit tests 2023-12-12 16:07:16 +01:00
Eren Gölge 934b87bbd1
Merge pull request #3391 from aaron-lii/multi-gpu
support multiple GPU training for XTTS
2023-12-12 13:51:26 +01:00
Eren Gölge 8e6a7cbfbf
Update .models.json 2023-12-12 13:50:01 +01:00
Eren Gölge 4dc0722bbc
Update .models.json 2023-12-12 13:28:16 +01:00
WeberJulian 61b67ef16f Fix read_json_with_comments 2023-12-11 23:58:52 +01:00
WeberJulian d47b6df4e5 Make comments in .model.json valid 2023-12-11 23:35:27 +01:00
WeberJulian b40750d1f5 Remove models that require app.coqui.ai 2023-12-11 23:17:54 +01:00
WeberJulian 5ab228dff2 Fix CI 2023-12-11 22:31:53 +01:00
WeberJulian 8c20a599d8 Remove coqui studio integration from TTS 2023-12-11 22:11:46 +01:00
WeberJulian 5cd750ac7e Fix API and CI 2023-12-11 20:21:53 +01:00
WeberJulian e3c9dab7a3 Make CLI work 2023-12-11 18:49:18 +01:00
WeberJulian 0a90359a42 rename speaker file 2023-12-11 18:48:49 +01:00
WeberJulian a5c0d9780f rename manager 2023-12-11 18:48:31 +01:00
WeberJulian 36143fee26 Add basic speaker manager 2023-12-11 15:25:46 +01:00
Frederico S. Oliveira f9117918fe
Update .models.json 2023-12-11 10:47:31 -03:00
Frederico S. Oliveira 163f9a3fdf
Merge branch 'coqui-ai:dev' into dev 2023-12-11 10:04:07 -03:00
WeberJulian 0a136a8535 Download speaker file 2023-12-11 11:29:36 +01:00
Aaron-Li b6e929696a support multiple GPU training 2023-12-08 16:55:32 +08:00
Josh Meyer 759d9ab3ae
Print message for either commercial license or CPML 2023-12-07 13:54:48 +01:00
Eren Gölge e49c512d99
Merge pull request #3351 from aaron-lii/chinese-puncs
fix pause problem of Chinese speech
2023-12-04 15:57:42 +01:00
Eren Gölge 2d02015978
Update to v0.21.3 2023-12-01 23:52:57 +01:00
Edresson Casanova 5f900f156a
Add XTTS Fine tuning gradio demo (#3296)
* Add XTTS FT demo data processing pipeline

* Add training and inference columns

* Uses tabs instead of columns

* Fix demo freezing issue

* Update demo

* Convert stereo to mono

* Bug fix on XTTS inference

* Update gradio demo

* Update gradio demo

* Update gradio demo

* Update gradio demo

* Add parameters to be able to set then on colab demo

* Add erros messages

* Add intuitive error messages

* Update

* Add max_audio_length parameter

* Add XTTS fine-tuner docs

* Update XTTS finetuner docs

* Delete trainer to freeze memory

* Delete unused variables

* Add gc.collect()

* Update xtts.md

---------

Co-authored-by: Eren Gölge <erogol@hotmail.com>
2023-12-01 23:52:23 +01:00
Aaron-Li 7b8808186a fix pause problem of Chinese speech 2023-12-01 23:30:03 +08:00
Frederico S. Oliveira bcd500fa7b Fixing bug
Correction in training the Fastspeech/Fastspeech2/FastPitch/SpeedySpeech model using external speaker embedding.
2023-11-30 17:27:05 -03:00
Frederico S. Oliveira a26e51b0b4
Merge branch 'coqui-ai:dev' into dev 2023-11-30 14:19:05 -03:00
Eren Gölge 6d1905c2b7
Update to v0.21.2 2023-11-30 13:05:10 +01:00
Enno Hermann 39321d02be
fix: correctly strip/restore initial punctuation (#3336)
* refactor(punctuation): remove orphan code for handling lone punctuation

The case of lone punctuation is already handled at the top of restore(). The
removed if statement would never be called and would in fact raise an
AttributeError because the _punc_index named tuple doesn't have the attribute
`mark`.

* refactor(punctuation): remove unused argument

* fix(punctuation): correctly handle initial punctuation

Stripping and restoring initial punctuation didn't work correctly because the
string-splitting caused an additional empty string to be inserted in the text
list (because `".A".split(".")` => `["", "A"]`). Now, an initial empty string is
skipped and relevant test cases are added.

Fixes #3333
2023-11-30 13:03:16 +01:00
Frederico S. Oliveira 77c2155609
Merge pull request #1 from coqui-ai/dev
Update
2023-11-29 17:24:02 -03:00
Eren G??lge bfbaffc84a Fixup 2023-11-28 13:47:45 +01:00
Eren G??lge b75e90ba85 Make text splitting optional 2023-11-27 14:53:11 +01:00
Eren G??lge 3b8894a3dd Make style 2023-11-27 14:15:50 +01:00
Eren G??lge 2fd8cf3d94 Make xtts runnable by version names 2023-11-27 14:15:16 +01:00
Eren G??lge 11ec9f7471 Add hi in config defaults 2023-11-24 15:38:36 +01:00
Eren G??lge 00a870c26a Update to v0.21.1 2023-11-24 15:15:44 +01:00
Eren G??lge 7e575068c9 Merge branch 'dev' of https://github.com/coqui-ai/TTS into dev 2023-11-24 15:15:19 +01:00
Eren G??lge 32065139e7 Simple text cleaner for "hi" 2023-11-24 15:14:34 +01:00
Eren Gölge 1542a50c3a
Update to v0.21.0 2023-11-24 14:37:05 +01:00
Eren G??lge 6dd43b0ce2 Update to XTTS v2.0.3 2023-11-24 14:36:04 +01:00
TITC 4d0f53d2ee
Misjudgment of `is_multi_lingual` When Loading Multilingual Model via `model_path` (#3273)
* load multilingual model by path

* use config to assert multi lingual or not
2023-11-24 12:28:31 +01:00
Enno Hermann 8c5227ed84
Fix tts_with_vc (#3275)
* Revert "fix for issue 3067"

This reverts commit 041b4b6723.

Fixes #3143. The original issue (#3067) was people trying to use
tts.tts_with_vc_to_file() with XTTS and was "fixed" in #3109. But XTTS has
integrated VC and you can just do tts.tts_to_file(..., speaker_wav="..."), there
is no point in passing it through FreeVC afterwards. So, reverting this commit
because it breaks tts.tts_with_vc_to_file() for any model that doesn't have
integrated VC, i.e. all models this method is meant for.

* fix: support multi-speaker models in tts_with_vc/tts_with_vc_to_file

* fix: only compute spk embeddings for models that support it

Fixes #1440. Passing a `speaker_wav` argument to regular Vits models failed
because they don't support voice cloning. Now that argument is simply ignored.
2023-11-24 12:26:37 +01:00
Enno Hermann 2af0220996
fix: don't pass quotes to espeak (#3286)
Previously, the text was wrapped in an additional set of quotes that was passed
to Espeak. This could result in different phonemization in certain edges and
caused the insertion of an initial separator "_" that had to be removed.
Compare:
$ espeak-ng -q -b 1 -v en-us --ipa=1 '"A"'
_ˈɐ
$ espeak-ng -q -b 1 -v en-us --ipa=1 'A'
ˈeɪ

Fixes #2619
2023-11-24 12:25:37 +01:00
Enno Hermann 4a2684be34
fix(bin.synthesize): more informative error for wrong --language argument (#3294)
In multilingual models, the target language is specified via the
`--language_idx` argument. However, the `tts` CLI also accepts a `--language`
argument for use with Coqui Studio, so it is easy to choose the wrong one,
resulting in the following confusing error at synthesis time:

```
AssertionError:   Language None is not supported. Supported languages are
['en', 'es', 'fr', 'de', 'it', 'pt', 'pl', 'tr', 'ru', 'nl', 'cs', 'ar',
'zh-cn', 'hu', 'ko', 'ja']
```

This commit adds a better error message when `--language` is passed for a
non-studio model.

Fixes #3270, fixes #3291
2023-11-24 12:24:42 +01:00
Tessa Painter 64f391b583
Made the tqdm `progress_bar` objects of static download methods a static class variable (#3297) 2023-11-24 12:23:59 +01:00
Eren Gölge b47d9c6e36
Merge pull request #3243 from idiap/checkpoints
Remove duplicate/unused code
2023-11-22 23:52:06 +01:00
Eren Gölge c011ab7455 Update to v0.20.6 2023-11-17 15:16:32 +01:00
Eren G??lge 52cb1e2f68 Update model hash for v2.0.2 2023-11-17 15:16:32 +01:00
Edresson Casanova 6075fa208c Ensures that only GPT model is in training mode during XTTS GPT training (#3241)
* Ensures that only GPT model is in training mode during training

* Fix parallel wavegan unit test
2023-11-17 15:15:22 +01:00
Eren G??lge a3279f9294 Make style 2023-11-17 15:15:22 +01:00
Eren G??lge f21067a84a Make k_diffusion optional 2023-11-17 15:15:21 +01:00
Enno Hermann 0fb0d67de7 refactor: use save_checkpoint()/save_best_model() from Trainer 2023-11-17 01:18:23 +01:00
Enno Hermann 96678c7ba2 refactor: use copy_model_files() from Trainer 2023-11-17 01:18:23 +01:00
Enno Hermann 5119e651a1 chore(utils.io): remove unused code
These are all available in Trainer.
2023-11-17 01:18:23 +01:00
Enno Hermann 39fe38bda4 refactor: use save_fsspec() from Trainer 2023-11-17 01:18:23 +01:00
Enno Hermann fdf0c8b10a chore(encoder): remove unused code 2023-11-17 01:18:23 +01:00
Eren Gölge 7e4375da2b
Update to v0.20.6 2023-11-16 17:52:13 +01:00
Julian Weber fbc18b8c34
Fix zh bug (#3238) 2023-11-16 17:51:37 +01:00
Julian Weber 675f983550
Add sentence splitting (#3227)
* Add sentence spliting

* update requirements

* update default args v2

* Add spanish

* Fix return gpt_latents

* Update requirements

* Fix requirements
2023-11-16 11:01:11 +01:00
Enno Hermann 3c2d5a9e03
Remove duplicate AudioProcessor code and fix ExtractTTSpectrogram.ipynb (#3230)
* chore: remove unused argument

* refactor(audio.processor): remove duplicate stft+griffin_lim

* chore(audio.processor): remove unused compute_stft_paddings

Same function available in numpy_transforms

* refactor(audio.processor): remove duplicate db_to_amp

* refactor(audio.processor): remove duplicate amp_to_db

* refactor(audio.processor): remove duplicate linear_to_mel

* refactor(audio.processor): remove duplicate mel_to_linear

* refactor(audio.processor): remove duplicate build_mel_basis

* refactor(audio.processor): remove duplicate stft_parameters

* refactor(audio.processor): use pre-/deemphasis from numpy_transforms

* refactor(audio.processor): use rms_volume_norm from numpy_transforms

* chore(audio.processor): remove duplicate assert

Already checked in numpy_transforms.compute_f0

* refactor(audio.processor): use find_endpoint from numpy_transforms

* refactor(audio.processor): use trim_silence from numpy_transforms

* refactor(audio.processor): use volume_norm from numpy_transforms

* refactor(audio.processor): use load_wav from numpy_transforms

* fix(bin.extract_tts_spectrograms): set quantization bits

* fix(ExtractTTSpectrogram.ipynb): adapt to current TTS code

Fixes #2447, #2574

* refactor(audio.processor): remove duplicate quantization methods
2023-11-16 10:57:06 +01:00
Eren Gölge 88630c60e5
Update to v0.20.5 2023-11-15 14:02:51 +01:00
Edresson Casanova 73a5bd08c0
Fix XTTS GPT padding and inference issues (#3216)
* Fix end artifact for fine tuning models

* Bug fix on zh-cn inference

* Remove ununsed code
2023-11-15 14:02:05 +01:00
Julian Weber 04901fb2e4
Add speed control for inference (#3214)
* Add speed control for inference

* Fix XTTS tests

* Add speed control tests
2023-11-14 16:07:17 +01:00
Eren Gölge d96f3885d5
Update to v0.20.4 2023-11-13 17:07:25 +01:00
Eren Gölge ac3df409a6
Merge pull request #3208 from coqui-ai/fix_max_mel_len
fix max generation length for XTTS
2023-11-13 14:32:56 +01:00
Eren G??lge 92fa988aec Fixup 2023-11-13 13:44:06 +01:00
WeberJulian b85536b23f fix max generation length 2023-11-13 13:18:45 +01:00
Eren G??lge b2682d39c5 Make style 2023-11-13 13:01:01 +01:00
Eren G??lge a16360af85 Implement chunking gpt_cond 2023-11-13 13:00:08 +01:00
Eren Gölge 6f1cba2f81
Update to v0.20.3 2023-11-09 17:41:37 +01:00
Enno Hermann 3b1e7038bc
fix(formatters): set missing root_path attribute (#3182)
Fixes #2778
2023-11-09 16:49:52 +01:00
Aarni Koskela a8e9163fb3
xtts/tokenizer: merge duplicate implementations of preprocess_text (#3170)
This was found via ruff:

> F811 Redefinition of unused `preprocess_text` from line 570
2023-11-09 16:32:12 +01:00
Matthew Boakes 1b9c400bca
PyTorch 2.1 Updates (Weight Norm and TorchAudio I/O) (#3176)
* Replaced PyTorch weight_norm With parametrizations.weight_norm

* TorchAudio: Migrating The I/O Functions To Use The Dispatcher Mechanism

* Corrected Code Style

---------

Co-authored-by: Eren Gölge <erogol@hotmail.com>
2023-11-09 16:31:03 +01:00
Gorkem 66a1e248d0
torchaudio should use proper backend to load audio (#3179) 2023-11-09 16:28:39 +01:00
Eren Gölge 46d9c27212
Update to v0.20.2 2023-11-08 16:07:56 +01:00
Julian Weber 03ad90135b
Add lang code in XTTS doc (#3158)
* Add lang code in XTTS doc

* Remove ununsed config and args

* update docs

* woops
2023-11-08 13:47:33 +01:00
Gorkem 78a596618a
Fix for exception on streaming if last chunk empty (#3160) 2023-11-08 11:32:02 +01:00
Enno Hermann 99edd6daa3
Fix ModelManager.list_models() (#3128)
* fix(utils.manage): remove hard-coded model_type variable

* refactor(utils.manage): address lint issues, fix typos

Addressed the following:
TTS/utils/manage.py:307:12: R1705: Unnecessary "else" after "return" (no-else-return)
TTS/utils/manage.py:308:21: W1514: Using open without explicitly specifying an encoding (unspecified-encoding)
TTS/utils/manage.py:299:4: R1710: Either all return statements in a function should return an expression, or none of them should. (inconsistent-return-statements)
TTS/utils/manage.py:299:4: R0201: Method could be a function (no-self-use)
TTS/utils/manage.py:314:4: R0201: Method could be a function (no-self-use)
2023-11-08 11:29:01 +01:00
Eren Gölge 77b18126c7
Merge pull request #3126 from akx/freevc-config-module
Move FreeVCConfig to TTS.vc.configs (like all other config classes)
2023-11-08 11:24:47 +01:00
Eren Gölge cc6e9fcaa7
Fix #3153 (#3169) 2023-11-08 11:13:58 +01:00
Eren Gölge a24ebcd8a6
Fix coqui api (#3168) 2023-11-08 10:51:23 +01:00
Julian Weber ce1a39a9a4
Add char limit warn (#3130)
* Add char limit warning

* Adding v2 langs

* cached_property for cutlet

* Fix import
2023-11-08 10:24:23 +01:00
Eren Gölge f846a9f300
Update to v0.20.1 2023-11-07 14:17:36 +01:00
Edresson Casanova cbdbc44e0f
Fix XTTS v2.0 training recipe (#3154)
* Fix XTTS v2.0 training recipe

* Update XTTS v2 model hash
2023-11-07 14:16:44 +01:00
Edresson Casanova 5f9ab6cfaa
Fix style
Co-authored-by: Aarni Koskela <akx@iki.fi>
2023-11-06 19:22:34 -03:00
Edresson Casanova 2470599d18 Drop XTTS v1 2023-11-06 19:12:04 -03:00
Edresson Casanova 13243df526 Update XTTS v1.1 files 2023-11-06 19:10:21 -03:00
Edresson Casanova 09fb317e6d Remove unused code 2023-11-06 17:36:32 -03:00
Edresson Casanova b146de4ce8 Bug fix on XTTS v2.0 Trainer 2023-11-06 20:26:01 +01:00
Edresson Casanova 1b6f8d0e46 Update unit tests and recipes 2023-11-06 20:25:06 +01:00
Edresson Casanova 72b2bac0f8 Load reference in 24khz to avoid issued with multiple sr references 2023-11-06 20:25:06 +01:00
Edresson Casanova 00294ffdf6 Update XTTS docs 2023-11-06 20:24:06 +01:00
Edresson Casanova 459ad70dc8 Add support for multiples speaker references on XTTS inference 2023-11-06 20:22:35 +01:00
Eren Gölge f0cb19ecca
Drop diffusion from XTTS (#3150)
* Drop diffusion for XTTS

* Make style

* Drop diffusion deps in code

* Restore thrashed
2023-11-06 20:15:49 +01:00
Eren G??lge 5d418bb84a Update docs 2023-11-06 18:48:41 +01:00
Eren G??lge 9bbf6eb8dd Drop use_ne_hifigan 2023-11-06 18:43:38 +01:00
Eren G??lge 9d54bd7655 Fixup XTTS 2023-11-06 18:13:58 +01:00
Eren Gölge c713a839da
Update VERSION 2023-11-06 15:51:56 +01:00
Edresson Casanova e45227d9ff
XTTS v2.0 (#3137)
* Implement most similar ref training approach

* Use non-enhanced hifigan for test samples

* Add Perceiver

* Update GPT Trainer for perceiver support

* Update XTTS docs

* Bug fix masking with XTTS perceiver

* Bug fix on gpt forward

* Bug Fix on XTTS v2.0 training

* Add XTTS v2.0 unit tests

* Add XTTS v2.0 inference unit tests

* Bug Fix on diffusion inference

* Add XTTS v2.0 training recipe

* Placeholder model entry

* Add cloning params to config

* Make prompt embedding configurable

* Make cloning configurable

* Cheap fix for a cheaper fix

* Prevent resampling

* Update model entry

* Update docs

* Update requirements

* Code linting

* Add xtts v2 to sep tests

* Bug fix on XTTS get_gpt_cond_latents

* Bug fix on rebase

* Make style

* Bug fix in Japenese tokenizer

* Add num2words to deps

* Remove unused kwarg and added num_beams=1 as default

---------

Co-authored-by: Eren G??lge <egolge@coqui.ai>
2023-11-06 14:58:18 +01:00
Aarni Koskela 38f6f8f0bb
Run `make style` & re-enable it in CI (#3127) 2023-11-06 11:36:37 +01:00
Aarni Koskela 5ae369d629 Move FreeVCConfig to TTS.vc.configs (like all other config classes) 2023-10-31 16:56:25 +02:00
Eren Gölge 6fef4f9067
Bump up to v0.19.1 2023-10-30 10:37:28 +01:00
Eren Gölge eccc94be9b
Merge pull request #2983 from vltmedia/dev
Bug: self.model_name needed to be initialized.
2023-10-28 10:39:25 +02:00
Eren Gölge 2d6bd716ef
Merge pull request #3109 from coqui-ai/tts_3067
fix for issue 3067
2023-10-28 10:37:52 +02:00
WeberJulian 1c98821359 Remove unused load_audio function 2023-10-27 22:27:18 +02:00
Aya Jafari 041b4b6723 fix for issue 3067 2023-10-26 13:06:01 -03:00
WeberJulian d4e08c8d6c Add features to get_conditioning_latents 2023-10-26 14:57:33 +02:00
WeberJulian c1133724a1 Move lang token add to tokenizer 2023-10-26 14:52:13 +02:00
WeberJulian 6fa46d197d Fix get_conditioning_latents when using only ne 2023-10-26 14:51:35 +02:00
Eren Gölge edd3a28723
Bump up to v0.19.0 2023-10-25 13:29:38 +02:00
Edresson Casanova 01839af926 Bug fix on XTTS masking training 2023-10-24 18:30:14 -03:00
VLT Media 818aa0eb7e
Merge branch 'coqui-ai:dev' into dev 2023-10-23 23:36:33 -04:00
Edresson Casanova 0f96abb5ec Add FT inference example on XTTS docs 2023-10-23 13:23:30 -03:00
Edresson Casanova 37b7945474 Update XTTS train not implemented error to point to the XTTS docs 2023-10-23 11:39:17 -03:00
Edresson Casanova ec7f54768a Rebase bug fix and update recipe 2023-10-21 17:37:51 -03:00
Edresson Casanova affaf11148 Add XTTS training unit test 2023-10-21 13:41:12 -03:00
Edresson Casanova 1f92741d6a Fix issue #2971 2023-10-21 13:37:21 -03:00
Edresson Casanova 5f98dbeec9 Update Ljspeech XTTS recipe 2023-10-21 13:37:21 -03:00
Edresson Casanova 9e3598c3b7 Bug Fix on inference using XTTS trainer checkpoint 2023-10-21 13:37:21 -03:00
Edresson Casanova c4ceaabe2c Add test sentences during the training 2023-10-21 13:33:56 -03:00
Edresson Casanova 2f868dd5c2 Bug fix on reproducible evaluation 2023-10-21 13:33:56 -03:00
Edresson Casanova bafab049c2 Add prompting masking 2023-10-21 13:33:56 -03:00
Edresson Casanova 47d613df3a Add reproducible evaluation 2023-10-21 13:33:56 -03:00
Edresson Casanova 40a4e631ea Update mel spectrogram for the style encoder 2023-10-21 13:33:56 -03:00
Edresson Casanova a32961bcb4 Add XTTS base training code 2023-10-21 13:33:56 -03:00
Eren Gölge 1e152692ed
Bump up to v0.18.2 2023-10-21 17:29:53 +02:00
Julian Weber dad6a7b0b6
Preserve [ja] token of the text processing 2023-10-21 11:26:03 +02:00
Julian Weber c7a16042e3
Remove global cutlet import 2023-10-21 11:18:58 +02:00
Edresson Casanova 414f0de0a1
Bump up to v0.18.1 2023-10-20 17:30:58 -03:00
Edresson Casanova 59576fc0ec
Bug fix on XTTS v1.1 inference (#3093)
* Bug fix on XTTS v1.1 inference

* Update .models.json

---------

Co-authored-by: Julian Weber <julian.weber@hotmail.fr>
2023-10-20 17:29:43 -03:00
Eren Gölge 85e7323739
Bump up to v0.18.0 2023-10-20 16:03:24 +02:00
Julian Weber cf97116185
XTTS v1.1 (#3089)
* Add support for ne_hifigan

* Update model.json

* Update hash

* Fix model loading

* Enhance text_normalization

* Add xtts to zoo test exception

* Add model hash check

* Add get_number_tokens
2023-10-20 16:02:08 +02:00
Eren Gölge 747f688dc3
Bump up to v0.17.10 2023-10-19 12:00:15 +02:00
Eren Gölge 93e6961bb5
Update .models.json 2023-10-19 11:59:49 +02:00
Eren Gölge bf68848f38
Bump up to v0.17.9 2023-10-19 11:22:42 +02:00
Eren Gölge c3b011217d
Update .models.json 2023-10-19 11:21:21 +02:00
David Garvey a151d70242
Add stdout option (#3027)
* add add cli options for play and speed
--play argument uses simpleaudio to play the tts wav
--speed <float 0.0-2.0> passes speed argument to Coqui Studio models

* remove simpleaudio not referenced in file

* fix simpleaudio dependency version

* add ALSA headers for simpleaudio compilation

* Dockerfile ALSA headers for simpleaudio

* base changes to use stdout instead of play audio
Considering conversion to pipe wav data for audio playback with ohter program
like aplay.

This is incomplete code. Using to get feedback before proceeding with
implementation.

* remove play for pipe_out arg that suppresses stdout
removed play and simpleaudio dependency in place of pipe
fuctionality to allow passing wav file data to a program
dedicated to playing audio.

* scipy.io.wavfile.write fails with /dev/null target

* Streaming inference for XTTS 🚀 (#3035)

* v0.17.7

* Redownload XTTS with the local and remote config do not match

* Remove unused method

* Print a message when it is already donwloaded

* Try-except to present error when the user dont have connection

* Fix style

* 0.17.8

* v0.17.8

---------

Co-authored-by: Julian Weber <julian.weber@hotmail.fr>
Co-authored-by: Eren Gölge <erogol@hotmail.com>
Co-authored-by: Edresson Casanova <edresson1@gmail.com>
Co-authored-by: ggoknar <ggoknar@coqui.ai>
2023-10-16 12:07:21 +02:00
Dusty Hagstrom 13cd076a7f
Synthesizer skips over embeddings file if model only has one speaker (#2587)
* It looks like the Neon model is special in that t does not have a speaker_name and it wants to get the only item available. This was blocking a valid model with one speaker and a d_vector_file from being executed to get the embedding.

* Update synthesizer.py

oh my how embarrassing
2023-10-16 11:55:45 +02:00
Aya Jafari ffddf10458 unit test fix 2023-10-13 10:56:47 -03:00
Aya Jafari 6eaecab0ca fixed bugs in fastpitch tts synthesis 2023-10-10 23:02:31 -03:00
ggoknar 99635193f5 v0.17.8 2023-10-07 01:14:05 +03:00
ggoknar 3bb51b1276 0.17.8 2023-10-07 01:13:02 +03:00
Edresson Casanova 2852404bdf Fix style 2023-10-06 17:42:46 -03:00
Edresson Casanova 99650044a4 Try-except to present error when the user dont have connection 2023-10-06 17:37:05 -03:00
Edresson Casanova 529ea3f67f Print a message when it is already donwloaded 2023-10-06 17:26:40 -03:00
Edresson Casanova ee1ef1c51e Remove unused method 2023-10-06 17:21:22 -03:00
Edresson Casanova 4a6103fec9 Redownload XTTS with the local and remote config do not match 2023-10-06 17:16:30 -03:00
Eren Gölge 0520697b5f
v0.17.7 2023-10-06 18:35:26 +02:00
Julian Weber e5e0cbffc9
Streaming inference for XTTS 🚀 (#3035) 2023-10-06 18:34:06 +02:00
OPERATOR 2150136210
None is not able to be read for "XTTS", fixes crash if its set to None. (#3009) 2023-10-02 12:53:36 +02:00
Eren Gölge 155c5fc0bd
v0.17.6 2023-09-29 23:44:09 +02:00
Edresson Casanova 4c3c11c958
Tortoise inference fix and fix zoo unit tests (#3010) 2023-09-29 13:40:57 +02:00
Eren Gölge bb05dcb9b4
Merge pull request #2922 from coqui-ai/be_tts
Adding Belarusian TTS model
2023-09-27 09:48:28 +02:00
Eren Gölge 8cba47191f
Merge pull request #2993 from akx/tts-readme
Ensure `tts` CLI tool readme and usage is in sync
2023-09-27 09:46:54 +02:00
Eren Gölge ea51a7ffcc
Merge pull request #3003 from akx/duplicate-code-removal
Duplicate code removal
2023-09-27 09:41:35 +02:00
Aarni Koskela 0dbe7cbcc4 Remove duplicate convert_pad_shape 2023-09-27 01:10:48 +03:00
Aarni Koskela 33a7c722f6 Merge duplicate on_train_step_start functions in delightful_tts 2023-09-27 01:10:44 +03:00
Aarni Koskela 861c68b0b8 Rename misnamed setter 2023-09-27 01:09:59 +03:00
Aarni Koskela 09e14e68db Remove duplicate get_named_beta_schedules 2023-09-27 01:09:59 +03:00
Aarni Koskela 59f85a7122 Remove duplicate code from xtts.tokenizer 2023-09-27 01:09:59 +03:00
Aarni Koskela 0a82f063cc Late-import main TTS libraries in `tts` CLI 2023-09-26 15:38:56 +03:00
Aarni Koskela 5c047cf304 Ensure `tts` CLI tool readme and usage help is in sync 2023-09-26 15:38:56 +03:00
Eren Gölge 0b95b88f13
Bum up to v0.17.5 2023-09-25 18:16:45 +02:00
VLT Media dd73910651
Bug: self.model_name needed to be initialized.
Bug: self.model_name needed to be initialize to get around a bug that automatically crashes when the user provides the model paths but no model_name when initializing the TTS object.
2023-09-23 01:41:35 -04:00
loupzeur da8b6bbce1
fix: xtts not taking into account device flag (#2951)
* fix: xtts not taking into account device flag

* Style changes

---------

Co-authored-by: Julian Weber <julian.weber@hotmail.fr>
2023-09-20 09:57:02 +02:00
Reuben Morais f829bf50f8
Bump version to v0.17.4 (really) 2023-09-15 16:40:34 +02:00
Eren G??lge aa8fa4756e Bump up to v0.17.4 2023-09-14 17:52:44 +02:00
Eren G??lge 9d0b76ce23 Check env var for COQUI_TOS_AGREED 2023-09-14 17:51:40 +02:00
Eren G??lge 13dd7c4c9e Bump up to v0.17.2 2023-09-14 15:24:05 +02:00
Eren G??lge ded7fd4fb2 Make style 2023-09-14 15:23:37 +02:00
Eren G??lge 44b61d2b92 Fixup 2023-09-14 15:22:54 +02:00
Eren Gölge 623ea41634
Fix model tests (#2943) 2023-09-14 15:21:48 +02:00
Eren G??lge af62613c86 Bump up to v0.17.1 2023-09-13 18:23:39 +02:00
Eren G??lge ee7cee0e35 Fixup 2023-09-13 18:21:44 +02:00
Eren G??lge 5dcf9ae311 Bump up v0.17.0 2023-09-13 18:04:26 +02:00
Eren Gölge 4033db5f4b 🔥 XTTS implementation 2023-09-13 17:51:24 +02:00
Edresson Casanova 4d3f23b5d3
Add CML-TTS dataset YourTTS training recipe (#2934) 2023-09-12 11:49:14 +02:00