Commit Graph

546 Commits

Author SHA1 Message Date
Enno Hermann e0f621180f refactor(bin.synthesize): use Python API for CLI 2024-12-06 17:07:54 +01:00
Enno Hermann 5f8ad4c64b test(openvoice): add sanity check 2024-12-02 23:26:28 +01:00
Enno Hermann 9ef2c7ed62 test(freevc): fix output length check 2024-12-02 23:26:28 +01:00
Enno Hermann 546f43cb25 refactor: only use keyword args in Synthesizer 2024-12-02 23:26:27 +01:00
Enno Hermann d488441b75 test(freevc): remove unused code 2024-12-02 23:26:27 +01:00
Enno Hermann 63625e79af refactor: import get_last_checkpoint from trainer.io 2024-11-29 13:59:43 +01:00
Enno Hermann 7330ad8854 refactor: move duplicate alignment functions into helpers 2024-11-24 19:57:14 +01:00
Enno Hermann 8bf288eeab test: move test_helpers.py to fast unit tests 2024-11-24 19:57:14 +01:00
Enno Hermann 76df6421de refactor: move more audio processing into torch_transforms 2024-11-24 19:57:14 +01:00
Enno Hermann 7cdfde226b refactor: move amp_to_db/db_to_amp into torch_transforms 2024-11-23 01:04:17 +01:00
Shavit 36611a7192
feat: normalize unicode characters in text cleaners (#85)
* Add normalizer type C to text cleaners

* Linter recommendations

* Add unicode normalize to every cleaner

* Format test_text_cleaners.py
2024-10-02 17:01:19 +02:00
Enno Hermann 0fb26f97df refactor: use get_user_data_dir from trainer 2024-06-29 15:07:10 +02:00
Enno Hermann c5241d71ab chore: address pytorch deprecations
torch.range(a, b) == torch.arange(a, b+1)

meshgrid indexing: https://github.com/pytorch/pytorch/issues/50276

checkpoint use_reentrant:
https://dev-discuss.pytorch.org/t/bc-breaking-update-to-torch-utils-checkpoint-not-passing-in-use-reentrant-flag-will-raise-an-error/1745

optimizer.step() before scheduler.step():
https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
2024-06-26 11:38:25 +02:00
Enno Hermann 857cd55ce5 test(helpers): fix test_rand_segment, test_generate_path 2024-06-26 10:16:46 +02:00
Enno Hermann c9f7197862 test(helpers): add test_ prefix so tests actually run 2024-06-25 23:03:40 +02:00
Enno Hermann e5c208d254 feat(cleaners): add multilingual phoneme cleaner
This doesn't convert numbers into English words.
2024-06-14 15:06:03 +02:00
Enno Hermann 77722cb0dd fix(bin.synthesize): correctly handle boolean arguments
Previously, e.g. `--use_cuda false` would actually set use_cuda=True:
https://github.com/coqui-ai/TTS/discussions/3762
2024-05-31 08:39:32 +02:00
Enno Hermann 07cbcf825c fix(espeak_wrapper): read phonemize() input from file
Avoids utf8 encoding issues on Windows when passing the text directly.
Fixes https://github.com/coqui-ai/TTS/discussions/3761
2024-05-29 10:10:05 +02:00
Enno Hermann ec50006855 style: run pre-commit
Automatic changes from: pre-commit run --all-files
2024-05-08 12:17:47 +02:00
Enno Hermann 98e21d0f02 test(losses): change assertEqual to assertAlmostEqual
Failed in CI with:
AssertionError: 1.401298464324817e-45 != 0.0
2024-05-01 14:28:55 +02:00
Enno Hermann b711e19cb6 refactor: remove verbose arguments
Can be handled by adjusting logging levels instead.
2024-04-03 15:19:45 +02:00
Enno Hermann a7753708fb refactor: remove duplicate methods available in Trainer 2024-03-12 15:06:42 +01:00
Enno Hermann dca564a705 test(vocoder): disable wavegrad training test in CI 2024-03-08 17:27:23 +01:00
Enno Hermann efdafd5a7f style: run black 2024-03-07 11:46:51 +01:00
Enno Hermann 24298da5fc
Merge pull request #1 from eginhard/lint-overhaul
Lint overhaul (pylint to ruff)
2024-03-06 16:10:26 +01:00
Enno Hermann 1961687a18 build: update to ruff 0.3.0 2024-03-06 13:40:56 +01:00
Edresson Casanova 5dcc16d193
Bug fix in MP3 and FLAC compute length on TTSDataset (#3092)
* Bug Fix on XTTS load

* Bug fix in MP3 length on TTSDataset

* Update TTS/tts/datasets/dataset.py

Co-authored-by: Aarni Koskela <akx@iki.fi>

* Uses mutagen for all audio formats

* Add dataloader test wit hall supported audio formats

* Use mutagen.File

* Update

* Fix aux unit tests

* Bug fixe on unit tests

---------

Co-authored-by: Aarni Koskela <akx@iki.fi>
2023-12-27 13:23:43 -03:00
Aarni Koskela 64bb41f4fa Ruff autofix C41 2023-12-13 14:56:41 +02:00
Aarni Koskela 90991e89b4 Ruff autofix unused imports and import order 2023-12-13 14:56:41 +02:00
Aarni Koskela 72ac2bfa09 Get rid of some star imports 2023-12-13 14:56:41 +02:00
Eren Gölge 8999780aff
Update test_models.py 2023-12-12 13:30:21 +01:00
WeberJulian 605a857add Remove tortoise 2023-12-11 23:35:07 +01:00
WeberJulian 8c20a599d8 Remove coqui studio integration from TTS 2023-12-11 22:11:46 +01:00
Enno Hermann 39321d02be
fix: correctly strip/restore initial punctuation (#3336)
* refactor(punctuation): remove orphan code for handling lone punctuation

The case of lone punctuation is already handled at the top of restore(). The
removed if statement would never be called and would in fact raise an
AttributeError because the _punc_index named tuple doesn't have the attribute
`mark`.

* refactor(punctuation): remove unused argument

* fix(punctuation): correctly handle initial punctuation

Stripping and restoring initial punctuation didn't work correctly because the
string-splitting caused an additional empty string to be inserted in the text
list (because `".A".split(".")` => `["", "A"]`). Now, an initial empty string is
skipped and relevant test cases are added.

Fixes #3333
2023-11-30 13:03:16 +01:00
Eren Gölge b47d9c6e36
Merge pull request #3243 from idiap/checkpoints
Remove duplicate/unused code
2023-11-22 23:52:06 +01:00
Eren G??lge 44880f09ed Make style 2023-11-17 13:43:34 +01:00
Enno Hermann 0fb0d67de7 refactor: use save_checkpoint()/save_best_model() from Trainer 2023-11-17 01:18:23 +01:00
Enno Hermann 3c2d5a9e03
Remove duplicate AudioProcessor code and fix ExtractTTSpectrogram.ipynb (#3230)
* chore: remove unused argument

* refactor(audio.processor): remove duplicate stft+griffin_lim

* chore(audio.processor): remove unused compute_stft_paddings

Same function available in numpy_transforms

* refactor(audio.processor): remove duplicate db_to_amp

* refactor(audio.processor): remove duplicate amp_to_db

* refactor(audio.processor): remove duplicate linear_to_mel

* refactor(audio.processor): remove duplicate mel_to_linear

* refactor(audio.processor): remove duplicate build_mel_basis

* refactor(audio.processor): remove duplicate stft_parameters

* refactor(audio.processor): use pre-/deemphasis from numpy_transforms

* refactor(audio.processor): use rms_volume_norm from numpy_transforms

* chore(audio.processor): remove duplicate assert

Already checked in numpy_transforms.compute_f0

* refactor(audio.processor): use find_endpoint from numpy_transforms

* refactor(audio.processor): use trim_silence from numpy_transforms

* refactor(audio.processor): use volume_norm from numpy_transforms

* refactor(audio.processor): use load_wav from numpy_transforms

* fix(bin.extract_tts_spectrograms): set quantization bits

* fix(ExtractTTSpectrogram.ipynb): adapt to current TTS code

Fixes #2447, #2574

* refactor(audio.processor): remove duplicate quantization methods
2023-11-16 10:57:06 +01:00
Julian Weber 04901fb2e4
Add speed control for inference (#3214)
* Add speed control for inference

* Fix XTTS tests

* Add speed control tests
2023-11-14 16:07:17 +01:00
Eren G??lge b2682d39c5 Make style 2023-11-13 13:01:01 +01:00
Julian Weber 58cb0d8dd0
Remove v1 doc and tests (#3172)
* remove v1 in inference.md

* remove v1 in README.md

* Update test_models.py
2023-11-08 14:51:42 +01:00
Edresson Casanova b146de4ce8 Bug fix on XTTS v2.0 Trainer 2023-11-06 20:26:01 +01:00
Edresson Casanova f444f296f2 Add multiples references on xtts inference tests 2023-11-06 20:25:06 +01:00
Edresson Casanova 1b6f8d0e46 Update unit tests and recipes 2023-11-06 20:25:06 +01:00
Eren Gölge f0cb19ecca
Drop diffusion from XTTS (#3150)
* Drop diffusion for XTTS

* Make style

* Drop diffusion deps in code

* Restore thrashed
2023-11-06 20:15:49 +01:00
Edresson Casanova e45227d9ff
XTTS v2.0 (#3137)
* Implement most similar ref training approach

* Use non-enhanced hifigan for test samples

* Add Perceiver

* Update GPT Trainer for perceiver support

* Update XTTS docs

* Bug fix masking with XTTS perceiver

* Bug fix on gpt forward

* Bug Fix on XTTS v2.0 training

* Add XTTS v2.0 unit tests

* Add XTTS v2.0 inference unit tests

* Bug Fix on diffusion inference

* Add XTTS v2.0 training recipe

* Placeholder model entry

* Add cloning params to config

* Make prompt embedding configurable

* Make cloning configurable

* Cheap fix for a cheaper fix

* Prevent resampling

* Update model entry

* Update docs

* Update requirements

* Code linting

* Add xtts v2 to sep tests

* Bug fix on XTTS get_gpt_cond_latents

* Bug fix on rebase

* Make style

* Bug fix in Japenese tokenizer

* Add num2words to deps

* Remove unused kwarg and added num_beams=1 as default

---------

Co-authored-by: Eren G??lge <egolge@coqui.ai>
2023-11-06 14:58:18 +01:00
Aarni Koskela 38f6f8f0bb
Run `make style` & re-enable it in CI (#3127) 2023-11-06 11:36:37 +01:00
Edresson Casanova 8af3d2dbcd Add a dedicated workflow for XTTS tests 2023-10-24 09:52:44 -03:00
Edresson Casanova 67ca70aff4 Fix Delightful TTS layers unit test 2023-10-23 11:47:10 -03:00
Edresson Casanova e8a1a50273 Remove unused vars in Delightful TTS layers tests 2023-10-23 09:26:36 -03:00