Commit Graph

2135 Commits

Author SHA1 Message Date
Enno Hermann 659b4852ba chore(bark): remove manual download of hubert model
Bark was previously adapted to download Hubert from HuggingFace, so the manual
download is superfluous.
2024-09-12 23:37:19 +02:00
Enno Hermann 86b58fb6d9 fix: define torch safe globals for torch.load
Required for loading some models using torch.load(..., weights_only=True). This
is only available from Pytorch 2.4
2024-09-12 23:37:19 +02:00
shavit 17ca24c3d6 fix: load weights only in torch.load 2024-09-12 23:37:19 +02:00
Enno Hermann 1920328822
feat(xtts): support hindi in tokenizer (#64)
Added proper tokenizer support for Hindi Language which would prevent crash while fine tuning Hindi language.

Co-authored-by: Akshat Bhardwaj <157223825+akshatrocky@users.noreply.github.com>
2024-09-12 21:29:21 +02:00
Enno Hermann 9c604c1de0 chore(dataset): address lint issues 2024-07-31 15:47:27 +02:00
Enno Hermann 8c460d0cd0 fix(dataset): skip files where audio length can't be computed
Avoids hard failures when the audio can't be decoded.
2024-07-31 15:20:56 +02:00
Daniel Walmsley 20bbb411c2
fix(xtts): update streaming for transformers>=4.42.0 (#59)
* Fix Stream Generator on MacOS

* Make it work on mps

* Implement custom tensor.isin

* Fix for latest TF

* Comment out hack for now

* Remove unused code

* build: increase minimum transformers version

* style: fix

---------

Co-authored-by: Enno Hermann <Eginhard@users.noreply.github.com>
2024-07-25 16:24:10 +02:00
Enno Hermann 20583a496e
Merge pull request #57 from idiap/xtts-vocab
fix(xtts): load tokenizer file based on config as last resort
2024-07-25 13:26:28 +01:00
Enno Hermann de35920317
Merge pull request #50 from idiap/umap
build: move umap-learn into optional notebook dependencies
2024-07-25 13:26:09 +01:00
Enno Hermann 9192ef1aa6 fix(xtts): load tokenizer file based on config as last resort 2024-07-05 13:52:01 +02:00
Abraham Mathews 6ea3b75b84
Update xtts.py (#53)
docs(xtts): fix typo in example
2024-07-02 13:43:52 +02:00
Enno Hermann 2d06aeb79b chore: remove unused TTS.utils.io module
All uses of these methods were replaced with the equivalents from coqui-tts-trainer
2024-06-29 15:07:10 +02:00
Enno Hermann e869b9b658 refactor: use load_checkpoint from trainer 2024-06-29 15:07:10 +02:00
Enno Hermann da82d55329 refactor: use load_fsspec from trainer
Made automatically with:
rg "from TTS.utils.io import load_fsspec" --files-with-matches | xargs sed -i 's/from TTS.utils.io import load_fsspec/from trainer.io import load_fsspec/g'
2024-06-29 15:07:10 +02:00
Enno Hermann 0fb26f97df refactor: use get_user_data_dir from trainer 2024-06-29 15:07:10 +02:00
Enno Hermann 28296c6458 refactor: use get_git_branch from trainer 2024-06-29 15:07:10 +02:00
Enno Hermann 59ef28d708 build: move umap-learn into optional notebook dependencies
Except for notebooks, it's only used to show embedding plots during speaker
encoder training, in which case a warning is now shown to install it.
2024-06-26 23:53:17 +02:00
Enno Hermann 4bd3df2607 refactor: remove duplicate get_padding 2024-06-26 11:54:36 +02:00
Enno Hermann c30fb0f56b chore: remove duplicate init_weights 2024-06-26 11:46:37 +02:00
Enno Hermann c5241d71ab chore: address pytorch deprecations
torch.range(a, b) == torch.arange(a, b+1)

meshgrid indexing: https://github.com/pytorch/pytorch/issues/50276

checkpoint use_reentrant:
https://dev-discuss.pytorch.org/t/bc-breaking-update-to-torch-utils-checkpoint-not-passing-in-use-reentrant-flag-will-raise-an-error/1745

optimizer.step() before scheduler.step():
https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
2024-06-26 11:38:25 +02:00
Enno Hermann a755328e49 refactor(freevc): remove duplicate sequence_mask 2024-06-26 10:17:04 +02:00
Enno Hermann f8df19a10c refactor: remove duplicate convert_pad_shape 2024-06-26 10:17:04 +02:00
Enno Hermann cd7b6daf46 fix: clarify types, fix missing functions 2024-06-26 10:17:04 +02:00
Enno Hermann d65bcf65bb chore(freevc): remove duplicate DDSConv and ElementwiseAffine
Already exist as:
TTS.tts.layers.vits.stochastic_duration_predictor.DilatedDepthSeparableConv
TTS.tts.layers.vits.stochastic_duration_predictor.ElementwiseAffine
2024-06-26 10:17:04 +02:00
Enno Hermann 9f80e043e4 refactor(freevc): use existing layernorm 2024-06-26 10:17:04 +02:00
Enno Hermann 4d9e18ea7d chore(stream_generator): address lint issues 2024-06-17 09:52:35 +02:00
Enno Hermann 2a281237d7 refactor(stream_generator): update code for transformers>=4.41.1
In line with
eed9ed6798/src/transformers/generation/utils.py
2024-06-17 09:52:35 +02:00
Enno Hermann 4b6da4e7ba refactor(stream_generator): update special tokens for transformers>=4.41.1
Fixes #31. The handling of special tokens in `transformers` was changed in
https://github.com/huggingface/transformers/pull/30624 and
https://github.com/huggingface/transformers/pull/30746. This updates the XTTS
streaming code accordingly.
2024-06-17 09:52:35 +02:00
Enno Hermann 81ac7abd58
Merge pull request #47 from idiap/numpy2
build: add numpy2 support
2024-06-17 08:48:18 +01:00
Enno Hermann bd9b21d946
Merge pull request #44 from idiap/phoneme-cleaners
Add multilingual phoneme cleaner
2024-06-17 08:47:15 +01:00
Enno Hermann 4bc0e75a08 build: add numpy2 support
Identified necessary code changes with the NPY201 ruff rule. Gruut is the only
dependency that doesn't support numpy2 yet.

NB: At build time numpy>=2.0.0 should be required to be able to build wheels
compatible with both numpy1+2:
https://numpy.org/devdocs/dev/depending_on_numpy.html#numpy-2-abi-handling
2024-06-16 22:10:33 +02:00
ChristianRomberg 3a20f4725f
fix(freevc): use the specified device for pretrained speaker encoder (#45)
Fixes coqui-ai#3787
2024-06-16 21:24:03 +02:00
Enno Hermann 9cfcc0a0f5 chore(cleaners): add type hints 2024-06-14 15:20:04 +02:00
Enno Hermann e5c208d254 feat(cleaners): add multilingual phoneme cleaner
This doesn't convert numbers into English words.
2024-06-14 15:06:03 +02:00
Enno Hermann 03de4b889e docs: fix readthedocs links
[ci skip]
2024-06-13 22:48:34 +02:00
Enno Hermann 29e91f2e77 fix(utils.generic_utils): correctly call now() 2024-05-31 08:39:32 +02:00
Enno Hermann 77722cb0dd fix(bin.synthesize): correctly handle boolean arguments
Previously, e.g. `--use_cuda false` would actually set use_cuda=True:
https://github.com/coqui-ai/TTS/discussions/3762
2024-05-31 08:39:32 +02:00
Enno Hermann a682fa8d56
Merge pull request #33 from idiap/versions
Fix XTTS streaming
2024-05-29 14:16:36 +01:00
Enno Hermann 07cbcf825c fix(espeak_wrapper): read phonemize() input from file
Avoids utf8 encoding issues on Windows when passing the text directly.
Fixes https://github.com/coqui-ai/TTS/discussions/3761
2024-05-29 10:10:05 +02:00
Enno Hermann 49fcbd908b fix(espeak_wrapper): avoid stuck process on windows
Fixes #24
2024-05-29 07:39:03 +02:00
Enno Hermann 203f60f1e1 refactor(espeak_wrapper): remove sync argument
_espeak_exe is always called with sync=True, so remove code for sync==False
2024-05-28 21:30:55 +02:00
Enno Hermann 7df4c2fa47 fix: restore TTS.__version__ attribute
This is used by the TTS/bin/collect_env_info.py script with which users print
version information for bug reports. We restore the TTS.__version__ attribute so
that old versions of the script still work.
2024-05-28 09:35:55 +02:00
Enno Hermann df088e99df
Merge pull request #19 from idiap/toml
Move from setup.py to pyproject.toml, simplify requirements
2024-05-27 08:59:09 +01:00
Enno Hermann 642cbd472f
Merge pull request #26 from idiap/server-output
fix(server): ensure logging output gets actually shown
2024-05-26 09:08:27 +01:00
Enno Hermann ab7d84bf05 refactor(server): address linter issues 2024-05-23 08:42:21 +02:00
Enno Hermann 8503500d9d chore(server): remove duplicate code 2024-05-20 12:45:47 +02:00
Enno Hermann 70bd84894d fix(server): ensure logging output gets actually shown 2024-05-20 12:45:41 +02:00
Enno Hermann 018f1e6453 docs(bark): update docstrings and type hints 2024-05-15 22:56:55 +02:00
Enno Hermann 59a6c9fdf2 fix(bark): add missing argument for load_voice()
Fixes https://github.com/coqui-ai/TTS/issues/2795
2024-05-15 22:56:28 +02:00
Enno Hermann 6d563af623 chore: remove obsolete code for torch<2
Minimum torch version is 2.1 now.
2024-05-08 18:08:40 +02:00
Enno Hermann 865a48156d fix: make korean g2p deps optional 2024-05-08 18:08:40 +02:00
Enno Hermann 55ed162f2a fix: make chinese g2p deps optional 2024-05-08 18:08:40 +02:00
Enno Hermann ea893c3795 fix: make bangla g2p deps optional 2024-05-08 18:08:40 +02:00
Enno Hermann ec50006855 style: run pre-commit
Automatic changes from: pre-commit run --all-files
2024-05-08 12:17:47 +02:00
Enno Hermann fb92e13ebb build: remove unused/obsolete code 2024-05-08 12:13:41 +02:00
Enno Hermann 259d8fc40b build: store version in pyproject.toml 2024-05-07 18:27:55 +02:00
Enno Hermann 962f9bbbcf refactor(espeak_wrapper): fix ruff lint suggestions 2024-05-01 13:31:39 +02:00
Enno Hermann 7b2289a454 fix(espeak_wrapper): capture stderr separately
Fixes https://github.com/coqui-ai/TTS/issues/2728

Previously, error messages from espeak were treated as normal output and also
converted to phonemes. This captures and logs them separately.
2024-05-01 12:31:49 +02:00
Enno Hermann 06304504d2
Merge pull request #11 from idiap/py312
build: add python 3.12 support
2024-04-23 13:52:00 +02:00
Enno Hermann 2675e743b0 chore: update version to 0.23.1
[ci skip]
2024-04-23 09:57:43 +02:00
Enno Hermann 52a52b5e21 fix(LanguageManager): allow initialisation from config with language ids file
Previously, running `LanguageManager.init_from_config(config)` would never use
the `language_ids_file` if that field is present because it was overwritten in
the next line with a new manager that manually parses languages from the
datasets in the config. Now that is only used as a fallback.
2024-04-19 11:57:27 +02:00
Enno Hermann f7d69cc1d7 chore: update version to 0.23.0 2024-04-11 17:01:09 +02:00
Enno Hermann b3c9685aee fix(tokenizer): add debug logging 2024-04-11 16:58:12 +02:00
Enno Hermann 2ad790d169
Merge pull request #4 from idiap/hindi
feat(xtts): support Hindi for sentence-splitting and fine-tuning
2024-04-11 16:49:44 +02:00
Enno Hermann dfbe0168e9
Merge pull request #3 from idiap/logging
Use Python logging instead of print()
2024-04-11 08:34:44 +02:00
Enno Hermann d41686502e feat(xtts): support hindi for sentence-splitting and fine-tuning
The XTTS model itself already supports Hindi, it was just in these components.
2024-04-08 15:57:56 +02:00
Enno Hermann aa40fd277b docs: update links 2024-04-04 18:21:57 +02:00
Enno Hermann e689fd1d4a fix(utils.manage): remove bare except, improve messages 2024-04-03 15:19:45 +02:00
Enno Hermann 7dc5d1eb3d fix: logging in executables 2024-04-03 15:19:45 +02:00
Enno Hermann ab64844aba feat(utils.generic_utils): add custom formatter for logging to console 2024-04-03 15:19:45 +02:00
Enno Hermann 9b2d48f8a6 feat(utils.generic_utils): improve setup_logger() arguments and output 2024-04-03 15:19:45 +02:00
Enno Hermann b711e19cb6 refactor: remove verbose arguments
Can be handled by adjusting logging levels instead.
2024-04-03 15:19:45 +02:00
Enno Hermann b6ab85a050 fix: use logging instead of print statements
Fixes #1691
2024-04-03 15:19:45 +02:00
Enno Hermann dd3768d4b1 chore: update version to v0.22.1 2024-04-03 12:31:39 +02:00
Enno Hermann d772724125 fix: update repository links, package names, metadata 2024-04-03 12:02:44 +02:00
Enno Hermann 7630abb43f refactor(bin.find_unique_chars): use existing function 2024-03-30 22:22:40 +01:00
Enno Hermann adbcba06da refactor(dataset): get audio length with torchaudio
Removes a (GPL) dependency
2024-03-14 20:48:29 +01:00
Enno Hermann e5c6da1c98
Merge pull request #20 from eginhard/return-complex
fix: torch.stft will soon require return_complex=True
2024-03-13 13:50:21 +01:00
Enno Hermann e95f8950eb fix: torch.stft will soon require return_complex=True
Refactor that removes the deprecation warning:
torch.view_as_real(torch.stft(*, return_complex=True)) is equal to
torch.stft(*, return_complex=False)

https://pytorch.org/docs/stable/generated/torch.stft.html
2024-03-13 12:06:27 +01:00
Enno Hermann 89a061f1d1 docs(tts.models.vits): clarify use of discriminator/generator
[ci skip]
2024-03-12 18:59:05 +01:00
Enno Hermann a7753708fb refactor: remove duplicate methods available in Trainer 2024-03-12 15:06:42 +01:00
Enno Hermann 7673f282be build: make dependencies for server optional 2024-03-10 20:16:00 +01:00
Enno Hermann d80f7f4eba
Fix fairseq (#11)
* fix fairseq mode

* Added line to fix fairseq model issue and made code cleaner.

---------

Co-authored-by: akgupta1337 <akgupta1337@gmail.com>
2024-03-09 16:43:42 +01:00
Enno Hermann 2e8f47a33d
Merge pull request #10 from eginhard/fix-pinyin
fix chinese pinyin phonemes
2024-03-09 16:23:28 +01:00
Enno Hermann 309f39a45f fix(xtts_manager): name_to_id() should return dict
This is how the other embedding managers work
2024-03-08 14:47:00 +01:00
Enno Hermann 1aef5ff091
Merge pull request #7 from eginhard/pin-black
Pin black for consistent outputs
2024-03-07 17:32:02 +01:00
Enno Hermann ed8740a39b
Merge pull request #6 from eginhard/fix-bark-url
Fix bark model url
2024-03-07 11:50:46 +01:00
Enno Hermann efdafd5a7f style: run black 2024-03-07 11:46:51 +01:00
Enno Hermann f6464d7682
Merge pull request #5 from eginhard/fix-list-models
Fix TTS().list_models()
2024-03-07 08:01:29 +01:00
Greer 02d88b5dec Fix TTS().list_models() 2024-03-06 23:24:02 +01:00
Enno Hermann 017c84d005 style: make style && make lint 2024-03-06 22:45:35 +01:00
Enno Hermann 4e183c61df fix(api): handle missing attribute in is_multilingual 2024-03-06 22:41:32 +01:00
Enno Hermann e05243c4c8 refactor: read/write csv files with standard library 2024-03-06 16:18:09 +01:00
Enno Hermann 24298da5fc
Merge pull request #1 from eginhard/lint-overhaul
Lint overhaul (pylint to ruff)
2024-03-06 16:10:26 +01:00
Enno Hermann 04d8d4b09a chore: remove unused imports 2024-03-06 13:27:43 +01:00
Nick Potafiy dbf1a08a0d
Update generic_utils.py (#3561)
Handles cases when git branch produces no output or invalid output. Right now, it just crashes with `StopIteration`
2024-02-10 11:20:58 -03:00
wangjie b184e9f0fe fix chinese pinyin phonemes 2024-01-12 09:11:56 +08:00
Ivan Peevski 08e00e4b49
Fix bark model 2024-01-08 14:45:04 +10:30
Edresson Casanova 5dcc16d193
Bug fix in MP3 and FLAC compute length on TTSDataset (#3092)
* Bug Fix on XTTS load

* Bug fix in MP3 length on TTSDataset

* Update TTS/tts/datasets/dataset.py

Co-authored-by: Aarni Koskela <akx@iki.fi>

* Uses mutagen for all audio formats

* Add dataloader test wit hall supported audio formats

* Use mutagen.File

* Update

* Fix aux unit tests

* Bug fixe on unit tests

---------

Co-authored-by: Aarni Koskela <akx@iki.fi>
2023-12-27 13:23:43 -03:00
Eren Gölge 55c7063724
Merge pull request #3423 from idiap/fix-aux-tests
Fix CI (save best model after 0 steps in tests)
2023-12-14 18:00:30 +01:00