Commit Graph

4914 Commits

Author SHA1 Message Date
Enno Hermann f5e21489e5 ci: explicitly upload hidden files for coverage
Due to breaking change in upload-artifact action:
actions/upload-artifact#602
2024-09-12 23:37:19 +02:00
Enno Hermann 659b4852ba chore(bark): remove manual download of hubert model
Bark was previously adapted to download Hubert from HuggingFace, so the manual
download is superfluous.
2024-09-12 23:37:19 +02:00
Enno Hermann 86b58fb6d9 fix: define torch safe globals for torch.load
Required for loading some models using torch.load(..., weights_only=True). This
is only available from Pytorch 2.4
2024-09-12 23:37:19 +02:00
shavit 17ca24c3d6 fix: load weights only in torch.load 2024-09-12 23:37:19 +02:00
Enno Hermann 1920328822
feat(xtts): support hindi in tokenizer (#64)
Added proper tokenizer support for Hindi Language which would prevent crash while fine tuning Hindi language.

Co-authored-by: Akshat Bhardwaj <157223825+akshatrocky@users.noreply.github.com>
2024-09-12 21:29:21 +02:00
Azalea 233dfb54ae
docs(tacotron): fix wrong paper links (#74) 2024-08-25 12:27:27 +02:00
Enno Hermann 204588f7c5
Merge pull request #56 from idiap/update-gruut
Preparations for Numpy 2 support (gruut, soxr, spacy)
2024-08-05 13:31:26 +01:00
Enno Hermann 7014782ad4 build: add upper bound for transformers
4.43.* broke XTTS streaming again
2024-08-05 10:28:03 +02:00
Enno Hermann b1558b06d7 build: require numpy<2 because spacy/thinc lack support 2024-08-05 10:27:14 +02:00
Enno Hermann d304ab2769 build: update gruut version for numpy2 support 2024-08-05 10:27:14 +02:00
Enno Hermann 19fce2c87c
Merge pull request #66 from idiap/skip-broken-audio
Skip audio files that can't be decoded
2024-07-31 15:40:21 +01:00
Enno Hermann 9c604c1de0 chore(dataset): address lint issues 2024-07-31 15:47:27 +02:00
Enno Hermann 8c460d0cd0 fix(dataset): skip files where audio length can't be computed
Avoids hard failures when the audio can't be decoded.
2024-07-31 15:20:56 +02:00
Daniel Walmsley 20bbb411c2
fix(xtts): update streaming for transformers>=4.42.0 (#59)
* Fix Stream Generator on MacOS

* Make it work on mps

* Implement custom tensor.isin

* Fix for latest TF

* Comment out hack for now

* Remove unused code

* build: increase minimum transformers version

* style: fix

---------

Co-authored-by: Enno Hermann <Eginhard@users.noreply.github.com>
2024-07-25 16:24:10 +02:00
Enno Hermann 20583a496e
Merge pull request #57 from idiap/xtts-vocab
fix(xtts): load tokenizer file based on config as last resort
2024-07-25 13:26:28 +01:00
Enno Hermann de35920317
Merge pull request #50 from idiap/umap
build: move umap-learn into optional notebook dependencies
2024-07-25 13:26:09 +01:00
Enno Hermann 9192ef1aa6 fix(xtts): load tokenizer file based on config as last resort 2024-07-05 13:52:01 +02:00
Abraham Mathews 6ea3b75b84
Update xtts.py (#53)
docs(xtts): fix typo in example
2024-07-02 13:43:52 +02:00
Enno Hermann c1a929b720
Merge pull request #51 from idiap/update-trainer
Update to coqui-tts-trainer 0.1.4
2024-07-02 09:49:23 +01:00
Enno Hermann 8cab2e3b4e ci: test lowest and highest compatible versions of dependencies 2024-06-29 17:33:33 +02:00
Enno Hermann 808a938171 build: specify minimum versions for dependencies 2024-06-29 17:33:33 +02:00
Enno Hermann 2d06aeb79b chore: remove unused TTS.utils.io module
All uses of these methods were replaced with the equivalents from coqui-tts-trainer
2024-06-29 15:07:10 +02:00
Enno Hermann e869b9b658 refactor: use load_checkpoint from trainer 2024-06-29 15:07:10 +02:00
Enno Hermann da82d55329 refactor: use load_fsspec from trainer
Made automatically with:
rg "from TTS.utils.io import load_fsspec" --files-with-matches | xargs sed -i 's/from TTS.utils.io import load_fsspec/from trainer.io import load_fsspec/g'
2024-06-29 15:07:10 +02:00
Enno Hermann 0fb26f97df refactor: use get_user_data_dir from trainer 2024-06-29 15:07:10 +02:00
Enno Hermann 28296c6458 refactor: use get_git_branch from trainer 2024-06-29 15:07:10 +02:00
Enno Hermann c693b08830 build: update trainer to 0.1.4 2024-06-29 15:07:08 +02:00
Enno Hermann 59ef28d708 build: move umap-learn into optional notebook dependencies
Except for notebooks, it's only used to show embedding plots during speaker
encoder training, in which case a warning is now shown to install it.
2024-06-26 23:53:17 +02:00
Enno Hermann ff2cd5c97d
Merge pull request #49 from idiap/vc-refactors
VC-related refactors and fixes
2024-06-26 14:01:21 +01:00
Enno Hermann 4bd3df2607 refactor: remove duplicate get_padding 2024-06-26 11:54:36 +02:00
Enno Hermann c30fb0f56b chore: remove duplicate init_weights 2024-06-26 11:46:37 +02:00
Enno Hermann c5241d71ab chore: address pytorch deprecations
torch.range(a, b) == torch.arange(a, b+1)

meshgrid indexing: https://github.com/pytorch/pytorch/issues/50276

checkpoint use_reentrant:
https://dev-discuss.pytorch.org/t/bc-breaking-update-to-torch-utils-checkpoint-not-passing-in-use-reentrant-flag-will-raise-an-error/1745

optimizer.step() before scheduler.step():
https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
2024-06-26 11:38:25 +02:00
Enno Hermann a755328e49 refactor(freevc): remove duplicate sequence_mask 2024-06-26 10:17:04 +02:00
Enno Hermann f8df19a10c refactor: remove duplicate convert_pad_shape 2024-06-26 10:17:04 +02:00
Enno Hermann cd7b6daf46 fix: clarify types, fix missing functions 2024-06-26 10:17:04 +02:00
Enno Hermann d65bcf65bb chore(freevc): remove duplicate DDSConv and ElementwiseAffine
Already exist as:
TTS.tts.layers.vits.stochastic_duration_predictor.DilatedDepthSeparableConv
TTS.tts.layers.vits.stochastic_duration_predictor.ElementwiseAffine
2024-06-26 10:17:04 +02:00
Enno Hermann 9f80e043e4 refactor(freevc): use existing layernorm 2024-06-26 10:17:04 +02:00
Enno Hermann 857cd55ce5 test(helpers): fix test_rand_segment, test_generate_path 2024-06-26 10:16:46 +02:00
Enno Hermann c9f7197862 test(helpers): add test_ prefix so tests actually run 2024-06-25 23:03:40 +02:00
Enno Hermann 98c0f86cb3
Merge pull request #46 from idiap/fix-xtts-streaming
Fix XTTS streaming for transformers update
2024-06-18 14:54:15 +01:00
Enno Hermann 4d9e18ea7d chore(stream_generator): address lint issues 2024-06-17 09:52:35 +02:00
Enno Hermann 2a281237d7 refactor(stream_generator): update code for transformers>=4.41.1
In line with
eed9ed6798/src/transformers/generation/utils.py
2024-06-17 09:52:35 +02:00
Enno Hermann 4b6da4e7ba refactor(stream_generator): update special tokens for transformers>=4.41.1
Fixes #31. The handling of special tokens in `transformers` was changed in
https://github.com/huggingface/transformers/pull/30624 and
https://github.com/huggingface/transformers/pull/30746. This updates the XTTS
streaming code accordingly.
2024-06-17 09:52:35 +02:00
Enno Hermann 81ac7abd58
Merge pull request #47 from idiap/numpy2
build: add numpy2 support
2024-06-17 08:48:18 +01:00
Enno Hermann bd9b21d946
Merge pull request #44 from idiap/phoneme-cleaners
Add multilingual phoneme cleaner
2024-06-17 08:47:15 +01:00
Enno Hermann 4bc0e75a08 build: add numpy2 support
Identified necessary code changes with the NPY201 ruff rule. Gruut is the only
dependency that doesn't support numpy2 yet.

NB: At build time numpy>=2.0.0 should be required to be able to build wheels
compatible with both numpy1+2:
https://numpy.org/devdocs/dev/depending_on_numpy.html#numpy-2-abi-handling
2024-06-16 22:10:33 +02:00
ChristianRomberg 3a20f4725f
fix(freevc): use the specified device for pretrained speaker encoder (#45)
Fixes coqui-ai#3787
2024-06-16 21:24:03 +02:00
Enno Hermann 9cfcc0a0f5 chore(cleaners): add type hints 2024-06-14 15:20:04 +02:00
Enno Hermann a1495d4bc1 fix(recipes): use multilingual phoneme cleaner in non-english recipes 2024-06-14 15:09:01 +02:00
Enno Hermann e5c208d254 feat(cleaners): add multilingual phoneme cleaner
This doesn't convert numbers into English words.
2024-06-14 15:06:03 +02:00