Işık
ed1563b132
Merge branch 'dev' into fix-improvements/adjust-speech-rate-or-speed
2024-12-30 00:36:41 +03:00
isikhi
26128be422
feat: add adjust_speech_rate function to modify speech speed with more durable latents. also missed tts speed implementations added.
2024-12-28 23:08:08 +03:00
Jan Zípek
98080e282c
fix(xtts): use correct language code for Czech num2words call ( #237 )
...
* Fix num2words call using non-standard lang code
* build: update minimum num2words version
---------
Co-authored-by: Enno Hermann <enno.hermann@idiap.ch>
2024-12-28 13:25:46 +01:00
Enno Hermann
f89ce41924
fix(xtts): voice_dir should remain None if not specified ( #224 )
2024-12-19 17:22:23 +01:00
Enno Hermann
6a52c8a855
fix(bin): log to stdout in cli tools, unless pipe_out is set
...
This way the outputs are available for further downstream processing, e.g. with
grep. For TTS/bin/synthesize.py, if --pipe_out is set, log to stderr because
then only the output audio stream should be on stdout, e.g. to pipe it to aplay.
2024-12-17 11:38:39 +01:00
Enno Hermann
9d5fc60a5d
feat(manager): print download location when listing models ( #213 )
2024-12-16 10:28:25 +01:00
Enno Hermann
0df04cc259
docs: add notes about xtts fine-tuning
2024-12-14 16:19:38 +01:00
Enno Hermann
a425ba599d
feat: allow both Path and strings where possible and add type hints
2024-12-14 16:19:38 +01:00
Enno Hermann
e38dcbea7a
docs: streamline readme and reuse content in other docs pages
...
[ci skip]
2024-12-12 18:29:23 +01:00
Enno Hermann
849e75e967
docs: improve documentation
2024-12-12 18:23:17 +01:00
Enno Hermann
c0d9ed3d18
fix: handle difference in xtts/tortoise attention ( #199 )
2024-12-09 16:13:13 +01:00
Enno Hermann
b545ab8b80
Merge pull request #197 from idiap/api
...
Expand Python API capabilities
2024-12-06 18:02:54 +01:00
Enno Hermann
e0f621180f
refactor(bin.synthesize): use Python API for CLI
2024-12-06 17:07:54 +01:00
Enno Hermann
806af96e4c
refactor(api): use save_wav() from Synthesizer instance
2024-12-06 15:26:06 +01:00
Enno Hermann
89abd98620
feat(api): support passing speaker/language id file paths
2024-12-06 15:26:06 +01:00
Enno Hermann
a05177ce71
chore(api): add type hints
2024-12-06 15:26:06 +01:00
Enno Hermann
85dbb3b8b3
feat(api): allow mixing TTS and vocoder model name and path
2024-12-06 15:26:06 +01:00
Enno Hermann
e8d99aaf2b
Merge pull request #184 from idiap/xtts-error
...
fix(xtts): clearer error message when file given to checkpoint_dir
2024-12-06 06:46:48 +01:00
Enno Hermann
1a4e58d0ce
feat(api): support passing a custom speaker encoder by path
2024-12-05 21:19:07 +01:00
Enno Hermann
5daed879e0
chore(bin.synthesize): remove unused argument
2024-12-05 21:19:07 +01:00
Enno Hermann
42ad9b00c6
feat(api): support specifying vocoders by name
2024-12-05 21:19:07 +01:00
Enno Hermann
5cfb4ecccd
refactor(api): require keyword arguments except for model_name
2024-12-05 21:19:07 +01:00
Enno Hermann
8c381e3e48
docs: use .to("cuda") instead of deprecated gpu=True
2024-12-05 21:19:07 +01:00
Enno Hermann
fe14ca6b68
refactor(xtts): remove duplicate xtts audio config
2024-12-05 15:46:28 +01:00
Enno Hermann
3539e65d8e
refactor(synthesizer): set sample rate in loading methods
2024-12-02 23:26:28 +01:00
Enno Hermann
7d0416f99b
refactor(vc): rename TTS.vc.modules to TTS.vc.layers for consistency
...
Same as in TTS.tts and TTS.vocoder
2024-12-02 23:26:28 +01:00
Enno Hermann
546f43cb25
refactor: only use keyword args in Synthesizer
2024-12-02 23:26:27 +01:00
Enno Hermann
6927e0bb89
fix(api): clearer error message when model doesn't support VC
2024-12-02 23:26:27 +01:00
Enno Hermann
fce3137e0d
feat: add openvoice vc model
2024-12-02 23:26:27 +01:00
Enno Hermann
ca02d0352b
feat(openvoice): add to .models.json
2024-12-02 22:34:56 +01:00
Enno Hermann
95998374bf
feat(openvoice): add config classes
2024-12-02 22:34:56 +01:00
Enno Hermann
b97d5378a5
refactor(openvoice): remove duplicate and unused code
2024-12-02 22:34:56 +01:00
Enno Hermann
4124b9d663
feat(vits): add tau parameter to posterior encoder
2024-12-02 22:34:56 +01:00
akulkarni
6de98ff480
feat(openvoice): initial integration
2024-12-02 22:34:56 +01:00
Enno Hermann
ce202532cf
fix(xtts): clearer error message when file given to checkpoint_dir
2024-12-02 16:54:11 +01:00
Enno Hermann
63625e79af
refactor: import get_last_checkpoint from trainer.io
2024-11-29 13:59:43 +01:00
Enno Hermann
170d3dae92
refactor: remove duplicate to_camel
2024-11-24 19:57:14 +01:00
Enno Hermann
7330ad8854
refactor: move duplicate alignment functions into helpers
2024-11-24 19:57:14 +01:00
Enno Hermann
76df6421de
refactor: move more audio processing into torch_transforms
2024-11-24 19:57:14 +01:00
Enno Hermann
b1ac884e07
refactor: move shared function into dataset.py
2024-11-24 19:57:14 +01:00
Enno Hermann
54f4228a46
refactor(xtts): use existing cleaners
2024-11-24 19:57:14 +01:00
Enno Hermann
b45a7a4220
refactor: move exists() and default() into generic_utils
2024-11-24 19:57:14 +01:00
Enno Hermann
fa844e0fb7
refactor(tacotron): remove duplicate function
2024-11-24 19:57:14 +01:00
Enno Hermann
0f69d31f70
refactor(vocoder): remove duplicate function
2024-11-24 19:57:14 +01:00
Enno Hermann
6ecf47312c
refactor(xtts): use tortoise conditioning encoder
2024-11-24 19:57:14 +01:00
Enno Hermann
69a599d403
refactor(freevc): remove duplicate code
2024-11-24 19:57:14 +01:00
Enno Hermann
2e5f68df6a
refactor(wavernn): remove duplicate Stretch2d
...
I checked that the implementations are the same
2024-11-23 01:04:17 +01:00
Enno Hermann
e63962c226
refactor(losses): move shared losses into losses.py
2024-11-23 01:04:17 +01:00
Enno Hermann
6f25c2b904
refactor(delightful_tts): remove unused classes
2024-11-23 01:04:17 +01:00
Enno Hermann
7cdfde226b
refactor: move amp_to_db/db_to_amp into torch_transforms
2024-11-23 01:04:17 +01:00