Commit Graph

1942 Commits

Author SHA1 Message Date
Eren Gölge 77b18126c7
Merge pull request #3126 from akx/freevc-config-module
Move FreeVCConfig to TTS.vc.configs (like all other config classes)
2023-11-08 11:24:47 +01:00
Eren Gölge cc6e9fcaa7
Fix #3153 (#3169) 2023-11-08 11:13:58 +01:00
Eren Gölge a24ebcd8a6
Fix coqui api (#3168) 2023-11-08 10:51:23 +01:00
Julian Weber ce1a39a9a4
Add char limit warn (#3130)
* Add char limit warning

* Adding v2 langs

* cached_property for cutlet

* Fix import
2023-11-08 10:24:23 +01:00
Eren Gölge f846a9f300
Update to v0.20.1 2023-11-07 14:17:36 +01:00
Edresson Casanova cbdbc44e0f
Fix XTTS v2.0 training recipe (#3154)
* Fix XTTS v2.0 training recipe

* Update XTTS v2 model hash
2023-11-07 14:16:44 +01:00
Edresson Casanova 5f9ab6cfaa
Fix style
Co-authored-by: Aarni Koskela <akx@iki.fi>
2023-11-06 19:22:34 -03:00
Edresson Casanova 2470599d18 Drop XTTS v1 2023-11-06 19:12:04 -03:00
Edresson Casanova 13243df526 Update XTTS v1.1 files 2023-11-06 19:10:21 -03:00
Edresson Casanova 09fb317e6d Remove unused code 2023-11-06 17:36:32 -03:00
Edresson Casanova b146de4ce8 Bug fix on XTTS v2.0 Trainer 2023-11-06 20:26:01 +01:00
Edresson Casanova 1b6f8d0e46 Update unit tests and recipes 2023-11-06 20:25:06 +01:00
Edresson Casanova 72b2bac0f8 Load reference in 24khz to avoid issued with multiple sr references 2023-11-06 20:25:06 +01:00
Edresson Casanova 00294ffdf6 Update XTTS docs 2023-11-06 20:24:06 +01:00
Edresson Casanova 459ad70dc8 Add support for multiples speaker references on XTTS inference 2023-11-06 20:22:35 +01:00
Eren Gölge f0cb19ecca
Drop diffusion from XTTS (#3150)
* Drop diffusion for XTTS

* Make style

* Drop diffusion deps in code

* Restore thrashed
2023-11-06 20:15:49 +01:00
Eren G??lge 5d418bb84a Update docs 2023-11-06 18:48:41 +01:00
Eren G??lge 9bbf6eb8dd Drop use_ne_hifigan 2023-11-06 18:43:38 +01:00
Eren G??lge 9d54bd7655 Fixup XTTS 2023-11-06 18:13:58 +01:00
Eren Gölge c713a839da
Update VERSION 2023-11-06 15:51:56 +01:00
Edresson Casanova e45227d9ff
XTTS v2.0 (#3137)
* Implement most similar ref training approach

* Use non-enhanced hifigan for test samples

* Add Perceiver

* Update GPT Trainer for perceiver support

* Update XTTS docs

* Bug fix masking with XTTS perceiver

* Bug fix on gpt forward

* Bug Fix on XTTS v2.0 training

* Add XTTS v2.0 unit tests

* Add XTTS v2.0 inference unit tests

* Bug Fix on diffusion inference

* Add XTTS v2.0 training recipe

* Placeholder model entry

* Add cloning params to config

* Make prompt embedding configurable

* Make cloning configurable

* Cheap fix for a cheaper fix

* Prevent resampling

* Update model entry

* Update docs

* Update requirements

* Code linting

* Add xtts v2 to sep tests

* Bug fix on XTTS get_gpt_cond_latents

* Bug fix on rebase

* Make style

* Bug fix in Japenese tokenizer

* Add num2words to deps

* Remove unused kwarg and added num_beams=1 as default

---------

Co-authored-by: Eren G??lge <egolge@coqui.ai>
2023-11-06 14:58:18 +01:00
Aarni Koskela 38f6f8f0bb
Run `make style` & re-enable it in CI (#3127) 2023-11-06 11:36:37 +01:00
Aarni Koskela 5ae369d629 Move FreeVCConfig to TTS.vc.configs (like all other config classes) 2023-10-31 16:56:25 +02:00
Eren Gölge 6fef4f9067
Bump up to v0.19.1 2023-10-30 10:37:28 +01:00
Eren Gölge eccc94be9b
Merge pull request #2983 from vltmedia/dev
Bug: self.model_name needed to be initialized.
2023-10-28 10:39:25 +02:00
Eren Gölge 2d6bd716ef
Merge pull request #3109 from coqui-ai/tts_3067
fix for issue 3067
2023-10-28 10:37:52 +02:00
WeberJulian 1c98821359 Remove unused load_audio function 2023-10-27 22:27:18 +02:00
Aya Jafari 041b4b6723 fix for issue 3067 2023-10-26 13:06:01 -03:00
WeberJulian d4e08c8d6c Add features to get_conditioning_latents 2023-10-26 14:57:33 +02:00
WeberJulian c1133724a1 Move lang token add to tokenizer 2023-10-26 14:52:13 +02:00
WeberJulian 6fa46d197d Fix get_conditioning_latents when using only ne 2023-10-26 14:51:35 +02:00
Eren Gölge edd3a28723
Bump up to v0.19.0 2023-10-25 13:29:38 +02:00
Edresson Casanova 01839af926 Bug fix on XTTS masking training 2023-10-24 18:30:14 -03:00
VLT Media 818aa0eb7e
Merge branch 'coqui-ai:dev' into dev 2023-10-23 23:36:33 -04:00
Edresson Casanova 0f96abb5ec Add FT inference example on XTTS docs 2023-10-23 13:23:30 -03:00
Edresson Casanova 37b7945474 Update XTTS train not implemented error to point to the XTTS docs 2023-10-23 11:39:17 -03:00
Edresson Casanova ec7f54768a Rebase bug fix and update recipe 2023-10-21 17:37:51 -03:00
Edresson Casanova affaf11148 Add XTTS training unit test 2023-10-21 13:41:12 -03:00
Edresson Casanova 1f92741d6a Fix issue #2971 2023-10-21 13:37:21 -03:00
Edresson Casanova 5f98dbeec9 Update Ljspeech XTTS recipe 2023-10-21 13:37:21 -03:00
Edresson Casanova 9e3598c3b7 Bug Fix on inference using XTTS trainer checkpoint 2023-10-21 13:37:21 -03:00
Edresson Casanova c4ceaabe2c Add test sentences during the training 2023-10-21 13:33:56 -03:00
Edresson Casanova 2f868dd5c2 Bug fix on reproducible evaluation 2023-10-21 13:33:56 -03:00
Edresson Casanova bafab049c2 Add prompting masking 2023-10-21 13:33:56 -03:00
Edresson Casanova 47d613df3a Add reproducible evaluation 2023-10-21 13:33:56 -03:00
Edresson Casanova 40a4e631ea Update mel spectrogram for the style encoder 2023-10-21 13:33:56 -03:00
Edresson Casanova a32961bcb4 Add XTTS base training code 2023-10-21 13:33:56 -03:00
Eren Gölge 1e152692ed
Bump up to v0.18.2 2023-10-21 17:29:53 +02:00
Julian Weber dad6a7b0b6
Preserve [ja] token of the text processing 2023-10-21 11:26:03 +02:00
Julian Weber c7a16042e3
Remove global cutlet import 2023-10-21 11:18:58 +02:00
Edresson Casanova 414f0de0a1
Bump up to v0.18.1 2023-10-20 17:30:58 -03:00
Edresson Casanova 59576fc0ec
Bug fix on XTTS v1.1 inference (#3093)
* Bug fix on XTTS v1.1 inference

* Update .models.json

---------

Co-authored-by: Julian Weber <julian.weber@hotmail.fr>
2023-10-20 17:29:43 -03:00
Eren Gölge 85e7323739
Bump up to v0.18.0 2023-10-20 16:03:24 +02:00
Julian Weber cf97116185
XTTS v1.1 (#3089)
* Add support for ne_hifigan

* Update model.json

* Update hash

* Fix model loading

* Enhance text_normalization

* Add xtts to zoo test exception

* Add model hash check

* Add get_number_tokens
2023-10-20 16:02:08 +02:00
Eren Gölge 747f688dc3
Bump up to v0.17.10 2023-10-19 12:00:15 +02:00
Eren Gölge 93e6961bb5
Update .models.json 2023-10-19 11:59:49 +02:00
Eren Gölge bf68848f38
Bump up to v0.17.9 2023-10-19 11:22:42 +02:00
Eren Gölge c3b011217d
Update .models.json 2023-10-19 11:21:21 +02:00
David Garvey a151d70242
Add stdout option (#3027)
* add add cli options for play and speed
--play argument uses simpleaudio to play the tts wav
--speed <float 0.0-2.0> passes speed argument to Coqui Studio models

* remove simpleaudio not referenced in file

* fix simpleaudio dependency version

* add ALSA headers for simpleaudio compilation

* Dockerfile ALSA headers for simpleaudio

* base changes to use stdout instead of play audio
Considering conversion to pipe wav data for audio playback with ohter program
like aplay.

This is incomplete code. Using to get feedback before proceeding with
implementation.

* remove play for pipe_out arg that suppresses stdout
removed play and simpleaudio dependency in place of pipe
fuctionality to allow passing wav file data to a program
dedicated to playing audio.

* scipy.io.wavfile.write fails with /dev/null target

* Streaming inference for XTTS 🚀 (#3035)

* v0.17.7

* Redownload XTTS with the local and remote config do not match

* Remove unused method

* Print a message when it is already donwloaded

* Try-except to present error when the user dont have connection

* Fix style

* 0.17.8

* v0.17.8

---------

Co-authored-by: Julian Weber <julian.weber@hotmail.fr>
Co-authored-by: Eren Gölge <erogol@hotmail.com>
Co-authored-by: Edresson Casanova <edresson1@gmail.com>
Co-authored-by: ggoknar <ggoknar@coqui.ai>
2023-10-16 12:07:21 +02:00
Dusty Hagstrom 13cd076a7f
Synthesizer skips over embeddings file if model only has one speaker (#2587)
* It looks like the Neon model is special in that t does not have a speaker_name and it wants to get the only item available. This was blocking a valid model with one speaker and a d_vector_file from being executed to get the embedding.

* Update synthesizer.py

oh my how embarrassing
2023-10-16 11:55:45 +02:00
Aya Jafari ffddf10458 unit test fix 2023-10-13 10:56:47 -03:00
Aya Jafari 6eaecab0ca fixed bugs in fastpitch tts synthesis 2023-10-10 23:02:31 -03:00
ggoknar 99635193f5 v0.17.8 2023-10-07 01:14:05 +03:00
ggoknar 3bb51b1276 0.17.8 2023-10-07 01:13:02 +03:00
Edresson Casanova 2852404bdf Fix style 2023-10-06 17:42:46 -03:00
Edresson Casanova 99650044a4 Try-except to present error when the user dont have connection 2023-10-06 17:37:05 -03:00
Edresson Casanova 529ea3f67f Print a message when it is already donwloaded 2023-10-06 17:26:40 -03:00
Edresson Casanova ee1ef1c51e Remove unused method 2023-10-06 17:21:22 -03:00
Edresson Casanova 4a6103fec9 Redownload XTTS with the local and remote config do not match 2023-10-06 17:16:30 -03:00
Eren Gölge 0520697b5f
v0.17.7 2023-10-06 18:35:26 +02:00
Julian Weber e5e0cbffc9
Streaming inference for XTTS 🚀 (#3035) 2023-10-06 18:34:06 +02:00
OPERATOR 2150136210
None is not able to be read for "XTTS", fixes crash if its set to None. (#3009) 2023-10-02 12:53:36 +02:00
Eren Gölge 155c5fc0bd
v0.17.6 2023-09-29 23:44:09 +02:00
Edresson Casanova 4c3c11c958
Tortoise inference fix and fix zoo unit tests (#3010) 2023-09-29 13:40:57 +02:00
Eren Gölge bb05dcb9b4
Merge pull request #2922 from coqui-ai/be_tts
Adding Belarusian TTS model
2023-09-27 09:48:28 +02:00
Eren Gölge 8cba47191f
Merge pull request #2993 from akx/tts-readme
Ensure `tts` CLI tool readme and usage is in sync
2023-09-27 09:46:54 +02:00
Eren Gölge ea51a7ffcc
Merge pull request #3003 from akx/duplicate-code-removal
Duplicate code removal
2023-09-27 09:41:35 +02:00
Aarni Koskela 0dbe7cbcc4 Remove duplicate convert_pad_shape 2023-09-27 01:10:48 +03:00
Aarni Koskela 33a7c722f6 Merge duplicate on_train_step_start functions in delightful_tts 2023-09-27 01:10:44 +03:00
Aarni Koskela 861c68b0b8 Rename misnamed setter 2023-09-27 01:09:59 +03:00
Aarni Koskela 09e14e68db Remove duplicate get_named_beta_schedules 2023-09-27 01:09:59 +03:00
Aarni Koskela 59f85a7122 Remove duplicate code from xtts.tokenizer 2023-09-27 01:09:59 +03:00
Aarni Koskela 0a82f063cc Late-import main TTS libraries in `tts` CLI 2023-09-26 15:38:56 +03:00
Aarni Koskela 5c047cf304 Ensure `tts` CLI tool readme and usage help is in sync 2023-09-26 15:38:56 +03:00
Eren Gölge 0b95b88f13
Bum up to v0.17.5 2023-09-25 18:16:45 +02:00
VLT Media dd73910651
Bug: self.model_name needed to be initialized.
Bug: self.model_name needed to be initialize to get around a bug that automatically crashes when the user provides the model paths but no model_name when initializing the TTS object.
2023-09-23 01:41:35 -04:00
loupzeur da8b6bbce1
fix: xtts not taking into account device flag (#2951)
* fix: xtts not taking into account device flag

* Style changes

---------

Co-authored-by: Julian Weber <julian.weber@hotmail.fr>
2023-09-20 09:57:02 +02:00
Reuben Morais f829bf50f8
Bump version to v0.17.4 (really) 2023-09-15 16:40:34 +02:00
Eren G??lge aa8fa4756e Bump up to v0.17.4 2023-09-14 17:52:44 +02:00
Eren G??lge 9d0b76ce23 Check env var for COQUI_TOS_AGREED 2023-09-14 17:51:40 +02:00
Eren G??lge 13dd7c4c9e Bump up to v0.17.2 2023-09-14 15:24:05 +02:00
Eren G??lge ded7fd4fb2 Make style 2023-09-14 15:23:37 +02:00
Eren G??lge 44b61d2b92 Fixup 2023-09-14 15:22:54 +02:00
Eren Gölge 623ea41634
Fix model tests (#2943) 2023-09-14 15:21:48 +02:00
Eren G??lge af62613c86 Bump up to v0.17.1 2023-09-13 18:23:39 +02:00
Eren G??lge ee7cee0e35 Fixup 2023-09-13 18:21:44 +02:00
Eren G??lge 5dcf9ae311 Bump up v0.17.0 2023-09-13 18:04:26 +02:00
Eren Gölge 4033db5f4b 🔥 XTTS implementation 2023-09-13 17:51:24 +02:00
Edresson Casanova 4d3f23b5d3
Add CML-TTS dataset YourTTS training recipe (#2934) 2023-09-12 11:49:14 +02:00
Eren Gölge 9533f8656c Make style 2023-09-04 13:58:37 +02:00