Commit Graph

398 Commits

Author SHA1 Message Date
Edresson Casanova e45227d9ff
XTTS v2.0 (#3137)
* Implement most similar ref training approach

* Use non-enhanced hifigan for test samples

* Add Perceiver

* Update GPT Trainer for perceiver support

* Update XTTS docs

* Bug fix masking with XTTS perceiver

* Bug fix on gpt forward

* Bug Fix on XTTS v2.0 training

* Add XTTS v2.0 unit tests

* Add XTTS v2.0 inference unit tests

* Bug Fix on diffusion inference

* Add XTTS v2.0 training recipe

* Placeholder model entry

* Add cloning params to config

* Make prompt embedding configurable

* Make cloning configurable

* Cheap fix for a cheaper fix

* Prevent resampling

* Update model entry

* Update docs

* Update requirements

* Code linting

* Add xtts v2 to sep tests

* Bug fix on XTTS get_gpt_cond_latents

* Bug fix on rebase

* Make style

* Bug fix in Japenese tokenizer

* Add num2words to deps

* Remove unused kwarg and added num_beams=1 as default

---------

Co-authored-by: Eren G??lge <egolge@coqui.ai>
2023-11-06 14:58:18 +01:00
Aarni Koskela 38f6f8f0bb
Run `make style` & re-enable it in CI (#3127) 2023-11-06 11:36:37 +01:00
Julian Weber cf97116185
XTTS v1.1 (#3089)
* Add support for ne_hifigan

* Update model.json

* Update hash

* Fix model loading

* Enhance text_normalization

* Add xtts to zoo test exception

* Add model hash check

* Add get_number_tokens
2023-10-20 16:02:08 +02:00
David Garvey a151d70242
Add stdout option (#3027)
* add add cli options for play and speed
--play argument uses simpleaudio to play the tts wav
--speed <float 0.0-2.0> passes speed argument to Coqui Studio models

* remove simpleaudio not referenced in file

* fix simpleaudio dependency version

* add ALSA headers for simpleaudio compilation

* Dockerfile ALSA headers for simpleaudio

* base changes to use stdout instead of play audio
Considering conversion to pipe wav data for audio playback with ohter program
like aplay.

This is incomplete code. Using to get feedback before proceeding with
implementation.

* remove play for pipe_out arg that suppresses stdout
removed play and simpleaudio dependency in place of pipe
fuctionality to allow passing wav file data to a program
dedicated to playing audio.

* scipy.io.wavfile.write fails with /dev/null target

* Streaming inference for XTTS 🚀 (#3035)

* v0.17.7

* Redownload XTTS with the local and remote config do not match

* Remove unused method

* Print a message when it is already donwloaded

* Try-except to present error when the user dont have connection

* Fix style

* 0.17.8

* v0.17.8

---------

Co-authored-by: Julian Weber <julian.weber@hotmail.fr>
Co-authored-by: Eren Gölge <erogol@hotmail.com>
Co-authored-by: Edresson Casanova <edresson1@gmail.com>
Co-authored-by: ggoknar <ggoknar@coqui.ai>
2023-10-16 12:07:21 +02:00
Dusty Hagstrom 13cd076a7f
Synthesizer skips over embeddings file if model only has one speaker (#2587)
* It looks like the Neon model is special in that t does not have a speaker_name and it wants to get the only item available. This was blocking a valid model with one speaker and a d_vector_file from being executed to get the embedding.

* Update synthesizer.py

oh my how embarrassing
2023-10-16 11:55:45 +02:00
Edresson Casanova 2852404bdf Fix style 2023-10-06 17:42:46 -03:00
Edresson Casanova 99650044a4 Try-except to present error when the user dont have connection 2023-10-06 17:37:05 -03:00
Edresson Casanova 529ea3f67f Print a message when it is already donwloaded 2023-10-06 17:26:40 -03:00
Edresson Casanova ee1ef1c51e Remove unused method 2023-10-06 17:21:22 -03:00
Edresson Casanova 4a6103fec9 Redownload XTTS with the local and remote config do not match 2023-10-06 17:16:30 -03:00
Eren Gölge bb05dcb9b4
Merge pull request #2922 from coqui-ai/be_tts
Adding Belarusian TTS model
2023-09-27 09:48:28 +02:00
Eren G??lge 9d0b76ce23 Check env var for COQUI_TOS_AGREED 2023-09-14 17:51:40 +02:00
Eren G??lge ded7fd4fb2 Make style 2023-09-14 15:23:37 +02:00
Eren G??lge 44b61d2b92 Fixup 2023-09-14 15:22:54 +02:00
Eren Gölge 623ea41634
Fix model tests (#2943) 2023-09-14 15:21:48 +02:00
Eren Gölge 4033db5f4b 🔥 XTTS implementation 2023-09-13 17:51:24 +02:00
Eren Gölge 562a9509f2 Add BE model 2023-09-04 13:57:03 +02:00
Cohee b3b1555d82
Fix exception handling in manage.py (#2912) 2023-09-04 12:54:30 +02:00
Jake Tae 409db505d2
Add device support in TTS and Synthesizer (#2855)
* fix: resolve merge conflicts

* fix: retain backwards compatability in functions

* feature: utilize device for voice transfer

* feature: use device for vocoder

* chore: cleanup vocoder cpu logic

* fix: add necessary vocoder output device check

* fix: add necessary vocoder output device check

* fix: indentation

* fix: check if waveform is pt tensor before cpu conversion

---------

Co-authored-by: Jake Tae <jaketae@Jakes-MacBook-Pro-2.local>
2023-08-14 21:04:44 +02:00
Julian Weber febcaf710a
Add customizable data home path (#2871)
* Add customizable data home path

* Add TTS_HOME as an option
2023-08-14 21:02:48 +02:00
Eren Gölge 3a104d5c49
Update Studio API for XTTS (#2861)
* Update Studio API for XTTS

* Update the docs

* Update README.md

* Update README.md

Update README
2023-08-13 12:04:12 +02:00
Eren Gölge 17ddd65741 Please p3.11 2023-07-31 15:53:19 +02:00
Aleś Bułojčyk d124f78430
Recipe for Belarusian TTS (#2756)
* Changes from jhlfrfufyfn <jhlfrfufyfn@gmail.com>

* Recipe for Belarusian TTS

---------

Co-authored-by: jhlfrfufyfn <jhlfrfufyfn@gmail.com>
2023-07-31 10:26:21 +02:00
logan hart 6fdb88f8e2
Add Delightful-TTS implementation (#2095)
* add configs

* Update config file

* Add model configs

* Add model layers

* Add layer files

* Add layer modules

* change config names

* Add emotion manager

* fIX missing ap bug

* Fix missing ap bug

* Add base TTS e2e class

* Fix wrong variable name in load_tts_samples

* Add training script

* Remove range predictor and gaussian upsampling

* Add helper function

* Add vctk recipe

* Add conformer docs

* Fix linting in conformer.py

* Add Docs

* remove duplicate import

* refactor args

* Fix bugs

* Removew emotion embedding

* remove unused arg

* Remove emotion embedding arg

* Remove emotion embedding arg

* fix style issues

* Fix bugs

* Fix bugs

* Add unittests

* make style

* fix formatter bug

* fix test

* Add pyworld compute pitch func

* Update requirments.txt

* Fix dataset Bug

* Chnge layer norm to instance norm

* Add missing import

* Remove emotions.py

* remove ssim loss

* Add init layers func to aligner

* refactor model layers

* remove audio_config arg

* Rename loss func

* Rename to delightful-tts

* Rename loss func

* Remove unused modules

* refactor imports

* replace audio config with audio processor

* Add change sample rate option

* remove broken resample func

* update recipe

* fix style, add config docs

* fix tests and multispeaker embd dim

* remove pyworld

* Make style and fix inference

* Split tts tests

* Fixup

* Fixup

* Fixup

* Add argument names

* Set "random" speaker in the model Tortoise/Bark

* Use a diff f0_cache path for delightfull tts

* Fix delightful speaker handling

* Fix lint

* Make style

---------

Co-authored-by: loganhart420 <loganartpersonal@gmail.com>
Co-authored-by: Eren Gölge <erogol@hotmail.com>
2023-07-24 13:41:26 +02:00
JiangCheng 53938e2d32 Squashed commit of the following:
commit dd612fd72e
Author: JiangCheng <jiangcheng@kezaihui.com>
Date:   Mon Jun 5 16:04:54 2023 +0800

    Failed to download the file and need to delete the created file path
2023-07-05 12:08:05 +02:00
Eren G??lge 34b9a18c47 Fixup 2023-06-28 12:26:04 +02:00
Eren G??lge 6b9ebf5aab Merge branch 'p3_11' into dev 2023-06-28 12:13:04 +02:00
Eren Gölge c844b6570a
Inference API for 🐶Bark (#2685)
* Add bark requirements

* Draft Bark implementation

* Download HF models

* Update synthesizer

* Add bark model

* Make style

* Update pylintrc

* Update model URLs

* Update Bark Config

* Fix here and ther

* Make style

* Make lint

* Update requirements

* Update requirements
2023-06-28 11:55:27 +02:00
Eren G??lge a1c431e6a9 Fixups 2023-06-26 12:55:18 +02:00
Eren G??lge a58fb6c01b Update requirements 2023-06-22 13:53:19 +02:00
Eren G??lge e888e8a56d Fix manage 2023-06-22 10:13:20 +02:00
Eren Gölge fff8b762bc
Merge branch 'dev' into bark 2023-06-21 15:49:05 +02:00
Eren G??lge 0f8932a6a9 Fix here and ther 2023-06-21 11:59:27 +02:00
Eren G??lge f4c88ed677 Make style 2023-06-19 14:22:32 +02:00
Eren G??lge 2364c38d16 Update synthesizer 2023-06-19 14:15:21 +02:00
Eren G??lge 5a31fad502 Download HF models 2023-06-19 14:14:04 +02:00
Eren Gölge e785d101a1
Port Fairseq TTS models (#2628)
* Load fairseq models

* Add docs and missing files

* Managing fairseq models and docs for API

* Make style

* Use scarf URL

* Add tests

* Fix URL

* Pass cpu

* Make lint

* Fixup

* Make lint

* fixup

* Fixup

* Change tokenization order

* Update README

* Fixup

* Fixup
2023-06-05 11:15:13 +02:00
Shukrullo Turgunov 0d5e68a09f
fix typo (#2647)
* fix typo

* typo fix
2023-06-05 09:58:16 +02:00
manmay nakhashi a3d5801c44
Tortoise TTS inference (#2547)
* initial commit

* Tortoise inference

* revert path change

* style fix

* remove accidental remove

* style fixes

* style fixes

* removed unwanted assests and deps

* remove changes

* remove cvvp

* style fix black

* added tortoise config and updated config and args, refactoring the code

* added tortoise to api

* Pull mel_norm from url

* Use TTS cleaners

* Let download model files

* add ability to pass tortoise presets through coqui api

* fix tests

* fix style and tests

* fix tts commandline for tortoise

* Add config.json to tortoise

* Use kwargs

* Use regular model api for loading tortoise

* Add load from dir to synthesizer

* Fix Tortoise floats

* Use model_dir when there are multiple urls

* Use `synthesize` when exists

* lint fixes and resolve preset bug

* resolve a download bug and update model link

* fix json

* do tortoise inference from voice dir

* fix

* fix test

* fix speaker id and remove assests

* update inference_tests.yml

* replace inference_test.yml

* fix extra dir as None

* fix tests

* remove space

* Reformat docstring

* Add docs

* Update docs

* lint fixes

---------

Co-authored-by: Eren Gölge <egolge@coqui.ai>
Co-authored-by: Eren Gölge <erogol@hotmail.com>
2023-05-16 00:58:21 +02:00
Eren Gölge 9b5822d625
Update VAD for silence trimming. (#2604)
* Update vad for mp3 and fault tolerance

* Make style

* Remove importt

* Remove stupid defaults
2023-05-11 11:09:23 +02:00
prakharpbuf c1875f68df
typos and minor fixes (#2508)
* Update tacotron1-2.md

* Update README.md

* Update Tutorial_2_train_your_first_TTS_model.ipynb

* Update synthesizer.py

There is no arg called --speaker_name

* Update formatting_your_dataset.md

* Update AnalyzeDataset.ipynb

* Update AnalyzeDataset.ipynb

* Update AnalyzeDataset.ipynb

* Update finetuning.md

* Update train_yourtts.py

* Update train_yourtts.py

* Update train_yourtts.py

* Update finetuning.md
2023-04-26 15:22:57 +02:00
Eren Gölge 758ef84cc2 Using 🐸Studio models with `tts` command 2023-04-13 14:14:41 +02:00
Eren Gölge a49c1931d9 Fixup 2023-04-10 13:33:42 +02:00
Eren Gölge 30109af2a0
Merge pull request #2480 from MattyB95/librosa_v0.10.0
Update Librosa Version To V0.10.0
2023-04-07 12:32:33 +02:00
Eren Gölge ad8b9bf2be
🐸 Coqui Studio API integration (#2484)
* Warn when lang is not avail

* Make style

* Implement Coqui Studio API

* Test

* Update docs

* Set action

* Make style

* Make lint

* Update README

* Make style

* Fix action

* Run actions
2023-04-05 15:06:50 +02:00
Matthew Boakes 4c829e74a1 Update Librosa Version To V0.10.0 2023-04-05 00:59:20 +01:00
Eren Gölge d309f50e53
Implement FreeVC (#2451)
* Update .gitignore

* Draft FreeVC implementation

* Tests and relevant updates

* Update API tests

* Add missings

* Update requirements

* :(

* Lazy handle for vc

* Update docs for voice conversion

* Make style
2023-03-25 18:33:23 +01:00
Roee Shenberg 3c15f0619a
Bug fixes in OverFlow audio generation (#2380) 2023-03-15 12:02:11 +01:00
Eren Gölge 914280a556
Bump up to v0.11.0 (#2329)
* Make style

* Bump up to v0.11.0
2023-02-08 13:58:49 +01:00
Eren G??lge 85b3a04b37 Merge branch 'api_model_path' into dev 2023-02-06 11:18:00 +01:00
marius851000 1f4d8bf0f1
Fix tts-server for multi-lingual models (#2257) 2023-02-06 10:54:34 +01:00
Eren G??lge 7fddabc8ac Implement cloning in API 2023-01-30 13:35:48 +01:00
manmay nakhashi bc422f2f3c
Fastspeech2 (#2073)
* added EnergyDataset

* add energy to Dataset

* add comupte_energy

* added energy params

* added energy to forward_tts

* added plot_avg_energy for visualisation

* Update forward_tts.py

* create file

* added fastspeech2 recipe

* add fastspeech2 config

* removed energy from fast pitch

* add energy loss to forward tts

* Update fastspeech2_config.py

* change run_name

* Update numpy_transforms.py

* fix typo

* fix typo

* fix typo

* linting issues

* use_energy default value --> False

* Update numpy_transforms.py

* linting fixes

* fix typo

* liniting_fix

* liniting_fix

* fix

* fixes

* fixes

* lint fix

* lint fixws

* added training test

* wrong import

* wrong import

* trailing whitespace

* style fix

* changed class name because of error

* class name change

* class name change

* change class name

* fixed styles
2023-01-15 22:39:22 +01:00
Khalid Bashir 42afad5e79
Fixed bug related to yourtts speaker embeddings issue (#2234)
* Fixed bug related to yourtts speaker embeddings issue

* Reverted code for base_tts

* Bug fix on VITS d_vector_file type

* Ignore the test speakers on YourTTS recipe

* Add speaker encoder model and config on YourTTS recipe to easily do zero-shot inference

* Update YourTTS config file

* Update ModelManager._update_path to deal with list attributes

* Fix lint checks

* Remove unused code

* Fix unit tests

* Reset name_to_id to get the right speaker ids on load_embeddings_from_list_of_files

* Set weighted_sampler_multipliers as an empty dict to prevent users' mistakes

Co-authored-by: Edresson Casanova <edresson1@gmail.com>
2023-01-02 14:20:02 +01:00
Eren G??lge 8c32a6998a Add pth files to manager 2022-12-26 14:29:25 +01:00
Eren Gölge ecea43ec81
Adding pre-trained Overflow model (#2211)
* Adding pretrained Overflow model

* Stabilize HMM

* Fixup model manager

* Return `audio_unique_name` by default

* Distribute max split size over datasets

* Fixup eval_split_size

* Make style
2022-12-14 16:55:48 +01:00
Eren Gölge 1ddc484b49
Python API implementation (#2195)
* Draft implementation

* Fix style

* Add api tests

* Fix lint

* Update docs

* Update tests

* Set env

* Fixup

* Fixup

* Fix lint

* Revert
2022-12-12 12:04:20 +01:00
logan hart ff9b63d02a
Add neon models (#2140)
* Add neon ljspeech vits model

* Add neon german model

* Update .models.json

* Add neon spanish model

* Add french model

* Add Dutch model

* Add Hungarian model

* Add Greek model

* Remove uneeded description

* Update .models.json

* Update .models.json

* Handling neon models

* Add all neon models

* Update .models.json

* Split zoo_tests

* Update test names

* Update model testing

Co-authored-by: Eren Gölge <erogol@hotmail.com>
2022-11-16 16:12:39 +01:00
Eren Gölge 8cb1433e6e
Cache fsspec downloads (#2132)
* Cache fsspec downloaded files

* Use diff paths for test

* Make fsspec caching optional

* Decom GPU docker tests

* Make progress bar optional for better CI log

* Check path local
2022-11-09 22:12:48 +01:00
Victor Shepardson 5307a2229b
Fix Capacitron training (#2086) 2022-11-01 12:52:06 +01:00
Eren Gölge 5f5d441ee5
Write non-speech files in a TXT (#2048)
* Write non-speech files in a txt

* Save 16-bit wav out of vad
2022-10-06 13:25:54 +02:00
Eren Gölge 9e5a469c64
d-vector handling (#1945)
* Update BaseDatasetConfig

- Add dataset_name
- Chane name to formatter_name

* Update compute_embedding

- Allow entering dataset by args
- Use released model by default
- Use the new key format

* Update loading

* Update recipes

* Update other dep code

* Update tests

* Fixup

* Load multiple embedding files

* Fix argument names in dep code

* Update docs

* Fix argument name

* Fix linter
2022-09-13 14:10:33 +02:00
Edresson Casanova 371772c355
Replace pyworld by pyin (#1946)
* Replace pyworld by pyin

* Fix unit tests
2022-09-09 10:43:14 +02:00
Eren Gölge bfc63829ac
Implement bucketed weighted sampling for VITS (#1871) 2022-08-15 11:08:11 +02:00
Eren Gölge d46fbc240c
Introduce numpy and torch transforms (#1705)
* Refactor audio processing functions

* Add tests for numpy transforms

* Fix imports

* Fix imports2
2022-08-08 11:57:50 +02:00
p0p4k 4fe50801b5
Update README.md; download progress bar in CLI. (#1797)
* Update README.md

- minor PR
- added model_info usage guide based on #1623 in README.md .

* "added tqdm bar for model download"

* Update manage.py

* fixed style

* fixed style

* sort imports
2022-08-01 12:17:47 +02:00
Eren Gölge 49bac724c0
Implement VitsAudioConfig (#1556)
* Implement VitsAudioConfig

* Update VITS LJSpeech recipe

* Update VITS VCTK recipe

* Make style

* Add missing decorator

* Add missing param

* Make style

* Update recipes

* Fix test

* Bug fix

* Exclude tests folder

* Make linter

* Make style
2022-07-12 18:49:58 +02:00
a-froghyar 34b80e0280
feat: updated recipes and lr fix (#1718)
- updated the recipes activating more losses for more stable training
- re-enabling guided attention loss
- fixed a bug about not the correct lr fetched for logging
2022-07-12 15:00:53 +02:00
Eren G??lge 3328be7a8e Remove GL message 2022-06-21 12:39:31 +02:00
p0p4k 71281ff1e4
Add support for model_info in CLI (#1623)
* model_info

* model_info

* model_info_by_idx and name

* model_info_by_idx and name

* model_info

* Update manage.py

* fixed linter

* fixed linter

* fixed linter

* fixed linter

* fixed return style checks

* fixed linter

* fixed linter

* fixed idx always positive

* added comments

* fix parser.args check

* fix parser.args check

* Make style

Co-authored-by: Eren G??lge <egolge@coqui.ai>
2022-06-20 23:28:17 +02:00
a-froghyar 8be21ec387
Capacitron (#977)
* new CI config

* initial Capacitron implementation

* delete old unused file

* fix empty formatting changes

* update losses and training script

* fix previous commit

* fix commit

* Add Capacitron test and first round of test fixes

* revert formatter change

* add changes to the synthesizer

* add stepwise gradual lr scheduler and changes to the recipe

* add inference script for dev use

* feat: add posterior inference arguments to synth methods
- added reference wav and text args for posterior inference
- some formatting

* fix: add espeak flag to base_tts and dataset APIs
- use_espeak_phonemes flag was not implemented in those APIs
- espeak is now able to be utilised for phoneme generation
- necessary phonemizer for the Capacitron model

* chore: update training script and style
- training script includes the espeak flag and other hyperparams
- made style

* chore: fix linting

* feat: add Tacotron 2 support

* leftover from dev

* chore:rename parser args

* feat: extract optimizers
- created a separate optimizer class to merge the two optimizers

* chore: revert arbitrary trainer changes

* fmt: revert formatting bug

* formatting again

* formatting fixed

* fix: log func

* fix: update optimizer
- Implemented load_state_dict for continuing training

* fix: clean optimizer init for standard models

* improvement: purge espeak flags and add training scripts

* Delete capacitronT2.py

delete old training script, new one is pushed

* feat: capacitron trainer methods
- extracted capacitron specific training  operations from the trainer into custom
methods in taco1 and taco2 models

* chore: renaming and merging capacitron and gst style args

* fix: bug fixes from the previous commit

* fix: implement state_dict method on CapacitronOptimizer

* fix: call method

* fix: inference naming

* Delete train_capacitron.py

* fix: synthesize

* feat: update tests

* chore: fix style

* Delete capacitron_inference.py

* fix: fix train tts t2 capacitron tests

* fix: double forward in T2 train step

* fix: double forward in T1 train step

* fix: run make style

* fix: remove unused import

* fix: test for T1 capacitron

* fix: make lint

* feat: add blizzard2013 recipes

* make style

* fix: update recipes

* chore: make style

* Plot test sentences in Tacotron

* chore: make style and fix import

* fix: call forward first before problematic floordiv op

* fix: update recipes

* feat: add min_audio_len to recipes

* aux_input["style_mel"]

* chore: make style

* Make capacitron T2 recipe more stable

* Remove T1 capacitron Ljspeech

* feat: implement new grad clipping routine and update configs

* make style

* Add pretrained checkpoints

* Add default vocoder

* Change trainer package

* Fix grad clip issue for tacotron

* Fix scheduler issue with tacotron

Co-authored-by: Eren Gölge <egolge@coqui.ai>
Co-authored-by: WeberJulian <julian.weber@hotmail.fr>
Co-authored-by: Eren Gölge <erogol@hotmail.com>
2022-05-20 16:17:11 +02:00
Edresson Casanova ee99a6c1e2 Fix voice conversion inference (#1583)
* Add voice conversion zoo test

* Fix style

* Fix unit test
2022-05-20 15:50:25 +02:00
Eren Gölge 2fc38f67d2 Update SpeakerManager init in Synthesizer 2022-05-11 11:32:27 +02:00
Edresson Casanova 8d228ab22a
Trick to Upsampling to High sampling rates using VITS model (#1456)
* Add upsample VITS support

* Fix the bug in inference

* Fix lint checks

* Add RMS based norm in save_wav method

* Style fix

* Add the period for VITS multi-period discriminator in model_args

* Bug fix in speaker encoder load in inference time

* Add unit tests

* Remove useless detach_z_vocoder parameter

* Add docs for VITS upsampling

* Fix the docs

* Rename TTS_part_sample_rate to encoder_sample_rate

* Add upsampling_init and upsampling_z methods

* Add asserts for encoder_sample_rate part

* Move upsampling tests to test_vits.py
2022-04-26 11:47:46 +02:00
WeberJulian 30bea7d53c
Update manage.py (#1514) 2022-04-19 14:27:32 +02:00
Eren Gölge 7133f8f47d
Print Model's license when downloading (#1512)
* Print model license while downloading

* Make style

* Add a new license link

* Make style
2022-04-19 14:18:49 +02:00
Edresson Casanova 060e0f9368
Add EmbeddingManager and BaseIDManager (#1374) 2022-03-31 13:41:16 +02:00
WeberJulian 1b22f03e98
Fix G2P backend of the released models (#1461)
* Fix enforce phonemizer

* Add new models

* Fix .model.json
2022-03-30 12:47:11 +02:00
WeberJulian c66a6241fd
Enforce phonemizer definition for synthesis (#1441)
* Enforce phonemizer definition for synthesis

* Fix train_tts, tokenizer init can now edit config

* Add small change to trigger CI pipeline

* fix wrong output path for one tts_test

* Fix style

* Test config overides by args and tokenizer

* Fix style
2022-03-25 23:15:33 +01:00
Edresson Casanova 3435bc8fca Fix style tests 2022-03-23 15:05:32 -03:00
Edresson Casanova 0ae1e0248c Fix the bug for emptly audio files 2022-03-23 14:39:31 -03:00
Edresson Casanova ea53d6feb3 Replace webrtcvad by silero-vad 2022-03-23 14:39:31 -03:00
Eren Gölge 1c3623af33
Fix model manager (#1436)
* Fix manager

* Make style
2022-03-23 12:57:14 +01:00
Eren Gölge 72d85e53c9
Update model file extension (#1422)
* Update model file ext to ```.pth```

* Update docs

* Rename more

* Find model files
2022-03-22 17:55:00 +01:00
Eren Gölge 0870a4faa2
Make style (#1405) 2022-03-16 12:13:55 +01:00
Edresson Casanova dbe9da7f15
Add Voice conversion inference support (#1337)
* Add support for voice conversion inference

* Cache d_vectors_by_speaker for fast inference using a bigger speakers.json

* Rebase bug fix

* Use the average d-vector for inference
2022-03-10 14:57:12 +01:00
Eren Gölge 942df0fb05 Update vits dataset 2022-03-02 09:14:32 +01:00
Eren Gölge 935a604046 Delete trainer_utils 2022-02-25 11:29:41 +01:00
Eren Gölge d0c27a9661 Update synthesis.py 2022-02-25 11:29:41 +01:00
Eren Gölge 2bad098625 Implement BaseVocabulary 2022-02-25 11:28:47 +01:00
Eren Gölge a013566d15 Delete trainer related code 2022-02-25 11:26:59 +01:00
Eren Gölge d5c0e17548 Load right char class dynamically 2022-02-25 11:26:59 +01:00
Eren Gölge 1f0c8179da Make style 2022-02-25 11:26:59 +01:00
Eren Gölge cd5d1497cf Add pitch_fmin pitch_fmax args to the audio 2022-02-25 11:26:59 +01:00
Eren Gölge 1445a46e9e Update synthesizer to use iinit_from_config 2022-02-25 11:26:59 +01:00
Eren Gölge 2fe16de8e3 Make lint 2022-02-25 11:25:00 +01:00
Eren Gölge 50e17097a7 Add verbose option to AudioProcessor 2022-02-25 11:24:13 +01:00
Eren Gölge c9972e6f14 Make lint 2022-02-25 11:07:34 +01:00
Eren Gölge 9bb347a52b Update for tokenizer API 2022-02-25 11:05:06 +01:00
Eren Gölge 84091096a6 Refactor Synthesizer class for TTSTokenizer 2022-02-25 11:05:06 +01:00