Commit Graph

4196 Commits

Author SHA1 Message Date
Edresson Casanova 60034674f9 Remove audio padding before mel spec extraction 2022-05-07 13:12:09 +02:00
WeberJulian fbdf76b2fc returns y_mask in VITS inference (#1540)
* returns y_mask

* make style
2022-05-03 13:49:24 +02:00
Edresson Casanova 6233f4fcd7 Bug fix in compute embedding without eval partition 2022-04-26 13:58:03 -03:00
Edresson Casanova a41e860a66
Update Coqpit requirement (#1539) 2022-04-26 17:39:36 +02:00
Edresson Casanova 8d228ab22a
Trick to Upsampling to High sampling rates using VITS model (#1456)
* Add upsample VITS support

* Fix the bug in inference

* Fix lint checks

* Add RMS based norm in save_wav method

* Style fix

* Add the period for VITS multi-period discriminator in model_args

* Bug fix in speaker encoder load in inference time

* Add unit tests

* Remove useless detach_z_vocoder parameter

* Add docs for VITS upsampling

* Fix the docs

* Rename TTS_part_sample_rate to encoder_sample_rate

* Add upsampling_init and upsampling_z methods

* Add asserts for encoder_sample_rate part

* Move upsampling tests to test_vits.py
2022-04-26 11:47:46 +02:00
Eren Gölge c410bc58ef Bump to v0.6.2 2022-04-20 11:46:26 +02:00
WeberJulian 30bea7d53c
Update manage.py (#1514) 2022-04-19 14:27:32 +02:00
Yanlong Wang b45d5c5c60
Improve docsQA default questions (#1411) 2022-04-19 14:24:34 +02:00
Eren Gölge 7133f8f47d
Print Model's license when downloading (#1512)
* Print model license while downloading

* Make style

* Add a new license link

* Make style
2022-04-19 14:18:49 +02:00
WeberJulian 4953636b14
Add African models (#1511)
* Add african models

* Set default license for all models
2022-04-19 14:18:30 +02:00
jackiexiao e8573bfe3e
Update CONTRIBUTING.md (#1463)
fix header
```
## Call for sharing language models
```
2022-04-15 14:43:46 +02:00
Reuben Morais c18100d112 Merge branch 'docker-ci' into dev (Fixes #1498) 2022-04-15 02:32:51 +02:00
Reuben Morais 27fcb5dabf Add Dockerfile and build/push CI 2022-04-15 02:17:10 +02:00
Eren Gölge 164c7dd676
Update requirements coqui_trainer -> trainer (#1478) 2022-04-08 14:47:09 +02:00
Edresson Casanova 060e0f9368
Add EmbeddingManager and BaseIDManager (#1374) 2022-03-31 13:41:16 +02:00
WeberJulian 1b22f03e98
Fix G2P backend of the released models (#1461)
* Fix enforce phonemizer

* Add new models

* Fix .model.json
2022-03-30 12:47:11 +02:00
WeberJulian c66a6241fd
Enforce phonemizer definition for synthesis (#1441)
* Enforce phonemizer definition for synthesis

* Fix train_tts, tokenizer init can now edit config

* Add small change to trigger CI pipeline

* fix wrong output path for one tts_test

* Fix style

* Test config overides by args and tokenizer

* Fix style
2022-03-25 23:15:33 +01:00
Edresson Casanova 37896e1743
Bug fix in freeze encoder (#1391)
* Fix the bug in freeze encoder

* Remove emb_l definition for non-multilingual training

* Fix unit tests
2022-03-24 18:16:04 +01:00
Edresson Casanova 464dc658ff
Merge pull request #1431 from coqui-ai/silero-vad
Replace webrtcvad by silero-vad
2022-03-24 08:29:32 -03:00
Edresson Casanova 3435bc8fca Fix style tests 2022-03-23 15:05:32 -03:00
Edresson Casanova 0ae1e0248c Fix the bug for emptly audio files 2022-03-23 14:39:31 -03:00
Edresson Casanova ea53d6feb3 Replace webrtcvad by silero-vad 2022-03-23 14:39:31 -03:00
Eren Gölge 3af01cfe3b
Update base model wrt 👟 (#1406) 2022-03-23 17:24:20 +01:00
WeberJulian 3c7c14607b
Add formatting tests (#1437)
* Add style checks to `make lint`

* Bump target-version in black config
2022-03-23 17:23:36 +01:00
Eren Gölge 1c3623af33
Fix model manager (#1436)
* Fix manager

* Make style
2022-03-23 12:57:14 +01:00
Eren Gölge 72d85e53c9
Update model file extension (#1422)
* Update model file ext to ```.pth```

* Update docs

* Rename more

* Find model files
2022-03-22 17:55:00 +01:00
Edresson Casanova ccdc2300dc
Add eval_split and eval_split_size in the call of load_tts_samples for all recipes (#1424) 2022-03-22 12:54:41 +01:00
Eren Gölge 2e6e8f651d
Update CheckSpectrograms notebook (#1418) 2022-03-18 16:48:24 +01:00
Eren Gölge c7f9ec07c8
Hinge Gruut version to 2.2.3 (#1419) 2022-03-18 16:47:50 +01:00
Eren Gölge fd56fabb21
Fix #1380 (#1409) 2022-03-16 12:38:27 +01:00
Eren Gölge 0870a4faa2
Make style (#1405) 2022-03-16 12:13:55 +01:00
WeberJulian 690c96ed28
Fix default phonemizer for ja and zh (#1399) 2022-03-16 12:13:22 +01:00
Eren Gölge f40b833659
Add CITATION.cff (#1404) 2022-03-16 12:05:17 +01:00
WeberJulian 24b57f6a0e
Fix typo workflow text (#1403) 2022-03-16 11:51:37 +01:00
Edresson Casanova f81892483d
REBASED: Transform Speaker Encoder in a Generic Encoder and Implement Emotion Encoder training support (#1349)
* Rename Speaker encoder module to encoder

* Add a generic emotion dataset formatter

* Transform the Speaker Encoder dataset to a generic dataset and create emotion encoder config

* Add class map in emotion config

* Add Base encoder config

* Add evaluation encoder script

* Fix the bug in plot_embeddings

* Enable Weight decay for encoder training

* Add argumnet to disable storage

* Add Perfect Sampler and remove storage

* Add evaluation during encoder training

* Fix lint checks

* Remove useless config parameter

* Active evaluation in speaker encoder test and use multispeaker dataset for this test

* Unit tests fixs

* Remove useless tests for speedup the aux_tests

* Use get_optimizer in Encoder

* Add BaseEncoder Class

* Fix the unitests

* Add Perfect Batch Sampler unit test

* Add compute encoder accuracy in a function
2022-03-11 14:43:40 +01:00
Edresson Casanova 36e9ea2f97
Open bible dataset formatter (#1365)
* Add support for voice conversion inference

* Cache d_vectors_by_speaker for fast inference using a bigger speakers.json

* Rebase bug fix

* Use the average d-vector for inference

* Fix the bug in find unique chars script

* Add OpenBible formatter

Co-authored-by: Eren Gölge <erogol@hotmail.com>
2022-03-11 10:43:31 +01:00
Eren Gölge b0be825d92
Update issue template (#1370)
* Add bug_report template

* Fix typos
2022-03-11 10:40:20 +01:00
Edresson Casanova dbe9da7f15
Add Voice conversion inference support (#1337)
* Add support for voice conversion inference

* Cache d_vectors_by_speaker for fast inference using a bigger speakers.json

* Rebase bug fix

* Use the average d-vector for inference
2022-03-10 14:57:12 +01:00
Edresson Casanova 917f417ac4
Add alphas to control language and speaker balancer (#1216)
* Add alphas to control language and speaker balancer

* Add docs for speaker and language samplers

* Change the Samplers weights to float for save memory

* Change the test_samplers to unittest format

* Add get_sampler method in BaseTTS

* Fix rebase issues

* Add language and speaker samplers support for DDP training

* Rename distributed sampler wrapper

* Remove the DistributedSamplerWrapper and use the one from Trainer

* Bugfix after rebase

* Move the samplers config to tts config
2022-03-10 14:56:09 +01:00
Edresson Casanova f381e29b91
REBASED: Add support for the speaker encoder training using torch spectrograms (#1348)
* Add support for the speaker encoder training using torch spectrograms

* Remove useless function in speaker encoder dataset class
2022-03-10 14:54:51 +01:00
Eren Gölge 07d96f7991 Fix DocQA title 2022-03-10 12:17:06 +01:00
Yanlong Wang 8a007c8834
feat: add docsqa to docs website (#1363) 2022-03-10 11:40:06 +01:00
Eren Gölge 48f6bb405a
Fix recipes as to the recent API changes. (#1367)
* Fix recipes -> #1366

* Fix trainer docs
2022-03-10 11:36:38 +01:00
Edresson Casanova d792b78703
Fix multilingual recipe (#1354) 2022-03-09 16:18:17 +01:00
Eren Gölge c670365507 Fix VCTK recipe and formatter 2022-03-08 14:20:34 +01:00
Eren Gölge 0cf3265a46
Merge pull request #1347 from coqui-ai/dev
v0.6.1
2022-03-07 16:02:19 +01:00
Eren Gölge 8feb41d361 Bump up to v0.6.1 2022-03-07 15:57:44 +01:00
Eren Gölge 6df69f79ea Revert DocQA as it fails on readthedocs 2022-03-07 15:54:43 +01:00
Eren Gölge 95e551dd0a Update requirements.txt for coqui-trainer 2022-03-07 14:31:25 +01:00
Eren Gölge 209ee40c88
Merge pull request #1288 from coqui-ai/dev
v0.6.0
2022-03-07 14:05:30 +01:00