Commit Graph

4047 Commits

Author SHA1 Message Date
Eren Gölge b6073d16fb Return duration by ForwardTTS inference 2022-04-19 11:00:15 +02:00
Eren Gölge 7ce4444056 Make style 2022-04-19 10:59:59 +02:00
Eren Gölge df30f9d885 Update ForwardTTSe2e tests 2022-04-19 10:58:52 +02:00
Eren Gölge 8f3552fbaa Remove redundant abstract function 2022-04-19 09:23:35 +02:00
Eren Gölge 5cd7fa6228 Refactor TTSDataset to use numpy transforms 2022-04-19 09:23:18 +02:00
Eren Gölge 3824838e5d Update ForwardTTSE2eLoss 2022-04-19 09:22:50 +02:00
Eren Gölge 85c03c75ca Make AP optional in BaseTTS 2022-04-19 09:22:08 +02:00
Eren Gölge 2457739b5e Add numpy and torch transforms 2022-04-19 09:21:46 +02:00
Eren Gölge 7742c0b64e Refactor ForwardTTS to skip decoder 2022-04-19 09:21:31 +02:00
Eren Gölge 518b216631 Make plot results more general 2022-04-19 09:20:31 +02:00
Eren Gölge 82c2ca505d Add missing kernel size attr to transformer layer 2022-04-19 09:19:57 +02:00
Eren Gölge 622ff07c45 Remove AP from FastPitchE2e 2022-04-19 09:19:07 +02:00
Eren Gölge 52e86d8866 Update fastpitche2e recipe 2022-04-19 09:18:49 +02:00
Eren Gölge 519ee7c776 Update import statements 2022-04-19 09:16:03 +02:00
Eren Gölge ad24598797 Remove redundancy 2022-04-04 09:46:30 +02:00
Eren Gölge 9e456e8053 Fix Vocoder logging 2022-04-04 09:46:10 +02:00
Eren Gölge e5a9902e85 Rename vars in VITS 2022-04-04 09:45:46 +02:00
Eren Gölge 8f21991a84 Add cond layer in decoder 2022-04-04 09:44:20 +02:00
Eren Gölge 8408b983b2 Refactor multi-speaker init in ForwardTTS 2022-04-04 09:43:46 +02:00
Eren Gölge f1b034c8b0 Implement BaseTTSE2E 2022-04-04 09:43:15 +02:00
Eren Gölge 29216ff907 Implement ForwardTTSE2E Loss 2022-04-04 09:42:50 +02:00
Eren Gölge 95b52a65af Implement FastPitchE2E LJSpeech recipe 2022-04-04 09:41:46 +02:00
Eren Gölge 2c0cd0ddd5 Implement ForwardTTSE2E tests 2022-04-04 09:41:25 +02:00
Eren Gölge ade84aa124 Implement FastPitchE2EConfig 2022-04-04 09:41:05 +02:00
Eren Gölge c369f087ab Implement ForwardTTSE2Eg 2022-04-04 09:40:36 +02:00
Eren Gölge 1c3623af33
Fix model manager (#1436)
* Fix manager

* Make style
2022-03-23 12:57:14 +01:00
Eren Gölge 72d85e53c9
Update model file extension (#1422)
* Update model file ext to ```.pth```

* Update docs

* Rename more

* Find model files
2022-03-22 17:55:00 +01:00
Edresson Casanova ccdc2300dc
Add eval_split and eval_split_size in the call of load_tts_samples for all recipes (#1424) 2022-03-22 12:54:41 +01:00
Eren Gölge 2e6e8f651d
Update CheckSpectrograms notebook (#1418) 2022-03-18 16:48:24 +01:00
Eren Gölge c7f9ec07c8
Hinge Gruut version to 2.2.3 (#1419) 2022-03-18 16:47:50 +01:00
Eren Gölge fd56fabb21
Fix #1380 (#1409) 2022-03-16 12:38:27 +01:00
Eren Gölge 0870a4faa2
Make style (#1405) 2022-03-16 12:13:55 +01:00
WeberJulian 690c96ed28
Fix default phonemizer for ja and zh (#1399) 2022-03-16 12:13:22 +01:00
Eren Gölge f40b833659
Add CITATION.cff (#1404) 2022-03-16 12:05:17 +01:00
WeberJulian 24b57f6a0e
Fix typo workflow text (#1403) 2022-03-16 11:51:37 +01:00
Edresson Casanova f81892483d
REBASED: Transform Speaker Encoder in a Generic Encoder and Implement Emotion Encoder training support (#1349)
* Rename Speaker encoder module to encoder

* Add a generic emotion dataset formatter

* Transform the Speaker Encoder dataset to a generic dataset and create emotion encoder config

* Add class map in emotion config

* Add Base encoder config

* Add evaluation encoder script

* Fix the bug in plot_embeddings

* Enable Weight decay for encoder training

* Add argumnet to disable storage

* Add Perfect Sampler and remove storage

* Add evaluation during encoder training

* Fix lint checks

* Remove useless config parameter

* Active evaluation in speaker encoder test and use multispeaker dataset for this test

* Unit tests fixs

* Remove useless tests for speedup the aux_tests

* Use get_optimizer in Encoder

* Add BaseEncoder Class

* Fix the unitests

* Add Perfect Batch Sampler unit test

* Add compute encoder accuracy in a function
2022-03-11 14:43:40 +01:00
Edresson Casanova 36e9ea2f97
Open bible dataset formatter (#1365)
* Add support for voice conversion inference

* Cache d_vectors_by_speaker for fast inference using a bigger speakers.json

* Rebase bug fix

* Use the average d-vector for inference

* Fix the bug in find unique chars script

* Add OpenBible formatter

Co-authored-by: Eren Gölge <erogol@hotmail.com>
2022-03-11 10:43:31 +01:00
Eren Gölge b0be825d92
Update issue template (#1370)
* Add bug_report template

* Fix typos
2022-03-11 10:40:20 +01:00
Edresson Casanova dbe9da7f15
Add Voice conversion inference support (#1337)
* Add support for voice conversion inference

* Cache d_vectors_by_speaker for fast inference using a bigger speakers.json

* Rebase bug fix

* Use the average d-vector for inference
2022-03-10 14:57:12 +01:00
Edresson Casanova 917f417ac4
Add alphas to control language and speaker balancer (#1216)
* Add alphas to control language and speaker balancer

* Add docs for speaker and language samplers

* Change the Samplers weights to float for save memory

* Change the test_samplers to unittest format

* Add get_sampler method in BaseTTS

* Fix rebase issues

* Add language and speaker samplers support for DDP training

* Rename distributed sampler wrapper

* Remove the DistributedSamplerWrapper and use the one from Trainer

* Bugfix after rebase

* Move the samplers config to tts config
2022-03-10 14:56:09 +01:00
Edresson Casanova f381e29b91
REBASED: Add support for the speaker encoder training using torch spectrograms (#1348)
* Add support for the speaker encoder training using torch spectrograms

* Remove useless function in speaker encoder dataset class
2022-03-10 14:54:51 +01:00
Eren Gölge 07d96f7991 Fix DocQA title 2022-03-10 12:17:06 +01:00
Yanlong Wang 8a007c8834
feat: add docsqa to docs website (#1363) 2022-03-10 11:40:06 +01:00
Eren Gölge 48f6bb405a
Fix recipes as to the recent API changes. (#1367)
* Fix recipes -> #1366

* Fix trainer docs
2022-03-10 11:36:38 +01:00
Edresson Casanova d792b78703
Fix multilingual recipe (#1354) 2022-03-09 16:18:17 +01:00
Eren Gölge c670365507 Fix VCTK recipe and formatter 2022-03-08 14:20:34 +01:00
Eren Gölge 0cf3265a46
Merge pull request #1347 from coqui-ai/dev
v0.6.1
2022-03-07 16:02:19 +01:00
Eren Gölge 8feb41d361 Bump up to v0.6.1 2022-03-07 15:57:44 +01:00
Eren Gölge 6df69f79ea Revert DocQA as it fails on readthedocs 2022-03-07 15:54:43 +01:00
Eren Gölge 95e551dd0a Update requirements.txt for coqui-trainer 2022-03-07 14:31:25 +01:00