Eren Gölge
7133f8f47d
Print Model's license when downloading ( #1512 )
...
* Print model license while downloading
* Make style
* Add a new license link
* Make style
2022-04-19 14:18:49 +02:00
WeberJulian
4953636b14
Add African models ( #1511 )
...
* Add african models
* Set default license for all models
2022-04-19 14:18:30 +02:00
Edresson Casanova
060e0f9368
Add EmbeddingManager and BaseIDManager ( #1374 )
2022-03-31 13:41:16 +02:00
WeberJulian
1b22f03e98
Fix G2P backend of the released models ( #1461 )
...
* Fix enforce phonemizer
* Add new models
* Fix .model.json
2022-03-30 12:47:11 +02:00
WeberJulian
c66a6241fd
Enforce phonemizer definition for synthesis ( #1441 )
...
* Enforce phonemizer definition for synthesis
* Fix train_tts, tokenizer init can now edit config
* Add small change to trigger CI pipeline
* fix wrong output path for one tts_test
* Fix style
* Test config overides by args and tokenizer
* Fix style
2022-03-25 23:15:33 +01:00
Edresson Casanova
37896e1743
Bug fix in freeze encoder ( #1391 )
...
* Fix the bug in freeze encoder
* Remove emb_l definition for non-multilingual training
* Fix unit tests
2022-03-24 18:16:04 +01:00
Edresson Casanova
3435bc8fca
Fix style tests
2022-03-23 15:05:32 -03:00
Edresson Casanova
0ae1e0248c
Fix the bug for emptly audio files
2022-03-23 14:39:31 -03:00
Edresson Casanova
ea53d6feb3
Replace webrtcvad by silero-vad
2022-03-23 14:39:31 -03:00
Eren Gölge
3af01cfe3b
Update base model wrt 👟 ( #1406 )
2022-03-23 17:24:20 +01:00
Eren Gölge
1c3623af33
Fix model manager ( #1436 )
...
* Fix manager
* Make style
2022-03-23 12:57:14 +01:00
Eren Gölge
72d85e53c9
Update model file extension ( #1422 )
...
* Update model file ext to ```.pth```
* Update docs
* Rename more
* Find model files
2022-03-22 17:55:00 +01:00
Eren Gölge
fd56fabb21
Fix #1380 ( #1409 )
2022-03-16 12:38:27 +01:00
Eren Gölge
0870a4faa2
Make style ( #1405 )
2022-03-16 12:13:55 +01:00
WeberJulian
690c96ed28
Fix default phonemizer for ja and zh ( #1399 )
2022-03-16 12:13:22 +01:00
Edresson Casanova
f81892483d
REBASED: Transform Speaker Encoder in a Generic Encoder and Implement Emotion Encoder training support ( #1349 )
...
* Rename Speaker encoder module to encoder
* Add a generic emotion dataset formatter
* Transform the Speaker Encoder dataset to a generic dataset and create emotion encoder config
* Add class map in emotion config
* Add Base encoder config
* Add evaluation encoder script
* Fix the bug in plot_embeddings
* Enable Weight decay for encoder training
* Add argumnet to disable storage
* Add Perfect Sampler and remove storage
* Add evaluation during encoder training
* Fix lint checks
* Remove useless config parameter
* Active evaluation in speaker encoder test and use multispeaker dataset for this test
* Unit tests fixs
* Remove useless tests for speedup the aux_tests
* Use get_optimizer in Encoder
* Add BaseEncoder Class
* Fix the unitests
* Add Perfect Batch Sampler unit test
* Add compute encoder accuracy in a function
2022-03-11 14:43:40 +01:00
Edresson Casanova
36e9ea2f97
Open bible dataset formatter ( #1365 )
...
* Add support for voice conversion inference
* Cache d_vectors_by_speaker for fast inference using a bigger speakers.json
* Rebase bug fix
* Use the average d-vector for inference
* Fix the bug in find unique chars script
* Add OpenBible formatter
Co-authored-by: Eren Gölge <erogol@hotmail.com>
2022-03-11 10:43:31 +01:00
Edresson Casanova
dbe9da7f15
Add Voice conversion inference support ( #1337 )
...
* Add support for voice conversion inference
* Cache d_vectors_by_speaker for fast inference using a bigger speakers.json
* Rebase bug fix
* Use the average d-vector for inference
2022-03-10 14:57:12 +01:00
Edresson Casanova
917f417ac4
Add alphas to control language and speaker balancer ( #1216 )
...
* Add alphas to control language and speaker balancer
* Add docs for speaker and language samplers
* Change the Samplers weights to float for save memory
* Change the test_samplers to unittest format
* Add get_sampler method in BaseTTS
* Fix rebase issues
* Add language and speaker samplers support for DDP training
* Rename distributed sampler wrapper
* Remove the DistributedSamplerWrapper and use the one from Trainer
* Bugfix after rebase
* Move the samplers config to tts config
2022-03-10 14:56:09 +01:00
Edresson Casanova
f381e29b91
REBASED: Add support for the speaker encoder training using torch spectrograms ( #1348 )
...
* Add support for the speaker encoder training using torch spectrograms
* Remove useless function in speaker encoder dataset class
2022-03-10 14:54:51 +01:00
Eren Gölge
c670365507
Fix VCTK recipe and formatter
2022-03-08 14:20:34 +01:00
Eren Gölge
8feb41d361
Bump up to v0.6.1
2022-03-07 15:57:44 +01:00
Eren Gölge
ee02bc3823
Bump up to v0.6.0
2022-03-07 12:08:22 +01:00
Eren Gölge
dc280819be
Add new models
2022-03-07 12:08:09 +01:00
Eren Gölge
e9d9028b4d
Revert cleaner name
2022-03-06 12:57:06 +01:00
Eren Gölge
764c7fa4a4
Rename phoneme_cleaners
2022-03-06 12:09:54 +01:00
Eren Gölge
dd4287de1f
Update models
2022-03-03 20:23:00 +01:00
Eren Gölge
6cb00be795
Update your_tts model URL
2022-03-02 18:04:49 +01:00
Eren Gölge
1425a023fe
Make style and lint
2022-03-02 13:25:35 +01:00
Eren Gölge
c68885b3fd
Update Vits speaker encoder init
2022-03-02 13:20:23 +01:00
Eren Gölge
27b67b7945
Fix import
2022-03-02 09:15:20 +01:00
Eren Gölge
942df0fb05
Update vits dataset
2022-03-02 09:14:32 +01:00
Eren Gölge
6a9f8074f0
Fix TTSDataset
2022-03-01 07:57:48 +01:00
Eren Gölge
690de1ab06
Update Characters and add more tests
2022-02-25 11:32:44 +01:00
Eren Gölge
9063397892
Fix FastSpeech config
2022-02-25 11:31:56 +01:00
Eren Gölge
1e414b3a09
Make stlye
2022-02-25 11:31:56 +01:00
Eren Gölge
acc83cd3e6
Update Vits model API
2022-02-25 11:31:56 +01:00
Eren Gölge
fe656659be
Implement BaseTTS
2022-02-25 11:31:56 +01:00
Eren Gölge
bed4afd4ee
Implement BaseVocabulary
2022-02-25 11:31:56 +01:00
Eren Gölge
e0f9be76c0
Update test_run in wavernn and wavegrad
2022-02-25 11:31:56 +01:00
Eren Gölge
bf540f4323
Update imports for trainer
2022-02-25 11:31:56 +01:00
Eren Gölge
4c43eda414
Update BaseTrainerModel
2022-02-25 11:31:56 +01:00
Eren Gölge
83c5ddc5b7
Update imports
2022-02-25 11:31:56 +01:00
Eren Gölge
14c117978d
Fix return outputs
2022-02-25 11:31:56 +01:00
Eren Gölge
424d04e4f6
Make stlye
2022-02-25 11:31:56 +01:00
Eren Gölge
8b3ba02c95
Add vocab_dict to model config
2022-02-25 11:31:20 +01:00
Eren Gölge
ff23dce081
Update TTSDataset
2022-02-25 11:31:20 +01:00
Eren Gölge
750903d2ba
Add VCTK formatter docstring
2022-02-25 11:30:24 +01:00
Eren Gölge
52a7896668
Update VITS loss
2022-02-25 11:30:24 +01:00
Eren Gölge
c68962c574
Update forward tts binary loss
2022-02-25 11:30:24 +01:00
Eren Gölge
c11944022d
Revert back again rand_segment
2022-02-25 11:30:24 +01:00
Eren Gölge
00c7600103
Update Vits model API
2022-02-25 11:30:24 +01:00
Eren Gölge
935a604046
Delete trainer_utils
2022-02-25 11:29:41 +01:00
Eren Gölge
d0c27a9661
Update synthesis.py
2022-02-25 11:29:41 +01:00
Eren Gölge
35fc7270ff
Implement BaseTTS
2022-02-25 11:28:47 +01:00
Eren Gölge
2bad098625
Implement BaseVocabulary
2022-02-25 11:28:47 +01:00
Eren Gölge
833de62e30
Update base_vocoder
2022-02-25 11:28:14 +01:00
Eren Gölge
fc3b6d2861
Update gan
2022-02-25 11:28:14 +01:00
Eren Gölge
20a677c623
Update test_run in wavernn and wavegrad
2022-02-25 11:28:14 +01:00
Eren Gölge
be3a03126a
Update imports for trainer
2022-02-25 11:28:14 +01:00
Eren Gölge
c911729896
Update BaseTrainerModel
2022-02-25 11:28:14 +01:00
Eren Gölge
1e219fef0a
Revert drop_last
2022-02-25 11:26:59 +01:00
Eren Gölge
7dfd753d91
Add a cheap trick to avoid short audio clips
2022-02-25 11:26:59 +01:00
Eren Gölge
1a43e05460
Fix VITS loss bug
...
Fake and real features were given in the wrong args order to
the loss function
2022-02-25 11:26:59 +01:00
Eren Gölge
4b96bfe925
Fix train logging
2022-02-25 11:26:59 +01:00
Eren Gölge
ab8a4ca2c3
Revert random segment
2022-02-25 11:26:59 +01:00
Eren Gölge
8622226f3f
Make style
2022-02-25 11:26:59 +01:00
Eren Gölge
27db089d6c
Change TrainingArgs -> TrainerArgs
2022-02-25 11:26:59 +01:00
Eren Gölge
aa81454721
Update BaseTrainingConfig
2022-02-25 11:26:59 +01:00
Eren Gölge
d3a58ed07a
Fix default values
2022-02-25 11:26:59 +01:00
Eren Gölge
54c6bb2a8c
Fix add speaker VITS
2022-02-25 11:26:59 +01:00
Eren Gölge
590b04fb89
Fix espeak_wrapper
2022-02-25 11:26:59 +01:00
Eren Gölge
a013566d15
Delete trainer related code
2022-02-25 11:26:59 +01:00
Eren Gölge
38314194e7
Set `drop_last`
2022-02-25 11:26:59 +01:00
Eren Gölge
f70e4bb8c6
Add new speakers to the vits model
2022-02-25 11:26:59 +01:00
Eren Gölge
d5c0e17548
Load right char class dynamically
2022-02-25 11:26:59 +01:00
Eren Gölge
1f0c8179da
Make style
2022-02-25 11:26:59 +01:00
Eren Gölge
b3ed6ff6b7
Update FastPitchConfig
2022-02-25 11:26:59 +01:00
Eren Gölge
1932401e8d
Fix dataset preprocessing
2022-02-25 11:26:59 +01:00
Eren Gölge
34c4be5e49
Update forwardtts
2022-02-25 11:26:59 +01:00
Eren Gölge
bb37462794
Update language manager
2022-02-25 11:26:59 +01:00
Eren Gölge
5169d4eb32
Plot pitch over input characters
2022-02-25 11:26:59 +01:00
Eren Gölge
cd5d1497cf
Add pitch_fmin pitch_fmax args to the audio
2022-02-25 11:26:59 +01:00
Eren Gölge
1445a46e9e
Update synthesizer to use iinit_from_config
2022-02-25 11:26:59 +01:00
Eren Gölge
7058fcc3ff
Take file extension as an argument
2022-02-25 11:26:59 +01:00
Eren Gölge
13482dde1f
Update GAN model
2022-02-25 11:26:59 +01:00
Eren Gölge
2829027d8b
Refactor VITS model
2022-02-25 11:26:59 +01:00
Eren Gölge
ef63c99524
Implement `start_by_longest` option for TTSDatase
2022-02-25 11:26:18 +01:00
Eren Gölge
c4c471d61d
Allow padding for shorter segments
2022-02-25 11:25:48 +01:00
Eren Gölge
47fbddc8d4
Fix docstring
2022-02-25 11:25:48 +01:00
Eren Gölge
bc2243bac4
Fix tests
2022-02-25 11:25:00 +01:00
Eren Gölge
146fbfd7c9
Extend unittests
2022-02-25 11:25:00 +01:00
Eren Gölge
2fe16de8e3
Make lint
2022-02-25 11:25:00 +01:00
Eren Gölge
7b49a4aa2b
Fix glow_tts_config missing field
2022-02-25 11:24:13 +01:00
Eren Gölge
07b0a80d57
Fix tokenizer init_from_config
2022-02-25 11:24:13 +01:00
Eren Gölge
50e17097a7
Add verbose option to AudioProcessor
2022-02-25 11:24:13 +01:00
Eren Gölge
235f7d9b02
Extend glow_tts model tests
2022-02-25 11:24:13 +01:00
Eren Gölge
8e248913d6
Update train_tts for the new API
2022-02-25 11:24:13 +01:00
Eren Gölge
001da8afc8
Update Vits for the new model API
2022-02-25 11:21:19 +01:00
Eren Gölge
5176ae9e53
Fixes small compat. issues
2022-02-25 11:21:19 +01:00