Edresson Casanova
36e9ea2f97
Open bible dataset formatter ( #1365 )
...
* Add support for voice conversion inference
* Cache d_vectors_by_speaker for fast inference using a bigger speakers.json
* Rebase bug fix
* Use the average d-vector for inference
* Fix the bug in find unique chars script
* Add OpenBible formatter
Co-authored-by: Eren Gölge <erogol@hotmail.com>
2022-03-11 10:43:31 +01:00
Edresson Casanova
dbe9da7f15
Add Voice conversion inference support ( #1337 )
...
* Add support for voice conversion inference
* Cache d_vectors_by_speaker for fast inference using a bigger speakers.json
* Rebase bug fix
* Use the average d-vector for inference
2022-03-10 14:57:12 +01:00
Edresson Casanova
917f417ac4
Add alphas to control language and speaker balancer ( #1216 )
...
* Add alphas to control language and speaker balancer
* Add docs for speaker and language samplers
* Change the Samplers weights to float for save memory
* Change the test_samplers to unittest format
* Add get_sampler method in BaseTTS
* Fix rebase issues
* Add language and speaker samplers support for DDP training
* Rename distributed sampler wrapper
* Remove the DistributedSamplerWrapper and use the one from Trainer
* Bugfix after rebase
* Move the samplers config to tts config
2022-03-10 14:56:09 +01:00
Edresson Casanova
f381e29b91
REBASED: Add support for the speaker encoder training using torch spectrograms ( #1348 )
...
* Add support for the speaker encoder training using torch spectrograms
* Remove useless function in speaker encoder dataset class
2022-03-10 14:54:51 +01:00
Eren Gölge
c670365507
Fix VCTK recipe and formatter
2022-03-08 14:20:34 +01:00
Eren Gölge
e9d9028b4d
Revert cleaner name
2022-03-06 12:57:06 +01:00
Eren Gölge
764c7fa4a4
Rename phoneme_cleaners
2022-03-06 12:09:54 +01:00
Eren Gölge
dd4287de1f
Update models
2022-03-03 20:23:00 +01:00
Eren Gölge
1425a023fe
Make style and lint
2022-03-02 13:25:35 +01:00
Eren Gölge
c68885b3fd
Update Vits speaker encoder init
2022-03-02 13:20:23 +01:00
Eren Gölge
27b67b7945
Fix import
2022-03-02 09:15:20 +01:00
Eren Gölge
942df0fb05
Update vits dataset
2022-03-02 09:14:32 +01:00
Eren Gölge
6a9f8074f0
Fix TTSDataset
2022-03-01 07:57:48 +01:00
Eren Gölge
690de1ab06
Update Characters and add more tests
2022-02-25 11:32:44 +01:00
Eren Gölge
9063397892
Fix FastSpeech config
2022-02-25 11:31:56 +01:00
Eren Gölge
1e414b3a09
Make stlye
2022-02-25 11:31:56 +01:00
Eren Gölge
acc83cd3e6
Update Vits model API
2022-02-25 11:31:56 +01:00
Eren Gölge
fe656659be
Implement BaseTTS
2022-02-25 11:31:56 +01:00
Eren Gölge
bed4afd4ee
Implement BaseVocabulary
2022-02-25 11:31:56 +01:00
Eren Gölge
83c5ddc5b7
Update imports
2022-02-25 11:31:56 +01:00
Eren Gölge
14c117978d
Fix return outputs
2022-02-25 11:31:56 +01:00
Eren Gölge
424d04e4f6
Make stlye
2022-02-25 11:31:56 +01:00
Eren Gölge
8b3ba02c95
Add vocab_dict to model config
2022-02-25 11:31:20 +01:00
Eren Gölge
ff23dce081
Update TTSDataset
2022-02-25 11:31:20 +01:00
Eren Gölge
750903d2ba
Add VCTK formatter docstring
2022-02-25 11:30:24 +01:00
Eren Gölge
52a7896668
Update VITS loss
2022-02-25 11:30:24 +01:00
Eren Gölge
c68962c574
Update forward tts binary loss
2022-02-25 11:30:24 +01:00
Eren Gölge
c11944022d
Revert back again rand_segment
2022-02-25 11:30:24 +01:00
Eren Gölge
00c7600103
Update Vits model API
2022-02-25 11:30:24 +01:00
Eren Gölge
d0c27a9661
Update synthesis.py
2022-02-25 11:29:41 +01:00
Eren Gölge
35fc7270ff
Implement BaseTTS
2022-02-25 11:28:47 +01:00
Eren Gölge
2bad098625
Implement BaseVocabulary
2022-02-25 11:28:47 +01:00
Eren Gölge
1e219fef0a
Revert drop_last
2022-02-25 11:26:59 +01:00
Eren Gölge
7dfd753d91
Add a cheap trick to avoid short audio clips
2022-02-25 11:26:59 +01:00
Eren Gölge
1a43e05460
Fix VITS loss bug
...
Fake and real features were given in the wrong args order to
the loss function
2022-02-25 11:26:59 +01:00
Eren Gölge
4b96bfe925
Fix train logging
2022-02-25 11:26:59 +01:00
Eren Gölge
ab8a4ca2c3
Revert random segment
2022-02-25 11:26:59 +01:00
Eren Gölge
8622226f3f
Make style
2022-02-25 11:26:59 +01:00
Eren Gölge
d3a58ed07a
Fix default values
2022-02-25 11:26:59 +01:00
Eren Gölge
54c6bb2a8c
Fix add speaker VITS
2022-02-25 11:26:59 +01:00
Eren Gölge
590b04fb89
Fix espeak_wrapper
2022-02-25 11:26:59 +01:00
Eren Gölge
38314194e7
Set `drop_last`
2022-02-25 11:26:59 +01:00
Eren Gölge
f70e4bb8c6
Add new speakers to the vits model
2022-02-25 11:26:59 +01:00
Eren Gölge
d5c0e17548
Load right char class dynamically
2022-02-25 11:26:59 +01:00
Eren Gölge
1f0c8179da
Make style
2022-02-25 11:26:59 +01:00
Eren Gölge
b3ed6ff6b7
Update FastPitchConfig
2022-02-25 11:26:59 +01:00
Eren Gölge
1932401e8d
Fix dataset preprocessing
2022-02-25 11:26:59 +01:00
Eren Gölge
34c4be5e49
Update forwardtts
2022-02-25 11:26:59 +01:00
Eren Gölge
bb37462794
Update language manager
2022-02-25 11:26:59 +01:00
Eren Gölge
5169d4eb32
Plot pitch over input characters
2022-02-25 11:26:59 +01:00