Eren Gölge
f70e82cd19
Use fsspec and torch for embedding file IO ( #1581 )
...
* Use fsspec and torch for embedding file
* Fixup
* Fix load and save files
* Fix compute embedding script
* Set use_cuda to true if available
* Add dummy speakers.pth file
* Make style
* Change default speakers file extension
Co-authored-by: WeberJulian <julian.weber@hotmail.fr>
2022-06-01 13:49:42 +02:00
Edresson Casanova
c6008e5235
Add audio length sampler balancer ( #1561 )
...
* Add audio length sampler balancer
* Add unit tests
2022-05-12 19:59:19 +02:00
Edresson Casanova
060e0f9368
Add EmbeddingManager and BaseIDManager ( #1374 )
2022-03-31 13:41:16 +02:00
Eren Gölge
0870a4faa2
Make style ( #1405 )
2022-03-16 12:13:55 +01:00
Edresson Casanova
917f417ac4
Add alphas to control language and speaker balancer ( #1216 )
...
* Add alphas to control language and speaker balancer
* Add docs for speaker and language samplers
* Change the Samplers weights to float for save memory
* Change the test_samplers to unittest format
* Add get_sampler method in BaseTTS
* Fix rebase issues
* Add language and speaker samplers support for DDP training
* Rename distributed sampler wrapper
* Remove the DistributedSamplerWrapper and use the one from Trainer
* Bugfix after rebase
* Move the samplers config to tts config
2022-03-10 14:56:09 +01:00
Eren Gölge
fe656659be
Implement BaseTTS
2022-02-25 11:31:56 +01:00
Eren Gölge
424d04e4f6
Make stlye
2022-02-25 11:31:56 +01:00
Eren Gölge
35fc7270ff
Implement BaseTTS
2022-02-25 11:28:47 +01:00
Eren Gölge
1e219fef0a
Revert drop_last
2022-02-25 11:26:59 +01:00
Eren Gölge
8622226f3f
Make style
2022-02-25 11:26:59 +01:00
Eren Gölge
38314194e7
Set `drop_last`
2022-02-25 11:26:59 +01:00
Eren Gölge
ef63c99524
Implement `start_by_longest` option for TTSDatase
2022-02-25 11:26:18 +01:00
Eren Gölge
5176ae9e53
Fixes small compat. issues
2022-02-25 11:21:19 +01:00
Eren Gölge
18f726af65
Update ForwardTTS
2022-02-25 11:11:35 +01:00
Eren Gölge
452dbc43d8
Update imports for symbols -> characters
2022-02-25 11:05:06 +01:00
Eren Gölge
8071fa0020
Refactor GlowTTS model and recipe for TTSTokenizer
2022-02-25 11:05:06 +01:00
Eren Gölge
4cd690e4c1
Updates BaseTTS and configs
2022-02-25 10:57:35 +01:00
Eren Gölge
4597d4e5b6
Remove get_characters from BaseTTS
2022-02-25 10:48:03 +01:00
Eren Gölge
2d8ce98d2a
Update imports for symbols -> characters
2022-02-25 10:48:03 +01:00
Eren Gölge
9a95e15483
Refactor GlowTTS model and recipe for TTSTokenizer
2022-02-25 10:48:03 +01:00
Eren Gölge
d2525abe8c
Remove get_characters from BaseTTS
2022-02-25 10:48:03 +01:00
Eren Gölge
fbad17e084
Update imports for symbols -> characters
2022-02-25 10:48:02 +01:00
Eren Gölge
bd461ace33
Refactor GlowTTS model and recipe for TTSTokenizer
2022-02-25 10:45:24 +01:00
Eren Gölge
704dddcffa
Make style
2021-12-20 11:54:10 +00:00
WeberJulian
2bbcb558dc
Prevent weighted sampler use when num_gpus > 1
2021-12-20 11:54:10 +00:00
WeberJulian
74cedfac38
Revert init multispeaker change
2021-12-20 11:54:10 +00:00
WeberJulian
6b03943526
Move multilingual logic out of the trainer
2021-12-20 11:54:10 +00:00
WeberJulian
e8af6a9f08
Fix use_speaker_embedding logic
2021-12-20 11:54:10 +00:00
WeberJulian
1472b6df49
make style
2021-12-20 11:54:10 +00:00
WeberJulian
3b5592abcf
fix test vits
2021-12-20 11:54:10 +00:00
WeberJulian
005bba60b0
get_speaker_weighted_sampler
2021-12-20 11:54:10 +00:00
Edresson
76251b619a
Fix d-vector multispeaker training bug
2021-12-20 11:54:09 +00:00
Edresson
ac9416fb86
Add multilingual inference support
2021-12-20 11:54:09 +00:00
Edresson
dcb2374bc9
Add multilingual training support to the VITS model
2021-12-20 11:54:09 +00:00
Edresson
f996afedb0
Implement multilingual dataloader support
2021-12-20 11:54:09 +00:00
Eren Gölge
2ed9e3c241
Fix constant use of noise augment
2021-11-08 09:20:34 +01:00
Eren Gölge
2b7d159383
Update BaseTTS for multi-speaker training
2021-10-21 16:29:06 +00:00
Eren Gölge
7c2cb7cc30
Update BaseTTS
2021-10-20 18:18:22 +00:00
Eren Gölge
127571423c
Update multi-speaker init in BaseTTS
2021-10-18 08:54:41 +00:00
Eren Gölge
a0a5d580e9
Approximate audio length from file size
2021-10-18 08:54:02 +00:00
Eren Gölge
8ada870a57
Refactor `trainer.py` for v2
2021-09-30 14:16:34 +00:00
Eren Gölge
2b59da802c
Fix loader setup in `base_tts`
2021-09-06 15:16:58 +00:00
Eren Gölge
648655fa03
Add `PitchExtractor` and return dict by `collate`
2021-09-06 15:16:58 +00:00
Eren Gölge
e802b24ad0
Compute mean and std pitch
2021-09-06 15:16:58 +00:00
Eren Gölge
d085642ac1
Cache pitch features
...
Cache the features at the beginning of `BaseTTS` training.
2021-09-06 15:16:58 +00:00
Eren Gölge
994f2be2c1
Add comput_f0 field
2021-09-06 15:16:58 +00:00
Eren Gölge
f186856e5d
Add option to sort input sequnce by audio len
2021-08-30 08:10:35 +00:00
Eren Gölge
3ab8cef99e
Fix VITS model SPD
2021-08-18 14:55:46 +00:00
Eren Gölge
7c0d564965
Syncronize DDP processes
2021-08-13 10:40:50 +00:00
Eren Gölge
ecf5f17dca
Fix distribute.py and ddp training
2021-08-12 22:22:32 +00:00