Eren G??lge
6e3f74fc29
Fix #2191
2023-01-15 23:11:57 +01:00
Eren Gölge
a9167cf239
Fixup overflow ( #2218 )
...
* Update overflow config
* Pulling shuffle and drop_last from config
* Print training stats for overflow
2022-12-15 00:56:48 +01:00
Eren Gölge
ecea43ec81
Adding pre-trained Overflow model ( #2211 )
...
* Adding pretrained Overflow model
* Stabilize HMM
* Fixup model manager
* Return `audio_unique_name` by default
* Distribute max split size over datasets
* Fixup eval_split_size
* Make style
2022-12-14 16:55:48 +01:00
Victor Shepardson
5307a2229b
Fix Capacitron training ( #2086 )
2022-11-01 12:52:06 +01:00
Eren Gölge
9e5a469c64
d-vector handling ( #1945 )
...
* Update BaseDatasetConfig
- Add dataset_name
- Chane name to formatter_name
* Update compute_embedding
- Allow entering dataset by args
- Use released model by default
- Use the new key format
* Update loading
* Update recipes
* Update other dep code
* Update tests
* Fixup
* Load multiple embedding files
* Fix argument names in dep code
* Update docs
* Fix argument name
* Fix linter
2022-09-13 14:10:33 +02:00
manmay nakhashi
7fd9b89ebf
fix get_random_embeddings --> get_random_embedding ( #1726 )
...
* fix get_random_embeddings --> get_random_embedding
function typo leads to training crash, no such function
* fix typo
get_random_embedding
2022-08-07 14:06:03 +02:00
Eren Gölge
f70e82cd19
Use fsspec and torch for embedding file IO ( #1581 )
...
* Use fsspec and torch for embedding file
* Fixup
* Fix load and save files
* Fix compute embedding script
* Set use_cuda to true if available
* Add dummy speakers.pth file
* Make style
* Change default speakers file extension
Co-authored-by: WeberJulian <julian.weber@hotmail.fr>
2022-06-01 13:49:42 +02:00
Edresson Casanova
c6008e5235
Add audio length sampler balancer ( #1561 )
...
* Add audio length sampler balancer
* Add unit tests
2022-05-12 19:59:19 +02:00
Edresson Casanova
060e0f9368
Add EmbeddingManager and BaseIDManager ( #1374 )
2022-03-31 13:41:16 +02:00
Eren Gölge
0870a4faa2
Make style ( #1405 )
2022-03-16 12:13:55 +01:00
Edresson Casanova
917f417ac4
Add alphas to control language and speaker balancer ( #1216 )
...
* Add alphas to control language and speaker balancer
* Add docs for speaker and language samplers
* Change the Samplers weights to float for save memory
* Change the test_samplers to unittest format
* Add get_sampler method in BaseTTS
* Fix rebase issues
* Add language and speaker samplers support for DDP training
* Rename distributed sampler wrapper
* Remove the DistributedSamplerWrapper and use the one from Trainer
* Bugfix after rebase
* Move the samplers config to tts config
2022-03-10 14:56:09 +01:00
Eren Gölge
fe656659be
Implement BaseTTS
2022-02-25 11:31:56 +01:00
Eren Gölge
424d04e4f6
Make stlye
2022-02-25 11:31:56 +01:00
Eren Gölge
35fc7270ff
Implement BaseTTS
2022-02-25 11:28:47 +01:00
Eren Gölge
1e219fef0a
Revert drop_last
2022-02-25 11:26:59 +01:00
Eren Gölge
8622226f3f
Make style
2022-02-25 11:26:59 +01:00
Eren Gölge
38314194e7
Set `drop_last`
2022-02-25 11:26:59 +01:00
Eren Gölge
ef63c99524
Implement `start_by_longest` option for TTSDatase
2022-02-25 11:26:18 +01:00
Eren Gölge
5176ae9e53
Fixes small compat. issues
2022-02-25 11:21:19 +01:00
Eren Gölge
18f726af65
Update ForwardTTS
2022-02-25 11:11:35 +01:00
Eren Gölge
452dbc43d8
Update imports for symbols -> characters
2022-02-25 11:05:06 +01:00
Eren Gölge
8071fa0020
Refactor GlowTTS model and recipe for TTSTokenizer
2022-02-25 11:05:06 +01:00
Eren Gölge
4cd690e4c1
Updates BaseTTS and configs
2022-02-25 10:57:35 +01:00
Eren Gölge
4597d4e5b6
Remove get_characters from BaseTTS
2022-02-25 10:48:03 +01:00
Eren Gölge
2d8ce98d2a
Update imports for symbols -> characters
2022-02-25 10:48:03 +01:00
Eren Gölge
9a95e15483
Refactor GlowTTS model and recipe for TTSTokenizer
2022-02-25 10:48:03 +01:00
Eren Gölge
d2525abe8c
Remove get_characters from BaseTTS
2022-02-25 10:48:03 +01:00
Eren Gölge
fbad17e084
Update imports for symbols -> characters
2022-02-25 10:48:02 +01:00
Eren Gölge
bd461ace33
Refactor GlowTTS model and recipe for TTSTokenizer
2022-02-25 10:45:24 +01:00
Eren Gölge
704dddcffa
Make style
2021-12-20 11:54:10 +00:00
WeberJulian
2bbcb558dc
Prevent weighted sampler use when num_gpus > 1
2021-12-20 11:54:10 +00:00
WeberJulian
74cedfac38
Revert init multispeaker change
2021-12-20 11:54:10 +00:00
WeberJulian
6b03943526
Move multilingual logic out of the trainer
2021-12-20 11:54:10 +00:00
WeberJulian
e8af6a9f08
Fix use_speaker_embedding logic
2021-12-20 11:54:10 +00:00
WeberJulian
1472b6df49
make style
2021-12-20 11:54:10 +00:00
WeberJulian
3b5592abcf
fix test vits
2021-12-20 11:54:10 +00:00
WeberJulian
005bba60b0
get_speaker_weighted_sampler
2021-12-20 11:54:10 +00:00
Edresson
76251b619a
Fix d-vector multispeaker training bug
2021-12-20 11:54:09 +00:00
Edresson
ac9416fb86
Add multilingual inference support
2021-12-20 11:54:09 +00:00
Edresson
dcb2374bc9
Add multilingual training support to the VITS model
2021-12-20 11:54:09 +00:00
Edresson
f996afedb0
Implement multilingual dataloader support
2021-12-20 11:54:09 +00:00
Eren Gölge
2ed9e3c241
Fix constant use of noise augment
2021-11-08 09:20:34 +01:00
Eren Gölge
2b7d159383
Update BaseTTS for multi-speaker training
2021-10-21 16:29:06 +00:00
Eren Gölge
7c2cb7cc30
Update BaseTTS
2021-10-20 18:18:22 +00:00
Eren Gölge
127571423c
Update multi-speaker init in BaseTTS
2021-10-18 08:54:41 +00:00
Eren Gölge
a0a5d580e9
Approximate audio length from file size
2021-10-18 08:54:02 +00:00
Eren Gölge
8ada870a57
Refactor `trainer.py` for v2
2021-09-30 14:16:34 +00:00
Eren Gölge
2b59da802c
Fix loader setup in `base_tts`
2021-09-06 15:16:58 +00:00
Eren Gölge
648655fa03
Add `PitchExtractor` and return dict by `collate`
2021-09-06 15:16:58 +00:00
Eren Gölge
e802b24ad0
Compute mean and std pitch
2021-09-06 15:16:58 +00:00