Edresson Casanova
8d228ab22a
Trick to Upsampling to High sampling rates using VITS model ( #1456 )
...
* Add upsample VITS support
* Fix the bug in inference
* Fix lint checks
* Add RMS based norm in save_wav method
* Style fix
* Add the period for VITS multi-period discriminator in model_args
* Bug fix in speaker encoder load in inference time
* Add unit tests
* Remove useless detach_z_vocoder parameter
* Add docs for VITS upsampling
* Fix the docs
* Rename TTS_part_sample_rate to encoder_sample_rate
* Add upsampling_init and upsampling_z methods
* Add asserts for encoder_sample_rate part
* Move upsampling tests to test_vits.py
2022-04-26 11:47:46 +02:00
WeberJulian
30bea7d53c
Update manage.py ( #1514 )
2022-04-19 14:27:32 +02:00
Eren Gölge
7133f8f47d
Print Model's license when downloading ( #1512 )
...
* Print model license while downloading
* Make style
* Add a new license link
* Make style
2022-04-19 14:18:49 +02:00
Edresson Casanova
060e0f9368
Add EmbeddingManager and BaseIDManager ( #1374 )
2022-03-31 13:41:16 +02:00
WeberJulian
1b22f03e98
Fix G2P backend of the released models ( #1461 )
...
* Fix enforce phonemizer
* Add new models
* Fix .model.json
2022-03-30 12:47:11 +02:00
WeberJulian
c66a6241fd
Enforce phonemizer definition for synthesis ( #1441 )
...
* Enforce phonemizer definition for synthesis
* Fix train_tts, tokenizer init can now edit config
* Add small change to trigger CI pipeline
* fix wrong output path for one tts_test
* Fix style
* Test config overides by args and tokenizer
* Fix style
2022-03-25 23:15:33 +01:00
Edresson Casanova
3435bc8fca
Fix style tests
2022-03-23 15:05:32 -03:00
Edresson Casanova
0ae1e0248c
Fix the bug for emptly audio files
2022-03-23 14:39:31 -03:00
Edresson Casanova
ea53d6feb3
Replace webrtcvad by silero-vad
2022-03-23 14:39:31 -03:00
Eren Gölge
1c3623af33
Fix model manager ( #1436 )
...
* Fix manager
* Make style
2022-03-23 12:57:14 +01:00
Eren Gölge
72d85e53c9
Update model file extension ( #1422 )
...
* Update model file ext to ```.pth```
* Update docs
* Rename more
* Find model files
2022-03-22 17:55:00 +01:00
Eren Gölge
0870a4faa2
Make style ( #1405 )
2022-03-16 12:13:55 +01:00
Edresson Casanova
dbe9da7f15
Add Voice conversion inference support ( #1337 )
...
* Add support for voice conversion inference
* Cache d_vectors_by_speaker for fast inference using a bigger speakers.json
* Rebase bug fix
* Use the average d-vector for inference
2022-03-10 14:57:12 +01:00
Eren Gölge
942df0fb05
Update vits dataset
2022-03-02 09:14:32 +01:00
Eren Gölge
935a604046
Delete trainer_utils
2022-02-25 11:29:41 +01:00
Eren Gölge
d0c27a9661
Update synthesis.py
2022-02-25 11:29:41 +01:00
Eren Gölge
2bad098625
Implement BaseVocabulary
2022-02-25 11:28:47 +01:00
Eren Gölge
a013566d15
Delete trainer related code
2022-02-25 11:26:59 +01:00
Eren Gölge
d5c0e17548
Load right char class dynamically
2022-02-25 11:26:59 +01:00
Eren Gölge
1f0c8179da
Make style
2022-02-25 11:26:59 +01:00
Eren Gölge
cd5d1497cf
Add pitch_fmin pitch_fmax args to the audio
2022-02-25 11:26:59 +01:00
Eren Gölge
1445a46e9e
Update synthesizer to use iinit_from_config
2022-02-25 11:26:59 +01:00
Eren Gölge
2fe16de8e3
Make lint
2022-02-25 11:25:00 +01:00
Eren Gölge
50e17097a7
Add verbose option to AudioProcessor
2022-02-25 11:24:13 +01:00
Eren Gölge
c9972e6f14
Make lint
2022-02-25 11:07:34 +01:00
Eren Gölge
9bb347a52b
Update for tokenizer API
2022-02-25 11:05:06 +01:00
Eren Gölge
84091096a6
Refactor Synthesizer class for TTSTokenizer
2022-02-25 11:05:06 +01:00
Eren Gölge
1df1d6c4a9
Update for tokenizer API
2022-02-25 10:48:03 +01:00
Eren Gölge
3476be30d7
Refactor Synthesizer class for TTSTokenizer
2022-02-25 10:48:03 +01:00
Eren Gölge
9397a56b13
Allow init_from_config from model or audio config
2022-02-25 10:48:03 +01:00
Eren Gölge
acc6eef625
Update for tokenizer API
2022-02-25 10:48:02 +01:00
Eren Gölge
53f696615b
Add init_from_config to AudioProcessor
2022-02-25 09:32:54 +01:00
Eren Gölge
3d86edfc81
Refactor Synthesizer class for TTSTokenizer
2022-02-25 09:32:54 +01:00
Eren Gölge
127118c637
Update TTS.tts formatters ( #1228 )
...
* Return Dict from tts formatters
* Make style
2022-02-11 23:03:43 +01:00
Eren Gölge
fc09e319d4
Prioritize the given encoder path over config
2022-01-03 14:24:19 +00:00
Eren Gölge
7fad969a1f
Fix if else statement
2022-01-03 14:16:11 +00:00
Eren Gölge
e55f5ee59e
Make linter
2022-01-01 15:50:04 +00:00
Eren Gölge
8fd1ee1926
Print urls when BadZipError
2022-01-01 15:26:35 +00:00
Eren Gölge
61874bc0a0
Fix your_tts inference from the listed models
2021-12-31 13:45:05 +00:00
Eren Gölge
5c5ddd2ba7
Init speaker manager for speaker encoder
2021-12-22 15:51:53 +00:00
Eren Gölge
633dcc9c56
Implement RMS volume normalization
2021-12-22 15:51:14 +00:00
Eren Gölge
56378b12f7
Fix speaker encoder init
2021-12-21 12:26:25 +00:00
Eren Gölge
c9c1fa0548
Fix multi-speaker init in Synthesizer
2021-12-21 09:44:07 +00:00
Eren Gölge
f769595112
Add more listing options to ModelManager
2021-12-20 11:54:10 +00:00
Eren Gölge
473414d4af
Implement init_speaker_encoder and change arg names
2021-12-20 11:54:10 +00:00
Eren Gölge
35a781fb90
Fix synthesizer reading `use_language_embedding`
2021-12-20 11:54:10 +00:00
Eren Gölge
704dddcffa
Make style
2021-12-20 11:54:10 +00:00
WeberJulian
54b7fb4e4a
Fix zoo tests
2021-12-20 11:54:10 +00:00
WeberJulian
a564eb9f54
Add support for multi-lingual models in CLI
2021-12-20 11:54:10 +00:00
Edresson
818dc4ccd8
Add Docstring for TorchSTFT
2021-12-20 11:54:10 +00:00
Edresson
d39200e69b
Remove torchaudio requeriment
2021-12-20 11:54:10 +00:00
Edresson
45d0b04179
Lint fixs
2021-12-20 11:54:10 +00:00
Edresson
2b2cecaea2
Set the new_fields in copy_model_files as None by default
2021-12-20 11:54:10 +00:00
Edresson
352aa69eca
Create a module for the VAD script
2021-12-20 11:54:10 +00:00
loganhart420
103c010eca
Add addtional datasets
2021-12-16 07:21:27 -05:00
Eren Gölge
ce45d9e1af
Make style and lint
2021-12-01 10:42:52 +00:00
Eren Gölge
512ada7548
Fix callbacks against multi-gpu training
2021-12-01 10:32:14 +00:00
Eren Gölge
d227aaebcc
Print when using Griffin-Lim in Synthesizer
2021-11-01 16:52:26 +01:00
George
37eaefc085
Optional silence trimming during inference and find_endpoint() fix ( #898 )
...
* Set find_endpoint db threshold in config.json
* Optional silence trimming during inference
* Make trim_db value negative
2021-10-29 18:28:55 +02:00
Eren Gölge
2df0752e73
Model zoo tests ( #900 )
...
* Fix VITS model multi-speaker init
* Remove gdrive support in model manager
* Add model zoo tests
2021-10-29 17:54:16 +02:00
Eren Gölge
035ed432bc
Doc update ( #889 )
...
* Link source files from the docs
* Update glowTTS recipes for docs
* Add dataset downloaders
2021-10-26 17:41:33 +02:00
Eren Gölge
1987aaaaed
Update d-vector reshape in synthesizer
2021-10-21 13:53:25 +00:00
Eren Gölge
92b6d98443
Set pitch frame alignment wrt spec computation
2021-10-20 18:12:38 +00:00
Eren Gölge
0a3d1cc7ee
Pass speaker manager to the model in synthesizer
2021-10-20 18:11:36 +00:00
Eren Gölge
3c7848e9b1
Don't OOR values in train console log
2021-10-19 16:32:16 +00:00
Eren Gölge
c514351c0e
Refactor multi-speaker init in BaseTTS-Tacotron1-2
2021-10-18 08:55:45 +00:00
Eren Gölge
700b056117
Update Synthesizer multi-speaker handling
2021-10-15 10:21:12 +00:00
Eren Gölge
9a0d8fa027
Update `copy_model_files()`
2021-09-30 14:47:56 +00:00
Eren Gölge
8ada870a57
Refactor `trainer.py` for v2
2021-09-30 14:16:34 +00:00
Eren Gölge
7d8f77385a
Use `glow-tts` in synthesis tests
2021-09-10 17:27:33 +00:00
Eren Gölge
742f9c54da
Warn user if nan in GL
2021-09-10 08:26:05 +00:00
Eren Gölge
4761853c5c
Fix imports
2021-09-08 13:34:40 +00:00
Eren Gölge
2c4bbbf9b9
Use pyworld for pitch
2021-09-06 15:16:58 +00:00
Eren Gölge
98a7271ce8
Refactor FastPitchv2
2021-09-06 15:16:58 +00:00
Eren Gölge
42862f7fdb
Format style of the recipes
2021-09-06 15:16:58 +00:00
Eren Gölge
aacbb3ed77
Fix SpeakerManager usage in `synthesize.py`
2021-09-06 15:16:58 +00:00
Eren Gölge
5a6ffaee08
Add yin based pitch computation
2021-09-06 15:16:58 +00:00
Eren Gölge
d085642ac1
Cache pitch features
...
Cache the features at the beginning of `BaseTTS` training.
2021-09-06 15:16:58 +00:00
Eren Gölge
fba257104d
Compute F0 using librosa
2021-09-06 15:16:58 +00:00
Eren Gölge
d16da949a5
Merge branch 'fix_distribute' into dev
2021-08-30 16:31:07 +00:00
Eren Gölge
5255e089e6
Fix #767
2021-08-30 13:10:08 +00:00
Eren Gölge
c560114324
Fix #750
2021-08-30 13:06:50 +00:00
Eren Gölge
18da8f5dbd
Update pylint 2.10.2 and fix lint issues
2021-08-30 08:10:35 +00:00
Eren Gölge
2620f62ea8
Move duration_loss inside VitsGeneratorLoss
2021-08-27 07:07:07 +00:00
Eren Gölge
1692b8e4d9
Merge pull request #726 from fijipants/patch-1
...
Fix bug with log_func
2021-08-26 22:11:29 +02:00
Eren Gölge
49e1181ea4
Fixes for the vits model
2021-08-26 17:15:09 +00:00
fijipants
e9e01b09b0
Fix bug with log_func
2021-08-18 19:59:51 -04:00
fijipants
8f57f8adfd
Update synthesizer.py
2021-08-18 19:56:52 -04:00
Eren Gölge
7c0d564965
Syncronize DDP processes
2021-08-13 10:40:50 +00:00
Eren Gölge
ecf5f17dca
Fix distribute.py and ddp training
2021-08-12 22:22:32 +00:00
Eren Gölge
537bc8487a
Print model count when listing modelsk
2021-08-10 16:25:11 +00:00
Ayush Chaurasia
f3e9d61330
Refactor logging initialization
2021-08-09 18:35:08 +00:00
Ayush Chaurasia
79b74a989d
Update: add_text
2021-08-09 18:34:38 +00:00
Ayush Chaurasia
9fcf48b760
Delete logger_base.py
2021-08-09 18:34:00 +00:00
Ayush Chaurasia
290972fd35
reformat
2021-08-09 18:34:00 +00:00
Ayush Chaurasia
936a47504d
Update Logger API, recipes
2021-08-09 18:34:00 +00:00
Ayush Chaurasia
f63cf46c55
Unified logger API
2021-08-09 18:34:00 +00:00
Ayush Chaurasia
f4434da5a3
Update disabled structure
2021-08-09 18:31:16 +00:00
Ayush Chaurasia
f606741dc4
Add artifacts logging , wandb args
2021-08-09 18:31:16 +00:00
Ayush Chaurasia
f5e50ad502
WandbLogger
2021-08-09 18:27:06 +00:00