p0p4k
903a77c197
Update wavenet.py ( #1796 )
...
* Update wavenet.py
Current version does not use "in_channels" argument.
In glowTTS, we use normalizing flows and so "input dim" == "ouput dim" (channels and length). So, the existing code just uses hidden_channel sized tensor as input to first layer as well as outputs hidden_channel sized tensor.
However, since it is a generic implementation, I believe it is better to update it for a more general use.
* "in_channels -> hidden_channels"
2022-08-01 12:20:37 +02:00
p0p4k
4fe50801b5
Update README.md; download progress bar in CLI. ( #1797 )
...
* Update README.md
- minor PR
- added model_info usage guide based on #1623 in README.md .
* "added tqdm bar for model download"
* Update manage.py
* fixed style
* fixed style
* sort imports
2022-08-01 12:17:47 +02:00
Eren G??lge
7d8b1665c8
Fix rand_segment edge case (input_len == seg_len - 1)
2022-08-01 11:37:45 +02:00
vanIvan
5094499eba
Fix & update WaveRNN vocoder model ( #1749 )
...
* Fixes KeyError bug. Adding logging to dashboard.
* Make pep8 compliant
* Make style compliant
* Still fixing style
2022-07-26 15:05:11 +02:00
p0p4k
10195c4eba
Update decoder.py ( #1792 )
...
Minor comment correction.
2022-07-26 13:06:06 +02:00
ivan provalov
903d9c791a
Fix for FloorDiv Function Warning ( #1760 )
...
* Fix for Floor Function Warning
Fix for Floor Function Warning
* Adding double quotes to fix formatting
Adding double quotes to fix formatting
* Update glow_tts.py
* Update glow_tts.py
2022-07-20 11:31:22 +02:00
Eren Gölge
f7587fc134
Fix SSIM loss correction
2022-07-13 10:47:12 +02:00
Eren Gölge
bc1f93c299
Fix device allocation
2022-07-12 19:05:25 +02:00
Eren Gölge
49bac724c0
Implement VitsAudioConfig ( #1556 )
...
* Implement VitsAudioConfig
* Update VITS LJSpeech recipe
* Update VITS VCTK recipe
* Make style
* Add missing decorator
* Add missing param
* Make style
* Update recipes
* Fix test
* Bug fix
* Exclude tests folder
* Make linter
* Make style
2022-07-12 18:49:58 +02:00
a-froghyar
34b80e0280
feat: updated recipes and lr fix ( #1718 )
...
- updated the recipes activating more losses for more stable training
- re-enabling guided attention loss
- fixed a bug about not the correct lr fetched for logging
2022-07-12 15:00:53 +02:00
Eren G??lge
48a4f3647f
Make lint
2022-07-12 14:58:26 +02:00
WeberJulian
c614f21982
Add durations as aux input for VITS ( #1694 )
...
* Add durations as aux input for VITS
* Make style
* Fix tts_tests
* Fix test_get_aux_input
2022-07-12 14:25:21 +02:00
Eren G??lge
2cf89b88c9
Make style
2022-07-12 14:12:57 +02:00
Eren G??lge
a6f73a18cb
Fix BCELoss adressing #1192
2022-07-12 14:11:34 +02:00
Eren G??lge
c17ff17a18
Fix SSIM loss
2022-07-12 12:35:24 +02:00
Eren G??lge
f1e35596e8
Remove redundant config field
2022-07-11 13:39:41 +02:00
WeberJulian
5cef6facb0
Fix tokenizer for punc only ( #1717 )
2022-07-06 22:59:41 +02:00
camillem
5c821d9fa1
Fix the --model_name and --vocoder_name arguments need a <model_type> element ( #1469 )
...
Co-authored-by: Eren Gölge <erogol@hotmail.com>
2022-06-27 10:32:43 +02:00
manmay nakhashi
577ec406f4
Fix checkpointing GAN models ( #1641 )
...
* checkpoint sae step crash fix
* checkpoint save step crash fix
* Update gan.py
updated requested changes
* crash fix
2022-06-22 12:07:46 +02:00
Eren G??lge
00e67092d8
Bump up to v0.7.1
2022-06-21 14:12:55 +02:00
Eren G??lge
3328be7a8e
Remove GL message
2022-06-21 12:39:31 +02:00
WeberJulian
30c72e0d05
Add Thorsten VITS model ( #1675 )
...
Co-authored-by: Eren Gölge <egolge@coqui.ai>
2022-06-21 11:39:49 +02:00
p0p4k
71281ff1e4
Add support for model_info in CLI ( #1623 )
...
* model_info
* model_info
* model_info_by_idx and name
* model_info_by_idx and name
* model_info
* Update manage.py
* fixed linter
* fixed linter
* fixed linter
* fixed linter
* fixed return style checks
* fixed linter
* fixed linter
* fixed idx always positive
* added comments
* fix parser.args check
* fix parser.args check
* Make style
Co-authored-by: Eren G??lge <egolge@coqui.ai>
2022-06-20 23:28:17 +02:00
Eren G??lge
8b75e8be9c
Bump up to v0.7.0
2022-06-20 13:50:09 +02:00
WeberJulian
6126c23498
Add synpaflex formatter ( #1616 )
...
* Add synpaflex formatter
* Fix formatter
* Make style
2022-06-20 13:36:26 +02:00
WeberJulian
f09ea11c71
Internal formatter ( #1629 )
...
* Add coqui formatter
* Make style
2022-06-08 14:31:03 +02:00
Eren Gölge
f70e82cd19
Use fsspec and torch for embedding file IO ( #1581 )
...
* Use fsspec and torch for embedding file
* Fixup
* Fix load and save files
* Fix compute embedding script
* Set use_cuda to true if available
* Add dummy speakers.pth file
* Make style
* Change default speakers file extension
Co-authored-by: WeberJulian <julian.weber@hotmail.fr>
2022-06-01 13:49:42 +02:00
Noran Raskin
a790df4e94
Training recipes for thorsten dataset ( #1020 )
...
* Fix style
* Fix isort
* Remove tensorboardX from requirements
Co-authored-by: logan hart <72301874+loganhart420@users.noreply.github.com>
Co-authored-by: Eren Gölge <egolge@coqui.ai>
2022-05-30 12:07:31 +02:00
André R. de Miranda
3b84ef9524
Fixed use_cuda issue in compute_embeddings.py
...
Added use_cuda argument in self.init_encoder method
2022-05-20 12:46:46 -03:00
a-froghyar
8be21ec387
Capacitron ( #977 )
...
* new CI config
* initial Capacitron implementation
* delete old unused file
* fix empty formatting changes
* update losses and training script
* fix previous commit
* fix commit
* Add Capacitron test and first round of test fixes
* revert formatter change
* add changes to the synthesizer
* add stepwise gradual lr scheduler and changes to the recipe
* add inference script for dev use
* feat: add posterior inference arguments to synth methods
- added reference wav and text args for posterior inference
- some formatting
* fix: add espeak flag to base_tts and dataset APIs
- use_espeak_phonemes flag was not implemented in those APIs
- espeak is now able to be utilised for phoneme generation
- necessary phonemizer for the Capacitron model
* chore: update training script and style
- training script includes the espeak flag and other hyperparams
- made style
* chore: fix linting
* feat: add Tacotron 2 support
* leftover from dev
* chore:rename parser args
* feat: extract optimizers
- created a separate optimizer class to merge the two optimizers
* chore: revert arbitrary trainer changes
* fmt: revert formatting bug
* formatting again
* formatting fixed
* fix: log func
* fix: update optimizer
- Implemented load_state_dict for continuing training
* fix: clean optimizer init for standard models
* improvement: purge espeak flags and add training scripts
* Delete capacitronT2.py
delete old training script, new one is pushed
* feat: capacitron trainer methods
- extracted capacitron specific training operations from the trainer into custom
methods in taco1 and taco2 models
* chore: renaming and merging capacitron and gst style args
* fix: bug fixes from the previous commit
* fix: implement state_dict method on CapacitronOptimizer
* fix: call method
* fix: inference naming
* Delete train_capacitron.py
* fix: synthesize
* feat: update tests
* chore: fix style
* Delete capacitron_inference.py
* fix: fix train tts t2 capacitron tests
* fix: double forward in T2 train step
* fix: double forward in T1 train step
* fix: run make style
* fix: remove unused import
* fix: test for T1 capacitron
* fix: make lint
* feat: add blizzard2013 recipes
* make style
* fix: update recipes
* chore: make style
* Plot test sentences in Tacotron
* chore: make style and fix import
* fix: call forward first before problematic floordiv op
* fix: update recipes
* feat: add min_audio_len to recipes
* aux_input["style_mel"]
* chore: make style
* Make capacitron T2 recipe more stable
* Remove T1 capacitron Ljspeech
* feat: implement new grad clipping routine and update configs
* make style
* Add pretrained checkpoints
* Add default vocoder
* Change trainer package
* Fix grad clip issue for tacotron
* Fix scheduler issue with tacotron
Co-authored-by: Eren Gölge <egolge@coqui.ai>
Co-authored-by: WeberJulian <julian.weber@hotmail.fr>
Co-authored-by: Eren Gölge <erogol@hotmail.com>
2022-05-20 16:17:11 +02:00
Edresson Casanova
ee99a6c1e2
Fix voice conversion inference ( #1583 )
...
* Add voice conversion zoo test
* Fix style
* Fix unit test
2022-05-20 15:50:25 +02:00
Edresson Casanova
e5d8ec2402
Change the VITS upsampling interpolation trick to linear ( #1564 )
2022-05-13 10:52:39 +02:00
Edresson Casanova
c6008e5235
Add audio length sampler balancer ( #1561 )
...
* Add audio length sampler balancer
* Add unit tests
2022-05-12 19:59:19 +02:00
Eren Gölge
6e460b7e42
Add an assert for the upsampling trick ( #1538 )
2022-05-12 19:55:24 +02:00
Eren Gölge
4857967063
🐍 Python 3.10.x support and drop Python 3.6 support ( #1565 )
...
* Update requirements
* Update CI for p3.10
* Update numpy requirement
* Drop 🐍 p3.6 support
Numpy also dropped support for p3.6
* Bind cython v0.29.28
* Bind pyworld to v0.2.10
> 0.2.10 is not p3.10.x compatible
* Update Dockerfile
2022-05-12 15:50:25 +02:00
Edresson Casanova
a97eed696a
Fix the bug in eSpeak wrapper for eSpeak version 1.48.15 ( #1560 )
2022-05-12 15:15:18 +02:00
Eren Gölge
e45ae57aef
Merge pull request #1550 from coqui-ai/fix-upsampling-asserts
...
Fix VITS upsampling asserts
2022-05-12 14:51:41 +02:00
Edresson Casanova
175ca06388
Add reinit text encoder and duration predictor parameter ( #1562 )
...
* Add reinit encoder and duration predictor option
* Add .data to prevent any overlooked autograd hook
2022-05-12 09:08:36 -03:00
Edresson Casanova
182711043c
Fix the VITS upsampling asserts
...
Fix style
2022-05-12 09:08:29 -03:00
Eren Gölge
2fc38f67d2
Update SpeakerManager init in Synthesizer
2022-05-11 11:32:27 +02:00
Eren Gölge
c3f8c4d5eb
Return default SpeakerManager if no d_vector_file
2022-05-11 11:31:45 +02:00
Eren Gölge
121e9ed685
Pass use_cuda to init_encoder
2022-05-11 11:31:17 +02:00
Eren Gölge
c18bd21b3f
Return durations at VITS inference
2022-05-11 11:30:05 +02:00
Eren Gölge
5021a03de0
Use torch.no_grad for VITS inference
2022-05-11 11:29:36 +02:00
Eren Gölge
3f03e3012c
Fix batch_group_size in VITS
2022-05-07 13:44:44 +02:00
code-review-doctor
fa887ef5f9
Fix issue probably-meant-fstring found at https://codereview.doctor ( #1532 )
2022-05-07 13:33:40 +02:00
Eren Gölge
a0a9279e4b
Fix GAN optimizer order
...
commit 212d330929
Author: Edresson Casanova <edresson1@gmail.com>
Date: Fri Apr 29 16:29:44 2022 -0300
Fix unit test
commit 44456b0483
Author: Edresson Casanova <edresson1@gmail.com>
Date: Fri Apr 29 07:28:39 2022 -0300
Fix style
commit d545beadb9
Author: Edresson Casanova <edresson1@gmail.com>
Date: Thu Apr 28 17:08:04 2022 -0300
Change order of HIFI-GAN optimizers to be equal than the original repository
commit 657c5442e5
Author: Edresson Casanova <edresson1@gmail.com>
Date: Thu Apr 28 15:40:16 2022 -0300
Remove audio padding before mel spec extraction
commit 76b274e690
Merge: 379ccd7b
6233f4fc
Author: Edresson Casanova <edresson1@gmail.com>
Date: Wed Apr 27 07:28:48 2022 -0300
Merge pull request #1541 from coqui-ai/comp_emb_fix
Bug fix in compute embedding without eval partition
commit 379ccd7ba6
Author: WeberJulian <julian.weber@hotmail.fr>
Date: Wed Apr 27 10:42:26 2022 +0200
returns y_mask in VITS inference (#1540 )
* returns y_mask
* make style
2022-05-07 13:29:11 +02:00
Edresson Casanova
60034674f9
Remove audio padding before mel spec extraction
2022-05-07 13:12:09 +02:00
WeberJulian
fbdf76b2fc
returns y_mask in VITS inference ( #1540 )
...
* returns y_mask
* make style
2022-05-03 13:49:24 +02:00
Edresson Casanova
6233f4fcd7
Bug fix in compute embedding without eval partition
2022-04-26 13:58:03 -03:00
Edresson Casanova
8d228ab22a
Trick to Upsampling to High sampling rates using VITS model ( #1456 )
...
* Add upsample VITS support
* Fix the bug in inference
* Fix lint checks
* Add RMS based norm in save_wav method
* Style fix
* Add the period for VITS multi-period discriminator in model_args
* Bug fix in speaker encoder load in inference time
* Add unit tests
* Remove useless detach_z_vocoder parameter
* Add docs for VITS upsampling
* Fix the docs
* Rename TTS_part_sample_rate to encoder_sample_rate
* Add upsampling_init and upsampling_z methods
* Add asserts for encoder_sample_rate part
* Move upsampling tests to test_vits.py
2022-04-26 11:47:46 +02:00
Eren Gölge
c410bc58ef
Bump to v0.6.2
2022-04-20 11:46:26 +02:00
WeberJulian
30bea7d53c
Update manage.py ( #1514 )
2022-04-19 14:27:32 +02:00
Eren Gölge
7133f8f47d
Print Model's license when downloading ( #1512 )
...
* Print model license while downloading
* Make style
* Add a new license link
* Make style
2022-04-19 14:18:49 +02:00
WeberJulian
4953636b14
Add African models ( #1511 )
...
* Add african models
* Set default license for all models
2022-04-19 14:18:30 +02:00
Edresson Casanova
060e0f9368
Add EmbeddingManager and BaseIDManager ( #1374 )
2022-03-31 13:41:16 +02:00
WeberJulian
1b22f03e98
Fix G2P backend of the released models ( #1461 )
...
* Fix enforce phonemizer
* Add new models
* Fix .model.json
2022-03-30 12:47:11 +02:00
WeberJulian
c66a6241fd
Enforce phonemizer definition for synthesis ( #1441 )
...
* Enforce phonemizer definition for synthesis
* Fix train_tts, tokenizer init can now edit config
* Add small change to trigger CI pipeline
* fix wrong output path for one tts_test
* Fix style
* Test config overides by args and tokenizer
* Fix style
2022-03-25 23:15:33 +01:00
Edresson Casanova
37896e1743
Bug fix in freeze encoder ( #1391 )
...
* Fix the bug in freeze encoder
* Remove emb_l definition for non-multilingual training
* Fix unit tests
2022-03-24 18:16:04 +01:00
Edresson Casanova
3435bc8fca
Fix style tests
2022-03-23 15:05:32 -03:00
Edresson Casanova
0ae1e0248c
Fix the bug for emptly audio files
2022-03-23 14:39:31 -03:00
Edresson Casanova
ea53d6feb3
Replace webrtcvad by silero-vad
2022-03-23 14:39:31 -03:00
Eren Gölge
3af01cfe3b
Update base model wrt 👟 ( #1406 )
2022-03-23 17:24:20 +01:00
Eren Gölge
1c3623af33
Fix model manager ( #1436 )
...
* Fix manager
* Make style
2022-03-23 12:57:14 +01:00
Eren Gölge
72d85e53c9
Update model file extension ( #1422 )
...
* Update model file ext to ```.pth```
* Update docs
* Rename more
* Find model files
2022-03-22 17:55:00 +01:00
Eren Gölge
fd56fabb21
Fix #1380 ( #1409 )
2022-03-16 12:38:27 +01:00
Eren Gölge
0870a4faa2
Make style ( #1405 )
2022-03-16 12:13:55 +01:00
WeberJulian
690c96ed28
Fix default phonemizer for ja and zh ( #1399 )
2022-03-16 12:13:22 +01:00
Edresson Casanova
f81892483d
REBASED: Transform Speaker Encoder in a Generic Encoder and Implement Emotion Encoder training support ( #1349 )
...
* Rename Speaker encoder module to encoder
* Add a generic emotion dataset formatter
* Transform the Speaker Encoder dataset to a generic dataset and create emotion encoder config
* Add class map in emotion config
* Add Base encoder config
* Add evaluation encoder script
* Fix the bug in plot_embeddings
* Enable Weight decay for encoder training
* Add argumnet to disable storage
* Add Perfect Sampler and remove storage
* Add evaluation during encoder training
* Fix lint checks
* Remove useless config parameter
* Active evaluation in speaker encoder test and use multispeaker dataset for this test
* Unit tests fixs
* Remove useless tests for speedup the aux_tests
* Use get_optimizer in Encoder
* Add BaseEncoder Class
* Fix the unitests
* Add Perfect Batch Sampler unit test
* Add compute encoder accuracy in a function
2022-03-11 14:43:40 +01:00
Edresson Casanova
36e9ea2f97
Open bible dataset formatter ( #1365 )
...
* Add support for voice conversion inference
* Cache d_vectors_by_speaker for fast inference using a bigger speakers.json
* Rebase bug fix
* Use the average d-vector for inference
* Fix the bug in find unique chars script
* Add OpenBible formatter
Co-authored-by: Eren Gölge <erogol@hotmail.com>
2022-03-11 10:43:31 +01:00
Edresson Casanova
dbe9da7f15
Add Voice conversion inference support ( #1337 )
...
* Add support for voice conversion inference
* Cache d_vectors_by_speaker for fast inference using a bigger speakers.json
* Rebase bug fix
* Use the average d-vector for inference
2022-03-10 14:57:12 +01:00
Edresson Casanova
917f417ac4
Add alphas to control language and speaker balancer ( #1216 )
...
* Add alphas to control language and speaker balancer
* Add docs for speaker and language samplers
* Change the Samplers weights to float for save memory
* Change the test_samplers to unittest format
* Add get_sampler method in BaseTTS
* Fix rebase issues
* Add language and speaker samplers support for DDP training
* Rename distributed sampler wrapper
* Remove the DistributedSamplerWrapper and use the one from Trainer
* Bugfix after rebase
* Move the samplers config to tts config
2022-03-10 14:56:09 +01:00
Edresson Casanova
f381e29b91
REBASED: Add support for the speaker encoder training using torch spectrograms ( #1348 )
...
* Add support for the speaker encoder training using torch spectrograms
* Remove useless function in speaker encoder dataset class
2022-03-10 14:54:51 +01:00
Eren Gölge
c670365507
Fix VCTK recipe and formatter
2022-03-08 14:20:34 +01:00
Eren Gölge
8feb41d361
Bump up to v0.6.1
2022-03-07 15:57:44 +01:00
Eren Gölge
ee02bc3823
Bump up to v0.6.0
2022-03-07 12:08:22 +01:00
Eren Gölge
dc280819be
Add new models
2022-03-07 12:08:09 +01:00
Eren Gölge
e9d9028b4d
Revert cleaner name
2022-03-06 12:57:06 +01:00
Eren Gölge
764c7fa4a4
Rename phoneme_cleaners
2022-03-06 12:09:54 +01:00
Eren Gölge
dd4287de1f
Update models
2022-03-03 20:23:00 +01:00
Eren Gölge
6cb00be795
Update your_tts model URL
2022-03-02 18:04:49 +01:00
Eren Gölge
1425a023fe
Make style and lint
2022-03-02 13:25:35 +01:00
Eren Gölge
c68885b3fd
Update Vits speaker encoder init
2022-03-02 13:20:23 +01:00
Eren Gölge
27b67b7945
Fix import
2022-03-02 09:15:20 +01:00
Eren Gölge
942df0fb05
Update vits dataset
2022-03-02 09:14:32 +01:00
Eren Gölge
6a9f8074f0
Fix TTSDataset
2022-03-01 07:57:48 +01:00
Eren Gölge
690de1ab06
Update Characters and add more tests
2022-02-25 11:32:44 +01:00
Eren Gölge
9063397892
Fix FastSpeech config
2022-02-25 11:31:56 +01:00
Eren Gölge
1e414b3a09
Make stlye
2022-02-25 11:31:56 +01:00
Eren Gölge
acc83cd3e6
Update Vits model API
2022-02-25 11:31:56 +01:00
Eren Gölge
fe656659be
Implement BaseTTS
2022-02-25 11:31:56 +01:00
Eren Gölge
bed4afd4ee
Implement BaseVocabulary
2022-02-25 11:31:56 +01:00
Eren Gölge
e0f9be76c0
Update test_run in wavernn and wavegrad
2022-02-25 11:31:56 +01:00
Eren Gölge
bf540f4323
Update imports for trainer
2022-02-25 11:31:56 +01:00
Eren Gölge
4c43eda414
Update BaseTrainerModel
2022-02-25 11:31:56 +01:00
Eren Gölge
83c5ddc5b7
Update imports
2022-02-25 11:31:56 +01:00
Eren Gölge
14c117978d
Fix return outputs
2022-02-25 11:31:56 +01:00
Eren Gölge
424d04e4f6
Make stlye
2022-02-25 11:31:56 +01:00
Eren Gölge
8b3ba02c95
Add vocab_dict to model config
2022-02-25 11:31:20 +01:00
Eren Gölge
ff23dce081
Update TTSDataset
2022-02-25 11:31:20 +01:00
Eren Gölge
750903d2ba
Add VCTK formatter docstring
2022-02-25 11:30:24 +01:00
Eren Gölge
52a7896668
Update VITS loss
2022-02-25 11:30:24 +01:00
Eren Gölge
c68962c574
Update forward tts binary loss
2022-02-25 11:30:24 +01:00
Eren Gölge
c11944022d
Revert back again rand_segment
2022-02-25 11:30:24 +01:00
Eren Gölge
00c7600103
Update Vits model API
2022-02-25 11:30:24 +01:00
Eren Gölge
935a604046
Delete trainer_utils
2022-02-25 11:29:41 +01:00
Eren Gölge
d0c27a9661
Update synthesis.py
2022-02-25 11:29:41 +01:00
Eren Gölge
35fc7270ff
Implement BaseTTS
2022-02-25 11:28:47 +01:00
Eren Gölge
2bad098625
Implement BaseVocabulary
2022-02-25 11:28:47 +01:00
Eren Gölge
833de62e30
Update base_vocoder
2022-02-25 11:28:14 +01:00
Eren Gölge
fc3b6d2861
Update gan
2022-02-25 11:28:14 +01:00
Eren Gölge
20a677c623
Update test_run in wavernn and wavegrad
2022-02-25 11:28:14 +01:00
Eren Gölge
be3a03126a
Update imports for trainer
2022-02-25 11:28:14 +01:00
Eren Gölge
c911729896
Update BaseTrainerModel
2022-02-25 11:28:14 +01:00
Eren Gölge
1e219fef0a
Revert drop_last
2022-02-25 11:26:59 +01:00
Eren Gölge
7dfd753d91
Add a cheap trick to avoid short audio clips
2022-02-25 11:26:59 +01:00
Eren Gölge
1a43e05460
Fix VITS loss bug
...
Fake and real features were given in the wrong args order to
the loss function
2022-02-25 11:26:59 +01:00
Eren Gölge
4b96bfe925
Fix train logging
2022-02-25 11:26:59 +01:00
Eren Gölge
ab8a4ca2c3
Revert random segment
2022-02-25 11:26:59 +01:00
Eren Gölge
8622226f3f
Make style
2022-02-25 11:26:59 +01:00
Eren Gölge
27db089d6c
Change TrainingArgs -> TrainerArgs
2022-02-25 11:26:59 +01:00
Eren Gölge
aa81454721
Update BaseTrainingConfig
2022-02-25 11:26:59 +01:00
Eren Gölge
d3a58ed07a
Fix default values
2022-02-25 11:26:59 +01:00
Eren Gölge
54c6bb2a8c
Fix add speaker VITS
2022-02-25 11:26:59 +01:00
Eren Gölge
590b04fb89
Fix espeak_wrapper
2022-02-25 11:26:59 +01:00
Eren Gölge
a013566d15
Delete trainer related code
2022-02-25 11:26:59 +01:00
Eren Gölge
38314194e7
Set `drop_last`
2022-02-25 11:26:59 +01:00
Eren Gölge
f70e4bb8c6
Add new speakers to the vits model
2022-02-25 11:26:59 +01:00
Eren Gölge
d5c0e17548
Load right char class dynamically
2022-02-25 11:26:59 +01:00
Eren Gölge
1f0c8179da
Make style
2022-02-25 11:26:59 +01:00
Eren Gölge
b3ed6ff6b7
Update FastPitchConfig
2022-02-25 11:26:59 +01:00
Eren Gölge
1932401e8d
Fix dataset preprocessing
2022-02-25 11:26:59 +01:00
Eren Gölge
34c4be5e49
Update forwardtts
2022-02-25 11:26:59 +01:00
Eren Gölge
bb37462794
Update language manager
2022-02-25 11:26:59 +01:00
Eren Gölge
5169d4eb32
Plot pitch over input characters
2022-02-25 11:26:59 +01:00
Eren Gölge
cd5d1497cf
Add pitch_fmin pitch_fmax args to the audio
2022-02-25 11:26:59 +01:00
Eren Gölge
1445a46e9e
Update synthesizer to use iinit_from_config
2022-02-25 11:26:59 +01:00
Eren Gölge
7058fcc3ff
Take file extension as an argument
2022-02-25 11:26:59 +01:00
Eren Gölge
13482dde1f
Update GAN model
2022-02-25 11:26:59 +01:00
Eren Gölge
2829027d8b
Refactor VITS model
2022-02-25 11:26:59 +01:00
Eren Gölge
ef63c99524
Implement `start_by_longest` option for TTSDatase
2022-02-25 11:26:18 +01:00
Eren Gölge
c4c471d61d
Allow padding for shorter segments
2022-02-25 11:25:48 +01:00
Eren Gölge
47fbddc8d4
Fix docstring
2022-02-25 11:25:48 +01:00
Eren Gölge
bc2243bac4
Fix tests
2022-02-25 11:25:00 +01:00
Eren Gölge
146fbfd7c9
Extend unittests
2022-02-25 11:25:00 +01:00
Eren Gölge
2fe16de8e3
Make lint
2022-02-25 11:25:00 +01:00
Eren Gölge
7b49a4aa2b
Fix glow_tts_config missing field
2022-02-25 11:24:13 +01:00
Eren Gölge
07b0a80d57
Fix tokenizer init_from_config
2022-02-25 11:24:13 +01:00
Eren Gölge
50e17097a7
Add verbose option to AudioProcessor
2022-02-25 11:24:13 +01:00
Eren Gölge
235f7d9b02
Extend glow_tts model tests
2022-02-25 11:24:13 +01:00
Eren Gölge
8e248913d6
Update train_tts for the new API
2022-02-25 11:24:13 +01:00
Eren Gölge
001da8afc8
Update Vits for the new model API
2022-02-25 11:21:19 +01:00
Eren Gölge
5176ae9e53
Fixes small compat. issues
2022-02-25 11:21:19 +01:00
Eren Gölge
131bc0cfc0
Fix synthesis.py 🔧
2022-02-25 11:18:00 +01:00
Eren Gölge
c0746f23df
Fix `too many open files`
2022-02-25 11:16:30 +01:00
Eren Gölge
df0d58bf09
Update VCTK recipes
2022-02-25 11:16:30 +01:00
Eren Gölge
730f7c0df4
Add file_ext args to resample.py
2022-02-25 11:15:46 +01:00
Eren Gölge
28d98da422
Update VCTK formatter
2022-02-25 11:15:46 +01:00
Eren Gölge
4d99fee3e2
Update spec extractor
2022-02-25 11:12:44 +01:00
Eren Gölge
38a0b3b6c7
Update train_tts.py
2022-02-25 11:11:35 +01:00
Eren Gölge
cfaa51fddc
Update BaseTTS config
2022-02-25 11:11:35 +01:00
Eren Gölge
4c5cb44eeb
Update setup_model
2022-02-25 11:11:35 +01:00
Eren Gölge
7c4243fba7
Update GlowTTS
2022-02-25 11:11:35 +01:00
Eren Gölge
bacf79f4fb
Update AlignTTS
2022-02-25 11:11:35 +01:00
Eren Gölge
18f726af65
Update ForwardTTS
2022-02-25 11:11:35 +01:00
Eren Gölge
d0ec4b91e5
Update Tacotron models
2022-02-25 11:11:35 +01:00
Eren Gölge
ea965a5683
Update VITS for the new API
2022-02-25 11:11:35 +01:00
Eren Gölge
f802a931a3
Pass samples to init_from_config in SpeakerManager
2022-02-25 11:07:34 +01:00
Eren Gölge
bde68d9f25
Use the same phonemizer for `en` to `en-us`
2022-02-25 11:07:34 +01:00
Eren Gölge
8649d4fd36
Allow None pad and blank tokens
2022-02-25 11:07:34 +01:00
Eren Gölge
c9972e6f14
Make lint
2022-02-25 11:07:34 +01:00
Eren Gölge
30cfafce56
Add init_from_config
2022-02-25 11:05:54 +01:00
Eren Gölge
90cc45dd4e
Update data loader tests
2022-02-25 11:05:54 +01:00
Eren Gölge
93957d58a1
Refactorin VITS for the tokenizer API
2022-02-25 11:05:06 +01:00
Eren Gölge
04df0a3d9f
Refactor TTSDataset ⚡ ️
2022-02-25 11:05:06 +01:00
Eren Gölge
9bb347a52b
Update for tokenizer API
2022-02-25 11:05:06 +01:00
Eren Gölge
452dbc43d8
Update imports for symbols -> characters
2022-02-25 11:05:06 +01:00
Eren Gölge
8071fa0020
Refactor GlowTTS model and recipe for TTSTokenizer
2022-02-25 11:05:06 +01:00
Eren Gölge
b6c2bfdf08
Refactor synthesis.py for TTSTokenizer
2022-02-25 11:05:06 +01:00
Eren Gölge
b2bb954a51
Refactor TTSDataset to use TTSTokenizer
2022-02-25 11:05:06 +01:00
Eren Gölge
84091096a6
Refactor Synthesizer class for TTSTokenizer
2022-02-25 11:05:06 +01:00
Eren Gölge
196ae74273
Update data loader tests
2022-02-25 11:05:06 +01:00
Eren Gölge
98057a00ae
Make style
2022-02-25 10:57:35 +01:00
Eren Gölge
7575367b9f
Refactorin VITS for the tokenizer API
2022-02-25 10:57:35 +01:00
Eren Gölge
4cd690e4c1
Updates BaseTTS and configs
2022-02-25 10:57:35 +01:00
Eren Gölge
176b712c1a
Refactor TTSDataset ⚡ ️
2022-02-25 10:57:35 +01:00
Eren Gölge
4597d4e5b6
Remove get_characters from BaseTTS
2022-02-25 10:48:03 +01:00
Eren Gölge
1df1d6c4a9
Update for tokenizer API
2022-02-25 10:48:03 +01:00
Eren Gölge
2d8ce98d2a
Update imports for symbols -> characters
2022-02-25 10:48:03 +01:00
Eren Gölge
9a95e15483
Refactor GlowTTS model and recipe for TTSTokenizer
2022-02-25 10:48:03 +01:00
Eren Gölge
d0eb642d88
Refactor synthesis.py for TTSTokenizer
2022-02-25 10:48:03 +01:00
Eren Gölge
3476be30d7
Refactor Synthesizer class for TTSTokenizer
2022-02-25 10:48:03 +01:00
Eren Gölge
9397a56b13
Allow init_from_config from model or audio config
2022-02-25 10:48:03 +01:00
Eren Gölge
a71a013276
Fix the wrong default loss name for GAN models
2022-02-25 10:48:03 +01:00
Eren Gölge
04202da1ac
Make style
2022-02-25 10:48:03 +01:00
Eren Gölge
3b63d713b9
Fix espeak wrapper cmd call
2022-02-25 10:48:03 +01:00
Eren Gölge
4894998e6b
Fix print_logs
2022-02-25 10:48:03 +01:00
Eren Gölge
4e8f9d6f10
Fix IPAPhonemes init_from_config
2022-02-25 10:48:03 +01:00
Eren Gölge
0fe39166fe
Discard OOV chars in tokenizer
...
Discard but store OOV chars with a warninig message
when the OOV char first recognized
2022-02-25 10:48:03 +01:00
Eren Gölge
c39aaafbfc
Update EspeakWrapper for espeak-ng
2022-02-25 10:48:03 +01:00
Eren Gölge
bb389479a4
Update setup_model for TTS.tts models
2022-02-25 10:48:03 +01:00
Eren Gölge
9b83e665fc
Add init_from_config as an abstract class
2022-02-25 10:48:03 +01:00
Eren Gölge
3eca5ad060
Update config fields for phonemizer
2022-02-25 10:48:03 +01:00
Eren Gölge
d2525abe8c
Remove get_characters from BaseTTS
2022-02-25 10:48:03 +01:00
Eren Gölge
73d27ebd45
Fix GlowTTS
2022-02-25 10:48:03 +01:00
Eren Gölge
87bf940676
Print duplicate characters
2022-02-25 10:48:03 +01:00
Eren Gölge
3de9f38d16
Add init_from_config to SpeakerManager
2022-02-25 10:48:03 +01:00
Eren Gölge
d8ec7086b6
Update `synthesis` for the new API
2022-02-25 10:48:03 +01:00
Eren Gölge
4e83bf3968
Allow choosing phonemizer
2022-02-25 10:48:02 +01:00
Eren Gölge
22f0c58fe1
Print language codes
2022-02-25 10:48:02 +01:00
Eren Gölge
693fb4dd39
Modify init_from_config for IPAPhonemes
2022-02-25 10:48:02 +01:00
Eren Gölge
acc6eef625
Update for tokenizer API
2022-02-25 10:48:02 +01:00
Eren Gölge
e1b4c4ca43
Add init_from_config to GAN
2022-02-25 10:48:02 +01:00
Eren Gölge
353f913efc
Fix #985
2022-02-25 10:48:02 +01:00
Eren Gölge
ba3b60c90f
Test TTSTokenizer
2022-02-25 10:48:02 +01:00
Eren Gölge
79a84410f2
Test punctuations
2022-02-25 10:48:02 +01:00
Eren Gölge
d8bdeb8b8f
Fix Punctuation
2022-02-25 10:48:02 +01:00
Eren Gölge
ff7c385838
Fix BasePhonemizer
2022-02-25 10:48:02 +01:00
Eren Gölge
10d435ce77
Fixup
2022-02-25 10:48:02 +01:00
Eren Gölge
f0655bfffc
Fix ja_jp_phonemizer
2022-02-25 10:48:02 +01:00
Eren Gölge
20e5dd3678
Add doc examples
2022-02-25 10:48:02 +01:00
Eren Gölge
fbad17e084
Update imports for symbols -> characters
2022-02-25 10:48:02 +01:00
Eren Gölge
a1df4f9887
Test character classes
2022-02-25 10:45:24 +01:00
Eren Gölge
bd461ace33
Refactor GlowTTS model and recipe for TTSTokenizer
2022-02-25 10:45:24 +01:00
Eren Gölge
5a9653978a
Refactor synthesis.py for TTSTokenizer
2022-02-25 10:45:24 +01:00
Eren Gölge
e5785b34b0
Style fix
2022-02-25 10:27:46 +01:00
Eren Gölge
e4049aa31a
Refactor TTSDataset to use TTSTokenizer
2022-02-25 10:27:46 +01:00
Eren Gölge
2480bbe937
Remove OLD TOKENIZATION ROUTINES
2022-02-25 09:32:54 +01:00
Eren Gölge
53f696615b
Add init_from_config to AudioProcessor
2022-02-25 09:32:54 +01:00
Eren Gölge
3d86edfc81
Refactor Synthesizer class for TTSTokenizer
2022-02-25 09:32:54 +01:00
Eren Gölge
8d85af84cd
Implement Punctuation class
2022-02-25 09:32:54 +01:00
Eren Gölge
1aca58afaf
Fix imports in cleaners.py
2022-02-25 09:32:54 +01:00
Eren Gölge
0344645e90
Implement TTSTokenizer
2022-02-25 09:32:54 +01:00
Eren Gölge
2fb1f70503
Implement BaseCharacters, IPAPhonemes, Graphemes
2022-02-25 09:32:54 +01:00
Eren Gölge
1bee40af40
Create language folders under `TTS.tts.utils.text`
2022-02-25 09:32:54 +01:00
Eren Gölge
c1119bc291
Implement BasePhonemizer
2022-02-25 09:32:54 +01:00
Eren Gölge
dcd01356e0
Create `text/english` folder
2022-02-25 09:32:54 +01:00
Eren Gölge
80867c8e8c
Implement multi-phonemizer
2022-02-25 09:32:54 +01:00
Eren Gölge
5e4f78add3
Implement espeak wrapper
2022-02-25 09:32:54 +01:00
Eren Gölge
e03a05c816
Implement gruut wrapper
2022-02-25 09:32:54 +01:00
Eren Gölge
172ba0c5e7
Implement JA_JP phonemizer
2022-02-25 09:32:54 +01:00
Eren Gölge
ca02b82218
Implement ZH_CH phonemizer
2022-02-25 09:32:54 +01:00
Eren Gölge
a51b031bff
Merge branch 'dev' into dev-fix-glowtts-infer
2022-02-21 12:01:40 +03:00
Edresson Casanova
28a7464975
Fix the bug in split dataset function ( #1251 )
...
* Fix the bug in split_dataset
* Make eval_split_size configurable
* Change test_loader to use load_tts_samples function
* Change eval_split_portion to eval_split_size and permits to set the absolute number of samples in eval
* Fix samplers unit test
* Add data unit test on GitHub workflow
2022-02-21 11:59:36 +03:00
Edresson Casanova
bc5db13d06
Fix the bug in extract tts spectrogram script
2022-02-19 19:24:00 +00:00
Edresson Casanova
ba6e56e01c
Fix Glow-TTS multi-speaker inference
2022-02-18 19:25:29 +00:00
Eren Gölge
127118c637
Update TTS.tts formatters ( #1228 )
...
* Return Dict from tts formatters
* Make style
2022-02-11 23:03:43 +01:00
Eren Gölge
5e3f499a69
Fix #1187 ( #1227 )
2022-02-11 13:27:59 +01:00
Edresson Casanova
0860d73cf8
Remove Tensorflow requeriment ( #1225 )
...
* Remove TF modules
* Remove TF unit tests
* Remove TF vocoder modules
* Remove TF convert scripts
* Remove TF requirement
* Remove the Docs TF instructions
* Remove TF inference support
2022-02-10 16:14:54 +01:00
Eren Gölge
44c7d1a826
Merge pull request #1054 from WeberJulian/partial_embedding_compute
...
Partial embedding compute
2022-02-06 20:13:55 +01:00