Commit Graph

4424 Commits

Author SHA1 Message Date
Julian Weber bb59718c03
Add capacitron v2 model (#1768)
* Add capacitron v2 in .models.json

* Put right commit hash
2022-09-08 09:43:56 +02:00
Edresson Casanova 096b35f639
Add VCTK speaker encoder recipe (#1912) 2022-08-26 16:19:03 +02:00
Eren Gölge e5430a6519
Add new DE Thorsten models (#1898)
- Tacotron2-DDC
- HifiGAN vocoder
2022-08-22 11:27:39 +02:00
Eren G??lge 8845f06fd9 Bump up to v0.8.0 2022-08-22 11:26:47 +02:00
Stanislav Kachnov 2c9f00a808
Fix tune wavegrad (#1844)
* fix imports in tune_wavegrad

* load_config returns Coqpit object instead None

* set action (store true) for flag "--use_cuda"; start to tune if module is running as the main program

* fix var order in the result of batch collating

* make style

* make style with black and isort
2022-08-22 09:55:32 +02:00
Eren Gölge fcb0bb58ae
Handle when no batch sampler (#1882) 2022-08-18 11:26:04 +02:00
Eren Gölge 7442bcefa5
Remove deprecated files (#1873)
- samplers.py is moved
- distribute.py is replaces by the 👟Trainer
2022-08-15 12:16:37 +02:00
Eren Gölge 4333492341
Fix BCE loss issue (#1872)
* Fix BCE loss issue

* Remove import
2022-08-15 11:27:21 +02:00
jchai.me c30b6485ea
updates to dataset analysis notebooks for compatibility with latest version of TTS (#1853) 2022-08-15 11:11:07 +02:00
manmay nakhashi e4db7c51b5
Update capacitron_layers.py (#1664)
crashing because of dimension miss match   at line no. 57
[batch, 256] vs [batch , 1, 512]
enc_out = torch.cat([enc_out, speaker_embedding], dim=-1)
2022-08-15 11:08:50 +02:00
Eren Gölge bfc63829ac
Implement bucketed weighted sampling for VITS (#1871) 2022-08-15 11:08:11 +02:00
Eren Gölge d46fbc240c
Introduce numpy and torch transforms (#1705)
* Refactor audio processing functions

* Add tests for numpy transforms

* Fix imports

* Fix imports2
2022-08-08 11:57:50 +02:00
manmay nakhashi 7fd9b89ebf
fix get_random_embeddings --> get_random_embedding (#1726)
* fix get_random_embeddings --> get_random_embedding

function typo leads to training crash, no such function

* fix typo

get_random_embedding
2022-08-07 14:06:03 +02:00
rbaraglia 75ac9e3f0c
Fix language flags generated by espeak-ng phonemizer (#1801)
* fix language flags generated by espeak-ng phonemizer

* Style

* Updated language flag regex to consider all language codes alike
2022-08-07 13:57:40 +02:00
Lars Kiesow 8c645080ac
Adjust default to be able to process longer sentences (#1835)
Running `tts --text "$text" --out_path …` with a somewhat longer
sentences in the text will lead to warnings like “Decoder stopped with
max_decoder_steps 500” and the sentences just being cut off in the
resulting WAV file.

This happens quite frequently when feeding longer texts (e.g. a blog
post) to `tts`. It's particular frustrating since the error is not
always obvious in the output. You have to notice that there are missing
parts. This is something other users seem to have run into as well [1].

This patch simply increases the maximum number of steps allowed for the
tacotron decoder to fix this issue, resulting in a smoother default
behavior.

[1] https://github.com/mozilla/TTS/issues/734
2022-08-07 13:51:29 +02:00
p0p4k 903a77c197
Update wavenet.py (#1796)
* Update wavenet.py

Current version does not use "in_channels" argument. 
In glowTTS, we use normalizing flows and so "input dim" == "ouput dim" (channels and length). So, the existing code just uses hidden_channel sized tensor as input to first layer as well as outputs hidden_channel sized tensor. 
However, since it is a generic implementation, I believe it is better to update it for a more general use.

* "in_channels -> hidden_channels"
2022-08-01 12:20:37 +02:00
p0p4k 4fe50801b5
Update README.md; download progress bar in CLI. (#1797)
* Update README.md

- minor PR
- added model_info usage guide based on #1623 in README.md .

* "added tqdm bar for model download"

* Update manage.py

* fixed style

* fixed style

* sort imports
2022-08-01 12:17:47 +02:00
p0p4k d9bad91a66
Update requirements.txt; inflect==5.6 (#1809)
New inflect version (6.0) depends on pydantic which has some issues irrelevant to 🐸 TTS. #1808 
Force inflect==5.6 (pydantic free) install to solve dependency issue.
2022-08-01 11:48:02 +02:00
Eren G??lge 7d8b1665c8 Fix rand_segment edge case (input_len == seg_len - 1) 2022-08-01 11:37:45 +02:00
vanIvan 5094499eba
Fix & update WaveRNN vocoder model (#1749)
* Fixes KeyError bug. Adding logging to dashboard.

* Make pep8 compliant

* Make style compliant

* Still fixing style
2022-07-26 15:05:11 +02:00
Yuri Pourre 1a065fa6ed
Update README.md (#1776)
Fix typo in different and code sample
2022-07-26 13:28:21 +02:00
p0p4k 669966d963
Update requirements.txt (#1791)
Support for #1775
2022-07-26 13:06:40 +02:00
p0p4k 10195c4eba
Update decoder.py (#1792)
Minor comment correction.
2022-07-26 13:06:06 +02:00
Tsai Meng-Ting 9d32cbc3db
Fix type in download_vctk.sh (#1739)
typo in comment
2022-07-20 12:27:42 +02:00
ivan provalov 903d9c791a
Fix for FloorDiv Function Warning (#1760)
* Fix for Floor Function Warning

Fix for Floor Function Warning

* Adding double quotes to fix formatting

Adding double quotes to fix formatting

* Update glow_tts.py

* Update glow_tts.py
2022-07-20 11:31:22 +02:00
WeberJulian 4f31402227
Fix aux tests (#1753)
* Set n_jobs to 1 for resample script

* Delete resample test

* Set n_jobs 1 in vad test

* delete vad test

* Revert "Delete resample test"

This reverts commit bb7c8466af.

* Remove tests with resample
2022-07-19 10:06:31 +02:00
Eren Gölge f7587fc134
Fix SSIM loss correction 2022-07-13 10:47:12 +02:00
Eren Gölge bc1f93c299
Fix device allocation 2022-07-12 19:05:25 +02:00
Eren Gölge 49bac724c0
Implement VitsAudioConfig (#1556)
* Implement VitsAudioConfig

* Update VITS LJSpeech recipe

* Update VITS VCTK recipe

* Make style

* Add missing decorator

* Add missing param

* Make style

* Update recipes

* Fix test

* Bug fix

* Exclude tests folder

* Make linter

* Make style
2022-07-12 18:49:58 +02:00
a-froghyar 34b80e0280
feat: updated recipes and lr fix (#1718)
- updated the recipes activating more losses for more stable training
- re-enabling guided attention loss
- fixed a bug about not the correct lr fetched for logging
2022-07-12 15:00:53 +02:00
Eren G??lge 48a4f3647f Make lint 2022-07-12 14:58:26 +02:00
WeberJulian c614f21982
Add durations as aux input for VITS (#1694)
* Add durations as aux input for VITS

* Make style

* Fix tts_tests

* Fix test_get_aux_input
2022-07-12 14:25:21 +02:00
Eren G??lge 2cf89b88c9 Make style 2022-07-12 14:12:57 +02:00
Eren G??lge a6f73a18cb Fix BCELoss adressing #1192 2022-07-12 14:11:34 +02:00
Eren G??lge eefd482f51 Separate loss tests 2022-07-12 12:35:46 +02:00
Eren G??lge c17ff17a18 Fix SSIM loss 2022-07-12 12:35:24 +02:00
Eren G??lge f1e35596e8 Remove redundant config field 2022-07-11 13:39:41 +02:00
WeberJulian 5cef6facb0
Fix tokenizer for punc only (#1717) 2022-07-06 22:59:41 +02:00
WeberJulian 9e00e31e37
Fix Publish CI (#1597)
* Try out manylinux

* temporary removal of useless pipeline

* remove check and use only manylinux

* Try --plat-name

* Add install requirements

* Add back other actions

* Add PR trigger

* Remove conditions

* Fix sythax

* Roll back some changes

* Add other python versions

* Add test pypi upload

* Add username

* Add back __token__ as username

* Modify name of entry to testpypi

* Set it to release only

* Fix version checking
2022-07-05 11:07:33 +02:00
camillem 5c821d9fa1
Fix the --model_name and --vocoder_name arguments need a <model_type> element (#1469)
Co-authored-by: Eren Gölge <erogol@hotmail.com>
2022-06-27 10:32:43 +02:00
manmay nakhashi 577ec406f4
Fix checkpointing GAN models (#1641)
* checkpoint sae step crash fix

* checkpoint save step crash fix

* Update gan.py

updated requested changes

* crash fix
2022-06-22 12:07:46 +02:00
Eren G??lge 00e67092d8 Bump up to v0.7.1 2022-06-21 14:12:55 +02:00
Eren G??lge 3328be7a8e Remove GL message 2022-06-21 12:39:31 +02:00
WeberJulian 30c72e0d05
Add Thorsten VITS model (#1675)
Co-authored-by: Eren Gölge <egolge@coqui.ai>
2022-06-21 11:39:49 +02:00
p0p4k 71281ff1e4
Add support for model_info in CLI (#1623)
* model_info

* model_info

* model_info_by_idx and name

* model_info_by_idx and name

* model_info

* Update manage.py

* fixed linter

* fixed linter

* fixed linter

* fixed linter

* fixed return style checks

* fixed linter

* fixed linter

* fixed idx always positive

* added comments

* fix parser.args check

* fix parser.args check

* Make style

Co-authored-by: Eren G??lge <egolge@coqui.ai>
2022-06-20 23:28:17 +02:00
Eren G??lge 8b75e8be9c Bump up to v0.7.0 2022-06-20 13:50:09 +02:00
WeberJulian 6126c23498
Add synpaflex formatter (#1616)
* Add synpaflex formatter

* Fix formatter

* Make style
2022-06-20 13:36:26 +02:00
klotlabs c44e39d9d6
Update training_a_model.md (#1620)
edited `...servers our needs.` to `...serves your needs.`
2022-06-08 23:40:05 +02:00
WeberJulian f09ea11c71
Internal formatter (#1629)
* Add coqui formatter

* Make style
2022-06-08 14:31:03 +02:00
Aya-AlJafari 68cef28a88
Adding TTS Tutorials (#1584)
* Adding inferencing notebook

* added multispeaker explanation and usecase and renamed the file

* Adding training tutorial

* fixed dummy paths

* fixed review comments

* fixed metadata extension

Co-authored-by: Eren Gölge <erogol@hotmail.com>
2022-06-02 12:23:00 +02:00