Eren Gölge
|
608f437545
|
Add a function to find unique chars
|
2021-11-01 16:41:33 +01:00 |
Eren Gölge
|
035ed432bc
|
Doc update (#889)
* Link source files from the docs
* Update glowTTS recipes for docs
* Add dataset downloaders
|
2021-10-26 17:41:33 +02:00 |
Eren Gölge
|
0cac3f330a
|
Enable custom formatter in load_tts_samples
|
2021-10-26 13:07:11 +02:00 |
Eren Gölge
|
82fed4add2
|
Make style
|
2021-10-21 16:05:51 +00:00 |
Eren Gölge
|
a0a5d580e9
|
Approximate audio length from file size
|
2021-10-18 08:54:02 +00:00 |
Eren Gölge
|
043dca61b4
|
Rename `load_meta_data` as `load_tts_data`
|
2021-09-30 14:47:56 +00:00 |
Eren Gölge
|
9f23ad6a0f
|
Fix imports
|
2021-09-30 14:47:56 +00:00 |
Eren Gölge
|
8ada870a57
|
Refactor `trainer.py` for v2
|
2021-09-30 14:16:34 +00:00 |
Eren Gölge
|
76c4929ab2
|
Fix attn mask reading bug
|
2021-09-06 15:16:58 +00:00 |
Eren Gölge
|
91a70e80b2
|
Refactor TTSDataset
Return a dict by `collate`
Refactor batch handling in `collate`
A couple of bug fixes
|
2021-09-06 15:16:58 +00:00 |
Eren Gölge
|
648655fa03
|
Add `PitchExtractor` and return dict by `collate`
|
2021-09-06 15:16:58 +00:00 |
Eren Gölge
|
545a00fc04
|
Use absolute paths of the attention masks
|
2021-09-06 15:16:58 +00:00 |
Eren Gölge
|
e802b24ad0
|
Compute mean and std pitch
|
2021-09-06 15:16:58 +00:00 |
Eren Gölge
|
8fffd4e813
|
Don't print computed phonemes
It causes noise in logs
|
2021-09-06 15:16:58 +00:00 |
Eren Gölge
|
d085642ac1
|
Cache pitch features
Cache the features at the beginning of `BaseTTS` training.
|
2021-09-06 15:16:58 +00:00 |
Eren Gölge
|
fba257104d
|
Compute F0 using librosa
|
2021-09-06 15:16:58 +00:00 |
Eren Gölge
|
18da8f5dbd
|
Update pylint 2.10.2 and fix lint issues
|
2021-08-30 08:10:35 +00:00 |
Eren Gölge
|
f186856e5d
|
Add option to sort input sequnce by audio len
|
2021-08-30 08:10:35 +00:00 |
Eren Gölge
|
c312acac7d
|
Implement VITS model 🚀
VITS model implementation built on Glow TTS and HiFiGAN
layers.
|
2021-08-09 18:02:36 +00:00 |
Eren Gölge
|
003e5579e8
|
Enable `custom_symbols` in text processing
Models can define their own custom symbols lists with custom
`make_symbols()`
|
2021-08-09 18:02:36 +00:00 |
Eren Gölge
|
4b7b88dd3d
|
Add fullband-melgan DE vocoder
|
2021-07-26 15:38:30 +02:00 |
Edresson
|
b1620d1f3f
|
remove ignore generate eval flag
|
2021-07-15 03:34:28 -03:00 |
Edresson
|
2e5baffa9c
|
Merge fix and eval split as argparse
|
2021-07-13 01:47:32 -03:00 |
Eren Gölge
|
932ab107ae
|
Docstring edit in `TTSDataset.py` ✍️
|
2021-06-28 17:03:47 +02:00 |
Eren Gölge
|
8c74f054f0
|
Enable support for 🐍 python 3.10
Bump up versions numpy 1.19.5 and TF 2.5.0
|
2021-06-28 17:03:47 +02:00 |
Eren Gölge
|
fdfb18d230
|
downsize melgan test model size
|
2021-06-28 17:03:19 +02:00 |
Eren Gölge
|
419735f440
|
refactor and fix multi-speaker training in Trainer and Tacotron models
|
2021-06-28 17:03:19 +02:00 |
Eren Gölge
|
802d461389
|
Compute d_vectors and speaker_ids separately in TTSDataset
|
2021-06-28 17:03:19 +02:00 |
Eren Gölge
|
9042ae9195
|
use `to_cuda()` for moving data in `format_batch()`
|
2021-06-28 17:03:19 +02:00 |
Eren Gölge
|
d96ebcd6d3
|
make style
|
2021-06-28 17:03:19 +02:00 |
Eren Gölge
|
b500338faa
|
make style
|
2021-06-28 17:03:19 +02:00 |
Eren Gölge
|
a20a1c7d06
|
rename preprocess.py -> formatters.py
|
2021-06-28 17:03:19 +02:00 |
Eren Gölge
|
b9bccbb243
|
move load_meta_data and related functions to `datasets/__init__.py`
|
2021-06-28 17:03:19 +02:00 |
Eren Gölge
|
42554cc711
|
rename MyDataset -> TTSDataset
|
2021-06-28 17:03:19 +02:00 |
Edresson
|
28bec238ca
|
fix Lint checks
|
2021-06-18 14:33:50 -03:00 |
Edresson
|
83644056e3
|
fix Lint checks
|
2021-06-18 14:32:28 -03:00 |
Edresson Casanova
|
e78e3cd81e
|
Merge branch 'dev' into dev
|
2021-06-18 14:10:03 -03:00 |
Edresson
|
b74b510d3c
|
Compute embeddings and find characters using config file
|
2021-06-18 14:04:49 -03:00 |
Eren Gölge
|
49c5e5d820
|
maket style japanese PR
|
2021-06-02 11:44:46 +02:00 |
Katsuya Iida
|
0536aa6d0f
|
Japanese Tacotron 2 model
|
2021-05-22 17:12:19 +09:00 |
Eren Gölge
|
8a7c40736c
|
set use_phonemes false
|
2021-05-19 01:27:26 +02:00 |
Eren Gölge
|
8b1014d188
|
add docstrings with default value fixes
|
2021-05-15 23:45:10 +02:00 |
Eren Gölge
|
93a00373f6
|
move split_dataset
|
2021-05-11 11:29:17 +02:00 |
Eren Gölge
|
79d7215142
|
config refactor #5 WIP
|
2021-05-11 11:29:17 +02:00 |
Eren Gölge
|
e5b9607bc3
|
isort all imports
|
2021-04-09 00:45:20 +02:00 |
Eren Gölge
|
0e79fa86ad
|
format with black and pylint 2.7.3
|
2021-04-09 00:38:08 +02:00 |
Eren Gölge
|
e84f120a04
|
sam-accenture model preprocessor
|
2021-04-01 03:41:41 +02:00 |
Eren Gölge
|
1c1949d348
|
utf-8 encoding for certain preprocessors
|
2021-03-30 14:39:16 +02:00 |
Eren Gölge
|
f3e5ddfaaf
|
bug fix in preprocessor
|
2021-03-18 13:33:23 +01:00 |
Eren Gölge
|
e15734c3fc
|
linter fix
|
2021-03-08 05:29:43 +01:00 |
Eren Gölge
|
9a48ba3821
|
a ton of linter updates
|
2021-03-08 05:06:54 +01:00 |
kirianguiller
|
9ab07f94e2
|
modify according to PR reviews
|
2021-03-08 02:59:48 +01:00 |
kirianguiller
|
42ba30eb8f
|
<add> Chinese mandarin implementation (tacotron2)
|
2021-03-08 02:59:24 +01:00 |
kirianguiller
|
0d4525322c
|
modify according to PR reviews
|
2021-03-08 02:57:11 +01:00 |
kirianguiller
|
e6fd118cf8
|
<add> Chinese mandarin implementation (tacotron2)
|
2021-03-08 02:57:11 +01:00 |
Eren Gölge
|
2ca74b8ab3
|
add RUSLAN dataset preprocessor
|
2021-03-08 02:54:47 +01:00 |
Eren Gölge
|
f9fe167537
|
docstring update
|
2021-03-08 02:54:47 +01:00 |
Eren Gölge
|
29d928d531
|
css10 dataset preprocessor
|
2021-03-08 02:54:47 +01:00 |
Eren Gölge
|
08581deb61
|
linter updates
|
2021-03-08 02:53:02 +01:00 |
erogol
|
27a75de15f
|
update processors for loading attention maps
|
2021-01-06 13:19:40 +01:00 |
erogol
|
df180148e9
|
use noise augmentation in TTSDataset
|
2020-12-09 15:46:25 +01:00 |
erogol
|
7505c0ba27
|
muliprocess phoneme computation
|
2020-12-07 11:29:41 +01:00 |
erogol
|
20c86489d7
|
make static methods for faster multiprocess call
|
2020-12-07 11:29:10 +01:00 |
erogol
|
affe1c1138
|
setup training scripts for computing phonemes before training optionally. And define data_loaders before starting training and re-use them instead of re-define for every train and eval calls. This is to enable better instance filtering based on input length.
|
2020-12-07 11:26:57 +01:00 |
erogol
|
a757b203bc
|
fix longer phoneme seqs
|
2020-11-26 15:05:03 +01:00 |
erogol
|
7541d2ecaa
|
return eval split optional
|
2020-11-25 14:50:09 +01:00 |
Qingping Hou
|
b0b97d636f
|
speed up metafile build for voxceleb
|
2020-11-14 23:45:17 -08:00 |
erogol
|
9b0f441945
|
argument for returning no eval split
|
2020-11-12 12:52:27 +01:00 |
Edresson
|
d9540a5857
|
add blank token in sequence for encrease glowtts results
|
2020-10-25 15:08:28 -03:00 |
erogol
|
10258724d1
|
linter fixes
|
2020-09-22 03:54:16 +02:00 |
erogol
|
a6df617eb1
|
Merge branch 'glow-tts-amp-time_depth_conv' into dev
|
2020-09-21 14:23:45 +02:00 |
mueller91
|
9b4aac94a8
|
fix: linter issues
|
2020-09-21 12:13:02 +02:00 |
mueller
|
e36a3067e4
|
add: save wavs instead feats to storage.
This is done in order to mitigate staleness when caching and loading from data storage
|
2020-09-17 14:14:30 +02:00 |
mueller
|
1511076fde
|
add: Configurable encoder dataset storage to reduce disk I/O
add: Averaged time for data loader to console and Tensorboard output
|
2020-09-17 12:29:38 +02:00 |
mueller
|
95d2906307
|
add: Mozilla Commonvoice, VoxCeleb1+2, LibriTTS to Speaker Encoder Training
|
2020-09-16 16:49:53 +02:00 |
mueller
|
c909ca3855
|
Improve runtime of __parse_items() from O(|speakers|*|items|) to O(|items|)
|
2020-09-16 15:55:55 +02:00 |
erogol
|
89d15bf118
|
merge glow-tts after rebranding
|
2020-09-11 19:01:37 +02:00 |
erogol
|
df19428ec6
|
rename the project to old TTS
|
2020-09-09 12:27:23 +02:00 |