Jörg Thalheim
6646682650
cleaners: expand english time
2020-12-10 14:53:20 +01:00
Jörg Thalheim
76138687d3
expand more currencies
2020-12-10 14:53:20 +01:00
erogol
a2859b7ddc
update config args checks
2020-12-10 13:52:57 +01:00
erogol
788cd6f902
fix multi-speaker glow-tts inference
2020-12-10 02:05:48 +01:00
erogol
3d5066e2b8
Merge branch 'dev' of https://github.com/mozilla/TTS into dev
2020-12-10 00:31:03 +01:00
erogol
92cc9630d7
fix glow-tts synthesis for DPP
2020-12-10 00:30:34 +01:00
Eren Gölge
2473b2dc62
Merge pull request #559 from krzim/patch-1
...
Fix import to grab the encoder model save function
2020-12-10 00:19:32 +01:00
erogol
53679b706d
glow-tts distributed fix
2020-12-09 23:39:09 +01:00
erogol
62bc171db5
Merge branch 'dev' of https://github.com/mozilla/TTS into dev
2020-12-09 15:46:57 +01:00
erogol
df180148e9
use noise augmentation in TTSDataset
2020-12-09 15:46:25 +01:00
Thorsten Mueller
e39628ce2f
Limit filenames to 10 chars
2020-12-08 18:44:19 +01:00
erogol
06612ce305
test fixes
2020-12-07 15:57:34 +01:00
erogol
0252a07fa6
Merge branch 'dev' of https://github.com/mozilla/TTS into dev
2020-12-07 11:31:55 +01:00
erogol
482e725752
sync torch calls before logging training results
2020-12-07 11:30:19 +01:00
erogol
7505c0ba27
muliprocess phoneme computation
2020-12-07 11:29:41 +01:00
erogol
20c86489d7
make static methods for faster multiprocess call
2020-12-07 11:29:10 +01:00
erogol
affe1c1138
setup training scripts for computing phonemes before training optionally. And define data_loaders before starting training and re-use them instead of re-define for every train and eval calls. This is to enable better instance filtering based on input length.
2020-12-07 11:26:57 +01:00
Alexander Korolev
f42ca2b73f
Update wavegrad.py
...
This should fix the issue https://github.com/mozilla/TTS/issues/581
2020-12-04 16:43:39 +01:00
erogol
7c3cdced1a
make speaker_mapping a global variable to prevent reload. Fix glow-tts training
2020-12-01 03:23:25 +01:00
Thorsten Mueller
06a389bc08
Added option for saving raw spectograms
2020-11-27 15:49:55 +01:00
erogol
a757b203bc
fix longer phoneme seqs
2020-11-26 15:05:03 +01:00
erogol
7b0a93d2f8
fix
2020-11-26 11:44:52 +01:00
erogol
0c6f7e4c77
resample audio if flag set true
2020-11-26 11:30:48 +01:00
erogol
f6c96b0ac2
Merge branch 'dev'
2020-11-25 15:29:06 +01:00
erogol
e3b7157146
remove contextlib
2020-11-25 15:22:01 +01:00
erogol
e3eda159d1
wavegrad_dataset update
2020-11-25 14:50:50 +01:00
erogol
a1e4ee18f9
convert float16 to float32 for plotting spectrograms
2020-11-25 14:50:28 +01:00
erogol
7541d2ecaa
return eval split optional
2020-11-25 14:50:09 +01:00
erogol
4b92ac0f92
tune_wavegrad update
2020-11-25 14:49:48 +01:00
erogol
d8c1b5b73d
print max lengths in tacotron training
2020-11-25 14:49:07 +01:00
erogol
1229554c42
use native amp
2020-11-25 14:48:54 +01:00
erogol
8a820930c6
compute_embedding update
2020-11-25 14:46:08 +01:00
erogol
aa2b31a1b0
use 'enabled' argument to control autocast
2020-11-17 14:22:01 +01:00
erogol
d9d04d892b
Merge branch 'dev' of https://github.com/mozilla/TTS into dev
2020-11-17 14:17:24 +01:00
erogol
8b0e0846a3
temporary travis check
2020-11-17 14:17:03 +01:00
Qingping Hou
b0b97d636f
speed up metafile build for voxceleb
2020-11-14 23:45:17 -08:00
erogol
a2a142dc39
Merge branch 'dev' of https://github.com/mozilla/TTS into dev
2020-11-14 13:02:19 +01:00
erogol
c65712426a
change noise scheduling for wavegrad. Compute beta values externally to enable better flexibility
2020-11-14 13:01:10 +01:00
erogol
5a59467f34
scaler fix for wavegrad and wavernn. Save and load scaler
2020-11-14 13:00:35 +01:00
erogol
d8511efa8f
use native amp for tacotron training
2020-11-14 12:59:28 +01:00
Qingping Hou
0cc3650ef6
support loading config in yaml
2020-11-14 00:13:53 -08:00
erogol
6cc464ead6
fix ton of tesnting bugs
2020-11-12 16:33:29 +01:00
erogol
25551c4634
change wavernn generate to inference
2020-11-12 12:52:52 +01:00
erogol
9b0f441945
argument for returning no eval split
2020-11-12 12:52:27 +01:00
erogol
a7aefd5c50
use pytorch amp for mixed precision training for Tacotron
2020-11-12 12:51:56 +01:00
erogol
67e2b664e5
compute embeddings and create speakers.json
2020-11-12 12:51:17 +01:00
erogol
f8fd300b3e
bug fix
2020-11-10 12:53:39 +01:00
erogol
016d3503da
compute embeddings with speaker encoder
2020-11-10 12:51:02 +01:00
erogol
21364331d2
Merge branch 'dev' of https://github.com/mozilla/TTS into dev
2020-11-09 13:31:12 +01:00
erogol
c76a617072
linter updates
2020-11-09 13:18:35 +01:00
erogol
ea976b0543
python compat update for contextlib
2020-11-06 13:34:11 +01:00
erogol
c80225544e
tune wavegrad to fine the best noise schedule for inferece
2020-11-06 13:04:46 +01:00
erogol
d94782a076
reset the way ga_loss is stored in return_dict
2020-11-02 13:18:56 +01:00
erogol
a108d0ee81
check nan loss in glow-tts loss
2020-11-02 13:12:19 +01:00
erogol
b8ac9aba9d
check against NaN loss in tacotron_loss
2020-11-02 12:44:41 +01:00
erogol
ef04d7fae7
bug fix for wavernn training
2020-10-30 14:08:41 +01:00
erogol
a44ef58aea
wavegrad weight norm refactoring
2020-10-30 13:23:24 +01:00
erogol
183fe56d95
Merge branch 'ssim_loss' into dev
2020-10-29 23:49:09 +01:00
krzim
2202e171c5
Fix import to grab the encoder model save function
...
I saw that this was recently changed but I'm not sure if it should have been. This is the correct function given the arguments provided to it in the train loop.
2020-10-29 18:03:11 -04:00
erogol
73581cd94c
renaming train scripts and updating tests
2020-10-29 16:50:07 +01:00
erogol
39c71ee8a9
wavegrad refactoring, fixing tests for glow-tts and wavegrad
2020-10-29 15:47:15 +01:00
erogol
946a0c0fb9
bug fixes for single speaker glow-tts, enable torch based amp. Make amp optional for wavegrad. Bug fixes for synthesis setup for glow-tts
2020-10-29 15:45:50 +01:00
erogol
14c2381207
weight norm and torch based amp training for wavegrad
2020-10-29 12:31:43 +01:00
erogol
b76a0be97a
wavegrad model and layers refactoring
2020-10-29 12:31:43 +01:00
erogol
dc2825dfb2
wavegrad dataset update
2020-10-29 12:31:43 +01:00
erogol
5b5b9fcfdd
wavegrad config updates
2020-10-29 12:31:43 +01:00
erogol
c8a4c771a8
train wavegrad updates
2020-10-29 12:31:43 +01:00
erogol
670f44aa18
enable compute stats by vocoder config
2020-10-29 12:31:43 +01:00
erogol
f79bbbbd00
use Adam for wavegras instead of RAdam
2020-10-29 12:31:43 +01:00
erogol
7bcdb7ac35
wavegrad updates
2020-10-29 12:31:43 +01:00
erogol
a1582a0e12
fix distributed training for train_* scripts
2020-10-29 12:31:43 +01:00
erogol
193b81b273
add universal_fullband_melgan config
2020-10-29 12:30:37 +01:00
erogol
e02cd6a220
initial wavegrad layers model and trainig script
2020-10-29 12:30:37 +01:00
erogol
ac57eea928
add wavegrad to vocoder generators
2020-10-29 12:30:37 +01:00
erogol
e723b99888
handle distributed model as saving
2020-10-29 12:30:37 +01:00
Eren Gölge
26c18b61c9
Merge pull request #553 from Edresson/dev
...
bug fix in the inference with GlowTTS
2020-10-28 18:49:31 +01:00
erogol
fdaed45f58
optional loss masking for stoptoken predictor
2020-10-28 18:40:54 +01:00
erogol
e49cc3bbcd
bug fix
2020-10-28 18:34:34 +01:00
erogol
59e1cf99d0
config update and ssim implementation
2020-10-28 18:30:00 +01:00
erogol
9cef923d99
ssim loss for tacotron models
2020-10-28 15:24:18 +01:00
erogol
9d0ae2bfb4
wavernn dataloader handling for short samples and mixed precision training
2020-10-28 12:31:01 +01:00
Edresson
f01502a9db
bug fix in glowTTS sythesize
2020-10-27 16:30:16 -03:00
Eren Gölge
f4b8170bd1
Merge pull request #545 from Edresson/dev
...
GlowTTS zeroshot TTS support
2020-10-27 15:23:41 +01:00
erogol
a6f564c8c8
pylint fixes
2020-10-27 12:35:10 +01:00
erogol
0becef4b58
small updates
2020-10-27 12:17:38 +01:00
sanjaesc
2ee47e9568
fix pylint once again
2020-10-27 12:17:38 +01:00
sanjaesc
1e646135ca
add model params to config
2020-10-27 12:17:38 +01:00
sanjaesc
bef3f2020b
compute audio feat on dataload
2020-10-27 12:17:38 +01:00
sanjaesc
7c72562fe7
fix travis + pylint tests
2020-10-27 12:17:38 +01:00
sanjaesc
91e5f8b63d
added to device cpu/gpu + formatting
2020-10-27 12:17:38 +01:00
sanjaesc
016a77fcf2
fix formatting + pylint
2020-10-27 12:17:38 +01:00
erogol
8de7c13708
fix no loss masking loss computation
2020-10-27 12:17:38 +01:00
sanjaesc
e8294cb9db
fixing pylint errors
2020-10-27 12:17:38 +01:00
sanjaesc
878b7c373e
added feature preprocessing if not set in config
2020-10-27 12:17:38 +01:00
sanjaesc
e495e03ea1
some minor changes to wavernn
2020-10-27 12:17:38 +01:00
Alex K
9c3c7ce2f8
wavernn stuff...
2020-10-27 12:17:38 +01:00
Alex K
6378fa2b07
add initial wavernn support
2020-10-27 12:17:38 +01:00
Edresson
89e9bfe3a2
add text processing blank token test
2020-10-26 17:41:23 -03:00
Edresson
d9540a5857
add blank token in sequence for encrease glowtts results
2020-10-25 15:08:28 -03:00
Edresson
fbea058c59
add parse speakers function
2020-10-24 16:10:05 -03:00
Edresson
07345099ee
GlowTTS zero-shot TTS Support
2020-10-24 15:58:39 -03:00
Alexander Korolev
47d74ced1c
Update losses.py
...
Seems like in the latest dev merge, this change was reverted. Any specific reason for this?
Without it the problem as stated here https://github.com/mozilla/TTS/issues/473 occurs.
2020-10-23 14:15:01 +02:00
ayush-1506
2a3559f02b
Fix readme and config file
2020-10-21 13:43:49 +05:30
Edresson
b7f9ebd32b
add check arguments for GlowTTS and multispeaker training bug fix
2020-10-19 17:17:58 -03:00
erogol
c2c4126a18
remove merge conflicts
2020-10-08 01:35:27 +02:00
erogol
c5074cfd8e
general purpose distribute.py
2020-10-08 01:30:42 +02:00
erogol
6f0654f9a8
differential spectral loss
2020-10-08 01:30:42 +02:00
erogol
e0d4b88877
config update
2020-10-08 01:29:30 +02:00
erogol
4e93f90108
bug fix
2020-10-08 01:29:30 +02:00
erogol
bb9b70ee27
differential spectral loss and loss weight settings
2020-10-08 01:29:30 +02:00
erogol
e1eab1ce4b
print model r value as loading it
2020-10-07 13:34:21 +02:00
erogol
48a40c4730
remove unused import
2020-10-06 11:32:24 +02:00
erogol
a2606fbc22
format utils
2020-10-06 11:02:54 +02:00
Eren Gölge
4873601694
Merge pull request #531 from WeberJulian/french-cleaners
...
Adding support for french cleaners
2020-09-30 15:30:50 +02:00
Edresson
99d5a0ac07
add Speaker Conditional GST support
2020-09-29 16:09:27 -03:00
Julian WEBER
ea7c2e15c0
Adding french abbreviations
2020-09-29 15:43:39 +02:00
Julian WEBER
54b4031391
Merge remote-tracking branch 'origin/dev' into french-cleaners
2020-09-29 14:24:51 +02:00
Julian WEBER
da134eeee4
Subjective improvements
2020-09-29 14:20:52 +02:00
Julian WEBER
b2817e9e93
Adding french cleaners
2020-09-29 14:20:24 +02:00
Eren Gölge
cf02ace5b7
Merge pull request #530 from mueller91/fix_split_dataset
...
fix: split_dataset
2020-09-28 12:42:40 +02:00
erogol
154f90bc44
format speaker encoder imports
2020-09-28 11:19:19 +02:00
erogol
e097bc6c5d
Merge branch 'dev' of https://github.com/mozilla/TTS into dev
2020-09-28 11:15:32 +02:00
Eren Gölge
8e2dc79c3a
Merge pull request #526 from mueller91/dev
...
Fix: Check storage params only for speaker encoder
2020-09-28 11:15:23 +02:00
erogol
6a70c63f24
correct glow-tts loss
2020-09-27 03:28:42 +02:00
erogol
665f7ca714
linter fix
2020-09-24 12:57:54 +02:00
mueller91
227b9c8864
fix: split_dataset() runtime reduced from O(N * |items|) to O(N) where N is the size of the eval split (max 500)
...
I notice a significant speedup on the initial loading of large datasets such as common voice (from minutes to seconds)
2020-09-23 23:27:51 +02:00
mueller91
cfeeef7a7f
fix: broken imports and missing files after merging in latest commits from mozilla/dev into mueller91/dev.
...
speaker_encoder's config.json and visuals.py are missing in the current dev branch of MozillaTTS, and some imports are broken.
2020-09-22 20:10:41 +02:00
mueller91
1fe5eb054f
Merge branch 'dev' of https://github.com/mozilla/TTS into dev
...
Conflicts:
TTS/bin/train_encoder.py
requirements.txt
2020-09-22 19:58:53 +02:00
mueller91
df4caec4b7
add: check_config for speaker_encoder
2020-09-22 19:52:09 +02:00
WeberJulian
3c212be5a8
fix: fixing the RenamingUnpickler fix
2020-09-22 17:36:05 +02:00
mueller91
0ea7f4e2bd
fix: make speaker encoder's storage parameters non-restriced
2020-09-22 10:39:40 +02:00
mueller91
7029452228
fix: make speaker encoder's storage parameters non-restriced
2020-09-22 10:31:42 +02:00
erogol
10258724d1
linter fixes
2020-09-22 03:54:16 +02:00
erogol
a6df617eb1
Merge branch 'glow-tts-amp-time_depth_conv' into dev
2020-09-21 14:23:45 +02:00
erogol
8150d5727e
Merge branch 'dev' of https://github.com/mozilla/TTS into dev
2020-09-21 14:21:55 +02:00
erogol
e0b9fa887f
glow-tts modules added
2020-09-21 14:15:40 +02:00
erogol
e4c6386603
change import for normalization layer
2020-09-21 13:09:52 +02:00
mueller91
9b4aac94a8
fix: linter issues
2020-09-21 12:13:02 +02:00
erogol
c008003506
do not check sample rate as loading stats file for normalization to enable interpolation for different sample rate vocoder
2020-09-18 12:52:19 +02:00
mueller
6b0621c794
cleanup
2020-09-17 16:46:43 +02:00
mueller
a273b1a210
add: add random noise to dataset
2020-09-17 14:23:40 +02:00
mueller
e36a3067e4
add: save wavs instead feats to storage.
...
This is done in order to mitigate staleness when caching and loading from data storage
2020-09-17 14:14:30 +02:00
mueller
1511076fde
add: Configurable encoder dataset storage to reduce disk I/O
...
add: Averaged time for data loader to console and Tensorboard output
2020-09-17 12:29:38 +02:00
erogol
3660c57f1e
time seperable convolution encoder, huber loss for duration predictor
2020-09-17 03:10:58 +02:00
mueller
95d2906307
add: Mozilla Commonvoice, VoxCeleb1+2, LibriTTS to Speaker Encoder Training
2020-09-16 16:49:53 +02:00
mueller
c909ca3855
Improve runtime of __parse_items() from O(|speakers|*|items|) to O(|items|)
2020-09-16 15:55:55 +02:00
mueller
d733b90255
Improve runtime of __parse_items() from O(|speakers|*|items|) to O(|items|)
2020-09-16 15:09:02 +02:00
maxbachmann
60ce862113
use difflib for string matching
2020-09-14 23:55:34 +02:00
erogol
f1a75468c2
fix arguments
2020-09-12 04:00:25 +02:00
erogol
7c2c4d6f27
pass x_mask to layer norm
2020-09-12 03:41:37 +02:00