Commit Graph

320 Commits

Author SHA1 Message Date
Eren Golge 015f7780f4 Decoder shape comments for Tacotron2, decoupled grad clip for stopnet and the rest of the network. Some variable renaming and bug fix for alignment score logging 2019-11-12 11:20:53 +01:00
Eren Golge adf9ebd629 Graves attention and setting attn type by config.json 2019-11-12 11:18:57 +01:00
Eren Golge 0e0d0345cd call truncated inference 2019-10-29 17:41:07 +01:00
Eren Golge 60b6ec18fe bug fix for synthesis.py 2019-10-29 17:38:59 +01:00
Eren Golge e83a4b07d2 commention model outputs for tacotron, align outputs shapes of tacotron and tracotron2, merge bidirectional decoder 2019-10-28 14:51:19 +01:00
Eren Golge 2dcdc14ea6 UPDATE TRIM SILENCE 2019-10-12 18:34:28 +02:00
Eren Golge 0849e3c42f sound normalization while reading, adapting get_Speaker for multiple datasets 2019-10-04 18:20:30 +02:00
Eren Golge 8dec2a9e95 fix memory leak duee to diagonal alingmnet score 2019-10-02 00:30:25 +02:00
Eren Golge acbafb456b Weighting positive values for stopnet loss, change adam_weight_decay name 2019-09-28 15:44:17 +02:00
Eren Golge 99d7f2a666 update set_weight_decay 2019-09-28 15:31:18 +02:00
Eren Golge 8565c508e4 remove debug line 2019-09-28 01:11:04 +02:00
Eren Golge b76aaf8ad4 skip weight decay for BN and biases, some formatting 2019-09-28 01:09:28 +02:00
Eren Golge 5b6b1f354d add use_gst to enable global style token 2019-09-24 16:24:58 +02:00
Eren Golge d45d963dc1 linter fix 2019-09-11 10:39:59 +02:00
Eren Golge 609d8efa69 compute alignment diagonality score and encapsulate stats averaging with a class in traning 2019-09-11 10:32:07 +02:00
Eren Golge d1828c9573 fix server tests and pylint 2019-09-10 12:09:58 +02:00
Eren Golge 0bb8d780e8 visual.py update 2019-09-05 16:48:36 +02:00
Eren Golge dc69074a56 add RADAM reference 2019-08-30 10:33:46 +02:00
Eren Golge 529348d6dc lint fixes 2019-08-30 10:29:22 +02:00
Eren Golge afdc4bad10 Merge branch 'dev-radam' into dev 2019-08-30 10:15:54 +02:00
Reuben Morais 28644a717e Fix tests 2019-08-29 12:18:33 +02:00
Reuben Morais 3c5aeb5e22 Fix installation by using an explicit symlink 2019-08-29 11:49:53 +02:00
Eren Golge e02fc51fde server update for changing r value 2019-08-23 12:28:05 +02:00
Eren Golge 1a1db23df1 radam 2019-08-22 00:34:46 +02:00
Eren Golge 5ff8544d6a force frame_length to be a multiple hop_length 2019-08-20 13:22:04 +02:00
Eren Golge d99623e285 bug fixes for logging 2019-08-19 16:27:53 +02:00
Eren Golge 5629292bde bug fixes 2019-08-16 15:08:04 +02:00
Eren Golge b22c7d4a29 Merge branch 'dev-gradual-queue' into dev 2019-08-16 13:20:17 +02:00
Eren Golge 5acd9e82bd save model r value for checkpoints 2019-08-16 13:11:51 +02:00
Thomas Werkmeister 215eb014ca enforce list append semantic; prevents numpy add 2019-07-26 13:40:58 +02:00
Eren Golge 85adb2496c Merge branch 'master' of github.com:mozilla/TTS 2019-07-22 20:59:42 +02:00
Eren Golge 91795cc0f1 config update 2019-07-22 15:44:09 +02:00
Eren Golge ee706b50f6 enalbe graudal training by config.json 2019-07-22 02:11:20 +02:00
Thomas Werkmeister f59543d127 fixed usage of bos&eos char with caching 2019-07-19 15:17:35 +02:00
Reuben Morais 9a61dfa155 Address additional lint problems 2019-07-19 11:35:06 +02:00
Reuben Morais 11e7895329 Fix Pylint issues 2019-07-19 09:08:51 +02:00
Eren Gölge 63c0085256
Merge pull request #229 from twerkmeister/patch-2
check for speaker id is None before put on cuda
2019-07-17 16:20:24 +02:00
Thomas Werkmeister ee4d55549d
check for speaker id is None before put on cuda 2019-07-17 14:08:53 +02:00
Eren Golge fd081c49b7 split dataset outside preprocessor 2019-07-16 21:15:04 +02:00
Eren Golge aec7f02817 libri tts config, and bug fix 2019-07-16 15:17:38 +02:00
Eren Golge 1468db0d07 bug fix for multispeaker test run 2019-07-12 10:50:20 +02:00
Eren Golge 5851c5d29b Merge branch 'tacotron-gst' into dev 2019-07-11 15:32:32 +02:00
Eren Golge 89969b0f38 LibriTTS processor and a small notification for silence trimming 2019-07-11 15:25:29 +02:00
Thomas Werkmeister 2f2482f9b4 reading all speakers upfront 2019-07-10 18:38:55 +02:00
Thomas Werkmeister d23e29ea1f extracted id to torch code 2019-07-02 14:40:01 +02:00
Thomas Werkmeister ba8cc8054b disabling multispeaker with num_speakers=0 2019-07-01 14:01:34 +02:00
Thomas Werkmeister 04e452d8cb Merge branch 'tacotron-gst' of github.com:mozilla/TTS into multispeaker 2019-07-01 14:00:22 +02:00
Eren Golge 464cc29756 Make optional reampling of the read wav 2019-06-26 14:11:30 +02:00
Thomas Werkmeister 05ff8801d1 config, benchmark notebook, synthesis fixed 2019-06-26 13:31:16 +02:00
Thomas Werkmeister d172a3d3d5 multispeaker 2019-06-26 12:59:14 +02:00
Eren Golge 51f1cd67e3 bug fix 2019-06-15 01:22:27 +02:00
Eren Golge 037ec13453 config update, audio.py update and modularize synthesize.py 2019-06-14 16:18:49 +02:00
Eren Golge e061ed091a modularize synthesis 2019-06-12 12:12:22 +02:00
Eren Golge 0f8936d744 GST inference 2019-06-12 12:12:01 +02:00
Eren Golge 31fe02412c forward_attn_mask and config update 2019-06-06 11:14:20 +02:00
Eren Golge 127a6b68e0 update mulaw decoder 2019-06-06 11:13:26 +02:00
Eren Golge 63eea4a364 bug fix 2019-06-06 10:24:34 +02:00
Eren Golge 7410daceb2 Adapt TTS for TacotronGST and some changes for Audio.py , better config.json naming 2019-06-05 18:33:57 +02:00
Eren Golge 4678c66599 forward_attn_mask and config update 2019-06-04 00:39:29 +02:00
Eren Golge f096f1052f config updates, update audio.py, update mailabs preprocessor 2019-06-03 15:34:36 +02:00
Eren Golge 70929387c0 Merge branch 'dev-tacotron2' 2019-05-28 14:59:24 +02:00
Eren Golge 0dbed8fef7 New method to convert Tacotron output to mel psectrograms 2019-05-27 14:41:59 +02:00
Eren Golge ba492f43be Set tacotron model parameters to adap to common_layers.py - Prenet and Attention 2019-05-27 14:40:28 +02:00
Eren Golge d4b900f6c9 use soundfile for faster read 2019-05-23 02:02:22 +02:00
Eren Golge e62659da94 update separate stopnet flow to make it faster. 2019-05-17 16:15:43 +02:00
Eren Golge 832dc3eafa bug fix 2019-05-15 12:37:31 +02:00
Eren Golge bb2b705e01 small bug fixes 2019-05-14 13:53:26 +02:00
Eren Golge 5e679f746d save figures in visualize of set 2019-05-12 17:35:44 +02:00
Eren Golge 6331bccefc make dropout oprional #2 2019-05-12 17:35:31 +02:00
Eren Golge e2439fde9a make location attention optional and keep all attention weights in attention class 2019-04-29 11:37:01 +02:00
Eren Golge 01dbfb3a0f Server update s 2019-04-18 17:35:20 +02:00
Eren Golge 3c2d500f53 Changesat windowing and some comments 2019-04-12 16:13:40 +02:00
Eren Golge 9466505f27 Make eos bos chars optional 2019-04-12 16:12:15 +02:00
Eren Golge e2cf35bb10 Make loss masking optional 2019-04-10 16:41:08 +02:00
Eren Golge 8a47b46195 print warning if a layer in ehckpoint is not defined in model definition 2019-04-08 19:32:07 +02:00
Eren Golge 961af0f5cd setup_model externally based on model selection. Make forward attention and prenet type configurable in config.json 2019-04-05 17:49:18 +02:00
Eren Golge 7baaf140f9 Remove start character for phonme sequenceing 2019-04-04 10:49:09 +02:00
Eren Golge 2e361e2306 strip sting after phonemizer 2019-03-29 17:05:44 +01:00
Eren Golge 103971c893 text processing updates with tests 2019-03-29 17:04:10 +01:00
Eren Golge 6edd8bc6dd add git branch and restore_path to copied config file for each run 2019-03-29 17:01:57 +01:00
Eren Golge 1ed4978e69 text processing update 2019-03-27 14:57:36 +01:00
Eren Golge 76d5e065db phoneme_to_sequence bug fix 2019-03-27 14:57:26 +01:00
Eren Golge fdca8402c7 config updates 2019-03-26 15:46:26 +01:00
Eren Golge d8908692c5 refactor partial reinit script as a function. Allow user to select layers to reinit in finutunning 2019-03-23 17:19:40 +01:00
Eren Golge 06a7aeb26d git commit bug fix for phonimizer 2019-03-23 16:44:38 +01:00
Eren Golge f96945443e add start char but remove end char 2019-03-22 23:48:44 +01:00
Eren Golge d6307fbb7f config update 2019-03-22 19:12:58 +01:00
Eren Golge ff7258062c skip the alst empty char in phonemes to sequence. It breaks the alingment 2019-03-20 12:24:04 +01:00
Eren Golge 5acc9db4ac
Add empty character to phonemes 2019-03-12 10:16:42 +01:00
gnosly 95de2cd559 added missing phonemes, synthesizer.py now setup the correct input layer 2019-03-11 21:56:40 +01:00
Eren Golge b9b79fcf0f inference truncated NEED TO BE TESTED 2019-03-11 17:40:09 +01:00
Eren Golge 5754116c19 bos char addded 2019-03-06 22:06:01 +01:00
Eren Golge a2a22d253f synthesis update compatible with multiplt architecture 2019-03-06 13:11:46 +01:00
Eren Golge 08162157ee generic train.py for multiple architectures set on config.json 2019-03-06 13:11:22 +01:00
Eren Golge 1e8fdec084 Modularize functions in Tacotron 2019-03-05 13:25:50 +01:00
Eren Golge bf5f18d11e Formatting changes and distributed training 2019-02-27 09:50:52 +01:00
Eren Golge caae1af4f6 visual updates for phoenemes 2019-02-25 17:20:36 +01:00
Eren Golge 97a16cedbf phoneme punctuation bug fix 2019-02-16 03:20:04 +01:00
Eren Golge eb839a7acd small buggy fix for phoeneme sequencer 2019-02-05 11:57:12 +01:00
Eren Golge 328db7757d one more phoneme char for en-uk 2019-01-18 13:35:51 +01:00
Eren Golge 4749bc211e Add new char to phoneme symbols for en-gb 2019-01-17 15:48:37 +01:00
Eren Golge 7e020d4084 Bug fixes 2019-01-16 16:23:04 +01:00
Eren Golge 915783e10e enable phoneme based synthesizing 2019-01-16 15:53:07 +01:00
Eren Golge b241104778 Make phoneme training configurable through config.json 2019-01-16 13:07:03 +01:00
Eren Golge 9927664f27 Phonemize statements are updated 2019-01-16 12:30:33 +01:00
Eren Golge 524743507c remove debug prints 2019-01-16 12:29:48 +01:00
Eren Golge b9629135db phonemizer updates for utils.text 2019-01-16 12:29:48 +01:00
Eren Golge c754ca89de Move phoneme compuataion to __init__ and put char list to symbols.py 2019-01-16 12:28:28 +01:00
Eren Golge 28d45a8d80 bug fixes 2019-01-16 12:27:38 +01:00
Eren Golge 004dd0f208 useing epitran and new phoneme list 2019-01-16 12:26:39 +01:00
Eren Golge 0e73b6ba45 Debug prints for phoneme extraction 2019-01-16 12:26:21 +01:00
Eren Golge 85a1990cc6 Convesntional update s 2019-01-16 12:26:21 +01:00
Eren Golge 1722b1659a phonem updates 2019-01-16 12:24:40 +01:00
Eren Golge 9c9aea276c phonem extraction for training 2019-01-16 12:23:04 +01:00
Eren Golge 94387c905e remove debug prints 2019-01-16 12:08:12 +01:00
Eren Golge e1cb7c1501 phonemizer updates for utils.text 2019-01-16 12:08:12 +01:00
Eren Golge df49e93684 Move phoneme compuataion to __init__ and put char list to symbols.py 2019-01-16 12:07:33 +01:00
Eren Golge da2f064bc5 bug fixes 2019-01-16 12:07:33 +01:00
Eren Golge 444451dc8e useing epitran and new phoneme list 2019-01-16 12:07:00 +01:00
Eren Golge 7edb53ce63 Debug prints for phoneme extraction 2019-01-16 12:06:59 +01:00
Eren Golge e6750ca652 Convesntional update s 2019-01-16 12:05:29 +01:00
Eren Golge 5f22e2a83a use phoneme to sequence for synthesis 2019-01-16 12:05:29 +01:00
Eren Golge 421787277f phonem updates 2019-01-16 12:00:41 +01:00
Eren Golge da30c3c9b3 change numbers.py to number_norm.py 2019-01-16 11:59:48 +01:00
Eren Golge 8e22147a19 phonem extraction for training 2019-01-16 11:59:48 +01:00
Eren Golge c8d7a6a84e explicit slience removal after voice synthesis in case of wrong stop token 2019-01-06 18:10:54 +01:00
Eren Golge 4abc9ad1bc Logger field naming update for layer stats 2018-12-28 14:22:41 +01:00
Eren Golge 806643300c Place model name to the beginning of the generated output folder name 2018-12-28 14:22:41 +01:00
Eren Golge 481105ccfa logger for tensorboard plotting 2018-12-28 14:18:19 +01:00
Eren Golge 6488d5e305 nug fix 2018-11-28 16:37:59 +01:00
Eren Golge 7730ef6bff Merge branch 'dev' of github.com:mozilla/TTS into dev 2018-11-28 16:34:03 +01:00
Eren Golge bb2a88a984 Rename LR scheduler 2018-11-26 14:09:42 +01:00
Eren Golge f6bf5b3d74 trim silence if enabled 2018-11-23 17:06:22 +01:00
Eren Golge 0f0bde935c trim silence if enabled 2018-11-23 16:58:26 +01:00
Eren Golge 22dcc4f7d0 small print formatting 2018-11-22 17:03:53 +01:00
Eren Golge 161a26c9dd Plot mel spectrogram if required 2018-11-13 12:10:40 +01:00
Eren Golge 6550db5251 Formatting, fixing import statements, logging learning rate, remove optimizer restore cuda call 2018-11-05 14:05:04 +01:00
Eren Golge 440f51b61d correct import statements 2018-11-03 23:19:23 +01:00
Eren Golge 0b6a9995fc change import statements 2018-11-03 19:15:06 +01:00
Eren Golge d96690f83f Config updates and add sigmoid to mel network again 2018-11-02 17:27:31 +01:00
Eren Golge c8a552e627 Batch update after data-loss 2018-11-02 16:13:51 +01:00
Eren 41bfa95736 bug fix 2018-09-21 21:51:38 +02:00
Eren 34eeaee58b Make audio folder and save audio with scipy 2018-09-21 17:38:55 +02:00
Eren a165cd7bda Bug fix audio saving 2018-09-19 15:45:08 +02:00
Eren c52d3f16f9 Bug fix, prevent save_wav to modify given variable 2018-09-19 14:05:10 +02:00
Eren 56c6d0cac8 Remove min max mel freq 2018-09-06 15:26:20 +02:00
Eren bb526c296f Change scheduler AnnealLR and catch audio synthesis error in eval time 2018-08-13 13:13:45 +02:00
Eren 6818e11185 Make lr scheduler configurable 2018-08-12 15:02:06 +02:00
Eren f7add3c8e5 tensorboardx plotting figures 2018-08-11 16:53:09 +02:00
Eren 3b2654203d fixing size mismatch 2018-08-10 18:48:43 +02:00