Eren Gölge
3da79a4de4
Comment Tacotron2 model
2021-10-20 18:14:04 +00:00
Eren Gölge
9f23ad6a0f
Fix imports
2021-09-30 14:47:56 +00:00
Eren Gölge
26f76fce22
Remove SpeedySpeech from .models.json
2021-09-10 17:47:27 +00:00
Eren Gölge
d6e29ef98a
Style update
2021-09-10 08:30:33 +00:00
Eren Gölge
ed4b1d8514
Test `TTS.tts.utils.helpers`
2021-09-10 08:25:21 +00:00
Eren Gölge
bfc6ceac29
Move MAS to `TTS.tts.utils.helpers`
2021-09-09 10:57:19 +00:00
Eren Gölge
537c8576ec
Stage `TTS.tts.utils.helpers`
2021-09-08 13:35:18 +00:00
Eren Gölge
4761853c5c
Fix imports
2021-09-08 13:34:40 +00:00
Eren Gölge
c1513ec4cd
Plot pitch over spectrogram
2021-09-06 15:16:58 +00:00
Eren Gölge
42862f7fdb
Format style of the recipes
2021-09-06 15:16:58 +00:00
Eren Gölge
8fffd4e813
Don't print computed phonemes
...
It causes noise in logs
2021-09-06 15:16:58 +00:00
Katsuya Iida
165e5814af
Update Japanese phonemizer ( #758 )
...
* Update default ja vocoder
* update
* Japanese phonemizer test
* Run make style
Co-authored-by: Eren Gölge <egolge@coqui.ai>
2021-09-01 09:33:15 +02:00
Eren Gölge
49e1181ea4
Fixes for the vits model
2021-08-26 17:15:09 +00:00
Eren Gölge
c312acac7d
Implement VITS model 🚀
...
VITS model implementation built on Glow TTS and HiFiGAN
layers.
2021-08-09 18:02:36 +00:00
Eren Gölge
f5a6aa974f
Modify `symbols.py` not to add _arpanet
2021-08-09 18:02:36 +00:00
Eren Gölge
003e5579e8
Enable `custom_symbols` in text processing
...
Models can define their own custom symbols lists with custom
`make_symbols()`
2021-08-09 18:02:36 +00:00
Eren Gölge
e4648ffef1
Fix multi-speaker init of Tacotron models & tests
2021-08-09 18:02:36 +00:00
Agrin Hilmkil
ced4cfdbbf
Allow saving / loading checkpoints from cloud paths ( #683 )
...
* Allow saving / loading checkpoints from cloud paths
Allows saving and loading checkpoints directly from cloud paths like
Amazon S3 (s3://) and Google Cloud Storage (gs://) by using fsspec.
Note: The user will have to install the relevant dependency for each
protocol. Otherwise fsspec will fail and specify which dependency is
missing.
* Append suffix _fsspec to save/load function names
* Add a lower bound to the fsspec dependency
Skips the 0 major version.
* Add missing changes from refactor
* Use fsspec for remaining artifacts
* Add test case with path requiring fsspec
* Avoid writing logs to file unless output_path is local
* Document the possibility of using paths supported by fsspec
* Fix style and lint
* Add missing lint fixes
* Add type annotations to new functions
* Use Coqpit method for converting config to dict
* Fix type annotation in semi-new function
* Add return type for load_fsspec
* Fix bug where fs not always created
* Restore the experiment removal functionality
2021-08-09 18:02:36 +00:00
Eren Gölge
75b201c6c1
Merge pull request #673 from coqui-ai/fix_stopnet
...
Fix stopnet training for Tacotron models
2021-07-24 12:25:38 +02:00
Eren Gölge
fc0c4600bd
Fix stopnet training
2021-07-24 11:39:54 +02:00
Edresson
2e5baffa9c
Merge fix and eval split as argparse
2021-07-13 01:47:32 -03:00
Eren Gölge
c25a2184e7
Add docs for `SpeakerManager`
2021-07-03 13:55:27 +02:00
Eren Gölge
ae6405bb76
Docstrings for `Trainer`
2021-06-28 17:03:47 +02:00
Eren Gölge
f23b228e24
Update `speaker_manager`
2021-06-28 17:03:47 +02:00
Eren Gölge
98298ee671
Implement unified IO utils
2021-06-28 17:03:19 +02:00
Eren Gölge
00c82c516d
rename to
2021-06-28 17:03:19 +02:00
Eren Gölge
166f0aeb9a
merge if branches with the same implementation
2021-06-28 17:03:19 +02:00
Eren Gölge
03494ad642
adjust `distribute.py` for the `train_tts.py`
2021-06-28 17:03:19 +02:00
Eren Gölge
25238e0658
fix glow-tts `inference()`
2021-06-28 17:03:19 +02:00
Eren Gölge
419735f440
refactor and fix multi-speaker training in Trainer and Tacotron models
2021-06-28 17:03:19 +02:00
Eren Gölge
2c38ef8441
use get_speaker_manager in Trainer and save speakers.json file when
...
needed
2021-06-28 17:03:19 +02:00
Eren Gölge
db6a97d1a2
rename external speaker embedding arguments as `d_vectors`
2021-06-28 17:03:19 +02:00
Eren Gölge
f82f1970b8
change `to(device)` to `type_as` in models
2021-06-28 17:03:19 +02:00
Eren Gölge
30211512a4
fix type annotations
2021-06-28 17:03:19 +02:00
Eren Gölge
f840268181
refactor `SpeakerManager`
2021-06-28 17:03:19 +02:00
Eren Gölge
421194880d
linter fixes
2021-06-28 17:03:19 +02:00
Eren Gölge
d96ebcd6d3
make style
2021-06-28 17:03:19 +02:00
Eren Gölge
b500338faa
make style
2021-06-28 17:03:19 +02:00
Eren Gölge
c680a07a20
fix `Synthesized` for the new `synthesis()`
2021-06-28 17:03:19 +02:00
Eren Gölge
b8a4af4010
update `synthesis.py` for being more generic
2021-06-28 17:03:19 +02:00
Eren Gölge
f4f83b6379
update `synthesis.py` for the trainer
2021-06-28 17:03:19 +02:00
Eren Gölge
130781dab6
remove `tts.generic_utils` as all the functions are moved to other files
2021-06-28 17:03:19 +02:00
Eren Gölge
ca302db7b0
add sequence_mask to `utils.data`
2021-06-28 17:03:19 +02:00
Eren Gölge
8def3c87af
trainer-API updates
2021-06-28 17:03:19 +02:00
Edresson
1c4e806f54
use speaker manager on compute embeddings script
2021-06-27 03:35:34 -03:00
Michael Hansen
3f172b84d8
Fix linting issues
2021-06-25 14:41:31 +02:00
Michael Hansen
4d8426fa0a
Use eSpeak IPA lexicons by default for phoneme models
2021-06-25 14:41:05 +02:00
Michael Hansen
618b509204
Use combined characters available in TTS phonemes (like ç)
2021-06-25 14:41:05 +02:00
Michael Hansen
da6f6a4a01
Update docstring for clean_gruut_phonemes
2021-06-25 14:41:05 +02:00
Michael Hansen
47191f3ecc
Add tests for gruut phonemization
2021-06-25 14:41:05 +02:00
Michael Hansen
67869e77f9
Use gruut for phonemization
2021-06-25 14:41:05 +02:00
Eren Gölge
49c5e5d820
maket style japanese PR
2021-06-02 11:44:46 +02:00
Eren Gölge
73b4083c6c
Merge pull request #502 from kaiidams/kaiidams/kokoro
...
Japanese Tacotron 2 model
2021-06-02 10:20:08 +02:00
Katsuya Iida
1cc18d1972
Move unittest of Japanese phonemizer.
2021-06-01 18:51:34 +09:00
Katsuya Iida
d0c9c1ca5c
Move TTS/tts/utils/japanese
2021-05-29 09:21:47 +09:00
Katsuya Iida
c4987e9d4e
Move import at the head of the file.
2021-05-28 00:22:57 +09:00
Eren Gölge
925c08cf95
replace unidecode with anyascii
2021-05-27 14:02:44 +02:00
Katsuya Iida
f921a05bdb
Fixed lint errors
2021-05-26 19:02:16 +09:00
Katsuya Iida
0536aa6d0f
Japanese Tacotron 2 model
2021-05-22 17:12:19 +09:00
Eren Gölge
8a7c40736c
set use_phonemes false
2021-05-19 01:27:26 +02:00
Eren Gölge
ccfaa6b1d5
add `needs_phonemizer` field to models.json. If set true these models
...
are only compatible with v0.0.13 or below.
2021-05-18 17:57:28 +02:00
Eren Gölge
a14fcf2a13
remove text_processing test
2021-05-18 17:57:28 +02:00
Eren Gölge
d7fae3f515
remove all espeaker and phonemizer deps
2021-05-18 17:57:28 +02:00
Eren Gölge
ced05e812a
move chinese phonemizer
2021-05-18 17:57:28 +02:00
Eren Gölge
19fb1d743d
style update
2021-05-11 11:30:00 +02:00
Eren Gölge
21dd4d7960
fix load_config imports for Coqpit
2021-05-11 11:29:18 +02:00
Eren Gölge
c57f0b46bb
reintro use_gst for backwars compat
2021-05-11 11:29:18 +02:00
Eren Gölge
9ee70af9bb
code styling
2021-05-11 11:29:18 +02:00
Eren Gölge
720fe13056
update glow_tts modules and training script for coqpit use
2021-05-11 11:29:17 +02:00
Eren Gölge
647163397d
coqpit refactoring
2021-05-11 11:29:17 +02:00
Eren Gölge
eaa130e813
fix tacotron for coqpit
2021-05-11 11:29:17 +02:00
Eren Gölge
05d9543ed8
init GST module using gst config in Tacotron models
2021-05-11 11:29:17 +02:00
Eren Gölge
93a00373f6
move split_dataset
2021-05-11 11:29:17 +02:00
Eren Gölge
79d7215142
config refactor #5 WIP
2021-05-11 11:29:17 +02:00
Eren Gölge
dc50f5f0b0
config refactor #4 WIP
2021-05-11 11:28:35 +02:00
Eren Gölge
a21c0b5585
config update 2 WIP
2021-05-11 11:28:35 +02:00
Eren Gölge
e092ae40dc
config update WIP
2021-05-11 11:28:35 +02:00
Eren Gölge
f7582107da
Merge pull request #453 from Edresson/dev
...
Script for spectrogram extraction using teacher forcing and Glow-TTS inference with MAS.
2021-05-06 17:53:28 +02:00
Eren Gölge
8cb27267a4
formatting
2021-05-03 14:26:35 +02:00
Eren Gölge
2f0716073e
enable multi-speaker CoquiTTS models for synthesize.py
2021-04-26 19:36:53 +02:00
Eren Gölge
b531fa699c
remove conflicy noise
2021-04-26 15:27:52 +02:00
Eren Gölge
f37b488876
Merge branch 'speaker-manager' of https://github.com/coqui-ai/TTS into speaker-manager
2021-04-26 15:25:25 +02:00
Edresson
8228091f92
add script for extraction of tts spectrograms
2021-04-23 14:17:46 -03:00
Eren Gölge
4cf211348d
styling and linting
2021-04-23 18:04:37 +02:00
Eren Gölge
f69195739e
let speaker manager compute mean x_vector from multiple wav files
2021-04-23 18:04:37 +02:00
Eren Gölge
c80d21f311
load speaker_encoder_ap and compute x_vector directly from the input file in speaker manager
2021-04-23 18:04:37 +02:00
Eren Gölge
e97126314c
add ```unique``` argument to make_symbols to fix the incompat. issue of the
...
SC-Glow models
2021-04-23 18:04:37 +02:00
Eren Gölge
d08888e603
formating speakers.py
2021-04-23 18:04:37 +02:00
Eren Gölge
df422223a3
initial SpeakerManager implementation
2021-04-23 18:04:37 +02:00
Eren Gölge
7a7aeb35f5
fix the glow-tts in setup_model
2021-04-23 18:04:37 +02:00
Eren Gölge
99dc07a7dd
add ```unique``` param to keep scglow models compatible (they are duplicate symbols ins the character set)
2021-04-23 18:04:37 +02:00
Eren Gölge
aadb2106ec
code styling
2021-04-23 18:04:37 +02:00
kirianguiller
7dccbfdcd5
handle multi speaker and gst in Synthetizer class
2021-04-23 18:04:37 +02:00
Eren Gölge
04b6881b66
add ```unique``` argument to make_symbols to fix the incompat. issue of the
...
SC-Glow models
2021-04-21 13:12:35 +02:00
Eren Gölge
790946faec
formating speakers.py
2021-04-21 13:12:11 +02:00
Eren Gölge
ab313814de
initial SpeakerManager implementation
2021-04-21 13:11:46 +02:00
Eren Gölge
09890c7421
fix the glow-tts in setup_model
2021-04-21 13:10:40 +02:00
Eren Gölge
d2fa8add1f
add ```unique``` param to keep scglow models compatible (they are duplicate symbols ins the character set)
2021-04-16 19:40:13 +02:00
Eren Gölge
47e356cb48
code styling
2021-04-16 16:01:40 +02:00
kirianguiller
48ae52a9a3
handle multi speaker and gst in Synthetizer class
2021-04-16 15:54:49 +02:00
Eren Gölge
9cc17be53a
formatting and a small bug fix in Tacotron model
2021-04-15 16:36:51 +02:00
Eren Gölge
3de5a89154
optionally enable prenet dropout at inference time for tacotron models
2021-04-13 13:24:56 +02:00
Eren Gölge
480e2f7888
docstring update and better handling make_symbols
2021-04-12 16:40:49 +02:00
Eren Gölge
e5b9607bc3
isort all imports
2021-04-09 00:45:20 +02:00
Eren Gölge
0e79fa86ad
format with black and pylint 2.7.3
2021-04-09 00:38:08 +02:00
Eren Gölge
7a382a5c2b
stowed aligntts commit and small refactoring with feed_forward layers
2021-03-30 14:39:16 +02:00
Eren Gölge
844e8e0ed4
adapt align_tts and model name handling
2021-03-30 14:39:16 +02:00
Eren Gölge
2b3e12ea49
correct imports after refactoring, add AlignTTS (old SSMAS) and some formatting
2021-03-30 14:39:16 +02:00
Eren Gölge
e8cf8cb00e
restructure TF tacotron files
2021-03-30 14:39:16 +02:00
Eren Gölge
bdfd1f8a89
linter fix
2021-03-16 19:13:32 +01:00
WeberJulian
11e25a7125
fix linter issues
2021-03-16 19:13:01 +01:00
WeberJulian
1574d8dd39
fix french_cleaners
2021-03-16 19:13:01 +01:00
Eren Gölge
94805236fb
Merge branch 'dev' of https://github.com/coqui-ai/TTS into dev
2021-03-08 15:21:06 +01:00
Eren Gölge
9a48ba3821
a ton of linter updates
2021-03-08 05:06:54 +01:00
kirianguiller
557239db7f
remove re.Match typing in '_number_replace()'
2021-03-08 02:59:48 +01:00
kirianguiller
9ab07f94e2
modify according to PR reviews
2021-03-08 02:59:48 +01:00
kirianguiller
42ba30eb8f
<add> Chinese mandarin implementation (tacotron2)
2021-03-08 02:59:24 +01:00
kirianguiller
e85658ac2b
remove re.Match typing in '_number_replace()'
2021-03-08 02:57:11 +01:00
kirianguiller
0d4525322c
modify according to PR reviews
2021-03-08 02:57:11 +01:00
kirianguiller
e6fd118cf8
<add> Chinese mandarin implementation (tacotron2)
2021-03-08 02:57:11 +01:00
Eren Gölge
0e1e60bef0
remove redundancy
2021-03-08 02:54:47 +01:00
Eren Gölge
55fc50b26d
update test_text_processing for espeak-ng
2021-03-08 02:54:47 +01:00
Eren Gölge
5b8a6736a7
remove _phoneme_punctuations
2021-03-08 02:54:47 +01:00
Eren Gölge
62a8eba3b2
parse_characters function
2021-03-08 02:54:47 +01:00
Eren Gölge
0b33acdcca
enable saving model characters in io.py
2021-03-08 02:54:47 +01:00
Eren Gölge
9fefc79f0c
fix make_symbols
2021-03-08 02:54:47 +01:00
Eren Gölge
5f1018abee
fix spelling of a def argument and parse phonemes from config.json if
...
use_phonemes is True
2021-03-08 02:54:47 +01:00
Eren Gölge
6cd642c2e1
add missing phonemes to test_config.json
2021-03-08 02:54:47 +01:00
Eren Gölge
ee58ff2d38
add russian phoneme char
2021-03-08 02:54:47 +01:00
Eren Gölge
90d4f08d6c
reorder imports
2021-03-08 02:48:31 +01:00
kirianguiller
7f36d91131
update chinese model
2021-03-01 14:55:05 +01:00
kirianguiller
3911b87e54
remove re.Match typing in '_number_replace()'
2021-02-17 20:53:56 +01:00
kirianguiller
fb0655d1e7
modify according to PR reviews
2021-02-17 20:53:56 +01:00
kirianguiller
c4c7bc1b88
<add> Chinese mandarin implementation (tacotron2)
2021-02-17 20:53:56 +01:00
Eren Gölge
ff218e2370
remove redundancy
2021-02-15 12:07:02 +00:00
Eren Gölge
4244096ccb
update test_text_processing for espeak-ng
2021-02-12 14:07:26 +00:00
Eren Gölge
b28c724c04
remove _phoneme_punctuations
2021-02-12 12:10:57 +00:00
Eren Gölge
593cedee14
parse_characters function
2021-02-12 12:05:56 +00:00
Eren Gölge
2abfff17f9
enable saving model characters in io.py
2021-02-12 12:04:41 +00:00
Eren Gölge
43f54d2dce
fix make_symbols
2021-02-11 15:26:52 +00:00
Eren Gölge
bc131208be
fix spelling of a def argument and parse phonemes from config.json if
...
use_phonemes is True
2021-02-11 13:04:47 +00:00
Eren Gölge
3baec4ea96
add missing phonemes to test_config.json
2021-02-11 11:14:39 +00:00
Eren Gölge
b08b8ca2a1
add russian phoneme char
2021-02-10 13:30:59 +00:00
Eren Gölge
a926aa106d
reorder imports
2021-01-29 01:36:21 +01:00
Eren Gölge
c990b3a59c
linter fixes and test fixes
2021-01-22 02:32:35 +01:00
root
1bc8fbbd3c
set eval mode whe nloading models
2021-01-20 02:14:18 +00:00
erogol
de2a542f83
glow-tts bug fix
2021-01-07 13:40:32 +01:00
erogol
e7fad928e7
doc strings for the all glow-tts layers
2021-01-06 13:19:40 +01:00
erogol
7586fbc4de
SS refactoring
2021-01-06 13:19:40 +01:00
erogol
e82d31b6ac
glow ttss refactoring
2021-01-06 13:19:40 +01:00
erogol
71c382be14
copy model scale stats file with config.json to the trianing folder, fixed for model inits
2021-01-06 13:19:40 +01:00
erogol
fede46e96e
pylint and test fixes
2021-01-06 13:19:40 +01:00
erogol
e4680e1b99
plot float16 alignments
2021-01-06 13:19:40 +01:00
erogol
13c6665c92
inference for SS
2021-01-06 13:19:40 +01:00
erogol
30788960a8
check SS model parameters
2021-01-06 13:19:40 +01:00
erogol
5cae2c5742
make optional position encoding for speedyspeech
2021-01-06 13:19:40 +01:00
erogol
022af74d74
update prompt msg
2021-01-06 13:19:40 +01:00
erogol
57ef53bef3
update argumnet check for non tacotron models
2021-01-06 13:19:40 +01:00
erogol
fa6907fa0e
update glow-tts parameters and fix rel-attn-win size
2021-01-06 13:19:40 +01:00
erogol
973754d893
fix for init glow-tts
2021-01-06 13:19:40 +01:00
erogol
070146e143
add monotonic dynamic convolution attention
2021-01-06 13:18:41 +01:00
erogol
639fa29261
update speaker id casting for glow-tts
2020-12-14 16:58:47 +01:00
erogol
999120ecdf
Merge branch 'dev' of https://github.com/mozilla/TTS into dev
2020-12-12 18:50:14 +01:00
erogol
f611e6ac01
Merge branch 'dev' of https://github.com/mozilla/TTS into dev
2020-12-12 18:47:59 +01:00
Jörg Thalheim
62fd4ca70d
inflect negative numbers correctly
2020-12-10 16:47:51 +01:00
Jörg Thalheim
6646682650
cleaners: expand english time
2020-12-10 14:53:20 +01:00
Jörg Thalheim
76138687d3
expand more currencies
2020-12-10 14:53:20 +01:00
erogol
a2859b7ddc
update config args checks
2020-12-10 13:52:57 +01:00
erogol
788cd6f902
fix multi-speaker glow-tts inference
2020-12-10 02:05:48 +01:00
erogol
92cc9630d7
fix glow-tts synthesis for DPP
2020-12-10 00:30:34 +01:00
erogol
06612ce305
test fixes
2020-12-07 15:57:34 +01:00
erogol
a1e4ee18f9
convert float16 to float32 for plotting spectrograms
2020-11-25 14:50:28 +01:00
erogol
6cc464ead6
fix ton of tesnting bugs
2020-11-12 16:33:29 +01:00
erogol
21364331d2
Merge branch 'dev' of https://github.com/mozilla/TTS into dev
2020-11-09 13:31:12 +01:00
erogol
183fe56d95
Merge branch 'ssim_loss' into dev
2020-10-29 23:49:09 +01:00
erogol
946a0c0fb9
bug fixes for single speaker glow-tts, enable torch based amp. Make amp optional for wavegrad. Bug fixes for synthesis setup for glow-tts
2020-10-29 15:45:50 +01:00
erogol
e723b99888
handle distributed model as saving
2020-10-29 12:30:37 +01:00
erogol
59e1cf99d0
config update and ssim implementation
2020-10-28 18:30:00 +01:00
erogol
9cef923d99
ssim loss for tacotron models
2020-10-28 15:24:18 +01:00
Edresson
f01502a9db
bug fix in glowTTS sythesize
2020-10-27 16:30:16 -03:00
Edresson
89e9bfe3a2
add text processing blank token test
2020-10-26 17:41:23 -03:00
Edresson
d9540a5857
add blank token in sequence for encrease glowtts results
2020-10-25 15:08:28 -03:00
Edresson
fbea058c59
add parse speakers function
2020-10-24 16:10:05 -03:00
Edresson
07345099ee
GlowTTS zero-shot TTS Support
2020-10-24 15:58:39 -03:00
Edresson
b7f9ebd32b
add check arguments for GlowTTS and multispeaker training bug fix
2020-10-19 17:17:58 -03:00
erogol
e1eab1ce4b
print model r value as loading it
2020-10-07 13:34:21 +02:00
Eren Gölge
4873601694
Merge pull request #531 from WeberJulian/french-cleaners
...
Adding support for french cleaners
2020-09-30 15:30:50 +02:00
Edresson
99d5a0ac07
add Speaker Conditional GST support
2020-09-29 16:09:27 -03:00
Julian WEBER
ea7c2e15c0
Adding french abbreviations
2020-09-29 15:43:39 +02:00
Julian WEBER
54b4031391
Merge remote-tracking branch 'origin/dev' into french-cleaners
2020-09-29 14:24:51 +02:00
Julian WEBER
da134eeee4
Subjective improvements
2020-09-29 14:20:52 +02:00
Julian WEBER
b2817e9e93
Adding french cleaners
2020-09-29 14:20:24 +02:00
mueller91
227b9c8864
fix: split_dataset() runtime reduced from O(N * |items|) to O(N) where N is the size of the eval split (max 500)
...
I notice a significant speedup on the initial loading of large datasets such as common voice (from minutes to seconds)
2020-09-23 23:27:51 +02:00
mueller91
1fe5eb054f
Merge branch 'dev' of https://github.com/mozilla/TTS into dev
...
Conflicts:
TTS/bin/train_encoder.py
requirements.txt
2020-09-22 19:58:53 +02:00
mueller91
df4caec4b7
add: check_config for speaker_encoder
2020-09-22 19:52:09 +02:00
mueller91
0ea7f4e2bd
fix: make speaker encoder's storage parameters non-restriced
2020-09-22 10:39:40 +02:00
mueller91
7029452228
fix: make speaker encoder's storage parameters non-restriced
2020-09-22 10:31:42 +02:00
erogol
10258724d1
linter fixes
2020-09-22 03:54:16 +02:00
erogol
a6df617eb1
Merge branch 'glow-tts-amp-time_depth_conv' into dev
2020-09-21 14:23:45 +02:00
mueller
6b0621c794
cleanup
2020-09-17 16:46:43 +02:00