Eren Gölge
a409e0f8f8
Update train_tts for multi-speaker
2021-10-21 16:29:06 +00:00
Eren Gölge
ba2b8c827f
Update `train_tts.py` and `train_vocoder.py`
2021-09-30 14:47:56 +00:00
Eren Gölge
2e9b6b4f90
Refactor Speaker Encoder training
2021-09-30 14:47:56 +00:00
Eren Gölge
043dca61b4
Rename `load_meta_data` as `load_tts_data`
2021-09-30 14:47:56 +00:00
Eren Gölge
3c740d4893
Style extract_tts_spectrogram.py
2021-09-10 08:21:21 +00:00
Eren Gölge
807f1d3817
Fix `extract_tts_spectrograms.py` model init
2021-09-09 08:59:55 +00:00
Eren Gölge
91a70e80b2
Refactor TTSDataset
...
Return a dict by `collate`
Refactor batch handling in `collate`
A couple of bug fixes
2021-09-06 15:16:58 +00:00
Eren Gölge
545a00fc04
Use absolute paths of the attention masks
2021-09-06 15:16:58 +00:00
Eren Gölge
0f19f8c911
Fix `compute_attention_masks.py`
2021-09-06 15:16:58 +00:00
Eren Gölge
18da8f5dbd
Update pylint 2.10.2 and fix lint issues
2021-08-30 08:10:35 +00:00
Eren Gölge
f186856e5d
Add option to sort input sequnce by audio len
2021-08-30 08:10:35 +00:00
Eren Gölge
5911eec3b1
Small trainer refactoring
...
1. Use a single Gradscaler for all the optimizers
2. Save terminal logs to a file. In DDP mode, each worker creates `trainer_N_log.txt`.
3. Fixes to allow only the main worker (rank==0) writing to Tensorboard
4. Pass parameters owned by the target optimizer to the grad_clip_norm
2021-08-26 17:08:58 +00:00
Eren Gölge
ecf5f17dca
Fix distribute.py and ddp training
2021-08-12 22:22:32 +00:00
Eren Gölge
6af03ac476
Fix `num_char` init in Tacotron models
2021-08-09 21:46:15 +00:00
Ayush Chaurasia
936a47504d
Update Logger API, recipes
2021-08-09 18:34:00 +00:00
Ayush Chaurasia
f63cf46c55
Unified logger API
2021-08-09 18:34:00 +00:00
Ayush Chaurasia
f606741dc4
Add artifacts logging , wandb args
2021-08-09 18:31:16 +00:00
Agrin Hilmkil
ced4cfdbbf
Allow saving / loading checkpoints from cloud paths ( #683 )
...
* Allow saving / loading checkpoints from cloud paths
Allows saving and loading checkpoints directly from cloud paths like
Amazon S3 (s3://) and Google Cloud Storage (gs://) by using fsspec.
Note: The user will have to install the relevant dependency for each
protocol. Otherwise fsspec will fail and specify which dependency is
missing.
* Append suffix _fsspec to save/load function names
* Add a lower bound to the fsspec dependency
Skips the 0 major version.
* Add missing changes from refactor
* Use fsspec for remaining artifacts
* Add test case with path requiring fsspec
* Avoid writing logs to file unless output_path is local
* Document the possibility of using paths supported by fsspec
* Fix style and lint
* Add missing lint fixes
* Add type annotations to new functions
* Use Coqpit method for converting config to dict
* Fix type annotation in semi-new function
* Add return type for load_fsspec
* Fix bug where fs not always created
* Restore the experiment removal functionality
2021-08-09 18:02:36 +00:00
Eren Gölge
4b7b88dd3d
Add fullband-melgan DE vocoder
2021-07-26 15:38:30 +02:00
Edresson Casanova
d5adc35fdf
Add docstring to compute_embeddings script
2021-07-21 07:16:10 -03:00
Edresson
b1620d1f3f
remove ignore generate eval flag
2021-07-15 03:34:28 -03:00
Edresson
d906fea08c
lint fix and eval as argparse in extract tts spectrograms
2021-07-13 02:15:31 -03:00
Edresson
2e5baffa9c
Merge fix and eval split as argparse
2021-07-13 01:47:32 -03:00
Eren Gölge
93a74cbb71
Merge pull request #628 from Aloento/patch-2
...
Change to _get_preprocessor_by_name
2021-07-11 22:17:50 +02:00
Edresson
4eac1c4651
bug fix on train_encoder and unit tests
2021-07-11 12:00:39 -03:00
Aloento
6e3e6d5756
Change to _get_preprocessor_by_name
2021-07-08 09:53:13 +02:00
Eren Gölge
a4c658f5ef
Fix for using the `Synthesizer` out of the model
2021-07-02 10:43:38 +02:00
Eren Gölge
b3c073c99b
Allow runing full path scripts with `distribute.py`
2021-06-28 17:03:47 +02:00
Eren Gölge
a7617d8ab6
Add 🐍 python 3.9 to CI
2021-06-28 17:03:47 +02:00
Eren Gölge
9790eddada
Fix wrong argument name 🛠️
2021-06-28 17:03:47 +02:00
Eren Gölge
45947acb60
Update `TTS.bin` scripts for the new API
2021-06-28 17:03:47 +02:00
Eren Gölge
c7aad884cd
Implement unified trainer
2021-06-28 17:03:19 +02:00
Eren Gölge
c754a0e17d
`TrainerAbstract` and related updates for `TrainerTTS`
2021-06-28 17:03:19 +02:00
Eren Gölge
00c82c516d
rename to
2021-06-28 17:03:19 +02:00
Eren Gölge
03494ad642
adjust `distribute.py` for the `train_tts.py`
2021-06-28 17:03:19 +02:00
Eren Gölge
d6b2b6add6
make style and linter fixes
2021-06-28 17:03:19 +02:00
Eren Gölge
802d461389
Compute d_vectors and speaker_ids separately in TTSDataset
2021-06-28 17:03:19 +02:00
Eren Gölge
db6a97d1a2
rename external speaker embedding arguments as `d_vectors`
2021-06-28 17:03:19 +02:00
Eren Gölge
ef4ea9e527
update imports for `formatters`
2021-06-28 17:03:19 +02:00
Eren Gölge
421194880d
linter fixes
2021-06-28 17:03:19 +02:00
Eren Gölge
8e52a69230
delete separate tts training scripts and pre-commit configuration
2021-06-28 17:03:19 +02:00
Eren Gölge
d96ebcd6d3
make style
2021-06-28 17:03:19 +02:00
Eren Gölge
b500338faa
make style
2021-06-28 17:03:19 +02:00
Eren Gölge
469d2e620a
update extract_tts_spectrogram for `cond_input` API of the models
2021-06-28 17:03:19 +02:00
Eren Gölge
5ab28fa618
update `extract_tts_spec...` using `SpeakerManager`
2021-06-28 17:03:19 +02:00
Eren Gölge
c392fa4288
update `extract_tts_spectrograms` for the new model API
2021-06-28 17:03:19 +02:00
Eren Gölge
8f47f95998
correct import of `load_meta_data`
...
remove redundant import
2021-06-28 17:03:19 +02:00
Eren Gölge
d25f017b42
update `setup_model.py` imports
2021-06-28 17:03:19 +02:00
Eren Gölge
e298b8e364
update trainer.py for better logging handling, restoring models and
...
rename init_ functions with get_
2021-06-28 17:03:19 +02:00
Eren Gölge
5f07315722
add trainer and train_tts
2021-06-28 17:03:19 +02:00
Eren Gölge
8def3c87af
trainer-API updates
2021-06-28 17:03:19 +02:00
Eren Gölge
42554cc711
rename MyDataset -> TTSDataset
2021-06-28 17:03:19 +02:00
Edresson
1c4e806f54
use speaker manager on compute embeddings script
2021-06-27 03:35:34 -03:00
Edresson Casanova
eb84bb2bc8
Merge branch 'dev' into dev
2021-06-26 15:32:19 -03:00
Michael Hansen
3f172b84d8
Fix linting issues
2021-06-25 14:41:31 +02:00
Edresson
99d40e98d9
fix Lint checks
2021-06-18 14:59:01 -03:00
Edresson
28bec238ca
fix Lint checks
2021-06-18 14:33:50 -03:00
Edresson
83644056e3
fix Lint checks
2021-06-18 14:32:28 -03:00
Edresson Casanova
e78e3cd81e
Merge branch 'dev' into dev
2021-06-18 14:10:03 -03:00
Edresson
b74b510d3c
Compute embeddings and find characters using config file
2021-06-18 14:04:49 -03:00
Adam Froghyar
b0aa189348
Forcing do_trim_silence to False in the extract TTS script
2021-06-14 10:44:00 +02:00
Eren Gölge
d0ab0382fc
linter fixes
2021-06-01 09:15:32 +02:00
Eren Gölge
bec85ac58d
make style
2021-05-31 16:37:15 +02:00
Edresson
7448177b72
use SpeakerManager on compute embeddings script
2021-05-29 21:11:53 -03:00
Edresson
208bb0f0ee
add batched speaker encoder inference
2021-05-27 20:01:00 -03:00
Edresson
825734a3a9
remove unused embeddings export
2021-05-27 19:10:24 -03:00
Edresson
1496f271dc
update Compute embeddings script
2021-05-27 00:45:18 -03:00
Edresson
c90037c2e9
solve merge problems
2021-05-26 16:01:30 -03:00
Edresson Casanova
f89cb6aec2
Merge branch 'dev' into dev
2021-05-25 17:30:25 -03:00
Edresson
d570c2d790
pylint fix and data loader bug fix
2021-05-26 01:11:37 -03:00
Eren Gölge
c2c7dff805
use relaxted coqpit parser
2021-05-18 14:49:47 +02:00
Edresson
856ea19758
bug fix in dataloader and update inference
2021-05-18 03:43:16 -03:00
Eren Gölge
12722501bb
styling
2021-05-15 23:48:31 +02:00
Edresson
3433c2f348
add compute embedding for the new speaker encoder
2021-05-12 03:06:46 -03:00
Eren Gölge
715b0a65a0
update main.yml for python x64
...
fix test
2021-05-12 00:57:29 +02:00
Edresson
3fcc748b2e
implement the Speaker Encoder H/ASP
2021-05-11 16:27:05 -03:00
Eren Gölge
843d1b3d98
linter fixes
2021-05-11 11:30:00 +02:00
Eren Gölge
19fb1d743d
style update
2021-05-11 11:30:00 +02:00
Eren Gölge
9f7599e3c3
fix train_encoder for coqpit
2021-05-11 11:29:18 +02:00
Eren Gölge
3fde2001b1
train_encoder refactoring for coqpit
2021-05-11 11:29:18 +02:00
Eren Gölge
9ee70af9bb
code styling
2021-05-11 11:29:18 +02:00
Eren Gölge
78b3825d0b
update train scripts for coqpit
2021-05-11 11:29:18 +02:00
Eren Gölge
e6f45b9eb7
update train_vocoder_gan.py for coqpit
2021-05-11 11:29:18 +02:00
Eren Gölge
bcebd69d09
remove bash tts training tests
2021-05-11 11:29:17 +02:00
Eren Gölge
7227e8f1d2
update train_align_tts.py for coqpit
2021-05-11 11:29:17 +02:00
Eren Gölge
720fe13056
update glow_tts modules and training script for coqpit use
2021-05-11 11:29:17 +02:00
Eren Gölge
35341d5482
move bash script based tests to python with coqpit
2021-05-11 11:29:17 +02:00
Eren Gölge
eaa130e813
fix tacotron for coqpit
2021-05-11 11:29:17 +02:00
Eren Gölge
65d7ad4250
refactor train_speedy_speech.py for coqpit
2021-05-11 11:29:17 +02:00
Eren Gölge
9c18e40f64
black formatting
2021-05-11 11:29:17 +02:00
Eren Gölge
c34c8137d7
update compute_statistics for coqpit
2021-05-11 11:29:17 +02:00
Eren Gölge
79d7215142
config refactor #5 WIP
2021-05-11 11:29:17 +02:00
Eren Gölge
dc50f5f0b0
config refactor #4 WIP
2021-05-11 11:28:35 +02:00
Eren Gölge
97bd5f9734
[ci skip] config update #3 WIP
2021-05-11 11:28:35 +02:00
Eren Gölge
a21c0b5585
config update 2 WIP
2021-05-11 11:28:35 +02:00
Edresson
85ccad7e0a
add Audio data augamentation Addtive and RIR
2021-05-11 00:59:57 -03:00
Edresson
77d85c6cc5
add softmaxproto loss and bug fix in data loader
2021-05-10 17:08:38 -03:00
Eren Gölge
f7582107da
Merge pull request #453 from Edresson/dev
...
Script for spectrogram extraction using teacher forcing and Glow-TTS inference with MAS.
2021-05-06 17:53:28 +02:00
Edresson
501c8e0302
remove unused vars on extract tts spectrograms script
2021-05-04 19:04:13 -03:00
Eren Gölge
87d674a038
bumpup librosa version to 0.8.0
2021-05-03 14:25:09 +02:00
Edresson
3ecd556bbe
add unit test for extract tts spectrograms script
2021-05-01 13:41:56 -03:00
Edresson
446b1da936
create inference function
2021-04-29 18:18:37 -03:00
Eren Gölge
1235e54738
test for synthesize.py
2021-04-27 14:17:38 +02:00
Eren Gölge
2f0716073e
enable multi-speaker CoquiTTS models for synthesize.py
2021-04-26 19:36:53 +02:00
Edresson
20e42a3381
add save audio option
2021-04-23 15:00:00 -03:00
Edresson
8228091f92
add script for extraction of tts spectrograms
2021-04-23 14:17:46 -03:00
Eren Gölge
4cf211348d
styling and linting
2021-04-23 18:04:37 +02:00
Eren Gölge
179722e3a7
new arguments to synthesize.py for loading speaker encoder and speaker wavs
2021-04-23 18:04:37 +02:00
Eren Gölge
af2d36faeb
update synthesize.py for multi-speaker setting
2021-04-23 18:04:37 +02:00
Edresson
d2b6326b8b
change optimizer initialization for compatibility with Hifi-GAN official implementation
2021-04-23 07:54:39 -03:00
Eren Gölge
9cc17be53a
formatting and a small bug fix in Tacotron model
2021-04-15 16:36:51 +02:00
Eren Gölge
d60a8d7211
show the real waveform on TB too for GAN vocoder training.
2021-04-15 15:30:06 +02:00
Eren Gölge
5fbe926429
change the default TTS model to TacotronDDC
2021-04-15 15:29:44 +02:00
Eren Gölge
b11d1cb845
small fixes
2021-04-12 12:40:55 +02:00
Eren Gölge
a7f6045644
Merge branch 'reformat' into hifigan-reformat
2021-04-12 12:00:17 +02:00
Eren Gölge
f519012dea
reformatting and styling
2021-04-12 11:47:39 +02:00
Eren Gölge
5b70da2e3f
restore schedulers only if training is continuing a previous training
...
inherit nn.Module for TorchSTFT
2021-04-09 19:31:28 +02:00
Eren Gölge
105e0b4d62
vocoder gan training fixes
2021-04-09 11:38:04 +02:00
Eren Gölge
18d9ec8036
format with black
2021-04-09 00:54:59 +02:00
Eren Gölge
e5b9607bc3
isort all imports
2021-04-09 00:45:20 +02:00
Eren Gölge
0e79fa86ad
format with black and pylint 2.7.3
2021-04-09 00:38:08 +02:00
Eren Gölge
cd69da4868
linter fixes #2
2021-04-08 16:57:46 +02:00
Eren Gölge
0ee0458309
remove redundant imports
2021-04-08 11:29:15 +02:00
Eren Gölge
4998ece8d8
allow configuration of optimziers from the config file
2021-04-08 11:28:30 +02:00
Eren Gölge
8daf407652
cache empty
2021-04-08 11:28:30 +02:00
Eren Gölge
3fb78c004a
move scheduler updates to the end of the epoch
2021-04-08 11:28:30 +02:00
Eren Gölge
2a872c98aa
don't call os.exit as it leaves the process resources standing
2021-04-08 11:27:40 +02:00
Eren Gölge
57f6bd1afa
make using different samples for G and D networks optional
2021-04-08 11:26:01 +02:00
rishikksh20
e656e8b108
Remove select size bug
2021-04-08 11:20:33 +02:00
rishikksh20
ef6ff4e95c
Add Exponential LR scheduler check
2021-04-08 11:20:33 +02:00
Eren Gölge
6ad4eba678
gan vocoder train fix in case of restoring models wiht no scheduler is defined
2021-04-06 16:24:50 +02:00
Eren Gölge
b4c2cf80f2
fix eval iter
2021-03-30 14:39:16 +02:00
Eren Gölge
a3a840fd78
linter fixes
2021-03-30 14:39:16 +02:00
Eren Gölge
7a382a5c2b
stowed aligntts commit and small refactoring with feed_forward layers
2021-03-30 14:39:16 +02:00
Eren Gölge
2b3e12ea49
correct imports after refactoring, add AlignTTS (old SSMAS) and some formatting
2021-03-30 14:39:16 +02:00
Eren Gölge
d9c405f0c3
create feedforward folder for SS layers
2021-03-30 14:39:16 +02:00
Eren Gölge
ca2f22cdd7
linter fix
2021-03-30 14:36:12 +02:00
Eren Gölge
d0dcd7d1b8
let the user define outpu.wav file path fix #393
2021-03-30 14:24:31 +02:00
Eren Gölge
3947750dd9
Merge branch 'dev' of https://github.com/coqui-ai/TTS into dev
2021-03-18 14:09:47 +01:00
WeberJulian
596ea2c98a
Add resample script
2021-03-18 13:33:37 +01:00
Eren Gölge
65533f33e9
fix #374
2021-03-18 13:33:00 +01:00
WeberJulian
af96080e17
fix linter issues
2021-03-18 13:33:00 +01:00
WeberJulian
f6cd8e0ecc
test case
2021-03-18 13:33:00 +01:00
WeberJulian
e954e45e57
linter + test
2021-03-18 13:33:00 +01:00
WeberJulian
e598977f3d
Using path.join instead of concat
2021-03-18 13:33:00 +01:00
WeberJulian
c5ef2de73f
Add resample script
2021-03-18 13:33:00 +01:00
Eren Gölge
babc94f63f
fix #374
2021-03-16 19:13:32 +01:00
WeberJulian
11e25a7125
fix linter issues
2021-03-16 19:13:01 +01:00
WeberJulian
b94373afb8
test case
2021-03-16 19:13:01 +01:00
WeberJulian
93fdc0729c
linter + test
2021-03-16 19:13:01 +01:00