Eren Gölge
6e9d4062f2
Add `sort_by_audio_len` option
2021-09-06 15:16:58 +00:00
Eren Gölge
59d52a4cd8
Disable autcast for criterions
2021-09-06 15:16:58 +00:00
Eren Gölge
98a7271ce8
Refactor FastPitchv2
2021-09-06 15:16:58 +00:00
Eren Gölge
e429afbce4
Enable aligner for FastPitch
2021-09-06 15:16:58 +00:00
Eren Gölge
81c228a2d8
Update FastPitch don't detach duration network inputs
2021-09-06 15:16:58 +00:00
Eren Gölge
ca29033ef4
Refactor FastPitch model
2021-09-06 15:16:58 +00:00
Eren Gölge
42862f7fdb
Format style of the recipes
2021-09-06 15:16:58 +00:00
Eren Gölge
5d59100a88
Don't use align_score for models with duration predictor
2021-09-06 15:16:58 +00:00
Eren Gölge
fac9dbe661
Update FastPitchLoss
2021-09-06 15:16:58 +00:00
Eren Gölge
b81560607b
Update docstrings
2021-09-06 15:16:58 +00:00
Eren Gölge
57b3aec1b9
Update docstring format
2021-09-06 15:16:58 +00:00
Eren Gölge
7692bfe7f8
Update FastPitch config
2021-09-06 15:16:58 +00:00
Eren Gölge
b7caad39e0
Make optional to detach duration predictor input
2021-09-06 15:16:58 +00:00
Eren Gölge
545a00fc04
Use absolute paths of the attention masks
2021-09-06 15:16:58 +00:00
Eren Gölge
bc396c393f
Add FastPitch model and FastPitchconfig
2021-09-06 15:16:58 +00:00
Eren Gölge
e802b24ad0
Compute mean and std pitch
2021-09-06 15:16:58 +00:00
Eren Gölge
8fffd4e813
Don't print computed phonemes
...
It causes noise in logs
2021-09-06 15:16:58 +00:00
Eren Gölge
d085642ac1
Cache pitch features
...
Cache the features at the beginning of `BaseTTS` training.
2021-09-06 15:16:58 +00:00
Eren Gölge
7590c7db7a
Fix `base_tacotron` `aux_input` handling
2021-09-06 15:16:58 +00:00
Eren Gölge
db32162eae
Fix `FastPitchLoss`
2021-09-06 15:16:58 +00:00
Eren Gölge
994f2be2c1
Add comput_f0 field
2021-09-06 15:16:58 +00:00
Eren Gölge
c8d999b010
Add FastPitchLoss
2021-09-06 15:16:58 +00:00
Eren Gölge
fba257104d
Compute F0 using librosa
2021-09-06 15:16:58 +00:00
Katsuya Iida
165e5814af
Update Japanese phonemizer ( #758 )
...
* Update default ja vocoder
* update
* Japanese phonemizer test
* Run make style
Co-authored-by: Eren Gölge <egolge@coqui.ai>
2021-09-01 09:33:15 +02:00
Eren Gölge
2b7e55f01f
Fix vits args types
2021-08-30 23:24:20 +00:00
Eren Gölge
18da8f5dbd
Update pylint 2.10.2 and fix lint issues
2021-08-30 08:10:35 +00:00
Eren Gölge
f186856e5d
Add option to sort input sequnce by audio len
2021-08-30 08:10:35 +00:00
Eren Gölge
2620f62ea8
Move duration_loss inside VitsGeneratorLoss
2021-08-27 07:07:07 +00:00
Eren Gölge
49e1181ea4
Fixes for the vits model
2021-08-26 17:15:09 +00:00
Eren Gölge
3ab8cef99e
Fix VITS model SPD
2021-08-18 14:55:46 +00:00
Eren Gölge
7c0d564965
Syncronize DDP processes
2021-08-13 10:40:50 +00:00
Eren Gölge
ecf5f17dca
Fix distribute.py and ddp training
2021-08-12 22:22:32 +00:00
Eren Gölge
c8b9ca3d71
Fix Tacotron num_char init
2021-08-10 08:56:34 +00:00
Eren Gölge
6af03ac476
Fix `num_char` init in Tacotron models
2021-08-09 21:46:15 +00:00
Eren Gölge
06018251e6
Add VITS and GlowTTS class docs 🗒️
2021-08-09 18:02:36 +00:00
Eren Gölge
6a7275881d
Add VitsConfig docstring
2021-08-09 18:02:36 +00:00
Eren Gölge
f7a72552f1
Make duration predictor dropout configurable
2021-08-09 18:02:36 +00:00
Eren Gölge
c312acac7d
Implement VITS model 🚀
...
VITS model implementation built on Glow TTS and HiFiGAN
layers.
2021-08-09 18:02:36 +00:00
Eren Gölge
232a5abb6a
Update `tts.setup_model`
...
Run `model.make_symbols()` if availabe to set the symbol list
2021-08-09 18:02:36 +00:00
Eren Gölge
f5a6aa974f
Modify `symbols.py` not to add _arpanet
2021-08-09 18:02:36 +00:00
Eren Gölge
003e5579e8
Enable `custom_symbols` in text processing
...
Models can define their own custom symbols lists with custom
`make_symbols()`
2021-08-09 18:02:36 +00:00
Eren Gölge
bd4e29b4dd
Add `compute_linear_spec=False` to `BaseTTSConfig`
2021-08-09 18:02:36 +00:00
Eren Gölge
e4648ffef1
Fix multi-speaker init of Tacotron models & tests
2021-08-09 18:02:36 +00:00
Eren Gölge
01324c8e70
Update `base_tts.py`
...
Enable calling `make_symbols()` from the model if defined.
Compatibility changes for end2end `tts` models in batch formatting.
Changes in multi-speaker initialization.
Modify `test_run()` to work with dict output iof `synthesis`
2021-08-09 18:02:36 +00:00
Agrin Hilmkil
ced4cfdbbf
Allow saving / loading checkpoints from cloud paths ( #683 )
...
* Allow saving / loading checkpoints from cloud paths
Allows saving and loading checkpoints directly from cloud paths like
Amazon S3 (s3://) and Google Cloud Storage (gs://) by using fsspec.
Note: The user will have to install the relevant dependency for each
protocol. Otherwise fsspec will fail and specify which dependency is
missing.
* Append suffix _fsspec to save/load function names
* Add a lower bound to the fsspec dependency
Skips the 0 major version.
* Add missing changes from refactor
* Use fsspec for remaining artifacts
* Add test case with path requiring fsspec
* Avoid writing logs to file unless output_path is local
* Document the possibility of using paths supported by fsspec
* Fix style and lint
* Add missing lint fixes
* Add type annotations to new functions
* Use Coqpit method for converting config to dict
* Fix type annotation in semi-new function
* Add return type for load_fsspec
* Fix bug where fs not always created
* Restore the experiment removal functionality
2021-08-09 18:02:36 +00:00
Eren Gölge
d9e18e009b
Skip phoneme cache pre-compute if the path exists
2021-08-09 18:02:36 +00:00
Eren Gölge
4b7b88dd3d
Add fullband-melgan DE vocoder
2021-07-26 15:38:30 +02:00
Eren Gölge
75b201c6c1
Merge pull request #673 from coqui-ai/fix_stopnet
...
Fix stopnet training for Tacotron models
2021-07-24 12:25:38 +02:00
Eren Gölge
fc0c4600bd
Fix stopnet training
2021-07-24 11:39:54 +02:00
Eren Gölge
30eed347b6
Merge pull request #581 from Edresson/dev
...
Compute speaker embeddings in batch for the LSTM Speaker Encoder and Compute embeddings/ finding chars using config file.
2021-07-23 17:22:51 +02:00