Enno Hermann
3c2d5a9e03
Remove duplicate AudioProcessor code and fix ExtractTTSpectrogram.ipynb ( #3230 )
...
* chore: remove unused argument
* refactor(audio.processor): remove duplicate stft+griffin_lim
* chore(audio.processor): remove unused compute_stft_paddings
Same function available in numpy_transforms
* refactor(audio.processor): remove duplicate db_to_amp
* refactor(audio.processor): remove duplicate amp_to_db
* refactor(audio.processor): remove duplicate linear_to_mel
* refactor(audio.processor): remove duplicate mel_to_linear
* refactor(audio.processor): remove duplicate build_mel_basis
* refactor(audio.processor): remove duplicate stft_parameters
* refactor(audio.processor): use pre-/deemphasis from numpy_transforms
* refactor(audio.processor): use rms_volume_norm from numpy_transforms
* chore(audio.processor): remove duplicate assert
Already checked in numpy_transforms.compute_f0
* refactor(audio.processor): use find_endpoint from numpy_transforms
* refactor(audio.processor): use trim_silence from numpy_transforms
* refactor(audio.processor): use volume_norm from numpy_transforms
* refactor(audio.processor): use load_wav from numpy_transforms
* fix(bin.extract_tts_spectrograms): set quantization bits
* fix(ExtractTTSpectrogram.ipynb): adapt to current TTS code
Fixes #2447 , #2574
* refactor(audio.processor): remove duplicate quantization methods
2023-11-16 10:57:06 +01:00
Eren Gölge
914280a556
Bump up to v0.11.0 ( #2329 )
...
* Make style
* Bump up to v0.11.0
2023-02-08 13:58:49 +01:00
Eren Gölge
9e5a469c64
d-vector handling ( #1945 )
...
* Update BaseDatasetConfig
- Add dataset_name
- Chane name to formatter_name
* Update compute_embedding
- Allow entering dataset by args
- Use released model by default
- Use the new key format
* Update loading
* Update recipes
* Update other dep code
* Update tests
* Fixup
* Load multiple embedding files
* Fix argument names in dep code
* Update docs
* Fix argument name
* Fix linter
2022-09-13 14:10:33 +02:00
Edresson Casanova
060e0f9368
Add EmbeddingManager and BaseIDManager ( #1374 )
2022-03-31 13:41:16 +02:00
Eren Gölge
1425a023fe
Make style and lint
2022-03-02 13:25:35 +01:00
Eren Gölge
4d99fee3e2
Update spec extractor
2022-02-25 11:12:44 +01:00
Eren Gölge
a51b031bff
Merge branch 'dev' into dev-fix-glowtts-infer
2022-02-21 12:01:40 +03:00
Edresson Casanova
28a7464975
Fix the bug in split dataset function ( #1251 )
...
* Fix the bug in split_dataset
* Make eval_split_size configurable
* Change test_loader to use load_tts_samples function
* Change eval_split_portion to eval_split_size and permits to set the absolute number of samples in eval
* Fix samplers unit test
* Add data unit test on GitHub workflow
2022-02-21 11:59:36 +03:00
Edresson Casanova
bc5db13d06
Fix the bug in extract tts spectrogram script
2022-02-19 19:24:00 +00:00
Eren Gölge
704dddcffa
Make style
2021-12-20 11:54:10 +00:00
Edresson
85418ffeaa
Fix the bug in extract tts spectrograms
2021-12-20 11:54:10 +00:00
Edresson
34749f8727
Remove the call to get_speaker_manager
2021-12-20 11:54:10 +00:00
Eren Gölge
043dca61b4
Rename `load_meta_data` as `load_tts_data`
2021-09-30 14:47:56 +00:00
Eren Gölge
3c740d4893
Style extract_tts_spectrogram.py
2021-09-10 08:21:21 +00:00
Eren Gölge
807f1d3817
Fix `extract_tts_spectrograms.py` model init
2021-09-09 08:59:55 +00:00
Eren Gölge
91a70e80b2
Refactor TTSDataset
...
Return a dict by `collate`
Refactor batch handling in `collate`
A couple of bug fixes
2021-09-06 15:16:58 +00:00
Eren Gölge
18da8f5dbd
Update pylint 2.10.2 and fix lint issues
2021-08-30 08:10:35 +00:00
Eren Gölge
f186856e5d
Add option to sort input sequnce by audio len
2021-08-30 08:10:35 +00:00
Agrin Hilmkil
ced4cfdbbf
Allow saving / loading checkpoints from cloud paths ( #683 )
...
* Allow saving / loading checkpoints from cloud paths
Allows saving and loading checkpoints directly from cloud paths like
Amazon S3 (s3://) and Google Cloud Storage (gs://) by using fsspec.
Note: The user will have to install the relevant dependency for each
protocol. Otherwise fsspec will fail and specify which dependency is
missing.
* Append suffix _fsspec to save/load function names
* Add a lower bound to the fsspec dependency
Skips the 0 major version.
* Add missing changes from refactor
* Use fsspec for remaining artifacts
* Add test case with path requiring fsspec
* Avoid writing logs to file unless output_path is local
* Document the possibility of using paths supported by fsspec
* Fix style and lint
* Add missing lint fixes
* Add type annotations to new functions
* Use Coqpit method for converting config to dict
* Fix type annotation in semi-new function
* Add return type for load_fsspec
* Fix bug where fs not always created
* Restore the experiment removal functionality
2021-08-09 18:02:36 +00:00
Edresson
b1620d1f3f
remove ignore generate eval flag
2021-07-15 03:34:28 -03:00
Edresson
d906fea08c
lint fix and eval as argparse in extract tts spectrograms
2021-07-13 02:15:31 -03:00
Edresson
2e5baffa9c
Merge fix and eval split as argparse
2021-07-13 01:47:32 -03:00
Eren Gölge
9790eddada
Fix wrong argument name 🛠️
2021-06-28 17:03:47 +02:00
Eren Gölge
45947acb60
Update `TTS.bin` scripts for the new API
2021-06-28 17:03:47 +02:00
Eren Gölge
00c82c516d
rename to
2021-06-28 17:03:19 +02:00
Eren Gölge
d6b2b6add6
make style and linter fixes
2021-06-28 17:03:19 +02:00
Eren Gölge
802d461389
Compute d_vectors and speaker_ids separately in TTSDataset
2021-06-28 17:03:19 +02:00
Eren Gölge
db6a97d1a2
rename external speaker embedding arguments as `d_vectors`
2021-06-28 17:03:19 +02:00
Eren Gölge
b500338faa
make style
2021-06-28 17:03:19 +02:00
Eren Gölge
469d2e620a
update extract_tts_spectrogram for `cond_input` API of the models
2021-06-28 17:03:19 +02:00
Eren Gölge
5ab28fa618
update `extract_tts_spec...` using `SpeakerManager`
2021-06-28 17:03:19 +02:00
Eren Gölge
c392fa4288
update `extract_tts_spectrograms` for the new model API
2021-06-28 17:03:19 +02:00
Eren Gölge
8f47f95998
correct import of `load_meta_data`
...
remove redundant import
2021-06-28 17:03:19 +02:00
Eren Gölge
d25f017b42
update `setup_model.py` imports
2021-06-28 17:03:19 +02:00
Eren Gölge
42554cc711
rename MyDataset -> TTSDataset
2021-06-28 17:03:19 +02:00
Edresson Casanova
eb84bb2bc8
Merge branch 'dev' into dev
2021-06-26 15:32:19 -03:00
Michael Hansen
3f172b84d8
Fix linting issues
2021-06-25 14:41:31 +02:00
Adam Froghyar
b0aa189348
Forcing do_trim_silence to False in the extract TTS script
2021-06-14 10:44:00 +02:00
Eren Gölge
12722501bb
styling
2021-05-15 23:48:31 +02:00
Eren Gölge
715b0a65a0
update main.yml for python x64
...
fix test
2021-05-12 00:57:29 +02:00
Edresson
501c8e0302
remove unused vars on extract tts spectrograms script
2021-05-04 19:04:13 -03:00
Edresson
3ecd556bbe
add unit test for extract tts spectrograms script
2021-05-01 13:41:56 -03:00
Edresson
446b1da936
create inference function
2021-04-29 18:18:37 -03:00
Edresson
20e42a3381
add save audio option
2021-04-23 15:00:00 -03:00
Edresson
8228091f92
add script for extraction of tts spectrograms
2021-04-23 14:17:46 -03:00