Commit Graph

438 Commits

Author SHA1 Message Date
Eren Gölge ae6405bb76 Docstrings for `Trainer` 2021-06-28 17:03:47 +02:00
Eren Gölge d42d1c02ea Use `torch.linalg.qr` for pytorch > `v1.9.0` 2021-06-28 17:03:47 +02:00
Eren Gölge 9790eddada Fix wrong argument name 🛠️ 2021-06-28 17:03:47 +02:00
Eren Gölge 932ab107ae Docstring edit in `TTSDataset.py` ✍️ 2021-06-28 17:03:47 +02:00
Eren Gölge 8c74f054f0 Enable support for 🐍 python 3.10
Bump up versions numpy 1.19.5 and TF 2.5.0
2021-06-28 17:03:47 +02:00
Eren Gölge 9455a2b01e Apply small fixes for API compatibility 2021-06-28 17:03:47 +02:00
Eren Gölge a5d5bc9063 Print `max_decoder_steps` when model reaches the limit 2021-06-28 17:03:47 +02:00
Eren Gölge f23b228e24 Update `speaker_manager` 2021-06-28 17:03:47 +02:00
Eren Gölge 51005cdab4 Update `tts.models.setup_model` 2021-06-28 17:03:19 +02:00
Eren Gölge 7b8c15ac49 Create base 🐸TTS model abstraction for tts models 2021-06-28 17:03:19 +02:00
Eren Gölge 786170fe7d Update tts model configs 2021-06-28 17:03:19 +02:00
Eren Gölge 98298ee671 Implement unified IO utils 2021-06-28 17:03:19 +02:00
Eren Gölge c7aad884cd Implement unified trainer 2021-06-28 17:03:19 +02:00
Eren Gölge 6d7b5fbcde `tts` model abstraction with `TTSModel` 2021-06-28 17:03:19 +02:00
Eren Gölge d4dbd89752 fix calculation of `loader_start_time` 2021-06-28 17:03:19 +02:00
Eren Gölge c754a0e17d `TrainerAbstract` and related updates for `TrainerTTS` 2021-06-28 17:03:19 +02:00
Eren Gölge 00c82c516d rename to 2021-06-28 17:03:19 +02:00
Eren Gölge 166f0aeb9a merge if branches with the same implementation 2021-06-28 17:03:19 +02:00
Eren Gölge 03494ad642 adjust `distribute.py` for the `train_tts.py` 2021-06-28 17:03:19 +02:00
Eren Gölge fdfb18d230 downsize melgan test model size 2021-06-28 17:03:19 +02:00
Eren Gölge 25238e0658 fix glow-tts `inference()` 2021-06-28 17:03:19 +02:00
Eren Gölge 419735f440 refactor and fix multi-speaker training in Trainer and Tacotron models 2021-06-28 17:03:19 +02:00
Eren Gölge 269e5a734e add max_decoder_steps argument to tacotron models 2021-06-28 17:03:19 +02:00
Eren Gölge 2c38ef8441 use get_speaker_manager in Trainer and save speakers.json file when
needed
2021-06-28 17:03:19 +02:00
Eren Gölge 802d461389 Compute d_vectors and speaker_ids separately in TTSDataset 2021-06-28 17:03:19 +02:00
Eren Gölge db6a97d1a2 rename external speaker embedding arguments as `d_vectors` 2021-06-28 17:03:19 +02:00
Eren Gölge 9042ae9195 use `to_cuda()` for moving data in `format_batch()` 2021-06-28 17:03:19 +02:00
Eren Gölge f82f1970b8 change `to(device)` to `type_as` in models 2021-06-28 17:03:19 +02:00
Eren Gölge 1fa15c195a docstring fix 2021-06-28 17:03:19 +02:00
Eren Gölge 1c8a3d7c86 make style 2021-06-28 17:03:19 +02:00
Eren Gölge 30211512a4 fix type annotations 2021-06-28 17:03:19 +02:00
Eren Gölge b22b7620c3 update glow-tts output shapes to match [B, T, C] 2021-06-28 17:03:19 +02:00
Eren Gölge 8381379938 formating `cond_input` with a function in Tacotron models 2021-06-28 17:03:19 +02:00
Eren Gölge 6c495c6a6e fix glow-tts inference and forward functions for handling `cond_input`
and refactor its test
2021-06-28 17:03:19 +02:00
Eren Gölge f840268181 refactor `SpeakerManager` 2021-06-28 17:03:19 +02:00
Eren Gölge 421194880d linter fixes 2021-06-28 17:03:19 +02:00
Eren Gölge d96ebcd6d3 make style 2021-06-28 17:03:19 +02:00
Eren Gölge b500338faa make style 2021-06-28 17:03:19 +02:00
Eren Gölge c680a07a20 fix `Synthesized` for the new `synthesis()` 2021-06-28 17:03:19 +02:00
Eren Gölge bb355b7441 update align_tts.py model for the trainer 2021-06-28 17:03:19 +02:00
Eren Gölge 9203b863d9 update align_tts_loss for trainer 2021-06-28 17:03:19 +02:00
Eren Gölge fc9a0fb8ce update aling_tts_config for the trainer 2021-06-28 17:03:19 +02:00
Eren Gölge b8a4af4010 update `synthesis.py` for being more generic 2021-06-28 17:03:19 +02:00
Eren Gölge c70d0c9dae update `speedy_speech.py` model for trainer 2021-06-28 17:03:19 +02:00
Eren Gölge 06ee57d816 update `speedy_speecy_config.py` for the trainer 2021-06-28 17:03:19 +02:00
Eren Gölge 4e910993f1 update tacotron model to return `model_outputs` 2021-06-28 17:03:19 +02:00
Eren Gölge bb4deee64c update glow-tts for the trainer 2021-06-28 17:03:19 +02:00
Eren Gölge 9134c7dfb6 update `sequence_mask` import globally 2021-06-28 17:03:19 +02:00
Eren Gölge b2218e882a update `glow_tts_config.py` for setting the optimizer and the scheduler 2021-06-28 17:03:19 +02:00
Eren Gölge f4f83b6379 update `synthesis.py` for the trainer 2021-06-28 17:03:19 +02:00
Eren Gölge 130781dab6 remove `tts.generic_utils` as all the functions are moved to other files 2021-06-28 17:03:19 +02:00
Eren Gölge 535a458f40 update Tacotron models for the trainer 2021-06-28 17:03:19 +02:00
Eren Gölge bdbfc95618 add `gradual_training` argument to tacotron.py 2021-06-28 17:03:19 +02:00
Eren Gölge 5a2e75f0ee import missings for tacotron.py 2021-06-28 17:03:19 +02:00
Eren Gölge da7d10e53c mode `setup_model()` to `models/__init__.py` 2021-06-28 17:03:19 +02:00
Eren Gölge ca302db7b0 add sequence_mask to `utils.data` 2021-06-28 17:03:19 +02:00
Eren Gölge 844abb3b1d `setup_loss()` in `layer/__init__.py` 2021-06-28 17:03:19 +02:00
Eren Gölge a20a1c7d06 rename preprocess.py -> formatters.py 2021-06-28 17:03:19 +02:00
Eren Gölge b9bccbb243 move load_meta_data and related functions to `datasets/__init__.py` 2021-06-28 17:03:19 +02:00
Eren Gölge d09385808a set test_sentences in config 2021-06-28 17:03:19 +02:00
Eren Gölge 8def3c87af trainer-API updates 2021-06-28 17:03:19 +02:00
Eren Gölge 42554cc711 rename MyDataset -> TTSDataset 2021-06-28 17:03:19 +02:00
Edresson 1c4e806f54 use speaker manager on compute embeddings script 2021-06-27 03:35:34 -03:00
Edresson Casanova eb84bb2bc8
Merge branch 'dev' into dev 2021-06-26 15:32:19 -03:00
Michael Hansen 3f172b84d8 Fix linting issues 2021-06-25 14:41:31 +02:00
Michael Hansen 4d8426fa0a Use eSpeak IPA lexicons by default for phoneme models 2021-06-25 14:41:05 +02:00
Michael Hansen 618b509204 Use combined characters available in TTS phonemes (like ç) 2021-06-25 14:41:05 +02:00
Michael Hansen da6f6a4a01 Update docstring for clean_gruut_phonemes 2021-06-25 14:41:05 +02:00
Michael Hansen 47191f3ecc Add tests for gruut phonemization 2021-06-25 14:41:05 +02:00
Michael Hansen 67869e77f9 Use gruut for phonemization 2021-06-25 14:41:05 +02:00
Edresson 28bec238ca fix Lint checks 2021-06-18 14:33:50 -03:00
Edresson 83644056e3 fix Lint checks 2021-06-18 14:32:28 -03:00
Edresson Casanova e78e3cd81e
Merge branch 'dev' into dev 2021-06-18 14:10:03 -03:00
Edresson b74b510d3c Compute embeddings and find characters using config file 2021-06-18 14:04:49 -03:00
Eren Gölge 49c5e5d820 maket style japanese PR 2021-06-02 11:44:46 +02:00
Eren Gölge 73b4083c6c
Merge pull request #502 from kaiidams/kaiidams/kokoro
Japanese Tacotron 2 model
2021-06-02 10:20:08 +02:00
Alexander Korolev c1eb9bdcca
fix speaker dim inference 2021-06-01 15:15:26 +02:00
Katsuya Iida 1cc18d1972 Move unittest of Japanese phonemizer. 2021-06-01 18:51:34 +09:00
Alexander Korolev 5b89ef2c6e
fix speaker-embeddings dimension during inference 2021-06-01 11:06:35 +02:00
Katsuya Iida c4a5a73f18 update Kokoro config 2021-05-29 19:17:27 +09:00
Katsuya Iida 3a9ac2de4a Merge remote-tracking branch 'coqui-ai/main' into kaiidams/kokoro 2021-05-29 09:39:23 +09:00
Katsuya Iida d0c9c1ca5c Move TTS/tts/utils/japanese 2021-05-29 09:21:47 +09:00
Edresson 099142d4dd bug fix 2021-05-27 21:50:56 -03:00
Katsuya Iida c4987e9d4e Move import at the head of the file. 2021-05-28 00:22:57 +09:00
Eren Gölge 925c08cf95 replace unidecode with anyascii 2021-05-27 14:02:44 +02:00
Eren Gölge c6f22aaa67 fix #509 2021-05-27 13:09:15 +02:00
Katsuya Iida f921a05bdb Fixed lint errors 2021-05-26 19:02:16 +09:00
Katsuya Iida 0536aa6d0f Japanese Tacotron 2 model 2021-05-22 17:12:19 +09:00
Eren Gölge 5482a0f62d type def for gradual_training 2021-05-19 14:03:26 +02:00
Eren Gölge df6a98d0c3 type def for gradual_training 2021-05-19 14:00:44 +02:00
Eren Gölge 8a7c40736c set use_phonemes false 2021-05-19 01:27:26 +02:00
Eren Gölge ccfaa6b1d5 add `needs_phonemizer` field to models.json. If set true these models
are only compatible with v0.0.13 or below.
2021-05-18 17:57:28 +02:00
Eren Gölge a14fcf2a13 remove text_processing test 2021-05-18 17:57:28 +02:00
Eren Gölge d7fae3f515 remove all espeaker and phonemizer deps 2021-05-18 17:57:28 +02:00
Eren Gölge ced05e812a move chinese phonemizer 2021-05-18 17:57:28 +02:00
Eren Gölge 218af1d9a2 change `list` to `List` in config 2021-05-18 17:30:27 +02:00
Eren Gölge d1b469935d tacotron DDC LJSpeech recipe 2021-05-17 11:42:14 +02:00
Eren Gölge 34a42d379f update tacotron_config.py for checking `r` and the docstring 2021-05-17 11:35:30 +02:00
Eren Gölge 12722501bb styling 2021-05-15 23:48:31 +02:00
Eren Gölge 8b1014d188 add docstrings with default value fixes 2021-05-15 23:45:10 +02:00