Commit Graph

1723 Commits

Author SHA1 Message Date
Eren Gölge 49cf6a5d62 Bump up to v0.14.3 2023-06-06 09:41:59 +02:00
Eren Gölge 8e415732dd Fixup 2023-06-06 09:41:46 +02:00
Eren Gölge 547a72c97d Fixup 2023-06-05 22:38:56 +02:00
Eren Gölge a494f0c92a Bump up to v0.14.1 2023-06-05 11:29:10 +02:00
Eren Gölge 50b1074779 Make `tts` ready 2023-06-05 11:29:10 +02:00
Eren Gölge e785d101a1
Port Fairseq TTS models (#2628)
* Load fairseq models

* Add docs and missing files

* Managing fairseq models and docs for API

* Make style

* Use scarf URL

* Add tests

* Fix URL

* Pass cpu

* Make lint

* Fixup

* Make lint

* fixup

* Fixup

* Change tokenization order

* Update README

* Fixup

* Fixup
2023-06-05 11:15:13 +02:00
Shukrullo Turgunov 0d5e68a09f
fix typo (#2647)
* fix typo

* typo fix
2023-06-05 09:58:16 +02:00
Reuben Morais 23a7a9a363
Fetch all built-in speakers (#2626) 2023-05-22 17:28:08 +02:00
Eren Gölge aef7f6d980 Bump up to v0.14.1 2023-05-18 11:13:09 +02:00
Eren Gölge 9e99e0f42d Disable reduction 2023-05-18 11:12:51 +02:00
Eren Gölge bc0a532c7a
Bump up to v0.14.0 2023-05-16 10:08:41 +02:00
Eren Gölge 4de797bb11
Draft ONNX export for VITS (#2563)
* Draft ONNX export for VITS

Could not get it work to output variable length sequence

* Fixup for onnx constant output

* Make style

* Remove commented code
2023-05-16 01:07:56 +02:00
manmay nakhashi a3d5801c44
Tortoise TTS inference (#2547)
* initial commit

* Tortoise inference

* revert path change

* style fix

* remove accidental remove

* style fixes

* style fixes

* removed unwanted assests and deps

* remove changes

* remove cvvp

* style fix black

* added tortoise config and updated config and args, refactoring the code

* added tortoise to api

* Pull mel_norm from url

* Use TTS cleaners

* Let download model files

* add ability to pass tortoise presets through coqui api

* fix tests

* fix style and tests

* fix tts commandline for tortoise

* Add config.json to tortoise

* Use kwargs

* Use regular model api for loading tortoise

* Add load from dir to synthesizer

* Fix Tortoise floats

* Use model_dir when there are multiple urls

* Use `synthesize` when exists

* lint fixes and resolve preset bug

* resolve a download bug and update model link

* fix json

* do tortoise inference from voice dir

* fix

* fix test

* fix speaker id and remove assests

* update inference_tests.yml

* replace inference_test.yml

* fix extra dir as None

* fix tests

* remove space

* Reformat docstring

* Add docs

* Update docs

* lint fixes

---------

Co-authored-by: Eren Gölge <egolge@coqui.ai>
Co-authored-by: Eren Gölge <erogol@hotmail.com>
2023-05-16 00:58:21 +02:00
Eren Gölge 9b5822d625
Update VAD for silence trimming. (#2604)
* Update vad for mp3 and fault tolerance

* Make style

* Remove importt

* Remove stupid defaults
2023-05-11 11:09:23 +02:00
Eren Gölge dfb51e06b2
Add jenny model (#2603) 2023-05-08 12:05:40 +02:00
Michael Görner 27e237ed08
use default_factory for audio parameter (#2576)
Python 3.11 complains about the mutable default and other members
were already adapted to use the factory, so I expect this line just
went unnoticed until now.
2023-05-08 11:17:36 +02:00
prakharpbuf c1875f68df
typos and minor fixes (#2508)
* Update tacotron1-2.md

* Update README.md

* Update Tutorial_2_train_your_first_TTS_model.ipynb

* Update synthesizer.py

There is no arg called --speaker_name

* Update formatting_your_dataset.md

* Update AnalyzeDataset.ipynb

* Update AnalyzeDataset.ipynb

* Update AnalyzeDataset.ipynb

* Update finetuning.md

* Update train_yourtts.py

* Update train_yourtts.py

* Update train_yourtts.py

* Update finetuning.md
2023-04-26 15:22:57 +02:00
Eren Gölge 2071088bab
Bump up to v0.13.3 2023-04-17 16:13:35 +02:00
Eren Gölge 1a6a5710fd Make lint 2023-04-17 15:02:56 +02:00
Eren Gölge a44a0e1fd2 Update model urls 2023-04-17 14:53:27 +02:00
Eren Gölge 2533a18d62 Add BN tests 2023-04-17 13:37:10 +02:00
Eren Gölge 2d49c05259 Remove import 2023-04-17 13:05:29 +02:00
Eren Gölge 5e5768d784 Fix API 2023-04-17 13:05:19 +02:00
Eren Gölge cd83991067 Add BN phonemizer 2023-04-17 12:54:00 +02:00
Eren Gölge 36be05290d Add models 2023-04-17 12:52:32 +02:00
Eren Gölge e4c5c27854
Bump up to v0.13.2 2023-04-14 10:23:39 +02:00
Eren Gölge dba5cec497
Merge pull request #2509 from coqui-ai/update_vad
Update VAD
2023-04-13 19:35:17 +02:00
Eren Gölge 5a9bda13f3 Make style 2023-04-13 14:19:06 +02:00
Eren Gölge c9375e4b8b Make style 2023-04-13 14:17:06 +02:00
Eren Gölge 758ef84cc2 Using 🐸Studio models with `tts` command 2023-04-13 14:14:41 +02:00
Eren G??lge 537dc0e933 Update VAD 2023-04-13 00:39:46 +02:00
Eren Gölge e33e7170ed Bump up to v0.13.1 2023-04-12 16:20:53 +02:00
Eren Gölge 8da3342676 Ping API 2023-04-12 16:20:53 +02:00
Eren Gölge cbb592b295 Fixup 2023-04-10 14:50:11 +02:00
Eren Gölge b8b9f09de5 Fixup 2023-04-10 14:06:31 +02:00
Eren Gölge a49c1931d9 Fixup 2023-04-10 13:33:42 +02:00
Eren Gölge 5bd1fb6b2c Fix API for voice conversion 2023-04-10 13:32:16 +02:00
Eren Gölge 30109af2a0
Merge pull request #2480 from MattyB95/librosa_v0.10.0
Update Librosa Version To V0.10.0
2023-04-07 12:32:33 +02:00
Eren Gölge 1233365cf4 Bump up to v0.13.0 2023-04-05 15:09:31 +02:00
Eren Gölge ad8b9bf2be
🐸 Coqui Studio API integration (#2484)
* Warn when lang is not avail

* Make style

* Implement Coqui Studio API

* Test

* Update docs

* Set action

* Make style

* Make lint

* Update README

* Make style

* Fix action

* Run actions
2023-04-05 15:06:50 +02:00
Matthew Boakes 4c829e74a1 Update Librosa Version To V0.10.0 2023-04-05 00:59:20 +01:00
Yingzhi WANG 95fa2c9fd6
fix typo (#2475) 2023-04-03 23:31:09 +02:00
p0p 91cf1b2da9
[minor] batch["speaker_ids"] getting set two times (#2470)
* [minor] batch["speaker_ids"] getting set two times

just to make it consistent with language_ids

* Update vits.py

style.
2023-04-03 11:35:21 +02:00
Rajiv P c2d15cd413
[minor] hifigan_generator.py typo (#2462)
resblock2 description updated.
2023-03-28 12:43:36 +02:00
Eren Gölge d309f50e53
Implement FreeVC (#2451)
* Update .gitignore

* Draft FreeVC implementation

* Tests and relevant updates

* Update API tests

* Add missings

* Update requirements

* :(

* Lazy handle for vc

* Update docs for voice conversion

* Make style
2023-03-25 18:33:23 +01:00
Khalid Bashir 14c80dd1fd
vits.py training fixed due to return_complex (#2418)
Torch set default value for `return_complex=True` for `torch.stft` method
This turned warning into error:-
```
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/trainer/trainer.py", line 1591, in fit
    self._fit()
  File "/usr/local/lib/python3.10/dist-packages/trainer/trainer.py", line 1544, in _fit
    self.train_epoch()
  File "/usr/local/lib/python3.10/dist-packages/trainer/trainer.py", line 1309, in train_epoch
    _, _ = self.train_step(batch, batch_num_steps, cur_step, loader_start_time)
  File "/usr/local/lib/python3.10/dist-packages/trainer/trainer.py", line 1162, in train_step
    outputs, loss_dict_new, step_time = self._optimize(
  File "/usr/local/lib/python3.10/dist-packages/trainer/trainer.py", line 1023, in _optimize
    outputs, loss_dict = self._model_train_step(batch, model, criterion, optimizer_idx=optimizer_idx)
  File "/usr/local/lib/python3.10/dist-packages/trainer/trainer.py", line 970, in _model_train_step
    return model.train_step(*input_args)
  File "/workspace/coqui-tts/TTS/tts/models/vits.py", line 1293, in train_step
    mel_slice_hat = wav_to_mel(
  File "/workspace/coqui-tts/TTS/tts/models/vits.py", line 191, in wav_to_mel
    spec = torch.stft(
  File "/usr/local/lib/python3.10/dist-packages/torch/functional.py", line 641, in stft
    return _VF.stft(input, n_fft, hop_length, win_length, window,  # type: ignore[attr-defined]
RuntimeError: stft requires the return_complex parameter be given for real inputs, and will further require that return_complex=True in a future PyTorch release.
```
2023-03-19 00:22:04 +01:00
Eren Gölge 2db262747e
Bump up to v0.12.0 2023-03-17 13:21:03 +01:00
Roee Shenberg 3c15f0619a
Bug fixes in OverFlow audio generation (#2380) 2023-03-15 12:02:11 +01:00
Daniel Vera Nieto dfb48737fb Style fixed 2023-03-13 16:11:15 +01:00
Dani Vera 0d12229b64
Update vits.py
This should fix the issue https://github.com/coqui-ai/TTS/issues/1986 without breaking batch data sampling.
2023-03-10 18:35:16 +01:00