a deep learning toolkit for Text-to-Speech, battle-tested in research and production

deep-learning glow-tts hifigan melgan multi-speaker-tts python pytorch speaker-encoder speaker-encodings speech speech-synthesis tacotron text-to-speech tts tts-model vocoder voice-cloning voice-conversion voice-synthesis

Go to file

rish106-hub f3c81e0b88 docs: improve README with better organization and comprehensive documentation		2025-03-15 15:03:14 +05:30
.github	Fix read_json_with_comments	2023-12-11 23:58:52 +01:00
TTS	Update generic_utils.py (#3561 )	2024-02-10 11:20:58 -03:00
dockerfiles	Introducing Development Dockerfile (#3263 )	2023-11-24 12:30:15 +01:00
docs	Update docs	2023-12-12 09:22:07 -03:00
images	update the readme	2021-04-28 21:49:44 +02:00
notebooks	Remove duplicate AudioProcessor code and fix ExtractTTSpectrogram.ipynb (#3230 )	2023-11-16 10:57:06 +01:00
recipes	Fix XTTS v2.0 training recipe (#3154 )	2023-11-07 14:16:44 +01:00
scripts	Ensure `tts` CLI tool readme and usage help is in sync	2023-09-26 15:38:56 +03:00
tests	Bug fix in MP3 and FLAC compute length on TTSDataset (#3092 )	2023-12-27 13:23:43 -03:00
.cardboardlint.yml	cardboard config update	2020-06-15 19:04:05 +02:00
.dockerignore	Doc update docker (#2153 )	2022-11-16 00:21:56 +01:00
.gitignore	XTTS v2.0 (#3137 )	2023-11-06 14:58:18 +01:00
.pre-commit-config.yaml	Adding OverFlow (#2183 )	2022-12-12 12:44:15 +01:00
.pylintrc	Update pylintrc	2023-06-21 11:57:33 +02:00
.readthedocs.yml	Update docs	2023-06-30 14:40:54 +02:00
CITATION.cff	Add CITATION.cff (#1404 )	2022-03-16 12:05:17 +01:00
CODE_OF_CONDUCT.md	Update CODE_OF_CONDUCT.md	2021-03-07 11:22:34 +01:00
CODE_OWNERS.rst	add CODE_OWNERS.rst	2020-12-17 16:44:50 +01:00
CONTRIBUTING.md	fixes a typo	2023-12-08 14:19:57 +00:00
Dockerfile	Introducing Development Dockerfile (#3263 )	2023-11-24 12:30:15 +01:00
LICENSE.txt	Create LICENSE.txt	2018-02-13 22:37:59 +01:00
MANIFEST.in	Implement VitsAudioConfig (#1556 )	2022-07-12 18:49:58 +02:00
Makefile	Remove coqui studio integration from TTS	2023-12-11 22:11:46 +01:00
README.md	docs: improve README with better organization and comprehensive documentation	2025-03-15 15:03:14 +05:30
hubconf.py	Fix #618	2021-07-24 11:23:55 +02:00
pyproject.toml	Merge pull request #2999 from akx/remove-unnecessary-black-config	2023-09-27 09:43:06 +02:00
requirements.dev.txt	🐍 Python 3.10.x support and drop Python 3.6 support (#1565 )	2022-05-12 15:50:25 +02:00
requirements.ja.txt	XTTS v1.1 (#3089 )	2023-10-20 16:02:08 +02:00
requirements.notebooks.txt	bumpup librosa version to 0.8.0	2021-05-03 14:25:09 +02:00
requirements.txt	Bug fix in MP3 and FLAC compute length on TTSDataset (#3092 )	2023-12-27 13:23:43 -03:00
run_bash_tests.sh	Fix aux tests (#1753 )	2022-07-19 10:06:31 +02:00
setup.cfg	Update requirements	2023-06-22 13:53:19 +02:00
setup.py	Make Japanese-specific dependencies optional (#2776 )	2023-07-24 11:28:27 +02:00

README.md

🐸 Coqui TTS - Advanced Text-to-Speech Toolkit

A comprehensive library for advanced Text-to-Speech generation

🔥 Latest Updates

📣 ⓍTTSv2 released with 16 languages and improved performance
📣 ⓍTTS fine-tuning code available
📣 ⓍTTS now supports streaming with <200ms latency
📣 Support for ~1100 Fairseq models
📣 Integration with 🐶Bark and 🐢Tortoise View all updates

🚀 Quick Start

# Install TTS
pip install TTS

# Quick text-to-speech generation
python -c "from TTS.api import TTS; tts = TTS('tts_models/multilingual/multi-dataset/xtts_v2'); tts.tts_to_file(text='Hello, this is a test!', file_path='output.wav')"

✨ Features

🌟 High-performance Deep Learning models
🌍 Support for 1100+ languages
🎯 Production-ready performance
🔧 Easy-to-use API
📚 Comprehensive documentation
🛠️ Flexible training pipeline

💻 Installation

Requirements

Python >= 3.9, < 3.12
Operating Systems: Ubuntu 18.04+ (Primary), Windows, macOS
GPU (Optional but recommended for training)

Basic Installation

pip install TTS

Development Installation

git clone https://github.com/coqui-ai/TTS
pip install -e .[all,dev,notebooks]

Docker Installation

docker run --rm -it -p 5002:5002 ghcr.io/coqui-ai/tts-cpu

Detailed Installation Guide

📖 Basic Usage

Simple Text-to-Speech

from TTS.api import TTS

# Initialize TTS
tts = TTS("tts_models/en/ljspeech/tacotron2-DDC")

# Generate speech
tts.tts_to_file("Hello world!", file_path="output.wav")

Multi-lingual Voice Cloning

tts = TTS("tts_models/multilingual/multi-dataset/xtts_v2")
tts.tts_to_file(
    text="Hello world!",
    speaker_wav="path/to/speaker.wav",
    language="en",
    file_path="output.wav"
)

🎯 Available Models

Text-to-Speech Models

Model	Languages	Speed	Quality	GPU Memory
ⓍTTS v2	16	⭐⭐⭐	⭐⭐⭐⭐⭐	4GB+
YourTTS	13	⭐⭐⭐⭐	⭐⭐⭐⭐	2GB+
Tacotron 2	Any	⭐⭐	⭐⭐⭐	1GB+
FastSpeech 2	Any	⭐⭐⭐⭐⭐	⭐⭐⭐	1GB+

Complete Model List

🚄 Performance Optimization

Hardware Requirements

Training: NVIDIA GPU with 8GB+ VRAM recommended
Inference: CPU or GPU (2GB+ VRAM)
RAM: 8GB minimum, 16GB recommended

Optimization Tips

Use batch processing for multiple inputs
Enable GPU acceleration when available
Implement caching for repeated phrases
Use quantized models for faster inference

🌐 Deployment

Production Setup

Load models during initialization
Implement proper error handling
Set up monitoring and logging
Use appropriate scaling strategies

Docker Deployment

docker run -d --gpus all -p 5002:5002 ghcr.io/coqui-ai/tts-gpu

🛠 Contributing

Development Setup

Fork the repository
Set up development environment
Run tests: pytest tests/
Submit PR with detailed description

Contributing Guidelines

🤝 Community & Support

Get Help

Commercial Support

Contact Coqui

🔒 Security

Best Practices

Keep models and dependencies updated
Use environment variables for sensitive data
Implement proper API authentication
Monitor for unusual usage patterns

Security Policy

📚 Citation

@misc{coqui-ai-tts,
  author = {Eren Gölge and others},
  title = {🐸TTS - a deep learning toolkit for Text-to-Speech},
  year = {2021},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/coqui-ai/TTS}},
}

📊 Performance Benchmarks

🌍 Language Support

16 primary languages with ⓍTTS v2
1100+ languages via Fairseq models
Support for custom language training

Language Documentation

📁 Directory Structure

|- notebooks/       # Jupyter Notebooks for examples
|- TTS/
    |- bin/        # Training scripts
    |- tts/        # Core TTS models
    |- vocoder/    # Vocoder models
    |- utils/      # Utilities

For more detailed information, visit our Documentation.