a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Go to file
rish106-hub f3c81e0b88 docs: improve README with better organization and comprehensive documentation 2025-03-15 15:03:14 +05:30
.github Fix read_json_with_comments 2023-12-11 23:58:52 +01:00
TTS Update generic_utils.py (#3561) 2024-02-10 11:20:58 -03:00
dockerfiles Introducing Development Dockerfile (#3263) 2023-11-24 12:30:15 +01:00
docs Update docs 2023-12-12 09:22:07 -03:00
images update the readme 2021-04-28 21:49:44 +02:00
notebooks Remove duplicate AudioProcessor code and fix ExtractTTSpectrogram.ipynb (#3230) 2023-11-16 10:57:06 +01:00
recipes Fix XTTS v2.0 training recipe (#3154) 2023-11-07 14:16:44 +01:00
scripts Ensure `tts` CLI tool readme and usage help is in sync 2023-09-26 15:38:56 +03:00
tests Bug fix in MP3 and FLAC compute length on TTSDataset (#3092) 2023-12-27 13:23:43 -03:00
.cardboardlint.yml cardboard config update 2020-06-15 19:04:05 +02:00
.dockerignore Doc update docker (#2153) 2022-11-16 00:21:56 +01:00
.gitignore XTTS v2.0 (#3137) 2023-11-06 14:58:18 +01:00
.pre-commit-config.yaml Adding OverFlow (#2183) 2022-12-12 12:44:15 +01:00
.pylintrc Update pylintrc 2023-06-21 11:57:33 +02:00
.readthedocs.yml Update docs 2023-06-30 14:40:54 +02:00
CITATION.cff Add CITATION.cff (#1404) 2022-03-16 12:05:17 +01:00
CODE_OF_CONDUCT.md Update CODE_OF_CONDUCT.md 2021-03-07 11:22:34 +01:00
CODE_OWNERS.rst add CODE_OWNERS.rst 2020-12-17 16:44:50 +01:00
CONTRIBUTING.md fixes a typo 2023-12-08 14:19:57 +00:00
Dockerfile Introducing Development Dockerfile (#3263) 2023-11-24 12:30:15 +01:00
LICENSE.txt Create LICENSE.txt 2018-02-13 22:37:59 +01:00
MANIFEST.in Implement VitsAudioConfig (#1556) 2022-07-12 18:49:58 +02:00
Makefile Remove coqui studio integration from TTS 2023-12-11 22:11:46 +01:00
README.md docs: improve README with better organization and comprehensive documentation 2025-03-15 15:03:14 +05:30
hubconf.py Fix #618 2021-07-24 11:23:55 +02:00
pyproject.toml Merge pull request #2999 from akx/remove-unnecessary-black-config 2023-09-27 09:43:06 +02:00
requirements.dev.txt 🐍 Python 3.10.x support and drop Python 3.6 support (#1565) 2022-05-12 15:50:25 +02:00
requirements.ja.txt XTTS v1.1 (#3089) 2023-10-20 16:02:08 +02:00
requirements.notebooks.txt bumpup librosa version to 0.8.0 2021-05-03 14:25:09 +02:00
requirements.txt Bug fix in MP3 and FLAC compute length on TTSDataset (#3092) 2023-12-27 13:23:43 -03:00
run_bash_tests.sh Fix aux tests (#1753) 2022-07-19 10:06:31 +02:00
setup.cfg Update requirements 2023-06-22 13:53:19 +02:00
setup.py Make Japanese-specific dependencies optional (#2776) 2023-07-24 11:28:27 +02:00

README.md

🐸 Coqui TTS - Advanced Text-to-Speech Toolkit

A comprehensive library for advanced Text-to-Speech generation

Discord License PyPI version Downloads DOI

📑 Table of Contents

🔥 Latest Updates

  • 📣 ⓍTTSv2 released with 16 languages and improved performance
  • 📣 ⓍTTS fine-tuning code available
  • 📣 ⓍTTS now supports streaming with <200ms latency
  • 📣 Support for ~1100 Fairseq models
  • 📣 Integration with 🐶Bark and 🐢Tortoise View all updates

🚀 Quick Start

# Install TTS
pip install TTS

# Quick text-to-speech generation
python -c "from TTS.api import TTS; tts = TTS('tts_models/multilingual/multi-dataset/xtts_v2'); tts.tts_to_file(text='Hello, this is a test!', file_path='output.wav')"

Features

  • 🌟 High-performance Deep Learning models
  • 🌍 Support for 1100+ languages
  • 🎯 Production-ready performance
  • 🔧 Easy-to-use API
  • 📚 Comprehensive documentation
  • 🛠️ Flexible training pipeline

💻 Installation

Requirements

  • Python >= 3.9, < 3.12
  • Operating Systems: Ubuntu 18.04+ (Primary), Windows, macOS
  • GPU (Optional but recommended for training)

Basic Installation

pip install TTS

Development Installation

git clone https://github.com/coqui-ai/TTS
pip install -e .[all,dev,notebooks]

Docker Installation

docker run --rm -it -p 5002:5002 ghcr.io/coqui-ai/tts-cpu

Detailed Installation Guide

📖 Basic Usage

Simple Text-to-Speech

from TTS.api import TTS

# Initialize TTS
tts = TTS("tts_models/en/ljspeech/tacotron2-DDC")

# Generate speech
tts.tts_to_file("Hello world!", file_path="output.wav")

Multi-lingual Voice Cloning

tts = TTS("tts_models/multilingual/multi-dataset/xtts_v2")
tts.tts_to_file(
    text="Hello world!",
    speaker_wav="path/to/speaker.wav",
    language="en",
    file_path="output.wav"
)

🎯 Available Models

Text-to-Speech Models

Model Languages Speed Quality GPU Memory
ⓍTTS v2 16 4GB+
YourTTS 13 2GB+
Tacotron 2 Any 1GB+
FastSpeech 2 Any 1GB+

Complete Model List

🚄 Performance Optimization

Hardware Requirements

  • Training: NVIDIA GPU with 8GB+ VRAM recommended
  • Inference: CPU or GPU (2GB+ VRAM)
  • RAM: 8GB minimum, 16GB recommended

Optimization Tips

  • Use batch processing for multiple inputs
  • Enable GPU acceleration when available
  • Implement caching for repeated phrases
  • Use quantized models for faster inference

🌐 Deployment

Production Setup

  1. Load models during initialization
  2. Implement proper error handling
  3. Set up monitoring and logging
  4. Use appropriate scaling strategies

Docker Deployment

docker run -d --gpus all -p 5002:5002 ghcr.io/coqui-ai/tts-gpu

🛠 Contributing

Development Setup

  1. Fork the repository
  2. Set up development environment
  3. Run tests: pytest tests/
  4. Submit PR with detailed description

Contributing Guidelines

🤝 Community & Support

Get Help

Commercial Support

🔒 Security

Best Practices

  • Keep models and dependencies updated
  • Use environment variables for sensitive data
  • Implement proper API authentication
  • Monitor for unusual usage patterns

Security Policy

📚 Citation

@misc{coqui-ai-tts,
  author = {Eren Gölge and others},
  title = {🐸TTS - a deep learning toolkit for Text-to-Speech},
  year = {2021},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/coqui-ai/TTS}},
}

📊 Performance Benchmarks

🌍 Language Support

  • 16 primary languages with ⓍTTS v2
  • 1100+ languages via Fairseq models
  • Support for custom language training

Language Documentation

📁 Directory Structure

|- notebooks/       # Jupyter Notebooks for examples
|- TTS/
    |- bin/        # Training scripts
    |- tts/        # Core TTS models
    |- vocoder/    # Vocoder models
    |- utils/      # Utilities

For more detailed information, visit our Documentation.