5.5 KiB
TTS Example Scripts
This directory contains example scripts demonstrating how to use the TTS (Text-to-Speech) system.
Available Scripts
simple_tts.py
- The simplest way to use TTS with minimal setupquick_tts.py
- Command-line interface for quick text-to-speech conversioninteractive_tts.py
- Interactive script with speaker selection and multi-line text inputexample_tts.py
- Basic example showing TTS functionality
Usage
Simple TTS
python simple_tts.py
This will convert the default text to speech using speaker p335.
Quick TTS
python quick_tts.py "Your text goes here"
Converts command-line text to speech immediately.
Interactive TTS
python interactive_tts.py
Provides an interactive interface where you can:
- Choose from available speakers
- Enter multi-line text
- Generate speech with custom output filenames
Output
All scripts generate WAV files that can be played with any media player.
simple_tts.py
generatesspeech_[speaker_id].wav
quick_tts.py
generatesspeech_output.wav
interactive_tts.py
generatesspeech_output_[number].wav
Requirements
- Python 3.9+
- TTS library
- espeak-ng (required for phonemization)
Installation
1. Python Environment Setup
# Create a virtual environment
python -m venv venv
# Activate virtual environment
# On Windows
venv\Scripts\activate
# On macOS/Linux
source venv/bin/activate
# Upgrade pip
python -m pip install --upgrade pip
2. Install Dependencies
# Install TTS library
pip install TTS
# Install system dependencies
# For macOS
brew install espeak-ng
# For Ubuntu/Debian
sudo apt-get install espeak-ng
Examples
Basic Text-to-Speech
from TTS.api import TTS
# Initialize TTS
tts = TTS(model_name="tts_models/en/vctk/vits")
# Simple conversion
tts.tts_to_file(text="Hello, world!", file_path="output.wav", speaker="p335")
Multi-line Text
text = """
This is a multi-line text example.
It will be converted to speech with proper pauses.
You can use it for longer content like articles or books.
"""
Different Speakers
# List available speakers
tts = TTS(model_name="tts_models/en/vctk/vits")
print("Available speakers:", tts.speakers)
# Try different speakers
tts.tts_to_file(text="Same text, different voice.", file_path="speaker1.wav", speaker="p227")
tts.tts_to_file(text="Same text, different voice.", file_path="speaker2.wav", speaker="p228")
Configuration
Model Options
- Model Name:
tts_models/en/vctk/vits
- Sample Rate: 22050 Hz
- Speaker IDs: p225-p376 available
- Language: English
Audio Settings
# Available audio settings
settings = {
"sample_rate": 22050, # Audio sample rate
"output_format": "wav", # Output format (wav, mp3)
"speed": 1.0, # Speech speed (0.5-2.0)
}
Performance Tips
-
Memory Usage
- Batch processing for multiple files
- Clear cache between large generations
- Monitor system resources
-
Speed Optimization
- Use CPU for small tasks
- Enable GPU for batch processing
- Cache model for repeated use
Development
Setting up Development Environment
# Clone the repository
git clone <repository-url>
cd TTS
# Create development environment
python -m venv venv
source venv/bin/activate # or `venv\Scripts\activate` on Windows
# Install development dependencies
pip install -r requirements.dev.txt
Running Tests
# Run all tests
python -m pytest tests/
# Run specific test file
python -m pytest tests/test_specific.py
Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Run tests
- Submit pull request
Supported Languages
The current model (tts_models/en/vctk/vits
) supports English with multiple speakers.
Each speaker has a unique voice characteristic. The available speakers can be viewed
when running the interactive script.
Troubleshooting
Common Issues
-
No audio output file generated
- Check if you have write permissions in the current directory
- Ensure enough disk space is available
- Verify that the text input is not empty
-
espeak-ng not found
- Make sure espeak-ng is installed correctly
- For macOS:
brew install espeak-ng
- For Ubuntu/Debian:
sudo apt-get install espeak-ng
- Add espeak-ng to your system PATH if needed
-
Speaker not found error
- Use the interactive script to see available speaker IDs
- Default speaker is "p335"
- Make sure to use exact speaker ID (case sensitive)
-
Model download issues
- Check your internet connection
- Ensure you have enough disk space
- Try removing the downloaded model and let it re-download
-
Memory errors
- Try with shorter text inputs
- Close other memory-intensive applications
- Check if your system meets minimum requirements
Advanced Usage
-
Custom Output Location
- All scripts support custom output paths
- Use absolute paths for reliable file saving
- Ensure write permissions in target directory
-
Voice Customization
- Try different speakers for variety
- Use interactive mode to preview voices
- Experiment with different text formats
Getting Help
If you encounter issues not covered here:
- Check the error message for specific details
- Verify your Python environment and dependencies
- Try running the example scripts with simple inputs first
- Check the TTS library documentation for advanced issues