docs: enhance TTS example scripts documentation with installation guide, examples, and troubleshooting

2025-03-15 14:42:37 +05:30 · 2025-03-15 14:42:37 +05:30 · b3e9f93dce
parent eef419b373
commit b3e9f93dce
1 changed files with 222 additions and 0 deletions
--- a/examples/tts_scripts/README.md
+++ b/examples/tts_scripts/README.md
@ -0,0 +1,222 @@
 # TTS Example Scripts
 This directory contains example scripts demonstrating how to use the TTS (Text-to-Speech) system.
 ## Available Scripts
 1. `simple_tts.py` - The simplest way to use TTS with minimal setup
 2. `quick_tts.py` - Command-line interface for quick text-to-speech conversion
 3. `interactive_tts.py` - Interactive script with speaker selection and multi-line text input
 4. `example_tts.py` - Basic example showing TTS functionality
 ## Usage
 ### Simple TTS
 ```bash
 python simple_tts.py
 ```
 This will convert the default text to speech using speaker p335.
 ### Quick TTS
 ```bash
 python quick_tts.py "Your text goes here"
 ```
 Converts command-line text to speech immediately.
 ### Interactive TTS
 ```bash
 python interactive_tts.py
 ```
 Provides an interactive interface where you can:
 1. Choose from available speakers
 2. Enter multi-line text
 3. Generate speech with custom output filenames
 ## Output
 All scripts generate WAV files that can be played with any media player.
 - `simple_tts.py` generates `speech_[speaker_id].wav`
 - `quick_tts.py` generates `speech_output.wav`
 - `interactive_tts.py` generates `speech_output_[number].wav`
 ## Requirements
 - Python 3.9+
 - TTS library
 - espeak-ng (required for phonemization)
 ## Installation
 ### 1. Python Environment Setup
 ```bash
 # Create a virtual environment
 python -m venv venv
 # Activate virtual environment
 # On Windows
 venv\Scripts\activate
 # On macOS/Linux
 source venv/bin/activate
 # Upgrade pip
 python -m pip install --upgrade pip
 ```
 ### 2. Install Dependencies
 ```bash
 # Install TTS library
 pip install TTS
 # Install system dependencies
 # For macOS
 brew install espeak-ng
 # For Ubuntu/Debian
 sudo apt-get install espeak-ng
 ```
 ## Examples
 ### Basic Text-to-Speech
 ```python
 from TTS.api import TTS
 # Initialize TTS
 tts = TTS(model_name="tts_models/en/vctk/vits")
 # Simple conversion
 tts.tts_to_file(text="Hello, world!", file_path="output.wav", speaker="p335")
 ```
 ### Multi-line Text
 ```python
 text = """
 This is a multi-line text example.
 It will be converted to speech with proper pauses.
 You can use it for longer content like articles or books.
 """
 ```
 ### Different Speakers
 ```python
 # List available speakers
 tts = TTS(model_name="tts_models/en/vctk/vits")
 print("Available speakers:", tts.speakers)
 # Try different speakers
 tts.tts_to_file(text="Same text, different voice.", file_path="speaker1.wav", speaker="p227")
 tts.tts_to_file(text="Same text, different voice.", file_path="speaker2.wav", speaker="p228")
 ```
 ## Configuration
 ### Model Options
 - Model Name: `tts_models/en/vctk/vits`
 - Sample Rate: 22050 Hz
 - Speaker IDs: p225-p376 available
 - Language: English
 ### Audio Settings
 ```python
 # Available audio settings
 settings = {
    "sample_rate": 22050,      # Audio sample rate
    "output_format": "wav",    # Output format (wav, mp3)
    "speed": 1.0,             # Speech speed (0.5-2.0)
 }
 ```
 ### Performance Tips
 1. **Memory Usage**
   - Batch processing for multiple files
   - Clear cache between large generations
   - Monitor system resources
 2. **Speed Optimization**
   - Use CPU for small tasks
   - Enable GPU for batch processing
   - Cache model for repeated use
 ## Development
 ### Setting up Development Environment
 ```bash
 # Clone the repository
 git clone <repository-url>
 cd TTS
 # Create development environment
 python -m venv venv
 source venv/bin/activate  # or `venv\Scripts\activate` on Windows
 # Install development dependencies
 pip install -r requirements.dev.txt
 ```
 ### Running Tests
 ```bash
 # Run all tests
 python -m pytest tests/
 # Run specific test file
 python -m pytest tests/test_specific.py
 ```
 ### Contributing
 1. Fork the repository
 2. Create a feature branch
 3. Make your changes
 4. Run tests
 5. Submit pull request
 ## Supported Languages
 The current model (`tts_models/en/vctk/vits`) supports English with multiple speakers.
 Each speaker has a unique voice characteristic. The available speakers can be viewed
 when running the interactive script.
 ## Troubleshooting
 ### Common Issues
 1. **No audio output file generated**
   - Check if you have write permissions in the current directory
   - Ensure enough disk space is available
   - Verify that the text input is not empty
 2. **espeak-ng not found**
   - Make sure espeak-ng is installed correctly
   - For macOS: `brew install espeak-ng`
   - For Ubuntu/Debian: `sudo apt-get install espeak-ng`
   - Add espeak-ng to your system PATH if needed
 3. **Speaker not found error**
   - Use the interactive script to see available speaker IDs
   - Default speaker is "p335"
   - Make sure to use exact speaker ID (case sensitive)
 4. **Model download issues**
   - Check your internet connection
   - Ensure you have enough disk space
   - Try removing the downloaded model and let it re-download
 5. **Memory errors**
   - Try with shorter text inputs
   - Close other memory-intensive applications
   - Check if your system meets minimum requirements
 ### Advanced Usage
 1. **Custom Output Location**
   - All scripts support custom output paths
   - Use absolute paths for reliable file saving
   - Ensure write permissions in target directory
 2. **Voice Customization**
   - Try different speakers for variety
   - Use interactive mode to preview voices
   - Experiment with different text formats
 ### Getting Help
 If you encounter issues not covered here:
 1. Check the error message for specific details
 2. Verify your Python environment and dependencies
 3. Try running the example scripts with simple inputs first
 4. Check the TTS library documentation for advanced issues