## CHANGES - Update documentation examples to use gemini-2.5-flash-preview-tts - Replace gemini-2.0-flash-tts references throughout Gemini-TTS.md - Update voice selection example commands - Modify CLI help text example command - Update changelog database binary file
4.3 KiB
Gemini Text-to-Speech (TTS) Guide
Fabric supports Google Gemini's text-to-speech (TTS) capabilities, allowing you to convert text into high-quality audio using various AI-generated voices.
Overview
The Gemini TTS feature in Fabric allows you to:
- Convert text input into audio using Google's Gemini TTS models
- Choose from 30+ different AI voices with varying characteristics
- Generate high-quality WAV audio files
- Integrate TTS generation into your existing Fabric workflows
Usage
Basic TTS Generation
To generate audio from text using TTS:
# Basic TTS with default voice (Kore)
echo "Hello, this is a test of Gemini TTS" | fabric -m gemini-2.5-flash-preview-tts -o output.wav
# Using a specific voice
echo "Hello, this is a test with the Charon voice" | fabric -m gemini-2.5-flash-preview-tts --voice Charon -o output.wav
# Using TTS with a pattern
fabric -p summarize --voice Puck -m gemini-2.5-flash-preview-tts -o summary.wav < document.txt
Voice Selection
Use the --voice flag to specify which voice to use for TTS generation:
fabric -m gemini-2.5-flash-preview-tts --voice Zephyr -o output.wav "Your text here"
If no voice is specified, the default voice "Kore" will be used.
Available Voices
Gemini TTS supports 30+ different voices, each with unique characteristics:
Popular Voices
- Kore - Firm and confident (default)
- Charon - Informative and clear
- Puck - Upbeat and energetic
- Zephyr - Bright and cheerful
- Leda - Youthful and energetic
- Aoede - Breezy and natural
Complete Voice List
- Kore, Charon, Puck, Fenrir, Aoede, Leda, Orus, Zephyr
- Autonoe, Callirhoe, Despina, Erinome, Gacrux, Laomedeia
- Pulcherrima, Sulafat, Vindemiatrix, Achernar, Achird
- Algenib, Algieba, Alnilam, Enceladus, Iapetus, Rasalgethi
- Sadachbia, Zubenelgenubi, Vega, Capella, Lyra
Listing Available Voices
To see all available voices with descriptions:
# List all voices with characteristics
fabric --list-gemini-voices
# List voice names only (for shell completion)
fabric --list-gemini-voices --shell-complete-list
Rate Limits
Google Gemini TTS has usage quotas that vary by plan:
Free Tier
- 15 requests per day per project per TTS model
- Quota resets daily
- Applies to all TTS models (e.g.,
gemini-2.5-flash-preview-tts)
Rate Limit Errors
If you exceed your quota, you'll see an error like:
Error 429: You exceeded your current quota, please check your plan and billing details
Solutions:
- Wait for daily quota reset (typically at midnight UTC)
- Upgrade to a paid plan for higher limits
- Use TTS generation strategically for important content
For current rate limits and pricing, visit: https://ai.google.dev/gemini-api/docs/rate-limits
Configuration
Command Line Options
--voice <voice_name>- Specify the TTS voice to use-o <filename.wav>- Output audio file (required for TTS models)-m <tts_model>- Specify a TTS-capable model (e.g.,gemini-2.5-flash-preview-tts)
YAML Configuration
You can also set a default voice in your Fabric configuration file (~/.config/fabric/config.yaml):
voice: "Charon" # Set your preferred default voice
Requirements
- Valid Google Gemini API key configured in Fabric
- TTS-capable Gemini model (models containing "tts" in the name)
- Audio output must be specified with
-o filename.wav
Troubleshooting
Common Issues
Error: "TTS model requires audio output"
- Solution: Always specify an output file with
-o filename.wavwhen using TTS models
Error: "Invalid voice 'X'"
- Solution: Check that the voice name is spelled correctly and matches one of the supported voices listed above
Error: "TTS generation failed"
- Solution: Verify your Gemini API key is valid and you have sufficient quota
Getting Help
For additional help with TTS features:
fabric --help
Technical Details
- Audio Format: WAV files with 24kHz sample rate, 16-bit depth, mono channel
- Language Support: Automatic language detection for 24+ languages
- Model Requirements: Models must contain "tts", "preview-tts", or "text-to-speech" in the name
- Voice Selection: Uses Google's PrebuiltVoiceConfig system for consistent voice quality
For more information about Fabric, visit the main documentation.