mirror of https://github.com/danielmiessler/Fabric.git synced 2026-01-07 21:44:02 -05:00

Files

Kayvan Sylvan 27f9134912 docs: update Gemini TTS model references to gemini-2.5-flash-preview-tts

## CHANGES

- Update documentation examples to use gemini-2.5-flash-preview-tts
- Replace gemini-2.0-flash-tts references throughout Gemini-TTS.md
- Update voice selection example commands
- Modify CLI help text example command
- Update changelog database binary file

2025-07-26 16:23:56 -07:00

4.3 KiB

Raw Permalink Blame History

Gemini Text-to-Speech (TTS) Guide

Fabric supports Google Gemini's text-to-speech (TTS) capabilities, allowing you to convert text into high-quality audio using various AI-generated voices.

Overview

The Gemini TTS feature in Fabric allows you to:

Convert text input into audio using Google's Gemini TTS models
Choose from 30+ different AI voices with varying characteristics
Generate high-quality WAV audio files
Integrate TTS generation into your existing Fabric workflows

Usage

Basic TTS Generation

To generate audio from text using TTS:

# Basic TTS with default voice (Kore)
echo "Hello, this is a test of Gemini TTS" | fabric -m gemini-2.5-flash-preview-tts -o output.wav

# Using a specific voice
echo "Hello, this is a test with the Charon voice" | fabric -m gemini-2.5-flash-preview-tts --voice Charon -o output.wav

# Using TTS with a pattern
fabric -p summarize --voice Puck -m gemini-2.5-flash-preview-tts -o summary.wav < document.txt

Voice Selection

Use the --voice flag to specify which voice to use for TTS generation:

fabric -m gemini-2.5-flash-preview-tts --voice Zephyr -o output.wav "Your text here"

If no voice is specified, the default voice "Kore" will be used.

Available Voices

Gemini TTS supports 30+ different voices, each with unique characteristics:

Popular Voices

Kore - Firm and confident (default)
Charon - Informative and clear
Puck - Upbeat and energetic
Zephyr - Bright and cheerful
Leda - Youthful and energetic
Aoede - Breezy and natural

Complete Voice List

Kore, Charon, Puck, Fenrir, Aoede, Leda, Orus, Zephyr
Autonoe, Callirhoe, Despina, Erinome, Gacrux, Laomedeia
Pulcherrima, Sulafat, Vindemiatrix, Achernar, Achird
Algenib, Algieba, Alnilam, Enceladus, Iapetus, Rasalgethi
Sadachbia, Zubenelgenubi, Vega, Capella, Lyra

Listing Available Voices

To see all available voices with descriptions:

# List all voices with characteristics
fabric --list-gemini-voices

# List voice names only (for shell completion)
fabric --list-gemini-voices --shell-complete-list

Rate Limits

Google Gemini TTS has usage quotas that vary by plan:

Free Tier

15 requests per day per project per TTS model
Quota resets daily
Applies to all TTS models (e.g., gemini-2.5-flash-preview-tts)

Rate Limit Errors

If you exceed your quota, you'll see an error like:

Error 429: You exceeded your current quota, please check your plan and billing details

Solutions:

Wait for daily quota reset (typically at midnight UTC)
Upgrade to a paid plan for higher limits
Use TTS generation strategically for important content

For current rate limits and pricing, visit: https://ai.google.dev/gemini-api/docs/rate-limits

Configuration

Command Line Options

--voice <voice_name> - Specify the TTS voice to use
-o <filename.wav> - Output audio file (required for TTS models)
-m <tts_model> - Specify a TTS-capable model (e.g., gemini-2.5-flash-preview-tts)

YAML Configuration

You can also set a default voice in your Fabric configuration file (~/.config/fabric/config.yaml):

voice: "Charon"  # Set your preferred default voice

Requirements

Valid Google Gemini API key configured in Fabric
TTS-capable Gemini model (models containing "tts" in the name)
Audio output must be specified with -o filename.wav

Troubleshooting

Common Issues

Error: "TTS model requires audio output"

Solution: Always specify an output file with -o filename.wav when using TTS models

Error: "Invalid voice 'X'"

Solution: Check that the voice name is spelled correctly and matches one of the supported voices listed above

Error: "TTS generation failed"

Solution: Verify your Gemini API key is valid and you have sufficient quota

Getting Help

For additional help with TTS features:

fabric --help

Technical Details

Audio Format: WAV files with 24kHz sample rate, 16-bit depth, mono channel
Language Support: Automatic language detection for 24+ languages
Model Requirements: Models must contain "tts", "preview-tts", or "text-to-speech" in the name
Voice Selection: Uses Google's PrebuiltVoiceConfig system for consistent voice quality

For more information about Fabric, visit the main documentation.

4.3 KiB Raw Permalink Blame History