mirror of
https://github.com/danielmiessler/Fabric.git
synced 2026-01-07 21:44:02 -05:00
feat: add GitHub Models provider and refactor model fetching with direct API fallback
- Add GitHub Models to supported OpenAI-compatible providers list - Implement direct HTTP fallback for non-standard model responses - Centralize model fetching logic in openai package - Upgrade openai-go SDK dependency from v1.8.2 to v1.12.0 - Remove redundant model fetching code from openai_compatible package - Add comprehensive GitHub Models setup documentation (700+ lines) - Support custom models URL endpoint per provider configuration - Add unit tests for direct model fetching functionality - Update internationalization strings for model fetching errors - Add VSCode dictionary entries for "azureml" and "Jamba"
This commit is contained in:
2
.vscode/settings.json
vendored
2
.vscode/settings.json
vendored
@@ -10,6 +10,7 @@
|
||||
"aplicar",
|
||||
"atotto",
|
||||
"Autonoe",
|
||||
"azureml",
|
||||
"badfile",
|
||||
"Behrens",
|
||||
"blindspots",
|
||||
@@ -87,6 +88,7 @@
|
||||
"horts",
|
||||
"HTMLURL",
|
||||
"imagetools",
|
||||
"Jamba",
|
||||
"jaredmontoya",
|
||||
"jessevdk",
|
||||
"Jina",
|
||||
|
||||
@@ -73,6 +73,7 @@ Below are the **new features and capabilities** we've added (newest first):
|
||||
|
||||
### Recent Major Features
|
||||
|
||||
- [v1.4.331](https://github.com/danielmiessler/fabric/releases/tag/v1.4.331) (Nov 23, 2025) — **Support for GitHub Models**: Adds support for using GitHub Models.
|
||||
- [v1.4.322](https://github.com/danielmiessler/fabric/releases/tag/v1.4.322) (Nov 5, 2025) — **Interactive HTML Concept Maps and Claude Sonnet 4.5**: Adds `create_conceptmap` pattern for visual knowledge representation using Vis.js, introduces WELLNESS category with psychological analysis patterns, and upgrades to Claude Sonnet 4.5
|
||||
- [v1.4.317](https://github.com/danielmiessler/fabric/releases/tag/v1.4.317) (Sep 21, 2025) — **Portuguese Language Variants**: Adds BCP 47 locale normalization with support for Brazilian Portuguese (pt-BR) and European Portuguese (pt-PT) with intelligent fallback chains
|
||||
- [v1.4.314](https://github.com/danielmiessler/fabric/releases/tag/v1.4.314) (Sep 17, 2025) — **Azure OpenAI Migration**: Migrates to official `openai-go/azure` SDK with improved authentication and default API version support
|
||||
|
||||
7
cmd/generate_changelog/incoming/1839.txt
Normal file
7
cmd/generate_changelog/incoming/1839.txt
Normal file
@@ -0,0 +1,7 @@
|
||||
### PR [#1839](https://github.com/danielmiessler/Fabric/pull/1839) by [ksylvan](https://github.com/ksylvan): Add GitHub Models Provider and Refactor Fetching Fallback Logic
|
||||
|
||||
- Add GitHub Models provider and refactor model fetching with direct API fallback
|
||||
- Add GitHub Models to supported OpenAI-compatible providers list
|
||||
- Implement direct HTTP fallback for non-standard model responses
|
||||
- Centralize model fetching logic in openai package
|
||||
- Upgrade openai-go SDK dependency from v1.8.2 to v1.12.0
|
||||
700
docs/GitHub-Models-Setup.md
Normal file
700
docs/GitHub-Models-Setup.md
Normal file
@@ -0,0 +1,700 @@
|
||||
# GitHub Models Setup Guide for Fabric
|
||||
|
||||
This guide will walk you through setting up and using GitHub Models with Fabric CLI. GitHub Models provides free access to multiple AI models from OpenAI, Meta, Microsoft, DeepSeek, xAI, and other providers using only your GitHub credentials.
|
||||
|
||||
## Table of Contents
|
||||
|
||||
- [What are GitHub Models?](#what-are-github-models)
|
||||
- [Getting Your GitHub Models API Key](#getting-your-github-models-api-key)
|
||||
- [Configuring Fabric for GitHub Models](#configuring-fabric-for-github-models)
|
||||
- [Testing Your Setup](#testing-your-setup)
|
||||
- [Available Models](#available-models)
|
||||
- [Rate Limits & Free Tier](#rate-limits--free-tier)
|
||||
- [Troubleshooting](#troubleshooting)
|
||||
- [Advanced Usage](#advanced-usage)
|
||||
|
||||
---
|
||||
|
||||
## What are GitHub Models?
|
||||
|
||||
**GitHub Models** is a free AI inference API platform that allows you to access multiple AI models using only your GitHub account. It's powered by Azure AI infrastructure and provides:
|
||||
|
||||
- **Unified Access**: Single API endpoint for models from multiple providers
|
||||
- **No Extra API Keys**: Uses GitHub Personal Access Tokens (no separate OpenAI, Anthropic, etc. keys needed)
|
||||
- **Free Tier**: Rate-limited free access perfect for prototyping and personal projects
|
||||
- **Web Playground**: Test models directly at [github.com/marketplace/models](https://github.com/marketplace/models)
|
||||
- **Compatible Format**: Works with OpenAI SDK standards
|
||||
|
||||
### Why Use GitHub Models with Fabric?
|
||||
|
||||
- **No Cost for Testing**: Free tier allows 50-150 requests/day depending on model
|
||||
- **Multiple Providers**: Access OpenAI, Meta Llama, Microsoft Phi, DeepSeek, and more
|
||||
- **Easy Setup**: Just one GitHub token instead of managing multiple API keys
|
||||
- **Great for Learning**: Experiment with different models without financial commitment
|
||||
|
||||
---
|
||||
|
||||
## Getting Your GitHub Models API Key
|
||||
|
||||
GitHub Models uses **Personal Access Tokens (PAT)** instead of separate API keys.
|
||||
|
||||
### Step-by-Step Instructions
|
||||
|
||||
1. **Sign in to GitHub** at [github.com](https://github.com)
|
||||
|
||||
2. **Navigate to Token Settings:**
|
||||
- Click your profile picture (upper-right corner)
|
||||
- Click **Settings**
|
||||
- Scroll down the left sidebar to **Developer settings** (at the bottom)
|
||||
- Click **Personal access tokens** → **Fine-grained tokens** (recommended)
|
||||
|
||||
3. **Generate New Token:**
|
||||
- Click **Generate new token**
|
||||
- Give it a descriptive name: `Fabric CLI - GitHub Models`
|
||||
- Set expiration (recommended: 90 days or custom)
|
||||
- **Repository access**: Select "Public Repositories (read-only)" or "All repositories" (your choice)
|
||||
- **Permissions**:
|
||||
- Scroll down to **Account permissions**
|
||||
- Find **AI Models** and set to **Read-only** ✓
|
||||
- This grants the `models:read` scope
|
||||
- Click **Generate token** at the bottom
|
||||
|
||||
4. **Save Your Token:**
|
||||
- **IMPORTANT**: Copy the token immediately (starts with `github_pat_` or `ghp_`)
|
||||
- You won't be able to see it again!
|
||||
- Store it securely - this will be your `GITHUB_TOKEN`
|
||||
|
||||
### Security Best Practices
|
||||
|
||||
- ✅ Use fine-grained tokens with minimal permissions
|
||||
- ✅ Set an expiration date (rotate tokens regularly)
|
||||
- ✅ Never commit tokens to Git repositories
|
||||
- ✅ Store in environment variables or secure credential managers
|
||||
- ❌ Don't share tokens in chat, email, or screenshots
|
||||
|
||||
---
|
||||
|
||||
## Configuring Fabric for GitHub Models
|
||||
|
||||
### Method 1: Using Fabric Setup (Recommended)
|
||||
|
||||
This is the easiest and safest method:
|
||||
|
||||
1. **Run Fabric Setup:**
|
||||
|
||||
```bash
|
||||
fabric --setup
|
||||
```
|
||||
|
||||
2. **Select GitHub from the Menu:**
|
||||
- You'll see a numbered list of AI vendors
|
||||
- Find `[8] GitHub (configured)` or similar
|
||||
- Enter the number (e.g., `8`) and press Enter
|
||||
|
||||
3. **Enter Your GitHub Token:**
|
||||
- When prompted for "API Key", paste your GitHub Personal Access Token
|
||||
- The token you created earlier (starts with `github_pat_` or `ghp_`)
|
||||
- Press Enter
|
||||
|
||||
4. **Verify Base URL (Optional):**
|
||||
- You'll be asked for "API Base URL"
|
||||
- Press Enter to use the default: `https://models.github.ai/inference`
|
||||
- Or customize if needed (advanced use only)
|
||||
|
||||
5. **Save and Exit:**
|
||||
- The setup wizard will save your configuration
|
||||
- You should see "GitHub (configured)" next time
|
||||
|
||||
### Method 2: Manual Configuration (Advanced)
|
||||
|
||||
If you prefer to manually edit the configuration file:
|
||||
|
||||
1. **Edit Environment File:**
|
||||
|
||||
```bash
|
||||
nano ~/.config/fabric/.env
|
||||
```
|
||||
|
||||
2. **Add GitHub Configuration:**
|
||||
|
||||
```bash
|
||||
# GitHub Models API Key (your Personal Access Token)
|
||||
GITHUB_API_KEY=github_pat_YOUR_TOKEN_HERE
|
||||
|
||||
# GitHub Models API Base URL (default, usually don't need to change)
|
||||
GITHUB_API_BASE_URL=https://models.github.ai/inference
|
||||
```
|
||||
|
||||
Save and exit (Ctrl+X, then Y, then Enter)
|
||||
|
||||
**Note**: The environment variable is `GITHUB_API_KEY`, not `GITHUB_TOKEN`.
|
||||
|
||||
### Verify Configuration
|
||||
|
||||
Check that your configuration is properly set:
|
||||
|
||||
```bash
|
||||
grep GITHUB_API_KEY ~/.config/fabric/.env
|
||||
```
|
||||
|
||||
You should see:
|
||||
|
||||
```text
|
||||
GITHUB_API_KEY=github_pat_...
|
||||
```
|
||||
|
||||
Or run setup again to verify:
|
||||
|
||||
```bash
|
||||
fabric --setup
|
||||
```
|
||||
|
||||
Look for `[8] GitHub (configured)` in the list.
|
||||
|
||||
---
|
||||
|
||||
## Testing Your Setup
|
||||
|
||||
### 1. List Available Models
|
||||
|
||||
Verify that Fabric can connect to GitHub Models and fetch the model list:
|
||||
|
||||
```bash
|
||||
fabric --listmodels | grep GitHub
|
||||
```
|
||||
|
||||
**Expected Output:**
|
||||
|
||||
```text
|
||||
Available models:
|
||||
...
|
||||
$ fabric -L | grep GitHub
|
||||
[65] GitHub|ai21-labs/ai21-jamba-1.5-large
|
||||
[66] GitHub|cohere/cohere-command-a
|
||||
[67] GitHub|cohere/cohere-command-r-08-2024
|
||||
[68] GitHub|cohere/cohere-command-r-plus-08-2024
|
||||
[69] GitHub|deepseek/deepseek-r1
|
||||
[70] GitHub|deepseek/deepseek-r1-0528
|
||||
[71] GitHub|deepseek/deepseek-v3-0324
|
||||
[72] GitHub|meta/llama-3.2-11b-vision-instruct
|
||||
[73] GitHub|meta/llama-3.2-90b-vision-instruct
|
||||
... (and more)
|
||||
```
|
||||
|
||||
### 2. Simple Chat Test
|
||||
|
||||
Test a basic chat completion with a small, fast model:
|
||||
|
||||
```bash
|
||||
# Use gpt-4o-mini (fast and has generous rate limits)
|
||||
fabric --vendor GitHub -m openai/gpt-4o-mini 'Why is th
|
||||
e sky blue?'
|
||||
```
|
||||
|
||||
**Expected**: You should see a response explaining Rayleigh scattering.
|
||||
|
||||
**Tip**: Model names from `--listmodels` can be used directly (e.g., `openai/gpt-4o-mini`, `openai/gpt-4o`, `meta/llama-4-maverick-17b-128e-instruct-fp8`).
|
||||
|
||||
### 3. Test with a Pattern
|
||||
|
||||
Use one of Fabric's built-in patterns:
|
||||
|
||||
```bash
|
||||
echo "Artificial intelligence is transforming how we work and live." | \
|
||||
fabric --pattern summarize --vendor GitHub --model "openai/gpt-4o-mini"
|
||||
```
|
||||
|
||||
### 4. Test Streaming
|
||||
|
||||
Verify streaming responses work:
|
||||
|
||||
```bash
|
||||
echo "Count from 1 to 100" | \
|
||||
fabric --vendor GitHub --model "openai/gpt-4o-mini" --stream
|
||||
```
|
||||
|
||||
You should see the response appear progressively, word by word.
|
||||
|
||||
### 5. Test with Different Models
|
||||
|
||||
Try a Meta Llama model:
|
||||
|
||||
```bash
|
||||
# Use a Llama model
|
||||
echo "Explain quantum computing" | \
|
||||
fabric --vendor GitHub --model "meta/Meta-Llama-3.1-8B-Instruct"
|
||||
```
|
||||
|
||||
### Quick Validation Checklist
|
||||
|
||||
- [x] `--listmodels` shows GitHub models
|
||||
- [x] Basic chat completion works
|
||||
- [x] Patterns work with GitHub vendor
|
||||
- [x] Streaming responses work
|
||||
- [x] Can switch between different models
|
||||
|
||||
---
|
||||
|
||||
## Available Models
|
||||
|
||||
GitHub Models provides access to models from multiple providers. Models use the format: `{publisher}/{model-name}`
|
||||
|
||||
### OpenAI Models
|
||||
|
||||
| Model ID | Description | Tier | Best For |
|
||||
|----------|-------------|------|----------|
|
||||
| `openai/gpt-4.1` | Latest flagship GPT-4 | High | Complex tasks, reasoning |
|
||||
| `openai/gpt-4o` | Optimized GPT-4 | High | General purpose, fast |
|
||||
| `openai/gpt-4o-mini` | Compact, cost-effective | Low | Quick tasks, high volume |
|
||||
| `openai/o1` | Advanced reasoning | High | Complex problem solving |
|
||||
| `openai/o3` | Next-gen reasoning | High | Cutting-edge reasoning |
|
||||
|
||||
### Meta Llama Models
|
||||
|
||||
| Model ID | Description | Tier | Best For |
|
||||
|----------|-------------|------|----------|
|
||||
| `meta/llama-3.1-405b` | Largest Llama model | High | Complex tasks, accuracy |
|
||||
| `meta/llama-3.1-70b` | Mid-size Llama | Low | Balanced performance |
|
||||
| `meta/llama-3.1-8b` | Compact Llama | Low | Fast, efficient tasks |
|
||||
|
||||
### Microsoft Phi Models
|
||||
|
||||
| Model ID | Description | Tier | Best For |
|
||||
|----------|-------------|------|----------|
|
||||
| `microsoft/phi-4` | Latest Phi generation | Low | Efficient reasoning |
|
||||
| `microsoft/phi-3-medium` | Mid-size variant | Low | General tasks |
|
||||
| `microsoft/phi-3-mini` | Smallest Phi | Low | Quick, simple tasks |
|
||||
|
||||
### DeepSeek Models
|
||||
|
||||
| Model ID | Description | Tier | Special |
|
||||
|----------|-------------|------|---------|
|
||||
| `deepseek/deepseek-r1` | Reasoning model | Very Limited | 8 requests/day |
|
||||
| `deepseek/deepseek-r1-0528` | Updated version | Very Limited | 8 requests/day |
|
||||
|
||||
### xAI Models
|
||||
|
||||
| Model ID | Description | Tier | Special |
|
||||
|----------|-------------|------|---------|
|
||||
| `xai/grok-3` | Latest Grok | Very Limited | 15 requests/day |
|
||||
| `xai/grok-3-mini` | Smaller Grok | Very Limited | 15 requests/day |
|
||||
|
||||
### Getting the Full List
|
||||
|
||||
To see all currently available models:
|
||||
|
||||
```bash
|
||||
fabric --listmodels | grep GitHub
|
||||
```
|
||||
|
||||
Or for a formatted list with details, you can query the GitHub Models API directly:
|
||||
|
||||
```bash
|
||||
curl -H "Authorization: Bearer $GITHUB_TOKEN" \
|
||||
-H "X-GitHub-Api-Version: 2022-11-28" \
|
||||
https://models.github.ai/catalog/models | jq '.[] | {id, publisher, tier: .rate_limit_tier}'
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Rate Limits & Free Tier
|
||||
|
||||
GitHub Models has tiered rate limits based on model complexity. Understanding these helps you use the free tier effectively.
|
||||
|
||||
### Low Tier Models (Recommended for High Volume)
|
||||
|
||||
**Models**: `gpt-4o-mini`, `llama-3.1-*`, `phi-*`
|
||||
|
||||
- **Requests per minute**: 15
|
||||
- **Requests per day**: 150
|
||||
- **Tokens per request**: 8,000 input / 4,000 output
|
||||
- **Concurrent requests**: 5
|
||||
|
||||
**Best practices**: Use these for most Fabric patterns and daily tasks.
|
||||
|
||||
### High Tier Models (Use Sparingly)
|
||||
|
||||
**Models**: `gpt-4.1`, `gpt-4o`, `o1`, `o3`, `llama-3.1-405b`
|
||||
|
||||
- **Requests per minute**: 10
|
||||
- **Requests per day**: 50
|
||||
- **Tokens per request**: 8,000 input / 4,000 output
|
||||
- **Concurrent requests**: 2
|
||||
|
||||
**Best practices**: Save for complex tasks, important queries, or when you need maximum quality.
|
||||
|
||||
### Very Limited Models
|
||||
|
||||
**Models**: `deepseek-r1`, `grok-3`
|
||||
|
||||
- **Requests per minute**: 1
|
||||
- **Requests per day**: 8-15 (varies by model)
|
||||
- **Tokens per request**: 4,000 input / 4,000 output
|
||||
- **Concurrent requests**: 1
|
||||
|
||||
**Best practices**: Use only for special experiments or when you specifically need these models.
|
||||
|
||||
### Rate Limit Reset Times
|
||||
|
||||
- **Per-minute limits**: Reset every 60 seconds
|
||||
- **Daily limits**: Reset at midnight UTC
|
||||
- **Per-user**: Limits are tied to your GitHub account, not the token
|
||||
|
||||
### Enhanced Limits with GitHub Copilot
|
||||
|
||||
If you have a GitHub Copilot subscription, you get higher limits:
|
||||
|
||||
- **Copilot Business**: 2× daily request limits
|
||||
- **Copilot Enterprise**: 3× daily limits + higher token limits
|
||||
|
||||
### What Happens When You Hit Limits?
|
||||
|
||||
You'll receive an HTTP 429 error with a message like:
|
||||
|
||||
```text
|
||||
Rate limit exceeded. Try again in X seconds.
|
||||
```
|
||||
|
||||
Fabric will display this error. Wait for the reset time and try again.
|
||||
|
||||
### Tips for Staying Within Limits
|
||||
|
||||
1. **Use low-tier models** for most tasks (`gpt-4o-mini`, `llama-3.1-8b`)
|
||||
2. **Batch your requests** - process multiple items together when possible
|
||||
3. **Cache results** - save responses for repeated queries
|
||||
4. **Monitor usage** - keep track of daily request counts
|
||||
5. **Set per-pattern models** - configure specific models for specific patterns (see Advanced Usage)
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Error: "Authentication failed" or "Unauthorized"
|
||||
|
||||
**Cause**: Invalid or missing GitHub token
|
||||
|
||||
**Solutions**:
|
||||
|
||||
1. Verify token is in `.env` file:
|
||||
|
||||
```bash
|
||||
grep GITHUB_API_KEY ~/.config/fabric/.env
|
||||
```
|
||||
|
||||
2. Check token has `models:read` permission:
|
||||
- Go to GitHub Settings → Developer settings → Personal access tokens
|
||||
- Click on your token
|
||||
- Verify "AI Models: Read-only" is checked
|
||||
|
||||
3. Re-run setup to reconfigure:
|
||||
|
||||
```bash
|
||||
fabric --setup
|
||||
# Select GitHub (number 8 or similar)
|
||||
# Enter your token again
|
||||
```
|
||||
|
||||
4. Generate a new token if needed (tokens expire)
|
||||
|
||||
### Error: "Rate limit exceeded"
|
||||
|
||||
**Cause**: Too many requests in a short time period
|
||||
|
||||
**Solutions**:
|
||||
|
||||
1. Check which tier your model is in (see [Rate Limits](#rate-limits--free-tier))
|
||||
2. Wait for the reset (check error message for wait time)
|
||||
3. Switch to a lower-tier model:
|
||||
|
||||
```bash
|
||||
# Instead of gpt-4.1 (high tier)
|
||||
fabric --vendor GitHub --model openai/gpt-4.1 ...
|
||||
|
||||
# Use gpt-4o-mini (low tier)
|
||||
fabric --vendor GitHub --model openai/gpt-4o-mini ...
|
||||
```
|
||||
|
||||
### Error: "Model not found" or "Invalid model"
|
||||
|
||||
**Cause**: Model name format incorrect or model not available
|
||||
|
||||
**Solutions**:
|
||||
|
||||
1. Use correct format: `{publisher}/{model-name}`, e.g., `openai/gpt-4o-mini`
|
||||
|
||||
```bash
|
||||
# ❌ Wrong
|
||||
fabric --vendor GitHub --model gpt-4o-mini
|
||||
|
||||
# ✅ Correct
|
||||
fabric --vendor GitHub --model openai/gpt-4o-mini
|
||||
```
|
||||
|
||||
2. List available models to verify name:
|
||||
|
||||
```bash
|
||||
fabric --listmodels --vendor GitHub | grep -i "gpt-4"
|
||||
```
|
||||
|
||||
### Error: "Cannot list models" or Empty model list
|
||||
|
||||
**Cause**: API endpoint issue or authentication problem
|
||||
|
||||
**Solutions**:
|
||||
|
||||
1. Test direct API access:
|
||||
|
||||
```bash
|
||||
curl -H "Authorization: Bearer $GITHUB_TOKEN" \
|
||||
-H "X-GitHub-Api-Version: 2022-11-28" \
|
||||
https://models.github.ai/catalog/models
|
||||
```
|
||||
|
||||
2. If curl works but Fabric doesn't, rebuild Fabric:
|
||||
|
||||
```bash
|
||||
cd /path/to/fabric
|
||||
go build ./cmd/fabric
|
||||
```
|
||||
|
||||
3. Check for network/firewall issues blocking `models.github.ai`
|
||||
|
||||
### Error: "Response format not supported"
|
||||
|
||||
**Cause**: This should be fixed in the latest version with direct fetch fallback
|
||||
|
||||
**Solutions**:
|
||||
|
||||
1. Update to the latest Fabric version with PR #1839 merged
|
||||
2. Verify you're on a version that includes the `FetchModelsDirectly` fallback
|
||||
|
||||
### Models are slow to respond
|
||||
|
||||
**Cause**: High tier models have limited concurrency, or GitHub Models API congestion
|
||||
|
||||
**Solutions**:
|
||||
|
||||
1. Switch to faster models:
|
||||
- `openai/gpt-4o-mini` instead of `gpt-4.1`
|
||||
- `meta/llama-3.1-8b` instead of `llama-3.1-405b`
|
||||
|
||||
2. Check your internet connection
|
||||
|
||||
3. Try again later (API may be experiencing high traffic)
|
||||
|
||||
### Token expires or becomes invalid
|
||||
|
||||
**Cause**: Tokens have expiration dates or can be revoked
|
||||
|
||||
**Solutions**:
|
||||
|
||||
1. Generate a new token (see [Getting Your GitHub Models API Key](#getting-your-github-models-api-key))
|
||||
2. Update `.env` file with new token
|
||||
3. Set longer expiration when creating tokens (e.g., 90 days)
|
||||
|
||||
---
|
||||
|
||||
## Advanced Usage
|
||||
|
||||
### Using Specific Models with Patterns
|
||||
|
||||
You can specify which model to use with any pattern:
|
||||
|
||||
```bash
|
||||
# Use GPT-4.1 with the analyze_claims pattern
|
||||
cat article.txt | fabric --pattern analyze_claims \
|
||||
--vendor GitHub --model openai/gpt-4.1
|
||||
|
||||
# Use Llama for summarization
|
||||
cat document.txt | fabric --pattern summarize \
|
||||
--vendor GitHub --model meta/llama-3.1-70b
|
||||
```
|
||||
|
||||
### Per-Pattern Model Mapping
|
||||
|
||||
Set default models for specific patterns using environment variables:
|
||||
|
||||
Edit `~/.config/fabric/.env`:
|
||||
|
||||
```bash
|
||||
# Use GPT-4.1 for complex analysis
|
||||
FABRIC_MODEL_analyze_claims=GitHub|openai/gpt-4.1
|
||||
FABRIC_MODEL_extract_wisdom=GitHub|openai/gpt-4.1
|
||||
|
||||
# Use GPT-4o-mini for simple tasks
|
||||
FABRIC_MODEL_summarize=GitHub|openai/gpt-4o-mini
|
||||
FABRIC_MODEL_extract_article_wisdom=GitHub|openai/gpt-4o-mini
|
||||
|
||||
# Use Llama for code tasks
|
||||
FABRIC_MODEL_explain_code=GitHub|meta/llama-3.1-70b
|
||||
```
|
||||
|
||||
Now when you run:
|
||||
|
||||
```bash
|
||||
cat article.txt | fabric --pattern analyze_claims
|
||||
```
|
||||
|
||||
It will automatically use `GitHub|openai/gpt-4.1` without needing to specify the vendor and model.
|
||||
|
||||
### Comparing Responses Across Providers
|
||||
|
||||
Compare how different models respond to the same input:
|
||||
|
||||
```bash
|
||||
# OpenAI GPT-4o-mini
|
||||
echo "Explain quantum computing" | \
|
||||
fabric --vendor GitHub --model openai/gpt-4o-mini > response_openai.txt
|
||||
|
||||
# Meta Llama
|
||||
echo "Explain quantum computing" | \
|
||||
fabric --vendor GitHub --model meta/llama-3.1-70b > response_llama.txt
|
||||
|
||||
# Microsoft Phi
|
||||
echo "Explain quantum computing" | \
|
||||
fabric --vendor GitHub --model microsoft/phi-4 > response_phi.txt
|
||||
|
||||
# Compare
|
||||
diff response_openai.txt response_llama.txt
|
||||
```
|
||||
|
||||
### Testing Different Models for a Pattern
|
||||
|
||||
Find the best model for your use case:
|
||||
|
||||
```bash
|
||||
# Create a test script
|
||||
cat > test_models.sh << 'EOF'
|
||||
#!/bin/bash
|
||||
|
||||
INPUT="Explain the concept of recursion in programming"
|
||||
PATTERN="explain_code"
|
||||
|
||||
for MODEL in "openai/gpt-4o-mini" "meta/llama-3.1-8b" "microsoft/phi-4"; do
|
||||
echo "=== Testing $MODEL ==="
|
||||
echo "$INPUT" | fabric --pattern "$PATTERN" --vendor GitHub --model "$MODEL"
|
||||
echo ""
|
||||
done
|
||||
EOF
|
||||
|
||||
chmod +x test_models.sh
|
||||
./test_models.sh
|
||||
```
|
||||
|
||||
### Quick Test Without Setup
|
||||
|
||||
If you want to quickly test without running full setup, you can set the environment variable directly:
|
||||
|
||||
```bash
|
||||
# Temporary test (this session only)
|
||||
export GITHUB_API_KEY=github_pat_YOUR_TOKEN_HERE
|
||||
|
||||
# Test immediately
|
||||
fabric --listmodels --vendor GitHub
|
||||
```
|
||||
|
||||
This is useful for quick tests, but we recommend using `fabric --setup` for permanent configuration.
|
||||
|
||||
### Streaming for Long Responses
|
||||
|
||||
For long-form content, use streaming to see results as they generate:
|
||||
|
||||
```bash
|
||||
cat long_article.txt | \
|
||||
fabric --pattern summarize \
|
||||
--vendor GitHub --model openai/gpt-4o-mini \
|
||||
--stream
|
||||
```
|
||||
|
||||
### Saving Token Usage
|
||||
|
||||
Monitor your usage to stay within rate limits:
|
||||
|
||||
```bash
|
||||
# Create a simple usage tracker
|
||||
echo "$(date): Used gpt-4.1 for analyze_claims" >> ~/.config/fabric/usage.log
|
||||
|
||||
# Check daily usage
|
||||
grep "$(date +%Y-%m-%d)" ~/.config/fabric/usage.log | wc -l
|
||||
```
|
||||
|
||||
### Environment-Based Configuration
|
||||
|
||||
Create different profiles for different use cases:
|
||||
|
||||
```bash
|
||||
# Development profile (uses free GitHub Models)
|
||||
cat > ~/.config/fabric/.env.dev << EOF
|
||||
GITHUB_TOKEN=github_pat_dev_token_here
|
||||
DEFAULT_VENDOR=GitHub
|
||||
DEFAULT_MODEL=openai/gpt-4o-mini
|
||||
EOF
|
||||
|
||||
# Production profile (uses paid OpenAI)
|
||||
cat > ~/.config/fabric/.env.prod << EOF
|
||||
OPENAI_API_KEY=sk-prod-key-here
|
||||
DEFAULT_VENDOR=OpenAI
|
||||
DEFAULT_MODEL=gpt-4
|
||||
EOF
|
||||
|
||||
# Switch profiles
|
||||
ln -sf ~/.config/fabric/.env.dev ~/.config/fabric/.env
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Additional Resources
|
||||
|
||||
### Official Documentation
|
||||
|
||||
- [GitHub Models Quickstart](https://docs.github.com/en/github-models/quickstart)
|
||||
- [GitHub Models API Reference](https://docs.github.com/en/rest/models)
|
||||
- [GitHub Models Marketplace](https://github.com/marketplace/models)
|
||||
|
||||
### Fabric Documentation
|
||||
|
||||
- [Fabric README](../README.md)
|
||||
- [Contexts and Sessions Tutorial](./contexts-and-sessions-tutorial.md)
|
||||
- [Using Speech-to-Text](./Using-Speech-To-Text.md)
|
||||
|
||||
### Community
|
||||
|
||||
- [Fabric GitHub Repository](https://github.com/danielmiessler/fabric)
|
||||
- [Fabric Issues](https://github.com/danielmiessler/fabric/issues)
|
||||
- [Fabric Discussions](https://github.com/danielmiessler/fabric/discussions)
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
GitHub Models provides an excellent way to experiment with AI models through Fabric without managing multiple API keys or incurring costs. Key points:
|
||||
|
||||
✅ **Free to start**: No credit card required, 50-150 requests/day
|
||||
✅ **Multiple providers**: OpenAI, Meta, Microsoft, DeepSeek, xAI
|
||||
✅ **Simple setup**: Just one GitHub token via `fabric --setup`
|
||||
✅ **Great for learning**: Try different models and patterns
|
||||
✅ **Production path**: Can upgrade to paid tier when ready
|
||||
|
||||
### Quick Start Commands
|
||||
|
||||
```bash
|
||||
# 1. Get GitHub token with models:read scope from:
|
||||
# https://github.com/settings/tokens
|
||||
|
||||
# 2. Configure Fabric
|
||||
fabric --setup
|
||||
# Select [8] GitHub
|
||||
# Paste your token when prompted
|
||||
|
||||
# 3. List available models
|
||||
fabric --listmodels --vendor GitHub | grep gpt-4o
|
||||
|
||||
# 4. Try it out with gpt-4o-mini
|
||||
echo "What is AI?" | fabric --vendor GitHub --model "gpt-4o-mini"
|
||||
```
|
||||
|
||||
**Recommended starting point**: Use `gpt-4o-mini` for most patterns - it's fast, capable, and has generous rate limits (150 requests/day).
|
||||
|
||||
**Available Models**: `gpt-4o`, `gpt-4o-mini`, `Meta-Llama-3.1-8B-Instruct`, `Meta-Llama-3.1-70B-Instruct`, `Mistral-large-2407`, and more. Use `--listmodels` to see the complete list.
|
||||
|
||||
Happy prompting! 🚀
|
||||
2
go.mod
2
go.mod
@@ -21,7 +21,7 @@ require (
|
||||
github.com/mattn/go-sqlite3 v1.14.28
|
||||
github.com/nicksnyder/go-i18n/v2 v2.6.0
|
||||
github.com/ollama/ollama v0.11.7
|
||||
github.com/openai/openai-go v1.8.2
|
||||
github.com/openai/openai-go v1.12.0
|
||||
github.com/otiai10/copy v1.14.1
|
||||
github.com/pkg/errors v0.9.1
|
||||
github.com/samber/lo v1.50.0
|
||||
|
||||
4
go.sum
4
go.sum
@@ -201,8 +201,8 @@ github.com/ollama/ollama v0.11.7 h1:CuYjaJ/YEnvLDpJocJbbVdpdVFyGA/OP6lKFyzZD4dI=
|
||||
github.com/ollama/ollama v0.11.7/go.mod h1:9+1//yWPsDE2u+l1a5mpaKrYw4VdnSsRU3ioq5BvMms=
|
||||
github.com/onsi/gomega v1.34.1 h1:EUMJIKUjM8sKjYbtxQI9A4z2o+rruxnzNvpknOXie6k=
|
||||
github.com/onsi/gomega v1.34.1/go.mod h1:kU1QgUvBDLXBJq618Xvm2LUX6rSAfRaFRTcdOeDLwwY=
|
||||
github.com/openai/openai-go v1.8.2 h1:UqSkJ1vCOPUpz9Ka5tS0324EJFEuOvMc+lA/EarJWP8=
|
||||
github.com/openai/openai-go v1.8.2/go.mod h1:g461MYGXEXBVdV5SaR/5tNzNbSfwTBBefwc+LlDCK0Y=
|
||||
github.com/openai/openai-go v1.12.0 h1:NBQCnXzqOTv5wsgNC36PrFEiskGfO5wccfCWDo9S1U0=
|
||||
github.com/openai/openai-go v1.12.0/go.mod h1:g461MYGXEXBVdV5SaR/5tNzNbSfwTBBefwc+LlDCK0Y=
|
||||
github.com/otiai10/copy v1.14.1 h1:5/7E6qsUMBaH5AnQ0sSLzzTg1oTECmcCmT6lvF45Na8=
|
||||
github.com/otiai10/copy v1.14.1/go.mod h1:oQwrEDDOci3IM8dJF0d8+jnbfPDllW6vUjNc3DoZm9I=
|
||||
github.com/otiai10/mint v1.6.3 h1:87qsV/aw1F5as1eH1zS/yqHY85ANKVMgkDrf9rcxbQs=
|
||||
|
||||
@@ -28,6 +28,12 @@
|
||||
"youtube_empty_seconds_string": "leere Sekunden-Zeichenfolge",
|
||||
"youtube_invalid_seconds_format": "ungültiges Sekundenformat %q: %w",
|
||||
"error_fetching_playlist_videos": "Fehler beim Abrufen der Playlist-Videos: %w",
|
||||
"openai_api_base_url_not_configured": "API-Basis-URL für Anbieter %s nicht konfiguriert",
|
||||
"openai_failed_to_create_models_url": "Modell-URL konnte nicht erstellt werden: %w",
|
||||
"openai_unexpected_status_code_with_body": "unerwarteter Statuscode: %d von Anbieter %s, Antwort: %s",
|
||||
"openai_unexpected_status_code_read_error_partial": "unerwarteter Statuscode: %d von Anbieter %s (Fehler beim Lesen: %v), teilweise Antwort: %s",
|
||||
"openai_unexpected_status_code_read_error": "unerwarteter Statuscode: %d von Anbieter %s (Fehler beim Lesen der Antwort: %v)",
|
||||
"openai_unable_to_parse_models_response": "Modell-Antwort konnte nicht geparst werden; rohe Antwort: %s",
|
||||
"scraping_not_configured": "Scraping-Funktionalität ist nicht konfiguriert. Bitte richte Jina ein, um Scraping zu aktivieren",
|
||||
"could_not_determine_home_dir": "konnte Benutzer-Home-Verzeichnis nicht bestimmen: %w",
|
||||
"could_not_stat_env_file": "konnte .env-Datei nicht überprüfen: %w",
|
||||
|
||||
@@ -28,6 +28,12 @@
|
||||
"youtube_empty_seconds_string": "empty seconds string",
|
||||
"youtube_invalid_seconds_format": "invalid seconds format %q: %w",
|
||||
"error_fetching_playlist_videos": "error fetching playlist videos: %w",
|
||||
"openai_api_base_url_not_configured": "API base URL not configured for provider %s",
|
||||
"openai_failed_to_create_models_url": "failed to create models URL: %w",
|
||||
"openai_unexpected_status_code_with_body": "unexpected status code: %d from provider %s, response body: %s",
|
||||
"openai_unexpected_status_code_read_error_partial": "unexpected status code: %d from provider %s (error reading body: %v), partial response: %s",
|
||||
"openai_unexpected_status_code_read_error": "unexpected status code: %d from provider %s (failed to read response body: %v)",
|
||||
"openai_unable_to_parse_models_response": "unable to parse models response; raw response: %s",
|
||||
"scraping_not_configured": "scraping functionality is not configured. Please set up Jina to enable scraping",
|
||||
"could_not_determine_home_dir": "could not determine user home directory: %w",
|
||||
"could_not_stat_env_file": "could not stat .env file: %w",
|
||||
|
||||
@@ -28,6 +28,12 @@
|
||||
"youtube_empty_seconds_string": "cadena de segundos vacía",
|
||||
"youtube_invalid_seconds_format": "formato de segundos inválido %q: %w",
|
||||
"error_fetching_playlist_videos": "error al obtener videos de la lista de reproducción: %w",
|
||||
"openai_api_base_url_not_configured": "URL base de API no configurada para el proveedor %s",
|
||||
"openai_failed_to_create_models_url": "error al crear URL de modelos: %w",
|
||||
"openai_unexpected_status_code_with_body": "código de estado inesperado: %d del proveedor %s, cuerpo de respuesta: %s",
|
||||
"openai_unexpected_status_code_read_error_partial": "código de estado inesperado: %d del proveedor %s (error al leer cuerpo: %v), respuesta parcial: %s",
|
||||
"openai_unexpected_status_code_read_error": "código de estado inesperado: %d del proveedor %s (error al leer cuerpo de respuesta: %v)",
|
||||
"openai_unable_to_parse_models_response": "no se pudo analizar la respuesta de modelos; respuesta cruda: %s",
|
||||
"scraping_not_configured": "la funcionalidad de extracción no está configurada. Por favor configura Jina para habilitar la extracción",
|
||||
"could_not_determine_home_dir": "no se pudo determinar el directorio home del usuario: %w",
|
||||
"could_not_stat_env_file": "no se pudo verificar el archivo .env: %w",
|
||||
|
||||
@@ -28,6 +28,12 @@
|
||||
"youtube_empty_seconds_string": "رشته ثانیه خالی",
|
||||
"youtube_invalid_seconds_format": "فرمت ثانیه نامعتبر %q: %w",
|
||||
"error_fetching_playlist_videos": "خطا در دریافت ویدیوهای فهرست پخش: %w",
|
||||
"openai_api_base_url_not_configured": "URL پایه API برای ارائهدهنده %s پیکربندی نشده است",
|
||||
"openai_failed_to_create_models_url": "ایجاد URL مدلها ناموفق بود: %w",
|
||||
"openai_unexpected_status_code_with_body": "کد وضعیت غیرمنتظره: %d از ارائهدهنده %s، پاسخ: %s",
|
||||
"openai_unexpected_status_code_read_error_partial": "کد وضعیت غیرمنتظره: %d از ارائهدهنده %s (خطا در خواندن: %v)، پاسخ جزئی: %s",
|
||||
"openai_unexpected_status_code_read_error": "کد وضعیت غیرمنتظره: %d از ارائهدهنده %s (خطا در خواندن پاسخ: %v)",
|
||||
"openai_unable_to_parse_models_response": "تجزیه پاسخ مدلها ناموفق بود; پاسخ خام: %s",
|
||||
"scraping_not_configured": "قابلیت استخراج داده پیکربندی نشده است. لطفاً Jina را برای فعالسازی استخراج تنظیم کنید",
|
||||
"could_not_determine_home_dir": "نتوانست دایرکتوری خانه کاربر را تعیین کند: %w",
|
||||
"could_not_stat_env_file": "نتوانست وضعیت فایل .env را بررسی کند: %w",
|
||||
|
||||
@@ -28,6 +28,12 @@
|
||||
"youtube_empty_seconds_string": "chaîne de secondes vide",
|
||||
"youtube_invalid_seconds_format": "format de secondes invalide %q : %w",
|
||||
"error_fetching_playlist_videos": "erreur lors de la récupération des vidéos de la liste de lecture : %w",
|
||||
"openai_api_base_url_not_configured": "URL de base de l'API non configurée pour le fournisseur %s",
|
||||
"openai_failed_to_create_models_url": "échec de création de l'URL des modèles : %w",
|
||||
"openai_unexpected_status_code_with_body": "code d'état inattendu : %d du fournisseur %s, corps de réponse : %s",
|
||||
"openai_unexpected_status_code_read_error_partial": "code d'état inattendu : %d du fournisseur %s (erreur de lecture : %v), réponse partielle : %s",
|
||||
"openai_unexpected_status_code_read_error": "code d'état inattendu : %d du fournisseur %s (échec de lecture du corps de réponse : %v)",
|
||||
"openai_unable_to_parse_models_response": "impossible d'analyser la réponse des modèles ; réponse brute : %s",
|
||||
"scraping_not_configured": "la fonctionnalité de scraping n'est pas configurée. Veuillez configurer Jina pour activer le scraping",
|
||||
"could_not_determine_home_dir": "impossible de déterminer le répertoire home de l'utilisateur : %w",
|
||||
"could_not_stat_env_file": "impossible de vérifier le fichier .env : %w",
|
||||
|
||||
@@ -28,6 +28,12 @@
|
||||
"youtube_empty_seconds_string": "stringa di secondi vuota",
|
||||
"youtube_invalid_seconds_format": "formato secondi non valido %q: %w",
|
||||
"error_fetching_playlist_videos": "errore nel recupero dei video della playlist: %w",
|
||||
"openai_api_base_url_not_configured": "URL base API non configurato per il provider %s",
|
||||
"openai_failed_to_create_models_url": "impossibile creare URL modelli: %w",
|
||||
"openai_unexpected_status_code_with_body": "codice di stato imprevisto: %d dal provider %s, corpo risposta: %s",
|
||||
"openai_unexpected_status_code_read_error_partial": "codice di stato imprevisto: %d dal provider %s (errore lettura corpo: %v), risposta parziale: %s",
|
||||
"openai_unexpected_status_code_read_error": "codice di stato imprevisto: %d dal provider %s (errore lettura corpo risposta: %v)",
|
||||
"openai_unable_to_parse_models_response": "impossibile analizzare risposta modelli; risposta grezza: %s",
|
||||
"scraping_not_configured": "la funzionalità di scraping non è configurata. Per favore configura Jina per abilitare lo scraping",
|
||||
"could_not_determine_home_dir": "impossibile determinare la directory home dell'utente: %w",
|
||||
"could_not_stat_env_file": "impossibile verificare il file .env: %w",
|
||||
|
||||
@@ -28,6 +28,12 @@
|
||||
"youtube_empty_seconds_string": "空の秒文字列",
|
||||
"youtube_invalid_seconds_format": "無効な秒形式 %q: %w",
|
||||
"error_fetching_playlist_videos": "プレイリスト動画の取得エラー: %w",
|
||||
"openai_api_base_url_not_configured": "プロバイダー %s のAPIベースURLが設定されていません",
|
||||
"openai_failed_to_create_models_url": "モデルURLの作成に失敗しました: %w",
|
||||
"openai_unexpected_status_code_with_body": "予期しないステータスコード: プロバイダー %s から %d、レスポンス本文: %s",
|
||||
"openai_unexpected_status_code_read_error_partial": "予期しないステータスコード: プロバイダー %s から %d (本文読み取りエラー: %v)、部分的なレスポンス: %s",
|
||||
"openai_unexpected_status_code_read_error": "予期しないステータスコード: プロバイダー %s から %d (レスポンス本文の読み取りに失敗: %v)",
|
||||
"openai_unable_to_parse_models_response": "モデルレスポンスの解析に失敗しました; 生のレスポンス: %s",
|
||||
"scraping_not_configured": "スクレイピング機能が設定されていません。スクレイピングを有効にするためにJinaを設定してください",
|
||||
"could_not_determine_home_dir": "ユーザーのホームディレクトリを特定できませんでした: %w",
|
||||
"could_not_stat_env_file": ".envファイルの状態を確認できませんでした: %w",
|
||||
|
||||
@@ -28,6 +28,12 @@
|
||||
"youtube_empty_seconds_string": "string de segundos vazia",
|
||||
"youtube_invalid_seconds_format": "formato de segundos inválido %q: %w",
|
||||
"error_fetching_playlist_videos": "erro ao buscar vídeos da playlist: %w",
|
||||
"openai_api_base_url_not_configured": "URL base da API não configurada para o provedor %s",
|
||||
"openai_failed_to_create_models_url": "falha ao criar URL de modelos: %w",
|
||||
"openai_unexpected_status_code_with_body": "código de status inesperado: %d do provedor %s, corpo da resposta: %s",
|
||||
"openai_unexpected_status_code_read_error_partial": "código de status inesperado: %d do provedor %s (erro ao ler corpo: %v), resposta parcial: %s",
|
||||
"openai_unexpected_status_code_read_error": "código de status inesperado: %d do provedor %s (falha ao ler corpo da resposta: %v)",
|
||||
"openai_unable_to_parse_models_response": "não foi possível analisar a resposta de modelos; resposta bruta: %s",
|
||||
"scraping_not_configured": "funcionalidade de scraping não está configurada. Por favor configure o Jina para ativar o scraping",
|
||||
"could_not_determine_home_dir": "não foi possível determinar o diretório home do usuário: %w",
|
||||
"could_not_stat_env_file": "não foi possível verificar o arquivo .env: %w",
|
||||
|
||||
@@ -28,6 +28,12 @@
|
||||
"youtube_empty_seconds_string": "cadeia de segundos vazia",
|
||||
"youtube_invalid_seconds_format": "formato de segundos inválido %q: %w",
|
||||
"error_fetching_playlist_videos": "erro ao obter vídeos da playlist: %w",
|
||||
"openai_api_base_url_not_configured": "URL base da API não configurado para o fornecedor %s",
|
||||
"openai_failed_to_create_models_url": "falha ao criar URL de modelos: %w",
|
||||
"openai_unexpected_status_code_with_body": "código de estado inesperado: %d do fornecedor %s, corpo da resposta: %s",
|
||||
"openai_unexpected_status_code_read_error_partial": "código de estado inesperado: %d do fornecedor %s (erro ao ler corpo: %v), resposta parcial: %s",
|
||||
"openai_unexpected_status_code_read_error": "código de estado inesperado: %d do fornecedor %s (falha ao ler corpo da resposta: %v)",
|
||||
"openai_unable_to_parse_models_response": "não foi possível analisar a resposta de modelos; resposta bruta: %s",
|
||||
"scraping_not_configured": "funcionalidade de scraping não está configurada. Por favor configure o Jina para ativar o scraping",
|
||||
"could_not_determine_home_dir": "não foi possível determinar o diretório home do utilizador: %w",
|
||||
"could_not_stat_env_file": "não foi possível verificar o ficheiro .env: %w",
|
||||
|
||||
@@ -28,6 +28,12 @@
|
||||
"youtube_empty_seconds_string": "秒数字符串为空",
|
||||
"youtube_invalid_seconds_format": "无效的秒数格式 %q:%w",
|
||||
"error_fetching_playlist_videos": "获取播放列表视频时出错: %w",
|
||||
"openai_api_base_url_not_configured": "未为提供商 %s 配置 API 基础 URL",
|
||||
"openai_failed_to_create_models_url": "创建模型 URL 失败:%w",
|
||||
"openai_unexpected_status_code_with_body": "意外的状态码:来自提供商 %s 的 %d,响应主体:%s",
|
||||
"openai_unexpected_status_code_read_error_partial": "意外的状态码:来自提供商 %s 的 %d(读取主体错误:%v),部分响应:%s",
|
||||
"openai_unexpected_status_code_read_error": "意外的状态码:来自提供商 %s 的 %d(读取响应主体失败:%v)",
|
||||
"openai_unable_to_parse_models_response": "无法解析模型响应;原始响应:%s",
|
||||
"scraping_not_configured": "抓取功能未配置。请设置 Jina 以启用抓取功能",
|
||||
"could_not_determine_home_dir": "无法确定用户主目录: %w",
|
||||
"could_not_stat_env_file": "无法获取 .env 文件状态: %w",
|
||||
|
||||
120
internal/plugins/ai/openai/direct_models.go
Normal file
120
internal/plugins/ai/openai/direct_models.go
Normal file
@@ -0,0 +1,120 @@
|
||||
package openai
|
||||
|
||||
import (
|
||||
"context"
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"io"
|
||||
"net/http"
|
||||
"net/url"
|
||||
"time"
|
||||
|
||||
"github.com/danielmiessler/fabric/internal/i18n"
|
||||
debuglog "github.com/danielmiessler/fabric/internal/log"
|
||||
)
|
||||
|
||||
// modelResponse represents a minimal model returned by the API.
|
||||
// This mirrors the shape used by OpenAI-compatible providers that return
|
||||
// either an array of models or an object with a `data` field.
|
||||
type modelResponse struct {
|
||||
ID string `json:"id"`
|
||||
}
|
||||
|
||||
// errorResponseLimit defines the maximum length of error response bodies for truncation.
|
||||
const errorResponseLimit = 1024
|
||||
|
||||
// maxResponseSize defines the maximum size of response bodies to prevent memory exhaustion.
|
||||
const maxResponseSize = 10 * 1024 * 1024 // 10MB
|
||||
|
||||
// FetchModelsDirectly is used to fetch models directly from the API when the
|
||||
// standard OpenAI SDK method fails due to a nonstandard format. This is useful
|
||||
// for providers that return a direct array of models (e.g., GitHub Models) or
|
||||
// other OpenAI-compatible implementations.
|
||||
func FetchModelsDirectly(ctx context.Context, baseURL, apiKey, providerName string) ([]string, error) {
|
||||
if ctx == nil {
|
||||
ctx = context.Background()
|
||||
}
|
||||
if baseURL == "" {
|
||||
return nil, fmt.Errorf(i18n.T("openai_api_base_url_not_configured"), providerName)
|
||||
}
|
||||
|
||||
// Build the /models endpoint URL
|
||||
fullURL, err := url.JoinPath(baseURL, "models")
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf(i18n.T("openai_failed_to_create_models_url"), err)
|
||||
}
|
||||
|
||||
req, err := http.NewRequestWithContext(ctx, http.MethodGet, fullURL, nil)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
|
||||
req.Header.Set("Authorization", fmt.Sprintf("Bearer %s", apiKey))
|
||||
req.Header.Set("Accept", "application/json")
|
||||
|
||||
// TODO: Consider reusing a single http.Client instance (e.g., as a field on Client) instead of allocating a new one for
|
||||
// each request.
|
||||
client := &http.Client{
|
||||
Timeout: 10 * time.Second,
|
||||
}
|
||||
resp, err := client.Do(req)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
defer resp.Body.Close()
|
||||
|
||||
if resp.StatusCode != http.StatusOK {
|
||||
// Read the response body for debugging, but limit the number of bytes read
|
||||
bodyBytes, readErr := io.ReadAll(io.LimitReader(resp.Body, errorResponseLimit))
|
||||
if readErr != nil {
|
||||
return nil, fmt.Errorf(i18n.T("openai_unexpected_status_code_read_error"),
|
||||
resp.StatusCode, providerName, readErr)
|
||||
}
|
||||
bodyString := string(bodyBytes)
|
||||
return nil, fmt.Errorf(i18n.T("openai_unexpected_status_code_with_body"),
|
||||
resp.StatusCode, providerName, bodyString)
|
||||
}
|
||||
|
||||
// Read the response body once, with a size limit to prevent memory exhaustion
|
||||
// Read up to maxResponseSize + 1 bytes to detect truncation
|
||||
bodyBytes, err := io.ReadAll(io.LimitReader(resp.Body, maxResponseSize+1))
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
if len(bodyBytes) > maxResponseSize {
|
||||
return nil, fmt.Errorf(i18n.T("openai_models_response_too_large"), providerName, maxResponseSize)
|
||||
}
|
||||
|
||||
// Try to parse as an object with data field (OpenAI format)
|
||||
var openAIFormat struct {
|
||||
Data []modelResponse `json:"data"`
|
||||
}
|
||||
// Try to parse as a direct array
|
||||
var directArray []modelResponse
|
||||
|
||||
if err := json.Unmarshal(bodyBytes, &openAIFormat); err == nil {
|
||||
debuglog.Debug(debuglog.Detailed, "Successfully parsed models response from %s using OpenAI format (found %d models)\n", providerName, len(openAIFormat.Data))
|
||||
return extractModelIDs(openAIFormat.Data), nil
|
||||
}
|
||||
|
||||
if err := json.Unmarshal(bodyBytes, &directArray); err == nil {
|
||||
debuglog.Debug(debuglog.Detailed, "Successfully parsed models response from %s using direct array format (found %d models)\n", providerName, len(directArray))
|
||||
return extractModelIDs(directArray), nil
|
||||
}
|
||||
|
||||
var truncatedBody string
|
||||
if len(bodyBytes) > errorResponseLimit {
|
||||
truncatedBody = string(bodyBytes[:errorResponseLimit]) + "..."
|
||||
} else {
|
||||
truncatedBody = string(bodyBytes)
|
||||
}
|
||||
return nil, fmt.Errorf(i18n.T("openai_unable_to_parse_models_response"), truncatedBody)
|
||||
}
|
||||
|
||||
func extractModelIDs(models []modelResponse) []string {
|
||||
modelIDs := make([]string, 0, len(models))
|
||||
for _, model := range models {
|
||||
modelIDs = append(modelIDs, model.ID)
|
||||
}
|
||||
return modelIDs
|
||||
}
|
||||
@@ -8,6 +8,7 @@ import (
|
||||
|
||||
"github.com/danielmiessler/fabric/internal/chat"
|
||||
"github.com/danielmiessler/fabric/internal/domain"
|
||||
debuglog "github.com/danielmiessler/fabric/internal/log"
|
||||
"github.com/danielmiessler/fabric/internal/plugins"
|
||||
openai "github.com/openai/openai-go"
|
||||
"github.com/openai/openai-go/option"
|
||||
@@ -83,13 +84,19 @@ func (o *Client) configure() (ret error) {
|
||||
|
||||
func (o *Client) ListModels() (ret []string, err error) {
|
||||
var page *pagination.Page[openai.Model]
|
||||
if page, err = o.ApiClient.Models.List(context.Background()); err != nil {
|
||||
return
|
||||
if page, err = o.ApiClient.Models.List(context.Background()); err == nil {
|
||||
for _, mod := range page.Data {
|
||||
ret = append(ret, mod.ID)
|
||||
}
|
||||
// SDK succeeded - return the result even if empty
|
||||
return ret, nil
|
||||
}
|
||||
for _, mod := range page.Data {
|
||||
ret = append(ret, mod.ID)
|
||||
}
|
||||
return
|
||||
|
||||
// SDK returned an error - fall back to direct API fetch.
|
||||
// Some providers (e.g., GitHub Models) return non-standard response formats
|
||||
// that the SDK fails to parse.
|
||||
debuglog.Debug(debuglog.Basic, "SDK Models.List failed for %s: %v, falling back to direct API fetch\n", o.GetName(), err)
|
||||
return FetchModelsDirectly(context.Background(), o.ApiBaseURL.Value, o.ApiKey.Value, o.GetName())
|
||||
}
|
||||
|
||||
func (o *Client) SendStream(
|
||||
|
||||
58
internal/plugins/ai/openai/openai_models_test.go
Normal file
58
internal/plugins/ai/openai/openai_models_test.go
Normal file
@@ -0,0 +1,58 @@
|
||||
package openai
|
||||
|
||||
import (
|
||||
"context"
|
||||
"net/http"
|
||||
"net/http/httptest"
|
||||
"testing"
|
||||
|
||||
"github.com/stretchr/testify/assert"
|
||||
)
|
||||
|
||||
// Ensures we can fetch models directly when a provider returns a direct array of models
|
||||
// instead of the standard OpenAI list response structure.
|
||||
func TestFetchModelsDirectly_DirectArray(t *testing.T) {
|
||||
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
assert.Equal(t, "/models", r.URL.Path)
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
_, err := w.Write([]byte(`[{"id":"github-model"}]`))
|
||||
assert.NoError(t, err)
|
||||
}))
|
||||
defer srv.Close()
|
||||
|
||||
models, err := FetchModelsDirectly(context.Background(), srv.URL, "test-key", "TestProvider")
|
||||
assert.NoError(t, err)
|
||||
assert.Equal(t, 1, len(models))
|
||||
assert.Equal(t, "github-model", models[0])
|
||||
}
|
||||
|
||||
// Ensures we can fetch models when a provider returns the standard OpenAI format
|
||||
func TestFetchModelsDirectly_OpenAIFormat(t *testing.T) {
|
||||
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
assert.Equal(t, "/models", r.URL.Path)
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
_, err := w.Write([]byte(`{"data":[{"id":"openai-model"}]}`))
|
||||
assert.NoError(t, err)
|
||||
}))
|
||||
defer srv.Close()
|
||||
|
||||
models, err := FetchModelsDirectly(context.Background(), srv.URL, "test-key", "TestProvider")
|
||||
assert.NoError(t, err)
|
||||
assert.Equal(t, 1, len(models))
|
||||
assert.Equal(t, "openai-model", models[0])
|
||||
}
|
||||
|
||||
// Ensures we handle empty model lists correctly
|
||||
func TestFetchModelsDirectly_EmptyArray(t *testing.T) {
|
||||
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
assert.Equal(t, "/models", r.URL.Path)
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
_, err := w.Write([]byte(`[]`))
|
||||
assert.NoError(t, err)
|
||||
}))
|
||||
defer srv.Close()
|
||||
|
||||
models, err := FetchModelsDirectly(context.Background(), srv.URL, "test-key", "TestProvider")
|
||||
assert.NoError(t, err)
|
||||
assert.Equal(t, 0, len(models))
|
||||
}
|
||||
@@ -2,104 +2,12 @@ package openai_compatible
|
||||
|
||||
import (
|
||||
"context"
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"io"
|
||||
"net/http"
|
||||
"net/url"
|
||||
"time"
|
||||
|
||||
"github.com/danielmiessler/fabric/internal/plugins/ai/openai"
|
||||
)
|
||||
|
||||
// Model represents a model returned by the API
|
||||
type Model struct {
|
||||
ID string `json:"id"`
|
||||
}
|
||||
|
||||
// ErrorResponseLimit defines the maximum length of error response bodies for truncation.
|
||||
const errorResponseLimit = 1024 // Limit for error response body size
|
||||
|
||||
// DirectlyGetModels is used to fetch models directly from the API
|
||||
// when the standard OpenAI SDK method fails due to a nonstandard format.
|
||||
// This is useful for providers like Together that return a direct array of models.
|
||||
// DirectlyGetModels is used to fetch models directly from the API when the
|
||||
// standard OpenAI SDK method fails due to a nonstandard format.
|
||||
func (c *Client) DirectlyGetModels(ctx context.Context) ([]string, error) {
|
||||
if ctx == nil {
|
||||
ctx = context.Background()
|
||||
}
|
||||
baseURL := c.ApiBaseURL.Value
|
||||
if baseURL == "" {
|
||||
return nil, fmt.Errorf("API base URL not configured for provider %s", c.GetName())
|
||||
}
|
||||
|
||||
// Build the /models endpoint URL
|
||||
fullURL, err := url.JoinPath(baseURL, "models")
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("failed to create models URL: %w", err)
|
||||
}
|
||||
|
||||
req, err := http.NewRequestWithContext(ctx, "GET", fullURL, nil)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
|
||||
req.Header.Set("Authorization", fmt.Sprintf("Bearer %s", c.ApiKey.Value))
|
||||
req.Header.Set("Accept", "application/json")
|
||||
|
||||
// TODO: Consider reusing a single http.Client instance (e.g., as a field on Client) instead of allocating a new one for each request.
|
||||
|
||||
client := &http.Client{
|
||||
Timeout: 10 * time.Second,
|
||||
}
|
||||
resp, err := client.Do(req)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
defer resp.Body.Close()
|
||||
|
||||
if resp.StatusCode != http.StatusOK {
|
||||
// Read the response body for debugging
|
||||
bodyBytes, _ := io.ReadAll(resp.Body)
|
||||
bodyString := string(bodyBytes)
|
||||
if len(bodyString) > errorResponseLimit { // Truncate if too large
|
||||
bodyString = bodyString[:errorResponseLimit] + "..."
|
||||
}
|
||||
return nil, fmt.Errorf("unexpected status code: %d from provider %s, response body: %s",
|
||||
resp.StatusCode, c.GetName(), bodyString)
|
||||
}
|
||||
|
||||
// Read the response body once
|
||||
bodyBytes, err := io.ReadAll(resp.Body)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
|
||||
// Try to parse as an object with data field (OpenAI format)
|
||||
var openAIFormat struct {
|
||||
Data []Model `json:"data"`
|
||||
}
|
||||
// Try to parse as a direct array (Together format)
|
||||
var directArray []Model
|
||||
|
||||
if err := json.Unmarshal(bodyBytes, &openAIFormat); err == nil && len(openAIFormat.Data) > 0 {
|
||||
return extractModelIDs(openAIFormat.Data), nil
|
||||
}
|
||||
|
||||
if err := json.Unmarshal(bodyBytes, &directArray); err == nil && len(directArray) > 0 {
|
||||
return extractModelIDs(directArray), nil
|
||||
}
|
||||
|
||||
var truncatedBody string
|
||||
if len(bodyBytes) > errorResponseLimit {
|
||||
truncatedBody = string(bodyBytes[:errorResponseLimit]) + "..."
|
||||
} else {
|
||||
truncatedBody = string(bodyBytes)
|
||||
}
|
||||
return nil, fmt.Errorf("unable to parse models response; raw response: %s", truncatedBody)
|
||||
}
|
||||
|
||||
func extractModelIDs(models []Model) []string {
|
||||
modelIDs := make([]string, 0, len(models))
|
||||
for _, model := range models {
|
||||
modelIDs = append(modelIDs, model.ID)
|
||||
}
|
||||
return modelIDs
|
||||
return openai.FetchModelsDirectly(ctx, c.ApiBaseURL.Value, c.ApiKey.Value, c.GetName())
|
||||
}
|
||||
|
||||
@@ -12,17 +12,21 @@ import (
|
||||
type ProviderConfig struct {
|
||||
Name string
|
||||
BaseURL string
|
||||
ImplementsResponses bool // Whether the provider supports OpenAI's new Responses API
|
||||
ModelsURL string // Optional: Custom endpoint for listing models (if different from BaseURL/models)
|
||||
ImplementsResponses bool // Whether the provider supports OpenAI's new Responses API
|
||||
}
|
||||
|
||||
// Client is the common structure for all OpenAI-compatible providers
|
||||
type Client struct {
|
||||
*openai.Client
|
||||
modelsURL string // Custom URL for listing models (if different from BaseURL/models)
|
||||
}
|
||||
|
||||
// NewClient creates a new OpenAI-compatible client for the specified provider
|
||||
func NewClient(providerConfig ProviderConfig) *Client {
|
||||
client := &Client{}
|
||||
client := &Client{
|
||||
modelsURL: providerConfig.ModelsURL,
|
||||
}
|
||||
client.Client = openai.NewClientCompatibleWithResponses(
|
||||
providerConfig.Name,
|
||||
providerConfig.BaseURL,
|
||||
@@ -34,14 +38,20 @@ func NewClient(providerConfig ProviderConfig) *Client {
|
||||
|
||||
// ListModels overrides the default ListModels to handle different response formats
|
||||
func (c *Client) ListModels() ([]string, error) {
|
||||
// If a custom models URL is provided, use direct fetch with that URL
|
||||
if c.modelsURL != "" {
|
||||
// TODO: Handle context properly in Fabric by accepting and propagating a context.Context
|
||||
// instead of creating a new one here.
|
||||
return openai.FetchModelsDirectly(context.Background(), c.modelsURL, c.Client.ApiKey.Value, c.GetName())
|
||||
}
|
||||
|
||||
// First try the standard OpenAI SDK approach
|
||||
models, err := c.Client.ListModels()
|
||||
if err == nil && len(models) > 0 { // only return if OpenAI SDK returns models
|
||||
return models, nil
|
||||
}
|
||||
|
||||
// TODO: Handle context properly in Fabric by accepting and propagating a context.Context
|
||||
// instead of creating a new one here.
|
||||
// Fall back to direct API fetch
|
||||
return c.DirectlyGetModels(context.Background())
|
||||
}
|
||||
|
||||
@@ -62,6 +72,12 @@ var ProviderMap = map[string]ProviderConfig{
|
||||
BaseURL: "https://api.deepseek.com",
|
||||
ImplementsResponses: false,
|
||||
},
|
||||
"GitHub": {
|
||||
Name: "GitHub",
|
||||
BaseURL: "https://models.github.ai/inference",
|
||||
ModelsURL: "https://models.github.ai/catalog", // FetchModelsDirectly will append /models
|
||||
ImplementsResponses: false,
|
||||
},
|
||||
"GrokAI": {
|
||||
Name: "GrokAI",
|
||||
BaseURL: "https://api.x.ai/v1",
|
||||
|
||||
Reference in New Issue
Block a user