docs: update README for GraphQL optimization and AI summary features

### CHANGES

- Detail GraphQL API usage for faster PR fetching
- Introduce AI-powered summaries via Fabric integration
- Explain content-based caching for AI summaries
- Document support for loading secrets from .env files
- Add usage examples for new AI summary feature
- Clarify project license is The MIT License
This commit is contained in:
Kayvan Sylvan
2025-07-13 21:57:11 -07:00
parent fad0a065d4
commit 47f75237ff

View File

@@ -11,8 +11,12 @@ A high-performance changelog generator for Git repositories that automatically c
- **Unreleased changes**: Tracks all commits since the last release
- **Concurrent processing**: Parallel GitHub API calls for improved performance
- **Flexible output**: Generate complete changelogs or target specific versions
- **Optimized PR fetching**: Batch fetches all merged PRs using GitHub Search API (drastically reduces API calls)
- **GraphQL optimization**: Ultra-fast PR fetching using GitHub GraphQL API (~5-10 calls vs 1000s)
- **Intelligent sync**: Automatically syncs new PRs every 24 hours or when missing PRs are detected
- **AI-powered summaries**: Optional Fabric integration for enhanced changelog summaries
- **Advanced caching**: Content-based change detection for AI summaries with hash comparison
- **Author type detection**: Distinguishes between users, bots, and organizations
- **Lightning-fast incremental updates**: SHA→PR mapping for instant git operations
## Installation
@@ -23,26 +27,31 @@ go install github.com/danielmiessler/fabric/cmd/generate_changelog@latest
## Usage
### Basic usage (generate complete changelog)
```bash
generate_changelog
```
### Save to file
```bash
generate_changelog -o CHANGELOG.md
```
### Generate for specific version
```bash
generate_changelog -v v1.4.244
```
### Limit to recent versions
```bash
generate_changelog -l 10
```
### Using GitHub token for private repos or higher rate limits
```bash
export GITHUB_TOKEN=your_token_here
generate_changelog
@@ -51,7 +60,18 @@ generate_changelog
generate_changelog --token your_token_here
```
### AI-enhanced summaries
```bash
# Enable AI summaries using Fabric
generate_changelog --ai-summarize
# Use custom model for AI summaries
FABRIC_CHANGELOG_SUMMARIZE_MODEL=claude-opus-4 generate_changelog --ai-summarize
```
### Cache management
```bash
# Rebuild cache from scratch
generate_changelog --rebuild-cache
@@ -80,6 +100,7 @@ generate_changelog --cache /path/to/cache.db
| `--rebuild-cache` | | Rebuild cache from scratch | false |
| `--force-pr-sync` | | Force a full PR sync from GitHub | false |
| `--token` | | GitHub API token | `$GITHUB_TOKEN` |
| `--ai-summarize` | | Generate AI-enhanced summaries using Fabric | false |
## Output Format
@@ -120,54 +141,118 @@ The generated changelog follows this structure:
- **Concurrent API calls**: Processes up to 10 GitHub API requests in parallel
- **Smart caching**: SQLite cache eliminates redundant API calls
- **Incremental updates**: Only processes new commits on subsequent runs
- **Batch PR fetching**: Uses GitHub Search API to fetch all merged PRs in minimal API calls
- **GraphQL optimization**: Uses GitHub GraphQL API to fetch all PR data in ~5-10 calls
- **AI-powered summaries**: Optional Fabric integration with intelligent caching
- **Content-based change detection**: AI summaries only regenerated when content changes
- **Lightning-fast git operations**: SHA→PR mapping stored in database for instant lookups
### Major Optimization: Batch PR Fetching
### Major Optimization: GraphQL + Advanced Caching
The tool has been optimized to drastically reduce GitHub API calls:
The tool has been optimized to drastically reduce GitHub API calls and improve performance:
**Previous approach**: Individual API calls for each PR (2 API calls per PR)
**Before**: Individual API calls for each PR (2 API calls per PR - one for PR details, one for commits)
- For a repo with 500 PRs: 1,000 API calls
**After**: Batch fetching using GitHub Search API
- For a repo with 500 PRs: ~10 API calls (search) + 500 API calls (details) = ~510 API calls
- **50% reduction in API calls!**
**Current approach**: GraphQL batch fetching with intelligent caching
- For a repo with 500 PRs: ~5-10 GraphQL calls (initial fetch) + 0 calls (subsequent runs with cache)
- **99%+ reduction in API calls after initial run!**
The optimization includes:
1. **Batch Search**: Uses GitHub's Search API to find all merged PRs in paginated batches
2. **Smart Caching**: Stores complete PR data and tracks last sync timestamp
3. **Incremental Sync**: Only fetches PRs merged after the last sync
1. **GraphQL Batch Fetch**: Uses GitHub's GraphQL API to fetch all merged PRs with commits in minimal calls
2. **Smart Caching**: Stores complete PR data, commits, and SHA mappings in SQLite
3. **Incremental Sync**: Only fetches PRs merged after the last sync timestamp
4. **Automatic Refresh**: PRs are synced every 24 hours or when missing PRs are detected
5. **Fallback Support**: If batch fetch fails, falls back to individual PR fetching
5. **AI Summary Caching**: Content-based change detection prevents unnecessary AI regeneration
6. **Fallback Support**: If GraphQL fails, falls back to REST API batch fetching
7. **Lightning Git Operations**: Pre-computed SHA→PR mappings for instant commit association
## Requirements
- Go 1.24+ (for installation from source)
- Git repository
- GitHub token (optional, for private repos or higher rate limits)
- Fabric CLI (optional, for AI-enhanced summaries)
## Authentication
The tool supports GitHub authentication via:
1. Environment variable: `export GITHUB_TOKEN=your_token`
2. Command line flag: `--token your_token`
3. `.env` file in the same directory as the binary
### Environment File Support
Create a `.env` file next to the `generate_changelog` binary:
```bash
GITHUB_TOKEN=your_github_token_here
FABRIC_CHANGELOG_SUMMARIZE_MODEL=claude-sonnet-4-20250514
```
The tool automatically loads `.env` files for convenient configuration management.
Without authentication, the tool is limited to 60 GitHub API requests per hour.
## Caching
The SQLite cache stores:
- Version information and commit associations
- Pull request details (title, body, commits, authors)
- Last processed commit SHA for incremental updates
- Last PR sync timestamp for intelligent refresh
- AI summaries with content-based change detection
- SHA→PR mappings for lightning-fast git operations
Cache benefits:
- Instant changelog regeneration
- Drastically reduced GitHub API usage (50%+ reduction)
- Drastically reduced GitHub API usage (99%+ reduction after initial run)
- Offline changelog generation (after initial cache build)
- Automatic PR data refresh every 24 hours
- Batch database transactions for better performance
- Content-aware AI summary regeneration
## AI-Enhanced Summaries
The tool can generate AI-powered summaries using Fabric for more polished, professional changelogs:
```bash
# Enable AI summarization
generate_changelog --ai-summarize
# Custom model (default: claude-sonnet-4-20250514)
FABRIC_CHANGELOG_SUMMARIZE_MODEL=claude-opus-4 generate_changelog --ai-summarize
```
### AI Summary Features
- **Content-based change detection**: AI summaries are only regenerated when version content changes
- **Intelligent caching**: Preserves existing summaries and only processes changed versions
- **Content hash comparison**: Uses SHA256 hashing to detect when "Unreleased" content changes
- **Automatic fallback**: Falls back to raw content if AI processing fails
- **Error detection**: Identifies and handles AI processing errors gracefully
- **Minimum content filtering**: Skips AI processing for very brief content (< 256 characters)
### AI Model Configuration
Set the model via environment variable:
```bash
export FABRIC_CHANGELOG_SUMMARIZE_MODEL=claude-opus-4
# or
export FABRIC_CHANGELOG_SUMMARIZE_MODEL=gpt-4
```
AI summaries are cached and only regenerated when:
- Version content changes (detected via hash comparison)
- No existing AI summary exists for the version
- Force rebuild is requested
## Contributing
@@ -175,4 +260,4 @@ This tool is part of the Fabric project. Contributions are welcome!
## License
Same as the Fabric project.
The MIT License. Same as the Fabric project.