- Add comprehensive unit tests for `parseOllamaNumCtx` function
- Remove redundant negative value checks in float parsing
- Simplify error messages to avoid exposing internal type info
- Streamline error response in `ollamaChat` handler
- Add helper functions for string containment in tests
- Cover edge cases including overflow, invalid types, and boundaries
This commit fixes the Ollama server /api/chat endpoint which was ignoring
the client-provided num_ctx parameter and global DEFAULT_MODEL_CONTEXT_LENGTH,
always using a hardcoded value of 2048 tokens.
- Add parseOllamaNumCtx() function in ollama.go with type-safe extraction
supporting 6 numeric types (float64, float32, int, int64, json.Number, string)
- Extract num_ctx from client request options in ollamaChat()
- Add ModelContextLength field to ChatRequest struct in chat.go
- Replace hardcoded 2048 with request.ModelContextLength in GetChatter() call
- Platform-aware integer overflow protection for 32-bit systems
- DoS protection via 1,000,000 token maximum limit
- Long string truncation in error messages (50 char limit)
- Sanitized error messages (no internal stdlib details exposed)
- Missing/null num_ctx returns (0, nil) to trigger existing default fallback
- Zero API contract changes
- Invalid values return 400 Bad Request with clear error messages
- All existing tests pass
- Compilation successful with no errors or warnings
Fixes#1942
## CHANGES
- Add `Variables` field to `OllamaRequestBody` struct for direct variable passing
- Change `Options` field from empty struct to flexible `map[string]any` type
- Extract variables from top-level `Variables` field or nested `Options.variables`
- Support parsing variables as JSON string or map format
- Pass extracted variables to `PromptRequest` for single message chats
- Pass extracted variables to `PromptRequest` for multi-message chats
- Add `omitempty` JSON tags to optional fields
## CHANGES
- Use int64 for prompt and eval count fields
- Skip sending secondary error message on stream write failure
- Allow non-http schemes and validate host only for address
- Flush response only when writer implements http.Flusher
## CHANGES
- Use Gin request context for outbound HTTP calls
- Send final stream chunk when response write fails
- Capture timing duration once for consistent metrics
- Build final Ollama response via shared helper function
- Validate Fabric base URL scheme is http/https only
- Add clarifying documentation comments for URL and writers
## CHANGES
- Use int64 for `load_duration` JSON field values
- Use int64 for `prompt_eval_duration` JSON field values
- Remove lossy int casts when assigning nanosecond durations
- Keep duration payloads consistent with `total_duration` precision
- Prevent potential overflow on long-running request timing
## CHANGES
- Validate parsed URL host not start with colon
- Return explicit error for missing hostname in URL
- Update unit test to expect error on port-only host
- Prevent accidental acceptance of malformed `https://:port` addresses
## CHANGES
- Set NDJSON content type before checking upstream errors
- Reject parsed URLs that omit a hostname
- Remove hardcoded eval count placeholders from responses
- Add unit tests for Fabric chat URL builder
- Cover colon-port, host:port, and IP address inputs
## CHANGES
- Convert JSON error responses to use err.Error()
- Detect upstream Fabric non-2xx status before scanning
- Read and log upstream error body when possible
- Return upstream status error message for non-stream mode
- Stream error message via NDJSON when streaming enabled
- Set NDJSON Content-Type header before first streaming write
- Remove per-chunk header setting during streaming output
Addresses all 8 Copilot review comments on PR #1940:
Critical fixes:
- Replace log.Fatal with proper HTTP error response to prevent
server crashes on request failures
- Add streaming-aware error handling to maintain consistent
response format (prevents mixing JSON with NDJSON)
Error messaging improvements:
- Replace "testing endpoint" placeholders with descriptive
error messages
- Add clear context for unmarshal and scanning failures
Protocol compliance:
- Set Content-Type: application/x-ndjson for streaming responses
- Ensure all error paths respect stream vs non-stream mode
Code cleanup:
- Remove commented-out dead code
Tested both streaming and non-streaming modes successfully.
- Add `json:"-"` tag to exclude UpdateChan from JSON serialization
- Extract URL building logic into dedicated `buildFabricChatURL` helper function
- Replace single-read body parsing with streaming `bufio.Scanner` approach
- Add proper SSE data prefix parsing for fabric response format
- Implement real-time streaming with `writeOllamaResponse` helper function
- Add `writeOllamaResponseStruct` for consistent JSON response writing
- Handle both streaming and non-streaming response modes separately
- Add proper error handling for fabric error response types
- Ensure response body is properly closed with defer statement
Update the SendStream interface to match the current Vendor interface
which now uses chan domain.StreamUpdate instead of chan string.
Changes:
- Update SendStream signature to use chan domain.StreamUpdate
- Update sendChatMessageStream signature accordingly
- Update parseSSEStream signature accordingly
- Wrap all channel sends with domain.StreamUpdate{Type: StreamTypeContent}
This fixes the build error introduced when the streaming interface was
updated to support metadata like token usage alongside content.
## CHANGES
- Add DigitalOcean as a new AI provider in plugin registry
- Implement DigitalOcean client with OpenAI-compatible inference endpoint
- Support model access key authentication for inference requests
- Add optional control plane token for model discovery
- Create DigitalOcean setup documentation with environment variables
- Update README to list DigitalOcean in supported providers
- Handle model listing via control plane API with fallback
## CHANGES
- Add Mammouth provider configuration with API base URL
- Configure Mammouth to use standard OpenAI-compatible interface
- Disable Responses API implementation for Mammouth provider
- Add "Mammouth" to VSCode spell check dictionary
Add centralized factory function for AI vendor plugin initialization:
- Add NewVendorPluginBase(name, configure) to internal/plugins/plugin.go
- Update 8 vendor files (anthropic, bedrock, gemini, lmstudio, ollama,
openai, perplexity, vertexai) to use the factory function
- Add 3 test cases for the new factory function
This removes ~40 lines of duplicated boilerplate code and ensures
consistent plugin initialization across all vendors.
MAESTRO: Loop 00001 refactoring implementation
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
### CHANGES
- Extract `getGeminiRegion` method for region determination
- Use `getGeminiRegion` in `sendGemini` for location setting
- Apply `getGeminiRegion` in `sendStreamGemini` for consistency
- Extract model fetching logic into a dedicated helper function.
- Improve response body cleanup during Vertex AI pagination loops.
- Remove unused time import and timeout constant from models.
- Streamline listPublisherModels function by delegating API requests to helper.
- Dynamic model listing from Vertex AI Model Garden API
- Support for both Gemini (genai SDK) and Claude (Anthropic SDK) models
- Curated Gemini model list (no API available to list them)
- Web search support for Gemini models
- Thinking/extended thinking support for Gemini
- TopP parameter support for Claude models
- Model filtering (excludes imagen, embeddings, legacy models)
- Model sorting (Gemini > Claude > DeepSeek > Llama > Mistral > Others)
- Extract message conversion into dedicated `toMessages` helper method
- Add proper role handling for system, user, and assistant messages
- Prepend system content to first user message per Anthropic format
- Enforce user/assistant message alternation with placeholder messages
- Skip empty messages during conversion processing
- Concatenate multiple text blocks in response output
- Add validation for empty message arrays before sending
- Handle edge case when only system content is provided
Add support for Google Cloud Vertex AI as a provider to access Claude models
using Application Default Credentials (ADC). This allows users to route their
Fabric requests through Google Cloud Platform instead of directly to Anthropic,
enabling billing through GCP.
Features:
- Support for Claude models (Sonnet 4.5, Opus 4.5, Haiku 4.5, etc.) via Vertex AI
- Uses Google ADC for authentication (no API keys required)
- Configurable project ID and region (defaults to 'global' for cost optimization)
- Full support for streaming and non-streaming requests
- Implements complete ai.Vendor interface
Configuration:
- VERTEXAI_PROJECT_ID: GCP project ID (required)
- VERTEXAI_REGION: Vertex AI region (optional, defaults to 'global')
Closes#1570
Add session name parameter to GUI chat interface, enabling persistent
multi-turn conversations similar to CLI's --session flag.
Changes:
- Add SessionName field to PromptRequest in chat.go
- Add sessionName to ChatPrompt interface
- Include currentSession in ChatService requests
- Add Session ID input with existing sessions dropdown in DropdownGroup
Closes#680🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
### CHANGES
- Simplify error wrapping by removing redundant Sprintf calls in CLI
- Pass HTTP client to FetchModelsDirectly to enable connection reuse
- Store persistent HTTP client instance inside the OpenAI provider struct
- Update compatible AI providers to match the new function signature
- Add error handling for pattern loading from absolute file paths
### CHANGES
- Remove redundant fmt.Sprintf calls from error formatting logic
- Simplify model assignment to always use normalized model names
- Remove unused variadic parameter from the VendorsManager Clear method
- Add MiniMax provider configuration to ProviderMap
- Set MiniMax base URL to api.minimaxi.com/v1
- Configure MiniMax with ImplementsResponses as false
- Add test case for MiniMax provider validation
- Add GitHub and Buy Me a Coffee funding configuration.
- Localize setup prompts and error messages across multiple languages.
- Implement helper for localized questions with static environment keys.
- Update environment variable builder to handle hyphenated plugin names.
- Replace hardcoded console output with localized i18n translation strings.
- Expand locale files with comprehensive pattern and strategy translations.
- Add new i18n keys for optional and required markers
- Remove hardcoded `[required]` markers from description strings
- Add custom patterns, Jina AI, YouTube, and language labels
- Switch plugin descriptions to use i18n translation keys
- Append markers dynamically to setup descriptions in Go code
- Remove trailing newlines from plugin question prompt strings
- Standardize all locale files with consistent formatting changes
- Add `/youtube/transcript` POST endpoint to Swagger docs
- Define `YouTubeRequest` schema with URL, language, timestamps fields
- Define `YouTubeResponse` schema with transcript and metadata fields
- Add API security requirement using ApiKeyAuth
- Document 200, 400, and 500 response codes
- Add godoc comments to YouTubeHandler struct methods
- Include example values for all request/response properties