Commit Graph

178 Commits

Author SHA1 Message Date
Kayvan Sylvan
2cb2a76200 feat: add support for pattern variables in Ollama API requests
## CHANGES

- Add `Variables` field to `OllamaRequestBody` struct for direct variable passing
- Change `Options` field from empty struct to flexible `map[string]any` type
- Extract variables from top-level `Variables` field or nested `Options.variables`
- Support parsing variables as JSON string or map format
- Pass extracted variables to `PromptRequest` for single message chats
- Pass extracted variables to `PromptRequest` for multi-message chats
- Add `omitempty` JSON tags to optional fields
2026-01-17 06:35:41 -08:00
Kayvan Sylvan
c7c9d73c01 fix: return string error payloads and fail non-stream empty upstream
## CHANGES
- Serialize JSON error field as `err.Error()` string
- Treat non-stream upstream empty content as 502 error
- Keep streaming mode tolerant when upstream returns no content
2026-01-17 05:34:38 -08:00
Kayvan Sylvan
61e8871396 fix: set NDJSON header only after successful upstream response
## CHANGES
- Move NDJSON `Content-Type` header after status validation
- Avoid setting stream headers on upstream error responses
- Log warning when upstream returns no streamed content
- Keep duration timing consistent across response paths
- Preserve existing streaming and non-streaming response behavior
2026-01-17 05:03:03 -08:00
Kayvan Sylvan
04fef11e17 fix: harden Ollama streaming flush and align metric counters with int64
## CHANGES
- Use int64 for prompt and eval count fields
- Skip sending secondary error message on stream write failure
- Allow non-http schemes and validate host only for address
- Flush response only when writer implements http.Flusher
2026-01-17 04:55:49 -08:00
Kayvan Sylvan
c50b9a61de fix: propagate request context and close Ollama stream on errors
## CHANGES
- Use Gin request context for outbound HTTP calls
- Send final stream chunk when response write fails
- Capture timing duration once for consistent metrics
- Build final Ollama response via shared helper function
- Validate Fabric base URL scheme is http/https only
- Add clarifying documentation comments for URL and writers
2026-01-17 04:21:41 -08:00
Kayvan Sylvan
665267842f fix: align Ollama duration fields to int64 nanosecond precision
## CHANGES
- Use int64 for `load_duration` JSON field values
- Use int64 for `prompt_eval_duration` JSON field values
- Remove lossy int casts when assigning nanosecond durations
- Keep duration payloads consistent with `total_duration` precision
- Prevent potential overflow on long-running request timing
2026-01-17 04:01:26 -08:00
Kayvan Sylvan
e2b63ddc2f fix: improve SSE scan errors and validate bare Fabric address inputs
## CHANGES
- Send detailed SSE stream scan errors in responses
- Detect token-too-long and return clear buffer-limit message
- Unify streaming and JSON error messaging for scan failures
- Validate bare Fabric address using URL parsing
- Reject bare addresses missing host or hostname
- Disallow path components in bare Fabric addresses
- Trim trailing slash from validated Fabric chat URL
- Add tests covering invalid bare addresses with paths
2026-01-17 03:32:07 -08:00
Kayvan Sylvan
97b6b76dd2 fix: reject hostless Fabric chat URLs like https://:8080
## CHANGES
- Validate parsed URL host not start with colon
- Return explicit error for missing hostname in URL
- Update unit test to expect error on port-only host
- Prevent accidental acceptance of malformed `https://:port` addresses
2026-01-17 03:06:17 -08:00
Kayvan Sylvan
29a32a8439 fix: validate Fabric chat URL host and tidy Ollama responses
## CHANGES
- Set NDJSON content type before checking upstream errors
- Reject parsed URLs that omit a hostname
- Remove hardcoded eval count placeholders from responses
- Add unit tests for Fabric chat URL builder
- Cover colon-port, host:port, and IP address inputs
2026-01-17 02:31:46 -08:00
Kayvan Sylvan
ae6d4d1fb3 fix: handle upstream non-2xx and return stringified error payloads
## CHANGES
- Convert JSON error responses to use err.Error()
- Detect upstream Fabric non-2xx status before scanning
- Read and log upstream error body when possible
- Return upstream status error message for non-stream mode
- Stream error message via NDJSON when streaming enabled
- Set NDJSON Content-Type header before first streaming write
- Remove per-chunk header setting during streaming output
2026-01-17 01:37:20 -08:00
Kayvan Sylvan
8310695e1a fix(ollama): address Copilot review feedback for error handling
Addresses all 8 Copilot review comments on PR #1940:

Critical fixes:
- Replace log.Fatal with proper HTTP error response to prevent
  server crashes on request failures
- Add streaming-aware error handling to maintain consistent
  response format (prevents mixing JSON with NDJSON)

Error messaging improvements:
- Replace "testing endpoint" placeholders with descriptive
  error messages
- Add clear context for unmarshal and scanning failures

Protocol compliance:
- Set Content-Type: application/x-ndjson for streaming responses
- Ensure all error paths respect stream vs non-stream mode

Code cleanup:
- Remove commented-out dead code

Tested both streaming and non-streaming modes successfully.
2026-01-17 01:19:37 -08:00
Kayvan Sylvan
e318a939aa refactor: rewrite Ollama chat handler to support proper streaming responses
- Add `json:"-"` tag to exclude UpdateChan from JSON serialization
- Extract URL building logic into dedicated `buildFabricChatURL` helper function
- Replace single-read body parsing with streaming `bufio.Scanner` approach
- Add proper SSE data prefix parsing for fabric response format
- Implement real-time streaming with `writeOllamaResponse` helper function
- Add `writeOllamaResponseStruct` for consistent JSON response writing
- Handle both streaming and non-streaming response modes separately
- Add proper error handling for fabric error response types
- Ensure response body is properly closed with defer statement
2026-01-17 00:52:29 -08:00
Kayvan Sylvan
a2370a0e3b chore: Note in the guide about restricted env + modernize fixes 2026-01-15 15:16:40 -08:00
Kayvan Sylvan
f50a7568d1 Merge branch 'main' into kayvan/msft_copilot_vendor_by_claude_opus_4_5 2026-01-15 15:00:42 -08:00
Tom Stetson
d98ad5290c fix: update Copilot SendStream to use domain.StreamUpdate
Update the SendStream interface to match the current Vendor interface
which now uses chan domain.StreamUpdate instead of chan string.

Changes:
- Update SendStream signature to use chan domain.StreamUpdate
- Update sendChatMessageStream signature accordingly
- Update parseSSEStream signature accordingly
- Wrap all channel sends with domain.StreamUpdate{Type: StreamTypeContent}

This fixes the build error introduced when the streaming interface was
updated to support metadata like token usage alongside content.
2026-01-15 14:50:59 -05:00
Kayvan Sylvan
c26a56a368 feat: add DigitalOcean Gradient AI Agents as a new vendor
## CHANGES

- Add DigitalOcean as a new AI provider in plugin registry
- Implement DigitalOcean client with OpenAI-compatible inference endpoint
- Support model access key authentication for inference requests
- Add optional control plane token for model discovery
- Create DigitalOcean setup documentation with environment variables
- Update README to list DigitalOcean in supported providers
- Handle model listing via control plane API with fallback
2026-01-13 22:52:13 -08:00
Kayvan Sylvan
a2058ae26e Merge branch 'main' into kayvan/msft_copilot_vendor_by_claude_opus_4_5 2026-01-13 10:35:24 -08:00
Kayvan Sylvan
7e7ab9e5f2 feat: add Mammouth as new OpenAI-compatible AI provider
## CHANGES

- Add Mammouth provider configuration with API base URL
- Configure Mammouth to use standard OpenAI-compatible interface
- Disable Responses API implementation for Mammouth provider
- Add "Mammouth" to VSCode spell check dictionary
2026-01-12 09:27:28 -08:00
Kayvan Sylvan
cf55be784f refactor: add NewVendorPluginBase factory function to reduce duplication
Add centralized factory function for AI vendor plugin initialization:
- Add NewVendorPluginBase(name, configure) to internal/plugins/plugin.go
- Update 8 vendor files (anthropic, bedrock, gemini, lmstudio, ollama,
  openai, perplexity, vertexai) to use the factory function
- Add 3 test cases for the new factory function

This removes ~40 lines of duplicated boilerplate code and ensures
consistent plugin initialization across all vendors.

MAESTRO: Loop 00001 refactoring implementation

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-11 20:12:58 -08:00
Henri Cook
8017f376b1 fix: use MaxTokens not ModelContextLength for output limit 2026-01-08 19:23:21 +00:00
Kayvan Sylvan
6f103b2db2 feat: refactor Gemini region logic into getGeminiRegion method
### CHANGES

- Extract `getGeminiRegion` method for region determination
- Use `getGeminiRegion` in `sendGemini` for location setting
- Apply `getGeminiRegion` in `sendStreamGemini` for consistency
2026-01-08 11:19:31 -08:00
Kayvan Sylvan
19aeebe6f5 refactor: extract fetchModelsPage in Vertex AI to improve pagination
- Extract model fetching logic into a dedicated helper function.
- Improve response body cleanup during Vertex AI pagination loops.
- Remove unused time import and timeout constant from models.
- Streamline listPublisherModels function by delegating API requests to helper.
2026-01-08 11:16:25 -08:00
Kayvan Sylvan
2d79d3b706 chore: format fixes 2026-01-08 10:56:56 -08:00
Henri Cook
2501cbf47e feat(vertexai): add dynamic model listing and multi-model support
- Dynamic model listing from Vertex AI Model Garden API
- Support for both Gemini (genai SDK) and Claude (Anthropic SDK) models
- Curated Gemini model list (no API available to list them)
- Web search support for Gemini models
- Thinking/extended thinking support for Gemini
- TopP parameter support for Claude models
- Model filtering (excludes imagen, embeddings, legacy models)
- Model sorting (Gemini > Claude > DeepSeek > Llama > Mistral > Others)
2026-01-08 17:24:19 +00:00
Kayvan Sylvan
b381bae24a Merge pull request #1915 from majiayu000/fix-1842-feature-request-parallelize-au-0101-2335
feat: parallelize audio chunk transcription for improved performance
2026-01-04 13:04:56 -08:00
Kayvan Sylvan
0776e77872 Merge branch 'main' into fix-1910-bug-rest-api-chat-endpoint-doe-0101-2307 2026-01-03 17:09:28 -08:00
Kayvan Sylvan
cb2759a5a1 Merge branch 'main' into fix-1842-feature-request-parallelize-au-0101-2335 2026-01-03 17:05:14 -08:00
majiayu000
8a28ca7b1e feat: parallelize audio chunk transcription using goroutines
Signed-off-by: majiayu000 <1835304752@qq.com>
2026-01-01 23:38:32 +08:00
majiayu000
6ea5551f06 fix: pass Search and SearchLocation parameters to ChatOptions in /chat endpoint
Signed-off-by: majiayu000 <1835304752@qq.com>
2026-01-01 23:09:30 +08:00
berniegreen
b04346008b fix: add missing newline to end of chatter_test.go 2025-12-31 16:59:30 -06:00
berniegreen
c7ecac3262 test: add test for metadata stream propagation 2025-12-31 15:56:20 -06:00
berniegreen
8166ee7a18 docs: update swagger documentation and fix dryrun tests 2025-12-31 15:13:20 -06:00
berniegreen
c539b1edfc feat: implement REST API support for metadata streaming (Phase 5) 2025-12-31 12:43:48 -06:00
berniegreen
66d3bf786e feat: implement CLI support for metadata display (Phase 4) 2025-12-31 12:41:06 -06:00
berniegreen
569f50179d refactor: implement structured streaming in all AI vendors (Phase 3) 2025-12-31 12:38:38 -06:00
berniegreen
477ca045b0 refactor: update Vendor interface and Chatter for structured streaming (Phase 2) 2025-12-31 12:26:13 -06:00
berniegreen
e40d51cc71 feat: add domain types for structured streaming (Phase 1) 2025-12-31 12:19:27 -06:00
Kayvan Sylvan
31a52f7191 refactor: extract message conversion logic to toMessages method in VertexAI client
- Extract message conversion into dedicated `toMessages` helper method
- Add proper role handling for system, user, and assistant messages
- Prepend system content to first user message per Anthropic format
- Enforce user/assistant message alternation with placeholder messages
- Skip empty messages during conversion processing
- Concatenate multiple text blocks in response output
- Add validation for empty message arrays before sending
- Handle edge case when only system content is provided
2025-12-30 09:43:22 -08:00
Rodaddy
3cb0be03c7 feat(ai): add VertexAI provider for Claude models
Add support for Google Cloud Vertex AI as a provider to access Claude models
using Application Default Credentials (ADC). This allows users to route their
Fabric requests through Google Cloud Platform instead of directly to Anthropic,
enabling billing through GCP.

Features:
- Support for Claude models (Sonnet 4.5, Opus 4.5, Haiku 4.5, etc.) via Vertex AI
- Uses Google ADC for authentication (no API keys required)
- Configurable project ID and region (defaults to 'global' for cost optimization)
- Full support for streaming and non-streaming requests
- Implements complete ai.Vendor interface

Configuration:
- VERTEXAI_PROJECT_ID: GCP project ID (required)
- VERTEXAI_REGION: Vertex AI region (optional, defaults to 'global')

Closes #1570
2025-12-29 14:33:25 -05:00
lif
6c5487609e feat(gui): add Session ID support for multi-turn conversations
Add session name parameter to GUI chat interface, enabling persistent
multi-turn conversations similar to CLI's --session flag.

Changes:
- Add SessionName field to PromptRequest in chat.go
- Add sessionName to ChatPrompt interface
- Include currentSession in ChatService requests
- Add Session ID input with existing sessions dropdown in DropdownGroup

Closes #680

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-27 08:11:30 +08:00
Kayvan Sylvan
33130f2087 refactor: optimize HTTP client reuse and simplify error formatting
### CHANGES

- Simplify error wrapping by removing redundant Sprintf calls in CLI
- Pass HTTP client to FetchModelsDirectly to enable connection reuse
- Store persistent HTTP client instance inside the OpenAI provider struct
- Update compatible AI providers to match the new function signature
- Add error handling for pattern loading from absolute file paths
2025-12-25 07:58:49 -08:00
Kayvan Sylvan
58e8ac1012 chore: simplify error formatting and clean up model assignment logic
### CHANGES
- Remove redundant fmt.Sprintf calls from error formatting logic
- Simplify model assignment to always use normalized model names
- Remove unused variadic parameter from the VendorsManager Clear method
2025-12-23 07:51:33 -08:00
Kayvan Sylvan
e2c28c8f19 feat: add MiniMax provider support to OpenAI compatible plugin
- Add MiniMax provider configuration to ProviderMap
- Set MiniMax base URL to api.minimaxi.com/v1
- Configure MiniMax with ImplementsResponses as false
- Add test case for MiniMax provider validation
2025-12-22 14:52:08 -08:00
Kayvan Sylvan
7570e7930b feat: localize setup process and add funding configuration
- Add GitHub and Buy Me a Coffee funding configuration.
- Localize setup prompts and error messages across multiple languages.
- Implement helper for localized questions with static environment keys.
- Update environment variable builder to handle hyphenated plugin names.
- Replace hardcoded console output with localized i18n translation strings.
- Expand locale files with comprehensive pattern and strategy translations.
- Add new i18n keys for optional and required markers
- Remove hardcoded `[required]` markers from description strings
- Add custom patterns, Jina AI, YouTube, and language labels
- Switch plugin descriptions to use i18n translation keys
- Append markers dynamically to setup descriptions in Go code
- Remove trailing newlines from plugin question prompt strings
- Standardize all locale files with consistent formatting changes
2025-12-22 09:39:02 -08:00
Kayvan Sylvan
5e4e4f4bf1 docs: Add YouTube transcript endpoint to Swagger UI.
- Add `/youtube/transcript` POST endpoint to Swagger docs
- Define `YouTubeRequest` schema with URL, language, timestamps fields
- Define `YouTubeResponse` schema with transcript and metadata fields
- Add API security requirement using ApiKeyAuth
- Document 200, 400, and 500 response codes
- Add godoc comments to YouTubeHandler struct methods
- Include example values for all request/response properties
2025-12-19 10:41:55 -08:00
Bob Vandevliet
8a3fa9337c feat: correct video title (instead of id) and added description to yt transcript api response 2025-12-19 13:14:12 +01:00
Kayvan Sylvan
9f79877524 User Experience: implement automated first-time setup and improved configuration validation
### CHANGES

- Add automated first-time setup for patterns and strategies.
- Implement configuration validation to warn about missing required components.
- Update setup menu to group plugins into required and optional.
- Provide helpful guidance when no patterns are found in listing.
- Expand localization support for setup and error messaging across languages.
- Enhance strategy manager to reload and count installed strategies.
- Improve pattern error handling with specific guidance for empty directories.
2025-12-18 14:48:50 -08:00
Kayvan Sylvan
c06c94f8b8 # CHANGES
- Add Swagger UI at `/swagger/index.html` endpoint
- Generate OpenAPI spec files (JSON and YAML)
- Document chat, patterns, and models endpoints
- Update contributing guide with Swagger annotation instructions
- Add swaggo dependencies to project
- Configure authentication bypass for Swagger documentation
- Add custom YAML handler for OpenAPI spec
- Update REST API documentation with Swagger links
- Add dictionary entries for new tools
2025-12-18 07:12:08 -08:00
Kayvan Sylvan
fdadeae1e7 modernize: update GitHub Actions and modernize Go code with latest stdlib features
## CHANGES

- Upgrade GitHub Actions to latest versions (v6, v21)
- Add modernization check step in CI workflow
- Replace strings manipulation with `strings.CutPrefix` and `strings.CutSuffix`
- Replace manual loops with `slices.Contains` for validation
- Use `strings.SplitSeq` for iterator-based string splitting
- Replace `bytes.TrimPrefix` with `bytes.CutPrefix` for clarity
- Use `strings.Builder` instead of string concatenation
- Replace `fmt.Sprintf` with `fmt.Appendf` for efficiency
- Simplify padding calculation with `max` builtin
2025-12-15 23:55:37 -08:00
Kayvan Sylvan
a4484d4e01 refactor: modernize Go code with TypeFor and range loops
- Replace reflect.TypeOf with TypeFor generic syntax
- Convert traditional for loops to range-based iterations
- Simplify reflection usage in CLI flag handling
- Update test loops to use range over integers
- Refactor string processing loops in template plugin
2025-12-15 23:29:41 -08:00