mirror of
https://github.com/Significant-Gravitas/AutoGPT.git
synced 2026-04-30 03:00:41 -04:00
383e60fba579a0ed6b8dbb35b4795de5cef987fa
994 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
383e60fba5 |
fix(library): merge settings when updating library agent
- Updated the `update_library_agent` function to merge existing and new settings when provided, ensuring that only unset fields are updated. - Removed the `get_folder_depth` function and related depth validation logic to streamline folder operations, as it was deemed unnecessary for current functionality. - These changes enhance the integrity of agent settings management and simplify folder validation processes. |
||
|
|
91f7c7b0db | Merge branch 'dev' into abhi/folder-inside-library | ||
|
|
01f18acba8 |
fix(library): validate folder existence when updating agent folder ID
- Enhanced the `update_library_agent` function to verify that the specified folder ID belongs to the user and is not deleted before updating the agent's folder ID. - This change improves error handling by raising a `NotFoundError` if the folder is not found, ensuring better data integrity and user feedback during agent updates. |
||
|
|
09f74594ab |
fix(tests): update test_get_library_agents_success to include folder_id and include_root_only parameters
- Added `folder_id` and `include_root_only` parameters to the `test_get_library_agents_success` function to ensure comprehensive testing of library agent retrieval functionality. - This change enhances the test coverage for scenarios involving folder-specific agent queries. |
||
|
|
f4848a43af |
feat(library): improve folder and agent management with code enhancements
- Refactored `delete_folder` function to include an asynchronous cleanup process for affected agents, ensuring proper resource management during folder deletions. - Added `FolderValidationError` exception to enhance error handling in folder operations. - Improved logging for database errors in the `update_library_agent` function, providing clearer feedback during agent updates. - Enhanced UI components for better readability and structure, including adjustments to the `LibraryAgentCard` and `FavoritesSection`. These changes enhance the functionality and reliability of folder and agent management, improving user experience and error handling in the library interface. |
||
|
|
647c8ed8d4 |
feat(backend/blocks): enhance list concatenation with advanced operations (#12105)
## Summary Enhances the existing `ConcatenateListsBlock` and adds five new companion blocks for comprehensive list manipulation, addressing issue #11139 ("Implement block to concatenate lists"). ### Changes - **Enhanced `ConcatenateListsBlock`** with optional deduplication (`deduplicate`) and None-value filtering (`remove_none`), plus an output `length` field - **New `FlattenListBlock`**: Recursively flattens nested list structures with configurable `max_depth` - **New `InterleaveListsBlock`**: Round-robin interleaving of elements from multiple lists - **New `ZipListsBlock`**: Zips corresponding elements from multiple lists with support for padding to longest or truncating to shortest - **New `ListDifferenceBlock`**: Computes set difference between two lists (regular or symmetric) - **New `ListIntersectionBlock`**: Finds common elements between two lists, preserving order ### Helper Utilities Extracted reusable helper functions for validation, flattening, deduplication, interleaving, chunking, and statistics computation to support the blocks and enable future reuse. ### Test Coverage Comprehensive test suite with 188 test functions across 29 test classes covering: - Built-in block test harness validation for all 6 blocks - Manual edge-case tests for each block (empty inputs, large lists, mixed types, nested structures) - Internal method tests for all block classes - Unit tests for all helper utility functions Closes #11139 ## Test plan - [x] All files pass Python syntax validation (`ast.parse`) - [x] Built-in `test_input`/`test_output` tests defined for all blocks - [x] Manual tests cover edge cases: empty lists, large lists, mixed types, nested structures, deduplication, None removal - [x] Helper function tests validate all utility functions independently - [x] All block IDs are valid UUID4 - [x] Block categories set to `BlockCategory.BASIC` for consistency with existing list blocks <!-- greptile_comment --> <h2>Greptile Overview</h2> <details><summary><h3>Greptile Summary</h3></summary> Enhanced `ConcatenateListsBlock` with deduplication and None-filtering options, and added five new list manipulation blocks (`FlattenListBlock`, `InterleaveListsBlock`, `ZipListsBlock`, `ListDifferenceBlock`, `ListIntersectionBlock`) with comprehensive helper functions and test coverage. **Key Changes:** - Enhanced `ConcatenateListsBlock` with `deduplicate` and `remove_none` options, plus `length` output field - Added `FlattenListBlock` for recursively flattening nested lists with configurable `max_depth` - Added `InterleaveListsBlock` for round-robin element interleaving - Added `ZipListsBlock` with support for padding/truncation - Added `ListDifferenceBlock` and `ListIntersectionBlock` for set operations - Extracted 12 reusable helper functions for validation, flattening, deduplication, etc. - Comprehensive test suite with 188 test functions covering edge cases **Minor Issues:** - Helper function `_deduplicate_list` has redundant logic in the `else` branch that duplicates the `if` branch - Three helper functions (`_filter_empty_collections`, `_compute_list_statistics`, `_chunk_list`) are defined but unused - consider removing unless planned for future use - The `_make_hashable` function uses `hash(repr(item))` for unhashable types, which correctly treats structurally identical dicts/lists as duplicates </details> <details><summary><h3>Confidence Score: 4/5</h3></summary> - Safe to merge with minor style improvements recommended - The implementation is well-structured with comprehensive test coverage (188 tests), proper error handling, and follows existing block patterns. All blocks use valid UUID4 IDs and correct categories. The helper functions provide good code reuse. The minor issues are purely stylistic (redundant code, unused helpers) and don't affect functionality or safety. - No files require special attention - both files are well-tested and follow project conventions </details> <details><summary><h3>Sequence Diagram</h3></summary> ```mermaid sequenceDiagram participant User participant Block as List Block participant Helper as Helper Functions participant Output User->>Block: Input (lists/parameters) Block->>Helper: _validate_all_lists() Helper-->>Block: validation result alt validation fails Block->>Output: error message else validation succeeds Block->>Helper: _concatenate_lists_simple() / _flatten_nested_list() / etc. Helper-->>Block: processed result opt deduplicate enabled Block->>Helper: _deduplicate_list() Helper-->>Block: deduplicated result end opt remove_none enabled Block->>Helper: _filter_none_values() Helper-->>Block: filtered result end Block->>Output: result + length end Output-->>User: Block outputs ``` </details> <sub>Last reviewed commit: a6d5445</sub> <!-- greptile_other_comments_section --> <sub>(2/5) Greptile learns from your feedback when you react with thumbs up/down!</sub> <!-- /greptile_comment --> --------- Co-authored-by: Otto <otto@agpt.co> |
||
|
|
27d94e395c |
feat(backend/sdk): enable WebSearch, block WebFetch, consolidate tool constants (#12108)
## Summary
- Enable Claude Agent SDK built-in **WebSearch** tool (Brave Search via
Anthropic API) for the CoPilot SDK agent
- Explicitly **block WebFetch** via `SDK_DISALLOWED_TOOLS`. The agent
uses the SSRF-protected `mcp__copilot__web_fetch` MCP tool instead
- **Consolidate** all tool security constants (`BLOCKED_TOOLS`,
`WORKSPACE_SCOPED_TOOLS`, `DANGEROUS_PATTERNS`, `SDK_DISALLOWED_TOOLS`)
into `tool_adapter.py` as a single source of truth — previously
scattered across `tool_adapter.py`, `security_hooks.py`, and inline in
`service.py`
## Changes
- `tool_adapter.py`: Add `WebSearch` to `_SDK_BUILTIN_TOOLS`, add
`SDK_DISALLOWED_TOOLS`, move security constants here
- `security_hooks.py`: Import constants from `tool_adapter.py` instead
of defining locally
- `service.py`: Use `SDK_DISALLOWED_TOOLS` instead of inline `["Bash"]`
## Test plan
- [x] All 21 security hooks tests pass
- [x] Ruff lint clean
- [x] All pre-commit hooks pass
- [ ] Verify WebSearch works in CoPilot chat (manual test)
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
<details><summary><h3>Greptile Summary</h3></summary>
Consolidates tool security constants into `tool_adapter.py` as single
source of truth, enables WebSearch (Brave via Anthropic API), and
explicitly blocks WebFetch to prevent SSRF attacks. The change improves
security by ensuring the agent uses the SSRF-protected
`mcp__copilot__web_fetch` tool instead of the built-in WebFetch which
can access internal networks like `localhost:8006`.
</details>
<details><summary><h3>Confidence Score: 5/5</h3></summary>
- This PR is safe to merge with minimal risk
- The changes improve security by blocking WebFetch (SSRF risk) while
enabling safe WebSearch. The consolidation of constants into a single
source of truth improves maintainability. All existing tests pass (21
security hooks tests), and the refactoring is straightforward with no
behavioral changes to existing security logic. The only suggestions are
minor improvements: adding a test for WebFetch blocking and considering
a lowercase alias for consistency.
- No files require special attention
</details>
<details><summary><h3>Sequence Diagram</h3></summary>
```mermaid
sequenceDiagram
participant Agent as SDK Agent
participant Hooks as Security Hooks
participant TA as tool_adapter.py
participant MCP as MCP Tools
Note over TA: SDK_DISALLOWED_TOOLS = ["Bash", "WebFetch"]
Note over TA: _SDK_BUILTIN_TOOLS includes WebSearch
Agent->>Hooks: Request WebSearch (Brave API)
Hooks->>TA: Check BLOCKED_TOOLS
TA-->>Hooks: Not blocked
Hooks-->>Agent: Allowed ✓
Agent->>Agent: Execute via Anthropic API
Agent->>Hooks: Request WebFetch (SSRF risk)
Hooks->>TA: Check BLOCKED_TOOLS
Note over TA: WebFetch in SDK_DISALLOWED_TOOLS
TA-->>Hooks: Blocked
Hooks-->>Agent: Denied ✗
Note over Agent: Use mcp__copilot__web_fetch instead
Agent->>Hooks: Request mcp__copilot__web_fetch
Hooks->>MCP: Validate (MCP tool, not SDK builtin)
MCP-->>Hooks: Has SSRF protection
Hooks-->>Agent: Allowed ✓
Agent->>MCP: Execute with SSRF checks
```
</details>
<sub>Last reviewed commit: 2d9975f</sub>
<!-- greptile_other_comments_section -->
<!-- /greptile_comment -->
|
||
|
|
b8f5c208d0 |
Handle errors in Jina ExtractWebsiteContentBlock (#12048)
## Summary - catch Jina reader client/server errors in ExtractWebsiteContentBlock and surface a clear error output keyed to the user URL - guard empty responses to return an explicit error instead of yielding blank content - add regression tests covering the happy path and HTTP client failures via a monkeypatched fetch ## Testing - not run (pytest unavailable in this environment) --------- Co-authored-by: Nicholas Tindle <nicktindle@outlook.com> Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co> |
||
|
|
f9f358c526 |
feat(mcp): Add MCP tool block with OAuth, tool discovery, and standard credential integration (#12011)
## Summary <img width="1000" alt="image" src="https://github.com/user-attachments/assets/18e8ef34-d222-453c-8b0a-1b25ef8cf806" /> <img width="250" alt="image" src="https://github.com/user-attachments/assets/ba97556c-09c5-4f76-9f4e-49a2e8e57468" /> <img width="250" alt="image" src="https://github.com/user-attachments/assets/68f7804a-fe74-442d-9849-39a229c052cf" /> <img width="250" alt="image" src="https://github.com/user-attachments/assets/700690ba-f9fe-4726-8871-3bfbab586001" /> Full-stack MCP (Model Context Protocol) tool block integration that allows users to connect to any MCP server, discover available tools, authenticate via OAuth, and execute tools — all through the standard AutoGPT credential system. ### Backend - **MCPToolBlock** (`blocks/mcp/block.py`): New block using `CredentialsMetaInput` pattern with optional credentials (`default={}`), supporting both authenticated (OAuth) and public MCP servers. Includes auto-lookup fallback for backward compatibility. - **MCP Client** (`blocks/mcp/client.py`): HTTP transport with JSON-RPC 2.0, tool discovery, tool execution with robust error handling (type-checked error fields, non-JSON response handling) - **MCP OAuth Handler** (`blocks/mcp/oauth.py`): RFC 8414 discovery, dynamic per-server OAuth with PKCE, token storage and refresh via `raise_for_status=True` - **MCP API Routes** (`api/features/mcp/routes.py`): `discover-tools`, `oauth/login`, `oauth/callback` endpoints with credential cleanup, defensive OAuth metadata validation - **Credential system integration**: - `CredentialsMetaInput` model_validator normalizes legacy `"ProviderName.MCP"` format from Python 3.13's `str(StrEnum)` change - `CredentialsFieldInfo.combine()` supports URL-based credential discrimination (each MCP server gets its own credential entry) - `aggregate_credentials_inputs` checks block schema defaults for credential optionality - Executor normalizes credential data for both Pydantic and JSON schema validation paths - Chat credential matching handles MCP server URL filtering - `provider_matches()` helper used consistently for Python 3.13 StrEnum compatibility - **Pre-run validation**: `_validate_graph_get_errors` now calls `get_missing_input()` for custom block-level validation (MCP tool arguments) - **Security**: HTML tag stripping loop to prevent XSS bypass, SSRF protection (removed trusted_origins) ### Frontend - **MCPToolDialog** (`MCPToolDialog.tsx`): Full tool discovery UI — enter server URL, authenticate if needed, browse tools, select tool and configure - **OAuth popup** (`oauth-popup.ts`): Shared utility supporting cross-origin MCP OAuth flows with BroadcastChannel + localStorage fallback - **Credential integration**: MCP-specific OAuth flow in `useCredentialsInput`, server URL filtering in `useCredentials`, MCP callback page - **CredentialsSelect**: Auto-selects first available credential instead of defaulting to "None", credentials listed before "None" in dropdown - **Node rendering**: Dynamic tool input schema rendering on MCP nodes, proper handling in both legacy and new flow editors - **Block title persistence**: `customized_name` set at block creation for both MCP and Agent blocks — no fallback logic needed, titles survive save/load reliably - **Stable credential ordering**: Removed `sortByUnsetFirst` that caused credential inputs to jump when selected ### Tests (~2060 lines) - Unit tests: block, client, tool execution - Integration tests: mock MCP server with auth - OAuth flow tests - API endpoint tests - Credential combining/optionality tests - E2e tests (skipped in CI, run manually) ## Key Design Decisions 1. **Optional credentials via `default={}`**: MCP servers can be public (no auth) or private (OAuth). The `credentials` field has `default={}` making it optional at the schema level, so public servers work without prompting for credentials. 2. **URL-based credential discrimination**: Each MCP server URL gets its own credential entry in the "Run agent" form (via `discriminator="server_url"`), so agents using multiple MCP servers prompt for each independently. 3. **Model-level normalization**: Python 3.13 changed `str(StrEnum)` to return `"ClassName.MEMBER"`. Rather than scattering fixes across the codebase, a Pydantic `model_validator(mode="before")` on `CredentialsMetaInput` handles normalization centrally, and `provider_matches()` handles lookups. 4. **Credential auto-select**: `CredentialsSelect` component defaults to the first available credential and notifies the parent state, ensuring credentials are pre-filled in the "Run agent" dialog without requiring manual selection. 5. **customized_name for block titles**: Both MCP and Agent blocks set `customized_name` in metadata at creation time. This eliminates convoluted runtime fallback logic (`agent_name`, hostname extraction) — the title is persisted once and read directly. ## Test plan - [x] Unit/integration tests pass (68 MCP + 11 graph = 79 tests) - [x] Manual: MCP block with public server (DeepWiki) — no credentials needed, tools discovered and executable - [x] Manual: MCP block with OAuth server (Linear, Sentry) — OAuth flow prompts correctly - [x] Manual: "Run agent" form shows correct credential requirements per MCP server - [x] Manual: Credential auto-selects when exactly one matches, pre-selects first when multiple exist - [x] Manual: Credential ordering stays stable when selecting/deselecting - [x] Manual: MCP block title persists after save and refresh - [x] Manual: Agent block title persists after save and refresh (via customized_name) - [ ] Manual: Shared agent with MCP block prompts new user for credentials --------- Co-authored-by: Otto <otto@agpt.co> Co-authored-by: Ubbe <hi@ubbe.dev> |
||
|
|
94bd91388f |
feat(library): enhance folder and agent management with error handling improvements
- Added support for updating the folder ID in the `update_library_agent` function, allowing agents to be moved to different folders. - Implemented cleanup of schedules and webhooks for affected agents during folder deletion, ensuring proper resource management. - Improved error handling in various folder-related API endpoints, standardizing error messages to provide clearer feedback to users. - Updated the `LibraryFolderEditDialog` to handle API errors more effectively, enhancing user experience during folder operations. These changes improve the functionality and reliability of folder and agent management within the library, providing users with a smoother experience when organizing their agents. |
||
|
|
52b3aebf71 |
feat(backend/sdk): Claude Agent SDK integration for CoPilot (#12103)
## Summary
Full integration of the **Claude Agent SDK** to replace the existing
one-turn OpenAI-compatible CoPilot implementation with a multi-turn,
tool-using AI agent.
### What changed
**Core SDK Integration** (`chat/sdk/` — new module)
- **`service.py`**: Main orchestrator — spawns Claude Code CLI as a
subprocess per user message, streams responses back via SSE. Handles
conversation history compression, session lifecycle, and error recovery.
- **`response_adapter.py`**: Translates Claude Agent SDK events (text
deltas, tool use, errors, result messages) into the existing CoPilot
`StreamEvent` protocol so the frontend works unchanged.
- **`tool_adapter.py`**: Bridges CoPilot's MCP tools (find_block,
run_block, create_agent, etc.) into the SDK's tool format. Handles
schema conversion and result serialization.
- **`security_hooks.py`**: Pre/Post tool-use hooks that enforce a strict
allowlist of tools, block path traversal, sandbox file operations to
per-session workspace directories, cap sub-agent spawning, and prevent
the model from accessing unauthorized system resources.
- **`transcript.py`**: JSONL transcript I/O utilities for the stateless
`--resume` feature (see below).
**Stateless Multi-Turn Resume** (new)
- Instead of compressing conversation history via LLM on every turn
(lossy and expensive), we capture Claude Code's native JSONL session
transcript via a **Stop hook** callback, persist it in the DB
(`ChatSession.sdkTranscript`), and restore it on the next turn via
`--resume <file>`.
- This preserves full tool call/result context across turns with zero
token overhead for history.
- Feature-flagged via `CLAUDE_AGENT_USE_RESUME` (default: off).
- DB migration: `ALTER TABLE "ChatSession" ADD COLUMN "sdkTranscript"
TEXT`.
**Sandboxed Tool Execution** (`chat/tools/`)
- **`bash_exec.py`**: Sandboxed bash execution using bubblewrap
(`bwrap`) with read-only root filesystem, per-session writable
workspace, resource limits (CPU, memory, file size), and network
isolation.
- **`sandbox.py`**: Shared bubblewrap sandbox infrastructure — generates
`bwrap` command lines with configurable mounts, environment, and
resource constraints.
- **`web_fetch.py`**: URL fetching tool with domain allowlist, size
limits, and content-type filtering.
- **`check_operation_status.py`**: Polling tool for long-running
operations (agent creation, block execution) so the SDK doesn't block
waiting.
- **`find_block.py`** / **`run_block.py`**: Enhanced with category
filtering, optimized response size (removed raw JSON schemas), and
better error handling.
**Security**
- Path traversal prevention: session IDs sanitized, all file ops
confined to workspace dirs, symlink resolution.
- Tool allowlist enforcement via SDK hooks — model cannot call arbitrary
tools.
- Built-in `Bash` tool blocked via `disallowed_tools` to prevent
bypassing sandboxed `bash_exec`.
- Sub-agent (`Task`) spawning capped at configurable limit (default:
10).
- CodeQL-clean path sanitization patterns.
**Streaming & Reconnection**
- SSE stream registry backed by Redis Streams for crash-resilient
reconnection.
- Long-running operation tracking with TTL-based cleanup.
- Atomic message append to prevent race conditions on concurrent writes.
**Configuration** (`config.py`)
- `use_claude_agent_sdk` — master toggle (default: on)
- `claude_agent_model` — model override for SDK path
- `claude_agent_max_buffer_size` — JSON parsing buffer (10MB)
- `claude_agent_max_subtasks` — sub-agent cap (10)
- `claude_agent_use_resume` — transcript-based resume (default: off)
- `thinking_enabled` — extended thinking for Claude models
**Tests**
- `sdk/response_adapter_test.py` — 366 lines covering all event
translation paths
- `sdk/security_hooks_test.py` — 165 lines covering tool blocking, path
traversal, subtask limits
- `chat/model_test.py` — 214 lines covering session model serialization
- `chat/service_test.py` — Integration tests including multi-turn resume
keyword recall
- `tools/find_block_test.py` / `run_block_test.py` — Extended with new
tool behavior tests
## Test plan
- [x] Unit tests pass (`sdk/response_adapter_test.py`,
`security_hooks_test.py`, `model_test.py`)
- [x] Integration test: multi-turn keyword recall via `--resume`
(`service_test.py::test_sdk_resume_multi_turn`)
- [x] Manual E2E: CoPilot chat sessions with tool calls, bash execution,
and multi-turn context
- [x] Pre-commit hooks pass (ruff, isort, black, pyright, flake8)
- [ ] Staging deployment with `claude_agent_use_resume=false` initially
- [ ] Enable resume in staging, verify transcript capture and recall
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
<details><summary><h3>Greptile Summary</h3></summary>
This PR replaces the existing OpenAI-compatible CoPilot with a full
Claude Agent SDK integration, introducing multi-turn conversations,
stateless resume via JSONL transcripts, and sandboxed tool execution.
**Key changes:**
- **SDK integration** (`chat/sdk/`): spawns Claude Code CLI subprocess
per message, translates events to frontend protocol, bridges MCP tools
- **Stateless resume**: captures JSONL transcripts via Stop hook,
persists in `ChatSession.sdkTranscript`, restores with `--resume`
(feature-flagged, default off)
- **Sandboxed execution**: bubblewrap sandbox for bash commands with
filesystem whitelist, network isolation, resource limits
- **Security hooks**: tool allowlist enforcement, path traversal
prevention, workspace-scoped file operations, sub-agent spawn limits
- **Long-running operations**: delegates `create_agent`/`edit_agent` to
existing stream_registry infrastructure for SSE reconnection
- **Feature flag**: `CHAT_USE_CLAUDE_AGENT_SDK` with LaunchDarkly
support, defaults to enabled
**Security issues found:**
- Path traversal validation has logic errors in `security_hooks.py:82`
(tilde expansion order) and `service.py:266` (redundant `..` check)
- Config validator always prefers env var over explicit `False` value
(`config.py:162`)
- Race condition in `routes.py:323` — message persisted before task
registration, could duplicate on retry
- Resource limits in sandbox may fail silently (`sandbox.py:109`)
**Test coverage is strong** with 366 lines for response adapter, 165 for
security hooks, and integration tests for multi-turn resume.
</details>
<details><summary><h3>Confidence Score: 3/5</h3></summary>
- This PR is generally safe but has critical security issues in path
validation that must be fixed before merge
- Score reflects strong architecture and test coverage offset by real
security vulnerabilities: the tilde expansion bug in `security_hooks.py`
could allow sandbox escape, the race condition could cause message
duplication, and the silent ulimit failures could bypass resource
limits. The bubblewrap sandbox and allowlist enforcement are
well-designed, but the path validation bugs need fixing. The transcript
resume feature is properly feature-flagged. Overall the implementation
is solid but the security issues prevent a higher score.
- Pay close attention to
`backend/api/features/chat/sdk/security_hooks.py` (path traversal
vulnerability), `backend/api/features/chat/routes.py` (race condition),
`backend/api/features/chat/tools/sandbox.py` (silent resource limit
failures), and `backend/api/features/chat/sdk/service.py` (redundant
security check)
</details>
<details><summary><h3>Sequence Diagram</h3></summary>
```mermaid
sequenceDiagram
participant Frontend
participant Routes as routes.py
participant SDKService as sdk/service.py
participant ClaudeSDK as Claude Agent SDK CLI
participant SecurityHooks as security_hooks.py
participant ToolAdapter as tool_adapter.py
participant CoPilotTools as tools/*
participant Sandbox as sandbox.py (bwrap)
participant DB as Database
participant Redis as stream_registry
Frontend->>Routes: POST /chat (user message)
Routes->>SDKService: stream_chat_completion_sdk()
SDKService->>DB: get_chat_session()
DB-->>SDKService: session + messages
alt Resume enabled AND transcript exists
SDKService->>SDKService: validate_transcript()
SDKService->>SDKService: write_transcript_to_tempfile()
Note over SDKService: Pass --resume to SDK
else No resume
SDKService->>SDKService: _compress_conversation_history()
Note over SDKService: Inject history into user message
end
SDKService->>SecurityHooks: create_security_hooks()
SDKService->>ToolAdapter: create_copilot_mcp_server()
SDKService->>ClaudeSDK: spawn subprocess with MCP server
loop Streaming Conversation
ClaudeSDK->>SDKService: AssistantMessage (text/tool_use)
SDKService->>Frontend: StreamTextDelta / StreamToolInputAvailable
alt Tool Call
ClaudeSDK->>SecurityHooks: PreToolUse hook
SecurityHooks->>SecurityHooks: validate path, check allowlist
alt Tool blocked
SecurityHooks-->>ClaudeSDK: deny
else Tool allowed
SecurityHooks-->>ClaudeSDK: allow
ClaudeSDK->>ToolAdapter: call MCP tool
alt Long-running tool (create_agent, edit_agent)
ToolAdapter->>Redis: register task
ToolAdapter->>DB: save OperationPendingResponse
ToolAdapter->>ToolAdapter: spawn background task
ToolAdapter-->>ClaudeSDK: OperationStartedResponse
else Regular tool (find_block, bash_exec)
ToolAdapter->>CoPilotTools: execute()
alt bash_exec
CoPilotTools->>Sandbox: run_sandboxed()
Sandbox->>Sandbox: build bwrap command
Note over Sandbox: Network isolation,<br/>filesystem whitelist,<br/>resource limits
Sandbox-->>CoPilotTools: stdout, stderr, exit_code
end
CoPilotTools-->>ToolAdapter: result
ToolAdapter->>ToolAdapter: stash full output
ToolAdapter-->>ClaudeSDK: MCP response
end
SecurityHooks->>SecurityHooks: PostToolUse hook (log)
end
end
ClaudeSDK->>SDKService: UserMessage (ToolResultBlock)
SDKService->>ToolAdapter: pop_pending_tool_output()
SDKService->>Frontend: StreamToolOutputAvailable
end
ClaudeSDK->>SecurityHooks: Stop hook
SecurityHooks->>SDKService: transcript_path callback
SDKService->>SDKService: read_transcript_file()
SDKService->>DB: save transcript to session.sdkTranscript
ClaudeSDK->>SDKService: ResultMessage (success)
SDKService->>Frontend: StreamFinish
SDKService->>DB: upsert_chat_session()
```
</details>
<sub>Last reviewed commit: 28c1121</sub>
<!-- greptile_other_comments_section -->
<!-- /greptile_comment -->
---------
Co-authored-by: Swifty <craigswift13@gmail.com>
|
||
|
|
784c025938 |
feat(library): refine folder filtering and enhance animation handling in LibraryAgentList
- Updated `list_library_agents` function to improve folder filtering logic, ensuring it only applies when not searching. - Enhanced animation handling in `LibraryAgentList` by implementing explicit initial and animate states for items, improving visibility during dynamic updates. - Adjusted transition timings for smoother animations, particularly when items are added or removed. These changes enhance the user experience by providing clearer folder management and more responsive animations in the library interface. |
||
|
|
c2368f15ff |
fix(blocks): disable PrintToConsoleBlock (#12100)
## Summary Disables the Print to Console block as requested by Nick Tindle. ## Changes - Added `disabled=True` to PrintToConsoleBlock in `basic.py` ## Testing - Block will no longer appear in the platform UI - Existing graphs using this block should be checked (block ID: `f3b1c1b2-4c4f-4f0d-8d2f-4c4f0d8d2f4c`) Closes OPEN-3000 <!-- greptile_comment --> <h2>Greptile Overview</h2> <details><summary><h3>Greptile Summary</h3></summary> Added `disabled=True` parameter to `PrintToConsoleBlock` in `basic.py` per Nick Tindle's request (OPEN-3000). - Block follows the same disabling pattern used by other blocks in the codebase (e.g., `BlockInstallationBlock`, video blocks, Ayrshare blocks) - Block will no longer appear in the platform UI for new graph creation - Existing graphs using this block (ID: `f3b1c1b2-4c4f-4f0d-8d2f-4c4f0d8d2f4c`) will need to be checked for compatibility - Comment properly documents the reason for disabling </details> <details><summary><h3>Confidence Score: 5/5</h3></summary> - This PR is safe to merge with minimal risk - Single-line change that adds a well-documented flag following existing patterns used throughout the codebase. The change is non-destructive and only affects UI visibility of the block for new graphs. - No files require special attention </details> <sub>Last reviewed commit: 759003b</sub> <!-- greptile_other_comments_section --> <!-- /greptile_comment --> |
||
|
|
5035b69c79 |
feat(platform): add feature request tools for CoPilot chat (#12102)
Users can now search for existing feature requests and submit new ones directly through the CoPilot chat interface. Requests are tracked in Linear with customer need attribution. ### Changes 🏗️ **Backend:** - Added `SearchFeatureRequestsTool` and `CreateFeatureRequestTool` to the CoPilot chat tools registry - Integrated with Linear GraphQL API for searching issues in the feature requests project, creating new issues, upserting customers, and attaching customer needs - Added `linear_api_key` secret to settings for system-level Linear API access - Added response models (`FeatureRequestSearchResponse`, `FeatureRequestCreatedResponse`, `FeatureRequestInfo`) to the tools models **Frontend:** - Added `SearchFeatureRequestsTool` and `CreateFeatureRequestTool` UI components with full streaming state handling (input-streaming, input-available, output-available, output-error) - Added helper utilities for output parsing, type guards, animation text, and icon rendering - Wired tools into `ChatMessagesContainer` for rendering in the chat - Added styleguide examples covering all tool states ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Verified search returns matching feature requests from Linear - [x] Verified creating a new feature request creates an issue and customer need in Linear - [x] Verified adding a need to an existing issue works via `existing_issue_id` - [x] Verified error states render correctly in the UI - [x] Verified styleguide page renders all tool states #### For configuration changes: - [x] `.env.default` is updated or already compatible with my changes - [x] I have included a list of my configuration changes in the PR description (under **Changes**) New secret: `LINEAR_API_KEY` — required for system-level Linear API operations (defaults to empty string). <!-- greptile_comment --> <h2>Greptile Overview</h2> <details><summary><h3>Greptile Summary</h3></summary> Adds feature request search and creation tools to CoPilot chat, integrating with Linear's GraphQL API to track user feedback. Users can now search existing feature requests and submit new ones (or add their need to existing issues) directly through conversation. **Key changes:** - Backend: `SearchFeatureRequestsTool` and `CreateFeatureRequestTool` with Linear API integration via system-level `LINEAR_API_KEY` - Frontend: React components with streaming state handling and accordion UI for search results and creation confirmations - Models: Added `FeatureRequestSearchResponse` and `FeatureRequestCreatedResponse` to response types - Customer need tracking: Upserts customers in Linear and attaches needs to issues for better feedback attribution **Issues found:** - Missing `LINEAR_API_KEY` entry in `.env.default` (required per PR description checklist) - Hardcoded project/team IDs reduce maintainability - Global singleton pattern could cause issues in async contexts - Using `user_id` as customer name reduces readability in Linear </details> <details><summary><h3>Confidence Score: 4/5</h3></summary> - Safe to merge with minor configuration fix required - The implementation is well-structured with proper error handling, type safety, and follows existing patterns in the codebase. The missing `.env.default` entry is a straightforward configuration issue that must be fixed before deployment but doesn't affect code quality. The other findings are style improvements that don't impact functionality. - Verify that `LINEAR_API_KEY` is added to `.env.default` before merging </details> <details><summary><h3>Sequence Diagram</h3></summary> ```mermaid sequenceDiagram participant User participant CoPilot UI participant LLM participant FeatureRequestTool participant LinearClient participant Linear API User->>CoPilot UI: Request feature via chat CoPilot UI->>LLM: Send user message LLM->>FeatureRequestTool: search_feature_requests(query) FeatureRequestTool->>LinearClient: query(SEARCH_ISSUES_QUERY) LinearClient->>Linear API: POST /graphql (search) Linear API-->>LinearClient: searchIssues.nodes[] LinearClient-->>FeatureRequestTool: Feature request data FeatureRequestTool-->>LLM: FeatureRequestSearchResponse alt No existing requests found LLM->>FeatureRequestTool: create_feature_request(title, description) FeatureRequestTool->>LinearClient: mutate(CUSTOMER_UPSERT_MUTATION) LinearClient->>Linear API: POST /graphql (upsert customer) Linear API-->>LinearClient: customer {id, name} LinearClient-->>FeatureRequestTool: Customer data FeatureRequestTool->>LinearClient: mutate(ISSUE_CREATE_MUTATION) LinearClient->>Linear API: POST /graphql (create issue) Linear API-->>LinearClient: issue {id, identifier, url} LinearClient-->>FeatureRequestTool: Issue data FeatureRequestTool->>LinearClient: mutate(CUSTOMER_NEED_CREATE_MUTATION) LinearClient->>Linear API: POST /graphql (attach need) Linear API-->>LinearClient: need {id, issue} LinearClient-->>FeatureRequestTool: Need data FeatureRequestTool-->>LLM: FeatureRequestCreatedResponse else Existing request found LLM->>FeatureRequestTool: create_feature_request(title, description, existing_issue_id) FeatureRequestTool->>LinearClient: mutate(CUSTOMER_UPSERT_MUTATION) LinearClient->>Linear API: POST /graphql (upsert customer) Linear API-->>LinearClient: customer {id} LinearClient-->>FeatureRequestTool: Customer data FeatureRequestTool->>LinearClient: mutate(CUSTOMER_NEED_CREATE_MUTATION) LinearClient->>Linear API: POST /graphql (attach need to existing) Linear API-->>LinearClient: need {id, issue} LinearClient-->>FeatureRequestTool: Need data FeatureRequestTool-->>LLM: FeatureRequestCreatedResponse end LLM-->>CoPilot UI: Tool response + continuation CoPilot UI-->>User: Display result with accordion UI ``` </details> <sub>Last reviewed commit: af2e093</sub> <!-- greptile_other_comments_section --> <!-- /greptile_comment --> |
||
|
|
dfa517300b |
debug(copilot): Add detailed API error logging (#11942)
## Summary Adds comprehensive error logging for OpenRouter/OpenAI API errors to help diagnose issues like provider routing failures, context length exceeded, rate limits, etc. ## Background While investigating [SECRT-1859](https://linear.app/autogpt/issue/SECRT-1859), we found that when OpenRouter returns errors, the actual error details weren't being captured or logged. Langfuse traces showed `provider_name: 'unknown'` and `completion: null` without any insight into WHY all providers rejected the request. ## Changes - Add `_extract_api_error_details()` to extract rich information from API errors including: - Status code and request ID - Response body (contains OpenRouter's actual error message) - OpenRouter-specific headers (provider, model) - Rate limit headers - Add `_log_api_error()` helper that logs errors with context: - Session ID for correlation - Message count (helps identify context length issues) - Model being used - Retry count - Update error handling in `_stream_chat_chunks()` and `_generate_llm_continuation()` to use new logging - Extract provider's error message from response body for better user feedback ## Example log output ``` API error: { 'error_type': 'APIStatusError', 'error_message': 'Provider returned error', 'status_code': 400, 'request_id': 'req_xxx', 'response_body': {'error': {'message': 'context_length_exceeded', 'type': 'invalid_request_error'}}, 'openrouter_provider': 'unknown', 'session_id': '44fbb803-...', 'message_count': 52, 'model': 'anthropic/claude-opus-4.5', 'retry_count': 0 } ``` ## Testing - [ ] Verified code passes linting (black, isort, ruff) - [ ] Error details are properly extracted from different error types ## Refs - Linear: SECRT-1859 - Thread: https://discord.com/channels/1126875755960336515/1467066151002571034 --------- Co-authored-by: Reinier van der Leer <pwuts@agpt.co> |
||
|
|
ab0b537cc7 |
refactor(backend): optimize find_block response size by removing raw JSON schemas (#12020)
### Changes 🏗️ The `find_block` AutoPilot tool was returning ~90K characters per response (10 blocks). The bloat came from including full JSON Schema objects (`input_schema`, `output_schema`) with all nested `$defs`, `anyOf`, and type definitions for every block. **What changed:** - **`BlockInfoSummary` model**: Removed `input_schema` (raw JSON Schema), `output_schema` (raw JSON Schema), and `categories`. Added `output_fields` (compact field-level summaries matching the existing `required_inputs` format). - **`BlockListResponse` model**: Removed `usage_hint` (info now in `message`). - **`FindBlockTool._execute()`**: Now extracts compact `output_fields` from output schema properties instead of including the entire raw schema. Credentials handling is unchanged. - **Test**: Added `test_response_size_average_chars_per_block` with realistic block schemas (HTTP, Email, Claude Code) to measure and assert response size stays under 2K chars/block. - **`CLAUDE.md`**: Clarified `dev` vs `master` branching strategy. **Result:** Average response size reduced from ~9,000 to ~1,300 chars per block (~85% reduction). This directly reduces LLM token consumption, latency, and API costs for AutoPilot interactions. ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Verified models import and serialize correctly - [x] Verified response size: 3,970 chars for 3 realistic blocks (avg 1,323/block) - [x] Lint (`ruff check`) and type check (`pyright`) pass on changed files - [x] Frontend compatibility preserved: `blocks[].name` and `count` fields retained for `block_list` handler --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Toran Bruce Richards <toran.richards@gmail.com> |
||
|
|
62bc325d79 |
feat(library): implement folder management for organizing library agents
- Added `LibraryFolder` model to represent folders in the library. - Updated `LibraryAgent` model to include a reference to `folderId`. - Introduced new API endpoints for folder operations: create, update, move, delete, and list folders. - Enhanced existing agent endpoints to support filtering by folder. - Implemented validation for folder operations to prevent circular references and depth violations. - Created a migration script to add folder-related database schema changes. This feature allows users to organize their library agents into folders, improving the overall user experience and management of agents. |
||
|
|
d95aef7665 |
fix(copilot): stream timeout, long-running tool polling, and CreateAgent UI refresh (#12070)
Agent generation completes on the backend but the UI does not update/refresh to show the result. ### Changes 🏗️ ![Uploading Screenshot 2026-02-13 at 00.44.54.png…]() - **Stream start timeout (12s):** If the backend doesn't begin streaming within 12 seconds of submitting a message, the stream is aborted and a destructive toast is shown to the user. - **Long-running tool polling:** Added `useLongRunningToolPolling` hook that polls the session endpoint every 1.5s while a tool output is in an operating state (`operation_started` / `operation_pending` / `operation_in_progress`). When the backend completes, messages are refreshed so the UI reflects the final result. - **CreateAgent UI improvements:** Replaced the orbit loader / progress bar with a mini-game, added expanded accordion for saved agents, and improved the saved-agent card with image, icons, and links that open in new tabs. - **Backend tweaks:** Added `image_url` to `CreateAgentToolOutput`, minor model/service updates for the dummy agent generator. ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Send a message and verify the stream starts within 12s or a toast appears - [x] Trigger agent creation and verify the UI updates when the backend completes - [x] Verify the saved-agent card renders correctly with image, links, and icons --------- Co-authored-by: Otto <otto@agpt.co> Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
cb166dd6fb |
feat(blocks): Store sandbox files to workspace (#12073)
Store files created by sandbox blocks (Claude Code, Code Executor) to
the user's workspace for persistence across runs.
### Changes 🏗️
- **New `sandbox_files.py` utility** (`backend/util/sandbox_files.py`)
- Shared module for extracting files from E2B sandboxes
- Stores files to workspace via `store_media_file()` (includes virus
scanning, size limits)
- Returns `SandboxFileOutput` with path, content, and `workspace_ref`
- **Claude Code block** (`backend/blocks/claude_code.py`)
- Added `workspace_ref` field to `FileOutput` schema
- Replaced inline `_extract_files()` with shared utility
- Files from working directory now stored to workspace automatically
- **Code Executor block** (`backend/blocks/code_executor.py`)
- Added `files` output field to `ExecuteCodeBlock.Output`
- Creates `/output` directory in sandbox before execution
- Extracts all files (text + binary) from `/output` after execution
- Updated `execute_code()` to support file extraction with
`extract_files` param
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Create agent with Claude Code block, have it create a file, verify
`workspace_ref` in output
- [x] Create agent with Code Executor block, write file to `/output`,
verify `workspace_ref` in output
- [x] Verify files persist in workspace after sandbox disposal
- [x] Verify binary files (images, etc.) work correctly in Code Executor
- [x] Verify existing graphs using `content` field still work (backward
compat)
#### For configuration changes:
- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)
No configuration changes required - this is purely additive backend
code.
---
**Related:** Closes SECRT-1931
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> **Medium Risk**
> Adds automatic extraction and workspace storage of sandbox-written
files (including binaries for code execution), which can affect output
payload size, performance, and file-handling edge cases.
>
> **Overview**
> **Sandbox blocks now persist generated files to workspace.** A new
shared utility (`backend/util/sandbox_files.py`) extracts files from an
E2B sandbox (scoped by a start timestamp) and stores them via
`store_media_file`, returning `SandboxFileOutput` with `workspace_ref`.
>
> `ClaudeCodeBlock` replaces its inline file-scraping logic with this
utility and updates the `files` output schema to include
`workspace_ref`.
>
> `ExecuteCodeBlock` adds a `files` output and extends the executor
mixin to optionally extract/store files (text + binary) when an
`execution_context` is provided; related mocks/tests and docs are
updated accordingly.
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
|
||
|
|
3d31f62bf1 |
Revert "added feature request tooling"
This reverts commit
|
||
|
|
b8b6c9de23 | added feature request tooling | ||
|
|
113e87a23c |
refactor(backend): Reduce circular imports (#12068)
I'm getting circular import issues because there is a lot of cross-importing between `backend.data`, `backend.blocks`, and other modules. This change reduces block-related cross-imports and thus risk of breaking circular imports. ### Changes 🏗️ - Strip down `backend.data.block` - Move `Block` base class and related class/enum defs to `backend.blocks._base` - Move `is_block_auth_configured` to `backend.blocks._utils` - Move `get_blocks()`, `get_io_block_ids()` etc. to `backend.blocks` (`__init__.py`) - Update imports everywhere - Remove unused and poorly typed `Block.create()` - Change usages from `block_cls.create()` to `block_cls()` - Improve typing of `load_all_blocks` and `get_blocks` - Move cross-import of `backend.api.features.library.model` from `backend/data/__init__.py` to `backend/data/integrations.py` - Remove deprecated attribute `NodeModel.webhook` - Re-generate OpenAPI spec and fix frontend usage - Eliminate module-level `backend.blocks` import from `blocks/agent.py` - Eliminate module-level `backend.data.execution` and `backend.executor.manager` imports from `blocks/helpers/review.py` - Replace `BlockInput` with `GraphInput` for graph inputs ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - CI static type-checking + tests should be sufficient for this |
||
|
|
a78145505b |
fix(copilot): merge split assistant messages to prevent Anthropic API errors (#12062)
## Summary
- When the copilot model responds with both text content AND a
long-running tool call (e.g., `create_agent`), the streaming code
created two separate consecutive assistant messages — one with text, one
with `tool_calls`. This caused Anthropic's API to reject with
`"unexpected tool_use_id found in tool_result blocks"` because the
`tool_result` couldn't find a matching `tool_use` in the immediately
preceding assistant message.
- Added a defensive merge of consecutive assistant messages in
`to_openai_messages()` (fixes existing corrupt sessions too)
- Fixed `_yield_tool_call` to add tool_calls to the existing
current-turn assistant message instead of creating a new one
- Changed `accumulated_tool_calls` assignment to use `extend` to prevent
overwriting tool_calls added by long-running tool flow
## Test plan
- [x] All 23 chat feature tests pass (`backend/api/features/chat/`)
- [x] All 44 prompt utility tests pass (`backend/util/prompt_test.py`)
- [x] All pre-commit hooks pass (ruff, isort, black, pyright)
- [ ] Manual test: create an agent via copilot, then ask a follow-up
question — should no longer get 400 error
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
<details><summary><h3>Greptile Summary</h3></summary>
Fixes a critical bug where long-running tool calls (like `create_agent`)
caused Anthropic API 400 errors due to split assistant messages. The fix
ensures tool calls are added to the existing assistant message instead
of creating new ones, and adds a defensive merge function to repair any
existing corrupt sessions.
**Key changes:**
- Added `_merge_consecutive_assistant_messages()` to defensively merge
split assistant messages in `to_openai_messages()`
- Modified `_yield_tool_call()` to append tool calls to the current-turn
assistant message instead of creating a new one
- Changed `accumulated_tool_calls` from assignment to `extend` to
preserve tool calls already added by long-running tool flow
**Impact:** Resolves the issue where users received 400 errors after
creating agents via copilot and asking follow-up questions.
</details>
<details><summary><h3>Confidence Score: 4/5</h3></summary>
- Safe to merge with minor verification recommended
- The changes are well-targeted and solve a real API compatibility
issue. The logic is sound: searching backwards for the current assistant
message is correct, and using `extend` instead of assignment prevents
overwriting. The defensive merge in `to_openai_messages()` also fixes
existing corrupt sessions. All existing tests pass according to the PR
description.
- No files require special attention - changes are localized and
defensive
</details>
<details><summary><h3>Sequence Diagram</h3></summary>
```mermaid
sequenceDiagram
participant User
participant StreamAPI as stream_chat_completion
participant Chunks as _stream_chat_chunks
participant ToolCall as _yield_tool_call
participant Session as ChatSession
User->>StreamAPI: Send message
StreamAPI->>Chunks: Stream chat chunks
alt Text + Long-running tool call
Chunks->>StreamAPI: Text delta (content)
StreamAPI->>Session: Append assistant message with content
Chunks->>ToolCall: Tool call detected
Note over ToolCall: OLD: Created new assistant message<br/>NEW: Appends to existing assistant
ToolCall->>Session: Search backwards for current assistant
ToolCall->>Session: Append tool_call to existing message
ToolCall->>Session: Add pending tool result
end
StreamAPI->>StreamAPI: Merge accumulated_tool_calls
Note over StreamAPI: Use extend (not assign)<br/>to preserve existing tool_calls
StreamAPI->>Session: to_openai_messages()
Session->>Session: _merge_consecutive_assistant_messages()
Note over Session: Defensive: Merges any split<br/>assistant messages
Session-->>StreamAPI: Merged messages
StreamAPI->>User: Return response
```
</details>
<!-- greptile_other_comments_section -->
<!-- /greptile_comment -->
|
||
|
|
36aeb0b2b3 |
docs(blocks): clarify HumanInTheLoop output descriptions for agent builder (#12069)
## Problem The agent builder (LLM) misinterprets the HumanInTheLoop block outputs. It thinks `approved_data` and `rejected_data` will yield status strings like "APPROVED" or "REJECTED" instead of understanding that the actual input data passes through. This leads to unnecessary complexity - the agent builder adds comparison blocks to check for status strings that don't exist. ## Solution Enriched the block docstring and all input/output field descriptions to make it explicit that: 1. The output is the actual data itself, not a status string 2. The routing is determined by which output pin fires 3. How to use the block correctly (connect downstream blocks to appropriate output pins) ## Changes - Updated block docstring with clear "How it works" and "Example usage" sections - Enhanced `data` input description to explain data flow - Enhanced `name` input description for reviewer context - Enhanced `approved_data` output to explicitly state it's NOT a status string - Enhanced `rejected_data` output to explicitly state it's NOT a status string - Enhanced `review_message` output for clarity ## Testing Documentation-only change to schema descriptions. No functional changes. Fixes SECRT-1930 <!-- greptile_comment --> <h2>Greptile Overview</h2> <details><summary><h3>Greptile Summary</h3></summary> Enhanced documentation for the `HumanInTheLoopBlock` to clarify how output pins work. The key improvement explicitly states that output pins (`approved_data` and `rejected_data`) yield the actual input data, not status strings like "APPROVED" or "REJECTED". This prevents the agent builder (LLM) from misinterpreting the block's behavior and adding unnecessary comparison blocks. **Key changes:** - Added "How it works" and "Example usage" sections to the block docstring - Clarified that routing is determined by which output pin fires, not by comparing output values - Enhanced all input/output field descriptions with explicit data flow explanations - Emphasized that downstream blocks should be connected to the appropriate output pin based on desired workflow path This is a documentation-only change with no functional modifications to the code logic. </details> <details><summary><h3>Confidence Score: 5/5</h3></summary> - This PR is safe to merge with no risk - Documentation-only change that accurately reflects the existing code behavior. No functional changes, no runtime impact, and the enhanced descriptions correctly explain how the block outputs work based on verification of the implementation code. - No files require special attention </details> <!-- greptile_other_comments_section --> <!-- /greptile_comment --> Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co> |
||
|
|
2a189c44c4 |
fix(frontend): API stream issues leaking into prompt (#12063)
## Changes 🏗️ <img width="800" height="621" alt="Screenshot 2026-02-11 at 19 32 39" src="https://github.com/user-attachments/assets/e97be1a7-972e-4ae0-8dfa-6ade63cf287b" /> When the BE API has an error, prevent it from leaking into the stream and instead handle it gracefully via toast. ## Checklist 📋 ### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Run the app locally and trust the changes <!-- greptile_comment --> <h2>Greptile Overview</h2> <details><summary><h3>Greptile Summary</h3></summary> This PR fixes an issue where backend API stream errors were leaking into the chat prompt instead of being handled gracefully. The fix involves both backend and frontend changes to ensure error events conform to the AI SDK's strict schema. **Key Changes:** - **Backend (`response_model.py`)**: Added custom `to_sse()` method for `StreamError` that only emits `type` and `errorText` fields, stripping extra fields like `code` and `details` that cause AI SDK validation failures - **Backend (`prompt.py`)**: Added validation step after context compression to remove orphaned tool responses without matching tool calls, preventing "unexpected tool_use_id" API errors - **Frontend (`route.ts`)**: Implemented SSE stream normalization with `normalizeSSEStream()` and `normalizeSSEEvent()` functions to strip non-conforming fields from error events before they reach the AI SDK - **Frontend (`ChatMessagesContainer.tsx`)**: Added toast notifications for errors and improved error display UI with deduplication logic The changes ensure a clean separation between internal error metadata (useful for logging/debugging) and the strict schema required by the AI SDK on the frontend. </details> <details><summary><h3>Confidence Score: 4/5</h3></summary> - This PR is safe to merge with low risk - The changes are well-structured and address a specific bug with proper error handling. The dual-layer approach (backend filtering in `to_sse()` + frontend normalization) provides defense-in-depth. However, the lack of automated tests for the new error normalization logic and the potential for edge cases in SSE parsing prevent a perfect score. - Pay close attention to `autogpt_platform/frontend/src/app/api/chat/sessions/[sessionId]/stream/route.ts` - the SSE normalization logic should be tested with various error scenarios </details> <details><summary><h3>Sequence Diagram</h3></summary> ```mermaid sequenceDiagram participant User participant Frontend as ChatMessagesContainer participant Proxy as /api/chat/.../stream participant Backend as Backend API participant AISDK as AI SDK User->>Frontend: Send message Frontend->>Proxy: POST with message Proxy->>Backend: Forward request with auth Backend->>Backend: Process message alt Success Path Backend->>Proxy: SSE stream (text-delta, etc.) Proxy->>Proxy: normalizeSSEStream (pass through) Proxy->>AISDK: Forward SSE events AISDK->>Frontend: Update messages Frontend->>User: Display response else Error Path Backend->>Backend: StreamError.to_sse() Note over Backend: Only emit {type, errorText} Backend->>Proxy: SSE error event Proxy->>Proxy: normalizeSSEEvent() Note over Proxy: Strip extra fields (code, details) Proxy->>AISDK: {type: "error", errorText: "..."} AISDK->>Frontend: error state updated Frontend->>Frontend: Toast notification (deduplicated) Frontend->>User: Show error UI + toast end ``` </details> <!-- greptile_other_comments_section --> <!-- /greptile_comment --> --------- Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: Otto-AGPT <otto@agpt.co> |
||
|
|
062fe1aa70 |
fix(security): enforce disabled flag on blocks in graph validation (#12059)
## Summary Blocks marked `disabled=True` (like BlockInstallationBlock) were not being checked during graph validation, allowing them to be used via direct API calls despite being hidden from the UI. This adds a security check in `_validate_graph_get_errors()` to reject any graph containing disabled blocks. ## Security Advisory GHSA-4crw-9p35-9x54 ## Linear SECRT-1927 ## Changes - Added `block.disabled` check in graph validation (6 lines) ## Testing - Graphs with disabled blocks → rejected with clear error message - Graphs with valid blocks → unchanged behavior <!-- greptile_comment --> <h2>Greptile Overview</h2> <details><summary><h3>Greptile Summary</h3></summary> Adds critical security validation to prevent execution of disabled blocks (like `BlockInstallationBlock`) via direct API calls. The fix validates that `block.disabled` is `False` during graph validation in `_validate_graph_get_errors()` on line 747-750, ensuring disabled blocks are rejected before graph creation or execution. This closes a vulnerability where blocks marked disabled in the UI could still be used through API endpoints. </details> <details><summary><h3>Confidence Score: 5/5</h3></summary> - This PR is safe to merge and addresses a critical security vulnerability - The fix is minimal (6 lines), correctly placed in the validation flow, includes clear security context (GHSA reference), and follows existing validation patterns. The check is positioned after block existence validation and before input validation, ensuring disabled blocks are caught early in both graph creation and execution paths. - No files require special attention </details> <!-- greptile_other_comments_section --> <!-- /greptile_comment --> --------- Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
017a00af46 |
feat(copilot): Enable extended thinking for Claude models (#12052)
## Summary
Enables Anthropic's extended thinking feature for Claude models in
CoPilot via OpenRouter. This keeps the model's chain-of-thought
reasoning internal rather than outputting it to users.
## Problem
The CoPilot prompt was designed for a thinking agent (with
`<internal_reasoning>` tags), but extended thinking wasn't enabled on
the API side. This caused the model to output its reasoning as regular
text, leaking internal analysis to users.
## Solution
Added thinking configuration to the OpenRouter `extra_body` for
Anthropic models:
```python
extra_body["provider"] = {
"anthropic": {
"thinking": {
"type": "enabled",
"budget_tokens": config.thinking_budget_tokens,
}
}
}
```
## Configuration
New settings in `ChatConfig`:
| Setting | Default | Description |
|---------|---------|-------------|
| `thinking_enabled` | `True` | Enable extended thinking for Claude
models |
| `thinking_budget_tokens` | `10000` | Token budget for thinking
(1000-100000) |
## Changes
- `config.py`: Added `thinking_enabled` and `thinking_budget_tokens`
settings
- `service.py`: Added thinking config to all 3 places where `extra_body`
is built for LLM calls
## Testing
- Verify CoPilot responses no longer include internal reasoning text
- Check that Claude's extended thinking is working (should see thinking
tokens in usage)
- Confirm non-Anthropic models are unaffected
## Related
Discussion:
https://discord.com/channels/1126875755960336515/1126875756925046928/1470779843552612607
---------
Co-authored-by: Swifty <craigswift13@gmail.com>
|
||
|
|
7d4c020a9b |
feat(chat): implement AI SDK integration with custom streaming response handling (#11901)
### Changes 🏗️ - Added AI SDK integration for chat streaming with proper message handling - Implemented custom to_sse method in StreamToolOutputAvailable to exclude non-spec fields - Modified stream_chat_completion to reuse message IDs for tool call continuations - Created new Copilot 2.0 UI with AI SDK React components - Added streamdown and related packages for markdown rendering - Built reusable conversation and message components for the chat interface - Added support for tool output display in the chat UI ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Start a new chat session and verify streaming works correctly - [x] Test tool calls and verify they display properly in the UI - [x] Verify message continuations don't create duplicate messages - [x] Test markdown rendering with code blocks and other formatting - [x] Verify the UI is responsive and scrolls correctly #### For configuration changes: - [x] `.env.default` is updated or already compatible with my changes - [x] `docker-compose.yml` is updated or already compatible with my changes - [x] I have included a list of my configuration changes in the PR description (under **Changes**) --------- Co-authored-by: Lluis Agusti <hi@llu.lu> Co-authored-by: Ubbe <hi@ubbe.dev> |
||
|
|
81f8290f01 |
debug(backend/db): Add diagnostic logging for vector type errors (#12024)
Adds diagnostic logging when the `type vector does not exist` error occurs in raw SQL queries. ## Problem We're seeing intermittent "type vector does not exist" errors on dev-behave ([Sentry issue](https://significant-gravitas.sentry.io/issues/7205929979/)). The pgvector extension should be in the search_path, but occasionally queries fail to resolve the vector type. ## Solution When a query fails with this specific error, we now log: - `SHOW search_path` - what schemas are being searched - `SELECT current_schema()` - the active schema - `SELECT current_user, session_user, current_database()` - connection context This diagnostic info will help identify why the vector extension isn't visible in certain cases. ## Changes - Added `_log_vector_error_diagnostics()` helper function in `backend/data/db.py` - Wrapped SQL execution in try/except to catch and diagnose vector type errors - Original exception is re-raised after logging (no behavior change) ## Testing This is observational/diagnostic code. It will be validated by waiting for the error to occur naturally on dev and checking the logs. ## Rollout Once we've captured diagnostic logs and identified the root cause, this logging can be removed or reduced in verbosity. |
||
|
|
6467f6734f |
debug(backend/chat): Add timing logging to chat stream generation mechanism (#12019)
[SECRT-1912: Investigate & eliminate chat session start latency](https://linear.app/autogpt/issue/SECRT-1912) ### Changes 🏗️ - Add timing logs to `backend.api.features.chat` in `routes.py`, `service.py`, and `stream_registry.py` - Remove unneeded DB join in `create_chat_session` ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - CI checks |
||
|
|
5a30d11416 |
refactor(copilot): Code cleanup and deduplication (#11950)
## Summary Code cleanup of the AI Copilot codebase - rebased onto latest dev. ## Changes ### New Files - `backend/util/validation.py` - UUID validation helpers - `backend/api/features/chat/tools/helpers.py` - Shared tool utilities ### Credential Matching Consolidation - Added shared utilities to `utils.py` - Refactored `run_block._check_block_credentials()` with discriminator support - Extracted `_resolve_discriminated_credentials()` for multi-provider handling ### Routes Cleanup - Extracted `_create_stream_generator()` and `SSE_RESPONSE_HEADERS` ### Tool Files Cleanup - Updated `run_agent.py` and `run_block.py` to use shared helpers **WIP** - This PR will be updated incrementally. |
||
|
|
caf9ff34e6 |
fix(backend): Handle stale RabbitMQ channels on connection drop (#11929)
### Changes 🏗️ Fixes [**AUTOGPT-SERVER-1TN**](https://autoagpt.sentry.io/issues/?query=AUTOGPT-SERVER-1TN) (~39K events since Feb 2025) and related connection issues **6JC/6JD/6JE/6JF** (~6K combined). #### Problem When the RabbitMQ TCP connection drops (network blip, server restart, etc.): 1. `connect_robust` (aio_pika) automatically reconnects the underlying AMQP connection 2. But `AsyncRabbitMQ._channel` still references the **old dead channel** 3. `is_ready` checks `not self._channel.is_closed` — but the channel object doesn't know the transport is gone 4. `publish_message` tries to use the stale channel → `ChannelInvalidStateError: No active transport in channel` 5. `@func_retry` retries 5 times, but each retry hits the same stale channel (it passes `is_ready`) This means every connection drop generates errors until the process is restarted. #### Fix **New `_ensure_channel()` helper** that resets stale channels before reconnecting, so `connect()` creates a fresh one instead of short-circuiting on `is_connected`. **Explicit `ChannelInvalidStateError` handling in `publish_message`:** 1. First attempt uses `_ensure_channel()` (handles normal staleness) 2. If publish throws `ChannelInvalidStateError`, does a full reconnect (resets both `_channel` and `_connection`) and retries once 3. `@func_retry` provides additional retry resilience on top **Simplified `get_channel()`** to use the same resilient helper. **1 file changed, 62 insertions, 24 deletions.** #### Impact - Eliminates ~39K `ChannelInvalidStateError` Sentry events - RabbitMQ operations self-heal after connection drops without process restart - Related transport EOF errors (6JC/6JD/6JE/6JF) should also reduce |
||
|
|
e8fc8ee623 |
fix(backend): filter graph-only blocks from CoPilot's find_block results (#11892)
Filters out blocks that are unsuitable for standalone execution from
CoPilot's block search and execution. These blocks serve graph-specific
purposes and will either fail, hang, or confuse users when run outside
of a graph context.
**Important:** This does NOT affect the Builder UI which uses
`load_all_blocks()` directly.
### Changes 🏗️
- **find_block.py**: Added `EXCLUDED_BLOCK_TYPES` and
`EXCLUDED_BLOCK_IDS` constants, skip excluded blocks in search results
- **run_block.py**: Added execution guard that returns clear error
message for excluded blocks
- **content_handlers.py**: Added filtering to
`BlockHandler.get_missing_items()` and `get_stats()` to prevent indexing
excluded blocks
**Excluded by BlockType:**
| BlockType | Reason |
|-----------|--------|
| `INPUT` | Graph interface definition - data enters via chat, not graph
inputs |
| `OUTPUT` | Graph interface definition - data exits via chat, not graph
outputs |
| `WEBHOOK` | Wait for external events - would hang forever in CoPilot |
| `WEBHOOK_MANUAL` | Same as WEBHOOK |
| `NOTE` | Visual annotation only - no runtime behavior |
| `HUMAN_IN_THE_LOOP` | Pauses for human approval - CoPilot IS
human-in-the-loop |
| `AGENT` | AgentExecutorBlock requires graph context - use `run_agent`
tool instead |
**Excluded by ID:**
| Block | Reason |
|-------|--------|
| `SmartDecisionMakerBlock` | Dynamically discovers downstream blocks
via graph topology |
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [ ] Search for "input" in CoPilot - should NOT return AgentInputBlock
variants
- [ ] Search for "output" in CoPilot - should NOT return
AgentOutputBlock
- [ ] Search for "webhook" in CoPilot - should NOT return trigger blocks
- [ ] Search for "human" in CoPilot - should NOT return
HumanInTheLoopBlock
- [ ] Search for "decision" in CoPilot - should NOT return
SmartDecisionMakerBlock
- [ ] Verify functional blocks still appear (e.g., "email", "http",
"text")
- [ ] Verify Builder UI still shows ALL blocks (no regression)
#### For configuration changes:
- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)
No configuration changes required.
---
Resolves: [SECRT-1831](https://linear.app/autogpt/issue/SECRT-1831)
🤖 Generated with [Claude Code](https://claude.ai/code)
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> **Low Risk**
> Behavior change is limited to CoPilot’s block discovery/execution
guards and is covered by new tests; main risk is inadvertently excluding
a block that should be runnable.
>
> **Overview**
> CoPilot now **filters out graph-only blocks** from `find_block`
results and prevents them from being executed via `run_block`, returning
a clear error when a user attempts to run an excluded block.
>
> `find_block` introduces explicit exclusion lists (by `BlockType` and a
specific block ID), over-fetches search results to maintain up to 10
usable matches after filtering, and adds debug logging when results are
reduced. New unit tests cover both the search filtering and the
`run_block` execution guard; a minor cleanup removes an unused `pytest`
import in `execution_queue_test.py`.
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
|
||
|
|
9e38bd5b78 |
chore(backend/deps): bump the production-dependencies group across 1 directory with 8 updates (#12014)
Bumps the production-dependencies group with 8 updates in the /autogpt_platform/backend directory: | Package | From | To | | --- | --- | --- | | [anthropic](https://github.com/anthropics/anthropic-sdk-python) | `0.59.0` | `0.79.0` | | [fastapi](https://github.com/fastapi/fastapi) | `0.128.3` | `0.128.5` | | [ollama](https://github.com/ollama/ollama-python) | `0.5.4` | `0.6.1` | | [prometheus-client](https://github.com/prometheus/client_python) | `0.22.1` | `0.24.1` | | [python-multipart](https://github.com/Kludex/python-multipart) | `0.0.20` | `0.0.22` | | [supabase](https://github.com/supabase/supabase-py) | `2.27.2` | `2.27.3` | | [tenacity](https://github.com/jd/tenacity) | `9.1.3` | `9.1.4` | | [tiktoken](https://github.com/openai/tiktoken) | `0.9.0` | `0.12.0` | Updates `anthropic` from 0.59.0 to 0.79.0 <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/anthropics/anthropic-sdk-python/releases">anthropic's releases</a>.</em></p> <blockquote> <h2>v0.79.0</h2> <h2>0.79.0 (2026-02-07)</h2> <p>Full Changelog: <a href="https://github.com/anthropics/anthropic-sdk-python/compare/v0.78.0...v0.79.0">v0.78.0...v0.79.0</a></p> <h3>Features</h3> <ul> <li><strong>api:</strong> enabling fast-mode in claude-opus-4-6 (<a href=" |
||
|
|
a329831b0b |
feat(backend): Add ClamAV scanning for local file paths (#11988)
## Context From PR #11796 review discussion. Files processed by the video blocks (downloads, uploads, generated videos) should be scanned through ClamAV for malware detection. ## Problem `store_media_file()` in `backend/util/file.py` already scans: - `workspace://` references - Cloud storage paths - Data URIs (`data:...`) - HTTP/HTTPS URLs **But local file paths were NOT scanned.** The `else` branch only verified the file exists. This gap affected video processing blocks (e.g., `LoopVideoBlock`, `AddAudioToVideoBlock`) that: 1. Download/receive input media 2. Process it locally (loop, add audio, etc.) 3. Write output to temp directory 4. Call `store_media_file(output_filename, ...)` with a local path → **skipped virus scanning** ## Solution Added virus scanning to the local file path branch: ```python # Virus scan the local file before any further processing local_content = target_path.read_bytes() if len(local_content) > MAX_FILE_SIZE_BYTES: raise ValueError(...) await scan_content_safe(local_content, filename=sanitized_file) ``` ## Changes - `backend/util/file.py` - Added ~7 lines to scan local files (consistent with other input types) - `backend/util/file_test.py` - Added 2 test cases for local file scanning ## Risk Assessment - **Low risk:** Single point of change, follows existing pattern - **Backwards compatible:** No API changes - **Fail-safe:** If scanning fails, file is rejected (existing behavior) Closes SECRT-1904 Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co> |
||
|
|
728c40def5 |
fix(backend): replace multiprocessing queue with thread safe queue in ExecutionQueue (#11618)
<!-- Clearly explain the need for these changes: --> The `ExecutionQueue` class was using `multiprocessing.Manager().Queue()` which spawns a subprocess for inter-process communication. However, analysis showed that `ExecutionQueue` is only accessed from threads within the same process, not across processes. This caused: - Unnecessary subprocess spawning per graph execution - IPC overhead for every queue operation - Potential resource leaks if Manager processes weren't properly cleaned up - Limited scalability when many graphs execute concurrently ### Changes <!-- Concisely describe all of the changes made in this pull request: --> - Replaced `multiprocessing.Manager().Queue()` with `queue.Queue()` in `ExecutionQueue` class - Updated imports: removed `from multiprocessing import Manager` and `from queue import Empty`, added `import queue` - Updated exception handling from `except Empty:` to `except queue.Empty:` - Added comprehensive docstring explaining the bug and fix **File changed:** `autogpt_platform/backend/backend/data/execution.py` ### Checklist #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: <!-- Put your test plan here: --> - [x] Verified `ExecutionQueue` uses `queue.Queue` (not `multiprocessing.Manager().Queue()`) - [x] Tested all queue operations: `add()`, `get()`, `empty()`, `get_or_none()` - [x] Verified thread-safety with concurrent producer/consumer threads (100 items) - [x] Verified multi-producer/consumer scenario (3 producers, 2 consumers, 150 items) - [x] Confirmed no subprocess spawning when creating multiple queues - [x] Code passes Black formatting check #### For configuration changes: - [x] `.env.default` is updated or already compatible with my changes - [x] `docker-compose.yml` is updated or already compatible with my changes - [x] I have included a list of my configuration changes in the PR description (under **Changes**) > No configuration changes required - this is a code-only fix with no external API changes. --------- Co-authored-by: Otto <otto@agpt.co> Co-authored-by: Zamil Majdy <majdyz@users.noreply.github.com> Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co> |
||
|
|
cd64562e1b |
chore(libs/deps): bump the production-dependencies group across 1 directory with 8 updates (#11934)
Bumps the production-dependencies group with 8 updates in the /autogpt_platform/autogpt_libs directory: | Package | From | To | | --- | --- | --- | | [fastapi](https://github.com/fastapi/fastapi) | `0.116.1` | `0.128.0` | | [google-cloud-logging](https://github.com/googleapis/python-logging) | `3.12.1` | `3.13.0` | | [launchdarkly-server-sdk](https://github.com/launchdarkly/python-server-sdk) | `9.12.0` | `9.14.1` | | [pydantic](https://github.com/pydantic/pydantic) | `2.11.7` | `2.12.5` | | [pydantic-settings](https://github.com/pydantic/pydantic-settings) | `2.10.1` | `2.12.0` | | [pyjwt](https://github.com/jpadilla/pyjwt) | `2.10.1` | `2.11.0` | | [supabase](https://github.com/supabase/supabase-py) | `2.16.0` | `2.27.2` | | [uvicorn](https://github.com/Kludex/uvicorn) | `0.35.0` | `0.40.0` | Updates `fastapi` from 0.116.1 to 0.128.0 <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/fastapi/fastapi/releases">fastapi's releases</a>.</em></p> <blockquote> <h2>0.128.0</h2> <h3>Breaking Changes</h3> <ul> <li>➖ Drop support for <code>pydantic.v1</code>. PR <a href="https://redirect.github.com/fastapi/fastapi/pull/14609">#14609</a> by <a href="https://github.com/tiangolo"><code>@tiangolo</code></a>.</li> </ul> <h3>Internal</h3> <ul> <li>✅ Run performance tests only on Pydantic v2. PR <a href="https://redirect.github.com/fastapi/fastapi/pull/14608">#14608</a> by <a href="https://github.com/tiangolo"><code>@tiangolo</code></a>.</li> </ul> <h2>0.127.1</h2> <h3>Refactors</h3> <ul> <li>🔊 Add a custom <code>FastAPIDeprecationWarning</code>. PR <a href="https://redirect.github.com/fastapi/fastapi/pull/14605">#14605</a> by <a href="https://github.com/tiangolo"><code>@tiangolo</code></a>.</li> </ul> <h3>Docs</h3> <ul> <li>📝 Add documentary to website. PR <a href="https://redirect.github.com/fastapi/fastapi/pull/14600">#14600</a> by <a href="https://github.com/tiangolo"><code>@tiangolo</code></a>.</li> </ul> <h3>Translations</h3> <ul> <li>🌐 Update translations for de (update-outdated). PR <a href="https://redirect.github.com/fastapi/fastapi/pull/14602">#14602</a> by <a href="https://github.com/nilslindemann"><code>@nilslindemann</code></a>.</li> <li>🌐 Update translations for de (update-outdated). PR <a href="https://redirect.github.com/fastapi/fastapi/pull/14581">#14581</a> by <a href="https://github.com/nilslindemann"><code>@nilslindemann</code></a>.</li> </ul> <h3>Internal</h3> <ul> <li>🔧 Update pre-commit to use local Ruff instead of hook. PR <a href="https://redirect.github.com/fastapi/fastapi/pull/14604">#14604</a> by <a href="https://github.com/tiangolo"><code>@tiangolo</code></a>.</li> <li>✅ Add missing tests for code examples. PR <a href="https://redirect.github.com/fastapi/fastapi/pull/14569">#14569</a> by <a href="https://github.com/YuriiMotov"><code>@YuriiMotov</code></a>.</li> <li>👷 Remove <code>lint</code> job from <code>test</code> CI workflow. PR <a href="https://redirect.github.com/fastapi/fastapi/pull/14593">#14593</a> by <a href="https://github.com/YuriiMotov"><code>@YuriiMotov</code></a>.</li> <li>👷 Update secrets check. PR <a href="https://redirect.github.com/fastapi/fastapi/pull/14592">#14592</a> by <a href="https://github.com/tiangolo"><code>@tiangolo</code></a>.</li> <li>👷 Run CodSpeed tests in parallel to other tests to speed up CI. PR <a href="https://redirect.github.com/fastapi/fastapi/pull/14586">#14586</a> by <a href="https://github.com/tiangolo"><code>@tiangolo</code></a>.</li> <li>🔨 Update scripts and pre-commit to autofix files. PR <a href="https://redirect.github.com/fastapi/fastapi/pull/14585">#14585</a> by <a href="https://github.com/tiangolo"><code>@tiangolo</code></a>.</li> </ul> <h2>0.127.0</h2> <h3>Breaking Changes</h3> <ul> <li>🔊 Add deprecation warnings when using <code>pydantic.v1</code>. PR <a href="https://redirect.github.com/fastapi/fastapi/pull/14583">#14583</a> by <a href="https://github.com/tiangolo"><code>@tiangolo</code></a>.</li> </ul> <h3>Translations</h3> <ul> <li>🔧 Add LLM prompt file for Korean, generated from the existing translations. PR <a href="https://redirect.github.com/fastapi/fastapi/pull/14546">#14546</a> by <a href="https://github.com/tiangolo"><code>@tiangolo</code></a>.</li> <li>🔧 Add LLM prompt file for Japanese, generated from the existing translations. PR <a href="https://redirect.github.com/fastapi/fastapi/pull/14545">#14545</a> by <a href="https://github.com/tiangolo"><code>@tiangolo</code></a>.</li> </ul> <h3>Internal</h3> <ul> <li>⬆️ Upgrade OpenAI model for translations to gpt-5.2. PR <a href="https://redirect.github.com/fastapi/fastapi/pull/14579">#14579</a> by <a href="https://github.com/tiangolo"><code>@tiangolo</code></a>.</li> </ul> <h2>0.126.0</h2> <h3>Upgrades</h3> <ul> <li>➖ Drop support for Pydantic v1, keeping short temporary support for Pydantic v2's <code>pydantic.v1</code>. PR <a href="https://redirect.github.com/fastapi/fastapi/pull/14575">#14575</a> by <a href="https://github.com/tiangolo"><code>@tiangolo</code></a>.</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href=" |
||
|
|
8fddc9d71f |
fix(backend): Reduce GET /api/graphs expense + latency (#11986)
[SECRT-1896: Fix crazy `GET /api/graphs` latency (P95 = 107s)](https://linear.app/autogpt/issue/SECRT-1896) These changes should decrease latency of this endpoint by ~~60-65%~~ a lot. ### Changes 🏗️ - Make `Graph.credentials_input_schema` cheaper by avoiding constructing a new `BlockSchema` subclass - Strip down `GraphMeta` - drop all computed fields - Replace with either `GraphModel` or `GraphModelWithoutNodes` wherever those computed fields are used - Simplify usage in `list_graphs_paginated` and `fetch_graph_from_store_slug` - Refactor and clarify relationships between the different graph models - Split `BaseGraph` into `GraphBaseMeta` + `BaseGraph` - Strip down `Graph` - move `credentials_input_schema` and `aggregate_credentials_inputs` to `GraphModel` - Refactor to eliminate double `aggregate_credentials_inputs()` call in `credentials_input_schema` call tree - Add `GraphModelWithoutNodes` (similar to current `GraphMeta`) ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] `GET /api/graphs` works as it should - [x] Running a graph succeeds - [x] Adding a sub-agent in the Builder works as it should |
||
|
|
29ee85c86f |
fix: add virus scanning to WorkspaceManager.write_file() (#11990)
## Summary
Adds virus scanning at the `WorkspaceManager.write_file()` layer for
defense in depth.
## Problem
Previously, virus scanning was only performed at entry points:
- `store_media_file()` in `backend/util/file.py`
- `WriteWorkspaceFileTool` in
`backend/api/features/chat/tools/workspace_files.py`
This created a trust boundary where any new caller of
`WorkspaceManager.write_file()` would need to remember to scan first.
## Solution
Add `scan_content_safe()` call directly in
`WorkspaceManager.write_file()` before persisting to storage. This
ensures all content is scanned regardless of the caller.
## Changes
- Added import for `scan_content_safe` from `backend.util.virus_scanner`
- Added virus scan call after file size validation, before storage
## Testing
Existing tests should pass. The scan is a no-op in test environments
where ClamAV isn't running.
Closes https://linear.app/autogpt/issue/OPEN-2993
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> **Medium Risk**
> Introduces a new required async scan step in the workspace write path,
which can add latency or cause new failures if the scanner/ClamAV is
misconfigured or unavailable.
>
> **Overview**
> Adds a **defense-in-depth** virus scan to
`WorkspaceManager.write_file()` by invoking `scan_content_safe()` after
file-size validation and before any storage/database persistence.
>
> This centralizes scanning so any caller writing workspace files gets
the same malware check without relying on upstream entry points to
remember to scan.
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
|
||
|
|
85b6520710 |
feat(blocks): Add video editing blocks (#11796)
<!-- Clearly explain the need for these changes: -->
This PR adds general-purpose video editing blocks for the AutoGPT
Platform, enabling automated video production workflows like documentary
creation, marketing videos, tutorial assembly, and content repurposing.
### Changes 🏗️
<!-- Concisely describe all of the changes made in this pull request:
-->
**New blocks added in `backend/blocks/video/`:**
- `VideoDownloadBlock` - Download videos from URLs (YouTube, Vimeo, news
sites, direct links) using yt-dlp
- `VideoClipBlock` - Extract time segments from videos with start/end
time validation
- `VideoConcatBlock` - Merge multiple video clips with optional
transitions (none, crossfade, fade_black)
- `VideoTextOverlayBlock` - Add text overlays/captions with positioning
and timing options
- `VideoNarrationBlock` - Generate AI narration via ElevenLabs and mix
with video audio (replace, mix, or ducking modes)
**Dependencies required:**
- `yt-dlp` - For video downloading
- `moviepy` - For video editing operations
**Implementation details:**
- All blocks follow the SDK pattern with proper error handling and
exception chaining
- Proper resource cleanup in `finally` blocks to prevent memory leaks
- Input validation (e.g., end_time > start_time)
- Test mocks included for CI
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Blocks follow the SDK pattern with
`BlockSchemaInput`/`BlockSchemaOutput`
- [x] Resource cleanup is implemented in `finally` blocks
- [x] Exception chaining is properly implemented
- [x] Input validation is in place
- [x] Test mocks are provided for CI environments
#### For configuration changes:
- [ ] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [ ] I have included a list of my configuration changes in the PR
description (under **Changes**)
N/A - No configuration changes required.
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> **Medium Risk**
> Adds new multimedia blocks that invoke ffmpeg/MoviePy and introduces
new external dependencies (plus container packages), which can impact
runtime stability and resource usage; download/overlay blocks are
present but disabled due to sandbox/policy concerns.
>
> **Overview**
> Adds a new `backend.blocks.video` module with general-purpose video
workflow blocks (download, clip, concat w/ transitions, loop, add-audio,
text overlay, and ElevenLabs-powered narration), including shared
utilities for codec selection, filename cleanup, and an ffmpeg-based
chapter-strip workaround for MoviePy.
>
> Extends credentials/config to support ElevenLabs
(`ELEVENLABS_API_KEY`, provider enum, system credentials, and cost
config) and adds new dependencies (`elevenlabs`, `yt-dlp`) plus Docker
runtime packages (`ffmpeg`, `imagemagick`).
>
> Improves file/reference handling end-to-end by embedding MIME types in
`workspace://...#mime` outputs and updating frontend rendering to detect
video vs image from MIME fragments (and broaden supported audio/video
extensions), with optional enhanced output rendering behind a feature
flag in the legacy builder UI.
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
|
||
|
|
bfa942e032 |
feat(platform): Add Claude Opus 4.6 model support (#11983)
## Summary Adds support for Anthropic's newly released Claude Opus 4.6 model. ## Changes - Added `claude-opus-4-6` to the `LlmModel` enum - Added model metadata: 200K context window (1M beta), **128K max output tokens** - Added block cost config (same pricing tier as Opus 4.5: $5/MTok input, $25/MTok output) - Updated chat config default model to Claude Opus 4.6 ## Model Details From [Anthropic's docs](https://docs.anthropic.com/en/docs/about-claude/models): - **API ID:** `claude-opus-4-6` - **Context window:** 200K tokens (1M beta) - **Max output:** 128K tokens (up from 64K on Opus 4.5) - **Extended thinking:** Yes - **Adaptive thinking:** Yes (new, Opus 4.6 exclusive) - **Knowledge cutoff:** May 2025 (reliable), Aug 2025 (training) - **Pricing:** $5/MTok input, $25/MTok output (same as Opus 4.5) --------- Co-authored-by: Toran Bruce Richards <toran.richards@gmail.com> |
||
|
|
3ca2387631 |
feat(blocks): Implement Text Encode block (#11857)
## Summary
Implements a `TextEncoderBlock` that encodes plain text into escape
sequences (the reverse of `TextDecoderBlock`).
## Changes
### Block Implementation
- Added `encoder_block.py` with `TextEncoderBlock` in
`autogpt_platform/backend/backend/blocks/`
- Uses `codecs.encode(text, "unicode_escape").decode("utf-8")` for
encoding
- Mirrors the structure and patterns of the existing `TextDecoderBlock`
- Categorised as `BlockCategory.TEXT`
### Documentation
- Added Text Encoder section to
`docs/integrations/block-integrations/text.md` (the auto-generated docs
file for TEXT category blocks)
- Expanded "How it works" with technical details on the encoding method,
validation, and edge cases
- Added 3 structured use cases per docs guidelines: JSON payload
preparation, Config/ENV generation, Snapshot fixtures
- Added Text Encoder to the overview table in
`docs/integrations/README.md`
- Removed standalone `encoder_block.md` (TEXT category blocks belong in
`text.md` per `CATEGORY_FILE_MAP` in `generate_block_docs.py`)
### Documentation Formatting (CodeRabbit feedback)
- Added blank lines around markdown tables (MD058)
- Added `text` language tags to fenced code blocks (MD040)
- Restructured use case section with bold headings per coding guidelines
## How Docs Were Synced
The `check-docs-sync` CI job runs `poetry run python
scripts/generate_block_docs.py --check` which expects blocks to be
documented in category-grouped files. Since `TextEncoderBlock` uses
`BlockCategory.TEXT`, the `CATEGORY_FILE_MAP` maps it to `text.md` — not
a standalone file. The block entry was added to `text.md` following the
exact format used by the generator (with `<!-- MANUAL -->` markers for
hand-written sections).
## Related Issue
Fixes #11111
---------
Co-authored-by: Otto <otto@agpt.co>
Co-authored-by: lif <19658300+majiayu000@users.noreply.github.com>
Co-authored-by: Aryan Kaul <134673289+aryancodes1@users.noreply.github.com>
Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co>
Co-authored-by: Nick Tindle <nick@ntindle.com>
|
||
|
|
ed07f02738 |
fix(copilot): edit_agent updates existing agent instead of creating duplicate (#11981)
## Summary When editing an agent via CoPilot's `edit_agent` tool, the code was always creating a new `LibraryAgent` entry instead of updating the existing one to point to the new graph version. This caused duplicate agents to appear in the user's library. ## Changes In `save_agent_to_library()`: - When `is_update=True`, now checks if there's an existing library agent for the graph using `get_library_agent_by_graph_id()` - If found, uses `update_agent_version_in_library()` to update the existing library agent to point to the new version - Falls back to creating a new library agent if no existing one is found (e.g., if editing a graph that wasn't added to library yet) ## Testing - Verified lint/format checks pass - Plan reviewed and approved by Staff Engineer Plan Reviewer agent ## Related Fixes [SECRT-1857](https://linear.app/autogpt/issue/SECRT-1857) --------- Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co> |
||
|
|
e40233a3ac |
fix(backend/chat): Guide find_agent users toward action with CTAs (#11976)
When users search for agents, guide them toward creating custom agents if no results are found or after showing results. This improves user engagement by offering a clear next step. ### Changes 🏗️ - Updated `agent_search.py` to add CTAs in search responses - Added messaging to inform users they can create custom agents based on their needs - Applied to both "no results found" and "agents found" scenarios ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Search for agents in marketplace with matching results - [x] Search for agents in marketplace with no results - [x] Search for agents in library with matching results - [x] Search for agents in library with no results - [x] Verify CTA message appears in all cases --------- Co-authored-by: Otto <otto@agpt.co> |
||
|
|
3ae5eabf9d |
fix(backend/chat): Use latest prompt label in non-production environments (#11977)
In non-production environments, the chat service now fetches prompts with the `latest` label instead of the default production-labeled prompt. This makes it easier to test and iterate on prompt changes in dev/staging without needing to promote them to production first. ### Changes 🏗️ - Updated `_get_system_prompt_template()` in chat service to pass `label="latest"` when `app_env` is not `PRODUCTION` - Production environments continue using the default behavior (production-labeled prompts) ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Verified that in non-production environments, prompts with `latest` label are fetched - [x] Verified that production environments still use the default (production) labeled prompts Co-authored-by: Otto <otto@agpt.co> |
||
|
|
a077ba9f03 |
fix(platform): YouTube block yields only error on failure (#11980)
## Summary Fixes [SECRT-1889](https://linear.app/autogpt/issue/SECRT-1889): The YouTube transcription block was yielding both `video_id` and `error` when the transcript fetch failed. ## Problem The block yielded `video_id` immediately upon extracting it from the URL, before attempting to fetch the transcript. If the transcript fetch failed, both outputs were present. ```python # Before video_id = self.extract_video_id(input_data.youtube_url) yield "video_id", video_id # ← Yielded before transcript attempt transcript = self.get_transcript(video_id, credentials) # ← Could fail here ``` ## Solution Wrap the entire operation in try/except and only yield outputs after all operations succeed: ```python # After try: video_id = self.extract_video_id(input_data.youtube_url) transcript = self.get_transcript(video_id, credentials) transcript_text = self.format_transcript(transcript=transcript) # Only yield after all operations succeed yield "video_id", video_id yield "transcript", transcript_text except Exception as e: yield "error", str(e) ``` This follows the established pattern in other blocks (e.g., `ai_image_generator_block.py`). ## Testing - All 10 unit tests pass (`test/blocks/test_youtube.py`) - Lint/format checks pass Co-authored-by: Toran Bruce Richards <toran.richards@gmail.com> |
||
|
|
5401d54eaa |
fix(backend): Handle StreamHeartbeat in CoPilot stream handler (#11928)
### Changes 🏗️ Fixes **AUTOGPT-SERVER-7JA** (123 events since Jan 27, 2026). #### Problem `StreamHeartbeat` was added to keep SSE connections alive during long-running tool executions (yielded every 15s while waiting). However, the main `stream_chat_completion` handler's `elif` chain didn't have a case for it: ``` StreamTextStart → ✅ handled StreamTextDelta → ✅ handled StreamTextEnd → ✅ handled StreamToolInputStart → ✅ handled StreamToolInputAvailable → ✅ handled StreamToolOutputAvailable → ✅ handled StreamFinish → ✅ handled StreamError → ✅ handled StreamUsage → ✅ handled StreamHeartbeat → ❌ fell through to 'Unknown chunk type' error ``` This meant every heartbeat during tool execution generated a Sentry error instead of keeping the connection alive. #### Fix Add `StreamHeartbeat` to the `elif` chain and yield it through. The route handler already calls `to_sse()` on all yielded chunks, and `StreamHeartbeat.to_sse()` correctly returns `: heartbeat\n\n` (SSE comment format, ignored by clients but keeps proxies/load balancers happy). **1 file changed, 3 insertions.** |
||
|
|
5ac89d7c0b |
fix(test): fix timing bug in test_block_credit_reset (#11978)
## Summary Fixes the flaky `test_block_credit_reset` test that was failing on multiple PRs with `assert 0 == 1000`. ## Root Cause The test calls `disable_test_user_transactions()` which sets `updatedAt` to 35 days ago from the **actual current time**. It then mocks `time_now` to January 1st. **The bug**: If the test runs in early February, 35 days ago is January — the **same month** as the mocked `time_now`. The credit refill logic only triggers when the balance snapshot is from a *different* month, so no refill happens and the balance stays at 0. ## Fix After calling `disable_test_user_transactions()`, explicitly set `updatedAt` to December of the previous year. This ensures it's always in a different month than the mocked `month1` (January), regardless of when the test runs. ## Testing CI will verify the fix. |
||
|
|
4f908d5cb3 |
fix(platform): Improve Linear Search Block [SECRT-1880] (#11967)
## Summary Implements [SECRT-1880](https://linear.app/autogpt/issue/SECRT-1880) - Improve Linear Search Block ## Changes ### Models (`models.py`) - Added `State` model with `id`, `name`, and `type` fields for workflow state information - Added `state: State | None` field to `Issue` model ### API Client (`_api.py`) - Updated `try_search_issues()` to: - Add `max_results` parameter (default 10, was ~50) to reduce token usage - Add `team_id` parameter for team filtering - Return `createdAt`, `state`, `project`, and `assignee` fields in results - Fixed `try_get_team_by_name()` to return descriptive error message when team not found instead of crashing with `IndexError` ### Block (`issues.py`) - Added `max_results` input parameter (1-100, default 10) - Added `team_name` input parameter for optional team filtering - Added `error` output field for graceful error handling - Added categories (`PRODUCTIVITY`, `ISSUE_TRACKING`) - Updated test fixtures to include new fields ## Breaking Changes | Change | Before | After | Mitigation | |--------|--------|-------|------------| | Default result count | ~50 | 10 | Users can set `max_results` up to 100 if needed | ## Non-Breaking Changes - `state` field added to `Issue` (optional, defaults to `None`) - `max_results` param added (has default value) - `team_name` param added (optional, defaults to `None`) - `error` output added (follows established pattern from GitHub blocks) ## Testing - [x] Format/lint checks pass - [x] Unit test fixtures updated Resolves SECRT-1880 --------- Co-authored-by: Toran Bruce Richards <toran.richards@gmail.com> Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com> Co-authored-by: Toran Bruce Richards <Torantulino@users.noreply.github.com> |
||
|
|
c1aa684743 |
fix(platform/chat): Filter host-scoped credentials for run_agent tool (#11905)
- Fixes [SECRT-1851: \[Copilot\] `run_agent` tool doesn't filter host-scoped credentials](https://linear.app/autogpt/issue/SECRT-1851) - Follow-up to #11881 ### Changes 🏗️ - Filter host-scoped credentials for `run_agent` tool - Tighten validation on host input field in `HostScopedCredentialsModal` - Use netloc (w/ port) rather than just hostname (w/o port) as host scope ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - Create graph that requires host-scoped credentials to work - Create host-scoped credentials with a *different* host - Try to have Copilot run the graph - [x] -> no matching credentials available - Create new credentials - [x] -> works --------- Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co> |