AutoGPT

mirror of https://github.com/Significant-Gravitas/AutoGPT.git synced 2026-02-14 08:45:12 -05:00

Author	SHA1	Message	Date
Zamil Majdy	bec2e1ddee	fix(chat/tools): sanitize session_id in sandbox workspace path Align with SDK's _make_sdk_cwd() to prevent path traversal and ensure python_exec/bash_exec share the same workspace as SDK file tools.	2026-02-12 19:08:47 +04:00
Zamil Majdy	ec1ab06e0d	chore(chat): bump default max_subtasks from 3 to 10	2026-02-12 19:07:42 +04:00
Zamil Majdy	f31cb49557	feat(chat/tools): add sandboxed python_exec, bash_exec, web_fetch tools and enable Task - Add sandbox.py with network-isolated execution via unshare --net (Linux) and import/command blocklist fallback (macOS dev) - Add python_exec tool: runs Python in subprocess with no network, workspace-scoped - Add bash_exec tool: full Bash scripting with no network, workspace-scoped - Add web_fetch tool: SSRF-protected URL fetching via backend Requests utility - Remove SDK built-in Bash from allowlist (replaced by sandboxed bash_exec) - Enable SDK built-in Task (sub-agents) with per-session rate limit (default 3) - Add claude_agent_max_subtasks config field	2026-02-12 19:07:19 +04:00
Zamil Majdy	fd28c386f4	Merge branch 'dev' into feat/copitlot-claude-code	2026-02-12 18:50:11 +04:00
Zamil Majdy	3bea584659	feat(chat/sdk): route SDK through OpenRouter with observability (#12084 ) ## Summary - Routes Claude Agent SDK API calls through OpenRouter via `ANTHROPIC_BASE_URL` / `ANTHROPIC_AUTH_TOKEN` env vars, enabling per-call token and cost tracking on the OpenRouter dashboard - Adds `sdk_model` and `sdk_max_budget_usd` config fields for SDK-specific model selection and budget control - Emits `StreamUsage` from SDK `ResultMessage` so the frontend receives token counts, and persists usage to `session.usage` - Fixes Langfuse tracing to use the configured model name instead of a hardcoded default - Updates Anthropic fallback to use `config.api_key` / `config.base_url` (OpenRouter routing) instead of raw `ANTHROPIC_API_KEY` env var ## Test plan - [ ] Deploy and send a CoPilot message — verify the API call appears on the OpenRouter dashboard - [ ] Check Langfuse trace shows correct model name (e.g. `claude-opus-4.6` not hardcoded `claude-sonnet-4-20250514`) - [ ] Verify frontend receives `StreamUsage` with `promptTokens` / `completionTokens` values - [ ] Set `CHAT_SDK_MAX_BUDGET_USD` and verify budget is respected - [ ] Test fallback path (without `claude-agent-sdk` installed) still works via OpenRouter <!-- greptile_comment --> <h2>Greptile Overview</h2> <details><summary><h3>Greptile Summary</h3></summary> Routes Claude Agent SDK API calls through OpenRouter for enhanced observability and cost tracking. The PR enables per-call token tracking on the OpenRouter dashboard by configuring the SDK to use `ANTHROPIC_BASE_URL` and `ANTHROPIC_AUTH_TOKEN` environment variables derived from the chat configuration. Key changes: - Added `sdk_model` and `sdk_max_budget_usd` configuration fields for SDK-specific control - Implemented automatic model name resolution that strips OpenRouter provider prefixes - Updated SDK client initialization to route through OpenRouter with proper environment variables - Emits `StreamUsage` events from SDK `ResultMessage` for frontend token visibility - Persists usage data to `session.usage` for historical tracking - Fixed Langfuse tracing to use the configured model name instead of hardcoded defaults - Updated fallback path to use OpenRouter routing instead of direct Anthropic API </details> <details><summary><h3>Confidence Score: 4/5</h3></summary> - Safe to merge with minor observations - the implementation is solid and the changes are well-structured - The code quality is high with proper error handling, clear separation of concerns, and good defensive coding practices. The changes integrate cleanly with existing patterns. Minor observations include missing validation for sdk_max_budget_usd and a potential edge case in model name resolution, but these don't block merging - No files require special attention - all changes follow existing patterns and maintain consistency </details> <details><summary><h3>Sequence Diagram</h3></summary> ```mermaid sequenceDiagram participant Frontend participant Backend participant SDK as Claude Agent SDK participant OpenRouter participant Anthropic participant Langfuse Frontend->>Backend: POST /chat/completions Backend->>Backend: Load config (api_key, base_url) Backend->>Backend: Resolve SDK model (strip OpenRouter prefix) Backend->>Backend: Build SDK env vars (ANTHROPIC_BASE_URL, ANTHROPIC_AUTH_TOKEN) Backend->>Langfuse: Initialize TracedSession with model name Backend->>SDK: ClaudeSDKClient(model, env, max_budget_usd) SDK->>SDK: Use ANTHROPIC_BASE_URL from env SDK->>OpenRouter: POST /messages (via configured base_url) OpenRouter->>Anthropic: Forward request with routing Anthropic-->>OpenRouter: Stream response chunks OpenRouter-->>SDK: Stream response with usage data loop For each SDK message SDK-->>Backend: AssistantMessage/UserMessage/ResultMessage Backend->>Langfuse: log_sdk_message() Backend->>Backend: SDKResponseAdapter.convert_message() Backend->>Backend: Extract usage from ResultMessage Backend->>Backend: Persist Usage to session.usage Backend-->>Frontend: StreamUsage(promptTokens, completionTokens) Backend-->>Frontend: StreamTextDelta/StreamToolInput/etc end Backend->>Langfuse: Log final generation with model name Backend->>Backend: Save session with usage data Backend-->>Frontend: StreamFinish ``` </details> <!-- greptile_other_comments_section --> <!-- /greptile_comment -->	2026-02-12 21:47:39 +07:00
Reinier van der Leer	113e87a23c	refactor(backend): Reduce circular imports (#12068 ) I'm getting circular import issues because there is a lot of cross-importing between `backend.data`, `backend.blocks`, and other modules. This change reduces block-related cross-imports and thus risk of breaking circular imports. ### Changes 🏗️ - Strip down `backend.data.block` - Move `Block` base class and related class/enum defs to `backend.blocks._base` - Move `is_block_auth_configured` to `backend.blocks._utils` - Move `get_blocks()`, `get_io_block_ids()` etc. to `backend.blocks` (`__init__.py`) - Update imports everywhere - Remove unused and poorly typed `Block.create()` - Change usages from `block_cls.create()` to `block_cls()` - Improve typing of `load_all_blocks` and `get_blocks` - Move cross-import of `backend.api.features.library.model` from `backend/data/__init__.py` to `backend/data/integrations.py` - Remove deprecated attribute `NodeModel.webhook` - Re-generate OpenAPI spec and fix frontend usage - Eliminate module-level `backend.blocks` import from `blocks/agent.py` - Eliminate module-level `backend.data.execution` and `backend.executor.manager` imports from `blocks/helpers/review.py` - Replace `BlockInput` with `GraphInput` for graph inputs ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - CI static type-checking + tests should be sufficient for this	2026-02-12 12:07:49 +00:00
Zamil Majdy	d7f7a2747f	fix(backend/chat): Atomic message append to prevent race condition Replace the read-modify-write pattern in stream_chat_post with an atomic append_and_save_message helper that acquires the session lock before re-fetching and appending. This prevents message loss when concurrent requests modify the same session.	2026-02-12 09:10:43 +04:00
Zamil Majdy	68849e197c	format	2026-02-12 08:26:26 +04:00
Zamil Majdy	211478bb29	Revert "style: run ruff format and isort" This reverts commit `40b58807ab`.	2026-02-12 08:25:22 +04:00
Zamil Majdy	0e88dd15b2	feat(chat): add hook-based tracing integration for Claude Agent SDK - Add create_tracing_hooks() for fine-grained tool timing - Add merge_hooks() utility to combine security + tracing hooks - Captures precise pre/post timing for tool executions - Tracks tool failures via PostToolUseFailure hook - Integrates seamlessly with existing security hooks	2026-02-12 03:35:16 +00:00
Zamil Majdy	7f3c227f0a	feat(chat): add modular Langfuse tracing for Claude Agent SDK - Create tracing.py with TracedSession context manager - Automatically trace user messages, SDK messages, and results - Capture tool calls with input/output and timing - Log usage and cost from SDK ResultMessage - No-op when Langfuse not configured (zero overhead) - Clean integration into service.py via context manager	2026-02-12 03:33:37 +00:00
Zamil Majdy	40b58807ab	style: run ruff format and isort	2026-02-12 03:25:19 +00:00
Zamil Majdy	d0e2e6f013	security(service): strengthen path validation for SDK cleanup - Add empty check after session_id sanitization - Add assertion for defense-in-depth - Add explicit '..' traversal check in cleanup - Replace glob with os.listdir to avoid glob injection - Add validation that project_dir stays under ~/.claude/projects - Add warning logs for rejected paths Addresses CodeQL alert about uncontrolled data in path expression	2026-02-12 03:07:08 +00:00
Zamil Majdy	efdc8d73cc	fix(security_hooks): use json.dumps for pattern matching and log warning - Use json.dumps instead of str() for more predictable pattern matching - Log warning when SDK not available and security hooks are disabled Addresses CodeRabbit review feedback	2026-02-12 02:55:04 +00:00
Zamil Majdy	038b7d5841	feat(copilot): show specific command name for Bash tool - Extract command name (jq, grep, etc.) from Bash tool input - Display 'jq completed' instead of 'Bash completed' - Add ripgrep and tree to Dockerfile (match ALLOWED_BASH_COMMANDS)	2026-02-12 02:48:19 +00:00
Zamil Majdy	cac93b0cc9	fix(chat): increase SDK buffer limit and add jq - Add sdk_max_buffer_size config option (default 10MB, was 1MB) - Pass max_buffer_size to ClaudeAgentOptions to prevent crashes on large tool outputs - Install jq in Dockerfile for JSON processing capabilities Fixes AUTOGPT-SERVER-7V2	2026-02-12 02:41:12 +00:00
Zamil Majdy	a78145505b	fix(copilot): merge split assistant messages to prevent Anthropic API errors (#12062 ) ## Summary - When the copilot model responds with both text content AND a long-running tool call (e.g., `create_agent`), the streaming code created two separate consecutive assistant messages — one with text, one with `tool_calls`. This caused Anthropic's API to reject with `"unexpected tool_use_id found in tool_result blocks"` because the `tool_result` couldn't find a matching `tool_use` in the immediately preceding assistant message. - Added a defensive merge of consecutive assistant messages in `to_openai_messages()` (fixes existing corrupt sessions too) - Fixed `_yield_tool_call` to add tool_calls to the existing current-turn assistant message instead of creating a new one - Changed `accumulated_tool_calls` assignment to use `extend` to prevent overwriting tool_calls added by long-running tool flow ## Test plan - [x] All 23 chat feature tests pass (`backend/api/features/chat/`) - [x] All 44 prompt utility tests pass (`backend/util/prompt_test.py`) - [x] All pre-commit hooks pass (ruff, isort, black, pyright) - [ ] Manual test: create an agent via copilot, then ask a follow-up question — should no longer get 400 error <!-- greptile_comment --> <h2>Greptile Overview</h2> <details><summary><h3>Greptile Summary</h3></summary> Fixes a critical bug where long-running tool calls (like `create_agent`) caused Anthropic API 400 errors due to split assistant messages. The fix ensures tool calls are added to the existing assistant message instead of creating new ones, and adds a defensive merge function to repair any existing corrupt sessions. Key changes: - Added `_merge_consecutive_assistant_messages()` to defensively merge split assistant messages in `to_openai_messages()` - Modified `_yield_tool_call()` to append tool calls to the current-turn assistant message instead of creating a new one - Changed `accumulated_tool_calls` from assignment to `extend` to preserve tool calls already added by long-running tool flow Impact: Resolves the issue where users received 400 errors after creating agents via copilot and asking follow-up questions. </details> <details><summary><h3>Confidence Score: 4/5</h3></summary> - Safe to merge with minor verification recommended - The changes are well-targeted and solve a real API compatibility issue. The logic is sound: searching backwards for the current assistant message is correct, and using `extend` instead of assignment prevents overwriting. The defensive merge in `to_openai_messages()` also fixes existing corrupt sessions. All existing tests pass according to the PR description. - No files require special attention - changes are localized and defensive </details> <details><summary><h3>Sequence Diagram</h3></summary> ```mermaid sequenceDiagram participant User participant StreamAPI as stream_chat_completion participant Chunks as _stream_chat_chunks participant ToolCall as _yield_tool_call participant Session as ChatSession User->>StreamAPI: Send message StreamAPI->>Chunks: Stream chat chunks alt Text + Long-running tool call Chunks->>StreamAPI: Text delta (content) StreamAPI->>Session: Append assistant message with content Chunks->>ToolCall: Tool call detected Note over ToolCall: OLD: Created new assistant message<br/>NEW: Appends to existing assistant ToolCall->>Session: Search backwards for current assistant ToolCall->>Session: Append tool_call to existing message ToolCall->>Session: Add pending tool result end StreamAPI->>StreamAPI: Merge accumulated_tool_calls Note over StreamAPI: Use extend (not assign)<br/>to preserve existing tool_calls StreamAPI->>Session: to_openai_messages() Session->>Session: _merge_consecutive_assistant_messages() Note over Session: Defensive: Merges any split<br/>assistant messages Session-->>StreamAPI: Merged messages StreamAPI->>User: Return response ``` </details> <!-- greptile_other_comments_section --> <!-- /greptile_comment -->	2026-02-12 01:52:17 +00:00
Zamil Majdy	2025aaf5f2	fix(backend/chat): Preserve full MCP tool output for frontend widgets The SDK CLI truncates large tool results (writing them to disk), which breaks frontend widget rendering (e.g., find_block's block list cards). Stash the full MCP tool output before the SDK sees it, then use the stash in the response adapter so the frontend always receives the complete JSON for proper widget parsing.	2026-02-11 23:13:42 +04:00
Zamil Majdy	ae9bce3bae	feat(backend/chat): Add sandboxed Bash and notify SDK of restrictions - Allow Bash tool with command allowlist (jq, grep, head, tail, etc.) validated via shlex.split for proper quote handling - Add workspace path validation for Bash absolute paths - Add SDK built-in tools (Read/Write/Edit/Glob/Grep/Bash) to allowed_tools - Append Bash restrictions to system prompt (SDK doesn't know our allowlist) - Add default_factory to BlockInfoSummary schema fields - Add 12 Bash sandbox tests covering safe/dangerous commands, substitution, redirection, /dev/ access, path escaping	2026-02-11 22:35:39 +04:00
Zamil Majdy	f174fb6303	fix(backend/chat): Strip MCP prefix from SDK tool names for frontend rendering The Vercel AI SDK frontend renders tool widgets based on tool name (e.g. "tool-find_block", "tool-run_agent"). The SDK sends tool names with the MCP prefix (mcp__copilot__find_block) which didn't match any frontend switch case, causing tool execution to be invisible. Strip the mcp__copilot__ prefix in the response adapter so tool events reach the correct frontend widget handlers.	2026-02-11 22:01:59 +04:00
Zamil Majdy	920a4c5f15	feat(backend/chat): Allow Read/Write/Edit/Glob/Grep in SDK within workspace Move these tools from fully-blocked to workspace-scoped: they are now allowed when the file path stays within the SDK working directory (/tmp/copilot-<session>/) or the tool-results directory (~/.claude/projects/…/tool-results/). This enables the SDK's built-in oversized tool result handling and workspace file operations. - Add _validate_workspace_path() with normpath-based path validation - Pass sdk_cwd from service.py into create_security_hooks() - Add 20 unit tests covering allowed/denied paths, traversal attacks	2026-02-11 20:39:33 +04:00
Zamil Majdy	e95fadbb86	Merge branch 'dev' into feat/copitlot-claude-code	2026-02-11 20:23:56 +04:00
Zamil Majdy	b14b3803ad	feat(backend/chat): Add StreamStartStep/StreamFinishStep to SDK adapter The non-SDK path emits step boundaries (StartStep/FinishStep) around each LLM turn and tool cycle. The SDK adapter was missing these, causing the frontend to lack visual step framing for tool calls. Now the SDK adapter emits: - StreamStartStep after init and before each new LLM turn - StreamFinishStep after tool results and before final finish	2026-02-11 20:18:27 +04:00
Otto	36aeb0b2b3	docs(blocks): clarify HumanInTheLoop output descriptions for agent builder (#12069 ) ## Problem The agent builder (LLM) misinterprets the HumanInTheLoop block outputs. It thinks `approved_data` and `rejected_data` will yield status strings like "APPROVED" or "REJECTED" instead of understanding that the actual input data passes through. This leads to unnecessary complexity - the agent builder adds comparison blocks to check for status strings that don't exist. ## Solution Enriched the block docstring and all input/output field descriptions to make it explicit that: 1. The output is the actual data itself, not a status string 2. The routing is determined by which output pin fires 3. How to use the block correctly (connect downstream blocks to appropriate output pins) ## Changes - Updated block docstring with clear "How it works" and "Example usage" sections - Enhanced `data` input description to explain data flow - Enhanced `name` input description for reviewer context - Enhanced `approved_data` output to explicitly state it's NOT a status string - Enhanced `rejected_data` output to explicitly state it's NOT a status string - Enhanced `review_message` output for clarity ## Testing Documentation-only change to schema descriptions. No functional changes. Fixes SECRT-1930 <!-- greptile_comment --> <h2>Greptile Overview</h2> <details><summary><h3>Greptile Summary</h3></summary> Enhanced documentation for the `HumanInTheLoopBlock` to clarify how output pins work. The key improvement explicitly states that output pins (`approved_data` and `rejected_data`) yield the actual input data, not status strings like "APPROVED" or "REJECTED". This prevents the agent builder (LLM) from misinterpreting the block's behavior and adding unnecessary comparison blocks. Key changes: - Added "How it works" and "Example usage" sections to the block docstring - Clarified that routing is determined by which output pin fires, not by comparing output values - Enhanced all input/output field descriptions with explicit data flow explanations - Emphasized that downstream blocks should be connected to the appropriate output pin based on desired workflow path This is a documentation-only change with no functional modifications to the code logic. </details> <details><summary><h3>Confidence Score: 5/5</h3></summary> - This PR is safe to merge with no risk - Documentation-only change that accurately reflects the existing code behavior. No functional changes, no runtime impact, and the enhanced descriptions correctly explain how the block outputs work based on verification of the implementation code. - No files require special attention </details> <!-- greptile_other_comments_section --> <!-- /greptile_comment --> Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>	2026-02-11 15:43:58 +00:00
Ubbe	2a189c44c4	fix(frontend): API stream issues leaking into prompt (#12063 ) ## Changes 🏗️ <img width="800" height="621" alt="Screenshot 2026-02-11 at 19 32 39" src="https://github.com/user-attachments/assets/e97be1a7-972e-4ae0-8dfa-6ade63cf287b" /> When the BE API has an error, prevent it from leaking into the stream and instead handle it gracefully via toast. ## Checklist 📋 ### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Run the app locally and trust the changes <!-- greptile_comment --> <h2>Greptile Overview</h2> <details><summary><h3>Greptile Summary</h3></summary> This PR fixes an issue where backend API stream errors were leaking into the chat prompt instead of being handled gracefully. The fix involves both backend and frontend changes to ensure error events conform to the AI SDK's strict schema. Key Changes: - Backend (`response_model.py`): Added custom `to_sse()` method for `StreamError` that only emits `type` and `errorText` fields, stripping extra fields like `code` and `details` that cause AI SDK validation failures - Backend (`prompt.py`): Added validation step after context compression to remove orphaned tool responses without matching tool calls, preventing "unexpected tool_use_id" API errors - Frontend (`route.ts`): Implemented SSE stream normalization with `normalizeSSEStream()` and `normalizeSSEEvent()` functions to strip non-conforming fields from error events before they reach the AI SDK - Frontend (`ChatMessagesContainer.tsx`): Added toast notifications for errors and improved error display UI with deduplication logic The changes ensure a clean separation between internal error metadata (useful for logging/debugging) and the strict schema required by the AI SDK on the frontend. </details> <details><summary><h3>Confidence Score: 4/5</h3></summary> - This PR is safe to merge with low risk - The changes are well-structured and address a specific bug with proper error handling. The dual-layer approach (backend filtering in `to_sse()` + frontend normalization) provides defense-in-depth. However, the lack of automated tests for the new error normalization logic and the potential for edge cases in SSE parsing prevent a perfect score. - Pay close attention to `autogpt_platform/frontend/src/app/api/chat/sessions/[sessionId]/stream/route.ts` - the SSE normalization logic should be tested with various error scenarios </details> <details><summary><h3>Sequence Diagram</h3></summary> ```mermaid sequenceDiagram participant User participant Frontend as ChatMessagesContainer participant Proxy as /api/chat/.../stream participant Backend as Backend API participant AISDK as AI SDK User->>Frontend: Send message Frontend->>Proxy: POST with message Proxy->>Backend: Forward request with auth Backend->>Backend: Process message alt Success Path Backend->>Proxy: SSE stream (text-delta, etc.) Proxy->>Proxy: normalizeSSEStream (pass through) Proxy->>AISDK: Forward SSE events AISDK->>Frontend: Update messages Frontend->>User: Display response else Error Path Backend->>Backend: StreamError.to_sse() Note over Backend: Only emit {type, errorText} Backend->>Proxy: SSE error event Proxy->>Proxy: normalizeSSEEvent() Note over Proxy: Strip extra fields (code, details) Proxy->>AISDK: {type: "error", errorText: "..."} AISDK->>Frontend: error state updated Frontend->>Frontend: Toast notification (deduplicated) Frontend->>User: Show error UI + toast end ``` </details> <!-- greptile_other_comments_section --> <!-- /greptile_comment --> --------- Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: Otto-AGPT <otto@agpt.co>	2026-02-11 22:46:37 +08:00
Otto	062fe1aa70	fix(security): enforce disabled flag on blocks in graph validation (#12059 ) ## Summary Blocks marked `disabled=True` (like BlockInstallationBlock) were not being checked during graph validation, allowing them to be used via direct API calls despite being hidden from the UI. This adds a security check in `_validate_graph_get_errors()` to reject any graph containing disabled blocks. ## Security Advisory GHSA-4crw-9p35-9x54 ## Linear SECRT-1927 ## Changes - Added `block.disabled` check in graph validation (6 lines) ## Testing - Graphs with disabled blocks → rejected with clear error message - Graphs with valid blocks → unchanged behavior <!-- greptile_comment --> <h2>Greptile Overview</h2> <details><summary><h3>Greptile Summary</h3></summary> Adds critical security validation to prevent execution of disabled blocks (like `BlockInstallationBlock`) via direct API calls. The fix validates that `block.disabled` is `False` during graph validation in `_validate_graph_get_errors()` on line 747-750, ensuring disabled blocks are rejected before graph creation or execution. This closes a vulnerability where blocks marked disabled in the UI could still be used through API endpoints. </details> <details><summary><h3>Confidence Score: 5/5</h3></summary> - This PR is safe to merge and addresses a critical security vulnerability - The fix is minimal (6 lines), correctly placed in the validation flow, includes clear security context (GHSA reference), and follows existing validation patterns. The check is positioned after block existence validation and before input validation, ensuring disabled blocks are caught early in both graph creation and execution paths. - No files require special attention </details> <!-- greptile_other_comments_section --> <!-- /greptile_comment --> --------- Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-11 03:28:19 +00:00
Zamil Majdy	7cffa1895f	fix(backend/chat): Filter duplicate StreamStart from non-SDK path Routes.py already publishes a StreamStart before calling the service. The SDK path filters the duplicate internally, but the non-SDK path did not, causing two StreamStart events to reach the frontend.	2026-02-11 06:52:47 +04:00
Zamil Majdy	9791bdd724	fix(backend/chat): Use normpath+startswith pattern for CodeQL path sanitization CodeQL doesn't recognize re.sub as a path sanitizer. Switch to the os.path.normpath + startswith prefix check pattern that CodeQL's taint model explicitly recognizes as breaking the taint chain.	2026-02-11 06:45:12 +04:00
Zamil Majdy	750a674c78	fix lock	2026-02-11 06:39:03 +04:00
Zamil Majdy	960c7980a3	fix(backend/chat): Use named helper for session_id sanitization to satisfy CodeQL Replace inline comprehension with _sanitize_session_id() using re.sub so CodeQL recognizes the path-traversal sanitization barrier.	2026-02-11 06:32:16 +04:00
Zamil Majdy	e85d437bb2	fix(backend/chat): Sanitize session_id in SDK cwd path to prevent path traversal	2026-02-11 06:26:48 +04:00
Zamil Majdy	44f9536bd6	fix lock	2026-02-11 06:24:41 +04:00
Zamil Majdy	1c1085a227	Merge remote-tracking branch 'origin/dev' into feat/copitlot-claude-code # Conflicts: # autogpt_platform/backend/backend/api/features/chat/config.py # autogpt_platform/backend/poetry.lock	2026-02-11 05:30:46 +04:00
Zamil Majdy	d7ef70469e	fix(backend/chat): Fix cleanup race condition and move to outer finally - Use session-specific temp dir (/tmp/copilot-{session_id}) as SDK cwd to prevent concurrent sessions from deleting each other's tool-result files during cleanup - Move _cleanup_sdk_tool_results() to outer finally block so it runs even when the outer except Exception fires - Clean up the temp cwd directory after each session - Remove unnecessary inner try/finally nesting	2026-02-11 05:13:02 +04:00
Zamil Majdy	1926127ddd	fix(backend/chat): Fix bugs and remove dead code in SDK integration - Fix message accumulation bug: reset has_appended_assistant when creating new post-tool assistant message to prevent lost text deltas - Fix hardcoded model in anthropic_fallback.py: use config.model instead of hardcoded "claude-sonnet-4-20250514" - Fix _SDK_TOOL_RESULTS_DIR using hardcoded /root/ path: use expanduser - Remove unused create_strict_security_hooks (~75 lines) - Remove unused create_heartbeat/create_usage from response adapter - Remove unused RAW_TOOL_NAMES from tool_adapter - Extract _MAX_TOOL_ITERATIONS constant from magic number	2026-02-11 04:42:05 +04:00
Zamil Majdy	8b509e56de	refactor(backend/chat): Replace --resume with conversation context, add compaction and dedup - Remove broken --resume/session file approach (CLI v2.1.38 can't load >2 message session files) and delete session_file.py + tests - Embed prior conversation turns as <conversation_history> context in the user message for multi-turn memory - Add context compaction using shared compress_context() from prompt.py with LLM summarization + truncation fallback for long conversations - Reuse _build_system_prompt and _generate_session_title from parent service.py instead of duplicating (gains Langfuse prompt support) - Add has_conversation_history param to _build_system_prompt to avoid greeting on multi-turn conversations - Fix _SDK_TOOL_RESULTS_GLOB from hardcoded /root/ to expanduser ~/	2026-02-11 04:22:11 +04:00
dependabot[bot]	1ecae8c87e	chore(backend/deps): bump aiofiles from 24.1.0 to 25.1.0 in /autogpt_platform/backend (#12043 ) Bumps [aiofiles](https://github.com/Tinche/aiofiles) from 24.1.0 to 25.1.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/Tinche/aiofiles/releases">aiofiles's releases</a>.</em></p> <blockquote> <h2>v25.1.0</h2> <ul> <li>Switch to <a href="https://docs.astral.sh/uv/">uv</a> + add Python v3.14 support. (<a href="https://redirect.github.com/Tinche/aiofiles/pull/219">#219</a>)</li> <li>Add <code>ruff</code> formatter and linter. <a href="https://redirect.github.com/Tinche/aiofiles/pull/216">#216</a></li> <li>Drop Python 3.8 support. If you require it, use version 24.1.0. <a href="https://redirect.github.com/Tinche/aiofiles/pull/204">#204</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/danielsmyers"><code>@danielsmyers</code></a> made their first contribution in <a href="https://redirect.github.com/Tinche/aiofiles/pull/185">Tinche/aiofiles#185</a></li> <li><a href="https://github.com/stankudrow"><code>@stankudrow</code></a> made their first contribution in <a href="https://redirect.github.com/Tinche/aiofiles/pull/192">Tinche/aiofiles#192</a></li> <li><a href="https://github.com/waketzheng"><code>@waketzheng</code></a> made their first contribution in <a href="https://redirect.github.com/Tinche/aiofiles/pull/221">Tinche/aiofiles#221</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/Tinche/aiofiles/compare/v24.1.0...v25.1.0">https://github.com/Tinche/aiofiles/compare/v24.1.0...v25.1.0</a></p> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/Tinche/aiofiles/blob/main/CHANGELOG.md">aiofiles's changelog</a>.</em></p> <blockquote> <h2>25.1.0 (2025-10-09)</h2> <ul> <li>Switch to <a href="https://docs.astral.sh/uv/">uv</a> + add Python v3.14 support. (<a href="https://redirect.github.com/Tinche/aiofiles/pull/219">#219</a>)</li> <li>Add <code>ruff</code> formatter and linter. <a href="https://redirect.github.com/Tinche/aiofiles/pull/216">#216</a></li> <li>Drop Python 3.8 support. If you require it, use version 24.1.0. <a href="https://redirect.github.com/Tinche/aiofiles/pull/204">#204</a></li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`348f5ef656`"><code>348f5ef</code></a> v25.1.0</li> <li><a href="`5e1bb8f12b`"><code>5e1bb8f</code></a> docs: update readme to use ruff badge (<a href="https://redirect.github.com/Tinche/aiofiles/issues/221">#221</a>)</li> <li><a href="`6fdc25c781`"><code>6fdc25c</code></a> Move to uv. (<a href="https://redirect.github.com/Tinche/aiofiles/issues/219">#219</a>)</li> <li><a href="`1989132423`"><code>1989132</code></a> set 'function' as a default fixture loop scope value</li> <li><a href="`8986452a1b`"><code>8986452</code></a> add the 'asyncio_default_fixture_loop_scope=session' option</li> <li><a href="`ccab1ff776`"><code>ccab1ff</code></a> update pytest-asyncio==1.0.0</li> <li><a href="`8727c96f5b`"><code>8727c96</code></a> add PR <a href="https://redirect.github.com/Tinche/aiofiles/issues/216">#216</a> into the CHANGELOG</li> <li><a href="`a9388e5f8d`"><code>a9388e5</code></a> add TID and ignore TID252</li> <li><a href="`760366489a`"><code>7603664</code></a> remove [ruff].exclude keyval</li> <li><a href="`7c49a5c5f2`"><code>7c49a5c</code></a> add final newlines</li> <li>Additional commits viewable in <a href="https://github.com/Tinche/aiofiles/compare/v24.1.0...v25.1.0">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=aiofiles&package-manager=pip&previous-version=24.1.0&new-version=25.1.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Otto <otto@agpt.co>	2026-02-10 23:32:30 +00:00
Zamil Majdy	acb2d0bd1b	fix(backend/chat): Resolve symlinks in session file path for --resume The CLI resolves symlinks when computing its project directory (e.g. /tmp -> /private/tmp on macOS), so our session file writes must use the resolved path to match. Also adds cwd to ClaudeAgentOptions and debug logging for SDK messages.	2026-02-10 20:11:16 +04:00
Zamil Majdy	51aa369c80	fix(backend): Restore PyYAML cp38 wheel entries in poetry.lock Re-add Python 3.8 wheel entries for PyYAML that were dropped by poetry lock resolution, keeping the lockfile consistent with dev.	2026-02-10 20:06:45 +04:00
Otto	017a00af46	feat(copilot): Enable extended thinking for Claude models (#12052 ) ## Summary Enables Anthropic's extended thinking feature for Claude models in CoPilot via OpenRouter. This keeps the model's chain-of-thought reasoning internal rather than outputting it to users. ## Problem The CoPilot prompt was designed for a thinking agent (with `<internal_reasoning>` tags), but extended thinking wasn't enabled on the API side. This caused the model to output its reasoning as regular text, leaking internal analysis to users. ## Solution Added thinking configuration to the OpenRouter `extra_body` for Anthropic models: ```python extra_body["provider"] = { "anthropic": { "thinking": { "type": "enabled", "budget_tokens": config.thinking_budget_tokens, } } } ``` ## Configuration New settings in `ChatConfig`: \| Setting \| Default \| Description \| \|---------\|---------\|-------------\| \| `thinking_enabled` \| `True` \| Enable extended thinking for Claude models \| \| `thinking_budget_tokens` \| `10000` \| Token budget for thinking (1000-100000) \| ## Changes - `config.py`: Added `thinking_enabled` and `thinking_budget_tokens` settings - `service.py`: Added thinking config to all 3 places where `extra_body` is built for LLM calls ## Testing - Verify CoPilot responses no longer include internal reasoning text - Check that Claude's extended thinking is working (should see thinking tokens in usage) - Confirm non-Anthropic models are unaffected ## Related Discussion: https://discord.com/channels/1126875755960336515/1126875756925046928/1470779843552612607 --------- Co-authored-by: Swifty <craigswift13@gmail.com>	2026-02-10 16:18:05 +01:00
Zamil Majdy	6403ffe353	fix(backend/chat): Use --resume with session files for multi-turn conversations Replace broken AsyncIterable approach (CLI rejects assistant-type stdin messages) with JSONL session files written to the CLI's storage directory. This enables --resume to load full user+assistant context with turn-level compaction support for long conversations.	2026-02-10 18:46:33 +04:00
Zamil Majdy	c40a98ba3c	Merge branches 'feat/copitlot-claude-code' and 'dev' of github.com:Significant-Gravitas/AutoGPT into feat/copitlot-claude-code	2026-02-10 18:19:23 +04:00
Zamil Majdy	a31fc8b162	refactor(backend/chat): Use proper SDK types and in-memory conversation history Replace duck typing (class name checks, getattr) with isinstance() using SDK-exported dataclasses. Replace file-based --resume with AsyncIterable message injection for conversation history, eliminating disk I/O. Add 15 unit tests for the response adapter.	2026-02-10 18:17:00 +04:00
dependabot[bot]	81c1524658	chore(backend/deps): bump the production-dependencies group in /autogpt_platform/backend with 2 updates (#12037 ) Bumps the production-dependencies group in /autogpt_platform/backend with 2 updates: [fastapi](https://github.com/fastapi/fastapi) and [langfuse](https://github.com/langfuse/langfuse). Updates `fastapi` from 0.128.5 to 0.128.6 <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/fastapi/fastapi/releases">fastapi's releases</a>.</em></p> <blockquote> <h2>0.128.6</h2> <h3>Fixes</h3> <ul> <li>🐛 Fix <code>on_startup</code> and <code>on_shutdown</code> parameters of <code>APIRouter</code>. PR <a href="https://redirect.github.com/fastapi/fastapi/pull/14873">#14873</a> by <a href="https://github.com/YuriiMotov"><code>@YuriiMotov</code></a>.</li> </ul> <h3>Translations</h3> <ul> <li>🌐 Update translations for zh (update-outdated). PR <a href="https://redirect.github.com/fastapi/fastapi/pull/14843">#14843</a> by <a href="https://github.com/tiangolo"><code>@tiangolo</code></a>.</li> </ul> <h3>Internal</h3> <ul> <li>✅ Fix parameterized tests with snapshots. PR <a href="https://redirect.github.com/fastapi/fastapi/pull/14875">#14875</a> by <a href="https://github.com/YuriiMotov"><code>@YuriiMotov</code></a>.</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`fbca586c1d`"><code>fbca586</code></a> 📝 Update release notes</li> <li><a href="`4e879799dd`"><code>4e87979</code></a> 📝 Update release notes</li> <li><a href="`0a4033aeee`"><code>0a4033a</code></a> 🔖 Release version 0.128.6</li> <li><a href="`ed2512a5ec`"><code>ed2512a</code></a> 🐛 Fix <code>on_startup</code> and <code>on_shutdown</code> parameters of <code>APIRouter</code> (<a href="https://redirect.github.com/fastapi/fastapi/issues/14873">#14873</a>)</li> <li><a href="`0c0f6332e2`"><code>0c0f633</code></a> 📝 Update release notes</li> <li><a href="`227cb85a03`"><code>227cb85</code></a> ✅ Fix parameterized tests with snapshots (<a href="https://redirect.github.com/fastapi/fastapi/issues/14875">#14875</a>)</li> <li><a href="`cd31576d57`"><code>cd31576</code></a> 📝 Update release notes</li> <li><a href="`376e108580`"><code>376e108</code></a> 🌐 Update translations for zh (update-outdated) (<a href="https://redirect.github.com/fastapi/fastapi/issues/14843">#14843</a>)</li> <li>See full diff in <a href="https://github.com/fastapi/fastapi/compare/0.128.5...0.128.6">compare view</a></li> </ul> </details> <br /> Updates `langfuse` from 3.13.0 to 3.14.1 <details> <summary>Commits</summary> <ul> <li>See full diff in <a href="https://github.com/langfuse/langfuse/commits">compare view</a></li> </ul> </details> <br /> Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore <dependency name> major version` will close this group update PR and stop Dependabot creating any more for the specific dependency's major version (unless you unignore this specific dependency's major version or upgrade to it yourself) - `@dependabot ignore <dependency name> minor version` will close this group update PR and stop Dependabot creating any more for the specific dependency's minor version (unless you unignore this specific dependency's minor version or upgrade to it yourself) - `@dependabot ignore <dependency name>` will close this group update PR and stop Dependabot creating any more for the specific dependency (unless you unignore this specific dependency or upgrade to it yourself) - `@dependabot unignore <dependency name>` will remove all of the ignore conditions of the specified dependency - `@dependabot unignore <dependency name> <ignore condition>` will remove the ignore condition of the specified dependency and ignore conditions </details> --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co> Co-authored-by: Otto <otto@agpt.co>	2026-02-10 13:32:48 +00:00
Zamil Majdy	0f2d1a6553	Merge branch 'dev' into feat/copitlot-claude-code	2026-02-10 17:23:06 +04:00
Zamil Majdy	87d817b83b	fix(backend/chat): Allow MCP-registered tools through security hook and fix title generation - Skip BLOCKED_TOOLS check for tools with mcp__copilot__ prefix since they are already sandboxed by tool_adapter (fixes Read tool being blocked) - Fall back to session.messages for title generation when message=None	2026-02-10 17:15:42 +04:00
Abhimanyu Yadav	7d4c020a9b	feat(chat): implement AI SDK integration with custom streaming response handling (#11901 ) ### Changes 🏗️ - Added AI SDK integration for chat streaming with proper message handling - Implemented custom to_sse method in StreamToolOutputAvailable to exclude non-spec fields - Modified stream_chat_completion to reuse message IDs for tool call continuations - Created new Copilot 2.0 UI with AI SDK React components - Added streamdown and related packages for markdown rendering - Built reusable conversation and message components for the chat interface - Added support for tool output display in the chat UI ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Start a new chat session and verify streaming works correctly - [x] Test tool calls and verify they display properly in the UI - [x] Verify message continuations don't create duplicate messages - [x] Test markdown rendering with code blocks and other formatting - [x] Verify the UI is responsive and scrolls correctly #### For configuration changes: - [x] `.env.default` is updated or already compatible with my changes - [x] `docker-compose.yml` is updated or already compatible with my changes - [x] I have included a list of my configuration changes in the PR description (under Changes) --------- Co-authored-by: Lluis Agusti <hi@llu.lu> Co-authored-by: Ubbe <hi@ubbe.dev>	2026-02-10 21:12:21 +08:00
Zamil Majdy	acf932bf4f	refactor(backend/chat): Move glob/os imports to top-level in SDK service	2026-02-10 16:57:11 +04:00
Zamil Majdy	f562d9a277	fix(backend/chat): Add Read tool for SDK oversized tool results The Claude Agent SDK saves tool results exceeding its token limit to files and instructs the agent to read them back with a Read tool. Our MCP server didn't have this tool, breaking the agent on large results like run_block output (117K+ chars). Changes: - Add a Read tool to the MCP server (restricted to /root/.claude/) - Register it in COPILOT_TOOL_NAMES so the SDK can use it - Add safety-net truncation at 500K chars for extreme cases - Clean up SDK tool-result files after each client session	2026-02-10 16:53:04 +04:00
Zamil Majdy	3c92a96504	fix(backend/chat): Publish StreamError before StreamFinish on error paths When run_ai_generation() or event_generator() encounter errors, they were only publishing StreamFinish without a preceding StreamError. The frontend treats finish-without-error as normal completion, leaving the user with an apparently stuck/empty response requiring a page refresh.	2026-02-10 15:49:23 +04:00

1 2 3 4 5 ...

1159 Commits