Compare commits

..

81 Commits

Author SHA1 Message Date
Zamil Majdy
32e9dda30d fix(chat/sdk): resolve relative paths in security hooks and unify workspace access
The security hook's path validation blocked SDK Read/Write tools because
it didn't resolve relative paths against sdk_cwd. Since the SDK sets cwd,
Claude naturally uses relative paths like "test.txt" which failed the
absolute path prefix check. Now relative paths are joined with sdk_cwd
before validation, and denial messages include the allowed workspace path.

Also clarifies the workspace model: SDK Read/Write + bash_exec share the
same ephemeral session directory, while workspace_file tools provide
persistent cloud storage across sessions.
2026-02-13 10:40:41 +04:00
Zamil Majdy
cb45e7957b feat: fix openapi.json 2026-02-12 23:39:47 +04:00
Zamil Majdy
f1d02fb8f3 fix(chat/sdk): move cwd setup inside try block to ensure cleanup
Move _make_sdk_cwd() and os.makedirs() inside the try block so
the finally cleanup always runs, preventing /tmp dir leaks if
setup fails.
2026-02-12 23:32:26 +04:00
Zamil Majdy
47de6b6420 feat(chat): add check_operation_status tool for long-running ops
Lets the CoPilot agent query whether a create_agent/edit_agent
operation is still running, completed, or failed. Accepts operation_id
or task_id from a previous operation_started response and looks up
the task status in Redis via stream_registry.
2026-02-12 23:30:51 +04:00
Zamil Majdy
62cd2eea89 fix(chat/sandbox): use --symlink for compat paths on Debian 13
On Debian 13 (bookworm+), /bin, /lib, /sbin, /lib64 are symlinks to
/usr/*.  bwrap --ro-bind cannot create a symlink as a mount target
inside the sandbox, causing "execvp: No such file or directory" because
the ELF dynamic linker at /lib64/ld-linux-x86-64.so.2 is unreachable.

Detect symlinks at runtime with os.path.islink() and use bwrap
--symlink instead of --ro-bind.  Falls back to --ro-bind on older
distros where these are real directories.
2026-02-12 22:55:29 +04:00
Zamil Majdy
ae61ec692e Merge branch 'dev' into feat/copitlot-claude-code 2026-02-12 22:27:50 +04:00
Zamil Majdy
9296bd8736 fix(chat/sandbox): fix bwrap inside Docker containers
Three fixes for bubblewrap sandbox:
- Fix --tmpdir (invalid) to --tmpfs (correct bwrap option)
- Add --unshare-user so bwrap can create namespaces inside
  unprivileged Docker containers (no CAP_SYS_ADMIN needed)
- Reorder mounts: --tmpfs /tmp first, then --bind workspace on top,
  so the workspace directory is visible through the fresh tmpfs
2026-02-12 22:22:39 +04:00
Zamil Majdy
308113c03d fix(chat/sdk): remove obsolete Bash allowlist tests
The SDK built-in Bash tool is now unconditionally blocked (bash_exec
MCP tool with bubblewrap is used instead). Remove tests that expected
safe Bash commands to be allowed and replace with a single test that
verifies Bash is always denied.
2026-02-12 22:19:30 +04:00
Zamil Majdy
51abf13254 feat(chat): use LaunchDarkly flag for copilot SDK rollout
Replace static CHAT_USE_CLAUDE_AGENT_SDK env var with a LaunchDarkly
feature flag (copilot-sdk) for per-user rollout control. The env var
value serves as the default when LD is not configured or the flag
doesn't exist yet.
2026-02-12 22:02:28 +04:00
Zamil Majdy
54b03d3a29 fix(frontend): remove python_exec from openapi.json ResponseType enum
The python_exec tool was removed from the backend but the generated
openapi.json still referenced the enum value.
2026-02-12 21:55:25 +04:00
Zamil Majdy
239dff5ebd feat(chat/sandbox): add resource limits to bubblewrap sandbox
Add ulimit-based resource caps inside the bwrap sandbox to prevent
fork bombs and resource exhaustion:
- max 64 processes (stops fork bombs)
- 512 MB virtual memory
- 50 MB max file size
- 256 open file descriptors

Limits are applied via `sh -c 'ulimit ...; exec "$@"'` wrapper inside
the sandbox, so they're inherited by all child processes.
2026-02-12 21:47:49 +04:00
Zamil Majdy
1dd53db21c feat(chat/sandbox): bubblewrap sandbox for bash_exec, remove python_exec
- Replace `--ro-bind / /` with whitelist-only filesystem: only /usr, /etc,
  /bin, /lib, /sbin mounted read-only. /app, /root, /home, /opt, /var are
  completely invisible inside the sandbox.
- Add `--clearenv` to wipe all inherited env vars (API keys, DB passwords).
  Only safe vars (PATH, HOME=workspace, LANG) are explicitly set.
- Remove python_exec tool — bash_exec can run `python3 -c` or heredocs with
  identical bubblewrap protection, reducing attack surface.
- Remove all fallback security code (import hooks, blocked modules, network
  command lists). Tools now hard-require bubblewrap — disabled on platforms
  without bwrap.
- Clean up security_hooks.py: remove ~200 lines of dead bash validation code,
  add Bash to BLOCKED_TOOLS as defence-in-depth.
- Wire up long-running tool callback in SDK service for create_agent/edit_agent
  delegation to Redis Streams background infrastructure.
2026-02-12 21:44:40 +04:00
Zamil Majdy
06c16ee2fe fix(chat/sdk): non-blocking long-running tools, tighten security
- Long-running tools (create_agent) now run in background and return
  immediately with an operation_id. Add check_operation MCP tool for
  polling results. Prevents 3+ min blocking and survives page refresh.
- Fix CodeQL path traversal alert: use normpath+startswith sanitizer
  in _make_sdk_cwd() instead of assert.
- Tighten _read_file_handler: restrict from ~/.claude/ to only
  ~/.claude/projects/**/tool-results/ (sentry review feedback).
- Fix bash redirect bypass: strip quoted strings before checking for
  unquoted > operator, catches `echo hello>file` (sentry review).
2026-02-12 20:39:33 +04:00
Zamil Majdy
8d2a649ee5 refactor(chat/sdk): remove Langfuse tracing — OpenRouter handles observability
Delete tracing.py (~408 lines) and all TracedSession/hook references from the
SDK path. OpenRouter already provides token usage, cost tracking, and request
logging, making manual Langfuse integration redundant. This also fixes the
broken 'Langfuse' object has no attribute 'trace' warning on every request.
2026-02-12 20:24:27 +04:00
Zamil Majdy
9589474709 Merge branch 'dev' into feat/copitlot-claude-code 2026-02-12 19:40:32 +04:00
Zamil Majdy
749a78723a refactor(chat/sdk): deduplicate code and remove anthropic fallback
- Extract shared `make_session_path()` into sandbox.py (single source of
  truth for workspace path sanitization), replace duplicate in service.py
- Delete anthropic_fallback.py (~360 lines) — redundant third code path;
  routes.py already falls back to non-SDK service
- Remove dead `traced_session()`, `get_tool_definitions()`,
  `get_tool_handlers()`, `_current_tool_call_id` ContextVar
- Fix hardcoded model in tracing — pass actual resolved model
- Fix inconsistent model name splitting in anthropic fallback
2026-02-12 19:26:29 +04:00
Zamil Majdy
bec2e1ddee fix(chat/tools): sanitize session_id in sandbox workspace path
Align with SDK's _make_sdk_cwd() to prevent path traversal and ensure
python_exec/bash_exec share the same workspace as SDK file tools.
2026-02-12 19:08:47 +04:00
Zamil Majdy
ec1ab06e0d chore(chat): bump default max_subtasks from 3 to 10 2026-02-12 19:07:42 +04:00
Zamil Majdy
f31cb49557 feat(chat/tools): add sandboxed python_exec, bash_exec, web_fetch tools and enable Task
- Add sandbox.py with network-isolated execution via unshare --net (Linux)
  and import/command blocklist fallback (macOS dev)
- Add python_exec tool: runs Python in subprocess with no network, workspace-scoped
- Add bash_exec tool: full Bash scripting with no network, workspace-scoped
- Add web_fetch tool: SSRF-protected URL fetching via backend Requests utility
- Remove SDK built-in Bash from allowlist (replaced by sandboxed bash_exec)
- Enable SDK built-in Task (sub-agents) with per-session rate limit (default 3)
- Add claude_agent_max_subtasks config field
2026-02-12 19:07:19 +04:00
Zamil Majdy
fd28c386f4 Merge branch 'dev' into feat/copitlot-claude-code 2026-02-12 18:50:11 +04:00
Zamil Majdy
3bea584659 feat(chat/sdk): route SDK through OpenRouter with observability (#12084)
## Summary
- Routes Claude Agent SDK API calls through OpenRouter via
`ANTHROPIC_BASE_URL` / `ANTHROPIC_AUTH_TOKEN` env vars, enabling
per-call token and cost tracking on the OpenRouter dashboard
- Adds `sdk_model` and `sdk_max_budget_usd` config fields for
SDK-specific model selection and budget control
- Emits `StreamUsage` from SDK `ResultMessage` so the frontend receives
token counts, and persists usage to `session.usage`
- Fixes Langfuse tracing to use the configured model name instead of a
hardcoded default
- Updates Anthropic fallback to use `config.api_key` / `config.base_url`
(OpenRouter routing) instead of raw `ANTHROPIC_API_KEY` env var

## Test plan
- [ ] Deploy and send a CoPilot message — verify the API call appears on
the OpenRouter dashboard
- [ ] Check Langfuse trace shows correct model name (e.g.
`claude-opus-4.6` not hardcoded `claude-sonnet-4-20250514`)
- [ ] Verify frontend receives `StreamUsage` with `promptTokens` /
`completionTokens` values
- [ ] Set `CHAT_SDK_MAX_BUDGET_USD` and verify budget is respected
- [ ] Test fallback path (without `claude-agent-sdk` installed) still
works via OpenRouter

<!-- greptile_comment -->

<h2>Greptile Overview</h2>

<details><summary><h3>Greptile Summary</h3></summary>

Routes Claude Agent SDK API calls through OpenRouter for enhanced
observability and cost tracking. The PR enables per-call token tracking
on the OpenRouter dashboard by configuring the SDK to use
`ANTHROPIC_BASE_URL` and `ANTHROPIC_AUTH_TOKEN` environment variables
derived from the chat configuration.

Key changes:
- Added `sdk_model` and `sdk_max_budget_usd` configuration fields for
SDK-specific control
- Implemented automatic model name resolution that strips OpenRouter
provider prefixes
- Updated SDK client initialization to route through OpenRouter with
proper environment variables
- Emits `StreamUsage` events from SDK `ResultMessage` for frontend token
visibility
- Persists usage data to `session.usage` for historical tracking
- Fixed Langfuse tracing to use the configured model name instead of
hardcoded defaults
- Updated fallback path to use OpenRouter routing instead of direct
Anthropic API
</details>


<details><summary><h3>Confidence Score: 4/5</h3></summary>

- Safe to merge with minor observations - the implementation is solid
and the changes are well-structured
- The code quality is high with proper error handling, clear separation
of concerns, and good defensive coding practices. The changes integrate
cleanly with existing patterns. Minor observations include missing
validation for sdk_max_budget_usd and a potential edge case in model
name resolution, but these don't block merging
- No files require special attention - all changes follow existing
patterns and maintain consistency
</details>


<details><summary><h3>Sequence Diagram</h3></summary>

```mermaid
sequenceDiagram
    participant Frontend
    participant Backend
    participant SDK as Claude Agent SDK
    participant OpenRouter
    participant Anthropic
    participant Langfuse

    Frontend->>Backend: POST /chat/completions
    Backend->>Backend: Load config (api_key, base_url)
    Backend->>Backend: Resolve SDK model (strip OpenRouter prefix)
    Backend->>Backend: Build SDK env vars (ANTHROPIC_BASE_URL, ANTHROPIC_AUTH_TOKEN)
    
    Backend->>Langfuse: Initialize TracedSession with model name
    Backend->>SDK: ClaudeSDKClient(model, env, max_budget_usd)
    
    SDK->>SDK: Use ANTHROPIC_BASE_URL from env
    SDK->>OpenRouter: POST /messages (via configured base_url)
    OpenRouter->>Anthropic: Forward request with routing
    Anthropic-->>OpenRouter: Stream response chunks
    OpenRouter-->>SDK: Stream response with usage data
    
    loop For each SDK message
        SDK-->>Backend: AssistantMessage/UserMessage/ResultMessage
        Backend->>Langfuse: log_sdk_message()
        Backend->>Backend: SDKResponseAdapter.convert_message()
        Backend->>Backend: Extract usage from ResultMessage
        Backend->>Backend: Persist Usage to session.usage
        Backend-->>Frontend: StreamUsage(promptTokens, completionTokens)
        Backend-->>Frontend: StreamTextDelta/StreamToolInput/etc
    end
    
    Backend->>Langfuse: Log final generation with model name
    Backend->>Backend: Save session with usage data
    Backend-->>Frontend: StreamFinish
```
</details>


<!-- greptile_other_comments_section -->

<!-- /greptile_comment -->
2026-02-12 21:47:39 +07:00
Zamil Majdy
d7f7a2747f fix(backend/chat): Atomic message append to prevent race condition
Replace the read-modify-write pattern in stream_chat_post with an
atomic append_and_save_message helper that acquires the session lock
before re-fetching and appending. This prevents message loss when
concurrent requests modify the same session.
2026-02-12 09:10:43 +04:00
Zamil Majdy
68849e197c format 2026-02-12 08:26:26 +04:00
Zamil Majdy
211478bb29 Revert "style: run ruff format and isort"
This reverts commit 40b58807ab.
2026-02-12 08:25:22 +04:00
Zamil Majdy
0e88dd15b2 feat(chat): add hook-based tracing integration for Claude Agent SDK
- Add create_tracing_hooks() for fine-grained tool timing
- Add merge_hooks() utility to combine security + tracing hooks
- Captures precise pre/post timing for tool executions
- Tracks tool failures via PostToolUseFailure hook
- Integrates seamlessly with existing security hooks
2026-02-12 03:35:16 +00:00
Zamil Majdy
7f3c227f0a feat(chat): add modular Langfuse tracing for Claude Agent SDK
- Create tracing.py with TracedSession context manager
- Automatically trace user messages, SDK messages, and results
- Capture tool calls with input/output and timing
- Log usage and cost from SDK ResultMessage
- No-op when Langfuse not configured (zero overhead)
- Clean integration into service.py via context manager
2026-02-12 03:33:37 +00:00
Zamil Majdy
40b58807ab style: run ruff format and isort 2026-02-12 03:25:19 +00:00
Zamil Majdy
d0e2e6f013 security(service): strengthen path validation for SDK cleanup
- Add empty check after session_id sanitization
- Add assertion for defense-in-depth
- Add explicit '..' traversal check in cleanup
- Replace glob with os.listdir to avoid glob injection
- Add validation that project_dir stays under ~/.claude/projects
- Add warning logs for rejected paths

Addresses CodeQL alert about uncontrolled data in path expression
2026-02-12 03:07:08 +00:00
Zamil Majdy
efdc8d73cc fix(security_hooks): use json.dumps for pattern matching and log warning
- Use json.dumps instead of str() for more predictable pattern matching
- Log warning when SDK not available and security hooks are disabled

Addresses CodeRabbit review feedback
2026-02-12 02:55:04 +00:00
Zamil Majdy
a34810d8a2 revert: remove Bash command extraction from GenericTool
Keep it simple - just show 'Bash completed' instead of special
handling to extract command names like 'jq completed'
2026-02-12 02:53:37 +00:00
Zamil Majdy
038b7d5841 feat(copilot): show specific command name for Bash tool
- Extract command name (jq, grep, etc.) from Bash tool input
- Display 'jq completed' instead of 'Bash completed'
- Add ripgrep and tree to Dockerfile (match ALLOWED_BASH_COMMANDS)
2026-02-12 02:48:19 +00:00
Zamil Majdy
cac93b0cc9 fix(chat): increase SDK buffer limit and add jq
- Add sdk_max_buffer_size config option (default 10MB, was 1MB)
- Pass max_buffer_size to ClaudeAgentOptions to prevent crashes on large tool outputs
- Install jq in Dockerfile for JSON processing capabilities

Fixes AUTOGPT-SERVER-7V2
2026-02-12 02:41:12 +00:00
Zamil Majdy
2025aaf5f2 fix(backend/chat): Preserve full MCP tool output for frontend widgets
The SDK CLI truncates large tool results (writing them to disk),
which breaks frontend widget rendering (e.g., find_block's block
list cards). Stash the full MCP tool output before the SDK sees it,
then use the stash in the response adapter so the frontend always
receives the complete JSON for proper widget parsing.
2026-02-11 23:13:42 +04:00
Zamil Majdy
ae9bce3bae feat(backend/chat): Add sandboxed Bash and notify SDK of restrictions
- Allow Bash tool with command allowlist (jq, grep, head, tail, etc.)
  validated via shlex.split for proper quote handling
- Add workspace path validation for Bash absolute paths
- Add SDK built-in tools (Read/Write/Edit/Glob/Grep/Bash) to allowed_tools
- Append Bash restrictions to system prompt (SDK doesn't know our allowlist)
- Add default_factory to BlockInfoSummary schema fields
- Add 12 Bash sandbox tests covering safe/dangerous commands, substitution,
  redirection, /dev/ access, path escaping
2026-02-11 22:35:39 +04:00
Zamil Majdy
3107d889fc feat(frontend/copilot): Add generic tool widget for unrecognized tools
SDK built-in tools (Read, Glob, Grep, etc.) have no dedicated frontend
widget, so tool calls silently disappeared. Add a GenericTool component
that shows a spinning gear + "Running {tool}…" for any tool-* part
type that doesn't match a known case.
2026-02-11 22:08:03 +04:00
Zamil Majdy
f174fb6303 fix(backend/chat): Strip MCP prefix from SDK tool names for frontend rendering
The Vercel AI SDK frontend renders tool widgets based on tool name
(e.g. "tool-find_block", "tool-run_agent"). The SDK sends tool names
with the MCP prefix (mcp__copilot__find_block) which didn't match
any frontend switch case, causing tool execution to be invisible.

Strip the mcp__copilot__ prefix in the response adapter so tool events
reach the correct frontend widget handlers.
2026-02-11 22:01:59 +04:00
Zamil Majdy
920a4c5f15 feat(backend/chat): Allow Read/Write/Edit/Glob/Grep in SDK within workspace
Move these tools from fully-blocked to workspace-scoped: they are now
allowed when the file path stays within the SDK working directory
(/tmp/copilot-<session>/) or the tool-results directory
(~/.claude/projects/…/tool-results/). This enables the SDK's built-in
oversized tool result handling and workspace file operations.

- Add _validate_workspace_path() with normpath-based path validation
- Pass sdk_cwd from service.py into create_security_hooks()
- Add 20 unit tests covering allowed/denied paths, traversal attacks
2026-02-11 20:39:33 +04:00
Zamil Majdy
e95fadbb86 Merge branch 'dev' into feat/copitlot-claude-code 2026-02-11 20:23:56 +04:00
Zamil Majdy
b14b3803ad feat(backend/chat): Add StreamStartStep/StreamFinishStep to SDK adapter
The non-SDK path emits step boundaries (StartStep/FinishStep) around
each LLM turn and tool cycle. The SDK adapter was missing these,
causing the frontend to lack visual step framing for tool calls.

Now the SDK adapter emits:
- StreamStartStep after init and before each new LLM turn
- StreamFinishStep after tool results and before final finish
2026-02-11 20:18:27 +04:00
Zamil Majdy
82c483d6c8 Merge branch 'dev' into feat/copitlot-claude-code 2026-02-11 07:17:38 +04:00
Zamil Majdy
7cffa1895f fix(backend/chat): Filter duplicate StreamStart from non-SDK path
Routes.py already publishes a StreamStart before calling the service.
The SDK path filters the duplicate internally, but the non-SDK path
did not, causing two StreamStart events to reach the frontend.
2026-02-11 06:52:47 +04:00
Zamil Majdy
9791bdd724 fix(backend/chat): Use normpath+startswith pattern for CodeQL path sanitization
CodeQL doesn't recognize re.sub as a path sanitizer. Switch to the
os.path.normpath + startswith prefix check pattern that CodeQL's
taint model explicitly recognizes as breaking the taint chain.
2026-02-11 06:45:12 +04:00
Zamil Majdy
750a674c78 fix lock 2026-02-11 06:39:03 +04:00
Zamil Majdy
960c7980a3 fix(backend/chat): Use named helper for session_id sanitization to satisfy CodeQL
Replace inline comprehension with _sanitize_session_id() using re.sub
so CodeQL recognizes the path-traversal sanitization barrier.
2026-02-11 06:32:16 +04:00
Zamil Majdy
e85d437bb2 fix(backend/chat): Sanitize session_id in SDK cwd path to prevent path traversal 2026-02-11 06:26:48 +04:00
Zamil Majdy
44f9536bd6 fix lock 2026-02-11 06:24:41 +04:00
Zamil Majdy
1c1085a227 Merge remote-tracking branch 'origin/dev' into feat/copitlot-claude-code
# Conflicts:
#	autogpt_platform/backend/backend/api/features/chat/config.py
#	autogpt_platform/backend/poetry.lock
2026-02-11 05:30:46 +04:00
Zamil Majdy
d7ef70469e fix(backend/chat): Fix cleanup race condition and move to outer finally
- Use session-specific temp dir (/tmp/copilot-{session_id}) as SDK cwd
  to prevent concurrent sessions from deleting each other's tool-result
  files during cleanup
- Move _cleanup_sdk_tool_results() to outer finally block so it runs
  even when the outer except Exception fires
- Clean up the temp cwd directory after each session
- Remove unnecessary inner try/finally nesting
2026-02-11 05:13:02 +04:00
Zamil Majdy
1926127ddd fix(backend/chat): Fix bugs and remove dead code in SDK integration
- Fix message accumulation bug: reset has_appended_assistant when
  creating new post-tool assistant message to prevent lost text deltas
- Fix hardcoded model in anthropic_fallback.py: use config.model instead
  of hardcoded "claude-sonnet-4-20250514"
- Fix _SDK_TOOL_RESULTS_DIR using hardcoded /root/ path: use expanduser
- Remove unused create_strict_security_hooks (~75 lines)
- Remove unused create_heartbeat/create_usage from response adapter
- Remove unused RAW_TOOL_NAMES from tool_adapter
- Extract _MAX_TOOL_ITERATIONS constant from magic number
2026-02-11 04:42:05 +04:00
Zamil Majdy
8b509e56de refactor(backend/chat): Replace --resume with conversation context, add compaction and dedup
- Remove broken --resume/session file approach (CLI v2.1.38 can't load
  >2 message session files) and delete session_file.py + tests
- Embed prior conversation turns as <conversation_history> context in
  the user message for multi-turn memory
- Add context compaction using shared compress_context() from prompt.py
  with LLM summarization + truncation fallback for long conversations
- Reuse _build_system_prompt and _generate_session_title from parent
  service.py instead of duplicating (gains Langfuse prompt support)
- Add has_conversation_history param to _build_system_prompt to avoid
  greeting on multi-turn conversations
- Fix _SDK_TOOL_RESULTS_GLOB from hardcoded /root/ to expanduser ~/
2026-02-11 04:22:11 +04:00
Zamil Majdy
acb2d0bd1b fix(backend/chat): Resolve symlinks in session file path for --resume
The CLI resolves symlinks when computing its project directory (e.g.
/tmp -> /private/tmp on macOS), so our session file writes must use
the resolved path to match. Also adds cwd to ClaudeAgentOptions and
debug logging for SDK messages.
2026-02-10 20:11:16 +04:00
Zamil Majdy
51aa369c80 fix(backend): Restore PyYAML cp38 wheel entries in poetry.lock
Re-add Python 3.8 wheel entries for PyYAML that were dropped by
poetry lock resolution, keeping the lockfile consistent with dev.
2026-02-10 20:06:45 +04:00
Zamil Majdy
6403ffe353 fix(backend/chat): Use --resume with session files for multi-turn conversations
Replace broken AsyncIterable approach (CLI rejects assistant-type stdin
messages) with JSONL session files written to the CLI's storage directory.
This enables --resume to load full user+assistant context with turn-level
compaction support for long conversations.
2026-02-10 18:46:33 +04:00
Zamil Majdy
c40a98ba3c Merge branches 'feat/copitlot-claude-code' and 'dev' of github.com:Significant-Gravitas/AutoGPT into feat/copitlot-claude-code 2026-02-10 18:19:23 +04:00
Zamil Majdy
a31fc8b162 refactor(backend/chat): Use proper SDK types and in-memory conversation history
Replace duck typing (class name checks, getattr) with isinstance() using
SDK-exported dataclasses. Replace file-based --resume with AsyncIterable
message injection for conversation history, eliminating disk I/O. Add 15
unit tests for the response adapter.
2026-02-10 18:17:00 +04:00
Zamil Majdy
0f2d1a6553 Merge branch 'dev' into feat/copitlot-claude-code 2026-02-10 17:23:06 +04:00
Zamil Majdy
87d817b83b fix(backend/chat): Allow MCP-registered tools through security hook and fix title generation
- Skip BLOCKED_TOOLS check for tools with mcp__copilot__ prefix since they
  are already sandboxed by tool_adapter (fixes Read tool being blocked)
- Fall back to session.messages for title generation when message=None
2026-02-10 17:15:42 +04:00
Zamil Majdy
acf932bf4f refactor(backend/chat): Move glob/os imports to top-level in SDK service 2026-02-10 16:57:11 +04:00
Zamil Majdy
f562d9a277 fix(backend/chat): Add Read tool for SDK oversized tool results
The Claude Agent SDK saves tool results exceeding its token limit to
files and instructs the agent to read them back with a Read tool. Our
MCP server didn't have this tool, breaking the agent on large results
like run_block output (117K+ chars).

Changes:
- Add a Read tool to the MCP server (restricted to /root/.claude/)
- Register it in COPILOT_TOOL_NAMES so the SDK can use it
- Add safety-net truncation at 500K chars for extreme cases
- Clean up SDK tool-result files after each client session
2026-02-10 16:53:04 +04:00
Zamil Majdy
3c92a96504 fix(backend/chat): Publish StreamError before StreamFinish on error paths
When run_ai_generation() or event_generator() encounter errors, they
were only publishing StreamFinish without a preceding StreamError. The
frontend treats finish-without-error as normal completion, leaving the
user with an apparently stuck/empty response requiring a page refresh.
2026-02-10 15:49:23 +04:00
Zamil Majdy
8b8e1df739 fix(backend/chat): Auto-expire stale running tasks to unblock sessions
Tasks stuck in "running" status beyond stream_timeout (300s) are now
auto-marked as failed when looked up, preventing zombie tasks from
blocking the session indefinitely.
2026-02-10 15:35:43 +04:00
Zamil Majdy
602a0a4fb1 fix(backend/chat): Strip tool call noise from conversation history context 2026-02-10 14:11:27 +04:00
Zamil Majdy
8d7d531ae0 refactor(backend/chat): Remove unused max_context_messages config 2026-02-10 13:57:33 +04:00
Zamil Majdy
43153a12e0 fix(backend/chat): Remove manual context truncation from SDK path, let SDK handle compaction 2026-02-10 13:52:49 +04:00
Zamil Majdy
587e11c60a refactor(backend/chat): Extract MCP server name constants to avoid hardcoded strings 2026-02-10 12:12:08 +04:00
Zamil Majdy
57da545e02 Merge branch 'dev' into feat/copitlot-claude-code 2026-02-10 12:10:35 +04:00
Zamil Majdy
626980bf27 Merge branch 'dev' into feat/copitlot-claude-code 2026-02-09 19:26:52 +04:00
Swifty
e42b27af3c Merge branch 'dev' into feat/copitlot-claude-code 2026-02-09 09:12:23 +01:00
Zamil Majdy
34face15d2 fix lock 2026-02-09 11:45:59 +04:00
Zamil Majdy
7d32c83f95 fix(backend/chat): Handle non-serializable SDK objects in tool result output 2026-02-09 10:59:50 +04:00
Zamil Majdy
6e2a45b84e style(backend): Remove unused pytest import in execution_queue_test 2026-02-09 10:14:20 +04:00
Zamil Majdy
32f6532e9c Merge branch 'dev' of github.com:Significant-Gravitas/AutoGPT into feat/copitlot-claude-code 2026-02-09 10:10:32 +04:00
Zamil Majdy
0bbe8a184d Merge dev and resolve poetry.lock conflict 2026-02-08 19:40:17 +04:00
Zamil Majdy
7592deed63 fix(backend/chat): Address remaining PR review comments
- Fix tool_call_id always being "sdk-call" by generating unique IDs per invocation
- Fix validation using original tool_name instead of clean_name in security hooks
- Fix duplicate StreamFinish in Anthropic fallback path
- Fix ImportError fallback returning plain dict instead of re-raising
- Extract _build_input_schema helper to deduplicate schema construction
- Add else branch for unhandled SDK message types for observability
- Truncate large tool results in conversation history to prevent context overflow
2026-02-08 19:39:10 +04:00
Zamil Majdy
b9c759ce4f fix(backend/chat): Address additional PR review comments
- Add terminal StreamFinish in adapt_sdk_stream if SDK ends without one
- Sanitize error message in adapt_sdk_stream exception handler
- Pass full JSON schema (type, properties, required) to tool decorator
2026-02-08 07:14:45 +04:00
Zamil Majdy
5efb80d47b fix(backend/chat): Address PR review comments for Claude SDK integration
- Add StreamFinish after ErrorMessage in response adapter
- Fix str.replace to removeprefix in security hooks
- Apply max_context_messages limit as safety guard in history formatting
- Add empty prompt guard before sending to SDK
- Sanitize error messages to avoid exposing internal details
- Fix fire-and-forget asyncio.create_task by storing task reference
- Fix tool_calls population on assistant messages
- Rewrite Anthropic fallback to persist messages and merge consecutive roles
- Only use ANTHROPIC_API_KEY for fallback (not OpenRouter keys)
- Fix IndexError when tool result content list is empty
2026-02-06 13:25:10 +04:00
Zamil Majdy
b49d8e2cba fix lock 2026-02-06 13:19:53 +04:00
Zamil Majdy
452544530d feat(chat/sdk): Enable native SDK context compaction
- Remove manual truncation in conversation history formatting
- SDK's automatic compaction handles context limits intelligently
- Add observability hooks:
  - PreCompact: Log when SDK triggers context compaction
  - PostToolUse: Log successful tool executions
  - PostToolUseFailure: Log and debug failed tool executions
- Update config: increase max_context_messages (SDK handles compaction)
2026-02-06 12:44:48 +04:00
Zamil Majdy
32ee7e6cf8 fix(chat): Remove aggressive stale task detection
The 60-second timeout was too aggressive and could incorrectly mark
legitimate long-running tool calls as stale. Relying on Redis TTL
(1 hour) for cleanup is sufficient and more reliable.
2026-02-06 11:45:54 +04:00
Zamil Majdy
670663c406 Merge dev and resolve poetry.lock conflict 2026-02-06 11:40:41 +04:00
Zamil Majdy
0dbe4cf51e feat(backend/chat): Add Claude Agent SDK integration for CoPilot
This PR adds Claude Agent SDK as the default backend for CoPilot chat completions,
replacing the direct OpenAI API integration.

Key changes:
- Add Claude Agent SDK service layer with MCP tool adapter
- Fix message persistence after tool calls (messages no longer disappear on refresh)
- Add OpenRouter tracing for session title generation
- Add security hooks for user context validation
- Add Anthropic fallback when SDK is not available
- Clean up excessive debug logging
2026-02-06 11:38:17 +04:00
72 changed files with 3815 additions and 5068 deletions

View File

@@ -5,13 +5,42 @@
!docs/
# Platform - Libs
!autogpt_platform/autogpt_libs/
!autogpt_platform/autogpt_libs/autogpt_libs/
!autogpt_platform/autogpt_libs/pyproject.toml
!autogpt_platform/autogpt_libs/poetry.lock
!autogpt_platform/autogpt_libs/README.md
# Platform - Backend
!autogpt_platform/backend/
!autogpt_platform/backend/backend/
!autogpt_platform/backend/test/e2e_test_data.py
!autogpt_platform/backend/migrations/
!autogpt_platform/backend/schema.prisma
!autogpt_platform/backend/pyproject.toml
!autogpt_platform/backend/poetry.lock
!autogpt_platform/backend/README.md
!autogpt_platform/backend/.env
!autogpt_platform/backend/gen_prisma_types_stub.py
# Platform - Market
!autogpt_platform/market/market/
!autogpt_platform/market/scripts.py
!autogpt_platform/market/schema.prisma
!autogpt_platform/market/pyproject.toml
!autogpt_platform/market/poetry.lock
!autogpt_platform/market/README.md
# Platform - Frontend
!autogpt_platform/frontend/
!autogpt_platform/frontend/src/
!autogpt_platform/frontend/public/
!autogpt_platform/frontend/scripts/
!autogpt_platform/frontend/package.json
!autogpt_platform/frontend/pnpm-lock.yaml
!autogpt_platform/frontend/tsconfig.json
!autogpt_platform/frontend/README.md
## config
!autogpt_platform/frontend/*.config.*
!autogpt_platform/frontend/.env.*
!autogpt_platform/frontend/.env
# Classic - AutoGPT
!classic/original_autogpt/autogpt/
@@ -35,38 +64,6 @@
# Classic - Frontend
!classic/frontend/build/web/
# Explicitly re-ignore unwanted files from whitelisted directories
# Note: These patterns MUST come after the whitelist rules to take effect
# Hidden files and directories (but keep frontend .env files needed for build)
**/.*
!autogpt_platform/frontend/.env
!autogpt_platform/frontend/.env.default
!autogpt_platform/frontend/.env.production
# Python artifacts
**/__pycache__/
**/*.pyc
**/*.pyo
**/.venv/
**/.ruff_cache/
**/.pytest_cache/
**/.coverage
**/htmlcov/
# Node artifacts
**/node_modules/
**/.next/
**/storybook-static/
**/playwright-report/
**/test-results/
# Build artifacts
**/dist/
**/build/
!autogpt_platform/frontend/src/**/build/
**/target/
# Logs and temp files
**/*.log
**/*.tmp
# Explicitly re-ignore some folders
.*
**/__pycache__

View File

@@ -40,48 +40,6 @@ jobs:
git checkout -b "$BRANCH_NAME"
echo "branch_name=$BRANCH_NAME" >> $GITHUB_OUTPUT
# Backend Python/Poetry setup (so Claude can run linting/tests)
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.11"
- name: Set up Python dependency cache
uses: actions/cache@v5
with:
path: ~/.cache/pypoetry
key: poetry-${{ runner.os }}-${{ hashFiles('autogpt_platform/backend/poetry.lock') }}
- name: Install Poetry
run: |
cd autogpt_platform/backend
HEAD_POETRY_VERSION=$(python3 ../../.github/workflows/scripts/get_package_version_from_lockfile.py poetry)
curl -sSL https://install.python-poetry.org | POETRY_VERSION=$HEAD_POETRY_VERSION python3 -
echo "$HOME/.local/bin" >> $GITHUB_PATH
- name: Install Python dependencies
working-directory: autogpt_platform/backend
run: poetry install
- name: Generate Prisma Client
working-directory: autogpt_platform/backend
run: poetry run prisma generate && poetry run gen-prisma-stub
# Frontend Node.js/pnpm setup (so Claude can run linting/tests)
- name: Enable corepack
run: corepack enable
- name: Set up Node.js
uses: actions/setup-node@v6
with:
node-version: "22"
cache: "pnpm"
cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml
- name: Install JavaScript dependencies
working-directory: autogpt_platform/frontend
run: pnpm install --frozen-lockfile
- name: Get CI failure details
id: failure_details
uses: actions/github-script@v8

View File

@@ -77,15 +77,27 @@ jobs:
run: poetry run prisma generate && poetry run gen-prisma-stub
# Frontend Node.js/pnpm setup (mirrors platform-frontend-ci.yml)
- name: Enable corepack
run: corepack enable
- name: Set up Node.js
uses: actions/setup-node@v6
with:
node-version: "22"
cache: "pnpm"
cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml
- name: Enable corepack
run: corepack enable
- name: Set pnpm store directory
run: |
pnpm config set store-dir ~/.pnpm-store
echo "PNPM_HOME=$HOME/.pnpm-store" >> $GITHUB_ENV
- name: Cache frontend dependencies
uses: actions/cache@v5
with:
path: ~/.pnpm-store
key: ${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml', 'autogpt_platform/frontend/package.json') }}
restore-keys: |
${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml') }}
${{ runner.os }}-pnpm-
- name: Install JavaScript dependencies
working-directory: autogpt_platform/frontend

View File

@@ -93,15 +93,27 @@ jobs:
run: poetry run prisma generate && poetry run gen-prisma-stub
# Frontend Node.js/pnpm setup (mirrors platform-frontend-ci.yml)
- name: Enable corepack
run: corepack enable
- name: Set up Node.js
uses: actions/setup-node@v6
with:
node-version: "22"
cache: "pnpm"
cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml
- name: Enable corepack
run: corepack enable
- name: Set pnpm store directory
run: |
pnpm config set store-dir ~/.pnpm-store
echo "PNPM_HOME=$HOME/.pnpm-store" >> $GITHUB_ENV
- name: Cache frontend dependencies
uses: actions/cache@v5
with:
path: ~/.pnpm-store
key: ${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml', 'autogpt_platform/frontend/package.json') }}
restore-keys: |
${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml') }}
${{ runner.os }}-pnpm-
- name: Install JavaScript dependencies
working-directory: autogpt_platform/frontend

View File

@@ -62,7 +62,7 @@ jobs:
# Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL
uses: github/codeql-action/init@v4
uses: github/codeql-action/init@v3
with:
languages: ${{ matrix.language }}
build-mode: ${{ matrix.build-mode }}
@@ -93,6 +93,6 @@ jobs:
exit 1
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v4
uses: github/codeql-action/analyze@v3
with:
category: "/language:${{matrix.language}}"

View File

@@ -26,6 +26,7 @@ jobs:
setup:
runs-on: ubuntu-latest
outputs:
cache-key: ${{ steps.cache-key.outputs.key }}
components-changed: ${{ steps.filter.outputs.components }}
steps:
@@ -40,17 +41,28 @@ jobs:
components:
- 'autogpt_platform/frontend/src/components/**'
- name: Enable corepack
run: corepack enable
- name: Set up Node
- name: Set up Node.js
uses: actions/setup-node@v6
with:
node-version: "22.18.0"
cache: "pnpm"
cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml
- name: Install dependencies to populate cache
- name: Enable corepack
run: corepack enable
- name: Generate cache key
id: cache-key
run: echo "key=${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml', 'autogpt_platform/frontend/package.json') }}" >> $GITHUB_OUTPUT
- name: Cache dependencies
uses: actions/cache@v5
with:
path: ~/.pnpm-store
key: ${{ steps.cache-key.outputs.key }}
restore-keys: |
${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml') }}
${{ runner.os }}-pnpm-
- name: Install dependencies
run: pnpm install --frozen-lockfile
lint:
@@ -61,15 +73,22 @@ jobs:
- name: Checkout repository
uses: actions/checkout@v6
- name: Enable corepack
run: corepack enable
- name: Set up Node
- name: Set up Node.js
uses: actions/setup-node@v6
with:
node-version: "22.18.0"
cache: "pnpm"
cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml
- name: Enable corepack
run: corepack enable
- name: Restore dependencies cache
uses: actions/cache@v5
with:
path: ~/.pnpm-store
key: ${{ needs.setup.outputs.cache-key }}
restore-keys: |
${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml') }}
${{ runner.os }}-pnpm-
- name: Install dependencies
run: pnpm install --frozen-lockfile
@@ -92,15 +111,22 @@ jobs:
with:
fetch-depth: 0
- name: Enable corepack
run: corepack enable
- name: Set up Node
- name: Set up Node.js
uses: actions/setup-node@v6
with:
node-version: "22.18.0"
cache: "pnpm"
cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml
- name: Enable corepack
run: corepack enable
- name: Restore dependencies cache
uses: actions/cache@v5
with:
path: ~/.pnpm-store
key: ${{ needs.setup.outputs.cache-key }}
restore-keys: |
${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml') }}
${{ runner.os }}-pnpm-
- name: Install dependencies
run: pnpm install --frozen-lockfile
@@ -115,8 +141,10 @@ jobs:
exitOnceUploaded: true
e2e_test:
name: end-to-end tests
runs-on: big-boi
needs: setup
strategy:
fail-fast: false
steps:
- name: Checkout repository
@@ -124,11 +152,19 @@ jobs:
with:
submodules: recursive
- name: Set up Platform - Copy default supabase .env
- name: Set up Node.js
uses: actions/setup-node@v6
with:
node-version: "22.18.0"
- name: Enable corepack
run: corepack enable
- name: Copy default supabase .env
run: |
cp ../.env.default ../.env
- name: Set up Platform - Copy backend .env and set OpenAI API key
- name: Copy backend .env and set OpenAI API key
run: |
cp ../backend/.env.default ../backend/.env
echo "OPENAI_INTERNAL_API_KEY=${{ secrets.OPENAI_API_KEY }}" >> ../backend/.env
@@ -136,125 +172,77 @@ jobs:
# Used by E2E test data script to generate embeddings for approved store agents
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
- name: Set up Platform - Set up Docker Buildx
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
with:
driver: docker-container
driver-opts: network=host
- name: Set up Platform - Expose GHA cache to docker buildx CLI
uses: crazy-max/ghaction-github-runtime@v3
- name: Set up Platform - Build Docker images (with cache)
working-directory: autogpt_platform
run: |
pip install pyyaml
# Resolve extends and generate a flat compose file that bake can understand
docker compose -f docker-compose.yml config > docker-compose.resolved.yml
# Add cache configuration to the resolved compose file
python ../.github/workflows/scripts/docker-ci-fix-compose-build-cache.py \
--source docker-compose.resolved.yml \
--cache-from "type=gha" \
--cache-to "type=gha,mode=max" \
--backend-hash "${{ hashFiles('autogpt_platform/backend/Dockerfile', 'autogpt_platform/backend/poetry.lock', 'autogpt_platform/backend/backend') }}" \
--frontend-hash "${{ hashFiles('autogpt_platform/frontend/Dockerfile', 'autogpt_platform/frontend/pnpm-lock.yaml', 'autogpt_platform/frontend/src') }}" \
--git-ref "${{ github.ref }}"
# Build with bake using the resolved compose file (now includes cache config)
docker buildx bake --allow=fs.read=.. -f docker-compose.resolved.yml --load
env:
NEXT_PUBLIC_PW_TEST: true
- name: Set up tests - Cache E2E test data
id: e2e-data-cache
- name: Cache Docker layers
uses: actions/cache@v5
with:
path: /tmp/e2e_test_data.sql
key: e2e-test-data-${{ hashFiles('autogpt_platform/backend/test/e2e_test_data.py', 'autogpt_platform/backend/migrations/**', '.github/workflows/platform-frontend-ci.yml') }}
path: /tmp/.buildx-cache
key: ${{ runner.os }}-buildx-frontend-test-${{ hashFiles('autogpt_platform/docker-compose.yml', 'autogpt_platform/backend/Dockerfile', 'autogpt_platform/backend/pyproject.toml', 'autogpt_platform/backend/poetry.lock') }}
restore-keys: |
${{ runner.os }}-buildx-frontend-test-
- name: Set up Platform - Start Supabase DB + Auth
- name: Run docker compose
run: |
docker compose -f ../docker-compose.resolved.yml up -d db auth --no-build
echo "Waiting for database to be ready..."
timeout 60 sh -c 'until docker compose -f ../docker-compose.resolved.yml exec -T db pg_isready -U postgres 2>/dev/null; do sleep 2; done'
echo "Waiting for auth service to be ready..."
timeout 60 sh -c 'until docker compose -f ../docker-compose.resolved.yml exec -T db psql -U postgres -d postgres -c "SELECT 1 FROM auth.users LIMIT 1" 2>/dev/null; do sleep 2; done' || echo "Auth schema check timeout, continuing..."
- name: Set up Platform - Run migrations
run: |
echo "Running migrations..."
docker compose -f ../docker-compose.resolved.yml run --rm migrate
echo "✅ Migrations completed"
NEXT_PUBLIC_PW_TEST=true docker compose -f ../docker-compose.yml up -d
env:
NEXT_PUBLIC_PW_TEST: true
DOCKER_BUILDKIT: 1
BUILDX_CACHE_FROM: type=local,src=/tmp/.buildx-cache
BUILDX_CACHE_TO: type=local,dest=/tmp/.buildx-cache-new,mode=max
- name: Set up tests - Load cached E2E test data
if: steps.e2e-data-cache.outputs.cache-hit == 'true'
- name: Move cache
run: |
echo "✅ Found cached E2E test data, restoring..."
{
echo "SET session_replication_role = 'replica';"
cat /tmp/e2e_test_data.sql
echo "SET session_replication_role = 'origin';"
} | docker compose -f ../docker-compose.resolved.yml exec -T db psql -U postgres -d postgres -b
# Refresh materialized views after restore
docker compose -f ../docker-compose.resolved.yml exec -T db \
psql -U postgres -d postgres -b -c "SET search_path TO platform; SELECT refresh_store_materialized_views();" || true
rm -rf /tmp/.buildx-cache
if [ -d "/tmp/.buildx-cache-new" ]; then
mv /tmp/.buildx-cache-new /tmp/.buildx-cache
fi
echo "✅ E2E test data restored from cache"
- name: Set up Platform - Start (all other services)
- name: Wait for services to be ready
run: |
docker compose -f ../docker-compose.resolved.yml up -d --no-build
echo "Waiting for rest_server to be ready..."
timeout 60 sh -c 'until curl -f http://localhost:8006/health 2>/dev/null; do sleep 2; done' || echo "Rest server health check timeout, continuing..."
env:
NEXT_PUBLIC_PW_TEST: true
echo "Waiting for database to be ready..."
timeout 60 sh -c 'until docker compose -f ../docker-compose.yml exec -T db pg_isready -U postgres 2>/dev/null; do sleep 2; done' || echo "Database ready check timeout, continuing..."
- name: Set up tests - Create E2E test data
if: steps.e2e-data-cache.outputs.cache-hit != 'true'
- name: Create E2E test data
run: |
echo "Creating E2E test data..."
docker cp ../backend/test/e2e_test_data.py $(docker compose -f ../docker-compose.resolved.yml ps -q rest_server):/tmp/e2e_test_data.py
docker compose -f ../docker-compose.resolved.yml exec -T rest_server sh -c "cd /app/autogpt_platform && python /tmp/e2e_test_data.py" || {
echo "❌ E2E test data creation failed!"
docker compose -f ../docker-compose.resolved.yml logs --tail=50 rest_server
exit 1
}
# First try to run the script from inside the container
if docker compose -f ../docker-compose.yml exec -T rest_server test -f /app/autogpt_platform/backend/test/e2e_test_data.py; then
echo "✅ Found e2e_test_data.py in container, running it..."
docker compose -f ../docker-compose.yml exec -T rest_server sh -c "cd /app/autogpt_platform && python backend/test/e2e_test_data.py" || {
echo "❌ E2E test data creation failed!"
docker compose -f ../docker-compose.yml logs --tail=50 rest_server
exit 1
}
else
echo "⚠️ e2e_test_data.py not found in container, copying and running..."
# Copy the script into the container and run it
docker cp ../backend/test/e2e_test_data.py $(docker compose -f ../docker-compose.yml ps -q rest_server):/tmp/e2e_test_data.py || {
echo "❌ Failed to copy script to container"
exit 1
}
docker compose -f ../docker-compose.yml exec -T rest_server sh -c "cd /app/autogpt_platform && python /tmp/e2e_test_data.py" || {
echo "❌ E2E test data creation failed!"
docker compose -f ../docker-compose.yml logs --tail=50 rest_server
exit 1
}
fi
# Dump auth.users + platform schema for cache (two separate dumps)
echo "Dumping database for cache..."
{
docker compose -f ../docker-compose.resolved.yml exec -T db \
pg_dump -U postgres --data-only --column-inserts \
--table='auth.users' postgres
docker compose -f ../docker-compose.resolved.yml exec -T db \
pg_dump -U postgres --data-only --column-inserts \
--schema=platform \
--exclude-table='platform._prisma_migrations' \
--exclude-table='platform.apscheduler_jobs' \
--exclude-table='platform.apscheduler_jobs_batched_notifications' \
postgres
} > /tmp/e2e_test_data.sql
echo "✅ Database dump created for caching ($(wc -l < /tmp/e2e_test_data.sql) lines)"
- name: Set up tests - Enable corepack
run: corepack enable
- name: Set up tests - Set up Node
uses: actions/setup-node@v6
- name: Restore dependencies cache
uses: actions/cache@v5
with:
node-version: "22.18.0"
cache: "pnpm"
cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml
path: ~/.pnpm-store
key: ${{ needs.setup.outputs.cache-key }}
restore-keys: |
${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml') }}
${{ runner.os }}-pnpm-
- name: Set up tests - Install dependencies
- name: Install dependencies
run: pnpm install --frozen-lockfile
- name: Set up tests - Install browser 'chromium'
- name: Install Browser 'chromium'
run: pnpm playwright install --with-deps chromium
- name: Run Playwright tests
@@ -281,7 +269,7 @@ jobs:
- name: Print Final Docker Compose logs
if: always()
run: docker compose -f ../docker-compose.resolved.yml logs
run: docker compose -f ../docker-compose.yml logs
integration_test:
runs-on: ubuntu-latest
@@ -293,15 +281,22 @@ jobs:
with:
submodules: recursive
- name: Enable corepack
run: corepack enable
- name: Set up Node
- name: Set up Node.js
uses: actions/setup-node@v6
with:
node-version: "22.18.0"
cache: "pnpm"
cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml
- name: Enable corepack
run: corepack enable
- name: Restore dependencies cache
uses: actions/cache@v5
with:
path: ~/.pnpm-store
key: ${{ needs.setup.outputs.cache-key }}
restore-keys: |
${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml') }}
${{ runner.os }}-pnpm-
- name: Install dependencies
run: pnpm install --frozen-lockfile

View File

@@ -1,195 +0,0 @@
#!/usr/bin/env python3
"""
Add cache configuration to a resolved docker-compose file for all services
that have a build key, and ensure image names match what docker compose expects.
"""
import argparse
import yaml
DEFAULT_BRANCH = "dev"
CACHE_BUILDS_FOR_COMPONENTS = ["backend", "frontend"]
def main():
parser = argparse.ArgumentParser(
description="Add cache config to a resolved compose file"
)
parser.add_argument(
"--source",
required=True,
help="Source compose file to read (should be output of `docker compose config`)",
)
parser.add_argument(
"--cache-from",
default="type=gha",
help="Cache source configuration",
)
parser.add_argument(
"--cache-to",
default="type=gha,mode=max",
help="Cache destination configuration",
)
for component in CACHE_BUILDS_FOR_COMPONENTS:
parser.add_argument(
f"--{component}-hash",
default="",
help=f"Hash for {component} cache scope (e.g., from hashFiles())",
)
parser.add_argument(
"--git-ref",
default="",
help="Git ref for branch-based cache scope (e.g., refs/heads/master)",
)
args = parser.parse_args()
# Normalize git ref to a safe scope name (e.g., refs/heads/master -> master)
git_ref_scope = ""
if args.git_ref:
git_ref_scope = args.git_ref.replace("refs/heads/", "").replace("/", "-")
with open(args.source, "r") as f:
compose = yaml.safe_load(f)
# Get project name from compose file or default
project_name = compose.get("name", "autogpt_platform")
def get_image_name(dockerfile: str, target: str) -> str:
"""Generate image name based on Dockerfile folder and build target."""
dockerfile_parts = dockerfile.replace("\\", "/").split("/")
if len(dockerfile_parts) >= 2:
folder_name = dockerfile_parts[-2] # e.g., "backend" or "frontend"
else:
folder_name = "app"
return f"{project_name}-{folder_name}:{target}"
def get_build_key(dockerfile: str, target: str) -> str:
"""Generate a unique key for a Dockerfile+target combination."""
return f"{dockerfile}:{target}"
def get_component(dockerfile: str) -> str | None:
"""Get component name (frontend/backend) from dockerfile path."""
for component in CACHE_BUILDS_FOR_COMPONENTS:
if component in dockerfile:
return component
return None
# First pass: collect all services with build configs and identify duplicates
# Track which (dockerfile, target) combinations we've seen
build_key_to_first_service: dict[str, str] = {}
services_to_build: list[str] = []
services_to_dedupe: list[str] = []
for service_name, service_config in compose.get("services", {}).items():
if "build" not in service_config:
continue
build_config = service_config["build"]
dockerfile = build_config.get("dockerfile", "Dockerfile")
target = build_config.get("target", "default")
build_key = get_build_key(dockerfile, target)
if build_key not in build_key_to_first_service:
# First service with this build config - it will do the actual build
build_key_to_first_service[build_key] = service_name
services_to_build.append(service_name)
else:
# Duplicate - will just use the image from the first service
services_to_dedupe.append(service_name)
# Second pass: configure builds and deduplicate
modified_services = []
for service_name, service_config in compose.get("services", {}).items():
if "build" not in service_config:
continue
build_config = service_config["build"]
dockerfile = build_config.get("dockerfile", "Dockerfile")
target = build_config.get("target", "latest")
image_name = get_image_name(dockerfile, target)
# Set image name for all services (needed for both builders and deduped)
service_config["image"] = image_name
if service_name in services_to_dedupe:
# Remove build config - this service will use the pre-built image
del service_config["build"]
continue
# This service will do the actual build - add cache config
cache_from_list = []
cache_to_list = []
component = get_component(dockerfile)
if not component:
# Skip services that don't clearly match frontend/backend
continue
# Get the hash for this component
component_hash = getattr(args, f"{component}_hash")
# Scope format: platform-{component}-{target}-{hash|ref}
# Example: platform-backend-server-abc123
if "type=gha" in args.cache_from:
# 1. Primary: exact hash match (most specific)
if component_hash:
hash_scope = f"platform-{component}-{target}-{component_hash}"
cache_from_list.append(f"{args.cache_from},scope={hash_scope}")
# 2. Fallback: branch-based cache
if git_ref_scope:
ref_scope = f"platform-{component}-{target}-{git_ref_scope}"
cache_from_list.append(f"{args.cache_from},scope={ref_scope}")
# 3. Fallback: dev branch cache (for PRs/feature branches)
if git_ref_scope and git_ref_scope != DEFAULT_BRANCH:
master_scope = f"platform-{component}-{target}-{DEFAULT_BRANCH}"
cache_from_list.append(f"{args.cache_from},scope={master_scope}")
if "type=gha" in args.cache_to:
# Write to both hash-based and branch-based scopes
if component_hash:
hash_scope = f"platform-{component}-{target}-{component_hash}"
cache_to_list.append(f"{args.cache_to},scope={hash_scope}")
if git_ref_scope:
ref_scope = f"platform-{component}-{target}-{git_ref_scope}"
cache_to_list.append(f"{args.cache_to},scope={ref_scope}")
# Ensure we have at least one cache source/target
if not cache_from_list:
cache_from_list.append(args.cache_from)
if not cache_to_list:
cache_to_list.append(args.cache_to)
build_config["cache_from"] = cache_from_list
build_config["cache_to"] = cache_to_list
modified_services.append(service_name)
# Write back to the same file
with open(args.source, "w") as f:
yaml.dump(compose, f, default_flow_style=False, sort_keys=False)
print(f"Added cache config to {len(modified_services)} services in {args.source}:")
for svc in modified_services:
svc_config = compose["services"][svc]
build_cfg = svc_config.get("build", {})
cache_from_list = build_cfg.get("cache_from", ["none"])
cache_to_list = build_cfg.get("cache_to", ["none"])
print(f" - {svc}")
print(f" image: {svc_config.get('image', 'N/A')}")
print(f" cache_from: {cache_from_list}")
print(f" cache_to: {cache_to_list}")
if services_to_dedupe:
print(
f"Deduplicated {len(services_to_dedupe)} services (will use pre-built images):"
)
for svc in services_to_dedupe:
print(f" - {svc} -> {compose['services'][svc].get('image', 'N/A')}")
if __name__ == "__main__":
main()

View File

@@ -45,11 +45,6 @@ AutoGPT Platform is a monorepo containing:
- Backend/Frontend services use YAML anchors for consistent configuration
- Supabase services (`db/docker/docker-compose.yml`) follow the same pattern
### Branching Strategy
- **`dev`** is the main development branch. All PRs should target `dev`.
- **`master`** is the production branch. Only used for production releases.
### Creating Pull Requests
- Create the PR against the `dev` branch of the repository.

View File

@@ -448,61 +448,61 @@ toml = ["tomli ; python_full_version <= \"3.11.0a6\""]
[[package]]
name = "cryptography"
version = "46.0.5"
version = "46.0.4"
description = "cryptography is a package which provides cryptographic recipes and primitives to Python developers."
optional = false
python-versions = "!=3.9.0,!=3.9.1,>=3.8"
groups = ["main"]
files = [
{file = "cryptography-46.0.5-cp311-abi3-macosx_10_9_universal2.whl", hash = "sha256:351695ada9ea9618b3500b490ad54c739860883df6c1f555e088eaf25b1bbaad"},
{file = "cryptography-46.0.5-cp311-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:c18ff11e86df2e28854939acde2d003f7984f721eba450b56a200ad90eeb0e6b"},
{file = "cryptography-46.0.5-cp311-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:4d7e3d356b8cd4ea5aff04f129d5f66ebdc7b6f8eae802b93739ed520c47c79b"},
{file = "cryptography-46.0.5-cp311-abi3-manylinux_2_28_aarch64.whl", hash = "sha256:50bfb6925eff619c9c023b967d5b77a54e04256c4281b0e21336a130cd7fc263"},
{file = "cryptography-46.0.5-cp311-abi3-manylinux_2_28_ppc64le.whl", hash = "sha256:803812e111e75d1aa73690d2facc295eaefd4439be1023fefc4995eaea2af90d"},
{file = "cryptography-46.0.5-cp311-abi3-manylinux_2_28_x86_64.whl", hash = "sha256:3ee190460e2fbe447175cda91b88b84ae8322a104fc27766ad09428754a618ed"},
{file = "cryptography-46.0.5-cp311-abi3-manylinux_2_31_armv7l.whl", hash = "sha256:f145bba11b878005c496e93e257c1e88f154d278d2638e6450d17e0f31e558d2"},
{file = "cryptography-46.0.5-cp311-abi3-manylinux_2_34_aarch64.whl", hash = "sha256:e9251e3be159d1020c4030bd2e5f84d6a43fe54b6c19c12f51cde9542a2817b2"},
{file = "cryptography-46.0.5-cp311-abi3-manylinux_2_34_ppc64le.whl", hash = "sha256:47fb8a66058b80e509c47118ef8a75d14c455e81ac369050f20ba0d23e77fee0"},
{file = "cryptography-46.0.5-cp311-abi3-manylinux_2_34_x86_64.whl", hash = "sha256:4c3341037c136030cb46e4b1e17b7418ea4cbd9dd207e4a6f3b2b24e0d4ac731"},
{file = "cryptography-46.0.5-cp311-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:890bcb4abd5a2d3f852196437129eb3667d62630333aacc13dfd470fad3aaa82"},
{file = "cryptography-46.0.5-cp311-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:80a8d7bfdf38f87ca30a5391c0c9ce4ed2926918e017c29ddf643d0ed2778ea1"},
{file = "cryptography-46.0.5-cp311-abi3-win32.whl", hash = "sha256:60ee7e19e95104d4c03871d7d7dfb3d22ef8a9b9c6778c94e1c8fcc8365afd48"},
{file = "cryptography-46.0.5-cp311-abi3-win_amd64.whl", hash = "sha256:38946c54b16c885c72c4f59846be9743d699eee2b69b6988e0a00a01f46a61a4"},
{file = "cryptography-46.0.5-cp314-cp314t-macosx_10_9_universal2.whl", hash = "sha256:94a76daa32eb78d61339aff7952ea819b1734b46f73646a07decb40e5b3448e2"},
{file = "cryptography-46.0.5-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:5be7bf2fb40769e05739dd0046e7b26f9d4670badc7b032d6ce4db64dddc0678"},
{file = "cryptography-46.0.5-cp314-cp314t-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:fe346b143ff9685e40192a4960938545c699054ba11d4f9029f94751e3f71d87"},
{file = "cryptography-46.0.5-cp314-cp314t-manylinux_2_28_aarch64.whl", hash = "sha256:c69fd885df7d089548a42d5ec05be26050ebcd2283d89b3d30676eb32ff87dee"},
{file = "cryptography-46.0.5-cp314-cp314t-manylinux_2_28_ppc64le.whl", hash = "sha256:8293f3dea7fc929ef7240796ba231413afa7b68ce38fd21da2995549f5961981"},
{file = "cryptography-46.0.5-cp314-cp314t-manylinux_2_28_x86_64.whl", hash = "sha256:1abfdb89b41c3be0365328a410baa9df3ff8a9110fb75e7b52e66803ddabc9a9"},
{file = "cryptography-46.0.5-cp314-cp314t-manylinux_2_31_armv7l.whl", hash = "sha256:d66e421495fdb797610a08f43b05269e0a5ea7f5e652a89bfd5a7d3c1dee3648"},
{file = "cryptography-46.0.5-cp314-cp314t-manylinux_2_34_aarch64.whl", hash = "sha256:4e817a8920bfbcff8940ecfd60f23d01836408242b30f1a708d93198393a80b4"},
{file = "cryptography-46.0.5-cp314-cp314t-manylinux_2_34_ppc64le.whl", hash = "sha256:68f68d13f2e1cb95163fa3b4db4bf9a159a418f5f6e7242564fc75fcae667fd0"},
{file = "cryptography-46.0.5-cp314-cp314t-manylinux_2_34_x86_64.whl", hash = "sha256:a3d1fae9863299076f05cb8a778c467578262fae09f9dc0ee9b12eb4268ce663"},
{file = "cryptography-46.0.5-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:c4143987a42a2397f2fc3b4d7e3a7d313fbe684f67ff443999e803dd75a76826"},
{file = "cryptography-46.0.5-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:7d731d4b107030987fd61a7f8ab512b25b53cef8f233a97379ede116f30eb67d"},
{file = "cryptography-46.0.5-cp314-cp314t-win32.whl", hash = "sha256:c3bcce8521d785d510b2aad26ae2c966092b7daa8f45dd8f44734a104dc0bc1a"},
{file = "cryptography-46.0.5-cp314-cp314t-win_amd64.whl", hash = "sha256:4d8ae8659ab18c65ced284993c2265910f6c9e650189d4e3f68445ef82a810e4"},
{file = "cryptography-46.0.5-cp38-abi3-macosx_10_9_universal2.whl", hash = "sha256:4108d4c09fbbf2789d0c926eb4152ae1760d5a2d97612b92d508d96c861e4d31"},
{file = "cryptography-46.0.5-cp38-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:7d1f30a86d2757199cb2d56e48cce14deddf1f9c95f1ef1b64ee91ea43fe2e18"},
{file = "cryptography-46.0.5-cp38-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:039917b0dc418bb9f6edce8a906572d69e74bd330b0b3fea4f79dab7f8ddd235"},
{file = "cryptography-46.0.5-cp38-abi3-manylinux_2_28_aarch64.whl", hash = "sha256:ba2a27ff02f48193fc4daeadf8ad2590516fa3d0adeeb34336b96f7fa64c1e3a"},
{file = "cryptography-46.0.5-cp38-abi3-manylinux_2_28_ppc64le.whl", hash = "sha256:61aa400dce22cb001a98014f647dc21cda08f7915ceb95df0c9eaf84b4b6af76"},
{file = "cryptography-46.0.5-cp38-abi3-manylinux_2_28_x86_64.whl", hash = "sha256:3ce58ba46e1bc2aac4f7d9290223cead56743fa6ab94a5d53292ffaac6a91614"},
{file = "cryptography-46.0.5-cp38-abi3-manylinux_2_31_armv7l.whl", hash = "sha256:420d0e909050490d04359e7fdb5ed7e667ca5c3c402b809ae2563d7e66a92229"},
{file = "cryptography-46.0.5-cp38-abi3-manylinux_2_34_aarch64.whl", hash = "sha256:582f5fcd2afa31622f317f80426a027f30dc792e9c80ffee87b993200ea115f1"},
{file = "cryptography-46.0.5-cp38-abi3-manylinux_2_34_ppc64le.whl", hash = "sha256:bfd56bb4b37ed4f330b82402f6f435845a5f5648edf1ad497da51a8452d5d62d"},
{file = "cryptography-46.0.5-cp38-abi3-manylinux_2_34_x86_64.whl", hash = "sha256:a3d507bb6a513ca96ba84443226af944b0f7f47dcc9a399d110cd6146481d24c"},
{file = "cryptography-46.0.5-cp38-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:9f16fbdf4da055efb21c22d81b89f155f02ba420558db21288b3d0035bafd5f4"},
{file = "cryptography-46.0.5-cp38-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:ced80795227d70549a411a4ab66e8ce307899fad2220ce5ab2f296e687eacde9"},
{file = "cryptography-46.0.5-cp38-abi3-win32.whl", hash = "sha256:02f547fce831f5096c9a567fd41bc12ca8f11df260959ecc7c3202555cc47a72"},
{file = "cryptography-46.0.5-cp38-abi3-win_amd64.whl", hash = "sha256:556e106ee01aa13484ce9b0239bca667be5004efb0aabbed28d353df86445595"},
{file = "cryptography-46.0.5-pp311-pypy311_pp73-macosx_11_0_arm64.whl", hash = "sha256:3b4995dc971c9fb83c25aa44cf45f02ba86f71ee600d81091c2f0cbae116b06c"},
{file = "cryptography-46.0.5-pp311-pypy311_pp73-manylinux_2_28_aarch64.whl", hash = "sha256:bc84e875994c3b445871ea7181d424588171efec3e185dced958dad9e001950a"},
{file = "cryptography-46.0.5-pp311-pypy311_pp73-manylinux_2_28_x86_64.whl", hash = "sha256:2ae6971afd6246710480e3f15824ed3029a60fc16991db250034efd0b9fb4356"},
{file = "cryptography-46.0.5-pp311-pypy311_pp73-manylinux_2_34_aarch64.whl", hash = "sha256:d861ee9e76ace6cf36a6a89b959ec08e7bc2493ee39d07ffe5acb23ef46d27da"},
{file = "cryptography-46.0.5-pp311-pypy311_pp73-manylinux_2_34_x86_64.whl", hash = "sha256:2b7a67c9cd56372f3249b39699f2ad479f6991e62ea15800973b956f4b73e257"},
{file = "cryptography-46.0.5-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:8456928655f856c6e1533ff59d5be76578a7157224dbd9ce6872f25055ab9ab7"},
{file = "cryptography-46.0.5.tar.gz", hash = "sha256:abace499247268e3757271b2f1e244b36b06f8515cf27c4d49468fc9eb16e93d"},
{file = "cryptography-46.0.4-cp311-abi3-macosx_10_9_universal2.whl", hash = "sha256:281526e865ed4166009e235afadf3a4c4cba6056f99336a99efba65336fd5485"},
{file = "cryptography-46.0.4-cp311-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:5f14fba5bf6f4390d7ff8f086c566454bff0411f6d8aa7af79c88b6f9267aecc"},
{file = "cryptography-46.0.4-cp311-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:47bcd19517e6389132f76e2d5303ded6cf3f78903da2158a671be8de024f4cd0"},
{file = "cryptography-46.0.4-cp311-abi3-manylinux_2_28_aarch64.whl", hash = "sha256:01df4f50f314fbe7009f54046e908d1754f19d0c6d3070df1e6268c5a4af09fa"},
{file = "cryptography-46.0.4-cp311-abi3-manylinux_2_28_ppc64le.whl", hash = "sha256:5aa3e463596b0087b3da0dbe2b2487e9fc261d25da85754e30e3b40637d61f81"},
{file = "cryptography-46.0.4-cp311-abi3-manylinux_2_28_x86_64.whl", hash = "sha256:0a9ad24359fee86f131836a9ac3bffc9329e956624a2d379b613f8f8abaf5255"},
{file = "cryptography-46.0.4-cp311-abi3-manylinux_2_31_armv7l.whl", hash = "sha256:dc1272e25ef673efe72f2096e92ae39dea1a1a450dd44918b15351f72c5a168e"},
{file = "cryptography-46.0.4-cp311-abi3-manylinux_2_34_aarch64.whl", hash = "sha256:de0f5f4ec8711ebc555f54735d4c673fc34b65c44283895f1a08c2b49d2fd99c"},
{file = "cryptography-46.0.4-cp311-abi3-manylinux_2_34_ppc64le.whl", hash = "sha256:eeeb2e33d8dbcccc34d64651f00a98cb41b2dc69cef866771a5717e6734dfa32"},
{file = "cryptography-46.0.4-cp311-abi3-manylinux_2_34_x86_64.whl", hash = "sha256:3d425eacbc9aceafd2cb429e42f4e5d5633c6f873f5e567077043ef1b9bbf616"},
{file = "cryptography-46.0.4-cp311-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:91627ebf691d1ea3976a031b61fb7bac1ccd745afa03602275dda443e11c8de0"},
{file = "cryptography-46.0.4-cp311-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:2d08bc22efd73e8854b0b7caff402d735b354862f1145d7be3b9c0f740fef6a0"},
{file = "cryptography-46.0.4-cp311-abi3-win32.whl", hash = "sha256:82a62483daf20b8134f6e92898da70d04d0ef9a75829d732ea1018678185f4f5"},
{file = "cryptography-46.0.4-cp311-abi3-win_amd64.whl", hash = "sha256:6225d3ebe26a55dbc8ead5ad1265c0403552a63336499564675b29eb3184c09b"},
{file = "cryptography-46.0.4-cp314-cp314t-macosx_10_9_universal2.whl", hash = "sha256:485e2b65d25ec0d901bca7bcae0f53b00133bf3173916d8e421f6fddde103908"},
{file = "cryptography-46.0.4-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:078e5f06bd2fa5aea5a324f2a09f914b1484f1d0c2a4d6a8a28c74e72f65f2da"},
{file = "cryptography-46.0.4-cp314-cp314t-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:dce1e4f068f03008da7fa51cc7abc6ddc5e5de3e3d1550334eaf8393982a5829"},
{file = "cryptography-46.0.4-cp314-cp314t-manylinux_2_28_aarch64.whl", hash = "sha256:2067461c80271f422ee7bdbe79b9b4be54a5162e90345f86a23445a0cf3fd8a2"},
{file = "cryptography-46.0.4-cp314-cp314t-manylinux_2_28_ppc64le.whl", hash = "sha256:c92010b58a51196a5f41c3795190203ac52edfd5dc3ff99149b4659eba9d2085"},
{file = "cryptography-46.0.4-cp314-cp314t-manylinux_2_28_x86_64.whl", hash = "sha256:829c2b12bbc5428ab02d6b7f7e9bbfd53e33efd6672d21341f2177470171ad8b"},
{file = "cryptography-46.0.4-cp314-cp314t-manylinux_2_31_armv7l.whl", hash = "sha256:62217ba44bf81b30abaeda1488686a04a702a261e26f87db51ff61d9d3510abd"},
{file = "cryptography-46.0.4-cp314-cp314t-manylinux_2_34_aarch64.whl", hash = "sha256:9c2da296c8d3415b93e6053f5a728649a87a48ce084a9aaf51d6e46c87c7f2d2"},
{file = "cryptography-46.0.4-cp314-cp314t-manylinux_2_34_ppc64le.whl", hash = "sha256:9b34d8ba84454641a6bf4d6762d15847ecbd85c1316c0a7984e6e4e9f748ec2e"},
{file = "cryptography-46.0.4-cp314-cp314t-manylinux_2_34_x86_64.whl", hash = "sha256:df4a817fa7138dd0c96c8c8c20f04b8aaa1fac3bbf610913dcad8ea82e1bfd3f"},
{file = "cryptography-46.0.4-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:b1de0ebf7587f28f9190b9cb526e901bf448c9e6a99655d2b07fff60e8212a82"},
{file = "cryptography-46.0.4-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:9b4d17bc7bd7cdd98e3af40b441feaea4c68225e2eb2341026c84511ad246c0c"},
{file = "cryptography-46.0.4-cp314-cp314t-win32.whl", hash = "sha256:c411f16275b0dea722d76544a61d6421e2cc829ad76eec79280dbdc9ddf50061"},
{file = "cryptography-46.0.4-cp314-cp314t-win_amd64.whl", hash = "sha256:728fedc529efc1439eb6107b677f7f7558adab4553ef8669f0d02d42d7b959a7"},
{file = "cryptography-46.0.4-cp38-abi3-macosx_10_9_universal2.whl", hash = "sha256:a9556ba711f7c23f77b151d5798f3ac44a13455cc68db7697a1096e6d0563cab"},
{file = "cryptography-46.0.4-cp38-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:8bf75b0259e87fa70bddc0b8b4078b76e7fd512fd9afae6c1193bcf440a4dbef"},
{file = "cryptography-46.0.4-cp38-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:3c268a3490df22270955966ba236d6bc4a8f9b6e4ffddb78aac535f1a5ea471d"},
{file = "cryptography-46.0.4-cp38-abi3-manylinux_2_28_aarch64.whl", hash = "sha256:812815182f6a0c1d49a37893a303b44eaac827d7f0d582cecfc81b6427f22973"},
{file = "cryptography-46.0.4-cp38-abi3-manylinux_2_28_ppc64le.whl", hash = "sha256:a90e43e3ef65e6dcf969dfe3bb40cbf5aef0d523dff95bfa24256be172a845f4"},
{file = "cryptography-46.0.4-cp38-abi3-manylinux_2_28_x86_64.whl", hash = "sha256:a05177ff6296644ef2876fce50518dffb5bcdf903c85250974fc8bc85d54c0af"},
{file = "cryptography-46.0.4-cp38-abi3-manylinux_2_31_armv7l.whl", hash = "sha256:daa392191f626d50f1b136c9b4cf08af69ca8279d110ea24f5c2700054d2e263"},
{file = "cryptography-46.0.4-cp38-abi3-manylinux_2_34_aarch64.whl", hash = "sha256:e07ea39c5b048e085f15923511d8121e4a9dc45cee4e3b970ca4f0d338f23095"},
{file = "cryptography-46.0.4-cp38-abi3-manylinux_2_34_ppc64le.whl", hash = "sha256:d5a45ddc256f492ce42a4e35879c5e5528c09cd9ad12420828c972951d8e016b"},
{file = "cryptography-46.0.4-cp38-abi3-manylinux_2_34_x86_64.whl", hash = "sha256:6bb5157bf6a350e5b28aee23beb2d84ae6f5be390b2f8ee7ea179cda077e1019"},
{file = "cryptography-46.0.4-cp38-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:dd5aba870a2c40f87a3af043e0dee7d9eb02d4aff88a797b48f2b43eff8c3ab4"},
{file = "cryptography-46.0.4-cp38-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:93d8291da8d71024379ab2cb0b5c57915300155ad42e07f76bea6ad838d7e59b"},
{file = "cryptography-46.0.4-cp38-abi3-win32.whl", hash = "sha256:0563655cb3c6d05fb2afe693340bc050c30f9f34e15763361cf08e94749401fc"},
{file = "cryptography-46.0.4-cp38-abi3-win_amd64.whl", hash = "sha256:fa0900b9ef9c49728887d1576fd8d9e7e3ea872fa9b25ef9b64888adc434e976"},
{file = "cryptography-46.0.4-pp311-pypy311_pp73-macosx_11_0_arm64.whl", hash = "sha256:766330cce7416c92b5e90c3bb71b1b79521760cdcfc3a6a1a182d4c9fab23d2b"},
{file = "cryptography-46.0.4-pp311-pypy311_pp73-manylinux_2_28_aarch64.whl", hash = "sha256:c236a44acfb610e70f6b3e1c3ca20ff24459659231ef2f8c48e879e2d32b73da"},
{file = "cryptography-46.0.4-pp311-pypy311_pp73-manylinux_2_28_x86_64.whl", hash = "sha256:8a15fb869670efa8f83cbffbc8753c1abf236883225aed74cd179b720ac9ec80"},
{file = "cryptography-46.0.4-pp311-pypy311_pp73-manylinux_2_34_aarch64.whl", hash = "sha256:fdc3daab53b212472f1524d070735b2f0c214239df131903bae1d598016fa822"},
{file = "cryptography-46.0.4-pp311-pypy311_pp73-manylinux_2_34_x86_64.whl", hash = "sha256:44cc0675b27cadb71bdbb96099cca1fa051cd11d2ade09e5cd3a2edb929ed947"},
{file = "cryptography-46.0.4-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:be8c01a7d5a55f9a47d1888162b76c8f49d62b234d88f0ff91a9fbebe32ffbc3"},
{file = "cryptography-46.0.4.tar.gz", hash = "sha256:bfd019f60f8abc2ed1b9be4ddc21cfef059c841d86d710bb69909a688cbb8f59"},
]
[package.dependencies]
@@ -516,7 +516,7 @@ nox = ["nox[uv] (>=2024.4.15)"]
pep8test = ["check-sdist", "click (>=8.0.1)", "mypy (>=1.14)", "ruff (>=0.11.11)"]
sdist = ["build (>=1.0.0)"]
ssh = ["bcrypt (>=3.1.5)"]
test = ["certifi (>=2024)", "cryptography-vectors (==46.0.5)", "pretend (>=0.7)", "pytest (>=7.4.0)", "pytest-benchmark (>=4.0)", "pytest-cov (>=2.10.1)", "pytest-xdist (>=3.5.0)"]
test = ["certifi (>=2024)", "cryptography-vectors (==46.0.4)", "pretend (>=0.7)", "pytest (>=7.4.0)", "pytest-benchmark (>=4.0)", "pytest-cov (>=2.10.1)", "pytest-xdist (>=3.5.0)"]
test-randomorder = ["pytest-randomly"]
[[package]]
@@ -570,25 +570,24 @@ tests = ["coverage", "coveralls", "dill", "mock", "nose"]
[[package]]
name = "fastapi"
version = "0.128.7"
version = "0.128.0"
description = "FastAPI framework, high performance, easy to learn, fast to code, ready for production"
optional = false
python-versions = ">=3.9"
groups = ["main"]
files = [
{file = "fastapi-0.128.7-py3-none-any.whl", hash = "sha256:6bd9bd31cb7047465f2d3fa3ba3f33b0870b17d4eaf7cdb36d1576ab060ad662"},
{file = "fastapi-0.128.7.tar.gz", hash = "sha256:783c273416995486c155ad2c0e2b45905dedfaf20b9ef8d9f6a9124670639a24"},
{file = "fastapi-0.128.0-py3-none-any.whl", hash = "sha256:aebd93f9716ee3b4f4fcfe13ffb7cf308d99c9f3ab5622d8877441072561582d"},
{file = "fastapi-0.128.0.tar.gz", hash = "sha256:1cc179e1cef10a6be60ffe429f79b829dce99d8de32d7acb7e6c8dfdf7f2645a"},
]
[package.dependencies]
annotated-doc = ">=0.0.2"
pydantic = ">=2.7.0"
starlette = ">=0.40.0,<1.0.0"
starlette = ">=0.40.0,<0.51.0"
typing-extensions = ">=4.8.0"
typing-inspection = ">=0.4.2"
[package.extras]
all = ["email-validator (>=2.0.0)", "fastapi-cli[standard] (>=0.0.8)", "httpx (>=0.23.0,<1.0.0)", "itsdangerous (>=1.1.0)", "jinja2 (>=3.1.5)", "orjson (>=3.9.3)", "pydantic-extra-types (>=2.0.0)", "pydantic-settings (>=2.0.0)", "python-multipart (>=0.0.18)", "pyyaml (>=5.3.1)", "ujson (>=5.8.0)", "uvicorn[standard] (>=0.12.0)"]
all = ["email-validator (>=2.0.0)", "fastapi-cli[standard] (>=0.0.8)", "httpx (>=0.23.0,<1.0.0)", "itsdangerous (>=1.1.0)", "jinja2 (>=3.1.5)", "orjson (>=3.2.1)", "pydantic-extra-types (>=2.0.0)", "pydantic-settings (>=2.0.0)", "python-multipart (>=0.0.18)", "pyyaml (>=5.3.1)", "ujson (>=4.0.1,!=4.0.2,!=4.1.0,!=4.2.0,!=4.3.0,!=5.0.0,!=5.1.0)", "uvicorn[standard] (>=0.12.0)"]
standard = ["email-validator (>=2.0.0)", "fastapi-cli[standard] (>=0.0.8)", "httpx (>=0.23.0,<1.0.0)", "jinja2 (>=3.1.5)", "pydantic-extra-types (>=2.0.0)", "pydantic-settings (>=2.0.0)", "python-multipart (>=0.0.18)", "uvicorn[standard] (>=0.12.0)"]
standard-no-fastapi-cloud-cli = ["email-validator (>=2.0.0)", "fastapi-cli[standard-no-fastapi-cloud-cli] (>=0.0.8)", "httpx (>=0.23.0,<1.0.0)", "jinja2 (>=3.1.5)", "pydantic-extra-types (>=2.0.0)", "pydantic-settings (>=2.0.0)", "python-multipart (>=0.0.18)", "uvicorn[standard] (>=0.12.0)"]
@@ -1063,14 +1062,14 @@ urllib3 = ">=1.26.0,<3"
[[package]]
name = "launchdarkly-server-sdk"
version = "9.15.0"
version = "9.14.1"
description = "LaunchDarkly SDK for Python"
optional = false
python-versions = ">=3.10"
python-versions = ">=3.9"
groups = ["main"]
files = [
{file = "launchdarkly_server_sdk-9.15.0-py3-none-any.whl", hash = "sha256:c267e29bfa3fb5e2a06a208448ada6ed5557a2924979b8d79c970b45d227c668"},
{file = "launchdarkly_server_sdk-9.15.0.tar.gz", hash = "sha256:f31441b74bc1a69c381db57c33116509e407a2612628ad6dff0a7dbb39d5020b"},
{file = "launchdarkly_server_sdk-9.14.1-py3-none-any.whl", hash = "sha256:a9e2bd9ecdef845cd631ae0d4334a1115e5b44257c42eb2349492be4bac7815c"},
{file = "launchdarkly_server_sdk-9.14.1.tar.gz", hash = "sha256:1df44baf0a0efa74d8c1dad7a00592b98bce7d19edded7f770da8dbc49922213"},
]
[package.dependencies]
@@ -1479,14 +1478,14 @@ testing = ["coverage", "pytest", "pytest-benchmark"]
[[package]]
name = "postgrest"
version = "2.28.0"
version = "2.27.2"
description = "PostgREST client for Python. This library provides an ORM interface to PostgREST."
optional = false
python-versions = ">=3.9"
groups = ["main"]
files = [
{file = "postgrest-2.28.0-py3-none-any.whl", hash = "sha256:7bca2f24dd1a1bf8a3d586c7482aba6cd41662da6733045fad585b63b7f7df75"},
{file = "postgrest-2.28.0.tar.gz", hash = "sha256:c36b38646d25ea4255321d3d924ce70f8d20ec7799cb42c1221d6a818d4f6515"},
{file = "postgrest-2.27.2-py3-none-any.whl", hash = "sha256:1666fef3de05ca097a314433dd5ae2f2d71c613cb7b233d0f468c4ffe37277da"},
{file = "postgrest-2.27.2.tar.gz", hash = "sha256:55407d530b5af3d64e883a71fec1f345d369958f723ce4a8ab0b7d169e313242"},
]
[package.dependencies]
@@ -2249,14 +2248,14 @@ cli = ["click (>=5.0)"]
[[package]]
name = "realtime"
version = "2.28.0"
version = "2.27.2"
description = ""
optional = false
python-versions = ">=3.9"
groups = ["main"]
files = [
{file = "realtime-2.28.0-py3-none-any.whl", hash = "sha256:db1bd59bab9b1fcc9f9d3b1a073bed35bf4994d720e6751f10031a58d57a3836"},
{file = "realtime-2.28.0.tar.gz", hash = "sha256:d18cedcebd6a8f22fcd509bc767f639761eb218b7b2b6f14fc4205b6259b50fc"},
{file = "realtime-2.27.2-py3-none-any.whl", hash = "sha256:34a9cbb26a274e707e8fc9e3ee0a66de944beac0fe604dc336d1e985db2c830f"},
{file = "realtime-2.27.2.tar.gz", hash = "sha256:b960a90294d2cea1b3f1275ecb89204304728e08fff1c393cc1b3150739556b3"},
]
[package.dependencies]
@@ -2437,14 +2436,14 @@ full = ["httpx (>=0.27.0,<0.29.0)", "itsdangerous", "jinja2", "python-multipart
[[package]]
name = "storage3"
version = "2.28.0"
version = "2.27.2"
description = "Supabase Storage client for Python."
optional = false
python-versions = ">=3.9"
groups = ["main"]
files = [
{file = "storage3-2.28.0-py3-none-any.whl", hash = "sha256:ecb50efd2ac71dabbdf97e99ad346eafa630c4c627a8e5a138ceb5fbbadae716"},
{file = "storage3-2.28.0.tar.gz", hash = "sha256:bc1d008aff67de7a0f2bd867baee7aadbcdb6f78f5a310b4f7a38e8c13c19865"},
{file = "storage3-2.27.2-py3-none-any.whl", hash = "sha256:e6f16e7a260729e7b1f46e9bf61746805a02e30f5e419ee1291007c432e3ec63"},
{file = "storage3-2.27.2.tar.gz", hash = "sha256:cb4807b7f86b4bb1272ac6fdd2f3cfd8ba577297046fa5f88557425200275af5"},
]
[package.dependencies]
@@ -2488,35 +2487,35 @@ python-dateutil = ">=2.6.0"
[[package]]
name = "supabase"
version = "2.28.0"
version = "2.27.2"
description = "Supabase client for Python."
optional = false
python-versions = ">=3.9"
groups = ["main"]
files = [
{file = "supabase-2.28.0-py3-none-any.whl", hash = "sha256:42776971c7d0ccca16034df1ab96a31c50228eb1eb19da4249ad2f756fc20272"},
{file = "supabase-2.28.0.tar.gz", hash = "sha256:aea299aaab2a2eed3c57e0be7fc035c6807214194cce795a3575add20268ece1"},
{file = "supabase-2.27.2-py3-none-any.whl", hash = "sha256:d4dce00b3a418ee578017ec577c0e5be47a9a636355009c76f20ed2faa15bc54"},
{file = "supabase-2.27.2.tar.gz", hash = "sha256:2aed40e4f3454438822442a1e94a47be6694c2c70392e7ae99b51a226d4293f7"},
]
[package.dependencies]
httpx = ">=0.26,<0.29"
postgrest = "2.28.0"
realtime = "2.28.0"
storage3 = "2.28.0"
supabase-auth = "2.28.0"
supabase-functions = "2.28.0"
postgrest = "2.27.2"
realtime = "2.27.2"
storage3 = "2.27.2"
supabase-auth = "2.27.2"
supabase-functions = "2.27.2"
yarl = ">=1.22.0"
[[package]]
name = "supabase-auth"
version = "2.28.0"
version = "2.27.2"
description = "Python Client Library for Supabase Auth"
optional = false
python-versions = ">=3.9"
groups = ["main"]
files = [
{file = "supabase_auth-2.28.0-py3-none-any.whl", hash = "sha256:2ac85026cc285054c7fa6d41924f3a333e9ec298c013e5b5e1754039ba7caec9"},
{file = "supabase_auth-2.28.0.tar.gz", hash = "sha256:2bb8f18ff39934e44b28f10918db965659f3735cd6fbfcc022fe0b82dbf8233e"},
{file = "supabase_auth-2.27.2-py3-none-any.whl", hash = "sha256:78ec25b11314d0a9527a7205f3b1c72560dccdc11b38392f80297ef98664ee91"},
{file = "supabase_auth-2.27.2.tar.gz", hash = "sha256:0f5bcc79b3677cb42e9d321f3c559070cfa40d6a29a67672cc8382fb7dc2fe97"},
]
[package.dependencies]
@@ -2526,14 +2525,14 @@ pyjwt = {version = ">=2.10.1", extras = ["crypto"]}
[[package]]
name = "supabase-functions"
version = "2.28.0"
version = "2.27.2"
description = "Library for Supabase Functions"
optional = false
python-versions = ">=3.9"
groups = ["main"]
files = [
{file = "supabase_functions-2.28.0-py3-none-any.whl", hash = "sha256:30bf2d586f8df285faf0621bb5d5bb3ec3157234fc820553ca156f009475e4ae"},
{file = "supabase_functions-2.28.0.tar.gz", hash = "sha256:db3dddfc37aca5858819eb461130968473bd8c75bd284581013958526dac718b"},
{file = "supabase_functions-2.27.2-py3-none-any.whl", hash = "sha256:db480efc669d0bca07605b9b6f167312af43121adcc842a111f79bea416ef754"},
{file = "supabase_functions-2.27.2.tar.gz", hash = "sha256:d0c8266207a94371cb3fd35ad3c7f025b78a97cf026861e04ccd35ac1775f80b"},
]
[package.dependencies]
@@ -2912,4 +2911,4 @@ type = ["pytest-mypy"]
[metadata]
lock-version = "2.1"
python-versions = ">=3.10,<4.0"
content-hash = "9619cae908ad38fa2c48016a58bcf4241f6f5793aa0e6cc140276e91c433cbbb"
content-hash = "40eae94995dc0a388fa832ed4af9b6137f28d5b5ced3aaea70d5f91d4d9a179d"

View File

@@ -11,14 +11,14 @@ python = ">=3.10,<4.0"
colorama = "^0.4.6"
cryptography = "^46.0"
expiringdict = "^1.2.2"
fastapi = "^0.128.7"
fastapi = "^0.128.0"
google-cloud-logging = "^3.13.0"
launchdarkly-server-sdk = "^9.15.0"
launchdarkly-server-sdk = "^9.14.1"
pydantic = "^2.12.5"
pydantic-settings = "^2.12.0"
pyjwt = { version = "^2.11.0", extras = ["crypto"] }
redis = "^6.2.0"
supabase = "^2.28.0"
supabase = "^2.27.2"
uvicorn = "^0.40.0"
[tool.poetry.group.dev.dependencies]

View File

@@ -104,12 +104,6 @@ TWITTER_CLIENT_SECRET=
# Make a new workspace for your OAuth APP -- trust me
# https://linear.app/settings/api/applications/new
# Callback URL: http://localhost:3000/auth/integrations/oauth_callback
LINEAR_API_KEY=
# Linear project and team IDs for the feature request tracker.
# Find these in your Linear workspace URL: linear.app/<workspace>/project/<project-id>
# and in team settings. Used by the chat copilot to file and search feature requests.
LINEAR_FEATURE_REQUEST_PROJECT_ID=
LINEAR_FEATURE_REQUEST_TEAM_ID=
LINEAR_CLIENT_ID=
LINEAR_CLIENT_SECRET=

View File

@@ -1,5 +1,3 @@
# ============================ DEPENDENCY BUILDER ============================ #
FROM debian:13-slim AS builder
# Set environment variables
@@ -53,9 +51,7 @@ COPY autogpt_platform/backend/backend/data/partial_types.py ./backend/data/parti
COPY autogpt_platform/backend/gen_prisma_types_stub.py ./
RUN poetry run prisma generate && poetry run gen-prisma-stub
# ============================== BACKEND SERVER ============================== #
FROM debian:13-slim AS server
FROM debian:13-slim AS server_dependencies
WORKDIR /app
@@ -66,15 +62,22 @@ ENV POETRY_HOME=/opt/poetry \
DEBIAN_FRONTEND=noninteractive
ENV PATH=/opt/poetry/bin:$PATH
# Install Python, FFmpeg, and ImageMagick (required for video processing blocks)
# Using --no-install-recommends saves ~650MB by skipping unnecessary deps like llvm, mesa, etc.
RUN apt-get update && apt-get install -y --no-install-recommends \
# Install Python, FFmpeg, ImageMagick, and CLI tools for agent use.
# bubblewrap provides OS-level sandbox (whitelist-only FS + no network)
# for the bash_exec MCP tool.
RUN apt-get update && apt-get install -y \
python3.13 \
python3-pip \
ffmpeg \
imagemagick \
jq \
ripgrep \
tree \
bubblewrap \
&& rm -rf /var/lib/apt/lists/*
# Copy only necessary files from builder
COPY --from=builder /app /app
COPY --from=builder /usr/local/lib/python3* /usr/local/lib/python3*
COPY --from=builder /usr/local/bin/poetry /usr/local/bin/poetry
# Copy Node.js installation for Prisma
@@ -84,54 +87,30 @@ COPY --from=builder /usr/bin/npm /usr/bin/npm
COPY --from=builder /usr/bin/npx /usr/bin/npx
COPY --from=builder /root/.cache/prisma-python/binaries /root/.cache/prisma-python/binaries
WORKDIR /app/autogpt_platform/backend
# Copy only the .venv from builder (not the entire /app directory)
# The .venv includes the generated Prisma client
COPY --from=builder /app/autogpt_platform/backend/.venv ./.venv
ENV PATH="/app/autogpt_platform/backend/.venv/bin:$PATH"
# Copy dependency files + autogpt_libs (path dependency)
COPY autogpt_platform/autogpt_libs /app/autogpt_platform/autogpt_libs
COPY autogpt_platform/backend/poetry.lock autogpt_platform/backend/pyproject.toml ./
RUN mkdir -p /app/autogpt_platform/autogpt_libs
RUN mkdir -p /app/autogpt_platform/backend
# Copy backend code + docs (for Copilot docs search)
COPY autogpt_platform/backend ./
COPY autogpt_platform/autogpt_libs /app/autogpt_platform/autogpt_libs
COPY autogpt_platform/backend/poetry.lock autogpt_platform/backend/pyproject.toml /app/autogpt_platform/backend/
WORKDIR /app/autogpt_platform/backend
FROM server_dependencies AS migrate
# Migration stage only needs schema and migrations - much lighter than full backend
COPY autogpt_platform/backend/schema.prisma /app/autogpt_platform/backend/
COPY autogpt_platform/backend/backend/data/partial_types.py /app/autogpt_platform/backend/backend/data/partial_types.py
COPY autogpt_platform/backend/migrations /app/autogpt_platform/backend/migrations
FROM server_dependencies AS server
COPY autogpt_platform/backend /app/autogpt_platform/backend
COPY docs /app/docs
RUN poetry install --no-ansi --only-root
ENV PORT=8000
CMD ["poetry", "run", "rest"]
# =============================== DB MIGRATOR =============================== #
# Lightweight migrate stage - only needs Prisma CLI, not full Python environment
FROM debian:13-slim AS migrate
WORKDIR /app/autogpt_platform/backend
ENV DEBIAN_FRONTEND=noninteractive
# Install only what's needed for prisma migrate: Node.js and minimal Python for prisma-python
RUN apt-get update && apt-get install -y --no-install-recommends \
python3.13 \
python3-pip \
ca-certificates \
&& rm -rf /var/lib/apt/lists/*
# Copy Node.js from builder (needed for Prisma CLI)
COPY --from=builder /usr/bin/node /usr/bin/node
COPY --from=builder /usr/lib/node_modules /usr/lib/node_modules
COPY --from=builder /usr/bin/npm /usr/bin/npm
# Copy Prisma binaries
COPY --from=builder /root/.cache/prisma-python/binaries /root/.cache/prisma-python/binaries
# Install prisma-client-py directly (much smaller than copying full venv)
RUN pip3 install prisma>=0.15.0 --break-system-packages
COPY autogpt_platform/backend/schema.prisma ./
COPY autogpt_platform/backend/backend/data/partial_types.py ./backend/data/partial_types.py
COPY autogpt_platform/backend/gen_prisma_types_stub.py ./
COPY autogpt_platform/backend/migrations ./migrations

View File

@@ -27,12 +27,11 @@ class ChatConfig(BaseSettings):
session_ttl: int = Field(default=43200, description="Session TTL in seconds")
# Streaming Configuration
max_context_messages: int = Field(
default=50, ge=1, le=200, description="Maximum context messages"
)
stream_timeout: int = Field(default=300, description="Stream timeout in seconds")
max_retries: int = Field(default=3, description="Maximum number of retries")
max_retries: int = Field(
default=3,
description="Max retries for fallback path (SDK handles retries internally)",
)
max_agent_runs: int = Field(default=30, description="Maximum number of agent runs")
max_agent_schedules: int = Field(
default=30, description="Maximum number of agent schedules"
@@ -93,6 +92,26 @@ class ChatConfig(BaseSettings):
description="Name of the prompt in Langfuse to fetch",
)
# Claude Agent SDK Configuration
use_claude_agent_sdk: bool = Field(
default=True,
description="Use Claude Agent SDK for chat completions",
)
claude_agent_model: str | None = Field(
default=None,
description="Model for the Claude Agent SDK path. If None, derives from "
"the `model` field by stripping the OpenRouter provider prefix.",
)
claude_agent_max_buffer_size: int = Field(
default=10 * 1024 * 1024, # 10MB (default SDK is 1MB)
description="Max buffer size in bytes for Claude Agent SDK JSON message parsing. "
"Increase if tool outputs exceed the limit.",
)
claude_agent_max_subtasks: int = Field(
default=10,
description="Max number of sub-agent Tasks the SDK can spawn per session.",
)
# Extended thinking configuration for Claude models
thinking_enabled: bool = Field(
default=True,
@@ -138,6 +157,17 @@ class ChatConfig(BaseSettings):
v = os.getenv("CHAT_INTERNAL_API_KEY")
return v
@field_validator("use_claude_agent_sdk", mode="before")
@classmethod
def get_use_claude_agent_sdk(cls, v):
"""Get use_claude_agent_sdk from environment if not provided."""
# Check environment variable - default to True if not set
env_val = os.getenv("CHAT_USE_CLAUDE_AGENT_SDK", "").lower()
if env_val:
return env_val in ("true", "1", "yes", "on")
# Default to True (SDK enabled by default)
return True if v is None else v
# Prompt paths for different contexts
PROMPT_PATHS: dict[str, str] = {
"default": "prompts/chat_system.md",

View File

@@ -334,9 +334,8 @@ async def _get_session_from_cache(session_id: str) -> ChatSession | None:
try:
session = ChatSession.model_validate_json(raw_session)
logger.info(
f"Loading session {session_id} from cache: "
f"message_count={len(session.messages)}, "
f"roles={[m.role for m in session.messages]}"
f"[CACHE] Loaded session {session_id}: {len(session.messages)} messages, "
f"last_roles={[m.role for m in session.messages[-3:]]}" # Last 3 roles
)
return session
except Exception as e:
@@ -378,11 +377,9 @@ async def _get_session_from_db(session_id: str) -> ChatSession | None:
return None
messages = prisma_session.Messages
logger.info(
f"Loading session {session_id} from DB: "
f"has_messages={messages is not None}, "
f"message_count={len(messages) if messages else 0}, "
f"roles={[m.role for m in messages] if messages else []}"
logger.debug(
f"[DB] Loaded session {session_id}: {len(messages) if messages else 0} messages, "
f"roles={[m.role for m in messages[-3:]] if messages else []}" # Last 3 roles
)
return ChatSession.from_db(prisma_session, messages)
@@ -433,10 +430,9 @@ async def _save_session_to_db(
"function_call": msg.function_call,
}
)
logger.info(
f"Saving {len(new_messages)} new messages to DB for session {session.session_id}: "
f"roles={[m['role'] for m in messages_data]}, "
f"start_sequence={existing_message_count}"
logger.debug(
f"[DB] Saving {len(new_messages)} messages to session {session.session_id}, "
f"roles={[m['role'] for m in messages_data]}"
)
await chat_db.add_chat_messages_batch(
session_id=session.session_id,
@@ -476,7 +472,7 @@ async def get_chat_session(
logger.warning(f"Unexpected cache error for session {session_id}: {e}")
# Fall back to database
logger.info(f"Session {session_id} not in cache, checking database")
logger.debug(f"Session {session_id} not in cache, checking database")
session = await _get_session_from_db(session_id)
if session is None:
@@ -493,7 +489,6 @@ async def get_chat_session(
# Cache the session from DB
try:
await _cache_session(session)
logger.info(f"Cached session {session_id} from database")
except Exception as e:
logger.warning(f"Failed to cache session {session_id}: {e}")
@@ -558,6 +553,40 @@ async def upsert_chat_session(
return session
async def append_and_save_message(session_id: str, message: ChatMessage) -> ChatSession:
"""Atomically append a message to a session and persist it.
Acquires the session lock, re-fetches the latest session state,
appends the message, and saves — preventing message loss when
concurrent requests modify the same session.
"""
lock = await _get_session_lock(session_id)
async with lock:
session = await get_chat_session(session_id)
if session is None:
raise ValueError(f"Session {session_id} not found")
session.messages.append(message)
existing_message_count = await chat_db.get_chat_session_message_count(
session_id
)
try:
await _save_session_to_db(session, existing_message_count)
except Exception as e:
raise DatabaseError(
f"Failed to persist message to session {session_id}"
) from e
try:
await _cache_session(session)
except Exception as e:
logger.warning(f"Cache write failed for session {session_id}: {e}")
return session
async def create_chat_session(user_id: str) -> ChatSession:
"""Create a new chat session and persist it.
@@ -664,13 +693,19 @@ async def update_session_title(session_id: str, title: str) -> bool:
logger.warning(f"Session {session_id} not found for title update")
return False
# Invalidate cache so next fetch gets updated title
# Update title in cache if it exists (instead of invalidating).
# This prevents race conditions where cache invalidation causes
# the frontend to see stale DB data while streaming is still in progress.
try:
redis_key = _get_session_cache_key(session_id)
async_redis = await get_redis_async()
await async_redis.delete(redis_key)
cached = await _get_session_from_cache(session_id)
if cached:
cached.title = title
await _cache_session(cached)
except Exception as e:
logger.warning(f"Failed to invalidate cache for session {session_id}: {e}")
# Not critical - title will be correct on next full cache refresh
logger.warning(
f"Failed to update title in cache for session {session_id}: {e}"
)
return True
except Exception as e:

View File

@@ -1,5 +1,6 @@
"""Chat API routes for chat session management and streaming via SSE."""
import asyncio
import logging
import uuid as uuid_module
from collections.abc import AsyncGenerator
@@ -11,20 +12,28 @@ from fastapi.responses import StreamingResponse
from pydantic import BaseModel
from backend.util.exceptions import NotFoundError
from backend.util.feature_flag import Flag, is_feature_enabled
from . import service as chat_service
from . import stream_registry
from .completion_handler import process_operation_failure, process_operation_success
from .config import ChatConfig
from .model import ChatSession, create_chat_session, get_chat_session, get_user_sessions
from .response_model import StreamFinish, StreamHeartbeat
from .model import (
ChatMessage,
ChatSession,
append_and_save_message,
create_chat_session,
get_chat_session,
get_user_sessions,
)
from .response_model import StreamError, StreamFinish, StreamHeartbeat, StreamStart
from .sdk import service as sdk_service
from .tools.models import (
AgentDetailsResponse,
AgentOutputResponse,
AgentPreviewResponse,
AgentSavedResponse,
AgentsFoundResponse,
BlockDetailsResponse,
BlockListResponse,
BlockOutputResponse,
ClarificationNeededResponse,
@@ -41,6 +50,7 @@ from .tools.models import (
SetupRequirementsResponse,
UnderstandingUpdatedResponse,
)
from .tracking import track_user_message
config = ChatConfig()
@@ -232,6 +242,10 @@ async def get_session(
active_task, last_message_id = await stream_registry.get_active_task_for_session(
session_id, user_id
)
logger.info(
f"[GET_SESSION] session={session_id}, active_task={active_task is not None}, "
f"msg_count={len(messages)}, last_role={messages[-1].get('role') if messages else 'none'}"
)
if active_task:
# Filter out the in-progress assistant message from the session response.
# The client will receive the complete assistant response through the SSE
@@ -301,10 +315,9 @@ async def stream_chat_post(
f"user={user_id}, message_len={len(request.message)}",
extra={"json_fields": log_meta},
)
session = await _validate_and_get_session(session_id, user_id)
logger.info(
f"[TIMING] session validated in {(time.perf_counter() - stream_start_time)*1000:.1f}ms",
f"[TIMING] session validated in {(time.perf_counter() - stream_start_time) * 1000:.1f}ms",
extra={
"json_fields": {
**log_meta,
@@ -313,6 +326,25 @@ async def stream_chat_post(
},
)
# Atomically append user message to session BEFORE creating task to avoid
# race condition where GET_SESSION sees task as "running" but message isn't
# saved yet. append_and_save_message re-fetches inside a lock to prevent
# message loss from concurrent requests.
if request.message:
message = ChatMessage(
role="user" if request.is_user_message else "assistant",
content=request.message,
)
if request.is_user_message:
track_user_message(
user_id=user_id,
session_id=session_id,
message_length=len(request.message),
)
logger.info(f"[STREAM] Saving user message to session {session_id}")
session = await append_and_save_message(session_id, message)
logger.info(f"[STREAM] User message saved for session {session_id}")
# Create a task in the stream registry for reconnection support
task_id = str(uuid_module.uuid4())
operation_id = str(uuid_module.uuid4())
@@ -328,7 +360,7 @@ async def stream_chat_post(
operation_id=operation_id,
)
logger.info(
f"[TIMING] create_task completed in {(time.perf_counter() - task_create_start)*1000:.1f}ms",
f"[TIMING] create_task completed in {(time.perf_counter() - task_create_start) * 1000:.1f}ms",
extra={
"json_fields": {
**log_meta,
@@ -349,15 +381,47 @@ async def stream_chat_post(
first_chunk_time, ttfc = None, None
chunk_count = 0
try:
async for chunk in chat_service.stream_chat_completion(
# Emit a start event with task_id for reconnection
start_chunk = StreamStart(messageId=task_id, taskId=task_id)
await stream_registry.publish_chunk(task_id, start_chunk)
logger.info(
f"[TIMING] StreamStart published at {(time_module.perf_counter() - gen_start_time) * 1000:.1f}ms",
extra={
"json_fields": {
**log_meta,
"elapsed_ms": (time_module.perf_counter() - gen_start_time)
* 1000,
}
},
)
# Choose service based on LaunchDarkly flag (falls back to config default)
use_sdk = await is_feature_enabled(
Flag.COPILOT_SDK,
user_id or "anonymous",
default=config.use_claude_agent_sdk,
)
stream_fn = (
sdk_service.stream_chat_completion_sdk
if use_sdk
else chat_service.stream_chat_completion
)
logger.info(
f"[TIMING] Calling {'sdk' if use_sdk else 'standard'} stream_chat_completion",
extra={"json_fields": log_meta},
)
# Pass message=None since we already added it to the session above
async for chunk in stream_fn(
session_id,
request.message,
None, # Message already in session
is_user_message=request.is_user_message,
user_id=user_id,
session=session, # Pass pre-fetched session to avoid double-fetch
session=session, # Pass session with message already added
context=request.context,
_task_id=task_id, # Pass task_id so service emits start with taskId for reconnection
):
# Skip duplicate StreamStart — we already published one above
if isinstance(chunk, StreamStart):
continue
chunk_count += 1
if first_chunk_time is None:
first_chunk_time = time_module.perf_counter()
@@ -378,7 +442,7 @@ async def stream_chat_post(
gen_end_time = time_module.perf_counter()
total_time = (gen_end_time - gen_start_time) * 1000
logger.info(
f"[TIMING] run_ai_generation FINISHED in {total_time/1000:.1f}s; "
f"[TIMING] run_ai_generation FINISHED in {total_time / 1000:.1f}s; "
f"task={task_id}, session={session_id}, "
f"ttfc={ttfc or -1:.2f}s, n_chunks={chunk_count}",
extra={
@@ -405,6 +469,17 @@ async def stream_chat_post(
}
},
)
# Publish a StreamError so the frontend can display an error message
try:
await stream_registry.publish_chunk(
task_id,
StreamError(
errorText="An error occurred. Please try again.",
code="stream_error",
),
)
except Exception:
pass # Best-effort; mark_task_completed will publish StreamFinish
await stream_registry.mark_task_completed(task_id, "failed")
# Start the AI generation in a background task
@@ -507,8 +582,14 @@ async def stream_chat_post(
"json_fields": {**log_meta, "elapsed_ms": elapsed, "error": str(e)}
},
)
# Surface error to frontend so it doesn't appear stuck
yield StreamError(
errorText="An error occurred. Please try again.",
code="stream_error",
).to_sse()
yield StreamFinish().to_sse()
finally:
# Unsubscribe when client disconnects or stream ends to prevent resource leak
# Unsubscribe when client disconnects or stream ends
if subscriber_queue is not None:
try:
await stream_registry.unsubscribe_from_task(
@@ -752,8 +833,6 @@ async def stream_task(
)
async def event_generator() -> AsyncGenerator[str, None]:
import asyncio
heartbeat_interval = 15.0 # Send heartbeat every 15 seconds
try:
while True:
@@ -972,7 +1051,6 @@ ToolResponseUnion = (
| AgentSavedResponse
| ClarificationNeededResponse
| BlockListResponse
| BlockDetailsResponse
| BlockOutputResponse
| DocSearchResultsResponse
| DocPageResponse

View File

@@ -0,0 +1,14 @@
"""Claude Agent SDK integration for CoPilot.
This module provides the integration layer between the Claude Agent SDK
and the existing CoPilot tool system, enabling drop-in replacement of
the current LLM orchestration with the battle-tested Claude Agent SDK.
"""
from .service import stream_chat_completion_sdk
from .tool_adapter import create_copilot_mcp_server
__all__ = [
"stream_chat_completion_sdk",
"create_copilot_mcp_server",
]

View File

@@ -0,0 +1,198 @@
"""Response adapter for converting Claude Agent SDK messages to Vercel AI SDK format.
This module provides the adapter layer that converts streaming messages from
the Claude Agent SDK into the Vercel AI SDK UI Stream Protocol format that
the frontend expects.
"""
import json
import logging
import uuid
from claude_agent_sdk import (
AssistantMessage,
Message,
ResultMessage,
SystemMessage,
TextBlock,
ToolResultBlock,
ToolUseBlock,
UserMessage,
)
from backend.api.features.chat.response_model import (
StreamBaseResponse,
StreamError,
StreamFinish,
StreamFinishStep,
StreamStart,
StreamStartStep,
StreamTextDelta,
StreamTextEnd,
StreamTextStart,
StreamToolInputAvailable,
StreamToolInputStart,
StreamToolOutputAvailable,
)
from backend.api.features.chat.sdk.tool_adapter import (
MCP_TOOL_PREFIX,
pop_pending_tool_output,
)
logger = logging.getLogger(__name__)
class SDKResponseAdapter:
"""Adapter for converting Claude Agent SDK messages to Vercel AI SDK format.
This class maintains state during a streaming session to properly track
text blocks, tool calls, and message lifecycle.
"""
def __init__(self, message_id: str | None = None):
self.message_id = message_id or str(uuid.uuid4())
self.text_block_id = str(uuid.uuid4())
self.has_started_text = False
self.has_ended_text = False
self.current_tool_calls: dict[str, dict[str, str]] = {}
self.task_id: str | None = None
self.step_open = False
def set_task_id(self, task_id: str) -> None:
"""Set the task ID for reconnection support."""
self.task_id = task_id
def convert_message(self, sdk_message: Message) -> list[StreamBaseResponse]:
"""Convert a single SDK message to Vercel AI SDK format."""
responses: list[StreamBaseResponse] = []
if isinstance(sdk_message, SystemMessage):
if sdk_message.subtype == "init":
responses.append(
StreamStart(messageId=self.message_id, taskId=self.task_id)
)
# Open the first step (matches non-SDK: StreamStart then StreamStartStep)
responses.append(StreamStartStep())
self.step_open = True
elif isinstance(sdk_message, AssistantMessage):
# After tool results, the SDK sends a new AssistantMessage for the
# next LLM turn. Open a new step if the previous one was closed.
if not self.step_open:
responses.append(StreamStartStep())
self.step_open = True
for block in sdk_message.content:
if isinstance(block, TextBlock):
if block.text:
self._ensure_text_started(responses)
responses.append(
StreamTextDelta(id=self.text_block_id, delta=block.text)
)
elif isinstance(block, ToolUseBlock):
self._end_text_if_open(responses)
# Strip MCP prefix so frontend sees "find_block"
# instead of "mcp__copilot__find_block".
tool_name = block.name.removeprefix(MCP_TOOL_PREFIX)
responses.append(
StreamToolInputStart(toolCallId=block.id, toolName=tool_name)
)
responses.append(
StreamToolInputAvailable(
toolCallId=block.id,
toolName=tool_name,
input=block.input,
)
)
self.current_tool_calls[block.id] = {"name": tool_name}
elif isinstance(sdk_message, UserMessage):
# UserMessage carries tool results back from tool execution.
content = sdk_message.content
blocks = content if isinstance(content, list) else []
for block in blocks:
if isinstance(block, ToolResultBlock) and block.tool_use_id:
tool_info = self.current_tool_calls.get(block.tool_use_id, {})
tool_name = tool_info.get("name", "unknown")
# Prefer the stashed full output over the SDK's
# (potentially truncated) ToolResultBlock content.
# The SDK truncates large results, writing them to disk,
# which breaks frontend widget parsing.
output = pop_pending_tool_output(tool_name) or (
_extract_tool_output(block.content)
)
responses.append(
StreamToolOutputAvailable(
toolCallId=block.tool_use_id,
toolName=tool_name,
output=output,
success=not (block.is_error or False),
)
)
# Close the current step after tool results — the next
# AssistantMessage will open a new step for the continuation.
if self.step_open:
responses.append(StreamFinishStep())
self.step_open = False
elif isinstance(sdk_message, ResultMessage):
self._end_text_if_open(responses)
# Close the step before finishing.
if self.step_open:
responses.append(StreamFinishStep())
self.step_open = False
if sdk_message.subtype == "success":
responses.append(StreamFinish())
elif sdk_message.subtype in ("error", "error_during_execution"):
error_msg = getattr(sdk_message, "result", None) or "Unknown error"
responses.append(
StreamError(errorText=str(error_msg), code="sdk_error")
)
responses.append(StreamFinish())
else:
logger.debug(f"Unhandled SDK message type: {type(sdk_message).__name__}")
return responses
def _ensure_text_started(self, responses: list[StreamBaseResponse]) -> None:
"""Start (or restart) a text block if needed."""
if not self.has_started_text or self.has_ended_text:
if self.has_ended_text:
self.text_block_id = str(uuid.uuid4())
self.has_ended_text = False
responses.append(StreamTextStart(id=self.text_block_id))
self.has_started_text = True
def _end_text_if_open(self, responses: list[StreamBaseResponse]) -> None:
"""End the current text block if one is open."""
if self.has_started_text and not self.has_ended_text:
responses.append(StreamTextEnd(id=self.text_block_id))
self.has_ended_text = True
def _extract_tool_output(content: str | list[dict[str, str]] | None) -> str:
"""Extract a string output from a ToolResultBlock's content field."""
if isinstance(content, str):
return content
if isinstance(content, list):
parts = [item.get("text", "") for item in content if item.get("type") == "text"]
if parts:
return "".join(parts)
try:
return json.dumps(content)
except (TypeError, ValueError):
return str(content)
if content is None:
return ""
try:
return json.dumps(content)
except (TypeError, ValueError):
return str(content)

View File

@@ -0,0 +1,366 @@
"""Unit tests for the SDK response adapter."""
from claude_agent_sdk import (
AssistantMessage,
ResultMessage,
SystemMessage,
TextBlock,
ToolResultBlock,
ToolUseBlock,
UserMessage,
)
from backend.api.features.chat.response_model import (
StreamBaseResponse,
StreamError,
StreamFinish,
StreamFinishStep,
StreamStart,
StreamStartStep,
StreamTextDelta,
StreamTextEnd,
StreamTextStart,
StreamToolInputAvailable,
StreamToolInputStart,
StreamToolOutputAvailable,
)
from .response_adapter import SDKResponseAdapter
from .tool_adapter import MCP_TOOL_PREFIX
def _adapter() -> SDKResponseAdapter:
a = SDKResponseAdapter(message_id="msg-1")
a.set_task_id("task-1")
return a
# -- SystemMessage -----------------------------------------------------------
def test_system_init_emits_start_and_step():
adapter = _adapter()
results = adapter.convert_message(SystemMessage(subtype="init", data={}))
assert len(results) == 2
assert isinstance(results[0], StreamStart)
assert results[0].messageId == "msg-1"
assert results[0].taskId == "task-1"
assert isinstance(results[1], StreamStartStep)
def test_system_non_init_emits_nothing():
adapter = _adapter()
results = adapter.convert_message(SystemMessage(subtype="other", data={}))
assert results == []
# -- AssistantMessage with TextBlock -----------------------------------------
def test_text_block_emits_step_start_and_delta():
adapter = _adapter()
msg = AssistantMessage(content=[TextBlock(text="hello")], model="test")
results = adapter.convert_message(msg)
assert len(results) == 3
assert isinstance(results[0], StreamStartStep)
assert isinstance(results[1], StreamTextStart)
assert isinstance(results[2], StreamTextDelta)
assert results[2].delta == "hello"
def test_empty_text_block_emits_only_step():
adapter = _adapter()
msg = AssistantMessage(content=[TextBlock(text="")], model="test")
results = adapter.convert_message(msg)
# Empty text skipped, but step still opens
assert len(results) == 1
assert isinstance(results[0], StreamStartStep)
def test_multiple_text_deltas_reuse_block_id():
adapter = _adapter()
msg1 = AssistantMessage(content=[TextBlock(text="a")], model="test")
msg2 = AssistantMessage(content=[TextBlock(text="b")], model="test")
r1 = adapter.convert_message(msg1)
r2 = adapter.convert_message(msg2)
# First gets step+start+delta, second only delta (block & step already started)
assert len(r1) == 3
assert isinstance(r1[0], StreamStartStep)
assert isinstance(r1[1], StreamTextStart)
assert len(r2) == 1
assert isinstance(r2[0], StreamTextDelta)
assert r1[1].id == r2[0].id # same block ID
# -- AssistantMessage with ToolUseBlock --------------------------------------
def test_tool_use_emits_input_start_and_available():
"""Tool names arrive with MCP prefix and should be stripped for the frontend."""
adapter = _adapter()
msg = AssistantMessage(
content=[
ToolUseBlock(
id="tool-1",
name=f"{MCP_TOOL_PREFIX}find_agent",
input={"q": "x"},
)
],
model="test",
)
results = adapter.convert_message(msg)
assert len(results) == 3
assert isinstance(results[0], StreamStartStep)
assert isinstance(results[1], StreamToolInputStart)
assert results[1].toolCallId == "tool-1"
assert results[1].toolName == "find_agent" # prefix stripped
assert isinstance(results[2], StreamToolInputAvailable)
assert results[2].toolName == "find_agent" # prefix stripped
assert results[2].input == {"q": "x"}
def test_text_then_tool_ends_text_block():
adapter = _adapter()
text_msg = AssistantMessage(content=[TextBlock(text="thinking...")], model="test")
tool_msg = AssistantMessage(
content=[ToolUseBlock(id="t1", name=f"{MCP_TOOL_PREFIX}tool", input={})],
model="test",
)
adapter.convert_message(text_msg) # opens step + text
results = adapter.convert_message(tool_msg)
# Step already open, so: TextEnd, ToolInputStart, ToolInputAvailable
assert len(results) == 3
assert isinstance(results[0], StreamTextEnd)
assert isinstance(results[1], StreamToolInputStart)
# -- UserMessage with ToolResultBlock ----------------------------------------
def test_tool_result_emits_output_and_finish_step():
adapter = _adapter()
# First register the tool call (opens step) — SDK sends prefixed name
tool_msg = AssistantMessage(
content=[ToolUseBlock(id="t1", name=f"{MCP_TOOL_PREFIX}find_agent", input={})],
model="test",
)
adapter.convert_message(tool_msg)
# Now send tool result
result_msg = UserMessage(
content=[ToolResultBlock(tool_use_id="t1", content="found 3 agents")]
)
results = adapter.convert_message(result_msg)
assert len(results) == 2
assert isinstance(results[0], StreamToolOutputAvailable)
assert results[0].toolCallId == "t1"
assert results[0].toolName == "find_agent" # prefix stripped
assert results[0].output == "found 3 agents"
assert results[0].success is True
assert isinstance(results[1], StreamFinishStep)
def test_tool_result_error():
adapter = _adapter()
adapter.convert_message(
AssistantMessage(
content=[
ToolUseBlock(id="t1", name=f"{MCP_TOOL_PREFIX}run_agent", input={})
],
model="test",
)
)
result_msg = UserMessage(
content=[ToolResultBlock(tool_use_id="t1", content="timeout", is_error=True)]
)
results = adapter.convert_message(result_msg)
assert isinstance(results[0], StreamToolOutputAvailable)
assert results[0].success is False
assert isinstance(results[1], StreamFinishStep)
def test_tool_result_list_content():
adapter = _adapter()
adapter.convert_message(
AssistantMessage(
content=[ToolUseBlock(id="t1", name=f"{MCP_TOOL_PREFIX}tool", input={})],
model="test",
)
)
result_msg = UserMessage(
content=[
ToolResultBlock(
tool_use_id="t1",
content=[
{"type": "text", "text": "line1"},
{"type": "text", "text": "line2"},
],
)
]
)
results = adapter.convert_message(result_msg)
assert isinstance(results[0], StreamToolOutputAvailable)
assert results[0].output == "line1line2"
assert isinstance(results[1], StreamFinishStep)
def test_string_user_message_ignored():
"""A plain string UserMessage (not tool results) produces no output."""
adapter = _adapter()
results = adapter.convert_message(UserMessage(content="hello"))
assert results == []
# -- ResultMessage -----------------------------------------------------------
def test_result_success_emits_finish_step_and_finish():
adapter = _adapter()
# Start some text first (opens step)
adapter.convert_message(
AssistantMessage(content=[TextBlock(text="done")], model="test")
)
msg = ResultMessage(
subtype="success",
duration_ms=100,
duration_api_ms=50,
is_error=False,
num_turns=1,
session_id="s1",
)
results = adapter.convert_message(msg)
# TextEnd + FinishStep + StreamFinish
assert len(results) == 3
assert isinstance(results[0], StreamTextEnd)
assert isinstance(results[1], StreamFinishStep)
assert isinstance(results[2], StreamFinish)
def test_result_error_emits_error_and_finish():
adapter = _adapter()
msg = ResultMessage(
subtype="error",
duration_ms=100,
duration_api_ms=50,
is_error=True,
num_turns=0,
session_id="s1",
result="API rate limited",
)
results = adapter.convert_message(msg)
# No step was open, so no FinishStep — just Error + Finish
assert len(results) == 2
assert isinstance(results[0], StreamError)
assert "API rate limited" in results[0].errorText
assert isinstance(results[1], StreamFinish)
# -- Text after tools (new block ID) ----------------------------------------
def test_text_after_tool_gets_new_block_id():
adapter = _adapter()
# Text -> Tool -> ToolResult -> Text should get a new text block ID and step
adapter.convert_message(
AssistantMessage(content=[TextBlock(text="before")], model="test")
)
adapter.convert_message(
AssistantMessage(
content=[ToolUseBlock(id="t1", name=f"{MCP_TOOL_PREFIX}tool", input={})],
model="test",
)
)
# Send tool result (closes step)
adapter.convert_message(
UserMessage(content=[ToolResultBlock(tool_use_id="t1", content="ok")])
)
results = adapter.convert_message(
AssistantMessage(content=[TextBlock(text="after")], model="test")
)
# Should get StreamStartStep (new step) + StreamTextStart (new block) + StreamTextDelta
assert len(results) == 3
assert isinstance(results[0], StreamStartStep)
assert isinstance(results[1], StreamTextStart)
assert isinstance(results[2], StreamTextDelta)
assert results[2].delta == "after"
# -- Full conversation flow --------------------------------------------------
def test_full_conversation_flow():
"""Simulate a complete conversation: init -> text -> tool -> result -> text -> finish."""
adapter = _adapter()
all_responses: list[StreamBaseResponse] = []
# 1. Init
all_responses.extend(
adapter.convert_message(SystemMessage(subtype="init", data={}))
)
# 2. Assistant text
all_responses.extend(
adapter.convert_message(
AssistantMessage(content=[TextBlock(text="Let me search")], model="test")
)
)
# 3. Tool use
all_responses.extend(
adapter.convert_message(
AssistantMessage(
content=[
ToolUseBlock(
id="t1",
name=f"{MCP_TOOL_PREFIX}find_agent",
input={"query": "email"},
)
],
model="test",
)
)
)
# 4. Tool result
all_responses.extend(
adapter.convert_message(
UserMessage(
content=[ToolResultBlock(tool_use_id="t1", content="Found 2 agents")]
)
)
)
# 5. More text
all_responses.extend(
adapter.convert_message(
AssistantMessage(content=[TextBlock(text="I found 2")], model="test")
)
)
# 6. Result
all_responses.extend(
adapter.convert_message(
ResultMessage(
subtype="success",
duration_ms=500,
duration_api_ms=400,
is_error=False,
num_turns=2,
session_id="s1",
)
)
)
types = [type(r).__name__ for r in all_responses]
assert types == [
"StreamStart",
"StreamStartStep", # step 1: text + tool call
"StreamTextStart",
"StreamTextDelta", # "Let me search"
"StreamTextEnd", # closed before tool
"StreamToolInputStart",
"StreamToolInputAvailable",
"StreamToolOutputAvailable", # tool result
"StreamFinishStep", # step 1 closed after tool result
"StreamStartStep", # step 2: continuation text
"StreamTextStart", # new block after tool
"StreamTextDelta", # "I found 2"
"StreamTextEnd", # closed by result
"StreamFinishStep", # step 2 closed
"StreamFinish",
]

View File

@@ -0,0 +1,299 @@
"""Security hooks for Claude Agent SDK integration.
This module provides security hooks that validate tool calls before execution,
ensuring multi-user isolation and preventing unauthorized operations.
"""
import json
import logging
import os
import re
from typing import Any, cast
from backend.api.features.chat.sdk.tool_adapter import MCP_TOOL_PREFIX
logger = logging.getLogger(__name__)
# Tools that are blocked entirely (CLI/system access).
# "Bash" (capital) is the SDK built-in — it's NOT in allowed_tools but blocked
# here as defence-in-depth. The agent uses mcp__copilot__bash_exec instead,
# which has kernel-level network isolation (unshare --net).
BLOCKED_TOOLS = {
"Bash",
"bash",
"shell",
"exec",
"terminal",
"command",
}
# Tools allowed only when their path argument stays within the SDK workspace.
# The SDK uses these to handle oversized tool results (writes to tool-results/
# files, then reads them back) and for workspace file operations.
WORKSPACE_SCOPED_TOOLS = {"Read", "Write", "Edit", "Glob", "Grep"}
# Dangerous patterns in tool inputs
DANGEROUS_PATTERNS = [
r"sudo",
r"rm\s+-rf",
r"dd\s+if=",
r"/etc/passwd",
r"/etc/shadow",
r"chmod\s+777",
r"curl\s+.*\|.*sh",
r"wget\s+.*\|.*sh",
r"eval\s*\(",
r"exec\s*\(",
r"__import__",
r"os\.system",
r"subprocess",
]
def _deny(reason: str) -> dict[str, Any]:
"""Return a hook denial response."""
return {
"hookSpecificOutput": {
"hookEventName": "PreToolUse",
"permissionDecision": "deny",
"permissionDecisionReason": reason,
}
}
def _validate_workspace_path(
tool_name: str, tool_input: dict[str, Any], sdk_cwd: str | None
) -> dict[str, Any]:
"""Validate that a workspace-scoped tool only accesses allowed paths.
Allowed directories:
- The SDK working directory (``/tmp/copilot-<session>/``)
- The SDK tool-results directory (``~/.claude/projects/…/tool-results/``)
"""
path = tool_input.get("file_path") or tool_input.get("path") or ""
if not path:
# Glob/Grep without a path default to cwd which is already sandboxed
return {}
# Resolve relative paths against sdk_cwd (the SDK sets cwd so the LLM
# naturally uses relative paths like "test.txt" instead of absolute ones).
# Tilde paths (~/) are home-dir references, not relative — expand first.
if path.startswith("~"):
resolved = os.path.normpath(os.path.expanduser(path))
elif not os.path.isabs(path) and sdk_cwd:
resolved = os.path.normpath(os.path.join(sdk_cwd, path))
else:
resolved = os.path.normpath(path)
# Allow access within the SDK working directory
if sdk_cwd:
norm_cwd = os.path.normpath(sdk_cwd)
if resolved.startswith(norm_cwd + os.sep) or resolved == norm_cwd:
return {}
# Allow access to ~/.claude/projects/*/tool-results/ (big tool results)
claude_dir = os.path.normpath(os.path.expanduser("~/.claude/projects"))
if resolved.startswith(claude_dir + os.sep) and "tool-results" in resolved:
return {}
logger.warning(
f"Blocked {tool_name} outside workspace: {path} (resolved={resolved})"
)
workspace_hint = f" Allowed workspace: {sdk_cwd}" if sdk_cwd else ""
return _deny(
f"[SECURITY] Tool '{tool_name}' can only access files within the workspace "
f"directory.{workspace_hint} "
"This is enforced by the platform and cannot be bypassed."
)
def _validate_tool_access(
tool_name: str, tool_input: dict[str, Any], sdk_cwd: str | None = None
) -> dict[str, Any]:
"""Validate that a tool call is allowed.
Returns:
Empty dict to allow, or dict with hookSpecificOutput to deny
"""
# Block forbidden tools
if tool_name in BLOCKED_TOOLS:
logger.warning(f"Blocked tool access attempt: {tool_name}")
return _deny(
f"[SECURITY] Tool '{tool_name}' is blocked for security. "
"This is enforced by the platform and cannot be bypassed. "
"Use the CoPilot-specific MCP tools instead."
)
# Workspace-scoped tools: allowed only within the SDK workspace directory
if tool_name in WORKSPACE_SCOPED_TOOLS:
return _validate_workspace_path(tool_name, tool_input, sdk_cwd)
# Check for dangerous patterns in tool input
# Use json.dumps for predictable format (str() produces Python repr)
input_str = json.dumps(tool_input) if tool_input else ""
for pattern in DANGEROUS_PATTERNS:
if re.search(pattern, input_str, re.IGNORECASE):
logger.warning(
f"Blocked dangerous pattern in tool input: {pattern} in {tool_name}"
)
return _deny(
"[SECURITY] Input contains a blocked pattern. "
"This is enforced by the platform and cannot be bypassed."
)
return {}
def _validate_user_isolation(
tool_name: str, tool_input: dict[str, Any], user_id: str | None
) -> dict[str, Any]:
"""Validate that tool calls respect user isolation."""
# For workspace file tools, ensure path doesn't escape
if "workspace" in tool_name.lower():
path = tool_input.get("path", "") or tool_input.get("file_path", "")
if path:
# Check for path traversal
if ".." in path or path.startswith("/"):
logger.warning(
f"Blocked path traversal attempt: {path} by user {user_id}"
)
return {
"hookSpecificOutput": {
"hookEventName": "PreToolUse",
"permissionDecision": "deny",
"permissionDecisionReason": "Path traversal not allowed",
}
}
return {}
def create_security_hooks(
user_id: str | None,
sdk_cwd: str | None = None,
max_subtasks: int = 3,
) -> dict[str, Any]:
"""Create the security hooks configuration for Claude Agent SDK.
Includes security validation and observability hooks:
- PreToolUse: Security validation before tool execution
- PostToolUse: Log successful tool executions
- PostToolUseFailure: Log and handle failed tool executions
- PreCompact: Log context compaction events (SDK handles compaction automatically)
Args:
user_id: Current user ID for isolation validation
sdk_cwd: SDK working directory for workspace-scoped tool validation
max_subtasks: Maximum Task (sub-agent) spawns allowed per session
Returns:
Hooks configuration dict for ClaudeAgentOptions
"""
try:
from claude_agent_sdk import HookMatcher
from claude_agent_sdk.types import HookContext, HookInput, SyncHookJSONOutput
# Per-session counter for Task sub-agent spawns
task_spawn_count = 0
async def pre_tool_use_hook(
input_data: HookInput,
tool_use_id: str | None,
context: HookContext,
) -> SyncHookJSONOutput:
"""Combined pre-tool-use validation hook."""
nonlocal task_spawn_count
_ = context # unused but required by signature
tool_name = cast(str, input_data.get("tool_name", ""))
tool_input = cast(dict[str, Any], input_data.get("tool_input", {}))
# Rate-limit Task (sub-agent) spawns per session
if tool_name == "Task":
task_spawn_count += 1
if task_spawn_count > max_subtasks:
logger.warning(
f"[SDK] Task limit reached ({max_subtasks}), user={user_id}"
)
return cast(
SyncHookJSONOutput,
_deny(
f"Maximum {max_subtasks} sub-tasks per session. "
"Please continue in the main conversation."
),
)
# Strip MCP prefix for consistent validation
is_copilot_tool = tool_name.startswith(MCP_TOOL_PREFIX)
clean_name = tool_name.removeprefix(MCP_TOOL_PREFIX)
# Only block non-CoPilot tools; our MCP-registered tools
# (including Read for oversized results) are already sandboxed.
if not is_copilot_tool:
result = _validate_tool_access(clean_name, tool_input, sdk_cwd)
if result:
return cast(SyncHookJSONOutput, result)
# Validate user isolation
result = _validate_user_isolation(clean_name, tool_input, user_id)
if result:
return cast(SyncHookJSONOutput, result)
logger.debug(f"[SDK] Tool start: {tool_name}, user={user_id}")
return cast(SyncHookJSONOutput, {})
async def post_tool_use_hook(
input_data: HookInput,
tool_use_id: str | None,
context: HookContext,
) -> SyncHookJSONOutput:
"""Log successful tool executions for observability."""
_ = context
tool_name = cast(str, input_data.get("tool_name", ""))
logger.debug(f"[SDK] Tool success: {tool_name}, tool_use_id={tool_use_id}")
return cast(SyncHookJSONOutput, {})
async def post_tool_failure_hook(
input_data: HookInput,
tool_use_id: str | None,
context: HookContext,
) -> SyncHookJSONOutput:
"""Log failed tool executions for debugging."""
_ = context
tool_name = cast(str, input_data.get("tool_name", ""))
error = input_data.get("error", "Unknown error")
logger.warning(
f"[SDK] Tool failed: {tool_name}, error={error}, "
f"user={user_id}, tool_use_id={tool_use_id}"
)
return cast(SyncHookJSONOutput, {})
async def pre_compact_hook(
input_data: HookInput,
tool_use_id: str | None,
context: HookContext,
) -> SyncHookJSONOutput:
"""Log when SDK triggers context compaction.
The SDK automatically compacts conversation history when it grows too large.
This hook provides visibility into when compaction happens.
"""
_ = context, tool_use_id
trigger = input_data.get("trigger", "auto")
logger.info(
f"[SDK] Context compaction triggered: {trigger}, user={user_id}"
)
return cast(SyncHookJSONOutput, {})
return {
"PreToolUse": [HookMatcher(matcher="*", hooks=[pre_tool_use_hook])],
"PostToolUse": [HookMatcher(matcher="*", hooks=[post_tool_use_hook])],
"PostToolUseFailure": [
HookMatcher(matcher="*", hooks=[post_tool_failure_hook])
],
"PreCompact": [HookMatcher(matcher="*", hooks=[pre_compact_hook])],
}
except ImportError:
# Fallback for when SDK isn't available - return empty hooks
logger.warning("claude-agent-sdk not available, security hooks disabled")
return {}

View File

@@ -0,0 +1,165 @@
"""Unit tests for SDK security hooks."""
import os
from .security_hooks import _validate_tool_access, _validate_user_isolation
SDK_CWD = "/tmp/copilot-abc123"
def _is_denied(result: dict) -> bool:
hook = result.get("hookSpecificOutput", {})
return hook.get("permissionDecision") == "deny"
# -- Blocked tools -----------------------------------------------------------
def test_blocked_tools_denied():
for tool in ("bash", "shell", "exec", "terminal", "command"):
result = _validate_tool_access(tool, {})
assert _is_denied(result), f"{tool} should be blocked"
def test_unknown_tool_allowed():
result = _validate_tool_access("SomeCustomTool", {})
assert result == {}
# -- Workspace-scoped tools --------------------------------------------------
def test_read_within_workspace_allowed():
result = _validate_tool_access(
"Read", {"file_path": f"{SDK_CWD}/file.txt"}, sdk_cwd=SDK_CWD
)
assert result == {}
def test_write_within_workspace_allowed():
result = _validate_tool_access(
"Write", {"file_path": f"{SDK_CWD}/output.json"}, sdk_cwd=SDK_CWD
)
assert result == {}
def test_edit_within_workspace_allowed():
result = _validate_tool_access(
"Edit", {"file_path": f"{SDK_CWD}/src/main.py"}, sdk_cwd=SDK_CWD
)
assert result == {}
def test_glob_within_workspace_allowed():
result = _validate_tool_access("Glob", {"path": f"{SDK_CWD}/src"}, sdk_cwd=SDK_CWD)
assert result == {}
def test_grep_within_workspace_allowed():
result = _validate_tool_access("Grep", {"path": f"{SDK_CWD}/src"}, sdk_cwd=SDK_CWD)
assert result == {}
def test_read_outside_workspace_denied():
result = _validate_tool_access(
"Read", {"file_path": "/etc/passwd"}, sdk_cwd=SDK_CWD
)
assert _is_denied(result)
def test_write_outside_workspace_denied():
result = _validate_tool_access(
"Write", {"file_path": "/home/user/secrets.txt"}, sdk_cwd=SDK_CWD
)
assert _is_denied(result)
def test_traversal_attack_denied():
result = _validate_tool_access(
"Read",
{"file_path": f"{SDK_CWD}/../../etc/passwd"},
sdk_cwd=SDK_CWD,
)
assert _is_denied(result)
def test_no_path_allowed():
"""Glob/Grep without a path argument defaults to cwd — should pass."""
result = _validate_tool_access("Glob", {}, sdk_cwd=SDK_CWD)
assert result == {}
def test_read_no_cwd_denies_absolute():
"""If no sdk_cwd is set, absolute paths are denied."""
result = _validate_tool_access("Read", {"file_path": "/tmp/anything"})
assert _is_denied(result)
# -- Tool-results directory --------------------------------------------------
def test_read_tool_results_allowed():
home = os.path.expanduser("~")
path = f"{home}/.claude/projects/-tmp-copilot-abc123/tool-results/12345.txt"
result = _validate_tool_access("Read", {"file_path": path}, sdk_cwd=SDK_CWD)
assert result == {}
def test_read_claude_projects_without_tool_results_denied():
home = os.path.expanduser("~")
path = f"{home}/.claude/projects/-tmp-copilot-abc123/settings.json"
result = _validate_tool_access("Read", {"file_path": path}, sdk_cwd=SDK_CWD)
assert _is_denied(result)
# -- Built-in Bash is blocked (use bash_exec MCP tool instead) ---------------
def test_bash_builtin_always_blocked():
"""SDK built-in Bash is blocked — bash_exec MCP tool with bubblewrap is used instead."""
result = _validate_tool_access("Bash", {"command": "echo hello"}, sdk_cwd=SDK_CWD)
assert _is_denied(result)
# -- Dangerous patterns ------------------------------------------------------
def test_dangerous_pattern_blocked():
result = _validate_tool_access("SomeTool", {"cmd": "sudo rm -rf /"})
assert _is_denied(result)
def test_subprocess_pattern_blocked():
result = _validate_tool_access("SomeTool", {"code": "subprocess.run(...)"})
assert _is_denied(result)
# -- User isolation ----------------------------------------------------------
def test_workspace_path_traversal_blocked():
result = _validate_user_isolation(
"workspace_read", {"path": "../../../etc/shadow"}, user_id="user-1"
)
assert _is_denied(result)
def test_workspace_absolute_path_blocked():
result = _validate_user_isolation(
"workspace_read", {"path": "/etc/passwd"}, user_id="user-1"
)
assert _is_denied(result)
def test_workspace_normal_path_allowed():
result = _validate_user_isolation(
"workspace_read", {"path": "src/main.py"}, user_id="user-1"
)
assert result == {}
def test_non_workspace_tool_passes_isolation():
result = _validate_user_isolation(
"find_agent", {"query": "email"}, user_id="user-1"
)
assert result == {}

View File

@@ -0,0 +1,668 @@
"""Claude Agent SDK service layer for CoPilot chat completions."""
import asyncio
import json
import logging
import os
import uuid
from collections.abc import AsyncGenerator
from typing import Any
from backend.util.exceptions import NotFoundError
from .. import stream_registry
from ..config import ChatConfig
from ..model import (
ChatMessage,
ChatSession,
get_chat_session,
update_session_title,
upsert_chat_session,
)
from ..response_model import (
StreamBaseResponse,
StreamError,
StreamFinish,
StreamStart,
StreamTextDelta,
StreamToolInputAvailable,
StreamToolOutputAvailable,
)
from ..service import (
_build_system_prompt,
_execute_long_running_tool_with_streaming,
_generate_session_title,
)
from ..tools.models import OperationPendingResponse, OperationStartedResponse
from ..tools.sandbox import WORKSPACE_PREFIX, make_session_path
from ..tracking import track_user_message
from .response_adapter import SDKResponseAdapter
from .security_hooks import create_security_hooks
from .tool_adapter import (
COPILOT_TOOL_NAMES,
LongRunningCallback,
create_copilot_mcp_server,
set_execution_context,
)
logger = logging.getLogger(__name__)
config = ChatConfig()
# Set to hold background tasks to prevent garbage collection
_background_tasks: set[asyncio.Task[Any]] = set()
_SDK_CWD_PREFIX = WORKSPACE_PREFIX
# Appended to the system prompt to inform the agent about available tools.
# The SDK built-in Bash is NOT available — use mcp__copilot__bash_exec instead,
# which has kernel-level network isolation (unshare --net).
_SDK_TOOL_SUPPLEMENT = """
## Tool notes
- The SDK built-in Bash tool is NOT available. Use the `bash_exec` MCP tool
for shell commands — it runs in a network-isolated sandbox.
- **Shared workspace**: The SDK Read/Write tools and `bash_exec` share the
same working directory. Files created by one are readable by the other.
These files are **ephemeral** — they exist only for the current session.
- **Persistent storage**: Use `write_workspace_file` / `read_workspace_file`
for files that should persist across sessions (stored in cloud storage).
- Long-running tools (create_agent, edit_agent, etc.) are handled
asynchronously. You will receive an immediate response; the actual result
is delivered to the user via a background stream.
"""
def _build_long_running_callback(user_id: str | None) -> LongRunningCallback:
"""Build a callback that delegates long-running tools to the non-SDK infrastructure.
Long-running tools (create_agent, edit_agent, etc.) are delegated to the
existing background infrastructure: stream_registry (Redis Streams),
database persistence, and SSE reconnection. This means results survive
page refreshes / pod restarts, and the frontend shows the proper loading
widget with progress updates.
The returned callback matches the ``LongRunningCallback`` signature:
``(tool_name, args, session) -> MCP response dict``.
"""
async def _callback(
tool_name: str, args: dict[str, Any], session: ChatSession
) -> dict[str, Any]:
operation_id = str(uuid.uuid4())
task_id = str(uuid.uuid4())
tool_call_id = f"sdk-{uuid.uuid4().hex[:12]}"
session_id = session.session_id
# --- Build user-friendly messages (matches non-SDK service) ---
if tool_name == "create_agent":
desc = args.get("description", "")
desc_preview = (desc[:100] + "...") if len(desc) > 100 else desc
pending_msg = (
f"Creating your agent: {desc_preview}"
if desc_preview
else "Creating agent... This may take a few minutes."
)
started_msg = (
"Agent creation started. You can close this tab - "
"check your library in a few minutes."
)
elif tool_name == "edit_agent":
changes = args.get("changes", "")
changes_preview = (changes[:100] + "...") if len(changes) > 100 else changes
pending_msg = (
f"Editing agent: {changes_preview}"
if changes_preview
else "Editing agent... This may take a few minutes."
)
started_msg = (
"Agent edit started. You can close this tab - "
"check your library in a few minutes."
)
else:
pending_msg = f"Running {tool_name}... This may take a few minutes."
started_msg = (
f"{tool_name} started. You can close this tab - "
"check back in a few minutes."
)
# --- Register task in Redis for SSE reconnection ---
await stream_registry.create_task(
task_id=task_id,
session_id=session_id,
user_id=user_id,
tool_call_id=tool_call_id,
tool_name=tool_name,
operation_id=operation_id,
)
# --- Save OperationPendingResponse to chat history ---
pending_message = ChatMessage(
role="tool",
content=OperationPendingResponse(
message=pending_msg,
operation_id=operation_id,
tool_name=tool_name,
).model_dump_json(),
tool_call_id=tool_call_id,
)
session.messages.append(pending_message)
await upsert_chat_session(session)
# --- Spawn background task (reuses non-SDK infrastructure) ---
bg_task = asyncio.create_task(
_execute_long_running_tool_with_streaming(
tool_name=tool_name,
parameters=args,
tool_call_id=tool_call_id,
operation_id=operation_id,
task_id=task_id,
session_id=session_id,
user_id=user_id,
)
)
_background_tasks.add(bg_task)
bg_task.add_done_callback(_background_tasks.discard)
await stream_registry.set_task_asyncio_task(task_id, bg_task)
logger.info(
f"[SDK] Long-running tool {tool_name} delegated to background "
f"(operation_id={operation_id}, task_id={task_id})"
)
# --- Return OperationStartedResponse as MCP tool result ---
# This flows through SDK → response adapter → frontend, triggering
# the loading widget with SSE reconnection support.
started_json = OperationStartedResponse(
message=started_msg,
operation_id=operation_id,
tool_name=tool_name,
task_id=task_id,
).model_dump_json()
return {
"content": [{"type": "text", "text": started_json}],
"isError": False,
}
return _callback
def _resolve_sdk_model() -> str | None:
"""Resolve the model name for the Claude Agent SDK CLI.
Uses ``config.claude_agent_model`` if set, otherwise derives from
``config.model`` by stripping the OpenRouter provider prefix (e.g.,
``"anthropic/claude-opus-4.6"`` → ``"claude-opus-4.6"``).
"""
if config.claude_agent_model:
return config.claude_agent_model
model = config.model
if "/" in model:
return model.split("/", 1)[1]
return model
def _build_sdk_env() -> dict[str, str]:
"""Build env vars for the SDK CLI process.
Routes API calls through OpenRouter (or a custom base_url) using
the same ``config.api_key`` / ``config.base_url`` as the non-SDK path.
This gives per-call token and cost tracking on the OpenRouter dashboard.
Only overrides ``ANTHROPIC_API_KEY`` when a valid proxy URL and auth
token are both present — otherwise returns an empty dict so the SDK
falls back to its default credentials.
"""
env: dict[str, str] = {}
if config.api_key and config.base_url:
# Strip /v1 suffix — SDK expects the base URL without a version path
base = config.base_url.rstrip("/")
if base.endswith("/v1"):
base = base[:-3]
if not base or not base.startswith("http"):
# Invalid base_url — don't override SDK defaults
return env
env["ANTHROPIC_BASE_URL"] = base
env["ANTHROPIC_AUTH_TOKEN"] = config.api_key
# Must be explicitly empty so the CLI uses AUTH_TOKEN instead
env["ANTHROPIC_API_KEY"] = ""
return env
def _make_sdk_cwd(session_id: str) -> str:
"""Create a safe, session-specific working directory path.
Delegates to :func:`~backend.api.features.chat.tools.sandbox.make_session_path`
(single source of truth for path sanitization) and adds a defence-in-depth
assertion.
"""
cwd = make_session_path(session_id)
# Defence-in-depth: normpath + startswith is a CodeQL-recognised sanitizer
cwd = os.path.normpath(cwd)
if not cwd.startswith(_SDK_CWD_PREFIX):
raise ValueError(f"SDK cwd escaped prefix: {cwd}")
return cwd
def _cleanup_sdk_tool_results(cwd: str) -> None:
"""Remove SDK tool-result files for a specific session working directory.
The SDK creates tool-result files under ~/.claude/projects/<encoded-cwd>/tool-results/.
We clean only the specific cwd's results to avoid race conditions between
concurrent sessions.
Security: cwd MUST be created by _make_sdk_cwd() which sanitizes session_id.
"""
import shutil
# Security check 1: Validate cwd is under the expected prefix
normalized = os.path.normpath(cwd)
if not normalized.startswith(_SDK_CWD_PREFIX):
logger.warning(f"[SDK] Rejecting cleanup for invalid path: {cwd}")
return
# Security check 2: Ensure no path traversal in the normalized path
if ".." in normalized:
logger.warning(f"[SDK] Rejecting cleanup for traversal attempt: {cwd}")
return
# SDK encodes the cwd path by replacing '/' with '-'
encoded_cwd = normalized.replace("/", "-")
# Construct the project directory path (known-safe home expansion)
claude_projects = os.path.expanduser("~/.claude/projects")
project_dir = os.path.join(claude_projects, encoded_cwd)
# Security check 3: Validate project_dir is under ~/.claude/projects
project_dir = os.path.normpath(project_dir)
if not project_dir.startswith(claude_projects):
logger.warning(
f"[SDK] Rejecting cleanup for escaped project path: {project_dir}"
)
return
results_dir = os.path.join(project_dir, "tool-results")
if os.path.isdir(results_dir):
for filename in os.listdir(results_dir):
file_path = os.path.join(results_dir, filename)
try:
if os.path.isfile(file_path):
os.remove(file_path)
except OSError:
pass
# Also clean up the temp cwd directory itself
try:
shutil.rmtree(normalized, ignore_errors=True)
except OSError:
pass
async def _compress_conversation_history(
session: ChatSession,
) -> list[ChatMessage]:
"""Compress prior conversation messages if they exceed the token threshold.
Uses the shared compress_context() from prompt.py which supports:
- LLM summarization of old messages (keeps recent ones intact)
- Progressive content truncation as fallback
- Middle-out deletion as last resort
Returns the compressed prior messages (everything except the current message).
"""
prior = session.messages[:-1]
if len(prior) < 2:
return prior
from backend.util.prompt import compress_context
# Convert ChatMessages to dicts for compress_context
messages_dict = []
for msg in prior:
msg_dict: dict[str, Any] = {"role": msg.role}
if msg.content:
msg_dict["content"] = msg.content
if msg.tool_calls:
msg_dict["tool_calls"] = msg.tool_calls
if msg.tool_call_id:
msg_dict["tool_call_id"] = msg.tool_call_id
messages_dict.append(msg_dict)
try:
import openai
async with openai.AsyncOpenAI(
api_key=config.api_key, base_url=config.base_url, timeout=30.0
) as client:
result = await compress_context(
messages=messages_dict,
model=config.model,
client=client,
)
except Exception as e:
logger.warning(f"[SDK] Context compression with LLM failed: {e}")
# Fall back to truncation-only (no LLM summarization)
result = await compress_context(
messages=messages_dict,
model=config.model,
client=None,
)
if result.was_compacted:
logger.info(
f"[SDK] Context compacted: {result.original_token_count} -> "
f"{result.token_count} tokens "
f"({result.messages_summarized} summarized, "
f"{result.messages_dropped} dropped)"
)
# Convert compressed dicts back to ChatMessages
return [
ChatMessage(
role=m["role"],
content=m.get("content"),
tool_calls=m.get("tool_calls"),
tool_call_id=m.get("tool_call_id"),
)
for m in result.messages
]
return prior
def _format_conversation_context(messages: list[ChatMessage]) -> str | None:
"""Format conversation messages into a context prefix for the user message.
Returns a string like:
<conversation_history>
User: hello
You responded: Hi! How can I help?
</conversation_history>
Returns None if there are no messages to format.
"""
if not messages:
return None
lines: list[str] = []
for msg in messages:
if not msg.content:
continue
if msg.role == "user":
lines.append(f"User: {msg.content}")
elif msg.role == "assistant":
lines.append(f"You responded: {msg.content}")
# Skip tool messages — they're internal details
if not lines:
return None
return "<conversation_history>\n" + "\n".join(lines) + "\n</conversation_history>"
async def stream_chat_completion_sdk(
session_id: str,
message: str | None = None,
tool_call_response: str | None = None, # noqa: ARG001
is_user_message: bool = True,
user_id: str | None = None,
retry_count: int = 0, # noqa: ARG001
session: ChatSession | None = None,
context: dict[str, str] | None = None, # noqa: ARG001
) -> AsyncGenerator[StreamBaseResponse, None]:
"""Stream chat completion using Claude Agent SDK.
Drop-in replacement for stream_chat_completion with improved reliability.
"""
if session is None:
session = await get_chat_session(session_id, user_id)
if not session:
raise NotFoundError(
f"Session {session_id} not found. Please create a new session first."
)
if message:
session.messages.append(
ChatMessage(
role="user" if is_user_message else "assistant", content=message
)
)
if is_user_message:
track_user_message(
user_id=user_id, session_id=session_id, message_length=len(message)
)
session = await upsert_chat_session(session)
# Generate title for new sessions (first user message)
if is_user_message and not session.title:
user_messages = [m for m in session.messages if m.role == "user"]
if len(user_messages) == 1:
first_message = user_messages[0].content or message or ""
if first_message:
task = asyncio.create_task(
_update_title_async(session_id, first_message, user_id)
)
_background_tasks.add(task)
task.add_done_callback(_background_tasks.discard)
# Build system prompt (reuses non-SDK path with Langfuse support)
has_history = len(session.messages) > 1
system_prompt, _ = await _build_system_prompt(
user_id, has_conversation_history=has_history
)
system_prompt += _SDK_TOOL_SUPPLEMENT
message_id = str(uuid.uuid4())
task_id = str(uuid.uuid4())
yield StreamStart(messageId=message_id, taskId=task_id)
stream_completed = False
# Initialise sdk_cwd before the try so the finally can reference it
# even if _make_sdk_cwd raises (in that case it stays as "").
sdk_cwd = ""
try:
# Use a session-specific temp dir to avoid cleanup race conditions
# between concurrent sessions.
sdk_cwd = _make_sdk_cwd(session_id)
os.makedirs(sdk_cwd, exist_ok=True)
set_execution_context(
user_id,
session,
long_running_callback=_build_long_running_callback(user_id),
)
try:
from claude_agent_sdk import ClaudeAgentOptions, ClaudeSDKClient
# Fail fast when no API credentials are available at all
sdk_env = _build_sdk_env()
if not sdk_env and not os.environ.get("ANTHROPIC_API_KEY"):
raise RuntimeError(
"No API key configured. Set OPEN_ROUTER_API_KEY "
"(or CHAT_API_KEY) for OpenRouter routing, "
"or ANTHROPIC_API_KEY for direct Anthropic access."
)
mcp_server = create_copilot_mcp_server()
sdk_model = _resolve_sdk_model()
security_hooks = create_security_hooks(
user_id,
sdk_cwd=sdk_cwd,
max_subtasks=config.claude_agent_max_subtasks,
)
options = ClaudeAgentOptions(
system_prompt=system_prompt,
mcp_servers={"copilot": mcp_server}, # type: ignore[arg-type]
allowed_tools=COPILOT_TOOL_NAMES,
hooks=security_hooks, # type: ignore[arg-type]
cwd=sdk_cwd,
max_buffer_size=config.claude_agent_max_buffer_size,
# Only pass model/env when OpenRouter is configured
**({"model": sdk_model, "env": sdk_env} if sdk_env else {}),
)
adapter = SDKResponseAdapter(message_id=message_id)
adapter.set_task_id(task_id)
async with ClaudeSDKClient(options=options) as client:
current_message = message or ""
if not current_message and session.messages:
last_user = [m for m in session.messages if m.role == "user"]
if last_user:
current_message = last_user[-1].content or ""
if not current_message.strip():
yield StreamError(
errorText="Message cannot be empty.",
code="empty_prompt",
)
yield StreamFinish()
return
# Build query with conversation history context.
# Compress history first to handle long conversations.
query_message = current_message
if len(session.messages) > 1:
compressed = await _compress_conversation_history(session)
history_context = _format_conversation_context(compressed)
if history_context:
query_message = (
f"{history_context}\n\n"
f"Now, the user says:\n{current_message}"
)
logger.info(
f"[SDK] Sending query: {current_message[:80]!r}"
f" ({len(session.messages)} msgs in session)"
)
await client.query(query_message, session_id=session_id)
assistant_response = ChatMessage(role="assistant", content="")
accumulated_tool_calls: list[dict[str, Any]] = []
has_appended_assistant = False
has_tool_results = False
async for sdk_msg in client.receive_messages():
logger.debug(
f"[SDK] Received: {type(sdk_msg).__name__} "
f"{getattr(sdk_msg, 'subtype', '')}"
)
for response in adapter.convert_message(sdk_msg):
if isinstance(response, StreamStart):
continue
yield response
if isinstance(response, StreamTextDelta):
delta = response.delta or ""
# After tool results, start a new assistant
# message for the post-tool text.
if has_tool_results and has_appended_assistant:
assistant_response = ChatMessage(
role="assistant", content=delta
)
accumulated_tool_calls = []
has_appended_assistant = False
has_tool_results = False
session.messages.append(assistant_response)
has_appended_assistant = True
else:
assistant_response.content = (
assistant_response.content or ""
) + delta
if not has_appended_assistant:
session.messages.append(assistant_response)
has_appended_assistant = True
elif isinstance(response, StreamToolInputAvailable):
accumulated_tool_calls.append(
{
"id": response.toolCallId,
"type": "function",
"function": {
"name": response.toolName,
"arguments": json.dumps(response.input or {}),
},
}
)
assistant_response.tool_calls = accumulated_tool_calls
if not has_appended_assistant:
session.messages.append(assistant_response)
has_appended_assistant = True
elif isinstance(response, StreamToolOutputAvailable):
session.messages.append(
ChatMessage(
role="tool",
content=(
response.output
if isinstance(response.output, str)
else str(response.output)
),
tool_call_id=response.toolCallId,
)
)
has_tool_results = True
elif isinstance(response, StreamFinish):
stream_completed = True
if stream_completed:
break
if (
assistant_response.content or assistant_response.tool_calls
) and not has_appended_assistant:
session.messages.append(assistant_response)
except ImportError:
raise RuntimeError(
"claude-agent-sdk is not installed. "
"Disable SDK mode (CHAT_USE_CLAUDE_AGENT_SDK=false) "
"to use the OpenAI-compatible fallback."
)
await upsert_chat_session(session)
logger.debug(
f"[SDK] Session {session_id} saved with {len(session.messages)} messages"
)
if not stream_completed:
yield StreamFinish()
except Exception as e:
logger.error(f"[SDK] Error: {e}", exc_info=True)
try:
await upsert_chat_session(session)
except Exception as save_err:
logger.error(f"[SDK] Failed to save session on error: {save_err}")
yield StreamError(
errorText="An error occurred. Please try again.",
code="sdk_error",
)
yield StreamFinish()
finally:
if sdk_cwd:
_cleanup_sdk_tool_results(sdk_cwd)
async def _update_title_async(
session_id: str, message: str, user_id: str | None = None
) -> None:
"""Background task to update session title."""
try:
title = await _generate_session_title(
message, user_id=user_id, session_id=session_id
)
if title:
await update_session_title(session_id, title)
logger.debug(f"[SDK] Generated title for {session_id}: {title}")
except Exception as e:
logger.warning(f"[SDK] Failed to update session title: {e}")

View File

@@ -0,0 +1,320 @@
"""Tool adapter for wrapping existing CoPilot tools as Claude Agent SDK MCP tools.
This module provides the adapter layer that converts existing BaseTool implementations
into in-process MCP tools that can be used with the Claude Agent SDK.
Long-running tools (``is_long_running=True``) are delegated to the non-SDK
background infrastructure (stream_registry, Redis persistence, SSE reconnection)
via a callback provided by the service layer. This avoids wasteful SDK polling
and makes results survive page refreshes.
"""
import json
import logging
import os
import uuid
from collections.abc import Awaitable, Callable
from contextvars import ContextVar
from typing import Any
from backend.api.features.chat.model import ChatSession
from backend.api.features.chat.tools import TOOL_REGISTRY
from backend.api.features.chat.tools.base import BaseTool
logger = logging.getLogger(__name__)
# Allowed base directory for the Read tool (SDK saves oversized tool results here).
# Restricted to ~/.claude/projects/ and further validated to require "tool-results"
# in the path — prevents reading settings, credentials, or other sensitive files.
_SDK_PROJECTS_DIR = os.path.expanduser("~/.claude/projects/")
# MCP server naming - the SDK prefixes tool names as "mcp__{server_name}__{tool}"
MCP_SERVER_NAME = "copilot"
MCP_TOOL_PREFIX = f"mcp__{MCP_SERVER_NAME}__"
# Context variables to pass user/session info to tool execution
_current_user_id: ContextVar[str | None] = ContextVar("current_user_id", default=None)
_current_session: ContextVar[ChatSession | None] = ContextVar(
"current_session", default=None
)
# Stash for MCP tool outputs before the SDK potentially truncates them.
# Keyed by tool_name → full output string. Consumed (popped) by the
# response adapter when it builds StreamToolOutputAvailable.
_pending_tool_outputs: ContextVar[dict[str, str]] = ContextVar(
"pending_tool_outputs", default=None # type: ignore[arg-type]
)
# Callback type for delegating long-running tools to the non-SDK infrastructure.
# Args: (tool_name, arguments, session) → MCP-formatted response dict.
LongRunningCallback = Callable[
[str, dict[str, Any], ChatSession], Awaitable[dict[str, Any]]
]
# ContextVar so the service layer can inject the callback per-request.
_long_running_callback: ContextVar[LongRunningCallback | None] = ContextVar(
"long_running_callback", default=None
)
def set_execution_context(
user_id: str | None,
session: ChatSession,
long_running_callback: LongRunningCallback | None = None,
) -> None:
"""Set the execution context for tool calls.
This must be called before streaming begins to ensure tools have access
to user_id and session information.
Args:
user_id: Current user's ID.
session: Current chat session.
long_running_callback: Optional callback to delegate long-running tools
to the non-SDK background infrastructure (stream_registry + Redis).
"""
_current_user_id.set(user_id)
_current_session.set(session)
_pending_tool_outputs.set({})
_long_running_callback.set(long_running_callback)
def get_execution_context() -> tuple[str | None, ChatSession | None]:
"""Get the current execution context."""
return (
_current_user_id.get(),
_current_session.get(),
)
def pop_pending_tool_output(tool_name: str) -> str | None:
"""Pop and return the stashed full output for *tool_name*.
The SDK CLI may truncate large tool results (writing them to disk and
replacing the content with a file reference). This stash keeps the
original MCP output so the response adapter can forward it to the
frontend for proper widget rendering.
Returns ``None`` if nothing was stashed for *tool_name*.
"""
pending = _pending_tool_outputs.get(None)
if pending is None:
return None
return pending.pop(tool_name, None)
async def _execute_tool_sync(
base_tool: BaseTool,
user_id: str | None,
session: ChatSession,
args: dict[str, Any],
) -> dict[str, Any]:
"""Execute a tool synchronously and return MCP-formatted response."""
effective_id = f"sdk-{uuid.uuid4().hex[:12]}"
result = await base_tool.execute(
user_id=user_id,
session=session,
tool_call_id=effective_id,
**args,
)
text = (
result.output if isinstance(result.output, str) else json.dumps(result.output)
)
# Stash the full output before the SDK potentially truncates it.
pending = _pending_tool_outputs.get(None)
if pending is not None:
pending[base_tool.name] = text
return {
"content": [{"type": "text", "text": text}],
"isError": not result.success,
}
def _mcp_error(message: str) -> dict[str, Any]:
return {
"content": [
{"type": "text", "text": json.dumps({"error": message, "type": "error"})}
],
"isError": True,
}
def create_tool_handler(base_tool: BaseTool):
"""Create an async handler function for a BaseTool.
This wraps the existing BaseTool._execute method to be compatible
with the Claude Agent SDK MCP tool format.
Long-running tools (``is_long_running=True``) are delegated to the
non-SDK background infrastructure via a callback set in the execution
context. The callback persists the operation in Redis (stream_registry)
so results survive page refreshes and pod restarts.
"""
async def tool_handler(args: dict[str, Any]) -> dict[str, Any]:
"""Execute the wrapped tool and return MCP-formatted response."""
user_id, session = get_execution_context()
if session is None:
return _mcp_error("No session context available")
# --- Long-running: delegate to non-SDK background infrastructure ---
if base_tool.is_long_running:
callback = _long_running_callback.get(None)
if callback:
try:
return await callback(base_tool.name, args, session)
except Exception as e:
logger.error(
f"Long-running callback failed for {base_tool.name}: {e}",
exc_info=True,
)
return _mcp_error(f"Failed to start {base_tool.name}: {e}")
# No callback — fall through to synchronous execution
logger.warning(
f"[SDK] No long-running callback for {base_tool.name}, "
f"executing synchronously (may block)"
)
# --- Normal (fast) tool: execute synchronously ---
try:
return await _execute_tool_sync(base_tool, user_id, session, args)
except Exception as e:
logger.error(f"Error executing tool {base_tool.name}: {e}", exc_info=True)
return _mcp_error(f"Failed to execute {base_tool.name}: {e}")
return tool_handler
def _build_input_schema(base_tool: BaseTool) -> dict[str, Any]:
"""Build a JSON Schema input schema for a tool."""
return {
"type": "object",
"properties": base_tool.parameters.get("properties", {}),
"required": base_tool.parameters.get("required", []),
}
async def _read_file_handler(args: dict[str, Any]) -> dict[str, Any]:
"""Read a file with optional offset/limit. Restricted to SDK working directory.
After reading, the file is deleted to prevent accumulation in long-running pods.
"""
file_path = args.get("file_path", "")
offset = args.get("offset", 0)
limit = args.get("limit", 2000)
# Security: only allow reads under ~/.claude/projects/**/tool-results/
real_path = os.path.realpath(file_path)
if not real_path.startswith(_SDK_PROJECTS_DIR) or "tool-results" not in real_path:
return {
"content": [{"type": "text", "text": f"Access denied: {file_path}"}],
"isError": True,
}
try:
with open(real_path) as f:
lines = f.readlines()
selected = lines[offset : offset + limit]
content = "".join(selected)
return {"content": [{"type": "text", "text": content}], "isError": False}
except FileNotFoundError:
return {
"content": [{"type": "text", "text": f"File not found: {file_path}"}],
"isError": True,
}
except Exception as e:
return {
"content": [{"type": "text", "text": f"Error reading file: {e}"}],
"isError": True,
}
_READ_TOOL_NAME = "Read"
_READ_TOOL_DESCRIPTION = (
"Read a file from the local filesystem. "
"Use offset and limit to read specific line ranges for large files."
)
_READ_TOOL_SCHEMA = {
"type": "object",
"properties": {
"file_path": {
"type": "string",
"description": "The absolute path to the file to read",
},
"offset": {
"type": "integer",
"description": "Line number to start reading from (0-indexed). Default: 0",
},
"limit": {
"type": "integer",
"description": "Number of lines to read. Default: 2000",
},
},
"required": ["file_path"],
}
# Create the MCP server configuration
def create_copilot_mcp_server():
"""Create an in-process MCP server configuration for CoPilot tools.
This can be passed to ClaudeAgentOptions.mcp_servers.
Note: The actual SDK MCP server creation depends on the claude-agent-sdk
package being available. This function returns the configuration that
can be used with the SDK.
"""
try:
from claude_agent_sdk import create_sdk_mcp_server, tool
# Create decorated tool functions
sdk_tools = []
for tool_name, base_tool in TOOL_REGISTRY.items():
handler = create_tool_handler(base_tool)
decorated = tool(
tool_name,
base_tool.description,
_build_input_schema(base_tool),
)(handler)
sdk_tools.append(decorated)
# Add the Read tool so the SDK can read back oversized tool results
read_tool = tool(
_READ_TOOL_NAME,
_READ_TOOL_DESCRIPTION,
_READ_TOOL_SCHEMA,
)(_read_file_handler)
sdk_tools.append(read_tool)
server = create_sdk_mcp_server(
name=MCP_SERVER_NAME,
version="1.0.0",
tools=sdk_tools,
)
return server
except ImportError:
# Let ImportError propagate so service.py handles the fallback
raise
# SDK built-in tools allowed within the workspace directory.
# Security hooks validate that file paths stay within sdk_cwd.
# Bash is NOT included — use the sandboxed MCP bash_exec tool instead,
# which provides kernel-level network isolation via unshare --net.
# Task allows spawning sub-agents (rate-limited by security hooks).
_SDK_BUILTIN_TOOLS = ["Read", "Write", "Edit", "Glob", "Grep", "Task"]
# List of tool names for allowed_tools configuration
# Include MCP tools, the MCP Read tool for oversized results,
# and SDK built-in file tools for workspace operations.
COPILOT_TOOL_NAMES = [
*[f"{MCP_TOOL_PREFIX}{name}" for name in TOOL_REGISTRY.keys()],
f"{MCP_TOOL_PREFIX}{_READ_TOOL_NAME}",
*_SDK_BUILTIN_TOOLS,
]

View File

@@ -245,12 +245,16 @@ async def _get_system_prompt_template(context: str) -> str:
return DEFAULT_SYSTEM_PROMPT.format(users_information=context)
async def _build_system_prompt(user_id: str | None) -> tuple[str, Any]:
async def _build_system_prompt(
user_id: str | None, has_conversation_history: bool = False
) -> tuple[str, Any]:
"""Build the full system prompt including business understanding if available.
Args:
user_id: The user ID for fetching business understanding
If "default" and this is the user's first session, will use "onboarding" instead.
user_id: The user ID for fetching business understanding.
has_conversation_history: Whether there's existing conversation history.
If True, we don't tell the model to greet/introduce (since they're
already in a conversation).
Returns:
Tuple of (compiled prompt string, business understanding object)
@@ -266,6 +270,8 @@ async def _build_system_prompt(user_id: str | None) -> tuple[str, Any]:
if understanding:
context = format_understanding_for_prompt(understanding)
elif has_conversation_history:
context = "No prior understanding saved yet. Continue the existing conversation naturally."
else:
context = "This is the first time you are meeting the user. Greet them and introduce them to the platform"
@@ -374,7 +380,6 @@ async def stream_chat_completion(
Raises:
NotFoundError: If session_id is invalid
ValueError: If max_context_messages is exceeded
"""
completion_start = time.monotonic()
@@ -459,8 +464,9 @@ async def stream_chat_completion(
# Generate title for new sessions on first user message (non-blocking)
# Check: is_user_message, no title yet, and this is the first user message
if is_user_message and message and not session.title:
user_messages = [m for m in session.messages if m.role == "user"]
user_messages = [m for m in session.messages if m.role == "user"]
first_user_msg = message or (user_messages[0].content if user_messages else None)
if is_user_message and first_user_msg and not session.title:
if len(user_messages) == 1:
# First user message - generate title in background
import asyncio
@@ -468,7 +474,7 @@ async def stream_chat_completion(
# Capture only the values we need (not the session object) to avoid
# stale data issues when the main flow modifies the session
captured_session_id = session_id
captured_message = message
captured_message = first_user_msg
captured_user_id = user_id
async def _update_title():
@@ -1237,7 +1243,7 @@ async def _stream_chat_chunks(
total_time = (time_module.perf_counter() - stream_chunks_start) * 1000
logger.info(
f"[TIMING] _stream_chat_chunks COMPLETED in {total_time/1000:.1f}s; "
f"[TIMING] _stream_chat_chunks COMPLETED in {total_time / 1000:.1f}s; "
f"session={session.session_id}, user={session.user_id}",
extra={"json_fields": {**log_meta, "total_time_ms": total_time}},
)
@@ -1245,7 +1251,6 @@ async def _stream_chat_chunks(
return
except Exception as e:
last_error = e
if _is_retryable_error(e) and retry_count < MAX_RETRIES:
retry_count += 1
# Calculate delay with exponential backoff
@@ -1261,27 +1266,12 @@ async def _stream_chat_chunks(
continue # Retry the stream
else:
# Non-retryable error or max retries exceeded
_log_api_error(
error=e,
context="stream (not retrying)",
session_id=session.session_id if session else None,
message_count=len(messages) if messages else None,
model=model,
retry_count=retry_count,
logger.error(
f"Error in stream (not retrying): {e!s}",
exc_info=True,
)
error_code = None
error_text = str(e)
error_details = _extract_api_error_details(e)
if error_details.get("response_body"):
body = error_details["response_body"]
if isinstance(body, dict):
err = body.get("error")
if isinstance(err, dict) and err.get("message"):
error_text = err["message"]
elif body.get("message"):
error_text = body["message"]
if _is_region_blocked_error(e):
error_code = "MODEL_NOT_AVAILABLE_REGION"
error_text = (
@@ -1298,13 +1288,9 @@ async def _stream_chat_chunks(
# If we exit the retry loop without returning, it means we exhausted retries
if last_error:
_log_api_error(
error=last_error,
context=f"stream (max retries {MAX_RETRIES} exceeded)",
session_id=session.session_id if session else None,
message_count=len(messages) if messages else None,
model=model,
retry_count=MAX_RETRIES,
logger.error(
f"Max retries ({MAX_RETRIES}) exceeded. Last error: {last_error!s}",
exc_info=True,
)
yield StreamError(errorText=f"Max retries exceeded: {last_error!s}")
yield StreamFinish()
@@ -1877,7 +1863,6 @@ async def _generate_llm_continuation(
break # Success, exit retry loop
except Exception as e:
last_error = e
if _is_retryable_error(e) and retry_count < MAX_RETRIES:
retry_count += 1
delay = min(
@@ -1891,25 +1876,17 @@ async def _generate_llm_continuation(
await asyncio.sleep(delay)
continue
else:
# Non-retryable error - log details and exit gracefully
_log_api_error(
error=e,
context="LLM continuation (not retrying)",
session_id=session_id,
message_count=len(messages) if messages else None,
model=config.model,
retry_count=retry_count,
# Non-retryable error - log and exit gracefully
logger.error(
f"Non-retryable error in LLM continuation: {e!s}",
exc_info=True,
)
return
if last_error:
_log_api_error(
error=last_error,
context=f"LLM continuation (max retries {MAX_RETRIES} exceeded)",
session_id=session_id,
message_count=len(messages) if messages else None,
model=config.model,
retry_count=MAX_RETRIES,
logger.error(
f"Max retries ({MAX_RETRIES}) exceeded for LLM continuation. "
f"Last error: {last_error!s}"
)
return
@@ -1949,91 +1926,6 @@ async def _generate_llm_continuation(
logger.error(f"Failed to generate LLM continuation: {e}", exc_info=True)
def _log_api_error(
error: Exception,
context: str,
session_id: str | None = None,
message_count: int | None = None,
model: str | None = None,
retry_count: int = 0,
) -> None:
"""Log detailed API error information for debugging."""
details = _extract_api_error_details(error)
details["context"] = context
details["session_id"] = session_id
details["message_count"] = message_count
details["model"] = model
details["retry_count"] = retry_count
if isinstance(error, RateLimitError):
logger.warning(f"Rate limit error in {context}: {details}", exc_info=error)
elif isinstance(error, APIConnectionError):
logger.warning(f"API connection error in {context}: {details}", exc_info=error)
elif isinstance(error, APIStatusError) and error.status_code >= 500:
logger.error(f"API server error (5xx) in {context}: {details}", exc_info=error)
else:
logger.error(f"API error in {context}: {details}", exc_info=error)
def _extract_api_error_details(error: Exception) -> dict[str, Any]:
"""Extract detailed information from OpenAI/OpenRouter API errors."""
error_msg = str(error)
details: dict[str, Any] = {
"error_type": type(error).__name__,
"error_message": error_msg[:500] + "..." if len(error_msg) > 500 else error_msg,
}
if hasattr(error, "code"):
details["code"] = getattr(error, "code", None)
if hasattr(error, "param"):
details["param"] = getattr(error, "param", None)
if isinstance(error, APIStatusError):
details["status_code"] = error.status_code
details["request_id"] = getattr(error, "request_id", None)
if hasattr(error, "body") and error.body:
details["response_body"] = _sanitize_error_body(error.body)
if hasattr(error, "response") and error.response:
headers = error.response.headers
details["openrouter_provider"] = headers.get("x-openrouter-provider")
details["openrouter_model"] = headers.get("x-openrouter-model")
details["retry_after"] = headers.get("retry-after")
details["rate_limit_remaining"] = headers.get("x-ratelimit-remaining")
return details
def _sanitize_error_body(
body: Any, max_length: int = 2000
) -> dict[str, Any] | str | None:
"""Extract only safe fields from error response body to avoid logging sensitive data."""
if not isinstance(body, dict):
# Non-dict bodies (e.g., HTML error pages) - return truncated string
if body is not None:
body_str = str(body)
if len(body_str) > max_length:
return body_str[:max_length] + "...[truncated]"
return body_str
return None
safe_fields = ("message", "type", "code", "param", "error")
sanitized: dict[str, Any] = {}
for field in safe_fields:
if field in body:
value = body[field]
if field == "error" and isinstance(value, dict):
sanitized[field] = _sanitize_error_body(value, max_length)
elif isinstance(value, str) and len(value) > max_length:
sanitized[field] = value[:max_length] + "...[truncated]"
else:
sanitized[field] = value
return sanitized if sanitized else None
async def _generate_llm_continuation_with_streaming(
session_id: str,
user_id: str | None,

View File

@@ -814,6 +814,28 @@ async def get_active_task_for_session(
if task_user_id and user_id != task_user_id:
continue
# Auto-expire stale tasks that exceeded stream_timeout
created_at_str = meta.get("created_at", "")
if created_at_str:
try:
created_at = datetime.fromisoformat(created_at_str)
age_seconds = (
datetime.now(timezone.utc) - created_at
).total_seconds()
if age_seconds > config.stream_timeout:
logger.warning(
f"[TASK_LOOKUP] Auto-expiring stale task {task_id[:8]}... "
f"(age={age_seconds:.0f}s > timeout={config.stream_timeout}s)"
)
await mark_task_completed(task_id, "failed")
continue
except (ValueError, TypeError):
pass
logger.info(
f"[TASK_LOOKUP] Found running task {task_id[:8]}... for session {session_id[:8]}..."
)
# Get the last message ID from Redis Stream
stream_key = _get_task_stream_key(task_id)
last_id = "0-0"

View File

@@ -9,6 +9,8 @@ from backend.api.features.chat.tracking import track_tool_called
from .add_understanding import AddUnderstandingTool
from .agent_output import AgentOutputTool
from .base import BaseTool
from .bash_exec import BashExecTool
from .check_operation_status import CheckOperationStatusTool
from .create_agent import CreateAgentTool
from .customize_agent import CustomizeAgentTool
from .edit_agent import EditAgentTool
@@ -19,6 +21,7 @@ from .get_doc_page import GetDocPageTool
from .run_agent import RunAgentTool
from .run_block import RunBlockTool
from .search_docs import SearchDocsTool
from .web_fetch import WebFetchTool
from .workspace_files import (
DeleteWorkspaceFileTool,
ListWorkspaceFilesTool,
@@ -43,47 +46,20 @@ TOOL_REGISTRY: dict[str, BaseTool] = {
"run_agent": RunAgentTool(),
"run_block": RunBlockTool(),
"view_agent_output": AgentOutputTool(),
"check_operation_status": CheckOperationStatusTool(),
"search_docs": SearchDocsTool(),
"get_doc_page": GetDocPageTool(),
# Workspace tools for CoPilot file operations
# Web fetch for safe URL retrieval
"web_fetch": WebFetchTool(),
# Sandboxed code execution (bubblewrap)
"bash_exec": BashExecTool(),
# Persistent workspace tools (cloud storage, survives across sessions)
"list_workspace_files": ListWorkspaceFilesTool(),
"read_workspace_file": ReadWorkspaceFileTool(),
"write_workspace_file": WriteWorkspaceFileTool(),
"delete_workspace_file": DeleteWorkspaceFileTool(),
}
def _register_feature_request_tools() -> None:
"""Register feature request tools only if Linear configuration is available."""
from backend.util.settings import Settings
try:
secrets = Settings().secrets
except Exception:
logger.warning("Feature request tools disabled: failed to load settings")
return
if not (
secrets.linear_api_key
and secrets.linear_feature_request_project_id
and secrets.linear_feature_request_team_id
):
logger.info(
"Feature request tools disabled: LINEAR_API_KEY, "
"LINEAR_FEATURE_REQUEST_PROJECT_ID, or "
"LINEAR_FEATURE_REQUEST_TEAM_ID is not configured"
)
return
from .feature_requests import CreateFeatureRequestTool, SearchFeatureRequestsTool
TOOL_REGISTRY["search_feature_requests"] = SearchFeatureRequestsTool()
TOOL_REGISTRY["create_feature_request"] = CreateFeatureRequestTool()
logger.info("Feature request tools enabled")
_register_feature_request_tools()
# Export individual tool instances for backwards compatibility
find_agent_tool = TOOL_REGISTRY["find_agent"]
run_agent_tool = TOOL_REGISTRY["run_agent"]

View File

@@ -1,154 +0,0 @@
"""Dummy Agent Generator for testing.
Returns mock responses matching the format expected from the external service.
Enable via AGENTGENERATOR_USE_DUMMY=true in settings.
WARNING: This is for testing only. Do not use in production.
"""
import asyncio
import logging
import uuid
from typing import Any
logger = logging.getLogger(__name__)
# Dummy decomposition result (instructions type)
DUMMY_DECOMPOSITION_RESULT: dict[str, Any] = {
"type": "instructions",
"steps": [
{
"description": "Get input from user",
"action": "input",
"block_name": "AgentInputBlock",
},
{
"description": "Process the input",
"action": "process",
"block_name": "TextFormatterBlock",
},
{
"description": "Return output to user",
"action": "output",
"block_name": "AgentOutputBlock",
},
],
}
# Block IDs from backend/blocks/io.py
AGENT_INPUT_BLOCK_ID = "c0a8e994-ebf1-4a9c-a4d8-89d09c86741b"
AGENT_OUTPUT_BLOCK_ID = "363ae599-353e-4804-937e-b2ee3cef3da4"
def _generate_dummy_agent_json() -> dict[str, Any]:
"""Generate a minimal valid agent JSON for testing."""
input_node_id = str(uuid.uuid4())
output_node_id = str(uuid.uuid4())
return {
"id": str(uuid.uuid4()),
"version": 1,
"is_active": True,
"name": "Dummy Test Agent",
"description": "A dummy agent generated for testing purposes",
"nodes": [
{
"id": input_node_id,
"block_id": AGENT_INPUT_BLOCK_ID,
"input_default": {
"name": "input",
"title": "Input",
"description": "Enter your input",
"placeholder_values": [],
},
"metadata": {"position": {"x": 0, "y": 0}},
},
{
"id": output_node_id,
"block_id": AGENT_OUTPUT_BLOCK_ID,
"input_default": {
"name": "output",
"title": "Output",
"description": "Agent output",
"format": "{output}",
},
"metadata": {"position": {"x": 400, "y": 0}},
},
],
"links": [
{
"id": str(uuid.uuid4()),
"source_id": input_node_id,
"sink_id": output_node_id,
"source_name": "result",
"sink_name": "value",
"is_static": False,
},
],
}
async def decompose_goal_dummy(
description: str,
context: str = "",
library_agents: list[dict[str, Any]] | None = None,
) -> dict[str, Any]:
"""Return dummy decomposition result."""
logger.info("Using dummy agent generator for decompose_goal")
return DUMMY_DECOMPOSITION_RESULT.copy()
async def generate_agent_dummy(
instructions: dict[str, Any],
library_agents: list[dict[str, Any]] | None = None,
operation_id: str | None = None,
task_id: str | None = None,
) -> dict[str, Any]:
"""Return dummy agent JSON after a simulated delay."""
logger.info("Using dummy agent generator for generate_agent (30s delay)")
await asyncio.sleep(30)
return _generate_dummy_agent_json()
async def generate_agent_patch_dummy(
update_request: str,
current_agent: dict[str, Any],
library_agents: list[dict[str, Any]] | None = None,
operation_id: str | None = None,
task_id: str | None = None,
) -> dict[str, Any]:
"""Return dummy patched agent (returns the current agent with updated description)."""
logger.info("Using dummy agent generator for generate_agent_patch")
patched = current_agent.copy()
patched["description"] = (
f"{current_agent.get('description', '')} (updated: {update_request})"
)
return patched
async def customize_template_dummy(
template_agent: dict[str, Any],
modification_request: str,
context: str = "",
) -> dict[str, Any]:
"""Return dummy customized template (returns template with updated description)."""
logger.info("Using dummy agent generator for customize_template")
customized = template_agent.copy()
customized["description"] = (
f"{template_agent.get('description', '')} (customized: {modification_request})"
)
return customized
async def get_blocks_dummy() -> list[dict[str, Any]]:
"""Return dummy blocks list."""
logger.info("Using dummy agent generator for get_blocks")
return [
{"id": AGENT_INPUT_BLOCK_ID, "name": "AgentInputBlock"},
{"id": AGENT_OUTPUT_BLOCK_ID, "name": "AgentOutputBlock"},
]
async def health_check_dummy() -> bool:
"""Always returns healthy for dummy service."""
return True

View File

@@ -12,19 +12,8 @@ import httpx
from backend.util.settings import Settings
from .dummy import (
customize_template_dummy,
decompose_goal_dummy,
generate_agent_dummy,
generate_agent_patch_dummy,
get_blocks_dummy,
health_check_dummy,
)
logger = logging.getLogger(__name__)
_dummy_mode_warned = False
def _create_error_response(
error_message: str,
@@ -101,26 +90,10 @@ def _get_settings() -> Settings:
return _settings
def _is_dummy_mode() -> bool:
"""Check if dummy mode is enabled for testing."""
global _dummy_mode_warned
settings = _get_settings()
is_dummy = bool(settings.config.agentgenerator_use_dummy)
if is_dummy and not _dummy_mode_warned:
logger.warning(
"Agent Generator running in DUMMY MODE - returning mock responses. "
"Do not use in production!"
)
_dummy_mode_warned = True
return is_dummy
def is_external_service_configured() -> bool:
"""Check if external Agent Generator service is configured (or dummy mode)."""
"""Check if external Agent Generator service is configured."""
settings = _get_settings()
return bool(settings.config.agentgenerator_host) or bool(
settings.config.agentgenerator_use_dummy
)
return bool(settings.config.agentgenerator_host)
def _get_base_url() -> str:
@@ -164,9 +137,6 @@ async def decompose_goal_external(
- {"type": "error", "error": "...", "error_type": "..."} on error
Or None on unexpected error
"""
if _is_dummy_mode():
return await decompose_goal_dummy(description, context, library_agents)
client = _get_client()
if context:
@@ -256,11 +226,6 @@ async def generate_agent_external(
Returns:
Agent JSON dict, {"status": "accepted"} for async, or error dict {"type": "error", ...} on error
"""
if _is_dummy_mode():
return await generate_agent_dummy(
instructions, library_agents, operation_id, task_id
)
client = _get_client()
# Build request payload
@@ -332,11 +297,6 @@ async def generate_agent_patch_external(
Returns:
Updated agent JSON, clarifying questions dict, {"status": "accepted"} for async, or error dict on error
"""
if _is_dummy_mode():
return await generate_agent_patch_dummy(
update_request, current_agent, library_agents, operation_id, task_id
)
client = _get_client()
# Build request payload
@@ -423,11 +383,6 @@ async def customize_template_external(
Returns:
Customized agent JSON, clarifying questions dict, or error dict on error
"""
if _is_dummy_mode():
return await customize_template_dummy(
template_agent, modification_request, context
)
client = _get_client()
request = modification_request
@@ -490,9 +445,6 @@ async def get_blocks_external() -> list[dict[str, Any]] | None:
Returns:
List of block info dicts or None on error
"""
if _is_dummy_mode():
return await get_blocks_dummy()
client = _get_client()
try:
@@ -526,9 +478,6 @@ async def health_check() -> bool:
if not is_external_service_configured():
return False
if _is_dummy_mode():
return await health_check_dummy()
client = _get_client()
try:

View File

@@ -0,0 +1,131 @@
"""Bash execution tool — run shell commands in a bubblewrap sandbox.
Full Bash scripting is allowed (loops, conditionals, pipes, functions, etc.).
Safety comes from OS-level isolation (bubblewrap): only system dirs visible
read-only, writable workspace only, clean env, no network.
Requires bubblewrap (``bwrap``) — the tool is disabled when bwrap is not
available (e.g. macOS development).
"""
import logging
from typing import Any
from backend.api.features.chat.model import ChatSession
from backend.api.features.chat.tools.base import BaseTool
from backend.api.features.chat.tools.models import (
BashExecResponse,
ErrorResponse,
ToolResponseBase,
)
from backend.api.features.chat.tools.sandbox import (
get_workspace_dir,
has_full_sandbox,
run_sandboxed,
)
logger = logging.getLogger(__name__)
class BashExecTool(BaseTool):
"""Execute Bash commands in a bubblewrap sandbox."""
@property
def name(self) -> str:
return "bash_exec"
@property
def description(self) -> str:
if not has_full_sandbox():
return (
"Bash execution is DISABLED — bubblewrap sandbox is not "
"available on this platform. Do not call this tool."
)
return (
"Execute a Bash command or script in a bubblewrap sandbox. "
"Full Bash scripting is supported (loops, conditionals, pipes, "
"functions, etc.). "
"The sandbox shares the same working directory as the SDK Read/Write "
"tools — files created by either are accessible to both. "
"SECURITY: Only system directories (/usr, /bin, /lib, /etc) are "
"visible read-only, the per-session workspace is the only writable "
"path, environment variables are wiped (no secrets), all network "
"access is blocked at the kernel level, and resource limits are "
"enforced (max 64 processes, 512MB memory, 50MB file size). "
"Application code, configs, and other directories are NOT accessible. "
"To fetch web content, use the web_fetch tool instead. "
"Execution is killed after the timeout (default 30s, max 120s). "
"Returns stdout and stderr. "
"Useful for file manipulation, data processing with Unix tools "
"(grep, awk, sed, jq, etc.), and running shell scripts."
)
@property
def parameters(self) -> dict[str, Any]:
return {
"type": "object",
"properties": {
"command": {
"type": "string",
"description": "Bash command or script to execute.",
},
"timeout": {
"type": "integer",
"description": (
"Max execution time in seconds (default 30, max 120)."
),
"default": 30,
},
},
"required": ["command"],
}
@property
def requires_auth(self) -> bool:
return False
async def _execute(
self,
user_id: str | None,
session: ChatSession,
**kwargs: Any,
) -> ToolResponseBase:
session_id = session.session_id if session else None
if not has_full_sandbox():
return ErrorResponse(
message="bash_exec requires bubblewrap sandbox (Linux only).",
error="sandbox_unavailable",
session_id=session_id,
)
command: str = (kwargs.get("command") or "").strip()
timeout: int = kwargs.get("timeout", 30)
if not command:
return ErrorResponse(
message="No command provided.",
error="empty_command",
session_id=session_id,
)
workspace = get_workspace_dir(session_id or "default")
stdout, stderr, exit_code, timed_out = await run_sandboxed(
command=["bash", "-c", command],
cwd=workspace,
timeout=timeout,
)
return BashExecResponse(
message=(
"Execution timed out"
if timed_out
else f"Command executed (exit {exit_code})"
),
stdout=stdout,
stderr=stderr,
exit_code=exit_code,
timed_out=timed_out,
session_id=session_id,
)

View File

@@ -0,0 +1,127 @@
"""CheckOperationStatusTool — query the status of a long-running operation."""
import logging
from typing import Any
from backend.api.features.chat.model import ChatSession
from backend.api.features.chat.tools.base import BaseTool
from backend.api.features.chat.tools.models import (
ErrorResponse,
ResponseType,
ToolResponseBase,
)
logger = logging.getLogger(__name__)
class OperationStatusResponse(ToolResponseBase):
"""Response for check_operation_status tool."""
type: ResponseType = ResponseType.OPERATION_STATUS
task_id: str
operation_id: str
status: str # "running", "completed", "failed"
tool_name: str | None = None
message: str = ""
class CheckOperationStatusTool(BaseTool):
"""Check the status of a long-running operation (create_agent, edit_agent, etc.).
The CoPilot uses this tool to report back to the user whether an
operation that was started earlier has completed, failed, or is still
running.
"""
@property
def name(self) -> str:
return "check_operation_status"
@property
def description(self) -> str:
return (
"Check the current status of a long-running operation such as "
"create_agent or edit_agent. Accepts either an operation_id or "
"task_id from a previous operation_started response. "
"Returns the current status: running, completed, or failed."
)
@property
def parameters(self) -> dict[str, Any]:
return {
"type": "object",
"properties": {
"operation_id": {
"type": "string",
"description": (
"The operation_id from an operation_started response."
),
},
"task_id": {
"type": "string",
"description": (
"The task_id from an operation_started response. "
"Used as fallback if operation_id is not provided."
),
},
},
"required": [],
}
@property
def requires_auth(self) -> bool:
return False
async def _execute(
self,
user_id: str | None,
session: ChatSession,
**kwargs,
) -> ToolResponseBase:
from backend.api.features.chat import stream_registry
operation_id: str = kwargs.get("operation_id", "").strip()
task_id: str = kwargs.get("task_id", "").strip()
if not operation_id and not task_id:
return ErrorResponse(
message="Please provide an operation_id or task_id.",
error="missing_parameter",
)
task = None
if operation_id:
task = await stream_registry.find_task_by_operation_id(operation_id)
if task is None and task_id:
task = await stream_registry.get_task(task_id)
if task is None:
# Task not in Redis — it may have already expired (TTL).
# Check conversation history for the result instead.
return ErrorResponse(
message=(
"Operation not found — it may have already completed and "
"expired from the status tracker. Check the conversation "
"history for the result."
),
error="not_found",
)
status_messages = {
"running": (
f"The {task.tool_name or 'operation'} is still running. "
"Please wait for it to complete."
),
"completed": (
f"The {task.tool_name or 'operation'} has completed successfully."
),
"failed": f"The {task.tool_name or 'operation'} has failed.",
}
return OperationStatusResponse(
task_id=task.task_id,
operation_id=task.operation_id,
status=task.status,
tool_name=task.tool_name,
message=status_messages.get(task.status, f"Status: {task.status}"),
)

View File

@@ -1,448 +0,0 @@
"""Feature request tools - search and create feature requests via Linear."""
import logging
from typing import Any
from pydantic import SecretStr
from backend.api.features.chat.model import ChatSession
from backend.api.features.chat.tools.base import BaseTool
from backend.api.features.chat.tools.models import (
ErrorResponse,
FeatureRequestCreatedResponse,
FeatureRequestInfo,
FeatureRequestSearchResponse,
NoResultsResponse,
ToolResponseBase,
)
from backend.blocks.linear._api import LinearClient
from backend.data.model import APIKeyCredentials
from backend.data.user import get_user_email_by_id
from backend.util.settings import Settings
logger = logging.getLogger(__name__)
MAX_SEARCH_RESULTS = 10
# GraphQL queries/mutations
SEARCH_ISSUES_QUERY = """
query SearchFeatureRequests($term: String!, $filter: IssueFilter, $first: Int) {
searchIssues(term: $term, filter: $filter, first: $first) {
nodes {
id
identifier
title
description
}
}
}
"""
CUSTOMER_UPSERT_MUTATION = """
mutation CustomerUpsert($input: CustomerUpsertInput!) {
customerUpsert(input: $input) {
success
customer {
id
name
externalIds
}
}
}
"""
ISSUE_CREATE_MUTATION = """
mutation IssueCreate($input: IssueCreateInput!) {
issueCreate(input: $input) {
success
issue {
id
identifier
title
url
}
}
}
"""
CUSTOMER_NEED_CREATE_MUTATION = """
mutation CustomerNeedCreate($input: CustomerNeedCreateInput!) {
customerNeedCreate(input: $input) {
success
need {
id
body
customer {
id
name
}
issue {
id
identifier
title
url
}
}
}
}
"""
_settings: Settings | None = None
def _get_settings() -> Settings:
global _settings
if _settings is None:
_settings = Settings()
return _settings
def _get_linear_config() -> tuple[LinearClient, str, str]:
"""Return a configured Linear client, project ID, and team ID.
Raises RuntimeError if any required setting is missing.
"""
secrets = _get_settings().secrets
if not secrets.linear_api_key:
raise RuntimeError("LINEAR_API_KEY is not configured")
if not secrets.linear_feature_request_project_id:
raise RuntimeError("LINEAR_FEATURE_REQUEST_PROJECT_ID is not configured")
if not secrets.linear_feature_request_team_id:
raise RuntimeError("LINEAR_FEATURE_REQUEST_TEAM_ID is not configured")
credentials = APIKeyCredentials(
id="system-linear",
provider="linear",
api_key=SecretStr(secrets.linear_api_key),
title="System Linear API Key",
)
client = LinearClient(credentials=credentials)
return (
client,
secrets.linear_feature_request_project_id,
secrets.linear_feature_request_team_id,
)
class SearchFeatureRequestsTool(BaseTool):
"""Tool for searching existing feature requests in Linear."""
@property
def name(self) -> str:
return "search_feature_requests"
@property
def description(self) -> str:
return (
"Search existing feature requests to check if a similar request "
"already exists before creating a new one. Returns matching feature "
"requests with their ID, title, and description."
)
@property
def parameters(self) -> dict[str, Any]:
return {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "Search term to find matching feature requests.",
},
},
"required": ["query"],
}
@property
def requires_auth(self) -> bool:
return True
async def _execute(
self,
user_id: str | None,
session: ChatSession,
**kwargs,
) -> ToolResponseBase:
query = kwargs.get("query", "").strip()
session_id = session.session_id if session else None
if not query:
return ErrorResponse(
message="Please provide a search query.",
error="Missing query parameter",
session_id=session_id,
)
try:
client, project_id, _team_id = _get_linear_config()
data = await client.query(
SEARCH_ISSUES_QUERY,
{
"term": query,
"filter": {
"project": {"id": {"eq": project_id}},
},
"first": MAX_SEARCH_RESULTS,
},
)
nodes = data.get("searchIssues", {}).get("nodes", [])
if not nodes:
return NoResultsResponse(
message=f"No feature requests found matching '{query}'.",
suggestions=[
"Try different keywords",
"Use broader search terms",
"You can create a new feature request if none exists",
],
session_id=session_id,
)
results = [
FeatureRequestInfo(
id=node["id"],
identifier=node["identifier"],
title=node["title"],
description=node.get("description"),
)
for node in nodes
]
return FeatureRequestSearchResponse(
message=f"Found {len(results)} feature request(s) matching '{query}'.",
results=results,
count=len(results),
query=query,
session_id=session_id,
)
except Exception as e:
logger.exception("Failed to search feature requests")
return ErrorResponse(
message="Failed to search feature requests.",
error=str(e),
session_id=session_id,
)
class CreateFeatureRequestTool(BaseTool):
"""Tool for creating feature requests (or adding needs to existing ones)."""
@property
def name(self) -> str:
return "create_feature_request"
@property
def description(self) -> str:
return (
"Create a new feature request or add a customer need to an existing one. "
"Always search first with search_feature_requests to avoid duplicates. "
"If a matching request exists, pass its ID as existing_issue_id to add "
"the user's need to it instead of creating a duplicate."
)
@property
def parameters(self) -> dict[str, Any]:
return {
"type": "object",
"properties": {
"title": {
"type": "string",
"description": "Title for the feature request.",
},
"description": {
"type": "string",
"description": "Detailed description of what the user wants and why.",
},
"existing_issue_id": {
"type": "string",
"description": (
"If adding a need to an existing feature request, "
"provide its Linear issue ID (from search results). "
"Omit to create a new feature request."
),
},
},
"required": ["title", "description"],
}
@property
def requires_auth(self) -> bool:
return True
async def _find_or_create_customer(
self, client: LinearClient, user_id: str, name: str
) -> dict:
"""Find existing customer by user_id or create a new one via upsert.
Args:
client: Linear API client.
user_id: Stable external ID used to deduplicate customers.
name: Human-readable display name (e.g. the user's email).
"""
data = await client.mutate(
CUSTOMER_UPSERT_MUTATION,
{
"input": {
"name": name,
"externalId": user_id,
},
},
)
result = data.get("customerUpsert", {})
if not result.get("success"):
raise RuntimeError(f"Failed to upsert customer: {data}")
return result["customer"]
async def _execute(
self,
user_id: str | None,
session: ChatSession,
**kwargs,
) -> ToolResponseBase:
title = kwargs.get("title", "").strip()
description = kwargs.get("description", "").strip()
existing_issue_id = kwargs.get("existing_issue_id")
session_id = session.session_id if session else None
if not title or not description:
return ErrorResponse(
message="Both title and description are required.",
error="Missing required parameters",
session_id=session_id,
)
if not user_id:
return ErrorResponse(
message="Authentication required to create feature requests.",
error="Missing user_id",
session_id=session_id,
)
try:
client, project_id, team_id = _get_linear_config()
except Exception as e:
logger.exception("Failed to initialize Linear client")
return ErrorResponse(
message="Failed to create feature request.",
error=str(e),
session_id=session_id,
)
# Resolve a human-readable name (email) for the Linear customer record.
# Fall back to user_id if the lookup fails or returns None.
try:
customer_display_name = await get_user_email_by_id(user_id) or user_id
except Exception:
customer_display_name = user_id
# Step 1: Find or create customer for this user
try:
customer = await self._find_or_create_customer(
client, user_id, customer_display_name
)
customer_id = customer["id"]
customer_name = customer["name"]
except Exception as e:
logger.exception("Failed to upsert customer in Linear")
return ErrorResponse(
message="Failed to create feature request.",
error=str(e),
session_id=session_id,
)
# Step 2: Create or reuse issue
issue_id: str | None = None
issue_identifier: str | None = None
if existing_issue_id:
# Add need to existing issue - we still need the issue details for response
is_new_issue = False
issue_id = existing_issue_id
else:
# Create new issue in the feature requests project
try:
data = await client.mutate(
ISSUE_CREATE_MUTATION,
{
"input": {
"title": title,
"description": description,
"teamId": team_id,
"projectId": project_id,
},
},
)
result = data.get("issueCreate", {})
if not result.get("success"):
return ErrorResponse(
message="Failed to create feature request issue.",
error=str(data),
session_id=session_id,
)
issue = result["issue"]
issue_id = issue["id"]
issue_identifier = issue.get("identifier")
except Exception as e:
logger.exception("Failed to create feature request issue")
return ErrorResponse(
message="Failed to create feature request.",
error=str(e),
session_id=session_id,
)
is_new_issue = True
# Step 3: Create customer need on the issue
try:
data = await client.mutate(
CUSTOMER_NEED_CREATE_MUTATION,
{
"input": {
"customerId": customer_id,
"issueId": issue_id,
"body": description,
"priority": 0,
},
},
)
need_result = data.get("customerNeedCreate", {})
if not need_result.get("success"):
orphaned = (
{"issue_id": issue_id, "issue_identifier": issue_identifier}
if is_new_issue
else None
)
return ErrorResponse(
message="Failed to attach customer need to the feature request.",
error=str(data),
details=orphaned,
session_id=session_id,
)
need = need_result["need"]
issue_info = need["issue"]
except Exception as e:
logger.exception("Failed to create customer need")
orphaned = (
{"issue_id": issue_id, "issue_identifier": issue_identifier}
if is_new_issue
else None
)
return ErrorResponse(
message="Failed to attach customer need to the feature request.",
error=str(e),
details=orphaned,
session_id=session_id,
)
return FeatureRequestCreatedResponse(
message=(
f"{'Created new feature request' if is_new_issue else 'Added your request to existing feature request'}: "
f"{issue_info['title']}."
),
issue_id=issue_info["id"],
issue_identifier=issue_info["identifier"],
issue_title=issue_info["title"],
issue_url=issue_info.get("url", ""),
is_new_issue=is_new_issue,
customer_name=customer_name,
session_id=session_id,
)

View File

@@ -1,615 +0,0 @@
"""Tests for SearchFeatureRequestsTool and CreateFeatureRequestTool."""
from unittest.mock import AsyncMock, patch
import pytest
from backend.api.features.chat.tools.feature_requests import (
CreateFeatureRequestTool,
SearchFeatureRequestsTool,
)
from backend.api.features.chat.tools.models import (
ErrorResponse,
FeatureRequestCreatedResponse,
FeatureRequestSearchResponse,
NoResultsResponse,
)
from ._test_data import make_session
_TEST_USER_ID = "test-user-feature-requests"
_TEST_USER_EMAIL = "testuser@example.com"
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
_FAKE_PROJECT_ID = "test-project-id"
_FAKE_TEAM_ID = "test-team-id"
def _mock_linear_config(*, query_return=None, mutate_return=None):
"""Return a patched _get_linear_config that yields a mock LinearClient."""
client = AsyncMock()
if query_return is not None:
client.query.return_value = query_return
if mutate_return is not None:
client.mutate.return_value = mutate_return
return (
patch(
"backend.api.features.chat.tools.feature_requests._get_linear_config",
return_value=(client, _FAKE_PROJECT_ID, _FAKE_TEAM_ID),
),
client,
)
def _search_response(nodes: list[dict]) -> dict:
return {"searchIssues": {"nodes": nodes}}
def _customer_upsert_response(
customer_id: str = "cust-1", name: str = _TEST_USER_EMAIL, success: bool = True
) -> dict:
return {
"customerUpsert": {
"success": success,
"customer": {"id": customer_id, "name": name, "externalIds": [name]},
}
}
def _issue_create_response(
issue_id: str = "issue-1",
identifier: str = "FR-1",
title: str = "New Feature",
success: bool = True,
) -> dict:
return {
"issueCreate": {
"success": success,
"issue": {
"id": issue_id,
"identifier": identifier,
"title": title,
"url": f"https://linear.app/issue/{identifier}",
},
}
}
def _need_create_response(
need_id: str = "need-1",
issue_id: str = "issue-1",
identifier: str = "FR-1",
title: str = "New Feature",
success: bool = True,
) -> dict:
return {
"customerNeedCreate": {
"success": success,
"need": {
"id": need_id,
"body": "description",
"customer": {"id": "cust-1", "name": _TEST_USER_EMAIL},
"issue": {
"id": issue_id,
"identifier": identifier,
"title": title,
"url": f"https://linear.app/issue/{identifier}",
},
},
}
}
# ===========================================================================
# SearchFeatureRequestsTool
# ===========================================================================
class TestSearchFeatureRequestsTool:
"""Tests for SearchFeatureRequestsTool._execute."""
@pytest.mark.asyncio(loop_scope="session")
async def test_successful_search(self):
session = make_session(user_id=_TEST_USER_ID)
nodes = [
{
"id": "id-1",
"identifier": "FR-1",
"title": "Dark mode",
"description": "Add dark mode support",
},
{
"id": "id-2",
"identifier": "FR-2",
"title": "Dark theme",
"description": None,
},
]
patcher, _ = _mock_linear_config(query_return=_search_response(nodes))
with patcher:
tool = SearchFeatureRequestsTool()
resp = await tool._execute(
user_id=_TEST_USER_ID, session=session, query="dark mode"
)
assert isinstance(resp, FeatureRequestSearchResponse)
assert resp.count == 2
assert resp.results[0].id == "id-1"
assert resp.results[1].identifier == "FR-2"
assert resp.query == "dark mode"
@pytest.mark.asyncio(loop_scope="session")
async def test_no_results(self):
session = make_session(user_id=_TEST_USER_ID)
patcher, _ = _mock_linear_config(query_return=_search_response([]))
with patcher:
tool = SearchFeatureRequestsTool()
resp = await tool._execute(
user_id=_TEST_USER_ID, session=session, query="nonexistent"
)
assert isinstance(resp, NoResultsResponse)
assert "nonexistent" in resp.message
@pytest.mark.asyncio(loop_scope="session")
async def test_empty_query_returns_error(self):
session = make_session(user_id=_TEST_USER_ID)
tool = SearchFeatureRequestsTool()
resp = await tool._execute(user_id=_TEST_USER_ID, session=session, query=" ")
assert isinstance(resp, ErrorResponse)
assert resp.error is not None
assert "query" in resp.error.lower()
@pytest.mark.asyncio(loop_scope="session")
async def test_missing_query_returns_error(self):
session = make_session(user_id=_TEST_USER_ID)
tool = SearchFeatureRequestsTool()
resp = await tool._execute(user_id=_TEST_USER_ID, session=session)
assert isinstance(resp, ErrorResponse)
@pytest.mark.asyncio(loop_scope="session")
async def test_api_failure(self):
session = make_session(user_id=_TEST_USER_ID)
patcher, client = _mock_linear_config()
client.query.side_effect = RuntimeError("Linear API down")
with patcher:
tool = SearchFeatureRequestsTool()
resp = await tool._execute(
user_id=_TEST_USER_ID, session=session, query="test"
)
assert isinstance(resp, ErrorResponse)
assert resp.error is not None
assert "Linear API down" in resp.error
@pytest.mark.asyncio(loop_scope="session")
async def test_malformed_node_returns_error(self):
"""A node missing required keys should be caught by the try/except."""
session = make_session(user_id=_TEST_USER_ID)
# Node missing 'identifier' key
bad_nodes = [{"id": "id-1", "title": "Missing identifier"}]
patcher, _ = _mock_linear_config(query_return=_search_response(bad_nodes))
with patcher:
tool = SearchFeatureRequestsTool()
resp = await tool._execute(
user_id=_TEST_USER_ID, session=session, query="test"
)
assert isinstance(resp, ErrorResponse)
@pytest.mark.asyncio(loop_scope="session")
async def test_linear_client_init_failure(self):
session = make_session(user_id=_TEST_USER_ID)
with patch(
"backend.api.features.chat.tools.feature_requests._get_linear_config",
side_effect=RuntimeError("No API key"),
):
tool = SearchFeatureRequestsTool()
resp = await tool._execute(
user_id=_TEST_USER_ID, session=session, query="test"
)
assert isinstance(resp, ErrorResponse)
assert resp.error is not None
assert "No API key" in resp.error
# ===========================================================================
# CreateFeatureRequestTool
# ===========================================================================
class TestCreateFeatureRequestTool:
"""Tests for CreateFeatureRequestTool._execute."""
@pytest.fixture(autouse=True)
def _patch_email_lookup(self):
with patch(
"backend.api.features.chat.tools.feature_requests.get_user_email_by_id",
new_callable=AsyncMock,
return_value=_TEST_USER_EMAIL,
):
yield
# ---- Happy paths -------------------------------------------------------
@pytest.mark.asyncio(loop_scope="session")
async def test_create_new_issue(self):
"""Full happy path: upsert customer -> create issue -> attach need."""
session = make_session(user_id=_TEST_USER_ID)
patcher, client = _mock_linear_config()
client.mutate.side_effect = [
_customer_upsert_response(),
_issue_create_response(),
_need_create_response(),
]
with patcher:
tool = CreateFeatureRequestTool()
resp = await tool._execute(
user_id=_TEST_USER_ID,
session=session,
title="New Feature",
description="Please add this",
)
assert isinstance(resp, FeatureRequestCreatedResponse)
assert resp.is_new_issue is True
assert resp.issue_identifier == "FR-1"
assert resp.customer_name == _TEST_USER_EMAIL
assert client.mutate.call_count == 3
@pytest.mark.asyncio(loop_scope="session")
async def test_add_need_to_existing_issue(self):
"""When existing_issue_id is provided, skip issue creation."""
session = make_session(user_id=_TEST_USER_ID)
patcher, client = _mock_linear_config()
client.mutate.side_effect = [
_customer_upsert_response(),
_need_create_response(issue_id="existing-1", identifier="FR-99"),
]
with patcher:
tool = CreateFeatureRequestTool()
resp = await tool._execute(
user_id=_TEST_USER_ID,
session=session,
title="Existing Feature",
description="Me too",
existing_issue_id="existing-1",
)
assert isinstance(resp, FeatureRequestCreatedResponse)
assert resp.is_new_issue is False
assert resp.issue_id == "existing-1"
# Only 2 mutations: customer upsert + need create (no issue create)
assert client.mutate.call_count == 2
# ---- Validation errors -------------------------------------------------
@pytest.mark.asyncio(loop_scope="session")
async def test_missing_title(self):
session = make_session(user_id=_TEST_USER_ID)
tool = CreateFeatureRequestTool()
resp = await tool._execute(
user_id=_TEST_USER_ID,
session=session,
title="",
description="some desc",
)
assert isinstance(resp, ErrorResponse)
assert resp.error is not None
assert "required" in resp.error.lower()
@pytest.mark.asyncio(loop_scope="session")
async def test_missing_description(self):
session = make_session(user_id=_TEST_USER_ID)
tool = CreateFeatureRequestTool()
resp = await tool._execute(
user_id=_TEST_USER_ID,
session=session,
title="Some title",
description="",
)
assert isinstance(resp, ErrorResponse)
assert resp.error is not None
assert "required" in resp.error.lower()
@pytest.mark.asyncio(loop_scope="session")
async def test_missing_user_id(self):
session = make_session(user_id=_TEST_USER_ID)
tool = CreateFeatureRequestTool()
resp = await tool._execute(
user_id=None,
session=session,
title="Some title",
description="Some desc",
)
assert isinstance(resp, ErrorResponse)
assert resp.error is not None
assert "user_id" in resp.error.lower()
# ---- Linear client init failure ----------------------------------------
@pytest.mark.asyncio(loop_scope="session")
async def test_linear_client_init_failure(self):
session = make_session(user_id=_TEST_USER_ID)
with patch(
"backend.api.features.chat.tools.feature_requests._get_linear_config",
side_effect=RuntimeError("No API key"),
):
tool = CreateFeatureRequestTool()
resp = await tool._execute(
user_id=_TEST_USER_ID,
session=session,
title="Title",
description="Desc",
)
assert isinstance(resp, ErrorResponse)
assert resp.error is not None
assert "No API key" in resp.error
# ---- Customer upsert failures ------------------------------------------
@pytest.mark.asyncio(loop_scope="session")
async def test_customer_upsert_api_error(self):
session = make_session(user_id=_TEST_USER_ID)
patcher, client = _mock_linear_config()
client.mutate.side_effect = RuntimeError("Customer API error")
with patcher:
tool = CreateFeatureRequestTool()
resp = await tool._execute(
user_id=_TEST_USER_ID,
session=session,
title="Title",
description="Desc",
)
assert isinstance(resp, ErrorResponse)
assert resp.error is not None
assert "Customer API error" in resp.error
@pytest.mark.asyncio(loop_scope="session")
async def test_customer_upsert_not_success(self):
session = make_session(user_id=_TEST_USER_ID)
patcher, client = _mock_linear_config()
client.mutate.return_value = _customer_upsert_response(success=False)
with patcher:
tool = CreateFeatureRequestTool()
resp = await tool._execute(
user_id=_TEST_USER_ID,
session=session,
title="Title",
description="Desc",
)
assert isinstance(resp, ErrorResponse)
@pytest.mark.asyncio(loop_scope="session")
async def test_customer_malformed_response(self):
"""Customer dict missing 'id' key should be caught."""
session = make_session(user_id=_TEST_USER_ID)
patcher, client = _mock_linear_config()
# success=True but customer has no 'id'
client.mutate.return_value = {
"customerUpsert": {
"success": True,
"customer": {"name": _TEST_USER_ID},
}
}
with patcher:
tool = CreateFeatureRequestTool()
resp = await tool._execute(
user_id=_TEST_USER_ID,
session=session,
title="Title",
description="Desc",
)
assert isinstance(resp, ErrorResponse)
# ---- Issue creation failures -------------------------------------------
@pytest.mark.asyncio(loop_scope="session")
async def test_issue_create_api_error(self):
session = make_session(user_id=_TEST_USER_ID)
patcher, client = _mock_linear_config()
client.mutate.side_effect = [
_customer_upsert_response(),
RuntimeError("Issue create failed"),
]
with patcher:
tool = CreateFeatureRequestTool()
resp = await tool._execute(
user_id=_TEST_USER_ID,
session=session,
title="Title",
description="Desc",
)
assert isinstance(resp, ErrorResponse)
assert resp.error is not None
assert "Issue create failed" in resp.error
@pytest.mark.asyncio(loop_scope="session")
async def test_issue_create_not_success(self):
session = make_session(user_id=_TEST_USER_ID)
patcher, client = _mock_linear_config()
client.mutate.side_effect = [
_customer_upsert_response(),
_issue_create_response(success=False),
]
with patcher:
tool = CreateFeatureRequestTool()
resp = await tool._execute(
user_id=_TEST_USER_ID,
session=session,
title="Title",
description="Desc",
)
assert isinstance(resp, ErrorResponse)
assert "Failed to create feature request issue" in resp.message
@pytest.mark.asyncio(loop_scope="session")
async def test_issue_create_malformed_response(self):
"""issueCreate success=True but missing 'issue' key."""
session = make_session(user_id=_TEST_USER_ID)
patcher, client = _mock_linear_config()
client.mutate.side_effect = [
_customer_upsert_response(),
{"issueCreate": {"success": True}}, # no 'issue' key
]
with patcher:
tool = CreateFeatureRequestTool()
resp = await tool._execute(
user_id=_TEST_USER_ID,
session=session,
title="Title",
description="Desc",
)
assert isinstance(resp, ErrorResponse)
# ---- Customer need attachment failures ---------------------------------
@pytest.mark.asyncio(loop_scope="session")
async def test_need_create_api_error_new_issue(self):
"""Need creation fails after new issue was created -> orphaned issue info."""
session = make_session(user_id=_TEST_USER_ID)
patcher, client = _mock_linear_config()
client.mutate.side_effect = [
_customer_upsert_response(),
_issue_create_response(issue_id="orphan-1", identifier="FR-10"),
RuntimeError("Need attach failed"),
]
with patcher:
tool = CreateFeatureRequestTool()
resp = await tool._execute(
user_id=_TEST_USER_ID,
session=session,
title="Title",
description="Desc",
)
assert isinstance(resp, ErrorResponse)
assert resp.error is not None
assert "Need attach failed" in resp.error
assert resp.details is not None
assert resp.details["issue_id"] == "orphan-1"
assert resp.details["issue_identifier"] == "FR-10"
@pytest.mark.asyncio(loop_scope="session")
async def test_need_create_api_error_existing_issue(self):
"""Need creation fails on existing issue -> no orphaned info."""
session = make_session(user_id=_TEST_USER_ID)
patcher, client = _mock_linear_config()
client.mutate.side_effect = [
_customer_upsert_response(),
RuntimeError("Need attach failed"),
]
with patcher:
tool = CreateFeatureRequestTool()
resp = await tool._execute(
user_id=_TEST_USER_ID,
session=session,
title="Title",
description="Desc",
existing_issue_id="existing-1",
)
assert isinstance(resp, ErrorResponse)
assert resp.details is None
@pytest.mark.asyncio(loop_scope="session")
async def test_need_create_not_success_includes_orphaned_info(self):
"""customerNeedCreate returns success=False -> includes orphaned issue."""
session = make_session(user_id=_TEST_USER_ID)
patcher, client = _mock_linear_config()
client.mutate.side_effect = [
_customer_upsert_response(),
_issue_create_response(issue_id="orphan-2", identifier="FR-20"),
_need_create_response(success=False),
]
with patcher:
tool = CreateFeatureRequestTool()
resp = await tool._execute(
user_id=_TEST_USER_ID,
session=session,
title="Title",
description="Desc",
)
assert isinstance(resp, ErrorResponse)
assert resp.details is not None
assert resp.details["issue_id"] == "orphan-2"
assert resp.details["issue_identifier"] == "FR-20"
@pytest.mark.asyncio(loop_scope="session")
async def test_need_create_not_success_existing_issue_no_details(self):
"""customerNeedCreate fails on existing issue -> no orphaned info."""
session = make_session(user_id=_TEST_USER_ID)
patcher, client = _mock_linear_config()
client.mutate.side_effect = [
_customer_upsert_response(),
_need_create_response(success=False),
]
with patcher:
tool = CreateFeatureRequestTool()
resp = await tool._execute(
user_id=_TEST_USER_ID,
session=session,
title="Title",
description="Desc",
existing_issue_id="existing-1",
)
assert isinstance(resp, ErrorResponse)
assert resp.details is None
@pytest.mark.asyncio(loop_scope="session")
async def test_need_create_malformed_response(self):
"""need_result missing 'need' key after success=True."""
session = make_session(user_id=_TEST_USER_ID)
patcher, client = _mock_linear_config()
client.mutate.side_effect = [
_customer_upsert_response(),
_issue_create_response(),
{"customerNeedCreate": {"success": True}}, # no 'need' key
]
with patcher:
tool = CreateFeatureRequestTool()
resp = await tool._execute(
user_id=_TEST_USER_ID,
session=session,
title="Title",
description="Desc",
)
assert isinstance(resp, ErrorResponse)
assert resp.details is not None
assert resp.details["issue_id"] == "issue-1"

View File

@@ -7,6 +7,7 @@ from backend.api.features.chat.model import ChatSession
from backend.api.features.chat.tools.base import BaseTool, ToolResponseBase
from backend.api.features.chat.tools.models import (
BlockInfoSummary,
BlockInputFieldInfo,
BlockListResponse,
ErrorResponse,
NoResultsResponse,
@@ -54,8 +55,7 @@ class FindBlockTool(BaseTool):
"Blocks are reusable components that perform specific tasks like "
"sending emails, making API calls, processing text, etc. "
"IMPORTANT: Use this tool FIRST to get the block's 'id' before calling run_block. "
"The response includes each block's id, name, and description. "
"Call run_block with the block's id **with no inputs** to see detailed inputs/outputs and execute it."
"The response includes each block's id, required_inputs, and input_schema."
)
@property
@@ -124,7 +124,7 @@ class FindBlockTool(BaseTool):
session_id=session_id,
)
# Enrich results with block information
# Enrich results with full block information
blocks: list[BlockInfoSummary] = []
for result in results:
block_id = result["content_id"]
@@ -141,11 +141,65 @@ class FindBlockTool(BaseTool):
):
continue
# Get input/output schemas
input_schema = {}
output_schema = {}
try:
input_schema = block.input_schema.jsonschema()
except Exception as e:
logger.debug(
"Failed to generate input schema for block %s: %s",
block_id,
e,
)
try:
output_schema = block.output_schema.jsonschema()
except Exception as e:
logger.debug(
"Failed to generate output schema for block %s: %s",
block_id,
e,
)
# Get categories from block instance
categories = []
if hasattr(block, "categories") and block.categories:
categories = [cat.value for cat in block.categories]
# Extract required inputs for easier use
required_inputs: list[BlockInputFieldInfo] = []
if input_schema:
properties = input_schema.get("properties", {})
required_fields = set(input_schema.get("required", []))
# Get credential field names to exclude from required inputs
credentials_fields = set(
block.input_schema.get_credentials_fields().keys()
)
for field_name, field_schema in properties.items():
# Skip credential fields - they're handled separately
if field_name in credentials_fields:
continue
required_inputs.append(
BlockInputFieldInfo(
name=field_name,
type=field_schema.get("type", "string"),
description=field_schema.get("description", ""),
required=field_name in required_fields,
default=field_schema.get("default"),
)
)
blocks.append(
BlockInfoSummary(
id=block_id,
name=block.name,
description=block.description or "",
categories=categories,
input_schema=input_schema,
output_schema=output_schema,
required_inputs=required_inputs,
)
)
@@ -174,7 +228,8 @@ class FindBlockTool(BaseTool):
return BlockListResponse(
message=(
f"Found {len(blocks)} block(s) matching '{query}'. "
"To see a block's inputs/outputs and execute it, use run_block with the block's 'id' - providing no inputs."
"To execute a block, use run_block with the block's 'id' field "
"and provide 'input_data' matching the block's input_schema."
),
blocks=blocks,
count=len(blocks),

View File

@@ -18,13 +18,7 @@ _TEST_USER_ID = "test-user-find-block"
def make_mock_block(
block_id: str,
name: str,
block_type: BlockType,
disabled: bool = False,
input_schema: dict | None = None,
output_schema: dict | None = None,
credentials_fields: dict | None = None,
block_id: str, name: str, block_type: BlockType, disabled: bool = False
):
"""Create a mock block for testing."""
mock = MagicMock()
@@ -34,13 +28,10 @@ def make_mock_block(
mock.block_type = block_type
mock.disabled = disabled
mock.input_schema = MagicMock()
mock.input_schema.jsonschema.return_value = input_schema or {
"properties": {},
"required": [],
}
mock.input_schema.get_credentials_fields.return_value = credentials_fields or {}
mock.input_schema.jsonschema.return_value = {"properties": {}, "required": []}
mock.input_schema.get_credentials_fields.return_value = {}
mock.output_schema = MagicMock()
mock.output_schema.jsonschema.return_value = output_schema or {}
mock.output_schema.jsonschema.return_value = {}
mock.categories = []
return mock
@@ -146,241 +137,3 @@ class TestFindBlockFiltering:
assert isinstance(response, BlockListResponse)
assert len(response.blocks) == 1
assert response.blocks[0].id == "normal-block-id"
@pytest.mark.asyncio(loop_scope="session")
async def test_response_size_average_chars_per_block(self):
"""Measure average chars per block in the serialized response."""
session = make_session(user_id=_TEST_USER_ID)
# Realistic block definitions modeled after real blocks
block_defs = [
{
"id": "http-block-id",
"name": "Send Web Request",
"input_schema": {
"properties": {
"url": {
"type": "string",
"description": "The URL to send the request to",
},
"method": {
"type": "string",
"description": "The HTTP method to use",
},
"headers": {
"type": "object",
"description": "Headers to include in the request",
},
"json_format": {
"type": "boolean",
"description": "If true, send the body as JSON",
},
"body": {
"type": "object",
"description": "Form/JSON body payload",
},
"credentials": {
"type": "object",
"description": "HTTP credentials",
},
},
"required": ["url", "method"],
},
"output_schema": {
"properties": {
"response": {
"type": "object",
"description": "The response from the server",
},
"client_error": {
"type": "object",
"description": "Errors on 4xx status codes",
},
"server_error": {
"type": "object",
"description": "Errors on 5xx status codes",
},
"error": {
"type": "string",
"description": "Errors for all other exceptions",
},
},
},
"credentials_fields": {"credentials": True},
},
{
"id": "email-block-id",
"name": "Send Email",
"input_schema": {
"properties": {
"to_email": {
"type": "string",
"description": "Recipient email address",
},
"subject": {
"type": "string",
"description": "Subject of the email",
},
"body": {
"type": "string",
"description": "Body of the email",
},
"config": {
"type": "object",
"description": "SMTP Config",
},
"credentials": {
"type": "object",
"description": "SMTP credentials",
},
},
"required": ["to_email", "subject", "body", "credentials"],
},
"output_schema": {
"properties": {
"status": {
"type": "string",
"description": "Status of the email sending operation",
},
"error": {
"type": "string",
"description": "Error message if sending failed",
},
},
},
"credentials_fields": {"credentials": True},
},
{
"id": "claude-code-block-id",
"name": "Claude Code",
"input_schema": {
"properties": {
"e2b_credentials": {
"type": "object",
"description": "API key for E2B platform",
},
"anthropic_credentials": {
"type": "object",
"description": "API key for Anthropic",
},
"prompt": {
"type": "string",
"description": "Task or instruction for Claude Code",
},
"timeout": {
"type": "integer",
"description": "Sandbox timeout in seconds",
},
"setup_commands": {
"type": "array",
"description": "Shell commands to run before execution",
},
"working_directory": {
"type": "string",
"description": "Working directory for Claude Code",
},
"session_id": {
"type": "string",
"description": "Session ID to resume a conversation",
},
"sandbox_id": {
"type": "string",
"description": "Sandbox ID to reconnect to",
},
"conversation_history": {
"type": "string",
"description": "Previous conversation history",
},
"dispose_sandbox": {
"type": "boolean",
"description": "Whether to dispose sandbox after execution",
},
},
"required": [
"e2b_credentials",
"anthropic_credentials",
"prompt",
],
},
"output_schema": {
"properties": {
"response": {
"type": "string",
"description": "Output from Claude Code execution",
},
"files": {
"type": "array",
"description": "Files created/modified by Claude Code",
},
"conversation_history": {
"type": "string",
"description": "Full conversation history",
},
"session_id": {
"type": "string",
"description": "Session ID for this conversation",
},
"sandbox_id": {
"type": "string",
"description": "ID of the sandbox instance",
},
"error": {
"type": "string",
"description": "Error message if execution failed",
},
},
},
"credentials_fields": {
"e2b_credentials": True,
"anthropic_credentials": True,
},
},
]
search_results = [
{"content_id": d["id"], "score": 0.9 - i * 0.1}
for i, d in enumerate(block_defs)
]
mock_blocks = {
d["id"]: make_mock_block(
block_id=d["id"],
name=d["name"],
block_type=BlockType.STANDARD,
input_schema=d["input_schema"],
output_schema=d["output_schema"],
credentials_fields=d["credentials_fields"],
)
for d in block_defs
}
with patch(
"backend.api.features.chat.tools.find_block.unified_hybrid_search",
new_callable=AsyncMock,
return_value=(search_results, len(search_results)),
), patch(
"backend.api.features.chat.tools.find_block.get_block",
side_effect=lambda bid: mock_blocks.get(bid),
):
tool = FindBlockTool()
response = await tool._execute(
user_id=_TEST_USER_ID, session=session, query="test"
)
assert isinstance(response, BlockListResponse)
assert response.count == len(block_defs)
total_chars = len(response.model_dump_json())
avg_chars = total_chars // response.count
# Print for visibility in test output
print(f"\nTotal response size: {total_chars} chars")
print(f"Number of blocks: {response.count}")
print(f"Average chars per block: {avg_chars}")
# The old response was ~90K for 10 blocks (~9K per block).
# Previous optimization reduced it to ~1.5K per block (no raw JSON schemas).
# Now with only id/name/description, we expect ~300 chars per block.
assert avg_chars < 500, (
f"Average chars per block ({avg_chars}) exceeds 500. "
f"Total response: {total_chars} chars for {response.count} blocks."
)

View File

@@ -25,7 +25,6 @@ class ResponseType(str, Enum):
AGENT_SAVED = "agent_saved"
CLARIFICATION_NEEDED = "clarification_needed"
BLOCK_LIST = "block_list"
BLOCK_DETAILS = "block_details"
BLOCK_OUTPUT = "block_output"
DOC_SEARCH_RESULTS = "doc_search_results"
DOC_PAGE = "doc_page"
@@ -41,9 +40,12 @@ class ResponseType(str, Enum):
OPERATION_IN_PROGRESS = "operation_in_progress"
# Input validation
INPUT_VALIDATION_ERROR = "input_validation_error"
# Feature request types
FEATURE_REQUEST_SEARCH = "feature_request_search"
FEATURE_REQUEST_CREATED = "feature_request_created"
# Web fetch
WEB_FETCH = "web_fetch"
# Code execution
BASH_EXEC = "bash_exec"
# Operation status check
OPERATION_STATUS = "operation_status"
# Base response model
@@ -338,6 +340,19 @@ class BlockInfoSummary(BaseModel):
id: str
name: str
description: str
categories: list[str]
input_schema: dict[str, Any] = Field(
default_factory=dict,
description="Full JSON schema for block inputs",
)
output_schema: dict[str, Any] = Field(
default_factory=dict,
description="Full JSON schema for block outputs",
)
required_inputs: list[BlockInputFieldInfo] = Field(
default_factory=list,
description="List of input fields for this block",
)
class BlockListResponse(ToolResponseBase):
@@ -347,25 +362,10 @@ class BlockListResponse(ToolResponseBase):
blocks: list[BlockInfoSummary]
count: int
query: str
class BlockDetails(BaseModel):
"""Detailed block information."""
id: str
name: str
description: str
inputs: dict[str, Any] = {}
outputs: dict[str, Any] = {}
credentials: list[CredentialsMetaInput] = []
class BlockDetailsResponse(ToolResponseBase):
"""Response for block details (first run_block attempt)."""
type: ResponseType = ResponseType.BLOCK_DETAILS
block: BlockDetails
user_authenticated: bool = False
usage_hint: str = Field(
default="To execute a block, call run_block with block_id set to the block's "
"'id' field and input_data containing the fields listed in required_inputs."
)
class BlockOutputResponse(ToolResponseBase):
@@ -435,32 +435,22 @@ class AsyncProcessingResponse(ToolResponseBase):
task_id: str | None = None
# Feature request models
class FeatureRequestInfo(BaseModel):
"""Information about a feature request issue."""
class WebFetchResponse(ToolResponseBase):
"""Response for web_fetch tool."""
id: str
identifier: str
title: str
description: str | None = None
type: ResponseType = ResponseType.WEB_FETCH
url: str
status_code: int
content_type: str
content: str
truncated: bool = False
class FeatureRequestSearchResponse(ToolResponseBase):
"""Response for search_feature_requests tool."""
class BashExecResponse(ToolResponseBase):
"""Response for bash_exec tool."""
type: ResponseType = ResponseType.FEATURE_REQUEST_SEARCH
results: list[FeatureRequestInfo]
count: int
query: str
class FeatureRequestCreatedResponse(ToolResponseBase):
"""Response for create_feature_request tool."""
type: ResponseType = ResponseType.FEATURE_REQUEST_CREATED
issue_id: str
issue_identifier: str
issue_title: str
issue_url: str
is_new_issue: bool # False if added to existing
customer_name: str
type: ResponseType = ResponseType.BASH_EXEC
stdout: str
stderr: str
exit_code: int
timed_out: bool = False

View File

@@ -23,11 +23,8 @@ from backend.util.exceptions import BlockError
from .base import BaseTool
from .helpers import get_inputs_from_schema
from .models import (
BlockDetails,
BlockDetailsResponse,
BlockOutputResponse,
ErrorResponse,
InputValidationErrorResponse,
SetupInfo,
SetupRequirementsResponse,
ToolResponseBase,
@@ -54,8 +51,8 @@ class RunBlockTool(BaseTool):
"Execute a specific block with the provided input data. "
"IMPORTANT: You MUST call find_block first to get the block's 'id' - "
"do NOT guess or make up block IDs. "
"On first attempt (without input_data), returns detailed schema showing "
"required inputs and outputs. Then call again with proper input_data to execute."
"Use the 'id' from find_block results and provide input_data "
"matching the block's required_inputs."
)
@property
@@ -70,19 +67,11 @@ class RunBlockTool(BaseTool):
"NEVER guess this - always get it from find_block first."
),
},
"block_name": {
"type": "string",
"description": (
"The block's human-readable name from find_block results. "
"Used for display purposes in the UI."
),
},
"input_data": {
"type": "object",
"description": (
"Input values for the block. "
"First call with empty {} to see the block's schema, "
"then call again with proper values to execute."
"Input values for the block. Use the 'required_inputs' field "
"from find_block to see what fields are needed."
),
},
},
@@ -167,34 +156,6 @@ class RunBlockTool(BaseTool):
await self._resolve_block_credentials(user_id, block, input_data)
)
# Get block schemas for details/validation
try:
input_schema: dict[str, Any] = block.input_schema.jsonschema()
except Exception as e:
logger.warning(
"Failed to generate input schema for block %s: %s",
block_id,
e,
)
return ErrorResponse(
message=f"Block '{block.name}' has an invalid input schema",
error=str(e),
session_id=session_id,
)
try:
output_schema: dict[str, Any] = block.output_schema.jsonschema()
except Exception as e:
logger.warning(
"Failed to generate output schema for block %s: %s",
block_id,
e,
)
return ErrorResponse(
message=f"Block '{block.name}' has an invalid output schema",
error=str(e),
session_id=session_id,
)
if missing_credentials:
# Return setup requirements response with missing credentials
credentials_fields_info = block.input_schema.get_credentials_fields_info()
@@ -227,53 +188,6 @@ class RunBlockTool(BaseTool):
graph_version=None,
)
# Check if this is a first attempt (required inputs missing)
# Return block details so user can see what inputs are needed
credentials_fields = set(block.input_schema.get_credentials_fields().keys())
required_keys = set(input_schema.get("required", []))
required_non_credential_keys = required_keys - credentials_fields
provided_input_keys = set(input_data.keys()) - credentials_fields
# Check for unknown input fields
valid_fields = (
set(input_schema.get("properties", {}).keys()) - credentials_fields
)
unrecognized_fields = provided_input_keys - valid_fields
if unrecognized_fields:
return InputValidationErrorResponse(
message=(
f"Unknown input field(s) provided: {', '.join(sorted(unrecognized_fields))}. "
f"Block was not executed. Please use the correct field names from the schema."
),
session_id=session_id,
unrecognized_fields=sorted(unrecognized_fields),
inputs=input_schema,
)
# Show details when not all required non-credential inputs are provided
if not (required_non_credential_keys <= provided_input_keys):
# Get credentials info for the response
credentials_meta = []
for field_name, cred_meta in matched_credentials.items():
credentials_meta.append(cred_meta)
return BlockDetailsResponse(
message=(
f"Block '{block.name}' details. "
"Provide input_data matching the inputs schema to execute the block."
),
session_id=session_id,
block=BlockDetails(
id=block_id,
name=block.name,
description=block.description or "",
inputs=input_schema,
outputs=output_schema,
credentials=credentials_meta,
),
user_authenticated=True,
)
try:
# Get or create user's workspace for CoPilot file operations
workspace = await get_or_create_workspace(user_id)

View File

@@ -1,15 +1,10 @@
"""Tests for block execution guards and input validation in RunBlockTool."""
"""Tests for block execution guards in RunBlockTool."""
from unittest.mock import AsyncMock, MagicMock, patch
from unittest.mock import MagicMock, patch
import pytest
from backend.api.features.chat.tools.models import (
BlockDetailsResponse,
BlockOutputResponse,
ErrorResponse,
InputValidationErrorResponse,
)
from backend.api.features.chat.tools.models import ErrorResponse
from backend.api.features.chat.tools.run_block import RunBlockTool
from backend.blocks._base import BlockType
@@ -33,39 +28,6 @@ def make_mock_block(
return mock
def make_mock_block_with_schema(
block_id: str,
name: str,
input_properties: dict,
required_fields: list[str],
output_properties: dict | None = None,
):
"""Create a mock block with a defined input/output schema for validation tests."""
mock = MagicMock()
mock.id = block_id
mock.name = name
mock.block_type = BlockType.STANDARD
mock.disabled = False
mock.description = f"Test block: {name}"
input_schema = {
"properties": input_properties,
"required": required_fields,
}
mock.input_schema = MagicMock()
mock.input_schema.jsonschema.return_value = input_schema
mock.input_schema.get_credentials_fields_info.return_value = {}
mock.input_schema.get_credentials_fields.return_value = {}
output_schema = {
"properties": output_properties or {"result": {"type": "string"}},
}
mock.output_schema = MagicMock()
mock.output_schema.jsonschema.return_value = output_schema
return mock
class TestRunBlockFiltering:
"""Tests for block execution guards in RunBlockTool."""
@@ -142,221 +104,3 @@ class TestRunBlockFiltering:
# (may be other errors like missing credentials, but not the exclusion guard)
if isinstance(response, ErrorResponse):
assert "cannot be run directly in CoPilot" not in response.message
class TestRunBlockInputValidation:
"""Tests for input field validation in RunBlockTool.
run_block rejects unknown input field names with InputValidationErrorResponse,
preventing silent failures where incorrect keys would be ignored and the block
would execute with default values instead of the caller's intended values.
"""
@pytest.mark.asyncio(loop_scope="session")
async def test_unknown_input_fields_are_rejected(self):
"""run_block rejects unknown input fields instead of silently ignoring them.
Scenario: The AI Text Generator block has a field called 'model' (for LLM model
selection), but the LLM calling the tool guesses wrong and sends 'LLM_Model'
instead. The block should reject the request and return the valid schema.
"""
session = make_session(user_id=_TEST_USER_ID)
mock_block = make_mock_block_with_schema(
block_id="ai-text-gen-id",
name="AI Text Generator",
input_properties={
"prompt": {"type": "string", "description": "The prompt to send"},
"model": {
"type": "string",
"description": "The LLM model to use",
"default": "gpt-4o-mini",
},
"sys_prompt": {
"type": "string",
"description": "System prompt",
"default": "",
},
},
required_fields=["prompt"],
output_properties={"response": {"type": "string"}},
)
with patch(
"backend.api.features.chat.tools.run_block.get_block",
return_value=mock_block,
):
tool = RunBlockTool()
# Provide 'prompt' (correct) but 'LLM_Model' instead of 'model' (wrong key)
response = await tool._execute(
user_id=_TEST_USER_ID,
session=session,
block_id="ai-text-gen-id",
input_data={
"prompt": "Write a haiku about coding",
"LLM_Model": "claude-opus-4-6", # WRONG KEY - should be 'model'
},
)
assert isinstance(response, InputValidationErrorResponse)
assert "LLM_Model" in response.unrecognized_fields
assert "Block was not executed" in response.message
assert "inputs" in response.model_dump() # valid schema included
@pytest.mark.asyncio(loop_scope="session")
async def test_multiple_wrong_keys_are_all_reported(self):
"""All unrecognized field names are reported in a single error response."""
session = make_session(user_id=_TEST_USER_ID)
mock_block = make_mock_block_with_schema(
block_id="ai-text-gen-id",
name="AI Text Generator",
input_properties={
"prompt": {"type": "string"},
"model": {"type": "string", "default": "gpt-4o-mini"},
"sys_prompt": {"type": "string", "default": ""},
"retry": {"type": "integer", "default": 3},
},
required_fields=["prompt"],
)
with patch(
"backend.api.features.chat.tools.run_block.get_block",
return_value=mock_block,
):
tool = RunBlockTool()
response = await tool._execute(
user_id=_TEST_USER_ID,
session=session,
block_id="ai-text-gen-id",
input_data={
"prompt": "Hello", # correct
"llm_model": "claude-opus-4-6", # WRONG - should be 'model'
"system_prompt": "Be helpful", # WRONG - should be 'sys_prompt'
"retries": 5, # WRONG - should be 'retry'
},
)
assert isinstance(response, InputValidationErrorResponse)
assert set(response.unrecognized_fields) == {
"llm_model",
"system_prompt",
"retries",
}
assert "Block was not executed" in response.message
@pytest.mark.asyncio(loop_scope="session")
async def test_unknown_fields_rejected_even_with_missing_required(self):
"""Unknown fields are caught before the missing-required-fields check."""
session = make_session(user_id=_TEST_USER_ID)
mock_block = make_mock_block_with_schema(
block_id="ai-text-gen-id",
name="AI Text Generator",
input_properties={
"prompt": {"type": "string"},
"model": {"type": "string", "default": "gpt-4o-mini"},
},
required_fields=["prompt"],
)
with patch(
"backend.api.features.chat.tools.run_block.get_block",
return_value=mock_block,
):
tool = RunBlockTool()
# 'prompt' is missing AND 'LLM_Model' is an unknown field
response = await tool._execute(
user_id=_TEST_USER_ID,
session=session,
block_id="ai-text-gen-id",
input_data={
"LLM_Model": "claude-opus-4-6", # wrong key, and 'prompt' is missing
},
)
# Unknown fields are caught first
assert isinstance(response, InputValidationErrorResponse)
assert "LLM_Model" in response.unrecognized_fields
@pytest.mark.asyncio(loop_scope="session")
async def test_correct_inputs_still_execute(self):
"""Correct input field names pass validation and the block executes."""
session = make_session(user_id=_TEST_USER_ID)
mock_block = make_mock_block_with_schema(
block_id="ai-text-gen-id",
name="AI Text Generator",
input_properties={
"prompt": {"type": "string"},
"model": {"type": "string", "default": "gpt-4o-mini"},
},
required_fields=["prompt"],
)
async def mock_execute(input_data, **kwargs):
yield "response", "Generated text"
mock_block.execute = mock_execute
with (
patch(
"backend.api.features.chat.tools.run_block.get_block",
return_value=mock_block,
),
patch(
"backend.api.features.chat.tools.run_block.get_or_create_workspace",
new_callable=AsyncMock,
return_value=MagicMock(id="test-workspace-id"),
),
):
tool = RunBlockTool()
response = await tool._execute(
user_id=_TEST_USER_ID,
session=session,
block_id="ai-text-gen-id",
input_data={
"prompt": "Write a haiku",
"model": "gpt-4o-mini", # correct field name
},
)
assert isinstance(response, BlockOutputResponse)
assert response.success is True
@pytest.mark.asyncio(loop_scope="session")
async def test_missing_required_fields_returns_details(self):
"""Missing required fields returns BlockDetailsResponse with schema."""
session = make_session(user_id=_TEST_USER_ID)
mock_block = make_mock_block_with_schema(
block_id="ai-text-gen-id",
name="AI Text Generator",
input_properties={
"prompt": {"type": "string"},
"model": {"type": "string", "default": "gpt-4o-mini"},
},
required_fields=["prompt"],
)
with patch(
"backend.api.features.chat.tools.run_block.get_block",
return_value=mock_block,
):
tool = RunBlockTool()
# Only provide valid optional field, missing required 'prompt'
response = await tool._execute(
user_id=_TEST_USER_ID,
session=session,
block_id="ai-text-gen-id",
input_data={
"model": "gpt-4o-mini", # valid but optional
},
)
assert isinstance(response, BlockDetailsResponse)

View File

@@ -0,0 +1,267 @@
"""Sandbox execution utilities for code execution tools.
Provides filesystem + network isolated command execution using **bubblewrap**
(``bwrap``): whitelist-only filesystem (only system dirs visible read-only),
writable workspace only, clean environment, network blocked.
Tools that call :func:`run_sandboxed` must first check :func:`has_full_sandbox`
and refuse to run if bubblewrap is not available.
"""
import asyncio
import logging
import os
import platform
import shutil
logger = logging.getLogger(__name__)
# Output limits — prevent blowing up LLM context
_MAX_OUTPUT_CHARS = 50_000
_DEFAULT_TIMEOUT = 30
_MAX_TIMEOUT = 120
# ---------------------------------------------------------------------------
# Sandbox capability detection (cached at first call)
# ---------------------------------------------------------------------------
_BWRAP_AVAILABLE: bool | None = None
def has_full_sandbox() -> bool:
"""Return True if bubblewrap is available (filesystem + network isolation).
On non-Linux platforms (macOS), always returns False.
"""
global _BWRAP_AVAILABLE
if _BWRAP_AVAILABLE is None:
_BWRAP_AVAILABLE = (
platform.system() == "Linux" and shutil.which("bwrap") is not None
)
return _BWRAP_AVAILABLE
WORKSPACE_PREFIX = "/tmp/copilot-"
def make_session_path(session_id: str) -> str:
"""Build a sanitized, session-specific path under :data:`WORKSPACE_PREFIX`.
Shared by both the SDK working-directory setup and the sandbox tools so
they always resolve to the same directory for a given session.
Steps:
1. Strip all characters except ``[A-Za-z0-9-]``.
2. Construct ``/tmp/copilot-<safe_id>``.
3. Validate via ``os.path.normpath`` + ``startswith`` (CodeQL-recognised
sanitizer) to prevent path traversal.
Raises:
ValueError: If the resulting path escapes the prefix.
"""
import re
safe_id = re.sub(r"[^A-Za-z0-9-]", "", session_id)
if not safe_id:
safe_id = "default"
path = os.path.normpath(f"{WORKSPACE_PREFIX}{safe_id}")
if not path.startswith(WORKSPACE_PREFIX):
raise ValueError(f"Session path escaped prefix: {path}")
return path
def get_workspace_dir(session_id: str) -> str:
"""Get or create the workspace directory for a session.
Uses :func:`make_session_path` — the same path the SDK uses — so that
bash_exec shares the workspace with the SDK file tools.
"""
workspace = make_session_path(session_id)
os.makedirs(workspace, exist_ok=True)
return workspace
# ---------------------------------------------------------------------------
# Bubblewrap command builder
# ---------------------------------------------------------------------------
# System directories mounted read-only inside the sandbox.
# ONLY these are visible — /app, /root, /home, /opt, /var etc. are NOT accessible.
_SYSTEM_RO_BINDS = [
"/usr", # binaries, libraries, Python interpreter
"/etc", # system config: ld.so, locale, passwd, alternatives
]
# Compat paths: symlinks to /usr/* on modern Debian, real dirs on older systems.
# On Debian 13 these are symlinks (e.g. /bin -> usr/bin). bwrap --ro-bind
# can't create a symlink target, so we detect and use --symlink instead.
# /lib64 is critical: the ELF dynamic linker lives at /lib64/ld-linux-x86-64.so.2.
_COMPAT_PATHS = [
("/bin", "usr/bin"), # -> /usr/bin on Debian 13
("/sbin", "usr/sbin"), # -> /usr/sbin on Debian 13
("/lib", "usr/lib"), # -> /usr/lib on Debian 13
("/lib64", "usr/lib64"), # 64-bit libraries / ELF interpreter
]
# Resource limits to prevent fork bombs, memory exhaustion, and disk abuse.
# Applied via ulimit inside the sandbox before exec'ing the user command.
_RESOURCE_LIMITS = (
"ulimit -u 64" # max 64 processes (prevents fork bombs)
" -v 524288" # 512 MB virtual memory
" -f 51200" # 50 MB max file size (1024-byte blocks)
" -n 256" # 256 open file descriptors
" 2>/dev/null"
)
def _build_bwrap_command(
command: list[str], cwd: str, env: dict[str, str]
) -> list[str]:
"""Build a bubblewrap command with strict filesystem + network isolation.
Security model:
- **Whitelist-only filesystem**: only system directories (``/usr``, ``/etc``,
``/bin``, ``/lib``) are mounted read-only. Application code (``/app``),
home directories, ``/var``, ``/opt``, etc. are NOT accessible at all.
- **Writable workspace only**: the per-session workspace is the sole
writable path.
- **Clean environment**: ``--clearenv`` wipes all inherited env vars.
Only the explicitly-passed safe env vars are set inside the sandbox.
- **Network isolation**: ``--unshare-net`` blocks all network access.
- **Resource limits**: ulimit caps on processes (64), memory (512MB),
file size (50MB), and open FDs (256) to prevent fork bombs and abuse.
- **New session**: prevents terminal control escape.
- **Die with parent**: prevents orphaned sandbox processes.
"""
cmd = [
"bwrap",
# Create a new user namespace so bwrap can set up sandboxing
# inside unprivileged Docker containers (no CAP_SYS_ADMIN needed).
"--unshare-user",
# Wipe all inherited environment variables (API keys, secrets, etc.)
"--clearenv",
]
# Set only the safe env vars inside the sandbox
for key, value in env.items():
cmd.extend(["--setenv", key, value])
# System directories: read-only
for path in _SYSTEM_RO_BINDS:
cmd.extend(["--ro-bind", path, path])
# Compat paths: use --symlink when host path is a symlink (Debian 13),
# --ro-bind when it's a real directory (older distros).
for path, symlink_target in _COMPAT_PATHS:
if os.path.islink(path):
cmd.extend(["--symlink", symlink_target, path])
elif os.path.exists(path):
cmd.extend(["--ro-bind", path, path])
# Wrap the user command with resource limits:
# sh -c 'ulimit ...; exec "$@"' -- <original command>
# `exec "$@"` replaces the shell so there's no extra process overhead,
# and properly handles arguments with spaces.
limited_command = [
"sh",
"-c",
f'{_RESOURCE_LIMITS}; exec "$@"',
"--",
*command,
]
cmd.extend(
[
# Fresh virtual filesystems
"--dev",
"/dev",
"--proc",
"/proc",
"--tmpfs",
"/tmp",
# Workspace bind AFTER --tmpfs /tmp so it's visible through the tmpfs.
# (workspace lives under /tmp/copilot-<session>)
"--bind",
cwd,
cwd,
# Isolation
"--unshare-net",
"--die-with-parent",
"--new-session",
"--chdir",
cwd,
"--",
*limited_command,
]
)
return cmd
# ---------------------------------------------------------------------------
# Public API
# ---------------------------------------------------------------------------
async def run_sandboxed(
command: list[str],
cwd: str,
timeout: int = _DEFAULT_TIMEOUT,
env: dict[str, str] | None = None,
) -> tuple[str, str, int, bool]:
"""Run a command inside a bubblewrap sandbox.
Callers **must** check :func:`has_full_sandbox` before calling this
function. If bubblewrap is not available, this function raises
:class:`RuntimeError` rather than running unsandboxed.
Returns:
(stdout, stderr, exit_code, timed_out)
"""
if not has_full_sandbox():
raise RuntimeError(
"run_sandboxed() requires bubblewrap but bwrap is not available. "
"Callers must check has_full_sandbox() before calling this function."
)
timeout = min(max(timeout, 1), _MAX_TIMEOUT)
safe_env = {
"PATH": "/usr/local/bin:/usr/bin:/bin",
"HOME": cwd,
"TMPDIR": cwd,
"LANG": "en_US.UTF-8",
"PYTHONDONTWRITEBYTECODE": "1",
"PYTHONIOENCODING": "utf-8",
}
if env:
safe_env.update(env)
full_command = _build_bwrap_command(command, cwd, safe_env)
try:
proc = await asyncio.create_subprocess_exec(
*full_command,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE,
cwd=cwd,
env=safe_env,
)
try:
stdout_bytes, stderr_bytes = await asyncio.wait_for(
proc.communicate(), timeout=timeout
)
stdout = stdout_bytes.decode("utf-8", errors="replace")[:_MAX_OUTPUT_CHARS]
stderr = stderr_bytes.decode("utf-8", errors="replace")[:_MAX_OUTPUT_CHARS]
return stdout, stderr, proc.returncode or 0, False
except asyncio.TimeoutError:
proc.kill()
await proc.communicate()
return "", f"Execution timed out after {timeout}s", -1, True
except RuntimeError:
raise
except Exception as e:
return "", f"Sandbox error: {e}", -1, False

View File

@@ -1,153 +0,0 @@
"""Tests for BlockDetailsResponse in RunBlockTool."""
from unittest.mock import AsyncMock, MagicMock, patch
import pytest
from backend.api.features.chat.tools.models import BlockDetailsResponse
from backend.api.features.chat.tools.run_block import RunBlockTool
from backend.blocks._base import BlockType
from backend.data.model import CredentialsMetaInput
from backend.integrations.providers import ProviderName
from ._test_data import make_session
_TEST_USER_ID = "test-user-run-block-details"
def make_mock_block_with_inputs(
block_id: str, name: str, description: str = "Test description"
):
"""Create a mock block with input/output schemas for testing."""
mock = MagicMock()
mock.id = block_id
mock.name = name
mock.description = description
mock.block_type = BlockType.STANDARD
mock.disabled = False
# Input schema with non-credential fields
mock.input_schema = MagicMock()
mock.input_schema.jsonschema.return_value = {
"properties": {
"url": {"type": "string", "description": "URL to fetch"},
"method": {"type": "string", "description": "HTTP method"},
},
"required": ["url"],
}
mock.input_schema.get_credentials_fields.return_value = {}
mock.input_schema.get_credentials_fields_info.return_value = {}
# Output schema
mock.output_schema = MagicMock()
mock.output_schema.jsonschema.return_value = {
"properties": {
"response": {"type": "object", "description": "HTTP response"},
"error": {"type": "string", "description": "Error message"},
}
}
return mock
@pytest.mark.asyncio(loop_scope="session")
async def test_run_block_returns_details_when_no_input_provided():
"""When run_block is called without input_data, it should return BlockDetailsResponse."""
session = make_session(user_id=_TEST_USER_ID)
# Create a block with inputs
http_block = make_mock_block_with_inputs(
"http-block-id", "HTTP Request", "Send HTTP requests"
)
with patch(
"backend.api.features.chat.tools.run_block.get_block",
return_value=http_block,
):
# Mock credentials check to return no missing credentials
with patch.object(
RunBlockTool,
"_resolve_block_credentials",
new_callable=AsyncMock,
return_value=({}, []), # (matched_credentials, missing_credentials)
):
tool = RunBlockTool()
response = await tool._execute(
user_id=_TEST_USER_ID,
session=session,
block_id="http-block-id",
input_data={}, # Empty input data
)
# Should return BlockDetailsResponse showing the schema
assert isinstance(response, BlockDetailsResponse)
assert response.block.id == "http-block-id"
assert response.block.name == "HTTP Request"
assert response.block.description == "Send HTTP requests"
assert "url" in response.block.inputs["properties"]
assert "method" in response.block.inputs["properties"]
assert "response" in response.block.outputs["properties"]
assert response.user_authenticated is True
@pytest.mark.asyncio(loop_scope="session")
async def test_run_block_returns_details_when_only_credentials_provided():
"""When only credentials are provided (no actual input), should return details."""
session = make_session(user_id=_TEST_USER_ID)
# Create a block with both credential and non-credential inputs
mock = MagicMock()
mock.id = "api-block-id"
mock.name = "API Call"
mock.description = "Make API calls"
mock.block_type = BlockType.STANDARD
mock.disabled = False
mock.input_schema = MagicMock()
mock.input_schema.jsonschema.return_value = {
"properties": {
"credentials": {"type": "object", "description": "API credentials"},
"endpoint": {"type": "string", "description": "API endpoint"},
},
"required": ["credentials", "endpoint"],
}
mock.input_schema.get_credentials_fields.return_value = {"credentials": True}
mock.input_schema.get_credentials_fields_info.return_value = {}
mock.output_schema = MagicMock()
mock.output_schema.jsonschema.return_value = {
"properties": {"result": {"type": "object"}}
}
with patch(
"backend.api.features.chat.tools.run_block.get_block",
return_value=mock,
):
with patch.object(
RunBlockTool,
"_resolve_block_credentials",
new_callable=AsyncMock,
return_value=(
{
"credentials": CredentialsMetaInput(
id="cred-id",
provider=ProviderName("test_provider"),
type="api_key",
title="Test Credential",
)
},
[],
),
):
tool = RunBlockTool()
response = await tool._execute(
user_id=_TEST_USER_ID,
session=session,
block_id="api-block-id",
input_data={"credentials": {"some": "cred"}}, # Only credential
)
# Should return details because no non-credential inputs provided
assert isinstance(response, BlockDetailsResponse)
assert response.block.id == "api-block-id"
assert response.block.name == "API Call"

View File

@@ -0,0 +1,156 @@
"""Web fetch tool — safely retrieve public web page content."""
import logging
from typing import Any
import aiohttp
import html2text
from backend.api.features.chat.model import ChatSession
from backend.api.features.chat.tools.base import BaseTool
from backend.api.features.chat.tools.models import (
ErrorResponse,
ToolResponseBase,
WebFetchResponse,
)
from backend.util.request import Requests
logger = logging.getLogger(__name__)
# Limits
_MAX_CONTENT_BYTES = 102_400 # 100 KB download cap
_MAX_OUTPUT_CHARS = 50_000 # 50K char truncation for LLM context
_REQUEST_TIMEOUT = aiohttp.ClientTimeout(total=15)
# Content types we'll read as text
_TEXT_CONTENT_TYPES = {
"text/html",
"text/plain",
"text/xml",
"text/csv",
"text/markdown",
"application/json",
"application/xml",
"application/xhtml+xml",
"application/rss+xml",
"application/atom+xml",
}
def _is_text_content(content_type: str) -> bool:
base = content_type.split(";")[0].strip().lower()
return base in _TEXT_CONTENT_TYPES or base.startswith("text/")
def _html_to_text(html: str) -> str:
h = html2text.HTML2Text()
h.ignore_links = False
h.ignore_images = True
h.body_width = 0
return h.handle(html)
class WebFetchTool(BaseTool):
"""Safely fetch content from a public URL using SSRF-protected HTTP."""
@property
def name(self) -> str:
return "web_fetch"
@property
def description(self) -> str:
return (
"Fetch the content of a public web page by URL. "
"Returns readable text extracted from HTML by default. "
"Useful for reading documentation, articles, and API responses. "
"Only supports HTTP/HTTPS GET requests to public URLs "
"(private/internal network addresses are blocked)."
)
@property
def parameters(self) -> dict[str, Any]:
return {
"type": "object",
"properties": {
"url": {
"type": "string",
"description": "The public HTTP/HTTPS URL to fetch.",
},
"extract_text": {
"type": "boolean",
"description": (
"If true (default), extract readable text from HTML. "
"If false, return raw content."
),
"default": True,
},
},
"required": ["url"],
}
@property
def requires_auth(self) -> bool:
return False
async def _execute(
self,
user_id: str | None,
session: ChatSession,
**kwargs: Any,
) -> ToolResponseBase:
url: str = (kwargs.get("url") or "").strip()
extract_text: bool = kwargs.get("extract_text", True)
session_id = session.session_id if session else None
if not url:
return ErrorResponse(
message="Please provide a URL to fetch.",
error="missing_url",
session_id=session_id,
)
try:
client = Requests(raise_for_status=False, retry_max_attempts=1)
response = await client.get(url, timeout=_REQUEST_TIMEOUT)
except ValueError as e:
# validate_url raises ValueError for SSRF / blocked IPs
return ErrorResponse(
message=f"URL blocked: {e}",
error="url_blocked",
session_id=session_id,
)
except Exception as e:
logger.warning(f"[web_fetch] Request failed for {url}: {e}")
return ErrorResponse(
message=f"Failed to fetch URL: {e}",
error="fetch_failed",
session_id=session_id,
)
content_type = response.headers.get("content-type", "")
if not _is_text_content(content_type):
return ErrorResponse(
message=f"Non-text content type: {content_type.split(';')[0]}",
error="unsupported_content_type",
session_id=session_id,
)
raw = response.content[:_MAX_CONTENT_BYTES]
text = raw.decode("utf-8", errors="replace")
if extract_text and "html" in content_type.lower():
text = _html_to_text(text)
truncated = len(text) > _MAX_OUTPUT_CHARS
if truncated:
text = text[:_MAX_OUTPUT_CHARS]
return WebFetchResponse(
message=f"Fetched {url}" + (" (truncated)" if truncated else ""),
url=response.url,
status_code=response.status,
content_type=content_type.split(";")[0].strip(),
content=text,
truncated=truncated,
session_id=session_id,
)

View File

@@ -88,7 +88,9 @@ class ListWorkspaceFilesTool(BaseTool):
@property
def description(self) -> str:
return (
"List files in the user's workspace. "
"List files in the user's persistent workspace (cloud storage). "
"These files survive across sessions. "
"For ephemeral session files, use the SDK Read/Glob tools instead. "
"Returns file names, paths, sizes, and metadata. "
"Optionally filter by path prefix."
)
@@ -204,7 +206,9 @@ class ReadWorkspaceFileTool(BaseTool):
@property
def description(self) -> str:
return (
"Read a file from the user's workspace. "
"Read a file from the user's persistent workspace (cloud storage). "
"These files survive across sessions. "
"For ephemeral session files, use the SDK Read tool instead. "
"Specify either file_id or path to identify the file. "
"For small text files, returns content directly. "
"For large or binary files, returns metadata and a download URL. "
@@ -378,7 +382,9 @@ class WriteWorkspaceFileTool(BaseTool):
@property
def description(self) -> str:
return (
"Write or create a file in the user's workspace. "
"Write or create a file in the user's persistent workspace (cloud storage). "
"These files survive across sessions. "
"For ephemeral session files, use the SDK Write tool instead. "
"Provide the content as a base64-encoded string. "
f"Maximum file size is {Config().max_file_size_mb}MB. "
"Files are saved to the current session's folder by default. "
@@ -523,7 +529,7 @@ class DeleteWorkspaceFileTool(BaseTool):
@property
def description(self) -> str:
return (
"Delete a file from the user's workspace. "
"Delete a file from the user's persistent workspace (cloud storage). "
"Specify either file_id or path to identify the file. "
"Paths are scoped to the current session by default. "
"Use /sessions/<session_id>/... for cross-session access."

View File

@@ -126,7 +126,6 @@ class PrintToConsoleBlock(Block):
output_schema=PrintToConsoleBlock.Output,
test_input={"text": "Hello, World!"},
is_sensitive_action=True,
disabled=True, # Disabled per Nick Tindle's request (OPEN-3000)
test_output=[
("output", "Hello, World!"),
("status", "printed"),

View File

@@ -38,6 +38,7 @@ class Flag(str, Enum):
AGENT_ACTIVITY = "agent-activity"
ENABLE_PLATFORM_PAYMENT = "enable-platform-payment"
CHAT = "chat"
COPILOT_SDK = "copilot-sdk"
def is_configured() -> bool:

View File

@@ -368,10 +368,6 @@ class Config(UpdateTrackingModel["Config"], BaseSettings):
default=600,
description="The timeout in seconds for Agent Generator service requests (includes retries for rate limits)",
)
agentgenerator_use_dummy: bool = Field(
default=False,
description="Use dummy agent generator responses for testing (bypasses external service)",
)
enable_example_blocks: bool = Field(
default=False,
@@ -662,17 +658,6 @@ class Secrets(UpdateTrackingModel["Secrets"], BaseSettings):
mem0_api_key: str = Field(default="", description="Mem0 API key")
elevenlabs_api_key: str = Field(default="", description="ElevenLabs API key")
linear_api_key: str = Field(
default="", description="Linear API key for system-level operations"
)
linear_feature_request_project_id: str = Field(
default="",
description="Linear project ID where feature requests are tracked",
)
linear_feature_request_team_id: str = Field(
default="",
description="Linear team ID used when creating feature request issues",
)
linear_client_id: str = Field(default="", description="Linear client ID")
linear_client_secret: str = Field(default="", description="Linear client secret")

View File

@@ -441,14 +441,14 @@ develop = true
colorama = "^0.4.6"
cryptography = "^46.0"
expiringdict = "^1.2.2"
fastapi = "^0.128.7"
fastapi = "^0.128.0"
google-cloud-logging = "^3.13.0"
launchdarkly-server-sdk = "^9.15.0"
launchdarkly-server-sdk = "^9.14.1"
pydantic = "^2.12.5"
pydantic-settings = "^2.12.0"
pyjwt = {version = "^2.11.0", extras = ["crypto"]}
redis = "^6.2.0"
supabase = "^2.28.0"
supabase = "^2.27.2"
uvicorn = "^0.40.0"
[package.source]
@@ -897,6 +897,29 @@ files = [
{file = "charset_normalizer-3.4.4.tar.gz", hash = "sha256:94537985111c35f28720e43603b8e7b43a6ecfb2ce1d3058bbe955b73404e21a"},
]
[[package]]
name = "claude-agent-sdk"
version = "0.1.35"
description = "Python SDK for Claude Code"
optional = false
python-versions = ">=3.10"
groups = ["main"]
files = [
{file = "claude_agent_sdk-0.1.35-py3-none-macosx_11_0_arm64.whl", hash = "sha256:df67f4deade77b16a9678b3a626c176498e40417f33b04beda9628287f375591"},
{file = "claude_agent_sdk-0.1.35-py3-none-manylinux_2_17_aarch64.whl", hash = "sha256:14963944f55ded7c8ed518feebfa5b4284aa6dd8d81aeff2e5b21a962ce65097"},
{file = "claude_agent_sdk-0.1.35-py3-none-manylinux_2_17_x86_64.whl", hash = "sha256:84344dcc535d179c1fc8a11c6f34c37c3b583447bdf09d869effb26514fd7a65"},
{file = "claude_agent_sdk-0.1.35-py3-none-win_amd64.whl", hash = "sha256:1b3d54b47448c93f6f372acd4d1757f047c3c1e8ef5804be7a1e3e53e2c79a5f"},
{file = "claude_agent_sdk-0.1.35.tar.gz", hash = "sha256:0f98e2b3c71ca85abfc042e7a35c648df88e87fda41c52e6779ef7b038dcbb52"},
]
[package.dependencies]
anyio = ">=4.0.0"
mcp = ">=0.1.0"
typing-extensions = {version = ">=4.0.0", markers = "python_version < \"3.11\""}
[package.extras]
dev = ["anyio[trio] (>=4.0.0)", "mypy (>=1.0.0)", "pytest (>=7.0.0)", "pytest-asyncio (>=0.20.0)", "pytest-cov (>=4.0.0)", "ruff (>=0.1.0)"]
[[package]]
name = "cleo"
version = "2.1.0"
@@ -1382,14 +1405,14 @@ tzdata = "*"
[[package]]
name = "fastapi"
version = "0.128.7"
version = "0.128.6"
description = "FastAPI framework, high performance, easy to learn, fast to code, ready for production"
optional = false
python-versions = ">=3.9"
groups = ["main"]
files = [
{file = "fastapi-0.128.7-py3-none-any.whl", hash = "sha256:6bd9bd31cb7047465f2d3fa3ba3f33b0870b17d4eaf7cdb36d1576ab060ad662"},
{file = "fastapi-0.128.7.tar.gz", hash = "sha256:783c273416995486c155ad2c0e2b45905dedfaf20b9ef8d9f6a9124670639a24"},
{file = "fastapi-0.128.6-py3-none-any.whl", hash = "sha256:bb1c1ef87d6086a7132d0ab60869d6f1ee67283b20fbf84ec0003bd335099509"},
{file = "fastapi-0.128.6.tar.gz", hash = "sha256:0cb3946557e792d731b26a42b04912f16367e3c3135ea8290f620e234f2b604f"},
]
[package.dependencies]
@@ -2593,6 +2616,18 @@ http2 = ["h2 (>=3,<5)"]
socks = ["socksio (==1.*)"]
zstd = ["zstandard (>=0.18.0)"]
[[package]]
name = "httpx-sse"
version = "0.4.3"
description = "Consume Server-Sent Event (SSE) messages with HTTPX."
optional = false
python-versions = ">=3.9"
groups = ["main"]
files = [
{file = "httpx_sse-0.4.3-py3-none-any.whl", hash = "sha256:0ac1c9fe3c0afad2e0ebb25a934a59f4c7823b60792691f779fad2c5568830fc"},
{file = "httpx_sse-0.4.3.tar.gz", hash = "sha256:9b1ed0127459a66014aec3c56bebd93da3c1bc8bb6618c8082039a44889a755d"},
]
[[package]]
name = "huggingface-hub"
version = "1.4.1"
@@ -3117,14 +3152,14 @@ urllib3 = ">=1.26.0,<3"
[[package]]
name = "launchdarkly-server-sdk"
version = "9.15.0"
version = "9.14.1"
description = "LaunchDarkly SDK for Python"
optional = false
python-versions = ">=3.10"
python-versions = ">=3.9"
groups = ["main"]
files = [
{file = "launchdarkly_server_sdk-9.15.0-py3-none-any.whl", hash = "sha256:c267e29bfa3fb5e2a06a208448ada6ed5557a2924979b8d79c970b45d227c668"},
{file = "launchdarkly_server_sdk-9.15.0.tar.gz", hash = "sha256:f31441b74bc1a69c381db57c33116509e407a2612628ad6dff0a7dbb39d5020b"},
{file = "launchdarkly_server_sdk-9.14.1-py3-none-any.whl", hash = "sha256:a9e2bd9ecdef845cd631ae0d4334a1115e5b44257c42eb2349492be4bac7815c"},
{file = "launchdarkly_server_sdk-9.14.1.tar.gz", hash = "sha256:1df44baf0a0efa74d8c1dad7a00592b98bce7d19edded7f770da8dbc49922213"},
]
[package.dependencies]
@@ -3310,6 +3345,39 @@ files = [
{file = "mccabe-0.7.0.tar.gz", hash = "sha256:348e0240c33b60bbdf4e523192ef919f28cb2c3d7d5c7794f74009290f236325"},
]
[[package]]
name = "mcp"
version = "1.26.0"
description = "Model Context Protocol SDK"
optional = false
python-versions = ">=3.10"
groups = ["main"]
files = [
{file = "mcp-1.26.0-py3-none-any.whl", hash = "sha256:904a21c33c25aa98ddbeb47273033c435e595bbacfdb177f4bd87f6dceebe1ca"},
{file = "mcp-1.26.0.tar.gz", hash = "sha256:db6e2ef491eecc1a0d93711a76f28dec2e05999f93afd48795da1c1137142c66"},
]
[package.dependencies]
anyio = ">=4.5"
httpx = ">=0.27.1"
httpx-sse = ">=0.4"
jsonschema = ">=4.20.0"
pydantic = ">=2.11.0,<3.0.0"
pydantic-settings = ">=2.5.2"
pyjwt = {version = ">=2.10.1", extras = ["crypto"]}
python-multipart = ">=0.0.9"
pywin32 = {version = ">=310", markers = "sys_platform == \"win32\""}
sse-starlette = ">=1.6.1"
starlette = ">=0.27"
typing-extensions = ">=4.9.0"
typing-inspection = ">=0.4.1"
uvicorn = {version = ">=0.31.1", markers = "sys_platform != \"emscripten\""}
[package.extras]
cli = ["python-dotenv (>=1.0.0)", "typer (>=0.16.0)"]
rich = ["rich (>=13.9.4)"]
ws = ["websockets (>=15.0.1)"]
[[package]]
name = "mdurl"
version = "0.1.2"
@@ -4728,14 +4796,14 @@ tests = ["coverage-conditional-plugin (>=0.9.0)", "portalocker[redis]", "pytest
[[package]]
name = "postgrest"
version = "2.28.0"
version = "2.27.3"
description = "PostgREST client for Python. This library provides an ORM interface to PostgREST."
optional = false
python-versions = ">=3.9"
groups = ["main"]
files = [
{file = "postgrest-2.28.0-py3-none-any.whl", hash = "sha256:7bca2f24dd1a1bf8a3d586c7482aba6cd41662da6733045fad585b63b7f7df75"},
{file = "postgrest-2.28.0.tar.gz", hash = "sha256:c36b38646d25ea4255321d3d924ce70f8d20ec7799cb42c1221d6a818d4f6515"},
{file = "postgrest-2.27.3-py3-none-any.whl", hash = "sha256:ed79123af7127edd78d538bfe8351d277e45b1a36994a4dbf57ae27dde87a7b7"},
{file = "postgrest-2.27.3.tar.gz", hash = "sha256:c2e2679addfc8eaab23197bad7ddaee6cbb4cbe8c483ebd2d2e5219543037cc3"},
]
[package.dependencies]
@@ -5994,7 +6062,7 @@ description = "Python for Window Extensions"
optional = false
python-versions = "*"
groups = ["main"]
markers = "platform_system == \"Windows\""
markers = "sys_platform == \"win32\" or platform_system == \"Windows\""
files = [
{file = "pywin32-311-cp310-cp310-win32.whl", hash = "sha256:d03ff496d2a0cd4a5893504789d4a15399133fe82517455e78bad62efbb7f0a3"},
{file = "pywin32-311-cp310-cp310-win_amd64.whl", hash = "sha256:797c2772017851984b97180b0bebe4b620bb86328e8a884bb626156295a63b3b"},
@@ -6260,14 +6328,14 @@ all = ["numpy"]
[[package]]
name = "realtime"
version = "2.28.0"
version = "2.27.3"
description = ""
optional = false
python-versions = ">=3.9"
groups = ["main"]
files = [
{file = "realtime-2.28.0-py3-none-any.whl", hash = "sha256:db1bd59bab9b1fcc9f9d3b1a073bed35bf4994d720e6751f10031a58d57a3836"},
{file = "realtime-2.28.0.tar.gz", hash = "sha256:d18cedcebd6a8f22fcd509bc767f639761eb218b7b2b6f14fc4205b6259b50fc"},
{file = "realtime-2.27.3-py3-none-any.whl", hash = "sha256:f571115f86988e33c41c895cb3fba2eaa1b693aeaede3617288f44274ca90f43"},
{file = "realtime-2.27.3.tar.gz", hash = "sha256:02b082243107656a5ef3fb63e8e2ab4c40bc199abb45adb8a42ed63f089a1041"},
]
[package.dependencies]
@@ -6974,6 +7042,28 @@ postgresql-psycopgbinary = ["psycopg[binary] (>=3.0.7)"]
pymysql = ["pymysql"]
sqlcipher = ["sqlcipher3_binary"]
[[package]]
name = "sse-starlette"
version = "3.2.0"
description = "SSE plugin for Starlette"
optional = false
python-versions = ">=3.9"
groups = ["main"]
files = [
{file = "sse_starlette-3.2.0-py3-none-any.whl", hash = "sha256:5876954bd51920fc2cd51baee47a080eb88a37b5b784e615abb0b283f801cdbf"},
{file = "sse_starlette-3.2.0.tar.gz", hash = "sha256:8127594edfb51abe44eac9c49e59b0b01f1039d0c7461c6fd91d4e03b70da422"},
]
[package.dependencies]
anyio = ">=4.7.0"
starlette = ">=0.49.1"
[package.extras]
daphne = ["daphne (>=4.2.0)"]
examples = ["aiosqlite (>=0.21.0)", "fastapi (>=0.115.12)", "sqlalchemy[asyncio] (>=2.0.41)", "uvicorn (>=0.34.0)"]
granian = ["granian (>=2.3.1)"]
uvicorn = ["uvicorn (>=0.34.0)"]
[[package]]
name = "stagehand"
version = "0.5.9"
@@ -7024,14 +7114,14 @@ full = ["httpx (>=0.27.0,<0.29.0)", "itsdangerous", "jinja2", "python-multipart
[[package]]
name = "storage3"
version = "2.28.0"
version = "2.27.3"
description = "Supabase Storage client for Python."
optional = false
python-versions = ">=3.9"
groups = ["main"]
files = [
{file = "storage3-2.28.0-py3-none-any.whl", hash = "sha256:ecb50efd2ac71dabbdf97e99ad346eafa630c4c627a8e5a138ceb5fbbadae716"},
{file = "storage3-2.28.0.tar.gz", hash = "sha256:bc1d008aff67de7a0f2bd867baee7aadbcdb6f78f5a310b4f7a38e8c13c19865"},
{file = "storage3-2.27.3-py3-none-any.whl", hash = "sha256:11a05b7da84bccabeeea12d940bca3760cf63fe6ca441868677335cfe4fdfbe0"},
{file = "storage3-2.27.3.tar.gz", hash = "sha256:dc1a4a010cf36d5482c5cb6c1c28fc5f00e23284342b89e4ae43b5eae8501ddb"},
]
[package.dependencies]
@@ -7091,35 +7181,35 @@ typing-extensions = {version = ">=4.5.0", markers = "python_version >= \"3.7\""}
[[package]]
name = "supabase"
version = "2.28.0"
version = "2.27.3"
description = "Supabase client for Python."
optional = false
python-versions = ">=3.9"
groups = ["main"]
files = [
{file = "supabase-2.28.0-py3-none-any.whl", hash = "sha256:42776971c7d0ccca16034df1ab96a31c50228eb1eb19da4249ad2f756fc20272"},
{file = "supabase-2.28.0.tar.gz", hash = "sha256:aea299aaab2a2eed3c57e0be7fc035c6807214194cce795a3575add20268ece1"},
{file = "supabase-2.27.3-py3-none-any.whl", hash = "sha256:082a74642fcf9954693f1ce8c251baf23e4bda26ffdbc8dcd4c99c82e60d69ff"},
{file = "supabase-2.27.3.tar.gz", hash = "sha256:5e5a348232ac4315c1032ddd687278f0b982465471f0cbb52bca7e6a66495ff3"},
]
[package.dependencies]
httpx = ">=0.26,<0.29"
postgrest = "2.28.0"
realtime = "2.28.0"
storage3 = "2.28.0"
supabase-auth = "2.28.0"
supabase-functions = "2.28.0"
postgrest = "2.27.3"
realtime = "2.27.3"
storage3 = "2.27.3"
supabase-auth = "2.27.3"
supabase-functions = "2.27.3"
yarl = ">=1.22.0"
[[package]]
name = "supabase-auth"
version = "2.28.0"
version = "2.27.3"
description = "Python Client Library for Supabase Auth"
optional = false
python-versions = ">=3.9"
groups = ["main"]
files = [
{file = "supabase_auth-2.28.0-py3-none-any.whl", hash = "sha256:2ac85026cc285054c7fa6d41924f3a333e9ec298c013e5b5e1754039ba7caec9"},
{file = "supabase_auth-2.28.0.tar.gz", hash = "sha256:2bb8f18ff39934e44b28f10918db965659f3735cd6fbfcc022fe0b82dbf8233e"},
{file = "supabase_auth-2.27.3-py3-none-any.whl", hash = "sha256:82a4262eaad85383319d394dab0eea11fcf3ebd774062aef8ea3874ae2f02579"},
{file = "supabase_auth-2.27.3.tar.gz", hash = "sha256:39894d4bc60b6f23b5cff4d0d7d4c1659e5d69563cadf014d4896f780ca8ca78"},
]
[package.dependencies]
@@ -7129,14 +7219,14 @@ pyjwt = {version = ">=2.10.1", extras = ["crypto"]}
[[package]]
name = "supabase-functions"
version = "2.28.0"
version = "2.27.3"
description = "Library for Supabase Functions"
optional = false
python-versions = ">=3.9"
groups = ["main"]
files = [
{file = "supabase_functions-2.28.0-py3-none-any.whl", hash = "sha256:30bf2d586f8df285faf0621bb5d5bb3ec3157234fc820553ca156f009475e4ae"},
{file = "supabase_functions-2.28.0.tar.gz", hash = "sha256:db3dddfc37aca5858819eb461130968473bd8c75bd284581013958526dac718b"},
{file = "supabase_functions-2.27.3-py3-none-any.whl", hash = "sha256:9d14a931d49ede1c6cf5fbfceb11c44061535ba1c3f310f15384964d86a83d9e"},
{file = "supabase_functions-2.27.3.tar.gz", hash = "sha256:e954f1646da8ca6e7e16accef58d0884a5f97b25956ee98e7d4927a210ed92f9"},
]
[package.dependencies]
@@ -8440,4 +8530,4 @@ cffi = ["cffi (>=1.17,<2.0) ; platform_python_implementation != \"PyPy\" and pyt
[metadata]
lock-version = "2.1"
python-versions = ">=3.10,<3.14"
content-hash = "fa9c5deadf593e815dd2190f58e22152373900603f5f244b9616cd721de84d2f"
content-hash = "942dea6daf671c3be65a22f3445feda26c1af9409d7173765e9a0742f0aa05dc"

View File

@@ -16,6 +16,7 @@ anthropic = "^0.79.0"
apscheduler = "^3.11.1"
autogpt-libs = { path = "../autogpt_libs", develop = true }
bleach = { extras = ["css"], version = "^6.2.0" }
claude-agent-sdk = "^0.1.0"
click = "^8.2.0"
cryptography = "^46.0"
discord-py = "^2.5.2"
@@ -65,7 +66,7 @@ sentry-sdk = {extras = ["anthropic", "fastapi", "launchdarkly", "openai", "sqlal
sqlalchemy = "^2.0.40"
strenum = "^0.4.9"
stripe = "^11.5.0"
supabase = "2.28.0"
supabase = "2.27.3"
tenacity = "^9.1.4"
todoist-api-python = "^2.1.7"
tweepy = "^4.16.0"

View File

@@ -25,7 +25,6 @@ class TestServiceConfiguration:
"""Test that external service is not configured when host is empty."""
mock_settings = MagicMock()
mock_settings.config.agentgenerator_host = ""
mock_settings.config.agentgenerator_use_dummy = False
with patch.object(service, "_get_settings", return_value=mock_settings):
assert service.is_external_service_configured() is False

View File

@@ -0,0 +1,133 @@
"""Tests for SDK security hooks — workspace paths, tool access, and deny messages.
These are pure unit tests with no external dependencies (no SDK, no DB, no server).
They validate that the security hooks correctly block unauthorized paths,
tool access, and dangerous input patterns.
Note: Bash command validation was removed — the SDK built-in Bash tool is not in
allowed_tools, and the bash_exec MCP tool has kernel-level network isolation
(unshare --net) making command-level parsing unnecessary.
"""
from backend.api.features.chat.sdk.security_hooks import (
_validate_tool_access,
_validate_workspace_path,
)
SDK_CWD = "/tmp/copilot-test-session"
def _is_denied(result: dict) -> bool:
hook = result.get("hookSpecificOutput", {})
return hook.get("permissionDecision") == "deny"
def _reason(result: dict) -> str:
return result.get("hookSpecificOutput", {}).get("permissionDecisionReason", "")
# ============================================================
# Workspace path validation (Read, Write, Edit, etc.)
# ============================================================
class TestWorkspacePathValidation:
def test_path_in_workspace(self):
result = _validate_workspace_path(
"Read", {"file_path": f"{SDK_CWD}/file.txt"}, SDK_CWD
)
assert not _is_denied(result)
def test_path_outside_workspace(self):
result = _validate_workspace_path("Read", {"file_path": "/etc/passwd"}, SDK_CWD)
assert _is_denied(result)
def test_tool_results_allowed(self):
result = _validate_workspace_path(
"Read",
{"file_path": "~/.claude/projects/abc/tool-results/out.txt"},
SDK_CWD,
)
assert not _is_denied(result)
def test_claude_settings_blocked(self):
result = _validate_workspace_path(
"Read", {"file_path": "~/.claude/settings.json"}, SDK_CWD
)
assert _is_denied(result)
def test_claude_projects_without_tool_results(self):
result = _validate_workspace_path(
"Read", {"file_path": "~/.claude/projects/abc/credentials.json"}, SDK_CWD
)
assert _is_denied(result)
def test_no_path_allowed(self):
"""Glob/Grep without path defaults to cwd — should be allowed."""
result = _validate_workspace_path("Grep", {"pattern": "foo"}, SDK_CWD)
assert not _is_denied(result)
def test_path_traversal_with_dotdot(self):
result = _validate_workspace_path(
"Read", {"file_path": f"{SDK_CWD}/../../../etc/passwd"}, SDK_CWD
)
assert _is_denied(result)
# ============================================================
# Tool access validation
# ============================================================
class TestToolAccessValidation:
def test_blocked_tools(self):
for tool in ("bash", "shell", "exec", "terminal", "command"):
result = _validate_tool_access(tool, {})
assert _is_denied(result), f"Tool '{tool}' should be blocked"
def test_bash_builtin_blocked(self):
"""SDK built-in Bash (capital) is blocked as defence-in-depth."""
result = _validate_tool_access("Bash", {"command": "echo hello"}, SDK_CWD)
assert _is_denied(result)
assert "Bash" in _reason(result)
def test_workspace_tools_delegate(self):
result = _validate_tool_access(
"Read", {"file_path": f"{SDK_CWD}/file.txt"}, SDK_CWD
)
assert not _is_denied(result)
def test_dangerous_pattern_blocked(self):
result = _validate_tool_access("SomeUnknownTool", {"data": "sudo rm -rf /"})
assert _is_denied(result)
def test_safe_unknown_tool_allowed(self):
result = _validate_tool_access("SomeSafeTool", {"data": "hello world"})
assert not _is_denied(result)
# ============================================================
# Deny message quality (ntindle feedback)
# ============================================================
class TestDenyMessageClarity:
"""Deny messages must include [SECURITY] and 'cannot be bypassed'
so the model knows the restriction is enforced, not a suggestion."""
def test_blocked_tool_message(self):
reason = _reason(_validate_tool_access("bash", {}))
assert "[SECURITY]" in reason
assert "cannot be bypassed" in reason
def test_bash_builtin_blocked_message(self):
reason = _reason(_validate_tool_access("Bash", {"command": "echo hello"}))
assert "[SECURITY]" in reason
assert "cannot be bypassed" in reason
def test_workspace_path_message(self):
reason = _reason(
_validate_workspace_path("Read", {"file_path": "/etc/passwd"}, SDK_CWD)
)
assert "[SECURITY]" in reason
assert "cannot be bypassed" in reason

View File

@@ -37,7 +37,7 @@ services:
context: ../
dockerfile: autogpt_platform/backend/Dockerfile
target: migrate
command: ["sh", "-c", "prisma generate && python3 gen_prisma_types_stub.py && prisma migrate deploy"]
command: ["sh", "-c", "poetry run prisma generate && poetry run gen-prisma-stub && poetry run prisma migrate deploy"]
develop:
watch:
- path: ./
@@ -56,7 +56,7 @@ services:
test:
[
"CMD-SHELL",
"prisma migrate status | grep -q 'No pending migrations' || exit 1",
"poetry run prisma migrate status | grep -q 'No pending migrations' || exit 1",
]
interval: 30s
timeout: 10s

View File

@@ -22,11 +22,6 @@ Sentry.init({
enabled: shouldEnable,
// Suppress cross-origin stylesheet errors from Sentry Replay (rrweb)
// serializing DOM snapshots with cross-origin stylesheets
// (e.g., from browser extensions or CDN-loaded CSS)
ignoreErrors: [/Not allowed to access cross-origin stylesheet/],
// Add optional integrations for additional features
integrations: [
Sentry.captureConsoleIntegration(),

View File

@@ -15,15 +15,12 @@ import { ToolUIPart, UIDataTypes, UIMessage, UITools } from "ai";
import { useEffect, useRef, useState } from "react";
import { CreateAgentTool } from "../../tools/CreateAgent/CreateAgent";
import { EditAgentTool } from "../../tools/EditAgent/EditAgent";
import {
CreateFeatureRequestTool,
SearchFeatureRequestsTool,
} from "../../tools/FeatureRequests/FeatureRequests";
import { FindAgentsTool } from "../../tools/FindAgents/FindAgents";
import { FindBlocksTool } from "../../tools/FindBlocks/FindBlocks";
import { RunAgentTool } from "../../tools/RunAgent/RunAgent";
import { RunBlockTool } from "../../tools/RunBlock/RunBlock";
import { SearchDocsTool } from "../../tools/SearchDocs/SearchDocs";
import { GenericTool } from "../../tools/GenericTool/GenericTool";
import { ViewAgentOutputTool } from "../../tools/ViewAgentOutput/ViewAgentOutput";
// ---------------------------------------------------------------------------
@@ -258,21 +255,17 @@ export const ChatMessagesContainer = ({
part={part as ToolUIPart}
/>
);
case "tool-search_feature_requests":
return (
<SearchFeatureRequestsTool
key={`${message.id}-${i}`}
part={part as ToolUIPart}
/>
);
case "tool-create_feature_request":
return (
<CreateFeatureRequestTool
key={`${message.id}-${i}`}
part={part as ToolUIPart}
/>
);
default:
// Render a generic tool indicator for SDK built-in
// tools (Read, Glob, Grep, etc.) or any unrecognized tool
if (part.type.startsWith("tool-")) {
return (
<GenericTool
key={`${message.id}-${i}`}
part={part as ToolUIPart}
/>
);
}
return null;
}
})}

View File

@@ -0,0 +1,10 @@
import { parseAsString, useQueryState } from "nuqs";
export function useCopilotSessionId() {
const [urlSessionId, setUrlSessionId] = useQueryState(
"sessionId",
parseAsString,
);
return { urlSessionId, setUrlSessionId };
}

View File

@@ -1,126 +0,0 @@
import { getGetV2GetSessionQueryKey } from "@/app/api/__generated__/endpoints/chat/chat";
import { useQueryClient } from "@tanstack/react-query";
import type { UIDataTypes, UIMessage, UITools } from "ai";
import { useCallback, useEffect, useRef } from "react";
import { convertChatSessionMessagesToUiMessages } from "../helpers/convertChatSessionToUiMessages";
const OPERATING_TYPES = new Set([
"operation_started",
"operation_pending",
"operation_in_progress",
]);
const POLL_INTERVAL_MS = 1_500;
/**
* Detects whether any message contains a tool part whose output indicates
* a long-running operation is still in progress.
*/
function hasOperatingTool(
messages: UIMessage<unknown, UIDataTypes, UITools>[],
) {
for (const msg of messages) {
for (const part of msg.parts) {
if (!part.type.startsWith("tool-")) continue;
const toolPart = part as { output?: unknown };
if (!toolPart.output) continue;
const output =
typeof toolPart.output === "string"
? safeParse(toolPart.output)
: toolPart.output;
if (
output &&
typeof output === "object" &&
"type" in output &&
OPERATING_TYPES.has((output as { type: string }).type)
) {
return true;
}
}
}
return false;
}
function safeParse(value: string): unknown {
try {
return JSON.parse(value);
} catch {
return null;
}
}
/**
* Polls the session endpoint while any tool is in an "operating" state
* (operation_started / operation_pending / operation_in_progress).
*
* When the session data shows the tool output has changed (e.g. to
* agent_saved), it calls `setMessages` with the updated messages.
*/
export function useLongRunningToolPolling(
sessionId: string | null,
messages: UIMessage<unknown, UIDataTypes, UITools>[],
setMessages: (
updater: (
prev: UIMessage<unknown, UIDataTypes, UITools>[],
) => UIMessage<unknown, UIDataTypes, UITools>[],
) => void,
) {
const queryClient = useQueryClient();
const intervalRef = useRef<ReturnType<typeof setInterval> | null>(null);
const stopPolling = useCallback(() => {
if (intervalRef.current) {
clearInterval(intervalRef.current);
intervalRef.current = null;
}
}, []);
const poll = useCallback(async () => {
if (!sessionId) return;
// Invalidate the query cache so the next fetch gets fresh data
await queryClient.invalidateQueries({
queryKey: getGetV2GetSessionQueryKey(sessionId),
});
// Fetch fresh session data
const data = queryClient.getQueryData<{
status: number;
data: { messages?: unknown[] };
}>(getGetV2GetSessionQueryKey(sessionId));
if (data?.status !== 200 || !data.data.messages) return;
const freshMessages = convertChatSessionMessagesToUiMessages(
sessionId,
data.data.messages,
);
if (!freshMessages || freshMessages.length === 0) return;
// Update when the long-running tool completed
if (!hasOperatingTool(freshMessages)) {
setMessages(() => freshMessages);
stopPolling();
}
}, [sessionId, queryClient, setMessages, stopPolling]);
useEffect(() => {
const shouldPoll = hasOperatingTool(messages);
// Always clear any previous interval first so we never leak timers
// when the effect re-runs due to dependency changes (e.g. messages
// updating as the LLM streams text after the tool call).
stopPolling();
if (shouldPoll && sessionId) {
intervalRef.current = setInterval(() => {
poll();
}, POLL_INTERVAL_MS);
}
return () => {
stopPolling();
};
}, [messages, sessionId, poll, stopPolling]);
}

View File

@@ -14,10 +14,6 @@ import { Text } from "@/components/atoms/Text/Text";
import { CopilotChatActionsProvider } from "../components/CopilotChatActionsProvider/CopilotChatActionsProvider";
import { CreateAgentTool } from "../tools/CreateAgent/CreateAgent";
import { EditAgentTool } from "../tools/EditAgent/EditAgent";
import {
CreateFeatureRequestTool,
SearchFeatureRequestsTool,
} from "../tools/FeatureRequests/FeatureRequests";
import { FindAgentsTool } from "../tools/FindAgents/FindAgents";
import { FindBlocksTool } from "../tools/FindBlocks/FindBlocks";
import { RunAgentTool } from "../tools/RunAgent/RunAgent";
@@ -49,8 +45,6 @@ const SECTIONS = [
"Tool: Create Agent",
"Tool: Edit Agent",
"Tool: View Agent Output",
"Tool: Search Feature Requests",
"Tool: Create Feature Request",
"Full Conversation Example",
] as const;
@@ -1427,235 +1421,6 @@ export default function StyleguidePage() {
</SubSection>
</Section>
{/* ============================================================= */}
{/* SEARCH FEATURE REQUESTS */}
{/* ============================================================= */}
<Section title="Tool: Search Feature Requests">
<SubSection label="Input streaming">
<SearchFeatureRequestsTool
part={{
type: "tool-search_feature_requests",
toolCallId: uid(),
state: "input-streaming",
input: { query: "dark mode" },
}}
/>
</SubSection>
<SubSection label="Input available">
<SearchFeatureRequestsTool
part={{
type: "tool-search_feature_requests",
toolCallId: uid(),
state: "input-available",
input: { query: "dark mode" },
}}
/>
</SubSection>
<SubSection label="Output available (with results)">
<SearchFeatureRequestsTool
part={{
type: "tool-search_feature_requests",
toolCallId: uid(),
state: "output-available",
input: { query: "dark mode" },
output: {
type: "feature_request_search",
message:
'Found 2 feature request(s) matching "dark mode".',
query: "dark mode",
count: 2,
results: [
{
id: "fr-001",
identifier: "INT-42",
title: "Add dark mode to the platform",
description:
"Users have requested a dark mode option for the builder and copilot interfaces to reduce eye strain during long sessions.",
},
{
id: "fr-002",
identifier: "INT-87",
title: "Dark theme for agent output viewer",
description:
"Specifically requesting dark theme support for the agent output/execution viewer panel.",
},
],
},
}}
/>
</SubSection>
<SubSection label="Output available (no results)">
<SearchFeatureRequestsTool
part={{
type: "tool-search_feature_requests",
toolCallId: uid(),
state: "output-available",
input: { query: "teleportation" },
output: {
type: "no_results",
message:
"No feature requests found matching 'teleportation'.",
suggestions: [
"Try different keywords",
"Use broader search terms",
"You can create a new feature request if none exists",
],
},
}}
/>
</SubSection>
<SubSection label="Output available (error)">
<SearchFeatureRequestsTool
part={{
type: "tool-search_feature_requests",
toolCallId: uid(),
state: "output-available",
input: { query: "dark mode" },
output: {
type: "error",
message: "Failed to search feature requests.",
error: "LINEAR_API_KEY environment variable is not set",
},
}}
/>
</SubSection>
<SubSection label="Output error">
<SearchFeatureRequestsTool
part={{
type: "tool-search_feature_requests",
toolCallId: uid(),
state: "output-error",
input: { query: "dark mode" },
}}
/>
</SubSection>
</Section>
{/* ============================================================= */}
{/* CREATE FEATURE REQUEST */}
{/* ============================================================= */}
<Section title="Tool: Create Feature Request">
<SubSection label="Input streaming">
<CreateFeatureRequestTool
part={{
type: "tool-create_feature_request",
toolCallId: uid(),
state: "input-streaming",
input: {
title: "Add dark mode",
description: "I would love dark mode for the platform.",
},
}}
/>
</SubSection>
<SubSection label="Input available">
<CreateFeatureRequestTool
part={{
type: "tool-create_feature_request",
toolCallId: uid(),
state: "input-available",
input: {
title: "Add dark mode",
description: "I would love dark mode for the platform.",
},
}}
/>
</SubSection>
<SubSection label="Output available (new issue created)">
<CreateFeatureRequestTool
part={{
type: "tool-create_feature_request",
toolCallId: uid(),
state: "output-available",
input: {
title: "Add dark mode",
description: "I would love dark mode for the platform.",
},
output: {
type: "feature_request_created",
message:
"Created new feature request [INT-105] Add dark mode.",
issue_id: "issue-new-123",
issue_identifier: "INT-105",
issue_title: "Add dark mode",
issue_url:
"https://linear.app/autogpt/issue/INT-105/add-dark-mode",
is_new_issue: true,
customer_name: "user-abc-123",
},
}}
/>
</SubSection>
<SubSection label="Output available (added to existing issue)">
<CreateFeatureRequestTool
part={{
type: "tool-create_feature_request",
toolCallId: uid(),
state: "output-available",
input: {
title: "Dark mode support",
description:
"Please add dark mode, it would help with long sessions.",
existing_issue_id: "fr-001",
},
output: {
type: "feature_request_created",
message:
"Added your request to existing feature request [INT-42] Add dark mode to the platform.",
issue_id: "fr-001",
issue_identifier: "INT-42",
issue_title: "Add dark mode to the platform",
issue_url:
"https://linear.app/autogpt/issue/INT-42/add-dark-mode-to-the-platform",
is_new_issue: false,
customer_name: "user-xyz-789",
},
}}
/>
</SubSection>
<SubSection label="Output available (error)">
<CreateFeatureRequestTool
part={{
type: "tool-create_feature_request",
toolCallId: uid(),
state: "output-available",
input: {
title: "Add dark mode",
description: "I would love dark mode.",
},
output: {
type: "error",
message:
"Failed to attach customer need to the feature request.",
error: "Linear API request failed (500): Internal error",
},
}}
/>
</SubSection>
<SubSection label="Output error">
<CreateFeatureRequestTool
part={{
type: "tool-create_feature_request",
toolCallId: uid(),
state: "output-error",
input: { title: "Add dark mode" },
}}
/>
</SubSection>
</Section>
{/* ============================================================= */}
{/* FULL CONVERSATION EXAMPLE */}
{/* ============================================================= */}

View File

@@ -1,30 +1,24 @@
"use client";
import { Button } from "@/components/atoms/Button/Button";
import { Text } from "@/components/atoms/Text/Text";
import {
BookOpenIcon,
CheckFatIcon,
PencilSimpleIcon,
WarningDiamondIcon,
} from "@phosphor-icons/react";
import { WarningDiamondIcon } from "@phosphor-icons/react";
import type { ToolUIPart } from "ai";
import NextLink from "next/link";
import { useCopilotChatActions } from "../../components/CopilotChatActionsProvider/useCopilotChatActions";
import { MorphingTextAnimation } from "../../components/MorphingTextAnimation/MorphingTextAnimation";
import { ProgressBar } from "../../components/ProgressBar/ProgressBar";
import {
ContentCardDescription,
ContentCodeBlock,
ContentGrid,
ContentHint,
ContentLink,
ContentMessage,
} from "../../components/ToolAccordion/AccordionContent";
import { ToolAccordion } from "../../components/ToolAccordion/ToolAccordion";
import { useAsymptoticProgress } from "../../hooks/useAsymptoticProgress";
import {
ClarificationQuestionsCard,
ClarifyingQuestion,
} from "./components/ClarificationQuestionsCard";
import { MiniGame } from "./components/MiniGame/MiniGame";
import {
AccordionIcon,
formatMaybeJson,
@@ -58,7 +52,7 @@ function getAccordionMeta(output: CreateAgentToolOutput) {
const icon = <AccordionIcon />;
if (isAgentSavedOutput(output)) {
return { icon, title: output.agent_name, expanded: true };
return { icon, title: output.agent_name };
}
if (isAgentPreviewOutput(output)) {
return {
@@ -84,7 +78,6 @@ function getAccordionMeta(output: CreateAgentToolOutput) {
return {
icon,
title: "Creating agent, this may take a few minutes. Sit back and relax.",
expanded: true,
};
}
return {
@@ -114,6 +107,8 @@ export function CreateAgentTool({ part }: Props) {
isOperationPendingOutput(output) ||
isOperationInProgressOutput(output));
const progress = useAsymptoticProgress(isOperating);
const hasExpandableContent =
part.state === "output-available" &&
!!output &&
@@ -157,53 +152,31 @@ export function CreateAgentTool({ part }: Props) {
<ToolAccordion {...getAccordionMeta(output)}>
{isOperating && (
<ContentGrid>
<MiniGame />
<ProgressBar value={progress} className="max-w-[280px]" />
<ContentHint>
This could take a few minutes play while you wait!
This could take a few minutes, grab a coffee
</ContentHint>
</ContentGrid>
)}
{isAgentSavedOutput(output) && (
<div className="rounded-xl border border-border/60 bg-card p-4 shadow-sm">
<div className="flex items-baseline gap-2">
<CheckFatIcon
size={18}
weight="regular"
className="relative top-1 text-green-500"
/>
<Text
variant="body-medium"
className="text-blacks mb-2 text-[16px]"
>
{output.message}
</Text>
<ContentGrid>
<ContentMessage>{output.message}</ContentMessage>
<div className="flex flex-wrap gap-2">
<ContentLink href={output.library_agent_link}>
Open in library
</ContentLink>
<ContentLink href={output.agent_page_link}>
Open in builder
</ContentLink>
</div>
<div className="mt-3 flex flex-wrap gap-4">
<Button variant="outline" size="small">
<NextLink
href={output.library_agent_link}
className="inline-flex items-center gap-1.5"
target="_blank"
rel="noopener noreferrer"
>
<BookOpenIcon size={14} weight="regular" />
Open in library
</NextLink>
</Button>
<Button variant="outline" size="small">
<NextLink
href={output.agent_page_link}
target="_blank"
rel="noopener noreferrer"
className="inline-flex items-center gap-1.5"
>
<PencilSimpleIcon size={14} weight="regular" />
Open in builder
</NextLink>
</Button>
</div>
</div>
<ContentCodeBlock>
{truncateText(
formatMaybeJson({ agent_id: output.agent_id }),
800,
)}
</ContentCodeBlock>
</ContentGrid>
)}
{isAgentPreviewOutput(output) && (

View File

@@ -1,21 +0,0 @@
"use client";
import { useMiniGame } from "./useMiniGame";
export function MiniGame() {
const { canvasRef } = useMiniGame();
return (
<div
className="w-full overflow-hidden rounded-md bg-background text-foreground"
style={{ border: "1px solid #d17fff" }}
>
<canvas
ref={canvasRef}
tabIndex={0}
className="block w-full outline-none"
style={{ imageRendering: "pixelated" }}
/>
</div>
);
}

View File

@@ -1,579 +0,0 @@
import { useEffect, useRef } from "react";
/* ------------------------------------------------------------------ */
/* Constants */
/* ------------------------------------------------------------------ */
const CANVAS_HEIGHT = 150;
const GRAVITY = 0.55;
const JUMP_FORCE = -9.5;
const BASE_SPEED = 3;
const SPEED_INCREMENT = 0.0008;
const SPAWN_MIN = 70;
const SPAWN_MAX = 130;
const CHAR_SIZE = 18;
const CHAR_X = 50;
const GROUND_PAD = 20;
const STORAGE_KEY = "copilot-minigame-highscore";
// Colors
const COLOR_BG = "#E8EAF6";
const COLOR_CHAR = "#263238";
const COLOR_BOSS = "#F50057";
// Boss
const BOSS_SIZE = 36;
const BOSS_ENTER_SPEED = 2;
const BOSS_LEAVE_SPEED = 3;
const BOSS_SHOOT_COOLDOWN = 90;
const BOSS_SHOTS_TO_EVADE = 5;
const BOSS_INTERVAL = 20; // every N score
const PROJ_SPEED = 4.5;
const PROJ_SIZE = 12;
/* ------------------------------------------------------------------ */
/* Types */
/* ------------------------------------------------------------------ */
interface Obstacle {
x: number;
width: number;
height: number;
scored: boolean;
}
interface Projectile {
x: number;
y: number;
speed: number;
evaded: boolean;
type: "low" | "high";
}
interface BossState {
phase: "inactive" | "entering" | "fighting" | "leaving";
x: number;
targetX: number;
shotsEvaded: number;
cooldown: number;
projectiles: Projectile[];
bob: number;
}
interface GameState {
charY: number;
vy: number;
obstacles: Obstacle[];
score: number;
highScore: number;
speed: number;
frame: number;
nextSpawn: number;
running: boolean;
over: boolean;
groundY: number;
boss: BossState;
bossThreshold: number;
}
/* ------------------------------------------------------------------ */
/* Helpers */
/* ------------------------------------------------------------------ */
function randInt(min: number, max: number) {
return Math.floor(Math.random() * (max - min + 1)) + min;
}
function readHighScore(): number {
try {
return parseInt(localStorage.getItem(STORAGE_KEY) || "0", 10) || 0;
} catch {
return 0;
}
}
function writeHighScore(score: number) {
try {
localStorage.setItem(STORAGE_KEY, String(score));
} catch {
/* noop */
}
}
function makeBoss(): BossState {
return {
phase: "inactive",
x: 0,
targetX: 0,
shotsEvaded: 0,
cooldown: 0,
projectiles: [],
bob: 0,
};
}
function makeState(groundY: number): GameState {
return {
charY: groundY - CHAR_SIZE,
vy: 0,
obstacles: [],
score: 0,
highScore: readHighScore(),
speed: BASE_SPEED,
frame: 0,
nextSpawn: randInt(SPAWN_MIN, SPAWN_MAX),
running: false,
over: false,
groundY,
boss: makeBoss(),
bossThreshold: BOSS_INTERVAL,
};
}
function gameOver(s: GameState) {
s.running = false;
s.over = true;
if (s.score > s.highScore) {
s.highScore = s.score;
writeHighScore(s.score);
}
}
/* ------------------------------------------------------------------ */
/* Projectile collision — shared between fighting & leaving phases */
/* ------------------------------------------------------------------ */
/** Returns true if the player died. */
function tickProjectiles(s: GameState): boolean {
const boss = s.boss;
for (const p of boss.projectiles) {
p.x -= p.speed;
if (!p.evaded && p.x + PROJ_SIZE < CHAR_X) {
p.evaded = true;
boss.shotsEvaded++;
}
// Collision
if (
!p.evaded &&
CHAR_X + CHAR_SIZE > p.x &&
CHAR_X < p.x + PROJ_SIZE &&
s.charY + CHAR_SIZE > p.y &&
s.charY < p.y + PROJ_SIZE
) {
gameOver(s);
return true;
}
}
boss.projectiles = boss.projectiles.filter((p) => p.x + PROJ_SIZE > -20);
return false;
}
/* ------------------------------------------------------------------ */
/* Update */
/* ------------------------------------------------------------------ */
function update(s: GameState, canvasWidth: number) {
if (!s.running) return;
s.frame++;
// Speed only ramps during regular play
if (s.boss.phase === "inactive") {
s.speed = BASE_SPEED + s.frame * SPEED_INCREMENT;
}
// ---- Character physics (always active) ---- //
s.vy += GRAVITY;
s.charY += s.vy;
if (s.charY + CHAR_SIZE >= s.groundY) {
s.charY = s.groundY - CHAR_SIZE;
s.vy = 0;
}
// ---- Trigger boss ---- //
if (s.boss.phase === "inactive" && s.score >= s.bossThreshold) {
s.boss.phase = "entering";
s.boss.x = canvasWidth + 10;
s.boss.targetX = canvasWidth - BOSS_SIZE - 40;
s.boss.shotsEvaded = 0;
s.boss.cooldown = BOSS_SHOOT_COOLDOWN;
s.boss.projectiles = [];
s.obstacles = [];
}
// ---- Boss: entering ---- //
if (s.boss.phase === "entering") {
s.boss.bob = Math.sin(s.frame * 0.05) * 3;
s.boss.x -= BOSS_ENTER_SPEED;
if (s.boss.x <= s.boss.targetX) {
s.boss.x = s.boss.targetX;
s.boss.phase = "fighting";
}
return; // no obstacles while entering
}
// ---- Boss: fighting ---- //
if (s.boss.phase === "fighting") {
s.boss.bob = Math.sin(s.frame * 0.05) * 3;
// Shoot
s.boss.cooldown--;
if (s.boss.cooldown <= 0) {
const isLow = Math.random() < 0.5;
s.boss.projectiles.push({
x: s.boss.x - PROJ_SIZE,
y: isLow ? s.groundY - 14 : s.groundY - 70,
speed: PROJ_SPEED,
evaded: false,
type: isLow ? "low" : "high",
});
s.boss.cooldown = BOSS_SHOOT_COOLDOWN;
}
if (tickProjectiles(s)) return;
// Boss defeated?
if (s.boss.shotsEvaded >= BOSS_SHOTS_TO_EVADE) {
s.boss.phase = "leaving";
s.score += 5; // bonus
s.bossThreshold = s.score + BOSS_INTERVAL;
}
return;
}
// ---- Boss: leaving ---- //
if (s.boss.phase === "leaving") {
s.boss.bob = Math.sin(s.frame * 0.05) * 3;
s.boss.x += BOSS_LEAVE_SPEED;
// Still check in-flight projectiles
if (tickProjectiles(s)) return;
if (s.boss.x > canvasWidth + 50) {
s.boss = makeBoss();
s.nextSpawn = s.frame + randInt(SPAWN_MIN / 2, SPAWN_MAX / 2);
}
return;
}
// ---- Regular obstacle play ---- //
if (s.frame >= s.nextSpawn) {
s.obstacles.push({
x: canvasWidth + 10,
width: randInt(10, 16),
height: randInt(20, 48),
scored: false,
});
s.nextSpawn = s.frame + randInt(SPAWN_MIN, SPAWN_MAX);
}
for (const o of s.obstacles) {
o.x -= s.speed;
if (!o.scored && o.x + o.width < CHAR_X) {
o.scored = true;
s.score++;
}
}
s.obstacles = s.obstacles.filter((o) => o.x + o.width > -20);
for (const o of s.obstacles) {
const oY = s.groundY - o.height;
if (
CHAR_X + CHAR_SIZE > o.x &&
CHAR_X < o.x + o.width &&
s.charY + CHAR_SIZE > oY
) {
gameOver(s);
return;
}
}
}
/* ------------------------------------------------------------------ */
/* Drawing */
/* ------------------------------------------------------------------ */
function drawBoss(ctx: CanvasRenderingContext2D, s: GameState, bg: string) {
const bx = s.boss.x;
const by = s.groundY - BOSS_SIZE + s.boss.bob;
// Body
ctx.save();
ctx.fillStyle = COLOR_BOSS;
ctx.globalAlpha = 0.9;
ctx.beginPath();
ctx.roundRect(bx, by, BOSS_SIZE, BOSS_SIZE, 4);
ctx.fill();
ctx.restore();
// Eyes
ctx.save();
ctx.fillStyle = bg;
const eyeY = by + 13;
ctx.beginPath();
ctx.arc(bx + 10, eyeY, 4, 0, Math.PI * 2);
ctx.fill();
ctx.beginPath();
ctx.arc(bx + 26, eyeY, 4, 0, Math.PI * 2);
ctx.fill();
ctx.restore();
// Angry eyebrows
ctx.save();
ctx.strokeStyle = bg;
ctx.lineWidth = 2;
ctx.beginPath();
ctx.moveTo(bx + 5, eyeY - 7);
ctx.lineTo(bx + 14, eyeY - 4);
ctx.stroke();
ctx.beginPath();
ctx.moveTo(bx + 31, eyeY - 7);
ctx.lineTo(bx + 22, eyeY - 4);
ctx.stroke();
ctx.restore();
// Zigzag mouth
ctx.save();
ctx.strokeStyle = bg;
ctx.lineWidth = 1.5;
ctx.beginPath();
ctx.moveTo(bx + 10, by + 27);
ctx.lineTo(bx + 14, by + 24);
ctx.lineTo(bx + 18, by + 27);
ctx.lineTo(bx + 22, by + 24);
ctx.lineTo(bx + 26, by + 27);
ctx.stroke();
ctx.restore();
}
function drawProjectiles(ctx: CanvasRenderingContext2D, boss: BossState) {
ctx.save();
ctx.fillStyle = COLOR_BOSS;
ctx.globalAlpha = 0.8;
for (const p of boss.projectiles) {
if (p.evaded) continue;
ctx.beginPath();
ctx.arc(
p.x + PROJ_SIZE / 2,
p.y + PROJ_SIZE / 2,
PROJ_SIZE / 2,
0,
Math.PI * 2,
);
ctx.fill();
}
ctx.restore();
}
function draw(
ctx: CanvasRenderingContext2D,
s: GameState,
w: number,
h: number,
fg: string,
started: boolean,
) {
ctx.fillStyle = COLOR_BG;
ctx.fillRect(0, 0, w, h);
// Ground
ctx.save();
ctx.strokeStyle = fg;
ctx.globalAlpha = 0.15;
ctx.setLineDash([4, 4]);
ctx.beginPath();
ctx.moveTo(0, s.groundY);
ctx.lineTo(w, s.groundY);
ctx.stroke();
ctx.restore();
// Character
ctx.save();
ctx.fillStyle = COLOR_CHAR;
ctx.globalAlpha = 0.85;
ctx.beginPath();
ctx.roundRect(CHAR_X, s.charY, CHAR_SIZE, CHAR_SIZE, 3);
ctx.fill();
ctx.restore();
// Eyes
ctx.save();
ctx.fillStyle = COLOR_BG;
ctx.beginPath();
ctx.arc(CHAR_X + 6, s.charY + 7, 2.5, 0, Math.PI * 2);
ctx.fill();
ctx.beginPath();
ctx.arc(CHAR_X + 12, s.charY + 7, 2.5, 0, Math.PI * 2);
ctx.fill();
ctx.restore();
// Obstacles
ctx.save();
ctx.fillStyle = fg;
ctx.globalAlpha = 0.55;
for (const o of s.obstacles) {
ctx.fillRect(o.x, s.groundY - o.height, o.width, o.height);
}
ctx.restore();
// Boss + projectiles
if (s.boss.phase !== "inactive") {
drawBoss(ctx, s, COLOR_BG);
drawProjectiles(ctx, s.boss);
}
// Score HUD
ctx.save();
ctx.fillStyle = fg;
ctx.globalAlpha = 0.5;
ctx.font = "bold 11px monospace";
ctx.textAlign = "right";
ctx.fillText(`Score: ${s.score}`, w - 12, 20);
ctx.fillText(`Best: ${s.highScore}`, w - 12, 34);
if (s.boss.phase === "fighting") {
ctx.fillText(
`Evade: ${s.boss.shotsEvaded}/${BOSS_SHOTS_TO_EVADE}`,
w - 12,
48,
);
}
ctx.restore();
// Prompts
if (!started && !s.running && !s.over) {
ctx.save();
ctx.fillStyle = fg;
ctx.globalAlpha = 0.5;
ctx.font = "12px sans-serif";
ctx.textAlign = "center";
ctx.fillText("Click or press Space to play while you wait", w / 2, h / 2);
ctx.restore();
}
if (s.over) {
ctx.save();
ctx.fillStyle = fg;
ctx.globalAlpha = 0.7;
ctx.font = "bold 13px sans-serif";
ctx.textAlign = "center";
ctx.fillText("Game Over", w / 2, h / 2 - 8);
ctx.font = "11px sans-serif";
ctx.fillText("Click or Space to restart", w / 2, h / 2 + 10);
ctx.restore();
}
}
/* ------------------------------------------------------------------ */
/* Hook */
/* ------------------------------------------------------------------ */
export function useMiniGame() {
const canvasRef = useRef<HTMLCanvasElement>(null);
const stateRef = useRef<GameState | null>(null);
const rafRef = useRef(0);
const startedRef = useRef(false);
useEffect(() => {
const canvas = canvasRef.current;
if (!canvas) return;
const container = canvas.parentElement;
if (container) {
canvas.width = container.clientWidth;
canvas.height = CANVAS_HEIGHT;
}
const groundY = canvas.height - GROUND_PAD;
stateRef.current = makeState(groundY);
const style = getComputedStyle(canvas);
let fg = style.color || "#71717a";
// -------------------------------------------------------------- //
// Jump //
// -------------------------------------------------------------- //
function jump() {
const s = stateRef.current;
if (!s) return;
if (s.over) {
const hs = s.highScore;
const gy = s.groundY;
stateRef.current = makeState(gy);
stateRef.current.highScore = hs;
stateRef.current.running = true;
startedRef.current = true;
return;
}
if (!s.running) {
s.running = true;
startedRef.current = true;
return;
}
// Only jump when on the ground
if (s.charY + CHAR_SIZE >= s.groundY) {
s.vy = JUMP_FORCE;
}
}
function onKey(e: KeyboardEvent) {
if (e.code === "Space" || e.key === " ") {
e.preventDefault();
jump();
}
}
function onClick() {
canvas?.focus();
jump();
}
// -------------------------------------------------------------- //
// Loop //
// -------------------------------------------------------------- //
function loop() {
const s = stateRef.current;
if (!canvas || !s) return;
const ctx = canvas.getContext("2d");
if (!ctx) return;
update(s, canvas.width);
draw(ctx, s, canvas.width, canvas.height, fg, startedRef.current);
rafRef.current = requestAnimationFrame(loop);
}
rafRef.current = requestAnimationFrame(loop);
canvas.addEventListener("click", onClick);
canvas.addEventListener("keydown", onKey);
const observer = new ResizeObserver((entries) => {
for (const entry of entries) {
canvas.width = entry.contentRect.width;
canvas.height = CANVAS_HEIGHT;
if (stateRef.current) {
stateRef.current.groundY = canvas.height - GROUND_PAD;
}
const cs = getComputedStyle(canvas);
fg = cs.color || fg;
}
});
if (container) observer.observe(container);
return () => {
cancelAnimationFrame(rafRef.current);
canvas.removeEventListener("click", onClick);
canvas.removeEventListener("keydown", onKey);
observer.disconnect();
};
}, []);
return { canvasRef };
}

View File

@@ -1,227 +0,0 @@
"use client";
import type { ToolUIPart } from "ai";
import { useMemo } from "react";
import { MorphingTextAnimation } from "../../components/MorphingTextAnimation/MorphingTextAnimation";
import {
ContentBadge,
ContentCard,
ContentCardDescription,
ContentCardHeader,
ContentCardTitle,
ContentGrid,
ContentMessage,
ContentSuggestionsList,
} from "../../components/ToolAccordion/AccordionContent";
import { ToolAccordion } from "../../components/ToolAccordion/ToolAccordion";
import {
AccordionIcon,
getAccordionTitle,
getAnimationText,
getFeatureRequestOutput,
isCreatedOutput,
isErrorOutput,
isNoResultsOutput,
isSearchResultsOutput,
ToolIcon,
type FeatureRequestToolType,
} from "./helpers";
export interface FeatureRequestToolPart {
type: FeatureRequestToolType;
toolCallId: string;
state: ToolUIPart["state"];
input?: unknown;
output?: unknown;
}
interface Props {
part: FeatureRequestToolPart;
}
function truncate(text: string, maxChars: number): string {
const trimmed = text.trim();
if (trimmed.length <= maxChars) return trimmed;
return `${trimmed.slice(0, maxChars).trimEnd()}`;
}
export function SearchFeatureRequestsTool({ part }: Props) {
const output = getFeatureRequestOutput(part);
const text = getAnimationText(part);
const isStreaming =
part.state === "input-streaming" || part.state === "input-available";
const isError =
part.state === "output-error" || (!!output && isErrorOutput(output));
const normalized = useMemo(() => {
if (!output) return null;
return { title: getAccordionTitle(part.type, output) };
}, [output, part.type]);
const isOutputAvailable = part.state === "output-available" && !!output;
const searchOutput =
isOutputAvailable && output && isSearchResultsOutput(output)
? output
: null;
const noResultsOutput =
isOutputAvailable && output && isNoResultsOutput(output) ? output : null;
const errorOutput =
isOutputAvailable && output && isErrorOutput(output) ? output : null;
const hasExpandableContent =
isOutputAvailable &&
((!!searchOutput && searchOutput.count > 0) ||
!!noResultsOutput ||
!!errorOutput);
const accordionDescription =
hasExpandableContent && searchOutput
? `Found ${searchOutput.count} result${searchOutput.count === 1 ? "" : "s"} for "${searchOutput.query}"`
: hasExpandableContent && (noResultsOutput || errorOutput)
? ((noResultsOutput ?? errorOutput)?.message ?? null)
: null;
return (
<div className="py-2">
<div className="flex items-center gap-2 text-sm text-muted-foreground">
<ToolIcon
toolType={part.type}
isStreaming={isStreaming}
isError={isError}
/>
<MorphingTextAnimation
text={text}
className={isError ? "text-red-500" : undefined}
/>
</div>
{hasExpandableContent && normalized && (
<ToolAccordion
icon={<AccordionIcon toolType={part.type} />}
title={normalized.title}
description={accordionDescription}
>
{searchOutput && (
<ContentGrid>
{searchOutput.results.map((r) => (
<ContentCard key={r.id}>
<ContentCardHeader>
<ContentCardTitle>{r.title}</ContentCardTitle>
</ContentCardHeader>
{r.description && (
<ContentCardDescription>
{truncate(r.description, 200)}
</ContentCardDescription>
)}
</ContentCard>
))}
</ContentGrid>
)}
{noResultsOutput && (
<div>
<ContentMessage>{noResultsOutput.message}</ContentMessage>
{noResultsOutput.suggestions &&
noResultsOutput.suggestions.length > 0 && (
<ContentSuggestionsList items={noResultsOutput.suggestions} />
)}
</div>
)}
{errorOutput && (
<div>
<ContentMessage>{errorOutput.message}</ContentMessage>
{errorOutput.error && (
<ContentCardDescription>
{errorOutput.error}
</ContentCardDescription>
)}
</div>
)}
</ToolAccordion>
)}
</div>
);
}
export function CreateFeatureRequestTool({ part }: Props) {
const output = getFeatureRequestOutput(part);
const text = getAnimationText(part);
const isStreaming =
part.state === "input-streaming" || part.state === "input-available";
const isError =
part.state === "output-error" || (!!output && isErrorOutput(output));
const normalized = useMemo(() => {
if (!output) return null;
return { title: getAccordionTitle(part.type, output) };
}, [output, part.type]);
const isOutputAvailable = part.state === "output-available" && !!output;
const createdOutput =
isOutputAvailable && output && isCreatedOutput(output) ? output : null;
const errorOutput =
isOutputAvailable && output && isErrorOutput(output) ? output : null;
const hasExpandableContent =
isOutputAvailable && (!!createdOutput || !!errorOutput);
const accordionDescription =
hasExpandableContent && createdOutput
? createdOutput.issue_title
: hasExpandableContent && errorOutput
? errorOutput.message
: null;
return (
<div className="py-2">
<div className="flex items-center gap-2 text-sm text-muted-foreground">
<ToolIcon
toolType={part.type}
isStreaming={isStreaming}
isError={isError}
/>
<MorphingTextAnimation
text={text}
className={isError ? "text-red-500" : undefined}
/>
</div>
{hasExpandableContent && normalized && (
<ToolAccordion
icon={<AccordionIcon toolType={part.type} />}
title={normalized.title}
description={accordionDescription}
>
{createdOutput && (
<ContentCard>
<ContentCardHeader>
<ContentCardTitle>{createdOutput.issue_title}</ContentCardTitle>
</ContentCardHeader>
<div className="mt-2 flex items-center gap-2">
<ContentBadge>
{createdOutput.is_new_issue ? "New" : "Existing"}
</ContentBadge>
</div>
<ContentMessage>{createdOutput.message}</ContentMessage>
</ContentCard>
)}
{errorOutput && (
<div>
<ContentMessage>{errorOutput.message}</ContentMessage>
{errorOutput.error && (
<ContentCardDescription>
{errorOutput.error}
</ContentCardDescription>
)}
</div>
)}
</ToolAccordion>
)}
</div>
);
}

View File

@@ -1,271 +0,0 @@
import {
CheckCircleIcon,
LightbulbIcon,
MagnifyingGlassIcon,
PlusCircleIcon,
} from "@phosphor-icons/react";
import type { ToolUIPart } from "ai";
/* ------------------------------------------------------------------ */
/* Types (local until API client is regenerated) */
/* ------------------------------------------------------------------ */
interface FeatureRequestInfo {
id: string;
identifier: string;
title: string;
description?: string | null;
}
export interface FeatureRequestSearchResponse {
type: "feature_request_search";
message: string;
results: FeatureRequestInfo[];
count: number;
query: string;
}
export interface FeatureRequestCreatedResponse {
type: "feature_request_created";
message: string;
issue_id: string;
issue_identifier: string;
issue_title: string;
issue_url: string;
is_new_issue: boolean;
customer_name: string;
}
interface NoResultsResponse {
type: "no_results";
message: string;
suggestions?: string[];
}
interface ErrorResponse {
type: "error";
message: string;
error?: string;
}
export type FeatureRequestOutput =
| FeatureRequestSearchResponse
| FeatureRequestCreatedResponse
| NoResultsResponse
| ErrorResponse;
export type FeatureRequestToolType =
| "tool-search_feature_requests"
| "tool-create_feature_request"
| string;
/* ------------------------------------------------------------------ */
/* Output parsing */
/* ------------------------------------------------------------------ */
function parseOutput(output: unknown): FeatureRequestOutput | null {
if (!output) return null;
if (typeof output === "string") {
const trimmed = output.trim();
if (!trimmed) return null;
try {
return parseOutput(JSON.parse(trimmed) as unknown);
} catch {
return null;
}
}
if (typeof output === "object") {
const type = (output as { type?: unknown }).type;
if (
type === "feature_request_search" ||
type === "feature_request_created" ||
type === "no_results" ||
type === "error"
) {
return output as FeatureRequestOutput;
}
// Fallback structural checks
if ("results" in output && "query" in output)
return output as FeatureRequestSearchResponse;
if ("issue_identifier" in output)
return output as FeatureRequestCreatedResponse;
if ("suggestions" in output && !("error" in output))
return output as NoResultsResponse;
if ("error" in output || "details" in output)
return output as ErrorResponse;
}
return null;
}
export function getFeatureRequestOutput(
part: unknown,
): FeatureRequestOutput | null {
if (!part || typeof part !== "object") return null;
return parseOutput((part as { output?: unknown }).output);
}
/* ------------------------------------------------------------------ */
/* Type guards */
/* ------------------------------------------------------------------ */
export function isSearchResultsOutput(
output: FeatureRequestOutput,
): output is FeatureRequestSearchResponse {
return (
output.type === "feature_request_search" ||
("results" in output && "query" in output)
);
}
export function isCreatedOutput(
output: FeatureRequestOutput,
): output is FeatureRequestCreatedResponse {
return (
output.type === "feature_request_created" || "issue_identifier" in output
);
}
export function isNoResultsOutput(
output: FeatureRequestOutput,
): output is NoResultsResponse {
return (
output.type === "no_results" ||
("suggestions" in output && !("error" in output))
);
}
export function isErrorOutput(
output: FeatureRequestOutput,
): output is ErrorResponse {
return output.type === "error" || "error" in output;
}
/* ------------------------------------------------------------------ */
/* Accordion metadata */
/* ------------------------------------------------------------------ */
export function getAccordionTitle(
toolType: FeatureRequestToolType,
output: FeatureRequestOutput,
): string {
if (toolType === "tool-search_feature_requests") {
if (isSearchResultsOutput(output)) return "Feature requests";
if (isNoResultsOutput(output)) return "No feature requests found";
return "Feature request search error";
}
if (isCreatedOutput(output)) {
return output.is_new_issue
? "Feature request created"
: "Added to feature request";
}
if (isErrorOutput(output)) return "Feature request error";
return "Feature request";
}
/* ------------------------------------------------------------------ */
/* Animation text */
/* ------------------------------------------------------------------ */
interface AnimationPart {
type: FeatureRequestToolType;
state: ToolUIPart["state"];
input?: unknown;
output?: unknown;
}
export function getAnimationText(part: AnimationPart): string {
if (part.type === "tool-search_feature_requests") {
const query = (part.input as { query?: string } | undefined)?.query?.trim();
const queryText = query ? ` for "${query}"` : "";
switch (part.state) {
case "input-streaming":
case "input-available":
return `Searching feature requests${queryText}`;
case "output-available": {
const output = parseOutput(part.output);
if (!output) return `Searching feature requests${queryText}`;
if (isSearchResultsOutput(output)) {
return `Found ${output.count} feature request${output.count === 1 ? "" : "s"}${queryText}`;
}
if (isNoResultsOutput(output))
return `No feature requests found${queryText}`;
return `Error searching feature requests${queryText}`;
}
case "output-error":
return `Error searching feature requests${queryText}`;
default:
return "Searching feature requests";
}
}
// create_feature_request
const title = (part.input as { title?: string } | undefined)?.title?.trim();
const titleText = title ? ` "${title}"` : "";
switch (part.state) {
case "input-streaming":
case "input-available":
return `Creating feature request${titleText}`;
case "output-available": {
const output = parseOutput(part.output);
if (!output) return `Creating feature request${titleText}`;
if (isCreatedOutput(output)) {
return output.is_new_issue
? "Feature request created"
: "Added to existing feature request";
}
if (isErrorOutput(output)) return "Error creating feature request";
return `Created feature request${titleText}`;
}
case "output-error":
return "Error creating feature request";
default:
return "Creating feature request";
}
}
/* ------------------------------------------------------------------ */
/* Icons */
/* ------------------------------------------------------------------ */
export function ToolIcon({
toolType,
isStreaming,
isError,
}: {
toolType: FeatureRequestToolType;
isStreaming?: boolean;
isError?: boolean;
}) {
const IconComponent =
toolType === "tool-create_feature_request"
? PlusCircleIcon
: MagnifyingGlassIcon;
return (
<IconComponent
size={14}
weight="regular"
className={
isError
? "text-red-500"
: isStreaming
? "text-neutral-500"
: "text-neutral-400"
}
/>
);
}
export function AccordionIcon({
toolType,
}: {
toolType: FeatureRequestToolType;
}) {
const IconComponent =
toolType === "tool-create_feature_request"
? CheckCircleIcon
: LightbulbIcon;
return <IconComponent size={32} weight="light" />;
}

View File

@@ -0,0 +1,63 @@
"use client";
import { ToolUIPart } from "ai";
import { GearIcon } from "@phosphor-icons/react";
import { MorphingTextAnimation } from "../../components/MorphingTextAnimation/MorphingTextAnimation";
interface Props {
part: ToolUIPart;
}
function extractToolName(part: ToolUIPart): string {
// ToolUIPart.type is "tool-{name}", extract the name portion.
return part.type.replace(/^tool-/, "");
}
function formatToolName(name: string): string {
// "search_docs" → "Search docs", "Read" → "Read"
return name.replace(/_/g, " ").replace(/^\w/, (c) => c.toUpperCase());
}
function getAnimationText(part: ToolUIPart): string {
const label = formatToolName(extractToolName(part));
switch (part.state) {
case "input-streaming":
case "input-available":
return `Running ${label}`;
case "output-available":
return `${label} completed`;
case "output-error":
return `${label} failed`;
default:
return `Running ${label}`;
}
}
export function GenericTool({ part }: Props) {
const isStreaming =
part.state === "input-streaming" || part.state === "input-available";
const isError = part.state === "output-error";
return (
<div className="py-2">
<div className="flex items-center gap-2 text-sm text-muted-foreground">
<GearIcon
size={14}
weight="regular"
className={
isError
? "text-red-500"
: isStreaming
? "animate-spin text-neutral-500"
: "text-neutral-400"
}
/>
<MorphingTextAnimation
text={getAnimationText(part)}
className={isError ? "text-red-500" : undefined}
/>
</div>
</div>
);
}

View File

@@ -3,7 +3,6 @@
import type { ToolUIPart } from "ai";
import { MorphingTextAnimation } from "../../components/MorphingTextAnimation/MorphingTextAnimation";
import { ToolAccordion } from "../../components/ToolAccordion/ToolAccordion";
import { BlockDetailsCard } from "./components/BlockDetailsCard/BlockDetailsCard";
import { BlockOutputCard } from "./components/BlockOutputCard/BlockOutputCard";
import { ErrorCard } from "./components/ErrorCard/ErrorCard";
import { SetupRequirementsCard } from "./components/SetupRequirementsCard/SetupRequirementsCard";
@@ -12,7 +11,6 @@ import {
getAnimationText,
getRunBlockToolOutput,
isRunBlockBlockOutput,
isRunBlockDetailsOutput,
isRunBlockErrorOutput,
isRunBlockSetupRequirementsOutput,
ToolIcon,
@@ -43,7 +41,6 @@ export function RunBlockTool({ part }: Props) {
part.state === "output-available" &&
!!output &&
(isRunBlockBlockOutput(output) ||
isRunBlockDetailsOutput(output) ||
isRunBlockSetupRequirementsOutput(output) ||
isRunBlockErrorOutput(output));
@@ -61,10 +58,6 @@ export function RunBlockTool({ part }: Props) {
<ToolAccordion {...getAccordionMeta(output)}>
{isRunBlockBlockOutput(output) && <BlockOutputCard output={output} />}
{isRunBlockDetailsOutput(output) && (
<BlockDetailsCard output={output} />
)}
{isRunBlockSetupRequirementsOutput(output) && (
<SetupRequirementsCard output={output} />
)}

View File

@@ -1,188 +0,0 @@
import type { Meta, StoryObj } from "@storybook/nextjs";
import { ResponseType } from "@/app/api/__generated__/models/responseType";
import type { BlockDetailsResponse } from "../../helpers";
import { BlockDetailsCard } from "./BlockDetailsCard";
const meta: Meta<typeof BlockDetailsCard> = {
title: "Copilot/RunBlock/BlockDetailsCard",
component: BlockDetailsCard,
parameters: {
layout: "centered",
},
tags: ["autodocs"],
decorators: [
(Story) => (
<div style={{ maxWidth: 480 }}>
<Story />
</div>
),
],
};
export default meta;
type Story = StoryObj<typeof meta>;
const baseBlock: BlockDetailsResponse = {
type: ResponseType.block_details,
message:
"Here are the details for the GetWeather block. Provide the required inputs to run it.",
session_id: "session-123",
user_authenticated: true,
block: {
id: "block-abc-123",
name: "GetWeather",
description: "Fetches current weather data for a given location.",
inputs: {
type: "object",
properties: {
location: {
title: "Location",
type: "string",
description:
"City name or coordinates (e.g. 'London' or '51.5,-0.1')",
},
units: {
title: "Units",
type: "string",
description: "Temperature units: 'metric' or 'imperial'",
},
},
required: ["location"],
},
outputs: {
type: "object",
properties: {
temperature: {
title: "Temperature",
type: "number",
description: "Current temperature in the requested units",
},
condition: {
title: "Condition",
type: "string",
description: "Weather condition description (e.g. 'Sunny', 'Rain')",
},
},
},
credentials: [],
},
};
export const Default: Story = {
args: {
output: baseBlock,
},
};
export const InputsOnly: Story = {
args: {
output: {
...baseBlock,
message: "This block requires inputs. No outputs are defined.",
block: {
...baseBlock.block,
outputs: {},
},
},
},
};
export const OutputsOnly: Story = {
args: {
output: {
...baseBlock,
message: "This block has no required inputs.",
block: {
...baseBlock.block,
inputs: {},
},
},
},
};
export const ManyFields: Story = {
args: {
output: {
...baseBlock,
message: "Block with many input and output fields.",
block: {
...baseBlock.block,
name: "SendEmail",
description: "Sends an email via SMTP.",
inputs: {
type: "object",
properties: {
to: {
title: "To",
type: "string",
description: "Recipient email address",
},
subject: {
title: "Subject",
type: "string",
description: "Email subject line",
},
body: {
title: "Body",
type: "string",
description: "Email body content",
},
cc: {
title: "CC",
type: "string",
description: "CC recipients (comma-separated)",
},
bcc: {
title: "BCC",
type: "string",
description: "BCC recipients (comma-separated)",
},
},
required: ["to", "subject", "body"],
},
outputs: {
type: "object",
properties: {
message_id: {
title: "Message ID",
type: "string",
description: "Unique ID of the sent email",
},
status: {
title: "Status",
type: "string",
description: "Delivery status",
},
},
},
},
},
},
};
export const NoFieldDescriptions: Story = {
args: {
output: {
...baseBlock,
message: "Fields without descriptions.",
block: {
...baseBlock.block,
name: "SimpleBlock",
inputs: {
type: "object",
properties: {
input_a: { title: "Input A", type: "string" },
input_b: { title: "Input B", type: "number" },
},
required: ["input_a"],
},
outputs: {
type: "object",
properties: {
result: { title: "Result", type: "string" },
},
},
},
},
},
};

View File

@@ -1,103 +0,0 @@
"use client";
import type { BlockDetailsResponse } from "../../helpers";
import {
ContentBadge,
ContentCard,
ContentCardDescription,
ContentCardTitle,
ContentGrid,
ContentMessage,
} from "../../../../components/ToolAccordion/AccordionContent";
interface Props {
output: BlockDetailsResponse;
}
function SchemaFieldList({
title,
properties,
required,
}: {
title: string;
properties: Record<string, unknown>;
required?: string[];
}) {
const entries = Object.entries(properties);
if (entries.length === 0) return null;
const requiredSet = new Set(required ?? []);
return (
<ContentCard>
<ContentCardTitle className="text-xs">{title}</ContentCardTitle>
<div className="mt-2 grid gap-2">
{entries.map(([name, schema]) => {
const field = schema as Record<string, unknown> | undefined;
const fieldTitle =
typeof field?.title === "string" ? field.title : name;
const fieldType =
typeof field?.type === "string" ? field.type : "unknown";
const description =
typeof field?.description === "string"
? field.description
: undefined;
return (
<div key={name} className="rounded-xl border p-2">
<div className="flex items-center justify-between gap-2">
<ContentCardTitle className="text-xs">
{fieldTitle}
</ContentCardTitle>
<div className="flex gap-1">
<ContentBadge>{fieldType}</ContentBadge>
{requiredSet.has(name) && (
<ContentBadge>Required</ContentBadge>
)}
</div>
</div>
{description && (
<ContentCardDescription className="mt-1 text-xs">
{description}
</ContentCardDescription>
)}
</div>
);
})}
</div>
</ContentCard>
);
}
export function BlockDetailsCard({ output }: Props) {
const inputs = output.block.inputs as {
properties?: Record<string, unknown>;
required?: string[];
} | null;
const outputs = output.block.outputs as {
properties?: Record<string, unknown>;
required?: string[];
} | null;
return (
<ContentGrid>
<ContentMessage>{output.message}</ContentMessage>
{inputs?.properties && Object.keys(inputs.properties).length > 0 && (
<SchemaFieldList
title="Inputs"
properties={inputs.properties}
required={inputs.required}
/>
)}
{outputs?.properties && Object.keys(outputs.properties).length > 0 && (
<SchemaFieldList
title="Outputs"
properties={outputs.properties}
required={outputs.required}
/>
)}
</ContentGrid>
);
}

View File

@@ -10,37 +10,18 @@ import {
import type { ToolUIPart } from "ai";
import { OrbitLoader } from "../../components/OrbitLoader/OrbitLoader";
/** Block details returned on first run_block attempt (before input_data provided). */
export interface BlockDetailsResponse {
type: typeof ResponseType.block_details;
message: string;
session_id?: string | null;
block: {
id: string;
name: string;
description: string;
inputs: Record<string, unknown>;
outputs: Record<string, unknown>;
credentials: unknown[];
};
user_authenticated: boolean;
}
export interface RunBlockInput {
block_id?: string;
block_name?: string;
input_data?: Record<string, unknown>;
}
export type RunBlockToolOutput =
| SetupRequirementsResponse
| BlockDetailsResponse
| BlockOutputResponse
| ErrorResponse;
const RUN_BLOCK_OUTPUT_TYPES = new Set<string>([
ResponseType.setup_requirements,
ResponseType.block_details,
ResponseType.block_output,
ResponseType.error,
]);
@@ -54,15 +35,6 @@ export function isRunBlockSetupRequirementsOutput(
);
}
export function isRunBlockDetailsOutput(
output: RunBlockToolOutput,
): output is BlockDetailsResponse {
return (
output.type === ResponseType.block_details ||
("block" in output && typeof output.block === "object")
);
}
export function isRunBlockBlockOutput(
output: RunBlockToolOutput,
): output is BlockOutputResponse {
@@ -92,7 +64,6 @@ function parseOutput(output: unknown): RunBlockToolOutput | null {
return output as RunBlockToolOutput;
}
if ("block_id" in output) return output as BlockOutputResponse;
if ("block" in output) return output as BlockDetailsResponse;
if ("setup_info" in output) return output as SetupRequirementsResponse;
if ("error" in output || "details" in output)
return output as ErrorResponse;
@@ -113,25 +84,17 @@ export function getAnimationText(part: {
output?: unknown;
}): string {
const input = part.input as RunBlockInput | undefined;
const blockName = input?.block_name?.trim();
const blockId = input?.block_id?.trim();
// Prefer block_name if available, otherwise fall back to block_id
const blockText = blockName
? ` "${blockName}"`
: blockId
? ` "${blockId}"`
: "";
const blockText = blockId ? ` "${blockId}"` : "";
switch (part.state) {
case "input-streaming":
case "input-available":
return `Running${blockText}`;
return `Running the block${blockText}`;
case "output-available": {
const output = parseOutput(part.output);
if (!output) return `Running${blockText}`;
if (!output) return `Running the block${blockText}`;
if (isRunBlockBlockOutput(output)) return `Ran "${output.block_name}"`;
if (isRunBlockDetailsOutput(output))
return `Details for "${output.block.name}"`;
if (isRunBlockSetupRequirementsOutput(output)) {
return `Setup needed for "${output.setup_info.agent_name}"`;
}
@@ -195,21 +158,6 @@ export function getAccordionMeta(output: RunBlockToolOutput): {
};
}
if (isRunBlockDetailsOutput(output)) {
const inputKeys = Object.keys(
(output.block.inputs as { properties?: Record<string, unknown> })
?.properties ?? {},
);
return {
icon,
title: output.block.name,
description:
inputKeys.length > 0
? `${inputKeys.length} input field${inputKeys.length === 1 ? "" : "s"} available`
: output.message,
};
}
if (isRunBlockSetupRequirementsOutput(output)) {
const missingCredsCount = Object.keys(
(output.setup_info.user_readiness?.missing_credentials ?? {}) as Record<

View File

@@ -1,14 +1,10 @@
import { useGetV2ListSessions } from "@/app/api/__generated__/endpoints/chat/chat";
import { toast } from "@/components/molecules/Toast/use-toast";
import { useBreakpoint } from "@/lib/hooks/useBreakpoint";
import { useSupabase } from "@/lib/supabase/hooks/useSupabase";
import { useChat } from "@ai-sdk/react";
import { DefaultChatTransport } from "ai";
import { useEffect, useMemo, useRef, useState } from "react";
import { useEffect, useMemo, useState } from "react";
import { useChatSession } from "./useChatSession";
import { useLongRunningToolPolling } from "./hooks/useLongRunningToolPolling";
const STREAM_START_TIMEOUT_MS = 12_000;
export function useCopilotPage() {
const { isUserLoading, isLoggedIn } = useSupabase();
@@ -56,24 +52,6 @@ export function useCopilotPage() {
transport: transport ?? undefined,
});
// Abort the stream if the backend doesn't start sending data within 12s.
const stopRef = useRef(stop);
stopRef.current = stop;
useEffect(() => {
if (status !== "submitted") return;
const timer = setTimeout(() => {
stopRef.current();
toast({
title: "Stream timed out",
description: "The server took too long to respond. Please try again.",
variant: "destructive",
});
}, STREAM_START_TIMEOUT_MS);
return () => clearTimeout(timer);
}, [status]);
useEffect(() => {
if (!hydratedMessages || hydratedMessages.length === 0) return;
setMessages((prev) => {
@@ -82,11 +60,6 @@ export function useCopilotPage() {
});
}, [hydratedMessages, setMessages]);
// Poll session endpoint when a long-running tool (create_agent, edit_agent)
// is in progress. When the backend completes, the session data will contain
// the final tool output — this hook detects the change and updates messages.
useLongRunningToolPolling(sessionId, messages, setMessages);
// Clear messages when session is null
useEffect(() => {
if (!sessionId) setMessages([]);

View File

@@ -29,7 +29,6 @@ export function ScheduleListItem({
description={formatDistanceToNow(schedule.next_run_time, {
addSuffix: true,
})}
descriptionTitle={new Date(schedule.next_run_time).toString()}
onClick={onClick}
selected={selected}
icon={

View File

@@ -7,7 +7,6 @@ import React from "react";
interface Props {
title: string;
description?: string;
descriptionTitle?: string;
icon?: React.ReactNode;
selected?: boolean;
onClick?: () => void;
@@ -17,7 +16,6 @@ interface Props {
export function SidebarItemCard({
title,
description,
descriptionTitle,
icon,
selected,
onClick,
@@ -40,11 +38,7 @@ export function SidebarItemCard({
>
{title}
</Text>
<Text
variant="body"
className="leading-tight !text-zinc-500"
title={descriptionTitle}
>
<Text variant="body" className="leading-tight !text-zinc-500">
{description}
</Text>
</div>

View File

@@ -81,9 +81,6 @@ export function TaskListItem({
? formatDistanceToNow(run.started_at, { addSuffix: true })
: "—"
}
descriptionTitle={
run.started_at ? new Date(run.started_at).toString() : undefined
}
onClick={onClick}
selected={selected}
actions={

View File

@@ -1053,7 +1053,6 @@
"$ref": "#/components/schemas/ClarificationNeededResponse"
},
{ "$ref": "#/components/schemas/BlockListResponse" },
{ "$ref": "#/components/schemas/BlockDetailsResponse" },
{ "$ref": "#/components/schemas/BlockOutputResponse" },
{ "$ref": "#/components/schemas/DocSearchResultsResponse" },
{ "$ref": "#/components/schemas/DocPageResponse" },
@@ -6959,58 +6958,6 @@
"enum": ["run", "byte", "second"],
"title": "BlockCostType"
},
"BlockDetails": {
"properties": {
"id": { "type": "string", "title": "Id" },
"name": { "type": "string", "title": "Name" },
"description": { "type": "string", "title": "Description" },
"inputs": {
"additionalProperties": true,
"type": "object",
"title": "Inputs",
"default": {}
},
"outputs": {
"additionalProperties": true,
"type": "object",
"title": "Outputs",
"default": {}
},
"credentials": {
"items": { "$ref": "#/components/schemas/CredentialsMetaInput" },
"type": "array",
"title": "Credentials",
"default": []
}
},
"type": "object",
"required": ["id", "name", "description"],
"title": "BlockDetails",
"description": "Detailed block information."
},
"BlockDetailsResponse": {
"properties": {
"type": {
"$ref": "#/components/schemas/ResponseType",
"default": "block_details"
},
"message": { "type": "string", "title": "Message" },
"session_id": {
"anyOf": [{ "type": "string" }, { "type": "null" }],
"title": "Session Id"
},
"block": { "$ref": "#/components/schemas/BlockDetails" },
"user_authenticated": {
"type": "boolean",
"title": "User Authenticated",
"default": false
}
},
"type": "object",
"required": ["message", "block"],
"title": "BlockDetailsResponse",
"description": "Response for block details (first run_block attempt)."
},
"BlockInfo": {
"properties": {
"id": { "type": "string", "title": "Id" },
@@ -7066,13 +7013,57 @@
"properties": {
"id": { "type": "string", "title": "Id" },
"name": { "type": "string", "title": "Name" },
"description": { "type": "string", "title": "Description" }
"description": { "type": "string", "title": "Description" },
"categories": {
"items": { "type": "string" },
"type": "array",
"title": "Categories"
},
"input_schema": {
"additionalProperties": true,
"type": "object",
"title": "Input Schema",
"description": "Full JSON schema for block inputs"
},
"output_schema": {
"additionalProperties": true,
"type": "object",
"title": "Output Schema",
"description": "Full JSON schema for block outputs"
},
"required_inputs": {
"items": { "$ref": "#/components/schemas/BlockInputFieldInfo" },
"type": "array",
"title": "Required Inputs",
"description": "List of input fields for this block"
}
},
"type": "object",
"required": ["id", "name", "description"],
"required": ["id", "name", "description", "categories"],
"title": "BlockInfoSummary",
"description": "Summary of a block for search results."
},
"BlockInputFieldInfo": {
"properties": {
"name": { "type": "string", "title": "Name" },
"type": { "type": "string", "title": "Type" },
"description": {
"type": "string",
"title": "Description",
"default": ""
},
"required": {
"type": "boolean",
"title": "Required",
"default": false
},
"default": { "anyOf": [{}, { "type": "null" }], "title": "Default" }
},
"type": "object",
"required": ["name", "type"],
"title": "BlockInputFieldInfo",
"description": "Information about a block input field."
},
"BlockListResponse": {
"properties": {
"type": {
@@ -7090,7 +7081,12 @@
"title": "Blocks"
},
"count": { "type": "integer", "title": "Count" },
"query": { "type": "string", "title": "Query" }
"query": { "type": "string", "title": "Query" },
"usage_hint": {
"type": "string",
"title": "Usage Hint",
"default": "To execute a block, call run_block with block_id set to the block's 'id' field and input_data containing the fields listed in required_inputs."
}
},
"type": "object",
"required": ["message", "blocks", "count", "query"],
@@ -10483,7 +10479,6 @@
"agent_saved",
"clarification_needed",
"block_list",
"block_details",
"block_output",
"doc_search_results",
"doc_page",
@@ -10496,8 +10491,9 @@
"operation_pending",
"operation_in_progress",
"input_validation_error",
"feature_request_search",
"feature_request_created"
"web_fetch",
"bash_exec",
"operation_status"
],
"title": "ResponseType",
"description": "Types of tool responses."

View File

@@ -180,14 +180,3 @@ body[data-google-picker-open="true"] [data-dialog-content] {
z-index: 1 !important;
pointer-events: none !important;
}
/* CoPilot chat table styling — remove left/right borders, increase padding */
[data-streamdown="table-wrapper"] table {
border-left: none;
border-right: none;
}
[data-streamdown="table-wrapper"] th,
[data-streamdown="table-wrapper"] td {
padding: 0.875rem 1rem; /* py-3.5 px-4 */
}

View File

@@ -226,7 +226,7 @@ function renderMarkdown(
table: ({ children, ...props }) => (
<div className="my-4 overflow-x-auto">
<table
className="min-w-full divide-y divide-gray-200 border-y border-gray-200 dark:divide-gray-700 dark:border-gray-700"
className="min-w-full divide-y divide-gray-200 rounded-lg border border-gray-200 dark:divide-gray-700 dark:border-gray-700"
{...props}
>
{children}
@@ -235,7 +235,7 @@ function renderMarkdown(
),
th: ({ children, ...props }) => (
<th
className="bg-gray-50 px-4 py-3.5 text-left text-xs font-semibold uppercase tracking-wider text-gray-700 dark:bg-gray-800 dark:text-gray-300"
className="bg-gray-50 px-4 py-3 text-left text-xs font-semibold uppercase tracking-wider text-gray-700 dark:bg-gray-800 dark:text-gray-300"
{...props}
>
{children}
@@ -243,7 +243,7 @@ function renderMarkdown(
),
td: ({ children, ...props }) => (
<td
className="border-t border-gray-200 px-4 py-3.5 text-sm text-gray-600 dark:border-gray-700 dark:text-gray-400"
className="border-t border-gray-200 px-4 py-3 text-sm text-gray-600 dark:border-gray-700 dark:text-gray-400"
{...props}
>
{children}

View File

@@ -1,165 +0,0 @@
# Implementation Plan: SECRT-1950 - Apply E2E CI Optimizations to Claude Code Workflows
## Ticket
[SECRT-1950](https://linear.app/autogpt/issue/SECRT-1950)
## Summary
Apply Pwuts's CI performance optimizations from PR #12090 to Claude Code workflows.
## Reference PR
https://github.com/Significant-Gravitas/AutoGPT/pull/12090
---
## Analysis
### Current State (claude.yml)
**pnpm caching (lines 104-118):**
```yaml
- name: Set up Node.js
uses: actions/setup-node@v6
with:
node-version: "22"
- name: Enable corepack
run: corepack enable
- name: Set pnpm store directory
run: |
pnpm config set store-dir ~/.pnpm-store
echo "PNPM_HOME=$HOME/.pnpm-store" >> $GITHUB_ENV
- name: Cache frontend dependencies
uses: actions/cache@v5
with:
path: ~/.pnpm-store
key: ${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml', 'autogpt_platform/frontend/package.json') }}
restore-keys: |
${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml') }}
${{ runner.os }}-pnpm-
```
**Docker setup (lines 134-165):**
- Uses `docker-buildx-action@v3`
- Has manual Docker image caching via `actions/cache`
- Runs `docker compose up` without buildx bake optimization
### Pwuts's Optimizations (PR #12090)
1. **Simplified pnpm caching** - Use `setup-node` built-in cache:
```yaml
- name: Enable corepack
run: corepack enable
- name: Set up Node
uses: actions/setup-node@v6
with:
node-version: "22.18.0"
cache: "pnpm"
cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml
```
2. **Docker build caching via buildx bake**:
```yaml
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
with:
driver: docker-container
driver-opts: network=host
- name: Expose GHA cache to docker buildx CLI
uses: crazy-max/ghaction-github-runtime@v3
- name: Build Docker images (with cache)
run: |
pip install pyyaml
docker compose -f docker-compose.yml config > docker-compose.resolved.yml
python ../.github/workflows/scripts/docker-ci-fix-compose-build-cache.py \
--source docker-compose.resolved.yml \
--cache-from "type=gha" \
--cache-to "type=gha,mode=max" \
...
docker buildx bake --allow=fs.read=.. -f docker-compose.resolved.yml --load
```
---
## Proposed Changes
### 1. Update pnpm caching in `claude.yml`
**Before:**
- Manual cache key generation
- Separate `actions/cache` step
- Manual pnpm store directory config
**After:**
- Use `setup-node` built-in `cache: "pnpm"` option
- Remove manual cache step
- Keep `corepack enable` before `setup-node`
### 2. Update Docker build in `claude.yml`
**Before:**
- Manual Docker layer caching via `actions/cache` with `/tmp/.buildx-cache`
- Simple `docker compose build`
**After:**
- Use `crazy-max/ghaction-github-runtime@v3` to expose GHA cache
- Use `docker-ci-fix-compose-build-cache.py` script
- Build with `docker buildx bake`
### 3. Apply same changes to other Claude workflows
- `claude-dependabot.yml` - Check if it has similar patterns
- `claude-ci-failure-auto-fix.yml` - Check if it has similar patterns
- `copilot-setup-steps.yml` - Reusable workflow, may be the source of truth
---
## Files to Modify
1. `.github/workflows/claude.yml`
2. `.github/workflows/claude-dependabot.yml` (if applicable)
3. `.github/workflows/claude-ci-failure-auto-fix.yml` (if applicable)
## Dependencies
- PR #12090 must be merged first (provides the `docker-ci-fix-compose-build-cache.py` script)
- Backend Dockerfile optimizations (already in PR #12090)
---
## Test Plan
1. Create PR with changes
2. Trigger Claude workflow manually or via `@claude` mention on a test issue
3. Compare CI runtime before/after
4. Verify Claude agent still works correctly (can checkout, build, run tests)
---
## Risk Assessment
**Low risk:**
- These are CI infrastructure changes, not code changes
- If caching fails, builds fall back to uncached (slower but works)
- Changes mirror proven patterns from PR #12090
---
## Questions for Reviewer
1. Should we wait for PR #12090 to merge before creating this PR?
2. Does `copilot-setup-steps.yml` need updating, or is it a separate concern?
3. Any concerns about cache key collisions between frontend E2E and Claude workflows?
---
## Verified
-**`claude-dependabot.yml`**: Has same pnpm caching pattern as `claude.yml` (manual `actions/cache`) — NEEDS UPDATE
-**`claude-ci-failure-auto-fix.yml`**: Simple workflow with no pnpm or Docker caching — NO CHANGES NEEDED
-**Script path**: `docker-ci-fix-compose-build-cache.py` will be at `.github/workflows/scripts/` after PR #12090 merges
-**Test seed caching**: NOT APPLICABLE — Claude workflows spin up a dev environment but don't run E2E tests with pre-seeded data. The seed caching in PR #12090 is specific to the frontend E2E test suite which needs consistent test data. Claude just needs the services running.