mirror of
https://github.com/Significant-Gravitas/AutoGPT.git
synced 2026-04-08 03:00:28 -04:00
11cfd8756ce2a1abe434d4a3c835fc12fca61425
7939 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
11cfd8756c |
fix(backend): standardize microservice host/port configuration
- Change agentgenerator_host default from empty string to "localhost" - Consistent with other client services (rabbitmq, redis, clamav) - Fixes "service_not_configured" error when only port is set - Change agentgenerator_port default from 8000 to 8009 - Avoids conflict with Kong gateway (port 8000) - Follows sequential port allocation (8001-8008 already in use) - Add AGENTGENERATOR_HOST to .env.default for clarity This standardization ensures: 1. Consistent host naming across all microservice client configurations 2. No port conflicts with Kong or other services 3. Agent Generator service works out of the box when enabled |
||
|
|
e40c8c70ce |
fix(copilot): collision detection, session locking, and sync for concurrent message saves (#12177)
Requested by @majdyz Concurrent writers (incremental streaming saves from PR #12173 and long-running tool callbacks) can race to persist messages with the same `(sessionId, sequence)` pair, causing unique constraint violations on `ChatMessage`. **Root cause:** The streaming loop tracks `saved_msg_count` in-memory, but the long-running tool callback (`_build_long_running_callback`) also appends messages and calls `upsert_chat_session` independently — without coordinating sequence numbers. When the streaming loop does its next incremental save with the stale `saved_msg_count`, it tries to insert at a sequence that already exists. **Fix:** Multi-layered defense-in-depth approach: 1. **Collision detection with retry** (db.py): `add_chat_messages_batch` uses `create_many()` in a transaction. On `UniqueViolationError`, queries `MAX(sequence)+1` from DB and retries with the correct offset (max 5 attempts). 2. **Robust sequence tracking** (db.py): `get_next_sequence()` uses indexed `find_first` with `order={"sequence": "desc"}` for O(1) MAX lookup, immune to deleted messages. 3. **Session-based counter** (model.py): Added `saved_message_count` field to `ChatSession`. `upsert_chat_session` returns the session with updated count, eliminating tuple returns throughout the codebase. 4. **MessageCounter dataclass** (sdk/service.py): Replaced list[int] mutable reference pattern with a clean `MessageCounter` dataclass for shared state between streaming loop and long-running callbacks. 5. **Session locking** (sdk/service.py): Prevent concurrent streams on the same session using Redis `SET NX EX` distributed locks with TTL refresh on heartbeats (config.stream_ttl = 3600s). 6. **Atomic operations** (db.py): Single timestamp for all messages and session update in batch operations for consistency. Parallel queries with `asyncio.gather` for lower latency. 7. **Config-based TTL** (sdk/service.py, config.py): Consolidated all TTL constants to use `config.stream_ttl` (3600s) with lock refresh on heartbeats. ### Key implementation details - **create_many**: Uses `sessionId` directly (not nested `Session.connect`) as `create_many` doesn't support nested creates - **Type narrowing**: Added explicit `assert session is not None` statements for pyright type checking in async contexts - **Parallel operations**: Use `asyncio.gather` for independent DB operations (create_many + session update) - **Single timestamp**: All messages in a batch share the same `createdAt` timestamp for atomicity ### Changes - `backend/copilot/db.py`: Collision detection with `create_many` + retry, indexed sequence lookup, single timestamp, parallel queries - `backend/copilot/model.py`: Added `saved_message_count` field, simplified return types - `backend/copilot/sdk/service.py`: MessageCounter dataclass, session locking with refresh, config-based TTL, type narrowing - `backend/copilot/service.py`: Updated all callers to handle new return types - `backend/copilot/config.py`: Increased long_running_operation_ttl to 3600s with clarified docstring - `backend/copilot/*_test.py`: Tests updated for new signatures --------- Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co> |
||
|
|
9cdcd6793f |
fix(copilot): remove stream timeout, add error propagation to frontend (#12175)
## Summary
Fixes critical reliability issues where long-running copilot sessions
were forcibly terminated and failures showed no error messages to users.
## Issues Fixed
1. **Silent failures**: Tasks failed but frontend showed "stopped" with
zero explanation
2. **Premature timeout**: Sessions auto-expired after 5 minutes even
when actively running
## Changes
### Error propagation to frontend
- Add `error_message` parameter to `mark_task_completed()`
- When `status="failed"`, publish `StreamError` before `StreamFinish` so
frontend displays reason
- Update all failure callers with specific error messages:
- Session not found: `"Session {id} not found"`
- Tool setup failed: `"Failed to setup tool {name}: {error}"`
- Task cancelled: `"Task was cancelled"`
### Remove stream timeout
- Delete `stream_timeout` config (was 300s/5min)
- Remove auto-expiry logic in `get_active_task_for_session()`
- Sessions now run indefinitely — user controls stopping via UI
## Why
**Auto-expiry was broken:**
- Used `created_at` (task start) not last activity
- SDK sessions with multiple LLM calls + subagent Tasks easily run
20-30+ minutes
- A task publishing chunks every second still got killed at 5min mark
- Hard timeout is inappropriate for long-running AI agents
**Error propagation was missing:**
- `mark_task_completed(status="failed")` only sent `StreamFinish`
- No `StreamError` event = frontend had no message to show user
- Backend logs showed errors but user saw nothing
## Test Plan
- [x] Formatter, linter, type-check pass
- [ ] Start a copilot session with Task tool (spawns subagent)
- [ ] Verify session runs beyond 5 minutes without auto-expiry
- [ ] Cancel a running session → frontend shows "Task was cancelled"
error
- [ ] Trigger a tool setup failure → frontend shows error message
- [ ] Session continues running until user clicks stop or task completes
## Files Changed
- `backend/copilot/config.py` — removed `stream_timeout`
- `backend/copilot/stream_registry.py` — removed auto-expiry, added
error propagation
- `backend/copilot/service.py` — error messages for 2 failure paths
- `backend/copilot/executor/processor.py` — error message for
cancellation
|
||
|
|
fc64f83331 |
fix(copilot): SDK streaming reliability, parallel tools, incremental saves, frontend reconnection (#12173)
## Summary
Fixes multiple reliability issues in the copilot's Claude Agent SDK
streaming pipeline — tool outputs getting stuck, parallel tool calls
flushing prematurely, messages lost on page refresh, and SSE
reconnection failures.
## Changes
### Backend: Streaming loop rewrite (`sdk/service.py`)
- **Non-cancelling heartbeat pattern**: Replace `asyncio.timeout()` with
`asyncio.wait()` for SDK message iteration. The old approach corrupted
the SDK's internal anyio memory stream when timeouts fired
mid-`__anext__()`, causing `StopAsyncIteration` on the next call and
silently dropping all in-flight tool results.
- **Hook synchronization**: Add `wait_for_stash()` before
`convert_message()` — the SDK fires PostToolUse hooks via `start_soon()`
(fire-and-forget), so the next message can arrive before the hook
stashes its output. The new asyncio.Event-based mechanism bridges this
gap without arbitrary sleeps.
- **Error handling**: Add `asyncio.CancelledError` handling at both
inner (streaming loop) and outer (session) levels, plus pending task
cleanup in `finally` block to prevent leaked coroutines. Catch
`Exception` from `done.pop().result()` for SDK error messages.
- **Safety-net flush**: After streaming loop ends, flush any remaining
unresolved tool calls so the frontend stops showing spinners even if the
stream drops unexpectedly.
- **StreamFinish fallback**: Emit `StreamFinishStep` + `StreamFinish`
when stream ends without `ResultMessage` (StopAsyncIteration) so the
frontend transitions to "ready" state.
- **Incremental session saves**: Save session to PostgreSQL after each
tool input/output event (not just at stream end), so page refresh and
other devices see recent messages.
- **Enhanced logging**: All log lines now include `session_id[:12]`
prefix and tool call resolution state (unresolved/current/resolved
counts).
### Backend: Response adapter (`sdk/response_adapter.py`)
- **Parallel tool call support**: Skip `_flush_unresolved_tool_calls()`
when an AssistantMessage contains only ToolUseBlocks (parallel
continuation) — the prior tools are still executing concurrently and
haven't finished yet.
- **Duplicate output prevention**: Skip already-resolved tool results in
both UserMessage (ToolResultBlock) and parent_tool_use_id handling to
prevent duplicate `StreamToolOutputAvailable` events.
- **`has_unresolved_tool_calls` property**: Used by the streaming loop
to decide whether to wait for PostToolUse hooks.
- **`session_id` parameter**: Passed through for structured logging.
### Backend: Hook synchronization (`sdk/tool_adapter.py`)
- **`_stash_event` ContextVar**: asyncio.Event signaled by
`stash_pending_tool_output()` whenever a PostToolUse hook stashes
output.
- **`wait_for_stash()`**: Awaits the event with configurable timeout —
replaces the racy "hope the hook finished" approach.
### Backend: Security hooks (`sdk/security_hooks.py`)
- Enhanced logging in `post_tool_use_hook` — log whether tool is
built-in, preview of stashed output, warning when `tool_response` is
None.
### Backend: Incremental save optimization (`model.py`)
- **`existing_message_count` parameter** on `upsert_chat_session`: Skip
the DB query to count existing messages when the caller already tracks
this (streaming loop).
- **`skip_existence_check` parameter** on `_save_session_to_db`: Skip
the `get_chat_session` existence query when we know the session row
already exists. Reduces from 4 DB round trips to 2 per incremental save.
### Backend: SDK version bump (`pyproject.toml`, `poetry.lock`)
- Bump `claude-agent-sdk` from `^0.1.0` to `^0.1.39`.
### Backend: New tests
- **`sdk_compat_test.py`** (new file): SDK compatibility tests — verify
the installed SDK exposes every class, attribute, and method the copilot
integration relies on. Catches SDK upgrade breakage immediately.
- **`response_adapter_test.py`**: 9 new tests covering
flush-at-ResultMessage, flush-at-next-AssistantMessage, stashed output
flush, wait_for_stash signaling/timeout/fast-path, parallel tool call
non-premature-flush, text-message flush of prior tools, and
already-resolved tool skip in UserMessage.
### Frontend: Session hydration (`convertChatSessionToUiMessages.ts`)
- **`isComplete` option**: When session has no active stream, mark
dangling tool calls (no output in DB) as `output-available` with empty
output — stops stale spinners after page refresh.
### Frontend: Chat session hook (`useChatSession.ts`)
- Reorder `hasActiveStream` memo before `hydratedMessages` so
`isComplete` flag is available.
- Pass `{ isComplete: !hasActiveStream }` to
`convertChatSessionMessagesToUiMessages`.
### Frontend: Copilot page hook (`useCopilotPage.ts`)
- **Cache invalidation on stream end**: Invalidate React Query session
cache when stream transitions active → idle, so next hydration fetches
fresh messages from backend (staleTime: Infinity otherwise keeps stale
data).
- **Resume ref reset**: Reset `hasResumedRef` on stream end to allow
re-resume if SSE drops but backend task is still running.
- **Remove old `resolveInProgressTools` effect**: Replaced by
backend-side safety-net flush + hydration-time `isComplete` marking.
## Test plan
- [ ] Existing copilot tests pass (`pytest backend/copilot/ -x -q`)
- [ ] SDK compat tests pass (`pytest
backend/copilot/sdk/sdk_compat_test.py -v`)
- [ ] Tool outputs (bash_exec, web_fetch, WebSearch) appear in the UI
instead of getting stuck
- [ ] Parallel tool calls (e.g. multiple WebSearch) complete and display
results without premature flush
- [ ] Page refresh during active stream reconnects and recovers messages
- [ ] Opening session from another device shows recent tool results
- [ ] SSE drop → automatic reconnection without losing messages
- [ ] Long-running tools (create_agent) still delegate to background
infrastructure
|
||
|
|
7718c49f05 |
Make CoPilot todo/task list card expanded by default (#12168)
The todo card rendered by GenericTool was collapsed by default, requiring users to click to see their checklist items. Now passes `defaultExpanded` when the category is `"todo"` so the task list is immediately visible. **File changed:** `autogpt_platform/frontend/src/app/(platform)/copilot/tools/GenericTool/GenericTool.tsx` Resolves [SECRT-2017](https://linear.app/autogpt/issue/SECRT-2017) |
||
|
|
0a1591fce2 |
refactor(frontend): remove old builder code and monitoring components
(#12082) ### Changes 🏗️ This PR removes old builder code and monitoring components as part of the migration to the new flow editor: - **NewControlPanel**: Simplified component by removing unused props (`flowExecutionID`, `visualizeBeads`, `pinSavePopover`, `pinBlocksPopover`, `nodes`, `onNodeSelect`, `onNodeHover`) and cleaned up commented legacy code - **Import paths**: Updated all references from `legacy-builder/CustomNode` to `FlowEditor/nodes/CustomNode` - **GraphContent**: Fixed type safety by properly handling `customized_name` metadata and using `categoryColorMap` instead of `getPrimaryCategoryColor` - **useNewControlPanel**: Removed unused state and query parameter handling related to old builder - Removed dead code and commented-out imports throughout ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Verify NewControlPanel renders correctly - [x] Test BlockMenu functionality - [x] Test Save Control - [x] Test Undo/Redo buttons - [x] Verify graph search menu still works with updated imports <!-- greptile_comment --> <h2>Greptile Overview</h2> <details><summary><h3>Greptile Summary</h3></summary> This PR removes legacy builder components and monitoring page (~12,000 lines of code), simplifying `NewControlPanel` to focus only on the new flow editor. **Key changes:** - Removed entire `legacy-builder/` directory (36 files) containing old CustomNode, CustomEdge, Flow, and control components - Deleted `/monitoring` page and all related components (9 files) - Deleted `useAgentGraph` hook (1,043 lines) that was only used by legacy components - Simplified `NewControlPanel` by removing unused props (`flowExecutionID`, `nodes`, `onNodeSelect`, etc.) and commented-out code - Updated imports in `NewSearchGraph` components to reference new `FlowEditor/nodes/CustomNode` instead of deleted `legacy-builder/CustomNode` - Removed `/monitoring` from protected pages in `helpers.ts` - Updated test files to remove monitoring-related test helpers **Minor style issues:** - `useNewControlPanel` hook returns unused state (`blockMenuSelected`) that should be cleaned up - Unnecessary double negation (`!!`) in `GraphContent.tsx:136` </details> <details><summary><h3>Confidence Score: 4/5</h3></summary> - This PR is safe to merge with minor style improvements recommended - The refactor is a straightforward deletion of legacy code with no references remaining in the codebase. All imports have been updated correctly, tests cleaned up, and routing configuration updated. The only issues are minor unused code that could be cleaned up but won't cause runtime errors. - No files require special attention - the unused state in `useNewControlPanel.ts` is a minor style issue </details> <details><summary><h3>Sequence Diagram</h3></summary> ```mermaid sequenceDiagram participant User participant NewControlPanel participant BlockMenu participant NewSaveControl participant UndoRedoButtons participant Store as blockMenuStore (Zustand) Note over NewControlPanel: Simplified component (removed props & legacy code) User->>NewControlPanel: Render NewControlPanel->>useNewControlPanel: Call hook (unused return) NewControlPanel->>BlockMenu: Render BlockMenu->>Store: Access state via useBlockMenuStore Store-->>BlockMenu: Return search, filters, etc. NewControlPanel->>NewSaveControl: Render NewControlPanel->>UndoRedoButtons: Render Note over NewControlPanel,Store: State management moved from hook to Zustand store Note over User: Legacy components (CustomNode, Flow, etc.) completely removed ``` </details> <!-- greptile_other_comments_section --> <!-- /greptile_comment --> |
||
|
|
681bb7c2b4 |
feat(copilot): workspace file tools, context reconstruction, transcript upload protection (#12164)
## Summary - **Workspace file tools**: `write_workspace_file` now accepts plain text `content`, `source_path` (copy from ephemeral disk), and graceful fallback for invalid base64. `read_workspace_file` gains `save_to_path` to download workspace files to the ephemeral working directory. Both validate paths against session-specific ephemeral directory. - **Context reconstruction**: `_format_conversation_context` now includes tool call summaries and tool results (not just user/assistant text), fixing agent amnesia when transcript is unavailable or stale. - **Transcript upload protection**: Moved transcript upload from inside the inner `try` block to the `finally` block, ensuring it always runs even on streaming exceptions — prevents transcript loss that caused staleness on subsequent turns. - **Agent inactivity timeout**: Configurable timeout (default 300s) kills hung Claude agents that stop producing SDK messages. - **SDK system prompt**: Restructured with clear sections for shell commands, two storage systems, file transfer workflows, and long-running tools. - **Path validation hardening**: `_validate_ephemeral_path` uses `os.path.realpath` for both session dir and target path, fixing macOS `/tmp` → `/private/tmp` symlink mismatch. Empty-string params normalised to `None` to prevent dispatch assertion failures. ## Test plan - [x] `_format_conversation_context` — empty, user, assistant, tool calls, tool results, full conversation (query_builder_test.py) - [x] `_build_query_message` — resume up-to-date, stale transcript gap, zero msg count, no resume single/multi (query_builder_test.py) - [x] `_validate_ephemeral_path` — valid path, traversal, cross-session, symlink escape, nested (workspace_files_test.py) - [x] `_resolve_write_content` — no sources, multiple sources, plain text, base64, invalid base64, source_path, not found, outside ephemeral, empty strings (workspace_files_test.py) - [ ] Verify transcript upload occurs even after streaming error - [ ] Verify agent inactivity timeout kills hung agents (300s default) --------- Co-authored-by: Otto (AGPT) <otto@agpt.co> |
||
|
|
0818cd6683 |
fix(copilot): prevent background agent stalls and context hallucination (#12167)
## Summary
- **Block background Task agents**: The SDK's `Task` tool with
`run_in_background=true` stalls the SSE stream (no messages flow while
they execute) and the agents get killed when the main agent's turn ends
and we SIGTERM the CLI. The `PreToolUse` hook now denies these and tells
the agent to run tasks in the foreground instead.
- **Add heartbeats to SDK streaming**: Replaced the `async for` loop
with an explicit async iterator + `asyncio.wait_for(15s)`. Sends
`StreamHeartbeat` when the CLI is idle (e.g. during long tool execution)
to keep SSE connections alive through proxies/LBs.
- **Fix summarization hallucination**: The `_summarize_messages_llm`
prompt forced the LLM to produce ALL 9 sections ("You MUST include
ALL"), causing fabrication when the conversation didn't have content for
every section. Changed to optional sections with explicit
anti-hallucination instructions.
## Context
Session `7a9dda34-1068-4cfb-9132-5daf8ad31253` exhibited both issues:
1. The copilot tried to spin up background agents to create files in
parallel, then stopped responding
2. On resume, the copilot hallucinated having completed a "comprehensive
competitive analysis" with "9 deliverables" that never happened
## Test plan
- [x] All 26 security hooks tests pass (3 new: background blocked,
foreground allowed, limit enforced)
- [x] All 44 prompt utility tests pass
- [x] Linting and typecheck pass
- [ ] Manual test: copilot session where agent attempts to use Task tool
— should run foreground only
- [ ] Manual test: long-running tool execution — SSE should stay alive
via heartbeats
- [ ] Manual test: resume a multi-turn session — no hallucinated context
in summary
|
||
|
|
7a39bdfaf8 |
feat(copilot): wire up stop button to cancel executor tasks (#12171)
## Summary
- The stop button was completely disconnected — clicking it only aborted
the client-side SSE fetch while the executor kept running indefinitely
- The executor already had full cancel infrastructure (RabbitMQ FANOUT
consumer, `CancelCoPilotEvent`, `threading.Event`, periodic cancel
checks) but nobody ever published a cancel message
- This PR wires up the missing pieces: a cancel REST endpoint, a publish
function, and frontend integration
## Changes
- **`executor/utils.py`**: Add `enqueue_cancel_task()` to publish
`CancelCoPilotEvent` to the existing FANOUT exchange
- **`routes.py`**: Add `POST /sessions/{session_id}/cancel` that finds
the active task, publishes cancel, and polls Redis until the task
confirms stopped (up to 10s timeout)
- **`cancel/route.ts`**: Next.js API proxy route for the cancel endpoint
- **`useCopilotPage.ts`**: Wrap AI SDK's `stop()` to also call the
backend cancel API — `sdkStop()` fires first for instant UI feedback,
then the cancel API waits for executor confirmation
## Test plan
- [ ] Start a copilot chat session and send a message
- [ ] Click "Stop generating" while streaming
- [ ] Verify executor logs show `Received cancel for {task_id}` and
`Cancelled during streaming`
- [ ] Verify the cancel endpoint returns `{"cancelled": true}` (not
timeout)
- [ ] Verify frontend transitions to idle state
- [ ] Verify clicking stop when no task is running returns
`{"cancelled": false, "reason": "no_active_task"}`
|
||
|
|
0b151f64e8 |
feat(copilot): Execute parallel tool calls concurrently (#12165)
When the LLM returns multiple tool calls in a single response (e.g. multiple web fetches for a research task), they now execute concurrently instead of sequentially. This can dramatically reduce latency for multi-tool turns. **Before:** Tool calls execute one after another — 7 web fetches × 2s each = 14s total **After:** All tool calls fire concurrently — 7 web fetches = ~2s total ### Changes - **`service.py`**: New `_execute_tool_calls_parallel()` function that spawns tool calls as concurrent `asyncio` tasks, collecting stream events via `asyncio.Queue` - **`service.py`**: `_yield_tool_call()` now accepts an optional `session_lock` parameter for concurrent-safe session mutations - **`base.py`**: Session lock exposed via `contextvars` so tools that need it can access it without interface changes - **`run_agent.py`**: Rate-limit counters (`successful_agent_runs`, `successful_agent_schedules`) protected with the session lock to prevent race conditions ### Concurrency Safety | Shared State | Risk | Mitigation | |---|---|---| | `session.messages` (long-running tools only) | Race on append + upsert | `session_lock` wraps mutations | | `session.successful_agent_runs` counter | Bypass max-runs check | `session_lock` wraps read-check-increment | | Tool-internal state (DB queries, API calls) | None — stateless | No mitigation needed | ### Testing - Added `parallel_tool_calls_test.py` with tests for: - Parallel timing verification (sum vs max of delays) - Single tool call regression - Retryable error propagation - Shared session lock verification - Cancellation cleanup Closes SECRT-2016 --------- Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co> |
||
|
|
be2a48aedb |
feat(platform/copilot): add SuggestedGoalResponse for vague/unachievable goals (#12139)
## Summary
- Add `SUGGESTED_GOAL` response type and `SuggestedGoalResponse` model
to backend; vague/unachievable goals now return a structured suggestion
instead of a generic error
- Add `SuggestedGoalCard` frontend component (amber styling, "Use this
goal" button) that lets users accept and re-submit a refined goal in one
click
- Add error recovery buttons ("Try again", "Simplify goal") to the error
output block
- Update copilot system prompt with explicit guidance for handling
`suggested_goal` and `clarifying_questions` feedback loops
- Add `create_agent_test.py` covering all four decomposition result
types
## Test plan
- [ ] Trigger vague goal (e.g. "monitor social media") →
`SuggestedGoalCard` renders with amber styling
- [ ] Trigger unachievable goal (e.g. "read my mind") → card shows goal
type "Goal cannot be accomplished" with reason
- [ ] Click "Use this goal" → sends message and triggers new
`create_agent` call with the suggested goal
- [ ] Trigger an error → "Try again" and "Simplify goal" buttons appear
below the error
- [ ] Clarifying questions answered → LLM re-calls `create_agent` with
context (system prompt guidance)
- [ ] Backend tests pass: `poetry run pytest
backend/api/features/chat/tools/create_agent_test.py -xvs` (requires
Docker services)
<!-- greptile_comment -->
<details><summary><h3>Greptile Summary</h3></summary>
Replaced generic `ErrorResponse` with structured `SuggestedGoalResponse`
for vague/unachievable goals in the copilot agent creation flow. Added
frontend `SuggestedGoalCard` component with amber styling and "Use this
goal" button for one-click goal refinement. Enhanced system prompt with
explicit feedback loop handling for `suggested_goal` and
`clarifying_questions`. Added comprehensive test coverage for all four
decomposition result types.
**Key improvements:**
- Better UX: Users can now accept refined goals with one click instead
of manually retyping
- Clearer error recovery: Added "Try again" and "Simplify goal" buttons
to error blocks
- Structured data: Backend now returns `suggested_goal`, `reason`,
`original_goal`, and `goal_type` fields instead of embedding everything
in error messages
**Issue found:**
- The `reason` field from the backend is not being passed to or
displayed by the `SuggestedGoalCard` component, so users won't see the
explanation for why their goal was rejected (especially important for
unachievable goals where it explains what blocks are missing)
</details>
<details><summary><h3>Confidence Score: 4/5</h3></summary>
- Safe to merge after fixing the missing `reason` field in the frontend
component
- Implementation is well-structured with good test coverage and follows
established patterns. The issue with the missing `reason` field is
straightforward to fix but important for UX - users won't understand why
their goal was rejected without it. All other changes are solid: backend
properly returns structured data, tests cover all cases, and the
component integration follows the project's conventions.
-
autogpt_platform/frontend/src/app/(platform)/copilot/tools/CreateAgent/CreateAgent.tsx
and SuggestedGoalCard.tsx need the `reason` prop added
</details>
<details><summary><h3>Flowchart</h3></summary>
```mermaid
flowchart TD
Start[User submits goal to create_agent] --> Decompose[decompose_goal analyzes request]
Decompose --> CheckType{Decomposition result type?}
CheckType -->|clarifying_questions| Questions[Return ClarificationNeededResponse]
Questions --> UserAnswers[User answers questions]
UserAnswers --> Retry[Retry with context]
Retry --> Decompose
CheckType -->|vague_goal| VagueResponse[Return SuggestedGoalResponse<br/>goal_type: vague]
VagueResponse --> ShowSuggestion[Frontend: SuggestedGoalCard<br/>amber styling]
ShowSuggestion --> UserAccepts{User clicks<br/>Use this goal?}
UserAccepts -->|Yes| NewGoal[Send suggested goal]
NewGoal --> Decompose
UserAccepts -->|No| End1[User refines manually]
CheckType -->|unachievable_goal| UnachievableResponse[Return SuggestedGoalResponse<br/>goal_type: unachievable<br/>reason: missing blocks]
UnachievableResponse --> ShowSuggestion
CheckType -->|success| Generate[generate_agent creates workflow]
Generate --> SaveOrPreview{save parameter?}
SaveOrPreview -->|true| Save[Save to library<br/>AgentSavedResponse]
SaveOrPreview -->|false| Preview[AgentPreviewResponse]
CheckType -->|error| ErrorFlow[Return ErrorResponse]
ErrorFlow --> ShowError[Frontend: Show error with<br/>Try again & Simplify goal buttons]
ShowError --> UserRetry{User action?}
UserRetry -->|Try again| Decompose
UserRetry -->|Simplify goal| GetHelp[Ask LLM to simplify]
GetHelp --> Decompose
Save --> End2[Done]
Preview --> End2
End1 --> End2
```
</details>
<sub>Last reviewed commit: 2f37aee</sub>
<!-- greptile_other_comments_section -->
<!-- /greptile_comment -->
|
||
|
|
aeca4dbb79 |
docs(frontend): add mandatory pre-completion checks to CLAUDE.md (#12161)
### Changes 🏗️ Adds a **Pre-completion Checks (MANDATORY)** section to `frontend/CLAUDE.md` that instructs Claude Code agents to always run the following commands in order before reporting frontend work as done: 1. `pnpm format` — auto-fix formatting issues 2. `pnpm lint` — check for lint errors and fix them 3. `pnpm types` — check for type errors and fix them This ensures code quality gates are enforced consistently by AI agents working on the frontend. ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Verified `pnpm format` passes cleanly - [x] Verified `pnpm lint` passes cleanly - [x] Verified `pnpm types` passes cleanly 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
7b85eeaae2 |
refactor(frontend): fix flaky e2e tests (#12156)
### Changes 🏗️ Some fixes to make running e2e more predictable... ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] e2e are imdempotent --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
4db3be2d61 |
fix(frontend): switch minigame to snake (#12160)
## Changes 🏗️ <img width="600" height="416" alt="Screenshot 2026-02-19 at 18 05 39" src="https://github.com/user-attachments/assets/930116ad-b611-4398-bee7-4e33ca4dc688" /> Make the mini game a snake 🐍 game, so we don't use assets (_possible license issues_ ), and it's simpler... ## Checklist 📋 ### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Run the app and test |
||
|
|
f57a1995d0 |
fix(frontend): make chat spinner centred when loading (#12154)
## Changes 🏗️ <img width="800" height="969" alt="Screenshot 2026-02-18 at 20 30 36" src="https://github.com/user-attachments/assets/30d7d211-98c1-4159-94e1-86e81e29ad43" /> - Make the spinner centred when the chat is loading ## Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Run the app and test locally |
||
|
|
3928c35928 |
feat(copilot): SDK tool output, transcript resume, stream reconnection, GenericTool UI (#12159)
## Summary
### SDK built-in tool output forwarding
- WebSearch, Read, TodoWrite outputs now render in the frontend —
PostToolUse hook stashes outputs before SDK truncation, response adapter
flushes unresolved tool calls via `_flush_unresolved_tool_calls` +
`parent_tool_use_id` handling
- Multi-call stash upgraded to `dict[str, list[str]]` to support
multiple calls to the same built-in tool in one turn
### Transcript-based `--resume` with staleness detection
- Simplified to single upload block after `async with` (Stop hook +
`appendFileSync` guarantees), extracted `_try_upload_transcript` helper
- **NEW**: `message_count` watermark + timestamp metadata stored
alongside transcript — on the next turn, detects staleness and
compresses only the missed messages instead of the full history (hybrid:
transcript via `--resume` + compressed gap)
- Removed redundant dual-strategy code and dead
`find_cli_transcript`/`read_fallback_transcript` functions
### Frontend stream reconnection
- **NEW**: Enabled `resume: true` on `useChat` with
`prepareReconnectToStreamRequest` — page refresh reconnects to active
backend streams via Redis replay (backend `resume_session_stream`
endpoint was already wired up)
### GenericTool.tsx UI overhaul
- Tool-specific icons (terminal, globe, file, search, edit, checklist)
with category-based display
- TodoWrite checklist rendering with status indicators
(completed/in-progress/pending)
- WebSearch/MCP content display via `extractMcpText` for MCP-style
content blocks + raw JSON fallback
- Defensive TodoItem filter per coderabbit review
- Proper accordion content per tool category (bash, web, file, search,
edit, todo)
### Image support
- MCP tool results now include `{"type": "image"}` content blocks when
workspace file responses contain `content_base64` with image MIME types
### Security & cleanup
- `AskUserQuestion` added to `SDK_DISALLOWED_TOOLS` (interactive CLI
tool, no terminal in copilot)
- 36 per-operation `[TIMING]`/`[TASK_LOOKUP]` diagnostic logs downgraded
info→debug
- Silent exception fixes: warning logs for swallowed errors in
stream_registry + service
## Test plan
- [ ] Verify copilot multi-turn conversations use `--resume` (check logs
for `Using --resume`)
- [ ] Verify stale transcript detection fills gap (check logs for
`Transcript stale: covers N of M messages`)
- [ ] Verify page refresh reconnects to active stream (check network tab
for GET to `/stream` returning SSE)
- [ ] Verify WebSearch, Read, TodoWrite tool outputs render in frontend
accordion
- [ ] Verify GenericTool icons and accordion content display correctly
for each tool type
- [ ] Verify production log volume is reduced (no more `[TIMING]` at
info level)
---------
Co-authored-by: Otto (AGPT) <otto@agpt.co>
|
||
|
|
dc77e7b6e6 |
feat(frontend): Replace advanced switch with chevron on builder nodes (#12152)
## Summary Replaces the "Advanced" switch/toggle on builder nodes with a chevron control, matching the UX pattern used for the outputs section. Resolves [OPEN-3006](https://linear.app/autogpt/issue/OPEN-3006/replace-advanced-switch-with-chevron-on-builder-nodes) Before <img width="443" height="348" alt="Screenshot 2026-02-17 at 9 01 31 pm" src="https://github.com/user-attachments/assets/40e98669-3136-4e53-8d46-df18ea32c4d7" /> After <img width="443" height="348" alt="Screenshot 2026-02-17 at 9 00 21 pm" src="https://github.com/user-attachments/assets/0836e3ac-1d0a-43d7-9392-c9d5741b32b6" /> ## Changes - `NodeAdvancedToggle.tsx` — Replaced switch component with a chevron expand/collapse toggle ## Testing Tested and verified by @kpczerwinski <!-- greptile_comment --> <details><summary><h3>Greptile Summary</h3></summary> Replaces the `Switch` toggle for the "Advanced" section on builder nodes with a chevron (`CaretDownIcon`) expand/collapse control, matching the existing UX pattern used in `OutputHandler.tsx`. The change is clean and consistent with the codebase. - Swapped `Switch` component for a ghost `Button` + `CaretDownIcon` with a `rotate-180` transition for visual feedback - Pattern closely mirrors the output section toggle in `OutputHandler.tsx` (lines 120-136) - Removed the top border separator and rounded bottom corners from the container, adjusting the visual spacing - Toggle logic correctly inverts the `showAdvanced` boolean state - Uses Phosphor Icons and design system components per project conventions </details> <details><summary><h3>Confidence Score: 5/5</h3></summary> - This PR is safe to merge — it is a small, focused UI change with no logic or security concerns. - Single file changed with a straightforward UI component swap. The new implementation follows an established pattern already in use in OutputHandler.tsx. Toggle logic is correct and all conventions (Phosphor Icons, design system components, Tailwind styling) are followed. - No files require special attention. </details> <details><summary><h3>Sequence Diagram</h3></summary> ```mermaid sequenceDiagram participant User participant NodeAdvancedToggle participant nodeStore User->>NodeAdvancedToggle: Click chevron button NodeAdvancedToggle->>nodeStore: setShowAdvanced(nodeId, !showAdvanced) nodeStore-->>NodeAdvancedToggle: Updated showAdvanced state NodeAdvancedToggle->>NodeAdvancedToggle: Rotate CaretDownIcon (0° ↔ 180°) Note over NodeAdvancedToggle: Advanced fields shown/hidden via FormCreator ``` </details> <sub>Last reviewed commit: ad66080</sub> <!-- greptile_other_comments_section --> **Context used:** - Context from `dashboard` - autogpt_platform/frontend/CLAUDE.md ([source](https://app.greptile.com/review/custom-context?memory=39861924-d320-41ba-a1a7-a8bff44f780a)) - Context from `dashboard` - autogpt_platform/frontend/src/app/(platform)/build/components/FlowEditor/ARCHITECTURE_FLOW_EDITOR.md ([source](https://app.greptile.com/review/custom-context?memory=0c5511fe-9aeb-4cf1-bbe9-798f2093b748)) <!-- /greptile_comment --> --------- Co-authored-by: Krzysztof Czerwinski <kpczerwinski@gmail.com> Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com> Co-authored-by: Ubbe <0ubbe@users.noreply.github.com> Co-authored-by: Ubbe <hi@ubbe.dev> |
||
|
|
ba75cc28b5 |
fix(copilot): Remove description from feature request search, add PII prevention (#12155)
Two targeted changes to the CoPilot feature request tools: 1. **Remove description from search results** — The `search_feature_requests` tool no longer returns issue descriptions. Only the title is needed for duplicate detection, reducing unnecessary data exposure. 2. **Prevent PII in created issues** — Updated the `create_feature_request` tool description and parameter descriptions to explicitly instruct the LLM to never include personally identifiable information (names, emails, company names, etc.) in Linear issue titles and descriptions. Resolves [SECRT-2010](https://linear.app/autogpt/issue/SECRT-2010) |
||
|
|
15bcdae4e8 |
fix(backend/copilot): Clean up GCSWorkspaceStorage per worker (#12153)
The copilot executor runs each worker in its own thread with a dedicated event loop (`asyncio.new_event_loop()`). `aiohttp.ClientSession` is bound to the event loop where it was created — using it from a different loop causes `asyncio.timeout()` to fail with: ``` RuntimeError: Timeout context manager should be used inside a task ``` This was the root cause of transcript upload failures tracked in SECRT-2009 and [Sentry #7272473694](https://significant-gravitas.sentry.io/issues/7272473694/). ### Fix **One `GCSWorkspaceStorage` instance per event loop** instead of a single shared global. - `get_workspace_storage()` now returns a per-loop GCS instance (keyed by `id(asyncio.get_running_loop())`). Local storage remains shared since it has no async I/O. - `shutdown_workspace_storage()` closes the instance for the **current** loop only, so `session.close()` always runs on the loop that created the session. - `CoPilotProcessor.cleanup()` shuts down workspace storage on the worker's own loop, then stops the loop. - Manager cleanup submits `cleanup_worker` to each thread pool worker before shutting down the executor — replacing the old approach of creating a temporary event loop that couldn't close cross-loop sessions. ### Changes | File | Change | |------|--------| | `util/workspace_storage.py` | `GCSWorkspaceStorage` back to simple single-session class; `get_workspace_storage()` returns per-loop GCS instance; `shutdown_workspace_storage()` scoped to current loop | | `copilot/executor/processor.py` | Added `CoPilotProcessor.cleanup()` and `cleanup_worker()` | | `copilot/executor/manager.py` | Calls `cleanup_worker` on each thread pool worker during shutdown | Fixes SECRT-2009 --------- Co-authored-by: Reinier van der Leer <pwuts@agpt.co> |
||
|
|
e9ba7e51db |
fix(copilot): Route workspace through db_accessors, fix transcript upload (#12148)
## Summary Fixes two bugs in the copilot executor: ### SECRT-2008: WorkspaceManager bypasses db_accessors `backend/util/workspace.py` imported 6 workspace functions directly from `backend/data/workspace.py`, which call `prisma()` directly. In the copilot executor (no Prisma connection), these fail. **Fix:** Replace direct imports with `workspace_db()` from `db_accessors`, routing through the database_manager HTTP client when Prisma is unavailable. Also: - Register all 6 workspace functions in `DatabaseManager` and `DatabaseManagerAsyncClient` - Add `UniqueViolationError` to the service `EXCEPTION_MAPPING` so it's properly re-raised over HTTP (needed for race-condition retry logic) ### SECRT-2009: Transcript upload asyncio.timeout error `asyncio.create_task()` at line 696 of `service.py` creates an orphaned background task in the executor's thread event loop. `gcloud-aio-storage`'s `asyncio.timeout()` fails in this context. **Fix:** Replace `create_task` with direct `await`. The upload runs after streaming completes (all chunks already yielded), so no user-facing latency impact. The function already has internal try/except error handling. |
||
|
|
d23248f065 |
feat(backend/copilot): Copilot Executor Microservice (#12057)
Uncouple Copilot task execution from the REST API server. This should help performance and scalability, and allows task execution to continue regardless of the state of the user's connection. - Resolves #12023 ### Changes 🏗️ - Add `backend.copilot.executor`->`CoPilotExecutor` (setup similar to `backend.executor`->`ExecutionManager`). This executor service uses RabbitMQ-based task distribution, and sticks with the existing Redis Streams setup for task output. It uses a cluster lock mechanism to ensure a task is only executed by one pod, and the `DatabaseManager` for pooled DB access. - Add `backend.data.db_accessors` for automatic choice of direct/proxied DB access Chat requests now flow: API → RabbitMQ → CoPilot Executor → Redis Streams → SSE Client. This enables horizontal scaling of chat processing and isolates long-running LLM operations from the API service. - Move non-API Copilot stuff into `backend.copilot` (from `backend.api.features.chat`) - Updated import paths for all usages - Move `backend.executor.database` to `backend.data.db_manager` and add methods for copilot executor - Updated import paths for all usages - Make `backend.copilot.db` RPC-compatible (-> DB ops return ~~Prisma~~ Pydantic models) - Make `backend.data.workspace` RPC-compatible - Make `backend.data.graphs.get_store_listed_graphs` RPC-compatible DX: - Add `copilot_executor` service to Docker setup Config: - Add `Config.num_copilot_workers` (default 5) and `Config.copilot_executor_port` (default 8008) - Remove unused `Config.agent_server_port` > [!WARNING] > **This change adds a new microservice to the system, with entrypoint `backend.copilot.executor`.** > The `docker compose` setup has been updated, but if you run the Platform on something else, you'll have to update your deployment config to include this new service. > > When running locally, the `CoPilotExecutor` uses port 8008 by default. ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Copilot works - [x] Processes messages when triggered - [x] Can use its tools #### For configuration changes: - [x] `.env.default` is updated or already compatible with my changes - [x] `docker-compose.yml` is updated or already compatible with my changes - [x] I have included a list of my configuration changes in the PR description (under **Changes**) --------- Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co> |
||
|
|
905373a712 |
fix(frontend): use singleton Shiki highlighter for code syntax highlighting (#12144)
## Summary
Addresses SENTRY-1051: Shiki warning about multiple highlighter
instances.
## Problem
The `@streamdown/code` package creates a **new Shiki highlighter for
each language** encountered. When users view AI chat responses with code
blocks in multiple languages (JavaScript, Python, JSON, YAML, etc.),
this creates 10+ highlighter instances, triggering Shiki's warning:
> "10 instances have been created. Shiki is supposed to be used as a
singleton, consider refactoring your code to cache your highlighter
instance"
This causes memory bloat and performance degradation.
## Solution
Introduced a custom code highlighting plugin that properly implements
the singleton pattern:
### New files:
- `src/lib/shiki-highlighter.ts` - Singleton highlighter management
- `src/lib/streamdown-code-plugin.ts` - Drop-in replacement for
`@streamdown/code`
### Key features:
- **Single shared highlighter** - One instance serves all code blocks
- **Preloaded common languages** - JS, TS, Python, JSON, Bash, YAML,
etc.
- **Lazy loading** - Additional languages loaded on demand
- **Result caching** - Avoids re-highlighting identical code blocks
### Changes:
- Added `shiki` as direct dependency
- Updated `message.tsx` to use the new plugin
## Testing
- [ ] Verify code blocks render correctly in AI chat
- [ ] Confirm no Shiki singleton warnings in console
- [ ] Test with multiple languages in same conversation
## Related
- Linear: SENTRY-1051
- Sentry: Multiple Shiki instances warning
<!-- greptile_comment -->
<details><summary><h3>Greptile Summary</h3></summary>
Replaced `@streamdown/code` with a custom singleton-based Shiki
highlighter implementation to resolve memory bloat from creating
multiple highlighter instances per language. The new implementation
creates a single shared highlighter with preloaded common languages (JS,
TS, Python, JSON, etc.) and lazy-loads additional languages on demand.
Results are cached to avoid re-highlighting identical code blocks.
**Key changes:**
- Added `shiki` v3.21.0 as a direct dependency
- Created `shiki-highlighter.ts` with singleton pattern and language
management utilities
- Created `streamdown-code-plugin.ts` as a drop-in replacement for
`@streamdown/code`
- Updated `message.tsx` to import from the new plugin instead of
`@streamdown/code`
The implementation follows React best practices with async highlighting
and callback-based notifications. The cache key uses code length +
prefix/suffix for efficient lookups on large code blocks.
</details>
<details><summary><h3>Confidence Score: 4/5</h3></summary>
- Safe to merge with minor considerations for edge cases
- The implementation is solid with proper singleton pattern, caching,
and async handling. The code is well-structured and addresses the stated
problem. However, there's a subtle potential race condition in the
callback handling where multiple concurrent requests for the same cache
key could trigger duplicate highlight operations before the first
completes. The cache key generation using prefix/suffix could
theoretically cause false cache hits for large files with identical
prefixes and suffixes. Despite these edge cases, the implementation
should work correctly for the vast majority of use cases.
- No files require special attention
</details>
<details><summary><h3>Sequence Diagram</h3></summary>
```mermaid
sequenceDiagram
participant UI as Streamdown Component
participant Plugin as Custom Code Plugin
participant Cache as Token Cache
participant Singleton as Shiki Highlighter (Singleton)
participant Callbacks as Pending Callbacks
UI->>Plugin: highlight(code, lang)
Plugin->>Cache: Check cache key
alt Cache hit
Cache-->>Plugin: Return cached result
Plugin-->>UI: Return highlighted tokens
else Cache miss
Plugin->>Callbacks: Register callback
Plugin->>Singleton: Get highlighter instance
alt First call
Singleton->>Singleton: Create highlighter with preloaded languages
end
Singleton-->>Plugin: Return highlighter
alt Language not loaded
Plugin->>Singleton: Load language dynamically
end
Plugin->>Singleton: codeToTokens(code, lang, themes)
Singleton-->>Plugin: Return tokens
Plugin->>Cache: Store result
Plugin->>Callbacks: Notify all waiting callbacks
Callbacks-->>UI: Async callback with result
end
```
</details>
<sub>Last reviewed commit: 96c793b</sub>
<!-- greptile_other_comments_section -->
<!-- /greptile_comment -->
|
||
|
|
ee9d39bc0f |
refactor(copilot): Replace legacy delete dialog with molecules/Dialog (#12136)
## Summary
Updates the session delete confirmation in CoPilot to use the new
`Dialog` component from `molecules/Dialog` instead of the legacy
`DeleteConfirmDialog`.
## Changes
- **ChatSidebar**: Use Dialog component for delete confirmation
(desktop)
- **CopilotPage**: Use Dialog component for delete confirmation (mobile)
## Behavior
- Dialog stays **open** during deletion with loading state on button
- Cancel button **disabled** while delete is in progress
- Delete button shows **loading spinner** during deletion
- Dialog only closes on successful delete or when cancel is clicked (if
not deleting)
## Screenshots
*Dialog uses the same styling as other molecules/Dialog instances in the
app*
## Requested by
@0ubbe
<!-- greptile_comment -->
<details><summary><h3>Greptile Summary</h3></summary>
Replaces the legacy `DeleteConfirmDialog` component with the new
`molecules/Dialog` component for session delete confirmations in both
desktop (ChatSidebar) and mobile (CopilotPage) views. The new
implementation maintains the same behavior: dialog stays open during
deletion with a loading state on the delete button and disabled cancel
button, closing only on successful deletion or cancel click.
</details>
<details><summary><h3>Confidence Score: 5/5</h3></summary>
- This PR is safe to merge with minimal risk
- This is a straightforward component replacement that maintains the
same behavior and UX. The Dialog component API is properly used with
controlled state, the loading states are correctly implemented, and both
mobile and desktop views are handled consistently. The changes are
well-tested patterns used elsewhere in the codebase.
- No files require special attention
</details>
<details><summary><h3>Flowchart</h3></summary>
```mermaid
flowchart TD
A[User clicks delete button] --> B{isMobile?}
B -->|Yes| C[CopilotPage Dialog]
B -->|No| D[ChatSidebar Dialog]
C --> E[Set sessionToDelete state]
D --> E
E --> F[Dialog opens with controlled.isOpen]
F --> G{User action?}
G -->|Cancel| H{isDeleting?}
H -->|No| I[handleCancelDelete: setSessionToDelete null]
H -->|Yes| J[Cancel button disabled]
G -->|Confirm Delete| K[handleConfirmDelete called]
K --> L[deleteSession mutation]
L --> M[isDeleting = true]
M --> N[Button shows loading spinner]
M --> O[Cancel button disabled]
L --> P{Mutation result?}
P -->|Success| Q[Invalidate sessions query]
Q --> R[Clear sessionId if current]
R --> S[setSessionToDelete null]
S --> T[Dialog closes]
P -->|Error| U[Show toast error]
U --> V[setSessionToDelete null]
V --> W[Dialog closes]
```
</details>
<sub>Last reviewed commit: 275950c</sub>
<!-- greptile_other_comments_section -->
<!-- /greptile_comment -->
---------
Co-authored-by: Lluis Agusti <hi@llu.lu>
Co-authored-by: Ubbe <hi@ubbe.dev>
|
||
|
|
05aaf7a85e |
fix(backend): Rename LINEAR_API_KEY to COPILOT_LINEAR_API_KEY to prevent global access (#12143)
The `LINEAR_API_KEY` environment variable name is too generic — it matches the key name used by integrations/blocks, meaning that if set globally, it could inadvertently grant all users access to Linear through the blocks system rather than restricting it to the copilot feature-request tool. This renames the setting to `COPILOT_LINEAR_API_KEY` to make it clear this key is scoped exclusively to the copilot's feature-request functionality, preventing it from being picked up as a general-purpose Linear credential. ### Changes 🏗️ - Renamed `linear_api_key` → `copilot_linear_api_key` in `Secrets` settings model (`backend/util/settings.py`) - Updated all references in the copilot feature-request tool (`backend/api/features/chat/tools/feature_requests.py`) ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Verified the rename is consistent across all references (settings + feature_requests tool) - [x] No other files reference the old `linear_api_key` setting name #### For configuration changes: - [x] `.env.default` is updated or already compatible with my changes - [x] I have included a list of my configuration changes in the PR description (under **Changes**) > **Note:** The env var changes from `LINEAR_API_KEY` to `COPILOT_LINEAR_API_KEY`. Any deployment using the old name will need to update accordingly. <!-- greptile_comment --> <details><summary><h3>Greptile Summary</h3></summary> Renamed `LINEAR_API_KEY` to `COPILOT_LINEAR_API_KEY` in settings and the copilot feature-request tool to prevent unintended access through Linear blocks. **Key changes:** - Updated `Secrets.linear_api_key` → `Secrets.copilot_linear_api_key` in `backend/util/settings.py` - Updated all references in `backend/api/features/chat/tools/feature_requests.py` - The rename prevents the copilot Linear key from being picked up by the Linear blocks integration (which uses `LINEAR_API_KEY` via `ProviderBuilder` in `backend/blocks/linear/_config.py`) **Issues found:** - `.env.default` still references `LINEAR_API_KEY` instead of `COPILOT_LINEAR_API_KEY` - Frontend styleguide has a hardcoded error message with the old variable name </details> <details><summary><h3>Confidence Score: 3/5</h3></summary> - Generally safe but requires fixing `.env.default` before deployment - The code changes are correct and achieve the intended security improvement by preventing scope leakage. However, the PR is incomplete - `.env.default` wasn't updated (critical for deployment) and a frontend error message reference was missed. These issues will cause configuration problems for anyone deploying with the new variable name. - Check `autogpt_platform/backend/.env.default` and `autogpt_platform/frontend/src/app/(platform)/copilot/styleguide/page.tsx` - both need updates to match the renamed variable </details> <details><summary><h3>Flowchart</h3></summary> ```mermaid flowchart TD A[".env file<br/>COPILOT_LINEAR_API_KEY"] --> B["Secrets model<br/>copilot_linear_api_key"] B --> C["feature_requests.py<br/>_get_linear_config()"] C --> D["Creates APIKeyCredentials<br/>for copilot feature requests"] E[".env file<br/>LINEAR_API_KEY"] --> F["ProviderBuilder<br/>in blocks/linear/_config.py"] F --> G["Linear blocks integration<br/>for user workflows"] style A fill:#90EE90 style B fill:#90EE90 style C fill:#90EE90 style D fill:#90EE90 style E fill:#FFD700 style F fill:#FFD700 style G fill:#FFD700 ``` </details> <sub>Last reviewed commit: 86dc57a</sub> <!-- greptile_other_comments_section --> <!-- /greptile_comment --> |
||
|
|
9d4dcbd9e0 |
fix(backend/docker): Make server last (= default) build stage
Without specifying an explicit build target it would build the `migrate` stage because it is the last stage in the Dockerfile. This caused deployment failures.
- Follow-up to #12124 and
|
||
|
|
074be7aea6 |
fix(backend/docker): Update run commands to match deployment
- Follow-up to #12124 Changes: - Update `run` commands for all backend services in `docker-compose.platform.yml` to match the deployment commands used in production - Add trigger on `docker-compose(.platform)?.yml` changes to the Frontend CI workflow |
||
|
|
39d28b24fc |
ci(backend): Upgrade RabbitMQ from 3.12 (EOL) to 4.1.4 (#12118)
## Summary Upgrades RabbitMQ from the end-of-life `rabbitmq:3.12-management` to `rabbitmq:4.1.4`, aligning CI, local dev, and e2e testing with production. ## Changes ### CI Workflow (`.github/workflows/platform-backend-ci.yml`) - **Image:** `rabbitmq:3.12-management` → `rabbitmq:4.1.4` - **Port:** Removed 15672 (management UI) — not used - **Health check:** Added to prevent flaky tests from race conditions during startup ### Docker Compose (`docker-compose.platform.yml`, `docker-compose.test.yaml`) - **Image:** `rabbitmq:management` → `rabbitmq:4.1.4` - **Port:** Removed 15672 (management UI) — not used ## Why - RabbitMQ 3.12 is EOL - We don't use the management interface, so `-management` variant is unnecessary - CI and local dev/e2e should match production (4.1.4) ## Testing CI validates that backend tests pass against RabbitMQ 4.1.4 on Python 3.11, 3.12, and 3.13. --- Closes SECRT-1703 |
||
|
|
bf79a7748a |
fix(backend/build): Update stale Poetry usage in Dockerfile (#12124)
[SECRT-2006: Dev deployment failing: poetry not found in container PATH](https://linear.app/autogpt/issue/SECRT-2006) - Follow-up to #12090 ### Changes 🏗️ - Remove now-broken Poetry path config values - Remove usage of now-broken `poetry run` in container run command - Add trigger on `backend/Dockerfile` changes to Frontend CI workflow ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - If it works, CI will pass |
||
|
|
649d4ab7f5 |
feat(chat): Add delete chat session endpoint and UI (#12112)
## Summary
Adds the ability to delete chat sessions from the CoPilot interface.
## Changes
### Backend
- Add `DELETE /api/chat/sessions/{session_id}` endpoint in `routes.py`
- Returns 204 on success, 404 if not found or not owned by user
- Reuses existing `delete_chat_session` function from `model.py`
### Frontend
- Add delete button (trash icon) that appears on hover for each chat
session
- Add confirmation dialog before deletion using existing
`DeleteConfirmDialog` component
- Refresh session list after successful delete
- Clear current session selection if the deleted session was active
- Update OpenAPI spec with new endpoint
## Testing
1. Hover over a chat session in sidebar → trash icon appears
2. Click trash icon → confirmation dialog
3. Confirm deletion → session removed, list refreshes
4. If deleted session was active, selection is cleared
## Screenshots
Delete button appears on hover, confirmation dialog on click.
## Related Issues
Closes SECRT-1928
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
<details><summary><h3>Greptile Summary</h3></summary>
Adds the ability to delete chat sessions from the CoPilot interface — a
new `DELETE /api/chat/sessions/{session_id}` backend endpoint and a
corresponding delete button with confirmation dialog in the
`ChatSidebar` frontend component.
- **Backend route** (`routes.py`): Clean implementation reusing the
existing `delete_chat_session` model function with proper auth guards
and 204/404 responses. No issues.
- **Frontend** (`ChatSidebar.tsx`): Adds hover-visible trash icon per
session, confirmation dialog, mutation with cache invalidation, and
active session clearing on delete. However, it uses a `__legacy__`
component (`DeleteConfirmDialog`) which violates the project's style
guide — new code should use the modern design system components. Error
handling only logs to console without user-facing feedback (project
convention is to use toast notifications for mutation errors).
`isDeleting` is destructured but unused.
- **OpenAPI spec** updated correctly.
- **Unrelated file included**:
`notes/plan-SECRT-1959-graph-edge-desync.md` is a planning document for
a different ticket and should be removed from this PR. The `notes/`
directory is newly introduced and both plan files should be reconsidered
for inclusion.
</details>
<details><summary><h3>Confidence Score: 3/5</h3></summary>
- Functionally correct but has style guide violations and includes
unrelated files that should be addressed before merge.
- The core feature implementation (backend DELETE endpoint and frontend
mutation logic) is sound and follows existing patterns. Score is lowered
because: (1) the frontend uses a legacy component explicitly prohibited
by the project's style guide, (2) mutation errors are not surfaced to
the user, and (3) the PR includes an unrelated planning document for a
different ticket.
- Pay close attention to `ChatSidebar.tsx` for the legacy component
import and error handling, and
`notes/plan-SECRT-1959-graph-edge-desync.md` which should be removed.
</details>
<details><summary><h3>Sequence Diagram</h3></summary>
```mermaid
sequenceDiagram
participant User
participant ChatSidebar as ChatSidebar (Frontend)
participant ReactQuery as React Query
participant API as DELETE /api/chat/sessions/{id}
participant Model as model.delete_chat_session
participant DB as db.delete_chat_session (Prisma)
participant Redis as Redis Cache
User->>ChatSidebar: Click trash icon on session
ChatSidebar->>ChatSidebar: Show DeleteConfirmDialog
User->>ChatSidebar: Confirm deletion
ChatSidebar->>ReactQuery: deleteSession({ sessionId })
ReactQuery->>API: DELETE /api/chat/sessions/{session_id}
API->>Model: delete_chat_session(session_id, user_id)
Model->>DB: delete_many(where: {id, userId})
DB-->>Model: bool (deleted count > 0)
Model->>Redis: Delete session cache key
Model->>Model: Clean up session lock
Model-->>API: True
API-->>ReactQuery: 204 No Content
ReactQuery->>ChatSidebar: onSuccess callback
ChatSidebar->>ReactQuery: invalidateQueries(sessions list)
ChatSidebar->>ChatSidebar: Clear sessionId if deleted was active
```
</details>
<sub>Last reviewed commit: 44a92c6</sub>
<!-- greptile_other_comments_section -->
<details><summary><h4>Context used (3)</h4></summary>
- Context from `dashboard` - autogpt_platform/frontend/CLAUDE.md
([source](https://app.greptile.com/review/custom-context?memory=39861924-d320-41ba-a1a7-a8bff44f780a))
- Context from `dashboard` - autogpt_platform/frontend/CONTRIBUTING.md
([source](https://app.greptile.com/review/custom-context?memory=cc4f1b17-cb5c-4b63-b218-c772b48e20ee))
- Context from `dashboard` - autogpt_platform/CLAUDE.md
([source](https://app.greptile.com/review/custom-context?memory=6e9dc5dc-8942-47df-8677-e60062ec8c3a))
</details>
<!-- /greptile_comment -->
---------
Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
|
||
|
|
223df9d3da |
feat(frontend): improve create/edit copilot UX (#12117)
## Changes 🏗️ Make the UX nicer when running long tasks in Copilot, like creating an agent, editing it or running a task. ## Checklist 📋 ### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Run locally and play the game! <!-- greptile_comment --> <details><summary><h3>Greptile Summary</h3></summary> This PR replaces the static progress bar and idle wait screens with an interactive mini-game across the Create, Edit, and Run Agent copilot tools. The existing mini-game (a simple runner with projectile-dodge boss encounters) is significantly overhauled into a two-mode game: a runner mode with animated tree obstacles and a duel mode featuring a melee boss fight with attack, guard, and movement mechanics. Sprite-based rendering replaces the previous shape-drawing approach. - **Create/Edit/Run Agent UX**: All three tool views now show the mini-game with contextual overlays during long-running operations, replacing the progress bar in EditAgent and adding the game to RunAgent - **Game mechanics overhaul**: Boss encounters changed from projectile-dodging to melee duel with attack (Z), block (X), movement (arrows), and jump (Space) controls - **Sprite rendering**: Added 9 sprite sheet assets for characters, trees, and boss animations with fallback to shape rendering if images fail to load - **UI overlays**: Added React-managed overlay states for idle, boss-intro, boss-defeated, and game-over screens with continue/retry buttons - **Minor issues found**: Unused `isRunActive` variable in `MiniGame.tsx`, unreachable "leaving" boss phase in `useMiniGame.ts`, and a missing `expanded` property in `getAccordionMeta` return type annotation in `EditAgent.tsx` - **Unused asset**: `archer-shoot.png` is included in the PR but never imported or referenced in any code </details> <details><summary><h3>Confidence Score: 4/5</h3></summary> - This PR is safe to merge — it only affects the copilot mini-game UX with no backend or data model changes. - The changes are entirely frontend/cosmetic, scoped to the copilot tools' waiting UX. The mini-game logic is self-contained in a canvas-based hook and doesn't affect any application state, API calls, or routing. The issues found are minor (unused variable, dead code, type annotation gap, unused asset) and don't impact runtime behavior. - `useMiniGame.ts` has the most complex logic changes (boss AI, death animations, sprite rendering) and contains unreachable dead code in the "leaving" phase handler. `EditAgent.tsx` has a return type annotation that doesn't include `expanded`. </details> <details><summary><h3>Flowchart</h3></summary> ```mermaid flowchart TD A[Game Idle] -->|"Start button"| B[Run Mode] B -->|"Jump over trees"| C{Score >= Threshold?} C -->|No| B C -->|"Yes, obstacles clear"| D[Boss Intro Overlay] D -->|"Continue button"| E[Duel Mode] E -->|"Attack Z / Guard X / Move ←→"| F{Boss HP <= 0?} F -->|No| G{Player hit & not guarding?} G -->|No| E G -->|Yes| H[Player Death Animation] H --> I[Game Over Overlay] I -->|"Retry button"| B F -->|Yes| J[Boss Death Animation] J --> K[Boss Defeated Overlay] K -->|"Continue button"| L[Reset Boss & Resume Run] L --> B ``` </details> <sub>Last reviewed commit: ad80e24</sub> <!-- greptile_other_comments_section --> <!-- /greptile_comment --> |
||
|
|
187ab04745 |
refactor(frontend): remove OldAgentLibraryView and NEW_AGENT_RUNS flag (#12088)
## Summary - Removes the deprecated `OldAgentLibraryView` directory (13 files, ~2200 lines deleted) - Removes the `NEW_AGENT_RUNS` feature flag from the `Flag` enum and defaults - Removes the legacy agent library page at `library/legacy/[id]` - Moves shared `CronScheduler` components to `src/components/contextual/CronScheduler/` - Moves `agent-run-draft-view` and `agent-status-chip` to `legacy-builder/` (co-located with their only consumer) - Updates all import paths in consuming files (`AgentInfoStep`, `SaveControl`, `RunnerInputUI`, `useRunGraph`) ## Test plan - [x] `pnpm format` passes - [x] `pnpm types` passes (no TypeScript errors) - [x] No remaining references to `OldAgentLibraryView`, `NEW_AGENT_RUNS`, or `new-agent-runs` in the codebase - [x] Verify `RunnerInputUI` dialog still works in the legacy builder - [x] Verify `AgentInfoStep` cron scheduling works in the publish modal - [x] Verify `SaveControl` cron scheduling works in the legacy builder 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- greptile_comment --> <h2>Greptile Overview</h2> <details><summary><h3>Greptile Summary</h3></summary> This PR removes deprecated code from the legacy agent library view system and consolidates the codebase to use the new agent runs implementation exclusively. The refactor successfully removes ~2200 lines of code across 13 deleted files while properly relocating shared components. **Key changes:** - Removed the entire `OldAgentLibraryView` directory and its 13 component files - Removed the `NEW_AGENT_RUNS` feature flag from the `Flag` enum and defaults - Deleted the legacy agent library page route at `library/legacy/[id]` - Moved `CronScheduler` components to `src/components/contextual/CronScheduler/` for shared use across the application - Moved `agent-run-draft-view` and `agent-status-chip` to `legacy-builder/` directory, co-locating them with their only consumer - Updated `useRunGraph.ts` to import `GraphExecutionMeta` from the generated API models instead of the deleted custom type definition - Updated all import paths in consuming components (`AgentInfoStep`, `SaveControl`, `RunnerInputUI`) **Technical notes:** - The new import path for `GraphExecutionMeta` (`@/app/api/__generated__/models/graphExecutionMeta`) will be generated when running `pnpm generate:api` from the OpenAPI spec - All references to the old code have been cleanly removed from the codebase - The refactor maintains proper separation of concerns by moving shared components to contextual locations </details> <details><summary><h3>Confidence Score: 4/5</h3></summary> - This PR is safe to merge with minimal risk, pending manual verification of the UI components mentioned in the test plan - The refactor is well-structured and all code changes are correct. The score of 4 (rather than 5) reflects that the PR author has marked three manual testing items as incomplete in the test plan: verifying `RunnerInputUI` dialog, `AgentInfoStep` cron scheduling, and `SaveControl` cron scheduling. While the code changes are sound, these UI components should be manually tested before merging to ensure the moved components work correctly in their new locations. - No files require special attention. The author should complete the manual testing checklist items for `RunnerInputUI`, `AgentInfoStep`, and `SaveControl` as noted in the test plan. </details> <details><summary><h3>Sequence Diagram</h3></summary> ```mermaid sequenceDiagram participant Dev as Developer participant FE as Frontend Build participant API as Backend API participant Gen as Generated Types Note over Dev,Gen: Refactor: Remove OldAgentLibraryView & NEW_AGENT_RUNS flag Dev->>FE: Delete OldAgentLibraryView (13 files, ~2200 lines) Dev->>FE: Remove NEW_AGENT_RUNS from Flag enum Dev->>FE: Delete library/legacy/[id]/page.tsx Dev->>FE: Move CronScheduler → src/components/contextual/ Dev->>FE: Move agent-run-draft-view → legacy-builder/ Dev->>FE: Move agent-status-chip → legacy-builder/ Dev->>FE: Update RunnerInputUI import path Dev->>FE: Update SaveControl import path Dev->>FE: Update AgentInfoStep import path Dev->>FE: Update useRunGraph.ts FE->>Gen: Import GraphExecutionMeta from generated models Note over Gen: Type available after pnpm generate:api Gen-->>API: Uses OpenAPI spec schema API-->>FE: Type-safe GraphExecutionMeta model ``` </details> <!-- greptile_other_comments_section --> <!-- /greptile_comment --> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
e2d3c8a217 |
fix(frontend): Prevent node drag when selecting text in object editor key input (#11955)
## Summary - Add `nodrag` class to the key name input wrapper in `WrapIfAdditionalTemplate.tsx` - This prevents the node from being dragged when users try to select text in the key name input field - Follows the same pattern used by other input components like `TextWidget.tsx` ## Test plan - [x] Open the new builder - [x] Add a custom node with an Object input field - [x] Try to select text in the key name input by clicking and dragging - [x] Verify that text selection works without moving the block Co-authored-by: Claude <noreply@anthropic.com> |
||
|
|
647c8ed8d4 |
feat(backend/blocks): enhance list concatenation with advanced operations (#12105)
## Summary Enhances the existing `ConcatenateListsBlock` and adds five new companion blocks for comprehensive list manipulation, addressing issue #11139 ("Implement block to concatenate lists"). ### Changes - **Enhanced `ConcatenateListsBlock`** with optional deduplication (`deduplicate`) and None-value filtering (`remove_none`), plus an output `length` field - **New `FlattenListBlock`**: Recursively flattens nested list structures with configurable `max_depth` - **New `InterleaveListsBlock`**: Round-robin interleaving of elements from multiple lists - **New `ZipListsBlock`**: Zips corresponding elements from multiple lists with support for padding to longest or truncating to shortest - **New `ListDifferenceBlock`**: Computes set difference between two lists (regular or symmetric) - **New `ListIntersectionBlock`**: Finds common elements between two lists, preserving order ### Helper Utilities Extracted reusable helper functions for validation, flattening, deduplication, interleaving, chunking, and statistics computation to support the blocks and enable future reuse. ### Test Coverage Comprehensive test suite with 188 test functions across 29 test classes covering: - Built-in block test harness validation for all 6 blocks - Manual edge-case tests for each block (empty inputs, large lists, mixed types, nested structures) - Internal method tests for all block classes - Unit tests for all helper utility functions Closes #11139 ## Test plan - [x] All files pass Python syntax validation (`ast.parse`) - [x] Built-in `test_input`/`test_output` tests defined for all blocks - [x] Manual tests cover edge cases: empty lists, large lists, mixed types, nested structures, deduplication, None removal - [x] Helper function tests validate all utility functions independently - [x] All block IDs are valid UUID4 - [x] Block categories set to `BlockCategory.BASIC` for consistency with existing list blocks <!-- greptile_comment --> <h2>Greptile Overview</h2> <details><summary><h3>Greptile Summary</h3></summary> Enhanced `ConcatenateListsBlock` with deduplication and None-filtering options, and added five new list manipulation blocks (`FlattenListBlock`, `InterleaveListsBlock`, `ZipListsBlock`, `ListDifferenceBlock`, `ListIntersectionBlock`) with comprehensive helper functions and test coverage. **Key Changes:** - Enhanced `ConcatenateListsBlock` with `deduplicate` and `remove_none` options, plus `length` output field - Added `FlattenListBlock` for recursively flattening nested lists with configurable `max_depth` - Added `InterleaveListsBlock` for round-robin element interleaving - Added `ZipListsBlock` with support for padding/truncation - Added `ListDifferenceBlock` and `ListIntersectionBlock` for set operations - Extracted 12 reusable helper functions for validation, flattening, deduplication, etc. - Comprehensive test suite with 188 test functions covering edge cases **Minor Issues:** - Helper function `_deduplicate_list` has redundant logic in the `else` branch that duplicates the `if` branch - Three helper functions (`_filter_empty_collections`, `_compute_list_statistics`, `_chunk_list`) are defined but unused - consider removing unless planned for future use - The `_make_hashable` function uses `hash(repr(item))` for unhashable types, which correctly treats structurally identical dicts/lists as duplicates </details> <details><summary><h3>Confidence Score: 4/5</h3></summary> - Safe to merge with minor style improvements recommended - The implementation is well-structured with comprehensive test coverage (188 tests), proper error handling, and follows existing block patterns. All blocks use valid UUID4 IDs and correct categories. The helper functions provide good code reuse. The minor issues are purely stylistic (redundant code, unused helpers) and don't affect functionality or safety. - No files require special attention - both files are well-tested and follow project conventions </details> <details><summary><h3>Sequence Diagram</h3></summary> ```mermaid sequenceDiagram participant User participant Block as List Block participant Helper as Helper Functions participant Output User->>Block: Input (lists/parameters) Block->>Helper: _validate_all_lists() Helper-->>Block: validation result alt validation fails Block->>Output: error message else validation succeeds Block->>Helper: _concatenate_lists_simple() / _flatten_nested_list() / etc. Helper-->>Block: processed result opt deduplicate enabled Block->>Helper: _deduplicate_list() Helper-->>Block: deduplicated result end opt remove_none enabled Block->>Helper: _filter_none_values() Helper-->>Block: filtered result end Block->>Output: result + length end Output-->>User: Block outputs ``` </details> <sub>Last reviewed commit: a6d5445</sub> <!-- greptile_other_comments_section --> <sub>(2/5) Greptile learns from your feedback when you react with thumbs up/down!</sub> <!-- /greptile_comment --> --------- Co-authored-by: Otto <otto@agpt.co> |
||
|
|
27d94e395c |
feat(backend/sdk): enable WebSearch, block WebFetch, consolidate tool constants (#12108)
## Summary
- Enable Claude Agent SDK built-in **WebSearch** tool (Brave Search via
Anthropic API) for the CoPilot SDK agent
- Explicitly **block WebFetch** via `SDK_DISALLOWED_TOOLS`. The agent
uses the SSRF-protected `mcp__copilot__web_fetch` MCP tool instead
- **Consolidate** all tool security constants (`BLOCKED_TOOLS`,
`WORKSPACE_SCOPED_TOOLS`, `DANGEROUS_PATTERNS`, `SDK_DISALLOWED_TOOLS`)
into `tool_adapter.py` as a single source of truth — previously
scattered across `tool_adapter.py`, `security_hooks.py`, and inline in
`service.py`
## Changes
- `tool_adapter.py`: Add `WebSearch` to `_SDK_BUILTIN_TOOLS`, add
`SDK_DISALLOWED_TOOLS`, move security constants here
- `security_hooks.py`: Import constants from `tool_adapter.py` instead
of defining locally
- `service.py`: Use `SDK_DISALLOWED_TOOLS` instead of inline `["Bash"]`
## Test plan
- [x] All 21 security hooks tests pass
- [x] Ruff lint clean
- [x] All pre-commit hooks pass
- [ ] Verify WebSearch works in CoPilot chat (manual test)
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
<details><summary><h3>Greptile Summary</h3></summary>
Consolidates tool security constants into `tool_adapter.py` as single
source of truth, enables WebSearch (Brave via Anthropic API), and
explicitly blocks WebFetch to prevent SSRF attacks. The change improves
security by ensuring the agent uses the SSRF-protected
`mcp__copilot__web_fetch` tool instead of the built-in WebFetch which
can access internal networks like `localhost:8006`.
</details>
<details><summary><h3>Confidence Score: 5/5</h3></summary>
- This PR is safe to merge with minimal risk
- The changes improve security by blocking WebFetch (SSRF risk) while
enabling safe WebSearch. The consolidation of constants into a single
source of truth improves maintainability. All existing tests pass (21
security hooks tests), and the refactoring is straightforward with no
behavioral changes to existing security logic. The only suggestions are
minor improvements: adding a test for WebFetch blocking and considering
a lowercase alias for consistency.
- No files require special attention
</details>
<details><summary><h3>Sequence Diagram</h3></summary>
```mermaid
sequenceDiagram
participant Agent as SDK Agent
participant Hooks as Security Hooks
participant TA as tool_adapter.py
participant MCP as MCP Tools
Note over TA: SDK_DISALLOWED_TOOLS = ["Bash", "WebFetch"]
Note over TA: _SDK_BUILTIN_TOOLS includes WebSearch
Agent->>Hooks: Request WebSearch (Brave API)
Hooks->>TA: Check BLOCKED_TOOLS
TA-->>Hooks: Not blocked
Hooks-->>Agent: Allowed ✓
Agent->>Agent: Execute via Anthropic API
Agent->>Hooks: Request WebFetch (SSRF risk)
Hooks->>TA: Check BLOCKED_TOOLS
Note over TA: WebFetch in SDK_DISALLOWED_TOOLS
TA-->>Hooks: Blocked
Hooks-->>Agent: Denied ✗
Note over Agent: Use mcp__copilot__web_fetch instead
Agent->>Hooks: Request mcp__copilot__web_fetch
Hooks->>MCP: Validate (MCP tool, not SDK builtin)
MCP-->>Hooks: Has SSRF protection
Hooks-->>Agent: Allowed ✓
Agent->>MCP: Execute with SSRF checks
```
</details>
<sub>Last reviewed commit: 2d9975f</sub>
<!-- greptile_other_comments_section -->
<!-- /greptile_comment -->
|
||
|
|
b8f5c208d0 |
Handle errors in Jina ExtractWebsiteContentBlock (#12048)
## Summary - catch Jina reader client/server errors in ExtractWebsiteContentBlock and surface a clear error output keyed to the user URL - guard empty responses to return an explicit error instead of yielding blank content - add regression tests covering the happy path and HTTP client failures via a monkeypatched fetch ## Testing - not run (pytest unavailable in this environment) --------- Co-authored-by: Nicholas Tindle <nicktindle@outlook.com> Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co> |
||
|
|
ca216dfd7f |
ci(docs-claude-review): Update comments instead of creating new ones (#12106)
## Changes 🏗️ This PR updates the Claude Block Docs Review CI workflow to update existing comments instead of creating new ones on each push. ### What's Changed: 1. **Concurrency group** - Prevents race conditions if the workflow runs twice simultaneously 2. **Comment cleanup step** - Deletes any previous Claude review comment before posting a new one 3. **Marker instruction** - Instructs Claude to include a `<!-- CLAUDE_DOCS_REVIEW -->` marker in its comment for identification ### Why: Previously, every PR push would create a new review comment, cluttering the PR with multiple comments. Now only the most recent review is shown. ### Testing: 1. Create a PR that triggers this workflow (modify a file in `docs/integrations/` or `autogpt_platform/backend/backend/blocks/`) 2. Verify first run creates comment with marker 3. Push another commit 4. Verify old comment is deleted and new comment is created (not accumulated) Requested by @Bentlybro --- ## Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [ ] I have made a test plan - [ ] I have tested my changes according to the test plan (will be tested on merge) #### For configuration changes: - [x] `.env.default` is updated or already compatible with my changes - [x] `docker-compose.yml` is updated or already compatible with my changes - [x] I have included a list of my configuration changes in the PR description (under **Changes**) <!-- greptile_comment --> <h2>Greptile Overview</h2> <details><summary><h3>Greptile Summary</h3></summary> Added concurrency control and comment deduplication to prevent multiple Claude review comments from accumulating on PRs. The workflow now deletes previous review comments (identified by `<!-- CLAUDE_DOCS_REVIEW -->` marker) before posting new ones, and uses concurrency groups to prevent race conditions. </details> <details><summary><h3>Confidence Score: 5/5</h3></summary> - This PR is safe to merge with minimal risk - The changes are well-contained, follow GitHub Actions best practices, and use built-in GitHub APIs safely. The concurrency control prevents race conditions, and the comment cleanup logic uses proper filtering with `head -1` to handle edge cases. The HTML comment marker approach is standard and reliable. - No files require special attention </details> <details><summary><h3>Sequence Diagram</h3></summary> ```mermaid sequenceDiagram participant GH as GitHub PR Event participant WF as Workflow participant API as GitHub API participant Claude as Claude Action GH->>WF: PR opened/synchronized WF->>WF: Check concurrency group Note over WF: Cancel any in-progress runs<br/>for same PR number WF->>API: Query PR comments API-->>WF: Return all comments WF->>WF: Filter for CLAUDE_DOCS_REVIEW marker alt Previous comment exists WF->>API: DELETE comment by ID API-->>WF: Comment deleted else No previous comment WF->>WF: Skip deletion end WF->>Claude: Run code review Claude->>API: POST new comment with marker API-->>Claude: Comment created ``` </details> <sub>Last reviewed commit: fb1b436</sub> <!-- greptile_other_comments_section --> <!-- /greptile_comment --> |
||
|
|
f9f358c526 |
feat(mcp): Add MCP tool block with OAuth, tool discovery, and standard credential integration (#12011)
## Summary <img width="1000" alt="image" src="https://github.com/user-attachments/assets/18e8ef34-d222-453c-8b0a-1b25ef8cf806" /> <img width="250" alt="image" src="https://github.com/user-attachments/assets/ba97556c-09c5-4f76-9f4e-49a2e8e57468" /> <img width="250" alt="image" src="https://github.com/user-attachments/assets/68f7804a-fe74-442d-9849-39a229c052cf" /> <img width="250" alt="image" src="https://github.com/user-attachments/assets/700690ba-f9fe-4726-8871-3bfbab586001" /> Full-stack MCP (Model Context Protocol) tool block integration that allows users to connect to any MCP server, discover available tools, authenticate via OAuth, and execute tools — all through the standard AutoGPT credential system. ### Backend - **MCPToolBlock** (`blocks/mcp/block.py`): New block using `CredentialsMetaInput` pattern with optional credentials (`default={}`), supporting both authenticated (OAuth) and public MCP servers. Includes auto-lookup fallback for backward compatibility. - **MCP Client** (`blocks/mcp/client.py`): HTTP transport with JSON-RPC 2.0, tool discovery, tool execution with robust error handling (type-checked error fields, non-JSON response handling) - **MCP OAuth Handler** (`blocks/mcp/oauth.py`): RFC 8414 discovery, dynamic per-server OAuth with PKCE, token storage and refresh via `raise_for_status=True` - **MCP API Routes** (`api/features/mcp/routes.py`): `discover-tools`, `oauth/login`, `oauth/callback` endpoints with credential cleanup, defensive OAuth metadata validation - **Credential system integration**: - `CredentialsMetaInput` model_validator normalizes legacy `"ProviderName.MCP"` format from Python 3.13's `str(StrEnum)` change - `CredentialsFieldInfo.combine()` supports URL-based credential discrimination (each MCP server gets its own credential entry) - `aggregate_credentials_inputs` checks block schema defaults for credential optionality - Executor normalizes credential data for both Pydantic and JSON schema validation paths - Chat credential matching handles MCP server URL filtering - `provider_matches()` helper used consistently for Python 3.13 StrEnum compatibility - **Pre-run validation**: `_validate_graph_get_errors` now calls `get_missing_input()` for custom block-level validation (MCP tool arguments) - **Security**: HTML tag stripping loop to prevent XSS bypass, SSRF protection (removed trusted_origins) ### Frontend - **MCPToolDialog** (`MCPToolDialog.tsx`): Full tool discovery UI — enter server URL, authenticate if needed, browse tools, select tool and configure - **OAuth popup** (`oauth-popup.ts`): Shared utility supporting cross-origin MCP OAuth flows with BroadcastChannel + localStorage fallback - **Credential integration**: MCP-specific OAuth flow in `useCredentialsInput`, server URL filtering in `useCredentials`, MCP callback page - **CredentialsSelect**: Auto-selects first available credential instead of defaulting to "None", credentials listed before "None" in dropdown - **Node rendering**: Dynamic tool input schema rendering on MCP nodes, proper handling in both legacy and new flow editors - **Block title persistence**: `customized_name` set at block creation for both MCP and Agent blocks — no fallback logic needed, titles survive save/load reliably - **Stable credential ordering**: Removed `sortByUnsetFirst` that caused credential inputs to jump when selected ### Tests (~2060 lines) - Unit tests: block, client, tool execution - Integration tests: mock MCP server with auth - OAuth flow tests - API endpoint tests - Credential combining/optionality tests - E2e tests (skipped in CI, run manually) ## Key Design Decisions 1. **Optional credentials via `default={}`**: MCP servers can be public (no auth) or private (OAuth). The `credentials` field has `default={}` making it optional at the schema level, so public servers work without prompting for credentials. 2. **URL-based credential discrimination**: Each MCP server URL gets its own credential entry in the "Run agent" form (via `discriminator="server_url"`), so agents using multiple MCP servers prompt for each independently. 3. **Model-level normalization**: Python 3.13 changed `str(StrEnum)` to return `"ClassName.MEMBER"`. Rather than scattering fixes across the codebase, a Pydantic `model_validator(mode="before")` on `CredentialsMetaInput` handles normalization centrally, and `provider_matches()` handles lookups. 4. **Credential auto-select**: `CredentialsSelect` component defaults to the first available credential and notifies the parent state, ensuring credentials are pre-filled in the "Run agent" dialog without requiring manual selection. 5. **customized_name for block titles**: Both MCP and Agent blocks set `customized_name` in metadata at creation time. This eliminates convoluted runtime fallback logic (`agent_name`, hostname extraction) — the title is persisted once and read directly. ## Test plan - [x] Unit/integration tests pass (68 MCP + 11 graph = 79 tests) - [x] Manual: MCP block with public server (DeepWiki) — no credentials needed, tools discovered and executable - [x] Manual: MCP block with OAuth server (Linear, Sentry) — OAuth flow prompts correctly - [x] Manual: "Run agent" form shows correct credential requirements per MCP server - [x] Manual: Credential auto-selects when exactly one matches, pre-selects first when multiple exist - [x] Manual: Credential ordering stays stable when selecting/deselecting - [x] Manual: MCP block title persists after save and refresh - [x] Manual: Agent block title persists after save and refresh (via customized_name) - [ ] Manual: Shared agent with MCP block prompts new user for credentials --------- Co-authored-by: Otto <otto@agpt.co> Co-authored-by: Ubbe <hi@ubbe.dev> |
||
|
|
52b3aebf71 |
feat(backend/sdk): Claude Agent SDK integration for CoPilot (#12103)
## Summary
Full integration of the **Claude Agent SDK** to replace the existing
one-turn OpenAI-compatible CoPilot implementation with a multi-turn,
tool-using AI agent.
### What changed
**Core SDK Integration** (`chat/sdk/` — new module)
- **`service.py`**: Main orchestrator — spawns Claude Code CLI as a
subprocess per user message, streams responses back via SSE. Handles
conversation history compression, session lifecycle, and error recovery.
- **`response_adapter.py`**: Translates Claude Agent SDK events (text
deltas, tool use, errors, result messages) into the existing CoPilot
`StreamEvent` protocol so the frontend works unchanged.
- **`tool_adapter.py`**: Bridges CoPilot's MCP tools (find_block,
run_block, create_agent, etc.) into the SDK's tool format. Handles
schema conversion and result serialization.
- **`security_hooks.py`**: Pre/Post tool-use hooks that enforce a strict
allowlist of tools, block path traversal, sandbox file operations to
per-session workspace directories, cap sub-agent spawning, and prevent
the model from accessing unauthorized system resources.
- **`transcript.py`**: JSONL transcript I/O utilities for the stateless
`--resume` feature (see below).
**Stateless Multi-Turn Resume** (new)
- Instead of compressing conversation history via LLM on every turn
(lossy and expensive), we capture Claude Code's native JSONL session
transcript via a **Stop hook** callback, persist it in the DB
(`ChatSession.sdkTranscript`), and restore it on the next turn via
`--resume <file>`.
- This preserves full tool call/result context across turns with zero
token overhead for history.
- Feature-flagged via `CLAUDE_AGENT_USE_RESUME` (default: off).
- DB migration: `ALTER TABLE "ChatSession" ADD COLUMN "sdkTranscript"
TEXT`.
**Sandboxed Tool Execution** (`chat/tools/`)
- **`bash_exec.py`**: Sandboxed bash execution using bubblewrap
(`bwrap`) with read-only root filesystem, per-session writable
workspace, resource limits (CPU, memory, file size), and network
isolation.
- **`sandbox.py`**: Shared bubblewrap sandbox infrastructure — generates
`bwrap` command lines with configurable mounts, environment, and
resource constraints.
- **`web_fetch.py`**: URL fetching tool with domain allowlist, size
limits, and content-type filtering.
- **`check_operation_status.py`**: Polling tool for long-running
operations (agent creation, block execution) so the SDK doesn't block
waiting.
- **`find_block.py`** / **`run_block.py`**: Enhanced with category
filtering, optimized response size (removed raw JSON schemas), and
better error handling.
**Security**
- Path traversal prevention: session IDs sanitized, all file ops
confined to workspace dirs, symlink resolution.
- Tool allowlist enforcement via SDK hooks — model cannot call arbitrary
tools.
- Built-in `Bash` tool blocked via `disallowed_tools` to prevent
bypassing sandboxed `bash_exec`.
- Sub-agent (`Task`) spawning capped at configurable limit (default:
10).
- CodeQL-clean path sanitization patterns.
**Streaming & Reconnection**
- SSE stream registry backed by Redis Streams for crash-resilient
reconnection.
- Long-running operation tracking with TTL-based cleanup.
- Atomic message append to prevent race conditions on concurrent writes.
**Configuration** (`config.py`)
- `use_claude_agent_sdk` — master toggle (default: on)
- `claude_agent_model` — model override for SDK path
- `claude_agent_max_buffer_size` — JSON parsing buffer (10MB)
- `claude_agent_max_subtasks` — sub-agent cap (10)
- `claude_agent_use_resume` — transcript-based resume (default: off)
- `thinking_enabled` — extended thinking for Claude models
**Tests**
- `sdk/response_adapter_test.py` — 366 lines covering all event
translation paths
- `sdk/security_hooks_test.py` — 165 lines covering tool blocking, path
traversal, subtask limits
- `chat/model_test.py` — 214 lines covering session model serialization
- `chat/service_test.py` — Integration tests including multi-turn resume
keyword recall
- `tools/find_block_test.py` / `run_block_test.py` — Extended with new
tool behavior tests
## Test plan
- [x] Unit tests pass (`sdk/response_adapter_test.py`,
`security_hooks_test.py`, `model_test.py`)
- [x] Integration test: multi-turn keyword recall via `--resume`
(`service_test.py::test_sdk_resume_multi_turn`)
- [x] Manual E2E: CoPilot chat sessions with tool calls, bash execution,
and multi-turn context
- [x] Pre-commit hooks pass (ruff, isort, black, pyright, flake8)
- [ ] Staging deployment with `claude_agent_use_resume=false` initially
- [ ] Enable resume in staging, verify transcript capture and recall
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
<details><summary><h3>Greptile Summary</h3></summary>
This PR replaces the existing OpenAI-compatible CoPilot with a full
Claude Agent SDK integration, introducing multi-turn conversations,
stateless resume via JSONL transcripts, and sandboxed tool execution.
**Key changes:**
- **SDK integration** (`chat/sdk/`): spawns Claude Code CLI subprocess
per message, translates events to frontend protocol, bridges MCP tools
- **Stateless resume**: captures JSONL transcripts via Stop hook,
persists in `ChatSession.sdkTranscript`, restores with `--resume`
(feature-flagged, default off)
- **Sandboxed execution**: bubblewrap sandbox for bash commands with
filesystem whitelist, network isolation, resource limits
- **Security hooks**: tool allowlist enforcement, path traversal
prevention, workspace-scoped file operations, sub-agent spawn limits
- **Long-running operations**: delegates `create_agent`/`edit_agent` to
existing stream_registry infrastructure for SSE reconnection
- **Feature flag**: `CHAT_USE_CLAUDE_AGENT_SDK` with LaunchDarkly
support, defaults to enabled
**Security issues found:**
- Path traversal validation has logic errors in `security_hooks.py:82`
(tilde expansion order) and `service.py:266` (redundant `..` check)
- Config validator always prefers env var over explicit `False` value
(`config.py:162`)
- Race condition in `routes.py:323` — message persisted before task
registration, could duplicate on retry
- Resource limits in sandbox may fail silently (`sandbox.py:109`)
**Test coverage is strong** with 366 lines for response adapter, 165 for
security hooks, and integration tests for multi-turn resume.
</details>
<details><summary><h3>Confidence Score: 3/5</h3></summary>
- This PR is generally safe but has critical security issues in path
validation that must be fixed before merge
- Score reflects strong architecture and test coverage offset by real
security vulnerabilities: the tilde expansion bug in `security_hooks.py`
could allow sandbox escape, the race condition could cause message
duplication, and the silent ulimit failures could bypass resource
limits. The bubblewrap sandbox and allowlist enforcement are
well-designed, but the path validation bugs need fixing. The transcript
resume feature is properly feature-flagged. Overall the implementation
is solid but the security issues prevent a higher score.
- Pay close attention to
`backend/api/features/chat/sdk/security_hooks.py` (path traversal
vulnerability), `backend/api/features/chat/routes.py` (race condition),
`backend/api/features/chat/tools/sandbox.py` (silent resource limit
failures), and `backend/api/features/chat/sdk/service.py` (redundant
security check)
</details>
<details><summary><h3>Sequence Diagram</h3></summary>
```mermaid
sequenceDiagram
participant Frontend
participant Routes as routes.py
participant SDKService as sdk/service.py
participant ClaudeSDK as Claude Agent SDK CLI
participant SecurityHooks as security_hooks.py
participant ToolAdapter as tool_adapter.py
participant CoPilotTools as tools/*
participant Sandbox as sandbox.py (bwrap)
participant DB as Database
participant Redis as stream_registry
Frontend->>Routes: POST /chat (user message)
Routes->>SDKService: stream_chat_completion_sdk()
SDKService->>DB: get_chat_session()
DB-->>SDKService: session + messages
alt Resume enabled AND transcript exists
SDKService->>SDKService: validate_transcript()
SDKService->>SDKService: write_transcript_to_tempfile()
Note over SDKService: Pass --resume to SDK
else No resume
SDKService->>SDKService: _compress_conversation_history()
Note over SDKService: Inject history into user message
end
SDKService->>SecurityHooks: create_security_hooks()
SDKService->>ToolAdapter: create_copilot_mcp_server()
SDKService->>ClaudeSDK: spawn subprocess with MCP server
loop Streaming Conversation
ClaudeSDK->>SDKService: AssistantMessage (text/tool_use)
SDKService->>Frontend: StreamTextDelta / StreamToolInputAvailable
alt Tool Call
ClaudeSDK->>SecurityHooks: PreToolUse hook
SecurityHooks->>SecurityHooks: validate path, check allowlist
alt Tool blocked
SecurityHooks-->>ClaudeSDK: deny
else Tool allowed
SecurityHooks-->>ClaudeSDK: allow
ClaudeSDK->>ToolAdapter: call MCP tool
alt Long-running tool (create_agent, edit_agent)
ToolAdapter->>Redis: register task
ToolAdapter->>DB: save OperationPendingResponse
ToolAdapter->>ToolAdapter: spawn background task
ToolAdapter-->>ClaudeSDK: OperationStartedResponse
else Regular tool (find_block, bash_exec)
ToolAdapter->>CoPilotTools: execute()
alt bash_exec
CoPilotTools->>Sandbox: run_sandboxed()
Sandbox->>Sandbox: build bwrap command
Note over Sandbox: Network isolation,<br/>filesystem whitelist,<br/>resource limits
Sandbox-->>CoPilotTools: stdout, stderr, exit_code
end
CoPilotTools-->>ToolAdapter: result
ToolAdapter->>ToolAdapter: stash full output
ToolAdapter-->>ClaudeSDK: MCP response
end
SecurityHooks->>SecurityHooks: PostToolUse hook (log)
end
end
ClaudeSDK->>SDKService: UserMessage (ToolResultBlock)
SDKService->>ToolAdapter: pop_pending_tool_output()
SDKService->>Frontend: StreamToolOutputAvailable
end
ClaudeSDK->>SecurityHooks: Stop hook
SecurityHooks->>SDKService: transcript_path callback
SDKService->>SDKService: read_transcript_file()
SDKService->>DB: save transcript to session.sdkTranscript
ClaudeSDK->>SDKService: ResultMessage (success)
SDKService->>Frontend: StreamFinish
SDKService->>DB: upsert_chat_session()
```
</details>
<sub>Last reviewed commit: 28c1121</sub>
<!-- greptile_other_comments_section -->
<!-- /greptile_comment -->
---------
Co-authored-by: Swifty <craigswift13@gmail.com>
|
||
|
|
965b7d3e04 |
dx: Add PR overlap detection & alert (#12104)
## Summary Adds an automated workflow that detects potential merge conflicts between open PRs, helping contributors coordinate proactively. **Example output:** [See comment on PR #12057](https://github.com/Significant-Gravitas/AutoGPT/pull/12057#issuecomment-3897330632) ## How it works 1. **Triggered on PR events** — runs when a PR is opened, pushed to, or reopened 2. **Compares against all open PRs** targeting the same base branch 3. **Detects overlaps** at multiple levels: - File overlap (same files modified) - Line overlap (same line ranges modified) - Actual merge conflicts (attempts real merges) 4. **Posts a comment** on the PR with findings ## Features - Full file paths with common prefix extraction for readability - Conflict size (number of conflict regions + lines affected) - Conflict types (content, added, deleted, modified/deleted, etc.) - Last-updated timestamps for each PR - Risk categorization (conflict, medium, low) - Ignores noise files (openapi.json, lock files) - Updates existing comment on subsequent pushes (no spam) - Filters out PRs older than 14 days - Clone-once optimization for fast merge testing (~48s for 19 PRs) ## Files - `.github/scripts/detect_overlaps.py` — main detection script - `.github/workflows/pr-overlap-check.yml` — workflow definition |
||
|
|
c2368f15ff |
fix(blocks): disable PrintToConsoleBlock (#12100)
## Summary Disables the Print to Console block as requested by Nick Tindle. ## Changes - Added `disabled=True` to PrintToConsoleBlock in `basic.py` ## Testing - Block will no longer appear in the platform UI - Existing graphs using this block should be checked (block ID: `f3b1c1b2-4c4f-4f0d-8d2f-4c4f0d8d2f4c`) Closes OPEN-3000 <!-- greptile_comment --> <h2>Greptile Overview</h2> <details><summary><h3>Greptile Summary</h3></summary> Added `disabled=True` parameter to `PrintToConsoleBlock` in `basic.py` per Nick Tindle's request (OPEN-3000). - Block follows the same disabling pattern used by other blocks in the codebase (e.g., `BlockInstallationBlock`, video blocks, Ayrshare blocks) - Block will no longer appear in the platform UI for new graph creation - Existing graphs using this block (ID: `f3b1c1b2-4c4f-4f0d-8d2f-4c4f0d8d2f4c`) will need to be checked for compatibility - Comment properly documents the reason for disabling </details> <details><summary><h3>Confidence Score: 5/5</h3></summary> - This PR is safe to merge with minimal risk - Single-line change that adds a well-documented flag following existing patterns used throughout the codebase. The change is non-destructive and only affects UI visibility of the block for new graphs. - No files require special attention </details> <sub>Last reviewed commit: 759003b</sub> <!-- greptile_other_comments_section --> <!-- /greptile_comment --> |
||
|
|
9ac3f64d56 |
chore(deps): bump github/codeql-action from 3 to 4 (#12033)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3 to 4. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/github/codeql-action/releases">github/codeql-action's releases</a>.</em></p> <blockquote> <h2>v3.32.2</h2> <ul> <li>Update default CodeQL bundle version to <a href="https://github.com/github/codeql-action/releases/tag/codeql-bundle-v2.24.1">2.24.1</a>. <a href="https://redirect.github.com/github/codeql-action/pull/3460">#3460</a></li> </ul> <h2>v3.32.1</h2> <ul> <li>A warning is now shown in Default Setup workflow logs if a <a href="https://docs.github.com/en/code-security/how-tos/secure-at-scale/configure-organization-security/manage-usage-and-access/giving-org-access-private-registries">private package registry is configured</a> using a GitHub Personal Access Token (PAT), but no username is configured. <a href="https://redirect.github.com/github/codeql-action/pull/3422">#3422</a></li> <li>Fixed a bug which caused the CodeQL Action to fail when repository properties cannot successfully be retrieved. <a href="https://redirect.github.com/github/codeql-action/pull/3421">#3421</a></li> </ul> <h2>v3.32.0</h2> <ul> <li>Update default CodeQL bundle version to <a href="https://github.com/github/codeql-action/releases/tag/codeql-bundle-v2.24.0">2.24.0</a>. <a href="https://redirect.github.com/github/codeql-action/pull/3425">#3425</a></li> </ul> <h2>v3.31.11</h2> <ul> <li>When running a Default Setup workflow with <a href="https://docs.github.com/en/actions/how-tos/monitor-workflows/enable-debug-logging">Actions debugging enabled</a>, the CodeQL Action will now use more unique names when uploading logs from the Dependabot authentication proxy as workflow artifacts. This ensures that the artifact names do not clash between multiple jobs in a build matrix. <a href="https://redirect.github.com/github/codeql-action/pull/3409">#3409</a></li> <li>Improved error handling throughout the CodeQL Action. <a href="https://redirect.github.com/github/codeql-action/pull/3415">#3415</a></li> <li>Added experimental support for automatically excluding <a href="https://docs.github.com/en/repositories/working-with-files/managing-files/customizing-how-changed-files-appear-on-github">generated files</a> from the analysis. This feature is not currently enabled for any analysis. In the future, it may be enabled by default for some GitHub-managed analyses. <a href="https://redirect.github.com/github/codeql-action/pull/3318">#3318</a></li> <li>The changelog extracts that are included with releases of the CodeQL Action are now shorter to avoid duplicated information from appearing in Dependabot PRs. <a href="https://redirect.github.com/github/codeql-action/pull/3403">#3403</a></li> </ul> <h2>v3.31.10</h2> <h1>CodeQL Action Changelog</h1> <p>See the <a href="https://github.com/github/codeql-action/releases">releases page</a> for the relevant changes to the CodeQL CLI and language packs.</p> <h2>3.31.10 - 12 Jan 2026</h2> <ul> <li>Update default CodeQL bundle version to 2.23.9. <a href="https://redirect.github.com/github/codeql-action/pull/3393">#3393</a></li> </ul> <p>See the full <a href="https://github.com/github/codeql-action/blob/v3.31.10/CHANGELOG.md">CHANGELOG.md</a> for more information.</p> <h2>v3.31.9</h2> <h1>CodeQL Action Changelog</h1> <p>See the <a href="https://github.com/github/codeql-action/releases">releases page</a> for the relevant changes to the CodeQL CLI and language packs.</p> <h2>3.31.9 - 16 Dec 2025</h2> <p>No user facing changes.</p> <p>See the full <a href="https://github.com/github/codeql-action/blob/v3.31.9/CHANGELOG.md">CHANGELOG.md</a> for more information.</p> <h2>v3.31.8</h2> <h1>CodeQL Action Changelog</h1> <p>See the <a href="https://github.com/github/codeql-action/releases">releases page</a> for the relevant changes to the CodeQL CLI and language packs.</p> <h2>3.31.8 - 11 Dec 2025</h2> <ul> <li>Update default CodeQL bundle version to 2.23.8. <a href="https://redirect.github.com/github/codeql-action/pull/3354">#3354</a></li> </ul> <p>See the full <a href="https://github.com/github/codeql-action/blob/v3.31.8/CHANGELOG.md">CHANGELOG.md</a> for more information.</p> <h2>v3.31.7</h2> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/github/codeql-action/blob/main/CHANGELOG.md">github/codeql-action's changelog</a>.</em></p> <blockquote> <h2>4.31.11 - 23 Jan 2026</h2> <ul> <li>When running a Default Setup workflow with <a href="https://docs.github.com/en/actions/how-tos/monitor-workflows/enable-debug-logging">Actions debugging enabled</a>, the CodeQL Action will now use more unique names when uploading logs from the Dependabot authentication proxy as workflow artifacts. This ensures that the artifact names do not clash between multiple jobs in a build matrix. <a href="https://redirect.github.com/github/codeql-action/pull/3409">#3409</a></li> <li>Improved error handling throughout the CodeQL Action. <a href="https://redirect.github.com/github/codeql-action/pull/3415">#3415</a></li> <li>Added experimental support for automatically excluding <a href="https://docs.github.com/en/repositories/working-with-files/managing-files/customizing-how-changed-files-appear-on-github">generated files</a> from the analysis. This feature is not currently enabled for any analysis. In the future, it may be enabled by default for some GitHub-managed analyses. <a href="https://redirect.github.com/github/codeql-action/pull/3318">#3318</a></li> <li>The changelog extracts that are included with releases of the CodeQL Action are now shorter to avoid duplicated information from appearing in Dependabot PRs. <a href="https://redirect.github.com/github/codeql-action/pull/3403">#3403</a></li> </ul> <h2>4.31.10 - 12 Jan 2026</h2> <ul> <li>Update default CodeQL bundle version to 2.23.9. <a href="https://redirect.github.com/github/codeql-action/pull/3393">#3393</a></li> </ul> <h2>4.31.9 - 16 Dec 2025</h2> <p>No user facing changes.</p> <h2>4.31.8 - 11 Dec 2025</h2> <ul> <li>Update default CodeQL bundle version to 2.23.8. <a href="https://redirect.github.com/github/codeql-action/pull/3354">#3354</a></li> </ul> <h2>4.31.7 - 05 Dec 2025</h2> <ul> <li>Update default CodeQL bundle version to 2.23.7. <a href="https://redirect.github.com/github/codeql-action/pull/3343">#3343</a></li> </ul> <h2>4.31.6 - 01 Dec 2025</h2> <p>No user facing changes.</p> <h2>4.31.5 - 24 Nov 2025</h2> <ul> <li>Update default CodeQL bundle version to 2.23.6. <a href="https://redirect.github.com/github/codeql-action/pull/3321">#3321</a></li> </ul> <h2>4.31.4 - 18 Nov 2025</h2> <p>No user facing changes.</p> <h2>4.31.3 - 13 Nov 2025</h2> <ul> <li>CodeQL Action v3 will be deprecated in December 2026. The Action now logs a warning for customers who are running v3 but could be running v4. For more information, see <a href="https://github.blog/changelog/2025-10-28-upcoming-deprecation-of-codeql-action-v3/">Upcoming deprecation of CodeQL Action v3</a>.</li> <li>Update default CodeQL bundle version to 2.23.5. <a href="https://redirect.github.com/github/codeql-action/pull/3288">#3288</a></li> </ul> <h2>4.31.2 - 30 Oct 2025</h2> <p>No user facing changes.</p> <h2>4.31.1 - 30 Oct 2025</h2> <ul> <li>The <code>add-snippets</code> input has been removed from the <code>analyze</code> action. This input has been deprecated since CodeQL Action 3.26.4 in August 2024 when this removal was announced.</li> </ul> <h2>4.31.0 - 24 Oct 2025</h2> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href=" |
||
|
|
5035b69c79 |
feat(platform): add feature request tools for CoPilot chat (#12102)
Users can now search for existing feature requests and submit new ones directly through the CoPilot chat interface. Requests are tracked in Linear with customer need attribution. ### Changes 🏗️ **Backend:** - Added `SearchFeatureRequestsTool` and `CreateFeatureRequestTool` to the CoPilot chat tools registry - Integrated with Linear GraphQL API for searching issues in the feature requests project, creating new issues, upserting customers, and attaching customer needs - Added `linear_api_key` secret to settings for system-level Linear API access - Added response models (`FeatureRequestSearchResponse`, `FeatureRequestCreatedResponse`, `FeatureRequestInfo`) to the tools models **Frontend:** - Added `SearchFeatureRequestsTool` and `CreateFeatureRequestTool` UI components with full streaming state handling (input-streaming, input-available, output-available, output-error) - Added helper utilities for output parsing, type guards, animation text, and icon rendering - Wired tools into `ChatMessagesContainer` for rendering in the chat - Added styleguide examples covering all tool states ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Verified search returns matching feature requests from Linear - [x] Verified creating a new feature request creates an issue and customer need in Linear - [x] Verified adding a need to an existing issue works via `existing_issue_id` - [x] Verified error states render correctly in the UI - [x] Verified styleguide page renders all tool states #### For configuration changes: - [x] `.env.default` is updated or already compatible with my changes - [x] I have included a list of my configuration changes in the PR description (under **Changes**) New secret: `LINEAR_API_KEY` — required for system-level Linear API operations (defaults to empty string). <!-- greptile_comment --> <h2>Greptile Overview</h2> <details><summary><h3>Greptile Summary</h3></summary> Adds feature request search and creation tools to CoPilot chat, integrating with Linear's GraphQL API to track user feedback. Users can now search existing feature requests and submit new ones (or add their need to existing issues) directly through conversation. **Key changes:** - Backend: `SearchFeatureRequestsTool` and `CreateFeatureRequestTool` with Linear API integration via system-level `LINEAR_API_KEY` - Frontend: React components with streaming state handling and accordion UI for search results and creation confirmations - Models: Added `FeatureRequestSearchResponse` and `FeatureRequestCreatedResponse` to response types - Customer need tracking: Upserts customers in Linear and attaches needs to issues for better feedback attribution **Issues found:** - Missing `LINEAR_API_KEY` entry in `.env.default` (required per PR description checklist) - Hardcoded project/team IDs reduce maintainability - Global singleton pattern could cause issues in async contexts - Using `user_id` as customer name reduces readability in Linear </details> <details><summary><h3>Confidence Score: 4/5</h3></summary> - Safe to merge with minor configuration fix required - The implementation is well-structured with proper error handling, type safety, and follows existing patterns in the codebase. The missing `.env.default` entry is a straightforward configuration issue that must be fixed before deployment but doesn't affect code quality. The other findings are style improvements that don't impact functionality. - Verify that `LINEAR_API_KEY` is added to `.env.default` before merging </details> <details><summary><h3>Sequence Diagram</h3></summary> ```mermaid sequenceDiagram participant User participant CoPilot UI participant LLM participant FeatureRequestTool participant LinearClient participant Linear API User->>CoPilot UI: Request feature via chat CoPilot UI->>LLM: Send user message LLM->>FeatureRequestTool: search_feature_requests(query) FeatureRequestTool->>LinearClient: query(SEARCH_ISSUES_QUERY) LinearClient->>Linear API: POST /graphql (search) Linear API-->>LinearClient: searchIssues.nodes[] LinearClient-->>FeatureRequestTool: Feature request data FeatureRequestTool-->>LLM: FeatureRequestSearchResponse alt No existing requests found LLM->>FeatureRequestTool: create_feature_request(title, description) FeatureRequestTool->>LinearClient: mutate(CUSTOMER_UPSERT_MUTATION) LinearClient->>Linear API: POST /graphql (upsert customer) Linear API-->>LinearClient: customer {id, name} LinearClient-->>FeatureRequestTool: Customer data FeatureRequestTool->>LinearClient: mutate(ISSUE_CREATE_MUTATION) LinearClient->>Linear API: POST /graphql (create issue) Linear API-->>LinearClient: issue {id, identifier, url} LinearClient-->>FeatureRequestTool: Issue data FeatureRequestTool->>LinearClient: mutate(CUSTOMER_NEED_CREATE_MUTATION) LinearClient->>Linear API: POST /graphql (attach need) Linear API-->>LinearClient: need {id, issue} LinearClient-->>FeatureRequestTool: Need data FeatureRequestTool-->>LLM: FeatureRequestCreatedResponse else Existing request found LLM->>FeatureRequestTool: create_feature_request(title, description, existing_issue_id) FeatureRequestTool->>LinearClient: mutate(CUSTOMER_UPSERT_MUTATION) LinearClient->>Linear API: POST /graphql (upsert customer) Linear API-->>LinearClient: customer {id} LinearClient-->>FeatureRequestTool: Customer data FeatureRequestTool->>LinearClient: mutate(CUSTOMER_NEED_CREATE_MUTATION) LinearClient->>Linear API: POST /graphql (attach need to existing) Linear API-->>LinearClient: need {id, issue} LinearClient-->>FeatureRequestTool: Need data FeatureRequestTool-->>LLM: FeatureRequestCreatedResponse end LLM-->>CoPilot UI: Tool response + continuation CoPilot UI-->>User: Display result with accordion UI ``` </details> <sub>Last reviewed commit: af2e093</sub> <!-- greptile_other_comments_section --> <!-- /greptile_comment --> |
||
|
|
86af8fc856 |
ci: apply E2E CI optimizations to Claude workflows (#12097)
## Summary Applies the CI performance optimizations from #12090 to Claude Code workflows. ## Changes ### `claude.yml` & `claude-dependabot.yml` - **pnpm caching**: Replaced manual `actions/cache` with `setup-node` built-in `cache: "pnpm"` - Removes 4 steps (set pnpm store dir, cache step, manual config) → 1 step ### `claude-ci-failure-auto-fix.yml` - **Added dev environment setup** with optimized caching - Now Claude can run lint/tests when fixing CI failures (previously could only edit files) - Uses the same optimized caching patterns ## Dependency This PR is based on #12090 and will merge after it. ## Testing - Workflow YAML syntax validated - Patterns match proven #12090 implementation - CI caching changes fail gracefully to uncached builds ## Linear Fixes [SECRT-1950](https://linear.app/autogpt/issue/SECRT-1950) ## Future Enhancements E2E test data caching could be added to Claude workflows if needed for running integration tests. Currently Claude workflows set up a dev environment but don't run E2E tests by default. <!-- greptile_comment --> <h2>Greptile Overview</h2> <details><summary><h3>Greptile Summary</h3></summary> Applies proven CI performance optimizations to Claude workflows by simplifying pnpm caching and adding dev environment setup to the auto-fix workflow. **Key changes:** - Replaced manual pnpm cache configuration (4 steps) with built-in `setup-node` `cache: "pnpm"` support in `claude.yml` and `claude-dependabot.yml` - Added complete dev environment setup (Python/Poetry + Node.js/pnpm) to `claude-ci-failure-auto-fix.yml` so Claude can run linting and tests when fixing CI failures - Correctly orders `corepack enable` before `setup-node` to ensure pnpm is available for caching The changes mirror the optimizations from PR #12090 and maintain consistency across all Claude workflows. </details> <details><summary><h3>Confidence Score: 5/5</h3></summary> - This PR is safe to merge with minimal risk - The changes are CI infrastructure optimizations that mirror proven patterns from PR #12090. The pnpm caching simplification reduces complexity without changing functionality (caching failures gracefully fall back to uncached builds). The dev environment setup in the auto-fix workflow is additive and enables Claude to run linting/tests. All YAML syntax is correct and the step ordering follows best practices. - No files require special attention </details> <details><summary><h3>Sequence Diagram</h3></summary> ```mermaid sequenceDiagram participant GHA as GitHub Actions participant Corepack as Corepack participant SetupNode as setup-node@v6 participant Cache as GHA Cache participant pnpm as pnpm Note over GHA,pnpm: Before (Manual Caching) GHA->>SetupNode: Set up Node.js 22 SetupNode-->>GHA: Node.js ready GHA->>Corepack: Enable corepack Corepack-->>GHA: pnpm available GHA->>pnpm: Configure store directory pnpm-->>GHA: Store path set GHA->>Cache: actions/cache (manual key) Cache-->>GHA: Cache restored/missed GHA->>pnpm: Install dependencies pnpm-->>GHA: Dependencies installed Note over GHA,pnpm: After (Built-in Caching) GHA->>Corepack: Enable corepack Corepack-->>GHA: pnpm available GHA->>SetupNode: Set up Node.js 22<br/>cache: "pnpm"<br/>cache-dependency-path: pnpm-lock.yaml SetupNode->>Cache: Auto-detect pnpm store Cache-->>SetupNode: Cache restored/missed SetupNode-->>GHA: Node.js + cache ready GHA->>pnpm: Install dependencies pnpm-->>GHA: Dependencies installed ``` </details> <sub>Last reviewed commit: f1681a0</sub> <!-- greptile_other_comments_section --> <!-- /greptile_comment --> --------- Co-authored-by: Reinier van der Leer <pwuts@agpt.co> Co-authored-by: Ubbe <hi@ubbe.dev> |
||
|
|
dfa517300b |
debug(copilot): Add detailed API error logging (#11942)
## Summary Adds comprehensive error logging for OpenRouter/OpenAI API errors to help diagnose issues like provider routing failures, context length exceeded, rate limits, etc. ## Background While investigating [SECRT-1859](https://linear.app/autogpt/issue/SECRT-1859), we found that when OpenRouter returns errors, the actual error details weren't being captured or logged. Langfuse traces showed `provider_name: 'unknown'` and `completion: null` without any insight into WHY all providers rejected the request. ## Changes - Add `_extract_api_error_details()` to extract rich information from API errors including: - Status code and request ID - Response body (contains OpenRouter's actual error message) - OpenRouter-specific headers (provider, model) - Rate limit headers - Add `_log_api_error()` helper that logs errors with context: - Session ID for correlation - Message count (helps identify context length issues) - Model being used - Retry count - Update error handling in `_stream_chat_chunks()` and `_generate_llm_continuation()` to use new logging - Extract provider's error message from response body for better user feedback ## Example log output ``` API error: { 'error_type': 'APIStatusError', 'error_message': 'Provider returned error', 'status_code': 400, 'request_id': 'req_xxx', 'response_body': {'error': {'message': 'context_length_exceeded', 'type': 'invalid_request_error'}}, 'openrouter_provider': 'unknown', 'session_id': '44fbb803-...', 'message_count': 52, 'model': 'anthropic/claude-opus-4.5', 'retry_count': 0 } ``` ## Testing - [ ] Verified code passes linting (black, isort, ruff) - [ ] Error details are properly extracted from different error types ## Refs - Linear: SECRT-1859 - Thread: https://discord.com/channels/1126875755960336515/1467066151002571034 --------- Co-authored-by: Reinier van der Leer <pwuts@agpt.co> |
||
|
|
43b25b5e2f |
ci(frontend): Speed up E2E test job (#12090)
The frontend `e2e_test` doesn't have a working build cache setup, causing really slow builds = slow test jobs. These changes reduce total test runtime from ~12 minutes to ~5 minutes. ### Changes 🏗️ - Inject build cache config into docker compose config; let `buildx bake` use GHA cache directly - Add `docker-ci-fix-compose-build-cache.py` script - Optimize `backend/Dockerfile` + root `.dockerignore` - Replace broken DIY pnpm store caching with `actions/setup-node` built-in cache management - Add caching for test seed data created in DB ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - CI |
||
|
|
ab0b537cc7 |
refactor(backend): optimize find_block response size by removing raw JSON schemas (#12020)
### Changes 🏗️ The `find_block` AutoPilot tool was returning ~90K characters per response (10 blocks). The bloat came from including full JSON Schema objects (`input_schema`, `output_schema`) with all nested `$defs`, `anyOf`, and type definitions for every block. **What changed:** - **`BlockInfoSummary` model**: Removed `input_schema` (raw JSON Schema), `output_schema` (raw JSON Schema), and `categories`. Added `output_fields` (compact field-level summaries matching the existing `required_inputs` format). - **`BlockListResponse` model**: Removed `usage_hint` (info now in `message`). - **`FindBlockTool._execute()`**: Now extracts compact `output_fields` from output schema properties instead of including the entire raw schema. Credentials handling is unchanged. - **Test**: Added `test_response_size_average_chars_per_block` with realistic block schemas (HTTP, Email, Claude Code) to measure and assert response size stays under 2K chars/block. - **`CLAUDE.md`**: Clarified `dev` vs `master` branching strategy. **Result:** Average response size reduced from ~9,000 to ~1,300 chars per block (~85% reduction). This directly reduces LLM token consumption, latency, and API costs for AutoPilot interactions. ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Verified models import and serialize correctly - [x] Verified response size: 3,970 chars for 3 realistic blocks (avg 1,323/block) - [x] Lint (`ruff check`) and type check (`pyright`) pass on changed files - [x] Frontend compatibility preserved: `blocks[].name` and `count` fields retained for `block_list` handler --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Toran Bruce Richards <toran.richards@gmail.com> |
||
|
|
9a8c6ad609 |
chore(libs/deps): bump the production-dependencies group across 1 directory with 4 updates (#12056)
Bumps the production-dependencies group with 4 updates in the /autogpt_platform/autogpt_libs directory: [cryptography](https://github.com/pyca/cryptography), [fastapi](https://github.com/fastapi/fastapi), [launchdarkly-server-sdk](https://github.com/launchdarkly/python-server-sdk) and [supabase](https://github.com/supabase/supabase-py). Updates `cryptography` from 46.0.4 to 46.0.5 <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst">cryptography's changelog</a>.</em></p> <blockquote> <p>46.0.5 - 2026-02-10</p> <pre><code> * An attacker could create a malicious public key that reveals portions of your private key when using certain uncommon elliptic curves (binary curves). This version now includes additional security checks to prevent this attack. This issue only affects binary elliptic curves, which are rarely used in real-world applications. Credit to **XlabAI Team of Tencent Xuanwu Lab and Atuin Automated Vulnerability Discovery Engine** for reporting the issue. **CVE-2026-26007** * Support for ``SECT*`` binary elliptic curves is deprecated and will be removed in the next release. <p>.. v46-0-4:<br /> </code></pre></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href=" |
||
|
|
e8c50b96d1 |
fix(frontend): improve CoPilot chat table styling (#12094)
## Summary - Remove left and right borders from tables rendered in CoPilot chat - Increase cell padding (py-3 → py-3.5) for better spacing between text and lines - Applies to both Streamdown (main chat) and MarkdownRenderer (tool outputs) Design feedback from Olivia to make tables "breathe" more. ## Test plan - [ ] Open CoPilot chat and trigger a response containing a table - [ ] Verify tables no longer have left/right borders - [ ] Verify increased spacing between rows - [ ] Check both light and dark modes 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- greptile_comment --> <h2>Greptile Overview</h2> <details><summary><h3>Greptile Summary</h3></summary> Improved CoPilot chat table styling by removing left and right borders and increasing vertical padding from `py-3` to `py-3.5`. Changes apply to both: - Streamdown-rendered tables (via CSS selector in `globals.css`) - MarkdownRenderer tables (via Tailwind classes) The changes make tables "breathe" more per design feedback from Olivia. **Issue Found:** - The CSS padding value in `globals.css:192` is `0.625rem` (`py-2.5`) but should be `0.875rem` (`py-3.5`) to match the PR description and the MarkdownRenderer implementation. </details> <details><summary><h3>Confidence Score: 2/5</h3></summary> - This PR has a logical error that will cause inconsistent table styling between Streamdown and MarkdownRenderer tables - The implementation has an inconsistency where the CSS file uses `py-2.5` padding while the PR description and MarkdownRenderer use `py-3.5`. This will result in different table padding between the two rendering systems, contradicting the goal of consistent styling improvements. - Pay close attention to `autogpt_platform/frontend/src/app/globals.css` - the padding value needs to be corrected to match the intended design </details> <!-- greptile_other_comments_section --> <!-- /greptile_comment --> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> |
||
|
|
30e854569a |
feat(frontend): add exact timestamp tooltip on run timestamps (#12087)
Resolves OPEN-2693: Make exact timestamp of runs accessible through UI.
The NewAgentLibraryView shows relative timestamps ("2 days ago") for
runs and schedules, but unlike the OldAgentLibraryView it didn't show
the exact timestamp on hover. This PR adds a native `title` tooltip so
users can see the full date/time by hovering.
### Changes 🏗️
- Added `descriptionTitle` prop to `SidebarItemCard` that renders as a
`title` attribute on the description text
- `TaskListItem` now passes the exact `run.started_at` timestamp via
`descriptionTitle`
- `ScheduleListItem` now passes the exact `schedule.next_run_time`
timestamp via `descriptionTitle`
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [ ] Open an agent in the library view
- [ ] Hover over a run's relative timestamp (e.g. "2 days ago") and
confirm the full date/time tooltip appears
- [ ] Hover over a schedule's relative timestamp and confirm the full
date/time tooltip appears
🤖 Generated with [Claude Code](https://claude.com/claude-code)
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
<details><summary><h3>Greptile Summary</h3></summary>
Added native tooltip functionality to show exact timestamps in the
library view. The implementation adds a `descriptionTitle` prop to
`SidebarItemCard` that renders as a `title` attribute on the description
text. This allows users to hover over relative timestamps (e.g., "2 days
ago") to see the full date/time.
**Changes:**
- Added optional `descriptionTitle` prop to `SidebarItemCard` component
(SidebarItemCard.tsx:10)
- `TaskListItem` passes `run.started_at` as the tooltip value
(TaskListItem.tsx:84-86)
- `ScheduleListItem` passes `schedule.next_run_time` as the tooltip
value (ScheduleListItem.tsx:32)
- Unrelated fix included: Sentry configuration updated to suppress
cross-origin stylesheet errors (instrumentation-client.ts:25-28)
**Note:** The PR includes two separate commits - the main timestamp
tooltip feature and a Sentry error suppression fix. The PR description
only documents the timestamp feature.
</details>
<details><summary><h3>Confidence Score: 5/5</h3></summary>
- This PR is safe to merge with minimal risk
- The changes are straightforward and limited in scope - adding an
optional prop that forwards a native HTML attribute for tooltip
functionality. The Text component already supports forwarding arbitrary
HTML attributes through its spread operator (...rest), ensuring the
`title` attribute works correctly. Both the timestamp tooltip feature
and the Sentry configuration fix are low-risk improvements with no
breaking changes.
- No files require special attention
</details>
<details><summary><h3>Sequence Diagram</h3></summary>
```mermaid
sequenceDiagram
participant User
participant TaskListItem
participant ScheduleListItem
participant SidebarItemCard
participant Text
participant Browser
User->>TaskListItem: Hover over run timestamp
TaskListItem->>SidebarItemCard: Pass descriptionTitle (run.started_at)
SidebarItemCard->>Text: Render with title attribute
Text->>Browser: Forward title attribute to DOM
Browser->>User: Display native tooltip with exact timestamp
User->>ScheduleListItem: Hover over schedule timestamp
ScheduleListItem->>SidebarItemCard: Pass descriptionTitle (schedule.next_run_time)
SidebarItemCard->>Text: Render with title attribute
Text->>Browser: Forward title attribute to DOM
Browser->>User: Display native tooltip with exact timestamp
```
</details>
<!-- greptile_other_comments_section -->
<!-- /greptile_comment -->
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
|
||
|
|
301d7cbada |
fix(frontend): suppress cross-origin stylesheet security error (#12086)
## Summary - Adds `ignoreErrors` to the Sentry client configuration (`instrumentation-client.ts`) to filter out `SecurityError: CSSStyleSheet.cssRules getter: Not allowed to access cross-origin stylesheet` errors - These errors are caused by Sentry Replay (rrweb) attempting to serialize DOM snapshots that include cross-origin stylesheets (from browser extensions or CDN-loaded CSS) - This was reported via Sentry on production, occurring on any page when logged in ## Changes - **`frontend/instrumentation-client.ts`**: Added `ignoreErrors: [/Not allowed to access cross-origin stylesheet/]` to `Sentry.init()` config ## Test plan - [ ] Verify the error no longer appears in Sentry after deployment - [ ] Verify Sentry Replay still works correctly for other errors - [ ] Verify no regressions in error tracking (other errors should still be captured) 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- greptile_comment --> <h2>Greptile Overview</h2> <details><summary><h3>Greptile Summary</h3></summary> Adds error filtering to Sentry client configuration to suppress cross-origin stylesheet security errors that occur when Sentry Replay (rrweb) attempts to serialize DOM snapshots containing stylesheets from browser extensions or CDN-loaded CSS. This prevents noise in Sentry error logs without affecting the capture of legitimate errors. </details> <details><summary><h3>Confidence Score: 5/5</h3></summary> - This PR is safe to merge with minimal risk - The change adds a simple error filter to suppress benign cross-origin stylesheet errors that are caused by Sentry Replay itself. The regex pattern is specific and only affects client-side error reporting, with no impact on application functionality or legitimate error capture - No files require special attention </details> <!-- greptile_other_comments_section --> <!-- /greptile_comment --> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> |