## Changes 🏗️
Adds Redis-based SSE reconnection support for long-running CoPilot
operations (like Agent Generator), enabling clients to reconnect and
resume receiving updates after disconnection.
### What this does:
- **Stream Registry** - Redis-backed task tracking with message
persistence via Redis Streams
- **SSE Reconnection** - Clients can reconnect to active tasks using
`task_id` and `last_message_id`
- **Duplicate Message Fix** - Filters out in-progress assistant messages
from session response when active stream exists
- **Completion Consumer** - Handles background task completion
notifications via Redis Streams
### Architecture:
```
1. User sends message → Backend creates task in Redis
2. SSE chunks written to Redis Stream for persistence
3. Client receives chunks via SSE subscription
4. If client disconnects → Task continues in background
5. Client reconnects → GET /sessions/{id} returns active_stream info
6. Client subscribes to /tasks/{task_id}/stream with last_message_id
7. Missed messages replayed from Redis Stream
```
### Key endpoints:
- `GET /sessions/{session_id}` - Returns `active_stream` info if task is
running
- `GET /tasks/{task_id}/stream?last_message_id=X` - SSE endpoint for
reconnection
- `GET /tasks/{task_id}` - Get task status
- `POST /operations/{op_id}/complete` - Webhook for external service
completion
### Duplicate message fix:
When `GET /sessions/{id}` detects an active stream:
1. Filters out the in-progress assistant message from response
2. Returns `last_message_id="0-0"` so client replays stream from
beginning
3. Client receives complete response only through SSE (single source of
truth)
### Frontend changes:
- Task persistence in localStorage for cross-tab reconnection
- Stream event dispatcher handles reconnection flow
- Deduplication logic prevents duplicate messages
### Testing:
- Manual testing of reconnection scenarios
- Verified duplicate message fix works correctly
## Related
- Resolves SSE timeout issues for Agent Generator
- Fixes duplicate message bug on reconnection
- Replaced custom expandable sections with ToolAccordion component in both FindAgents and SearchDocs tools.
- Simplified state management by removing unnecessary useState and useReducedMotion hooks.
- Enhanced accessibility and readability of agent and document search results with clearer descriptions and structured layouts.
- Updated the `FindBlocksTool` to utilize the new `MorphingTextAnimation` for improved visual feedback.
- Refactored `MorphingTextAnimation` to accept a `text` prop, simplifying its usage and enhancing flexibility.
- Improved the rendering logic in `ChatMessagesContainer` to ensure proper key assignment for dynamic elements.
These changes aim to enhance the user experience by providing better visual transitions and cleaner component interactions.
- Simplified the `handleMessageSubmit` function in the chat page for better readability.
- Refactored the `ChatMessagesContainer` to improve message rendering logic, including the addition of the `FindBlocksTool` for tool call outputs.
- Updated the `ChatSidebar` component for better organization and clarity in props definition.
- Introduced a new `MorphingTextAnimation` component to enhance visual feedback during message transitions.
- Removed the obsolete `chat-store.ts` file to streamline the codebase.
These changes aim to improve the overall functionality and user experience of the chat interface.
## Problem
When marketplace agents are included in the `library_agents` payload
sent to the Agent Generator service, they were missing required fields
(`graph_id`, `graph_version`, `input_schema`, `output_schema`). This
caused Pydantic validation to fail with HTTP 422 Unprocessable Entity.
**Root cause:** The `MarketplaceAgentSummary` TypedDict had a different
shape than `LibraryAgentInfo` expected by the Agent Generator:
- Agent Generator expects: `graph_id`, `graph_version`, `name`,
`description`, `input_schema`, `output_schema`
- MarketplaceAgentSummary had: `name`, `description`, `sub_heading`,
`creator`, `is_marketplace_agent`
## Solution
1. **Add `agent_graph_id` to `StoreAgent` model** - The field was
already in the database view but not exposed
2. **Include `agentGraphId` in hybrid search SQL query** - Carry the
field through the search CTEs
3. **Update `search_marketplace_agents_for_generation()`** - Now fetches
full graph schemas using `get_graph()` and returns `LibraryAgentSummary`
(same type as library agents)
4. **Update deduplication logic** - Use `graph_id` instead of name for
more accurate deduplication
## Changes
- `backend/api/features/store/model.py`: Add optional `agent_graph_id`
field to `StoreAgent`
- `backend/api/features/store/hybrid_search.py`: Include `agentGraphId`
in SQL query columns
- `backend/api/features/store/db.py`: Map `agentGraphId` when creating
`StoreAgent` objects
- `backend/api/features/chat/tools/agent_generator/core.py`: Update
`search_marketplace_agents_for_generation()` to fetch and include full
graph schemas
## Testing
- [ ] Agent creation on dev with marketplace agents in context
- [ ] Verify no 422 errors from Agent Generator
- [ ] Verify marketplace agents can be used as sub-agents
Fixes: SECRT-1817
---------
Co-authored-by: majdyz <majdyz@users.noreply.github.com>
Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
## Context
When users ask the chat to create agents, they may want to compose
workflows that reuse their existing agents as sub-agents. For this to
work, the Agent Generator service needs to know what agents the user has
available.
**Challenge:** Users can have large libraries with many agents. Fetching
all of them would be slow and provide too much context to the LLM.
## Solution
This PR implements **search-based library agent fetching** with a
**two-phase search** strategy:
1. **Phase 1 (Initial Search):** When the user describes their goal, we
search for relevant library agents using the goal as the search query
2. **Phase 2 (Step-Based Enrichment):** After the goal is decomposed
into steps, we extract keywords from those steps and search for
additional relevant agents
This ensures we find agents that are relevant to both the high-level
goal AND the specific steps identified.
### Example Flow
```
User goal: "Create an agent that fetches weather and sends a summary email"
Phase 1: Search for "weather email summary" → finds "Weather Fetcher" agent
Phase 2: After decomposition identifies steps like "send email notification"
→ searches "send email notification" → finds "Gmail Sender" agent
```
### Changes
**Library Agent Fetching:**
- `get_library_agents_for_generation()` - Search-based fetching from
user's library
- `search_marketplace_agents_for_generation()` - Search public
marketplace
- `get_all_relevant_agents_for_generation()` - Combines both with
deduplication
**Two-Phase Search:**
- `extract_search_terms_from_steps()` - Extracts keywords from
decomposed steps
- `enrich_library_agents_from_steps()` - Searches for additional agents
based on steps
- Integrated into `create_agent.py` as "Step 1.5" after goal
decomposition
**Type Safety:**
- Added `TypedDict` definitions: `LibraryAgentSummary`,
`MarketplaceAgentSummary`, `DecompositionStep`, `DecompositionResult`
### Design Decisions
- **Search-based, not fetch-all:** Scalable for large libraries
- **Library agents prioritized:** They have full schemas; marketplace
agents have basic info only
- **Deduplication by name and graph_id:** Prevents duplicates across
searches
- **Graceful degradation:** Failures don't block agent generation
- **Limited to 3 search terms:** Avoids excessive API calls during
enrichment
## Related PR
- Agent Generator:
https://github.com/Significant-Gravitas/AutoGPT-Agent-Generator/pull/103
## Test plan
- [x] `test_library_agents.py` - 19 tests covering all new functions
- [x] `test_service.py` - 4 tests for library_agents passthrough
- [ ] Integration test: Create agent with library sub-agent composition
### Changes 🏗️
Further improvements to LaunchDarkly initialisation and homepage
redirect...
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Run the app locally with the flag disabled/enabled, and the
redirects work
---------
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Ubbe <0ubbe@users.noreply.github.com>
## Changes 🏗️
Removes the `key` prop from `LDProvider` that was causing full remounts
when user context changed.
### The Problem
The `key={context.key}` prop was forcing React to unmount and remount
the entire LDProvider when switching from anonymous → logged in user:
```
1. Page loads, user loading → key="anonymous" → LD mounts → flags available ✅
2. User finishes loading → key="user-123" → React sees key changed
3. LDProvider UNMOUNTS → flags become undefined ❌
4. New LDProvider MOUNTS → initializes again → flags available ✅
```
This caused the flag values to cycle: `undefined → value → undefined →
value`
### The Fix
Remove the `key` prop. The LDProvider handles context changes internally
via the `context` prop, which triggers `identify()` without remounting
the provider.
## Checklist 📋
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [ ] I have tested my changes according to the test plan:
- [ ] Flag values don't flicker on page load
- [ ] Flag values update correctly when logging in/out
- [ ] No redirect race conditions
Related: SECRT-1845
- Added new dependencies for Streamdown components to improve rendering capabilities.
- Updated the chat page layout to utilize new conversation components, enhancing user experience.
- Refactored message handling to streamline input submission and improve message rendering logic.
These changes aim to enhance the overall functionality and usability of the chat interface.
### Changes 🏗️
The page used a client-side redirect (`useEffect` + `router.replace`)
which only works after JavaScript loads and hydrates. On deployed sites,
if there's any delay or failure in JS execution, users see an
empty/black page because the component returns null.
**Fix:** Converted to a server-side redirect using redirect() from
next/navigation. This is a server component now, so:
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Tested locally but will see it fully working once deployed
- Refactored the chat page to utilize a new `ChatSidebar` component for better organization and user experience.
- Improved message handling by simplifying session creation logic and ensuring proper state management.
- Updated UI elements for consistency, including button labels and input handling.
- Enhanced message rendering to support tool call outputs, improving the chat interaction flow.
These changes aim to streamline the chat interface and improve overall usability.
## Changes 🏗️
Fixes the hard refresh redirect bug (SECRT-1845) by removing the double
feature flag check.
### Before (buggy)
```
/ → checks flag → /copilot or /library
/copilot (layout) → checks flag → /library if OFF
```
On hard refresh, two sequential LD checks created a race condition
window.
### After (fixed)
```
/ → always redirects to /copilot
/copilot (layout) → single flag check via FeatureFlagPage
```
Single check point = no double-check race condition.
## Root Cause
As identified by @0ubbe: the root page and copilot layout were both
checking the feature flag. On hard refresh with network latency, the
second check could fire before LaunchDarkly fully initialized, causing
users to be bounced to `/library`.
## Test Plan
- [ ] Hard refresh on `/` → should go to `/copilot` (flag ON)
- [ ] Hard refresh on `/copilot` → should stay on `/copilot` (flag ON)
- [ ] With flag OFF → should redirect to `/library`
- [ ] Normal navigation still works
Fixes: SECRT-1845
cc @0ubbe
## Summary
Refocuses the chat input textarea after voice transcription finishes,
allowing users to immediately use `spacebar+enter` to record and send
their prompt.
## Changes
- Added `inputId` parameter to `useVoiceRecording` hook
- After transcription completes, the input is automatically focused
- This improves the voice input UX flow
## Testing
1. Click mic button or press spacebar to record voice
2. Record a message and stop
3. After transcription completes, the input should be focused
4. User can now press Enter to send or spacebar to record again
---------
Co-authored-by: Lluis Agusti <hi@llu.lu>
## Summary
Fixes flaky E2E marketplace and library tests that were causing PRs to
be removed from the merge queue.
## Root Cause
1. **Test data was probabilistic** - `e2e_test_data.py` used random
chances (40% approve, then 20-50% feature), which could result in 0
featured agents
2. **Library pagination threshold wrong** - Checked `>= 10`, but page
size is 20
3. **Fixed timeouts** - Used `waitForTimeout(2000)` /
`waitForTimeout(10000)` instead of proper waits
## Changes
### Backend (`e2e_test_data.py`)
- Add guaranteed minimums: 8 featured agents, 5 featured creators, 10
top agents
- First N submissions are deterministically approved and featured
- Increase agents per user from 15 → 25 (for pagination with
page_size=20)
- Fix library agent creation to use constants instead of hardcoded `10`
### Frontend Tests
- `library.spec.ts`: Fix pagination threshold to `PAGE_SIZE` (20)
- `library.page.ts`: Replace 2s timeout with `networkidle` +
`waitForFunction`
- `marketplace.page.ts`: Add `networkidle` wait, 30s waits in
`getFirst*` methods
- `marketplace.spec.ts`: Replace 10s timeout with `waitForFunction`
- `marketplace-creator.spec.ts`: Add `networkidle` + element waits
## Related
- Closes SECRT-1848, SECRT-1849
- Should unblock #11841 and other PRs in merge queue
---------
Co-authored-by: Ubbe <hi@ubbe.dev>
Split `autogpt_platform/CLAUDE.md` into project-specific files, to make
the scope of the instructions clearer.
Also, some minor improvements:
- Change references to other Markdown files to @file/path.md syntax that
Claude recognizes
- Update ambiguous/incorrect/outdated instructions
- Remove trailing slashes
- Fix broken file path references in other docs (including comments)
## Changes 🏗️
- Refactor homepage redirect logic to always point to `/`
- the `/` route handles whether to redirect to `/copilot` or `/library`
based on flag
- Simplify `useGetFlag` checks
- Add `<FeatureFlagRedirect />` and `<FeatureFlagPage />` wrapper
components
- helpers to do 1 thing or the other, depending on chat enabled/disabled
- avoids boilerplate code, checking flagss and redirects mistakes
(especially around race conditions with LD init )
## Checklist 📋
### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Log in / out of AutoGPT with flag disabled/enabled
- [x] Sign up to AutoGPT with flag disabled/enabled
- [x] Redirects to homepage always work `/`
- [x] Can't access Copilot with disabled flag
## Changes 🏗️https://github.com/user-attachments/assets/d9c12ac0-625c-4b38-8834-e494b5eda9c0
Add a "speech to text" feature in the Chat input fox of Copilot, similar
as what you have in ChatGPT.
## Checklist 📋
### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Run locally and try the speech to text feature as part of the chat
input box
### For configuration changes:
We need to add `OPENAI_API_KEY=` to Vercel ( used in the Front-end )
both in Dev and Prod.
- [x] `.env.default` is updated or already compatible with my changes
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Implements persistent User Workspace storage for CoPilot, enabling
blocks to save and retrieve files across sessions. Files are stored in
session-scoped virtual paths (`/sessions/{session_id}/`).
Fixes SECRT-1833
### Changes 🏗️
**Database & Storage:**
- Add `UserWorkspace` and `UserWorkspaceFile` Prisma models
- Implement `WorkspaceStorageBackend` abstraction (GCS for cloud, local
filesystem for self-hosted)
- Add `workspace_id` and `session_id` fields to `ExecutionContext`
**Backend API:**
- Add REST endpoints: `GET/POST /api/workspace/files`, `GET/DELETE
/api/workspace/files/{id}`, `GET /api/workspace/files/{id}/download`
- Add CoPilot tools: `list_workspace_files`, `read_workspace_file`,
`write_workspace_file`
- Integrate workspace storage into `store_media_file()` - returns
`workspace://file-id` references
**Block Updates:**
- Refactor all file-handling blocks to use unified `ExecutionContext`
parameter
- Update media-generating blocks to persist outputs to workspace
(AIImageGenerator, AIImageCustomizer, FluxKontext, TalkingHead, FAL
video, Bannerbear, etc.)
**Frontend:**
- Render `workspace://` image references in chat via proxy endpoint
- Add "AI cannot see this image" overlay indicator
**CoPilot Context Mapping:**
- Session = Agent (graph_id) = Run (graph_exec_id)
- Files scoped to `/sessions/{session_id}/`
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [ ] I have tested my changes according to the test plan:
- [ ] Create CoPilot session, generate image with AIImageGeneratorBlock
- [ ] Verify image returns `workspace://file-id` (not base64)
- [ ] Verify image renders in chat with visibility indicator
- [ ] Verify workspace files persist across sessions
- [ ] Test list/read/write workspace files via CoPilot tools
- [ ] Test local storage backend for self-hosted deployments
#### For configuration changes:
- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)
🤖 Generated with [Claude Code](https://claude.ai/code)
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> **Medium Risk**
> Introduces a new persistent file-storage surface area (DB tables,
storage backends, download API, and chat tools) and rewires
`store_media_file()`/block execution context across many blocks, so
regressions could impact file handling, access control, or storage
costs.
>
> **Overview**
> Adds a **persistent per-user Workspace** (new
`UserWorkspace`/`UserWorkspaceFile` models plus `WorkspaceManager` +
`WorkspaceStorageBackend` with GCS/local implementations) and wires it
into the API via a new `/api/workspace/files/{file_id}/download` route
(including header-sanitized `Content-Disposition`) and shutdown
lifecycle hooks.
>
> Extends `ExecutionContext` to carry execution identity +
`workspace_id`/`session_id`, updates executor tooling to clone
node-specific contexts, and updates `run_block` (CoPilot) to create a
session-scoped workspace and synthetic graph/run/node IDs.
>
> Refactors `store_media_file()` to require `execution_context` +
`return_format` and to support `workspace://` references; migrates many
media/file-handling blocks and related tests to the new API and to
persist generated media as `workspace://...` (or fall back to data URIs
outside CoPilot), and adds CoPilot chat tools for
listing/reading/writing/deleting workspace files with safeguards against
context bloat.
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
6abc70f793. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Reinier van der Leer <pwuts@agpt.co>
## Summary
- Add `InvalidInputError` for validation errors (search term too long,
invalid pagination) - returns 400 instead of 500
- Remove redundant try/catch blocks in library routes - global exception
handlers already handle `ValueError`→400 and `NotFoundError`→404
- Aggregate embedding backfill errors and log once at the end instead of
per content type to prevent Sentry issue spam
## Test plan
- [x] Verify validation errors (search term >100 chars) return 400 Bad
Request
- [x] Verify NotFoundError still returns 404
- [x] Verify embedding errors are logged once at the end with aggregated
counts
Fixes AUTOGPT-SERVER-7K5, BUILDER-6NC
---------
Co-authored-by: Swifty <craigswift13@gmail.com>
Disable automatic onboarding redirects on signup/login while keeping the
checklist/wallet functional. Users now receive $5 (500 credits) on their
first visit to /copilot.
### Changes 🏗️
- **Frontend**: `shouldShowOnboarding()` now returns `false`, disabling
auto-redirects to `/onboarding`
- **Backend**: Added `VISIT_COPILOT` onboarding step with 500 credit
($5) reward
- **Frontend**: Copilot page automatically completes `VISIT_COPILOT`
step on mount
- **Database**: Migration to add `VISIT_COPILOT` to `OnboardingStep`
enum
NOTE: /onboarding/1-welcome -> /library now as shouldShowOnboardin is
always false
Users land directly on `/copilot` after signup/login and receive $5
invisibly (not shown in checklist UI).
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] New user signup (email/password) → lands on `/copilot`, wallet
shows 500 credits
- [x] Verified credits are only granted once (idempotent via onboarding
reward mechanism)
- [x] Existing user login (already granted flag set) → lands on
`/copilot`, no duplicate credits
- [x] Checklist/wallet remains functional
#### For configuration changes:
- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)
No configuration changes required.
---
OPEN-2967
🤖 Generated with [Claude Code](https://claude.ai/code)
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> Introduces a new onboarding step and adjusts onboarding flow.
>
> - Adds `VISIT_COPILOT` onboarding step (+500 credits) with DB enum
migration and API/type updates
> - Copilot page auto-completes `VISIT_COPILOT` on mount to grant the
welcome bonus
> - Changes `/onboarding/enabled` to require user context and return
`false` when `CHAT` feature is enabled (skips legacy onboarding)
> - Wallet now refreshes credits on any onboarding `step_completed`
notification; confetti limited to visible tasks
> - Test flows updated to accept redirects to `copilot`/`library` and
verify authenticated state
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
ec5a5a4dfd. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Nicholas Tindle <ntindle@users.noreply.github.com>
## Summary
Fixes the issue where the "Creating Agent" spinner doesn't auto-update
when agent generation completes - user had to refresh the browser.
**Changes:**
- **Frontend polling**: Add `onOperationStarted` callback to trigger
polling when `operation_started` is received via SSE
- **Polling backoff**: 2s, 4s, 6s, 8s... up to 30s max
- **Message deduplication**: Use content-based keys (role + content)
instead of timestamps to prevent duplicate messages
- **Message ordering**: Preserve server message order instead of
timestamp-based sorting
- **Debug cleanup**: Remove verbose console.log/console.info statements
## Test plan
- [ ] Start agent generation in copilot
- [ ] Verify "Creating Agent" spinner appears
- [ ] Wait for completion (2-5 min) WITHOUT refreshing
- [ ] Verify agent carousel appears automatically when done
- [ ] Verify no duplicate messages in chat
- [ ] Verify message order is correct (user → assistant → tool_call →
tool_response)
## Summary
Agent generation (`create_agent`, `edit_agent`) can take 1-5 minutes.
Previously, if the user closed their browser tab during this time:
1. The SSE connection would die
2. The tool execution would be cancelled via `CancelledError`
3. The result would be lost - even if the agent-generator service
completed successfully
This PR ensures long-running tool operations survive SSE disconnections.
### Changes 🏗️
**Backend:**
- **base.py**: Added `is_long_running` property to `BaseTool` for tools
to opt-in to background execution
- **create_agent.py / edit_agent.py**: Set `is_long_running = True`
- **models.py**: Added `OperationStartedResponse`,
`OperationPendingResponse`, `OperationInProgressResponse` types
- **service.py**: Modified `_yield_tool_call()` to:
- Check if tool is `is_long_running`
- Save "pending" message to chat history immediately
- Spawn background task that runs independently of SSE
- Return `operation_started` immediately (don't wait)
- Update chat history with result when background task completes
- Track running operations for idempotency (prevents duplicate ops on
refresh)
- **db.py**: Added `update_tool_message_content()` to update pending
messages
- **model.py**: Added `invalidate_session_cache()` to clear Redis after
background completion
**Frontend:**
- **useChatMessage.ts**: Added operation message types
- **helpers.ts**: Handle `operation_started`, `operation_pending`,
`operation_in_progress` response types
- **PendingOperationWidget**: New component to display operation status
with spinner
- **ChatMessage.tsx**: Render `PendingOperationWidget` for operation
messages
### How It Works
```
User Request → Save "pending" message → Spawn background task → Return immediately
↓
Task runs independently of SSE
↓
On completion: Update message in chat history
↓
User refreshes → Loads history → Sees result
```
### User Experience
1. User requests agent creation
2. Sees "Agent creation started. You can close this tab - check your
library in a few minutes."
3. Can close browser tab safely
4. When they return, chat shows the completed result (or error)
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] pyright passes (0 errors)
- [x] TypeScript checks pass
- [x] Formatters applied
### Test Plan
1. Start agent creation in copilot
2. Close browser tab immediately after seeing "operation_started"
3. Wait 2-3 minutes
4. Reopen chat
5. Verify: Chat history shows completion message and agent appears in
library
---------
Co-authored-by: Ubbe <hi@ubbe.dev>
## Changes 🏗️
On the **Copilot** page:
- prevent unnecessary sidebar repaints
- show a disclaimer when switching chats on the sidebar to terminate a
current stream
- handle loading better
- save streams better when disconnecting
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Run the app locally and test the above
## Changes 🏗️
- **Fix infinite loop in copilot page**
- use Zustand selectors instead of full store object to get stable
function references
- **Centralize chat streaming logic**
- move all streaming files from `providers/chat-stream/` to
`components/contextual/Chat/` for better colocation and reusability
- **Rename `copilot-store` → `copilot-page-store`**: Clarify scope
- **Fix message duplication**
- Only replay chunks from active streams (not completed ones) since
backend already provides persisted messages in `initialMessages`
- **Auto-focus chat input**
- Focus textarea when streaming ends and input is re-enabled
- **Graceful error display**
- Render tool response errors in muted style (small text + warning icon)
instead of raw "Error: ..." text
## Checklist 📋
### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Navigate to copilot page - no infinite loop errors
- [x] Start a new chat, send message, verify streaming works
- [x] Navigate away and back to a completed session - no duplicate
messages
- [x] After stream completes, verify chat input receives focus
- [x] Trigger a tool error - verify it displays with muted styling
## Summary
Add interactive UI to collect user answers when the agent-generator
service returns clarifying questions during agent creation/editing.
Previously, when the backend asked clarifying questions, the frontend
would just display them as text with no way for users to answer. This
caused the chat to keep retrying without the necessary context.
## Changes
- **ChatMessageData type**: Add `clarification_needed` variant with
questions field
- **ClarificationQuestionsWidget**: New component with interactive form
to collect answers
- **parseToolResponse**: Detect and parse `clarification_needed`
responses from backend
- **ChatMessage**: Render the widget when clarification is needed
## How It Works
1. User requests to create/edit agent
2. Backend returns `ClarificationNeededResponse` with list of questions
3. Frontend shows interactive form with text inputs for each question
4. User fills in answers and clicks "Submit Answers"
5. Answers are sent back as context to the tool
6. Backend receives full context and continues
## UI Features
- Shows all questions with examples (if provided)
- Input validation (all questions must be answered to submit)
- Visual feedback (checkmarks when answered)
- Numbered questions for clarity
- Submit button disabled until all answered
- Follows same design pattern as `credentials_needed` flow
## Related
- Backend support for clarification was added in #11819
- Fixes the issue shown in the screenshot where users couldn't answer
clarifying questions
## Test plan
- [ ] Test creating agent that requires clarifying questions
- [ ] Verify questions are displayed in interactive form
- [ ] Verify all questions must be answered before submitting
- [ ] Verify answers are sent back to backend as context
- [ ] Verify agent creation continues with full context
## Changes 🏗️
This prevents Posthog from being initialised locally, where we should
not be collecting analytics during local development.
## Checklist 📋
### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Run locally and test the above
### Changes 🏗️
- Refactored node execution results storage to maintain a history of
executions instead of just the latest result
- Added support for viewing accumulated output data across multiple
executions
- Implemented a cleaner UI for viewing historical execution results with
proper grouping
- Added functionality to clear execution results when starting a new run
- Created helper functions to normalize and process execution data
consistently
- Updated the NodeDataViewer component to display both latest and
historical execution data
- Added ability to view input data alongside output data in the
execution history
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Create and run a flow with multiple blocks that produce output
- [x] Verify that execution results are properly accumulated and
displayed
- [x] Run the same flow multiple times and confirm historical data is
preserved
- [x] Test the "View more data" functionality to ensure it displays all
execution history
- [x] Verify that execution results are properly cleared when starting a
new run
Adds analytics tracking to the chat copilot system for better
observability of user interactions and agent operations.
### Changes 🏗️
**PostHog Analytics Integration:**
- Added `posthog` dependency (v7.6.0) to track chat events
- Created new tracking module (`backend/api/features/chat/tracking.py`)
with events:
- `chat_message_sent` - When a user sends a message
- `chat_tool_called` - When a tool is called (includes tool name)
- `chat_agent_run_success` - When an agent runs successfully
- `chat_agent_scheduled` - When an agent is scheduled
- `chat_trigger_setup` - When a trigger is set up
- Added PostHog configuration to settings:
- `POSTHOG_API_KEY` - API key for PostHog
- `POSTHOG_HOST` - PostHog host URL (defaults to
`https://us.i.posthog.com`)
**OpenRouter Tracing:**
- Added `user` and `session_id` fields to chat completion API calls for
OpenRouter tracing
- Added `posthogDistinctId` and `posthogProperties` (with environment)
to API calls
**Files Changed:**
- `backend/api/features/chat/tracking.py` - New PostHog tracking module
- `backend/api/features/chat/service.py` - Added user message tracking
and OpenRouter tracing
- `backend/api/features/chat/tools/__init__.py` - Added tool call
tracking
- `backend/api/features/chat/tools/run_agent.py` - Added agent
run/schedule tracking
- `backend/util/settings.py` - Added PostHog configuration fields
- `pyproject.toml` - Added posthog dependency
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Verified code passes linting and formatting
- [x] Verified PostHog client initializes correctly when API key is
provided
- [x] Verified tracking is gracefully skipped when PostHog is not
configured
#### For configuration changes:
- [ ] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)
**New environment variables (optional):**
- `POSTHOG_API_KEY` - PostHog project API key
- `POSTHOG_HOST` - PostHog host URL (optional, defaults to US cloud)
## Summary
This PR implements comprehensive improvements to the human-in-the-loop
(HITL) review system, including safety features, architectural changes,
and bug fixes:
### Key Features
- **SECRT-1798: One-time safety popup** - Shows informational popup
before first run of AI-generated agents with sensitive actions/HITL
blocks
- **SECRT-1795: Auto-approval toggle UX** - Toggle in pending reviews
panel to auto-approve future actions from the same node
- **Node-specific auto-approval** - Changed from execution-specific to
node-specific using special key pattern
`auto_approve_{graph_exec_id}_{node_id}`
- **Consolidated approval checking** - Merged `check_auto_approval` into
`check_approval` using single OR query for better performance
- **Race condition prevention** - Added execution status check before
resuming to prevent duplicate execution when approving while graph is
running
- **Parallel auto-approval creation** - Uses `asyncio.gather` for better
performance when creating multiple auto-approval records
## Changes
### Backend Architecture
- **`human_review.py`**:
- Added `check_approval()` function that checks both normal and
auto-approval in single query
- Added `create_auto_approval_record()` for node-specific auto-approval
using special key pattern
- Added `get_auto_approve_key()` helper to generate consistent
auto-approval keys
- **`review/routes.py`**:
- Added execution status check before resuming to prevent race
conditions
- Refactored auto-approval record creation to use parallel execution
with `asyncio.gather`
- Removed obvious comments for cleaner code
- **`review/model.py`**: Added `auto_approve_future_actions` field to
`ReviewRequest`
- **`blocks/helpers/review.py`**: Updated to use consolidated
`check_approval` via database manager client
- **`executor/database.py`**: Exposed `check_approval` through
DatabaseManager RPC for block execution context
- **`data/block.py`**: Fixed safe mode checks for sensitive action
blocks
### Frontend
- **New `AIAgentSafetyPopup`** component with localStorage-based
one-time display
- **`PendingReviewsList`**:
- Replaced "Approve all future actions" button with toggle
- Toggle resets data to original values and disables editing when
enabled
- Shows warning message explaining auto-approval behavior
- **`RunAgentModal`**: Integrated safety popup before first run
- **`usePendingReviews`**: Added polling for real-time badge updates
- **`FloatingSafeModeToggle` & `SafeModeToggle`**: Simplified visibility
logic
- **`local-storage.ts`**: Added localStorage key for popup state
tracking
### Bug Fixes
- Fixed "Client is not connected to query engine" error by using
database manager client pattern
- Fixed race condition where approving reviews while graph is RUNNING
could queue execution twice
- Fixed migration to only drop FK constraint, not non-existent column
- Fixed card data reset when auto-approve toggle changes
### Code Quality
- Removed duplicate/obvious comments
- Moved imports to top-level instead of local scope in tests
- Used walrus operator for cleaner conditional assignments
- Parallel execution for auto-approval record creation
## Test plan
- [ ] Create an AI-generated agent with sensitive actions (e.g., email
sending)
- [ ] First run should show the safety popup before starting
- [ ] Subsequent runs should not show the popup
- [ ] Clear localStorage (`AI_AGENT_SAFETY_POPUP_SHOWN`) to verify popup
shows again
- [ ] Create an agent with human-in-the-loop blocks
- [ ] Run it and verify the pending reviews panel appears
- [ ] Enable the "Auto-approve all future actions" toggle
- [ ] Verify editing is disabled and shows warning message
- [ ] Click "Approve" and verify subsequent blocks from same node
auto-approve
- [ ] Verify auto-approval persists across multiple executions of same
graph
- [ ] Disable toggle and verify editing works normally
- [ ] Verify "Reject" button still works regardless of toggle state
- [ ] Test race condition: Approve reviews while graph is RUNNING
(should skip resume)
- [ ] Test race condition: Approve reviews while graph is REVIEW (should
resume)
- [ ] Verify pending reviews badge updates in real-time when new reviews
are created