Commit Graph

7824 Commits

Author SHA1 Message Date
Swifty
f4bf492f24 feat(platform): Add Redis-based SSE reconnection for long-running CoPilot operations (#11877)
## Changes 🏗️

Adds Redis-based SSE reconnection support for long-running CoPilot
operations (like Agent Generator), enabling clients to reconnect and
resume receiving updates after disconnection.

### What this does:
- **Stream Registry** - Redis-backed task tracking with message
persistence via Redis Streams
- **SSE Reconnection** - Clients can reconnect to active tasks using
`task_id` and `last_message_id`
- **Duplicate Message Fix** - Filters out in-progress assistant messages
from session response when active stream exists
- **Completion Consumer** - Handles background task completion
notifications via Redis Streams

### Architecture:
```
1. User sends message → Backend creates task in Redis
2. SSE chunks written to Redis Stream for persistence
3. Client receives chunks via SSE subscription
4. If client disconnects → Task continues in background
5. Client reconnects → GET /sessions/{id} returns active_stream info
6. Client subscribes to /tasks/{task_id}/stream with last_message_id
7. Missed messages replayed from Redis Stream
```

### Key endpoints:
- `GET /sessions/{session_id}` - Returns `active_stream` info if task is
running
- `GET /tasks/{task_id}/stream?last_message_id=X` - SSE endpoint for
reconnection
- `GET /tasks/{task_id}` - Get task status
- `POST /operations/{op_id}/complete` - Webhook for external service
completion

### Duplicate message fix:
When `GET /sessions/{id}` detects an active stream:
1. Filters out the in-progress assistant message from response
2. Returns `last_message_id="0-0"` so client replays stream from
beginning
3. Client receives complete response only through SSE (single source of
truth)

### Frontend changes:
- Task persistence in localStorage for cross-tab reconnection
- Stream event dispatcher handles reconnection flow
- Deduplication logic prevents duplicate messages

### Testing:
- Manual testing of reconnection scenarios
- Verified duplicate message fix works correctly

## Related
- Resolves SSE timeout issues for Agent Generator
- Fixes duplicate message bug on reconnection
2026-02-03 16:52:06 +01:00
Zamil Majdy
81e48c00a4 feat(copilot): add customize_agent tool for marketplace templates (#11943)
## Summary

Adds a new copilot tool that allows users to customize
marketplace/template agents using natural language before adding them to
their library.

This exposes the Agent Generator's `/api/template-modification` endpoint
to the copilot, which was previously not available.

## Changes

- **service.py**: Add `customize_template_external` to call Agent
Generator's template modification endpoint
- **core.py**: 
  - Add `customize_template` wrapper function
- Extract `graph_to_json` as a reusable function (was previously inline
in `get_agent_as_json`)
- **customize_agent.py**: New tool that:
  - Takes marketplace agent ID (format: `creator/slug`)
  - Fetches template from store via `store_db.get_agent()`
  - Calls Agent Generator for customization
  - Handles clarifying questions from the generator
  - Saves customized agent to user's library
- **__init__.py**: Register the tool in `TOOL_REGISTRY` for
auto-discovery

## Usage Flow

1. User searches marketplace: *"Find me a newsletter agent"*
2. Copilot calls `find_agent` → returns `autogpt/newsletter-writer`
3. User: *"Customize that agent to post to Discord instead of email"*
4. Copilot calls:
   ```
   customize_agent(
       agent_id="autogpt/newsletter-writer",
       modifications="Post to Discord instead of sending email"
   )
   ```
5. Agent Generator may ask clarifying questions (e.g., "What Discord
channel?")
6. Customized agent is saved to user's library

## Test plan

- [x] Verified tool imports correctly
- [x] Verified tool is registered in `TOOL_REGISTRY`
- [x] Verified OpenAI function schema is valid
- [x] Ran existing tests (`pytest backend/api/features/chat/tools/`) -
all pass
- [x] Type checker (`pyright`) passes with 0 errors
- [ ] Manual testing with copilot (requires Agent Generator service)
2026-02-03 14:59:25 +00:00
Otto
7dc53071e8 fix(backend): Add retry and error handling to block initialization (#11946)
## Summary
Adds retry logic and graceful error handling to `initialize_blocks()` to
prevent transient DB errors from crashing server startup.

## Problem
When a transient database error occurs during block initialization
(e.g., Prisma P1017 "Server has closed the connection"), the entire
server fails to start. This is overly aggressive since:
1. Blocks are already registered in memory
2. The DB sync is primarily for tracking/schema storage
3. One flaky connection shouldn't prevent the server from starting

**Triggered by:** [Sentry
AUTOGPT-SERVER-7PW](https://significant-gravitas.sentry.io/issues/7238733543/)

## Solution
- Add retry decorator (3 attempts with exponential backoff) for DB
operations
- On failure after retries, log a warning and continue to the next block
- Blocks remain available in memory even if DB sync fails
- Log summary of any failed blocks at the end

## Changes
- `autogpt_platform/backend/backend/data/block.py`: Wrap block DB sync
in retry logic with graceful fallback

## Testing
- Existing block initialization behavior unchanged on success
- On transient DB errors: retries up to 3 times, then continues with
warning
2026-02-03 12:43:30 +00:00
Zamil Majdy
4878665c66 Merge branch 'master' into dev 2026-02-03 16:01:23 +04:00
Zamil Majdy
678ddde751 refactor(backend): unify context compression into compress_context() (#11937)
## Background

This PR consolidates and unifies context window management for the
CoPilot backend.

### Problem
The CoPilot backend had **two separate implementations** of context
window management:

1. **`service.py` → `_manage_context_window()`** - Chat service
streaming/continuation
2. **`prompt.py` → `compress_prompt()`** - Sync LLM blocks

This duplication led to inconsistent behavior, maintenance burden, and
duplicate code.

---

## Solution: Unified `compress_context()`

A single async function that handles both use cases:

| Caller | Usage | Behavior |
|--------|-------|----------|
| **Chat service** | `compress_context(msgs, client=openai_client)` |
Summarization → Truncation |
| **LLM blocks** | `compress_context(msgs, client=None)` | Truncation
only (no API call) |

---

## Strategy Order

| Step | Description | Runs When |
|------|-------------|-----------|
| **1. LLM Summarization** | Summarize old messages into single context
message, keep recent 15 | Only if `client` provided |
| **2. Content Truncation** | Progressively truncate message content
(8192→4096→...→128 tokens) | If still over limit |
| **3. Middle-out Deletion** | Delete messages one at a time from center
outward | If still over limit |
| **4. First/Last Trim** | Truncate system prompt and last message
content | Last resort |

### Why This Order?

1. **Summarization first** (if available) - Preserves semantic meaning
of old messages
2. **Content truncation before deletion** - Keeps all conversation
turns, just shorter
3. **Middle-out deletion** - More granular than dropping all old
messages at once
4. **First/last trim** - Only touch system prompt as last resort

---

## Key Fixes

| Issue | Before | After |
|-------|--------|-------|
| **Socket leak** | `AsyncOpenAI` client never closed | `async with`
context manager |
| **Timeout ignored** | `timeout=30` passed to `create()` (invalid) |
`client.with_options(timeout=30)` |
| **OpenAI tool messages** | Not truncated | Properly truncated |
| **Tool pair integrity** | OpenAI format only | Both OpenAI + Anthropic
formats |

---

## Tool Format Support

`_ensure_tool_pairs_intact()` now supports both formats:

### OpenAI Format
```python
# Assistant with tool_calls
{"role": "assistant", "tool_calls": [{"id": "call_1", ...}]}
# Tool response
{"role": "tool", "tool_call_id": "call_1", "content": "result"}
```

### Anthropic Format
```python
# Assistant with tool_use
{"role": "assistant", "content": [{"type": "tool_use", "id": "toolu_1", ...}]}
# Tool result
{"role": "user", "content": [{"type": "tool_result", "tool_use_id": "toolu_1", ...}]}
```

---

## Files Changed

| File | Change |
|------|--------|
| `backend/util/prompt.py` | +450 lines: Add `CompressResult`,
`compress_context()`, helpers |
| `backend/api/features/chat/service.py` | -380 lines: Remove duplicate,
use thin wrapper |
| `backend/blocks/llm.py` | Migrate `llm_call()` to use
`compress_context(client=None)` |
| `backend/util/prompt_test.py` | +400 lines: Comprehensive tests
(OpenAI + Anthropic) |

### Removed
- `compress_prompt()` - Replaced by `compress_context(client=None)`
- `_manage_context_window()` - Replaced by
`compress_context(client=openai_client)`

---

## API

```python
async def compress_context(
    messages: list[dict],
    target_tokens: int = 120_000,
    *,
    model: str = "gpt-4o",
    client: AsyncOpenAI | None = None,  # None = truncation only
    keep_recent: int = 15,
    reserve: int = 2_048,
    start_cap: int = 8_192,
    floor_cap: int = 128,
) -> CompressResult:
    ...

@dataclass
class CompressResult:
    messages: list[dict]
    token_count: int
    was_compacted: bool
    error: str | None = None
    original_token_count: int = 0
    messages_summarized: int = 0
    messages_dropped: int = 0
```

---

## Tests Added

| Test Class | Coverage |
|------------|----------|
| `TestMsgTokens` | Token counting for regular messages, OpenAI tool
calls, Anthropic tool_use |
| `TestTruncateToolMessageContent` | OpenAI + Anthropic tool message
truncation |
| `TestEnsureToolPairsIntact` | OpenAI format (3 tests), Anthropic
format (3 tests), edge cases (3 tests) |
| `TestCompressContext` | No compression, truncation-only, tool pair
preservation, error handling |

---

## Checklist

- [x] Code follows project conventions
- [x] Linting passes (`poetry run format`)
- [x] Type checking passes (`pyright`)
- [x] Tests added for all new functions
- [x] Both OpenAI and Anthropic tool formats supported
- [x] Backward compatible behavior preserved
- [x] All review comments addressed
2026-02-03 10:36:10 +00:00
Otto
aef6f57cfd fix(scheduler): route db calls through DatabaseManager (#11941)
## Summary

Routes `increment_onboarding_runs` and `cleanup_expired_oauth_tokens`
through the DatabaseManager RPC client instead of calling Prisma
directly.

## Problem

The Scheduler service never connects its Prisma client. While
`add_graph_execution()` in `utils.py` has a fallback that routes through
DatabaseManager when Prisma isn't connected, subsequent calls in the
scheduler were hitting Prisma directly:

- `increment_onboarding_runs()` after successful graph execution
- `cleanup_expired_oauth_tokens()` in the scheduled job

These threw `ClientNotConnectedError`, caught by generic exception
handlers but spamming Sentry (~696K events since December per the
original analysis in #11926).

## Solution

Follow the same pattern as `utils.py`:
1. Add `cleanup_expired_oauth_tokens` to `DatabaseManager` and
`DatabaseManagerAsyncClient`
2. Update scheduler to use `get_database_manager_async_client()` for
both calls

## Changes

- **database.py**: Import and expose `cleanup_expired_oauth_tokens` in
both manager classes
- **scheduler.py**: Use `db.increment_onboarding_runs()` and
`db.cleanup_expired_oauth_tokens()` via the async client

## Impact

- Eliminates Sentry error spam from scheduler
- Onboarding run counters now actually increment for scheduled
executions
- OAuth token cleanup now actually runs

## Testing

Deploy to staging with scheduled graphs and verify:
1. No more `ClientNotConnectedError` in scheduler logs
2. `UserOnboarding.agentRuns` increments on scheduled runs
3. Expired OAuth tokens get cleaned up

Refs: #11926 (original fix that was closed)
2026-02-03 09:54:49 +00:00
Krzysztof Czerwinski
14cee1670a fix(backend): Prevent leaking Redis connections in ws_api (#11869)
Fixing
https://github.com/Significant-Gravitas/AutoGPT/pull/11297#discussion_r2496833421

### Changes 🏗️

1. event_bus.py - Added close method to AsyncRedisEventBus
- Added __init__ method to track the _pubsub instance attribute
- Added async def close() method that closes the PubSub connection
safely
- Modified listen_events() to store the pubsub reference in self._pubsub

2. ws_api.py - Added cleanup in event_broadcaster
- Wrapped the worker coroutines in try/finally block
- The finally block calls close() on both event buses to ensure cleanup
happens on any exit (including exceptions before retry)
2026-02-03 08:07:48 +00:00
Zamil Majdy
d81d1ce024 refactor(backend): extract context window management and fix LLM continuation (#11936)
## Summary

Fixes CoPilot becoming unresponsive after long-running tools complete,
and refactors context window management into a reusable function.

## Problem

After `create_agent` completes, `_generate_llm_continuation()` was
sending ALL messages to OpenRouter without any context compaction. When
conversations exceeded ~50 messages, OpenRouter rejected requests with
`provider_name: 'unknown'` (no provider would accept).

**Evidence:** Langfuse session
[44fbb803-092e-4ebd-b288-852959f4faf5](https://cloud.langfuse.com/project/cmk5qhf210003ad079sd8utjt/sessions/44fbb803-092e-4ebd-b288-852959f4faf5)
showed:
- Successful calls: 32-50 messages, known providers
- Failed calls: 52+ messages, `provider: unknown`, `completion: null`

## Changes

### Refactor: Extract reusable `_manage_context_window()`
- Counts tokens and checks against 120k threshold
- Summarizes old messages while keeping recent 15
- Ensures tool_call/tool_response pairs stay intact
- Progressive truncation if still over limit
- Returns `ContextWindowResult` dataclass with messages, token count,
compaction status, and errors
- Helper `_messages_to_dicts()` reduces code duplication

### Fix: Update `_generate_llm_continuation()`
- Now calls `_manage_context_window()` before making LLM calls
- Adds retry logic with exponential backoff (matching
`_stream_chat_chunks` behavior)

### Cleanup: Update `_stream_chat_chunks()`
- Replaced inline context management with call to
`_manage_context_window()`
- Eliminates code duplication between the two functions

## Testing

- Syntax check: 
- Ruff lint: 
- Import verification: 

## Checklist

- [x] My code follows the style guidelines of this project
- [x] I have performed a self-review of my own code
- [x] My changes generate no new warnings
- [x] I have checked that my changes do not break existing functionality

---------

Co-authored-by: Otto <otto@agpt.co>
2026-02-03 04:41:43 +00:00
Zamil Majdy
2dd341c369 refactor: enrich description with context before calling Agent Generator (#11932)
## Summary
Updates the Agent Generator client to enrich the description with
context before calling, instead of sending `user_instruction` as a
separate parameter.

## Context
Companion PR to Significant-Gravitas/AutoGPT-Agent-Generator#105 which
removes unused parameters from the decompose API.

## Changes
- Enrich `description` with `context` (e.g., clarifying question
answers) before sending
- Remove `user_instruction` from request payload

## How it works
Both input boxes and chat box work the same way - the frontend
constructs a formatted message with answers and sends it as a user
message. The backend then enriches the description with this context
before calling the external Agent Generator service.
2026-02-03 02:31:07 +00:00
Otto
f7350c797a fix(copilot): use messages_dict in fallback context compaction (#11922)
## Summary

Fixes a bug where the fallback path in context compaction passes
`recent_messages` (already sliced) instead of `messages_dict` (full
conversation) to `_ensure_tool_pairs_intact`.

This caused the function to fail to find assistant messages that exist
in the original conversation but were outside the sliced window,
resulting in orphan tool_results being sent to Anthropic and rejected
with:

```
messages.66.content.0: unexpected tool_use_id found in tool_result blocks: toolu_vrtx_019bi1PDvEn7o5ByAxcS3VdA
```

## Changes

- Pass `messages_dict` and `slice_start` (relative to full conversation)
instead of `recent_messages` and `reduced_slice_start` (relative to
already-sliced list)

## Testing

This is a targeted fix for the fallback path. The bug only manifests
when:
1. Token count > 120k (triggers compaction)
2. Initial compaction + summary still exceeds limit (triggers fallback)
3. A tool_result's corresponding assistant is in `messages_dict` but not
in `recent_messages`

## Related

- Fixes SECRT-1861
- Related: SECRT-1839 (original fix that missed this code path)
2026-02-02 13:01:05 +00:00
Guofang.Tang
1081590384 feat(backend): cover webhook ingress URL route (#11747)
### Changes 🏗️

- Add a unit test to verify webhook ingress URL generation matches the
FastAPI route.

  ### Checklist 📋

  #### For code changes:

  - [x] I have clearly listed my changes in the PR description
  - [x] I have made a test plan
  - [x] I have tested my changes according to the test plan:
- [x] poetry run pytest backend/integrations/webhooks/utils_test.py
--confcutdir=backend/integrations/webhooks

  #### For configuration changes:

  - [x] .env.default is updated or already compatible with my changes
- [x] docker-compose.yml is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under Changes)



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Tests**
* Added a unit test that validates webhook ingress URL generation
matches the application's resolved route (scheme, host, and path) for
provider-specific webhook endpoints, improving confidence in routing
behavior and helping prevent regressions.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Reinier van der Leer <pwuts@agpt.co>
2026-02-01 20:29:15 +00:00
Otto
7e37de8e30 fix: Include graph schemas for marketplace agents in Agent Generator (#11920)
## Problem

When marketplace agents are included in the `library_agents` payload
sent to the Agent Generator service, they were missing required fields
(`graph_id`, `graph_version`, `input_schema`, `output_schema`). This
caused Pydantic validation to fail with HTTP 422 Unprocessable Entity.

**Root cause:** The `MarketplaceAgentSummary` TypedDict had a different
shape than `LibraryAgentInfo` expected by the Agent Generator:
- Agent Generator expects: `graph_id`, `graph_version`, `name`,
`description`, `input_schema`, `output_schema`
- MarketplaceAgentSummary had: `name`, `description`, `sub_heading`,
`creator`, `is_marketplace_agent`

## Solution

1. **Add `agent_graph_id` to `StoreAgent` model** - The field was
already in the database view but not exposed
2. **Include `agentGraphId` in hybrid search SQL query** - Carry the
field through the search CTEs
3. **Update `search_marketplace_agents_for_generation()`** - Now fetches
full graph schemas using `get_graph()` and returns `LibraryAgentSummary`
(same type as library agents)
4. **Update deduplication logic** - Use `graph_id` instead of name for
more accurate deduplication

## Changes

- `backend/api/features/store/model.py`: Add optional `agent_graph_id`
field to `StoreAgent`
- `backend/api/features/store/hybrid_search.py`: Include `agentGraphId`
in SQL query columns
- `backend/api/features/store/db.py`: Map `agentGraphId` when creating
`StoreAgent` objects
- `backend/api/features/chat/tools/agent_generator/core.py`: Update
`search_marketplace_agents_for_generation()` to fetch and include full
graph schemas

## Testing

- [ ] Agent creation on dev with marketplace agents in context
- [ ] Verify no 422 errors from Agent Generator
- [ ] Verify marketplace agents can be used as sub-agents

Fixes: SECRT-1817

---------

Co-authored-by: majdyz <majdyz@users.noreply.github.com>
Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
2026-01-31 19:17:36 +00:00
Otto
2abbb7fbc8 hotfix(backend): use discriminator for credential matching in run_block (#11908)
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Nicholas Tindle <ntindle@users.noreply.github.com>
Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 21:50:21 -06:00
Otto
7ee94d986c docs: add credentials prerequisites to create-basic-agent guide (#11913)
## Summary
Addresses #11785 - users were encountering `openai_api_key_credentials`
errors when following the create-basic-agent guide because it didn't
mention the need to configure API credentials before using AI blocks.

## Changes
Added a **Prerequisites** section to
`docs/platform/create-basic-agent.md` explaining:
- **Cloud users:** Go to Profile → Integrations to add API keys
- **Self-hosted (Docker):** Add keys to `autogpt_platform/backend/.env`
and restart services

Also added a note that the Calculator example doesn't need credentials,
making it a good first test.

## Related
- Issue: #11785
2026-01-31 03:05:31 +00:00
Nicholas Tindle
05b60db554 fix(backend/chat): Include input schema in discovery and validate unknown fields (#11916)
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 21:00:43 -06:00
Zamil Majdy
18a1661fa3 feat: add library agent fetching with two-phase search for sub-agent support (#11889)
## Context

When users ask the chat to create agents, they may want to compose
workflows that reuse their existing agents as sub-agents. For this to
work, the Agent Generator service needs to know what agents the user has
available.

**Challenge:** Users can have large libraries with many agents. Fetching
all of them would be slow and provide too much context to the LLM.

## Solution

This PR implements **search-based library agent fetching** with a
**two-phase search** strategy:

1. **Phase 1 (Initial Search):** When the user describes their goal, we
search for relevant library agents using the goal as the search query
2. **Phase 2 (Step-Based Enrichment):** After the goal is decomposed
into steps, we extract keywords from those steps and search for
additional relevant agents

This ensures we find agents that are relevant to both the high-level
goal AND the specific steps identified.

### Example Flow

```
User goal: "Create an agent that fetches weather and sends a summary email"

Phase 1: Search for "weather email summary" → finds "Weather Fetcher" agent
Phase 2: After decomposition identifies steps like "send email notification"
         → searches "send email notification" → finds "Gmail Sender" agent
```

### Changes

**Library Agent Fetching:**
- `get_library_agents_for_generation()` - Search-based fetching from
user's library
- `search_marketplace_agents_for_generation()` - Search public
marketplace
- `get_all_relevant_agents_for_generation()` - Combines both with
deduplication

**Two-Phase Search:**
- `extract_search_terms_from_steps()` - Extracts keywords from
decomposed steps
- `enrich_library_agents_from_steps()` - Searches for additional agents
based on steps
- Integrated into `create_agent.py` as "Step 1.5" after goal
decomposition

**Type Safety:**
- Added `TypedDict` definitions: `LibraryAgentSummary`,
`MarketplaceAgentSummary`, `DecompositionStep`, `DecompositionResult`

### Design Decisions

- **Search-based, not fetch-all:** Scalable for large libraries
- **Library agents prioritized:** They have full schemas; marketplace
agents have basic info only
- **Deduplication by name and graph_id:** Prevents duplicates across
searches
- **Graceful degradation:** Failures don't block agent generation
- **Limited to 3 search terms:** Avoids excessive API calls during
enrichment

## Related PR
- Agent Generator:
https://github.com/Significant-Gravitas/AutoGPT-Agent-Generator/pull/103

## Test plan
- [x] `test_library_agents.py` - 19 tests covering all new functions
- [x] `test_service.py` - 4 tests for library_agents passthrough
- [ ] Integration test: Create agent with library sub-agent composition
2026-01-31 00:18:21 +00:00
Otto
b72521daa9 fix(readme): update broken self-hosting docs link (#11911)
## Summary
The self-hosting guide link in README.md was broken.

**Old link:** `https://docs.agpt.co/platform/getting-started/`
- Redirects to `https://agpt.co/docs/platform/getting-started`
- Returns HTTP 400 

**New link:**
`https://agpt.co/docs/platform/getting-started/getting-started`
- Works correctly 

## Changes
- Updated the self-hosting guide URL in README.md

Fixes #OPEN-2973
2026-01-30 22:59:45 +00:00
Ubbe
cc4839bedb hotfix(frontend): fix home redirect (3) (#11904)
### Changes 🏗️

Further improvements to LaunchDarkly initialisation and homepage
redirect...

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Run the app locally with the flag disabled/enabled, and the
redirects work

---------

Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Ubbe <0ubbe@users.noreply.github.com>
2026-01-30 20:40:46 +07:00
Otto
dbbff04616 hotfix(frontend): LD remount (#11903)
## Changes 🏗️

Removes the `key` prop from `LDProvider` that was causing full remounts
when user context changed.

### The Problem

The `key={context.key}` prop was forcing React to unmount and remount
the entire LDProvider when switching from anonymous → logged in user:

```
1. Page loads, user loading → key="anonymous" → LD mounts → flags available 
2. User finishes loading → key="user-123" → React sees key changed
3. LDProvider UNMOUNTS → flags become undefined 
4. New LDProvider MOUNTS → initializes again → flags available 
```

This caused the flag values to cycle: `undefined → value → undefined →
value`

### The Fix

Remove the `key` prop. The LDProvider handles context changes internally
via the `context` prop, which triggers `identify()` without remounting
the provider.

## Checklist 📋

- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [ ] I have tested my changes according to the test plan:
  - [ ] Flag values don't flicker on page load
  - [ ] Flag values update correctly when logging in/out
  - [ ] No redirect race conditions

Related: SECRT-1845
2026-01-30 19:08:26 +07:00
Reinier van der Leer
350ad3591b fix(backend/chat): Filter credentials for graph execution by scopes (#11881)
[SECRT-1842: run_agent tool does not correctly use credentials - agents
fail with insufficient auth
scopes](https://linear.app/autogpt/issue/SECRT-1842)

### Changes 🏗️

- Include scopes in credentials filter in
`backend.api.features.chat.tools.utils.match_user_credentials_to_graph`

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - CI must pass
- It's broken now and a simple change so we'll test in the dev
deployment
2026-01-30 11:01:51 +00:00
Ubbe
e6438b9a76 hotfix(frontend): use server redirect (#11900)
### Changes 🏗️

The page used a client-side redirect (`useEffect` + `router.replace`)
which only works after JavaScript loads and hydrates. On deployed sites,
if there's any delay or failure in JS execution, users see an
empty/black page because the component returns null.

**Fix:** Converted to a server-side redirect using redirect() from
next/navigation. This is a server component now, so:

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Tested locally but will see it fully working once deployed
2026-01-30 17:20:03 +07:00
Bently
de0ec3d388 chore(llm): remove deprecated Claude 3.7 Sonnet model with migration and defensive handling (#11841)
## Summary
Remove `claude-3-7-sonnet-20250219` from LLM model definitions ahead of
Anthropic's API retirement, with comprehensive migration and defensive
error handling.

## Background
Anthropic is retiring Claude 3.7 Sonnet (`claude-3-7-sonnet-20250219`)
on **February 19, 2026 at 9:00 AM PT**. This PR removes the model from
the platform and migrates existing users to prevent service
interruptions.

## Changes

### Code Changes
- Remove `CLAUDE_3_7_SONNET` enum member from `LlmModel` in `llm.py`
- Remove corresponding `ModelMetadata` entry
- Remove `CLAUDE_3_7_SONNET` from `StagehandRecommendedLlmModel` enum
- Remove `CLAUDE_3_7_SONNET` from block cost config
- Add `CLAUDE_4_5_SONNET` to `StagehandRecommendedLlmModel` enum
- Update Stagehand block defaults from `CLAUDE_3_7_SONNET` to
`CLAUDE_4_5_SONNET` (staying in Claude family)
- Add defensive error handling in `CredentialsFieldInfo.discriminate()`
for deprecated model values

### Database Migration
- Adds migration `20260126120000_migrate_claude_3_7_to_4_5_sonnet`
- Migrates `AgentNode.constantInput` model references
- Migrates `AgentNodeExecutionInputOutput.data` preset overrides

### Documentation
- Updated `docs/integrations/block-integrations/llm.md` to remove
deprecated model
- Updated `docs/integrations/block-integrations/stagehand/blocks.md` to
remove deprecated model and add Claude 4.5 Sonnet

## Notes
- Agent JSON files in `autogpt_platform/backend/agents/` still reference
this model in their provider mappings. These are auto-generated and
should be regenerated separately.

## Testing
- [ ] Verify LLM block still functions with remaining models
- [ ] Confirm no import errors in affected files
- [ ] Verify migration runs successfully
- [ ] Verify deprecated model gives helpful error message instead of
KeyError
2026-01-30 08:40:55 +00:00
Otto
e10ff8d37f fix(frontend): remove double flag check on homepage redirect (#11894)
## Changes 🏗️

Fixes the hard refresh redirect bug (SECRT-1845) by removing the double
feature flag check.

### Before (buggy)
```
/                    → checks flag → /copilot or /library
/copilot (layout)    → checks flag → /library if OFF
```

On hard refresh, two sequential LD checks created a race condition
window.

### After (fixed)
```
/                    → always redirects to /copilot
/copilot (layout)    → single flag check via FeatureFlagPage
```

Single check point = no double-check race condition.

## Root Cause

As identified by @0ubbe: the root page and copilot layout were both
checking the feature flag. On hard refresh with network latency, the
second check could fire before LaunchDarkly fully initialized, causing
users to be bounced to `/library`.

## Test Plan

- [ ] Hard refresh on `/` → should go to `/copilot` (flag ON)
- [ ] Hard refresh on `/copilot` → should stay on `/copilot` (flag ON)  
- [ ] With flag OFF → should redirect to `/library`
- [ ] Normal navigation still works

Fixes: SECRT-1845

cc @0ubbe
2026-01-30 08:32:50 +00:00
Otto
7cb1e588b0 fix(frontend): Refocus ChatInput after voice transcription completes (#11893)
## Summary
Refocuses the chat input textarea after voice transcription finishes,
allowing users to immediately use `spacebar+enter` to record and send
their prompt.

## Changes
- Added `inputId` parameter to `useVoiceRecording` hook
- After transcription completes, the input is automatically focused
- This improves the voice input UX flow

## Testing
1. Click mic button or press spacebar to record voice
2. Record a message and stop
3. After transcription completes, the input should be focused
4. User can now press Enter to send or spacebar to record again

---------

Co-authored-by: Lluis Agusti <hi@llu.lu>
2026-01-30 14:49:05 +07:00
Otto
582c6cad36 fix(e2e): Make E2E test data deterministic and fix flaky tests (#11890)
## Summary
Fixes flaky E2E marketplace and library tests that were causing PRs to
be removed from the merge queue.

## Root Cause
1. **Test data was probabilistic** - `e2e_test_data.py` used random
chances (40% approve, then 20-50% feature), which could result in 0
featured agents
2. **Library pagination threshold wrong** - Checked `>= 10`, but page
size is 20
3. **Fixed timeouts** - Used `waitForTimeout(2000)` /
`waitForTimeout(10000)` instead of proper waits

## Changes

### Backend (`e2e_test_data.py`)
- Add guaranteed minimums: 8 featured agents, 5 featured creators, 10
top agents
- First N submissions are deterministically approved and featured
- Increase agents per user from 15 → 25 (for pagination with
page_size=20)
- Fix library agent creation to use constants instead of hardcoded `10`

### Frontend Tests
- `library.spec.ts`: Fix pagination threshold to `PAGE_SIZE` (20)
- `library.page.ts`: Replace 2s timeout with `networkidle` +
`waitForFunction`
- `marketplace.page.ts`: Add `networkidle` wait, 30s waits in
`getFirst*` methods
- `marketplace.spec.ts`: Replace 10s timeout with `waitForFunction`
- `marketplace-creator.spec.ts`: Add `networkidle` + element waits

## Related
- Closes SECRT-1848, SECRT-1849
- Should unblock #11841 and other PRs in merge queue

---------

Co-authored-by: Ubbe <hi@ubbe.dev>
2026-01-30 05:12:35 +00:00
Nicholas Tindle
3b822cdaf7 chore(branchlet): Remove docs pip install from postCreateCmd (#11883)
### Changes 🏗️

- Removed `cd docs && pip install -r requirements.txt` from
`postCreateCmd` in `.branchlet.json`
- Docs dependencies will no longer be auto-installed during branchlet
worktree creation

### Rationale

The docs setup step was adding unnecessary overhead to the worktree
creation process. Developers who need to work on documentation can
manually install the docs requirements when needed.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Verified branchlet worktree creation still works without the docs
pip install step

#### For configuration changes:

- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)
2026-01-30 00:31:34 +00:00
Zamil Majdy
b2eb4831bd feat(chat): improve agent generator error propagation (#11884)
## Summary
- Add helper functions in `service.py` to create standardized error
responses with `error_type` classification
- Update service functions to return error dicts instead of `None`,
preserving error details from the Agent Generator microservice
- Update `core.py` to pass through error responses properly
- Update `create_agent.py` to handle error responses with user-friendly
messages based on error type

## Error Types Now Propagated
| Error Type | Description | User Message |
|------------|-------------|--------------|
| `llm_parse_error` | LLM returned unparseable response | "The AI had
trouble understanding this request" |
| `llm_timeout` / `timeout` | Request timed out | "The request took too
long" |
| `llm_rate_limit` / `rate_limit` | Rate limited | "The service is
currently busy" |
| `validation_error` | Agent validation failed | "The generated agent
failed validation" |
| `connection_error` | Could not connect to Agent Generator | Generic
error message |
| `http_error` | HTTP error from Agent Generator | Generic error message
|
| `unknown` | Unclassified error | Generic error message |

## Motivation
This enables better debugging for issues like SECRT-1817 where
decomposition failed due to transient LLM errors but the root cause was
unclear in the logs. Now:
1. Error details from the Agent Generator microservice are preserved
2. Users get more helpful error messages based on error type
3. Debugging is easier with `error_type` in response details

## Related PR
- Agent Generator side:
https://github.com/Significant-Gravitas/AutoGPT-Agent-Generator/pull/102

## Test Plan
- [ ] Test decomposition with various error scenarios (timeout, parse
error)
- [ ] Verify user-friendly messages are shown based on error type
- [ ] Check that error details are logged properly
2026-01-29 19:53:40 +00:00
Reinier van der Leer
4cd5da678d refactor(claude): Split autogpt_platform/CLAUDE.md into project-specific files (#11788)
Split `autogpt_platform/CLAUDE.md` into project-specific files, to make
the scope of the instructions clearer.

Also, some minor improvements:

- Change references to other Markdown files to @file/path.md syntax that
Claude recognizes
- Update ambiguous/incorrect/outdated instructions
- Remove trailing slashes
- Fix broken file path references in other docs (including comments)
2026-01-29 17:33:02 +00:00
Ubbe
9538992eaf hotfix(frontend): flags copilot redirects (#11878)
## Changes 🏗️

- Refactor homepage redirect logic to always point to `/`
- the `/` route handles whether to redirect to `/copilot` or `/library`
based on flag
- Simplify `useGetFlag` checks
- Add `<FeatureFlagRedirect />` and `<FeatureFlagPage />` wrapper
components
- helpers to do 1 thing or the other, depending on chat enabled/disabled
- avoids boilerplate code, checking flagss and redirects mistakes
(especially around race conditions with LD init )

## Checklist 📋

### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Log in / out of AutoGPT with flag disabled/enabled
  - [x] Sign up to AutoGPT with flag disabled/enabled
  - [x] Redirects to homepage always work `/`
  - [x] Can't access Copilot with disabled flag
2026-01-29 18:13:28 +07:00
Ubbe
b94c83aacc feat(frontend): Copilot speech to text via Whisper model (#11871)
## Changes 🏗️


https://github.com/user-attachments/assets/d9c12ac0-625c-4b38-8834-e494b5eda9c0

Add a "speech to text" feature in the Chat input fox of Copilot, similar
as what you have in ChatGPT.

## Checklist 📋

### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Run locally and try the speech to text feature as part of the chat
input box

### For configuration changes:

We need to add `OPENAI_API_KEY=` to Vercel ( used in the Front-end )
both in Dev and Prod.

- [x] `.env.default` is updated or already compatible with my changes

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 17:46:36 +07:00
Nicholas Tindle
7668c17d9c feat(platform): add User Workspace for persistent CoPilot file storage (#11867)
Implements persistent User Workspace storage for CoPilot, enabling
blocks to save and retrieve files across sessions. Files are stored in
session-scoped virtual paths (`/sessions/{session_id}/`).

Fixes SECRT-1833

### Changes 🏗️

**Database & Storage:**
- Add `UserWorkspace` and `UserWorkspaceFile` Prisma models
- Implement `WorkspaceStorageBackend` abstraction (GCS for cloud, local
filesystem for self-hosted)
- Add `workspace_id` and `session_id` fields to `ExecutionContext`

**Backend API:**
- Add REST endpoints: `GET/POST /api/workspace/files`, `GET/DELETE
/api/workspace/files/{id}`, `GET /api/workspace/files/{id}/download`
- Add CoPilot tools: `list_workspace_files`, `read_workspace_file`,
`write_workspace_file`
- Integrate workspace storage into `store_media_file()` - returns
`workspace://file-id` references

**Block Updates:**
- Refactor all file-handling blocks to use unified `ExecutionContext`
parameter
- Update media-generating blocks to persist outputs to workspace
(AIImageGenerator, AIImageCustomizer, FluxKontext, TalkingHead, FAL
video, Bannerbear, etc.)

**Frontend:**
- Render `workspace://` image references in chat via proxy endpoint
- Add "AI cannot see this image" overlay indicator

**CoPilot Context Mapping:**
- Session = Agent (graph_id) = Run (graph_exec_id)
- Files scoped to `/sessions/{session_id}/`

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [ ] I have tested my changes according to the test plan:
- [ ] Create CoPilot session, generate image with AIImageGeneratorBlock
  - [ ] Verify image returns `workspace://file-id` (not base64)
  - [ ] Verify image renders in chat with visibility indicator
  - [ ] Verify workspace files persist across sessions
  - [ ] Test list/read/write workspace files via CoPilot tools
  - [ ] Test local storage backend for self-hosted deployments

#### For configuration changes:
- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)

🤖 Generated with [Claude Code](https://claude.ai/code)

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Medium Risk**
> Introduces a new persistent file-storage surface area (DB tables,
storage backends, download API, and chat tools) and rewires
`store_media_file()`/block execution context across many blocks, so
regressions could impact file handling, access control, or storage
costs.
> 
> **Overview**
> Adds a **persistent per-user Workspace** (new
`UserWorkspace`/`UserWorkspaceFile` models plus `WorkspaceManager` +
`WorkspaceStorageBackend` with GCS/local implementations) and wires it
into the API via a new `/api/workspace/files/{file_id}/download` route
(including header-sanitized `Content-Disposition`) and shutdown
lifecycle hooks.
> 
> Extends `ExecutionContext` to carry execution identity +
`workspace_id`/`session_id`, updates executor tooling to clone
node-specific contexts, and updates `run_block` (CoPilot) to create a
session-scoped workspace and synthetic graph/run/node IDs.
> 
> Refactors `store_media_file()` to require `execution_context` +
`return_format` and to support `workspace://` references; migrates many
media/file-handling blocks and related tests to the new API and to
persist generated media as `workspace://...` (or fall back to data URIs
outside CoPilot), and adds CoPilot chat tools for
listing/reading/writing/deleting workspace files with safeguards against
context bloat.
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
6abc70f793. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Reinier van der Leer <pwuts@agpt.co>
2026-01-29 05:49:47 +00:00
Nicholas Tindle
27b72062f2 Merge branch 'dev' autogpt-platform-beta-v0.6.45 2026-01-28 15:17:57 -06:00
Nicholas Tindle
e0dfae5732 fix(platform): evaluate chat flag after auth for correct redirect (#11873)
Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-28 14:58:02 -06:00
Zamil Majdy
9a79a8d257 Merge branch 'dev' of github.com:Significant-Gravitas/AutoGPT 2026-01-28 12:32:17 -06:00
Zamil Majdy
7df867d645 Merge branch 'master' of github.com:Significant-Gravitas/AutoGPT into dev 2026-01-28 12:29:41 -06:00
Zamil Majdy
a9bf08748b Merge branch 'dev' of github.com:Significant-Gravitas/AutoGPT 2026-01-28 12:28:48 -06:00
Zamil Majdy
d855f79874 fix(platform): reduce Sentry alert spam for expected errors (#11872)
## Summary
- Add `InvalidInputError` for validation errors (search term too long,
invalid pagination) - returns 400 instead of 500
- Remove redundant try/catch blocks in library routes - global exception
handlers already handle `ValueError`→400 and `NotFoundError`→404
- Aggregate embedding backfill errors and log once at the end instead of
per content type to prevent Sentry issue spam

## Test plan
- [x] Verify validation errors (search term >100 chars) return 400 Bad
Request
- [x] Verify NotFoundError still returns 404
- [x] Verify embedding errors are logged once at the end with aggregated
counts

Fixes AUTOGPT-SERVER-7K5, BUILDER-6NC

---------

Co-authored-by: Swifty <craigswift13@gmail.com>
2026-01-29 01:28:27 +07:00
Swifty
dac99694fe Merge branch 'release/v0.6.44' v0.6.44 2026-01-28 12:19:13 +01:00
Nicholas Tindle
0953983944 feat(platform): disable onboarding redirects and add $5 signup bonus (#11862)
Disable automatic onboarding redirects on signup/login while keeping the
checklist/wallet functional. Users now receive $5 (500 credits) on their
first visit to /copilot.

### Changes 🏗️

- **Frontend**: `shouldShowOnboarding()` now returns `false`, disabling
auto-redirects to `/onboarding`
- **Backend**: Added `VISIT_COPILOT` onboarding step with 500 credit
($5) reward
- **Frontend**: Copilot page automatically completes `VISIT_COPILOT`
step on mount
- **Database**: Migration to add `VISIT_COPILOT` to `OnboardingStep`
enum

NOTE: /onboarding/1-welcome -> /library now as shouldShowOnboardin is
always false

Users land directly on `/copilot` after signup/login and receive $5
invisibly (not shown in checklist UI).

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] New user signup (email/password) → lands on `/copilot`, wallet
shows 500 credits
- [x] Verified credits are only granted once (idempotent via onboarding
reward mechanism)
- [x] Existing user login (already granted flag set) → lands on
`/copilot`, no duplicate credits
  - [x] Checklist/wallet remains functional

#### For configuration changes:
- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)

No configuration changes required.

---

OPEN-2967

🤖 Generated with [Claude Code](https://claude.ai/code)


<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Introduces a new onboarding step and adjusts onboarding flow.
> 
> - Adds `VISIT_COPILOT` onboarding step (+500 credits) with DB enum
migration and API/type updates
> - Copilot page auto-completes `VISIT_COPILOT` on mount to grant the
welcome bonus
> - Changes `/onboarding/enabled` to require user context and return
`false` when `CHAT` feature is enabled (skips legacy onboarding)
> - Wallet now refreshes credits on any onboarding `step_completed`
notification; confetti limited to visible tasks
> - Test flows updated to accept redirects to `copilot`/`library` and
verify authenticated state
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
ec5a5a4dfd. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Nicholas Tindle <ntindle@users.noreply.github.com>
2026-01-28 07:22:46 +00:00
Zamil Majdy
0058cd3ba6 fix(frontend): auto-poll for long-running tool completion (#11866)
## Summary
Fixes the issue where the "Creating Agent" spinner doesn't auto-update
when agent generation completes - user had to refresh the browser.

**Changes:**
- **Frontend polling**: Add `onOperationStarted` callback to trigger
polling when `operation_started` is received via SSE
- **Polling backoff**: 2s, 4s, 6s, 8s... up to 30s max
- **Message deduplication**: Use content-based keys (role + content)
instead of timestamps to prevent duplicate messages
- **Message ordering**: Preserve server message order instead of
timestamp-based sorting
- **Debug cleanup**: Remove verbose console.log/console.info statements

## Test plan
- [ ] Start agent generation in copilot
- [ ] Verify "Creating Agent" spinner appears
- [ ] Wait for completion (2-5 min) WITHOUT refreshing
- [ ] Verify agent carousel appears automatically when done
- [ ] Verify no duplicate messages in chat
- [ ] Verify message order is correct (user → assistant → tool_call →
tool_response)
2026-01-28 10:03:21 +07:00
Nicholas Tindle
ea035224bc feat(copilot): Increase max_agent_runs and max_agent_schedules (#11865)
<!-- Clearly explain the need for these changes: -->
Config change to increase the max times an agent can run in the chat and
the max number of scheduels created by copilot in one chat

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Increases per-chat operational limits for Copilot.
> 
> - Bumps `max_agent_runs` default from `3` to `30` in `ChatConfig`
> - Bumps `max_agent_schedules` default from `3` to `30` in `ChatConfig`
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
93cbae6d27. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
2026-01-28 01:08:02 +00:00
Nicholas Tindle
62813a1ea6 Delete backend/blocks/video/__init__.py (#11864)
<!-- Clearly explain the need for these changes: -->
oops file
### Changes 🏗️

<!-- Concisely describe all of the changes made in this pull request:
-->
removes file that should have not been commited

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Removes erroneous `backend/blocks/video/__init__.py`, eliminating an
unintended `video` package.
> 
> - Deletes a placeholder comment-only file
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
3b84576c33. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
2026-01-28 00:58:49 +00:00
Bently
67405f7eb9 fix(copilot): ensure tool_call/tool_response pairs stay intact during context compaction (#11863)
## Summary

Fixes context compaction breaking tool_call/tool_response pairs, causing
API validation errors.

## Problem

When context compaction slices messages with `messages[-KEEP_RECENT:]`,
a naive slice can separate an assistant message containing `tool_calls`
from its corresponding tool response messages. This causes API
validation errors like:

```
messages.0.content.1: unexpected 'tool_use_id' found in 'tool_result' blocks: orphan_12345.
Each 'tool_result' block must have a corresponding 'tool_use' block in the previous message.
```

## Solution

Added `_ensure_tool_pairs_intact()` helper function that:
1. Detects orphan tool responses in a slice (tool messages whose
`tool_call_id` has no matching assistant message)
2. Extends the slice backwards to include the missing assistant messages
3. Falls back to removing orphan tool responses if the assistant cannot
be found (edge case)

Applied this safeguard to:
- The initial `KEEP_RECENT` slice (line ~990)
- The progressive fallback slices when still over token limit (line
~1079)

## Testing

- Syntax validated with `python -m py_compile`
- Logic reviewed for correctness

## Linear

Fixes SECRT-1839

---
*Debugged by Toran & Orion in #agpt Discord*
2026-01-28 00:21:54 +00:00
Zamil Majdy
171ff6e776 feat(backend): persist long-running tool results to survive SSE disconnects (#11856)
## Summary

Agent generation (`create_agent`, `edit_agent`) can take 1-5 minutes.
Previously, if the user closed their browser tab during this time:
1. The SSE connection would die
2. The tool execution would be cancelled via `CancelledError`
3. The result would be lost - even if the agent-generator service
completed successfully

This PR ensures long-running tool operations survive SSE disconnections.

### Changes 🏗️

**Backend:**
- **base.py**: Added `is_long_running` property to `BaseTool` for tools
to opt-in to background execution
- **create_agent.py / edit_agent.py**: Set `is_long_running = True`
- **models.py**: Added `OperationStartedResponse`,
`OperationPendingResponse`, `OperationInProgressResponse` types
- **service.py**: Modified `_yield_tool_call()` to:
  - Check if tool is `is_long_running`
  - Save "pending" message to chat history immediately
  - Spawn background task that runs independently of SSE
  - Return `operation_started` immediately (don't wait)
  - Update chat history with result when background task completes
- Track running operations for idempotency (prevents duplicate ops on
refresh)
- **db.py**: Added `update_tool_message_content()` to update pending
messages
- **model.py**: Added `invalidate_session_cache()` to clear Redis after
background completion

**Frontend:**
- **useChatMessage.ts**: Added operation message types
- **helpers.ts**: Handle `operation_started`, `operation_pending`,
`operation_in_progress` response types
- **PendingOperationWidget**: New component to display operation status
with spinner
- **ChatMessage.tsx**: Render `PendingOperationWidget` for operation
messages

### How It Works

```
User Request → Save "pending" message → Spawn background task → Return immediately
                                              ↓
                                     Task runs independently of SSE
                                              ↓
                                     On completion: Update message in chat history
                                              ↓
                                     User refreshes → Loads history → Sees result
```

### User Experience

1. User requests agent creation
2. Sees "Agent creation started. You can close this tab - check your
library in a few minutes."
3. Can close browser tab safely
4. When they return, chat shows the completed result (or error)

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] pyright passes (0 errors)
  - [x] TypeScript checks pass
  - [x] Formatters applied

### Test Plan

1. Start agent creation in copilot
2. Close browser tab immediately after seeing "operation_started" 
3. Wait 2-3 minutes
4. Reopen chat
5. Verify: Chat history shows completion message and agent appears in
library

---------

Co-authored-by: Ubbe <hi@ubbe.dev>
2026-01-28 05:09:34 +07:00
Lluis Agusti
349b1f9c79 hotfix(frontend): copilot session handling refinements... 2026-01-28 02:53:45 +07:00
Lluis Agusti
277b0537e9 hotfix(frontend): copilot simplication... 2026-01-28 02:10:18 +07:00
Ubbe
071b3bb5cd fix(frontend): more copilot refinements (#11858)
## Changes 🏗️

On the **Copilot** page:

- prevent unnecessary sidebar repaints 
- show a disclaimer when switching chats on the sidebar to terminate a
current stream
- handle loading better
- save streams better when disconnecting


### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Run the app locally and test the above
2026-01-28 00:49:28 +07:00
Swifty
2134d777be fix(backend): exclude disabled blocks from chat search and indexing (#11854)
## Summary

Disabled blocks (e.g., webhook blocks without `platform_base_url`
configured) were being indexed and returned in chat tool search results.
This PR ensures they are properly filtered out.

### Changes 🏗️

- **find_block.py**: Skip disabled blocks when enriching search results
- **content_handlers.py**: 
  - Skip disabled blocks during embedding indexing
- Update `get_stats()` to only count enabled blocks for accurate
coverage metrics

### Why

Blocks can be disabled for various reasons (missing OAuth config, no
platform URL for webhooks, etc.). These blocks shouldn't appear in
search results since users cannot use them.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Verified disabled blocks are filtered from search results
  - [x] Verified disabled blocks are not indexed
  - [x] Verified stats accurately reflect enabled block count
2026-01-27 15:21:13 +00:00
Ubbe
962824c8af refactor(frontend): copilot session management stream updates (#11853)
## Changes 🏗️

- **Fix infinite loop in copilot page**
- use Zustand selectors instead of full store object to get stable
function references
- **Centralize chat streaming logic**
- move all streaming files from `providers/chat-stream/` to
`components/contextual/Chat/` for better colocation and reusability
- **Rename `copilot-store` → `copilot-page-store`**: Clarify scope
- **Fix message duplication**
- Only replay chunks from active streams (not completed ones) since
backend already provides persisted messages in `initialMessages`
- **Auto-focus chat input**
  - Focus textarea when streaming ends and input is re-enabled
- **Graceful error display**
- Render tool response errors in muted style (small text + warning icon)
instead of raw "Error: ..." text

## Checklist 📋

### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Navigate to copilot page - no infinite loop errors
  - [x] Start a new chat, send message, verify streaming works
- [x] Navigate away and back to a completed session - no duplicate
messages
  - [x] After stream completes, verify chat input receives focus
  - [x] Trigger a tool error - verify it displays with muted styling
2026-01-27 22:09:25 +07:00
Zamil Majdy
3e9d5d0d50 fix(backend): handle race condition in review processing gracefully (#11845)
## Summary
- Fixes race condition when multiple concurrent requests try to process
the same reviews (e.g., double-click, multiple browser tabs)
- Previously the second request would fail with "Reviews not found,
access denied, or not in WAITING status"
- Now handles this gracefully by treating already-processed reviews with
the same decision as success

## Changes
- Added `get_reviews_by_node_exec_ids()` function that fetches reviews
regardless of status
- Modified `process_all_reviews_for_execution()` to handle
already-processed reviews
- Updated route to use idempotent validation

## Test plan
- [x] Linter passes (`poetry run ruff check`)
- [x] Type checker passes (`poetry run pyright`)
- [x] Formatter passes (`poetry run format`)
- [ ] Manual testing: double-click approve button should not cause
errors

Fixes AUTOGPT-SERVER-7HE
2026-01-27 21:43:31 +07:00