Commit Graph

7874 Commits

Author SHA1 Message Date
Nicholas Tindle
2602913f7d fix(blocks): use MediaFileType for loop block video_out
Aligns with all other video blocks which use MediaFileType for output.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 22:40:32 -06:00
Nicholas Tindle
cbab0c2149 fix(blocks): validate transition_duration against shortest clip in concat
Prevents confusing MoviePy errors when transition_duration exceeds
the duration of any input clip.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 22:24:09 -06:00
Nicholas Tindle
dd2ccd31d3 fix(backend): add ImageMagick to Dockerfile for text overlay block
MoviePy's TextClip requires ImageMagick to render text. Without it,
VideoTextOverlayBlock fails in Docker with "ImageMagick not found".

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 22:15:19 -06:00
Nicholas Tindle
5923e5c211 fix(backend): use correct lookup method for path-based workspace refs
When adding MIME type to workspace refs without fragments, use
get_file_info_by_path() for path-based refs instead of get_file_info()
which expects an ID.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 22:09:09 -06:00
Nicholas Tindle
2deb9b96cf fix(blocks): strip chapter metadata from audio files in add_audio block
Audio files (especially audiobooks, podcasts) can have chapter metadata
that causes MoviePy to crash with IndexError.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 21:58:04 -06:00
Nicholas Tindle
4a84f34b31 fix(blocks): add proper resource cleanup to video blocks
Add try/finally blocks to close MoviePy clips in loop.py, duration.py,
and add_audio.py. Without this, file handles and ffmpeg subprocesses
leak on each block execution.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 21:54:54 -06:00
Nicholas Tindle
ecdf2f8c09 fix(backend): use fixed dates in credit_test to avoid month boundary flakiness
The test was failing because datetime.now().replace(month=1) in February
still produced a January 2026 date, which matched the "35 days ago"
updatedAt also landing in January 2026 - causing the refill to be skipped.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 21:43:54 -06:00
Nicholas Tindle
fe264545a9 fix(blocks): prevent exponential filename growth when chaining video blocks
Add extract_source_name() utility that strips accumulated
{node_exec_id}_{operation}_ prefixes so chained blocks produce flat
filenames like copilot-uuid_clip_My_Video.mp4 instead of embedding
the entire input path.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 21:27:06 -06:00
Nicholas Tindle
93d49b34d2 Merge branch 'dev' into feature/video-editing-blocks 2026-02-04 18:58:59 -06:00
Nicholas Tindle
e871edd387 feat(frontend): render video/audio workspace refs using MIME fragment
Use the #mimeType fragment on workspace:// URIs to determine media
category (video/image/audio) instead of relying solely on keyword
matching. Adds video rendering support in MarkdownContent, broader
format support in render.tsx, and enhanced output handling in the
builder's DataTable and NodeOutputs.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 18:25:55 -06:00
Nicholas Tindle
ed26d81c2b fix(backend): strip MIME fragment from workspace URI before resolving file ID
store_media_file() appends #mimeType fragments to workspace URIs on output
(e.g. workspace://abc123#video/mp4). When a downstream block receives this
URI as input, the same function was using "abc123#video/mp4" as a file ID,
which fails. Add parse_workspace_uri() utility to cleanly separate the file
ID from the MIME fragment, fixing block-to-block file passing.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 18:24:43 -06:00
Nicholas Tindle
2b08ea9d4d fix(blocks): add yt-dlp format fallback for video download
Append `/best` to yt-dlp format string so it falls back to best
available format when the preferred quality isn't available.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 17:56:16 -06:00
Nicholas Tindle
448e8b8876 fix(blocks): strip chapter metadata before MoviePy opens files
MoviePy 2.x crashes with IndexError when parsing video/audio files
with embedded chapter metadata (moviepy#2419). Add strip_chapters_inplace()
which runs `ffmpeg -map_chapters -1 -codec copy` to remove chapters
without re-encoding before any VideoFileClip/AudioFileClip call.

Applied to all video blocks: clip, concat, text_overlay, narration,
add_audio, loop, and duration.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 17:52:12 -06:00
Otto
4f908d5cb3 fix(platform): Improve Linear Search Block [SECRT-1880] (#11967)
## Summary

Implements [SECRT-1880](https://linear.app/autogpt/issue/SECRT-1880) -
Improve Linear Search Block

## Changes

### Models (`models.py`)
- Added `State` model with `id`, `name`, and `type` fields for workflow
state information
- Added `state: State | None` field to `Issue` model

### API Client (`_api.py`)
- Updated `try_search_issues()` to:
- Add `max_results` parameter (default 10, was ~50) to reduce token
usage
  - Add `team_id` parameter for team filtering
- Return `createdAt`, `state`, `project`, and `assignee` fields in
results
- Fixed `try_get_team_by_name()` to return descriptive error message
when team not found instead of crashing with `IndexError`

### Block (`issues.py`)
- Added `max_results` input parameter (1-100, default 10)
- Added `team_name` input parameter for optional team filtering
- Added `error` output field for graceful error handling
- Added categories (`PRODUCTIVITY`, `ISSUE_TRACKING`)
- Updated test fixtures to include new fields

## Breaking Changes

| Change | Before | After | Mitigation |
|--------|--------|-------|------------|
| Default result count | ~50 | 10 | Users can set `max_results` up to
100 if needed |

## Non-Breaking Changes

- `state` field added to `Issue` (optional, defaults to `None`)
- `max_results` param added (has default value)
- `team_name` param added (optional, defaults to `None`)
- `error` output added (follows established pattern from GitHub blocks)

## Testing

- [x] Format/lint checks pass
- [x] Unit test fixtures updated

Resolves SECRT-1880

---------

Co-authored-by: Toran Bruce Richards <toran.richards@gmail.com>
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Toran Bruce Richards <Torantulino@users.noreply.github.com>
2026-02-04 22:54:46 +00:00
Reinier van der Leer
c1aa684743 fix(platform/chat): Filter host-scoped credentials for run_agent tool (#11905)
- Fixes [SECRT-1851: \[Copilot\] `run_agent` tool doesn't filter
host-scoped credentials](https://linear.app/autogpt/issue/SECRT-1851)
- Follow-up to #11881

### Changes 🏗️

- Filter host-scoped credentials for `run_agent` tool
- Tighten validation on host input field in `HostScopedCredentialsModal`
- Use netloc (w/ port) rather than just hostname (w/o port) as host
scope

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - Create graph that requires host-scoped credentials to work
  - Create host-scoped credentials with a *different* host
  - Try to have Copilot run the graph
  - [x] -> no matching credentials available
  - Create new credentials
  - [x] -> works

---------

Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co>
2026-02-04 16:27:14 +00:00
Otto
7e5b84cc5c fix(copilot): update homepage copy to focus on problem discovery (#11956)
## Summary
Update the CoPilot homepage to shift from "what do you want to
automate?" to "tell me about your problems." This lowers the barrier to
engagement by letting users describe their work frustrations instead of
requiring them to identify automations themselves.

## Changes
| Element | Before | After |
|---------|--------|-------|
| Headline | "What do you want to automate?" | "Tell me about your work
— I'll find what to automate." |
| Placeholder | "You can search or just ask - e.g. 'create a blog post
outline'" | "What's your role and what eats up most of your day? e.g.
'I'm a real estate agent and I hate...'" |
| Button 1 | "Show me what I can automate" | "I don't know where to
start, just ask me stuff" |
| Button 2 | "Design a custom workflow" | "I do the same thing every
week and it's killing me" |
| Button 3 | "Help me with content creation" | "Help me find where I'm
wasting my time" |
| Container | max-w-2xl | max-w-3xl |

> **Note on container width:** The `max-w-2xl` → `max-w-3xl` change is
just to keep the longer headline on one line. This works but may not be
the ideal solution — @lluis-xai should advise on the proper approach.

## Why This Matters
The current UX assumes users know what they want to automate. In
reality, most users know what frustrates them but can't identify
automations. The current screen blocks Otto from starting the discovery
conversation that leads to useful recommendations.

## Files Changed
- `autogpt_platform/frontend/src/app/(platform)/copilot/page.tsx` —
headline, placeholder, container width
- `autogpt_platform/frontend/src/app/(platform)/copilot/helpers.ts` —
quick action button text

Resolves: [SECRT-1876](https://linear.app/autogpt/issue/SECRT-1876)

---------

Co-authored-by: Lluis Agusti <hi@llu.lu>
autogpt-platform-beta-v0.6.46
2026-02-04 17:38:58 +07:00
Swifty
09cb313211 fix(frontend): Prevent reflected XSS in OAuth callback route (#11963)
## Summary

Fixes a reflected cross-site scripting (XSS) vulnerability in the OAuth
callback route.

**Security Issue:**
https://github.com/Significant-Gravitas/AutoGPT/security/code-scanning/202

### Vulnerability

The OAuth callback route at
`frontend/src/app/(platform)/auth/integrations/oauth_callback/route.ts`
was writing user-controlled data directly into an HTML response without
proper sanitization. This allowed potential attackers to inject
malicious scripts via OAuth callback parameters.

### Fix

Added a `safeJsonStringify()` function that escapes characters that
could break out of the script context:
- `<` → `\u003c`
- `>` → `\u003e`  
- `&` → `\u0026`

This prevents any user-provided values from being interpreted as
HTML/script content when embedded in the response.

### References

- [OWASP XSS Prevention Cheat
Sheet](https://cheatsheetseries.owasp.org/cheatsheets/Cross_Site_Scripting_Prevention_Cheat_Sheet.html)
- [CWE-79: Improper Neutralization of Input During Web Page
Generation](https://cwe.mitre.org/data/definitions/79.html)

## Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Verified the OAuth callback still functions correctly
- [x] Confirmed special characters in OAuth responses are properly
escaped
2026-02-04 10:53:17 +01:00
Krzysztof Czerwinski
c026485023 feat(frontend): Disable auto-opening wallet (#11961)
<!-- Clearly explain the need for these changes: -->

### Changes 🏗️

- Disable auto-opening Wallet for first time user and on credit increase
- Remove no longer needed `lastSeenCredits` state and storage

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Wallet doesn't open automatically
2026-02-04 06:11:41 +00:00
Nicholas Tindle
1eabc60484 Merge commit from fork
Fixes GHSA-rc89-6g7g-v5v7 / CVE-2026-22038

The logger.info() calls were explicitly logging API keys via
get_secret_value(), exposing credentials in plaintext logs.

Changes:
- Replace info-level credential logging with debug-level provider logging
- Remove all explicit secret value logging from observe/act/extract blocks

Co-authored-by: Otto <otto@agpt.co>
2026-02-03 11:16:57 -06:00
Swifty
f4bf492f24 feat(platform): Add Redis-based SSE reconnection for long-running CoPilot operations (#11877)
## Changes 🏗️

Adds Redis-based SSE reconnection support for long-running CoPilot
operations (like Agent Generator), enabling clients to reconnect and
resume receiving updates after disconnection.

### What this does:
- **Stream Registry** - Redis-backed task tracking with message
persistence via Redis Streams
- **SSE Reconnection** - Clients can reconnect to active tasks using
`task_id` and `last_message_id`
- **Duplicate Message Fix** - Filters out in-progress assistant messages
from session response when active stream exists
- **Completion Consumer** - Handles background task completion
notifications via Redis Streams

### Architecture:
```
1. User sends message → Backend creates task in Redis
2. SSE chunks written to Redis Stream for persistence
3. Client receives chunks via SSE subscription
4. If client disconnects → Task continues in background
5. Client reconnects → GET /sessions/{id} returns active_stream info
6. Client subscribes to /tasks/{task_id}/stream with last_message_id
7. Missed messages replayed from Redis Stream
```

### Key endpoints:
- `GET /sessions/{session_id}` - Returns `active_stream` info if task is
running
- `GET /tasks/{task_id}/stream?last_message_id=X` - SSE endpoint for
reconnection
- `GET /tasks/{task_id}` - Get task status
- `POST /operations/{op_id}/complete` - Webhook for external service
completion

### Duplicate message fix:
When `GET /sessions/{id}` detects an active stream:
1. Filters out the in-progress assistant message from response
2. Returns `last_message_id="0-0"` so client replays stream from
beginning
3. Client receives complete response only through SSE (single source of
truth)

### Frontend changes:
- Task persistence in localStorage for cross-tab reconnection
- Stream event dispatcher handles reconnection flow
- Deduplication logic prevents duplicate messages

### Testing:
- Manual testing of reconnection scenarios
- Verified duplicate message fix works correctly

## Related
- Resolves SSE timeout issues for Agent Generator
- Fixes duplicate message bug on reconnection
2026-02-03 16:52:06 +01:00
Zamil Majdy
81e48c00a4 feat(copilot): add customize_agent tool for marketplace templates (#11943)
## Summary

Adds a new copilot tool that allows users to customize
marketplace/template agents using natural language before adding them to
their library.

This exposes the Agent Generator's `/api/template-modification` endpoint
to the copilot, which was previously not available.

## Changes

- **service.py**: Add `customize_template_external` to call Agent
Generator's template modification endpoint
- **core.py**: 
  - Add `customize_template` wrapper function
- Extract `graph_to_json` as a reusable function (was previously inline
in `get_agent_as_json`)
- **customize_agent.py**: New tool that:
  - Takes marketplace agent ID (format: `creator/slug`)
  - Fetches template from store via `store_db.get_agent()`
  - Calls Agent Generator for customization
  - Handles clarifying questions from the generator
  - Saves customized agent to user's library
- **__init__.py**: Register the tool in `TOOL_REGISTRY` for
auto-discovery

## Usage Flow

1. User searches marketplace: *"Find me a newsletter agent"*
2. Copilot calls `find_agent` → returns `autogpt/newsletter-writer`
3. User: *"Customize that agent to post to Discord instead of email"*
4. Copilot calls:
   ```
   customize_agent(
       agent_id="autogpt/newsletter-writer",
       modifications="Post to Discord instead of sending email"
   )
   ```
5. Agent Generator may ask clarifying questions (e.g., "What Discord
channel?")
6. Customized agent is saved to user's library

## Test plan

- [x] Verified tool imports correctly
- [x] Verified tool is registered in `TOOL_REGISTRY`
- [x] Verified OpenAI function schema is valid
- [x] Ran existing tests (`pytest backend/api/features/chat/tools/`) -
all pass
- [x] Type checker (`pyright`) passes with 0 errors
- [ ] Manual testing with copilot (requires Agent Generator service)
2026-02-03 14:59:25 +00:00
Otto
7dc53071e8 fix(backend): Add retry and error handling to block initialization (#11946)
## Summary
Adds retry logic and graceful error handling to `initialize_blocks()` to
prevent transient DB errors from crashing server startup.

## Problem
When a transient database error occurs during block initialization
(e.g., Prisma P1017 "Server has closed the connection"), the entire
server fails to start. This is overly aggressive since:
1. Blocks are already registered in memory
2. The DB sync is primarily for tracking/schema storage
3. One flaky connection shouldn't prevent the server from starting

**Triggered by:** [Sentry
AUTOGPT-SERVER-7PW](https://significant-gravitas.sentry.io/issues/7238733543/)

## Solution
- Add retry decorator (3 attempts with exponential backoff) for DB
operations
- On failure after retries, log a warning and continue to the next block
- Blocks remain available in memory even if DB sync fails
- Log summary of any failed blocks at the end

## Changes
- `autogpt_platform/backend/backend/data/block.py`: Wrap block DB sync
in retry logic with graceful fallback

## Testing
- Existing block initialization behavior unchanged on success
- On transient DB errors: retries up to 3 times, then continues with
warning
2026-02-03 12:43:30 +00:00
Zamil Majdy
4878665c66 Merge branch 'master' into dev 2026-02-03 16:01:23 +04:00
Zamil Majdy
678ddde751 refactor(backend): unify context compression into compress_context() (#11937)
## Background

This PR consolidates and unifies context window management for the
CoPilot backend.

### Problem
The CoPilot backend had **two separate implementations** of context
window management:

1. **`service.py` → `_manage_context_window()`** - Chat service
streaming/continuation
2. **`prompt.py` → `compress_prompt()`** - Sync LLM blocks

This duplication led to inconsistent behavior, maintenance burden, and
duplicate code.

---

## Solution: Unified `compress_context()`

A single async function that handles both use cases:

| Caller | Usage | Behavior |
|--------|-------|----------|
| **Chat service** | `compress_context(msgs, client=openai_client)` |
Summarization → Truncation |
| **LLM blocks** | `compress_context(msgs, client=None)` | Truncation
only (no API call) |

---

## Strategy Order

| Step | Description | Runs When |
|------|-------------|-----------|
| **1. LLM Summarization** | Summarize old messages into single context
message, keep recent 15 | Only if `client` provided |
| **2. Content Truncation** | Progressively truncate message content
(8192→4096→...→128 tokens) | If still over limit |
| **3. Middle-out Deletion** | Delete messages one at a time from center
outward | If still over limit |
| **4. First/Last Trim** | Truncate system prompt and last message
content | Last resort |

### Why This Order?

1. **Summarization first** (if available) - Preserves semantic meaning
of old messages
2. **Content truncation before deletion** - Keeps all conversation
turns, just shorter
3. **Middle-out deletion** - More granular than dropping all old
messages at once
4. **First/last trim** - Only touch system prompt as last resort

---

## Key Fixes

| Issue | Before | After |
|-------|--------|-------|
| **Socket leak** | `AsyncOpenAI` client never closed | `async with`
context manager |
| **Timeout ignored** | `timeout=30` passed to `create()` (invalid) |
`client.with_options(timeout=30)` |
| **OpenAI tool messages** | Not truncated | Properly truncated |
| **Tool pair integrity** | OpenAI format only | Both OpenAI + Anthropic
formats |

---

## Tool Format Support

`_ensure_tool_pairs_intact()` now supports both formats:

### OpenAI Format
```python
# Assistant with tool_calls
{"role": "assistant", "tool_calls": [{"id": "call_1", ...}]}
# Tool response
{"role": "tool", "tool_call_id": "call_1", "content": "result"}
```

### Anthropic Format
```python
# Assistant with tool_use
{"role": "assistant", "content": [{"type": "tool_use", "id": "toolu_1", ...}]}
# Tool result
{"role": "user", "content": [{"type": "tool_result", "tool_use_id": "toolu_1", ...}]}
```

---

## Files Changed

| File | Change |
|------|--------|
| `backend/util/prompt.py` | +450 lines: Add `CompressResult`,
`compress_context()`, helpers |
| `backend/api/features/chat/service.py` | -380 lines: Remove duplicate,
use thin wrapper |
| `backend/blocks/llm.py` | Migrate `llm_call()` to use
`compress_context(client=None)` |
| `backend/util/prompt_test.py` | +400 lines: Comprehensive tests
(OpenAI + Anthropic) |

### Removed
- `compress_prompt()` - Replaced by `compress_context(client=None)`
- `_manage_context_window()` - Replaced by
`compress_context(client=openai_client)`

---

## API

```python
async def compress_context(
    messages: list[dict],
    target_tokens: int = 120_000,
    *,
    model: str = "gpt-4o",
    client: AsyncOpenAI | None = None,  # None = truncation only
    keep_recent: int = 15,
    reserve: int = 2_048,
    start_cap: int = 8_192,
    floor_cap: int = 128,
) -> CompressResult:
    ...

@dataclass
class CompressResult:
    messages: list[dict]
    token_count: int
    was_compacted: bool
    error: str | None = None
    original_token_count: int = 0
    messages_summarized: int = 0
    messages_dropped: int = 0
```

---

## Tests Added

| Test Class | Coverage |
|------------|----------|
| `TestMsgTokens` | Token counting for regular messages, OpenAI tool
calls, Anthropic tool_use |
| `TestTruncateToolMessageContent` | OpenAI + Anthropic tool message
truncation |
| `TestEnsureToolPairsIntact` | OpenAI format (3 tests), Anthropic
format (3 tests), edge cases (3 tests) |
| `TestCompressContext` | No compression, truncation-only, tool pair
preservation, error handling |

---

## Checklist

- [x] Code follows project conventions
- [x] Linting passes (`poetry run format`)
- [x] Type checking passes (`pyright`)
- [x] Tests added for all new functions
- [x] Both OpenAI and Anthropic tool formats supported
- [x] Backward compatible behavior preserved
- [x] All review comments addressed
2026-02-03 10:36:10 +00:00
Otto
aef6f57cfd fix(scheduler): route db calls through DatabaseManager (#11941)
## Summary

Routes `increment_onboarding_runs` and `cleanup_expired_oauth_tokens`
through the DatabaseManager RPC client instead of calling Prisma
directly.

## Problem

The Scheduler service never connects its Prisma client. While
`add_graph_execution()` in `utils.py` has a fallback that routes through
DatabaseManager when Prisma isn't connected, subsequent calls in the
scheduler were hitting Prisma directly:

- `increment_onboarding_runs()` after successful graph execution
- `cleanup_expired_oauth_tokens()` in the scheduled job

These threw `ClientNotConnectedError`, caught by generic exception
handlers but spamming Sentry (~696K events since December per the
original analysis in #11926).

## Solution

Follow the same pattern as `utils.py`:
1. Add `cleanup_expired_oauth_tokens` to `DatabaseManager` and
`DatabaseManagerAsyncClient`
2. Update scheduler to use `get_database_manager_async_client()` for
both calls

## Changes

- **database.py**: Import and expose `cleanup_expired_oauth_tokens` in
both manager classes
- **scheduler.py**: Use `db.increment_onboarding_runs()` and
`db.cleanup_expired_oauth_tokens()` via the async client

## Impact

- Eliminates Sentry error spam from scheduler
- Onboarding run counters now actually increment for scheduled
executions
- OAuth token cleanup now actually runs

## Testing

Deploy to staging with scheduled graphs and verify:
1. No more `ClientNotConnectedError` in scheduler logs
2. `UserOnboarding.agentRuns` increments on scheduled runs
3. Expired OAuth tokens get cleaned up

Refs: #11926 (original fix that was closed)
2026-02-03 09:54:49 +00:00
Krzysztof Czerwinski
14cee1670a fix(backend): Prevent leaking Redis connections in ws_api (#11869)
Fixing
https://github.com/Significant-Gravitas/AutoGPT/pull/11297#discussion_r2496833421

### Changes 🏗️

1. event_bus.py - Added close method to AsyncRedisEventBus
- Added __init__ method to track the _pubsub instance attribute
- Added async def close() method that closes the PubSub connection
safely
- Modified listen_events() to store the pubsub reference in self._pubsub

2. ws_api.py - Added cleanup in event_broadcaster
- Wrapped the worker coroutines in try/finally block
- The finally block calls close() on both event buses to ensure cleanup
happens on any exit (including exceptions before retry)
2026-02-03 08:07:48 +00:00
Zamil Majdy
d81d1ce024 refactor(backend): extract context window management and fix LLM continuation (#11936)
## Summary

Fixes CoPilot becoming unresponsive after long-running tools complete,
and refactors context window management into a reusable function.

## Problem

After `create_agent` completes, `_generate_llm_continuation()` was
sending ALL messages to OpenRouter without any context compaction. When
conversations exceeded ~50 messages, OpenRouter rejected requests with
`provider_name: 'unknown'` (no provider would accept).

**Evidence:** Langfuse session
[44fbb803-092e-4ebd-b288-852959f4faf5](https://cloud.langfuse.com/project/cmk5qhf210003ad079sd8utjt/sessions/44fbb803-092e-4ebd-b288-852959f4faf5)
showed:
- Successful calls: 32-50 messages, known providers
- Failed calls: 52+ messages, `provider: unknown`, `completion: null`

## Changes

### Refactor: Extract reusable `_manage_context_window()`
- Counts tokens and checks against 120k threshold
- Summarizes old messages while keeping recent 15
- Ensures tool_call/tool_response pairs stay intact
- Progressive truncation if still over limit
- Returns `ContextWindowResult` dataclass with messages, token count,
compaction status, and errors
- Helper `_messages_to_dicts()` reduces code duplication

### Fix: Update `_generate_llm_continuation()`
- Now calls `_manage_context_window()` before making LLM calls
- Adds retry logic with exponential backoff (matching
`_stream_chat_chunks` behavior)

### Cleanup: Update `_stream_chat_chunks()`
- Replaced inline context management with call to
`_manage_context_window()`
- Eliminates code duplication between the two functions

## Testing

- Syntax check: 
- Ruff lint: 
- Import verification: 

## Checklist

- [x] My code follows the style guidelines of this project
- [x] I have performed a self-review of my own code
- [x] My changes generate no new warnings
- [x] I have checked that my changes do not break existing functionality

---------

Co-authored-by: Otto <otto@agpt.co>
2026-02-03 04:41:43 +00:00
Zamil Majdy
2dd341c369 refactor: enrich description with context before calling Agent Generator (#11932)
## Summary
Updates the Agent Generator client to enrich the description with
context before calling, instead of sending `user_instruction` as a
separate parameter.

## Context
Companion PR to Significant-Gravitas/AutoGPT-Agent-Generator#105 which
removes unused parameters from the decompose API.

## Changes
- Enrich `description` with `context` (e.g., clarifying question
answers) before sending
- Remove `user_instruction` from request payload

## How it works
Both input boxes and chat box work the same way - the frontend
constructs a formatted message with answers and sends it as a user
message. The backend then enriches the description with this context
before calling the external Agent Generator service.
2026-02-03 02:31:07 +00:00
Otto
f7350c797a fix(copilot): use messages_dict in fallback context compaction (#11922)
## Summary

Fixes a bug where the fallback path in context compaction passes
`recent_messages` (already sliced) instead of `messages_dict` (full
conversation) to `_ensure_tool_pairs_intact`.

This caused the function to fail to find assistant messages that exist
in the original conversation but were outside the sliced window,
resulting in orphan tool_results being sent to Anthropic and rejected
with:

```
messages.66.content.0: unexpected tool_use_id found in tool_result blocks: toolu_vrtx_019bi1PDvEn7o5ByAxcS3VdA
```

## Changes

- Pass `messages_dict` and `slice_start` (relative to full conversation)
instead of `recent_messages` and `reduced_slice_start` (relative to
already-sliced list)

## Testing

This is a targeted fix for the fallback path. The bug only manifests
when:
1. Token count > 120k (triggers compaction)
2. Initial compaction + summary still exceeds limit (triggers fallback)
3. A tool_result's corresponding assistant is in `messages_dict` but not
in `recent_messages`

## Related

- Fixes SECRT-1861
- Related: SECRT-1839 (original fix that missed this code path)
2026-02-02 13:01:05 +00:00
Guofang.Tang
1081590384 feat(backend): cover webhook ingress URL route (#11747)
### Changes 🏗️

- Add a unit test to verify webhook ingress URL generation matches the
FastAPI route.

  ### Checklist 📋

  #### For code changes:

  - [x] I have clearly listed my changes in the PR description
  - [x] I have made a test plan
  - [x] I have tested my changes according to the test plan:
- [x] poetry run pytest backend/integrations/webhooks/utils_test.py
--confcutdir=backend/integrations/webhooks

  #### For configuration changes:

  - [x] .env.default is updated or already compatible with my changes
- [x] docker-compose.yml is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under Changes)



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Tests**
* Added a unit test that validates webhook ingress URL generation
matches the application's resolved route (scheme, host, and path) for
provider-specific webhook endpoints, improving confidence in routing
behavior and helping prevent regressions.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Reinier van der Leer <pwuts@agpt.co>
2026-02-01 20:29:15 +00:00
Otto
7e37de8e30 fix: Include graph schemas for marketplace agents in Agent Generator (#11920)
## Problem

When marketplace agents are included in the `library_agents` payload
sent to the Agent Generator service, they were missing required fields
(`graph_id`, `graph_version`, `input_schema`, `output_schema`). This
caused Pydantic validation to fail with HTTP 422 Unprocessable Entity.

**Root cause:** The `MarketplaceAgentSummary` TypedDict had a different
shape than `LibraryAgentInfo` expected by the Agent Generator:
- Agent Generator expects: `graph_id`, `graph_version`, `name`,
`description`, `input_schema`, `output_schema`
- MarketplaceAgentSummary had: `name`, `description`, `sub_heading`,
`creator`, `is_marketplace_agent`

## Solution

1. **Add `agent_graph_id` to `StoreAgent` model** - The field was
already in the database view but not exposed
2. **Include `agentGraphId` in hybrid search SQL query** - Carry the
field through the search CTEs
3. **Update `search_marketplace_agents_for_generation()`** - Now fetches
full graph schemas using `get_graph()` and returns `LibraryAgentSummary`
(same type as library agents)
4. **Update deduplication logic** - Use `graph_id` instead of name for
more accurate deduplication

## Changes

- `backend/api/features/store/model.py`: Add optional `agent_graph_id`
field to `StoreAgent`
- `backend/api/features/store/hybrid_search.py`: Include `agentGraphId`
in SQL query columns
- `backend/api/features/store/db.py`: Map `agentGraphId` when creating
`StoreAgent` objects
- `backend/api/features/chat/tools/agent_generator/core.py`: Update
`search_marketplace_agents_for_generation()` to fetch and include full
graph schemas

## Testing

- [ ] Agent creation on dev with marketplace agents in context
- [ ] Verify no 422 errors from Agent Generator
- [ ] Verify marketplace agents can be used as sub-agents

Fixes: SECRT-1817

---------

Co-authored-by: majdyz <majdyz@users.noreply.github.com>
Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
2026-01-31 19:17:36 +00:00
Otto
2abbb7fbc8 hotfix(backend): use discriminator for credential matching in run_block (#11908)
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Nicholas Tindle <ntindle@users.noreply.github.com>
Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 21:50:21 -06:00
Otto
7ee94d986c docs: add credentials prerequisites to create-basic-agent guide (#11913)
## Summary
Addresses #11785 - users were encountering `openai_api_key_credentials`
errors when following the create-basic-agent guide because it didn't
mention the need to configure API credentials before using AI blocks.

## Changes
Added a **Prerequisites** section to
`docs/platform/create-basic-agent.md` explaining:
- **Cloud users:** Go to Profile → Integrations to add API keys
- **Self-hosted (Docker):** Add keys to `autogpt_platform/backend/.env`
and restart services

Also added a note that the Calculator example doesn't need credentials,
making it a good first test.

## Related
- Issue: #11785
2026-01-31 03:05:31 +00:00
Nicholas Tindle
05b60db554 fix(backend/chat): Include input schema in discovery and validate unknown fields (#11916)
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 21:00:43 -06:00
Zamil Majdy
18a1661fa3 feat: add library agent fetching with two-phase search for sub-agent support (#11889)
## Context

When users ask the chat to create agents, they may want to compose
workflows that reuse their existing agents as sub-agents. For this to
work, the Agent Generator service needs to know what agents the user has
available.

**Challenge:** Users can have large libraries with many agents. Fetching
all of them would be slow and provide too much context to the LLM.

## Solution

This PR implements **search-based library agent fetching** with a
**two-phase search** strategy:

1. **Phase 1 (Initial Search):** When the user describes their goal, we
search for relevant library agents using the goal as the search query
2. **Phase 2 (Step-Based Enrichment):** After the goal is decomposed
into steps, we extract keywords from those steps and search for
additional relevant agents

This ensures we find agents that are relevant to both the high-level
goal AND the specific steps identified.

### Example Flow

```
User goal: "Create an agent that fetches weather and sends a summary email"

Phase 1: Search for "weather email summary" → finds "Weather Fetcher" agent
Phase 2: After decomposition identifies steps like "send email notification"
         → searches "send email notification" → finds "Gmail Sender" agent
```

### Changes

**Library Agent Fetching:**
- `get_library_agents_for_generation()` - Search-based fetching from
user's library
- `search_marketplace_agents_for_generation()` - Search public
marketplace
- `get_all_relevant_agents_for_generation()` - Combines both with
deduplication

**Two-Phase Search:**
- `extract_search_terms_from_steps()` - Extracts keywords from
decomposed steps
- `enrich_library_agents_from_steps()` - Searches for additional agents
based on steps
- Integrated into `create_agent.py` as "Step 1.5" after goal
decomposition

**Type Safety:**
- Added `TypedDict` definitions: `LibraryAgentSummary`,
`MarketplaceAgentSummary`, `DecompositionStep`, `DecompositionResult`

### Design Decisions

- **Search-based, not fetch-all:** Scalable for large libraries
- **Library agents prioritized:** They have full schemas; marketplace
agents have basic info only
- **Deduplication by name and graph_id:** Prevents duplicates across
searches
- **Graceful degradation:** Failures don't block agent generation
- **Limited to 3 search terms:** Avoids excessive API calls during
enrichment

## Related PR
- Agent Generator:
https://github.com/Significant-Gravitas/AutoGPT-Agent-Generator/pull/103

## Test plan
- [x] `test_library_agents.py` - 19 tests covering all new functions
- [x] `test_service.py` - 4 tests for library_agents passthrough
- [ ] Integration test: Create agent with library sub-agent composition
2026-01-31 00:18:21 +00:00
Otto
b72521daa9 fix(readme): update broken self-hosting docs link (#11911)
## Summary
The self-hosting guide link in README.md was broken.

**Old link:** `https://docs.agpt.co/platform/getting-started/`
- Redirects to `https://agpt.co/docs/platform/getting-started`
- Returns HTTP 400 

**New link:**
`https://agpt.co/docs/platform/getting-started/getting-started`
- Works correctly 

## Changes
- Updated the self-hosting guide URL in README.md

Fixes #OPEN-2973
2026-01-30 22:59:45 +00:00
Ubbe
cc4839bedb hotfix(frontend): fix home redirect (3) (#11904)
### Changes 🏗️

Further improvements to LaunchDarkly initialisation and homepage
redirect...

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Run the app locally with the flag disabled/enabled, and the
redirects work

---------

Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Ubbe <0ubbe@users.noreply.github.com>
2026-01-30 20:40:46 +07:00
Otto
dbbff04616 hotfix(frontend): LD remount (#11903)
## Changes 🏗️

Removes the `key` prop from `LDProvider` that was causing full remounts
when user context changed.

### The Problem

The `key={context.key}` prop was forcing React to unmount and remount
the entire LDProvider when switching from anonymous → logged in user:

```
1. Page loads, user loading → key="anonymous" → LD mounts → flags available 
2. User finishes loading → key="user-123" → React sees key changed
3. LDProvider UNMOUNTS → flags become undefined 
4. New LDProvider MOUNTS → initializes again → flags available 
```

This caused the flag values to cycle: `undefined → value → undefined →
value`

### The Fix

Remove the `key` prop. The LDProvider handles context changes internally
via the `context` prop, which triggers `identify()` without remounting
the provider.

## Checklist 📋

- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [ ] I have tested my changes according to the test plan:
  - [ ] Flag values don't flicker on page load
  - [ ] Flag values update correctly when logging in/out
  - [ ] No redirect race conditions

Related: SECRT-1845
2026-01-30 19:08:26 +07:00
Reinier van der Leer
350ad3591b fix(backend/chat): Filter credentials for graph execution by scopes (#11881)
[SECRT-1842: run_agent tool does not correctly use credentials - agents
fail with insufficient auth
scopes](https://linear.app/autogpt/issue/SECRT-1842)

### Changes 🏗️

- Include scopes in credentials filter in
`backend.api.features.chat.tools.utils.match_user_credentials_to_graph`

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - CI must pass
- It's broken now and a simple change so we'll test in the dev
deployment
2026-01-30 11:01:51 +00:00
Ubbe
e6438b9a76 hotfix(frontend): use server redirect (#11900)
### Changes 🏗️

The page used a client-side redirect (`useEffect` + `router.replace`)
which only works after JavaScript loads and hydrates. On deployed sites,
if there's any delay or failure in JS execution, users see an
empty/black page because the component returns null.

**Fix:** Converted to a server-side redirect using redirect() from
next/navigation. This is a server component now, so:

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Tested locally but will see it fully working once deployed
2026-01-30 17:20:03 +07:00
Bently
de0ec3d388 chore(llm): remove deprecated Claude 3.7 Sonnet model with migration and defensive handling (#11841)
## Summary
Remove `claude-3-7-sonnet-20250219` from LLM model definitions ahead of
Anthropic's API retirement, with comprehensive migration and defensive
error handling.

## Background
Anthropic is retiring Claude 3.7 Sonnet (`claude-3-7-sonnet-20250219`)
on **February 19, 2026 at 9:00 AM PT**. This PR removes the model from
the platform and migrates existing users to prevent service
interruptions.

## Changes

### Code Changes
- Remove `CLAUDE_3_7_SONNET` enum member from `LlmModel` in `llm.py`
- Remove corresponding `ModelMetadata` entry
- Remove `CLAUDE_3_7_SONNET` from `StagehandRecommendedLlmModel` enum
- Remove `CLAUDE_3_7_SONNET` from block cost config
- Add `CLAUDE_4_5_SONNET` to `StagehandRecommendedLlmModel` enum
- Update Stagehand block defaults from `CLAUDE_3_7_SONNET` to
`CLAUDE_4_5_SONNET` (staying in Claude family)
- Add defensive error handling in `CredentialsFieldInfo.discriminate()`
for deprecated model values

### Database Migration
- Adds migration `20260126120000_migrate_claude_3_7_to_4_5_sonnet`
- Migrates `AgentNode.constantInput` model references
- Migrates `AgentNodeExecutionInputOutput.data` preset overrides

### Documentation
- Updated `docs/integrations/block-integrations/llm.md` to remove
deprecated model
- Updated `docs/integrations/block-integrations/stagehand/blocks.md` to
remove deprecated model and add Claude 4.5 Sonnet

## Notes
- Agent JSON files in `autogpt_platform/backend/agents/` still reference
this model in their provider mappings. These are auto-generated and
should be regenerated separately.

## Testing
- [ ] Verify LLM block still functions with remaining models
- [ ] Confirm no import errors in affected files
- [ ] Verify migration runs successfully
- [ ] Verify deprecated model gives helpful error message instead of
KeyError
2026-01-30 08:40:55 +00:00
Otto
e10ff8d37f fix(frontend): remove double flag check on homepage redirect (#11894)
## Changes 🏗️

Fixes the hard refresh redirect bug (SECRT-1845) by removing the double
feature flag check.

### Before (buggy)
```
/                    → checks flag → /copilot or /library
/copilot (layout)    → checks flag → /library if OFF
```

On hard refresh, two sequential LD checks created a race condition
window.

### After (fixed)
```
/                    → always redirects to /copilot
/copilot (layout)    → single flag check via FeatureFlagPage
```

Single check point = no double-check race condition.

## Root Cause

As identified by @0ubbe: the root page and copilot layout were both
checking the feature flag. On hard refresh with network latency, the
second check could fire before LaunchDarkly fully initialized, causing
users to be bounced to `/library`.

## Test Plan

- [ ] Hard refresh on `/` → should go to `/copilot` (flag ON)
- [ ] Hard refresh on `/copilot` → should stay on `/copilot` (flag ON)  
- [ ] With flag OFF → should redirect to `/library`
- [ ] Normal navigation still works

Fixes: SECRT-1845

cc @0ubbe
2026-01-30 08:32:50 +00:00
Otto
7cb1e588b0 fix(frontend): Refocus ChatInput after voice transcription completes (#11893)
## Summary
Refocuses the chat input textarea after voice transcription finishes,
allowing users to immediately use `spacebar+enter` to record and send
their prompt.

## Changes
- Added `inputId` parameter to `useVoiceRecording` hook
- After transcription completes, the input is automatically focused
- This improves the voice input UX flow

## Testing
1. Click mic button or press spacebar to record voice
2. Record a message and stop
3. After transcription completes, the input should be focused
4. User can now press Enter to send or spacebar to record again

---------

Co-authored-by: Lluis Agusti <hi@llu.lu>
2026-01-30 14:49:05 +07:00
Otto
582c6cad36 fix(e2e): Make E2E test data deterministic and fix flaky tests (#11890)
## Summary
Fixes flaky E2E marketplace and library tests that were causing PRs to
be removed from the merge queue.

## Root Cause
1. **Test data was probabilistic** - `e2e_test_data.py` used random
chances (40% approve, then 20-50% feature), which could result in 0
featured agents
2. **Library pagination threshold wrong** - Checked `>= 10`, but page
size is 20
3. **Fixed timeouts** - Used `waitForTimeout(2000)` /
`waitForTimeout(10000)` instead of proper waits

## Changes

### Backend (`e2e_test_data.py`)
- Add guaranteed minimums: 8 featured agents, 5 featured creators, 10
top agents
- First N submissions are deterministically approved and featured
- Increase agents per user from 15 → 25 (for pagination with
page_size=20)
- Fix library agent creation to use constants instead of hardcoded `10`

### Frontend Tests
- `library.spec.ts`: Fix pagination threshold to `PAGE_SIZE` (20)
- `library.page.ts`: Replace 2s timeout with `networkidle` +
`waitForFunction`
- `marketplace.page.ts`: Add `networkidle` wait, 30s waits in
`getFirst*` methods
- `marketplace.spec.ts`: Replace 10s timeout with `waitForFunction`
- `marketplace-creator.spec.ts`: Add `networkidle` + element waits

## Related
- Closes SECRT-1848, SECRT-1849
- Should unblock #11841 and other PRs in merge queue

---------

Co-authored-by: Ubbe <hi@ubbe.dev>
2026-01-30 05:12:35 +00:00
Nicholas Tindle
059c94afac docs(blocks): update video block documentation
- Remove deprecated output_return_type parameter
- Add model_id parameter to narration block

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 21:55:46 -06:00
Nicholas Tindle
3ee7c9bfa8 chore(backend): format video blocks and update poetry.lock
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 21:47:34 -06:00
Otto
0fde14bf23 refactor: Move all media blocks into video/ folder
- Moved MediaDurationBlock, LoopVideoBlock, AddAudioToVideoBlock from media.py to video/
- Deleted media.py - all video/media blocks now consolidated in video/ folder
- Updated video/__init__.py to export all 8 blocks
2026-01-30 03:26:41 +00:00
Otto
e8b33f9dbe Merge dev into feature/video-editing-blocks
- Resolved conflicts keeping dev's media.py with ExecutionContext pattern
- Updated video blocks (clip, concat, download, narration, text_overlay) to use ExecutionContext
- Removed duplicate blocks (duration, loop, add_audio) - now provided by media.py
- Updated video/__init__.py to only export new video blocks
2026-01-30 03:16:41 +00:00
Nicholas Tindle
3b822cdaf7 chore(branchlet): Remove docs pip install from postCreateCmd (#11883)
### Changes 🏗️

- Removed `cd docs && pip install -r requirements.txt` from
`postCreateCmd` in `.branchlet.json`
- Docs dependencies will no longer be auto-installed during branchlet
worktree creation

### Rationale

The docs setup step was adding unnecessary overhead to the worktree
creation process. Developers who need to work on documentation can
manually install the docs requirements when needed.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Verified branchlet worktree creation still works without the docs
pip install step

#### For configuration changes:

- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)
2026-01-30 00:31:34 +00:00
Zamil Majdy
b2eb4831bd feat(chat): improve agent generator error propagation (#11884)
## Summary
- Add helper functions in `service.py` to create standardized error
responses with `error_type` classification
- Update service functions to return error dicts instead of `None`,
preserving error details from the Agent Generator microservice
- Update `core.py` to pass through error responses properly
- Update `create_agent.py` to handle error responses with user-friendly
messages based on error type

## Error Types Now Propagated
| Error Type | Description | User Message |
|------------|-------------|--------------|
| `llm_parse_error` | LLM returned unparseable response | "The AI had
trouble understanding this request" |
| `llm_timeout` / `timeout` | Request timed out | "The request took too
long" |
| `llm_rate_limit` / `rate_limit` | Rate limited | "The service is
currently busy" |
| `validation_error` | Agent validation failed | "The generated agent
failed validation" |
| `connection_error` | Could not connect to Agent Generator | Generic
error message |
| `http_error` | HTTP error from Agent Generator | Generic error message
|
| `unknown` | Unclassified error | Generic error message |

## Motivation
This enables better debugging for issues like SECRT-1817 where
decomposition failed due to transient LLM errors but the root cause was
unclear in the logs. Now:
1. Error details from the Agent Generator microservice are preserved
2. Users get more helpful error messages based on error type
3. Debugging is easier with `error_type` in response details

## Related PR
- Agent Generator side:
https://github.com/Significant-Gravitas/AutoGPT-Agent-Generator/pull/102

## Test Plan
- [ ] Test decomposition with various error scenarios (timeout, parse
error)
- [ ] Verify user-friendly messages are shown based on error type
- [ ] Check that error details are logged properly
2026-01-29 19:53:40 +00:00