Compare commits

...

486 Commits

Author SHA1 Message Date
coderabbitai[bot]
3db2a944f7 fix: apply CodeRabbit auto-fixes
Fixed 1 file(s) based on 1 unresolved review comment.

Co-authored-by: CodeRabbit <noreply@coderabbit.ai>
2026-04-01 19:24:39 +00:00
Bentlybro
59192102a6 refactor(frontend): remove useMemo from useLibraryAgentList; fix useState staleness in useSitrepItems and useFleetSummary
- useLibraryAgentList: replace both useMemo calls with plain expressions;
  extract filterAgentsByStatus as a standalone helper function
- useSitrepItems: replace useState initializer with useMemo([agentIDs,
  maxItems]) so items recompute when the agent list changes
- useFleetSummary: replace useState initializer with useMemo([agentIDs])
  so fleet counts recompute when agents are added or removed
2026-04-01 19:10:03 +00:00
Bentlybro
65cca9bef8 fix(frontend): fix TS errors and AgentBriefingPanel regression
- Replace `variant="xsmall"` (not in Text component's type union) with
  `variant="small"` in PulseChips, StatsGrid, LibraryAgentCard, SitrepList
  — fixes the `check API types` CI failure
- useLibraryAgentList: expose `allAgentIDs` (unfiltered) and
  `displayedCount` (filtered count when filter is active)
- LibraryAgentList: pass `allAgentIDs` to AgentBriefingPanel so the
  sitrep covers the full fleet regardless of the active filter; pass
  `displayedCount` to the tab label so "All N" reflects the current view

Addresses Sentry comments 3023964054 and 3023964058.
2026-04-01 19:09:41 +00:00
Bentlybro
6b32e43d84 fix(frontend): wire statusFilter to agent list, fix consumePrompt, add NOTE comments
- useLibraryAgentList: accept statusFilter prop and apply client-side
  filtering using mockStatusForAgent; maps "attention"→health=attention,
  "healthy"→health=good, others match status directly
- LibraryAgentList: pass statusFilter through to useLibraryAgentList
- AutoPilotBridgeContext.consumePrompt: use `prompt !== null` instead of
  truthy check so empty string correctly clears sessionStorage
- Add NOTE comments near useState initialisers in useSitrepItems,
  useLibraryFleetSummary, useAgentStatus, and useFleetSummary explaining
  that they do not recompute on prop changes
2026-04-01 19:09:41 +00:00
Bentlybro
b73d05c23e fix(frontend): address review feedback — lint, format, and bug fixes
- Remove unused Button import in PulseChips (fixes lint CI failure)
- Fix AutoPilotBridgeContext: use Next.js router + sessionStorage instead
  of window.location.href which destroyed React state before consumption
- Add defensive handling in formatTimeAgo for invalid/future dates
- Use cn() utility in LibraryAgentCard for className consistency
- Fix prettier formatting across AgentFilterMenu, SitrepList, useAgentStatus
2026-03-31 17:22:23 +00:00
John Ababseh
8277cce835 feat(frontend): add Agent Intelligence Layer to library and home
Implements 7 new features for agent awareness across the platform:

1. Agent Briefing Panel — collapsible stats grid showing fleet-wide
   counts (running, error, listening, scheduled, idle, monthly spend)
2. Enhanced Library Cards — StatusBadge, progress bar, error messages,
   run counts, spend, and time-ago on every agent card
3. Expanded Filter & Sort — new AgentFilterMenu dropdown with status-
   based filtering (running, attention, listening, scheduled, idle)
4. AI Summary / Situation Report — prioritized SitrepList with error-
   first ranking and contextual action buttons
5. Ask AutoPilot Bridge — shared context that lets sitrep items and
   pulse chips pre-populate the Home chat with agent-specific prompts
6. Home Pulse Chips — lightweight agent status chips on the empty
   Home/Copilot screen linking back to the library
7. Contextual Action Buttons — status-aware actions (View error,
   Reconnect, Watch live, Run now, Retry) on cards and sitrep items

All features use deterministic mock data via useAgentStatus hook,
marked with TODO comments for backend API integration. Follows
existing component patterns (atoms/molecules/organisms), reuses
shadcn/ui primitives, Phosphor Icons, and platform design tokens.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-31 18:12:51 +02:00
Abhimanyu Yadav
57b17dc8e1 feat(platform): generic managed credential system with AgentMail auto-provisioning (#12537)
### Why / What / How

**Why:** We need a third credential type: **system-provided but unique
per user** (managed credentials). Currently we have system credentials
(same for all users) and user credentials (user provides their own
keys). Managed credentials bridge the gap — the platform provisions them
automatically, one per user, for integrations like AgentMail where each
user needs their own pod-scoped API key.

**What:**
- Generic **managed credential provider registry** — any integration can
register a provider that auto-provisions per-user credentials
- **AgentMail** is the first consumer: creates a pod + pod-scoped API
key using the org-level API key
- Managed credentials appear in the credential dropdown like normal API
keys but with `autogpt_managed=True` — users **cannot update or delete**
them
- **Auto-provisioning** on `GET /credentials` — lazily creates managed
credentials when users browse their credential list
- **Account deletion cleanup** utility — revokes external resources
(pods, API keys) before user deletion
- **Frontend UX** — hides the delete button for managed credentials on
the integrations page

**How:**

### Backend

**New files:**
- `backend/integrations/managed_credentials.py` —
`ManagedCredentialProvider` ABC, global registry,
`ensure_managed_credentials()` (with per-user asyncio lock +
`asyncio.gather` for concurrency), `cleanup_managed_credentials()`
- `backend/integrations/managed_providers/__init__.py` —
`register_all()` called at startup
- `backend/integrations/managed_providers/agentmail.py` —
`AgentMailManagedProvider` with `provision()` (creates pod + API key via
agentmail SDK) and `deprovision()` (deletes pod)

**Modified files:**
- `credentials_store.py` — `autogpt_managed` guards on update/delete,
`has_managed_credential()` / `add_managed_credential()` helpers
- `model.py` — `autogpt_managed: bool` + `metadata: dict` on
`_BaseCredentials`
- `router.py` — calls `ensure_managed_credentials()` in list endpoints,
removed explicit `/agentmail/connect` endpoint
- `user.py` — `cleanup_user_managed_credentials()` for account deletion
- `rest_api.py` — registers managed providers at startup
- `settings.py` — `agentmail_api_key` setting

### Frontend
- Added `autogpt_managed` to `CredentialsMetaResponse` type
- Conditionally hides delete button on integrations page for managed
credentials

### Key design decisions
- **Auto-provision in API layer, not data layer** — keeps
`get_all_creds()` side-effect-free
- **Race-safe** — per-(user, provider) asyncio lock with double-check
pattern prevents duplicate pods
- **Idempotent** — AgentMail SDK `client_id` ensures pod creation is
idempotent; `add_managed_credential()` uses upsert under Redis lock
- **Error-resilient** — provisioning failures are logged but never block
credential listing

### Changes 🏗️

| File | Action | Description |
|------|--------|-------------|
| `backend/integrations/managed_credentials.py` | NEW | ABC, registry,
ensure/cleanup |
| `backend/integrations/managed_providers/__init__.py` | NEW | Registers
all providers at startup |
| `backend/integrations/managed_providers/agentmail.py` | NEW |
AgentMail provisioning/deprovisioning |
| `backend/integrations/credentials_store.py` | MODIFY | Guards +
managed credential helpers |
| `backend/data/model.py` | MODIFY | `autogpt_managed` + `metadata`
fields |
| `backend/api/features/integrations/router.py` | MODIFY |
Auto-provision on list, removed `/agentmail/connect` |
| `backend/data/user.py` | MODIFY | Account deletion cleanup |
| `backend/api/rest_api.py` | MODIFY | Provider registration at startup
|
| `backend/util/settings.py` | MODIFY | `agentmail_api_key` setting |
| `frontend/.../integrations/page.tsx` | MODIFY | Hide delete for
managed creds |
| `frontend/.../types.ts` | MODIFY | `autogpt_managed` field |

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] 23 tests pass in `router_test.py` (9 new tests for
ensure/cleanup/auto-provisioning)
  - [x] `poetry run format && poetry run lint` — clean
  - [x] OpenAPI schema regenerated
- [x] Manual: verify managed credential appears in AgentMail block
dropdown
  - [x] Manual: verify delete button hidden for managed credentials
- [x] Manual: verify managed credential cannot be deleted via API (403)

#### For configuration changes:
- [x] `.env.default` is updated with `AGENTMAIL_API_KEY=`

---------

Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
2026-03-31 12:56:18 +00:00
Krishna Chaitanya
a20188ae59 fix(blocks): validate non-empty input in AIConversationBlock before LLM call (#12545)
### Why / What / How

**Why:** When `AIConversationBlock` receives an empty messages list and
an empty prompt, the block blindly forwards the empty array to the
downstream LLM API, which returns a cryptic `400 Bad Request` error:
`"Invalid 'messages': empty array. Expected an array with minimum length
1."` This is confusing for users who don't understand why their agent
failed.

**What:** Add early input validation in `AIConversationBlock.run()` that
raises a clear `ValueError` when both `messages` and `prompt` are empty.
Also add three unit tests covering the validation logic.

**How:** A simple guard clause at the top of the `run` method checks `if
not input_data.messages and not input_data.prompt` before the LLM call
is made. If both are empty, a descriptive `ValueError` is raised. If
either one has content, the block proceeds normally.

### Changes

- `autogpt_platform/backend/backend/blocks/llm.py`: Add validation guard
in `AIConversationBlock.run()` to reject empty messages + empty prompt
before calling the LLM
- `autogpt_platform/backend/backend/blocks/test/test_llm.py`: Add
`TestAIConversationBlockValidation` with three tests:
- `test_empty_messages_and_empty_prompt_raises_error` — validates the
guard clause
- `test_empty_messages_with_prompt_succeeds` — ensures prompt-only usage
still works
- `test_nonempty_messages_with_empty_prompt_succeeds` — ensures
messages-only usage still works

### Checklist

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Lint passes (`ruff check`)
  - [x] Formatting passes (`ruff format`)
- [x] New unit tests validate the empty-input guard and the happy paths

Closes #11875

---------

Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
2026-03-31 12:43:42 +00:00
goingforstudying-ctrl
c410be890e fix: add empty choices guard in extract_openai_tool_calls() (#12540)
## Summary

`extract_openai_tool_calls()` in `llm.py` crashes with `IndexError` when
the LLM provider returns a response with an empty `choices` list.

### Changes 🏗️

- Added a guard check `if not response.choices: return None` before
accessing `response.choices[0]`
- This is consistent with the function's existing pattern of returning
`None` when no tool calls are found

### Bug Details

When an LLM provider returns a response with an empty choices list
(e.g., due to content filtering, rate limiting, or API errors),
`response.choices[0]` raises `IndexError`. This can crash the entire
agent execution pipeline.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- Verified that the function returns `None` when `response.choices` is
empty
- Verified existing behavior is unchanged when `response.choices` is
non-empty

---------

Co-authored-by: goingforstudying-ctrl <forgithubuse@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
2026-03-31 20:10:27 +07:00
Zamil Majdy
37d9863552 feat(platform): add extended thinking execution mode to OrchestratorBlock (#12512)
## Summary
- Adds `ExecutionMode` enum with `BUILT_IN` (default built-in tool-call
loop) and `EXTENDED_THINKING` (delegates to Claude Agent SDK for richer
reasoning)
- Extracts shared `tool_call_loop` into `backend/util/tool_call_loop.py`
— reusable by both OrchestratorBlock agent mode and copilot baseline
- Refactors copilot baseline to use the shared `tool_call_loop` with
callback-driven iteration

## ExecutionMode enum
`ExecutionMode` (`backend/blocks/orchestrator.py`) controls how
OrchestratorBlock executes tool calls:
- **`BUILT_IN`** — Default mode. Runs the built-in tool-call loop
(supports all LLM providers).
- **`EXTENDED_THINKING`** — Delegates to the Claude Agent SDK for
extended thinking and multi-step planning. Requires Anthropic-compatible
providers (`anthropic` / `open_router`) and direct API credentials
(subscription mode not supported). Validates both provider and model
name at runtime.

## Shared tool_call_loop
`backend/util/tool_call_loop.py` provides a generic, provider-agnostic
conversation loop:
1. Call LLM with tools → 2. Extract tool calls → 3. Execute tools → 4.
Update conversation → 5. Repeat

Callers provide three callbacks:
- `llm_call`: wraps any LLM provider (OpenAI streaming, Anthropic,
llm.llm_call, etc.)
- `execute_tool`: wraps any tool execution (TOOL_REGISTRY, graph block
execution, etc.)
- `update_conversation`: formats messages for the specific protocol

## OrchestratorBlock EXTENDED_THINKING mode
- `_create_graph_mcp_server()` converts graph-connected blocks to MCP
tools
- `_execute_tools_sdk_mode()` runs `ClaudeSDKClient` with those MCP
tools
- Agent mode refactored to use shared `tool_call_loop`

## Copilot baseline refactored
- Streaming callbacks buffer `Stream*` events during loop execution
- Events are drained after `tool_call_loop` returns
- Same conversation logic, less code duplication

## SDK environment builder extraction
- `build_sdk_env()` extracted to `backend/copilot/sdk/env.py` for reuse
by both copilot SDK service and OrchestratorBlock

## Provider validation
EXTENDED_THINKING mode validates `provider in ('anthropic',
'open_router')` and `model_name.startswith('claude')` because the Claude
Agent SDK requires an Anthropic API key or OpenRouter key. Subscription
mode is not supported — it uses the platform's internal credit system
which doesn't provide raw API keys needed by the SDK. The validation
raises a clear `ValueError` if an unsupported provider or model is used.

## PR Dependencies
This PR builds on #12511 (Claude SDK client). It can be reviewed
independently — #12511 only adds the SDK client module which this PR
imports. If #12511 merges first, this PR will have no conflicts.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] All pre-commit hooks pass (typecheck, lint, format)
  - [x] Existing OrchestratorBlock tests still pass
- [x] Copilot baseline behavior unchanged (same stream events, same tool
execution)
- [x] Manual: OrchestratorBlock with execution_mode=EXTENDED_THINKING +
downstream blocks → SDK calls tools
  - [x] Agent mode regression test (non-SDK path works as before)
  - [x] SDK mode error handling (invalid provider raises ValueError)
2026-03-31 20:04:13 +07:00
Krishna Chaitanya
2f42ff9b47 fix(blocks): validate email recipients in Gmail blocks before API call (#12546)
### Why / What / How

**Why:** When a user or LLM supplies a malformed recipient string (e.g.
a bare username, a JSON blob, or an empty value) to `GmailSendBlock`,
`GmailCreateDraftBlock`, or any reply block, the Gmail API returns an
opaque `HttpError 400: "Invalid To header"`. This surfaces as a
`BlockUnknownError` with no actionable guidance, making it impossible
for the LLM to self-correct. (Fixes #11954)

**What:** Adds a lightweight `validate_email_recipients()` function that
checks every recipient against a simplified RFC 5322 pattern
(`local@domain.tld`) and raises a clear `ValueError` listing all invalid
entries before any API call is made.

**How:** The validation is called in two shared code paths —
`create_mime_message()` (used by send and draft blocks) and
`_build_reply_message()` (used by reply blocks) — so all Gmail blocks
that compose outgoing email benefit from it with zero per-block changes.
The regex is intentionally permissive (any `x@y.z` passes) to avoid
false positives on unusual but valid addresses.

### Changes 🏗️

- Added `validate_email_recipients()` helper in `gmail.py` with a
compiled regex
- Hooked validation into `create_mime_message()` for `to`, `cc`, and
`bcc` fields
- Hooked validation into `_build_reply_message()` for reply/draft-reply
blocks
- Added `TestValidateEmailRecipients` test class covering valid,
invalid, mixed, empty, JSON-string, and field-name scenarios

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Verified `validate_email_recipients` correctly accepts valid
emails (`user@example.com`, `a@b.com`, `test@sub.domain.co`)
- [x] Verified it rejects malformed entries (bare names, missing domain
dot, empty strings, JSON strings)
- [x] Verified error messages include the field name and all invalid
entries
  - [x] Verified empty recipient lists pass without error
  - [x] Confirmed `gmail.py` and test file parse correctly (AST check)

---------

Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
2026-03-31 12:37:33 +00:00
Zamil Majdy
914efc53e5 fix(backend): disambiguate duplicate tool names in OrchestratorBlock (#12555)
## Why
The OrchestratorBlock fails with `Tool names must be unique` when
multiple nodes use the same block type (e.g., two "Web Search" blocks
connected as tools). The Anthropic API rejects the request because
duplicate tool names are sent.

## What
- Detect duplicate tool names after building tool signatures
- Append `_1`, `_2`, etc. suffixes to disambiguate
- Enrich descriptions of duplicate tools with their hardcoded default
values so the LLM can distinguish between them
- Clean up internal `_hardcoded_defaults` metadata before sending to API
- Exclude sensitive/credential fields from default value descriptions

## How
- After `_create_tool_node_signatures` builds all tool functions, count
name occurrences
- For duplicates: rename with suffix and append `[Pre-configured:
key=value]` to description using the node's `input_default` (excluding
linked fields that the LLM provides)
- Added defensive `isinstance(defaults, dict)` check for compatibility
with test mocks
- Suffix collision avoidance: skips candidates that collide with
existing tool names
- Long tool names truncated to fit within 64-character API limit
- 47 unit tests covering: basic dedup, description enrichment, unique
names unchanged, no metadata leaks, single tool, triple duplicates,
linked field exclusion, mixed unique/duplicate scenarios, sensitive
field exclusion, long name truncation, suffix collision, malformed
tools, missing description, empty list, 10-tool all-same-name, multiple
distinct groups, large default truncation, suffix collision cascade,
parameter preservation, boundary name lengths, nested dict/list
defaults, null defaults, customized name priority, required fields

## Test plan
- [x] All 47 tests in `test_orchestrator_tool_dedup.py` pass
- [x] All 11 existing orchestrator unit tests pass (dict, dynamic
fields, responses API)
- [x] Pre-commit hooks pass (ruff, black, isort, pyright)
- [ ] Manual test: connect two same-type blocks to an orchestrator and
verify the LLM call succeeds

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 11:54:10 +00:00
Carson Kahn
17e78ca382 fix(docs): remove extraneous whitespace in README (#12587)
### Why / What / How

Remove extraneous whitespace in README.md:
- "Workflow Management" description: extra spaces between "block" and
"performs"
- "Agent Interaction" description: extra spaces between "user-friendly"
and "interface"

---------

Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
2026-03-31 08:38:45 +00:00
Ubbe
7ba05366ed feat(platform/copilot): live timer stats with persisted duration (#12583)
## Why

The copilot chat had no indication of how long the AI spent "thinking"
on a response. Users couldn't tell if a long wait was normal or
something was stuck. Additionally, the thinking duration was lost on
page reload since it was only tracked client-side.

## What

- **Live elapsed timer**: Shows elapsed time ("23s", "1m 5s") in the
ThinkingIndicator while the AI is processing (appears after 20s to avoid
spam on quick responses)
- **Frozen "Thought for Xm Ys"**: Displays the final thinking duration
in TurnStatsBar after the response completes
- **Persisted duration**: Saves `durationMs` on the last assistant
message in the DB so the timer survives page reloads

## How

**Backend:**
- Added `durationMs Int?` column to `ChatMessage` (Prisma migration)
- `mark_session_completed` in `stream_registry.py` computes wall-clock
duration from Redis session `created_at` and saves it via
`DatabaseManager.set_turn_duration()`
- Invalidates Redis session cache after writing so GET returns fresh
data

**Frontend:**
- `useElapsedTimer` hook tracks client-side elapsed seconds during
streaming
- `ThinkingIndicator` shows only the elapsed time (no phrases) after
20s, with `font-mono text-sm` styling
- `TurnStatsBar` displays "Thought for Xs" after completion, preferring
live `elapsedSeconds` and falling back to persisted `durationMs`
- `convertChatSessionToUiMessages` extracts `duration_ms` from
historical messages into a `Map<string, number>` threaded through to
`ChatMessagesContainer`

## Test plan

- [ ] Send a message in copilot — verify ThinkingIndicator shows elapsed
time after 20s
- [ ] After response completes — verify "Thought for Xs" appears below
the response
- [ ] Refresh the page — verify "Thought for Xs" still appears
(persisted from DB)
- [ ] Check older conversations — they should NOT show timer (no
historical data)
- [ ] Verify no Zod/SSE validation errors in browser console

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 16:46:31 +07:00
Zamil Majdy
ca74f980c1 fix(copilot): resolve host-scoped credentials for authenticated web requests (#12579)
## Summary
- Fixed `_resolve_discriminated_credentials()` in `helpers.py` to handle
URL/host-based credential discrimination (used by
`SendAuthenticatedWebRequestBlock`)
- Previously, only provider-based discrimination (with
`discriminator_mapping`) was handled; URL-based discrimination (with
`discriminator` set but no `discriminator_mapping`) was silently skipped
- This caused host-scoped credentials to either match the wrong host or
fail to match at all when the CoPilot called `run_block` for
authenticated HTTP requests
- Added 14 targeted tests covering discriminator resolution, host
matching, credential resolution integration, and RunBlockTool end-to-end
flows

## Root Cause
`_resolve_discriminated_credentials()` checked `if
field_info.discriminator and field_info.discriminator_mapping:` which
excluded host-scoped credentials where `discriminator="url"` but
`discriminator_mapping=None`. The URL from `input_data` was never added
to `discriminator_values`, so `_credential_is_for_host()` received empty
`discriminator_values` and returned `True` for **any** host-scoped
credential regardless of URL match.

## Fix
When `discriminator` is set without `discriminator_mapping`, the URL
value from `input_data` is now copied into `discriminator_values` on a
shallow copy of the field info (to avoid mutating the cached schema).
This enables `_credential_is_for_host()` to properly match the
credential's host against the target URL.

## Test plan
- [x] `TestResolveDiscriminatedCredentials` - 4 tests verifying URL
discriminator populates values, handles missing URL, doesn't mutate
original, preserves provider/type
- [x] `TestFindMatchingHostScopedCredential` - 5 tests verifying
correct/wrong host matching, wildcard hosts, multiple credential
selection
- [x] `TestResolveBlockCredentials` - 3 integration tests verifying full
credential resolution with matching/wrong/missing hosts
- [x] `TestRunBlockToolAuthenticatedHttp` - 2 end-to-end tests verifying
SetupRequirementsResponse when creds missing and BlockDetailsResponse
when creds matched
- [x] All 28 existing + new tests pass
- [x] Ruff lint, isort, Black formatting, pyright typecheck all pass

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 08:12:33 +00:00
Zamil Majdy
68f5d2ad08 fix(blocks): raise AIConditionBlock errors instead of swallowing them (#12593)
## Why

Sentry alert
[AUTOGPT-SERVER-8C8](https://significant-gravitas.sentry.io/issues/7367978095/)
— `AIConditionBlock` failing in prod with:

```
Invalid 'max_output_tokens': integer below minimum value.
Expected a value >= 16, but got 10 instead.
```

Two problems:
1. `max_tokens=10` is below OpenAI's new minimum of 16
2. The `except Exception` handler was calling `logger.error()` which
triggered Sentry for what are known block errors, AND silently
defaulting to `result=False` — making the block appear to succeed with
an incorrect answer

## What

- Bump `max_tokens` from 10 to 16 (fixes the root cause)
- Remove the `try/except` entirely — the executor already handles
exceptions correctly (`ValueError` = known/no Sentry, everything else =
unknown/Sentry). The old handler was just swallowing errors and
producing wrong results.

## Test plan

- [x] Existing `AIConditionBlock` tests pass (block only expects
"true"/"false", 16 tokens is plenty)
- [x] No more silent `result=False` on errors
- [x] No more spurious Sentry alerts from `logger.error()`

Fixes AUTOGPT-SERVER-8C8

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 10:28:14 +00:00
Nicholas Tindle
2b3d730ca9 dx(skills): add /open-pr and /setup-repo skills (#12591)
### Why / What / How

**Why:** Agents working in worktrees lack guidance on two of the most
common workflows: properly opening PRs (using the repo template,
validating test coverage, triggering the review bot) and bootstrapping
the repo from scratch with a worktree-based layout. Without these
skills, agents either skip steps (no test plan, wrong template) or
require manual hand-holding for setup.

**What:** Adds two new Claude Code skills under `.claude/skills/`:
- `/open-pr` — A structured PR creation workflow that enforces the
canonical `.github/PULL_REQUEST_TEMPLATE.md`, validates test coverage
for existing and new behaviors, supports a configurable base branch, and
integrates the `/review` bot workflow for agents without local testing
capability. Cross-references `/pr-test`, `/pr-review`, and `/pr-address`
for the full PR lifecycle.
- `/setup-repo` — An interactive repo bootstrapping skill that creates a
worktree-based layout (main + reviews + N numbered work branches).
Handles .env file provisioning with graceful fallbacks (.env.default,
.env.example), copies branchlet config, installs dependencies, and is
fully idempotent (safe to re-run).

**How:** Markdown-based SKILL.md files following the existing skill
conventions. Both skills use proper bash patterns (seq-based loops
instead of brace expansion with variables, existence checks before
branch/worktree creation, error reporting on install failures).
`/open-pr` delegates to AskUserQuestion-style prompts for base branch
selection. `/setup-repo` uses AskUserQuestion for interactive branch
count and base branch selection.

### Changes 🏗️

- Added `.claude/skills/open-pr/SKILL.md` — PR creation workflow with:
  - Pre-flight checks (committed, pushed, formatted)
- Test coverage validation (existing behavior not broken, new behavior
covered)
- Canonical PR template enforcement (read and fill verbatim, no
pre-checked boxes)
  - Configurable base branch (defaults to dev)
- Review bot workflow (`/review` comment + 30min wait) for agents
without local testing
  - Related skills table linking `/pr-test`, `/pr-review`, `/pr-address`

- Added `.claude/skills/setup-repo/SKILL.md` — Repo bootstrap workflow
with:
- Interactive setup (branch count: 4/8/16/custom, base branch selection)
- Idempotent branch creation (skips existing branches with info message)
  - Idempotent worktree creation (skips existing directories)
- .env provisioning with fallback chain (.env → .env.default →
.env.example → warning)
  - Branchlet config propagation
  - Dependency installation with success/failure reporting per worktree

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Verified SKILL.md frontmatter follows existing skill conventions
  - [x] Verified trigger conditions match expected user intents
  - [x] Verified cross-references to existing skills are accurate
- [x] Verified PR template section matches
`.github/PULL_REQUEST_TEMPLATE.md`
- [x] Verified bash snippets use correct patterns (seq, show-ref, quoted
vars)
  - [x] Pre-commit hooks pass on all commits
  - [x] Addressed all CodeRabbit, Sentry, and Cursor review comments

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Low Risk**
> Low risk documentation-only change: adds new markdown skills without
modifying runtime code. Main risk is workflow guidance drift (e.g.,
`.env`/worktree steps) if it diverges from actual repo conventions.
> 
> **Overview**
> Adds two new Claude Code skills under `.claude/skills/` to standardize
common developer workflows.
> 
> `/open-pr` documents a PR creation flow that enforces using
`.github/PULL_REQUEST_TEMPLATE.md` verbatim, calls out required test
coverage, and describes how to trigger/poll the `/review` bot when local
testing isn’t available.
> 
> `/setup-repo` documents an idempotent, interactive bootstrap for a
multi-worktree layout (creates `reviews` and `branch1..N`, provisions
`.env` files with `.env.default`/`.env.example` fallbacks, copies
`.branchlet.json`, and installs dependencies), complementing the
existing `/worktree` skill.
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
80dbeb1596. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
2026-03-27 10:22:03 +00:00
Zamil Majdy
f28628e34b fix(backend): preserve thinking blocks during transcript compaction (#12574)
## Why

AutoPilot users hit `invalid_request_error` ("thinking or
redacted_thinking blocks in the latest assistant message cannot be
modified") when sessions get long enough to trigger transcript
compaction. The Anthropic API requires thinking blocks in the last
assistant message to be byte-for-byte identical to the original response
— our compaction was flattening them to plain text, destroying the
cryptographic signatures.

Reported in Discord `#breakage` by John Ababseh with session
`31d3f08a-cb94-45eb-9fce-56b3f0287ef4`.

## What

- **`compact_transcript`** now splits the transcript into a compressible
prefix and a preserved tail (last assistant entry + trailing entries).
Only the prefix is compressed; the tail is re-appended verbatim,
preserving thinking blocks exactly.
- **`_flatten_assistant_content`** now silently drops `thinking` and
`redacted_thinking` blocks instead of creating `[__thinking__]`
placeholders — they carry no useful context for compression summaries.
- **`response_adapter`** explicitly handles `ThinkingBlock` (skip
gracefully instead of silently falling through the isinstance chain).
- **`_format_sdk_content_blocks`** now passes through raw dict blocks
(e.g. `redacted_thinking` that the SDK may not have a typed class for)
verbatim to the transcript.

## How

The key insight is the Anthropic API's asymmetric constraint:
- **Last assistant message**: thinking/redacted_thinking blocks must be
preserved byte-for-byte
- **Older assistant messages**: thinking blocks can be removed entirely

`compact_transcript` uses `_find_last_assistant_entry()` to split the
JSONL into two parts:
1. **Prefix** (everything before the last assistant): flattened and
compressed normally
2. **Tail** (last assistant + any trailing user message): preserved
verbatim and re-chained via `_rechain_tail()` to maintain the
`parentUuid` chain

This ensures the API always sees the original thinking blocks in the
last assistant message while still achieving meaningful compression on
older turns.

## Test plan
- [x] 25 new tests across `thinking_blocks_test.py` (TDD: written before
implementation)
- [x] `_find_last_assistant_entry` splits correctly at last assistant,
handles edges (no assistant, index 0, trailing user)
  - [x] `_rechain_tail` patches parentUuid chain, handles empty tail
- [x] `_flatten_assistant_content` strips thinking/redacted_thinking
blocks, handles mixed content
  - [x] `compact_transcript` preserves last assistant's thinking blocks
- [x] `compact_transcript` strips thinking from older assistant messages
- [x] Edge cases: trailing user message, single assistant, no thinking
blocks
  - [x] `response_adapter` handles ThinkingBlock without crash
- [x] `_format_sdk_content_blocks` preserves thinking block format and
raw dict blocks
- [x] All existing copilot SDK tests pass
- [x] Pre-commit hooks (lint, format, typecheck) all pass
2026-03-27 06:36:52 +00:00
Zamil Majdy
b6a027fd2b fix(platform): fix prod Sentry errors and reduce on-call alert noise (#12565)
## Why

Multiple Sentry issues paging on-call in prod:

1. **AUTOGPT-SERVER-8BP**: `ConversionError: Failed to convert
anthropic/claude-sonnet-4-6 to <enum 'LlmModel'>` — the copilot passes
OpenRouter-style provider-prefixed model names
(`anthropic/claude-sonnet-4-6`) to blocks, but the `LlmModel` enum only
recognizes the bare model ID (`claude-sonnet-4-6`).

2. **BUILDER-7GF**: `Error invoking postEvent: Method not found` —
Sentry SDK internal error on Chrome Mobile Android, not a platform bug.

3. **XMLParserBlock**: `BlockUnknownError raised by XMLParserBlock with
message: Error in input xml syntax` — user sent bad XML but the block
raised `SyntaxError`, which gets wrapped as `BlockUnknownError`
(unexpected) instead of `BlockExecutionError` (expected).

4. **AUTOGPT-SERVER-8BS**: `Virus scanning failed for Screenshot
2026-03-26 091900.png: range() arg 3 must not be zero` — empty (0-byte)
file upload causes `range(0, 0, 0)` in the virus scanner chunking loop,
and the failure is logged at `error` level which pages on-call.

5. **AUTOGPT-SERVER-8BT**: `ValueError: <Token var=<ContextVar
name='current_context'>> was created in a different Context` —
OpenTelemetry `context.detach()` fails when the SDK streaming async
generator is garbage-collected in a different context than where it was
created (client disconnect mid-stream).

6. **AUTOGPT-SERVER-8BW**: `RuntimeError: Attempted to exit cancel scope
in a different task than it was entered in` — anyio's
`TaskGroup.__aexit__` detects cancel scope entered in one task but
exited in another when `GeneratorExit` interrupts the SDK cleanup during
client disconnect.

7. **Workspace UniqueViolationError**: `UniqueViolationError: Unique
constraint failed on (workspaceId, path)` — race condition during
concurrent file uploads handled by `WorkspaceManager._persist_db_record`
retry logic, but Sentry still captures the exception at the raise site.

8. **Library UniqueViolationError**: `UniqueViolationError` on
`LibraryAgent (userId, agentGraphId, agentGraphVersion)` — race
conditions in `add_graph_to_library` and `create_library_agent` caused
crashes or silent data loss.

9. **Graph version collision**: `UniqueViolationError` on `AgentGraph
(id, version)` — copilot re-saving an agent at an existing version
collides with the primary key.

## What

### Backend: `LlmModel._missing_()` for provider-prefixed model names
- Adds `_missing_` classmethod to `LlmModel` enum that strips the
provider prefix (e.g., `anthropic/`) when direct lookup fails
- Self-contained in the enum — no changes to the generic type conversion
system

### Frontend: Filter Sentry SDK noise
- Adds `postEvent: Method not found` to `ignoreErrors` — a known Sentry
SDK issue on certain mobile browsers

### Backend: XMLParserBlock — raise ValueError instead of SyntaxError
- Changed `_validate_tokens()` to raise `ValueError` instead of
`SyntaxError`
- Changed the `except SyntaxError` handler in `run()` to re-raise as
`ValueError`
- This ensures `Block.execute()` wraps XML parsing failures as
`BlockExecutionError` (expected/user-caused) instead of
`BlockUnknownError` (unexpected/alerts Sentry)

### Backend: Virus scanner — handle empty files + reduce alert noise
- Added early return for empty (0-byte) files in `scan_file()` to avoid
`range() arg 3 must not be zero` when `chunk_size` is 0
- Added `max(1, len(content))` guard on `chunk_size` as defense-in-depth
- Downgraded `scan_content_safe` failure log from `error` to `warning`
so single-file scan failures don't page on-call via Sentry

### Backend: Suppress SDK client cleanup errors on SSE disconnect
- Replaced `async with ClaudeSDKClient` in `_run_stream_attempt` with
manual `__aenter__`/`__aexit__` wrapped in new
`_safe_close_sdk_client()` helper
- `_safe_close_sdk_client()` catches `ValueError` (OTEL context token
mismatch) and `RuntimeError` (anyio cancel scope in wrong task) during
`__aexit__` and logs at `debug` level — these are expected when SSE
client disconnects mid-stream
- Added `_is_sdk_disconnect_error()` helper for defense-in-depth at the
outer `except BaseException` handler in `stream_chat_completion_sdk`
- Both Sentry errors (8BT and 8BW) are now suppressed without affecting
normal cleanup flow

### Backend: Filter workspace UniqueViolationError from Sentry alerts
- Added `before_send` filter in `_before_send()` to drop
`UniqueViolationError` events where the message contains `workspaceId`
and `path`
- The error is already handled by `WorkspaceManager._persist_db_record`
retry logic — it must propagate for the retry logic to work, so the fix
is at the Sentry filter level rather than catching/suppressing at source

### Backend: Library agent race condition fixes
- **`add_graph_to_library`**: Replaced check-then-create pattern with
create-then-catch-`UniqueViolationError`-then-update. On collision,
updates the existing row (restoring soft-deleted/archived agents)
instead of crashing.
- **`create_library_agent`**: Replaced `create` with `upsert` on the
`(userId, agentGraphId, agentGraphVersion)` composite unique constraint,
so concurrent adds restore soft-deleted entries instead of throwing.

### Backend: Graph version auto-increment on collision
- `__create_graph` now checks if the `(id, version)` already exists
before `create_many`, and auto-increments the version to `max_existing +
1` to avoid `UniqueViolationError` when the copilot re-saves an agent.

### Backend: Workspace `get_or_create_workspace` upsert
- Changed from find-then-create to `upsert` to atomically handle
concurrent workspace creation.

## Test plan

- [x] `LlmModel("anthropic/claude-sonnet-4-6")` resolves correctly
- [x] `LlmModel("claude-sonnet-4-6")` still works (no regression)
- [x] `LlmModel("invalid/nonexistent-model")` still raises `ValueError`
- [x] XMLParserBlock: unclosed tags, extra closing tags, empty XML all
raise `ValueError`
- [x] XMLParserBlock: `SyntaxError` from gravitasml library is caught
and re-raised as `ValueError`
- [x] Virus scanner: empty file (0 bytes) returns clean without hitting
ClamAV
- [x] Virus scanner: single-byte file scans normally (regression test)
- [x] Virus scanner: `scan_content_safe` logs at WARNING not ERROR on
failure
- [x] SDK disconnect: `_is_sdk_disconnect_error` correctly identifies
cancel scope and context var errors
- [x] SDK disconnect: `_is_sdk_disconnect_error` rejects unrelated
errors
- [x] SDK disconnect: `_safe_close_sdk_client` suppresses ValueError,
RuntimeError, and unexpected exceptions
- [x] SDK disconnect: `_safe_close_sdk_client` calls `__aexit__` on
clean exit
- [x] Library: `add_graph_to_library` creates new agent on first call
- [x] Library: `add_graph_to_library` updates existing on
UniqueViolationError
- [x] Library: `create_library_agent` uses upsert to handle concurrent
adds
- [x] All existing workspace overwrite tests still pass
- [x] All tests passing (existing + 4 XML syntax + 3 virus scanner + 10
SDK disconnect + library tests)
2026-03-27 06:09:42 +00:00
Zamil Majdy
fb74fcf4a4 feat(platform): add shared admin user search + rate-limit modal on spending page (#12577)
## Why
Admin rate-limit management required manually entering user UUIDs. The
spending page already had user search but it wasn't reusable.

## What
- Extract `AdminUserSearch` as shared component from spending page
search
- Add rate-limit modal (usage bars + reset) to spending page user rows
- Add email/name/UUID search to standalone rate-limits page
- Backend: add email query parameter to rate-limit endpoint

## How
- `AdminUserSearch` in `admin/components/` — reused by both spending and
rate-limits
- `RateLimitModal` opens from spending page "Rate Limits" button
- Backend `_resolve_user_id()` accepts email or user_id
- Smart routing: exact email → direct lookup, UUID → direct, partial →
fuzzy search

### Follow-up
- `AdminUserSearch` is a plain text input with no typeahead/fuzzy
suggestions — consider adding autocomplete dropdown with debounced
search

### Checklist 📋
- [x] Shared search component extracted and reused
- [x] Tests pass
- [x] Type-checked
2026-03-27 05:53:04 +00:00
Zamil Majdy
28b26dde94 feat(platform): spend credits to reset CoPilot daily rate limit (#12526)
## Summary
- When users hit their daily CoPilot token limit, they can now spend
credits ($2.00 default) to reset it and continue working
- Adds a dialog prompt when rate limit error occurs, offering the
credit-based reset option
- Adds a "Reset daily limit" button in the usage limits panel when the
daily limit is reached
- Backend: new `POST /api/chat/usage/reset` endpoint,
`reset_daily_usage()` Redis helper, `rate_limit_reset_cost` config
- Frontend: `RateLimitResetDialog` component, updated
`UsagePanelContent` with reset button, `useCopilotStream` exposes rate
limit state
- **NEW: Resetting the daily limit also reduces weekly usage by the
daily limit amount**, effectively granting 1 extra day's worth of weekly
capacity (e.g., daily_limit=10000 → weekly usage reduced by 10000,
clamped to 0)

## Context
Users have been confused about having credits available but being
blocked by rate limits (REQ-63, REQ-61). This provides a short-term
solution allowing users to spend credits to bypass their daily limit.

The weekly usage reduction ensures that a paid daily reset doesn't just
move the bottleneck to the weekly limit — users get genuine additional
capacity for the day they paid to unlock.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Hit daily rate limit → dialog appears with reset option
- [x] Click "Reset for $2.00" → credits charged, daily counter reset,
dialog closes
- [x] Usage panel shows "Reset daily limit" button when at 100% daily
usage
- [x] When `rate_limit_reset_cost=0` (disabled), rate limit shows toast
instead of dialog
  - [x] Insufficient credits → error toast shown
  - [x] Verify existing rate limit tests pass
  - [x] Unit tests: weekly counter reduced by daily_limit on reset
  - [x] Unit tests: weekly counter clamped to 0 when usage < daily_limit
  - [x] Unit tests: no weekly reduction when daily_token_limit=0

#### For configuration changes:
- [x] `.env.default` is updated or already compatible with my changes
(new config fields `rate_limit_reset_cost` and `max_daily_resets` have
defaults in code)
- [x] `docker-compose.yml` is updated or already compatible with my
changes (no Docker changes needed)
2026-03-26 13:52:08 +00:00
Zamil Majdy
d677978c90 feat(platform): admin rate limit check and reset with LD-configurable global limits (#12566)
## Why
Admins need visibility into per-user CoPilot rate limit usage and the
ability to reset a user's counters when needed (e.g., after a false
positive or for debugging). Additionally, the global rate limits were
hardcoded deploy-time constants with no way to adjust without
redeploying.

## What
- Admin endpoints to **check** a user's current rate limit usage and
**reset** their daily/weekly counters to zero
- Global rate limits are now **LaunchDarkly-configurable** via
`copilot-daily-token-limit` and `copilot-weekly-token-limit` flags,
falling back to existing `ChatConfig` values
- Frontend admin page at `/admin/rate-limits` with user lookup, usage
visualization, and reset capability
- Chat routes updated to source global limits from LD flags

## How
- **Backend**: Added `reset_user_usage()` to `rate_limit.py` that
deletes Redis usage keys. New admin routes in
`rate_limit_admin_routes.py` (GET `/api/copilot/admin/rate_limit` and
POST `/api/copilot/admin/rate_limit/reset`). Added
`COPILOT_DAILY_TOKEN_LIMIT` and `COPILOT_WEEKLY_TOKEN_LIMIT` to the
`Flag` enum. Chat routes use `_get_global_rate_limits()` helper that
checks LD first.
- **Frontend**: New `/admin/rate-limits` page with `RateLimitManager`
(user lookup) and `RateLimitDisplay` (usage bars + reset button). Added
`getUserRateLimit` and `resetUserRateLimit` to `BackendAPI` client.

## Test plan
- [x] Backend: 4 tests covering get, reset, redis failure, and
admin-only access
- [ ] Manual: Look up a user's rate limits in the admin UI
- [ ] Manual: Reset a user's usage counters
- [ ] Manual: Verify LD flag overrides are respected for global limits
2026-03-26 08:29:40 +00:00
Otto
a347c274b7 fix(frontend): replace unrealistic CoPilot suggestion prompt (#12564)
Replaces "Sort my bookmarks into categories" with "Summarize my unread
emails" in the Organize suggestion category. CoPilot has no access to
browser bookmarks or local files, so the original prompt was misleading.

---
Co-authored-by: Toran Bruce Richards (@Torantulino)
<Torantulino@users.noreply.github.com>
2026-03-26 08:10:28 +00:00
Zamil Majdy
f79d8f0449 fix(backend): move placeholder_values exclusively to AgentDropdownInputBlock (#12551)
## Why

`AgentInputBlock` has a `placeholder_values` field whose
`generate_schema()` converts it into a JSON schema `enum`. The frontend
renders any field with `enum` as a dropdown/select. This means
AI-generated agents that populate `placeholder_values` with example
values (e.g. URLs) on regular `AgentInputBlock` nodes end up with
dropdowns instead of free-text inputs — users can't type custom values.

Only `AgentDropdownInputBlock` should produce dropdown behavior.

## What

- Removed `placeholder_values` field from `AgentInputBlock.Input`
- Moved the `enum` generation logic to
`AgentDropdownInputBlock.Input.generate_schema()`
- Cleaned up test data for non-dropdown input blocks
- Updated copilot agent generation guide to stop suggesting
`placeholder_values` for `AgentInputBlock`

## How

The base `AgentInputBlock.Input.generate_schema()` no longer converts
`placeholder_values` → `enum`. Only `AgentDropdownInputBlock.Input`
defines `placeholder_values` and overrides `generate_schema()` to
produce the `enum`.

**Backward compatibility**: Existing agents with `placeholder_values` on
`AgentInputBlock` nodes load fine — `model_construct()` silently ignores
extra fields not defined on the model. Those inputs will now render as
text fields (desired behavior).

## Test plan
- [x] `poetry run pytest backend/blocks/test/test_block.py -xvs` — all
block tests pass
- [x] `poetry run format && poetry run lint` — clean
- [ ] Import an agent JSON with `placeholder_values` on an
`AgentInputBlock` — verify it loads and renders as text input
- [ ] Create an agent with `AgentDropdownInputBlock` — verify dropdown
still works
2026-03-26 08:09:38 +00:00
Otto
1bc48c55d5 feat(copilot): add copy button to user prompt messages [SECRT-2172] (#12571)
Requested by @itsababseh

Users can copy assistant output messages but not their own prompts. This
adds the same copy button to user messages — appears on hover,
right-aligned, using the existing `CopyButton` component.

## Why

Users write long prompts and need to copy them to reuse or share.
Currently requires manual text selection. ChatGPT shows copy on hover
for user messages — this matches that pattern.

## What

- Added `CopyButton` to user prompt messages in
`ChatMessagesContainer.tsx`
- Shows on hover (`group-hover:opacity-100`), positioned right-aligned
below the message
- Reuses the existing `CopyButton` and `MessageActions` components —
zero new code

## How

One file changed, 11 lines added:
1. Import `MessageActions` and `CopyButton`
2. Render them after user `MessageContent`, gated on `message.role ===
"user"` and having text parts

---
Co-authored-by: itsababseh (@itsababseh)
<36419647+itsababseh@users.noreply.github.com>
2026-03-26 08:02:28 +00:00
Abhimanyu Yadav
9d0a31c0f1 fix(frontend/builder): fix array field item layout and add FormRenderer stories (#12532)
Fix broken UI when selecting nodes with array fields (list[str],
list[Enum]) in the builder. The select/input inside array items was
squeezed by the Remove button instead of taking full width.
<img width="2559" height="1077" alt="Screenshot 2026-03-26 at 10 23
34 AM"
src="https://github.com/user-attachments/assets/2ffc28a2-8d6c-428c-897c-021b1575723c"
/>

### Changes 🏗️

- **ArrayFieldItemTemplate**: Changed layout from horizontal flex-row to
vertical flex-col so the input takes full width and Remove button sits
below aligned left, with tighter spacing between them
- **Storybook config**: Added `renderers/**` glob to
`.storybook/main.ts` so renderer stories are discoverable
- **FormRenderer stories**: Added comprehensive Storybook stories
covering all backend field types (string, int, float, bool, enum,
date/time, list[str], list[int], list[Enum], list[bool], nested objects,
Optional, anyOf unions, oneOf discriminated unions, multi-select, list
of objects, and a kitchen sink). Includes exact Twitter GetUserBlock
schema for realistic oneOf + multi-select testing.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Verified array field items render with full-width input and Remove
button below in Storybook
  - [x] Verified list[Enum] select dropdown takes full width
  - [x] Verified list[str] text input takes full width
- [x] Verified all FormRenderer stories render without errors in
Storybook
- [x] Verified multi-select and oneOf discriminated union stories match
real backend schemas
2026-03-26 06:15:30 +00:00
Abhimanyu Yadav
9b086e39c6 fix(frontend): hide placeholder text when copilot voice recording is active (#12534)
### Why / What / How

**Why:** When voice recording is active in the CoPilot chat input, the
recording UI (waveform + timer) overlays on top of the placeholder/hint
text, creating a visually broken appearance. Reported by a user via
SECRT-2163.

**What:** Hide the textarea placeholder text while voice recording is
active so it doesn't bleed through the `RecordingIndicator` overlay.

**How:** When `isRecording` is true, the placeholder is set to an empty
string. The existing `RecordingIndicator` overlay (waveform animation +
elapsed time) then displays cleanly without the hint text showing
underneath.

### Changes 🏗️

- Clear the `PromptInputTextarea` placeholder to `""` when voice
recording is active, preventing it from rendering behind the
`RecordingIndicator` overlay

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Open CoPilot chat at /copilot
- [x] Click the microphone button or press Space to start voice
recording
- [x] Verify the placeholder text ("Type your message..." / "What else
can I help with?") is hidden during recording
- [x] Verify the RecordingIndicator (waveform + timer) displays cleanly
without overlapping text
  - [x] Stop recording and verify placeholder text reappears
  - [x] Verify "Transcribing..." placeholder shows during transcription
2026-03-26 05:41:09 +00:00
Zamil Majdy
5867e4d613 Merge branch 'master' of github.com:Significant-Gravitas/AutoGPT into dev 2026-03-26 07:30:56 +07:00
Zamil Majdy
85f0d8353a fix(platform): fix prod Sentry errors and reduce on-call alert noise (#12560)
## Summary
Hotfix targeting master for production Sentry errors that are triggering
on-call pages. Fixes actual bugs and expands Sentry filters to suppress
user-caused errors that are not platform issues.

### Bug Fixes
- **Workspace race condition** (`get_or_create_workspace`): Replaced
Prisma's non-atomic `upsert` with find-then-create pattern. Prisma's
upsert translates to SELECT + INSERT (not PostgreSQL's native `INSERT
... ON CONFLICT`), causing `UniqueViolationError` when concurrent
requests hit for the same user (e.g. copilot + file upload
simultaneously).
- **ChatSidebar crash**: Added null-safe `?.` for `sessions` which can
be `undefined` during error/loading states, preventing `TypeError:
Cannot read properties of undefined (reading 'length')`.
- **UsageLimits crash**: Added null-safe `?.` for
`usage.daily`/`usage.weekly` which can be `undefined` when the API
returns partial data, preventing `TypeError: Cannot read properties of
undefined (reading 'limit')`.

### Sentry Filter Improvements
Expanded backend `_before_send` to stop user-caused errors from reaching
Sentry and triggering on-call alerts:
- **Consolidated auth keywords** into a shared `_USER_AUTH_KEYWORDS`
list used by both exception-based and log-based filters (previously
duplicated).
- **Added missing auth keywords**: `"unauthorized"`, `"bad
credentials"`, `"insufficient authentication scopes"` — these were
leaking through.
- **Added user integration HTTP error filter**: `"http 401 error"`,
`"http 403 error"`, `"http 404 error"` — catches `BlockUnknownError` and
`HTTPClientError` from user integrations (expired GitHub tokens, wrong
Airtable IDs, etc.).
- **Fixed log-based event gap**: User auth errors logged via
`logger.error()` (not raised as exceptions) were bypassing the
`exc_info` filter. Now the same `_USER_AUTH_KEYWORDS` list is checked
against log messages too.

## On-Call Alerts Addressed

### Fixed (actual bugs)
| Alert | Issue | Root Cause |
|-------|-------|------------|
| `Unique constraint failed on the fields: (userId)` |
[AUTOGPT-SERVER-8BM](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-8BM)
| Prisma upsert race condition |
| `Unique constraint failed on the fields: (userId)` |
[AUTOGPT-SERVER-8BK](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-8BK)
| Same — via `/api/workspace/files/upload` |
| `Unique constraint failed on the fields: (userId)` |
[AUTOGPT-SERVER-8BN](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-8BN)
| Same — via `tools/call run_block` |
| `Upload failed (500): Unique constraint failed` |
[BUILDER-7GA](https://significant-gravitas.sentry.io/issues/BUILDER-7GA)
| Frontend surface of same workspace bug |
| `Cannot read properties of undefined (reading 'length')` |
[BUILDER-7GD](https://significant-gravitas.sentry.io/issues/BUILDER-7GD)
| `sessions` undefined in ChatSidebar |
| `Cannot read properties of undefined (reading 'limit')` |
[BUILDER-7GB](https://significant-gravitas.sentry.io/issues/BUILDER-7GB)
| `usage.daily` undefined in UsageLimits |

### Filtered (user-caused, not platform bugs)
| Alert | Issue | Why it's not a platform bug |
|-------|-------|-----------------------------|
| `Anthropic API error: invalid x-api-key` |
[AUTOGPT-SERVER-8B6](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-8B6),
8B7, 8B8 | User provided invalid Anthropic API key |
| `AI condition evaluation failed: Incorrect API key` |
[AUTOGPT-SERVER-83Y](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-83Y)
| User's OpenAI key is wrong (4.5K events, 1 user) |
| `GithubListIssuesBlock: HTTP 401 Bad credentials` |
[AUTOGPT-SERVER-8BF](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-8BF)
| User's GitHub token expired |
| `HTTPClientError: HTTP 401 Unauthorized` |
[AUTOGPT-SERVER-8BG](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-8BG)
| Same — credential check endpoint |
| `GithubReadIssueBlock: HTTP 401 Bad credentials` |
[AUTOGPT-SERVER-8BH](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-8BH)
| Same — different block |
| `AirtableCreateBaseBlock: HTTP 404 MODEL_ID_NOT_FOUND` |
[AUTOGPT-SERVER-8BC](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-8BC)
| User's Airtable model ID is wrong |

### Not addressed in this PR
| Alert | Issue | Reason |
|-------|-------|--------|
| `Unexpected token '<', "<html><hea"...` |
[BUILDER-7GC](https://significant-gravitas.sentry.io/issues/BUILDER-7GC)
| Transient — backend briefly returned HTML error page |
| `undefined is not an object (activeResponse.state)` |
[BUILDER-71J](https://significant-gravitas.sentry.io/issues/BUILDER-71J)
| Bug in Vercel AI SDK `ai@6.0.59`, already resolved |
| `Last Tool Output is needed` |
[AUTOGPT-SERVER-72T](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-72T)
| User graph misconfiguration (1 user, 21 events) |
| `Cannot set property ethereum` |
[BUILDER-7G6](https://significant-gravitas.sentry.io/issues/BUILDER-7G6)
| Browser wallet extension conflict |
| `File already exists at path` |
[BUILDER-7FS](https://significant-gravitas.sentry.io/issues/BUILDER-7FS)
| Expected 409 conflict |

## Test plan
- [ ] Verify workspace creation works for new users
- [ ] Verify concurrent workspace access (e.g. copilot + file upload)
doesn't error
- [ ] Verify copilot ChatSidebar and UsageLimits load correctly when API
returns partial/error data
- [ ] Verify user auth errors (invalid API keys, expired tokens) no
longer appear in Sentry after deployment
2026-03-25 23:25:32 +07:00
An Vy Le
f871717f68 fix(backend): add sink input validation to AgentValidator (#12514)
## Summary

- Added `validate_sink_input_existence` method to `AgentValidator` to
ensure all sink names in links and input defaults reference valid input
schema fields in the corresponding block
- Added comprehensive tests covering valid/invalid sink names, nested
inputs, and default key handling
- Updated `ReadDiscordMessagesBlock` description to clarify it reads new
messages and triggers on new posts
- Removed leftover test function file

## Test plan

- [ ] Run `pytest` on `validator_test.py` to verify all sink input
validation cases pass
- [ ] Verify existing agent validation flow is unaffected
- [ ] Confirm `ReadDiscordMessagesBlock` description update is accurate

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
2026-03-25 16:08:17 +00:00
Ubbe
f08e52dc86 fix(frontend): marketplace card description 3 lines + fallback color (#12557)
## Summary
- Increase the marketplace StoreCard description from 2 lines to 3 lines
for better readability
- Change fallback background colour for missing agent images from
`bg-violet-50` to `rgb(216, 208, 255)`

<img width="933" height="458" alt="Screenshot 2026-03-25 at 20 25 41"
src="https://github.com/user-attachments/assets/ea433741-1397-4585-b64c-c7c3b8109584"
/>
<img width="350" height="457" alt="Screenshot 2026-03-25 at 20 25 55"
src="https://github.com/user-attachments/assets/e2029c09-518a-4404-aa95-e202b4064d0b"
/>


## Test plan
- [x] Verified `pnpm format`, `pnpm lint`, `pnpm types` all pass
- [x] Visually confirmed description shows 3 lines on marketplace cards
- [x] Visually confirmed fallback color renders correctly for cards
without images

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 20:58:45 +08:00
Ubbe
500b345b3b fix(frontend): auto-reconnect copilot chat after device sleep/wake (#12519)
## Summary

- Adds `visibilitychange`-based sleep/wake detection to the copilot chat
— when the page becomes visible after >30s hidden, automatically refetch
the session and either resume an active stream or hydrate completed
messages
- Blocks chat input during re-sync (`isSyncing` state) to prevent users
from accidentally sending a message that overwrites the agent's
completed work
- Replaces `PulseLoader` with a spinning `CircleNotch` icon on sidebar
session names for background streaming sessions (closer to ChatGPT's UX)

## How it works

1. When the page goes hidden, we record a timestamp
2. When the page becomes visible, we check elapsed time
3. If >30s elapsed (indicating sleep or long background), we refetch the
session from the API
4. If backend still has `active_stream=true` → remove stale assistant
message and resume SSE
5. If backend is done → the refetch triggers React Query invalidation
which hydrates the completed messages
6. Chat input stays disabled (`isSyncing=true`) until re-sync completes

## Test plan

- [ ] Open copilot, start a long-running agent task
- [ ] Close laptop lid / lock screen for >30 seconds
- [ ] Wake device — verify chat shows the agent's completed response (or
resumes streaming)
- [ ] Verify chat input is temporarily disabled during re-sync, then
re-enables
- [ ] Verify sidebar shows spinning icon (not pulse loader) for
background sessions
- [ ] Verify no duplicate messages appear after wake
- [ ] Verify normal streaming (no sleep) still works as expected

Resolves: [SECRT-2159](https://linear.app/autogpt/issue/SECRT-2159)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 20:15:33 +08:00
Ubbe
995dd1b5f3 feat(platform): replace suggestion pills with themed prompt categories (#12515)
## Summary

<img width="700" height="575" alt="Screenshot 2026-03-23 at 21 40 07"
src="https://github.com/user-attachments/assets/f6138c63-dd5e-4bde-a2e4-7434d0d3ec72"
/>

Re-applies #12452 which was reverted as collateral in #12485 (invite
system revert).

Replaces the flat list of suggestion pills in the CoPilot empty session
with themed prompt categories (Learn, Create, Automate, Organize), each
shown as a popover with contextual prompts.

- **Backend**: Adds `suggested_prompts` as a themed `dict[str,
list[str]]` keyed by category. Updates Tally extraction LLM prompt to
generate prompts per theme, and the `/suggested-prompts` API to return
grouped themes. Legacy `list[str]` rows are preserved under a
`"General"` key for backward compatibility.
- **Frontend**: Replaces inline pill buttons with a `SuggestionThemes`
popover component. Each theme button (with icon) opens a dropdown of 5
relevant prompts. Falls back to hardcoded defaults when the API has no
personalized prompts. Normalizes partial API responses by padding
missing themes with defaults. Legacy `"General"` prompts are distributed
round-robin across themes.

### Changes 🏗️

- `backend/data/understanding.py`: `suggested_prompts` field added as
`dict[str, list[str]]`; legacy list rows preserved under `"General"` key
via `_json_to_themed_prompts`
- `backend/data/tally.py`: LLM prompt updated to generate themed
prompts; validation now per-theme with blank-string rejection
- `backend/api/features/chat/routes.py`: New `SuggestedTheme` model;
endpoint returns `themes[]`
- `frontend/copilot/components/EmptySession/EmptySession.tsx`: Uses
generated API hooks for suggested prompts
- `frontend/copilot/components/EmptySession/helpers.ts`:
`DEFAULT_THEMES` replaces `DEFAULT_QUICK_ACTIONS`; `getSuggestionThemes`
normalizes partial API responses
-
`frontend/copilot/components/EmptySession/components/SuggestionThemes/`:
New popover component with theme icons and loading states

### Checklist 📋

- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Verify themed suggestion buttons render on CoPilot empty session
  - [x] Click each theme button and confirm popover opens with prompts
  - [x] Click a prompt and confirm it sends the message
- [x] Verify fallback to default themes when API returns no custom
prompts
- [x] Verify legacy users' personalized prompts are preserved and
visible

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 15:32:49 +08:00
Zamil Majdy
336114f217 fix(backend): prevent graph execution stuck + steer SDK away from bash_exec (#12548)
## Summary

Two backend fixes for CoPilot stability:

1. **Steer model away from bash_exec for SDK tool-result files** — When
the SDK returns tool results as file paths, the copilot model was
attempting to use `bash_exec` to read them instead of treating the
content directly. Added system prompt guidance to prevent this.

2. **Guard against missing 'name' in execution input_data** —
`GraphExecution.from_db()` assumed all INPUT/OUTPUT block node
executions have a `name` field in `input_data`. This crashes with
`KeyError: 'name'` when non-standard blocks (e.g., OrchestratorBlock)
produce node executions without this field. Added `"name" in
exec.input_data` guards.

## Why

- The bash_exec issue causes copilot to fail when processing SDK tool
outputs
- The KeyError crashes the `update_graph_execution_stats` endpoint,
causing graph executions to appear stuck (retries 35+ times, never
completes)

## How

- Added system prompt instruction to treat tool result file contents
directly
- Added `"name" in exec.input_data` guard in both input extraction (line
340) and output extraction (line 365) in `execution.py`

### Changes
- `backend/copilot/sdk/service.py` — system prompt guidance
- `backend/data/execution.py` — KeyError guard for missing `name` field

### Checklist 📋
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan

#### Test plan:
- [x] OrchestratorBlock graph execution no longer gets stuck
- [x] Standard Agent Input/Output blocks still work correctly
- [x] Copilot SDK tool results are processed without bash_exec
2026-03-25 13:58:24 +07:00
Nicholas Tindle
866563ad25 feat(platform): admin preview marketplace submissions before approving (#12536)
## Why

Admins reviewing marketplace submissions currently approve blindly —
they can see raw metadata in the admin table but cannot see what the
listing actually looks like (images, video, branding, layout). This
risks approving inappropriate content. With full-scale production
approaching, this is critical.

Additionally, when a creator un-publishes an agent, users who already
added it to their library lose access — breaking their workflows.
Product decided on a "you added it, you keep it" model.

## What

- **Admin preview page** at `/admin/marketplace/preview/[id]` — renders
the listing exactly as it would appear on the public marketplace
- **Add to Library** for admins to test-run pending agents before
approving
- **Library membership grants graph access** — if you added an agent to
your library, you keep access even if it's un-published or rejected
- **Preview button** on every submission row in the admin marketplace
table
- **Cross-reference comments** on original functions to prevent
SECRT-2162-style regressions

## How

### Backend

**Admin preview (`store/db.py`):**
- `get_store_agent_details_as_admin()` queries `StoreListingVersion`
directly, bypassing the APPROVED-only `StoreAgent` DB view
- Validates `CreatorProfile` FK integrity, reads all fields including
`recommendedScheduleCron`

**Admin add-to-library (`library/_add_to_library.py`):**
- Extracted shared logic into `resolve_graph_for_library()` +
`add_graph_to_library()` — eliminates duplication between public and
admin paths
- Admin path uses `get_graph_as_admin()` to bypass marketplace status
checks
- Handles concurrent double-click race via `UniqueViolationError` catch

**Library membership grants graph access (`data/graph.py`):**
- `get_graph()` now falls back to `LibraryAgent` lookup if ownership and
marketplace checks fail
- Only for authenticated users with non-deleted, non-archived library
records
- `validate_graph_execution_permissions()` updated to match — library
membership grants execution access too

**New endpoints (`store_admin_routes.py`):**
- `GET /admin/submissions/{id}/preview` — returns `StoreAgentDetails`
- `POST /admin/submissions/{id}/add-to-library` — creates `LibraryAgent`
via admin path

### Frontend

- Preview page reuses `AgentInfo` + `AgentImages` with admin banner
- Shows instructions, recommended schedule, and slug
- "Add to My Library" button wired to admin endpoint
- Preview button added to `ExpandableRow` (header + version history)
- Categories column uncommented in version history table

### Testing (19 tests)

**Graph access control (9 in `graph_test.py`):** Owner access,
marketplace access, library member access (unpublished),
deleted/archived/anonymous denied, null FK denied, efficiency checks

**Admin bypass (5 in `store_admin_routes_test.py`):** Preview uses
StoreListingVersion not StoreAgent, admin path uses get_graph_as_admin,
regular path uses get_graph, library member can view in builder

**Security (3):** Non-admin 403 on preview, non-admin 403 on
add-to-library, nonexistent 404

**SECRT-2162 regression (2):** Admin access to pending agent, export
with sub-graphs

### Checklist
- [x] Changes clearly listed
- [x] Test plan made
- [x] 19 backend tests pass
- [x] Frontend lints and types clean

## Test plan
- [x] Navigate to `/admin/marketplace`, click Preview on a PENDING
submission
- [x] Verify images, video, description, categories, instructions,
schedule render correctly
- [x] Click "Add to My Library", verify agent appears in library and
opens in builder
- [x] Verify non-admin users get 403
- [x] Verify un-publishing doesn't break access for users who already
added it

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **High Risk**
> Adds new admin-only endpoints that bypass marketplace
approval/ownership checks and changes `get_graph`/execution
authorization to grant access via library membership, which impacts
security-sensitive access control paths.
> 
> **Overview**
> Adds **admin preview + review workflow support** for marketplace
submissions: new admin routes to `GET /admin/submissions/{id}/preview`
(querying `StoreListingVersion` directly) and `POST
/admin/submissions/{id}/add-to-library` (admin bypass to pull pending
graphs into an admin’s library).
> 
> Refactors library add-from-store logic into shared helpers
(`resolve_graph_for_library`, `add_graph_to_library`) and introduces an
admin variant `add_store_agent_to_library_as_admin`, including restore
of archived/deleted entries and dedup/race handling.
> 
> Changes core graph access rules: `get_graph()` now falls back to
**library membership** (non-deleted/non-archived, version-specific) when
ownership and marketplace approval don’t apply, and
`validate_graph_execution_permissions()` is updated accordingly.
Frontend adds a preview link and a dedicated admin preview page with
“Add to My Library”; tests expand significantly to lock in the new
bypass and access-control behavior.
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
a362415d12. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 04:26:36 +00:00
Zamil Majdy
e79928a815 fix(backend): prevent logging sensitive data in SafeJson fallback (#12547)
### Why / What / How

**Why:** GitHub's code scanning detected a HIGH severity security
vulnerability in `/autogpt_platform/backend/backend/util/json.py:172`.
The error handler in `sanitize_json()` was logging sensitive data
(potentially including secrets, API keys, credentials) as clear text
when serialization fails.

**What:** This PR removes the logging of actual data content from the
error handler while preserving useful debugging metadata (error type,
error message, and data type).

**How:** Removed the `"Data preview: %s"` format parameter and the
corresponding `truncate(str(data), 100)` argument from the
logger.error() call. The error handler now logs only safe metadata that
helps debugging without exposing sensitive information.

### Changes 🏗️

- **Security Fix**: Modified `sanitize_json()` function in
`backend/util/json.py`
- Removed logging of data content (`truncate(str(data), 100)`) from the
error handler
  - Retained logging of error type (`type(e).__name__`)
- Retained logging of truncated error message (`truncate(str(e), 200)`)
  - Retained logging of data type (`type(data).__name__`)
- Error handler still provides useful debugging information without
exposing secrets

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Verified the code passes type checking (`poetry run pyright
backend/util/json.py`)
- [x] Verified the code passes linting (`poetry run ruff check
backend/util/json.py`)
  - [x] Verified all pre-commit hooks pass
- [x] Reviewed the diff to ensure only the sensitive data logging was
removed
- [x] Confirmed that useful debugging information (error type, error
message, data type) is still logged

#### For configuration changes:
- N/A - No configuration changes required
2026-03-25 04:21:21 +00:00
Zamil Majdy
1771ed3bef dx(skills): codify PR workflow rules in skill docs and CLAUDE.md (#12531)
## Summary

- **pr-address skill**: Add explicit rule against empty commits for CI
re-triggers, and strengthen push-immediately guidance with rationale
- **Platform CLAUDE.md**: Add "split PRs by concern" guideline under
Creating Pull Requests

### Changes
- Updated `.claude/skills/pr-address/SKILL.md`
- Updated `autogpt_platform/CLAUDE.md`

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan

#### Test plan:
- [x] Documentation-only changes — no functional tests needed
- [x] Verified markdown renders correctly
2026-03-25 10:19:30 +07:00
Zamil Majdy
550fa5a319 fix(backend): register AutoPilot sessions with stream registry for SSE updates (#12500)
### Changes 🏗️
- When the AutoPilot block executes a copilot session via
`collect_copilot_response`, it calls `stream_chat_completion_sdk`
directly, bypassing the copilot executor and stream registry. This means
the frontend sees no `active_stream` on the session and cannot connect
via SSE — users see a frozen chat with no updates until the turn fully
completes.
- Fix: register a `stream_registry` session in
`collect_copilot_response` and publish each chunk to Redis as events are
consumed. This allows the frontend to detect `active_stream=true` and
connect via the SSE reconnect endpoint for live streaming updates during
AutoPilot execution.
- Error handling is graceful — if stream registry fails, AutoPilot still
works normally, just without real-time frontend updates.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Trigger an AutoPilot block execution that creates a new chat
session
- [x] Verify the new session appears in the sidebar with streaming
indicator
- [x] Click on the session while AutoPilot is still executing — verify
SSE connects and messages stream in real-time
- [x] Verify that after AutoPilot completes, the session shows as
complete (no active_stream)
- [x] Test reconnection: disconnect and reconnect while AutoPilot is
running — verify stream resumes (found and fixed GeneratorExit bug that
caused stuck sessions)
- [x] E2E: 10 stream events published to Redis (StreamStart,
3×ToolInput, 3×ToolOutput, TextStart, TextEnd, StreamFinish)
  - [x] E2E: Redis xadd latency 0.2–3.4ms per chunk
  - [x] E2E: Chat sessions registered in Redis (confirmed via redis-cli)
2026-03-25 01:08:49 +00:00
Zamil Majdy
8528dffbf2 fix(backend): allow /tmp as valid path in E2B sandbox file tools (#12501)
## Summary
- Allow `/tmp` as a valid writable directory in E2B sandbox file tools
(`write_file`, `read_file`, `edit_file`, `glob`, `grep`)
- The E2B sandbox is already fully isolated, so restricting writes to
only `/home/user` was unnecessarily limiting — scripts and tools
commonly use `/tmp` for temporary files
- Extract `is_within_allowed_dirs()` helper in `context.py` to
centralize the allowed-directory check for both path resolution and
symlink escape detection

## Changes
- `context.py`: Add `E2B_ALLOWED_DIRS` tuple and `E2B_ALLOWED_DIRS_STR`,
introduce `is_within_allowed_dirs()`, update `resolve_sandbox_path()` to
use it
- `e2b_file_tools.py`: Update `_check_sandbox_symlink_escape()` to use
`is_within_allowed_dirs()`, update tool descriptions
- Tests: Add coverage for `/tmp` paths in both `context_test.py` and
`e2b_file_tools_test.py`

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] All 59 existing + new tests pass (`poetry run pytest
backend/copilot/context_test.py
backend/copilot/sdk/e2b_file_tools_test.py`)
  - [x] `poetry run format` and `poetry run lint` pass clean
  - [x] Verify `/tmp` write works in live E2B sandbox
  - [x] E2E: Write file to /tmp/test.py in E2B sandbox via copilot
  - [x] E2E: Execute script from /tmp — output "Hello, World!"
  - [x] E2E: E2B sandbox lifecycle (create, use, pause) works correctly
2026-03-25 00:52:58 +00:00
Zamil Majdy
8fbf6a4b09 Merge branch 'master' of github.com:Significant-Gravitas/AutoGPT into dev 2026-03-25 06:55:47 +07:00
Zamil Majdy
239148596c fix(backend): filter SDK default credentials from credentials API responses (#12544)
## Summary

- Filter SDK-provisioned default credentials from credentials API list
endpoints
- Reuse `CredentialsMetaResponse` model from internal router in external
API (removes duplicate `CredentialSummary`)
- Add `is_sdk_default()` helper for identifying platform-provisioned
credentials
- Add `provider_matches()` to credential store for consistent provider
filtering
- Add tests for credential filtering behavior

### Changes
- `backend/data/model.py` — add `is_sdk_default()` helper
- `backend/api/features/integrations/router.py` — filter SDK defaults
from list endpoints
- `backend/api/external/v1/integrations.py` — reuse
`CredentialsMetaResponse`, filter SDK defaults
- `backend/integrations/credentials_store.py` — add `provider_matches()`
- `backend/sdk/registry.py` — update credential registration
- `backend/api/features/integrations/router_test.py` — new tests
- `backend/api/features/integrations/conftest.py` — test fixtures

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan

#### Test plan:
- [x] Unit tests for credential filtering (`router_test.py`)
- [x] Verify SDK default credentials excluded from API responses
- [x] Verify user-created credentials still returned normally
2026-03-25 06:54:54 +07:00
Zamil Majdy
a880d73481 feat(platform): dry-run execution mode with LLM block simulation (#12483)
## Why

Agent generation and building needs a way to test-run agents without
requiring real credentials or producing side effects. Currently, every
execution hits real APIs, consumes credits, and requires valid
credentials — making it impossible to debug or validate agent graphs
during the build phase without real consequences.

## Summary

Adds a `dry_run` execution mode to the copilot's `run_block` and
`run_agent` tools. When `dry_run=True`, every block execution is
simulated by an LLM instead of calling the real service — no real API
calls, no credentials consumed, no side effects.

Inspired by
[Significant-Gravitas/agent-simulator](https://github.com/Significant-Gravitas/agent-simulator).

### How it works

- **`backend/executor/simulator.py`** (new): `simulate_block()` builds a
prompt from the block's name, description, input/output schemas, and
actual input values, then calls `gpt-4o-mini` via the existing
OpenRouter client with JSON mode. Retries up to 5 times on JSON parse
failures. Missing output pins are filled with `None` (or `""` for the
`error` pin). Long inputs (>20k chars) are truncated before sending to
the LLM.
- **`ExecutionContext`**: Added `dry_run: bool = False` field; threaded
through `add_graph_execution()` so graph-level dry runs propagate to
every block execution.
- **`execute_block()` helper**: When `dry_run=True`, the function
short-circuits before any credential injection or credit checks, calls
`simulate_block()`, and returns a `[DRY RUN]`-prefixed
`BlockOutputResponse`.
- **`RunBlockTool`**: New `dry_run` boolean parameter.
- **`RunAgentTool`**: New `dry_run` boolean parameter; passes
`ExecutionContext(dry_run=True)` to graph execution.

### Tests

11 tests in `backend/copilot/tools/test_dry_run.py`:
- Correct output tuples from LLM response
- JSON retry logic (3 total calls when first 2 fail)
- All-retries-exhausted yields `SIMULATOR ERROR`
- Missing output pins filled with `None`/`""`
- No-client case
- Input truncation at 20k chars
- `execute_block(dry_run=True)` skips real `block.execute()`
- Response format: `[DRY RUN]` message, `success=True`
- `dry_run=False` unchanged (real path)
- `RunBlockTool` parameter presence
- `dry_run` kwarg forwarding

## Test plan
- [x] Run `pytest backend/copilot/tools/test_dry_run.py -v` — all 11
pass
- [x] Call `run_block` with `dry_run=true` in copilot; verify no real
API calls occur and output contains `[DRY RUN]`
- [x] Call `run_agent` with `dry_run=true`; verify execution is created
with `dry_run=True` in context
- [x] E2E: Simulate button (flask icon) present in builder alongside
play button
- [x] E2E: Simulated run labeled with "(Simulated)" suffix and badge in
Library
- [x] E2E: No credits consumed during dry-run
2026-03-24 22:36:47 +00:00
Zamil Majdy
80bfd64ffa Merge branch 'master' of github.com:Significant-Gravitas/AutoGPT into dev 2026-03-24 21:18:11 +07:00
Zamil Majdy
0076ad2a1a hotfix(blocks): bump stagehand ^0.5.1 → ^3.4.0 to fix yanked litellm (#12539)
## Summary

**Critical CI fix** — litellm was compromised in a supply chain attack
(versions 1.82.7/1.82.8 contained infostealer malware) and PyPI
subsequently yanked many litellm versions including the 1.7x range that
stagehand 0.5.x depended on. This breaks `poetry lock` in CI for all
PRs.

- Bump `stagehand` from `^0.5.1` to `^3.4.0` — Stagehand v3 is a
Stainless-generated HTTP API client that **no longer depends on
litellm**, completely removing litellm from our dependency tree
- Migrate stagehand blocks to use `AsyncStagehand` + session-based API
(`sessions.start`, `session.navigate/act/observe/extract`)
- Net reduction of ~430 lines in `poetry.lock` from dropping litellm and
its transitive dependencies

## Why

All CI pipelines are blocked because `poetry lock` fails to resolve
yanked litellm versions that stagehand 0.5.x required.

## Test plan

- [x] CI passes (poetry lock resolves, backend tests green)
- [ ] Verify stagehand blocks still function with the new session-based
API
2026-03-24 21:17:19 +07:00
Zamil Majdy
edb3d322f0 feat(backend/copilot): parallel block execution via infrastructure-level pre-launch (#12472)
## Summary

- Implements **infrastructure-level parallel tool execution** for
CoPilot: all tools called in a single LLM turn now execute concurrently
with zero changes to individual tool implementations or LLM prompts.
- Adds `pre_launch_tool_call()` to `tool_adapter.py`: when an
`AssistantMessage` with `ToolUseBlock`s arrives, all tools are
immediately fired as `asyncio.Task`s before the SDK dispatches MCP
handlers. Each MCP handler then awaits its pre-launched task instead of
executing fresh.
- Adds a `_tool_task_queues` `ContextVar` (initialized per-session in
`set_execution_context()`) so concurrent sessions never share task
queues.
- DRY refactor: extracts `prepare_block_for_execution()`,
`check_hitl_review()`, and `BlockPreparation` dataclass into
`helpers.py` so the execution pipeline is reusable.
- 10 unit tests for the parallel pre-launch infrastructure (queue
enqueue/dequeue, MCP prefix stripping, fallback path, `CancelledError`
handling, multi-same-tool FIFO ordering).

## Root cause

The Claude Agent SDK CLI sends MCP tool calls as sequential
request-response pairs: it waits for each `control_response` before
issuing the next `mcp_message`. Even though Python dispatches handlers
with `start_soon`, the CLI never issues call B until call A's response
is sent — blocks always ran sequentially. The pre-launch pattern fixes
this at the infrastructure level by starting all tasks before the SDK
even dispatches the first handler.

## Test plan

- [x] `poetry run pytest backend/copilot/sdk/tool_adapter_test.py` — 27
tests pass (10 new parallel infra tests)
- [x] `poetry run pytest backend/copilot/tools/helpers_test.py` — 20
tests pass
- [x] `poetry run pytest backend/copilot/tools/run_block_test.py
backend/copilot/tools/test_run_block_details.py` — all pass
- [x] Manually test in CoPilot: ask the agent to run two blocks
simultaneously — verify both start executing before either completes
- [x] E2E: Both GetCurrentTimeBlock and CalculatorBlock executed
concurrently (time=09:35:42, 42×7=294)
- [x] E2E: Pre-launch mechanism active — two run_block events at same
timestamp (3ms apart)
- [x] E2E: Arg-mismatch fallback tested — system correctly cancels and
falls back to direct execution
2026-03-24 20:27:46 +07:00
Zamil Majdy
9381057079 refactor(platform): rename SmartDecisionMakerBlock to OrchestratorBlock (#12511)
## Summary
- Renames `SmartDecisionMakerBlock` to `OrchestratorBlock` across the
entire codebase
- The block supports iteration/agent mode and general tool
orchestration, so "Smart Decision Maker" no longer accurately describes
its capabilities
- Block UUID (`3b191d9f-356f-482d-8238-ba04b6d18381`) remains unchanged
— fully backward compatible with existing graphs

## Changes
- Renamed block class, constants, file names, test files, docs, and
frontend enum
- Updated copilot agent generator (helpers, validator, fixer) references
- Updated agent generation guide documentation
- No functional changes — pure rename refactor

### For code changes
- [x] I have clearly listed my changes in the PR description
- [x] I have made corresponding changes to the documentation
- [x] My changes do not generate new warnings or errors
- [x] New and existing unit tests pass locally with my changes

## Test plan
- [x] All pre-commit hooks pass (typecheck, lint, format)
- [x] Existing graphs with this block continue to load and execute (same
UUID)
- [x] Agent mode / iteration mode works as before
- [x] Copilot agent generator correctly references the renamed block
2026-03-24 19:16:42 +07:00
Otto
f21a36ca37 fix(backend): downgrade user-caused LLM API errors to warning level (#12516)
Requested by @majdyz

Follow-up to #12513. Anthropic/OpenAI 401, 403, and 429 errors are
user-caused (bad API keys, forbidden, rate limits) and should not hit
Sentry as exceptions.

### Changes

**Changes in `blocks/llm.py`:**
- Anthropic `APIError` handler (line ~950): check `status_code` — use
`logger.warning()` for 401/403/429, keep `logger.error()` for server
errors
- Generic `Exception` handler in LLM block `run()` (line ~1467): same
pattern — `logger.warning()` for user-caused status codes,
`logger.exception()` for everything else
- Extracted `USER_ERROR_STATUS_CODES = (401, 403, 429)` module-level
constant
- Added `break` to short-circuit retry loop for user-caused errors
- Removed double-logging from inner Anthropic handler

**Changes in `blocks/test/test_llm.py`:**
- Added 8 regression tests covering 401/403/429 fast-exit and 500 retry
behavior

**Sentry issues addressed:**
- AUTOGPT-SERVER-8B6, 8B7, 8B8 — `[LLM-Block] Anthropic API error: Error
code: 401 - invalid x-api-key`
- Any OpenAI 401/403/429 errors hitting the generic exception handler

Part of SECRT-2166

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan

#### Test plan:
- [x] Unit tests for 401/403/429 Anthropic errors → warning log, no
retry
- [x] Unit tests for 500 Anthropic errors → error log, retry
- [x] Unit tests for 401/403/429 OpenAI errors → warning log, no retry
- [x] Unit tests for 500 OpenAI errors → error log, retry
- [x] Verified USER_ERROR_STATUS_CODES constant is used consistently
- [x] Verified no double-logging in Anthropic handler path

---
Co-authored-by: Zamil Majdy (@majdyz) <zamil.majdy@agpt.co>

---------

Co-authored-by: Zamil Majdy (@majdyz) <zamil.majdy@agpt.co>
2026-03-24 10:59:04 +00:00
Zamil Majdy
ee5382a064 feat(copilot): add tool/block capability filtering to AutoPilotBlock (#12482)
## Summary

- Adds `CopilotPermissions` model (`copilot/permissions.py`) — a
capability filter that restricts which tools and blocks the
AutoPilot/Copilot may use during a single execution
- Exposes 4 new `advanced=True` fields on `AutoPilotBlock`: `tools`,
`tools_exclude`, `blocks`, `blocks_exclude`
- Threads permissions through the full execution path: `AutoPilotBlock`
→ `collect_copilot_response` → `stream_chat_completion_sdk` →
`run_block`
- Implements recursion inheritance via contextvar: sub-agent executions
can only be *more* restrictive than their parent

## Design

**Tool filtering** (`tools` + `tools_exclude`):
- `tools_exclude=True` (default): `tools` is a **blacklist** — listed
tools denied, all others allowed. Empty list = allow all.
- `tools_exclude=False`: `tools` is a **whitelist** — only listed tools
are allowed.
- Users specify short names (`run_block`, `web_fetch`, `Read`, `Task`,
…) — mapped to full SDK format internally.
- Validated eagerly at block-run time with a clear error listing valid
names.

**Block filtering** (`blocks` + `blocks_exclude`):
- Same semantics as tool filtering, applied inside `run_block` via
contextvar.
- Each entry can be a full UUID, an 8-char partial UUID (first segment),
or a case-insensitive block name.
- Validated against the live block registry; invalid identifiers surface
a helpful error before the session is created.

**Recursion inheritance**:
- `_inherited_permissions` contextvar stores the parent execution's
permissions.
- On each `AutoPilotBlock.run()`, the child's permissions are merged
with the parent via `merged_with_parent()` — effective allowed sets are
intersected (tools) and the parent chain is kept for block checks.
- Sub-agents can never expand what the parent allowed.

## Test plan

- [x] 68 new unit tests in `copilot/permissions_test.py` and
`blocks/autopilot_permissions_test.py`
- [x] Block identifier matching: full UUID, partial UUID, name,
case-insensitivity
- [x] Tool allow/deny list semantics including edge cases (empty list,
unknown tool)
- [x] Parent/child merging and recursion ceiling correctness
- [x] `validate_tool_names` / `validate_block_identifiers` with mock
block registry
- [x] `apply_tool_permissions` SDK tool-list integration
- [x] `AutoPilotBlock.run()` — invalid tool/block yields error before
session creation
- [x] `AutoPilotBlock.run()` — valid permissions forwarded to
`execute_copilot`
- [x] Existing `AutoPilotBlock` block tests still pass (2/2)
- [x] All hooks pass (pyright, ruff, black, isort)
- [x] E2E: CoPilot chat works end-to-end with E2B sandbox (12s stream)
- [x] E2E: Permission fields render in Builder UI (Tools combobox,
exclude toggles)
- [x] E2E: Agent with restricted permissions (whitelist web_fetch only)
executes correctly
- [x] E2E: Permission values preserved through API round-trip
2026-03-24 07:49:58 +00:00
Nicholas Tindle
b80e5ea987 fix(backend): allow admins to download submitted agents pending review (#12535)
## Why

Admins cannot download submitted-but-not-yet-approved agents from
`/admin/marketplace`. Clicking "Download" fails silently with a Server
Components render error. This blocks admins from reviewing agents that
companies have submitted.

## What

Remove the redundant ownership/marketplace check from
`get_graph_as_admin()` that was silently tightened in PR #11323 (Nov
2025). Add regression tests for both the admin download path and the
non-admin marketplace access control.

## How

**Root cause:** In PR #11323, Reinier refactored an inline
`StoreListingVersion` query (which had no status filter) into a call to
`is_graph_published_in_marketplace()` (which requires `submissionStatus:
APPROVED`). This was collateral cleanup — his PR focused on sub-agent
execution permissions — but it broke admin download of pending agents.

**Fix:** Remove the ownership/marketplace check from
`get_graph_as_admin()`, keeping only the null guard. This is safe
because `get_graph_as_admin` is only callable through admin-protected
routes (`requires_admin_user` at router level).

**Tests added:**
- `test_admin_can_access_pending_agent_not_owned` — admin can access a
graph they don't own that isn't APPROVED
- `test_admin_download_pending_agent_with_subagents` — admin export
includes sub-graphs
- `test_get_graph_non_owner_approved_marketplace_agent` — protects PR
#11323: non-owners CAN access APPROVED agents
- `test_get_graph_non_owner_pending_marketplace_agent_denied` — protects
PR #11323: non-owners CANNOT access PENDING agents

### Checklist

- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] 4 regression tests pass locally
  - [x] Admin can download pending agents (verified via unit test)
  - [x] Non-admin marketplace access control preserved

## Test plan
- [ ] Verify admin can download a submitted-but-not-approved agent from
`/admin/marketplace`
- [ ] Verify non-admin users still cannot access admin endpoints
- [ ] Verify the download succeeds without console errors

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Medium Risk**
> Changes access-control behavior for admin graph retrieval; risk is
mitigated by route-level admin auth but misuse of `get_graph_as_admin()`
outside admin-protected routes would expose non-approved graphs.
> 
> **Overview**
> Admins can now download/review **submitted-but-not-approved**
marketplace agents: `get_graph_as_admin()` no longer enforces ownership
or *marketplace APPROVED* checks, only returning `None` when the graph
doesn’t exist.
> 
> Adds regression tests covering the admin download/export path
(including sub-graphs) and confirming non-admin behavior is unchanged:
non-owners can fetch **APPROVED** marketplace graphs but cannot access
**pending** ones.
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
a6d2d69ae4. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 07:40:17 +00:00
Zamil Majdy
3d4fcfacb6 fix(backend): add circuit breaker for infinite tool call retry loops (#12499)
## Summary
- Adds a two-layer circuit breaker to prevent AutoPilot from looping
infinitely when tool calls fail with empty parameters
- **Tool-level**: After 3 consecutive identical failures per tool,
returns a hard-stop message instructing the model to output content as
text instead of retrying
- **Stream-level**: After 6 consecutive empty tool calls (`input: {}`),
aborts the stream entirely with a user-visible error and retry button

## Background
In session `c5548b48`, the model completed all research successfully but
then spent 51+ minutes in an infinite loop trying to write output —
every tool call was sent with `input: {}` (likely due to context
saturation preventing argument serialization). 21+ identical failing
tool calls with no circuit breaker.

## Changes
- `tool_adapter.py`: Added `_check_circuit_breaker`,
`_record_tool_failure`, `_clear_tool_failures` functions with a
`ContextVar`-based tracker. Integrated into both `create_tool_handler`
(BaseTool) and the `_truncating` wrapper (all tools).
- `service.py`: Added empty-tool-call detection in the main stream loop
that counts consecutive `AssistantMessage`s with empty
`ToolUseBlock.input` and aborts after the limit.
- `test_circuit_breaker.py`: 7 unit tests covering threshold behavior,
per-args tracking, reset on success, and uninitialized tracker safety.

## Test plan
- [x] Unit tests pass (`pytest
backend/copilot/sdk/test_circuit_breaker.py` — 8/8 passing)
- [x] Pre-commit hooks pass (Ruff, Black, isort, typecheck all pass)
- [x] E2E: CoPilot tool calls work normally (GetCurrentTimeBlock
returned 09:16:39 UTC)
- [x] E2E: Circuit breaker pass-through verified (successful calls don't
trigger breaker)
- [x] E2E: Circuit breaker code integrated into tool_adapter truncating
wrapper
2026-03-24 05:45:12 +00:00
Zamil Majdy
32eac6d52e dx(skills): improve /pr-test to require screenshots, state verification, and fix accountability (#12527)
## Summary
- Add "Critical Requirements" section making screenshots at every step,
PR comment posting, state verification, negative tests, and full
evidence reports non-negotiable
- Add "State Manipulation for Realistic Testing" section with Redis CLI,
DB query, and API before/after patterns
- Strengthen fix mode to require before/after screenshot pairs, rebuild
only affected services, and commit after each fix
- Expand test report format to include API evidence and screenshot
evidence columns
- Bump version to 2.0.0

## Test plan
- [x] Run `/pr-test` on an existing PR and verify it follows the new
critical requirements
- [x] Verify screenshots are posted to PR comment
- [x] Verify fix mode produces before/after screenshot pairs
2026-03-24 12:35:05 +07:00
dependabot[bot]
9762f4cde7 chore(libs/deps-dev): bump the development-dependencies group across 1 directory with 2 updates (#12523)
Bumps the development-dependencies group with 2 updates in the
/autogpt_platform/autogpt_libs directory:
[pytest-cov](https://github.com/pytest-dev/pytest-cov) and
[ruff](https://github.com/astral-sh/ruff).

Updates `pytest-cov` from 7.0.0 to 7.1.0
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/pytest-dev/pytest-cov/blob/master/CHANGELOG.rst">pytest-cov's
changelog</a>.</em></p>
<blockquote>
<h2>7.1.0 (2026-03-21)</h2>
<ul>
<li>
<p>Fixed total coverage computation to always be consistent, regardless
of reporting settings.
Previously some reports could produce different total counts, and
consequently can make --cov-fail-under behave different depending on
reporting options.
See <code>[#641](https://github.com/pytest-dev/pytest-cov/issues/641)
&lt;https://github.com/pytest-dev/pytest-cov/issues/641&gt;</code>_.</p>
</li>
<li>
<p>Improve handling of ResourceWarning from sqlite3.</p>
<p>The plugin adds warning filter for sqlite3
<code>ResourceWarning</code> unclosed database (since 6.2.0).
It checks if there is already existing plugin for this message by
comparing filter regular expression.
When filter is specified on command line the message is escaped and does
not match an expected message.
A check for an escaped regular expression is added to handle this
case.</p>
<p>With this fix one can suppress <code>ResourceWarning</code> from
sqlite3 from command line::</p>
<p>pytest -W &quot;ignore:unclosed database in &lt;sqlite3.Connection
object at:ResourceWarning&quot; ...</p>
</li>
<li>
<p>Various improvements to documentation.
Contributed by Art Pelling in
<code>[#718](https://github.com/pytest-dev/pytest-cov/issues/718)
&lt;https://github.com/pytest-dev/pytest-cov/pull/718&gt;</code>_ and
&quot;vivodi&quot; in
<code>[#738](https://github.com/pytest-dev/pytest-cov/issues/738)
&lt;https://github.com/pytest-dev/pytest-cov/pull/738&gt;</code><em>.
Also closed
<code>[#736](https://github.com/pytest-dev/pytest-cov/issues/736)
&lt;https://github.com/pytest-dev/pytest-cov/issues/736&gt;</code></em>.</p>
</li>
<li>
<p>Fixed some assertions in tests.
Contributed by in Markéta Machová in
<code>[#722](https://github.com/pytest-dev/pytest-cov/issues/722)
&lt;https://github.com/pytest-dev/pytest-cov/pull/722&gt;</code>_.</p>
</li>
<li>
<p>Removed unnecessary coverage configuration copying (meant as a backup
because reporting commands had configuration side-effects before
coverage 5.0).</p>
</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="66c8a526b1"><code>66c8a52</code></a>
Bump version: 7.0.0 → 7.1.0</li>
<li><a
href="f707662478"><code>f707662</code></a>
Make the examples use pypy 3.11.</li>
<li><a
href="6049a78478"><code>6049a78</code></a>
Make context test use the old ctracer (seems the new sysmon tracer
behaves di...</li>
<li><a
href="8ebf20bbbc"><code>8ebf20b</code></a>
Update changelog.</li>
<li><a
href="861d30e60d"><code>861d30e</code></a>
Remove the backup context manager - shouldn't be needed since coverage
5.0, ...</li>
<li><a
href="fd4c956014"><code>fd4c956</code></a>
Pass the precision on the nulled total (seems that there's some caching
goion...</li>
<li><a
href="78c9c4ecb0"><code>78c9c4e</code></a>
Only run the 3.9 on older deps.</li>
<li><a
href="4849a922e8"><code>4849a92</code></a>
Punctuation.</li>
<li><a
href="197c35e2f3"><code>197c35e</code></a>
Update changelog and hopefully I don't forget to publish release again
:))</li>
<li><a
href="14dc1c92d4"><code>14dc1c9</code></a>
Update examples to use 3.11 and make the adhoc layout example look a bit
more...</li>
<li>Additional commits viewable in <a
href="https://github.com/pytest-dev/pytest-cov/compare/v7.0.0...v7.1.0">compare
view</a></li>
</ul>
</details>
<br />

Updates `ruff` from 0.15.0 to 0.15.7
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/astral-sh/ruff/releases">ruff's
releases</a>.</em></p>
<blockquote>
<h2>0.15.7</h2>
<h2>Release Notes</h2>
<p>Released on 2026-03-19.</p>
<h3>Preview features</h3>
<ul>
<li>Display output severity in preview (<a
href="https://redirect.github.com/astral-sh/ruff/pull/23845">#23845</a>)</li>
<li>Don't show <code>noqa</code> hover for non-Python documents (<a
href="https://redirect.github.com/astral-sh/ruff/pull/24040">#24040</a>)</li>
</ul>
<h3>Rule changes</h3>
<ul>
<li>[<code>pycodestyle</code>] Recognize <code>pyrefly:</code> as a
pragma comment (<code>E501</code>) (<a
href="https://redirect.github.com/astral-sh/ruff/pull/24019">#24019</a>)</li>
</ul>
<h3>Server</h3>
<ul>
<li>Don't return code actions for non-Python documents (<a
href="https://redirect.github.com/astral-sh/ruff/pull/23905">#23905</a>)</li>
</ul>
<h3>Documentation</h3>
<ul>
<li>Add company AI policy to contributing guide (<a
href="https://redirect.github.com/astral-sh/ruff/pull/24021">#24021</a>)</li>
<li>Document editor features for Markdown code formatting (<a
href="https://redirect.github.com/astral-sh/ruff/pull/23924">#23924</a>)</li>
<li>[<code>pylint</code>] Improve phrasing (<code>PLC0208</code>) (<a
href="https://redirect.github.com/astral-sh/ruff/pull/24033">#24033</a>)</li>
</ul>
<h3>Other changes</h3>
<ul>
<li>Use PEP 639 license information (<a
href="https://redirect.github.com/astral-sh/ruff/pull/19661">#19661</a>)</li>
</ul>
<h3>Contributors</h3>
<ul>
<li><a
href="https://github.com/tmimmanuel"><code>@​tmimmanuel</code></a></li>
<li><a
href="https://github.com/DimitriPapadopoulos"><code>@​DimitriPapadopoulos</code></a></li>
<li><a
href="https://github.com/amyreese"><code>@​amyreese</code></a></li>
<li><a href="https://github.com/statxc"><code>@​statxc</code></a></li>
<li><a href="https://github.com/dylwil3"><code>@​dylwil3</code></a></li>
<li><a
href="https://github.com/hunterhogan"><code>@​hunterhogan</code></a></li>
<li><a
href="https://github.com/renovate"><code>@​renovate</code></a></li>
</ul>
<h2>Install ruff 0.15.7</h2>
<h3>Install prebuilt binaries via shell script</h3>
<pre lang="sh"><code>curl --proto '=https' --tlsv1.2 -LsSf
https://releases.astral.sh/github/ruff/releases/download/0.15.7/ruff-installer.sh
| sh
</code></pre>
<h3>Install prebuilt binaries via powershell script</h3>
<pre lang="sh"><code>powershell -ExecutionPolicy Bypass -c &quot;irm
https://releases.astral.sh/github/ruff/releases/download/0.15.7/ruff-installer.ps1
| iex&quot;
&lt;/tr&gt;&lt;/table&gt; 
</code></pre>
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md">ruff's
changelog</a>.</em></p>
<blockquote>
<h2>0.15.7</h2>
<p>Released on 2026-03-19.</p>
<h3>Preview features</h3>
<ul>
<li>Display output severity in preview (<a
href="https://redirect.github.com/astral-sh/ruff/pull/23845">#23845</a>)</li>
<li>Don't show <code>noqa</code> hover for non-Python documents (<a
href="https://redirect.github.com/astral-sh/ruff/pull/24040">#24040</a>)</li>
</ul>
<h3>Rule changes</h3>
<ul>
<li>[<code>pycodestyle</code>] Recognize <code>pyrefly:</code> as a
pragma comment (<code>E501</code>) (<a
href="https://redirect.github.com/astral-sh/ruff/pull/24019">#24019</a>)</li>
</ul>
<h3>Server</h3>
<ul>
<li>Don't return code actions for non-Python documents (<a
href="https://redirect.github.com/astral-sh/ruff/pull/23905">#23905</a>)</li>
</ul>
<h3>Documentation</h3>
<ul>
<li>Add company AI policy to contributing guide (<a
href="https://redirect.github.com/astral-sh/ruff/pull/24021">#24021</a>)</li>
<li>Document editor features for Markdown code formatting (<a
href="https://redirect.github.com/astral-sh/ruff/pull/23924">#23924</a>)</li>
<li>[<code>pylint</code>] Improve phrasing (<code>PLC0208</code>) (<a
href="https://redirect.github.com/astral-sh/ruff/pull/24033">#24033</a>)</li>
</ul>
<h3>Other changes</h3>
<ul>
<li>Use PEP 639 license information (<a
href="https://redirect.github.com/astral-sh/ruff/pull/19661">#19661</a>)</li>
</ul>
<h3>Contributors</h3>
<ul>
<li><a
href="https://github.com/tmimmanuel"><code>@​tmimmanuel</code></a></li>
<li><a
href="https://github.com/DimitriPapadopoulos"><code>@​DimitriPapadopoulos</code></a></li>
<li><a
href="https://github.com/amyreese"><code>@​amyreese</code></a></li>
<li><a href="https://github.com/statxc"><code>@​statxc</code></a></li>
<li><a href="https://github.com/dylwil3"><code>@​dylwil3</code></a></li>
<li><a
href="https://github.com/hunterhogan"><code>@​hunterhogan</code></a></li>
<li><a
href="https://github.com/renovate"><code>@​renovate</code></a></li>
</ul>
<h2>0.15.6</h2>
<p>Released on 2026-03-12.</p>
<h3>Preview features</h3>
<ul>
<li>Add support for <code>lazy</code> import parsing (<a
href="https://redirect.github.com/astral-sh/ruff/pull/23755">#23755</a>)</li>
<li>Add support for star-unpacking of comprehensions (PEP 798) (<a
href="https://redirect.github.com/astral-sh/ruff/pull/23788">#23788</a>)</li>
<li>Reject semantic syntax errors for lazy imports (<a
href="https://redirect.github.com/astral-sh/ruff/pull/23757">#23757</a>)</li>
<li>Drop a few rules from the preview default set (<a
href="https://redirect.github.com/astral-sh/ruff/pull/23879">#23879</a>)</li>
<li>[<code>airflow</code>] Flag <code>Variable.get()</code> calls
outside of task execution context (<code>AIR003</code>) (<a
href="https://redirect.github.com/astral-sh/ruff/pull/23584">#23584</a>)</li>
<li>[<code>airflow</code>] Flag runtime-varying values in DAG/task
constructor arguments (<code>AIR304</code>) (<a
href="https://redirect.github.com/astral-sh/ruff/pull/23631">#23631</a>)</li>
<li>[<code>flake8-bugbear</code>] Implement
<code>delattr-with-constant</code> (<code>B043</code>) (<a
href="https://redirect.github.com/astral-sh/ruff/pull/23737">#23737</a>)</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="0ef39de46c"><code>0ef39de</code></a>
Bump 0.15.7 (<a
href="https://redirect.github.com/astral-sh/ruff/issues/24049">#24049</a>)</li>
<li><a
href="beb543b5c6"><code>beb543b</code></a>
[ty] ecosystem-analyzer: Fail on newly panicking projects (<a
href="https://redirect.github.com/astral-sh/ruff/issues/24043">#24043</a>)</li>
<li><a
href="378fe73092"><code>378fe73</code></a>
Don't show noqa hover for non-Python documents (<a
href="https://redirect.github.com/astral-sh/ruff/issues/24040">#24040</a>)</li>
<li><a
href="b5665bd18e"><code>b5665bd</code></a>
[<code>pylint</code>] Improve phrasing (<code>PLC0208</code>) (<a
href="https://redirect.github.com/astral-sh/ruff/issues/24033">#24033</a>)</li>
<li><a
href="6e20f22190"><code>6e20f22</code></a>
test: migrate <code>show_settings</code> and <code>version</code> tests
to use <code>CliTest</code> (<a
href="https://redirect.github.com/astral-sh/ruff/issues/23702">#23702</a>)</li>
<li><a
href="f99b284c1f"><code>f99b284</code></a>
Drain file watcher events during test setup (<a
href="https://redirect.github.com/astral-sh/ruff/issues/24030">#24030</a>)</li>
<li><a
href="744c996c35"><code>744c996</code></a>
[ty] Filter out unsatisfiable inference attempts during generic call
narrowin...</li>
<li><a
href="16160958bd"><code>1616095</code></a>
[ty] Avoid inferring intersection types for call arguments (<a
href="https://redirect.github.com/astral-sh/ruff/issues/23933">#23933</a>)</li>
<li><a
href="7f275f431b"><code>7f275f4</code></a>
[ty] Pin mypy_primer in <code>setup_primer_project.py</code> (<a
href="https://redirect.github.com/astral-sh/ruff/issues/24020">#24020</a>)</li>
<li><a
href="7255e362e4"><code>7255e36</code></a>
[<code>pycodestyle</code>] Recognize <code>pyrefly:</code> as a pragma
comment (<code>E501</code>) (<a
href="https://redirect.github.com/astral-sh/ruff/issues/24019">#24019</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/astral-sh/ruff/compare/0.15.0...0.15.7">compare
view</a></li>
</ul>
</details>
<br />


Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore <dependency name> major version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's major version (unless you unignore this specific
dependency's major version or upgrade to it yourself)
- `@dependabot ignore <dependency name> minor version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's minor version (unless you unignore this specific
dependency's minor version or upgrade to it yourself)
- `@dependabot ignore <dependency name>` will close this group update PR
and stop Dependabot creating any more for the specific dependency
(unless you unignore this specific dependency or upgrade to it yourself)
- `@dependabot unignore <dependency name>` will remove all of the ignore
conditions of the specified dependency
- `@dependabot unignore <dependency name> <ignore condition>` will
remove the ignore condition of the specified dependency and ignore
conditions


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-24 01:36:45 +00:00
Otto
76901ba22f docs: add Why/What/How structure to PR template, CLAUDE.md, and PR skills (#12525)
Requested by @majdyz

### Why / What / How

**Why:** PR descriptions currently explain the *what* and *how* but not
the *why*. Without motivation context, reviewers can't judge whether an
approach fits the problem. Nick flagged this in standup: "The PR
descriptions you use are explaining the what not the why."

**What:** Adds a consistent Why / What / How structure to PR
descriptions across the entire workflow — template, CLAUDE.md guidance,
and all PR-related skills (`/pr-review`, `/pr-test`, `/pr-address`).

**How:**
- **`.github/PULL_REQUEST_TEMPLATE.md`**: Replaced the old vague
`Changes` heading with a single `Why / What / How` section with guiding
comments
- **`autogpt_platform/CLAUDE.md`**: Added bullet under "Creating Pull
Requests" requiring the Why/What/How structure
- **`.claude/skills/pr-review/SKILL.md`**: Added "Read the PR
description" step before reading the diff, and "Description quality" to
the review checklist
- **`.claude/skills/pr-test/SKILL.md`**: Updated Step 1 to read the PR
description and understand Why/What/How before testing
- **`.claude/skills/pr-address/SKILL.md`**: Added "Read the PR
description" step before fetching comments

## Test plan
- [x] All five files reviewed for correct formatting and consistency

---
Co-authored-by: Zamil Majdy (@majdyz) <zamil.majdy@agpt.co>
2026-03-24 01:35:39 +00:00
Zamil Majdy
23b65939f3 fix(backend/db): add DB_STATEMENT_CACHE_SIZE env var for Prisma engine (#12521)
## Summary
- Add `DB_STATEMENT_CACHE_SIZE` env var support for Prisma query engine
- Wires through as `statement_cache_size` URL parameter to control the
LRU prepared statement cache per connection in the Rust binary engine

## Why
Live investigation on dev pods showed the Prisma Rust engine growing
from 34MB to 932MB over ~1hr due to unbounded query plan cache. Despite
`pgbouncer=true` in the DATABASE_URL (which should disable caching), the
engine still caches.

This gives explicit control: setting `DB_STATEMENT_CACHE_SIZE=0`
disables the cache entirely.

## Live data (dev)
```
Fresh pod:  Python=693MB, Engine=34MB,  Total=727MB
Bloated:    Python=2.1GB, Engine=932MB, Total=3GB
```

## Infra companion PR

[AutoGPT_cloud_infrastructure#299](https://github.com/Significant-Gravitas/AutoGPT_cloud_infrastructure/pull/299)
sets `DB_STATEMENT_CACHE_SIZE=0` along with `PYTHONMALLOC=malloc` and
memory limit changes.

## Test plan
- [ ] Deploy to dev and monitor Prisma engine memory over 1hr
- [ ] Verify queries still work correctly with cache disabled
- [ ] Compare engine RSS on fresh vs aged pods
2026-03-23 23:57:28 +07:00
Zamil Majdy
1c27eaac53 dx(skills): improve /pr-test skill to show screenshots with explanations (#12518)
## Summary
- Update /pr-test skill to consistently show screenshots inline to the
user with explanations
- Post PR comments with inline images and per-screenshot descriptions
(not just local file paths)
- Simplify GitHub Git API upload flow for screenshot hosting

## Changes
- Step 5: Take screenshots at every significant test step (aim for 1+
per scenario)
- Step 6 (new): Show every screenshot to the user via Read tool with 2-3
sentence explanations
- Step 7: Post PR comment with inline images, summary table, and
per-screenshot context

## Test plan
- [x] Tested end-to-end on PR #12512 — screenshots uploaded and rendered
correctly in PR comment
2026-03-23 23:11:21 +07:00
Zamil Majdy
923b164794 fix(backend): use system chromium for agent-browser on all architectures (#12473)
## Summary

- Replaces the arch-conditional chromium install (ARM64 vs AMD64) with a
single approach: always use the distro-packaged `chromium` and set
`AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium`
- Removes `agent-browser install` entirely (it downloads Chrome for
Testing, which has no ARM64 binary)
- Removes the `entrypoint.sh` wrapper script that was setting the env
var at runtime
- Updates `autogpt_platform/db/docker/docker-compose.yml`: removes
`external: true` from the network declarations so the Supabase stack can
be brought up standalone (needed for the Docker integration tests in the
test plan below — without this, `docker compose up` fails unless the
platform stack is already running); also sets
`GOTRUE_MAILER_AUTOCONFIRM: true` for local dev convenience (no SMTP
setup required on first run — this compose file is not used in
production)
- Updates `autogpt_platform/docker-compose.platform.yml`: mounts the
`workspace` volume so agent-browser results (screenshots, snapshots) are
accessible from other services; without this the copilot workspace write
fails in Docker

## Verification

Tested via Docker build on arm64 (Apple Silicon):
```
=== Testing agent-browser with system chromium ===
✓ Example Domain
  https://example.com/
=== SUCCESS: agent-browser launched with system chromium ===
```
agent-browser navigated to example.com in ~1.5s using system chromium
(v146 from Debian trixie).

## Test plan

- [x] Docker build test on arm64: `agent-browser open
https://example.com` succeeds with system chromium
- [x] Verify amd64 Docker build still works (CI)
2026-03-23 20:54:03 +07:00
Zamil Majdy
e86ac21c43 feat(platform): add workflow import from other tools (n8n, Make.com, Zapier) (#12440)
## Summary
- Enable one-click import of workflows from other platforms (n8n,
Make.com, Zapier, etc.) into AutoGPT via CoPilot
- **No backend endpoint** — import is entirely client-side: the dialog
reads the file or fetches the n8n template URL, uploads the JSON to the
workspace via `uploadFileDirect`, stores the file reference in
`sessionStorage`, and redirects to CoPilot with `autosubmit=true`
- CoPilot receives the workflow JSON as a proper file attachment and
uses the existing agent-generator pipeline to convert it
- Library dialog redesigned: 2 tabs — "AutoGPT agent" (upload exported
agent JSON) and "Another platform" (file upload + optional n8n URL)

## How it works
1. User uploads a workflow JSON (or pastes an n8n template URL)
2. Frontend fetches/reads the JSON and uploads it to the user's
workspace via the existing file upload API
3. User is redirected to `/copilot?source=import&autosubmit=true`
4. CoPilot picks up the file from `sessionStorage` and sends it as a
`FileUIPart` attachment with a prompt to recreate the workflow as an
AutoGPT agent

## Test plan
- [x] Manual test: import a real n8n workflow JSON via the dialog
- [x] Manual test: paste an n8n template URL and verify it fetches +
converts
- [x] Manual test: import Make.com / Zapier workflow export JSON
- [x] Repeated imports don't cause 409 conflicts (filenames use
`crypto.randomUUID()`)
- [x] E2E: Import dialog has 2 tabs (AutoGPT agent + Another platform)
- [x] E2E: n8n quick-start template buttons present
- [x] E2E: n8n URL input enables Import button on valid URL
- [x] E2E: Workspace upload API returns file_id
2026-03-23 13:03:02 +00:00
Lluis Agusti
94224be841 Merge remote-tracking branch 'origin/master' into dev 2026-03-23 20:42:32 +08:00
Otto
da4bdc7ab9 fix(backend+frontend): reduce Sentry noise from user-caused errors (#12513)
Requested by @majdyz

User-caused errors (no payment method, webhook agent invocation, missing
credentials, bad API keys) were hitting Sentry via `logger.exception()`
in the `ValueError` handler, creating noise that obscures real bugs.
Additionally, a frontend crash on the copilot page (BUILDER-71J) needed
fixing.

**Changes:**

**Backend — rest_api.py**
- Set `log_error=False` for the `ValueError` exception handler (line
278), consistent with how `FolderValidationError` and `NotFoundError`
are already handled. User-caused 400 errors no longer trigger
`logger.exception()` → Sentry.

**Backend — executor/manager.py**
- Downgrade `ExecutionManager` input validation skip errors from `error`
to `warning` level. Missing credentials is expected user behavior, not
an internal error.

**Backend — blocks/llm.py**
- Sanitize unpaired surrogates in LLM prompt content before sending to
provider APIs. Prevents `UnicodeEncodeError: surrogates not allowed`
when httpx encodes the JSON body (AUTOGPT-SERVER-8AX).

**Frontend — package.json**
- Upgrade `ai` SDK from `6.0.59` to `6.0.134` to fix BUILDER-71J
(`TypeError: undefined is not an object (evaluating
'this.activeResponse.state')` on /copilot page). This is a known issue
in the Vercel AI SDK fixed in later patch versions.

**Sentry issues addressed:**
- `No payment method found` (ValueError → 400)
- `This agent is triggered by an external event (webhook)` (ValueError →
400)
- `Node input updated with non-existent credentials` (ValueError → 400)
- `[ExecutionManager] Skip execution, input validation error: missing
input {credentials}`
- `UnicodeEncodeError: surrogates not allowed` (AUTOGPT-SERVER-8AX)
- `TypeError: activeResponse.state` (BUILDER-71J)

Resolves SECRT-2166

---
Co-authored-by: Zamil Majdy (@majdyz) <zamil.majdy@agpt.co>

---------

Co-authored-by: Zamil Majdy (@majdyz) <zamil.majdy@agpt.co>
2026-03-23 12:22:49 +00:00
Zamil Majdy
7176cecf25 perf(copilot): reduce tool schema token cost by 34% (#12398)
## Summary

Reduce CoPilot per-turn token overhead by systematically trimming tool
descriptions, parameter schemas, and system prompt content. All 35 MCP
tool schemas are passed on every SDK call — this PR reduces their size.

### Strategy

1. **Tool descriptions**: Trimmed verbose multi-sentence explanations to
concise single-sentence summaries while preserving meaning
2. **Parameter schemas**: Shortened parameter descriptions to essential
info, removed some `default` values (handled in code)
3. **System prompt**: Condensed `_SHARED_TOOL_NOTES` and storage
supplement template in `prompting.py`
4. **Cross-tool references**: Removed duplicate workflow hints (e.g.
"call find_block before run_block" appeared in BOTH tools — kept only in
the dependent tool). Critical cross-tool references retained (e.g.
`continue_run_block` in `run_block`, `fix_agent_graph` in
`validate_agent`, `get_doc_page` in `search_docs`, `web_fetch`
preference in `browser_navigate`)

### Token Impact

| Metric | Before | After | Reduction |
|--------|--------|-------|-----------|
| System Prompt | ~865 tokens | ~497 tokens | 43% |
| Tool Schemas | ~9,744 tokens | ~6,470 tokens | 34% |
| **Grand Total** | **~10,609 tokens** | **~6,967 tokens** | **34%** |

Saves **~3,642 tokens per conversation turn**.

### Key Decisions

- **Mostly description changes**: Tool logic, parameters, and types
unchanged. However, some schema-level `default` fields were removed
(e.g. `save` in `customize_agent`) — these are machine-readable
metadata, not just prose, and may affect LLM behavior.
- **Quality preserved**: All descriptions still convey what the tool
does and essential usage patterns
- **Cross-references trimmed carefully**: Kept prerequisite hints in the
dependent tool (run_block mentions find_block) but removed the reverse
(find_block no longer mentions run_block). Critical cross-tool guidance
retained where removal would degrade model behavior.
- **`run_time` description fixed**: Added missing supported values
(today, last 30 days, ISO datetime) per review feedback

### Future Optimization

The SDK passes all 35 tools on every call. The MCP protocol's
`list_tools()` handler supports dynamic tool registration — a follow-up
PR could implement lazy tool loading (register core tools + a discovery
meta-tool) to further reduce per-turn token cost.

### Changes

- Trimmed descriptions across 25 tool files
- Condensed `_SHARED_TOOL_NOTES` and `_build_storage_supplement` in
`prompting.py`
- Fixed `run_time` schema description in `agent_output.py`

### Checklist

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] All 273 copilot tests pass locally
  - [x] All 35 tools load and produce valid schemas
  - [x] Before/after token dumps compared
  - [x] Formatting passes (`poetry run format`)
  - [x] CI green
2026-03-23 08:27:24 +00:00
Zamil Majdy
f35210761c feat(devops): add /pr-test skill + subscription mode auto-provisioning (#12507)
## Summary
- Adds `/pr-test` skill for automated E2E testing of PRs using docker
compose, agent-browser, and API calls
- Covers full environment setup (copy .env, configure copilot auth,
ARM64 Docker fix)
- Includes browser UI testing, direct API testing, screenshot capture,
and test report generation
- Has `--fix` mode for auto-fixing bugs found during testing (similar to
`/pr-address`)
- **Screenshot uploads use GitHub Git API** (blobs → tree → commit →
ref) — no local git operations, safe for worktrees
- **Subscription mode improvements:**
- Extract subscription auth logic to `sdk/subscription.py` — uses SDK's
bundled CLI binary instead of requiring `npm install -g
@anthropic-ai/claude-code`
- Auto-provision `~/.claude/.credentials.json` from
`CLAUDE_CODE_OAUTH_TOKEN` env var on container startup — no `claude
login` needed in Docker
- Add `scripts/refresh_claude_token.sh` — cross-platform helper
(macOS/Linux/Windows) to extract OAuth tokens from host and update
`backend/.env`

## Test plan
- [x] Validated skill on multiple PRs (#12482, #12483, #12499, #12500,
#12501, #12440, #12472) — all test scenarios passed
- [x] Confirmed screenshot upload via GitHub Git API renders correctly
on all 7 PRs
- [x] Verified subscription mode E2E in Docker:
`refresh_claude_token.sh` → `docker compose up` → copilot chat responds
correctly with no API keys (pure OAuth subscription)
- [x] Verified auto-provisioning of credentials file inside container
from `CLAUDE_CODE_OAUTH_TOKEN` env var
- [x] Confirmed bundled CLI detection
(`claude_agent_sdk._bundled/claude`) works without system-installed
`claude`
- [x] `poetry run pytest backend/copilot/sdk/service_test.py` — 24/24
tests pass
2026-03-23 15:29:00 +07:00
Zamil Majdy
1ebcf85669 fix(platform): resolve 5 production Sentry alerts (#12496)
## Summary

Fixes 5 high-priority Sentry alerts from production:

- **AUTOGPT-SERVER-8AM**: Fix `TypeError: TypedDict does not support
instance and class checks` — `_value_satisfies_type` in `type.py` now
handles TypedDict classes that don't support `isinstance()` checks
- **AUTOGPT-SERVER-8AN**: Fix `ValueError: No payment method found`
triggering Sentry error — catch the expected ValueError in the
auto-top-up endpoint and return HTTP 422 instead
- **BUILDER-7F5**: Fix `Upload failed (409): File already exists` — add
`overwrite` query param to workspace upload endpoint and set it to
`true` from the frontend direct-upload
- **BUILDER-7F0**: Fix `LaTeX-incompatible input` KaTeX warnings
flooding Sentry — set `strict: false` on rehype-katex plugin to suppress
warnings for unrecognized Unicode characters
- **AUTOGPT-SERVER-89N**: Fix `Tool execution with manager failed:
validation error for dict[str,list[any]]` — make RPC return type
validation resilient (log warning instead of crash) and downgrade
SmartDecisionMaker tool execution errors to warnings

## Test plan
- [ ] Verify TypedDict type coercion works for
GithubMultiFileCommitBlock inputs
- [ ] Verify auto-top-up without payment method returns 422, not 500
- [ ] Verify file re-upload in copilot succeeds (overwrites instead of
409)
- [ ] Verify LaTeX rendering with Unicode characters doesn't produce
console warnings
- [ ] Verify SmartDecisionMaker tool execution failures are logged at
warning level
2026-03-23 08:05:08 +00:00
Otto
ab7c38bda7 fix(frontend): detect closed OAuth popup and allow dismissing waiting modal (#12443)
Requested by @kcze

When a user closes the OAuth sign-in popup without completing
authentication, the 'Waiting on sign-in process' modal was stuck open
with no way to dismiss it, forcing a page refresh.

Two bugs caused this:

1. `oauth-popup.ts` had no detection for the popup being closed by the
user. The promise would hang until the 5-minute timeout.

2. The modal's cancel button aborted a disconnected `AbortController`
instead of the actual OAuth flow's abort function, so clicking
cancel/close did nothing.

### Changes

- Add `popup.closed` polling (500ms) in `openOAuthPopup()` that rejects
the promise when the user closes the auth window
- Add reject-on-abort so the cancel button properly terminates the flow
- Replace the disconnected `oAuthPopupController` with a direct
`cancelOAuthFlow()` function that calls the real abort ref
- Handle popup-closed and user-canceled as silent cancellations (no
error toast)

### Testing

Tested manually 
- [x] Start OAuth flow → close popup window → modal dismisses
automatically 
- [x] Start OAuth flow → click cancel on modal → popup closes, modal
dismisses 
- [x] Complete OAuth flow normally → works as before 

Resolves SECRT-2054

---
Co-authored-by: Krzysztof Czerwinski (@kcze)
<krzysztof.czerwinski@agpt.co>

---------

Co-authored-by: Krzysztof Czerwinski <kpczerwinski@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 14:41:09 +00:00
Ubbe
0f67e45d05 hotfix(marketplace): adjust card height overflow (#12497)
## Summary

### Before

<img width="500" height="501" alt="Screenshot 2026-03-20 at 21 50 31"
src="https://github.com/user-attachments/assets/6154cffb-6772-4c3d-a703-527c8ca0daff"
/>

### After

<img width="500" height="581" alt="Screenshot 2026-03-20 at 21 33 12"
src="https://github.com/user-attachments/assets/2f9bd69d-30c5-4d06-ad1e-ed76b184afe5"
/>

### Other minor fixes

- minor spacing adjustments in creator/search pages when empty and
between sections


### Summary

- Increase StoreCard height from 25rem to 26.5rem to prevent content
overflow
- Replace manual tooltip-based title truncation with `OverflowText`
component in StoreCard
- Adjust carousel indicator positioning and hide it on md+ when exactly
3 featured agents are shown

## Test plan
- [x] Verify marketplace cards display without text overflow
- [x] Verify featured section carousel indicators behave correctly
- [x] Check responsive behavior at common breakpoints

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 22:03:28 +08:00
Ubbe
b9ce37600e refactor(frontend/marketplace): move download below Add to library with contextual text (#12486)
## Summary

<img width="1487" height="670" alt="Screenshot 2026-03-20 at 00 52 58"
src="https://github.com/user-attachments/assets/f09de2a0-3c5b-4bce-b6f4-8a853f6792cf"
/>


- Move the download button from inline next to "Add to library" to a
separate line below it
- Add contextual text: "Want to use this agent locally? Download here"
- Style the "Download here" as a violet ghost button link with the
download icon

## Test plan
- [ ] Visit a marketplace agent page
- [ ] Verify "Add to library" button renders in its row
- [ ] Verify "Want to use this agent locally? Download here" appears
below it
- [ ] Click "Download here" and confirm the agent downloads correctly

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 13:13:59 +00:00
Otto
3921deaef1 fix(frontend): truncate marketplace card description to 2 lines (#12494)
Reduces `line-clamp` from 3 to 2 on the marketplace `StoreCard`
description to prevent text from overlapping with the
absolutely-positioned run count and +Add button at the bottom of the
card.

Resolves SECRT-2156.

---
Co-authored-by: Abhimanyu Yadav (@Abhi1992002)
<122007096+Abhi1992002@users.noreply.github.com>
2026-03-20 09:10:21 +00:00
Nicholas Tindle
f01f668674 fix(backend): support Responses API in SmartDecisionMakerBlock (#12489)
## Summary

- Fixes SmartDecisionMakerBlock conversation management to work with
OpenAI's Responses API, which was introduced in #12099 (commit 1240f38)
- The migration to `responses.create` updated the outbound LLM call but
missed the conversation history serialization — the `raw_response` is
now the entire `Response` object (not a `ChatCompletionMessage`), and
tool calls/results use `function_call` / `function_call_output` types
instead of role-based messages
- This caused a 400 error on the second LLM call in agent mode:
`"Invalid value: ''. Supported values are: 'assistant', 'system',
'developer', and 'user'."`

### Changes

**`smart_decision_maker.py`** — 6 functions updated:
| Function | Fix |
|---|---|
| `_convert_raw_response_to_dict` | Detects Responses API `Response`
objects, extracts output items as a list |
| `_get_tool_requests` | Recognizes `type: "function_call"` items |
| `_get_tool_responses` | Recognizes `type: "function_call_output"`
items |
| `_create_tool_response` | New `responses_api` kwarg produces
`function_call_output` format |
| `_update_conversation` | Handles list return from
`_convert_raw_response_to_dict` |
| Non-agent mode path | Same list handling for traditional execution |

**`test_smart_decision_maker_responses_api.py`** — 61 tests covering:
- Every branch of all 6 affected helper functions
- Chat Completions, Anthropic, and Responses API formats
- End-to-end agent mode and traditional mode conversation validity

## Test plan

- [x] 61 new unit tests all pass
- [x] 11 existing SmartDecisionMakerBlock tests still pass (no
regressions)
- [x] All pre-commit hooks pass (ruff, black, isort, pyright)
- [ ] CI integration tests

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Medium Risk**
> Updates core LLM invocation and agent conversation/tool-call
bookkeeping to match OpenAI’s Responses API, which can affect tool
execution loops and prompt serialization across providers. Risk is
mitigated by extensive new unit tests, but regressions could surface in
production agent-mode flows or token/usage accounting.
> 
> **Overview**
> **Migrates OpenAI calls from Chat Completions to the Responses API
end-to-end**, including tool schema conversion, output parsing,
reasoning/text extraction, and updated token usage fields in
`LLMResponse`.
> 
> **Fixes SmartDecisionMakerBlock conversation/tool handling for
Responses API** by treating `raw_response` as a Response object
(splitting it into `output` items for replay), recognizing
`function_call`/`function_call_output` entries, and emitting tool
outputs in the correct Responses format to prevent invalid follow-up
prompts.
> 
> Also adjusts prompt compaction/token estimation to understand
Responses API tool items, changes
`get_execution_outputs_by_node_exec_id` to return list-valued
`CompletedBlockOutput`, removes `gpt-3.5-turbo` from model/cost/docs
lists, and adds focused unit tests plus a lightweight `conftest.py` to
run these tests without the full server stack.
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
ff292efd3d. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Otto <otto@agpt.co>
Co-authored-by: Krzysztof Czerwinski <kpczerwinski@gmail.com>
2026-03-20 03:23:52 +00:00
Otto
f7a3491f91 docs(platform): add TDD guidance to CLAUDE.md files (#12491)
Requested by @majdyz

Adds TDD (test-driven development) guidance to CLAUDE.md files so Claude
Code follows a test-first workflow when fixing bugs or adding features.

**Changes:**
- **Parent `CLAUDE.md`**: Cross-cutting TDD workflow — write a failing
`xfail` test, implement the fix, remove the marker
- **Backend `CLAUDE.md`**: Concrete pytest example with
`@pytest.mark.xfail` pattern
- **Frontend `CLAUDE.md`**: Note about using Playwright `.fixme`
annotation for bug-fix tests

The workflow is: write a failing test first → confirm it fails for the
right reason → implement → confirm it passes. This ensures every bug fix
is covered by a test that would have caught the regression.

---
Co-authored-by: Zamil Majdy (@majdyz) <zamil.majdy@agpt.co>
2026-03-20 02:13:16 +00:00
Nicholas Tindle
cbff3b53d3 Revert "feat(backend): migrate OpenAI provider to Responses API" (#12490)
Reverts Significant-Gravitas/AutoGPT#12099

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Medium Risk**
> Reverts the OpenAI integration in `llm_call` from the Responses API
back to `chat.completions`, which can change tool-calling, JSON-mode
behavior, and token accounting across core AI blocks. The change is
localized but touches the primary LLM execution path and associated
tests/docs.
> 
> **Overview**
> Reverts the OpenAI path in `backend/blocks/llm.py` from the Responses
API back to `chat.completions`, including updating JSON-mode
(`response_format`), tool handling, and usage extraction to match the
Chat Completions response shape.
> 
> Removes the now-unused `backend/util/openai_responses.py` helpers and
their unit tests, updates LLM tests to mock `chat.completions.create`,
and adds `gpt-3.5-turbo` to the supported model list, cost config, and
LLM docs.
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
7d6226d10e. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
2026-03-20 01:51:56 +00:00
Reinier van der Leer
5b9a4c52c9 revert(platform): Revert invite system (#12485)
## Summary

Reverts the invite system PRs due to security gaps identified during
review:

- The move from Supabase-native `allowed_users` gating to
application-level gating allows orphaned Supabase auth accounts (valid
JWT without a platform `User`)
- The auth middleware never verifies `User` existence, so orphaned users
get 500s instead of clean 403s
- OAuth/Google SSO signup completely bypasses the invite gate
- The DB trigger that atomically created `User` + `Profile` on signup
was dropped in favor of a client-initiated API call, introducing a
failure window

### Reverted PRs
- Reverts #12347 — Foundation: InvitedUser model, invite-gated signup,
admin UI
- Reverts #12374 — Tally enrichment: personalized prompts from form
submissions
- Reverts #12451 — Pre-check: POST /auth/check-invite endpoint
- Reverts #12452 (collateral) — Themed prompt categories /
SuggestionThemes UI. This PR built on top of #12374's
`suggested_prompts` backend field and `/chat/suggested-prompts`
endpoint, so it cannot remain without #12374. The copilot empty session
falls back to hardcoded default prompts.

### Migration
Includes a new migration (`20260319120000_revert_invite_system`) that:
- Drops the `InvitedUser` table and its enums (`InvitedUserStatus`,
`TallyComputationStatus`)
- Restores the `add_user_and_profile_to_platform()` trigger on
`auth.users`
- Backfills `User` + `Profile` rows for any auth accounts created during
the invite-gate window

### What's NOT reverted
- The `generate_username()` function (never dropped, still used by
backfill migration)
- The old `add_user_to_platform()` function (superseded by
`add_user_and_profile_to_platform()`)
- PR #12471 (admin UX improvements) — was never merged, no action needed

## Test plan
- [x] Verify migration: `InvitedUser` table dropped, enums dropped,
trigger restored
- [x] Verify backfill: no orphaned auth users, no users without Profile
- [x] Verify existing users can still log in (email + OAuth)
- [x] Verify CoPilot chat page loads with default prompts
- [ ] Verify new user signup creates `User` + `Profile` via the restored
trigger
- [ ] Verify admin `/admin/users` page loads without crashing
- [ ] Run backend tests: `poetry run test`

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
2026-03-19 17:15:30 +00:00
Otto
0ce1c90b55 fix(frontend): rename "CoPilot" to "AutoPilot" on credits page (#12481)
Requested by @kcze

Renames "CoPilot" → "AutoPilot" on the credits/usage limits page:

- **Heading:** "CoPilot Usage Limits" → "AutoPilot Usage Limits"
- **Button:** "Open CoPilot" → "Open AutoPilot"
- Comment updated to match

---
Co-authored-by: Zamil Majdy (@majdyz) <zamil.majdy@agpt.co>

Co-authored-by: Zamil Majdy (@majdyz) <zamil.majdy@agpt.co>
2026-03-19 15:25:21 +00:00
Ubbe
d4c6eb9adc fix(frontend): collapse navbar text to icons below 1280px (#12484)
## Summary

<img width="400" height="339" alt="Screenshot 2026-03-19 at 22 53 23"
src="https://github.com/user-attachments/assets/2fa76b8f-424d-4764-90ac-b7a331f5f610"
/>

<img width="600" height="595" alt="Screenshot 2026-03-19 at 22 53 31"
src="https://github.com/user-attachments/assets/23f51cc7-b01e-4d83-97ba-2c43683877db"
/>

<img width="800" height="523" alt="Screenshot 2026-03-19 at 22 53 36"
src="https://github.com/user-attachments/assets/1e447b9a-1cca-428c-bccd-1730f1670b8e"
/>

Now that we have the `Give feedback` button on the Navigation bar,
collpase some of the links below `1280px` so there is more space and
they don't collide with each other...

- Collapse navbar link text to icon-only below 1280px (`xl` breakpoint)
to prevent crowding
- Wallet button shows only the wallet icon below 1280px instead of "Earn
credits" text
- Feedback button shows only the chat icon below 1280px instead of "Give
Feedback" text
- Added `whitespace-nowrap` to feedback button to prevent wrapping

## Changes
- `NavbarLink.tsx`: `lg:block` → `xl:block` for link text
- `Wallet.tsx`: `md:hidden`/`md:inline-block` →
`xl:hidden`/`xl:inline-block`
- `FeedbackButton.tsx`: wrap text in `hidden xl:inline` span, add
`whitespace-nowrap`

## Test plan
- [ ] Resize browser between 1024px–1280px and verify navbar shows only
icons
- [ ] At 1280px+ verify full text labels appear for links, wallet, and
feedback
- [ ] Verify mobile navbar still works correctly below `md` breakpoint

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 15:10:27 +00:00
Ubbe
1bb91b53b7 fix(frontend/marketplace): comprehensive marketplace UI redesign (#12462)
## Summary

<img width="600" height="964" alt="Screenshot_2026-03-19_at_00 07 52"
src="https://github.com/user-attachments/assets/95c0430a-26a3-499b-8f6a-25b9715d3012"
/>
<img width="600" height="968" alt="Screenshot_2026-03-19_at_00 08 01"
src="https://github.com/user-attachments/assets/d440c3b0-c247-4f13-bf82-a51ff2e50902"
/>
<img width="600" height="939" alt="Screenshot_2026-03-19_at_00 08 14"
src="https://github.com/user-attachments/assets/f19be759-e102-4a95-9474-64f18bce60cf"
/>"
<img width="600" height="953" alt="Screenshot_2026-03-19_at_00 08 24"
src="https://github.com/user-attachments/assets/ba4fa644-3958-45e2-89e9-a6a4448c63c5"
/>



- Re-style and re-skin the Marketplace pages to look more "professional"
...
- Move the `Give feedback` button to the header

## Test plan
- [x] Verify marketplace page search bar matches Form text field styling
- [x] Verify agent cards have padding and subtle border
- [x] Verify hover/focus states work correctly
- [x] Check responsive behavior at different breakpoints

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 22:28:01 +08:00
Ubbe
a5f9c43a41 feat(platform): replace suggestion pills with themed prompt categories (#12452)
## Summary



https://github.com/user-attachments/assets/13da6d36-5f35-429b-a6cf-e18316bb8709



Replaces the flat list of suggestion pills in the CoPilot empty session
with themed prompt categories (Learn, Create, Automate, Organize), each
shown as a popover with contextual prompts.

- **Backend**: Changes `suggested_prompts` from a flat `list[str]` to a
themed `dict[str, list[str]]` keyed by category. Updates Tally
extraction LLM prompt to generate prompts per theme, and the
`/suggested-prompts` API to return grouped themes. Legacy `list[str]`
rows are preserved under a `"General"` key for backward compatibility.
- **Frontend**: Replaces inline pill buttons with a `SuggestionThemes`
popover component. Each theme button (with icon) opens a dropdown of 5
relevant prompts. Falls back to hardcoded defaults when the API has no
personalized prompts. Normalizes partial API responses by padding
missing themes with defaults. Legacy `"General"` prompts are distributed
round-robin across themes so existing users keep their personalized
suggestions.

### Changes 🏗️

- `backend/data/understanding.py`: `suggested_prompts` field changed
from `list[str]` to `dict[str, list[str]]`; legacy list rows preserved
under `"General"` key; list items validated as strings
- `backend/data/tally.py`: LLM prompt updated to generate themed
prompts; validation now per-theme with blank-string rejection
- `backend/api/features/chat/routes.py`: New `SuggestedTheme` model;
endpoint returns `themes[]`
- `frontend/copilot/components/EmptySession/EmptySession.tsx`: Uses
generated API types directly (no cast)
- `frontend/copilot/components/EmptySession/helpers.ts`:
`DEFAULT_THEMES` replaces `DEFAULT_QUICK_ACTIONS`; `getSuggestionThemes`
normalizes partial API responses and distributes legacy `"General"`
prompts across themes
-
`frontend/copilot/components/EmptySession/components/SuggestionThemes/`:
New popover component with theme icons and loading states

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Verify themed suggestion buttons render on CoPilot empty session
  - [x] Click each theme button and confirm popover opens with prompts
  - [x] Click a prompt and confirm it sends the message
- [x] Verify fallback to default themes when API returns no custom
prompts
- [x] Verify legacy users' personalized prompts are preserved and
visible


🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-19 18:46:12 +08:00
Otto
1240f38f75 feat(backend): migrate OpenAI provider to Responses API (#12099)
## Summary

Migrates the OpenAI provider in the LLM block from
`chat.completions.create` to `responses.create` — OpenAI's newer,
unified API. Also removes the obsolete GPT-3.5-turbo model.

Resolves #11624
Linear:
[OPEN-2911](https://linear.app/autogpt/issue/OPEN-2911/update-openai-calls-to-use-responsescreate)

## Changes

- **`backend/blocks/llm.py`** — OpenAI provider now uses
`responses.create` exclusively. Removed GPT-3.5-turbo enum + metadata.
- **`backend/util/openai_responses.py`** *(new)* — Helpers for the
Responses API: tool format conversion, content/reasoning/usage/tool-call
extraction.
- **`backend/util/openai_responses_test.py`** *(new)* — Unit tests for
all helper functions.
- **`backend/data/block_cost_config.py`** — Removed GPT-3.5 cost entry.
- **`docs/integrations/block-integrations/llm.md`** — Regenerated block
docs.

## Key API differences handled

| Aspect | Chat Completions | Responses API |
|--------|-----------------|---------------|
| Messages param | `messages` | `input` |
| Max tokens param | `max_completion_tokens` | `max_output_tokens` |
| Usage fields | `prompt_tokens` / `completion_tokens` | `input_tokens`
/ `output_tokens` |
| Tool format | Nested under `function` key | Flat structure |

## Test plan

- [x] Unit tests for all `openai_responses.py` helpers
- [x] Existing LLM block tests updated for Responses API mocks
- [x] Regular OpenAI models work
- [x] Reasoning OpenAI models work
- [x] Non-OpenAI models work

---------

Co-authored-by: Krzysztof Czerwinski <kpczerwinski@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-19 09:19:31 +00:00
Zamil Majdy
f617f50f0b dx(skills): improve pr-address skill — full thread context + PR description backtick fix (#12480)
## Summary

Improves the `pr-address` skill with two fixes:

- **Full comment thread loading**: Adds `--paginate` to the inline
comments fetch and explicit instructions to reconstruct threads using
`in_reply_to_id`, reading root-to-last-reply before acting. Previously,
only the opening comment was visible — missing reviewer replies led to
wrong fixes.
- **Backtick-safe PR descriptions**: Adds instructions to write the PR
body to a temp file via `<<'PREOF'` heredoc before passing to `gh pr
edit/create`. Inlining the body directly causes backticks to be
shell-escaped, breaking markdown rendering.

## Test plan
- [ ] Run `/pr-address` on a PR with multi-reply inline comment threads
— verify the last reply is what gets acted on
- [ ] Update a PR description containing backticks — verify they render
correctly in GitHub
2026-03-19 15:11:14 +07:00
Otto
943a1df815 dx(backend): Make Builder and Marketplace search work without embeddings (#12479)
When OpenAI credentials are unavailable (fork PRs, dev envs without API
keys), both builder block search and store agent functionality break:

1. **Block search returns wrong results.** `unified_hybrid_search` falls
back to a zero vector when embedding generation fails. With ~200 blocks
in `UnifiedContentEmbedding`, the zero-vector semantic scores are
garbage, and lexical matching on short block names is too weak — "Store
Value" doesn't appear in the top results for query "Store Value".

2. **Store submission approval fails entirely.**
`review_store_submission` calls `ensure_embedding()` inside a
transaction. When it throws, the entire transaction rolls back — no
store submissions get approved, the `StoreAgent` materialized view stays
empty, and all marketplace e2e tests fail.

3. **Store search returns nothing.** Even when store data exists,
`hybrid_search` queries `UnifiedContentEmbedding` which has no store
agent rows (backfill failed). It succeeds with zero results rather than
throwing, so the existing exception-based fallback never triggers.

### Changes 🏗️

- Replace `unified_hybrid_search` with in-memory text search in
`_hybrid_search_blocks` (-> `_text_search_blocks`). All ~200 blocks are
already loaded in memory, and `_score_primary_fields` provides correct
deterministic text relevance scoring against block name, description,
and input schema field descriptions — the same rich text the embedding
pipeline uses. CamelCase block names are split via `split_camelcase()`
to match the tokenization from PR #12400.

- Make embedding generation in `review_store_submission` best-effort:
catch failures and log a warning instead of rolling back the approval
transaction. The backfill scheduler retries later when credentials
become available.

- Fall through to direct DB search when `hybrid_search` returns empty
results (not just when it throws). The fallback uses ad-hoc
`to_tsvector`/`plainto_tsquery` with `ts_rank_cd` ranking on
`StoreAgent` view fields, restoring the search quality of the original
pre-hybrid implementation (stemming, stop-word removal, relevance
ranking).

- Fix Playwright artifact upload in end-to-end test CI

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] `build.spec.ts`: 8/8 pass locally (was 0/7 before fix)
  - [x] All 79 e2e tests pass in CI (was 15 failures before fix)

---
Co-authored-by: Reinier van der Leer (@Pwuts)

---------

Co-authored-by: Reinier van der Leer <pwuts@agpt.co>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 00:11:06 +00:00
Otto
593001e0c8 fix(frontend): Remove dead Tutorial button from TallyPopup (#12474)
After the legacy builder was removed in #12082, the TallyPopup component
still showed a "Tutorial" button (bottom-right, next to "Give Feedback")
that navigated to `/build?resetTutorial=true`. Nothing handles that
param anymore, so clicking it did nothing.

This removes the dead button and its associated state/handler from
TallyPopup and useTallyPopup. The working tutorial (Shepherd.js
chalkboard icon in CustomControls) is unaffected.

**Changes:**
- `TallyPopup.tsx`: Remove Tutorial button JSX, unused imports
(`usePathname`, `useSearchParams`), and `isNewBuilder` check
- `useTallyPopup.ts`: Remove `showTutorial` state, `handleResetTutorial`
handler, unused `useRouter` import

Resolves SECRT-2109

---
Co-authored-by: Reinier van der Leer (@Pwuts) <pwuts@agpt.co>

Co-authored-by: Reinier van der Leer (@Pwuts) <pwuts@agpt.co>
2026-03-19 00:09:46 +00:00
Ubbe
e1db8234a3 fix(frontend/copilot): constrain markdown heading sizes in user chat messages (#12463)
### Before

<img width="600" height="489" alt="Screenshot 2026-03-18 at 19 24 41"
src="https://github.com/user-attachments/assets/bb8dc0fa-04cd-4f32-8125-2d7930b4acde"
/>

Formatted headings in user messages would look massive

### After

<img width="600" height="549" alt="Screenshot 2026-03-18 at 19 24 33"
src="https://github.com/user-attachments/assets/51230232-c914-42dd-821f-3b067b80bab4"
/>

Markdown headings (`# H1` through `###### H6`) and setext-style headings
(`====`) in user chat messages rendered at their full HTML heading size,
which looked disproportionately large in the chat bubble context.

### Changes 🏗️

- Added Tailwind CSS overrides on the user message `MessageContent`
wrapper to cap all heading elements (h1-h6) at `text-lg font-semibold`
- Only affects user messages in copilot chat (via `group-[.is-user]`
selector); assistant messages are unchanged

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [ ] Send a user message containing `# Heading 1` through `######
Heading 6` and verify they all render at constrained size
- [ ] Send a message with `====` separator pattern and verify it doesn't
render as a mega H1
  - [ ] Verify assistant messages with headings still render normally

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 00:33:09 +08:00
Zamil Majdy
282173be9d feat(copilot): GitHub CLI support — inject GH_TOKEN and connect_integration tool (#12426)
## Summary

- When a user has connected GitHub, `GH_TOKEN` is automatically injected
into the Claude Agent SDK subprocess environment so `gh` CLI commands
work without any manual auth step
- When GitHub is **not** connected, the copilot can call a new
`connect_integration(provider="github")` MCP tool, which surfaces the
same credential setup card used by regular GitHub blocks — the user
connects inline without leaving the chat
- After connecting, the copilot is instructed to retry the operation
automatically

## Changes

**Backend**
- `sdk/service.py`: `_get_github_token_for_user()` fetches OAuth2 or API
key credentials and injects `GH_TOKEN` + `GITHUB_TOKEN` into `sdk_env`
before the SDK subprocess starts (per-request, thread-safe via
`ClaudeAgentOptions.env`)
- `tools/connect_integration.py`: new `ConnectIntegrationTool` MCP tool
— returns `SetupRequirementsResponse` for a given provider (`github` for
now); extensible via `_PROVIDER_INFO` dict
- `tools/__init__.py`: registers `connect_integration` in
`TOOL_REGISTRY`
- `prompting.py`: adds GitHub CLI / `connect_integration` guidance to
`_SHARED_TOOL_NOTES`

**Frontend**
- `ConnectIntegrationTool/ConnectIntegrationTool.tsx`: thin wrapper
around the existing `SetupRequirementsCard` with a tailored retry
instruction
- `MessagePartRenderer.tsx`: dispatches `tool-connect_integration` to
the new component

## Test plan

- [ ] User with GitHub credentials: `gh pr list` works without any auth
step in copilot
- [ ] User without GitHub credentials: copilot calls
`connect_integration`, card renders with GitHub credential input, after
connecting copilot retries and `gh` works
- [ ] `GH_TOKEN` is NOT leaked across users (injected via
`ClaudeAgentOptions.env`, not `os.environ`)
- [ ] `connect_integration` with unknown provider returns a graceful
error message
2026-03-18 11:52:42 +00:00
Zamil Majdy
5d9a169e04 feat(blocks): add AutoPilotBlock for invoking AutoPilot from graphs (#12439)
## Summary
- Adds `AutogptCopilotBlock` that invokes the platform's copilot system
(`stream_chat_completion_sdk`) directly from graph executions
- Enables sub-agent patterns: copilot can call this block recursively
(with depth limiting via `contextvars`)
- Enables scheduled copilot execution through the agent executor system
- No user credentials needed — uses server-side copilot config

## Inputs/Outputs
**Inputs:** prompt, system_context, session_id (continuation), timeout,
max_recursion_depth
**Outputs:** response text, tool_calls list, conversation_history JSON,
session_id, token_usage

## Test plan
- [x] Block test passes (`test_available_blocks[AutogptCopilotBlock]`)
- [x] Pre-commit hooks pass (format, lint, typecheck)
- [ ] Manual test: add block to graph, send prompt, verify response
- [ ] Manual test: chain two copilot blocks with session_id to verify
continuation
2026-03-18 11:22:25 +00:00
Ubbe
6fd1050457 fix(backend): arch-conditional chromium in Docker for ARM64 compatibility (#12466)
## Summary
- On **amd64**: keep `agent-browser install` (Chrome for Testing —
pinned version tested with Playwright) + restore runtime libs
- On **arm64**: install system `chromium` package (Chrome for Testing
has no ARM64 binary) + skip `agent-browser install`
- An entrypoint script sets
`AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium` at container startup
on arm64 (detected via presence of `/usr/bin/chromium`); on amd64 the
var is left unset so agent-browser uses Chrome for Testing as before

**Why not system chromium on amd64?** `agent-browser install` downloads
a specific Chrome for Testing version pinned to the Playwright version
in use. Using whatever Debian ships on amd64 could cause protocol
compatibility issues.

Introduced by #12301 (cc @Significant-Gravitas/zamil-majdy)

## Test plan
- [ ] `docker compose up --build` succeeds on ARM64 (Apple Silicon)
- [ ] `docker compose up --build` succeeds on x86_64
- [ ] Copilot browser tools (`browser_navigate`, `browser_act`,
`browser_screenshot`) work in a Copilot session on both architectures

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
2026-03-18 19:08:14 +08:00
Otto
02708bcd00 fix(platform): pre-check invite eligibility before Supabase signup (#12451)
Requested by @Swiftyos

The invite gate check in `get_or_activate_user()` runs after Supabase
creates the auth user, resulting in orphaned auth accounts with no
platform access when a non-invited user signs up. Users could create a
Supabase account but had no `User`, `Profile`, or `Onboarding` records —
they could log in but access nothing.

### Changes 🏗️

**Backend** (`v1.py`, `invited_user.py`):
- Add public `POST /api/auth/check-invite` endpoint (no auth required —
this is a pre-signup check)
- Add `check_invite_eligibility()` helper in the data layer
- Returns `{allowed: true}` when `enable_invite_gate` is disabled
- Extracted `is_internal_email()` helper to deduplicate `@agpt.co`
bypass logic (was duplicated between route and `get_or_activate_user`)
- Checks `InvitedUser` table for `INVITED` status
- Added IP-based Redis rate limiting (10 req/60 s per IP, fails open if
Redis unavailable, returns HTTP 429 when exceeded)
- Fixed Redis pipeline atomicity: `incr` + `expire` now sent in a single
pipeline round-trip, preventing a TTL-less key if `expire` had
previously failed after `incr`
- Fixed incorrect `await` on `pipe.incr()` / `pipe.expire()` — redis-py
async pipeline queue methods are synchronous; only `execute()` is
awaitable. The erroneous `await` was silently swallowed by the `except`
block, making the rate limiter completely non-functional

**Frontend** (`signup/actions.ts`):
- Call the generated `postV1CheckIfAnEmailIsAllowedToSignUp` client
(replacing raw `fetch`) before `supabase.auth.signUp()`
- `ApiError` (non-OK HTTP responses) logs a Sentry warning with the HTTP
status; network/other errors capture a Sentry exception
- If not allowed, return `not_allowed` error (existing
`EmailNotAllowedModal` handles this)
- Graceful fallback: if the pre-check fails (backend unreachable), falls
through to the existing flow — `get_or_activate_user()` remains as
defense-in-depth

**Tests** (`v1_test.py`, `invited_user_test.py`):
- 5 route-level tests covering: gate disabled → allowed, `@agpt.co`
bypass, eligible email, ineligible email, rate-limit exceeded
- Rate-limit test mock updated to use pipeline interface
(`pipeline().execute()` returns `[count, True]`)
- Existing `invited_user_test.py` updated to cover
`check_invite_eligibility` branches

**Not changed:**
- Google OAuth flow — already gated by OAuth provider settings
- `get_or_activate_user()` — stays as backend safety net
- All admin invite CRUD routes — unchanged

### Test plan
1. Email/password signup with invited email → signup proceeds normally
2. Email/password signup with non-invited email → `EmailNotAllowedModal`
shown, no Supabase user created
3. `enable_invite_gate=false` → all emails allowed
4. Backend unreachable during pre-check → falls through to existing flow
5. Same IP exceeds 10 requests/60 s → HTTP 429 returned

---
Co-authored-by: Craig Swift (@Swiftyos) <craigswift13@gmail.com>

---------

Co-authored-by: Craig Swift (@Swiftyos) <craigswift13@gmail.com>
Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
2026-03-18 10:36:50 +00:00
Zamil Majdy
156d61fe5c dx(skills): add merge conflict detection and resolution to pr-address (#12469)
## Summary
- Adds merge conflict detection as step 2 of the polling loop (between
CI check and comment check), including handling of the transient
`"UNKNOWN"` state
- Adds a "Resolving merge conflicts" section with step-by-step
instructions using 3-way merge (no force push needed since PRs are
squash-merged)
- Validates all three git conflict markers before staging to prevent
committing broken code
- Fixes `args` → `argument-hint` in skill frontmatter

## Test plan
- [ ] Verify skill renders correctly in Claude Code
2026-03-18 17:46:32 +07:00
Zamil Majdy
5a29de0e0e fix(platform): try-compact-retry for prompt-too-long errors in CoPilot SDK (#12413)
## Summary

When the Claude SDK returns a prompt-too-long error (e.g. transcript +
query exceeds the model's context window), the streaming loop now
retries with escalating fallbacks instead of failing immediately:

1. **Attempt 1**: Use the transcript as-is (normal path)
2. **Attempt 2**: Compact the transcript via LLM summarization
(`compact_transcript`) and retry
3. **Attempt 3**: Drop the transcript entirely and fall back to
DB-reconstructed context (`_build_query_message`)

If all 3 attempts fail, a `StreamError(code="prompt_too_long")` is
yielded to the frontend.

### Key changes

**`service.py`**
- Add `_is_prompt_too_long(err)` — pattern-matches SDK exceptions for
prompt-length errors (`prompt is too long`, `prompt_too_long`,
`context_length_exceeded`, `request too large`)
- Wrap `async with ClaudeSDKClient` in a 3-attempt retry `for` loop with
compaction/fallback logic
- Move `current_message`, `_build_query_message`, and
`_prepare_file_attachments` before the retry loop (computed once,
reused)
- Skip transcript upload in `finally` when `transcript_caused_error`
(avoids persisting a broken/empty transcript)
- Reset `stream_completed` between retry iterations
- Document outer-scope variable contract in `_run_stream_attempt`
closure (which variables are reassigned between retries vs read-only)

**`transcript.py`**
- Add `compact_transcript(content, log_prefix, model)` — converts JSONL
→ messages → `compress_context` (LLM summarization with truncation
fallback) → JSONL
- Add helpers: `_flatten_assistant_content`,
`_flatten_tool_result_content`, `_transcript_to_messages`,
`_messages_to_transcript`, `_run_compression`
- Returns `None` when compaction fails or transcript is already within
budget (signals caller to fall through to DB fallback)
- Truncation fallback wrapped in 30s timeout to prevent unbounded CPU
time on large transcripts
- Accepts `model` parameter to avoid creating a new `ChatConfig()` on
every call

**`util/prompt.py`**
- Fix `_truncate_middle_tokens` edge case: returns empty string when
`max_tok < 1`, properly handles `max_tok < 3`

**`config.py`**
- E2B sandbox timeout raised from 5 min to 15 min to accommodate
compaction retries

**`prompt_too_long_test.py`** (new, 45 tests)
- `_is_prompt_too_long` positive/negative patterns, case sensitivity,
BaseException handling
- Flatten helpers for assistant/tool_result content blocks
- `_transcript_to_messages` / `_messages_to_transcript` roundtrip,
strippable types, empty content
- `compact_transcript` async tests: too few messages, not compacted,
successful compaction, compression failure

**`retry_scenarios_test.py`** (new, 27 tests)
- Full retry state machine simulation covering all 8 scenarios:
  1. Normal flow (no retry)
  2. Compact succeeds → retry succeeds
  3. Compact fails → DB fallback succeeds
  4. No transcript → DB fallback succeeds
  5. Double fail → DB fallback on attempt 3
  6. All 3 attempts exhausted
  7. Non-prompt-too-long error (no retry)
  8. Compaction returns identical content → DB fallback
- Edge cases: nested exceptions, case insensitivity, unicode content,
large transcripts, resume-after-compaction flow

**Shared test fixtures** (`conftest.py`)
- Extracted `build_test_transcript` helper used across 3 test files to
eliminate duplication

## Test plan

- [x] `_is_prompt_too_long` correctly identifies prompt-length errors (8
positive, 5 negative patterns)
- [x] `compact_transcript` compacts oversized transcripts via LLM
summarization
- [x] `compact_transcript` returns `None` on failure or when already
within budget
- [x] Retry loop state machine: all 8 scenarios verified with state
assertions
- [x] `TranscriptBuilder` works correctly after loading compacted
transcripts
- [x] `_messages_to_transcript` roundtrip preserves content including
unicode
- [x] `transcript_caused_error` prevents stale transcript upload
- [x] Truncation timeout prevents unbounded CPU time
- [x] All 139 unit tests pass locally
- [x] CI green (tests 3.11/3.12/3.13, types, CodeQL, linting)
2026-03-18 10:27:31 +00:00
Otto
e657472162 feat(blocks): Add Nano Banana 2 to image generator, customizer, and editor blocks (#12218)
Requested by @Torantulino

Add `google/nano-banana-2` (Gemini 3.1 Flash Image) support across all
three image blocks.

### Changes

**`ai_image_customizer.py`**
- Add `NANO_BANANA_2 = "google/nano-banana-2"` to `GeminiImageModel`
enum
- Update block description to reference Nano-Banana models generically

**`ai_image_generator_block.py`**
- Add `NANO_BANANA_2` to `ImageGenModel` enum
- Add generation branch (identical to NBP except model name)

**`flux_kontext.py` (AI Image Editor)**
- Rename `FluxKontextModelName` → `ImageEditorModel` (with
backwards-compatible alias)
- Add `NANO_BANANA_PRO` and `NANO_BANANA_2` to the editor
- Model-aware branching in `run_model()`: NB models use `image_input`
list (not `input_image`), no `seed`, and add `output_format`

**`block_cost_config.py`**
- Add NB2 cost entries for all three blocks (14 credits, matching NBP)
- Add NB Pro cost entry for editor block
- Update editor block refs from `.PRO`/`.MAX` to
`.FLUX_KONTEXT_PRO`/`.FLUX_KONTEXT_MAX`

Resolves SECRT-2047

---------

Co-authored-by: Torantulino <Torantulino@users.noreply.github.com>
Co-authored-by: Abhimanyu Yadav <122007096+Abhi1992002@users.noreply.github.com>
2026-03-18 09:42:18 +00:00
DEEVEN SERU
4d00e0f179 fix(blocks): allow falsy entries in AddToListBlock (#12028)
## Summary
- treat AddToListBlock.entry as optional rather than truthy so
0/""/False are appended
- extend block self-tests with a falsy entry case

## Testing
- Not run (pytest not available in environment)

Co-authored-by: DEEVEN SERU <144827577+DEVELOPER-DEEVEN@users.noreply.github.com>
Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co>
2026-03-18 09:42:14 +00:00
DEEVEN SERU
1d7282b5f3 fix(backend): Truncate filenames with excessively long 'extensions' (#12025)
Fixes issue where filenames with no dots until the end (or massive
extensions) bypassed truncation logic, causing OSError [Errno 36].
Limits extension preservation to 20 chars.

---------

Co-authored-by: DEVELOPER-DEEVEN <144827577+DEVELOPER-DEEVEN@users.noreply.github.com>
2026-03-18 09:42:06 +00:00
Reinier van der Leer
e3591fcaa3 ci(backend): Python version specific type checking (#12453)
- Resolves #10657
- Partially based on #10913

### Changes 🏗️

- Run Pyright separately for each supported Python version
  - Move type checking and linting into separate jobs
    - Add `--skip-pyright` option to lint script
- Move `linter.py` into `backend/scripts`
  - Move other scripts in `backend/` too for consistency

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - CI

---

Co-authored-by: @Joaco2603 <jpappa2603@gmail.com>

---------

Co-authored-by: Joaco2603 <jpappa2603@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-18 09:41:35 +00:00
Reinier van der Leer
876dc32e17 chore(backend): Update poetry to v2.2.1 (#12459)
Poetry v2.2.1 has bugfixes that are relevant in context of our
`.pre-commit-config.yaml`

### Changes 🏗️

- Update `poetry` from v2.1.1 to v2.2.1 (latest version supported by
Dependabot)
- Re-generate `poetry.lock`

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - CI
2026-03-18 09:41:28 +00:00
Reinier van der Leer
616e29f5e4 fix tests for 6d0e206 2026-03-18 10:39:51 +01:00
Zamil Majdy
280a98ad38 dx(skills): poll for new PR comments while waiting for CI (#12461)
## Summary
- Updates the `pr-address` skill to poll for new PR comments while
waiting for CI, instead of blocking solely on `gh pr checks --watch
--fail-fast`
- Runs CI watch in the background and polls all 3 comment endpoints
every 30s
- Allows bot comments (coderabbitai, sentry) to be addressed in parallel
with CI rather than sequentially

## Test plan
- [ ] Run `/pr-address` on a PR with pending CI and verify it detects
new comments while CI is running
- [ ] Verify CI failures are still handled correctly after the combined
wait
2026-03-18 15:07:13 +07:00
Reinier van der Leer
c7f2a7dd03 fix formatting 2026-03-17 20:30:33 +01:00
Otto
6d0e2063ec Merge commit from fork
* fix(backend): add resource limits to Jinja2 template rendering

Prevent DoS via computational exhaustion in FillTextTemplateBlock by:

- Subclassing SandboxedEnvironment to intercept ** and * operators
  with caps on exponent size (1000) and string repeat length (10K)
- Replacing range() global with a capped version (max 10K items)
- Wrapping template.render() in a ThreadPoolExecutor with a 10s
  timeout to kill runaway expressions

Addresses GHSA-ppw9-h7rv-gwq9 (CWE-400).

* address review: move helpers after TextFormatter, drop ThreadPoolExecutor

- Move _safe_range and _RestrictedEnvironment below TextFormatter
  (helpers after the function that uses them)
- Remove ThreadPoolExecutor timeout wrapper from format_string() —
  it has problematic behavior in async contexts and the static
  interception (operator caps, range limit) already covers the
  known attack vectors

* address review: extend sequence guard, harden format_email, add tests

- Extend * guard to cover list and tuple repetition, not just strings
  (blocks {{ [0] * 999999999 }} and {{ (0,) * 999999999 }})
- Rename MAX_STRING_REPEAT → MAX_SEQUENCE_REPEAT
- Use _RestrictedEnvironment in format_email (defense-in-depth)
- Add tests: list repeat, tuple repeat, negative exponent, nested
  exponentiation (18 tests total)

* add async timeout wrapper at block level

Wrap format_string calls in FillTextTemplateBlock and AgentOutputBlock
with asyncio.wait_for(asyncio.to_thread(...), timeout=10s).

This provides defense-in-depth: if an expression somehow bypasses the
static operator checks, the async timeout will cancel it. Uses
asyncio.to_thread for proper async integration (no event loop blocking)
and asyncio.wait_for for real cancellation on timeout.

* make format_string async with timeout kwarg

Move asyncio.wait_for + asyncio.to_thread into format_string() itself
with a timeout kwarg (default 10s). This way all callers get the
timeout automatically — no wrapper needed at each call site.

- format_string() is now async, callers use await
- format_email() is now async (calls format_string internally)
- Updated all callers: text.py, io.py, llm.py, smart_decision_maker.py,
  email.py, notifications.py
- Tests updated to use asyncio.run()

* use Jinja2 native async rendering instead of to_thread

Switch from asyncio.to_thread(template.render) to Jinja2's native
enable_async=True + template.render_async(). No thread overhead,
proper async integration. asyncio.wait_for timeout still applies.

---------

Co-authored-by: Reinier van der Leer <pwuts@agpt.co>
2026-03-17 20:24:04 +01:00
Zamil Majdy
8b577ae194 feat(backend/copilot): add direct ID lookup to find_agent and find_block tools (#12446)
## Summary
- Add direct `creator/slug` lookup to `find_agent` marketplace search,
bypassing full-text search when an exact identifier is provided
- Add direct UUID lookup to `find_block`, returning the block
immediately when a valid block ID is given
- Update tool descriptions and parameter hints to document the new
lookup capabilities

## Test plan
- [ ] Verify `find_agent` with a `creator/slug` query returns the exact
agent
- [ ] Verify `find_agent` falls back to search when slug lookup fails
- [ ] Verify `find_block` with a block UUID returns the exact block
- [ ] Verify `find_block` with a non-existent UUID falls through to
search
- [ ] Verify excluded block types/IDs are still filtered in direct
lookup
2026-03-17 16:41:17 +00:00
Zamil Majdy
d8f5f783ae feat(copilot): enable SmartDecisionMakerBlock in agent generator (#12438)
## Summary
- Enable the agent generator to create orchestrator agents using
**SmartDecisionMakerBlock** with agent mode
- SmartDecisionMaker + AgentExecutorBlock tools = autonomous agent that
decides which sub-agents to call, executes them, reads results, and
loops until done
- Follows existing patterns (AgentExecutorBlock/MCPToolBlock) for fixer,
validator, and guide documentation

## Changes
- Remove SmartDecisionMakerBlock from `COPILOT_EXCLUDED_BLOCK_IDS` in
`find_block.py`
- Add `SMART_DECISION_MAKER_BLOCK_ID` constant to `helpers.py`
- Add `fix_smart_decision_maker_blocks()` in `fixer.py` — populates
agent-mode defaults (`max_iterations=-1`,
`conversation_compaction=True`, etc.)
- Add `validate_smart_decision_maker_blocks()` in `validator.py` —
ensures downstream tool blocks are connected
- Add SmartDecisionMakerBlock documentation section in
`agent_generation_guide.md`
- Add 18 tests: 7 fixer, 7 validator, 4 e2e pipeline

## Test plan
- [x] All 18 new tests pass
(`test/agent_generator/test_smart_decision_maker.py`)
- [x] All 31 existing agent generator tests still pass
- [x] Pre-commit hooks (ruff, black, isort, pyright) all pass
- [ ] Manual: use CoPilot to generate an orchestrator agent with
SmartDecisionMakerBlock

---------

Co-authored-by: Reinier van der Leer <pwuts@agpt.co>
2026-03-17 16:30:04 +00:00
Reinier van der Leer
82d22f3680 dx(backend): Update CLAUDE.md (#12458)
- Prefer f-strings except for debug statements
- Top-down module/function/class ordering

As suggested by @majdyz, this is more effective than commenting on every
single instance on PRs.
2026-03-17 16:27:09 +00:00
Zamil Majdy
50622333d1 fix(backend/copilot): fix tool-result file read failures across turns (#12399)
## Summary
- **Path validation fix**: `is_allowed_local_path()` now correctly
handles the SDK's nested conversation UUID path structure
(`<encoded-cwd>/<conversation-uuid>/tool-results/<file>`) instead of
only matching `<encoded-cwd>/tool-results/<file>`
- **`read_workspace_file` fallback**: When the model mistakenly calls
`read_workspace_file` for an SDK tool-result path (local disk, not cloud
storage), the tool now falls back to reading from local disk instead of
returning "file not found"
- **Cross-turn cleanup fix**: Stopped deleting
`~/.claude/projects/<encoded-cwd>/` between turns — tool-result files
now persist across `--resume` turns so the model can re-read them. Added
TTL-based stale directory sweeping (24h) to prevent unbounded disk
growth.
- **System prompt**: Added guidance telling the model to use `read_file`
(not `read_workspace_file`) for SDK tool-result paths
- **Symlink escape fix** (e2b_file_tools.py): Added `readlink -f`
canonicalization inside the E2B sandbox to detect symlink-based path
escapes before writes
- **Stash timeout increase**: `wait_for_stash` timeout increased from
0.5s to 2.0s, with a post-timeout `sleep(0)` fallback

### Root cause
Investigated via Langfuse trace `5116befdca6a6ff9a8af6153753e267d`
(session `d5841fd8`). The model ran 3 Perplexity deep research calls,
SDK truncated large outputs to `~/.claude/projects/.../tool-results/`
files. Model then called `read_workspace_file` (cloud DB) instead of
`read_file` (local disk), getting "file not found". Additionally, the
path validation check didn't account for the SDK's nested UUID directory
structure, and cleanup between turns deleted tool-result files that the
transcript still referenced.

## Test plan
- [x] All 653 copilot tests pass (excluding 1 pre-existing infra test)
- [x] Security test `test_read_claude_projects_settings_json_denied`
still passes — non-tool-result files under the project dir are still
blocked
- [x] `poetry run format` passes all checks
2026-03-17 15:57:15 +00:00
Zamil Majdy
27af5782a9 feat(skills): add gh pr checks --watch to pr-address loop (#12457)
## Summary
- Teaches the `pr-address` skill to use `gh pr checks --watch
--fail-fast` for efficient CI waiting instead of manual polling
- Adds guidance on investigating failures with `gh run view
--log-failed`
- Adds explicit "between CI waits" section: re-fetch and address new bot
comments while CI runs

## Test plan
- [x] Verified the updated skill renders correctly
- [ ] Use `/pr-address` on a PR with pending CI to confirm the new flow
works
2026-03-17 22:10:18 +07:00
Otto
522f932e67 Merge commit from fork
SendEmailBlock accepted user-supplied smtp_server and smtp_port inputs
and passed them directly to smtplib.SMTP() with no IP validation,
bypassing the platform's SSRF protections in request.py.

This fix:
- Makes _resolve_and_check_blocked public in request.py so non-HTTP
  blocks can reuse the same IP validation
- Validates the SMTP server hostname via resolve_and_check_blocked()
  before connecting
- Restricts allowed SMTP ports to standard values (25, 465, 587, 2525)
- Catches SMTPConnectError and SMTPServerDisconnected to prevent TCP
  banner leakage in error messages

Fixes GHSA-4jwj-6mg5-wrwf
2026-03-17 15:55:49 +01:00
Otto
a6124b06d5 Merge commit from fork
* fix(backend): add HMAC signing to Redis cache to prevent pickle deserialization attacks

Add HMAC-SHA256 integrity verification to all values stored in the shared
Redis cache. This prevents cache poisoning attacks where an attacker with
Redis access injects malicious pickled payloads that execute arbitrary code
on deserialization.

Changes:
- Sign pickled values with HMAC-SHA256 before storing in Redis
- Verify HMAC signature before deserializing cached values
- Reject tampered or unsigned (legacy) cache entries gracefully
  (treated as cache misses, logged as warnings)
- Derive HMAC key from redis_password or unsubscribe_secret_key
- Add tests for HMAC round-trip, tamper detection, and legacy rejection

Fixes GHSA-rfg2-37xq-w4m9

* improve log message

---------

Co-authored-by: Reinier van der Leer <pwuts@agpt.co>
2026-03-17 15:52:37 +01:00
Otto
ae660ea04f Merge commit from fork
Replace NamedTemporaryFile(delete=False) with a direct Response,
preventing unbounded disk consumption on the public download endpoint.

Fixes: GHSA-374w-2pxq-c9jp
2026-03-17 15:33:55 +01:00
Otto
2479f3a1c4 Merge commit from fork
- Normalize IPv4-mapped IPv6 addresses (e.g. ::ffff:127.0.0.1) to IPv4
  before checking against blocked networks, preventing blocklist bypass
- Add missing blocked ranges: CGNAT (100.64.0.0/10), IETF Protocol
  Assignments (192.0.0.0/24), Benchmarking (198.18.0.0/15)
- Add comprehensive tests for IPv4-mapped bypass and new blocked ranges
2026-03-17 14:43:38 +01:00
Abhimanyu Yadav
8153306384 feat(frontend): reusable confetti with enhanced particles and dual bursts (#12454)
<!-- Clearly explain the need for these changes: -->

The previous confetti implementation using party-js was causing lag.
Replaced it with canvas-confetti for smoother, more performant
celebrations with enhanced visual effects.

### Changes 🏗️

- **New Confetti Component**: Reusable canvas-confetti wrapper with
AutoGPT purple color palette and Storybook stories demonstrating various
effects
- **Enhanced Wallet Confetti**: Dual simultaneous bursts at 45° and 135°
angles with larger particles (scalar 1.2) for better visibility
- **Enhanced Task Celebration**: Dual-burst confetti for task group and
individual task completion events
- **Onboarding Congrats Page**: Replaced party-js with canvas-confetti
for side-cannon animation effect
- **Dependency**: Added canvas-confetti v1.9.4, removed party-js

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Trigger task completion in wallet to see dual-burst confetti at
45° and 135° angles
- [x] Complete tasks/groups to verify celebration confetti displays with
larger particles
  - [x] Visit onboarding congratulations page to see side-cannon effect
  - [x] Verify confetti rendering performance and no console errors
2026-03-17 12:49:15 +00:00
Abhimanyu Yadav
9c3d100a22 feat(frontend): add builder e2e tests for new Flow Editor (#12436)
### Changes
- Replace skipped legacy builder tests with 8 working Playwright e2e
tests
  targeting the new Flow Editor
- Rewrite `BuildPage` page object to match new `data-id`/`data-testid`
  selectors
- Update `agent-activity.spec.ts` to use new `BuildPage` API

### Tests added
  - Build page loads successfully (canvas + control buttons)
  - Add a block via block menu search
  - Add multiple blocks
  - Remove a block (select + Backspace)
  - Save an agent (name/description, verify flowID in URL)
  - Save and verify run button becomes enabled
  - Copy and paste a node (Cmd+C/V)
  - Run an agent from the builder

 ### Test plan
  - [x] All 8 builder tests pass locally (`pnpm test:no-build
  src/tests/build.spec.ts`)
  - [x] `pnpm format`, `pnpm lint`, `pnpm types` all clean
  - [x] CI passes
2026-03-17 12:48:59 +00:00
Zamil Majdy
fc3bf6c154 fix(copilot): handle transient Anthropic API connection errors gracefully (#12445)
## Summary
- Detect transient Anthropic API errors (ECONNRESET, "socket connection
was closed unexpectedly") across all error paths in the copilot SDK
streaming loop
- Replace raw technical error messages with user-friendly text:
**"Anthropic connection interrupted — please retry"**
- Add `retryable` field to `StreamError` model so the frontend can
distinguish retryable errors
- Add **"Try Again" button** on the error card for transient errors,
which re-sends the last user message

### Background
Sentry issue
[AUTOGPT-SERVER-875](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-875)
— 25+ events since March 13, caused by Anthropic API infrastructure
instability (confirmed by their status page). Same SDK/code on dev and
prod, prod-only because of higher volume of long-running streaming
sessions.

### Changes
**Backend (`constants.py`, `service.py`, `response_adapter.py`,
`response_model.py`):**
- `is_transient_api_error()` — pattern-matching helper for known
transient error strings
- Intercept transient errors in 3 places: `AssistantMessage.error`,
stream exceptions, `BaseException` handler
- Use friendly message in error markers persisted to session (so it
shows properly on page refresh too)
- `StreamError.retryable` field for frontend consumption

**Frontend (`ChatContainer`, `ChatMessagesContainer`,
`MessagePartRenderer`):**
- Thread `onRetry` callback from `ChatContainer` →
`ChatMessagesContainer` → `MessagePartRenderer`
- Detect transient error text in error markers and show "Try Again"
button via existing `ErrorCard.onRetry`
- Clicking "Try Again" re-sends the last user message (backend
auto-cleans stale error markers)

Fixes SECRT-2128, SECRT-2129, SECRT-2130

## Test plan
- [ ] Verify transient error detection with `is_transient_api_error()`
for known patterns
- [ ] Confirm error card shows "Anthropic connection interrupted —
please retry" instead of raw socket error
- [ ] Confirm "Try Again" button appears on transient error cards
- [ ] Confirm "Try Again" re-sends the last user message successfully
- [ ] Confirm non-transient errors (e.g., "Prompt is too long") still
show original error text without retry button
- [ ] Verify error marker persists correctly on page refresh
2026-03-17 12:48:53 +00:00
Abhimanyu Yadav
e32d258a7e feat(blocks): add AgentMail integration blocks (#12417)
## Summary
- Add a full AgentMail integration with blocks for managing inboxes,
messages, threads, drafts, attachments, lists, and pods
- Includes shared provider configuration (`_config.py`) with API key
authentication
- 8 block modules covering ~25 individual blocks across all AgentMail
API surfaces

  ## Block Modules
  | Module | Blocks |
  |--------|--------|
  | `inbox.py` | Create, Get, List, Update, Delete inboxes |
| `messages.py` | Send, Get, List, Delete messages + org-wide listing |
  | `threads.py` | Get, List, Delete threads + org-wide listing |
| `drafts.py` | Create, Get, List, Update, Send, Delete drafts +
org-wide listing |
  | `attachments.py` | Download attachments |
  | `lists.py` | Create, Get, List, Update, Delete mailing lists |
  | `pods.py` | Create, Get, List, Update, Delete pods |

  ## Test plan
- [x] `poetry run pytest 'backend/blocks/test/test_block.py' -xvs` — all
new blocks pass the standard block test suite
  - [x] test all blocks manually
2026-03-17 12:40:32 +00:00
Abhimanyu Yadav
3e86544bfe feat(frontend): add graph search functionality to new builder (#12395)
### Changes
- Integrates the existing graph search components into the new builder's
control panel
- Search by block name/title, block type, node inputs/outputs, and
description with fuzzy matching
  (Jaro-Winkler)
- Clicking a result zooms/navigates to the node on the canvas
- Keyboard shortcut Cmd/Ctrl+F to open search
- Arrow key navigation and Enter to select within results
- Styled to match the new builder's block menu card pattern


https://github.com/user-attachments/assets/41ed676d-83b1-4f00-8611-00d20987a7af


### Test plan

- [x] Open builder with a graph containing multiple nodes
- [x] Click magnifying glass icon in control panel — search panel opens
- [x] Type a query — results filter by name, type, inputs, outputs
- [x] Click a result — canvas zooms to that node
- [x] Use arrow keys + Enter to navigate and select results
- [x] Press Cmd/Ctrl+F — search panel opens
- [x] Press Escape or click outside — search panel closes and query
clears
2026-03-17 12:19:54 +00:00
Abhimanyu Yadav
c6b729bdfa fix(frontend): replace custom LibraryTabs with design system TabsLine (#12444)
Replaces the custom LibraryTabs component with the design system's
TabsLine component throughout the library page for better UI
consistency. Also wires up favorite animation refs and removes the
unused `agentGraphVersion` field from the test fixture.

### Changes 🏗️

- Replace `LibraryTabs` with `TabsLine` from design system in
`FavoritesSection`, `LibrarySubSection`, and `page.tsx`
- Add favorite animation ref registration in `FavoritesSection` and
`LibrarySubSection`
- Inline tab type definition as `{ id: string; title: string; icon: Icon
}` in component props
- Remove unused `agentGraphVersion` field from `load_store_agents.py`
test

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Library page renders with both "All" and "Favorites" tabs using
TabsLine component
  - [x] Tab switching between all agents and favorites works correctly
  - [x] Favorite animations reference the correct tab element
2026-03-17 10:39:12 +00:00
Zamil Majdy
7a391fbd99 feat(platform): CoPilot credit charging, token rate limiting, and usage UI (#12385)
### Background
CoPilot block execution was not charging credits, LLM token usage was
not tracked, and there was no per-user rate limiting. This PR adds all
three, plus a frontend usage indicator.

### Screenshot

<!-- Drag-drop the usage limits screenshot here -->

### Changes

**Credit Charging** (`copilot/tools/helpers.py`)
- Pre-execution balance check + post-execution credit deduction via
`block_usage_cost` / `spend_credits`
- Uses adapter pattern (RPC fallback) so it works in the CoPilot
executor which has no Prisma connection

**Token Rate Limiting** (`copilot/rate_limit.py`)
- Redis-backed daily + weekly fixed-window counters per user
- Fail-open on Redis outages, clock-skew-safe weekly boundaries
- Configurable via `daily_token_limit` / `weekly_token_limit` (0 =
unlimited)

**Token Tracking**
- *Baseline* (`copilot/baseline/service.py`):
`stream_options={"include_usage": True}` with tiktoken fallback
estimation
- *SDK* (`copilot/sdk/service.py`): Extract usage from Claude Agent SDK
`ResultMessage`, including cached tokens
- Both: yield `StreamUsage` SSE events, persist `Usage` records, call
`record_token_usage` in `finally`

**Usage API** (`api/features/chat/routes.py`)
- `GET /api/chat/usage` — returns `CoPilotUsageStatus` (daily/weekly
used, limit, resets_at)
- Pre-turn `check_rate_limit` in `stream_chat_post` (returns 429 on
exceed)

**Frontend** (`copilot/components/UsageLimits/`)
- `UsageLimits` popover with daily/weekly progress bars, reset times,
dark mode
- `useUsageLimits` hook with 30s auto-refresh via generated Orval API
hook

### Tests
| Area | Tests | File |
|------|-------|------|
| Rate limiting | 22 | `rate_limit_test.py` |
| Credit charging | 12 | `helpers_test.py` |
| Usage API | 3 | `routes_test.py` |
| Frontend UI | 9 | `UsageLimits.test.tsx` |

### Checklist

- [x] Changes clearly listed
- [x] Test plan created and executed (46 backend + 9 frontend tests)
- [x] Pre-commit hooks pass (formatting, linting, type checks)
- [x] `.env.default` compatible (new config defaults to 0 = unlimited)
- [x] `docker-compose.yml` compatible (no changes needed)
2026-03-17 08:43:27 +00:00
Zamil Majdy
791dd7cb48 fix(backend): split CamelCase block names and filter disabled blocks before batch slicing (#12400)
## Summary

Two bugs causing blocks to be invisible in CoPilot search:

### Bug 1: CamelCase block names not tokenized
Block names like `AITextGeneratorBlock` were indexed as single tokens in
the search database. PostgreSQL's `plainto_tsquery('english', ...)` and
the BM25 tokenizer both treat CamelCase as one word, so searching for
"text generator" produced zero lexical/BM25 match.

**Fix:** Split CamelCase names into separate words before indexing (e.g.
`"AI Text Generator Block"`) and in the BM25 tokenizer.

### Bug 2: Disabled blocks exhausting batch budget (root cause of 36
missing blocks)
The `batch_size` limit in `get_missing_items()` was applied **before**
filtering out disabled blocks. With 120+ disabled blocks and
`batch_size=100`, the first 100 missing entries were all disabled
(skipped via `continue`), leaving the 36 enabled blocks beyond the slice
boundary **never indexed**. This made core blocks like
`AITextGeneratorBlock`, `AIConversationBlock`, `AIListGeneratorBlock`,
etc. completely invisible to search.

**Fix:** Filter disabled blocks from the missing list before slicing by
`batch_size`.

### Changes
- **`content_handlers.py`**: 
- Split CamelCase block names into space-separated words when building
`searchableText`
- Filter disabled blocks before applying `batch_size` slice so enabled
blocks aren't starved
- **`hybrid_search.py`**: Updated BM25 `tokenize()` to split CamelCase
tokens

### Evidence from local DB
```
Indexed blocks: 341
Total blocks: 497 (156 missing from index)
Missing (non-disabled): 36 — including AITextGeneratorBlock, AIConversationBlock, etc.

# batch_size analysis:
First 100 missing: 0 enabled, 100 disabled  ← batch exhausted by disabled blocks
After 100: 36 enabled                        ← never reached!
```

## Test plan
- [ ] Verify CamelCase splitting: `AITextGeneratorBlock` → `AI Text
Generator Block`
- [ ] Run `poetry run pytest backend/api/features/store/` for
regressions
- [ ] After deploy, trigger embedding backfill and verify all 36 blocks
get indexed
- [ ] Search for "text generator" in CoPilot and verify
`AITextGeneratorBlock` appears
2026-03-17 08:36:53 +00:00
Abhimanyu Yadav
f0800b9420 feat(frontend): add rich media previews for Builder node outputs and file inputs (#12432)
### Changes
- Add YouTube/Vimeo embed support to `VideoRenderer` — URLs render as
embedded
  iframe players instead of plain text
- Add new `AudioRenderer` — HTTP audio URLs (.mp3, .wav, .ogg, .m4a,
.aac,
  .flac) and data URIs render as inline audio players
- Add new `LinkRenderer` — any HTTP/HTTPS URL not claimed by a media
renderer
  becomes a clickable link with an external-link icon
- Add media preview button to `FileInput` — uploaded audio, video, and
image
files show an Eye icon that opens a preview dialog reusing the
OutputRenderer
  system
- Update `ContentRenderer` shortContent gate to allow new renderers
through in
  node previews


https://github.com/user-attachments/assets/eea27fb7-3870-4a1e-8d08-ba23b6e07d74

### Test plan
- [x] `pnpm vitest run src/components/contextual/OutputRenderers/` — 36
tests
  passing
- [x] `pnpm format && pnpm lint && pnpm types` — all clean
- [x] Manual: run a block that outputs a YouTube URL → embedded player
- [x] Manual: run a block that outputs an audio file URL → audio player
- [x] Manual: run a block that outputs a generic URL → clickable link
- [x] Manual: upload an audio/video/image file to a file input → Eye
icon
  appears, clicking opens preview dialog
2026-03-17 07:09:02 +00:00
Abhimanyu Yadav
60bc49ba50 fix(platform): fix image delete button on EditAgentForm (#12362)
### Summary
- SECRT-2094: Fix store image delete button accidentally submitting the
edit form — the remove image <button> in ThumbnailImages.tsx was missing
type="button", causing it to act as a form submit inside the
EditAgentForm. This closed the modal and showed a success toast without
the user clicking "Update submission".

https://github.com/user-attachments/assets/86cbdd7d-90b1-473c-9709-e75e956dea6b

###  Changes
- `frontend/.../ThumbnailImages.tsx` — added type="button" to image
remove button
2026-03-17 07:06:05 +00:00
Abhimanyu Yadav
ba4f4b6242 test(frontend): add integration tests for builder UI state stores and draft recovery (part-2) (#12435)
### Changes
- Add integration tests for `controlPanelStore` (sidebar panel state
  management)
- Add integration tests for `blockMenuStore` (search/filter/category
state,
  creator list deduplication, reset behavior)
- Add integration tests for `tutorialStore` (tutorial lifecycle, step
  progression, input values)
- Add integration tests for `DraftRecoveryPopup` (diff summary
rendering,
  restore/discard actions, null diff fallback, singular/plural text)

### Test plan
  - [x] All 54 tests pass across 4 new test files
  - [x] `pnpm format` clean
  - [x] `pnpm lint` clean
  - [x] `pnpm types` clean
2026-03-17 07:05:51 +00:00
Nicholas Tindle
8892bcd230 docs: Add workspace and media file architecture documentation (#11989)
### Changes 🏗️

- Added comprehensive architecture documentation at
`docs/platform/workspace-media-architecture.md` covering:
  - Database models (`UserWorkspace`, `UserWorkspaceFile`)
  - `WorkspaceManager` API with session scoping
- `store_media_file()` media normalization pipeline (input types, return
formats)
  - Virus scanning responsibility boundaries
- Decision tree for choosing `WorkspaceManager` vs `store_media_file()`
- Configuration reference including `clamav_max_concurrency` and
`clamav_mark_failed_scans_as_clean`
  - Common patterns with error handling examples
- Updated `autogpt_platform/backend/CLAUDE.md` with a "Workspace & Media
Files" section referencing the new docs
- Removed duplicate `scan_content_safe()` call from
`WriteWorkspaceFileTool` — `WorkspaceManager.write_file()` already scans
internally, so the tool was double-scanning every file
- Replaced removed comment in `workspace.py` with explicit ownership
comment clarifying that `WorkspaceManager` is the single scanning
boundary

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Verified `scan_content_safe()` is called inside
`WorkspaceManager.write_file()` (workspace.py:186)
- [x] Verified `store_media_file()` scans all input branches including
local paths (file.py:351)
- [x] Verified documentation accuracy against current source code after
merge with dev
  - [x] CI checks all passing

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Low Risk**
> Mostly adds documentation and internal developer guidance; the only
code change is a comment clarifying `WorkspaceManager.write_file()` as
the single virus-scanning boundary, with no behavior change.
> 
> **Overview**
> Adds a new `docs/platform/workspace-media-architecture.md` describing
the Workspace storage layer vs the `store_media_file()` media pipeline,
including session scoping and virus-scanning/persistence responsibility
boundaries.
> 
> Updates backend `CLAUDE.md` to point contributors to the new doc when
working on CoPilot uploads/downloads or
`WorkspaceManager`/`store_media_file()`, and clarifies in
`WorkspaceManager.write_file()` (comment-only) that callers should not
duplicate virus scanning.
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
18fcfa03f8. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 06:12:26 +00:00
Zamil Majdy
48ff8300a4 Merge branch 'master' of github.com:Significant-Gravitas/AutoGPT into dev 2026-03-17 13:13:42 +07:00
Abhimanyu Yadav
c268fc6464 test(frontend/builder): add integration tests for builder stores, components, and hooks (part-1) (#12433)
### Changes
- Add 329 integration tests across 11 test files for the builder (visual
  workflow editor)
- Cover all Zustand stores (nodeStore, edgeStore, historyStore,
graphStore,
  copyPasteStore, blockMenuStore, controlPanelStore)
- Cover key components (CustomNode, NewBlockMenu, NewSaveControl,
RunGraph)
- Cover hooks (useFlow, useCopyPaste)

### Test files

  | File | Tests | Coverage |
  |------|-------|----------|
| `nodeStore.test.ts` | 58 | Node lifecycle, bulk ops, backend
conversion,
  execution tracking, status, errors, resolution mode |
  | `edgeStore.test.ts` | 37 | Edge CRUD, duplicate rejection, bead
  visualization, backend link conversion, upsert |
| `historyStore.test.ts` | 22 | Undo/redo, history limits (50),
microtask
  batching, deduplication, canUndo/canRedo |
| `graphStore.test.ts` | 28 | Execution status transitions,
isGraphRunning,
  schema management, sub-graphs |
| `copyPasteStore.test.ts` | 8 | Copy/paste with ID remapping, position
offset,
   edge preservation |
| `CustomNode.test.tsx` | 25 | Rendering by block type (NOTE, WEBHOOK,
AGENT,
  OUTPUT, AYRSHARE), error states |
| `NewBlockMenu.test.tsx` | 29 | Store state (search, filters, creators,
  categories), search/default view routing |
| `NewSaveControl.test.tsx` | 11 | Save dialog rendering, form
validation,
  version display, popover state |
| `RunGraph.test.tsx` | 11 | Run/stop button states, loading, click
handlers,
  RunInputDialog visibility |
  | `useFlow.test.ts` | 4 | Loading states, initial load completion |
| `useCopyPaste.test.ts` | 16 | Clipboard copy/paste, UUID remapping,
viewport
  centering, input field guard |
2026-03-17 05:24:55 +00:00
Reinier van der Leer
aff3fb44af ci(platform): Improve end-to-end CI & reduce its cost (#12437)
Our CI costs are skyrocketing, most of it because of
`platform-fullstack-ci.yml`. The `types` job currently uses in a
`big-boi` runner (= expensive), but doesn't need to.
Additionally, the "end-to-end tests" job is currently in
`platform-frontend-ci.yml` instead of `platform-fullstack-ci.yml`,
causing it not to run on backend changes (which it should).

### Changes 🏗️

- Simplify `check-api-types` job (renamed from `types`) and make it use
regular `ubuntu-latest` runner
- Export API schema from backend through CLI (instead of spinning it up
in docker)
- Fix dependency caching in `platform-fullstack-ci.yml` (based on recent
improvements in `platform-frontend-ci.yml`)
- Move `e2e_tests` job to `platform-fullstack-ci.yml`

Out-of-scope but necessary:
- Eliminate module-level init of OpenAI client in
`backend.copilot.service`

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - CI
2026-03-16 23:08:18 +00:00
Zamil Majdy
9a41312769 feat(backend/copilot): parse @@agptfile bare refs by file extension (#12392)
The `@@agptfile:` expansion system previously used content-sniffing
(trying
`json.loads` then `csv.Sniffer`) to decide whether to parse file content
as
structured data. This was fragile — a file containing just `"42"` would
be
parsed as an integer, and the heuristics could misfire on ambiguous
content.

This PR replaces content-sniffing with **extension/MIME-based format
detection**.
When the file has a well-known extension (`.json`, `.csv`, etc.) or MIME
type
fragment (`workspace://id#application/json`), the content is parsed
accordingly.
Unknown formats or parse failures always fall back to plain string — no
surprises.

> [!NOTE]
> This PR builds on the `@@agptfile:` file reference protocol introduced
in #12332 and the structured data auto-parsing added in #12390.
>
> **What is `@@agptfile:`?**
> It is a special URI prefix (e.g. `@@agptfile:workspace:///report.csv`)
that the CoPilot SDK expands inline before sending tool arguments to
blocks. This lets the AI reference workspace files by name, and the SDK
automatically reads and injects the file content. See #12332 for the
full design.

### Changes 🏗️

**New utility: `backend/util/file_content_parser.py`**
- `infer_format(uri)` — determines format from file extension or MIME
fragment
- `parse_file_content(content, fmt)` — parses content, never raises
- Supported text formats: JSON, JSONL/NDJSON, CSV, TSV, YAML, TOML
- Supported binary formats: Parquet (via pyarrow), Excel/XLSX (via
openpyxl)
- JSON scalars (strings, numbers, booleans, null) stay as strings — only
  containers (arrays, objects) are promoted
- CSV/TSV require ≥1 row and ≥2 columns to qualify as tabular data
- Added `openpyxl` dependency for Excel reading via pandas
- Case-insensitive MIME fragment matching per RFC 2045
- Shared `PARSE_EXCEPTIONS` constant to avoid duplication between
modules

**Updated `expand_file_refs_in_args` in `file_ref.py`**
- Bare refs now use `infer_format` + `parse_file_content` instead of the
  old `_try_parse_structured` content-sniffing function
- Binary formats (parquet, xlsx) read raw bytes via `read_file_bytes`
- Embedded refs (text around `@@agptfile:`) still produce plain strings
- **Size guards**: Workspace and sandbox file reads now enforce a 10 MB
limit
  (matching the existing local file limit) to prevent OOM on large files

**Updated `blocks/github/commits.py`**
- Consolidated `_create_blob` and `_create_binary_blob` into a single
function
  with an `encoding` parameter

**Updated copilot system prompt**
- Documents the extension-based structured data parsing and supported
formats

**66 new tests** in `file_content_parser_test.py` covering:
- Format inference (extension, MIME, case-insensitive, precedence)
- All 8 format parsers (happy path + edge cases + fallbacks)
- Binary format handling (string input fallback, invalid bytes fallback)
- Unknown format passthrough

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] All 66 file_content_parser_test.py tests pass
  - [x] All 31 file_ref_test.py tests pass
  - [x] All 13 file_ref_integration_test.py tests pass
  - [x] `poetry run format` passes clean (including pyright)
2026-03-16 22:31:21 +00:00
Ubbe
048fb06b0a feat(frontend): add "Jump Back In" button to Library page (#12387)
Adds a "Jump Back In" CTA at the top of the Library page to encourage
users to quickly rerun their most recently successful agent.

Closes SECRT-1536

### Changes 🏗️

- New `JumpBackIn` component with `useJumpBackIn` hook at
`library/components/JumpBackIn/`
- Fetches first page of library agents sorted by `updatedAt`
- Finds the first agent with a `COMPLETED` execution in
`recent_executions`
- Shows banner with agent name + "Jump Back In" button linking to
`/library/agents/{id}`
- Returns `null` (hidden) when loading or when no agent with a
successful run exists

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] `pnpm format`, `pnpm lint`, `pnpm types` all pass
- [x] Verified banner is hidden when no successful runs exist (edge
case)
- [x] Verified library page renders correctly with no visual regressions

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-16 21:35:03 +08:00
Zamil Majdy
3f653e6614 dx(.claude): refactor and consolidate Claude Code skills (#12424)
Refactors the Claude Code skills for a cleaner, more intuitive dev loop.

### Changes 🏗️

- **`/pr-review` (new)**: Actual code review skill — reads the PR diff,
fetches existing comments to avoid duplicates, and posts inline GitHub
comments with structured feedback (Blockers / Should Fix / Nice to Have
/ Nit) covering correctness, security, code quality, architecture, and
testing.

- **`/pr-address` (was `/babysit-pr`)**: Addresses review comments and
monitors CI until green. Renamed from `/babysit-pr` to `/pr-address` to
better reflect its purpose. Handles bot-specific feedback
(autogpt-reviewer, sentry, coderabbitai) and loops until all comments
are addressed and CI is green.

- **`/backend-check` + `/frontend-check` → `/check`**: Unified into a
single `/check` skill that auto-detects whether backend (Python) or
frontend (TypeScript) code changed and runs the appropriate formatting,
linting, type checking, and tests. Shared code quality rules applied to
both.

- **`/code-style` enhanced**: Now covers both Python and
TypeScript/React. Added learnings from real PR work: lazy `%s` logging,
TOCTOU awareness, SSE protocol rules (`data:` vs `: comment`), FastAPI
`Security()` vs `Depends()`, Redis pipeline atomicity, error path
sanitization, mock target rules after refactoring.

- **`/worktree` fixed**: Normal `git worktree` is now the default (was
branchlet-first). Branchlet moved to optional section. All paths derived
from `git rev-parse --show-toplevel`.

- **`/pr-create`, `/openapi-regen`, `/new-block` cleaned up**: Reference
`/check` and `/code-style` instead of duplicating instructions.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Verified all skill files parse correctly (valid YAML frontmatter)
  - [x] Verified skill auto-detection triggers updated in descriptions
- [x] Verified old backend-check and frontend-check directories removed
- [x] Verified pr-review and pr-address directories created with correct
content
2026-03-16 10:35:05 +00:00
Zamil Majdy
c9c3d54b2b fix(platform): reduce Sentry noise by filtering expected errors and downgrading log levels (#12430)
## Summary

Reduces Sentry error noise by ~90% by filtering out expected/transient
errors and downgrading inappropriate error-level logs to warnings. Most
of the top Sentry issues are not actual bugs but expected conditions
(user errors, transient infra, business logic) that were incorrectly
logged at ERROR level, causing them to be captured as Sentry events.

## Changes

### 1. Sentry `before_send` filter (`metrics.py`)
Added a `before_send` hook to filter known expected errors before they
reach Sentry:
- **AMQP/RabbitMQ connection errors** — transient during
deploys/restarts
- **User credential errors** — invalid API keys, missing auth headers
(user error, not platform bug)
- **Insufficient balance** — expected business logic
- **Blocked IP access** — security check working as intended
- **Discord bot token errors** — misconfiguration, not runtime error
- **Google metadata DNS errors** — expected in non-GCP environments
- **Inactive email recipients** — expected for bounced addresses
- **Unclosed client sessions/connectors** — resource cleanup noise

### 2. Connection retry log levels (`retry.py`)
- `conn_retry` final failure: `error` → `warning` (these are infra
retries, not bugs)
- `conn_retry` wrapper final failure: `error` → `warning`
- Discord alert send failure: `error` → `warning`

### 3. Block execution Sentry capture (`manager.py`)
- Skip `sentry_sdk.capture_exception()` for `ValueError` subclasses
(BlockExecutionError, BlockInputError, InsufficientBalanceError, etc.) —
these are user-caused errors, not platform bugs
- Downgrade executor shutdown/disconnect errors to warning

### 4. Scheduler log levels (`scheduler.py`)
- Graph validation failure: `error` → `warning` (expected for
old/invalid graphs)
- Unable to unschedule graph: `error` → `warning`
- Job listener failure: `error` → `warning`
- Async operation failure: `error` → `warning`

### 5. Discord system alert (`notifications.py`)
- Wrapped `discord_system_alert` endpoint with try/catch to prevent
unhandled exceptions (fixes AUTOGPT-SERVER-743, AUTOGPT-SERVER-7MW)

### 6. Notification system log levels (`notifications.py`)
- All batch processing errors: `error` → `warning`
- User email not found: `error` → `warning`
- Notification parsing errors: `error` → `warning`
- Email sending failures: `error` → `warning`
- Summary data gathering failure: `error` → `warning`
- Cleaned up unprofessional error messages

### 7. Cloud storage cleanup (`cloud_storage.py`)
- Cleanup error: `error` → `warning`

## Sentry Issues Addressed

### AMQP/RabbitMQ (~3.4M events total)
-
[AUTOGPT-SERVER-3H2](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-3H2)
— AMQPConnector ConnectionRefusedError (1.2M events)
-
[AUTOGPT-SERVER-3H3](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-3H3)
— AMQPConnectionWorkflowFailed (770K events)
-
[AUTOGPT-SERVER-3H4](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-3H4)
— AMQP connection workflow failed (770K events)
-
[AUTOGPT-SERVER-3H5](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-3H5)
— AMQPConnectionWorkflow reporting failure (770K events)
-
[AUTOGPT-SERVER-3H7](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-3H7)
— Socket failed to connect (514K events)
-
[AUTOGPT-SERVER-3H8](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-3H8)
— TCP Connection attempt failed (514K events)
-
[AUTOGPT-SERVER-3H6](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-3H6)
— AMQPConnectionError (93K events)
-
[AUTOGPT-SERVER-7SX](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-7SX)
— Error creating transport (69K events)
-
[AUTOGPT-SERVER-1TN](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-1TN)
— ChannelInvalidStateError (39K events)
-
[AUTOGPT-SERVER-6JC](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-6JC)
— ConnectionClosedByBroker (2K events)
-
[AUTOGPT-SERVER-6RJ/6RK/6RN/6RQ/6RP/6RR](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-6RJ)
— Various connection failures (~15K events)
-
[AUTOGPT-SERVER-4A5/6RM/7XN](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-4A5)
— Connection close/transport errors (~540 events)

### User Credential Errors (~15K events)
-
[AUTOGPT-SERVER-6S5](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-6S5)
— Incorrect OpenAI API key (9.2K events)
-
[AUTOGPT-SERVER-7W4](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-7W4)
— Incorrect API key in AIConditionBlock (3.4K events)
-
[AUTOGPT-SERVER-83Y](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-83Y)
— AI condition invalid key (2.3K events)
-
[AUTOGPT-SERVER-7ZP](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-7ZP)
— Perplexity missing auth header (451 events)
-
[AUTOGPT-SERVER-7XK/7XM](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-7XK)
— Anthropic invalid key (125 events)
-
[AUTOGPT-SERVER-82C](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-82C)
— Missing auth header (27 events)
-
[AUTOGPT-SERVER-721](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-721)
— Ideogram invalid token (165 events)

### Business Logic / Validation (~120K events)
-
[AUTOGPT-SERVER-7YQ](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-7YQ)
— Disabled block used in graph (56K events)
-
[AUTOGPT-SERVER-6W3](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-6W3)
— Graph failed validation (46K events)
-
[AUTOGPT-SERVER-6W2](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-6W2)
— Unable to unschedule graph (46K events)
-
[AUTOGPT-SERVER-83X](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-83X)
— Blocked IP access (15K events)
-
[AUTOGPT-SERVER-6K9](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-6K9)
— Insufficient balance (4K events)

### Discord Alert Failures (~24K events)
-
[AUTOGPT-SERVER-743](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-743)
— Discord improper token (22K events)
-
[AUTOGPT-SERVER-7MW](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-7MW)
— Discord 403 Missing Access (1.5K events)

### Notification System (~16K events)
-
[AUTOGPT-SERVER-550](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-550)
— Notification batch create error (8.3K events)
-
[AUTOGPT-SERVER-58H](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-58H)
— ValidationError for NotificationEventModel (3K events)
-
[AUTOGPT-SERVER-5C6](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-5C6)
— Get notification batch error (2.1K events)
-
[AUTOGPT-SERVER-4BT](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-4BT)
— Notification batch create error (1.8K events)
-
[AUTOGPT-SERVER-5E4](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-5E4)
— NotificationPreference validation (1.4K events)
-
[AUTOGPT-SERVER-508](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-508)
— Inactive email recipients (702 events)

### Infrastructure / Transient (~20K events)
-
[AUTOGPT-SERVER-6WJ](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-6WJ)
— Unclosed client session (13K events)
-
[AUTOGPT-SERVER-745](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-745)
— Unclosed connector (5.8K events)
-
[AUTOGPT-SERVER-4V1](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-4V1)
— Google metadata DNS error (2.2K events)
-
[AUTOGPT-SERVER-80J](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-80J)
— CloudStorage DNS error (35 events)

### Executor Shutdown
-
[AUTOGPT-SERVER-55J](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-55J)
— Error disconnecting run client (118 events)

## Test plan
- [x] All pre-commit hooks pass (Ruff, isort, Black, Pyright typecheck)
- [x] All changed modules import successfully
- [ ] Deploy to staging and verify Sentry event volume drops
significantly
- [ ] Verify legitimate errors still appear in Sentry
2026-03-16 10:29:01 +00:00
Ubbe
53d58e21d3 feat(frontend): replace technical block terminology with user-friendly labels (#12389)
## Summary
- Replaces all user-facing "block" terminology in the CoPilot activity
stream with plain-English labels ("Step failed", "action",
"Credentials", etc.)
- Adds `humanizeFileName()` utility to display file names without
extensions, with title-case and spaces (e.g. `executive_memo.md` →
`"Executive Memo"`)
- Updates error messages across RunBlock, RunAgent, and FindBlocks tools
to use friendly language

## Test plan
- [ ] Open CoPilot and trigger a block execution — verify animation text
says "Running" / "Step failed" instead of "Running the block" / "Error
running block"
- [ ] Trigger a file read/write action — verify the activity shows
humanized file names (e.g. `Reading "Executive Memo"` not `Reading
executive_memo.md`)
- [ ] Trigger FindBlocks — verify labels say "Searching for actions" and
"Results" instead of "Searching for blocks" and "Block results"
- [ ] Check the work-done stats bar — verify it shows "action" /
"actions" instead of "block run" / "block runs"
- [ ] Trigger a setup requirements card — verify labels say
"Credentials" and "Inputs" instead of "Block credentials" and "Block
inputs"
- [ ] Visit `/copilot/styleguide` — verify error test data no longer
contains "Block execution" text

Resolves: [SECRT-2025](https://linear.app/autogpt/issue/SECRT-2025)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-16 09:00:25 +00:00
Ubbe
fa04fb41d8 feat(frontend): add "Run now" button to schedule view (#12388)
Adds a "Run now" action to the schedule detail view and sidebar
dropdown, allowing users to immediately trigger a scheduled agent run
without waiting for the next cron execution.

### Changes 🏗️

- **`useSelectedScheduleActions.ts`**: Added
`usePostV1ExecuteGraphAgent` hook and `handleRunNow` function that
executes the agent using the schedule's stored `input_data` and
`input_credentials`. On success, invalidates runs query and navigates to
the new run
- **`SelectedScheduleActions.tsx`**: Added Play icon button as first
action button, with loading spinner while running
- **`SelectedScheduleView.tsx`**: Threads `onSelectRun` prop and
`schedule` object to action components (both mobile and desktop layouts)
- **`NewAgentLibraryView.tsx`**: Passes `onSelectRun` handler to enable
navigation to the new run after execution
- **`ScheduleActionsDropdown.tsx`**: Added "Run now" dropdown menu item
with same execution logic
- **`ScheduleListItem.tsx`**: Added `onRunCreated` prop passed to
dropdown
- **`SidebarRunsList.tsx`**: Connects sidebar dropdown to run
selection/navigation

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] `pnpm format`, `pnpm lint`, `pnpm types` all pass
- [x] Code review: follows existing patterns (mirrors "Run Again" in
SelectedRunActions)
  - [x] No visual regressions on agent detail page

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-16 17:00:41 +08:00
Otto
d9c16ded65 fix(copilot): prioritize block discovery over MCP and sanitize HTML errors (#12394)
Requested by @majdyz

When a user asks for Google Sheets integration, the CoPilot agent skips
block discovery entirely (despite 55+ Google Sheets blocks being
available), jumps straight to MCP, guesses a fake URL
(`https://sheets.googleapis.com/mcp`), and gets a raw HTML 404 error
page dumped into the conversation.

**Changes:**

1. **MCP guide** (`mcp_tool_guide.md`): Added "Check blocks first"
section directing the agent to use `find_block` before attempting MCP
for any service not in the known servers list. Explicitly prohibits
guessing/constructing MCP server URLs.

2. **Error handling** (`run_mcp_tool.py`): Detects HTML error pages in
HTTP responses (e.g. raw 404 pages from non-MCP endpoints) and returns a
clean one-liner like "This URL does not appear to host an MCP server"
instead of dumping the full HTML body.

**Note:** The main CoPilot system prompt (managed externally, not in
repo) should also be updated to reinforce block-first behavior in the
Capability Check section. This PR covers the in-repo changes.

Session reference: `9216df83-5f4a-48eb-9457-3ba2057638ae` (turn 3)
Ticket: [SECRT-2116](https://linear.app/autogpt/issue/SECRT-2116)

---
Co-authored-by: Zamil Majdy (@majdyz) <majdyz@gmail.com>

---------

Co-authored-by: Zamil Majdy (@majdyz) <majdyz@gmail.com>
Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
2026-03-14 12:49:03 +00:00
Otto
6dc8429ae7 fix(copilot): downgrade agent validation failure log from error to warning (#12409)
Agent validation failures are expected when the LLM generates invalid
agent graphs (wrong block IDs, missing required inputs, bad output field
names). The validator catches these and returns proper error responses.

However, `validator.py:938` used `logger.error()`, which Sentry captures
as error events — flooding #platform-alerts with non-errors.

This changes it to `logger.warning()`, keeping the log visible for
debugging without triggering Sentry alerts.

Fixes SECRT-2120

---
Co-authored-by: Zamil Majdy (@majdyz) <zamil.majdy@agpt.co>
2026-03-14 12:48:36 +00:00
Zamil Majdy
cfe22e5a8f fix(backend/copilot): sync TranscriptBuilder with CLI on mid-stream compaction (#12401)
## Summary
- **Root cause**: `TranscriptBuilder` accumulates all raw SDK stream
messages including pre-compaction content. When the CLI compacts
mid-stream, the uploaded transcript was still uncompacted, causing
"Prompt is too long" errors on the next `--resume` turn.
- **Fix**: Detect mid-stream compaction via the `PreCompact` hook, read
the CLI's session file to get the compacted entries (summary +
post-compaction messages), and call
`TranscriptBuilder.replace_entries()` to sync it with the CLI's active
context. This ensures the uploaded transcript always matches what the
CLI sees.
- **Key changes**:
- `CompactionTracker`: stores `transcript_path` from `PreCompact` hook,
one-shot `compaction_just_ended` flag that correctly resets for multiple
compactions
- `read_compacted_entries()`: reads CLI session JSONL, finds
`isCompactSummary: true` entry, returns it + all entries after. Includes
path validation against the CLI projects directory.
- `TranscriptBuilder.replace_entries()`: clears and replaces all entries
with compacted ones, preserving `isCompactSummary` entries (which have
`type: "summary"` that would normally be stripped)
- `load_previous()`: also preserves `isCompactSummary` entries when
loading a previously compacted transcript
- Service stream loop: after compaction ends, reads compacted entries
and syncs TranscriptBuilder

## Test plan
- [x] 69 tests pass across `compaction_test.py` and `transcript_test.py`
- [x] Tests cover: one-shot flag behavior, multiple compactions within a
query, transcript path storage, path traversal rejection,
`read_compacted_entries` (7 tests), `replace_entries` (4 tests),
`load_previous` with compacted content (2 tests)
- [x] Pre-commit hooks pass (lint, format, typecheck)
- [ ] Manual test: trigger compaction in a multi-turn session and verify
the uploaded transcript reflects compaction
2026-03-13 22:17:46 +00:00
Otto
0b594a219c feat(copilot): support prompt-in-URL for shareable prompt links (#12406)
Requested by @torantula

Add support for shareable AutoPilot URLs that contain a prompt in the
URL hash fragment, inspired by [Lovable's
implementation](https://docs.lovable.dev/integrations/build-with-url).

**URL format:**
- `/copilot#prompt=URL-encoded-text` — pre-fills the input for the user
to review before sending
- `/copilot?autosubmit=true#prompt=...` — auto-creates a session and
sends the prompt immediately

**Example:**
```
https://platform.agpt.co/copilot#prompt=Create%20a%20todo%20app
https://platform.agpt.co/copilot?autosubmit=true#prompt=Create%20a%20todo%20app
```

**Key design decisions:**
- Uses URL fragment (`#`) instead of query params — fragments never hit
the server, so prompts stay client-side only (better for privacy, no
backend URL length limits)
- URL is cleaned via `history.replaceState` immediately after extraction
to prevent re-triggering on navigation/reload
- Leverages existing `pendingMessage` + `createSession()` flow for
auto-submit — no new backend APIs needed
- For populate-only mode, passes `initialPrompt` down through component
tree to pre-fill the chat input

**Files changed:**
- `useCopilotPage.ts` — URL hash extraction logic + `initialPrompt`
state
- `CopilotPage.tsx` — passes `initialPrompt` to `ChatContainer`
- `ChatContainer.tsx` — passes `initialPrompt` to `EmptySession`
- `EmptySession.tsx` — passes `initialPrompt` to `ChatInput`
- `ChatInput.tsx` / `useChatInput.ts` — accepts `initialValue` to
pre-fill the textarea

Fixes SECRT-2119

---
Co-authored-by: Toran Bruce Richards (@Torantulino) <toran@agpt.co>
2026-03-13 23:54:54 +07:00
Zamil Majdy
a8259ca935 feat(analytics): read-only SQL views layer with analytics schema (#12367)
### Changes 🏗️

Adds `autogpt_platform/analytics/` — 14 SQL view definitions that expose
production data safely through a locked-down `analytics` schema.

**Security model:**
- Views use `security_invoker = false` (PostgreSQL 15+), so they execute
as their owner (`postgres`), not the caller
- `analytics_readonly` role only has access to `analytics.*` — cannot
touch `platform` or `auth` tables directly

**Files:**
- `backend/generate_views.py` — does everything; auto-reads credentials
from `backend/.env`
- `analytics/queries/*.sql` — 14 documented view definitions (auth, user
activity, executions, onboarding funnel, cohort retention)

---

### Running locally (dev)

```bash
cd autogpt_platform/backend

# First time only — creates analytics schema, role, grants
poetry run analytics-setup

# Create / refresh views (auto-reads backend/.env)
poetry run analytics-views
```

### Running in production (Supabase)

```bash
cd autogpt_platform/backend

# Step 1 — first time only (run in Supabase SQL Editor as postgres superuser)
poetry run analytics-setup --dry-run
# Paste the output into Supabase SQL Editor and run

# Step 2 — apply views (use direct connection host, not pooler)
poetry run analytics-views --db-url "postgresql://postgres:PASSWORD@db.<ref>.supabase.co:5432/postgres"

# Step 3 — set password for analytics_readonly so external tools can connect
# Run in Supabase SQL Editor:
# ALTER ROLE analytics_readonly WITH PASSWORD 'your-password';
```

---

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Setup + views applied cleanly on local Postgres 15
- [x] `analytics_readonly` can `SELECT` from all 14 `analytics.*` views
- [x] `analytics_readonly` gets `permission denied` on `platform.*` and
`auth.*` directly

---------

Co-authored-by: Otto (AGPT) <otto@agpt.co>
2026-03-13 12:04:42 +00:00
Swifty
1f1288d623 feat(copilot): generate personalized quick-action prompts from Tally business understanding (#12374)
During Tally data extraction, the system now also generates personalized
quick-action prompts as part of the existing LLM extraction call
(configurable model, defaults to GPT-4o-mini, `temperature=0.0`). The
prompt asks the LLM for 5 candidates, then the code validates (filters
prompts >20 words) and keeps the top 3. These prompts are stored in the
existing `CoPilotUnderstanding.data` JSON field (at the top level, not
under `business`) and served to the frontend via a new API endpoint. The
copilot chat page uses them instead of hardcoded defaults when
available.

### Changes 🏗️

**Backend – Data models** (`understanding.py`):
- Added `suggested_prompts` field to `BusinessUnderstandingInput`
(optional) and `BusinessUnderstanding` (default empty list)
- Updated `from_db()` to deserialize `suggested_prompts` from top-level
of the data JSON
- Updated `merge_business_understanding_data()` with overwrite strategy
for prompts (full replace, not append)
- `format_understanding_for_prompt()` intentionally does **not** include
`suggested_prompts` — they are UI-only

**Backend – Prompt generation** (`tally.py`):
- Extended `_EXTRACTION_PROMPT` to request 5 suggested prompts alongside
the existing business understanding fields — all extracted in a single
LLM call (`temperature=0.0`)
- Post-extraction validation filters out prompts exceeding 20 words and
slices to the top 3
- Model is now configurable via `tally_extraction_llm_model` setting
(defaults to `openai/gpt-4o-mini`)

**Backend – API endpoint** (`routes.py`):
- Added `GET /api/chat/suggested-prompts` (auth required)
- Returns `{prompts: string[]}` from the user's cached business
understanding (48h Redis TTL)
- Returns empty array if no understanding or no prompts exist

**Frontend** (`EmptySession/`):
- `helpers.ts`: Extracted defaults to `DEFAULT_QUICK_ACTIONS`,
`getQuickActions()` now accepts optional custom prompts and falls back
to defaults
- `EmptySession.tsx`: Calls `useGetV2GetSuggestedPrompts` hook
(`staleTime: Infinity`) and passes results to `getQuickActions()` with
hardcoded fallback
- Fixed `useEffect` resize handler that previously used
`window.innerWidth` as a dependency (re-ran every render); now uses a
proper resize event listener
- Added skeleton loading state while prompts are being fetched

**Generated** (`__generated__/`):
- Regenerated Orval API client with new endpoint types and hooks

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Backend format + lint + pyright pass
  - [x] Frontend format + lint pass
  - [x] All existing tally tests pass (28/28)
  - [x] All chat route tests pass (9/9)
  - [x] All invited_user tests pass (7/7)
- [x] E2E: New user with tally data sees custom prompts on copilot page
  - [x] E2E: User without tally data sees hardcoded default prompts
  - [x] E2E: Clicking a custom prompt sends it as a chat message
2026-03-13 12:11:31 +01:00
Otto
02645732b8 feat(backend/copilot): enable E2B auto_resume and reduce safety-net timeout (#12397)
Enable E2B `auto_resume` lifecycle option and reduce the safety-net
timeout from 3 hours to 5 minutes.

Currently, if the explicit per-turn `pause_sandbox_direct()` call fails
(process crash, network issue, fire-and-forget task cancellation), the
sandbox keeps running for up to **3 hours** before the safety-net
timeout fires. With this change, worst-case billing drops to **5
minutes**.

### Changes
- Add `auto_resume: True` to sandbox lifecycle config — paused sandboxes
wake transparently on SDK activity
- Reduce `e2b_sandbox_timeout` default from 10800s (3h) → 300s (5min)
- Add `e2b_sandbox_auto_resume` config field (default: `True`)
- Guard: `auto_resume` only added when `on_timeout == "pause"`

### What doesn't change
- Explicit per-turn `pause_sandbox_direct()` remains the primary
mechanism
- `connect()` / `_try_reconnect()` flow unchanged
- Redis key management unchanged
- No latency impact (resume is ~1-2s regardless of trigger)

### Risk
Very low — `auto_resume` is additive. If it doesn't work as advertised,
`connect()` still resumes paused sandboxes exactly as before.

Ref: https://e2b.dev/docs/sandbox/auto-resume
Linear: SECRT-2118

---
Co-authored-by: Zamil Majdy (@majdyz) <zamil.majdy@agpt.co>
2026-03-13 10:29:28 +00:00
Swifty
ba301a3912 feat(platform): add whitelisting-backed beta user provisioning (#12347)
### Changes 🏗️

- add invite-backed beta provisioning with a new `InvitedUser` platform
model, Prisma migration, and first-login activation path that
materializes `User`, `Profile`, `UserOnboarding`, and
`CoPilotUnderstanding`
- replace the legacy beta allowlist check with invite-backed gating for
email/password signup and Tally pre-seeding during activation
- add admin backend APIs and frontend `/admin/users` management UI for
listing, creating, revoking, retrying, and bulk-uploading invited users
- add the design doc for the beta invite system and extend backend
coverage for invite activation, bulk uploads, and auth-route behavior
- configuration changes: introduce the new invite/tally schema objects
and migration; no new env vars or docker service changes are required

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] `cd autogpt_platform/backend && poetry run format`
- [x] `cd autogpt_platform/backend && poetry run pytest -q` (run against
an isolated local Postgres database with non-conflicting service port
overrides)

#### For configuration changes:

- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)
2026-03-13 10:25:49 +01:00
Abhimanyu Yadav
0cd9c0d87a fix(frontend): show sub-folders when navigating inside a folder (#12316)
## Summary

When opening a folder in the library, sub-folders were not displayed —
only agents were shown. This was caused by two issues:

1. The folder list query always fetched root-level folders (no
`parent_id` filter), so sub-folders were never requested
2. `showFolders` was set to `false` whenever a folder was selected,
hiding all folders from the view

### Changes 🏗️

- Pass `parent_id` to the `useGetV2ListLibraryFolders` hook so it
fetches child folders of the currently selected folder
- Remove the `!selectedFolderId` condition from `showFolders` so folders
render inside other folders
- Fetch the current folder via `useGetV2GetFolder` instead of searching
the (now differently-scoped) folder list
- Clean up breadcrumb: remove emoji icon, match folder name text size to
"My Library", replace `Button` with plain `<button>` to remove extra
padding/gap

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Open a folder in the library and verify sub-folders are displayed
  - [x] Verify agents inside the folder still display correctly
- [x] Verify breadcrumb shows folder name without emoji, matching "My
Library" text size
- [x] Verify clicking "My Library" in breadcrumb navigates back to root
  - [x] Verify root-level view still shows all top-level folders
  - [x] Verify favorites tab does not show folders
2026-03-13 04:40:09 +00:00
Zamil Majdy
a083493aa2 fix(backend/copilot): auto-parse structured data and robust type coercion (#12390)
The copilot's `@@agptfile:` reference system always produces strings
when expanding
file references. This breaks blocks that expect structured types — e.g.
`GoogleSheetsWriteBlock` expects `values: list[list[str]]`, but receives
a raw CSV
string instead. Additionally, the copilot's input coercion was
duplicating logic from
the executor instead of reusing the shared `convert()` utility, and the
coercion had
no type-aware gating — it would always call `convert()`, which could
incorrectly
transform values that already matched the expected type (e.g.
stringifying a valid
`list[str]` in a `str | list[str]` union).

### Changes 🏗️

**Structured data parsing for `@@agptfile:` bare references:**
- When an entire tool argument value is a bare `@@agptfile:` reference,
the resolved
content is now auto-parsed: JSON → native types, CSV/TSV →
`list[list[str]]`
- Embedded references within larger strings still do plain text
substitution
- Updated copilot system prompt to document the structured data
capability

**Shared type coercion utility (`coerce_inputs_to_schema`):**
- Extracted `coerce_inputs_to_schema()` into `backend/util/type.py` —
shared by both
  the executor's `validate_exec()` and the copilot's `execute_block()`
- Uses Pydantic `model_fields` (not `__annotations__`) to include
inherited fields
- Added `_value_satisfies_type()` gate: only calls `convert()` when the
value doesn't
already match the target type, including recursive inner-element
checking for generics

**`_value_satisfies_type` — recursive type checking:**
- Handles `Any`, `Optional`, `Union`, `list[T]`, `dict[K,V]`, `set[T]`,
`tuple[T, ...]`,
  heterogeneous `tuple[str, int, bool]`, bare generics, nested generics
- Guards against non-runtime origins (`Literal`, etc.) to prevent
`isinstance()` crashes
- Returns `False` (not `True`) for unhandled generic origins as a safe
fallback

**Test coverage:**
- 51 new tests for `_value_satisfies_type` and `coerce_inputs_to_schema`
in `type_test.py`
- 8 new tests for `execute_block` type coercion in `helpers_test.py`

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] All existing file_ref tests pass
- [x] All new type_test.py tests pass (51 tests covering
_value_satisfies_type and coerce_inputs_to_schema)
- [x] All new helpers_test.py tests pass (8 tests covering execute_block
coercion)
  - [x] `poetry run format` passes clean
  - [x] `poetry run lint` passes clean
  - [x] Pyright type checking passes
2026-03-12 19:27:41 +00:00
Zamil Majdy
c51dc7ad99 fix(backend): agent generator sets invalid model on PerplexityBlocks (#12391)
Fixes the agent generator setting `gpt-5.2-2025-12-11` (or `gpt-4o`) as
the model for PerplexityBlocks instead of valid Perplexity models,
causing 100% failure rate for agents using Perplexity blocks.

### Changes 🏗️

- **Fixer: block-aware model validation** — `fix_ai_model_parameter()`
now reads the block's `inputSchema` to check for `enum` constraints on
the model field. Blocks with their own model enum (PerplexityBlock,
IdeogramBlock, CodexBlock, etc.) are validated against their own allowed
values with the correct default, instead of the hardcoded generic set
(`gpt-4o`, `claude-opus-4-6`). This also fixes `edit_agent` which runs
through the same fixer pipeline.
- **PerplexityBlock: runtime fallback** — Added a `field_validator` on
the model field that gracefully falls back to `SONAR` instead of
crashing when an invalid model value is encountered at runtime. Also
overrides `validate_data` to sanitize invalid model values *before* JSON
schema validation (which runs in `Block._execute` before Pydantic
instantiation), ensuring the fallback is actually reachable during block
execution.
- **DB migration** — Fixes existing PerplexityBlock nodes with invalid
model values in both `AgentNode.constantInput` and
`AgentNodeExecutionInputOutput` (preset overrides), matching the pattern
from the Gemini migration.
- **Tests** — Fixer tests for block-specific enum validation, plus
`validate_data`-level tests ensuring invalid models are sanitized before
JSON schema validation rejects them.

Resolves [SECRT-2097](https://linear.app/autogpt/issue/SECRT-2097)

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] All existing + new fixer tests pass
  - [x] PerplexityBlock block test passes
- [x] 11 perplexity_test.py tests pass (field_validator + validate_data
paths)
- [x] Verified invalid model (`gpt-5.2-2025-12-11`) falls back to
`perplexity/sonar` at runtime
  - [x] Verified valid Perplexity models are preserved by the fixer
  - [x] Migration covers both constantInput and preset overrides
2026-03-12 18:54:18 +00:00
Krzysztof Czerwinski
bc6b82218a feat(platform): add autopilot notification system (#12364)
Adds a notification system for the Copilot (AutoPilot) so users know
when background chats finish processing — via in-app indicators, sounds,
browser notifications, and document title badges.

### Changes 🏗️

**Backend**
- Add `is_processing` field to `SessionSummaryResponse` — batch-checks
Redis for active stream status on each session in the list endpoint
- Fix `is_processing` always returning `false` due to bytes vs string
comparison (`b"running"` → `"running"`) with `decode_responses=True`
Redis client
- Add `CopilotCompletionPayload` model for WebSocket notification events
- Publish `copilot_completion` notification via WebSocket when a session
completes in `stream_registry.mark_session_completed`

**Frontend — Notification UI**
- Add `NotificationBanner` component — amber banner prompting users to
enable browser notifications (auto-hides when already enabled or
dismissed)
- Add `NotificationDialog` component — modal dialog for enabling
notifications, supports force-open from sidebar menu for testing
- Fix repeated word "response" in dialog copy

**Frontend — Sidebar**
- Add bell icon in sidebar header with popover menu containing:
- Notifications toggle (requests browser permission on enable; shows
toast if denied)
  - Sound toggle (disabled when notifications are off)
  - "Show notification popup" button (for testing the dialog)
  - "Clear local data" button (resets all copilot localStorage keys)
- Bell icon states: `BellSlash` (disabled), `Bell` (enabled, no sound),
`BellRinging` (enabled + sound)
- Add processing indicator (PulseLoader) and completion checkmark
(CheckCircle) inline with chat title, to the left of the hamburger menu
- Processing indicator hides immediately when completion arrives (no
overlap with checkmark)
- Fix PulseLoader initial flash — start at `scale(0); opacity: 0` with
smoother keyframes
- Add 10s polling (`refetchInterval`) to session list so `is_processing`
updates automatically
- Clear document title badge when navigating to a completed chat
- Remove duplicate "Your chats" heading that appeared in both
SidebarHeader and SidebarContent

**Frontend — Notification Hook (`useCopilotNotifications`)**
- Listen for `copilot_completion` WebSocket events
- Track completed sessions in Zustand store
- Play notification sound (only for background sessions, not active
chat)
- Update `document.title` with unread count badge
- Send browser `Notification` when tab is hidden, with click-to-navigate
to the completed chat
- Reset document title on tab focus

**Frontend — Store & Storage**
- Add `completedSessionIDs`, `isNotificationsEnabled`, `isSoundEnabled`,
`showNotificationDialog`, `clearCopilotLocalData` to Zustand store
- Persist notification and sound preferences in localStorage
- On init, validate `isNotificationsEnabled` against actual
`Notification.permission`
- Add localStorage keys: `COPILOT_NOTIFICATIONS_ENABLED`,
`COPILOT_SOUND_ENABLED`, `COPILOT_NOTIFICATION_BANNER_DISMISSED`,
`COPILOT_NOTIFICATION_DIALOG_DISMISSED`

**Mobile**
- Add processing/completion indicators and sound toggle to MobileDrawer

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Open copilot, start a chat, switch to another chat — verify
processing indicator appears on the background chat
- [x] Wait for background chat to complete — verify checkmark appears,
processing indicator disappears
- [x] Enable notifications via bell menu — verify browser permission
prompt appears
- [x] With notifications enabled, complete a background chat while on
another tab — verify system notification appears with sound
- [x] Click system notification — verify it navigates to the completed
chat
- [x] Verify document title shows unread count and resets when
navigating to the chat or focusing the tab
  - [x] Toggle sound off — verify no sound plays on completion
- [x] Toggle notifications off — verify no sound, no system
notification, no badge
  - [x] Clear local data — verify all preferences reset
- [x] Verify notification banner hides when notifications already
enabled
- [x] Verify dialog auto-shows for first-time users and can be
force-opened from menu

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 14:03:24 +00:00
Otto
83e49f71cd fix(frontend): pass through Supabase error params in password reset callback (#12384)
When Supabase rejects a password reset token (expired, already used,
etc.), it redirects to the callback URL with `error`, `error_code`, and
`error_description` params instead of a `code`. Previously, the callback
only checked for `!code` and returned a generic "Missing verification
code" error, swallowing the actual Supabase error.

This meant the `ExpiredLinkMessage` UX (added in SECRT-1369 / #12123)
was never triggered for these cases — users just saw the email input
form again with no explanation.

Now the callback reads Supabase's error params and forwards them to
`/reset-password`, where the existing expired link detection picks them
up correctly.

**Note:** This doesn't fix the root cause of Pwuts's token expiry issue
(likely link preview/prefetch consuming the OTP), but it ensures users
see the proper "link expired" message with a "Request new link" button
instead of a confusing silent redirect.

---
Co-authored-by: Reinier van der Leer (@Pwuts) <pwuts@agpt.co>
2026-03-12 13:51:15 +00:00
Bently
ef446e4fe9 feat(llm): Add Cohere Command A Family Models (#12339)
## Summary
Adds the Cohere Command A family of models to AutoGPT Platform with
proper pricing configuration.

## Models Added
- **Command A 03.2025**: Flagship model (256k context, 8k output) - 3
credits
- **Command A Translate 08.2025**: State-of-the-art translation (8k
context, 8k output) - 3 credits
- **Command A Reasoning 08.2025**: First reasoning model (256k context,
32k output) - 6 credits
- **Command A Vision 07.2025**: First vision-capable model (128k
context, 8k output) - 3 credits

## Changes
- Added 4 new LlmModel enum entries with proper OpenRouter model IDs
- Added ModelMetadata for each model with correct context windows,
output limits, and price tiers
- Added pricing configuration in block_cost_config.py

## Testing
- [ ] Models appear in AutoGPT Platform model selector
- [ ] Pricing is correctly applied when using models

Resolves **SECRT-2083**
2026-03-12 11:56:30 +00:00
Bently
7b1e8ed786 feat(llm): Add Microsoft Phi-4 model support (#12342)
## Changes
- Added `MICROSOFT_PHI_4` to LlmModel enum (`microsoft/phi-4`)
- Configured model metadata:
  - 16K context window
  - 16K max output tokens
  - OpenRouter provider
- Set cost tier: 1
  - Input: $0.06 per 1M tokens
  - Output: $0.14 per 1M tokens

## Details
Microsoft Phi-4 is a 14B parameter model available through OpenRouter.
This PR adds proper support in the autogpt_platform backend.

Resolves SECRT-2086
2026-03-12 11:15:27 +00:00
Abhimanyu Yadav
7ccfff1040 feat(frontend): add credential type selector for multi-auth providers (#12378)
### Changes

- When a provider supports multiple credential types (e.g. GitHub with
both OAuth and API Key),
clicking "Add credential" now opens a tabbed dialog where users can
choose which type to use.
  Previously, OAuth always took priority and API key was unreachable.
- Each credential in the list now shows a type-specific icon (provider
icon for OAuth, key for API Key,
password/lock for others) and a small label badge (e.g. "API Key",
"OAuth").
- The native dropdown options also include the credential type in
parentheses for clarity.
- Single credential type providers behave exactly as before — no dialog,
direct action.


https://github.com/user-attachments/assets/79f3a097-ea97-426b-a2d9-781d7dcdb8a4



  ## Test plan
- [x] Test with a provider that has only one credential type (e.g.
OpenAI with api_key only) — should
  behave as before
- [x] Test with a provider that has multiple types (e.g. GitHub with
OAuth + API Key configured) —
  should show tabbed dialog
  - [x] Verify OAuth tab triggers the OAuth flow correctly
  - [x] Verify API Key tab shows the inline form and creates credentials
  - [x] Verify credential list shows correct icons and type badges
  - [x] Verify dropdown options show type in parentheses
2026-03-12 10:17:58 +00:00
Otto
81c7685a82 fix(frontend): release test fixes — scheduler time picker, unpublished banner (#12376)
Two frontend fixes from release testing (2026-03-11):

**SECRT-2102:** The schedule dialog shows an "At [hh]:[mm]" time picker
when selecting Custom > Every x Minutes or Hours, which makes no sense
for sub-day intervals. Now only shows the time picker for Custom > Days
and other frequency types.

**SECRT-2103:** The "Unpublished changes" banner shows for agents the
user doesn't own or create. Root cause: `owner_user_id` is the library
copy owner, not the graph creator. Changed to use `can_access_graph`
which correctly reflects write access.

---
Co-authored-by: Reinier van der Leer (@Pwuts) <pwuts@agpt.co>

---------

Co-authored-by: Reinier van der Leer (@Pwuts) <reinier@agpt.co>
Co-authored-by: Reinier van der Leer <pwuts@agpt.co>
2026-03-12 10:02:26 +00:00
Bently
3595c6e769 feat(llm): add Perplexity Sonar Reasoning Pro model (#12341)
## Summary
Adds support for Perplexity's new reasoning model:
`perplexity/sonar-reasoning-pro`

## Changes
-  Added `PERPLEXITY_SONAR_REASONING_PRO` to `LlmModel` enum
-  Added model metadata (128K context window, 8K max output tokens,
tier 2)
-  Set pricing at 5 credits (matches sonar-pro tier)

## Model Details
- **Model ID:** `perplexity/sonar-reasoning-pro`
- **Provider:** OpenRouter
- **Context Window:** 128,000 tokens
- **Max Output:** 8,000 tokens
- **Pricing:** $0.000002/token (prompt), $0.000008/token (completion)
- **Cost Tier:** 2 (5 credits)

## Testing
-  Black formatting passed
-  Ruff linting passed

Resolves SECRT-2084
2026-03-12 09:58:29 +00:00
Abhimanyu Yadav
1c2953d61b fix(frontend): restore broken tutorial in builder (#12377)
### Changes
- Restored missing `shepherd.js/dist/css/shepherd.css` base styles
import
- Added missing .new-builder-tutorial-disable and
.new-builder-tutorial-highlight CSS classes to
  tutorial.css
- Fixed getFormContainerSelector() to include -node suffix matching the
actual DOM attribute

###  What broke
The old legacy-builder/tutorial.ts was the only file importing
Shepherd's base CSS. When #12082 removed
the legacy builder, the new tutorial lost all base Shepherd styles
(close button positioning, modal
overlay, tooltip layout). The new tutorial's custom CSS overrides
depended on these base styles
  existing.

  Test plan
  - [x] Start the tutorial from the builder (click the chalkboard icon)
- [x] Verify the close (X) button is positioned correctly in the
top-right of the popover
  - [x] Verify the modal overlay dims the background properly
- [x] Verify element highlighting works when the tutorial points to
blocks/buttons
- [x] Verify non-target blocks are grayed out during the "select
calculator" step
- [x] Complete the full tutorial flow end-to-end (add block → configure
→ connect → save → run)
2026-03-12 09:23:34 +00:00
Zamil Majdy
755bc84b1a fix(copilot): replace MCP jargon with user-friendly language (#12381)
Closes SECRT-2105

### Changes 🏗️

Replace all user-facing MCP technical terminology with plain, friendly
language across the CoPilot UI and LLM prompting.

**Backend (`run_mcp_tool.py`)**
- Added `_service_name()` helper that extracts a readable name from an
MCP host (`mcp.sentry.dev` → `Sentry`)
- `agent_name` in `SetupRequirementsResponse`: `"MCP: mcp.sentry.dev"` →
`"Sentry"`
- Auth message: `"The MCP server at X requires authentication. Please
connect your credentials to continue."` → `"To continue, sign in to
Sentry and approve access."`

**Backend (`mcp_tool_guide.md`)**
- Added "Communication style" section with before/after examples to
teach the LLM to avoid "MCP server", "OAuth", "credentials" jargon in
responses to users

**Frontend (`MCPSetupCard.tsx`)**
- Button: `"Connect to mcp.sentry.dev"` → `"Connect Sentry"`
- Connected state: `"Connected to mcp.sentry.dev!"` → `"Connected to
Sentry!"`
- Retry message: `"I've connected the MCP server credentials. Please
retry."` → `"I've connected. Please retry."`

**Frontend (`helpers.tsx`)**
- Added `serviceNameFromHost()` helper (exported, mirrors the backend
logic)
- Run text: `"Discovering MCP tools on mcp.sentry.dev"` → `"Connecting
to Sentry…"`
- Run text: `"Connecting to MCP server"` → `"Connecting…"`
- Run text: `"Connect to MCP: mcp.sentry.dev"` → `"Connect Sentry"`
(uses `agent_name` which is now just `"Sentry"`)
- Run text: `"Discovered N tool(s) on mcp.sentry.dev"` → `"Connected to
Sentry"`
- Error text: `"MCP error"` → `"Connection error"`

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [ ] Open CoPilot and ask it to connect to a service (e.g. Sentry,
Notion)
- [ ] Verify the run text accordion title shows `"Connecting to
Sentry…"` instead of `"Discovering MCP tools on mcp.sentry.dev"`
- [ ] Verify the auth card button shows `"Connect Sentry"` instead of
`"Connect to mcp.sentry.dev"`
- [ ] Verify the connected state shows `"Connected to Sentry!"` instead
of `"Connected to mcp.sentry.dev!"`
- [ ] Verify the LLM response text avoids "MCP server", "OAuth",
"credentials" terminology
2026-03-12 08:54:15 +00:00
Bently
ade2baa58f feat(llm): Add Grok 3 model support (#12343)
## Summary
Adds support for xAI's Grok 3 model to AutoGPT.

## Changes
- Added `GROK_3` to `LlmModel` enum with identifier `x-ai/grok-3`
- Configured model metadata:
  - Context window: 131,072 tokens (128k)
  - Max output: 32,768 tokens (32k)  
  - Provider: OpenRouter
  - Creator: xAI
  - Price tier: 2 (mid-tier)
- Set model cost to 3 credits (mid-tier pricing between fast models and
Grok 4)
- Updated block documentation to include Grok 3 in model lists

## Pricing Rationale
- **Grok 4**: 9 credits (tier 3 - premium, 256k context)
- **Grok 3**: 3 credits (tier 2 - mid-tier, 128k context) ← NEW
- **Grok 4 Fast/4.1 Fast/Code Fast**: 1 credit (tier 1 - affordable)

Grok 3 is positioned as a mid-tier model, priced similarly to other tier
2 models.

## Testing
- [x] Code passes `black` formatting
- [x] Code passes `ruff` linting
- [x] Model metadata and cost configuration added
- [x] Documentation updated

Closes SECRT-2079
2026-03-12 07:31:59 +00:00
Reinier van der Leer
4d35534a89 Merge branch 'master' into dev 2026-03-11 22:26:35 +01:00
Zamil Majdy
2cc748f34c chore(frontend): remove accidentally committed generated file (#12373)
`responseType.ts` was accidentally committed inside
`src/app/api/__generated__/models/` despite that directory being listed
in `.gitignore` (added in PR #12238).

### Changes 🏗️

- Removes
`autogpt_platform/frontend/src/app/api/__generated__/models/responseType.ts`
from git tracking — the file is already covered by the `.gitignore` rule
`src/app/api/__generated__/` and should never have been committed.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] No functional changes — only removes a stale tracked file that is
already gitignored
2026-03-11 14:22:37 +00:00
Shunyu Wu
c2e79fa5e1 fix(gmail): fallback to raw HTML when html2text conversion fails (#12369)
## Summary
- keep Gmail body extraction resilient when `html2text` converter raises
- fallback to raw HTML instead of failing extraction
- add regression test for converter failure path

Closes #12368

## Testing
- added unit test in
`autogpt_platform/backend/test/blocks/test_gmail.py`

---------

Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
2026-03-11 11:46:57 +00:00
Bently
89a5b3178a fix(llm): Update Gemini model lineup - add 3.1 models, deprecate 3 Pro Preview (#12331)
## 🔴 URGENT: Gemini 3 Pro Preview Shutdown - March 9, 2026

Google is shutting down Gemini 3 Pro Preview **tomorrow (March 9,
2026)**. This PR addresses SECRT-2067 by updating the Gemini model
lineup to prevent disruption.

---

## Changes

###  P0 - Critical (This Week)
- [x] **Remove/Replace Gemini 3 Pro Preview** → Migrated to 3.1 Pro
Preview
- [x] **Add Gemini 3.1 Pro Preview** (released Feb 19, 2026)

###  P1 - High Priority  
- [x] **Add Gemini 3.1 Flash Lite Preview** (released Mar 3, 2026)
- [x] **Add Gemini 3 Flash Preview** (released Dec 17, 2025)

###  P2 - Medium Priority
- [x] **Add Gemini 2.5 Pro (stable/GA)** (released Jun 17, 2025)

---

## Model Details

| Model | Context | Input Cost | Output Cost | Price Tier |
|-------|---------|------------|-------------|------------|
| **Gemini 3.1 Pro Preview** | 1.05M | $2.00/1M | $12.00/1M | 2 |
| **Gemini 3.1 Flash Lite Preview** | 1.05M | $0.25/1M | $1.50/1M | 1 |
| **Gemini 3 Flash Preview** | 1.05M | $0.50/1M | $3.00/1M | 1 |
| **Gemini 2.5 Pro (GA)** | 1.05M | $1.25/1M | $10.00/1M | 2 |
| ~~Gemini 3 Pro Preview~~ | ~~1.05M~~ | ~~$2.00/1M~~ | ~~$12.00/1M~~ |
**DEPRECATED** |

---

## Migration Strategy

**Database Migration:**
`20260308095500_migrate_deprecated_gemini_3_pro_preview`

- Automatically migrates all existing graphs using
`google/gemini-3-pro-preview` to `google/gemini-3.1-pro-preview`
- Updates: AgentBlock, AgentGraphExecution, AgentNodeExecution,
AgentGraph
- Zero user-facing disruption
- Migration runs on next deployment (before March 9 shutdown)

---

## Testing

- [ ] Verify new models appear in LLM block dropdown
- [ ] Test migration on staging database
- [ ] Confirm existing graphs using deprecated model auto-migrate
- [ ] Validate cost calculations for new models

---

## References

- **Linear Issue:**
[SECRT-2067](https://linear.app/autogpt/issue/SECRT-2067)
- **OpenRouter Models:** https://openrouter.ai/models/google
- **Google Deprecation Notice:**
https://ai.google.dev/gemini-api/docs/deprecations

---

## Checklist

- [x] Models added to `LlmModel` enum
- [x] Model metadata configured
- [x] Cost config updated
- [x] Database migration created
- [x] Deprecated model commented out (not removed for historical
reference)
- [ ] PR reviewed and approved
- [ ] Merged before March 9, 2026 deadline

---

**Priority:** 🔴 Critical - Must merge before March 9, 2026
2026-03-11 11:21:16 +00:00
Abhimanyu Yadav
c62d9a24ff fix(frontend): show correct status in agent submission view modal (#12360)
### Changes 🏗️
- The "View" modal for agent submissions hardcoded "Agent is awaiting
review" regardless of actual status
- Now displays "Agent approved", "Agent rejected", or "Agent is awaiting
review" based on the submission's actual status
- Shows review feedback in a highlighted section for rejected agents
when review comments are available

<img width="1127" height="788" alt="Screenshot 2026-03-11 at 9 02 29 AM"
src="https://github.com/user-attachments/assets/840e0fb1-22c2-4fda-891b-967c8b8b4043"
/>
<img width="1105" height="680" alt="Screenshot 2026-03-11 at 9 02 46 AM"
src="https://github.com/user-attachments/assets/f0c407e6-c58e-4ec8-9988-9f5c69bfa9a7"
/>

  ## Test plan
- [x] Submit an agent and verify the view modal shows "Agent is awaiting
review"
- [x] View an approved agent submission and verify it shows "Agent
approved"
- [x] View a rejected agent submission and verify it shows "Agent
rejected"
- [x] View a rejected agent with review comments and verify the feedback
section appears

  Closes SECRT-2092
2026-03-11 10:08:17 +00:00
Abhimanyu Yadav
0e0bfaac29 fix(frontend): show specific error messages for store image upload failures (#12361)
### Changes
- Surface backend error details (file size limit, invalid file type,
virus detected, etc.) in the upload failed toast instead of showing a
generic "Upload Failed" message
- The backend already returns specific error messages (e.g., "File too
large. Maximum size is 50MB") but the frontend was discarding them with
a catch-all handler
  
<img width="1222" height="411" alt="Screenshot 2026-03-11 at 9 13 30 AM"
src="https://github.com/user-attachments/assets/34ab3d90-fffa-4788-917a-fe2a7f4144b9"
/>

  ## Test plan
- [x] Upload an image larger than 50MB to a store submission → should
see "File too large. Maximum size is 50MB"
- [x] Upload an unsupported file type → should see file type error
message
  - [x] Upload a valid image → should still work normally

  Resolves SECRT-2093
2026-03-11 10:07:37 +00:00
Bently
0633475915 fix(frontend/library): graceful schedule deletion with auto-selection (#12278)
### Motivation 🎯

Fixes the issue where deleting a schedule shows an error screen instead
of gracefully handling the deletion. Previously, when a user deleted a
schedule, a race condition occurred where the query cache refetch
completed before the URL
state updated, causing the component to try rendering a schedule that no
longer existed (resulting in a 404 error screen).

### Changes 🏗️

**1. Fixed deletion order to prevent error screen flash**
- `useSelectedScheduleActions.ts` - Call `onDeleted()` callback
**before** invalidating queries to clear selection first
- `ScheduleActionsDropdown.tsx` - Same fix for sidebar dropdown deletion

**2. Added smart auto-selection logic**
- `useNewAgentLibraryView.ts`:
  - Added query to fetch current schedules list
  - Added `handleScheduleDeleted(deletedScheduleId)` function that:
    - Auto-selects the first remaining schedule if others exist
    - Clears selection to show empty state if no schedules remain

**3. Wired up smart deletion handler throughout component tree**
- `NewAgentLibraryView.tsx` - Passes `handleScheduleDeleted` to child
components
- `SelectedScheduleView.tsx` - Changed callback from
`onClearSelectedRun` to `onScheduleDeleted` and passes schedule ID
- `SidebarRunsList.tsx` - Added `onScheduleDeleted` prop and passes it
through to list items

### Checklist 📋

**Test Plan:**
- [] Create 2-3 test schedules for an agent
- [] Delete a schedule from the detail view (trash icon in actions) when
other schedules exist → Verify next schedule auto-selects without error
- [] Delete a schedule from the sidebar dropdown (three-dot menu) when
other schedules exist → Verify next schedule auto-selects without error
- [] Delete all schedules until only one remains → Verify empty state
shows gracefully without error
- [] Verify "Schedule deleted" toast appears on successful deletion
- [] Verify no error screen appears at any point during deletion flow
2026-03-11 09:01:55 +00:00
Bently
34a2f9a0a2 feat(llm): add Mistral flagship models (Large 3, Medium 3.1, Small 3.2, Codestral) (#12337)
## Summary

Adds four missing Mistral AI flagship models to address the critical
coverage gap identified in
[SECRT-2082](https://linear.app/autogpt/issue/SECRT-2082).

## Models Added

| Model | Context | Max Output | Price Tier | Use Case |
|-------|---------|------------|------------|----------|
| **Mistral Large 3** | 262K | None | 2 (Medium) | Flagship reasoning
model, 41B active params (675B total), MoE architecture |
| **Mistral Medium 3.1** | 131K | None | 2 (Medium) | Balanced
performance/cost, 8x cheaper than traditional large models |
| **Mistral Small 3.2** | 131K | 131K | 1 (Low) | Fast, cost-efficient,
high-volume use cases |
| **Codestral 2508** | 256K | None | 1 (Low) | Code generation
specialist (FIM, correction, test gen) |

## Problem

Previously, the platform only offered:
- Mistral Nemo (1 official model)
- dolphin-mistral (third-party Ollama fine-tune)

This left significant gaps in Mistral's lineup, particularly:
- No flagship reasoning model
- No balanced mid-tier option
- No code-specialized model
- Missing multimodal capabilities (Large 3, Medium 3.1, Small 3.2 all
support text+image)

## Changes

**File:** `autogpt_platform/backend/backend/blocks/llm.py`

- Added 4 enum entries in `LlmModel` class
- Added 4 metadata entries in `MODEL_METADATA` dict
- All models use OpenRouter provider
- Follows existing pattern for model additions

## Testing

-  Enum values match OpenRouter model IDs
-  Metadata follows existing format
-  Context windows verified from OpenRouter API
-  Price tiers assigned appropriately

## Closes

- SECRT-2082

---

**Note:** All models are available via OpenRouter and tested. This
brings Mistral coverage in line with other major providers (OpenAI,
Anthropic, Google).
2026-03-11 08:48:48 +00:00
Zamil Majdy
9f4caa7dfc feat(blocks): add and harden GitHub blocks for full-cycle development (#12334)
## Summary
- Add 8 new GitHub blocks: GetRepositoryInfo, ForkRepository,
ListCommits, SearchCode, CompareBranches, GetRepositoryTree,
MultiFileCommit, MergePullRequest
- Split `repo.py` (2094 lines, 19 blocks) into domain-specific modules:
`repo.py`, `repo_branches.py`, `repo_files.py`, `commits.py`
- Concurrent blob creation via `asyncio.gather()` in MultiFileCommit
- URL-encode branch/ref params via `urllib.parse.quote()` for
defense-in-depth
- Step-level error handling in MultiFileCommit ref update with recovery
SHA
- Collapse FileOperation CREATE/UPDATE into UPSERT (Git Trees API treats
them identically)
- Add `ge=1, le=100` constraints on per_page SchemaFields
- Preserve URL scheme in `prepare_pr_api_url`
- Handle null commit authors gracefully in ListCommits
- Add unit tests for `prepare_pr_api_url`, error-path tests for
MergePR/MultiFileCommit, FileOperation enum validation tests

## Test plan
- [ ] Block tests pass for all 19 GitHub blocks (CI:
`test_available_blocks`)
- [ ] New test file `test_github_blocks.py` passes (prepare_pr_api_url,
error paths, enum)
- [ ] `check-docs-sync` passes with regenerated docs
- [ ] pyright/ruff clean on all changed files
2026-03-11 08:35:37 +00:00
Otto
0876d22e22 feat(frontend/copilot): improve TTS voice selection to avoid robotic voices (#12317)
Requested by @0ubbe

Refines the `pickBestVoice()` function to ensure non-robotic voices are
always preferred:

- **Filter out known low-quality engines** — eSpeak, Festival, MBROLA,
Flite, and Pico voices are deprioritized
- **Prefer remote/cloud-backed voices** — `localService: false` voices
are typically higher quality
- **Expand preferred voices list** — added Moira, Tessa (macOS), Jenny,
Aria, Guy (Windows OneCore)
- **Smarter fallback chain** — English default → English → any default →
first available

The previous fallback could select eSpeak or Festival voices on Linux
systems, resulting in robotic output. Now those are filtered out unless
they're the only option.

---
Co-authored-by: Ubbe <ubbe@users.noreply.github.com>

---------

Co-authored-by: Ubbe <hi@ubbe.dev>
Co-authored-by: Lluis Agusti <hi@llu.lu>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 16:47:42 +08:00
Zamil Majdy
15e3980d65 fix(frontend): buffer workspace file downloads to prevent truncation (#12349)
## Summary
- Workspace file downloads (images, CSVs, etc.) were silently truncated
(~10 KB lost from the end) when served through the Next.js proxy
- Root cause: `new NextResponse(response.body)` passes a
`ReadableStream` directly, which Next.js / Vercel silently truncates for
larger files
- Fix: fully buffer with `response.arrayBuffer()` before forwarding, and
set `Content-Length` from the actual buffer size
- Keeps the auth proxy intact — no signed URLs (which would be public
and expire, breaking chat history)

## Root cause verification
Confirmed locally on session `080f27f9-0379-4085-a67a-ee34cc40cd62`:
- Backend `write_workspace_file` logs **978,831 bytes** written
- Direct backend download (`curl
localhost:8006/api/workspace/files/.../download`): **978,831 bytes** 
- Browser download through Next.js proxy: **truncated** 

## Why not signed URLs?
- Signed URLs are effectively public — anyone with the link can download
the file (privacy concern)
- Signed URLs expire, but chat history persists — reopening a
conversation later would show broken downloads
- Buffering is fine: workspace files are capped at 100 MB, Vercel
function memory is 1 GB

## Related
- Discord thread: `#Truncated File Bug` channel
- Related PR #12319 (signed URL approach) — this fix is simpler and
preserves auth

## Test plan
- [ ] Download a workspace file (CSV, PNG, any type) through the copilot
UI
- [ ] Verify downloaded file size matches the original
- [ ] Verify PNGs open correctly and CSVs have all rows

cc @Swiftyos @uberdot @AdarshRawat1
2026-03-10 18:23:51 +00:00
Zamil Majdy
fe9eb2564b feat(copilot): HITL review for sensitive block execution (#12356)
## Summary
- Integrates existing Human-In-The-Loop (HITL) review infrastructure
into CoPilot's direct block execution (`run_block`) for blocks marked
with `is_sensitive_action=True`
- Removes the `PendingHumanReview → AgentGraphExecution` FK constraint
to support synthetic CoPilot session IDs (migration included)
- Adds `ReviewRequiredResponse` model + frontend `ReviewRequiredCard`
component to surface review status in the chat UI
- Auto-approval works within a CoPilot session: once a block is
approved, subsequent executions of the same block in the same session
are auto-approved (using `copilot-session-{session_id}` as
`graph_exec_id` and `copilot-node-{block_id}` as `node_id`)

## Test plan
- [x] All 11 `run_block_test.py` tests pass (3 new sensitive action
tests)
- [ ] Manual: Execute a block with `is_sensitive_action=True` in CoPilot
→ verify ReviewRequiredResponse is returned and rendered
- [ ] Manual: Approve in review panel → re-execute the same block →
verify auto-approval kicks in
- [ ] Manual: Verify non-sensitive blocks still execute without review
2026-03-10 18:20:11 +00:00
Otto
5641cdd3ca fix(backend): update test patches for validate_url → validate_url_host rename (#12358)
bfb843a renamed `validate_url` to `validate_url_host` in
`agent_browser`, `run_mcp_tool`, and MCP routes, but the corresponding
test files still patched the old name, causing `AttributeError` in CI.

Updates all mock patch targets and assertions across 3 test files:
- `agent_browser_test.py`
- `test_run_mcp_tool.py`  
- `mcp/test_routes.py`

---
Co-authored-by: Zamil Majdy (@majdyz) <zamil.majdy@agpt.co>
Co-authored-by: Reinier van der Leer (@Pwuts) <pwuts@agpt.co>
2026-03-10 17:22:11 +00:00
Otto
bfb843a56e Merge commit from fork
* Fix SSRF via user-controlled ollama_host field

Validate ollama_host against BLOCKED_IP_NETWORKS before passing to
ollama.AsyncClient(). The server-configured default (env: OLLAMA_HOST)
is allowed without validation; user-supplied values that differ are
checked for private/internal IP resolution.

Fixes GHSA-6jx2-4h7q-3fx3

* Generalize validate_ollama_host to validate_host; fix description line length

* Rename to validate_untrusted_host with whitelist parameter

* Apply PR suggestion: include whitelist in error message; run formatting

* Move whitelist check after URL normalization; match on netloc

* revert unrelated formatting changes

* Dedup validate_url and validate_untrusted_host; normalize whitelist

* Move _resolve_and_check_blocked after calling functions

* dedup and clean up

* make trusted_hostnames truly optional

---------

Co-authored-by: Reinier van der Leer <pwuts@agpt.co>
2026-03-10 15:51:58 +01:00
Abhimanyu Yadav
684845d946 fix(frontend/builder): handle discriminated unions and improve node layout (#12354)
## Summary
- **Discriminated union support (oneOf)**: Added a new `OneOfField`
component that properly
renders Pydantic discriminated unions. Hides the unusable parent object
handle, auto-populates
the discriminator value, shows a dropdown with variant titles (e.g.,
"Username" / "UserId"), and
filters out the internal discriminator field from the form.
Non-discriminated `oneOf` schemas
  fall back to existing `AnyOfField` behavior.
- **Collapsible object outputs**: Object-type outputs with nested keys
(e.g.,
`PersonLookupResponse.Url`, `PersonLookupResponse.profile`) are now
collapsed by default behind a
caret toggle. Nested keys show short names instead of the full
`Parent.Key` prefix.
- **Node layout cleanup**: Removed excessive bottom margin (`mb-6`) from
`FormRenderer`, hide the
Advanced toggle when no advanced fields exist, and add rounded bottom
corners on OUTPUT-type
  blocks.

<img width="440" height="427" alt="Screenshot 2026-03-10 at 11 31 55 AM"
src="https://github.com/user-attachments/assets/06cc5414-4e02-4371-bdeb-1695e7cb2c97"
/>
<img width="371" height="320" alt="Screenshot 2026-03-10 at 11 36 52 AM"
src="https://github.com/user-attachments/assets/1a55f87a-c602-4f4d-b91b-6e49f810e5d5"
/>

  ## Test plan
- [x] Add a Twitter Get User block — verify "Identifier" shows a
dropdown (Username/UserId) with
no unusable parent handle, discriminator field is hidden, and the block
can run without staying
  INCOMPLETE
- [x] Add any block with object outputs (e.g., PersonLookupResponse) —
verify nested keys are
  collapsed by default and expand on click with short labels
- [x] Verify blocks without advanced fields don't show the Advanced
toggle
- [x] Verify existing `anyOf` schemas (optional types, 3+ variant
unions) still render correctly
  - [x] Check OUTPUT-type blocks have rounded bottom corners

---------

Co-authored-by: Reinier van der Leer <pwuts@agpt.co>
Co-authored-by: eureka928 <meobius123@gmail.com>
2026-03-10 14:13:32 +00:00
Bently
6a6b23c2e1 fix(frontend): Remove unused Otto Server Action causing 107K+ errors (#12336)
## Summary

Fixes [OPEN-3025](https://linear.app/autogpt/issue/OPEN-3025) —
**107,571+ Server Action errors** in production

Removes the orphaned `askOtto` Server Action that was left behind after
the Otto chat widget removal in PR #12082.

## Problem

Next.js Server Actions that are never imported are excluded from the
server manifest. Old client bundles still reference the action ID,
causing "not found" errors.

**Sentry impact:**
- **BUILDER-3BN:** 107,571 events
- **BUILDER-729:** 285 events  
- **BUILDER-3QH:** 1,611 events
- **36+ users affected**

## Root Cause

1. **Mar 2025:** Otto widget added to `/build` page with `askOtto`
Server Action
2. **Feb 2026:** Otto widget removed (PR #12082), but `actions.ts` left
behind
3. **Result:** Dead code → not in manifest → errors

## Evidence

```bash
# Zero imports across frontend:
grep -r "askOtto" src/ --exclude="actions.ts"
# → No results

# Server manifest missing the action:
cat .next/server/server-reference-manifest.json
# → Only includes login/supabase actions, NOT build/actions
```

## Changes

-  Delete
`autogpt_platform/frontend/src/app/(platform)/build/actions.ts`

## Testing

1. Verify no imports of `askOtto` in codebase 
2. Check Sentry for error drop after deploy
3. Monitor for new "Server Action not found" errors

## Checklist

- [x] Dead code confirmed (zero imports)
- [x] Sentry issues documented
- [x] Clear commit message with context
2026-03-10 09:03:38 +00:00
Dream
d0a1d72e8a fix(frontend/builder): batch undo history for cascading operations (#12344)
## Summary

Fixes undo in the Builder not working correctly when deleting nodes.
When a node is deleted, React Flow fires `onNodesChange` (node removal)
and `onEdgesChange` (cascading edge cleanup) as separate callbacks —
each independently pushing to the undo history stack. This creates
intermediate states that break undo:

- Single undo restores a partial state (e.g. edges pointing to a deleted
node)
- Multiple undos required to fully restore the graph
- Redo also produces inconsistent states

Resolves #10999

### Changes 🏗️

- **`historyStore.ts`** — Added microtask-based batching to
`pushState()`. Multiple calls within the same synchronous execution
(same event loop tick) are coalesced into a single history entry,
keeping only the first pre-change snapshot. Uses `queueMicrotask` so all
cascading store updates from a single user action settle before the
history entry is committed.
- Reset `pendingState` in `initializeHistory()` and `clear()` to prevent
stale batched state from leaking across graph loads or navigation.

**Side benefit:** Copy/paste operations that add multiple nodes and
edges now also produce a single history entry instead of one per
node/edge.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Place 3 blocks A, B, C and connect A→B→C
  - [x] Delete block C (removes node + cascading edge B→C)
  - [x] Delete connection A→B
  - [x] Undo — connection A→B restored (single undo, not multiple)
  - [x] Undo — block C and connection B→C restored
  - [x] Redo — block C removed again with its connections
- [x] Copy/paste multiple connected blocks — single undo reverts entire
paste

---------

Co-authored-by: Reinier van der Leer <pwuts@agpt.co>
Co-authored-by: Abhimanyu Yadav <122007096+Abhi1992002@users.noreply.github.com>
2026-03-10 04:55:07 +00:00
Zamil Majdy
f1945d6a2f feat(platform/copilot): @@agptfile: file-ref protocol for tool call inputs + block input toggle (#12332)
## Summary

- **Problem**: When the LLM calls a tool with large file content, it
must rewrite all content token-by-token. This is wasteful since the
files are already accessible on disk.
- **Solution**: Introduces an \`@@agptfile:\` reference protocol. The
LLM passes a file path reference; the processor loads and substitutes
the content before executing the tool.

### Protocol

\`\`\`
@@agptfile:<uri>[<start>-<end>]
\`\`\`

**Supported URI types:**
| URI | Source |
|-----|--------|
| \`workspace://<file_id>\` | Persistent workspace file by ID |
| \`workspace:///<path>\` | Workspace file by virtual path |
| \`/absolute/path\` | Absolute host or sandbox path |

**Line range** is optional; omitting it reads the whole file.

### Backend changes

- Rename \`@file:\` → \`@@agptfile:\` prefix for uniqueness; extract
\`FILE_REF_PREFIX\` constant
- Extract shared execution-context ContextVars into
\`backend/copilot/context.py\` — eliminates duplicate ContextVar objects
that caused \`e2b_file_tools.py\` to always see empty context
- \`tool_adapter.py\` imports ContextVars from \`context.py\` (single
source of truth)
- \`expand_file_refs_in_string\` raises \`FileRefExpansionError\` on
failure (instead of inline error strings), blocking tool execution and
returning a clear error hint to the model
- Tighten URI regex: only expand refs starting with \`workspace://\` or
\`/\`
- Aggregate budget: 1 MB total expansion cap across all refs in one
string
- Per-file cap: 200 KB per individual ref
- Fix \`_read_file_handler\` to pass \`get_sdk_cwd()\` to
\`is_allowed_local_path\` — ephemeral working directory files were
incorrectly blocked
- Fix \`_is_allowed_local\` in \`e2b_file_tools.py\` to pass
\`get_sdk_cwd()\`
- Restrict local path allow-list to \`tool-results/\` subdirectory only
(was entire session project dir)
- Add \`raise_on_error\` param + remove two-pass \`_FILE_REF_ERROR_RE\`
detection
- Update system prompt docs and tool_adapter error messages

### Frontend changes

- \`BlockInputCard\`: hidden by default with Show/Hide toggle + \`mb-2\`
spacing

## Test plan

- [ ] \`poetry run pytest backend/copilot/ -x
--ignore=backend/copilot/sdk/file_ref_integration_test.py\` passes
- [ ] \`@@agptfile:workspace:///<path>[1-50]\` expands correctly in tool
calls
- [ ] Invalid line ranges produce \`[file-ref error: ...]\` inline
messages
- [ ] Files outside \`sdk_cwd\` / \`tool-results/\` are rejected
- [ ] Block input card shows hidden by default with toggle
2026-03-09 18:39:13 +00:00
Zamil Majdy
6491cb1e23 feat(copilot): local agent generation with validation, fixing, MCP & sub-agent support (#12238)
## Summary

Port the agent generation pipeline from the external AgentGenerator
service into local copilot tools, making the Claude Agent SDK itself
handle validation, fixing, and block recommendation — no separate inner
LLM calls needed.

Key capabilities:
- **Local agent generation**: Create, edit, and customize agents
entirely within the SDK session
- **Graph validation**: 9 validation checks (block existence, link
references, type compatibility, IO blocks, etc.)
- **Graph fixing**: 17+ auto-fix methods (ID repair, link rewiring, type
conversion, credential stripping, dynamic block sink names, etc.)
- **MCP tool blocks**: Guide and fixer support for MCPToolBlock nodes
with proper dynamic input schema handling
- **Sub-agent composition**: AgentExecutorBlock support with library
agent schema enrichment
- **Embedding fallback**: Falls back to OpenRouter for embeddings when
`openai_internal_api_key` is unavailable
- **Actionable error messages**: Excluded block types (MCP, Agent)
return specific hints redirecting to the correct tool

### New Tools
- `validate_agent_graph` — run 9 validation checks on agent JSON
- `fix_agent_graph` — apply 17+ auto-fixes to agent JSON
- `get_blocks_for_goal` — recommend blocks for a given goal (with
optimized descriptions)

### Refactored Tools
- `create_agent`, `edit_agent`, `customize_agent` — accept `agent_json`
for local generation with shared fix→validate→save pipeline
- `find_block` — added `include_schemas` parameter, excludes MCP/Agent
blocks with actionable hints
- `run_block` — actionable error messages for excluded block types
- `find_library_agent` — enriched with `graph_version`, `input_schema`,
`output_schema` for sub-agent composition

### Architecture
- Split 2,558-line `validation.py` into `fixer.py`, `validator.py`,
`helpers.py`, `pipeline.py`
- Extracted shared `fix_validate_and_save()` pipeline (was duplicated
across 3 tools)
- Shared `OPENROUTER_BASE_URL` constant across codebase
- Comprehensive test coverage: 78+ unit tests for fixer/validator, 8
run_block tests, 17 SDK compat tests

## Test plan
- [x] `poetry run format` passes
- [x] `poetry run pytest -s -vvv backend/copilot/` — all tests pass
- [x] CI green on all Python versions (3.11, 3.12, 3.13)
- [x] Manual E2E: copilot generates agents with correct IO blocks,
links, and node structure
- [x] Manual E2E: MCP tool blocks use bare field names for dynamic
inputs
- [x] Manual E2E: sub-agent composition with AgentExecutorBlock
2026-03-09 16:10:22 +00:00
nKOxxx
c7124a5240 Add documentation for Google Gemini integration (#12283)
## Summary
Adding comprehensive documentation for Google Gemini integration with
AutoGPT.

## Changes
- Added setup instructions for Gemini API
- Documented configuration options
- Added examples and best practices

## Related Issues
N/A - Documentation improvement

## Testing
- Verified documentation accuracy
- Tested all code examples

## Checklist
- [x] Code follows project style
- [x] Documentation updated
- [x] Tests pass (if applicable)
2026-03-09 15:13:28 +00:00
Zamil Majdy
5537cb2858 dx: add shared Claude Code skills as auto-triggered guidelines (#12297)
## Summary
- Add 8 Claude Code skills under \`.claude/skills/\` that act as
**auto-triggered guidelines** — the LLM invokes them automatically based
on context, no manual \`/command\` needed
- Skills: \`pr-review\`, \`pr-create\`, \`new-block\`,
\`openapi-regen\`, \`backend-check\`, \`frontend-check\`,
\`worktree-setup\`, \`code-style\`
- Each skill has an explicit TRIGGER condition so the LLM knows when to
apply it without being asked

## Changes

### Skills (all auto-triggered by context)
| Skill | Trigger |
|-------|---------|
| \`pr-review\` | User shares a PR URL or asks to address review
comments |
| \`pr-create\` | User asks to create a PR, push changes for review, or
submit work |
| \`new-block\` | User asks to create a new block or add a new
integration |
| \`openapi-regen\` | API routes change, new endpoints added, or
frontend types are stale |
| \`backend-check\` | Backend Python code has been modified |
| \`frontend-check\` | Frontend TypeScript/React code has been modified
|
| \`worktree-setup\` | User asks to work on a branch in isolation or set
up a worktree |
| \`code-style\` | Writing or reviewing Python code |

## Test plan
- [ ] Verify skills appear automatically in Claude Code when context
matches (no \`/command\` needed)
- [ ] Modify frontend code — confirm \`frontend-check\` fires
automatically
- [ ] Ask Claude to "create a PR" — confirm \`pr-create\` fires without
\`/pr-create\`
- [ ] Share a PR URL — confirm \`pr-review\` fires automatically

---------

Co-authored-by: Krzysztof Czerwinski <kpczerwinski@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 15:10:38 +00:00
Zamil Majdy
aef5f6d666 feat(copilot): E2B sandbox auto-pause between turns to eliminate idle billing (#12330)
## Summary

### Before
- E2B sandboxes ran continuously between CoPilot turns, billing for idle
time
- Sandbox timeout caused **termination** (kill), losing all session
state
- No explicit cleanup when sessions were deleted — sandboxes leaked
- Single timeout concept with no separation between pause and kill
semantics

### After
- **Per-turn pause**: `pause_sandbox()` is called in the `finally` block
after every CoPilot turn, stopping billing instantly between turns
(paused sandboxes cost \$0 compute)
- **Auto-pause safety net**: Sandboxes are created with
`lifecycle={"on_timeout": "pause"}` (`pause_timeout` = 4h default) so
they auto-pause rather than terminate if the explicit pause is missed
- **Auto-reconnect**: `AsyncSandbox.connect()` in e2b SDK v2
auto-resumes paused sandboxes transparently — no extra code needed
- **Session delete cleanup**: `kill_sandbox()` is now called in
`delete_chat_session()` to explicitly terminate sandboxes and free
resources
- **Two distinct timeouts**: `pause_timeout` (4h, e2b auto-pause) vs
`redis_ttl` (12h, session key lifetime)

### Key Changes

| File | Change |
|------|--------|
| `pyproject.toml` | Bump `e2b-code-interpreter` `1.x` → `2.x` |
| `e2b_sandbox.py` | Add `pause_sandbox()`, `kill_sandbox()`,
`_act_on_sandbox()` helper; `lifecycle={"on_timeout": "pause"}`;
separate `pause_timeout` / `redis_ttl` params |
| `sdk/service.py` | Call `pause_sandbox()` in `finally` block
**before** transcript upload; use walrus operator for type-safe
`e2b_api_key` narrowing |
| `model.py` | Call `kill_sandbox()` in `delete_chat_session()`; inline
import to avoid circular dependency |
| `config.py` | Add `e2b_active` property; rename `e2b_sandbox_timeout`
default to 4h |
| `e2b_sandbox_test.py` | Add `test_pause_then_reconnect_reuses_sandbox`
test; update all `sandbox_timeout` → `pause_timeout` |

### Verified E2E
- Used real `E2B_API_KEY` from k8s dev cluster to manually verify:
sandbox created → paused → `is_running() == False` → reconnected via
`connect()` → state preserved → killed

## Test plan
- [x] `poetry run pytest backend/copilot/tools/e2b_sandbox_test.py` —
all 19 tests pass
- [x] CI: test (3.11, 3.12, 3.13), types — all green
- [x] E2E verified with real E2B credentials
2026-03-09 14:55:10 +00:00
Ubbe
8063391d0a feat(frontend/copilot): pin interactive tool cards outside reasoning collapse (#12346)
## Summary

<img width="400" height="227" alt="Screenshot 2026-03-09 at 22 43 10"
src="https://github.com/user-attachments/assets/0116e260-860d-4466-9763-e02de2766e50"
/>

<img width="600" height="618" alt="Screenshot 2026-03-09 at 22 43 14"
src="https://github.com/user-attachments/assets/beaa6aca-afa8-483f-ac06-439bf162c951"
/>

- When the copilot stream finishes, tool calls that require user
interaction (credentials, inputs, clarification) are now **pinned**
outside the "Show reasoning" collapse instead of being hidden
- Added `isInteractiveToolPart()` helper that checks tool output's
`type` field against a set of interactive response types
- Modified `splitReasoningAndResponse()` to extract interactive tools
from reasoning into the visible response section
- Added styleguide section with 3 demos: `setup_requirements`,
`agent_details`, and `agent_saved` pinning scenarios

### Interactive response types kept visible:
`setup_requirements`, `agent_details`, `block_details`, `need_login`,
`input_validation_error`, `clarification_needed`, `suggested_goal`,
`agent_preview`, `agent_saved`

Error responses remain in reasoning (LLM explains them in final text).

Closes SECRT-2088

## Test plan
- [ ] Verify copilot stream with interactive tool (e.g. run_agent
requiring credentials) keeps the tool card visible after stream ends
- [ ] Verify non-interactive tools (find_block, bash_exec) still
collapse into "Show reasoning"
- [ ] Verify styleguide page at `/copilot/styleguide` renders the new
"Reasoning Collapse: Interactive Tool Pinning" section correctly
- [ ] Verify `pnpm types`, `pnpm lint`, `pnpm format` all pass

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 23:12:14 +08:00
Otto
0bbb12d688 fix(frontend/copilot): hide New Chat button on Autopilot homepage (#12321)
Requested by @0ubbe

The **New Chat** button was visible on the Autopilot homepage where
clicking it has no effect (since `sessionId` is already `null`). This
hides the button when no chat session is active, so it only appears when
the user is viewing a conversation and wants to start a new one.

**Changes:**
- `ChatSidebar.tsx` — hide button in both collapsed and expanded sidebar
states when `sessionId` is null
- `MobileDrawer.tsx` — same fix for mobile drawer

---
Co-authored-by: Ubbe <ubbe@users.noreply.github.com>
2026-03-09 22:41:11 +08:00
Otto
eadc68f2a5 feat(frontend/copilot): move microphone button to right side of input box (#12320)
Requested by @olivia-1421

Moves the microphone/recording button from the left-side tools group to
the right side, next to the submit button. The left side is now reserved
for the attachment/upload (plus) button only.

**Before:** `[ 📎 🎤 ] .................. [ ➤ ]`
**After:**  `[ 📎 ] .................. [ 🎤 ➤ ]`

---
Co-authored-by: Olivia <olivia-1421@users.noreply.github.com>

---------

Co-authored-by: Ubbe <hi@ubbe.dev>
Co-authored-by: Lluis Agusti <hi@llu.lu>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 18:37:02 +08:00
Reinier van der Leer
19d775c435 Merge commit from fork 2026-03-08 10:25:24 +01:00
Reinier van der Leer
eca7b5e793 Merge commit from fork 2026-03-08 10:24:44 +01:00
Otto
c304a4937a fix(backend): Handle manual run attempts for triggered agents (#12298)
When a webhook-triggered agent is executed directly (e.g. via Copilot)
without actual webhook data, `GraphExecution.from_db()` crashes with
`KeyError: 'payload'` because it does a hard key access on
`exec.input_data["payload"]` for webhook blocks.

This caused 232 Sentry events (AUTOGPT-SERVER-821) and multiple
INCOMPLETE graph executions due to retries.

**Changes:**

1. **Defensive fix in `from_db()`** — use `.get("payload")` instead of
`["payload"]` to handle missing keys gracefully (matches existing
pattern for input blocks using `.get("value")`)

2. **Upfront refusal in `_construct_starting_node_execution_input()`** —
refuse execution of webhook/webhook_manual blocks when no payload is
provided. The check is placed after `nodes_input_masks` application, so
legitimate webhook triggers (which inject payload via
`nodes_input_masks`) pass through fine.

Resolves [SENTRY-1113: Copilot is able to manually initiate runs for
triggered agents (which
fails)](https://linear.app/autogpt/issue/SENTRY-1113/copilot-is-able-to-manually-initiate-runs-for-triggered-agents-which)

---
Co-authored-by: Reinier van der Leer (@Pwuts) <pwuts@agpt.co>
2026-03-06 20:47:51 +00:00
Zamil Majdy
7ead4c040f hotfix(backend/copilot): capture tool results in transcript (#12323)
## Summary
- Fixes tool results not being captured in the CoPilot transcript during
SDK-based streaming
- Adds `transcript_builder.add_user_message()` call with `tool_result`
content block when a `StreamToolOutputAvailable` event is received
- Ensures transcript accurately reflects the full conversation including
tool outputs, which is critical for Langfuse tracing and debugging

## Context
After the transcript refactor in #12318, tool call results from the SDK
streaming loop were not being recorded in the transcript. This meant
Langfuse traces were missing tool outputs, making it hard to debug agent
behavior.

## Test plan
- [ ] Verify CoPilot conversation with tool calls captures tool results
in Langfuse traces
- [ ] Verify transcript includes tool_result content blocks after tool
execution
2026-03-06 18:58:48 +00:00
Zamil Majdy
8cfabcf4fd refactor(backend/copilot): centralize prompt building in prompting.py (#12324)
## Summary

Centralizes all prompt building logic into a new
`backend/copilot/prompting.py` module with clear SDK vs baseline and
local vs E2B distinctions.

### Key Changes

**New `prompting.py` module:**
- `get_sdk_supplement(use_e2b, cwd)` - For SDK mode (NO tool docs -
Claude gets schemas automatically)
- `get_baseline_supplement(use_e2b, cwd)` - For baseline mode (WITH
auto-generated tool docs from TOOL_REGISTRY)
- Handles local/E2B storage differences

**SDK mode (`sdk/service.py`):**
- Removed 165+ lines of duplicate constants
- Now imports and uses `get_sdk_supplement()`
- Cleaner, more maintainable

**Baseline mode (`baseline/service.py`):**
- Now appends `get_baseline_supplement()` to system prompt
- Baseline mode finally gets tool documentation!

**Enhanced tool descriptions:**
- `create_agent`: Added feedback loop workflow (suggested_goal,
clarifying_questions)
- `run_mcp_tool`: Added known server URLs, 2-step workflow, auth
handling

**Tests:**
- Updated to verify SDK excludes tool docs, baseline includes them
- All existing tests pass

### Architecture Benefits

 Single source of truth for prompt supplements
 Clear SDK vs baseline distinction (SDK doesn't need tool docs)
 Clear local vs E2B distinction (storage systems)
 Easy to maintain and update
 Eliminates code duplication

## Test plan

- [x] Unit tests pass (TestPromptSupplement class)
- [x] SDK mode excludes tool documentation
- [x] Baseline mode includes tool documentation
- [x] E2B vs local mode differences handled correctly
2026-03-06 18:56:20 +00:00
Zamil Majdy
7bf407b66c Merge branch 'master' of github.com:Significant-Gravitas/AutoGPT into dev 2026-03-07 02:01:41 +07:00
Abhimanyu Yadav
0f813f1bf9 feat(copilot): Add folder management tools to CoPilot (#12290)
Adds folder management capabilities to the CoPilot, allowing users to
organize agents into folders directly from the chat interface.

<img width="823" height="356" alt="Screenshot 2026-03-05 at 5 26 30 PM"
src="https://github.com/user-attachments/assets/4c55f926-1e71-488f-9eb6-fca87c4ab01b"
/>
<img width="797" height="150" alt="Screenshot 2026-03-05 at 5 28 40 PM"
src="https://github.com/user-attachments/assets/5c9c6f8b-57ac-4122-b17d-b9f091bb7c4e"
/>
<img width="763" height="196" alt="Screenshot 2026-03-05 at 5 28 36 PM"
src="https://github.com/user-attachments/assets/d1b22b5d-921d-44ac-90e8-a5820bb3146d"
/>
<img width="756" height="199" alt="Screenshot 2026-03-05 at 5 30 17 PM"
src="https://github.com/user-attachments/assets/40a59748-f42e-4521-bae0-cc786918a9b5"
/>

### Changes

**Backend -- 6 new CoPilot tools** (`manage_folders.py`):
- `create_folder` -- Create folders with optional parent, icon, and
color
- `list_folders` -- List folder tree or children of a specific folder,
with optional `include_agents` to show agents inside each folder
- `update_folder` -- Rename or change icon/color
- `move_folder` -- Reparent a folder or move to root
- `delete_folder` -- Soft-delete (agents moved to root, not deleted)
- `move_agents_to_folder` -- Bulk-move agents into a folder or back to
root

**Backend -- DatabaseManager RPC registration**:
- Registered all 7 folder DB functions (`create_folder`, `list_folders`,
`get_folder_tree`, `update_folder`, `move_folder`, `delete_folder`,
`bulk_move_agents_to_folder`) in `DatabaseManager` and
`DatabaseManagerAsyncClient` so they work via RPC in the CoPilotExecutor
process
- `manage_folders.py` uses `db_accessors.library_db()` pattern
(consistent with all other copilot tools) instead of direct Prisma
imports

**Backend -- folder_id threading**:
- `create_agent` and `customize_agent` tools accept optional `folder_id`
to save agents directly into a folder
- `save_agent_to_library` -> `create_graph_in_library` ->
`create_library_agent` pipeline passes `folder_id` through
- `create_library_agent` refactored from `asyncio.gather` to sequential
loop to support conditional `folderId` assignment on the main graph only
(not sub-graphs)

**Backend -- system prompt and models**:
- Added folder tool descriptions and usage guidance to Otto's system
prompt
- Added `FolderAgentSummary` model for lightweight agent info in folder
listings
- Added 6 `ResponseType` enum values and corresponding Pydantic response
models (`FolderInfo`, `FolderTreeInfo`, `FolderCreatedResponse`, etc.)

**Frontend -- FolderTool UI component**:
- `FolderTool.tsx` -- Renders folder operations in chat using the
`file-tree` molecule component for tree view, with `FileIcon` for agents
and `FolderIcon` for folders (both `text-neutral-600`)
- `helpers.ts` -- Type guards, output parsing, animation text helpers,
and `FolderAgentSummary` type
- `MessagePartRenderer.tsx` -- Routes 6 folder tool types to
`FolderTool` component
- Flat folder list view shows agents inside `FolderCard` when
`include_agents` is set

**Frontend -- file-tree molecule**:
- Fixed 3 pre-existing lint errors in `file-tree.tsx` (unused `ref`,
`handleSelect`, `className` params)
- Updated tree indicator line color from `bg-neutral-100` to
`bg-neutral-400` for visibility
- Added `file-tree.stories.tsx` with 5 stories: Default, AllExpanded,
FoldersOnly, WithInitialSelection, NoIndicator
- Added `ui/scroll-area.tsx` (dependency of file-tree, was missing from
non-legacy ui folder)

### Checklist

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Create a folder via copilot chat ("create a folder called
Marketing")
  - [x] List folders ("show me my folders")
- [x] List folders with agents ("show me my folders and the agents in
them")
- [x] Update folder name/icon/color ("rename Marketing folder to Sales")
- [x] Move folder to a different parent ("move Sales into the Projects
folder")
  - [x] Delete a folder and verify agents move to root
- [x] Move agents into a folder ("put my newsletter agent in the
Marketing folder")
- [x] Create agent with folder_id ("create a scraper agent and save it
in my Tools folder")
- [x] Verify FolderTool UI renders loading, success, error, and empty
states correctly
- [x] Verify folder tree renders nested folders with file-tree component
- [x] Verify agents appear as FileIcon nodes in tree view when
include_agents is true
  - [x] Verify file-tree storybook stories render correctly
2026-03-06 14:59:03 +00:00
Reinier van der Leer
aa08063939 refactor(backend/db): Improve & clean up Marketplace DB layer & API (#12284)
These changes were part of #12206, but here they are separately for
easier review.
This is all primarily to make the v2 API (#11678) work possible/easier.

### Changes 🏗️

- Fix relations between `Profile`, `StoreListing`, and `AgentGraph`
- Redefine `StoreSubmission` view with more efficient joins (100x
speed-up on dev DB) and more consistent field names
- Clean up query functions in `store/db.py`
- Clean up models in `store/model.py`
- Add missing fields to `StoreAgent` and `StoreSubmission` views
- Rename ambiguous `agent_id` -> `graph_id`
- Clean up API route definitions & docs in `store/routes.py`
  - Make routes more consistent
- Avoid collision edge-case between `/agents/{username}/{agent_name}`
and `/agents/{store_listing_version_id}/*`
- Replace all usages of legacy `BackendAPI` for store endpoints with
generated client
- Remove scope requirements on public store endpoints in v1 external API

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Test all Marketplace views (including admin views)
    - [x] Download an agent from the marketplace
  - [x] Submit an agent to the Marketplace
  - [x] Approve/reject Marketplace submission
2026-03-06 14:38:12 +00:00
Zamil Majdy
bde6a4c0df Merge branch 'master' of github.com:Significant-Gravitas/AutoGPT into dev
# Conflicts:
#	autogpt_platform/backend/backend/copilot/sdk/service.py
2026-03-06 21:07:37 +07:00
Zamil Majdy
d56452898a hotfix(backend/copilot): refactor transcript to SDK-based atomic full-context model (#12318)
## Summary

Major refactor to eliminate CLI transcript race conditions and simplify
the codebase by building transcripts directly from SDK messages instead
of reading CLI files.

## Problem

The previous approach had race conditions:
- SDK reads CLI transcript file during stop hook
- CLI may not have finished writing → incomplete transcript
- Complex merge logic to detect and fix incomplete writes
- ~200 lines of synthetic entry detection and merge code

## Solution

**Atomic Full-Context Transcript Model:**
- Build transcript from SDK messages during streaming
(`TranscriptBuilder`)
- Each upload REPLACES the previous transcript entirely (atomic)
- No CLI file reading → no race conditions
- Eliminates all merge complexity

## Key Changes

### Core Refactor
- **NEW**: `transcript_builder.py` - Build JSONL from SDK messages
during streaming
- **SIMPLIFIED**: `transcript.py` - Removed merge logic, simplified
upload/download
- **SIMPLIFIED**: `service.py` - Use TranscriptBuilder, removed stop
hook callback
- **CLEANED**: `security_hooks.py` - Removed `on_stop` parameter

### Performance & Code Quality
- **orjson migration**: Use `backend.util.json` (2-3x faster than
stdlib)
- Added `fallback` parameter to `json.loads()` for cleaner error
handling
- Moved SDK imports to top-level per code style guidelines

### Bug Fixes
- Fixed garbage collection bug in background task handling
- Fixed double upload bug in timeout handling  
- Downgraded PII-risk logging from WARNING to DEBUG
- Added 30s timeout to prevent session lock hang

## Code Removed (~200 lines)

- `merge_with_previous_transcript()` - No longer needed
- `read_transcript_file()` - No longer needed
- `CapturedTranscript` dataclass - No longer needed
- `_on_stop()` callback - No longer needed
- Synthetic entry detection logic - No longer needed
- Manual append/merge logic in finally block - No longer needed

## Testing

-  All transcript tests passing (24/24)
-  Verified with real session logs showing proper transcript growth
-  Verified with Langfuse traces showing proper turn tracking (1-8)

## Transcript Growth Pattern

From session logs:
- **Turn 1**: 2 entries (initial)
- **Turn 2**: 5 entries (+3), 2257B uploaded
- **Turn N**: ~2N entries (linear growth)

Each upload is the **complete atomic state** - always REPLACES, never
incremental.

## Files Changed

```
backend/copilot/sdk/transcript_builder.py (NEW)   | +140 lines
backend/copilot/sdk/transcript.py                  | -198, +125 lines  
backend/copilot/sdk/service.py                     | -214, +160 lines
backend/copilot/sdk/security_hooks.py              | -33, +10 lines
backend/copilot/sdk/transcript_test.py             | -85, +36 lines
backend/util/json.py                               | +45 lines
```

**Net result**: -200 lines, more reliable, faster JSON operations.

## Migration Notes

This is a **breaking change** for any code that:
- Directly calls `merge_with_previous_transcript()` or
`read_transcript_file()`
- Relies on incremental transcript uploads
- Expects stop hook callbacks

All internal usage has been updated.

---

@ntindle - Tagging for autogpt-reviewer
2026-03-06 21:03:49 +07:00
Ubbe
7507240177 feat(copilot): collapse repeated tool calls and fix stream stuck on completion (#12282)
## Summary
- **Frontend:** Group consecutive completed generic tool parts into
collapsible summary rows with a "Reasoning" collapse for finalized
messages. Merge consecutive assistant messages on hydration to avoid
split bubbles. Extract GenericTool helpers. Add `reconnectExhausted`
state and a brief delay before refetching session to reduce stale
`active_stream` reconnect cycles.
- **Backend:** Make transcript upload fire-and-forget instead of
blocking the generator exit. The 30s upload timeout in
`_try_upload_transcript` was delaying `mark_session_completed()`,
keeping the SSE stream alive with only heartbeats after the LLM had
finished — causing the UI to stay stuck in "streaming" state.

## Test plan
- [ ] Send a message in Copilot that triggers multiple tool calls —
verify they collapse into a grouped summary row once completed
- [ ] Verify the final text response appears below the collapsed
reasoning section
- [ ] Confirm the stream properly closes after the agent finishes (no
stuck "Stop" button)
- [ ] Refresh mid-stream and verify reconnection works correctly
- [ ] Click Stop during streaming — verify the UI becomes responsive
immediately

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 21:21:59 +08:00
Abhimanyu Yadav
d7c3f5b8fc fix(frontend): bypass Next.js proxy for file uploads to fix 413 error (#12315)
## Summary
- File uploads routed through the Next.js API proxy (`/api/proxy/...`)
fail with HTTP 413 for files >4.5MB due to Vercel's serverless function
body size limit
- Created shared `uploadFileDirect` utility (`src/lib/direct-upload.ts`)
that uploads files directly from the browser to the Python backend,
bypassing the proxy entirely
- Updated `useWorkspaceUpload` to use direct upload instead of the
generated hook (which went through the proxy)
- Deduplicated the copilot page's inline upload logic to use the same
shared utility

## Changes 🏗️
- **New**: `src/lib/direct-upload.ts` — shared utility for
direct-to-backend file uploads (up to 256MB)
- **Updated**: `useWorkspaceUpload.ts` — replaced proxy-based generated
hook with `uploadFileDirect`
- **Updated**: `useCopilotPage.ts` — replaced inline upload logic with
shared `uploadFileDirect`, removed unused imports

## Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Upload a file >5MB via workspace file input (e.g. in agent
builder) — should succeed without 413
  - [x] Upload a file >5MB via copilot chat — should succeed without 413
  - [x] Upload a small file (<1MB) via both paths — should still work
  - [x] Verify file delete still works from workspace file input
2026-03-06 12:20:18 +00:00
Otto
3e108a813a fix(backend): Use db_manager for workspace in add_graph_execution (#12312)
When `add_graph_execution` is called from a context where the global
Prisma client isn't connected (e.g. CoPilot tools, external API), the
call to `get_or_create_workspace(user_id)` crashes with
`ClientNotConnectedError` because it directly accesses
`UserWorkspace.prisma()`.

The fix adds `workspace_db` to the existing `if prisma.is_connected()`
fallback pattern, consistent with how all other DB calls in the function
already work.

**Sentry:** AUTOGPT-SERVER-83T (and ~15 related issues going back to Jan
2026)

---
Co-authored-by: Reinier van der Leer (@Pwuts) <pwuts@agpt.co>

Co-authored-by: Reinier van der Leer (@Pwuts) <pwuts@agpt.co>
2026-03-06 08:48:15 +01:00
Krzysztof Czerwinski
08c49a78f8 feat(copilot): UX improvements (#12258)
CoPilot conversation UX improvements (SECRT-2055):

1. **Rename conversations** — Inline rename via the session dropdown
menu. New `PATCH /sessions/{session_id}/title` endpoint with server-side
validation (rejects blank/whitespace-only titles, normalizes
whitespace). Pressing Enter or clicking away submits; Escape cancels
without submitting.

2. **New Chat button moved to top & sticky** — The 'New Chat' button is
now at the top of the sidebar (under 'Your chats') instead of the
footer, and stays fixed — only the session list below it scrolls. A
subtle shadow separator mirrors the original footer style.

3. **Auto-generated title appears live** — After the first message in a
new chat, the sidebar polls for the backend-generated title and animates
it in smoothly once available. The backend also guards against
auto-title overwriting a user-set title.

4. **External Link popup redesign** — Replaced the CSS-hacked external
link confirmation dialog with a proper AutoGPT `Dialog` component using
the design system (`Button`, `Text`, `Dialog`). Removed the old
`globals.css` workaround.

<img width="321" height="263" alt="Screenshot 2026-03-03 at 6 31 50 pm"
src="https://github.com/user-attachments/assets/3cdd1c6f-cca6-4f16-8165-15a1dc2d53f7"
/>

<img width="374" height="74" alt="Screenshot 2026-03-02 at 6 39 07 pm"
src="https://github.com/user-attachments/assets/6f9fc953-5fa7-4469-9eab-7074e7604519"
/>

<img width="548" height="293" alt="Screenshot 2026-03-02 at 6 36 28 pm"
src="https://github.com/user-attachments/assets/0f34683b-7281-4826-ac6f-ac7926e67854"
/>

### Changes 🏗️

**Backend:**
- `routes.py`: Added `PATCH /sessions/{session_id}/title` endpoint with
`UpdateSessionTitleRequest` Pydantic model — validates non-blank title,
normalizes whitespace, returns 404 vs 500 correctly
- `routes_test.py`: New test file — 7 test cases covering success,
whitespace trimming, blank rejection (422), not found (404), internal
failure (500)
- `service.py`: Auto-title generation now checks if a user-set title
already exists before overwriting
- `openapi.json`: Updated with new endpoint schema

**Frontend:**
- `ChatSidebar.tsx`: Inline rename (Enter/blur submits, Escape cancels
via ref flag); "New Chat" button sticky at top with shadow separator;
session title animates when auto-generated title appears
(`AnimatePresence`)
- `useCopilotPage.ts`: Polls for auto-generated title after stream ends,
stops as soon as title appears in cache
- `MobileDrawer.tsx`: Updated to match sidebar layout changes
- `DeleteChatDialog.tsx`: Removed redundant `onClose` prop (controlled
Dialog already handles close)
- `message.tsx`: Added `ExternalLinkModal` using AutoGPT design system;
removed redundant `onClose` prop
- `globals.css`: Removed old CSS hack for external link modal

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Create a new chat, send a message — verify auto-generated title
appears in sidebar without refresh
- [x] Rename a chat via dropdown — Enter submits, Escape reverts, blank
title rejected
- [x] Rename a chat, then send another message — verify user title is
not overwritten by auto-title
- [x] With many chats, scroll the sidebar — verify "New Chat" button
stays fixed at top
- [x] Click an external link in a message — verify the new dialog
appears with AutoGPT styling

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 06:01:41 +00:00
Bently
5d56548e6b fix(frontend): prevent crash on /library with 401 error from pagination helper (#12292)
## Changes
Fixes crash on `/library` page when backend returns a 401 authentication
error.

### Problem

When the backend returns a 401 error, React Query still calls
`getNextPageParam` with the error response. The response doesn't have
the expected pagination structure, causing `pagination` to be
`undefined`. The code then crashes trying
 to access `pagination.current_page`.

Error:
TypeError: Cannot read properties of undefined (reading 'current_page')
    at Object.getNextPageParam

### Solution

Added a defensive null check in `getPaginationNextPageNumber()` to
handle cases where `pagination` is undefined:

```typescript
const { pagination } = lastPage.data;
if (!pagination) return undefined;
```
When undefined is returned, React Query interprets this as "no next page
available" and gracefully stops pagination instead of crashing.

Testing

- Manual testing: Verify /library page handles 401 errors without
crashing
- The fix is defensive and doesn't change behavior for successful
responses

Related Issues

Closes OPEN-2684
2026-03-05 19:52:36 +00:00
Otto
6ecf55d214 fix(frontend): fix 'Open link' button text color to white for contrast (#12304)
Requested by @ntindle

The Streamdown external link safety modal's "Open link" button had dark
text (`color: black`) on a dark background, making it unreadable.
Changed to `color: white` for proper contrast per our design system.

**File:** `autogpt_platform/frontend/src/app/globals.css`

Resolves SECRT-2061

---
Co-authored-by: Nick Tindle (@ntindle)
2026-03-05 19:50:39 +00:00
Bently
7c8c7bf395 feat(llm): add Claude Sonnet 4.6 model (#12158)
## Summary
Adds Claude Sonnet 4.6 (`claude-sonnet-4-6`) to the platform.

## Model Details (from [Anthropic
docs](https://www.anthropic.com/news/claude-sonnet-4-6))
- **API ID:** `claude-sonnet-4-6`
- **Pricing:** $3 / input MTok, $15 / output MTok (same as Sonnet 4.5)
- **Context window:** 200K tokens (1M beta)
- **Max output:** 64K tokens
- **Knowledge cutoff:** Aug 2025 (reliable), Jan 2026 (training data)

## Changes
- Added `CLAUDE_4_6_SONNET` to `LlmModel` enum
- Added metadata entry with correct context/output limits
- Updated Stagehand to use Sonnet 4.6 (better for browser automation
tasks)

## Why
Sonnet 4.6 brings major improvements in coding, computer use, and
reasoning. Developers with early access often prefer it to even Opus
4.5.

---------

Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co>
2026-03-05 19:36:56 +00:00
Zamil Majdy
0b9e0665dd Merge branch 'dev' of github.com:Significant-Gravitas/AutoGPT 2026-03-06 02:32:36 +07:00
Zamil Majdy
be18436e8f Merge branch 'master' of github.com:Significant-Gravitas/AutoGPT into dev 2026-03-06 02:31:40 +07:00
Zamil Majdy
f6f268a1f0 Merge branch 'dev' of github.com:Significant-Gravitas/AutoGPT into HEAD 2026-03-06 02:29:56 +07:00
Zamil Majdy
ea0333c1fc fix(copilot): always upload transcript instead of size-based skip (#12303)
## Summary

Fixes copilot sessions "forgetting" previous turns due to stale
transcript storage.

**Root cause:** The transcript upload logic used byte size comparison
(`existing >= new → skip`) to prevent overwriting newer transcripts with
older ones. However, with `--resume` the CLI compacts old tool results,
so newer transcripts can have **fewer bytes** despite containing **more
conversation events**. This caused the stored transcript to freeze at
whatever the largest historical upload was — every subsequent turn
downloaded the same stale transcript and the agent lost context of
recent turns.

**Evidence from prod session `41a3814c`:**
- Stored transcript: 764KB (frozen, never updated)
- Turn 1 output: 379KB (75 lines) → upload skipped (764KB >= 379KB)
- Turn 2 output: 422KB (71 lines) → upload skipped (764KB >= 422KB)
- Turn 3 output: **empty** → upload skipped
- Agent resumed from the same stale 764KB transcript every turn, losing
context of the PR it created

**Fix:** Remove the size comparison entirely. The executor holds a
cluster lock per session, so concurrent uploads cannot race. Just always
overwrite with the latest transcript.

## Test plan
- [x] `poetry run pytest backend/copilot/sdk/transcript_test.py` — 25/25
pass
- [x] All pre-commit hooks pass
- [ ] After deploy: verify multi-turn sessions retain context across
turns
2026-03-06 02:26:52 +07:00
Zamil Majdy
21c705af6e fix(backend/copilot): prevent title update from overwriting session messages (#12302)
### Changes 🏗️

Fixes a race condition in `update_session_title()` where the background
title generation task could overwrite the Redis session cache with a
stale snapshot, causing the copilot to "forget" its previous turns.

**Root cause:** `update_session_title()` performs a read-modify-write on
the Redis cache (read full session → set title → write back). Meanwhile,
`upsert_chat_session()` writes a newer version with more messages during
streaming. If the title task reads early (e.g., 34 messages) and writes
late (after streaming persisted 101 messages), the stale 34-message
version overwrites the 101-message version. When the next message lands
on a different pod, it loads the stale session from Redis.

**Fix:** Replace the read-modify-write with a simple cache invalidation
(`invalidate_session_cache`). The title is already updated in the DB;
the next access just reloads from DB with the correct title and
messages. No locks, no deserialization of the full session blob, no risk
of stale overwrites.

**Evidence from prod logs (session `41a3814c`):**
- Pod `tm2jb` persisted session with 101 messages
- Pod `phflm` loaded session from Redis cache with only 35 messages (66
messages lost)
- The title background task ran between these events, overwriting the
cache

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] `poetry run pytest backend/copilot/model_test.py` — 15/15 pass
  - [x] All pre-commit hooks pass (ruff, black, isort, pyright)
- [ ] After deploy: verify long sessions no longer lose context on
multi-pod setups
2026-03-05 18:49:41 +00:00
Zamil Majdy
a576be9db2 fix(backend): install agent-browser + Chromium in Docker image (#12301)
The Copilot browser tool (`browser_navigate`, `browser_act`,
`browser_screenshot`) has been broken on dev because `agent-browser` CLI
+ Chromium were never installed in the backend Docker image.

### Changes 🏗️

- Added `npx playwright install-deps chromium` to install Chromium
runtime libraries (libnss3, libatk, etc.)
- Added `npm install -g agent-browser` to install the CLI
- Added `agent-browser install` to download the Chromium binary
- Layer is placed after existing COPY-from-builder lines to preserve
Docker cache ordering

### Root cause

Every `browser_navigate` call fails with:
```
WARNING  [browser_navigate] open failed for <url>: agent-browser is not installed
(run: npm install -g agent-browser && agent-browser install).
```
The error originates from `FileNotFoundError` in `agent_browser.py:101`
when the subprocess tries to execute the `agent-browser` binary which
doesn't exist in the container.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Verified `agent-browser` binary is missing from current dev pod
via `kubectl logs`
- [x] Confirmed session `01eeac29-5a7` shows repeated failures for all
URLs
- [ ] After deploy: verify browser_navigate works in a Copilot session
on dev

#### For configuration changes:
- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)
2026-03-05 18:44:55 +00:00
dependabot[bot]
5e90585f10 chore(deps): bump crazy-max/ghaction-github-runtime from 3 to 4 (#12262)
Bumps
[crazy-max/ghaction-github-runtime](https://github.com/crazy-max/ghaction-github-runtime)
from 3 to 4.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/crazy-max/ghaction-github-runtime/releases">crazy-max/ghaction-github-runtime's
releases</a>.</em></p>
<blockquote>
<h2>v3.1.0</h2>
<ul>
<li>Bump <code>@​actions/core</code> from 1.10.0 to 1.11.1 in <a
href="https://redirect.github.com/crazy-max/ghaction-github-runtime/pull/58">crazy-max/ghaction-github-runtime#58</a></li>
<li>Bump braces from 3.0.2 to 3.0.3 in <a
href="https://redirect.github.com/crazy-max/ghaction-github-runtime/pull/54">crazy-max/ghaction-github-runtime#54</a></li>
<li>Bump cross-spawn from 7.0.3 to 7.0.6 in <a
href="https://redirect.github.com/crazy-max/ghaction-github-runtime/pull/59">crazy-max/ghaction-github-runtime#59</a></li>
<li>Bump ip from 2.0.0 to 2.0.1 in <a
href="https://redirect.github.com/crazy-max/ghaction-github-runtime/pull/50">crazy-max/ghaction-github-runtime#50</a></li>
<li>Bump micromatch from 4.0.5 to 4.0.8 in <a
href="https://redirect.github.com/crazy-max/ghaction-github-runtime/pull/55">crazy-max/ghaction-github-runtime#55</a></li>
<li>Bump tar from 6.1.14 to 6.2.1 in <a
href="https://redirect.github.com/crazy-max/ghaction-github-runtime/pull/51">crazy-max/ghaction-github-runtime#51</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/crazy-max/ghaction-github-runtime/compare/v3.0.0...v3.1.0">https://github.com/crazy-max/ghaction-github-runtime/compare/v3.0.0...v3.1.0</a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="04d248b846"><code>04d248b</code></a>
Merge pull request <a
href="https://redirect.github.com/crazy-max/ghaction-github-runtime/issues/76">#76</a>
from crazy-max/node24</li>
<li><a
href="c8f8e4e4e2"><code>c8f8e4e</code></a>
node 24 as default runtime</li>
<li><a
href="494a382acb"><code>494a382</code></a>
Merge pull request <a
href="https://redirect.github.com/crazy-max/ghaction-github-runtime/issues/68">#68</a>
from crazy-max/dependabot/npm_and_yarn/actions/core-2.0.1</li>
<li><a
href="5d51b8ef32"><code>5d51b8e</code></a>
Merge pull request <a
href="https://redirect.github.com/crazy-max/ghaction-github-runtime/issues/74">#74</a>
from crazy-max/dependabot/npm_and_yarn/minimatch-3.1.5</li>
<li><a
href="f7077dccce"><code>f7077dc</code></a>
chore: update generated content</li>
<li><a
href="4d1e03547a"><code>4d1e035</code></a>
chore(deps): bump minimatch from 3.1.2 to 3.1.5</li>
<li><a
href="b59d56d5bc"><code>b59d56d</code></a>
chore(deps): bump <code>@​actions/core</code> from 1.11.1 to 2.0.1</li>
<li><a
href="6d0e2ef281"><code>6d0e2ef</code></a>
Merge pull request <a
href="https://redirect.github.com/crazy-max/ghaction-github-runtime/issues/75">#75</a>
from crazy-max/esm</li>
<li><a
href="41d6f6acdb"><code>41d6f6a</code></a>
remove codecov config</li>
<li><a
href="b5018eca65"><code>b5018ec</code></a>
chore: update generated content</li>
<li>Additional commits viewable in <a
href="https://github.com/crazy-max/ghaction-github-runtime/compare/v3...v4">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=crazy-max/ghaction-github-runtime&package-manager=github_actions&previous-version=3&new-version=4)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co>
2026-03-05 15:59:06 +00:00
Zamil Majdy
3e22a0e786 fix(copilot): pin claude-agent-sdk to 0.1.45 to fix tool_reference content block validation error (#12294)
Requested by @majdyz

## Problem

CoPilot throws `400 Invalid Anthropic Messages API request` errors on
first message, both locally and on Dev.

## Root Cause

The CLI's built-in `ToolSearch` tool returns `tool_reference` content
blocks (`{"type": "tool_reference", "tool_name":
"mcp__copilot__find_block"}`). When the CLI constructs the next
Anthropic API request, it passes these blocks as-is in the
`tool_result.content` field. However, the Anthropic Messages API only
accepts `text` and `image` content block types in tool results.

This causes a Zod validation error:
```
messages[3].content[0].content: Invalid input: expected string, received array
```

The error only manifests when using **OpenRouter** (`ANTHROPIC_BASE_URL`
set) because the Anthropic TypeScript SDK performs stricter client-side
Zod validation in that code path vs the subscription auth path.

PR #12288 bumped `claude-agent-sdk` from `0.1.39` to `^0.1.46`, which
upgraded the bundled Claude CLI from `v2.1.49` to `v2.1.69` where this
issue was introduced.

## Fix

Pin to `0.1.45` which has a CLI version that doesn't produce
`tool_reference` content blocks in tool results.

## Testing
- CoPilot first message should work without 400 errors via OpenRouter
- SDK compat tests should still pass
2026-03-05 13:12:26 +00:00
Ubbe
6abe39b33a feat(frontend/copilot): add per-turn work-done summary stats (#12257)
## Summary
- Adds per-turn work-done counters (e.g. "3 searches", "1 agent run")
shown as plain text on the final assistant message of each
user/assistant interaction pair
- Counters aggregate tool calls by category (searches, agents run,
blocks run, agents created/edited, agents scheduled)
- Copy and TTS actions now appear only on the final assistant message
per turn, with text aggregated from all assistant messages in that turn
- Removes the global JobStatsBar above the chat input

Resolves: SECRT-2026

## Test plan
- [ ] Work-done counters appear only on the last assistant message of
each turn (not on intermediate assistant messages)
- [ ] Counters increment correctly as tool call parts appear in messages
- [ ] Internal operations (add_understanding, search_docs, get_doc_page,
find_block) are NOT counted
- [ ] Max 3 counter categories shown, sorted by volume
- [ ] Copy/TTS actions appear only on the final assistant message per
turn
- [ ] Copy/TTS aggregate text from all assistant messages in the turn
- [ ] No counters or actions shown while streaming is still in progress
- [ ] No type errors, lint errors, or format issues introduced

Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 19:32:48 +08:00
Zamil Majdy
476cf1c601 feat(copilot): support Claude Code subscription auth for SDK mode (#12288)
## Summary

- Adds `CHAT_USE_CLAUDE_CODE_SUBSCRIPTION` config flag to let the
copilot SDK path use the Claude CLI's own subscription auth (from
`claude login`) instead of API keys
- When enabled, the SDK subprocess inherits CLI credentials — no
`ANTHROPIC_BASE_URL`/`AUTH_TOKEN` override is injected
- Forces SDK mode regardless of LaunchDarkly flag (baseline path uses
`openai.AsyncOpenAI` which requires an API key)
- Validates CLI installation on first use with clear error messages

## Setup

```bash
npm install -g @anthropic-ai/claude-code
claude login
# then set in .env:
CHAT_USE_CLAUDE_CODE_SUBSCRIPTION=true
```

## Changes

| File | Change |
|------|--------|
| `copilot/config.py` | New `use_claude_code_subscription` field + env
var validator |
| `copilot/sdk/service.py` | `_validate_claude_code_subscription()` +
`_build_sdk_env()` early-return + fail-fast guard |
| `copilot/executor/processor.py` | Force SDK mode via short-circuit
`or` |

## Test plan

- [ ] Set `CHAT_USE_CLAUDE_CODE_SUBSCRIPTION=true`, unset all API keys
- [ ] Run `claude login` on the host
- [ ] Start backend, send a copilot message — verify SDK subprocess uses
CLI auth
- [ ] Verify existing OpenRouter/API key flows still work (no
regression)
2026-03-05 09:55:35 +00:00
Zamil Majdy
25022f2d1e fix(copilot): handle empty tool_call arguments in baseline path (#12289)
## Summary

Handle empty/None `tool_call.arguments` in the baseline copilot path
that cause OpenRouter 400 errors when converting to Anthropic format.

## Changes

**`backend/copilot/baseline/service.py`**:
- Default empty `tc["arguments"]` to `"{}"` to prevent OpenRouter from
failing on empty tool arguments during format conversion.

## Test plan

- [x] Existing baseline tests pass
- [ ] Verify on staging: trigger a tool call in baseline mode and
confirm normal flow works
2026-03-05 09:53:05 +00:00
Zamil Majdy
ce1675cfc7 feat(copilot): add Langfuse tracing to baseline LLM path (#12281)
## Summary
Depends on #12276 (baseline code).

- Swap shared OpenAI client to `langfuse.openai.AsyncOpenAI` —
auto-captures all LLM calls (token usage, latency, model, prompts) as
Langfuse generations when configured
- Add `propagate_attributes()` context in baseline streaming for
`user_id`/`session_id` attribution, matching the SDK path's OTEL tracing
- No-op when Langfuse is not configured — `langfuse.openai.AsyncOpenAI`
falls back to standard `openai.AsyncOpenAI` behavior

## Observability parity

| Aspect | SDK path | Baseline path (after this PR) |
|--------|----------|-------------------------------|
| LLM call tracing | OTEL via `configure_claude_agent_sdk()` |
`langfuse.openai.AsyncOpenAI` auto-instrumentation |
| User/session context | `propagate_attributes()` |
`propagate_attributes()` |
| Langfuse prompts | Shared `_build_system_prompt()` | Shared
`_build_system_prompt()` |
| Token/cost tracking | Via OTEL spans | Via Langfuse generation objects
|

## Test plan
- [x] `poetry run format` passes (pyright, ruff, black, isort)
- [ ] Verify Langfuse traces appear for baseline path with
`CHAT_USE_CLAUDE_AGENT_SDK=false`
- [ ] Verify SDK path tracing is unaffected
2026-03-05 09:51:16 +00:00
Otto
3d0ede9f34 feat(backend/copilot): attach uploaded images and PDFs as multimodal vision blocks (#12273)
Requested by @majdyz

When users upload images or PDFs to CoPilot, the AI couldn't see the
content because the CLI's Zod validator rejects large base64 in MCP tool
results and even small images were misidentified (the CLI silently drops
or corrupts image content blocks in tool results).

## Approach

Embed uploaded images directly as **vision content blocks** in the user
message via `client._transport.write()`. The SDK's `client.query()` only
accepts string content, so we bypass it for multimodal messages —
writing a properly structured user message with `[...image_blocks,
{"type": "text", "text": query}]` directly to the transport. This
ensures the CLI binary receives images as native vision blocks, matching
how the Anthropic API handles multimodal input.

For binary files accessed via workspace tools at runtime, we save them
to the SDK's ephemeral working directory (`sdk_cwd`) and return a file
path for the CLI's built-in `Read` tool to handle natively.

## Changes

### Vision content blocks for attached files — `service.py`
- `_prepare_file_attachments` downloads workspace files before the
query, converts images to base64 vision blocks (`{"type": "image",
"source": {"type": "base64", ...}}`)
- When vision blocks are present, writes multimodal user message
directly to `client._transport` instead of using `client.query()`
- Non-image files (PDFs, text) are saved to `sdk_cwd` with a hint to use
the Read tool

### File-path based access for workspace tools — `workspace_files.py`
- `read_workspace_file` saves binary files to `sdk_cwd` instead of
returning base64, returning a path for the Read tool

### SDK context for ephemeral directory — `tool_adapter.py`
- Added `sdk_cwd` context variable so workspace tools can access the
ephemeral directory
- Removed inline base64 multimodal block machinery
(`_extract_content_block`, `_strip_base64_from_text`, `_BLOCK_BUILDERS`,
etc.)

### Frontend — rendering improvements
- `MessageAttachments.tsx` — uses `OutputRenderers` system
(`globalRegistry` + `OutputItem`) for image/video preview rendering
instead of custom components
- `GenericTool.tsx` — uses `OutputRenderers` system for inline image
rendering of base64 content
- `routes.py` — returns 409 for duplicate workspace filenames

### Tests
- `tool_adapter_test.py` — removed multimodal extraction/stripping
tests, added `get_sdk_cwd` tests
- `service_test.py` — rewritten for `_prepare_file_attachments` with
file-on-disk assertions

Closes OPEN-3022

---------

Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
2026-03-05 09:09:59 +00:00
Ubbe
5474f7c495 feat(frontend/copilot): add output action buttons (upvote, downvote) with Langfuse feedback (#12260)
## Summary
- Feedback is submitted to the backend Langfuse integration
(`/api/chat/sessions/{id}/feedback`) for observability
- Downvote opens a modal dialog for optional detailed feedback text (max
2000 chars)
- Buttons are hidden during streaming and appear on hover; once feedback
is selected they stay visible

## Changes
- **`AssistantMessageActions.tsx`** (new): Renders copy (CopySimple),
thumbs-up, and thumbs-down buttons using `MessageAction` from the design
system. Visual states for selected feedback (green for upvote, red for
downvote with filled icons).
- **`FeedbackModal.tsx`** (new): Dialog with a textarea for optional
downvote comment, using the design system `Dialog` component.
- **`useMessageFeedback.ts`** (new): Hook managing per-message feedback
state and backend submission via `POST
/api/chat/sessions/{id}/feedback`.
- **`ChatMessagesContainer.tsx`** (modified): Renders
`AssistantMessageActions` after `MessageContent` for assistant messages
when not streaming.
- **`ChatContainer.tsx`** (modified): Passes `sessionID` prop through to
`ChatMessagesContainer`.

## Test plan
- [ ] Verify action buttons appear on hover over assistant messages
- [ ] Verify buttons are hidden during active streaming
- [ ] Click copy button → text copied to clipboard, success toast shown
- [ ] Click upvote → green highlight, "Thank you" toast, button locked
- [ ] Click downvote → red highlight, feedback modal opens
- [ ] Submit feedback modal with/without comment → modal closes,
feedback sent
- [ ] Cancel feedback modal → modal closes, downvote stays locked
- [ ] Verify feedback POST reaches `/api/chat/sessions/{id}/feedback`

### Linear issue
Closes SECRT-2051

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 17:01:05 +08:00
Abhimanyu Yadav
f1b771b7ee feat(platform): switch builder file inputs from base64 to workspace uploads (#12226)
## Summary

Builder node file inputs were stored as base64 data URIs directly in
graph JSON, bloating saves and causing lag. This PR uploads files to the
existing workspace system and stores lightweight `workspace://`
references instead.

## What changed

- **Upload**: When a user picks a file in a builder node input, it gets
uploaded to workspace storage and the graph stores a small
`workspace://file-id#mime/type` URI instead of a huge base64 string.

- **Delete**: When a user clears a file input, the workspace file is
soft-deleted from storage so it doesn't leave orphaned files behind.

- **Execution**: Wired up `workspace_id` on `ExecutionContext` so blocks
can resolve `workspace://` URIs during graph runs. `store_media_file()`
already knew how to handle them.

- **Output rendering**: Added a renderer that displays `workspace://`
URIs as images, videos, audio players, or download cards in node output.

- **Proxy fix**: Removed a `Content-Type: text/plain` override on
multipart form responses that was breaking the generated hooks' response
parsing.

Existing graphs with base64 `data:` URIs continue to work — no migration
needed.

## Test plan

- [x] Upload file in builder → spinner shows, completes, file label
appears
- [x] Save/reload graph → `workspace://` URI persists, not base64
- [x] Clear file input → workspace file is deleted
- [x] Run graph → blocks resolve `workspace://` files correctly
- [x] Output renders images/video/audio from `workspace://` URIs
- [x] Old graphs with base64 still work

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 08:38:18 +00:00
Otto
aa7a2f0a48 hotfix(frontend/signup): Add missing createUser() call in password signup (#12287)
Requested by @0ubbe

Password signup was missing the backend `createUser()` call that the
OAuth callback flow already had. This caused `getOnboardingStatus()` to
fail/hang for new users whose backend record didn't exist yet, resulting
in an infinite spinner after account creation.

## Root Cause

| Flow | createUser() | getOnboardingStatus() | Result |
|------|-------------|----------------------|--------|
| OAuth signup |  Called |  Works | Redirects correctly |
| Password signup |  Missing |  Fails/hangs | Infinite spinner |

## Fix

Adds `createUser()` call in `signup/actions.ts` after session is set,
before onboarding status check — matching the OAuth callback pattern.
Includes error handling with Sentry reporting.

## Testing

- Create a new password account → should redirect without spinner
- OAuth signup unaffected (no changes to that flow)

Fixes OPEN-3023

---------

Co-authored-by: Lluis Agusti <hi@llu.lu>
2026-03-05 16:11:51 +08:00
Nicholas Tindle
3722d05b9b fix(frontend/builder): make Google Drive file inputs chainable (#12274)
Resolves: OPEN-3018

Google Drive picker fields on INPUT blocks were missing connection
handles, making them non-chainable in the new builder.

### Changes 🏗️

- **Render `TitleFieldTemplate` with `InputNodeHandle`** — uses
`getHandleId()` with `fieldPathId.$id` (which correctly resolves to e.g.
`agpt_%_spreadsheet`), fixing the previous `_@_` handle error caused by
using `idSchema.$id` (undefined for custom RJSF FieldProps)
- **Override `showHandles: !!nodeId`** in uiOptions — the INPUT block's
`generate-ui-schema.ts` sets `showHandles: false`, but Google Drive
fields need handles to be chainable
- **Hide picker content when handle is connected** — uses
`useEdgeStore.isInputConnected()` to detect wired connections and
conditionally hides the picker/placeholder UI

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Add a Google Drive file input block to a graph in the new builder
  - [x] Verify the connection handle appears on the input
  - [x] Connect another block's output to the Google Drive input handle
- [x] Verify the picker UI hides when connected and reappears when
disconnected
- [x] Verify the Google Drive picker still works normally on non-INPUT
block nodes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Medium Risk**
> Changes input-handle ID generation and conditional rendering for
Google Drive fields in the builder; regressions could break edge
connections or hide the picker unexpectedly on some nodes.
> 
> **Overview**
> Google Drive picker fields now render a proper RJSF
`TitleFieldTemplate` (and thus input handles) using a computed
`handleId` derived from `fieldPathId.$id`, and force `showHandles` on
when a `nodeId` is present.
> 
> The picker/placeholder UI is now conditionally hidden when
`useEdgeStore.isInputConnected()` reports the input handle is connected,
preventing duplicate input UI when the value comes from an upstream
node.
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
1f1df53a38. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: abhi1992002 <abhimanyu1992002@gmail.com>
Co-authored-by: Abhimanyu Yadav <122007096+Abhi1992002@users.noreply.github.com>
2026-03-05 04:28:01 +00:00
Zamil Majdy
592830ce9b feat(copilot): persist large tool outputs to workspace with retrieval instructions (#12279)
## Summary
- Large tool outputs (>80K chars) are now persisted to session workspace
storage before truncation, preventing permanent data loss
- Truncated output includes a head preview (50K chars) with clear
retrieval instructions referencing `read_workspace_file` with
offset/length
- Added `offset` and `length` parameters to `ReadWorkspaceFileTool` for
paginated reads of large files without re-triggering truncation

## Problem
Tool outputs exceeding 100K chars were permanently lost — truncated by
`StreamToolOutputAvailable.model_post_init` using middle-out truncation.
The model had no way to retrieve the full output later, causing
recursive read loops where the agent repeatedly tries to re-read
truncated data.

## Solution
1. **`BaseTool.execute()`** — When output exceeds 80K chars, persist
full output to workspace at `tool-outputs/{tool_call_id}.json`, then
replace with a head preview wrapped in `<tool-output-truncated>` tags
containing retrieval instructions
2. **`ReadWorkspaceFileTool`** — New `offset`/`length` parameters enable
paginated reads so the agent can fetch slices without re-triggering
truncation
3. **Graceful fallback** — If workspace write fails, returns raw output
unchanged for existing truncation to handle

## Test plan
- [x] `base_test.py`: 5 tests covering persist+preview, fallback on
error, small output passthrough, large output persistence, anonymous
user skip
- [x] `workspace_files_test.py`: Ranged read test covering offset+length
slice, offset-only, offset beyond file length
- [ ] CI passes
- [ ] Review comments addressed
2026-03-05 00:50:01 +00:00
Zamil Majdy
6cc680f71c feat(copilot): improve SDK loading time (#12280)
## Summary
- Skip CLI version check at worker init (saves ~300ms/request)
- Pre-warm bundled CLI binary at startup to warm OS page caches (~500ms
saved on first request per worker)
- Parallelize E2B setup, system prompt fetch, and transcript download
with `asyncio.gather()` (saves ~200-500ms)
- Enable Langfuse prompt caching with configurable TTL (default 300s)

## Test plan
- [ ] `poetry run pytest backend/copilot/sdk/service_test.py -s -vvv`
- [ ] Manual: send copilot messages via SDK path, verify resume still
works on multi-turn
- [ ] Check executor logs for "CLI pre-warm done" messages
2026-03-05 00:49:14 +00:00
Otto
b342bfa3ba fix(frontend): revalidate layout after email/password login (#12285)
Requested by @ntindle

After logging in with email/password, the page navigates but renders a
blank/unauthenticated state (just logo + cookie banner). A manual page
refresh fixes it.

The `login` server action calls `signInWithPassword()` server-side but
doesn't call `revalidatePath()`, so Next.js serves cached RSC payloads
that don't reflect the new auth state. The OAuth callback route already
does this correctly.

**Fix:** Add `revalidatePath(next, "layout")` after successful login,
matching the OAuth callback pattern.

Closes SECRT-2059
2026-03-04 22:25:48 +00:00
Zamil Majdy
0215332386 feat(copilot): remove legacy copilot, add baseline non-SDK mode with tool calling (#12276)
## Summary
- Remove ~1200 lines of broken/unmaintained non-SDK copilot streaming
code (retry logic, parallel tool calls, context window management)
- Add `stream_chat_completion_baseline()` as a clean fallback LLM path
with full tool-calling support when `CHAT_USE_CLAUDE_AGENT_SDK=false`
(e.g. when Anthropic is down)
- Baseline reuses the same shared `TOOL_REGISTRY`,
`get_available_tools()`, and `execute_tool()` as the SDK path
- Move baseline code to dedicated `baseline/` folder (mirrors `sdk/`
structure)
- Clean up SDK service: remove unused params, fix model/env resolution,
fix stream error persistence
- Clean up config: remove `max_retries`, `thinking_enabled` fields
(non-SDK only)

## Changes
| File | Action |
|------|--------|
| `backend/copilot/baseline/__init__.py` | New — package export |
| `backend/copilot/baseline/service.py` | New — baseline streaming with
tool-call loop |
| `backend/copilot/baseline/service_test.py` | New — multi-turn keyword
recall test |
| `backend/copilot/service.py` | Remove ~1200 lines of legacy code, keep
shared helpers only |
| `backend/copilot/executor/processor.py` | Simplify branching to SDK vs
baseline |
| `backend/copilot/sdk/service.py` | Remove unused params, fix model/env
separation, fix stream error persistence |
| `backend/copilot/config.py` | Remove `max_retries`, `thinking_enabled`
|
| `backend/copilot/service_test.py` | Keep SDK test only (baseline test
moved) |
| `backend/copilot/parallel_tool_calls_test.py` | Deleted (tested
removed code) |

## Test plan
- [x] `poetry run format` passes
- [x] CI passes (all 3 Python versions, types, CodeQL)
- [ ] SDK path works unchanged in production
- [x] Baseline path (`CHAT_USE_CLAUDE_AGENT_SDK=false`) streams
responses with tool calling
- [x] Baseline emits correct Vercel AI SDK stream protocol events
2026-03-04 13:51:46 +00:00
Zamil Majdy
160d6eddfb feat(copilot): enable OpenRouter broadcast for SDK /messages endpoint (#12277)
## Summary

OpenRouter Broadcast silently drops traces for the Anthropic-native
`/api/v1/messages` endpoint unless an `x-session-id` HTTP header is
present. This was confirmed by systematic testing against our Langfuse
integration:

| Test | Endpoint | `x-session-id` header | Broadcast to Langfuse |
|------|----------|-----------------------|----------------------|
| 1 | `/chat/completions` | N/A (body fields work) |  |
| 2 | `/messages` (body fields only) |  |  |
| 3 | `/messages` (header + body) |  |  |
| 4 | `/messages` (`metadata.user_id` only) |  |  |
| 5 | `/messages` (header only) |  |  |

**Root cause:** OpenRouter only triggers broadcast for the `/messages`
endpoint when the `x-session-id` HTTP header is present — body-level
`session_id` and `metadata.user_id` are insufficient.

### Changes
- **SDK path:** Inject `x-session-id` and `x-user-id` via
`ANTHROPIC_CUSTOM_HEADERS` env var in `_build_sdk_env()`, which the
Claude Agent SDK CLI reads and attaches to every outgoing API request
- **Non-SDK path:** Add `trace` object (`trace_name` + `environment`) to
`extra_body` for richer broadcast metadata in Langfuse

This creates complementary traces alongside the existing OTEL
integration: broadcast provides cost/usage data from OpenRouter while
OTEL provides full tool-call observability with `userId`, `sessionId`,
`environment`, and `tags`.

## Test plan
- [x] Verified via test script: `/messages` with `x-session-id` header →
trace appears in Langfuse with correct `sessionId`
- [x] Verified `/chat/completions` with `trace` object → trace appears
with custom `trace_name`
- [x] Pre-commit hooks pass (ruff, black, isort, pyright)
- [ ] Deploy to dev and verify broadcast traces appear for real copilot
SDK sessions
2026-03-04 09:07:48 +00:00
Ubbe
a5db9c05d0 feat(frontend/copilot): add text-to-speech and share output actions (#12256)
## Summary
- Add text-to-speech action button to CoPilot assistant messages using
the browser Web Speech API
- Add share action button that uses the Web Share API with clipboard
fallback
- Replace inline SVG copy icon with Phosphor CopyIcon for consistency

## Linked Issue
SECRT-2052

## Test plan
- [ ] Verify copy button still works
- [ ] Click speaker icon and verify TTS reads aloud
- [ ] Click stop while playing and verify speech stops
- [ ] Click share icon and verify native share or clipboard fallback

Note: This PR should be merged after SECRT-2051 PR

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 16:08:54 +08:00
Otto
b74d41d50c fix(backend): handle UniqueViolationError in workspace file retry path (#12267)
Requested by @majdyz

When two concurrent requests write to the same workspace file path with
`overwrite=True`, the retry after deleting the conflicting file could
also hit a `UniqueViolationError`. This raw Prisma exception was
bubbling up unhandled to Sentry as a high-priority alert
(AUTOGPT-SERVER-7ZA).

Now the retry path catches `UniqueViolationError` specifically and
converts it to a `ValueError` with a clear message, matching the
existing pattern for the non-overwrite path.

**Change:** `autogpt_platform/backend/backend/util/workspace.py` — added
a specific `UniqueViolationError` catch before the generic `Exception`
catch in the retry block.

**Risk:** Minimal — only affects the already-failing retry path. No
behavior change for success paths.

---------

Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
2026-03-04 07:04:50 +00:00
Otto
a897f9e124 feat(copilot): render context compaction as tool-call UI events (#12250)
Requested by @majdyz

When CoPilot compacts (summarizes/truncates) conversation history to fit
within context limits, the user now sees it rendered like a tool call —
a spinner while compaction runs, then a completion notice.

**Backend:**
- Added `compaction_start_events()`, `compaction_end_events()`,
`compaction_events()` in `response_model.py` using the existing
tool-call SSE protocol (`tool-input-start` → `tool-input-available` →
`tool-output-available`)
- All three compaction paths (legacy `service.py`, SDK pre-query, SDK
mid-stream) use the same pattern
- Pre-query and SDK-internal compaction tracked independently so neither
suppresses the other

**Frontend:**
- Added `compaction` tool category to `GenericTool` with
`ArrowsClockwise` icon
- Shows "Summarizing earlier messages…" with spinner while running
- Shows "Earlier messages were summarized" when done
- No expandable accordion — just the status line

**Cleanup:**
- Removed unused `system_notice_start/end_events`,
`COMPACTION_STARTED_MSG`
- Removed unused `system_notice_events`, `system_error_events`,
`_system_text_events`

Closes SECRT-2053

---------

Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
2026-03-04 05:31:32 +00:00
Zamil Majdy
7fd26d3554 feat(copilot): run_mcp_tool — MCP server discovery and execution in Otto (#12213)
## Summary

Enables Otto (the AutoGPT copilot) to connect to any MCP (Model Context
Protocol) server, discover its tools, and execute them — with the same
credential login UI used in the graph builder.

**Why a dedicated `run_mcp_tool` instead of reusing `run_block` +
MCPToolBlock?**
Two blockers make `run_block` unworkable for MCP:
1. **No discovery mode** — `MCPToolBlock` errors with "No tool selected"
when `selected_tool` is empty; the agent can't learn what tools exist
before picking one.
2. **Credential matching bug** — `find_matching_credential()` (the block
execution path) does NOT check MCP server URLs; it would match any
stored MCP OAuth credential regardless of server. The correct
`_credential_is_for_mcp_server()` helper only applies in the graph path.

## Changes

### Backend
- **New `run_mcp_tool` copilot tool** (`run_mcp_tool.py`) — two-stage
flow:
1. `run_mcp_tool(server_url)` → discovers available tools via
`MCPClient.list_tools()`
2. `run_mcp_tool(server_url, tool_name, tool_arguments)` → executes via
`MCPClient.call_tool()`
- Lazy auth: fast DB credential lookup first
(`MCPToolBlock._auto_lookup_credential`); on HTTP 401/403 with no stored
creds, returns `SetupRequirementsResponse` so the frontend renders the
existing CredentialsGroupedView OAuth login card
- **New response models** in `models.py`: `MCPToolsDiscoveredResponse`,
`MCPToolOutputResponse`, `MCPToolInfo`
- **Exclude MCPToolBlock** from `find_block` / `run_block`
(`COPILOT_EXCLUDED_BLOCK_TYPES`)
- **System prompt update** — MCP section with two-step flow,
`input_schema` guidance, auth-wait instruction, and registry URL
(`registry.modelcontextprotocol.io`)

### Frontend
- **`RunMCPToolComponent`** — routes between credential prompt (reuses
`SetupRequirementsCard` from RunBlock) and result card; discovery step
shows only a minimal in-progress animation (agent-internal, not
user-facing)
- **`MCPToolOutputCard`** — renders tool result as formatted JSON or
plain text
- **`helpers.tsx`** — type guards (`isMCPToolOutput`,
`isSetupRequirementsOutput`, `isErrorOutput`), output parsing, animation
text
- Registered `tool-run_mcp_tool` case in `ChatMessagesContainer`

## Test plan

- [ ] Call `run_mcp_tool(server_url)` with a public MCP server → see
discovery animation, agent gets tool list
- [ ] Call `run_mcp_tool(server_url, tool_name, tool_arguments)` → see
`MCPToolOutputCard` with result
- [ ] Call with an auth-required server and no stored creds →
`SetupRequirementsCard` renders with MCP OAuth button
- [ ] After connecting credentials, retry → executes successfully
- [ ] `find_block("MCP")` returns no results (MCPToolBlock excluded)
- [ ] Backend unit tests: mock `MCPClient` for discovery + execution +
auth error paths

---------

Co-authored-by: Otto (AGPT) <otto@agpt.co>
2026-03-04 05:30:38 +00:00
Zamil Majdy
b504cf9854 feat(copilot): Add agent-browser multi-step browser automation tools (#12230)
## Summary

Adds three new Copilot tools for multi-step browser automation using the
[agent-browser](https://github.com/vercel-labs/agent-browser) CLI
(Playwright-based local daemon):

- **`browser_navigate`** — navigate to a URL and get an
accessibility-tree snapshot with `@ref` IDs
- **`browser_act`** — interact with page elements (click, fill, scroll,
check, press, select, `dblclick`, `type`, `wait`, back, forward,
reload); returns updated snapshot
- **`browser_screenshot`** — capture annotated screenshot (with `@ref`
overlays) and save to user workspace

Also adds **`browse_web`** (Stagehand + Browserbase) for one-shot
JS-rendered page extraction.

### Why two browser tools?

| Tool | When to use |
|------|-------------|
| `browse_web` | Single-shot extraction — cloud Browserbase session, no
local daemon needed |
| `browser_navigate` / `browser_act` | Multi-step flows (login →
navigate → scrape), persistent session within a Copilot session |

### Design decisions

- **SSRF protection**: Uses the same `validate_url()` from
`backend.util.request` as HTTP blocks — async DNS, all IPs checked, full
RFC 1918 + link-local + IPv6 coverage
- **Session isolation**: `_run()` passes both `--session <id>` (isolated
Chromium context per Copilot session) **and** `--session-name <id>`
(persist cookies within a session), preventing cross-session state
leakage while supporting login flows
- **Snapshot truncation**: Interactive-only accessibility tree
(`snapshot -i`) capped at 20 000 chars with a continuation hint
- **Screenshot storage**: PNG bytes uploaded to user workspace via
`WriteWorkspaceFileTool`; returns `file_id` for retrieval

### Bugs fixed in this PR

- Session isolation bug: `--session-name` alone shared browser history
across different Copilot sessions; added `--session` to isolate contexts
- Missing actions: added `dblclick`, `type` (append without clearing),
`wait` (CSS selector or ms delay)

## Test plan

- [x] 53 unit tests covering all three tools, all actions, SSRF
integration, auth check, session isolation, snapshot truncation,
timeout, missing binary
- [x] Integration test: real `agent-browser` CLI + Anthropic API
tool-calling loop (3/3 scenarios passed)
- [x] Linting (Ruff, isort, Black, Pyright) all passing

```
backend/copilot/tools/agent_browser_test.py  53 passed in 17.79s
```
2026-03-03 21:55:28 +00:00
Zamil Majdy
29da8db48e feat(copilot): E2B cloud sandbox — unified file tools, persistent execution, output truncation (#12212)
## Summary

- **E2B file tools**: New MCP tools
(`read_file`/`write_file`/`edit_file`/`glob`/`grep`) that operate
directly on the E2B sandbox filesystem (`/home/user`). When E2B is
active, these replace SDK built-in `Read/Write/Edit/Glob/Grep` so all
tools share a single coherent filesystem with `bash_exec` — no sync
needed.
- **E2B sandbox lifecycle**: New `e2b_sandbox.py` manages sandbox
creation and reconnection via Redis, with stale-key cleanup on
reconnection failure.
- **E2B enabled by default**: `use_e2b_sandbox` defaults to `True`; set
`CHAT_USE_E2B_SANDBOX=false` to disable.
- **Centralized output truncation**: All MCP tool outputs are truncated
via `_truncating` wrapper and stashed (`_pending_tool_outputs`) to
bypass SDK's head-truncation for the frontend.
- **Frontend tool display**: `GenericTool.tsx` now renders bash
stdout/stderr, file content, edit diffs (old/new), todo lists, and
glob/grep results with category-specific icons and status text.
- **Workspace file tools + E2B**: `read_workspace_file`'s `save_to_path`
and `write_workspace_file`'s `source_path` route to E2B sandbox when
active.

## Files changed

| Area | Files | What |
|------|-------|------|
| E2B file tools | `sdk/e2b_file_tools.py`, `sdk/e2b_file_tools_test.py`
| MCP file tool handlers + tests |
| E2B sandbox | `tools/e2b_sandbox.py` | Sandbox lifecycle
(create/reconnect/Redis) |
| Tool adapter | `sdk/tool_adapter.py` | MCP server, truncation, stash,
path validation |
| Service | `sdk/service.py` | E2B integration, prompt supplements |
| Security | `sdk/security_hooks.py`, `sdk/security_hooks_test.py` |
Path validation for E2B mode |
| Bash exec | `tools/bash_exec.py` | E2B execution path |
| Workspace files | `tools/workspace_files.py`,
`tools/workspace_files_test.py` | E2B-aware save/source paths |
| Config | `copilot/config.py` | E2B config fields (default on) |
| Truncation | `util/truncate.py` | Middle-out truncation fix |
| Frontend | `GenericTool.tsx` | Tool-specific display rendering |

## Test plan

- [x] `security_hooks_test.py` — 43 tests (path validation, tool access,
deny messages)
- [x] `e2b_file_tools_test.py` — 19 tests (path resolution, local read
safety)
- [x] `workspace_files_test.py` — 17 tests (ephemeral path validation)
- [x] CI green (backend 3.11/3.12/3.13, lint, types, e2e)
2026-03-03 21:31:38 +00:00
Nicholas Tindle
757ec1f064 feat(platform): Add file upload to copilot chat [SECRT-1788] (#12220)
## Summary

- Add file attachment support to copilot chat (documents, images,
spreadsheets, video, audio)
- Show upload progress with spinner overlays on file chips during upload
- Display attached files as styled pills in sent user messages using AI
SDK's native `FileUIPart`
- Backend upload endpoint with virus scanning (ClamAV), per-file size
limits, and per-user storage caps
- Enrich chat stream with file metadata so the LLM can access files via
`read_workspace_file`

Resolves: [SECRT-1788](https://linear.app/autogpt/issue/SECRT-1788)

### Backend
| File | Change |
|------|--------|
| `chat/routes.py` | Accept `file_ids` in stream request, enrich user
message with file metadata |
| `workspace/routes.py` | New `POST /files/upload` and `GET
/storage/usage` endpoints |
| `executor/utils.py` | Thread `file_ids` through
`CoPilotExecutionEntry` and RabbitMQ |
| `settings.py` | Add `max_file_size_mb` and `max_workspace_storage_mb`
config |

### Frontend
| File | Change |
|------|--------|
| `AttachmentMenu.tsx` | **New** — `+` button with popover for file
category selection |
| `FileChips.tsx` | **New** — file preview chips with upload spinner
state |
| `MessageAttachments.tsx` | **New** — paperclip pills rendering
`FileUIPart` in chat bubbles |
| `upload/route.ts` | **New** — Next.js API proxy for multipart uploads
to backend |
| `ChatInput.tsx` | Integrate attachment menu, file chips, upload
progress |
| `useCopilotPage.ts` | Upload flow, `FileUIPart` construction,
transport `file_ids` extraction |
| `ChatMessagesContainer.tsx` | Render file parts as
`MessageAttachments` |
| `ChatContainer.tsx` / `EmptySession.tsx` | Thread `isUploadingFiles`
prop |
| `useChatInput.ts` | `canSendEmpty` option for file-only sends |
| `stream/route.ts` | Forward `file_ids` to backend |

## Test plan

- [x] Attach files via `+` button → file chips appear with X buttons
- [x] Remove a chip → file is removed from the list
- [x] Send message with files → chips show upload spinners → message
appears with file attachment pills
- [x] Upload failure → toast error, chips revert to editable (no phantom
message sent)
- [x] New session (empty form): same upload flow works
- [x] Messages without files render normally
- [x] Network tab: `file_ids` present in stream POST body

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Medium Risk**
> Adds authenticated file upload/storage-quota enforcement and threads
`file_ids` through the chat streaming path, which affects data handling
and storage behavior. Risk is mitigated by UUID/workspace scoping, size
limits, and virus scanning but still touches security- and
reliability-sensitive upload flows.
> 
> **Overview**
> Copilot chat now supports attaching files: the frontend adds
drag-and-drop and an attach button, shows selected files as removable
chips with an upload-in-progress state, and renders sent attachments
using AI SDK `FileUIPart` with download links.
> 
> On send, files are uploaded to the backend (with client-side limits
and failure handling) and the chat stream request includes `file_ids`;
the backend sanitizes/filters IDs, scopes them to the user’s workspace,
appends an `[Attached files]` metadata block to the user message for the
LLM, and forwards the sanitized IDs through `enqueue_copilot_turn`.
> 
> The backend adds `POST /workspace/files/upload` (filename
sanitization, per-file size limit, ClamAV scan, and per-user storage
quota with post-write rollback) plus `GET /workspace/storage/usage`,
introduces `max_workspace_storage_mb` config, optimizes workspace size
calculation, and fixes executor cleanup to avoid un-awaited coroutine
warnings; new route tests cover file ID validation and upload quota/scan
behaviors.
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
8d3b95d046. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 20:23:27 +00:00
Ubbe
9442c648a4 fix(platform/copilot): bypass Vercel SSE proxy, refactor hook architecture (#12254)
## Summary

Reliability, architecture, and UX improvements for the CoPilot SSE
streaming pipeline.

### Frontend

- **SSE proxy bypass**: Connect directly to the Python backend for SSE
streams, avoiding the Next.js serverless proxy and its 800s Vercel
function timeout ceiling
- **Hook refactor**: Decompose the 490-line `useCopilotPage` monolith
into focused domain modules:
- `helpers.ts` — pure functions (`deduplicateMessages`,
`resolveInProgressTools`)
- `store.ts` — Zustand store for shared UI state (`sessionToDelete`,
drawer open/close)
- `useCopilotStream.ts` — SSE transport, `useChat` wrapper,
reconnect/resume logic, stop+cancel
  - `useCopilotPage.ts` — thin orchestrator (~160 lines)
- **ChatMessagesContainer refactor**: Split 525-line monolith into
sub-components:
  - `helpers.ts` — pure text parsing (markers, workspace URLs)
- `components/ThinkingIndicator.tsx` — ScaleLoader animation + cycling
phrases with pulse
- `components/MessagePartRenderer.tsx` — tool dispatch switch +
workspace media
- **Stop UX fixes**:
- Guard `isReconnecting` and resume effect with `isUserStoppingRef` so
the input unlocks immediately after explicit stop (previously stuck
until page refresh)
- Inject cancellation marker locally in `stop()` so "You manually
stopped this chat" shows instantly
- **Thinking indicator polish**: Replace MorphingBlob SVG with
ScaleLoader (16px), fix initial dark circle flash via
`animation-fill-mode: backwards`, smooth `animate-pulse` text instead of
shimmer gradient
- **ChatSidebar consolidation**: Reads `sessionToDelete` from Zustand
store instead of duplicating delete state/mutation locally
- **Auth error handling**: `getAuthHeaders()` throws on failure instead
of silently returning empty headers; 401 errors show user-facing toast
- **Stale closure fix**: Use refs for reconnect guards to avoid stale
closures during rapid reconnect cycles
- **Session switch resume**: Clear `hasResumedRef` on session switch so
returning to a session with an active stream auto-reconnects
- **Target session cache invalidation**: Invalidate the target session's
React Query cache on switch so `active_stream` is accurate for resume
- **Dedup hardening**: Content-fingerprint dedup resets on non-assistant
messages, preventing legitimate repeated responses from being dropped
- **Marker prefixes**: Hex-suffixed markers (`[__COPILOT_ERROR_f7a1__]`)
to prevent LLM false-positives
- **Code style**: Remove unnecessary `useCallback` wrappers per project
convention, replace unsafe `as` cast with runtime type guard

### Backend (minimal)

- **Faster heartbeat**: 10s → 3s interval to keep SSE alive through
proxies/LBs
- **Faster stall detection**: SSE subscriber queue timeout 30s → 10s
- **Marker prefixes**: Matching hex-suffixed prefixes for error/system
markers

## Test plan

- [ ] Verify SSE streams connect directly to backend (no Next.js proxy
in network tab)
- [ ] Verify reconnect works on transient disconnects (up to 3 attempts
with backoff)
- [ ] Verify auth failure shows user-facing toast
- [ ] Verify switching sessions and switching back shows messages and
resumes active stream
- [ ] Verify deleting a chat from sidebar works (shared Zustand state)
- [ ] Verify mobile drawer delete works (shared Zustand state)
- [ ] Verify thinking indicator shows ScaleLoader + pulsing text, no
dark circle flash
- [ ] Verify stopping a stream immediately unlocks the input and shows
"You manually stopped this chat"
- [ ] Verify marker prefix parsing still works with hex-suffixed
prefixes
- [ ] `pnpm format && pnpm lint && pnpm types` pass

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 19:56:24 +08:00
Zamil Majdy
1c51dd18aa fix(test): backdate UserBalance.updatedAt in test_block_credit_reset (#12236)
## Root cause

The test constructs \`month3\` using \`datetime.now().replace(month=3,
day=1)\` — hardcoded to **March of the real current year**. When
\`update(balance=400)\` runs, Prisma auto-sets \`updatedAt\` to the
**real wall-clock time**.

The refill guard in \`BetaUserCredit.get_credits\` is:
\`\`\`python
if (snapshot_time.year, snapshot_time.month) == (cur_time.year,
cur_time.month):
    return balance  # same month → skip refill
\`\`\`

This means the test only fails when run **during the real month of
March**, because the mocked \`month3\` and the real \`updatedAt\` both
land in March:

| Test runs in | \`snapshot_time\` (real \`updatedAt\`) | \`cur_time\`
(mocked month3) | Same? | Result |
|---|---|---|---|---|
| January 2026 | \`(2026, 1)\` | \`(2026, 3)\` |  | refill triggers  |
| February 2026 | \`(2026, 2)\` | \`(2026, 3)\` |  | refill triggers 
|
| **March 2026** | **\`(2026, 3)\`** | **\`(2026, 3)\`** | **** |
**skips refill ** |
| April 2026 | \`(2026, 4)\` | \`(2026, 3)\` |  | refill triggers  |

It would silently pass again in April, then fail again next March 2027.

## Fix

Explicitly pass \`updatedAt=month2\` when updating the balance to 400,
so the month2→month3 transition is correctly detected regardless of when
the test actually runs. This matches the existing pattern used earlier
in the same test for the month1 setup.

## Test plan
- [ ] \`pytest backend/data/credit_test.py::test_block_credit_reset\`
passes
- [ ] No other credit tests broken
2026-03-01 07:46:04 +00:00
Zamil Majdy
6f4f80871d feat(copilot): Langfuse SDK tracing for Claude Agent SDK path (#12228)
## Problem

The Copilot SDK path (`ClaudeSDKClient`) routes API calls through `POST
/api/v1/messages` (Anthropic-native endpoint). OpenRouter Broadcast
**silently excludes** this endpoint — it only forwards `POST
/api/v1/chat/completions` (OpenAI-compat) to Langfuse. As a result, all
SDK-path turns were invisible in Langfuse.

**Root cause confirmed** via live pod test: two HTTP calls (one per
endpoint), only the `/chat/completions` one appeared in Langfuse.

## Solution

Add **Langfuse SDK direct tracing** in `sdk/service.py`, wrapping each
`stream_chat_completion_sdk()` call with a `generation` observation.

### What gets captured per user turn
| Field | Value |
|---|---|
| `name` | `copilot-sdk-session` |
| `model` | resolved SDK model |
| `input` | user message |
| `output` | final accumulated assistant text |
| `usage_details.input` | aggregated input tokens (from
`ResultMessage.usage`) |
| `usage_details.output` | aggregated output tokens |
| `cost_details.total` | total cost USD |
| trace `session_id` | copilot session ID |
| trace `user_id` | authenticated user ID |
| trace `tags` | `["sdk"]` |

Token counts and cost are **aggregated** across all internal Anthropic
API calls in the session (tool-use turns included), sourced from
`ResultMessage.usage`.

### Implementation notes
- Span is opened via
`start_as_current_observation(as_type='generation')` before
`ClaudeSDKClient` enters
- Span is **always closed in `finally`** — survives errors,
cancellations, and user stops
- Fails open: any Langfuse init error is caught and logged at `DEBUG`,
tracing is disabled for that turn but the session continues normally
- Only runs when `_is_langfuse_configured()` returns true (same guard as
the non-SDK path)

## Also included

`reproduce_openrouter_broadcast_gap.py` — standalone repro script (no
sensitive data) demonstrating that `/api/v1/messages` is not captured by
OpenRouter Broadcast while `/api/v1/chat/completions` is. To be filed
with OpenRouter support.

## Test plan

- [ ] Deploy to dev, send a Copilot message via the SDK path
- [ ] Confirm trace appears in Langfuse with `tags=["sdk"]`, correct
`session_id`/`user_id`, non-zero token counts
- [ ] Confirm session still works normally when `LANGFUSE_PUBLIC_KEY` is
not set (no-op path)
- [ ] Confirm session still works on error/cancellation (span closed in
finally)
2026-02-27 16:26:46 +00:00
Ubbe
e8cca6cd9a feat(frontend/copilot): migrate ChatInput to ai-sdk prompt-input component (#12207)
## Summary

- **Migrate ChatInput** to composable `PromptInput*` sub-components from
AI SDK Elements, replacing the custom implementation with a boxy,
Claude-style input layout (textarea + footer with tools and submit)
- **Eliminate JS-based DOM height manipulation** (60+ lines removed from
`useChatInput.ts`) in favor of CSS-driven auto-resize via
`min-h`/`max-h`, fixing input sizing jumps (SECRT-2040)
- **Change stop button color** from red to black (`bg-zinc-800`) per
SECRT-2038, while keeping mic recording button red
- **Add new UI primitives**: `InputGroup`, `Spinner`, `Textarea`, and
`prompt-input` composition layer

### New files
- `src/components/ai-elements/prompt-input.tsx` — Composable prompt
input sub-components (PromptInput, PromptInputBody, PromptInputTextarea,
PromptInputFooter, PromptInputTools, PromptInputButton,
PromptInputSubmit)
- `src/components/ui/input-group.tsx` — Layout primitive with flex-col
support, rounded-xlarge styling
- `src/components/ui/spinner.tsx` — Loading spinner using Phosphor
CircleNotch
- `src/components/ui/textarea.tsx` — Base shadcn Textarea component

### Modified files
- `ChatInput.tsx` — Rewritten to compose PromptInput* sub-components
with InputGroup
- `useChatInput.ts` — Simplified: removed maxRows, hasMultipleLines,
handleKeyDown, all DOM style manipulation
- `useVoiceRecording.ts` — Removed `baseHandleKeyDown` dependency;
PromptInputTextarea handles Enter→submit natively

## Resolves
- SECRT-2042: Migrate copilot chat input to ai-sdk prompt-input
component
- SECRT-2038: Change stop button color from red to black

## Test plan
- [ ] Type a message and send it — verify it submits and clears the
input
- [ ] Multi-line input grows smoothly without sizing jumps
- [ ] Press Enter to send, Shift+Enter for new line
- [ ] Voice recording: press space on empty input to start, space again
to stop
- [ ] Mic button stays red while recording; stop-generating button is
black
- [ ] Input has boxy rectangular shape with rounded-xlarge corners
- [ ] Streaming: stop button appears during generation, clicking it
stops the stream
- [ ] EmptySession layout renders correctly with the new input
- [ ] Input is disabled during transcription with "Transcribing..."
placeholder

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 15:24:19 +00:00
Abhimanyu Yadav
bf6308e87c fix(frontend): truncate node output and fix dialog overflow (#12222)
## Summary
- **Truncate node output on canvas**: Input content now uses
`shortContent={true}` so it renders truncated text instead of full
content. Output items are capped to 3 per pin with `.slice(0, 3)`.
- **Increase truncation limit**: `TextRenderer` truncation raised from
100 to 200 characters for better readability.
- **Fix dialog content overflow**: Removed legacy `ScrollArea` from the
Full Preview dialog (`NodeDataViewer`) — it was preventing proper width
constraint, causing JSON/code content to overflow beyond the dialog
boundary. Replaced with a simple flex container that respects the
dialog's width.
- **Reposition action buttons**: Copy/download buttons moved from
right-side/absolute positioning to below the content, aligned left, for
better layout with horizontally-scrollable content.
- **Add overflow protection to ContentRenderer**: Added
`overflow-hidden` and `pre` word-wrap rules to prevent content from
breaking out of the node card on the canvas.

## Test plan
- [x] Open a node with long JSON output on the builder canvas — verify
content is truncated
- [x] Click the expand button to open "Full Output Preview" dialog —
verify content stays within dialog bounds and scrolls horizontally if
needed
- [x] Verify copy/download buttons appear below the content,
left-aligned
- [x] Check that input data also shows truncated on the canvas node
- [x] Verify output items are capped at 3 per pin on the canvas

🤖 Generated with [Claude Code](https://claude.com/claude-code)
2026-02-27 10:53:16 +00:00
Otto
4e59143d16 Add plans/ to .gitignore (#12229)
Requested by @torantula

Adds `plans/` to `.gitignore` and removes one existing tracked plan
file.
2026-02-27 10:12:46 +00:00
Reinier van der Leer
d5efb6915b dx: Sync types & dependencies on pre-commit and post-checkout (#12211)
Our pre-commit hooks can use an update: the type check often fails based
on stale type definitions, the OpenAPI schema isn't synced/checked, and
the pre-push checks aren't installed by default.

### Changes 🏗️

- Regenerate Prisma `.pyi` type stub in on `prisma generate` hook:
Pyright prefers `.pyi` over `.py`, so a stale stub shadows the
regenerated `types.py`
- Also run setup hooks (dependency install, `prisma generate`, `pnpm
generate:api`) on `post-checkout`, to keep types and packages in sync
after switching branches
- Switch these hooks to `git diff` checks because `post-checkout`
doesn't support file triggers/filters
- Add `Check & Install dependencies - AutoGPT Platform - Frontend` hook
- Add `Sync API types - AutoGPT Platform - Backend -> Frontend` hook
- Fix non-ASCII issue in `export-api-schema` (`ensure_ascii=False`)
- Exclude `pnpm-lock.yaml` from `detect-secrets` hook (integrity hashes
cause ~1800 false positives)
- Add `default_stages: [pre-commit]`
- Add `post-checkout`, `pre-push` to `default_install_hook_types`

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Tested locally
2026-02-26 22:28:59 +01:00
Nicholas Tindle
b9aac42056 Merge branch 'master' into dev 2026-02-26 13:39:34 -06:00
Otto
95651d33da feat(backend): add fpdf2 dependency for PDF operations in copilot executor (#12216)
Requested by @majdyz

Adds [fpdf2](https://github.com/py-pdf/fpdf2) (v2.8.6) to backend
dependencies to enable PDF generation and manipulation in the copilot
executor.

fpdf2 is a lightweight PDF generation library (no external dependencies,
pure Python) that allows creating PDFs with text, images, tables, and
more.
2026-02-26 18:21:34 +00:00
Zamil Majdy
b30418d833 fix(copilot): inject working directory into SDK prompt + workspace download links (#12215)
## Summary

- Replaces the static `_SDK_TOOL_SUPPLEMENT` placeholder path with
`_build_sdk_tool_supplement(cwd: str)` that injects the session-specific
working directory
- `sdk_cwd` is computed once via `_make_sdk_cwd(session_id)`,
`os.makedirs` is called after lock acquisition (inside the protected
`try/finally`), and the same variable is used everywhere — no drift
between prompt and execution directory
- Added `ValueError`/`OSError` error handling for cwd preparation with
proper `StreamError` emission
- Teaches the SDK agent how to share workspace files with the user via
`workspace://` Markdown links (images render inline, videos render with
player controls, other files as download links)
- `WorkspaceWriteResponse` now includes `download_url` (pre-formatted
`workspace://file_id#mime` string) and a normalised `mime_type` field
(MIME parameters stripped, lowercased)
- Frontend: workspace `workspace://` regular links now resolve to
absolute URLs so Streamdown's "Copy link" copies the full URL
- Frontend: Streamdown's "Open link" button colour overridden to match
the design system (violet accent) — previously near-invisible in dark
mode due to `--primary` resolving to near-white

## Motivation

The SDK agent was seeing a hardcoded placeholder path in the system
prompt instead of the real working directory, causing it to reference
wrong paths in tool calls. Additionally, there was no guidance for the
agent on how to share files it writes to the workspace with the user in
chat.

## Test plan

- [ ] CI green (test 3.11 / 3.12 / 3.13)
- [ ] Start a copilot session with `CHAT_USE_CLAUDE_AGENT_SDK=true` and
verify the agent references the correct `sdk_cwd` path in its tool calls
- [ ] Ask the agent to write a file and confirm it responds with a
clickable download link / inline image using the `workspace://` syntax
- [ ] Verify the "Open link" button in the Streamdown external-link
dialog is visible in both light and dark mode
- [ ] Click "Copy link" on a workspace file link and confirm it copies
the full URL (including host)
2026-02-26 17:26:19 +00:00
Otto
ed729ddbe2 feat(copilot): Wait for agent execution completion (#12147)
Adds the ability for CoPilot to wait for agent execution to complete
before returning results.

Closes SECRT-2003.

## Changes

### New: `execution_utils.py`
- `wait_for_execution()` — uses Redis pubsub to wait for execution to
reach terminal state
- `TERMINAL_STATUSES` — shared frozenset of completed/failed/terminated
- `PAUSED_STATUSES` — handles REVIEW (human-in-the-loop) as a
stop-waiting state
- `get_execution_outputs()` — helper to extract outputs

### `run_agent.py`
- New `wait_for_result` parameter (0-300 seconds)
- When >0, waits for execution to complete and returns outputs directly
- Handles completed, failed, terminated, review, and timeout states with
appropriate responses

### `agent_output.py` (view_agent_output)
- New `wait_if_running` parameter (0-300 seconds)
- Includes running/queued/review executions when waiting is requested
- Status-aware response messages (completed, failed, running, review,
etc.)

## How it works
1. After starting execution, subscribes to Redis pubsub channel for
execution events
2. Re-checks DB after subscribing to close the race window
3. `asyncio.wait_for` enforces the timeout
4. On completion: returns full outputs via `AgentOutputResponse`
5. On timeout: returns current state with guidance to check again later
6. On error/terminated: returns `ErrorResponse` with details
7. Redis connection always cleaned up via `finally` block

## Testing

- [x] Run an agent with `wait_for_result=0` — should return immediately
with execution ID (existing behavior)
- [x] Run a fast agent with `wait_for_result=60` — should return
completed outputs
- [x] Run a slow agent with `wait_for_result=5` — should timeout and
return current status
- [x] Use `view_agent_output` with `wait_if_running=0` on a completed
execution — should return outputs
- [x] Use `view_agent_output` with `wait_if_running=30` on a running
execution — should wait and return
- [ ] ~~Verify Redis connections are cleaned up (no leaked pubsub
connections after timeout)~~
- [ ] ~~Test with a failed execution — should return error response~~
- [ ] ~~Test with a terminated execution — should return error response
(not "still running")~~

## Collaboration

This PR was developed in collaboration with @Pwuts.

---------

Co-authored-by: Reinier van der Leer <pwuts@agpt.co>
2026-02-26 16:41:33 +00:00
Otto
8c7030af0b fix(copilot): handle 'all' keyword in find_library_agent tool (#12138)
When users ask CoPilot to "show all my agents" or similar, the LLM was
passing the literal string "all" as a search query to
`find_library_agent`, which matched no agents because there's no agent
named "all". (issue:
[SECRT-2002](https://linear.app/autogpt/issue/SECRT-2002))

## Changes

- **Make `query` parameter optional** in `FindLibraryAgentTool` - users
can now omit it to list all agents
- **Add special keyword handling** - keywords like "all", "*",
"everything", "any", or empty string are treated as "list all" rather
than literal searches
- **Update response messages** - differentiate between "listing all
agents" vs "searching for X"

## Example

Before:
```
User: Show me all my agents
CoPilot: find_library_agent(query="all")
Result: No agents matching 'all' found in your library
```

After:
```
User: Show me all my agents  
CoPilot: find_library_agent(query="all") OR find_library_agent()
Result: Found 5 agents in your library
```

## Testing

- [x] Test with "show me all my agents" prompt
- [x] Test with empty query
- [x] Test with specific search terms (should still work as before)

## Collaboration

This PR was developed in collaboration with @Pwuts.
2026-02-26 16:07:40 +00:00
Otto
195b14286a fix(frontend): fix Streamdown link safety modal and add origin check (#12209)
Requested by @ntindle

Fixes the Streamdown link safety modal in CoPilot with three changes:

**1. Fix invisible "Open link" button (HIGH)**
Added Streamdown's dist directory to the Tailwind content scan in
`tailwind.config.ts`. Previously, Tailwind was only scanning
`./src/**/*.{ts,tsx}`, so classes used by Streamdown's internal modal
components (like `bg-primary`, `text-primary-foreground`,
`hover:bg-primary/90`) were being purged. The "Open link" button
rendered invisible but remained clickable.

**2. Add same-origin URL whitelist (MEDIUM)**
Configured `linkSafety.onLinkCheck` on the `<Streamdown>` component in
`message.tsx` to whitelist same-origin URLs. Previously, ALL links
(including internal `/api/proxy/...` workspace download URLs) triggered
the "Open external link?" modal. Now same-origin links open directly.

**3. Add Storybook stories (LOW)**
Added `message.stories.tsx` with stories covering default messages, user
messages, internal/external links, the link safety modal, and
conversations.

### Testing
- [ ] Open link safety modal → "Open link" button is visible with proper
styling
- [ ] Click a workspace download link → opens directly (no modal)
- [ ] Click an external link → shows safety modal
- [ ] Verify in both light and dark mode
- [ ] Verify on mobile viewport
- [ ] Storybook stories render correctly

Fixes SECRT-2044
2026-02-26 15:19:54 +00:00
Zamil Majdy
29ca034e40 fix(backend/frontend): error handling, stream reconnection, and chat switching (#12205)
## Problem

CoPilot executions were experiencing:
1. **Duplicate error markers** - Both `execute()` and `_execute_async()`
called `mark_session_completed`, sending duplicate completion markers
2. **RuntimeError bypass** - RuntimeErrors that weren't SDK cleanup
issues bypassed error persistence logic
3. **Generic error messages** - StreamError showed "An error occurred"
instead of actual error text
4. **Empty chat on reconnect** - Messages cleared immediately when
reconnecting, before new messages arrived
5. **Stream not resuming** - Switching chats (A → B → A) didn't resume
active streams due to stale `hasResumedRef`
6. **Excessive diagnostic logging** - 60+ lines of STREAM_DIAG console
logs not needed in production

## Changes 🏗️

### 1. Consolidated Exception Handling
**Files:** `backend/copilot/executor/processor.py`,
`backend/copilot/sdk/service.py`

**processor.py:**
- Removed all error handling from `execute()` method
- Kept error handling only in `_execute_async()` where work happens
- Merged `CancelledError` and `BaseException` handlers into single catch
- Uses `isinstance()` to determine error message

**service.py:**
- Merged `CancelledError` and `Exception` handlers into single catch
- Moved RuntimeError check inside main Exception handler
- Prevents non-cancel-scope RuntimeErrors from bypassing error
persistence

**Impact:** Eliminates duplicate `mark_session_completed` calls, ~70
lines of code removed

---

### 2. Fixed StreamError Message
**File:** `backend/copilot/sdk/service.py`

- Changed from generic `"An error occurred. Please try again."`
- Now shows actual error: `errorText=error_msg`
- Provides real error details to frontend during active stream

---

### 3. Deferred Message Clearing on Reconnect
**File:** `frontend/src/app/(platform)/copilot/useCopilotPage.ts`

- Added `shouldClearOnNextMessageRef` flag
- Set flag when reconnect starts
- Clear old assistant messages only AFTER first new message arrives
- Prevents empty chat flicker during reconnection

---

### 4. Fixed Chat Switching Stream Resume
**File:** `frontend/src/app/(platform)/copilot/useCopilotPage.ts`

**Problem:** When switching Chat A → B → A, the stream didn't resume
because `hasResumedRef.current.get(sessionId)` was still `true`

**Fix:** Clear `hasResumedRef` entry when navigating away from session

**Flow now:**
1. In Chat A with active stream
2. Switch to Chat B → clears `hasResumedRef` for Chat A
3. Switch back to Chat A → `hasResumedRef` is false → resumes stream 

---

### 5. Removed Diagnostic Logging
**Files:** `frontend/useCopilotPage.ts`, `frontend/useChatSession.ts`,
`backend/stream_registry.py`, `backend/processor.py`,
`backend/routes.py`

- Removed all `[STREAM_DIAG]` console logs (60+ lines)
- Logs were useful for debugging but not needed in production
- Cleaner codebase, reduced noise in logs

---

### 6. Exception Handling Order Consistency
**File:** `backend/copilot/executor/processor.py`

- Made both CancelledError and regular exception branches follow same
pattern
- Set `error_msg` before logging in both cases
- Consistent code structure

---

## Architecture Quality: **9/10**

**Strengths:**
- Eliminated duplicate completion markers
- All RuntimeErrors now get proper error persistence
- Real error messages shown to users
- Stream resume works reliably when switching chats
- Cleaner codebase with diagnostic logs removed
- Consistent exception handling patterns

**Trade-offs:**
- Message clearing deferred means brief period with stale + new messages
(acceptable, prevents empty chat)

## Test Plan

- [x] Verify no duplicate completion markers sent
- [x] Trigger RuntimeError, verify error persists
- [x] Check StreamError shows actual error message
- [x] Reconnect, verify chat doesn't go empty
- [x] Switch Chat A → B → A with active stream, verify resume works
- [x] Verify no STREAM_DIAG logs in console
- [x] Run `pnpm format && pnpm lint && pnpm types` - all passed
- [x] Run `poetry run format` - all passed
- [ ] Test in production

## Checklist 📋

- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan  
- [x] I have tested my changes according to the test plan
- [x] `.env.default` is updated or compatible (no config changes)
- [x] `docker-compose.yml` is updated or compatible (no config changes)
2026-02-26 13:32:25 +00:00
Reinier van der Leer
1d9dd782a8 feat(backend/api): Add POST /graphs endpoint to external API (#12208)
- Resolves [SECRT-2031: Add upload agent to Library endpoint on external
API](https://linear.app/autogpt/issue/SECRT-2031)

### Changes 🏗️

- Add `POST /graphs` to v1 external API
- Add support for requiring multiple scopes in `require_permission`
middleware
- Add `WRITE_GRAPH` and `WRITE_LIBRARY` API permission scopes

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Test `POST /graphs` endpoint through `/docs` Swagger UI
2026-02-26 12:54:39 +01:00
Krzysztof Czerwinski
a1cb3d2a91 feat(blocks): Add Telegram blocks (#12141)
Add Telegram blocks that allow the use of [Telegram bots' API
features](https://core.telegram.org/bots/features).

### Changes 🏗️

1. Credentials & API layer: Bot token auth via `APIKeyCredentials`,
helper functions for JSON API calls (call_telegram_api) and multipart
file uploads (call_telegram_api_with_file)
2. Trigger blocks:
- `TelegramMessageTriggerBlock` — receives messages (text, photo, voice,
audio, document, video, edited message) with configurable event filters
- `TelegramMessageReactionTriggerBlock` — fires on reaction changes
(private chats auto, groups require admin)
2. Action blocks (11 total):
  - Send: Message, Photo, Voice, Audio, Document, Video
  - Reply to Message, Edit Message, Delete Message
  - Get File (download by file_id)
3. Webhook manager: Registers/deregisters webhooks via Telegram's
setWebhook API, validates incoming requests using
X-Telegram-Bot-Api-Secret-Token header
4. Provider registration: Added TELEGRAM to ProviderName enum and
registered `TelegramWebhooksManager`
5. Media send blocks support both URL passthrough (Telegram fetches
directly) and file upload for workspace/data URI inputs

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Non-AI UUIDs
  - [x] Blocks work correctly
    - [x] SendTelegramMessageBlock
    - [x] SendTelegramPhotoBlock
    - [x] SendTelegramVoiceBlock
    - [x] SendTelegramAudioBlock
    - [x] SendTelegramDocumentBlock
    - [x] SendTelegramVideoBlock
    - [x] ReplyToTelegramMessageBlock
    - [x] GetTelegramFileBlock
    - [x] DeleteTelegramMessageBlock
    - [x] EditTelegramMessageBlock
    - [x] TelegramMessageTriggerBlock (works for every trigger type)
    - [x] TelegramMessageReactionTriggerBlock

---------

Co-authored-by: Reinier van der Leer <pwuts@agpt.co>
2026-02-26 10:25:08 +00:00
Otto
1b91327034 fix(builder): Show X button on edge line hover, not just button hover (#12083)
## Summary

Fixes the issue where the X button for removing connections between
nodes only appears when hovering directly over the button itself. Users
now see the button when hovering anywhere on the connection line.

## Changes

- Added an invisible interaction path along the edge with a 20px stroke
width
- The path triggers the same hover state as the button
- This makes the X button visible when hovering the line OR the button
- Preserves existing behavior for broken edges (always visible)

## Testing

1. Hover over an edge line (not the button) → X button should appear
2. Move from line to button → button should stay visible  
3. Move away from both → button should fade out
4. Broken edges should still show X button always

## Linear

Fixes SECRT-1943

## Screenshots

This is a UX improvement - no visual changes except the button now
appears on line hover.

<!-- greptile_comment -->

<h2>Greptile Overview</h2>

<details><summary><h3>Greptile Summary</h3></summary>

This PR improves the UX for edge deletion by adding an invisible
interaction path with a 20px stroke width that makes the delete button
(X) appear when hovering anywhere along the connection line, not just
when hovering directly over the button.

**Key Changes:**
- Added invisible `<path>` element before `BaseEdge` with
`stroke="transparent"` and `strokeWidth={20}`
- Path has `onMouseEnter` and `onMouseLeave` handlers that trigger the
same `setIsHovered` state used by the delete button
- Delete button visibility logic remains unchanged: fades in when
`isHovered` is true (or always visible for broken edges)
- Works uniformly for all edge types (regular, static, and broken edges)

**How It Works:**
The invisible path creates a wider hit area (20px) around the edge
curve, making it much easier for users to trigger the hover state. When
the mouse enters this area, `isHovered` becomes true, which causes the
delete button to fade in (via the existing opacity transition logic).
The button itself also has hover handlers, so moving from the line to
the button maintains the visible state smoothly.
</details>


<details><summary><h3>Confidence Score: 5/5</h3></summary>

- This PR is safe to merge with minimal risk - it's a small, focused UX
improvement with no logic changes
- The implementation is clean and focused: adds only 9 lines of code,
uses existing state management (`isHovered`), and doesn't modify any
deletion logic. The invisible path is a standard SVG/React pattern for
expanding hit areas, and the approach is consistent with how the delete
button already handles hover events. No breaking changes, no side
effects.
- No files require special attention
</details>


<!-- greptile_other_comments_section -->

<!-- /greptile_comment -->

---------

Co-authored-by: Krzysztof Czerwinski <kpczerwinski@gmail.com>
2026-02-26 10:02:01 +00:00
Krzysztof Czerwinski
c7cdb40c5b feat(platform): Update new builder search (#11806)
### Changes 🏗️

- Add materialized view for suggested blocks
- Make `search` in builder accept comma separated filter list in query
- Remove Otto suggestions
- Use hybrid search for blocks search in builder
- Exclude `AgentExecutorBlocks` from builder
- Remove `Block` suffix from builder items

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Materialized view function works (when running manually)
- [x] Higher execution count blocks are shown first in "suggested
blocks" (uses materialized view)
  - [x] Hybrid search works
- [x] `AgentExecutorBlocks` doesn't appear on search results and in
blocks list
  - [x] `Block` suffix isn't displayed on blocks names in builder items

---------

Co-authored-by: Reinier van der Leer <pwuts@agpt.co>
2026-02-26 09:56:40 +00:00
Nicholas Tindle
77fb4419d0 Handle workspace:// URLs in regular markdown links (#12166)
### Changes 🏗️

Extended the `resolveWorkspaceUrls` function to handle both image syntax
(`![alt](workspace://id#mime)`) and regular link syntax
(`[text](workspace://id)`).

Previously, only image links were being resolved. Regular workspace
links were being blocked by Streamdown's rehype-harden sanitizer because
`workspace://` is not in the allowed URL-scheme whitelist, causing
"[blocked]" to appear next to link text.

The fix:
- Refactored the function to process image links first (existing
behavior)
- Added a second regex replacement to handle regular links using a
negative lookbehind (`(?<!!)`) to avoid matching image syntax
- Both patterns now resolve `workspace://` URLs to proxy download URLs
via `/api/proxy`
- Updated JSDoc comments to clarify the dual handling

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Verified image links with MIME type hints still resolve correctly
- [x] Verified regular workspace links now resolve to proxy URLs instead
of being blocked
- [x] Confirmed negative lookbehind prevents double-processing of image
syntax

https://claude.ai/code/session_0184TVJJcEoB8wbX9htCnv4b

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Low Risk**
> Low risk: a small, localized frontend markdown preprocessing change
that only rewrites `workspace://` URLs to existing `/api/proxy` download
URLs; main risk is regex edge cases affecting link rendering.
> 
> **Overview**
> Updates `resolveWorkspaceUrls` in `ChatMessagesContainer` to rewrite
**both** `workspace://` image markdown and regular markdown links into
`/api/proxy` download URLs so Streamdown sanitization no longer shows
`[blocked]` for workspace links.
> 
> Image handling is preserved (including `#video/*` MIME hints via
`video:` alt text), and a second regex pass with a negative lookbehind
avoids double-processing image syntax when rewriting plain links.
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
e17749b72c. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Ubbe <hi@ubbe.dev>
Co-authored-by: Lluis Agusti <hi@llu.lu>
2026-02-25 12:33:10 +00:00
Bently
9f002ce8f6 fix(frontend): improve UX for expired or duplicate password reset links (#12123)
## Summary
Improves the user experience when a password reset link has expired or
been used, replacing the confusing generic error with a clean, helpful
message.

## Changes
- Added `ExpiredLinkMessage` component that displays a user-friendly
error state
- Updated reset password page to detect expired/used links from:
- Supabase error format
(`error=access_denied&error_code=otp_expired&error_description=...`)
  - Internal clean format (`error=link_expired`)
- Enhanced callback route to detect and map expired/invalid link errors
- Clear, actionable UI with:
  - Friendly error message explaining what happened
  - "Send Me a New Link" button to request a new reset email
  - Login link for users who already have access

## Before
Users saw a confusing URL with error parameters and an unclear form:
```
/reset-password?error=access_denied&error_code=otp_expired&error_description=Email+link+is+invalid+or+has+expired
```

## After
Users see a clean, helpful message explaining the issue and how to fix
it.

<img width="548" height="454" alt="image"
src="https://github.com/user-attachments/assets/e867e522-146c-4d43-91b3-9e62d2957f95"
/>


Closes SECRT-1369

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [ ] Navigate to `/reset-password?error=link_expired` and verify the
ExpiredLinkMessage component appears
  - [ ] Click "Send Me a New Link" and verify the email form appears
- [ ] Navigate to
`/reset-password?error=access_denied&error_code=otp_expired` and verify
same behavior

<!-- greptile_comment -->

<details><summary><h3>Greptile Summary</h3></summary>

Improved password reset UX by adding an `ExpiredLinkMessage` component
that displays when users follow expired or already-used reset links. The
implementation detects expired link errors from Supabase
(`error_code=otp_expired`) and internal format (`error=link_expired`),
replacing confusing URL parameters with a clean message.

**Key changes:**
- Added error detection logic in both callback route and reset password
page to identify expired/invalid links
- Created new `ExpiredLinkMessage` component with friendly messaging
- Enhanced error handling to differentiate between expired links and
other errors

**Issues found:**
- The "Send Me a New Link" button misleadingly suggests it will send an
email, but it only reveals the email form - user must still enter email
and submit
- `access_denied` error detection may be too broad and could incorrectly
classify non-expired errors as expired links
</details>


<details><summary><h3>Confidence Score: 3/5</h3></summary>

- This PR improves UX but has logic issues that could mislead users
- The implementation correctly detects expired links and displays
helpful UI, but the "Send Me a New Link" button doesn't actually send an
email (just shows the form), which creates a misleading user experience.
Additionally, the `access_denied` error check is overly broad and could
incorrectly classify errors. These are functional issues that should be
addressed before merge.
- Pay close attention to `page.tsx` - the `handleSendNewLink` function
and error detection logic need refinement
</details>


<details><summary><h3>Flowchart</h3></summary>

```mermaid
flowchart TD
    Start[User clicks reset link with code] --> Callback[API: /auth/callback/reset-password]
    Callback --> CheckCode{Code valid?}
    
    CheckCode -->|No - expired/invalid/used| DetectError[Detect error type]
    DetectError --> CheckExpired{Contains expired/<br/>invalid/otp_expired/<br/>already/used?}
    CheckExpired -->|Yes| RedirectExpired[Redirect to /reset-password?error=link_expired]
    CheckExpired -->|No| RedirectOther[Redirect to /reset-password?error=message]
    
    CheckCode -->|Yes| RedirectSuccess[Redirect to /reset-password]
    
    RedirectExpired --> PageLoad[Page: /reset-password]
    RedirectOther --> PageLoad
    RedirectSuccess --> PageLoad
    
    PageLoad --> ParseParams[Parse URL params]
    ParseParams --> CheckErrorParams{Has error or<br/>error_code?}
    
    CheckErrorParams -->|Yes| CheckExpiredParams{error=link_expired OR<br/>errorCode=otp_expired OR<br/>error=access_denied OR<br/>description contains<br/>expired/invalid?}
    CheckExpiredParams -->|Yes| ShowExpired[Show ExpiredLinkMessage]
    CheckExpiredParams -->|No| ShowToast[Show error toast]
    
    CheckErrorParams -->|No| CheckUser{User<br/>authenticated?}
    
    ShowExpired --> ClickButton[User clicks 'Send Me a New Link']
    ClickButton --> HideExpired[setShowExpiredMessage false]
    HideExpired --> ShowForm[Show email form]
    
    ShowToast --> ClearParams[Clear error params from URL]
    ClearParams --> CheckUser
    
    CheckUser -->|Yes| ShowPasswordForm[Show password change form]
    CheckUser -->|No| ShowForm[Show email form]
```
</details>


<sub>Last reviewed commit: 80e9f40</sub>

<!-- greptile_other_comments_section -->

<!-- /greptile_comment -->
2026-02-25 12:11:55 +00:00
Ubbe
74691076c6 fix(frontend/copilot): show clarification and agent-saved cards without accordion (#12204)
### Background

The CoPilot tool UI wraps several output cards (clarification questions,
agent saved confirmation) inside a collapsible `ToolAccordion`. This
means users have to expand the accordion to see important interactive
content — clarification questions they need to answer, or confirmation
that their agent was created/updated.

### Changes 🏗️

- **Clarification questions always visible**: Moved
`ClarificationQuestionsCard` out of the `ToolAccordion` in both
`CreateAgent` and `EditAgent` tools so users immediately see and can
answer questions without expanding an accordion
- **Agent saved card always visible**: Moved the agent-saved
confirmation card out of the `ToolAccordion` in both tools so the
success state with library/builder links is immediately visible
- **Extracted `AgentSavedCard` component**: The agent-saved card was
duplicated between `CreateAgent` and `EditAgent` — extracted it into a
shared `copilot/components/AgentSavedCard/AgentSavedCard.tsx` component,
parameterized by `message` ("has been saved to your library!" vs "has
been updated!")
- **ClarificationQuestionsCard polish**: Updated spacing, icon
(`ChatTeardropDotsIcon`), typography variants, border styles, and number
badge sizing for a cleaner look
- **Minor atom tweaks**: Lightened `secondary` button variant
(`zinc-200` → `zinc-100`), changed textarea border radius from
`rounded-3xl` to `rounded-xl`

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] `pnpm format` passes
  - [x] `pnpm lint` passes
  - [x] `pnpm types` passes
- [ ] Create an agent via CoPilot and verify the saved card shows
without accordion
- [ ] Trigger clarification questions and verify they show without
accordion
- [ ] Edit an agent via CoPilot and verify the updated card shows
without accordion
- [ ] Verify the ClarificationQuestionsCard styling looks correct
(spacing, icons, borders)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 18:13:01 +07:00
Otto
b15ad0df9b hotfix(frontend): fix null credits TypeError on /copilot (#12202)
Requested by @majdyz

Fix `TypeError: Cannot read properties of null (reading 'credits')` on
the /copilot page.

**Sentry:**
[BUILDER-71P](https://significant-gravitas.sentry.io/issues/7256025912/)
**Linear:** SENTRY-1110

## Root Cause

Two issues combined:

1. **`getUserCredit()` had a broken try/catch** — it wasn't `await`ing
`_get()`, so async errors (including null responses) were never caught
2. **`_makeClientRequest` returns `null` during logout** — when a user
is logging out and `/credits` races with auth teardown, the response is
`null`

Chain: logout starts → `/credits` fetch races → auth error →
`_makeClientRequest` returns `null` → `getUserCredit` passes `null`
through → `fetchCredits` does `null.credits` → 💥

## Fix

- `getUserCredit()`: Add `await` + null coalescing fallback to `{
credits: 0 }`
- `fetchCredits()`: Add optional chaining guard (`response?.credits ??
null`)
2026-02-25 10:38:08 +00:00
Abhimanyu Yadav
2136defea8 feat(library): implement folder organization system for agents (#12101)
### Changes 🏗️

This PR adds folder organization capabilities to the library, allowing
users to organize their agents into folders:

- Added new `LibraryFolder` model and database schema
- Created folder management API endpoints for CRUD operations
- Implemented folder tree structure with proper parent-child
relationships
- Added drag-and-drop functionality for moving agents between folders
- Created folder creation dialog with emoji picker for folder icons
- Added folder editing and deletion capabilities
- Implemented folder navigation in the library UI
- Added validation to prevent circular references and excessive nesting
- Created animation for favoriting agents
- Updated library agent list to show folder structure
- Added folder filtering to agent list queries

<img width="1512" height="950" alt="Screenshot 2026-02-13 at 9 08 45 PM"
src="https://github.com/user-attachments/assets/78778e03-4349-4d50-ad71-d83028ca004a"
/>

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Create a new folder with custom name, icon, and color
  - [x] Move agents into folders via drag and drop
  - [x] Move agents into folders via context menu
  - [x] Navigate between folders
  - [x] Edit folder properties (name, icon, color)
  - [x] Delete folders and verify agents return to root
  - [x] Verify favorite animation works when adding to favorites
  - [x] Test folder navigation with search functionality
  - [x] Verify folder tree structure is maintained

#### For configuration changes:

- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)

<!-- greptile_comment -->

<h2>Greptile Overview</h2>

<details><summary><h3>Greptile Summary</h3></summary>

This PR implements a comprehensive folder organization system for
library agents, enabling hierarchical structure up to 5 levels deep.

**Backend Changes:**
- Added `LibraryFolder` model with self-referential hierarchy
(`parentId` → `Parent`/`Children`)
- Implemented CRUD operations with validation for circular references
and depth limits (MAX_FOLDER_DEPTH=5)
- Added `folderId` foreign key to `LibraryAgent` table
- Created folder management endpoints: list, get, create, update, move,
delete, and bulk agent moves
- Proper soft-delete cascade handling for folders and their contained
agents

**Frontend Changes:**
- Created folder creation/edit/delete dialogs with emoji picker
integration
- Implemented folder navigation UI with breadcrumbs and folder tree
structure
- Added drag-and-drop support for moving agents between folders
- Created context menu for agent actions (move to folder, remove from
folder)
- Added favorite animation system with `FavoriteAnimationProvider`
- Integrated folder filtering into agent list queries

**Key Features:**
- Folders support custom names, emoji icons, and hex colors
- Unique constraint per parent folder per user prevents duplicate names
- Validation prevents circular folder hierarchies and excessive nesting
- Agents can be moved between folders via drag-drop or context menu
- Deleting a folder soft-deletes all descendant folders and contained
agents
</details>


<details><summary><h3>Confidence Score: 4/5</h3></summary>

- This PR is safe to merge with minor considerations for performance
optimization
- The implementation is well-structured with proper validation, error
handling, and database constraints. The folder hierarchy logic correctly
prevents circular references and enforces depth limits. However, there
are some performance concerns with N+1 queries in depth calculation and
circular reference checking that could be optimized for deeply nested
hierarchies. The foreign key constraint (ON DELETE RESTRICT) conflicts
with the hard-delete code path but shouldn't cause issues since
soft-deletes are the default. The client-side duplicate validation is
redundant but not harmful.
- Pay close attention to migration file (foreign key constraint) and
db.py (performance of recursive queries)
</details>


<details><summary><h3>Sequence Diagram</h3></summary>

```mermaid
sequenceDiagram
    participant User
    participant Frontend
    participant API
    participant DB

    User->>Frontend: Create folder with name/icon/color
    Frontend->>API: POST /v2/folders
    API->>DB: Validate parent exists & depth limit
    API->>DB: Check unique constraint (userId, parentId, name)
    DB-->>API: Folder created
    API-->>Frontend: LibraryFolder response
    Frontend-->>User: Show success toast

    User->>Frontend: Drag agent to folder
    Frontend->>API: POST /v2/folders/agents/bulk-move
    API->>DB: Verify folder exists
    API->>DB: Update LibraryAgent.folderId
    DB-->>API: Agents updated
    API-->>Frontend: Updated agents
    Frontend-->>User: Refresh agent list

    User->>Frontend: Navigate into folder
    Frontend->>API: GET /v2/library/agents?folder_id=X
    API->>DB: Query agents WHERE folderId=X
    DB-->>API: Filtered agents
    API-->>Frontend: Agent list
    Frontend-->>User: Display folder contents

    User->>Frontend: Delete folder
    Frontend->>API: DELETE /v2/folders/{id}
    API->>DB: Get descendant folders recursively
    API->>DB: Soft-delete folders + agents in transaction
    DB-->>API: Deletion complete
    API-->>Frontend: 204 No Content
    Frontend-->>User: Show success toast
```
</details>


<sub>Last reviewed commit: a6c2f64</sub>

<!-- greptile_other_comments_section -->

<!-- /greptile_comment -->
2026-02-24 15:04:56 +00:00
Zamil Majdy
6e61cb103c fix(copilot): workspace file listing fix (#12190)
Requested by @majdyz

Improves workspace file display in GenericTool:
- Base64 content decoding for workspace files
- Rich file object rendering (path, size, mime type)
- MCP text extraction from SDK tool responses (Read, Glob, Grep, Edit)
- Better file list formatting for both string and object file entries

---------

Co-authored-by: Otto (AGPT) <otto@agpt.co>
2026-02-24 12:33:24 +00:00
Zamil Majdy
0e72e1f5e7 fix(platform/copilot): fix stuck sessions, stop button, and StreamFinish reliability (#12191)
## Summary

- **Fix stuck sessions**: Root cause was `_stream_listener` infinite
xread loop when Redis session metadata TTL expired — `hget` returned
`None` which bypassed the `status != "running"` break condition. Fixed
by treating `None` status as non-running.
- **Fix stop button reliability**: Cancel endpoint now force-completes
via `mark_session_completed` when executor doesn't respond within 5s.
Returns `cancelled=True` for already-expired sessions.
- **Single-owner StreamFinish**: All `yield StreamFinish()` removed from
service layers (sdk/service.py, service.py, dummy.py).
`mark_session_completed` is now the single atomic source of truth for
publishing StreamFinish via Lua CAS script.
- **Rename task → session/turn**: Consistent terminology across
stream_registry and processor.
- **Frontend session refetch**: Added `refetchOnMount: true` so page
refresh re-fetches session state.
- **Test fixes**: Updated e2e, service, and run_agent tests for
StreamFinish removal; fixed async fixture decorators.

## Test plan
- [x] E2E dummy streaming tests pass (13 passed, 1 xfailed)
- [x] run_agent_test.py passes (async fixture decorator fix)
- [x] service_test.py passes (StreamFinish assertions removed)
- [ ] Manual: verify stuck sessions recover on page refresh
- [ ] Manual: verify stop button works for active and expired sessions
- [ ] Manual: verify no duplicate StreamFinish events in SSE stream
2026-02-24 10:49:22 +00:00
Swifty
163b0b3c9d feat(backend): pre-populate CoPilotUnderstanding from Tally form on signup (#12119)
When new users sign up, check if they previously filled out the Tally
beta application form and, if so, pre-populate their
CoPilotUnderstanding with business data extracted from that form. This
gives the CoPilot (Otto) immediate context about the user on their very
first chat interaction.

### Changes 🏗️

- **`backend/util/settings.py`**: Added `tally_api_key` to `Secrets`
class
- **`backend/.env.default`**: Added `TALLY_API_KEY=` env var entry
- **`backend/data/tally.py`** (new): Core Tally integration module
- Redis-cached email index of form submissions (1h TTL) with incremental
refresh via `startDate`
  - Paginated Tally API fetching with Bearer token auth
  - Email matching (case-insensitive) against submission data
- LLM extraction (gpt-4o-mini via OpenRouter) of
`BusinessUnderstandingInput` fields
  - Fire-and-forget orchestrator that is idempotent and never raises
- **`backend/api/features/v1.py`**: Added background task in
`get_or_create_user_route` to trigger Tally lookup on login (skips if
understanding already exists)
- **`backend/data/tally_test.py`** (new): 15 unit tests covering index
building, email case-insensitivity, cache hit/miss, format helpers,
idempotency, graceful degradation, and error resilience

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] All 15 unit tests pass (`poetry run pytest
backend/data/tally_test.py --noconftest -xvs`)
  - [x] Lint clean (`poetry run ruff check` on changed files)
  - [x] Type check clean (`poetry run pyright` on new files)
- [ ] Manual: Set `TALLY_API_KEY` in `.env`, create a new user, verify
CoPilotUnderstanding is populated
- [ ] Manual: Verify user creation succeeds when Tally API key is
missing or API is down

#### For configuration changes:
- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)
- Added `TALLY_API_KEY=` to `.env.default` (optional, empty by default —
feature is a no-op without it)

<!-- greptile_comment -->

<details><summary><h3>Greptile Summary</h3></summary>

This PR adds a Tally form integration that pre-populates
`CoPilotUnderstanding` for new users by matching their signup email
against cached Tally beta application form submissions, then using an
LLM (gpt-4o-mini via OpenRouter) to extract structured business data.

- **New module `tally.py`** implements Redis-cached email indexing of
Tally form submissions with incremental refresh, email matching, LLM
extraction, and an idempotent fire-and-forget orchestrator
- **`v1.py`** adds a background task on the `get_or_create_user_route`
to trigger Tally lookup on every login (idempotency check is inside the
called function)
- **`settings.py` / `.env.default`** adds `tally_api_key` as an optional
secret — feature is a no-op without it
- **`tally_test.py`** adds 15 unit tests with thorough mocking coverage
- **Bug: TTL mismatch** — `_LAST_FETCH_TTL` (2h) > `_INDEX_TTL` (1h)
creates a window where incremental refresh loses all previously indexed
emails because the base index has expired but `last_fetch` persists.
This will cause silent data loss for users whose form submissions were
indexed before the cache expiry
- **Bug: `str.format()` on LLM prompt** — form data containing `{` or
`}` will crash the prompt formatting, silently preventing understanding
population for those users
</details>


<details><summary><h3>Confidence Score: 2/5</h3></summary>

- This PR has two logic bugs that will cause silent data loss in
production — recommend fixing before merge.
- The TTL mismatch between `_LAST_FETCH_TTL` and `_INDEX_TTL` will
intermittently cause incomplete caches, silently dropping users from the
email index. The `str.format()` issue will cause failures for any form
submission containing curly braces. Both bugs are caught by the
top-level exception handler, so they won't crash the service, but they
will silently prevent the feature from working correctly for affected
users. The overall architecture is sound and well-tested for normal
paths.
- `autogpt_platform/backend/backend/data/tally.py` — contains both the
TTL mismatch bug in `_refresh_cache` and the `str.format()` issue in
`extract_business_understanding`
</details>


<details><summary><h3>Sequence Diagram</h3></summary>

```mermaid
sequenceDiagram
    participant User
    participant API as v1.py (get_or_create_user_route)
    participant Tally as tally.py (populate_understanding_from_tally)
    participant DB as Database (understanding)
    participant Redis
    participant TallyAPI as Tally API
    participant LLM as OpenRouter (gpt-4o-mini)

    User->>API: POST /auth/user (JWT)
    API->>API: get_or_create_user(user_data)
    API-->>User: Return user (immediate)
    API->>Tally: asyncio.create_task(populate_understanding_from_tally)

    Tally->>DB: get_business_understanding(user_id)
    alt Understanding exists
        DB-->>Tally: existing understanding
        Note over Tally: Skip (idempotent)
    else No understanding
        DB-->>Tally: None
        Tally->>Tally: Check tally_api_key configured
        Tally->>Redis: Check cached email index
        alt Cache hit
            Redis-->>Tally: email_index + questions
        else Cache miss
            Redis-->>Tally: None
            Tally->>TallyAPI: GET /forms/{id}/submissions (paginated)
            TallyAPI-->>Tally: submissions + questions
            Tally->>Tally: Build email index
            Tally->>Redis: Cache index (1h TTL)
        end
        Tally->>Tally: Lookup email in index
        alt Email found
            Tally->>Tally: format_submission_for_llm()
            Tally->>LLM: Extract BusinessUnderstandingInput
            LLM-->>Tally: JSON structured data
            Tally->>DB: upsert_business_understanding(user_id, input)
        end
    end
```
</details>


<sub>Last reviewed commit: 92d2da4</sub>

<!-- greptile_other_comments_section -->

<!-- /greptile_comment -->

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Otto (AGPT) <otto@agpt.co>
Co-authored-by: Reinier van der Leer <pwuts@agpt.co>
2026-02-24 11:31:29 +01:00
Bently
ef42b17e3b docs: add Podman compatibility warning (#12120)
## Summary
Adds a warning to the Getting Started docs clarifying that **Podman and
podman-compose are not supported**.

## Problem
Users on Windows using `podman-compose` instead of Docker get errors
like:
```
Error: the specified Containerfile or Dockerfile does not exist, ..\..\autogpt_platform\backend\Dockerfile
```

This is because Podman handles relative paths differently than Docker,
causing incorrect path resolution on Windows.

## Solution
- Added a clear warning section after the Windows WSL 2 notes
- Explains the error users might see
- Directs them to install Docker Desktop instead

Closes #11358

<!-- greptile_comment -->

<details><summary><h3>Greptile Summary</h3></summary>

Adds a "Podman Not Supported" warning section to the Getting Started
documentation, placed after the Windows/WSL 2 installation notes. The
section clarifies that Docker is required, shows the typical error
message users encounter when using Podman, and directs them to install
Docker Desktop instead. This addresses issue #11358 where Windows users
using `podman-compose` hit path resolution errors.

- Adds `### ⚠️ Podman Not Supported` section under Manual Setup, after
Windows Installation Note
- Includes the specific error message users see with Podman for easy
identification
- Links to Docker Desktop installation docs as the recommended solution
- Formatting is consistent with existing sections in the document (emoji
headings, code blocks for errors)
</details>


<details><summary><h3>Confidence Score: 5/5</h3></summary>

- This PR is safe to merge — it only adds a documentation warning
section with no code changes.
- The change is a small, well-written documentation addition that adds a
Podman compatibility warning. It touches only one markdown file,
introduces no code changes, and is consistent with the existing document
structure and style. No issues were found.
- No files require special attention.
</details>


<details><summary><h3>Flowchart</h3></summary>

```mermaid
flowchart TD
    A[User wants to run AutoGPT] --> B{Which container runtime?}
    B -->|Docker / Docker Desktop| C[docker compose up -d --build]
    C --> D[AutoGPT starts successfully]
    B -->|Podman / podman-compose| E[podman-compose up -d --build]
    E --> F[Error: Containerfile or Dockerfile does not exist]
    F --> G[New warning section directs user to install Docker Desktop]
    G --> C
```
</details>


<sub>Last reviewed commit: 23ea6bd</sub>

<!-- greptile_other_comments_section -->

<!-- /greptile_comment -->
2026-02-23 15:19:24 +00:00
Ubbe
a18ffd0b21 fix(frontend/copilot): always-visible credentials, inputs, and login prompts (#12194)
Credentials, inputs, and login prompts in copilot tool outputs were
hidden inside collapsible accordions — users could accidentally collapse
them, hiding blocking actionable UI. This PR extracts all blocking
requirements out of accordions so they're always visible.

### Changes 🏗️

- **RunAgent & RunBlock**: Extract `SetupRequirementsCard` (credentials
picker) out of `ToolAccordion` — renders standalone, always visible
- **RunAgent**: Also extract `AgentDetailsCard` (inputs needed) and
`need_login` message out of accordion
- **SetupRequirementsCard (RunBlock)**: Input form always visible
(removed toggle button and animation), unified "Proceed" button disabled
until credentials + inputs are satisfied
- **SetupRequirementsCard (RunAgent)**: "Proceed" button disabled until
all credentials are selected
- **Both cards**: Added titled box with border for credentials section
("Block credentials" / "Agent credentials"), matching the existing
inputs box pattern
- **CredentialsFlatView**: "Add" button uses `variant="primary"` when
user has no credentials (was `secondary`)
- **Styleguide**: Added mock `CredentialsProvidersContext` with two
scenarios:
  - No credentials → shows "add new" flow
  - Has credentials → shows selection list with existing accounts
- **CreateAgent & EditAgent**: Picked up user-initiated styling
refinements

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] `pnpm format && pnpm lint && pnpm types` all pass
  - [ ] Visit `/copilot/styleguide` and verify:
- [ ] "Setup requirements — no credentials" shows add-credential button
(primary variant)
- [ ] "Setup requirements — has credentials" shows credential selection
dropdown
- [ ] Both RunAgent and RunBlock setup requirements render outside
accordion
- [ ] Trigger a copilot agent run that requires credentials — credential
picker always visible
- [ ] Trigger a copilot block run that requires credentials + inputs —
both sections visible, "Proceed" disabled until ready
- [ ] Trigger a copilot agent run that returns "agent details" — card
renders outside accordion
- [ ] Verify other output types (execution_started, error) still render
inside accordions


🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 16:39:21 +07:00
Otto
e40c8c70ce fix(copilot): collision detection, session locking, and sync for concurrent message saves (#12177)
Requested by @majdyz

Concurrent writers (incremental streaming saves from PR #12173 and
long-running tool callbacks) can race to persist messages with the same
`(sessionId, sequence)` pair, causing unique constraint violations on
`ChatMessage`.

**Root cause:** The streaming loop tracks `saved_msg_count` in-memory,
but the long-running tool callback (`_build_long_running_callback`) also
appends messages and calls `upsert_chat_session` independently — without
coordinating sequence numbers. When the streaming loop does its next
incremental save with the stale `saved_msg_count`, it tries to insert at
a sequence that already exists.

**Fix:** Multi-layered defense-in-depth approach:

1. **Collision detection with retry** (db.py): `add_chat_messages_batch`
uses `create_many()` in a transaction. On `UniqueViolationError`,
queries `MAX(sequence)+1` from DB and retries with the correct offset
(max 5 attempts).

2. **Robust sequence tracking** (db.py): `get_next_sequence()` uses
indexed `find_first` with `order={"sequence": "desc"}` for O(1) MAX
lookup, immune to deleted messages.

3. **Session-based counter** (model.py): Added `saved_message_count`
field to `ChatSession`. `upsert_chat_session` returns the session with
updated count, eliminating tuple returns throughout the codebase.

4. **MessageCounter dataclass** (sdk/service.py): Replaced list[int]
mutable reference pattern with a clean `MessageCounter` dataclass for
shared state between streaming loop and long-running callbacks.

5. **Session locking** (sdk/service.py): Prevent concurrent streams on
the same session using Redis `SET NX EX` distributed locks with TTL
refresh on heartbeats (config.stream_ttl = 3600s).

6. **Atomic operations** (db.py): Single timestamp for all messages and
session update in batch operations for consistency. Parallel queries
with `asyncio.gather` for lower latency.

7. **Config-based TTL** (sdk/service.py, config.py): Consolidated all
TTL constants to use `config.stream_ttl` (3600s) with lock refresh on
heartbeats.

### Key implementation details

- **create_many**: Uses `sessionId` directly (not nested
`Session.connect`) as `create_many` doesn't support nested creates
- **Type narrowing**: Added explicit `assert session is not None`
statements for pyright type checking in async contexts
- **Parallel operations**: Use `asyncio.gather` for independent DB
operations (create_many + session update)
- **Single timestamp**: All messages in a batch share the same
`createdAt` timestamp for atomicity

### Changes
- `backend/copilot/db.py`: Collision detection with `create_many` +
retry, indexed sequence lookup, single timestamp, parallel queries
- `backend/copilot/model.py`: Added `saved_message_count` field,
simplified return types
- `backend/copilot/sdk/service.py`: MessageCounter dataclass, session
locking with refresh, config-based TTL, type narrowing
- `backend/copilot/service.py`: Updated all callers to handle new return
types
- `backend/copilot/config.py`: Increased long_running_operation_ttl to
3600s with clarified docstring
- `backend/copilot/*_test.py`: Tests updated for new signatures

---------

Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
2026-02-20 15:05:03 +00:00
Zamil Majdy
9cdcd6793f fix(copilot): remove stream timeout, add error propagation to frontend (#12175)
## Summary

Fixes critical reliability issues where long-running copilot sessions
were forcibly terminated and failures showed no error messages to users.

## Issues Fixed

1. **Silent failures**: Tasks failed but frontend showed "stopped" with
zero explanation
2. **Premature timeout**: Sessions auto-expired after 5 minutes even
when actively running

## Changes

### Error propagation to frontend
- Add `error_message` parameter to `mark_task_completed()`
- When `status="failed"`, publish `StreamError` before `StreamFinish` so
frontend displays reason
- Update all failure callers with specific error messages:
  - Session not found: `"Session {id} not found"`
  - Tool setup failed: `"Failed to setup tool {name}: {error}"`  
  - Task cancelled: `"Task was cancelled"`

### Remove stream timeout
- Delete `stream_timeout` config (was 300s/5min)
- Remove auto-expiry logic in `get_active_task_for_session()`
- Sessions now run indefinitely — user controls stopping via UI

## Why

**Auto-expiry was broken:**
- Used `created_at` (task start) not last activity
- SDK sessions with multiple LLM calls + subagent Tasks easily run
20-30+ minutes
- A task publishing chunks every second still got killed at 5min mark
- Hard timeout is inappropriate for long-running AI agents

**Error propagation was missing:**
- `mark_task_completed(status="failed")` only sent `StreamFinish`
- No `StreamError` event = frontend had no message to show user
- Backend logs showed errors but user saw nothing

## Test Plan

- [x] Formatter, linter, type-check pass
- [ ] Start a copilot session with Task tool (spawns subagent)
- [ ] Verify session runs beyond 5 minutes without auto-expiry
- [ ] Cancel a running session → frontend shows "Task was cancelled"
error
- [ ] Trigger a tool setup failure → frontend shows error message
- [ ] Session continues running until user clicks stop or task completes

## Files Changed

- `backend/copilot/config.py` — removed `stream_timeout`
- `backend/copilot/stream_registry.py` — removed auto-expiry, added
error propagation
- `backend/copilot/service.py` — error messages for 2 failure paths
- `backend/copilot/executor/processor.py` — error message for
cancellation
2026-02-20 09:16:22 +00:00
Zamil Majdy
fc64f83331 fix(copilot): SDK streaming reliability, parallel tools, incremental saves, frontend reconnection (#12173)
## Summary

Fixes multiple reliability issues in the copilot's Claude Agent SDK
streaming pipeline — tool outputs getting stuck, parallel tool calls
flushing prematurely, messages lost on page refresh, and SSE
reconnection failures.

## Changes

### Backend: Streaming loop rewrite (`sdk/service.py`)
- **Non-cancelling heartbeat pattern**: Replace `asyncio.timeout()` with
`asyncio.wait()` for SDK message iteration. The old approach corrupted
the SDK's internal anyio memory stream when timeouts fired
mid-`__anext__()`, causing `StopAsyncIteration` on the next call and
silently dropping all in-flight tool results.
- **Hook synchronization**: Add `wait_for_stash()` before
`convert_message()` — the SDK fires PostToolUse hooks via `start_soon()`
(fire-and-forget), so the next message can arrive before the hook
stashes its output. The new asyncio.Event-based mechanism bridges this
gap without arbitrary sleeps.
- **Error handling**: Add `asyncio.CancelledError` handling at both
inner (streaming loop) and outer (session) levels, plus pending task
cleanup in `finally` block to prevent leaked coroutines. Catch
`Exception` from `done.pop().result()` for SDK error messages.
- **Safety-net flush**: After streaming loop ends, flush any remaining
unresolved tool calls so the frontend stops showing spinners even if the
stream drops unexpectedly.
- **StreamFinish fallback**: Emit `StreamFinishStep` + `StreamFinish`
when stream ends without `ResultMessage` (StopAsyncIteration) so the
frontend transitions to "ready" state.
- **Incremental session saves**: Save session to PostgreSQL after each
tool input/output event (not just at stream end), so page refresh and
other devices see recent messages.
- **Enhanced logging**: All log lines now include `session_id[:12]`
prefix and tool call resolution state (unresolved/current/resolved
counts).

### Backend: Response adapter (`sdk/response_adapter.py`)
- **Parallel tool call support**: Skip `_flush_unresolved_tool_calls()`
when an AssistantMessage contains only ToolUseBlocks (parallel
continuation) — the prior tools are still executing concurrently and
haven't finished yet.
- **Duplicate output prevention**: Skip already-resolved tool results in
both UserMessage (ToolResultBlock) and parent_tool_use_id handling to
prevent duplicate `StreamToolOutputAvailable` events.
- **`has_unresolved_tool_calls` property**: Used by the streaming loop
to decide whether to wait for PostToolUse hooks.
- **`session_id` parameter**: Passed through for structured logging.

### Backend: Hook synchronization (`sdk/tool_adapter.py`)
- **`_stash_event` ContextVar**: asyncio.Event signaled by
`stash_pending_tool_output()` whenever a PostToolUse hook stashes
output.
- **`wait_for_stash()`**: Awaits the event with configurable timeout —
replaces the racy "hope the hook finished" approach.

### Backend: Security hooks (`sdk/security_hooks.py`)
- Enhanced logging in `post_tool_use_hook` — log whether tool is
built-in, preview of stashed output, warning when `tool_response` is
None.

### Backend: Incremental save optimization (`model.py`)
- **`existing_message_count` parameter** on `upsert_chat_session`: Skip
the DB query to count existing messages when the caller already tracks
this (streaming loop).
- **`skip_existence_check` parameter** on `_save_session_to_db`: Skip
the `get_chat_session` existence query when we know the session row
already exists. Reduces from 4 DB round trips to 2 per incremental save.

### Backend: SDK version bump (`pyproject.toml`, `poetry.lock`)
- Bump `claude-agent-sdk` from `^0.1.0` to `^0.1.39`.

### Backend: New tests
- **`sdk_compat_test.py`** (new file): SDK compatibility tests — verify
the installed SDK exposes every class, attribute, and method the copilot
integration relies on. Catches SDK upgrade breakage immediately.
- **`response_adapter_test.py`**: 9 new tests covering
flush-at-ResultMessage, flush-at-next-AssistantMessage, stashed output
flush, wait_for_stash signaling/timeout/fast-path, parallel tool call
non-premature-flush, text-message flush of prior tools, and
already-resolved tool skip in UserMessage.

### Frontend: Session hydration (`convertChatSessionToUiMessages.ts`)
- **`isComplete` option**: When session has no active stream, mark
dangling tool calls (no output in DB) as `output-available` with empty
output — stops stale spinners after page refresh.

### Frontend: Chat session hook (`useChatSession.ts`)
- Reorder `hasActiveStream` memo before `hydratedMessages` so
`isComplete` flag is available.
- Pass `{ isComplete: !hasActiveStream }` to
`convertChatSessionMessagesToUiMessages`.

### Frontend: Copilot page hook (`useCopilotPage.ts`)
- **Cache invalidation on stream end**: Invalidate React Query session
cache when stream transitions active → idle, so next hydration fetches
fresh messages from backend (staleTime: Infinity otherwise keeps stale
data).
- **Resume ref reset**: Reset `hasResumedRef` on stream end to allow
re-resume if SSE drops but backend task is still running.
- **Remove old `resolveInProgressTools` effect**: Replaced by
backend-side safety-net flush + hydration-time `isComplete` marking.

## Test plan
- [ ] Existing copilot tests pass (`pytest backend/copilot/ -x -q`)
- [ ] SDK compat tests pass (`pytest
backend/copilot/sdk/sdk_compat_test.py -v`)
- [ ] Tool outputs (bash_exec, web_fetch, WebSearch) appear in the UI
instead of getting stuck
- [ ] Parallel tool calls (e.g. multiple WebSearch) complete and display
results without premature flush
- [ ] Page refresh during active stream reconnects and recovers messages
- [ ] Opening session from another device shows recent tool results
- [ ] SSE drop → automatic reconnection without losing messages
- [ ] Long-running tools (create_agent) still delegate to background
infrastructure
2026-02-20 08:25:08 +00:00
Otto
7718c49f05 Make CoPilot todo/task list card expanded by default (#12168)
The todo card rendered by GenericTool was collapsed by default,
requiring users to click to see their checklist items. Now passes
`defaultExpanded` when the category is `"todo"` so the task list is
immediately visible.

**File changed:**
`autogpt_platform/frontend/src/app/(platform)/copilot/tools/GenericTool/GenericTool.tsx`

Resolves [SECRT-2017](https://linear.app/autogpt/issue/SECRT-2017)
2026-02-20 05:36:16 +00:00
Abhimanyu Yadav
0a1591fce2 refactor(frontend): remove old builder code and monitoring components
(#12082)

### Changes 🏗️

This PR removes old builder code and monitoring components as part of
the migration to the new flow editor:

- **NewControlPanel**: Simplified component by removing unused props
(`flowExecutionID`, `visualizeBeads`, `pinSavePopover`,
`pinBlocksPopover`, `nodes`, `onNodeSelect`, `onNodeHover`) and cleaned
up commented legacy code
- **Import paths**: Updated all references from
`legacy-builder/CustomNode` to `FlowEditor/nodes/CustomNode`
- **GraphContent**: Fixed type safety by properly handling
`customized_name` metadata and using `categoryColorMap` instead of
`getPrimaryCategoryColor`
- **useNewControlPanel**: Removed unused state and query parameter
handling related to old builder
- Removed dead code and commented-out imports throughout

### Checklist 📋

#### For code changes:

- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
    - [x] Verify NewControlPanel renders correctly
    - [x] Test BlockMenu functionality
    - [x] Test Save Control
    - [x] Test Undo/Redo buttons
    - [x] Verify graph search menu still works with updated imports

<!-- greptile_comment -->

<h2>Greptile Overview</h2>

<details><summary><h3>Greptile Summary</h3></summary>

This PR removes legacy builder components and monitoring page (~12,000
lines of code), simplifying `NewControlPanel` to focus only on the new
flow editor.

**Key changes:**
- Removed entire `legacy-builder/` directory (36 files) containing old
CustomNode, CustomEdge, Flow, and control components
- Deleted `/monitoring` page and all related components (9 files)
- Deleted `useAgentGraph` hook (1,043 lines) that was only used by
legacy components
- Simplified `NewControlPanel` by removing unused props
(`flowExecutionID`, `nodes`, `onNodeSelect`, etc.) and commented-out
code
- Updated imports in `NewSearchGraph` components to reference new
`FlowEditor/nodes/CustomNode` instead of deleted
`legacy-builder/CustomNode`
- Removed `/monitoring` from protected pages in `helpers.ts`
- Updated test files to remove monitoring-related test helpers

**Minor style issues:**
- `useNewControlPanel` hook returns unused state (`blockMenuSelected`)
that should be cleaned up
- Unnecessary double negation (`!!`) in `GraphContent.tsx:136`
</details>


<details><summary><h3>Confidence Score: 4/5</h3></summary>

- This PR is safe to merge with minor style improvements recommended
- The refactor is a straightforward deletion of legacy code with no
references remaining in the codebase. All imports have been updated
correctly, tests cleaned up, and routing configuration updated. The only
issues are minor unused code that could be cleaned up but won't cause
runtime errors.
- No files require special attention - the unused state in
`useNewControlPanel.ts` is a minor style issue
</details>


<details><summary><h3>Sequence Diagram</h3></summary>

```mermaid
sequenceDiagram
    participant User
    participant NewControlPanel
    participant BlockMenu
    participant NewSaveControl
    participant UndoRedoButtons
    participant Store as blockMenuStore (Zustand)

    Note over NewControlPanel: Simplified component (removed props & legacy code)
    
    User->>NewControlPanel: Render
    NewControlPanel->>useNewControlPanel: Call hook (unused return)
    
    NewControlPanel->>BlockMenu: Render
    BlockMenu->>Store: Access state via useBlockMenuStore
    Store-->>BlockMenu: Return search, filters, etc.
    
    NewControlPanel->>NewSaveControl: Render
    NewControlPanel->>UndoRedoButtons: Render
    
    Note over NewControlPanel,Store: State management moved from hook to Zustand store
    Note over User: Legacy components (CustomNode, Flow, etc.) completely removed
```
</details>


<!-- greptile_other_comments_section -->

<!-- /greptile_comment -->
2026-02-20 05:19:08 +00:00
Zamil Majdy
681bb7c2b4 feat(copilot): workspace file tools, context reconstruction, transcript upload protection (#12164)
## Summary
- **Workspace file tools**: `write_workspace_file` now accepts plain
text `content`, `source_path` (copy from ephemeral disk), and graceful
fallback for invalid base64. `read_workspace_file` gains `save_to_path`
to download workspace files to the ephemeral working directory. Both
validate paths against session-specific ephemeral directory.
- **Context reconstruction**: `_format_conversation_context` now
includes tool call summaries and tool results (not just user/assistant
text), fixing agent amnesia when transcript is unavailable or stale.
- **Transcript upload protection**: Moved transcript upload from inside
the inner `try` block to the `finally` block, ensuring it always runs
even on streaming exceptions — prevents transcript loss that caused
staleness on subsequent turns.
- **Agent inactivity timeout**: Configurable timeout (default 300s)
kills hung Claude agents that stop producing SDK messages.
- **SDK system prompt**: Restructured with clear sections for shell
commands, two storage systems, file transfer workflows, and long-running
tools.
- **Path validation hardening**: `_validate_ephemeral_path` uses
`os.path.realpath` for both session dir and target path, fixing macOS
`/tmp` → `/private/tmp` symlink mismatch. Empty-string params normalised
to `None` to prevent dispatch assertion failures.

## Test plan
- [x] `_format_conversation_context` — empty, user, assistant, tool
calls, tool results, full conversation (query_builder_test.py)
- [x] `_build_query_message` — resume up-to-date, stale transcript gap,
zero msg count, no resume single/multi (query_builder_test.py)
- [x] `_validate_ephemeral_path` — valid path, traversal, cross-session,
symlink escape, nested (workspace_files_test.py)
- [x] `_resolve_write_content` — no sources, multiple sources, plain
text, base64, invalid base64, source_path, not found, outside ephemeral,
empty strings (workspace_files_test.py)
- [ ] Verify transcript upload occurs even after streaming error
- [ ] Verify agent inactivity timeout kills hung agents (300s default)

---------

Co-authored-by: Otto (AGPT) <otto@agpt.co>
2026-02-20 03:20:12 +00:00
Zamil Majdy
0818cd6683 fix(copilot): prevent background agent stalls and context hallucination (#12167)
## Summary
- **Block background Task agents**: The SDK's `Task` tool with
`run_in_background=true` stalls the SSE stream (no messages flow while
they execute) and the agents get killed when the main agent's turn ends
and we SIGTERM the CLI. The `PreToolUse` hook now denies these and tells
the agent to run tasks in the foreground instead.
- **Add heartbeats to SDK streaming**: Replaced the `async for` loop
with an explicit async iterator + `asyncio.wait_for(15s)`. Sends
`StreamHeartbeat` when the CLI is idle (e.g. during long tool execution)
to keep SSE connections alive through proxies/LBs.
- **Fix summarization hallucination**: The `_summarize_messages_llm`
prompt forced the LLM to produce ALL 9 sections ("You MUST include
ALL"), causing fabrication when the conversation didn't have content for
every section. Changed to optional sections with explicit
anti-hallucination instructions.

## Context
Session `7a9dda34-1068-4cfb-9132-5daf8ad31253` exhibited both issues:
1. The copilot tried to spin up background agents to create files in
parallel, then stopped responding
2. On resume, the copilot hallucinated having completed a "comprehensive
competitive analysis" with "9 deliverables" that never happened

## Test plan
- [x] All 26 security hooks tests pass (3 new: background blocked,
foreground allowed, limit enforced)
- [x] All 44 prompt utility tests pass
- [x] Linting and typecheck pass
- [ ] Manual test: copilot session where agent attempts to use Task tool
— should run foreground only
- [ ] Manual test: long-running tool execution — SSE should stay alive
via heartbeats
- [ ] Manual test: resume a multi-turn session — no hallucinated context
in summary
2026-02-19 20:00:15 +00:00
Zamil Majdy
7a39bdfaf8 feat(copilot): wire up stop button to cancel executor tasks (#12171)
## Summary

- The stop button was completely disconnected — clicking it only aborted
the client-side SSE fetch while the executor kept running indefinitely
- The executor already had full cancel infrastructure (RabbitMQ FANOUT
consumer, `CancelCoPilotEvent`, `threading.Event`, periodic cancel
checks) but nobody ever published a cancel message
- This PR wires up the missing pieces: a cancel REST endpoint, a publish
function, and frontend integration

## Changes

- **`executor/utils.py`**: Add `enqueue_cancel_task()` to publish
`CancelCoPilotEvent` to the existing FANOUT exchange
- **`routes.py`**: Add `POST /sessions/{session_id}/cancel` that finds
the active task, publishes cancel, and polls Redis until the task
confirms stopped (up to 10s timeout)
- **`cancel/route.ts`**: Next.js API proxy route for the cancel endpoint
- **`useCopilotPage.ts`**: Wrap AI SDK's `stop()` to also call the
backend cancel API — `sdkStop()` fires first for instant UI feedback,
then the cancel API waits for executor confirmation

## Test plan

- [ ] Start a copilot chat session and send a message
- [ ] Click "Stop generating" while streaming
- [ ] Verify executor logs show `Received cancel for {task_id}` and
`Cancelled during streaming`
- [ ] Verify the cancel endpoint returns `{"cancelled": true}` (not
timeout)
- [ ] Verify frontend transitions to idle state
- [ ] Verify clicking stop when no task is running returns
`{"cancelled": false, "reason": "no_active_task"}`
2026-02-19 19:57:51 +00:00
Otto
0b151f64e8 feat(copilot): Execute parallel tool calls concurrently (#12165)
When the LLM returns multiple tool calls in a single response (e.g.
multiple web fetches for a research task), they now execute concurrently
instead of sequentially. This can dramatically reduce latency for
multi-tool turns.

**Before:** Tool calls execute one after another — 7 web fetches × 2s
each = 14s total
**After:** All tool calls fire concurrently — 7 web fetches = ~2s total

### Changes

- **`service.py`**: New `_execute_tool_calls_parallel()` function that
spawns tool calls as concurrent `asyncio` tasks, collecting stream
events via `asyncio.Queue`
- **`service.py`**: `_yield_tool_call()` now accepts an optional
`session_lock` parameter for concurrent-safe session mutations
- **`base.py`**: Session lock exposed via `contextvars` so tools that
need it can access it without interface changes
- **`run_agent.py`**: Rate-limit counters (`successful_agent_runs`,
`successful_agent_schedules`) protected with the session lock to prevent
race conditions

### Concurrency Safety

| Shared State | Risk | Mitigation |
|---|---|---|
| `session.messages` (long-running tools only) | Race on append + upsert
| `session_lock` wraps mutations |
| `session.successful_agent_runs` counter | Bypass max-runs check |
`session_lock` wraps read-check-increment |
| Tool-internal state (DB queries, API calls) | None — stateless | No
mitigation needed |

### Testing

- Added `parallel_tool_calls_test.py` with tests for:
  - Parallel timing verification (sum vs max of delays)
  - Single tool call regression
  - Retryable error propagation
  - Shared session lock verification
  - Cancellation cleanup

Closes SECRT-2016

---------

Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
2026-02-19 17:53:36 +00:00
Zamil Majdy
be2a48aedb feat(platform/copilot): add SuggestedGoalResponse for vague/unachievable goals (#12139)
## Summary

- Add `SUGGESTED_GOAL` response type and `SuggestedGoalResponse` model
to backend; vague/unachievable goals now return a structured suggestion
instead of a generic error
- Add `SuggestedGoalCard` frontend component (amber styling, "Use this
goal" button) that lets users accept and re-submit a refined goal in one
click
- Add error recovery buttons ("Try again", "Simplify goal") to the error
output block
- Update copilot system prompt with explicit guidance for handling
`suggested_goal` and `clarifying_questions` feedback loops
- Add `create_agent_test.py` covering all four decomposition result
types

## Test plan

- [ ] Trigger vague goal (e.g. "monitor social media") →
`SuggestedGoalCard` renders with amber styling
- [ ] Trigger unachievable goal (e.g. "read my mind") → card shows goal
type "Goal cannot be accomplished" with reason
- [ ] Click "Use this goal" → sends message and triggers new
`create_agent` call with the suggested goal
- [ ] Trigger an error → "Try again" and "Simplify goal" buttons appear
below the error
- [ ] Clarifying questions answered → LLM re-calls `create_agent` with
context (system prompt guidance)
- [ ] Backend tests pass: `poetry run pytest
backend/api/features/chat/tools/create_agent_test.py -xvs` (requires
Docker services)

<!-- greptile_comment -->

<details><summary><h3>Greptile Summary</h3></summary>

Replaced generic `ErrorResponse` with structured `SuggestedGoalResponse`
for vague/unachievable goals in the copilot agent creation flow. Added
frontend `SuggestedGoalCard` component with amber styling and "Use this
goal" button for one-click goal refinement. Enhanced system prompt with
explicit feedback loop handling for `suggested_goal` and
`clarifying_questions`. Added comprehensive test coverage for all four
decomposition result types.

**Key improvements:**
- Better UX: Users can now accept refined goals with one click instead
of manually retyping
- Clearer error recovery: Added "Try again" and "Simplify goal" buttons
to error blocks
- Structured data: Backend now returns `suggested_goal`, `reason`,
`original_goal`, and `goal_type` fields instead of embedding everything
in error messages

**Issue found:**
- The `reason` field from the backend is not being passed to or
displayed by the `SuggestedGoalCard` component, so users won't see the
explanation for why their goal was rejected (especially important for
unachievable goals where it explains what blocks are missing)
</details>


<details><summary><h3>Confidence Score: 4/5</h3></summary>

- Safe to merge after fixing the missing `reason` field in the frontend
component
- Implementation is well-structured with good test coverage and follows
established patterns. The issue with the missing `reason` field is
straightforward to fix but important for UX - users won't understand why
their goal was rejected without it. All other changes are solid: backend
properly returns structured data, tests cover all cases, and the
component integration follows the project's conventions.
-
autogpt_platform/frontend/src/app/(platform)/copilot/tools/CreateAgent/CreateAgent.tsx
and SuggestedGoalCard.tsx need the `reason` prop added
</details>


<details><summary><h3>Flowchart</h3></summary>

```mermaid
flowchart TD
    Start[User submits goal to create_agent] --> Decompose[decompose_goal analyzes request]
    
    Decompose --> CheckType{Decomposition result type?}
    
    CheckType -->|clarifying_questions| Questions[Return ClarificationNeededResponse]
    Questions --> UserAnswers[User answers questions]
    UserAnswers --> Retry[Retry with context]
    Retry --> Decompose
    
    CheckType -->|vague_goal| VagueResponse[Return SuggestedGoalResponse<br/>goal_type: vague]
    VagueResponse --> ShowSuggestion[Frontend: SuggestedGoalCard<br/>amber styling]
    ShowSuggestion --> UserAccepts{User clicks<br/>Use this goal?}
    UserAccepts -->|Yes| NewGoal[Send suggested goal]
    NewGoal --> Decompose
    UserAccepts -->|No| End1[User refines manually]
    
    CheckType -->|unachievable_goal| UnachievableResponse[Return SuggestedGoalResponse<br/>goal_type: unachievable<br/>reason: missing blocks]
    UnachievableResponse --> ShowSuggestion
    
    CheckType -->|success| Generate[generate_agent creates workflow]
    Generate --> SaveOrPreview{save parameter?}
    SaveOrPreview -->|true| Save[Save to library<br/>AgentSavedResponse]
    SaveOrPreview -->|false| Preview[AgentPreviewResponse]
    
    CheckType -->|error| ErrorFlow[Return ErrorResponse]
    ErrorFlow --> ShowError[Frontend: Show error with<br/>Try again & Simplify goal buttons]
    ShowError --> UserRetry{User action?}
    UserRetry -->|Try again| Decompose
    UserRetry -->|Simplify goal| GetHelp[Ask LLM to simplify]
    GetHelp --> Decompose
    
    Save --> End2[Done]
    Preview --> End2
    End1 --> End2
```
</details>


<sub>Last reviewed commit: 2f37aee</sub>

<!-- greptile_other_comments_section -->

<!-- /greptile_comment -->
2026-02-19 16:11:41 +00:00
Ubbe
aeca4dbb79 docs(frontend): add mandatory pre-completion checks to CLAUDE.md (#12161)
### Changes 🏗️

Adds a **Pre-completion Checks (MANDATORY)** section to
`frontend/CLAUDE.md` that instructs Claude Code agents to always run the
following commands in order before reporting frontend work as done:

1. `pnpm format` — auto-fix formatting issues
2. `pnpm lint` — check for lint errors and fix them
3. `pnpm types` — check for type errors and fix them

This ensures code quality gates are enforced consistently by AI agents
working on the frontend.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Verified `pnpm format` passes cleanly
  - [x] Verified `pnpm lint` passes cleanly
  - [x] Verified `pnpm types` passes cleanly

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 23:07:55 +08:00
Ubbe
7b85eeaae2 refactor(frontend): fix flaky e2e tests (#12156)
### Changes 🏗️

Some fixes to make running e2e more predictable...

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] e2e are imdempotent

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 21:38:50 +07:00
Ubbe
4db3be2d61 fix(frontend): switch minigame to snake (#12160)
## Changes 🏗️

<img width="600" height="416" alt="Screenshot 2026-02-19 at 18 05 39"
src="https://github.com/user-attachments/assets/930116ad-b611-4398-bee7-4e33ca4dc688"
/>

Make the mini game a snake 🐍 game, so we don't use assets (_possible
license issues_ ), and it's simpler...

## Checklist 📋

### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Run the app and test
2026-02-19 19:28:59 +07:00
Ubbe
f57a1995d0 fix(frontend): make chat spinner centred when loading (#12154)
## Changes 🏗️

<img width="800" height="969" alt="Screenshot 2026-02-18 at 20 30 36"
src="https://github.com/user-attachments/assets/30d7d211-98c1-4159-94e1-86e81e29ad43"
/>

- Make the spinner centred when the chat is loading

## Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Run the app and test locally
2026-02-19 17:31:07 +07:00
Zamil Majdy
3928c35928 feat(copilot): SDK tool output, transcript resume, stream reconnection, GenericTool UI (#12159)
## Summary

### SDK built-in tool output forwarding
- WebSearch, Read, TodoWrite outputs now render in the frontend —
PostToolUse hook stashes outputs before SDK truncation, response adapter
flushes unresolved tool calls via `_flush_unresolved_tool_calls` +
`parent_tool_use_id` handling
- Multi-call stash upgraded to `dict[str, list[str]]` to support
multiple calls to the same built-in tool in one turn

### Transcript-based `--resume` with staleness detection
- Simplified to single upload block after `async with` (Stop hook +
`appendFileSync` guarantees), extracted `_try_upload_transcript` helper
- **NEW**: `message_count` watermark + timestamp metadata stored
alongside transcript — on the next turn, detects staleness and
compresses only the missed messages instead of the full history (hybrid:
transcript via `--resume` + compressed gap)
- Removed redundant dual-strategy code and dead
`find_cli_transcript`/`read_fallback_transcript` functions

### Frontend stream reconnection
- **NEW**: Enabled `resume: true` on `useChat` with
`prepareReconnectToStreamRequest` — page refresh reconnects to active
backend streams via Redis replay (backend `resume_session_stream`
endpoint was already wired up)

### GenericTool.tsx UI overhaul
- Tool-specific icons (terminal, globe, file, search, edit, checklist)
with category-based display
- TodoWrite checklist rendering with status indicators
(completed/in-progress/pending)
- WebSearch/MCP content display via `extractMcpText` for MCP-style
content blocks + raw JSON fallback
- Defensive TodoItem filter per coderabbit review
- Proper accordion content per tool category (bash, web, file, search,
edit, todo)

### Image support
- MCP tool results now include `{"type": "image"}` content blocks when
workspace file responses contain `content_base64` with image MIME types

### Security & cleanup
- `AskUserQuestion` added to `SDK_DISALLOWED_TOOLS` (interactive CLI
tool, no terminal in copilot)
- 36 per-operation `[TIMING]`/`[TASK_LOOKUP]` diagnostic logs downgraded
info→debug
- Silent exception fixes: warning logs for swallowed errors in
stream_registry + service

## Test plan
- [ ] Verify copilot multi-turn conversations use `--resume` (check logs
for `Using --resume`)
- [ ] Verify stale transcript detection fills gap (check logs for
`Transcript stale: covers N of M messages`)
- [ ] Verify page refresh reconnects to active stream (check network tab
for GET to `/stream` returning SSE)
- [ ] Verify WebSearch, Read, TodoWrite tool outputs render in frontend
accordion
- [ ] Verify GenericTool icons and accordion content display correctly
for each tool type
- [ ] Verify production log volume is reduced (no more `[TIMING]` at
info level)

---------

Co-authored-by: Otto (AGPT) <otto@agpt.co>
2026-02-19 08:48:12 +00:00
Otto
dc77e7b6e6 feat(frontend): Replace advanced switch with chevron on builder nodes (#12152)
## Summary

Replaces the "Advanced" switch/toggle on builder nodes with a chevron
control, matching the UX pattern used for the outputs section.

Resolves
[OPEN-3006](https://linear.app/autogpt/issue/OPEN-3006/replace-advanced-switch-with-chevron-on-builder-nodes)

Before
<img width="443" height="348" alt="Screenshot 2026-02-17 at 9 01 31 pm"
src="https://github.com/user-attachments/assets/40e98669-3136-4e53-8d46-df18ea32c4d7"
/>
After
<img width="443" height="348" alt="Screenshot 2026-02-17 at 9 00 21 pm"
src="https://github.com/user-attachments/assets/0836e3ac-1d0a-43d7-9392-c9d5741b32b6"
/>

## Changes

- `NodeAdvancedToggle.tsx` — Replaced switch component with a chevron
expand/collapse toggle

## Testing

Tested and verified by @kpczerwinski

<!-- greptile_comment -->

<details><summary><h3>Greptile Summary</h3></summary>

Replaces the `Switch` toggle for the "Advanced" section on builder nodes
with a chevron (`CaretDownIcon`) expand/collapse control, matching the
existing UX pattern used in `OutputHandler.tsx`. The change is clean and
consistent with the codebase.

- Swapped `Switch` component for a ghost `Button` + `CaretDownIcon` with
a `rotate-180` transition for visual feedback
- Pattern closely mirrors the output section toggle in
`OutputHandler.tsx` (lines 120-136)
- Removed the top border separator and rounded bottom corners from the
container, adjusting the visual spacing
- Toggle logic correctly inverts the `showAdvanced` boolean state
- Uses Phosphor Icons and design system components per project
conventions
</details>


<details><summary><h3>Confidence Score: 5/5</h3></summary>

- This PR is safe to merge — it is a small, focused UI change with no
logic or security concerns.
- Single file changed with a straightforward UI component swap. The new
implementation follows an established pattern already in use in
OutputHandler.tsx. Toggle logic is correct and all conventions (Phosphor
Icons, design system components, Tailwind styling) are followed.
- No files require special attention.
</details>


<details><summary><h3>Sequence Diagram</h3></summary>

```mermaid
sequenceDiagram
    participant User
    participant NodeAdvancedToggle
    participant nodeStore

    User->>NodeAdvancedToggle: Click chevron button
    NodeAdvancedToggle->>nodeStore: setShowAdvanced(nodeId, !showAdvanced)
    nodeStore-->>NodeAdvancedToggle: Updated showAdvanced state
    NodeAdvancedToggle->>NodeAdvancedToggle: Rotate CaretDownIcon (0° ↔ 180°)
    Note over NodeAdvancedToggle: Advanced fields shown/hidden via FormCreator
```
</details>


<sub>Last reviewed commit: ad66080</sub>

<!-- greptile_other_comments_section -->

**Context used:**

- Context from `dashboard` - autogpt_platform/frontend/CLAUDE.md
([source](https://app.greptile.com/review/custom-context?memory=39861924-d320-41ba-a1a7-a8bff44f780a))
- Context from `dashboard` -
autogpt_platform/frontend/src/app/(platform)/build/components/FlowEditor/ARCHITECTURE_FLOW_EDITOR.md
([source](https://app.greptile.com/review/custom-context?memory=0c5511fe-9aeb-4cf1-bbe9-798f2093b748))

<!-- /greptile_comment -->

---------

Co-authored-by: Krzysztof Czerwinski <kpczerwinski@gmail.com>
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Ubbe <0ubbe@users.noreply.github.com>
Co-authored-by: Ubbe <hi@ubbe.dev>
2026-02-18 15:34:02 +00:00
Otto
ba75cc28b5 fix(copilot): Remove description from feature request search, add PII prevention (#12155)
Two targeted changes to the CoPilot feature request tools:

1. **Remove description from search results** — The
`search_feature_requests` tool no longer returns issue descriptions.
Only the title is needed for duplicate detection, reducing unnecessary
data exposure.

2. **Prevent PII in created issues** — Updated the
`create_feature_request` tool description and parameter descriptions to
explicitly instruct the LLM to never include personally identifiable
information (names, emails, company names, etc.) in Linear issue titles
and descriptions.

Resolves [SECRT-2010](https://linear.app/autogpt/issue/SECRT-2010)
2026-02-18 14:36:12 +01:00
Otto
15bcdae4e8 fix(backend/copilot): Clean up GCSWorkspaceStorage per worker (#12153)
The copilot executor runs each worker in its own thread with a dedicated
event loop (`asyncio.new_event_loop()`). `aiohttp.ClientSession` is
bound to the event loop where it was created — using it from a different
loop causes `asyncio.timeout()` to fail with:

```
RuntimeError: Timeout context manager should be used inside a task
```

This was the root cause of transcript upload failures tracked in
SECRT-2009 and [Sentry
#7272473694](https://significant-gravitas.sentry.io/issues/7272473694/).

### Fix

**One `GCSWorkspaceStorage` instance per event loop** instead of a
single shared global.

- `get_workspace_storage()` now returns a per-loop GCS instance (keyed
by `id(asyncio.get_running_loop())`). Local storage remains shared since
it has no async I/O.
- `shutdown_workspace_storage()` closes the instance for the **current**
loop only, so `session.close()` always runs on the loop that created the
session.
- `CoPilotProcessor.cleanup()` shuts down workspace storage on the
worker's own loop, then stops the loop.
- Manager cleanup submits `cleanup_worker` to each thread pool worker
before shutting down the executor — replacing the old approach of
creating a temporary event loop that couldn't close cross-loop sessions.

### Changes

| File | Change |
|------|--------|
| `util/workspace_storage.py` | `GCSWorkspaceStorage` back to simple
single-session class; `get_workspace_storage()` returns per-loop GCS
instance; `shutdown_workspace_storage()` scoped to current loop |
| `copilot/executor/processor.py` | Added `CoPilotProcessor.cleanup()`
and `cleanup_worker()` |
| `copilot/executor/manager.py` | Calls `cleanup_worker` on each thread
pool worker during shutdown |

Fixes SECRT-2009

---------

Co-authored-by: Reinier van der Leer <pwuts@agpt.co>
2026-02-18 11:17:39 +00:00
Otto
e9ba7e51db fix(copilot): Route workspace through db_accessors, fix transcript upload (#12148)
## Summary

Fixes two bugs in the copilot executor:

### SECRT-2008: WorkspaceManager bypasses db_accessors
`backend/util/workspace.py` imported 6 workspace functions directly from
`backend/data/workspace.py`, which call `prisma()` directly. In the
copilot executor (no Prisma connection), these fail.

**Fix:** Replace direct imports with `workspace_db()` from
`db_accessors`, routing through the database_manager HTTP client when
Prisma is unavailable. Also:
- Register all 6 workspace functions in `DatabaseManager` and
`DatabaseManagerAsyncClient`
- Add `UniqueViolationError` to the service `EXCEPTION_MAPPING` so it's
properly re-raised over HTTP (needed for race-condition retry logic)

### SECRT-2009: Transcript upload asyncio.timeout error
`asyncio.create_task()` at line 696 of `service.py` creates an orphaned
background task in the executor's thread event loop.
`gcloud-aio-storage`'s `asyncio.timeout()` fails in this context.

**Fix:** Replace `create_task` with direct `await`. The upload runs
after streaming completes (all chunks already yielded), so no
user-facing latency impact. The function already has internal try/except
error handling.
2026-02-17 22:24:19 +00:00
Reinier van der Leer
d23248f065 feat(backend/copilot): Copilot Executor Microservice (#12057)
Uncouple Copilot task execution from the REST API server. This should
help performance and scalability, and allows task execution to continue
regardless of the state of the user's connection.

- Resolves #12023

### Changes 🏗️

- Add `backend.copilot.executor`->`CoPilotExecutor` (setup similar to
`backend.executor`->`ExecutionManager`).

This executor service uses RabbitMQ-based task distribution, and sticks
with the existing Redis Streams setup for task output. It uses a cluster
lock mechanism to ensure a task is only executed by one pod, and the
`DatabaseManager` for pooled DB access.

- Add `backend.data.db_accessors` for automatic choice of direct/proxied
DB access

Chat requests now flow: API → RabbitMQ → CoPilot Executor → Redis
Streams → SSE Client. This enables horizontal scaling of chat processing
and isolates long-running LLM operations from the API service.

- Move non-API Copilot stuff into `backend.copilot` (from
`backend.api.features.chat`)
  - Updated import paths for all usages

- Move `backend.executor.database` to `backend.data.db_manager` and add
methods for copilot executor
  - Updated import paths for all usages
- Make `backend.copilot.db` RPC-compatible (-> DB ops return ~~Prisma~~
Pydantic models)
  - Make `backend.data.workspace` RPC-compatible
  - Make `backend.data.graphs.get_store_listed_graphs` RPC-compatible

DX:
- Add `copilot_executor` service to Docker setup

Config:
- Add `Config.num_copilot_workers` (default 5) and
`Config.copilot_executor_port` (default 8008)
- Remove unused `Config.agent_server_port`

> [!WARNING]
> **This change adds a new microservice to the system, with entrypoint
`backend.copilot.executor`.**
> The `docker compose` setup has been updated, but if you run the
Platform on something else, you'll have to update your deployment config
to include this new service.
>
> When running locally, the `CoPilotExecutor` uses port 8008 by default.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Copilot works
    - [x] Processes messages when triggered
    - [x] Can use its tools

#### For configuration changes:

- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)

---------

Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
2026-02-17 16:15:28 +00:00
Bently
905373a712 fix(frontend): use singleton Shiki highlighter for code syntax highlighting (#12144)
## Summary
Addresses SENTRY-1051: Shiki warning about multiple highlighter
instances.

## Problem
The `@streamdown/code` package creates a **new Shiki highlighter for
each language** encountered. When users view AI chat responses with code
blocks in multiple languages (JavaScript, Python, JSON, YAML, etc.),
this creates 10+ highlighter instances, triggering Shiki's warning:

> "10 instances have been created. Shiki is supposed to be used as a
singleton, consider refactoring your code to cache your highlighter
instance"

This causes memory bloat and performance degradation.

## Solution
Introduced a custom code highlighting plugin that properly implements
the singleton pattern:

### New files:
- `src/lib/shiki-highlighter.ts` - Singleton highlighter management
- `src/lib/streamdown-code-plugin.ts` - Drop-in replacement for
`@streamdown/code`

### Key features:
- **Single shared highlighter** - One instance serves all code blocks
- **Preloaded common languages** - JS, TS, Python, JSON, Bash, YAML,
etc.
- **Lazy loading** - Additional languages loaded on demand
- **Result caching** - Avoids re-highlighting identical code blocks

### Changes:
- Added `shiki` as direct dependency
- Updated `message.tsx` to use the new plugin

## Testing
- [ ] Verify code blocks render correctly in AI chat
- [ ] Confirm no Shiki singleton warnings in console
- [ ] Test with multiple languages in same conversation

## Related
- Linear: SENTRY-1051
- Sentry: Multiple Shiki instances warning

<!-- greptile_comment -->

<details><summary><h3>Greptile Summary</h3></summary>

Replaced `@streamdown/code` with a custom singleton-based Shiki
highlighter implementation to resolve memory bloat from creating
multiple highlighter instances per language. The new implementation
creates a single shared highlighter with preloaded common languages (JS,
TS, Python, JSON, etc.) and lazy-loads additional languages on demand.
Results are cached to avoid re-highlighting identical code blocks.

**Key changes:**
- Added `shiki` v3.21.0 as a direct dependency
- Created `shiki-highlighter.ts` with singleton pattern and language
management utilities
- Created `streamdown-code-plugin.ts` as a drop-in replacement for
`@streamdown/code`
- Updated `message.tsx` to import from the new plugin instead of
`@streamdown/code`

The implementation follows React best practices with async highlighting
and callback-based notifications. The cache key uses code length +
prefix/suffix for efficient lookups on large code blocks.
</details>


<details><summary><h3>Confidence Score: 4/5</h3></summary>

- Safe to merge with minor considerations for edge cases
- The implementation is solid with proper singleton pattern, caching,
and async handling. The code is well-structured and addresses the stated
problem. However, there's a subtle potential race condition in the
callback handling where multiple concurrent requests for the same cache
key could trigger duplicate highlight operations before the first
completes. The cache key generation using prefix/suffix could
theoretically cause false cache hits for large files with identical
prefixes and suffixes. Despite these edge cases, the implementation
should work correctly for the vast majority of use cases.
- No files require special attention
</details>


<details><summary><h3>Sequence Diagram</h3></summary>

```mermaid
sequenceDiagram
    participant UI as Streamdown Component
    participant Plugin as Custom Code Plugin
    participant Cache as Token Cache
    participant Singleton as Shiki Highlighter (Singleton)
    participant Callbacks as Pending Callbacks

    UI->>Plugin: highlight(code, lang)
    Plugin->>Cache: Check cache key
    
    alt Cache hit
        Cache-->>Plugin: Return cached result
        Plugin-->>UI: Return highlighted tokens
    else Cache miss
        Plugin->>Callbacks: Register callback
        Plugin->>Singleton: Get highlighter instance
        
        alt First call
            Singleton->>Singleton: Create highlighter with preloaded languages
        end
        
        Singleton-->>Plugin: Return highlighter
        
        alt Language not loaded
            Plugin->>Singleton: Load language dynamically
        end
        
        Plugin->>Singleton: codeToTokens(code, lang, themes)
        Singleton-->>Plugin: Return tokens
        Plugin->>Cache: Store result
        Plugin->>Callbacks: Notify all waiting callbacks
        Callbacks-->>UI: Async callback with result
    end
```
</details>


<sub>Last reviewed commit: 96c793b</sub>

<!-- greptile_other_comments_section -->

<!-- /greptile_comment -->
2026-02-17 12:15:53 +00:00
Otto
ee9d39bc0f refactor(copilot): Replace legacy delete dialog with molecules/Dialog (#12136)
## Summary
Updates the session delete confirmation in CoPilot to use the new
`Dialog` component from `molecules/Dialog` instead of the legacy
`DeleteConfirmDialog`.

## Changes
- **ChatSidebar**: Use Dialog component for delete confirmation
(desktop)
- **CopilotPage**: Use Dialog component for delete confirmation (mobile)

## Behavior
- Dialog stays **open** during deletion with loading state on button
- Cancel button **disabled** while delete is in progress
- Delete button shows **loading spinner** during deletion
- Dialog only closes on successful delete or when cancel is clicked (if
not deleting)

## Screenshots
*Dialog uses the same styling as other molecules/Dialog instances in the
app*

## Requested by
@0ubbe

<!-- greptile_comment -->

<details><summary><h3>Greptile Summary</h3></summary>

Replaces the legacy `DeleteConfirmDialog` component with the new
`molecules/Dialog` component for session delete confirmations in both
desktop (ChatSidebar) and mobile (CopilotPage) views. The new
implementation maintains the same behavior: dialog stays open during
deletion with a loading state on the delete button and disabled cancel
button, closing only on successful deletion or cancel click.
</details>


<details><summary><h3>Confidence Score: 5/5</h3></summary>

- This PR is safe to merge with minimal risk
- This is a straightforward component replacement that maintains the
same behavior and UX. The Dialog component API is properly used with
controlled state, the loading states are correctly implemented, and both
mobile and desktop views are handled consistently. The changes are
well-tested patterns used elsewhere in the codebase.
- No files require special attention
</details>


<details><summary><h3>Flowchart</h3></summary>

```mermaid
flowchart TD
    A[User clicks delete button] --> B{isMobile?}
    B -->|Yes| C[CopilotPage Dialog]
    B -->|No| D[ChatSidebar Dialog]
    
    C --> E[Set sessionToDelete state]
    D --> E
    
    E --> F[Dialog opens with controlled.isOpen]
    F --> G{User action?}
    
    G -->|Cancel| H{isDeleting?}
    H -->|No| I[handleCancelDelete: setSessionToDelete null]
    H -->|Yes| J[Cancel button disabled]
    
    G -->|Confirm Delete| K[handleConfirmDelete called]
    K --> L[deleteSession mutation]
    L --> M[isDeleting = true]
    M --> N[Button shows loading spinner]
    M --> O[Cancel button disabled]
    
    L --> P{Mutation result?}
    P -->|Success| Q[Invalidate sessions query]
    Q --> R[Clear sessionId if current]
    R --> S[setSessionToDelete null]
    S --> T[Dialog closes]
    
    P -->|Error| U[Show toast error]
    U --> V[setSessionToDelete null]
    V --> W[Dialog closes]
```
</details>


<sub>Last reviewed commit: 275950c</sub>

<!-- greptile_other_comments_section -->

<!-- /greptile_comment -->

---------

Co-authored-by: Lluis Agusti <hi@llu.lu>
Co-authored-by: Ubbe <hi@ubbe.dev>
2026-02-17 19:12:27 +07:00
Swifty
05aaf7a85e fix(backend): Rename LINEAR_API_KEY to COPILOT_LINEAR_API_KEY to prevent global access (#12143)
The `LINEAR_API_KEY` environment variable name is too generic — it
matches the key name used by integrations/blocks, meaning that if set
globally, it could inadvertently grant all users access to Linear
through the blocks system rather than restricting it to the copilot
feature-request tool.

This renames the setting to `COPILOT_LINEAR_API_KEY` to make it clear
this key is scoped exclusively to the copilot's feature-request
functionality, preventing it from being picked up as a general-purpose
Linear credential.

### Changes 🏗️

- Renamed `linear_api_key` → `copilot_linear_api_key` in `Secrets`
settings model (`backend/util/settings.py`)
- Updated all references in the copilot feature-request tool
(`backend/api/features/chat/tools/feature_requests.py`)

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Verified the rename is consistent across all references (settings
+ feature_requests tool)
  - [x] No other files reference the old `linear_api_key` setting name

#### For configuration changes:
- [x] `.env.default` is updated or already compatible with my changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)

> **Note:** The env var changes from `LINEAR_API_KEY` to
`COPILOT_LINEAR_API_KEY`. Any deployment using the old name will need to
update accordingly.

<!-- greptile_comment -->

<details><summary><h3>Greptile Summary</h3></summary>

Renamed `LINEAR_API_KEY` to `COPILOT_LINEAR_API_KEY` in settings and the
copilot feature-request tool to prevent unintended access through Linear
blocks.

**Key changes:**
- Updated `Secrets.linear_api_key` → `Secrets.copilot_linear_api_key` in
`backend/util/settings.py`
- Updated all references in
`backend/api/features/chat/tools/feature_requests.py`
- The rename prevents the copilot Linear key from being picked up by the
Linear blocks integration (which uses `LINEAR_API_KEY` via
`ProviderBuilder` in `backend/blocks/linear/_config.py`)

**Issues found:**
- `.env.default` still references `LINEAR_API_KEY` instead of
`COPILOT_LINEAR_API_KEY`
- Frontend styleguide has a hardcoded error message with the old
variable name
</details>


<details><summary><h3>Confidence Score: 3/5</h3></summary>

- Generally safe but requires fixing `.env.default` before deployment
- The code changes are correct and achieve the intended security
improvement by preventing scope leakage. However, the PR is incomplete -
`.env.default` wasn't updated (critical for deployment) and a frontend
error message reference was missed. These issues will cause
configuration problems for anyone deploying with the new variable name.
- Check `autogpt_platform/backend/.env.default` and
`autogpt_platform/frontend/src/app/(platform)/copilot/styleguide/page.tsx`
- both need updates to match the renamed variable
</details>


<details><summary><h3>Flowchart</h3></summary>

```mermaid
flowchart TD
    A[".env file<br/>COPILOT_LINEAR_API_KEY"] --> B["Secrets model<br/>copilot_linear_api_key"]
    B --> C["feature_requests.py<br/>_get_linear_config()"]
    C --> D["Creates APIKeyCredentials<br/>for copilot feature requests"]
    
    E[".env file<br/>LINEAR_API_KEY"] --> F["ProviderBuilder<br/>in blocks/linear/_config.py"]
    F --> G["Linear blocks integration<br/>for user workflows"]
    
    style A fill:#90EE90
    style B fill:#90EE90
    style C fill:#90EE90
    style D fill:#90EE90
    style E fill:#FFD700
    style F fill:#FFD700
    style G fill:#FFD700
```
</details>


<sub>Last reviewed commit: 86dc57a</sub>

<!-- greptile_other_comments_section -->

<!-- /greptile_comment -->
2026-02-17 11:16:43 +01:00
Reinier van der Leer
9d4dcbd9e0 fix(backend/docker): Make server last (= default) build stage
Without specifying an explicit build target it would build the `migrate` stage because it is the last stage in the Dockerfile. This caused deployment failures.

- Follow-up to #12124 and 074be7ae
2026-02-16 14:49:30 +01:00
Reinier van der Leer
074be7aea6 fix(backend/docker): Update run commands to match deployment
- Follow-up to #12124

Changes:
- Update `run` commands for all backend services in `docker-compose.platform.yml` to match the deployment commands used in production
- Add trigger on `docker-compose(.platform)?.yml` changes to the Frontend CI workflow
2026-02-16 14:23:29 +01:00
Otto
39d28b24fc ci(backend): Upgrade RabbitMQ from 3.12 (EOL) to 4.1.4 (#12118)
## Summary
Upgrades RabbitMQ from the end-of-life `rabbitmq:3.12-management` to
`rabbitmq:4.1.4`, aligning CI, local dev, and e2e testing with
production.

## Changes

### CI Workflow (`.github/workflows/platform-backend-ci.yml`)
- **Image:** `rabbitmq:3.12-management` → `rabbitmq:4.1.4`
- **Port:** Removed 15672 (management UI) — not used
- **Health check:** Added to prevent flaky tests from race conditions
during startup

### Docker Compose (`docker-compose.platform.yml`,
`docker-compose.test.yaml`)
- **Image:** `rabbitmq:management` → `rabbitmq:4.1.4`
- **Port:** Removed 15672 (management UI) — not used

## Why
- RabbitMQ 3.12 is EOL
- We don't use the management interface, so `-management` variant is
unnecessary
- CI and local dev/e2e should match production (4.1.4)

## Testing
CI validates that backend tests pass against RabbitMQ 4.1.4 on Python
3.11, 3.12, and 3.13.

---
Closes SECRT-1703
2026-02-16 12:45:39 +00:00
Reinier van der Leer
bf79a7748a fix(backend/build): Update stale Poetry usage in Dockerfile (#12124)
[SECRT-2006: Dev deployment failing: poetry not found in container
PATH](https://linear.app/autogpt/issue/SECRT-2006)

- Follow-up to #12090

### Changes 🏗️

- Remove now-broken Poetry path config values
- Remove usage of now-broken `poetry run` in container run command
- Add trigger on `backend/Dockerfile` changes to Frontend CI workflow

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - If it works, CI will pass
2026-02-16 13:54:20 +01:00
Otto
649d4ab7f5 feat(chat): Add delete chat session endpoint and UI (#12112)
## Summary

Adds the ability to delete chat sessions from the CoPilot interface.

## Changes

### Backend
- Add `DELETE /api/chat/sessions/{session_id}` endpoint in `routes.py`
- Returns 204 on success, 404 if not found or not owned by user
- Reuses existing `delete_chat_session` function from `model.py`

### Frontend
- Add delete button (trash icon) that appears on hover for each chat
session
- Add confirmation dialog before deletion using existing
`DeleteConfirmDialog` component
- Refresh session list after successful delete
- Clear current session selection if the deleted session was active
- Update OpenAPI spec with new endpoint

## Testing

1. Hover over a chat session in sidebar → trash icon appears
2. Click trash icon → confirmation dialog
3. Confirm deletion → session removed, list refreshes
4. If deleted session was active, selection is cleared

## Screenshots

Delete button appears on hover, confirmation dialog on click.

## Related Issues

Closes SECRT-1928

<!-- greptile_comment -->

<h2>Greptile Overview</h2>

<details><summary><h3>Greptile Summary</h3></summary>

Adds the ability to delete chat sessions from the CoPilot interface — a
new `DELETE /api/chat/sessions/{session_id}` backend endpoint and a
corresponding delete button with confirmation dialog in the
`ChatSidebar` frontend component.

- **Backend route** (`routes.py`): Clean implementation reusing the
existing `delete_chat_session` model function with proper auth guards
and 204/404 responses. No issues.
- **Frontend** (`ChatSidebar.tsx`): Adds hover-visible trash icon per
session, confirmation dialog, mutation with cache invalidation, and
active session clearing on delete. However, it uses a `__legacy__`
component (`DeleteConfirmDialog`) which violates the project's style
guide — new code should use the modern design system components. Error
handling only logs to console without user-facing feedback (project
convention is to use toast notifications for mutation errors).
`isDeleting` is destructured but unused.
- **OpenAPI spec** updated correctly.
- **Unrelated file included**:
`notes/plan-SECRT-1959-graph-edge-desync.md` is a planning document for
a different ticket and should be removed from this PR. The `notes/`
directory is newly introduced and both plan files should be reconsidered
for inclusion.
</details>


<details><summary><h3>Confidence Score: 3/5</h3></summary>

- Functionally correct but has style guide violations and includes
unrelated files that should be addressed before merge.
- The core feature implementation (backend DELETE endpoint and frontend
mutation logic) is sound and follows existing patterns. Score is lowered
because: (1) the frontend uses a legacy component explicitly prohibited
by the project's style guide, (2) mutation errors are not surfaced to
the user, and (3) the PR includes an unrelated planning document for a
different ticket.
- Pay close attention to `ChatSidebar.tsx` for the legacy component
import and error handling, and
`notes/plan-SECRT-1959-graph-edge-desync.md` which should be removed.
</details>


<details><summary><h3>Sequence Diagram</h3></summary>

```mermaid
sequenceDiagram
    participant User
    participant ChatSidebar as ChatSidebar (Frontend)
    participant ReactQuery as React Query
    participant API as DELETE /api/chat/sessions/{id}
    participant Model as model.delete_chat_session
    participant DB as db.delete_chat_session (Prisma)
    participant Redis as Redis Cache

    User->>ChatSidebar: Click trash icon on session
    ChatSidebar->>ChatSidebar: Show DeleteConfirmDialog
    User->>ChatSidebar: Confirm deletion
    ChatSidebar->>ReactQuery: deleteSession({ sessionId })
    ReactQuery->>API: DELETE /api/chat/sessions/{session_id}
    API->>Model: delete_chat_session(session_id, user_id)
    Model->>DB: delete_many(where: {id, userId})
    DB-->>Model: bool (deleted count > 0)
    Model->>Redis: Delete session cache key
    Model->>Model: Clean up session lock
    Model-->>API: True
    API-->>ReactQuery: 204 No Content
    ReactQuery->>ChatSidebar: onSuccess callback
    ChatSidebar->>ReactQuery: invalidateQueries(sessions list)
    ChatSidebar->>ChatSidebar: Clear sessionId if deleted was active
```
</details>


<sub>Last reviewed commit: 44a92c6</sub>

<!-- greptile_other_comments_section -->

<details><summary><h4>Context used (3)</h4></summary>

- Context from `dashboard` - autogpt_platform/frontend/CLAUDE.md
([source](https://app.greptile.com/review/custom-context?memory=39861924-d320-41ba-a1a7-a8bff44f780a))
- Context from `dashboard` - autogpt_platform/frontend/CONTRIBUTING.md
([source](https://app.greptile.com/review/custom-context?memory=cc4f1b17-cb5c-4b63-b218-c772b48e20ee))
- Context from `dashboard` - autogpt_platform/CLAUDE.md
([source](https://app.greptile.com/review/custom-context?memory=6e9dc5dc-8942-47df-8677-e60062ec8c3a))
</details>


<!-- /greptile_comment -->

---------

Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
2026-02-16 12:19:18 +00:00
Ubbe
223df9d3da feat(frontend): improve create/edit copilot UX (#12117)
## Changes 🏗️

Make the UX nicer when running long tasks in Copilot, like creating an
agent, editing it or running a task.

## Checklist 📋

### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Run locally and play the game!

<!-- greptile_comment -->

<details><summary><h3>Greptile Summary</h3></summary>

This PR replaces the static progress bar and idle wait screens with an
interactive mini-game across the Create, Edit, and Run Agent copilot
tools. The existing mini-game (a simple runner with projectile-dodge
boss encounters) is significantly overhauled into a two-mode game: a
runner mode with animated tree obstacles and a duel mode featuring a
melee boss fight with attack, guard, and movement mechanics.
Sprite-based rendering replaces the previous shape-drawing approach.

- **Create/Edit/Run Agent UX**: All three tool views now show the
mini-game with contextual overlays during long-running operations,
replacing the progress bar in EditAgent and adding the game to RunAgent
- **Game mechanics overhaul**: Boss encounters changed from
projectile-dodging to melee duel with attack (Z), block (X), movement
(arrows), and jump (Space) controls
- **Sprite rendering**: Added 9 sprite sheet assets for characters,
trees, and boss animations with fallback to shape rendering if images
fail to load
- **UI overlays**: Added React-managed overlay states for idle,
boss-intro, boss-defeated, and game-over screens with continue/retry
buttons
- **Minor issues found**: Unused `isRunActive` variable in
`MiniGame.tsx`, unreachable "leaving" boss phase in `useMiniGame.ts`,
and a missing `expanded` property in `getAccordionMeta` return type
annotation in `EditAgent.tsx`
- **Unused asset**: `archer-shoot.png` is included in the PR but never
imported or referenced in any code
</details>


<details><summary><h3>Confidence Score: 4/5</h3></summary>

- This PR is safe to merge — it only affects the copilot mini-game UX
with no backend or data model changes.
- The changes are entirely frontend/cosmetic, scoped to the copilot
tools' waiting UX. The mini-game logic is self-contained in a
canvas-based hook and doesn't affect any application state, API calls,
or routing. The issues found are minor (unused variable, dead code, type
annotation gap, unused asset) and don't impact runtime behavior.
- `useMiniGame.ts` has the most complex logic changes (boss AI, death
animations, sprite rendering) and contains unreachable dead code in the
"leaving" phase handler. `EditAgent.tsx` has a return type annotation
that doesn't include `expanded`.
</details>


<details><summary><h3>Flowchart</h3></summary>

```mermaid
flowchart TD
    A[Game Idle] -->|"Start button"| B[Run Mode]
    B -->|"Jump over trees"| C{Score >= Threshold?}
    C -->|No| B
    C -->|"Yes, obstacles clear"| D[Boss Intro Overlay]
    D -->|"Continue button"| E[Duel Mode]
    E -->|"Attack Z / Guard X / Move ←→"| F{Boss HP <= 0?}
    F -->|No| G{Player hit & not guarding?}
    G -->|No| E
    G -->|Yes| H[Player Death Animation]
    H --> I[Game Over Overlay]
    I -->|"Retry button"| B
    F -->|Yes| J[Boss Death Animation]
    J --> K[Boss Defeated Overlay]
    K -->|"Continue button"| L[Reset Boss & Resume Run]
    L --> B
```
</details>


<sub>Last reviewed commit: ad80e24</sub>

<!-- greptile_other_comments_section -->

<!-- /greptile_comment -->
2026-02-16 10:53:08 +00:00
Ubbe
187ab04745 refactor(frontend): remove OldAgentLibraryView and NEW_AGENT_RUNS flag (#12088)
## Summary
- Removes the deprecated `OldAgentLibraryView` directory (13 files,
~2200 lines deleted)
- Removes the `NEW_AGENT_RUNS` feature flag from the `Flag` enum and
defaults
- Removes the legacy agent library page at `library/legacy/[id]`
- Moves shared `CronScheduler` components to
`src/components/contextual/CronScheduler/`
- Moves `agent-run-draft-view` and `agent-status-chip` to
`legacy-builder/` (co-located with their only consumer)
- Updates all import paths in consuming files (`AgentInfoStep`,
`SaveControl`, `RunnerInputUI`, `useRunGraph`)

## Test plan
- [x] `pnpm format` passes
- [x] `pnpm types` passes (no TypeScript errors)
- [x] No remaining references to `OldAgentLibraryView`,
`NEW_AGENT_RUNS`, or `new-agent-runs` in the codebase
- [x] Verify `RunnerInputUI` dialog still works in the legacy builder
- [x] Verify `AgentInfoStep` cron scheduling works in the publish modal
- [x] Verify `SaveControl` cron scheduling works in the legacy builder

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- greptile_comment -->

<h2>Greptile Overview</h2>

<details><summary><h3>Greptile Summary</h3></summary>

This PR removes deprecated code from the legacy agent library view
system and consolidates the codebase to use the new agent runs
implementation exclusively. The refactor successfully removes ~2200
lines of code across 13 deleted files while properly relocating shared
components.

**Key changes:**
- Removed the entire `OldAgentLibraryView` directory and its 13
component files
- Removed the `NEW_AGENT_RUNS` feature flag from the `Flag` enum and
defaults
- Deleted the legacy agent library page route at `library/legacy/[id]`
- Moved `CronScheduler` components to
`src/components/contextual/CronScheduler/` for shared use across the
application
- Moved `agent-run-draft-view` and `agent-status-chip` to
`legacy-builder/` directory, co-locating them with their only consumer
- Updated `useRunGraph.ts` to import `GraphExecutionMeta` from the
generated API models instead of the deleted custom type definition
- Updated all import paths in consuming components (`AgentInfoStep`,
`SaveControl`, `RunnerInputUI`)

**Technical notes:**
- The new import path for `GraphExecutionMeta`
(`@/app/api/__generated__/models/graphExecutionMeta`) will be generated
when running `pnpm generate:api` from the OpenAPI spec
- All references to the old code have been cleanly removed from the
codebase
- The refactor maintains proper separation of concerns by moving shared
components to contextual locations
</details>


<details><summary><h3>Confidence Score: 4/5</h3></summary>

- This PR is safe to merge with minimal risk, pending manual
verification of the UI components mentioned in the test plan
- The refactor is well-structured and all code changes are correct. The
score of 4 (rather than 5) reflects that the PR author has marked three
manual testing items as incomplete in the test plan: verifying
`RunnerInputUI` dialog, `AgentInfoStep` cron scheduling, and
`SaveControl` cron scheduling. While the code changes are sound, these
UI components should be manually tested before merging to ensure the
moved components work correctly in their new locations.
- No files require special attention. The author should complete the
manual testing checklist items for `RunnerInputUI`, `AgentInfoStep`, and
`SaveControl` as noted in the test plan.
</details>


<details><summary><h3>Sequence Diagram</h3></summary>

```mermaid
sequenceDiagram
    participant Dev as Developer
    participant FE as Frontend Build
    participant API as Backend API
    participant Gen as Generated Types

    Note over Dev,Gen: Refactor: Remove OldAgentLibraryView & NEW_AGENT_RUNS flag

    Dev->>FE: Delete OldAgentLibraryView (13 files, ~2200 lines)
    Dev->>FE: Remove NEW_AGENT_RUNS from Flag enum
    Dev->>FE: Delete library/legacy/[id]/page.tsx
    
    Dev->>FE: Move CronScheduler → src/components/contextual/
    Dev->>FE: Move agent-run-draft-view → legacy-builder/
    Dev->>FE: Move agent-status-chip → legacy-builder/
    
    Dev->>FE: Update RunnerInputUI import path
    Dev->>FE: Update SaveControl import path
    Dev->>FE: Update AgentInfoStep import path
    
    Dev->>FE: Update useRunGraph.ts
    FE->>Gen: Import GraphExecutionMeta from generated models
    Note over Gen: Type available after pnpm generate:api
    
    Gen-->>API: Uses OpenAPI spec schema
    API-->>FE: Type-safe GraphExecutionMeta model
```
</details>


<!-- greptile_other_comments_section -->

<!-- /greptile_comment -->

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 18:29:59 +08:00
Abhimanyu Yadav
e2d3c8a217 fix(frontend): Prevent node drag when selecting text in object editor key input (#11955)
## Summary
- Add `nodrag` class to the key name input wrapper in
`WrapIfAdditionalTemplate.tsx`
- This prevents the node from being dragged when users try to select
text in the key name input field
- Follows the same pattern used by other input components like
`TextWidget.tsx`

## Test plan
- [x] Open the new builder
- [x] Add a custom node with an Object input field
- [x] Try to select text in the key name input by clicking and dragging
- [x] Verify that text selection works without moving the block

Co-authored-by: Claude <noreply@anthropic.com>
2026-02-16 06:59:33 +00:00
Eve
647c8ed8d4 feat(backend/blocks): enhance list concatenation with advanced operations (#12105)
## Summary

Enhances the existing `ConcatenateListsBlock` and adds five new
companion blocks for comprehensive list manipulation, addressing issue
#11139 ("Implement block to concatenate lists").

### Changes

- **Enhanced `ConcatenateListsBlock`** with optional deduplication
(`deduplicate`) and None-value filtering (`remove_none`), plus an output
`length` field
- **New `FlattenListBlock`**: Recursively flattens nested list
structures with configurable `max_depth`
- **New `InterleaveListsBlock`**: Round-robin interleaving of elements
from multiple lists
- **New `ZipListsBlock`**: Zips corresponding elements from multiple
lists with support for padding to longest or truncating to shortest
- **New `ListDifferenceBlock`**: Computes set difference between two
lists (regular or symmetric)
- **New `ListIntersectionBlock`**: Finds common elements between two
lists, preserving order

### Helper Utilities

Extracted reusable helper functions for validation, flattening,
deduplication, interleaving, chunking, and statistics computation to
support the blocks and enable future reuse.

### Test Coverage

Comprehensive test suite with 188 test functions across 29 test classes
covering:
- Built-in block test harness validation for all 6 blocks
- Manual edge-case tests for each block (empty inputs, large lists,
mixed types, nested structures)
- Internal method tests for all block classes
- Unit tests for all helper utility functions

Closes #11139

## Test plan

- [x] All files pass Python syntax validation (`ast.parse`)
- [x] Built-in `test_input`/`test_output` tests defined for all blocks
- [x] Manual tests cover edge cases: empty lists, large lists, mixed
types, nested structures, deduplication, None removal
- [x] Helper function tests validate all utility functions independently
- [x] All block IDs are valid UUID4
- [x] Block categories set to `BlockCategory.BASIC` for consistency with
existing list blocks


<!-- greptile_comment -->

<h2>Greptile Overview</h2>

<details><summary><h3>Greptile Summary</h3></summary>

Enhanced `ConcatenateListsBlock` with deduplication and None-filtering
options, and added five new list manipulation blocks
(`FlattenListBlock`, `InterleaveListsBlock`, `ZipListsBlock`,
`ListDifferenceBlock`, `ListIntersectionBlock`) with comprehensive
helper functions and test coverage.

**Key Changes:**
- Enhanced `ConcatenateListsBlock` with `deduplicate` and `remove_none`
options, plus `length` output field
- Added `FlattenListBlock` for recursively flattening nested lists with
configurable `max_depth`
- Added `InterleaveListsBlock` for round-robin element interleaving
- Added `ZipListsBlock` with support for padding/truncation
- Added `ListDifferenceBlock` and `ListIntersectionBlock` for set
operations
- Extracted 12 reusable helper functions for validation, flattening,
deduplication, etc.
- Comprehensive test suite with 188 test functions covering edge cases

**Minor Issues:**
- Helper function `_deduplicate_list` has redundant logic in the `else`
branch that duplicates the `if` branch
- Three helper functions (`_filter_empty_collections`,
`_compute_list_statistics`, `_chunk_list`) are defined but unused -
consider removing unless planned for future use
- The `_make_hashable` function uses `hash(repr(item))` for unhashable
types, which correctly treats structurally identical dicts/lists as
duplicates
</details>


<details><summary><h3>Confidence Score: 4/5</h3></summary>

- Safe to merge with minor style improvements recommended
- The implementation is well-structured with comprehensive test coverage
(188 tests), proper error handling, and follows existing block patterns.
All blocks use valid UUID4 IDs and correct categories. The helper
functions provide good code reuse. The minor issues are purely stylistic
(redundant code, unused helpers) and don't affect functionality or
safety.
- No files require special attention - both files are well-tested and
follow project conventions
</details>


<details><summary><h3>Sequence Diagram</h3></summary>

```mermaid
sequenceDiagram
    participant User
    participant Block as List Block
    participant Helper as Helper Functions
    participant Output
    
    User->>Block: Input (lists/parameters)
    Block->>Helper: _validate_all_lists()
    Helper-->>Block: validation result
    
    alt validation fails
        Block->>Output: error message
    else validation succeeds
        Block->>Helper: _concatenate_lists_simple() / _flatten_nested_list() / etc.
        Helper-->>Block: processed result
        
        opt deduplicate enabled
            Block->>Helper: _deduplicate_list()
            Helper-->>Block: deduplicated result
        end
        
        opt remove_none enabled
            Block->>Helper: _filter_none_values()
            Helper-->>Block: filtered result
        end
        
        Block->>Output: result + length
    end
    
    Output-->>User: Block outputs
```
</details>


<sub>Last reviewed commit: a6d5445</sub>

<!-- greptile_other_comments_section -->

<sub>(2/5) Greptile learns from your feedback when you react with thumbs
up/down!</sub>

<!-- /greptile_comment -->

---------

Co-authored-by: Otto <otto@agpt.co>
2026-02-16 05:39:53 +00:00
Zamil Majdy
27d94e395c feat(backend/sdk): enable WebSearch, block WebFetch, consolidate tool constants (#12108)
## Summary
- Enable Claude Agent SDK built-in **WebSearch** tool (Brave Search via
Anthropic API) for the CoPilot SDK agent
- Explicitly **block WebFetch** via `SDK_DISALLOWED_TOOLS`. The agent
uses the SSRF-protected `mcp__copilot__web_fetch` MCP tool instead
- **Consolidate** all tool security constants (`BLOCKED_TOOLS`,
`WORKSPACE_SCOPED_TOOLS`, `DANGEROUS_PATTERNS`, `SDK_DISALLOWED_TOOLS`)
into `tool_adapter.py` as a single source of truth — previously
scattered across `tool_adapter.py`, `security_hooks.py`, and inline in
`service.py`

## Changes
- `tool_adapter.py`: Add `WebSearch` to `_SDK_BUILTIN_TOOLS`, add
`SDK_DISALLOWED_TOOLS`, move security constants here
- `security_hooks.py`: Import constants from `tool_adapter.py` instead
of defining locally
- `service.py`: Use `SDK_DISALLOWED_TOOLS` instead of inline `["Bash"]`

## Test plan
- [x] All 21 security hooks tests pass
- [x] Ruff lint clean
- [x] All pre-commit hooks pass
- [ ] Verify WebSearch works in CoPilot chat (manual test)

<!-- greptile_comment -->

<h2>Greptile Overview</h2>

<details><summary><h3>Greptile Summary</h3></summary>

Consolidates tool security constants into `tool_adapter.py` as single
source of truth, enables WebSearch (Brave via Anthropic API), and
explicitly blocks WebFetch to prevent SSRF attacks. The change improves
security by ensuring the agent uses the SSRF-protected
`mcp__copilot__web_fetch` tool instead of the built-in WebFetch which
can access internal networks like `localhost:8006`.
</details>


<details><summary><h3>Confidence Score: 5/5</h3></summary>

- This PR is safe to merge with minimal risk
- The changes improve security by blocking WebFetch (SSRF risk) while
enabling safe WebSearch. The consolidation of constants into a single
source of truth improves maintainability. All existing tests pass (21
security hooks tests), and the refactoring is straightforward with no
behavioral changes to existing security logic. The only suggestions are
minor improvements: adding a test for WebFetch blocking and considering
a lowercase alias for consistency.
- No files require special attention
</details>


<details><summary><h3>Sequence Diagram</h3></summary>

```mermaid
sequenceDiagram
    participant Agent as SDK Agent
    participant Hooks as Security Hooks
    participant TA as tool_adapter.py
    participant MCP as MCP Tools
    
    Note over TA: SDK_DISALLOWED_TOOLS = ["Bash", "WebFetch"]
    Note over TA: _SDK_BUILTIN_TOOLS includes WebSearch
    
    Agent->>Hooks: Request WebSearch (Brave API)
    Hooks->>TA: Check BLOCKED_TOOLS
    TA-->>Hooks: Not blocked
    Hooks-->>Agent: Allowed ✓
    Agent->>Agent: Execute via Anthropic API
    
    Agent->>Hooks: Request WebFetch (SSRF risk)
    Hooks->>TA: Check BLOCKED_TOOLS
    Note over TA: WebFetch in SDK_DISALLOWED_TOOLS
    TA-->>Hooks: Blocked
    Hooks-->>Agent: Denied ✗
    Note over Agent: Use mcp__copilot__web_fetch instead
    
    Agent->>Hooks: Request mcp__copilot__web_fetch
    Hooks->>MCP: Validate (MCP tool, not SDK builtin)
    MCP-->>Hooks: Has SSRF protection
    Hooks-->>Agent: Allowed ✓
    Agent->>MCP: Execute with SSRF checks
```
</details>


<sub>Last reviewed commit: 2d9975f</sub>

<!-- greptile_other_comments_section -->

<!-- /greptile_comment -->
2026-02-15 06:51:25 +00:00
DEEVEN SERU
b8f5c208d0 Handle errors in Jina ExtractWebsiteContentBlock (#12048)
## Summary
- catch Jina reader client/server errors in ExtractWebsiteContentBlock
and surface a clear error output keyed to the user URL
- guard empty responses to return an explicit error instead of yielding
blank content
- add regression tests covering the happy path and HTTP client failures
via a monkeypatched fetch

## Testing
- not run (pytest unavailable in this environment)

---------

Co-authored-by: Nicholas Tindle <nicktindle@outlook.com>
Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co>
2026-02-13 19:15:09 +00:00
Otto
ca216dfd7f ci(docs-claude-review): Update comments instead of creating new ones (#12106)
## Changes 🏗️

This PR updates the Claude Block Docs Review CI workflow to update
existing comments instead of creating new ones on each push.

### What's Changed:
1. **Concurrency group** - Prevents race conditions if the workflow runs
twice simultaneously
2. **Comment cleanup step** - Deletes any previous Claude review comment
before posting a new one
3. **Marker instruction** - Instructs Claude to include a `<!--
CLAUDE_DOCS_REVIEW -->` marker in its comment for identification

### Why:
Previously, every PR push would create a new review comment, cluttering
the PR with multiple comments. Now only the most recent review is shown.

### Testing:
1. Create a PR that triggers this workflow (modify a file in
`docs/integrations/` or `autogpt_platform/backend/backend/blocks/`)
2. Verify first run creates comment with marker
3. Push another commit
4. Verify old comment is deleted and new comment is created (not
accumulated)

Requested by @Bentlybro

---

## Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [ ] I have made a test plan
- [ ] I have tested my changes according to the test plan (will be
tested on merge)

#### For configuration changes:
- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)

<!-- greptile_comment -->

<h2>Greptile Overview</h2>

<details><summary><h3>Greptile Summary</h3></summary>

Added concurrency control and comment deduplication to prevent multiple
Claude review comments from accumulating on PRs. The workflow now
deletes previous review comments (identified by `<!-- CLAUDE_DOCS_REVIEW
-->` marker) before posting new ones, and uses concurrency groups to
prevent race conditions.
</details>


<details><summary><h3>Confidence Score: 5/5</h3></summary>

- This PR is safe to merge with minimal risk
- The changes are well-contained, follow GitHub Actions best practices,
and use built-in GitHub APIs safely. The concurrency control prevents
race conditions, and the comment cleanup logic uses proper filtering
with `head -1` to handle edge cases. The HTML comment marker approach is
standard and reliable.
- No files require special attention
</details>


<details><summary><h3>Sequence Diagram</h3></summary>

```mermaid
sequenceDiagram
    participant GH as GitHub PR Event
    participant WF as Workflow
    participant API as GitHub API
    participant Claude as Claude Action
    
    GH->>WF: PR opened/synchronized
    WF->>WF: Check concurrency group
    Note over WF: Cancel any in-progress runs<br/>for same PR number
    WF->>API: Query PR comments
    API-->>WF: Return all comments
    WF->>WF: Filter for CLAUDE_DOCS_REVIEW marker
    alt Previous comment exists
        WF->>API: DELETE comment by ID
        API-->>WF: Comment deleted
    else No previous comment
        WF->>WF: Skip deletion
    end
    WF->>Claude: Run code review
    Claude->>API: POST new comment with marker
    API-->>Claude: Comment created
```
</details>


<sub>Last reviewed commit: fb1b436</sub>

<!-- greptile_other_comments_section -->

<!-- /greptile_comment -->
2026-02-13 16:46:23 +00:00
Zamil Majdy
f9f358c526 feat(mcp): Add MCP tool block with OAuth, tool discovery, and standard credential integration (#12011)
## Summary

<img width="1000" alt="image"
src="https://github.com/user-attachments/assets/18e8ef34-d222-453c-8b0a-1b25ef8cf806"
/>


<img width="250" alt="image"
src="https://github.com/user-attachments/assets/ba97556c-09c5-4f76-9f4e-49a2e8e57468"
/>

<img width="250" alt="image"
src="https://github.com/user-attachments/assets/68f7804a-fe74-442d-9849-39a229c052cf"
/>

<img width="250" alt="image"
src="https://github.com/user-attachments/assets/700690ba-f9fe-4726-8871-3bfbab586001"
/>

Full-stack MCP (Model Context Protocol) tool block integration that
allows users to connect to any MCP server, discover available tools,
authenticate via OAuth, and execute tools — all through the standard
AutoGPT credential system.

### Backend

- **MCPToolBlock** (`blocks/mcp/block.py`): New block using
`CredentialsMetaInput` pattern with optional credentials (`default={}`),
supporting both authenticated (OAuth) and public MCP servers. Includes
auto-lookup fallback for backward compatibility.
- **MCP Client** (`blocks/mcp/client.py`): HTTP transport with JSON-RPC
2.0, tool discovery, tool execution with robust error handling
(type-checked error fields, non-JSON response handling)
- **MCP OAuth Handler** (`blocks/mcp/oauth.py`): RFC 8414 discovery,
dynamic per-server OAuth with PKCE, token storage and refresh via
`raise_for_status=True`
- **MCP API Routes** (`api/features/mcp/routes.py`): `discover-tools`,
`oauth/login`, `oauth/callback` endpoints with credential cleanup,
defensive OAuth metadata validation
- **Credential system integration**:
- `CredentialsMetaInput` model_validator normalizes legacy
`"ProviderName.MCP"` format from Python 3.13's `str(StrEnum)` change
- `CredentialsFieldInfo.combine()` supports URL-based credential
discrimination (each MCP server gets its own credential entry)
- `aggregate_credentials_inputs` checks block schema defaults for
credential optionality
- Executor normalizes credential data for both Pydantic and JSON schema
validation paths
  - Chat credential matching handles MCP server URL filtering
- `provider_matches()` helper used consistently for Python 3.13 StrEnum
compatibility
- **Pre-run validation**: `_validate_graph_get_errors` now calls
`get_missing_input()` for custom block-level validation (MCP tool
arguments)
- **Security**: HTML tag stripping loop to prevent XSS bypass, SSRF
protection (removed trusted_origins)

### Frontend

- **MCPToolDialog** (`MCPToolDialog.tsx`): Full tool discovery UI —
enter server URL, authenticate if needed, browse tools, select tool and
configure
- **OAuth popup** (`oauth-popup.ts`): Shared utility supporting
cross-origin MCP OAuth flows with BroadcastChannel + localStorage
fallback
- **Credential integration**: MCP-specific OAuth flow in
`useCredentialsInput`, server URL filtering in `useCredentials`, MCP
callback page
- **CredentialsSelect**: Auto-selects first available credential instead
of defaulting to "None", credentials listed before "None" in dropdown
- **Node rendering**: Dynamic tool input schema rendering on MCP nodes,
proper handling in both legacy and new flow editors
- **Block title persistence**: `customized_name` set at block creation
for both MCP and Agent blocks — no fallback logic needed, titles survive
save/load reliably
- **Stable credential ordering**: Removed `sortByUnsetFirst` that caused
credential inputs to jump when selected

### Tests (~2060 lines)

- Unit tests: block, client, tool execution
- Integration tests: mock MCP server with auth
- OAuth flow tests
- API endpoint tests
- Credential combining/optionality tests
- E2e tests (skipped in CI, run manually)

## Key Design Decisions

1. **Optional credentials via `default={}`**: MCP servers can be public
(no auth) or private (OAuth). The `credentials` field has `default={}`
making it optional at the schema level, so public servers work without
prompting for credentials.

2. **URL-based credential discrimination**: Each MCP server URL gets its
own credential entry in the "Run agent" form (via
`discriminator="server_url"`), so agents using multiple MCP servers
prompt for each independently.

3. **Model-level normalization**: Python 3.13 changed `str(StrEnum)` to
return `"ClassName.MEMBER"`. Rather than scattering fixes across the
codebase, a Pydantic `model_validator(mode="before")` on
`CredentialsMetaInput` handles normalization centrally, and
`provider_matches()` handles lookups.

4. **Credential auto-select**: `CredentialsSelect` component defaults to
the first available credential and notifies the parent state, ensuring
credentials are pre-filled in the "Run agent" dialog without requiring
manual selection.

5. **customized_name for block titles**: Both MCP and Agent blocks set
`customized_name` in metadata at creation time. This eliminates
convoluted runtime fallback logic (`agent_name`, hostname extraction) —
the title is persisted once and read directly.

## Test plan

- [x] Unit/integration tests pass (68 MCP + 11 graph = 79 tests)
- [x] Manual: MCP block with public server (DeepWiki) — no credentials
needed, tools discovered and executable
- [x] Manual: MCP block with OAuth server (Linear, Sentry) — OAuth flow
prompts correctly
- [x] Manual: "Run agent" form shows correct credential requirements per
MCP server
- [x] Manual: Credential auto-selects when exactly one matches,
pre-selects first when multiple exist
- [x] Manual: Credential ordering stays stable when
selecting/deselecting
- [x] Manual: MCP block title persists after save and refresh
- [x] Manual: Agent block title persists after save and refresh (via
customized_name)
- [ ] Manual: Shared agent with MCP block prompts new user for
credentials

---------

Co-authored-by: Otto <otto@agpt.co>
Co-authored-by: Ubbe <hi@ubbe.dev>
2026-02-13 16:17:03 +00:00
Zamil Majdy
52b3aebf71 feat(backend/sdk): Claude Agent SDK integration for CoPilot (#12103)
## Summary

Full integration of the **Claude Agent SDK** to replace the existing
one-turn OpenAI-compatible CoPilot implementation with a multi-turn,
tool-using AI agent.

### What changed

**Core SDK Integration** (`chat/sdk/` — new module)
- **`service.py`**: Main orchestrator — spawns Claude Code CLI as a
subprocess per user message, streams responses back via SSE. Handles
conversation history compression, session lifecycle, and error recovery.
- **`response_adapter.py`**: Translates Claude Agent SDK events (text
deltas, tool use, errors, result messages) into the existing CoPilot
`StreamEvent` protocol so the frontend works unchanged.
- **`tool_adapter.py`**: Bridges CoPilot's MCP tools (find_block,
run_block, create_agent, etc.) into the SDK's tool format. Handles
schema conversion and result serialization.
- **`security_hooks.py`**: Pre/Post tool-use hooks that enforce a strict
allowlist of tools, block path traversal, sandbox file operations to
per-session workspace directories, cap sub-agent spawning, and prevent
the model from accessing unauthorized system resources.
- **`transcript.py`**: JSONL transcript I/O utilities for the stateless
`--resume` feature (see below).

**Stateless Multi-Turn Resume** (new)
- Instead of compressing conversation history via LLM on every turn
(lossy and expensive), we capture Claude Code's native JSONL session
transcript via a **Stop hook** callback, persist it in the DB
(`ChatSession.sdkTranscript`), and restore it on the next turn via
`--resume <file>`.
- This preserves full tool call/result context across turns with zero
token overhead for history.
- Feature-flagged via `CLAUDE_AGENT_USE_RESUME` (default: off).
- DB migration: `ALTER TABLE "ChatSession" ADD COLUMN "sdkTranscript"
TEXT`.

**Sandboxed Tool Execution** (`chat/tools/`)
- **`bash_exec.py`**: Sandboxed bash execution using bubblewrap
(`bwrap`) with read-only root filesystem, per-session writable
workspace, resource limits (CPU, memory, file size), and network
isolation.
- **`sandbox.py`**: Shared bubblewrap sandbox infrastructure — generates
`bwrap` command lines with configurable mounts, environment, and
resource constraints.
- **`web_fetch.py`**: URL fetching tool with domain allowlist, size
limits, and content-type filtering.
- **`check_operation_status.py`**: Polling tool for long-running
operations (agent creation, block execution) so the SDK doesn't block
waiting.
- **`find_block.py`** / **`run_block.py`**: Enhanced with category
filtering, optimized response size (removed raw JSON schemas), and
better error handling.

**Security**
- Path traversal prevention: session IDs sanitized, all file ops
confined to workspace dirs, symlink resolution.
- Tool allowlist enforcement via SDK hooks — model cannot call arbitrary
tools.
- Built-in `Bash` tool blocked via `disallowed_tools` to prevent
bypassing sandboxed `bash_exec`.
- Sub-agent (`Task`) spawning capped at configurable limit (default:
10).
- CodeQL-clean path sanitization patterns.

**Streaming & Reconnection**
- SSE stream registry backed by Redis Streams for crash-resilient
reconnection.
- Long-running operation tracking with TTL-based cleanup.
- Atomic message append to prevent race conditions on concurrent writes.

**Configuration** (`config.py`)
- `use_claude_agent_sdk` — master toggle (default: on)
- `claude_agent_model` — model override for SDK path
- `claude_agent_max_buffer_size` — JSON parsing buffer (10MB)
- `claude_agent_max_subtasks` — sub-agent cap (10)
- `claude_agent_use_resume` — transcript-based resume (default: off)
- `thinking_enabled` — extended thinking for Claude models

**Tests**
- `sdk/response_adapter_test.py` — 366 lines covering all event
translation paths
- `sdk/security_hooks_test.py` — 165 lines covering tool blocking, path
traversal, subtask limits
- `chat/model_test.py` — 214 lines covering session model serialization
- `chat/service_test.py` — Integration tests including multi-turn resume
keyword recall
- `tools/find_block_test.py` / `run_block_test.py` — Extended with new
tool behavior tests

## Test plan
- [x] Unit tests pass (`sdk/response_adapter_test.py`,
`security_hooks_test.py`, `model_test.py`)
- [x] Integration test: multi-turn keyword recall via `--resume`
(`service_test.py::test_sdk_resume_multi_turn`)
- [x] Manual E2E: CoPilot chat sessions with tool calls, bash execution,
and multi-turn context
- [x] Pre-commit hooks pass (ruff, isort, black, pyright, flake8)
- [ ] Staging deployment with `claude_agent_use_resume=false` initially
- [ ] Enable resume in staging, verify transcript capture and recall

<!-- greptile_comment -->

<h2>Greptile Overview</h2>

<details><summary><h3>Greptile Summary</h3></summary>

This PR replaces the existing OpenAI-compatible CoPilot with a full
Claude Agent SDK integration, introducing multi-turn conversations,
stateless resume via JSONL transcripts, and sandboxed tool execution.

**Key changes:**
- **SDK integration** (`chat/sdk/`): spawns Claude Code CLI subprocess
per message, translates events to frontend protocol, bridges MCP tools
- **Stateless resume**: captures JSONL transcripts via Stop hook,
persists in `ChatSession.sdkTranscript`, restores with `--resume`
(feature-flagged, default off)
- **Sandboxed execution**: bubblewrap sandbox for bash commands with
filesystem whitelist, network isolation, resource limits
- **Security hooks**: tool allowlist enforcement, path traversal
prevention, workspace-scoped file operations, sub-agent spawn limits
- **Long-running operations**: delegates `create_agent`/`edit_agent` to
existing stream_registry infrastructure for SSE reconnection
- **Feature flag**: `CHAT_USE_CLAUDE_AGENT_SDK` with LaunchDarkly
support, defaults to enabled

**Security issues found:**
- Path traversal validation has logic errors in `security_hooks.py:82`
(tilde expansion order) and `service.py:266` (redundant `..` check)
- Config validator always prefers env var over explicit `False` value
(`config.py:162`)
- Race condition in `routes.py:323` — message persisted before task
registration, could duplicate on retry
- Resource limits in sandbox may fail silently (`sandbox.py:109`)

**Test coverage is strong** with 366 lines for response adapter, 165 for
security hooks, and integration tests for multi-turn resume.
</details>


<details><summary><h3>Confidence Score: 3/5</h3></summary>

- This PR is generally safe but has critical security issues in path
validation that must be fixed before merge
- Score reflects strong architecture and test coverage offset by real
security vulnerabilities: the tilde expansion bug in `security_hooks.py`
could allow sandbox escape, the race condition could cause message
duplication, and the silent ulimit failures could bypass resource
limits. The bubblewrap sandbox and allowlist enforcement are
well-designed, but the path validation bugs need fixing. The transcript
resume feature is properly feature-flagged. Overall the implementation
is solid but the security issues prevent a higher score.
- Pay close attention to
`backend/api/features/chat/sdk/security_hooks.py` (path traversal
vulnerability), `backend/api/features/chat/routes.py` (race condition),
`backend/api/features/chat/tools/sandbox.py` (silent resource limit
failures), and `backend/api/features/chat/sdk/service.py` (redundant
security check)
</details>


<details><summary><h3>Sequence Diagram</h3></summary>

```mermaid
sequenceDiagram
    participant Frontend
    participant Routes as routes.py
    participant SDKService as sdk/service.py
    participant ClaudeSDK as Claude Agent SDK CLI
    participant SecurityHooks as security_hooks.py
    participant ToolAdapter as tool_adapter.py
    participant CoPilotTools as tools/*
    participant Sandbox as sandbox.py (bwrap)
    participant DB as Database
    participant Redis as stream_registry

    Frontend->>Routes: POST /chat (user message)
    Routes->>SDKService: stream_chat_completion_sdk()
    
    SDKService->>DB: get_chat_session()
    DB-->>SDKService: session + messages
    
    alt Resume enabled AND transcript exists
        SDKService->>SDKService: validate_transcript()
        SDKService->>SDKService: write_transcript_to_tempfile()
        Note over SDKService: Pass --resume to SDK
    else No resume
        SDKService->>SDKService: _compress_conversation_history()
        Note over SDKService: Inject history into user message
    end
    
    SDKService->>SecurityHooks: create_security_hooks()
    SDKService->>ToolAdapter: create_copilot_mcp_server()
    SDKService->>ClaudeSDK: spawn subprocess with MCP server
    
    loop Streaming Conversation
        ClaudeSDK->>SDKService: AssistantMessage (text/tool_use)
        SDKService->>Frontend: StreamTextDelta / StreamToolInputAvailable
        
        alt Tool Call
            ClaudeSDK->>SecurityHooks: PreToolUse hook
            SecurityHooks->>SecurityHooks: validate path, check allowlist
            alt Tool blocked
                SecurityHooks-->>ClaudeSDK: deny
            else Tool allowed
                SecurityHooks-->>ClaudeSDK: allow
                ClaudeSDK->>ToolAdapter: call MCP tool
                
                alt Long-running tool (create_agent, edit_agent)
                    ToolAdapter->>Redis: register task
                    ToolAdapter->>DB: save OperationPendingResponse
                    ToolAdapter->>ToolAdapter: spawn background task
                    ToolAdapter-->>ClaudeSDK: OperationStartedResponse
                else Regular tool (find_block, bash_exec)
                    ToolAdapter->>CoPilotTools: execute()
                    alt bash_exec
                        CoPilotTools->>Sandbox: run_sandboxed()
                        Sandbox->>Sandbox: build bwrap command
                        Note over Sandbox: Network isolation,<br/>filesystem whitelist,<br/>resource limits
                        Sandbox-->>CoPilotTools: stdout, stderr, exit_code
                    end
                    CoPilotTools-->>ToolAdapter: result
                    ToolAdapter->>ToolAdapter: stash full output
                    ToolAdapter-->>ClaudeSDK: MCP response
                end
                
                SecurityHooks->>SecurityHooks: PostToolUse hook (log)
            end
        end
        
        ClaudeSDK->>SDKService: UserMessage (ToolResultBlock)
        SDKService->>ToolAdapter: pop_pending_tool_output()
        SDKService->>Frontend: StreamToolOutputAvailable
    end
    
    ClaudeSDK->>SecurityHooks: Stop hook
    SecurityHooks->>SDKService: transcript_path callback
    SDKService->>SDKService: read_transcript_file()
    SDKService->>DB: save transcript to session.sdkTranscript
    
    ClaudeSDK->>SDKService: ResultMessage (success)
    SDKService->>Frontend: StreamFinish
    SDKService->>DB: upsert_chat_session()
```
</details>


<sub>Last reviewed commit: 28c1121</sub>

<!-- greptile_other_comments_section -->

<!-- /greptile_comment -->

---------

Co-authored-by: Swifty <craigswift13@gmail.com>
2026-02-13 15:49:03 +00:00
Otto
965b7d3e04 dx: Add PR overlap detection & alert (#12104)
## Summary

Adds an automated workflow that detects potential merge conflicts
between open PRs, helping contributors coordinate proactively.

**Example output:** [See comment on PR
#12057](https://github.com/Significant-Gravitas/AutoGPT/pull/12057#issuecomment-3897330632)

## How it works

1. **Triggered on PR events** — runs when a PR is opened, pushed to, or
reopened
2. **Compares against all open PRs** targeting the same base branch
3. **Detects overlaps** at multiple levels:
   - File overlap (same files modified)
   - Line overlap (same line ranges modified)
   - Actual merge conflicts (attempts real merges)
4. **Posts a comment** on the PR with findings

## Features

- Full file paths with common prefix extraction for readability
- Conflict size (number of conflict regions + lines affected)
- Conflict types (content, added, deleted, modified/deleted, etc.)
- Last-updated timestamps for each PR
- Risk categorization (conflict, medium, low)
- Ignores noise files (openapi.json, lock files)
- Updates existing comment on subsequent pushes (no spam)
- Filters out PRs older than 14 days
- Clone-once optimization for fast merge testing (~48s for 19 PRs)

## Files

- `.github/scripts/detect_overlaps.py` — main detection script
- `.github/workflows/pr-overlap-check.yml` — workflow definition
2026-02-13 15:45:10 +00:00
Bently
c2368f15ff fix(blocks): disable PrintToConsoleBlock (#12100)
## Summary
Disables the Print to Console block as requested by Nick Tindle.

## Changes
- Added `disabled=True` to PrintToConsoleBlock in `basic.py`

## Testing
- Block will no longer appear in the platform UI
- Existing graphs using this block should be checked (block ID:
`f3b1c1b2-4c4f-4f0d-8d2f-4c4f0d8d2f4c`)

Closes OPEN-3000

<!-- greptile_comment -->

<h2>Greptile Overview</h2>

<details><summary><h3>Greptile Summary</h3></summary>

Added `disabled=True` parameter to `PrintToConsoleBlock` in `basic.py`
per Nick Tindle's request (OPEN-3000).

- Block follows the same disabling pattern used by other blocks in the
codebase (e.g., `BlockInstallationBlock`, video blocks, Ayrshare blocks)
- Block will no longer appear in the platform UI for new graph creation
- Existing graphs using this block (ID:
`f3b1c1b2-4c4f-4f0d-8d2f-4c4f0d8d2f4c`) will need to be checked for
compatibility
- Comment properly documents the reason for disabling
</details>


<details><summary><h3>Confidence Score: 5/5</h3></summary>

- This PR is safe to merge with minimal risk
- Single-line change that adds a well-documented flag following existing
patterns used throughout the codebase. The change is non-destructive and
only affects UI visibility of the block for new graphs.
- No files require special attention
</details>


<sub>Last reviewed commit: 759003b</sub>

<!-- greptile_other_comments_section -->

<!-- /greptile_comment -->
2026-02-13 15:20:23 +00:00
dependabot[bot]
9ac3f64d56 chore(deps): bump github/codeql-action from 3 to 4 (#12033)
Bumps [github/codeql-action](https://github.com/github/codeql-action)
from 3 to 4.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/github/codeql-action/releases">github/codeql-action's
releases</a>.</em></p>
<blockquote>
<h2>v3.32.2</h2>
<ul>
<li>Update default CodeQL bundle version to <a
href="https://github.com/github/codeql-action/releases/tag/codeql-bundle-v2.24.1">2.24.1</a>.
<a
href="https://redirect.github.com/github/codeql-action/pull/3460">#3460</a></li>
</ul>
<h2>v3.32.1</h2>
<ul>
<li>A warning is now shown in Default Setup workflow logs if a <a
href="https://docs.github.com/en/code-security/how-tos/secure-at-scale/configure-organization-security/manage-usage-and-access/giving-org-access-private-registries">private
package registry is configured</a> using a GitHub Personal Access Token
(PAT), but no username is configured. <a
href="https://redirect.github.com/github/codeql-action/pull/3422">#3422</a></li>
<li>Fixed a bug which caused the CodeQL Action to fail when repository
properties cannot successfully be retrieved. <a
href="https://redirect.github.com/github/codeql-action/pull/3421">#3421</a></li>
</ul>
<h2>v3.32.0</h2>
<ul>
<li>Update default CodeQL bundle version to <a
href="https://github.com/github/codeql-action/releases/tag/codeql-bundle-v2.24.0">2.24.0</a>.
<a
href="https://redirect.github.com/github/codeql-action/pull/3425">#3425</a></li>
</ul>
<h2>v3.31.11</h2>
<ul>
<li>When running a Default Setup workflow with <a
href="https://docs.github.com/en/actions/how-tos/monitor-workflows/enable-debug-logging">Actions
debugging enabled</a>, the CodeQL Action will now use more unique names
when uploading logs from the Dependabot authentication proxy as workflow
artifacts. This ensures that the artifact names do not clash between
multiple jobs in a build matrix. <a
href="https://redirect.github.com/github/codeql-action/pull/3409">#3409</a></li>
<li>Improved error handling throughout the CodeQL Action. <a
href="https://redirect.github.com/github/codeql-action/pull/3415">#3415</a></li>
<li>Added experimental support for automatically excluding <a
href="https://docs.github.com/en/repositories/working-with-files/managing-files/customizing-how-changed-files-appear-on-github">generated
files</a> from the analysis. This feature is not currently enabled for
any analysis. In the future, it may be enabled by default for some
GitHub-managed analyses. <a
href="https://redirect.github.com/github/codeql-action/pull/3318">#3318</a></li>
<li>The changelog extracts that are included with releases of the CodeQL
Action are now shorter to avoid duplicated information from appearing in
Dependabot PRs. <a
href="https://redirect.github.com/github/codeql-action/pull/3403">#3403</a></li>
</ul>
<h2>v3.31.10</h2>
<h1>CodeQL Action Changelog</h1>
<p>See the <a
href="https://github.com/github/codeql-action/releases">releases
page</a> for the relevant changes to the CodeQL CLI and language
packs.</p>
<h2>3.31.10 - 12 Jan 2026</h2>
<ul>
<li>Update default CodeQL bundle version to 2.23.9. <a
href="https://redirect.github.com/github/codeql-action/pull/3393">#3393</a></li>
</ul>
<p>See the full <a
href="https://github.com/github/codeql-action/blob/v3.31.10/CHANGELOG.md">CHANGELOG.md</a>
for more information.</p>
<h2>v3.31.9</h2>
<h1>CodeQL Action Changelog</h1>
<p>See the <a
href="https://github.com/github/codeql-action/releases">releases
page</a> for the relevant changes to the CodeQL CLI and language
packs.</p>
<h2>3.31.9 - 16 Dec 2025</h2>
<p>No user facing changes.</p>
<p>See the full <a
href="https://github.com/github/codeql-action/blob/v3.31.9/CHANGELOG.md">CHANGELOG.md</a>
for more information.</p>
<h2>v3.31.8</h2>
<h1>CodeQL Action Changelog</h1>
<p>See the <a
href="https://github.com/github/codeql-action/releases">releases
page</a> for the relevant changes to the CodeQL CLI and language
packs.</p>
<h2>3.31.8 - 11 Dec 2025</h2>
<ul>
<li>Update default CodeQL bundle version to 2.23.8. <a
href="https://redirect.github.com/github/codeql-action/pull/3354">#3354</a></li>
</ul>
<p>See the full <a
href="https://github.com/github/codeql-action/blob/v3.31.8/CHANGELOG.md">CHANGELOG.md</a>
for more information.</p>
<h2>v3.31.7</h2>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/github/codeql-action/blob/main/CHANGELOG.md">github/codeql-action's
changelog</a>.</em></p>
<blockquote>
<h2>4.31.11 - 23 Jan 2026</h2>
<ul>
<li>When running a Default Setup workflow with <a
href="https://docs.github.com/en/actions/how-tos/monitor-workflows/enable-debug-logging">Actions
debugging enabled</a>, the CodeQL Action will now use more unique names
when uploading logs from the Dependabot authentication proxy as workflow
artifacts. This ensures that the artifact names do not clash between
multiple jobs in a build matrix. <a
href="https://redirect.github.com/github/codeql-action/pull/3409">#3409</a></li>
<li>Improved error handling throughout the CodeQL Action. <a
href="https://redirect.github.com/github/codeql-action/pull/3415">#3415</a></li>
<li>Added experimental support for automatically excluding <a
href="https://docs.github.com/en/repositories/working-with-files/managing-files/customizing-how-changed-files-appear-on-github">generated
files</a> from the analysis. This feature is not currently enabled for
any analysis. In the future, it may be enabled by default for some
GitHub-managed analyses. <a
href="https://redirect.github.com/github/codeql-action/pull/3318">#3318</a></li>
<li>The changelog extracts that are included with releases of the CodeQL
Action are now shorter to avoid duplicated information from appearing in
Dependabot PRs. <a
href="https://redirect.github.com/github/codeql-action/pull/3403">#3403</a></li>
</ul>
<h2>4.31.10 - 12 Jan 2026</h2>
<ul>
<li>Update default CodeQL bundle version to 2.23.9. <a
href="https://redirect.github.com/github/codeql-action/pull/3393">#3393</a></li>
</ul>
<h2>4.31.9 - 16 Dec 2025</h2>
<p>No user facing changes.</p>
<h2>4.31.8 - 11 Dec 2025</h2>
<ul>
<li>Update default CodeQL bundle version to 2.23.8. <a
href="https://redirect.github.com/github/codeql-action/pull/3354">#3354</a></li>
</ul>
<h2>4.31.7 - 05 Dec 2025</h2>
<ul>
<li>Update default CodeQL bundle version to 2.23.7. <a
href="https://redirect.github.com/github/codeql-action/pull/3343">#3343</a></li>
</ul>
<h2>4.31.6 - 01 Dec 2025</h2>
<p>No user facing changes.</p>
<h2>4.31.5 - 24 Nov 2025</h2>
<ul>
<li>Update default CodeQL bundle version to 2.23.6. <a
href="https://redirect.github.com/github/codeql-action/pull/3321">#3321</a></li>
</ul>
<h2>4.31.4 - 18 Nov 2025</h2>
<p>No user facing changes.</p>
<h2>4.31.3 - 13 Nov 2025</h2>
<ul>
<li>CodeQL Action v3 will be deprecated in December 2026. The Action now
logs a warning for customers who are running v3 but could be running v4.
For more information, see <a
href="https://github.blog/changelog/2025-10-28-upcoming-deprecation-of-codeql-action-v3/">Upcoming
deprecation of CodeQL Action v3</a>.</li>
<li>Update default CodeQL bundle version to 2.23.5. <a
href="https://redirect.github.com/github/codeql-action/pull/3288">#3288</a></li>
</ul>
<h2>4.31.2 - 30 Oct 2025</h2>
<p>No user facing changes.</p>
<h2>4.31.1 - 30 Oct 2025</h2>
<ul>
<li>The <code>add-snippets</code> input has been removed from the
<code>analyze</code> action. This input has been deprecated since CodeQL
Action 3.26.4 in August 2024 when this removal was announced.</li>
</ul>
<h2>4.31.0 - 24 Oct 2025</h2>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="8aac4e47ac"><code>8aac4e4</code></a>
Merge pull request <a
href="https://redirect.github.com/github/codeql-action/issues/3448">#3448</a>
from github/mergeback/v4.32.1-to-main-6bc82e05</li>
<li><a
href="e8d7df4f04"><code>e8d7df4</code></a>
Rebuild</li>
<li><a
href="c1bba77db0"><code>c1bba77</code></a>
Update changelog and version after v4.32.1</li>
<li><a
href="6bc82e05fd"><code>6bc82e0</code></a>
Merge pull request <a
href="https://redirect.github.com/github/codeql-action/issues/3447">#3447</a>
from github/update-v4.32.1-f52cbc830</li>
<li><a
href="42f00f2d33"><code>42f00f2</code></a>
Add a couple of change notes</li>
<li><a
href="cedee6de9f"><code>cedee6d</code></a>
Update changelog for v4.32.1</li>
<li><a
href="f52cbc8309"><code>f52cbc8</code></a>
Merge pull request <a
href="https://redirect.github.com/github/codeql-action/issues/3445">#3445</a>
from github/dependabot/npm_and_yarn/fast-xml-parser-...</li>
<li>See full diff in <a
href="https://github.com/github/codeql-action/compare/v3...v4">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=github/codeql-action&package-manager=github_actions&previous-version=3&new-version=4)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-02-13 15:04:05 +00:00
Swifty
5035b69c79 feat(platform): add feature request tools for CoPilot chat (#12102)
Users can now search for existing feature requests and submit new ones
directly through the CoPilot chat interface. Requests are tracked in
Linear with customer need attribution.

### Changes 🏗️

**Backend:**
- Added `SearchFeatureRequestsTool` and `CreateFeatureRequestTool` to
the CoPilot chat tools registry
- Integrated with Linear GraphQL API for searching issues in the feature
requests project, creating new issues, upserting customers, and
attaching customer needs
- Added `linear_api_key` secret to settings for system-level Linear API
access
- Added response models (`FeatureRequestSearchResponse`,
`FeatureRequestCreatedResponse`, `FeatureRequestInfo`) to the tools
models

**Frontend:**
- Added `SearchFeatureRequestsTool` and `CreateFeatureRequestTool` UI
components with full streaming state handling (input-streaming,
input-available, output-available, output-error)
- Added helper utilities for output parsing, type guards, animation
text, and icon rendering
- Wired tools into `ChatMessagesContainer` for rendering in the chat
- Added styleguide examples covering all tool states

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Verified search returns matching feature requests from Linear
- [x] Verified creating a new feature request creates an issue and
customer need in Linear
- [x] Verified adding a need to an existing issue works via
`existing_issue_id`
  - [x] Verified error states render correctly in the UI
  - [x] Verified styleguide page renders all tool states

#### For configuration changes:
- [x] `.env.default` is updated or already compatible with my changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)

New secret: `LINEAR_API_KEY` — required for system-level Linear API
operations (defaults to empty string).

<!-- greptile_comment -->

<h2>Greptile Overview</h2>

<details><summary><h3>Greptile Summary</h3></summary>

Adds feature request search and creation tools to CoPilot chat,
integrating with Linear's GraphQL API to track user feedback. Users can
now search existing feature requests and submit new ones (or add their
need to existing issues) directly through conversation.

**Key changes:**
- Backend: `SearchFeatureRequestsTool` and `CreateFeatureRequestTool`
with Linear API integration via system-level `LINEAR_API_KEY`
- Frontend: React components with streaming state handling and accordion
UI for search results and creation confirmations
- Models: Added `FeatureRequestSearchResponse` and
`FeatureRequestCreatedResponse` to response types
- Customer need tracking: Upserts customers in Linear and attaches needs
to issues for better feedback attribution

**Issues found:**
- Missing `LINEAR_API_KEY` entry in `.env.default` (required per PR
description checklist)
- Hardcoded project/team IDs reduce maintainability
- Global singleton pattern could cause issues in async contexts
- Using `user_id` as customer name reduces readability in Linear
</details>


<details><summary><h3>Confidence Score: 4/5</h3></summary>

- Safe to merge with minor configuration fix required
- The implementation is well-structured with proper error handling, type
safety, and follows existing patterns in the codebase. The missing
`.env.default` entry is a straightforward configuration issue that must
be fixed before deployment but doesn't affect code quality. The other
findings are style improvements that don't impact functionality.
- Verify that `LINEAR_API_KEY` is added to `.env.default` before merging
</details>


<details><summary><h3>Sequence Diagram</h3></summary>

```mermaid
sequenceDiagram
    participant User
    participant CoPilot UI
    participant LLM
    participant FeatureRequestTool
    participant LinearClient
    participant Linear API

    User->>CoPilot UI: Request feature via chat
    CoPilot UI->>LLM: Send user message
    
    LLM->>FeatureRequestTool: search_feature_requests(query)
    FeatureRequestTool->>LinearClient: query(SEARCH_ISSUES_QUERY)
    LinearClient->>Linear API: POST /graphql (search)
    Linear API-->>LinearClient: searchIssues.nodes[]
    LinearClient-->>FeatureRequestTool: Feature request data
    FeatureRequestTool-->>LLM: FeatureRequestSearchResponse
    
    alt No existing requests found
        LLM->>FeatureRequestTool: create_feature_request(title, description)
        FeatureRequestTool->>LinearClient: mutate(CUSTOMER_UPSERT_MUTATION)
        LinearClient->>Linear API: POST /graphql (upsert customer)
        Linear API-->>LinearClient: customer {id, name}
        LinearClient-->>FeatureRequestTool: Customer data
        
        FeatureRequestTool->>LinearClient: mutate(ISSUE_CREATE_MUTATION)
        LinearClient->>Linear API: POST /graphql (create issue)
        Linear API-->>LinearClient: issue {id, identifier, url}
        LinearClient-->>FeatureRequestTool: Issue data
        
        FeatureRequestTool->>LinearClient: mutate(CUSTOMER_NEED_CREATE_MUTATION)
        LinearClient->>Linear API: POST /graphql (attach need)
        Linear API-->>LinearClient: need {id, issue}
        LinearClient-->>FeatureRequestTool: Need data
        FeatureRequestTool-->>LLM: FeatureRequestCreatedResponse
    else Existing request found
        LLM->>FeatureRequestTool: create_feature_request(title, description, existing_issue_id)
        FeatureRequestTool->>LinearClient: mutate(CUSTOMER_UPSERT_MUTATION)
        LinearClient->>Linear API: POST /graphql (upsert customer)
        Linear API-->>LinearClient: customer {id}
        LinearClient-->>FeatureRequestTool: Customer data
        
        FeatureRequestTool->>LinearClient: mutate(CUSTOMER_NEED_CREATE_MUTATION)
        LinearClient->>Linear API: POST /graphql (attach need to existing)
        Linear API-->>LinearClient: need {id, issue}
        LinearClient-->>FeatureRequestTool: Need data
        FeatureRequestTool-->>LLM: FeatureRequestCreatedResponse
    end
    
    LLM-->>CoPilot UI: Tool response + continuation
    CoPilot UI-->>User: Display result with accordion UI
```
</details>


<sub>Last reviewed commit: af2e093</sub>

<!-- greptile_other_comments_section -->

<!-- /greptile_comment -->
2026-02-13 15:27:00 +01:00
Otto
86af8fc856 ci: apply E2E CI optimizations to Claude workflows (#12097)
## Summary

Applies the CI performance optimizations from #12090 to Claude Code
workflows.

## Changes

### `claude.yml` & `claude-dependabot.yml`
- **pnpm caching**: Replaced manual `actions/cache` with `setup-node`
built-in `cache: "pnpm"`
- Removes 4 steps (set pnpm store dir, cache step, manual config) → 1
step

### `claude-ci-failure-auto-fix.yml`
- **Added dev environment setup** with optimized caching
- Now Claude can run lint/tests when fixing CI failures (previously
could only edit files)
- Uses the same optimized caching patterns

## Dependency

This PR is based on #12090 and will merge after it.

## Testing

- Workflow YAML syntax validated
- Patterns match proven #12090 implementation
- CI caching changes fail gracefully to uncached builds

## Linear

Fixes [SECRT-1950](https://linear.app/autogpt/issue/SECRT-1950)

## Future Enhancements

E2E test data caching could be added to Claude workflows if needed for
running integration tests. Currently Claude workflows set up a dev
environment but don't run E2E tests by default.

<!-- greptile_comment -->

<h2>Greptile Overview</h2>

<details><summary><h3>Greptile Summary</h3></summary>

Applies proven CI performance optimizations to Claude workflows by
simplifying pnpm caching and adding dev environment setup to the
auto-fix workflow.

**Key changes:**
- Replaced manual pnpm cache configuration (4 steps) with built-in
`setup-node` `cache: "pnpm"` support in `claude.yml` and
`claude-dependabot.yml`
- Added complete dev environment setup (Python/Poetry + Node.js/pnpm) to
`claude-ci-failure-auto-fix.yml` so Claude can run linting and tests
when fixing CI failures
- Correctly orders `corepack enable` before `setup-node` to ensure pnpm
is available for caching

The changes mirror the optimizations from PR #12090 and maintain
consistency across all Claude workflows.
</details>


<details><summary><h3>Confidence Score: 5/5</h3></summary>

- This PR is safe to merge with minimal risk
- The changes are CI infrastructure optimizations that mirror proven
patterns from PR #12090. The pnpm caching simplification reduces
complexity without changing functionality (caching failures gracefully
fall back to uncached builds). The dev environment setup in the auto-fix
workflow is additive and enables Claude to run linting/tests. All YAML
syntax is correct and the step ordering follows best practices.
- No files require special attention
</details>


<details><summary><h3>Sequence Diagram</h3></summary>

```mermaid
sequenceDiagram
    participant GHA as GitHub Actions
    participant Corepack as Corepack
    participant SetupNode as setup-node@v6
    participant Cache as GHA Cache
    participant pnpm as pnpm

    Note over GHA,pnpm: Before (Manual Caching)
    GHA->>SetupNode: Set up Node.js 22
    SetupNode-->>GHA: Node.js ready
    GHA->>Corepack: Enable corepack
    Corepack-->>GHA: pnpm available
    GHA->>pnpm: Configure store directory
    pnpm-->>GHA: Store path set
    GHA->>Cache: actions/cache (manual key)
    Cache-->>GHA: Cache restored/missed
    GHA->>pnpm: Install dependencies
    pnpm-->>GHA: Dependencies installed

    Note over GHA,pnpm: After (Built-in Caching)
    GHA->>Corepack: Enable corepack
    Corepack-->>GHA: pnpm available
    GHA->>SetupNode: Set up Node.js 22<br/>cache: "pnpm"<br/>cache-dependency-path: pnpm-lock.yaml
    SetupNode->>Cache: Auto-detect pnpm store
    Cache-->>SetupNode: Cache restored/missed
    SetupNode-->>GHA: Node.js + cache ready
    GHA->>pnpm: Install dependencies
    pnpm-->>GHA: Dependencies installed
```
</details>


<sub>Last reviewed commit: f1681a0</sub>

<!-- greptile_other_comments_section -->

<!-- /greptile_comment -->

---------

Co-authored-by: Reinier van der Leer <pwuts@agpt.co>
Co-authored-by: Ubbe <hi@ubbe.dev>
2026-02-13 13:48:04 +00:00
Otto
dfa517300b debug(copilot): Add detailed API error logging (#11942)
## Summary
Adds comprehensive error logging for OpenRouter/OpenAI API errors to
help diagnose issues like provider routing failures, context length
exceeded, rate limits, etc.

## Background
While investigating
[SECRT-1859](https://linear.app/autogpt/issue/SECRT-1859), we found that
when OpenRouter returns errors, the actual error details weren't being
captured or logged. Langfuse traces showed `provider_name: 'unknown'`
and `completion: null` without any insight into WHY all providers
rejected the request.

## Changes
- Add `_extract_api_error_details()` to extract rich information from
API errors including:
  - Status code and request ID
  - Response body (contains OpenRouter's actual error message)
  - OpenRouter-specific headers (provider, model)
  - Rate limit headers
- Add `_log_api_error()` helper that logs errors with context:
  - Session ID for correlation
  - Message count (helps identify context length issues)
  - Model being used
  - Retry count
- Update error handling in `_stream_chat_chunks()` and
`_generate_llm_continuation()` to use new logging
- Extract provider's error message from response body for better user
feedback

## Example log output
```
API error: {
  'error_type': 'APIStatusError',
  'error_message': 'Provider returned error',
  'status_code': 400,
  'request_id': 'req_xxx',
  'response_body': {'error': {'message': 'context_length_exceeded', 'type': 'invalid_request_error'}},
  'openrouter_provider': 'unknown',
  'session_id': '44fbb803-...',
  'message_count': 52,
  'model': 'anthropic/claude-opus-4.5',
  'retry_count': 0
}
```

## Testing
- [ ] Verified code passes linting (black, isort, ruff)
- [ ] Error details are properly extracted from different error types

## Refs
- Linear: SECRT-1859
- Thread:
https://discord.com/channels/1126875755960336515/1467066151002571034

---------

Co-authored-by: Reinier van der Leer <pwuts@agpt.co>
2026-02-13 13:15:17 +00:00
Reinier van der Leer
43b25b5e2f ci(frontend): Speed up E2E test job (#12090)
The frontend `e2e_test` doesn't have a working build cache setup,
causing really slow builds = slow test jobs. These changes reduce total
test runtime from ~12 minutes to ~5 minutes.

### Changes 🏗️

- Inject build cache config into docker compose config; let `buildx
bake` use GHA cache directly
  - Add `docker-ci-fix-compose-build-cache.py` script
- Optimize `backend/Dockerfile` + root `.dockerignore`
- Replace broken DIY pnpm store caching with `actions/setup-node`
built-in cache management
- Add caching for test seed data created in DB

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - CI
2026-02-13 11:09:41 +01:00
Swifty
ab0b537cc7 refactor(backend): optimize find_block response size by removing raw JSON schemas (#12020)
### Changes 🏗️

The `find_block` AutoPilot tool was returning ~90K characters per
response (10 blocks). The bloat came from including full JSON Schema
objects (`input_schema`, `output_schema`) with all nested `$defs`,
`anyOf`, and type definitions for every block.

**What changed:**

- **`BlockInfoSummary` model**: Removed `input_schema` (raw JSON
Schema), `output_schema` (raw JSON Schema), and `categories`. Added
`output_fields` (compact field-level summaries matching the existing
`required_inputs` format).
- **`BlockListResponse` model**: Removed `usage_hint` (info now in
`message`).
- **`FindBlockTool._execute()`**: Now extracts compact `output_fields`
from output schema properties instead of including the entire raw
schema. Credentials handling is unchanged.
- **Test**: Added `test_response_size_average_chars_per_block` with
realistic block schemas (HTTP, Email, Claude Code) to measure and assert
response size stays under 2K chars/block.
- **`CLAUDE.md`**: Clarified `dev` vs `master` branching strategy.

**Result:** Average response size reduced from ~9,000 to ~1,300 chars
per block (~85% reduction). This directly reduces LLM token consumption,
latency, and API costs for AutoPilot interactions.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Verified models import and serialize correctly
- [x] Verified response size: 3,970 chars for 3 realistic blocks (avg
1,323/block)
- [x] Lint (`ruff check`) and type check (`pyright`) pass on changed
files
- [x] Frontend compatibility preserved: `blocks[].name` and `count`
fields retained for `block_list` handler

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Toran Bruce Richards <toran.richards@gmail.com>
2026-02-13 11:08:51 +01:00
dependabot[bot]
9a8c6ad609 chore(libs/deps): bump the production-dependencies group across 1 directory with 4 updates (#12056)
Bumps the production-dependencies group with 4 updates in the
/autogpt_platform/autogpt_libs directory:
[cryptography](https://github.com/pyca/cryptography),
[fastapi](https://github.com/fastapi/fastapi),
[launchdarkly-server-sdk](https://github.com/launchdarkly/python-server-sdk)
and [supabase](https://github.com/supabase/supabase-py).

Updates `cryptography` from 46.0.4 to 46.0.5
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst">cryptography's
changelog</a>.</em></p>
<blockquote>
<p>46.0.5 - 2026-02-10</p>
<pre><code>
* An attacker could create a malicious public key that reveals portions
of your
private key when using certain uncommon elliptic curves (binary curves).
This version now includes additional security checks to prevent this
attack.
This issue only affects binary elliptic curves, which are rarely used in
real-world applications. Credit to **XlabAI Team of Tencent Xuanwu Lab
and
Atuin Automated Vulnerability Discovery Engine** for reporting the
issue.
  **CVE-2026-26007**
* Support for ``SECT*`` binary elliptic curves is deprecated and will be
  removed in the next release.
<p>.. v46-0-4:<br />
</code></pre></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="06e120e682"><code>06e120e</code></a>
bump version for 46.0.5 release (<a
href="https://redirect.github.com/pyca/cryptography/issues/14289">#14289</a>)</li>
<li><a
href="0eebb9dbb6"><code>0eebb9d</code></a>
EC check key on cofactor &gt; 1 (<a
href="https://redirect.github.com/pyca/cryptography/issues/14287">#14287</a>)</li>
<li><a
href="bedf6e186b"><code>bedf6e1</code></a>
fix openssl version on 46 branch (<a
href="https://redirect.github.com/pyca/cryptography/issues/14220">#14220</a>)</li>
<li>See full diff in <a
href="https://github.com/pyca/cryptography/compare/46.0.4...46.0.5">compare
view</a></li>
</ul>
</details>
<br />

Updates `fastapi` from 0.128.0 to 0.128.7
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/fastapi/fastapi/releases">fastapi's
releases</a>.</em></p>
<blockquote>
<h2>0.128.7</h2>
<h3>Features</h3>
<ul>
<li> Show a clear error on attempt to include router into itself. PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14258">#14258</a>
by <a
href="https://github.com/JavierSanchezCastro"><code>@​JavierSanchezCastro</code></a>.</li>
<li> Replace <code>dict</code> by <code>Mapping</code> on
<code>HTTPException.headers</code>. PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/12997">#12997</a>
by <a
href="https://github.com/rijenkii"><code>@​rijenkii</code></a>.</li>
</ul>
<h3>Refactors</h3>
<ul>
<li>♻️ Simplify reading files in memory, do it sequentially instead of
(fake) parallel. PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14884">#14884</a>
by <a
href="https://github.com/tiangolo"><code>@​tiangolo</code></a>.</li>
</ul>
<h3>Docs</h3>
<ul>
<li>📝 Use <code>dfn</code> tag for definitions instead of
<code>abbr</code> in docs. PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14744">#14744</a>
by <a
href="https://github.com/YuriiMotov"><code>@​YuriiMotov</code></a>.</li>
</ul>
<h3>Internal</h3>
<ul>
<li> Tweak comment in test to reference PR. PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14885">#14885</a>
by <a
href="https://github.com/tiangolo"><code>@​tiangolo</code></a>.</li>
<li>🔧 Update LLM-prompt for <code>abbr</code> and <code>dfn</code> tags.
PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14747">#14747</a>
by <a
href="https://github.com/YuriiMotov"><code>@​YuriiMotov</code></a>.</li>
<li> Test order for the submitted byte Files. PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14828">#14828</a>
by <a
href="https://github.com/valentinDruzhinin"><code>@​valentinDruzhinin</code></a>.</li>
<li>🔧 Configure <code>test</code> workflow to run tests with
<code>inline-snapshot=review</code>. PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14876">#14876</a>
by <a
href="https://github.com/YuriiMotov"><code>@​YuriiMotov</code></a>.</li>
</ul>
<h2>0.128.6</h2>
<h3>Fixes</h3>
<ul>
<li>🐛 Fix <code>on_startup</code> and <code>on_shutdown</code>
parameters of <code>APIRouter</code>. PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14873">#14873</a>
by <a
href="https://github.com/YuriiMotov"><code>@​YuriiMotov</code></a>.</li>
</ul>
<h3>Translations</h3>
<ul>
<li>🌐 Update translations for zh (update-outdated). PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14843">#14843</a>
by <a
href="https://github.com/tiangolo"><code>@​tiangolo</code></a>.</li>
</ul>
<h3>Internal</h3>
<ul>
<li> Fix parameterized tests with snapshots. PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14875">#14875</a>
by <a
href="https://github.com/YuriiMotov"><code>@​YuriiMotov</code></a>.</li>
</ul>
<h2>0.128.5</h2>
<h3>Refactors</h3>
<ul>
<li>♻️ Refactor and simplify Pydantic v2 (and v1) compatibility internal
utils. PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14862">#14862</a>
by <a
href="https://github.com/tiangolo"><code>@​tiangolo</code></a>.</li>
</ul>
<h3>Internal</h3>
<ul>
<li> Add inline snapshot tests for OpenAPI before changes from Pydantic
v2. PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14864">#14864</a>
by <a
href="https://github.com/tiangolo"><code>@​tiangolo</code></a>.</li>
</ul>
<h2>0.128.4</h2>
<h3>Refactors</h3>
<ul>
<li>♻️ Refactor internals, simplify Pydantic v2/v1 utils,
<code>create_model_field</code>, better types for
<code>lenient_issubclass</code>. PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14860">#14860</a>
by <a
href="https://github.com/tiangolo"><code>@​tiangolo</code></a>.</li>
<li>♻️ Simplify internals, remove Pydantic v1 only logic, no longer
needed. PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14857">#14857</a>
by <a
href="https://github.com/tiangolo"><code>@​tiangolo</code></a>.</li>
<li>♻️ Refactor internals, cleanup unneeded Pydantic v1 specific logic.
PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14856">#14856</a>
by <a
href="https://github.com/tiangolo"><code>@​tiangolo</code></a>.</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="8f82c94de0"><code>8f82c94</code></a>
🔖 Release version 0.128.7</li>
<li><a
href="5bb3423205"><code>5bb3423</code></a>
📝 Update release notes</li>
<li><a
href="6ce5e3e961"><code>6ce5e3e</code></a>
 Tweak comment in test to reference PR (<a
href="https://redirect.github.com/fastapi/fastapi/issues/14885">#14885</a>)</li>
<li><a
href="65da3dde12"><code>65da3dd</code></a>
📝 Update release notes</li>
<li><a
href="81f82fd955"><code>81f82fd</code></a>
🔧 Update LLM-prompt for <code>abbr</code> and <code>dfn</code> tags (<a
href="https://redirect.github.com/fastapi/fastapi/issues/14747">#14747</a>)</li>
<li><a
href="ff721017df"><code>ff72101</code></a>
📝 Update release notes</li>
<li><a
href="ca76a4eba9"><code>ca76a4e</code></a>
📝 Use <code>dfn</code> tag for definitions instead of <code>abbr</code>
in docs (<a
href="https://redirect.github.com/fastapi/fastapi/issues/14744">#14744</a>)</li>
<li><a
href="1133a4594d"><code>1133a45</code></a>
📝 Update release notes</li>
<li><a
href="38f965985e"><code>38f9659</code></a>
 Test order for the submitted byte Files (<a
href="https://redirect.github.com/fastapi/fastapi/issues/14828">#14828</a>)</li>
<li><a
href="3f1cc8f8f5"><code>3f1cc8f</code></a>
📝 Update release notes</li>
<li>Additional commits viewable in <a
href="https://github.com/fastapi/fastapi/compare/0.128.0...0.128.7">compare
view</a></li>
</ul>
</details>
<br />

Updates `launchdarkly-server-sdk` from 9.14.1 to 9.15.0
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/launchdarkly/python-server-sdk/releases">launchdarkly-server-sdk's
releases</a>.</em></p>
<blockquote>
<h2>v9.15.0</h2>
<h2><a
href="https://github.com/launchdarkly/python-server-sdk/compare/9.14.1...9.15.0">9.15.0</a>
(2026-02-10)</h2>
<h3>Features</h3>
<ul>
<li>Drop support for python 3.9 (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/393">#393</a>)
(<a
href="5b761bd306">5b761bd</a>)</li>
<li>Update ChangeSet to always require a Selector (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/405">#405</a>)
(<a
href="5dc4f81688">5dc4f81</a>)</li>
</ul>
<h3>Bug Fixes</h3>
<ul>
<li>Add context manager for clearer, safer locks (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/396">#396</a>)
(<a
href="beca0fa498">beca0fa</a>)</li>
<li>Address potential race condition in FeatureStore update_availability
(<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/391">#391</a>)
(<a
href="31cf4875c3">31cf487</a>)</li>
<li>Allow modifying fdv2 data source options independent of main config
(<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/403">#403</a>)
(<a
href="d78079e7f3">d78079e</a>)</li>
<li>Mark copy_with_new_sdk_key method as deprecated (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/353">#353</a>)
(<a
href="e471ccc3d5">e471ccc</a>)</li>
<li>Prevent immediate polling on recoverable error (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/399">#399</a>)
(<a
href="da565a2dce">da565a2</a>)</li>
<li>Redis store is considered initialized when <code>$inited</code> key
is written (<a
href="e99a27d48f">e99a27d</a>)</li>
<li>Stop FeatureStoreClientWrapper poller on close (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/397">#397</a>)
(<a
href="468afdfef3">468afdf</a>)</li>
<li>Update DataSystemConfig to accept list of synchronizers (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/404">#404</a>)
(<a
href="c73ad14090">c73ad14</a>)</li>
<li>Update reason documentation with inExperiment value (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/401">#401</a>)
(<a
href="cbfc3dd887">cbfc3dd</a>)</li>
<li>Update Redis to write missing <code>$inited</code> key (<a
href="e99a27d48f">e99a27d</a>)</li>
</ul>
<hr />
<p>This PR was generated with <a
href="https://github.com/googleapis/release-please">Release Please</a>.
See <a
href="https://github.com/googleapis/release-please#release-please">documentation</a>.</p>
<!-- raw HTML omitted -->
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/launchdarkly/python-server-sdk/blob/main/CHANGELOG.md">launchdarkly-server-sdk's
changelog</a>.</em></p>
<blockquote>
<h2><a
href="https://github.com/launchdarkly/python-server-sdk/compare/9.14.1...9.15.0">9.15.0</a>
(2026-02-10)</h2>
<h3>⚠ BREAKING CHANGES</h3>
<p><strong>Note:</strong> The following breaking changes apply only to
FDv2 (Flag Delivery v2) early access features, which are not subject to
semantic versioning and may change without a major version bump.</p>
<ul>
<li>Update ChangeSet to always require a Selector (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/405">#405</a>)
(<a
href="5dc4f81688">5dc4f81</a>)
<ul>
<li>The <code>ChangeSetBuilder.finish()</code> method now requires a
<code>Selector</code> parameter.</li>
</ul>
</li>
<li>Update DataSystemConfig to accept list of synchronizers (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/404">#404</a>)
(<a
href="c73ad14090">c73ad14</a>)
<ul>
<li>The <code>DataSystemConfig.synchronizers</code> field now accepts a
list of synchronizers, and the
<code>ConfigBuilder.synchronizers()</code> method accepts variadic
arguments.</li>
</ul>
</li>
</ul>
<h3>Features</h3>
<ul>
<li>Drop support for python 3.9 (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/393">#393</a>)
(<a
href="5b761bd306">5b761bd</a>)</li>
</ul>
<h3>Bug Fixes</h3>
<ul>
<li>Add context manager for clearer, safer locks (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/396">#396</a>)
(<a
href="beca0fa498">beca0fa</a>)</li>
<li>Address potential race condition in FeatureStore update_availability
(<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/391">#391</a>)
(<a
href="31cf4875c3">31cf487</a>)</li>
<li>Allow modifying fdv2 data source options independent of main config
(<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/403">#403</a>)
(<a
href="d78079e7f3">d78079e</a>)</li>
<li>Mark copy_with_new_sdk_key method as deprecated (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/353">#353</a>)
(<a
href="e471ccc3d5">e471ccc</a>)</li>
<li>Prevent immediate polling on recoverable error (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/399">#399</a>)
(<a
href="da565a2dce">da565a2</a>)</li>
<li>Redis store is considered initialized when <code>$inited</code> key
is written (<a
href="e99a27d48f">e99a27d</a>)</li>
<li>Stop FeatureStoreClientWrapper poller on close (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/397">#397</a>)
(<a
href="468afdfef3">468afdf</a>)</li>
<li>Update reason documentation with inExperiment value (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/401">#401</a>)
(<a
href="cbfc3dd887">cbfc3dd</a>)</li>
<li>Update Redis to write missing <code>$inited</code> key (<a
href="e99a27d48f">e99a27d</a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="e542f737a6"><code>e542f73</code></a>
chore(main): release 9.15.0 (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/394">#394</a>)</li>
<li><a
href="e471ccc3d5"><code>e471ccc</code></a>
fix: Mark copy_with_new_sdk_key method as deprecated (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/353">#353</a>)</li>
<li><a
href="5dc4f81688"><code>5dc4f81</code></a>
feat: Update ChangeSet to always require a Selector (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/405">#405</a>)</li>
<li><a
href="f20fffeb1e"><code>f20fffe</code></a>
chore: Remove dead code, clarify names, other cleanup (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/398">#398</a>)</li>
<li><a
href="c73ad14090"><code>c73ad14</code></a>
fix: Update DataSystemConfig to accept list of synchronizers (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/404">#404</a>)</li>
<li><a
href="d78079e7f3"><code>d78079e</code></a>
fix: Allow modifying fdv2 data source options independent of main config
(<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/403">#403</a>)</li>
<li><a
href="e99a27d48f"><code>e99a27d</code></a>
chore: Support persistent data store verification in contract tests (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/402">#402</a>)</li>
<li><a
href="cbfc3dd887"><code>cbfc3dd</code></a>
fix: Update reason documentation with inExperiment value (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/401">#401</a>)</li>
<li><a
href="5a1adbb2de"><code>5a1adbb</code></a>
chore: Update sdk_metadata features (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/400">#400</a>)</li>
<li><a
href="da565a2dce"><code>da565a2</code></a>
fix: Prevent immediate polling on recoverable error (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/399">#399</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/launchdarkly/python-server-sdk/compare/9.14.1...9.15.0">compare
view</a></li>
</ul>
</details>
<br />

Updates `supabase` from 2.27.2 to 2.28.0
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/supabase/supabase-py/releases">supabase's
releases</a>.</em></p>
<blockquote>
<h2>v2.28.0</h2>
<h2><a
href="https://github.com/supabase/supabase-py/compare/v2.27.3...v2.28.0">2.28.0</a>
(2026-02-10)</h2>
<h3>Features</h3>
<ul>
<li><strong>storage:</strong> add list_v2 method to file_api client (<a
href="https://redirect.github.com/supabase/supabase-py/issues/1377">#1377</a>)
(<a
href="259f4ad42d">259f4ad</a>)</li>
</ul>
<h3>Bug Fixes</h3>
<ul>
<li><strong>auth:</strong> add missing is_sso_user, deleted_at,
banned_until to User model (<a
href="https://redirect.github.com/supabase/supabase-py/issues/1375">#1375</a>)
(<a
href="7f84a62996">7f84a62</a>)</li>
<li><strong>realtime:</strong> ensure remove_channel removes channel
from channels dict (<a
href="https://redirect.github.com/supabase/supabase-py/issues/1373">#1373</a>)
(<a
href="0923314039">0923314</a>)</li>
<li><strong>realtime:</strong> use pop with default in _handle_message
to prevent KeyError (<a
href="https://redirect.github.com/supabase/supabase-py/issues/1388">#1388</a>)
(<a
href="baea26f7ce">baea26f</a>)</li>
<li><strong>storage3:</strong> replace print() with warnings.warn() for
trailing slash notice (<a
href="https://redirect.github.com/supabase/supabase-py/issues/1380">#1380</a>)
(<a
href="50b099fa06">50b099f</a>)</li>
</ul>
<h2>v2.27.3</h2>
<h2><a
href="https://github.com/supabase/supabase-py/compare/v2.27.2...v2.27.3">2.27.3</a>
(2026-02-03)</h2>
<h3>Bug Fixes</h3>
<ul>
<li>deprecate python 3.9 in all packages (<a
href="https://redirect.github.com/supabase/supabase-py/issues/1365">#1365</a>)
(<a
href="cc72ed75d4">cc72ed7</a>)</li>
<li>ensure storage_url has trailing slash to prevent warning (<a
href="https://redirect.github.com/supabase/supabase-py/issues/1367">#1367</a>)
(<a
href="4267ff1345">4267ff1</a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/supabase/supabase-py/blob/main/CHANGELOG.md">supabase's
changelog</a>.</em></p>
<blockquote>
<h2><a
href="https://github.com/supabase/supabase-py/compare/v2.27.3...v2.28.0">2.28.0</a>
(2026-02-10)</h2>
<h3>Features</h3>
<ul>
<li><strong>storage:</strong> add list_v2 method to file_api client (<a
href="https://redirect.github.com/supabase/supabase-py/issues/1377">#1377</a>)
(<a
href="259f4ad42d">259f4ad</a>)</li>
</ul>
<h3>Bug Fixes</h3>
<ul>
<li><strong>auth:</strong> add missing is_sso_user, deleted_at,
banned_until to User model (<a
href="https://redirect.github.com/supabase/supabase-py/issues/1375">#1375</a>)
(<a
href="7f84a62996">7f84a62</a>)</li>
<li><strong>realtime:</strong> ensure remove_channel removes channel
from channels dict (<a
href="https://redirect.github.com/supabase/supabase-py/issues/1373">#1373</a>)
(<a
href="0923314039">0923314</a>)</li>
<li><strong>realtime:</strong> use pop with default in _handle_message
to prevent KeyError (<a
href="https://redirect.github.com/supabase/supabase-py/issues/1388">#1388</a>)
(<a
href="baea26f7ce">baea26f</a>)</li>
<li><strong>storage3:</strong> replace print() with warnings.warn() for
trailing slash notice (<a
href="https://redirect.github.com/supabase/supabase-py/issues/1380">#1380</a>)
(<a
href="50b099fa06">50b099f</a>)</li>
</ul>
<h2><a
href="https://github.com/supabase/supabase-py/compare/v2.27.2...v2.27.3">2.27.3</a>
(2026-02-03)</h2>
<h3>Bug Fixes</h3>
<ul>
<li>deprecate python 3.9 in all packages (<a
href="https://redirect.github.com/supabase/supabase-py/issues/1365">#1365</a>)
(<a
href="cc72ed75d4">cc72ed7</a>)</li>
<li>ensure storage_url has trailing slash to prevent warning (<a
href="https://redirect.github.com/supabase/supabase-py/issues/1367">#1367</a>)
(<a
href="4267ff1345">4267ff1</a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="59e338400b"><code>59e3384</code></a>
chore(main): release 2.28.0 (<a
href="https://redirect.github.com/supabase/supabase-py/issues/1378">#1378</a>)</li>
<li><a
href="baea26f7ce"><code>baea26f</code></a>
fix(realtime): use pop with default in _handle_message to prevent
KeyError (#...</li>
<li><a
href="259f4ad42d"><code>259f4ad</code></a>
feat(storage): add list_v2 method to file_api client (<a
href="https://redirect.github.com/supabase/supabase-py/issues/1377">#1377</a>)</li>
<li><a
href="50b099fa06"><code>50b099f</code></a>
fix(storage3): replace print() with warnings.warn() for trailing slash
notice...</li>
<li><a
href="0923314039"><code>0923314</code></a>
fix(realtime): ensure remove_channel removes channel from channels dict
(<a
href="https://redirect.github.com/supabase/supabase-py/issues/1373">#1373</a>)</li>
<li><a
href="7f84a62996"><code>7f84a62</code></a>
fix(auth): add missing is_sso_user, deleted_at, banned_until to User
model (#...</li>
<li><a
href="57dd6e2195"><code>57dd6e2</code></a>
chore(deps): bump the uv group across 1 directory with 3 updates (<a
href="https://redirect.github.com/supabase/supabase-py/issues/1369">#1369</a>)</li>
<li><a
href="c357def670"><code>c357def</code></a>
chore(main): release 2.27.3 (<a
href="https://redirect.github.com/supabase/supabase-py/issues/1368">#1368</a>)</li>
<li><a
href="4267ff1345"><code>4267ff1</code></a>
fix: ensure storage_url has trailing slash to prevent warning (<a
href="https://redirect.github.com/supabase/supabase-py/issues/1367">#1367</a>)</li>
<li><a
href="cc72ed75d4"><code>cc72ed7</code></a>
fix: deprecate python 3.9 in all packages (<a
href="https://redirect.github.com/supabase/supabase-py/issues/1365">#1365</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/supabase/supabase-py/compare/v2.27.2...v2.28.0">compare
view</a></li>
</ul>
</details>
<br />


Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore <dependency name> major version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's major version (unless you unignore this specific
dependency's major version or upgrade to it yourself)
- `@dependabot ignore <dependency name> minor version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's minor version (unless you unignore this specific
dependency's minor version or upgrade to it yourself)
- `@dependabot ignore <dependency name>` will close this group update PR
and stop Dependabot creating any more for the specific dependency
(unless you unignore this specific dependency or upgrade to it yourself)
- `@dependabot unignore <dependency name>` will remove all of the ignore
conditions of the specified dependency
- `@dependabot unignore <dependency name> <ignore condition>` will
remove the ignore condition of the specified dependency and ignore
conditions


</details>

<!-- greptile_comment -->

<h2>Greptile Overview</h2>

<details><summary><h3>Greptile Summary</h3></summary>

Dependency update bumps 4 packages in the production-dependencies group,
including a **critical security patch for `cryptography`**
(CVE-2026-26007) that prevents malicious public key attacks on binary
elliptic curves. The update also includes bug fixes for `fastapi`,
`launchdarkly-server-sdk`, and `supabase`.

- **cryptography** 46.0.4 → 46.0.5: patches CVE-2026-26007, deprecates
SECT* binary curves
- **fastapi** 0.128.0 → 0.128.7: bug fixes, improved error handling,
relaxed Starlette constraint
- **launchdarkly-server-sdk** 9.14.1 → 9.15.0: drops Python 3.9 support
(requires >=3.10), fixes race conditions
- **supabase** 2.27.2/2.27.3 → 2.28.0: realtime fixes, new User model
fields

The lock files correctly resolve all dependencies. Python 3.10+
requirement is already enforced in both packages. However, backend's
`pyproject.toml` still specifies `launchdarkly-server-sdk = "^9.14.1"`
while the lock file uses 9.15.0 (pulled from autogpt_libs dependency),
creating a minor version constraint inconsistency.
</details>


<details><summary><h3>Confidence Score: 4/5</h3></summary>

- This PR is safe to merge with one minor style suggestion
- Automated dependency update with critical security patch for
cryptography. All updates are backwards-compatible within semver
constraints. Lock files correctly resolve all dependencies. Python 3.10+
is already enforced. Only minor issue is version constraint
inconsistency in backend's pyproject.toml for launchdarkly-server-sdk,
which doesn't affect functionality but should be aligned for clarity.
- autogpt_platform/backend/pyproject.toml needs launchdarkly-server-sdk
version constraint updated to ^9.15.0
</details>


<!-- greptile_other_comments_section -->

<!-- /greptile_comment -->

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Otto <otto@agpt.co>
2026-02-13 09:10:11 +00:00
Ubbe
e8c50b96d1 fix(frontend): improve CoPilot chat table styling (#12094)
## Summary
- Remove left and right borders from tables rendered in CoPilot chat
- Increase cell padding (py-3 → py-3.5) for better spacing between text
and lines
- Applies to both Streamdown (main chat) and MarkdownRenderer (tool
outputs)

Design feedback from Olivia to make tables "breathe" more.

## Test plan
- [ ] Open CoPilot chat and trigger a response containing a table
- [ ] Verify tables no longer have left/right borders
- [ ] Verify increased spacing between rows
- [ ] Check both light and dark modes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- greptile_comment -->

<h2>Greptile Overview</h2>

<details><summary><h3>Greptile Summary</h3></summary>

Improved CoPilot chat table styling by removing left and right borders
and increasing vertical padding from `py-3` to `py-3.5`. Changes apply
to both:
- Streamdown-rendered tables (via CSS selector in `globals.css`)  
- MarkdownRenderer tables (via Tailwind classes)

The changes make tables "breathe" more per design feedback from Olivia.

**Issue Found:**
- The CSS padding value in `globals.css:192` is `0.625rem` (`py-2.5`)
but should be `0.875rem` (`py-3.5`) to match the PR description and the
MarkdownRenderer implementation.
</details>


<details><summary><h3>Confidence Score: 2/5</h3></summary>

- This PR has a logical error that will cause inconsistent table styling
between Streamdown and MarkdownRenderer tables
- The implementation has an inconsistency where the CSS file uses
`py-2.5` padding while the PR description and MarkdownRenderer use
`py-3.5`. This will result in different table padding between the two
rendering systems, contradicting the goal of consistent styling
improvements.
- Pay close attention to `autogpt_platform/frontend/src/app/globals.css`
- the padding value needs to be corrected to match the intended design
</details>


<!-- greptile_other_comments_section -->

<!-- /greptile_comment -->

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
2026-02-13 09:38:59 +08:00
Ubbe
30e854569a feat(frontend): add exact timestamp tooltip on run timestamps (#12087)
Resolves OPEN-2693: Make exact timestamp of runs accessible through UI.

The NewAgentLibraryView shows relative timestamps ("2 days ago") for
runs and schedules, but unlike the OldAgentLibraryView it didn't show
the exact timestamp on hover. This PR adds a native `title` tooltip so
users can see the full date/time by hovering.

### Changes 🏗️

- Added `descriptionTitle` prop to `SidebarItemCard` that renders as a
`title` attribute on the description text
- `TaskListItem` now passes the exact `run.started_at` timestamp via
`descriptionTitle`
- `ScheduleListItem` now passes the exact `schedule.next_run_time`
timestamp via `descriptionTitle`

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [ ] Open an agent in the library view
- [ ] Hover over a run's relative timestamp (e.g. "2 days ago") and
confirm the full date/time tooltip appears
- [ ] Hover over a schedule's relative timestamp and confirm the full
date/time tooltip appears

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- greptile_comment -->

<h2>Greptile Overview</h2>

<details><summary><h3>Greptile Summary</h3></summary>

Added native tooltip functionality to show exact timestamps in the
library view. The implementation adds a `descriptionTitle` prop to
`SidebarItemCard` that renders as a `title` attribute on the description
text. This allows users to hover over relative timestamps (e.g., "2 days
ago") to see the full date/time.

**Changes:**
- Added optional `descriptionTitle` prop to `SidebarItemCard` component
(SidebarItemCard.tsx:10)
- `TaskListItem` passes `run.started_at` as the tooltip value
(TaskListItem.tsx:84-86)
- `ScheduleListItem` passes `schedule.next_run_time` as the tooltip
value (ScheduleListItem.tsx:32)
- Unrelated fix included: Sentry configuration updated to suppress
cross-origin stylesheet errors (instrumentation-client.ts:25-28)

**Note:** The PR includes two separate commits - the main timestamp
tooltip feature and a Sentry error suppression fix. The PR description
only documents the timestamp feature.
</details>


<details><summary><h3>Confidence Score: 5/5</h3></summary>

- This PR is safe to merge with minimal risk
- The changes are straightforward and limited in scope - adding an
optional prop that forwards a native HTML attribute for tooltip
functionality. The Text component already supports forwarding arbitrary
HTML attributes through its spread operator (...rest), ensuring the
`title` attribute works correctly. Both the timestamp tooltip feature
and the Sentry configuration fix are low-risk improvements with no
breaking changes.
- No files require special attention
</details>


<details><summary><h3>Sequence Diagram</h3></summary>

```mermaid
sequenceDiagram
    participant User
    participant TaskListItem
    participant ScheduleListItem
    participant SidebarItemCard
    participant Text
    participant Browser

    User->>TaskListItem: Hover over run timestamp
    TaskListItem->>SidebarItemCard: Pass descriptionTitle (run.started_at)
    SidebarItemCard->>Text: Render with title attribute
    Text->>Browser: Forward title attribute to DOM
    Browser->>User: Display native tooltip with exact timestamp

    User->>ScheduleListItem: Hover over schedule timestamp
    ScheduleListItem->>SidebarItemCard: Pass descriptionTitle (schedule.next_run_time)
    SidebarItemCard->>Text: Render with title attribute
    Text->>Browser: Forward title attribute to DOM
    Browser->>User: Display native tooltip with exact timestamp
```
</details>


<!-- greptile_other_comments_section -->

<!-- /greptile_comment -->

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 09:38:16 +08:00
Ubbe
301d7cbada fix(frontend): suppress cross-origin stylesheet security error (#12086)
## Summary
- Adds `ignoreErrors` to the Sentry client configuration
(`instrumentation-client.ts`) to filter out `SecurityError:
CSSStyleSheet.cssRules getter: Not allowed to access cross-origin
stylesheet` errors
- These errors are caused by Sentry Replay (rrweb) attempting to
serialize DOM snapshots that include cross-origin stylesheets (from
browser extensions or CDN-loaded CSS)
- This was reported via Sentry on production, occurring on any page when
logged in

## Changes
- **`frontend/instrumentation-client.ts`**: Added `ignoreErrors: [/Not
allowed to access cross-origin stylesheet/]` to `Sentry.init()` config

## Test plan
- [ ] Verify the error no longer appears in Sentry after deployment
- [ ] Verify Sentry Replay still works correctly for other errors
- [ ] Verify no regressions in error tracking (other errors should still
be captured)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- greptile_comment -->

<h2>Greptile Overview</h2>

<details><summary><h3>Greptile Summary</h3></summary>

Adds error filtering to Sentry client configuration to suppress
cross-origin stylesheet security errors that occur when Sentry Replay
(rrweb) attempts to serialize DOM snapshots containing stylesheets from
browser extensions or CDN-loaded CSS. This prevents noise in Sentry
error logs without affecting the capture of legitimate errors.
</details>


<details><summary><h3>Confidence Score: 5/5</h3></summary>

- This PR is safe to merge with minimal risk
- The change adds a simple error filter to suppress benign cross-origin
stylesheet errors that are caused by Sentry Replay itself. The regex
pattern is specific and only affects client-side error reporting, with
no impact on application functionality or legitimate error capture
- No files require special attention
</details>


<!-- greptile_other_comments_section -->

<!-- /greptile_comment -->

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 09:37:54 +08:00
Ubbe
d95aef7665 fix(copilot): stream timeout, long-running tool polling, and CreateAgent UI refresh (#12070)
Agent generation completes on the backend but the UI does not
update/refresh to show the result.

### Changes 🏗️

![Uploading Screenshot 2026-02-13 at 00.44.54.png…]()


- **Stream start timeout (12s):** If the backend doesn't begin streaming
within 12 seconds of submitting a message, the stream is aborted and a
destructive toast is shown to the user.
- **Long-running tool polling:** Added `useLongRunningToolPolling` hook
that polls the session endpoint every 1.5s while a tool output is in an
operating state (`operation_started` / `operation_pending` /
`operation_in_progress`). When the backend completes, messages are
refreshed so the UI reflects the final result.
- **CreateAgent UI improvements:** Replaced the orbit loader / progress
bar with a mini-game, added expanded accordion for saved agents, and
improved the saved-agent card with image, icons, and links that open in
new tabs.
- **Backend tweaks:** Added `image_url` to `CreateAgentToolOutput`,
minor model/service updates for the dummy agent generator.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Send a message and verify the stream starts within 12s or a toast
appears
- [x] Trigger agent creation and verify the UI updates when the backend
completes
- [x] Verify the saved-agent card renders correctly with image, links,
and icons

---------

Co-authored-by: Otto <otto@agpt.co>
Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-12 20:06:40 +00:00
Nicholas Tindle
cb166dd6fb feat(blocks): Store sandbox files to workspace (#12073)
Store files created by sandbox blocks (Claude Code, Code Executor) to
the user's workspace for persistence across runs.

### Changes 🏗️

- **New `sandbox_files.py` utility** (`backend/util/sandbox_files.py`)
  - Shared module for extracting files from E2B sandboxes
- Stores files to workspace via `store_media_file()` (includes virus
scanning, size limits)
  - Returns `SandboxFileOutput` with path, content, and `workspace_ref`

- **Claude Code block** (`backend/blocks/claude_code.py`)
  - Added `workspace_ref` field to `FileOutput` schema
  - Replaced inline `_extract_files()` with shared utility
  - Files from working directory now stored to workspace automatically

- **Code Executor block** (`backend/blocks/code_executor.py`)
  - Added `files` output field to `ExecuteCodeBlock.Output`
  - Creates `/output` directory in sandbox before execution
  - Extracts all files (text + binary) from `/output` after execution
- Updated `execute_code()` to support file extraction with
`extract_files` param

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Create agent with Claude Code block, have it create a file, verify
`workspace_ref` in output
- [x] Create agent with Code Executor block, write file to `/output`,
verify `workspace_ref` in output
  - [x] Verify files persist in workspace after sandbox disposal
- [x] Verify binary files (images, etc.) work correctly in Code Executor
- [x] Verify existing graphs using `content` field still work (backward
compat)

#### For configuration changes:
- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)

No configuration changes required - this is purely additive backend
code.

---

**Related:** Closes SECRT-1931

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Medium Risk**
> Adds automatic extraction and workspace storage of sandbox-written
files (including binaries for code execution), which can affect output
payload size, performance, and file-handling edge cases.
> 
> **Overview**
> **Sandbox blocks now persist generated files to workspace.** A new
shared utility (`backend/util/sandbox_files.py`) extracts files from an
E2B sandbox (scoped by a start timestamp) and stores them via
`store_media_file`, returning `SandboxFileOutput` with `workspace_ref`.
> 
> `ClaudeCodeBlock` replaces its inline file-scraping logic with this
utility and updates the `files` output schema to include
`workspace_ref`.
> 
> `ExecuteCodeBlock` adds a `files` output and extends the executor
mixin to optionally extract/store files (text + binary) when an
`execution_context` is provided; related mocks/tests and docs are
updated accordingly.
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
343854c0cf. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-12 15:56:59 +00:00
Swifty
3d31f62bf1 Revert "added feature request tooling"
This reverts commit b8b6c9de23.
2026-02-12 16:39:24 +01:00
Swifty
b8b6c9de23 added feature request tooling 2026-02-12 16:38:17 +01:00
Abhimanyu Yadav
4f6055f494 refactor(frontend): remove default expiration date from API key credentials form (#12092)
### Changes 🏗️

Removed the default expiration date for API keys in the credentials
modal. Previously, API keys were set to expire the next day by default,
but now the expiration date field starts empty, allowing users to
explicitly choose whether they want to set an expiration date.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Open the API key credentials modal and verify the expiration date
field is empty by default
  - [x] Test creating an API key with and without an expiration date
  - [x] Verify both scenarios work correctly

<!-- greptile_comment -->

<h2>Greptile Overview</h2>

<details><summary><h3>Greptile Summary</h3></summary>

Removed the default expiration date for API key credentials in the
credentials modal. Previously, API keys were automatically set to expire
the next day at midnight. Now the expiration date field starts empty,
allowing users to explicitly choose whether to set an expiration.

- Removed `getDefaultExpirationDate()` helper function that calculated
tomorrow's date
- Changed default `expiresAt` value from calculated date to empty string
- Backend already supports optional expiration (`expires_at?: number`),
so no backend changes needed
- Form submission correctly handles empty expiration by passing
`undefined` to the API
</details>


<details><summary><h3>Confidence Score: 5/5</h3></summary>

- This PR is safe to merge with minimal risk
- The changes are straightforward and well-contained. The refactor
removes a helper function and changes a default value. The backend API
already supports optional expiration dates, and the form submission
logic correctly handles empty values by passing undefined. The change
improves UX by not forcing a default expiration date on users.
- No files require special attention
</details>


<!-- greptile_other_comments_section -->

<!-- /greptile_comment -->
2026-02-12 12:57:06 +00:00
Otto
695a185fa1 fix(frontend): remove fixed min-height from CoPilot message container (#12091)
## Summary

Removes the `min-h-screen` class from `ConversationContent` in
ChatMessagesContainer, which was causing fixed height layout issues in
the CoPilot chat interface.

## Changes

- Removed `min-h-screen` from ConversationContent className

## Linear

Fixes [SECRT-1944](https://linear.app/autogpt/issue/SECRT-1944)

<!-- greptile_comment -->

<h2>Greptile Overview</h2>

<details><summary><h3>Greptile Summary</h3></summary>

Removes the `min-h-screen` (100vh) class from `ConversationContent` that
was causing the chat message container to enforce a minimum viewport
height. The parent container already handles height constraints with
`h-full min-h-0` and flexbox layout, so the fixed minimum height was
creating layout conflicts. The component now properly grows within its
flex container using `flex-1`.
</details>


<details><summary><h3>Confidence Score: 5/5</h3></summary>

- This PR is safe to merge with minimal risk
- The change removes a single problematic CSS class that was causing
fixed height layout issues. The parent container already handles height
constraints properly with flexbox, and removing min-h-screen allows the
component to size correctly within its flex parent. This is a targeted,
low-risk bug fix with no logic changes.
- No files require special attention
</details>


<!-- greptile_other_comments_section -->

<!-- /greptile_comment -->
2026-02-12 12:46:29 +00:00
Reinier van der Leer
113e87a23c refactor(backend): Reduce circular imports (#12068)
I'm getting circular import issues because there is a lot of
cross-importing between `backend.data`, `backend.blocks`, and other
modules. This change reduces block-related cross-imports and thus risk
of breaking circular imports.

### Changes 🏗️

- Strip down `backend.data.block`
- Move `Block` base class and related class/enum defs to
`backend.blocks._base`
  - Move `is_block_auth_configured` to `backend.blocks._utils`
- Move `get_blocks()`, `get_io_block_ids()` etc. to `backend.blocks`
(`__init__.py`)
  - Update imports everywhere
- Remove unused and poorly typed `Block.create()`
  - Change usages from `block_cls.create()` to `block_cls()`
- Improve typing of `load_all_blocks` and `get_blocks`
- Move cross-import of `backend.api.features.library.model` from
`backend/data/__init__.py` to `backend/data/integrations.py`
- Remove deprecated attribute `NodeModel.webhook`
  - Re-generate OpenAPI spec and fix frontend usage
- Eliminate module-level `backend.blocks` import from `blocks/agent.py`
- Eliminate module-level `backend.data.execution` and
`backend.executor.manager` imports from `blocks/helpers/review.py`
- Replace `BlockInput` with `GraphInput` for graph inputs

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - CI static type-checking + tests should be sufficient for this
2026-02-12 12:07:49 +00:00
Abhimanyu Yadav
d09f1532a4 feat(frontend): replace legacy builder with new flow editor
(#12081)

### Changes 🏗️

This PR completes the migration from the legacy builder to the new Flow
editor by removing all legacy code and feature flags.

**Removed:**
- Old builder view toggle functionality (`BuilderViewTabs.tsx`)
- Legacy debug panel (`RightSidebar.tsx`)
- Feature flags: `NEW_FLOW_EDITOR` and `BUILDER_VIEW_SWITCH`
- `useBuilderView` hook and related view-switching logic

**Updated:**
- Simplified `build/page.tsx` to always render the new Flow editor
- Added CSS styling (`flow.css`) to properly render Phosphor icons in
React Flow handles

**Tests:**
- Skipped e2e test suite in `build.spec.ts` (legacy builder tests)
- Follow-up PR (#12082) will add new e2e tests for the Flow editor

### Checklist 📋

#### For code changes:

- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
    - [x] Create a new flow and verify it loads correctly
    - [x] Add nodes and connections to verify basic functionality works
    - [x] Verify that node handles render correctly with the new CSS
- [x] Check that the UI is clean without the old debug panel or view
toggles

#### For configuration changes:

- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
2026-02-12 11:16:01 +00:00
Zamil Majdy
a78145505b fix(copilot): merge split assistant messages to prevent Anthropic API errors (#12062)
## Summary
- When the copilot model responds with both text content AND a
long-running tool call (e.g., `create_agent`), the streaming code
created two separate consecutive assistant messages — one with text, one
with `tool_calls`. This caused Anthropic's API to reject with
`"unexpected tool_use_id found in tool_result blocks"` because the
`tool_result` couldn't find a matching `tool_use` in the immediately
preceding assistant message.
- Added a defensive merge of consecutive assistant messages in
`to_openai_messages()` (fixes existing corrupt sessions too)
- Fixed `_yield_tool_call` to add tool_calls to the existing
current-turn assistant message instead of creating a new one
- Changed `accumulated_tool_calls` assignment to use `extend` to prevent
overwriting tool_calls added by long-running tool flow

## Test plan
- [x] All 23 chat feature tests pass (`backend/api/features/chat/`)
- [x] All 44 prompt utility tests pass (`backend/util/prompt_test.py`)
- [x] All pre-commit hooks pass (ruff, isort, black, pyright)
- [ ] Manual test: create an agent via copilot, then ask a follow-up
question — should no longer get 400 error

<!-- greptile_comment -->

<h2>Greptile Overview</h2>

<details><summary><h3>Greptile Summary</h3></summary>

Fixes a critical bug where long-running tool calls (like `create_agent`)
caused Anthropic API 400 errors due to split assistant messages. The fix
ensures tool calls are added to the existing assistant message instead
of creating new ones, and adds a defensive merge function to repair any
existing corrupt sessions.

**Key changes:**
- Added `_merge_consecutive_assistant_messages()` to defensively merge
split assistant messages in `to_openai_messages()`
- Modified `_yield_tool_call()` to append tool calls to the current-turn
assistant message instead of creating a new one
- Changed `accumulated_tool_calls` from assignment to `extend` to
preserve tool calls already added by long-running tool flow

**Impact:** Resolves the issue where users received 400 errors after
creating agents via copilot and asking follow-up questions.
</details>


<details><summary><h3>Confidence Score: 4/5</h3></summary>

- Safe to merge with minor verification recommended
- The changes are well-targeted and solve a real API compatibility
issue. The logic is sound: searching backwards for the current assistant
message is correct, and using `extend` instead of assignment prevents
overwriting. The defensive merge in `to_openai_messages()` also fixes
existing corrupt sessions. All existing tests pass according to the PR
description.
- No files require special attention - changes are localized and
defensive
</details>


<details><summary><h3>Sequence Diagram</h3></summary>

```mermaid
sequenceDiagram
    participant User
    participant StreamAPI as stream_chat_completion
    participant Chunks as _stream_chat_chunks
    participant ToolCall as _yield_tool_call
    participant Session as ChatSession
    
    User->>StreamAPI: Send message
    StreamAPI->>Chunks: Stream chat chunks
    
    alt Text + Long-running tool call
        Chunks->>StreamAPI: Text delta (content)
        StreamAPI->>Session: Append assistant message with content
        Chunks->>ToolCall: Tool call detected
        
        Note over ToolCall: OLD: Created new assistant message<br/>NEW: Appends to existing assistant
        
        ToolCall->>Session: Search backwards for current assistant
        ToolCall->>Session: Append tool_call to existing message
        ToolCall->>Session: Add pending tool result
    end
    
    StreamAPI->>StreamAPI: Merge accumulated_tool_calls
    Note over StreamAPI: Use extend (not assign)<br/>to preserve existing tool_calls
    
    StreamAPI->>Session: to_openai_messages()
    Session->>Session: _merge_consecutive_assistant_messages()
    Note over Session: Defensive: Merges any split<br/>assistant messages
    Session-->>StreamAPI: Merged messages
    
    StreamAPI->>User: Return response
```
</details>


<!-- greptile_other_comments_section -->

<!-- /greptile_comment -->
2026-02-12 01:52:17 +00:00
Otto
36aeb0b2b3 docs(blocks): clarify HumanInTheLoop output descriptions for agent builder (#12069)
## Problem

The agent builder (LLM) misinterprets the HumanInTheLoop block outputs.
It thinks `approved_data` and `rejected_data` will yield status strings
like "APPROVED" or "REJECTED" instead of understanding that the actual
input data passes through.

This leads to unnecessary complexity - the agent builder adds comparison
blocks to check for status strings that don't exist.

## Solution

Enriched the block docstring and all input/output field descriptions to
make it explicit that:
1. The output is the actual data itself, not a status string
2. The routing is determined by which output pin fires
3. How to use the block correctly (connect downstream blocks to
appropriate output pins)

## Changes

- Updated block docstring with clear "How it works" and "Example usage"
sections
- Enhanced `data` input description to explain data flow
- Enhanced `name` input description for reviewer context
- Enhanced `approved_data` output to explicitly state it's NOT a status
string
- Enhanced `rejected_data` output to explicitly state it's NOT a status
string
- Enhanced `review_message` output for clarity

## Testing

Documentation-only change to schema descriptions. No functional changes.

Fixes SECRT-1930

<!-- greptile_comment -->

<h2>Greptile Overview</h2>

<details><summary><h3>Greptile Summary</h3></summary>

Enhanced documentation for the `HumanInTheLoopBlock` to clarify how
output pins work. The key improvement explicitly states that output pins
(`approved_data` and `rejected_data`) yield the actual input data, not
status strings like "APPROVED" or "REJECTED". This prevents the agent
builder (LLM) from misinterpreting the block's behavior and adding
unnecessary comparison blocks.

**Key changes:**
- Added "How it works" and "Example usage" sections to the block
docstring
- Clarified that routing is determined by which output pin fires, not by
comparing output values
- Enhanced all input/output field descriptions with explicit data flow
explanations
- Emphasized that downstream blocks should be connected to the
appropriate output pin based on desired workflow path

This is a documentation-only change with no functional modifications to
the code logic.
</details>


<details><summary><h3>Confidence Score: 5/5</h3></summary>

- This PR is safe to merge with no risk
- Documentation-only change that accurately reflects the existing code
behavior. No functional changes, no runtime impact, and the enhanced
descriptions correctly explain how the block outputs work based on
verification of the implementation code.
- No files require special attention
</details>


<!-- greptile_other_comments_section -->

<!-- /greptile_comment -->

Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
2026-02-11 15:43:58 +00:00
Ubbe
2a189c44c4 fix(frontend): API stream issues leaking into prompt (#12063)
## Changes 🏗️

<img width="800" height="621" alt="Screenshot 2026-02-11 at 19 32 39"
src="https://github.com/user-attachments/assets/e97be1a7-972e-4ae0-8dfa-6ade63cf287b"
/>

When the BE API has an error, prevent it from leaking into the stream
and instead handle it gracefully via toast.

## Checklist 📋

### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Run the app locally and trust the changes

<!-- greptile_comment -->

<h2>Greptile Overview</h2>

<details><summary><h3>Greptile Summary</h3></summary>

This PR fixes an issue where backend API stream errors were leaking into
the chat prompt instead of being handled gracefully. The fix involves
both backend and frontend changes to ensure error events conform to the
AI SDK's strict schema.

**Key Changes:**
- **Backend (`response_model.py`)**: Added custom `to_sse()` method for
`StreamError` that only emits `type` and `errorText` fields, stripping
extra fields like `code` and `details` that cause AI SDK validation
failures
- **Backend (`prompt.py`)**: Added validation step after context
compression to remove orphaned tool responses without matching tool
calls, preventing "unexpected tool_use_id" API errors
- **Frontend (`route.ts`)**: Implemented SSE stream normalization with
`normalizeSSEStream()` and `normalizeSSEEvent()` functions to strip
non-conforming fields from error events before they reach the AI SDK
- **Frontend (`ChatMessagesContainer.tsx`)**: Added toast notifications
for errors and improved error display UI with deduplication logic

The changes ensure a clean separation between internal error metadata
(useful for logging/debugging) and the strict schema required by the AI
SDK on the frontend.
</details>


<details><summary><h3>Confidence Score: 4/5</h3></summary>

- This PR is safe to merge with low risk
- The changes are well-structured and address a specific bug with proper
error handling. The dual-layer approach (backend filtering in `to_sse()`
+ frontend normalization) provides defense-in-depth. However, the lack
of automated tests for the new error normalization logic and the
potential for edge cases in SSE parsing prevent a perfect score.
- Pay close attention to
`autogpt_platform/frontend/src/app/api/chat/sessions/[sessionId]/stream/route.ts`
- the SSE normalization logic should be tested with various error
scenarios
</details>


<details><summary><h3>Sequence Diagram</h3></summary>

```mermaid
sequenceDiagram
    participant User
    participant Frontend as ChatMessagesContainer
    participant Proxy as /api/chat/.../stream
    participant Backend as Backend API
    participant AISDK as AI SDK

    User->>Frontend: Send message
    Frontend->>Proxy: POST with message
    Proxy->>Backend: Forward request with auth
    Backend->>Backend: Process message
    
    alt Success Path
        Backend->>Proxy: SSE stream (text-delta, etc.)
        Proxy->>Proxy: normalizeSSEStream (pass through)
        Proxy->>AISDK: Forward SSE events
        AISDK->>Frontend: Update messages
        Frontend->>User: Display response
    else Error Path
        Backend->>Backend: StreamError.to_sse()
        Note over Backend: Only emit {type, errorText}
        Backend->>Proxy: SSE error event
        Proxy->>Proxy: normalizeSSEEvent()
        Note over Proxy: Strip extra fields (code, details)
        Proxy->>AISDK: {type: "error", errorText: "..."}
        AISDK->>Frontend: error state updated
        Frontend->>Frontend: Toast notification (deduplicated)
        Frontend->>User: Show error UI + toast
    end
```
</details>


<!-- greptile_other_comments_section -->

<!-- /greptile_comment -->

---------

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: Otto-AGPT <otto@agpt.co>
2026-02-11 22:46:37 +08:00
Abhimanyu Yadav
508759610f fix(frontend): add min-width-0 to ContentCard to prevent overflow (#12060)
### Changes 🏗️

Added `min-w-0` class to the ContentCard component in the ToolAccordion
to prevent content overflow issues. This CSS fix ensures that the card
properly respects its container width constraints and allows text
truncation to work correctly when content is too wide.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Verified that tool content displays correctly in the accordion
- [x] Confirmed that long content properly truncates instead of
overflowing
  - [x] Tested with various screen sizes to ensure responsive behavior

#### For configuration changes:

- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes

<!-- greptile_comment -->

<h2>Greptile Overview</h2>

<details><summary><h3>Greptile Summary</h3></summary>

Added `min-w-0` class to `ContentCard` component to fix text truncation
overflow in grid layouts. This is a standard CSS fix that allows grid
items to shrink below their content size, enabling `truncate` classes on
child elements (`ContentCardTitle`, `ContentCardSubtitle`) to work
correctly. The fix follows the same pattern already used in
`ContentCardHeader` (line 54) and `ToolAccordion` (line 54).
</details>


<details><summary><h3>Confidence Score: 5/5</h3></summary>

- Safe to merge with no risk
- Single-line CSS fix that addresses a well-known flexbox/grid layout
issue. The change follows existing patterns in the codebase and is
thoroughly tested. No logic changes, no breaking changes, no side
effects.
- No files require special attention
</details>


<!-- greptile_other_comments_section -->

<!-- /greptile_comment -->
2026-02-11 21:09:21 +08:00
Otto
062fe1aa70 fix(security): enforce disabled flag on blocks in graph validation (#12059)
## Summary
Blocks marked `disabled=True` (like BlockInstallationBlock) were not
being checked during graph validation, allowing them to be used via
direct API calls despite being hidden from the UI.

This adds a security check in `_validate_graph_get_errors()` to reject
any graph containing disabled blocks.

## Security Advisory
GHSA-4crw-9p35-9x54

## Linear
SECRT-1927

## Changes
- Added `block.disabled` check in graph validation (6 lines)

## Testing
- Graphs with disabled blocks → rejected with clear error message
- Graphs with valid blocks → unchanged behavior

<!-- greptile_comment -->

<h2>Greptile Overview</h2>

<details><summary><h3>Greptile Summary</h3></summary>

Adds critical security validation to prevent execution of disabled
blocks (like `BlockInstallationBlock`) via direct API calls. The fix
validates that `block.disabled` is `False` during graph validation in
`_validate_graph_get_errors()` on line 747-750, ensuring disabled blocks
are rejected before graph creation or execution. This closes a
vulnerability where blocks marked disabled in the UI could still be used
through API endpoints.
</details>


<details><summary><h3>Confidence Score: 5/5</h3></summary>

- This PR is safe to merge and addresses a critical security
vulnerability
- The fix is minimal (6 lines), correctly placed in the validation flow,
includes clear security context (GHSA reference), and follows existing
validation patterns. The check is positioned after block existence
validation and before input validation, ensuring disabled blocks are
caught early in both graph creation and execution paths.
- No files require special attention
</details>


<!-- greptile_other_comments_section -->

<!-- /greptile_comment -->

---------

Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 03:28:19 +00:00
dependabot[bot]
2cd0d4fe0f chore(deps): bump actions/checkout from 4 to 6 (#12034)
Bumps [actions/checkout](https://github.com/actions/checkout) from 4 to
6.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/actions/checkout/releases">actions/checkout's
releases</a>.</em></p>
<blockquote>
<h2>v6.0.0</h2>
<h2>What's Changed</h2>
<ul>
<li>Update README to include Node.js 24 support details and requirements
by <a href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a>
in <a
href="https://redirect.github.com/actions/checkout/pull/2248">actions/checkout#2248</a></li>
<li>Persist creds to a separate file by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2286">actions/checkout#2286</a></li>
<li>v6-beta by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2298">actions/checkout#2298</a></li>
<li>update readme/changelog for v6 by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2311">actions/checkout#2311</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/checkout/compare/v5.0.0...v6.0.0">https://github.com/actions/checkout/compare/v5.0.0...v6.0.0</a></p>
<h2>v6-beta</h2>
<h2>What's Changed</h2>
<p>Updated persist-credentials to store the credentials under
<code>$RUNNER_TEMP</code> instead of directly in the local git
config.</p>
<p>This requires a minimum Actions Runner version of <a
href="https://github.com/actions/runner/releases/tag/v2.329.0">v2.329.0</a>
to access the persisted credentials for <a
href="https://docs.github.com/en/actions/tutorials/use-containerized-services/create-a-docker-container-action">Docker
container action</a> scenarios.</p>
<h2>v5.0.1</h2>
<h2>What's Changed</h2>
<ul>
<li>Port v6 cleanup to v5 by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2301">actions/checkout#2301</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/checkout/compare/v5...v5.0.1">https://github.com/actions/checkout/compare/v5...v5.0.1</a></p>
<h2>v5.0.0</h2>
<h2>What's Changed</h2>
<ul>
<li>Update actions checkout to use node 24 by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2226">actions/checkout#2226</a></li>
<li>Prepare v5.0.0 release by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2238">actions/checkout#2238</a></li>
</ul>
<h2>⚠️ Minimum Compatible Runner Version</h2>
<p><strong>v2.327.1</strong><br />
<a
href="https://github.com/actions/runner/releases/tag/v2.327.1">Release
Notes</a></p>
<p>Make sure your runner is updated to this version or newer to use this
release.</p>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/checkout/compare/v4...v5.0.0">https://github.com/actions/checkout/compare/v4...v5.0.0</a></p>
<h2>v4.3.1</h2>
<h2>What's Changed</h2>
<ul>
<li>Port v6 cleanup to v4 by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2305">actions/checkout#2305</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/checkout/compare/v4...v4.3.1">https://github.com/actions/checkout/compare/v4...v4.3.1</a></p>
<h2>v4.3.0</h2>
<h2>What's Changed</h2>
<ul>
<li>docs: update README.md by <a
href="https://github.com/motss"><code>@​motss</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1971">actions/checkout#1971</a></li>
<li>Add internal repos for checking out multiple repositories by <a
href="https://github.com/mouismail"><code>@​mouismail</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1977">actions/checkout#1977</a></li>
<li>Documentation update - add recommended permissions to Readme by <a
href="https://github.com/benwells"><code>@​benwells</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2043">actions/checkout#2043</a></li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/actions/checkout/blob/main/CHANGELOG.md">actions/checkout's
changelog</a>.</em></p>
<blockquote>
<h1>Changelog</h1>
<h2>v6.0.2</h2>
<ul>
<li>Fix tag handling: preserve annotations and explicit fetch-tags by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2356">actions/checkout#2356</a></li>
</ul>
<h2>v6.0.1</h2>
<ul>
<li>Add worktree support for persist-credentials includeIf by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2327">actions/checkout#2327</a></li>
</ul>
<h2>v6.0.0</h2>
<ul>
<li>Persist creds to a separate file by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2286">actions/checkout#2286</a></li>
<li>Update README to include Node.js 24 support details and requirements
by <a href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a>
in <a
href="https://redirect.github.com/actions/checkout/pull/2248">actions/checkout#2248</a></li>
</ul>
<h2>v5.0.1</h2>
<ul>
<li>Port v6 cleanup to v5 by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2301">actions/checkout#2301</a></li>
</ul>
<h2>v5.0.0</h2>
<ul>
<li>Update actions checkout to use node 24 by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2226">actions/checkout#2226</a></li>
</ul>
<h2>v4.3.1</h2>
<ul>
<li>Port v6 cleanup to v4 by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2305">actions/checkout#2305</a></li>
</ul>
<h2>v4.3.0</h2>
<ul>
<li>docs: update README.md by <a
href="https://github.com/motss"><code>@​motss</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1971">actions/checkout#1971</a></li>
<li>Add internal repos for checking out multiple repositories by <a
href="https://github.com/mouismail"><code>@​mouismail</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1977">actions/checkout#1977</a></li>
<li>Documentation update - add recommended permissions to Readme by <a
href="https://github.com/benwells"><code>@​benwells</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2043">actions/checkout#2043</a></li>
<li>Adjust positioning of user email note and permissions heading by <a
href="https://github.com/joshmgross"><code>@​joshmgross</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2044">actions/checkout#2044</a></li>
<li>Update README.md by <a
href="https://github.com/nebuk89"><code>@​nebuk89</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2194">actions/checkout#2194</a></li>
<li>Update CODEOWNERS for actions by <a
href="https://github.com/TingluoHuang"><code>@​TingluoHuang</code></a>
in <a
href="https://redirect.github.com/actions/checkout/pull/2224">actions/checkout#2224</a></li>
<li>Update package dependencies by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2236">actions/checkout#2236</a></li>
</ul>
<h2>v4.2.2</h2>
<ul>
<li><code>url-helper.ts</code> now leverages well-known environment
variables by <a href="https://github.com/jww3"><code>@​jww3</code></a>
in <a
href="https://redirect.github.com/actions/checkout/pull/1941">actions/checkout#1941</a></li>
<li>Expand unit test coverage for <code>isGhes</code> by <a
href="https://github.com/jww3"><code>@​jww3</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1946">actions/checkout#1946</a></li>
</ul>
<h2>v4.2.1</h2>
<ul>
<li>Check out other refs/* by commit if provided, fall back to ref by <a
href="https://github.com/orhantoy"><code>@​orhantoy</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1924">actions/checkout#1924</a></li>
</ul>
<h2>v4.2.0</h2>
<ul>
<li>Add Ref and Commit outputs by <a
href="https://github.com/lucacome"><code>@​lucacome</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1180">actions/checkout#1180</a></li>
<li>Dependency updates by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>- <a
href="https://redirect.github.com/actions/checkout/pull/1777">actions/checkout#1777</a>,
<a
href="https://redirect.github.com/actions/checkout/pull/1872">actions/checkout#1872</a></li>
</ul>
<h2>v4.1.7</h2>
<ul>
<li>Bump the minor-npm-dependencies group across 1 directory with 4
updates by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1739">actions/checkout#1739</a></li>
<li>Bump actions/checkout from 3 to 4 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1697">actions/checkout#1697</a></li>
<li>Check out other refs/* by commit by <a
href="https://github.com/orhantoy"><code>@​orhantoy</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1774">actions/checkout#1774</a></li>
<li>Pin actions/checkout's own workflows to a known, good, stable
version. by <a href="https://github.com/jww3"><code>@​jww3</code></a> in
<a
href="https://redirect.github.com/actions/checkout/pull/1776">actions/checkout#1776</a></li>
</ul>
<h2>v4.1.6</h2>
<ul>
<li>Check platform to set archive extension appropriately by <a
href="https://github.com/cory-miller"><code>@​cory-miller</code></a> in
<a
href="https://redirect.github.com/actions/checkout/pull/1732">actions/checkout#1732</a></li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="de0fac2e45"><code>de0fac2</code></a>
Fix tag handling: preserve annotations and explicit fetch-tags (<a
href="https://redirect.github.com/actions/checkout/issues/2356">#2356</a>)</li>
<li><a
href="064fe7f331"><code>064fe7f</code></a>
Add orchestration_id to git user-agent when ACTIONS_ORCHESTRATION_ID is
set (...</li>
<li><a
href="8e8c483db8"><code>8e8c483</code></a>
Clarify v6 README (<a
href="https://redirect.github.com/actions/checkout/issues/2328">#2328</a>)</li>
<li><a
href="033fa0dc0b"><code>033fa0d</code></a>
Add worktree support for persist-credentials includeIf (<a
href="https://redirect.github.com/actions/checkout/issues/2327">#2327</a>)</li>
<li><a
href="c2d88d3ecc"><code>c2d88d3</code></a>
Update all references from v5 and v4 to v6 (<a
href="https://redirect.github.com/actions/checkout/issues/2314">#2314</a>)</li>
<li><a
href="1af3b93b68"><code>1af3b93</code></a>
update readme/changelog for v6 (<a
href="https://redirect.github.com/actions/checkout/issues/2311">#2311</a>)</li>
<li><a
href="71cf2267d8"><code>71cf226</code></a>
v6-beta (<a
href="https://redirect.github.com/actions/checkout/issues/2298">#2298</a>)</li>
<li><a
href="069c695914"><code>069c695</code></a>
Persist creds to a separate file (<a
href="https://redirect.github.com/actions/checkout/issues/2286">#2286</a>)</li>
<li><a
href="ff7abcd0c3"><code>ff7abcd</code></a>
Update README to include Node.js 24 support details and requirements (<a
href="https://redirect.github.com/actions/checkout/issues/2248">#2248</a>)</li>
<li><a
href="08c6903cd8"><code>08c6903</code></a>
Prepare v5.0.0 release (<a
href="https://redirect.github.com/actions/checkout/issues/2238">#2238</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/actions/checkout/compare/v4...v6">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/checkout&package-manager=github_actions&previous-version=4&new-version=6)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Otto <otto@agpt.co>
2026-02-11 02:25:51 +00:00
dependabot[bot]
1ecae8c87e chore(backend/deps): bump aiofiles from 24.1.0 to 25.1.0 in /autogpt_platform/backend (#12043)
Bumps [aiofiles](https://github.com/Tinche/aiofiles) from 24.1.0 to
25.1.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/Tinche/aiofiles/releases">aiofiles's
releases</a>.</em></p>
<blockquote>
<h2>v25.1.0</h2>
<ul>
<li>Switch to <a href="https://docs.astral.sh/uv/">uv</a> + add Python
v3.14 support. (<a
href="https://redirect.github.com/Tinche/aiofiles/pull/219">#219</a>)</li>
<li>Add <code>ruff</code> formatter and linter. <a
href="https://redirect.github.com/Tinche/aiofiles/pull/216">#216</a></li>
<li>Drop Python 3.8 support. If you require it, use version 24.1.0. <a
href="https://redirect.github.com/Tinche/aiofiles/pull/204">#204</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a
href="https://github.com/danielsmyers"><code>@​danielsmyers</code></a>
made their first contribution in <a
href="https://redirect.github.com/Tinche/aiofiles/pull/185">Tinche/aiofiles#185</a></li>
<li><a
href="https://github.com/stankudrow"><code>@​stankudrow</code></a> made
their first contribution in <a
href="https://redirect.github.com/Tinche/aiofiles/pull/192">Tinche/aiofiles#192</a></li>
<li><a
href="https://github.com/waketzheng"><code>@​waketzheng</code></a> made
their first contribution in <a
href="https://redirect.github.com/Tinche/aiofiles/pull/221">Tinche/aiofiles#221</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/Tinche/aiofiles/compare/v24.1.0...v25.1.0">https://github.com/Tinche/aiofiles/compare/v24.1.0...v25.1.0</a></p>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/Tinche/aiofiles/blob/main/CHANGELOG.md">aiofiles's
changelog</a>.</em></p>
<blockquote>
<h2>25.1.0 (2025-10-09)</h2>
<ul>
<li>Switch to <a href="https://docs.astral.sh/uv/">uv</a> + add Python
v3.14 support.
(<a
href="https://redirect.github.com/Tinche/aiofiles/pull/219">#219</a>)</li>
<li>Add <code>ruff</code> formatter and linter.
<a
href="https://redirect.github.com/Tinche/aiofiles/pull/216">#216</a></li>
<li>Drop Python 3.8 support. If you require it, use version 24.1.0.
<a
href="https://redirect.github.com/Tinche/aiofiles/pull/204">#204</a></li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="348f5ef656"><code>348f5ef</code></a>
v25.1.0</li>
<li><a
href="5e1bb8f12b"><code>5e1bb8f</code></a>
docs: update readme to use ruff badge (<a
href="https://redirect.github.com/Tinche/aiofiles/issues/221">#221</a>)</li>
<li><a
href="6fdc25c781"><code>6fdc25c</code></a>
Move to uv. (<a
href="https://redirect.github.com/Tinche/aiofiles/issues/219">#219</a>)</li>
<li><a
href="1989132423"><code>1989132</code></a>
set 'function' as a default fixture loop scope value</li>
<li><a
href="8986452a1b"><code>8986452</code></a>
add the 'asyncio_default_fixture_loop_scope=session' option</li>
<li><a
href="ccab1ff776"><code>ccab1ff</code></a>
update pytest-asyncio==1.0.0</li>
<li><a
href="8727c96f5b"><code>8727c96</code></a>
add PR <a
href="https://redirect.github.com/Tinche/aiofiles/issues/216">#216</a>
into the CHANGELOG</li>
<li><a
href="a9388e5f8d"><code>a9388e5</code></a>
add TID and ignore TID252</li>
<li><a
href="760366489a"><code>7603664</code></a>
remove [ruff].exclude keyval</li>
<li><a
href="7c49a5c5f2"><code>7c49a5c</code></a>
add final newlines</li>
<li>Additional commits viewable in <a
href="https://github.com/Tinche/aiofiles/compare/v24.1.0...v25.1.0">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=aiofiles&package-manager=pip&previous-version=24.1.0&new-version=25.1.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Otto <otto@agpt.co>
2026-02-10 23:32:30 +00:00
dependabot[bot]
659338f90c chore(deps): bump peter-evans/repository-dispatch from 3 to 4 (#12035)
Bumps
[peter-evans/repository-dispatch](https://github.com/peter-evans/repository-dispatch)
from 3 to 4.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/peter-evans/repository-dispatch/releases">peter-evans/repository-dispatch's
releases</a>.</em></p>
<blockquote>
<h2>Repository Dispatch v4.0.0</h2>
<p>⚙️ Requires <a
href="https://github.com/actions/runner/releases/tag/v2.327.1">Actions
Runner v2.327.1</a> or later if you are using a self-hosted runner for
Node 24 support.</p>
<h2>What's Changed</h2>
<ul>
<li>build(deps-dev): bump <code>@​types/node</code> from 18.19.8 to
18.19.10 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/306">peter-evans/repository-dispatch#306</a></li>
<li>build(deps): bump peter-evans/repository-dispatch from 2 to 3 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/307">peter-evans/repository-dispatch#307</a></li>
<li>build(deps-dev): bump <code>@​types/node</code> from 18.19.10 to
18.19.14 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/308">peter-evans/repository-dispatch#308</a></li>
<li>build(deps): bump peter-evans/create-pull-request from 5 to 6 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/310">peter-evans/repository-dispatch#310</a></li>
<li>build(deps): bump peter-evans/slash-command-dispatch from 3 to 4 by
<a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/309">peter-evans/repository-dispatch#309</a></li>
<li>build(deps-dev): bump <code>@​types/node</code> from 18.19.14 to
18.19.15 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/311">peter-evans/repository-dispatch#311</a></li>
<li>build(deps-dev): bump prettier from 3.2.4 to 3.2.5 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/312">peter-evans/repository-dispatch#312</a></li>
<li>build(deps-dev): bump <code>@​types/node</code> from 18.19.15 to
18.19.17 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/313">peter-evans/repository-dispatch#313</a></li>
<li>build(deps-dev): bump <code>@​types/node</code> from 18.19.17 to
18.19.18 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/314">peter-evans/repository-dispatch#314</a></li>
<li>build(deps-dev): bump eslint-plugin-github from 4.10.1 to 4.10.2 by
<a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/316">peter-evans/repository-dispatch#316</a></li>
<li>build(deps-dev): bump <code>@​types/node</code> from 18.19.18 to
18.19.21 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/317">peter-evans/repository-dispatch#317</a></li>
<li>build(deps-dev): bump eslint from 8.56.0 to 8.57.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/318">peter-evans/repository-dispatch#318</a></li>
<li>build(deps-dev): bump <code>@​types/node</code> from 18.19.21 to
18.19.22 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/319">peter-evans/repository-dispatch#319</a></li>
<li>build(deps-dev): bump <code>@​types/node</code> from 18.19.22 to
18.19.24 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/320">peter-evans/repository-dispatch#320</a></li>
<li>build(deps-dev): bump <code>@​types/node</code> from 18.19.24 to
18.19.26 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/321">peter-evans/repository-dispatch#321</a></li>
<li>build(deps-dev): bump <code>@​types/node</code> from 18.19.26 to
18.19.29 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/322">peter-evans/repository-dispatch#322</a></li>
<li>build(deps-dev): bump <code>@​types/node</code> from 18.19.29 to
18.19.31 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/323">peter-evans/repository-dispatch#323</a></li>
<li>build(deps-dev): bump <code>@​types/node</code> from 18.19.31 to
18.19.33 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/324">peter-evans/repository-dispatch#324</a></li>
<li>build(deps-dev): bump <code>@​types/node</code> from 18.19.33 to
18.19.34 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/325">peter-evans/repository-dispatch#325</a></li>
<li>build(deps-dev): bump prettier from 3.2.5 to 3.3.1 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/326">peter-evans/repository-dispatch#326</a></li>
<li>build(deps-dev): bump prettier from 3.3.1 to 3.3.2 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/327">peter-evans/repository-dispatch#327</a></li>
<li>build(deps-dev): bump braces from 3.0.2 to 3.0.3 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/328">peter-evans/repository-dispatch#328</a></li>
<li>build(deps-dev): bump ws from 7.5.9 to 7.5.10 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/329">peter-evans/repository-dispatch#329</a></li>
<li>build(deps-dev): bump <code>@​types/node</code> from 18.19.34 to
18.19.38 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/330">peter-evans/repository-dispatch#330</a></li>
<li>build(deps-dev): bump <code>@​types/node</code> from 18.19.38 to
18.19.39 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/332">peter-evans/repository-dispatch#332</a></li>
<li>build(deps-dev): bump prettier from 3.3.2 to 3.3.3 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/334">peter-evans/repository-dispatch#334</a></li>
<li>build(deps-dev): bump <code>@​types/node</code> from 18.19.39 to
18.19.41 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/335">peter-evans/repository-dispatch#335</a></li>
<li>build(deps-dev): bump eslint-plugin-prettier from 5.1.3 to 5.2.1 by
<a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/336">peter-evans/repository-dispatch#336</a></li>
<li>build(deps-dev): bump <code>@​types/node</code> from 18.19.41 to
18.19.42 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/337">peter-evans/repository-dispatch#337</a></li>
<li>build(deps-dev): bump <code>@​types/node</code> from 18.19.42 to
18.19.43 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/338">peter-evans/repository-dispatch#338</a></li>
<li>build(deps-dev): bump <code>@​types/node</code> from 18.19.43 to
18.19.44 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/339">peter-evans/repository-dispatch#339</a></li>
<li>build(deps-dev): bump <code>@​types/node</code> from 18.19.44 to
18.19.45 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/340">peter-evans/repository-dispatch#340</a></li>
<li>build(deps-dev): bump <code>@​types/node</code> from 18.19.45 to
18.19.47 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/341">peter-evans/repository-dispatch#341</a></li>
<li>build(deps-dev): bump <code>@​types/node</code> from 18.19.47 to
18.19.50 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/343">peter-evans/repository-dispatch#343</a></li>
<li>build(deps): bump peter-evans/create-pull-request from 6 to 7 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/342">peter-evans/repository-dispatch#342</a></li>
<li>build(deps-dev): bump eslint from 8.57.0 to 8.57.1 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/344">peter-evans/repository-dispatch#344</a></li>
<li>build(deps-dev): bump <code>@​types/node</code> from 18.19.50 to
18.19.53 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/345">peter-evans/repository-dispatch#345</a></li>
<li>build(deps-dev): bump <code>@​vercel/ncc</code> from 0.38.1 to
0.38.2 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/346">peter-evans/repository-dispatch#346</a></li>
<li>Update distribution by <a
href="https://github.com/actions-bot"><code>@​actions-bot</code></a> in
<a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/347">peter-evans/repository-dispatch#347</a></li>
<li>build(deps): bump <code>@​actions/core</code> from 1.10.1 to 1.11.0
by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/349">peter-evans/repository-dispatch#349</a></li>
<li>build(deps-dev): bump <code>@​types/node</code> from 18.19.53 to
18.19.54 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/348">peter-evans/repository-dispatch#348</a></li>
<li>Update distribution by <a
href="https://github.com/actions-bot"><code>@​actions-bot</code></a> in
<a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/350">peter-evans/repository-dispatch#350</a></li>
<li>build(deps): bump <code>@​actions/core</code> from 1.11.0 to 1.11.1
by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/351">peter-evans/repository-dispatch#351</a></li>
<li>build(deps-dev): bump <code>@​types/node</code> from 18.19.54 to
18.19.55 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/352">peter-evans/repository-dispatch#352</a></li>
<li>Update distribution by <a
href="https://github.com/actions-bot"><code>@​actions-bot</code></a> in
<a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/353">peter-evans/repository-dispatch#353</a></li>
<li>build(deps-dev): bump <code>@​types/node</code> from 18.19.55 to
18.19.56 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/repository-dispatch/pull/354">peter-evans/repository-dispatch#354</a></li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="28959ce8df"><code>28959ce</code></a>
Fix node version in actions.yml (<a
href="https://redirect.github.com/peter-evans/repository-dispatch/issues/433">#433</a>)</li>
<li><a
href="25d29c2bbf"><code>25d29c2</code></a>
build(deps-dev): bump <code>@​types/node</code> in the npm group (<a
href="https://redirect.github.com/peter-evans/repository-dispatch/issues/432">#432</a>)</li>
<li><a
href="830136c664"><code>830136c</code></a>
build(deps): bump the github-actions group with 3 updates (<a
href="https://redirect.github.com/peter-evans/repository-dispatch/issues/431">#431</a>)</li>
<li><a
href="2c856c63fe"><code>2c856c6</code></a>
ci: update dependabot config</li>
<li><a
href="66739071c2"><code>6673907</code></a>
build(deps-dev): bump <code>@​types/node</code> from 18.19.127 to
18.19.129 (<a
href="https://redirect.github.com/peter-evans/repository-dispatch/issues/429">#429</a>)</li>
<li><a
href="952a211c1e"><code>952a211</code></a>
build(deps): bump peter-evans/repository-dispatch from 3 to 4 (<a
href="https://redirect.github.com/peter-evans/repository-dispatch/issues/428">#428</a>)</li>
<li><a
href="5fc4efd1a4"><code>5fc4efd</code></a>
docs: update readme</li>
<li><a
href="a628c95fd1"><code>a628c95</code></a>
feat: v4 (<a
href="https://redirect.github.com/peter-evans/repository-dispatch/issues/427">#427</a>)</li>
<li><a
href="de78ac1a71"><code>de78ac1</code></a>
build(deps-dev): bump <code>@​vercel/ncc</code> from 0.38.3 to 0.38.4
(<a
href="https://redirect.github.com/peter-evans/repository-dispatch/issues/425">#425</a>)</li>
<li><a
href="f49fa7f26b"><code>f49fa7f</code></a>
build(deps-dev): bump <code>@​types/node</code> from 18.19.124 to
18.19.127 (<a
href="https://redirect.github.com/peter-evans/repository-dispatch/issues/426">#426</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/peter-evans/repository-dispatch/compare/v3...v4">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=peter-evans/repository-dispatch&package-manager=github_actions&previous-version=3&new-version=4)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co>
2026-02-10 21:28:23 +00:00
Abhimanyu Yadav
4df5b7bde7 refactor(frontend): remove defaultExpanded prop from ToolAccordion components (#12054)
### Changes

- Removed `defaultExpanded` prop from `ToolAccordion` in CreateAgent,
EditAgent, RunAgent, and RunBlock components to streamline the code and
improve readability.

### Impact

- This refactor enhances maintainability by reducing complexity in the
component structure while preserving existing functionality.

### Changes 🏗️

- Removed conditional expansion logic from all tool components
- Simplified ToolAccordion implementation across all affected components

### Checklist 📋

#### For code changes:

- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Create and run agents with various tools to verify accordion
behavior works correctly
    - [x] Verify that UI components expand and collapse as expected
    - [x] Test with different output types to ensure proper rendering

---------

Co-authored-by: Ubbe <hi@ubbe.dev>
Co-authored-by: Lluis Agusti <hi@llu.lu>
2026-02-11 00:22:01 +08:00
Otto
017a00af46 feat(copilot): Enable extended thinking for Claude models (#12052)
## Summary

Enables Anthropic's extended thinking feature for Claude models in
CoPilot via OpenRouter. This keeps the model's chain-of-thought
reasoning internal rather than outputting it to users.

## Problem

The CoPilot prompt was designed for a thinking agent (with
`<internal_reasoning>` tags), but extended thinking wasn't enabled on
the API side. This caused the model to output its reasoning as regular
text, leaking internal analysis to users.

## Solution

Added thinking configuration to the OpenRouter `extra_body` for
Anthropic models:
```python
extra_body["provider"] = {
    "anthropic": {
        "thinking": {
            "type": "enabled",
            "budget_tokens": config.thinking_budget_tokens,
        }
    }
}
```

## Configuration

New settings in `ChatConfig`:
| Setting | Default | Description |
|---------|---------|-------------|
| `thinking_enabled` | `True` | Enable extended thinking for Claude
models |
| `thinking_budget_tokens` | `10000` | Token budget for thinking
(1000-100000) |

## Changes

- `config.py`: Added `thinking_enabled` and `thinking_budget_tokens`
settings
- `service.py`: Added thinking config to all 3 places where `extra_body`
is built for LLM calls

## Testing

- Verify CoPilot responses no longer include internal reasoning text
- Check that Claude's extended thinking is working (should see thinking
tokens in usage)
- Confirm non-Anthropic models are unaffected

## Related

Discussion:
https://discord.com/channels/1126875755960336515/1126875756925046928/1470779843552612607

---------

Co-authored-by: Swifty <craigswift13@gmail.com>
2026-02-10 16:18:05 +01:00
Reinier van der Leer
52650eed1d refactor(frontend/auth): Move /copilot auth check to middleware (#12053)
These "is the user authenticated, and should they be?" checks should not
be spread across the codebase, it's complex enough as it is. :')

- Follow-up to #12050

### Changes 🏗️

- Revert "fix(frontend): copilot redirect logout (#12050)"
- Add `/copilot` to `PROTECTED_PAGES` in `@/lib/supabase/helpers`

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Trivial change, we know this works for other pages
2026-02-10 14:43:33 +00:00
dependabot[bot]
81c1524658 chore(backend/deps): bump the production-dependencies group in /autogpt_platform/backend with 2 updates (#12037)
Bumps the production-dependencies group in /autogpt_platform/backend
with 2 updates: [fastapi](https://github.com/fastapi/fastapi) and
[langfuse](https://github.com/langfuse/langfuse).

Updates `fastapi` from 0.128.5 to 0.128.6
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/fastapi/fastapi/releases">fastapi's
releases</a>.</em></p>
<blockquote>
<h2>0.128.6</h2>
<h3>Fixes</h3>
<ul>
<li>🐛 Fix <code>on_startup</code> and <code>on_shutdown</code>
parameters of <code>APIRouter</code>. PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14873">#14873</a>
by <a
href="https://github.com/YuriiMotov"><code>@​YuriiMotov</code></a>.</li>
</ul>
<h3>Translations</h3>
<ul>
<li>🌐 Update translations for zh (update-outdated). PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14843">#14843</a>
by <a
href="https://github.com/tiangolo"><code>@​tiangolo</code></a>.</li>
</ul>
<h3>Internal</h3>
<ul>
<li> Fix parameterized tests with snapshots. PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14875">#14875</a>
by <a
href="https://github.com/YuriiMotov"><code>@​YuriiMotov</code></a>.</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="fbca586c1d"><code>fbca586</code></a>
📝 Update release notes</li>
<li><a
href="4e879799dd"><code>4e87979</code></a>
📝 Update release notes</li>
<li><a
href="0a4033aeee"><code>0a4033a</code></a>
🔖 Release version 0.128.6</li>
<li><a
href="ed2512a5ec"><code>ed2512a</code></a>
🐛 Fix <code>on_startup</code> and <code>on_shutdown</code> parameters of
<code>APIRouter</code> (<a
href="https://redirect.github.com/fastapi/fastapi/issues/14873">#14873</a>)</li>
<li><a
href="0c0f6332e2"><code>0c0f633</code></a>
📝 Update release notes</li>
<li><a
href="227cb85a03"><code>227cb85</code></a>
 Fix parameterized tests with snapshots (<a
href="https://redirect.github.com/fastapi/fastapi/issues/14875">#14875</a>)</li>
<li><a
href="cd31576d57"><code>cd31576</code></a>
📝 Update release notes</li>
<li><a
href="376e108580"><code>376e108</code></a>
🌐 Update translations for zh (update-outdated) (<a
href="https://redirect.github.com/fastapi/fastapi/issues/14843">#14843</a>)</li>
<li>See full diff in <a
href="https://github.com/fastapi/fastapi/compare/0.128.5...0.128.6">compare
view</a></li>
</ul>
</details>
<br />

Updates `langfuse` from 3.13.0 to 3.14.1
<details>
<summary>Commits</summary>
<ul>
<li>See full diff in <a
href="https://github.com/langfuse/langfuse/commits">compare
view</a></li>
</ul>
</details>
<br />


Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore <dependency name> major version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's major version (unless you unignore this specific
dependency's major version or upgrade to it yourself)
- `@dependabot ignore <dependency name> minor version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's minor version (unless you unignore this specific
dependency's minor version or upgrade to it yourself)
- `@dependabot ignore <dependency name>` will close this group update PR
and stop Dependabot creating any more for the specific dependency
(unless you unignore this specific dependency or upgrade to it yourself)
- `@dependabot unignore <dependency name>` will remove all of the ignore
conditions of the specified dependency
- `@dependabot unignore <dependency name> <ignore condition>` will
remove the ignore condition of the specified dependency and ignore
conditions


</details>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
Co-authored-by: Otto <otto@agpt.co>
2026-02-10 13:32:48 +00:00
Ubbe
f2ead70f3d fix(frontend): copilot redirect logout (#12050)
## Changes 🏗️

Redirect to `/login` if the user is not authenticated and tries to
access `/copilot`

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Run the app locally and tested
2026-02-10 21:39:11 +08:00
Abhimanyu Yadav
7d4c020a9b feat(chat): implement AI SDK integration with custom streaming response handling (#11901)
### Changes 🏗️

- Added AI SDK integration for chat streaming with proper message
handling
- Implemented custom to_sse method in StreamToolOutputAvailable to
exclude non-spec fields
- Modified stream_chat_completion to reuse message IDs for tool call
continuations
- Created new Copilot 2.0 UI with AI SDK React components
- Added streamdown and related packages for markdown rendering
- Built reusable conversation and message components for the chat
interface
- Added support for tool output display in the chat UI

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Start a new chat session and verify streaming works correctly
  - [x] Test tool calls and verify they display properly in the UI
  - [x] Verify message continuations don't create duplicate messages
  - [x] Test markdown rendering with code blocks and other formatting
  - [x] Verify the UI is responsive and scrolls correctly

#### For configuration changes:

- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)

---------

Co-authored-by: Lluis Agusti <hi@llu.lu>
Co-authored-by: Ubbe <hi@ubbe.dev>
2026-02-10 21:12:21 +08:00
dependabot[bot]
e596ea87cb chore(libs/deps-dev): bump pytest-cov from 6.2.1 to 7.0.0 in /autogpt_platform/autogpt_libs (#12030)
Bumps [pytest-cov](https://github.com/pytest-dev/pytest-cov) from 6.2.1
to 7.0.0.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/pytest-dev/pytest-cov/blob/master/CHANGELOG.rst">pytest-cov's
changelog</a>.</em></p>
<blockquote>
<h2>7.0.0 (2025-09-09)</h2>
<ul>
<li>
<p>Dropped support for subprocesses measurement.</p>
<p>It was a feature added long time ago when coverage lacked a nice way
to measure subprocesses created in tests.
It relied on a <code>.pth</code> file, there was no way to opt-out and
it created bad interations
with <code>coverage's new patch system
&lt;https://coverage.readthedocs.io/en/latest/config.html#run-patch&gt;</code>_
added
in <code>7.10
&lt;https://coverage.readthedocs.io/en/7.10.6/changes.html#version-7-10-0-2025-07-24&gt;</code>_.</p>
<p>To migrate to this release you might need to enable the suprocess
patch, example for <code>.coveragerc</code>:</p>
<p>.. code-block:: ini</p>
<p>[run]
patch = subprocess</p>
<p>This release also requires at least coverage 7.10.6.</p>
</li>
<li>
<p>Switched packaging to have metadata completely in
<code>pyproject.toml</code> and use <code>hatchling
&lt;https://pypi.org/project/hatchling/&gt;</code>_ for
building.
Contributed by Ofek Lev in
<code>[#551](https://github.com/pytest-dev/pytest-cov/issues/551)
&lt;https://github.com/pytest-dev/pytest-cov/pull/551&gt;</code>_
with some extras in
<code>[#716](https://github.com/pytest-dev/pytest-cov/issues/716)
&lt;https://github.com/pytest-dev/pytest-cov/pull/716&gt;</code>_.</p>
</li>
<li>
<p>Removed some not really necessary testing deps like
<code>six</code>.</p>
</li>
</ul>
<h2>6.3.0 (2025-09-06)</h2>
<ul>
<li>Added support for markdown reports.
Contributed by Marcos Boger in
<code>[#712](https://github.com/pytest-dev/pytest-cov/issues/712)
&lt;https://github.com/pytest-dev/pytest-cov/pull/712&gt;</code>_
and <code>[#714](https://github.com/pytest-dev/pytest-cov/issues/714)
&lt;https://github.com/pytest-dev/pytest-cov/pull/714&gt;</code>_.</li>
<li>Fixed some formatting issues in docs.
Anonymous contribution in
<code>[#706](https://github.com/pytest-dev/pytest-cov/issues/706)
&lt;https://github.com/pytest-dev/pytest-cov/pull/706&gt;</code>_.</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="224d8964ca"><code>224d896</code></a>
Bump version: 6.3.0 → 7.0.0</li>
<li><a
href="73424e3999"><code>73424e3</code></a>
Cleanup the docs a bit.</li>
<li><a
href="36f1cc2967"><code>36f1cc2</code></a>
Bump pins in template.</li>
<li><a
href="f299c590a6"><code>f299c59</code></a>
Bump the github-actions group with 2 updates</li>
<li><a
href="25f0b2e0cd"><code>25f0b2e</code></a>
Update docs/config.rst</li>
<li><a
href="bb23eacc55"><code>bb23eac</code></a>
Improve configuration docs</li>
<li><a
href="a19531e91e"><code>a19531e</code></a>
Switch from build/pre-commit to uv/prek - this should make this
faster.</li>
<li><a
href="82f9993910"><code>82f9993</code></a>
Update changelog.</li>
<li><a
href="211b5cd41c"><code>211b5cd</code></a>
Fix links.</li>
<li><a
href="97aadd74bc"><code>97aadd7</code></a>
Update some ci config, reformat and apply some lint fixes.</li>
<li>Additional commits viewable in <a
href="https://github.com/pytest-dev/pytest-cov/compare/v6.2.1...v7.0.0">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=pytest-cov&package-manager=pip&previous-version=6.2.1&new-version=7.0.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
Co-authored-by: Otto <otto@agpt.co>
2026-02-10 11:22:25 +00:00
Otto
81f8290f01 debug(backend/db): Add diagnostic logging for vector type errors (#12024)
Adds diagnostic logging when the `type vector does not exist` error
occurs in raw SQL queries.

## Problem

We're seeing intermittent "type vector does not exist" errors on
dev-behave ([Sentry
issue](https://significant-gravitas.sentry.io/issues/7205929979/)). The
pgvector extension should be in the search_path, but occasionally
queries fail to resolve the vector type.

## Solution

When a query fails with this specific error, we now log:
- `SHOW search_path` - what schemas are being searched
- `SELECT current_schema()` - the active schema
- `SELECT current_user, session_user, current_database()` - connection
context

This diagnostic info will help identify why the vector extension isn't
visible in certain cases.

## Changes

- Added `_log_vector_error_diagnostics()` helper function in
`backend/data/db.py`
- Wrapped SQL execution in try/except to catch and diagnose vector type
errors
- Original exception is re-raised after logging (no behavior change)

## Testing

This is observational/diagnostic code. It will be validated by waiting
for the error to occur naturally on dev and checking the logs.

## Rollout

Once we've captured diagnostic logs and identified the root cause, this
logging can be removed or reduced in verbosity.
2026-02-10 07:35:13 +00:00
Reinier van der Leer
6467f6734f debug(backend/chat): Add timing logging to chat stream generation mechanism (#12019)
[SECRT-1912: Investigate & eliminate chat session start
latency](https://linear.app/autogpt/issue/SECRT-1912)

### Changes 🏗️

- Add timing logs to `backend.api.features.chat` in `routes.py`,
`service.py`, and `stream_registry.py`
- Remove unneeded DB join in `create_chat_session`

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - CI checks
2026-02-09 14:05:29 +00:00
Otto
5a30d11416 refactor(copilot): Code cleanup and deduplication (#11950)
## Summary

Code cleanup of the AI Copilot codebase - rebased onto latest dev.

## Changes

### New Files
- `backend/util/validation.py` - UUID validation helpers
- `backend/api/features/chat/tools/helpers.py` - Shared tool utilities

### Credential Matching Consolidation  
- Added shared utilities to `utils.py`
- Refactored `run_block._check_block_credentials()` with discriminator
support
- Extracted `_resolve_discriminated_credentials()` for multi-provider
handling

### Routes Cleanup
- Extracted `_create_stream_generator()` and `SSE_RESPONSE_HEADERS`

### Tool Files Cleanup
- Updated `run_agent.py` and `run_block.py` to use shared helpers

**WIP** - This PR will be updated incrementally.
2026-02-09 13:43:55 +00:00
Bently
1f4105e8f9 fix(frontend): Handle object values in FileInput component (#11948)
Fixes
[#11800](https://github.com/Significant-Gravitas/AutoGPT/issues/11800)

## Problem
The FileInput component crashed with `TypeError: e.startsWith is not a
function` when the value was an object (from external API) instead of a
string.

## Example Input Object
When using the external API
(`/external-api/v1/graphs/{id}/execute/{version}`), file inputs can be
passed as objects:

```json
{
  "node_input": {
    "input_image": {
      "name": "image.jpeg",
      "type": "image/jpeg",
      "size": 131147,
      "data": "/9j/4QAW..."
    }
  }
}
```

## Changes
- Updated `getFileLabelFromValue()` to handle object format: `{ name,
type, size, data }`
- Added type guards for string vs object values
- Graceful fallback for edge cases (null, undefined, empty object)

## Test cases verified
- Object with name: returns filename
- Object with type only: extracts and formats MIME type
- String data URI: parses correctly
- String file path: extracts extension
- Edge cases: returns "File" fallback
2026-02-09 10:25:08 +00:00
Bently
caf9ff34e6 fix(backend): Handle stale RabbitMQ channels on connection drop (#11929)
### Changes 🏗️

Fixes
[**AUTOGPT-SERVER-1TN**](https://autoagpt.sentry.io/issues/?query=AUTOGPT-SERVER-1TN)
(~39K events since Feb 2025) and related connection issues
**6JC/6JD/6JE/6JF** (~6K combined).

#### Problem

When the RabbitMQ TCP connection drops (network blip, server restart,
etc.):

1. `connect_robust` (aio_pika) automatically reconnects the underlying
AMQP connection
2. But `AsyncRabbitMQ._channel` still references the **old dead
channel**
3. `is_ready` checks `not self._channel.is_closed` — but the channel
object doesn't know the transport is gone
4. `publish_message` tries to use the stale channel →
`ChannelInvalidStateError: No active transport in channel`
5. `@func_retry` retries 5 times, but each retry hits the same stale
channel (it passes `is_ready`)

This means every connection drop generates errors until the process is
restarted.

#### Fix

**New `_ensure_channel()` helper** that resets stale channels before
reconnecting, so `connect()` creates a fresh one instead of
short-circuiting on `is_connected`.

**Explicit `ChannelInvalidStateError` handling in `publish_message`:**
1. First attempt uses `_ensure_channel()` (handles normal staleness)
2. If publish throws `ChannelInvalidStateError`, does a full reconnect
(resets both `_channel` and `_connection`) and retries once
3. `@func_retry` provides additional retry resilience on top

**Simplified `get_channel()`** to use the same resilient helper.

**1 file changed, 62 insertions, 24 deletions.**

#### Impact
- Eliminates ~39K `ChannelInvalidStateError` Sentry events
- RabbitMQ operations self-heal after connection drops without process
restart
- Related transport EOF errors (6JC/6JD/6JE/6JF) should also reduce
2026-02-09 10:24:08 +00:00
Nicholas Tindle
e8fc8ee623 fix(backend): filter graph-only blocks from CoPilot's find_block results (#11892)
Filters out blocks that are unsuitable for standalone execution from
CoPilot's block search and execution. These blocks serve graph-specific
purposes and will either fail, hang, or confuse users when run outside
of a graph context.

**Important:** This does NOT affect the Builder UI which uses
`load_all_blocks()` directly.

### Changes 🏗️

- **find_block.py**: Added `EXCLUDED_BLOCK_TYPES` and
`EXCLUDED_BLOCK_IDS` constants, skip excluded blocks in search results
- **run_block.py**: Added execution guard that returns clear error
message for excluded blocks
- **content_handlers.py**: Added filtering to
`BlockHandler.get_missing_items()` and `get_stats()` to prevent indexing
excluded blocks

**Excluded by BlockType:**
| BlockType | Reason |
|-----------|--------|
| `INPUT` | Graph interface definition - data enters via chat, not graph
inputs |
| `OUTPUT` | Graph interface definition - data exits via chat, not graph
outputs |
| `WEBHOOK` | Wait for external events - would hang forever in CoPilot |
| `WEBHOOK_MANUAL` | Same as WEBHOOK |
| `NOTE` | Visual annotation only - no runtime behavior |
| `HUMAN_IN_THE_LOOP` | Pauses for human approval - CoPilot IS
human-in-the-loop |
| `AGENT` | AgentExecutorBlock requires graph context - use `run_agent`
tool instead |

**Excluded by ID:**
| Block | Reason |
|-------|--------|
| `SmartDecisionMakerBlock` | Dynamically discovers downstream blocks
via graph topology |

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [ ] Search for "input" in CoPilot - should NOT return AgentInputBlock
variants
- [ ] Search for "output" in CoPilot - should NOT return
AgentOutputBlock
- [ ] Search for "webhook" in CoPilot - should NOT return trigger blocks
- [ ] Search for "human" in CoPilot - should NOT return
HumanInTheLoopBlock
- [ ] Search for "decision" in CoPilot - should NOT return
SmartDecisionMakerBlock
- [ ] Verify functional blocks still appear (e.g., "email", "http",
"text")
  - [ ] Verify Builder UI still shows ALL blocks (no regression)

#### For configuration changes:
- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)

No configuration changes required.

---

Resolves: [SECRT-1831](https://linear.app/autogpt/issue/SECRT-1831)

🤖 Generated with [Claude Code](https://claude.ai/code)

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Low Risk**
> Behavior change is limited to CoPilot’s block discovery/execution
guards and is covered by new tests; main risk is inadvertently excluding
a block that should be runnable.
> 
> **Overview**
> CoPilot now **filters out graph-only blocks** from `find_block`
results and prevents them from being executed via `run_block`, returning
a clear error when a user attempts to run an excluded block.
> 
> `find_block` introduces explicit exclusion lists (by `BlockType` and a
specific block ID), over-fetches search results to maintain up to 10
usable matches after filtering, and adds debug logging when results are
reduced. New unit tests cover both the search filtering and the
`run_block` execution guard; a minor cleanup removes an unused `pytest`
import in `execution_queue_test.py`.
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
bc50755dcf. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Nicholas Tindle <ntindle@users.noreply.github.com>
Co-authored-by: Otto <otto@agpt.co>
2026-02-09 07:19:43 +00:00
dependabot[bot]
1a16e203b8 chore(deps): Bump actions/setup-node from 4 to 6 (#11213)
Bumps [actions/setup-node](https://github.com/actions/setup-node) from 4
to 6.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/actions/setup-node/releases">actions/setup-node's
releases</a>.</em></p>
<blockquote>
<h2>v6.0.0</h2>
<h2>What's Changed</h2>
<p><strong>Breaking Changes</strong></p>
<ul>
<li>Limit automatic caching to npm, update workflows and documentation
by <a
href="https://github.com/priyagupta108"><code>@​priyagupta108</code></a>
in <a
href="https://redirect.github.com/actions/setup-node/pull/1374">actions/setup-node#1374</a></li>
</ul>
<p><strong>Dependency Upgrades</strong></p>
<ul>
<li>Upgrade ts-jest from 29.1.2 to 29.4.1 and document breaking changes
in v5 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-node/pull/1336">#1336</a></li>
<li>Upgrade prettier from 2.8.8 to 3.6.2 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-node/pull/1334">#1334</a></li>
<li>Upgrade actions/publish-action from 0.3.0 to 0.4.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-node/pull/1362">#1362</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/setup-node/compare/v5...v6.0.0">https://github.com/actions/setup-node/compare/v5...v6.0.0</a></p>
<h2>v5.0.0</h2>
<h2>What's Changed</h2>
<h3>Breaking Changes</h3>
<ul>
<li>Enhance caching in setup-node with automatic package manager
detection by <a
href="https://github.com/priya-kinthali"><code>@​priya-kinthali</code></a>
in <a
href="https://redirect.github.com/actions/setup-node/pull/1348">actions/setup-node#1348</a></li>
</ul>
<p>This update, introduces automatic caching when a valid
<code>packageManager</code> field is present in your
<code>package.json</code>. This aims to improve workflow performance and
make dependency management more seamless.
To disable this automatic caching, set <code>package-manager-cache:
false</code></p>
<pre lang="yaml"><code>steps:
- uses: actions/checkout@v5
- uses: actions/setup-node@v5
  with:
    package-manager-cache: false
</code></pre>
<ul>
<li>Upgrade action to use node24 by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/setup-node/pull/1325">actions/setup-node#1325</a></li>
</ul>
<p>Make sure your runner is on version v2.327.1 or later to ensure
compatibility with this release. <a
href="https://github.com/actions/runner/releases/tag/v2.327.1">See
Release Notes</a></p>
<h3>Dependency Upgrades</h3>
<ul>
<li>Upgrade <code>@​octokit/request-error</code> and
<code>@​actions/github</code> by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-node/pull/1227">actions/setup-node#1227</a></li>
<li>Upgrade uuid from 9.0.1 to 11.1.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-node/pull/1273">actions/setup-node#1273</a></li>
<li>Upgrade undici from 5.28.5 to 5.29.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-node/pull/1295">actions/setup-node#1295</a></li>
<li>Upgrade form-data to bring in fix for critical vulnerability by <a
href="https://github.com/gowridurgad"><code>@​gowridurgad</code></a> in
<a
href="https://redirect.github.com/actions/setup-node/pull/1332">actions/setup-node#1332</a></li>
<li>Upgrade actions/checkout from 4 to 5 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-node/pull/1345">actions/setup-node#1345</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a
href="https://github.com/priya-kinthali"><code>@​priya-kinthali</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/setup-node/pull/1348">actions/setup-node#1348</a></li>
<li><a href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/setup-node/pull/1325">actions/setup-node#1325</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/setup-node/compare/v4...v5.0.0">https://github.com/actions/setup-node/compare/v4...v5.0.0</a></p>
<h2>v4.4.0</h2>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="2028fbc5c2"><code>2028fbc</code></a>
Limit automatic caching to npm, update workflows and documentation (<a
href="https://redirect.github.com/actions/setup-node/issues/1374">#1374</a>)</li>
<li><a
href="13427813f7"><code>1342781</code></a>
Bump actions/publish-action from 0.3.0 to 0.4.0 (<a
href="https://redirect.github.com/actions/setup-node/issues/1362">#1362</a>)</li>
<li><a
href="89d709d423"><code>89d709d</code></a>
Bump prettier from 2.8.8 to 3.6.2 (<a
href="https://redirect.github.com/actions/setup-node/issues/1334">#1334</a>)</li>
<li><a
href="cd2651c462"><code>cd2651c</code></a>
Bump ts-jest from 29.1.2 to 29.4.1 (<a
href="https://redirect.github.com/actions/setup-node/issues/1336">#1336</a>)</li>
<li><a
href="a0853c2454"><code>a0853c2</code></a>
Bump actions/checkout from 4 to 5 (<a
href="https://redirect.github.com/actions/setup-node/issues/1345">#1345</a>)</li>
<li><a
href="b7234cc9fe"><code>b7234cc</code></a>
Upgrade action to use node24 (<a
href="https://redirect.github.com/actions/setup-node/issues/1325">#1325</a>)</li>
<li><a
href="d7a11313b5"><code>d7a1131</code></a>
Enhance caching in setup-node with automatic package manager detection
(<a
href="https://redirect.github.com/actions/setup-node/issues/1348">#1348</a>)</li>
<li><a
href="5e2628c959"><code>5e2628c</code></a>
Bumps form-data (<a
href="https://redirect.github.com/actions/setup-node/issues/1332">#1332</a>)</li>
<li><a
href="65beceff8e"><code>65becef</code></a>
Bump undici from 5.28.5 to 5.29.0 (<a
href="https://redirect.github.com/actions/setup-node/issues/1295">#1295</a>)</li>
<li><a
href="7e24a656e1"><code>7e24a65</code></a>
Bump uuid from 9.0.1 to 11.1.0 (<a
href="https://redirect.github.com/actions/setup-node/issues/1273">#1273</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/actions/setup-node/compare/v4...v6">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/setup-node&package-manager=github_actions&previous-version=4&new-version=6)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

You can trigger a rebase of this PR by commenting `@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

> **Note**
> Automatic rebases have been disabled on this pull request as it has
been open for over 30 days.

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Nick Tindle <nick@ntindle.com>
2026-02-09 07:11:21 +00:00
dependabot[bot]
5dae303ce0 chore(frontend/deps): Bump react-window and @types/react-window in /autogpt_platform/frontend (#10943)
Bumps [react-window](https://github.com/bvaughn/react-window) and
[@types/react-window](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/react-window).
These dependencies needed to be updated together.
Updates `react-window` from 1.8.11 to 2.1.0
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/bvaughn/react-window/releases">react-window's
releases</a>.</em></p>
<blockquote>
<h2>2.1.0</h2>
<p>Improved ARIA support:</p>
<ul>
<li>Add better default ARIA attributes for outer
<code>HTMLDivElement</code></li>
<li>Add optional <code>ariaAttributes</code> prop to row and cell
renderers to simplify better ARIA attributes for user-rendered
cells</li>
<li>Remove intermediate <code>HTMLDivElement</code> from
<code>List</code> and <code>Grid</code>
<ul>
<li>This may enable more/better custom CSS styling</li>
<li>This may also enable adding an optional <code>children</code> prop
to <code>List</code> and <code>Grid</code> for e.g.
overlays/tooltips</li>
</ul>
</li>
<li>Add optional <code>tagName</code> prop; defaults to
<code>&quot;div&quot;</code> but can be changed to e.g.
<code>&quot;ul&quot;</code></li>
</ul>
<pre lang="tsx"><code>// Example of how to use new `ariaAttributes` prop
function RowComponent({
  ariaAttributes,
  index,
  style,
  ...rest
}: RowComponentProps&lt;object&gt;) {
  return (
    &lt;div style={style} {...ariaAttributes}&gt;
      ...
    &lt;/div&gt;
  );
}
</code></pre>
<p>Added optional <code>children</code> prop to better support edge
cases like sticky rows.</p>
<p>Minor changes to <code>onRowsRendered</code> and
<code>onCellsRendered</code> callbacks to make it easier to
differentiate between <em>visible</em> items and items rendered due to
overscan settings. These methods will now receive two params– the first
for <em>visible</em> rows and the second for <em>all</em> rows
(including overscan), e.g.:</p>
<pre lang="ts"><code>function onRowsRendered(
  visibleRows: {
    startIndex: number;
    stopIndex: number;
  },
  allRows: {
    startIndex: number;
    stopIndex: number;
  }
): void {
  // ...
}
<p>function onCellsRendered(<br />
visibleCells: {<br />
columnStartIndex: number;<br />
columnStopIndex: number;<br />
rowStartIndex: number;<br />
rowStopIndex: number;<br />
&lt;/tr&gt;&lt;/table&gt;<br />
</code></pre></p>
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/bvaughn/react-window/blob/master/CHANGELOG.md">react-window's
changelog</a>.</em></p>
<blockquote>
<h2>2.1.0</h2>
<p>Improved ARIA support:</p>
<ul>
<li>Add better default ARIA attributes for outer
<code>HTMLDivElement</code></li>
<li>Add optional <code>ariaAttributes</code> prop to row and cell
renderers to simplify better ARIA attributes for user-rendered
cells</li>
<li>Remove intermediate <code>HTMLDivElement</code> from
<code>List</code> and <code>Grid</code>
<ul>
<li>This may enable more/better custom CSS styling</li>
<li>This may also enable adding an optional <code>children</code> prop
to <code>List</code> and <code>Grid</code> for e.g.
overlays/tooltips</li>
</ul>
</li>
<li>Add optional <code>tagName</code> prop; defaults to
<code>&quot;div&quot;</code> but can be changed to e.g.
<code>&quot;ul&quot;</code></li>
</ul>
<pre lang="tsx"><code>// Example of how to use new `ariaAttributes` prop
function RowComponent({
  ariaAttributes,
  index,
  style,
  ...rest
}: RowComponentProps&lt;object&gt;) {
  return (
    &lt;div style={style} {...ariaAttributes}&gt;
      ...
    &lt;/div&gt;
  );
}
</code></pre>
<p>Added optional <code>children</code> prop to better support edge
cases like sticky rows.</p>
<p>Minor changes to <code>onRowsRendered</code> and
<code>onCellsRendered</code> callbacks to make it easier to
differentiate between <em>visible</em> items and items rendered due to
overscan settings. These methods will now receive two params– the first
for <em>visible</em> rows and the second for <em>all</em> rows
(including overscan), e.g.:</p>
<pre lang="ts"><code>function onRowsRendered(
  visibleRows: {
    startIndex: number;
    stopIndex: number;
  },
  allRows: {
    startIndex: number;
    stopIndex: number;
  }
): void {
  // ...
}
<p>function onCellsRendered(<br />
visibleCells: {<br />
columnStartIndex: number;<br />
columnStopIndex: number;<br />
rowStartIndex: number;<br />
&lt;/tr&gt;&lt;/table&gt;<br />
</code></pre></p>
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="1b6840ba35"><code>1b6840b</code></a>
Merge pull request <a
href="https://redirect.github.com/bvaughn/react-window/issues/836">#836</a>
from bvaughn/ARIA-roles</li>
<li><a
href="35f651b615"><code>35f651b</code></a>
Revert accidental change to docs example</li>
<li><a
href="8bce7f555b"><code>8bce7f5</code></a>
onRowsRendered/onCellsRendered separate visible and overscan items</li>
<li><a
href="9f1e8f2f0a"><code>9f1e8f2</code></a>
Support custom tagName for outer element and (optional) children</li>
<li><a
href="7f07ac33cb"><code>7f07ac3</code></a>
Improve ARIA attributes</li>
<li><a
href="7234ec3c09"><code>7234ec3</code></a>
Reduced network waterfalls between routes</li>
<li><a
href="5c431a294f"><code>5c431a2</code></a>
Stronger typing for doc website routes</li>
<li><a
href="c9349a4b7b"><code>c9349a4</code></a>
2.0.1 -&gt; 2.0.2</li>
<li><a
href="6adc6c04a1"><code>6adc6c0</code></a>
Merge pull request <a
href="https://redirect.github.com/bvaughn/react-window/issues/832">#832</a>
from bvaughn/issues/831</li>
<li><a
href="bd562c5734"><code>bd562c5</code></a>
Add tests</li>
<li>Additional commits viewable in <a
href="https://github.com/bvaughn/react-window/compare/1.8.11...2.1.0">compare
view</a></li>
</ul>
</details>
<br />

Updates `@types/react-window` from 1.8.8 to 2.0.0
<details>
<summary>Commits</summary>
<ul>
<li>See full diff in <a
href="https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/react-window">compare
view</a></li>
</ul>
</details>
<br />


You can trigger a rebase of this PR by commenting `@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

> **Note**
> Automatic rebases have been disabled on this pull request as it has
been open for over 30 days.

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co>
Co-authored-by: Nick Tindle <nick@ntindle.com>
2026-02-09 06:42:47 +00:00
dependabot[bot]
6cbfbdd013 chore(libs/deps-dev): bump the development-dependencies group across 1 directory with 4 updates (#11349)
Bumps the development-dependencies group with 4 updates in the
/autogpt_platform/autogpt_libs directory:
[pyright](https://github.com/RobertCraigie/pyright-python),
[pytest-asyncio](https://github.com/pytest-dev/pytest-asyncio),
[pytest-mock](https://github.com/pytest-dev/pytest-mock) and
[ruff](https://github.com/astral-sh/ruff).

Updates `pyright` from 1.1.404 to 1.1.407
<details>
<summary>Commits</summary>
<ul>
<li><a
href="53e8efb463"><code>53e8efb</code></a>
Pyright NPM Package update to 1.1.407 (<a
href="https://redirect.github.com/RobertCraigie/pyright-python/issues/356">#356</a>)</li>
<li><a
href="1d515b7129"><code>1d515b7</code></a>
Pyright NPM Package update to 1.1.406 (<a
href="https://redirect.github.com/RobertCraigie/pyright-python/issues/355">#355</a>)</li>
<li><a
href="e211ec8df8"><code>e211ec8</code></a>
Pyright NPM Package update to 1.1.405 (<a
href="https://redirect.github.com/RobertCraigie/pyright-python/issues/353">#353</a>)</li>
<li>See full diff in <a
href="https://github.com/RobertCraigie/pyright-python/compare/v1.1.404...v1.1.407">compare
view</a></li>
</ul>
</details>
<br />

Updates `pytest-asyncio` from 1.1.0 to 1.3.0
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/pytest-dev/pytest-asyncio/releases">pytest-asyncio's
releases</a>.</em></p>
<blockquote>
<h2>pytest-asyncio 1.3.0</h2>
<h1><a
href="https://github.com/pytest-dev/pytest-asyncio/tree/1.3.0">1.3.0</a>
- 2025-11-10</h1>
<h2>Removed</h2>
<ul>
<li>Support for Python 3.9 (<a
href="https://redirect.github.com/pytest-dev/pytest-asyncio/issues/1278">#1278</a>)</li>
</ul>
<h2>Added</h2>
<ul>
<li>Support for pytest 9 (<a
href="https://redirect.github.com/pytest-dev/pytest-asyncio/issues/1279">#1279</a>)</li>
</ul>
<h2>Notes for Downstream Packagers</h2>
<ul>
<li>Tested Python versions include free threaded Python 3.14t (<a
href="https://redirect.github.com/pytest-dev/pytest-asyncio/issues/1274">#1274</a>)</li>
<li>Tests are run in the same pytest process, instead of spawning a
subprocess with <code>pytest.Pytester.runpytest_subprocess</code>. This
prevents the test suite from accidentally using a system installation of
pytest-asyncio, which could result in test errors. (<a
href="https://redirect.github.com/pytest-dev/pytest-asyncio/issues/1275">#1275</a>)</li>
</ul>
<h2>pytest-asyncio 1.2.0</h2>
<h1><a
href="https://github.com/pytest-dev/pytest-asyncio/tree/1.2.0">1.2.0</a>
- 2025-09-12</h1>
<h2>Added</h2>
<ul>
<li><code>--asyncio-debug</code> CLI option and
<code>asyncio_debug</code> configuration option to enable asyncio debug
mode for the default event loop. (<a
href="https://redirect.github.com/pytest-dev/pytest-asyncio/issues/980">#980</a>)</li>
<li>A <code>pytest.UsageError</code> for invalid configuration values of
<code>asyncio_default_fixture_loop_scope</code> and
<code>asyncio_default_test_loop_scope</code>. (<a
href="https://redirect.github.com/pytest-dev/pytest-asyncio/issues/1189">#1189</a>)</li>
<li>Compatibility with the Pyright type checker (<a
href="https://redirect.github.com/pytest-dev/pytest-asyncio/issues/731">#731</a>)</li>
</ul>
<h2>Fixed</h2>
<ul>
<li><code>RuntimeError: There is no current event loop in thread
'MainThread'</code> when any test unsets the event loop (such as when
using <code>asyncio.run</code> and <code>asyncio.Runner</code>). (<a
href="https://redirect.github.com/pytest-dev/pytest-asyncio/issues/1177">#1177</a>)</li>
<li>Deprecation warning when decorating an asynchronous fixture with
<code>@pytest.fixture</code> in [strict]{.title-ref} mode. The warning
message now refers to the correct package. (<a
href="https://redirect.github.com/pytest-dev/pytest-asyncio/issues/1198">#1198</a>)</li>
</ul>
<h2>Notes for Downstream Packagers</h2>
<ul>
<li>Bump the minimum required version of tox to v4.28. This change is
only relevant if you use the <code>tox.ini</code> file provided by
pytest-asyncio to run tests.</li>
<li>Extend dependency on typing-extensions&gt;=4.12 from Python&lt;3.10
to Python&lt;3.13.</li>
</ul>
<h2>pytest-asyncio 1.1.1</h2>
<h1><a
href="https://github.com/pytest-dev/pytest-asyncio/tree/v1.1.1">v1.1.1</a>
- 2025-09-12</h1>
<h2>Notes for Downstream Packagers</h2>
<p>- Addresses a build problem with setuptoos-scm &gt;= 9 caused by
invalid setuptools-scm configuration in pytest-asyncio. (<a
href="https://redirect.github.com/pytest-dev/pytest-asyncio/issues/1192">#1192</a>)</p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="2e9695fcf8"><code>2e9695f</code></a>
docs: Compile changelog for v1.3.0</li>
<li><a
href="dd0e9ba3fa"><code>dd0e9ba</code></a>
docs: Reference correct issue in news fragment.</li>
<li><a
href="4c31abe5bf"><code>4c31abe</code></a>
Build(deps): Bump nh3 from 0.3.1 to 0.3.2</li>
<li><a
href="13e94770d7"><code>13e9477</code></a>
Link to migration guides from changelog</li>
<li><a
href="4d2cf3c36f"><code>4d2cf3c</code></a>
tests: handle Python 3.14 DefaultEventLoopPolicy deprecation
warnings</li>
<li><a
href="ee3549b6ef"><code>ee3549b</code></a>
test: Remove obsolete test for the event_loop fixture.</li>
<li><a
href="7a67c82c5a"><code>7a67c82</code></a>
tests: Fix failing test by preventing warning conversion to error.</li>
<li><a
href="a17b689a75"><code>a17b689</code></a>
test: add pytest config to isolated test directories</li>
<li><a
href="18afc9df5a"><code>18afc9d</code></a>
fix(tests): replace runpytest_subprocess with runpytest</li>
<li><a
href="cdc6bd1de7"><code>cdc6bd1</code></a>
Add support for pytest 9 and drop Python 3.9 support</li>
<li>Additional commits viewable in <a
href="https://github.com/pytest-dev/pytest-asyncio/compare/v1.1.0...v1.3.0">compare
view</a></li>
</ul>
</details>
<br />

Updates `pytest-mock` from 3.14.1 to 3.15.1
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/pytest-dev/pytest-mock/releases">pytest-mock's
releases</a>.</em></p>
<blockquote>
<h2>v3.15.1</h2>
<p><em>2025-09-16</em></p>
<ul>
<li><a
href="https://redirect.github.com/pytest-dev/pytest-mock/issues/529">#529</a>:
Fixed <code>itertools._tee object has no attribute error</code> -- now
<code>duplicate_iterators=True</code> must be passed to
<code>mocker.spy</code> to duplicate iterators.</li>
</ul>
<h2>v3.15.0</h2>
<p><em>2025-09-04</em></p>
<ul>
<li>Python 3.8 (EOL) is no longer supported.</li>
<li><a
href="https://redirect.github.com/pytest-dev/pytest-mock/pull/524">#524</a>:
Added <code>spy_return_iter</code> to <code>mocker.spy</code>, which
contains a duplicate of the return value of the spied method if it is an
<code>Iterator</code>.</li>
</ul>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/pytest-dev/pytest-mock/blob/main/CHANGELOG.rst">pytest-mock's
changelog</a>.</em></p>
<blockquote>
<h2>3.15.1</h2>
<p><em>2025-09-16</em></p>
<ul>
<li><code>[#529](https://github.com/pytest-dev/pytest-mock/issues/529)
&lt;https://github.com/pytest-dev/pytest-mock/issues/529&gt;</code>_:
Fixed <code>itertools._tee object has no attribute error</code> -- now
<code>duplicate_iterators=True</code> must be passed to
<code>mocker.spy</code> to duplicate iterators.</li>
</ul>
<h2>3.15.0</h2>
<p><em>2025-09-04</em></p>
<ul>
<li>Python 3.8 (EOL) is no longer supported.</li>
<li><code>[#524](https://github.com/pytest-dev/pytest-mock/issues/524)
&lt;https://github.com/pytest-dev/pytest-mock/pull/524&gt;</code>_:
Added <code>spy_return_iter</code> to <code>mocker.spy</code>, which
contains a duplicate of the return value of the spied method if it is an
<code>Iterator</code>.</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="e1b5c62a38"><code>e1b5c62</code></a>
Release 3.15.1</li>
<li><a
href="184eb190d6"><code>184eb19</code></a>
Set <code>spy_return_iter</code> only when explicitly requested (<a
href="https://redirect.github.com/pytest-dev/pytest-mock/issues/537">#537</a>)</li>
<li><a
href="4fa0088a0a"><code>4fa0088</code></a>
[pre-commit.ci] pre-commit autoupdate (<a
href="https://redirect.github.com/pytest-dev/pytest-mock/issues/536">#536</a>)</li>
<li><a
href="f5aff33ce7"><code>f5aff33</code></a>
Fix test failure with pytest 8+ and verbose mode (<a
href="https://redirect.github.com/pytest-dev/pytest-mock/issues/535">#535</a>)</li>
<li><a
href="adc41873c9"><code>adc4187</code></a>
Bump actions/setup-python from 5 to 6 in the github-actions group (<a
href="https://redirect.github.com/pytest-dev/pytest-mock/issues/533">#533</a>)</li>
<li><a
href="95ad570060"><code>95ad570</code></a>
[pre-commit.ci] pre-commit autoupdate (<a
href="https://redirect.github.com/pytest-dev/pytest-mock/issues/532">#532</a>)</li>
<li><a
href="e696bf02c1"><code>e696bf0</code></a>
Fix standalone mock support (<a
href="https://redirect.github.com/pytest-dev/pytest-mock/issues/531">#531</a>)</li>
<li><a
href="5b29b03ce9"><code>5b29b03</code></a>
Fix gen-release-notes script</li>
<li><a
href="7d22ef4e56"><code>7d22ef4</code></a>
Merge pull request <a
href="https://redirect.github.com/pytest-dev/pytest-mock/issues/528">#528</a>
from pytest-dev/release-3.15.0</li>
<li><a
href="90b29f89e2"><code>90b29f8</code></a>
Update CHANGELOG for 3.15.0</li>
<li>Additional commits viewable in <a
href="https://github.com/pytest-dev/pytest-mock/compare/v3.14.1...v3.15.1">compare
view</a></li>
</ul>
</details>
<br />

Updates `ruff` from 0.12.11 to 0.14.4
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/astral-sh/ruff/releases">ruff's
releases</a>.</em></p>
<blockquote>
<h2>0.14.4</h2>
<h2>Release Notes</h2>
<p>Released on 2025-11-06.</p>
<h3>Preview features</h3>
<ul>
<li>[formatter] Allow newlines after function headers without docstrings
(<a
href="https://redirect.github.com/astral-sh/ruff/pull/21110">#21110</a>)</li>
<li>[formatter] Avoid extra parentheses for long <code>match</code>
patterns with <code>as</code> captures (<a
href="https://redirect.github.com/astral-sh/ruff/pull/21176">#21176</a>)</li>
<li>[<code>refurb</code>] Expand fix safety for keyword arguments and
<code>Decimal</code>s (<code>FURB164</code>) (<a
href="https://redirect.github.com/astral-sh/ruff/pull/21259">#21259</a>)</li>
<li>[<code>refurb</code>] Preserve argument ordering in autofix
(<code>FURB103</code>) (<a
href="https://redirect.github.com/astral-sh/ruff/pull/20790">#20790</a>)</li>
</ul>
<h3>Bug fixes</h3>
<ul>
<li>[server] Fix missing diagnostics for notebooks (<a
href="https://redirect.github.com/astral-sh/ruff/pull/21156">#21156</a>)</li>
<li>[<code>flake8-bugbear</code>] Ignore non-NFKC attribute names in
<code>B009</code> and <code>B010</code> (<a
href="https://redirect.github.com/astral-sh/ruff/pull/21131">#21131</a>)</li>
<li>[<code>refurb</code>] Fix false negative for underscores before sign
in <code>Decimal</code> constructor (<code>FURB157</code>) (<a
href="https://redirect.github.com/astral-sh/ruff/pull/21190">#21190</a>)</li>
<li>[<code>ruff</code>] Fix false positives on starred arguments
(<code>RUF057</code>) (<a
href="https://redirect.github.com/astral-sh/ruff/pull/21256">#21256</a>)</li>
</ul>
<h3>Rule changes</h3>
<ul>
<li>[<code>airflow</code>] extend deprecated argument
<code>concurrency</code> in <code>airflow..DAG</code>
(<code>AIR301</code>) (<a
href="https://redirect.github.com/astral-sh/ruff/pull/21220">#21220</a>)</li>
</ul>
<h3>Documentation</h3>
<ul>
<li>Improve <code>extend</code> docs (<a
href="https://redirect.github.com/astral-sh/ruff/pull/21135">#21135</a>)</li>
<li>[<code>flake8-comprehensions</code>] Fix typo in <code>C416</code>
documentation (<a
href="https://redirect.github.com/astral-sh/ruff/pull/21184">#21184</a>)</li>
<li>Revise Ruff setup instructions for Zed editor (<a
href="https://redirect.github.com/astral-sh/ruff/pull/20935">#20935</a>)</li>
</ul>
<h3>Other changes</h3>
<ul>
<li>Make <code>ruff analyze graph</code> work with jupyter notebooks (<a
href="https://redirect.github.com/astral-sh/ruff/pull/21161">#21161</a>)</li>
</ul>
<h3>Contributors</h3>
<ul>
<li><a
href="https://github.com/chirizxc"><code>@​chirizxc</code></a></li>
<li><a href="https://github.com/Lee-W"><code>@​Lee-W</code></a></li>
<li><a
href="https://github.com/musicinmybrain"><code>@​musicinmybrain</code></a></li>
<li><a
href="https://github.com/MichaReiser"><code>@​MichaReiser</code></a></li>
<li><a href="https://github.com/tjkuson"><code>@​tjkuson</code></a></li>
<li><a
href="https://github.com/danparizher"><code>@​danparizher</code></a></li>
<li><a
href="https://github.com/renovate"><code>@​renovate</code></a></li>
<li><a href="https://github.com/ntBre"><code>@​ntBre</code></a></li>
<li><a
href="https://github.com/gauthsvenkat"><code>@​gauthsvenkat</code></a></li>
<li><a
href="https://github.com/LoicRiegel"><code>@​LoicRiegel</code></a></li>
</ul>
<h2>Install ruff 0.14.4</h2>
<h3>Install prebuilt binaries via shell script</h3>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md">ruff's
changelog</a>.</em></p>
<blockquote>
<h2>0.14.4</h2>
<p>Released on 2025-11-06.</p>
<h3>Preview features</h3>
<ul>
<li>[formatter] Allow newlines after function headers without docstrings
(<a
href="https://redirect.github.com/astral-sh/ruff/pull/21110">#21110</a>)</li>
<li>[formatter] Avoid extra parentheses for long <code>match</code>
patterns with <code>as</code> captures (<a
href="https://redirect.github.com/astral-sh/ruff/pull/21176">#21176</a>)</li>
<li>[<code>refurb</code>] Expand fix safety for keyword arguments and
<code>Decimal</code>s (<code>FURB164</code>) (<a
href="https://redirect.github.com/astral-sh/ruff/pull/21259">#21259</a>)</li>
<li>[<code>refurb</code>] Preserve argument ordering in autofix
(<code>FURB103</code>) (<a
href="https://redirect.github.com/astral-sh/ruff/pull/20790">#20790</a>)</li>
</ul>
<h3>Bug fixes</h3>
<ul>
<li>[server] Fix missing diagnostics for notebooks (<a
href="https://redirect.github.com/astral-sh/ruff/pull/21156">#21156</a>)</li>
<li>[<code>flake8-bugbear</code>] Ignore non-NFKC attribute names in
<code>B009</code> and <code>B010</code> (<a
href="https://redirect.github.com/astral-sh/ruff/pull/21131">#21131</a>)</li>
<li>[<code>refurb</code>] Fix false negative for underscores before sign
in <code>Decimal</code> constructor (<code>FURB157</code>) (<a
href="https://redirect.github.com/astral-sh/ruff/pull/21190">#21190</a>)</li>
<li>[<code>ruff</code>] Fix false positives on starred arguments
(<code>RUF057</code>) (<a
href="https://redirect.github.com/astral-sh/ruff/pull/21256">#21256</a>)</li>
</ul>
<h3>Rule changes</h3>
<ul>
<li>[<code>airflow</code>] extend deprecated argument
<code>concurrency</code> in <code>airflow..DAG</code>
(<code>AIR301</code>) (<a
href="https://redirect.github.com/astral-sh/ruff/pull/21220">#21220</a>)</li>
</ul>
<h3>Documentation</h3>
<ul>
<li>Improve <code>extend</code> docs (<a
href="https://redirect.github.com/astral-sh/ruff/pull/21135">#21135</a>)</li>
<li>[<code>flake8-comprehensions</code>] Fix typo in <code>C416</code>
documentation (<a
href="https://redirect.github.com/astral-sh/ruff/pull/21184">#21184</a>)</li>
<li>Revise Ruff setup instructions for Zed editor (<a
href="https://redirect.github.com/astral-sh/ruff/pull/20935">#20935</a>)</li>
</ul>
<h3>Other changes</h3>
<ul>
<li>Make <code>ruff analyze graph</code> work with jupyter notebooks (<a
href="https://redirect.github.com/astral-sh/ruff/pull/21161">#21161</a>)</li>
</ul>
<h3>Contributors</h3>
<ul>
<li><a
href="https://github.com/chirizxc"><code>@​chirizxc</code></a></li>
<li><a href="https://github.com/Lee-W"><code>@​Lee-W</code></a></li>
<li><a
href="https://github.com/musicinmybrain"><code>@​musicinmybrain</code></a></li>
<li><a
href="https://github.com/MichaReiser"><code>@​MichaReiser</code></a></li>
<li><a href="https://github.com/tjkuson"><code>@​tjkuson</code></a></li>
<li><a
href="https://github.com/danparizher"><code>@​danparizher</code></a></li>
<li><a
href="https://github.com/renovate"><code>@​renovate</code></a></li>
<li><a href="https://github.com/ntBre"><code>@​ntBre</code></a></li>
<li><a
href="https://github.com/gauthsvenkat"><code>@​gauthsvenkat</code></a></li>
<li><a
href="https://github.com/LoicRiegel"><code>@​LoicRiegel</code></a></li>
</ul>
<h2>0.14.3</h2>
<p>Released on 2025-10-30.</p>
<h3>Preview features</h3>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="c7ff9826d6"><code>c7ff982</code></a>
Bump 0.14.4 (<a
href="https://redirect.github.com/astral-sh/ruff/issues/21306">#21306</a>)</li>
<li><a
href="35640dd853"><code>35640dd</code></a>
Fix main by using <code>infer_expression</code> (<a
href="https://redirect.github.com/astral-sh/ruff/issues/21299">#21299</a>)</li>
<li><a
href="cb2e277482"><code>cb2e277</code></a>
[ty] Understand legacy and PEP 695 <code>ParamSpec</code> (<a
href="https://redirect.github.com/astral-sh/ruff/issues/21139">#21139</a>)</li>
<li><a
href="132d10fb6f"><code>132d10f</code></a>
[ty] Discover site-packages from the environment that ty is installed in
(<a
href="https://redirect.github.com/astral-sh/ruff/issues/21">#21</a>...</li>
<li><a
href="f189aad6d2"><code>f189aad</code></a>
[ty] Make special cases for <code>UnionType</code> slightly narrower (<a
href="https://redirect.github.com/astral-sh/ruff/issues/21276">#21276</a>)</li>
<li><a
href="5517c9943a"><code>5517c99</code></a>
Require ignore 0.4.24 in <code>Cargo.toml</code> (<a
href="https://redirect.github.com/astral-sh/ruff/issues/21292">#21292</a>)</li>
<li><a
href="b5ff96595d"><code>b5ff965</code></a>
[ty] Favour imported symbols over builtin symbols (<a
href="https://redirect.github.com/astral-sh/ruff/issues/21285">#21285</a>)</li>
<li><a
href="c6573b16ac"><code>c6573b1</code></a>
docs: revise Ruff setup instructions for Zed editor (<a
href="https://redirect.github.com/astral-sh/ruff/issues/20935">#20935</a>)</li>
<li><a
href="76127e5fb5"><code>76127e5</code></a>
[ty] Update salsa (<a
href="https://redirect.github.com/astral-sh/ruff/issues/21281">#21281</a>)</li>
<li><a
href="cddc0fedc2"><code>cddc0fe</code></a>
[syntax-error]: no binding for nonlocal PLE0117 as a semantic syntax
error (...</li>
<li>Additional commits viewable in <a
href="https://github.com/astral-sh/ruff/compare/0.12.11...0.14.4">compare
view</a></li>
</ul>
</details>
<br />


You can trigger a rebase of this PR by commenting `@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore <dependency name> major version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's major version (unless you unignore this specific
dependency's major version or upgrade to it yourself)
- `@dependabot ignore <dependency name> minor version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's minor version (unless you unignore this specific
dependency's minor version or upgrade to it yourself)
- `@dependabot ignore <dependency name>` will close this group update PR
and stop Dependabot creating any more for the specific dependency
(unless you unignore this specific dependency or upgrade to it yourself)
- `@dependabot unignore <dependency name>` will remove all of the ignore
conditions of the specified dependency
- `@dependabot unignore <dependency name> <ignore condition>` will
remove the ignore condition of the specified dependency and ignore
conditions


</details>

> **Note**
> Automatic rebases have been disabled on this pull request as it has
been open for over 30 days.

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Nick Tindle <nick@ntindle.com>
2026-02-09 04:54:05 +00:00
dependabot[bot]
0c6fa60436 chore(deps): Bump actions/github-script from 7 to 8 (#10870)
Bumps [actions/github-script](https://github.com/actions/github-script)
from 7 to 8.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/actions/github-script/releases">actions/github-script's
releases</a>.</em></p>
<blockquote>
<h2>v8.0.0</h2>
<h2>What's Changed</h2>
<ul>
<li>Update Node.js version support to 24.x by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/github-script/pull/637">actions/github-script#637</a></li>
<li>README for updating actions/github-script from v7 to v8 by <a
href="https://github.com/sneha-krip"><code>@​sneha-krip</code></a> in <a
href="https://redirect.github.com/actions/github-script/pull/653">actions/github-script#653</a></li>
</ul>
<h2>⚠️ Minimum Compatible Runner Version</h2>
<p><strong>v2.327.1</strong><br />
<a
href="https://github.com/actions/runner/releases/tag/v2.327.1">Release
Notes</a></p>
<p>Make sure your runner is updated to this version or newer to use this
release.</p>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/github-script/pull/637">actions/github-script#637</a></li>
<li><a
href="https://github.com/sneha-krip"><code>@​sneha-krip</code></a> made
their first contribution in <a
href="https://redirect.github.com/actions/github-script/pull/653">actions/github-script#653</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/github-script/compare/v7.1.0...v8.0.0">https://github.com/actions/github-script/compare/v7.1.0...v8.0.0</a></p>
<h2>v7.1.0</h2>
<h2>What's Changed</h2>
<ul>
<li>Upgrade husky to v9 by <a
href="https://github.com/benelan"><code>@​benelan</code></a> in <a
href="https://redirect.github.com/actions/github-script/pull/482">actions/github-script#482</a></li>
<li>Add workflow file for publishing releases to immutable action
package by <a
href="https://github.com/Jcambass"><code>@​Jcambass</code></a> in <a
href="https://redirect.github.com/actions/github-script/pull/485">actions/github-script#485</a></li>
<li>Upgrade IA Publish by <a
href="https://github.com/Jcambass"><code>@​Jcambass</code></a> in <a
href="https://redirect.github.com/actions/github-script/pull/486">actions/github-script#486</a></li>
<li>Fix workflow status badges by <a
href="https://github.com/joshmgross"><code>@​joshmgross</code></a> in <a
href="https://redirect.github.com/actions/github-script/pull/497">actions/github-script#497</a></li>
<li>Update usage of <code>actions/upload-artifact</code> by <a
href="https://github.com/joshmgross"><code>@​joshmgross</code></a> in <a
href="https://redirect.github.com/actions/github-script/pull/512">actions/github-script#512</a></li>
<li>Clear up package name confusion by <a
href="https://github.com/joshmgross"><code>@​joshmgross</code></a> in <a
href="https://redirect.github.com/actions/github-script/pull/514">actions/github-script#514</a></li>
<li>Update dependencies with <code>npm audit fix</code> by <a
href="https://github.com/joshmgross"><code>@​joshmgross</code></a> in <a
href="https://redirect.github.com/actions/github-script/pull/515">actions/github-script#515</a></li>
<li>Specify that the used script is JavaScript by <a
href="https://github.com/timotk"><code>@​timotk</code></a> in <a
href="https://redirect.github.com/actions/github-script/pull/478">actions/github-script#478</a></li>
<li>chore: Add Dependabot for NPM and Actions by <a
href="https://github.com/nschonni"><code>@​nschonni</code></a> in <a
href="https://redirect.github.com/actions/github-script/pull/472">actions/github-script#472</a></li>
<li>Define <code>permissions</code> in workflows and update actions by
<a href="https://github.com/joshmgross"><code>@​joshmgross</code></a> in
<a
href="https://redirect.github.com/actions/github-script/pull/531">actions/github-script#531</a></li>
<li>chore: Add Dependabot for .github/actions/install-dependencies by <a
href="https://github.com/nschonni"><code>@​nschonni</code></a> in <a
href="https://redirect.github.com/actions/github-script/pull/532">actions/github-script#532</a></li>
<li>chore: Remove .vscode settings by <a
href="https://github.com/nschonni"><code>@​nschonni</code></a> in <a
href="https://redirect.github.com/actions/github-script/pull/533">actions/github-script#533</a></li>
<li>ci: Use github/setup-licensed by <a
href="https://github.com/nschonni"><code>@​nschonni</code></a> in <a
href="https://redirect.github.com/actions/github-script/pull/473">actions/github-script#473</a></li>
<li>make octokit instance available as octokit on top of github, to make
it easier to seamlessly copy examples from GitHub rest api or octokit
documentations by <a
href="https://github.com/iamstarkov"><code>@​iamstarkov</code></a> in <a
href="https://redirect.github.com/actions/github-script/pull/508">actions/github-script#508</a></li>
<li>Remove <code>octokit</code> README updates for v7 by <a
href="https://github.com/joshmgross"><code>@​joshmgross</code></a> in <a
href="https://redirect.github.com/actions/github-script/pull/557">actions/github-script#557</a></li>
<li>docs: add &quot;exec&quot; usage examples by <a
href="https://github.com/neilime"><code>@​neilime</code></a> in <a
href="https://redirect.github.com/actions/github-script/pull/546">actions/github-script#546</a></li>
<li>Bump ruby/setup-ruby from 1.213.0 to 1.222.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/github-script/pull/563">actions/github-script#563</a></li>
<li>Bump ruby/setup-ruby from 1.222.0 to 1.229.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/github-script/pull/575">actions/github-script#575</a></li>
<li>Clearly document passing inputs to the <code>script</code> by <a
href="https://github.com/joshmgross"><code>@​joshmgross</code></a> in <a
href="https://redirect.github.com/actions/github-script/pull/603">actions/github-script#603</a></li>
<li>Update README.md by <a
href="https://github.com/nebuk89"><code>@​nebuk89</code></a> in <a
href="https://redirect.github.com/actions/github-script/pull/610">actions/github-script#610</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/benelan"><code>@​benelan</code></a> made
their first contribution in <a
href="https://redirect.github.com/actions/github-script/pull/482">actions/github-script#482</a></li>
<li><a href="https://github.com/Jcambass"><code>@​Jcambass</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/github-script/pull/485">actions/github-script#485</a></li>
<li><a href="https://github.com/timotk"><code>@​timotk</code></a> made
their first contribution in <a
href="https://redirect.github.com/actions/github-script/pull/478">actions/github-script#478</a></li>
<li><a
href="https://github.com/iamstarkov"><code>@​iamstarkov</code></a> made
their first contribution in <a
href="https://redirect.github.com/actions/github-script/pull/508">actions/github-script#508</a></li>
<li><a href="https://github.com/neilime"><code>@​neilime</code></a> made
their first contribution in <a
href="https://redirect.github.com/actions/github-script/pull/546">actions/github-script#546</a></li>
<li><a href="https://github.com/nebuk89"><code>@​nebuk89</code></a> made
their first contribution in <a
href="https://redirect.github.com/actions/github-script/pull/610">actions/github-script#610</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/github-script/compare/v7...v7.1.0">https://github.com/actions/github-script/compare/v7...v7.1.0</a></p>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="ed597411d8"><code>ed59741</code></a>
Merge pull request <a
href="https://redirect.github.com/actions/github-script/issues/653">#653</a>
from actions/sneha-krip/readme-for-v8</li>
<li><a
href="2dc352e4ba"><code>2dc352e</code></a>
Bold minimum Actions Runner version in README</li>
<li><a
href="01e118c8d0"><code>01e118c</code></a>
Update README for Node 24 runtime requirements</li>
<li><a
href="8b222ac82e"><code>8b222ac</code></a>
Apply suggestion from <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a></li>
<li><a
href="adc0eeac99"><code>adc0eea</code></a>
README for updating actions/github-script from v7 to v8</li>
<li><a
href="20fe497b3f"><code>20fe497</code></a>
Merge pull request <a
href="https://redirect.github.com/actions/github-script/issues/637">#637</a>
from actions/node24</li>
<li><a
href="e7b7f222b1"><code>e7b7f22</code></a>
update licenses</li>
<li><a
href="2c81ba05f3"><code>2c81ba0</code></a>
Update Node.js version support to 24.x</li>
<li>See full diff in <a
href="https://github.com/actions/github-script/compare/v7...v8">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/github-script&package-manager=github_actions&previous-version=7&new-version=8)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

You can trigger a rebase of this PR by commenting `@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

> **Note**
> Automatic rebases have been disabled on this pull request as it has
been open for over 30 days.

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Update GitHub Actions workflows to use actions/github-script v8.
> 
> - **CI Workflows**:
>   - Update `actions/github-script` from `v7` to `v8` in:
>     - `.github/workflows/claude-ci-failure-auto-fix.yml`
>     - `.github/workflows/platform-dev-deploy-event-dispatcher.yml`
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
cfdccf966b. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co>
2026-02-09 04:27:07 +00:00
dependabot[bot]
b04e916c23 chore(backend/deps-dev): bump the development-dependencies group across 1 directory with 3 updates (#12005)
Bumps the development-dependencies group with 3 updates in the
/autogpt_platform/backend directory:
[poethepoet](https://github.com/nat-n/poethepoet),
[pytest-watcher](https://github.com/olzhasar/pytest-watcher) and
[ruff](https://github.com/astral-sh/ruff).

Updates `poethepoet` from 0.37.0 to 0.40.0
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/nat-n/poethepoet/releases">poethepoet's
releases</a>.</em></p>
<blockquote>
<h2>0.40.0</h2>
<h2>Enhancements</h2>
<ul>
<li>Allow optional envfiles without warnings by <a
href="https://github.com/cnaples79"><code>@​cnaples79</code></a> in <a
href="https://redirect.github.com/nat-n/poethepoet/pull/337">nat-n/poethepoet#337</a></li>
<li>Add support for the <code>capture_output</code> option in ref tasks
by <a href="https://github.com/kzrnm"><code>@​kzrnm</code></a> in <a
href="https://redirect.github.com/nat-n/poethepoet/pull/343">nat-n/poethepoet#343</a></li>
<li>Set uv to quiet mode during shell completion to avoid console spam
by <a href="https://github.com/nat-n"><code>@​nat-n</code></a> in <a
href="https://redirect.github.com/nat-n/poethepoet/pull/338">nat-n/poethepoet#338</a></li>
<li>Support <code>ignore_fail</code> on execution task types and ref
tasks by <a href="https://github.com/nat-n"><code>@​nat-n</code></a> in
<a
href="https://redirect.github.com/nat-n/poethepoet/pull/347">nat-n/poethepoet#347</a></li>
<li>Add choices option to constrain named arguments by <a
href="https://github.com/nat-n"><code>@​nat-n</code></a> in <a
href="https://redirect.github.com/nat-n/poethepoet/pull/348">nat-n/poethepoet#348</a></li>
</ul>
<h2>Fixes</h2>
<ul>
<li>Handle SIGHUP and SIGBREAK signals to stop tasks by <a
href="https://github.com/nat-n"><code>@​nat-n</code></a> in <a
href="https://redirect.github.com/nat-n/poethepoet/pull/344">nat-n/poethepoet#344</a></li>
<li>Accept string for type name in global executor option by <a
href="https://github.com/kzrnm"><code>@​kzrnm</code></a> in <a
href="https://redirect.github.com/nat-n/poethepoet/pull/340">nat-n/poethepoet#340</a></li>
</ul>
<h2>Code improvements</h2>
<ul>
<li>Modernize type annotations by <a
href="https://github.com/nat-n"><code>@​nat-n</code></a> in <a
href="https://redirect.github.com/nat-n/poethepoet/pull/339">nat-n/poethepoet#339</a></li>
<li>Ensure test virtual environments are always cleaned up by <a
href="https://github.com/kzrnm"><code>@​kzrnm</code></a> in <a
href="https://redirect.github.com/nat-n/poethepoet/pull/346">nat-n/poethepoet#346</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/nat-n/poethepoet/compare/v0.39.0...v0.40.0">https://github.com/nat-n/poethepoet/compare/v0.39.0...v0.40.0</a></p>
<h2>0.39.0</h2>
<h2>Enhancements</h2>
<ul>
<li>Add support for uv executor options by <a
href="https://github.com/rochacbruno"><code>@​rochacbruno</code></a> and
<a href="https://github.com/nat-n"><code>@​nat-n</code></a> in <a
href="https://redirect.github.com/nat-n/poethepoet/pull/327">nat-n/poethepoet#327</a>
<ul>
<li>feat: add <a
href="https://poethepoet.natn.io/global_options.html#uv-executor">various
options to the uv executor</a> to be passed to the uv run command</li>
<li>feat: allow task executor to be configure with just the type as a
string</li>
<li>feat executor options to be set at runtime via the new
--executor-opt cli global option</li>
<li>feat: allow inheritance of compatible executor options from global
to task to runtime</li>
<li>refactor: extend PoeOptions to support annotating config fields with
a config_name to parse, separate from the attribute name</li>
<li>refactor: some micro-optimizations to PoeOptions and
AnnotationType</li>
<li>doc: Add <a
href="https://poethepoet.natn.io/guides/tox_replacement_guide.html">guide
for replacing tox with poe + uv</a></li>
<li>doc: tidy up executor docs</li>
<li>doc: fix typo in doc for expr task</li>
<li>test: improve test coverage of PoeOptions</li>
<li>test: disable some test cases on windows that are too flaky</li>
</ul>
</li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a
href="https://github.com/rochacbruno"><code>@​rochacbruno</code></a>
made their first contribution in <a
href="https://redirect.github.com/nat-n/poethepoet/pull/327">nat-n/poethepoet#327</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/nat-n/poethepoet/compare/v0.38.0...v0.39.0">https://github.com/nat-n/poethepoet/compare/v0.38.0...v0.39.0</a></p>
<h2>0.38.0</h2>
<h2>Enhancements</h2>
<ul>
<li>feat: Add parallel task type by <a
href="https://github.com/nat-n"><code>@​nat-n</code></a> in <a
href="https://redirect.github.com/nat-n/poethepoet/pull/323">nat-n/poethepoet#323</a></li>
</ul>
<h2>Breaking changes</h2>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="0a7247d8f7"><code>0a7247d</code></a>
Bump version to 0.40.0</li>
<li><a
href="312e74a5be"><code>312e74a</code></a>
feat: Add choices option to constrain named arguments (<a
href="https://redirect.github.com/nat-n/poethepoet/issues/348">#348</a>)</li>
<li><a
href="5e0b3e5590"><code>5e0b3e5</code></a>
feat: support ignore_fail on execution task types and ref tasks (<a
href="https://redirect.github.com/nat-n/poethepoet/issues/347">#347</a>)</li>
<li><a
href="a3c97e1e94"><code>a3c97e1</code></a>
test: ensure the test virtual environment is always removed (<a
href="https://redirect.github.com/nat-n/poethepoet/issues/346">#346</a>)</li>
<li><a
href="bc04e2fe18"><code>bc04e2f</code></a>
feat: support <code>capture_output</code> on ref tasks (<a
href="https://redirect.github.com/nat-n/poethepoet/issues/343">#343</a>)</li>
<li><a
href="f7b82ef954"><code>f7b82ef</code></a>
fix: global executor option (<a
href="https://redirect.github.com/nat-n/poethepoet/issues/340">#340</a>)</li>
<li><a
href="8e7b1166a0"><code>8e7b116</code></a>
fix: handle SIGHUP and SIGBREAK signals to stop tasks (<a
href="https://redirect.github.com/nat-n/poethepoet/issues/344">#344</a>)</li>
<li><a
href="8e51f2b79f"><code>8e51f2b</code></a>
refactor: modernize type annotations (<a
href="https://redirect.github.com/nat-n/poethepoet/issues/339">#339</a>)</li>
<li><a
href="72a9225dac"><code>72a9225</code></a>
fix: set uv to quiet during shell completion (<a
href="https://redirect.github.com/nat-n/poethepoet/issues/338">#338</a>)</li>
<li><a
href="c6c7306276"><code>c6c7306</code></a>
feat: allow optional envfiles without warnings (<a
href="https://redirect.github.com/nat-n/poethepoet/issues/337">#337</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/nat-n/poethepoet/compare/v0.37.0...v0.40.0">compare
view</a></li>
</ul>
</details>
<br />

Updates `pytest-watcher` from 0.4.3 to 0.6.3
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/olzhasar/pytest-watcher/releases">pytest-watcher's
releases</a>.</em></p>
<blockquote>
<h2>v0.6.3</h2>
<h3>Features</h3>
<ul>
<li>Add debug mode activated with <code>PTW_DEBUG</code> environment
variable and improve log messages.</li>
</ul>
<h3>Bugfixes</h3>
<ul>
<li>Fix terminal flushing after menu and header prints.</li>
<li>Use monotonic clock for trigger detection to avoid misbehavior on
clock changes.</li>
</ul>
<h2>v0.6.2</h2>
<h3>Bugfixes</h3>
<ul>
<li>Allow specifying blank patterns via CLI</li>
<li>Fix duplicate command entries in menu</li>
</ul>
<h2>v0.6.1</h2>
<h3>Bugfixes</h3>
<ul>
<li>Trigger tests in interactive mode for carriage return character</li>
</ul>
<h3>Improved Documentation</h3>
<ul>
<li>Add contributing guide</li>
</ul>
<h3>Misc</h3>
<ul>
<li>Integrate <a
href="https://towncrier.readthedocs.io/en/stable/index.html">towncrier</a>
into the development process</li>
</ul>
<h2>v0.6.0</h2>
<h2>Features</h2>
<ul>
<li>Add <code>notify-on-failure</code> flag (and config option) to emit
BEL symbol on test suite failure.</li>
</ul>
<h2>Infrastructure</h2>
<ul>
<li>Migrate from poetry to uv.</li>
<li>Remove tox.</li>
</ul>
<h2>v0.5.0</h2>
<h2>Fixes</h2>
<ul>
<li>Merge arguments passed to the runner from config and CLI instead of
overriding.</li>
</ul>
<h2>Changes</h2>
<ul>
<li>Drop support for Python 3.7 &amp; 3.8</li>
</ul>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/olzhasar/pytest-watcher/blob/master/CHANGELOG.md">pytest-watcher's
changelog</a>.</em></p>
<blockquote>
<h2><a
href="https://github.com/olzhasar/pytest-watcher/releases/tag/0.6.3">0.6.3</a>
- 2026-01-11</h2>
<h3>Features</h3>
<ul>
<li>Add debug mode activated with <code>PTW_DEBUG</code> environment
variable and improve log messages.</li>
</ul>
<h3>Bugfixes</h3>
<ul>
<li>Fix terminal flushing after menu and header prints.</li>
<li>Use monotonic clock for trigger detection to avoid misbehavior on
clock changes.</li>
</ul>
<h2><a
href="https://github.com/olzhasar/pytest-watcher/releases/tag/0.6.2">0.6.2</a>
- 2025-12-28</h2>
<h3>Bugfixes</h3>
<ul>
<li>Allow specifying blank patterns via CLI</li>
<li>Fix duplicate command entries in menu</li>
</ul>
<h2><a
href="https://github.com/olzhasar/pytest-watcher/releases/tag/0.6.1">0.6.1</a>
- 2025-12-26</h2>
<h3>Bugfixes</h3>
<ul>
<li>Trigger tests in interactive mode for carriage return character</li>
</ul>
<h3>Improved Documentation</h3>
<ul>
<li>Add contributing guide</li>
</ul>
<h3>Misc</h3>
<ul>
<li>Integrate <a
href="https://towncrier.readthedocs.io/en/stable/index.html">towncrier</a>
into the development process</li>
</ul>
<h2><a
href="https://github.com/olzhasar/pytest-watcher/releases/tag/0.6.0">0.6.0</a>
- 2025-12-22</h2>
<h3>Features</h3>
<ul>
<li>Add notify-on-failure flag (and config option) to emit BEL symbol on
test suite failure.</li>
</ul>
<h3>Infrastructure</h3>
<ul>
<li>Migrate from <code>poetry</code> to <code>uv</code>.</li>
<li>Remove <code>tox</code>.</li>
</ul>
<h2><a
href="https://github.com/olzhasar/pytest-watcher/releases/tag/0.5.0">0.5.0</a>
- 2025-12-21</h2>
<h3>Fixes</h3>
<ul>
<li>Merge arguments passed to the runner from config and CLI instead of
overriding.</li>
</ul>
<h3>Changes</h3>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="c52925b613"><code>c52925b</code></a>
release v0.6.3</li>
<li><a
href="23d49893f7"><code>23d4989</code></a>
Add debug mode. Improve log messages</li>
<li><a
href="e3dffa1cb3"><code>e3dffa1</code></a>
Fix terminal flushing after menu and header prints</li>
<li><a
href="0eeaf6080e"><code>0eeaf60</code></a>
Use monotonic clock for trigger detection</li>
<li><a
href="5ed9d0e262"><code>5ed9d0e</code></a>
Update CHANGELOG. Fix changelog_reader action</li>
<li><a
href="756f005f5d"><code>756f005</code></a>
release v0.6.2</li>
<li><a
href="902aa9e07b"><code>902aa9e</code></a>
Merge pull request <a
href="https://redirect.github.com/olzhasar/pytest-watcher/issues/51">#51</a>
from olzhasar/fix-duplicate-menu</li>
<li><a
href="e6b20d35b9"><code>e6b20d3</code></a>
Allow specifying empty patterns via CLI</li>
<li><a
href="2d522dabf9"><code>2d522da</code></a>
Fix duplicate menu entries</li>
<li><a
href="171e6f1282"><code>171e6f1</code></a>
Fix towncrier CHANGELOG versioning</li>
<li>Additional commits viewable in <a
href="https://github.com/olzhasar/pytest-watcher/compare/v0.4.3...v0.6.3">compare
view</a></li>
</ul>
</details>
<br />

Updates `ruff` from 0.14.14 to 0.15.0
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/astral-sh/ruff/releases">ruff's
releases</a>.</em></p>
<blockquote>
<h2>0.15.0</h2>
<h2>Release Notes</h2>
<p>Released on 2026-02-03.</p>
<p>Check out the <a href="https://astral.sh/blog/ruff-v0.15.0">blog
post</a> for a migration guide and overview of the changes!</p>
<h3>Breaking changes</h3>
<ul>
<li>
<p>Ruff now formats your code according to the 2026 style guide. See the
formatter section below or in the blog post for a detailed list of
changes.</p>
</li>
<li>
<p>The linter now supports block suppression comments. For example, to
suppress <code>N803</code> for all parameters in this function:</p>
<pre lang="python"><code># ruff: disable[N803]
def foo(
    legacyArg1,
    legacyArg2,
    legacyArg3,
    legacyArg4,
): ...
# ruff: enable[N803]
</code></pre>
<p>See the <a
href="https://docs.astral.sh/ruff/linter/#block-level">documentation</a>
for more details.</p>
</li>
<li>
<p>The <code>ruff:alpine</code> Docker image is now based on Alpine 3.23
(up from 3.21).</p>
</li>
<li>
<p>The <code>ruff:debian</code> and <code>ruff:debian-slim</code> Docker
images are now based on Debian 13 &quot;Trixie&quot; instead of Debian
12 &quot;Bookworm.&quot;</p>
</li>
<li>
<p>Binaries for the <code>ppc64</code> (64-bit big-endian PowerPC)
architecture are no longer included in our releases. It should still be
possible to build Ruff manually for this platform, if needed.</p>
</li>
<li>
<p>Ruff now resolves all <code>extend</code>ed configuration files
before falling back on a default Python version.</p>
</li>
</ul>
<h3>Stabilization</h3>
<p>The following rules have been stabilized and are no longer in
preview:</p>
<ul>
<li><a
href="https://docs.astral.sh/ruff/rules/blocking-http-call-httpx-in-async-function"><code>blocking-http-call-httpx-in-async-function</code></a>
(<code>ASYNC212</code>)</li>
<li><a
href="https://docs.astral.sh/ruff/rules/blocking-path-method-in-async-function"><code>blocking-path-method-in-async-function</code></a>
(<code>ASYNC240</code>)</li>
<li><a
href="https://docs.astral.sh/ruff/rules/blocking-input-in-async-function"><code>blocking-input-in-async-function</code></a>
(<code>ASYNC250</code>)</li>
<li><a
href="https://docs.astral.sh/ruff/rules/map-without-explicit-strict"><code>map-without-explicit-strict</code></a>
(<code>B912</code>)</li>
<li><a
href="https://docs.astral.sh/ruff/rules/if-exp-instead-of-or-operator"><code>if-exp-instead-of-or-operator</code></a>
(<code>FURB110</code>)</li>
<li><a
href="https://docs.astral.sh/ruff/rules/single-item-membership-test"><code>single-item-membership-test</code></a>
(<code>FURB171</code>)</li>
<li><a
href="https://docs.astral.sh/ruff/rules/missing-maxsplit-arg"><code>missing-maxsplit-arg</code></a>
(<code>PLC0207</code>)</li>
<li><a
href="https://docs.astral.sh/ruff/rules/unnecessary-lambda"><code>unnecessary-lambda</code></a>
(<code>PLW0108</code>)</li>
<li><a
href="https://docs.astral.sh/ruff/rules/unnecessary-empty-iterable-within-deque-call"><code>unnecessary-empty-iterable-within-deque-call</code></a>
(<code>RUF037</code>)</li>
<li><a
href="https://docs.astral.sh/ruff/rules/in-empty-collection"><code>in-empty-collection</code></a>
(<code>RUF060</code>)</li>
<li><a
href="https://docs.astral.sh/ruff/rules/legacy-form-pytest-raises"><code>legacy-form-pytest-raises</code></a>
(<code>RUF061</code>)</li>
<li><a
href="https://docs.astral.sh/ruff/rules/non-octal-permissions"><code>non-octal-permissions</code></a>
(<code>RUF064</code>)</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md">ruff's
changelog</a>.</em></p>
<blockquote>
<h2>0.15.0</h2>
<p>Released on 2026-02-03.</p>
<p>Check out the <a href="https://astral.sh/blog/ruff-v0.15.0">blog
post</a> for a migration
guide and overview of the changes!</p>
<h3>Breaking changes</h3>
<ul>
<li>
<p>Ruff now formats your code according to the 2026 style guide. See the
formatter section below or in the blog post for a detailed list of
changes.</p>
</li>
<li>
<p>The linter now supports block suppression comments. For example, to
suppress <code>N803</code> for all parameters in this function:</p>
<pre lang="python"><code># ruff: disable[N803]
def foo(
    legacyArg1,
    legacyArg2,
    legacyArg3,
    legacyArg4,
): ...
# ruff: enable[N803]
</code></pre>
<p>See the <a
href="https://docs.astral.sh/ruff/linter/#block-level">documentation</a>
for more details.</p>
</li>
<li>
<p>The <code>ruff:alpine</code> Docker image is now based on Alpine 3.23
(up from 3.21).</p>
</li>
<li>
<p>The <code>ruff:debian</code> and <code>ruff:debian-slim</code> Docker
images are now based on Debian 13 &quot;Trixie&quot; instead of Debian
12 &quot;Bookworm.&quot;</p>
</li>
<li>
<p>Binaries for the <code>ppc64</code> (64-bit big-endian PowerPC)
architecture are no longer included in our releases. It should still be
possible to build Ruff manually for this platform, if needed.</p>
</li>
<li>
<p>Ruff now resolves all <code>extend</code>ed configuration files
before falling back on a default Python version.</p>
</li>
</ul>
<h3>Stabilization</h3>
<p>The following rules have been stabilized and are no longer in
preview:</p>
<ul>
<li><a
href="https://docs.astral.sh/ruff/rules/blocking-http-call-httpx-in-async-function"><code>blocking-http-call-httpx-in-async-function</code></a>
(<code>ASYNC212</code>)</li>
<li><a
href="https://docs.astral.sh/ruff/rules/blocking-path-method-in-async-function"><code>blocking-path-method-in-async-function</code></a>
(<code>ASYNC240</code>)</li>
<li><a
href="https://docs.astral.sh/ruff/rules/blocking-input-in-async-function"><code>blocking-input-in-async-function</code></a>
(<code>ASYNC250</code>)</li>
<li><a
href="https://docs.astral.sh/ruff/rules/map-without-explicit-strict"><code>map-without-explicit-strict</code></a>
(<code>B912</code>)</li>
<li><a
href="https://docs.astral.sh/ruff/rules/if-exp-instead-of-or-operator"><code>if-exp-instead-of-or-operator</code></a>
(<code>FURB110</code>)</li>
<li><a
href="https://docs.astral.sh/ruff/rules/single-item-membership-test"><code>single-item-membership-test</code></a>
(<code>FURB171</code>)</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="ce5f7b6127"><code>ce5f7b6</code></a>
Bump 0.15.0 (<a
href="https://redirect.github.com/astral-sh/ruff/issues/23055">#23055</a>)</li>
<li><a
href="b4e40f539c"><code>b4e40f5</code></a>
[ty] Fix <code>__contains__</code> to respect descriptors (<a
href="https://redirect.github.com/astral-sh/ruff/issues/23056">#23056</a>)</li>
<li><a
href="848cb72dc1"><code>848cb72</code></a>
[ty] Fix narrowing of nonlocal variables with conditional assignments
(<a
href="https://redirect.github.com/astral-sh/ruff/issues/22966">#22966</a>)</li>
<li><a
href="da7f33af22"><code>da7f33a</code></a>
[ty] Add a diagnostic for <code>Final</code> without assignment (<a
href="https://redirect.github.com/astral-sh/ruff/issues/23001">#23001</a>)</li>
<li><a
href="e65f9a6b03"><code>e65f9a6</code></a>
Document markdown formatting feature (<a
href="https://redirect.github.com/astral-sh/ruff/issues/22990">#22990</a>)</li>
<li><a
href="c0c1b985c9"><code>c0c1b98</code></a>
Format markdown code blocks with line-by-line regex parse (<a
href="https://redirect.github.com/astral-sh/ruff/issues/22996">#22996</a>)</li>
<li><a
href="9f8f3e196b"><code>9f8f3e1</code></a>
Allow positional-only params with defaults in method overrides (<a
href="https://redirect.github.com/astral-sh/ruff/issues/23037">#23037</a>)</li>
<li><a
href="ef83810e11"><code>ef83810</code></a>
[ty] ecosystem-analyzer: Support bare git repositories (<a
href="https://redirect.github.com/astral-sh/ruff/issues/23054">#23054</a>)</li>
<li><a
href="54dfee4cb8"><code>54dfee4</code></a>
Customize where the <code>fix_title</code> sub-diagnostic appears (<a
href="https://redirect.github.com/astral-sh/ruff/issues/23044">#23044</a>)</li>
<li><a
href="b53460799b"><code>b534607</code></a>
2026 Ruff Formatter Style (<a
href="https://redirect.github.com/astral-sh/ruff/issues/22735">#22735</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/astral-sh/ruff/compare/0.14.14...0.15.0">compare
view</a></li>
</ul>
</details>
<br />


Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore <dependency name> major version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's major version (unless you unignore this specific
dependency's major version or upgrade to it yourself)
- `@dependabot ignore <dependency name> minor version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's minor version (unless you unignore this specific
dependency's minor version or upgrade to it yourself)
- `@dependabot ignore <dependency name>` will close this group update PR
and stop Dependabot creating any more for the specific dependency
(unless you unignore this specific dependency or upgrade to it yourself)
- `@dependabot unignore <dependency name>` will remove all of the ignore
conditions of the specified dependency
- `@dependabot unignore <dependency name> <ignore condition>` will
remove the ignore condition of the specified dependency and ignore
conditions


</details>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Nick Tindle <nick@ntindle.com>
Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co>
2026-02-09 04:26:58 +00:00
dependabot[bot]
1a32ba7d9a chore(deps): bump urllib3 from 2.5.0 to 2.6.0 in /autogpt_platform/backend (#11607)
Bumps [urllib3](https://github.com/urllib3/urllib3) from 2.5.0 to 2.6.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/urllib3/urllib3/releases">urllib3's
releases</a>.</em></p>
<blockquote>
<h2>2.6.0</h2>
<h2>🚀 urllib3 is fundraising for HTTP/2 support</h2>
<p><a
href="https://sethmlarson.dev/urllib3-is-fundraising-for-http2-support">urllib3
is raising ~$40,000 USD</a> to release HTTP/2 support and ensure
long-term sustainable maintenance of the project after a sharp decline
in financial support. If your company or organization uses Python and
would benefit from HTTP/2 support in Requests, pip, cloud SDKs, and
thousands of other projects <a
href="https://opencollective.com/urllib3">please consider contributing
financially</a> to ensure HTTP/2 support is developed sustainably and
maintained for the long-haul.</p>
<p>Thank you for your support.</p>
<h2>Security</h2>
<ul>
<li>Fixed a security issue where streaming API could improperly handle
highly compressed HTTP content (&quot;decompression bombs&quot;) leading
to excessive resource consumption even when a small amount of data was
requested. Reading small chunks of compressed data is safer and much
more efficient now. (CVE-2025-66471 reported by <a
href="https://github.com/Cycloctane"><code>@​Cycloctane</code></a>, 8.9
High, GHSA-2xpw-w6gg-jr37)</li>
<li>Fixed a security issue where an attacker could compose an HTTP
response with virtually unlimited links in the
<code>Content-Encoding</code> header, potentially leading to a denial of
service (DoS) attack by exhausting system resources during decoding. The
number of allowed chained encodings is now limited to 5. (CVE-2025-66418
reported by <a
href="https://github.com/illia-v"><code>@​illia-v</code></a>, 8.9 High,
GHSA-gm62-xv2j-4w53)</li>
</ul>
<blockquote>
<p>[!IMPORTANT]</p>
<ul>
<li>If urllib3 is not installed with the optional
<code>urllib3[brotli]</code> extra, but your environment contains a
Brotli/brotlicffi/brotlipy package anyway, make sure to upgrade it to at
least Brotli 1.2.0 or brotlicffi 1.2.0.0 to benefit from the security
fixes and avoid warnings. Prefer using <code>urllib3[brotli]</code> to
install a compatible Brotli package automatically.</li>
<li>If you use custom decompressors, please make sure to update them to
respect the changed API of
<code>urllib3.response.ContentDecoder</code>.</li>
</ul>
</blockquote>
<h2>Features</h2>
<ul>
<li>Enabled retrieval, deletion, and membership testing in
<code>HTTPHeaderDict</code> using bytes keys. (<a
href="https://redirect.github.com/urllib3/urllib3/issues/3653">#3653</a>)</li>
<li>Added host and port information to string representations of
<code>HTTPConnection</code>. (<a
href="https://redirect.github.com/urllib3/urllib3/issues/3666">#3666</a>)</li>
<li>Added support for Python 3.14 free-threading builds explicitly. (<a
href="https://redirect.github.com/urllib3/urllib3/issues/3696">#3696</a>)</li>
</ul>
<h2>Removals</h2>
<ul>
<li>Removed the <code>HTTPResponse.getheaders()</code> method in favor
of <code>HTTPResponse.headers</code>. Removed the
<code>HTTPResponse.getheader(name, default)</code> method in favor of
<code>HTTPResponse.headers.get(name, default)</code>. (<a
href="https://redirect.github.com/urllib3/urllib3/issues/3622">#3622</a>)</li>
</ul>
<h2>Bugfixes</h2>
<ul>
<li>Fixed redirect handling in <code>urllib3.PoolManager</code> when an
integer is passed for the retries parameter. (<a
href="https://redirect.github.com/urllib3/urllib3/issues/3649">#3649</a>)</li>
<li>Fixed <code>HTTPConnectionPool</code> when used in Emscripten with
no explicit port. (<a
href="https://redirect.github.com/urllib3/urllib3/issues/3664">#3664</a>)</li>
<li>Fixed handling of <code>SSLKEYLOGFILE</code> with expandable
variables. (<a
href="https://redirect.github.com/urllib3/urllib3/issues/3700">#3700</a>)</li>
</ul>
<h2>Misc</h2>
<ul>
<li>Changed the <code>zstd</code> extra to install
<code>backports.zstd</code> instead of <code>zstandard</code> on Python
3.13 and before. (<a
href="https://redirect.github.com/urllib3/urllib3/issues/3693">#3693</a>)</li>
<li>Improved the performance of content decoding by optimizing
<code>BytesQueueBuffer</code> class. (<a
href="https://redirect.github.com/urllib3/urllib3/issues/3710">#3710</a>)</li>
<li>Allowed building the urllib3 package with newer setuptools-scm v9.x.
(<a
href="https://redirect.github.com/urllib3/urllib3/issues/3652">#3652</a>)</li>
<li>Ensured successful urllib3 builds by setting Hatchling requirement
to ≥ 1.27.0. (<a
href="https://redirect.github.com/urllib3/urllib3/issues/3638">#3638</a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/urllib3/urllib3/blob/main/CHANGES.rst">urllib3's
changelog</a>.</em></p>
<blockquote>
<h1>2.6.0 (2025-12-05)</h1>
<h2>Security</h2>
<ul>
<li>Fixed a security issue where streaming API could improperly handle
highly
compressed HTTP content (&quot;decompression bombs&quot;) leading to
excessive resource
consumption even when a small amount of data was requested. Reading
small
chunks of compressed data is safer and much more efficient now.
(<code>GHSA-2xpw-w6gg-jr37
&lt;https://github.com/urllib3/urllib3/security/advisories/GHSA-2xpw-w6gg-jr37&gt;</code>__)</li>
<li>Fixed a security issue where an attacker could compose an HTTP
response with
virtually unlimited links in the <code>Content-Encoding</code> header,
potentially
leading to a denial of service (DoS) attack by exhausting system
resources
during decoding. The number of allowed chained encodings is now limited
to 5.
(<code>GHSA-gm62-xv2j-4w53
&lt;https://github.com/urllib3/urllib3/security/advisories/GHSA-gm62-xv2j-4w53&gt;</code>__)</li>
</ul>
<p>.. caution::</p>
<ul>
<li>
<p>If urllib3 is not installed with the optional
<code>urllib3[brotli]</code> extra, but
your environment contains a Brotli/brotlicffi/brotlipy package anyway,
make
sure to upgrade it to at least Brotli 1.2.0 or brotlicffi 1.2.0.0 to
benefit from the security fixes and avoid warnings. Prefer using
<code>urllib3[brotli]</code> to install a compatible Brotli package
automatically.</p>
</li>
<li>
<p>If you use custom decompressors, please make sure to update them to
respect the changed API of
<code>urllib3.response.ContentDecoder</code>.</p>
</li>
</ul>
<h2>Features</h2>
<ul>
<li>Enabled retrieval, deletion, and membership testing in
<code>HTTPHeaderDict</code> using bytes keys.
(<code>[#3653](https://github.com/urllib3/urllib3/issues/3653)
&lt;https://github.com/urllib3/urllib3/issues/3653&gt;</code>__)</li>
<li>Added host and port information to string representations of
<code>HTTPConnection</code>.
(<code>[#3666](https://github.com/urllib3/urllib3/issues/3666)
&lt;https://github.com/urllib3/urllib3/issues/3666&gt;</code>__)</li>
<li>Added support for Python 3.14 free-threading builds explicitly.
(<code>[#3696](https://github.com/urllib3/urllib3/issues/3696)
&lt;https://github.com/urllib3/urllib3/issues/3696&gt;</code>__)</li>
</ul>
<h2>Removals</h2>
<ul>
<li>Removed the <code>HTTPResponse.getheaders()</code> method in favor
of <code>HTTPResponse.headers</code>.
Removed the <code>HTTPResponse.getheader(name, default)</code> method in
favor of <code>HTTPResponse.headers.get(name, default)</code>.
(<code>[#3622](https://github.com/urllib3/urllib3/issues/3622)
&lt;https://github.com/urllib3/urllib3/issues/3622&gt;</code>__)</li>
</ul>
<h2>Bugfixes</h2>
<ul>
<li>Fixed redirect handling in <code>urllib3.PoolManager</code> when an
integer is passed
for the retries parameter.
(<code>[#3649](https://github.com/urllib3/urllib3/issues/3649)
&lt;https://github.com/urllib3/urllib3/issues/3649&gt;</code>__)</li>
<li>Fixed <code>HTTPConnectionPool</code> when used in Emscripten with
no explicit port.
(<code>[#3664](https://github.com/urllib3/urllib3/issues/3664)
&lt;https://github.com/urllib3/urllib3/issues/3664&gt;</code>__)</li>
<li>Fixed handling of <code>SSLKEYLOGFILE</code> with expandable
variables.
(<code>[#3700](https://github.com/urllib3/urllib3/issues/3700)
&lt;https://github.com/urllib3/urllib3/issues/3700&gt;</code>__)</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="720f484b60"><code>720f484</code></a>
Release 2.6.0</li>
<li><a
href="24d7b67eac"><code>24d7b67</code></a>
Merge commit from fork</li>
<li><a
href="c19571de34"><code>c19571d</code></a>
Merge commit from fork</li>
<li><a
href="816fcf0452"><code>816fcf0</code></a>
Bump actions/setup-python from 6.0.0 to 6.1.0 (<a
href="https://redirect.github.com/urllib3/urllib3/issues/3725">#3725</a>)</li>
<li><a
href="18af0a10ef"><code>18af0a1</code></a>
Improve speed of <code>BytesQueueBuffer.get()</code> by using memoryview
(<a
href="https://redirect.github.com/urllib3/urllib3/issues/3711">#3711</a>)</li>
<li><a
href="1f6abac3e6"><code>1f6abac</code></a>
Bump versions of pre-commit hooks (<a
href="https://redirect.github.com/urllib3/urllib3/issues/3716">#3716</a>)</li>
<li><a
href="1c8fbf787b"><code>1c8fbf7</code></a>
Bump actions/checkout from 5.0.0 to 6.0.0 (<a
href="https://redirect.github.com/urllib3/urllib3/issues/3722">#3722</a>)</li>
<li><a
href="7784b9eee9"><code>7784b9e</code></a>
Add Python 3.15 to CI (<a
href="https://redirect.github.com/urllib3/urllib3/issues/3717">#3717</a>)</li>
<li><a
href="0241c9e728"><code>0241c9e</code></a>
Updated docs to reflect change in optional zstd dependency from
<code>zstandard</code> t...</li>
<li><a
href="7afcabb648"><code>7afcabb</code></a>
Expand environment variable of SSLKEYLOGFILE (<a
href="https://redirect.github.com/urllib3/urllib3/issues/3705">#3705</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/urllib3/urllib3/compare/2.5.0...2.6.0">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=urllib3&package-manager=pip&previous-version=2.5.0&new-version=2.6.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

You can trigger a rebase of this PR by commenting `@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/Significant-Gravitas/AutoGPT/network/alerts).

</details>

> **Note**
> Automatic rebases have been disabled on this pull request as it has
been open for over 30 days.

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Nick Tindle <nick@ntindle.com>
2026-02-09 03:39:05 +00:00
dependabot[bot]
deccc26f1f chore(deps): bump actions/cache from 4 to 5 (#11665)
Bumps [actions/cache](https://github.com/actions/cache) from 4 to 5.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/actions/cache/releases">actions/cache's
releases</a>.</em></p>
<blockquote>
<h2>v5.0.0</h2>
<blockquote>
<p>[!IMPORTANT]
<strong><code>actions/cache@v5</code> runs on the Node.js 24 runtime and
requires a minimum Actions Runner version of
<code>2.327.1</code>.</strong></p>
<p>If you are using self-hosted runners, ensure they are updated before
upgrading.</p>
</blockquote>
<hr />
<h2>What's Changed</h2>
<ul>
<li>Upgrade to use node24 by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/cache/pull/1630">actions/cache#1630</a></li>
<li>Prepare v5.0.0 release by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/cache/pull/1684">actions/cache#1684</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/cache/compare/v4.3.0...v5.0.0">https://github.com/actions/cache/compare/v4.3.0...v5.0.0</a></p>
<h2>v4.3.0</h2>
<h2>What's Changed</h2>
<ul>
<li>Add note on runner versions by <a
href="https://github.com/GhadimiR"><code>@​GhadimiR</code></a> in <a
href="https://redirect.github.com/actions/cache/pull/1642">actions/cache#1642</a></li>
<li>Prepare <code>v4.3.0</code> release by <a
href="https://github.com/Link"><code>@​Link</code></a>- in <a
href="https://redirect.github.com/actions/cache/pull/1655">actions/cache#1655</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/GhadimiR"><code>@​GhadimiR</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/cache/pull/1642">actions/cache#1642</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/cache/compare/v4...v4.3.0">https://github.com/actions/cache/compare/v4...v4.3.0</a></p>
<h2>v4.2.4</h2>
<h2>What's Changed</h2>
<ul>
<li>Update README.md by <a
href="https://github.com/nebuk89"><code>@​nebuk89</code></a> in <a
href="https://redirect.github.com/actions/cache/pull/1620">actions/cache#1620</a></li>
<li>Upgrade <code>@actions/cache</code> to <code>4.0.5</code> and move
<code>@protobuf-ts/plugin</code> to dev depdencies by <a
href="https://github.com/Link"><code>@​Link</code></a>- in <a
href="https://redirect.github.com/actions/cache/pull/1634">actions/cache#1634</a></li>
<li>Prepare release <code>4.2.4</code> by <a
href="https://github.com/Link"><code>@​Link</code></a>- in <a
href="https://redirect.github.com/actions/cache/pull/1636">actions/cache#1636</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/nebuk89"><code>@​nebuk89</code></a> made
their first contribution in <a
href="https://redirect.github.com/actions/cache/pull/1620">actions/cache#1620</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/cache/compare/v4...v4.2.4">https://github.com/actions/cache/compare/v4...v4.2.4</a></p>
<h2>v4.2.3</h2>
<h2>What's Changed</h2>
<ul>
<li>Update to use <code>@​actions/cache</code> 4.0.3 package &amp;
prepare for new release by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/cache/pull/1577">actions/cache#1577</a>
(SAS tokens for cache entries are now masked in debug logs)</li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/cache/pull/1577">actions/cache#1577</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/cache/compare/v4.2.2...v4.2.3">https://github.com/actions/cache/compare/v4.2.2...v4.2.3</a></p>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/actions/cache/blob/main/RELEASES.md">actions/cache's
changelog</a>.</em></p>
<blockquote>
<h1>Releases</h1>
<h2>Changelog</h2>
<h3>5.0.1</h3>
<ul>
<li>Update <code>@azure/storage-blob</code> to <code>^12.29.1</code> via
<code>@actions/cache@5.0.1</code> <a
href="https://redirect.github.com/actions/cache/pull/1685">#1685</a></li>
</ul>
<h3>5.0.0</h3>
<blockquote>
<p>[!IMPORTANT]
<code>actions/cache@v5</code> runs on the Node.js 24 runtime and
requires a minimum Actions Runner version of <code>2.327.1</code>.
If you are using self-hosted runners, ensure they are updated before
upgrading.</p>
</blockquote>
<h3>4.3.0</h3>
<ul>
<li>Bump <code>@actions/cache</code> to <a
href="https://redirect.github.com/actions/toolkit/pull/2132">v4.1.0</a></li>
</ul>
<h3>4.2.4</h3>
<ul>
<li>Bump <code>@actions/cache</code> to v4.0.5</li>
</ul>
<h3>4.2.3</h3>
<ul>
<li>Bump <code>@actions/cache</code> to v4.0.3 (obfuscates SAS token in
debug logs for cache entries)</li>
</ul>
<h3>4.2.2</h3>
<ul>
<li>Bump <code>@actions/cache</code> to v4.0.2</li>
</ul>
<h3>4.2.1</h3>
<ul>
<li>Bump <code>@actions/cache</code> to v4.0.1</li>
</ul>
<h3>4.2.0</h3>
<p>TLDR; The cache backend service has been rewritten from the ground up
for improved performance and reliability. <a
href="https://github.com/actions/cache">actions/cache</a> now integrates
with the new cache service (v2) APIs.</p>
<p>The new service will gradually roll out as of <strong>February 1st,
2025</strong>. The legacy service will also be sunset on the same date.
Changes in these release are <strong>fully backward
compatible</strong>.</p>
<p><strong>We are deprecating some versions of this action</strong>. We
recommend upgrading to version <code>v4</code> or <code>v3</code> as
soon as possible before <strong>February 1st, 2025.</strong> (Upgrade
instructions below).</p>
<p>If you are using pinned SHAs, please use the SHAs of versions
<code>v4.2.0</code> or <code>v3.4.0</code></p>
<p>If you do not upgrade, all workflow runs using any of the deprecated
<a href="https://github.com/actions/cache">actions/cache</a> will
fail.</p>
<p>Upgrading to the recommended versions will not break your
workflows.</p>
<h3>4.1.2</h3>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="9255dc7a25"><code>9255dc7</code></a>
Merge pull request <a
href="https://redirect.github.com/actions/cache/issues/1686">#1686</a>
from actions/cache-v5.0.1-release</li>
<li><a
href="8ff5423e8b"><code>8ff5423</code></a>
chore: release v5.0.1</li>
<li><a
href="9233019a15"><code>9233019</code></a>
Merge pull request <a
href="https://redirect.github.com/actions/cache/issues/1685">#1685</a>
from salmanmkc/node24-storage-blob-fix</li>
<li><a
href="b975f2bb84"><code>b975f2b</code></a>
fix: add peer property to package-lock.json for dependencies</li>
<li><a
href="d0a0e18134"><code>d0a0e18</code></a>
fix: update license files for <code>@​actions/cache</code>,
fast-xml-parser, and strnum</li>
<li><a
href="74de208dcf"><code>74de208</code></a>
fix: update <code>@​actions/cache</code> to ^5.0.1 for Node.js 24
punycode fix</li>
<li><a
href="ac7f1152ea"><code>ac7f115</code></a>
peer</li>
<li><a
href="b0f846b50b"><code>b0f846b</code></a>
fix: update <code>@​actions/cache</code> with storage-blob fix for
Node.js 24 punycode depr...</li>
<li><a
href="a783357455"><code>a783357</code></a>
Merge pull request <a
href="https://redirect.github.com/actions/cache/issues/1684">#1684</a>
from actions/prepare-cache-v5-release</li>
<li><a
href="3bb0d78750"><code>3bb0d78</code></a>
docs: highlight v5 runner requirement in releases</li>
<li>Additional commits viewable in <a
href="https://github.com/actions/cache/compare/v4...v5">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/cache&package-manager=github_actions&previous-version=4&new-version=5)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

You can trigger a rebase of this PR by commenting `@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

> **Note**
> Automatic rebases have been disabled on this pull request as it has
been open for over 30 days.

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Nick Tindle <nick@ntindle.com>
2026-02-09 03:28:23 +00:00
dependabot[bot]
9e38bd5b78 chore(backend/deps): bump the production-dependencies group across 1 directory with 8 updates (#12014)
Bumps the production-dependencies group with 8 updates in the
/autogpt_platform/backend directory:

| Package | From | To |
| --- | --- | --- |
| [anthropic](https://github.com/anthropics/anthropic-sdk-python) |
`0.59.0` | `0.79.0` |
| [fastapi](https://github.com/fastapi/fastapi) | `0.128.3` | `0.128.5`
|
| [ollama](https://github.com/ollama/ollama-python) | `0.5.4` | `0.6.1`
|
| [prometheus-client](https://github.com/prometheus/client_python) |
`0.22.1` | `0.24.1` |
| [python-multipart](https://github.com/Kludex/python-multipart) |
`0.0.20` | `0.0.22` |
| [supabase](https://github.com/supabase/supabase-py) | `2.27.2` |
`2.27.3` |
| [tenacity](https://github.com/jd/tenacity) | `9.1.3` | `9.1.4` |
| [tiktoken](https://github.com/openai/tiktoken) | `0.9.0` | `0.12.0` |


Updates `anthropic` from 0.59.0 to 0.79.0
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/anthropics/anthropic-sdk-python/releases">anthropic's
releases</a>.</em></p>
<blockquote>
<h2>v0.79.0</h2>
<h2>0.79.0 (2026-02-07)</h2>
<p>Full Changelog: <a
href="https://github.com/anthropics/anthropic-sdk-python/compare/v0.78.0...v0.79.0">v0.78.0...v0.79.0</a></p>
<h3>Features</h3>
<ul>
<li><strong>api:</strong> enabling fast-mode in claude-opus-4-6 (<a
href="5953ba7b42">5953ba7</a>)</li>
</ul>
<h3>Bug Fixes</h3>
<ul>
<li>pass speed parameter through in sync beta count_tokens (<a
href="1dd6119dac">1dd6119</a>)</li>
</ul>
<h2>v0.78.0</h2>
<h2>0.78.0 (2026-02-05)</h2>
<p>Full Changelog: <a
href="https://github.com/anthropics/anthropic-sdk-python/compare/v0.77.1...v0.78.0">v0.77.1...v0.78.0</a></p>
<h3>Features</h3>
<ul>
<li><strong>api:</strong> Release Claude Opus 4.6, adaptive thinking,
and other features (<a
href="3ef1529b45">3ef1529</a>)</li>
</ul>
<h2>v0.77.1</h2>
<h2>0.77.1 (2026-02-03)</h2>
<p>Full Changelog: <a
href="https://github.com/anthropics/anthropic-sdk-python/compare/v0.77.0...v0.77.1">v0.77.0...v0.77.1</a></p>
<h3>Bug Fixes</h3>
<ul>
<li><strong>structured outputs:</strong> send structured output beta
header when format is omitted (<a
href="https://redirect.github.com/anthropics/anthropic-sdk-python/issues/1158">#1158</a>)
(<a
href="258494e2b8">258494e</a>)</li>
</ul>
<h3>Chores</h3>
<ul>
<li>remove claude-code-review workflow (<a
href="https://redirect.github.com/anthropics/anthropic-sdk-python/issues/1338">#1338</a>)
(<a
href="aec4512305">aec4512</a>)</li>
</ul>
<h2>v0.77.0</h2>
<h2>0.77.0 (2026-01-29)</h2>
<p>Full Changelog: <a
href="https://github.com/anthropics/anthropic-sdk-python/compare/v0.76.0...v0.77.0">v0.76.0...v0.77.0</a></p>
<h3>Features</h3>
<ul>
<li><strong>api:</strong> add support for Structured Outputs in the
Messages API (<a
href="ad5667774a">ad56677</a>)</li>
<li><strong>api:</strong> migrate sending message format in
output_config rather than output_format (<a
href="af405e473f">af405e4</a>)</li>
<li><strong>client:</strong> add custom JSON encoder for extended type
support (<a
href="7780e90bd2">7780e90</a>)</li>
<li>use output_config for structured outputs (<a
href="82d669db65">82d669d</a>)</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/anthropics/anthropic-sdk-python/blob/main/CHANGELOG.md">anthropic's
changelog</a>.</em></p>
<blockquote>
<h2>0.79.0 (2026-02-07)</h2>
<p>Full Changelog: <a
href="https://github.com/anthropics/anthropic-sdk-python/compare/v0.78.0...v0.79.0">v0.78.0...v0.79.0</a></p>
<h3>Features</h3>
<ul>
<li><strong>api:</strong> enabling fast-mode in claude-opus-4-6 (<a
href="5953ba7b42">5953ba7</a>)</li>
</ul>
<h3>Bug Fixes</h3>
<ul>
<li>pass speed parameter through in sync beta count_tokens (<a
href="1dd6119dac">1dd6119</a>)</li>
</ul>
<h2>0.78.0 (2026-02-05)</h2>
<p>Full Changelog: <a
href="https://github.com/anthropics/anthropic-sdk-python/compare/v0.77.1...v0.78.0">v0.77.1...v0.78.0</a></p>
<h3>Features</h3>
<ul>
<li><strong>api:</strong> Release Claude Opus 4.6, adaptive thinking,
and other features (<a
href="3ef1529b45">3ef1529</a>)</li>
</ul>
<h2>0.77.1 (2026-02-03)</h2>
<p>Full Changelog: <a
href="https://github.com/anthropics/anthropic-sdk-python/compare/v0.77.0...v0.77.1">v0.77.0...v0.77.1</a></p>
<h3>Bug Fixes</h3>
<ul>
<li><strong>structured outputs:</strong> send structured output beta
header when format is omitted (<a
href="https://redirect.github.com/anthropics/anthropic-sdk-python/issues/1158">#1158</a>)
(<a
href="258494e2b8">258494e</a>)</li>
</ul>
<h3>Chores</h3>
<ul>
<li>remove claude-code-review workflow (<a
href="https://redirect.github.com/anthropics/anthropic-sdk-python/issues/1338">#1338</a>)
(<a
href="aec4512305">aec4512</a>)</li>
</ul>
<h2>0.77.0 (2026-01-29)</h2>
<p>Full Changelog: <a
href="https://github.com/anthropics/anthropic-sdk-python/compare/v0.76.0...v0.77.0">v0.76.0...v0.77.0</a></p>
<h3>Features</h3>
<ul>
<li><strong>api:</strong> add support for Structured Outputs in the
Messages API (<a
href="ad5667774a">ad56677</a>)</li>
<li><strong>api:</strong> migrate sending message format in
output_config rather than output_format (<a
href="af405e473f">af405e4</a>)</li>
<li><strong>client:</strong> add custom JSON encoder for extended type
support (<a
href="7780e90bd2">7780e90</a>)</li>
<li>use output_config for structured outputs (<a
href="82d669db65">82d669d</a>)</li>
</ul>
<h3>Bug Fixes</h3>
<ul>
<li><strong>client:</strong> run formatter (<a
href="2e4ff86d7b">2e4ff86</a>)</li>
<li>remove class causing breaking change (<a
href="https://redirect.github.com/anthropics/anthropic-sdk-python/issues/1333">#1333</a>)
(<a
href="81ee9533d1">81ee953</a>)</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="cd1b39bf07"><code>cd1b39b</code></a>
release: 0.79.0</li>
<li><a
href="fb52a6a09d"><code>fb52a6a</code></a>
fix: pass speed parameter through in sync beta count_tokens</li>
<li><a
href="b7c2df239d"><code>b7c2df2</code></a>
feat(api): enabling fast-mode in claude-opus-4-6</li>
<li><a
href="7c42e4b04b"><code>7c42e4b</code></a>
Update CHANGELOG.md (<a
href="https://redirect.github.com/anthropics/anthropic-sdk-python/issues/1163">#1163</a>)</li>
<li><a
href="f2b61ed11c"><code>f2b61ed</code></a>
release: 0.78.0</li>
<li><a
href="a4a29cab92"><code>a4a29ca</code></a>
feat(api): manual updates</li>
<li><a
href="3955600d74"><code>3955600</code></a>
release: 0.77.1</li>
<li><a
href="eca8ddfb19"><code>eca8ddf</code></a>
fix(structured outputs): send structured output beta header when format
is om...</li>
<li><a
href="ee44c52131"><code>ee44c52</code></a>
chore: remove claude-code-review workflow (<a
href="https://redirect.github.com/anthropics/anthropic-sdk-python/issues/1338">#1338</a>)</li>
<li><a
href="9c485f6966"><code>9c485f6</code></a>
release: 0.77.0 (<a
href="https://redirect.github.com/anthropics/anthropic-sdk-python/issues/1117">#1117</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/anthropics/anthropic-sdk-python/compare/v0.59.0...v0.79.0">compare
view</a></li>
</ul>
</details>
<br />

Updates `fastapi` from 0.128.3 to 0.128.5
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/fastapi/fastapi/releases">fastapi's
releases</a>.</em></p>
<blockquote>
<h2>0.128.5</h2>
<h3>Refactors</h3>
<ul>
<li>♻️ Refactor and simplify Pydantic v2 (and v1) compatibility internal
utils. PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14862">#14862</a>
by <a
href="https://github.com/tiangolo"><code>@​tiangolo</code></a>.</li>
</ul>
<h3>Internal</h3>
<ul>
<li> Add inline snapshot tests for OpenAPI before changes from Pydantic
v2. PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14864">#14864</a>
by <a
href="https://github.com/tiangolo"><code>@​tiangolo</code></a>.</li>
</ul>
<h2>0.128.4</h2>
<h3>Refactors</h3>
<ul>
<li>♻️ Refactor internals, simplify Pydantic v2/v1 utils,
<code>create_model_field</code>, better types for
<code>lenient_issubclass</code>. PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14860">#14860</a>
by <a
href="https://github.com/tiangolo"><code>@​tiangolo</code></a>.</li>
<li>♻️ Simplify internals, remove Pydantic v1 only logic, no longer
needed. PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14857">#14857</a>
by <a
href="https://github.com/tiangolo"><code>@​tiangolo</code></a>.</li>
<li>♻️ Refactor internals, cleanup unneeded Pydantic v1 specific logic.
PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14856">#14856</a>
by <a
href="https://github.com/tiangolo"><code>@​tiangolo</code></a>.</li>
</ul>
<h3>Translations</h3>
<ul>
<li>🌐 Update translations for fr (outdated pages). PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14839">#14839</a>
by <a
href="https://github.com/YuriiMotov"><code>@​YuriiMotov</code></a>.</li>
<li>🌐 Update translations for tr (outdated and missing). PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14838">#14838</a>
by <a
href="https://github.com/YuriiMotov"><code>@​YuriiMotov</code></a>.</li>
</ul>
<h3>Internal</h3>
<ul>
<li>⬆️ Upgrade development dependencies. PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14854">#14854</a>
by <a
href="https://github.com/tiangolo"><code>@​tiangolo</code></a>.</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="dedf1409fe"><code>dedf140</code></a>
🔖 Release version 0.128.5</li>
<li><a
href="79d4dfb37f"><code>79d4dfb</code></a>
📝 Update release notes</li>
<li><a
href="9f4ecf562c"><code>9f4ecf5</code></a>
 Add inline snapshot tests for OpenAPI before changes from Pydantic v2
(<a
href="https://redirect.github.com/fastapi/fastapi/issues/14864">#14864</a>)</li>
<li><a
href="c48539f4c6"><code>c48539f</code></a>
📝 Update release notes</li>
<li><a
href="2e7d3754cd"><code>2e7d375</code></a>
♻️ Refactor and simplify Pydantic v2 (and v1) compatibility internal
utils (#...</li>
<li><a
href="8eac94bd91"><code>8eac94b</code></a>
🔖 Release version 0.128.4</li>
<li><a
href="58cdfc7f4b"><code>58cdfc7</code></a>
📝 Update release notes</li>
<li><a
href="d59fbc3494"><code>d59fbc3</code></a>
♻️ Refactor internals, simplify Pydantic v2/v1 utils,
<code>create_model_field</code>, b...</li>
<li><a
href="cc6ced6345"><code>cc6ced6</code></a>
📝 Update release notes</li>
<li><a
href="cf55bade7e"><code>cf55bad</code></a>
♻️ Simplify internals, remove Pydantic v1 only logic, no longer needed
(<a
href="https://redirect.github.com/fastapi/fastapi/issues/14857">#14857</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/fastapi/fastapi/compare/0.128.3...0.128.5">compare
view</a></li>
</ul>
</details>
<br />

Updates `ollama` from 0.5.4 to 0.6.1
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/ollama/ollama-python/releases">ollama's
releases</a>.</em></p>
<blockquote>
<h2>v0.6.1</h2>
<h2>What's Changed</h2>
<ul>
<li>client/types: add logprobs support by <a
href="https://github.com/ParthSareen"><code>@​ParthSareen</code></a> in
<a
href="https://redirect.github.com/ollama/ollama-python/pull/601">ollama/ollama-python#601</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/ollama/ollama-python/compare/v0.6.0...v0.6.1">https://github.com/ollama/ollama-python/compare/v0.6.0...v0.6.1</a></p>
<h2>v0.6.0</h2>
<h2>What's Changed</h2>
<ul>
<li>
<p>client: add web search and web crawl capabilities by <a
href="https://github.com/ParthSareen"><code>@​ParthSareen</code></a> in
<a
href="https://redirect.github.com/ollama/ollama-python/pull/578">ollama/ollama-python#578</a></p>
</li>
<li>
<p>client: load OLLAMA_API_KEY on init by <a
href="https://github.com/ParthSareen"><code>@​ParthSareen</code></a> in
<a
href="https://redirect.github.com/ollama/ollama-python/pull/583">ollama/ollama-python#583</a></p>
</li>
<li>
<p>client/types: update web search and fetch API by <a
href="https://github.com/npardal"><code>@​npardal</code></a> in <a
href="https://redirect.github.com/ollama/ollama-python/pull/584">ollama/ollama-python#584</a></p>
</li>
<li>
<p>examples: add mcp server for web_search web_crawl by <a
href="https://github.com/ParthSareen"><code>@​ParthSareen</code></a> in
<a
href="https://redirect.github.com/ollama/ollama-python/pull/585">ollama/ollama-python#585</a></p>
</li>
<li>
<p>examples: gpt oss browser tool by <a
href="https://github.com/ParthSareen"><code>@​ParthSareen</code></a> in
<a
href="https://redirect.github.com/ollama/ollama-python/pull/588">ollama/ollama-python#588</a></p>
</li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/npardal"><code>@​npardal</code></a> made
their first contribution in <a
href="https://redirect.github.com/ollama/ollama-python/pull/584">ollama/ollama-python#584</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/ollama/ollama-python/compare/v0.5.4...v0.6.0">https://github.com/ollama/ollama-python/compare/v0.5.4...v0.6.0</a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="0008226fda"><code>0008226</code></a>
client/types: add logprobs support (<a
href="https://redirect.github.com/ollama/ollama-python/issues/601">#601</a>)</li>
<li><a
href="9ddd5f0182"><code>9ddd5f0</code></a>
examples: fix model web search (<a
href="https://redirect.github.com/ollama/ollama-python/issues/589">#589</a>)</li>
<li><a
href="d967f048d9"><code>d967f04</code></a>
examples: gpt oss browser tool (<a
href="https://redirect.github.com/ollama/ollama-python/issues/588">#588</a>)</li>
<li><a
href="ab49a669cd"><code>ab49a66</code></a>
examples: add mcp server for web_search web_crawl (<a
href="https://redirect.github.com/ollama/ollama-python/issues/585">#585</a>)</li>
<li><a
href="16f344f635"><code>16f344f</code></a>
client/types: update web search and fetch API (<a
href="https://redirect.github.com/ollama/ollama-python/issues/584">#584</a>)</li>
<li><a
href="d0f71bc8b8"><code>d0f71bc</code></a>
client: load OLLAMA_API_KEY on init (<a
href="https://redirect.github.com/ollama/ollama-python/issues/583">#583</a>)</li>
<li><a
href="b22c5fdabb"><code>b22c5fd</code></a>
init: fix export for web_search (<a
href="https://redirect.github.com/ollama/ollama-python/issues/581">#581</a>)</li>
<li><a
href="4d0b81b37a"><code>4d0b81b</code></a>
client: add web search and web crawl capabilities (<a
href="https://redirect.github.com/ollama/ollama-python/issues/578">#578</a>)</li>
<li>See full diff in <a
href="https://github.com/ollama/ollama-python/compare/v0.5.4...v0.6.1">compare
view</a></li>
</ul>
</details>
<br />

Updates `prometheus-client` from 0.22.1 to 0.24.1
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/prometheus/client_python/releases">prometheus-client's
releases</a>.</em></p>
<blockquote>
<h2>v0.24.1</h2>
<ul>
<li>[Django] Pass correct registry to MultiProcessCollector by <a
href="https://github.com/jelly"><code>@​jelly</code></a> in <a
href="https://redirect.github.com/prometheus/client_python/pull/1152">prometheus/client_python#1152</a></li>
</ul>
<h2>v0.24.0</h2>
<h2>What's Changed</h2>
<ul>
<li>Add an AIOHTTP exporter by <a
href="https://github.com/Lexicality"><code>@​Lexicality</code></a> in <a
href="https://redirect.github.com/prometheus/client_python/pull/1139">prometheus/client_python#1139</a></li>
<li>Add remove_matching() method for metric label deletion by <a
href="https://github.com/hazel-shen"><code>@​hazel-shen</code></a> in <a
href="https://redirect.github.com/prometheus/client_python/pull/1121">prometheus/client_python#1121</a></li>
<li>fix(multiprocess): avoid double-building child metric names (<a
href="https://redirect.github.com/prometheus/client_python/issues/1035">#1035</a>)
by <a href="https://github.com/hazel-shen"><code>@​hazel-shen</code></a>
in <a
href="https://redirect.github.com/prometheus/client_python/pull/1146">prometheus/client_python#1146</a></li>
<li>Don't interleave histogram metrics in multi-process collector by <a
href="https://github.com/cjwatson"><code>@​cjwatson</code></a> in <a
href="https://redirect.github.com/prometheus/client_python/pull/1148">prometheus/client_python#1148</a></li>
<li>Relax registry type annotations for exposition by <a
href="https://github.com/cjwatson"><code>@​cjwatson</code></a> in <a
href="https://redirect.github.com/prometheus/client_python/pull/1149">prometheus/client_python#1149</a></li>
<li>Added compression support in pushgateway by <a
href="https://github.com/ritesh-avesha"><code>@​ritesh-avesha</code></a>
in <a
href="https://redirect.github.com/prometheus/client_python/pull/1144">prometheus/client_python#1144</a></li>
<li>Add Django exporter (<a
href="https://redirect.github.com/prometheus/client_python/issues/1088">#1088</a>)
by <a href="https://github.com/Chadys"><code>@​Chadys</code></a> in <a
href="https://redirect.github.com/prometheus/client_python/pull/1143">prometheus/client_python#1143</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/prometheus/client_python/compare/v0.23.1...v0.24.0">https://github.com/prometheus/client_python/compare/v0.23.1...v0.24.0</a></p>
<h2>v0.23.1</h2>
<h2>What's Changed</h2>
<ul>
<li>fix: use tuples instead of packaging Version by <a
href="https://github.com/efiop"><code>@​efiop</code></a> in <a
href="https://redirect.github.com/prometheus/client_python/pull/1136">prometheus/client_python#1136</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/efiop"><code>@​efiop</code></a> made
their first contribution in <a
href="https://redirect.github.com/prometheus/client_python/pull/1136">prometheus/client_python#1136</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/prometheus/client_python/compare/v0.23.0...v0.23.1">https://github.com/prometheus/client_python/compare/v0.23.0...v0.23.1</a></p>
<h2>v0.23.0</h2>
<h2>What's Changed</h2>
<ul>
<li>UTF-8 Content Negotiation by <a
href="https://github.com/ywwg"><code>@​ywwg</code></a> in <a
href="https://redirect.github.com/prometheus/client_python/pull/1102">prometheus/client_python#1102</a></li>
<li>Re include test data by <a
href="https://github.com/mgorny"><code>@​mgorny</code></a> in <a
href="https://redirect.github.com/prometheus/client_python/pull/1113">prometheus/client_python#1113</a></li>
<li>Improve parser performance by <a
href="https://github.com/csmarchbanks"><code>@​csmarchbanks</code></a>
in <a
href="https://redirect.github.com/prometheus/client_python/pull/1117">prometheus/client_python#1117</a></li>
<li>Add support to <code>write_to_textfile</code> for custom tmpdir by
<a
href="https://github.com/aadityadhruv"><code>@​aadityadhruv</code></a>
in <a
href="https://redirect.github.com/prometheus/client_python/pull/1115">prometheus/client_python#1115</a></li>
<li>OM text exposition for NH by <a
href="https://github.com/vesari"><code>@​vesari</code></a> in <a
href="https://redirect.github.com/prometheus/client_python/pull/1087">prometheus/client_python#1087</a></li>
<li>Fix bug which caused metric publishing to not accept query string
parameters in ASGI app by <a
href="https://github.com/hacksparr0w"><code>@​hacksparr0w</code></a> in
<a
href="https://redirect.github.com/prometheus/client_python/pull/1125">prometheus/client_python#1125</a></li>
<li>Emit native histograms only when OM 2.0.0 is requested by <a
href="https://github.com/vesari"><code>@​vesari</code></a> in <a
href="https://redirect.github.com/prometheus/client_python/pull/1128">prometheus/client_python#1128</a></li>
<li>fix: remove space after comma in openmetrics exposition by <a
href="https://github.com/theSuess"><code>@​theSuess</code></a> in <a
href="https://redirect.github.com/prometheus/client_python/pull/1132">prometheus/client_python#1132</a></li>
<li>Fix issue parsing double spaces after # HELP/# TYPE by <a
href="https://github.com/csmarchbanks"><code>@​csmarchbanks</code></a>
in <a
href="https://redirect.github.com/prometheus/client_python/pull/1134">prometheus/client_python#1134</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/mgorny"><code>@​mgorny</code></a> made
their first contribution in <a
href="https://redirect.github.com/prometheus/client_python/pull/1113">prometheus/client_python#1113</a></li>
<li><a
href="https://github.com/aadityadhruv"><code>@​aadityadhruv</code></a>
made their first contribution in <a
href="https://redirect.github.com/prometheus/client_python/pull/1115">prometheus/client_python#1115</a></li>
<li><a
href="https://github.com/hacksparr0w"><code>@​hacksparr0w</code></a>
made their first contribution in <a
href="https://redirect.github.com/prometheus/client_python/pull/1125">prometheus/client_python#1125</a></li>
<li><a href="https://github.com/theSuess"><code>@​theSuess</code></a>
made their first contribution in <a
href="https://redirect.github.com/prometheus/client_python/pull/1132">prometheus/client_python#1132</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/prometheus/client_python/compare/v0.22.1...v0.23.0">https://github.com/prometheus/client_python/compare/v0.22.1...v0.23.0</a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="f417f6ea8f"><code>f417f6e</code></a>
Release 0.24.1</li>
<li><a
href="6f0e967c1f"><code>6f0e967</code></a>
Pass correct registry to MultiProcessCollector (<a
href="https://redirect.github.com/prometheus/client_python/issues/1152">#1152</a>)</li>
<li><a
href="c5024d310f"><code>c5024d3</code></a>
Release 0.24.0</li>
<li><a
href="e1cdc203b1"><code>e1cdc20</code></a>
Add Django exporter (<a
href="https://redirect.github.com/prometheus/client_python/issues/1088">#1088</a>)
(<a
href="https://redirect.github.com/prometheus/client_python/issues/1143">#1143</a>)</li>
<li><a
href="7b99592094"><code>7b99592</code></a>
Added compression support in pushgateway (<a
href="https://redirect.github.com/prometheus/client_python/issues/1144">#1144</a>)</li>
<li><a
href="13df12421e"><code>13df124</code></a>
Relax registry type annotations for exposition (<a
href="https://redirect.github.com/prometheus/client_python/issues/1149">#1149</a>)</li>
<li><a
href="a264ec0d85"><code>a264ec0</code></a>
Don't interleave histogram metrics in multi-process collector (<a
href="https://redirect.github.com/prometheus/client_python/issues/1148">#1148</a>)</li>
<li><a
href="e8f8bae655"><code>e8f8bae</code></a>
fix(multiprocess): avoid double-building child metric names (<a
href="https://redirect.github.com/prometheus/client_python/issues/1035">#1035</a>)
(<a
href="https://redirect.github.com/prometheus/client_python/issues/1146">#1146</a>)</li>
<li><a
href="1783ca87ac"><code>1783ca8</code></a>
Add support for Python 3.14 (<a
href="https://redirect.github.com/prometheus/client_python/issues/1142">#1142</a>)</li>
<li><a
href="378510b8ae"><code>378510b</code></a>
Add remove_matching() method for metric label deletion (<a
href="https://redirect.github.com/prometheus/client_python/issues/1121">#1121</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/prometheus/client_python/compare/v0.22.1...v0.24.1">compare
view</a></li>
</ul>
</details>
<br />

Updates `python-multipart` from 0.0.20 to 0.0.22
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/Kludex/python-multipart/releases">python-multipart's
releases</a>.</em></p>
<blockquote>
<h2>Version 0.0.22</h2>
<h2>What's Changed</h2>
<ul>
<li>Drop directory path from filename in <code>File</code> <a
href="9433f4bbc9">9433f4b</a>.</li>
</ul>
<hr />
<p><strong>Full Changelog</strong>: <a
href="https://github.com/Kludex/python-multipart/compare/0.0.21...0.0.22">https://github.com/Kludex/python-multipart/compare/0.0.21...0.0.22</a></p>
<h2>Version 0.0.21</h2>
<h2>What's Changed</h2>
<ul>
<li>Add support for Python 3.14 and drop EOL 3.8 and 3.9 by <a
href="https://github.com/hugovk"><code>@​hugovk</code></a> in <a
href="https://redirect.github.com/Kludex/python-multipart/pull/216">Kludex/python-multipart#216</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a
href="https://github.com/waketzheng"><code>@​waketzheng</code></a> made
their first contribution in <a
href="https://redirect.github.com/Kludex/python-multipart/pull/203">Kludex/python-multipart#203</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/Kludex/python-multipart/compare/0.0.20...0.0.21">https://github.com/Kludex/python-multipart/compare/0.0.20...0.0.21</a></p>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/Kludex/python-multipart/blob/master/CHANGELOG.md">python-multipart's
changelog</a>.</em></p>
<blockquote>
<h2>0.0.22 (2026-01-25)</h2>
<ul>
<li>Drop directory path from filename in <code>File</code> <a
href="9433f4bbc9">9433f4b</a>.</li>
</ul>
<h2>0.0.21 (2025-12-17)</h2>
<ul>
<li>Add support for Python 3.14 and drop EOL 3.8 and 3.9 <a
href="https://redirect.github.com/Kludex/python-multipart/pull/216">#216</a>.</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="bea7bbb290"><code>bea7bbb</code></a>
Version 0.0.22 (<a
href="https://redirect.github.com/Kludex/python-multipart/issues/222">#222</a>)</li>
<li><a
href="0fb59a9df0"><code>0fb59a9</code></a>
chore: add return type on test (<a
href="https://redirect.github.com/Kludex/python-multipart/issues/221">#221</a>)</li>
<li><a
href="9433f4bbc9"><code>9433f4b</code></a>
Merge commit from fork</li>
<li><a
href="d5c91ecb0a"><code>d5c91ec</code></a>
Bump the github-actions group with 2 updates (<a
href="https://redirect.github.com/Kludex/python-multipart/issues/219">#219</a>)</li>
<li><a
href="5a90631b48"><code>5a90631</code></a>
bump uv (<a
href="https://redirect.github.com/Kludex/python-multipart/issues/218">#218</a>)</li>
<li><a
href="1f72955602"><code>1f72955</code></a>
Version 0.0.21 (<a
href="https://redirect.github.com/Kludex/python-multipart/issues/217">#217</a>)</li>
<li><a
href="47ecfed353"><code>47ecfed</code></a>
Add support for Python 3.14 and drop EOL 3.8 and 3.9 (<a
href="https://redirect.github.com/Kludex/python-multipart/issues/216">#216</a>)</li>
<li><a
href="f18b70941b"><code>f18b709</code></a>
Bump the github-actions group across 1 directory with 4 updates (<a
href="https://redirect.github.com/Kludex/python-multipart/issues/214">#214</a>)</li>
<li><a
href="b388e9a7a8"><code>b388e9a</code></a>
chore: use depedency-groups in <code>pyproject.toml</code> (<a
href="https://redirect.github.com/Kludex/python-multipart/issues/212">#212</a>)</li>
<li><a
href="6113e75097"><code>6113e75</code></a>
Bump the github-actions group across 1 directory with 3 updates (<a
href="https://redirect.github.com/Kludex/python-multipart/issues/210">#210</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/Kludex/python-multipart/compare/0.0.20...0.0.22">compare
view</a></li>
</ul>
</details>
<br />

Updates `supabase` from 2.27.2 to 2.27.3
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/supabase/supabase-py/releases">supabase's
releases</a>.</em></p>
<blockquote>
<h2>v2.27.3</h2>
<h2><a
href="https://github.com/supabase/supabase-py/compare/v2.27.2...v2.27.3">2.27.3</a>
(2026-02-03)</h2>
<h3>Bug Fixes</h3>
<ul>
<li>deprecate python 3.9 in all packages (<a
href="https://redirect.github.com/supabase/supabase-py/issues/1365">#1365</a>)
(<a
href="cc72ed75d4">cc72ed7</a>)</li>
<li>ensure storage_url has trailing slash to prevent warning (<a
href="https://redirect.github.com/supabase/supabase-py/issues/1367">#1367</a>)
(<a
href="4267ff1345">4267ff1</a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/supabase/supabase-py/blob/main/CHANGELOG.md">supabase's
changelog</a>.</em></p>
<blockquote>
<h2><a
href="https://github.com/supabase/supabase-py/compare/v2.27.2...v2.27.3">2.27.3</a>
(2026-02-03)</h2>
<h3>Bug Fixes</h3>
<ul>
<li>deprecate python 3.9 in all packages (<a
href="https://redirect.github.com/supabase/supabase-py/issues/1365">#1365</a>)
(<a
href="cc72ed75d4">cc72ed7</a>)</li>
<li>ensure storage_url has trailing slash to prevent warning (<a
href="https://redirect.github.com/supabase/supabase-py/issues/1367">#1367</a>)
(<a
href="4267ff1345">4267ff1</a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="c357def670"><code>c357def</code></a>
chore(main): release 2.27.3 (<a
href="https://redirect.github.com/supabase/supabase-py/issues/1368">#1368</a>)</li>
<li><a
href="4267ff1345"><code>4267ff1</code></a>
fix: ensure storage_url has trailing slash to prevent warning (<a
href="https://redirect.github.com/supabase/supabase-py/issues/1367">#1367</a>)</li>
<li><a
href="cc72ed75d4"><code>cc72ed7</code></a>
fix: deprecate python 3.9 in all packages (<a
href="https://redirect.github.com/supabase/supabase-py/issues/1365">#1365</a>)</li>
<li><a
href="9d3620da64"><code>9d3620d</code></a>
chore(realtime): move most 'info' level logs into 'debug' (<a
href="https://redirect.github.com/supabase/supabase-py/issues/1358">#1358</a>)</li>
<li><a
href="30f5e84022"><code>30f5e84</code></a>
Upgrade GitHub Actions for Node 24 compatibility (<a
href="https://redirect.github.com/supabase/supabase-py/issues/1357">#1357</a>)</li>
<li><a
href="1df3afcd7c"><code>1df3afc</code></a>
chore(ci): add python package to ci matrix (<a
href="https://redirect.github.com/supabase/supabase-py/issues/1351">#1351</a>)</li>
<li>See full diff in <a
href="https://github.com/supabase/supabase-py/compare/v2.27.2...v2.27.3">compare
view</a></li>
</ul>
</details>
<br />

Updates `tenacity` from 9.1.3 to 9.1.4
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/jd/tenacity/releases">tenacity's
releases</a>.</em></p>
<blockquote>
<h2>9.1.4</h2>
<h2>What's Changed</h2>
<ul>
<li>Fix <code>retry()</code> annotations with async <code>sleep=</code>
function by <a
href="https://github.com/Zac-HD"><code>@​Zac-HD</code></a> in <a
href="https://redirect.github.com/jd/tenacity/pull/555">jd/tenacity#555</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/jd/tenacity/compare/9.1.3...9.1.4">https://github.com/jd/tenacity/compare/9.1.3...9.1.4</a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="d4e868d6b8"><code>d4e868d</code></a>
Fix <code>retry()</code> annotations with async <code>sleep=</code>
function (<a
href="https://redirect.github.com/jd/tenacity/issues/555">#555</a>)</li>
<li>See full diff in <a
href="https://github.com/jd/tenacity/compare/9.1.3...9.1.4">compare
view</a></li>
</ul>
</details>
<br />

Updates `tiktoken` from 0.9.0 to 0.12.0
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/openai/tiktoken/blob/main/CHANGELOG.md">tiktoken's
changelog</a>.</em></p>
<blockquote>
<h2>[v0.12.0]</h2>
<ul>
<li>Build wheels for Python 3.14</li>
<li>Build musllinux aarch64 wheels</li>
<li>Support for free-threaded Python</li>
<li>Update version of <code>pyo3</code> and <code>rustc-hash</code></li>
<li>Avoid use of <code>blobfile</code> for reading local files</li>
<li>Recognise <code>gpt-5</code> model identifier</li>
<li>Minor performance improvement for file reading</li>
</ul>
<h2>[v0.11.0]</h2>
<ul>
<li>Support for <code>GPT-5</code></li>
<li>Update version of <code>pyo3</code></li>
<li>Use new Rust edition</li>
<li>Fix special token handling in <code>encode_to_numpy</code></li>
<li>Better error handling</li>
<li>Improvements to private APIs</li>
</ul>
<h2>[v0.10.0]</h2>
<ul>
<li>Support for newer models</li>
<li>Improvements to private APIs</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="97e49cbadd"><code>97e49cb</code></a>
Release 0.12.0</li>
<li><a
href="948549882b"><code>9485498</code></a>
Partial sync of codebase (<a
href="https://redirect.github.com/openai/tiktoken/issues/451">#451</a>)</li>
<li><a
href="00ff187f59"><code>00ff187</code></a>
Add GPT-5 model support with o200k_base encoding (<a
href="https://redirect.github.com/openai/tiktoken/issues/440">#440</a>)</li>
<li><a
href="5ee89ca1fa"><code>5ee89ca</code></a>
chore: update dependencies (<a
href="https://redirect.github.com/openai/tiktoken/issues/449">#449</a>)</li>
<li><a
href="2ab6d3706d"><code>2ab6d37</code></a>
Support the free-threaded build (<a
href="https://redirect.github.com/openai/tiktoken/issues/443">#443</a>)</li>
<li><a
href="82dc3bbacc"><code>82dc3bb</code></a>
bump PyO3 version (<a
href="https://redirect.github.com/openai/tiktoken/issues/444">#444</a>)</li>
<li><a
href="eedc856364"><code>eedc856</code></a>
Partial sync of codebase</li>
<li><a
href="5818d56626"><code>5818d56</code></a>
Partial sync of codebase</li>
<li><a
href="3591ff175d"><code>3591ff1</code></a>
Sync codebase</li>
<li><a
href="4560a8896f"><code>4560a88</code></a>
Sync codebase (<a
href="https://redirect.github.com/openai/tiktoken/issues/389">#389</a>)</li>
<li>See full diff in <a
href="https://github.com/openai/tiktoken/compare/0.9.0...0.12.0">compare
view</a></li>
</ul>
</details>
<br />


Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore <dependency name> major version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's major version (unless you unignore this specific
dependency's major version or upgrade to it yourself)
- `@dependabot ignore <dependency name> minor version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's minor version (unless you unignore this specific
dependency's minor version or upgrade to it yourself)
- `@dependabot ignore <dependency name>` will close this group update PR
and stop Dependabot creating any more for the specific dependency
(unless you unignore this specific dependency or upgrade to it yourself)
- `@dependabot unignore <dependency name>` will remove all of the ignore
conditions of the specified dependency
- `@dependabot unignore <dependency name> <ignore condition>` will
remove the ignore condition of the specified dependency and ignore
conditions


</details>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Otto <otto@agpt.co>
2026-02-09 03:28:22 +00:00
Otto
a329831b0b feat(backend): Add ClamAV scanning for local file paths (#11988)
## Context

From PR #11796 review discussion. Files processed by the video blocks
(downloads, uploads, generated videos) should be scanned through ClamAV
for malware detection.

## Problem

`store_media_file()` in `backend/util/file.py` already scans:
- `workspace://` references
- Cloud storage paths  
- Data URIs (`data:...`)
- HTTP/HTTPS URLs

**But local file paths were NOT scanned.** The `else` branch only
verified the file exists.

This gap affected video processing blocks (e.g., `LoopVideoBlock`,
`AddAudioToVideoBlock`) that:
1. Download/receive input media
2. Process it locally (loop, add audio, etc.)
3. Write output to temp directory
4. Call `store_media_file(output_filename, ...)` with a local path →
**skipped virus scanning**

## Solution

Added virus scanning to the local file path branch:
```python
# Virus scan the local file before any further processing
local_content = target_path.read_bytes()
if len(local_content) > MAX_FILE_SIZE_BYTES:
    raise ValueError(...)
await scan_content_safe(local_content, filename=sanitized_file)
```

## Changes

- `backend/util/file.py` - Added ~7 lines to scan local files
(consistent with other input types)
- `backend/util/file_test.py` - Added 2 test cases for local file
scanning

## Risk Assessment

- **Low risk:** Single point of change, follows existing pattern
- **Backwards compatible:** No API changes
- **Fail-safe:** If scanning fails, file is rejected (existing behavior)

Closes SECRT-1904

Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co>
2026-02-09 00:24:18 +00:00
dependabot[bot]
98dd1a9480 chore(libs/deps): Bump cryptography from 45.0.6 to 46.0.1 in /autogpt_platform/autogpt_libs (#10968)
Bumps [cryptography](https://github.com/pyca/cryptography) from 45.0.6
to 46.0.1.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst">cryptography's
changelog</a>.</em></p>
<blockquote>
<p>46.0.1 - 2025-09-16</p>
<pre><code>
* Fixed an issue where users installing via ``pip`` on Python 3.14
development
  versions would not properly install a dependency.
* Fixed an issue building the free-threaded macOS 3.14 wheels.
<p>.. _v46-0-0:</p>
<p>46.0.0 - 2025-09-16<br />
</code></pre></p>
<ul>
<li><strong>BACKWARDS INCOMPATIBLE:</strong> Support for Python 3.7 has
been removed.</li>
<li>Support for OpenSSL &lt; 3.0 is deprecated and will be removed in
the next
release.</li>
<li>Support for <code>x86_64</code> macOS (including publishing wheels)
is deprecated
and will be removed in two releases. We will switch to publishing an
<code>arm64</code> only wheel for macOS.</li>
<li>Support for 32-bit Windows (including publishing wheels) is
deprecated
and will be removed in two releases. Users should move to a 64-bit
Python installation.</li>
<li>Updated Windows, macOS, and Linux wheels to be compiled with OpenSSL
3.5.3.</li>
<li>We now build <code>ppc64le</code> <code>manylinux</code> wheels and
publish them to PyPI.</li>
<li>We now build <code>win_arm64</code> (Windows on Arm) wheels and
publish them to PyPI.</li>
<li>Added support for free-threaded Python 3.14.</li>
<li>Removed the deprecated <code>get_attribute_for_oid</code> method on
:class:<code>~cryptography.x509.CertificateSigningRequest</code>. Users
should use
:meth:<code>~cryptography.x509.Attributes.get_attribute_for_oid</code>
instead.</li>
<li>Removed the deprecated <code>CAST5</code>, <code>SEED</code>,
<code>IDEA</code>, and <code>Blowfish</code>
classes from the cipher module. These are still available in
:doc:<code>/hazmat/decrepit/index</code>.</li>
<li>In X.509, when performing a PSS signature with a SHA-3 hash, it is
now
encoded with the official NIST SHA3 OID.</li>
</ul>
<p>.. _v45-0-7:</p>
<p>45.0.7 - 2025-09-01</p>
<pre><code>
* Added a function to support an upcoming ``pyOpenSSL`` release.
<p>.. _v45-0-6:<br />
</code></pre></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="e735cfc275"><code>e735cfc</code></a>
release 46.0.1 (<a
href="https://redirect.github.com/pyca/cryptography/issues/13450">#13450</a>)</li>
<li><a
href="4e457ffba4"><code>4e457ff</code></a>
Explicitly specify python in mac uv build invocation (<a
href="https://redirect.github.com/pyca/cryptography/issues/13447">#13447</a>)</li>
<li><a
href="2726efdb6d"><code>2726efd</code></a>
Depend on CFFI 2.0.0 or newer on Python &gt; 3.8 (<a
href="https://redirect.github.com/pyca/cryptography/issues/13448">#13448</a>)</li>
<li><a
href="62230623d1"><code>6223062</code></a>
release 46.0.0 (<a
href="https://redirect.github.com/pyca/cryptography/issues/13446">#13446</a>)</li>
<li><a
href="563c4915b0"><code>563c491</code></a>
Update comment for pyopenssl-release tag (<a
href="https://redirect.github.com/pyca/cryptography/issues/13445">#13445</a>)</li>
<li><a
href="d2f6f7face"><code>d2f6f7f</code></a>
Bump downstream dependencies in CI (<a
href="https://redirect.github.com/pyca/cryptography/issues/13439">#13439</a>)</li>
<li><a
href="e7ab02bd67"><code>e7ab02b</code></a>
we'll ship this with 3.5.3 why not (<a
href="https://redirect.github.com/pyca/cryptography/issues/13442">#13442</a>)</li>
<li><a
href="0b68a4bffb"><code>0b68a4b</code></a>
Another pair of bump dependencies fix (<a
href="https://redirect.github.com/pyca/cryptography/issues/13444">#13444</a>)</li>
<li><a
href="e076d08ee4"><code>e076d08</code></a>
Attempt to fix commit message for bump downstreams (<a
href="https://redirect.github.com/pyca/cryptography/issues/13440">#13440</a>)</li>
<li><a
href="6835ce899e"><code>6835ce8</code></a>
Put correct version bounds for pyenchant in pins (<a
href="https://redirect.github.com/pyca/cryptography/issues/13441">#13441</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/pyca/cryptography/compare/45.0.6...46.0.1">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=cryptography&package-manager=pip&previous-version=45.0.6&new-version=46.0.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

You can trigger a rebase of this PR by commenting `@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

> **Note**
> Automatic rebases have been disabled on this pull request as it has
been open for over 30 days.

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Nick Tindle <nick@ntindle.com>
2026-02-08 23:40:15 +00:00
dependabot[bot]
9c7c598c7d chore(deps): bump peter-evans/create-pull-request from 7 to 8 (#11663)
Bumps
[peter-evans/create-pull-request](https://github.com/peter-evans/create-pull-request)
from 7 to 8.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/peter-evans/create-pull-request/releases">peter-evans/create-pull-request's
releases</a>.</em></p>
<blockquote>
<h2>Create Pull Request v8.0.0</h2>
<h2>What's new in v8</h2>
<ul>
<li>Requires <a
href="https://github.com/actions/runner/releases/tag/v2.327.1">Actions
Runner v2.327.1</a> or later if you are using a self-hosted runner for
Node 24 support.</li>
</ul>
<h2>What's Changed</h2>
<ul>
<li>chore: Update checkout action version to v6 by <a
href="https://github.com/yonas"><code>@​yonas</code></a> in <a
href="https://redirect.github.com/peter-evans/create-pull-request/pull/4258">peter-evans/create-pull-request#4258</a></li>
<li>Update actions/checkout references to <a
href="https://github.com/v6"><code>@​v6</code></a> in docs by <a
href="https://github.com/Copilot"><code>@​Copilot</code></a> in <a
href="https://redirect.github.com/peter-evans/create-pull-request/pull/4259">peter-evans/create-pull-request#4259</a></li>
<li>feat: v8 by <a
href="https://github.com/peter-evans"><code>@​peter-evans</code></a> in
<a
href="https://redirect.github.com/peter-evans/create-pull-request/pull/4260">peter-evans/create-pull-request#4260</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/yonas"><code>@​yonas</code></a> made
their first contribution in <a
href="https://redirect.github.com/peter-evans/create-pull-request/pull/4258">peter-evans/create-pull-request#4258</a></li>
<li><a href="https://github.com/Copilot"><code>@​Copilot</code></a> made
their first contribution in <a
href="https://redirect.github.com/peter-evans/create-pull-request/pull/4259">peter-evans/create-pull-request#4259</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/peter-evans/create-pull-request/compare/v7.0.11...v8.0.0">https://github.com/peter-evans/create-pull-request/compare/v7.0.11...v8.0.0</a></p>
<h2>Create Pull Request v7.0.11</h2>
<h2>What's Changed</h2>
<ul>
<li>fix: restrict remote prune to self-hosted runners by <a
href="https://github.com/peter-evans"><code>@​peter-evans</code></a> in
<a
href="https://redirect.github.com/peter-evans/create-pull-request/pull/4250">peter-evans/create-pull-request#4250</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/peter-evans/create-pull-request/compare/v7.0.10...v7.0.11">https://github.com/peter-evans/create-pull-request/compare/v7.0.10...v7.0.11</a></p>
<h2>Create Pull Request v7.0.10</h2>
<p>⚙️ Fixes an issue where updating a pull request failed when targeting
a forked repository with the same owner as its parent.</p>
<h2>What's Changed</h2>
<ul>
<li>build(deps): bump the github-actions group with 2 updates by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/create-pull-request/pull/4235">peter-evans/create-pull-request#4235</a></li>
<li>build(deps-dev): bump prettier from 3.6.2 to 3.7.3 in the npm group
by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/peter-evans/create-pull-request/pull/4240">peter-evans/create-pull-request#4240</a></li>
<li>fix: provider list pulls fallback for multi fork same owner by <a
href="https://github.com/peter-evans"><code>@​peter-evans</code></a> in
<a
href="https://redirect.github.com/peter-evans/create-pull-request/pull/4245">peter-evans/create-pull-request#4245</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/obnyis"><code>@​obnyis</code></a> made
their first contribution in <a
href="https://redirect.github.com/peter-evans/create-pull-request/pull/4064">peter-evans/create-pull-request#4064</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/peter-evans/create-pull-request/compare/v7.0.9...v7.0.10">https://github.com/peter-evans/create-pull-request/compare/v7.0.9...v7.0.10</a></p>
<h2>Create Pull Request v7.0.9</h2>
<p>⚙️ Fixes an <a
href="https://redirect.github.com/peter-evans/create-pull-request/issues/4228">incompatibility</a>
with the recently released <code>actions/checkout@v6</code>.</p>
<h2>What's Changed</h2>
<ul>
<li>~70 dependency updates by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a></li>
<li>docs: fix workaround description about <code>ready_for_review</code>
by <a href="https://github.com/ybiquitous"><code>@​ybiquitous</code></a>
in <a
href="https://redirect.github.com/peter-evans/create-pull-request/pull/3939">peter-evans/create-pull-request#3939</a></li>
<li>Docs: <code>add-paths</code> default behavior by <a
href="https://github.com/joeflack4"><code>@​joeflack4</code></a> in <a
href="https://redirect.github.com/peter-evans/create-pull-request/pull/3928">peter-evans/create-pull-request#3928</a></li>
<li>docs: update to create-github-app-token v2 by <a
href="https://github.com/Goooler"><code>@​Goooler</code></a> in <a
href="https://redirect.github.com/peter-evans/create-pull-request/pull/4063">peter-evans/create-pull-request#4063</a></li>
<li>Fix compatibility with actions/checkout@v6 by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/peter-evans/create-pull-request/pull/4230">peter-evans/create-pull-request#4230</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/joeflack4"><code>@​joeflack4</code></a>
made their first contribution in <a
href="https://redirect.github.com/peter-evans/create-pull-request/pull/3928">peter-evans/create-pull-request#3928</a></li>
<li><a href="https://github.com/Goooler"><code>@​Goooler</code></a> made
their first contribution in <a
href="https://redirect.github.com/peter-evans/create-pull-request/pull/4063">peter-evans/create-pull-request#4063</a></li>
<li><a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> made
their first contribution in <a
href="https://redirect.github.com/peter-evans/create-pull-request/pull/4230">peter-evans/create-pull-request#4230</a></li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="98357b18bf"><code>98357b1</code></a>
feat: v8 (<a
href="https://redirect.github.com/peter-evans/create-pull-request/issues/4260">#4260</a>)</li>
<li><a
href="41c0e4b789"><code>41c0e4b</code></a>
Update actions/checkout references to <a
href="https://github.com/v6"><code>@​v6</code></a> in docs (<a
href="https://redirect.github.com/peter-evans/create-pull-request/issues/4259">#4259</a>)</li>
<li><a
href="994332de4c"><code>994332d</code></a>
chore: Update checkout action version to v6 (<a
href="https://redirect.github.com/peter-evans/create-pull-request/issues/4258">#4258</a>)</li>
<li>See full diff in <a
href="https://github.com/peter-evans/create-pull-request/compare/v7...v8">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=peter-evans/create-pull-request&package-manager=github_actions&previous-version=7&new-version=8)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

You can trigger a rebase of this PR by commenting `@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

> **Note**
> Automatic rebases have been disabled on this pull request as it has
been open for over 30 days.

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Nick Tindle <nick@ntindle.com>
2026-02-08 23:06:40 +00:00
Nikhil Bhagat
728c40def5 fix(backend): replace multiprocessing queue with thread safe queue in ExecutionQueue (#11618)
<!-- Clearly explain the need for these changes: -->

The `ExecutionQueue` class was using `multiprocessing.Manager().Queue()`
which spawns a subprocess for inter-process communication. However,
analysis showed that `ExecutionQueue` is only accessed from threads
within the same process, not across processes. This caused:
- Unnecessary subprocess spawning per graph execution
- IPC overhead for every queue operation
- Potential resource leaks if Manager processes weren't properly cleaned
up
- Limited scalability when many graphs execute concurrently

### Changes 

<!-- Concisely describe all of the changes made in this pull request:
-->

- Replaced `multiprocessing.Manager().Queue()` with `queue.Queue()` in
`ExecutionQueue` class
- Updated imports: removed `from multiprocessing import Manager` and
`from queue import Empty`, added `import queue`
- Updated exception handling from `except Empty:` to `except
queue.Empty:`
- Added comprehensive docstring explaining the bug and fix

**File changed:** `autogpt_platform/backend/backend/data/execution.py`

### Checklist 

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  <!-- Put your test plan here: -->
- [x] Verified `ExecutionQueue` uses `queue.Queue` (not
`multiprocessing.Manager().Queue()`)
- [x] Tested all queue operations: `add()`, `get()`, `empty()`,
`get_or_none()`
- [x] Verified thread-safety with concurrent producer/consumer threads
(100 items)
- [x] Verified multi-producer/consumer scenario (3 producers, 2
consumers, 150 items)
  - [x] Confirmed no subprocess spawning when creating multiple queues
  - [x] Code passes Black formatting check

#### For configuration changes:

- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)

> No configuration changes required - this is a code-only fix with no
external API changes.

---------

Co-authored-by: Otto <otto@agpt.co>
Co-authored-by: Zamil Majdy <majdyz@users.noreply.github.com>
Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
2026-02-08 16:28:04 +00:00
dependabot[bot]
cd64562e1b chore(libs/deps): bump the production-dependencies group across 1 directory with 8 updates (#11934)
Bumps the production-dependencies group with 8 updates in the
/autogpt_platform/autogpt_libs directory:

| Package | From | To |
| --- | --- | --- |
| [fastapi](https://github.com/fastapi/fastapi) | `0.116.1` | `0.128.0`
|
| [google-cloud-logging](https://github.com/googleapis/python-logging) |
`3.12.1` | `3.13.0` |
|
[launchdarkly-server-sdk](https://github.com/launchdarkly/python-server-sdk)
| `9.12.0` | `9.14.1` |
| [pydantic](https://github.com/pydantic/pydantic) | `2.11.7` | `2.12.5`
|
| [pydantic-settings](https://github.com/pydantic/pydantic-settings) |
`2.10.1` | `2.12.0` |
| [pyjwt](https://github.com/jpadilla/pyjwt) | `2.10.1` | `2.11.0` |
| [supabase](https://github.com/supabase/supabase-py) | `2.16.0` |
`2.27.2` |
| [uvicorn](https://github.com/Kludex/uvicorn) | `0.35.0` | `0.40.0` |


Updates `fastapi` from 0.116.1 to 0.128.0
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/fastapi/fastapi/releases">fastapi's
releases</a>.</em></p>
<blockquote>
<h2>0.128.0</h2>
<h3>Breaking Changes</h3>
<ul>
<li> Drop support for <code>pydantic.v1</code>. PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14609">#14609</a>
by <a
href="https://github.com/tiangolo"><code>@​tiangolo</code></a>.</li>
</ul>
<h3>Internal</h3>
<ul>
<li> Run performance tests only on Pydantic v2. PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14608">#14608</a>
by <a
href="https://github.com/tiangolo"><code>@​tiangolo</code></a>.</li>
</ul>
<h2>0.127.1</h2>
<h3>Refactors</h3>
<ul>
<li>🔊 Add a custom <code>FastAPIDeprecationWarning</code>. PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14605">#14605</a>
by <a
href="https://github.com/tiangolo"><code>@​tiangolo</code></a>.</li>
</ul>
<h3>Docs</h3>
<ul>
<li>📝 Add documentary to website. PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14600">#14600</a>
by <a
href="https://github.com/tiangolo"><code>@​tiangolo</code></a>.</li>
</ul>
<h3>Translations</h3>
<ul>
<li>🌐 Update translations for de (update-outdated). PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14602">#14602</a>
by <a
href="https://github.com/nilslindemann"><code>@​nilslindemann</code></a>.</li>
<li>🌐 Update translations for de (update-outdated). PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14581">#14581</a>
by <a
href="https://github.com/nilslindemann"><code>@​nilslindemann</code></a>.</li>
</ul>
<h3>Internal</h3>
<ul>
<li>🔧 Update pre-commit to use local Ruff instead of hook. PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14604">#14604</a>
by <a
href="https://github.com/tiangolo"><code>@​tiangolo</code></a>.</li>
<li> Add missing tests for code examples. PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14569">#14569</a>
by <a
href="https://github.com/YuriiMotov"><code>@​YuriiMotov</code></a>.</li>
<li>👷 Remove <code>lint</code> job from <code>test</code> CI workflow.
PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14593">#14593</a>
by <a
href="https://github.com/YuriiMotov"><code>@​YuriiMotov</code></a>.</li>
<li>👷 Update secrets check. PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14592">#14592</a>
by <a
href="https://github.com/tiangolo"><code>@​tiangolo</code></a>.</li>
<li>👷 Run CodSpeed tests in parallel to other tests to speed up CI. PR
<a
href="https://redirect.github.com/fastapi/fastapi/pull/14586">#14586</a>
by <a
href="https://github.com/tiangolo"><code>@​tiangolo</code></a>.</li>
<li>🔨 Update scripts and pre-commit to autofix files. PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14585">#14585</a>
by <a
href="https://github.com/tiangolo"><code>@​tiangolo</code></a>.</li>
</ul>
<h2>0.127.0</h2>
<h3>Breaking Changes</h3>
<ul>
<li>🔊 Add deprecation warnings when using <code>pydantic.v1</code>. PR
<a
href="https://redirect.github.com/fastapi/fastapi/pull/14583">#14583</a>
by <a
href="https://github.com/tiangolo"><code>@​tiangolo</code></a>.</li>
</ul>
<h3>Translations</h3>
<ul>
<li>🔧 Add LLM prompt file for Korean, generated from the existing
translations. PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14546">#14546</a>
by <a
href="https://github.com/tiangolo"><code>@​tiangolo</code></a>.</li>
<li>🔧 Add LLM prompt file for Japanese, generated from the existing
translations. PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14545">#14545</a>
by <a
href="https://github.com/tiangolo"><code>@​tiangolo</code></a>.</li>
</ul>
<h3>Internal</h3>
<ul>
<li>⬆️ Upgrade OpenAI model for translations to gpt-5.2. PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14579">#14579</a>
by <a
href="https://github.com/tiangolo"><code>@​tiangolo</code></a>.</li>
</ul>
<h2>0.126.0</h2>
<h3>Upgrades</h3>
<ul>
<li> Drop support for Pydantic v1, keeping short temporary support for
Pydantic v2's <code>pydantic.v1</code>. PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14575">#14575</a>
by <a
href="https://github.com/tiangolo"><code>@​tiangolo</code></a>.</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="8322a4445a"><code>8322a44</code></a>
🔖 Release version 0.128.0</li>
<li><a
href="4b2cfcfd34"><code>4b2cfcf</code></a>
📝 Update release notes</li>
<li><a
href="e300630551"><code>e300630</code></a>
 Drop support for <code>pydantic.v1</code> (<a
href="https://redirect.github.com/fastapi/fastapi/issues/14609">#14609</a>)</li>
<li><a
href="1b3bea8b6b"><code>1b3bea8</code></a>
📝 Update release notes</li>
<li><a
href="34e884156f"><code>34e8841</code></a>
 Run performance tests only on Pydantic v2 (<a
href="https://redirect.github.com/fastapi/fastapi/issues/14608">#14608</a>)</li>
<li><a
href="cd90c78391"><code>cd90c78</code></a>
🔖 Release version 0.127.1</li>
<li><a
href="93f4dfd88b"><code>93f4dfd</code></a>
📝 Update release notes</li>
<li><a
href="535b5daa31"><code>535b5da</code></a>
🔊 Add a custom <code>FastAPIDeprecationWarning</code> (<a
href="https://redirect.github.com/fastapi/fastapi/issues/14605">#14605</a>)</li>
<li><a
href="6b53786f62"><code>6b53786</code></a>
📝 Update release notes</li>
<li><a
href="d98f4eb56e"><code>d98f4eb</code></a>
🔧 Update pre-commit to use local Ruff instead of hook (<a
href="https://redirect.github.com/fastapi/fastapi/issues/14604">#14604</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/fastapi/fastapi/compare/0.116.1...0.128.0">compare
view</a></li>
</ul>
</details>
<br />

Updates `google-cloud-logging` from 3.12.1 to 3.13.0
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/googleapis/python-logging/releases">google-cloud-logging's
releases</a>.</em></p>
<blockquote>
<h2>google-cloud-logging 3.13.0</h2>
<h2><a
href="https://github.com/googleapis/python-logging/compare/v3.12.1...v3.13.0">3.13.0</a>
(2025-12-15)</h2>
<h3>Features</h3>
<ul>
<li>Add support for python 3.14 (<a
href="https://redirect.github.com/googleapis/python-logging/issues/1065">#1065</a>)
(<a
href="https://github.com/googleapis/python-logging/commit/6be3df6a">6be3df6a</a>)</li>
</ul>
<h3>Bug Fixes</h3>
<ul>
<li>remove setup.cfg configuration for creating universal wheels (<a
href="https://redirect.github.com/googleapis/python-logging/issues/981">#981</a>)
(<a
href="https://github.com/googleapis/python-logging/commit/70f612c3">70f612c3</a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/googleapis/python-logging/blob/main/CHANGELOG.md">google-cloud-logging's
changelog</a>.</em></p>
<blockquote>
<h2><a
href="https://github.com/googleapis/python-logging/compare/v3.12.1...v3.13.0">3.13.0</a>
(2025-12-15)</h2>
<h3>Features</h3>
<ul>
<li>Add support for python 3.14 (<a
href="https://redirect.github.com/googleapis/python-logging/issues/1065">#1065</a>)
(<a
href="6be3df6aa9">6be3df6aa94539cd2ab22a4fac55b343862228b2</a>)</li>
</ul>
<h3>Bug Fixes</h3>
<ul>
<li>remove setup.cfg configuration for creating universal wheels (<a
href="https://redirect.github.com/googleapis/python-logging/issues/981">#981</a>)
(<a
href="70f612c328">70f612c3281f1df13f3aba6b19bc4e9397297f3d</a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="1415883be0"><code>1415883</code></a>
chore: librarian release pull request: 20251215T134006Z (<a
href="https://redirect.github.com/googleapis/python-logging/issues/1066">#1066</a>)</li>
<li><a
href="6be3df6aa9"><code>6be3df6</code></a>
feat: Add support for python 3.14 (<a
href="https://redirect.github.com/googleapis/python-logging/issues/1065">#1065</a>)</li>
<li><a
href="36fb4270b3"><code>36fb427</code></a>
chore(librarian): onboard to librarian (<a
href="https://redirect.github.com/googleapis/python-logging/issues/1061">#1061</a>)</li>
<li><a
href="eb189bf712"><code>eb189bf</code></a>
chore: update Python generator version to 1.25.1 (<a
href="https://redirect.github.com/googleapis/python-logging/issues/1003">#1003</a>)</li>
<li><a
href="a7a28d1b93"><code>a7a28d1</code></a>
test: ignore DeprecationWarning for <code>credentials_file</code>
argument and Python ve...</li>
<li><a
href="70f612c328"><code>70f612c</code></a>
fix: remove setup.cfg configuration for creating universal wheels (<a
href="https://redirect.github.com/googleapis/python-logging/issues/981">#981</a>)</li>
<li><a
href="e4c445a856"><code>e4c445a</code></a>
chore: Update gapic-generator-python to 1.25.0 (<a
href="https://redirect.github.com/googleapis/python-logging/issues/985">#985</a>)</li>
<li><a
href="14364a534a"><code>14364a5</code></a>
test: Added cleanup of old sink storage buckets (<a
href="https://redirect.github.com/googleapis/python-logging/issues/991">#991</a>)</li>
<li>See full diff in <a
href="https://github.com/googleapis/python-logging/compare/v3.12.1...v3.13.0">compare
view</a></li>
</ul>
</details>
<br />

Updates `launchdarkly-server-sdk` from 9.12.0 to 9.14.1
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/launchdarkly/python-server-sdk/releases">launchdarkly-server-sdk's
releases</a>.</em></p>
<blockquote>
<h2>v9.14.1</h2>
<h2><a
href="https://github.com/launchdarkly/python-server-sdk/compare/9.14.0...9.14.1">9.14.1</a>
(2025-12-15)</h2>
<h3>Bug Fixes</h3>
<ul>
<li>Remove all synchronizers in daemon mode (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/388">#388</a>)
(<a
href="441a5ecb3d">441a5ec</a>)</li>
</ul>
<hr />
<p>This PR was generated with <a
href="https://github.com/googleapis/release-please">Release Please</a>.
See <a
href="https://github.com/googleapis/release-please#release-please">documentation</a>.</p>
<!-- raw HTML omitted -->
<h2>v9.14.0</h2>
<h2><a
href="https://github.com/launchdarkly/python-server-sdk/compare/9.13.1...9.14.0">9.14.0</a>
(2025-12-04)</h2>
<h3>Features</h3>
<ul>
<li>adding data system option to create file datasource intializer (<a
href="e5b121f92a">e5b121f</a>)</li>
<li>adding file data source as an intializer (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/381">#381</a>)
(<a
href="3700d1ddd9">3700d1d</a>)</li>
</ul>
<h3>Bug Fixes</h3>
<ul>
<li>Add warning if relying on Redis <code>max_connections</code>
parameter (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/387">#387</a>)
(<a
href="e6395fa531">e6395fa</a>),
closes <a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/386">#386</a></li>
<li>modified initializer behavior to spec (<a
href="064f65c761">064f65c</a>)</li>
</ul>
<hr />
<p>This PR was generated with <a
href="https://github.com/googleapis/release-please">Release Please</a>.
See <a
href="https://github.com/googleapis/release-please#release-please">documentation</a>.</p>
<!-- raw HTML omitted -->
<h2>v9.13.1</h2>
<h2><a
href="https://github.com/launchdarkly/python-server-sdk/compare/9.13.0...9.13.1">9.13.1</a>
(2025-11-19)</h2>
<h3>Bug Fixes</h3>
<ul>
<li>Include ldclient.datasystem in docs (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/379">#379</a>)
(<a
href="318c6fea07">318c6fe</a>)</li>
</ul>
<hr />
<p>This PR was generated with <a
href="https://github.com/googleapis/release-please">Release Please</a>.
See <a
href="https://github.com/googleapis/release-please#release-please">documentation</a>.</p>
<!-- raw HTML omitted -->
<h2>v9.13.0</h2>
<h2><a
href="https://github.com/launchdarkly/python-server-sdk/compare/9.12.3...9.13.0">9.13.0</a>
(2025-11-19)</h2>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/launchdarkly/python-server-sdk/blob/main/CHANGELOG.md">launchdarkly-server-sdk's
changelog</a>.</em></p>
<blockquote>
<h2><a
href="https://github.com/launchdarkly/python-server-sdk/compare/9.14.0...9.14.1">9.14.1</a>
(2025-12-15)</h2>
<h3>Bug Fixes</h3>
<ul>
<li>Remove all synchronizers in daemon mode (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/388">#388</a>)
(<a
href="441a5ecb3d">441a5ec</a>)</li>
</ul>
<h2><a
href="https://github.com/launchdarkly/python-server-sdk/compare/9.13.1...9.14.0">9.14.0</a>
(2025-12-04)</h2>
<h3>Features</h3>
<ul>
<li>adding data system option to create file datasource intializer (<a
href="e5b121f92a">e5b121f</a>)</li>
<li>adding file data source as an intializer (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/381">#381</a>)
(<a
href="3700d1ddd9">3700d1d</a>)</li>
</ul>
<h3>Bug Fixes</h3>
<ul>
<li>Add warning if relying on Redis <code>max_connections</code>
parameter (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/387">#387</a>)
(<a
href="e6395fa531">e6395fa</a>),
closes <a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/386">#386</a></li>
<li>modified initializer behavior to spec (<a
href="064f65c761">064f65c</a>)</li>
</ul>
<h2><a
href="https://github.com/launchdarkly/python-server-sdk/compare/9.13.0...9.13.1">9.13.1</a>
(2025-11-19)</h2>
<h3>Bug Fixes</h3>
<ul>
<li>Include ldclient.datasystem in docs (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/379">#379</a>)
(<a
href="318c6fea07">318c6fe</a>)</li>
</ul>
<h2><a
href="https://github.com/launchdarkly/python-server-sdk/compare/9.12.3...9.13.0">9.13.0</a>
(2025-11-19)</h2>
<h3>Features</h3>
<ul>
<li><strong>experimental:</strong> Release EAP support for FDv2 data
system (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/376">#376</a>)
(<a
href="0e7c32b4df">0e7c32b</a>)</li>
</ul>
<h2><a
href="https://github.com/launchdarkly/python-server-sdk/compare/9.12.2...9.12.3">9.12.3</a>
(2025-10-30)</h2>
<h3>Bug Fixes</h3>
<ul>
<li>Fix overly generic type hint on File data source (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/365">#365</a>)
(<a
href="52a7499f7c">52a7499</a>),
closes <a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/364">#364</a></li>
</ul>
<h2><a
href="https://github.com/launchdarkly/python-server-sdk/compare/9.12.1...9.12.2">9.12.2</a>
(2025-10-27)</h2>
<h3>Bug Fixes</h3>
<ul>
<li>Fix incorrect event count in failure message (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/359">#359</a>)
(<a
href="91f416329b">91f4163</a>)</li>
</ul>
<h2><a
href="https://github.com/launchdarkly/python-server-sdk/compare/9.12.0...9.12.1">9.12.1</a>
(2025-09-30)</h2>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="54e62cc706"><code>54e62cc</code></a>
chore(main): release 9.14.1 (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/389">#389</a>)</li>
<li><a
href="441a5ecb3d"><code>441a5ec</code></a>
fix: Remove all synchronizers in daemon mode (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/388">#388</a>)</li>
<li><a
href="7bb537827f"><code>7bb5378</code></a>
chore(main): release 9.14.0 (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/382">#382</a>)</li>
<li><a
href="e6395fa531"><code>e6395fa</code></a>
fix: Add warning if relying on Redis <code>max_connections</code>
parameter (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/387">#387</a>)</li>
<li><a
href="45786a9a7e"><code>45786a9</code></a>
chore: Expose flag change listeners from data system (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/384">#384</a>)</li>
<li><a
href="2b7eedc836"><code>2b7eedc</code></a>
chore: Clean up unused _data_availability (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/383">#383</a>)</li>
<li><a
href="3700d1ddd9"><code>3700d1d</code></a>
feat: adding file data source as an intializer (<a
href="https://redirect.github.com/launchdarkly/python-server-sdk/issues/381">#381</a>)</li>
<li><a
href="04a2c538e5"><code>04a2c53</code></a>
chore: PR comments</li>
<li><a
href="064f65c761"><code>064f65c</code></a>
fix: modified initializer behavior to spec</li>
<li><a
href="e5b121f92a"><code>e5b121f</code></a>
feat: adding data system option to create file datasource
intializer</li>
<li>Additional commits viewable in <a
href="https://github.com/launchdarkly/python-server-sdk/compare/9.12.0...9.14.1">compare
view</a></li>
</ul>
</details>
<br />

Updates `pydantic` from 2.11.7 to 2.12.5
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/pydantic/pydantic/releases">pydantic's
releases</a>.</em></p>
<blockquote>
<h2>v2.12.5 2025-11-26</h2>
<h2>v2.12.5 (2025-11-26)</h2>
<p>This is the fifth 2.12 patch release, addressing an issue with the
<code>MISSING</code> sentinel and providing several documentation
improvements.</p>
<p>The next 2.13 minor release will be published in a couple weeks, and
will include a new <em>polymorphic serialization</em> feature addressing
the remaining unexpected changes to the <em>serialize as any</em>
behavior.</p>
<ul>
<li>Fix pickle error when using <code>model_construct()</code> on a
model with <code>MISSING</code> as a default value by <a
href="https://github.com/ornariece"><code>@​ornariece</code></a> in <a
href="https://redirect.github.com/pydantic/pydantic/pull/12522">#12522</a>.</li>
<li>Several updates to the documentation by <a
href="https://github.com/Viicos"><code>@​Viicos</code></a>.</li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/pydantic/pydantic/compare/v2.12.4...v2.12.5">https://github.com/pydantic/pydantic/compare/v2.12.4...v2.12.5</a></p>
<h2>v2.12.4 2025-11-05</h2>
<h2>v2.12.4 (2025-11-05)</h2>
<p>This is the fourth 2.12 patch release, fixing more regressions, and
reverting a change in the <code>build()</code> method
of the <a
href="https://docs.pydantic.dev/latest/api/networks/"><code>AnyUrl</code>
and Dsn types</a>.</p>
<p>This patch release also fixes an issue with the serialization of IP
address types, when <code>serialize_as_any</code> is used. The next
patch release
will try to address the remaining issues with <em>serialize as any</em>
behavior by introducing a new <em>polymorphic serialization</em>
feature, that
should be used in most cases in place of <em>serialize as any</em>.</p>
<ul>
<li>
<p>Fix issue with forward references in parent <code>TypedDict</code>
classes by <a href="https://github.com/Viicos"><code>@​Viicos</code></a>
in <a
href="https://redirect.github.com/pydantic/pydantic/pull/12427">#12427</a>.</p>
<p>This issue is only relevant on Python 3.14 and greater.</p>
</li>
<li>
<p>Exclude fields with <code>exclude_if</code> from JSON Schema required
fields by <a href="https://github.com/Viicos"><code>@​Viicos</code></a>
in <a
href="https://redirect.github.com/pydantic/pydantic/pull/12430">#12430</a></p>
</li>
<li>
<p>Revert URL percent-encoding of credentials in the
<code>build()</code> method of the <a
href="https://docs.pydantic.dev/latest/api/networks/"><code>AnyUrl</code>
and Dsn types</a> by <a
href="https://github.com/davidhewitt"><code>@​davidhewitt</code></a> in
<a
href="https://redirect.github.com/pydantic/pydantic-core/pull/1833">pydantic-core#1833</a>.</p>
<p>This was initially considered as a bugfix, but caused regressions and
as such was fully reverted. The next release will include
an opt-in option to percent-encode components of the URL.</p>
</li>
<li>
<p>Add type inference for IP address types by <a
href="https://github.com/davidhewitt"><code>@​davidhewitt</code></a> in
<a
href="https://redirect.github.com/pydantic/pydantic-core/pull/1868">pydantic-core#1868</a>.</p>
<p>The 2.12 changes to the <code>serialize_as_any</code> behavior made
it so that IP address types could not properly serialize to JSON.</p>
</li>
<li>
<p>Avoid getting default values from defaultdict by <a
href="https://github.com/davidhewitt"><code>@​davidhewitt</code></a> in
<a
href="https://redirect.github.com/pydantic/pydantic-core/pull/1853">pydantic-core#1853</a>.</p>
<p>This fixes a subtle regression in the validation behavior of the <a
href="https://docs.python.org/3/library/collections.html#collections.defaultdict"><code>collections.defaultdict</code></a>
type.</p>
</li>
<li>
<p>Fix issue with field serializers on nested typed dictionaries by <a
href="https://github.com/davidhewitt"><code>@​davidhewitt</code></a> in
<a
href="https://redirect.github.com/pydantic/pydantic-core/pull/1879">pydantic-core#1879</a>.</p>
</li>
<li>
<p>Add more <code>pydantic-core</code> builds for the three-threaded
version of Python 3.14 by <a
href="https://github.com/davidhewitt"><code>@​davidhewitt</code></a> in
<a
href="https://redirect.github.com/pydantic/pydantic-core/pull/1864">pydantic-core#1864</a>.</p>
</li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/pydantic/pydantic/compare/v2.12.3...v2.12.4">https://github.com/pydantic/pydantic/compare/v2.12.3...v2.12.4</a></p>
<h2>v2.12.3 2025-10-17</h2>
<h2>v2.12.3 (2025-10-17)</h2>
<h3>What's Changed</h3>
<p>This is the third 2.13 patch release, fixing issues related to the
<code>FieldInfo</code> class, and reverting a change to the supported <a
href="https://docs.pydantic.dev/latest/concepts/validators/#model-validators"><em>after</em>
model validator</a> function signatures.</p>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/pydantic/pydantic/blob/main/HISTORY.md">pydantic's
changelog</a>.</em></p>
<blockquote>
<h2>v2.12.5 (2025-11-26)</h2>
<p><a
href="https://github.com/pydantic/pydantic/releases/tag/v2.12.5">GitHub
release</a></p>
<p>This is the fifth 2.12 patch release, addressing an issue with the
<code>MISSING</code> sentinel and providing several documentation
improvements.</p>
<p>The next 2.13 minor release will be published in a couple weeks, and
will include a new <em>polymorphic serialization</em> feature addressing
the remaining unexpected changes to the <em>serialize as any</em>
behavior.</p>
<ul>
<li>Fix pickle error when using <code>model_construct()</code> on a
model with <code>MISSING</code> as a default value by <a
href="https://github.com/ornariece"><code>@​ornariece</code></a> in <a
href="https://redirect.github.com/pydantic/pydantic/pull/12522">#12522</a>.</li>
<li>Several updates to the documentation by <a
href="https://github.com/Viicos"><code>@​Viicos</code></a>.</li>
</ul>
<h2>v2.12.4 (2025-11-05)</h2>
<p><a
href="https://github.com/pydantic/pydantic/releases/tag/v2.12.4">GitHub
release</a></p>
<p>This is the fourth 2.12 patch release, fixing more regressions, and
reverting a change in the <code>build()</code> method
of the <a
href="https://docs.pydantic.dev/latest/api/networks/"><code>AnyUrl</code>
and Dsn types</a>.</p>
<p>This patch release also fixes an issue with the serialization of IP
address types, when <code>serialize_as_any</code> is used. The next
patch release
will try to address the remaining issues with <em>serialize as any</em>
behavior by introducing a new <em>polymorphic serialization</em>
feature, that
should be used in most cases in place of <em>serialize as any</em>.</p>
<ul>
<li>
<p>Fix issue with forward references in parent <code>TypedDict</code>
classes by <a href="https://github.com/Viicos"><code>@​Viicos</code></a>
in <a
href="https://redirect.github.com/pydantic/pydantic/pull/12427">#12427</a>.</p>
<p>This issue is only relevant on Python 3.14 and greater.</p>
</li>
<li>
<p>Exclude fields with <code>exclude_if</code> from JSON Schema required
fields by <a href="https://github.com/Viicos"><code>@​Viicos</code></a>
in <a
href="https://redirect.github.com/pydantic/pydantic/pull/12430">#12430</a></p>
</li>
<li>
<p>Revert URL percent-encoding of credentials in the
<code>build()</code> method
of the <a
href="https://docs.pydantic.dev/latest/api/networks/"><code>AnyUrl</code>
and Dsn types</a> by <a
href="https://github.com/davidhewitt"><code>@​davidhewitt</code></a> in
<a
href="https://redirect.github.com/pydantic/pydantic-core/pull/1833">pydantic-core#1833</a>.</p>
<p>This was initially considered as a bugfix, but caused regressions and
as such was fully reverted. The next release will include
an opt-in option to percent-encode components of the URL.</p>
</li>
<li>
<p>Add type inference for IP address types by <a
href="https://github.com/davidhewitt"><code>@​davidhewitt</code></a> in
<a
href="https://redirect.github.com/pydantic/pydantic-core/pull/1868">pydantic-core#1868</a>.</p>
<p>The 2.12 changes to the <code>serialize_as_any</code> behavior made
it so that IP address types could not properly serialize to JSON.</p>
</li>
<li>
<p>Avoid getting default values from defaultdict by <a
href="https://github.com/davidhewitt"><code>@​davidhewitt</code></a> in
<a
href="https://redirect.github.com/pydantic/pydantic-core/pull/1853">pydantic-core#1853</a>.</p>
<p>This fixes a subtle regression in the validation behavior of the <a
href="https://docs.python.org/3/library/collections.html#collections.defaultdict"><code>collections.defaultdict</code></a>
type.</p>
</li>
<li>
<p>Fix issue with field serializers on nested typed dictionaries by <a
href="https://github.com/davidhewitt"><code>@​davidhewitt</code></a> in
<a
href="https://redirect.github.com/pydantic/pydantic-core/pull/1879">pydantic-core#1879</a>.</p>
</li>
<li>
<p>Add more <code>pydantic-core</code> builds for the three-threaded
version of Python 3.14 by <a
href="https://github.com/davidhewitt"><code>@​davidhewitt</code></a> in
<a
href="https://redirect.github.com/pydantic/pydantic-core/pull/1864">pydantic-core#1864</a>.</p>
</li>
</ul>
<h2>v2.12.3 (2025-10-17)</h2>
<p><a
href="https://github.com/pydantic/pydantic/releases/tag/v2.12.3">GitHub
release</a></p>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="bd2d0dd013"><code>bd2d0dd</code></a>
Prepare release v2.12.5</li>
<li><a
href="7d0302ec7e"><code>7d0302e</code></a>
Document security implications when using
<code>create_model()</code></li>
<li><a
href="e9ef980def"><code>e9ef980</code></a>
Fix typo in Standard Library Types documentation</li>
<li><a
href="f2c20c00c2"><code>f2c20c0</code></a>
Add <code>pydantic-docs</code> dev dependency, make use of versioning
blocks</li>
<li><a
href="a76c1aa26f"><code>a76c1aa</code></a>
Update documentation about JSON Schema</li>
<li><a
href="8cbc72ca48"><code>8cbc72c</code></a>
Add documentation about custom <code>__init__()</code></li>
<li><a
href="99eba59906"><code>99eba59</code></a>
Add additional test for <code>FieldInfo.get_default()</code></li>
<li><a
href="c71076988e"><code>c710769</code></a>
Special case <code>MISSING</code> sentinel in
<code>smart_deepcopy()</code></li>
<li><a
href="20a9d771c2"><code>20a9d77</code></a>
Do not delete mock validator/serializer in
<code>rebuild_dataclass()</code></li>
<li><a
href="c86515a3a8"><code>c86515a</code></a>
Update parts of the model and <code>revalidate_instances</code>
documentation</li>
<li>Additional commits viewable in <a
href="https://github.com/pydantic/pydantic/compare/v2.11.7...v2.12.5">compare
view</a></li>
</ul>
</details>
<br />

Updates `pydantic-settings` from 2.10.1 to 2.12.0
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/pydantic/pydantic-settings/releases">pydantic-settings's
releases</a>.</em></p>
<blockquote>
<h2>v2.12.0</h2>
<h2>What's Changed</h2>
<ul>
<li>Support for enum kebab case. by <a
href="https://github.com/kschwab"><code>@​kschwab</code></a> in <a
href="https://redirect.github.com/pydantic/pydantic-settings/pull/686">pydantic/pydantic-settings#686</a></li>
<li>Apply source order: init &gt; env &gt; dotenv &gt; secrets &gt;
defaults and pres… by <a
href="https://github.com/chbndrhnns"><code>@​chbndrhnns</code></a> in <a
href="https://redirect.github.com/pydantic/pydantic-settings/pull/688">pydantic/pydantic-settings#688</a></li>
<li>Add NestedSecretsSettings source by <a
href="https://github.com/makukha"><code>@​makukha</code></a> in <a
href="https://redirect.github.com/pydantic/pydantic-settings/pull/690">pydantic/pydantic-settings#690</a></li>
<li>Strip non-explicit default values. by <a
href="https://github.com/kschwab"><code>@​kschwab</code></a> in <a
href="https://redirect.github.com/pydantic/pydantic-settings/pull/692">pydantic/pydantic-settings#692</a></li>
<li>Coerce env vars if strict is True. by <a
href="https://github.com/kschwab"><code>@​kschwab</code></a> in <a
href="https://redirect.github.com/pydantic/pydantic-settings/pull/693">pydantic/pydantic-settings#693</a></li>
<li>Restore init kwarg names before returning final state dictionary. by
<a href="https://github.com/kschwab"><code>@​kschwab</code></a> in <a
href="https://redirect.github.com/pydantic/pydantic-settings/pull/700">pydantic/pydantic-settings#700</a></li>
<li>Drop Python3.9 support by <a
href="https://github.com/hramezani"><code>@​hramezani</code></a> in <a
href="https://redirect.github.com/pydantic/pydantic-settings/pull/699">pydantic/pydantic-settings#699</a></li>
<li>Adapt test_protected_namespace_defaults for dev. Pydantic by <a
href="https://github.com/musicinmybrain"><code>@​musicinmybrain</code></a>
in <a
href="https://redirect.github.com/pydantic/pydantic-settings/pull/637">pydantic/pydantic-settings#637</a></li>
<li>Add Python 3.14 by <a
href="https://github.com/hramezani"><code>@​hramezani</code></a> in <a
href="https://redirect.github.com/pydantic/pydantic-settings/pull/704">pydantic/pydantic-settings#704</a></li>
<li>Prepare release 2.12 by <a
href="https://github.com/hramezani"><code>@​hramezani</code></a> in <a
href="https://redirect.github.com/pydantic/pydantic-settings/pull/705">pydantic/pydantic-settings#705</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a
href="https://github.com/chbndrhnns"><code>@​chbndrhnns</code></a> made
their first contribution in <a
href="https://redirect.github.com/pydantic/pydantic-settings/pull/688">pydantic/pydantic-settings#688</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/pydantic/pydantic-settings/compare/v2.11.0...v2.12.0">https://github.com/pydantic/pydantic-settings/compare/v2.11.0...v2.12.0</a></p>
<h2>v2.11.0</h2>
<h2>What's Changed</h2>
<ul>
<li>CLI Serialize Support by <a
href="https://github.com/kschwab"><code>@​kschwab</code></a> in <a
href="https://redirect.github.com/pydantic/pydantic-settings/pull/643">pydantic/pydantic-settings#643</a></li>
<li>Inspect type aliases to determine if an annotation is complex by <a
href="https://github.com/tselepakis"><code>@​tselepakis</code></a> in <a
href="https://redirect.github.com/pydantic/pydantic-settings/pull/644">pydantic/pydantic-settings#644</a></li>
<li>Revert &quot;fix: Respect 'cli_parse_args' from model_config with
settings_customise_sources (<a
href="https://redirect.github.com/pydantic/pydantic-settings/issues/611">#611</a>)&quot;
by <a href="https://github.com/hramezani"><code>@​hramezani</code></a>
in <a
href="https://redirect.github.com/pydantic/pydantic-settings/pull/655">pydantic/pydantic-settings#655</a></li>
<li>Remove parsing of command line arguments from
<code>CliSettingsSource.__init__</code>. by <a
href="https://github.com/trygve-baerland"><code>@​trygve-baerland</code></a>
in <a
href="https://redirect.github.com/pydantic/pydantic-settings/pull/656">pydantic/pydantic-settings#656</a></li>
<li>turn off allow_abbrev on subparsers by <a
href="https://github.com/mroch"><code>@​mroch</code></a> in <a
href="https://redirect.github.com/pydantic/pydantic-settings/pull/658">pydantic/pydantic-settings#658</a></li>
<li>CLI Serialization Fixes by <a
href="https://github.com/kschwab"><code>@​kschwab</code></a> in <a
href="https://redirect.github.com/pydantic/pydantic-settings/pull/649">pydantic/pydantic-settings#649</a></li>
<li>Fix PydanticModel type checking. by <a
href="https://github.com/kschwab"><code>@​kschwab</code></a> in <a
href="https://redirect.github.com/pydantic/pydantic-settings/pull/659">pydantic/pydantic-settings#659</a></li>
<li>Avoid env_prefix falling back to env vars without prefix by <a
href="https://github.com/tselepakis"><code>@​tselepakis</code></a> in <a
href="https://redirect.github.com/pydantic/pydantic-settings/pull/648">pydantic/pydantic-settings#648</a></li>
<li>Warn if model_config sets unused keys for missing settings sources
by <a href="https://github.com/HomerusJa"><code>@​HomerusJa</code></a>
in <a
href="https://redirect.github.com/pydantic/pydantic-settings/pull/663">pydantic/pydantic-settings#663</a></li>
<li>Included endpoint_url kwarg in AWSSecretsManagerSettingsSource class
by <a href="https://github.com/adrianohrl"><code>@​adrianohrl</code></a>
in <a
href="https://redirect.github.com/pydantic/pydantic-settings/pull/664">pydantic/pydantic-settings#664</a></li>
<li>Fix typo (&quot;Accesing&quot;) in the &quot;Adding sources&quot;
docs by <a
href="https://github.com/deepyaman"><code>@​deepyaman</code></a> in <a
href="https://redirect.github.com/pydantic/pydantic-settings/pull/668">pydantic/pydantic-settings#668</a></li>
<li>CLI Windows Path Fix by <a
href="https://github.com/kschwab"><code>@​kschwab</code></a> in <a
href="https://redirect.github.com/pydantic/pydantic-settings/pull/669">pydantic/pydantic-settings#669</a></li>
<li>Cli root model support by <a
href="https://github.com/kschwab"><code>@​kschwab</code></a> in <a
href="https://redirect.github.com/pydantic/pydantic-settings/pull/677">pydantic/pydantic-settings#677</a></li>
<li>Snake case conversion in Azure Key Vault by <a
href="https://github.com/AndreuCodina"><code>@​AndreuCodina</code></a>
in <a
href="https://redirect.github.com/pydantic/pydantic-settings/pull/680">pydantic/pydantic-settings#680</a></li>
<li>Make <code>InitSettingsSource</code> resolution deterministic by <a
href="https://github.com/enrico-stauss"><code>@​enrico-stauss</code></a>
in <a
href="https://redirect.github.com/pydantic/pydantic-settings/pull/681">pydantic/pydantic-settings#681</a></li>
<li>Update deps by <a
href="https://github.com/hramezani"><code>@​hramezani</code></a> in <a
href="https://redirect.github.com/pydantic/pydantic-settings/pull/683">pydantic/pydantic-settings#683</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a
href="https://github.com/tselepakis"><code>@​tselepakis</code></a> made
their first contribution in <a
href="https://redirect.github.com/pydantic/pydantic-settings/pull/644">pydantic/pydantic-settings#644</a></li>
<li><a
href="https://github.com/trygve-baerland"><code>@​trygve-baerland</code></a>
made their first contribution in <a
href="https://redirect.github.com/pydantic/pydantic-settings/pull/656">pydantic/pydantic-settings#656</a></li>
<li><a href="https://github.com/mroch"><code>@​mroch</code></a> made
their first contribution in <a
href="https://redirect.github.com/pydantic/pydantic-settings/pull/658">pydantic/pydantic-settings#658</a></li>
<li><a href="https://github.com/HomerusJa"><code>@​HomerusJa</code></a>
made their first contribution in <a
href="https://redirect.github.com/pydantic/pydantic-settings/pull/663">pydantic/pydantic-settings#663</a></li>
<li><a
href="https://github.com/adrianohrl"><code>@​adrianohrl</code></a> made
their first contribution in <a
href="https://redirect.github.com/pydantic/pydantic-settings/pull/664">pydantic/pydantic-settings#664</a></li>
<li><a href="https://github.com/deepyaman"><code>@​deepyaman</code></a>
made their first contribution in <a
href="https://redirect.github.com/pydantic/pydantic-settings/pull/668">pydantic/pydantic-settings#668</a></li>
<li><a
href="https://github.com/enrico-stauss"><code>@​enrico-stauss</code></a>
made their first contribution in <a
href="https://redirect.github.com/pydantic/pydantic-settings/pull/681">pydantic/pydantic-settings#681</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/pydantic/pydantic-settings/compare/2.10.1...v2.11.0">https://github.com/pydantic/pydantic-settings/compare/2.10.1...v2.11.0</a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="584983d253"><code>584983d</code></a>
Prepare release 2.12 (<a
href="https://redirect.github.com/pydantic/pydantic-settings/issues/705">#705</a>)</li>
<li><a
href="6b4d87e776"><code>6b4d87e</code></a>
Add Python 3.14 (<a
href="https://redirect.github.com/pydantic/pydantic-settings/issues/704">#704</a>)</li>
<li><a
href="02de5b622b"><code>02de5b6</code></a>
Adapt test_protected_namespace_defaults for dev. Pydantic (<a
href="https://redirect.github.com/pydantic/pydantic-settings/issues/637">#637</a>)</li>
<li><a
href="4239ea460a"><code>4239ea4</code></a>
Drop Python3.9 support (<a
href="https://redirect.github.com/pydantic/pydantic-settings/issues/699">#699</a>)</li>
<li><a
href="5008c694f6"><code>5008c69</code></a>
Restore init kwarg names before returning final state dictionary. (<a
href="https://redirect.github.com/pydantic/pydantic-settings/issues/700">#700</a>)</li>
<li><a
href="4433101fef"><code>4433101</code></a>
Coerce env vars if strict is True. (<a
href="https://redirect.github.com/pydantic/pydantic-settings/issues/693">#693</a>)</li>
<li><a
href="4d2ebfd543"><code>4d2ebfd</code></a>
Strip non-explicit default values. (<a
href="https://redirect.github.com/pydantic/pydantic-settings/issues/692">#692</a>)</li>
<li><a
href="4a6ffcaeae"><code>4a6ffca</code></a>
Add NestedSecretsSettings source (<a
href="https://redirect.github.com/pydantic/pydantic-settings/issues/690">#690</a>)</li>
<li><a
href="7a6e96ebfc"><code>7a6e96e</code></a>
Apply source order: init &gt; env &gt; dotenv &gt; secrets &gt; defaults
and pres… (<a
href="https://redirect.github.com/pydantic/pydantic-settings/issues/688">#688</a>)</li>
<li><a
href="68563eddc0"><code>68563ed</code></a>
Support for enum kebab case. (<a
href="https://redirect.github.com/pydantic/pydantic-settings/issues/686">#686</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/pydantic/pydantic-settings/compare/2.10.1...v2.12.0">compare
view</a></li>
</ul>
</details>
<br />

Updates `pyjwt` from 2.10.1 to 2.11.0
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/jpadilla/pyjwt/releases">pyjwt's
releases</a>.</em></p>
<blockquote>
<h2>2.11.0</h2>
<h2>What's Changed</h2>
<ul>
<li>Fixed type error in comment by <a
href="https://github.com/shuhaib-aot"><code>@​shuhaib-aot</code></a> in
<a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1026">jpadilla/pyjwt#1026</a></li>
<li>[pre-commit.ci] pre-commit autoupdate by <a
href="https://github.com/pre-commit-ci"><code>@​pre-commit-ci</code></a>[bot]
in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1018">jpadilla/pyjwt#1018</a></li>
<li>[pre-commit.ci] pre-commit autoupdate by <a
href="https://github.com/pre-commit-ci"><code>@​pre-commit-ci</code></a>[bot]
in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1033">jpadilla/pyjwt#1033</a></li>
<li>Make note of use of leeway with nbf by <a
href="https://github.com/djw8605"><code>@​djw8605</code></a> in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1034">jpadilla/pyjwt#1034</a></li>
<li>[pre-commit.ci] pre-commit autoupdate by <a
href="https://github.com/pre-commit-ci"><code>@​pre-commit-ci</code></a>[bot]
in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1035">jpadilla/pyjwt#1035</a></li>
<li>Fixes <a
href="https://redirect.github.com/jpadilla/pyjwt/issues/964">#964</a>:
Validate key against allowed types for Algorithm family by <a
href="https://github.com/pachewise"><code>@​pachewise</code></a> in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/985">jpadilla/pyjwt#985</a></li>
<li>Feat <a
href="https://redirect.github.com/jpadilla/pyjwt/issues/1024">#1024</a>:
Add iterator for PyJWKSet by <a
href="https://github.com/pachewise"><code>@​pachewise</code></a> in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1041">jpadilla/pyjwt#1041</a></li>
<li>Fixes <a
href="https://redirect.github.com/jpadilla/pyjwt/issues/1039">#1039</a>:
Add iss, issuer type checks by <a
href="https://github.com/pachewise"><code>@​pachewise</code></a> in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1040">jpadilla/pyjwt#1040</a></li>
<li>Fixes <a
href="https://redirect.github.com/jpadilla/pyjwt/issues/660">#660</a>:
Improve typing/logic for <code>options</code> in decode,
decode_complete; Improve docs by <a
href="https://github.com/pachewise"><code>@​pachewise</code></a> in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1045">jpadilla/pyjwt#1045</a></li>
<li>[pre-commit.ci] pre-commit autoupdate by <a
href="https://github.com/pre-commit-ci"><code>@​pre-commit-ci</code></a>[bot]
in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1042">jpadilla/pyjwt#1042</a></li>
<li>[pre-commit.ci] pre-commit autoupdate by <a
href="https://github.com/pre-commit-ci"><code>@​pre-commit-ci</code></a>[bot]
in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1052">jpadilla/pyjwt#1052</a></li>
<li>[pre-commit.ci] pre-commit autoupdate by <a
href="https://github.com/pre-commit-ci"><code>@​pre-commit-ci</code></a>[bot]
in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1053">jpadilla/pyjwt#1053</a></li>
<li>Fix <a
href="https://redirect.github.com/jpadilla/pyjwt/issues/1022">#1022</a>:
Map <code>algorithm=None</code> to &quot;none&quot; by <a
href="https://github.com/qqii"><code>@​qqii</code></a> in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1056">jpadilla/pyjwt#1056</a></li>
<li>[pre-commit.ci] pre-commit autoupdate by <a
href="https://github.com/pre-commit-ci"><code>@​pre-commit-ci</code></a>[bot]
in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1055">jpadilla/pyjwt#1055</a></li>
<li>[pre-commit.ci] pre-commit autoupdate by <a
href="https://github.com/pre-commit-ci"><code>@​pre-commit-ci</code></a>[bot]
in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1058">jpadilla/pyjwt#1058</a></li>
<li>[pre-commit.ci] pre-commit autoupdate by <a
href="https://github.com/pre-commit-ci"><code>@​pre-commit-ci</code></a>[bot]
in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1060">jpadilla/pyjwt#1060</a></li>
<li>[pre-commit.ci] pre-commit autoupdate by <a
href="https://github.com/pre-commit-ci"><code>@​pre-commit-ci</code></a>[bot]
in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1061">jpadilla/pyjwt#1061</a></li>
<li>Fixes <a
href="https://redirect.github.com/jpadilla/pyjwt/issues/1047">#1047</a>:
Correct <code>PyJWKClient.get_signing_key_from_jwt</code> annotation by
<a href="https://github.com/khvn26"><code>@​khvn26</code></a> in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1048">jpadilla/pyjwt#1048</a></li>
<li>[pre-commit.ci] pre-commit autoupdate by <a
href="https://github.com/pre-commit-ci"><code>@​pre-commit-ci</code></a>[bot]
in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1062">jpadilla/pyjwt#1062</a></li>
<li>Fixed doc string typo in _validate_jti() function <a
href="https://redirect.github.com/jpadilla/pyjwt/issues/1063">#1063</a>
by <a
href="https://github.com/kuldeepkhatke"><code>@​kuldeepkhatke</code></a>
in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1064">jpadilla/pyjwt#1064</a></li>
<li>[pre-commit.ci] pre-commit autoupdate by <a
href="https://github.com/pre-commit-ci"><code>@​pre-commit-ci</code></a>[bot]
in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1065">jpadilla/pyjwt#1065</a></li>
<li>Update SECURITY.md by <a
href="https://github.com/auvipy"><code>@​auvipy</code></a> in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1057">jpadilla/pyjwt#1057</a></li>
<li>Typing fix: use <code>float</code> instead of <code>int</code> for
<code>lifespan</code> and <code>timeout</code> by <a
href="https://github.com/nikitagashkov"><code>@​nikitagashkov</code></a>
in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1068">jpadilla/pyjwt#1068</a></li>
<li>[pre-commit.ci] pre-commit autoupdate by <a
href="https://github.com/pre-commit-ci"><code>@​pre-commit-ci</code></a>[bot]
in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1067">jpadilla/pyjwt#1067</a></li>
<li>[pre-commit.ci] pre-commit autoupdate by <a
href="https://github.com/pre-commit-ci"><code>@​pre-commit-ci</code></a>[bot]
in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1071">jpadilla/pyjwt#1071</a></li>
<li>[pre-commit.ci] pre-commit autoupdate by <a
href="https://github.com/pre-commit-ci"><code>@​pre-commit-ci</code></a>[bot]
in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1076">jpadilla/pyjwt#1076</a></li>
<li>Fix TYP header documentation by <a
href="https://github.com/fobiasmog"><code>@​fobiasmog</code></a> in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1046">jpadilla/pyjwt#1046</a></li>
<li>doc: Document claims sub and jti by <a
href="https://github.com/cleder"><code>@​cleder</code></a> in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1088">jpadilla/pyjwt#1088</a></li>
<li>[pre-commit.ci] pre-commit autoupdate by <a
href="https://github.com/pre-commit-ci"><code>@​pre-commit-ci</code></a>[bot]
in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1077">jpadilla/pyjwt#1077</a></li>
<li>Bump actions/setup-python from 5 to 6 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1089">jpadilla/pyjwt#1089</a></li>
<li>Bump actions/stale from 8 to 10 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1090">jpadilla/pyjwt#1090</a></li>
<li>Bump actions/checkout from 4 to 5 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1083">jpadilla/pyjwt#1083</a></li>
<li>[pre-commit.ci] pre-commit autoupdate by <a
href="https://github.com/pre-commit-ci"><code>@​pre-commit-ci</code></a>[bot]
in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1091">jpadilla/pyjwt#1091</a></li>
<li>[pre-commit.ci] pre-commit autoupdate by <a
href="https://github.com/pre-commit-ci"><code>@​pre-commit-ci</code></a>[bot]
in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1093">jpadilla/pyjwt#1093</a></li>
<li>[pre-commit.ci] pre-commit autoupdate by <a
href="https://github.com/pre-commit-ci"><code>@​pre-commit-ci</code></a>[bot]
in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1096">jpadilla/pyjwt#1096</a></li>
<li>Resolve package build warnings by <a
href="https://github.com/kurtmckee"><code>@​kurtmckee</code></a> in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1105">jpadilla/pyjwt#1105</a></li>
<li>Support Python 3.14, and test against PyPy 3.10+ by <a
href="https://github.com/kurtmckee"><code>@​kurtmckee</code></a> in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1104">jpadilla/pyjwt#1104</a></li>
<li>Fix a <code>SyntaxWarning</code> caused by invalid escape sequences
by <a href="https://github.com/kurtmckee"><code>@​kurtmckee</code></a>
in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1103">jpadilla/pyjwt#1103</a></li>
<li>Standardize CHANGELOG links to PRs by <a
href="https://github.com/kurtmckee"><code>@​kurtmckee</code></a> in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1110">jpadilla/pyjwt#1110</a></li>
<li>Migrate from <code>pep517</code>, which is deprecated, to
<code>build</code> by <a
href="https://github.com/kurtmckee"><code>@​kurtmckee</code></a> in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1108">jpadilla/pyjwt#1108</a></li>
<li>Fix incorrectly-named test suite function by <a
href="https://github.com/kurtmckee"><code>@​kurtmckee</code></a> in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1116">jpadilla/pyjwt#1116</a></li>
<li>Fix Read the Docs builds by <a
href="https://github.com/kurtmckee"><code>@​kurtmckee</code></a> in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1111">jpadilla/pyjwt#1111</a></li>
<li>Bump actions/download-artifact from 4 to 6 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1118">jpadilla/pyjwt#1118</a></li>
<li>Escalate test suite warnings to errors by <a
href="https://github.com/kurtmckee"><code>@​kurtmckee</code></a> in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1107">jpadilla/pyjwt#1107</a></li>
<li>Add pyupgrade as a pre-commit hook by <a
href="https://github.com/kurtmckee"><code>@​kurtmckee</code></a> in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1109">jpadilla/pyjwt#1109</a></li>
<li>Simplify the test suite decorators by <a
href="https://github.com/kurtmckee"><code>@​kurtmckee</code></a> in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1113">jpadilla/pyjwt#1113</a></li>
<li>Improve coverage config and eliminate unused test suite code by <a
href="https://github.com/kurtmckee"><code>@​kurtmckee</code></a> in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1115">jpadilla/pyjwt#1115</a></li>
<li>Build a shared wheel once in the test suite by <a
href="https://github.com/kurtmckee"><code>@​kurtmckee</code></a> in <a
href="https://redirect.github.com/jpadilla/pyjwt/pull/1114">jpadilla/pyjwt#1114</a></li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/jpadilla/pyjwt/blob/master/CHANGELOG.rst">pyjwt's
changelog</a>.</em></p>
<blockquote>
<h2><code>v2.11.0
&lt;https://github.com/jpadilla/pyjwt/compare/2.10.1...2.11.0&gt;</code>__</h2>
<p>Fixed</p>
<pre><code>
- Enforce ECDSA curve validation per RFC 7518 Section 3.4.
- Fix build system warnings by @kurtmckee in
`[#1105](https://github.com/jpadilla/pyjwt/issues/1105)
&lt;https://github.com/jpadilla/pyjwt/pull/1105&gt;`__
- Validate key against allowed types for Algorithm family in
`[#964](https://github.com/jpadilla/pyjwt/issues/964)
&lt;https://github.com/jpadilla/pyjwt/pull/964&gt;`__
- Add iterator for JWKSet in
`[#1041](https://github.com/jpadilla/pyjwt/issues/1041)
&lt;https://github.com/jpadilla/pyjwt/pull/1041&gt;`__
- Validate `iss` claim is a string during encoding and decoding by
@pachewise in `[#1040](https://github.com/jpadilla/pyjwt/issues/1040)
&lt;https://github.com/jpadilla/pyjwt/pull/1040&gt;`__
- Improve typing/logic for `options` in decode, decode_complete by
@pachewise in `[#1045](https://github.com/jpadilla/pyjwt/issues/1045)
&lt;https://github.com/jpadilla/pyjwt/pull/1045&gt;`__
- Declare float supported type for lifespan and timeout by
@nikitagashkov in
`[#1068](https://github.com/jpadilla/pyjwt/issues/1068)
&lt;https://github.com/jpadilla/pyjwt/pull/1068&gt;`__
- Fix ``SyntaxWarning``\s/``DeprecationWarning``\s caused by invalid
escape sequences by @kurtmckee in
`[#1103](https://github.com/jpadilla/pyjwt/issues/1103)
&lt;https://github.com/jpadilla/pyjwt/pull/1103&gt;`__
- Development: Build a shared wheel once to speed up test suite setup
times by @kurtmckee in
`[#1114](https://github.com/jpadilla/pyjwt/issues/1114)
&lt;https://github.com/jpadilla/pyjwt/pull/1114&gt;`__
- Development: Test type annotations across all supported Python
versions,
increase the strictness of the type checking, and remove the mypy
pre-commit hook
by @kurtmckee in `[#1112](https://github.com/jpadilla/pyjwt/issues/1112)
&lt;https://github.com/jpadilla/pyjwt/pull/1112&gt;`__
<p>Added
</code></pre></p>
<ul>
<li>Support Python 3.14, and test against PyPy 3.10 and 3.11 by <a
href="https://github.com/kurtmckee"><code>@​kurtmckee</code></a> in
<code>[#1104](https://github.com/jpadilla/pyjwt/issues/1104)
&lt;https://github.com/jpadilla/pyjwt/pull/1104&gt;</code>__</li>
<li>Development: Migrate to <code>build</code> to test package building
in CI by <a
href="https://github.com/kurtmckee"><code>@​kurtmckee</code></a> in
<code>[#1108](https://github.com/jpadilla/pyjwt/issues/1108)
&lt;https://github.com/jpadilla/pyjwt/pull/1108&gt;</code>__</li>
<li>Development: Improve coverage config and eliminate unused test suite
code by <a
href="https://github.com/kurtmckee"><code>@​kurtmckee</code></a> in
<code>[#1115](https://github.com/jpadilla/pyjwt/issues/1115)
&lt;https://github.com/jpadilla/pyjwt/pull/1115&gt;</code>__</li>
<li>Docs: Standardize CHANGELOG links to PRs by <a
href="https://github.com/kurtmckee"><code>@​kurtmckee</code></a> in
<code>[#1110](https://github.com/jpadilla/pyjwt/issues/1110)
&lt;https://github.com/jpadilla/pyjwt/pull/1110&gt;</code>__</li>
<li>Docs: Fix Read the Docs builds by <a
href="https://github.com/kurtmckee"><code>@​kurtmckee</code></a> in
<code>[#1111](https://github.com/jpadilla/pyjwt/issues/1111)
&lt;https://github.com/jpadilla/pyjwt/pull/1111&gt;</code>__</li>
<li>Docs: Add example of using leeway with nbf by <a
href="https://github.com/djw8605"><code>@​djw8605</code></a> in
<code>[#1034](https://github.com/jpadilla/pyjwt/issues/1034)
&lt;https://github.com/jpadilla/pyjwt/pull/1034&gt;</code>__</li>
<li>Docs: Refactored docs with <code>autodoc</code>; added
<code>PyJWS</code> and <code>jwt.algorithms</code> docs by <a
href="https://github.com/pachewise"><code>@​pachewise</code></a> in
<code>[#1045](https://github.com/jpadilla/pyjwt/issues/1045)
&lt;https://github.com/jpadilla/pyjwt/pull/1045&gt;</code>__</li>
<li>Docs: Documentation improvements for &quot;sub&quot; and
&quot;jti&quot; claims by <a
href="https://github.com/cleder"><code>@​cleder</code></a> in
<code>[#1088](https://github.com/jpadilla/pyjwt/issues/1088)
&lt;https://github.com/jpadilla/pyjwt/pull/1088&gt;</code>__</li>
<li>Development: Add pyupgrade as a pre-commit hook by <a
href="https://github.com/kurtmckee"><code>@​kurtmckee</code></a> in
<code>[#1109](https://github.com/jpadilla/pyjwt/issues/1109)
&lt;https://github.com/jpadilla/pyjwt/pull/1109&gt;</code>__</li>
<li>Add minimum key length validation for HMAC and RSA keys (CWE-326).
Warns by default via <code>InsecureKeyLengthWarning</code> when keys are
below
minimum recommended lengths per RFC 7518 Section 3.2 (HMAC) and
NIST SP 800-131A (RSA). Pass
<code>enforce_minimum_key_length=True</code> in
options to <code>PyJWT</code> or <code>PyJWS</code> to raise
<code>InvalidKeyError</code> instead.</li>
<li>Refactor <code>PyJWT</code> to own an internal <code>PyJWS</code>
instance instead of
calling global <code>api_jws</code> functions.</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="697344d259"><code>697344d</code></a>
bump up version</li>
<li><a
href="e4d0aec024"><code>e4d0aec</code></a>
fix: pre-commit</li>
<li><a
href="df9a6a0c44"><code>df9a6a0</code></a>
fix: failing test</li>
<li><a
href="2b2e53cd23"><code>2b2e53c</code></a>
fix: docs</li>
<li><a
href="635c8d89dd"><code>635c8d8</code></a>
fix: failing mypy</li>
<li><a
href="96ae3563b9"><code>96ae356</code></a>
feat: add minimum key length validation for HMAC and RSA</li>
<li><a
href="5b86227733"><code>5b86227</code></a>
fix: enforce ECDSA curve validation per RFC 7518 Section 3.4</li>
<li><a
href="04947d75dc"><code>04947d7</code></a>
Bump actions/download-artifact from 6 to 7 (<a
href="https://redirect.github.com/jpadilla/pyjwt/issues/1125">#1125</a>)</li>
<li><a
href="dd448344c3"><code>dd44834</code></a>
Fix leeway value in usage documentation (<a
href="https://redirect.github.com/jpadilla/pyjwt/issues/1124">#1124</a>)</li>
<li><a
href="407f0bde99"><code>407f0bd</code></a>
Thoroughly test type annotations, and resolve errors (<a
href="https://redirect.github.com/jpadilla/pyjwt/issues/1112">#1112</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/jpadilla/pyjwt/compare/2.10.1...2.11.0">compare
view</a></li>
</ul>
</details>
<br />

Updates `supabase` from 2.16.0 to 2.27.2
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/supabase/supabase-py/releases">supabase's
releases</a>.</em></p>
<blockquote>
<h2>v2.27.2</h2>
<h2><a
href="https://github.com/supabase/supabase-py/compare/v2.27.1...v2.27.2">2.27.2</a>
(2026-01-14)</h2>
<h3>Bug Fixes</h3>
<ul>
<li><strong>ci:</strong> generate new token for release-please (<a
href="https://redirect.github.com/supabase/supabase-py/issues/1348">#1348</a>)
(<a
href="c2ad37f9dc">c2ad37f</a>)</li>
<li><strong>ci:</strong> run CI when .github files change (<a
href="https://redirect.github.com/supabase/supabase-py/issues/1349">#1349</a>)
(<a
href="a221aac029">a221aac</a>)</li>
<li><strong>realtime:</strong> ammend reconnect logic to not unsubscribe
(<a
href="https://redirect.github.com/supabase/supabase-py/issues/1346">#1346</a>)
(<a
href="cfbe5943cb">cfbe594</a>)</li>
</ul>
<h2>v2.27.1</h2>
<h2><a
href="https://github.com/supabase/supabase-py/compare/v2.27.0...v2.27.1">2.27.1</a>
(2026-01-06)</h2>
<h3>Bug Fixes</h3>
<ul>
<li><strong>realtime:</strong> use 'event' instead of 'events' in
postgres_changes protocol (<a
href="https://redirect.github.com/supabase/supabase-py/issues/1339">#1339</a>)
(<a
href="c1e7986c5e">c1e7986</a>)</li>
<li><strong>storage:</strong> catch bad responses from server (<a
href="https://redirect.github.com/supabase/supabase-py/issues/1344">#1344</a>)
(<a
href="ddb50547db">ddb5054</a>)</li>
</ul>
<h2>v2.27.0</h2>
<h2><a
href="https://github.com/supabase/supabase-py/compare/v2.26.0...v2.27.0">2.27.0</a>
(2025-12-16)</h2>
<h3>Features</h3>
<ul>
<li><strong>auth:</strong> add X (OAuth 2.0) provider (<a
href="https://redirect.github.com/supabase/supabase-py/issues/1335">#1335</a>)
(<a
href="f600f96b52">f600f96</a>)</li>
</ul>
<h3>Bug Fixes</h3>
<ul>
<li><strong>storage:</strong> replace deprecated pydantic Extra with
literal values (<a
href="https://redirect.github.com/supabase/supabase-py/issues/1334">#1334</a>)
(<a
href="6df3545785">6df3545</a>)</li>
</ul>
<h2>v2.26....

_Description has been truncated_

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Nicholas Tindle <ntindle@users.noreply.github.com>
Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co>
Co-authored-by: Nick Tindle <nick@ntindle.com>
2026-02-07 02:17:38 +00:00
Reinier van der Leer
8fddc9d71f fix(backend): Reduce GET /api/graphs expense + latency (#11986)
[SECRT-1896: Fix crazy `GET /api/graphs` latency (P95 =
107s)](https://linear.app/autogpt/issue/SECRT-1896)

These changes should decrease latency of this endpoint by ~~60-65%~~ a
lot.

### Changes 🏗️

- Make `Graph.credentials_input_schema` cheaper by avoiding constructing
a new `BlockSchema` subclass
- Strip down `GraphMeta` - drop all computed fields
- Replace with either `GraphModel` or `GraphModelWithoutNodes` wherever
those computed fields are used
- Simplify usage in `list_graphs_paginated` and
`fetch_graph_from_store_slug`
- Refactor and clarify relationships between the different graph models
  - Split `BaseGraph` into `GraphBaseMeta` + `BaseGraph`
- Strip down `Graph` - move `credentials_input_schema` and
`aggregate_credentials_inputs` to `GraphModel`
- Refactor to eliminate double `aggregate_credentials_inputs()` call in
`credentials_input_schema` call tree
  - Add `GraphModelWithoutNodes` (similar to current `GraphMeta`)

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] `GET /api/graphs` works as it should
  - [x] Running a graph succeeds
  - [x] Adding a sub-agent in the Builder works as it should
2026-02-06 19:13:21 +00:00
Ubbe
3d1cd03fc8 ci(frontend): disable chromatic for this month (#11994)
### Changes 🏗️

- we react the max snapshots quota and don't wanna upgrade
- make it run (when re-enabled) on `src/components` changes only to
reduce snapshots

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] CI hope for the best
2026-02-06 19:17:25 +07:00
Swifty
e7ebe42306 fix(frontend): Revert ThinkingMessage progress bar delay to original values (#11993) 2026-02-06 12:23:32 +01:00
Otto
e0fab7e34e fix(frontend): Improve clarification answer message formatting (#11985)
## Summary

Improves the auto-generated message format when users submit
clarification answers in the agent generator.

## Before

```
I have the answers to your questions:

keyword_1: User answer 1
keyword_2: User answer 2

Please proceed with creating the agent.
```
<img width="748" height="153" alt="image"
src="https://github.com/user-attachments/assets/7231aaab-8ea4-406b-ba31-fa2b6055b82d"
/>

## After

```
**Here are my answers:**

> What is the primary purpose?

User answer 1

> What is the target audience?

User answer 2

Please proceed with creating the agent.
```
<img width="619" height="352" alt="image"
src="https://github.com/user-attachments/assets/ef8c1fbf-fb60-4488-b51f-407c1b9e3e44"
/>


## Changes

- Use human-readable question text instead of machine-readable keywords
- Use blockquote format for questions (natural "quote and reply"
pattern)
- Use double newlines for proper Markdown paragraph breaks
- Iterate over `message.questions` array to preserve original question
order
- Move handler inside conditional block for proper TypeScript type
narrowing

## Why

- The old format was ugly and hard to read (raw keywords, no line
breaks)
- The new format uses a natural "quoting and replying" pattern
- Better readability for both users and the LLM (verified: backend does
NOT parse keywords)

## Linear Ticket

Fixes [SECRT-1822](https://linear.app/autogpt/issue/SECRT-1822)

## Testing

- [ ] Trigger agent creation that requires clarifying questions
- [ ] Fill out the form and submit
- [ ] Verify message appears with new blockquote format
- [ ] Verify questions appear in original order
- [ ] Verify agent generation proceeds correctly

Co-authored-by: Toran Bruce Richards <toran.richards@gmail.com>
2026-02-06 08:41:06 +00:00
Nicholas Tindle
29ee85c86f fix: add virus scanning to WorkspaceManager.write_file() (#11990)
## Summary

Adds virus scanning at the `WorkspaceManager.write_file()` layer for
defense in depth.

## Problem

Previously, virus scanning was only performed at entry points:
- `store_media_file()` in `backend/util/file.py`
- `WriteWorkspaceFileTool` in
`backend/api/features/chat/tools/workspace_files.py`

This created a trust boundary where any new caller of
`WorkspaceManager.write_file()` would need to remember to scan first.

## Solution

Add `scan_content_safe()` call directly in
`WorkspaceManager.write_file()` before persisting to storage. This
ensures all content is scanned regardless of the caller.

## Changes

- Added import for `scan_content_safe` from `backend.util.virus_scanner`
- Added virus scan call after file size validation, before storage

## Testing

Existing tests should pass. The scan is a no-op in test environments
where ClamAV isn't running.

Closes https://linear.app/autogpt/issue/OPEN-2993

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Medium Risk**
> Introduces a new required async scan step in the workspace write path,
which can add latency or cause new failures if the scanner/ClamAV is
misconfigured or unavailable.
> 
> **Overview**
> Adds a **defense-in-depth** virus scan to
`WorkspaceManager.write_file()` by invoking `scan_content_safe()` after
file-size validation and before any storage/database persistence.
> 
> This centralizes scanning so any caller writing workspace files gets
the same malware check without relying on upstream entry points to
remember to scan.
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
0f5ac68b92. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
2026-02-06 04:38:32 +00:00
Nicholas Tindle
85b6520710 feat(blocks): Add video editing blocks (#11796)
<!-- Clearly explain the need for these changes: -->
This PR adds general-purpose video editing blocks for the AutoGPT
Platform, enabling automated video production workflows like documentary
creation, marketing videos, tutorial assembly, and content repurposing.

### Changes 🏗️

<!-- Concisely describe all of the changes made in this pull request:
-->

**New blocks added in `backend/blocks/video/`:**
- `VideoDownloadBlock` - Download videos from URLs (YouTube, Vimeo, news
sites, direct links) using yt-dlp
- `VideoClipBlock` - Extract time segments from videos with start/end
time validation
- `VideoConcatBlock` - Merge multiple video clips with optional
transitions (none, crossfade, fade_black)
- `VideoTextOverlayBlock` - Add text overlays/captions with positioning
and timing options
- `VideoNarrationBlock` - Generate AI narration via ElevenLabs and mix
with video audio (replace, mix, or ducking modes)

**Dependencies required:**
- `yt-dlp` - For video downloading
- `moviepy` - For video editing operations

**Implementation details:**
- All blocks follow the SDK pattern with proper error handling and
exception chaining
- Proper resource cleanup in `finally` blocks to prevent memory leaks
- Input validation (e.g., end_time > start_time)
- Test mocks included for CI

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Blocks follow the SDK pattern with
`BlockSchemaInput`/`BlockSchemaOutput`
  - [x] Resource cleanup is implemented in `finally` blocks
  - [x] Exception chaining is properly implemented
  - [x] Input validation is in place
  - [x] Test mocks are provided for CI environments

#### For configuration changes:
- [ ] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [ ] I have included a list of my configuration changes in the PR
description (under **Changes**)

N/A - No configuration changes required.


<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Medium Risk**
> Adds new multimedia blocks that invoke ffmpeg/MoviePy and introduces
new external dependencies (plus container packages), which can impact
runtime stability and resource usage; download/overlay blocks are
present but disabled due to sandbox/policy concerns.
> 
> **Overview**
> Adds a new `backend.blocks.video` module with general-purpose video
workflow blocks (download, clip, concat w/ transitions, loop, add-audio,
text overlay, and ElevenLabs-powered narration), including shared
utilities for codec selection, filename cleanup, and an ffmpeg-based
chapter-strip workaround for MoviePy.
> 
> Extends credentials/config to support ElevenLabs
(`ELEVENLABS_API_KEY`, provider enum, system credentials, and cost
config) and adds new dependencies (`elevenlabs`, `yt-dlp`) plus Docker
runtime packages (`ffmpeg`, `imagemagick`).
> 
> Improves file/reference handling end-to-end by embedding MIME types in
`workspace://...#mime` outputs and updating frontend rendering to detect
video vs image from MIME fragments (and broaden supported audio/video
extensions), with optional enhanced output rendering behind a feature
flag in the legacy builder UI.
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
da7a44d794. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Nicholas Tindle <ntindle@users.noreply.github.com>
Co-authored-by: Otto <otto@agpt.co>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 22:22:33 +00:00
Bently
bfa942e032 feat(platform): Add Claude Opus 4.6 model support (#11983)
## Summary
Adds support for Anthropic's newly released Claude Opus 4.6 model.

## Changes
- Added `claude-opus-4-6` to the `LlmModel` enum
- Added model metadata: 200K context window (1M beta), **128K max output
tokens**
- Added block cost config (same pricing tier as Opus 4.5: $5/MTok input,
$25/MTok output)
- Updated chat config default model to Claude Opus 4.6

## Model Details
From [Anthropic's
docs](https://docs.anthropic.com/en/docs/about-claude/models):
- **API ID:** `claude-opus-4-6`
- **Context window:** 200K tokens (1M beta)
- **Max output:** 128K tokens (up from 64K on Opus 4.5)
- **Extended thinking:** Yes
- **Adaptive thinking:** Yes (new, Opus 4.6 exclusive)
- **Knowledge cutoff:** May 2025 (reliable), Aug 2025 (training)
- **Pricing:** $5/MTok input, $25/MTok output (same as Opus 4.5)

---------

Co-authored-by: Toran Bruce Richards <toran.richards@gmail.com>
2026-02-05 19:19:51 +00:00
Otto
11256076d8 fix(frontend): Rename "Tasks" tab to "Agents" in navbar (#11982)
## Summary
Renames the "Tasks" tab in the navbar to "Agents" per the Figma design.

## Changes
- `Navbar.tsx`: Changed label from "Tasks" to "Agents"

<img width="1069" height="153" alt="image"
src="https://github.com/user-attachments/assets/3869d2a2-9bd9-4346-b650-15dabbdb46c4"
/>


## Why
- "Tasks" was incorrectly named and confusing for users trying to find
their agent builds
- Matches the Figma design

## Linear Ticket
Fixes [SECRT-1894](https://linear.app/autogpt/issue/SECRT-1894)

## Related
- [SECRT-1865](https://linear.app/autogpt/issue/SECRT-1865) - Find and
Manage Existing/Unpublished or Recent Agent Builds Is Unintuitive
2026-02-05 17:54:39 +00:00
Bently
3ca2387631 feat(blocks): Implement Text Encode block (#11857)
## Summary
Implements a `TextEncoderBlock` that encodes plain text into escape
sequences (the reverse of `TextDecoderBlock`).

## Changes

### Block Implementation
- Added `encoder_block.py` with `TextEncoderBlock` in
`autogpt_platform/backend/backend/blocks/`
- Uses `codecs.encode(text, "unicode_escape").decode("utf-8")` for
encoding
- Mirrors the structure and patterns of the existing `TextDecoderBlock`
- Categorised as `BlockCategory.TEXT`

### Documentation
- Added Text Encoder section to
`docs/integrations/block-integrations/text.md` (the auto-generated docs
file for TEXT category blocks)
- Expanded "How it works" with technical details on the encoding method,
validation, and edge cases
- Added 3 structured use cases per docs guidelines: JSON payload
preparation, Config/ENV generation, Snapshot fixtures
- Added Text Encoder to the overview table in
`docs/integrations/README.md`
- Removed standalone `encoder_block.md` (TEXT category blocks belong in
`text.md` per `CATEGORY_FILE_MAP` in `generate_block_docs.py`)

### Documentation Formatting (CodeRabbit feedback)
- Added blank lines around markdown tables (MD058)
- Added `text` language tags to fenced code blocks (MD040)
- Restructured use case section with bold headings per coding guidelines

## How Docs Were Synced
The `check-docs-sync` CI job runs `poetry run python
scripts/generate_block_docs.py --check` which expects blocks to be
documented in category-grouped files. Since `TextEncoderBlock` uses
`BlockCategory.TEXT`, the `CATEGORY_FILE_MAP` maps it to `text.md` — not
a standalone file. The block entry was added to `text.md` following the
exact format used by the generator (with `<!-- MANUAL -->` markers for
hand-written sections).

## Related Issue
Fixes #11111

---------

Co-authored-by: Otto <otto@agpt.co>
Co-authored-by: lif <19658300+majiayu000@users.noreply.github.com>
Co-authored-by: Aryan Kaul <134673289+aryancodes1@users.noreply.github.com>
Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co>
Co-authored-by: Nick Tindle <nick@ntindle.com>
2026-02-05 17:31:02 +00:00
Otto
ed07f02738 fix(copilot): edit_agent updates existing agent instead of creating duplicate (#11981)
## Summary

When editing an agent via CoPilot's `edit_agent` tool, the code was
always creating a new `LibraryAgent` entry instead of updating the
existing one to point to the new graph version. This caused duplicate
agents to appear in the user's library.

## Changes

In `save_agent_to_library()`:
- When `is_update=True`, now checks if there's an existing library agent
for the graph using `get_library_agent_by_graph_id()`
- If found, uses `update_agent_version_in_library()` to update the
existing library agent to point to the new version
- Falls back to creating a new library agent if no existing one is found
(e.g., if editing a graph that wasn't added to library yet)

## Testing

- Verified lint/format checks pass
- Plan reviewed and approved by Staff Engineer Plan Reviewer agent

## Related

Fixes [SECRT-1857](https://linear.app/autogpt/issue/SECRT-1857)

---------

Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
2026-02-05 15:02:26 +00:00
Swifty
b121030c94 feat(frontend): Add progress indicator during agent generation [SECRT-1883] (#11974)
## Summary
- Add asymptotic progress bar that appears during long-running chat
tasks
- Progress bar shows after 10 seconds with "Working on it..." label and
percentage
- Uses half-life formula: ~50% at 30s, ~75% at 60s, ~87.5% at 90s, etc.
- Creates the classic "game loading bar" effect that never reaches 100%



https://github.com/user-attachments/assets/3c59289e-793c-4a08-b3fc-69e1eef28b1f



## Test plan
- [x] Start a chat that triggers agent generation
- [x] Wait 10+ seconds for the progress bar to appear
- [x] Verify progress bar is centered with label and percentage
- [x] Verify progress follows expected timing (~50% at 30s)
- [x] Verify progress bar disappears when task completes

---------

Co-authored-by: Otto <otto@agpt.co>
2026-02-05 15:37:51 +01:00
Swifty
c22c18374d feat(frontend): Add ready-to-test prompt after agent creation [SECRT-1882] (#11975)
## Summary
- Add special UI prompt when agent is successfully created in chat
- Show "Agent Created Successfully" with agent name
- Provide two action buttons:
- **Run with example values**: Sends chat message asking AI to run with
placeholders
- **Run with my inputs**: Opens RunAgentModal for custom input
configuration
- After run/schedule, automatically send chat message with execution
details for AI monitoring



https://github.com/user-attachments/assets/b11e118c-de59-4b79-a629-8bd0d52d9161



## Test plan
- [x] Create an agent through chat
- [x] Verify "Agent Created Successfully" prompt appears
- [x] Click "Run with example values" - verify chat message is sent
- [x] Click "Run with my inputs" - verify RunAgentModal opens
- [x] Fill inputs and run - verify chat message with execution ID is
sent
- [x] Fill inputs and schedule - verify chat message with schedule
details is sent

---------

Co-authored-by: Otto <otto@agpt.co>
2026-02-05 15:37:31 +01:00
Swifty
e40233a3ac fix(backend/chat): Guide find_agent users toward action with CTAs (#11976)
When users search for agents, guide them toward creating custom agents
if no results are found or after showing results. This improves user
engagement by offering a clear next step.

### Changes 🏗️

- Updated `agent_search.py` to add CTAs in search responses
- Added messaging to inform users they can create custom agents based on
their needs
- Applied to both "no results found" and "agents found" scenarios

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Search for agents in marketplace with matching results
  - [x] Search for agents in marketplace with no results
  - [x] Search for agents in library with matching results  
  - [x] Search for agents in library with no results
  - [x] Verify CTA message appears in all cases

---------

Co-authored-by: Otto <otto@agpt.co>
2026-02-05 15:36:55 +01:00
Swifty
3ae5eabf9d fix(backend/chat): Use latest prompt label in non-production environments (#11977)
In non-production environments, the chat service now fetches prompts
with the `latest` label instead of the default production-labeled
prompt. This makes it easier to test and iterate on prompt changes in
dev/staging without needing to promote them to production first.

### Changes 🏗️

- Updated `_get_system_prompt_template()` in chat service to pass
`label="latest"` when `app_env` is not `PRODUCTION`
- Production environments continue using the default behavior
(production-labeled prompts)

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Verified that in non-production environments, prompts with
`latest` label are fetched
- [x] Verified that production environments still use the default
(production) labeled prompts

Co-authored-by: Otto <otto@agpt.co>
2026-02-05 14:54:39 +01:00
Otto
a077ba9f03 fix(platform): YouTube block yields only error on failure (#11980)
## Summary

Fixes [SECRT-1889](https://linear.app/autogpt/issue/SECRT-1889): The
YouTube transcription block was yielding both `video_id` and `error`
when the transcript fetch failed.

## Problem

The block yielded `video_id` immediately upon extracting it from the
URL, before attempting to fetch the transcript. If the transcript fetch
failed, both outputs were present.

```python
# Before
video_id = self.extract_video_id(input_data.youtube_url)
yield "video_id", video_id  # ← Yielded before transcript attempt

transcript = self.get_transcript(video_id, credentials)  # ← Could fail here
```

## Solution

Wrap the entire operation in try/except and only yield outputs after all
operations succeed:

```python
# After
try:
    video_id = self.extract_video_id(input_data.youtube_url)
    transcript = self.get_transcript(video_id, credentials)
    transcript_text = self.format_transcript(transcript=transcript)

    # Only yield after all operations succeed
    yield "video_id", video_id
    yield "transcript", transcript_text
except Exception as e:
    yield "error", str(e)
```

This follows the established pattern in other blocks (e.g.,
`ai_image_generator_block.py`).

## Testing

- All 10 unit tests pass (`test/blocks/test_youtube.py`)
- Lint/format checks pass

Co-authored-by: Toran Bruce Richards <toran.richards@gmail.com>
2026-02-05 11:51:32 +00:00
Bently
5401d54eaa fix(backend): Handle StreamHeartbeat in CoPilot stream handler (#11928)
### Changes 🏗️

Fixes **AUTOGPT-SERVER-7JA** (123 events since Jan 27, 2026).

#### Problem

`StreamHeartbeat` was added to keep SSE connections alive during
long-running tool executions (yielded every 15s while waiting). However,
the main `stream_chat_completion` handler's `elif` chain didn't have a
case for it:

```
StreamTextStart →  handled
StreamTextDelta →  handled
StreamTextEnd →  handled
StreamToolInputStart →  handled
StreamToolInputAvailable →  handled
StreamToolOutputAvailable →  handled
StreamFinish →  handled
StreamError →  handled
StreamUsage →  handled
StreamHeartbeat →  fell through to 'Unknown chunk type' error
```

This meant every heartbeat during tool execution generated a Sentry
error instead of keeping the connection alive.

#### Fix

Add `StreamHeartbeat` to the `elif` chain and yield it through. The
route handler already calls `to_sse()` on all yielded chunks, and
`StreamHeartbeat.to_sse()` correctly returns `: heartbeat\n\n` (SSE
comment format, ignored by clients but keeps proxies/load balancers
happy).

**1 file changed, 3 insertions.**
2026-02-05 12:04:46 +01:00
Otto
5ac89d7c0b fix(test): fix timing bug in test_block_credit_reset (#11978)
## Summary
Fixes the flaky `test_block_credit_reset` test that was failing on
multiple PRs with `assert 0 == 1000`.

## Root Cause
The test calls `disable_test_user_transactions()` which sets `updatedAt`
to 35 days ago from the **actual current time**. It then mocks
`time_now` to January 1st.

**The bug**: If the test runs in early February, 35 days ago is January
— the **same month** as the mocked `time_now`. The credit refill logic
only triggers when the balance snapshot is from a *different* month, so
no refill happens and the balance stays at 0.

## Fix
After calling `disable_test_user_transactions()`, explicitly set
`updatedAt` to December of the previous year. This ensures it's always
in a different month than the mocked `month1` (January), regardless of
when the test runs.

## Testing
CI will verify the fix.
2026-02-05 11:56:26 +01:00
Otto
4f908d5cb3 fix(platform): Improve Linear Search Block [SECRT-1880] (#11967)
## Summary

Implements [SECRT-1880](https://linear.app/autogpt/issue/SECRT-1880) -
Improve Linear Search Block

## Changes

### Models (`models.py`)
- Added `State` model with `id`, `name`, and `type` fields for workflow
state information
- Added `state: State | None` field to `Issue` model

### API Client (`_api.py`)
- Updated `try_search_issues()` to:
- Add `max_results` parameter (default 10, was ~50) to reduce token
usage
  - Add `team_id` parameter for team filtering
- Return `createdAt`, `state`, `project`, and `assignee` fields in
results
- Fixed `try_get_team_by_name()` to return descriptive error message
when team not found instead of crashing with `IndexError`

### Block (`issues.py`)
- Added `max_results` input parameter (1-100, default 10)
- Added `team_name` input parameter for optional team filtering
- Added `error` output field for graceful error handling
- Added categories (`PRODUCTIVITY`, `ISSUE_TRACKING`)
- Updated test fixtures to include new fields

## Breaking Changes

| Change | Before | After | Mitigation |
|--------|--------|-------|------------|
| Default result count | ~50 | 10 | Users can set `max_results` up to
100 if needed |

## Non-Breaking Changes

- `state` field added to `Issue` (optional, defaults to `None`)
- `max_results` param added (has default value)
- `team_name` param added (optional, defaults to `None`)
- `error` output added (follows established pattern from GitHub blocks)

## Testing

- [x] Format/lint checks pass
- [x] Unit test fixtures updated

Resolves SECRT-1880

---------

Co-authored-by: Toran Bruce Richards <toran.richards@gmail.com>
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Toran Bruce Richards <Torantulino@users.noreply.github.com>
2026-02-04 22:54:46 +00:00
Reinier van der Leer
c1aa684743 fix(platform/chat): Filter host-scoped credentials for run_agent tool (#11905)
- Fixes [SECRT-1851: \[Copilot\] `run_agent` tool doesn't filter
host-scoped credentials](https://linear.app/autogpt/issue/SECRT-1851)
- Follow-up to #11881

### Changes 🏗️

- Filter host-scoped credentials for `run_agent` tool
- Tighten validation on host input field in `HostScopedCredentialsModal`
- Use netloc (w/ port) rather than just hostname (w/o port) as host
scope

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - Create graph that requires host-scoped credentials to work
  - Create host-scoped credentials with a *different* host
  - Try to have Copilot run the graph
  - [x] -> no matching credentials available
  - Create new credentials
  - [x] -> works

---------

Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co>
2026-02-04 16:27:14 +00:00
Otto
7e5b84cc5c fix(copilot): update homepage copy to focus on problem discovery (#11956)
## Summary
Update the CoPilot homepage to shift from "what do you want to
automate?" to "tell me about your problems." This lowers the barrier to
engagement by letting users describe their work frustrations instead of
requiring them to identify automations themselves.

## Changes
| Element | Before | After |
|---------|--------|-------|
| Headline | "What do you want to automate?" | "Tell me about your work
— I'll find what to automate." |
| Placeholder | "You can search or just ask - e.g. 'create a blog post
outline'" | "What's your role and what eats up most of your day? e.g.
'I'm a real estate agent and I hate...'" |
| Button 1 | "Show me what I can automate" | "I don't know where to
start, just ask me stuff" |
| Button 2 | "Design a custom workflow" | "I do the same thing every
week and it's killing me" |
| Button 3 | "Help me with content creation" | "Help me find where I'm
wasting my time" |
| Container | max-w-2xl | max-w-3xl |

> **Note on container width:** The `max-w-2xl` → `max-w-3xl` change is
just to keep the longer headline on one line. This works but may not be
the ideal solution — @lluis-xai should advise on the proper approach.

## Why This Matters
The current UX assumes users know what they want to automate. In
reality, most users know what frustrates them but can't identify
automations. The current screen blocks Otto from starting the discovery
conversation that leads to useful recommendations.

## Files Changed
- `autogpt_platform/frontend/src/app/(platform)/copilot/page.tsx` —
headline, placeholder, container width
- `autogpt_platform/frontend/src/app/(platform)/copilot/helpers.ts` —
quick action button text

Resolves: [SECRT-1876](https://linear.app/autogpt/issue/SECRT-1876)

---------

Co-authored-by: Lluis Agusti <hi@llu.lu>
2026-02-04 17:38:58 +07:00
Swifty
09cb313211 fix(frontend): Prevent reflected XSS in OAuth callback route (#11963)
## Summary

Fixes a reflected cross-site scripting (XSS) vulnerability in the OAuth
callback route.

**Security Issue:**
https://github.com/Significant-Gravitas/AutoGPT/security/code-scanning/202

### Vulnerability

The OAuth callback route at
`frontend/src/app/(platform)/auth/integrations/oauth_callback/route.ts`
was writing user-controlled data directly into an HTML response without
proper sanitization. This allowed potential attackers to inject
malicious scripts via OAuth callback parameters.

### Fix

Added a `safeJsonStringify()` function that escapes characters that
could break out of the script context:
- `<` → `\u003c`
- `>` → `\u003e`  
- `&` → `\u0026`

This prevents any user-provided values from being interpreted as
HTML/script content when embedded in the response.

### References

- [OWASP XSS Prevention Cheat
Sheet](https://cheatsheetseries.owasp.org/cheatsheets/Cross_Site_Scripting_Prevention_Cheat_Sheet.html)
- [CWE-79: Improper Neutralization of Input During Web Page
Generation](https://cwe.mitre.org/data/definitions/79.html)

## Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Verified the OAuth callback still functions correctly
- [x] Confirmed special characters in OAuth responses are properly
escaped
2026-02-04 10:53:17 +01:00
Krzysztof Czerwinski
c026485023 feat(frontend): Disable auto-opening wallet (#11961)
<!-- Clearly explain the need for these changes: -->

### Changes 🏗️

- Disable auto-opening Wallet for first time user and on credit increase
- Remove no longer needed `lastSeenCredits` state and storage

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Wallet doesn't open automatically
2026-02-04 06:11:41 +00:00
Nicholas Tindle
1eabc60484 Merge commit from fork
Fixes GHSA-rc89-6g7g-v5v7 / CVE-2026-22038

The logger.info() calls were explicitly logging API keys via
get_secret_value(), exposing credentials in plaintext logs.

Changes:
- Replace info-level credential logging with debug-level provider logging
- Remove all explicit secret value logging from observe/act/extract blocks

Co-authored-by: Otto <otto@agpt.co>
2026-02-03 11:16:57 -06:00
Swifty
f4bf492f24 feat(platform): Add Redis-based SSE reconnection for long-running CoPilot operations (#11877)
## Changes 🏗️

Adds Redis-based SSE reconnection support for long-running CoPilot
operations (like Agent Generator), enabling clients to reconnect and
resume receiving updates after disconnection.

### What this does:
- **Stream Registry** - Redis-backed task tracking with message
persistence via Redis Streams
- **SSE Reconnection** - Clients can reconnect to active tasks using
`task_id` and `last_message_id`
- **Duplicate Message Fix** - Filters out in-progress assistant messages
from session response when active stream exists
- **Completion Consumer** - Handles background task completion
notifications via Redis Streams

### Architecture:
```
1. User sends message → Backend creates task in Redis
2. SSE chunks written to Redis Stream for persistence
3. Client receives chunks via SSE subscription
4. If client disconnects → Task continues in background
5. Client reconnects → GET /sessions/{id} returns active_stream info
6. Client subscribes to /tasks/{task_id}/stream with last_message_id
7. Missed messages replayed from Redis Stream
```

### Key endpoints:
- `GET /sessions/{session_id}` - Returns `active_stream` info if task is
running
- `GET /tasks/{task_id}/stream?last_message_id=X` - SSE endpoint for
reconnection
- `GET /tasks/{task_id}` - Get task status
- `POST /operations/{op_id}/complete` - Webhook for external service
completion

### Duplicate message fix:
When `GET /sessions/{id}` detects an active stream:
1. Filters out the in-progress assistant message from response
2. Returns `last_message_id="0-0"` so client replays stream from
beginning
3. Client receives complete response only through SSE (single source of
truth)

### Frontend changes:
- Task persistence in localStorage for cross-tab reconnection
- Stream event dispatcher handles reconnection flow
- Deduplication logic prevents duplicate messages

### Testing:
- Manual testing of reconnection scenarios
- Verified duplicate message fix works correctly

## Related
- Resolves SSE timeout issues for Agent Generator
- Fixes duplicate message bug on reconnection
2026-02-03 16:52:06 +01:00
Zamil Majdy
81e48c00a4 feat(copilot): add customize_agent tool for marketplace templates (#11943)
## Summary

Adds a new copilot tool that allows users to customize
marketplace/template agents using natural language before adding them to
their library.

This exposes the Agent Generator's `/api/template-modification` endpoint
to the copilot, which was previously not available.

## Changes

- **service.py**: Add `customize_template_external` to call Agent
Generator's template modification endpoint
- **core.py**: 
  - Add `customize_template` wrapper function
- Extract `graph_to_json` as a reusable function (was previously inline
in `get_agent_as_json`)
- **customize_agent.py**: New tool that:
  - Takes marketplace agent ID (format: `creator/slug`)
  - Fetches template from store via `store_db.get_agent()`
  - Calls Agent Generator for customization
  - Handles clarifying questions from the generator
  - Saves customized agent to user's library
- **__init__.py**: Register the tool in `TOOL_REGISTRY` for
auto-discovery

## Usage Flow

1. User searches marketplace: *"Find me a newsletter agent"*
2. Copilot calls `find_agent` → returns `autogpt/newsletter-writer`
3. User: *"Customize that agent to post to Discord instead of email"*
4. Copilot calls:
   ```
   customize_agent(
       agent_id="autogpt/newsletter-writer",
       modifications="Post to Discord instead of sending email"
   )
   ```
5. Agent Generator may ask clarifying questions (e.g., "What Discord
channel?")
6. Customized agent is saved to user's library

## Test plan

- [x] Verified tool imports correctly
- [x] Verified tool is registered in `TOOL_REGISTRY`
- [x] Verified OpenAI function schema is valid
- [x] Ran existing tests (`pytest backend/api/features/chat/tools/`) -
all pass
- [x] Type checker (`pyright`) passes with 0 errors
- [ ] Manual testing with copilot (requires Agent Generator service)
2026-02-03 14:59:25 +00:00
Otto
7dc53071e8 fix(backend): Add retry and error handling to block initialization (#11946)
## Summary
Adds retry logic and graceful error handling to `initialize_blocks()` to
prevent transient DB errors from crashing server startup.

## Problem
When a transient database error occurs during block initialization
(e.g., Prisma P1017 "Server has closed the connection"), the entire
server fails to start. This is overly aggressive since:
1. Blocks are already registered in memory
2. The DB sync is primarily for tracking/schema storage
3. One flaky connection shouldn't prevent the server from starting

**Triggered by:** [Sentry
AUTOGPT-SERVER-7PW](https://significant-gravitas.sentry.io/issues/7238733543/)

## Solution
- Add retry decorator (3 attempts with exponential backoff) for DB
operations
- On failure after retries, log a warning and continue to the next block
- Blocks remain available in memory even if DB sync fails
- Log summary of any failed blocks at the end

## Changes
- `autogpt_platform/backend/backend/data/block.py`: Wrap block DB sync
in retry logic with graceful fallback

## Testing
- Existing block initialization behavior unchanged on success
- On transient DB errors: retries up to 3 times, then continues with
warning
2026-02-03 12:43:30 +00:00
Zamil Majdy
4878665c66 Merge branch 'master' into dev 2026-02-03 16:01:23 +04:00
Zamil Majdy
678ddde751 refactor(backend): unify context compression into compress_context() (#11937)
## Background

This PR consolidates and unifies context window management for the
CoPilot backend.

### Problem
The CoPilot backend had **two separate implementations** of context
window management:

1. **`service.py` → `_manage_context_window()`** - Chat service
streaming/continuation
2. **`prompt.py` → `compress_prompt()`** - Sync LLM blocks

This duplication led to inconsistent behavior, maintenance burden, and
duplicate code.

---

## Solution: Unified `compress_context()`

A single async function that handles both use cases:

| Caller | Usage | Behavior |
|--------|-------|----------|
| **Chat service** | `compress_context(msgs, client=openai_client)` |
Summarization → Truncation |
| **LLM blocks** | `compress_context(msgs, client=None)` | Truncation
only (no API call) |

---

## Strategy Order

| Step | Description | Runs When |
|------|-------------|-----------|
| **1. LLM Summarization** | Summarize old messages into single context
message, keep recent 15 | Only if `client` provided |
| **2. Content Truncation** | Progressively truncate message content
(8192→4096→...→128 tokens) | If still over limit |
| **3. Middle-out Deletion** | Delete messages one at a time from center
outward | If still over limit |
| **4. First/Last Trim** | Truncate system prompt and last message
content | Last resort |

### Why This Order?

1. **Summarization first** (if available) - Preserves semantic meaning
of old messages
2. **Content truncation before deletion** - Keeps all conversation
turns, just shorter
3. **Middle-out deletion** - More granular than dropping all old
messages at once
4. **First/last trim** - Only touch system prompt as last resort

---

## Key Fixes

| Issue | Before | After |
|-------|--------|-------|
| **Socket leak** | `AsyncOpenAI` client never closed | `async with`
context manager |
| **Timeout ignored** | `timeout=30` passed to `create()` (invalid) |
`client.with_options(timeout=30)` |
| **OpenAI tool messages** | Not truncated | Properly truncated |
| **Tool pair integrity** | OpenAI format only | Both OpenAI + Anthropic
formats |

---

## Tool Format Support

`_ensure_tool_pairs_intact()` now supports both formats:

### OpenAI Format
```python
# Assistant with tool_calls
{"role": "assistant", "tool_calls": [{"id": "call_1", ...}]}
# Tool response
{"role": "tool", "tool_call_id": "call_1", "content": "result"}
```

### Anthropic Format
```python
# Assistant with tool_use
{"role": "assistant", "content": [{"type": "tool_use", "id": "toolu_1", ...}]}
# Tool result
{"role": "user", "content": [{"type": "tool_result", "tool_use_id": "toolu_1", ...}]}
```

---

## Files Changed

| File | Change |
|------|--------|
| `backend/util/prompt.py` | +450 lines: Add `CompressResult`,
`compress_context()`, helpers |
| `backend/api/features/chat/service.py` | -380 lines: Remove duplicate,
use thin wrapper |
| `backend/blocks/llm.py` | Migrate `llm_call()` to use
`compress_context(client=None)` |
| `backend/util/prompt_test.py` | +400 lines: Comprehensive tests
(OpenAI + Anthropic) |

### Removed
- `compress_prompt()` - Replaced by `compress_context(client=None)`
- `_manage_context_window()` - Replaced by
`compress_context(client=openai_client)`

---

## API

```python
async def compress_context(
    messages: list[dict],
    target_tokens: int = 120_000,
    *,
    model: str = "gpt-4o",
    client: AsyncOpenAI | None = None,  # None = truncation only
    keep_recent: int = 15,
    reserve: int = 2_048,
    start_cap: int = 8_192,
    floor_cap: int = 128,
) -> CompressResult:
    ...

@dataclass
class CompressResult:
    messages: list[dict]
    token_count: int
    was_compacted: bool
    error: str | None = None
    original_token_count: int = 0
    messages_summarized: int = 0
    messages_dropped: int = 0
```

---

## Tests Added

| Test Class | Coverage |
|------------|----------|
| `TestMsgTokens` | Token counting for regular messages, OpenAI tool
calls, Anthropic tool_use |
| `TestTruncateToolMessageContent` | OpenAI + Anthropic tool message
truncation |
| `TestEnsureToolPairsIntact` | OpenAI format (3 tests), Anthropic
format (3 tests), edge cases (3 tests) |
| `TestCompressContext` | No compression, truncation-only, tool pair
preservation, error handling |

---

## Checklist

- [x] Code follows project conventions
- [x] Linting passes (`poetry run format`)
- [x] Type checking passes (`pyright`)
- [x] Tests added for all new functions
- [x] Both OpenAI and Anthropic tool formats supported
- [x] Backward compatible behavior preserved
- [x] All review comments addressed
2026-02-03 10:36:10 +00:00
Otto
aef6f57cfd fix(scheduler): route db calls through DatabaseManager (#11941)
## Summary

Routes `increment_onboarding_runs` and `cleanup_expired_oauth_tokens`
through the DatabaseManager RPC client instead of calling Prisma
directly.

## Problem

The Scheduler service never connects its Prisma client. While
`add_graph_execution()` in `utils.py` has a fallback that routes through
DatabaseManager when Prisma isn't connected, subsequent calls in the
scheduler were hitting Prisma directly:

- `increment_onboarding_runs()` after successful graph execution
- `cleanup_expired_oauth_tokens()` in the scheduled job

These threw `ClientNotConnectedError`, caught by generic exception
handlers but spamming Sentry (~696K events since December per the
original analysis in #11926).

## Solution

Follow the same pattern as `utils.py`:
1. Add `cleanup_expired_oauth_tokens` to `DatabaseManager` and
`DatabaseManagerAsyncClient`
2. Update scheduler to use `get_database_manager_async_client()` for
both calls

## Changes

- **database.py**: Import and expose `cleanup_expired_oauth_tokens` in
both manager classes
- **scheduler.py**: Use `db.increment_onboarding_runs()` and
`db.cleanup_expired_oauth_tokens()` via the async client

## Impact

- Eliminates Sentry error spam from scheduler
- Onboarding run counters now actually increment for scheduled
executions
- OAuth token cleanup now actually runs

## Testing

Deploy to staging with scheduled graphs and verify:
1. No more `ClientNotConnectedError` in scheduler logs
2. `UserOnboarding.agentRuns` increments on scheduled runs
3. Expired OAuth tokens get cleaned up

Refs: #11926 (original fix that was closed)
2026-02-03 09:54:49 +00:00
Krzysztof Czerwinski
14cee1670a fix(backend): Prevent leaking Redis connections in ws_api (#11869)
Fixing
https://github.com/Significant-Gravitas/AutoGPT/pull/11297#discussion_r2496833421

### Changes 🏗️

1. event_bus.py - Added close method to AsyncRedisEventBus
- Added __init__ method to track the _pubsub instance attribute
- Added async def close() method that closes the PubSub connection
safely
- Modified listen_events() to store the pubsub reference in self._pubsub

2. ws_api.py - Added cleanup in event_broadcaster
- Wrapped the worker coroutines in try/finally block
- The finally block calls close() on both event buses to ensure cleanup
happens on any exit (including exceptions before retry)
2026-02-03 08:07:48 +00:00
Zamil Majdy
d81d1ce024 refactor(backend): extract context window management and fix LLM continuation (#11936)
## Summary

Fixes CoPilot becoming unresponsive after long-running tools complete,
and refactors context window management into a reusable function.

## Problem

After `create_agent` completes, `_generate_llm_continuation()` was
sending ALL messages to OpenRouter without any context compaction. When
conversations exceeded ~50 messages, OpenRouter rejected requests with
`provider_name: 'unknown'` (no provider would accept).

**Evidence:** Langfuse session
[44fbb803-092e-4ebd-b288-852959f4faf5](https://cloud.langfuse.com/project/cmk5qhf210003ad079sd8utjt/sessions/44fbb803-092e-4ebd-b288-852959f4faf5)
showed:
- Successful calls: 32-50 messages, known providers
- Failed calls: 52+ messages, `provider: unknown`, `completion: null`

## Changes

### Refactor: Extract reusable `_manage_context_window()`
- Counts tokens and checks against 120k threshold
- Summarizes old messages while keeping recent 15
- Ensures tool_call/tool_response pairs stay intact
- Progressive truncation if still over limit
- Returns `ContextWindowResult` dataclass with messages, token count,
compaction status, and errors
- Helper `_messages_to_dicts()` reduces code duplication

### Fix: Update `_generate_llm_continuation()`
- Now calls `_manage_context_window()` before making LLM calls
- Adds retry logic with exponential backoff (matching
`_stream_chat_chunks` behavior)

### Cleanup: Update `_stream_chat_chunks()`
- Replaced inline context management with call to
`_manage_context_window()`
- Eliminates code duplication between the two functions

## Testing

- Syntax check: 
- Ruff lint: 
- Import verification: 

## Checklist

- [x] My code follows the style guidelines of this project
- [x] I have performed a self-review of my own code
- [x] My changes generate no new warnings
- [x] I have checked that my changes do not break existing functionality

---------

Co-authored-by: Otto <otto@agpt.co>
2026-02-03 04:41:43 +00:00
Zamil Majdy
2dd341c369 refactor: enrich description with context before calling Agent Generator (#11932)
## Summary
Updates the Agent Generator client to enrich the description with
context before calling, instead of sending `user_instruction` as a
separate parameter.

## Context
Companion PR to Significant-Gravitas/AutoGPT-Agent-Generator#105 which
removes unused parameters from the decompose API.

## Changes
- Enrich `description` with `context` (e.g., clarifying question
answers) before sending
- Remove `user_instruction` from request payload

## How it works
Both input boxes and chat box work the same way - the frontend
constructs a formatted message with answers and sends it as a user
message. The backend then enriches the description with this context
before calling the external Agent Generator service.
2026-02-03 02:31:07 +00:00
Otto
f7350c797a fix(copilot): use messages_dict in fallback context compaction (#11922)
## Summary

Fixes a bug where the fallback path in context compaction passes
`recent_messages` (already sliced) instead of `messages_dict` (full
conversation) to `_ensure_tool_pairs_intact`.

This caused the function to fail to find assistant messages that exist
in the original conversation but were outside the sliced window,
resulting in orphan tool_results being sent to Anthropic and rejected
with:

```
messages.66.content.0: unexpected tool_use_id found in tool_result blocks: toolu_vrtx_019bi1PDvEn7o5ByAxcS3VdA
```

## Changes

- Pass `messages_dict` and `slice_start` (relative to full conversation)
instead of `recent_messages` and `reduced_slice_start` (relative to
already-sliced list)

## Testing

This is a targeted fix for the fallback path. The bug only manifests
when:
1. Token count > 120k (triggers compaction)
2. Initial compaction + summary still exceeds limit (triggers fallback)
3. A tool_result's corresponding assistant is in `messages_dict` but not
in `recent_messages`

## Related

- Fixes SECRT-1861
- Related: SECRT-1839 (original fix that missed this code path)
2026-02-02 13:01:05 +00:00
Guofang.Tang
1081590384 feat(backend): cover webhook ingress URL route (#11747)
### Changes 🏗️

- Add a unit test to verify webhook ingress URL generation matches the
FastAPI route.

  ### Checklist 📋

  #### For code changes:

  - [x] I have clearly listed my changes in the PR description
  - [x] I have made a test plan
  - [x] I have tested my changes according to the test plan:
- [x] poetry run pytest backend/integrations/webhooks/utils_test.py
--confcutdir=backend/integrations/webhooks

  #### For configuration changes:

  - [x] .env.default is updated or already compatible with my changes
- [x] docker-compose.yml is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under Changes)



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Tests**
* Added a unit test that validates webhook ingress URL generation
matches the application's resolved route (scheme, host, and path) for
provider-specific webhook endpoints, improving confidence in routing
behavior and helping prevent regressions.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Reinier van der Leer <pwuts@agpt.co>
2026-02-01 20:29:15 +00:00
Otto
7e37de8e30 fix: Include graph schemas for marketplace agents in Agent Generator (#11920)
## Problem

When marketplace agents are included in the `library_agents` payload
sent to the Agent Generator service, they were missing required fields
(`graph_id`, `graph_version`, `input_schema`, `output_schema`). This
caused Pydantic validation to fail with HTTP 422 Unprocessable Entity.

**Root cause:** The `MarketplaceAgentSummary` TypedDict had a different
shape than `LibraryAgentInfo` expected by the Agent Generator:
- Agent Generator expects: `graph_id`, `graph_version`, `name`,
`description`, `input_schema`, `output_schema`
- MarketplaceAgentSummary had: `name`, `description`, `sub_heading`,
`creator`, `is_marketplace_agent`

## Solution

1. **Add `agent_graph_id` to `StoreAgent` model** - The field was
already in the database view but not exposed
2. **Include `agentGraphId` in hybrid search SQL query** - Carry the
field through the search CTEs
3. **Update `search_marketplace_agents_for_generation()`** - Now fetches
full graph schemas using `get_graph()` and returns `LibraryAgentSummary`
(same type as library agents)
4. **Update deduplication logic** - Use `graph_id` instead of name for
more accurate deduplication

## Changes

- `backend/api/features/store/model.py`: Add optional `agent_graph_id`
field to `StoreAgent`
- `backend/api/features/store/hybrid_search.py`: Include `agentGraphId`
in SQL query columns
- `backend/api/features/store/db.py`: Map `agentGraphId` when creating
`StoreAgent` objects
- `backend/api/features/chat/tools/agent_generator/core.py`: Update
`search_marketplace_agents_for_generation()` to fetch and include full
graph schemas

## Testing

- [ ] Agent creation on dev with marketplace agents in context
- [ ] Verify no 422 errors from Agent Generator
- [ ] Verify marketplace agents can be used as sub-agents

Fixes: SECRT-1817

---------

Co-authored-by: majdyz <majdyz@users.noreply.github.com>
Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
2026-01-31 19:17:36 +00:00
Otto
2abbb7fbc8 hotfix(backend): use discriminator for credential matching in run_block (#11908)
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Nicholas Tindle <ntindle@users.noreply.github.com>
Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 21:50:21 -06:00
Otto
7ee94d986c docs: add credentials prerequisites to create-basic-agent guide (#11913)
## Summary
Addresses #11785 - users were encountering `openai_api_key_credentials`
errors when following the create-basic-agent guide because it didn't
mention the need to configure API credentials before using AI blocks.

## Changes
Added a **Prerequisites** section to
`docs/platform/create-basic-agent.md` explaining:
- **Cloud users:** Go to Profile → Integrations to add API keys
- **Self-hosted (Docker):** Add keys to `autogpt_platform/backend/.env`
and restart services

Also added a note that the Calculator example doesn't need credentials,
making it a good first test.

## Related
- Issue: #11785
2026-01-31 03:05:31 +00:00
Nicholas Tindle
05b60db554 fix(backend/chat): Include input schema in discovery and validate unknown fields (#11916)
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 21:00:43 -06:00
Zamil Majdy
18a1661fa3 feat: add library agent fetching with two-phase search for sub-agent support (#11889)
## Context

When users ask the chat to create agents, they may want to compose
workflows that reuse their existing agents as sub-agents. For this to
work, the Agent Generator service needs to know what agents the user has
available.

**Challenge:** Users can have large libraries with many agents. Fetching
all of them would be slow and provide too much context to the LLM.

## Solution

This PR implements **search-based library agent fetching** with a
**two-phase search** strategy:

1. **Phase 1 (Initial Search):** When the user describes their goal, we
search for relevant library agents using the goal as the search query
2. **Phase 2 (Step-Based Enrichment):** After the goal is decomposed
into steps, we extract keywords from those steps and search for
additional relevant agents

This ensures we find agents that are relevant to both the high-level
goal AND the specific steps identified.

### Example Flow

```
User goal: "Create an agent that fetches weather and sends a summary email"

Phase 1: Search for "weather email summary" → finds "Weather Fetcher" agent
Phase 2: After decomposition identifies steps like "send email notification"
         → searches "send email notification" → finds "Gmail Sender" agent
```

### Changes

**Library Agent Fetching:**
- `get_library_agents_for_generation()` - Search-based fetching from
user's library
- `search_marketplace_agents_for_generation()` - Search public
marketplace
- `get_all_relevant_agents_for_generation()` - Combines both with
deduplication

**Two-Phase Search:**
- `extract_search_terms_from_steps()` - Extracts keywords from
decomposed steps
- `enrich_library_agents_from_steps()` - Searches for additional agents
based on steps
- Integrated into `create_agent.py` as "Step 1.5" after goal
decomposition

**Type Safety:**
- Added `TypedDict` definitions: `LibraryAgentSummary`,
`MarketplaceAgentSummary`, `DecompositionStep`, `DecompositionResult`

### Design Decisions

- **Search-based, not fetch-all:** Scalable for large libraries
- **Library agents prioritized:** They have full schemas; marketplace
agents have basic info only
- **Deduplication by name and graph_id:** Prevents duplicates across
searches
- **Graceful degradation:** Failures don't block agent generation
- **Limited to 3 search terms:** Avoids excessive API calls during
enrichment

## Related PR
- Agent Generator:
https://github.com/Significant-Gravitas/AutoGPT-Agent-Generator/pull/103

## Test plan
- [x] `test_library_agents.py` - 19 tests covering all new functions
- [x] `test_service.py` - 4 tests for library_agents passthrough
- [ ] Integration test: Create agent with library sub-agent composition
2026-01-31 00:18:21 +00:00
Otto
b72521daa9 fix(readme): update broken self-hosting docs link (#11911)
## Summary
The self-hosting guide link in README.md was broken.

**Old link:** `https://docs.agpt.co/platform/getting-started/`
- Redirects to `https://agpt.co/docs/platform/getting-started`
- Returns HTTP 400 

**New link:**
`https://agpt.co/docs/platform/getting-started/getting-started`
- Works correctly 

## Changes
- Updated the self-hosting guide URL in README.md

Fixes #OPEN-2973
2026-01-30 22:59:45 +00:00
Ubbe
cc4839bedb hotfix(frontend): fix home redirect (3) (#11904)
### Changes 🏗️

Further improvements to LaunchDarkly initialisation and homepage
redirect...

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Run the app locally with the flag disabled/enabled, and the
redirects work

---------

Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Ubbe <0ubbe@users.noreply.github.com>
2026-01-30 20:40:46 +07:00
Otto
dbbff04616 hotfix(frontend): LD remount (#11903)
## Changes 🏗️

Removes the `key` prop from `LDProvider` that was causing full remounts
when user context changed.

### The Problem

The `key={context.key}` prop was forcing React to unmount and remount
the entire LDProvider when switching from anonymous → logged in user:

```
1. Page loads, user loading → key="anonymous" → LD mounts → flags available 
2. User finishes loading → key="user-123" → React sees key changed
3. LDProvider UNMOUNTS → flags become undefined 
4. New LDProvider MOUNTS → initializes again → flags available 
```

This caused the flag values to cycle: `undefined → value → undefined →
value`

### The Fix

Remove the `key` prop. The LDProvider handles context changes internally
via the `context` prop, which triggers `identify()` without remounting
the provider.

## Checklist 📋

- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [ ] I have tested my changes according to the test plan:
  - [ ] Flag values don't flicker on page load
  - [ ] Flag values update correctly when logging in/out
  - [ ] No redirect race conditions

Related: SECRT-1845
2026-01-30 19:08:26 +07:00
Reinier van der Leer
350ad3591b fix(backend/chat): Filter credentials for graph execution by scopes (#11881)
[SECRT-1842: run_agent tool does not correctly use credentials - agents
fail with insufficient auth
scopes](https://linear.app/autogpt/issue/SECRT-1842)

### Changes 🏗️

- Include scopes in credentials filter in
`backend.api.features.chat.tools.utils.match_user_credentials_to_graph`

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - CI must pass
- It's broken now and a simple change so we'll test in the dev
deployment
2026-01-30 11:01:51 +00:00
Ubbe
e6438b9a76 hotfix(frontend): use server redirect (#11900)
### Changes 🏗️

The page used a client-side redirect (`useEffect` + `router.replace`)
which only works after JavaScript loads and hydrates. On deployed sites,
if there's any delay or failure in JS execution, users see an
empty/black page because the component returns null.

**Fix:** Converted to a server-side redirect using redirect() from
next/navigation. This is a server component now, so:

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Tested locally but will see it fully working once deployed
2026-01-30 17:20:03 +07:00
Bently
de0ec3d388 chore(llm): remove deprecated Claude 3.7 Sonnet model with migration and defensive handling (#11841)
## Summary
Remove `claude-3-7-sonnet-20250219` from LLM model definitions ahead of
Anthropic's API retirement, with comprehensive migration and defensive
error handling.

## Background
Anthropic is retiring Claude 3.7 Sonnet (`claude-3-7-sonnet-20250219`)
on **February 19, 2026 at 9:00 AM PT**. This PR removes the model from
the platform and migrates existing users to prevent service
interruptions.

## Changes

### Code Changes
- Remove `CLAUDE_3_7_SONNET` enum member from `LlmModel` in `llm.py`
- Remove corresponding `ModelMetadata` entry
- Remove `CLAUDE_3_7_SONNET` from `StagehandRecommendedLlmModel` enum
- Remove `CLAUDE_3_7_SONNET` from block cost config
- Add `CLAUDE_4_5_SONNET` to `StagehandRecommendedLlmModel` enum
- Update Stagehand block defaults from `CLAUDE_3_7_SONNET` to
`CLAUDE_4_5_SONNET` (staying in Claude family)
- Add defensive error handling in `CredentialsFieldInfo.discriminate()`
for deprecated model values

### Database Migration
- Adds migration `20260126120000_migrate_claude_3_7_to_4_5_sonnet`
- Migrates `AgentNode.constantInput` model references
- Migrates `AgentNodeExecutionInputOutput.data` preset overrides

### Documentation
- Updated `docs/integrations/block-integrations/llm.md` to remove
deprecated model
- Updated `docs/integrations/block-integrations/stagehand/blocks.md` to
remove deprecated model and add Claude 4.5 Sonnet

## Notes
- Agent JSON files in `autogpt_platform/backend/agents/` still reference
this model in their provider mappings. These are auto-generated and
should be regenerated separately.

## Testing
- [ ] Verify LLM block still functions with remaining models
- [ ] Confirm no import errors in affected files
- [ ] Verify migration runs successfully
- [ ] Verify deprecated model gives helpful error message instead of
KeyError
2026-01-30 08:40:55 +00:00
Otto
e10ff8d37f fix(frontend): remove double flag check on homepage redirect (#11894)
## Changes 🏗️

Fixes the hard refresh redirect bug (SECRT-1845) by removing the double
feature flag check.

### Before (buggy)
```
/                    → checks flag → /copilot or /library
/copilot (layout)    → checks flag → /library if OFF
```

On hard refresh, two sequential LD checks created a race condition
window.

### After (fixed)
```
/                    → always redirects to /copilot
/copilot (layout)    → single flag check via FeatureFlagPage
```

Single check point = no double-check race condition.

## Root Cause

As identified by @0ubbe: the root page and copilot layout were both
checking the feature flag. On hard refresh with network latency, the
second check could fire before LaunchDarkly fully initialized, causing
users to be bounced to `/library`.

## Test Plan

- [ ] Hard refresh on `/` → should go to `/copilot` (flag ON)
- [ ] Hard refresh on `/copilot` → should stay on `/copilot` (flag ON)  
- [ ] With flag OFF → should redirect to `/library`
- [ ] Normal navigation still works

Fixes: SECRT-1845

cc @0ubbe
2026-01-30 08:32:50 +00:00
Otto
7cb1e588b0 fix(frontend): Refocus ChatInput after voice transcription completes (#11893)
## Summary
Refocuses the chat input textarea after voice transcription finishes,
allowing users to immediately use `spacebar+enter` to record and send
their prompt.

## Changes
- Added `inputId` parameter to `useVoiceRecording` hook
- After transcription completes, the input is automatically focused
- This improves the voice input UX flow

## Testing
1. Click mic button or press spacebar to record voice
2. Record a message and stop
3. After transcription completes, the input should be focused
4. User can now press Enter to send or spacebar to record again

---------

Co-authored-by: Lluis Agusti <hi@llu.lu>
2026-01-30 14:49:05 +07:00
Otto
582c6cad36 fix(e2e): Make E2E test data deterministic and fix flaky tests (#11890)
## Summary
Fixes flaky E2E marketplace and library tests that were causing PRs to
be removed from the merge queue.

## Root Cause
1. **Test data was probabilistic** - `e2e_test_data.py` used random
chances (40% approve, then 20-50% feature), which could result in 0
featured agents
2. **Library pagination threshold wrong** - Checked `>= 10`, but page
size is 20
3. **Fixed timeouts** - Used `waitForTimeout(2000)` /
`waitForTimeout(10000)` instead of proper waits

## Changes

### Backend (`e2e_test_data.py`)
- Add guaranteed minimums: 8 featured agents, 5 featured creators, 10
top agents
- First N submissions are deterministically approved and featured
- Increase agents per user from 15 → 25 (for pagination with
page_size=20)
- Fix library agent creation to use constants instead of hardcoded `10`

### Frontend Tests
- `library.spec.ts`: Fix pagination threshold to `PAGE_SIZE` (20)
- `library.page.ts`: Replace 2s timeout with `networkidle` +
`waitForFunction`
- `marketplace.page.ts`: Add `networkidle` wait, 30s waits in
`getFirst*` methods
- `marketplace.spec.ts`: Replace 10s timeout with `waitForFunction`
- `marketplace-creator.spec.ts`: Add `networkidle` + element waits

## Related
- Closes SECRT-1848, SECRT-1849
- Should unblock #11841 and other PRs in merge queue

---------

Co-authored-by: Ubbe <hi@ubbe.dev>
2026-01-30 05:12:35 +00:00
Nicholas Tindle
3b822cdaf7 chore(branchlet): Remove docs pip install from postCreateCmd (#11883)
### Changes 🏗️

- Removed `cd docs && pip install -r requirements.txt` from
`postCreateCmd` in `.branchlet.json`
- Docs dependencies will no longer be auto-installed during branchlet
worktree creation

### Rationale

The docs setup step was adding unnecessary overhead to the worktree
creation process. Developers who need to work on documentation can
manually install the docs requirements when needed.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Verified branchlet worktree creation still works without the docs
pip install step

#### For configuration changes:

- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)
2026-01-30 00:31:34 +00:00
Zamil Majdy
b2eb4831bd feat(chat): improve agent generator error propagation (#11884)
## Summary
- Add helper functions in `service.py` to create standardized error
responses with `error_type` classification
- Update service functions to return error dicts instead of `None`,
preserving error details from the Agent Generator microservice
- Update `core.py` to pass through error responses properly
- Update `create_agent.py` to handle error responses with user-friendly
messages based on error type

## Error Types Now Propagated
| Error Type | Description | User Message |
|------------|-------------|--------------|
| `llm_parse_error` | LLM returned unparseable response | "The AI had
trouble understanding this request" |
| `llm_timeout` / `timeout` | Request timed out | "The request took too
long" |
| `llm_rate_limit` / `rate_limit` | Rate limited | "The service is
currently busy" |
| `validation_error` | Agent validation failed | "The generated agent
failed validation" |
| `connection_error` | Could not connect to Agent Generator | Generic
error message |
| `http_error` | HTTP error from Agent Generator | Generic error message
|
| `unknown` | Unclassified error | Generic error message |

## Motivation
This enables better debugging for issues like SECRT-1817 where
decomposition failed due to transient LLM errors but the root cause was
unclear in the logs. Now:
1. Error details from the Agent Generator microservice are preserved
2. Users get more helpful error messages based on error type
3. Debugging is easier with `error_type` in response details

## Related PR
- Agent Generator side:
https://github.com/Significant-Gravitas/AutoGPT-Agent-Generator/pull/102

## Test Plan
- [ ] Test decomposition with various error scenarios (timeout, parse
error)
- [ ] Verify user-friendly messages are shown based on error type
- [ ] Check that error details are logged properly
2026-01-29 19:53:40 +00:00
Reinier van der Leer
4cd5da678d refactor(claude): Split autogpt_platform/CLAUDE.md into project-specific files (#11788)
Split `autogpt_platform/CLAUDE.md` into project-specific files, to make
the scope of the instructions clearer.

Also, some minor improvements:

- Change references to other Markdown files to @file/path.md syntax that
Claude recognizes
- Update ambiguous/incorrect/outdated instructions
- Remove trailing slashes
- Fix broken file path references in other docs (including comments)
2026-01-29 17:33:02 +00:00
Ubbe
9538992eaf hotfix(frontend): flags copilot redirects (#11878)
## Changes 🏗️

- Refactor homepage redirect logic to always point to `/`
- the `/` route handles whether to redirect to `/copilot` or `/library`
based on flag
- Simplify `useGetFlag` checks
- Add `<FeatureFlagRedirect />` and `<FeatureFlagPage />` wrapper
components
- helpers to do 1 thing or the other, depending on chat enabled/disabled
- avoids boilerplate code, checking flagss and redirects mistakes
(especially around race conditions with LD init )

## Checklist 📋

### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Log in / out of AutoGPT with flag disabled/enabled
  - [x] Sign up to AutoGPT with flag disabled/enabled
  - [x] Redirects to homepage always work `/`
  - [x] Can't access Copilot with disabled flag
2026-01-29 18:13:28 +07:00
Ubbe
b94c83aacc feat(frontend): Copilot speech to text via Whisper model (#11871)
## Changes 🏗️


https://github.com/user-attachments/assets/d9c12ac0-625c-4b38-8834-e494b5eda9c0

Add a "speech to text" feature in the Chat input fox of Copilot, similar
as what you have in ChatGPT.

## Checklist 📋

### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Run locally and try the speech to text feature as part of the chat
input box

### For configuration changes:

We need to add `OPENAI_API_KEY=` to Vercel ( used in the Front-end )
both in Dev and Prod.

- [x] `.env.default` is updated or already compatible with my changes

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 17:46:36 +07:00
Nicholas Tindle
7668c17d9c feat(platform): add User Workspace for persistent CoPilot file storage (#11867)
Implements persistent User Workspace storage for CoPilot, enabling
blocks to save and retrieve files across sessions. Files are stored in
session-scoped virtual paths (`/sessions/{session_id}/`).

Fixes SECRT-1833

### Changes 🏗️

**Database & Storage:**
- Add `UserWorkspace` and `UserWorkspaceFile` Prisma models
- Implement `WorkspaceStorageBackend` abstraction (GCS for cloud, local
filesystem for self-hosted)
- Add `workspace_id` and `session_id` fields to `ExecutionContext`

**Backend API:**
- Add REST endpoints: `GET/POST /api/workspace/files`, `GET/DELETE
/api/workspace/files/{id}`, `GET /api/workspace/files/{id}/download`
- Add CoPilot tools: `list_workspace_files`, `read_workspace_file`,
`write_workspace_file`
- Integrate workspace storage into `store_media_file()` - returns
`workspace://file-id` references

**Block Updates:**
- Refactor all file-handling blocks to use unified `ExecutionContext`
parameter
- Update media-generating blocks to persist outputs to workspace
(AIImageGenerator, AIImageCustomizer, FluxKontext, TalkingHead, FAL
video, Bannerbear, etc.)

**Frontend:**
- Render `workspace://` image references in chat via proxy endpoint
- Add "AI cannot see this image" overlay indicator

**CoPilot Context Mapping:**
- Session = Agent (graph_id) = Run (graph_exec_id)
- Files scoped to `/sessions/{session_id}/`

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [ ] I have tested my changes according to the test plan:
- [ ] Create CoPilot session, generate image with AIImageGeneratorBlock
  - [ ] Verify image returns `workspace://file-id` (not base64)
  - [ ] Verify image renders in chat with visibility indicator
  - [ ] Verify workspace files persist across sessions
  - [ ] Test list/read/write workspace files via CoPilot tools
  - [ ] Test local storage backend for self-hosted deployments

#### For configuration changes:
- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)

🤖 Generated with [Claude Code](https://claude.ai/code)

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Medium Risk**
> Introduces a new persistent file-storage surface area (DB tables,
storage backends, download API, and chat tools) and rewires
`store_media_file()`/block execution context across many blocks, so
regressions could impact file handling, access control, or storage
costs.
> 
> **Overview**
> Adds a **persistent per-user Workspace** (new
`UserWorkspace`/`UserWorkspaceFile` models plus `WorkspaceManager` +
`WorkspaceStorageBackend` with GCS/local implementations) and wires it
into the API via a new `/api/workspace/files/{file_id}/download` route
(including header-sanitized `Content-Disposition`) and shutdown
lifecycle hooks.
> 
> Extends `ExecutionContext` to carry execution identity +
`workspace_id`/`session_id`, updates executor tooling to clone
node-specific contexts, and updates `run_block` (CoPilot) to create a
session-scoped workspace and synthetic graph/run/node IDs.
> 
> Refactors `store_media_file()` to require `execution_context` +
`return_format` and to support `workspace://` references; migrates many
media/file-handling blocks and related tests to the new API and to
persist generated media as `workspace://...` (or fall back to data URIs
outside CoPilot), and adds CoPilot chat tools for
listing/reading/writing/deleting workspace files with safeguards against
context bloat.
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
6abc70f793. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Reinier van der Leer <pwuts@agpt.co>
2026-01-29 05:49:47 +00:00
Nicholas Tindle
27b72062f2 Merge branch 'dev' 2026-01-28 15:17:57 -06:00
Nicholas Tindle
e0dfae5732 fix(platform): evaluate chat flag after auth for correct redirect (#11873)
Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-28 14:58:02 -06:00
Zamil Majdy
9a79a8d257 Merge branch 'dev' of github.com:Significant-Gravitas/AutoGPT 2026-01-28 12:32:17 -06:00
Zamil Majdy
7df867d645 Merge branch 'master' of github.com:Significant-Gravitas/AutoGPT into dev 2026-01-28 12:29:41 -06:00
Zamil Majdy
a9bf08748b Merge branch 'dev' of github.com:Significant-Gravitas/AutoGPT 2026-01-28 12:28:48 -06:00
Zamil Majdy
d855f79874 fix(platform): reduce Sentry alert spam for expected errors (#11872)
## Summary
- Add `InvalidInputError` for validation errors (search term too long,
invalid pagination) - returns 400 instead of 500
- Remove redundant try/catch blocks in library routes - global exception
handlers already handle `ValueError`→400 and `NotFoundError`→404
- Aggregate embedding backfill errors and log once at the end instead of
per content type to prevent Sentry issue spam

## Test plan
- [x] Verify validation errors (search term >100 chars) return 400 Bad
Request
- [x] Verify NotFoundError still returns 404
- [x] Verify embedding errors are logged once at the end with aggregated
counts

Fixes AUTOGPT-SERVER-7K5, BUILDER-6NC

---------

Co-authored-by: Swifty <craigswift13@gmail.com>
2026-01-29 01:28:27 +07:00
Swifty
dac99694fe Merge branch 'release/v0.6.44' 2026-01-28 12:19:13 +01:00
Nicholas Tindle
0953983944 feat(platform): disable onboarding redirects and add $5 signup bonus (#11862)
Disable automatic onboarding redirects on signup/login while keeping the
checklist/wallet functional. Users now receive $5 (500 credits) on their
first visit to /copilot.

### Changes 🏗️

- **Frontend**: `shouldShowOnboarding()` now returns `false`, disabling
auto-redirects to `/onboarding`
- **Backend**: Added `VISIT_COPILOT` onboarding step with 500 credit
($5) reward
- **Frontend**: Copilot page automatically completes `VISIT_COPILOT`
step on mount
- **Database**: Migration to add `VISIT_COPILOT` to `OnboardingStep`
enum

NOTE: /onboarding/1-welcome -> /library now as shouldShowOnboardin is
always false

Users land directly on `/copilot` after signup/login and receive $5
invisibly (not shown in checklist UI).

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] New user signup (email/password) → lands on `/copilot`, wallet
shows 500 credits
- [x] Verified credits are only granted once (idempotent via onboarding
reward mechanism)
- [x] Existing user login (already granted flag set) → lands on
`/copilot`, no duplicate credits
  - [x] Checklist/wallet remains functional

#### For configuration changes:
- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)

No configuration changes required.

---

OPEN-2967

🤖 Generated with [Claude Code](https://claude.ai/code)


<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Introduces a new onboarding step and adjusts onboarding flow.
> 
> - Adds `VISIT_COPILOT` onboarding step (+500 credits) with DB enum
migration and API/type updates
> - Copilot page auto-completes `VISIT_COPILOT` on mount to grant the
welcome bonus
> - Changes `/onboarding/enabled` to require user context and return
`false` when `CHAT` feature is enabled (skips legacy onboarding)
> - Wallet now refreshes credits on any onboarding `step_completed`
notification; confetti limited to visible tasks
> - Test flows updated to accept redirects to `copilot`/`library` and
verify authenticated state
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
ec5a5a4dfd. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Nicholas Tindle <ntindle@users.noreply.github.com>
2026-01-28 07:22:46 +00:00
Zamil Majdy
0058cd3ba6 fix(frontend): auto-poll for long-running tool completion (#11866)
## Summary
Fixes the issue where the "Creating Agent" spinner doesn't auto-update
when agent generation completes - user had to refresh the browser.

**Changes:**
- **Frontend polling**: Add `onOperationStarted` callback to trigger
polling when `operation_started` is received via SSE
- **Polling backoff**: 2s, 4s, 6s, 8s... up to 30s max
- **Message deduplication**: Use content-based keys (role + content)
instead of timestamps to prevent duplicate messages
- **Message ordering**: Preserve server message order instead of
timestamp-based sorting
- **Debug cleanup**: Remove verbose console.log/console.info statements

## Test plan
- [ ] Start agent generation in copilot
- [ ] Verify "Creating Agent" spinner appears
- [ ] Wait for completion (2-5 min) WITHOUT refreshing
- [ ] Verify agent carousel appears automatically when done
- [ ] Verify no duplicate messages in chat
- [ ] Verify message order is correct (user → assistant → tool_call →
tool_response)
2026-01-28 10:03:21 +07:00
Nicholas Tindle
ea035224bc feat(copilot): Increase max_agent_runs and max_agent_schedules (#11865)
<!-- Clearly explain the need for these changes: -->
Config change to increase the max times an agent can run in the chat and
the max number of scheduels created by copilot in one chat

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Increases per-chat operational limits for Copilot.
> 
> - Bumps `max_agent_runs` default from `3` to `30` in `ChatConfig`
> - Bumps `max_agent_schedules` default from `3` to `30` in `ChatConfig`
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
93cbae6d27. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
2026-01-28 01:08:02 +00:00
Nicholas Tindle
62813a1ea6 Delete backend/blocks/video/__init__.py (#11864)
<!-- Clearly explain the need for these changes: -->
oops file
### Changes 🏗️

<!-- Concisely describe all of the changes made in this pull request:
-->
removes file that should have not been commited

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Removes erroneous `backend/blocks/video/__init__.py`, eliminating an
unintended `video` package.
> 
> - Deletes a placeholder comment-only file
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
3b84576c33. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
2026-01-28 00:58:49 +00:00
Bently
67405f7eb9 fix(copilot): ensure tool_call/tool_response pairs stay intact during context compaction (#11863)
## Summary

Fixes context compaction breaking tool_call/tool_response pairs, causing
API validation errors.

## Problem

When context compaction slices messages with `messages[-KEEP_RECENT:]`,
a naive slice can separate an assistant message containing `tool_calls`
from its corresponding tool response messages. This causes API
validation errors like:

```
messages.0.content.1: unexpected 'tool_use_id' found in 'tool_result' blocks: orphan_12345.
Each 'tool_result' block must have a corresponding 'tool_use' block in the previous message.
```

## Solution

Added `_ensure_tool_pairs_intact()` helper function that:
1. Detects orphan tool responses in a slice (tool messages whose
`tool_call_id` has no matching assistant message)
2. Extends the slice backwards to include the missing assistant messages
3. Falls back to removing orphan tool responses if the assistant cannot
be found (edge case)

Applied this safeguard to:
- The initial `KEEP_RECENT` slice (line ~990)
- The progressive fallback slices when still over token limit (line
~1079)

## Testing

- Syntax validated with `python -m py_compile`
- Logic reviewed for correctness

## Linear

Fixes SECRT-1839

---
*Debugged by Toran & Orion in #agpt Discord*
2026-01-28 00:21:54 +00:00
Zamil Majdy
171ff6e776 feat(backend): persist long-running tool results to survive SSE disconnects (#11856)
## Summary

Agent generation (`create_agent`, `edit_agent`) can take 1-5 minutes.
Previously, if the user closed their browser tab during this time:
1. The SSE connection would die
2. The tool execution would be cancelled via `CancelledError`
3. The result would be lost - even if the agent-generator service
completed successfully

This PR ensures long-running tool operations survive SSE disconnections.

### Changes 🏗️

**Backend:**
- **base.py**: Added `is_long_running` property to `BaseTool` for tools
to opt-in to background execution
- **create_agent.py / edit_agent.py**: Set `is_long_running = True`
- **models.py**: Added `OperationStartedResponse`,
`OperationPendingResponse`, `OperationInProgressResponse` types
- **service.py**: Modified `_yield_tool_call()` to:
  - Check if tool is `is_long_running`
  - Save "pending" message to chat history immediately
  - Spawn background task that runs independently of SSE
  - Return `operation_started` immediately (don't wait)
  - Update chat history with result when background task completes
- Track running operations for idempotency (prevents duplicate ops on
refresh)
- **db.py**: Added `update_tool_message_content()` to update pending
messages
- **model.py**: Added `invalidate_session_cache()` to clear Redis after
background completion

**Frontend:**
- **useChatMessage.ts**: Added operation message types
- **helpers.ts**: Handle `operation_started`, `operation_pending`,
`operation_in_progress` response types
- **PendingOperationWidget**: New component to display operation status
with spinner
- **ChatMessage.tsx**: Render `PendingOperationWidget` for operation
messages

### How It Works

```
User Request → Save "pending" message → Spawn background task → Return immediately
                                              ↓
                                     Task runs independently of SSE
                                              ↓
                                     On completion: Update message in chat history
                                              ↓
                                     User refreshes → Loads history → Sees result
```

### User Experience

1. User requests agent creation
2. Sees "Agent creation started. You can close this tab - check your
library in a few minutes."
3. Can close browser tab safely
4. When they return, chat shows the completed result (or error)

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] pyright passes (0 errors)
  - [x] TypeScript checks pass
  - [x] Formatters applied

### Test Plan

1. Start agent creation in copilot
2. Close browser tab immediately after seeing "operation_started" 
3. Wait 2-3 minutes
4. Reopen chat
5. Verify: Chat history shows completion message and agent appears in
library

---------

Co-authored-by: Ubbe <hi@ubbe.dev>
2026-01-28 05:09:34 +07:00
Lluis Agusti
349b1f9c79 hotfix(frontend): copilot session handling refinements... 2026-01-28 02:53:45 +07:00
Lluis Agusti
277b0537e9 hotfix(frontend): copilot simplication... 2026-01-28 02:10:18 +07:00
Ubbe
071b3bb5cd fix(frontend): more copilot refinements (#11858)
## Changes 🏗️

On the **Copilot** page:

- prevent unnecessary sidebar repaints 
- show a disclaimer when switching chats on the sidebar to terminate a
current stream
- handle loading better
- save streams better when disconnecting


### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Run the app locally and test the above
2026-01-28 00:49:28 +07:00
Swifty
2134d777be fix(backend): exclude disabled blocks from chat search and indexing (#11854)
## Summary

Disabled blocks (e.g., webhook blocks without `platform_base_url`
configured) were being indexed and returned in chat tool search results.
This PR ensures they are properly filtered out.

### Changes 🏗️

- **find_block.py**: Skip disabled blocks when enriching search results
- **content_handlers.py**: 
  - Skip disabled blocks during embedding indexing
- Update `get_stats()` to only count enabled blocks for accurate
coverage metrics

### Why

Blocks can be disabled for various reasons (missing OAuth config, no
platform URL for webhooks, etc.). These blocks shouldn't appear in
search results since users cannot use them.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Verified disabled blocks are filtered from search results
  - [x] Verified disabled blocks are not indexed
  - [x] Verified stats accurately reflect enabled block count
2026-01-27 15:21:13 +00:00
Ubbe
962824c8af refactor(frontend): copilot session management stream updates (#11853)
## Changes 🏗️

- **Fix infinite loop in copilot page**
- use Zustand selectors instead of full store object to get stable
function references
- **Centralize chat streaming logic**
- move all streaming files from `providers/chat-stream/` to
`components/contextual/Chat/` for better colocation and reusability
- **Rename `copilot-store` → `copilot-page-store`**: Clarify scope
- **Fix message duplication**
- Only replay chunks from active streams (not completed ones) since
backend already provides persisted messages in `initialMessages`
- **Auto-focus chat input**
  - Focus textarea when streaming ends and input is re-enabled
- **Graceful error display**
- Render tool response errors in muted style (small text + warning icon)
instead of raw "Error: ..." text

## Checklist 📋

### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Navigate to copilot page - no infinite loop errors
  - [x] Start a new chat, send message, verify streaming works
- [x] Navigate away and back to a completed session - no duplicate
messages
  - [x] After stream completes, verify chat input receives focus
  - [x] Trigger a tool error - verify it displays with muted styling
2026-01-27 22:09:25 +07:00
Zamil Majdy
3e9d5d0d50 fix(backend): handle race condition in review processing gracefully (#11845)
## Summary
- Fixes race condition when multiple concurrent requests try to process
the same reviews (e.g., double-click, multiple browser tabs)
- Previously the second request would fail with "Reviews not found,
access denied, or not in WAITING status"
- Now handles this gracefully by treating already-processed reviews with
the same decision as success

## Changes
- Added `get_reviews_by_node_exec_ids()` function that fetches reviews
regardless of status
- Modified `process_all_reviews_for_execution()` to handle
already-processed reviews
- Updated route to use idempotent validation

## Test plan
- [x] Linter passes (`poetry run ruff check`)
- [x] Type checker passes (`poetry run pyright`)
- [x] Formatter passes (`poetry run format`)
- [ ] Manual testing: double-click approve button should not cause
errors

Fixes AUTOGPT-SERVER-7HE
2026-01-27 21:43:31 +07:00
Swifty
fac10c422b fix(backend): add SSE heartbeats to prevent tool execution timeouts (#11855)
## Summary

Long-running chat tools (like `create_agent` and `edit_agent`) were
timing out because no SSE data was sent during tool execution. GCP load
balancers and proxies have idle connection timeouts (~60 seconds), and
when the external Agent Generator service takes longer than this, the
connection would drop.

This PR adds SSE heartbeat comments during tool execution to keep
connections alive.

### Changes 🏗️

- **response_model.py**: Added `StreamHeartbeat` response type that
emits SSE comments (`: heartbeat\n\n`)
- **service.py**: Modified `_yield_tool_call()` to:
  - Run tool execution in a background asyncio task
  - Yield heartbeat events every 15 seconds while waiting
- Handle task failures with explicit error responses (no silent
failures)
  - Handle cancellation gracefully
- **create_agent.py**: Improved error messages with more context and
details
- **edit_agent.py**: Improved error messages with more context and
details

### How It Works

```
Tool Call → Background Task Started
     │
     ├── Every 15 seconds: yield `: heartbeat\n\n` (SSE comment)
     │
     └── Task Complete → yield tool result OR error response
```

SSE comments (`: heartbeat\n\n`) are:
- Ignored by SSE clients (don't trigger events)
- Keep TCP connections alive through proxies/load balancers
- Don't affect the AI SDK data protocol

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] All chat service tests pass (17 tests)
  - [x] Verified heartbeats are sent during long tool execution
  - [x] Verified errors are properly reported to frontend
2026-01-27 15:41:58 +01:00
Bently
91c7896859 fix(backend): implement context window management for long chat sessions (#11848)
## Changes 🏗️

Implements automatic context window management to prevent chat failures
when conversations exceed token limits.

  ### Problem
- **Issue**: [SECRT-1800] Long chat conversations stop working when
context grows beyond model limits (~113k tokens observed)
- **Root Cause**: Chat service sends ALL messages to LLM without
token-aware compression, eventually exceeding Claude Opus 4.5's 200k
context window

  ### Solution
  Implements a sliding window with summarization strategy:
1. Monitors token count before sending to LLM (triggers at 120k tokens)
2. Keeps last 15 messages completely intact (preserves recent
conversation flow)
  3. Summarizes older messages using gpt-4o-mini (fast & cheap)
4. Rebuilds context: `[system_prompt] + [summary] +
[recent_15_messages]`
5. Full history preserved in database (only compresses when sending to
LLM)

  ### Changes Made
- **Added** `_summarize_messages()` helper function to create concise
summaries using gpt-4o-mini
- **Modified** `_stream_chat_chunks()` to implement token counting and
conditional summarization
- **Integrated** existing `estimate_token_count()` utility for accurate
token measurement
- **Added** graceful fallback - continues with original messages if
summarization fails

  ## Motivation and Context 🎯

Without context management, users with long chat sessions (250+
messages) experience:
  - Complete chat failure when hitting 200k token limit
  - Lost conversation context
  - Poor user experience

  This fix enables:
  -  Unlimited conversation length
  -  Transparent operation (no UX changes)
  -  Preserved conversation quality (recent messages intact)
  -  Cost-efficient (~$0.0001 per summarization)

  ## Testing 🧪

  ### Expected Behavior
  - Conversations < 120k tokens: No change (normal operation)
  - Conversations > 120k tokens:
- Log message: `Context summarized: {tokens} tokens, kept last 15
messages + summary`
    - Chat continues working smoothly
    - Recent context remains intact

  ### How to Verify
  1. Start a chat session in copilot
  2. Send 250-600 messages (or 50+ with large code blocks)
  3. Check logs for "Context summarized:" message
  4. Verify chat continues working without errors
  5. Verify conversation quality remains good

  ## Checklist 

  - [x] My code follows the style guidelines of this project
  - [x] I have performed a self-review of my own code
- [x] I have commented my code, particularly in hard-to-understand areas
  - [x] My changes generate no new warnings
  - [x] I have tested my changes and verified they work as expected
2026-01-27 15:37:17 +01:00
Swifty
bab436231a refactor(backend): remove Langfuse tracing from chat system (#11829)
We are removing Langfuse tracing from the chat/copilot system in favor
of using OpenRouter's broadcast feature, which keeps our codebase
simpler. Langfuse prompt management is retained for fetching system
prompts.

### Changes 🏗️

**Removed Langfuse tracing:**
- Removed `@observe` decorators from all 11 chat tool files
- Removed `langfuse.openai` wrapper (now using standard `openai` client)
- Removed `start_as_current_observation` and `propagate_attributes`
context managers from `service.py`
- Removed `update_current_trace()`, `update_current_span()`,
`span.update()` calls

**Retained Langfuse prompt management:**
- `langfuse.get_prompt()` for fetching system prompts
- `_is_langfuse_configured()` check for prompt availability
- Configuration for `langfuse_prompt_name`

**Files modified:**
- `backend/api/features/chat/service.py`
- `backend/api/features/chat/tools/*.py` (11 tool files)

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Verified `poetry run format` passes
  - [x] Verified no `@observe` decorators remain in chat tools
- [x] Verified Langfuse prompt fetching is still functional (code
preserved)
2026-01-27 13:07:42 +01:00
Zamil Majdy
859f3f8c06 feat(frontend): implement clarification questions UI for agent generation (#11833)
## Summary
Add interactive UI to collect user answers when the agent-generator
service returns clarifying questions during agent creation/editing.

Previously, when the backend asked clarifying questions, the frontend
would just display them as text with no way for users to answer. This
caused the chat to keep retrying without the necessary context.

## Changes
- **ChatMessageData type**: Add `clarification_needed` variant with
questions field
- **ClarificationQuestionsWidget**: New component with interactive form
to collect answers
- **parseToolResponse**: Detect and parse `clarification_needed`
responses from backend
- **ChatMessage**: Render the widget when clarification is needed

## How It Works
1. User requests to create/edit agent
2. Backend returns `ClarificationNeededResponse` with list of questions
3. Frontend shows interactive form with text inputs for each question
4. User fills in answers and clicks "Submit Answers"
5. Answers are sent back as context to the tool
6. Backend receives full context and continues

## UI Features
- Shows all questions with examples (if provided)
- Input validation (all questions must be answered to submit)
- Visual feedback (checkmarks when answered)
- Numbered questions for clarity
- Submit button disabled until all answered
- Follows same design pattern as `credentials_needed` flow

## Related
- Backend support for clarification was added in #11819
- Fixes the issue shown in the screenshot where users couldn't answer
clarifying questions

## Test plan
- [ ] Test creating agent that requires clarifying questions
- [ ] Verify questions are displayed in interactive form
- [ ] Verify all questions must be answered before submitting
- [ ] Verify answers are sent back to backend as context
- [ ] Verify agent creation continues with full context
2026-01-27 09:22:30 +00:00
Swifty
d5c0f5b2df refactor(backend): remove page context from chat service (#11844)
### Background
The chat service previously supported including page context (URL and
content) in user messages. This functionality is being removed.

### Changes 🏗️

- Removed page context handling from `stream_chat_completion` in the
chat service
- User messages are now passed directly without URL/content context
injection
- Removed associated logging for page context

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Verify chat functionality works without page context
  - [x] Confirm no regressions in basic chat message handling
2026-01-26 16:00:48 +00:00
Ubbe
fbc2da36e6 fix(analytics): only try to init Posthog when on cloud (#11843)
## Changes 🏗️

This prevents Posthog from being initialised locally, where we should
not be collecting analytics during local development.

## Checklist 📋

### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Run locally and test the above
2026-01-26 22:54:19 +07:00
Swifty
75ecc4de92 fix(backend): enforce block disabled flag on execution endpoints (#11839)
## Summary
This PR adds security checks to prevent execution of disabled blocks
across all block execution endpoints.

- Add `disabled` flag check to main web API endpoint
(`/api/blocks/{block_id}/execute`)
- Add `disabled` flag check to external API endpoint
(`/api/blocks/{block_id}/execute`)
- Add `disabled` flag check to chat tool block execution

Previously, block execution endpoints only checked if a block existed
but did not verify the `disabled` flag, allowing any authenticated user
to execute disabled blocks.

## Test plan
- [x] Verify disabled blocks return 403 Forbidden on main API endpoint
- [x] Verify disabled blocks return 403 Forbidden on external API
endpoint
- [x] Verify disabled blocks return error response in chat tool
execution
- [x] Verify enabled blocks continue to execute normally
2026-01-26 13:56:24 +00:00
Abhimanyu Yadav
f0c2503608 feat(frontend): support multiple node execution results and accumulated data display (#11834)
### Changes 🏗️

- Refactored node execution results storage to maintain a history of
executions instead of just the latest result
- Added support for viewing accumulated output data across multiple
executions
- Implemented a cleaner UI for viewing historical execution results with
proper grouping
- Added functionality to clear execution results when starting a new run
- Created helper functions to normalize and process execution data
consistently
- Updated the NodeDataViewer component to display both latest and
historical execution data
- Added ability to view input data alongside output data in the
execution history

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Create and run a flow with multiple blocks that produce output
- [x] Verify that execution results are properly accumulated and
displayed
- [x] Run the same flow multiple times and confirm historical data is
preserved
- [x] Test the "View more data" functionality to ensure it displays all
execution history
- [x] Verify that execution results are properly cleared when starting a
new run
2026-01-26 12:33:22 +00:00
Swifty
cfb7dc5aca feat(backend): Add PostHog analytics and OpenRouter tracing to chat system (#11828)
Adds analytics tracking to the chat copilot system for better
observability of user interactions and agent operations.

### Changes 🏗️

**PostHog Analytics Integration:**
- Added `posthog` dependency (v7.6.0) to track chat events
- Created new tracking module (`backend/api/features/chat/tracking.py`)
with events:
  - `chat_message_sent` - When a user sends a message
  - `chat_tool_called` - When a tool is called (includes tool name)
  - `chat_agent_run_success` - When an agent runs successfully
  - `chat_agent_scheduled` - When an agent is scheduled
  - `chat_trigger_setup` - When a trigger is set up
- Added PostHog configuration to settings:
  - `POSTHOG_API_KEY` - API key for PostHog
- `POSTHOG_HOST` - PostHog host URL (defaults to
`https://us.i.posthog.com`)

**OpenRouter Tracing:**
- Added `user` and `session_id` fields to chat completion API calls for
OpenRouter tracing
- Added `posthogDistinctId` and `posthogProperties` (with environment)
to API calls

**Files Changed:**
- `backend/api/features/chat/tracking.py` - New PostHog tracking module
- `backend/api/features/chat/service.py` - Added user message tracking
and OpenRouter tracing
- `backend/api/features/chat/tools/__init__.py` - Added tool call
tracking
- `backend/api/features/chat/tools/run_agent.py` - Added agent
run/schedule tracking
- `backend/util/settings.py` - Added PostHog configuration fields
- `pyproject.toml` - Added posthog dependency

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Verified code passes linting and formatting
- [x] Verified PostHog client initializes correctly when API key is
provided
- [x] Verified tracking is gracefully skipped when PostHog is not
configured

#### For configuration changes:

- [ ] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)

**New environment variables (optional):**
- `POSTHOG_API_KEY` - PostHog project API key
- `POSTHOG_HOST` - PostHog host URL (optional, defaults to US cloud)
2026-01-26 12:26:15 +00:00
Zamil Majdy
9a6e17ff52 feat(backend): add external Agent Generator service integration (#11819)
## Summary
- Add support for delegating agent generation to an external
microservice when `AGENTGENERATOR_HOST` is configured
- Falls back to built-in LLM-based implementation when not configured
(default behavior)
- Add comprehensive tests for the service client and core integration
(34 tests)

## Changes
- Add `agentgenerator_host`, `agentgenerator_port`,
`agentgenerator_timeout` settings to `backend/util/settings.py`
- Add `service.py` client for external Agent Generator API endpoints:
  - `/api/decompose-description` - Break down goals into steps
  - `/api/generate-agent` - Generate agent from instructions
  - `/api/update-agent` - Generate patches to update existing agents
  - `/api/blocks` - Get available blocks
  - `/health` - Health check
- Update `core.py` to delegate to external service when configured
- Export `is_external_service_configured` and
`check_external_service_health` from the module

## Related PRs
- Infrastructure repo:
https://github.com/Significant-Gravitas/AutoGPT-cloud-infrastructure/pull/273

## Test plan
- [x] All 34 new tests pass (`poetry run pytest test/agent_generator/
-v`)
- [ ] Deploy with `AGENTGENERATOR_HOST` configured and verify external
service is used
- [ ] Verify built-in implementation still works when
`AGENTGENERATOR_HOST` is empty
2026-01-25 04:08:56 +07:00
Zamil Majdy
fb58827c61 feat(backend;frontend): Implement node-specific auto-approval, safety popup, and race condition fixes (#11810)
## Summary

This PR implements comprehensive improvements to the human-in-the-loop
(HITL) review system, including safety features, architectural changes,
and bug fixes:

### Key Features
- **SECRT-1798: One-time safety popup** - Shows informational popup
before first run of AI-generated agents with sensitive actions/HITL
blocks
- **SECRT-1795: Auto-approval toggle UX** - Toggle in pending reviews
panel to auto-approve future actions from the same node
- **Node-specific auto-approval** - Changed from execution-specific to
node-specific using special key pattern
`auto_approve_{graph_exec_id}_{node_id}`
- **Consolidated approval checking** - Merged `check_auto_approval` into
`check_approval` using single OR query for better performance
- **Race condition prevention** - Added execution status check before
resuming to prevent duplicate execution when approving while graph is
running
- **Parallel auto-approval creation** - Uses `asyncio.gather` for better
performance when creating multiple auto-approval records

## Changes

### Backend Architecture
- **`human_review.py`**: 
- Added `check_approval()` function that checks both normal and
auto-approval in single query
- Added `create_auto_approval_record()` for node-specific auto-approval
using special key pattern
- Added `get_auto_approve_key()` helper to generate consistent
auto-approval keys
- **`review/routes.py`**: 
- Added execution status check before resuming to prevent race
conditions
- Refactored auto-approval record creation to use parallel execution
with `asyncio.gather`
  - Removed obvious comments for cleaner code
- **`review/model.py`**: Added `auto_approve_future_actions` field to
`ReviewRequest`
- **`blocks/helpers/review.py`**: Updated to use consolidated
`check_approval` via database manager client
- **`executor/database.py`**: Exposed `check_approval` through
DatabaseManager RPC for block execution context
- **`data/block.py`**: Fixed safe mode checks for sensitive action
blocks

### Frontend
- **New `AIAgentSafetyPopup`** component with localStorage-based
one-time display
- **`PendingReviewsList`**: 
  - Replaced "Approve all future actions" button with toggle
- Toggle resets data to original values and disables editing when
enabled
  - Shows warning message explaining auto-approval behavior
- **`RunAgentModal`**: Integrated safety popup before first run
- **`usePendingReviews`**: Added polling for real-time badge updates
- **`FloatingSafeModeToggle` & `SafeModeToggle`**: Simplified visibility
logic
- **`local-storage.ts`**: Added localStorage key for popup state
tracking

### Bug Fixes
- Fixed "Client is not connected to query engine" error by using
database manager client pattern
- Fixed race condition where approving reviews while graph is RUNNING
could queue execution twice
- Fixed migration to only drop FK constraint, not non-existent column
- Fixed card data reset when auto-approve toggle changes

### Code Quality
- Removed duplicate/obvious comments
- Moved imports to top-level instead of local scope in tests
- Used walrus operator for cleaner conditional assignments
- Parallel execution for auto-approval record creation

## Test plan
- [ ] Create an AI-generated agent with sensitive actions (e.g., email
sending)
- [ ] First run should show the safety popup before starting
- [ ] Subsequent runs should not show the popup
- [ ] Clear localStorage (`AI_AGENT_SAFETY_POPUP_SHOWN`) to verify popup
shows again
- [ ] Create an agent with human-in-the-loop blocks
- [ ] Run it and verify the pending reviews panel appears
- [ ] Enable the "Auto-approve all future actions" toggle
- [ ] Verify editing is disabled and shows warning message
- [ ] Click "Approve" and verify subsequent blocks from same node
auto-approve
- [ ] Verify auto-approval persists across multiple executions of same
graph
- [ ] Disable toggle and verify editing works normally
- [ ] Verify "Reject" button still works regardless of toggle state
- [ ] Test race condition: Approve reviews while graph is RUNNING
(should skip resume)
- [ ] Test race condition: Approve reviews while graph is REVIEW (should
resume)
- [ ] Verify pending reviews badge updates in real-time when new reviews
are created
2026-01-25 04:05:25 +07:00
Zamil Majdy
595f3508c1 refactor(backend): consolidate embedding error logging to prevent Sentry spam (#11832)
## Summary

Refactors error handling in the embedding service to prevent Sentry
alert spam. Previously, batch operations would log one error per failed
file, causing hundreds of duplicate alerts. Now, exceptions bubble up
from individual functions and are aggregated at the batch level,
producing a single log entry showing all unique error types with counts.

## Changes

### Removed Error Swallowing
- Removed try/except blocks from `generate_embedding()`,
`store_content_embedding()`, `ensure_content_embedding()`,
`get_content_embedding()`, and `ensure_embedding()`
- These functions now raise exceptions instead of returning None/False
on failure
- Added docstring notes: "Raises exceptions on failure - caller should
handle"

### Improved Batch Error Aggregation
- Updated `backfill_all_content_types()` to aggregate unique errors
- Collects all exceptions from batch results
- Groups by error type and message, shows counts
- Single log entry per content type instead of per-file

### Example Output
Before: 50 separate error logs for same issue
After: `BLOCK: 50/100 embeddings failed. Errors: PrismaError: type
vector does not exist (50x)`

## Motivation

This was triggered by the AUTOGPT-SERVER-7D2 Sentry issue where pgvector
errors created hundreds of duplicate alerts. Even after the root cause
was fixed (stale database connections), the error logging pattern would
create spam for any future issues.

## Impact

-  Reduces Sentry noise - single alert per batch instead of per-file
-  Better diagnostics - shows all unique error types with counts
-  Cleaner code - removed ~24 lines of unnecessary error swallowing
-  Proper exception propagation follows Python best practices

## Testing

- Existing tests should pass (error handling moved to batch level)
- Error aggregation logic tested via
asyncio.gather(return_exceptions=True)

## Related Issues

- Fixes Sentry alert spam from AUTOGPT-SERVER-7D2
2026-01-24 21:49:32 +07:00
Ubbe
7892590b12 feat(frontend): refine copilot loading states (#11827)
## Changes 🏗️

- Make the loading UX better when switching between chats or loading a
new chat
- Make session/chat management logic more manageable
- Improving "Deep thinking" loading states
- Fix bug that happened when returning to chat after navigating away

## Checklist 📋

### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Run the app locally and test the above
2026-01-23 18:25:45 +07:00
Bently
82d7134fc6 feat(blocks): Add ClaudeCodeBlock for executing tasks via Claude Code in E2B sandbox (#11761)
Introduces a new ClaudeCodeBlock that enables execution of coding tasks
using Anthropic's Claude Code in an E2B sandbox. This block unlocks
powerful agentic coding capabilities - Claude Code can autonomously
create files, install packages, run commands, and build complete
applications within a secure sandboxed environment.

Changes 🏗️

- New file backend/blocks/claude_code.py:
  - ClaudeCodeBlock - Execute tasks using Claude Code in an E2B sandbox
- Dual credential support: E2B API key (sandbox) + Anthropic API key
(Claude Code)
- Session continuation support via session_id, sandbox_id, and
conversation_history
- Automatic file extraction with path, relative_path, name, and content
fields
  - Configurable timeout, setup commands, and working directory
- dispose_sandbox option to keep sandbox alive for multi-turn
conversations

Checklist 📋

For code changes:

- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Create and execute ClaudeCodeBlock with a simple prompt ("Create a
hello world HTML file")
- [x] Verify files output includes correct path, relative_path, name,
and content
- [x] Test session continuation by passing session_id and sandbox_id
back
- [x] Build "Any API → Instant App" demo agent combining Firecrawl +
ClaudeCodeBlock + GitHub blocks
- [x] Verify generated files are pushed to GitHub with correct folder
structure using relative_path

Here are two example agents i made that can be used to test this agent,
they require github, anthropic and e2b access via api keys that are set
via the user/on the platform is testing on dev

The first agent is my

Any API → Instant App
"Transform any API documentation into a fully functional web
application. Just provide a docs URL and get a complete, ready-to-deploy
app pushed to a new GitHub repository."

[Any API → Instant
App_v36.json](https://github.com/user-attachments/files/24600326/Any.API.Instant.App_v36.json)


The second agent is my
Idea to project
"Simply enter your coding project's idea and this agent will make all of
the base initial code needed for you to start working on that project
and place it on github for you!"

[Idea to
project_v11.json](https://github.com/user-attachments/files/24600346/Idea.to.project_v11.json)

If you have any questions or issues let me know.

References
https://e2b.dev/blog/python-guide-run-claude-code-in-an-e2b-sandbox

https://github.com/e2b-dev/e2b-cookbook/tree/main/examples/anthropic-claude-code-in-sandbox-python
https://code.claude.com/docs/en/cli-reference

I tried to use E2b's "anthropic-claude-code" template but it kept
complaining it was out of date, so I make it manually spin up a E2b
instance and make it install the latest claude code and it uses that
2026-01-23 10:05:32 +00:00
Nicholas Tindle
90466908a8 refactor(docs): restructure platform docs for GitBook and remove MkDo… (#11825)
<!-- Clearly explain the need for these changes: -->
we met some reality when merging into the docs site but this fixes it
### Changes 🏗️
updates paths, adds some guides
<!-- Concisely describe all of the changes made in this pull request:
-->
update to match reality
### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  <!-- Put your test plan here: -->
  - [x] deploy it and validate

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Aligns block integrations documentation with GitBook.
> 
> - Changes generator default output to
`docs/integrations/block-integrations` and writes overview `README.md`
and `SUMMARY.md` at `docs/integrations/`
> - Adds GitBook frontmatter and hint syntax to overview; prefixes block
links with `block-integrations/`
> - Introduces `generate_summary_md` to build GitBook navigation
(including optional `guides/`)
> - Preserves per-block manual sections and adds optional `extras` +
file-level `additional_content`
> - Updates sync checker to validate parent `README.md` and `SUMMARY.md`
> - Rewrites `docs/integrations/README.md` with GitBook frontmatter and
updated links; adds `docs/integrations/SUMMARY.md`
> - Adds new guides: `guides/llm-providers.md`,
`guides/voice-providers.md`
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
fdb7ff8111. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: bobby.gaffin <bobby.gaffin@agpt.co>
2026-01-23 06:18:16 +00:00
Zamil Majdy
f9f984a8f4 fix(db): Remove redundant migration and fix pgvector schema handling (#11822)
### Changes 🏗️

This PR includes two database migration fixes:

#### 1. Remove redundant Supabase extensions migration

Removes the `20260112173500_add_supabase_extensions_to_platform_schema`
migration which was attempting to manage Supabase-provided extensions
and schemas.

**What was removed:**
- Migration that created extensions (pgcrypto, uuid-ossp,
pg_stat_statements, pg_net, pgjwt, pg_graphql, pgsodium, supabase_vault)
- Schema creation for these extensions

**Why it was removed:**
- These extensions and schemas are pre-installed and managed by Supabase
automatically
- The migration was redundant and could cause schema drift warnings
- Attempting to manage Supabase-owned resources in our migrations is an
anti-pattern

#### 2. Fix pgvector extension schema handling

Improves the `20260109181714_add_docs_embedding` migration to handle
cases where pgvector exists in the wrong schema.

**Problem:**
- If pgvector was previously installed in `public` schema, `CREATE
EXTENSION IF NOT EXISTS` would succeed but not actually install it in
the `platform` schema
- This causes `type "vector" does not exist` errors because the type
isn't in the search_path

**Solution:**
- Detect if vector extension exists in a different schema than the
current one
- Drop it with CASCADE and reinstall in the correct schema (platform)
- Use dynamic SQL with `EXECUTE format()` to explicitly specify the
target schema
- Split exception handling: catch errors during removal, but let
installation fail naturally with clear PostgreSQL errors

**Impact:**
- No functional changes - Supabase continues to provide extensions as
before
- pgvector now correctly installs in the platform schema
- Cleaner migration history
- Prevents schema-related errors

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Verified migrations run successfully without the redundant file
  - [x] Confirmed Supabase extensions are still available
  - [x] Tested pgvector migration handles wrong-schema scenario
  - [x] No schema drift warnings

#### For configuration changes:
- [x] .env.default is updated or already compatible with my changes
- [x] docker-compose.yml is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)
  - N/A - No configuration changes required
2026-01-22 12:06:00 +00:00
Abhimanyu Yadav
fc87ed4e34 feat(ci): add integration test job and rename e2e test job (#11820)
### Changes 🏗️

- Renamed the `test` job to `e2e_test` in the CI workflow for better
clarity
- Added a new `integration_test` job to the CI workflow that runs unit
tests using `pnpm test:unit`
- Created a basic integration test for the MainMarketplacePage component
to verify CI functionality

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Verified the CI workflow runs both e2e and integration tests
  - [x] Confirmed the integration test for MainMarketplacePage passes

#### For configuration changes:

- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
2026-01-22 11:14:48 +00:00
Abhimanyu Yadav
b0953654d9 feat(frontend): add integration testing setup with Vitest, MSW, and RTL (#11813)
### Changes 🏗️

- Added Vitest and React Testing Library for frontend unit testing
- Configured MSW (Mock Service Worker) for API mocking in tests
- Created test utilities and setup files for integration tests
- Added comprehensive testing documentation in `AGENTS.md`
- Updated Orval configuration to generate MSW mock handlers
- Added mock server and browser implementations for development testing

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Run `pnpm test:unit` to verify tests pass
  - [x] Verify MSW mock handlers are generated correctly
  - [x] Check that test utilities work with sample component tests

#### For configuration changes:

- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)
2026-01-22 10:10:00 +00:00
Ubbe
c5069ca48f fix(frontend): chat UX improvements (#11804)
### Changes 🏗️

<img width="1920" height="998" alt="Screenshot 2026-01-19 at 22 14 51"
src="https://github.com/user-attachments/assets/ecd1c241-6f77-4702-9774-5e58806b0b64"
/>

This PR lays the groundwork for the new UX of AutoGPT Copilot. 
- moves the Copilot to its own route `/copilot`
- Makes the Copilot the homepage when enabled
- Updates the labelling of the homepage icons
- Makes the Library the homepage when Copilot is disabled
- Improves Copilot's:
  - session handling
  - styles and UX
  - message parsing
  
### Other improvements

- Improve the log out UX by adding a new `/logout` page and using a
re-direct

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Run locally and test the above

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Launches the new Copilot experience and aligns API behavior with the
UI.
> 
> - **Routing/Home**: Add `/copilot` with `CopilotShell` (desktop
sidebar + mobile drawer), make homepage route flag-driven; update
login/signup/error redirects and root page to use `getHomepageRoute`.
> - **Chat UX**: Replace legacy chat with `components/contextual/Chat/*`
(new message list, bubbles, tool call/response formatting, stop button,
initial-prompt handling, refined streaming/error handling); remove old
platform chat components.
> - **Sessions**: Add paginated session list (infinite load),
auto-select/create logic, mobile/desktop navigation, and improved
session fetching/claiming guards.
> - **Auth/Logout**: New `/logout` flow with delayed redirect; gate
various queries on auth state and logout-in-progress.
> - **Backend**: `GET /api/chat/sessions/{id}` returns `null` instead of
404; service saves assistant message on `StreamFinish` to avoid loss and
prevents duplicate saves; OpenAPI updated accordingly.
> - **Misc**: Minor UI polish in library modals, loader styling, docs
(CONTRIBUTING) additions, and small formatting fixes in block docs
generator.
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
1b4776dcf52ccd6987830ada3a58a87a160ce36c. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
2026-01-22 16:43:42 +07:00
Zamil Majdy
5d0cd88d98 fix(backend): Use unqualified vector type for pgvector queries (#11818)
## Summary
- Remove explicit schema qualification (`{schema}.vector` and
`OPERATOR({schema}.<=>)`) from pgvector queries in `embeddings.py` and
`hybrid_search.py`
- Use unqualified `::vector` type cast and `<=>` operator which work
because pgvector is in the search_path on all environments

## Problem
The previous approach tried to explicitly qualify the vector type with
schema names, but this failed because:
- **CI environment**: pgvector is in `public` schema → `platform.vector`
doesn't exist
- **Dev (Supabase)**: pgvector is in `platform` schema → `public.vector`
doesn't exist

## Solution
Use unqualified `::vector` and `<=>` operator. PostgreSQL resolves these
via `search_path`, which includes the schema where pgvector is installed
on all environments.

Tested on both local and dev environments with a test script that
verified:
-  Unqualified `::vector` type cast
-  Unqualified `<=>` operator in ORDER BY
-  Unqualified `<=>` in SELECT (similarity calculation)
-  Combined query patterns matching actual usage

## Test plan
- [ ] CI tests pass
- [ ] Marketplace approval works on dev after deployment

Fixes: AUTOGPT-SERVER-763, AUTOGPT-SERVER-764, AUTOGPT-SERVER-76B
2026-01-21 18:11:58 +00:00
Zamil Majdy
033f58c075 fix(backend): Make Redis event bus gracefully handle connection failures (#11817)
## Summary
Adds graceful error handling to AsyncRedisEventBus and RedisEventBus so
that connection failures log exceptions with full traceback while
remaining non-breaking. This allows DatabaseManager to operate without
Redis connectivity.

## Problem
DatabaseManager was failing with "Authentication required" when trying
to publish notifications via AsyncRedisNotificationEventBus. The service
has no Redis credentials configured, causing `increment_onboarding_runs`
to fail.

## Root Cause
When `increment_onboarding_runs` publishes a notification:
1. Calls `AsyncRedisNotificationEventBus().publish()`
2. Attempts to connect to Redis via `get_redis_async()`
3. Connection fails due to missing credentials
4. Exception propagates, failing the entire DB operation

Previous fix (#11775) made the cache module lazy, but didn't address the
notification bus which also requires Redis.

## Solution
Wrap Redis operations in try-except blocks:
- `publish_event`: Logs exception with traceback, continues without
publishing
- `listen_events`: Logs exception with traceback, returns empty
generator
- `wait_for_event`: Returns None on connection failure

Using `logger.exception()` instead of `logger.warning()` ensures full
stack traces are captured for debugging while keeping operations
non-breaking.

This allows services to operate without Redis when only using event bus
for non-critical notifications.

## Changes
- Modified `backend/data/event_bus.py`:
- Added graceful error handling to `RedisEventBus` and
`AsyncRedisEventBus`
- All Redis operations now catch exceptions and log with
`logger.exception()`
- Added `backend/data/event_bus_test.py`:
  - Tests verify graceful degradation when Redis is unavailable
  - Tests verify normal operation when Redis is available

## Test Plan
- [x] New tests verify graceful degradation when Redis unavailable
- [x] Existing notification tests still pass
- [x] DatabaseManager can increment onboarding runs without Redis

## Related Issues
Fixes https://significant-gravitas.sentry.io/issues/7205834440/
(AUTOGPT-SERVER-76D)
2026-01-21 15:51:26 +00:00
Ubbe
40ef2d511f fix(frontend): auto-select credentials correctly in old builder (#11815)
## Changes 🏗️

On the **Old Builder**, when running an agent...

### Before

<img width="800" height="614" alt="Screenshot 2026-01-21 at 21 27 05"
src="https://github.com/user-attachments/assets/a3b2ec17-597f-44d2-9130-9e7931599c38"
/>

Credentials are there, but it is not recognising them, you need to click
on them to be selected

### After

<img width="1029" height="728" alt="Screenshot 2026-01-21 at 21 26 47"
src="https://github.com/user-attachments/assets/c6e83846-6048-439e-919d-6807674f2d5a"
/>

It uses the new credentials UI and correctly auto-selects existing ones.

### Other

Fixed a small timezone display glitch on the new library view.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Run agent in old builder
- [x] Credentials are auto-selected and using the new collapsed system
credentials UI
2026-01-21 14:55:49 +00:00
Zamil Majdy
b714c0c221 fix(backend): handle null values in GraphSettings validation (#11812)
## Summary
- Fixes AUTOGPT-SERVER-76H - Error parsing LibraryAgent from database
due to null values in GraphSettings fields
- When parsing LibraryAgent settings from the database, null values for
`human_in_the_loop_safe_mode` and `sensitive_action_safe_mode` were
causing Pydantic validation errors
- Adds `BeforeValidator` annotations to coerce null values to their
defaults (True and False respectively)

## Test plan
- [x] Verified with unit tests that GraphSettings can now handle
None/null values
- [x] Backend tests pass
- [x] Manually tested with all scenarios (None, empty dict, explicit
values)
2026-01-21 08:40:38 -05:00
Krzysztof Czerwinski
ebabc4287e feat(platform): New LLM Picker UI (#11726)
Add new LLM Picker for the new Builder.

### Changes 🏗️

- Enrich `LlmModelMeta` (in `llm.py`) with human readable model, creator
and provider names and price tier (note: this is temporary measure and
all LlmModelMeta will be removed completely once LLM Registry is ready)
- Add provider icons
- Add custom input field `LlmModelField` and its components&helpers

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] LLM model picker works correctly in the new Builder
  - [x] Legacy LLM model picker works in the old Builder
2026-01-21 10:52:55 +00:00
Zamil Majdy
8b25e62959 feat(backend,frontend): add explicit safe mode toggles for HITL and sensitive actions (#11756)
## Summary

This PR introduces two explicit safe mode toggles for controlling agent
execution behavior, providing clearer and more granular control over
when agents should pause for human review.

### Key Changes

**New Safe Mode Settings:**
- **`human_in_the_loop_safe_mode`** (bool, default `true`) - Controls
whether human-in-the-loop (HITL) blocks pause for review
- **`sensitive_action_safe_mode`** (bool, default `false`) - Controls
whether sensitive action blocks pause for review

**New Computed Properties on LibraryAgent:**
- `has_human_in_the_loop` - Indicates if agent contains HITL blocks
- `has_sensitive_action` - Indicates if agent contains sensitive action
blocks

**Block Changes:**
- Renamed `requires_human_review` to `is_sensitive_action` on blocks for
clarity
- Blocks marked as `is_sensitive_action=True` pause only when
`sensitive_action_safe_mode=True`
- HITL blocks pause when `human_in_the_loop_safe_mode=True`

**Frontend Changes:**
- Two separate toggles in Agent Settings based on block types present
- Toggle visibility based on `has_human_in_the_loop` and
`has_sensitive_action` computed properties
- Settings cog hidden if neither toggle applies
- Proper state management for both toggles with defaults

**AI-Generated Agent Behavior:**
- AI-generated agents set `sensitive_action_safe_mode=True` by default
- This ensures sensitive actions are reviewed for AI-generated content

## Changes

**Backend:**
- `backend/data/graph.py` - Updated `GraphSettings` with two boolean
toggles (non-optional with defaults), added `has_sensitive_action`
computed property
- `backend/data/block.py` - Renamed `requires_human_review` to
`is_sensitive_action`, updated review logic
- `backend/data/execution.py` - Updated `ExecutionContext` with both
safe mode fields
- `backend/api/features/library/model.py` - Added
`has_human_in_the_loop` and `has_sensitive_action` to `LibraryAgent`
- `backend/api/features/library/db.py` - Updated to use
`sensitive_action_safe_mode` parameter
- `backend/executor/utils.py` - Simplified execution context creation

**Frontend:**
- `useAgentSafeMode.ts` - Rewritten to support two independent toggles
- `AgentSettingsModal.tsx` - Shows two separate toggles
- `SelectedSettingsView.tsx` - Shows two separate toggles
- Regenerated API types with new schema

## Test Plan

- [x] All backend tests pass (Python 3.11, 3.12, 3.13)
- [x] All frontend tests pass
- [x] Backend format and lint pass
- [x] Frontend format and lint pass
- [x] Pre-commit hooks pass

---------

Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co>
2026-01-21 00:56:02 +00:00
Zamil Majdy
35a13e3df5 fix(backend): Use explicit schema qualification for pgvector types (#11805)
## Summary
- Fix intermittent "type 'vector' does not exist" errors when using
PgBouncer in transaction mode
- The issue was that `SET search_path` and the actual query could run on
different backend connections
- Use explicit schema qualification (`{schema}.vector`,
`OPERATOR({schema}.<=>)`) instead of relying on search_path

## Test plan
- [x] Tested vector type cast on local: `'[1,2,3]'::platform.vector`
works
- [x] Tested OPERATOR syntax on local: `OPERATOR(platform.<=>)` works
- [x] Tested on dev via kubectl exec: both work correctly
- [ ] Deploy to dev and verify backfill_missing_embeddings endpoint no
longer errors

## Related Issues
Fixes: AUTOGPT-SERVER-763, AUTOGPT-SERVER-764

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-20 22:18:16 +00:00
Mewael Tsegay Desta
2169b433c9 feat(backend/blocks): add ConcatenateListsBlock (#11567)
# feat(backend/blocks): add ConcatenateListsBlock

## Description

This PR implements a new block `ConcatenateListsBlock` that concatenates
multiple lists into a single list. This addresses the "good first issue"
for implementing a list concatenation block in the platform/blocks area.

The block takes a list of lists as input and combines all elements in
order into a single concatenated list. This is useful for workflows that
need to merge data from multiple sources or combine results from
different operations.

### Changes 🏗️

- **Added `ConcatenateListsBlock` class** in
`autogpt_platform/backend/backend/blocks/data_manipulation.py`
- Input: `lists: List[List[Any]]` - accepts a list of lists to
concatenate
- Output: `concatenated_list: List[Any]` - returns a single concatenated
list
- Error output: `error: str` - provides clear error messages for invalid
input types
  - Block ID: `3cf9298b-5817-4141-9d80-7c2cc5199c8e`
- Category: `BlockCategory.BASIC` (consistent with other list
manipulation blocks)
  
- **Added comprehensive test suite** in
`autogpt_platform/backend/test/blocks/test_concatenate_lists.py`
  - Tests using built-in `test_input`/`test_output` validation
- Manual test cases covering edge cases (empty lists, single list, empty
input)
  - Error handling tests for invalid input types
  - Category consistency verification
  - All tests passing

- **Implementation details:**
  - Uses `extend()` method for efficient list concatenation
  - Preserves element order from all input lists
- **Runtime type validation**: Explicitly checks `isinstance(lst, list)`
before calling `extend()` to prevent:
- Strings being iterated character-by-character (e.g., `extend("abc")` →
`['a', 'b', 'c']`)
    - Non-iterable types causing `TypeError` (e.g., `extend(1)`)
  - Clear error messages indicating which index has invalid input
- Handles edge cases: empty lists, empty input, single list, None values
  - Follows existing block patterns and conventions

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Run `poetry run pytest test/blocks/test_concatenate_lists.py -v` -
all tests pass
  - [x] Verified block can be imported and instantiated
  - [x] Tested with built-in test cases (4 test scenarios)
  - [x] Tested manual edge cases (empty lists, single list, empty input)
  - [x] Tested error handling for invalid input types
  - [x] Verified category is `BASIC` for consistency
  - [x] Verified no linting errors
- [x] Confirmed block follows same patterns as other blocks in
`data_manipulation.py`

#### Code Quality:
- [x] Code follows existing patterns and conventions
- [x] Type hints are properly used
- [x] Documentation strings are clear and descriptive
- [x] Runtime type validation implemented
- [x] Error handling with clear error messages
- [x] No linting errors
- [x] Prisma client generated successfully

### Testing

**Test Results:**
```
test/blocks/test_concatenate_lists.py::test_concatenate_lists_block_builtin_tests PASSED
test/blocks/test_concatenate_lists.py::test_concatenate_lists_manual PASSED

============================== 2 passed in 8.35s ==============================
```

**Test Coverage:**
- Basic concatenation: `[[1, 2, 3], [4, 5, 6]]` → `[1, 2, 3, 4, 5, 6]`
- Mixed types: `[["a", "b"], ["c"], ["d", "e", "f"]]` → `["a", "b", "c",
"d", "e", "f"]`
- Empty list handling: `[[1, 2], []]` → `[1, 2]`
- Empty input: `[]` → `[]`
- Single list: `[[1, 2, 3]]` → `[1, 2, 3]`
- Error handling: Invalid input types (strings, non-lists) produce clear
error messages
- Category verification: Confirmed `BlockCategory.BASIC` for consistency

### Review Feedback Addressed

- **Category Consistency**: Changed from `BlockCategory.DATA` to
`BlockCategory.BASIC` to match other list manipulation blocks
(`AddToListBlock`, `FindInListBlock`, etc.)
- **Type Robustness**: Added explicit runtime validation with
`isinstance(lst, list)` check before calling `extend()` to prevent:
  - Strings being iterated character-by-character
  - Non-iterable types causing `TypeError`
- **Error Handling**: Added `error` output field with clear, descriptive
error messages indicating which index has invalid input
- **Test Coverage**: Added test case for error handling with invalid
input types

### Related Issues

- Addresses: "Implement block to concatenate lists" (good first issue,
platform/blocks, hacktoberfest)

### Notes

- This is a straightforward data manipulation block that doesn't require
external dependencies
- The block will be automatically discovered by the block loading system
- No database or configuration changes required
- Compatible with existing workflow system
- All review feedback has been addressed and incorporated


<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Adds a new list utility and updates docs.
> 
> - **New block**: `ConcatenateListsBlock` in
`backend/blocks/data_manipulation.py`
> - Input `lists: List[List[Any]]`; outputs `concatenated_list` or
`error`
> - Skips `None` entries; emits error for non-list items; preserves
order
> - **Docs**: Adds "Concatenate Lists" section to
`docs/integrations/basic.md` and links it in
`docs/integrations/README.md`
> - **Contributor guide**: New `docs/CLAUDE.md` with manual doc section
guidelines
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
4f56dd86c2. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-20 18:04:12 +00:00
Nicholas Tindle
fa0b7029dd fix(platform): make chat credentials type selection deterministic (#11795)
## Background

When using chat to run blocks/agents that support multiple credential
types (e.g., GitHub blocks support both `api_key` and `oauth2`), users
reported that the credentials setup UI would randomly show either "Add
API key" or "Connect account (OAuth)" - seemingly at random between
requests or server restarts.

## Root Cause

The bug was in how the backend selected which credential type to return
when building the missing credentials response:

```python
cred_type = next(iter(field_info.supported_types), "api_key")
```

The problem is that `supported_types` is a **frozenset**. When you call
`iter()` on a frozenset and take `next()`, the iteration order is
**non-deterministic** due to Python's hash randomization. This means:
- `frozenset({'api_key', 'oauth2'})` could iterate as either
`['api_key', 'oauth2']` or `['oauth2', 'api_key']`
- The order varies between Python process restarts and sometimes between
requests
- This caused the UI to randomly show different credential options

### Changes 🏗️

**Backend (`utils.py`, `run_block.py`, `run_agent.py`):**
- Added `_serialize_missing_credential()` helper that uses `sorted()`
for deterministic ordering
- Added `build_missing_credentials_from_graph()` and
`build_missing_credentials_from_field_info()` utilities
- Now returns both `type` (first sorted type, for backwards compat) and
`types` (full array with ALL supported types)

**Frontend (`helpers.ts`, `ChatCredentialsSetup.tsx`,
`useChatMessage.ts`):**
- Updated to read the `types` array from backend response
- Changed `credentialType` (single) to `credentialTypes` (array)
throughout the chat credentials flow
- Passes all supported types to `CredentialsInput` via
`credentials_types` schema field

### Result

Now `useCredentials.ts` correctly sets both `supportsApiKey=true` AND
`supportsOAuth2=true` when both are supported, ensuring:
1. **Deterministic behavior** - no more random type selection
2. **All saved credentials shown** - credentials of any supported type
appear in the selection list

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Verified GitHub block shows consistent credential options across
page reloads
- [x] Verified both OAuth and API key credentials appear in selection
when user has both saved
- [x] Verified backend returns `types: ["api_key", "oauth2"]` array
(checked via Python REPL)

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Ensures deterministic credential type selection and surfaces all
supported types end-to-end.
> 
> - Backend: add `_serialize_missing_credential`,
`build_missing_credentials_from_graph/field_info`;
`run_agent`/`run_block` now return missing credentials with stable
ordering and both `type` (first) and `types` (all).
> - Frontend: chat helpers and UI (`helpers.ts`,
`ChatCredentialsSetup.tsx`, `useChatMessage.ts`) now read `types`,
switch from single `credentialType` to `credentialTypes`, and pass all
supported `credentials_types` in schemas.
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
7d80f4f0e0. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Nicholas Tindle <ntindle@users.noreply.github.com>
2026-01-20 16:19:57 +00:00
Abhimanyu Yadav
c20ca47bb0 feat(frontend): enhance RunGraph and RunInputDialog components with loading states and improved UI (#11808)
### Changes 🏗️

- Enhanced UI for the Run Graph button with improved loading states and
animations
- Added color-coded edges in the flow editor based on output data types
- Improved the layout of the Run Input Dialog with a two-column grid
design
- Refined the styling of flow editor controls with consistent icon sizes
and colors
- Updated tutorial icons with better color and size customization
- Fixed credential field display to show provider name with "credential"
suffix
- Optimized draft saving by excluding node position changes to prevent
excessive saves when dragging nodes

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Verified that the Run Graph button shows proper loading states
  - [x] Confirmed that edges display correct colors based on data types
- [x] Tested the Run Input Dialog layout with various input
configurations
  - [x] Checked that flow editor controls display consistently
  - [x] Verified that tutorial icons render properly
  - [x] Confirmed credential fields show proper provider names
- [x] Tested that dragging nodes doesn't trigger unnecessary draft saves
2026-01-20 15:50:23 +00:00
Abhimanyu Yadav
7756e2d12d refactor(frontend): refactor credentials input with unified CredentialsGroupedView component (#11801)
### Changes 🏗️

- Refactored the credentials input handling in the RunInputDialog to use
the shared CredentialsGroupedView component
- Moved CredentialsGroupedView from agent library to a shared component
location for reuse
- Fixed source name handling in edge creation to properly handle tool
source names
- Improved node output UI by replacing custom expand/collapse with
Accordion component
- Fixed timing of hardcoded values synchronization with handle IDs to
ensure proper loading
- Enabled NEW_FLOW_EDITOR and BUILDER_VIEW_SWITCH feature flags by
default

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Verified credentials input works in both agent run dialog and
builder run dialog
  - [x] Confirmed node output accordion works correctly
- [x] Tested flow editor with tools to ensure source name handling works
properly
  - [x] Verified hardcoded values sync correctly with handle IDs

#### For configuration changes:

- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)
2026-01-20 12:20:25 +00:00
Swifty
bc75d70e7d refactor(backend): Improve Langfuse tracing with v3 SDK patterns and @observe decorators (#11803)
<!-- Clearly explain the need for these changes: -->

This PR improves the Langfuse tracing implementation in the chat feature
by adopting the v3 SDK patterns, resulting in cleaner code and better
observability.

### Changes 🏗️

- **Simplified Langfuse client usage**: Replace manual client
initialization with `langfuse.get_client()` global singleton
- **Use v3 context managers**: Switch to
`start_as_current_observation()` and `propagate_attributes()` for
automatic trace propagation
- **Auto-instrument OpenAI calls**: Use `langfuse.openai` wrapper for
automatic LLM call tracing instead of manual generation tracking
- **Add `@observe` decorators**: All chat tools now have
`@observe(as_type="tool")` decorators for automatic tool execution
tracing:
  - `add_understanding`
  - `view_agent_output` (renamed from `agent_output`)
  - `create_agent`
  - `edit_agent`
  - `find_agent`
  - `find_block`
  - `find_library_agent`
  - `get_doc_page`
  - `run_agent`
  - `run_block`
  - `search_docs`
- **Remove manual trace lifecycle**: Eliminated the verbose `finally`
block that manually ended traces/generations
- **Rename tool**: `agent_output` → `view_agent_output` for clarity

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Verified chat feature works with Langfuse tracing enabled
- [x] Confirmed traces appear correctly in Langfuse dashboard with tool
spans
  - [x] Tested tool execution flows show up as nested observations

#### For configuration changes:

- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)

No configuration changes required - uses existing Langfuse environment
variables.
2026-01-19 20:56:51 +00:00
Nicholas Tindle
c1a1767034 feat(docs): Add block documentation auto-generation system (#11707)
- Add generate_block_docs.py script that introspects block code to
generate markdown
- Support manual content preservation via <!-- MANUAL: --> markers
- Add migrate_block_docs.py to preserve existing manual content from git
HEAD
- Add CI workflow (docs-block-sync.yml) to fail if docs drift from code
- Add Claude PR review workflow (docs-claude-review.yml) for doc changes
- Add manual LLM enhancement workflow (docs-enhance.yml)
- Add GitBook configuration (.gitbook.yaml, SUMMARY.md)
- Fix non-deterministic category ordering (categories is a set)
- Add comprehensive test suite (32 tests)
- Generate docs for 444 blocks with 66 preserved manual sections

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

<!-- Clearly explain the need for these changes: -->

### Changes 🏗️

<!-- Concisely describe all of the changes made in this pull request:
-->

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  <!-- Put your test plan here: -->
  - [x] Extensively test code generation for the docs pages



<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Introduces an automated documentation pipeline for blocks and
integrates it into CI.
> 
> - Adds `scripts/generate_block_docs.py` (+ tests) to introspect blocks
and generate `docs/integrations/**`, preserving `<!-- MANUAL: -->`
sections
> - New CI workflows: **docs-block-sync** (fails if docs drift),
**docs-claude-review** (AI review for block/docs PRs), and
**docs-enhance** (optional LLM improvements)
> - Updates existing Claude workflows to use `CLAUDE_CODE_OAUTH_TOKEN`
instead of `ANTHROPIC_API_KEY`
> - Improves numerous block descriptions/typos and links across backend
blocks to standardize docs output
> - Commits initial generated docs including
`docs/integrations/README.md` and many provider/category pages
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
631e53e0f6. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-19 07:03:19 +00:00
Nicholas Tindle
1b56ff13d9 test 2026-01-18 15:32:10 -06:00
Zamil Majdy
f31c160043 feat(platform): add endedAt field and fix execution analytics timestamps (#11759)
## Summary

This PR adds proper execution end time tracking and fixes timestamp
handling throughout the execution analytics system.

### Key Changes

1. **Added `endedAt` field to database schema** - Executions now have a
dedicated field for tracking when they finish
2. **Fixed timestamp nullable handling** - `started_at` and `ended_at`
are now properly nullable in types
3. **Fixed chart aggregation** - Reduced threshold from ≥3 to ≥1
executions per day
4. **Improved timestamp display** - Moved timestamps to expandable
details section in analytics table
5. **Fixed nullable timestamp bugs** - Updated all frontend code to
handle null timestamps correctly

## Problem Statement

### Issue 1: Missing Execution End Times
Previously, executions used `updatedAt` (last DB update) as a proxy for
"end time". This broke when adding correctness scores retroactively -
the end time would change to whenever the score was added, not when the
execution actually finished.

### Issue 2: Chart Shows Only One Data Point
The accuracy trends chart showed only one data point despite having
executions across multiple days. Root cause: aggregation required ≥3
executions per day.

### Issue 3: Incorrect Type Definitions
Manually maintained types defined `started_at` and `ended_at` as
non-nullable `Date`, contradicting reality where QUEUED executions
haven't started yet.

## Solution

### Database Schema (`schema.prisma`)
```prisma
model AgentGraphExecution {
  // ...
  startedAt DateTime?
  endedAt   DateTime?  // NEW FIELD
  // ...
}
```

### Execution Lifecycle
- **QUEUED**: `startedAt = null`, `endedAt = null` (not started)
- **RUNNING**: `startedAt = set`, `endedAt = null` (in progress)  
- **COMPLETED/FAILED/TERMINATED**: `startedAt = set`, `endedAt = set`
(finished)

### Migration Strategy
```sql
-- Add endedAt column
ALTER TABLE "AgentGraphExecution" ADD COLUMN "endedAt" TIMESTAMP(3);

-- Backfill ONLY terminal executions (prevents marking RUNNING executions as ended)
UPDATE "AgentGraphExecution"
SET "endedAt" = "updatedAt"
WHERE "endedAt" IS NULL
  AND "executionStatus" IN ('COMPLETED', 'FAILED', 'TERMINATED');
```

## Changes by Component

### Backend

**`schema.prisma`**
- Added `endedAt` field to `AgentGraphExecution`

**`execution.py`**
- Made `started_at` and `ended_at` optional with Field descriptions
- Updated `from_db()` to use `endedAt` instead of `updatedAt`
- `update_graph_execution_stats()` sets `endedAt` when status becomes
terminal

**`execution_analytics_routes.py`**
- Removed `created_at`/`updated_at` from `ExecutionAnalyticsResult` (DB
metadata, not execution data)
- Kept only `started_at`/`ended_at` (actual execution runtime)
- Made settings global (avoid recreation)
- Moved OpenAI key validation to `_process_batch` (only check when LLM
actually runs)

**`analytics.py`**
- Fixed aggregation: `COUNT(*) >= 1` (was 3) - include all days with ≥1
execution
- Uses `createdAt` for chart grouping (when execution was queued)

**`late_execution_monitor.py`**
- Handle optional `started_at` with fallback to `datetime.min` for
sorting
- Display "Not started" when `started_at` is null

### Frontend

**Type Definitions**
- Fixed manually maintained `types.ts`: `started_at: Date | null` (was
non-nullable)
- Generated types were already correct

**Analytics Components**
- `AnalyticsResultsTable.tsx`: Show only `started_at`/`ended_at` in
2-column expandable grid
- `ExecutionAnalyticsForm.tsx`: Added filter explanation UI

**Monitoring Components** - Fixed null handling bugs:
- `OldAgentLibraryView.tsx`: Handle null in reduce function
- `agent-runs-selector-list.tsx`: Safe sorting with `?.getTime() ?? 0`
- `AgentFlowList.tsx`: Filter/sort with null checks
- `FlowRunsStatus.tsx`: Filter null timestamps
- `FlowRunsTimeline.tsx`: Filter executions with null timestamps before
rendering
- `monitoring/page.tsx`: Safe sorting
- `ActivityItem.tsx`: Fallback to "recently" for null timestamps

## Benefits

 **Accurate End Times**: `endedAt` is frozen when execution finishes,
not updated later
 **Type Safety**: Nullable types match reality, exposing real bugs  
 **Better UX**: Chart shows all days with data (not just days with ≥3
executions)
 **Bug Fixes**: 7+ frontend components now handle null timestamps
correctly
 **Documentation**: Field descriptions explain when timestamps are null

## Testing

### Backend
```bash
cd autogpt_platform/backend
poetry run format  #  All checks passed
poetry run lint    #  All checks passed
```

### Frontend  
```bash
cd autogpt_platform/frontend
pnpm format        #  All checks passed
pnpm lint          #  All checks passed
pnpm types         #  All type errors fixed
```

### Test Data Generation
Created script to generate 35 test executions across 7 days with
correctness scores:
```bash
poetry run python scripts/generate_test_analytics_data.py
```

## Migration Notes

⚠️ **Important**: The migration only backfills `endedAt` for executions
with terminal status (COMPLETED, FAILED, TERMINATED). Active executions
(QUEUED, RUNNING) correctly keep `endedAt = null`.

## Breaking Changes

None - this is backward compatible:
- `endedAt` is nullable, existing code that doesn't use it is unaffected
- Frontend already used generated types which were correct
- Migration safely backfills historical data

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Introduces explicit execution end-time tracking and normalizes
timestamp handling across backend and frontend.
> 
> - Adds `endedAt` to `AgentGraphExecution` (schema + migration);
backfills terminal executions; sets `endedAt` on terminal status updates
> - Makes `GraphExecutionMeta.started_at/ended_at` optional; updates
`from_db()` to use DB `endedAt`; exposes timestamps in
`ExecutionAnalyticsResult`
> - Moves OpenAI key validation into batch processing; instantiates
`Settings` once
> - Accuracy trends: reduce daily aggregation threshold to `>= 1`;
optional historical series
> - Monitoring/analytics UI: results table shows/export
`started_at`/`ended_at`; adds chart filter explainer
> - Frontend null-safety: update types (`Date | null`) and fix
sorting/filtering/rendering for nullable timestamps across monitoring
and library views
> - Late execution monitor: safe sorting/display when `started_at` is
null
> - OpenAPI specs updated for new/nullable fields
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
1d987ca6e5. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co>
2026-01-16 21:44:24 +00:00
Nicholas Tindle
06550a87eb feat(backend): add missed default credentials (#11760)
### Changes 🏗️

**Fixed missing default credentials and provider name mismatch in the
credentials store:**

1. **Provider name correction** (`credentials_store.py:97-103`)
- Changed `provider="unreal"` → `provider="unreal_speech"` to match the
existing `unreal_speech_api_key` setting and block usage
- Updated title from "Use Credits for Unreal" → "Use Credits for Unreal
Speech" for clarity

2. **Added missing OpenWeatherMap credentials**
(`credentials_store.py:219-226`)
- New `openweathermap_credentials` definition with `APIKeyCredentials`
- Uses existing `settings.secrets.openweathermap_api_key` setting that
was previously defined but had no credential object
   - Added to `DEFAULT_CREDENTIALS` list

3. **Fixed credentials not exposed in `get_all_creds()`**
(`credentials_store.py:343-354`)
- Added `llama_api_credentials` conditional append (was defined but not
returned to users)
- Added `v0_credentials` conditional append (was defined but not
returned to users)
   - Added `openweathermap_credentials` conditional append

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Verified provider name `unreal_speech` matches block usage in
`text_to_speech_block.py`
  - [x] Confirmed `openweathermap_api_key` setting exists in secrets
- [x] Confirmed `llama_api_key` and `v0_api_key` settings exist in
secrets

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Aligns backend credential definitions and exposes missing system
creds; updates frontend to hide new built-ins.
> 
> - Backend `credentials_store.py`:
>   - Corrects `provider` to `unreal_speech` and updates title
> - Adds `openweathermap_credentials`; includes in `DEFAULT_CREDENTIALS`
and `get_all_creds()` when key present
> - Ensures `llama_api_credentials` and `v0_credentials` are returned by
`get_all_creds()`
> - Frontend `integrations/page.tsx`:
> - Extends `hiddenCredentials` with IDs for `v0`, `webshare_proxy`, and
`openweathermap`
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
e7d46b76c6. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Nicholas Tindle <ntindle@users.noreply.github.com>
2026-01-16 21:18:12 +00:00
Nicholas Tindle
088b9998dc fix(frontend): Fix flaky agent-activity tests by targeting correct agent (#11790)
This PR fixes flaky agent-activity Playwright tests that were failing
intermittently in CI.

Closes #11789

### Changes 🏗️

- **Navigate to specific agent by name**: Replace
`LibraryPage.clickFirstAgent(page)` with
`LibraryPage.navigateToAgentByName(page, "Test Agent")` to ensure we're
testing the correct agent rather than relying on the first agent in the
list
- **Add retry mechanism for async data loading**: Replace direct
visibility check with `expect(...).toPass({ timeout: 15000 })` pattern
to properly handle asynchronous agent data fetching
- **Increase timeout**: Extended timeout from 8000ms to 15000ms to
accommodate slower CI environments

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Verified the test file syntax is correct
- [x] Changes target the correct file
(`autogpt_platform/frontend/src/tests/agent-activity.spec.ts`)
- [x] The retry mechanism follows Playwright best practices using
`toPass()`

#### For configuration changes:

- [x] `.env.default` is updated or already compatible with my changes
(N/A - no config changes)
- [x] `docker-compose.yml` is updated or already compatible with my
changes (N/A - no config changes)
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**) (N/A - no config changes)

---------

Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Nicholas Tindle <ntindle@users.noreply.github.com>
2026-01-16 20:33:47 +00:00
Nicholas Tindle
05c89fa5c0 feat(claude): add vercel-react-best-practices skill (#11777) 2026-01-16 09:40:58 -07:00
Swifty
8cc8295f14 feat(backend): add agent generator tools for chat copilot (#11781)
This PR adds the ability to create and edit agents from natural language
descriptions in the chat copilot.

### Changes 🏗️

- Added `agent_generator/` module with:
  - LLM client for OpenAI API calls
- Core generation logic for decomposing goals and generating agent JSON
  - Fixer module to correct common LLM generation errors
  - Validator to ensure generated agents are structurally valid
  - Prompts for goal decomposition and agent generation
  - Utility functions for blocks info and agent saving
- Added `CreateAgentTool` - creates new agents from natural language
descriptions
- Added `EditAgentTool` - edits existing agents using natural language
patches
- Added response models: `AgentPreviewResponse`, `AgentSavedResponse`,
`ClarificationNeededResponse`
- Registered new tools in the tools registry

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Run `poetry run format` to ensure code passes linting
- [x] Test creating an agent via chat with a natural language
description
  - [x] Test editing an existing agent via chat
2026-01-16 17:11:57 +01:00
Swifty
e55f05c7a8 feat(backend): add chat search tools and BM25 reranking (#11782)
This PR adds new chat tools for searching blocks and documentation,
along with BM25 reranking for improved search relevance.

### Changes 🏗️

**New Chat Tools:**
- `find_block` - Search for available blocks by name/description using
hybrid search
- `run_block` - Execute a block directly with provided inputs and
credentials
- `search_docs` - Search documentation with section-level granularity  
- `get_doc_page` - Retrieve full documentation page content

**Search Improvements:**
- Added BM25 reranking to hybrid search for better lexical relevance
- Documentation handler now chunks markdown by headings (##) for
finer-grained embeddings
- Section-based content IDs (`doc_path::section_index`) for precise doc
retrieval
- Startup embedding backfill in scheduler for immediate searchability

**Other Changes:**
- New response models for block and documentation search results
- Updated orphan cleanup to handle section-based doc embeddings
- Added `rank-bm25` dependency for BM25 scoring
- Removed max message limit check in chat service

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Run find_block tool to search for blocks (e.g., "current time")
  - [x] Run run_block tool to execute a found block
  - [x] Run search_docs tool to search documentation
  - [x] Run get_doc_page tool to retrieve full doc content
- [x] Verify BM25 reranking improves search relevance for exact term
matches
  - [x] Verify documentation sections are properly chunked and embedded

#### For configuration changes:
- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)

**Dependencies added:** `rank-bm25` for BM25 scoring algorithm
2026-01-16 16:18:10 +01:00
Swifty
4a9b13acb6 feat(frontend): extract frontend changes from hackathon/copilot branch (#11717)
Frontend changes extracted from the hackathon/copilot branch for the
copilot feature development.

### Changes 🏗️

- New Chat system with contextual components (`Chat`, `ChatDrawer`,
`ChatContainer`, `ChatMessage`, etc.)
- Form renderer system with RJSF v6 integration and new input renderers
- Enhanced credentials management with improved OAuth flow and
credential selection
- New output renderers for various content types (Code, Image, JSON,
Markdown, Text, Video)
- Scrollable tabs component for better UI organization
- Marketplace update notifications and publishing workflow improvements
- Draft recovery feature with IndexedDB persistence
- Safe mode toggle functionality
- Various UI/UX improvements across the platform

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [ ] Test new Chat components functionality
  - [ ] Verify form renderer with various input types
  - [ ] Test credential management flows
  - [ ] Verify output renderers display correctly
  - [ ] Test draft recovery feature

#### For configuration changes:

- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)

---------

Co-authored-by: Lluis Agusti <hi@llu.lu>
2026-01-16 22:15:39 +07:00
Zamil Majdy
5ff669e999 fix(backend): Make Redis connection lazy in cache module (#11775)
## Summary
- Makes Redis connection lazy in the cache module - connection is only
established when `shared_cache=True` is actually used
- Fixes DatabaseManager failing to start because it imports
`onboarding.py` which imports `cache.py`, triggering Redis connection at
module load time even though it only uses in-memory caching

## Root Cause
Commit `b01ea3fcb` (merged today) added `increment_onboarding_runs` to
DatabaseManager, which imports from `onboarding.py`. That module imports
`@cached` decorator from `cache.py`, which was creating a Redis
connection at module import time:

```python
# Old code - ran at import time!
redis = Redis(connection_pool=_get_cache_pool())
```

Since `onboarding.py` only uses `@cached(shared_cache=False)` (in-memory
caching), it doesn't actually need Redis. But the import triggered the
connection attempt.

## Changes
- Wrapped Redis connection in a singleton class with lazy initialization
- Connection is only established when `_get_redis()` is first called
(i.e., when `shared_cache=True` is used)
- Services using only in-memory caching can now import `cache.py`
without Redis configuration

## Test plan
- [ ] Services using `shared_cache=False` work without Redis configured
- [ ] Services using `shared_cache=True` still work correctly with Redis
- [ ] Existing cache tests pass

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-16 14:28:36 +00:00
Abhimanyu Yadav
ec03a13e26 fix(frontend): improve history tracking, error handling (#11786)
### Changes 🏗️

- **Improved Error Handling**: Enhanced error handling in
`useRunInputDialog.ts` to properly handle cases where node errors are
empty or undefined
- **Fixed Node Collision Resolution**: Updated `Flow.tsx` to use the
current state from the store instead of stale props
- **Enhanced History Management**:
    - Added proper state tracking for edge removal operations
    - Improved undo/redo functionality to prevent duplicate states
- Fixed edge case where history wasn't properly tracked during node
dragging
- **UI Improvements**:
- Fixed potential null reference in NodeHeader when accessing agent_name
    - Added placeholder for GoogleDrivePicker in INPUT mode
    - Fixed spacing in ArrayFieldTemplate
- **Bug Fixes**:
    - Added proper state tracking before modifying nodes/edges
    - Fixed history tracking to avoid redundant states
    - Improved collision detection and resolution

### Checklist ���

#### For code changes:

- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Test undo/redo functionality after adding, removing, and moving
nodes
    - [x] Test edge creation and deletion with history tracking
    - [x] Verify error handling when graph validation fails
    - [x] Test Google Drive picker in different UI modes
    - [x] Verify node collision resolution works correctly
2026-01-16 13:34:57 +00:00
Abhimanyu Yadav
b08851f5d7 feat(frontend): improve GoogleDrivePickerField with input mode support and array field spacing (#11780)
### Changes 🏗️

- Added a placeholder UI for Google Drive Picker in INPUT block type
- Improved detection of Google Drive file objects in schema validation
- Extracted `isGoogleDrivePickerSchema` function for better code
organization
- Added spacing between array field elements with a gap-2 class
- Added debug logging for preprocessed schema in FormRenderer

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Verified Google Drive Picker shows placeholder in INPUT blocks
  - [x] Confirmed array field elements have proper spacing
  - [x] Tested that Google Drive file objects are properly detected
2026-01-16 13:02:36 +00:00
Abhimanyu Yadav
8b1720e61d feat(frontend): improve graph validation error handling and node navigation (#11779)
### Changes 🏗️

- Enhanced error handling for graph validation failures with detailed
user feedback
- Added automatic viewport navigation to the first node with errors when
validation fails
- Improved node title display to prioritize agent_name from hardcoded
values
- Removed console.log debugging statement from OutputHandler
- Added ApiError import and improved error type handling
- Reorganized imports for better code organization

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Create a graph with intentional validation errors and verify error
messages display correctly
- [x] Verify the viewport automatically navigates to the first node with
errors
- [x] Check that node titles correctly display customized names or agent
names
- [x] Test error recovery by fixing validation errors and successfully
running the graph
2026-01-16 11:14:00 +00:00
Abhimanyu Yadav
aa5a039c5e feat(frontend): add special rendering for NOTE UI type in FieldTemplate (#11771)
### Changes 🏗️

Added support for Note blocks in the FieldTemplate component by:
- Importing the BlockUIType enum from the build components types
- Extracting the uiType from the registry.formContext
- Adding a conditional rendering check that returns children directly
when the uiType is BlockUIType.NOTE

This change allows Note blocks to render without the standard field
template wrapper, providing a cleaner display for note-type content.


![Screenshot 2026-01-15 at
1.01.03 PM.png](https://app.graphite.com/user-attachments/assets/7d654eed-abbe-4ec3-9c80-24a77a8373e3.png)

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Created a Note block and verified it renders correctly without
field template wrapper
- [x] Confirmed other block types still render with proper field
template
- [x] Verified that Note blocks maintain proper functionality in the
node graph
2026-01-16 11:10:21 +00:00
Zamil Majdy
8b83bb8647 feat(backend): unified hybrid search with embedding backfill for all content types (#11767)
## Summary

This PR extends the embedding system to support **blocks** and
**documentation** content types in addition to store agents, and
introduces **unified hybrid search** across all content types using a
single `UnifiedContentEmbedding` table.

### Key Changes

1. **Unified Hybrid Search Architecture**
   - Added `search` tsvector column to `UnifiedContentEmbedding` table
- New `unified_hybrid_search()` function searches across all content
types (agents, blocks, docs)
- Updated `hybrid_search()` for store agents to use
`UnifiedContentEmbedding.search`
   - Removed deprecated `search` column from `StoreListingVersion` table

2. **Pluggable Content Handler Architecture**
   - Created abstract `ContentHandler` base class for extensibility
- Implemented handlers: `StoreAgentHandler`, `BlockHandler`,
`DocumentationHandler`
   - Registry pattern for easy addition of new content types

3. **Block Embeddings**
   - Discovers all blocks using `get_blocks()`
- Extracts searchable text from: name, description, categories,
input/output schemas

4. **Documentation Embeddings**
   - Scans `/docs/` directory for `.md` and `.mdx` files
   - Extracts title from first `#` heading or uses filename as fallback

5. **Hybrid Search Graceful Degradation**
- Falls back to lexical-only search if query embedding generation fails
   - Redistributes semantic weight proportionally to other components
   - Logs warning instead of throwing error

6. **Database Migrations**
- `20260115200000_add_unified_search_tsvector`: Adds search column to
UnifiedContentEmbedding with auto-update trigger
- `20260115210000_remove_storelistingversion_search`: Removes deprecated
search column and updates StoreAgent view

7. **Orphan Cleanup**
- `cleanup_orphaned_embeddings()` removes embeddings for deleted content
   - Always runs after backfill, even at 100% coverage

### Review Comments Addressed

-  SQL parameter index bug when user_id provided (embeddings.py)
-  Early return skipping cleanup at 100% coverage (scheduler.py)
-  Inconsistent return structure across code paths (scheduler.py)
-  SQL UNION syntax error - added parentheses for ORDER BY/LIMIT
(hybrid_search.py)
-  Version numeric ordering in aggregations (migration)
-  Embedding dimension uses EMBEDDING_DIM constant

### Files Changed

- `backend/api/features/store/content_handlers.py` (NEW): Handler
architecture
- `backend/api/features/store/embeddings.py`: Refactored to use handlers
- `backend/api/features/store/hybrid_search.py`: Unified search +
graceful degradation
- `backend/executor/scheduler.py`: Process all content types, consistent
returns
- `migrations/20260115200000_add_unified_search_tsvector/`: Add tsvector
to unified table
- `migrations/20260115210000_remove_storelistingversion_search/`: Remove
old search column
- `schema.prisma`: Updated UnifiedContentEmbedding and
StoreListingVersion models
- `*_test.py`: Added tests for unified_hybrid_search

## Test Plan

1.  All tests passing on Python 3.11, 3.12, 3.13
2.  Types check passing
3.  CodeRabbit and Sentry reviews addressed
4. Deploy to staging and verify:
   - Backfill job processes all content types
   - Search results include blocks and docs
   - Search works without OpenAI API (graceful degradation)

🤖 Generated with [Claude Code](https://claude.ai/code)

---------

Co-authored-by: Swifty <craigswift13@gmail.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-16 09:47:19 +01:00
Nicholas Tindle
e80e4d9cbb ci: update dev from gitbook (#11757)
<!-- Clearly explain the need for these changes: -->
gitbook changes via ui

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Docs sync from GitBook**
> 
> - Updates `docs/home/README.md` with a new Developer Platform landing
page (cards, links to Platform, Integrations, Contribute, Discord,
GitHub) and metadata/cover settings
> - Adds `docs/home/SUMMARY.md` defining the table of contents linking
to `README.md`
> - No application/runtime code changes
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
446c71fec8. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
2026-01-15 20:02:48 +00:00
Ubbe
375d33cca9 fix(frontend): agent credentials improvements (#11763)
## Changes 🏗️

### System credentials in Run Modal

We had the issue that "system" credentials were mixed with "user"
credentials in the run agent modal:

#### Before

<img width="400" height="466" alt="Screenshot 2026-01-14 at 19 05 56"
src="https://github.com/user-attachments/assets/9d1ee766-5004-491f-ae14-a0cf89a9118e"
/>

This created confusion among the users. This "system" credentials are
supplied by AutoGPT ( _most of the time_ ) and a user running an agent
should not bother with them ( _unless they want to change them_ ). For
example in this case, the credential that matters is the **Google** one
🙇🏽

### After

<img width="400" height="350" alt="Screenshot 2026-01-14 at 19 04 12"
src="https://github.com/user-attachments/assets/e2bbc015-ce4c-496c-a76f-293c01a11c6f"
/>

<img width="400" height="672" alt="Screenshot 2026-01-14 at 19 04 19"
src="https://github.com/user-attachments/assets/d704dae2-ecb2-4306-bd04-3d812fed4401"
/>

"System" credentials are collapsed by default, reducing noise in the
Task Credentials section. The user can still see and change them by
expanding the accordion.

<img width="400" height="190" alt="Screenshot 2026-01-14 at 19 04 27"
src="https://github.com/user-attachments/assets/edc69612-4588-48e4-981a-f59c26cfa390"
/>

If some "system" credentials are missing, there is a red label
indicating so, it wasn't that obvious with the previous implementation,

<img width="400" height="309" alt="Screenshot 2026-01-14 at 19 04 30"
src="https://github.com/user-attachments/assets/f27081c7-40ad-4757-97b3-f29636616fc2"
/>

### New endpoint

There is a new REST endpoint, `GET /providers/system`, to list system
credential providers so it is easy to access in the Front-end to group
them together vs user ones.

### Other improvements

#### `<CredentialsInput />` refinements

<img width="715" height="200" alt="Screenshot 2026-01-14 at 19 09 31"
src="https://github.com/user-attachments/assets/01b39b16-25f3-428d-a6c8-da608038a38b"
/>

Use a normal browser `<select>` for the Credentials Dropdown ( _when you
have more than 1 for a provider_ ). This simplifies the UI shennagians a
lot and provides a better UX in 📱 ( _eventually we should move all our
selects to the native ones as they are much better for mobile and touch
screens and less code to maintain our end_ ).

I also renamed some files for clarity and tidied up some of the existing
logic.

#### Other

- Fix **Open telemetry** warnings on the server console by making the
packages external
- Fix `require-in-the-middle` console warnings
- Prettier tidy ups

## Checklist 📋

### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Run the app locally and test the above
2026-01-15 17:44:44 +07:00
Swifty
3b1b2fe30c feat(backend): Extract backend copilot/chat enhancements from hackathon (#11719)
This PR extracts backend changes from the hackathon/copilot branch,
adding enhanced chat capabilities, agent management tools, store
embeddings, and hybrid search functionality.

### Changes 🏗️

**Chat Features:**
- Added chat database layer (`db.py`) for conversation and message
persistence
- Extended chat models with new types and response structures
- New onboarding system prompt for guided user experiences
- Enhanced chat routes with additional endpoints
- Expanded chat service with more capabilities

**Chat Agent Tools:**
- `agent_output.py` - Handle agent execution outputs
- `create_agent.py` - Tool for creating new agents via chat
- `edit_agent.py` - Tool for modifying existing agents
- `find_library_agent.py` - Search and discover library agents
- Enhanced `run_agent.py` with additional functionality
- New `models.py` for shared tool types

**Store Enhancements:**
- `embeddings.py` - Vector embeddings support for semantic search
- `hybrid_search.py` - Combined keyword and semantic search
- `backfill_embeddings.py` - Utility for backfilling existing data
- Updated store database operations

**Admin:**
- Enhanced store admin routes

**Data Layer:**
- New `understanding.py` module for agent understanding/context

**Database Migrations:**
- `add_chat_tables` - Chat conversation and message tables
- `add_store_embeddings` - Embeddings storage for store items
- `enhance_search` - Search index improvements

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Chat endpoints respond correctly
  - [x] Agent tools (create/edit/find/run) function properly
  - [x] Store embeddings and hybrid search work
  - [x] Database migrations apply cleanly

#### For configuration changes:

- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)

---------

Co-authored-by: Torantulino <40276179@live.napier.ac.uk>
2026-01-15 11:11:36 +01:00
Abhimanyu Yadav
af63b3678e feat(frontend): hide children of connected array and object fields
(#11770)

### Changes 🏗️

- Added conditional rendering for array and object field children based
on connection status
- Implemented `shouldShowChildren` logic in `ArrayFieldTemplate` and
`ObjectFieldTemplate` components
- Modified the `shouldShowChildren` condition in `FieldTemplate` to
handle different schema types
- Imported and utilized `cleanUpHandleId` and `useEdgeStore` to check if
inputs are connected
- Added connection status checks to hide form fields when their inputs
are connected to other nodes

![Screenshot 2026-01-15 at
12.55.32 PM.png](https://app.graphite.com/user-attachments/assets/d3fffade-872e-4fd8-a347-28d1bae3072e.png)

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Verified that object and array fields hide their children when
connected to other nodes
- [x] Confirmed that unconnected fields display their children properly
- [x] Tested with various schema types to ensure correct rendering
behavior
- [x] Checked that the connection status is properly detected and
applied
2026-01-15 08:10:52 +00:00
Abhimanyu Yadav
631f1bd50a feat(frontend): add interactive tutorial for the new builder interface (#11458)
### Changes 🏗️

This PR adds a comprehensive interactive tutorial for the new Builder UI
to help users learn how to create agents. Key changes include:

- Added a tutorial button to the canvas controls that launches a
step-by-step guide
- Created a Shepherd.js-based tutorial with multiple steps covering:
    - Adding blocks from the Block Menu
    - Understanding input and output handles
    - Configuring block values
    - Connecting blocks together
    - Saving and running agents
- Added data-id attributes to key UI elements for tutorial targeting
- Implemented tutorial state management with a new tutorialStore
- Added helper functions for tutorial navigation and block manipulation
- Created CSS styles for tutorial tooltips and highlights
- Integrated with the Run Input dialog to support tutorial flow
- Added prefetching of tutorial blocks for better performance


https://github.com/user-attachments/assets/3db964b3-855c-4fcc-aa5f-6cd74ab33d7d


### Checklist 📋

#### For code changes:

- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
    - [x] Complete the tutorial from start to finish
    - [x] Test tutorial on different screen sizes
    - [x] Verify all tutorial steps work correctly
    - [x] Ensure tutorial can be canceled and restarted
- [x] Check that tutorial doesn't interfere with normal builder
functionality
2026-01-15 07:47:27 +00:00
Swifty
5ac941fe2f feat(backend): add hybrid search for store listings, docs and blocks (#11721)
This PR adds hybrid search functionality combining semantic embeddings
with traditional text search for improved store listing discovery.

### Changes 🏗️

- Add `embeddings.py` - OpenAI-based embedding generation and similarity
search
- Add `hybrid_search.py` - Combines vector similarity with text matching
for better search results
- Add `backfill_embeddings.py` - Script to generate embeddings for
existing store listings
- Update `db.py` - Integrate hybrid search into store database queries
- Update `schema.prisma` - Add embedding storage fields and indexes
- Add migrations for embedding columns and HNSW index for vector search

### Architecture Decisions 🏛️

**Fail-Fast Approach (No Silent Fallbacks)**

We explicitly chose NOT to implement graceful degradation when hybrid
search fails. Here's why:

 **Benefits:**
- Errors surface immediately → faster fixes
- Tests verify hybrid search actually works (not just fallback)
- Consistent search quality for all users
- Forces proper infrastructure setup (API keys, database)

 **Why Not Fallback:**
- Silent degradation hides production issues
- Users get inconsistent results without knowing why
- Tests can pass even when hybrid search is broken
- Reduces operational visibility

**How We Prevent Failures:**
1. Embedding generation in approval flow (db.py:1545)
2. Error logging with `logger.error` (not warning)
3. Clear error messages (ValueError explains what's wrong)
4. Comprehensive test coverage (9/9 tests passing)

If embeddings fail, it indicates a real infrastructure issue (missing
API key, OpenAI down, database issues) that needs immediate attention,
not silent degradation.

### Test Coverage 

**All tests passing (1625 total):**
- 9/9 hybrid_search tests (including fail-fast validation)
- 3/3 db search integration tests
- Full schema compatibility (public/platform schemas)
- Error handling verification

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Test hybrid search returns relevant results
  - [x] Test embedding generation for new listings
  - [x] Test backfill script on existing data
  - [x] Verify search performance with embeddings
  - [x] Test fail-fast behavior when embeddings unavailable

#### For configuration changes:

- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] Configuration: Requires `openai_internal_api_key` in secrets

---------

Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-15 04:17:03 +00:00
Reinier van der Leer
b01ea3fcbd fix(backend/executor): Centralize increment_runs calls & make add_graph_execution more robust (#11764)
[OPEN-2946: \[Scheduler\] Error executing graph <graph_id> after 19.83s:
ClientNotConnectedError: Client is not connected to the query engine,
you must call `connect()` before attempting to query
data.](https://linear.app/autogpt/issue/OPEN-2946)

- Follow-up to #11375
  <sub>(broken `increment_runs` call)</sub>
- Follow-up to #11380
  <sub>(direct `get_graph_execution` call)</sub>

### Changes 🏗️

- Move `increment_runs` call from `scheduler._execute_graph` to
`executor.utils.add_graph_execution` so it can be made through
`DatabaseManager`
  - Add `increment_onboarding_runs` to `DatabaseManager`
- Remove now-redundant `increment_onboarding_runs` calls in other places
- Make `add_graph_execution` more resilient
  - Split up large try/except block
  - Fix direct `get_graph_execution` call

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - CI + a thorough review
2026-01-15 04:08:19 +00:00
Reinier van der Leer
3b09a94e3f feat(frontend/builder): Add sub-graph update UX (#11631)
[OPEN-2743: Ability to Update Sub-Agents in Graph (Without
Re-Adding)](https://linear.app/autogpt/issue/OPEN-2743/ability-to-update-sub-agents-in-graph-without-re-adding)

Updating sub-graphs is a cumbersome experience at the moment, this
should help. :)

Demo in Builder v2:


https://github.com/user-attachments/assets/df564f32-4d1d-432c-bb91-fe9065068360


https://github.com/user-attachments/assets/f169471a-1f22-46e9-a958-ddb72d3f65af


### Changes 🏗️

- Add sub-graph update banner with I/O incompatibility notification and
resolution mode
  - Red visual indicators for broken inputs/outputs and edges
  - Update bars and tooltips show compatibility details
- Sub-agent update UI with compatibility checks, incompatibility dialog,
and guided resolution workflow
- Resolution mode banner guiding users to remove incompatible
connections
- Visual controls to stage/apply updates and auto-apply when broken
connections are fixed
  
  Technical:
- Builder v1: Add `CustomNode` > `IncompatibilityDialog` +
`SubAgentUpdateBar` sub-components
- Builder v2: Add `SubAgentUpdateFeature` + `ResolutionModeBar` +
`IncompatibleUpdateDialog` + `useSubAgentUpdateState` sub-components
  - Add `useSubAgentUpdate` hook

- Related fixes in Builder v1:
  - Fix static edges not rendering as such
  - Fix edge styling not applying
- Related fixes in Builder v2:
  - Fix excess spacing for nested node input fields

Other:
- "Retry" button in error view now reloads the page instead of
navigating to `/marketplace`

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - CI for existing frontend UX flows
- [x] Updating to a new sub-agent version with compatibility issues: UX
flow works
- [x] Updating to a new sub-agent version with *no* compatibility
issues: works
  - [x] Designer approves of the look

---------

Co-authored-by: abhi1992002 <abhimanyu1992002@gmail.com>
Co-authored-by: Abhimanyu Yadav <122007096+Abhi1992002@users.noreply.github.com>
2026-01-14 13:25:20 +00:00
Zamil Majdy
61efee4139 fix(frontend): Remove hardcoded bypass of billing feature flag (#11762)
## Summary

Fixes a critical security issue where the billing button in the settings
sidebar was always visible to all users, bypassing the
`ENABLE_PLATFORM_PAYMENT` feature flag.

## Changes 🏗️

- Removed hardcoded `|| true` condition in
`frontend/src/app/(platform)/profile/(user)/layout.tsx:32` that was
bypassing the feature flag check
- The billing button is now properly gated by the
`ENABLE_PLATFORM_PAYMENT` feature flag as intended

## Root Cause

The `|| true` was accidentally left in commit
3dbc03e488 (PR #11617 - OAuth API & Single
Sign-On) from December 19, 2025. It was likely added temporarily during
development/testing to always show the billing button, but was not
removed before merging.

## Test Plan

1. Verify feature flag is set to disabled in LaunchDarkly
2. Navigate to settings page (`/profile/settings`)
3. Confirm billing button is NOT visible in the sidebar
4. Enable feature flag in LaunchDarkly
5. Refresh page and confirm billing button IS now visible
6. Verify billing page (`/profile/credits`) is still accessible via
direct URL when feature flag is disabled

## Checklist 📋

### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan

Fixes SECRT-1791

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Bug Fixes**
* The Billing link in the profile sidebar now respects the payment
feature flag configuration and will only display when payment
functionality is enabled.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-01-14 03:28:36 +00:00
Bently
e539280e98 fix(blocks): set User-Agent header and URL-encode topic in GetWikipediaSummaryBlock (#11754)
The GetWikipediaSummaryBlock was returning HTTP 403 errors from
Wikipedia's API because it wasn't explicitly setting a User-Agent header
that complies with https://wikitech.wikimedia.org/wiki/Robot_policy.
Additionally, topics with spaces or special characters would cause
malformed URLs.

Fixes: OPEN-2889

Changes 🏗️

- URL-encode the topic parameter using urllib.parse.quote() to handle
spaces and special characters
- Explicitly set required headers per Wikimedia robot policy:
- User-Agent: Platform default user agent (includes app name, URL, and
contact email)
- Accept-Encoding: gzip, deflate: Recommended by Wikimedia to reduce
bandwidth
- Updated test mock to match the new function signature

Checklist 📋

For code changes:

- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Verify code passes syntax check
  - [x] Verify code passes ruff linting
- [x] Create an agent using GetWikipediaSummaryBlock with a topic
containing spaces (e.g., "Artificial Intelligence")
  - [x] Verify the block returns a Wikipedia summary without 403 errors

For configuration changes:

- .env.default is updated or already compatible with my changes
- docker-compose.yml is updated or already compatible with my changes
- I have included a list of my configuration changes in the PR
description (under Changes)
.
N/A - No configuration changes required.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Bug Fixes**
* Improved Wikipedia API requests by adding compatible request headers
(including a proper user agent and encoding acceptance) for more
reliable responses.
* Enhanced handling of search topics by URL-encoding terms so queries
with spaces or special characters return correct results.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-01-13 12:24:51 +00:00
Toran Bruce Richards
db8b43bb3d feat(blocks): Add WordPress Get All Posts block and Publish Post draft toggle (#11003)
**Implements issue #11002**

This PR adds WordPress post management functionality and improves error
handling in DataForSEO blocks.

### Changes 🏗️

1. **New WordPress Blocks:**
- Added `WordPressGetAllPostsBlock` - Fetches posts from WordPress sites
with filtering and pagination support
- Enhanced `WordPressCreatePostBlock` with `publish_as_draft` toggle to
control post publication status

2. **WordPress API Enhancements:**
- Added `get_posts()` function in `_api.py` to retrieve posts with
filtering by status
- Added `PostsResponse` model for handling WordPress posts list API
responses
- Support for pagination with `number` and `offset` parameters (max 100
posts per request)

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  
  **Test Plan:**
- [x] Test `WordPressGetAllPostsBlock` with valid WordPress credentials
  - [x] Verify filtering posts by status (publish, draft, pending, etc.)
  - [x] Test pagination with different number and offset values
- [x] Test `WordPressCreatePostBlock` with publish_as_draft=True to
create draft posts
- [x] Test `WordPressCreatePostBlock` with publish_as_draft=False to
publish posts publicly

#### For configuration changes:

- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)

**Note:** No configuration changes were required for this PR.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **New Features**
* Added a WordPress “Get All Posts” block to fetch posts with optional
status filtering and pagination; returns total found and post details.
* **Enhancements**
* WordPress “Create Post” block now supports a “Publish as draft”
option, allowing posts to be created as drafts or published immediately.
* WordPress blocks are now surfaced consistently in the block catalog
for easier use.
* **Error Handling**
* Clearer error messages when fetching posts fails, aiding
troubleshooting.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->


<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Introduces WordPress post listing and improves post creation and API
robustness.
> 
> - Adds `WordPressGetAllPostsBlock` to fetch posts with optional
`status` filter and pagination (`number`, `offset`); outputs `found`,
`posts`, and streams each `post`
> - Enhances `WordPressCreatePostBlock` with `publish_as_draft` input
and adds `site` to outputs; sets `status` accordingly
> - WordPress API updates in `_api.py`: new `get_posts`, `Post`,
`PostsResponse`, and `normalize_site`; apply
`Requests(raise_for_status=False)` across OAuth/token/info and post
creation; better error propagation
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
10be1c4709. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Toran Bruce Richards <Torantulino@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: Nicholas Tindle <ntindle@users.noreply.github.com>
Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-12 19:57:47 +00:00
Abhimanyu Yadav
923d8baedc feat(frontend): add JsonTextField component for complex nested form data (#11752)
### Changes 🏗️

- Added a new `JsonTextField` component to handle complex nested JSON
types (objects/arrays inside other objects/arrays)
- Created helper functions for JSON parsing, validation, and formatting
- Implemented `useJsonTextField` hook to manage state and validation
- Enhanced `generateUiSchemaForCustomFields` to detect nested complex
types and render them as JSON text fields
- Updated `TextInputExpanderModal` to support JSON-specific styling
- Added `JSON_TEXT_FIELD_ID` constant to custom registry for field
identification

This change improves the user experience by preventing deeply nested
form UIs. Instead, complex nested structures are presented as editable
JSON text fields with proper validation and formatting.

### Before

![Screenshot 2026-01-12 at
1.07.54 PM.png](https://app.graphite.com/user-attachments/assets/dc2b96cc-562a-4e6b-8278-76de941e3bd9.png)

### After

![Screenshot 2026-01-12 at
12.35.19 PM.png](https://app.graphite.com/user-attachments/assets/ea0028a5-c119-43c3-8100-b103484e0b54.png)

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Test with simple JSON objects in forms
  - [x] Test with nested arrays and objects
  - [x] Test with anyOf/oneOf schemas containing complex types
  - [x] Test the expander modal with JSON content

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* New JSON text field with expandable modal editor, inline validation,
and helpful placeholders.
* Complex nested objects/arrays now render as JSON fields to simplify
editing.
* Modal editor uses monospace, smaller text when editing JSON for
improved readability.

* **Chores**
* Added a non-functional runtime debug log (no user-facing behavior
changes).

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-01-12 12:22:41 +00:00
Abhimanyu Yadav
a55b2e02dc feat(frontend): enhance CredentialsInput and CredentialRow components with variant support (#11753)
### Changes 🏗️

- Added a new `variant` prop to `CredentialsInput` component with
options "default" or "node"
- Implemented compact styling for the "node" variant in `CredentialRow`
component
- Modified layout and overflow handling for credential display in node
context
- Added conditional rendering of masked key display based on variant
- Passed the variant prop through the component hierarchy
- Applied the "node" variant to the `CredentialsField` component with
appropriate styling

Before

![Screenshot 2026-01-12 at
4.39.35 PM.png](https://app.graphite.com/user-attachments/assets/2b605b2d-7abf-4e8a-adc5-6a6e8b712ef7.png)

After

![Screenshot 2026-01-12 at
4.55.39 PM.png](https://app.graphite.com/user-attachments/assets/20bb1452-870a-4111-a246-c4e3a3b456ea.png)

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Verified credential selection works correctly in node context
  - [x] Confirmed compact styling is applied properly in node variant
  - [x] Tested overflow handling for long credential names
  - [x] Verified both default and node variants display correctly

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **New Features**
* Credential input and selection components now support multiple
configurable visual variants, enabling better text display handling,
optimized layouts, and improved visual consistency across different
application contexts and specific use cases.

* **Style**
* Credential field displays now feature enhanced text truncation and
overflow management for a more polished and consistent user interface
experience.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-01-12 12:22:20 +00:00
Abhimanyu Yadav
6b6648b290 feat(frontend): add Table component with TableField renderer for tabular data input (#11751)
### Changes 🏗️

- Added a new `Table` component for handling tabular data input
- Created supporting hooks and helper functions for the Table component
- Added Storybook stories to showcase different Table configurations
- Implemented a custom `TableField` renderer for JSON Schema forms
- Updated type display info to support the new "table" format
- Added schema matcher to detect and render table fields appropriately

![Screenshot 2026-01-12 at
11.29.04 AM.png](https://app.graphite.com/user-attachments/assets/71469d59-469f-4cb0-882b-a49791fe948d.png)

![Screenshot 2026-01-12 at
11.28.54 AM.png](https://app.graphite.com/user-attachments/assets/81193f32-0e16-435e-bb66-5d2aea98266a.png)

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Verified Table component renders correctly with various
configurations
  - [x] Tested adding and removing rows in the Table
- [x] Confirmed data changes are properly tracked and reported via
onChange
  - [x] Verified TableField renderer works with JSON Schema forms
  - [x] Checked that table format is properly detected in the schema

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

## Release Notes

* **New Features**
* Added a Table component for displaying and editing tabular data with
support for adding/deleting rows, read-only mode, and customizable
labels.
* Added support for rendering array fields as tables in form inputs with
configurable columns and values.

* **Tests**
* Added comprehensive Storybook stories demonstrating various Table
configurations and behaviors.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-01-12 10:32:14 +00:00
Abhimanyu Yadav
c0a9c0410b feat(frontend): add MultiSelectField component and improve node title cursor styling (#11744)
## Changes 🏗️

- Added a new `MultiSelectField` component for handling multiple boolean
selections in a dropdown format
- Implemented `useMultiSelectField` hook to manage the state and logic
of the multi-select component
- Added support for custom fields in `AnyOfField` by checking if the
option schema matches a custom field
- Added `isMultiSelectSchema` utility function to detect schemas
suitable for the multi-select component
- Added hover cursor styling to node headers to indicate text
editability

![Screenshot 2026-01-10 at
11.15.12 AM.png](https://app.graphite.com/user-attachments/assets/8254497b-604f-4ccc-a40b-eb8994c073b4.png)

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Verified that multi-select fields render correctly in the UI
  - [x] Confirmed that selecting multiple options works as expected
  - [x] Tested that the node header shows the text cursor on hover
- [x] Verified that AnyOf fields correctly use custom field renderers
when applicable

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Added a multi-select field allowing selection of multiple options with
improved selection UI.
* AnyOf options can now resolve and render custom field types, improving
form composition when schemas map to custom controls.

* **Style**
  * Tooltip header cursor updated for clearer hover feedback.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-01-12 09:48:58 +00:00
Abhimanyu Yadav
17a77b02c7 fix(frontend): exclude schemas with enum from anyOf detection (#11743)
### Changes 🏗️

Fixed the `isAnyOfSchema` function in schema-utils.ts to exclude schemas
that have an `enum` property. This prevents incorrect schema processing
for enums that also have anyOf definitions. Added a console.log
statement in FormRenderer.tsx to help debug schema preprocessing.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Verified that forms with enum values render correctly
- [x] Confirmed that anyOf schemas are properly identified and processed
- [x] Tested with various schema combinations to ensure the fix doesn't
break existing functionality

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

## Bug Fixes
* Improved validation logic for form field schemas to correctly handle
edge cases when multiple constraint types are defined.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-01-12 09:48:47 +00:00
Zamil Majdy
701fce83ca fix(backend): add missing metadata attribute to mock nodes in SmartDecisionMaker tests (#11750)
This PR fixes failing SmartDecisionMaker tests by adding missing
`metadata` attribute to mock nodes.

### Changes 🏗️

Mock nodes in SmartDecisionMaker tests were missing the `metadata = {}`
attribute, which was introduced in commit 4a52b7eca for the
customized_name feature. This caused tests to fail with:

```
TypeError: expected string or bytes-like object, got 'Mock'
```

**Files fixed**:
- `backend/blocks/test/test_smart_decision_maker_dict.py`: Added
`metadata = {}` to mock nodes in all 3 tests
- `backend/blocks/test/test_smart_decision_maker_dynamic_fields.py`:
Added `metadata = {}` to mock nodes in all 8 tests

**Root cause**: The `_create_block_function_signature` method calls
`sink_node.metadata.get("customized_name")`, but mock nodes in tests
didn't have the metadata attribute initialized.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Run `poetry run pytest
backend/blocks/test/test_smart_decision_maker_dict.py -xvs` - 3 passed
- [x] Run `poetry run pytest
backend/blocks/test/test_smart_decision_maker_dynamic_fields.py -xvs` -
8 passed
  - [x] All tests pass successfully

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

## Release Notes

* **Tests**
* Updated test infrastructure to enhance mock object configuration for
improved test reliability and consistency across test suites.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-01-11 17:00:36 -06:00
Zamil Majdy
78d89d0faf Merge branch 'master' of github.com:Significant-Gravitas/AutoGPT into dev 2026-01-11 13:09:23 -06:00
Zamil Majdy
f482eb668b hotfix(backend): resolve tool pin name mismatch in SmartDecisionMakerBlock (#11749)
## Root Cause

Execution a40bdb4a-964d-4684-94e8-b148eb6bcfc2 and all similar
executions have been failing since Nov 12, 2025 when tool pin routing
was refactored to use node IDs. The SmartDecisionMakerBlock was
double-sanitizing field names when emitting tool call outputs:

```python
# Original field name from link: "Max Keyword Difficulty"
original_field_name = field_mapping.get(clean_arg_name)  #  Retrieved correctly
sanitized_arg_name = self.cleanup(original_field_name)   #  Sanitized AGAIN!
emit_key = f"tools_^_{node_id}_~_{sanitized_arg_name}"   # Emits "max_keyword_difficulty"
```

But the parser expected original names from graph links:
```python
# Parser expects: "Max Keyword Difficulty" (from link.sink_name)
# Emit provides: "max_keyword_difficulty" (sanitized)
# Result: Mismatch → Tool never executes
```

### Changes 🏗️

**1. Fixed Emit Logic** (`smart_decision_maker.py` line 1135)
- Removed double sanitization: `sanitized_arg_name =
self.cleanup(original_field_name)`
- Now emits with original field names: `emit_key =
f"tools_^_{node_id}_~_{original_field_name}"`

**2. Made Agent Nodes Consistent** (`smart_decision_maker.py` lines
497-530)
- Added `field_mapping` to agent function signatures (was missing)
- Agent signatures now sanitize property keys for Anthropic API (like
block signatures)
- Stores field_mapping for use during emit

### Impact

**Fixes:**
-  All graphs with multi-word field names (e.g., "Max Keyword
Difficulty", "Minimum Volume")
-  All graphs with special characters in field names (e.g., "API-Key")
-  Both block nodes AND agent nodes now work consistently

**Unaffected:**
- Single-word lowercase field names (e.g., "keyword", "url") - these
were already working

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Verified parse_execution_output handles exact match correctly
  - [x] Verified emit uses original field names
  - [x] Verified field_mapping works for both block and agent nodes
- [x] Re-run execution a40bdb4a-964d-4684-94e8-b148eb6bcfc2 after
deployment to verify fix

#### For configuration changes:
- [x] `.env.default` is updated or already compatible with my changes
(no changes)
- [x] `docker-compose.yml` is updated or already compatible with my
changes (no changes)
- [x] No configuration changes in this PR

### Test Plan

1. **Unit test validation** (completed):
- Field name cleanup: "Max Keyword Difficulty" →
"max_keyword_difficulty" 
   - Parse with exact match: Success 
   - Parse with mismatch: Returns None 

2. **Production validation** (to be done after deployment):
   - Re-run execution a40bdb4a-964d-4684-94e8-b148eb6bcfc2
- Verify AgentExecutor (node 767682f5-694f-4b2a-bf52-fbdcad6a4a4f)
executes successfully
   - Verify execution completes with high correctness score (not 0.20)
   - Monitor for any regressions in existing graphs

### Files Changed

- `backend/blocks/smart_decision_maker.py`: Remove double sanitization,
add agent field_mapping

### Related Issues

- Resolves execution failure a40bdb4a-964d-4684-94e8-b148eb6bcfc2
- Fixes bug introduced in commit 536e2a5ec (Nov 12, 2025)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Bug Fixes**
* Improved field name mapping consistency in the SmartDecisionMaker
block to ensure proper handling of field names throughout function
signatures and tool execution workflows.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-01-12 02:08:12 +07:00
Nicholas Tindle
4a52b7eca0 fix(backend): use customized block names in smart decision maker
The SmartDecisionMakerBlock now respects the customized_name field from
node metadata when generating tool function signatures for the LLM.

Previously, the block always used the static block.name from the block
class definition, ignoring any custom names users set in the builder UI.

Changes:
- _create_block_function_signature: Check sink_node.metadata for
  customized_name before falling back to block.name
- _create_agent_function_signature: Check sink_node.metadata for
  customized_name before falling back to sink_graph_meta.name
- Added 4 unit tests for the customized_name feature

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-09 16:51:39 -07:00
Zamil Majdy
97847f59f7 feat(backend): add human-in-the-loop review system for blocks requiring approval (#11732)
## Summary
Introduces a comprehensive Human-In-The-Loop (HITL) review system that
allows any block to require human approval before execution. This
extends the existing HITL infrastructure to support automatic review
requests for potentially dangerous operations.

## 🚀 Key Features

### **Automatic HITL for Any Block**
- **Simple opt-in**: Set `self.requires_human_review = True` in any
block constructor
- **Safe mode integration**: Only activates when
`execution_context.safe_mode = True`
- **Seamless workflow**: Blocks pause execution → Human reviews via
existing UI → Execution continues or stops

### **Unified Review Infrastructure**
- **Shared HITLReviewHelper**: Clean, reusable helper class for all
review operations
- **Single API**: `handle_review_decision()` method with structured
return type
- **Type-safe**: Proper typing with non-nullable
`ReviewDecision.review_result`

### **Smart Graph Detection** 
- **Updated `has_human_in_the_loop`**: Now detects both dedicated HITL
blocks and blocks with `requires_human_review = True`
- **Frontend awareness**: UI can properly indicate graphs requiring
human intervention

## 🏗️ Implementation

### **Block Usage**
```python
class MyBlock(Block):
    def __init__(self):
        super().__init__(...)
        self.requires_human_review = True  # Enable automatic HITL
        
    async def run(self, input_data, **kwargs):
        # If we reach here, either safe mode is off OR human approved
        # No additional HITL code needed - handled automatically by base class
        yield "result", "Operation completed"
```

### **Review Workflow**
1. **Block execution starts** → Base class checks
`requires_human_review` flag
2. **Safe mode enabled** → Creates review entry, pauses execution 
3. **Human reviews** → Uses existing review UI to approve/reject
4. **Execution resumes** → Continues if approved, raises error if
rejected
5. **Safe mode disabled** → Executes normally without review

## 🔧 Technical Improvements

### **Code Quality Enhancements**
- **Better naming**: `risky_block` → `requires_human_review` (clearer
intent)
- **Type safety**: Non-nullable `ReviewDecision.review_result`
(eliminates Optional checks)
- **Exhaustive handling**: Proper error handling for unexpected review
statuses
- **Clean exception handling**: Removed redundant try-catch-log-reraise
patterns

### **Architecture Fixes**
- **Circular import resolution**: Fixed `ExecutionContext` import issues
breaking 444+ block tests
- **Early returns**: Cleaner control flow without nested conditionals
- **Defensive programming**: Handles edge cases with clear error
messages

## 📊 Changes Made

### **Core Files**
- **`Block.requires_human_review`**: New flag for marking blocks
requiring approval
- **`HITLReviewHelper`**: Shared helper class with clean, testable API
- **`HumanInTheLoopBlock`**: Refactored to use shared infrastructure
- **`Graph.has_human_in_the_loop`**: Updated to include review-requiring
blocks

### **Quality Improvements**
- **Type hints**: Proper typing throughout with runtime compatibility
- **Error handling**: Exhaustive status handling with descriptive errors
- **Code reduction**: -16 lines through removal of redundant exception
handling
- **Test compatibility**: All 444/445 block tests pass

##  Testing & Validation

- **All tests pass**: 444/445 block tests passing 
- **Type checking**: All pyright/mypy checks pass   
- **Formatting**: All linting and formatting checks pass 
- **Circular imports**: Resolved import issues that were breaking tests

- **Backward compatibility**: Existing HITL functionality unchanged 

## 🎯 Use Cases

This enables automatic human oversight for blocks performing:
- **File operations**: Deletion, modification, system access
- **External API calls**: Payments, data modifications, destructive
operations
- **System commands**: Shell execution, configuration changes
- **Data processing**: Sensitive data handling, compliance-required
operations

## 🔄 Migration Path

**Existing code**: No changes required - fully backward compatible
**New blocks**: Simply set `self.requires_human_review = True` to enable
automatic HITL
**Safe mode**: Controls whether review requests are created (production
vs development)

---

This creates a robust, type-safe foundation for human oversight in
automated workflows while maintaining the existing HITL user experience
and API compatibility.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Human-in-the-loop review support so executions can pause for human
review and resume based on decisions.

* **Improvements**
* Blocks can opt into requiring human review and will use reviewed input
when proceeding.
* Unified review decision flow with clearer approved/rejected outcomes
and messaging.
* Graph detection expanded to recognize nodes that require human review.

* **Chores**
  * Test config adjusted to avoid pytest plugin conflicts.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-01-09 21:14:37 +00:00
Zamil Majdy
22ca8955c5 fix(backend): library agent creation and version update improvements (#11731)
## Summary
Fixes library agent creation and version update logic to properly handle
both user-created and marketplace agents.

## Changes
- **Remove useGraphIsActiveVersion filter** from
`update_agent_version_in_library` to allow both manual and auto updates
- **Set useGraphIsActiveVersion correctly**:
- `False` for marketplace agents (require manual updates to avoid
breaking workflows)
- `True` for user-created agents (can safely auto-update since user
controls source)
- Update function documentation to reflect new behavior

## Problem Solved
- Marketplace agents can now be updated manually via API
- User-created agents maintain auto-update capability  
- Resolves Sentry error AUTOGPT-SERVER-722 about "Expected a record,
found none"
- Fixes store submission modal issues

## Test Plan
- [x] Verify marketplace agents are created with
`useGraphIsActiveVersion: False`
- [x] Verify user agents are created with `useGraphIsActiveVersion:
True`
- [x] Confirm `update_agent_version_in_library` works for both types
- [x] Test store submission flow works without modal issues

## Review Notes
This change ensures proper separation between user-controlled agents
(auto-update) and marketplace agents (manual update), while allowing the
API to service both use cases.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Release Notes

* **New Features**
* Enhanced agent publishing workflow with improved version tracking and
change detection for marketplace updates

* **Bug Fixes**
  * Improved error handling when updating agent versions in the library
  * Better detection of unpublished changes before publishing agents

* **Improvements**
* Changes Summary field now supports longer descriptions (up to 500
characters) with multi-line editing capability

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-01-09 21:14:05 +00:00
Nicholas Tindle
43cbe2e011 feat!(blocks): Add Reddit OAuth2 integration and advanced Reddit blocks (#11623)
Replaces user/password Reddit credentials with OAuth2, adds
RedditOAuthHandler, and updates Reddit blocks to support OAuth2
authentication. Introduces new blocks for creating posts, fetching post
details, searching, editing posts, and retrieving subreddit info.
Updates test credentials and input handling to use OAuth2 tokens.

<!-- Clearly explain the need for these changes: -->

### Changes 🏗️
Rebuild the reddit blocks to support oauth2 rather than requiring users
to provide their password and username.
This is done via a swap from script based to web based authentication on
the reddit side faciliatated by the approval of an oauth app by reddit
on the account `ntindle`
<!-- Concisely describe all of the changes made in this pull request:
-->

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  <!-- Put your test plan here: -->
  - [x] Build a super agent
  - [x] Upload the super agent and a video of it working

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Introduces full Reddit OAuth2 support and substantially expands Reddit
capabilities across the platform.
> 
> - Adds `RedditOAuthHandler` with token exchange, refresh, revoke;
registers handler in `integrations/oauth/__init__.py`
> - Refactors Reddit blocks to use `OAuth2Credentials` and `praw` via
refresh tokens; updates models (e.g., `post_id`, richer outputs) and
adds `strip_reddit_prefix`
> - New blocks: create/edit/delete posts, post/get/delete comments,
reply to comments, get post details, user posts (self/others), search,
inbox, subreddit info/rules/flairs, send messages
> - Updates default `settings.config.reddit_user_agent` and test
credentials; minor `.branchlet.json` addition
> - Docs: clarifies block error-handling with
`BlockInputError`/`BlockExecutionError` guidance
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
4f1f26c7e7. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Release Notes

* **New Features**
* Added OAuth2-based authentication for Reddit integration, replacing
legacy credential methods
* Expanded Reddit capabilities with new blocks for creating posts,
retrieving post details, managing comments, accessing inbox, and
fetching user/subreddit information
* Enhanced data models to support richer Reddit interactions and
chainable workflows

* **Documentation**
* Updated error handling guidance to distinguish between validation
errors and runtime errors with improved exception patterns

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
2026-01-09 20:53:03 +00:00
Nicholas Tindle
a318832414 feat(docs): update dev from gitbook changes (#11740)
<!-- Clearly explain the need for these changes: -->
gitbook branch has changes that need synced to dev
### Changes 🏗️
Pull changes from gitbook into dev
<!-- Concisely describe all of the changes made in this pull request:
-->

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Migrates documentation to GitBook and removes the old MkDocs setup.
> 
> - Removes MkDocs configuration and infra: `docs/mkdocs.yml`,
`docs/netlify.toml`, `docs/overrides/main.html`,
`docs/requirements.txt`, and JS assets (`_javascript/mathjax.js`,
`_javascript/tablesort.js`)
> - Updates `docs/content/contribute/index.md` to describe GitBook
workflow (gitbook branch, editing, previews, and `SUMMARY.md`)
> - Adds GitBook navigation file `docs/platform/SUMMARY.md` and a new
platform overview page `docs/platform/what-is-autogpt-platform.md`
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
e7e118b5a8. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Documentation**
* Updated contribution guide for new documentation platform and workflow
  * Added new platform overview and navigation documentation

* **Chores**
  * Removed MkDocs configuration and related dependencies
  * Removed deprecated JavaScript integrations and deployment overrides

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-09 19:22:05 +00:00
Swifty
843c487500 feat(backend): add prisma types stub generator for pyright compatibility (#11736)
Prisma's generated `types.py` file is 57,000+ lines with complex
recursive TypedDict definitions that exhaust Pyright's type inference
budget. This causes random type errors and makes the type checker
unreliable.

### Changes 🏗️

- Add `gen_prisma_types_stub.py` script that generates a lightweight
`.pyi` stub file
- The stub preserves safe types (Literal, TypeVar) while collapsing
complex TypedDicts to `dict[str, Any]`
- Integrate stub generation into all workflows that run `prisma
generate`:
  - `platform-backend-ci.yml`
  - `claude.yml`
  - `claude-dependabot.yml`
  - `copilot-setup-steps.yml`
  - `docker-compose.platform.yml`
  - `Dockerfile`
  - `Makefile` (migrate & reset-db targets)
  - `linter.py` (lint & format commands)
- Add `gen-prisma-stub` poetry script entry
- Fix two pre-existing type errors that were previously masked:
- `store/db.py`: Replace private type
`_StoreListingVersion_version_OrderByInput` with dict literal
  - `airtable/_webhook.py`: Add cast for `Serializable` type

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Run `poetry run format` - passes with 0 errors (down from 57+)
  - [x] Run `poetry run lint` - passes with 0 errors
  - [x] Run `poetry run gen-prisma-stub` - generates stub successfully
- [x] Verify stub file is created at correct location with proper
content

#### For configuration changes:
- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Chores**
* Added a lightweight Prisma type-stub generator and integrated it into
build, lint, CI/CD, and container workflows.
* Build, migration, formatting, and lint steps now generate these stubs
to improve type-checking performance and reduce overhead during builds
and deployments.
  * Exposed a project command to run stub generation manually.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-01-09 16:31:10 +01:00
Nicholas Tindle
47a3a5ef41 feat(backend,frontend): optional credentials flag for blocks at agent level (#11716)
This feature allows agent makers to mark credential fields as optional.
When credentials are not configured for an optional block, the block
will be skipped during execution rather than causing a validation error.

**Use case:** An agent with multiple notification channels (Discord,
Twilio, Slack) where the user only needs to configure one - unconfigured
channels are simply skipped.

### Changes 🏗️

#### Backend

**Data Model Changes:**
- `backend/data/graph.py`: Added `credentials_optional` property to
`Node` model that reads from node metadata
- `backend/data/execution.py`: Added `nodes_to_skip` field to
`GraphExecutionEntry` model to track nodes that should be skipped

**Validation Changes:**
- `backend/executor/utils.py`:
- Updated `_validate_node_input_credentials()` to return a tuple of
`(credential_errors, nodes_to_skip)`
- Nodes with `credentials_optional=True` and missing credentials are
added to `nodes_to_skip` instead of raising validation errors
- Updated `validate_graph_with_credentials()` to propagate
`nodes_to_skip` set
- Updated `validate_and_construct_node_execution_input()` to return
`nodes_to_skip`
- Updated `add_graph_execution()` to pass `nodes_to_skip` to execution
entry

**Execution Changes:**
- `backend/executor/manager.py`:
  - Added skip logic in `_on_graph_execution()` dispatch loop
- When a node is in `nodes_to_skip`, it is marked as `COMPLETED` without
execution
  - No outputs are produced, so downstream nodes won't trigger

#### Frontend

**Node Store:**
- `frontend/src/app/(platform)/build/stores/nodeStore.ts`:
- Added `credentials_optional` to node metadata serialization in
`convertCustomNodeToBackendNode()`
- Added `getCredentialsOptional()` and `setCredentialsOptional()` helper
methods

**Credential Field Component:**
-
`frontend/src/components/renderers/input-renderer/fields/CredentialField/CredentialField.tsx`:
  - Added "Optional - skip block if not configured" switch toggle
  - Switch controls the `credentials_optional` metadata flag
  - Placeholder text updates based on optional state

**Credential Field Hook:**
-
`frontend/src/components/renderers/input-renderer/fields/CredentialField/useCredentialField.ts`:
  - Added `disableAutoSelect` parameter
- When credentials are optional, auto-selection of credentials is
disabled

**Feature Flags:**
- `frontend/src/services/feature-flags/use-get-flag.ts`: Minor refactor
(condition ordering)

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Build an agent using smart decision maker and down stream blocks
to test this

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Introduces optional credentials across graph execution and UI,
allowing nodes to be skipped (no outputs, no downstream triggers) when
their credentials are not configured.
> 
> - Backend
> - Adds `Node.credentials_optional` (from node `metadata`) and computes
required credential fields in `Graph.credentials_input_schema` based on
usage.
> - Validates credentials with `_validate_node_input_credentials` →
returns `(errors, nodes_to_skip)`; plumbs `nodes_to_skip` through
`validate_graph_with_credentials`,
`_construct_starting_node_execution_input`,
`validate_and_construct_node_execution_input`, and `add_graph_execution`
into `GraphExecutionEntry`.
> - Executor: dispatch loop skips nodes in `nodes_to_skip` (marks
`COMPLETED`); `execute_node`/`on_node_execution` accept `nodes_to_skip`;
`SmartDecisionMakerBlock.run` filters tool functions whose
`_sink_node_id` is in `nodes_to_skip` and errors only if all tools are
filtered.
> - Models: `GraphExecutionEntry` gains `nodes_to_skip` field. Tests and
snapshots updated accordingly.
> 
> - Frontend
> - Builder: credential field uses `custom/credential_field` with an
"Optional – skip block if not configured" toggle; `nodeStore` persists
`credentials_optional` and history; UI hides optional toggle in run
dialogs.
> - Run dialogs: compute required credentials from
`credentials_input_schema.required`; allow selecting "None"; avoid
auto-select for optional; filter out incomplete creds before execute.
>   - Minor schema/UI wiring updates (`uiSchema`, form context flags).
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
5e01fd6a3e. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Nicholas Tindle <ntindle@users.noreply.github.com>
2026-01-09 14:11:35 +00:00
Ubbe
ec00aa951a fix(frontend): agent favorites layout (#11733)
## Changes 🏗️

<img width="800" height="744" alt="Screenshot 2026-01-09 at 16 07 08"
src="https://github.com/user-attachments/assets/034c97e2-18f3-441c-a13d-71f668ad672f"
/>

- Remove feature flag for agent favourites ( _keep it always visible_ )
- Fix the layout on the card so the ❤️ icon appears next to the `...`
menu
- Remove icons on toasts

## Checklist 📋

### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Run the app locally and check the above


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Favorites now respond to the current search term and are available to
all users (no feature-flag).

* **UI/UX Improvements**
* Redesigned Favorites section with simplified header, inline agent
counts, updated spacing/dividers, and removal of skeleton placeholders.
  * Favorite button repositioned and visually simplified on agent cards.
* Toast visuals simplified by removing per-type icons and adjusting
close-button positioning.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-01-09 18:52:07 +07:00
Zamil Majdy
36fb1ea004 fix(platform): store submission validation and marketplace improvements (#11706)
## Summary

Major improvements to AutoGPT Platform store submission deletion,
creator detection, and marketplace functionality. This PR addresses
critical issues with submission management and significantly improves
performance.

### 🔧 **Store Submission Deletion Issues Fixed**

**Problems Solved**:
-  **Wrong deletion granularity**: Deleting entire `StoreListing` (all
versions) when users expected to delete individual submissions
-  **"Graph not found" errors**: Cascade deletion removing AgentGraphs
that were still referenced
-  **Multiple submissions deleted**: When removing one submission, all
submissions for that agent were removed
-  **Deletion of approved content**: Users could accidentally remove
live store content

**Solutions Implemented**:
-  **Granular deletion**: Now deletes individual `StoreListingVersion`
records instead of entire listings
-  **Protected approved content**: Prevents deletion of approved
submissions to keep store content safe
-  **Automatic cleanup**: Empty listings are automatically removed when
last version is deleted
-  **Simplified logic**: Reduced deletion function from 85 lines to 32
lines for better maintainability

### 🔧 **Creator Detection Performance Issues Fixed**

**Problems Solved**:
-  **Inefficient API calls**: Fetching ALL user submissions just to
check if they own one specific agent
-  **Complex logic**: Convoluted creator detection requiring multiple
database queries
-  **Performance impact**: Especially bad for non-creators who would
never need this data

**Solutions Implemented**:
-  **Added `owner_user_id` field**: Direct ownership reference in
`LibraryAgent` model
-  **Simple ownership check**: `owner_user_id === user.id` instead of
complex submission fetching
-  **90%+ performance improvement**: Massive reduction in unnecessary
API calls for non-creators
-  **Optimized data fetching**: Only fetch submissions when user is
creator AND has marketplace listing

### 🔧 **Original Store Submission Validation Issues (BUILDER-59F)**
Fixes "Agent not found for this user. User ID: ..., Agent ID: , Version:
0" errors:

- **Backend validation**: Added Pydantic validation for `agent_id`
(min_length=1) and `agent_version` (>0)
- **Frontend validation**: Pre-submission validation with user-friendly
error messages
- **Agent selection flow**: Fixed `agentId` not being set from
`selectedAgentId`
- **State management**: Prevented state reset conflicts clearing
selected agent

### 🔧 **Marketplace Display Improvements**
Enhanced version history and changelog display:

- Updated title from "Changelog" to "Version history"
- Added "Last updated X ago" with proper relative time formatting  
- Display version numbers as "Version X.0" format
- Replaced all hardcoded values with dynamic API data
- Improved text sizes and layout structure

### 📁 **Files Changed**

**Backend Changes**:
- `backend/api/features/store/db.py` - Simplified deletion logic, added
approval protection
- `backend/api/features/store/model.py` - Added `listing_id` field,
Pydantic validation
- `backend/api/features/library/model.py` - Added `owner_user_id` field
for efficient creator detection
- All test files - Updated with new required fields

**Frontend Changes**:
- `useMarketplaceUpdate.ts` - Optimized creator detection logic 
- `MainDashboardPage.tsx` - Added `listing_id` mapping for proper type
safety
- `useAgentTableRow.ts` - Updated deletion logic to use
`store_listing_version_id`
- `usePublishAgentModal.ts` - Fixed state reset conflicts
- Marketplace components - Enhanced version history display

###  **Benefits**

**Performance**:
- 🚀 **90%+ reduction** in unnecessary API calls for creator detection
- 🚀 **Instant ownership checks** (no database queries needed)
- 🚀 **Optimized submissions fetching** (only when needed)

**User Experience**: 
-  **Granular submission control** (delete individual versions, not
entire listings)
-  **Protected approved content** (prevents accidental store content
removal)
-  **Better error prevention** (no more "Graph not found" errors)
-  **Clear validation messages** (user-friendly error feedback)

**Code Quality**:
-  **Simplified deletion logic** (85 lines → 32 lines)
-  **Better type safety** (proper `listing_id` field usage)  
-  **Cleaner creator detection** (explicit ownership vs inferred)
-  **Automatic cleanup** (empty listings removed automatically)

### 🧪 **Testing**
- [x] Backend validation rejects empty agent_id and zero agent_version
- [x] Frontend TypeScript compilation passes
- [x] Store submission works from both creator dashboard and "become a
creator" flows
- [x] Granular submission deletion works correctly
- [x] Approved submissions are protected from deletion
- [x] Creator detection is fast and accurate
- [x] Marketplace displays version history correctly

**Breaking Changes**: None - All changes are additive and backwards
compatible.

Fixes critical submission deletion issues, improves performance
significantly, and enhances user experience across the platform.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
  * Agent ownership is now tracked and exposed across the platform.
* Store submissions and versions now include a required listing_id to
preserve listing linkage.

* **Bug Fixes**
* Prevent deletion of APPROVED submissions; remove empty listings after
deletions.
* Edits restricted to PENDING submissions with clearer invalid-operation
messages.

* **Improvements**
* Stronger publish validation and UX guards; deduplicated images and
modal open/reset refinements.
* Version history shows relative "Last updated" times and version
badges.

* **Tests**
* E2E tests updated to target pending-submission flows for edit/delete.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-01-08 19:11:38 +00:00
Abhimanyu Yadav
a81ac150da fix(frontend): add word wrapping to CodeRenderer and improve output actions visibility (#11724)
## Changes 🏗️
- Updated the `CodeRenderer` component to add `whitespace-pre-wrap` and
`break-words` CSS classes to the `<code>` element
- This enables proper wrapping of long code lines while preserving
whitespace formatting

Before


![image.png](https://app.graphite.com/user-attachments/assets/aca769cc-0f6f-4e25-8cdd-c491fcbf21bb.png)

After

![Screenshot 2026-01-08 at
3.02.53 PM.png](https://app.graphite.com/user-attachments/assets/99e23efa-be2a-441b-b0d6-50fa2a08cdb0.png)

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Verified code with long lines wraps correctly
  - [x] Confirmed whitespace and indentation are preserved
  - [x] Tested code display in various viewport sizes

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Bug Fixes**
* Code blocks now preserve whitespace and wrap long lines for improved
readability.
* Output action controls are hidden when there is only a single output
item, reducing unnecessary UI elements.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-01-08 11:13:47 +00:00
Abhimanyu Yadav
49ee087496 feat(frontend): add new integration images for Webshare and WordPress (#11725)
### Changes 🏗️

Added two new integration icons to the frontend:
- `webshare_proxy.png` - Icon for WebShare Proxy integration
- `wordpress.png` - Icon for WordPress integration

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Verified both icons display correctly in the integrations section
  - [x] Confirmed icons render properly at different screen sizes
  - [x] Checked that the icons maintain quality when scaled

#### For configuration changes:
- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
2026-01-08 11:13:34 +00:00
Ubbe
fc25e008b3 feat(frontend): update library agent cards to use DS (#11720)
## Changes 🏗️

<img width="700" height="838" alt="Screenshot 2026-01-07 at 16 11 04"
src="https://github.com/user-attachments/assets/0b38d2e1-d4a8-4036-862c-b35c82c496c2"
/>

- Update the agent library cards to new designs
- Update page to use Design System components
- Allow to edit/delete/duplicate agents on the library list page
- Add missing actions on library agent detail page

## Checklist 📋

### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Run locally and test the above


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Marketplace info shown on agent cards and improved favoriting with
optimistic UI and feedback.
  * Delete agent and delete schedule flows with confirmation dialogs.

* **Refactor**
* New composable form system, modernized upload dialog, streamlined
search bar, and multiple library components converted to named exports
with layout tweaks.
  * New agent card menu and favorite button UI.

* **Chores**
  * Removed notification UI and dropped a drag-drop dependency.

* **Tests**
  * Increased timeouts and stabilized upload/pagination flows.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-01-08 18:28:27 +07:00
Ubbe
b0855e8cf2 feat(frontend): context menu right click new builder (#11703)
## Changes 🏗️

<img width="250" height="504" alt="Screenshot 2026-01-06 at 17 53 26"
src="https://github.com/user-attachments/assets/52013448-f49c-46b6-b86a-39f98270cbc3"
/>

<img width="300" height="544" alt="Screenshot 2026-01-06 at 17 53 29"
src="https://github.com/user-attachments/assets/e6334034-68e4-4346-9092-3774ab3e8445"
/>

On the **New Builder**:
- right-click on a node menu make it show the context menu
- use the same menu for right-click and when clicking on `...`

## Checklist 📋

### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Run locally and test the above



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Added a custom right-click context menu for nodes with Copy, Open
agent (when available), and Delete actions; browser default menu is
suppressed while preserving zoom/drag/wiring.
* Introduced reusable SecondaryMenu primitives for context and dropdown
menus.

* **Documentation**
* Added Storybook examples demonstrating the context menu and dropdown
menu usage.

* **Style**
* Updated menu styling and icons with improved consistency and dark-mode
support.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-01-08 17:35:49 +07:00
1748 changed files with 222492 additions and 42752 deletions

36
.branchlet.json Normal file
View File

@@ -0,0 +1,36 @@
{
"worktreeCopyPatterns": [
".env*",
".vscode/**",
".auth/**",
".claude/**",
"autogpt_platform/.env*",
"autogpt_platform/backend/.env*",
"autogpt_platform/frontend/.env*",
"autogpt_platform/frontend/.auth/**",
"autogpt_platform/db/docker/.env*"
],
"worktreeCopyIgnores": [
"**/node_modules/**",
"**/dist/**",
"**/.git/**",
"**/Thumbs.db",
"**/.DS_Store",
"**/.next/**",
"**/__pycache__/**",
"**/.ruff_cache/**",
"**/.pytest_cache/**",
"**/*.pyc",
"**/playwright-report/**",
"**/logs/**",
"**/site/**"
],
"worktreePathTemplate": "$BASE_PATH.worktree",
"postCreateCmd": [
"cd autogpt_platform/autogpt_libs && poetry install",
"cd autogpt_platform/backend && poetry install && poetry run prisma generate",
"cd autogpt_platform/frontend && pnpm install"
],
"terminalCommand": "code .",
"deleteBranchWithWorktree": false
}

View File

@@ -0,0 +1,106 @@
---
name: open-pr
description: Open a pull request with proper PR template, test coverage, and review workflow. Guides agents through creating a PR that follows repo conventions, ensures existing behaviors aren't broken, covers new behaviors with tests, and handles review via bot when local testing isn't possible. TRIGGER when user asks to "open a PR", "create a PR", "make a PR", "submit a PR", "open pull request", "push and create PR", or any variation of opening/submitting a pull request.
user-invocable: true
args: "[base-branch] — optional target branch (defaults to dev)."
metadata:
author: autogpt-team
version: "1.0.0"
---
# Open a Pull Request
## Step 1: Pre-flight checks
Before opening the PR:
1. Ensure all changes are committed
2. Ensure the branch is pushed to the remote (`git push -u origin <branch>`)
3. Run linters/formatters across the whole repo (not just changed files) and commit any fixes
## Step 2: Test coverage
**This is critical.** Before opening the PR, verify:
### Existing behavior is not broken
- Identify which modules/components your changes touch
- Run the existing test suites for those areas
- If tests fail, fix them before opening the PR — do not open a PR with known regressions
### New behavior has test coverage
- Every new feature, endpoint, or behavior change needs tests
- If you added a new block, add tests for that block
- If you changed API behavior, add or update API tests
- If you changed frontend behavior, verify it doesn't break existing flows
If you cannot run the full test suite locally, note which tests you ran and which you couldn't in the test plan.
## Step 3: Create the PR using the repo template
Read the canonical PR template at `.github/PULL_REQUEST_TEMPLATE.md` and use it **verbatim** as your PR body:
1. Read the template: `cat .github/PULL_REQUEST_TEMPLATE.md`
2. Preserve the exact section titles and formatting, including:
- `### Why / What / How`
- `### Changes 🏗️`
- `### Checklist 📋`
3. Replace HTML comment prompts (`<!-- ... -->`) with actual content; do not leave them in
4. **Do not pre-check boxes** — leave all checkboxes as `- [ ]` until each step is actually completed
5. Do not alter the template structure, rename sections, or remove any checklist items
**PR title must use conventional commit format** (e.g., `feat(backend): add new block`, `fix(frontend): resolve routing bug`, `dx(skills): update PR workflow`). See CLAUDE.md for the full list of scopes.
Use `gh pr create` with the base branch (defaults to `dev` if no `[base-branch]` was provided). Use `--body-file` to avoid shell interpretation of backticks and special characters:
```bash
BASE_BRANCH="${BASE_BRANCH:-dev}"
PR_BODY=$(mktemp)
cat > "$PR_BODY" << 'PREOF'
<filled-in template from .github/PULL_REQUEST_TEMPLATE.md>
PREOF
gh pr create --base "$BASE_BRANCH" --title "<type>(scope): short description" --body-file "$PR_BODY"
rm "$PR_BODY"
```
## Step 4: Review workflow
### If you have a workspace that allows testing (docker, running backend, etc.)
- Run `/pr-test` to do E2E manual testing of the PR using docker compose, agent-browser, and API calls. This is the most thorough way to validate your changes before review.
- After testing, run `/pr-review` to self-review the PR for correctness, security, code quality, and testing gaps before requesting human review.
### If you do NOT have a workspace that allows testing
This is common for agents running in worktrees without a full stack. In this case:
1. Run `/pr-review` locally to catch obvious issues before pushing
2. **Comment `/review` on the PR** after creating it to trigger the review bot
3. **Poll for the review** rather than blindly waiting — check for new review comments every 30 seconds using `gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/reviews --paginate` and the GraphQL inline threads query. The bot typically responds within 30 minutes, but polling lets the agent react as soon as it arrives.
4. Do NOT proceed or merge until the bot review comes back
5. Address any issues the bot raises — use `/pr-address` which has a full polling loop with CI + comment tracking
```bash
# After creating the PR:
PR_NUMBER=$(gh pr view --json number -q .number)
gh pr comment "$PR_NUMBER" --body "/review"
# Then use /pr-address to poll for and address the review when it arrives
```
## Step 5: Address review feedback
Once the review bot or human reviewers leave comments:
- Run `/pr-address` to address review comments. It will loop until CI is green and all comments are resolved.
- Do not merge without human approval.
## Related skills
| Skill | When to use |
|---|---|
| `/pr-test` | E2E testing with docker compose, agent-browser, API calls — use when you have a running workspace |
| `/pr-review` | Review for correctness, security, code quality — use before requesting human review |
| `/pr-address` | Address reviewer comments and loop until CI green — use after reviews come in |
## Step 6: Post-creation
After the PR is created and review is triggered:
- Share the PR URL with the user
- If waiting on the review bot, let the user know the expected wait time (~30 min)
- Do not merge without human approval

View File

@@ -0,0 +1,210 @@
---
name: pr-address
description: Address PR review comments and loop until CI green and all comments resolved. TRIGGER when user asks to address comments, fix PR feedback, respond to reviewers, or babysit/monitor a PR.
user-invocable: true
argument-hint: "[PR number or URL] — if omitted, finds PR for current branch."
metadata:
author: autogpt-team
version: "1.0.0"
---
# PR Address
## Find the PR
```bash
gh pr list --head $(git branch --show-current) --repo Significant-Gravitas/AutoGPT
gh pr view {N}
```
## Read the PR description
Understand the **Why / What / How** before addressing comments — you need context to make good fixes:
```bash
gh pr view {N} --json body --jq '.body'
```
## Fetch comments (all sources)
### 1. Inline review threads — GraphQL (primary source of actionable items)
Use GraphQL to fetch inline threads. It natively exposes `isResolved`, returns threads already grouped with all replies, and paginates via cursor — no manual thread reconstruction needed.
```bash
gh api graphql -f query='
{
repository(owner: "Significant-Gravitas", name: "AutoGPT") {
pullRequest(number: {N}) {
reviewThreads(first: 100) {
pageInfo { hasNextPage endCursor }
nodes {
id
isResolved
path
comments(last: 1) {
nodes { databaseId body author { login } createdAt }
}
}
}
}
}
}'
```
If `pageInfo.hasNextPage` is true, fetch subsequent pages by adding `after: "<endCursor>"` to `reviewThreads(first: 100, after: "...")` and repeat until `hasNextPage` is false.
**Filter to unresolved threads only** — skip any thread where `isResolved: true`. `comments(last: 1)` returns the most recent comment in the thread — act on that; it reflects the reviewer's final ask. Use the thread `id` (Relay global ID) to track threads across polls.
### 2. Top-level reviews — REST (MUST paginate)
```bash
gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/reviews --paginate
```
**CRITICAL — always `--paginate`.** Reviews default to 30 per page. PRs can have 80170+ reviews (mostly empty resolution events). Without pagination you miss reviews past position 30 — including `autogpt-reviewer`'s structured review which is typically posted after several CI runs and sits well beyond the first page.
Two things to extract:
- **Overall state**: look for `CHANGES_REQUESTED` or `APPROVED` reviews.
- **Actionable feedback**: non-empty bodies only. Empty-body reviews are thread-resolution events — they indicate progress but have no feedback to act on.
**Where each reviewer posts:**
- `autogpt-reviewer` — posts detailed structured reviews ("Blockers", "Should Fix", "Nice to Have") as **top-level reviews**. Not present on every PR. Address ALL items.
- `sentry[bot]` — posts bug predictions as **inline threads**. Fix real bugs, explain false positives.
- `coderabbitai[bot]` — posts summaries as **top-level reviews** AND actionable items as **inline threads**. Address actionable items.
- Human reviewers — can post in any source. Address ALL non-empty feedback.
### 3. PR conversation comments — REST
```bash
gh api repos/Significant-Gravitas/AutoGPT/issues/{N}/comments --paginate
```
Mostly contains: bot summaries (`coderabbitai[bot]`), CI/conflict detection (`github-actions[bot]`), and author status updates. Scan for non-empty messages from non-bot human reviewers that aren't the PR author — those are the ones that need a response.
## For each unaddressed comment
Address comments **one at a time**: fix → commit → push → inline reply → next.
1. Read the referenced code, make the fix (or reply explaining why it's not needed)
2. Commit and push the fix
3. Reply **inline** (not as a new top-level comment) referencing the fixing commit — this is what resolves the conversation for bot reviewers (coderabbitai, sentry):
| Comment type | How to reply |
|---|---|
| Inline review (`pulls/{N}/comments`) | `gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/comments/{ID}/replies -f body="🤖 Fixed in <commit-sha>: <description>"` |
| Conversation (`issues/{N}/comments`) | `gh api repos/Significant-Gravitas/AutoGPT/issues/{N}/comments -f body="🤖 Fixed in <commit-sha>: <description>"` |
## Format and commit
After fixing, format the changed code:
- **Backend** (from `autogpt_platform/backend/`): `poetry run format`
- **Frontend** (from `autogpt_platform/frontend/`): `pnpm format && pnpm lint && pnpm types`
If API routes changed, regenerate the frontend client:
```bash
cd autogpt_platform/backend && poetry run rest &
REST_PID=$!
trap "kill $REST_PID 2>/dev/null" EXIT
WAIT=0; until curl -sf http://localhost:8006/health > /dev/null 2>&1; do sleep 1; WAIT=$((WAIT+1)); [ $WAIT -ge 60 ] && echo "Timed out" && exit 1; done
cd ../frontend && pnpm generate:api:force
kill $REST_PID 2>/dev/null; trap - EXIT
```
Never manually edit files in `src/app/api/__generated__/`.
Then commit and **push immediately** — never batch commits without pushing. Each fix should be visible on GitHub right away so CI can start and reviewers can see progress.
**Never push empty commits** (`git commit --allow-empty`) to re-trigger CI or bot checks. When a check fails, investigate the root cause (unchecked PR checklist, unaddressed review comments, code issues) and fix those directly. Empty commits add noise to git history.
For backend commits in worktrees: `poetry run git commit` (pre-commit hooks).
## The loop
```text
address comments → format → commit → push
→ wait for CI (while addressing new comments) → fix failures → push
→ re-check comments after CI settles
→ repeat until: all comments addressed AND CI green AND no new comments arriving
```
### Polling for CI + new comments
After pushing, poll for **both** CI status and new comments in a single loop. Do not use `gh pr checks --watch` — it blocks the tool and prevents reacting to new comments while CI is running.
> **Note:** `gh pr checks --watch --fail-fast` is tempting but it blocks the entire Bash tool call, meaning the agent cannot check for or address new comments until CI fully completes. Always poll manually instead.
**Polling loop — repeat every 30 seconds:**
1. Check CI status:
```bash
gh pr checks {N} --repo Significant-Gravitas/AutoGPT --json bucket,name,link
```
Parse the results: if every check has `bucket` of `"pass"` or `"skipping"`, CI is green. If any has `"fail"`, CI has failed. Otherwise CI is still pending.
2. Check for merge conflicts:
```bash
gh pr view {N} --repo Significant-Gravitas/AutoGPT --json mergeable --jq '.mergeable'
```
If the result is `"CONFLICTING"`, the PR has a merge conflict — see "Resolving merge conflicts" below. If `"UNKNOWN"`, GitHub is still computing mergeability — wait and re-check next poll.
3. Check for new/changed comments (all three sources):
**Inline threads** — re-run the GraphQL query from "Fetch comments". For each unresolved thread, record `{thread_id, last_comment_databaseId}` as your baseline. On each poll, action is needed if:
- A new thread `id` appears that wasn't in the baseline (new thread), OR
- An existing thread's `last_comment_databaseId` has changed (new reply on existing thread)
**Conversation comments:**
```bash
gh api repos/Significant-Gravitas/AutoGPT/issues/{N}/comments --paginate
```
Compare total count and newest `id` against baseline. Filter to non-empty, non-bot, non-author-update messages.
**Top-level reviews:**
```bash
gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/reviews --paginate
```
Watch for new non-empty reviews (`CHANGES_REQUESTED` or `COMMENTED` with body). Compare total count and newest `id` against baseline.
4. **React in this precedence order (first match wins):**
| What happened | Action |
|---|---|
| Merge conflict detected | See "Resolving merge conflicts" below. |
| Mergeability is `UNKNOWN` | GitHub is still computing mergeability. Sleep 30 seconds, then restart polling from the top. |
| New comments detected | Address them (fix → commit → push → reply). After pushing, re-fetch all comments to update your baseline, then restart this polling loop from the top (new commits invalidate CI status). |
| CI failed (bucket == "fail") | Get failed check links: `gh pr checks {N} --repo Significant-Gravitas/AutoGPT --json bucket,link --jq '.[] \| select(.bucket == "fail") \| .link'`. Extract run ID from link (format: `.../actions/runs/<run-id>/job/...`), read logs with `gh run view <run-id> --repo Significant-Gravitas/AutoGPT --log-failed`. Fix → commit → push → restart polling. |
| CI green + no new comments | **Do not exit immediately.** Bots (coderabbitai, sentry) often post reviews shortly after CI settles. Continue polling for **2 more cycles (60s)** after CI goes green. Only exit after 2 consecutive green+quiet polls. |
| CI pending + no new comments | Sleep 30 seconds, then poll again. |
**The loop ends when:** CI fully green + all comments addressed + **2 consecutive polls with no new comments after CI settled.**
### Resolving merge conflicts
1. Identify the PR's target branch and remote:
```bash
gh pr view {N} --repo Significant-Gravitas/AutoGPT --json baseRefName --jq '.baseRefName'
git remote -v # find the remote pointing to Significant-Gravitas/AutoGPT (typically 'upstream' in forks, 'origin' for direct contributors)
```
2. Pull the latest base branch with a 3-way merge:
```bash
git pull {base-remote} {base-branch} --no-rebase
```
3. Resolve conflicting files, then verify no conflict markers remain:
```bash
if grep -R -n -E '^(<<<<<<<|=======|>>>>>>>)' <conflicted-files>; then
echo "Unresolved conflict markers found — resolve before proceeding."
exit 1
fi
```
4. Stage and push:
```bash
git add <conflicted-files>
git commit -m "Resolve merge conflicts with {base-branch}"
git push
```
5. Restart the polling loop from the top — new commits reset CI status.

View File

@@ -0,0 +1,86 @@
---
name: pr-review
description: Review a PR for correctness, security, code quality, and testing issues. TRIGGER when user asks to review a PR, check PR quality, or give feedback on a PR.
user-invocable: true
args: "[PR number or URL] — if omitted, finds PR for current branch."
metadata:
author: autogpt-team
version: "1.0.0"
---
# PR Review
## Find the PR
```bash
gh pr list --head $(git branch --show-current) --repo Significant-Gravitas/AutoGPT
gh pr view {N}
```
## Read the PR description
Before reading code, understand the **why**, **what**, and **how** from the PR description:
```bash
gh pr view {N} --json body --jq '.body'
```
Every PR should have a Why / What / How structure. If any of these are missing, note it as feedback.
## Read the diff
```bash
gh pr diff {N}
```
## Fetch existing review comments
Before posting anything, fetch existing inline comments to avoid duplicates:
```bash
gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/comments --paginate
gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/reviews
```
## What to check
**Description quality:** Does the PR description cover Why (motivation/problem), What (summary of changes), and How (approach/implementation details)? If any are missing, request them — you can't judge the approach without understanding the problem and intent.
**Correctness:** logic errors, off-by-one, missing edge cases, race conditions (TOCTOU in file access, credit charging), error handling gaps, async correctness (missing `await`, unclosed resources).
**Security:** input validation at boundaries, no injection (command, XSS, SQL), secrets not logged, file paths sanitized (`os.path.basename()` in error messages).
**Code quality:** apply rules from backend/frontend CLAUDE.md files.
**Architecture:** DRY, single responsibility, modular functions. `Security()` vs `Depends()` for FastAPI auth. `data:` for SSE events, `: comment` for heartbeats. `transaction=True` for Redis pipelines.
**Testing:** edge cases covered, colocated `*_test.py` (backend) / `__tests__/` (frontend), mocks target where symbol is **used** not defined, `AsyncMock` for async.
## Output format
Every comment **must** be prefixed with `🤖` and a criticality badge:
| Tier | Badge | Meaning |
|---|---|---|
| Blocker | `🔴 **Blocker**` | Must fix before merge |
| Should Fix | `🟠 **Should Fix**` | Important improvement |
| Nice to Have | `🟡 **Nice to Have**` | Minor suggestion |
| Nit | `🔵 **Nit**` | Style / wording |
Example: `🤖 🔴 **Blocker**: Missing error handling for X — suggest wrapping in try/except.`
## Post inline comments
For each finding, post an inline comment on the PR (do not just write a local report):
```bash
# Get the latest commit SHA for the PR
COMMIT_SHA=$(gh api repos/Significant-Gravitas/AutoGPT/pulls/{N} --jq '.head.sha')
# Post an inline comment on a specific file/line
gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/comments \
-f body="🤖 🔴 **Blocker**: <description>" \
-f commit_id="$COMMIT_SHA" \
-f path="<file path>" \
-F line=<line number>
```

View File

@@ -0,0 +1,754 @@
---
name: pr-test
description: "E2E manual testing of PRs/branches using docker compose, agent-browser, and API calls. TRIGGER when user asks to manually test a PR, test a feature end-to-end, or run integration tests against a running system."
user-invocable: true
argument-hint: "[worktree path or PR number] — tests the PR in the given worktree. Optional flags: --fix (auto-fix issues found)"
metadata:
author: autogpt-team
version: "2.0.0"
---
# Manual E2E Test
Test a PR/branch end-to-end by building the full platform, interacting via browser and API, capturing screenshots, and reporting results.
## Critical Requirements
These are NON-NEGOTIABLE. Every test run MUST satisfy ALL the following:
### 1. Screenshots at Every Step
- Take a screenshot at EVERY significant test step — not just at the end
- Every test scenario MUST have at least one BEFORE and one AFTER screenshot
- Name screenshots sequentially: `{NN}-{action}-{state}.png` (e.g., `01-credits-before.png`, `02-credits-after.png`)
- If a screenshot is missing for a scenario, the test is INCOMPLETE — go back and take it
### 2. Screenshots MUST Be Posted to PR
- Push ALL screenshots to a temp branch `test-screenshots/pr-{N}`
- Post a PR comment with ALL screenshots embedded inline using GitHub raw URLs
- This is NOT optional — every test run MUST end with a PR comment containing screenshots
- If screenshot upload fails, retry. If it still fails, list failed files and require manual drag-and-drop/paste attachment in the PR comment
### 3. State Verification with Before/After Evidence
- For EVERY state-changing operation (API call, user action), capture the state BEFORE and AFTER
- Log the actual API response values (e.g., `credits_before=100, credits_after=95`)
- Screenshot MUST show the relevant UI state change
- Compare expected vs actual values explicitly — do not just eyeball it
### 4. Negative Test Cases Are Mandatory
- Test at least ONE negative case per feature (e.g., insufficient credits, invalid input, unauthorized access)
- Verify error messages are user-friendly and accurate
- Verify the system state did NOT change after a rejected operation
### 5. Test Report Must Include Full Evidence
Each test scenario in the report MUST have:
- **Steps**: What was done (exact commands or UI actions)
- **Expected**: What should happen
- **Actual**: What actually happened
- **API Evidence**: Before/after API response values for state-changing operations
- **Screenshot Evidence**: Before/after screenshots with explanations
## State Manipulation for Realistic Testing
When testing features that depend on specific states (rate limits, credits, quotas):
1. **Use Redis CLI to set counters directly:**
```bash
# Find the Redis container
REDIS_CONTAINER=$(docker ps --format '{{.Names}}' | grep redis | head -1)
# Set a key with expiry
docker exec $REDIS_CONTAINER redis-cli SET key value EX ttl
# Example: Set rate limit counter to near-limit
docker exec $REDIS_CONTAINER redis-cli SET "rate_limit:user:test@test.com" 99 EX 3600
# Example: Check current value
docker exec $REDIS_CONTAINER redis-cli GET "rate_limit:user:test@test.com"
```
2. **Use API calls to check before/after state:**
```bash
# BEFORE: Record current state
BEFORE=$(curl -s -H "Authorization: Bearer $TOKEN" http://localhost:8006/api/credits | jq '.credits')
echo "Credits BEFORE: $BEFORE"
# Perform the action...
# AFTER: Record new state and compare
AFTER=$(curl -s -H "Authorization: Bearer $TOKEN" http://localhost:8006/api/credits | jq '.credits')
echo "Credits AFTER: $AFTER"
echo "Delta: $(( BEFORE - AFTER ))"
```
3. **Take screenshots BEFORE and AFTER state changes** — the UI must reflect the backend state change
4. **Never rely on mocked/injected browser state** — always use real backend state. Do NOT use `agent-browser eval` to fake UI state. The backend must be the source of truth.
5. **Use direct DB queries when needed:**
```bash
# Query via Supabase's PostgREST or docker exec into the DB
docker exec supabase-db psql -U supabase_admin -d postgres -c "SELECT credits FROM user_credits WHERE user_id = '...';"
```
6. **After every API test, verify the state change actually persisted:**
```bash
# Example: After a credits purchase, verify DB matches API
API_CREDITS=$(curl -s -H "Authorization: Bearer $TOKEN" http://localhost:8006/api/credits | jq '.credits')
DB_CREDITS=$(docker exec supabase-db psql -U supabase_admin -d postgres -t -c "SELECT credits FROM user_credits WHERE user_id = '...';" | tr -d ' ')
[ "$API_CREDITS" = "$DB_CREDITS" ] && echo "CONSISTENT" || echo "MISMATCH: API=$API_CREDITS DB=$DB_CREDITS"
```
## Arguments
- `$ARGUMENTS` — worktree path (e.g. `$REPO_ROOT`) or PR number
- If `--fix` flag is present, auto-fix bugs found and push fixes (like pr-address loop)
## Step 0: Resolve the target
```bash
# If argument is a PR number, find its worktree
gh pr view {N} --json headRefName --jq '.headRefName'
# If argument is a path, use it directly
```
Determine:
- `REPO_ROOT` — the root repo directory: `git -C "$WORKTREE_PATH" worktree list | head -1 | awk '{print $1}'` (or `git rev-parse --show-toplevel` if not a worktree)
- `WORKTREE_PATH` — the worktree directory
- `PLATFORM_DIR` — `$WORKTREE_PATH/autogpt_platform`
- `BACKEND_DIR` — `$PLATFORM_DIR/backend`
- `FRONTEND_DIR` — `$PLATFORM_DIR/frontend`
- `PR_NUMBER` — the PR number (from `gh pr list --head $(git branch --show-current)`)
- `PR_TITLE` — the PR title, slugified (e.g. "Add copilot permissions" → "add-copilot-permissions")
- `RESULTS_DIR` — `$REPO_ROOT/test-results/PR-{PR_NUMBER}-{slugified-title}`
Create the results directory:
```bash
PR_NUMBER=$(cd $WORKTREE_PATH && gh pr list --head $(git branch --show-current) --repo Significant-Gravitas/AutoGPT --json number --jq '.[0].number')
PR_TITLE=$(cd $WORKTREE_PATH && gh pr list --head $(git branch --show-current) --repo Significant-Gravitas/AutoGPT --json title --jq '.[0].title' | tr '[:upper:]' '[:lower:]' | sed 's/[^a-z0-9]/-/g' | sed 's/--*/-/g' | sed 's/^-//;s/-$//' | head -c 50)
RESULTS_DIR="$REPO_ROOT/test-results/PR-${PR_NUMBER}-${PR_TITLE}"
mkdir -p $RESULTS_DIR
```
**Test user credentials** (for logging into the UI or verifying results manually):
- Email: `test@test.com`
- Password: `testtest123`
## Step 1: Understand the PR
Before testing, understand what changed:
```bash
cd $WORKTREE_PATH
# Read PR description to understand the WHY
gh pr view {N} --json body --jq '.body'
git log --oneline dev..HEAD | head -20
git diff dev --stat
```
Read the PR description (Why / What / How) and changed files to understand:
0. **Why** does this PR exist? What problem does it solve?
1. **What** feature/fix does this PR implement?
2. **How** does it work? What's the approach?
3. What components are affected? (backend, frontend, copilot, executor, etc.)
4. What are the key user-facing behaviors to test?
## Step 2: Write test scenarios
Based on the PR analysis, write a test plan to `$RESULTS_DIR/test-plan.md`:
```markdown
# Test Plan: PR #{N} — {title}
## Scenarios
1. [Scenario name] — [what to verify]
2. ...
## API Tests (if applicable)
1. [Endpoint] — [expected behavior]
- Before state: [what to check before]
- After state: [what to verify changed]
## UI Tests (if applicable)
1. [Page/component] — [interaction to test]
- Screenshot before: [what to capture]
- Screenshot after: [what to capture]
## Negative Tests (REQUIRED — at least one per feature)
1. [What should NOT happen] — [how to trigger it]
- Expected error: [what error message/code]
- State unchanged: [what to verify did NOT change]
```
**Be critical** — include edge cases, error paths, and security checks. Every scenario MUST specify what screenshots to take and what state to verify.
## Step 3: Environment setup
### 3a. Copy .env files from the root worktree
The root worktree (`$REPO_ROOT`) has the canonical `.env` files with all API keys. Copy them to the target worktree:
```bash
# CRITICAL: .env files are NOT checked into git. They must be copied manually.
cp $REPO_ROOT/autogpt_platform/.env $PLATFORM_DIR/.env
cp $REPO_ROOT/autogpt_platform/backend/.env $BACKEND_DIR/.env
cp $REPO_ROOT/autogpt_platform/frontend/.env $FRONTEND_DIR/.env
```
### 3b. Configure copilot authentication
The copilot needs an LLM API to function. Two approaches (try subscription first):
#### Option 1: Subscription mode (preferred — uses your Claude Max/Pro subscription)
The `claude_agent_sdk` Python package **bundles its own Claude CLI binary** — no need to install `@anthropic-ai/claude-code` via npm. The backend auto-provisions credentials from environment variables on startup.
Run the helper script to extract tokens from your host and auto-update `backend/.env` (works on macOS, Linux, and Windows/WSL):
```bash
# Extracts OAuth tokens and writes CLAUDE_CODE_OAUTH_TOKEN + CLAUDE_CODE_REFRESH_TOKEN into .env
bash $BACKEND_DIR/scripts/refresh_claude_token.sh --env-file $BACKEND_DIR/.env
```
**How it works:** The script reads the OAuth token from:
- **macOS**: system keychain (`"Claude Code-credentials"`)
- **Linux/WSL**: `~/.claude/.credentials.json`
- **Windows**: `%APPDATA%/claude/.credentials.json`
It sets `CLAUDE_CODE_OAUTH_TOKEN`, `CLAUDE_CODE_REFRESH_TOKEN`, and `CHAT_USE_CLAUDE_CODE_SUBSCRIPTION=true` in the `.env` file. On container startup, the backend auto-provisions `~/.claude/.credentials.json` inside the container from these env vars. The SDK's bundled CLI then authenticates using that file. No `claude login`, no npm install needed.
**Note:** The OAuth token expires (~24h). If copilot returns auth errors, re-run the script and restart: `$BACKEND_DIR/scripts/refresh_claude_token.sh --env-file $BACKEND_DIR/.env && docker compose up -d copilot_executor`
#### Option 2: OpenRouter API key mode (fallback)
If subscription mode doesn't work, switch to API key mode using OpenRouter:
```bash
# In $BACKEND_DIR/.env, ensure these are set:
CHAT_USE_CLAUDE_CODE_SUBSCRIPTION=false
CHAT_API_KEY=<value of OPEN_ROUTER_API_KEY from the same .env>
CHAT_BASE_URL=https://openrouter.ai/api/v1
CHAT_USE_CLAUDE_AGENT_SDK=true
```
Use `sed` to update these values:
```bash
ORKEY=$(grep "^OPEN_ROUTER_API_KEY=" $BACKEND_DIR/.env | cut -d= -f2)
[ -n "$ORKEY" ] || { echo "ERROR: OPEN_ROUTER_API_KEY is missing in $BACKEND_DIR/.env"; exit 1; }
perl -i -pe 's/CHAT_USE_CLAUDE_CODE_SUBSCRIPTION=true/CHAT_USE_CLAUDE_CODE_SUBSCRIPTION=false/' $BACKEND_DIR/.env
# Add or update CHAT_API_KEY and CHAT_BASE_URL
grep -q "^CHAT_API_KEY=" $BACKEND_DIR/.env && perl -i -pe "s|^CHAT_API_KEY=.*|CHAT_API_KEY=$ORKEY|" $BACKEND_DIR/.env || echo "CHAT_API_KEY=$ORKEY" >> $BACKEND_DIR/.env
grep -q "^CHAT_BASE_URL=" $BACKEND_DIR/.env && perl -i -pe 's|^CHAT_BASE_URL=.*|CHAT_BASE_URL=https://openrouter.ai/api/v1|' $BACKEND_DIR/.env || echo "CHAT_BASE_URL=https://openrouter.ai/api/v1" >> $BACKEND_DIR/.env
```
### 3c. Stop conflicting containers
```bash
# Stop any running app containers (keep infra: supabase, redis, rabbitmq, clamav)
docker ps --format "{{.Names}}" | grep -E "rest_server|executor|copilot|websocket|database_manager|scheduler|notification|frontend|migrate" | while read name; do
docker stop "$name" 2>/dev/null
done
```
### 3e. Build and start
```bash
cd $PLATFORM_DIR && docker compose build --no-cache 2>&1 | tail -20
if [ ${PIPESTATUS[0]} -ne 0 ]; then echo "ERROR: Docker build failed"; exit 1; fi
cd $PLATFORM_DIR && docker compose up -d 2>&1 | tail -20
if [ ${PIPESTATUS[0]} -ne 0 ]; then echo "ERROR: Docker compose up failed"; exit 1; fi
```
**Note:** If the container appears to be running old code (e.g. missing PR changes), use `docker compose build --no-cache` to force a full rebuild. Docker BuildKit may sometimes reuse cached `COPY` layers from a previous build on a different branch.
**Expected time: 3-8 minutes** for build, 5-10 minutes with `--no-cache`.
### 3f. Wait for services to be ready
```bash
# Poll until backend and frontend respond
for i in $(seq 1 60); do
BACKEND=$(curl -s -o /dev/null -w "%{http_code}" http://localhost:8006/docs 2>/dev/null)
FRONTEND=$(curl -s -o /dev/null -w "%{http_code}" http://localhost:3000 2>/dev/null)
if [ "$BACKEND" = "200" ] && [ "$FRONTEND" = "200" ]; then
echo "Services ready"
break
fi
sleep 5
done
```
### 3h. Create test user and get auth token
```bash
ANON_KEY=$(grep "NEXT_PUBLIC_SUPABASE_ANON_KEY=" $FRONTEND_DIR/.env | sed 's/.*NEXT_PUBLIC_SUPABASE_ANON_KEY=//' | tr -d '[:space:]')
# Signup (idempotent — returns "User already registered" if exists)
RESULT=$(curl -s -X POST 'http://localhost:8000/auth/v1/signup' \
-H "apikey: $ANON_KEY" \
-H 'Content-Type: application/json' \
-d '{"email":"test@test.com","password":"testtest123"}')
# If "Database error finding user", restart supabase-auth and retry
if echo "$RESULT" | grep -q "Database error"; then
docker restart supabase-auth && sleep 5
curl -s -X POST 'http://localhost:8000/auth/v1/signup' \
-H "apikey: $ANON_KEY" \
-H 'Content-Type: application/json' \
-d '{"email":"test@test.com","password":"testtest123"}'
fi
# Get auth token
TOKEN=$(curl -s -X POST 'http://localhost:8000/auth/v1/token?grant_type=password' \
-H "apikey: $ANON_KEY" \
-H 'Content-Type: application/json' \
-d '{"email":"test@test.com","password":"testtest123"}' | jq -r '.access_token // ""')
```
**Use this token for ALL API calls:**
```bash
curl -H "Authorization: Bearer $TOKEN" http://localhost:8006/api/...
```
## Step 4: Run tests
### Service ports reference
| Service | Port | URL |
|---------|------|-----|
| Frontend | 3000 | http://localhost:3000 |
| Backend REST | 8006 | http://localhost:8006 |
| Supabase Auth (via Kong) | 8000 | http://localhost:8000 |
| Executor | 8002 | http://localhost:8002 |
| Copilot Executor | 8008 | http://localhost:8008 |
| WebSocket | 8001 | http://localhost:8001 |
| Database Manager | 8005 | http://localhost:8005 |
| Redis | 6379 | localhost:6379 |
| RabbitMQ | 5672 | localhost:5672 |
### API testing
Use `curl` with the auth token for backend API tests. **For EVERY API call that changes state, record before/after values:**
```bash
# Example: List agents
curl -s -H "Authorization: Bearer $TOKEN" http://localhost:8006/api/graphs | jq . | head -20
# Example: Create an agent
curl -s -X POST http://localhost:8006/api/graphs \
-H "Authorization: Bearer $TOKEN" \
-H 'Content-Type: application/json' \
-d '{...}' | jq .
# Example: Run an agent
curl -s -X POST "http://localhost:8006/api/graphs/{graph_id}/execute" \
-H "Authorization: Bearer $TOKEN" \
-H 'Content-Type: application/json' \
-d '{"data": {...}}'
# Example: Get execution results
curl -s -H "Authorization: Bearer $TOKEN" \
"http://localhost:8006/api/graphs/{graph_id}/executions/{exec_id}" | jq .
```
**State verification pattern (use for EVERY state-changing API call):**
```bash
# 1. Record BEFORE state
BEFORE_STATE=$(curl -s -H "Authorization: Bearer $TOKEN" http://localhost:8006/api/{resource} | jq '{relevant_fields}')
echo "BEFORE: $BEFORE_STATE"
# 2. Perform the action
ACTION_RESULT=$(curl -s -X POST ... | jq .)
echo "ACTION RESULT: $ACTION_RESULT"
# 3. Record AFTER state
AFTER_STATE=$(curl -s -H "Authorization: Bearer $TOKEN" http://localhost:8006/api/{resource} | jq '{relevant_fields}')
echo "AFTER: $AFTER_STATE"
# 4. Log the comparison
echo "=== STATE CHANGE VERIFICATION ==="
echo "Before: $BEFORE_STATE"
echo "After: $AFTER_STATE"
echo "Expected change: {describe what should have changed}"
```
### Browser testing with agent-browser
```bash
# Close any existing session
agent-browser close 2>/dev/null || true
# Use --session-name to persist cookies across navigations
# This means login only needs to happen once per test session
agent-browser --session-name pr-test open 'http://localhost:3000/login' --timeout 15000
# Get interactive elements
agent-browser --session-name pr-test snapshot | grep "textbox\|button"
# Login
agent-browser --session-name pr-test fill {email_ref} "test@test.com"
agent-browser --session-name pr-test fill {password_ref} "testtest123"
agent-browser --session-name pr-test click {login_button_ref}
sleep 5
# Dismiss cookie banner if present
agent-browser --session-name pr-test click 'text=Accept All' 2>/dev/null || true
# Navigate — cookies are preserved so login persists
agent-browser --session-name pr-test open 'http://localhost:3000/copilot' --timeout 10000
# Take screenshot
agent-browser --session-name pr-test screenshot $RESULTS_DIR/01-page.png
# Interact with elements
agent-browser --session-name pr-test fill {ref} "text"
agent-browser --session-name pr-test press "Enter"
agent-browser --session-name pr-test click {ref}
agent-browser --session-name pr-test click 'text=Button Text'
# Read page content
agent-browser --session-name pr-test snapshot | grep "text:"
```
**Key pages:**
- `/copilot` — CoPilot chat (for testing copilot features)
- `/build` — Agent builder (for testing block/node features)
- `/build?flowID={id}` — Specific agent in builder
- `/library` — Agent library (for testing listing/import features)
- `/library/agents/{id}` — Agent detail with run history
- `/marketplace` — Marketplace
### Checking logs
```bash
# Backend REST server
docker logs autogpt_platform-rest_server-1 2>&1 | tail -30
# Executor (runs agent graphs)
docker logs autogpt_platform-executor-1 2>&1 | tail -30
# Copilot executor (runs copilot chat sessions)
docker logs autogpt_platform-copilot_executor-1 2>&1 | tail -30
# Frontend
docker logs autogpt_platform-frontend-1 2>&1 | tail -30
# Filter for errors
docker logs autogpt_platform-executor-1 2>&1 | grep -i "error\|exception\|traceback" | tail -20
```
### Copilot chat testing
The copilot uses SSE streaming. To test via API:
```bash
# Create a session
SESSION_ID=$(curl -s -X POST 'http://localhost:8006/api/chat/sessions' \
-H "Authorization: Bearer $TOKEN" \
-H 'Content-Type: application/json' \
-d '{}' | jq -r '.id // .session_id // ""')
# Stream a message (SSE - will stream chunks)
curl -N -X POST "http://localhost:8006/api/chat/sessions/$SESSION_ID/stream" \
-H "Authorization: Bearer $TOKEN" \
-H 'Content-Type: application/json' \
-d '{"message": "Hello, what can you help me with?"}' \
--max-time 60 2>/dev/null | head -50
```
Or test via browser (preferred for UI verification):
```bash
agent-browser --session-name pr-test open 'http://localhost:3000/copilot' --timeout 10000
# ... fill chat input and press Enter, wait 20-30s for response
```
## Step 5: Record results and take screenshots
**Take a screenshot at EVERY significant test step** — before and after interactions, on success, and on failure. This is NON-NEGOTIABLE.
**Required screenshot pattern for each test scenario:**
```bash
# BEFORE the action
agent-browser --session-name pr-test screenshot $RESULTS_DIR/{NN}-{scenario}-before.png
# Perform the action...
# AFTER the action
agent-browser --session-name pr-test screenshot $RESULTS_DIR/{NN}-{scenario}-after.png
```
**Naming convention:**
```bash
# Examples:
# $RESULTS_DIR/01-login-page-before.png
# $RESULTS_DIR/02-login-page-after.png
# $RESULTS_DIR/03-credits-page-before.png
# $RESULTS_DIR/04-credits-purchase-after.png
# $RESULTS_DIR/05-negative-insufficient-credits.png
# $RESULTS_DIR/06-error-state.png
```
**Minimum requirements:**
- At least TWO screenshots per test scenario (before + after)
- At least ONE screenshot for each negative test case showing the error state
- If a test fails, screenshot the failure state AND any error logs visible in the UI
## Step 6: Show results to user with screenshots
**CRITICAL: After all tests complete, you MUST show every screenshot to the user using the Read tool, with an explanation of what each screenshot shows.** This is the most important part of the test report — the user needs to visually verify the results.
For each screenshot:
1. Use the `Read` tool to display the PNG file (Claude can read images)
2. Write a 1-2 sentence explanation below it describing:
- What page/state is being shown
- What the screenshot proves (which test scenario it validates)
- Any notable details visible in the UI
Format the output like this:
```markdown
### Screenshot 1: {descriptive title}
[Read the PNG file here]
**What it shows:** {1-2 sentence explanation of what this screenshot proves}
---
```
After showing all screenshots, output a **detailed** summary table:
| # | Scenario | Result | API Evidence | Screenshot Evidence |
|---|----------|--------|-------------|-------------------|
| 1 | {name} | PASS/FAIL | Before: X, After: Y | 01-before.png, 02-after.png |
| 2 | ... | ... | ... | ... |
**IMPORTANT:** As you show each screenshot and record test results, persist them in shell variables for Step 7:
```bash
# Build these variables during Step 6 — they are required by Step 7's script
# NOTE: declare -A requires Bash 4.0+. This is standard on modern systems (macOS ships zsh
# but Homebrew bash is 5.x; Linux typically has bash 5.x). If running on Bash <4, use a
# plain variable with a lookup function instead.
declare -A SCREENSHOT_EXPLANATIONS=(
["01-login-page.png"]="Shows the login page loaded successfully with SSO options visible."
["02-builder-with-block.png"]="The builder canvas displays the newly added block connected to the trigger."
# ... one entry per screenshot, using the same explanations you showed the user above
)
TEST_RESULTS_TABLE="| 1 | Login flow | PASS | N/A | 01-login-before.png, 02-login-after.png |
| 2 | Credits purchase | PASS | Before: 100, After: 95 | 03-credits-before.png, 04-credits-after.png |
| 3 | Insufficient credits (negative) | PASS | Credits: 0, rejected | 05-insufficient-credits-error.png |"
# ... one row per test scenario with actual results
```
## Step 7: Post test report as PR comment with screenshots
Upload screenshots to the PR using the GitHub Git API (no local git operations — safe for worktrees), then post a comment with inline images and per-screenshot explanations.
**This step is MANDATORY. Every test run MUST post a PR comment with screenshots. No exceptions.**
```bash
# Upload screenshots via GitHub Git API (creates blobs, tree, commit, and ref remotely)
REPO="Significant-Gravitas/AutoGPT"
SCREENSHOTS_BRANCH="test-screenshots/pr-${PR_NUMBER}"
SCREENSHOTS_DIR="test-screenshots/PR-${PR_NUMBER}"
# Step 1: Create blobs for each screenshot and build tree JSON
# Retry each blob upload up to 3 times. If still failing, list them at end of report.
shopt -s nullglob
SCREENSHOT_FILES=("$RESULTS_DIR"/*.png)
if [ ${#SCREENSHOT_FILES[@]} -eq 0 ]; then
echo "ERROR: No screenshots found in $RESULTS_DIR. Test run is incomplete."
exit 1
fi
TREE_JSON='['
FIRST=true
FAILED_UPLOADS=()
for img in "${SCREENSHOT_FILES[@]}"; do
BASENAME=$(basename "$img")
B64=$(base64 < "$img")
BLOB_SHA=""
for attempt in 1 2 3; do
BLOB_SHA=$(gh api "repos/${REPO}/git/blobs" -f content="$B64" -f encoding="base64" --jq '.sha' 2>/dev/null || true)
[ -n "$BLOB_SHA" ] && break
sleep 1
done
if [ -z "$BLOB_SHA" ]; then
FAILED_UPLOADS+=("$img")
continue
fi
if [ "$FIRST" = true ]; then FIRST=false; else TREE_JSON+=','; fi
TREE_JSON+="{\"path\":\"${SCREENSHOTS_DIR}/${BASENAME}\",\"mode\":\"100644\",\"type\":\"blob\",\"sha\":\"${BLOB_SHA}\"}"
done
TREE_JSON+=']'
# Step 2: Create tree, commit, and branch ref
TREE_SHA=$(echo "$TREE_JSON" | jq -c '{tree: .}' | gh api "repos/${REPO}/git/trees" --input - --jq '.sha')
COMMIT_SHA=$(gh api "repos/${REPO}/git/commits" \
-f message="test: add E2E test screenshots for PR #${PR_NUMBER}" \
-f tree="$TREE_SHA" \
--jq '.sha')
gh api "repos/${REPO}/git/refs" \
-f ref="refs/heads/${SCREENSHOTS_BRANCH}" \
-f sha="$COMMIT_SHA" 2>/dev/null \
|| gh api "repos/${REPO}/git/refs/heads/${SCREENSHOTS_BRANCH}" \
-X PATCH -f sha="$COMMIT_SHA" -f force=true
```
Then post the comment with **inline images AND explanations for each screenshot**:
```bash
REPO_URL="https://raw.githubusercontent.com/${REPO}/${SCREENSHOTS_BRANCH}"
# Build image markdown using uploaded image URLs; skip FAILED_UPLOADS (listed separately)
IMAGE_MARKDOWN=""
for img in "${SCREENSHOT_FILES[@]}"; do
BASENAME=$(basename "$img")
TITLE=$(echo "${BASENAME%.png}" | sed 's/^[0-9]*-//' | sed 's/-/ /g' | awk '{for(i=1;i<=NF;i++) $i=toupper(substr($i,1,1)) tolower(substr($i,2))}1')
# Skip images that failed to upload — they will be listed at the end
IS_FAILED=false
for failed in "${FAILED_UPLOADS[@]}"; do
[ "$(basename "$failed")" = "$BASENAME" ] && IS_FAILED=true && break
done
if [ "$IS_FAILED" = true ]; then
continue
fi
EXPLANATION="${SCREENSHOT_EXPLANATIONS[$BASENAME]}"
if [ -z "$EXPLANATION" ]; then
echo "ERROR: Missing screenshot explanation for $BASENAME. Add it to SCREENSHOT_EXPLANATIONS in Step 6."
exit 1
fi
IMAGE_MARKDOWN="${IMAGE_MARKDOWN}
### ${TITLE}
![${BASENAME}](${REPO_URL}/${SCREENSHOTS_DIR}/${BASENAME})
${EXPLANATION}
"
done
# Write comment body to file to avoid shell interpretation issues with special characters
COMMENT_FILE=$(mktemp)
# If any uploads failed, append a section listing them with instructions
FAILED_SECTION=""
if [ ${#FAILED_UPLOADS[@]} -gt 0 ]; then
FAILED_SECTION="
## ⚠️ Failed Screenshot Uploads
The following screenshots could not be uploaded via the GitHub API after 3 retries.
**To add them:** drag-and-drop or paste these files into a PR comment manually:
"
for failed in "${FAILED_UPLOADS[@]}"; do
FAILED_SECTION="${FAILED_SECTION}
- \`$(basename "$failed")\` (local path: \`$failed\`)"
done
FAILED_SECTION="${FAILED_SECTION}
**Run status:** INCOMPLETE until the files above are manually attached and visible inline in the PR."
fi
cat > "$COMMENT_FILE" <<INNEREOF
## E2E Test Report
| # | Scenario | Result | API Evidence | Screenshot Evidence |
|---|----------|--------|-------------|-------------------|
${TEST_RESULTS_TABLE}
${IMAGE_MARKDOWN}
${FAILED_SECTION}
INNEREOF
gh api "repos/${REPO}/issues/$PR_NUMBER/comments" -F body=@"$COMMENT_FILE"
rm -f "$COMMENT_FILE"
```
**The PR comment MUST include:**
1. A summary table of all scenarios with PASS/FAIL and before/after API evidence
2. Every successfully uploaded screenshot rendered inline; any failed uploads listed with manual attachment instructions
3. A 1-2 sentence explanation below each screenshot describing what it proves
This approach uses the GitHub Git API to create blobs, trees, commits, and refs entirely server-side. No local `git checkout` or `git push` — safe for worktrees and won't interfere with the PR branch.
## Fix mode (--fix flag)
When `--fix` is present, the standard is HIGHER. Do not just note issues — FIX them immediately.
### Fix protocol for EVERY issue found (including UX issues):
1. **Identify** the root cause in the code — read the relevant source files
2. **Write a failing test first** (TDD): For backend bugs, write a test marked with `pytest.mark.xfail(reason="...")`. For frontend/Playwright bugs, write a test with `.fixme` annotation. Run it to confirm it fails as expected.
3. **Screenshot** the broken state: `agent-browser screenshot $RESULTS_DIR/{NN}-broken-{description}.png`
4. **Fix** the code in the worktree
5. **Rebuild** ONLY the affected service (not the whole stack):
```bash
cd $PLATFORM_DIR && docker compose up --build -d {service_name}
# e.g., docker compose up --build -d rest_server
# e.g., docker compose up --build -d frontend
```
6. **Wait** for the service to be ready (poll health endpoint)
7. **Re-test** the same scenario
8. **Screenshot** the fixed state: `agent-browser screenshot $RESULTS_DIR/{NN}-fixed-{description}.png`
9. **Remove the xfail/fixme marker** from the test written in step 2, and verify it passes
10. **Verify** the fix did not break other scenarios (run a quick smoke test)
11. **Commit and push** immediately:
```bash
cd $WORKTREE_PATH
git add -A
git commit -m "fix: {description of fix}"
git push
```
12. **Continue** to the next test scenario
### Fix loop (like pr-address)
```text
test scenario → find issue (bug OR UX problem) → screenshot broken state
→ fix code → rebuild affected service only → re-test → screenshot fixed state
→ verify no regressions → commit + push
→ repeat for next scenario
→ after ALL scenarios pass, run full re-test to verify everything together
```
**Key differences from non-fix mode:**
- UX issues count as bugs — fix them (bad alignment, confusing labels, missing loading states)
- Every fix MUST have a before/after screenshot pair proving it works
- Commit after EACH fix, not in a batch at the end
- The final re-test must produce a clean set of all-passing screenshots
## Known issues and workarounds
### Problem: "Database error finding user" on signup
**Cause:** Supabase auth service schema cache is stale after migration.
**Fix:** `docker restart supabase-auth && sleep 5` then retry signup.
### Problem: Copilot returns auth errors in subscription mode
**Cause:** `CHAT_USE_CLAUDE_CODE_SUBSCRIPTION=true` but `CLAUDE_CODE_OAUTH_TOKEN` is not set or expired.
**Fix:** Re-extract the OAuth token from macOS keychain (see step 3b, Option 1) and recreate the container (`docker compose up -d copilot_executor`). The backend auto-provisions `~/.claude/.credentials.json` from the env var on startup. No `npm install` or `claude login` needed — the SDK bundles its own CLI binary.
### Problem: agent-browser can't find chromium
**Cause:** The Dockerfile auto-provisions system chromium on all architectures (including ARM64). If your branch is behind `dev`, this may not be present yet.
**Fix:** Check if chromium exists: `which chromium || which chromium-browser`. If missing, install it: `apt-get install -y chromium` and set `AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium` in the container environment.
### Problem: agent-browser selector matches multiple elements
**Cause:** `text=X` matches all elements containing that text.
**Fix:** Use `agent-browser snapshot` to get specific `ref=eNN` references, then use those: `agent-browser click eNN`.
### Problem: Frontend shows cookie banner blocking interaction
**Fix:** `agent-browser click 'text=Accept All'` before other interactions.
### Problem: Container loses npm packages after rebuild
**Cause:** `docker compose up --build` rebuilds the image, losing runtime installs.
**Fix:** Add packages to the Dockerfile instead of installing at runtime.
### Problem: Services not starting after `docker compose up`
**Fix:** Wait and check health: `docker compose ps`. Common cause: migration hasn't finished. Check: `docker logs autogpt_platform-migrate-1 2>&1 | tail -5`. If supabase-db isn't healthy: `docker restart supabase-db && sleep 10`.
### Problem: Docker uses cached layers with old code (PR changes not visible)
**Cause:** `docker compose up --build` reuses cached `COPY` layers from previous builds. If the PR branch changes Python files but the previous build already cached that layer from `dev`, the container runs `dev` code.
**Fix:** Always use `docker compose build --no-cache` for the first build of a PR branch. Subsequent rebuilds within the same branch can use `--build`.
### Problem: `agent-browser open` loses login session
**Cause:** Without session persistence, `agent-browser open` starts fresh.
**Fix:** Use `--session-name pr-test` on ALL agent-browser commands. This auto-saves/restores cookies and localStorage across navigations. Alternatively, use `agent-browser eval "window.location.href = '...'"` to navigate within the same context.
### Problem: Supabase auth returns "Database error querying schema"
**Cause:** The database schema changed (migration ran) but supabase-auth has a stale schema cache.
**Fix:** `docker restart supabase-db && sleep 10 && docker restart supabase-auth && sleep 8`. If user data was lost, re-signup.

View File

@@ -0,0 +1,195 @@
---
name: setup-repo
description: Initialize a worktree-based repo layout for parallel development. Creates a main worktree, a reviews worktree for PR reviews, and N numbered work branches. Handles .env creation, dependency installation, and branchlet config. TRIGGER when user asks to set up the repo from scratch, initialize worktrees, bootstrap their dev environment, "setup repo", "setup worktrees", "initialize dev environment", "set up branches", or when a freshly cloned repo has no sibling worktrees.
user-invocable: true
args: "No arguments — interactive setup via prompts."
metadata:
author: autogpt-team
version: "1.0.0"
---
# Repository Setup
This skill sets up a worktree-based development layout from a freshly cloned repo. It creates:
- A **main** worktree (the primary checkout)
- A **reviews** worktree (for PR reviews)
- **N work branches** (branch1..branchN) for parallel development
## Step 1: Identify the repo
Determine the repo root and parent directory:
```bash
ROOT=$(git rev-parse --show-toplevel)
REPO_NAME=$(basename "$ROOT")
PARENT=$(dirname "$ROOT")
```
Detect if the repo is already inside a worktree layout by counting sibling worktrees (not just checking the directory name, which could be anything):
```bash
# Count worktrees that are siblings (live under $PARENT but aren't $ROOT itself)
SIBLING_COUNT=$(git worktree list --porcelain 2>/dev/null | grep "^worktree " | grep -c "$PARENT/" || true)
if [ "$SIBLING_COUNT" -gt 1 ]; then
echo "INFO: Existing worktree layout detected at $PARENT ($SIBLING_COUNT worktrees)"
# Use $ROOT as-is; skip renaming/restructuring
else
echo "INFO: Fresh clone detected, proceeding with setup"
fi
```
## Step 2: Ask the user questions
Use AskUserQuestion to gather setup preferences:
1. **How many parallel work branches do you need?** (Options: 4, 8, 16, or custom)
- These become `branch1` through `branchN`
2. **Which branch should be the base?** (Options: origin/master, origin/dev, or custom)
- All work branches and reviews will start from this
## Step 3: Fetch and set up branches
```bash
cd "$ROOT"
git fetch origin
# Create the reviews branch from base (skip if already exists)
if git show-ref --verify --quiet refs/heads/reviews; then
echo "INFO: Branch 'reviews' already exists, skipping"
else
git branch reviews <base-branch>
fi
# Create numbered work branches from base (skip if already exists)
for i in $(seq 1 "$COUNT"); do
if git show-ref --verify --quiet "refs/heads/branch$i"; then
echo "INFO: Branch 'branch$i' already exists, skipping"
else
git branch "branch$i" <base-branch>
fi
done
```
## Step 4: Create worktrees
Create worktrees as siblings to the main checkout:
```bash
if [ -d "$PARENT/reviews" ]; then
echo "INFO: Worktree '$PARENT/reviews' already exists, skipping"
else
git worktree add "$PARENT/reviews" reviews
fi
for i in $(seq 1 "$COUNT"); do
if [ -d "$PARENT/branch$i" ]; then
echo "INFO: Worktree '$PARENT/branch$i' already exists, skipping"
else
git worktree add "$PARENT/branch$i" "branch$i"
fi
done
```
## Step 5: Set up environment files
**Do NOT assume .env files exist.** For each worktree (including main if needed):
1. Check if `.env` exists in the source worktree for each path
2. If `.env` exists, copy it
3. If only `.env.default` or `.env.example` exists, copy that as `.env`
4. If neither exists, warn the user and list which env files are missing
Env file locations to check (same as the `/worktree` skill — keep these in sync):
- `autogpt_platform/.env`
- `autogpt_platform/backend/.env`
- `autogpt_platform/frontend/.env`
> **Note:** This env copying logic intentionally mirrors the `/worktree` skill's approach. If you update the path list or fallback logic here, update `/worktree` as well.
```bash
SOURCE="$ROOT"
WORKTREES="reviews"
for i in $(seq 1 "$COUNT"); do WORKTREES="$WORKTREES branch$i"; done
FOUND_ANY_ENV=0
for wt in $WORKTREES; do
TARGET="$PARENT/$wt"
for envpath in autogpt_platform autogpt_platform/backend autogpt_platform/frontend; do
if [ -f "$SOURCE/$envpath/.env" ]; then
FOUND_ANY_ENV=1
cp "$SOURCE/$envpath/.env" "$TARGET/$envpath/.env"
elif [ -f "$SOURCE/$envpath/.env.default" ]; then
FOUND_ANY_ENV=1
cp "$SOURCE/$envpath/.env.default" "$TARGET/$envpath/.env"
echo "NOTE: $wt/$envpath/.env was created from .env.default — you may need to edit it"
elif [ -f "$SOURCE/$envpath/.env.example" ]; then
FOUND_ANY_ENV=1
cp "$SOURCE/$envpath/.env.example" "$TARGET/$envpath/.env"
echo "NOTE: $wt/$envpath/.env was created from .env.example — you may need to edit it"
else
echo "WARNING: No .env, .env.default, or .env.example found at $SOURCE/$envpath/"
fi
done
done
if [ "$FOUND_ANY_ENV" -eq 0 ]; then
echo "WARNING: No environment files or templates were found in the source worktree."
# Use AskUserQuestion to confirm: "Continue setup without env files?"
# If the user declines, stop here and let them set up .env files first.
fi
```
## Step 6: Copy branchlet config
Copy `.branchlet.json` from main to each worktree so branchlet can manage sub-worktrees:
```bash
if [ -f "$ROOT/.branchlet.json" ]; then
for wt in $WORKTREES; do
cp "$ROOT/.branchlet.json" "$PARENT/$wt/.branchlet.json"
done
fi
```
## Step 7: Install dependencies
Install deps in all worktrees. Run these sequentially per worktree:
```bash
for wt in $WORKTREES; do
TARGET="$PARENT/$wt"
echo "=== Installing deps for $wt ==="
(cd "$TARGET/autogpt_platform/autogpt_libs" && poetry install) &&
(cd "$TARGET/autogpt_platform/backend" && poetry install && poetry run prisma generate) &&
(cd "$TARGET/autogpt_platform/frontend" && pnpm install) &&
echo "=== Done: $wt ===" ||
echo "=== FAILED: $wt ==="
done
```
This is slow. Run in background if possible and notify when complete.
## Step 8: Verify and report
After setup, verify and report to the user:
```bash
git worktree list
```
Summarize:
- Number of worktrees created
- Which env files were copied vs created from defaults vs missing
- Any warnings or errors encountered
## Final directory layout
```
parent/
main/ # Primary checkout (already exists)
reviews/ # PR review worktree
branch1/ # Work branch 1
branch2/ # Work branch 2
...
branchN/ # Work branch N
```

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,125 @@
---
name: vercel-react-best-practices
description: React and Next.js performance optimization guidelines from Vercel Engineering. This skill should be used when writing, reviewing, or refactoring React/Next.js code to ensure optimal performance patterns. Triggers on tasks involving React components, Next.js pages, data fetching, bundle optimization, or performance improvements.
license: MIT
metadata:
author: vercel
version: "1.0.0"
---
# Vercel React Best Practices
Comprehensive performance optimization guide for React and Next.js applications, maintained by Vercel. Contains 45 rules across 8 categories, prioritized by impact to guide automated refactoring and code generation.
## When to Apply
Reference these guidelines when:
- Writing new React components or Next.js pages
- Implementing data fetching (client or server-side)
- Reviewing code for performance issues
- Refactoring existing React/Next.js code
- Optimizing bundle size or load times
## Rule Categories by Priority
| Priority | Category | Impact | Prefix |
|----------|----------|--------|--------|
| 1 | Eliminating Waterfalls | CRITICAL | `async-` |
| 2 | Bundle Size Optimization | CRITICAL | `bundle-` |
| 3 | Server-Side Performance | HIGH | `server-` |
| 4 | Client-Side Data Fetching | MEDIUM-HIGH | `client-` |
| 5 | Re-render Optimization | MEDIUM | `rerender-` |
| 6 | Rendering Performance | MEDIUM | `rendering-` |
| 7 | JavaScript Performance | LOW-MEDIUM | `js-` |
| 8 | Advanced Patterns | LOW | `advanced-` |
## Quick Reference
### 1. Eliminating Waterfalls (CRITICAL)
- `async-defer-await` - Move await into branches where actually used
- `async-parallel` - Use Promise.all() for independent operations
- `async-dependencies` - Use better-all for partial dependencies
- `async-api-routes` - Start promises early, await late in API routes
- `async-suspense-boundaries` - Use Suspense to stream content
### 2. Bundle Size Optimization (CRITICAL)
- `bundle-barrel-imports` - Import directly, avoid barrel files
- `bundle-dynamic-imports` - Use next/dynamic for heavy components
- `bundle-defer-third-party` - Load analytics/logging after hydration
- `bundle-conditional` - Load modules only when feature is activated
- `bundle-preload` - Preload on hover/focus for perceived speed
### 3. Server-Side Performance (HIGH)
- `server-cache-react` - Use React.cache() for per-request deduplication
- `server-cache-lru` - Use LRU cache for cross-request caching
- `server-serialization` - Minimize data passed to client components
- `server-parallel-fetching` - Restructure components to parallelize fetches
- `server-after-nonblocking` - Use after() for non-blocking operations
### 4. Client-Side Data Fetching (MEDIUM-HIGH)
- `client-swr-dedup` - Use SWR for automatic request deduplication
- `client-event-listeners` - Deduplicate global event listeners
### 5. Re-render Optimization (MEDIUM)
- `rerender-defer-reads` - Don't subscribe to state only used in callbacks
- `rerender-memo` - Extract expensive work into memoized components
- `rerender-dependencies` - Use primitive dependencies in effects
- `rerender-derived-state` - Subscribe to derived booleans, not raw values
- `rerender-functional-setstate` - Use functional setState for stable callbacks
- `rerender-lazy-state-init` - Pass function to useState for expensive values
- `rerender-transitions` - Use startTransition for non-urgent updates
### 6. Rendering Performance (MEDIUM)
- `rendering-animate-svg-wrapper` - Animate div wrapper, not SVG element
- `rendering-content-visibility` - Use content-visibility for long lists
- `rendering-hoist-jsx` - Extract static JSX outside components
- `rendering-svg-precision` - Reduce SVG coordinate precision
- `rendering-hydration-no-flicker` - Use inline script for client-only data
- `rendering-activity` - Use Activity component for show/hide
- `rendering-conditional-render` - Use ternary, not && for conditionals
### 7. JavaScript Performance (LOW-MEDIUM)
- `js-batch-dom-css` - Group CSS changes via classes or cssText
- `js-index-maps` - Build Map for repeated lookups
- `js-cache-property-access` - Cache object properties in loops
- `js-cache-function-results` - Cache function results in module-level Map
- `js-cache-storage` - Cache localStorage/sessionStorage reads
- `js-combine-iterations` - Combine multiple filter/map into one loop
- `js-length-check-first` - Check array length before expensive comparison
- `js-early-exit` - Return early from functions
- `js-hoist-regexp` - Hoist RegExp creation outside loops
- `js-min-max-loop` - Use loop for min/max instead of sort
- `js-set-map-lookups` - Use Set/Map for O(1) lookups
- `js-tosorted-immutable` - Use toSorted() for immutability
### 8. Advanced Patterns (LOW)
- `advanced-event-handler-refs` - Store event handlers in refs
- `advanced-use-latest` - useLatest for stable callback refs
## How to Use
Read individual rule files for detailed explanations and code examples:
```
rules/async-parallel.md
rules/bundle-barrel-imports.md
rules/_sections.md
```
Each rule file contains:
- Brief explanation of why it matters
- Incorrect code example with explanation
- Correct code example with explanation
- Additional context and references
## Full Compiled Document
For the complete guide with all rules expanded: `AGENTS.md`

View File

@@ -0,0 +1,55 @@
---
title: Store Event Handlers in Refs
impact: LOW
impactDescription: stable subscriptions
tags: advanced, hooks, refs, event-handlers, optimization
---
## Store Event Handlers in Refs
Store callbacks in refs when used in effects that shouldn't re-subscribe on callback changes.
**Incorrect (re-subscribes on every render):**
```tsx
function useWindowEvent(event: string, handler: () => void) {
useEffect(() => {
window.addEventListener(event, handler)
return () => window.removeEventListener(event, handler)
}, [event, handler])
}
```
**Correct (stable subscription):**
```tsx
function useWindowEvent(event: string, handler: () => void) {
const handlerRef = useRef(handler)
useEffect(() => {
handlerRef.current = handler
}, [handler])
useEffect(() => {
const listener = () => handlerRef.current()
window.addEventListener(event, listener)
return () => window.removeEventListener(event, listener)
}, [event])
}
```
**Alternative: use `useEffectEvent` if you're on latest React:**
```tsx
import { useEffectEvent } from 'react'
function useWindowEvent(event: string, handler: () => void) {
const onEvent = useEffectEvent(handler)
useEffect(() => {
window.addEventListener(event, onEvent)
return () => window.removeEventListener(event, onEvent)
}, [event])
}
```
`useEffectEvent` provides a cleaner API for the same pattern: it creates a stable function reference that always calls the latest version of the handler.

View File

@@ -0,0 +1,49 @@
---
title: useLatest for Stable Callback Refs
impact: LOW
impactDescription: prevents effect re-runs
tags: advanced, hooks, useLatest, refs, optimization
---
## useLatest for Stable Callback Refs
Access latest values in callbacks without adding them to dependency arrays. Prevents effect re-runs while avoiding stale closures.
**Implementation:**
```typescript
function useLatest<T>(value: T) {
const ref = useRef(value)
useEffect(() => {
ref.current = value
}, [value])
return ref
}
```
**Incorrect (effect re-runs on every callback change):**
```tsx
function SearchInput({ onSearch }: { onSearch: (q: string) => void }) {
const [query, setQuery] = useState('')
useEffect(() => {
const timeout = setTimeout(() => onSearch(query), 300)
return () => clearTimeout(timeout)
}, [query, onSearch])
}
```
**Correct (stable effect, fresh callback):**
```tsx
function SearchInput({ onSearch }: { onSearch: (q: string) => void }) {
const [query, setQuery] = useState('')
const onSearchRef = useLatest(onSearch)
useEffect(() => {
const timeout = setTimeout(() => onSearchRef.current(query), 300)
return () => clearTimeout(timeout)
}, [query])
}
```

View File

@@ -0,0 +1,38 @@
---
title: Prevent Waterfall Chains in API Routes
impact: CRITICAL
impactDescription: 2-10× improvement
tags: api-routes, server-actions, waterfalls, parallelization
---
## Prevent Waterfall Chains in API Routes
In API routes and Server Actions, start independent operations immediately, even if you don't await them yet.
**Incorrect (config waits for auth, data waits for both):**
```typescript
export async function GET(request: Request) {
const session = await auth()
const config = await fetchConfig()
const data = await fetchData(session.user.id)
return Response.json({ data, config })
}
```
**Correct (auth and config start immediately):**
```typescript
export async function GET(request: Request) {
const sessionPromise = auth()
const configPromise = fetchConfig()
const session = await sessionPromise
const [config, data] = await Promise.all([
configPromise,
fetchData(session.user.id)
])
return Response.json({ data, config })
}
```
For operations with more complex dependency chains, use `better-all` to automatically maximize parallelism (see Dependency-Based Parallelization).

View File

@@ -0,0 +1,80 @@
---
title: Defer Await Until Needed
impact: HIGH
impactDescription: avoids blocking unused code paths
tags: async, await, conditional, optimization
---
## Defer Await Until Needed
Move `await` operations into the branches where they're actually used to avoid blocking code paths that don't need them.
**Incorrect (blocks both branches):**
```typescript
async function handleRequest(userId: string, skipProcessing: boolean) {
const userData = await fetchUserData(userId)
if (skipProcessing) {
// Returns immediately but still waited for userData
return { skipped: true }
}
// Only this branch uses userData
return processUserData(userData)
}
```
**Correct (only blocks when needed):**
```typescript
async function handleRequest(userId: string, skipProcessing: boolean) {
if (skipProcessing) {
// Returns immediately without waiting
return { skipped: true }
}
// Fetch only when needed
const userData = await fetchUserData(userId)
return processUserData(userData)
}
```
**Another example (early return optimization):**
```typescript
// Incorrect: always fetches permissions
async function updateResource(resourceId: string, userId: string) {
const permissions = await fetchPermissions(userId)
const resource = await getResource(resourceId)
if (!resource) {
return { error: 'Not found' }
}
if (!permissions.canEdit) {
return { error: 'Forbidden' }
}
return await updateResourceData(resource, permissions)
}
// Correct: fetches only when needed
async function updateResource(resourceId: string, userId: string) {
const resource = await getResource(resourceId)
if (!resource) {
return { error: 'Not found' }
}
const permissions = await fetchPermissions(userId)
if (!permissions.canEdit) {
return { error: 'Forbidden' }
}
return await updateResourceData(resource, permissions)
}
```
This optimization is especially valuable when the skipped branch is frequently taken, or when the deferred operation is expensive.

View File

@@ -0,0 +1,36 @@
---
title: Dependency-Based Parallelization
impact: CRITICAL
impactDescription: 2-10× improvement
tags: async, parallelization, dependencies, better-all
---
## Dependency-Based Parallelization
For operations with partial dependencies, use `better-all` to maximize parallelism. It automatically starts each task at the earliest possible moment.
**Incorrect (profile waits for config unnecessarily):**
```typescript
const [user, config] = await Promise.all([
fetchUser(),
fetchConfig()
])
const profile = await fetchProfile(user.id)
```
**Correct (config and profile run in parallel):**
```typescript
import { all } from 'better-all'
const { user, config, profile } = await all({
async user() { return fetchUser() },
async config() { return fetchConfig() },
async profile() {
return fetchProfile((await this.$.user).id)
}
})
```
Reference: [https://github.com/shuding/better-all](https://github.com/shuding/better-all)

View File

@@ -0,0 +1,28 @@
---
title: Promise.all() for Independent Operations
impact: CRITICAL
impactDescription: 2-10× improvement
tags: async, parallelization, promises, waterfalls
---
## Promise.all() for Independent Operations
When async operations have no interdependencies, execute them concurrently using `Promise.all()`.
**Incorrect (sequential execution, 3 round trips):**
```typescript
const user = await fetchUser()
const posts = await fetchPosts()
const comments = await fetchComments()
```
**Correct (parallel execution, 1 round trip):**
```typescript
const [user, posts, comments] = await Promise.all([
fetchUser(),
fetchPosts(),
fetchComments()
])
```

View File

@@ -0,0 +1,99 @@
---
title: Strategic Suspense Boundaries
impact: HIGH
impactDescription: faster initial paint
tags: async, suspense, streaming, layout-shift
---
## Strategic Suspense Boundaries
Instead of awaiting data in async components before returning JSX, use Suspense boundaries to show the wrapper UI faster while data loads.
**Incorrect (wrapper blocked by data fetching):**
```tsx
async function Page() {
const data = await fetchData() // Blocks entire page
return (
<div>
<div>Sidebar</div>
<div>Header</div>
<div>
<DataDisplay data={data} />
</div>
<div>Footer</div>
</div>
)
}
```
The entire layout waits for data even though only the middle section needs it.
**Correct (wrapper shows immediately, data streams in):**
```tsx
function Page() {
return (
<div>
<div>Sidebar</div>
<div>Header</div>
<div>
<Suspense fallback={<Skeleton />}>
<DataDisplay />
</Suspense>
</div>
<div>Footer</div>
</div>
)
}
async function DataDisplay() {
const data = await fetchData() // Only blocks this component
return <div>{data.content}</div>
}
```
Sidebar, Header, and Footer render immediately. Only DataDisplay waits for data.
**Alternative (share promise across components):**
```tsx
function Page() {
// Start fetch immediately, but don't await
const dataPromise = fetchData()
return (
<div>
<div>Sidebar</div>
<div>Header</div>
<Suspense fallback={<Skeleton />}>
<DataDisplay dataPromise={dataPromise} />
<DataSummary dataPromise={dataPromise} />
</Suspense>
<div>Footer</div>
</div>
)
}
function DataDisplay({ dataPromise }: { dataPromise: Promise<Data> }) {
const data = use(dataPromise) // Unwraps the promise
return <div>{data.content}</div>
}
function DataSummary({ dataPromise }: { dataPromise: Promise<Data> }) {
const data = use(dataPromise) // Reuses the same promise
return <div>{data.summary}</div>
}
```
Both components share the same promise, so only one fetch occurs. Layout renders immediately while both components wait together.
**When NOT to use this pattern:**
- Critical data needed for layout decisions (affects positioning)
- SEO-critical content above the fold
- Small, fast queries where suspense overhead isn't worth it
- When you want to avoid layout shift (loading → content jump)
**Trade-off:** Faster initial paint vs potential layout shift. Choose based on your UX priorities.

View File

@@ -0,0 +1,59 @@
---
title: Avoid Barrel File Imports
impact: CRITICAL
impactDescription: 200-800ms import cost, slow builds
tags: bundle, imports, tree-shaking, barrel-files, performance
---
## Avoid Barrel File Imports
Import directly from source files instead of barrel files to avoid loading thousands of unused modules. **Barrel files** are entry points that re-export multiple modules (e.g., `index.js` that does `export * from './module'`).
Popular icon and component libraries can have **up to 10,000 re-exports** in their entry file. For many React packages, **it takes 200-800ms just to import them**, affecting both development speed and production cold starts.
**Why tree-shaking doesn't help:** When a library is marked as external (not bundled), the bundler can't optimize it. If you bundle it to enable tree-shaking, builds become substantially slower analyzing the entire module graph.
**Incorrect (imports entire library):**
```tsx
import { Check, X, Menu } from 'lucide-react'
// Loads 1,583 modules, takes ~2.8s extra in dev
// Runtime cost: 200-800ms on every cold start
import { Button, TextField } from '@mui/material'
// Loads 2,225 modules, takes ~4.2s extra in dev
```
**Correct (imports only what you need):**
```tsx
import Check from 'lucide-react/dist/esm/icons/check'
import X from 'lucide-react/dist/esm/icons/x'
import Menu from 'lucide-react/dist/esm/icons/menu'
// Loads only 3 modules (~2KB vs ~1MB)
import Button from '@mui/material/Button'
import TextField from '@mui/material/TextField'
// Loads only what you use
```
**Alternative (Next.js 13.5+):**
```js
// next.config.js - use optimizePackageImports
module.exports = {
experimental: {
optimizePackageImports: ['lucide-react', '@mui/material']
}
}
// Then you can keep the ergonomic barrel imports:
import { Check, X, Menu } from 'lucide-react'
// Automatically transformed to direct imports at build time
```
Direct imports provide 15-70% faster dev boot, 28% faster builds, 40% faster cold starts, and significantly faster HMR.
Libraries commonly affected: `lucide-react`, `@mui/material`, `@mui/icons-material`, `@tabler/icons-react`, `react-icons`, `@headlessui/react`, `@radix-ui/react-*`, `lodash`, `ramda`, `date-fns`, `rxjs`, `react-use`.
Reference: [How we optimized package imports in Next.js](https://vercel.com/blog/how-we-optimized-package-imports-in-next-js)

View File

@@ -0,0 +1,31 @@
---
title: Conditional Module Loading
impact: HIGH
impactDescription: loads large data only when needed
tags: bundle, conditional-loading, lazy-loading
---
## Conditional Module Loading
Load large data or modules only when a feature is activated.
**Example (lazy-load animation frames):**
```tsx
function AnimationPlayer({ enabled }: { enabled: boolean }) {
const [frames, setFrames] = useState<Frame[] | null>(null)
useEffect(() => {
if (enabled && !frames && typeof window !== 'undefined') {
import('./animation-frames.js')
.then(mod => setFrames(mod.frames))
.catch(() => setEnabled(false))
}
}, [enabled, frames])
if (!frames) return <Skeleton />
return <Canvas frames={frames} />
}
```
The `typeof window !== 'undefined'` check prevents bundling this module for SSR, optimizing server bundle size and build speed.

View File

@@ -0,0 +1,49 @@
---
title: Defer Non-Critical Third-Party Libraries
impact: MEDIUM
impactDescription: loads after hydration
tags: bundle, third-party, analytics, defer
---
## Defer Non-Critical Third-Party Libraries
Analytics, logging, and error tracking don't block user interaction. Load them after hydration.
**Incorrect (blocks initial bundle):**
```tsx
import { Analytics } from '@vercel/analytics/react'
export default function RootLayout({ children }) {
return (
<html>
<body>
{children}
<Analytics />
</body>
</html>
)
}
```
**Correct (loads after hydration):**
```tsx
import dynamic from 'next/dynamic'
const Analytics = dynamic(
() => import('@vercel/analytics/react').then(m => m.Analytics),
{ ssr: false }
)
export default function RootLayout({ children }) {
return (
<html>
<body>
{children}
<Analytics />
</body>
</html>
)
}
```

View File

@@ -0,0 +1,35 @@
---
title: Dynamic Imports for Heavy Components
impact: CRITICAL
impactDescription: directly affects TTI and LCP
tags: bundle, dynamic-import, code-splitting, next-dynamic
---
## Dynamic Imports for Heavy Components
Use `next/dynamic` to lazy-load large components not needed on initial render.
**Incorrect (Monaco bundles with main chunk ~300KB):**
```tsx
import { MonacoEditor } from './monaco-editor'
function CodePanel({ code }: { code: string }) {
return <MonacoEditor value={code} />
}
```
**Correct (Monaco loads on demand):**
```tsx
import dynamic from 'next/dynamic'
const MonacoEditor = dynamic(
() => import('./monaco-editor').then(m => m.MonacoEditor),
{ ssr: false }
)
function CodePanel({ code }: { code: string }) {
return <MonacoEditor value={code} />
}
```

View File

@@ -0,0 +1,50 @@
---
title: Preload Based on User Intent
impact: MEDIUM
impactDescription: reduces perceived latency
tags: bundle, preload, user-intent, hover
---
## Preload Based on User Intent
Preload heavy bundles before they're needed to reduce perceived latency.
**Example (preload on hover/focus):**
```tsx
function EditorButton({ onClick }: { onClick: () => void }) {
const preload = () => {
if (typeof window !== 'undefined') {
void import('./monaco-editor')
}
}
return (
<button
onMouseEnter={preload}
onFocus={preload}
onClick={onClick}
>
Open Editor
</button>
)
}
```
**Example (preload when feature flag is enabled):**
```tsx
function FlagsProvider({ children, flags }: Props) {
useEffect(() => {
if (flags.editorEnabled && typeof window !== 'undefined') {
void import('./monaco-editor').then(mod => mod.init())
}
}, [flags.editorEnabled])
return <FlagsContext.Provider value={flags}>
{children}
</FlagsContext.Provider>
}
```
The `typeof window !== 'undefined'` check prevents bundling preloaded modules for SSR, optimizing server bundle size and build speed.

View File

@@ -0,0 +1,74 @@
---
title: Deduplicate Global Event Listeners
impact: LOW
impactDescription: single listener for N components
tags: client, swr, event-listeners, subscription
---
## Deduplicate Global Event Listeners
Use `useSWRSubscription()` to share global event listeners across component instances.
**Incorrect (N instances = N listeners):**
```tsx
function useKeyboardShortcut(key: string, callback: () => void) {
useEffect(() => {
const handler = (e: KeyboardEvent) => {
if (e.metaKey && e.key === key) {
callback()
}
}
window.addEventListener('keydown', handler)
return () => window.removeEventListener('keydown', handler)
}, [key, callback])
}
```
When using the `useKeyboardShortcut` hook multiple times, each instance will register a new listener.
**Correct (N instances = 1 listener):**
```tsx
import useSWRSubscription from 'swr/subscription'
// Module-level Map to track callbacks per key
const keyCallbacks = new Map<string, Set<() => void>>()
function useKeyboardShortcut(key: string, callback: () => void) {
// Register this callback in the Map
useEffect(() => {
if (!keyCallbacks.has(key)) {
keyCallbacks.set(key, new Set())
}
keyCallbacks.get(key)!.add(callback)
return () => {
const set = keyCallbacks.get(key)
if (set) {
set.delete(callback)
if (set.size === 0) {
keyCallbacks.delete(key)
}
}
}
}, [key, callback])
useSWRSubscription('global-keydown', () => {
const handler = (e: KeyboardEvent) => {
if (e.metaKey && keyCallbacks.has(e.key)) {
keyCallbacks.get(e.key)!.forEach(cb => cb())
}
}
window.addEventListener('keydown', handler)
return () => window.removeEventListener('keydown', handler)
})
}
function Profile() {
// Multiple shortcuts will share the same listener
useKeyboardShortcut('p', () => { /* ... */ })
useKeyboardShortcut('k', () => { /* ... */ })
// ...
}
```

View File

@@ -0,0 +1,56 @@
---
title: Use SWR for Automatic Deduplication
impact: MEDIUM-HIGH
impactDescription: automatic deduplication
tags: client, swr, deduplication, data-fetching
---
## Use SWR for Automatic Deduplication
SWR enables request deduplication, caching, and revalidation across component instances.
**Incorrect (no deduplication, each instance fetches):**
```tsx
function UserList() {
const [users, setUsers] = useState([])
useEffect(() => {
fetch('/api/users')
.then(r => r.json())
.then(setUsers)
}, [])
}
```
**Correct (multiple instances share one request):**
```tsx
import useSWR from 'swr'
function UserList() {
const { data: users } = useSWR('/api/users', fetcher)
}
```
**For immutable data:**
```tsx
import { useImmutableSWR } from '@/lib/swr'
function StaticContent() {
const { data } = useImmutableSWR('/api/config', fetcher)
}
```
**For mutations:**
```tsx
import { useSWRMutation } from 'swr/mutation'
function UpdateButton() {
const { trigger } = useSWRMutation('/api/user', updateUser)
return <button onClick={() => trigger()}>Update</button>
}
```
Reference: [https://swr.vercel.app](https://swr.vercel.app)

View File

@@ -0,0 +1,82 @@
---
title: Batch DOM CSS Changes
impact: MEDIUM
impactDescription: reduces reflows/repaints
tags: javascript, dom, css, performance, reflow
---
## Batch DOM CSS Changes
Avoid changing styles one property at a time. Group multiple CSS changes together via classes or `cssText` to minimize browser reflows.
**Incorrect (multiple reflows):**
```typescript
function updateElementStyles(element: HTMLElement) {
// Each line triggers a reflow
element.style.width = '100px'
element.style.height = '200px'
element.style.backgroundColor = 'blue'
element.style.border = '1px solid black'
}
```
**Correct (add class - single reflow):**
```typescript
// CSS file
.highlighted-box {
width: 100px;
height: 200px;
background-color: blue;
border: 1px solid black;
}
// JavaScript
function updateElementStyles(element: HTMLElement) {
element.classList.add('highlighted-box')
}
```
**Correct (change cssText - single reflow):**
```typescript
function updateElementStyles(element: HTMLElement) {
element.style.cssText = `
width: 100px;
height: 200px;
background-color: blue;
border: 1px solid black;
`
}
```
**React example:**
```tsx
// Incorrect: changing styles one by one
function Box({ isHighlighted }: { isHighlighted: boolean }) {
const ref = useRef<HTMLDivElement>(null)
useEffect(() => {
if (ref.current && isHighlighted) {
ref.current.style.width = '100px'
ref.current.style.height = '200px'
ref.current.style.backgroundColor = 'blue'
}
}, [isHighlighted])
return <div ref={ref}>Content</div>
}
// Correct: toggle class
function Box({ isHighlighted }: { isHighlighted: boolean }) {
return (
<div className={isHighlighted ? 'highlighted-box' : ''}>
Content
</div>
)
}
```
Prefer CSS classes over inline styles when possible. Classes are cached by the browser and provide better separation of concerns.

View File

@@ -0,0 +1,80 @@
---
title: Cache Repeated Function Calls
impact: MEDIUM
impactDescription: avoid redundant computation
tags: javascript, cache, memoization, performance
---
## Cache Repeated Function Calls
Use a module-level Map to cache function results when the same function is called repeatedly with the same inputs during render.
**Incorrect (redundant computation):**
```typescript
function ProjectList({ projects }: { projects: Project[] }) {
return (
<div>
{projects.map(project => {
// slugify() called 100+ times for same project names
const slug = slugify(project.name)
return <ProjectCard key={project.id} slug={slug} />
})}
</div>
)
}
```
**Correct (cached results):**
```typescript
// Module-level cache
const slugifyCache = new Map<string, string>()
function cachedSlugify(text: string): string {
if (slugifyCache.has(text)) {
return slugifyCache.get(text)!
}
const result = slugify(text)
slugifyCache.set(text, result)
return result
}
function ProjectList({ projects }: { projects: Project[] }) {
return (
<div>
{projects.map(project => {
// Computed only once per unique project name
const slug = cachedSlugify(project.name)
return <ProjectCard key={project.id} slug={slug} />
})}
</div>
)
}
```
**Simpler pattern for single-value functions:**
```typescript
let isLoggedInCache: boolean | null = null
function isLoggedIn(): boolean {
if (isLoggedInCache !== null) {
return isLoggedInCache
}
isLoggedInCache = document.cookie.includes('auth=')
return isLoggedInCache
}
// Clear cache when auth changes
function onAuthChange() {
isLoggedInCache = null
}
```
Use a Map (not a hook) so it works everywhere: utilities, event handlers, not just React components.
Reference: [How we made the Vercel Dashboard twice as fast](https://vercel.com/blog/how-we-made-the-vercel-dashboard-twice-as-fast)

View File

@@ -0,0 +1,28 @@
---
title: Cache Property Access in Loops
impact: LOW-MEDIUM
impactDescription: reduces lookups
tags: javascript, loops, optimization, caching
---
## Cache Property Access in Loops
Cache object property lookups in hot paths.
**Incorrect (3 lookups × N iterations):**
```typescript
for (let i = 0; i < arr.length; i++) {
process(obj.config.settings.value)
}
```
**Correct (1 lookup total):**
```typescript
const value = obj.config.settings.value
const len = arr.length
for (let i = 0; i < len; i++) {
process(value)
}
```

View File

@@ -0,0 +1,70 @@
---
title: Cache Storage API Calls
impact: LOW-MEDIUM
impactDescription: reduces expensive I/O
tags: javascript, localStorage, storage, caching, performance
---
## Cache Storage API Calls
`localStorage`, `sessionStorage`, and `document.cookie` are synchronous and expensive. Cache reads in memory.
**Incorrect (reads storage on every call):**
```typescript
function getTheme() {
return localStorage.getItem('theme') ?? 'light'
}
// Called 10 times = 10 storage reads
```
**Correct (Map cache):**
```typescript
const storageCache = new Map<string, string | null>()
function getLocalStorage(key: string) {
if (!storageCache.has(key)) {
storageCache.set(key, localStorage.getItem(key))
}
return storageCache.get(key)
}
function setLocalStorage(key: string, value: string) {
localStorage.setItem(key, value)
storageCache.set(key, value) // keep cache in sync
}
```
Use a Map (not a hook) so it works everywhere: utilities, event handlers, not just React components.
**Cookie caching:**
```typescript
let cookieCache: Record<string, string> | null = null
function getCookie(name: string) {
if (!cookieCache) {
cookieCache = Object.fromEntries(
document.cookie.split('; ').map(c => c.split('='))
)
}
return cookieCache[name]
}
```
**Important (invalidate on external changes):**
If storage can change externally (another tab, server-set cookies), invalidate cache:
```typescript
window.addEventListener('storage', (e) => {
if (e.key) storageCache.delete(e.key)
})
document.addEventListener('visibilitychange', () => {
if (document.visibilityState === 'visible') {
storageCache.clear()
}
})
```

View File

@@ -0,0 +1,32 @@
---
title: Combine Multiple Array Iterations
impact: LOW-MEDIUM
impactDescription: reduces iterations
tags: javascript, arrays, loops, performance
---
## Combine Multiple Array Iterations
Multiple `.filter()` or `.map()` calls iterate the array multiple times. Combine into one loop.
**Incorrect (3 iterations):**
```typescript
const admins = users.filter(u => u.isAdmin)
const testers = users.filter(u => u.isTester)
const inactive = users.filter(u => !u.isActive)
```
**Correct (1 iteration):**
```typescript
const admins: User[] = []
const testers: User[] = []
const inactive: User[] = []
for (const user of users) {
if (user.isAdmin) admins.push(user)
if (user.isTester) testers.push(user)
if (!user.isActive) inactive.push(user)
}
```

View File

@@ -0,0 +1,50 @@
---
title: Early Return from Functions
impact: LOW-MEDIUM
impactDescription: avoids unnecessary computation
tags: javascript, functions, optimization, early-return
---
## Early Return from Functions
Return early when result is determined to skip unnecessary processing.
**Incorrect (processes all items even after finding answer):**
```typescript
function validateUsers(users: User[]) {
let hasError = false
let errorMessage = ''
for (const user of users) {
if (!user.email) {
hasError = true
errorMessage = 'Email required'
}
if (!user.name) {
hasError = true
errorMessage = 'Name required'
}
// Continues checking all users even after error found
}
return hasError ? { valid: false, error: errorMessage } : { valid: true }
}
```
**Correct (returns immediately on first error):**
```typescript
function validateUsers(users: User[]) {
for (const user of users) {
if (!user.email) {
return { valid: false, error: 'Email required' }
}
if (!user.name) {
return { valid: false, error: 'Name required' }
}
}
return { valid: true }
}
```

View File

@@ -0,0 +1,45 @@
---
title: Hoist RegExp Creation
impact: LOW-MEDIUM
impactDescription: avoids recreation
tags: javascript, regexp, optimization, memoization
---
## Hoist RegExp Creation
Don't create RegExp inside render. Hoist to module scope or memoize with `useMemo()`.
**Incorrect (new RegExp every render):**
```tsx
function Highlighter({ text, query }: Props) {
const regex = new RegExp(`(${query})`, 'gi')
const parts = text.split(regex)
return <>{parts.map((part, i) => ...)}</>
}
```
**Correct (memoize or hoist):**
```tsx
const EMAIL_REGEX = /^[^\s@]+@[^\s@]+\.[^\s@]+$/
function Highlighter({ text, query }: Props) {
const regex = useMemo(
() => new RegExp(`(${escapeRegex(query)})`, 'gi'),
[query]
)
const parts = text.split(regex)
return <>{parts.map((part, i) => ...)}</>
}
```
**Warning (global regex has mutable state):**
Global regex (`/g`) has mutable `lastIndex` state:
```typescript
const regex = /foo/g
regex.test('foo') // true, lastIndex = 3
regex.test('foo') // false, lastIndex = 0
```

View File

@@ -0,0 +1,37 @@
---
title: Build Index Maps for Repeated Lookups
impact: LOW-MEDIUM
impactDescription: 1M ops to 2K ops
tags: javascript, map, indexing, optimization, performance
---
## Build Index Maps for Repeated Lookups
Multiple `.find()` calls by the same key should use a Map.
**Incorrect (O(n) per lookup):**
```typescript
function processOrders(orders: Order[], users: User[]) {
return orders.map(order => ({
...order,
user: users.find(u => u.id === order.userId)
}))
}
```
**Correct (O(1) per lookup):**
```typescript
function processOrders(orders: Order[], users: User[]) {
const userById = new Map(users.map(u => [u.id, u]))
return orders.map(order => ({
...order,
user: userById.get(order.userId)
}))
}
```
Build map once (O(n)), then all lookups are O(1).
For 1000 orders × 1000 users: 1M ops → 2K ops.

View File

@@ -0,0 +1,49 @@
---
title: Early Length Check for Array Comparisons
impact: MEDIUM-HIGH
impactDescription: avoids expensive operations when lengths differ
tags: javascript, arrays, performance, optimization, comparison
---
## Early Length Check for Array Comparisons
When comparing arrays with expensive operations (sorting, deep equality, serialization), check lengths first. If lengths differ, the arrays cannot be equal.
In real-world applications, this optimization is especially valuable when the comparison runs in hot paths (event handlers, render loops).
**Incorrect (always runs expensive comparison):**
```typescript
function hasChanges(current: string[], original: string[]) {
// Always sorts and joins, even when lengths differ
return current.sort().join() !== original.sort().join()
}
```
Two O(n log n) sorts run even when `current.length` is 5 and `original.length` is 100. There is also overhead of joining the arrays and comparing the strings.
**Correct (O(1) length check first):**
```typescript
function hasChanges(current: string[], original: string[]) {
// Early return if lengths differ
if (current.length !== original.length) {
return true
}
// Only sort/join when lengths match
const currentSorted = current.toSorted()
const originalSorted = original.toSorted()
for (let i = 0; i < currentSorted.length; i++) {
if (currentSorted[i] !== originalSorted[i]) {
return true
}
}
return false
}
```
This new approach is more efficient because:
- It avoids the overhead of sorting and joining the arrays when lengths differ
- It avoids consuming memory for the joined strings (especially important for large arrays)
- It avoids mutating the original arrays
- It returns early when a difference is found

View File

@@ -0,0 +1,82 @@
---
title: Use Loop for Min/Max Instead of Sort
impact: LOW
impactDescription: O(n) instead of O(n log n)
tags: javascript, arrays, performance, sorting, algorithms
---
## Use Loop for Min/Max Instead of Sort
Finding the smallest or largest element only requires a single pass through the array. Sorting is wasteful and slower.
**Incorrect (O(n log n) - sort to find latest):**
```typescript
interface Project {
id: string
name: string
updatedAt: number
}
function getLatestProject(projects: Project[]) {
const sorted = [...projects].sort((a, b) => b.updatedAt - a.updatedAt)
return sorted[0]
}
```
Sorts the entire array just to find the maximum value.
**Incorrect (O(n log n) - sort for oldest and newest):**
```typescript
function getOldestAndNewest(projects: Project[]) {
const sorted = [...projects].sort((a, b) => a.updatedAt - b.updatedAt)
return { oldest: sorted[0], newest: sorted[sorted.length - 1] }
}
```
Still sorts unnecessarily when only min/max are needed.
**Correct (O(n) - single loop):**
```typescript
function getLatestProject(projects: Project[]) {
if (projects.length === 0) return null
let latest = projects[0]
for (let i = 1; i < projects.length; i++) {
if (projects[i].updatedAt > latest.updatedAt) {
latest = projects[i]
}
}
return latest
}
function getOldestAndNewest(projects: Project[]) {
if (projects.length === 0) return { oldest: null, newest: null }
let oldest = projects[0]
let newest = projects[0]
for (let i = 1; i < projects.length; i++) {
if (projects[i].updatedAt < oldest.updatedAt) oldest = projects[i]
if (projects[i].updatedAt > newest.updatedAt) newest = projects[i]
}
return { oldest, newest }
}
```
Single pass through the array, no copying, no sorting.
**Alternative (Math.min/Math.max for small arrays):**
```typescript
const numbers = [5, 2, 8, 1, 9]
const min = Math.min(...numbers)
const max = Math.max(...numbers)
```
This works for small arrays but can be slower for very large arrays due to spread operator limitations. Use the loop approach for reliability.

View File

@@ -0,0 +1,24 @@
---
title: Use Set/Map for O(1) Lookups
impact: LOW-MEDIUM
impactDescription: O(n) to O(1)
tags: javascript, set, map, data-structures, performance
---
## Use Set/Map for O(1) Lookups
Convert arrays to Set/Map for repeated membership checks.
**Incorrect (O(n) per check):**
```typescript
const allowedIds = ['a', 'b', 'c', ...]
items.filter(item => allowedIds.includes(item.id))
```
**Correct (O(1) per check):**
```typescript
const allowedIds = new Set(['a', 'b', 'c', ...])
items.filter(item => allowedIds.has(item.id))
```

View File

@@ -0,0 +1,57 @@
---
title: Use toSorted() Instead of sort() for Immutability
impact: MEDIUM-HIGH
impactDescription: prevents mutation bugs in React state
tags: javascript, arrays, immutability, react, state, mutation
---
## Use toSorted() Instead of sort() for Immutability
`.sort()` mutates the array in place, which can cause bugs with React state and props. Use `.toSorted()` to create a new sorted array without mutation.
**Incorrect (mutates original array):**
```typescript
function UserList({ users }: { users: User[] }) {
// Mutates the users prop array!
const sorted = useMemo(
() => users.sort((a, b) => a.name.localeCompare(b.name)),
[users]
)
return <div>{sorted.map(renderUser)}</div>
}
```
**Correct (creates new array):**
```typescript
function UserList({ users }: { users: User[] }) {
// Creates new sorted array, original unchanged
const sorted = useMemo(
() => users.toSorted((a, b) => a.name.localeCompare(b.name)),
[users]
)
return <div>{sorted.map(renderUser)}</div>
}
```
**Why this matters in React:**
1. Props/state mutations break React's immutability model - React expects props and state to be treated as read-only
2. Causes stale closure bugs - Mutating arrays inside closures (callbacks, effects) can lead to unexpected behavior
**Browser support (fallback for older browsers):**
`.toSorted()` is available in all modern browsers (Chrome 110+, Safari 16+, Firefox 115+, Node.js 20+). For older environments, use spread operator:
```typescript
// Fallback for older browsers
const sorted = [...items].sort((a, b) => a.value - b.value)
```
**Other immutable array methods:**
- `.toSorted()` - immutable sort
- `.toReversed()` - immutable reverse
- `.toSpliced()` - immutable splice
- `.with()` - immutable element replacement

View File

@@ -0,0 +1,26 @@
---
title: Use Activity Component for Show/Hide
impact: MEDIUM
impactDescription: preserves state/DOM
tags: rendering, activity, visibility, state-preservation
---
## Use Activity Component for Show/Hide
Use React's `<Activity>` to preserve state/DOM for expensive components that frequently toggle visibility.
**Usage:**
```tsx
import { Activity } from 'react'
function Dropdown({ isOpen }: Props) {
return (
<Activity mode={isOpen ? 'visible' : 'hidden'}>
<ExpensiveMenu />
</Activity>
)
}
```
Avoids expensive re-renders and state loss.

View File

@@ -0,0 +1,47 @@
---
title: Animate SVG Wrapper Instead of SVG Element
impact: LOW
impactDescription: enables hardware acceleration
tags: rendering, svg, css, animation, performance
---
## Animate SVG Wrapper Instead of SVG Element
Many browsers don't have hardware acceleration for CSS3 animations on SVG elements. Wrap SVG in a `<div>` and animate the wrapper instead.
**Incorrect (animating SVG directly - no hardware acceleration):**
```tsx
function LoadingSpinner() {
return (
<svg
className="animate-spin"
width="24"
height="24"
viewBox="0 0 24 24"
>
<circle cx="12" cy="12" r="10" stroke="currentColor" />
</svg>
)
}
```
**Correct (animating wrapper div - hardware accelerated):**
```tsx
function LoadingSpinner() {
return (
<div className="animate-spin">
<svg
width="24"
height="24"
viewBox="0 0 24 24"
>
<circle cx="12" cy="12" r="10" stroke="currentColor" />
</svg>
</div>
)
}
```
This applies to all CSS transforms and transitions (`transform`, `opacity`, `translate`, `scale`, `rotate`). The wrapper div allows browsers to use GPU acceleration for smoother animations.

View File

@@ -0,0 +1,40 @@
---
title: Use Explicit Conditional Rendering
impact: LOW
impactDescription: prevents rendering 0 or NaN
tags: rendering, conditional, jsx, falsy-values
---
## Use Explicit Conditional Rendering
Use explicit ternary operators (`? :`) instead of `&&` for conditional rendering when the condition can be `0`, `NaN`, or other falsy values that render.
**Incorrect (renders "0" when count is 0):**
```tsx
function Badge({ count }: { count: number }) {
return (
<div>
{count && <span className="badge">{count}</span>}
</div>
)
}
// When count = 0, renders: <div>0</div>
// When count = 5, renders: <div><span class="badge">5</span></div>
```
**Correct (renders nothing when count is 0):**
```tsx
function Badge({ count }: { count: number }) {
return (
<div>
{count > 0 ? <span className="badge">{count}</span> : null}
</div>
)
}
// When count = 0, renders: <div></div>
// When count = 5, renders: <div><span class="badge">5</span></div>
```

View File

@@ -0,0 +1,38 @@
---
title: CSS content-visibility for Long Lists
impact: HIGH
impactDescription: faster initial render
tags: rendering, css, content-visibility, long-lists
---
## CSS content-visibility for Long Lists
Apply `content-visibility: auto` to defer off-screen rendering.
**CSS:**
```css
.message-item {
content-visibility: auto;
contain-intrinsic-size: 0 80px;
}
```
**Example:**
```tsx
function MessageList({ messages }: { messages: Message[] }) {
return (
<div className="overflow-y-auto h-screen">
{messages.map(msg => (
<div key={msg.id} className="message-item">
<Avatar user={msg.author} />
<div>{msg.content}</div>
</div>
))}
</div>
)
}
```
For 1000 messages, browser skips layout/paint for ~990 off-screen items (10× faster initial render).

View File

@@ -0,0 +1,46 @@
---
title: Hoist Static JSX Elements
impact: LOW
impactDescription: avoids re-creation
tags: rendering, jsx, static, optimization
---
## Hoist Static JSX Elements
Extract static JSX outside components to avoid re-creation.
**Incorrect (recreates element every render):**
```tsx
function LoadingSkeleton() {
return <div className="animate-pulse h-20 bg-gray-200" />
}
function Container() {
return (
<div>
{loading && <LoadingSkeleton />}
</div>
)
}
```
**Correct (reuses same element):**
```tsx
const loadingSkeleton = (
<div className="animate-pulse h-20 bg-gray-200" />
)
function Container() {
return (
<div>
{loading && loadingSkeleton}
</div>
)
}
```
This is especially helpful for large and static SVG nodes, which can be expensive to recreate on every render.
**Note:** If your project has [React Compiler](https://react.dev/learn/react-compiler) enabled, the compiler automatically hoists static JSX elements and optimizes component re-renders, making manual hoisting unnecessary.

View File

@@ -0,0 +1,82 @@
---
title: Prevent Hydration Mismatch Without Flickering
impact: MEDIUM
impactDescription: avoids visual flicker and hydration errors
tags: rendering, ssr, hydration, localStorage, flicker
---
## Prevent Hydration Mismatch Without Flickering
When rendering content that depends on client-side storage (localStorage, cookies), avoid both SSR breakage and post-hydration flickering by injecting a synchronous script that updates the DOM before React hydrates.
**Incorrect (breaks SSR):**
```tsx
function ThemeWrapper({ children }: { children: ReactNode }) {
// localStorage is not available on server - throws error
const theme = localStorage.getItem('theme') || 'light'
return (
<div className={theme}>
{children}
</div>
)
}
```
Server-side rendering will fail because `localStorage` is undefined.
**Incorrect (visual flickering):**
```tsx
function ThemeWrapper({ children }: { children: ReactNode }) {
const [theme, setTheme] = useState('light')
useEffect(() => {
// Runs after hydration - causes visible flash
const stored = localStorage.getItem('theme')
if (stored) {
setTheme(stored)
}
}, [])
return (
<div className={theme}>
{children}
</div>
)
}
```
Component first renders with default value (`light`), then updates after hydration, causing a visible flash of incorrect content.
**Correct (no flicker, no hydration mismatch):**
```tsx
function ThemeWrapper({ children }: { children: ReactNode }) {
return (
<>
<div id="theme-wrapper">
{children}
</div>
<script
dangerouslySetInnerHTML={{
__html: `
(function() {
try {
var theme = localStorage.getItem('theme') || 'light';
var el = document.getElementById('theme-wrapper');
if (el) el.className = theme;
} catch (e) {}
})();
`,
}}
/>
</>
)
}
```
The inline script executes synchronously before showing the element, ensuring the DOM already has the correct value. No flickering, no hydration mismatch.
This pattern is especially useful for theme toggles, user preferences, authentication states, and any client-only data that should render immediately without flashing default values.

View File

@@ -0,0 +1,28 @@
---
title: Optimize SVG Precision
impact: LOW
impactDescription: reduces file size
tags: rendering, svg, optimization, svgo
---
## Optimize SVG Precision
Reduce SVG coordinate precision to decrease file size. The optimal precision depends on the viewBox size, but in general reducing precision should be considered.
**Incorrect (excessive precision):**
```svg
<path d="M 10.293847 20.847362 L 30.938472 40.192837" />
```
**Correct (1 decimal place):**
```svg
<path d="M 10.3 20.8 L 30.9 40.2" />
```
**Automate with SVGO:**
```bash
npx svgo --precision=1 --multipass icon.svg
```

View File

@@ -0,0 +1,39 @@
---
title: Defer State Reads to Usage Point
impact: MEDIUM
impactDescription: avoids unnecessary subscriptions
tags: rerender, searchParams, localStorage, optimization
---
## Defer State Reads to Usage Point
Don't subscribe to dynamic state (searchParams, localStorage) if you only read it inside callbacks.
**Incorrect (subscribes to all searchParams changes):**
```tsx
function ShareButton({ chatId }: { chatId: string }) {
const searchParams = useSearchParams()
const handleShare = () => {
const ref = searchParams.get('ref')
shareChat(chatId, { ref })
}
return <button onClick={handleShare}>Share</button>
}
```
**Correct (reads on demand, no subscription):**
```tsx
function ShareButton({ chatId }: { chatId: string }) {
const handleShare = () => {
const params = new URLSearchParams(window.location.search)
const ref = params.get('ref')
shareChat(chatId, { ref })
}
return <button onClick={handleShare}>Share</button>
}
```

View File

@@ -0,0 +1,45 @@
---
title: Narrow Effect Dependencies
impact: LOW
impactDescription: minimizes effect re-runs
tags: rerender, useEffect, dependencies, optimization
---
## Narrow Effect Dependencies
Specify primitive dependencies instead of objects to minimize effect re-runs.
**Incorrect (re-runs on any user field change):**
```tsx
useEffect(() => {
console.log(user.id)
}, [user])
```
**Correct (re-runs only when id changes):**
```tsx
useEffect(() => {
console.log(user.id)
}, [user.id])
```
**For derived state, compute outside effect:**
```tsx
// Incorrect: runs on width=767, 766, 765...
useEffect(() => {
if (width < 768) {
enableMobileMode()
}
}, [width])
// Correct: runs only on boolean transition
const isMobile = width < 768
useEffect(() => {
if (isMobile) {
enableMobileMode()
}
}, [isMobile])
```

View File

@@ -0,0 +1,29 @@
---
title: Subscribe to Derived State
impact: MEDIUM
impactDescription: reduces re-render frequency
tags: rerender, derived-state, media-query, optimization
---
## Subscribe to Derived State
Subscribe to derived boolean state instead of continuous values to reduce re-render frequency.
**Incorrect (re-renders on every pixel change):**
```tsx
function Sidebar() {
const width = useWindowWidth() // updates continuously
const isMobile = width < 768
return <nav className={isMobile ? 'mobile' : 'desktop'}>
}
```
**Correct (re-renders only when boolean changes):**
```tsx
function Sidebar() {
const isMobile = useMediaQuery('(max-width: 767px)')
return <nav className={isMobile ? 'mobile' : 'desktop'}>
}
```

View File

@@ -0,0 +1,74 @@
---
title: Use Functional setState Updates
impact: MEDIUM
impactDescription: prevents stale closures and unnecessary callback recreations
tags: react, hooks, useState, useCallback, callbacks, closures
---
## Use Functional setState Updates
When updating state based on the current state value, use the functional update form of setState instead of directly referencing the state variable. This prevents stale closures, eliminates unnecessary dependencies, and creates stable callback references.
**Incorrect (requires state as dependency):**
```tsx
function TodoList() {
const [items, setItems] = useState(initialItems)
// Callback must depend on items, recreated on every items change
const addItems = useCallback((newItems: Item[]) => {
setItems([...items, ...newItems])
}, [items]) // ❌ items dependency causes recreations
// Risk of stale closure if dependency is forgotten
const removeItem = useCallback((id: string) => {
setItems(items.filter(item => item.id !== id))
}, []) // ❌ Missing items dependency - will use stale items!
return <ItemsEditor items={items} onAdd={addItems} onRemove={removeItem} />
}
```
The first callback is recreated every time `items` changes, which can cause child components to re-render unnecessarily. The second callback has a stale closure bug—it will always reference the initial `items` value.
**Correct (stable callbacks, no stale closures):**
```tsx
function TodoList() {
const [items, setItems] = useState(initialItems)
// Stable callback, never recreated
const addItems = useCallback((newItems: Item[]) => {
setItems(curr => [...curr, ...newItems])
}, []) // ✅ No dependencies needed
// Always uses latest state, no stale closure risk
const removeItem = useCallback((id: string) => {
setItems(curr => curr.filter(item => item.id !== id))
}, []) // ✅ Safe and stable
return <ItemsEditor items={items} onAdd={addItems} onRemove={removeItem} />
}
```
**Benefits:**
1. **Stable callback references** - Callbacks don't need to be recreated when state changes
2. **No stale closures** - Always operates on the latest state value
3. **Fewer dependencies** - Simplifies dependency arrays and reduces memory leaks
4. **Prevents bugs** - Eliminates the most common source of React closure bugs
**When to use functional updates:**
- Any setState that depends on the current state value
- Inside useCallback/useMemo when state is needed
- Event handlers that reference state
- Async operations that update state
**When direct updates are fine:**
- Setting state to a static value: `setCount(0)`
- Setting state from props/arguments only: `setName(newName)`
- State doesn't depend on previous value
**Note:** If your project has [React Compiler](https://react.dev/learn/react-compiler) enabled, the compiler can automatically optimize some cases, but functional updates are still recommended for correctness and to prevent stale closure bugs.

View File

@@ -0,0 +1,58 @@
---
title: Use Lazy State Initialization
impact: MEDIUM
impactDescription: wasted computation on every render
tags: react, hooks, useState, performance, initialization
---
## Use Lazy State Initialization
Pass a function to `useState` for expensive initial values. Without the function form, the initializer runs on every render even though the value is only used once.
**Incorrect (runs on every render):**
```tsx
function FilteredList({ items }: { items: Item[] }) {
// buildSearchIndex() runs on EVERY render, even after initialization
const [searchIndex, setSearchIndex] = useState(buildSearchIndex(items))
const [query, setQuery] = useState('')
// When query changes, buildSearchIndex runs again unnecessarily
return <SearchResults index={searchIndex} query={query} />
}
function UserProfile() {
// JSON.parse runs on every render
const [settings, setSettings] = useState(
JSON.parse(localStorage.getItem('settings') || '{}')
)
return <SettingsForm settings={settings} onChange={setSettings} />
}
```
**Correct (runs only once):**
```tsx
function FilteredList({ items }: { items: Item[] }) {
// buildSearchIndex() runs ONLY on initial render
const [searchIndex, setSearchIndex] = useState(() => buildSearchIndex(items))
const [query, setQuery] = useState('')
return <SearchResults index={searchIndex} query={query} />
}
function UserProfile() {
// JSON.parse runs only on initial render
const [settings, setSettings] = useState(() => {
const stored = localStorage.getItem('settings')
return stored ? JSON.parse(stored) : {}
})
return <SettingsForm settings={settings} onChange={setSettings} />
}
```
Use lazy initialization when computing initial values from localStorage/sessionStorage, building data structures (indexes, maps), reading from the DOM, or performing heavy transformations.
For simple primitives (`useState(0)`), direct references (`useState(props.value)`), or cheap literals (`useState({})`), the function form is unnecessary.

View File

@@ -0,0 +1,44 @@
---
title: Extract to Memoized Components
impact: MEDIUM
impactDescription: enables early returns
tags: rerender, memo, useMemo, optimization
---
## Extract to Memoized Components
Extract expensive work into memoized components to enable early returns before computation.
**Incorrect (computes avatar even when loading):**
```tsx
function Profile({ user, loading }: Props) {
const avatar = useMemo(() => {
const id = computeAvatarId(user)
return <Avatar id={id} />
}, [user])
if (loading) return <Skeleton />
return <div>{avatar}</div>
}
```
**Correct (skips computation when loading):**
```tsx
const UserAvatar = memo(function UserAvatar({ user }: { user: User }) {
const id = useMemo(() => computeAvatarId(user), [user])
return <Avatar id={id} />
})
function Profile({ user, loading }: Props) {
if (loading) return <Skeleton />
return (
<div>
<UserAvatar user={user} />
</div>
)
}
```
**Note:** If your project has [React Compiler](https://react.dev/learn/react-compiler) enabled, manual memoization with `memo()` and `useMemo()` is not necessary. The compiler automatically optimizes re-renders.

View File

@@ -0,0 +1,40 @@
---
title: Use Transitions for Non-Urgent Updates
impact: MEDIUM
impactDescription: maintains UI responsiveness
tags: rerender, transitions, startTransition, performance
---
## Use Transitions for Non-Urgent Updates
Mark frequent, non-urgent state updates as transitions to maintain UI responsiveness.
**Incorrect (blocks UI on every scroll):**
```tsx
function ScrollTracker() {
const [scrollY, setScrollY] = useState(0)
useEffect(() => {
const handler = () => setScrollY(window.scrollY)
window.addEventListener('scroll', handler, { passive: true })
return () => window.removeEventListener('scroll', handler)
}, [])
}
```
**Correct (non-blocking updates):**
```tsx
import { startTransition } from 'react'
function ScrollTracker() {
const [scrollY, setScrollY] = useState(0)
useEffect(() => {
const handler = () => {
startTransition(() => setScrollY(window.scrollY))
}
window.addEventListener('scroll', handler, { passive: true })
return () => window.removeEventListener('scroll', handler)
}, [])
}
```

View File

@@ -0,0 +1,73 @@
---
title: Use after() for Non-Blocking Operations
impact: MEDIUM
impactDescription: faster response times
tags: server, async, logging, analytics, side-effects
---
## Use after() for Non-Blocking Operations
Use Next.js's `after()` to schedule work that should execute after a response is sent. This prevents logging, analytics, and other side effects from blocking the response.
**Incorrect (blocks response):**
```tsx
import { logUserAction } from '@/app/utils'
export async function POST(request: Request) {
// Perform mutation
await updateDatabase(request)
// Logging blocks the response
const userAgent = request.headers.get('user-agent') || 'unknown'
await logUserAction({ userAgent })
return new Response(JSON.stringify({ status: 'success' }), {
status: 200,
headers: { 'Content-Type': 'application/json' }
})
}
```
**Correct (non-blocking):**
```tsx
import { after } from 'next/server'
import { headers, cookies } from 'next/headers'
import { logUserAction } from '@/app/utils'
export async function POST(request: Request) {
// Perform mutation
await updateDatabase(request)
// Log after response is sent
after(async () => {
const userAgent = (await headers()).get('user-agent') || 'unknown'
const sessionCookie = (await cookies()).get('session-id')?.value || 'anonymous'
logUserAction({ sessionCookie, userAgent })
})
return new Response(JSON.stringify({ status: 'success' }), {
status: 200,
headers: { 'Content-Type': 'application/json' }
})
}
```
The response is sent immediately while logging happens in the background.
**Common use cases:**
- Analytics tracking
- Audit logging
- Sending notifications
- Cache invalidation
- Cleanup tasks
**Important notes:**
- `after()` runs even if the response fails or redirects
- Works in Server Actions, Route Handlers, and Server Components
Reference: [https://nextjs.org/docs/app/api-reference/functions/after](https://nextjs.org/docs/app/api-reference/functions/after)

View File

@@ -0,0 +1,41 @@
---
title: Cross-Request LRU Caching
impact: HIGH
impactDescription: caches across requests
tags: server, cache, lru, cross-request
---
## Cross-Request LRU Caching
`React.cache()` only works within one request. For data shared across sequential requests (user clicks button A then button B), use an LRU cache.
**Implementation:**
```typescript
import { LRUCache } from 'lru-cache'
const cache = new LRUCache<string, any>({
max: 1000,
ttl: 5 * 60 * 1000 // 5 minutes
})
export async function getUser(id: string) {
const cached = cache.get(id)
if (cached) return cached
const user = await db.user.findUnique({ where: { id } })
cache.set(id, user)
return user
}
// Request 1: DB query, result cached
// Request 2: cache hit, no DB query
```
Use when sequential user actions hit multiple endpoints needing the same data within seconds.
**With Vercel's [Fluid Compute](https://vercel.com/docs/fluid-compute):** LRU caching is especially effective because multiple concurrent requests can share the same function instance and cache. This means the cache persists across requests without needing external storage like Redis.
**In traditional serverless:** Each invocation runs in isolation, so consider Redis for cross-process caching.
Reference: [https://github.com/isaacs/node-lru-cache](https://github.com/isaacs/node-lru-cache)

View File

@@ -0,0 +1,26 @@
---
title: Per-Request Deduplication with React.cache()
impact: MEDIUM
impactDescription: deduplicates within request
tags: server, cache, react-cache, deduplication
---
## Per-Request Deduplication with React.cache()
Use `React.cache()` for server-side request deduplication. Authentication and database queries benefit most.
**Usage:**
```typescript
import { cache } from 'react'
export const getCurrentUser = cache(async () => {
const session = await auth()
if (!session?.user?.id) return null
return await db.user.findUnique({
where: { id: session.user.id }
})
})
```
Within a single request, multiple calls to `getCurrentUser()` execute the query only once.

View File

@@ -0,0 +1,79 @@
---
title: Parallel Data Fetching with Component Composition
impact: CRITICAL
impactDescription: eliminates server-side waterfalls
tags: server, rsc, parallel-fetching, composition
---
## Parallel Data Fetching with Component Composition
React Server Components execute sequentially within a tree. Restructure with composition to parallelize data fetching.
**Incorrect (Sidebar waits for Page's fetch to complete):**
```tsx
export default async function Page() {
const header = await fetchHeader()
return (
<div>
<div>{header}</div>
<Sidebar />
</div>
)
}
async function Sidebar() {
const items = await fetchSidebarItems()
return <nav>{items.map(renderItem)}</nav>
}
```
**Correct (both fetch simultaneously):**
```tsx
async function Header() {
const data = await fetchHeader()
return <div>{data}</div>
}
async function Sidebar() {
const items = await fetchSidebarItems()
return <nav>{items.map(renderItem)}</nav>
}
export default function Page() {
return (
<div>
<Header />
<Sidebar />
</div>
)
}
```
**Alternative with children prop:**
```tsx
async function Layout({ children }: { children: ReactNode }) {
const header = await fetchHeader()
return (
<div>
<div>{header}</div>
{children}
</div>
)
}
async function Sidebar() {
const items = await fetchSidebarItems()
return <nav>{items.map(renderItem)}</nav>
}
export default function Page() {
return (
<Layout>
<Sidebar />
</Layout>
)
}
```

View File

@@ -0,0 +1,38 @@
---
title: Minimize Serialization at RSC Boundaries
impact: HIGH
impactDescription: reduces data transfer size
tags: server, rsc, serialization, props
---
## Minimize Serialization at RSC Boundaries
The React Server/Client boundary serializes all object properties into strings and embeds them in the HTML response and subsequent RSC requests. This serialized data directly impacts page weight and load time, so **size matters a lot**. Only pass fields that the client actually uses.
**Incorrect (serializes all 50 fields):**
```tsx
async function Page() {
const user = await fetchUser() // 50 fields
return <Profile user={user} />
}
'use client'
function Profile({ user }: { user: User }) {
return <div>{user.name}</div> // uses 1 field
}
```
**Correct (serializes only 1 field):**
```tsx
async function Page() {
const user = await fetchUser()
return <Profile name={user.name} />
}
'use client'
function Profile({ name }: { name: string }) {
return <div>{name}</div>
}
```

View File

@@ -0,0 +1,85 @@
---
name: worktree
description: Set up a new git worktree for parallel development. Creates the worktree, copies .env files, installs dependencies, and generates Prisma client. TRIGGER when user asks to set up a worktree, work on a branch in isolation, or needs a separate environment for a branch or PR.
user-invocable: true
args: "[name] — optional worktree name (e.g., 'AutoGPT7'). If omitted, uses next available AutoGPT<N>."
metadata:
author: autogpt-team
version: "3.0.0"
---
# Worktree Setup
## Create the worktree
Derive paths from the git toplevel. If a name is provided as argument, use it. Otherwise, check `git worktree list` and pick the next `AutoGPT<N>`.
```bash
ROOT=$(git rev-parse --show-toplevel)
PARENT=$(dirname "$ROOT")
# From an existing branch
git worktree add "$PARENT/<NAME>" <branch-name>
# From a new branch off dev
git worktree add -b <new-branch> "$PARENT/<NAME>" dev
```
## Copy environment files
Copy `.env` from the root worktree. Falls back to `.env.default` if `.env` doesn't exist.
```bash
ROOT=$(git rev-parse --show-toplevel)
TARGET="$(dirname "$ROOT")/<NAME>"
for envpath in autogpt_platform/backend autogpt_platform/frontend autogpt_platform; do
if [ -f "$ROOT/$envpath/.env" ]; then
cp "$ROOT/$envpath/.env" "$TARGET/$envpath/.env"
elif [ -f "$ROOT/$envpath/.env.default" ]; then
cp "$ROOT/$envpath/.env.default" "$TARGET/$envpath/.env"
fi
done
```
## Install dependencies
```bash
TARGET="$(dirname "$(git rev-parse --show-toplevel)")/<NAME>"
cd "$TARGET/autogpt_platform/autogpt_libs" && poetry install
cd "$TARGET/autogpt_platform/backend" && poetry install && poetry run prisma generate
cd "$TARGET/autogpt_platform/frontend" && pnpm install
```
Replace `<NAME>` with the actual worktree name (e.g., `AutoGPT7`).
## Running the app (optional)
Backend uses ports: 8001, 8002, 8003, 8005, 8006, 8007, 8008. Free them first if needed:
```bash
TARGET="$(dirname "$(git rev-parse --show-toplevel)")/<NAME>"
for port in 8001 8002 8003 8005 8006 8007 8008; do
lsof -ti :$port | xargs kill -9 2>/dev/null || true
done
cd "$TARGET/autogpt_platform/backend" && poetry run app
```
## CoPilot testing
SDK mode spawns a Claude subprocess — won't work inside Claude Code. Set `CHAT_USE_CLAUDE_AGENT_SDK=false` in `backend/.env` to use baseline mode.
## Cleanup
```bash
# Replace <NAME> with the actual worktree name (e.g., AutoGPT7)
git worktree remove "$(dirname "$(git rev-parse --show-toplevel)")/<NAME>"
```
## Alternative: Branchlet (optional)
If [branchlet](https://www.npmjs.com/package/branchlet) is installed:
```bash
branchlet create -n <name> -s <source-branch> -b <new-branch>
```

View File

@@ -1,42 +1,17 @@
# Ignore everything by default, selectively add things to context
*
# Documentation (for embeddings/search)
!docs/
# Platform - Libs
!autogpt_platform/autogpt_libs/autogpt_libs/
!autogpt_platform/autogpt_libs/pyproject.toml
!autogpt_platform/autogpt_libs/poetry.lock
!autogpt_platform/autogpt_libs/README.md
!autogpt_platform/autogpt_libs/
# Platform - Backend
!autogpt_platform/backend/backend/
!autogpt_platform/backend/test/e2e_test_data.py
!autogpt_platform/backend/migrations/
!autogpt_platform/backend/schema.prisma
!autogpt_platform/backend/pyproject.toml
!autogpt_platform/backend/poetry.lock
!autogpt_platform/backend/README.md
!autogpt_platform/backend/.env
# Platform - Market
!autogpt_platform/market/market/
!autogpt_platform/market/scripts.py
!autogpt_platform/market/schema.prisma
!autogpt_platform/market/pyproject.toml
!autogpt_platform/market/poetry.lock
!autogpt_platform/market/README.md
!autogpt_platform/backend/
# Platform - Frontend
!autogpt_platform/frontend/src/
!autogpt_platform/frontend/public/
!autogpt_platform/frontend/scripts/
!autogpt_platform/frontend/package.json
!autogpt_platform/frontend/pnpm-lock.yaml
!autogpt_platform/frontend/tsconfig.json
!autogpt_platform/frontend/README.md
## config
!autogpt_platform/frontend/*.config.*
!autogpt_platform/frontend/.env.*
!autogpt_platform/frontend/.env
!autogpt_platform/frontend/
# Classic - AutoGPT
!classic/original_autogpt/autogpt/
@@ -60,6 +35,38 @@
# Classic - Frontend
!classic/frontend/build/web/
# Explicitly re-ignore some folders
.*
**/__pycache__
# Explicitly re-ignore unwanted files from whitelisted directories
# Note: These patterns MUST come after the whitelist rules to take effect
# Hidden files and directories (but keep frontend .env files needed for build)
**/.*
!autogpt_platform/frontend/.env
!autogpt_platform/frontend/.env.default
!autogpt_platform/frontend/.env.production
# Python artifacts
**/__pycache__/
**/*.pyc
**/*.pyo
**/.venv/
**/.ruff_cache/
**/.pytest_cache/
**/.coverage
**/htmlcov/
# Node artifacts
**/node_modules/
**/.next/
**/storybook-static/
**/playwright-report/
**/test-results/
# Build artifacts
**/dist/
**/build/
!autogpt_platform/frontend/src/**/build/
**/target/
# Logs and temp files
**/*.log
**/*.tmp

View File

@@ -1,8 +1,12 @@
<!-- Clearly explain the need for these changes: -->
### Why / What / How
<!-- Why: Why does this PR exist? What problem does it solve, or what's broken/missing without it? -->
<!-- What: What does this PR change? Summarize the changes at a high level. -->
<!-- How: How does it work? Describe the approach, key implementation details, or architecture decisions. -->
### Changes 🏗️
<!-- Concisely describe all of the changes made in this pull request: -->
<!-- List the key changes. Keep it higher level than the diff but specific enough to highlight what's new/modified. -->
### Checklist 📋

View File

@@ -160,7 +160,7 @@ pnpm storybook # Start component development server
**Backend Entry Points:**
- `backend/backend/server/server.py` - FastAPI application setup
- `backend/backend/api/rest_api.py` - FastAPI application setup
- `backend/backend/data/` - Database models and user management
- `backend/blocks/` - Agent execution blocks and logic
@@ -219,7 +219,7 @@ Agents are built using a visual block-based system where each block performs a s
### API Development
1. Update routes in `/backend/backend/server/routers/`
1. Update routes in `/backend/backend/api/features/`
2. Add/update Pydantic models in same directory
3. Write tests alongside route files
4. For `data/*.py` changes, validate user ID checks
@@ -285,7 +285,7 @@ Agents are built using a visual block-based system where each block performs a s
### Security Guidelines
**Cache Protection Middleware** (`/backend/backend/server/middleware/security.py`):
**Cache Protection Middleware** (`/backend/backend/api/middleware/security.py`):
- Default: Disables caching for ALL endpoints with `Cache-Control: no-store, no-cache, must-revalidate, private`
- Uses allow list approach for cacheable paths (static assets, health checks, public pages)

1229
.github/scripts/detect_overlaps.py vendored Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -49,7 +49,7 @@ jobs:
- name: Create PR ${{ env.BUILD_BRANCH }} -> ${{ github.ref_name }}
if: github.event_name == 'push'
uses: peter-evans/create-pull-request@v7
uses: peter-evans/create-pull-request@v8
with:
add-paths: classic/frontend/build/web
base: ${{ github.ref_name }}

View File

@@ -22,7 +22,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
uses: actions/checkout@v6
with:
ref: ${{ github.event.workflow_run.head_branch }}
fetch-depth: 0
@@ -40,9 +40,51 @@ jobs:
git checkout -b "$BRANCH_NAME"
echo "branch_name=$BRANCH_NAME" >> $GITHUB_OUTPUT
# Backend Python/Poetry setup (so Claude can run linting/tests)
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.11"
- name: Set up Python dependency cache
uses: actions/cache@v5
with:
path: ~/.cache/pypoetry
key: poetry-${{ runner.os }}-${{ hashFiles('autogpt_platform/backend/poetry.lock') }}
- name: Install Poetry
run: |
cd autogpt_platform/backend
HEAD_POETRY_VERSION=$(python3 ../../.github/workflows/scripts/get_package_version_from_lockfile.py poetry)
curl -sSL https://install.python-poetry.org | POETRY_VERSION=$HEAD_POETRY_VERSION python3 -
echo "$HOME/.local/bin" >> $GITHUB_PATH
- name: Install Python dependencies
working-directory: autogpt_platform/backend
run: poetry install
- name: Generate Prisma Client
working-directory: autogpt_platform/backend
run: poetry run prisma generate && poetry run gen-prisma-stub
# Frontend Node.js/pnpm setup (so Claude can run linting/tests)
- name: Enable corepack
run: corepack enable
- name: Set up Node.js
uses: actions/setup-node@v6
with:
node-version: "22"
cache: "pnpm"
cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml
- name: Install JavaScript dependencies
working-directory: autogpt_platform/frontend
run: pnpm install --frozen-lockfile
- name: Get CI failure details
id: failure_details
uses: actions/github-script@v7
uses: actions/github-script@v8
with:
script: |
const run = await github.rest.actions.getWorkflowRun({
@@ -93,5 +135,5 @@ jobs:
Error logs:
${{ toJSON(fromJSON(steps.failure_details.outputs.result).errorLogs) }}
anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
claude_args: "--allowedTools 'Edit,MultiEdit,Write,Read,Glob,Grep,LS,Bash(git:*),Bash(bun:*),Bash(npm:*),Bash(npx:*),Bash(gh:*)'"

View File

@@ -7,7 +7,7 @@
# - Provide actionable recommendations for the development team
#
# Triggered on: Dependabot PRs (opened, synchronize)
# Requirements: ANTHROPIC_API_KEY secret must be configured
# Requirements: CLAUDE_CODE_OAUTH_TOKEN secret must be configured
name: Claude Dependabot PR Review
@@ -30,7 +30,7 @@ jobs:
actions: read # Required for CI access
steps:
- name: Checkout code
uses: actions/checkout@v4
uses: actions/checkout@v6
with:
fetch-depth: 1
@@ -41,7 +41,7 @@ jobs:
python-version: "3.11" # Use standard version matching CI
- name: Set up Python dependency cache
uses: actions/cache@v4
uses: actions/cache@v5
with:
path: ~/.cache/pypoetry
key: poetry-${{ runner.os }}-${{ hashFiles('autogpt_platform/backend/poetry.lock') }}
@@ -74,30 +74,18 @@ jobs:
- name: Generate Prisma Client
working-directory: autogpt_platform/backend
run: poetry run prisma generate
run: poetry run prisma generate && poetry run gen-prisma-stub
# Frontend Node.js/pnpm setup (mirrors platform-frontend-ci.yml)
- name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version: "22"
- name: Enable corepack
run: corepack enable
- name: Set pnpm store directory
run: |
pnpm config set store-dir ~/.pnpm-store
echo "PNPM_HOME=$HOME/.pnpm-store" >> $GITHUB_ENV
- name: Cache frontend dependencies
uses: actions/cache@v4
- name: Set up Node.js
uses: actions/setup-node@v6
with:
path: ~/.pnpm-store
key: ${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml', 'autogpt_platform/frontend/package.json') }}
restore-keys: |
${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml') }}
${{ runner.os }}-pnpm-
node-version: "22"
cache: "pnpm"
cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml
- name: Install JavaScript dependencies
working-directory: autogpt_platform/frontend
@@ -124,7 +112,7 @@ jobs:
# Phase 1: Cache and load Docker images for faster setup
- name: Set up Docker image cache
id: docker-cache
uses: actions/cache@v4
uses: actions/cache@v5
with:
path: ~/docker-cache
# Use a versioned key for cache invalidation when image list changes
@@ -308,7 +296,8 @@ jobs:
id: claude_review
uses: anthropics/claude-code-action@v1
with:
anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
allowed_bots: "dependabot[bot]"
claude_args: |
--allowedTools "Bash(npm:*),Bash(pnpm:*),Bash(poetry:*),Bash(git:*),Edit,Replace,NotebookEditCell,mcp__github_inline_comment__create_inline_comment,Bash(gh pr comment:*), Bash(gh pr diff:*), Bash(gh pr view:*)"
prompt: |

View File

@@ -40,7 +40,7 @@ jobs:
actions: read # Required for CI access
steps:
- name: Checkout code
uses: actions/checkout@v4
uses: actions/checkout@v6
with:
fetch-depth: 1
@@ -57,7 +57,7 @@ jobs:
python-version: "3.11" # Use standard version matching CI
- name: Set up Python dependency cache
uses: actions/cache@v4
uses: actions/cache@v5
with:
path: ~/.cache/pypoetry
key: poetry-${{ runner.os }}-${{ hashFiles('autogpt_platform/backend/poetry.lock') }}
@@ -90,30 +90,18 @@ jobs:
- name: Generate Prisma Client
working-directory: autogpt_platform/backend
run: poetry run prisma generate
run: poetry run prisma generate && poetry run gen-prisma-stub
# Frontend Node.js/pnpm setup (mirrors platform-frontend-ci.yml)
- name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version: "22"
- name: Enable corepack
run: corepack enable
- name: Set pnpm store directory
run: |
pnpm config set store-dir ~/.pnpm-store
echo "PNPM_HOME=$HOME/.pnpm-store" >> $GITHUB_ENV
- name: Cache frontend dependencies
uses: actions/cache@v4
- name: Set up Node.js
uses: actions/setup-node@v6
with:
path: ~/.pnpm-store
key: ${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml', 'autogpt_platform/frontend/package.json') }}
restore-keys: |
${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml') }}
${{ runner.os }}-pnpm-
node-version: "22"
cache: "pnpm"
cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml
- name: Install JavaScript dependencies
working-directory: autogpt_platform/frontend
@@ -140,7 +128,7 @@ jobs:
# Phase 1: Cache and load Docker images for faster setup
- name: Set up Docker image cache
id: docker-cache
uses: actions/cache@v4
uses: actions/cache@v5
with:
path: ~/docker-cache
# Use a versioned key for cache invalidation when image list changes
@@ -323,7 +311,7 @@ jobs:
id: claude
uses: anthropics/claude-code-action@v1
with:
anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
claude_args: |
--allowedTools "Bash(npm:*),Bash(pnpm:*),Bash(poetry:*),Bash(git:*),Edit,Replace,NotebookEditCell,mcp__github_inline_comment__create_inline_comment,Bash(gh pr comment:*), Bash(gh pr diff:*), Bash(gh pr view:*), Bash(gh pr edit:*)"
--model opus

View File

@@ -58,11 +58,11 @@ jobs:
# your codebase is analyzed, see https://docs.github.com/en/code-security/code-scanning/creating-an-advanced-setup-for-code-scanning/codeql-code-scanning-for-compiled-languages
steps:
- name: Checkout repository
uses: actions/checkout@v4
uses: actions/checkout@v6
# Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL
uses: github/codeql-action/init@v3
uses: github/codeql-action/init@v4
with:
languages: ${{ matrix.language }}
build-mode: ${{ matrix.build-mode }}
@@ -93,6 +93,6 @@ jobs:
exit 1
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v3
uses: github/codeql-action/analyze@v4
with:
category: "/language:${{matrix.language}}"

View File

@@ -27,7 +27,7 @@ jobs:
# If you do not check out your code, Copilot will do this for you.
steps:
- name: Checkout code
uses: actions/checkout@v4
uses: actions/checkout@v6
with:
fetch-depth: 0
submodules: true
@@ -39,7 +39,7 @@ jobs:
python-version: "3.11" # Use standard version matching CI
- name: Set up Python dependency cache
uses: actions/cache@v4
uses: actions/cache@v5
with:
path: ~/.cache/pypoetry
key: poetry-${{ runner.os }}-${{ hashFiles('autogpt_platform/backend/poetry.lock') }}
@@ -72,11 +72,11 @@ jobs:
- name: Generate Prisma Client
working-directory: autogpt_platform/backend
run: poetry run prisma generate
run: poetry run prisma generate && poetry run gen-prisma-stub
# Frontend Node.js/pnpm setup (mirrors platform-frontend-ci.yml)
- name: Set up Node.js
uses: actions/setup-node@v4
uses: actions/setup-node@v6
with:
node-version: "22"
@@ -89,7 +89,7 @@ jobs:
echo "PNPM_HOME=$HOME/.pnpm-store" >> $GITHUB_ENV
- name: Cache frontend dependencies
uses: actions/cache@v4
uses: actions/cache@v5
with:
path: ~/.pnpm-store
key: ${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml', 'autogpt_platform/frontend/package.json') }}
@@ -108,6 +108,16 @@ jobs:
# run: pnpm playwright install --with-deps chromium
# Docker setup for development environment
- name: Free up disk space
run: |
# Remove large unused tools to free disk space for Docker builds
sudo rm -rf /usr/share/dotnet
sudo rm -rf /usr/local/lib/android
sudo rm -rf /opt/ghc
sudo rm -rf /opt/hostedtoolcache/CodeQL
sudo docker system prune -af
df -h
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
@@ -122,7 +132,7 @@ jobs:
# Phase 1: Cache and load Docker images for faster setup
- name: Set up Docker image cache
id: docker-cache
uses: actions/cache@v4
uses: actions/cache@v5
with:
path: ~/docker-cache
# Use a versioned key for cache invalidation when image list changes

78
.github/workflows/docs-block-sync.yml vendored Normal file
View File

@@ -0,0 +1,78 @@
name: Block Documentation Sync Check
on:
push:
branches: [master, dev]
paths:
- "autogpt_platform/backend/backend/blocks/**"
- "docs/integrations/**"
- "autogpt_platform/backend/scripts/generate_block_docs.py"
- ".github/workflows/docs-block-sync.yml"
pull_request:
branches: [master, dev]
paths:
- "autogpt_platform/backend/backend/blocks/**"
- "docs/integrations/**"
- "autogpt_platform/backend/scripts/generate_block_docs.py"
- ".github/workflows/docs-block-sync.yml"
jobs:
check-docs-sync:
runs-on: ubuntu-latest
timeout-minutes: 15
steps:
- name: Checkout code
uses: actions/checkout@v6
with:
fetch-depth: 1
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.11"
- name: Set up Python dependency cache
uses: actions/cache@v5
with:
path: ~/.cache/pypoetry
key: poetry-${{ runner.os }}-${{ hashFiles('autogpt_platform/backend/poetry.lock') }}
restore-keys: |
poetry-${{ runner.os }}-
- name: Install Poetry
run: |
cd autogpt_platform/backend
HEAD_POETRY_VERSION=$(python3 ../../.github/workflows/scripts/get_package_version_from_lockfile.py poetry)
echo "Found Poetry version ${HEAD_POETRY_VERSION} in backend/poetry.lock"
curl -sSL https://install.python-poetry.org | POETRY_VERSION=$HEAD_POETRY_VERSION python3 -
echo "$HOME/.local/bin" >> $GITHUB_PATH
- name: Install dependencies
working-directory: autogpt_platform/backend
run: |
poetry install --only main
poetry run prisma generate
- name: Check block documentation is in sync
working-directory: autogpt_platform/backend
run: |
echo "Checking if block documentation is in sync with code..."
poetry run python scripts/generate_block_docs.py --check
- name: Show diff if out of sync
if: failure()
working-directory: autogpt_platform/backend
run: |
echo "::error::Block documentation is out of sync with code!"
echo ""
echo "To fix this, run the following command locally:"
echo " cd autogpt_platform/backend && poetry run python scripts/generate_block_docs.py"
echo ""
echo "Then commit the updated documentation files."
echo ""
echo "Regenerating docs to show diff..."
poetry run python scripts/generate_block_docs.py
echo ""
echo "Changes detected:"
git diff ../../docs/integrations/ || true

129
.github/workflows/docs-claude-review.yml vendored Normal file
View File

@@ -0,0 +1,129 @@
name: Claude Block Docs Review
on:
pull_request:
types: [opened, synchronize]
paths:
- "docs/integrations/**"
- "autogpt_platform/backend/backend/blocks/**"
concurrency:
group: claude-docs-review-${{ github.event.pull_request.number }}
cancel-in-progress: true
jobs:
claude-review:
# Only run for PRs from members/collaborators
if: |
github.event.pull_request.author_association == 'OWNER' ||
github.event.pull_request.author_association == 'MEMBER' ||
github.event.pull_request.author_association == 'COLLABORATOR'
runs-on: ubuntu-latest
timeout-minutes: 15
permissions:
contents: read
pull-requests: write
id-token: write
steps:
- name: Checkout code
uses: actions/checkout@v6
with:
fetch-depth: 0
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.11"
- name: Set up Python dependency cache
uses: actions/cache@v5
with:
path: ~/.cache/pypoetry
key: poetry-${{ runner.os }}-${{ hashFiles('autogpt_platform/backend/poetry.lock') }}
restore-keys: |
poetry-${{ runner.os }}-
- name: Install Poetry
run: |
cd autogpt_platform/backend
HEAD_POETRY_VERSION=$(python3 ../../.github/workflows/scripts/get_package_version_from_lockfile.py poetry)
curl -sSL https://install.python-poetry.org | POETRY_VERSION=$HEAD_POETRY_VERSION python3 -
echo "$HOME/.local/bin" >> $GITHUB_PATH
- name: Install dependencies
working-directory: autogpt_platform/backend
run: |
poetry install --only main
poetry run prisma generate
- name: Run Claude Code Review
uses: anthropics/claude-code-action@v1
with:
claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
claude_args: |
--allowedTools "Read,Glob,Grep,Bash(gh pr comment:*),Bash(gh pr diff:*),Bash(gh pr view:*)"
prompt: |
You are reviewing a PR that modifies block documentation or block code for AutoGPT.
## Your Task
Review the changes in this PR and provide constructive feedback. Focus on:
1. **Documentation Accuracy**: For any block code changes, verify that:
- Input/output tables in docs match the actual block schemas
- Description text accurately reflects what the block does
- Any new blocks have corresponding documentation
2. **Manual Content Quality**: Check manual sections (marked with `<!-- MANUAL: -->` markers):
- "How it works" sections should have clear technical explanations
- "Possible use case" sections should have practical, real-world examples
- Content should be helpful for users trying to understand the blocks
3. **Template Compliance**: Ensure docs follow the standard template:
- What it is (brief intro)
- What it does (description)
- How it works (technical explanation)
- Inputs table
- Outputs table
- Possible use case
4. **Cross-references**: Check that links and anchors are correct
## Review Process
1. First, get the PR diff to see what changed: `gh pr diff ${{ github.event.pull_request.number }}`
2. Read any modified block files to understand the implementation
3. Read corresponding documentation files to verify accuracy
4. Provide your feedback as a PR comment
## IMPORTANT: Comment Marker
Start your PR comment with exactly this HTML comment marker on its own line:
<!-- CLAUDE_DOCS_REVIEW -->
This marker is used to identify and replace your comment on subsequent runs.
Be constructive and specific. If everything looks good, say so!
If there are issues, explain what's wrong and suggest how to fix it.
- name: Delete old Claude review comments
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
# Get all comment IDs with our marker, sorted by creation date (oldest first)
COMMENT_IDS=$(gh api \
repos/${{ github.repository }}/issues/${{ github.event.pull_request.number }}/comments \
--jq '[.[] | select(.body | contains("<!-- CLAUDE_DOCS_REVIEW -->"))] | sort_by(.created_at) | .[].id')
# Count comments
COMMENT_COUNT=$(echo "$COMMENT_IDS" | grep -c . || true)
if [ "$COMMENT_COUNT" -gt 1 ]; then
# Delete all but the last (newest) comment
echo "$COMMENT_IDS" | head -n -1 | while read -r COMMENT_ID; do
if [ -n "$COMMENT_ID" ]; then
echo "Deleting old review comment: $COMMENT_ID"
gh api -X DELETE repos/${{ github.repository }}/issues/comments/$COMMENT_ID
fi
done
else
echo "No old review comments to clean up"
fi

194
.github/workflows/docs-enhance.yml vendored Normal file
View File

@@ -0,0 +1,194 @@
name: Enhance Block Documentation
on:
workflow_dispatch:
inputs:
block_pattern:
description: 'Block file pattern to enhance (e.g., "google/*.md" or "*" for all blocks)'
required: true
default: '*'
type: string
dry_run:
description: 'Dry run mode - show proposed changes without committing'
type: boolean
default: true
max_blocks:
description: 'Maximum number of blocks to process (0 for unlimited)'
type: number
default: 10
jobs:
enhance-docs:
runs-on: ubuntu-latest
timeout-minutes: 45
permissions:
contents: write
pull-requests: write
id-token: write
steps:
- name: Checkout code
uses: actions/checkout@v6
with:
fetch-depth: 1
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.11"
- name: Set up Python dependency cache
uses: actions/cache@v5
with:
path: ~/.cache/pypoetry
key: poetry-${{ runner.os }}-${{ hashFiles('autogpt_platform/backend/poetry.lock') }}
restore-keys: |
poetry-${{ runner.os }}-
- name: Install Poetry
run: |
cd autogpt_platform/backend
HEAD_POETRY_VERSION=$(python3 ../../.github/workflows/scripts/get_package_version_from_lockfile.py poetry)
curl -sSL https://install.python-poetry.org | POETRY_VERSION=$HEAD_POETRY_VERSION python3 -
echo "$HOME/.local/bin" >> $GITHUB_PATH
- name: Install dependencies
working-directory: autogpt_platform/backend
run: |
poetry install --only main
poetry run prisma generate
- name: Run Claude Enhancement
uses: anthropics/claude-code-action@v1
with:
claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
claude_args: |
--allowedTools "Read,Edit,Glob,Grep,Write,Bash(git:*),Bash(gh:*),Bash(find:*),Bash(ls:*)"
prompt: |
You are enhancing block documentation for AutoGPT. Your task is to improve the MANUAL sections
of block documentation files by reading the actual block implementations and writing helpful content.
## Configuration
- Block pattern: ${{ inputs.block_pattern }}
- Dry run: ${{ inputs.dry_run }}
- Max blocks to process: ${{ inputs.max_blocks }}
## Your Task
1. **Find Documentation Files**
Find block documentation files matching the pattern in `docs/integrations/`
Pattern: ${{ inputs.block_pattern }}
Use: `find docs/integrations -name "*.md" -type f`
2. **For Each Documentation File** (up to ${{ inputs.max_blocks }} files):
a. Read the documentation file
b. Identify which block(s) it documents (look for the block class name)
c. Find and read the corresponding block implementation in `autogpt_platform/backend/backend/blocks/`
d. Improve the MANUAL sections:
**"How it works" section** (within `<!-- MANUAL: how_it_works -->` markers):
- Explain the technical flow of the block
- Describe what APIs or services it connects to
- Note any important configuration or prerequisites
- Keep it concise but informative (2-4 paragraphs)
**"Possible use case" section** (within `<!-- MANUAL: use_case -->` markers):
- Provide 2-3 practical, real-world examples
- Make them specific and actionable
- Show how this block could be used in an automation workflow
3. **Important Rules**
- ONLY modify content within `<!-- MANUAL: -->` and `<!-- END MANUAL -->` markers
- Do NOT modify auto-generated sections (inputs/outputs tables, descriptions)
- Keep content accurate based on the actual block implementation
- Write for users who may not be technical experts
4. **Output**
${{ inputs.dry_run == true && 'DRY RUN MODE: Show proposed changes for each file but do NOT actually edit the files. Describe what you would change.' || 'LIVE MODE: Actually edit the files to improve the documentation.' }}
## Example Improvements
**Before (How it works):**
```
_Add technical explanation here._
```
**After (How it works):**
```
This block connects to the GitHub API to retrieve issue information. When executed,
it authenticates using your GitHub credentials and fetches issue details including
title, body, labels, and assignees.
The block requires a valid GitHub OAuth connection with repository access permissions.
It supports both public and private repositories you have access to.
```
**Before (Possible use case):**
```
_Add practical use case examples here._
```
**After (Possible use case):**
```
**Customer Support Automation**: Monitor a GitHub repository for new issues with
the "bug" label, then automatically create a ticket in your support system and
notify the on-call engineer via Slack.
**Release Notes Generation**: When a new release is published, gather all closed
issues since the last release and generate a summary for your changelog.
```
Begin by finding and listing the documentation files to process.
- name: Create PR with enhanced documentation
if: ${{ inputs.dry_run == false }}
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
# Check if there are changes
if git diff --quiet docs/integrations/; then
echo "No changes to commit"
exit 0
fi
# Configure git
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"
# Create branch and commit
BRANCH_NAME="docs/enhance-blocks-$(date +%Y%m%d-%H%M%S)"
git checkout -b "$BRANCH_NAME"
git add docs/integrations/
git commit -m "docs: enhance block documentation with LLM-generated content
Pattern: ${{ inputs.block_pattern }}
Max blocks: ${{ inputs.max_blocks }}
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>"
# Push and create PR
git push -u origin "$BRANCH_NAME"
gh pr create \
--title "docs: LLM-enhanced block documentation" \
--body "## Summary
This PR contains LLM-enhanced documentation for block files matching pattern: \`${{ inputs.block_pattern }}\`
The following manual sections were improved:
- **How it works**: Technical explanations based on block implementations
- **Possible use case**: Practical, real-world examples
## Review Checklist
- [ ] Content is accurate based on block implementations
- [ ] Examples are practical and helpful
- [ ] No auto-generated sections were modified
---
🤖 Generated with [Claude Code](https://claude.com/claude-code)" \
--base dev

View File

@@ -25,7 +25,7 @@ jobs:
steps:
- name: Checkout code
uses: actions/checkout@v4
uses: actions/checkout@v6
with:
ref: ${{ github.event.inputs.git_ref || github.ref_name }}
@@ -52,7 +52,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Trigger deploy workflow
uses: peter-evans/repository-dispatch@v3
uses: peter-evans/repository-dispatch@v4
with:
token: ${{ secrets.DEPLOY_TOKEN }}
repository: Significant-Gravitas/AutoGPT_cloud_infrastructure

View File

@@ -17,7 +17,7 @@ jobs:
steps:
- name: Checkout code
uses: actions/checkout@v4
uses: actions/checkout@v6
with:
ref: ${{ github.ref_name || 'master' }}
@@ -45,7 +45,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Trigger deploy workflow
uses: peter-evans/repository-dispatch@v3
uses: peter-evans/repository-dispatch@v4
with:
token: ${{ secrets.DEPLOY_TOKEN }}
repository: Significant-Gravitas/AutoGPT_cloud_infrastructure

View File

@@ -5,12 +5,14 @@ on:
branches: [master, dev, ci-test*]
paths:
- ".github/workflows/platform-backend-ci.yml"
- ".github/workflows/scripts/get_package_version_from_lockfile.py"
- "autogpt_platform/backend/**"
- "autogpt_platform/autogpt_libs/**"
pull_request:
branches: [master, dev, release-*]
paths:
- ".github/workflows/platform-backend-ci.yml"
- ".github/workflows/scripts/get_package_version_from_lockfile.py"
- "autogpt_platform/backend/**"
- "autogpt_platform/autogpt_libs/**"
merge_group:
@@ -25,10 +27,91 @@ defaults:
working-directory: autogpt_platform/backend
jobs:
lint:
permissions:
contents: read
timeout-minutes: 10
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v6
- name: Set up Python 3.12
uses: actions/setup-python@v5
with:
python-version: "3.12"
- name: Set up Python dependency cache
uses: actions/cache@v5
with:
path: ~/.cache/pypoetry
key: poetry-${{ runner.os }}-py3.12-${{ hashFiles('autogpt_platform/backend/poetry.lock') }}
- name: Install Poetry
run: |
HEAD_POETRY_VERSION=$(python ../../.github/workflows/scripts/get_package_version_from_lockfile.py poetry)
echo "Using Poetry version ${HEAD_POETRY_VERSION}"
curl -sSL https://install.python-poetry.org | POETRY_VERSION=$HEAD_POETRY_VERSION python3 -
- name: Install Python dependencies
run: poetry install
- name: Run Linters
run: poetry run lint --skip-pyright
env:
CI: true
PLAIN_OUTPUT: True
type-check:
permissions:
contents: read
timeout-minutes: 10
strategy:
fail-fast: false
matrix:
python-version: ["3.11", "3.12", "3.13"]
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v6
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
- name: Set up Python dependency cache
uses: actions/cache@v5
with:
path: ~/.cache/pypoetry
key: poetry-${{ runner.os }}-py${{ matrix.python-version }}-${{ hashFiles('autogpt_platform/backend/poetry.lock') }}
- name: Install Poetry
run: |
HEAD_POETRY_VERSION=$(python ../../.github/workflows/scripts/get_package_version_from_lockfile.py poetry)
echo "Using Poetry version ${HEAD_POETRY_VERSION}"
curl -sSL https://install.python-poetry.org | POETRY_VERSION=$HEAD_POETRY_VERSION python3 -
- name: Install Python dependencies
run: poetry install
- name: Generate Prisma Client
run: poetry run prisma generate && poetry run gen-prisma-stub
- name: Run Pyright
run: poetry run pyright --pythonversion ${{ matrix.python-version }}
env:
CI: true
PLAIN_OUTPUT: True
test:
permissions:
contents: read
timeout-minutes: 30
timeout-minutes: 15
strategy:
fail-fast: false
matrix:
@@ -41,13 +124,18 @@ jobs:
ports:
- 6379:6379
rabbitmq:
image: rabbitmq:3.12-management
image: rabbitmq:4.1.4
ports:
- 5672:5672
- 15672:15672
env:
RABBITMQ_DEFAULT_USER: ${{ env.RABBITMQ_DEFAULT_USER }}
RABBITMQ_DEFAULT_PASS: ${{ env.RABBITMQ_DEFAULT_PASS }}
options: >-
--health-cmd "rabbitmq-diagnostics -q ping"
--health-interval 30s
--health-timeout 10s
--health-retries 5
--health-start-period 10s
clamav:
image: clamav/clamav-debian:latest
ports:
@@ -68,7 +156,7 @@ jobs:
steps:
- name: Checkout repository
uses: actions/checkout@v4
uses: actions/checkout@v6
with:
fetch-depth: 0
submodules: true
@@ -88,12 +176,12 @@ jobs:
run: echo "date=$(date +'%Y-%m-%d')" >> $GITHUB_OUTPUT
- name: Set up Python dependency cache
uses: actions/cache@v4
uses: actions/cache@v5
with:
path: ~/.cache/pypoetry
key: poetry-${{ runner.os }}-${{ hashFiles('autogpt_platform/backend/poetry.lock') }}
key: poetry-${{ runner.os }}-py${{ matrix.python-version }}-${{ hashFiles('autogpt_platform/backend/poetry.lock') }}
- name: Install Poetry (Unix)
- name: Install Poetry
run: |
# Extract Poetry version from backend/poetry.lock
HEAD_POETRY_VERSION=$(python ../../.github/workflows/scripts/get_package_version_from_lockfile.py poetry)
@@ -134,7 +222,7 @@ jobs:
run: poetry install
- name: Generate Prisma Client
run: poetry run prisma generate
run: poetry run prisma generate && poetry run gen-prisma-stub
- id: supabase
name: Start Supabase
@@ -151,22 +239,22 @@ jobs:
echo "Waiting for ClamAV daemon to start..."
max_attempts=60
attempt=0
until nc -z localhost 3310 || [ $attempt -eq $max_attempts ]; do
echo "ClamAV is unavailable - sleeping (attempt $((attempt+1))/$max_attempts)"
sleep 5
attempt=$((attempt+1))
done
if [ $attempt -eq $max_attempts ]; then
echo "ClamAV failed to start after $((max_attempts*5)) seconds"
echo "Checking ClamAV service logs..."
docker logs $(docker ps -q --filter "ancestor=clamav/clamav-debian:latest") 2>&1 | tail -50 || echo "No ClamAV container found"
exit 1
fi
echo "ClamAV is ready!"
# Verify ClamAV is responsive
echo "Testing ClamAV connection..."
timeout 10 bash -c 'echo "PING" | nc localhost 3310' || {
@@ -176,23 +264,18 @@ jobs:
}
- name: Run Database Migrations
run: poetry run prisma migrate dev --name updates
run: poetry run prisma migrate deploy
env:
DATABASE_URL: ${{ steps.supabase.outputs.DB_URL }}
DIRECT_URL: ${{ steps.supabase.outputs.DB_URL }}
- id: lint
name: Run Linter
run: poetry run lint
- name: Run pytest with coverage
- name: Run pytest
run: |
if [[ "${{ runner.debug }}" == "1" ]]; then
poetry run pytest -s -vv -o log_cli=true -o log_cli_level=DEBUG
else
poetry run pytest -s -vv
fi
if: success() || (failure() && steps.lint.outcome == 'failure')
env:
LOG_LEVEL: ${{ runner.debug && 'DEBUG' || 'INFO' }}
DATABASE_URL: ${{ steps.supabase.outputs.DB_URL }}
@@ -204,6 +287,12 @@ jobs:
REDIS_PORT: "6379"
ENCRYPTION_KEY: "dvziYgz0KSK8FENhju0ZYi8-fRTfAdlz6YLhdB_jhNw=" # DO NOT USE IN PRODUCTION!!
# - name: Upload coverage reports to Codecov
# uses: codecov/codecov-action@v4
# with:
# token: ${{ secrets.CODECOV_TOKEN }}
# flags: backend,${{ runner.os }}
env:
CI: true
PLAIN_OUTPUT: True
@@ -217,9 +306,3 @@ jobs:
# the backend service, docker composes, and examples
RABBITMQ_DEFAULT_USER: "rabbitmq_user_default"
RABBITMQ_DEFAULT_PASS: "k0VMxyIJF9S35f3x2uaw5IWAl6Y536O7"
# - name: Upload coverage reports to Codecov
# uses: codecov/codecov-action@v4
# with:
# token: ${{ secrets.CODECOV_TOKEN }}
# flags: backend,${{ runner.os }}

View File

@@ -17,7 +17,7 @@ jobs:
- name: Check comment permissions and deployment status
id: check_status
if: github.event_name == 'issue_comment' && github.event.issue.pull_request
uses: actions/github-script@v7
uses: actions/github-script@v8
with:
script: |
const commentBody = context.payload.comment.body.trim();
@@ -55,7 +55,7 @@ jobs:
- name: Post permission denied comment
if: steps.check_status.outputs.permission_denied == 'true'
uses: actions/github-script@v7
uses: actions/github-script@v8
with:
script: |
await github.rest.issues.createComment({
@@ -68,7 +68,7 @@ jobs:
- name: Get PR details for deployment
id: pr_details
if: steps.check_status.outputs.should_deploy == 'true' || steps.check_status.outputs.should_undeploy == 'true'
uses: actions/github-script@v7
uses: actions/github-script@v8
with:
script: |
const pr = await github.rest.pulls.get({
@@ -82,7 +82,7 @@ jobs:
- name: Dispatch Deploy Event
if: steps.check_status.outputs.should_deploy == 'true'
uses: peter-evans/repository-dispatch@v3
uses: peter-evans/repository-dispatch@v4
with:
token: ${{ secrets.DISPATCH_TOKEN }}
repository: Significant-Gravitas/AutoGPT_cloud_infrastructure
@@ -98,7 +98,7 @@ jobs:
- name: Post deploy success comment
if: steps.check_status.outputs.should_deploy == 'true'
uses: actions/github-script@v7
uses: actions/github-script@v8
with:
script: |
await github.rest.issues.createComment({
@@ -110,7 +110,7 @@ jobs:
- name: Dispatch Undeploy Event (from comment)
if: steps.check_status.outputs.should_undeploy == 'true'
uses: peter-evans/repository-dispatch@v3
uses: peter-evans/repository-dispatch@v4
with:
token: ${{ secrets.DISPATCH_TOKEN }}
repository: Significant-Gravitas/AutoGPT_cloud_infrastructure
@@ -126,7 +126,7 @@ jobs:
- name: Post undeploy success comment
if: steps.check_status.outputs.should_undeploy == 'true'
uses: actions/github-script@v7
uses: actions/github-script@v8
with:
script: |
await github.rest.issues.createComment({
@@ -139,7 +139,7 @@ jobs:
- name: Check deployment status on PR close
id: check_pr_close
if: github.event_name == 'pull_request' && github.event.action == 'closed'
uses: actions/github-script@v7
uses: actions/github-script@v8
with:
script: |
const comments = await github.rest.issues.listComments({
@@ -168,7 +168,7 @@ jobs:
github.event_name == 'pull_request' &&
github.event.action == 'closed' &&
steps.check_pr_close.outputs.should_undeploy == 'true'
uses: peter-evans/repository-dispatch@v3
uses: peter-evans/repository-dispatch@v4
with:
token: ${{ secrets.DISPATCH_TOKEN }}
repository: Significant-Gravitas/AutoGPT_cloud_infrastructure
@@ -187,7 +187,7 @@ jobs:
github.event_name == 'pull_request' &&
github.event.action == 'closed' &&
steps.check_pr_close.outputs.should_undeploy == 'true'
uses: actions/github-script@v7
uses: actions/github-script@v8
with:
script: |
await github.rest.issues.createComment({

View File

@@ -6,11 +6,18 @@ on:
paths:
- ".github/workflows/platform-frontend-ci.yml"
- "autogpt_platform/frontend/**"
- "autogpt_platform/backend/Dockerfile"
- "autogpt_platform/docker-compose.yml"
- "autogpt_platform/docker-compose.platform.yml"
pull_request:
paths:
- ".github/workflows/platform-frontend-ci.yml"
- "autogpt_platform/frontend/**"
- "autogpt_platform/backend/Dockerfile"
- "autogpt_platform/docker-compose.yml"
- "autogpt_platform/docker-compose.platform.yml"
merge_group:
workflow_dispatch:
concurrency:
group: ${{ github.workflow }}-${{ github.event_name == 'merge_group' && format('merge-queue-{0}', github.ref) || format('{0}-{1}', github.ref, github.event.pull_request.number || github.sha) }}
@@ -25,34 +32,31 @@ jobs:
setup:
runs-on: ubuntu-latest
outputs:
cache-key: ${{ steps.cache-key.outputs.key }}
components-changed: ${{ steps.filter.outputs.components }}
steps:
- name: Checkout repository
uses: actions/checkout@v4
uses: actions/checkout@v6
- name: Set up Node.js
uses: actions/setup-node@v4
- name: Check for component changes
uses: dorny/paths-filter@v3
id: filter
with:
node-version: "22.18.0"
filters: |
components:
- 'autogpt_platform/frontend/src/components/**'
- name: Enable corepack
run: corepack enable
- name: Generate cache key
id: cache-key
run: echo "key=${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml', 'autogpt_platform/frontend/package.json') }}" >> $GITHUB_OUTPUT
- name: Cache dependencies
uses: actions/cache@v4
- name: Set up Node
uses: actions/setup-node@v6
with:
path: ~/.pnpm-store
key: ${{ steps.cache-key.outputs.key }}
restore-keys: |
${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml') }}
${{ runner.os }}-pnpm-
node-version: "22.18.0"
cache: "pnpm"
cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml
- name: Install dependencies
- name: Install dependencies to populate cache
run: pnpm install --frozen-lockfile
lint:
@@ -61,24 +65,17 @@ jobs:
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version: "22.18.0"
uses: actions/checkout@v6
- name: Enable corepack
run: corepack enable
- name: Restore dependencies cache
uses: actions/cache@v4
- name: Set up Node
uses: actions/setup-node@v6
with:
path: ~/.pnpm-store
key: ${{ needs.setup.outputs.cache-key }}
restore-keys: |
${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml') }}
${{ runner.os }}-pnpm-
node-version: "22.18.0"
cache: "pnpm"
cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml
- name: Install dependencies
run: pnpm install --frozen-lockfile
@@ -89,31 +86,27 @@ jobs:
chromatic:
runs-on: ubuntu-latest
needs: setup
# Only run on dev branch pushes or PRs targeting dev
if: github.ref == 'refs/heads/dev' || github.base_ref == 'dev'
# Disabled: to re-enable, remove 'false &&' from the condition below
if: >-
false
&& (github.ref == 'refs/heads/dev' || github.base_ref == 'dev')
&& needs.setup.outputs.components-changed == 'true'
steps:
- name: Checkout repository
uses: actions/checkout@v4
uses: actions/checkout@v6
with:
fetch-depth: 0
- name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version: "22.18.0"
- name: Enable corepack
run: corepack enable
- name: Restore dependencies cache
uses: actions/cache@v4
- name: Set up Node
uses: actions/setup-node@v6
with:
path: ~/.pnpm-store
key: ${{ needs.setup.outputs.cache-key }}
restore-keys: |
${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml') }}
${{ runner.os }}-pnpm-
node-version: "22.18.0"
cache: "pnpm"
cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml
- name: Install dependencies
run: pnpm install --frozen-lockfile
@@ -127,113 +120,31 @@ jobs:
token: ${{ secrets.GITHUB_TOKEN }}
exitOnceUploaded: true
test:
runs-on: big-boi
integration_test:
runs-on: ubuntu-latest
needs: setup
strategy:
fail-fast: false
steps:
- name: Checkout repository
uses: actions/checkout@v4
uses: actions/checkout@v6
with:
submodules: recursive
- name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version: "22.18.0"
- name: Enable corepack
run: corepack enable
- name: Copy default supabase .env
run: |
cp ../.env.default ../.env
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Cache Docker layers
uses: actions/cache@v4
- name: Set up Node
uses: actions/setup-node@v6
with:
path: /tmp/.buildx-cache
key: ${{ runner.os }}-buildx-frontend-test-${{ hashFiles('autogpt_platform/docker-compose.yml', 'autogpt_platform/backend/Dockerfile', 'autogpt_platform/backend/pyproject.toml', 'autogpt_platform/backend/poetry.lock') }}
restore-keys: |
${{ runner.os }}-buildx-frontend-test-
- name: Run docker compose
run: |
NEXT_PUBLIC_PW_TEST=true docker compose -f ../docker-compose.yml up -d
env:
DOCKER_BUILDKIT: 1
BUILDX_CACHE_FROM: type=local,src=/tmp/.buildx-cache
BUILDX_CACHE_TO: type=local,dest=/tmp/.buildx-cache-new,mode=max
- name: Move cache
run: |
rm -rf /tmp/.buildx-cache
if [ -d "/tmp/.buildx-cache-new" ]; then
mv /tmp/.buildx-cache-new /tmp/.buildx-cache
fi
- name: Wait for services to be ready
run: |
echo "Waiting for rest_server to be ready..."
timeout 60 sh -c 'until curl -f http://localhost:8006/health 2>/dev/null; do sleep 2; done' || echo "Rest server health check timeout, continuing..."
echo "Waiting for database to be ready..."
timeout 60 sh -c 'until docker compose -f ../docker-compose.yml exec -T db pg_isready -U postgres 2>/dev/null; do sleep 2; done' || echo "Database ready check timeout, continuing..."
- name: Create E2E test data
run: |
echo "Creating E2E test data..."
# First try to run the script from inside the container
if docker compose -f ../docker-compose.yml exec -T rest_server test -f /app/autogpt_platform/backend/test/e2e_test_data.py; then
echo "✅ Found e2e_test_data.py in container, running it..."
docker compose -f ../docker-compose.yml exec -T rest_server sh -c "cd /app/autogpt_platform && python backend/test/e2e_test_data.py" || {
echo "❌ E2E test data creation failed!"
docker compose -f ../docker-compose.yml logs --tail=50 rest_server
exit 1
}
else
echo "⚠️ e2e_test_data.py not found in container, copying and running..."
# Copy the script into the container and run it
docker cp ../backend/test/e2e_test_data.py $(docker compose -f ../docker-compose.yml ps -q rest_server):/tmp/e2e_test_data.py || {
echo "❌ Failed to copy script to container"
exit 1
}
docker compose -f ../docker-compose.yml exec -T rest_server sh -c "cd /app/autogpt_platform && python /tmp/e2e_test_data.py" || {
echo "❌ E2E test data creation failed!"
docker compose -f ../docker-compose.yml logs --tail=50 rest_server
exit 1
}
fi
- name: Restore dependencies cache
uses: actions/cache@v4
with:
path: ~/.pnpm-store
key: ${{ needs.setup.outputs.cache-key }}
restore-keys: |
${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml') }}
${{ runner.os }}-pnpm-
node-version: "22.18.0"
cache: "pnpm"
cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml
- name: Install dependencies
run: pnpm install --frozen-lockfile
- name: Install Browser 'chromium'
run: pnpm playwright install --with-deps chromium
- name: Generate API client
run: pnpm generate:api
- name: Run Playwright tests
run: pnpm test:no-build
- name: Upload Playwright artifacts
if: failure()
uses: actions/upload-artifact@v4
with:
name: playwright-report
path: playwright-report
- name: Print Final Docker Compose logs
if: always()
run: docker compose -f ../docker-compose.yml logs
- name: Run Integration Tests
run: pnpm test:unit

View File

@@ -1,14 +1,18 @@
name: AutoGPT Platform - Frontend CI
name: AutoGPT Platform - Full-stack CI
on:
push:
branches: [master, dev]
paths:
- ".github/workflows/platform-fullstack-ci.yml"
- ".github/workflows/scripts/docker-ci-fix-compose-build-cache.py"
- ".github/workflows/scripts/get_package_version_from_lockfile.py"
- "autogpt_platform/**"
pull_request:
paths:
- ".github/workflows/platform-fullstack-ci.yml"
- ".github/workflows/scripts/docker-ci-fix-compose-build-cache.py"
- ".github/workflows/scripts/get_package_version_from_lockfile.py"
- "autogpt_platform/**"
merge_group:
@@ -24,113 +28,285 @@ defaults:
jobs:
setup:
runs-on: ubuntu-latest
outputs:
cache-key: ${{ steps.cache-key.outputs.key }}
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version: "22.18.0"
uses: actions/checkout@v6
- name: Enable corepack
run: corepack enable
- name: Generate cache key
id: cache-key
run: echo "key=${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml', 'autogpt_platform/frontend/package.json') }}" >> $GITHUB_OUTPUT
- name: Cache dependencies
uses: actions/cache@v4
- name: Set up Node
uses: actions/setup-node@v6
with:
path: ~/.pnpm-store
key: ${{ steps.cache-key.outputs.key }}
restore-keys: |
${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml') }}
${{ runner.os }}-pnpm-
node-version: "22.18.0"
cache: "pnpm"
cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml
- name: Install dependencies
- name: Install dependencies to populate cache
run: pnpm install --frozen-lockfile
types:
check-api-types:
name: check API types
runs-on: ubuntu-latest
needs: setup
strategy:
fail-fast: false
steps:
- name: Checkout repository
uses: actions/checkout@v4
uses: actions/checkout@v6
with:
submodules: recursive
- name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version: "22.18.0"
# ------------------------ Backend setup ------------------------
- name: Enable corepack
- name: Set up Backend - Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.12"
- name: Set up Backend - Install Poetry
working-directory: autogpt_platform/backend
run: |
POETRY_VERSION=$(python ../../.github/workflows/scripts/get_package_version_from_lockfile.py poetry)
echo "Installing Poetry version ${POETRY_VERSION}"
curl -sSL https://install.python-poetry.org | POETRY_VERSION=$POETRY_VERSION python3 -
- name: Set up Backend - Set up dependency cache
uses: actions/cache@v5
with:
path: ~/.cache/pypoetry
key: poetry-${{ runner.os }}-${{ hashFiles('autogpt_platform/backend/poetry.lock') }}
- name: Set up Backend - Install dependencies
working-directory: autogpt_platform/backend
run: poetry install
- name: Set up Backend - Generate Prisma client
working-directory: autogpt_platform/backend
run: poetry run prisma generate && poetry run gen-prisma-stub
- name: Set up Frontend - Export OpenAPI schema from Backend
working-directory: autogpt_platform/backend
run: poetry run export-api-schema --output ../frontend/src/app/api/openapi.json
# ------------------------ Frontend setup ------------------------
- name: Set up Frontend - Enable corepack
run: corepack enable
- name: Copy default supabase .env
run: |
cp ../.env.default ../.env
- name: Copy backend .env
run: |
cp ../backend/.env.default ../backend/.env
- name: Run docker compose
run: |
docker compose -f ../docker-compose.yml --profile local --profile deps_backend up -d
- name: Restore dependencies cache
uses: actions/cache@v4
- name: Set up Frontend - Set up Node
uses: actions/setup-node@v6
with:
path: ~/.pnpm-store
key: ${{ needs.setup.outputs.cache-key }}
restore-keys: |
${{ runner.os }}-pnpm-
node-version: "22.18.0"
cache: "pnpm"
cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml
- name: Install dependencies
- name: Set up Frontend - Install dependencies
run: pnpm install --frozen-lockfile
- name: Setup .env
run: cp .env.default .env
- name: Wait for services to be ready
run: |
echo "Waiting for rest_server to be ready..."
timeout 60 sh -c 'until curl -f http://localhost:8006/health 2>/dev/null; do sleep 2; done' || echo "Rest server health check timeout, continuing..."
echo "Waiting for database to be ready..."
timeout 60 sh -c 'until docker compose -f ../docker-compose.yml exec -T db pg_isready -U postgres 2>/dev/null; do sleep 2; done' || echo "Database ready check timeout, continuing..."
- name: Generate API queries
run: pnpm generate:api:force
- name: Set up Frontend - Format OpenAPI schema
id: format-schema
run: pnpm prettier --write ./src/app/api/openapi.json
- name: Check for API schema changes
run: |
if ! git diff --exit-code src/app/api/openapi.json; then
echo "❌ API schema changes detected in src/app/api/openapi.json"
echo ""
echo "The openapi.json file has been modified after running 'pnpm generate:api-all'."
echo "The openapi.json file has been modified after exporting the API schema."
echo "This usually means changes have been made in the BE endpoints without updating the Frontend."
echo "The API schema is now out of sync with the Front-end queries."
echo ""
echo "To fix this:"
echo "1. Pull the backend 'docker compose pull && docker compose up -d --build --force-recreate'"
echo "2. Run 'pnpm generate:api' locally"
echo "3. Run 'pnpm types' locally"
echo "4. Fix any TypeScript errors that may have been introduced"
echo "5. Commit and push your changes"
echo "\nIn the backend directory:"
echo "1. Run 'poetry run export-api-schema --output ../frontend/src/app/api/openapi.json'"
echo "\nIn the frontend directory:"
echo "2. Run 'pnpm prettier --write src/app/api/openapi.json'"
echo "3. Run 'pnpm generate:api'"
echo "4. Run 'pnpm types'"
echo "5. Fix any TypeScript errors that may have been introduced"
echo "6. Commit and push your changes"
echo ""
exit 1
else
echo "✅ No API schema changes detected"
fi
- name: Run Typescript checks
- name: Set up Frontend - Generate API client
id: generate-api-client
run: pnpm orval --config ./orval.config.ts
# Continue with type generation & check even if there are schema changes
if: success() || (steps.format-schema.outcome == 'success')
- name: Check for TypeScript errors
run: pnpm types
if: success() || (steps.generate-api-client.outcome == 'success')
e2e_test:
name: end-to-end tests
runs-on: big-boi
steps:
- name: Checkout repository
uses: actions/checkout@v6
with:
submodules: recursive
- name: Set up Platform - Copy default supabase .env
run: |
cp ../.env.default ../.env
- name: Set up Platform - Copy backend .env and set OpenAI API key
run: |
cp ../backend/.env.default ../backend/.env
echo "OPENAI_INTERNAL_API_KEY=${{ secrets.OPENAI_API_KEY }}" >> ../backend/.env
env:
# Used by E2E test data script to generate embeddings for approved store agents
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
- name: Set up Platform - Set up Docker Buildx
uses: docker/setup-buildx-action@v3
with:
driver: docker-container
driver-opts: network=host
- name: Set up Platform - Expose GHA cache to docker buildx CLI
uses: crazy-max/ghaction-github-runtime@v4
- name: Set up Platform - Build Docker images (with cache)
working-directory: autogpt_platform
run: |
pip install pyyaml
# Resolve extends and generate a flat compose file that bake can understand
docker compose -f docker-compose.yml config > docker-compose.resolved.yml
# Add cache configuration to the resolved compose file
python ../.github/workflows/scripts/docker-ci-fix-compose-build-cache.py \
--source docker-compose.resolved.yml \
--cache-from "type=gha" \
--cache-to "type=gha,mode=max" \
--backend-hash "${{ hashFiles('autogpt_platform/backend/Dockerfile', 'autogpt_platform/backend/poetry.lock', 'autogpt_platform/backend/backend/**') }}" \
--frontend-hash "${{ hashFiles('autogpt_platform/frontend/Dockerfile', 'autogpt_platform/frontend/pnpm-lock.yaml', 'autogpt_platform/frontend/src/**') }}" \
--git-ref "${{ github.ref }}"
# Build with bake using the resolved compose file (now includes cache config)
docker buildx bake --allow=fs.read=.. -f docker-compose.resolved.yml --load
env:
NEXT_PUBLIC_PW_TEST: true
- name: Set up tests - Cache E2E test data
id: e2e-data-cache
uses: actions/cache@v5
with:
path: /tmp/e2e_test_data.sql
key: e2e-test-data-${{ hashFiles('autogpt_platform/backend/test/e2e_test_data.py', 'autogpt_platform/backend/migrations/**', '.github/workflows/platform-fullstack-ci.yml') }}
- name: Set up Platform - Start Supabase DB + Auth
run: |
docker compose -f ../docker-compose.resolved.yml up -d db auth --no-build
echo "Waiting for database to be ready..."
timeout 60 sh -c 'until docker compose -f ../docker-compose.resolved.yml exec -T db pg_isready -U postgres 2>/dev/null; do sleep 2; done'
echo "Waiting for auth service to be ready..."
timeout 60 sh -c 'until docker compose -f ../docker-compose.resolved.yml exec -T db psql -U postgres -d postgres -c "SELECT 1 FROM auth.users LIMIT 1" 2>/dev/null; do sleep 2; done' || echo "Auth schema check timeout, continuing..."
- name: Set up Platform - Run migrations
run: |
echo "Running migrations..."
docker compose -f ../docker-compose.resolved.yml run --rm migrate
echo "✅ Migrations completed"
env:
NEXT_PUBLIC_PW_TEST: true
- name: Set up tests - Load cached E2E test data
if: steps.e2e-data-cache.outputs.cache-hit == 'true'
run: |
echo "✅ Found cached E2E test data, restoring..."
{
echo "SET session_replication_role = 'replica';"
cat /tmp/e2e_test_data.sql
echo "SET session_replication_role = 'origin';"
} | docker compose -f ../docker-compose.resolved.yml exec -T db psql -U postgres -d postgres -b
# Refresh materialized views after restore
docker compose -f ../docker-compose.resolved.yml exec -T db \
psql -U postgres -d postgres -b -c "SET search_path TO platform; SELECT refresh_store_materialized_views();" || true
echo "✅ E2E test data restored from cache"
- name: Set up Platform - Start (all other services)
run: |
docker compose -f ../docker-compose.resolved.yml up -d --no-build
echo "Waiting for rest_server to be ready..."
timeout 60 sh -c 'until curl -f http://localhost:8006/health 2>/dev/null; do sleep 2; done' || echo "Rest server health check timeout, continuing..."
env:
NEXT_PUBLIC_PW_TEST: true
- name: Set up tests - Create E2E test data
if: steps.e2e-data-cache.outputs.cache-hit != 'true'
run: |
echo "Creating E2E test data..."
docker cp ../backend/test/e2e_test_data.py $(docker compose -f ../docker-compose.resolved.yml ps -q rest_server):/tmp/e2e_test_data.py
docker compose -f ../docker-compose.resolved.yml exec -T rest_server sh -c "cd /app/autogpt_platform && python /tmp/e2e_test_data.py" || {
echo "❌ E2E test data creation failed!"
docker compose -f ../docker-compose.resolved.yml logs --tail=50 rest_server
exit 1
}
# Dump auth.users + platform schema for cache (two separate dumps)
echo "Dumping database for cache..."
{
docker compose -f ../docker-compose.resolved.yml exec -T db \
pg_dump -U postgres --data-only --column-inserts \
--table='auth.users' postgres
docker compose -f ../docker-compose.resolved.yml exec -T db \
pg_dump -U postgres --data-only --column-inserts \
--schema=platform \
--exclude-table='platform._prisma_migrations' \
--exclude-table='platform.apscheduler_jobs' \
--exclude-table='platform.apscheduler_jobs_batched_notifications' \
postgres
} > /tmp/e2e_test_data.sql
echo "✅ Database dump created for caching ($(wc -l < /tmp/e2e_test_data.sql) lines)"
- name: Set up tests - Enable corepack
run: corepack enable
- name: Set up tests - Set up Node
uses: actions/setup-node@v6
with:
node-version: "22.18.0"
cache: "pnpm"
cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml
- name: Set up tests - Install dependencies
run: pnpm install --frozen-lockfile
- name: Set up tests - Install browser 'chromium'
run: pnpm playwright install --with-deps chromium
- name: Run Playwright tests
run: pnpm test:no-build
continue-on-error: false
- name: Upload Playwright report
if: always()
uses: actions/upload-artifact@v4
with:
name: playwright-report
path: autogpt_platform/frontend/playwright-report
if-no-files-found: ignore
retention-days: 3
- name: Upload Playwright test results
if: always()
uses: actions/upload-artifact@v4
with:
name: playwright-test-results
path: autogpt_platform/frontend/test-results
if-no-files-found: ignore
retention-days: 3
- name: Print Final Docker Compose logs
if: always()
run: docker compose -f ../docker-compose.resolved.yml logs

39
.github/workflows/pr-overlap-check.yml vendored Normal file
View File

@@ -0,0 +1,39 @@
name: PR Overlap Detection
on:
pull_request:
types: [opened, synchronize, reopened]
branches:
- dev
- master
permissions:
contents: read
pull-requests: write
jobs:
check-overlaps:
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
fetch-depth: 0 # Need full history for merge testing
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Configure git
run: |
git config user.email "github-actions[bot]@users.noreply.github.com"
git config user.name "github-actions[bot]"
- name: Run overlap detection
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
# Always succeed - this check informs contributors, it shouldn't block merging
continue-on-error: true
run: |
python .github/scripts/detect_overlaps.py ${{ github.event.pull_request.number }}

View File

@@ -11,7 +11,7 @@ jobs:
steps:
# - name: Wait some time for all actions to start
# run: sleep 30
- uses: actions/checkout@v4
- uses: actions/checkout@v6
# with:
# fetch-depth: 0
- name: Set up Python

View File

@@ -0,0 +1,195 @@
#!/usr/bin/env python3
"""
Add cache configuration to a resolved docker-compose file for all services
that have a build key, and ensure image names match what docker compose expects.
"""
import argparse
import yaml
DEFAULT_BRANCH = "dev"
CACHE_BUILDS_FOR_COMPONENTS = ["backend", "frontend"]
def main():
parser = argparse.ArgumentParser(
description="Add cache config to a resolved compose file"
)
parser.add_argument(
"--source",
required=True,
help="Source compose file to read (should be output of `docker compose config`)",
)
parser.add_argument(
"--cache-from",
default="type=gha",
help="Cache source configuration",
)
parser.add_argument(
"--cache-to",
default="type=gha,mode=max",
help="Cache destination configuration",
)
for component in CACHE_BUILDS_FOR_COMPONENTS:
parser.add_argument(
f"--{component}-hash",
default="",
help=f"Hash for {component} cache scope (e.g., from hashFiles())",
)
parser.add_argument(
"--git-ref",
default="",
help="Git ref for branch-based cache scope (e.g., refs/heads/master)",
)
args = parser.parse_args()
# Normalize git ref to a safe scope name (e.g., refs/heads/master -> master)
git_ref_scope = ""
if args.git_ref:
git_ref_scope = args.git_ref.replace("refs/heads/", "").replace("/", "-")
with open(args.source, "r") as f:
compose = yaml.safe_load(f)
# Get project name from compose file or default
project_name = compose.get("name", "autogpt_platform")
def get_image_name(dockerfile: str, target: str) -> str:
"""Generate image name based on Dockerfile folder and build target."""
dockerfile_parts = dockerfile.replace("\\", "/").split("/")
if len(dockerfile_parts) >= 2:
folder_name = dockerfile_parts[-2] # e.g., "backend" or "frontend"
else:
folder_name = "app"
return f"{project_name}-{folder_name}:{target}"
def get_build_key(dockerfile: str, target: str) -> str:
"""Generate a unique key for a Dockerfile+target combination."""
return f"{dockerfile}:{target}"
def get_component(dockerfile: str) -> str | None:
"""Get component name (frontend/backend) from dockerfile path."""
for component in CACHE_BUILDS_FOR_COMPONENTS:
if component in dockerfile:
return component
return None
# First pass: collect all services with build configs and identify duplicates
# Track which (dockerfile, target) combinations we've seen
build_key_to_first_service: dict[str, str] = {}
services_to_build: list[str] = []
services_to_dedupe: list[str] = []
for service_name, service_config in compose.get("services", {}).items():
if "build" not in service_config:
continue
build_config = service_config["build"]
dockerfile = build_config.get("dockerfile", "Dockerfile")
target = build_config.get("target", "default")
build_key = get_build_key(dockerfile, target)
if build_key not in build_key_to_first_service:
# First service with this build config - it will do the actual build
build_key_to_first_service[build_key] = service_name
services_to_build.append(service_name)
else:
# Duplicate - will just use the image from the first service
services_to_dedupe.append(service_name)
# Second pass: configure builds and deduplicate
modified_services = []
for service_name, service_config in compose.get("services", {}).items():
if "build" not in service_config:
continue
build_config = service_config["build"]
dockerfile = build_config.get("dockerfile", "Dockerfile")
target = build_config.get("target", "latest")
image_name = get_image_name(dockerfile, target)
# Set image name for all services (needed for both builders and deduped)
service_config["image"] = image_name
if service_name in services_to_dedupe:
# Remove build config - this service will use the pre-built image
del service_config["build"]
continue
# This service will do the actual build - add cache config
cache_from_list = []
cache_to_list = []
component = get_component(dockerfile)
if not component:
# Skip services that don't clearly match frontend/backend
continue
# Get the hash for this component
component_hash = getattr(args, f"{component}_hash")
# Scope format: platform-{component}-{target}-{hash|ref}
# Example: platform-backend-server-abc123
if "type=gha" in args.cache_from:
# 1. Primary: exact hash match (most specific)
if component_hash:
hash_scope = f"platform-{component}-{target}-{component_hash}"
cache_from_list.append(f"{args.cache_from},scope={hash_scope}")
# 2. Fallback: branch-based cache
if git_ref_scope:
ref_scope = f"platform-{component}-{target}-{git_ref_scope}"
cache_from_list.append(f"{args.cache_from},scope={ref_scope}")
# 3. Fallback: dev branch cache (for PRs/feature branches)
if git_ref_scope and git_ref_scope != DEFAULT_BRANCH:
master_scope = f"platform-{component}-{target}-{DEFAULT_BRANCH}"
cache_from_list.append(f"{args.cache_from},scope={master_scope}")
if "type=gha" in args.cache_to:
# Write to both hash-based and branch-based scopes
if component_hash:
hash_scope = f"platform-{component}-{target}-{component_hash}"
cache_to_list.append(f"{args.cache_to},scope={hash_scope}")
if git_ref_scope:
ref_scope = f"platform-{component}-{target}-{git_ref_scope}"
cache_to_list.append(f"{args.cache_to},scope={ref_scope}")
# Ensure we have at least one cache source/target
if not cache_from_list:
cache_from_list.append(args.cache_from)
if not cache_to_list:
cache_to_list.append(args.cache_to)
build_config["cache_from"] = cache_from_list
build_config["cache_to"] = cache_to_list
modified_services.append(service_name)
# Write back to the same file
with open(args.source, "w") as f:
yaml.dump(compose, f, default_flow_style=False, sort_keys=False)
print(f"Added cache config to {len(modified_services)} services in {args.source}:")
for svc in modified_services:
svc_config = compose["services"][svc]
build_cfg = svc_config.get("build", {})
cache_from_list = build_cfg.get("cache_from", ["none"])
cache_to_list = build_cfg.get("cache_to", ["none"])
print(f" - {svc}")
print(f" image: {svc_config.get('image', 'N/A')}")
print(f" cache_from: {cache_from_list}")
print(f" cache_to: {cache_to_list}")
if services_to_dedupe:
print(
f"Deduplicated {len(services_to_dedupe)} services (will use pre-built images):"
)
for svc in services_to_dedupe:
print(f" - {svc} -> {compose['services'][svc].get('image', 'N/A')}")
if __name__ == "__main__":
main()

4
.gitignore vendored
View File

@@ -178,4 +178,8 @@ autogpt_platform/backend/settings.py
*.ign.*
.test-contents
.claude/settings.local.json
CLAUDE.local.md
/autogpt_platform/backend/logs
.next
# Implementation plans (generated by AI agents)
plans/

1
.nvmrc Normal file
View File

@@ -0,0 +1 @@
22

View File

@@ -1,3 +1,10 @@
default_install_hook_types:
- pre-commit
- pre-push
- post-checkout
default_stages: [pre-commit]
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.4.0
@@ -17,6 +24,7 @@ repos:
name: Detect secrets
description: Detects high entropy strings that are likely to be passwords.
files: ^autogpt_platform/
exclude: pnpm-lock\.yaml$
stages: [pre-push]
- repo: local
@@ -26,49 +34,106 @@ repos:
- id: poetry-install
name: Check & Install dependencies - AutoGPT Platform - Backend
alias: poetry-install-platform-backend
entry: poetry -C autogpt_platform/backend install
# include autogpt_libs source (since it's a path dependency)
files: ^autogpt_platform/(backend|autogpt_libs)/poetry\.lock$
types: [file]
entry: >
bash -c '
if [ -n "$PRE_COMMIT_FROM_REF" ]; then
git diff --name-only "$PRE_COMMIT_FROM_REF" "$PRE_COMMIT_TO_REF"
else
git diff --cached --name-only
fi | grep -qE "^autogpt_platform/(backend|autogpt_libs)/poetry\.lock$" || exit 0;
poetry -C autogpt_platform/backend install
'
always_run: true
language: system
pass_filenames: false
stages: [pre-commit, post-checkout]
- id: poetry-install
name: Check & Install dependencies - AutoGPT Platform - Libs
alias: poetry-install-platform-libs
entry: poetry -C autogpt_platform/autogpt_libs install
files: ^autogpt_platform/autogpt_libs/poetry\.lock$
types: [file]
entry: >
bash -c '
if [ -n "$PRE_COMMIT_FROM_REF" ]; then
git diff --name-only "$PRE_COMMIT_FROM_REF" "$PRE_COMMIT_TO_REF"
else
git diff --cached --name-only
fi | grep -qE "^autogpt_platform/autogpt_libs/poetry\.lock$" || exit 0;
poetry -C autogpt_platform/autogpt_libs install
'
always_run: true
language: system
pass_filenames: false
stages: [pre-commit, post-checkout]
- id: pnpm-install
name: Check & Install dependencies - AutoGPT Platform - Frontend
alias: pnpm-install-platform-frontend
entry: >
bash -c '
if [ -n "$PRE_COMMIT_FROM_REF" ]; then
git diff --name-only "$PRE_COMMIT_FROM_REF" "$PRE_COMMIT_TO_REF"
else
git diff --cached --name-only
fi | grep -qE "^autogpt_platform/frontend/pnpm-lock\.yaml$" || exit 0;
pnpm --prefix autogpt_platform/frontend install
'
always_run: true
language: system
pass_filenames: false
stages: [pre-commit, post-checkout]
- id: poetry-install
name: Check & Install dependencies - Classic - AutoGPT
alias: poetry-install-classic-autogpt
entry: poetry -C classic/original_autogpt install
entry: >
bash -c '
if [ -n "$PRE_COMMIT_FROM_REF" ]; then
git diff --name-only "$PRE_COMMIT_FROM_REF" "$PRE_COMMIT_TO_REF"
else
git diff --cached --name-only
fi | grep -qE "^classic/(original_autogpt|forge)/poetry\.lock$" || exit 0;
poetry -C classic/original_autogpt install
'
# include forge source (since it's a path dependency)
files: ^classic/(original_autogpt|forge)/poetry\.lock$
types: [file]
always_run: true
language: system
pass_filenames: false
stages: [pre-commit, post-checkout]
- id: poetry-install
name: Check & Install dependencies - Classic - Forge
alias: poetry-install-classic-forge
entry: poetry -C classic/forge install
files: ^classic/forge/poetry\.lock$
types: [file]
entry: >
bash -c '
if [ -n "$PRE_COMMIT_FROM_REF" ]; then
git diff --name-only "$PRE_COMMIT_FROM_REF" "$PRE_COMMIT_TO_REF"
else
git diff --cached --name-only
fi | grep -qE "^classic/forge/poetry\.lock$" || exit 0;
poetry -C classic/forge install
'
always_run: true
language: system
pass_filenames: false
stages: [pre-commit, post-checkout]
- id: poetry-install
name: Check & Install dependencies - Classic - Benchmark
alias: poetry-install-classic-benchmark
entry: poetry -C classic/benchmark install
files: ^classic/benchmark/poetry\.lock$
types: [file]
entry: >
bash -c '
if [ -n "$PRE_COMMIT_FROM_REF" ]; then
git diff --name-only "$PRE_COMMIT_FROM_REF" "$PRE_COMMIT_TO_REF"
else
git diff --cached --name-only
fi | grep -qE "^classic/benchmark/poetry\.lock$" || exit 0;
poetry -C classic/benchmark install
'
always_run: true
language: system
pass_filenames: false
stages: [pre-commit, post-checkout]
- repo: local
# For proper type checking, Prisma client must be up-to-date.
@@ -76,12 +141,54 @@ repos:
- id: prisma-generate
name: Prisma Generate - AutoGPT Platform - Backend
alias: prisma-generate-platform-backend
entry: bash -c 'cd autogpt_platform/backend && poetry run prisma generate'
entry: >
bash -c '
if [ -n "$PRE_COMMIT_FROM_REF" ]; then
git diff --name-only "$PRE_COMMIT_FROM_REF" "$PRE_COMMIT_TO_REF"
else
git diff --cached --name-only
fi | grep -qE "^autogpt_platform/((backend|autogpt_libs)/poetry\.lock|backend/schema\.prisma)$" || exit 0;
cd autogpt_platform/backend
&& poetry run prisma generate
&& poetry run gen-prisma-stub
'
# include everything that triggers poetry install + the prisma schema
files: ^autogpt_platform/((backend|autogpt_libs)/poetry\.lock|backend/schema.prisma)$
types: [file]
always_run: true
language: system
pass_filenames: false
stages: [pre-commit, post-checkout]
- id: export-api-schema
name: Export API schema - AutoGPT Platform - Backend -> Frontend
alias: export-api-schema-platform
entry: >
bash -c '
cd autogpt_platform/backend
&& poetry run export-api-schema --output ../frontend/src/app/api/openapi.json
&& cd ../frontend
&& pnpm prettier --write ./src/app/api/openapi.json
'
files: ^autogpt_platform/backend/
language: system
pass_filenames: false
- id: generate-api-client
name: Generate API client - AutoGPT Platform - Frontend
alias: generate-api-client-platform-frontend
entry: >
bash -c '
SCHEMA=autogpt_platform/frontend/src/app/api/openapi.json;
if [ -n "$PRE_COMMIT_FROM_REF" ]; then
git diff --quiet "$PRE_COMMIT_FROM_REF" "$PRE_COMMIT_TO_REF" -- "$SCHEMA" && exit 0
else
git diff --quiet HEAD -- "$SCHEMA" && exit 0
fi;
cd autogpt_platform/frontend && pnpm generate:api
'
always_run: true
language: system
pass_filenames: false
stages: [pre-commit, post-checkout]
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.7.2

View File

@@ -16,6 +16,34 @@ See `docs/content/platform/getting-started.md` for setup instructions.
- Format Python code with `poetry run format`.
- Format frontend code using `pnpm format`.
## Frontend guidelines:
See `/frontend/CONTRIBUTING.md` for complete patterns. Quick reference:
1. **Pages**: Create in `src/app/(platform)/feature-name/page.tsx`
- Add `usePageName.ts` hook for logic
- Put sub-components in local `components/` folder
2. **Components**: Structure as `ComponentName/ComponentName.tsx` + `useComponentName.ts` + `helpers.ts`
- Use design system components from `src/components/` (atoms, molecules, organisms)
- Never use `src/components/__legacy__/*`
3. **Data fetching**: Use generated API hooks from `@/app/api/__generated__/endpoints/`
- Regenerate with `pnpm generate:api`
- Pattern: `use{Method}{Version}{OperationName}`
4. **Styling**: Tailwind CSS only, use design tokens, Phosphor Icons only
5. **Testing**: Add Storybook stories for new components, Playwright for E2E
6. **Code conventions**: Function declarations (not arrow functions) for components/handlers
- Component props should be `interface Props { ... }` (not exported) unless the interface needs to be used outside the component
- Separate render logic from business logic (component.tsx + useComponent.ts + helpers.ts)
- Colocate state when possible and avoid creating large components, use sub-components ( local `/components` folder next to the parent component ) when sensible
- Avoid large hooks, abstract logic into `helpers.ts` files when sensible
- Use function declarations for components, arrow functions only for callbacks
- No barrel files or `index.ts` re-exports
- Avoid comments at all times unless the code is very complex
- Do not use `useCallback` or `useMemo` unless asked to optimise a given function
- Do not type hook returns, let Typescript infer as much as possible
- Never type with `any`, if not types available use `unknown`
## Testing
- Backend: `poetry run test` (runs pytest with a docker based postgres + prisma).
@@ -23,22 +51,8 @@ See `docs/content/platform/getting-started.md` for setup instructions.
Always run the relevant linters and tests before committing.
Use conventional commit messages for all commits (e.g. `feat(backend): add API`).
Types:
- feat
- fix
- refactor
- ci
- dx (developer experience)
Scopes:
- platform
- platform/library
- platform/marketplace
- backend
- backend/executor
- frontend
- frontend/library
- frontend/marketplace
- blocks
Types: - feat - fix - refactor - ci - dx (developer experience)
Scopes: - platform - platform/library - platform/marketplace - backend - backend/executor - frontend - frontend/library - frontend/marketplace - blocks
## Pull requests

View File

@@ -54,7 +54,7 @@ Before proceeding with the installation, ensure your system meets the following
### Updated Setup Instructions:
We've moved to a fully maintained and regularly updated documentation site.
👉 [Follow the official self-hosting guide here](https://docs.agpt.co/platform/getting-started/)
👉 [Follow the official self-hosting guide here](https://agpt.co/docs/platform/getting-started/getting-started)
This tutorial assumes you have Docker, VSCode, git and npm installed.
@@ -83,13 +83,13 @@ The AutoGPT frontend is where users interact with our powerful AI automation pla
**Agent Builder:** For those who want to customize, our intuitive, low-code interface allows you to design and configure your own AI agents.
**Workflow Management:** Build, modify, and optimize your automation workflows with ease. You build your agent by connecting blocks, where each block performs a single action.
**Workflow Management:** Build, modify, and optimize your automation workflows with ease. You build your agent by connecting blocks, where each block performs a single action.
**Deployment Controls:** Manage the lifecycle of your agents, from testing to production.
**Ready-to-Use Agents:** Don't want to build? Simply select from our library of pre-configured agents and put them to work immediately.
**Agent Interaction:** Whether you've built your own or are using pre-configured agents, easily run and interact with them through our user-friendly interface.
**Agent Interaction:** Whether you've built your own or are using pre-configured agents, easily run and interact with them through our user-friendly interface.
**Monitoring and Analytics:** Keep track of your agents' performance and gain insights to continually improve your automation processes.

View File

@@ -1,2 +1,3 @@
*.ignore.*
*.ign.*
*.ign.*
.application.logs

View File

@@ -6,152 +6,30 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
AutoGPT Platform is a monorepo containing:
- **Backend** (`/backend`): Python FastAPI server with async support
- **Frontend** (`/frontend`): Next.js React application
- **Shared Libraries** (`/autogpt_libs`): Common Python utilities
- **Backend** (`backend`): Python FastAPI server with async support
- **Frontend** (`frontend`): Next.js React application
- **Shared Libraries** (`autogpt_libs`): Common Python utilities
## Essential Commands
## Component Documentation
### Backend Development
- **Backend**: See @backend/CLAUDE.md for backend-specific commands, architecture, and development tasks
- **Frontend**: See @frontend/CLAUDE.md for frontend-specific commands, architecture, and development patterns
```bash
# Install dependencies
cd backend && poetry install
# Run database migrations
poetry run prisma migrate dev
# Start all services (database, redis, rabbitmq, clamav)
docker compose up -d
# Run the backend server
poetry run serve
# Run tests
poetry run test
# Run specific test
poetry run pytest path/to/test_file.py::test_function_name
# Run block tests (tests that validate all blocks work correctly)
poetry run pytest backend/blocks/test/test_block.py -xvs
# Run tests for a specific block (e.g., GetCurrentTimeBlock)
poetry run pytest 'backend/blocks/test/test_block.py::test_available_blocks[GetCurrentTimeBlock]' -xvs
# Lint and format
# prefer format if you want to just "fix" it and only get the errors that can't be autofixed
poetry run format # Black + isort
poetry run lint # ruff
```
More details can be found in TESTING.md
#### Creating/Updating Snapshots
When you first write a test or when the expected output changes:
```bash
poetry run pytest path/to/test.py --snapshot-update
```
⚠️ **Important**: Always review snapshot changes before committing! Use `git diff` to verify the changes are expected.
### Frontend Development
```bash
# Install dependencies
cd frontend && pnpm i
# Generate API client from OpenAPI spec
pnpm generate:api
# Start development server
pnpm dev
# Run E2E tests
pnpm test
# Run Storybook for component development
pnpm storybook
# Build production
pnpm build
# Format and lint
pnpm format
# Type checking
pnpm types
```
**📖 Complete Guide**: See `/frontend/CONTRIBUTING.md` and `/frontend/.cursorrules` for comprehensive frontend patterns.
**Key Frontend Conventions:**
- Separate render logic from data/behavior in components
- Use generated API hooks from `@/app/api/__generated__/endpoints/`
- Use function declarations (not arrow functions) for components/handlers
- Use design system components from `src/components/` (atoms, molecules, organisms)
- Only use Phosphor Icons
- Never use `src/components/__legacy__/*` or deprecated `BackendAPI`
## Architecture Overview
### Backend Architecture
- **API Layer**: FastAPI with REST and WebSocket endpoints
- **Database**: PostgreSQL with Prisma ORM, includes pgvector for embeddings
- **Queue System**: RabbitMQ for async task processing
- **Execution Engine**: Separate executor service processes agent workflows
- **Authentication**: JWT-based with Supabase integration
- **Security**: Cache protection middleware prevents sensitive data caching in browsers/proxies
### Frontend Architecture
- **Framework**: Next.js 15 App Router (client-first approach)
- **Data Fetching**: Type-safe generated API hooks via Orval + React Query
- **State Management**: React Query for server state, co-located UI state in components/hooks
- **Component Structure**: Separate render logic (`.tsx`) from business logic (`use*.ts` hooks)
- **Workflow Builder**: Visual graph editor using @xyflow/react
- **UI Components**: shadcn/ui (Radix UI primitives) with Tailwind CSS styling
- **Icons**: Phosphor Icons only
- **Feature Flags**: LaunchDarkly integration
- **Error Handling**: ErrorCard for render errors, toast for mutations, Sentry for exceptions
- **Testing**: Playwright for E2E, Storybook for component development
### Key Concepts
## Key Concepts
1. **Agent Graphs**: Workflow definitions stored as JSON, executed by the backend
2. **Blocks**: Reusable components in `/backend/blocks/` that perform specific tasks
2. **Blocks**: Reusable components in `backend/backend/blocks/` that perform specific tasks
3. **Integrations**: OAuth and API connections stored per user
4. **Store**: Marketplace for sharing agent templates
5. **Virus Scanning**: ClamAV integration for file upload security
### Testing Approach
- Backend uses pytest with snapshot testing for API responses
- Test files are colocated with source files (`*_test.py`)
- Frontend uses Playwright for E2E tests
- Component testing via Storybook
### Database Schema
Key models (defined in `/backend/schema.prisma`):
- `User`: Authentication and profile data
- `AgentGraph`: Workflow definitions with version control
- `AgentGraphExecution`: Execution history and results
- `AgentNode`: Individual nodes in a workflow
- `StoreListing`: Marketplace listings for sharing agents
### Environment Configuration
#### Configuration Files
- **Backend**: `/backend/.env.default` (defaults) → `/backend/.env` (user overrides)
- **Frontend**: `/frontend/.env.default` (defaults) → `/frontend/.env` (user overrides)
- **Platform**: `/.env.default` (Supabase/shared defaults) → `/.env` (user overrides)
- **Backend**: `backend/.env.default` (defaults) → `backend/.env` (user overrides)
- **Frontend**: `frontend/.env.default` (defaults) → `frontend/.env` (user overrides)
- **Platform**: `.env.default` (Supabase/shared defaults) → `.env` (user overrides)
#### Docker Environment Loading Order
@@ -167,82 +45,49 @@ Key models (defined in `/backend/schema.prisma`):
- Backend/Frontend services use YAML anchors for consistent configuration
- Supabase services (`db/docker/docker-compose.yml`) follow the same pattern
### Common Development Tasks
### Branching Strategy
**Adding a new block:**
Follow the comprehensive [Block SDK Guide](../../../docs/content/platform/block-sdk-guide.md) which covers:
- Provider configuration with `ProviderBuilder`
- Block schema definition
- Authentication (API keys, OAuth, webhooks)
- Testing and validation
- File organization
Quick steps:
1. Create new file in `/backend/backend/blocks/`
2. Configure provider using `ProviderBuilder` in `_config.py`
3. Inherit from `Block` base class
4. Define input/output schemas using `BlockSchema`
5. Implement async `run` method
6. Generate unique block ID using `uuid.uuid4()`
7. Test with `poetry run pytest backend/blocks/test/test_block.py`
Note: when making many new blocks analyze the interfaces for each of these blocks and picture if they would go well together in a graph based editor or would they struggle to connect productively?
ex: do the inputs and outputs tie well together?
If you get any pushback or hit complex block conditions check the new_blocks guide in the docs.
**Modifying the API:**
1. Update route in `/backend/backend/server/routers/`
2. Add/update Pydantic models in same directory
3. Write tests alongside the route file
4. Run `poetry run test` to verify
**Frontend feature development:**
See `/frontend/CONTRIBUTING.md` for complete patterns. Quick reference:
1. **Pages**: Create in `src/app/(platform)/feature-name/page.tsx`
- Add `usePageName.ts` hook for logic
- Put sub-components in local `components/` folder
2. **Components**: Structure as `ComponentName/ComponentName.tsx` + `useComponentName.ts` + `helpers.ts`
- Use design system components from `src/components/` (atoms, molecules, organisms)
- Never use `src/components/__legacy__/*`
3. **Data fetching**: Use generated API hooks from `@/app/api/__generated__/endpoints/`
- Regenerate with `pnpm generate:api`
- Pattern: `use{Method}{Version}{OperationName}`
4. **Styling**: Tailwind CSS only, use design tokens, Phosphor Icons only
5. **Testing**: Add Storybook stories for new components, Playwright for E2E
6. **Code conventions**: Function declarations (not arrow functions) for components/handlers
### Security Implementation
**Cache Protection Middleware:**
- Located in `/backend/backend/server/middleware/security.py`
- Default behavior: Disables caching for ALL endpoints with `Cache-Control: no-store, no-cache, must-revalidate, private`
- Uses an allow list approach - only explicitly permitted paths can be cached
- Cacheable paths include: static assets (`/static/*`, `/_next/static/*`), health checks, public store pages, documentation
- Prevents sensitive data (auth tokens, API keys, user data) from being cached by browsers/proxies
- To allow caching for a new endpoint, add it to `CACHEABLE_PATHS` in the middleware
- Applied to both main API server and external API applications
- **`dev`** is the main development branch. All PRs should target `dev`.
- **`master`** is the production branch. Only used for production releases.
### Creating Pull Requests
- Create the PR aginst the `dev` branch of the repository.
- Ensure the branch name is descriptive (e.g., `feature/add-new-block`)/
- Use conventional commit messages (see below)/
- Fill out the .github/PULL_REQUEST_TEMPLATE.md template as the PR description/
- Create the PR against the `dev` branch of the repository.
- **Split PRs by concern** — each PR should have a single clear purpose. For example, "usage tracking" and "credit charging" should be separate PRs even if related. Combining multiple concerns makes it harder for reviewers to understand what belongs to what.
- Ensure the branch name is descriptive (e.g., `feature/add-new-block`)
- Use conventional commit messages (see below)
- **Structure the PR description with Why / What / How** — Why: the motivation (what problem it solves, what's broken/missing without it); What: high-level summary of changes; How: approach, key implementation details, or architecture decisions. Reviewers need all three to judge whether the approach fits the problem.
- Fill out the .github/PULL_REQUEST_TEMPLATE.md template as the PR description
- Always use `--body-file` to pass PR body — avoids shell interpretation of backticks and special characters:
```bash
PR_BODY=$(mktemp)
cat > "$PR_BODY" << 'PREOF'
## Summary
- use `backticks` freely here
PREOF
gh pr create --title "..." --body-file "$PR_BODY" --base dev
rm "$PR_BODY"
```
- Run the github pre-commit hooks to ensure code quality.
### Test-Driven Development (TDD)
When fixing a bug or adding a feature, follow a test-first approach:
1. **Write a failing test first** — create a test that reproduces the bug or validates the new behavior, marked with `@pytest.mark.xfail` (backend) or `.fixme` (Playwright). Run it to confirm it fails for the right reason.
2. **Implement the fix/feature** — write the minimal code to make the test pass.
3. **Remove the xfail marker** — once the test passes, remove the `xfail`/`.fixme` annotation and run the full test suite to confirm nothing else broke.
This ensures every change is covered by a test and that the test actually validates the intended behavior.
### Reviewing/Revising Pull Requests
- When the user runs /pr-comments or tries to fetch them, also run gh api /repos/Significant-Gravitas/AutoGPT/pulls/[issuenum]/reviews to get the reviews
- Use gh api /repos/Significant-Gravitas/AutoGPT/pulls/[issuenum]/reviews/[review_id]/comments to get the review contents
- Use gh api /repos/Significant-Gravitas/AutoGPT/issues/9924/comments to get the pr specific comments
Use `/pr-review` to review a PR or `/pr-address` to address comments.
When fetching comments manually:
- `gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/reviews --paginate` — top-level reviews
- `gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/comments --paginate` — inline review comments (always paginate to avoid missing comments beyond page 1)
- `gh api repos/Significant-Gravitas/AutoGPT/issues/{N}/comments` — PR conversation comments
### Conventional Commits

View File

@@ -6,12 +6,14 @@ start-core:
# Stop core services
stop-core:
docker compose stop deps
docker compose stop
reset-db:
docker compose stop db
rm -rf db/docker/volumes/db/data
cd backend && poetry run prisma migrate deploy
cd backend && poetry run prisma generate
cd backend && poetry run gen-prisma-stub
# View logs for core services
logs-core:
@@ -33,6 +35,7 @@ init-env:
migrate:
cd backend && poetry run prisma migrate deploy
cd backend && poetry run prisma generate
cd backend && poetry run gen-prisma-stub
run-backend:
cd backend && poetry run app
@@ -58,4 +61,4 @@ help:
@echo " run-backend - Run the backend FastAPI server"
@echo " run-frontend - Run the frontend Next.js development server"
@echo " test-data - Run the test data creator"
@echo " load-store-agents - Load store agents from agents/ folder into test database"
@echo " load-store-agents - Load store agents from agents/ folder into test database"

View File

@@ -0,0 +1,40 @@
-- =============================================================
-- View: analytics.auth_activities
-- Looker source alias: ds49 | Charts: 1
-- =============================================================
-- DESCRIPTION
-- Tracks authentication events (login, logout, SSO, password
-- reset, etc.) from Supabase's internal audit log.
-- Useful for monitoring sign-in patterns and detecting anomalies.
--
-- SOURCE TABLES
-- auth.audit_log_entries — Supabase internal auth event log
--
-- OUTPUT COLUMNS
-- created_at TIMESTAMPTZ When the auth event occurred
-- actor_id TEXT User ID who triggered the event
-- actor_via_sso TEXT Whether the action was via SSO ('true'/'false')
-- action TEXT Event type (e.g. 'login', 'logout', 'token_refreshed')
--
-- WINDOW
-- Rolling 90 days from current date
--
-- EXAMPLE QUERIES
-- -- Daily login counts
-- SELECT DATE_TRUNC('day', created_at) AS day, COUNT(*) AS logins
-- FROM analytics.auth_activities
-- WHERE action = 'login'
-- GROUP BY 1 ORDER BY 1;
--
-- -- SSO vs password login breakdown
-- SELECT actor_via_sso, COUNT(*) FROM analytics.auth_activities
-- WHERE action = 'login' GROUP BY 1;
-- =============================================================
SELECT
created_at,
payload->>'actor_id' AS actor_id,
payload->>'actor_via_sso' AS actor_via_sso,
payload->>'action' AS action
FROM auth.audit_log_entries
WHERE created_at >= NOW() - INTERVAL '90 days'

View File

@@ -0,0 +1,105 @@
-- =============================================================
-- View: analytics.graph_execution
-- Looker source alias: ds16 | Charts: 21
-- =============================================================
-- DESCRIPTION
-- One row per agent graph execution (last 90 days).
-- Unpacks the JSONB stats column into individual numeric columns
-- and normalises the executionStatus — runs that failed due to
-- insufficient credits are reclassified as 'NO_CREDITS' for
-- easier filtering. Error messages are scrubbed of IDs and URLs
-- to allow safe grouping.
--
-- SOURCE TABLES
-- platform.AgentGraphExecution — Execution records
-- platform.AgentGraph — Agent graph metadata (for name)
-- platform.LibraryAgent — To flag possibly-AI (safe-mode) agents
--
-- OUTPUT COLUMNS
-- id TEXT Execution UUID
-- agentGraphId TEXT Agent graph UUID
-- agentGraphVersion INT Graph version number
-- executionStatus TEXT COMPLETED | FAILED | NO_CREDITS | RUNNING | QUEUED | TERMINATED
-- createdAt TIMESTAMPTZ When the execution was queued
-- updatedAt TIMESTAMPTZ Last status update time
-- userId TEXT Owner user UUID
-- agentGraphName TEXT Human-readable agent name
-- cputime DECIMAL Total CPU seconds consumed
-- walltime DECIMAL Total wall-clock seconds
-- node_count DECIMAL Number of nodes in the graph
-- nodes_cputime DECIMAL CPU time across all nodes
-- nodes_walltime DECIMAL Wall time across all nodes
-- execution_cost DECIMAL Credit cost of this execution
-- correctness_score FLOAT AI correctness score (if available)
-- possibly_ai BOOLEAN True if agent has sensitive_action_safe_mode enabled
-- groupedErrorMessage TEXT Scrubbed error string (IDs/URLs replaced with wildcards)
--
-- WINDOW
-- Rolling 90 days (createdAt > CURRENT_DATE - 90 days)
--
-- EXAMPLE QUERIES
-- -- Daily execution counts by status
-- SELECT DATE_TRUNC('day', "createdAt") AS day, "executionStatus", COUNT(*)
-- FROM analytics.graph_execution
-- GROUP BY 1, 2 ORDER BY 1;
--
-- -- Average cost per execution by agent
-- SELECT "agentGraphName", AVG("execution_cost") AS avg_cost, COUNT(*) AS runs
-- FROM analytics.graph_execution
-- WHERE "executionStatus" = 'COMPLETED'
-- GROUP BY 1 ORDER BY avg_cost DESC;
--
-- -- Top error messages
-- SELECT "groupedErrorMessage", COUNT(*) AS occurrences
-- FROM analytics.graph_execution
-- WHERE "executionStatus" = 'FAILED'
-- GROUP BY 1 ORDER BY 2 DESC LIMIT 20;
-- =============================================================
SELECT
ge."id" AS id,
ge."agentGraphId" AS agentGraphId,
ge."agentGraphVersion" AS agentGraphVersion,
CASE
WHEN jsonb_exists(ge."stats"::jsonb, 'error')
AND (
(ge."stats"::jsonb->>'error') ILIKE '%insufficient balance%'
OR (ge."stats"::jsonb->>'error') ILIKE '%you have no credits left%'
)
THEN 'NO_CREDITS'
ELSE CAST(ge."executionStatus" AS TEXT)
END AS executionStatus,
ge."createdAt" AS createdAt,
ge."updatedAt" AS updatedAt,
ge."userId" AS userId,
g."name" AS agentGraphName,
(ge."stats"::jsonb->>'cputime')::decimal AS cputime,
(ge."stats"::jsonb->>'walltime')::decimal AS walltime,
(ge."stats"::jsonb->>'node_count')::decimal AS node_count,
(ge."stats"::jsonb->>'nodes_cputime')::decimal AS nodes_cputime,
(ge."stats"::jsonb->>'nodes_walltime')::decimal AS nodes_walltime,
(ge."stats"::jsonb->>'cost')::decimal AS execution_cost,
(ge."stats"::jsonb->>'correctness_score')::float AS correctness_score,
COALESCE(la.possibly_ai, FALSE) AS possibly_ai,
REGEXP_REPLACE(
REGEXP_REPLACE(
TRIM(BOTH '"' FROM ge."stats"::jsonb->>'error'),
'(https?://)([A-Za-z0-9.-]+)(:[0-9]+)?(/[^\s]*)?',
'\1\2/...', 'gi'
),
'[a-zA-Z0-9_:-]*\d[a-zA-Z0-9_:-]*', '*', 'g'
) AS groupedErrorMessage
FROM platform."AgentGraphExecution" ge
LEFT JOIN platform."AgentGraph" g
ON ge."agentGraphId" = g."id"
AND ge."agentGraphVersion" = g."version"
LEFT JOIN (
SELECT DISTINCT ON ("userId", "agentGraphId")
"userId", "agentGraphId",
("settings"::jsonb->>'sensitive_action_safe_mode')::boolean AS possibly_ai
FROM platform."LibraryAgent"
WHERE "isDeleted" = FALSE
AND "isArchived" = FALSE
ORDER BY "userId", "agentGraphId", "agentGraphVersion" DESC
) la ON la."userId" = ge."userId" AND la."agentGraphId" = ge."agentGraphId"
WHERE ge."createdAt" > CURRENT_DATE - INTERVAL '90 days'

View File

@@ -0,0 +1,101 @@
-- =============================================================
-- View: analytics.node_block_execution
-- Looker source alias: ds14 | Charts: 11
-- =============================================================
-- DESCRIPTION
-- One row per node (block) execution (last 90 days).
-- Unpacks stats JSONB and joins to identify which block type
-- was run. For failed nodes, joins the error output and
-- scrubs it for safe grouping.
--
-- SOURCE TABLES
-- platform.AgentNodeExecution — Node execution records
-- platform.AgentNode — Node → block mapping
-- platform.AgentBlock — Block name/ID
-- platform.AgentNodeExecutionInputOutput — Error output values
--
-- OUTPUT COLUMNS
-- id TEXT Node execution UUID
-- agentGraphExecutionId TEXT Parent graph execution UUID
-- agentNodeId TEXT Node UUID within the graph
-- executionStatus TEXT COMPLETED | FAILED | QUEUED | RUNNING | TERMINATED
-- addedTime TIMESTAMPTZ When the node was queued
-- queuedTime TIMESTAMPTZ When it entered the queue
-- startedTime TIMESTAMPTZ When execution started
-- endedTime TIMESTAMPTZ When execution finished
-- inputSize BIGINT Input payload size in bytes
-- outputSize BIGINT Output payload size in bytes
-- walltime NUMERIC Wall-clock seconds for this node
-- cputime NUMERIC CPU seconds for this node
-- llmRetryCount INT Number of LLM retries
-- llmCallCount INT Number of LLM API calls made
-- inputTokenCount BIGINT LLM input tokens consumed
-- outputTokenCount BIGINT LLM output tokens produced
-- blockName TEXT Human-readable block name (e.g. 'OpenAIBlock')
-- blockId TEXT Block UUID
-- groupedErrorMessage TEXT Scrubbed error (IDs/URLs wildcarded)
-- errorMessage TEXT Raw error output (only set when FAILED)
--
-- WINDOW
-- Rolling 90 days (addedTime > CURRENT_DATE - 90 days)
--
-- EXAMPLE QUERIES
-- -- Most-used blocks by execution count
-- SELECT "blockName", COUNT(*) AS executions,
-- COUNT(*) FILTER (WHERE "executionStatus"='FAILED') AS failures
-- FROM analytics.node_block_execution
-- GROUP BY 1 ORDER BY executions DESC LIMIT 20;
--
-- -- Average LLM token usage per block
-- SELECT "blockName",
-- AVG("inputTokenCount") AS avg_input_tokens,
-- AVG("outputTokenCount") AS avg_output_tokens
-- FROM analytics.node_block_execution
-- WHERE "llmCallCount" > 0
-- GROUP BY 1 ORDER BY avg_input_tokens DESC;
--
-- -- Top failure reasons
-- SELECT "blockName", "groupedErrorMessage", COUNT(*) AS count
-- FROM analytics.node_block_execution
-- WHERE "executionStatus" = 'FAILED'
-- GROUP BY 1, 2 ORDER BY count DESC LIMIT 20;
-- =============================================================
SELECT
ne."id" AS id,
ne."agentGraphExecutionId" AS agentGraphExecutionId,
ne."agentNodeId" AS agentNodeId,
CAST(ne."executionStatus" AS TEXT) AS executionStatus,
ne."addedTime" AS addedTime,
ne."queuedTime" AS queuedTime,
ne."startedTime" AS startedTime,
ne."endedTime" AS endedTime,
(ne."stats"::jsonb->>'input_size')::bigint AS inputSize,
(ne."stats"::jsonb->>'output_size')::bigint AS outputSize,
(ne."stats"::jsonb->>'walltime')::numeric AS walltime,
(ne."stats"::jsonb->>'cputime')::numeric AS cputime,
(ne."stats"::jsonb->>'llm_retry_count')::int AS llmRetryCount,
(ne."stats"::jsonb->>'llm_call_count')::int AS llmCallCount,
(ne."stats"::jsonb->>'input_token_count')::bigint AS inputTokenCount,
(ne."stats"::jsonb->>'output_token_count')::bigint AS outputTokenCount,
b."name" AS blockName,
b."id" AS blockId,
REGEXP_REPLACE(
REGEXP_REPLACE(
TRIM(BOTH '"' FROM eio."data"::text),
'(https?://)([A-Za-z0-9.-]+)(:[0-9]+)?(/[^\s]*)?',
'\1\2/...', 'gi'
),
'[a-zA-Z0-9_:-]*\d[a-zA-Z0-9_:-]*', '*', 'g'
) AS groupedErrorMessage,
eio."data" AS errorMessage
FROM platform."AgentNodeExecution" ne
LEFT JOIN platform."AgentNode" nd
ON ne."agentNodeId" = nd."id"
LEFT JOIN platform."AgentBlock" b
ON nd."agentBlockId" = b."id"
LEFT JOIN platform."AgentNodeExecutionInputOutput" eio
ON eio."referencedByOutputExecId" = ne."id"
AND eio."name" = 'error'
AND ne."executionStatus" = 'FAILED'
WHERE ne."addedTime" > CURRENT_DATE - INTERVAL '90 days'

View File

@@ -0,0 +1,97 @@
-- =============================================================
-- View: analytics.retention_agent
-- Looker source alias: ds35 | Charts: 2
-- =============================================================
-- DESCRIPTION
-- Weekly cohort retention broken down per individual agent.
-- Cohort = week of a user's first use of THAT specific agent.
-- Tells you which agents keep users coming back vs. one-shot
-- use. Only includes cohorts from the last 180 days.
--
-- SOURCE TABLES
-- platform.AgentGraphExecution — Execution records (user × agent × time)
-- platform.AgentGraph — Agent names
--
-- OUTPUT COLUMNS
-- agent_id TEXT Agent graph UUID
-- agent_label TEXT 'AgentName [first8chars]'
-- agent_label_n TEXT 'AgentName [first8chars] (n=total_users)'
-- cohort_week_start DATE Week users first ran this agent
-- cohort_label TEXT ISO week label
-- cohort_label_n TEXT ISO week label with cohort size
-- user_lifetime_week INT Weeks since first use of this agent
-- cohort_users BIGINT Users in this cohort for this agent
-- active_users BIGINT Users who ran the agent again in week k
-- retention_rate FLOAT active_users / cohort_users
-- cohort_users_w0 BIGINT cohort_users only at week 0 (safe to SUM)
-- agent_total_users BIGINT Total users across all cohorts for this agent
--
-- EXAMPLE QUERIES
-- -- Best-retained agents at week 2
-- SELECT agent_label, AVG(retention_rate) AS w2_retention
-- FROM analytics.retention_agent
-- WHERE user_lifetime_week = 2 AND cohort_users >= 10
-- GROUP BY 1 ORDER BY w2_retention DESC LIMIT 10;
--
-- -- Agents with most unique users
-- SELECT DISTINCT agent_label, agent_total_users
-- FROM analytics.retention_agent
-- ORDER BY agent_total_users DESC LIMIT 20;
-- =============================================================
WITH params AS (SELECT 12::int AS max_weeks, (CURRENT_DATE - INTERVAL '180 days') AS cohort_start),
events AS (
SELECT e."userId"::text AS user_id, e."agentGraphId" AS agent_id,
e."createdAt"::timestamptz AS created_at,
DATE_TRUNC('week', e."createdAt")::date AS week_start
FROM platform."AgentGraphExecution" e
),
first_use AS (
SELECT user_id, agent_id, MIN(created_at) AS first_use_at,
DATE_TRUNC('week', MIN(created_at))::date AS cohort_week_start
FROM events GROUP BY 1,2
HAVING MIN(created_at) >= (SELECT cohort_start FROM params)
),
activity_weeks AS (SELECT DISTINCT user_id, agent_id, week_start FROM events),
user_week_age AS (
SELECT aw.user_id, aw.agent_id, fu.cohort_week_start,
((aw.week_start - DATE_TRUNC('week',fu.first_use_at)::date)/7)::int AS user_lifetime_week
FROM activity_weeks aw JOIN first_use fu USING (user_id, agent_id)
WHERE aw.week_start >= DATE_TRUNC('week',fu.first_use_at)::date
),
active_counts AS (
SELECT agent_id, cohort_week_start, user_lifetime_week, COUNT(DISTINCT user_id) AS active_users
FROM user_week_age WHERE user_lifetime_week >= 0 GROUP BY 1,2,3
),
cohort_sizes AS (
SELECT agent_id, cohort_week_start, COUNT(DISTINCT user_id) AS cohort_users FROM first_use GROUP BY 1,2
),
cohort_caps AS (
SELECT cs.agent_id, cs.cohort_week_start, cs.cohort_users,
LEAST((SELECT max_weeks FROM params),
GREATEST(0,((DATE_TRUNC('week',CURRENT_DATE)::date-cs.cohort_week_start)/7)::int)) AS cap_weeks
FROM cohort_sizes cs
),
grid AS (
SELECT cc.agent_id, cc.cohort_week_start, gs AS user_lifetime_week, cc.cohort_users
FROM cohort_caps cc CROSS JOIN LATERAL generate_series(0, cc.cap_weeks) gs
),
agent_names AS (SELECT DISTINCT ON (g."id") g."id" AS agent_id, g."name" AS agent_name FROM platform."AgentGraph" g ORDER BY g."id", g."version" DESC),
agent_total_users AS (SELECT agent_id, SUM(cohort_users) AS agent_total_users FROM cohort_sizes GROUP BY 1)
SELECT
g.agent_id,
COALESCE(an.agent_name,'(unnamed)')||' ['||LEFT(g.agent_id::text,8)||']' AS agent_label,
COALESCE(an.agent_name,'(unnamed)')||' ['||LEFT(g.agent_id::text,8)||'] (n='||COALESCE(atu.agent_total_users,0)||')' AS agent_label_n,
g.cohort_week_start,
TO_CHAR(g.cohort_week_start,'IYYY-"W"IW') AS cohort_label,
TO_CHAR(g.cohort_week_start,'IYYY-"W"IW')||' (n='||g.cohort_users||')' AS cohort_label_n,
g.user_lifetime_week, g.cohort_users,
COALESCE(ac.active_users,0) AS active_users,
COALESCE(ac.active_users,0)::float / NULLIF(g.cohort_users,0) AS retention_rate,
CASE WHEN g.user_lifetime_week=0 THEN g.cohort_users ELSE 0 END AS cohort_users_w0,
COALESCE(atu.agent_total_users,0) AS agent_total_users
FROM grid g
LEFT JOIN active_counts ac ON ac.agent_id=g.agent_id AND ac.cohort_week_start=g.cohort_week_start AND ac.user_lifetime_week=g.user_lifetime_week
LEFT JOIN agent_names an ON an.agent_id=g.agent_id
LEFT JOIN agent_total_users atu ON atu.agent_id=g.agent_id
ORDER BY agent_label, g.cohort_week_start, g.user_lifetime_week;

View File

@@ -0,0 +1,81 @@
-- =============================================================
-- View: analytics.retention_execution_daily
-- Looker source alias: ds111 | Charts: 1
-- =============================================================
-- DESCRIPTION
-- Daily cohort retention based on agent executions.
-- Cohort anchor = day of user's FIRST ever execution.
-- Only includes cohorts from the last 90 days, up to day 30.
-- Great for early engagement analysis (did users run another
-- agent the next day?).
--
-- SOURCE TABLES
-- platform.AgentGraphExecution — Execution records
--
-- OUTPUT COLUMNS
-- Same pattern as retention_login_daily.
-- cohort_day_start = day of first execution (not first login)
--
-- EXAMPLE QUERIES
-- -- Day-3 execution retention
-- SELECT cohort_label, retention_rate_bounded AS d3_retention
-- FROM analytics.retention_execution_daily
-- WHERE user_lifetime_day = 3 ORDER BY cohort_day_start;
-- =============================================================
WITH params AS (SELECT 30::int AS max_days, (CURRENT_DATE - INTERVAL '90 days') AS cohort_start),
events AS (
SELECT e."userId"::text AS user_id, e."createdAt"::timestamptz AS created_at,
DATE_TRUNC('day', e."createdAt")::date AS day_start
FROM platform."AgentGraphExecution" e WHERE e."userId" IS NOT NULL
),
first_exec AS (
SELECT user_id, MIN(created_at) AS first_exec_at,
DATE_TRUNC('day', MIN(created_at))::date AS cohort_day_start
FROM events GROUP BY 1
HAVING MIN(created_at) >= (SELECT cohort_start FROM params)
),
activity_days AS (SELECT DISTINCT user_id, day_start FROM events),
user_day_age AS (
SELECT ad.user_id, fe.cohort_day_start,
(ad.day_start - DATE_TRUNC('day',fe.first_exec_at)::date)::int AS user_lifetime_day
FROM activity_days ad JOIN first_exec fe USING (user_id)
WHERE ad.day_start >= DATE_TRUNC('day',fe.first_exec_at)::date
),
bounded_counts AS (
SELECT cohort_day_start, user_lifetime_day, COUNT(DISTINCT user_id) AS active_users_bounded
FROM user_day_age WHERE user_lifetime_day >= 0 GROUP BY 1,2
),
last_active AS (
SELECT cohort_day_start, user_id, MAX(user_lifetime_day) AS last_active_day FROM user_day_age GROUP BY 1,2
),
unbounded_counts AS (
SELECT la.cohort_day_start, gs AS user_lifetime_day, COUNT(*) AS retained_users_unbounded
FROM last_active la
CROSS JOIN LATERAL generate_series(0, LEAST(la.last_active_day,(SELECT max_days FROM params))) gs
GROUP BY 1,2
),
cohort_sizes AS (SELECT cohort_day_start, COUNT(DISTINCT user_id) AS cohort_users FROM first_exec GROUP BY 1),
cohort_caps AS (
SELECT cs.cohort_day_start, cs.cohort_users,
LEAST((SELECT max_days FROM params), GREATEST(0,(CURRENT_DATE-cs.cohort_day_start)::int)) AS cap_days
FROM cohort_sizes cs
),
grid AS (
SELECT cc.cohort_day_start, gs AS user_lifetime_day, cc.cohort_users
FROM cohort_caps cc CROSS JOIN LATERAL generate_series(0, cc.cap_days) gs
)
SELECT
g.cohort_day_start,
TO_CHAR(g.cohort_day_start,'YYYY-MM-DD') AS cohort_label,
TO_CHAR(g.cohort_day_start,'YYYY-MM-DD')||' (n='||g.cohort_users||')' AS cohort_label_n,
g.user_lifetime_day, g.cohort_users,
COALESCE(b.active_users_bounded,0) AS active_users_bounded,
COALESCE(u.retained_users_unbounded,0) AS retained_users_unbounded,
CASE WHEN g.cohort_users>0 THEN COALESCE(b.active_users_bounded,0)::float/g.cohort_users END AS retention_rate_bounded,
CASE WHEN g.cohort_users>0 THEN COALESCE(u.retained_users_unbounded,0)::float/g.cohort_users END AS retention_rate_unbounded,
CASE WHEN g.user_lifetime_day=0 THEN g.cohort_users ELSE 0 END AS cohort_users_d0
FROM grid g
LEFT JOIN bounded_counts b ON b.cohort_day_start=g.cohort_day_start AND b.user_lifetime_day=g.user_lifetime_day
LEFT JOIN unbounded_counts u ON u.cohort_day_start=g.cohort_day_start AND u.user_lifetime_day=g.user_lifetime_day
ORDER BY g.cohort_day_start, g.user_lifetime_day;

View File

@@ -0,0 +1,81 @@
-- =============================================================
-- View: analytics.retention_execution_weekly
-- Looker source alias: ds92 | Charts: 2
-- =============================================================
-- DESCRIPTION
-- Weekly cohort retention based on agent executions.
-- Cohort anchor = week of user's FIRST ever agent execution
-- (not first login). Only includes cohorts from the last 180 days.
-- Useful when you care about product engagement, not just visits.
--
-- SOURCE TABLES
-- platform.AgentGraphExecution — Execution records
--
-- OUTPUT COLUMNS
-- Same pattern as retention_login_weekly.
-- cohort_week_start = week of first execution (not first login)
--
-- EXAMPLE QUERIES
-- -- Week-2 execution retention
-- SELECT cohort_label, retention_rate_bounded
-- FROM analytics.retention_execution_weekly
-- WHERE user_lifetime_week = 2 ORDER BY cohort_week_start;
-- =============================================================
WITH params AS (SELECT 12::int AS max_weeks, (CURRENT_DATE - INTERVAL '180 days') AS cohort_start),
events AS (
SELECT e."userId"::text AS user_id, e."createdAt"::timestamptz AS created_at,
DATE_TRUNC('week', e."createdAt")::date AS week_start
FROM platform."AgentGraphExecution" e WHERE e."userId" IS NOT NULL
),
first_exec AS (
SELECT user_id, MIN(created_at) AS first_exec_at,
DATE_TRUNC('week', MIN(created_at))::date AS cohort_week_start
FROM events GROUP BY 1
HAVING MIN(created_at) >= (SELECT cohort_start FROM params)
),
activity_weeks AS (SELECT DISTINCT user_id, week_start FROM events),
user_week_age AS (
SELECT aw.user_id, fe.cohort_week_start,
((aw.week_start - DATE_TRUNC('week',fe.first_exec_at)::date)/7)::int AS user_lifetime_week
FROM activity_weeks aw JOIN first_exec fe USING (user_id)
WHERE aw.week_start >= DATE_TRUNC('week',fe.first_exec_at)::date
),
bounded_counts AS (
SELECT cohort_week_start, user_lifetime_week, COUNT(DISTINCT user_id) AS active_users_bounded
FROM user_week_age WHERE user_lifetime_week >= 0 GROUP BY 1,2
),
last_active AS (
SELECT cohort_week_start, user_id, MAX(user_lifetime_week) AS last_active_week FROM user_week_age GROUP BY 1,2
),
unbounded_counts AS (
SELECT la.cohort_week_start, gs AS user_lifetime_week, COUNT(*) AS retained_users_unbounded
FROM last_active la
CROSS JOIN LATERAL generate_series(0, LEAST(la.last_active_week,(SELECT max_weeks FROM params))) gs
GROUP BY 1,2
),
cohort_sizes AS (SELECT cohort_week_start, COUNT(DISTINCT user_id) AS cohort_users FROM first_exec GROUP BY 1),
cohort_caps AS (
SELECT cs.cohort_week_start, cs.cohort_users,
LEAST((SELECT max_weeks FROM params),
GREATEST(0,((DATE_TRUNC('week',CURRENT_DATE)::date-cs.cohort_week_start)/7)::int)) AS cap_weeks
FROM cohort_sizes cs
),
grid AS (
SELECT cc.cohort_week_start, gs AS user_lifetime_week, cc.cohort_users
FROM cohort_caps cc CROSS JOIN LATERAL generate_series(0, cc.cap_weeks) gs
)
SELECT
g.cohort_week_start,
TO_CHAR(g.cohort_week_start,'IYYY-"W"IW') AS cohort_label,
TO_CHAR(g.cohort_week_start,'IYYY-"W"IW')||' (n='||g.cohort_users||')' AS cohort_label_n,
g.user_lifetime_week, g.cohort_users,
COALESCE(b.active_users_bounded,0) AS active_users_bounded,
COALESCE(u.retained_users_unbounded,0) AS retained_users_unbounded,
CASE WHEN g.cohort_users>0 THEN COALESCE(b.active_users_bounded,0)::float/g.cohort_users END AS retention_rate_bounded,
CASE WHEN g.cohort_users>0 THEN COALESCE(u.retained_users_unbounded,0)::float/g.cohort_users END AS retention_rate_unbounded,
CASE WHEN g.user_lifetime_week=0 THEN g.cohort_users ELSE 0 END AS cohort_users_w0
FROM grid g
LEFT JOIN bounded_counts b ON b.cohort_week_start=g.cohort_week_start AND b.user_lifetime_week=g.user_lifetime_week
LEFT JOIN unbounded_counts u ON u.cohort_week_start=g.cohort_week_start AND u.user_lifetime_week=g.user_lifetime_week
ORDER BY g.cohort_week_start, g.user_lifetime_week;

View File

@@ -0,0 +1,94 @@
-- =============================================================
-- View: analytics.retention_login_daily
-- Looker source alias: ds112 | Charts: 1
-- =============================================================
-- DESCRIPTION
-- Daily cohort retention based on login sessions.
-- Same logic as retention_login_weekly but at day granularity,
-- showing up to day 30 for cohorts from the last 90 days.
-- Useful for analysing early activation (days 1-7) in detail.
--
-- SOURCE TABLES
-- auth.sessions — Login session records
--
-- OUTPUT COLUMNS (same pattern as retention_login_weekly)
-- cohort_day_start DATE First day the cohort logged in
-- cohort_label TEXT Date string (e.g. '2025-03-01')
-- cohort_label_n TEXT Date + cohort size (e.g. '2025-03-01 (n=12)')
-- user_lifetime_day INT Days since first login (0 = signup day)
-- cohort_users BIGINT Total users in cohort
-- active_users_bounded BIGINT Users active on exactly day k
-- retained_users_unbounded BIGINT Users active any time on/after day k
-- retention_rate_bounded FLOAT bounded / cohort_users
-- retention_rate_unbounded FLOAT unbounded / cohort_users
-- cohort_users_d0 BIGINT cohort_users only at day 0, else 0 (safe to SUM)
--
-- EXAMPLE QUERIES
-- -- Day-1 retention rate (came back next day)
-- SELECT cohort_label, retention_rate_bounded AS d1_retention
-- FROM analytics.retention_login_daily
-- WHERE user_lifetime_day = 1 ORDER BY cohort_day_start;
--
-- -- Average retention curve across all cohorts
-- SELECT user_lifetime_day,
-- SUM(active_users_bounded)::float / NULLIF(SUM(cohort_users_d0), 0) AS avg_retention
-- FROM analytics.retention_login_daily
-- GROUP BY 1 ORDER BY 1;
-- =============================================================
WITH params AS (SELECT 30::int AS max_days, (CURRENT_DATE - INTERVAL '90 days')::date AS cohort_start),
events AS (
SELECT s.user_id::text AS user_id, s.created_at::timestamptz AS created_at,
DATE_TRUNC('day', s.created_at)::date AS day_start
FROM auth.sessions s WHERE s.user_id IS NOT NULL
),
first_login AS (
SELECT user_id, MIN(created_at) AS first_login_time,
DATE_TRUNC('day', MIN(created_at))::date AS cohort_day_start
FROM events GROUP BY 1
HAVING MIN(created_at) >= (SELECT cohort_start FROM params)
),
activity_days AS (SELECT DISTINCT user_id, day_start FROM events),
user_day_age AS (
SELECT ad.user_id, fl.cohort_day_start,
(ad.day_start - DATE_TRUNC('day', fl.first_login_time)::date)::int AS user_lifetime_day
FROM activity_days ad JOIN first_login fl USING (user_id)
WHERE ad.day_start >= DATE_TRUNC('day', fl.first_login_time)::date
),
bounded_counts AS (
SELECT cohort_day_start, user_lifetime_day, COUNT(DISTINCT user_id) AS active_users_bounded
FROM user_day_age WHERE user_lifetime_day >= 0 GROUP BY 1,2
),
last_active AS (
SELECT cohort_day_start, user_id, MAX(user_lifetime_day) AS last_active_day FROM user_day_age GROUP BY 1,2
),
unbounded_counts AS (
SELECT la.cohort_day_start, gs AS user_lifetime_day, COUNT(*) AS retained_users_unbounded
FROM last_active la
CROSS JOIN LATERAL generate_series(0, LEAST(la.last_active_day,(SELECT max_days FROM params))) gs
GROUP BY 1,2
),
cohort_sizes AS (SELECT cohort_day_start, COUNT(DISTINCT user_id) AS cohort_users FROM first_login GROUP BY 1),
cohort_caps AS (
SELECT cs.cohort_day_start, cs.cohort_users,
LEAST((SELECT max_days FROM params), GREATEST(0,(CURRENT_DATE-cs.cohort_day_start)::int)) AS cap_days
FROM cohort_sizes cs
),
grid AS (
SELECT cc.cohort_day_start, gs AS user_lifetime_day, cc.cohort_users
FROM cohort_caps cc CROSS JOIN LATERAL generate_series(0, cc.cap_days) gs
)
SELECT
g.cohort_day_start,
TO_CHAR(g.cohort_day_start,'YYYY-MM-DD') AS cohort_label,
TO_CHAR(g.cohort_day_start,'YYYY-MM-DD')||' (n='||g.cohort_users||')' AS cohort_label_n,
g.user_lifetime_day, g.cohort_users,
COALESCE(b.active_users_bounded,0) AS active_users_bounded,
COALESCE(u.retained_users_unbounded,0) AS retained_users_unbounded,
CASE WHEN g.cohort_users>0 THEN COALESCE(b.active_users_bounded,0)::float/g.cohort_users END AS retention_rate_bounded,
CASE WHEN g.cohort_users>0 THEN COALESCE(u.retained_users_unbounded,0)::float/g.cohort_users END AS retention_rate_unbounded,
CASE WHEN g.user_lifetime_day=0 THEN g.cohort_users ELSE 0 END AS cohort_users_d0
FROM grid g
LEFT JOIN bounded_counts b ON b.cohort_day_start=g.cohort_day_start AND b.user_lifetime_day=g.user_lifetime_day
LEFT JOIN unbounded_counts u ON u.cohort_day_start=g.cohort_day_start AND u.user_lifetime_day=g.user_lifetime_day
ORDER BY g.cohort_day_start, g.user_lifetime_day;

View File

@@ -0,0 +1,96 @@
-- =============================================================
-- View: analytics.retention_login_onboarded_weekly
-- Looker source alias: ds101 | Charts: 2
-- =============================================================
-- DESCRIPTION
-- Weekly cohort retention from login sessions, restricted to
-- users who "onboarded" — defined as running at least one
-- agent within 365 days of their first login.
-- Filters out users who signed up but never activated,
-- giving a cleaner view of engaged-user retention.
--
-- SOURCE TABLES
-- auth.sessions — Login session records
-- platform.AgentGraphExecution — Used to identify onboarders
--
-- OUTPUT COLUMNS
-- Same as retention_login_weekly (cohort_week_start, user_lifetime_week,
-- retention_rate_bounded, retention_rate_unbounded, etc.)
-- Only difference: cohort is filtered to onboarded users only.
--
-- EXAMPLE QUERIES
-- -- Compare week-4 retention: all users vs onboarded only
-- SELECT 'all_users' AS segment, AVG(retention_rate_bounded) AS w4_retention
-- FROM analytics.retention_login_weekly WHERE user_lifetime_week = 4
-- UNION ALL
-- SELECT 'onboarded', AVG(retention_rate_bounded)
-- FROM analytics.retention_login_onboarded_weekly WHERE user_lifetime_week = 4;
-- =============================================================
WITH params AS (SELECT 12::int AS max_weeks, 365::int AS onboarding_window_days),
events AS (
SELECT s.user_id::text AS user_id, s.created_at::timestamptz AS created_at,
DATE_TRUNC('week', s.created_at)::date AS week_start
FROM auth.sessions s WHERE s.user_id IS NOT NULL
),
first_login_all AS (
SELECT user_id, MIN(created_at) AS first_login_time,
DATE_TRUNC('week', MIN(created_at))::date AS cohort_week_start
FROM events GROUP BY 1
),
onboarders AS (
SELECT fl.user_id FROM first_login_all fl
WHERE EXISTS (
SELECT 1 FROM platform."AgentGraphExecution" e
WHERE e."userId"::text = fl.user_id
AND e."createdAt" >= fl.first_login_time
AND e."createdAt" < fl.first_login_time
+ make_interval(days => (SELECT onboarding_window_days FROM params))
)
),
first_login AS (SELECT * FROM first_login_all WHERE user_id IN (SELECT user_id FROM onboarders)),
activity_weeks AS (SELECT DISTINCT user_id, week_start FROM events),
user_week_age AS (
SELECT aw.user_id, fl.cohort_week_start,
((aw.week_start - DATE_TRUNC('week',fl.first_login_time)::date)/7)::int AS user_lifetime_week
FROM activity_weeks aw JOIN first_login fl USING (user_id)
WHERE aw.week_start >= DATE_TRUNC('week',fl.first_login_time)::date
),
bounded_counts AS (
SELECT cohort_week_start, user_lifetime_week, COUNT(DISTINCT user_id) AS active_users_bounded
FROM user_week_age WHERE user_lifetime_week >= 0 GROUP BY 1,2
),
last_active AS (
SELECT cohort_week_start, user_id, MAX(user_lifetime_week) AS last_active_week FROM user_week_age GROUP BY 1,2
),
unbounded_counts AS (
SELECT la.cohort_week_start, gs AS user_lifetime_week, COUNT(*) AS retained_users_unbounded
FROM last_active la
CROSS JOIN LATERAL generate_series(0, LEAST(la.last_active_week,(SELECT max_weeks FROM params))) gs
GROUP BY 1,2
),
cohort_sizes AS (SELECT cohort_week_start, COUNT(DISTINCT user_id) AS cohort_users FROM first_login GROUP BY 1),
cohort_caps AS (
SELECT cs.cohort_week_start, cs.cohort_users,
LEAST((SELECT max_weeks FROM params),
GREATEST(0,((DATE_TRUNC('week',CURRENT_DATE)::date-cs.cohort_week_start)/7)::int)) AS cap_weeks
FROM cohort_sizes cs
),
grid AS (
SELECT cc.cohort_week_start, gs AS user_lifetime_week, cc.cohort_users
FROM cohort_caps cc CROSS JOIN LATERAL generate_series(0, cc.cap_weeks) gs
)
SELECT
g.cohort_week_start,
TO_CHAR(g.cohort_week_start,'IYYY-"W"IW') AS cohort_label,
TO_CHAR(g.cohort_week_start,'IYYY-"W"IW')||' (n='||g.cohort_users||')' AS cohort_label_n,
g.user_lifetime_week, g.cohort_users,
COALESCE(b.active_users_bounded,0) AS active_users_bounded,
COALESCE(u.retained_users_unbounded,0) AS retained_users_unbounded,
CASE WHEN g.cohort_users>0 THEN COALESCE(b.active_users_bounded,0)::float/g.cohort_users END AS retention_rate_bounded,
CASE WHEN g.cohort_users>0 THEN COALESCE(u.retained_users_unbounded,0)::float/g.cohort_users END AS retention_rate_unbounded,
CASE WHEN g.user_lifetime_week=0 THEN g.cohort_users ELSE 0 END AS cohort_users_w0
FROM grid g
LEFT JOIN bounded_counts b ON b.cohort_week_start=g.cohort_week_start AND b.user_lifetime_week=g.user_lifetime_week
LEFT JOIN unbounded_counts u ON u.cohort_week_start=g.cohort_week_start AND u.user_lifetime_week=g.user_lifetime_week
ORDER BY g.cohort_week_start, g.user_lifetime_week;

View File

@@ -0,0 +1,103 @@
-- =============================================================
-- View: analytics.retention_login_weekly
-- Looker source alias: ds83 | Charts: 2
-- =============================================================
-- DESCRIPTION
-- Weekly cohort retention based on login sessions.
-- Users are grouped by the ISO week of their first ever login.
-- For each cohort × lifetime-week combination, outputs both:
-- - bounded rate: % active in exactly that week
-- - unbounded rate: % who were ever active on or after that week
-- Weeks are capped to the cohort's actual age (no future data points).
--
-- SOURCE TABLES
-- auth.sessions — Login session records
--
-- HOW TO READ THE OUTPUT
-- cohort_week_start The Monday of the week users first logged in
-- user_lifetime_week 0 = signup week, 1 = one week later, etc.
-- retention_rate_bounded = active_users_bounded / cohort_users
-- retention_rate_unbounded = retained_users_unbounded / cohort_users
--
-- OUTPUT COLUMNS
-- cohort_week_start DATE First day of the cohort's signup week
-- cohort_label TEXT ISO week label (e.g. '2025-W01')
-- cohort_label_n TEXT ISO week label with cohort size (e.g. '2025-W01 (n=42)')
-- user_lifetime_week INT Weeks since first login (0 = signup week)
-- cohort_users BIGINT Total users in this cohort (denominator)
-- active_users_bounded BIGINT Users active in exactly week k
-- retained_users_unbounded BIGINT Users active any time on/after week k
-- retention_rate_bounded FLOAT bounded active / cohort_users
-- retention_rate_unbounded FLOAT unbounded retained / cohort_users
-- cohort_users_w0 BIGINT cohort_users only at week 0, else 0 (safe to SUM in pivot tables)
--
-- EXAMPLE QUERIES
-- -- Week-1 retention rate per cohort
-- SELECT cohort_label, retention_rate_bounded AS w1_retention
-- FROM analytics.retention_login_weekly
-- WHERE user_lifetime_week = 1
-- ORDER BY cohort_week_start;
--
-- -- Overall average retention curve (all cohorts combined)
-- SELECT user_lifetime_week,
-- SUM(active_users_bounded)::float / NULLIF(SUM(cohort_users_w0), 0) AS avg_retention
-- FROM analytics.retention_login_weekly
-- GROUP BY 1 ORDER BY 1;
-- =============================================================
WITH params AS (SELECT 12::int AS max_weeks),
events AS (
SELECT s.user_id::text AS user_id, s.created_at::timestamptz AS created_at,
DATE_TRUNC('week', s.created_at)::date AS week_start
FROM auth.sessions s WHERE s.user_id IS NOT NULL
),
first_login AS (
SELECT user_id, MIN(created_at) AS first_login_time,
DATE_TRUNC('week', MIN(created_at))::date AS cohort_week_start
FROM events GROUP BY 1
),
activity_weeks AS (SELECT DISTINCT user_id, week_start FROM events),
user_week_age AS (
SELECT aw.user_id, fl.cohort_week_start,
((aw.week_start - DATE_TRUNC('week', fl.first_login_time)::date) / 7)::int AS user_lifetime_week
FROM activity_weeks aw JOIN first_login fl USING (user_id)
WHERE aw.week_start >= DATE_TRUNC('week', fl.first_login_time)::date
),
bounded_counts AS (
SELECT cohort_week_start, user_lifetime_week, COUNT(DISTINCT user_id) AS active_users_bounded
FROM user_week_age WHERE user_lifetime_week >= 0 GROUP BY 1,2
),
last_active AS (
SELECT cohort_week_start, user_id, MAX(user_lifetime_week) AS last_active_week FROM user_week_age GROUP BY 1,2
),
unbounded_counts AS (
SELECT la.cohort_week_start, gs AS user_lifetime_week, COUNT(*) AS retained_users_unbounded
FROM last_active la
CROSS JOIN LATERAL generate_series(0, LEAST(la.last_active_week,(SELECT max_weeks FROM params))) gs
GROUP BY 1,2
),
cohort_sizes AS (SELECT cohort_week_start, COUNT(DISTINCT user_id) AS cohort_users FROM first_login GROUP BY 1),
cohort_caps AS (
SELECT cs.cohort_week_start, cs.cohort_users,
LEAST((SELECT max_weeks FROM params),
GREATEST(0,((DATE_TRUNC('week',CURRENT_DATE)::date - cs.cohort_week_start)/7)::int)) AS cap_weeks
FROM cohort_sizes cs
),
grid AS (
SELECT cc.cohort_week_start, gs AS user_lifetime_week, cc.cohort_users
FROM cohort_caps cc CROSS JOIN LATERAL generate_series(0, cc.cap_weeks) gs
)
SELECT
g.cohort_week_start,
TO_CHAR(g.cohort_week_start,'IYYY-"W"IW') AS cohort_label,
TO_CHAR(g.cohort_week_start,'IYYY-"W"IW')||' (n='||g.cohort_users||')' AS cohort_label_n,
g.user_lifetime_week, g.cohort_users,
COALESCE(b.active_users_bounded,0) AS active_users_bounded,
COALESCE(u.retained_users_unbounded,0) AS retained_users_unbounded,
CASE WHEN g.cohort_users>0 THEN COALESCE(b.active_users_bounded,0)::float/g.cohort_users END AS retention_rate_bounded,
CASE WHEN g.cohort_users>0 THEN COALESCE(u.retained_users_unbounded,0)::float/g.cohort_users END AS retention_rate_unbounded,
CASE WHEN g.user_lifetime_week=0 THEN g.cohort_users ELSE 0 END AS cohort_users_w0
FROM grid g
LEFT JOIN bounded_counts b ON b.cohort_week_start=g.cohort_week_start AND b.user_lifetime_week=g.user_lifetime_week
LEFT JOIN unbounded_counts u ON u.cohort_week_start=g.cohort_week_start AND u.user_lifetime_week=g.user_lifetime_week
ORDER BY g.cohort_week_start, g.user_lifetime_week

View File

@@ -0,0 +1,71 @@
-- =============================================================
-- View: analytics.user_block_spending
-- Looker source alias: ds6 | Charts: 5
-- =============================================================
-- DESCRIPTION
-- One row per credit transaction (last 90 days).
-- Shows how users spend credits broken down by block type,
-- LLM provider and model. Joins node execution stats for
-- token-level detail.
--
-- SOURCE TABLES
-- platform.CreditTransaction — Credit debit/credit records
-- platform.AgentNodeExecution — Node execution stats (for token counts)
--
-- OUTPUT COLUMNS
-- transactionKey TEXT Unique transaction identifier
-- userId TEXT User who was charged
-- amount DECIMAL Credit amount (positive = credit, negative = debit)
-- negativeAmount DECIMAL amount * -1 (convenience for spend charts)
-- transactionType TEXT Transaction type (e.g. 'USAGE', 'REFUND', 'TOP_UP')
-- transactionTime TIMESTAMPTZ When the transaction was recorded
-- blockId TEXT Block UUID that triggered the spend
-- blockName TEXT Human-readable block name
-- llm_provider TEXT LLM provider (e.g. 'openai', 'anthropic')
-- llm_model TEXT Model name (e.g. 'gpt-4o', 'claude-3-5-sonnet')
-- node_exec_id TEXT Linked node execution UUID
-- llm_call_count INT LLM API calls made in that execution
-- llm_retry_count INT LLM retries in that execution
-- llm_input_token_count INT Input tokens consumed
-- llm_output_token_count INT Output tokens produced
--
-- WINDOW
-- Rolling 90 days (createdAt > CURRENT_DATE - 90 days)
--
-- EXAMPLE QUERIES
-- -- Total spend per user (last 90 days)
-- SELECT "userId", SUM("negativeAmount") AS total_spent
-- FROM analytics.user_block_spending
-- WHERE "transactionType" = 'USAGE'
-- GROUP BY 1 ORDER BY total_spent DESC;
--
-- -- Spend by LLM provider + model
-- SELECT "llm_provider", "llm_model",
-- SUM("negativeAmount") AS total_cost,
-- SUM("llm_input_token_count") AS input_tokens,
-- SUM("llm_output_token_count") AS output_tokens
-- FROM analytics.user_block_spending
-- WHERE "llm_provider" IS NOT NULL
-- GROUP BY 1, 2 ORDER BY total_cost DESC;
-- =============================================================
SELECT
c."transactionKey" AS transactionKey,
c."userId" AS userId,
c."amount" AS amount,
c."amount" * -1 AS negativeAmount,
c."type" AS transactionType,
c."createdAt" AS transactionTime,
c.metadata->>'block_id' AS blockId,
c.metadata->>'block' AS blockName,
c.metadata->'input'->'credentials'->>'provider' AS llm_provider,
c.metadata->'input'->>'model' AS llm_model,
c.metadata->>'node_exec_id' AS node_exec_id,
(ne."stats"->>'llm_call_count')::int AS llm_call_count,
(ne."stats"->>'llm_retry_count')::int AS llm_retry_count,
(ne."stats"->>'input_token_count')::int AS llm_input_token_count,
(ne."stats"->>'output_token_count')::int AS llm_output_token_count
FROM platform."CreditTransaction" c
LEFT JOIN platform."AgentNodeExecution" ne
ON (c.metadata->>'node_exec_id') = ne."id"::text
WHERE c."createdAt" > CURRENT_DATE - INTERVAL '90 days'

View File

@@ -0,0 +1,45 @@
-- =============================================================
-- View: analytics.user_onboarding
-- Looker source alias: ds68 | Charts: 3
-- =============================================================
-- DESCRIPTION
-- One row per user onboarding record. Contains the user's
-- stated usage reason, selected integrations, completed
-- onboarding steps and optional first agent selection.
-- Full history (no date filter) since onboarding happens
-- once per user.
--
-- SOURCE TABLES
-- platform.UserOnboarding — Onboarding state per user
--
-- OUTPUT COLUMNS
-- id TEXT Onboarding record UUID
-- createdAt TIMESTAMPTZ When onboarding started
-- updatedAt TIMESTAMPTZ Last update to onboarding state
-- usageReason TEXT Why user signed up (e.g. 'work', 'personal')
-- integrations TEXT[] Array of integration names the user selected
-- userId TEXT User UUID
-- completedSteps TEXT[] Array of onboarding step enums completed
-- selectedStoreListingVersionId TEXT First marketplace agent the user chose (if any)
--
-- EXAMPLE QUERIES
-- -- Usage reason breakdown
-- SELECT "usageReason", COUNT(*) FROM analytics.user_onboarding GROUP BY 1;
--
-- -- Completion rate per step
-- SELECT step, COUNT(*) AS users_completed
-- FROM analytics.user_onboarding
-- CROSS JOIN LATERAL UNNEST("completedSteps") AS step
-- GROUP BY 1 ORDER BY users_completed DESC;
-- =============================================================
SELECT
id,
"createdAt",
"updatedAt",
"usageReason",
integrations,
"userId",
"completedSteps",
"selectedStoreListingVersionId"
FROM platform."UserOnboarding"

View File

@@ -0,0 +1,100 @@
-- =============================================================
-- View: analytics.user_onboarding_funnel
-- Looker source alias: ds74 | Charts: 1
-- =============================================================
-- DESCRIPTION
-- Pre-aggregated onboarding funnel showing how many users
-- completed each step and the drop-off percentage from the
-- previous step. One row per onboarding step (all 22 steps
-- always present, even with 0 completions — prevents sparse
-- gaps from making LAG compare the wrong predecessors).
--
-- SOURCE TABLES
-- platform.UserOnboarding — Onboarding records with completedSteps array
--
-- OUTPUT COLUMNS
-- step TEXT Onboarding step enum name (e.g. 'WELCOME', 'CONGRATS')
-- step_order INT Numeric position in the funnel (1=first, 22=last)
-- users_completed BIGINT Distinct users who completed this step
-- pct_from_prev NUMERIC % of users from the previous step who reached this one
--
-- STEP ORDER
-- 1 WELCOME 9 MARKETPLACE_VISIT 17 SCHEDULE_AGENT
-- 2 USAGE_REASON 10 MARKETPLACE_ADD_AGENT 18 RUN_AGENTS
-- 3 INTEGRATIONS 11 MARKETPLACE_RUN_AGENT 19 RUN_3_DAYS
-- 4 AGENT_CHOICE 12 BUILDER_OPEN 20 TRIGGER_WEBHOOK
-- 5 AGENT_NEW_RUN 13 BUILDER_SAVE_AGENT 21 RUN_14_DAYS
-- 6 AGENT_INPUT 14 BUILDER_RUN_AGENT 22 RUN_AGENTS_100
-- 7 CONGRATS 15 VISIT_COPILOT
-- 8 GET_RESULTS 16 RE_RUN_AGENT
--
-- WINDOW
-- Users who started onboarding in the last 90 days
--
-- EXAMPLE QUERIES
-- -- Full funnel
-- SELECT * FROM analytics.user_onboarding_funnel ORDER BY step_order;
--
-- -- Biggest drop-off point
-- SELECT step, pct_from_prev FROM analytics.user_onboarding_funnel
-- ORDER BY pct_from_prev ASC LIMIT 3;
-- =============================================================
WITH all_steps AS (
-- Complete ordered grid of all 22 steps so zero-completion steps
-- are always present, keeping LAG comparisons correct.
SELECT step_name, step_order
FROM (VALUES
('WELCOME', 1),
('USAGE_REASON', 2),
('INTEGRATIONS', 3),
('AGENT_CHOICE', 4),
('AGENT_NEW_RUN', 5),
('AGENT_INPUT', 6),
('CONGRATS', 7),
('GET_RESULTS', 8),
('MARKETPLACE_VISIT', 9),
('MARKETPLACE_ADD_AGENT', 10),
('MARKETPLACE_RUN_AGENT', 11),
('BUILDER_OPEN', 12),
('BUILDER_SAVE_AGENT', 13),
('BUILDER_RUN_AGENT', 14),
('VISIT_COPILOT', 15),
('RE_RUN_AGENT', 16),
('SCHEDULE_AGENT', 17),
('RUN_AGENTS', 18),
('RUN_3_DAYS', 19),
('TRIGGER_WEBHOOK', 20),
('RUN_14_DAYS', 21),
('RUN_AGENTS_100', 22)
) AS t(step_name, step_order)
),
raw AS (
SELECT
u."userId",
step_txt::text AS step
FROM platform."UserOnboarding" u
CROSS JOIN LATERAL UNNEST(u."completedSteps") AS step_txt
WHERE u."createdAt" >= CURRENT_DATE - INTERVAL '90 days'
),
step_counts AS (
SELECT step, COUNT(DISTINCT "userId") AS users_completed
FROM raw GROUP BY step
),
funnel AS (
SELECT
a.step_name AS step,
a.step_order,
COALESCE(sc.users_completed, 0) AS users_completed,
ROUND(
100.0 * COALESCE(sc.users_completed, 0)
/ NULLIF(
LAG(COALESCE(sc.users_completed, 0)) OVER (ORDER BY a.step_order),
0
),
2
) AS pct_from_prev
FROM all_steps a
LEFT JOIN step_counts sc ON sc.step = a.step_name
)
SELECT * FROM funnel ORDER BY step_order

View File

@@ -0,0 +1,41 @@
-- =============================================================
-- View: analytics.user_onboarding_integration
-- Looker source alias: ds75 | Charts: 1
-- =============================================================
-- DESCRIPTION
-- Pre-aggregated count of users who selected each integration
-- during onboarding. One row per integration type, sorted
-- by popularity.
--
-- SOURCE TABLES
-- platform.UserOnboarding — integrations array column
--
-- OUTPUT COLUMNS
-- integration TEXT Integration name (e.g. 'github', 'slack', 'notion')
-- users_with_integration BIGINT Distinct users who selected this integration
--
-- WINDOW
-- Users who started onboarding in the last 90 days
--
-- EXAMPLE QUERIES
-- -- Full integration popularity ranking
-- SELECT * FROM analytics.user_onboarding_integration;
--
-- -- Top 5 integrations
-- SELECT * FROM analytics.user_onboarding_integration LIMIT 5;
-- =============================================================
WITH exploded AS (
SELECT
u."userId" AS user_id,
UNNEST(u."integrations") AS integration
FROM platform."UserOnboarding" u
WHERE u."createdAt" >= CURRENT_DATE - INTERVAL '90 days'
)
SELECT
integration,
COUNT(DISTINCT user_id) AS users_with_integration
FROM exploded
WHERE integration IS NOT NULL AND integration <> ''
GROUP BY integration
ORDER BY users_with_integration DESC

View File

@@ -0,0 +1,145 @@
-- =============================================================
-- View: analytics.users_activities
-- Looker source alias: ds56 | Charts: 5
-- =============================================================
-- DESCRIPTION
-- One row per user with lifetime activity summary.
-- Joins login sessions with agent graphs, executions and
-- node-level runs to give a full picture of how engaged
-- each user is. Includes a convenience flag for 7-day
-- activation (did the user return at least 7 days after
-- their first login?).
--
-- SOURCE TABLES
-- auth.sessions — Login/session records
-- platform.AgentGraph — Graphs (agents) built by the user
-- platform.AgentGraphExecution — Agent run history
-- platform.AgentNodeExecution — Individual block execution history
--
-- PERFORMANCE NOTE
-- Each CTE aggregates its own table independently by userId.
-- This avoids the fan-out that occurs when driving every join
-- from user_logins across the two largest tables
-- (AgentGraphExecution and AgentNodeExecution).
--
-- OUTPUT COLUMNS
-- user_id TEXT Supabase user UUID
-- first_login_time TIMESTAMPTZ First ever session created_at
-- last_login_time TIMESTAMPTZ Most recent session created_at
-- last_visit_time TIMESTAMPTZ Max of last refresh or login
-- last_agent_save_time TIMESTAMPTZ Last time user saved an agent graph
-- agent_count BIGINT Number of distinct active graphs built (0 if none)
-- first_agent_run_time TIMESTAMPTZ First ever graph execution
-- last_agent_run_time TIMESTAMPTZ Most recent graph execution
-- unique_agent_runs BIGINT Distinct agent graphs ever run (0 if none)
-- agent_runs BIGINT Total graph execution count (0 if none)
-- node_execution_count BIGINT Total node executions across all runs
-- node_execution_failed BIGINT Node executions with FAILED status
-- node_execution_completed BIGINT Node executions with COMPLETED status
-- node_execution_terminated BIGINT Node executions with TERMINATED status
-- node_execution_queued BIGINT Node executions with QUEUED status
-- node_execution_running BIGINT Node executions with RUNNING status
-- is_active_after_7d INT 1=returned after day 7, 0=did not, NULL=too early to tell
-- node_execution_incomplete BIGINT Node executions with INCOMPLETE status
-- node_execution_review BIGINT Node executions with REVIEW status
--
-- EXAMPLE QUERIES
-- -- Users who ran at least one agent and returned after 7 days
-- SELECT COUNT(*) FROM analytics.users_activities
-- WHERE agent_runs > 0 AND is_active_after_7d = 1;
--
-- -- Top 10 most active users by agent runs
-- SELECT user_id, agent_runs, node_execution_count
-- FROM analytics.users_activities
-- ORDER BY agent_runs DESC LIMIT 10;
--
-- -- 7-day activation rate
-- SELECT
-- SUM(CASE WHEN is_active_after_7d = 1 THEN 1 ELSE 0 END)::float
-- / NULLIF(COUNT(CASE WHEN is_active_after_7d IS NOT NULL THEN 1 END), 0)
-- AS activation_rate
-- FROM analytics.users_activities;
-- =============================================================
WITH user_logins AS (
SELECT
user_id::text AS user_id,
MIN(created_at) AS first_login_time,
MAX(created_at) AS last_login_time,
GREATEST(
MAX(refreshed_at)::timestamptz,
MAX(created_at)::timestamptz
) AS last_visit_time
FROM auth.sessions
GROUP BY user_id
),
user_agents AS (
-- Aggregate AgentGraph directly by userId (no fan-out from user_logins)
SELECT
"userId"::text AS user_id,
MAX("updatedAt") AS last_agent_save_time,
COUNT(DISTINCT "id") AS agent_count
FROM platform."AgentGraph"
WHERE "isActive"
GROUP BY "userId"
),
user_graph_runs AS (
-- Aggregate AgentGraphExecution directly by userId
SELECT
"userId"::text AS user_id,
MIN("createdAt") AS first_agent_run_time,
MAX("createdAt") AS last_agent_run_time,
COUNT(DISTINCT "agentGraphId") AS unique_agent_runs,
COUNT("id") AS agent_runs
FROM platform."AgentGraphExecution"
GROUP BY "userId"
),
user_node_runs AS (
-- Aggregate AgentNodeExecution directly; resolve userId via a
-- single join to AgentGraphExecution instead of fanning out from
-- user_logins through both large tables.
SELECT
g."userId"::text AS user_id,
COUNT(*) AS node_execution_count,
COUNT(*) FILTER (WHERE n."executionStatus" = 'FAILED') AS node_execution_failed,
COUNT(*) FILTER (WHERE n."executionStatus" = 'COMPLETED') AS node_execution_completed,
COUNT(*) FILTER (WHERE n."executionStatus" = 'TERMINATED') AS node_execution_terminated,
COUNT(*) FILTER (WHERE n."executionStatus" = 'QUEUED') AS node_execution_queued,
COUNT(*) FILTER (WHERE n."executionStatus" = 'RUNNING') AS node_execution_running,
COUNT(*) FILTER (WHERE n."executionStatus" = 'INCOMPLETE') AS node_execution_incomplete,
COUNT(*) FILTER (WHERE n."executionStatus" = 'REVIEW') AS node_execution_review
FROM platform."AgentNodeExecution" n
JOIN platform."AgentGraphExecution" g
ON g."id" = n."agentGraphExecutionId"
GROUP BY g."userId"
)
SELECT
ul.user_id,
ul.first_login_time,
ul.last_login_time,
ul.last_visit_time,
ua.last_agent_save_time,
COALESCE(ua.agent_count, 0) AS agent_count,
gr.first_agent_run_time,
gr.last_agent_run_time,
COALESCE(gr.unique_agent_runs, 0) AS unique_agent_runs,
COALESCE(gr.agent_runs, 0) AS agent_runs,
COALESCE(nr.node_execution_count, 0) AS node_execution_count,
COALESCE(nr.node_execution_failed, 0) AS node_execution_failed,
COALESCE(nr.node_execution_completed, 0) AS node_execution_completed,
COALESCE(nr.node_execution_terminated, 0) AS node_execution_terminated,
COALESCE(nr.node_execution_queued, 0) AS node_execution_queued,
COALESCE(nr.node_execution_running, 0) AS node_execution_running,
CASE
WHEN ul.first_login_time < NOW() - INTERVAL '7 days'
AND ul.last_visit_time >= ul.first_login_time + INTERVAL '7 days' THEN 1
WHEN ul.first_login_time < NOW() - INTERVAL '7 days'
AND ul.last_visit_time < ul.first_login_time + INTERVAL '7 days' THEN 0
ELSE NULL
END AS is_active_after_7d,
COALESCE(nr.node_execution_incomplete, 0) AS node_execution_incomplete,
COALESCE(nr.node_execution_review, 0) AS node_execution_review
FROM user_logins ul
LEFT JOIN user_agents ua ON ul.user_id = ua.user_id
LEFT JOIN user_graph_runs gr ON ul.user_id = gr.user_id
LEFT JOIN user_node_runs nr ON ul.user_id = nr.user_id

File diff suppressed because it is too large Load Diff

View File

@@ -9,25 +9,25 @@ packages = [{ include = "autogpt_libs" }]
[tool.poetry.dependencies]
python = ">=3.10,<4.0"
colorama = "^0.4.6"
cryptography = "^45.0"
cryptography = "^46.0"
expiringdict = "^1.2.2"
fastapi = "^0.116.1"
google-cloud-logging = "^3.12.1"
launchdarkly-server-sdk = "^9.12.0"
pydantic = "^2.11.7"
pydantic-settings = "^2.10.1"
pyjwt = { version = "^2.10.1", extras = ["crypto"] }
fastapi = "^0.128.7"
google-cloud-logging = "^3.13.0"
launchdarkly-server-sdk = "^9.15.0"
pydantic = "^2.12.5"
pydantic-settings = "^2.12.0"
pyjwt = { version = "^2.11.0", extras = ["crypto"] }
redis = "^6.2.0"
supabase = "^2.16.0"
uvicorn = "^0.35.0"
supabase = "^2.28.0"
uvicorn = "^0.40.0"
[tool.poetry.group.dev.dependencies]
pyright = "^1.1.404"
pyright = "^1.1.408"
pytest = "^8.4.1"
pytest-asyncio = "^1.1.0"
pytest-mock = "^3.14.1"
pytest-cov = "^6.2.1"
ruff = "^0.12.11"
pytest-asyncio = "^1.3.0"
pytest-mock = "^3.15.1"
pytest-cov = "^7.1.0"
ruff = "^0.15.7"
[build-system]
requires = ["poetry-core"]

Some files were not shown because too many files have changed in this diff Show More