Compare commits

..

19 Commits

Author SHA1 Message Date
Toran Bruce Richards
61e9c58930 docs: tweak LLM section copy (#12664)
Minor copy tweak in the Available Language Models section.

---------

Co-authored-by: Toran Bruce Richards <22963551+Torantulino@users.noreply.github.com>
2026-04-03 14:21:40 +01:00
Toran Bruce Richards
a148c9659f docs: list model developers (not providers) in Available Language Models (#12657)
## What changed

Replaces the **Available Language Models** section in
`what-is-autogpt-platform.md`.

### Before
The list mixed model creators (OpenAI, Anthropic) with inference
providers and aggregators (Groq, AI/ML API) and linked 'Llama' to
LlamaIndex — which isn't a model developer at all.

### After
- Uses the industry-standard term **model developers** instead of 'LLM
providers'
- Lists **every** model developer whose models are in the platform's
`LlmModel` enum (`dev` branch) — 17 total
- Removes entries that are inference providers/aggregators, not model
developers: **Groq, AI/ML API, Llama (LlamaIndex)**
- Ranked by approximate industry adoption based on [OpenRouter
market-share data](https://openrouter.ai/rankings#market-share), not
alphabetically
- Each entry links to the developer's site and highlights their key
model families

### Full list (ranked by adoption)
1. Google DeepMind — Gemini 3, 2.5, and 2.0
2. Anthropic — Claude Opus, Sonnet, and Haiku
3. DeepSeek — DeepSeek R1 and V3
4. Qwen (Alibaba) — Qwen 3
5. OpenAI — GPT-5, GPT-4.1, O3, and more
6. Meta — Llama 4 and Llama 3
7. Mistral AI — Mistral Large, Medium, Small, and Codestral
8. xAI — Grok 4 and Grok 3
9. Moonshot AI — Kimi K2
10. Perplexity — Sonar, Sonar Pro, and Sonar Deep Research
11. Amazon — Nova Pro, Lite, and Micro
12. Microsoft — Phi-4 and WizardLM 2
13. Cohere — Command A and Command R
14. Nvidia — Nemotron
15. Nous Research — Hermes
16. Vercel — v0
17. Gryphe — MythoMax

---------

Co-authored-by: Toran Bruce Richards <22963551+Torantulino@users.noreply.github.com>
2026-04-03 11:55:16 +01:00
Toran Bruce Richards
e50135ca12 docs: add user-facing platform documentation (#11996)
## Summary

Resolves [SECRT-1862](https://linear.app/autogpt/issue/SECRT-1862)

Adds 11 new user-facing documentation pages for the hosted AutoGPT
Platform, filling the gap between the existing developer-focused docs
and what end users of platform.agpt.co actually need to know.

The content was sourced from a structured interview with the platform
team to capture accurate, ground-truth information about the user
experience.

### New pages

- **Getting Started (Cloud)** — onboarding for the hosted platform,
navigation overview, key concepts
- **AutoPilot** — what AutoPilot can do, how to access it, comparison
with the builder
- **Agent Builder Guide** — blocks, pins, connections, saving,
configuring, error handling
- **Agent Library** — running agents, viewing tasks, favourites,
uploading, deleting
- **Marketplace** — browsing, adding agents to library, full publishing
guide
- **Scheduling & Triggers** — recurring schedules and webhook-based
triggers
- **Templates** — saving and reusing input configurations
- **Credits & Billing** — per-block pricing, balance, viewing task costs
- **Integrations & Credentials** — OAuth/API key/password, in-context
credential flow
- **Data Flow & Execution** — execution order, pins, lists, error
handling model
- **Sharing & Exporting Agents** — task URLs, file export/import,
marketplace publishing

### Other changes

- **SUMMARY.md** restructured into clear sections: "Using the Platform",
"Self-Hosting", "Tutorials", etc.
- **what-is-autogpt-platform.md** updated to link to both cloud and
self-host getting started guides (removed outdated waitlist link)

## Test plan

- [ ] Verify GitBook renders the new pages correctly via the PR preview
link (SUMMARY.md navigation, hint blocks, tables, cross-links)
- [ ] Review content accuracy against current platform behaviour
- [ ] Check all internal doc cross-links resolve correctly

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-03 09:36:52 +01:00
itsababseh
7b8b20fa4c docs: changelog for v0.6.53 — Import workflows & marketplace polish (#12562)
## Changelog for v0.6.53 (March 20–25, 2026)

### Highlights
- **Import workflows from other platforms** — n8n, Make.com, and Zapier
workflows can now be converted into AutoGPT agents
- **Marketplace UI polish** — cleaner card layout, better button
placement, overflow fixes
- **Dry-run execution mode** — test agents end-to-end without real API
calls
- **Parallel block execution** — independent blocks run simultaneously
for faster workflows

### Files changed
- `docs/platform/changelog/march-20-march-25-2026.md` — new changelog
entry
- `docs/platform/.gitbook/assets/import-workflows-hero.png` — hero image
- `docs/platform/.gitbook/assets/marketplace-ui-v053-hero.png` — hero
image
- `docs/platform/SUMMARY.md` — navigation update
- `docs/platform/changelog/README.md` — index table update

### Release

https://github.com/Significant-Gravitas/AutoGPT/releases/tag/autogpt-platform-beta-v0.6.53

---------

Co-authored-by: AutoGPT AutoPilot <autopilot@agpt.co>
Co-authored-by: Torantulino <40276179@live.napier.ac.uk>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: AutoPilot <autopilot@autogpt.com>
2026-03-25 16:02:15 -07:00
itsababseh
753ef080b6 docs: changelog for v0.6.52 (March 13–20, 2026) (#12493)
## Changelog for platform release v0.6.52

### Featured highlights (with hero images):
1. **A brand-new marketplace experience** — comprehensive UI redesign
(#12462)
2. **Track your usage at a glance** — credit charging, token rate
limiting, and usage UI (#12385)
3. **Give feedback from anywhere** — feedback button moved to header
(#12462)
4. **Smarter, personalized prompt suggestions** — business-aware
quick-action prompts (#12374)

### Also includes:
- 8 improvements (shareable prompt links, Run Now button, Jump Back In,
rich media previews, graph search, themed prompts, Nano Banana 2,
AgentMail)
- 8 bug fixes
- 7 under-the-hood changes

### Files changed:
- `docs/platform/changelog/march-13-march-20-2026.md` — new changelog
entry
- `docs/platform/changelog/README.md` — updated index table
- `docs/platform/SUMMARY.md` — updated sidebar navigation
- 4 hero images in `docs/platform/.gitbook/assets/`

Hero images generated with **Nano Banana 2** from user-provided
screenshots.

---------

Co-authored-by: AutoPilot <autopilot@agpt.co>
Co-authored-by: AutoPilot <autopilot@autogpt.com>
Co-authored-by: Torantulino <40276179@live.napier.ac.uk>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 15:52:59 -07:00
itsababseh
a64176e16a docs: add changelog for v0.6.51 (March 5–12, 2026) (#12411)
## Changelog for v0.6.51

Adds the changelog entry for the `autogpt-platform-beta-v0.6.51` release
covering **March 5–12, 2026**.

### What's included

**New changelog page**
(`docs/platform/changelog/march-5-march-12-2026.md`) highlighting:
- 📁 **Organize agents into folders** — color-coded folders in the
library
- 🔔 **Background chat notifications** — toast alerts when AutoPilot
finishes
- 🧠 **Cleaner reasoning & chat** — collapsed intermediate steps
- 📊 **Inline action outputs** — results displayed right after each step
- Plus collapsible Improvements, Fixes, and Under the Hood sections

**4 AI-generated hero images** in `docs/platform/.gitbook/assets/`:
- `folders-hero.png`
- `notifications-hero.png`
- `reasoning-hero.png`
- `outputs-hero.png`

**Updated navigation:**
- `SUMMARY.md` — new sidebar entry
- `changelog/README.md` — new row in the changelog table

Follows the exact same GitBook formatting as the previous changelog
entry (Feb 26–Mar 4).

---------

Co-authored-by: Torantulino <40276179@live.napier.ac.uk>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 16:06:27 -07:00
Toran Bruce Richards
d948bec269 docs: Add changelog for February 11-26, 2026 (v0.6.49) (#12231)
Adds the changelog entry for Autogpt-platform-beta-v0.6.49, covering
February 11-26, 2026.

## Changes
- **New file:** docs/platform/changelog/february-11-february-26-2026.md
- **Updated:** docs/platform/SUMMARY.md (added nav entry)
- **Updated:** docs/platform/changelog/README.md (added table row)

## What's covered
All user-facing changes from the v0.6.49 release, written for a
non-technical audience:
- 9 headline features (MCP tools, Telegram, rebuilt builder, folders,
AutoPilot upgrade, file persistence, feature requests, goal refinement,
stop button)
- 19 improvements
- 21 bug fixes
- 11 under-the-hood items

Every PR from the
[release](https://github.com/Significant-Gravitas/AutoGPT/releases/tag/autogpt-platform-beta-v0.6.49)
is accounted for. Internal-only items (CI, test infrastructure) are in
the collapsible 'Under the hood' section.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 19:44:49 +01:00
Toran Bruce Richards
ca666b07c7 docs: Add changelog for February 26 – March 4, 2026 (v0.6.50) (#12300)
Adds the changelog entry for autogpt-platform-beta-v0.6.50, covering
February 26 – March 4, 2026.

## Changes
- **New file:** docs/platform/changelog/february-26-march-4-2026.md
- **Updated:** docs/platform/SUMMARY.md (added nav entry)
- **Updated:** docs/platform/changelog/README.md (added table row)

## What's covered
All user-facing changes from the v0.6.50 release, written for a
non-technical audience:
- 7 headline features (file uploads, cloud sandbox, browser automation,
MCP tools, text-to-speech & share, context compaction indicator,
redesigned chat input)
- 3 improvements
- 3 bug fixes
- 4 under-the-hood items

Every PR from the
[release](https://github.com/Significant-Gravitas/AutoGPT/releases/tag/autogpt-platform-beta-v0.6.50)
is accounted for. Internal-only items (test fix, .gitignore) are
omitted.

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 19:37:46 +01:00
Toran Bruce Richards
b84bc469c9 docs: Add changelog section to platform docs (#12077)
## Summary

- Adds a new **Changelog** section to the platform docs sidebar
navigation
- Creates the first bi-weekly changelog entry covering January 29 –
February 11, 2026
- Establishes the directory structure (`docs/platform/changelog/`) for
future entries

### What's included in the changelog entry

The entry covers voice input, persistent file workspace, Claude Opus 4.6
upgrade, marketplace agent customization, rebuilt chat streaming (Vercel
AI SDK migration), reconnection during long tasks, agent creation UX
improvements, homepage redesign, and agent library reuse — plus
collapsible sections for smaller improvements and fixes.

### How to add future entries

1. Create a new markdown file in `docs/platform/changelog/` (e.g.,
`february-12-february-25-2026.md`)
2. Add a link to `docs/platform/SUMMARY.md` under the Changelog section

## Test plan

- [ ] Verify the Changelog section appears in the GitBook sidebar after
merge
- [ ] Verify the changelog entry renders correctly (headings, table,
collapsible sections)
- [ ] Verify navigation between the changelog index and the entry works

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- greptile_comment -->

<h2>Greptile Overview</h2>

<details><summary><h3>Greptile Summary</h3></summary>

Adds a new **Changelog** section to the platform documentation with the
first bi-weekly entry (January 29 – February 11, 2026). The changelog
documents major platform updates including voice input, persistent file
workspace, Claude Opus 4.6 upgrade, marketplace agent customization,
rebuilt chat streaming, reconnection during long tasks, agent creation
UX improvements, homepage redesign, and agent library reuse.

**Key changes:**
- Created `docs/platform/changelog/` directory with a landing page
(`README.md`) and the first entry
- Added "Changelog" section to `SUMMARY.md` navigation with proper
GitBook structure
- Used collapsible `<details>` sections for minor improvements and fixes
to keep the changelog scannable
- Follows existing documentation patterns with clear headings, GitBook
hint blocks, and markdown tables

The structure allows for easy addition of future entries by creating new
markdown files and linking them in `SUMMARY.md`.
</details>


<details><summary><h3>Confidence Score: 5/5</h3></summary>

- This PR is safe to merge with minimal risk - it only adds
documentation files.
- The PR introduces only new documentation files with no code changes.
The changelog structure follows GitBook conventions correctly, uses
proper markdown formatting, and integrates cleanly into the existing
documentation navigation. All files are well-formatted with clear
headings, proper use of GitBook hint blocks, and collapsible sections
for organization.
- No files require special attention
</details>


<!-- greptile_other_comments_section -->

<!-- /greptile_comment -->

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 18:26:57 +00:00
Nicholas Tindle
fdb7ff8111 docs(blocks): complete block documentation migration cleanup
Move remaining block docs to block-integrations/ subdirectory:
- Delete old docs from docs/integrations/ root
- Add new docs under docs/integrations/block-integrations/
- Add guides/ directory with LLM and voice provider docs
- Update SUMMARY.md with correct navigation structure

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-22 14:18:10 -06:00
Nicholas Tindle
0e42efb7d5 docs(blocks): migrate block documentation to docs/integrations
Restructure block documentation for GitBook integration:
- Move all block docs from docs/platform/blocks to docs/integrations/block-integrations
- Add generate_block_docs.py script to auto-generate documentation from block schemas
- Generate README.md with overview table of all 463 blocks
- Generate SUMMARY.md for GitBook navigation
- Add guides/ directory for LLM and voice provider documentation
- Fix bug in generate_block_docs.py where summary_path was incorrectly passed instead of summary_content

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-22 14:17:18 -06:00
bobby.gaffin
f2d82d8802 GITBOOK-73: No subject 2026-01-22 18:59:20 +00:00
Nicholas Tindle
446c71fec8 Merge branch 'dev' into gitbook 2026-01-15 12:59:51 -07:00
claude[bot]
ec4c2caa14 Merge remote-tracking branch 'origin/dev' into gitbook 2026-01-12 21:45:54 +00:00
Nicholas Tindle
516e8b4b25 fix: move files to the right places 2026-01-12 13:46:56 -06:00
Nicholas Tindle
e7e118b5a8 wip: fixes 2026-01-09 10:23:31 -07:00
Nicholas Tindle
92a7a7e6d6 wip: fixes 2026-01-09 10:21:06 -07:00
Nicholas Tindle
e16995347f Refactor/gitbook platform structure (#11739)
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-09 11:17:32 -06:00
Nicholas Tindle
234d3acb4c refactor(docs): restructure platform docs for GitBook and remove MkDocs (#11738)
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-09 11:09:17 -06:00
101 changed files with 3860 additions and 4883 deletions

View File

@@ -1,8 +1,9 @@
import asyncio
import logging
import uuid
from contextlib import asynccontextmanager
from datetime import UTC, datetime
from typing import Any
from weakref import WeakValueDictionary
from openai.types.chat import (
ChatCompletionAssistantMessageParam,
@@ -51,36 +52,28 @@ def _get_session_cache_key(session_id: str) -> str:
return f"{CHAT_SESSION_CACHE_PREFIX}{session_id}"
CHAT_SESSION_LOCK_PREFIX = "chat:session_lock:"
CHAT_SESSION_LOCK_TIMEOUT = 60 # seconds
# Session-level locks to prevent race conditions during concurrent upserts.
# Uses WeakValueDictionary to automatically garbage collect locks when no longer referenced,
# preventing unbounded memory growth while maintaining lock semantics for active sessions.
# Invalidation: Locks are auto-removed by GC when no coroutine holds a reference (after
# async with lock: completes). Explicit cleanup also occurs in delete_chat_session().
_session_locks: WeakValueDictionary[str, asyncio.Lock] = WeakValueDictionary()
_session_locks_mutex = asyncio.Lock()
@asynccontextmanager
async def _session_lock(session_id: str):
"""Distributed lock for a chat session using Redis.
async def _get_session_lock(session_id: str) -> asyncio.Lock:
"""Get or create a lock for a specific session to prevent concurrent upserts.
Provides system-wide locking across horizontally scaled backend instances
to prevent race conditions during concurrent session mutations.
Uses WeakValueDictionary for automatic cleanup: locks are garbage collected
when no coroutine holds a reference to them, preventing memory leaks from
unbounded growth of session locks.
"""
async_redis = await get_redis_async()
lock_key = _get_session_lock_key(session_id)
lock = async_redis.lock(lock_key, timeout=CHAT_SESSION_LOCK_TIMEOUT)
try:
await lock.acquire()
yield
finally:
if await lock.locked() and await lock.owned():
try:
await lock.release()
except Exception as e:
logger.warning(
f"Failed to release lock for chat session #{session_id}: {e}"
)
def _get_session_lock_key(session_id: str) -> str:
"""Get the Redis lock key for a chat session."""
return f"{CHAT_SESSION_LOCK_PREFIX}{session_id}"
async with _session_locks_mutex:
lock = _session_locks.get(session_id)
if lock is None:
lock = asyncio.Lock()
_session_locks[session_id] = lock
return lock
class ChatMessage(BaseModel):
@@ -446,8 +439,10 @@ async def upsert_chat_session(
callers are aware of the persistence failure.
RedisError: If the cache write fails (after successful DB write).
"""
# Acquire distributed session-specific lock to prevent concurrent upserts
async with _session_lock(session.session_id):
# Acquire session-specific lock to prevent concurrent upserts
lock = await _get_session_lock(session.session_id)
async with lock:
# Get existing message count from DB for incremental saves
existing_message_count = await chat_db.get_chat_session_message_count(
session.session_id
@@ -558,7 +553,7 @@ async def delete_chat_session(session_id: str, user_id: str | None = None) -> bo
if not deleted:
return False
# Invalidate cache after DB confirms deletion
# Only invalidate cache and clean up lock after DB confirms deletion
try:
redis_key = _get_session_cache_key(session_id)
async_redis = await get_redis_async()
@@ -566,6 +561,10 @@ async def delete_chat_session(session_id: str, user_id: str | None = None) -> bo
except Exception as e:
logger.warning(f"Failed to delete session {session_id} from cache: {e}")
# Clean up session lock (belt-and-suspenders with WeakValueDictionary)
async with _session_locks_mutex:
_session_locks.pop(session_id, None)
return True

View File

@@ -1,28 +1,29 @@
"""Agent generator package - Creates agents from natural language."""
from .core import (
AgentGeneratorNotConfiguredError,
apply_agent_patch,
decompose_goal,
generate_agent,
generate_agent_patch,
get_agent_as_json,
json_to_graph,
save_agent_to_library,
)
from .service import health_check as check_external_service_health
from .service import is_external_service_configured
from .fixer import apply_all_fixes
from .utils import get_blocks_info
from .validator import validate_agent
__all__ = [
# Core functions
"decompose_goal",
"generate_agent",
"generate_agent_patch",
"apply_agent_patch",
"save_agent_to_library",
"get_agent_as_json",
"json_to_graph",
# Exceptions
"AgentGeneratorNotConfiguredError",
# Service
"is_external_service_configured",
"check_external_service_health",
# Fixer
"apply_all_fixes",
# Validator
"validate_agent",
# Utils
"get_blocks_info",
]

View File

@@ -0,0 +1,25 @@
"""OpenRouter client configuration for agent generation."""
import os
from openai import AsyncOpenAI
# Configuration - use OPEN_ROUTER_API_KEY for consistency with chat/config.py
OPENROUTER_API_KEY = os.getenv("OPEN_ROUTER_API_KEY")
AGENT_GENERATOR_MODEL = os.getenv("AGENT_GENERATOR_MODEL", "anthropic/claude-opus-4.5")
# OpenRouter client (OpenAI-compatible API)
_client: AsyncOpenAI | None = None
def get_client() -> AsyncOpenAI:
"""Get or create the OpenRouter client."""
global _client
if _client is None:
if not OPENROUTER_API_KEY:
raise ValueError("OPENROUTER_API_KEY environment variable is required")
_client = AsyncOpenAI(
base_url="https://openrouter.ai/api/v1",
api_key=OPENROUTER_API_KEY,
)
return _client

View File

@@ -1,5 +1,7 @@
"""Core agent generation functions."""
import copy
import json
import logging
import uuid
from typing import Any
@@ -7,35 +9,13 @@ from typing import Any
from backend.api.features.library import db as library_db
from backend.data.graph import Graph, Link, Node, create_graph
from .service import (
decompose_goal_external,
generate_agent_external,
generate_agent_patch_external,
is_external_service_configured,
)
from .client import AGENT_GENERATOR_MODEL, get_client
from .prompts import DECOMPOSITION_PROMPT, GENERATION_PROMPT, PATCH_PROMPT
from .utils import get_block_summaries, parse_json_from_llm
logger = logging.getLogger(__name__)
class AgentGeneratorNotConfiguredError(Exception):
"""Raised when the external Agent Generator service is not configured."""
pass
def _check_service_configured() -> None:
"""Check if the external Agent Generator service is configured.
Raises:
AgentGeneratorNotConfiguredError: If the service is not configured.
"""
if not is_external_service_configured():
raise AgentGeneratorNotConfiguredError(
"Agent Generator service is not configured. "
"Set AGENTGENERATOR_HOST environment variable to enable agent generation."
)
async def decompose_goal(description: str, context: str = "") -> dict[str, Any] | None:
"""Break down a goal into steps or return clarifying questions.
@@ -48,13 +28,40 @@ async def decompose_goal(description: str, context: str = "") -> dict[str, Any]
- {"type": "clarifying_questions", "questions": [...]}
- {"type": "instructions", "steps": [...]}
Or None on error
Raises:
AgentGeneratorNotConfiguredError: If the external service is not configured.
"""
_check_service_configured()
logger.info("Calling external Agent Generator service for decompose_goal")
return await decompose_goal_external(description, context)
client = get_client()
prompt = DECOMPOSITION_PROMPT.format(block_summaries=get_block_summaries())
full_description = description
if context:
full_description = f"{description}\n\nAdditional context:\n{context}"
try:
response = await client.chat.completions.create(
model=AGENT_GENERATOR_MODEL,
messages=[
{"role": "system", "content": prompt},
{"role": "user", "content": full_description},
],
temperature=0,
)
content = response.choices[0].message.content
if content is None:
logger.error("LLM returned empty content for decomposition")
return None
result = parse_json_from_llm(content)
if result is None:
logger.error(f"Failed to parse decomposition response: {content[:200]}")
return None
return result
except Exception as e:
logger.error(f"Error decomposing goal: {e}")
return None
async def generate_agent(instructions: dict[str, Any]) -> dict[str, Any] | None:
@@ -65,14 +72,31 @@ async def generate_agent(instructions: dict[str, Any]) -> dict[str, Any] | None:
Returns:
Agent JSON dict or None on error
Raises:
AgentGeneratorNotConfiguredError: If the external service is not configured.
"""
_check_service_configured()
logger.info("Calling external Agent Generator service for generate_agent")
result = await generate_agent_external(instructions)
if result:
client = get_client()
prompt = GENERATION_PROMPT.format(block_summaries=get_block_summaries())
try:
response = await client.chat.completions.create(
model=AGENT_GENERATOR_MODEL,
messages=[
{"role": "system", "content": prompt},
{"role": "user", "content": json.dumps(instructions, indent=2)},
],
temperature=0,
)
content = response.choices[0].message.content
if content is None:
logger.error("LLM returned empty content for agent generation")
return None
result = parse_json_from_llm(content)
if result is None:
logger.error(f"Failed to parse agent JSON: {content[:200]}")
return None
# Ensure required fields
if "id" not in result:
result["id"] = str(uuid.uuid4())
@@ -80,7 +104,12 @@ async def generate_agent(instructions: dict[str, Any]) -> dict[str, Any] | None:
result["version"] = 1
if "is_active" not in result:
result["is_active"] = True
return result
return result
except Exception as e:
logger.error(f"Error generating agent: {e}")
return None
def json_to_graph(agent_json: dict[str, Any]) -> Graph:
@@ -255,23 +284,108 @@ async def get_agent_as_json(
async def generate_agent_patch(
update_request: str, current_agent: dict[str, Any]
) -> dict[str, Any] | None:
"""Update an existing agent using natural language.
The external Agent Generator service handles:
- Generating the patch
- Applying the patch
- Fixing and validating the result
"""Generate a patch to update an existing agent.
Args:
update_request: Natural language description of changes
current_agent: Current agent JSON
Returns:
Updated agent JSON, clarifying questions dict, or None on error
Raises:
AgentGeneratorNotConfiguredError: If the external service is not configured.
Patch dict or clarifying questions, or None on error
"""
_check_service_configured()
logger.info("Calling external Agent Generator service for generate_agent_patch")
return await generate_agent_patch_external(update_request, current_agent)
client = get_client()
prompt = PATCH_PROMPT.format(
current_agent=json.dumps(current_agent, indent=2),
block_summaries=get_block_summaries(),
)
try:
response = await client.chat.completions.create(
model=AGENT_GENERATOR_MODEL,
messages=[
{"role": "system", "content": prompt},
{"role": "user", "content": update_request},
],
temperature=0,
)
content = response.choices[0].message.content
if content is None:
logger.error("LLM returned empty content for patch generation")
return None
return parse_json_from_llm(content)
except Exception as e:
logger.error(f"Error generating patch: {e}")
return None
def apply_agent_patch(
current_agent: dict[str, Any], patch: dict[str, Any]
) -> dict[str, Any]:
"""Apply a patch to an existing agent.
Args:
current_agent: Current agent JSON
patch: Patch dict with operations
Returns:
Updated agent JSON
"""
agent = copy.deepcopy(current_agent)
patches = patch.get("patches", [])
for p in patches:
patch_type = p.get("type")
if patch_type == "modify":
node_id = p.get("node_id")
changes = p.get("changes", {})
for node in agent.get("nodes", []):
if node["id"] == node_id:
_deep_update(node, changes)
logger.debug(f"Modified node {node_id}")
break
elif patch_type == "add":
new_nodes = p.get("new_nodes", [])
new_links = p.get("new_links", [])
agent["nodes"] = agent.get("nodes", []) + new_nodes
agent["links"] = agent.get("links", []) + new_links
logger.debug(f"Added {len(new_nodes)} nodes, {len(new_links)} links")
elif patch_type == "remove":
node_ids_to_remove = set(p.get("node_ids", []))
link_ids_to_remove = set(p.get("link_ids", []))
# Remove nodes
agent["nodes"] = [
n for n in agent.get("nodes", []) if n["id"] not in node_ids_to_remove
]
# Remove links (both explicit and those referencing removed nodes)
agent["links"] = [
link
for link in agent.get("links", [])
if link["id"] not in link_ids_to_remove
and link["source_id"] not in node_ids_to_remove
and link["sink_id"] not in node_ids_to_remove
]
logger.debug(
f"Removed {len(node_ids_to_remove)} nodes, {len(link_ids_to_remove)} links"
)
return agent
def _deep_update(target: dict, source: dict) -> None:
"""Recursively update a dict with another dict."""
for key, value in source.items():
if key in target and isinstance(target[key], dict) and isinstance(value, dict):
_deep_update(target[key], value)
else:
target[key] = value

View File

@@ -0,0 +1,606 @@
"""Agent fixer - Fixes common LLM generation errors."""
import logging
import re
import uuid
from typing import Any
from .utils import (
ADDTODICTIONARY_BLOCK_ID,
ADDTOLIST_BLOCK_ID,
CODE_EXECUTION_BLOCK_ID,
CONDITION_BLOCK_ID,
CREATEDICT_BLOCK_ID,
CREATELIST_BLOCK_ID,
DATA_SAMPLING_BLOCK_ID,
DOUBLE_CURLY_BRACES_BLOCK_IDS,
GET_CURRENT_DATE_BLOCK_ID,
STORE_VALUE_BLOCK_ID,
UNIVERSAL_TYPE_CONVERTER_BLOCK_ID,
get_blocks_info,
is_valid_uuid,
)
logger = logging.getLogger(__name__)
def fix_agent_ids(agent: dict[str, Any]) -> dict[str, Any]:
"""Fix invalid UUIDs in agent and link IDs."""
# Fix agent ID
if not is_valid_uuid(agent.get("id", "")):
agent["id"] = str(uuid.uuid4())
logger.debug(f"Fixed agent ID: {agent['id']}")
# Fix node IDs
id_mapping = {} # Old ID -> New ID
for node in agent.get("nodes", []):
if not is_valid_uuid(node.get("id", "")):
old_id = node.get("id", "")
new_id = str(uuid.uuid4())
id_mapping[old_id] = new_id
node["id"] = new_id
logger.debug(f"Fixed node ID: {old_id} -> {new_id}")
# Fix link IDs and update references
for link in agent.get("links", []):
if not is_valid_uuid(link.get("id", "")):
link["id"] = str(uuid.uuid4())
logger.debug(f"Fixed link ID: {link['id']}")
# Update source/sink IDs if they were remapped
if link.get("source_id") in id_mapping:
link["source_id"] = id_mapping[link["source_id"]]
if link.get("sink_id") in id_mapping:
link["sink_id"] = id_mapping[link["sink_id"]]
return agent
def fix_double_curly_braces(agent: dict[str, Any]) -> dict[str, Any]:
"""Fix single curly braces to double in template blocks."""
for node in agent.get("nodes", []):
if node.get("block_id") not in DOUBLE_CURLY_BRACES_BLOCK_IDS:
continue
input_data = node.get("input_default", {})
for key in ("prompt", "format"):
if key in input_data and isinstance(input_data[key], str):
original = input_data[key]
# Fix simple variable references: {var} -> {{var}}
fixed = re.sub(
r"(?<!\{)\{([a-zA-Z_][a-zA-Z0-9_]*)\}(?!\})",
r"{{\1}}",
original,
)
if fixed != original:
input_data[key] = fixed
logger.debug(f"Fixed curly braces in {key}")
return agent
def fix_storevalue_before_condition(agent: dict[str, Any]) -> dict[str, Any]:
"""Add StoreValueBlock before ConditionBlock if needed for value2."""
nodes = agent.get("nodes", [])
links = agent.get("links", [])
# Find all ConditionBlock nodes
condition_node_ids = {
node["id"] for node in nodes if node.get("block_id") == CONDITION_BLOCK_ID
}
if not condition_node_ids:
return agent
new_nodes = []
new_links = []
processed_conditions = set()
for link in links:
sink_id = link.get("sink_id")
sink_name = link.get("sink_name")
# Check if this link goes to a ConditionBlock's value2
if sink_id in condition_node_ids and sink_name == "value2":
source_node = next(
(n for n in nodes if n["id"] == link.get("source_id")), None
)
# Skip if source is already a StoreValueBlock
if source_node and source_node.get("block_id") == STORE_VALUE_BLOCK_ID:
continue
# Skip if we already processed this condition
if sink_id in processed_conditions:
continue
processed_conditions.add(sink_id)
# Create StoreValueBlock
store_node_id = str(uuid.uuid4())
store_node = {
"id": store_node_id,
"block_id": STORE_VALUE_BLOCK_ID,
"input_default": {"data": None},
"metadata": {"position": {"x": 0, "y": -100}},
}
new_nodes.append(store_node)
# Create link: original source -> StoreValueBlock
new_links.append(
{
"id": str(uuid.uuid4()),
"source_id": link["source_id"],
"source_name": link["source_name"],
"sink_id": store_node_id,
"sink_name": "input",
"is_static": False,
}
)
# Update original link: StoreValueBlock -> ConditionBlock
link["source_id"] = store_node_id
link["source_name"] = "output"
logger.debug(f"Added StoreValueBlock before ConditionBlock {sink_id}")
if new_nodes:
agent["nodes"] = nodes + new_nodes
return agent
def fix_addtolist_blocks(agent: dict[str, Any]) -> dict[str, Any]:
"""Fix AddToList blocks by adding prerequisite empty AddToList block.
When an AddToList block is found:
1. Checks if there's a CreateListBlock before it
2. Removes CreateListBlock if linked directly to AddToList
3. Adds an empty AddToList block before the original
4. Ensures the original has a self-referencing link
"""
nodes = agent.get("nodes", [])
links = agent.get("links", [])
new_nodes = []
original_addtolist_ids = set()
nodes_to_remove = set()
links_to_remove = []
# First pass: identify CreateListBlock nodes to remove
for link in links:
source_node = next(
(n for n in nodes if n.get("id") == link.get("source_id")), None
)
sink_node = next((n for n in nodes if n.get("id") == link.get("sink_id")), None)
if (
source_node
and sink_node
and source_node.get("block_id") == CREATELIST_BLOCK_ID
and sink_node.get("block_id") == ADDTOLIST_BLOCK_ID
):
nodes_to_remove.add(source_node.get("id"))
links_to_remove.append(link)
logger.debug(f"Removing CreateListBlock {source_node.get('id')}")
# Second pass: process AddToList blocks
filtered_nodes = []
for node in nodes:
if node.get("id") in nodes_to_remove:
continue
if node.get("block_id") == ADDTOLIST_BLOCK_ID:
original_addtolist_ids.add(node.get("id"))
node_id = node.get("id")
pos = node.get("metadata", {}).get("position", {"x": 0, "y": 0})
# Check if already has prerequisite
has_prereq = any(
link.get("sink_id") == node_id
and link.get("sink_name") == "list"
and link.get("source_name") == "updated_list"
for link in links
)
if not has_prereq:
# Remove links to "list" input (except self-reference)
for link in links:
if (
link.get("sink_id") == node_id
and link.get("sink_name") == "list"
and link.get("source_id") != node_id
and link not in links_to_remove
):
links_to_remove.append(link)
# Create prerequisite AddToList block
prereq_id = str(uuid.uuid4())
prereq_node = {
"id": prereq_id,
"block_id": ADDTOLIST_BLOCK_ID,
"input_default": {"list": [], "entry": None, "entries": []},
"metadata": {
"position": {"x": pos.get("x", 0) - 800, "y": pos.get("y", 0)}
},
}
new_nodes.append(prereq_node)
# Link prerequisite to original
links.append(
{
"id": str(uuid.uuid4()),
"source_id": prereq_id,
"source_name": "updated_list",
"sink_id": node_id,
"sink_name": "list",
"is_static": False,
}
)
logger.debug(f"Added prerequisite AddToList block for {node_id}")
filtered_nodes.append(node)
# Remove marked links
filtered_links = [link for link in links if link not in links_to_remove]
# Add self-referencing links for original AddToList blocks
for node in filtered_nodes + new_nodes:
if (
node.get("block_id") == ADDTOLIST_BLOCK_ID
and node.get("id") in original_addtolist_ids
):
node_id = node.get("id")
has_self_ref = any(
link["source_id"] == node_id
and link["sink_id"] == node_id
and link["source_name"] == "updated_list"
and link["sink_name"] == "list"
for link in filtered_links
)
if not has_self_ref:
filtered_links.append(
{
"id": str(uuid.uuid4()),
"source_id": node_id,
"source_name": "updated_list",
"sink_id": node_id,
"sink_name": "list",
"is_static": False,
}
)
logger.debug(f"Added self-reference for AddToList {node_id}")
agent["nodes"] = filtered_nodes + new_nodes
agent["links"] = filtered_links
return agent
def fix_addtodictionary_blocks(agent: dict[str, Any]) -> dict[str, Any]:
"""Fix AddToDictionary blocks by removing empty CreateDictionary nodes."""
nodes = agent.get("nodes", [])
links = agent.get("links", [])
nodes_to_remove = set()
links_to_remove = []
for link in links:
source_node = next(
(n for n in nodes if n.get("id") == link.get("source_id")), None
)
sink_node = next((n for n in nodes if n.get("id") == link.get("sink_id")), None)
if (
source_node
and sink_node
and source_node.get("block_id") == CREATEDICT_BLOCK_ID
and sink_node.get("block_id") == ADDTODICTIONARY_BLOCK_ID
):
nodes_to_remove.add(source_node.get("id"))
links_to_remove.append(link)
logger.debug(f"Removing CreateDictionary {source_node.get('id')}")
agent["nodes"] = [n for n in nodes if n.get("id") not in nodes_to_remove]
agent["links"] = [link for link in links if link not in links_to_remove]
return agent
def fix_code_execution_output(agent: dict[str, Any]) -> dict[str, Any]:
"""Fix CodeExecutionBlock output: change 'response' to 'stdout_logs'."""
nodes = agent.get("nodes", [])
links = agent.get("links", [])
for link in links:
source_node = next(
(n for n in nodes if n.get("id") == link.get("source_id")), None
)
if (
source_node
and source_node.get("block_id") == CODE_EXECUTION_BLOCK_ID
and link.get("source_name") == "response"
):
link["source_name"] = "stdout_logs"
logger.debug("Fixed CodeExecutionBlock output: response -> stdout_logs")
return agent
def fix_data_sampling_sample_size(agent: dict[str, Any]) -> dict[str, Any]:
"""Fix DataSamplingBlock by setting sample_size to 1 as default."""
nodes = agent.get("nodes", [])
links = agent.get("links", [])
links_to_remove = []
for node in nodes:
if node.get("block_id") == DATA_SAMPLING_BLOCK_ID:
node_id = node.get("id")
input_default = node.get("input_default", {})
# Remove links to sample_size
for link in links:
if (
link.get("sink_id") == node_id
and link.get("sink_name") == "sample_size"
):
links_to_remove.append(link)
# Set default
input_default["sample_size"] = 1
node["input_default"] = input_default
logger.debug(f"Fixed DataSamplingBlock {node_id} sample_size to 1")
if links_to_remove:
agent["links"] = [link for link in links if link not in links_to_remove]
return agent
def fix_node_x_coordinates(agent: dict[str, Any]) -> dict[str, Any]:
"""Fix node x-coordinates to ensure 800+ unit spacing between linked nodes."""
nodes = agent.get("nodes", [])
links = agent.get("links", [])
node_lookup = {n.get("id"): n for n in nodes}
for link in links:
source_id = link.get("source_id")
sink_id = link.get("sink_id")
source_node = node_lookup.get(source_id)
sink_node = node_lookup.get(sink_id)
if not source_node or not sink_node:
continue
source_pos = source_node.get("metadata", {}).get("position", {})
sink_pos = sink_node.get("metadata", {}).get("position", {})
source_x = source_pos.get("x", 0)
sink_x = sink_pos.get("x", 0)
if abs(sink_x - source_x) < 800:
new_x = source_x + 800
if "metadata" not in sink_node:
sink_node["metadata"] = {}
if "position" not in sink_node["metadata"]:
sink_node["metadata"]["position"] = {}
sink_node["metadata"]["position"]["x"] = new_x
logger.debug(f"Fixed node {sink_id} x: {sink_x} -> {new_x}")
return agent
def fix_getcurrentdate_offset(agent: dict[str, Any]) -> dict[str, Any]:
"""Fix GetCurrentDateBlock offset to ensure it's positive."""
for node in agent.get("nodes", []):
if node.get("block_id") == GET_CURRENT_DATE_BLOCK_ID:
input_default = node.get("input_default", {})
if "offset" in input_default:
offset = input_default["offset"]
if isinstance(offset, (int, float)) and offset < 0:
input_default["offset"] = abs(offset)
logger.debug(f"Fixed offset: {offset} -> {abs(offset)}")
return agent
def fix_ai_model_parameter(
agent: dict[str, Any],
blocks_info: list[dict[str, Any]],
default_model: str = "gpt-4o",
) -> dict[str, Any]:
"""Add default model parameter to AI blocks if missing."""
block_map = {b.get("id"): b for b in blocks_info}
for node in agent.get("nodes", []):
block_id = node.get("block_id")
block = block_map.get(block_id)
if not block:
continue
# Check if block has AI category
categories = block.get("categories", [])
is_ai_block = any(
cat.get("category") == "AI" for cat in categories if isinstance(cat, dict)
)
if is_ai_block:
input_default = node.get("input_default", {})
if "model" not in input_default:
input_default["model"] = default_model
node["input_default"] = input_default
logger.debug(
f"Added model '{default_model}' to AI block {node.get('id')}"
)
return agent
def fix_link_static_properties(
agent: dict[str, Any], blocks_info: list[dict[str, Any]]
) -> dict[str, Any]:
"""Fix is_static property based on source block's staticOutput."""
block_map = {b.get("id"): b for b in blocks_info}
node_lookup = {n.get("id"): n for n in agent.get("nodes", [])}
for link in agent.get("links", []):
source_node = node_lookup.get(link.get("source_id"))
if not source_node:
continue
source_block = block_map.get(source_node.get("block_id"))
if not source_block:
continue
static_output = source_block.get("staticOutput", False)
if link.get("is_static") != static_output:
link["is_static"] = static_output
logger.debug(f"Fixed link {link.get('id')} is_static to {static_output}")
return agent
def fix_data_type_mismatch(
agent: dict[str, Any], blocks_info: list[dict[str, Any]]
) -> dict[str, Any]:
"""Fix data type mismatches by inserting UniversalTypeConverterBlock."""
nodes = agent.get("nodes", [])
links = agent.get("links", [])
block_map = {b.get("id"): b for b in blocks_info}
node_lookup = {n.get("id"): n for n in nodes}
def get_property_type(schema: dict, name: str) -> str | None:
if "_#_" in name:
parent, child = name.split("_#_", 1)
parent_schema = schema.get(parent, {})
if "properties" in parent_schema:
return parent_schema["properties"].get(child, {}).get("type")
return None
return schema.get(name, {}).get("type")
def are_types_compatible(src: str, sink: str) -> bool:
if {src, sink} <= {"integer", "number"}:
return True
return src == sink
type_mapping = {
"string": "string",
"text": "string",
"integer": "number",
"number": "number",
"float": "number",
"boolean": "boolean",
"bool": "boolean",
"array": "list",
"list": "list",
"object": "dictionary",
"dict": "dictionary",
"dictionary": "dictionary",
}
new_links = []
nodes_to_add = []
for link in links:
source_node = node_lookup.get(link.get("source_id"))
sink_node = node_lookup.get(link.get("sink_id"))
if not source_node or not sink_node:
new_links.append(link)
continue
source_block = block_map.get(source_node.get("block_id"))
sink_block = block_map.get(sink_node.get("block_id"))
if not source_block or not sink_block:
new_links.append(link)
continue
source_outputs = source_block.get("outputSchema", {}).get("properties", {})
sink_inputs = sink_block.get("inputSchema", {}).get("properties", {})
source_type = get_property_type(source_outputs, link.get("source_name", ""))
sink_type = get_property_type(sink_inputs, link.get("sink_name", ""))
if (
source_type
and sink_type
and not are_types_compatible(source_type, sink_type)
):
# Insert type converter
converter_id = str(uuid.uuid4())
target_type = type_mapping.get(sink_type, sink_type)
converter_node = {
"id": converter_id,
"block_id": UNIVERSAL_TYPE_CONVERTER_BLOCK_ID,
"input_default": {"type": target_type},
"metadata": {"position": {"x": 0, "y": 100}},
}
nodes_to_add.append(converter_node)
# source -> converter
new_links.append(
{
"id": str(uuid.uuid4()),
"source_id": link["source_id"],
"source_name": link["source_name"],
"sink_id": converter_id,
"sink_name": "value",
"is_static": False,
}
)
# converter -> sink
new_links.append(
{
"id": str(uuid.uuid4()),
"source_id": converter_id,
"source_name": "value",
"sink_id": link["sink_id"],
"sink_name": link["sink_name"],
"is_static": False,
}
)
logger.debug(f"Inserted type converter: {source_type} -> {target_type}")
else:
new_links.append(link)
if nodes_to_add:
agent["nodes"] = nodes + nodes_to_add
agent["links"] = new_links
return agent
def apply_all_fixes(
agent: dict[str, Any], blocks_info: list[dict[str, Any]] | None = None
) -> dict[str, Any]:
"""Apply all fixes to an agent JSON.
Args:
agent: Agent JSON dict
blocks_info: Optional list of block info dicts for advanced fixes
Returns:
Fixed agent JSON
"""
# Basic fixes (no block info needed)
agent = fix_agent_ids(agent)
agent = fix_double_curly_braces(agent)
agent = fix_storevalue_before_condition(agent)
agent = fix_addtolist_blocks(agent)
agent = fix_addtodictionary_blocks(agent)
agent = fix_code_execution_output(agent)
agent = fix_data_sampling_sample_size(agent)
agent = fix_node_x_coordinates(agent)
agent = fix_getcurrentdate_offset(agent)
# Advanced fixes (require block info)
if blocks_info is None:
blocks_info = get_blocks_info()
agent = fix_ai_model_parameter(agent, blocks_info)
agent = fix_link_static_properties(agent, blocks_info)
agent = fix_data_type_mismatch(agent, blocks_info)
return agent

View File

@@ -0,0 +1,225 @@
"""Prompt templates for agent generation."""
DECOMPOSITION_PROMPT = """
You are an expert AutoGPT Workflow Decomposer. Your task is to analyze a user's high-level goal and break it down into a clear, step-by-step plan using the available blocks.
Each step should represent a distinct, automatable action suitable for execution by an AI automation system.
---
FIRST: Analyze the user's goal and determine:
1) Design-time configuration (fixed settings that won't change per run)
2) Runtime inputs (values the agent's end-user will provide each time it runs)
For anything that can vary per run (email addresses, names, dates, search terms, etc.):
- DO NOT ask for the actual value
- Instead, define it as an Agent Input with a clear name, type, and description
Only ask clarifying questions about design-time config that affects how you build the workflow:
- Which external service to use (e.g., "Gmail vs Outlook", "Notion vs Google Docs")
- Required formats or structures (e.g., "CSV, JSON, or PDF output?")
- Business rules that must be hard-coded
IMPORTANT CLARIFICATIONS POLICY:
- Ask no more than five essential questions
- Do not ask for concrete values that can be provided at runtime as Agent Inputs
- Do not ask for API keys or credentials; the platform handles those directly
- If there is enough information to infer reasonable defaults, prefer to propose defaults
---
GUIDELINES:
1. List each step as a numbered item
2. Describe the action clearly and specify inputs/outputs
3. Ensure steps are in logical, sequential order
4. Mention block names naturally (e.g., "Use GetWeatherByLocationBlock to...")
5. Help the user reach their goal efficiently
---
RULES:
1. OUTPUT FORMAT: Only output either clarifying questions OR step-by-step instructions, not both
2. USE ONLY THE BLOCKS PROVIDED
3. ALL required_input fields must be provided
4. Data types of linked properties must match
5. Write expert-level prompts for AI-related blocks
---
CRITICAL BLOCK RESTRICTIONS:
1. AddToListBlock: Outputs updated list EVERY addition, not after all additions
2. SendEmailBlock: Draft the email for user review; set SMTP config based on email type
3. ConditionBlock: value2 is reference, value1 is contrast
4. CodeExecutionBlock: DO NOT USE - use AI blocks instead
5. ReadCsvBlock: Only use the 'rows' output, not 'row'
---
OUTPUT FORMAT:
If more information is needed:
```json
{{
"type": "clarifying_questions",
"questions": [
{{
"question": "Which email provider should be used? (Gmail, Outlook, custom SMTP)",
"keyword": "email_provider",
"example": "Gmail"
}}
]
}}
```
If ready to proceed:
```json
{{
"type": "instructions",
"steps": [
{{
"step_number": 1,
"block_name": "AgentShortTextInputBlock",
"description": "Get the URL of the content to analyze.",
"inputs": [{{"name": "name", "value": "URL"}}],
"outputs": [{{"name": "result", "description": "The URL entered by user"}}]
}}
]
}}
```
---
AVAILABLE BLOCKS:
{block_summaries}
"""
GENERATION_PROMPT = """
You are an expert AI workflow builder. Generate a valid agent JSON from the given instructions.
---
NODES:
Each node must include:
- `id`: Unique UUID v4 (e.g. `a8f5b1e2-c3d4-4e5f-8a9b-0c1d2e3f4a5b`)
- `block_id`: The block identifier (must match an Allowed Block)
- `input_default`: Dict of inputs (can be empty if no static inputs needed)
- `metadata`: Must contain:
- `position`: {{"x": number, "y": number}} - adjacent nodes should differ by 800+ in X
- `customized_name`: Clear name describing this block's purpose in the workflow
---
LINKS:
Each link connects a source node's output to a sink node's input:
- `id`: MUST be UUID v4 (NOT "link-1", "link-2", etc.)
- `source_id`: ID of the source node
- `source_name`: Output field name from the source block
- `sink_id`: ID of the sink node
- `sink_name`: Input field name on the sink block
- `is_static`: true only if source block has static_output: true
CRITICAL: All IDs must be valid UUID v4 format!
---
AGENT (GRAPH):
Wrap nodes and links in:
- `id`: UUID of the agent
- `name`: Short, generic name (avoid specific company names, URLs)
- `description`: Short, generic description
- `nodes`: List of all nodes
- `links`: List of all links
- `version`: 1
- `is_active`: true
---
TIPS:
- All required_input fields must be provided via input_default or a valid link
- Ensure consistent source_id and sink_id references
- Avoid dangling links
- Input/output pins must match block schemas
- Do not invent unknown block_ids
---
ALLOWED BLOCKS:
{block_summaries}
---
Generate the complete agent JSON. Output ONLY valid JSON, no explanation.
"""
PATCH_PROMPT = """
You are an expert at modifying AutoGPT agent workflows. Given the current agent and a modification request, generate a JSON patch to update the agent.
CURRENT AGENT:
{current_agent}
AVAILABLE BLOCKS:
{block_summaries}
---
PATCH FORMAT:
Return a JSON object with the following structure:
```json
{{
"type": "patch",
"intent": "Brief description of what the patch does",
"patches": [
{{
"type": "modify",
"node_id": "uuid-of-node-to-modify",
"changes": {{
"input_default": {{"field": "new_value"}},
"metadata": {{"customized_name": "New Name"}}
}}
}},
{{
"type": "add",
"new_nodes": [
{{
"id": "new-uuid",
"block_id": "block-uuid",
"input_default": {{}},
"metadata": {{"position": {{"x": 0, "y": 0}}, "customized_name": "Name"}}
}}
],
"new_links": [
{{
"id": "link-uuid",
"source_id": "source-node-id",
"source_name": "output_field",
"sink_id": "sink-node-id",
"sink_name": "input_field"
}}
]
}},
{{
"type": "remove",
"node_ids": ["uuid-of-node-to-remove"],
"link_ids": ["uuid-of-link-to-remove"]
}}
]
}}
```
If you need more information, return:
```json
{{
"type": "clarifying_questions",
"questions": [
{{
"question": "What specific change do you want?",
"keyword": "change_type",
"example": "Add error handling"
}}
]
}}
```
Generate the minimal patch needed. Output ONLY valid JSON.
"""

View File

@@ -1,269 +0,0 @@
"""External Agent Generator service client.
This module provides a client for communicating with the external Agent Generator
microservice. When AGENTGENERATOR_HOST is configured, the agent generation functions
will delegate to the external service instead of using the built-in LLM-based implementation.
"""
import logging
from typing import Any
import httpx
from backend.util.settings import Settings
logger = logging.getLogger(__name__)
_client: httpx.AsyncClient | None = None
_settings: Settings | None = None
def _get_settings() -> Settings:
"""Get or create settings singleton."""
global _settings
if _settings is None:
_settings = Settings()
return _settings
def is_external_service_configured() -> bool:
"""Check if external Agent Generator service is configured."""
settings = _get_settings()
return bool(settings.config.agentgenerator_host)
def _get_base_url() -> str:
"""Get the base URL for the external service."""
settings = _get_settings()
host = settings.config.agentgenerator_host
port = settings.config.agentgenerator_port
return f"http://{host}:{port}"
def _get_client() -> httpx.AsyncClient:
"""Get or create the HTTP client for the external service."""
global _client
if _client is None:
settings = _get_settings()
_client = httpx.AsyncClient(
base_url=_get_base_url(),
timeout=httpx.Timeout(settings.config.agentgenerator_timeout),
)
return _client
async def decompose_goal_external(
description: str, context: str = ""
) -> dict[str, Any] | None:
"""Call the external service to decompose a goal.
Args:
description: Natural language goal description
context: Additional context (e.g., answers to previous questions)
Returns:
Dict with either:
- {"type": "clarifying_questions", "questions": [...]}
- {"type": "instructions", "steps": [...]}
- {"type": "unachievable_goal", ...}
- {"type": "vague_goal", ...}
Or None on error
"""
client = _get_client()
# Build the request payload
payload: dict[str, Any] = {"description": description}
if context:
# The external service uses user_instruction for additional context
payload["user_instruction"] = context
try:
response = await client.post("/api/decompose-description", json=payload)
response.raise_for_status()
data = response.json()
if not data.get("success"):
logger.error(f"External service returned error: {data.get('error')}")
return None
# Map the response to the expected format
response_type = data.get("type")
if response_type == "instructions":
return {"type": "instructions", "steps": data.get("steps", [])}
elif response_type == "clarifying_questions":
return {
"type": "clarifying_questions",
"questions": data.get("questions", []),
}
elif response_type == "unachievable_goal":
return {
"type": "unachievable_goal",
"reason": data.get("reason"),
"suggested_goal": data.get("suggested_goal"),
}
elif response_type == "vague_goal":
return {
"type": "vague_goal",
"suggested_goal": data.get("suggested_goal"),
}
else:
logger.error(
f"Unknown response type from external service: {response_type}"
)
return None
except httpx.HTTPStatusError as e:
logger.error(f"HTTP error calling external agent generator: {e}")
return None
except httpx.RequestError as e:
logger.error(f"Request error calling external agent generator: {e}")
return None
except Exception as e:
logger.error(f"Unexpected error calling external agent generator: {e}")
return None
async def generate_agent_external(
instructions: dict[str, Any]
) -> dict[str, Any] | None:
"""Call the external service to generate an agent from instructions.
Args:
instructions: Structured instructions from decompose_goal
Returns:
Agent JSON dict or None on error
"""
client = _get_client()
try:
response = await client.post(
"/api/generate-agent", json={"instructions": instructions}
)
response.raise_for_status()
data = response.json()
if not data.get("success"):
logger.error(f"External service returned error: {data.get('error')}")
return None
return data.get("agent_json")
except httpx.HTTPStatusError as e:
logger.error(f"HTTP error calling external agent generator: {e}")
return None
except httpx.RequestError as e:
logger.error(f"Request error calling external agent generator: {e}")
return None
except Exception as e:
logger.error(f"Unexpected error calling external agent generator: {e}")
return None
async def generate_agent_patch_external(
update_request: str, current_agent: dict[str, Any]
) -> dict[str, Any] | None:
"""Call the external service to generate a patch for an existing agent.
Args:
update_request: Natural language description of changes
current_agent: Current agent JSON
Returns:
Updated agent JSON, clarifying questions dict, or None on error
"""
client = _get_client()
try:
response = await client.post(
"/api/update-agent",
json={
"update_request": update_request,
"current_agent_json": current_agent,
},
)
response.raise_for_status()
data = response.json()
if not data.get("success"):
logger.error(f"External service returned error: {data.get('error')}")
return None
# Check if it's clarifying questions
if data.get("type") == "clarifying_questions":
return {
"type": "clarifying_questions",
"questions": data.get("questions", []),
}
# Otherwise return the updated agent JSON
return data.get("agent_json")
except httpx.HTTPStatusError as e:
logger.error(f"HTTP error calling external agent generator: {e}")
return None
except httpx.RequestError as e:
logger.error(f"Request error calling external agent generator: {e}")
return None
except Exception as e:
logger.error(f"Unexpected error calling external agent generator: {e}")
return None
async def get_blocks_external() -> list[dict[str, Any]] | None:
"""Get available blocks from the external service.
Returns:
List of block info dicts or None on error
"""
client = _get_client()
try:
response = await client.get("/api/blocks")
response.raise_for_status()
data = response.json()
if not data.get("success"):
logger.error("External service returned error getting blocks")
return None
return data.get("blocks", [])
except httpx.HTTPStatusError as e:
logger.error(f"HTTP error getting blocks from external service: {e}")
return None
except httpx.RequestError as e:
logger.error(f"Request error getting blocks from external service: {e}")
return None
except Exception as e:
logger.error(f"Unexpected error getting blocks from external service: {e}")
return None
async def health_check() -> bool:
"""Check if the external service is healthy.
Returns:
True if healthy, False otherwise
"""
if not is_external_service_configured():
return False
client = _get_client()
try:
response = await client.get("/health")
response.raise_for_status()
data = response.json()
return data.get("status") == "healthy" and data.get("blocks_loaded", False)
except Exception as e:
logger.warning(f"External agent generator health check failed: {e}")
return False
async def close_client() -> None:
"""Close the HTTP client."""
global _client
if _client is not None:
await _client.aclose()
_client = None

View File

@@ -0,0 +1,213 @@
"""Utilities for agent generation."""
import json
import re
from typing import Any
from backend.data.block import get_blocks
# UUID validation regex
UUID_REGEX = re.compile(
r"^[a-f0-9]{8}-[a-f0-9]{4}-4[a-f0-9]{3}-[89ab][a-f0-9]{3}-[a-f0-9]{12}$"
)
# Block IDs for various fixes
STORE_VALUE_BLOCK_ID = "1ff065e9-88e8-4358-9d82-8dc91f622ba9"
CONDITION_BLOCK_ID = "715696a0-e1da-45c8-b209-c2fa9c3b0be6"
ADDTOLIST_BLOCK_ID = "aeb08fc1-2fc1-4141-bc8e-f758f183a822"
ADDTODICTIONARY_BLOCK_ID = "31d1064e-7446-4693-a7d4-65e5ca1180d1"
CREATELIST_BLOCK_ID = "a912d5c7-6e00-4542-b2a9-8034136930e4"
CREATEDICT_BLOCK_ID = "b924ddf4-de4f-4b56-9a85-358930dcbc91"
CODE_EXECUTION_BLOCK_ID = "0b02b072-abe7-11ef-8372-fb5d162dd712"
DATA_SAMPLING_BLOCK_ID = "4a448883-71fa-49cf-91cf-70d793bd7d87"
UNIVERSAL_TYPE_CONVERTER_BLOCK_ID = "95d1b990-ce13-4d88-9737-ba5c2070c97b"
GET_CURRENT_DATE_BLOCK_ID = "b29c1b50-5d0e-4d9f-8f9d-1b0e6fcbf0b1"
DOUBLE_CURLY_BRACES_BLOCK_IDS = [
"44f6c8ad-d75c-4ae1-8209-aad1c0326928", # FillTextTemplateBlock
"6ab085e2-20b3-4055-bc3e-08036e01eca6",
"90f8c45e-e983-4644-aa0b-b4ebe2f531bc",
"363ae599-353e-4804-937e-b2ee3cef3da4", # AgentOutputBlock
"3b191d9f-356f-482d-8238-ba04b6d18381",
"db7d8f02-2f44-4c55-ab7a-eae0941f0c30",
"3a7c4b8d-6e2f-4a5d-b9c1-f8d23c5a9b0e",
"ed1ae7a0-b770-4089-b520-1f0005fad19a",
"a892b8d9-3e4e-4e9c-9c1e-75f8efcf1bfa",
"b29c1b50-5d0e-4d9f-8f9d-1b0e6fcbf0b1",
"716a67b3-6760-42e7-86dc-18645c6e00fc",
"530cf046-2ce0-4854-ae2c-659db17c7a46",
"ed55ac19-356e-4243-a6cb-bc599e9b716f",
"1f292d4a-41a4-4977-9684-7c8d560b9f91", # LLM blocks
"32a87eab-381e-4dd4-bdb8-4c47151be35a",
]
def is_valid_uuid(value: str) -> bool:
"""Check if a string is a valid UUID v4."""
return isinstance(value, str) and UUID_REGEX.match(value) is not None
def _compact_schema(schema: dict) -> dict[str, str]:
"""Extract compact type info from a JSON schema properties dict.
Returns a dict of {field_name: type_string} for essential info only.
"""
props = schema.get("properties", {})
result = {}
for name, prop in props.items():
# Skip internal/complex fields
if name.startswith("_"):
continue
# Get type string
type_str = prop.get("type", "any")
# Handle anyOf/oneOf (optional types)
if "anyOf" in prop:
types = [t.get("type", "?") for t in prop["anyOf"] if t.get("type")]
type_str = "|".join(types) if types else "any"
elif "allOf" in prop:
type_str = "object"
# Add array item type if present
if type_str == "array" and "items" in prop:
items = prop["items"]
if isinstance(items, dict):
item_type = items.get("type", "any")
type_str = f"array[{item_type}]"
result[name] = type_str
return result
def get_block_summaries(include_schemas: bool = True) -> str:
"""Generate compact block summaries for prompts.
Args:
include_schemas: Whether to include input/output type info
Returns:
Formatted string of block summaries (compact format)
"""
blocks = get_blocks()
summaries = []
for block_id, block_cls in blocks.items():
block = block_cls()
name = block.name
desc = getattr(block, "description", "") or ""
# Truncate description
if len(desc) > 150:
desc = desc[:147] + "..."
if not include_schemas:
summaries.append(f"- {name} (id: {block_id}): {desc}")
else:
# Compact format with type info only
inputs = {}
outputs = {}
required = []
if hasattr(block, "input_schema"):
try:
schema = block.input_schema.jsonschema()
inputs = _compact_schema(schema)
required = schema.get("required", [])
except Exception:
pass
if hasattr(block, "output_schema"):
try:
schema = block.output_schema.jsonschema()
outputs = _compact_schema(schema)
except Exception:
pass
# Build compact line format
# Format: NAME (id): desc | in: {field:type, ...} [required] | out: {field:type}
in_str = ", ".join(f"{k}:{v}" for k, v in inputs.items())
out_str = ", ".join(f"{k}:{v}" for k, v in outputs.items())
req_str = f" req=[{','.join(required)}]" if required else ""
static = " [static]" if getattr(block, "static_output", False) else ""
line = f"- {name} (id: {block_id}): {desc}"
if in_str:
line += f"\n in: {{{in_str}}}{req_str}"
if out_str:
line += f"\n out: {{{out_str}}}{static}"
summaries.append(line)
return "\n".join(summaries)
def get_blocks_info() -> list[dict[str, Any]]:
"""Get block information with schemas for validation and fixing."""
blocks = get_blocks()
blocks_info = []
for block_id, block_cls in blocks.items():
block = block_cls()
blocks_info.append(
{
"id": block_id,
"name": block.name,
"description": getattr(block, "description", ""),
"categories": getattr(block, "categories", []),
"staticOutput": getattr(block, "static_output", False),
"inputSchema": (
block.input_schema.jsonschema()
if hasattr(block, "input_schema")
else {}
),
"outputSchema": (
block.output_schema.jsonschema()
if hasattr(block, "output_schema")
else {}
),
}
)
return blocks_info
def parse_json_from_llm(text: str) -> dict[str, Any] | None:
"""Extract JSON from LLM response (handles markdown code blocks)."""
if not text:
return None
# Try fenced code block
match = re.search(r"```(?:json)?\s*([\s\S]*?)```", text, re.IGNORECASE)
if match:
try:
return json.loads(match.group(1).strip())
except json.JSONDecodeError:
pass
# Try raw text
try:
return json.loads(text.strip())
except json.JSONDecodeError:
pass
# Try finding {...} span
start = text.find("{")
end = text.rfind("}")
if start != -1 and end > start:
try:
return json.loads(text[start : end + 1])
except json.JSONDecodeError:
pass
# Try finding [...] span
start = text.find("[")
end = text.rfind("]")
if start != -1 and end > start:
try:
return json.loads(text[start : end + 1])
except json.JSONDecodeError:
pass
return None

View File

@@ -0,0 +1,279 @@
"""Agent validator - Validates agent structure and connections."""
import logging
import re
from typing import Any
from .utils import get_blocks_info
logger = logging.getLogger(__name__)
class AgentValidator:
"""Validator for AutoGPT agents with detailed error reporting."""
def __init__(self):
self.errors: list[str] = []
def add_error(self, error: str) -> None:
"""Add an error message."""
self.errors.append(error)
def validate_block_existence(
self, agent: dict[str, Any], blocks_info: list[dict[str, Any]]
) -> bool:
"""Validate all block IDs exist in the blocks library."""
valid = True
valid_block_ids = {b.get("id") for b in blocks_info if b.get("id")}
for node in agent.get("nodes", []):
block_id = node.get("block_id")
node_id = node.get("id")
if not block_id:
self.add_error(f"Node '{node_id}' is missing 'block_id' field.")
valid = False
continue
if block_id not in valid_block_ids:
self.add_error(
f"Node '{node_id}' references block_id '{block_id}' which does not exist."
)
valid = False
return valid
def validate_link_node_references(self, agent: dict[str, Any]) -> bool:
"""Validate all node IDs referenced in links exist."""
valid = True
valid_node_ids = {n.get("id") for n in agent.get("nodes", []) if n.get("id")}
for link in agent.get("links", []):
link_id = link.get("id", "Unknown")
source_id = link.get("source_id")
sink_id = link.get("sink_id")
if not source_id:
self.add_error(f"Link '{link_id}' is missing 'source_id'.")
valid = False
elif source_id not in valid_node_ids:
self.add_error(
f"Link '{link_id}' references non-existent source_id '{source_id}'."
)
valid = False
if not sink_id:
self.add_error(f"Link '{link_id}' is missing 'sink_id'.")
valid = False
elif sink_id not in valid_node_ids:
self.add_error(
f"Link '{link_id}' references non-existent sink_id '{sink_id}'."
)
valid = False
return valid
def validate_required_inputs(
self, agent: dict[str, Any], blocks_info: list[dict[str, Any]]
) -> bool:
"""Validate required inputs are provided."""
valid = True
block_map = {b.get("id"): b for b in blocks_info}
for node in agent.get("nodes", []):
block_id = node.get("block_id")
block = block_map.get(block_id)
if not block:
continue
required_inputs = block.get("inputSchema", {}).get("required", [])
input_defaults = node.get("input_default", {})
node_id = node.get("id")
# Get linked inputs
linked_inputs = {
link["sink_name"]
for link in agent.get("links", [])
if link.get("sink_id") == node_id
}
for req_input in required_inputs:
if (
req_input not in input_defaults
and req_input not in linked_inputs
and req_input != "credentials"
):
block_name = block.get("name", "Unknown Block")
self.add_error(
f"Node '{node_id}' ({block_name}) is missing required input '{req_input}'."
)
valid = False
return valid
def validate_data_type_compatibility(
self, agent: dict[str, Any], blocks_info: list[dict[str, Any]]
) -> bool:
"""Validate linked data types are compatible."""
valid = True
block_map = {b.get("id"): b for b in blocks_info}
node_lookup = {n.get("id"): n for n in agent.get("nodes", [])}
def get_type(schema: dict, name: str) -> str | None:
if "_#_" in name:
parent, child = name.split("_#_", 1)
parent_schema = schema.get(parent, {})
if "properties" in parent_schema:
return parent_schema["properties"].get(child, {}).get("type")
return None
return schema.get(name, {}).get("type")
def are_compatible(src: str, sink: str) -> bool:
if {src, sink} <= {"integer", "number"}:
return True
return src == sink
for link in agent.get("links", []):
source_node = node_lookup.get(link.get("source_id"))
sink_node = node_lookup.get(link.get("sink_id"))
if not source_node or not sink_node:
continue
source_block = block_map.get(source_node.get("block_id"))
sink_block = block_map.get(sink_node.get("block_id"))
if not source_block or not sink_block:
continue
source_outputs = source_block.get("outputSchema", {}).get("properties", {})
sink_inputs = sink_block.get("inputSchema", {}).get("properties", {})
source_type = get_type(source_outputs, link.get("source_name", ""))
sink_type = get_type(sink_inputs, link.get("sink_name", ""))
if source_type and sink_type and not are_compatible(source_type, sink_type):
self.add_error(
f"Type mismatch: {source_block.get('name')} output '{link['source_name']}' "
f"({source_type}) -> {sink_block.get('name')} input '{link['sink_name']}' ({sink_type})."
)
valid = False
return valid
def validate_nested_sink_links(
self, agent: dict[str, Any], blocks_info: list[dict[str, Any]]
) -> bool:
"""Validate nested sink links (with _#_ notation)."""
valid = True
block_map = {b.get("id"): b for b in blocks_info}
node_lookup = {n.get("id"): n for n in agent.get("nodes", [])}
for link in agent.get("links", []):
sink_name = link.get("sink_name", "")
if "_#_" in sink_name:
parent, child = sink_name.split("_#_", 1)
sink_node = node_lookup.get(link.get("sink_id"))
if not sink_node:
continue
block = block_map.get(sink_node.get("block_id"))
if not block:
continue
input_props = block.get("inputSchema", {}).get("properties", {})
parent_schema = input_props.get(parent)
if not parent_schema:
self.add_error(
f"Invalid nested link '{sink_name}': parent '{parent}' not found."
)
valid = False
continue
if not parent_schema.get("additionalProperties"):
if not (
isinstance(parent_schema, dict)
and "properties" in parent_schema
and child in parent_schema.get("properties", {})
):
self.add_error(
f"Invalid nested link '{sink_name}': child '{child}' not found in '{parent}'."
)
valid = False
return valid
def validate_prompt_spaces(self, agent: dict[str, Any]) -> bool:
"""Validate prompts don't have spaces in template variables."""
valid = True
for node in agent.get("nodes", []):
input_default = node.get("input_default", {})
prompt = input_default.get("prompt", "")
if not isinstance(prompt, str):
continue
# Find {{...}} with spaces
matches = re.finditer(r"\{\{([^}]+)\}\}", prompt)
for match in matches:
content = match.group(1)
if " " in content:
self.add_error(
f"Node '{node.get('id')}' has spaces in template variable: "
f"'{{{{{content}}}}}' should be '{{{{{content.replace(' ', '_')}}}}}'."
)
valid = False
return valid
def validate(
self, agent: dict[str, Any], blocks_info: list[dict[str, Any]] | None = None
) -> tuple[bool, str | None]:
"""Run all validations.
Returns:
Tuple of (is_valid, error_message)
"""
self.errors = []
if blocks_info is None:
blocks_info = get_blocks_info()
checks = [
self.validate_block_existence(agent, blocks_info),
self.validate_link_node_references(agent),
self.validate_required_inputs(agent, blocks_info),
self.validate_data_type_compatibility(agent, blocks_info),
self.validate_nested_sink_links(agent, blocks_info),
self.validate_prompt_spaces(agent),
]
all_passed = all(checks)
if all_passed:
logger.info("Agent validation successful")
return True, None
error_message = "Agent validation failed:\n"
for i, error in enumerate(self.errors, 1):
error_message += f"{i}. {error}\n"
logger.warning(f"Agent validation failed with {len(self.errors)} errors")
return False, error_message
def validate_agent(
agent: dict[str, Any], blocks_info: list[dict[str, Any]] | None = None
) -> tuple[bool, str | None]:
"""Convenience function to validate an agent.
Returns:
Tuple of (is_valid, error_message)
"""
validator = AgentValidator()
return validator.validate(agent, blocks_info)

View File

@@ -8,10 +8,12 @@ from langfuse import observe
from backend.api.features.chat.model import ChatSession
from .agent_generator import (
AgentGeneratorNotConfiguredError,
apply_all_fixes,
decompose_goal,
generate_agent,
get_blocks_info,
save_agent_to_library,
validate_agent,
)
from .base import BaseTool
from .models import (
@@ -25,6 +27,9 @@ from .models import (
logger = logging.getLogger(__name__)
# Maximum retries for agent generation with validation feedback
MAX_GENERATION_RETRIES = 2
class CreateAgentTool(BaseTool):
"""Tool for creating agents from natural language descriptions."""
@@ -86,8 +91,9 @@ class CreateAgentTool(BaseTool):
Flow:
1. Decompose the description into steps (may return clarifying questions)
2. Generate agent JSON (external service handles fixing and validation)
3. Preview or save based on the save parameter
2. Generate agent JSON from the steps
3. Apply fixes to correct common LLM errors
4. Preview or save based on the save parameter
"""
description = kwargs.get("description", "").strip()
context = kwargs.get("context", "")
@@ -104,13 +110,11 @@ class CreateAgentTool(BaseTool):
# Step 1: Decompose goal into steps
try:
decomposition_result = await decompose_goal(description, context)
except AgentGeneratorNotConfiguredError:
except ValueError as e:
# Handle missing API key or configuration errors
return ErrorResponse(
message=(
"Agent generation is not available. "
"The Agent Generator service is not configured."
),
error="service_not_configured",
message=f"Agent generation is not configured: {str(e)}",
error="configuration_error",
session_id=session_id,
)
@@ -167,32 +171,72 @@ class CreateAgentTool(BaseTool):
session_id=session_id,
)
# Step 2: Generate agent JSON (external service handles fixing and validation)
try:
agent_json = await generate_agent(decomposition_result)
except AgentGeneratorNotConfiguredError:
return ErrorResponse(
message=(
"Agent generation is not available. "
"The Agent Generator service is not configured."
),
error="service_not_configured",
session_id=session_id,
# Step 2: Generate agent JSON with retry on validation failure
blocks_info = get_blocks_info()
agent_json = None
validation_errors = None
for attempt in range(MAX_GENERATION_RETRIES + 1):
# Generate agent (include validation errors from previous attempt)
if attempt == 0:
agent_json = await generate_agent(decomposition_result)
else:
# Retry with validation error feedback
logger.info(
f"Retry {attempt}/{MAX_GENERATION_RETRIES} with validation feedback"
)
retry_instructions = {
**decomposition_result,
"previous_errors": validation_errors,
"retry_instructions": (
"The previous generation had validation errors. "
"Please fix these issues in the new generation:\n"
f"{validation_errors}"
),
}
agent_json = await generate_agent(retry_instructions)
if agent_json is None:
if attempt == MAX_GENERATION_RETRIES:
return ErrorResponse(
message="Failed to generate the agent. Please try again.",
error="Generation failed",
session_id=session_id,
)
continue
# Step 3: Apply fixes to correct common errors
agent_json = apply_all_fixes(agent_json, blocks_info)
# Step 4: Validate the agent
is_valid, validation_errors = validate_agent(agent_json, blocks_info)
if is_valid:
logger.info(f"Agent generated successfully on attempt {attempt + 1}")
break
logger.warning(
f"Validation failed on attempt {attempt + 1}: {validation_errors}"
)
if agent_json is None:
return ErrorResponse(
message="Failed to generate the agent. Please try again.",
error="Generation failed",
session_id=session_id,
)
if attempt == MAX_GENERATION_RETRIES:
# Return error with validation details
return ErrorResponse(
message=(
f"Generated agent has validation errors after {MAX_GENERATION_RETRIES + 1} attempts. "
f"Please try rephrasing your request or simplify the workflow."
),
error="validation_failed",
details={"validation_errors": validation_errors},
session_id=session_id,
)
agent_name = agent_json.get("name", "Generated Agent")
agent_description = agent_json.get("description", "")
node_count = len(agent_json.get("nodes", []))
link_count = len(agent_json.get("links", []))
# Step 3: Preview or save
# Step 4: Preview or save
if not save:
return AgentPreviewResponse(
message=(

View File

@@ -8,10 +8,13 @@ from langfuse import observe
from backend.api.features.chat.model import ChatSession
from .agent_generator import (
AgentGeneratorNotConfiguredError,
apply_agent_patch,
apply_all_fixes,
generate_agent_patch,
get_agent_as_json,
get_blocks_info,
save_agent_to_library,
validate_agent,
)
from .base import BaseTool
from .models import (
@@ -25,6 +28,9 @@ from .models import (
logger = logging.getLogger(__name__)
# Maximum retries for patch generation with validation feedback
MAX_GENERATION_RETRIES = 2
class EditAgentTool(BaseTool):
"""Tool for editing existing agents using natural language."""
@@ -37,7 +43,7 @@ class EditAgentTool(BaseTool):
def description(self) -> str:
return (
"Edit an existing agent from the user's library using natural language. "
"Generates updates to the agent while preserving unchanged parts."
"Generates a patch to update the agent while preserving unchanged parts."
)
@property
@@ -92,8 +98,9 @@ class EditAgentTool(BaseTool):
Flow:
1. Fetch the current agent
2. Generate updated agent (external service handles fixing and validation)
3. Preview or save based on the save parameter
2. Generate a patch based on the requested changes
3. Apply the patch to create an updated agent
4. Preview or save based on the save parameter
"""
agent_id = kwargs.get("agent_id", "").strip()
changes = kwargs.get("changes", "").strip()
@@ -130,58 +137,121 @@ class EditAgentTool(BaseTool):
if context:
update_request = f"{changes}\n\nAdditional context:\n{context}"
# Step 2: Generate updated agent (external service handles fixing and validation)
try:
result = await generate_agent_patch(update_request, current_agent)
except AgentGeneratorNotConfiguredError:
return ErrorResponse(
message=(
"Agent editing is not available. "
"The Agent Generator service is not configured."
),
error="service_not_configured",
session_id=session_id,
)
# Step 2: Generate patch with retry on validation failure
blocks_info = get_blocks_info()
updated_agent = None
validation_errors = None
intent = "Applied requested changes"
if result is None:
return ErrorResponse(
message="Failed to generate changes. Please try rephrasing.",
error="Update generation failed",
session_id=session_id,
)
# Check if LLM returned clarifying questions
if result.get("type") == "clarifying_questions":
questions = result.get("questions", [])
return ClarificationNeededResponse(
message=(
"I need some more information about the changes. "
"Please answer the following questions:"
),
questions=[
ClarifyingQuestion(
question=q.get("question", ""),
keyword=q.get("keyword", ""),
example=q.get("example"),
for attempt in range(MAX_GENERATION_RETRIES + 1):
# Generate patch (include validation errors from previous attempt)
try:
if attempt == 0:
patch_result = await generate_agent_patch(
update_request, current_agent
)
for q in questions
],
session_id=session_id,
else:
# Retry with validation error feedback
logger.info(
f"Retry {attempt}/{MAX_GENERATION_RETRIES} with validation feedback"
)
retry_request = (
f"{update_request}\n\n"
f"IMPORTANT: The previous edit had validation errors. "
f"Please fix these issues:\n{validation_errors}"
)
patch_result = await generate_agent_patch(
retry_request, current_agent
)
except ValueError as e:
# Handle missing API key or configuration errors
return ErrorResponse(
message=f"Agent generation is not configured: {str(e)}",
error="configuration_error",
session_id=session_id,
)
if patch_result is None:
if attempt == MAX_GENERATION_RETRIES:
return ErrorResponse(
message="Failed to generate changes. Please try rephrasing.",
error="Patch generation failed",
session_id=session_id,
)
continue
# Check if LLM returned clarifying questions
if patch_result.get("type") == "clarifying_questions":
questions = patch_result.get("questions", [])
return ClarificationNeededResponse(
message=(
"I need some more information about the changes. "
"Please answer the following questions:"
),
questions=[
ClarifyingQuestion(
question=q.get("question", ""),
keyword=q.get("keyword", ""),
example=q.get("example"),
)
for q in questions
],
session_id=session_id,
)
# Step 3: Apply patch and fixes
try:
updated_agent = apply_agent_patch(current_agent, patch_result)
updated_agent = apply_all_fixes(updated_agent, blocks_info)
except Exception as e:
if attempt == MAX_GENERATION_RETRIES:
return ErrorResponse(
message=f"Failed to apply changes: {str(e)}",
error="patch_apply_failed",
details={"exception": str(e)},
session_id=session_id,
)
validation_errors = str(e)
continue
# Step 4: Validate the updated agent
is_valid, validation_errors = validate_agent(updated_agent, blocks_info)
if is_valid:
logger.info(f"Agent edited successfully on attempt {attempt + 1}")
intent = patch_result.get("intent", "Applied requested changes")
break
logger.warning(
f"Validation failed on attempt {attempt + 1}: {validation_errors}"
)
# Result is the updated agent JSON
updated_agent = result
if attempt == MAX_GENERATION_RETRIES:
# Return error with validation details
return ErrorResponse(
message=(
f"Updated agent has validation errors after "
f"{MAX_GENERATION_RETRIES + 1} attempts. "
f"Please try rephrasing your request or simplify the changes."
),
error="validation_failed",
details={"validation_errors": validation_errors},
session_id=session_id,
)
# At this point, updated_agent is guaranteed to be set (we return on all failure paths)
assert updated_agent is not None
agent_name = updated_agent.get("name", "Updated Agent")
agent_description = updated_agent.get("description", "")
node_count = len(updated_agent.get("nodes", []))
link_count = len(updated_agent.get("links", []))
# Step 3: Preview or save
# Step 5: Preview or save
if not save:
return AgentPreviewResponse(
message=(
f"I've updated the agent. "
f"I've updated the agent. Changes: {intent}. "
f"The agent now has {node_count} blocks. "
f"Review it and call edit_agent with save=true to save the changes."
),
@@ -207,7 +277,10 @@ class EditAgentTool(BaseTool):
)
return AgentSavedResponse(
message=f"Updated agent '{created_graph.name}' has been saved to your library!",
message=(
f"Updated agent '{created_graph.name}' has been saved to your library! "
f"Changes: {intent}"
),
agent_id=created_graph.id,
agent_name=created_graph.name,
library_agent_id=library_agent.id,

View File

@@ -29,7 +29,7 @@ def mock_embedding_functions():
yield
@pytest.mark.asyncio(loop_scope="session")
@pytest.mark.asyncio(scope="session")
async def test_run_agent(setup_test_data):
"""Test that the run_agent tool successfully executes an approved agent"""
# Use test data from fixture
@@ -70,7 +70,7 @@ async def test_run_agent(setup_test_data):
assert result_data["graph_name"] == "Test Agent"
@pytest.mark.asyncio(loop_scope="session")
@pytest.mark.asyncio(scope="session")
async def test_run_agent_missing_inputs(setup_test_data):
"""Test that the run_agent tool returns error when inputs are missing"""
# Use test data from fixture
@@ -106,7 +106,7 @@ async def test_run_agent_missing_inputs(setup_test_data):
assert "message" in result_data
@pytest.mark.asyncio(loop_scope="session")
@pytest.mark.asyncio(scope="session")
async def test_run_agent_invalid_agent_id(setup_test_data):
"""Test that the run_agent tool returns error for invalid agent ID"""
# Use test data from fixture
@@ -141,7 +141,7 @@ async def test_run_agent_invalid_agent_id(setup_test_data):
)
@pytest.mark.asyncio(loop_scope="session")
@pytest.mark.asyncio(scope="session")
async def test_run_agent_with_llm_credentials(setup_llm_test_data):
"""Test that run_agent works with an agent requiring LLM credentials"""
# Use test data from fixture
@@ -185,7 +185,7 @@ async def test_run_agent_with_llm_credentials(setup_llm_test_data):
assert result_data["graph_name"] == "LLM Test Agent"
@pytest.mark.asyncio(loop_scope="session")
@pytest.mark.asyncio(scope="session")
async def test_run_agent_shows_available_inputs_when_none_provided(setup_test_data):
"""Test that run_agent returns available inputs when called without inputs or use_defaults."""
user = setup_test_data["user"]
@@ -219,7 +219,7 @@ async def test_run_agent_shows_available_inputs_when_none_provided(setup_test_da
assert "inputs" in result_data["message"].lower()
@pytest.mark.asyncio(loop_scope="session")
@pytest.mark.asyncio(scope="session")
async def test_run_agent_with_use_defaults(setup_test_data):
"""Test that run_agent executes successfully with use_defaults=True."""
user = setup_test_data["user"]
@@ -251,7 +251,7 @@ async def test_run_agent_with_use_defaults(setup_test_data):
assert result_data["graph_id"] == graph.id
@pytest.mark.asyncio(loop_scope="session")
@pytest.mark.asyncio(scope="session")
async def test_run_agent_missing_credentials(setup_firecrawl_test_data):
"""Test that run_agent returns setup_requirements when credentials are missing."""
user = setup_firecrawl_test_data["user"]
@@ -285,7 +285,7 @@ async def test_run_agent_missing_credentials(setup_firecrawl_test_data):
assert len(setup_info["user_readiness"]["missing_credentials"]) > 0
@pytest.mark.asyncio(loop_scope="session")
@pytest.mark.asyncio(scope="session")
async def test_run_agent_invalid_slug_format(setup_test_data):
"""Test that run_agent returns error for invalid slug format (no slash)."""
user = setup_test_data["user"]
@@ -313,7 +313,7 @@ async def test_run_agent_invalid_slug_format(setup_test_data):
assert "username/agent-name" in result_data["message"]
@pytest.mark.asyncio(loop_scope="session")
@pytest.mark.asyncio(scope="session")
async def test_run_agent_unauthenticated():
"""Test that run_agent returns need_login for unauthenticated users."""
tool = RunAgentTool()
@@ -340,7 +340,7 @@ async def test_run_agent_unauthenticated():
assert "sign in" in result_data["message"].lower()
@pytest.mark.asyncio(loop_scope="session")
@pytest.mark.asyncio(scope="session")
async def test_run_agent_schedule_without_cron(setup_test_data):
"""Test that run_agent returns error when scheduling without cron expression."""
user = setup_test_data["user"]
@@ -372,7 +372,7 @@ async def test_run_agent_schedule_without_cron(setup_test_data):
assert "cron" in result_data["message"].lower()
@pytest.mark.asyncio(loop_scope="session")
@pytest.mark.asyncio(scope="session")
async def test_run_agent_schedule_without_name(setup_test_data):
"""Test that run_agent returns error when scheduling without schedule_name."""
user = setup_test_data["user"]

View File

@@ -23,7 +23,6 @@ class PendingHumanReviewModel(BaseModel):
id: Unique identifier for the review record
user_id: ID of the user who must perform the review
node_exec_id: ID of the node execution that created this review
node_id: ID of the node definition (for grouping reviews from same node)
graph_exec_id: ID of the graph execution containing the node
graph_id: ID of the graph template being executed
graph_version: Version number of the graph template
@@ -38,10 +37,6 @@ class PendingHumanReviewModel(BaseModel):
"""
node_exec_id: str = Field(description="Node execution ID (primary key)")
node_id: str = Field(
description="Node definition ID (for grouping)",
default="", # Temporary default for test compatibility
)
user_id: str = Field(description="User ID associated with the review")
graph_exec_id: str = Field(description="Graph execution ID")
graph_id: str = Field(description="Graph ID")
@@ -71,9 +66,7 @@ class PendingHumanReviewModel(BaseModel):
)
@classmethod
def from_db(
cls, review: "PendingHumanReview", node_id: str
) -> "PendingHumanReviewModel":
def from_db(cls, review: "PendingHumanReview") -> "PendingHumanReviewModel":
"""
Convert a database model to a response model.
@@ -81,14 +74,9 @@ class PendingHumanReviewModel(BaseModel):
payload, instructions, and editable flag.
Handles invalid data gracefully by using safe defaults.
Args:
review: Database review object
node_id: Node definition ID (fetched from NodeExecution)
"""
return cls(
node_exec_id=review.nodeExecId,
node_id=node_id,
user_id=review.userId,
graph_exec_id=review.graphExecId,
graph_id=review.graphId,
@@ -119,13 +107,6 @@ class ReviewItem(BaseModel):
reviewed_data: SafeJsonData | None = Field(
None, description="Optional edited data (ignored if approved=False)"
)
auto_approve_future: bool = Field(
default=False,
description=(
"If true and this review is approved, future executions of this same "
"block (node) will be automatically approved. This only affects approved reviews."
),
)
@field_validator("reviewed_data")
@classmethod
@@ -193,9 +174,6 @@ class ReviewRequest(BaseModel):
This request must include ALL pending reviews for a graph execution.
Each review will be either approved (with optional data modifications)
or rejected (data ignored). The execution will resume only after ALL reviews are processed.
Each review item can individually specify whether to auto-approve future executions
of the same block via the `auto_approve_future` field on ReviewItem.
"""
reviews: List[ReviewItem] = Field(

View File

@@ -1,27 +1,17 @@
import asyncio
import logging
from typing import Any, List
from typing import List
import autogpt_libs.auth as autogpt_auth_lib
from fastapi import APIRouter, HTTPException, Query, Security, status
from prisma.enums import ReviewStatus
from backend.data.execution import (
ExecutionContext,
ExecutionStatus,
get_graph_execution_meta,
)
from backend.data.graph import get_graph_settings
from backend.data.execution import get_graph_execution_meta
from backend.data.human_review import (
create_auto_approval_record,
get_pending_reviews_by_node_exec_ids,
get_pending_reviews_for_execution,
get_pending_reviews_for_user,
has_pending_reviews_for_graph_exec,
process_all_reviews_for_execution,
)
from backend.data.model import USER_TIMEZONE_NOT_SET
from backend.data.user import get_user_by_id
from backend.executor.utils import add_graph_execution
from .model import PendingHumanReviewModel, ReviewRequest, ReviewResponse
@@ -137,70 +127,17 @@ async def process_review_action(
detail="At least one review must be provided",
)
# Batch fetch all requested reviews
reviews_map = await get_pending_reviews_by_node_exec_ids(
list(all_request_node_ids), user_id
)
# Validate all reviews were found
missing_ids = all_request_node_ids - set(reviews_map.keys())
if missing_ids:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail=f"No pending review found for node execution(s): {', '.join(missing_ids)}",
)
# Validate all reviews belong to the same execution
graph_exec_ids = {review.graph_exec_id for review in reviews_map.values()}
if len(graph_exec_ids) > 1:
raise HTTPException(
status_code=status.HTTP_409_CONFLICT,
detail="All reviews in a single request must belong to the same execution.",
)
graph_exec_id = next(iter(graph_exec_ids))
# Validate execution status before processing reviews
graph_exec_meta = await get_graph_execution_meta(
user_id=user_id, execution_id=graph_exec_id
)
if not graph_exec_meta:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail=f"Graph execution #{graph_exec_id} not found",
)
# Only allow processing reviews if execution is paused for review
# or incomplete (partial execution with some reviews already processed)
if graph_exec_meta.status not in (
ExecutionStatus.REVIEW,
ExecutionStatus.INCOMPLETE,
):
raise HTTPException(
status_code=status.HTTP_409_CONFLICT,
detail=f"Cannot process reviews while execution status is {graph_exec_meta.status}. "
f"Reviews can only be processed when execution is paused (REVIEW status). "
f"Current status: {graph_exec_meta.status}",
)
# Build review decisions map and track which reviews requested auto-approval
# Auto-approved reviews use original data (no modifications allowed)
# Build review decisions map
review_decisions = {}
auto_approve_requests = {} # Map node_exec_id -> auto_approve_future flag
for review in request.reviews:
review_status = (
ReviewStatus.APPROVED if review.approved else ReviewStatus.REJECTED
)
# If this review requested auto-approval, don't allow data modifications
reviewed_data = None if review.auto_approve_future else review.reviewed_data
review_decisions[review.node_exec_id] = (
review_status,
reviewed_data,
review.reviewed_data,
review.message,
)
auto_approve_requests[review.node_exec_id] = review.auto_approve_future
# Process all reviews
updated_reviews = await process_all_reviews_for_execution(
@@ -208,87 +145,6 @@ async def process_review_action(
review_decisions=review_decisions,
)
# Create auto-approval records for approved reviews that requested it
# Deduplicate by node_id to avoid race conditions when multiple reviews
# for the same node are processed in parallel
async def create_auto_approval_for_node(
node_id: str, review_result
) -> tuple[str, bool]:
"""
Create auto-approval record for a node.
Returns (node_id, success) tuple for tracking failures.
"""
try:
await create_auto_approval_record(
user_id=user_id,
graph_exec_id=review_result.graph_exec_id,
graph_id=review_result.graph_id,
graph_version=review_result.graph_version,
node_id=node_id,
payload=review_result.payload,
)
return (node_id, True)
except Exception as e:
logger.error(
f"Failed to create auto-approval record for node {node_id}",
exc_info=e,
)
return (node_id, False)
# Collect node_exec_ids that need auto-approval
node_exec_ids_needing_auto_approval = [
node_exec_id
for node_exec_id, review_result in updated_reviews.items()
if review_result.status == ReviewStatus.APPROVED
and auto_approve_requests.get(node_exec_id, False)
]
# Batch-fetch node executions to get node_ids
nodes_needing_auto_approval: dict[str, Any] = {}
if node_exec_ids_needing_auto_approval:
from backend.data.execution import get_node_executions
node_execs = await get_node_executions(
graph_exec_id=graph_exec_id, include_exec_data=False
)
node_exec_map = {node_exec.node_exec_id: node_exec for node_exec in node_execs}
for node_exec_id in node_exec_ids_needing_auto_approval:
node_exec = node_exec_map.get(node_exec_id)
if node_exec:
review_result = updated_reviews[node_exec_id]
# Use the first approved review for this node (deduplicate by node_id)
if node_exec.node_id not in nodes_needing_auto_approval:
nodes_needing_auto_approval[node_exec.node_id] = review_result
else:
logger.error(
f"Failed to create auto-approval record for {node_exec_id}: "
f"Node execution not found. This may indicate a race condition "
f"or data inconsistency."
)
# Execute all auto-approval creations in parallel (deduplicated by node_id)
auto_approval_results = await asyncio.gather(
*[
create_auto_approval_for_node(node_id, review_result)
for node_id, review_result in nodes_needing_auto_approval.items()
],
return_exceptions=True,
)
# Count auto-approval failures
auto_approval_failed_count = 0
for result in auto_approval_results:
if isinstance(result, Exception):
# Unexpected exception during auto-approval creation
auto_approval_failed_count += 1
logger.error(
f"Unexpected exception during auto-approval creation: {result}"
)
elif isinstance(result, tuple) and len(result) == 2 and not result[1]:
# Auto-approval creation failed (returned False)
auto_approval_failed_count += 1
# Count results
approved_count = sum(
1
@@ -301,53 +157,30 @@ async def process_review_action(
if review.status == ReviewStatus.REJECTED
)
# Resume execution only if ALL pending reviews for this execution have been processed
# Resume execution if we processed some reviews
if updated_reviews:
# Get graph execution ID from any processed review
first_review = next(iter(updated_reviews.values()))
graph_exec_id = first_review.graph_exec_id
# Check if any pending reviews remain for this execution
still_has_pending = await has_pending_reviews_for_graph_exec(graph_exec_id)
if not still_has_pending:
# Get the graph_id from any processed review
first_review = next(iter(updated_reviews.values()))
# Resume execution
try:
# Fetch user and settings to build complete execution context
user = await get_user_by_id(user_id)
settings = await get_graph_settings(
user_id=user_id, graph_id=first_review.graph_id
)
# Preserve user's timezone preference when resuming execution
user_timezone = (
user.timezone if user.timezone != USER_TIMEZONE_NOT_SET else "UTC"
)
execution_context = ExecutionContext(
human_in_the_loop_safe_mode=settings.human_in_the_loop_safe_mode,
sensitive_action_safe_mode=settings.sensitive_action_safe_mode,
user_timezone=user_timezone,
)
await add_graph_execution(
graph_id=first_review.graph_id,
user_id=user_id,
graph_exec_id=graph_exec_id,
execution_context=execution_context,
)
logger.info(f"Resumed execution {graph_exec_id}")
except Exception as e:
logger.error(f"Failed to resume execution {graph_exec_id}: {str(e)}")
# Build error message if auto-approvals failed
error_message = None
if auto_approval_failed_count > 0:
error_message = (
f"{auto_approval_failed_count} auto-approval setting(s) could not be saved. "
f"You may need to manually approve these reviews in future executions."
)
return ReviewResponse(
approved_count=approved_count,
rejected_count=rejected_count,
failed_count=auto_approval_failed_count,
error=error_message,
failed_count=0,
error=None,
)

View File

@@ -583,13 +583,7 @@ async def update_library_agent(
)
update_fields["isDeleted"] = is_deleted
if settings is not None:
existing_agent = await get_library_agent(id=library_agent_id, user_id=user_id)
current_settings_dict = (
existing_agent.settings.model_dump() if existing_agent.settings else {}
)
new_settings = settings.model_dump(exclude_unset=True)
merged_settings = {**current_settings_dict, **new_settings}
update_fields["settings"] = SafeJson(merged_settings)
update_fields["settings"] = SafeJson(settings.model_dump())
try:
# If graph_version is provided, update to that specific version

View File

@@ -20,7 +20,6 @@ from typing import AsyncGenerator
import httpx
import pytest
import pytest_asyncio
from autogpt_libs.api_key.keysmith import APIKeySmith
from prisma.enums import APIKeyPermission
from prisma.models import OAuthAccessToken as PrismaOAuthAccessToken
@@ -39,13 +38,13 @@ keysmith = APIKeySmith()
# ============================================================================
@pytest.fixture(scope="session")
@pytest.fixture
def test_user_id() -> str:
"""Test user ID for OAuth tests."""
return str(uuid.uuid4())
@pytest_asyncio.fixture(scope="session", loop_scope="session")
@pytest.fixture
async def test_user(server, test_user_id: str):
"""Create a test user in the database."""
await PrismaUser.prisma().create(
@@ -68,7 +67,7 @@ async def test_user(server, test_user_id: str):
await PrismaUser.prisma().delete(where={"id": test_user_id})
@pytest_asyncio.fixture
@pytest.fixture
async def test_oauth_app(test_user: str):
"""Create a test OAuth application in the database."""
app_id = str(uuid.uuid4())
@@ -123,7 +122,7 @@ def pkce_credentials() -> tuple[str, str]:
return generate_pkce()
@pytest_asyncio.fixture
@pytest.fixture
async def client(server, test_user: str) -> AsyncGenerator[httpx.AsyncClient, None]:
"""
Create an async HTTP client that talks directly to the FastAPI app.
@@ -288,7 +287,7 @@ async def test_authorize_invalid_client_returns_error(
assert query_params["error"][0] == "invalid_client"
@pytest_asyncio.fixture
@pytest.fixture
async def inactive_oauth_app(test_user: str):
"""Create an inactive test OAuth application in the database."""
app_id = str(uuid.uuid4())
@@ -1005,7 +1004,7 @@ async def test_token_refresh_revoked(
assert "revoked" in response.json()["detail"].lower()
@pytest_asyncio.fixture
@pytest.fixture
async def other_oauth_app(test_user: str):
"""Create a second OAuth application for cross-app tests."""
app_id = str(uuid.uuid4())

View File

@@ -1552,7 +1552,7 @@ async def review_store_submission(
# Generate embedding for approved listing (blocking - admin operation)
# Inside transaction: if embedding fails, entire transaction rolls back
await ensure_embedding(
embedding_success = await ensure_embedding(
version_id=store_listing_version_id,
name=store_listing_version.name,
description=store_listing_version.description,
@@ -1560,6 +1560,12 @@ async def review_store_submission(
categories=store_listing_version.categories or [],
tx=tx,
)
if not embedding_success:
raise ValueError(
f"Failed to generate embedding for listing {store_listing_version_id}. "
"This is likely due to OpenAI API being unavailable. "
"Please try again later or contact support if the issue persists."
)
await prisma.models.StoreListing.prisma(tx).update(
where={"id": store_listing_version.StoreListing.id},

View File

@@ -21,6 +21,7 @@ from backend.util.json import dumps
logger = logging.getLogger(__name__)
# OpenAI embedding model configuration
EMBEDDING_MODEL = "text-embedding-3-small"
# Embedding dimension for the model above
@@ -62,42 +63,49 @@ def build_searchable_text(
return " ".join(parts)
async def generate_embedding(text: str) -> list[float]:
async def generate_embedding(text: str) -> list[float] | None:
"""
Generate embedding for text using OpenAI API.
Raises exceptions on failure - caller should handle.
Returns None if embedding generation fails.
Fail-fast: no retries to maintain consistency with approval flow.
"""
client = get_openai_client()
if not client:
raise RuntimeError("openai_internal_api_key not set, cannot generate embedding")
try:
client = get_openai_client()
if not client:
logger.error("openai_internal_api_key not set, cannot generate embedding")
return None
# Truncate text to token limit using tiktoken
# Character-based truncation is insufficient because token ratios vary by content type
enc = encoding_for_model(EMBEDDING_MODEL)
tokens = enc.encode(text)
if len(tokens) > EMBEDDING_MAX_TOKENS:
tokens = tokens[:EMBEDDING_MAX_TOKENS]
truncated_text = enc.decode(tokens)
logger.info(
f"Truncated text from {len(enc.encode(text))} to {len(tokens)} tokens"
# Truncate text to token limit using tiktoken
# Character-based truncation is insufficient because token ratios vary by content type
enc = encoding_for_model(EMBEDDING_MODEL)
tokens = enc.encode(text)
if len(tokens) > EMBEDDING_MAX_TOKENS:
tokens = tokens[:EMBEDDING_MAX_TOKENS]
truncated_text = enc.decode(tokens)
logger.info(
f"Truncated text from {len(enc.encode(text))} to {len(tokens)} tokens"
)
else:
truncated_text = text
start_time = time.time()
response = await client.embeddings.create(
model=EMBEDDING_MODEL,
input=truncated_text,
)
else:
truncated_text = text
latency_ms = (time.time() - start_time) * 1000
start_time = time.time()
response = await client.embeddings.create(
model=EMBEDDING_MODEL,
input=truncated_text,
)
latency_ms = (time.time() - start_time) * 1000
embedding = response.data[0].embedding
logger.info(
f"Generated embedding: {len(embedding)} dims, "
f"{len(tokens)} tokens, {latency_ms:.0f}ms"
)
return embedding
embedding = response.data[0].embedding
logger.info(
f"Generated embedding: {len(embedding)} dims, "
f"{len(tokens)} tokens, {latency_ms:.0f}ms"
)
return embedding
except Exception as e:
logger.error(f"Failed to generate embedding: {e}")
return None
async def store_embedding(
@@ -136,45 +144,48 @@ async def store_content_embedding(
New function for unified content embedding storage.
Uses raw SQL since Prisma doesn't natively support pgvector.
Raises exceptions on failure - caller should handle.
"""
client = tx if tx else prisma.get_client()
try:
client = tx if tx else prisma.get_client()
# Convert embedding to PostgreSQL vector format
embedding_str = embedding_to_vector_string(embedding)
metadata_json = dumps(metadata or {})
# Convert embedding to PostgreSQL vector format
embedding_str = embedding_to_vector_string(embedding)
metadata_json = dumps(metadata or {})
# Upsert the embedding
# WHERE clause in DO UPDATE prevents PostgreSQL 15 bug with NULLS NOT DISTINCT
# Use unqualified ::vector - pgvector is in search_path on all environments
await execute_raw_with_schema(
"""
INSERT INTO {schema_prefix}"UnifiedContentEmbedding" (
"id", "contentType", "contentId", "userId", "embedding", "searchableText", "metadata", "createdAt", "updatedAt"
# Upsert the embedding
# WHERE clause in DO UPDATE prevents PostgreSQL 15 bug with NULLS NOT DISTINCT
# Use unqualified ::vector - pgvector is in search_path on all environments
await execute_raw_with_schema(
"""
INSERT INTO {schema_prefix}"UnifiedContentEmbedding" (
"id", "contentType", "contentId", "userId", "embedding", "searchableText", "metadata", "createdAt", "updatedAt"
)
VALUES (gen_random_uuid()::text, $1::{schema_prefix}"ContentType", $2, $3, $4::vector, $5, $6::jsonb, NOW(), NOW())
ON CONFLICT ("contentType", "contentId", "userId")
DO UPDATE SET
"embedding" = $4::vector,
"searchableText" = $5,
"metadata" = $6::jsonb,
"updatedAt" = NOW()
WHERE {schema_prefix}"UnifiedContentEmbedding"."contentType" = $1::{schema_prefix}"ContentType"
AND {schema_prefix}"UnifiedContentEmbedding"."contentId" = $2
AND ({schema_prefix}"UnifiedContentEmbedding"."userId" = $3 OR ($3 IS NULL AND {schema_prefix}"UnifiedContentEmbedding"."userId" IS NULL))
""",
content_type,
content_id,
user_id,
embedding_str,
searchable_text,
metadata_json,
client=client,
)
VALUES (gen_random_uuid()::text, $1::{schema_prefix}"ContentType", $2, $3, $4::vector, $5, $6::jsonb, NOW(), NOW())
ON CONFLICT ("contentType", "contentId", "userId")
DO UPDATE SET
"embedding" = $4::vector,
"searchableText" = $5,
"metadata" = $6::jsonb,
"updatedAt" = NOW()
WHERE {schema_prefix}"UnifiedContentEmbedding"."contentType" = $1::{schema_prefix}"ContentType"
AND {schema_prefix}"UnifiedContentEmbedding"."contentId" = $2
AND ({schema_prefix}"UnifiedContentEmbedding"."userId" = $3 OR ($3 IS NULL AND {schema_prefix}"UnifiedContentEmbedding"."userId" IS NULL))
""",
content_type,
content_id,
user_id,
embedding_str,
searchable_text,
metadata_json,
client=client,
)
logger.info(f"Stored embedding for {content_type}:{content_id}")
return True
logger.info(f"Stored embedding for {content_type}:{content_id}")
return True
except Exception as e:
logger.error(f"Failed to store embedding for {content_type}:{content_id}: {e}")
return False
async def get_embedding(version_id: str) -> dict[str, Any] | None:
@@ -206,31 +217,34 @@ async def get_content_embedding(
New function for unified content embedding retrieval.
Returns dict with contentType, contentId, embedding, timestamps or None if not found.
Raises exceptions on failure - caller should handle.
"""
result = await query_raw_with_schema(
"""
SELECT
"contentType",
"contentId",
"userId",
"embedding"::text as "embedding",
"searchableText",
"metadata",
"createdAt",
"updatedAt"
FROM {schema_prefix}"UnifiedContentEmbedding"
WHERE "contentType" = $1::{schema_prefix}"ContentType" AND "contentId" = $2 AND ("userId" = $3 OR ($3 IS NULL AND "userId" IS NULL))
""",
content_type,
content_id,
user_id,
)
try:
result = await query_raw_with_schema(
"""
SELECT
"contentType",
"contentId",
"userId",
"embedding"::text as "embedding",
"searchableText",
"metadata",
"createdAt",
"updatedAt"
FROM {schema_prefix}"UnifiedContentEmbedding"
WHERE "contentType" = $1::{schema_prefix}"ContentType" AND "contentId" = $2 AND ("userId" = $3 OR ($3 IS NULL AND "userId" IS NULL))
""",
content_type,
content_id,
user_id,
)
if result and len(result) > 0:
return result[0]
return None
if result and len(result) > 0:
return result[0]
return None
except Exception as e:
logger.error(f"Failed to get embedding for {content_type}:{content_id}: {e}")
return None
async def ensure_embedding(
@@ -258,38 +272,46 @@ async def ensure_embedding(
tx: Optional transaction client
Returns:
True if embedding exists/was created
Raises exceptions on failure - caller should handle.
True if embedding exists/was created, False on failure
"""
# Check if embedding already exists
if not force:
existing = await get_embedding(version_id)
if existing and existing.get("embedding"):
logger.debug(f"Embedding for version {version_id} already exists")
return True
try:
# Check if embedding already exists
if not force:
existing = await get_embedding(version_id)
if existing and existing.get("embedding"):
logger.debug(f"Embedding for version {version_id} already exists")
return True
# Build searchable text for embedding
searchable_text = build_searchable_text(name, description, sub_heading, categories)
# Build searchable text for embedding
searchable_text = build_searchable_text(
name, description, sub_heading, categories
)
# Generate new embedding
embedding = await generate_embedding(searchable_text)
# Generate new embedding
embedding = await generate_embedding(searchable_text)
if embedding is None:
logger.warning(f"Could not generate embedding for version {version_id}")
return False
# Store the embedding with metadata using new function
metadata = {
"name": name,
"subHeading": sub_heading,
"categories": categories,
}
return await store_content_embedding(
content_type=ContentType.STORE_AGENT,
content_id=version_id,
embedding=embedding,
searchable_text=searchable_text,
metadata=metadata,
user_id=None, # Store agents are public
tx=tx,
)
# Store the embedding with metadata using new function
metadata = {
"name": name,
"subHeading": sub_heading,
"categories": categories,
}
return await store_content_embedding(
content_type=ContentType.STORE_AGENT,
content_id=version_id,
embedding=embedding,
searchable_text=searchable_text,
metadata=metadata,
user_id=None, # Store agents are public
tx=tx,
)
except Exception as e:
logger.error(f"Failed to ensure embedding for version {version_id}: {e}")
return False
async def delete_embedding(version_id: str) -> bool:
@@ -499,24 +521,6 @@ async def backfill_all_content_types(batch_size: int = 10) -> dict[str, Any]:
success = sum(1 for result in results if result is True)
failed = len(results) - success
# Aggregate unique errors to avoid Sentry spam
if failed > 0:
# Group errors by type and message
error_summary: dict[str, int] = {}
for result in results:
if isinstance(result, Exception):
error_key = f"{type(result).__name__}: {str(result)}"
error_summary[error_key] = error_summary.get(error_key, 0) + 1
# Log aggregated error summary
error_details = ", ".join(
f"{error} ({count}x)" for error, count in error_summary.items()
)
logger.error(
f"{content_type.value}: {failed}/{len(results)} embeddings failed. "
f"Errors: {error_details}"
)
results_by_type[content_type.value] = {
"processed": len(missing_items),
"success": success,
@@ -553,12 +557,11 @@ async def backfill_all_content_types(batch_size: int = 10) -> dict[str, Any]:
}
async def embed_query(query: str) -> list[float]:
async def embed_query(query: str) -> list[float] | None:
"""
Generate embedding for a search query.
Same as generate_embedding but with clearer intent.
Raises exceptions on failure - caller should handle.
"""
return await generate_embedding(query)
@@ -591,30 +594,40 @@ async def ensure_content_embedding(
tx: Optional transaction client
Returns:
True if embedding exists/was created
Raises exceptions on failure - caller should handle.
True if embedding exists/was created, False on failure
"""
# Check if embedding already exists
if not force:
existing = await get_content_embedding(content_type, content_id, user_id)
if existing and existing.get("embedding"):
logger.debug(f"Embedding for {content_type}:{content_id} already exists")
return True
try:
# Check if embedding already exists
if not force:
existing = await get_content_embedding(content_type, content_id, user_id)
if existing and existing.get("embedding"):
logger.debug(
f"Embedding for {content_type}:{content_id} already exists"
)
return True
# Generate new embedding
embedding = await generate_embedding(searchable_text)
# Generate new embedding
embedding = await generate_embedding(searchable_text)
if embedding is None:
logger.warning(
f"Could not generate embedding for {content_type}:{content_id}"
)
return False
# Store the embedding
return await store_content_embedding(
content_type=content_type,
content_id=content_id,
embedding=embedding,
searchable_text=searchable_text,
metadata=metadata or {},
user_id=user_id,
tx=tx,
)
# Store the embedding
return await store_content_embedding(
content_type=content_type,
content_id=content_id,
embedding=embedding,
searchable_text=searchable_text,
metadata=metadata or {},
user_id=user_id,
tx=tx,
)
except Exception as e:
logger.error(f"Failed to ensure embedding for {content_type}:{content_id}: {e}")
return False
async def cleanup_orphaned_embeddings() -> dict[str, Any]:
@@ -841,8 +854,9 @@ async def semantic_search(
limit = 100
# Generate query embedding
try:
query_embedding = await embed_query(query)
query_embedding = await embed_query(query)
if query_embedding is not None:
# Semantic search with embeddings
embedding_str = embedding_to_vector_string(query_embedding)
@@ -893,21 +907,24 @@ async def semantic_search(
"""
)
results = await query_raw_with_schema(sql, *params)
return [
{
"content_id": row["content_id"],
"content_type": row["content_type"],
"searchable_text": row["searchable_text"],
"metadata": row["metadata"],
"similarity": float(row["similarity"]),
}
for row in results
]
except Exception as e:
logger.warning(f"Semantic search failed, falling back to lexical search: {e}")
try:
results = await query_raw_with_schema(sql, *params)
return [
{
"content_id": row["content_id"],
"content_type": row["content_type"],
"searchable_text": row["searchable_text"],
"metadata": row["metadata"],
"similarity": float(row["similarity"]),
}
for row in results
]
except Exception as e:
logger.error(f"Semantic search failed: {e}")
# Fall through to lexical search below
# Fallback to lexical search if embeddings unavailable
logger.warning("Falling back to lexical search (embeddings unavailable)")
params_lexical: list[Any] = [limit]
user_filter = ""

View File

@@ -298,16 +298,17 @@ async def test_schema_handling_error_cases():
mock_client.execute_raw.side_effect = Exception("Database error")
mock_get_client.return_value = mock_client
# Should raise exception on error
with pytest.raises(Exception, match="Database error"):
await embeddings.store_content_embedding(
content_type=ContentType.STORE_AGENT,
content_id="test-id",
embedding=[0.1] * EMBEDDING_DIM,
searchable_text="test",
metadata=None,
user_id=None,
)
result = await embeddings.store_content_embedding(
content_type=ContentType.STORE_AGENT,
content_id="test-id",
embedding=[0.1] * EMBEDDING_DIM,
searchable_text="test",
metadata=None,
user_id=None,
)
# Should return False on error, not raise
assert result is False
if __name__ == "__main__":

View File

@@ -80,8 +80,9 @@ async def test_generate_embedding_no_api_key():
) as mock_get_client:
mock_get_client.return_value = None
with pytest.raises(RuntimeError, match="openai_internal_api_key not set"):
await embeddings.generate_embedding("test text")
result = await embeddings.generate_embedding("test text")
assert result is None
@pytest.mark.asyncio(loop_scope="session")
@@ -96,8 +97,9 @@ async def test_generate_embedding_api_error():
) as mock_get_client:
mock_get_client.return_value = mock_client
with pytest.raises(Exception, match="API Error"):
await embeddings.generate_embedding("test text")
result = await embeddings.generate_embedding("test text")
assert result is None
@pytest.mark.asyncio(loop_scope="session")
@@ -171,10 +173,11 @@ async def test_store_embedding_database_error(mocker):
embedding = [0.1, 0.2, 0.3]
with pytest.raises(Exception, match="Database error"):
await embeddings.store_embedding(
version_id="test-version-id", embedding=embedding, tx=mock_client
)
result = await embeddings.store_embedding(
version_id="test-version-id", embedding=embedding, tx=mock_client
)
assert result is False
@pytest.mark.asyncio(loop_scope="session")
@@ -274,16 +277,17 @@ async def test_ensure_embedding_create_new(mock_get, mock_store, mock_generate):
async def test_ensure_embedding_generation_fails(mock_get, mock_generate):
"""Test ensure_embedding when generation fails."""
mock_get.return_value = None
mock_generate.side_effect = Exception("Generation failed")
mock_generate.return_value = None
with pytest.raises(Exception, match="Generation failed"):
await embeddings.ensure_embedding(
version_id="test-id",
name="Test",
description="Test description",
sub_heading="Test heading",
categories=["test"],
)
result = await embeddings.ensure_embedding(
version_id="test-id",
name="Test",
description="Test description",
sub_heading="Test heading",
categories=["test"],
)
assert result is False
@pytest.mark.asyncio(loop_scope="session")

View File

@@ -186,12 +186,13 @@ async def unified_hybrid_search(
offset = (page - 1) * page_size
# Generate query embedding with graceful degradation
try:
query_embedding = await embed_query(query)
except Exception as e:
# Generate query embedding
query_embedding = await embed_query(query)
# Graceful degradation if embedding unavailable
if query_embedding is None or not query_embedding:
logger.warning(
f"Failed to generate query embedding - falling back to lexical-only search: {e}. "
"Failed to generate query embedding - falling back to lexical-only search. "
"Check that openai_internal_api_key is configured and OpenAI API is accessible."
)
query_embedding = [0.0] * EMBEDDING_DIM
@@ -463,12 +464,13 @@ async def hybrid_search(
offset = (page - 1) * page_size
# Generate query embedding with graceful degradation
try:
query_embedding = await embed_query(query)
except Exception as e:
# Generate query embedding
query_embedding = await embed_query(query)
# Graceful degradation
if query_embedding is None or not query_embedding:
logger.warning(
f"Failed to generate query embedding - falling back to lexical-only search: {e}"
"Failed to generate query embedding - falling back to lexical-only search."
)
query_embedding = [0.0] * EMBEDDING_DIM
total_non_semantic = (

View File

@@ -172,8 +172,8 @@ async def test_hybrid_search_without_embeddings():
with patch(
"backend.api.features.store.hybrid_search.query_raw_with_schema"
) as mock_query:
# Simulate embedding failure by raising exception
mock_embed.side_effect = Exception("Embedding generation failed")
# Simulate embedding failure
mock_embed.return_value = None
mock_query.return_value = mock_results
# Should NOT raise - graceful degradation
@@ -613,9 +613,7 @@ async def test_unified_hybrid_search_graceful_degradation():
"backend.api.features.store.hybrid_search.embed_query"
) as mock_embed:
mock_query.return_value = mock_results
mock_embed.side_effect = Exception(
"Embedding generation failed"
) # Embedding failure
mock_embed.return_value = None # Embedding failure
# Should NOT raise - graceful degradation
results, total = await unified_hybrid_search(

View File

@@ -116,7 +116,6 @@ class PrintToConsoleBlock(Block):
input_schema=PrintToConsoleBlock.Input,
output_schema=PrintToConsoleBlock.Output,
test_input={"text": "Hello, World!"},
is_sensitive_action=True,
test_output=[
("output", "Hello, World!"),
("status", "printed"),

View File

@@ -1,659 +0,0 @@
import json
import shlex
import uuid
from typing import Literal, Optional
from e2b import AsyncSandbox as BaseAsyncSandbox
from pydantic import BaseModel, SecretStr
from backend.data.block import (
Block,
BlockCategory,
BlockOutput,
BlockSchemaInput,
BlockSchemaOutput,
)
from backend.data.model import (
APIKeyCredentials,
CredentialsField,
CredentialsMetaInput,
SchemaField,
)
from backend.integrations.providers import ProviderName
class ClaudeCodeExecutionError(Exception):
"""Exception raised when Claude Code execution fails.
Carries the sandbox_id so it can be returned to the user for cleanup
when dispose_sandbox=False.
"""
def __init__(self, message: str, sandbox_id: str = ""):
super().__init__(message)
self.sandbox_id = sandbox_id
# Test credentials for E2B
TEST_E2B_CREDENTIALS = APIKeyCredentials(
id="01234567-89ab-cdef-0123-456789abcdef",
provider="e2b",
api_key=SecretStr("mock-e2b-api-key"),
title="Mock E2B API key",
expires_at=None,
)
TEST_E2B_CREDENTIALS_INPUT = {
"provider": TEST_E2B_CREDENTIALS.provider,
"id": TEST_E2B_CREDENTIALS.id,
"type": TEST_E2B_CREDENTIALS.type,
"title": TEST_E2B_CREDENTIALS.title,
}
# Test credentials for Anthropic
TEST_ANTHROPIC_CREDENTIALS = APIKeyCredentials(
id="2e568a2b-b2ea-475a-8564-9a676bf31c56",
provider="anthropic",
api_key=SecretStr("mock-anthropic-api-key"),
title="Mock Anthropic API key",
expires_at=None,
)
TEST_ANTHROPIC_CREDENTIALS_INPUT = {
"provider": TEST_ANTHROPIC_CREDENTIALS.provider,
"id": TEST_ANTHROPIC_CREDENTIALS.id,
"type": TEST_ANTHROPIC_CREDENTIALS.type,
"title": TEST_ANTHROPIC_CREDENTIALS.title,
}
class ClaudeCodeBlock(Block):
"""
Execute tasks using Claude Code (Anthropic's AI coding assistant) in an E2B sandbox.
Claude Code can create files, install tools, run commands, and perform complex
coding tasks autonomously within a secure sandbox environment.
"""
# Use base template - we'll install Claude Code ourselves for latest version
DEFAULT_TEMPLATE = "base"
class Input(BlockSchemaInput):
e2b_credentials: CredentialsMetaInput[
Literal[ProviderName.E2B], Literal["api_key"]
] = CredentialsField(
description=(
"API key for the E2B platform to create the sandbox. "
"Get one on the [e2b website](https://e2b.dev/docs)"
),
)
anthropic_credentials: CredentialsMetaInput[
Literal[ProviderName.ANTHROPIC], Literal["api_key"]
] = CredentialsField(
description=(
"API key for Anthropic to power Claude Code. "
"Get one at [Anthropic's website](https://console.anthropic.com)"
),
)
prompt: str = SchemaField(
description=(
"The task or instruction for Claude Code to execute. "
"Claude Code can create files, install packages, run commands, "
"and perform complex coding tasks."
),
placeholder="Create a hello world index.html file",
default="",
advanced=False,
)
timeout: int = SchemaField(
description=(
"Sandbox timeout in seconds. Claude Code tasks can take "
"a while, so set this appropriately for your task complexity. "
"Note: This only applies when creating a new sandbox. "
"When reconnecting to an existing sandbox via sandbox_id, "
"the original timeout is retained."
),
default=300, # 5 minutes default
advanced=True,
)
setup_commands: list[str] = SchemaField(
description=(
"Optional shell commands to run before executing Claude Code. "
"Useful for installing dependencies or setting up the environment."
),
default_factory=list,
advanced=True,
)
working_directory: str = SchemaField(
description="Working directory for Claude Code to operate in.",
default="/home/user",
advanced=True,
)
# Session/continuation support
session_id: str = SchemaField(
description=(
"Session ID to resume a previous conversation. "
"Leave empty for a new conversation. "
"Use the session_id from a previous run to continue that conversation."
),
default="",
advanced=True,
)
sandbox_id: str = SchemaField(
description=(
"Sandbox ID to reconnect to an existing sandbox. "
"Required when resuming a session (along with session_id). "
"Use the sandbox_id from a previous run where dispose_sandbox was False."
),
default="",
advanced=True,
)
conversation_history: str = SchemaField(
description=(
"Previous conversation history to continue from. "
"Use this to restore context on a fresh sandbox if the previous one timed out. "
"Pass the conversation_history output from a previous run."
),
default="",
advanced=True,
)
dispose_sandbox: bool = SchemaField(
description=(
"Whether to dispose of the sandbox immediately after execution. "
"Set to False if you want to continue the conversation later "
"(you'll need both sandbox_id and session_id from the output)."
),
default=True,
advanced=True,
)
class FileOutput(BaseModel):
"""A file extracted from the sandbox."""
path: str
relative_path: str # Path relative to working directory (for GitHub, etc.)
name: str
content: str
class Output(BlockSchemaOutput):
response: str = SchemaField(
description="The output/response from Claude Code execution"
)
files: list["ClaudeCodeBlock.FileOutput"] = SchemaField(
description=(
"List of text files created/modified by Claude Code during this execution. "
"Each file has 'path', 'relative_path', 'name', and 'content' fields."
)
)
conversation_history: str = SchemaField(
description=(
"Full conversation history including this turn. "
"Pass this to conversation_history input to continue on a fresh sandbox "
"if the previous sandbox timed out."
)
)
session_id: str = SchemaField(
description=(
"Session ID for this conversation. "
"Pass this back along with sandbox_id to continue the conversation."
)
)
sandbox_id: Optional[str] = SchemaField(
description=(
"ID of the sandbox instance. "
"Pass this back along with session_id to continue the conversation. "
"This is None if dispose_sandbox was True (sandbox was disposed)."
),
default=None,
)
error: str = SchemaField(description="Error message if execution failed")
def __init__(self):
super().__init__(
id="4e34f4a5-9b89-4326-ba77-2dd6750b7194",
description=(
"Execute tasks using Claude Code in an E2B sandbox. "
"Claude Code can create files, install tools, run commands, "
"and perform complex coding tasks autonomously."
),
categories={BlockCategory.DEVELOPER_TOOLS, BlockCategory.AI},
input_schema=ClaudeCodeBlock.Input,
output_schema=ClaudeCodeBlock.Output,
test_credentials={
"e2b_credentials": TEST_E2B_CREDENTIALS,
"anthropic_credentials": TEST_ANTHROPIC_CREDENTIALS,
},
test_input={
"e2b_credentials": TEST_E2B_CREDENTIALS_INPUT,
"anthropic_credentials": TEST_ANTHROPIC_CREDENTIALS_INPUT,
"prompt": "Create a hello world HTML file",
"timeout": 300,
"setup_commands": [],
"working_directory": "/home/user",
"session_id": "",
"sandbox_id": "",
"conversation_history": "",
"dispose_sandbox": True,
},
test_output=[
("response", "Created index.html with hello world content"),
(
"files",
[
{
"path": "/home/user/index.html",
"relative_path": "index.html",
"name": "index.html",
"content": "<html>Hello World</html>",
}
],
),
(
"conversation_history",
"User: Create a hello world HTML file\n"
"Claude: Created index.html with hello world content",
),
("session_id", str),
("sandbox_id", None), # None because dispose_sandbox=True in test_input
],
test_mock={
"execute_claude_code": lambda *args, **kwargs: (
"Created index.html with hello world content", # response
[
ClaudeCodeBlock.FileOutput(
path="/home/user/index.html",
relative_path="index.html",
name="index.html",
content="<html>Hello World</html>",
)
], # files
"User: Create a hello world HTML file\n"
"Claude: Created index.html with hello world content", # conversation_history
"test-session-id", # session_id
"sandbox_id", # sandbox_id
),
},
)
async def execute_claude_code(
self,
e2b_api_key: str,
anthropic_api_key: str,
prompt: str,
timeout: int,
setup_commands: list[str],
working_directory: str,
session_id: str,
existing_sandbox_id: str,
conversation_history: str,
dispose_sandbox: bool,
) -> tuple[str, list["ClaudeCodeBlock.FileOutput"], str, str, str]:
"""
Execute Claude Code in an E2B sandbox.
Returns:
Tuple of (response, files, conversation_history, session_id, sandbox_id)
"""
# Validate that sandbox_id is provided when resuming a session
if session_id and not existing_sandbox_id:
raise ValueError(
"sandbox_id is required when resuming a session with session_id. "
"The session state is stored in the original sandbox. "
"If the sandbox has timed out, use conversation_history instead "
"to restore context on a fresh sandbox."
)
sandbox = None
sandbox_id = ""
try:
# Either reconnect to existing sandbox or create a new one
if existing_sandbox_id:
# Reconnect to existing sandbox for conversation continuation
sandbox = await BaseAsyncSandbox.connect(
sandbox_id=existing_sandbox_id,
api_key=e2b_api_key,
)
else:
# Create new sandbox
sandbox = await BaseAsyncSandbox.create(
template=self.DEFAULT_TEMPLATE,
api_key=e2b_api_key,
timeout=timeout,
envs={"ANTHROPIC_API_KEY": anthropic_api_key},
)
# Install Claude Code from npm (ensures we get the latest version)
install_result = await sandbox.commands.run(
"npm install -g @anthropic-ai/claude-code@latest",
timeout=120, # 2 min timeout for install
)
if install_result.exit_code != 0:
raise Exception(
f"Failed to install Claude Code: {install_result.stderr}"
)
# Run any user-provided setup commands
for cmd in setup_commands:
setup_result = await sandbox.commands.run(cmd)
if setup_result.exit_code != 0:
raise Exception(
f"Setup command failed: {cmd}\n"
f"Exit code: {setup_result.exit_code}\n"
f"Stdout: {setup_result.stdout}\n"
f"Stderr: {setup_result.stderr}"
)
# Capture sandbox_id immediately after creation/connection
# so it's available for error recovery if dispose_sandbox=False
sandbox_id = sandbox.sandbox_id
# Generate or use provided session ID
current_session_id = session_id if session_id else str(uuid.uuid4())
# Build base Claude flags
base_flags = "-p --dangerously-skip-permissions --output-format json"
# Add conversation history context if provided (for fresh sandbox continuation)
history_flag = ""
if conversation_history and not session_id:
# Inject previous conversation as context via system prompt
# Use consistent escaping via _escape_prompt helper
escaped_history = self._escape_prompt(
f"Previous conversation context: {conversation_history}"
)
history_flag = f" --append-system-prompt {escaped_history}"
# Build Claude command based on whether we're resuming or starting new
# Use shlex.quote for working_directory and session IDs to prevent injection
safe_working_dir = shlex.quote(working_directory)
if session_id:
# Resuming existing session (sandbox still alive)
safe_session_id = shlex.quote(session_id)
claude_command = (
f"cd {safe_working_dir} && "
f"echo {self._escape_prompt(prompt)} | "
f"claude --resume {safe_session_id} {base_flags}"
)
else:
# New session with specific ID
safe_current_session_id = shlex.quote(current_session_id)
claude_command = (
f"cd {safe_working_dir} && "
f"echo {self._escape_prompt(prompt)} | "
f"claude --session-id {safe_current_session_id} {base_flags}{history_flag}"
)
# Capture timestamp before running Claude Code to filter files later
# Capture timestamp 1 second in the past to avoid race condition with file creation
timestamp_result = await sandbox.commands.run(
"date -u -d '1 second ago' +%Y-%m-%dT%H:%M:%S"
)
if timestamp_result.exit_code != 0:
raise RuntimeError(
f"Failed to capture timestamp: {timestamp_result.stderr}"
)
start_timestamp = (
timestamp_result.stdout.strip() if timestamp_result.stdout else None
)
result = await sandbox.commands.run(
claude_command,
timeout=0, # No command timeout - let sandbox timeout handle it
)
# Check for command failure
if result.exit_code != 0:
error_msg = result.stderr or result.stdout or "Unknown error"
raise Exception(
f"Claude Code command failed with exit code {result.exit_code}:\n"
f"{error_msg}"
)
raw_output = result.stdout or ""
# Parse JSON output to extract response and build conversation history
response = ""
new_conversation_history = conversation_history or ""
try:
# The JSON output contains the result
output_data = json.loads(raw_output)
response = output_data.get("result", raw_output)
# Build conversation history entry
turn_entry = f"User: {prompt}\nClaude: {response}"
if new_conversation_history:
new_conversation_history = (
f"{new_conversation_history}\n\n{turn_entry}"
)
else:
new_conversation_history = turn_entry
except json.JSONDecodeError:
# If not valid JSON, use raw output
response = raw_output
turn_entry = f"User: {prompt}\nClaude: {response}"
if new_conversation_history:
new_conversation_history = (
f"{new_conversation_history}\n\n{turn_entry}"
)
else:
new_conversation_history = turn_entry
# Extract files created/modified during this run
files = await self._extract_files(
sandbox, working_directory, start_timestamp
)
return (
response,
files,
new_conversation_history,
current_session_id,
sandbox_id,
)
except Exception as e:
# Wrap exception with sandbox_id so caller can access/cleanup
# the preserved sandbox when dispose_sandbox=False
raise ClaudeCodeExecutionError(str(e), sandbox_id) from e
finally:
if dispose_sandbox and sandbox:
await sandbox.kill()
async def _extract_files(
self,
sandbox: BaseAsyncSandbox,
working_directory: str,
since_timestamp: str | None = None,
) -> list["ClaudeCodeBlock.FileOutput"]:
"""
Extract text files created/modified during this Claude Code execution.
Args:
sandbox: The E2B sandbox instance
working_directory: Directory to search for files
since_timestamp: ISO timestamp - only return files modified after this time
Returns:
List of FileOutput objects with path, relative_path, name, and content
"""
files: list[ClaudeCodeBlock.FileOutput] = []
# Text file extensions we can safely read as text
text_extensions = {
".txt",
".md",
".html",
".htm",
".css",
".js",
".ts",
".jsx",
".tsx",
".json",
".xml",
".yaml",
".yml",
".toml",
".ini",
".cfg",
".conf",
".py",
".rb",
".php",
".java",
".c",
".cpp",
".h",
".hpp",
".cs",
".go",
".rs",
".swift",
".kt",
".scala",
".sh",
".bash",
".zsh",
".sql",
".graphql",
".env",
".gitignore",
".dockerfile",
"Dockerfile",
".vue",
".svelte",
".astro",
".mdx",
".rst",
".tex",
".csv",
".log",
}
try:
# List files recursively using find command
# Exclude node_modules and .git directories, but allow hidden files
# like .env and .gitignore (they're filtered by text_extensions later)
# Filter by timestamp to only get files created/modified during this run
safe_working_dir = shlex.quote(working_directory)
timestamp_filter = ""
if since_timestamp:
timestamp_filter = f"-newermt {shlex.quote(since_timestamp)} "
find_result = await sandbox.commands.run(
f"find {safe_working_dir} -type f "
f"{timestamp_filter}"
f"-not -path '*/node_modules/*' "
f"-not -path '*/.git/*' "
f"2>/dev/null"
)
if find_result.stdout:
for file_path in find_result.stdout.strip().split("\n"):
if not file_path:
continue
# Check if it's a text file we can read
is_text = any(
file_path.endswith(ext) for ext in text_extensions
) or file_path.endswith("Dockerfile")
if is_text:
try:
content = await sandbox.files.read(file_path)
# Handle bytes or string
if isinstance(content, bytes):
content = content.decode("utf-8", errors="replace")
# Extract filename from path
file_name = file_path.split("/")[-1]
# Calculate relative path by stripping working directory
relative_path = file_path
if file_path.startswith(working_directory):
relative_path = file_path[len(working_directory) :]
# Remove leading slash if present
if relative_path.startswith("/"):
relative_path = relative_path[1:]
files.append(
ClaudeCodeBlock.FileOutput(
path=file_path,
relative_path=relative_path,
name=file_name,
content=content,
)
)
except Exception:
# Skip files that can't be read
pass
except Exception:
# If file extraction fails, return empty results
pass
return files
def _escape_prompt(self, prompt: str) -> str:
"""Escape the prompt for safe shell execution."""
# Use single quotes and escape any single quotes in the prompt
escaped = prompt.replace("'", "'\"'\"'")
return f"'{escaped}'"
async def run(
self,
input_data: Input,
*,
e2b_credentials: APIKeyCredentials,
anthropic_credentials: APIKeyCredentials,
**kwargs,
) -> BlockOutput:
try:
(
response,
files,
conversation_history,
session_id,
sandbox_id,
) = await self.execute_claude_code(
e2b_api_key=e2b_credentials.api_key.get_secret_value(),
anthropic_api_key=anthropic_credentials.api_key.get_secret_value(),
prompt=input_data.prompt,
timeout=input_data.timeout,
setup_commands=input_data.setup_commands,
working_directory=input_data.working_directory,
session_id=input_data.session_id,
existing_sandbox_id=input_data.sandbox_id,
conversation_history=input_data.conversation_history,
dispose_sandbox=input_data.dispose_sandbox,
)
yield "response", response
# Always yield files (empty list if none) to match Output schema
yield "files", [f.model_dump() for f in files]
# Always yield conversation_history so user can restore context on fresh sandbox
yield "conversation_history", conversation_history
# Always yield session_id so user can continue conversation
yield "session_id", session_id
# Always yield sandbox_id (None if disposed) to match Output schema
yield "sandbox_id", sandbox_id if not input_data.dispose_sandbox else None
except ClaudeCodeExecutionError as e:
yield "error", str(e)
# If sandbox was preserved (dispose_sandbox=False), yield sandbox_id
# so user can reconnect to or clean up the orphaned sandbox
if not input_data.dispose_sandbox and e.sandbox_id:
yield "sandbox_id", e.sandbox_id
except Exception as e:
yield "error", str(e)

View File

@@ -9,7 +9,7 @@ from typing import Any, Optional
from prisma.enums import ReviewStatus
from pydantic import BaseModel
from backend.data.execution import ExecutionStatus
from backend.data.execution import ExecutionContext, ExecutionStatus
from backend.data.human_review import ReviewResult
from backend.executor.manager import async_update_node_execution_status
from backend.util.clients import get_database_manager_async_client
@@ -28,11 +28,6 @@ class ReviewDecision(BaseModel):
class HITLReviewHelper:
"""Helper class for Human-In-The-Loop review operations."""
@staticmethod
async def check_approval(**kwargs) -> Optional[ReviewResult]:
"""Check if there's an existing approval for this node execution."""
return await get_database_manager_async_client().check_approval(**kwargs)
@staticmethod
async def get_or_create_human_review(**kwargs) -> Optional[ReviewResult]:
"""Create or retrieve a human review from the database."""
@@ -60,11 +55,11 @@ class HITLReviewHelper:
async def _handle_review_request(
input_data: Any,
user_id: str,
node_id: str,
node_exec_id: str,
graph_exec_id: str,
graph_id: str,
graph_version: int,
execution_context: ExecutionContext,
block_name: str = "Block",
editable: bool = False,
) -> Optional[ReviewResult]:
@@ -74,11 +69,11 @@ class HITLReviewHelper:
Args:
input_data: The input data to be reviewed
user_id: ID of the user requesting the review
node_id: ID of the node in the graph definition
node_exec_id: ID of the node execution
graph_exec_id: ID of the graph execution
graph_id: ID of the graph
graph_version: Version of the graph
execution_context: Current execution context
block_name: Name of the block requesting review
editable: Whether the reviewer can edit the data
@@ -88,41 +83,15 @@ class HITLReviewHelper:
Raises:
Exception: If review creation or status update fails
"""
# Note: Safe mode checks (human_in_the_loop_safe_mode, sensitive_action_safe_mode)
# are handled by the caller:
# - HITL blocks check human_in_the_loop_safe_mode in their run() method
# - Sensitive action blocks check sensitive_action_safe_mode in is_block_exec_need_review()
# This function only handles checking for existing approvals.
# Check if this node has already been approved (normal or auto-approval)
if approval_result := await HITLReviewHelper.check_approval(
node_exec_id=node_exec_id,
graph_exec_id=graph_exec_id,
node_id=node_id,
user_id=user_id,
input_data=input_data,
):
# Skip review if safe mode is disabled - return auto-approved result
if not execution_context.human_in_the_loop_safe_mode:
logger.info(
f"Block {block_name} skipping review for node {node_exec_id} - "
f"found existing approval"
)
# Return a new ReviewResult with the current node_exec_id but approved status
# For auto-approvals, always use current input_data
# For normal approvals, use approval_result.data unless it's None
is_auto_approval = approval_result.node_exec_id != node_exec_id
approved_data = (
input_data
if is_auto_approval
else (
approval_result.data
if approval_result.data is not None
else input_data
)
f"Block {block_name} skipping review for node {node_exec_id} - safe mode disabled"
)
return ReviewResult(
data=approved_data,
data=input_data,
status=ReviewStatus.APPROVED,
message=approval_result.message,
message="Auto-approved (safe mode disabled)",
processed=True,
node_exec_id=node_exec_id,
)
@@ -134,7 +103,7 @@ class HITLReviewHelper:
graph_id=graph_id,
graph_version=graph_version,
input_data=input_data,
message=block_name, # Use block_name directly as the message
message=f"Review required for {block_name} execution",
editable=editable,
)
@@ -160,11 +129,11 @@ class HITLReviewHelper:
async def handle_review_decision(
input_data: Any,
user_id: str,
node_id: str,
node_exec_id: str,
graph_exec_id: str,
graph_id: str,
graph_version: int,
execution_context: ExecutionContext,
block_name: str = "Block",
editable: bool = False,
) -> Optional[ReviewDecision]:
@@ -174,11 +143,11 @@ class HITLReviewHelper:
Args:
input_data: The input data to be reviewed
user_id: ID of the user requesting the review
node_id: ID of the node in the graph definition
node_exec_id: ID of the node execution
graph_exec_id: ID of the graph execution
graph_id: ID of the graph
graph_version: Version of the graph
execution_context: Current execution context
block_name: Name of the block requesting review
editable: Whether the reviewer can edit the data
@@ -189,11 +158,11 @@ class HITLReviewHelper:
review_result = await HITLReviewHelper._handle_review_request(
input_data=input_data,
user_id=user_id,
node_id=node_id,
node_exec_id=node_exec_id,
graph_exec_id=graph_exec_id,
graph_id=graph_id,
graph_version=graph_version,
execution_context=execution_context,
block_name=block_name,
editable=editable,
)

View File

@@ -97,7 +97,6 @@ class HumanInTheLoopBlock(Block):
input_data: Input,
*,
user_id: str,
node_id: str,
node_exec_id: str,
graph_exec_id: str,
graph_id: str,
@@ -116,12 +115,12 @@ class HumanInTheLoopBlock(Block):
decision = await self.handle_review_decision(
input_data=input_data.data,
user_id=user_id,
node_id=node_id,
node_exec_id=node_exec_id,
graph_exec_id=graph_exec_id,
graph_id=graph_id,
graph_version=graph_version,
block_name=input_data.name, # Use user-provided name instead of block type
execution_context=execution_context,
block_name=self.name,
editable=input_data.editable,
)

View File

@@ -1,7 +1,7 @@
import logging
import os
import pytest_asyncio
import pytest
from dotenv import load_dotenv
from backend.util.logging import configure_logging
@@ -19,7 +19,7 @@ if not os.getenv("PRISMA_DEBUG"):
prisma_logger.setLevel(logging.INFO)
@pytest_asyncio.fixture(scope="session", loop_scope="session")
@pytest.fixture(scope="session")
async def server():
from backend.util.test import SpinTestServer
@@ -27,7 +27,7 @@ async def server():
yield server
@pytest_asyncio.fixture(scope="session", loop_scope="session", autouse=True)
@pytest.fixture(scope="session", autouse=True)
async def graph_cleanup(server):
created_graph_ids = []
original_create_graph = server.agent_server.test_create_graph

View File

@@ -441,7 +441,6 @@ class Block(ABC, Generic[BlockSchemaInputType, BlockSchemaOutputType]):
static_output: bool = False,
block_type: BlockType = BlockType.STANDARD,
webhook_config: Optional[BlockWebhookConfig | BlockManualWebhookConfig] = None,
is_sensitive_action: bool = False,
):
"""
Initialize the block with the given schema.
@@ -474,8 +473,8 @@ class Block(ABC, Generic[BlockSchemaInputType, BlockSchemaOutputType]):
self.static_output = static_output
self.block_type = block_type
self.webhook_config = webhook_config
self.is_sensitive_action = is_sensitive_action
self.execution_stats: NodeExecutionStats = NodeExecutionStats()
self.is_sensitive_action: bool = False
if self.webhook_config:
if isinstance(self.webhook_config, BlockWebhookConfig):
@@ -623,7 +622,6 @@ class Block(ABC, Generic[BlockSchemaInputType, BlockSchemaOutputType]):
input_data: BlockInput,
*,
user_id: str,
node_id: str,
node_exec_id: str,
graph_exec_id: str,
graph_id: str,
@@ -650,11 +648,11 @@ class Block(ABC, Generic[BlockSchemaInputType, BlockSchemaOutputType]):
decision = await HITLReviewHelper.handle_review_decision(
input_data=input_data,
user_id=user_id,
node_id=node_id,
node_exec_id=node_exec_id,
graph_exec_id=graph_exec_id,
graph_id=graph_id,
graph_version=graph_version,
execution_context=execution_context,
block_name=self.name,
editable=True,
)

View File

@@ -6,10 +6,10 @@ Handles all database operations for pending human reviews.
import asyncio
import logging
from datetime import datetime, timezone
from typing import TYPE_CHECKING, Optional
from typing import Optional
from prisma.enums import ReviewStatus
from prisma.models import AgentNodeExecution, PendingHumanReview
from prisma.models import PendingHumanReview
from prisma.types import PendingHumanReviewUpdateInput
from pydantic import BaseModel
@@ -17,12 +17,8 @@ from backend.api.features.executions.review.model import (
PendingHumanReviewModel,
SafeJsonData,
)
from backend.data.execution import get_graph_execution_meta
from backend.util.json import SafeJson
if TYPE_CHECKING:
pass
logger = logging.getLogger(__name__)
@@ -36,125 +32,6 @@ class ReviewResult(BaseModel):
node_exec_id: str
def get_auto_approve_key(graph_exec_id: str, node_id: str) -> str:
"""Generate the special nodeExecId key for auto-approval records."""
return f"auto_approve_{graph_exec_id}_{node_id}"
async def check_approval(
node_exec_id: str,
graph_exec_id: str,
node_id: str,
user_id: str,
input_data: SafeJsonData | None = None,
) -> Optional[ReviewResult]:
"""
Check if there's an existing approval for this node execution.
Checks both:
1. Normal approval by node_exec_id (previous run of the same node execution)
2. Auto-approval by special key pattern "auto_approve_{graph_exec_id}_{node_id}"
Args:
node_exec_id: ID of the node execution
graph_exec_id: ID of the graph execution
node_id: ID of the node definition (not execution)
user_id: ID of the user (for data isolation)
input_data: Current input data (used for auto-approvals to avoid stale data)
Returns:
ReviewResult if approval found (either normal or auto), None otherwise
"""
auto_approve_key = get_auto_approve_key(graph_exec_id, node_id)
# Check for either normal approval or auto-approval in a single query
existing_review = await PendingHumanReview.prisma().find_first(
where={
"OR": [
{"nodeExecId": node_exec_id},
{"nodeExecId": auto_approve_key},
],
"status": ReviewStatus.APPROVED,
"userId": user_id,
},
)
if existing_review:
is_auto_approval = existing_review.nodeExecId == auto_approve_key
logger.info(
f"Found {'auto-' if is_auto_approval else ''}approval for node {node_id} "
f"(exec: {node_exec_id}) in execution {graph_exec_id}"
)
# For auto-approvals, use current input_data to avoid replaying stale payload
# For normal approvals, use the stored payload (which may have been edited)
return ReviewResult(
data=(
input_data
if is_auto_approval and input_data is not None
else existing_review.payload
),
status=ReviewStatus.APPROVED,
message=(
"Auto-approved (user approved all future actions for this node)"
if is_auto_approval
else existing_review.reviewMessage or ""
),
processed=True,
node_exec_id=existing_review.nodeExecId,
)
return None
async def create_auto_approval_record(
user_id: str,
graph_exec_id: str,
graph_id: str,
graph_version: int,
node_id: str,
payload: SafeJsonData,
) -> None:
"""
Create an auto-approval record for a node in this execution.
This is stored as a PendingHumanReview with a special nodeExecId pattern
and status=APPROVED, so future executions of the same node can skip review.
Raises:
ValueError: If the graph execution doesn't belong to the user
"""
# Validate that the graph execution belongs to this user (defense in depth)
graph_exec = await get_graph_execution_meta(
user_id=user_id, execution_id=graph_exec_id
)
if not graph_exec:
raise ValueError(
f"Graph execution {graph_exec_id} not found or doesn't belong to user {user_id}"
)
auto_approve_key = get_auto_approve_key(graph_exec_id, node_id)
await PendingHumanReview.prisma().upsert(
where={"nodeExecId": auto_approve_key},
data={
"create": {
"nodeExecId": auto_approve_key,
"userId": user_id,
"graphExecId": graph_exec_id,
"graphId": graph_id,
"graphVersion": graph_version,
"payload": SafeJson(payload),
"instructions": "Auto-approval record",
"editable": False,
"status": ReviewStatus.APPROVED,
"processed": True,
"reviewedAt": datetime.now(timezone.utc),
},
"update": {}, # Already exists, no update needed
},
)
async def get_or_create_human_review(
user_id: str,
node_exec_id: str,
@@ -231,87 +108,6 @@ async def get_or_create_human_review(
)
async def get_pending_review_by_node_exec_id(
node_exec_id: str, user_id: str
) -> Optional["PendingHumanReviewModel"]:
"""
Get a pending review by its node execution ID.
Args:
node_exec_id: The node execution ID to look up
user_id: User ID for authorization (only returns if review belongs to this user)
Returns:
The pending review if found and belongs to user, None otherwise
"""
review = await PendingHumanReview.prisma().find_first(
where={
"nodeExecId": node_exec_id,
"userId": user_id,
"status": ReviewStatus.WAITING,
}
)
if not review:
return None
# Local import to avoid event loop conflicts in tests
from backend.data.execution import get_node_execution
node_exec = await get_node_execution(review.nodeExecId)
node_id = node_exec.node_id if node_exec else review.nodeExecId
return PendingHumanReviewModel.from_db(review, node_id=node_id)
async def get_pending_reviews_by_node_exec_ids(
node_exec_ids: list[str], user_id: str
) -> dict[str, "PendingHumanReviewModel"]:
"""
Get multiple pending reviews by their node execution IDs in a single batch query.
Args:
node_exec_ids: List of node execution IDs to look up
user_id: User ID for authorization (only returns reviews belonging to this user)
Returns:
Dictionary mapping node_exec_id -> PendingHumanReviewModel for found reviews
"""
if not node_exec_ids:
return {}
reviews = await PendingHumanReview.prisma().find_many(
where={
"nodeExecId": {"in": node_exec_ids},
"userId": user_id,
"status": ReviewStatus.WAITING,
}
)
if not reviews:
return {}
# Batch fetch all node executions to avoid N+1 queries
node_exec_ids_to_fetch = [review.nodeExecId for review in reviews]
node_execs = await AgentNodeExecution.prisma().find_many(
where={"id": {"in": node_exec_ids_to_fetch}},
include={"Node": True},
)
# Create mapping from node_exec_id to node_id
node_exec_id_to_node_id = {
node_exec.id: node_exec.agentNodeId for node_exec in node_execs
}
result = {}
for review in reviews:
node_id = node_exec_id_to_node_id.get(review.nodeExecId, review.nodeExecId)
result[review.nodeExecId] = PendingHumanReviewModel.from_db(
review, node_id=node_id
)
return result
async def has_pending_reviews_for_graph_exec(graph_exec_id: str) -> bool:
"""
Check if a graph execution has any pending reviews.
@@ -341,11 +137,8 @@ async def get_pending_reviews_for_user(
page_size: Number of reviews per page
Returns:
List of pending review models with node_id included
List of pending review models
"""
# Local import to avoid event loop conflicts in tests
from backend.data.execution import get_node_execution
# Calculate offset for pagination
offset = (page - 1) * page_size
@@ -356,14 +149,7 @@ async def get_pending_reviews_for_user(
take=page_size,
)
# Fetch node_id for each review from NodeExecution
result = []
for review in reviews:
node_exec = await get_node_execution(review.nodeExecId)
node_id = node_exec.node_id if node_exec else review.nodeExecId
result.append(PendingHumanReviewModel.from_db(review, node_id=node_id))
return result
return [PendingHumanReviewModel.from_db(review) for review in reviews]
async def get_pending_reviews_for_execution(
@@ -377,11 +163,8 @@ async def get_pending_reviews_for_execution(
user_id: User ID for security validation
Returns:
List of pending review models with node_id included
List of pending review models
"""
# Local import to avoid event loop conflicts in tests
from backend.data.execution import get_node_execution
reviews = await PendingHumanReview.prisma().find_many(
where={
"userId": user_id,
@@ -391,14 +174,7 @@ async def get_pending_reviews_for_execution(
order={"createdAt": "asc"},
)
# Fetch node_id for each review from NodeExecution
result = []
for review in reviews:
node_exec = await get_node_execution(review.nodeExecId)
node_id = node_exec.node_id if node_exec else review.nodeExecId
result.append(PendingHumanReviewModel.from_db(review, node_id=node_id))
return result
return [PendingHumanReviewModel.from_db(review) for review in reviews]
async def process_all_reviews_for_execution(
@@ -468,19 +244,11 @@ async def process_all_reviews_for_execution(
# Note: Execution resumption is now handled at the API layer after ALL reviews
# for an execution are processed (both approved and rejected)
# Fetch node_id for each review and return as dict for easy access
# Local import to avoid event loop conflicts in tests
from backend.data.execution import get_node_execution
result = {}
for review in updated_reviews:
node_exec = await get_node_execution(review.nodeExecId)
node_id = node_exec.node_id if node_exec else review.nodeExecId
result[review.nodeExecId] = PendingHumanReviewModel.from_db(
review, node_id=node_id
)
return result
# Return as dict for easy access
return {
review.nodeExecId: PendingHumanReviewModel.from_db(review)
for review in updated_reviews
}
async def update_review_processed_status(node_exec_id: str, processed: bool) -> None:
@@ -488,44 +256,3 @@ async def update_review_processed_status(node_exec_id: str, processed: bool) ->
await PendingHumanReview.prisma().update(
where={"nodeExecId": node_exec_id}, data={"processed": processed}
)
async def cancel_pending_reviews_for_execution(graph_exec_id: str, user_id: str) -> int:
"""
Cancel all pending reviews for a graph execution (e.g., when execution is stopped).
Marks all WAITING reviews as REJECTED with a message indicating the execution was stopped.
Args:
graph_exec_id: The graph execution ID
user_id: User ID who owns the execution (for security validation)
Returns:
Number of reviews cancelled
Raises:
ValueError: If the graph execution doesn't belong to the user
"""
# Validate user ownership before cancelling reviews
graph_exec = await get_graph_execution_meta(
user_id=user_id, execution_id=graph_exec_id
)
if not graph_exec:
raise ValueError(
f"Graph execution {graph_exec_id} not found or doesn't belong to user {user_id}"
)
result = await PendingHumanReview.prisma().update_many(
where={
"graphExecId": graph_exec_id,
"userId": user_id,
"status": ReviewStatus.WAITING,
},
data={
"status": ReviewStatus.REJECTED,
"reviewMessage": "Execution was stopped by user",
"processed": True,
"reviewedAt": datetime.now(timezone.utc),
},
)
return result

View File

@@ -36,7 +36,7 @@ def sample_db_review():
return mock_review
@pytest.mark.asyncio(loop_scope="function")
@pytest.mark.asyncio
async def test_get_or_create_human_review_new(
mocker: pytest_mock.MockFixture,
sample_db_review,
@@ -46,8 +46,8 @@ async def test_get_or_create_human_review_new(
sample_db_review.status = ReviewStatus.WAITING
sample_db_review.processed = False
mock_prisma = mocker.patch("backend.data.human_review.PendingHumanReview.prisma")
mock_prisma.return_value.upsert = AsyncMock(return_value=sample_db_review)
mock_upsert = mocker.patch("backend.data.human_review.PendingHumanReview.prisma")
mock_upsert.return_value.upsert = AsyncMock(return_value=sample_db_review)
result = await get_or_create_human_review(
user_id="test-user-123",
@@ -64,7 +64,7 @@ async def test_get_or_create_human_review_new(
assert result is None
@pytest.mark.asyncio(loop_scope="function")
@pytest.mark.asyncio
async def test_get_or_create_human_review_approved(
mocker: pytest_mock.MockFixture,
sample_db_review,
@@ -75,8 +75,8 @@ async def test_get_or_create_human_review_approved(
sample_db_review.processed = False
sample_db_review.reviewMessage = "Looks good"
mock_prisma = mocker.patch("backend.data.human_review.PendingHumanReview.prisma")
mock_prisma.return_value.upsert = AsyncMock(return_value=sample_db_review)
mock_upsert = mocker.patch("backend.data.human_review.PendingHumanReview.prisma")
mock_upsert.return_value.upsert = AsyncMock(return_value=sample_db_review)
result = await get_or_create_human_review(
user_id="test-user-123",
@@ -96,7 +96,7 @@ async def test_get_or_create_human_review_approved(
assert result.message == "Looks good"
@pytest.mark.asyncio(loop_scope="function")
@pytest.mark.asyncio
async def test_has_pending_reviews_for_graph_exec_true(
mocker: pytest_mock.MockFixture,
):
@@ -109,7 +109,7 @@ async def test_has_pending_reviews_for_graph_exec_true(
assert result is True
@pytest.mark.asyncio(loop_scope="function")
@pytest.mark.asyncio
async def test_has_pending_reviews_for_graph_exec_false(
mocker: pytest_mock.MockFixture,
):
@@ -122,7 +122,7 @@ async def test_has_pending_reviews_for_graph_exec_false(
assert result is False
@pytest.mark.asyncio(loop_scope="function")
@pytest.mark.asyncio
async def test_get_pending_reviews_for_user(
mocker: pytest_mock.MockFixture,
sample_db_review,
@@ -131,19 +131,10 @@ async def test_get_pending_reviews_for_user(
mock_find_many = mocker.patch("backend.data.human_review.PendingHumanReview.prisma")
mock_find_many.return_value.find_many = AsyncMock(return_value=[sample_db_review])
# Mock get_node_execution to return node with node_id (async function)
mock_node_exec = Mock()
mock_node_exec.node_id = "test_node_def_789"
mocker.patch(
"backend.data.execution.get_node_execution",
new=AsyncMock(return_value=mock_node_exec),
)
result = await get_pending_reviews_for_user("test_user", page=2, page_size=10)
assert len(result) == 1
assert result[0].node_exec_id == "test_node_123"
assert result[0].node_id == "test_node_def_789"
# Verify pagination parameters
call_args = mock_find_many.return_value.find_many.call_args
@@ -151,7 +142,7 @@ async def test_get_pending_reviews_for_user(
assert call_args.kwargs["take"] == 10
@pytest.mark.asyncio(loop_scope="function")
@pytest.mark.asyncio
async def test_get_pending_reviews_for_execution(
mocker: pytest_mock.MockFixture,
sample_db_review,
@@ -160,21 +151,12 @@ async def test_get_pending_reviews_for_execution(
mock_find_many = mocker.patch("backend.data.human_review.PendingHumanReview.prisma")
mock_find_many.return_value.find_many = AsyncMock(return_value=[sample_db_review])
# Mock get_node_execution to return node with node_id (async function)
mock_node_exec = Mock()
mock_node_exec.node_id = "test_node_def_789"
mocker.patch(
"backend.data.execution.get_node_execution",
new=AsyncMock(return_value=mock_node_exec),
)
result = await get_pending_reviews_for_execution(
"test_graph_exec_456", "test-user-123"
)
assert len(result) == 1
assert result[0].graph_exec_id == "test_graph_exec_456"
assert result[0].node_id == "test_node_def_789"
# Verify it filters by execution and user
call_args = mock_find_many.return_value.find_many.call_args
@@ -184,7 +166,7 @@ async def test_get_pending_reviews_for_execution(
assert where_clause["status"] == ReviewStatus.WAITING
@pytest.mark.asyncio(loop_scope="function")
@pytest.mark.asyncio
async def test_process_all_reviews_for_execution_success(
mocker: pytest_mock.MockFixture,
sample_db_review,
@@ -219,14 +201,6 @@ async def test_process_all_reviews_for_execution_success(
new=AsyncMock(return_value=[updated_review]),
)
# Mock get_node_execution to return node with node_id (async function)
mock_node_exec = Mock()
mock_node_exec.node_id = "test_node_def_789"
mocker.patch(
"backend.data.execution.get_node_execution",
new=AsyncMock(return_value=mock_node_exec),
)
result = await process_all_reviews_for_execution(
user_id="test-user-123",
review_decisions={
@@ -237,10 +211,9 @@ async def test_process_all_reviews_for_execution_success(
assert len(result) == 1
assert "test_node_123" in result
assert result["test_node_123"].status == ReviewStatus.APPROVED
assert result["test_node_123"].node_id == "test_node_def_789"
@pytest.mark.asyncio(loop_scope="function")
@pytest.mark.asyncio
async def test_process_all_reviews_for_execution_validation_errors(
mocker: pytest_mock.MockFixture,
):
@@ -260,7 +233,7 @@ async def test_process_all_reviews_for_execution_validation_errors(
)
@pytest.mark.asyncio(loop_scope="function")
@pytest.mark.asyncio
async def test_process_all_reviews_edit_permission_error(
mocker: pytest_mock.MockFixture,
sample_db_review,
@@ -286,7 +259,7 @@ async def test_process_all_reviews_edit_permission_error(
)
@pytest.mark.asyncio(loop_scope="function")
@pytest.mark.asyncio
async def test_process_all_reviews_mixed_approval_rejection(
mocker: pytest_mock.MockFixture,
sample_db_review,
@@ -356,14 +329,6 @@ async def test_process_all_reviews_mixed_approval_rejection(
new=AsyncMock(return_value=[approved_review, rejected_review]),
)
# Mock get_node_execution to return node with node_id (async function)
mock_node_exec = Mock()
mock_node_exec.node_id = "test_node_def_789"
mocker.patch(
"backend.data.execution.get_node_execution",
new=AsyncMock(return_value=mock_node_exec),
)
result = await process_all_reviews_for_execution(
user_id="test-user-123",
review_decisions={
@@ -375,5 +340,3 @@ async def test_process_all_reviews_mixed_approval_rejection(
assert len(result) == 2
assert "test_node_123" in result
assert "test_node_456" in result
assert result["test_node_123"].node_id == "test_node_def_789"
assert result["test_node_456"].node_id == "test_node_def_789"

View File

@@ -50,8 +50,6 @@ from backend.data.graph import (
validate_graph_execution_permissions,
)
from backend.data.human_review import (
cancel_pending_reviews_for_execution,
check_approval,
get_or_create_human_review,
has_pending_reviews_for_graph_exec,
update_review_processed_status,
@@ -192,8 +190,6 @@ class DatabaseManager(AppService):
get_user_notification_preference = _(get_user_notification_preference)
# Human In The Loop
cancel_pending_reviews_for_execution = _(cancel_pending_reviews_for_execution)
check_approval = _(check_approval)
get_or_create_human_review = _(get_or_create_human_review)
has_pending_reviews_for_graph_exec = _(has_pending_reviews_for_graph_exec)
update_review_processed_status = _(update_review_processed_status)
@@ -317,8 +313,6 @@ class DatabaseManagerAsyncClient(AppServiceClient):
set_execution_kv_data = d.set_execution_kv_data
# Human In The Loop
cancel_pending_reviews_for_execution = d.cancel_pending_reviews_for_execution
check_approval = d.check_approval
get_or_create_human_review = d.get_or_create_human_review
update_review_processed_status = d.update_review_processed_status

View File

@@ -10,7 +10,6 @@ from pydantic import BaseModel, JsonValue, ValidationError
from backend.data import execution as execution_db
from backend.data import graph as graph_db
from backend.data import human_review as human_review_db
from backend.data import onboarding as onboarding_db
from backend.data import user as user_db
from backend.data.block import (
@@ -750,27 +749,9 @@ async def stop_graph_execution(
if graph_exec.status in [
ExecutionStatus.QUEUED,
ExecutionStatus.INCOMPLETE,
ExecutionStatus.REVIEW,
]:
# If the graph is queued/incomplete/paused for review, terminate immediately
# No need to wait for executor since it's not actively running
# If graph is in REVIEW status, clean up pending reviews before terminating
if graph_exec.status == ExecutionStatus.REVIEW:
# Use human_review_db if Prisma connected, else database manager
review_db = (
human_review_db
if prisma.is_connected()
else get_database_manager_async_client()
)
# Mark all pending reviews as rejected/cancelled
cancelled_count = await review_db.cancel_pending_reviews_for_execution(
graph_exec_id, user_id
)
logger.info(
f"Cancelled {cancelled_count} pending review(s) for stopped execution {graph_exec_id}"
)
# If the graph is still on the queue, we can prevent them from being executed
# by setting the status to TERMINATED.
graph_exec.status = ExecutionStatus.TERMINATED
await asyncio.gather(
@@ -906,28 +887,9 @@ async def add_graph_execution(
nodes_to_skip=nodes_to_skip,
execution_context=execution_context,
)
logger.info(f"Queueing execution {graph_exec.id}")
# Update execution status to QUEUED BEFORE publishing to prevent race condition
# where two concurrent requests could both publish the same execution
updated_exec = await edb.update_graph_execution_stats(
graph_exec_id=graph_exec.id,
status=ExecutionStatus.QUEUED,
)
# Verify the status update succeeded (prevents duplicate queueing in race conditions)
# If another request already updated the status, this execution will not be QUEUED
if not updated_exec or updated_exec.status != ExecutionStatus.QUEUED:
logger.warning(
f"Skipping queue publish for execution {graph_exec.id} - "
f"status update failed or execution already queued by another request"
)
return graph_exec
graph_exec.status = ExecutionStatus.QUEUED
logger.info(f"Publishing execution {graph_exec.id} to execution queue")
# Publish to execution queue for executor to pick up
# This happens AFTER status update to ensure only one request publishes
exec_queue = await get_async_execution_queue()
await exec_queue.publish_message(
routing_key=GRAPH_EXECUTION_ROUTING_KEY,
@@ -935,6 +897,13 @@ async def add_graph_execution(
exchange=GRAPH_EXECUTION_EXCHANGE,
)
logger.info(f"Published execution {graph_exec.id} to RabbitMQ queue")
# Update execution status to QUEUED
graph_exec.status = ExecutionStatus.QUEUED
await edb.update_graph_execution_stats(
graph_exec_id=graph_exec.id,
status=graph_exec.status,
)
except BaseException as e:
err = str(e) or type(e).__name__
if not graph_exec:

View File

@@ -4,7 +4,6 @@ import pytest
from pytest_mock import MockerFixture
from backend.data.dynamic_fields import merge_execution_input, parse_execution_output
from backend.data.execution import ExecutionStatus
from backend.util.mock import MockObject
@@ -347,7 +346,6 @@ async def test_add_graph_execution_is_repeatable(mocker: MockerFixture):
mock_graph_exec = mocker.MagicMock(spec=GraphExecutionWithNodes)
mock_graph_exec.id = "execution-id-123"
mock_graph_exec.node_executions = [] # Add this to avoid AttributeError
mock_graph_exec.status = ExecutionStatus.QUEUED # Required for race condition check
mock_graph_exec.to_graph_execution_entry.return_value = mocker.MagicMock()
# Mock the queue and event bus
@@ -613,7 +611,6 @@ async def test_add_graph_execution_with_nodes_to_skip(mocker: MockerFixture):
mock_graph_exec = mocker.MagicMock(spec=GraphExecutionWithNodes)
mock_graph_exec.id = "execution-id-123"
mock_graph_exec.node_executions = []
mock_graph_exec.status = ExecutionStatus.QUEUED # Required for race condition check
# Track what's passed to to_graph_execution_entry
captured_kwargs = {}
@@ -673,232 +670,3 @@ async def test_add_graph_execution_with_nodes_to_skip(mocker: MockerFixture):
# Verify nodes_to_skip was passed to to_graph_execution_entry
assert "nodes_to_skip" in captured_kwargs
assert captured_kwargs["nodes_to_skip"] == nodes_to_skip
@pytest.mark.asyncio
async def test_stop_graph_execution_in_review_status_cancels_pending_reviews(
mocker: MockerFixture,
):
"""Test that stopping an execution in REVIEW status cancels pending reviews."""
from backend.data.execution import ExecutionStatus, GraphExecutionMeta
from backend.executor.utils import stop_graph_execution
user_id = "test-user"
graph_exec_id = "test-exec-123"
# Mock graph execution in REVIEW status
mock_graph_exec = mocker.MagicMock(spec=GraphExecutionMeta)
mock_graph_exec.id = graph_exec_id
mock_graph_exec.status = ExecutionStatus.REVIEW
# Mock dependencies
mock_get_queue = mocker.patch("backend.executor.utils.get_async_execution_queue")
mock_queue_client = mocker.AsyncMock()
mock_get_queue.return_value = mock_queue_client
mock_prisma = mocker.patch("backend.executor.utils.prisma")
mock_prisma.is_connected.return_value = True
mock_human_review_db = mocker.patch("backend.executor.utils.human_review_db")
mock_human_review_db.cancel_pending_reviews_for_execution = mocker.AsyncMock(
return_value=2 # 2 reviews cancelled
)
mock_execution_db = mocker.patch("backend.executor.utils.execution_db")
mock_execution_db.get_graph_execution_meta = mocker.AsyncMock(
return_value=mock_graph_exec
)
mock_execution_db.update_graph_execution_stats = mocker.AsyncMock()
mock_get_event_bus = mocker.patch(
"backend.executor.utils.get_async_execution_event_bus"
)
mock_event_bus = mocker.MagicMock()
mock_event_bus.publish = mocker.AsyncMock()
mock_get_event_bus.return_value = mock_event_bus
mock_get_child_executions = mocker.patch(
"backend.executor.utils._get_child_executions"
)
mock_get_child_executions.return_value = [] # No children
# Call stop_graph_execution with timeout to allow status check
await stop_graph_execution(
user_id=user_id,
graph_exec_id=graph_exec_id,
wait_timeout=1.0, # Wait to allow status check
cascade=True,
)
# Verify pending reviews were cancelled
mock_human_review_db.cancel_pending_reviews_for_execution.assert_called_once_with(
graph_exec_id, user_id
)
# Verify execution status was updated to TERMINATED
mock_execution_db.update_graph_execution_stats.assert_called_once()
call_kwargs = mock_execution_db.update_graph_execution_stats.call_args[1]
assert call_kwargs["graph_exec_id"] == graph_exec_id
assert call_kwargs["status"] == ExecutionStatus.TERMINATED
@pytest.mark.asyncio
async def test_stop_graph_execution_with_database_manager_when_prisma_disconnected(
mocker: MockerFixture,
):
"""Test that stop uses database manager when Prisma is not connected."""
from backend.data.execution import ExecutionStatus, GraphExecutionMeta
from backend.executor.utils import stop_graph_execution
user_id = "test-user"
graph_exec_id = "test-exec-456"
# Mock graph execution in REVIEW status
mock_graph_exec = mocker.MagicMock(spec=GraphExecutionMeta)
mock_graph_exec.id = graph_exec_id
mock_graph_exec.status = ExecutionStatus.REVIEW
# Mock dependencies
mock_get_queue = mocker.patch("backend.executor.utils.get_async_execution_queue")
mock_queue_client = mocker.AsyncMock()
mock_get_queue.return_value = mock_queue_client
# Prisma is NOT connected
mock_prisma = mocker.patch("backend.executor.utils.prisma")
mock_prisma.is_connected.return_value = False
# Mock database manager client
mock_get_db_manager = mocker.patch(
"backend.executor.utils.get_database_manager_async_client"
)
mock_db_manager = mocker.AsyncMock()
mock_db_manager.get_graph_execution_meta = mocker.AsyncMock(
return_value=mock_graph_exec
)
mock_db_manager.cancel_pending_reviews_for_execution = mocker.AsyncMock(
return_value=3 # 3 reviews cancelled
)
mock_db_manager.update_graph_execution_stats = mocker.AsyncMock()
mock_get_db_manager.return_value = mock_db_manager
mock_get_event_bus = mocker.patch(
"backend.executor.utils.get_async_execution_event_bus"
)
mock_event_bus = mocker.MagicMock()
mock_event_bus.publish = mocker.AsyncMock()
mock_get_event_bus.return_value = mock_event_bus
mock_get_child_executions = mocker.patch(
"backend.executor.utils._get_child_executions"
)
mock_get_child_executions.return_value = [] # No children
# Call stop_graph_execution with timeout
await stop_graph_execution(
user_id=user_id,
graph_exec_id=graph_exec_id,
wait_timeout=1.0,
cascade=True,
)
# Verify database manager was used for cancel_pending_reviews
mock_db_manager.cancel_pending_reviews_for_execution.assert_called_once_with(
graph_exec_id, user_id
)
# Verify execution status was updated via database manager
mock_db_manager.update_graph_execution_stats.assert_called_once()
@pytest.mark.asyncio
async def test_stop_graph_execution_cascades_to_child_with_reviews(
mocker: MockerFixture,
):
"""Test that stopping parent execution cascades to children and cancels their reviews."""
from backend.data.execution import ExecutionStatus, GraphExecutionMeta
from backend.executor.utils import stop_graph_execution
user_id = "test-user"
parent_exec_id = "parent-exec"
child_exec_id = "child-exec"
# Mock parent execution in RUNNING status
mock_parent_exec = mocker.MagicMock(spec=GraphExecutionMeta)
mock_parent_exec.id = parent_exec_id
mock_parent_exec.status = ExecutionStatus.RUNNING
# Mock child execution in REVIEW status
mock_child_exec = mocker.MagicMock(spec=GraphExecutionMeta)
mock_child_exec.id = child_exec_id
mock_child_exec.status = ExecutionStatus.REVIEW
# Mock dependencies
mock_get_queue = mocker.patch("backend.executor.utils.get_async_execution_queue")
mock_queue_client = mocker.AsyncMock()
mock_get_queue.return_value = mock_queue_client
mock_prisma = mocker.patch("backend.executor.utils.prisma")
mock_prisma.is_connected.return_value = True
mock_human_review_db = mocker.patch("backend.executor.utils.human_review_db")
mock_human_review_db.cancel_pending_reviews_for_execution = mocker.AsyncMock(
return_value=1 # 1 child review cancelled
)
# Mock execution_db to return different status based on which execution is queried
mock_execution_db = mocker.patch("backend.executor.utils.execution_db")
# Track call count to simulate status transition
call_count = {"count": 0}
async def get_exec_meta_side_effect(execution_id, user_id):
call_count["count"] += 1
if execution_id == parent_exec_id:
# After a few calls (child processing happens), transition parent to TERMINATED
# This simulates the executor service processing the stop request
if call_count["count"] > 3:
mock_parent_exec.status = ExecutionStatus.TERMINATED
return mock_parent_exec
elif execution_id == child_exec_id:
return mock_child_exec
return None
mock_execution_db.get_graph_execution_meta = mocker.AsyncMock(
side_effect=get_exec_meta_side_effect
)
mock_execution_db.update_graph_execution_stats = mocker.AsyncMock()
mock_get_event_bus = mocker.patch(
"backend.executor.utils.get_async_execution_event_bus"
)
mock_event_bus = mocker.MagicMock()
mock_event_bus.publish = mocker.AsyncMock()
mock_get_event_bus.return_value = mock_event_bus
# Mock _get_child_executions to return the child
mock_get_child_executions = mocker.patch(
"backend.executor.utils._get_child_executions"
)
def get_children_side_effect(parent_id):
if parent_id == parent_exec_id:
return [mock_child_exec]
return []
mock_get_child_executions.side_effect = get_children_side_effect
# Call stop_graph_execution on parent with cascade=True
await stop_graph_execution(
user_id=user_id,
graph_exec_id=parent_exec_id,
wait_timeout=1.0,
cascade=True,
)
# Verify child reviews were cancelled
mock_human_review_db.cancel_pending_reviews_for_execution.assert_called_once_with(
child_exec_id, user_id
)
# Verify both parent and child status updates
assert mock_execution_db.update_graph_execution_stats.call_count >= 1

View File

@@ -350,19 +350,6 @@ class Config(UpdateTrackingModel["Config"], BaseSettings):
description="Whether to mark failed scans as clean or not",
)
agentgenerator_host: str = Field(
default="",
description="The host for the Agent Generator service (empty to use built-in)",
)
agentgenerator_port: int = Field(
default=8000,
description="The port for the Agent Generator service",
)
agentgenerator_timeout: int = Field(
default=120,
description="The timeout in seconds for Agent Generator service requests",
)
enable_example_blocks: bool = Field(
default=False,
description="Whether to enable example blocks in production",

View File

@@ -1,4 +1,3 @@
import asyncio
import inspect
import logging
import time
@@ -59,11 +58,6 @@ class SpinTestServer:
self.db_api.__exit__(exc_type, exc_val, exc_tb)
self.notif_manager.__exit__(exc_type, exc_val, exc_tb)
# Give services time to fully shut down
# This prevents event loop issues where services haven't fully cleaned up
# before the next test starts
await asyncio.sleep(0.5)
def setup_dependency_overrides(self):
# Override get_user_id for testing
self.agent_server.set_test_dependency_overrides(

View File

@@ -1,7 +0,0 @@
-- Remove NodeExecution foreign key from PendingHumanReview
-- The nodeExecId column remains as the primary key, but we remove the FK constraint
-- to AgentNodeExecution since PendingHumanReview records can persist after node
-- execution records are deleted.
-- Drop foreign key constraint that linked PendingHumanReview.nodeExecId to AgentNodeExecution.id
ALTER TABLE "PendingHumanReview" DROP CONSTRAINT IF EXISTS "PendingHumanReview_nodeExecId_fkey";

View File

@@ -517,6 +517,8 @@ model AgentNodeExecution {
stats Json?
PendingHumanReview PendingHumanReview?
@@index([agentGraphExecutionId, agentNodeId, executionStatus])
@@index([agentNodeId, executionStatus])
@@index([addedTime, queuedTime])
@@ -565,7 +567,6 @@ enum ReviewStatus {
}
// Pending human reviews for Human-in-the-loop blocks
// Also stores auto-approval records with special nodeExecId patterns (e.g., "auto_approve_{graph_exec_id}_{node_id}")
model PendingHumanReview {
nodeExecId String @id
userId String
@@ -584,6 +585,7 @@ model PendingHumanReview {
reviewedAt DateTime?
User User @relation(fields: [userId], references: [id], onDelete: Cascade)
NodeExecution AgentNodeExecution @relation(fields: [nodeExecId], references: [id], onDelete: Cascade)
GraphExecution AgentGraphExecution @relation(fields: [graphExecId], references: [id], onDelete: Cascade)
@@unique([nodeExecId]) // One pending review per node execution

View File

@@ -1 +0,0 @@
"""Tests for agent generator module."""

View File

@@ -1,273 +0,0 @@
"""
Tests for the Agent Generator core module.
This test suite verifies that the core functions correctly delegate to
the external Agent Generator service.
"""
from unittest.mock import AsyncMock, patch
import pytest
from backend.api.features.chat.tools.agent_generator import core
from backend.api.features.chat.tools.agent_generator.core import (
AgentGeneratorNotConfiguredError,
)
class TestServiceNotConfigured:
"""Test that functions raise AgentGeneratorNotConfiguredError when service is not configured."""
@pytest.mark.asyncio
async def test_decompose_goal_raises_when_not_configured(self):
"""Test that decompose_goal raises error when service not configured."""
with patch.object(core, "is_external_service_configured", return_value=False):
with pytest.raises(AgentGeneratorNotConfiguredError):
await core.decompose_goal("Build a chatbot")
@pytest.mark.asyncio
async def test_generate_agent_raises_when_not_configured(self):
"""Test that generate_agent raises error when service not configured."""
with patch.object(core, "is_external_service_configured", return_value=False):
with pytest.raises(AgentGeneratorNotConfiguredError):
await core.generate_agent({"steps": []})
@pytest.mark.asyncio
async def test_generate_agent_patch_raises_when_not_configured(self):
"""Test that generate_agent_patch raises error when service not configured."""
with patch.object(core, "is_external_service_configured", return_value=False):
with pytest.raises(AgentGeneratorNotConfiguredError):
await core.generate_agent_patch("Add a node", {"nodes": []})
class TestDecomposeGoal:
"""Test decompose_goal function service delegation."""
@pytest.mark.asyncio
async def test_calls_external_service(self):
"""Test that decompose_goal calls the external service."""
expected_result = {"type": "instructions", "steps": ["Step 1"]}
with patch.object(
core, "is_external_service_configured", return_value=True
), patch.object(
core, "decompose_goal_external", new_callable=AsyncMock
) as mock_external:
mock_external.return_value = expected_result
result = await core.decompose_goal("Build a chatbot")
mock_external.assert_called_once_with("Build a chatbot", "")
assert result == expected_result
@pytest.mark.asyncio
async def test_passes_context_to_external_service(self):
"""Test that decompose_goal passes context to external service."""
expected_result = {"type": "instructions", "steps": ["Step 1"]}
with patch.object(
core, "is_external_service_configured", return_value=True
), patch.object(
core, "decompose_goal_external", new_callable=AsyncMock
) as mock_external:
mock_external.return_value = expected_result
await core.decompose_goal("Build a chatbot", "Use Python")
mock_external.assert_called_once_with("Build a chatbot", "Use Python")
@pytest.mark.asyncio
async def test_returns_none_on_service_failure(self):
"""Test that decompose_goal returns None when external service fails."""
with patch.object(
core, "is_external_service_configured", return_value=True
), patch.object(
core, "decompose_goal_external", new_callable=AsyncMock
) as mock_external:
mock_external.return_value = None
result = await core.decompose_goal("Build a chatbot")
assert result is None
class TestGenerateAgent:
"""Test generate_agent function service delegation."""
@pytest.mark.asyncio
async def test_calls_external_service(self):
"""Test that generate_agent calls the external service."""
expected_result = {"name": "Test Agent", "nodes": [], "links": []}
with patch.object(
core, "is_external_service_configured", return_value=True
), patch.object(
core, "generate_agent_external", new_callable=AsyncMock
) as mock_external:
mock_external.return_value = expected_result
instructions = {"type": "instructions", "steps": ["Step 1"]}
result = await core.generate_agent(instructions)
mock_external.assert_called_once_with(instructions)
# Result should have id, version, is_active added if not present
assert result is not None
assert result["name"] == "Test Agent"
assert "id" in result
assert result["version"] == 1
assert result["is_active"] is True
@pytest.mark.asyncio
async def test_preserves_existing_id_and_version(self):
"""Test that external service result preserves existing id and version."""
expected_result = {
"id": "existing-id",
"version": 3,
"is_active": False,
"name": "Test Agent",
}
with patch.object(
core, "is_external_service_configured", return_value=True
), patch.object(
core, "generate_agent_external", new_callable=AsyncMock
) as mock_external:
mock_external.return_value = expected_result.copy()
result = await core.generate_agent({"steps": []})
assert result is not None
assert result["id"] == "existing-id"
assert result["version"] == 3
assert result["is_active"] is False
@pytest.mark.asyncio
async def test_returns_none_when_external_service_fails(self):
"""Test that generate_agent returns None when external service fails."""
with patch.object(
core, "is_external_service_configured", return_value=True
), patch.object(
core, "generate_agent_external", new_callable=AsyncMock
) as mock_external:
mock_external.return_value = None
result = await core.generate_agent({"steps": []})
assert result is None
class TestGenerateAgentPatch:
"""Test generate_agent_patch function service delegation."""
@pytest.mark.asyncio
async def test_calls_external_service(self):
"""Test that generate_agent_patch calls the external service."""
expected_result = {"name": "Updated Agent", "nodes": [], "links": []}
with patch.object(
core, "is_external_service_configured", return_value=True
), patch.object(
core, "generate_agent_patch_external", new_callable=AsyncMock
) as mock_external:
mock_external.return_value = expected_result
current_agent = {"nodes": [], "links": []}
result = await core.generate_agent_patch("Add a node", current_agent)
mock_external.assert_called_once_with("Add a node", current_agent)
assert result == expected_result
@pytest.mark.asyncio
async def test_returns_clarifying_questions(self):
"""Test that generate_agent_patch returns clarifying questions."""
expected_result = {
"type": "clarifying_questions",
"questions": [{"question": "What type of node?"}],
}
with patch.object(
core, "is_external_service_configured", return_value=True
), patch.object(
core, "generate_agent_patch_external", new_callable=AsyncMock
) as mock_external:
mock_external.return_value = expected_result
result = await core.generate_agent_patch("Add a node", {"nodes": []})
assert result == expected_result
@pytest.mark.asyncio
async def test_returns_none_when_external_service_fails(self):
"""Test that generate_agent_patch returns None when service fails."""
with patch.object(
core, "is_external_service_configured", return_value=True
), patch.object(
core, "generate_agent_patch_external", new_callable=AsyncMock
) as mock_external:
mock_external.return_value = None
result = await core.generate_agent_patch("Add a node", {"nodes": []})
assert result is None
class TestJsonToGraph:
"""Test json_to_graph function."""
def test_converts_agent_json_to_graph(self):
"""Test conversion of agent JSON to Graph model."""
agent_json = {
"id": "test-id",
"version": 2,
"is_active": True,
"name": "Test Agent",
"description": "A test agent",
"nodes": [
{
"id": "node1",
"block_id": "block1",
"input_default": {"key": "value"},
"metadata": {"x": 100},
}
],
"links": [
{
"id": "link1",
"source_id": "node1",
"sink_id": "output",
"source_name": "result",
"sink_name": "input",
"is_static": False,
}
],
}
graph = core.json_to_graph(agent_json)
assert graph.id == "test-id"
assert graph.version == 2
assert graph.is_active is True
assert graph.name == "Test Agent"
assert graph.description == "A test agent"
assert len(graph.nodes) == 1
assert graph.nodes[0].id == "node1"
assert graph.nodes[0].block_id == "block1"
assert len(graph.links) == 1
assert graph.links[0].source_id == "node1"
def test_generates_ids_if_missing(self):
"""Test that missing IDs are generated."""
agent_json = {
"name": "Test Agent",
"nodes": [{"block_id": "block1"}],
"links": [],
}
graph = core.json_to_graph(agent_json)
assert graph.id is not None
assert graph.nodes[0].id is not None
if __name__ == "__main__":
pytest.main([__file__, "-v"])

View File

@@ -1,422 +0,0 @@
"""
Tests for the Agent Generator external service client.
This test suite verifies the external Agent Generator service integration,
including service detection, API calls, and error handling.
"""
from unittest.mock import AsyncMock, MagicMock, patch
import httpx
import pytest
from backend.api.features.chat.tools.agent_generator import service
class TestServiceConfiguration:
"""Test service configuration detection."""
def setup_method(self):
"""Reset settings singleton before each test."""
service._settings = None
service._client = None
def test_external_service_not_configured_when_host_empty(self):
"""Test that external service is not configured when host is empty."""
mock_settings = MagicMock()
mock_settings.config.agentgenerator_host = ""
with patch.object(service, "_get_settings", return_value=mock_settings):
assert service.is_external_service_configured() is False
def test_external_service_configured_when_host_set(self):
"""Test that external service is configured when host is set."""
mock_settings = MagicMock()
mock_settings.config.agentgenerator_host = "agent-generator.local"
with patch.object(service, "_get_settings", return_value=mock_settings):
assert service.is_external_service_configured() is True
def test_get_base_url(self):
"""Test base URL construction."""
mock_settings = MagicMock()
mock_settings.config.agentgenerator_host = "agent-generator.local"
mock_settings.config.agentgenerator_port = 8000
with patch.object(service, "_get_settings", return_value=mock_settings):
url = service._get_base_url()
assert url == "http://agent-generator.local:8000"
class TestDecomposeGoalExternal:
"""Test decompose_goal_external function."""
def setup_method(self):
"""Reset client singleton before each test."""
service._settings = None
service._client = None
@pytest.mark.asyncio
async def test_decompose_goal_returns_instructions(self):
"""Test successful decomposition returning instructions."""
mock_response = MagicMock()
mock_response.json.return_value = {
"success": True,
"type": "instructions",
"steps": ["Step 1", "Step 2"],
}
mock_response.raise_for_status = MagicMock()
mock_client = AsyncMock()
mock_client.post.return_value = mock_response
with patch.object(service, "_get_client", return_value=mock_client):
result = await service.decompose_goal_external("Build a chatbot")
assert result == {"type": "instructions", "steps": ["Step 1", "Step 2"]}
mock_client.post.assert_called_once_with(
"/api/decompose-description", json={"description": "Build a chatbot"}
)
@pytest.mark.asyncio
async def test_decompose_goal_returns_clarifying_questions(self):
"""Test decomposition returning clarifying questions."""
mock_response = MagicMock()
mock_response.json.return_value = {
"success": True,
"type": "clarifying_questions",
"questions": ["What platform?", "What language?"],
}
mock_response.raise_for_status = MagicMock()
mock_client = AsyncMock()
mock_client.post.return_value = mock_response
with patch.object(service, "_get_client", return_value=mock_client):
result = await service.decompose_goal_external("Build something")
assert result == {
"type": "clarifying_questions",
"questions": ["What platform?", "What language?"],
}
@pytest.mark.asyncio
async def test_decompose_goal_with_context(self):
"""Test decomposition with additional context."""
mock_response = MagicMock()
mock_response.json.return_value = {
"success": True,
"type": "instructions",
"steps": ["Step 1"],
}
mock_response.raise_for_status = MagicMock()
mock_client = AsyncMock()
mock_client.post.return_value = mock_response
with patch.object(service, "_get_client", return_value=mock_client):
await service.decompose_goal_external(
"Build a chatbot", context="Use Python"
)
mock_client.post.assert_called_once_with(
"/api/decompose-description",
json={"description": "Build a chatbot", "user_instruction": "Use Python"},
)
@pytest.mark.asyncio
async def test_decompose_goal_returns_unachievable_goal(self):
"""Test decomposition returning unachievable goal response."""
mock_response = MagicMock()
mock_response.json.return_value = {
"success": True,
"type": "unachievable_goal",
"reason": "Cannot do X",
"suggested_goal": "Try Y instead",
}
mock_response.raise_for_status = MagicMock()
mock_client = AsyncMock()
mock_client.post.return_value = mock_response
with patch.object(service, "_get_client", return_value=mock_client):
result = await service.decompose_goal_external("Do something impossible")
assert result == {
"type": "unachievable_goal",
"reason": "Cannot do X",
"suggested_goal": "Try Y instead",
}
@pytest.mark.asyncio
async def test_decompose_goal_handles_http_error(self):
"""Test decomposition handles HTTP errors gracefully."""
mock_client = AsyncMock()
mock_client.post.side_effect = httpx.HTTPStatusError(
"Server error", request=MagicMock(), response=MagicMock()
)
with patch.object(service, "_get_client", return_value=mock_client):
result = await service.decompose_goal_external("Build a chatbot")
assert result is None
@pytest.mark.asyncio
async def test_decompose_goal_handles_request_error(self):
"""Test decomposition handles request errors gracefully."""
mock_client = AsyncMock()
mock_client.post.side_effect = httpx.RequestError("Connection failed")
with patch.object(service, "_get_client", return_value=mock_client):
result = await service.decompose_goal_external("Build a chatbot")
assert result is None
@pytest.mark.asyncio
async def test_decompose_goal_handles_service_error(self):
"""Test decomposition handles service returning error."""
mock_response = MagicMock()
mock_response.json.return_value = {
"success": False,
"error": "Internal error",
}
mock_response.raise_for_status = MagicMock()
mock_client = AsyncMock()
mock_client.post.return_value = mock_response
with patch.object(service, "_get_client", return_value=mock_client):
result = await service.decompose_goal_external("Build a chatbot")
assert result is None
class TestGenerateAgentExternal:
"""Test generate_agent_external function."""
def setup_method(self):
"""Reset client singleton before each test."""
service._settings = None
service._client = None
@pytest.mark.asyncio
async def test_generate_agent_success(self):
"""Test successful agent generation."""
agent_json = {
"name": "Test Agent",
"nodes": [],
"links": [],
}
mock_response = MagicMock()
mock_response.json.return_value = {
"success": True,
"agent_json": agent_json,
}
mock_response.raise_for_status = MagicMock()
mock_client = AsyncMock()
mock_client.post.return_value = mock_response
instructions = {"type": "instructions", "steps": ["Step 1"]}
with patch.object(service, "_get_client", return_value=mock_client):
result = await service.generate_agent_external(instructions)
assert result == agent_json
mock_client.post.assert_called_once_with(
"/api/generate-agent", json={"instructions": instructions}
)
@pytest.mark.asyncio
async def test_generate_agent_handles_error(self):
"""Test agent generation handles errors gracefully."""
mock_client = AsyncMock()
mock_client.post.side_effect = httpx.RequestError("Connection failed")
with patch.object(service, "_get_client", return_value=mock_client):
result = await service.generate_agent_external({"steps": []})
assert result is None
class TestGenerateAgentPatchExternal:
"""Test generate_agent_patch_external function."""
def setup_method(self):
"""Reset client singleton before each test."""
service._settings = None
service._client = None
@pytest.mark.asyncio
async def test_generate_patch_returns_updated_agent(self):
"""Test successful patch generation returning updated agent."""
updated_agent = {
"name": "Updated Agent",
"nodes": [{"id": "1", "block_id": "test"}],
"links": [],
}
mock_response = MagicMock()
mock_response.json.return_value = {
"success": True,
"agent_json": updated_agent,
}
mock_response.raise_for_status = MagicMock()
mock_client = AsyncMock()
mock_client.post.return_value = mock_response
current_agent = {"name": "Old Agent", "nodes": [], "links": []}
with patch.object(service, "_get_client", return_value=mock_client):
result = await service.generate_agent_patch_external(
"Add a new node", current_agent
)
assert result == updated_agent
mock_client.post.assert_called_once_with(
"/api/update-agent",
json={
"update_request": "Add a new node",
"current_agent_json": current_agent,
},
)
@pytest.mark.asyncio
async def test_generate_patch_returns_clarifying_questions(self):
"""Test patch generation returning clarifying questions."""
mock_response = MagicMock()
mock_response.json.return_value = {
"success": True,
"type": "clarifying_questions",
"questions": ["What type of node?"],
}
mock_response.raise_for_status = MagicMock()
mock_client = AsyncMock()
mock_client.post.return_value = mock_response
with patch.object(service, "_get_client", return_value=mock_client):
result = await service.generate_agent_patch_external(
"Add something", {"nodes": []}
)
assert result == {
"type": "clarifying_questions",
"questions": ["What type of node?"],
}
class TestHealthCheck:
"""Test health_check function."""
def setup_method(self):
"""Reset singletons before each test."""
service._settings = None
service._client = None
@pytest.mark.asyncio
async def test_health_check_returns_false_when_not_configured(self):
"""Test health check returns False when service not configured."""
with patch.object(
service, "is_external_service_configured", return_value=False
):
result = await service.health_check()
assert result is False
@pytest.mark.asyncio
async def test_health_check_returns_true_when_healthy(self):
"""Test health check returns True when service is healthy."""
mock_response = MagicMock()
mock_response.json.return_value = {
"status": "healthy",
"blocks_loaded": True,
}
mock_response.raise_for_status = MagicMock()
mock_client = AsyncMock()
mock_client.get.return_value = mock_response
with patch.object(service, "is_external_service_configured", return_value=True):
with patch.object(service, "_get_client", return_value=mock_client):
result = await service.health_check()
assert result is True
mock_client.get.assert_called_once_with("/health")
@pytest.mark.asyncio
async def test_health_check_returns_false_when_not_healthy(self):
"""Test health check returns False when service is not healthy."""
mock_response = MagicMock()
mock_response.json.return_value = {
"status": "unhealthy",
"blocks_loaded": False,
}
mock_response.raise_for_status = MagicMock()
mock_client = AsyncMock()
mock_client.get.return_value = mock_response
with patch.object(service, "is_external_service_configured", return_value=True):
with patch.object(service, "_get_client", return_value=mock_client):
result = await service.health_check()
assert result is False
@pytest.mark.asyncio
async def test_health_check_returns_false_on_error(self):
"""Test health check returns False on connection error."""
mock_client = AsyncMock()
mock_client.get.side_effect = httpx.RequestError("Connection failed")
with patch.object(service, "is_external_service_configured", return_value=True):
with patch.object(service, "_get_client", return_value=mock_client):
result = await service.health_check()
assert result is False
class TestGetBlocksExternal:
"""Test get_blocks_external function."""
def setup_method(self):
"""Reset client singleton before each test."""
service._settings = None
service._client = None
@pytest.mark.asyncio
async def test_get_blocks_success(self):
"""Test successful blocks retrieval."""
blocks = [
{"id": "block1", "name": "Block 1"},
{"id": "block2", "name": "Block 2"},
]
mock_response = MagicMock()
mock_response.json.return_value = {
"success": True,
"blocks": blocks,
}
mock_response.raise_for_status = MagicMock()
mock_client = AsyncMock()
mock_client.get.return_value = mock_response
with patch.object(service, "_get_client", return_value=mock_client):
result = await service.get_blocks_external()
assert result == blocks
mock_client.get.assert_called_once_with("/api/blocks")
@pytest.mark.asyncio
async def test_get_blocks_handles_error(self):
"""Test blocks retrieval handles errors gracefully."""
mock_client = AsyncMock()
mock_client.get.side_effect = httpx.RequestError("Connection failed")
with patch.object(service, "_get_client", return_value=mock_client):
result = await service.get_blocks_external()
assert result is None
if __name__ == "__main__":
pytest.main([__file__, "-v"])

View File

@@ -86,6 +86,7 @@ export function FloatingSafeModeToggle({
const {
currentHITLSafeMode,
showHITLToggle,
isHITLStateUndetermined,
handleHITLToggle,
currentSensitiveActionSafeMode,
showSensitiveActionToggle,
@@ -98,9 +99,16 @@ export function FloatingSafeModeToggle({
return null;
}
const showHITL = showHITLToggle && !isHITLStateUndetermined;
const showSensitive = showSensitiveActionToggle;
if (!showHITL && !showSensitive) {
return null;
}
return (
<div className={cn("fixed z-50 flex flex-col gap-2", className)}>
{showHITLToggle && (
{showHITL && (
<SafeModeButton
isEnabled={currentHITLSafeMode}
label="Human in the loop block approval"
@@ -111,7 +119,7 @@ export function FloatingSafeModeToggle({
fullWidth={fullWidth}
/>
)}
{showSensitiveActionToggle && (
{showSensitive && (
<SafeModeButton
isEnabled={currentSensitiveActionSafeMode}
label="Sensitive actions blocks approval"

View File

@@ -1,41 +0,0 @@
"use client";
import { createContext, useContext, useRef, type ReactNode } from "react";
interface NewChatContextValue {
onNewChatClick: () => void;
setOnNewChatClick: (handler?: () => void) => void;
performNewChat?: () => void;
setPerformNewChat: (handler?: () => void) => void;
}
const NewChatContext = createContext<NewChatContextValue | null>(null);
export function NewChatProvider({ children }: { children: ReactNode }) {
const onNewChatRef = useRef<(() => void) | undefined>();
const performNewChatRef = useRef<(() => void) | undefined>();
const contextValueRef = useRef<NewChatContextValue>({
onNewChatClick() {
onNewChatRef.current?.();
},
setOnNewChatClick(handler?: () => void) {
onNewChatRef.current = handler;
},
performNewChat() {
performNewChatRef.current?.();
},
setPerformNewChat(handler?: () => void) {
performNewChatRef.current = handler;
},
});
return (
<NewChatContext.Provider value={contextValueRef.current}>
{children}
</NewChatContext.Provider>
);
}
export function useNewChat() {
return useContext(NewChatContext);
}

View File

@@ -1,10 +1,8 @@
"use client";
import { ChatLoader } from "@/components/contextual/Chat/components/ChatLoader/ChatLoader";
import { LoadingSpinner } from "@/components/atoms/LoadingSpinner/LoadingSpinner";
import { NAVBAR_HEIGHT_PX } from "@/lib/constants";
import type { ReactNode } from "react";
import { useEffect } from "react";
import { useNewChat } from "../../NewChatContext";
import { DesktopSidebar } from "./components/DesktopSidebar/DesktopSidebar";
import { LoadingState } from "./components/LoadingState/LoadingState";
import { MobileDrawer } from "./components/MobileDrawer/MobileDrawer";
@@ -35,25 +33,10 @@ export function CopilotShell({ children }: Props) {
isReadyToShowContent,
} = useCopilotShell();
const newChatContext = useNewChat();
const handleNewChatClickWrapper =
newChatContext?.onNewChatClick || handleNewChat;
useEffect(
function registerNewChatHandler() {
if (!newChatContext) return;
newChatContext.setPerformNewChat(handleNewChat);
return function cleanup() {
newChatContext.setPerformNewChat(undefined);
};
},
[newChatContext, handleNewChat],
);
if (!isLoggedIn) {
return (
<div className="flex h-full items-center justify-center">
<ChatLoader />
<LoadingSpinner size="large" />
</div>
);
}
@@ -72,7 +55,7 @@ export function CopilotShell({ children }: Props) {
isFetchingNextPage={isFetchingNextPage}
onSelectSession={handleSelectSession}
onFetchNextPage={fetchNextPage}
onNewChat={handleNewChatClickWrapper}
onNewChat={handleNewChat}
hasActiveSession={Boolean(hasActiveSession)}
/>
)}
@@ -94,7 +77,7 @@ export function CopilotShell({ children }: Props) {
isFetchingNextPage={isFetchingNextPage}
onSelectSession={handleSelectSession}
onFetchNextPage={fetchNextPage}
onNewChat={handleNewChatClickWrapper}
onNewChat={handleNewChat}
onClose={handleCloseDrawer}
onOpenChange={handleDrawerOpenChange}
hasActiveSession={Boolean(hasActiveSession)}

View File

@@ -148,15 +148,13 @@ export function useCopilotShell() {
setHasAutoSelectedSession(false);
}
const isLoading = isSessionsLoading && accumulatedSessions.length === 0;
return {
isMobile,
isDrawerOpen,
isLoggedIn,
hasActiveSession:
Boolean(currentSessionId) && (!isOnHomepage || Boolean(paramSessionId)),
isLoading,
isLoading: isSessionsLoading || !areAllSessionsLoaded,
sessions: visibleSessions,
currentSessionId: sidebarSelectedSessionId,
handleSelectSession,

View File

@@ -1,28 +1,5 @@
import type { User } from "@supabase/supabase-js";
export type PageState =
| { type: "welcome" }
| { type: "newChat" }
| { type: "creating"; prompt: string }
| { type: "chat"; sessionId: string; initialPrompt?: string };
export function getInitialPromptFromState(
pageState: PageState,
storedInitialPrompt: string | undefined,
) {
if (storedInitialPrompt) return storedInitialPrompt;
if (pageState.type === "creating") return pageState.prompt;
if (pageState.type === "chat") return pageState.initialPrompt;
}
export function shouldResetToWelcome(pageState: PageState) {
return (
pageState.type !== "newChat" &&
pageState.type !== "creating" &&
pageState.type !== "welcome"
);
}
export function getGreetingName(user?: User | null): string {
if (!user) return "there";
const metadata = user.user_metadata as Record<string, unknown> | undefined;

View File

@@ -1,11 +1,6 @@
import type { ReactNode } from "react";
import { NewChatProvider } from "./NewChatContext";
import { CopilotShell } from "./components/CopilotShell/CopilotShell";
export default function CopilotLayout({ children }: { children: ReactNode }) {
return (
<NewChatProvider>
<CopilotShell>{children}</CopilotShell>
</NewChatProvider>
);
return <CopilotShell>{children}</CopilotShell>;
}

View File

@@ -1,35 +1,142 @@
"use client";
import { postV2CreateSession } from "@/app/api/__generated__/endpoints/chat/chat";
import { Skeleton } from "@/components/__legacy__/ui/skeleton";
import { Button } from "@/components/atoms/Button/Button";
import { LoadingSpinner } from "@/components/atoms/LoadingSpinner/LoadingSpinner";
import { Text } from "@/components/atoms/Text/Text";
import { Chat } from "@/components/contextual/Chat/Chat";
import { ChatInput } from "@/components/contextual/Chat/components/ChatInput/ChatInput";
import { ChatLoader } from "@/components/contextual/Chat/components/ChatLoader/ChatLoader";
import { Dialog } from "@/components/molecules/Dialog/Dialog";
import { useCopilotPage } from "./useCopilotPage";
import { getHomepageRoute } from "@/lib/constants";
import { useSupabase } from "@/lib/supabase/hooks/useSupabase";
import {
Flag,
type FlagValues,
useGetFlag,
} from "@/services/feature-flags/use-get-flag";
import { useFlags } from "launchdarkly-react-client-sdk";
import { useRouter, useSearchParams } from "next/navigation";
import { useEffect, useMemo, useRef, useState } from "react";
import { getGreetingName, getQuickActions } from "./helpers";
type PageState =
| { type: "welcome" }
| { type: "creating"; prompt: string }
| { type: "chat"; sessionId: string; initialPrompt?: string };
export default function CopilotPage() {
const { state, handlers } = useCopilotPage();
const {
greetingName,
quickActions,
isLoading,
pageState,
isNewChatModalOpen,
isReady,
} = state;
const {
handleQuickAction,
startChatWithPrompt,
handleSessionNotFound,
handleStreamingChange,
handleCancelNewChat,
proceedWithNewChat,
handleNewChatModalOpen,
} = handlers;
const router = useRouter();
const searchParams = useSearchParams();
const { user, isLoggedIn, isUserLoading } = useSupabase();
if (!isReady) {
const isChatEnabled = useGetFlag(Flag.CHAT);
const flags = useFlags<FlagValues>();
const homepageRoute = getHomepageRoute(isChatEnabled);
const envEnabled = process.env.NEXT_PUBLIC_LAUNCHDARKLY_ENABLED === "true";
const clientId = process.env.NEXT_PUBLIC_LAUNCHDARKLY_CLIENT_ID;
const isLaunchDarklyConfigured = envEnabled && Boolean(clientId);
const isFlagReady =
!isLaunchDarklyConfigured || flags[Flag.CHAT] !== undefined;
const [pageState, setPageState] = useState<PageState>({ type: "welcome" });
const initialPromptRef = useRef<Map<string, string>>(new Map());
const urlSessionId = searchParams.get("sessionId");
// Sync with URL sessionId (preserve initialPrompt from ref)
useEffect(
function syncSessionFromUrl() {
if (urlSessionId) {
// If we're already in chat state with this sessionId, don't overwrite
if (pageState.type === "chat" && pageState.sessionId === urlSessionId) {
return;
}
// Get initialPrompt from ref or current state
const storedInitialPrompt = initialPromptRef.current.get(urlSessionId);
const currentInitialPrompt =
storedInitialPrompt ||
(pageState.type === "creating"
? pageState.prompt
: pageState.type === "chat"
? pageState.initialPrompt
: undefined);
if (currentInitialPrompt) {
initialPromptRef.current.set(urlSessionId, currentInitialPrompt);
}
setPageState({
type: "chat",
sessionId: urlSessionId,
initialPrompt: currentInitialPrompt,
});
} else if (pageState.type === "chat") {
setPageState({ type: "welcome" });
}
},
[urlSessionId],
);
useEffect(
function ensureAccess() {
if (!isFlagReady) return;
if (isChatEnabled === false) {
router.replace(homepageRoute);
}
},
[homepageRoute, isChatEnabled, isFlagReady, router],
);
const greetingName = useMemo(
function getName() {
return getGreetingName(user);
},
[user],
);
const quickActions = useMemo(function getActions() {
return getQuickActions();
}, []);
async function startChatWithPrompt(prompt: string) {
if (!prompt?.trim()) return;
if (pageState.type === "creating") return;
const trimmedPrompt = prompt.trim();
setPageState({ type: "creating", prompt: trimmedPrompt });
try {
// Create session
const sessionResponse = await postV2CreateSession({
body: JSON.stringify({}),
});
if (sessionResponse.status !== 200 || !sessionResponse.data?.id) {
throw new Error("Failed to create session");
}
const sessionId = sessionResponse.data.id;
// Store initialPrompt in ref so it persists across re-renders
initialPromptRef.current.set(sessionId, trimmedPrompt);
// Update URL and show Chat with initial prompt
// Chat will handle sending the message and streaming
window.history.replaceState(null, "", `/copilot?sessionId=${sessionId}`);
setPageState({ type: "chat", sessionId, initialPrompt: trimmedPrompt });
} catch (error) {
console.error("[CopilotPage] Failed to start chat:", error);
setPageState({ type: "welcome" });
}
}
function handleQuickAction(action: string) {
startChatWithPrompt(action);
}
function handleSessionNotFound() {
router.replace("/copilot");
}
if (!isFlagReady || isChatEnabled === false || !isLoggedIn) {
return null;
}
@@ -43,55 +150,7 @@ export default function CopilotPage() {
urlSessionId={pageState.sessionId}
initialPrompt={pageState.initialPrompt}
onSessionNotFound={handleSessionNotFound}
onStreamingChange={handleStreamingChange}
/>
<Dialog
title="Interrupt current chat?"
styling={{ maxWidth: 300, width: "100%" }}
controlled={{
isOpen: isNewChatModalOpen,
set: handleNewChatModalOpen,
}}
onClose={handleCancelNewChat}
>
<Dialog.Content>
<div className="flex flex-col gap-4">
<Text variant="body">
The current chat response will be interrupted. Are you sure you
want to start a new chat?
</Text>
<Dialog.Footer>
<Button
type="button"
variant="outline"
onClick={handleCancelNewChat}
>
Cancel
</Button>
<Button
type="button"
variant="primary"
onClick={proceedWithNewChat}
>
Start new chat
</Button>
</Dialog.Footer>
</div>
</Dialog.Content>
</Dialog>
</div>
);
}
if (pageState.type === "newChat") {
return (
<div className="flex h-full flex-1 flex-col items-center justify-center bg-[#f8f8f9]">
<div className="flex flex-col items-center gap-4">
<ChatLoader />
<Text variant="body" className="text-zinc-500">
Loading your chats...
</Text>
</div>
</div>
);
}
@@ -99,18 +158,18 @@ export default function CopilotPage() {
// Show loading state while creating session and sending first message
if (pageState.type === "creating") {
return (
<div className="flex h-full flex-1 flex-col items-center justify-center bg-[#f8f8f9]">
<div className="flex flex-col items-center gap-4">
<ChatLoader />
<Text variant="body" className="text-zinc-500">
Loading your chats...
</Text>
</div>
<div className="flex h-full flex-1 flex-col items-center justify-center bg-[#f8f8f9] px-6 py-10">
<LoadingSpinner size="large" />
<Text variant="body" className="mt-4 text-zinc-500">
Starting your chat...
</Text>
</div>
);
}
// Show Welcome screen
const isLoading = isUserLoading;
return (
<div className="flex h-full flex-1 items-center justify-center overflow-y-auto bg-[#f8f8f9] px-6 py-10">
<div className="w-full text-center">

View File

@@ -1,266 +0,0 @@
import { postV2CreateSession } from "@/app/api/__generated__/endpoints/chat/chat";
import { useToast } from "@/components/molecules/Toast/use-toast";
import { getHomepageRoute } from "@/lib/constants";
import { useSupabase } from "@/lib/supabase/hooks/useSupabase";
import {
Flag,
type FlagValues,
useGetFlag,
} from "@/services/feature-flags/use-get-flag";
import * as Sentry from "@sentry/nextjs";
import { useFlags } from "launchdarkly-react-client-sdk";
import { useRouter } from "next/navigation";
import { useEffect, useReducer } from "react";
import { useNewChat } from "./NewChatContext";
import { getGreetingName, getQuickActions, type PageState } from "./helpers";
import { useCopilotURLState } from "./useCopilotURLState";
type CopilotState = {
pageState: PageState;
isStreaming: boolean;
isNewChatModalOpen: boolean;
initialPrompts: Record<string, string>;
previousSessionId: string | null;
};
type CopilotAction =
| { type: "setPageState"; pageState: PageState }
| { type: "setStreaming"; isStreaming: boolean }
| { type: "setNewChatModalOpen"; isOpen: boolean }
| { type: "setInitialPrompt"; sessionId: string; prompt: string }
| { type: "setPreviousSessionId"; sessionId: string | null };
function isSamePageState(next: PageState, current: PageState) {
if (next.type !== current.type) return false;
if (next.type === "creating" && current.type === "creating") {
return next.prompt === current.prompt;
}
if (next.type === "chat" && current.type === "chat") {
return (
next.sessionId === current.sessionId &&
next.initialPrompt === current.initialPrompt
);
}
return true;
}
function copilotReducer(
state: CopilotState,
action: CopilotAction,
): CopilotState {
if (action.type === "setPageState") {
if (isSamePageState(action.pageState, state.pageState)) return state;
return { ...state, pageState: action.pageState };
}
if (action.type === "setStreaming") {
if (action.isStreaming === state.isStreaming) return state;
return { ...state, isStreaming: action.isStreaming };
}
if (action.type === "setNewChatModalOpen") {
if (action.isOpen === state.isNewChatModalOpen) return state;
return { ...state, isNewChatModalOpen: action.isOpen };
}
if (action.type === "setInitialPrompt") {
if (state.initialPrompts[action.sessionId] === action.prompt) return state;
return {
...state,
initialPrompts: {
...state.initialPrompts,
[action.sessionId]: action.prompt,
},
};
}
if (action.type === "setPreviousSessionId") {
if (state.previousSessionId === action.sessionId) return state;
return { ...state, previousSessionId: action.sessionId };
}
return state;
}
export function useCopilotPage() {
const router = useRouter();
const { user, isLoggedIn, isUserLoading } = useSupabase();
const { toast } = useToast();
const isChatEnabled = useGetFlag(Flag.CHAT);
const flags = useFlags<FlagValues>();
const homepageRoute = getHomepageRoute(isChatEnabled);
const envEnabled = process.env.NEXT_PUBLIC_LAUNCHDARKLY_ENABLED === "true";
const clientId = process.env.NEXT_PUBLIC_LAUNCHDARKLY_CLIENT_ID;
const isLaunchDarklyConfigured = envEnabled && Boolean(clientId);
const isFlagReady =
!isLaunchDarklyConfigured || flags[Flag.CHAT] !== undefined;
const [state, dispatch] = useReducer(copilotReducer, {
pageState: { type: "welcome" },
isStreaming: false,
isNewChatModalOpen: false,
initialPrompts: {},
previousSessionId: null,
});
const newChatContext = useNewChat();
const greetingName = getGreetingName(user);
const quickActions = getQuickActions();
function setPageState(pageState: PageState) {
dispatch({ type: "setPageState", pageState });
}
function setInitialPrompt(sessionId: string, prompt: string) {
dispatch({ type: "setInitialPrompt", sessionId, prompt });
}
function setPreviousSessionId(sessionId: string | null) {
dispatch({ type: "setPreviousSessionId", sessionId });
}
const { setUrlSessionId } = useCopilotURLState({
pageState: state.pageState,
initialPrompts: state.initialPrompts,
previousSessionId: state.previousSessionId,
setPageState,
setInitialPrompt,
setPreviousSessionId,
});
useEffect(
function registerNewChatHandler() {
if (!newChatContext) return;
newChatContext.setOnNewChatClick(handleNewChatClick);
return function cleanup() {
newChatContext.setOnNewChatClick(undefined);
};
},
[newChatContext, handleNewChatClick],
);
useEffect(
function transitionNewChatToWelcome() {
if (state.pageState.type === "newChat") {
function setWelcomeState() {
dispatch({ type: "setPageState", pageState: { type: "welcome" } });
}
const timer = setTimeout(setWelcomeState, 300);
return function cleanup() {
clearTimeout(timer);
};
}
},
[state.pageState.type],
);
useEffect(
function ensureAccess() {
if (!isFlagReady) return;
if (isChatEnabled === false) {
router.replace(homepageRoute);
}
},
[homepageRoute, isChatEnabled, isFlagReady, router],
);
async function startChatWithPrompt(prompt: string) {
if (!prompt?.trim()) return;
if (state.pageState.type === "creating") return;
const trimmedPrompt = prompt.trim();
dispatch({
type: "setPageState",
pageState: { type: "creating", prompt: trimmedPrompt },
});
try {
const sessionResponse = await postV2CreateSession({
body: JSON.stringify({}),
});
if (sessionResponse.status !== 200 || !sessionResponse.data?.id) {
throw new Error("Failed to create session");
}
const sessionId = sessionResponse.data.id;
dispatch({
type: "setInitialPrompt",
sessionId,
prompt: trimmedPrompt,
});
await setUrlSessionId(sessionId, { shallow: false });
dispatch({
type: "setPageState",
pageState: { type: "chat", sessionId, initialPrompt: trimmedPrompt },
});
} catch (error) {
console.error("[CopilotPage] Failed to start chat:", error);
toast({ title: "Failed to start chat", variant: "destructive" });
Sentry.captureException(error);
dispatch({ type: "setPageState", pageState: { type: "welcome" } });
}
}
function handleQuickAction(action: string) {
startChatWithPrompt(action);
}
function handleSessionNotFound() {
router.replace("/copilot");
}
function handleStreamingChange(isStreamingValue: boolean) {
dispatch({ type: "setStreaming", isStreaming: isStreamingValue });
}
async function proceedWithNewChat() {
dispatch({ type: "setNewChatModalOpen", isOpen: false });
if (newChatContext?.performNewChat) {
newChatContext.performNewChat();
return;
}
try {
await setUrlSessionId(null, { shallow: false });
} catch (error) {
console.error("[CopilotPage] Failed to clear session:", error);
}
router.replace("/copilot");
}
function handleCancelNewChat() {
dispatch({ type: "setNewChatModalOpen", isOpen: false });
}
function handleNewChatModalOpen(isOpen: boolean) {
dispatch({ type: "setNewChatModalOpen", isOpen });
}
function handleNewChatClick() {
if (state.isStreaming) {
dispatch({ type: "setNewChatModalOpen", isOpen: true });
} else {
proceedWithNewChat();
}
}
return {
state: {
greetingName,
quickActions,
isLoading: isUserLoading,
pageState: state.pageState,
isNewChatModalOpen: state.isNewChatModalOpen,
isReady: isFlagReady && isChatEnabled !== false && isLoggedIn,
},
handlers: {
handleQuickAction,
startChatWithPrompt,
handleSessionNotFound,
handleStreamingChange,
handleCancelNewChat,
proceedWithNewChat,
handleNewChatModalOpen,
},
};
}

View File

@@ -1,80 +0,0 @@
import { parseAsString, useQueryState } from "nuqs";
import { useLayoutEffect } from "react";
import {
getInitialPromptFromState,
type PageState,
shouldResetToWelcome,
} from "./helpers";
interface UseCopilotUrlStateArgs {
pageState: PageState;
initialPrompts: Record<string, string>;
previousSessionId: string | null;
setPageState: (pageState: PageState) => void;
setInitialPrompt: (sessionId: string, prompt: string) => void;
setPreviousSessionId: (sessionId: string | null) => void;
}
export function useCopilotURLState({
pageState,
initialPrompts,
previousSessionId,
setPageState,
setInitialPrompt,
setPreviousSessionId,
}: UseCopilotUrlStateArgs) {
const [urlSessionId, setUrlSessionId] = useQueryState(
"sessionId",
parseAsString,
);
function syncSessionFromUrl() {
if (urlSessionId) {
if (pageState.type === "chat" && pageState.sessionId === urlSessionId) {
setPreviousSessionId(urlSessionId);
return;
}
const storedInitialPrompt = initialPrompts[urlSessionId];
const currentInitialPrompt = getInitialPromptFromState(
pageState,
storedInitialPrompt,
);
if (currentInitialPrompt) {
setInitialPrompt(urlSessionId, currentInitialPrompt);
}
setPageState({
type: "chat",
sessionId: urlSessionId,
initialPrompt: currentInitialPrompt,
});
setPreviousSessionId(urlSessionId);
return;
}
const wasInChat = previousSessionId !== null && pageState.type === "chat";
setPreviousSessionId(null);
if (wasInChat) {
setPageState({ type: "newChat" });
return;
}
if (shouldResetToWelcome(pageState)) {
setPageState({ type: "welcome" });
}
}
useLayoutEffect(syncSessionFromUrl, [
urlSessionId,
pageState.type,
previousSessionId,
initialPrompts,
]);
return {
urlSessionId,
setUrlSessionId,
};
}

View File

@@ -14,10 +14,6 @@ import {
import { Dialog } from "@/components/molecules/Dialog/Dialog";
import { useEffect, useRef, useState } from "react";
import { ScheduleAgentModal } from "../ScheduleAgentModal/ScheduleAgentModal";
import {
AIAgentSafetyPopup,
useAIAgentSafetyPopup,
} from "./components/AIAgentSafetyPopup/AIAgentSafetyPopup";
import { ModalHeader } from "./components/ModalHeader/ModalHeader";
import { ModalRunSection } from "./components/ModalRunSection/ModalRunSection";
import { RunActions } from "./components/RunActions/RunActions";
@@ -87,18 +83,8 @@ export function RunAgentModal({
const [isScheduleModalOpen, setIsScheduleModalOpen] = useState(false);
const [hasOverflow, setHasOverflow] = useState(false);
const [isSafetyPopupOpen, setIsSafetyPopupOpen] = useState(false);
const [pendingRunAction, setPendingRunAction] = useState<(() => void) | null>(
null,
);
const contentRef = useRef<HTMLDivElement>(null);
const { shouldShowPopup, dismissPopup } = useAIAgentSafetyPopup(
agent.id,
agent.has_sensitive_action,
agent.has_human_in_the_loop,
);
const hasAnySetupFields =
Object.keys(agentInputFields || {}).length > 0 ||
Object.keys(agentCredentialsInputFields || {}).length > 0;
@@ -179,24 +165,6 @@ export function RunAgentModal({
onScheduleCreated?.(schedule);
}
function handleRunWithSafetyCheck() {
if (shouldShowPopup) {
setPendingRunAction(() => handleRun);
setIsSafetyPopupOpen(true);
} else {
handleRun();
}
}
function handleSafetyPopupAcknowledge() {
setIsSafetyPopupOpen(false);
dismissPopup();
if (pendingRunAction) {
pendingRunAction();
setPendingRunAction(null);
}
}
return (
<>
<Dialog
@@ -280,7 +248,7 @@ export function RunAgentModal({
)}
<RunActions
defaultRunType={defaultRunType}
onRun={handleRunWithSafetyCheck}
onRun={handleRun}
isExecuting={isExecuting}
isSettingUpTrigger={isSettingUpTrigger}
isRunReady={allRequiredInputsAreSet}
@@ -298,12 +266,6 @@ export function RunAgentModal({
</div>
</Dialog.Content>
</Dialog>
<AIAgentSafetyPopup
agentId={agent.id}
isOpen={isSafetyPopupOpen}
onAcknowledge={handleSafetyPopupAcknowledge}
/>
</>
);
}

View File

@@ -1,108 +0,0 @@
"use client";
import { Button } from "@/components/atoms/Button/Button";
import { Text } from "@/components/atoms/Text/Text";
import { Dialog } from "@/components/molecules/Dialog/Dialog";
import { Key, storage } from "@/services/storage/local-storage";
import { ShieldCheckIcon } from "@phosphor-icons/react";
import { useCallback, useEffect, useState } from "react";
interface Props {
agentId: string;
onAcknowledge: () => void;
isOpen: boolean;
}
export function AIAgentSafetyPopup({ agentId, onAcknowledge, isOpen }: Props) {
function handleAcknowledge() {
// Add this agent to the list of agents for which popup has been shown
const seenAgentsJson = storage.get(Key.AI_AGENT_SAFETY_POPUP_SHOWN);
const seenAgents: string[] = seenAgentsJson
? JSON.parse(seenAgentsJson)
: [];
if (!seenAgents.includes(agentId)) {
seenAgents.push(agentId);
storage.set(Key.AI_AGENT_SAFETY_POPUP_SHOWN, JSON.stringify(seenAgents));
}
onAcknowledge();
}
if (!isOpen) return null;
return (
<Dialog
controlled={{ isOpen, set: () => {} }}
styling={{ maxWidth: "480px" }}
>
<Dialog.Content>
<div className="flex flex-col items-center p-6 text-center">
<div className="mb-6 flex h-16 w-16 items-center justify-center rounded-full bg-blue-50">
<ShieldCheckIcon
weight="fill"
size={32}
className="text-blue-600"
/>
</div>
<Text variant="h3" className="mb-4">
Safety Checks Enabled
</Text>
<Text variant="body" className="mb-2 text-zinc-700">
AI-generated agents may take actions that affect your data or
external systems.
</Text>
<Text variant="body" className="mb-8 text-zinc-700">
AutoGPT includes safety checks so you&apos;ll always have the
opportunity to review and approve sensitive actions before they
happen.
</Text>
<Button
variant="primary"
size="large"
className="w-full"
onClick={handleAcknowledge}
>
Got it
</Button>
</div>
</Dialog.Content>
</Dialog>
);
}
export function useAIAgentSafetyPopup(
agentId: string,
hasSensitiveAction: boolean,
hasHumanInTheLoop: boolean,
) {
const [shouldShowPopup, setShouldShowPopup] = useState(false);
const [hasChecked, setHasChecked] = useState(false);
useEffect(() => {
if (hasChecked) return;
const seenAgentsJson = storage.get(Key.AI_AGENT_SAFETY_POPUP_SHOWN);
const seenAgents: string[] = seenAgentsJson
? JSON.parse(seenAgentsJson)
: [];
const hasSeenPopupForThisAgent = seenAgents.includes(agentId);
const isRelevantAgent = hasSensitiveAction || hasHumanInTheLoop;
setShouldShowPopup(!hasSeenPopupForThisAgent && isRelevantAgent);
setHasChecked(true);
}, [agentId, hasSensitiveAction, hasHumanInTheLoop, hasChecked]);
const dismissPopup = useCallback(() => {
setShouldShowPopup(false);
}, []);
return {
shouldShowPopup,
dismissPopup,
};
}

View File

@@ -69,6 +69,7 @@ export function SafeModeToggle({ graph, className }: Props) {
const {
currentHITLSafeMode,
showHITLToggle,
isHITLStateUndetermined,
handleHITLToggle,
currentSensitiveActionSafeMode,
showSensitiveActionToggle,
@@ -77,13 +78,20 @@ export function SafeModeToggle({ graph, className }: Props) {
shouldShowToggle,
} = useAgentSafeMode(graph);
if (!shouldShowToggle) {
if (!shouldShowToggle || isHITLStateUndetermined) {
return null;
}
const showHITL = showHITLToggle && !isHITLStateUndetermined;
const showSensitive = showSensitiveActionToggle;
if (!showHITL && !showSensitive) {
return null;
}
return (
<div className={cn("flex gap-1", className)}>
{showHITLToggle && (
{showHITL && (
<SafeModeIconButton
isEnabled={currentHITLSafeMode}
label="Human-in-the-loop"
@@ -93,7 +101,7 @@ export function SafeModeToggle({ graph, className }: Props) {
isPending={isPending}
/>
)}
{showSensitiveActionToggle && (
{showSensitive && (
<SafeModeIconButton
isEnabled={currentSensitiveActionSafeMode}
label="Sensitive actions"

View File

@@ -8809,12 +8809,6 @@
"title": "Node Exec Id",
"description": "Node execution ID (primary key)"
},
"node_id": {
"type": "string",
"title": "Node Id",
"description": "Node definition ID (for grouping)",
"default": ""
},
"user_id": {
"type": "string",
"title": "User Id",
@@ -8914,7 +8908,7 @@
"created_at"
],
"title": "PendingHumanReviewModel",
"description": "Response model for pending human review data.\n\nRepresents a human review request that is awaiting user action.\nContains all necessary information for a user to review and approve\nor reject data from a Human-in-the-Loop block execution.\n\nAttributes:\n id: Unique identifier for the review record\n user_id: ID of the user who must perform the review\n node_exec_id: ID of the node execution that created this review\n node_id: ID of the node definition (for grouping reviews from same node)\n graph_exec_id: ID of the graph execution containing the node\n graph_id: ID of the graph template being executed\n graph_version: Version number of the graph template\n payload: The actual data payload awaiting review\n instructions: Instructions or message for the reviewer\n editable: Whether the reviewer can edit the data\n status: Current review status (WAITING, APPROVED, or REJECTED)\n review_message: Optional message from the reviewer\n created_at: Timestamp when review was created\n updated_at: Timestamp when review was last modified\n reviewed_at: Timestamp when review was completed (if applicable)"
"description": "Response model for pending human review data.\n\nRepresents a human review request that is awaiting user action.\nContains all necessary information for a user to review and approve\nor reject data from a Human-in-the-Loop block execution.\n\nAttributes:\n id: Unique identifier for the review record\n user_id: ID of the user who must perform the review\n node_exec_id: ID of the node execution that created this review\n graph_exec_id: ID of the graph execution containing the node\n graph_id: ID of the graph template being executed\n graph_version: Version number of the graph template\n payload: The actual data payload awaiting review\n instructions: Instructions or message for the reviewer\n editable: Whether the reviewer can edit the data\n status: Current review status (WAITING, APPROVED, or REJECTED)\n review_message: Optional message from the reviewer\n created_at: Timestamp when review was created\n updated_at: Timestamp when review was last modified\n reviewed_at: Timestamp when review was completed (if applicable)"
},
"PostmarkBounceEnum": {
"type": "integer",
@@ -9417,12 +9411,6 @@
],
"title": "Reviewed Data",
"description": "Optional edited data (ignored if approved=False)"
},
"auto_approve_future": {
"type": "boolean",
"title": "Auto Approve Future",
"description": "If true and this review is approved, future executions of this same block (node) will be automatically approved. This only affects approved reviews.",
"default": false
}
},
"type": "object",
@@ -9442,7 +9430,7 @@
"type": "object",
"required": ["reviews"],
"title": "ReviewRequest",
"description": "Request model for processing ALL pending reviews for an execution.\n\nThis request must include ALL pending reviews for a graph execution.\nEach review will be either approved (with optional data modifications)\nor rejected (data ignored). The execution will resume only after ALL reviews are processed.\n\nEach review item can individually specify whether to auto-approve future executions\nof the same block via the `auto_approve_future` field on ReviewItem."
"description": "Request model for processing ALL pending reviews for an execution.\n\nThis request must include ALL pending reviews for a graph execution.\nEach review will be either approved (with optional data modifications)\nor rejected (data ignored). The execution will resume only after ALL reviews are processed."
},
"ReviewResponse": {
"properties": {

View File

@@ -13,7 +13,6 @@ export interface ChatProps {
urlSessionId?: string | null;
initialPrompt?: string;
onSessionNotFound?: () => void;
onStreamingChange?: (isStreaming: boolean) => void;
}
export function Chat({
@@ -21,7 +20,6 @@ export function Chat({
urlSessionId,
initialPrompt,
onSessionNotFound,
onStreamingChange,
}: ChatProps) {
const hasHandledNotFoundRef = useRef(false);
const {
@@ -75,7 +73,6 @@ export function Chat({
initialMessages={messages}
initialPrompt={initialPrompt}
className="flex-1"
onStreamingChange={onStreamingChange}
/>
)}
</main>

View File

@@ -4,7 +4,6 @@ import { Text } from "@/components/atoms/Text/Text";
import { Dialog } from "@/components/molecules/Dialog/Dialog";
import { useBreakpoint } from "@/lib/hooks/useBreakpoint";
import { cn } from "@/lib/utils";
import { useEffect } from "react";
import { ChatInput } from "../ChatInput/ChatInput";
import { MessageList } from "../MessageList/MessageList";
import { useChatContainer } from "./useChatContainer";
@@ -14,7 +13,6 @@ export interface ChatContainerProps {
initialMessages: SessionDetailResponse["messages"];
initialPrompt?: string;
className?: string;
onStreamingChange?: (isStreaming: boolean) => void;
}
export function ChatContainer({
@@ -22,7 +20,6 @@ export function ChatContainer({
initialMessages,
initialPrompt,
className,
onStreamingChange,
}: ChatContainerProps) {
const {
messages,
@@ -39,10 +36,6 @@ export function ChatContainer({
initialPrompt,
});
useEffect(() => {
onStreamingChange?.(isStreaming);
}, [isStreaming, onStreamingChange]);
const breakpoint = useBreakpoint();
const isMobile =
breakpoint === "base" || breakpoint === "sm" || breakpoint === "md";

View File

@@ -1,7 +1,12 @@
import { Text } from "@/components/atoms/Text/Text";
export function ChatLoader() {
return (
<div className="flex items-center gap-2">
<div className="h-5 w-5 animate-loader rounded-full bg-black" />
</div>
<Text
variant="small"
className="bg-gradient-to-r from-neutral-600 via-neutral-500 to-neutral-600 bg-[length:200%_100%] bg-clip-text text-xs text-transparent [animation:shimmer_2s_ease-in-out_infinite]"
>
Taking a bit more time...
</Text>
);
}

View File

@@ -7,6 +7,7 @@ import {
ArrowsClockwiseIcon,
CheckCircleIcon,
CheckIcon,
CopyIcon,
} from "@phosphor-icons/react";
import { useRouter } from "next/navigation";
import { useCallback, useState } from "react";
@@ -339,26 +340,11 @@ export function ChatMessage({
size="icon"
onClick={handleCopy}
aria-label="Copy message"
className="p-1"
>
{copied ? (
<CheckIcon className="size-4 text-green-600" />
) : (
<svg
xmlns="http://www.w3.org/2000/svg"
width="24"
height="24"
viewBox="0 0 24 24"
fill="none"
stroke="currentColor"
strokeWidth="2"
strokeLinecap="round"
strokeLinejoin="round"
className="size-3 text-zinc-600"
>
<rect width="14" height="14" x="8" y="8" rx="2" ry="2" />
<path d="M4 16c-1.1 0-2-.9-2-2V4c0-1.1.9-2 2-2h10c1.1 0 2 .9 2 2" />
</svg>
<CopyIcon className="size-4 text-zinc-600" />
)}
</Button>
)}

View File

@@ -1,6 +1,7 @@
import { cn } from "@/lib/utils";
import { useEffect, useRef, useState } from "react";
import { AIChatBubble } from "../AIChatBubble/AIChatBubble";
import { ChatLoader } from "../ChatLoader/ChatLoader";
export interface ThinkingMessageProps {
className?: string;
@@ -8,9 +9,7 @@ export interface ThinkingMessageProps {
export function ThinkingMessage({ className }: ThinkingMessageProps) {
const [showSlowLoader, setShowSlowLoader] = useState(false);
const [showCoffeeMessage, setShowCoffeeMessage] = useState(false);
const timerRef = useRef<NodeJS.Timeout | null>(null);
const coffeeTimerRef = useRef<NodeJS.Timeout | null>(null);
useEffect(() => {
if (timerRef.current === null) {
@@ -19,21 +18,11 @@ export function ThinkingMessage({ className }: ThinkingMessageProps) {
}, 8000);
}
if (coffeeTimerRef.current === null) {
coffeeTimerRef.current = setTimeout(() => {
setShowCoffeeMessage(true);
}, 10000);
}
return () => {
if (timerRef.current) {
clearTimeout(timerRef.current);
timerRef.current = null;
}
if (coffeeTimerRef.current) {
clearTimeout(coffeeTimerRef.current);
coffeeTimerRef.current = null;
}
};
}, []);
@@ -48,16 +37,16 @@ export function ThinkingMessage({ className }: ThinkingMessageProps) {
<div className="flex min-w-0 flex-1 flex-col">
<AIChatBubble>
<div className="transition-all duration-500 ease-in-out">
{showCoffeeMessage ? (
<span className="inline-block animate-shimmer bg-gradient-to-r from-neutral-400 via-neutral-600 to-neutral-400 bg-[length:200%_100%] bg-clip-text text-transparent">
This could take a few minutes, grab a coffee
</span>
) : showSlowLoader ? (
<span className="inline-block animate-shimmer bg-gradient-to-r from-neutral-400 via-neutral-600 to-neutral-400 bg-[length:200%_100%] bg-clip-text text-transparent">
Taking a bit more time...
</span>
{showSlowLoader ? (
<ChatLoader />
) : (
<span className="inline-block animate-shimmer bg-gradient-to-r from-neutral-400 via-neutral-600 to-neutral-400 bg-[length:200%_100%] bg-clip-text text-transparent">
<span
className="inline-block bg-gradient-to-r from-neutral-400 via-neutral-600 to-neutral-400 bg-clip-text text-transparent"
style={{
backgroundSize: "200% 100%",
animation: "shimmer 2s ease-in-out infinite",
}}
>
Thinking...
</span>
)}

View File

@@ -31,29 +31,6 @@ export function FloatingReviewsPanel({
query: {
enabled: !!(graphId && executionId),
select: okData,
// Poll while execution is in progress to detect status changes
refetchInterval: (q) => {
// Note: refetchInterval callback receives raw data before select transform
const rawData = q.state.data as
| { status: number; data?: { status?: string } }
| undefined;
if (rawData?.status !== 200) return false;
const status = rawData?.data?.status;
if (!status) return false;
// Poll every 2 seconds while running or in review
if (
status === AgentExecutionStatus.RUNNING ||
status === AgentExecutionStatus.QUEUED ||
status === AgentExecutionStatus.INCOMPLETE ||
status === AgentExecutionStatus.REVIEW
) {
return 2000;
}
return false;
},
refetchIntervalInBackground: true,
},
},
);
@@ -63,47 +40,28 @@ export function FloatingReviewsPanel({
useShallow((state) => state.graphExecutionStatus),
);
// Determine if we should poll for pending reviews
const isInReviewStatus =
executionDetails?.status === AgentExecutionStatus.REVIEW ||
graphExecutionStatus === AgentExecutionStatus.REVIEW;
const { pendingReviews, isLoading, refetch } = usePendingReviewsForExecution(
executionId || "",
{
enabled: !!executionId,
// Poll every 2 seconds when in REVIEW status to catch new reviews
refetchInterval: isInReviewStatus ? 2000 : false,
},
);
// Refetch pending reviews when execution status changes
useEffect(() => {
if (executionId && executionDetails?.status) {
if (executionId) {
refetch();
}
}, [executionDetails?.status, executionId, refetch]);
// Hide panel if:
// 1. No execution ID
// 2. No pending reviews and not in REVIEW status
// 3. Execution is RUNNING or QUEUED (hasn't paused for review yet)
if (!executionId) {
return null;
}
// Refetch when graph execution status changes to REVIEW
useEffect(() => {
if (graphExecutionStatus === AgentExecutionStatus.REVIEW && executionId) {
refetch();
}
}, [graphExecutionStatus, executionId, refetch]);
if (
!isLoading &&
pendingReviews.length === 0 &&
executionDetails?.status !== AgentExecutionStatus.REVIEW
) {
return null;
}
// Don't show panel while execution is still running/queued (not paused for review)
if (
executionDetails?.status === AgentExecutionStatus.RUNNING ||
executionDetails?.status === AgentExecutionStatus.QUEUED
!executionId ||
(!isLoading &&
pendingReviews.length === 0 &&
executionDetails?.status !== AgentExecutionStatus.REVIEW)
) {
return null;
}

View File

@@ -1,8 +1,10 @@
import { PendingHumanReviewModel } from "@/app/api/__generated__/models/pendingHumanReviewModel";
import { Text } from "@/components/atoms/Text/Text";
import { Button } from "@/components/atoms/Button/Button";
import { Input } from "@/components/atoms/Input/Input";
import { Switch } from "@/components/atoms/Switch/Switch";
import { useEffect, useState } from "react";
import { TrashIcon, EyeSlashIcon } from "@phosphor-icons/react";
import { useState } from "react";
interface StructuredReviewPayload {
data: unknown;
@@ -38,49 +40,37 @@ function extractReviewData(payload: unknown): {
interface PendingReviewCardProps {
review: PendingHumanReviewModel;
onReviewDataChange: (nodeExecId: string, data: string) => void;
autoApproveFuture?: boolean;
onAutoApproveFutureChange?: (nodeExecId: string, enabled: boolean) => void;
externalDataValue?: string;
showAutoApprove?: boolean;
nodeId?: string;
reviewMessage?: string;
onReviewMessageChange?: (nodeExecId: string, message: string) => void;
isDisabled?: boolean;
onToggleDisabled?: (nodeExecId: string) => void;
}
export function PendingReviewCard({
review,
onReviewDataChange,
autoApproveFuture = false,
onAutoApproveFutureChange,
externalDataValue,
showAutoApprove = true,
nodeId,
reviewMessage = "",
onReviewMessageChange,
isDisabled = false,
onToggleDisabled,
}: PendingReviewCardProps) {
const extractedData = extractReviewData(review.payload);
const isDataEditable = review.editable;
let instructions = review.instructions;
const isHITLBlock = instructions && !instructions.includes("Block");
if (instructions && !isHITLBlock) {
instructions = undefined;
}
const instructions = extractedData.instructions || review.instructions;
const [currentData, setCurrentData] = useState(extractedData.data);
useEffect(() => {
if (externalDataValue !== undefined) {
try {
const parsedData = JSON.parse(externalDataValue);
setCurrentData(parsedData);
} catch {}
}
}, [externalDataValue]);
const handleDataChange = (newValue: unknown) => {
setCurrentData(newValue);
onReviewDataChange(review.node_exec_id, JSON.stringify(newValue, null, 2));
};
const handleMessageChange = (newMessage: string) => {
onReviewMessageChange?.(review.node_exec_id, newMessage);
};
// Show simplified view when no toggle functionality is provided (Screenshot 1 mode)
const showSimplified = !onToggleDisabled;
const renderDataInput = () => {
const data = currentData;
@@ -147,59 +137,97 @@ export function PendingReviewCard({
}
};
const getShortenedNodeId = (id: string) => {
if (id.length <= 8) return id;
return `${id.slice(0, 4)}...${id.slice(-4)}`;
// Helper function to get proper field label
const getFieldLabel = (instructions?: string) => {
if (instructions)
return instructions.charAt(0).toUpperCase() + instructions.slice(1);
return "Data to Review";
};
// Use the existing HITL review interface
return (
<div className="space-y-4">
{nodeId && (
<Text variant="small" className="text-gray-500">
Node #{getShortenedNodeId(nodeId)}
</Text>
{!showSimplified && (
<div className="flex items-start justify-between">
<div className="flex-1">
{isDisabled && (
<Text variant="small" className="text-muted-foreground">
This item will be rejected
</Text>
)}
</div>
<Button
onClick={() => onToggleDisabled!(review.node_exec_id)}
variant={isDisabled ? "primary" : "secondary"}
size="small"
leftIcon={
isDisabled ? <EyeSlashIcon size={14} /> : <TrashIcon size={14} />
}
>
{isDisabled ? "Include" : "Exclude"}
</Button>
</div>
)}
<div className="space-y-3">
{instructions && (
{/* Show instructions as field label */}
{instructions && (
<div className="space-y-3">
<Text variant="body" className="font-semibold text-gray-900">
{instructions}
{getFieldLabel(instructions)}
</Text>
)}
{isDataEditable && !autoApproveFuture ? (
renderDataInput()
) : (
<div className="rounded-lg border border-gray-200 bg-white p-3">
<Text variant="small" className="text-gray-600">
{JSON.stringify(currentData, null, 2)}
</Text>
</div>
)}
</div>
{/* Auto-approve toggle for this review */}
{showAutoApprove && onAutoApproveFutureChange && (
<div className="space-y-2 pt-2">
<div className="flex items-center gap-3">
<Switch
checked={autoApproveFuture}
onCheckedChange={(enabled: boolean) =>
onAutoApproveFutureChange(review.node_exec_id, enabled)
}
/>
<Text variant="small" className="text-gray-700">
Auto-approve future executions of this block
</Text>
</div>
{autoApproveFuture && (
<Text variant="small" className="pl-11 text-gray-500">
Original data will be used for this and all future reviews from
this block.
</Text>
{isDataEditable && !isDisabled ? (
renderDataInput()
) : (
<div className="rounded-lg border border-gray-200 bg-white p-3">
<Text variant="small" className="text-gray-600">
{JSON.stringify(currentData, null, 2)}
</Text>
</div>
)}
</div>
)}
{/* If no instructions, show data directly */}
{!instructions && (
<div className="space-y-3">
<Text variant="body" className="font-semibold text-gray-900">
Data to Review
{!isDataEditable && (
<span className="ml-2 text-xs text-muted-foreground">
(Read-only)
</span>
)}
</Text>
{isDataEditable && !isDisabled ? (
renderDataInput()
) : (
<div className="rounded-lg border border-gray-200 bg-white p-3">
<Text variant="small" className="text-gray-600">
{JSON.stringify(currentData, null, 2)}
</Text>
</div>
)}
</div>
)}
{!showSimplified && isDisabled && (
<div>
<Text variant="body" className="mb-2 font-semibold">
Rejection Reason (Optional):
</Text>
<Input
id="rejection-reason"
label="Rejection Reason"
hideLabel
size="small"
type="textarea"
rows={3}
value={reviewMessage}
onChange={(e) => handleMessageChange(e.target.value)}
placeholder="Add any notes about why you're rejecting this..."
/>
</div>
)}
</div>
);
}

View File

@@ -1,16 +1,10 @@
import { useMemo, useState } from "react";
import { useState } from "react";
import { PendingHumanReviewModel } from "@/app/api/__generated__/models/pendingHumanReviewModel";
import { PendingReviewCard } from "@/components/organisms/PendingReviewCard/PendingReviewCard";
import { Text } from "@/components/atoms/Text/Text";
import { Button } from "@/components/atoms/Button/Button";
import { Switch } from "@/components/atoms/Switch/Switch";
import { useToast } from "@/components/molecules/Toast/use-toast";
import {
ClockIcon,
WarningIcon,
CaretDownIcon,
CaretRightIcon,
} from "@phosphor-icons/react";
import { ClockIcon, WarningIcon } from "@phosphor-icons/react";
import { usePostV2ProcessReviewAction } from "@/app/api/__generated__/endpoints/executions/executions";
interface PendingReviewsListProps {
@@ -38,34 +32,16 @@ export function PendingReviewsList({
},
);
const [reviewMessageMap, setReviewMessageMap] = useState<
Record<string, string>
>({});
const [pendingAction, setPendingAction] = useState<
"approve" | "reject" | null
>(null);
const [autoApproveFutureMap, setAutoApproveFutureMap] = useState<
Record<string, boolean>
>({});
const [collapsedGroups, setCollapsedGroups] = useState<
Record<string, boolean>
>({});
const { toast } = useToast();
const groupedReviews = useMemo(() => {
return reviews.reduce(
(acc, review) => {
const nodeId = review.node_id || "unknown";
if (!acc[nodeId]) {
acc[nodeId] = [];
}
acc[nodeId].push(review);
return acc;
},
{} as Record<string, PendingHumanReviewModel[]>,
);
}, [reviews]);
const reviewActionMutation = usePostV2ProcessReviewAction({
mutation: {
onSuccess: (res) => {
@@ -112,33 +88,8 @@ export function PendingReviewsList({
setReviewDataMap((prev) => ({ ...prev, [nodeExecId]: data }));
}
function handleAutoApproveFutureToggle(nodeId: string, enabled: boolean) {
setAutoApproveFutureMap((prev) => ({
...prev,
[nodeId]: enabled,
}));
if (enabled) {
const nodeReviews = groupedReviews[nodeId] || [];
setReviewDataMap((prev) => {
const updated = { ...prev };
nodeReviews.forEach((review) => {
updated[review.node_exec_id] = JSON.stringify(
review.payload,
null,
2,
);
});
return updated;
});
}
}
function toggleGroupCollapse(nodeId: string) {
setCollapsedGroups((prev) => ({
...prev,
[nodeId]: !prev[nodeId],
}));
function handleReviewMessageChange(nodeExecId: string, message: string) {
setReviewMessageMap((prev) => ({ ...prev, [nodeExecId]: message }));
}
function processReviews(approved: boolean) {
@@ -156,25 +107,22 @@ export function PendingReviewsList({
for (const review of reviews) {
const reviewData = reviewDataMap[review.node_exec_id];
const autoApproveThisNode = autoApproveFutureMap[review.node_id || ""];
const reviewMessage = reviewMessageMap[review.node_exec_id];
let parsedData: any = undefined;
let parsedData: any = review.payload; // Default to original payload
if (!autoApproveThisNode) {
if (review.editable && reviewData) {
try {
parsedData = JSON.parse(reviewData);
} catch (error) {
toast({
title: "Invalid JSON",
description: `Please fix the JSON format in review for node ${review.node_exec_id}: ${error instanceof Error ? error.message : "Invalid syntax"}`,
variant: "destructive",
});
setPendingAction(null);
return;
}
} else {
parsedData = review.payload;
// Parse edited data if available and editable
if (review.editable && reviewData) {
try {
parsedData = JSON.parse(reviewData);
} catch (error) {
toast({
title: "Invalid JSON",
description: `Please fix the JSON format in review for node ${review.node_exec_id}: ${error instanceof Error ? error.message : "Invalid syntax"}`,
variant: "destructive",
});
setPendingAction(null);
return;
}
}
@@ -182,7 +130,7 @@ export function PendingReviewsList({
node_exec_id: review.node_exec_id,
approved,
reviewed_data: parsedData,
auto_approve_future: autoApproveThisNode && approved,
message: reviewMessage || undefined,
});
}
@@ -210,6 +158,7 @@ export function PendingReviewsList({
return (
<div className="space-y-7 rounded-xl border border-yellow-150 bg-yellow-25 p-6">
{/* Warning Box Header */}
<div className="space-y-6">
<div className="flex items-start gap-2">
<WarningIcon
@@ -231,76 +180,23 @@ export function PendingReviewsList({
</div>
<div className="space-y-7">
{Object.entries(groupedReviews).map(([nodeId, nodeReviews]) => {
const isCollapsed = collapsedGroups[nodeId] ?? nodeReviews.length > 1;
const reviewCount = nodeReviews.length;
const firstReview = nodeReviews[0];
const blockName = firstReview?.instructions;
const reviewTitle = `Review required for ${blockName}`;
const getShortenedNodeId = (id: string) => {
if (id.length <= 8) return id;
return `${id.slice(0, 4)}...${id.slice(-4)}`;
};
return (
<div key={nodeId} className="space-y-4">
<button
onClick={() => toggleGroupCollapse(nodeId)}
className="flex w-full items-center gap-2 text-left"
>
{isCollapsed ? (
<CaretRightIcon size={20} className="text-gray-600" />
) : (
<CaretDownIcon size={20} className="text-gray-600" />
)}
<div className="flex-1">
<Text variant="body" className="font-semibold text-gray-900">
{reviewTitle}
</Text>
<Text variant="small" className="text-gray-500">
Node #{getShortenedNodeId(nodeId)}
</Text>
</div>
<span className="text-xs text-gray-600">
{reviewCount} {reviewCount === 1 ? "review" : "reviews"}
</span>
</button>
{!isCollapsed && (
<div className="space-y-4">
{nodeReviews.map((review) => (
<PendingReviewCard
key={review.node_exec_id}
review={review}
onReviewDataChange={handleReviewDataChange}
autoApproveFuture={autoApproveFutureMap[nodeId] || false}
externalDataValue={reviewDataMap[review.node_exec_id]}
showAutoApprove={false}
/>
))}
<div className="flex items-center gap-3 pt-2">
<Switch
checked={autoApproveFutureMap[nodeId] || false}
onCheckedChange={(enabled: boolean) =>
handleAutoApproveFutureToggle(nodeId, enabled)
}
/>
<Text variant="small" className="text-gray-700">
Auto-approve future executions of this node
</Text>
</div>
</div>
)}
</div>
);
})}
{reviews.map((review) => (
<PendingReviewCard
key={review.node_exec_id}
review={review}
onReviewDataChange={handleReviewDataChange}
onReviewMessageChange={handleReviewMessageChange}
reviewMessage={reviewMessageMap[review.node_exec_id] || ""}
/>
))}
</div>
<div className="space-y-4">
<div className="flex flex-wrap gap-2">
<div className="space-y-7">
<Text variant="body" className="text-textGrey">
Note: Changes you make here apply only to this task
</Text>
<div className="flex gap-2">
<Button
onClick={() => processReviews(true)}
disabled={reviewActionMutation.isPending || reviews.length === 0}
@@ -324,11 +220,6 @@ export function PendingReviewsList({
Reject
</Button>
</div>
<Text variant="small" className="text-textGrey">
You can turn auto-approval on or off using the toggle above for each
node.
</Text>
</div>
</div>
);

View File

@@ -15,22 +15,8 @@ export function usePendingReviews() {
};
}
interface UsePendingReviewsForExecutionOptions {
enabled?: boolean;
refetchInterval?: number | false;
}
export function usePendingReviewsForExecution(
graphExecId: string,
options?: UsePendingReviewsForExecutionOptions,
) {
const query = useGetV2GetPendingReviewsForExecution(graphExecId, {
query: {
enabled: options?.enabled ?? !!graphExecId,
refetchInterval: options?.refetchInterval,
refetchIntervalInBackground: !!options?.refetchInterval,
},
});
export function usePendingReviewsForExecution(graphExecId: string) {
const query = useGetV2GetPendingReviewsForExecution(graphExecId);
return {
pendingReviews: okData(query.data) || [],

View File

@@ -10,7 +10,6 @@ export enum Key {
LIBRARY_AGENTS_CACHE = "library-agents-cache",
CHAT_SESSION_ID = "chat_session_id",
COOKIE_CONSENT = "autogpt_cookie_consent",
AI_AGENT_SAFETY_POPUP_SHOWN = "ai-agent-safety-popup-shown",
}
function get(key: Key) {

View File

@@ -157,21 +157,12 @@ const config = {
backgroundPosition: "-200% 0",
},
},
loader: {
"0%": {
boxShadow: "0 0 0 0 rgba(0, 0, 0, 0.25)",
},
"100%": {
boxShadow: "0 0 0 30px rgba(0, 0, 0, 0)",
},
},
},
animation: {
"accordion-down": "accordion-down 0.2s ease-out",
"accordion-up": "accordion-up 0.2s ease-out",
"fade-in": "fade-in 0.2s ease-out",
shimmer: "shimmer 2s ease-in-out infinite",
loader: "loader 1s infinite",
},
transitionDuration: {
"2000": "2000ms",

View File

@@ -216,7 +216,6 @@ Below is a comprehensive list of all available blocks, categorized by their prim
| [AI Text Summarizer](block-integrations/llm.md#ai-text-summarizer) | A block that summarizes long texts using a Large Language Model (LLM), with configurable focus topics and summary styles |
| [AI Video Generator](block-integrations/fal/ai_video_generator.md#ai-video-generator) | Generate videos using FAL AI models |
| [Bannerbear Text Overlay](block-integrations/bannerbear/text_overlay.md#bannerbear-text-overlay) | Add text overlay to images using Bannerbear templates |
| [Claude Code](block-integrations/llm.md#claude-code) | Execute tasks using Claude Code in an E2B sandbox |
| [Code Generation](block-integrations/llm.md#code-generation) | Generate or refactor code using OpenAI's Codex (Responses API) |
| [Create Talking Avatar Video](block-integrations/llm.md#create-talking-avatar-video) | This block integrates with D-ID to create video clips and retrieve their URLs |
| [Exa Answer](block-integrations/exa/answers.md#exa-answer) | Get an LLM answer to a question informed by Exa search results |

View File

@@ -1,67 +0,0 @@
# Claude Code Execution
## What it is
The Claude Code block executes complex coding tasks using Anthropic's Claude Code AI assistant in a secure E2B sandbox environment.
## What it does
This block allows you to delegate coding tasks to Claude Code, which can autonomously create files, install packages, run commands, and build complete applications within a sandboxed environment. Claude Code can handle multi-step development tasks and maintain conversation context across multiple turns.
## How it works
When activated, the block:
1. Creates or connects to an E2B sandbox (a secure, isolated Linux environment)
2. Installs the latest version of Claude Code in the sandbox
3. Optionally runs setup commands to prepare the environment
4. Executes your prompt using Claude Code, which can:
- Create and edit files
- Install dependencies (npm, pip, etc.)
- Run terminal commands
- Build and test applications
5. Extracts all text files created/modified during execution
6. Returns the response and files, optionally keeping the sandbox alive for follow-up tasks
The block supports conversation continuation through three mechanisms:
- **Same sandbox continuation** (via `session_id` + `sandbox_id`): Resume on the same live sandbox
- **Fresh sandbox continuation** (via `conversation_history`): Restore context on a new sandbox if the previous one timed out
- **Dispose control** (`dispose_sandbox` flag): Keep sandbox alive for multi-turn conversations
## Inputs
| Input | Description |
|-------|-------------|
| E2B Credentials | API key for the E2B platform to create the sandbox. Get one at [e2b.dev](https://e2b.dev/docs) |
| Anthropic Credentials | API key for Anthropic to power Claude Code. Get one at [Anthropic's website](https://console.anthropic.com) |
| Prompt | The task or instruction for Claude Code to execute. Claude Code can create files, install packages, run commands, and perform complex coding tasks |
| Timeout | Sandbox timeout in seconds (default: 300). Set higher for complex tasks. Note: Only applies when creating a new sandbox |
| Setup Commands | Optional shell commands to run before executing Claude Code (e.g., installing dependencies) |
| Working Directory | Working directory for Claude Code to operate in (default: /home/user) |
| Session ID | Session ID to resume a previous conversation. Leave empty for new conversations |
| Sandbox ID | Sandbox ID to reconnect to an existing sandbox. Required when resuming a session |
| Conversation History | Previous conversation history to restore context on a fresh sandbox if the previous one timed out |
| Dispose Sandbox | Whether to dispose of the sandbox after execution (default: true). Set to false to continue conversations later |
## Outputs
| Output | Description |
|--------|-------------|
| Response | The output/response from Claude Code execution |
| Files | List of text files created/modified during execution. Each file includes path, relative_path, name, and content fields |
| Conversation History | Full conversation history including this turn. Use to restore context on a fresh sandbox |
| Session ID | Session ID for this conversation. Pass back with sandbox_id to continue the conversation |
| Sandbox ID | ID of the sandbox instance (null if disposed). Pass back with session_id to continue the conversation |
| Error | Error message if execution failed |
## Possible use case
**API Documentation to Full Application:**
A product team wants to quickly prototype applications based on API documentation. They create an agent that:
1. Uses Firecrawl to fetch API documentation from a URL
2. Passes the docs to Claude Code with a prompt like "Create a web app that demonstrates all the key features of this API"
3. Claude Code builds a complete application with HTML/CSS/JS frontend, proper error handling, and example API calls
4. The Files output is used with GitHub blocks to push the generated code to a new repository
The team can then iterate on the application by passing the sandbox_id and session_id back to Claude Code with refinement requests like "Add authentication" or "Improve the UI", and Claude Code will modify the existing files in the same sandbox.
**Multi-turn Development:**
A developer uses Claude Code to scaffold a new project:
- Turn 1: "Create a Python FastAPI project with user authentication" (dispose_sandbox=false)
- Turn 2: Uses the returned session_id + sandbox_id to ask "Add rate limiting middleware"
- Turn 3: Continues with "Add comprehensive tests"
Each turn builds on the previous work in the same sandbox environment.

View File

@@ -523,62 +523,6 @@ Summarizing lengthy research papers or articles to quickly grasp the main points
---
## Claude Code
### What it is
Execute tasks using Claude Code in an E2B sandbox. Claude Code can create files, install tools, run commands, and perform complex coding tasks autonomously.
### How it works
<!-- MANUAL: how_it_works -->
When activated, the block:
1. Creates or connects to an E2B sandbox (a secure, isolated Linux environment)
2. Installs the latest version of Claude Code in the sandbox
3. Optionally runs setup commands to prepare the environment
4. Executes your prompt using Claude Code, which can create/edit files, install dependencies, run terminal commands, and build applications
5. Extracts all text files created/modified during execution
6. Returns the response and files, optionally keeping the sandbox alive for follow-up tasks
The block supports conversation continuation through three mechanisms:
- **Same sandbox continuation** (via `session_id` + `sandbox_id`): Resume on the same live sandbox
- **Fresh sandbox continuation** (via `conversation_history`): Restore context on a new sandbox if the previous one timed out
- **Dispose control** (`dispose_sandbox` flag): Keep sandbox alive for multi-turn conversations
<!-- END MANUAL -->
### Inputs
| Input | Description | Type | Required |
|-------|-------------|------|----------|
| prompt | The task or instruction for Claude Code to execute. Claude Code can create files, install packages, run commands, and perform complex coding tasks. | str | No |
| timeout | Sandbox timeout in seconds. Claude Code tasks can take a while, so set this appropriately for your task complexity. Note: This only applies when creating a new sandbox. When reconnecting to an existing sandbox via sandbox_id, the original timeout is retained. | int | No |
| setup_commands | Optional shell commands to run before executing Claude Code. Useful for installing dependencies or setting up the environment. | List[str] | No |
| working_directory | Working directory for Claude Code to operate in. | str | No |
| session_id | Session ID to resume a previous conversation. Leave empty for a new conversation. Use the session_id from a previous run to continue that conversation. | str | No |
| sandbox_id | Sandbox ID to reconnect to an existing sandbox. Required when resuming a session (along with session_id). Use the sandbox_id from a previous run where dispose_sandbox was False. | str | No |
| conversation_history | Previous conversation history to continue from. Use this to restore context on a fresh sandbox if the previous one timed out. Pass the conversation_history output from a previous run. | str | No |
| dispose_sandbox | Whether to dispose of the sandbox immediately after execution. Set to False if you want to continue the conversation later (you'll need both sandbox_id and session_id from the output). | bool | No |
### Outputs
| Output | Description | Type |
|--------|-------------|------|
| error | Error message if execution failed | str |
| response | The output/response from Claude Code execution | str |
| files | List of text files created/modified by Claude Code during this execution. Each file has 'path', 'relative_path', 'name', and 'content' fields. | List[FileOutput] |
| conversation_history | Full conversation history including this turn. Pass this to conversation_history input to continue on a fresh sandbox if the previous sandbox timed out. | str |
| session_id | Session ID for this conversation. Pass this back along with sandbox_id to continue the conversation. | str |
| sandbox_id | ID of the sandbox instance. Pass this back along with session_id to continue the conversation. This is None if dispose_sandbox was True (sandbox was disposed). | str |
### Possible use case
<!-- MANUAL: use_case -->
**API Documentation to Full Application**: A product team wants to quickly prototype applications based on API documentation. They fetch API docs with Firecrawl, pass them to Claude Code with a prompt like "Create a web app that demonstrates all the key features of this API", and Claude Code builds a complete application with HTML/CSS/JS frontend, proper error handling, and example API calls. The Files output can then be pushed to GitHub.
**Multi-turn Development**: A developer uses Claude Code to scaffold a new project iteratively - Turn 1: "Create a Python FastAPI project with user authentication" (dispose_sandbox=false), Turn 2: Uses the returned session_id + sandbox_id to ask "Add rate limiting middleware", Turn 3: Continues with "Add comprehensive tests". Each turn builds on the previous work in the same sandbox environment.
**Automated Code Review and Fixes**: An agent receives code from a PR, sends it to Claude Code with "Review this code for bugs and security issues, then fix any problems you find", and Claude Code analyzes the code, makes fixes, and returns the corrected files ready to commit.
<!-- END MANUAL -->
---
## Code Generation
### What it is

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.7 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.4 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.2 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.5 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.6 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.7 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.2 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.3 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.0 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.2 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.0 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.8 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.8 MiB

View File

@@ -2,20 +2,34 @@
* [What is the AutoGPT Platform?](what-is-autogpt-platform.md)
## Getting Started
## Using the Platform
* [Setting Up Auto-GPT (Local Host)](getting-started.md)
* [Getting Started (Cloud)](getting-started-cloud.md)
* [AutoPilot](autopilot.md)
* [Agent Builder Guide](agent-builder-guide.md)
* [Agent Library](agent-library.md)
* [Marketplace](marketplace.md)
* [Scheduling & Triggers](scheduling-and-triggers.md)
* [Templates](templates.md)
* [Credits & Billing](credits-and-billing.md)
* [Integrations & Credentials](integrations-and-credentials.md)
* [Data Flow & Execution](data-flow-and-execution.md)
* [Sharing & Exporting Agents](sharing-and-exporting.md)
## Self-Hosting
* [Setting Up AutoGPT (Self-Host)](getting-started.md)
* [AutoGPT Platform Installer](installer.md)
* [Advanced Setup](advanced_setup.md)
## Tutorials
* [Create a Basic Agent](create-basic-agent.md)
* [Edit an Agent](edit-agent.md)
* [Delete an Agent](delete-agent.md)
* [Download & Import an Agent](download-agent-from-marketplace-local.md)
* [Create a Basic Agent](create-basic-agent.md)
* [Submit an Agent to the Marketplace](submit-agent-to-marketplace.md)
## Advanced Setup
* [Advanced Setup](advanced_setup.md)
## Building Blocks
* [Agent Blocks Overview](agent-blocks.md)
@@ -32,3 +46,13 @@
* [API Introduction](integrating/api-guide.md)
* [OAuth & SSO](integrating/oauth-guide.md)
## Changelog
* [What's New](changelog/README.md)
* [Import workflows from other platforms and enjoy a polished marketplace](changelog/march-20-march-25-2026.md)
* [Refreshed marketplace, usage monitor, and personalized prompts](changelog/march-13-march-20-2026.md)
* [Organize your agents, get notified, and enjoy a cleaner chat](changelog/march-5-march-12-2026.md)
* [Connect any app, share files, and let AutoPilot browse for you](changelog/february-26-march-4-2026.md)
* [Telegram bots, agent folders, and a rebuilt builder](changelog/february-11-february-26-2026.md)
* [Voice input, persistent files, and a smarter AI](changelog/january-29-february-11-2026.md)

View File

@@ -0,0 +1,105 @@
# Agent Builder Guide
## Overview
The Agent Builder is the visual editor where you design and build agents by connecting blocks together on a canvas. Each block is a single action — like generating text, calling an API, or processing data — and you wire them together to create automated workflows.
**URL:** [platform.agpt.co/build](https://platform.agpt.co/build)
## The Builder Interface
When you open the builder, you'll see:
- **Canvas**: The main workspace where you place and connect blocks
- **Blocks Menu**: A panel on the left-hand side where you browse and search for blocks
- **Save Button**: Save your agent with a title and description
## Working with Blocks
### Types of Blocks
Blocks fall into three categories:
| Type | Description |
|------|-------------|
| **Input Blocks** | Define what information the agent needs when it runs. These become the input fields users fill in when starting a task. Types include text inputs, file inputs, and more. |
| **Action Blocks** | Perform operations — AI text generation, image creation, API calls, data processing, and hundreds of integrations with external platforms. |
| **Output Blocks** | Define what the agent returns as its result. These become the visible output when a task completes. |
Input and output blocks define the **schema** of your agent — they determine what users see when running the agent. All other blocks inside the agent are internal and not exposed to the user at runtime.
{% hint style="info" %}
There is also a special type of input block: **Trigger Blocks**. These allow your agent to be activated by external events via webhooks rather than manual input. See [Scheduling & Triggers](scheduling-and-triggers.md) for details.
{% endhint %}
### Adding Blocks
1. Open the **Blocks menu** on the left-hand side of the builder
2. Browse categories or use the search bar to find a specific block
3. Click on a block to add it to the canvas
There are hundreds of blocks available, integrating with many platforms and services.
### Connecting Blocks
Blocks have **input pins** and **output pins**. Pins are typed — they handle specific data types like text, numbers, files, and more.
To connect blocks:
1. Click on an **output pin** of one block
2. Drag to an **input pin** of another block (or simply click the output pin, then click the input pin)
3. A **connection line** will appear between the two pins
When the agent runs, data flows along these connections. This is visually represented by a **coloured bead** that slides along the connection line from the output pin to the input pin.
### Configuring Blocks
Many blocks have settings you can configure directly on the block. For example:
- The **AI Text Generator** block lets you choose which language model to use
- **Integration blocks** may require credentials (see [Integrations & Credentials](integrations-and-credentials.md))
- Some blocks allow you to hardcode values on their input pins instead of connecting them to other blocks
If a block requires a credential you haven't connected yet, a **credential bar** will appear prompting you to add it (via OAuth, API key, or username/password depending on the service).
## Saving Your Agent
To save your agent:
- Press **Ctrl+S**, or
- Click the **Save button** in the builder
When saving, you can provide a **title** and **description** for the agent. This is a local save to your personal library. For publishing to the public marketplace, see [Publishing to the Marketplace](marketplace.md#publishing-an-agent).
{% hint style="info" %}
There is currently no draft vs. saved state — saving an agent immediately updates it in your library.
{% endhint %}
## Navigating Back to Your Library
After saving, click the **Agents** button in the navigation bar to return to your library and find your agent.
## Editing an Existing Agent
To open an existing agent in the builder:
1. Go to your **Agent Library** (click **Agents** in the nav bar)
2. Click on the agent you want to edit
3. Click the **three dots** menu (⋯) on the far right-hand side of the screen
4. Select **Edit Agent**
This will open the agent in the builder with all its existing blocks and connections.
## Error Handling
When a block fails during execution, it produces data on its **error pin**. As the agent creator, you decide how to handle errors:
- **Surface the error**: Connect the error pin to an output block so the error is returned as the agent's result
- **Handle gracefully**: Connect the error pin to other blocks that provide fallback behaviour, ensuring the agent continues working even when individual blocks encounter problems
## Tips
- **Name your input and output blocks clearly** — these names become the labels users see when running your agent
- **Test incrementally** — save and run your agent frequently as you build to catch issues early
- **Use the search** in the blocks menu to quickly find what you need among hundreds of available blocks
- **Check the pin types** — connections only work between compatible pin types

View File

@@ -0,0 +1,93 @@
# Agent Library
## Overview
The Agent Library is your personal collection of agents. From here you can run agents, view task history, set up schedules and triggers, edit agents, and manage your collection.
**URL:** [platform.agpt.co/library](https://platform.agpt.co/library)
**Access:** Click the **Agents** button in the navigation bar.
## Library View
Your library displays all of your agents, including agents you've built yourself and agents you've added from the marketplace. Each agent shows its name and key information at a glance.
### Favouriting Agents
To keep important agents easy to find, click the **heart icon** on any agent to favourite it. Favourited agents are pinned to the top of your library.
{% hint style="info" %}
There is currently no folder system for organising agents. Use favourites to pin your most-used agents to the top.
{% endhint %}
## Agent Detail View
Click on any agent to open its detail view. This screen provides:
- **Tasks**: A full history of every time this agent has been executed
- **Scheduled**: A list of active schedules for this agent
- **Templates**: Saved input configurations for quick re-runs
These are accessible via tabs on the left-hand side of the agent screen (e.g., `Tasks 286 | Scheduled 1 | Templates 0`).
### Agent Actions Menu
Click the **three dots** (⋯) on the far right-hand side of the agent screen to access:
| Action | Description |
|--------|-------------|
| **Edit Agent** | Opens the agent in the builder for editing |
| **Delete Agent** | Permanently removes the agent from your library |
| **Export Agent to File** | Downloads the agent as a file you can share with others |
## Running an Agent
To run an agent manually:
1. Open the agent from your library
2. Click **New Task**
3. Fill in the required input fields
4. Click **Start Task**
The task will begin executing immediately. You can watch the progress and view the results once it completes.
{% hint style="warning" %}
If the agent uses **trigger blocks** instead of standard input blocks, the **New Task** button is replaced with **New Trigger**. See [Scheduling & Triggers](scheduling-and-triggers.md) for details.
{% endhint %}
## Viewing Task Results
Every time an agent runs, it creates a **task**. To view task results:
1. Open the agent from your library
2. In the left-hand pane, browse the list of completed tasks
3. Click on a task to view its details
A task detail view shows:
- **Inputs**: The values that were provided when the task started
- **Outputs**: The results the agent produced
- **Cost**: The total credit cost for this task execution, displayed at the top of the task
You can also **share a task** by copying its URL, which allows others to view the task output directly.
## Uploading an Agent
To import an agent from a file:
1. Go to your Agent Library
2. Click **Upload Agent** at the top
3. Select the agent file from your computer
The agent will be added to your library and can be run or edited like any other agent. This is useful for importing agents shared by other users outside of the marketplace.
## Deleting an Agent
1. Open the agent from your library
2. Click the **three dots** (⋯) on the far right
3. Select **Delete Agent**
4. Confirm the deletion when prompted
{% hint style="danger" %}
Deleting an agent is permanent and cannot be undone. Make sure you want to remove it before confirming.
{% endhint %}

View File

@@ -0,0 +1,62 @@
# AutoPilot
## Overview
AutoPilot is your AI assistant built directly into the AutoGPT Platform. It can perform virtually any action on the platform through natural conversation — from running agents to generating images, conducting research, and even building entire agents for you.
## Accessing AutoPilot
AutoPilot is always available by clicking the **Home** button in the top-left of the navigation bar, or by navigating directly to [platform.agpt.co](https://platform.agpt.co).
## What AutoPilot Can Do
AutoPilot has access to the full power of the platform. Here's what it can help you with:
### Run Agents
Ask AutoPilot to run any agent in your library. It will handle filling in the inputs and executing the task for you.
### Build Agents
Describe the workflow you want and AutoPilot will create an agent for you in the builder. You can also ask it to edit your existing agents or modify agents you've added from the marketplace.
### Browse the Marketplace
Ask AutoPilot to find agents for a specific use case and it will search the marketplace for you.
### Execute Blocks Directly
AutoPilot can run individual blocks without building a full agent, giving it direct access to around **400 tools and counting**. This means you can:
- **Conduct research** with Perplexity
- **Generate images** with the latest image models
- **Edit pictures** using AI image editing blocks
- **Generate videos** using video generation blocks
- **Run any model** on inference services like Replicate
- **Make custom HTTP requests** to any API
- **Write and execute code**, including delegating coding tasks to Claude Code
### Manage Your Library
AutoPilot can help you manage your agent library, view task results, set up schedules, and more.
## Tips for Using AutoPilot
- **Be specific**: The more detail you provide, the better AutoPilot can assist you. Instead of "make me an agent", try "build me an agent that takes a blog topic as input, generates an outline with Claude, then writes the full article".
- **Iterate**: You can refine results by asking follow-up questions or requesting changes.
- **Explore capabilities**: If you're unsure whether AutoPilot can do something, just ask — it has access to a vast number of tools through the platform's block system.
## AutoPilot vs. the Agent Builder
AutoPilot and the Agent Builder are two ways to achieve the same things on the platform. AutoPilot can do anything the builder can — including creating and editing full, reusable agents — and it can also run blocks directly without building an agent first.
| | AutoPilot | Agent Builder |
|---|-----------|---------------|
| **Who is it for** | Everyone — no technical knowledge required | Technical users with a grasp of visual programming |
| **Best for** | Natural language interaction, doing things fast, and accessing blocks directly | Hands-on visual control over exactly how an agent is wired together |
| **How it works** | Conversational — describe what you want | Visual — drag, drop, and connect blocks on a canvas |
| **Can build agents** | Yes — describe what you want and it builds the agent for you | Yes — you build the agent manually |
| **Can edit agents** | Yes — including agents from the marketplace | Yes — full visual editing |
| **Block access** | Can run any block directly without building an agent | Blocks must be connected into an agent workflow |
Choose whichever approach suits you. Many users use both — AutoPilot for speed, and the builder when they want fine-grained visual control over their workflow.

View File

@@ -0,0 +1,18 @@
# Changelog
What's new in AutoPilot and the AutoGPT Platform. We ship updates regularly and document everything here.
{% hint style="info" %}
AutoPilot was previously called CoPilot. References to CoPilot in older entries refer to the same product.
{% endhint %}
## 2026
| Date | Highlights |
| ---- | ---------- |
| [March 20 March 25](march-20-march-25-2026.md) | Import workflows from other tools, marketplace UI polish, dry-run mode, parallel AutoPilot actions |
| [March 13 March 20](march-13-march-20-2026.md) | Refreshed marketplace, usage monitor, feedback button, personalized prompts |
| [March 5 March 12](march-5-march-12-2026.md) | Folders, notifications, cleaner reasoning, inline outputs |
| [February 26 March 4](february-26-march-4-2026.md) | Connect any app, share files, browse the web, run code, text-to-speech |
| [February 11 26](february-11-february-26-2026.md) | Telegram bots, agent folders, rebuilt flow editor, major AutoPilot upgrade |
| [January 29 February 11](january-29-february-11-2026.md) | Voice input, persistent file workspace, Claude Opus 4.6, marketplace agent customization, rebuilt chat streaming |

View File

@@ -0,0 +1,117 @@
# Telegram bots, agent folders, and a rebuilt builder
*February 11 26, 2026*
***
## AutoPilot got a major brain upgrade
AutoPilot is now powered by a completely new AI engine. It can hold real multi-step conversations, use tools behind the scenes, and handle complex requests that would have confused it before. When you need multiple things done at once, it runs them in parallel instead of one at a time — so everything finishes faster. [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/12103)
Ask it to run one of your agents and it will actually wait for the result and tell you what happened — no more "your agent started, go check later." [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/12147)
{% hint style="info" %}
If your connection drops while AutoPilot is working on something, it keeps going in the background and picks up right where you left off when you reconnect. [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/12057)
{% endhint %}
## Automate your Telegram bots
Thirteen new Telegram blocks let you build agents that respond to messages, photos, voice notes, videos, and reactions. Set up triggers for incoming content, then reply with text, images, audio, documents, or video. Connect your bot token from [@BotFather](https://t.me/BotFather) and you're ready to go. [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/12141)
## The builder has been rebuilt
The old agent builder is gone. Everyone now uses the new flow editor — a cleaner, faster interface with improved node rendering and no more toggling between views. If you've been using the legacy builder, the experience should feel familiar but noticeably more polished. [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/12081)
## Organize agents into folders
Your agent library now supports folders. Create them with custom names, emoji icons, and colors, then drag-and-drop agents into them or right-click to move. Folders can be nested up to five levels deep, and a breadcrumb bar helps you navigate. Search still works across everything regardless of which folder you're in. [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/12101)
## Your files follow you everywhere
Files created during agent runs — whether from code execution, image generation, or document creation — now persist in your workspace automatically. They're accessible across conversations and sessions, and images render inline in chat. AutoPilot also retains full context about files and tools it used earlier in the conversation, so it won't "forget" what just happened. [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/12073) [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/12164)
## Tell us what to build next
You can now submit feature requests directly through AutoPilot chat. Describe what you want — it checks for existing requests first and either creates a new one or adds your vote to a matching request so the team can see demand. [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/12102)
## Vague goals get refined, not rejected
If you ask AutoPilot to create an agent with a vague goal like "monitor social media," it now suggests a clearer, more actionable version instead of returning an error. Accept the suggestion with one click or refine it further. [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/12139)
## The stop button actually stops things
Clicking Stop now sends a real cancellation signal through the entire system. Previously it didn't always halt what was running behind the scenes — now it does. [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/12171)
***
<details>
<summary>Improvements</summary>
* [Delete chat sessions](https://github.com/Significant-Gravitas/AutoGPT/pull/12112) — clean up old conversations with a trash icon and confirmation dialog
* [Five new list blocks](https://github.com/Significant-Gravitas/AutoGPT/pull/12105) — Flatten, Interleave, Zip, Difference, and Intersection for agents that work with lists
* [Web search enabled for agents](https://github.com/Significant-Gravitas/AutoGPT/pull/12108) — agents can now search the web as part of their workflows
* [PDF generation support](https://github.com/Significant-Gravitas/AutoGPT/pull/12216) — agents can create and manipulate PDF documents
* [Faster, smarter block search](https://github.com/Significant-Gravitas/AutoGPT/pull/11806) — the builder search now combines keyword and semantic matching so you find what you need even without the exact name
* [Parallel tool execution](https://github.com/Significant-Gravitas/AutoGPT/pull/12165) — when an agent needs multiple tools at once, they run simultaneously instead of one at a time
* [Signup answers carry forward](https://github.com/Significant-Gravitas/AutoGPT/pull/12119) — information from your signup form pre-populates AutoPilot setup so you don't repeat yourself
* [Improved agent creation UX](https://github.com/Significant-Gravitas/AutoGPT/pull/12117) — cleaner create and edit experience with better progress feedback
* [Create agents via API](https://github.com/Significant-Gravitas/AutoGPT/pull/12208) — developers can now programmatically create agents through the external API
* [Exact timestamps on hover](https://github.com/Significant-Gravitas/AutoGPT/pull/12087) — hover over "2 hours ago" to see the full date and time
* [Credentials and inputs always visible](https://github.com/Significant-Gravitas/AutoGPT/pull/12194) — required setup steps and login prompts no longer hide inside collapsible sections
* [Clarification and save cards always visible](https://github.com/Significant-Gravitas/AutoGPT/pull/12204) — questions from AutoPilot and save confirmations are shown upfront instead of inside accordions
* [Task lists expanded by default](https://github.com/Significant-Gravitas/AutoGPT/pull/12168) — AutoPilot's to-do and task lists start open so you see everything immediately
* [Cleaner builder nodes](https://github.com/Significant-Gravitas/AutoGPT/pull/12152) — the "Advanced" switch on nodes replaced with a simpler chevron toggle
* [Easier connection deletion](https://github.com/Significant-Gravitas/AutoGPT/pull/12083) — the delete button appears when hovering anywhere on the connection line, not just a tiny target
* [Workspace files render in markdown](https://github.com/Significant-Gravitas/AutoGPT/pull/12166) — images and links from your workspace display correctly in chat messages
* [Better password reset experience](https://github.com/Significant-Gravitas/AutoGPT/pull/12123) — expired or already-used reset links now explain what happened and how to get a new one
* [Snake minigame while you wait](https://github.com/Significant-Gravitas/AutoGPT/pull/12160) — play a quick game of Snake during longer operations
* [Cleaner chat interface](https://github.com/Significant-Gravitas/AutoGPT/pull/12094) — improved spacing and styling in AutoPilot conversations
</details>
<details>
<summary>Fixes</summary>
* [Streaming completely overhauled](https://github.com/Significant-Gravitas/AutoGPT/pull/12173) — tool outputs no longer get lost, and refreshing the page preserves your conversation
* [Sessions no longer get stuck](https://github.com/Significant-Gravitas/AutoGPT/pull/12191) — fixed cases where the AI would stop responding with no way to recover
* [Long-running tasks no longer time out](https://github.com/Significant-Gravitas/AutoGPT/pull/12175) — complex tasks can run as long as needed instead of being killed after five minutes
* [Background agents no longer stall or hallucinate](https://github.com/Significant-Gravitas/AutoGPT/pull/12167) — the AI won't claim to have completed something it didn't actually do
* [Error messages are now actually helpful](https://github.com/Significant-Gravitas/AutoGPT/pull/12205) — real error details instead of generic failures, and switching between chats resumes properly
* [Reconnection preserves your messages](https://github.com/Significant-Gravitas/AutoGPT/pull/12159) — dropping and reconnecting no longer clears your chat history
* [API errors stay out of the chat](https://github.com/Significant-Gravitas/AutoGPT/pull/12063) — backend errors no longer appear as garbled text in your conversation
* [Agent creation follow-ups work](https://github.com/Significant-Gravitas/AutoGPT/pull/12062) — asking follow-up questions after creating an agent no longer causes errors
* [API key expiration default removed](https://github.com/Significant-Gravitas/AutoGPT/pull/12092) — API keys no longer auto-expire the next day; you choose whether to set an expiration
* [Text selection works in the builder](https://github.com/Significant-Gravitas/AutoGPT/pull/11955) — selecting text in input fields no longer accidentally drags the entire node
* [Website extraction errors handled](https://github.com/Significant-Gravitas/AutoGPT/pull/12048) — clear error messages when a website can't be fetched instead of silent failures
* [Content no longer overflows cards](https://github.com/Significant-Gravitas/AutoGPT/pull/12060) — block output text stays within its container instead of getting cut off
* [Workspace file listing improved](https://github.com/Significant-Gravitas/AutoGPT/pull/12190) — files display with proper names, sizes, and types instead of raw data
* ["Show all my agents" works](https://github.com/Significant-Gravitas/AutoGPT/pull/12138) — asking AutoPilot to list all your agents now actually lists them instead of returning nothing
* [External link button visible](https://github.com/Significant-Gravitas/AutoGPT/pull/12209) — the "Open link" button in the safety modal is now properly styled and clickable
* [No more crashes on logout](https://github.com/Significant-Gravitas/AutoGPT/pull/12202) — the AutoPilot page no longer breaks if you log out while it's open
* [Code highlighting performance](https://github.com/Significant-Gravitas/AutoGPT/pull/12144) — code blocks load faster and use less memory
* [Agent generation UI refreshes](https://github.com/Significant-Gravitas/AutoGPT/pull/12070) — the interface now updates when agent creation finishes instead of appearing stuck
* [Concurrent saves no longer conflict](https://github.com/Significant-Gravitas/AutoGPT/pull/12177) — sending multiple rapid messages no longer causes database errors
* [Workspace file downloads work](https://github.com/Significant-Gravitas/AutoGPT/pull/12215) — files created by the AI now include proper download links
* [Chat message spacing fixed](https://github.com/Significant-Gravitas/AutoGPT/pull/12091) — messages no longer have awkward extra whitespace
</details>
<details>
<summary>Under the hood</summary>
* [Message queue infrastructure upgraded](https://github.com/Significant-Gravitas/AutoGPT/pull/12118) — moved from an end-of-life version to a current, supported release for better reliability
* [Production dependencies updated](https://github.com/Significant-Gravitas/AutoGPT/pull/12056) — security patches and stability improvements across the stack
* [Security scanning updated](https://github.com/Significant-Gravitas/AutoGPT/pull/12033) — upgraded the tools that automatically check for vulnerabilities
* [Block search responses optimized](https://github.com/Significant-Gravitas/AutoGPT/pull/12020) — reduced data sent during block lookups so searches feel snappier
* [Legacy builder code removed](https://github.com/Significant-Gravitas/AutoGPT/pull/12082) — with the new flow editor live, the old code has been fully cleaned out
* [Legacy agent views removed](https://github.com/Significant-Gravitas/AutoGPT/pull/12088) — retired old UI components for a lighter, faster app
* [Internal module structure improved](https://github.com/Significant-Gravitas/AutoGPT/pull/12068) — reduced circular dependencies for more reliable startup
* [UI components modernized](https://github.com/Significant-Gravitas/AutoGPT/pull/12136) — replaced legacy dialog components with the current design system
* [Error logging improved](https://github.com/Significant-Gravitas/AutoGPT/pull/11942) — more detailed API error tracking so issues get found and resolved faster
* [HumanInTheLoop block docs clarified](https://github.com/Significant-Gravitas/AutoGPT/pull/12069) — clearer output descriptions for the agent builder
* [Podman compatibility note added](https://github.com/Significant-Gravitas/AutoGPT/pull/12120) — documented a known limitation for users running Podman instead of Docker
</details>

View File

@@ -0,0 +1,74 @@
# Connect any app, share files, and let AutoPilot browse for you
*February 26 March 4, 2026*
***
## Connect to any app — instantly
That integration you've been waiting for? You don't need to wait anymore. AutoPilot now supports MCP (Model Context Protocol) — an open standard backed by hundreds of ready-made connectors for apps like Notion, Slack, Jira, Stripe, Postgres, and more. **Just tell AutoPilot what you want to connect to, and it finds and sets it up automatically** — nothing to install, nothing to configure. Search your Notion workspace, query a database, create a Jira ticket — all from the chat. If the service requires a login, you'll get the same familiar sign-in prompt you already know. [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/12213)
<figure><img src="../.gitbook/assets/mcp-notion-hero.png" alt="AutoPilot connecting to Notion via MCP and searching a workspace in real time"><figcaption><p>AutoPilot connecting to Notion and pulling release highlights — no integration setup required</p></figcaption></figure>
## Upload files to AutoPilot
You can now attach files directly in the AutoPilot chat — documents, images, spreadsheets, audio, and video. Hit the **+** button or drag and drop files into the chat. Once sent, your attachments display inline in the conversation and the AI can read and reference them. [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/12220)
<figure><img src="../.gitbook/assets/file-upload-hero.png" alt="Dragging a file into the AutoPilot chat"><figcaption><p>Drop files directly into the chat and AutoPilot picks them up instantly</p></figcaption></figure>
## AutoPilot can run code and create files for you
You can now drop a messy spreadsheet into AutoPilot and ask it to clean the data, find trends, and generate a polished chart. It can crunch numbers, build scripts, create documents, and produce downloadable files, all inside the conversation. Each step builds on the last, so you can go from raw data to finished deliverable just by asking. [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/12212)
<figure><img src="../.gitbook/assets/code-execution-hero.png" alt="AutoPilot analyzing business data and generating a sales performance dashboard"><figcaption><p>AutoPilot analyzing 14 months of business data and generating a dashboard — all from a single request</p></figcaption></figure>
## AutoPilot can browse the web for you
Two new browsing capabilities let AutoPilot interact with websites on your behalf. For quick lookups, it can fetch and extract content from any page in one shot. For multi-step tasks — like logging into a site, navigating through menus, and pulling data — it drives a full browser session that persists across steps within your conversation. [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/12230)
<figure><img src="../.gitbook/assets/web-browsing-hero.png" alt="AutoPilot browsing a website autonomously in a live browser session"><figcaption><p>AutoPilot navigating a website in a live browser session</p></figcaption></figure>
## Listen to responses or copy them instantly
Multitasking? Tap the speaker icon on any message to have AutoPilot read it aloud while you do something else. Need to save a response? Hit the copy button to grab it in one click — ready to paste into an email, doc, or chat. [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/12256)
## See when the AI is summarizing
When a conversation gets long enough that AutoPilot needs to summarize earlier messages to stay within its memory limits, you'll now see a clear indicator — a spinner with "Summarizing earlier messages…" — instead of it happening silently in the background. [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/12250)
## Redesigned chat input
The chat input bar has been rebuilt with a cleaner layout — a spacious text area that grows smoothly as you type, with tool buttons and the send button in a tidy footer row. No more jarring height jumps when composing longer messages. [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/12207)
***
<details>
<summary>Improvements</summary>
* [More reliable streaming](https://github.com/Significant-Gravitas/AutoGPT/pull/12254) — AutoPilot now connects directly to the backend for real-time updates, avoiding an intermediate proxy that could cause timeouts on long-running tasks
* [Faster stall recovery](https://github.com/Significant-Gravitas/AutoGPT/pull/12254) — if the connection goes quiet, AutoPilot detects it in 10 seconds instead of 30 and reconnects automatically
* [Stop button works instantly](https://github.com/Significant-Gravitas/AutoGPT/pull/12254) — clicking Stop now immediately unlocks the input so you can keep working
</details>
<details>
<summary>Fixes</summary>
* [Node output no longer overflows](https://github.com/Significant-Gravitas/AutoGPT/pull/12222) — long text and JSON in the agent builder stays within its container, with output items capped for readability
* [Workspace file saves no longer conflict](https://github.com/Significant-Gravitas/AutoGPT/pull/12267) — when two operations write to the same file at the same time, you get a clear message instead of a raw error
* [Login no longer shows a blank page](https://github.com/Significant-Gravitas/AutoGPT/pull/12285) — signing in with email and password now loads the app immediately instead of requiring a manual refresh
</details>
<details>
<summary>Under the hood</summary>
* [Observability for the AI engine](https://github.com/Significant-Gravitas/AutoGPT/pull/12228) — every AutoPilot conversation turn is now traced with token counts, costs, and timing for faster debugging
* [Broadcast tracing enabled](https://github.com/Significant-Gravitas/AutoGPT/pull/12277) — monitoring coverage extended to the AI model routing layer
* [Legacy code removed](https://github.com/Significant-Gravitas/AutoGPT/pull/12276) — ~1,200 lines of old, unmaintained chat code cleaned out and replaced with a streamlined fallback path
* [Auto-synced developer tooling](https://github.com/Significant-Gravitas/AutoGPT/pull/12211) — switching branches now automatically keeps types and dependencies in sync
</details>

View File

@@ -0,0 +1,95 @@
# Voice input, persistent files, and a smarter AI
*January 29 February 11, 2026*
***
## Talk to AutoPilot with your voice
A mic button has been added to the chat input. Press it, speak your request, and your voice is transcribed instantly. The text box auto-focuses so you can hit Enter to send or tap the mic again to keep talking. [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/11871)
## Your files stick around
Files generated during conversations — images, documents, anything — are now saved to your personal workspace instead of disappearing when the session ends. You can browse and manage them directly through chat, and images show up inline in the conversation. [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/11867)
{% hint style="success" %}
This also unblocked **20+ blocks** (file storage, email attachments, screenshots, image generation, and more) that weren't working in chat.
{% endhint %}
## Upgraded to Claude Opus 4.6
AutoPilot now runs on Anthropic's newest and most capable model, with a larger context window and doubled output capacity. Extended thinking is enabled, so the AI reasons internally before responding — you get cleaner, more focused answers instead of raw chain-of-thought in the chat. [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/11983)
{% hint style="info" %}
Claude 3.7 Sonnet has been retired. Agents using it were auto-migrated to Claude 4.5 Sonnet.
{% endhint %}
## Customize marketplace agents
Ask AutoPilot to modify any marketplace agent before adding it to your library. "Customize that newsletter writer to post to Discord instead." AutoPilot adapts the workflow, asks follow-up questions if needed, and saves the customized version for you. [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/11943)
## Rebuilt chat streaming
The entire chat system has been rebuilt on the Vercel AI SDK. More reliable message delivery, better markdown formatting, tool outputs shown in clean collapsible panels, and errors surfaced as brief notifications instead of breaking the chat. [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/11901)
## Stay connected during long tasks
If your connection drops while AutoPilot is working (network hiccup, laptop sleep, switching tabs), it keeps going in the background, saves progress, and replays what you missed when you reconnect. [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/11877)
## New agents reuse your existing ones
When building a new agent, AutoPilot searches your library for agents that can be incorporated as building blocks — so you don't rebuild from scratch. [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/11889)
## Better agent creation
Progress bar during generation so you're not staring at a blank screen. A "Your agent is ready!" prompt with buttons to test immediately or provide your own inputs. Cleaner formatting when AutoPilot asks you clarifying questions. And helpful error messages instead of generic failures. [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/11974)
## Updated homepage text
The homepage now says "Tell me about your work — I'll find what to automate" instead of assuming you already know what to build. The quick-start buttons have been rewritten too: [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/11956)
{% tabs %}
{% tab title="After" %}
* "I don't know where to start, just ask me stuff"
* "I do the same thing every week and it's killing me"
* "Help me find where I'm wasting my time"
{% endtab %}
{% tab title="Before" %}
* ~~"Show me what I can automate"~~
* ~~"Design a custom workflow"~~
* ~~"Help me with content creation"~~
{% endtab %}
{% endtabs %}
***
<details>
<summary>Improvements</summary>
* [Editing agents updates the original](https://github.com/Significant-Gravitas/AutoGPT/pull/11981) instead of creating a duplicate in your library
* ["Tasks" tab renamed to "Agents"](https://github.com/Significant-Gravitas/AutoGPT/pull/11982) to better describe what's there
* [Wallet stays closed](https://github.com/Significant-Gravitas/AutoGPT/pull/11961) — no longer pops open automatically for new users or on balance changes
* [Search suggests next steps](https://github.com/Significant-Gravitas/AutoGPT/pull/11976), like offering to create a custom agent from your query
* [Smarter block filtering](https://github.com/Significant-Gravitas/AutoGPT/pull/11892) — chat no longer shows blocks that only work in the visual builder
* [Linear search block upgraded](https://github.com/Significant-Gravitas/AutoGPT/pull/11967) — now returns status, assignee, and project info with team filtering
* [New Text Encoder block](https://github.com/Significant-Gravitas/AutoGPT/pull/11857) for escaping special characters in JSON payloads and config files
</details>
<details>
<summary>Fixes</summary>
* [Improved credential matching](https://github.com/Significant-Gravitas/AutoGPT/pull/11908) so agents reliably use the correct API keys and permissions for each provider
* [Better input validation](https://github.com/Significant-Gravitas/AutoGPT/pull/11916) — agent inputs are now validated upfront with clear feedback on available fields
* [Marketplace agents work as sub-agents](https://github.com/Significant-Gravitas/AutoGPT/pull/11920) — referencing marketplace templates in new builds no longer fails
* [YouTube transcription fixed](https://github.com/Significant-Gravitas/AutoGPT/pull/11980) — cleanly reports either success or failure, not both
* [Long conversations stay responsive](https://github.com/Significant-Gravitas/AutoGPT/pull/11937) — improved context management for longer chat sessions
* [Agent list loads faster](https://github.com/Significant-Gravitas/AutoGPT/pull/12053) — optimized the endpoint that loads your agents
* [Login redirects fixed](https://github.com/Significant-Gravitas/AutoGPT/pull/11894) — resolved an issue where hard refresh could briefly show the wrong page
</details>

View File

@@ -0,0 +1,77 @@
# Explore a new marketplace, track usage, and give instant feedback
*March 13 March 20, 2026*
**Platform version:** `v0.6.52`
This week's release brings a **refreshed marketplace**, a **usage monitor** so you always know where your credits stand, and a **new feedback button**.
***
## A brand-new marketplace experience
The marketplace has been **completely redesigned** with a cleaner, more professional look. Agent cards now feature subtle borders and refined spacing, the search bar matches our design system, and everything feels polished at every screen size. [](https://github.com/Significant-Gravitas/AutoGPT/pull/12462)
<figure><img src="../.gitbook/assets/marketplace-hero.png" alt="The redesigned AutoGPT marketplace with polished agent cards and clean layout"><figcaption><p>The redesigned marketplace — cleaner cards, better search, and a professional new look</p></figcaption></figure>
***
## Track your usage at a glance
A new **usage monitor** gives you full visibility into your credit balance and token usage. You'll see daily and weekly progress bars, reset times, and a clear breakdown of how your credits are being spent — all from a simple popover in the interface. [](https://github.com/Significant-Gravitas/AutoGPT/pull/12385)
<figure><img src="../.gitbook/assets/usage-monitor-hero.png" alt="The new usage monitor showing credit balance and token usage"><figcaption><p>Keep tabs on your credit usage with daily and weekly breakdowns</p></figcaption></figure>
***
## Give feedback from anywhere
The **feedback button** has moved to the header, making it accessible from any page. Share your thoughts, report issues, or request features without having to navigate away from what you're doing. [](https://github.com/Significant-Gravitas/AutoGPT/pull/12462)
<figure><img src="../.gitbook/assets/feedback-button-hero.png" alt="The new feedback button in the header"><figcaption><p>The feedback button is now always within reach in the header</p></figcaption></figure>
***
<details>
<summary>Improvements</summary>
* **Shareable prompt links** — share an AutoPilot prompt via URL so anyone can try it out with a single click — a great way to share use cases and ideas [](https://github.com/Significant-Gravitas/AutoGPT/pull/12406)
* **"Run now" button on schedules** — manually trigger any scheduled agent run with one click [](https://github.com/Significant-Gravitas/AutoGPT/pull/12388)
* **"Jump Back In" on Library** — quickly resume your most recent agent right from the Library page [](https://github.com/Significant-Gravitas/AutoGPT/pull/12387)
* **Rich media previews in Builder** — see images, files, and other media directly in node outputs and file inputs [](https://github.com/Significant-Gravitas/AutoGPT/pull/12432)
* **Graph search in Builder** — find any node or block instantly in your agent graphs [](https://github.com/Significant-Gravitas/AutoGPT/pull/12395)
* **Themed prompt categories** — suggestion pills are now organized into themed categories for easier discovery [](https://github.com/Significant-Gravitas/AutoGPT/pull/12452)
* **Nano Banana 2** — the latest image generation model is now available in the image generator, customizer, and editor blocks [](https://github.com/Significant-Gravitas/AutoGPT/pull/12218)
* **AgentMail integration** — new blocks for managing email workflows through AgentMail [](https://github.com/Significant-Gravitas/AutoGPT/pull/12417)
</details>
<details>
<summary>Fixes</summary>
* Fixed sub-folders not showing when navigating inside a folder [](https://github.com/Significant-Gravitas/AutoGPT/pull/12316)
* Fixed image delete button on the Edit Agent form [](https://github.com/Significant-Gravitas/AutoGPT/pull/12362)
* Improved handling of transient API connection errors [](https://github.com/Significant-Gravitas/AutoGPT/pull/12445)
* Fixed tool-result file reads failing across turns [](https://github.com/Significant-Gravitas/AutoGPT/pull/12399)
* Fixed long filenames being truncated incorrectly [](https://github.com/Significant-Gravitas/AutoGPT/pull/12025)
* Allowed falsy values in list building blocks [](https://github.com/Significant-Gravitas/AutoGPT/pull/12028)
* Renamed "CoPilot" to "AutoPilot" on the credits page [](https://github.com/Significant-Gravitas/AutoGPT/pull/12481)
* Constrained markdown heading sizes in chat messages [](https://github.com/Significant-Gravitas/AutoGPT/pull/12463)
</details>
<details>
<summary>Under the hood</summary>
* Read-only SQL views layer with analytics schema [](https://github.com/Significant-Gravitas/AutoGPT/pull/12367)
* AutoPilot block for running AutoPilot from inside agents or as a sub-agent within AutoPilot itself [](https://github.com/Significant-Gravitas/AutoGPT/pull/12439)
* GitHub CLI support with automatic token injection [](https://github.com/Significant-Gravitas/AutoGPT/pull/12426)
* Improved end-to-end CI with reduced costs [](https://github.com/Significant-Gravitas/AutoGPT/pull/12437)
* New integration tests for Builder stores, components, and hooks [](https://github.com/Significant-Gravitas/AutoGPT/pull/12433)
* Builder end-to-end tests for the new Flow Editor [](https://github.com/Significant-Gravitas/AutoGPT/pull/12436)
* Collapsed navbar text to icons below 1280px [](https://github.com/Significant-Gravitas/AutoGPT/pull/12484)
</details>

View File

@@ -0,0 +1,68 @@
# Import workflows from other platforms and enjoy a polished marketplace
*March 20 March 25, 2026*
**Platform version:** `v0.6.53`
This release makes it easier than ever to **bring your existing automations into AutoGPT**. You can now import workflows straight from n8n, Make.com, and Zapier, **test your agents before they go live** with a new dry-run mode, plus the marketplace gets another round of visual polish.
***
## Import workflows from n8n, Make.com & Zapier
Switching to AutoGPT no longer means starting from scratch. A new **workflow import** feature lets you bring in automations you've already built on other platforms and convert them into AutoGPT agents. [](https://github.com/Significant-Gravitas/AutoGPT/pull/12440)
<figure><img src="../.gitbook/assets/import-workflows-hero.png" alt="Import workflows from other platforms"><figcaption><p>Import your existing workflows from n8n, Make.com, and Zapier directly into AutoGPT.</p></figcaption></figure>
***
## A more polished marketplace
The marketplace continues to get cleaner and easier to browse. Card descriptions are now **neatly truncated** to keep the layout consistent, the download button has been **repositioned** for better flow, and card overflow issues have been resolved. [](https://github.com/Significant-Gravitas/AutoGPT/pull/12494)
<figure><img src="../.gitbook/assets/marketplace-ui-v053-hero.png" alt="UI improvements to the marketplace"><figcaption><p>Cleaner marketplace cards with consistent layout and improved button placement.</p></figcaption></figure>
***
## Test your agents before they go live
A new **dry-run mode** lets you simulate a full run of any agent without real-world side effects. Every block executes and produces realistic outputs, so you can verify your agent works correctly — before it sends emails, updates spreadsheets, or takes any real action. [](https://github.com/Significant-Gravitas/AutoGPT/pull/12483)
This is especially powerful with AutoPilot. When AutoPilot builds an agent for you, it can now automatically test it to make sure it works and the quality is up to standard. It can also preview exactly what an agent will do before you commit to running it — so you always know what to expect.
***
<details>
<summary>Improvements</summary>
* **Parallel AutoPilot actions** — When AutoPilot needs to perform several steps at once, it now runs them simultaneously — no more waiting for each to finish before starting the next. [](https://github.com/Significant-Gravitas/AutoGPT/pull/12472)
* **Scoped AutoPilot tools** — You can now control exactly which tools and blocks AutoPilot has access to — whether running it as a block or using sub-agents — so you can build tightly constrained agentic systems. [](https://github.com/Significant-Gravitas/AutoGPT/pull/12482)
* **Leaner tool schemas** — Tool schema token cost has been reduced by 34%, meaning faster and cheaper AutoPilot. [](https://github.com/Significant-Gravitas/AutoGPT/pull/12398)
* **Admin marketplace preview** — Admins can now preview and download submitted agents before approving them. [](https://github.com/Significant-Gravitas/AutoGPT/pull/12536)
</details>
<details>
<summary>Fixes</summary>
* Fixed blocks with complex inputs (e.g. GitHub Multi-File Commit) sometimes failing silently [](https://github.com/Significant-Gravitas/AutoGPT/pull/12496)
* Fixed auto top-up setup showing a generic error when no payment method is on file [](https://github.com/Significant-Gravitas/AutoGPT/pull/12496)
* Fixed re-uploading a file to AutoPilot failing instead of replacing the existing file [](https://github.com/Significant-Gravitas/AutoGPT/pull/12496)
* OAuth popup detection — the app now notices when you close an OAuth window and lets you dismiss the waiting modal [](https://github.com/Significant-Gravitas/AutoGPT/pull/12443)
* Added circuit breaker to prevent infinite tool-call retry loops [](https://github.com/Significant-Gravitas/AutoGPT/pull/12499)
* Reduced noisy error logging from user-caused LLM API errors [](https://github.com/Significant-Gravitas/AutoGPT/pull/12516)
* Fixed browser automation to use system Chromium on all architectures [](https://github.com/Significant-Gravitas/AutoGPT/pull/12473)
</details>
<details>
<summary>Under the hood</summary>
* Renamed SmartDecisionMakerBlock to OrchestratorBlock for clarity [](https://github.com/Significant-Gravitas/AutoGPT/pull/12511)
* Added DB_STATEMENT_CACHE_SIZE env var for Prisma engine tuning [](https://github.com/Significant-Gravitas/AutoGPT/pull/12521)
* Bumped stagehand ^0.5.1 → ^3.4.0 to fix yanked litellm dependency [](https://github.com/Significant-Gravitas/AutoGPT/pull/12539)
* Registered AutoPilot sessions with stream registry for SSE updates [](https://github.com/Significant-Gravitas/AutoGPT/pull/12500)
* Allowed /tmp as valid path in E2B sandbox file tools [](https://github.com/Significant-Gravitas/AutoGPT/pull/12501)
</details>

View File

@@ -0,0 +1,85 @@
# Organize your agents, get notified, and enjoy a cleaner chat
*March 5 March 12, 2026*
***
## Organize agents into folders
You can now **create folders in your library** to keep your agents organized. Drag and drop agents between folders, rename them, and color-code them to match your workflow. Whether you have five agents or fifty, everything stays tidy. [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/12290)
<figure><img src="../.gitbook/assets/folders-hero.png" alt="Organizing agents into color-coded folders in the AutoGPT library"><figcaption><p>Organizing agents into color-coded folders in the AutoGPT library</p></figcaption></figure>
## Get notified when background chats finish
AutoPilot now sends you a **notification when a background chat completes**. No more switching tabs to check — you'll see a toast alert the moment your agent finishes its work, so you can jump right back into the results. [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/12364)
<figure><img src="../.gitbook/assets/notifications-hero.png" alt="AutoPilot notification appearing when a background chat finishes its work"><figcaption><p>AutoPilot notification appearing when a background chat finishes its work</p></figcaption></figure>
## Cleaner reasoning & chat experience
We've redesigned how **reasoning steps and tool calls appear in chat**. Intermediate thinking is now collapsed by default, keeping your conversation clean and easy to read. Expand any step if you want the full details. [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/12282)
<figure><img src="../.gitbook/assets/reasoning-hero.png" alt="Collapsed reasoning steps keeping the AutoPilot conversation clean and focused"><figcaption><p>Collapsed reasoning steps keeping the AutoPilot conversation clean and focused</p></figcaption></figure>
## Give feedback and share outputs instantly
Every response now comes with **action buttons to copy, upvote, downvote, or listen to the output**. Give quick feedback on any response so AutoPilot keeps getting better, and copy or share results in one tap. [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/12260)
<figure><img src="../.gitbook/assets/outputs-hero.png" alt="Output action buttons for copying, rating, and sharing AutoPilot responses"><figcaption><p>Output action buttons for copying, rating, and sharing AutoPilot responses</p></figcaption></figure>
***
<details>
<summary>Improvements</summary>
- **Credential selector redesign** [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/12378) — Completely rebuilt the credential picker for a smoother, faster experience when connecting your accounts.
- **Pinned tool cards** [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/12346) — Interactive tool cards now stay pinned outside collapsed reasoning, so results are always visible without expanding steps.
- **Workspace file uploads** [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/12226) — Upload files directly to your workspace and reference them across agents and conversations.
- **Multimodal vision support** [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/12273) — AutoPilot can now analyze images and PDFs you share in chat, powered by the latest vision models.
- **Per-turn summary stats** [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/12257) — Each agent turn now shows a work-done summary so you can see exactly what happened at a glance.
- **Text-to-speech voice selection** [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/12317) — Choose from multiple AI voices when listening to agent responses, with improved quality to avoid robotic-sounding output.
- **Batch undo in agent builder** [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/12344) — Undo multiple changes at once when editing agents in the visual builder.
</details>
<details>
<summary>Fixes</summary>
- **Session message preservation** [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/12302) — Title updates no longer overwrite session messages, keeping your full conversation intact.
- **Transcript reliability** [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/12318) — Refactored transcripts to an atomic full-context model, preventing data loss during complex multi-step runs.
- **File download integrity** [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/12349) — Workspace file downloads are now buffered to prevent truncation of large files.
- **Node layout handling** [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/12354) — Improved how the builder handles discriminated unions and node positioning when editing agents.
- **Triggered agent runs** [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/12298) — Manual run attempts for triggered agents are now handled gracefully instead of failing silently.
- **Password reset flow** [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/12384) — Supabase error details now pass through correctly during password reset, giving clearer feedback when something goes wrong.
</details>
<details>
<summary>Under the hood</summary>
- **New models added** — Claude Sonnet 4.6, Grok 3, Mistral Large, Medium, Small & Codestral, Perplexity Sonar Reasoning Pro, Google Gemini 3.1 Flash & Pro, Phi-4, and Cohere Command A are now available.
- **File-ref protocol** [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/12332) — Internal file references now use a consistent protocol, making cross-block file passing more reliable.
- **Langfuse tracing** [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/12281) — Better observability for agent execution with enhanced trace correlation and feedback loops.
- **Sandbox auto-pause** [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/12330) — Code execution sandboxes now automatically pause when idle, eliminating unnecessary billing.
- **Claude Code auth** [*↗*](https://github.com/Significant-Gravitas/AutoGPT/pull/12288) — Added subscription authentication support for Claude Code integration in SDK mode.
</details>

View File

@@ -0,0 +1,55 @@
# Credits & Billing
## Overview
The AutoGPT Platform uses a credit system to manage usage. Credits are consumed when blocks execute during agent runs. This guide explains how credits work, how pricing is determined, and how to monitor your spending.
{% hint style="info" %}
The platform is currently in a **pre-release closed beta**. Pricing is subject to change.
{% endhint %}
## How Credits Work
Credits are consumed on a **per-block-run** basis. Each time a block executes during an agent run, it costs a certain number of credits. The price of a block covers its compute, development, and operational costs — there are no separate charges for infrastructure or API usage.
### Block Pricing
Block prices vary depending on the block:
- **Fixed-price blocks**: Some blocks have a flat price regardless of how they are configured (e.g., basic data processing blocks)
- **Variable-price blocks**: Some blocks have a price that changes based on the settings you choose. For example, the **AI Text Generator** block's price changes depending on which large language model you select
{% hint style="info" %}
The current pricing system charges a flat rate per model for AI blocks — you are **not** charged per token.
{% endhint %}
Users are not charged for anything else on the platform beyond block execution. There are no subscription fees, storage fees, or platform access fees.
## Checking Your Balance
Your credit balance is displayed in the **top-right corner** of the screen at all times, visible from any page on the platform.
## Viewing Task Costs
To see how many credits a specific agent run consumed:
1. Go to your [Agent Library](agent-library.md)
2. Open the agent
3. Click on a completed task in the left-hand pane
4. The **total credit cost** for that task is displayed at the top of the task detail view
{% hint style="info" %}
There is no centralised ledger for browsing all credit spend across your account. Credit costs are available on a per-task basis within each agent.
{% endhint %}
## Running Out of Credits
There are no hard limits on usage beyond your credit balance. If your credit balance reaches zero:
- **Running agents will stop executing**
- **Scheduled agents will not run** until credits are replenished
- You will need to add more credits to continue using the platform
## Adding Credits
Credits can be added through the platform. Navigate to your profile settings to manage your credit balance.

View File

@@ -0,0 +1,84 @@
# Data Flow & Execution
## Overview
Understanding how agents execute is key to building effective workflows. This guide explains how data flows through an agent, what determines execution order, and how to work with lists and errors.
## Execution Order
Agent execution is entirely **determined by data flow**. There is no separate execution flow or ordering mechanism — data dependencies are the only thing that controls which block runs when.
### How It Works
1. **Execution starts from input blocks**, which yield their data when the agent is triggered (either manually or via a trigger/schedule)
2. The next block to run is whichever block has **all of its connected inputs satisfied**
3. This continues block by block, following the data flow, until all blocks have executed
4. **Output blocks** collect the final results and present them to the user
### Required Inputs
A block will only execute when:
- All **connected input pins** have received data from their upstream blocks
- All **required input pins** have values — either from a connection or from a hardcoded value set directly on the block
This means you can have blocks that don't depend on each other execute in any order, while blocks that depend on the output of another block will always wait.
## Working with Pins
### Pin Types
Input and output pins are typed. Common types include:
- **Text**: String values
- **Number**: Numeric values
- **File**: File uploads or downloads
- **List**: Arrays of items
- **Boolean**: True/false values
- **Object**: Structured data
Connections can only be made between compatible pin types.
### Data Flow Visualisation
When an agent is running, you can see data moving through the workflow in real time. Data flow is represented by a **coloured bead** that slides along each connection line from the output pin to the input pin, giving you a clear visual of what's happening.
## Working with Lists
Blocks can handle list data in flexible ways:
- **Outputting lists**: Some blocks produce a list of items as their output. You can choose to receive the full list as a single output or receive individual items one at a time.
- **Iterating over lists**: You can send a list into a block that iterates through its contents, yielding each item one by one. This is useful for processing each item in a list independently.
This makes it straightforward to build agents that process batches of data — for example, fetching a list of URLs and then processing each one through an AI block.
## Error Handling
When a block fails during execution, it does **not** automatically stop the entire agent. Instead:
1. The failed block produces data on its **error pin**
2. What happens next depends on how you've wired the agent
### Handling Errors Gracefully
You have full control over error handling through the block connections:
- **Surface the error**: Connect the error pin to an output block to return the error as part of the agent's result. This is useful for debugging or when you want users to see what went wrong.
- **Handle and continue**: Connect the error pin to other blocks that provide fallback behaviour. For example, retry with different settings, use a default value, or route to an alternative workflow path.
- **Ignore the error**: If the error pin is not connected, the error data is simply not propagated. Downstream blocks that depend on the failed block's normal output pins will not execute (since their inputs won't be satisfied).
{% hint style="info" %}
Building robust agents means thinking about what happens when things go wrong. Consider connecting error pins to output blocks during development so you can see any issues, then add proper error handling once your agent is working.
{% endhint %}
## Execution Summary
| Concept | How It Works |
|---------|-------------|
| **Execution order** | Determined entirely by data flow — blocks run when all inputs are ready |
| **Starting point** | Input blocks yield data first |
| **Ending point** | Output blocks collect final results |
| **Parallel execution** | Blocks with no dependencies on each other can execute in any order |
| **Error handling** | Failed blocks yield data on their error pin — you decide what to do with it |
| **Lists** | Can be processed as a whole or iterated item by item |
| **Visual feedback** | Coloured beads slide along connection lines during execution |

View File

@@ -0,0 +1,79 @@
# Getting Started with AutoGPT (Cloud)
## Introduction
This guide will get you up and running on the hosted AutoGPT Platform at [platform.agpt.co](https://platform.agpt.co). No installation, Docker, or API keys required — just sign up and start building.
{% hint style="info" %}
Looking to self-host instead? See the [Self-Hosting Guide](getting-started.md).
{% endhint %}
## Creating Your Account
1. Navigate to [platform.agpt.co](https://platform.agpt.co)
2. Click **Sign Up** and create your account
3. Once signed in, you'll land on the **AutoPilot** home screen
That's it — you're ready to go. The cloud platform comes with built-in credits and pre-configured API keys for services like OpenAI and Replicate, so you can start using AI blocks immediately without providing your own keys.
## Platform Navigation
The AutoGPT Platform has four main areas, accessible from the navigation bar at the top of every screen:
| Nav Button | URL | Description |
|------------|-----|-------------|
| **Home** | [platform.agpt.co](https://platform.agpt.co) | AutoPilot — your AI assistant for the platform |
| **Build** | [platform.agpt.co/build](https://platform.agpt.co/build) | Agent Builder — visual editor for creating agents |
| **Agents** | [platform.agpt.co/library](https://platform.agpt.co/library) | Your Agent Library — all your saved agents |
| **Marketplace** | [platform.agpt.co/marketplace](https://platform.agpt.co/marketplace) | Community marketplace for discovering agents |
Your **credit balance** is displayed in the top-right corner of the screen at all times. Your **profile menu** (top-right avatar) gives you access to account settings, integrations, and agent publishing.
## What to Do First
Here are three great ways to get started:
### Option 1: Chat with AutoPilot
Click **Home** in the nav bar to open AutoPilot. You can ask it to do almost anything on the platform — browse the marketplace, run agents, build agents, generate images, conduct research, and more. It's the fastest way to experience the platform.
### Option 2: Add an Agent from the Marketplace
1. Click **Marketplace** in the nav bar
2. Browse or search for an agent that interests you
3. Click on an agent to view its details
4. Click **Add to Library**
5. Navigate to **Agents** to find it in your library
6. Click on the agent and press **New Task** to run it
### Option 3: Build Your Own Agent
1. Click **Build** in the nav bar to open the Agent Builder
2. Open the Blocks menu on the left-hand side
3. Add an **Input Block**, an **AI Text Generator Block**, and an **Output Block**
4. Connect them together by dragging between their pins
5. Press **Ctrl+S** or click the save button to save your agent
6. Navigate to **Agents** to find and run it
For a detailed walkthrough, see [Creating a Basic Agent](create-basic-agent.md).
## Key Concepts
Before diving deeper, here are the core concepts you'll encounter:
- **Agent**: An automated workflow you design to perform specific tasks. Agents are made up of connected blocks.
- **Block**: A single action within an agent — such as generating text with AI, sending an email, or looking up data. There are hundreds of blocks integrating with many platforms.
- **Task**: A single execution of an agent. When you run an agent with a set of inputs, that creates a task. You can view the task's inputs, outputs, and credit cost.
- **AutoPilot**: Your AI assistant that can perform any action on the platform through natural conversation.
- **Marketplace**: A public library of community-built agents you can add to your own library.
- **Credits**: The currency used to run blocks. Each block has its own price. Your balance is shown in the top-right corner.
## Next Steps
| Guide | Description |
|-------|-------------|
| [AutoPilot](autopilot.md) | Learn what AutoPilot can do for you |
| [Agent Builder Guide](agent-builder-guide.md) | Master the visual agent builder |
| [Agent Library](agent-library.md) | Manage your agents, tasks, and schedules |
| [Marketplace](marketplace.md) | Discover and share agents |
| [Credits & Billing](credits-and-billing.md) | Understand the credit system |

View File

@@ -0,0 +1,67 @@
# Integrations & Credentials
## Overview
Many blocks on the AutoGPT Platform integrate with external services like Google, GitHub, Linear, Twitter, and more. These integrations require credentials — such as OAuth connections, API keys, or username/password pairs — to access your accounts on those services.
This guide explains how credentials work on the platform, how to add them, and how to manage them.
## How Credentials Work
### Platform-Provided Credentials
On the cloud-hosted platform at [platform.agpt.co](https://platform.agpt.co), many credentials are **provided by default**. Services like OpenAI, Anthropic, and Replicate are pre-configured, so you can use AI blocks and many other features without providing your own API keys.
### User-Provided Credentials
For services tied to your personal accounts — such as Google, Linear, GitHub, or Twitter — you'll need to connect your own credentials. This only needs to be done **once per service per account**. After connecting, all agents that use that service will automatically have access.
## Adding Credentials
Credentials are added **in context** — when you encounter a block that needs them, rather than from a central setup page.
### When Building an Agent
If you add a block to the builder that requires a credential you haven't connected yet, a **credential bar** will appear on the block prompting you to add it.
### When Running an Agent
If you run an agent (including marketplace agents) that requires credentials, one of the input fields will be the credential selector. This only appears for services you haven't connected yet.
### Credential Types
Depending on the service, you'll be prompted to authenticate in one of three ways:
| Type | Description | Example Services |
|------|-------------|------------------|
| **OAuth** | Click to authorise via the service's login page | Google, GitHub, Twitter |
| **API Key** | Paste your API key from the service's dashboard | Linear, OpenAI (if self-hosting) |
| **Username & Password** | Enter your account credentials | Varies by service |
{% hint style="info" %}
You only need to connect a credential **once per service**. For example, after adding your Linear API key, every agent that uses Linear blocks will have access automatically.
{% endhint %}
## Managing Credentials
### Viewing Connected Integrations
To view and manage your connected integrations:
1. Click your **profile picture** in the top-right corner
2. Select **Integrations**
3. You'll see a list of all your connected integrations
**URL:** [platform.agpt.co/profile/integrations](https://platform.agpt.co/profile/integrations)
### Removing a Credential
From the integrations page, you can **browse** your list of connected integrations and **delete** any you no longer need.
{% hint style="warning" %}
You **cannot add** new integrations from the integrations management screen. New credentials are only added when you encounter a block or agent that requires them.
{% endhint %}
## Self-Hosted Credentials
If you're running the platform locally via self-hosting, you'll need to provide your own API keys for all services, including AI providers. These are configured in the `autogpt_platform/backend/.env` file. See the [Self-Hosting Guide](getting-started.md) for details.

View File

@@ -0,0 +1,82 @@
# Marketplace
## Overview
The AutoGPT Marketplace is a public library of agents created by the community and by the AutoGPT team. You can discover agents for a wide range of use cases, add them to your library with one click, and even publish your own agents for others to use.
**URL:** [platform.agpt.co/marketplace](https://platform.agpt.co/marketplace)
**Access:** Click **Marketplace** in the navigation bar.
## Browsing the Marketplace
The marketplace allows you to:
- **Search** for agents by name or keyword
- **Browse by category** to explore different use cases
- **View agent details** by clicking on any agent
### Agent Detail Page
Each agent in the marketplace has a dedicated page that includes:
- **Title and description** of what the agent does
- **Creator information** — who built the agent and a link to their other agents
- **Video** (when available) — a demonstration of the agent in action
- **Output showcase** (when available) — examples of the agent's results
- **Instructions** — guidance on how to run the agent and what to expect
## Adding an Agent to Your Library
When you find an agent you want to use:
1. Click on the agent to open its marketplace page
2. Click the **Add to Library** button
The agent will immediately appear in your [Agent Library](agent-library.md), where you can run it, schedule it, or edit it.
{% hint style="info" %}
If you already have the agent in your library, the button changes to **See Runs**, which takes you directly to that agent in your library.
{% endhint %}
### For Self-Hosted Users
If you are not logged into the cloud platform and are instead self-hosting, you will see a **Download** button instead of "Add to Library". This downloads the agent as a file which you can import into your local instance. See [Download & Import an Agent](download-agent-from-marketplace-local.md) for details.
## Publishing an Agent
You can publish your own agents to the marketplace for the community to discover and use.
### How to Publish
1. Click your **profile picture** in the top-right corner of any screen
2. Select **Publish an Agent**
3. Choose the agent you want to publish from your list of recent agents
4. Fill out the publishing form:
| Field | Description |
|-------|-------------|
| **Title** | The name of your agent as it will appear in the marketplace |
| **Subheader** | A short tagline for your agent |
| **Slug** | A URL-friendly identifier (e.g., `resume-rewriter`) |
| **Thumbnail Images** | Upload images — the first image becomes the marketplace thumbnail |
| **YouTube Video Link** | Optional — a video demonstrating your agent |
| **Category** | Select the most relevant category for your agent |
| **Description** | A detailed explanation of what your agent does |
| **Agent Output Demo** | Optional — a short video showing the agent's results in action |
| **Instructions** | Explain to users how to run this agent and what to expect |
| **Recommended Schedule** | Suggest when users should run this agent for best results |
5. Click **Submit for Review**
### Review Process
After submission, a member of the AutoGPT team will review your agent against curation standards. If approved, your agent will be published to the marketplace and visible to all users.
{% hint style="info" %}
Creator monetisation (earning money from your published agents) is planned but not yet implemented.
{% endhint %}
## Editing Marketplace Agents
When you add a marketplace agent to your library, you get your own copy. You are free to edit it in the builder just like any agent you've created yourself. Your changes only affect your copy — the original marketplace listing is not modified.

View File

@@ -0,0 +1,96 @@
# Scheduling & Triggers
## Overview
AutoGPT agents can be run on demand, on a recurring schedule, or automatically in response to external events via webhooks. This guide covers both scheduling and trigger-based execution.
## Scheduling an Agent
Scheduling lets you run an agent automatically on a recurring basis with pre-configured inputs.
### Setting Up a Schedule
1. Go to your [Agent Library](agent-library.md) and open the agent you want to schedule
2. Click **New Task** (the same button used for manual runs)
3. Fill in all the input fields with the values you want the agent to use
4. At the bottom of the input form, you'll see two buttons: **Start Task** and **Schedule Task**
5. Click **Schedule Task**
### Configuring the Schedule
The schedule configuration screen allows you to set:
| Setting | Description |
|---------|-------------|
| **Schedule Name** | A descriptive name for this schedule |
| **Repeats** | Frequency — e.g., Weekly |
| **Repeats On** | Select specific days (individual days, weekdays, weekends, or select all) |
| **At** | The time of day to run (hour and minute) |
| **Timezone** | Schedule runs in your local timezone, displayed on screen |
**Example:** Run "Keyword SEO Expert" every weekday at 9:00 AM CST.
### Managing Schedules
After creating a schedule, you can view and manage it on your agent's detail page:
- Open the agent from your library
- Select the **Scheduled** tab on the left-hand side
- The tab shows the count of active schedules (e.g., `Scheduled 1`)
## Triggers (Webhook-Based Execution)
Triggers allow external services or your own code to start an agent automatically by sending data to a webhook URL.
### How Triggers Work
Unlike standard input blocks, **trigger blocks** are a special type of input block. When you add a trigger block to your agent in the builder, the agent's execution model changes — instead of being started manually, it waits for incoming webhook events.
### Setting Up a Trigger
#### Step 1: Add a Trigger Block in the Builder
1. Open the [Agent Builder](agent-builder-guide.md)
2. From the blocks menu, add a **trigger block** to your agent
3. Connect it to the rest of your workflow like any other input block
4. Save your agent
#### Step 2: Configure the Trigger in Your Library
1. Go to your [Agent Library](agent-library.md) and open the agent
2. Click **New Trigger** (this replaces the "New Task" button for trigger-based agents)
3. Give the trigger a **name** and **description**
#### Step 3: Copy the Webhook URL
Once the trigger is created, you'll see a status panel:
> **Trigger Status**
>
> Status: **Active**
>
> This trigger is ready to be used. Use the Webhook URL below to set up the trigger connection with the service of your choosing.
>
> **Webhook URL:** `https://backend.agpt.co/api/integrations/generic_webhook/webhooks/...`
Copy this webhook URL and provide it to the external platform or code that will be sending events to trigger your agent.
{% hint style="info" %}
Trigger-based agents cannot be started manually with the "New Task" button. The only way to execute them is by sending data to the webhook URL.
{% endhint %}
### Example Use Cases
- **GitHub webhook**: Trigger an agent whenever a pull request is opened
- **Payment processor**: Run an agent when a new payment is received
- **Form submission**: Process data when a user submits a form on your website
- **Custom integration**: Send data from any service that supports webhooks
## Schedule vs. Trigger
| | Schedule | Trigger |
|---|----------|---------|
| **How it starts** | Automatically at configured times | When an external event sends data to the webhook URL |
| **Input source** | Pre-configured when the schedule is created | Provided by the incoming webhook payload |
| **Use case** | Recurring tasks with fixed inputs (daily reports, weekly summaries) | Event-driven tasks (new PR opened, form submitted, payment received) |
| **Setup** | Through the "New Task" → "Schedule Task" flow | Through trigger blocks in the builder + "New Trigger" in the library |

View File

@@ -0,0 +1,48 @@
# Sharing & Exporting Agents
## Overview
There are several ways to share agents and their outputs with others on the AutoGPT Platform. This guide covers all the available options.
## Sharing Options
### Share Task Output via URL
Every completed task has a unique URL that can be shared with others. To share a task result:
1. Go to your [Agent Library](agent-library.md) and open the agent
2. Click on the completed task you want to share
3. Copy the task URL from your browser's address bar
4. Send the URL to anyone — they can view the task's inputs and outputs directly
### Publish to the Marketplace
The most visible way to share an agent is to publish it to the [Marketplace](marketplace.md). Published agents are discoverable by all platform users and go through a review process by the AutoGPT team.
See [Publishing an Agent](marketplace.md#publishing-an-agent) for the full guide.
### Export as a File
For sharing agents privately — without publishing to the marketplace — you can export an agent as a file:
1. Go to your [Agent Library](agent-library.md) and open the agent
2. Click the **three dots** (⋯) on the far right
3. Select **Export Agent to File**
4. The agent file will download to your computer
You can then send this file to anyone via email, messaging, or any other method.
### Import a Shared File
To import an agent file someone has shared with you:
1. Go to your [Agent Library](agent-library.md)
2. Click **Upload Agent** at the top
3. Select the agent file
4. The agent will be added to your library
## Teams & Collaboration
{% hint style="info" %}
Team and organisation features are not yet available on the platform. Currently, the only way to share agents privately between users is through the file export/import process described above.
{% endhint %}

View File

@@ -0,0 +1,43 @@
# Templates
## Overview
Templates are saved input configurations that let you re-run an agent with the same inputs quickly. Instead of filling in every input field each time, you can save a set of inputs as a template and trigger it again with one click.
## Creating a Template
Templates are created from previously completed tasks:
1. Go to your [Agent Library](agent-library.md) and open the agent
2. Find a completed task in the left-hand pane whose inputs you want to reuse
3. Click the **three dots** (⋯) on the task
4. Select **Save as Template**
5. Give the template a **name** and **description**
6. Click **Save**
The template captures all the input values that were used for that task.
## Using a Template
To run an agent using a saved template:
1. Open the agent from your library
2. Switch to the **Templates** tab on the left-hand side
3. Click on the template you want to use
4. The agent will run with the saved input values
## Managing Templates
Templates are listed under the **Templates** tab on your agent's detail page. The tab shows the count of available templates (e.g., `Templates 3`).
## When to Use Templates
Templates are particularly useful for:
- **Recurring tasks**: When you run an agent regularly with the same inputs (e.g., a daily SEO report for the same keyword)
- **Standard configurations**: When you've found input settings that work well and want to reuse them
- **Quick re-runs**: When you want to repeat a successful task without re-entering all the inputs
{% hint style="info" %}
Templates save the **input values** only — they do not save any settings or modifications to the agent itself. If you edit the agent's blocks or connections, existing templates will still use the saved input values with the updated agent design.
{% endhint %}

Some files were not shown because too many files have changed in this diff Show More