Merge branch 'dev' into refactor/remove-old-agent-library-view

fix(frontend): improve CoPilot chat table styling (#12094 )
## Summary - Remove left and right borders from tables rendered in CoPilot chat - Increase cell padding (py-3 → py-3.5) for better spacing between text and lines - Applies to both Streamdown (main chat) and MarkdownRenderer (tool outputs) Design feedback from Olivia to make tables "breathe" more. ## Test plan - [ ] Open CoPilot chat and trigger a response containing a table - [ ] Verify tables no longer have left/right borders - [ ] Verify increased spacing between rows - [ ] Check both light and dark modes 🤖 Generated with [Claude Code](https://claude.com/claude-code)  <h2>Greptile Overview</h2> <details><summary><h3>Greptile Summary</h3></summary> Improved CoPilot chat table styling by removing left and right borders and increasing vertical padding from `py-3` to `py-3.5`. Changes apply to both: - Streamdown-rendered tables (via CSS selector in `globals.css`) - MarkdownRenderer tables (via Tailwind classes) The changes make tables "breathe" more per design feedback from Olivia. **Issue Found:** - The CSS padding value in `globals.css:192` is `0.625rem` (`py-2.5`) but should be `0.875rem` (`py-3.5`) to match the PR description and the MarkdownRenderer implementation. </details> <details><summary><h3>Confidence Score: 2/5</h3></summary> - This PR has a logical error that will cause inconsistent table styling between Streamdown and MarkdownRenderer tables - The implementation has an inconsistency where the CSS file uses `py-2.5` padding while the PR description and MarkdownRenderer use `py-3.5`. This will result in different table padding between the two rendering systems, contradicting the goal of consistent styling improvements. - Pay close attention to `autogpt_platform/frontend/src/app/globals.css` - the padding value needs to be corrected to match the intended design </details>   --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
2026-02-13 00:05:02 -05:00 · 2026-02-13 09:39:20 +08:00 · 2026-02-13 09:38:59 +08:00 · 2026-02-13 09:38:16 +08:00 · 2026-02-13 09:37:54 +08:00 · 2026-02-12 20:06:40 +00:00
63 changed files with 1129 additions and 5539 deletions
--- a/autogpt_platform/backend/Dockerfile
+++ b/autogpt_platform/backend/Dockerfile
@@ -62,18 +62,12 @@ ENV POETRY_HOME=/opt/poetry \
    DEBIAN_FRONTEND=noninteractive
 ENV PATH=/opt/poetry/bin:$PATH
-# Install Python, FFmpeg, ImageMagick, and CLI tools for agent use.
+# Install Python, FFmpeg, and ImageMagick (required for video processing blocks)
 # bubblewrap provides OS-level sandbox (whitelist-only FS + no network)
 # for the bash_exec MCP tool.
 RUN apt-get update && apt-get install -y \
    python3.13 \
    python3-pip \
    ffmpeg \
    imagemagick \
    jq \
    ripgrep \
    tree \
    bubblewrap \
    && rm -rf /var/lib/apt/lists/*
 # Copy only necessary files from builder
--- a/autogpt_platform/backend/backend/api/features/chat/config.py
+++ b/autogpt_platform/backend/backend/api/features/chat/config.py
@@ -27,11 +27,12 @@ class ChatConfig(BaseSettings):
    session_ttl: int = Field(default=43200, description="Session TTL in seconds")
    # Streaming Configuration
-    stream_timeout: int = Field(default=300, description="Stream timeout in seconds")
+    max_context_messages: int = Field(
-    max_retries: int = Field(
+        default=50, ge=1, le=200, description="Maximum context messages"
        default=3,
        description="Max retries for fallback path (SDK handles retries internally)",
    )
    stream_timeout: int = Field(default=300, description="Stream timeout in seconds")
    max_retries: int = Field(default=3, description="Maximum number of retries")
    max_agent_runs: int = Field(default=30, description="Maximum number of agent runs")
    max_agent_schedules: int = Field(
        default=30, description="Maximum number of agent schedules"
@@ -92,26 +93,6 @@ class ChatConfig(BaseSettings):
        description="Name of the prompt in Langfuse to fetch",
    )
    # Claude Agent SDK Configuration
    use_claude_agent_sdk: bool = Field(
        default=True,
        description="Use Claude Agent SDK for chat completions",
    )
    claude_agent_model: str | None = Field(
        default=None,
        description="Model for the Claude Agent SDK path. If None, derives from "
        "the `model` field by stripping the OpenRouter provider prefix.",
    )
    claude_agent_max_buffer_size: int = Field(
        default=10 * 1024 * 1024,  # 10MB (default SDK is 1MB)
        description="Max buffer size in bytes for Claude Agent SDK JSON message parsing. "
        "Increase if tool outputs exceed the limit.",
    )
    claude_agent_max_subtasks: int = Field(
        default=10,
        description="Max number of sub-agent Tasks the SDK can spawn per session.",
    )
    # Extended thinking configuration for Claude models
    thinking_enabled: bool = Field(
        default=True,
@@ -157,17 +138,6 @@ class ChatConfig(BaseSettings):
            v = os.getenv("CHAT_INTERNAL_API_KEY")
        return v
    @field_validator("use_claude_agent_sdk", mode="before")
    @classmethod
    def get_use_claude_agent_sdk(cls, v):
        """Get use_claude_agent_sdk from environment if not provided."""
        # Check environment variable - default to True if not set
        env_val = os.getenv("CHAT_USE_CLAUDE_AGENT_SDK", "").lower()
        if env_val:
            return env_val in ("true", "1", "yes", "on")
        # Default to True (SDK enabled by default)
        return True if v is None else v
    # Prompt paths for different contexts
    PROMPT_PATHS: dict[str, str] = {
        "default": "prompts/chat_system.md",
--- a/autogpt_platform/backend/backend/api/features/chat/model.py
+++ b/autogpt_platform/backend/backend/api/features/chat/model.py
@@ -334,8 +334,9 @@ async def _get_session_from_cache(session_id: str) -> ChatSession | None:
    try:
        session = ChatSession.model_validate_json(raw_session)
        logger.info(
-            f"[CACHE] Loaded session {session_id}: {len(session.messages)} messages, "
+            f"Loading session {session_id} from cache: "
-            f"last_roles={[m.role for m in session.messages[-3:]]}"  # Last 3 roles
+            f"message_count={len(session.messages)}, "
            f"roles={[m.role for m in session.messages]}"
        )
        return session
    except Exception as e:
@@ -377,9 +378,11 @@ async def _get_session_from_db(session_id: str) -> ChatSession | None:
        return None
    messages = prisma_session.Messages
-    logger.debug(
+    logger.info(
-        f"[DB] Loaded session {session_id}: {len(messages) if messages else 0} messages, "
+        f"Loading session {session_id} from DB: "
-        f"roles={[m.role for m in messages[-3:]] if messages else []}"  # Last 3 roles
+        f"has_messages={messages is not None}, "
        f"message_count={len(messages) if messages else 0}, "
        f"roles={[m.role for m in messages] if messages else []}"
    )
    return ChatSession.from_db(prisma_session, messages)
@@ -430,9 +433,10 @@ async def _save_session_to_db(
                    "function_call": msg.function_call,
                }
            )
-        logger.debug(
+        logger.info(
-            f"[DB] Saving {len(new_messages)} messages to session {session.session_id}, "
+            f"Saving {len(new_messages)} new messages to DB for session {session.session_id}: "
-            f"roles={[m['role'] for m in messages_data]}"
+            f"roles={[m['role'] for m in messages_data]}, "
            f"start_sequence={existing_message_count}"
        )
        await chat_db.add_chat_messages_batch(
            session_id=session.session_id,
@@ -472,7 +476,7 @@ async def get_chat_session(
        logger.warning(f"Unexpected cache error for session {session_id}: {e}")
    # Fall back to database
-    logger.debug(f"Session {session_id} not in cache, checking database")
+    logger.info(f"Session {session_id} not in cache, checking database")
    session = await _get_session_from_db(session_id)
    if session is None:
@@ -489,6 +493,7 @@ async def get_chat_session(
    # Cache the session from DB
    try:
        await _cache_session(session)
        logger.info(f"Cached session {session_id} from database")
    except Exception as e:
        logger.warning(f"Failed to cache session {session_id}: {e}")
@@ -553,40 +558,6 @@ async def upsert_chat_session(
        return session
 async def append_and_save_message(session_id: str, message: ChatMessage) -> ChatSession:
    """Atomically append a message to a session and persist it.
    Acquires the session lock, re-fetches the latest session state,
    appends the message, and saves — preventing message loss when
    concurrent requests modify the same session.
    """
    lock = await _get_session_lock(session_id)
    async with lock:
        session = await get_chat_session(session_id)
        if session is None:
            raise ValueError(f"Session {session_id} not found")
        session.messages.append(message)
        existing_message_count = await chat_db.get_chat_session_message_count(
            session_id
        )
        try:
            await _save_session_to_db(session, existing_message_count)
        except Exception as e:
            raise DatabaseError(
                f"Failed to persist message to session {session_id}"
            ) from e
        try:
            await _cache_session(session)
        except Exception as e:
            logger.warning(f"Cache write failed for session {session_id}: {e}")
        return session
 async def create_chat_session(user_id: str) -> ChatSession:
    """Create a new chat session and persist it.
@@ -693,19 +664,13 @@ async def update_session_title(session_id: str, title: str) -> bool:
            logger.warning(f"Session {session_id} not found for title update")
            return False
-        # Update title in cache if it exists (instead of invalidating).
+        # Invalidate cache so next fetch gets updated title
        # This prevents race conditions where cache invalidation causes
        # the frontend to see stale DB data while streaming is still in progress.
        try:
-            cached = await _get_session_from_cache(session_id)
+            redis_key = _get_session_cache_key(session_id)
-            if cached:
+            async_redis = await get_redis_async()
-                cached.title = title
+            await async_redis.delete(redis_key)
                await _cache_session(cached)
        except Exception as e:
-            # Not critical - title will be correct on next full cache refresh
+            logger.warning(f"Failed to invalidate cache for session {session_id}: {e}")
            logger.warning(
                f"Failed to update title in cache for session {session_id}: {e}"
            )
        return True
    except Exception as e:
--- a/autogpt_platform/backend/backend/api/features/chat/routes.py
+++ b/autogpt_platform/backend/backend/api/features/chat/routes.py
@@ -1,6 +1,5 @@
 """Chat API routes for chat session management and streaming via SSE."""
 import asyncio
 import logging
 import uuid as uuid_module
 from collections.abc import AsyncGenerator
@@ -12,22 +11,13 @@ from fastapi.responses import StreamingResponse
 from pydantic import BaseModel
 from backend.util.exceptions import NotFoundError
 from backend.util.feature_flag import Flag, is_feature_enabled
 from . import service as chat_service
 from . import stream_registry
 from .completion_handler import process_operation_failure, process_operation_success
 from .config import ChatConfig
-from .model import (
+from .model import ChatSession, create_chat_session, get_chat_session, get_user_sessions
-    ChatMessage,
+from .response_model import StreamFinish, StreamHeartbeat
    ChatSession,
    append_and_save_message,
    create_chat_session,
    get_chat_session,
    get_user_sessions,
 )
 from .response_model import StreamError, StreamFinish, StreamHeartbeat, StreamStart
 from .sdk import service as sdk_service
 from .tools.models import (
    AgentDetailsResponse,
    AgentOutputResponse,
@@ -50,7 +40,6 @@ from .tools.models import (
    SetupRequirementsResponse,
    UnderstandingUpdatedResponse,
 )
 from .tracking import track_user_message
 config = ChatConfig()
@@ -242,10 +231,6 @@ async def get_session(
    active_task, last_message_id = await stream_registry.get_active_task_for_session(
        session_id, user_id
    )
    logger.info(
        f"[GET_SESSION] session={session_id}, active_task={active_task is not None}, "
        f"msg_count={len(messages)}, last_role={messages[-1].get('role') if messages else 'none'}"
    )
    if active_task:
        # Filter out the in-progress assistant message from the session response.
        # The client will receive the complete assistant response through the SSE
@@ -315,9 +300,10 @@ async def stream_chat_post(
        f"user={user_id}, message_len={len(request.message)}",
        extra={"json_fields": log_meta},
    )
    session = await _validate_and_get_session(session_id, user_id)
    logger.info(
-        f"[TIMING] session validated in {(time.perf_counter() - stream_start_time) * 1000:.1f}ms",
+        f"[TIMING] session validated in {(time.perf_counter() - stream_start_time)*1000:.1f}ms",
        extra={
            "json_fields": {
                **log_meta,
@@ -326,25 +312,6 @@ async def stream_chat_post(
        },
    )
    # Atomically append user message to session BEFORE creating task to avoid
    # race condition where GET_SESSION sees task as "running" but message isn't
    # saved yet.  append_and_save_message re-fetches inside a lock to prevent
    # message loss from concurrent requests.
    if request.message:
        message = ChatMessage(
            role="user" if request.is_user_message else "assistant",
            content=request.message,
        )
        if request.is_user_message:
            track_user_message(
                user_id=user_id,
                session_id=session_id,
                message_length=len(request.message),
            )
        logger.info(f"[STREAM] Saving user message to session {session_id}")
        session = await append_and_save_message(session_id, message)
        logger.info(f"[STREAM] User message saved for session {session_id}")
    # Create a task in the stream registry for reconnection support
    task_id = str(uuid_module.uuid4())
    operation_id = str(uuid_module.uuid4())
@@ -360,7 +327,7 @@ async def stream_chat_post(
        operation_id=operation_id,
    )
    logger.info(
-        f"[TIMING] create_task completed in {(time.perf_counter() - task_create_start) * 1000:.1f}ms",
+        f"[TIMING] create_task completed in {(time.perf_counter() - task_create_start)*1000:.1f}ms",
        extra={
            "json_fields": {
                **log_meta,
@@ -381,47 +348,15 @@ async def stream_chat_post(
        first_chunk_time, ttfc = None, None
        chunk_count = 0
        try:
-            # Emit a start event with task_id for reconnection
+            async for chunk in chat_service.stream_chat_completion(
            start_chunk = StreamStart(messageId=task_id, taskId=task_id)
            await stream_registry.publish_chunk(task_id, start_chunk)
            logger.info(
                f"[TIMING] StreamStart published at {(time_module.perf_counter() - gen_start_time) * 1000:.1f}ms",
                extra={
                    "json_fields": {
                        **log_meta,
                        "elapsed_ms": (time_module.perf_counter() - gen_start_time)
                        * 1000,
                    }
                },
            )
            # Choose service based on LaunchDarkly flag (falls back to config default)
            use_sdk = await is_feature_enabled(
                Flag.COPILOT_SDK,
                user_id or "anonymous",
                default=config.use_claude_agent_sdk,
            )
            stream_fn = (
                sdk_service.stream_chat_completion_sdk
                if use_sdk
                else chat_service.stream_chat_completion
            )
            logger.info(
                f"[TIMING] Calling {'sdk' if use_sdk else 'standard'} stream_chat_completion",
                extra={"json_fields": log_meta},
            )
            # Pass message=None since we already added it to the session above
            async for chunk in stream_fn(
                session_id,
-                None,  # Message already in session
+                request.message,
                is_user_message=request.is_user_message,
                user_id=user_id,
-                session=session,  # Pass session with message already added
+                session=session,  # Pass pre-fetched session to avoid double-fetch
                context=request.context,
                _task_id=task_id,  # Pass task_id so service emits start with taskId for reconnection
            ):
                # Skip duplicate StreamStart — we already published one above
                if isinstance(chunk, StreamStart):
                    continue
                chunk_count += 1
                if first_chunk_time is None:
                    first_chunk_time = time_module.perf_counter()
@@ -442,7 +377,7 @@ async def stream_chat_post(
            gen_end_time = time_module.perf_counter()
            total_time = (gen_end_time - gen_start_time) * 1000
            logger.info(
-                f"[TIMING] run_ai_generation FINISHED in {total_time / 1000:.1f}s; "
+                f"[TIMING] run_ai_generation FINISHED in {total_time/1000:.1f}s; "
                f"task={task_id}, session={session_id}, "
                f"ttfc={ttfc or -1:.2f}s, n_chunks={chunk_count}",
                extra={
@@ -469,17 +404,6 @@ async def stream_chat_post(
                    }
                },
            )
            # Publish a StreamError so the frontend can display an error message
            try:
                await stream_registry.publish_chunk(
                    task_id,
                    StreamError(
                        errorText="An error occurred. Please try again.",
                        code="stream_error",
                    ),
                )
            except Exception:
                pass  # Best-effort; mark_task_completed will publish StreamFinish
            await stream_registry.mark_task_completed(task_id, "failed")
    # Start the AI generation in a background task
@@ -582,14 +506,8 @@ async def stream_chat_post(
                    "json_fields": {**log_meta, "elapsed_ms": elapsed, "error": str(e)}
                },
            )
            # Surface error to frontend so it doesn't appear stuck
            yield StreamError(
                errorText="An error occurred. Please try again.",
                code="stream_error",
            ).to_sse()
            yield StreamFinish().to_sse()
        finally:
-            # Unsubscribe when client disconnects or stream ends
+            # Unsubscribe when client disconnects or stream ends to prevent resource leak
            if subscriber_queue is not None:
                try:
                    await stream_registry.unsubscribe_from_task(
@@ -833,6 +751,8 @@ async def stream_task(
        )
    async def event_generator() -> AsyncGenerator[str, None]:
        import asyncio
        heartbeat_interval = 15.0  # Send heartbeat every 15 seconds
        try:
            while True:
--- a/autogpt_platform/backend/backend/api/features/chat/sdk/init.py
+++ b/autogpt_platform/backend/backend/api/features/chat/sdk/init.py
@@ -1,14 +0,0 @@
 """Claude Agent SDK integration for CoPilot.
 This module provides the integration layer between the Claude Agent SDK
 and the existing CoPilot tool system, enabling drop-in replacement of
 the current LLM orchestration with the battle-tested Claude Agent SDK.
 """
 from .service import stream_chat_completion_sdk
 from .tool_adapter import create_copilot_mcp_server
 __all__ = [
    "stream_chat_completion_sdk",
    "create_copilot_mcp_server",
 ]
--- a/autogpt_platform/backend/backend/api/features/chat/sdk/response_adapter.py
+++ b/autogpt_platform/backend/backend/api/features/chat/sdk/response_adapter.py
@@ -1,198 +0,0 @@
 """Response adapter for converting Claude Agent SDK messages to Vercel AI SDK format.
 This module provides the adapter layer that converts streaming messages from
 the Claude Agent SDK into the Vercel AI SDK UI Stream Protocol format that
 the frontend expects.
 """
 import json
 import logging
 import uuid
 from claude_agent_sdk import (
    AssistantMessage,
    Message,
    ResultMessage,
    SystemMessage,
    TextBlock,
    ToolResultBlock,
    ToolUseBlock,
    UserMessage,
 )
 from backend.api.features.chat.response_model import (
    StreamBaseResponse,
    StreamError,
    StreamFinish,
    StreamFinishStep,
    StreamStart,
    StreamStartStep,
    StreamTextDelta,
    StreamTextEnd,
    StreamTextStart,
    StreamToolInputAvailable,
    StreamToolInputStart,
    StreamToolOutputAvailable,
 )
 from backend.api.features.chat.sdk.tool_adapter import (
    MCP_TOOL_PREFIX,
    pop_pending_tool_output,
 )
 logger = logging.getLogger(__name__)
 class SDKResponseAdapter:
    """Adapter for converting Claude Agent SDK messages to Vercel AI SDK format.
    This class maintains state during a streaming session to properly track
    text blocks, tool calls, and message lifecycle.
    """
    def __init__(self, message_id: str | None = None):
        self.message_id = message_id or str(uuid.uuid4())
        self.text_block_id = str(uuid.uuid4())
        self.has_started_text = False
        self.has_ended_text = False
        self.current_tool_calls: dict[str, dict[str, str]] = {}
        self.task_id: str | None = None
        self.step_open = False
    def set_task_id(self, task_id: str) -> None:
        """Set the task ID for reconnection support."""
        self.task_id = task_id
    def convert_message(self, sdk_message: Message) -> list[StreamBaseResponse]:
        """Convert a single SDK message to Vercel AI SDK format."""
        responses: list[StreamBaseResponse] = []
        if isinstance(sdk_message, SystemMessage):
            if sdk_message.subtype == "init":
                responses.append(
                    StreamStart(messageId=self.message_id, taskId=self.task_id)
                )
                # Open the first step (matches non-SDK: StreamStart then StreamStartStep)
                responses.append(StreamStartStep())
                self.step_open = True
        elif isinstance(sdk_message, AssistantMessage):
            # After tool results, the SDK sends a new AssistantMessage for the
            # next LLM turn. Open a new step if the previous one was closed.
            if not self.step_open:
                responses.append(StreamStartStep())
                self.step_open = True
            for block in sdk_message.content:
                if isinstance(block, TextBlock):
                    if block.text:
                        self._ensure_text_started(responses)
                        responses.append(
                            StreamTextDelta(id=self.text_block_id, delta=block.text)
                        )
                elif isinstance(block, ToolUseBlock):
                    self._end_text_if_open(responses)
                    # Strip MCP prefix so frontend sees "find_block"
                    # instead of "mcp__copilot__find_block".
                    tool_name = block.name.removeprefix(MCP_TOOL_PREFIX)
                    responses.append(
                        StreamToolInputStart(toolCallId=block.id, toolName=tool_name)
                    )
                    responses.append(
                        StreamToolInputAvailable(
                            toolCallId=block.id,
                            toolName=tool_name,
                            input=block.input,
                        )
                    )
                    self.current_tool_calls[block.id] = {"name": tool_name}
        elif isinstance(sdk_message, UserMessage):
            # UserMessage carries tool results back from tool execution.
            content = sdk_message.content
            blocks = content if isinstance(content, list) else []
            for block in blocks:
                if isinstance(block, ToolResultBlock) and block.tool_use_id:
                    tool_info = self.current_tool_calls.get(block.tool_use_id, {})
                    tool_name = tool_info.get("name", "unknown")
                    # Prefer the stashed full output over the SDK's
                    # (potentially truncated) ToolResultBlock content.
                    # The SDK truncates large results, writing them to disk,
                    # which breaks frontend widget parsing.
                    output = pop_pending_tool_output(tool_name) or (
                        _extract_tool_output(block.content)
                    )
                    responses.append(
                        StreamToolOutputAvailable(
                            toolCallId=block.tool_use_id,
                            toolName=tool_name,
                            output=output,
                            success=not (block.is_error or False),
                        )
                    )
            # Close the current step after tool results — the next
            # AssistantMessage will open a new step for the continuation.
            if self.step_open:
                responses.append(StreamFinishStep())
                self.step_open = False
        elif isinstance(sdk_message, ResultMessage):
            self._end_text_if_open(responses)
            # Close the step before finishing.
            if self.step_open:
                responses.append(StreamFinishStep())
                self.step_open = False
            if sdk_message.subtype == "success":
                responses.append(StreamFinish())
            elif sdk_message.subtype in ("error", "error_during_execution"):
                error_msg = getattr(sdk_message, "result", None) or "Unknown error"
                responses.append(
                    StreamError(errorText=str(error_msg), code="sdk_error")
                )
                responses.append(StreamFinish())
        else:
            logger.debug(f"Unhandled SDK message type: {type(sdk_message).__name__}")
        return responses
    def _ensure_text_started(self, responses: list[StreamBaseResponse]) -> None:
        """Start (or restart) a text block if needed."""
        if not self.has_started_text or self.has_ended_text:
            if self.has_ended_text:
                self.text_block_id = str(uuid.uuid4())
                self.has_ended_text = False
            responses.append(StreamTextStart(id=self.text_block_id))
            self.has_started_text = True
    def _end_text_if_open(self, responses: list[StreamBaseResponse]) -> None:
        """End the current text block if one is open."""
        if self.has_started_text and not self.has_ended_text:
            responses.append(StreamTextEnd(id=self.text_block_id))
            self.has_ended_text = True
 def _extract_tool_output(content: str | list[dict[str, str]] | None) -> str:
    """Extract a string output from a ToolResultBlock's content field."""
    if isinstance(content, str):
        return content
    if isinstance(content, list):
        parts = [item.get("text", "") for item in content if item.get("type") == "text"]
        if parts:
            return "".join(parts)
        try:
            return json.dumps(content)
        except (TypeError, ValueError):
            return str(content)
    if content is None:
        return ""
    try:
        return json.dumps(content)
    except (TypeError, ValueError):
        return str(content)
--- a/autogpt_platform/backend/backend/api/features/chat/sdk/response_adapter_test.py
+++ b/autogpt_platform/backend/backend/api/features/chat/sdk/response_adapter_test.py
@@ -1,366 +0,0 @@
 """Unit tests for the SDK response adapter."""
 from claude_agent_sdk import (
    AssistantMessage,
    ResultMessage,
    SystemMessage,
    TextBlock,
    ToolResultBlock,
    ToolUseBlock,
    UserMessage,
 )
 from backend.api.features.chat.response_model import (
    StreamBaseResponse,
    StreamError,
    StreamFinish,
    StreamFinishStep,
    StreamStart,
    StreamStartStep,
    StreamTextDelta,
    StreamTextEnd,
    StreamTextStart,
    StreamToolInputAvailable,
    StreamToolInputStart,
    StreamToolOutputAvailable,
 )
 from .response_adapter import SDKResponseAdapter
 from .tool_adapter import MCP_TOOL_PREFIX
 def _adapter() -> SDKResponseAdapter:
    a = SDKResponseAdapter(message_id="msg-1")
    a.set_task_id("task-1")
    return a
 # -- SystemMessage -----------------------------------------------------------
 def test_system_init_emits_start_and_step():
    adapter = _adapter()
    results = adapter.convert_message(SystemMessage(subtype="init", data={}))
    assert len(results) == 2
    assert isinstance(results[0], StreamStart)
    assert results[0].messageId == "msg-1"
    assert results[0].taskId == "task-1"
    assert isinstance(results[1], StreamStartStep)
 def test_system_non_init_emits_nothing():
    adapter = _adapter()
    results = adapter.convert_message(SystemMessage(subtype="other", data={}))
    assert results == []
 # -- AssistantMessage with TextBlock -----------------------------------------
 def test_text_block_emits_step_start_and_delta():
    adapter = _adapter()
    msg = AssistantMessage(content=[TextBlock(text="hello")], model="test")
    results = adapter.convert_message(msg)
    assert len(results) == 3
    assert isinstance(results[0], StreamStartStep)
    assert isinstance(results[1], StreamTextStart)
    assert isinstance(results[2], StreamTextDelta)
    assert results[2].delta == "hello"
 def test_empty_text_block_emits_only_step():
    adapter = _adapter()
    msg = AssistantMessage(content=[TextBlock(text="")], model="test")
    results = adapter.convert_message(msg)
    # Empty text skipped, but step still opens
    assert len(results) == 1
    assert isinstance(results[0], StreamStartStep)
 def test_multiple_text_deltas_reuse_block_id():
    adapter = _adapter()
    msg1 = AssistantMessage(content=[TextBlock(text="a")], model="test")
    msg2 = AssistantMessage(content=[TextBlock(text="b")], model="test")
    r1 = adapter.convert_message(msg1)
    r2 = adapter.convert_message(msg2)
    # First gets step+start+delta, second only delta (block & step already started)
    assert len(r1) == 3
    assert isinstance(r1[0], StreamStartStep)
    assert isinstance(r1[1], StreamTextStart)
    assert len(r2) == 1
    assert isinstance(r2[0], StreamTextDelta)
    assert r1[1].id == r2[0].id  # same block ID
 # -- AssistantMessage with ToolUseBlock --------------------------------------
 def test_tool_use_emits_input_start_and_available():
    """Tool names arrive with MCP prefix and should be stripped for the frontend."""
    adapter = _adapter()
    msg = AssistantMessage(
        content=[
            ToolUseBlock(
                id="tool-1",
                name=f"{MCP_TOOL_PREFIX}find_agent",
                input={"q": "x"},
            )
        ],
        model="test",
    )
    results = adapter.convert_message(msg)
    assert len(results) == 3
    assert isinstance(results[0], StreamStartStep)
    assert isinstance(results[1], StreamToolInputStart)
    assert results[1].toolCallId == "tool-1"
    assert results[1].toolName == "find_agent"  # prefix stripped
    assert isinstance(results[2], StreamToolInputAvailable)
    assert results[2].toolName == "find_agent"  # prefix stripped
    assert results[2].input == {"q": "x"}
 def test_text_then_tool_ends_text_block():
    adapter = _adapter()
    text_msg = AssistantMessage(content=[TextBlock(text="thinking...")], model="test")
    tool_msg = AssistantMessage(
        content=[ToolUseBlock(id="t1", name=f"{MCP_TOOL_PREFIX}tool", input={})],
        model="test",
    )
    adapter.convert_message(text_msg)  # opens step + text
    results = adapter.convert_message(tool_msg)
    # Step already open, so: TextEnd, ToolInputStart, ToolInputAvailable
    assert len(results) == 3
    assert isinstance(results[0], StreamTextEnd)
    assert isinstance(results[1], StreamToolInputStart)
 # -- UserMessage with ToolResultBlock ----------------------------------------
 def test_tool_result_emits_output_and_finish_step():
    adapter = _adapter()
    # First register the tool call (opens step) — SDK sends prefixed name
    tool_msg = AssistantMessage(
        content=[ToolUseBlock(id="t1", name=f"{MCP_TOOL_PREFIX}find_agent", input={})],
        model="test",
    )
    adapter.convert_message(tool_msg)
    # Now send tool result
    result_msg = UserMessage(
        content=[ToolResultBlock(tool_use_id="t1", content="found 3 agents")]
    )
    results = adapter.convert_message(result_msg)
    assert len(results) == 2
    assert isinstance(results[0], StreamToolOutputAvailable)
    assert results[0].toolCallId == "t1"
    assert results[0].toolName == "find_agent"  # prefix stripped
    assert results[0].output == "found 3 agents"
    assert results[0].success is True
    assert isinstance(results[1], StreamFinishStep)
 def test_tool_result_error():
    adapter = _adapter()
    adapter.convert_message(
        AssistantMessage(
            content=[
                ToolUseBlock(id="t1", name=f"{MCP_TOOL_PREFIX}run_agent", input={})
            ],
            model="test",
        )
    )
    result_msg = UserMessage(
        content=[ToolResultBlock(tool_use_id="t1", content="timeout", is_error=True)]
    )
    results = adapter.convert_message(result_msg)
    assert isinstance(results[0], StreamToolOutputAvailable)
    assert results[0].success is False
    assert isinstance(results[1], StreamFinishStep)
 def test_tool_result_list_content():
    adapter = _adapter()
    adapter.convert_message(
        AssistantMessage(
            content=[ToolUseBlock(id="t1", name=f"{MCP_TOOL_PREFIX}tool", input={})],
            model="test",
        )
    )
    result_msg = UserMessage(
        content=[
            ToolResultBlock(
                tool_use_id="t1",
                content=[
                    {"type": "text", "text": "line1"},
                    {"type": "text", "text": "line2"},
                ],
            )
        ]
    )
    results = adapter.convert_message(result_msg)
    assert isinstance(results[0], StreamToolOutputAvailable)
    assert results[0].output == "line1line2"
    assert isinstance(results[1], StreamFinishStep)
 def test_string_user_message_ignored():
    """A plain string UserMessage (not tool results) produces no output."""
    adapter = _adapter()
    results = adapter.convert_message(UserMessage(content="hello"))
    assert results == []
 # -- ResultMessage -----------------------------------------------------------
 def test_result_success_emits_finish_step_and_finish():
    adapter = _adapter()
    # Start some text first (opens step)
    adapter.convert_message(
        AssistantMessage(content=[TextBlock(text="done")], model="test")
    )
    msg = ResultMessage(
        subtype="success",
        duration_ms=100,
        duration_api_ms=50,
        is_error=False,
        num_turns=1,
        session_id="s1",
    )
    results = adapter.convert_message(msg)
    # TextEnd + FinishStep + StreamFinish
    assert len(results) == 3
    assert isinstance(results[0], StreamTextEnd)
    assert isinstance(results[1], StreamFinishStep)
    assert isinstance(results[2], StreamFinish)
 def test_result_error_emits_error_and_finish():
    adapter = _adapter()
    msg = ResultMessage(
        subtype="error",
        duration_ms=100,
        duration_api_ms=50,
        is_error=True,
        num_turns=0,
        session_id="s1",
        result="API rate limited",
    )
    results = adapter.convert_message(msg)
    # No step was open, so no FinishStep — just Error + Finish
    assert len(results) == 2
    assert isinstance(results[0], StreamError)
    assert "API rate limited" in results[0].errorText
    assert isinstance(results[1], StreamFinish)
 # -- Text after tools (new block ID) ----------------------------------------
 def test_text_after_tool_gets_new_block_id():
    adapter = _adapter()
    # Text -> Tool -> ToolResult -> Text should get a new text block ID and step
    adapter.convert_message(
        AssistantMessage(content=[TextBlock(text="before")], model="test")
    )
    adapter.convert_message(
        AssistantMessage(
            content=[ToolUseBlock(id="t1", name=f"{MCP_TOOL_PREFIX}tool", input={})],
            model="test",
        )
    )
    # Send tool result (closes step)
    adapter.convert_message(
        UserMessage(content=[ToolResultBlock(tool_use_id="t1", content="ok")])
    )
    results = adapter.convert_message(
        AssistantMessage(content=[TextBlock(text="after")], model="test")
    )
    # Should get StreamStartStep (new step) + StreamTextStart (new block) + StreamTextDelta
    assert len(results) == 3
    assert isinstance(results[0], StreamStartStep)
    assert isinstance(results[1], StreamTextStart)
    assert isinstance(results[2], StreamTextDelta)
    assert results[2].delta == "after"
 # -- Full conversation flow --------------------------------------------------
 def test_full_conversation_flow():
    """Simulate a complete conversation: init -> text -> tool -> result -> text -> finish."""
    adapter = _adapter()
    all_responses: list[StreamBaseResponse] = []
    # 1. Init
    all_responses.extend(
        adapter.convert_message(SystemMessage(subtype="init", data={}))
    )
    # 2. Assistant text
    all_responses.extend(
        adapter.convert_message(
            AssistantMessage(content=[TextBlock(text="Let me search")], model="test")
        )
    )
    # 3. Tool use
    all_responses.extend(
        adapter.convert_message(
            AssistantMessage(
                content=[
                    ToolUseBlock(
                        id="t1",
                        name=f"{MCP_TOOL_PREFIX}find_agent",
                        input={"query": "email"},
                    )
                ],
                model="test",
            )
        )
    )
    # 4. Tool result
    all_responses.extend(
        adapter.convert_message(
            UserMessage(
                content=[ToolResultBlock(tool_use_id="t1", content="Found 2 agents")]
            )
        )
    )
    # 5. More text
    all_responses.extend(
        adapter.convert_message(
            AssistantMessage(content=[TextBlock(text="I found 2")], model="test")
        )
    )
    # 6. Result
    all_responses.extend(
        adapter.convert_message(
            ResultMessage(
                subtype="success",
                duration_ms=500,
                duration_api_ms=400,
                is_error=False,
                num_turns=2,
                session_id="s1",
            )
        )
    )
    types = [type(r).__name__ for r in all_responses]
    assert types == [
        "StreamStart",
        "StreamStartStep",  # step 1: text + tool call
        "StreamTextStart",
        "StreamTextDelta",  # "Let me search"
        "StreamTextEnd",  # closed before tool
        "StreamToolInputStart",
        "StreamToolInputAvailable",
        "StreamToolOutputAvailable",  # tool result
        "StreamFinishStep",  # step 1 closed after tool result
        "StreamStartStep",  # step 2: continuation text
        "StreamTextStart",  # new block after tool
        "StreamTextDelta",  # "I found 2"
        "StreamTextEnd",  # closed by result
        "StreamFinishStep",  # step 2 closed
        "StreamFinish",
    ]
--- a/autogpt_platform/backend/backend/api/features/chat/sdk/security_hooks.py
+++ b/autogpt_platform/backend/backend/api/features/chat/sdk/security_hooks.py
@@ -1,296 +0,0 @@
 """Security hooks for Claude Agent SDK integration.
 This module provides security hooks that validate tool calls before execution,
 ensuring multi-user isolation and preventing unauthorized operations.
 """
 import json
 import logging
 import os
 import re
 from typing import Any, cast
 from backend.api.features.chat.sdk.tool_adapter import MCP_TOOL_PREFIX
 logger = logging.getLogger(__name__)
 # Tools that are blocked entirely (CLI/system access).
 # "Bash" (capital) is the SDK built-in — it's NOT in allowed_tools but blocked
 # here as defence-in-depth.  The agent uses mcp__copilot__bash_exec instead,
 # which has kernel-level network isolation (unshare --net).
 BLOCKED_TOOLS = {
    "Bash",
    "bash",
    "shell",
    "exec",
    "terminal",
    "command",
 }
 # Tools allowed only when their path argument stays within the SDK workspace.
 # The SDK uses these to handle oversized tool results (writes to tool-results/
 # files, then reads them back) and for workspace file operations.
 WORKSPACE_SCOPED_TOOLS = {"Read", "Write", "Edit", "Glob", "Grep"}
 # Dangerous patterns in tool inputs
 DANGEROUS_PATTERNS = [
    r"sudo",
    r"rm\s+-rf",
    r"dd\s+if=",
    r"/etc/passwd",
    r"/etc/shadow",
    r"chmod\s+777",
    r"curl\s+.*\|.*sh",
    r"wget\s+.*\|.*sh",
    r"eval\s*\(",
    r"exec\s*\(",
    r"__import__",
    r"os\.system",
    r"subprocess",
 ]
 def _deny(reason: str) -> dict[str, Any]:
    """Return a hook denial response."""
    return {
        "hookSpecificOutput": {
            "hookEventName": "PreToolUse",
            "permissionDecision": "deny",
            "permissionDecisionReason": reason,
        }
    }
 def _validate_workspace_path(
    tool_name: str, tool_input: dict[str, Any], sdk_cwd: str | None
 ) -> dict[str, Any]:
    """Validate that a workspace-scoped tool only accesses allowed paths.
    Allowed directories:
    - The SDK working directory (``/tmp/copilot-<session>/``)
    - The SDK tool-results directory (``~/.claude/projects/…/tool-results/``)
    """
    path = tool_input.get("file_path") or tool_input.get("path") or ""
    if not path:
        # Glob/Grep without a path default to cwd which is already sandboxed
        return {}
    # Resolve relative paths against sdk_cwd (the SDK sets cwd so the LLM
    # naturally uses relative paths like "test.txt" instead of absolute ones).
    if not os.path.isabs(path) and sdk_cwd:
        resolved = os.path.normpath(os.path.join(sdk_cwd, path))
    else:
        resolved = os.path.normpath(os.path.expanduser(path))
    # Allow access within the SDK working directory
    if sdk_cwd:
        norm_cwd = os.path.normpath(sdk_cwd)
        if resolved.startswith(norm_cwd + os.sep) or resolved == norm_cwd:
            return {}
    # Allow access to ~/.claude/projects/*/tool-results/ (big tool results)
    claude_dir = os.path.normpath(os.path.expanduser("~/.claude/projects"))
    if resolved.startswith(claude_dir + os.sep) and "tool-results" in resolved:
        return {}
    logger.warning(
        f"Blocked {tool_name} outside workspace: {path} (resolved={resolved})"
    )
    workspace_hint = f" Allowed workspace: {sdk_cwd}" if sdk_cwd else ""
    return _deny(
        f"[SECURITY] Tool '{tool_name}' can only access files within the workspace "
        f"directory.{workspace_hint} "
        "This is enforced by the platform and cannot be bypassed."
    )
 def _validate_tool_access(
    tool_name: str, tool_input: dict[str, Any], sdk_cwd: str | None = None
 ) -> dict[str, Any]:
    """Validate that a tool call is allowed.
    Returns:
        Empty dict to allow, or dict with hookSpecificOutput to deny
    """
    # Block forbidden tools
    if tool_name in BLOCKED_TOOLS:
        logger.warning(f"Blocked tool access attempt: {tool_name}")
        return _deny(
            f"[SECURITY] Tool '{tool_name}' is blocked for security. "
            "This is enforced by the platform and cannot be bypassed. "
            "Use the CoPilot-specific MCP tools instead."
        )
    # Workspace-scoped tools: allowed only within the SDK workspace directory
    if tool_name in WORKSPACE_SCOPED_TOOLS:
        return _validate_workspace_path(tool_name, tool_input, sdk_cwd)
    # Check for dangerous patterns in tool input
    # Use json.dumps for predictable format (str() produces Python repr)
    input_str = json.dumps(tool_input) if tool_input else ""
    for pattern in DANGEROUS_PATTERNS:
        if re.search(pattern, input_str, re.IGNORECASE):
            logger.warning(
                f"Blocked dangerous pattern in tool input: {pattern} in {tool_name}"
            )
            return _deny(
                "[SECURITY] Input contains a blocked pattern. "
                "This is enforced by the platform and cannot be bypassed."
            )
    return {}
 def _validate_user_isolation(
    tool_name: str, tool_input: dict[str, Any], user_id: str | None
 ) -> dict[str, Any]:
    """Validate that tool calls respect user isolation."""
    # For workspace file tools, ensure path doesn't escape
    if "workspace" in tool_name.lower():
        path = tool_input.get("path", "") or tool_input.get("file_path", "")
        if path:
            # Check for path traversal
            if ".." in path or path.startswith("/"):
                logger.warning(
                    f"Blocked path traversal attempt: {path} by user {user_id}"
                )
                return {
                    "hookSpecificOutput": {
                        "hookEventName": "PreToolUse",
                        "permissionDecision": "deny",
                        "permissionDecisionReason": "Path traversal not allowed",
                    }
                }
    return {}
 def create_security_hooks(
    user_id: str | None,
    sdk_cwd: str | None = None,
    max_subtasks: int = 3,
 ) -> dict[str, Any]:
    """Create the security hooks configuration for Claude Agent SDK.
    Includes security validation and observability hooks:
    - PreToolUse: Security validation before tool execution
    - PostToolUse: Log successful tool executions
    - PostToolUseFailure: Log and handle failed tool executions
    - PreCompact: Log context compaction events (SDK handles compaction automatically)
    Args:
        user_id: Current user ID for isolation validation
        sdk_cwd: SDK working directory for workspace-scoped tool validation
        max_subtasks: Maximum Task (sub-agent) spawns allowed per session
    Returns:
        Hooks configuration dict for ClaudeAgentOptions
    """
    try:
        from claude_agent_sdk import HookMatcher
        from claude_agent_sdk.types import HookContext, HookInput, SyncHookJSONOutput
        # Per-session counter for Task sub-agent spawns
        task_spawn_count = 0
        async def pre_tool_use_hook(
            input_data: HookInput,
            tool_use_id: str | None,
            context: HookContext,
        ) -> SyncHookJSONOutput:
            """Combined pre-tool-use validation hook."""
            nonlocal task_spawn_count
            _ = context  # unused but required by signature
            tool_name = cast(str, input_data.get("tool_name", ""))
            tool_input = cast(dict[str, Any], input_data.get("tool_input", {}))
            # Rate-limit Task (sub-agent) spawns per session
            if tool_name == "Task":
                task_spawn_count += 1
                if task_spawn_count > max_subtasks:
                    logger.warning(
                        f"[SDK] Task limit reached ({max_subtasks}), user={user_id}"
                    )
                    return cast(
                        SyncHookJSONOutput,
                        _deny(
                            f"Maximum {max_subtasks} sub-tasks per session. "
                            "Please continue in the main conversation."
                        ),
                    )
            # Strip MCP prefix for consistent validation
            is_copilot_tool = tool_name.startswith(MCP_TOOL_PREFIX)
            clean_name = tool_name.removeprefix(MCP_TOOL_PREFIX)
            # Only block non-CoPilot tools; our MCP-registered tools
            # (including Read for oversized results) are already sandboxed.
            if not is_copilot_tool:
                result = _validate_tool_access(clean_name, tool_input, sdk_cwd)
                if result:
                    return cast(SyncHookJSONOutput, result)
            # Validate user isolation
            result = _validate_user_isolation(clean_name, tool_input, user_id)
            if result:
                return cast(SyncHookJSONOutput, result)
            logger.debug(f"[SDK] Tool start: {tool_name}, user={user_id}")
            return cast(SyncHookJSONOutput, {})
        async def post_tool_use_hook(
            input_data: HookInput,
            tool_use_id: str | None,
            context: HookContext,
        ) -> SyncHookJSONOutput:
            """Log successful tool executions for observability."""
            _ = context
            tool_name = cast(str, input_data.get("tool_name", ""))
            logger.debug(f"[SDK] Tool success: {tool_name}, tool_use_id={tool_use_id}")
            return cast(SyncHookJSONOutput, {})
        async def post_tool_failure_hook(
            input_data: HookInput,
            tool_use_id: str | None,
            context: HookContext,
        ) -> SyncHookJSONOutput:
            """Log failed tool executions for debugging."""
            _ = context
            tool_name = cast(str, input_data.get("tool_name", ""))
            error = input_data.get("error", "Unknown error")
            logger.warning(
                f"[SDK] Tool failed: {tool_name}, error={error}, "
                f"user={user_id}, tool_use_id={tool_use_id}"
            )
            return cast(SyncHookJSONOutput, {})
        async def pre_compact_hook(
            input_data: HookInput,
            tool_use_id: str | None,
            context: HookContext,
        ) -> SyncHookJSONOutput:
            """Log when SDK triggers context compaction.
            The SDK automatically compacts conversation history when it grows too large.
            This hook provides visibility into when compaction happens.
            """
            _ = context, tool_use_id
            trigger = input_data.get("trigger", "auto")
            logger.info(
                f"[SDK] Context compaction triggered: {trigger}, user={user_id}"
            )
            return cast(SyncHookJSONOutput, {})
        return {
            "PreToolUse": [HookMatcher(matcher="*", hooks=[pre_tool_use_hook])],
            "PostToolUse": [HookMatcher(matcher="*", hooks=[post_tool_use_hook])],
            "PostToolUseFailure": [
                HookMatcher(matcher="*", hooks=[post_tool_failure_hook])
            ],
            "PreCompact": [HookMatcher(matcher="*", hooks=[pre_compact_hook])],
        }
    except ImportError:
        # Fallback for when SDK isn't available - return empty hooks
        logger.warning("claude-agent-sdk not available, security hooks disabled")
        return {}
--- a/autogpt_platform/backend/backend/api/features/chat/sdk/security_hooks_test.py
+++ b/autogpt_platform/backend/backend/api/features/chat/sdk/security_hooks_test.py
@@ -1,165 +0,0 @@
 """Unit tests for SDK security hooks."""
 import os
 from .security_hooks import _validate_tool_access, _validate_user_isolation
 SDK_CWD = "/tmp/copilot-abc123"
 def _is_denied(result: dict) -> bool:
    hook = result.get("hookSpecificOutput", {})
    return hook.get("permissionDecision") == "deny"
 # -- Blocked tools -----------------------------------------------------------
 def test_blocked_tools_denied():
    for tool in ("bash", "shell", "exec", "terminal", "command"):
        result = _validate_tool_access(tool, {})
        assert _is_denied(result), f"{tool} should be blocked"
 def test_unknown_tool_allowed():
    result = _validate_tool_access("SomeCustomTool", {})
    assert result == {}
 # -- Workspace-scoped tools --------------------------------------------------
 def test_read_within_workspace_allowed():
    result = _validate_tool_access(
        "Read", {"file_path": f"{SDK_CWD}/file.txt"}, sdk_cwd=SDK_CWD
    )
    assert result == {}
 def test_write_within_workspace_allowed():
    result = _validate_tool_access(
        "Write", {"file_path": f"{SDK_CWD}/output.json"}, sdk_cwd=SDK_CWD
    )
    assert result == {}
 def test_edit_within_workspace_allowed():
    result = _validate_tool_access(
        "Edit", {"file_path": f"{SDK_CWD}/src/main.py"}, sdk_cwd=SDK_CWD
    )
    assert result == {}
 def test_glob_within_workspace_allowed():
    result = _validate_tool_access("Glob", {"path": f"{SDK_CWD}/src"}, sdk_cwd=SDK_CWD)
    assert result == {}
 def test_grep_within_workspace_allowed():
    result = _validate_tool_access("Grep", {"path": f"{SDK_CWD}/src"}, sdk_cwd=SDK_CWD)
    assert result == {}
 def test_read_outside_workspace_denied():
    result = _validate_tool_access(
        "Read", {"file_path": "/etc/passwd"}, sdk_cwd=SDK_CWD
    )
    assert _is_denied(result)
 def test_write_outside_workspace_denied():
    result = _validate_tool_access(
        "Write", {"file_path": "/home/user/secrets.txt"}, sdk_cwd=SDK_CWD
    )
    assert _is_denied(result)
 def test_traversal_attack_denied():
    result = _validate_tool_access(
        "Read",
        {"file_path": f"{SDK_CWD}/../../etc/passwd"},
        sdk_cwd=SDK_CWD,
    )
    assert _is_denied(result)
 def test_no_path_allowed():
    """Glob/Grep without a path argument defaults to cwd — should pass."""
    result = _validate_tool_access("Glob", {}, sdk_cwd=SDK_CWD)
    assert result == {}
 def test_read_no_cwd_denies_absolute():
    """If no sdk_cwd is set, absolute paths are denied."""
    result = _validate_tool_access("Read", {"file_path": "/tmp/anything"})
    assert _is_denied(result)
 # -- Tool-results directory --------------------------------------------------
 def test_read_tool_results_allowed():
    home = os.path.expanduser("~")
    path = f"{home}/.claude/projects/-tmp-copilot-abc123/tool-results/12345.txt"
    result = _validate_tool_access("Read", {"file_path": path}, sdk_cwd=SDK_CWD)
    assert result == {}
 def test_read_claude_projects_without_tool_results_denied():
    home = os.path.expanduser("~")
    path = f"{home}/.claude/projects/-tmp-copilot-abc123/settings.json"
    result = _validate_tool_access("Read", {"file_path": path}, sdk_cwd=SDK_CWD)
    assert _is_denied(result)
 # -- Built-in Bash is blocked (use bash_exec MCP tool instead) ---------------
 def test_bash_builtin_always_blocked():
    """SDK built-in Bash is blocked — bash_exec MCP tool with bubblewrap is used instead."""
    result = _validate_tool_access("Bash", {"command": "echo hello"}, sdk_cwd=SDK_CWD)
    assert _is_denied(result)
 # -- Dangerous patterns ------------------------------------------------------
 def test_dangerous_pattern_blocked():
    result = _validate_tool_access("SomeTool", {"cmd": "sudo rm -rf /"})
    assert _is_denied(result)
 def test_subprocess_pattern_blocked():
    result = _validate_tool_access("SomeTool", {"code": "subprocess.run(...)"})
    assert _is_denied(result)
 # -- User isolation ----------------------------------------------------------
 def test_workspace_path_traversal_blocked():
    result = _validate_user_isolation(
        "workspace_read", {"path": "../../../etc/shadow"}, user_id="user-1"
    )
    assert _is_denied(result)
 def test_workspace_absolute_path_blocked():
    result = _validate_user_isolation(
        "workspace_read", {"path": "/etc/passwd"}, user_id="user-1"
    )
    assert _is_denied(result)
 def test_workspace_normal_path_allowed():
    result = _validate_user_isolation(
        "workspace_read", {"path": "src/main.py"}, user_id="user-1"
    )
    assert result == {}
 def test_non_workspace_tool_passes_isolation():
    result = _validate_user_isolation(
        "find_agent", {"query": "email"}, user_id="user-1"
    )
    assert result == {}
--- a/autogpt_platform/backend/backend/api/features/chat/sdk/service.py
+++ b/autogpt_platform/backend/backend/api/features/chat/sdk/service.py
@@ -1,668 +0,0 @@
 """Claude Agent SDK service layer for CoPilot chat completions."""
 import asyncio
 import json
 import logging
 import os
 import uuid
 from collections.abc import AsyncGenerator
 from typing import Any
 from backend.util.exceptions import NotFoundError
 from .. import stream_registry
 from ..config import ChatConfig
 from ..model import (
    ChatMessage,
    ChatSession,
    get_chat_session,
    update_session_title,
    upsert_chat_session,
 )
 from ..response_model import (
    StreamBaseResponse,
    StreamError,
    StreamFinish,
    StreamStart,
    StreamTextDelta,
    StreamToolInputAvailable,
    StreamToolOutputAvailable,
 )
 from ..service import (
    _build_system_prompt,
    _execute_long_running_tool_with_streaming,
    _generate_session_title,
 )
 from ..tools.models import OperationPendingResponse, OperationStartedResponse
 from ..tools.sandbox import WORKSPACE_PREFIX, make_session_path
 from ..tracking import track_user_message
 from .response_adapter import SDKResponseAdapter
 from .security_hooks import create_security_hooks
 from .tool_adapter import (
    COPILOT_TOOL_NAMES,
    LongRunningCallback,
    create_copilot_mcp_server,
    set_execution_context,
 )
 logger = logging.getLogger(__name__)
 config = ChatConfig()
 # Set to hold background tasks to prevent garbage collection
 _background_tasks: set[asyncio.Task[Any]] = set()
 _SDK_CWD_PREFIX = WORKSPACE_PREFIX
 # Appended to the system prompt to inform the agent about available tools.
 # The SDK built-in Bash is NOT available — use mcp__copilot__bash_exec instead,
 # which has kernel-level network isolation (unshare --net).
 _SDK_TOOL_SUPPLEMENT = """
 ## Tool notes
 - The SDK built-in Bash tool is NOT available.  Use the `bash_exec` MCP tool
  for shell commands — it runs in a network-isolated sandbox.
 - **Shared workspace**: The SDK Read/Write tools and `bash_exec` share the
  same working directory. Files created by one are readable by the other.
  These files are **ephemeral** — they exist only for the current session.
 - **Persistent storage**: Use `write_workspace_file` / `read_workspace_file`
  for files that should persist across sessions (stored in cloud storage).
 - Long-running tools (create_agent, edit_agent, etc.) are handled
  asynchronously.  You will receive an immediate response; the actual result
  is delivered to the user via a background stream.
 """
 def _build_long_running_callback(user_id: str | None) -> LongRunningCallback:
    """Build a callback that delegates long-running tools to the non-SDK infrastructure.
    Long-running tools (create_agent, edit_agent, etc.) are delegated to the
    existing background infrastructure: stream_registry (Redis Streams),
    database persistence, and SSE reconnection.  This means results survive
    page refreshes / pod restarts, and the frontend shows the proper loading
    widget with progress updates.
    The returned callback matches the ``LongRunningCallback`` signature:
    ``(tool_name, args, session) -> MCP response dict``.
    """
    async def _callback(
        tool_name: str, args: dict[str, Any], session: ChatSession
    ) -> dict[str, Any]:
        operation_id = str(uuid.uuid4())
        task_id = str(uuid.uuid4())
        tool_call_id = f"sdk-{uuid.uuid4().hex[:12]}"
        session_id = session.session_id
        # --- Build user-friendly messages (matches non-SDK service) ---
        if tool_name == "create_agent":
            desc = args.get("description", "")
            desc_preview = (desc[:100] + "...") if len(desc) > 100 else desc
            pending_msg = (
                f"Creating your agent: {desc_preview}"
                if desc_preview
                else "Creating agent... This may take a few minutes."
            )
            started_msg = (
                "Agent creation started. You can close this tab - "
                "check your library in a few minutes."
            )
        elif tool_name == "edit_agent":
            changes = args.get("changes", "")
            changes_preview = (changes[:100] + "...") if len(changes) > 100 else changes
            pending_msg = (
                f"Editing agent: {changes_preview}"
                if changes_preview
                else "Editing agent... This may take a few minutes."
            )
            started_msg = (
                "Agent edit started. You can close this tab - "
                "check your library in a few minutes."
            )
        else:
            pending_msg = f"Running {tool_name}... This may take a few minutes."
            started_msg = (
                f"{tool_name} started. You can close this tab - "
                "check back in a few minutes."
            )
        # --- Register task in Redis for SSE reconnection ---
        await stream_registry.create_task(
            task_id=task_id,
            session_id=session_id,
            user_id=user_id,
            tool_call_id=tool_call_id,
            tool_name=tool_name,
            operation_id=operation_id,
        )
        # --- Save OperationPendingResponse to chat history ---
        pending_message = ChatMessage(
            role="tool",
            content=OperationPendingResponse(
                message=pending_msg,
                operation_id=operation_id,
                tool_name=tool_name,
            ).model_dump_json(),
            tool_call_id=tool_call_id,
        )
        session.messages.append(pending_message)
        await upsert_chat_session(session)
        # --- Spawn background task (reuses non-SDK infrastructure) ---
        bg_task = asyncio.create_task(
            _execute_long_running_tool_with_streaming(
                tool_name=tool_name,
                parameters=args,
                tool_call_id=tool_call_id,
                operation_id=operation_id,
                task_id=task_id,
                session_id=session_id,
                user_id=user_id,
            )
        )
        _background_tasks.add(bg_task)
        bg_task.add_done_callback(_background_tasks.discard)
        await stream_registry.set_task_asyncio_task(task_id, bg_task)
        logger.info(
            f"[SDK] Long-running tool {tool_name} delegated to background "
            f"(operation_id={operation_id}, task_id={task_id})"
        )
        # --- Return OperationStartedResponse as MCP tool result ---
        # This flows through SDK → response adapter → frontend, triggering
        # the loading widget with SSE reconnection support.
        started_json = OperationStartedResponse(
            message=started_msg,
            operation_id=operation_id,
            tool_name=tool_name,
            task_id=task_id,
        ).model_dump_json()
        return {
            "content": [{"type": "text", "text": started_json}],
            "isError": False,
        }
    return _callback
 def _resolve_sdk_model() -> str | None:
    """Resolve the model name for the Claude Agent SDK CLI.
    Uses ``config.claude_agent_model`` if set, otherwise derives from
    ``config.model`` by stripping the OpenRouter provider prefix (e.g.,
    ``"anthropic/claude-opus-4.6"`` → ``"claude-opus-4.6"``).
    """
    if config.claude_agent_model:
        return config.claude_agent_model
    model = config.model
    if "/" in model:
        return model.split("/", 1)[1]
    return model
 def _build_sdk_env() -> dict[str, str]:
    """Build env vars for the SDK CLI process.
    Routes API calls through OpenRouter (or a custom base_url) using
    the same ``config.api_key`` / ``config.base_url`` as the non-SDK path.
    This gives per-call token and cost tracking on the OpenRouter dashboard.
    Only overrides ``ANTHROPIC_API_KEY`` when a valid proxy URL and auth
    token are both present — otherwise returns an empty dict so the SDK
    falls back to its default credentials.
    """
    env: dict[str, str] = {}
    if config.api_key and config.base_url:
        # Strip /v1 suffix — SDK expects the base URL without a version path
        base = config.base_url.rstrip("/")
        if base.endswith("/v1"):
            base = base[:-3]
        if not base or not base.startswith("http"):
            # Invalid base_url — don't override SDK defaults
            return env
        env["ANTHROPIC_BASE_URL"] = base
        env["ANTHROPIC_AUTH_TOKEN"] = config.api_key
        # Must be explicitly empty so the CLI uses AUTH_TOKEN instead
        env["ANTHROPIC_API_KEY"] = ""
    return env
 def _make_sdk_cwd(session_id: str) -> str:
    """Create a safe, session-specific working directory path.
    Delegates to :func:`~backend.api.features.chat.tools.sandbox.make_session_path`
    (single source of truth for path sanitization) and adds a defence-in-depth
    assertion.
    """
    cwd = make_session_path(session_id)
    # Defence-in-depth: normpath + startswith is a CodeQL-recognised sanitizer
    cwd = os.path.normpath(cwd)
    if not cwd.startswith(_SDK_CWD_PREFIX):
        raise ValueError(f"SDK cwd escaped prefix: {cwd}")
    return cwd
 def _cleanup_sdk_tool_results(cwd: str) -> None:
    """Remove SDK tool-result files for a specific session working directory.
    The SDK creates tool-result files under ~/.claude/projects/<encoded-cwd>/tool-results/.
    We clean only the specific cwd's results to avoid race conditions between
    concurrent sessions.
    Security: cwd MUST be created by _make_sdk_cwd() which sanitizes session_id.
    """
    import shutil
    # Security check 1: Validate cwd is under the expected prefix
    normalized = os.path.normpath(cwd)
    if not normalized.startswith(_SDK_CWD_PREFIX):
        logger.warning(f"[SDK] Rejecting cleanup for invalid path: {cwd}")
        return
    # Security check 2: Ensure no path traversal in the normalized path
    if ".." in normalized:
        logger.warning(f"[SDK] Rejecting cleanup for traversal attempt: {cwd}")
        return
    # SDK encodes the cwd path by replacing '/' with '-'
    encoded_cwd = normalized.replace("/", "-")
    # Construct the project directory path (known-safe home expansion)
    claude_projects = os.path.expanduser("~/.claude/projects")
    project_dir = os.path.join(claude_projects, encoded_cwd)
    # Security check 3: Validate project_dir is under ~/.claude/projects
    project_dir = os.path.normpath(project_dir)
    if not project_dir.startswith(claude_projects):
        logger.warning(
            f"[SDK] Rejecting cleanup for escaped project path: {project_dir}"
        )
        return
    results_dir = os.path.join(project_dir, "tool-results")
    if os.path.isdir(results_dir):
        for filename in os.listdir(results_dir):
            file_path = os.path.join(results_dir, filename)
            try:
                if os.path.isfile(file_path):
                    os.remove(file_path)
            except OSError:
                pass
    # Also clean up the temp cwd directory itself
    try:
        shutil.rmtree(normalized, ignore_errors=True)
    except OSError:
        pass
 async def _compress_conversation_history(
    session: ChatSession,
 ) -> list[ChatMessage]:
    """Compress prior conversation messages if they exceed the token threshold.
    Uses the shared compress_context() from prompt.py which supports:
    - LLM summarization of old messages (keeps recent ones intact)
    - Progressive content truncation as fallback
    - Middle-out deletion as last resort
    Returns the compressed prior messages (everything except the current message).
    """
    prior = session.messages[:-1]
    if len(prior) < 2:
        return prior
    from backend.util.prompt import compress_context
    # Convert ChatMessages to dicts for compress_context
    messages_dict = []
    for msg in prior:
        msg_dict: dict[str, Any] = {"role": msg.role}
        if msg.content:
            msg_dict["content"] = msg.content
        if msg.tool_calls:
            msg_dict["tool_calls"] = msg.tool_calls
        if msg.tool_call_id:
            msg_dict["tool_call_id"] = msg.tool_call_id
        messages_dict.append(msg_dict)
    try:
        import openai
        async with openai.AsyncOpenAI(
            api_key=config.api_key, base_url=config.base_url, timeout=30.0
        ) as client:
            result = await compress_context(
                messages=messages_dict,
                model=config.model,
                client=client,
            )
    except Exception as e:
        logger.warning(f"[SDK] Context compression with LLM failed: {e}")
        # Fall back to truncation-only (no LLM summarization)
        result = await compress_context(
            messages=messages_dict,
            model=config.model,
            client=None,
        )
    if result.was_compacted:
        logger.info(
            f"[SDK] Context compacted: {result.original_token_count} -> "
            f"{result.token_count} tokens "
            f"({result.messages_summarized} summarized, "
            f"{result.messages_dropped} dropped)"
        )
        # Convert compressed dicts back to ChatMessages
        return [
            ChatMessage(
                role=m["role"],
                content=m.get("content"),
                tool_calls=m.get("tool_calls"),
                tool_call_id=m.get("tool_call_id"),
            )
            for m in result.messages
        ]
    return prior
 def _format_conversation_context(messages: list[ChatMessage]) -> str | None:
    """Format conversation messages into a context prefix for the user message.
    Returns a string like:
        <conversation_history>
        User: hello
        You responded: Hi! How can I help?
        </conversation_history>
    Returns None if there are no messages to format.
    """
    if not messages:
        return None
    lines: list[str] = []
    for msg in messages:
        if not msg.content:
            continue
        if msg.role == "user":
            lines.append(f"User: {msg.content}")
        elif msg.role == "assistant":
            lines.append(f"You responded: {msg.content}")
        # Skip tool messages — they're internal details
    if not lines:
        return None
    return "<conversation_history>\n" + "\n".join(lines) + "\n</conversation_history>"
 async def stream_chat_completion_sdk(
    session_id: str,
    message: str | None = None,
    tool_call_response: str | None = None,  # noqa: ARG001
    is_user_message: bool = True,
    user_id: str | None = None,
    retry_count: int = 0,  # noqa: ARG001
    session: ChatSession | None = None,
    context: dict[str, str] | None = None,  # noqa: ARG001
 ) -> AsyncGenerator[StreamBaseResponse, None]:
    """Stream chat completion using Claude Agent SDK.
    Drop-in replacement for stream_chat_completion with improved reliability.
    """
    if session is None:
        session = await get_chat_session(session_id, user_id)
    if not session:
        raise NotFoundError(
            f"Session {session_id} not found. Please create a new session first."
        )
    if message:
        session.messages.append(
            ChatMessage(
                role="user" if is_user_message else "assistant", content=message
            )
        )
        if is_user_message:
            track_user_message(
                user_id=user_id, session_id=session_id, message_length=len(message)
            )
    session = await upsert_chat_session(session)
    # Generate title for new sessions (first user message)
    if is_user_message and not session.title:
        user_messages = [m for m in session.messages if m.role == "user"]
        if len(user_messages) == 1:
            first_message = user_messages[0].content or message or ""
            if first_message:
                task = asyncio.create_task(
                    _update_title_async(session_id, first_message, user_id)
                )
                _background_tasks.add(task)
                task.add_done_callback(_background_tasks.discard)
    # Build system prompt (reuses non-SDK path with Langfuse support)
    has_history = len(session.messages) > 1
    system_prompt, _ = await _build_system_prompt(
        user_id, has_conversation_history=has_history
    )
    system_prompt += _SDK_TOOL_SUPPLEMENT
    message_id = str(uuid.uuid4())
    task_id = str(uuid.uuid4())
    yield StreamStart(messageId=message_id, taskId=task_id)
    stream_completed = False
    # Initialise sdk_cwd before the try so the finally can reference it
    # even if _make_sdk_cwd raises (in that case it stays as "").
    sdk_cwd = ""
    try:
        # Use a session-specific temp dir to avoid cleanup race conditions
        # between concurrent sessions.
        sdk_cwd = _make_sdk_cwd(session_id)
        os.makedirs(sdk_cwd, exist_ok=True)
        set_execution_context(
            user_id,
            session,
            long_running_callback=_build_long_running_callback(user_id),
        )
        try:
            from claude_agent_sdk import ClaudeAgentOptions, ClaudeSDKClient
            # Fail fast when no API credentials are available at all
            sdk_env = _build_sdk_env()
            if not sdk_env and not os.environ.get("ANTHROPIC_API_KEY"):
                raise RuntimeError(
                    "No API key configured. Set OPEN_ROUTER_API_KEY "
                    "(or CHAT_API_KEY) for OpenRouter routing, "
                    "or ANTHROPIC_API_KEY for direct Anthropic access."
                )
            mcp_server = create_copilot_mcp_server()
            sdk_model = _resolve_sdk_model()
            security_hooks = create_security_hooks(
                user_id,
                sdk_cwd=sdk_cwd,
                max_subtasks=config.claude_agent_max_subtasks,
            )
            options = ClaudeAgentOptions(
                system_prompt=system_prompt,
                mcp_servers={"copilot": mcp_server},  # type: ignore[arg-type]
                allowed_tools=COPILOT_TOOL_NAMES,
                hooks=security_hooks,  # type: ignore[arg-type]
                cwd=sdk_cwd,
                max_buffer_size=config.claude_agent_max_buffer_size,
                # Only pass model/env when OpenRouter is configured
                **({"model": sdk_model, "env": sdk_env} if sdk_env else {}),
            )
            adapter = SDKResponseAdapter(message_id=message_id)
            adapter.set_task_id(task_id)
            async with ClaudeSDKClient(options=options) as client:
                current_message = message or ""
                if not current_message and session.messages:
                    last_user = [m for m in session.messages if m.role == "user"]
                    if last_user:
                        current_message = last_user[-1].content or ""
                if not current_message.strip():
                    yield StreamError(
                        errorText="Message cannot be empty.",
                        code="empty_prompt",
                    )
                    yield StreamFinish()
                    return
                # Build query with conversation history context.
                # Compress history first to handle long conversations.
                query_message = current_message
                if len(session.messages) > 1:
                    compressed = await _compress_conversation_history(session)
                    history_context = _format_conversation_context(compressed)
                    if history_context:
                        query_message = (
                            f"{history_context}\n\n"
                            f"Now, the user says:\n{current_message}"
                        )
                logger.info(
                    f"[SDK] Sending query: {current_message[:80]!r}"
                    f" ({len(session.messages)} msgs in session)"
                )
                await client.query(query_message, session_id=session_id)
                assistant_response = ChatMessage(role="assistant", content="")
                accumulated_tool_calls: list[dict[str, Any]] = []
                has_appended_assistant = False
                has_tool_results = False
                async for sdk_msg in client.receive_messages():
                    logger.debug(
                        f"[SDK] Received: {type(sdk_msg).__name__} "
                        f"{getattr(sdk_msg, 'subtype', '')}"
                    )
                    for response in adapter.convert_message(sdk_msg):
                        if isinstance(response, StreamStart):
                            continue
                        yield response
                        if isinstance(response, StreamTextDelta):
                            delta = response.delta or ""
                            # After tool results, start a new assistant
                            # message for the post-tool text.
                            if has_tool_results and has_appended_assistant:
                                assistant_response = ChatMessage(
                                    role="assistant", content=delta
                                )
                                accumulated_tool_calls = []
                                has_appended_assistant = False
                                has_tool_results = False
                                session.messages.append(assistant_response)
                                has_appended_assistant = True
                            else:
                                assistant_response.content = (
                                    assistant_response.content or ""
                                ) + delta
                                if not has_appended_assistant:
                                    session.messages.append(assistant_response)
                                    has_appended_assistant = True
                        elif isinstance(response, StreamToolInputAvailable):
                            accumulated_tool_calls.append(
                                {
                                    "id": response.toolCallId,
                                    "type": "function",
                                    "function": {
                                        "name": response.toolName,
                                        "arguments": json.dumps(response.input or {}),
                                    },
                                }
                            )
                            assistant_response.tool_calls = accumulated_tool_calls
                            if not has_appended_assistant:
                                session.messages.append(assistant_response)
                                has_appended_assistant = True
                        elif isinstance(response, StreamToolOutputAvailable):
                            session.messages.append(
                                ChatMessage(
                                    role="tool",
                                    content=(
                                        response.output
                                        if isinstance(response.output, str)
                                        else str(response.output)
                                    ),
                                    tool_call_id=response.toolCallId,
                                )
                            )
                            has_tool_results = True
                        elif isinstance(response, StreamFinish):
                            stream_completed = True
                    if stream_completed:
                        break
                if (
                    assistant_response.content or assistant_response.tool_calls
                ) and not has_appended_assistant:
                    session.messages.append(assistant_response)
        except ImportError:
            raise RuntimeError(
                "claude-agent-sdk is not installed. "
                "Disable SDK mode (CHAT_USE_CLAUDE_AGENT_SDK=false) "
                "to use the OpenAI-compatible fallback."
            )
        await upsert_chat_session(session)
        logger.debug(
            f"[SDK] Session {session_id} saved with {len(session.messages)} messages"
        )
        if not stream_completed:
            yield StreamFinish()
    except Exception as e:
        logger.error(f"[SDK] Error: {e}", exc_info=True)
        try:
            await upsert_chat_session(session)
        except Exception as save_err:
            logger.error(f"[SDK] Failed to save session on error: {save_err}")
        yield StreamError(
            errorText="An error occurred. Please try again.",
            code="sdk_error",
        )
        yield StreamFinish()
    finally:
        if sdk_cwd:
            _cleanup_sdk_tool_results(sdk_cwd)
 async def _update_title_async(
    session_id: str, message: str, user_id: str | None = None
 ) -> None:
    """Background task to update session title."""
    try:
        title = await _generate_session_title(
            message, user_id=user_id, session_id=session_id
        )
        if title:
            await update_session_title(session_id, title)
            logger.debug(f"[SDK] Generated title for {session_id}: {title}")
    except Exception as e:
        logger.warning(f"[SDK] Failed to update session title: {e}")
--- a/autogpt_platform/backend/backend/api/features/chat/sdk/tool_adapter.py
+++ b/autogpt_platform/backend/backend/api/features/chat/sdk/tool_adapter.py
@@ -1,320 +0,0 @@
 """Tool adapter for wrapping existing CoPilot tools as Claude Agent SDK MCP tools.
 This module provides the adapter layer that converts existing BaseTool implementations
 into in-process MCP tools that can be used with the Claude Agent SDK.
 Long-running tools (``is_long_running=True``) are delegated to the non-SDK
 background infrastructure (stream_registry, Redis persistence, SSE reconnection)
 via a callback provided by the service layer.  This avoids wasteful SDK polling
 and makes results survive page refreshes.
 """
 import json
 import logging
 import os
 import uuid
 from collections.abc import Awaitable, Callable
 from contextvars import ContextVar
 from typing import Any
 from backend.api.features.chat.model import ChatSession
 from backend.api.features.chat.tools import TOOL_REGISTRY
 from backend.api.features.chat.tools.base import BaseTool
 logger = logging.getLogger(__name__)
 # Allowed base directory for the Read tool (SDK saves oversized tool results here).
 # Restricted to ~/.claude/projects/ and further validated to require "tool-results"
 # in the path — prevents reading settings, credentials, or other sensitive files.
 _SDK_PROJECTS_DIR = os.path.expanduser("~/.claude/projects/")
 # MCP server naming - the SDK prefixes tool names as "mcp__{server_name}__{tool}"
 MCP_SERVER_NAME = "copilot"
 MCP_TOOL_PREFIX = f"mcp__{MCP_SERVER_NAME}__"
 # Context variables to pass user/session info to tool execution
 _current_user_id: ContextVar[str | None] = ContextVar("current_user_id", default=None)
 _current_session: ContextVar[ChatSession | None] = ContextVar(
    "current_session", default=None
 )
 # Stash for MCP tool outputs before the SDK potentially truncates them.
 # Keyed by tool_name → full output string. Consumed (popped) by the
 # response adapter when it builds StreamToolOutputAvailable.
 _pending_tool_outputs: ContextVar[dict[str, str]] = ContextVar(
    "pending_tool_outputs", default=None  # type: ignore[arg-type]
 )
 # Callback type for delegating long-running tools to the non-SDK infrastructure.
 # Args: (tool_name, arguments, session) → MCP-formatted response dict.
 LongRunningCallback = Callable[
    [str, dict[str, Any], ChatSession], Awaitable[dict[str, Any]]
 ]
 # ContextVar so the service layer can inject the callback per-request.
 _long_running_callback: ContextVar[LongRunningCallback | None] = ContextVar(
    "long_running_callback", default=None
 )
 def set_execution_context(
    user_id: str | None,
    session: ChatSession,
    long_running_callback: LongRunningCallback | None = None,
 ) -> None:
    """Set the execution context for tool calls.
    This must be called before streaming begins to ensure tools have access
    to user_id and session information.
    Args:
        user_id: Current user's ID.
        session: Current chat session.
        long_running_callback: Optional callback to delegate long-running tools
            to the non-SDK background infrastructure (stream_registry + Redis).
    """
    _current_user_id.set(user_id)
    _current_session.set(session)
    _pending_tool_outputs.set({})
    _long_running_callback.set(long_running_callback)
 def get_execution_context() -> tuple[str | None, ChatSession | None]:
    """Get the current execution context."""
    return (
        _current_user_id.get(),
        _current_session.get(),
    )
 def pop_pending_tool_output(tool_name: str) -> str | None:
    """Pop and return the stashed full output for *tool_name*.
    The SDK CLI may truncate large tool results (writing them to disk and
    replacing the content with a file reference). This stash keeps the
    original MCP output so the response adapter can forward it to the
    frontend for proper widget rendering.
    Returns ``None`` if nothing was stashed for *tool_name*.
    """
    pending = _pending_tool_outputs.get(None)
    if pending is None:
        return None
    return pending.pop(tool_name, None)
 async def _execute_tool_sync(
    base_tool: BaseTool,
    user_id: str | None,
    session: ChatSession,
    args: dict[str, Any],
 ) -> dict[str, Any]:
    """Execute a tool synchronously and return MCP-formatted response."""
    effective_id = f"sdk-{uuid.uuid4().hex[:12]}"
    result = await base_tool.execute(
        user_id=user_id,
        session=session,
        tool_call_id=effective_id,
        **args,
    )
    text = (
        result.output if isinstance(result.output, str) else json.dumps(result.output)
    )
    # Stash the full output before the SDK potentially truncates it.
    pending = _pending_tool_outputs.get(None)
    if pending is not None:
        pending[base_tool.name] = text
    return {
        "content": [{"type": "text", "text": text}],
        "isError": not result.success,
    }
 def _mcp_error(message: str) -> dict[str, Any]:
    return {
        "content": [
            {"type": "text", "text": json.dumps({"error": message, "type": "error"})}
        ],
        "isError": True,
    }
 def create_tool_handler(base_tool: BaseTool):
    """Create an async handler function for a BaseTool.
    This wraps the existing BaseTool._execute method to be compatible
    with the Claude Agent SDK MCP tool format.
    Long-running tools (``is_long_running=True``) are delegated to the
    non-SDK background infrastructure via a callback set in the execution
    context.  The callback persists the operation in Redis (stream_registry)
    so results survive page refreshes and pod restarts.
    """
    async def tool_handler(args: dict[str, Any]) -> dict[str, Any]:
        """Execute the wrapped tool and return MCP-formatted response."""
        user_id, session = get_execution_context()
        if session is None:
            return _mcp_error("No session context available")
        # --- Long-running: delegate to non-SDK background infrastructure ---
        if base_tool.is_long_running:
            callback = _long_running_callback.get(None)
            if callback:
                try:
                    return await callback(base_tool.name, args, session)
                except Exception as e:
                    logger.error(
                        f"Long-running callback failed for {base_tool.name}: {e}",
                        exc_info=True,
                    )
                    return _mcp_error(f"Failed to start {base_tool.name}: {e}")
            # No callback — fall through to synchronous execution
            logger.warning(
                f"[SDK] No long-running callback for {base_tool.name}, "
                f"executing synchronously (may block)"
            )
        # --- Normal (fast) tool: execute synchronously ---
        try:
            return await _execute_tool_sync(base_tool, user_id, session, args)
        except Exception as e:
            logger.error(f"Error executing tool {base_tool.name}: {e}", exc_info=True)
            return _mcp_error(f"Failed to execute {base_tool.name}: {e}")
    return tool_handler
 def _build_input_schema(base_tool: BaseTool) -> dict[str, Any]:
    """Build a JSON Schema input schema for a tool."""
    return {
        "type": "object",
        "properties": base_tool.parameters.get("properties", {}),
        "required": base_tool.parameters.get("required", []),
    }
 async def _read_file_handler(args: dict[str, Any]) -> dict[str, Any]:
    """Read a file with optional offset/limit. Restricted to SDK working directory.
    After reading, the file is deleted to prevent accumulation in long-running pods.
    """
    file_path = args.get("file_path", "")
    offset = args.get("offset", 0)
    limit = args.get("limit", 2000)
    # Security: only allow reads under ~/.claude/projects/**/tool-results/
    real_path = os.path.realpath(file_path)
    if not real_path.startswith(_SDK_PROJECTS_DIR) or "tool-results" not in real_path:
        return {
            "content": [{"type": "text", "text": f"Access denied: {file_path}"}],
            "isError": True,
        }
    try:
        with open(real_path) as f:
            lines = f.readlines()
        selected = lines[offset : offset + limit]
        content = "".join(selected)
        return {"content": [{"type": "text", "text": content}], "isError": False}
    except FileNotFoundError:
        return {
            "content": [{"type": "text", "text": f"File not found: {file_path}"}],
            "isError": True,
        }
    except Exception as e:
        return {
            "content": [{"type": "text", "text": f"Error reading file: {e}"}],
            "isError": True,
        }
 _READ_TOOL_NAME = "Read"
 _READ_TOOL_DESCRIPTION = (
    "Read a file from the local filesystem. "
    "Use offset and limit to read specific line ranges for large files."
 )
 _READ_TOOL_SCHEMA = {
    "type": "object",
    "properties": {
        "file_path": {
            "type": "string",
            "description": "The absolute path to the file to read",
        },
        "offset": {
            "type": "integer",
            "description": "Line number to start reading from (0-indexed). Default: 0",
        },
        "limit": {
            "type": "integer",
            "description": "Number of lines to read. Default: 2000",
        },
    },
    "required": ["file_path"],
 }
 # Create the MCP server configuration
 def create_copilot_mcp_server():
    """Create an in-process MCP server configuration for CoPilot tools.
    This can be passed to ClaudeAgentOptions.mcp_servers.
    Note: The actual SDK MCP server creation depends on the claude-agent-sdk
    package being available. This function returns the configuration that
    can be used with the SDK.
    """
    try:
        from claude_agent_sdk import create_sdk_mcp_server, tool
        # Create decorated tool functions
        sdk_tools = []
        for tool_name, base_tool in TOOL_REGISTRY.items():
            handler = create_tool_handler(base_tool)
            decorated = tool(
                tool_name,
                base_tool.description,
                _build_input_schema(base_tool),
            )(handler)
            sdk_tools.append(decorated)
        # Add the Read tool so the SDK can read back oversized tool results
        read_tool = tool(
            _READ_TOOL_NAME,
            _READ_TOOL_DESCRIPTION,
            _READ_TOOL_SCHEMA,
        )(_read_file_handler)
        sdk_tools.append(read_tool)
        server = create_sdk_mcp_server(
            name=MCP_SERVER_NAME,
            version="1.0.0",
            tools=sdk_tools,
        )
        return server
    except ImportError:
        # Let ImportError propagate so service.py handles the fallback
        raise
 # SDK built-in tools allowed within the workspace directory.
 # Security hooks validate that file paths stay within sdk_cwd.
 # Bash is NOT included — use the sandboxed MCP bash_exec tool instead,
 # which provides kernel-level network isolation via unshare --net.
 # Task allows spawning sub-agents (rate-limited by security hooks).
 _SDK_BUILTIN_TOOLS = ["Read", "Write", "Edit", "Glob", "Grep", "Task"]
 # List of tool names for allowed_tools configuration
 # Include MCP tools, the MCP Read tool for oversized results,
 # and SDK built-in file tools for workspace operations.
 COPILOT_TOOL_NAMES = [
    *[f"{MCP_TOOL_PREFIX}{name}" for name in TOOL_REGISTRY.keys()],
    f"{MCP_TOOL_PREFIX}{_READ_TOOL_NAME}",
    *_SDK_BUILTIN_TOOLS,
 ]
--- a/autogpt_platform/backend/backend/api/features/chat/service.py
+++ b/autogpt_platform/backend/backend/api/features/chat/service.py
@@ -245,16 +245,12 @@ async def _get_system_prompt_template(context: str) -> str:
    return DEFAULT_SYSTEM_PROMPT.format(users_information=context)
-async def _build_system_prompt(
+async def _build_system_prompt(user_id: str | None) -> tuple[str, Any]:
    user_id: str | None, has_conversation_history: bool = False
 ) -> tuple[str, Any]:
    """Build the full system prompt including business understanding if available.
    Args:
-        user_id: The user ID for fetching business understanding.
+        user_id: The user ID for fetching business understanding
-        has_conversation_history: Whether there's existing conversation history.
+                     If "default" and this is the user's first session, will use "onboarding" instead.
            If True, we don't tell the model to greet/introduce (since they're
            already in a conversation).
    Returns:
        Tuple of (compiled prompt string, business understanding object)
@@ -270,8 +266,6 @@ async def _build_system_prompt(
    if understanding:
        context = format_understanding_for_prompt(understanding)
    elif has_conversation_history:
        context = "No prior understanding saved yet. Continue the existing conversation naturally."
    else:
        context = "This is the first time you are meeting the user. Greet them and introduce them to the platform"
@@ -380,6 +374,7 @@ async def stream_chat_completion(
    Raises:
        NotFoundError: If session_id is invalid
        ValueError: If max_context_messages is exceeded
    """
    completion_start = time.monotonic()
@@ -464,9 +459,8 @@ async def stream_chat_completion(
    # Generate title for new sessions on first user message (non-blocking)
    # Check: is_user_message, no title yet, and this is the first user message
-    user_messages = [m for m in session.messages if m.role == "user"]
+    if is_user_message and message and not session.title:
-    first_user_msg = message or (user_messages[0].content if user_messages else None)
+        user_messages = [m for m in session.messages if m.role == "user"]
    if is_user_message and first_user_msg and not session.title:
        if len(user_messages) == 1:
            # First user message - generate title in background
            import asyncio
@@ -474,7 +468,7 @@ async def stream_chat_completion(
            # Capture only the values we need (not the session object) to avoid
            # stale data issues when the main flow modifies the session
            captured_session_id = session_id
-            captured_message = first_user_msg
+            captured_message = message
            captured_user_id = user_id
            async def _update_title():
@@ -1243,7 +1237,7 @@ async def _stream_chat_chunks(
                total_time = (time_module.perf_counter() - stream_chunks_start) * 1000
                logger.info(
-                    f"[TIMING] _stream_chat_chunks COMPLETED in {total_time / 1000:.1f}s; "
+                    f"[TIMING] _stream_chat_chunks COMPLETED in {total_time/1000:.1f}s; "
                    f"session={session.session_id}, user={session.user_id}",
                    extra={"json_fields": {**log_meta, "total_time_ms": total_time}},
                )
--- a/autogpt_platform/backend/backend/api/features/chat/stream_registry.py
+++ b/autogpt_platform/backend/backend/api/features/chat/stream_registry.py
@@ -814,28 +814,6 @@ async def get_active_task_for_session(
                if task_user_id and user_id != task_user_id:
                    continue
                # Auto-expire stale tasks that exceeded stream_timeout
                created_at_str = meta.get("created_at", "")
                if created_at_str:
                    try:
                        created_at = datetime.fromisoformat(created_at_str)
                        age_seconds = (
                            datetime.now(timezone.utc) - created_at
                        ).total_seconds()
                        if age_seconds > config.stream_timeout:
                            logger.warning(
                                f"[TASK_LOOKUP] Auto-expiring stale task {task_id[:8]}... "
                                f"(age={age_seconds:.0f}s > timeout={config.stream_timeout}s)"
                            )
                            await mark_task_completed(task_id, "failed")
                            continue
                    except (ValueError, TypeError):
                        pass
                logger.info(
                    f"[TASK_LOOKUP] Found running task {task_id[:8]}... for session {session_id[:8]}..."
                )
                # Get the last message ID from Redis Stream
                stream_key = _get_task_stream_key(task_id)
                last_id = "0-0"
--- a/autogpt_platform/backend/backend/api/features/chat/tools/init.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/init.py
@@ -9,8 +9,6 @@ from backend.api.features.chat.tracking import track_tool_called
 from .add_understanding import AddUnderstandingTool
 from .agent_output import AgentOutputTool
 from .base import BaseTool
 from .bash_exec import BashExecTool
 from .check_operation_status import CheckOperationStatusTool
 from .create_agent import CreateAgentTool
 from .customize_agent import CustomizeAgentTool
 from .edit_agent import EditAgentTool
@@ -21,7 +19,6 @@ from .get_doc_page import GetDocPageTool
 from .run_agent import RunAgentTool
 from .run_block import RunBlockTool
 from .search_docs import SearchDocsTool
 from .web_fetch import WebFetchTool
 from .workspace_files import (
    DeleteWorkspaceFileTool,
    ListWorkspaceFilesTool,
@@ -46,14 +43,9 @@ TOOL_REGISTRY: dict[str, BaseTool] = {
    "run_agent": RunAgentTool(),
    "run_block": RunBlockTool(),
    "view_agent_output": AgentOutputTool(),
    "check_operation_status": CheckOperationStatusTool(),
    "search_docs": SearchDocsTool(),
    "get_doc_page": GetDocPageTool(),
-    # Web fetch for safe URL retrieval
+    # Workspace tools for CoPilot file operations
    "web_fetch": WebFetchTool(),
    # Sandboxed code execution (bubblewrap)
    "bash_exec": BashExecTool(),
    # Persistent workspace tools (cloud storage, survives across sessions)
    "list_workspace_files": ListWorkspaceFilesTool(),
    "read_workspace_file": ReadWorkspaceFileTool(),
    "write_workspace_file": WriteWorkspaceFileTool(),
--- a/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/dummy.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/dummy.py
@@ -0,0 +1,154 @@
 """Dummy Agent Generator for testing.
 Returns mock responses matching the format expected from the external service.
 Enable via AGENTGENERATOR_USE_DUMMY=true in settings.
 WARNING: This is for testing only. Do not use in production.
 """
 import asyncio
 import logging
 import uuid
 from typing import Any
 logger = logging.getLogger(__name__)
 # Dummy decomposition result (instructions type)
 DUMMY_DECOMPOSITION_RESULT: dict[str, Any] = {
    "type": "instructions",
    "steps": [
        {
            "description": "Get input from user",
            "action": "input",
            "block_name": "AgentInputBlock",
        },
        {
            "description": "Process the input",
            "action": "process",
            "block_name": "TextFormatterBlock",
        },
        {
            "description": "Return output to user",
            "action": "output",
            "block_name": "AgentOutputBlock",
        },
    ],
 }
 # Block IDs from backend/blocks/io.py
 AGENT_INPUT_BLOCK_ID = "c0a8e994-ebf1-4a9c-a4d8-89d09c86741b"
 AGENT_OUTPUT_BLOCK_ID = "363ae599-353e-4804-937e-b2ee3cef3da4"
 def _generate_dummy_agent_json() -> dict[str, Any]:
    """Generate a minimal valid agent JSON for testing."""
    input_node_id = str(uuid.uuid4())
    output_node_id = str(uuid.uuid4())
    return {
        "id": str(uuid.uuid4()),
        "version": 1,
        "is_active": True,
        "name": "Dummy Test Agent",
        "description": "A dummy agent generated for testing purposes",
        "nodes": [
            {
                "id": input_node_id,
                "block_id": AGENT_INPUT_BLOCK_ID,
                "input_default": {
                    "name": "input",
                    "title": "Input",
                    "description": "Enter your input",
                    "placeholder_values": [],
                },
                "metadata": {"position": {"x": 0, "y": 0}},
            },
            {
                "id": output_node_id,
                "block_id": AGENT_OUTPUT_BLOCK_ID,
                "input_default": {
                    "name": "output",
                    "title": "Output",
                    "description": "Agent output",
                    "format": "{output}",
                },
                "metadata": {"position": {"x": 400, "y": 0}},
            },
        ],
        "links": [
            {
                "id": str(uuid.uuid4()),
                "source_id": input_node_id,
                "sink_id": output_node_id,
                "source_name": "result",
                "sink_name": "value",
                "is_static": False,
            },
        ],
    }
 async def decompose_goal_dummy(
    description: str,
    context: str = "",
    library_agents: list[dict[str, Any]] | None = None,
 ) -> dict[str, Any]:
    """Return dummy decomposition result."""
    logger.info("Using dummy agent generator for decompose_goal")
    return DUMMY_DECOMPOSITION_RESULT.copy()
 async def generate_agent_dummy(
    instructions: dict[str, Any],
    library_agents: list[dict[str, Any]] | None = None,
    operation_id: str | None = None,
    task_id: str | None = None,
 ) -> dict[str, Any]:
    """Return dummy agent JSON after a simulated delay."""
    logger.info("Using dummy agent generator for generate_agent (30s delay)")
    await asyncio.sleep(30)
    return _generate_dummy_agent_json()
 async def generate_agent_patch_dummy(
    update_request: str,
    current_agent: dict[str, Any],
    library_agents: list[dict[str, Any]] | None = None,
    operation_id: str | None = None,
    task_id: str | None = None,
 ) -> dict[str, Any]:
    """Return dummy patched agent (returns the current agent with updated description)."""
    logger.info("Using dummy agent generator for generate_agent_patch")
    patched = current_agent.copy()
    patched["description"] = (
        f"{current_agent.get('description', '')} (updated: {update_request})"
    )
    return patched
 async def customize_template_dummy(
    template_agent: dict[str, Any],
    modification_request: str,
    context: str = "",
 ) -> dict[str, Any]:
    """Return dummy customized template (returns template with updated description)."""
    logger.info("Using dummy agent generator for customize_template")
    customized = template_agent.copy()
    customized["description"] = (
        f"{template_agent.get('description', '')} (customized: {modification_request})"
    )
    return customized
 async def get_blocks_dummy() -> list[dict[str, Any]]:
    """Return dummy blocks list."""
    logger.info("Using dummy agent generator for get_blocks")
    return [
        {"id": AGENT_INPUT_BLOCK_ID, "name": "AgentInputBlock"},
        {"id": AGENT_OUTPUT_BLOCK_ID, "name": "AgentOutputBlock"},
    ]
 async def health_check_dummy() -> bool:
    """Always returns healthy for dummy service."""
    return True
--- a/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/service.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/service.py
@@ -12,8 +12,19 @@ import httpx
 from backend.util.settings import Settings
 from .dummy import (
    customize_template_dummy,
    decompose_goal_dummy,
    generate_agent_dummy,
    generate_agent_patch_dummy,
    get_blocks_dummy,
    health_check_dummy,
 )
 logger = logging.getLogger(__name__)
 _dummy_mode_warned = False
 def _create_error_response(
    error_message: str,
@@ -90,10 +101,26 @@ def _get_settings() -> Settings:
    return _settings
-def is_external_service_configured() -> bool:
+def _is_dummy_mode() -> bool:
-    """Check if external Agent Generator service is configured."""
+    """Check if dummy mode is enabled for testing."""
    global _dummy_mode_warned
    settings = _get_settings()
-    return bool(settings.config.agentgenerator_host)
+    is_dummy = bool(settings.config.agentgenerator_use_dummy)
    if is_dummy and not _dummy_mode_warned:
        logger.warning(
            "Agent Generator running in DUMMY MODE - returning mock responses. "
            "Do not use in production!"
        )
        _dummy_mode_warned = True
    return is_dummy
 def is_external_service_configured() -> bool:
    """Check if external Agent Generator service is configured (or dummy mode)."""
    settings = _get_settings()
    return bool(settings.config.agentgenerator_host) or bool(
        settings.config.agentgenerator_use_dummy
    )
 def _get_base_url() -> str:
@@ -137,6 +164,9 @@ async def decompose_goal_external(
        - {"type": "error", "error": "...", "error_type": "..."} on error
        Or None on unexpected error
    """
    if _is_dummy_mode():
        return await decompose_goal_dummy(description, context, library_agents)
    client = _get_client()
    if context:
@@ -226,6 +256,11 @@ async def generate_agent_external(
    Returns:
        Agent JSON dict, {"status": "accepted"} for async, or error dict {"type": "error", ...} on error
    """
    if _is_dummy_mode():
        return await generate_agent_dummy(
            instructions, library_agents, operation_id, task_id
        )
    client = _get_client()
    # Build request payload
@@ -297,6 +332,11 @@ async def generate_agent_patch_external(
    Returns:
        Updated agent JSON, clarifying questions dict, {"status": "accepted"} for async, or error dict on error
    """
    if _is_dummy_mode():
        return await generate_agent_patch_dummy(
            update_request, current_agent, library_agents, operation_id, task_id
        )
    client = _get_client()
    # Build request payload
@@ -383,6 +423,11 @@ async def customize_template_external(
    Returns:
        Customized agent JSON, clarifying questions dict, or error dict on error
    """
    if _is_dummy_mode():
        return await customize_template_dummy(
            template_agent, modification_request, context
        )
    client = _get_client()
    request = modification_request
@@ -445,6 +490,9 @@ async def get_blocks_external() -> list[dict[str, Any]] | None:
    Returns:
        List of block info dicts or None on error
    """
    if _is_dummy_mode():
        return await get_blocks_dummy()
    client = _get_client()
    try:
@@ -478,6 +526,9 @@ async def health_check() -> bool:
    if not is_external_service_configured():
        return False
    if _is_dummy_mode():
        return await health_check_dummy()
    client = _get_client()
    try:
--- a/autogpt_platform/backend/backend/api/features/chat/tools/bash_exec.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/bash_exec.py
@@ -1,131 +0,0 @@
 """Bash execution tool — run shell commands in a bubblewrap sandbox.
 Full Bash scripting is allowed (loops, conditionals, pipes, functions, etc.).
 Safety comes from OS-level isolation (bubblewrap): only system dirs visible
 read-only, writable workspace only, clean env, no network.
 Requires bubblewrap (``bwrap``) — the tool is disabled when bwrap is not
 available (e.g. macOS development).
 """
 import logging
 from typing import Any
 from backend.api.features.chat.model import ChatSession
 from backend.api.features.chat.tools.base import BaseTool
 from backend.api.features.chat.tools.models import (
    BashExecResponse,
    ErrorResponse,
    ToolResponseBase,
 )
 from backend.api.features.chat.tools.sandbox import (
    get_workspace_dir,
    has_full_sandbox,
    run_sandboxed,
 )
 logger = logging.getLogger(__name__)
 class BashExecTool(BaseTool):
    """Execute Bash commands in a bubblewrap sandbox."""
    @property
    def name(self) -> str:
        return "bash_exec"
    @property
    def description(self) -> str:
        if not has_full_sandbox():
            return (
                "Bash execution is DISABLED — bubblewrap sandbox is not "
                "available on this platform. Do not call this tool."
            )
        return (
            "Execute a Bash command or script in a bubblewrap sandbox. "
            "Full Bash scripting is supported (loops, conditionals, pipes, "
            "functions, etc.). "
            "The sandbox shares the same working directory as the SDK Read/Write "
            "tools — files created by either are accessible to both. "
            "SECURITY: Only system directories (/usr, /bin, /lib, /etc) are "
            "visible read-only, the per-session workspace is the only writable "
            "path, environment variables are wiped (no secrets), all network "
            "access is blocked at the kernel level, and resource limits are "
            "enforced (max 64 processes, 512MB memory, 50MB file size). "
            "Application code, configs, and other directories are NOT accessible. "
            "To fetch web content, use the web_fetch tool instead. "
            "Execution is killed after the timeout (default 30s, max 120s). "
            "Returns stdout and stderr. "
            "Useful for file manipulation, data processing with Unix tools "
            "(grep, awk, sed, jq, etc.), and running shell scripts."
        )
    @property
    def parameters(self) -> dict[str, Any]:
        return {
            "type": "object",
            "properties": {
                "command": {
                    "type": "string",
                    "description": "Bash command or script to execute.",
                },
                "timeout": {
                    "type": "integer",
                    "description": (
                        "Max execution time in seconds (default 30, max 120)."
                    ),
                    "default": 30,
                },
            },
            "required": ["command"],
        }
    @property
    def requires_auth(self) -> bool:
        return False
    async def _execute(
        self,
        user_id: str | None,
        session: ChatSession,
        **kwargs: Any,
    ) -> ToolResponseBase:
        session_id = session.session_id if session else None
        if not has_full_sandbox():
            return ErrorResponse(
                message="bash_exec requires bubblewrap sandbox (Linux only).",
                error="sandbox_unavailable",
                session_id=session_id,
            )
        command: str = (kwargs.get("command") or "").strip()
        timeout: int = kwargs.get("timeout", 30)
        if not command:
            return ErrorResponse(
                message="No command provided.",
                error="empty_command",
                session_id=session_id,
            )
        workspace = get_workspace_dir(session_id or "default")
        stdout, stderr, exit_code, timed_out = await run_sandboxed(
            command=["bash", "-c", command],
            cwd=workspace,
            timeout=timeout,
        )
        return BashExecResponse(
            message=(
                "Execution timed out"
                if timed_out
                else f"Command executed (exit {exit_code})"
            ),
            stdout=stdout,
            stderr=stderr,
            exit_code=exit_code,
            timed_out=timed_out,
            session_id=session_id,
        )
--- a/autogpt_platform/backend/backend/api/features/chat/tools/check_operation_status.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/check_operation_status.py
@@ -1,127 +0,0 @@
 """CheckOperationStatusTool — query the status of a long-running operation."""
 import logging
 from typing import Any
 from backend.api.features.chat.model import ChatSession
 from backend.api.features.chat.tools.base import BaseTool
 from backend.api.features.chat.tools.models import (
    ErrorResponse,
    ResponseType,
    ToolResponseBase,
 )
 logger = logging.getLogger(__name__)
 class OperationStatusResponse(ToolResponseBase):
    """Response for check_operation_status tool."""
    type: ResponseType = ResponseType.OPERATION_STATUS
    task_id: str
    operation_id: str
    status: str  # "running", "completed", "failed"
    tool_name: str | None = None
    message: str = ""
 class CheckOperationStatusTool(BaseTool):
    """Check the status of a long-running operation (create_agent, edit_agent, etc.).
    The CoPilot uses this tool to report back to the user whether an
    operation that was started earlier has completed, failed, or is still
    running.
    """
    @property
    def name(self) -> str:
        return "check_operation_status"
    @property
    def description(self) -> str:
        return (
            "Check the current status of a long-running operation such as "
            "create_agent or edit_agent. Accepts either an operation_id or "
            "task_id from a previous operation_started response. "
            "Returns the current status: running, completed, or failed."
        )
    @property
    def parameters(self) -> dict[str, Any]:
        return {
            "type": "object",
            "properties": {
                "operation_id": {
                    "type": "string",
                    "description": (
                        "The operation_id from an operation_started response."
                    ),
                },
                "task_id": {
                    "type": "string",
                    "description": (
                        "The task_id from an operation_started response. "
                        "Used as fallback if operation_id is not provided."
                    ),
                },
            },
            "required": [],
        }
    @property
    def requires_auth(self) -> bool:
        return False
    async def _execute(
        self,
        user_id: str | None,
        session: ChatSession,
        **kwargs,
    ) -> ToolResponseBase:
        from backend.api.features.chat import stream_registry
        operation_id: str = kwargs.get("operation_id", "").strip()
        task_id: str = kwargs.get("task_id", "").strip()
        if not operation_id and not task_id:
            return ErrorResponse(
                message="Please provide an operation_id or task_id.",
                error="missing_parameter",
            )
        task = None
        if operation_id:
            task = await stream_registry.find_task_by_operation_id(operation_id)
        if task is None and task_id:
            task = await stream_registry.get_task(task_id)
        if task is None:
            # Task not in Redis — it may have already expired (TTL).
            # Check conversation history for the result instead.
            return ErrorResponse(
                message=(
                    "Operation not found — it may have already completed and "
                    "expired from the status tracker. Check the conversation "
                    "history for the result."
                ),
                error="not_found",
            )
        status_messages = {
            "running": (
                f"The {task.tool_name or 'operation'} is still running. "
                "Please wait for it to complete."
            ),
            "completed": (
                f"The {task.tool_name or 'operation'} has completed successfully."
            ),
            "failed": f"The {task.tool_name or 'operation'} has failed.",
        }
        return OperationStatusResponse(
            task_id=task.task_id,
            operation_id=task.operation_id,
            status=task.status,
            tool_name=task.tool_name,
            message=status_messages.get(task.status, f"Status: {task.status}"),
        )
--- a/autogpt_platform/backend/backend/api/features/chat/tools/models.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/models.py
@@ -40,12 +40,6 @@ class ResponseType(str, Enum):
    OPERATION_IN_PROGRESS = "operation_in_progress"
    # Input validation
    INPUT_VALIDATION_ERROR = "input_validation_error"
    # Web fetch
    WEB_FETCH = "web_fetch"
    # Code execution
    BASH_EXEC = "bash_exec"
    # Operation status check
    OPERATION_STATUS = "operation_status"
 # Base response model
@@ -341,17 +335,11 @@ class BlockInfoSummary(BaseModel):
    name: str
    description: str
    categories: list[str]
-    input_schema: dict[str, Any] = Field(
+    input_schema: dict[str, Any]
-        default_factory=dict,
+    output_schema: dict[str, Any]
        description="Full JSON schema for block inputs",
    )
    output_schema: dict[str, Any] = Field(
        default_factory=dict,
        description="Full JSON schema for block outputs",
    )
    required_inputs: list[BlockInputFieldInfo] = Field(
        default_factory=list,
-        description="List of input fields for this block",
+        description="List of required input fields for this block",
    )
@@ -364,7 +352,7 @@ class BlockListResponse(ToolResponseBase):
    query: str
    usage_hint: str = Field(
        default="To execute a block, call run_block with block_id set to the block's "
-        "'id' field and input_data containing the fields listed in required_inputs."
+        "'id' field and input_data containing the required fields from input_schema."
    )
@@ -433,24 +421,3 @@ class AsyncProcessingResponse(ToolResponseBase):
    status: str = "accepted"  # Must be "accepted" for detection
    operation_id: str | None = None
    task_id: str | None = None
 class WebFetchResponse(ToolResponseBase):
    """Response for web_fetch tool."""
    type: ResponseType = ResponseType.WEB_FETCH
    url: str
    status_code: int
    content_type: str
    content: str
    truncated: bool = False
 class BashExecResponse(ToolResponseBase):
    """Response for bash_exec tool."""
    type: ResponseType = ResponseType.BASH_EXEC
    stdout: str
    stderr: str
    exit_code: int
    timed_out: bool = False
--- a/autogpt_platform/backend/backend/api/features/chat/tools/sandbox.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/sandbox.py
@@ -1,267 +0,0 @@
 """Sandbox execution utilities for code execution tools.
 Provides filesystem + network isolated command execution using **bubblewrap**
 (``bwrap``): whitelist-only filesystem (only system dirs visible read-only),
 writable workspace only, clean environment, network blocked.
 Tools that call :func:`run_sandboxed` must first check :func:`has_full_sandbox`
 and refuse to run if bubblewrap is not available.
 """
 import asyncio
 import logging
 import os
 import platform
 import shutil
 logger = logging.getLogger(__name__)
 # Output limits — prevent blowing up LLM context
 _MAX_OUTPUT_CHARS = 50_000
 _DEFAULT_TIMEOUT = 30
 _MAX_TIMEOUT = 120
 # ---------------------------------------------------------------------------
 # Sandbox capability detection (cached at first call)
 # ---------------------------------------------------------------------------
 _BWRAP_AVAILABLE: bool | None = None
 def has_full_sandbox() -> bool:
    """Return True if bubblewrap is available (filesystem + network isolation).
    On non-Linux platforms (macOS), always returns False.
    """
    global _BWRAP_AVAILABLE
    if _BWRAP_AVAILABLE is None:
        _BWRAP_AVAILABLE = (
            platform.system() == "Linux" and shutil.which("bwrap") is not None
        )
    return _BWRAP_AVAILABLE
 WORKSPACE_PREFIX = "/tmp/copilot-"
 def make_session_path(session_id: str) -> str:
    """Build a sanitized, session-specific path under :data:`WORKSPACE_PREFIX`.
    Shared by both the SDK working-directory setup and the sandbox tools so
    they always resolve to the same directory for a given session.
    Steps:
        1. Strip all characters except ``[A-Za-z0-9-]``.
        2. Construct ``/tmp/copilot-<safe_id>``.
        3. Validate via ``os.path.normpath`` + ``startswith`` (CodeQL-recognised
           sanitizer) to prevent path traversal.
    Raises:
        ValueError: If the resulting path escapes the prefix.
    """
    import re
    safe_id = re.sub(r"[^A-Za-z0-9-]", "", session_id)
    if not safe_id:
        safe_id = "default"
    path = os.path.normpath(f"{WORKSPACE_PREFIX}{safe_id}")
    if not path.startswith(WORKSPACE_PREFIX):
        raise ValueError(f"Session path escaped prefix: {path}")
    return path
 def get_workspace_dir(session_id: str) -> str:
    """Get or create the workspace directory for a session.
    Uses :func:`make_session_path` — the same path the SDK uses — so that
    bash_exec shares the workspace with the SDK file tools.
    """
    workspace = make_session_path(session_id)
    os.makedirs(workspace, exist_ok=True)
    return workspace
 # ---------------------------------------------------------------------------
 # Bubblewrap command builder
 # ---------------------------------------------------------------------------
 # System directories mounted read-only inside the sandbox.
 # ONLY these are visible — /app, /root, /home, /opt, /var etc. are NOT accessible.
 _SYSTEM_RO_BINDS = [
    "/usr",  # binaries, libraries, Python interpreter
    "/etc",  # system config: ld.so, locale, passwd, alternatives
 ]
 # Compat paths: symlinks to /usr/* on modern Debian, real dirs on older systems.
 # On Debian 13 these are symlinks (e.g. /bin -> usr/bin).  bwrap --ro-bind
 # can't create a symlink target, so we detect and use --symlink instead.
 # /lib64 is critical: the ELF dynamic linker lives at /lib64/ld-linux-x86-64.so.2.
 _COMPAT_PATHS = [
    ("/bin", "usr/bin"),  # -> /usr/bin on Debian 13
    ("/sbin", "usr/sbin"),  # -> /usr/sbin on Debian 13
    ("/lib", "usr/lib"),  # -> /usr/lib on Debian 13
    ("/lib64", "usr/lib64"),  # 64-bit libraries / ELF interpreter
 ]
 # Resource limits to prevent fork bombs, memory exhaustion, and disk abuse.
 # Applied via ulimit inside the sandbox before exec'ing the user command.
 _RESOURCE_LIMITS = (
    "ulimit -u 64"  # max 64 processes  (prevents fork bombs)
    " -v 524288"  # 512 MB virtual memory
    " -f 51200"  # 50 MB max file size  (1024-byte blocks)
    " -n 256"  # 256 open file descriptors
    " 2>/dev/null"
 )
 def _build_bwrap_command(
    command: list[str], cwd: str, env: dict[str, str]
 ) -> list[str]:
    """Build a bubblewrap command with strict filesystem + network isolation.
    Security model:
    - **Whitelist-only filesystem**: only system directories (``/usr``, ``/etc``,
      ``/bin``, ``/lib``) are mounted read-only.  Application code (``/app``),
      home directories, ``/var``, ``/opt``, etc. are NOT accessible at all.
    - **Writable workspace only**: the per-session workspace is the sole
      writable path.
    - **Clean environment**: ``--clearenv`` wipes all inherited env vars.
      Only the explicitly-passed safe env vars are set inside the sandbox.
    - **Network isolation**: ``--unshare-net`` blocks all network access.
    - **Resource limits**: ulimit caps on processes (64), memory (512MB),
      file size (50MB), and open FDs (256) to prevent fork bombs and abuse.
    - **New session**: prevents terminal control escape.
    - **Die with parent**: prevents orphaned sandbox processes.
    """
    cmd = [
        "bwrap",
        # Create a new user namespace so bwrap can set up sandboxing
        # inside unprivileged Docker containers (no CAP_SYS_ADMIN needed).
        "--unshare-user",
        # Wipe all inherited environment variables (API keys, secrets, etc.)
        "--clearenv",
    ]
    # Set only the safe env vars inside the sandbox
    for key, value in env.items():
        cmd.extend(["--setenv", key, value])
    # System directories: read-only
    for path in _SYSTEM_RO_BINDS:
        cmd.extend(["--ro-bind", path, path])
    # Compat paths: use --symlink when host path is a symlink (Debian 13),
    # --ro-bind when it's a real directory (older distros).
    for path, symlink_target in _COMPAT_PATHS:
        if os.path.islink(path):
            cmd.extend(["--symlink", symlink_target, path])
        elif os.path.exists(path):
            cmd.extend(["--ro-bind", path, path])
    # Wrap the user command with resource limits:
    #   sh -c 'ulimit ...; exec "$@"' -- <original command>
    # `exec "$@"` replaces the shell so there's no extra process overhead,
    # and properly handles arguments with spaces.
    limited_command = [
        "sh",
        "-c",
        f'{_RESOURCE_LIMITS}; exec "$@"',
        "--",
        *command,
    ]
    cmd.extend(
        [
            # Fresh virtual filesystems
            "--dev",
            "/dev",
            "--proc",
            "/proc",
            "--tmpfs",
            "/tmp",
            # Workspace bind AFTER --tmpfs /tmp so it's visible through the tmpfs.
            # (workspace lives under /tmp/copilot-<session>)
            "--bind",
            cwd,
            cwd,
            # Isolation
            "--unshare-net",
            "--die-with-parent",
            "--new-session",
            "--chdir",
            cwd,
            "--",
            *limited_command,
        ]
    )
    return cmd
 # ---------------------------------------------------------------------------
 # Public API
 # ---------------------------------------------------------------------------
 async def run_sandboxed(
    command: list[str],
    cwd: str,
    timeout: int = _DEFAULT_TIMEOUT,
    env: dict[str, str] | None = None,
 ) -> tuple[str, str, int, bool]:
    """Run a command inside a bubblewrap sandbox.
    Callers **must** check :func:`has_full_sandbox` before calling this
    function.  If bubblewrap is not available, this function raises
    :class:`RuntimeError` rather than running unsandboxed.
    Returns:
        (stdout, stderr, exit_code, timed_out)
    """
    if not has_full_sandbox():
        raise RuntimeError(
            "run_sandboxed() requires bubblewrap but bwrap is not available. "
            "Callers must check has_full_sandbox() before calling this function."
        )
    timeout = min(max(timeout, 1), _MAX_TIMEOUT)
    safe_env = {
        "PATH": "/usr/local/bin:/usr/bin:/bin",
        "HOME": cwd,
        "TMPDIR": cwd,
        "LANG": "en_US.UTF-8",
        "PYTHONDONTWRITEBYTECODE": "1",
        "PYTHONIOENCODING": "utf-8",
    }
    if env:
        safe_env.update(env)
    full_command = _build_bwrap_command(command, cwd, safe_env)
    try:
        proc = await asyncio.create_subprocess_exec(
            *full_command,
            stdout=asyncio.subprocess.PIPE,
            stderr=asyncio.subprocess.PIPE,
            cwd=cwd,
            env=safe_env,
        )
        try:
            stdout_bytes, stderr_bytes = await asyncio.wait_for(
                proc.communicate(), timeout=timeout
            )
            stdout = stdout_bytes.decode("utf-8", errors="replace")[:_MAX_OUTPUT_CHARS]
            stderr = stderr_bytes.decode("utf-8", errors="replace")[:_MAX_OUTPUT_CHARS]
            return stdout, stderr, proc.returncode or 0, False
        except asyncio.TimeoutError:
            proc.kill()
            await proc.communicate()
            return "", f"Execution timed out after {timeout}s", -1, True
    except RuntimeError:
        raise
    except Exception as e:
        return "", f"Sandbox error: {e}", -1, False
--- a/autogpt_platform/backend/backend/api/features/chat/tools/web_fetch.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/web_fetch.py
@@ -1,156 +0,0 @@
 """Web fetch tool — safely retrieve public web page content."""
 import logging
 from typing import Any
 import aiohttp
 import html2text
 from backend.api.features.chat.model import ChatSession
 from backend.api.features.chat.tools.base import BaseTool
 from backend.api.features.chat.tools.models import (
    ErrorResponse,
    ToolResponseBase,
    WebFetchResponse,
 )
 from backend.util.request import Requests
 logger = logging.getLogger(__name__)
 # Limits
 _MAX_CONTENT_BYTES = 102_400  # 100 KB download cap
 _MAX_OUTPUT_CHARS = 50_000  # 50K char truncation for LLM context
 _REQUEST_TIMEOUT = aiohttp.ClientTimeout(total=15)
 # Content types we'll read as text
 _TEXT_CONTENT_TYPES = {
    "text/html",
    "text/plain",
    "text/xml",
    "text/csv",
    "text/markdown",
    "application/json",
    "application/xml",
    "application/xhtml+xml",
    "application/rss+xml",
    "application/atom+xml",
 }
 def _is_text_content(content_type: str) -> bool:
    base = content_type.split(";")[0].strip().lower()
    return base in _TEXT_CONTENT_TYPES or base.startswith("text/")
 def _html_to_text(html: str) -> str:
    h = html2text.HTML2Text()
    h.ignore_links = False
    h.ignore_images = True
    h.body_width = 0
    return h.handle(html)
 class WebFetchTool(BaseTool):
    """Safely fetch content from a public URL using SSRF-protected HTTP."""
    @property
    def name(self) -> str:
        return "web_fetch"
    @property
    def description(self) -> str:
        return (
            "Fetch the content of a public web page by URL. "
            "Returns readable text extracted from HTML by default. "
            "Useful for reading documentation, articles, and API responses. "
            "Only supports HTTP/HTTPS GET requests to public URLs "
            "(private/internal network addresses are blocked)."
        )
    @property
    def parameters(self) -> dict[str, Any]:
        return {
            "type": "object",
            "properties": {
                "url": {
                    "type": "string",
                    "description": "The public HTTP/HTTPS URL to fetch.",
                },
                "extract_text": {
                    "type": "boolean",
                    "description": (
                        "If true (default), extract readable text from HTML. "
                        "If false, return raw content."
                    ),
                    "default": True,
                },
            },
            "required": ["url"],
        }
    @property
    def requires_auth(self) -> bool:
        return False
    async def _execute(
        self,
        user_id: str | None,
        session: ChatSession,
        **kwargs: Any,
    ) -> ToolResponseBase:
        url: str = (kwargs.get("url") or "").strip()
        extract_text: bool = kwargs.get("extract_text", True)
        session_id = session.session_id if session else None
        if not url:
            return ErrorResponse(
                message="Please provide a URL to fetch.",
                error="missing_url",
                session_id=session_id,
            )
        try:
            client = Requests(raise_for_status=False, retry_max_attempts=1)
            response = await client.get(url, timeout=_REQUEST_TIMEOUT)
        except ValueError as e:
            # validate_url raises ValueError for SSRF / blocked IPs
            return ErrorResponse(
                message=f"URL blocked: {e}",
                error="url_blocked",
                session_id=session_id,
            )
        except Exception as e:
            logger.warning(f"[web_fetch] Request failed for {url}: {e}")
            return ErrorResponse(
                message=f"Failed to fetch URL: {e}",
                error="fetch_failed",
                session_id=session_id,
            )
        content_type = response.headers.get("content-type", "")
        if not _is_text_content(content_type):
            return ErrorResponse(
                message=f"Non-text content type: {content_type.split(';')[0]}",
                error="unsupported_content_type",
                session_id=session_id,
            )
        raw = response.content[:_MAX_CONTENT_BYTES]
        text = raw.decode("utf-8", errors="replace")
        if extract_text and "html" in content_type.lower():
            text = _html_to_text(text)
        truncated = len(text) > _MAX_OUTPUT_CHARS
        if truncated:
            text = text[:_MAX_OUTPUT_CHARS]
        return WebFetchResponse(
            message=f"Fetched {url}" + (" (truncated)" if truncated else ""),
            url=response.url,
            status_code=response.status,
            content_type=content_type.split(";")[0].strip(),
            content=text,
            truncated=truncated,
            session_id=session_id,
        )
--- a/autogpt_platform/backend/backend/api/features/chat/tools/workspace_files.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/workspace_files.py
@@ -88,9 +88,7 @@ class ListWorkspaceFilesTool(BaseTool):
    @property
    def description(self) -> str:
        return (
-            "List files in the user's persistent workspace (cloud storage). "
+            "List files in the user's workspace. "
            "These files survive across sessions. "
            "For ephemeral session files, use the SDK Read/Glob tools instead. "
            "Returns file names, paths, sizes, and metadata. "
            "Optionally filter by path prefix."
        )
@@ -206,9 +204,7 @@ class ReadWorkspaceFileTool(BaseTool):
    @property
    def description(self) -> str:
        return (
-            "Read a file from the user's persistent workspace (cloud storage). "
+            "Read a file from the user's workspace. "
            "These files survive across sessions. "
            "For ephemeral session files, use the SDK Read tool instead. "
            "Specify either file_id or path to identify the file. "
            "For small text files, returns content directly. "
            "For large or binary files, returns metadata and a download URL. "
@@ -382,9 +378,7 @@ class WriteWorkspaceFileTool(BaseTool):
    @property
    def description(self) -> str:
        return (
-            "Write or create a file in the user's persistent workspace (cloud storage). "
+            "Write or create a file in the user's workspace. "
            "These files survive across sessions. "
            "For ephemeral session files, use the SDK Write tool instead. "
            "Provide the content as a base64-encoded string. "
            f"Maximum file size is {Config().max_file_size_mb}MB. "
            "Files are saved to the current session's folder by default. "
@@ -529,7 +523,7 @@ class DeleteWorkspaceFileTool(BaseTool):
    @property
    def description(self) -> str:
        return (
-            "Delete a file from the user's persistent workspace (cloud storage). "
+            "Delete a file from the user's workspace. "
            "Specify either file_id or path to identify the file. "
            "Paths are scoped to the current session by default. "
            "Use /sessions/<session_id>/... for cross-session access."
--- a/autogpt_platform/backend/backend/util/feature_flag.py
+++ b/autogpt_platform/backend/backend/util/feature_flag.py
@@ -38,7 +38,6 @@ class Flag(str, Enum):
    AGENT_ACTIVITY = "agent-activity"
    ENABLE_PLATFORM_PAYMENT = "enable-platform-payment"
    CHAT = "chat"
    COPILOT_SDK = "copilot-sdk"
 def is_configured() -> bool:
--- a/autogpt_platform/backend/backend/util/settings.py
+++ b/autogpt_platform/backend/backend/util/settings.py
@@ -368,6 +368,10 @@ class Config(UpdateTrackingModel["Config"], BaseSettings):
        default=600,
        description="The timeout in seconds for Agent Generator service requests (includes retries for rate limits)",
    )
    agentgenerator_use_dummy: bool = Field(
        default=False,
        description="Use dummy agent generator responses for testing (bypasses external service)",
    )
    enable_example_blocks: bool = Field(
        default=False,
--- a/autogpt_platform/backend/poetry.lock
+++ b/autogpt_platform/backend/poetry.lock
@@ -897,29 +897,6 @@ files = [
    {file = "charset_normalizer-3.4.4.tar.gz", hash = "sha256:94537985111c35f28720e43603b8e7b43a6ecfb2ce1d3058bbe955b73404e21a"},
 ]
 [[package]]
 name = "claude-agent-sdk"
 version = "0.1.35"
 description = "Python SDK for Claude Code"
 optional = false
 python-versions = ">=3.10"
 groups = ["main"]
 files = [
    {file = "claude_agent_sdk-0.1.35-py3-none-macosx_11_0_arm64.whl", hash = "sha256:df67f4deade77b16a9678b3a626c176498e40417f33b04beda9628287f375591"},
    {file = "claude_agent_sdk-0.1.35-py3-none-manylinux_2_17_aarch64.whl", hash = "sha256:14963944f55ded7c8ed518feebfa5b4284aa6dd8d81aeff2e5b21a962ce65097"},
    {file = "claude_agent_sdk-0.1.35-py3-none-manylinux_2_17_x86_64.whl", hash = "sha256:84344dcc535d179c1fc8a11c6f34c37c3b583447bdf09d869effb26514fd7a65"},
    {file = "claude_agent_sdk-0.1.35-py3-none-win_amd64.whl", hash = "sha256:1b3d54b47448c93f6f372acd4d1757f047c3c1e8ef5804be7a1e3e53e2c79a5f"},
    {file = "claude_agent_sdk-0.1.35.tar.gz", hash = "sha256:0f98e2b3c71ca85abfc042e7a35c648df88e87fda41c52e6779ef7b038dcbb52"},
 ]
 [package.dependencies]
 anyio = ">=4.0.0"
 mcp = ">=0.1.0"
 typing-extensions = {version = ">=4.0.0", markers = "python_version < \"3.11\""}
 [package.extras]
 dev = ["anyio[trio] (>=4.0.0)", "mypy (>=1.0.0)", "pytest (>=7.0.0)", "pytest-asyncio (>=0.20.0)", "pytest-cov (>=4.0.0)", "ruff (>=0.1.0)"]
 [[package]]
 name = "cleo"
 version = "2.1.0"
@@ -2616,18 +2593,6 @@ http2 = ["h2 (>=3,<5)"]
 socks = ["socksio (==1.*)"]
 zstd = ["zstandard (>=0.18.0)"]
 [[package]]
 name = "httpx-sse"
 version = "0.4.3"
 description = "Consume Server-Sent Event (SSE) messages with HTTPX."
 optional = false
 python-versions = ">=3.9"
 groups = ["main"]
 files = [
    {file = "httpx_sse-0.4.3-py3-none-any.whl", hash = "sha256:0ac1c9fe3c0afad2e0ebb25a934a59f4c7823b60792691f779fad2c5568830fc"},
    {file = "httpx_sse-0.4.3.tar.gz", hash = "sha256:9b1ed0127459a66014aec3c56bebd93da3c1bc8bb6618c8082039a44889a755d"},
 ]
 [[package]]
 name = "huggingface-hub"
 version = "1.4.1"
@@ -3345,39 +3310,6 @@ files = [
    {file = "mccabe-0.7.0.tar.gz", hash = "sha256:348e0240c33b60bbdf4e523192ef919f28cb2c3d7d5c7794f74009290f236325"},
 ]
 [[package]]
 name = "mcp"
 version = "1.26.0"
 description = "Model Context Protocol SDK"
 optional = false
 python-versions = ">=3.10"
 groups = ["main"]
 files = [
    {file = "mcp-1.26.0-py3-none-any.whl", hash = "sha256:904a21c33c25aa98ddbeb47273033c435e595bbacfdb177f4bd87f6dceebe1ca"},
    {file = "mcp-1.26.0.tar.gz", hash = "sha256:db6e2ef491eecc1a0d93711a76f28dec2e05999f93afd48795da1c1137142c66"},
 ]
 [package.dependencies]
 anyio = ">=4.5"
 httpx = ">=0.27.1"
 httpx-sse = ">=0.4"
 jsonschema = ">=4.20.0"
 pydantic = ">=2.11.0,<3.0.0"
 pydantic-settings = ">=2.5.2"
 pyjwt = {version = ">=2.10.1", extras = ["crypto"]}
 python-multipart = ">=0.0.9"
 pywin32 = {version = ">=310", markers = "sys_platform == \"win32\""}
 sse-starlette = ">=1.6.1"
 starlette = ">=0.27"
 typing-extensions = ">=4.9.0"
 typing-inspection = ">=0.4.1"
 uvicorn = {version = ">=0.31.1", markers = "sys_platform != \"emscripten\""}
 [package.extras]
 cli = ["python-dotenv (>=1.0.0)", "typer (>=0.16.0)"]
 rich = ["rich (>=13.9.4)"]
 ws = ["websockets (>=15.0.1)"]
 [[package]]
 name = "mdurl"
 version = "0.1.2"
@@ -6062,7 +5994,7 @@ description = "Python for Window Extensions"
 optional = false
 python-versions = "*"
 groups = ["main"]
-markers = "sys_platform == \"win32\" or platform_system == \"Windows\""
+markers = "platform_system == \"Windows\""
 files = [
    {file = "pywin32-311-cp310-cp310-win32.whl", hash = "sha256:d03ff496d2a0cd4a5893504789d4a15399133fe82517455e78bad62efbb7f0a3"},
    {file = "pywin32-311-cp310-cp310-win_amd64.whl", hash = "sha256:797c2772017851984b97180b0bebe4b620bb86328e8a884bb626156295a63b3b"},
@@ -7042,28 +6974,6 @@ postgresql-psycopgbinary = ["psycopg[binary] (>=3.0.7)"]
 pymysql = ["pymysql"]
 sqlcipher = ["sqlcipher3_binary"]
 [[package]]
 name = "sse-starlette"
 version = "3.2.0"
 description = "SSE plugin for Starlette"
 optional = false
 python-versions = ">=3.9"
 groups = ["main"]
 files = [
    {file = "sse_starlette-3.2.0-py3-none-any.whl", hash = "sha256:5876954bd51920fc2cd51baee47a080eb88a37b5b784e615abb0b283f801cdbf"},
    {file = "sse_starlette-3.2.0.tar.gz", hash = "sha256:8127594edfb51abe44eac9c49e59b0b01f1039d0c7461c6fd91d4e03b70da422"},
 ]
 [package.dependencies]
 anyio = ">=4.7.0"
 starlette = ">=0.49.1"
 [package.extras]
 daphne = ["daphne (>=4.2.0)"]
 examples = ["aiosqlite (>=0.21.0)", "fastapi (>=0.115.12)", "sqlalchemy[asyncio] (>=2.0.41)", "uvicorn (>=0.34.0)"]
 granian = ["granian (>=2.3.1)"]
 uvicorn = ["uvicorn (>=0.34.0)"]
 [[package]]
 name = "stagehand"
 version = "0.5.9"
@@ -8530,4 +8440,4 @@ cffi = ["cffi (>=1.17,<2.0) ; platform_python_implementation != \"PyPy\" and pyt
 [metadata]
 lock-version = "2.1"
 python-versions = ">=3.10,<3.14"
-content-hash = "942dea6daf671c3be65a22f3445feda26c1af9409d7173765e9a0742f0aa05dc"
+content-hash = "c06e96ad49388ba7a46786e9ea55ea2c1a57408e15613237b4bee40a592a12af"
--- a/autogpt_platform/backend/pyproject.toml
+++ b/autogpt_platform/backend/pyproject.toml
@@ -16,7 +16,6 @@ anthropic = "^0.79.0"
 apscheduler = "^3.11.1"
 autogpt-libs = { path = "../autogpt_libs", develop = true }
 bleach = { extras = ["css"], version = "^6.2.0" }
 claude-agent-sdk = "^0.1.0"
 click = "^8.2.0"
 cryptography = "^46.0"
 discord-py = "^2.5.2"
--- a/autogpt_platform/backend/test/agent_generator/test_service.py
+++ b/autogpt_platform/backend/test/agent_generator/test_service.py
@@ -25,6 +25,7 @@ class TestServiceConfiguration:
        """Test that external service is not configured when host is empty."""
        mock_settings = MagicMock()
        mock_settings.config.agentgenerator_host = ""
        mock_settings.config.agentgenerator_use_dummy = False
        with patch.object(service, "_get_settings", return_value=mock_settings):
            assert service.is_external_service_configured() is False
--- a/autogpt_platform/backend/test/chat/init.py
+++ b/autogpt_platform/backend/test/chat/init.py
--- a/autogpt_platform/backend/test/chat/test_security_hooks.py
+++ b/autogpt_platform/backend/test/chat/test_security_hooks.py
@@ -1,133 +0,0 @@
 """Tests for SDK security hooks — workspace paths, tool access, and deny messages.
 These are pure unit tests with no external dependencies (no SDK, no DB, no server).
 They validate that the security hooks correctly block unauthorized paths,
 tool access, and dangerous input patterns.
 Note: Bash command validation was removed — the SDK built-in Bash tool is not in
 allowed_tools, and the bash_exec MCP tool has kernel-level network isolation
 (unshare --net) making command-level parsing unnecessary.
 """
 from backend.api.features.chat.sdk.security_hooks import (
    _validate_tool_access,
    _validate_workspace_path,
 )
 SDK_CWD = "/tmp/copilot-test-session"
 def _is_denied(result: dict) -> bool:
    hook = result.get("hookSpecificOutput", {})
    return hook.get("permissionDecision") == "deny"
 def _reason(result: dict) -> str:
    return result.get("hookSpecificOutput", {}).get("permissionDecisionReason", "")
 # ============================================================
 # Workspace path validation (Read, Write, Edit, etc.)
 # ============================================================
 class TestWorkspacePathValidation:
    def test_path_in_workspace(self):
        result = _validate_workspace_path(
            "Read", {"file_path": f"{SDK_CWD}/file.txt"}, SDK_CWD
        )
        assert not _is_denied(result)
    def test_path_outside_workspace(self):
        result = _validate_workspace_path("Read", {"file_path": "/etc/passwd"}, SDK_CWD)
        assert _is_denied(result)
    def test_tool_results_allowed(self):
        result = _validate_workspace_path(
            "Read",
            {"file_path": "~/.claude/projects/abc/tool-results/out.txt"},
            SDK_CWD,
        )
        assert not _is_denied(result)
    def test_claude_settings_blocked(self):
        result = _validate_workspace_path(
            "Read", {"file_path": "~/.claude/settings.json"}, SDK_CWD
        )
        assert _is_denied(result)
    def test_claude_projects_without_tool_results(self):
        result = _validate_workspace_path(
            "Read", {"file_path": "~/.claude/projects/abc/credentials.json"}, SDK_CWD
        )
        assert _is_denied(result)
    def test_no_path_allowed(self):
        """Glob/Grep without path defaults to cwd — should be allowed."""
        result = _validate_workspace_path("Grep", {"pattern": "foo"}, SDK_CWD)
        assert not _is_denied(result)
    def test_path_traversal_with_dotdot(self):
        result = _validate_workspace_path(
            "Read", {"file_path": f"{SDK_CWD}/../../../etc/passwd"}, SDK_CWD
        )
        assert _is_denied(result)
 # ============================================================
 # Tool access validation
 # ============================================================
 class TestToolAccessValidation:
    def test_blocked_tools(self):
        for tool in ("bash", "shell", "exec", "terminal", "command"):
            result = _validate_tool_access(tool, {})
            assert _is_denied(result), f"Tool '{tool}' should be blocked"
    def test_bash_builtin_blocked(self):
        """SDK built-in Bash (capital) is blocked as defence-in-depth."""
        result = _validate_tool_access("Bash", {"command": "echo hello"}, SDK_CWD)
        assert _is_denied(result)
        assert "Bash" in _reason(result)
    def test_workspace_tools_delegate(self):
        result = _validate_tool_access(
            "Read", {"file_path": f"{SDK_CWD}/file.txt"}, SDK_CWD
        )
        assert not _is_denied(result)
    def test_dangerous_pattern_blocked(self):
        result = _validate_tool_access("SomeUnknownTool", {"data": "sudo rm -rf /"})
        assert _is_denied(result)
    def test_safe_unknown_tool_allowed(self):
        result = _validate_tool_access("SomeSafeTool", {"data": "hello world"})
        assert not _is_denied(result)
 # ============================================================
 # Deny message quality (ntindle feedback)
 # ============================================================
 class TestDenyMessageClarity:
    """Deny messages must include [SECURITY] and 'cannot be bypassed'
    so the model knows the restriction is enforced, not a suggestion."""
    def test_blocked_tool_message(self):
        reason = _reason(_validate_tool_access("bash", {}))
        assert "[SECURITY]" in reason
        assert "cannot be bypassed" in reason
    def test_bash_builtin_blocked_message(self):
        reason = _reason(_validate_tool_access("Bash", {"command": "echo hello"}))
        assert "[SECURITY]" in reason
        assert "cannot be bypassed" in reason
    def test_workspace_path_message(self):
        reason = _reason(
            _validate_workspace_path("Read", {"file_path": "/etc/passwd"}, SDK_CWD)
        )
        assert "[SECURITY]" in reason
        assert "cannot be bypassed" in reason
--- a/autogpt_platform/frontend/instrumentation-client.ts
+++ b/autogpt_platform/frontend/instrumentation-client.ts
@@ -22,6 +22,11 @@ Sentry.init({
  enabled: shouldEnable,
  // Suppress cross-origin stylesheet errors from Sentry Replay (rrweb)
  // serializing DOM snapshots with cross-origin stylesheets
  // (e.g., from browser extensions or CDN-loaded CSS)
  ignoreErrors: [/Not allowed to access cross-origin stylesheet/],
  // Add optional integrations for additional features
  integrations: [
    Sentry.captureConsoleIntegration(),
--- a/autogpt_platform/frontend/src/app/(platform)/build/components/BuilderActions/components/RunGraph/useRunGraph.ts
+++ b/autogpt_platform/frontend/src/app/(platform)/build/components/BuilderActions/components/RunGraph/useRunGraph.ts
@@ -4,7 +4,7 @@ import {
 } from "@/app/api/__generated__/endpoints/graphs/graphs";
 import { useToast } from "@/components/molecules/Toast/use-toast";
 import { parseAsInteger, parseAsString, useQueryStates } from "nuqs";
-import { GraphExecutionMeta } from "@/app/(platform)/library/agents/[id]/components/OldAgentLibraryView/use-agent-runs";
+import { GraphExecutionMeta } from "@/app/api/__generated__/models/graphExecutionMeta";
 import { useGraphStore } from "@/app/(platform)/build/stores/graphStore";
 import { useShallow } from "zustand/react/shallow";
 import { useEffect, useState } from "react";
--- a/autogpt_platform/frontend/src/app/(platform)/build/components/legacy-builder/RunnerInputUI.tsx
+++ b/autogpt_platform/frontend/src/app/(platform)/build/components/legacy-builder/RunnerInputUI.tsx
@@ -1,6 +1,6 @@
 import { useCallback } from "react";
-import { AgentRunDraftView } from "@/app/(platform)/library/agents/[id]/components/OldAgentLibraryView/components/agent-run-draft-view";
+import { AgentRunDraftView } from "@/app/(platform)/build/components/legacy-builder/agent-run-draft-view";
 import { Dialog } from "@/components/molecules/Dialog/Dialog";
 import type {
  CredentialsMetaInput,
--- a/autogpt_platform/frontend/src/app/(platform)/build/components/legacy-builder/SaveControl.tsx
+++ b/autogpt_platform/frontend/src/app/(platform)/build/components/legacy-builder/SaveControl.tsx
@@ -18,7 +18,7 @@ import {
 import { useToast } from "@/components/molecules/Toast/use-toast";
 import { useQueryClient } from "@tanstack/react-query";
 import { getGetV2ListMySubmissionsQueryKey } from "@/app/api/__generated__/endpoints/store/store";
-import { CronExpressionDialog } from "@/app/(platform)/library/agents/[id]/components/OldAgentLibraryView/components/cron-scheduler-dialog";
+import { CronExpressionDialog } from "@/components/contextual/CronScheduler/cron-scheduler-dialog";
 import { humanizeCronExpression } from "@/lib/cron-expression-utils";
 import { CalendarClockIcon } from "lucide-react";
--- a/autogpt_platform/frontend/src/app/(platform)/library/agents/[id]/components/OldAgentLibraryView/components/agent-run-draft-view.tsx
+++ b/autogpt_platform/frontend/src/app/(platform)/library/agents/[id]/components/OldAgentLibraryView/components/agent-run-draft-view.tsx
@@ -20,7 +20,7 @@ import {
 import { useBackendAPI } from "@/lib/autogpt-server-api/context";
 import { RunAgentInputs } from "@/app/(platform)/library/agents/[id]/components/NewAgentLibraryView/components/modals/RunAgentInputs/RunAgentInputs";
-import { ScheduleTaskDialog } from "@/app/(platform)/library/agents/[id]/components/OldAgentLibraryView/components/cron-scheduler-dialog";
+import { ScheduleTaskDialog } from "@/components/contextual/CronScheduler/cron-scheduler-dialog";
 import ActionButtonGroup from "@/components/__legacy__/action-button-group";
 import type { ButtonAction } from "@/components/__legacy__/types";
 import {
@@ -53,7 +53,10 @@ import { ClockIcon, CopyIcon, InfoIcon } from "@phosphor-icons/react";
 import { CalendarClockIcon, Trash2Icon } from "lucide-react";
 import { analytics } from "@/services/analytics";
-import { AgentStatus, AgentStatusChip } from "./agent-status-chip";
+import {
  AgentStatus,
  AgentStatusChip,
 } from "@/app/(platform)/build/components/legacy-builder/agent-status-chip";
 export function AgentRunDraftView({
  graph,
--- a/autogpt_platform/frontend/src/app/(platform)/library/agents/[id]/components/OldAgentLibraryView/components/agent-status-chip.tsx
+++ b/autogpt_platform/frontend/src/app/(platform)/library/agents/[id]/components/OldAgentLibraryView/components/agent-status-chip.tsx
--- a/autogpt_platform/frontend/src/app/(platform)/copilot/components/ChatMessagesContainer/ChatMessagesContainer.tsx
+++ b/autogpt_platform/frontend/src/app/(platform)/copilot/components/ChatMessagesContainer/ChatMessagesContainer.tsx
@@ -20,7 +20,6 @@ import { FindBlocksTool } from "../../tools/FindBlocks/FindBlocks";
 import { RunAgentTool } from "../../tools/RunAgent/RunAgent";
 import { RunBlockTool } from "../../tools/RunBlock/RunBlock";
 import { SearchDocsTool } from "../../tools/SearchDocs/SearchDocs";
 import { GenericTool } from "../../tools/GenericTool/GenericTool";
 import { ViewAgentOutputTool } from "../../tools/ViewAgentOutput/ViewAgentOutput";
 // ---------------------------------------------------------------------------
@@ -256,16 +255,6 @@ export const ChatMessagesContainer = ({
                        />
                      );
                    default:
                      // Render a generic tool indicator for SDK built-in
                      // tools (Read, Glob, Grep, etc.) or any unrecognized tool
                      if (part.type.startsWith("tool-")) {
                        return (
                          <GenericTool
                            key={`${message.id}-${i}`}
                            part={part as ToolUIPart}
                          />
                        );
                      }
                      return null;
                  }
                })}
--- a/autogpt_platform/frontend/src/app/(platform)/copilot/hooks/Untitled
+++ b/autogpt_platform/frontend/src/app/(platform)/copilot/hooks/Untitled
@@ -1,10 +0,0 @@
 import { parseAsString, useQueryState } from "nuqs";
 export function useCopilotSessionId() {
  const [urlSessionId, setUrlSessionId] = useQueryState(
    "sessionId",
    parseAsString,
  );
  return { urlSessionId, setUrlSessionId };
 }
--- a/autogpt_platform/frontend/src/app/(platform)/copilot/hooks/useLongRunningToolPolling.ts
+++ b/autogpt_platform/frontend/src/app/(platform)/copilot/hooks/useLongRunningToolPolling.ts
@@ -0,0 +1,126 @@
 import { getGetV2GetSessionQueryKey } from "@/app/api/__generated__/endpoints/chat/chat";
 import { useQueryClient } from "@tanstack/react-query";
 import type { UIDataTypes, UIMessage, UITools } from "ai";
 import { useCallback, useEffect, useRef } from "react";
 import { convertChatSessionMessagesToUiMessages } from "../helpers/convertChatSessionToUiMessages";
 const OPERATING_TYPES = new Set([
  "operation_started",
  "operation_pending",
  "operation_in_progress",
 ]);
 const POLL_INTERVAL_MS = 1_500;
 /**
 * Detects whether any message contains a tool part whose output indicates
 * a long-running operation is still in progress.
 */
 function hasOperatingTool(
  messages: UIMessage<unknown, UIDataTypes, UITools>[],
 ) {
  for (const msg of messages) {
    for (const part of msg.parts) {
      if (!part.type.startsWith("tool-")) continue;
      const toolPart = part as { output?: unknown };
      if (!toolPart.output) continue;
      const output =
        typeof toolPart.output === "string"
          ? safeParse(toolPart.output)
          : toolPart.output;
      if (
        output &&
        typeof output === "object" &&
        "type" in output &&
        OPERATING_TYPES.has((output as { type: string }).type)
      ) {
        return true;
      }
    }
  }
  return false;
 }
 function safeParse(value: string): unknown {
  try {
    return JSON.parse(value);
  } catch {
    return null;
  }
 }
 /**
 * Polls the session endpoint while any tool is in an "operating" state
 * (operation_started / operation_pending / operation_in_progress).
 *
 * When the session data shows the tool output has changed (e.g. to
 * agent_saved), it calls `setMessages` with the updated messages.
 */
 export function useLongRunningToolPolling(
  sessionId: string | null,
  messages: UIMessage<unknown, UIDataTypes, UITools>[],
  setMessages: (
    updater: (
      prev: UIMessage<unknown, UIDataTypes, UITools>[],
    ) => UIMessage<unknown, UIDataTypes, UITools>[],
  ) => void,
 ) {
  const queryClient = useQueryClient();
  const intervalRef = useRef<ReturnType<typeof setInterval> | null>(null);
  const stopPolling = useCallback(() => {
    if (intervalRef.current) {
      clearInterval(intervalRef.current);
      intervalRef.current = null;
    }
  }, []);
  const poll = useCallback(async () => {
    if (!sessionId) return;
    // Invalidate the query cache so the next fetch gets fresh data
    await queryClient.invalidateQueries({
      queryKey: getGetV2GetSessionQueryKey(sessionId),
    });
    // Fetch fresh session data
    const data = queryClient.getQueryData<{
      status: number;
      data: { messages?: unknown[] };
    }>(getGetV2GetSessionQueryKey(sessionId));
    if (data?.status !== 200 || !data.data.messages) return;
    const freshMessages = convertChatSessionMessagesToUiMessages(
      sessionId,
      data.data.messages,
    );
    if (!freshMessages || freshMessages.length === 0) return;
    // Update when the long-running tool completed
    if (!hasOperatingTool(freshMessages)) {
      setMessages(() => freshMessages);
      stopPolling();
    }
  }, [sessionId, queryClient, setMessages, stopPolling]);
  useEffect(() => {
    const shouldPoll = hasOperatingTool(messages);
    // Always clear any previous interval first so we never leak timers
    // when the effect re-runs due to dependency changes (e.g. messages
    // updating as the LLM streams text after the tool call).
    stopPolling();
    if (shouldPoll && sessionId) {
      intervalRef.current = setInterval(() => {
        poll();
      }, POLL_INTERVAL_MS);
    }
    return () => {
      stopPolling();
    };
  }, [messages, sessionId, poll, stopPolling]);
 }
--- a/autogpt_platform/frontend/src/app/(platform)/copilot/tools/CreateAgent/CreateAgent.tsx
+++ b/autogpt_platform/frontend/src/app/(platform)/copilot/tools/CreateAgent/CreateAgent.tsx
@@ -1,24 +1,30 @@
 "use client";
-import { WarningDiamondIcon } from "@phosphor-icons/react";
+import { Button } from "@/components/atoms/Button/Button";
 import { Text } from "@/components/atoms/Text/Text";
 import {
  BookOpenIcon,
  CheckFatIcon,
  PencilSimpleIcon,
  WarningDiamondIcon,
 } from "@phosphor-icons/react";
 import type { ToolUIPart } from "ai";
 import NextLink from "next/link";
 import { useCopilotChatActions } from "../../components/CopilotChatActionsProvider/useCopilotChatActions";
 import { MorphingTextAnimation } from "../../components/MorphingTextAnimation/MorphingTextAnimation";
 import { ProgressBar } from "../../components/ProgressBar/ProgressBar";
 import {
  ContentCardDescription,
  ContentCodeBlock,
  ContentGrid,
  ContentHint,
  ContentLink,
  ContentMessage,
 } from "../../components/ToolAccordion/AccordionContent";
 import { ToolAccordion } from "../../components/ToolAccordion/ToolAccordion";
 import { useAsymptoticProgress } from "../../hooks/useAsymptoticProgress";
 import {
  ClarificationQuestionsCard,
  ClarifyingQuestion,
 } from "./components/ClarificationQuestionsCard";
 import { MiniGame } from "./components/MiniGame/MiniGame";
 import {
  AccordionIcon,
  formatMaybeJson,
@@ -52,7 +58,7 @@ function getAccordionMeta(output: CreateAgentToolOutput) {
  const icon = <AccordionIcon />;
  if (isAgentSavedOutput(output)) {
-    return { icon, title: output.agent_name };
+    return { icon, title: output.agent_name, expanded: true };
  }
  if (isAgentPreviewOutput(output)) {
    return {
@@ -78,6 +84,7 @@ function getAccordionMeta(output: CreateAgentToolOutput) {
    return {
      icon,
      title: "Creating agent, this may take a few minutes. Sit back and relax.",
      expanded: true,
    };
  }
  return {
@@ -107,8 +114,6 @@ export function CreateAgentTool({ part }: Props) {
      isOperationPendingOutput(output) ||
      isOperationInProgressOutput(output));
  const progress = useAsymptoticProgress(isOperating);
  const hasExpandableContent =
    part.state === "output-available" &&
    !!output &&
@@ -152,31 +157,53 @@ export function CreateAgentTool({ part }: Props) {
        <ToolAccordion {...getAccordionMeta(output)}>
          {isOperating && (
            <ContentGrid>
-              <ProgressBar value={progress} className="max-w-[280px]" />
+              <MiniGame />
              <ContentHint>
-                This could take a few minutes, grab a coffee ☕
+                This could take a few minutes — play while you wait!
              </ContentHint>
            </ContentGrid>
          )}
          {isAgentSavedOutput(output) && (
-            <ContentGrid>
+            <div className="rounded-xl border border-border/60 bg-card p-4 shadow-sm">
-              <ContentMessage>{output.message}</ContentMessage>
+              <div className="flex items-baseline gap-2">
-              <div className="flex flex-wrap gap-2">
+                <CheckFatIcon
-                <ContentLink href={output.library_agent_link}>
+                  size={18}
-                  Open in library
+                  weight="regular"
-                </ContentLink>
+                  className="relative top-1 text-green-500"
-                <ContentLink href={output.agent_page_link}>
+                />
-                  Open in builder
+                <Text
-                </ContentLink>
+                  variant="body-medium"
                  className="text-blacks mb-2 text-[16px]"
                >
                  {output.message}
                </Text>
              </div>
-              <ContentCodeBlock>
+              <div className="mt-3 flex flex-wrap gap-4">
-                {truncateText(
+                <Button variant="outline" size="small">
-                  formatMaybeJson({ agent_id: output.agent_id }),
+                  <NextLink
-                  800,
+                    href={output.library_agent_link}
-                )}
+                    className="inline-flex items-center gap-1.5"
-              </ContentCodeBlock>
+                    target="_blank"
-            </ContentGrid>
+                    rel="noopener noreferrer"
                  >
                    <BookOpenIcon size={14} weight="regular" />
                    Open in library
                  </NextLink>
                </Button>
                <Button variant="outline" size="small">
                  <NextLink
                    href={output.agent_page_link}
                    target="_blank"
                    rel="noopener noreferrer"
                    className="inline-flex items-center gap-1.5"
                  >
                    <PencilSimpleIcon size={14} weight="regular" />
                    Open in builder
                  </NextLink>
                </Button>
              </div>
            </div>
          )}
          {isAgentPreviewOutput(output) && (
--- a/autogpt_platform/frontend/src/app/(platform)/copilot/tools/CreateAgent/components/MiniGame/MiniGame.tsx
+++ b/autogpt_platform/frontend/src/app/(platform)/copilot/tools/CreateAgent/components/MiniGame/MiniGame.tsx
@@ -0,0 +1,21 @@
 "use client";
 import { useMiniGame } from "./useMiniGame";
 export function MiniGame() {
  const { canvasRef } = useMiniGame();
  return (
    <div
      className="w-full overflow-hidden rounded-md bg-background text-foreground"
      style={{ border: "1px solid #d17fff" }}
    >
      <canvas
        ref={canvasRef}
        tabIndex={0}
        className="block w-full outline-none"
        style={{ imageRendering: "pixelated" }}
      />
    </div>
  );
 }
--- a/autogpt_platform/frontend/src/app/(platform)/copilot/tools/CreateAgent/components/MiniGame/useMiniGame.ts
+++ b/autogpt_platform/frontend/src/app/(platform)/copilot/tools/CreateAgent/components/MiniGame/useMiniGame.ts
@@ -0,0 +1,579 @@
 import { useEffect, useRef } from "react";
 /* ------------------------------------------------------------------ */
 /*  Constants                                                         */
 /* ------------------------------------------------------------------ */
 const CANVAS_HEIGHT = 150;
 const GRAVITY = 0.55;
 const JUMP_FORCE = -9.5;
 const BASE_SPEED = 3;
 const SPEED_INCREMENT = 0.0008;
 const SPAWN_MIN = 70;
 const SPAWN_MAX = 130;
 const CHAR_SIZE = 18;
 const CHAR_X = 50;
 const GROUND_PAD = 20;
 const STORAGE_KEY = "copilot-minigame-highscore";
 // Colors
 const COLOR_BG = "#E8EAF6";
 const COLOR_CHAR = "#263238";
 const COLOR_BOSS = "#F50057";
 // Boss
 const BOSS_SIZE = 36;
 const BOSS_ENTER_SPEED = 2;
 const BOSS_LEAVE_SPEED = 3;
 const BOSS_SHOOT_COOLDOWN = 90;
 const BOSS_SHOTS_TO_EVADE = 5;
 const BOSS_INTERVAL = 20; // every N score
 const PROJ_SPEED = 4.5;
 const PROJ_SIZE = 12;
 /* ------------------------------------------------------------------ */
 /*  Types                                                             */
 /* ------------------------------------------------------------------ */
 interface Obstacle {
  x: number;
  width: number;
  height: number;
  scored: boolean;
 }
 interface Projectile {
  x: number;
  y: number;
  speed: number;
  evaded: boolean;
  type: "low" | "high";
 }
 interface BossState {
  phase: "inactive" | "entering" | "fighting" | "leaving";
  x: number;
  targetX: number;
  shotsEvaded: number;
  cooldown: number;
  projectiles: Projectile[];
  bob: number;
 }
 interface GameState {
  charY: number;
  vy: number;
  obstacles: Obstacle[];
  score: number;
  highScore: number;
  speed: number;
  frame: number;
  nextSpawn: number;
  running: boolean;
  over: boolean;
  groundY: number;
  boss: BossState;
  bossThreshold: number;
 }
 /* ------------------------------------------------------------------ */
 /*  Helpers                                                           */
 /* ------------------------------------------------------------------ */
 function randInt(min: number, max: number) {
  return Math.floor(Math.random() * (max - min + 1)) + min;
 }
 function readHighScore(): number {
  try {
    return parseInt(localStorage.getItem(STORAGE_KEY) || "0", 10) || 0;
  } catch {
    return 0;
  }
 }
 function writeHighScore(score: number) {
  try {
    localStorage.setItem(STORAGE_KEY, String(score));
  } catch {
    /* noop */
  }
 }
 function makeBoss(): BossState {
  return {
    phase: "inactive",
    x: 0,
    targetX: 0,
    shotsEvaded: 0,
    cooldown: 0,
    projectiles: [],
    bob: 0,
  };
 }
 function makeState(groundY: number): GameState {
  return {
    charY: groundY - CHAR_SIZE,
    vy: 0,
    obstacles: [],
    score: 0,
    highScore: readHighScore(),
    speed: BASE_SPEED,
    frame: 0,
    nextSpawn: randInt(SPAWN_MIN, SPAWN_MAX),
    running: false,
    over: false,
    groundY,
    boss: makeBoss(),
    bossThreshold: BOSS_INTERVAL,
  };
 }
 function gameOver(s: GameState) {
  s.running = false;
  s.over = true;
  if (s.score > s.highScore) {
    s.highScore = s.score;
    writeHighScore(s.score);
  }
 }
 /* ------------------------------------------------------------------ */
 /*  Projectile collision — shared between fighting & leaving phases   */
 /* ------------------------------------------------------------------ */
 /** Returns true if the player died. */
 function tickProjectiles(s: GameState): boolean {
  const boss = s.boss;
  for (const p of boss.projectiles) {
    p.x -= p.speed;
    if (!p.evaded && p.x + PROJ_SIZE < CHAR_X) {
      p.evaded = true;
      boss.shotsEvaded++;
    }
    // Collision
    if (
      !p.evaded &&
      CHAR_X + CHAR_SIZE > p.x &&
      CHAR_X < p.x + PROJ_SIZE &&
      s.charY + CHAR_SIZE > p.y &&
      s.charY < p.y + PROJ_SIZE
    ) {
      gameOver(s);
      return true;
    }
  }
  boss.projectiles = boss.projectiles.filter((p) => p.x + PROJ_SIZE > -20);
  return false;
 }
 /* ------------------------------------------------------------------ */
 /*  Update                                                            */
 /* ------------------------------------------------------------------ */
 function update(s: GameState, canvasWidth: number) {
  if (!s.running) return;
  s.frame++;
  // Speed only ramps during regular play
  if (s.boss.phase === "inactive") {
    s.speed = BASE_SPEED + s.frame * SPEED_INCREMENT;
  }
  // ---- Character physics (always active) ---- //
  s.vy += GRAVITY;
  s.charY += s.vy;
  if (s.charY + CHAR_SIZE >= s.groundY) {
    s.charY = s.groundY - CHAR_SIZE;
    s.vy = 0;
  }
  // ---- Trigger boss ---- //
  if (s.boss.phase === "inactive" && s.score >= s.bossThreshold) {
    s.boss.phase = "entering";
    s.boss.x = canvasWidth + 10;
    s.boss.targetX = canvasWidth - BOSS_SIZE - 40;
    s.boss.shotsEvaded = 0;
    s.boss.cooldown = BOSS_SHOOT_COOLDOWN;
    s.boss.projectiles = [];
    s.obstacles = [];
  }
  // ---- Boss: entering ---- //
  if (s.boss.phase === "entering") {
    s.boss.bob = Math.sin(s.frame * 0.05) * 3;
    s.boss.x -= BOSS_ENTER_SPEED;
    if (s.boss.x <= s.boss.targetX) {
      s.boss.x = s.boss.targetX;
      s.boss.phase = "fighting";
    }
    return; // no obstacles while entering
  }
  // ---- Boss: fighting ---- //
  if (s.boss.phase === "fighting") {
    s.boss.bob = Math.sin(s.frame * 0.05) * 3;
    // Shoot
    s.boss.cooldown--;
    if (s.boss.cooldown <= 0) {
      const isLow = Math.random() < 0.5;
      s.boss.projectiles.push({
        x: s.boss.x - PROJ_SIZE,
        y: isLow ? s.groundY - 14 : s.groundY - 70,
        speed: PROJ_SPEED,
        evaded: false,
        type: isLow ? "low" : "high",
      });
      s.boss.cooldown = BOSS_SHOOT_COOLDOWN;
    }
    if (tickProjectiles(s)) return;
    // Boss defeated?
    if (s.boss.shotsEvaded >= BOSS_SHOTS_TO_EVADE) {
      s.boss.phase = "leaving";
      s.score += 5; // bonus
      s.bossThreshold = s.score + BOSS_INTERVAL;
    }
    return;
  }
  // ---- Boss: leaving ---- //
  if (s.boss.phase === "leaving") {
    s.boss.bob = Math.sin(s.frame * 0.05) * 3;
    s.boss.x += BOSS_LEAVE_SPEED;
    // Still check in-flight projectiles
    if (tickProjectiles(s)) return;
    if (s.boss.x > canvasWidth + 50) {
      s.boss = makeBoss();
      s.nextSpawn = s.frame + randInt(SPAWN_MIN / 2, SPAWN_MAX / 2);
    }
    return;
  }
  // ---- Regular obstacle play ---- //
  if (s.frame >= s.nextSpawn) {
    s.obstacles.push({
      x: canvasWidth + 10,
      width: randInt(10, 16),
      height: randInt(20, 48),
      scored: false,
    });
    s.nextSpawn = s.frame + randInt(SPAWN_MIN, SPAWN_MAX);
  }
  for (const o of s.obstacles) {
    o.x -= s.speed;
    if (!o.scored && o.x + o.width < CHAR_X) {
      o.scored = true;
      s.score++;
    }
  }
  s.obstacles = s.obstacles.filter((o) => o.x + o.width > -20);
  for (const o of s.obstacles) {
    const oY = s.groundY - o.height;
    if (
      CHAR_X + CHAR_SIZE > o.x &&
      CHAR_X < o.x + o.width &&
      s.charY + CHAR_SIZE > oY
    ) {
      gameOver(s);
      return;
    }
  }
 }
 /* ------------------------------------------------------------------ */
 /*  Drawing                                                           */
 /* ------------------------------------------------------------------ */
 function drawBoss(ctx: CanvasRenderingContext2D, s: GameState, bg: string) {
  const bx = s.boss.x;
  const by = s.groundY - BOSS_SIZE + s.boss.bob;
  // Body
  ctx.save();
  ctx.fillStyle = COLOR_BOSS;
  ctx.globalAlpha = 0.9;
  ctx.beginPath();
  ctx.roundRect(bx, by, BOSS_SIZE, BOSS_SIZE, 4);
  ctx.fill();
  ctx.restore();
  // Eyes
  ctx.save();
  ctx.fillStyle = bg;
  const eyeY = by + 13;
  ctx.beginPath();
  ctx.arc(bx + 10, eyeY, 4, 0, Math.PI * 2);
  ctx.fill();
  ctx.beginPath();
  ctx.arc(bx + 26, eyeY, 4, 0, Math.PI * 2);
  ctx.fill();
  ctx.restore();
  // Angry eyebrows
  ctx.save();
  ctx.strokeStyle = bg;
  ctx.lineWidth = 2;
  ctx.beginPath();
  ctx.moveTo(bx + 5, eyeY - 7);
  ctx.lineTo(bx + 14, eyeY - 4);
  ctx.stroke();
  ctx.beginPath();
  ctx.moveTo(bx + 31, eyeY - 7);
  ctx.lineTo(bx + 22, eyeY - 4);
  ctx.stroke();
  ctx.restore();
  // Zigzag mouth
  ctx.save();
  ctx.strokeStyle = bg;
  ctx.lineWidth = 1.5;
  ctx.beginPath();
  ctx.moveTo(bx + 10, by + 27);
  ctx.lineTo(bx + 14, by + 24);
  ctx.lineTo(bx + 18, by + 27);
  ctx.lineTo(bx + 22, by + 24);
  ctx.lineTo(bx + 26, by + 27);
  ctx.stroke();
  ctx.restore();
 }
 function drawProjectiles(ctx: CanvasRenderingContext2D, boss: BossState) {
  ctx.save();
  ctx.fillStyle = COLOR_BOSS;
  ctx.globalAlpha = 0.8;
  for (const p of boss.projectiles) {
    if (p.evaded) continue;
    ctx.beginPath();
    ctx.arc(
      p.x + PROJ_SIZE / 2,
      p.y + PROJ_SIZE / 2,
      PROJ_SIZE / 2,
      0,
      Math.PI * 2,
    );
    ctx.fill();
  }
  ctx.restore();
 }
 function draw(
  ctx: CanvasRenderingContext2D,
  s: GameState,
  w: number,
  h: number,
  fg: string,
  started: boolean,
 ) {
  ctx.fillStyle = COLOR_BG;
  ctx.fillRect(0, 0, w, h);
  // Ground
  ctx.save();
  ctx.strokeStyle = fg;
  ctx.globalAlpha = 0.15;
  ctx.setLineDash([4, 4]);
  ctx.beginPath();
  ctx.moveTo(0, s.groundY);
  ctx.lineTo(w, s.groundY);
  ctx.stroke();
  ctx.restore();
  // Character
  ctx.save();
  ctx.fillStyle = COLOR_CHAR;
  ctx.globalAlpha = 0.85;
  ctx.beginPath();
  ctx.roundRect(CHAR_X, s.charY, CHAR_SIZE, CHAR_SIZE, 3);
  ctx.fill();
  ctx.restore();
  // Eyes
  ctx.save();
  ctx.fillStyle = COLOR_BG;
  ctx.beginPath();
  ctx.arc(CHAR_X + 6, s.charY + 7, 2.5, 0, Math.PI * 2);
  ctx.fill();
  ctx.beginPath();
  ctx.arc(CHAR_X + 12, s.charY + 7, 2.5, 0, Math.PI * 2);
  ctx.fill();
  ctx.restore();
  // Obstacles
  ctx.save();
  ctx.fillStyle = fg;
  ctx.globalAlpha = 0.55;
  for (const o of s.obstacles) {
    ctx.fillRect(o.x, s.groundY - o.height, o.width, o.height);
  }
  ctx.restore();
  // Boss + projectiles
  if (s.boss.phase !== "inactive") {
    drawBoss(ctx, s, COLOR_BG);
    drawProjectiles(ctx, s.boss);
  }
  // Score HUD
  ctx.save();
  ctx.fillStyle = fg;
  ctx.globalAlpha = 0.5;
  ctx.font = "bold 11px monospace";
  ctx.textAlign = "right";
  ctx.fillText(`Score: ${s.score}`, w - 12, 20);
  ctx.fillText(`Best: ${s.highScore}`, w - 12, 34);
  if (s.boss.phase === "fighting") {
    ctx.fillText(
      `Evade: ${s.boss.shotsEvaded}/${BOSS_SHOTS_TO_EVADE}`,
      w - 12,
      48,
    );
  }
  ctx.restore();
  // Prompts
  if (!started && !s.running && !s.over) {
    ctx.save();
    ctx.fillStyle = fg;
    ctx.globalAlpha = 0.5;
    ctx.font = "12px sans-serif";
    ctx.textAlign = "center";
    ctx.fillText("Click or press Space to play while you wait", w / 2, h / 2);
    ctx.restore();
  }
  if (s.over) {
    ctx.save();
    ctx.fillStyle = fg;
    ctx.globalAlpha = 0.7;
    ctx.font = "bold 13px sans-serif";
    ctx.textAlign = "center";
    ctx.fillText("Game Over", w / 2, h / 2 - 8);
    ctx.font = "11px sans-serif";
    ctx.fillText("Click or Space to restart", w / 2, h / 2 + 10);
    ctx.restore();
  }
 }
 /* ------------------------------------------------------------------ */
 /*  Hook                                                              */
 /* ------------------------------------------------------------------ */
 export function useMiniGame() {
  const canvasRef = useRef<HTMLCanvasElement>(null);
  const stateRef = useRef<GameState | null>(null);
  const rafRef = useRef(0);
  const startedRef = useRef(false);
  useEffect(() => {
    const canvas = canvasRef.current;
    if (!canvas) return;
    const container = canvas.parentElement;
    if (container) {
      canvas.width = container.clientWidth;
      canvas.height = CANVAS_HEIGHT;
    }
    const groundY = canvas.height - GROUND_PAD;
    stateRef.current = makeState(groundY);
    const style = getComputedStyle(canvas);
    let fg = style.color || "#71717a";
    // -------------------------------------------------------------- //
    //  Jump                                                           //
    // -------------------------------------------------------------- //
    function jump() {
      const s = stateRef.current;
      if (!s) return;
      if (s.over) {
        const hs = s.highScore;
        const gy = s.groundY;
        stateRef.current = makeState(gy);
        stateRef.current.highScore = hs;
        stateRef.current.running = true;
        startedRef.current = true;
        return;
      }
      if (!s.running) {
        s.running = true;
        startedRef.current = true;
        return;
      }
      // Only jump when on the ground
      if (s.charY + CHAR_SIZE >= s.groundY) {
        s.vy = JUMP_FORCE;
      }
    }
    function onKey(e: KeyboardEvent) {
      if (e.code === "Space" || e.key === " ") {
        e.preventDefault();
        jump();
      }
    }
    function onClick() {
      canvas?.focus();
      jump();
    }
    // -------------------------------------------------------------- //
    //  Loop                                                           //
    // -------------------------------------------------------------- //
    function loop() {
      const s = stateRef.current;
      if (!canvas || !s) return;
      const ctx = canvas.getContext("2d");
      if (!ctx) return;
      update(s, canvas.width);
      draw(ctx, s, canvas.width, canvas.height, fg, startedRef.current);
      rafRef.current = requestAnimationFrame(loop);
    }
    rafRef.current = requestAnimationFrame(loop);
    canvas.addEventListener("click", onClick);
    canvas.addEventListener("keydown", onKey);
    const observer = new ResizeObserver((entries) => {
      for (const entry of entries) {
        canvas.width = entry.contentRect.width;
        canvas.height = CANVAS_HEIGHT;
        if (stateRef.current) {
          stateRef.current.groundY = canvas.height - GROUND_PAD;
        }
        const cs = getComputedStyle(canvas);
        fg = cs.color || fg;
      }
    });
    if (container) observer.observe(container);
    return () => {
      cancelAnimationFrame(rafRef.current);
      canvas.removeEventListener("click", onClick);
      canvas.removeEventListener("keydown", onKey);
      observer.disconnect();
    };
  }, []);
  return { canvasRef };
 }
--- a/autogpt_platform/frontend/src/app/(platform)/copilot/tools/GenericTool/GenericTool.tsx
+++ b/autogpt_platform/frontend/src/app/(platform)/copilot/tools/GenericTool/GenericTool.tsx
@@ -1,63 +0,0 @@
 "use client";
 import { ToolUIPart } from "ai";
 import { GearIcon } from "@phosphor-icons/react";
 import { MorphingTextAnimation } from "../../components/MorphingTextAnimation/MorphingTextAnimation";
 interface Props {
  part: ToolUIPart;
 }
 function extractToolName(part: ToolUIPart): string {
  // ToolUIPart.type is "tool-{name}", extract the name portion.
  return part.type.replace(/^tool-/, "");
 }
 function formatToolName(name: string): string {
  // "search_docs" → "Search docs", "Read" → "Read"
  return name.replace(/_/g, " ").replace(/^\w/, (c) => c.toUpperCase());
 }
 function getAnimationText(part: ToolUIPart): string {
  const label = formatToolName(extractToolName(part));
  switch (part.state) {
    case "input-streaming":
    case "input-available":
      return `Running ${label}…`;
    case "output-available":
      return `${label} completed`;
    case "output-error":
      return `${label} failed`;
    default:
      return `Running ${label}…`;
  }
 }
 export function GenericTool({ part }: Props) {
  const isStreaming =
    part.state === "input-streaming" || part.state === "input-available";
  const isError = part.state === "output-error";
  return (
    <div className="py-2">
      <div className="flex items-center gap-2 text-sm text-muted-foreground">
        <GearIcon
          size={14}
          weight="regular"
          className={
            isError
              ? "text-red-500"
              : isStreaming
                ? "animate-spin text-neutral-500"
                : "text-neutral-400"
          }
        />
        <MorphingTextAnimation
          text={getAnimationText(part)}
          className={isError ? "text-red-500" : undefined}
        />
      </div>
    </div>
  );
 }
--- a/autogpt_platform/frontend/src/app/(platform)/copilot/useCopilotPage.ts
+++ b/autogpt_platform/frontend/src/app/(platform)/copilot/useCopilotPage.ts
@@ -1,10 +1,14 @@
 import { useGetV2ListSessions } from "@/app/api/__generated__/endpoints/chat/chat";
 import { toast } from "@/components/molecules/Toast/use-toast";
 import { useBreakpoint } from "@/lib/hooks/useBreakpoint";
 import { useSupabase } from "@/lib/supabase/hooks/useSupabase";
 import { useChat } from "@ai-sdk/react";
 import { DefaultChatTransport } from "ai";
-import { useEffect, useMemo, useState } from "react";
+import { useEffect, useMemo, useRef, useState } from "react";
 import { useChatSession } from "./useChatSession";
 import { useLongRunningToolPolling } from "./hooks/useLongRunningToolPolling";
 const STREAM_START_TIMEOUT_MS = 12_000;
 export function useCopilotPage() {
  const { isUserLoading, isLoggedIn } = useSupabase();
@@ -52,6 +56,24 @@ export function useCopilotPage() {
    transport: transport ?? undefined,
  });
  // Abort the stream if the backend doesn't start sending data within 12s.
  const stopRef = useRef(stop);
  stopRef.current = stop;
  useEffect(() => {
    if (status !== "submitted") return;
    const timer = setTimeout(() => {
      stopRef.current();
      toast({
        title: "Stream timed out",
        description: "The server took too long to respond. Please try again.",
        variant: "destructive",
      });
    }, STREAM_START_TIMEOUT_MS);
    return () => clearTimeout(timer);
  }, [status]);
  useEffect(() => {
    if (!hydratedMessages || hydratedMessages.length === 0) return;
    setMessages((prev) => {
@@ -60,6 +82,11 @@ export function useCopilotPage() {
    });
  }, [hydratedMessages, setMessages]);
  // Poll session endpoint when a long-running tool (create_agent, edit_agent)
  // is in progress. When the backend completes, the session data will contain
  // the final tool output — this hook detects the change and updates messages.
  useLongRunningToolPolling(sessionId, messages, setMessages);
  // Clear messages when session is null
  useEffect(() => {
    if (!sessionId) setMessages([]);
--- a/autogpt_platform/frontend/src/app/(platform)/library/agents/[id]/components/NewAgentLibraryView/components/sidebar/SidebarRunsList/components/ScheduleListItem.tsx
+++ b/autogpt_platform/frontend/src/app/(platform)/library/agents/[id]/components/NewAgentLibraryView/components/sidebar/SidebarRunsList/components/ScheduleListItem.tsx
@@ -29,6 +29,7 @@ export function ScheduleListItem({
      description={formatDistanceToNow(schedule.next_run_time, {
        addSuffix: true,
      })}
      descriptionTitle={new Date(schedule.next_run_time).toString()}
      onClick={onClick}
      selected={selected}
      icon={
--- a/autogpt_platform/frontend/src/app/(platform)/library/agents/[id]/components/NewAgentLibraryView/components/sidebar/SidebarRunsList/components/SidebarItemCard.tsx
+++ b/autogpt_platform/frontend/src/app/(platform)/library/agents/[id]/components/NewAgentLibraryView/components/sidebar/SidebarRunsList/components/SidebarItemCard.tsx
@@ -7,6 +7,7 @@ import React from "react";
 interface Props {
  title: string;
  description?: string;
  descriptionTitle?: string;
  icon?: React.ReactNode;
  selected?: boolean;
  onClick?: () => void;
@@ -16,6 +17,7 @@ interface Props {
 export function SidebarItemCard({
  title,
  description,
  descriptionTitle,
  icon,
  selected,
  onClick,
@@ -38,7 +40,11 @@ export function SidebarItemCard({
          >
            {title}
          </Text>
-          <Text variant="body" className="leading-tight !text-zinc-500">
+          <Text
            variant="body"
            className="leading-tight !text-zinc-500"
            title={descriptionTitle}
          >
            {description}
          </Text>
        </div>
--- a/autogpt_platform/frontend/src/app/(platform)/library/agents/[id]/components/NewAgentLibraryView/components/sidebar/SidebarRunsList/components/TaskListItem.tsx
+++ b/autogpt_platform/frontend/src/app/(platform)/library/agents/[id]/components/NewAgentLibraryView/components/sidebar/SidebarRunsList/components/TaskListItem.tsx
@@ -81,6 +81,9 @@ export function TaskListItem({
          ? formatDistanceToNow(run.started_at, { addSuffix: true })
          : "—"
      }
      descriptionTitle={
        run.started_at ? new Date(run.started_at).toString() : undefined
      }
      onClick={onClick}
      selected={selected}
      actions={
--- a/autogpt_platform/frontend/src/app/(platform)/library/agents/[id]/components/OldAgentLibraryView/OldAgentLibraryView.tsx
+++ b/autogpt_platform/frontend/src/app/(platform)/library/agents/[id]/components/OldAgentLibraryView/OldAgentLibraryView.tsx
@@ -1,631 +0,0 @@
 "use client";
 import { useParams, useRouter } from "next/navigation";
 import { useQueryState } from "nuqs";
 import React, {
  useCallback,
  useEffect,
  useMemo,
  useRef,
  useState,
 } from "react";
 import {
  Graph,
  GraphExecution,
  GraphExecutionID,
  GraphExecutionMeta,
  GraphID,
  LibraryAgent,
  LibraryAgentID,
  LibraryAgentPreset,
  LibraryAgentPresetID,
  Schedule,
  ScheduleID,
 } from "@/lib/autogpt-server-api";
 import { useBackendAPI } from "@/lib/autogpt-server-api/context";
 import { exportAsJSONFile } from "@/lib/utils";
 import DeleteConfirmDialog from "@/components/__legacy__/delete-confirm-dialog";
 import type { ButtonAction } from "@/components/__legacy__/types";
 import { Button } from "@/components/__legacy__/ui/button";
 import {
  Dialog,
  DialogContent,
  DialogDescription,
  DialogFooter,
  DialogHeader,
  DialogTitle,
 } from "@/components/__legacy__/ui/dialog";
 import LoadingBox, { LoadingSpinner } from "@/components/__legacy__/ui/loading";
 import {
  useToast,
  useToastOnFail,
 } from "@/components/molecules/Toast/use-toast";
 import { AgentRunDetailsView } from "./components/agent-run-details-view";
 import { AgentRunDraftView } from "./components/agent-run-draft-view";
 import { CreatePresetDialog } from "./components/create-preset-dialog";
 import { useAgentRunsInfinite } from "./use-agent-runs";
 import { AgentRunsSelectorList } from "./components/agent-runs-selector-list";
 import { AgentScheduleDetailsView } from "./components/agent-schedule-details-view";
 export function OldAgentLibraryView() {
  const { id: agentID }: { id: LibraryAgentID } = useParams();
  const [executionId, setExecutionId] = useQueryState("executionId");
  const toastOnFail = useToastOnFail();
  const { toast } = useToast();
  const router = useRouter();
  const api = useBackendAPI();
  // ============================ STATE =============================
  const [graph, setGraph] = useState<Graph | null>(null); // Graph version corresponding to LibraryAgent
  const [agent, setAgent] = useState<LibraryAgent | null>(null);
  const agentRunsQuery = useAgentRunsInfinite(graph?.id); // only runs once graph.id is known
  const agentRuns = agentRunsQuery.agentRuns;
  const [agentPresets, setAgentPresets] = useState<LibraryAgentPreset[]>([]);
  const [schedules, setSchedules] = useState<Schedule[]>([]);
  const [selectedView, selectView] = useState<
    | { type: "run"; id?: GraphExecutionID }
    | { type: "preset"; id: LibraryAgentPresetID }
    | { type: "schedule"; id: ScheduleID }
  >({ type: "run" });
  const [selectedRun, setSelectedRun] = useState<
    GraphExecution | GraphExecutionMeta | null
  >(null);
  const selectedSchedule =
    selectedView.type == "schedule"
      ? schedules.find((s) => s.id == selectedView.id)
      : null;
  const [isFirstLoad, setIsFirstLoad] = useState<boolean>(true);
  const [agentDeleteDialogOpen, setAgentDeleteDialogOpen] =
    useState<boolean>(false);
  const [confirmingDeleteAgentRun, setConfirmingDeleteAgentRun] =
    useState<GraphExecutionMeta | null>(null);
  const [confirmingDeleteAgentPreset, setConfirmingDeleteAgentPreset] =
    useState<LibraryAgentPresetID | null>(null);
  const [copyAgentDialogOpen, setCopyAgentDialogOpen] = useState(false);
  const [creatingPresetFromExecutionID, setCreatingPresetFromExecutionID] =
    useState<GraphExecutionID | null>(null);
  // Set page title with agent name
  useEffect(() => {
    if (agent) {
      document.title = `${agent.name} - Library - AutoGPT Platform`;
    }
  }, [agent]);
  const openRunDraftView = useCallback(() => {
    selectView({ type: "run" });
  }, []);
  const selectRun = useCallback((id: GraphExecutionID) => {
    selectView({ type: "run", id });
  }, []);
  const selectPreset = useCallback((id: LibraryAgentPresetID) => {
    selectView({ type: "preset", id });
  }, []);
  const selectSchedule = useCallback((id: ScheduleID) => {
    selectView({ type: "schedule", id });
  }, []);
  const graphVersions = useRef<Record<number, Graph>>({});
  const loadingGraphVersions = useRef<Record<number, Promise<Graph>>>({});
  const getGraphVersion = useCallback(
    async (graphID: GraphID, version: number) => {
      if (version in graphVersions.current)
        return graphVersions.current[version];
      if (version in loadingGraphVersions.current)
        return loadingGraphVersions.current[version];
      const pendingGraph = api.getGraph(graphID, version).then((graph) => {
        graphVersions.current[version] = graph;
        return graph;
      });
      // Cache promise as well to avoid duplicate requests
      loadingGraphVersions.current[version] = pendingGraph;
      return pendingGraph;
    },
    [api, graphVersions, loadingGraphVersions],
  );
  const lastRefresh = useRef<number>(0);
  const refreshPageData = useCallback(() => {
    if (Date.now() - lastRefresh.current < 2e3) return; // 2 second debounce
    lastRefresh.current = Date.now();
    api.getLibraryAgent(agentID).then((agent) => {
      setAgent(agent);
      getGraphVersion(agent.graph_id, agent.graph_version).then(
        (_graph) =>
          (graph && graph.version == _graph.version) || setGraph(_graph),
      );
      Promise.all([
        agentRunsQuery.refetchRuns(),
        api.listLibraryAgentPresets({
          graph_id: agent.graph_id,
          page_size: 100,
        }),
      ]).then(([runsQueryResult, presets]) => {
        setAgentPresets(presets.presets);
        const newestAgentRunsResponse = runsQueryResult.data?.pages[0];
        if (!newestAgentRunsResponse || newestAgentRunsResponse.status != 200)
          return;
        const newestAgentRuns = newestAgentRunsResponse.data.executions;
        // Preload the corresponding graph versions for the latest 10 runs
        new Set(
          newestAgentRuns.slice(0, 10).map((run) => run.graph_version),
        ).forEach((version) => getGraphVersion(agent.graph_id, version));
      });
    });
  }, [api, agentID, getGraphVersion, graph]);
  // On first load: select the latest run
  useEffect(() => {
    // Only for first load or first execution
    if (selectedView.id || !isFirstLoad) return;
    if (agentRuns.length == 0 && agentPresets.length == 0) return;
    setIsFirstLoad(false);
    if (agentRuns.length > 0) {
      // select latest run
      const latestRun = agentRuns.reduce((latest, current) => {
        if (!latest.started_at && !current.started_at) return latest;
        if (!latest.started_at) return current;
        if (!current.started_at) return latest;
        return latest.started_at > current.started_at ? latest : current;
      }, agentRuns[0]);
      selectRun(latestRun.id as GraphExecutionID);
    } else {
      // select top preset
      const latestPreset = agentPresets.toSorted(
        (a, b) => b.updated_at.getTime() - a.updated_at.getTime(),
      )[0];
      selectPreset(latestPreset.id);
    }
  }, [
    isFirstLoad,
    selectedView.id,
    agentRuns,
    agentPresets,
    selectRun,
    selectPreset,
  ]);
  useEffect(() => {
    if (executionId) {
      selectRun(executionId as GraphExecutionID);
      setExecutionId(null);
    }
  }, [executionId, selectRun, setExecutionId]);
  // Initial load
  useEffect(() => {
    refreshPageData();
    // Show a toast when the WebSocket connection disconnects
    let connectionToast: ReturnType<typeof toast> | null = null;
    const cancelDisconnectHandler = api.onWebSocketDisconnect(() => {
      connectionToast ??= toast({
        title: "Connection to server was lost",
        variant: "destructive",
        description: (
          <div className="flex items-center">
            Trying to reconnect...
            <LoadingSpinner className="ml-1.5 size-3.5" />
          </div>
        ),
        duration: Infinity,
        dismissable: true,
      });
    });
    const cancelConnectHandler = api.onWebSocketConnect(() => {
      if (connectionToast)
        connectionToast.update({
          id: connectionToast.id,
          title: "✅ Connection re-established",
          variant: "default",
          description: (
            <div className="flex items-center">
              Refreshing data...
              <LoadingSpinner className="ml-1.5 size-3.5" />
            </div>
          ),
          duration: 2000,
          dismissable: true,
        });
      connectionToast = null;
    });
    return () => {
      cancelDisconnectHandler();
      cancelConnectHandler();
    };
  }, []);
  // Subscribe to WebSocket updates for agent runs
  useEffect(() => {
    if (!agent?.graph_id) return;
    return api.onWebSocketConnect(() => {
      refreshPageData(); // Sync up on (re)connect
      // Subscribe to all executions for this agent
      api.subscribeToGraphExecutions(agent.graph_id);
    });
  }, [api, agent?.graph_id, refreshPageData]);
  // Handle execution updates
  useEffect(() => {
    const detachExecUpdateHandler = api.onWebSocketMessage(
      "graph_execution_event",
      (data) => {
        if (data.graph_id != agent?.graph_id) return;
        agentRunsQuery.upsertAgentRun(data);
        if (data.id === selectedView.id) {
          // Update currently viewed run
          setSelectedRun(data);
        }
      },
    );
    return () => {
      detachExecUpdateHandler();
    };
  }, [api, agent?.graph_id, selectedView.id]);
  // Pre-load selectedRun based on selectedView
  useEffect(() => {
    if (selectedView.type != "run" || !selectedView.id) return;
    const newSelectedRun = agentRuns.find((run) => run.id == selectedView.id);
    if (selectedView.id !== selectedRun?.id) {
      // Pull partial data from "cache" while waiting for the rest to load
      setSelectedRun((newSelectedRun as GraphExecutionMeta) ?? null);
    }
  }, [api, selectedView, agentRuns, selectedRun?.id]);
  // Load selectedRun based on selectedView; refresh on agent refresh
  useEffect(() => {
    if (selectedView.type != "run" || !selectedView.id || !agent) return;
    api
      .getGraphExecutionInfo(agent.graph_id, selectedView.id)
      .then(async (run) => {
        // Ensure corresponding graph version is available before rendering I/O
        await getGraphVersion(run.graph_id, run.graph_version);
        setSelectedRun(run);
      });
  }, [api, selectedView, agent, getGraphVersion]);
  const fetchSchedules = useCallback(async () => {
    if (!agent) return;
    setSchedules(await api.listGraphExecutionSchedules(agent.graph_id));
  }, [api, agent?.graph_id]);
  useEffect(() => {
    fetchSchedules();
  }, [fetchSchedules]);
  // =========================== ACTIONS ============================
  const deleteRun = useCallback(
    async (run: GraphExecutionMeta) => {
      if (run.status == "RUNNING" || run.status == "QUEUED") {
        await api.stopGraphExecution(run.graph_id, run.id);
      }
      await api.deleteGraphExecution(run.id);
      setConfirmingDeleteAgentRun(null);
      if (selectedView.type == "run" && selectedView.id == run.id) {
        openRunDraftView();
      }
      agentRunsQuery.removeAgentRun(run.id);
    },
    [api, selectedView, openRunDraftView],
  );
  const deletePreset = useCallback(
    async (presetID: LibraryAgentPresetID) => {
      await api.deleteLibraryAgentPreset(presetID);
      setConfirmingDeleteAgentPreset(null);
      if (selectedView.type == "preset" && selectedView.id == presetID) {
        openRunDraftView();
      }
      setAgentPresets((presets) => presets.filter((p) => p.id !== presetID));
    },
    [api, selectedView, openRunDraftView],
  );
  const deleteSchedule = useCallback(
    async (scheduleID: ScheduleID) => {
      const removedSchedule =
        await api.deleteGraphExecutionSchedule(scheduleID);
      setSchedules((schedules) => {
        const newSchedules = schedules.filter(
          (s) => s.id !== removedSchedule.id,
        );
        if (
          selectedView.type == "schedule" &&
          selectedView.id == removedSchedule.id
        ) {
          if (newSchedules.length > 0) {
            // Select next schedule if available
            selectSchedule(newSchedules[0].id);
          } else {
            // Reset to draft view if current schedule was deleted
            openRunDraftView();
          }
        }
        return newSchedules;
      });
      openRunDraftView();
    },
    [schedules, api],
  );
  const handleCreatePresetFromRun = useCallback(
    async (name: string, description: string) => {
      if (!creatingPresetFromExecutionID) return;
      await api
        .createLibraryAgentPreset({
          name,
          description,
          graph_execution_id: creatingPresetFromExecutionID,
        })
        .then((preset) => {
          setAgentPresets((prev) => [...prev, preset]);
          selectPreset(preset.id);
          setCreatingPresetFromExecutionID(null);
        })
        .catch(toastOnFail("create a preset"));
    },
    [api, creatingPresetFromExecutionID, selectPreset, toast],
  );
  const downloadGraph = useCallback(
    async () =>
      agent &&
      // Export sanitized graph from backend
      api
        .getGraph(agent.graph_id, agent.graph_version, true)
        .then((graph) =>
          exportAsJSONFile(graph, `${graph.name}_v${graph.version}.json`),
        ),
    [api, agent],
  );
  const copyAgent = useCallback(async () => {
    setCopyAgentDialogOpen(false);
    api
      .forkLibraryAgent(agentID)
      .then((newAgent) => {
        router.push(`/library/agents/${newAgent.id}`);
      })
      .catch((error) => {
        console.error("Error copying agent:", error);
        toast({
          title: "Error copying agent",
          description: `An error occurred while copying the agent: ${error.message}`,
          variant: "destructive",
        });
      });
  }, [agentID, api, router, toast]);
  const agentActions: ButtonAction[] = useMemo(
    () => [
      {
        label: "Customize agent",
        href: `/build?flowID=${agent?.graph_id}&flowVersion=${agent?.graph_version}`,
        disabled: !agent?.can_access_graph,
      },
      { label: "Export agent to file", callback: downloadGraph },
      ...(!agent?.can_access_graph
        ? [
            {
              label: "Edit a copy",
              callback: () => setCopyAgentDialogOpen(true),
            },
          ]
        : []),
      {
        label: "Delete agent",
        callback: () => setAgentDeleteDialogOpen(true),
      },
    ],
    [agent, downloadGraph],
  );
  const runGraph =
    graphVersions.current[selectedRun?.graph_version ?? 0] ?? graph;
  const onCreateSchedule = useCallback(
    (schedule: Schedule) => {
      setSchedules((prev) => [...prev, schedule]);
      selectSchedule(schedule.id);
    },
    [selectView],
  );
  const onCreatePreset = useCallback(
    (preset: LibraryAgentPreset) => {
      setAgentPresets((prev) => [...prev, preset]);
      selectPreset(preset.id);
    },
    [selectPreset],
  );
  const onUpdatePreset = useCallback(
    (updated: LibraryAgentPreset) => {
      setAgentPresets((prev) =>
        prev.map((p) => (p.id === updated.id ? updated : p)),
      );
      selectPreset(updated.id);
    },
    [selectPreset],
  );
  if (!agent || !graph) {
    return <LoadingBox className="h-[90vh]" />;
  }
  return (
    <div className="container justify-stretch p-0 pt-16 lg:flex">
      {/* Sidebar w/ list of runs */}
      {/* TODO: render this below header in sm and md layouts */}
      <AgentRunsSelectorList
        className="agpt-div w-full border-b pb-2 lg:w-auto lg:border-b-0 lg:border-r lg:pb-0"
        agent={agent}
        agentRunsQuery={agentRunsQuery}
        agentPresets={agentPresets}
        schedules={schedules}
        selectedView={selectedView}
        onSelectRun={selectRun}
        onSelectPreset={selectPreset}
        onSelectSchedule={selectSchedule}
        onSelectDraftNewRun={openRunDraftView}
        doDeleteRun={setConfirmingDeleteAgentRun}
        doDeletePreset={setConfirmingDeleteAgentPreset}
        doDeleteSchedule={deleteSchedule}
        doCreatePresetFromRun={setCreatingPresetFromExecutionID}
      />
      <div className="flex-1">
        {/* Header */}
        <div className="agpt-div w-full border-b">
          <h1
            data-testid="agent-title"
            className="font-poppins text-3xl font-medium"
          >
            {
              agent.name /* TODO: use dynamic/custom run title - https://github.com/Significant-Gravitas/AutoGPT/issues/9184 */
            }
          </h1>
        </div>
        {/* Run / Schedule views */}
        {(selectedView.type == "run" && selectedView.id ? (
          selectedRun && runGraph ? (
            <AgentRunDetailsView
              agent={agent}
              graph={runGraph}
              run={selectedRun}
              agentActions={agentActions}
              onRun={selectRun}
              doDeleteRun={() => setConfirmingDeleteAgentRun(selectedRun)}
              doCreatePresetFromRun={() =>
                setCreatingPresetFromExecutionID(selectedRun.id)
              }
            />
          ) : null
        ) : selectedView.type == "run" ? (
          /* Draft new runs / Create new presets */
          <AgentRunDraftView
            graph={graph}
            onRun={selectRun}
            onCreateSchedule={onCreateSchedule}
            onCreatePreset={onCreatePreset}
            agentActions={agentActions}
            recommendedScheduleCron={agent?.recommended_schedule_cron || null}
          />
        ) : selectedView.type == "preset" ? (
          /* Edit & update presets */
          <AgentRunDraftView
            graph={graph}
            agentPreset={
              agentPresets.find((preset) => preset.id == selectedView.id)!
            }
            onRun={selectRun}
            recommendedScheduleCron={agent?.recommended_schedule_cron || null}
            onCreateSchedule={onCreateSchedule}
            onUpdatePreset={onUpdatePreset}
            doDeletePreset={setConfirmingDeleteAgentPreset}
            agentActions={agentActions}
          />
        ) : selectedView.type == "schedule" ? (
          selectedSchedule &&
          graph && (
            <AgentScheduleDetailsView
              graph={graph}
              schedule={selectedSchedule}
              // agent={agent}
              agentActions={agentActions}
              onForcedRun={selectRun}
              doDeleteSchedule={deleteSchedule}
            />
          )
        ) : null) || <LoadingBox className="h-[70vh]" />}
        <DeleteConfirmDialog
          entityType="agent"
          open={agentDeleteDialogOpen}
          onOpenChange={setAgentDeleteDialogOpen}
          onDoDelete={() =>
            agent &&
            api.deleteLibraryAgent(agent.id).then(() => router.push("/library"))
          }
        />
        <DeleteConfirmDialog
          entityType="agent run"
          open={!!confirmingDeleteAgentRun}
          onOpenChange={(open) => !open && setConfirmingDeleteAgentRun(null)}
          onDoDelete={() =>
            confirmingDeleteAgentRun && deleteRun(confirmingDeleteAgentRun)
          }
        />
        <DeleteConfirmDialog
          entityType={agent.has_external_trigger ? "trigger" : "agent preset"}
          open={!!confirmingDeleteAgentPreset}
          onOpenChange={(open) => !open && setConfirmingDeleteAgentPreset(null)}
          onDoDelete={() =>
            confirmingDeleteAgentPreset &&
            deletePreset(confirmingDeleteAgentPreset)
          }
        />
        {/* Copy agent confirmation dialog */}
        <Dialog
          onOpenChange={setCopyAgentDialogOpen}
          open={copyAgentDialogOpen}
        >
          <DialogContent>
            <DialogHeader>
              <DialogTitle>You&apos;re making an editable copy</DialogTitle>
              <DialogDescription className="pt-2">
                The original Marketplace agent stays the same and cannot be
                edited. We&apos;ll save a new version of this agent to your
                Library. From there, you can customize it however you&apos;d
                like by clicking &quot;Customize agent&quot; — this will open
                the builder where you can see and modify the inner workings.
              </DialogDescription>
            </DialogHeader>
            <DialogFooter className="justify-end">
              <Button
                type="button"
                variant="outline"
                onClick={() => setCopyAgentDialogOpen(false)}
              >
                Cancel
              </Button>
              <Button type="button" onClick={copyAgent}>
                Continue
              </Button>
            </DialogFooter>
          </DialogContent>
        </Dialog>
        <CreatePresetDialog
          open={!!creatingPresetFromExecutionID}
          onOpenChange={() => setCreatingPresetFromExecutionID(null)}
          onConfirm={handleCreatePresetFromRun}
        />
      </div>
    </div>
  );
 }
--- a/autogpt_platform/frontend/src/app/(platform)/library/agents/[id]/components/OldAgentLibraryView/components/agent-run-details-view.tsx
+++ b/autogpt_platform/frontend/src/app/(platform)/library/agents/[id]/components/OldAgentLibraryView/components/agent-run-details-view.tsx
@@ -1,445 +0,0 @@
 "use client";
 import { format, formatDistanceToNow, formatDistanceStrict } from "date-fns";
 import React, { useCallback, useMemo, useEffect } from "react";
 import {
  Graph,
  GraphExecution,
  GraphExecutionID,
  GraphExecutionMeta,
  LibraryAgent,
 } from "@/lib/autogpt-server-api";
 import { useBackendAPI } from "@/lib/autogpt-server-api/context";
 import ActionButtonGroup from "@/components/__legacy__/action-button-group";
 import type { ButtonAction } from "@/components/__legacy__/types";
 import {
  Card,
  CardContent,
  CardHeader,
  CardTitle,
 } from "@/components/__legacy__/ui/card";
 import {
  IconRefresh,
  IconSquare,
  IconCircleAlert,
 } from "@/components/__legacy__/ui/icons";
 import { Input } from "@/components/__legacy__/ui/input";
 import LoadingBox from "@/components/__legacy__/ui/loading";
 import {
  Tooltip,
  TooltipContent,
  TooltipProvider,
  TooltipTrigger,
 } from "@/components/atoms/Tooltip/BaseTooltip";
 import { useToastOnFail } from "@/components/molecules/Toast/use-toast";
 import { AgentRunStatus, agentRunStatusMap } from "./agent-run-status-chip";
 import useCredits from "@/hooks/useCredits";
 import { AgentRunOutputView } from "./agent-run-output-view";
 import { analytics } from "@/services/analytics";
 import { PendingReviewsList } from "@/components/organisms/PendingReviewsList/PendingReviewsList";
 import { usePendingReviewsForExecution } from "@/hooks/usePendingReviews";
 export function AgentRunDetailsView({
  agent,
  graph,
  run,
  agentActions,
  onRun,
  doDeleteRun,
  doCreatePresetFromRun,
 }: {
  agent: LibraryAgent;
  graph: Graph;
  run: GraphExecution | GraphExecutionMeta;
  agentActions: ButtonAction[];
  onRun: (runID: GraphExecutionID) => void;
  doDeleteRun: () => void;
  doCreatePresetFromRun: () => void;
 }): React.ReactNode {
  const api = useBackendAPI();
  const { formatCredits } = useCredits();
  const runStatus: AgentRunStatus = useMemo(
    () => agentRunStatusMap[run.status],
    [run],
  );
  const {
    pendingReviews,
    isLoading: reviewsLoading,
    refetch: refetchReviews,
  } = usePendingReviewsForExecution(run.id);
  const toastOnFail = useToastOnFail();
  // Refetch pending reviews when execution status changes to REVIEW
  useEffect(() => {
    if (runStatus === "review" && run.id) {
      refetchReviews();
    }
  }, [runStatus, run.id, refetchReviews]);
  const infoStats: { label: string; value: React.ReactNode }[] = useMemo(() => {
    if (!run) return [];
    return [
      {
        label: "Status",
        value: runStatus.charAt(0).toUpperCase() + runStatus.slice(1),
      },
      {
        label: "Started",
        value: run.started_at
          ? `${formatDistanceToNow(run.started_at, { addSuffix: true })}, ${format(run.started_at, "HH:mm")}`
          : "—",
      },
      ...(run.stats
        ? [
            {
              label: "Duration",
              value: formatDistanceStrict(0, run.stats.duration * 1000),
            },
            { label: "Steps", value: run.stats.node_exec_count },
            { label: "Cost", value: formatCredits(run.stats.cost) },
          ]
        : []),
    ];
  }, [run, runStatus, formatCredits]);
  const agentRunInputs:
    | Record<
        string,
        {
          title?: string;
          /* type: BlockIOSubType; */
          value: string | number | undefined;
        }
      >
    | undefined = useMemo(() => {
    if (!run.inputs) return undefined;
    // TODO: show (link to) preset - https://github.com/Significant-Gravitas/AutoGPT/issues/9168
    // Add type info from agent input schema
    return Object.fromEntries(
      Object.entries(run.inputs).map(([k, v]) => [
        k,
        {
          title: graph.input_schema.properties[k]?.title,
          // type: graph.input_schema.properties[k].type, // TODO: implement typed graph inputs
          value: typeof v == "object" ? JSON.stringify(v, undefined, 2) : v,
        },
      ]),
    );
  }, [graph, run]);
  const runAgain = useCallback(() => {
    if (
      !run.inputs ||
      !(graph.credentials_input_schema?.required ?? []).every(
        (k) => k in (run.credential_inputs ?? {}),
      )
    )
      return;
    if (run.preset_id) {
      return api
        .executeLibraryAgentPreset(
          run.preset_id,
          run.inputs!,
          run.credential_inputs!,
        )
        .then(({ id }) => {
          analytics.sendDatafastEvent("run_agent", {
            name: graph.name,
            id: graph.id,
          });
          onRun(id);
        })
        .catch(toastOnFail("execute agent preset"));
    }
    return api
      .executeGraph(
        graph.id,
        graph.version,
        run.inputs!,
        run.credential_inputs!,
        "library",
      )
      .then(({ id }) => {
        analytics.sendDatafastEvent("run_agent", {
          name: graph.name,
          id: graph.id,
        });
        onRun(id);
      })
      .catch(toastOnFail("execute agent"));
  }, [api, graph, run, onRun, toastOnFail]);
  const stopRun = useCallback(
    () => api.stopGraphExecution(graph.id, run.id),
    [api, graph.id, run.id],
  );
  const agentRunOutputs:
    | Record<
        string,
        {
          title?: string;
          /* type: BlockIOSubType; */
          values: Array<React.ReactNode>;
        }
      >
    | null
    | undefined = useMemo(() => {
    if (!("outputs" in run)) return undefined;
    if (!["running", "success", "failed", "stopped"].includes(runStatus))
      return null;
    // Add type info from agent input schema
    return Object.fromEntries(
      Object.entries(run.outputs).map(([k, vv]) => [
        k,
        {
          title: graph.output_schema.properties[k].title,
          /* type: agent.output_schema.properties[k].type */
          values: vv.map((v) =>
            typeof v == "object" ? JSON.stringify(v, undefined, 2) : v,
          ),
        },
      ]),
    );
  }, [graph, run, runStatus]);
  const runActions: ButtonAction[] = useMemo(
    () => [
      ...(["running", "queued"].includes(runStatus)
        ? ([
            {
              label: (
                <>
                  <IconSquare className="mr-2 size-4" />
                  Stop run
                </>
              ),
              variant: "secondary",
              callback: stopRun,
            },
          ] satisfies ButtonAction[])
        : []),
      ...(["success", "failed", "stopped"].includes(runStatus) &&
      !graph.has_external_trigger &&
      (graph.credentials_input_schema?.required ?? []).every(
        (k) => k in (run.credential_inputs ?? {}),
      )
        ? [
            {
              label: (
                <>
                  <IconRefresh className="mr-2 size-4" />
                  Run again
                </>
              ),
              callback: runAgain,
              dataTestId: "run-again-button",
            },
          ]
        : []),
      ...(agent.can_access_graph
        ? [
            {
              label: "Open run in builder",
              href: `/build?flowID=${run.graph_id}&flowVersion=${run.graph_version}&flowExecutionID=${run.id}`,
            },
          ]
        : []),
      { label: "Create preset from run", callback: doCreatePresetFromRun },
      { label: "Delete run", variant: "secondary", callback: doDeleteRun },
    ],
    [
      runStatus,
      runAgain,
      stopRun,
      doDeleteRun,
      doCreatePresetFromRun,
      graph.has_external_trigger,
      graph.credentials_input_schema?.required,
      agent.can_access_graph,
      run.graph_id,
      run.graph_version,
      run.id,
    ],
  );
  return (
    <div className="agpt-div flex gap-6">
      <div className="flex flex-1 flex-col gap-4">
        <Card className="agpt-box">
          <CardHeader>
            <CardTitle className="font-poppins text-lg">Info</CardTitle>
          </CardHeader>
          <CardContent>
            <div className="flex justify-stretch gap-4">
              {infoStats.map(({ label, value }) => (
                <div key={label} className="flex-1">
                  <p className="text-sm font-medium text-black">{label}</p>
                  <p className="text-sm text-neutral-600">{value}</p>
                </div>
              ))}
            </div>
            {run.status === "FAILED" && (
              <div className="mt-4 rounded-md border border-red-200 bg-red-50 p-3 dark:border-red-800 dark:bg-red-900/20">
                <p className="text-sm text-red-800 dark:text-red-200">
                  <strong>Error:</strong>{" "}
                  {run.stats?.error ||
                    "The execution failed due to an internal error. You can re-run the agent to retry."}
                </p>
              </div>
            )}
          </CardContent>
        </Card>
        {/* Smart Agent Execution Summary */}
        {run.stats?.activity_status && (
          <Card className="agpt-box">
            <CardHeader>
              <CardTitle className="flex items-center gap-2 font-poppins text-lg">
                Task Summary
                <TooltipProvider>
                  <Tooltip>
                    <TooltipTrigger asChild>
                      <IconCircleAlert className="size-4 cursor-help text-neutral-500 hover:text-neutral-700" />
                    </TooltipTrigger>
                    <TooltipContent>
                      <p className="max-w-xs">
                        This AI-generated summary describes how the agent
                        handled your task. It’s an experimental feature and may
                        occasionally be inaccurate.
                      </p>
                    </TooltipContent>
                  </Tooltip>
                </TooltipProvider>
              </CardTitle>
            </CardHeader>
            <CardContent className="space-y-4">
              <p className="text-sm leading-relaxed text-neutral-700">
                {run.stats.activity_status}
              </p>
              {/* Correctness Score */}
              {typeof run.stats.correctness_score === "number" && (
                <div className="flex items-center gap-3 rounded-lg bg-neutral-50 p-3">
                  <div className="flex items-center gap-2">
                    <span className="text-sm font-medium text-neutral-600">
                      Success Estimate:
                    </span>
                    <div className="flex items-center gap-2">
                      <div className="relative h-2 w-16 overflow-hidden rounded-full bg-neutral-200">
                        <div
                          className={`h-full transition-all ${
                            run.stats.correctness_score >= 0.8
                              ? "bg-green-500"
                              : run.stats.correctness_score >= 0.6
                                ? "bg-yellow-500"
                                : run.stats.correctness_score >= 0.4
                                  ? "bg-orange-500"
                                  : "bg-red-500"
                          }`}
                          style={{
                            width: `${Math.round(run.stats.correctness_score * 100)}%`,
                          }}
                        />
                      </div>
                      <span className="text-sm font-medium">
                        {Math.round(run.stats.correctness_score * 100)}%
                      </span>
                    </div>
                  </div>
                  <TooltipProvider>
                    <Tooltip>
                      <TooltipTrigger asChild>
                        <IconCircleAlert className="size-4 cursor-help text-neutral-400 hover:text-neutral-600" />
                      </TooltipTrigger>
                      <TooltipContent>
                        <p className="max-w-xs">
                          AI-generated estimate of how well this execution
                          achieved its intended purpose. This score indicates
                          {run.stats.correctness_score >= 0.8
                            ? " the agent was highly successful."
                            : run.stats.correctness_score >= 0.6
                              ? " the agent was mostly successful with minor issues."
                              : run.stats.correctness_score >= 0.4
                                ? " the agent was partially successful with some gaps."
                                : " the agent had limited success with significant issues."}
                        </p>
                      </TooltipContent>
                    </Tooltip>
                  </TooltipProvider>
                </div>
              )}
            </CardContent>
          </Card>
        )}
        {agentRunOutputs !== null && (
          <AgentRunOutputView agentRunOutputs={agentRunOutputs} />
        )}
        {/* Pending Reviews Section */}
        {runStatus === "review" && (
          <Card className="agpt-box">
            <CardHeader>
              <CardTitle className="font-poppins text-lg">
                Pending Reviews ({pendingReviews.length})
              </CardTitle>
            </CardHeader>
            <CardContent>
              {reviewsLoading ? (
                <LoadingBox spinnerSize={12} className="h-24" />
              ) : pendingReviews.length > 0 ? (
                <PendingReviewsList
                  reviews={pendingReviews}
                  onReviewComplete={refetchReviews}
                  emptyMessage="No pending reviews for this execution"
                />
              ) : (
                <div className="py-4 text-neutral-600">
                  No pending reviews for this execution
                </div>
              )}
            </CardContent>
          </Card>
        )}
        <Card className="agpt-box">
          <CardHeader>
            <CardTitle className="font-poppins text-lg">Input</CardTitle>
          </CardHeader>
          <CardContent className="flex flex-col gap-4">
            {agentRunInputs !== undefined ? (
              Object.entries(agentRunInputs).map(([key, { title, value }]) => (
                <div key={key} className="flex flex-col gap-1.5">
                  <label className="text-sm font-medium">{title || key}</label>
                  <Input value={value} className="rounded-full" disabled />
                </div>
              ))
            ) : (
              <LoadingBox spinnerSize={12} className="h-24" />
            )}
          </CardContent>
        </Card>
      </div>
      {/* Run / Agent Actions */}
      <aside className="w-48 xl:w-56">
        <div className="flex flex-col gap-8">
          <ActionButtonGroup title="Run actions" actions={runActions} />
          <ActionButtonGroup title="Agent actions" actions={agentActions} />
        </div>
      </aside>
    </div>
  );
 }
--- a/autogpt_platform/frontend/src/app/(platform)/library/agents/[id]/components/OldAgentLibraryView/components/agent-run-output-view.tsx
+++ b/autogpt_platform/frontend/src/app/(platform)/library/agents/[id]/components/OldAgentLibraryView/components/agent-run-output-view.tsx
@@ -1,178 +0,0 @@
 "use client";
 import { Flag, useGetFlag } from "@/services/feature-flags/use-get-flag";
 import React, { useMemo } from "react";
 import {
  Card,
  CardContent,
  CardHeader,
  CardTitle,
 } from "@/components/__legacy__/ui/card";
 import LoadingBox from "@/components/__legacy__/ui/loading";
 import type { OutputMetadata } from "../../../../../../../../components/contextual/OutputRenderers";
 import {
  globalRegistry,
  OutputActions,
  OutputItem,
 } from "../../../../../../../../components/contextual/OutputRenderers";
 export function AgentRunOutputView({
  agentRunOutputs,
 }: {
  agentRunOutputs:
    | Record<
        string,
        {
          title?: string;
          /* type: BlockIOSubType; */
          values: Array<React.ReactNode>;
        }
      >
    | undefined;
 }) {
  const enableEnhancedOutputHandling = useGetFlag(
    Flag.ENABLE_ENHANCED_OUTPUT_HANDLING,
  );
  // Prepare items for the renderer system
  const outputItems = useMemo(() => {
    if (!agentRunOutputs) return [];
    const items: Array<{
      key: string;
      label: string;
      value: unknown;
      metadata?: OutputMetadata;
      renderer: any;
    }> = [];
    Object.entries(agentRunOutputs).forEach(([key, { title, values }]) => {
      values.forEach((value, index) => {
        // Enhanced metadata extraction
        const metadata: OutputMetadata = {};
        // Type guard to safely access properties
        if (
          typeof value === "object" &&
          value !== null &&
          !React.isValidElement(value)
        ) {
          const objValue = value as any;
          if (objValue.type) metadata.type = objValue.type;
          if (objValue.mimeType) metadata.mimeType = objValue.mimeType;
          if (objValue.filename) metadata.filename = objValue.filename;
        }
        const renderer = globalRegistry.getRenderer(value, metadata);
        if (renderer) {
          items.push({
            key: `${key}-${index}`,
            label: index === 0 ? title || key : "",
            value,
            metadata,
            renderer,
          });
        } else {
          const textRenderer = globalRegistry
            .getAllRenderers()
            .find((r) => r.name === "TextRenderer");
          if (textRenderer) {
            items.push({
              key: `${key}-${index}`,
              label: index === 0 ? title || key : "",
              value: JSON.stringify(value, null, 2),
              metadata,
              renderer: textRenderer,
            });
          }
        }
      });
    });
    return items;
  }, [agentRunOutputs]);
  return (
    <>
      {enableEnhancedOutputHandling ? (
        <Card className="agpt-box" style={{ maxWidth: "950px" }}>
          <CardHeader>
            <div className="flex items-center justify-between">
              <CardTitle className="font-poppins text-lg">Output</CardTitle>
              {outputItems.length > 0 && (
                <OutputActions
                  items={outputItems.map((item) => ({
                    value: item.value,
                    metadata: item.metadata,
                    renderer: item.renderer,
                  }))}
                />
              )}
            </div>
          </CardHeader>
          <CardContent
            className="flex flex-col gap-4"
            style={{ maxWidth: "660px" }}
          >
            {agentRunOutputs !== undefined ? (
              outputItems.length > 0 ? (
                outputItems.map((item) => (
                  <OutputItem
                    key={item.key}
                    value={item.value}
                    metadata={item.metadata}
                    renderer={item.renderer}
                    label={item.label}
                  />
                ))
              ) : (
                <p className="text-sm text-muted-foreground">
                  No outputs to display
                </p>
              )
            ) : (
              <LoadingBox spinnerSize={12} className="h-24" />
            )}
          </CardContent>
        </Card>
      ) : (
        <Card className="agpt-box" style={{ maxWidth: "950px" }}>
          <CardHeader>
            <CardTitle className="font-poppins text-lg">Output</CardTitle>
          </CardHeader>
          <CardContent
            className="flex flex-col gap-4"
            style={{ maxWidth: "660px" }}
          >
            {agentRunOutputs !== undefined ? (
              Object.entries(agentRunOutputs).map(
                ([key, { title, values }]) => (
                  <div key={key} className="flex flex-col gap-1.5">
                    <label className="text-sm font-medium">
                      {title || key}
                    </label>
                    {values.map((value, i) => (
                      <p
                        className="resize-none overflow-x-auto whitespace-pre-wrap break-words border-none text-sm text-neutral-700 disabled:cursor-not-allowed"
                        key={i}
                      >
                        {value}
                      </p>
                    ))}
                    {/* TODO: pretty type-dependent rendering */}
                  </div>
                ),
              )
            ) : (
              <LoadingBox spinnerSize={12} className="h-24" />
            )}
          </CardContent>
        </Card>
      )}
    </>
  );
 }
--- a/autogpt_platform/frontend/src/app/(platform)/library/agents/[id]/components/OldAgentLibraryView/components/agent-run-status-chip.tsx
+++ b/autogpt_platform/frontend/src/app/(platform)/library/agents/[id]/components/OldAgentLibraryView/components/agent-run-status-chip.tsx
@@ -1,68 +0,0 @@
 import React from "react";
 import { Badge } from "@/components/__legacy__/ui/badge";
 import { GraphExecutionMeta } from "@/lib/autogpt-server-api/types";
 export type AgentRunStatus =
  | "success"
  | "failed"
  | "queued"
  | "running"
  | "stopped"
  | "scheduled"
  | "draft"
  | "review";
 export const agentRunStatusMap: Record<
  GraphExecutionMeta["status"],
  AgentRunStatus
 > = {
  INCOMPLETE: "draft",
  COMPLETED: "success",
  FAILED: "failed",
  QUEUED: "queued",
  RUNNING: "running",
  TERMINATED: "stopped",
  REVIEW: "review",
 };
 const statusData: Record<
  AgentRunStatus,
  { label: string; variant: keyof typeof statusStyles }
 > = {
  success: { label: "Success", variant: "success" },
  running: { label: "Running", variant: "info" },
  failed: { label: "Failed", variant: "destructive" },
  queued: { label: "Queued", variant: "warning" },
  draft: { label: "Draft", variant: "secondary" },
  stopped: { label: "Stopped", variant: "secondary" },
  scheduled: { label: "Scheduled", variant: "secondary" },
  review: { label: "In Review", variant: "warning" },
 };
 const statusStyles = {
  success:
    "bg-green-100 text-green-800 hover:bg-green-100 hover:text-green-800",
  destructive: "bg-red-100 text-red-800 hover:bg-red-100 hover:text-red-800",
  warning:
    "bg-yellow-100 text-yellow-800 hover:bg-yellow-100 hover:text-yellow-800",
  info: "bg-blue-100 text-blue-800 hover:bg-blue-100 hover:text-blue-800",
  secondary:
    "bg-slate-100 text-slate-800 hover:bg-slate-100 hover:text-slate-800",
 };
 export function AgentRunStatusChip({
  status,
 }: {
  status: AgentRunStatus;
 }): React.ReactElement {
  return (
    <Badge
      variant="secondary"
      className={`text-xs font-medium ${statusStyles[statusData[status]?.variant]} rounded-[45px] px-[9px] py-[3px]`}
    >
      {statusData[status]?.label}
    </Badge>
  );
 }
--- a/autogpt_platform/frontend/src/app/(platform)/library/agents/[id]/components/OldAgentLibraryView/components/agent-run-summary-card.tsx
+++ b/autogpt_platform/frontend/src/app/(platform)/library/agents/[id]/components/OldAgentLibraryView/components/agent-run-summary-card.tsx
@@ -1,130 +0,0 @@
 import React from "react";
 import { formatDistanceToNow, isPast } from "date-fns";
 import { cn } from "@/lib/utils";
 import { Link2Icon, Link2OffIcon, MoreVertical } from "lucide-react";
 import { Card, CardContent } from "@/components/__legacy__/ui/card";
 import { Button } from "@/components/__legacy__/ui/button";
 import {
  DropdownMenu,
  DropdownMenuContent,
  DropdownMenuItem,
  DropdownMenuTrigger,
 } from "@/components/__legacy__/ui/dropdown-menu";
 import { AgentStatus, AgentStatusChip } from "./agent-status-chip";
 import { AgentRunStatus, AgentRunStatusChip } from "./agent-run-status-chip";
 import { PushPinSimpleIcon } from "@phosphor-icons/react";
 export type AgentRunSummaryProps = (
  | {
      type: "run";
      status: AgentRunStatus;
    }
  | {
      type: "preset";
      status?: undefined;
    }
  | {
      type: "preset.triggered";
      status: AgentStatus;
    }
  | {
      type: "schedule";
      status: "scheduled";
    }
 ) & {
  title: string;
  timestamp?: number | Date;
  selected?: boolean;
  onClick?: () => void;
  // onRename: () => void;
  onDelete: () => void;
  onPinAsPreset?: () => void;
  className?: string;
 };
 export function AgentRunSummaryCard({
  type,
  status,
  title,
  timestamp,
  selected = false,
  onClick,
  // onRename,
  onDelete,
  onPinAsPreset,
  className,
 }: AgentRunSummaryProps): React.ReactElement {
  return (
    <Card
      className={cn(
        "agpt-rounded-card cursor-pointer border-zinc-300",
        selected ? "agpt-card-selected" : "",
        className,
      )}
      onClick={onClick}
    >
      <CardContent className="relative p-2.5 lg:p-4">
        {(type == "run" || type == "schedule") && (
          <AgentRunStatusChip status={status} />
        )}
        {type == "preset" && (
          <div className="flex items-center text-sm font-medium text-neutral-700">
            <PushPinSimpleIcon className="mr-1 size-4 text-foreground" /> Preset
          </div>
        )}
        {type == "preset.triggered" && (
          <div className="flex items-center justify-between">
            <AgentStatusChip status={status} />
            <div className="flex items-center text-sm font-medium text-neutral-700">
              {status == "inactive" ? (
                <Link2OffIcon className="mr-1 size-4 text-foreground" />
              ) : (
                <Link2Icon className="mr-1 size-4 text-foreground" />
              )}{" "}
              Trigger
            </div>
          </div>
        )}
        <div className="mt-5 flex items-center justify-between">
          <h3 className="truncate pr-2 text-base font-medium text-neutral-900">
            {title}
          </h3>
          <DropdownMenu>
            <DropdownMenuTrigger asChild>
              <Button variant="ghost" className="h-5 w-5 p-0">
                <MoreVertical className="h-5 w-5" />
              </Button>
            </DropdownMenuTrigger>
            <DropdownMenuContent>
              {onPinAsPreset && (
                <DropdownMenuItem onClick={onPinAsPreset}>
                  Pin as a preset
                </DropdownMenuItem>
              )}
              {/* <DropdownMenuItem onClick={onRename}>Rename</DropdownMenuItem> */}
              <DropdownMenuItem onClick={onDelete}>Delete</DropdownMenuItem>
            </DropdownMenuContent>
          </DropdownMenu>
        </div>
        {timestamp && (
          <p
            className="mt-1 text-sm font-normal text-neutral-500"
            title={new Date(timestamp).toString()}
          >
            {isPast(timestamp) ? "Ran" : "Runs in"}{" "}
            {formatDistanceToNow(timestamp, { addSuffix: true })}
          </p>
        )}
      </CardContent>
    </Card>
  );
 }
--- a/autogpt_platform/frontend/src/app/(platform)/library/agents/[id]/components/OldAgentLibraryView/components/agent-runs-selector-list.tsx
+++ b/autogpt_platform/frontend/src/app/(platform)/library/agents/[id]/components/OldAgentLibraryView/components/agent-runs-selector-list.tsx
@@ -1,237 +0,0 @@
 "use client";
 import { Plus } from "lucide-react";
 import React, { useEffect, useState } from "react";
 import {
  GraphExecutionID,
  GraphExecutionMeta,
  LibraryAgent,
  LibraryAgentPreset,
  LibraryAgentPresetID,
  Schedule,
  ScheduleID,
 } from "@/lib/autogpt-server-api";
 import { cn } from "@/lib/utils";
 import { Badge } from "@/components/__legacy__/ui/badge";
 import { Button } from "@/components/atoms/Button/Button";
 import LoadingBox, { LoadingSpinner } from "@/components/__legacy__/ui/loading";
 import { Separator } from "@/components/__legacy__/ui/separator";
 import { ScrollArea } from "@/components/__legacy__/ui/scroll-area";
 import { InfiniteScroll } from "@/components/contextual/InfiniteScroll/InfiniteScroll";
 import { AgentRunsQuery } from "../use-agent-runs";
 import { agentRunStatusMap } from "./agent-run-status-chip";
 import { AgentRunSummaryCard } from "./agent-run-summary-card";
 interface AgentRunsSelectorListProps {
  agent: LibraryAgent;
  agentRunsQuery: AgentRunsQuery;
  agentPresets: LibraryAgentPreset[];
  schedules: Schedule[];
  selectedView: { type: "run" | "preset" | "schedule"; id?: string };
  allowDraftNewRun?: boolean;
  onSelectRun: (id: GraphExecutionID) => void;
  onSelectPreset: (preset: LibraryAgentPresetID) => void;
  onSelectSchedule: (id: ScheduleID) => void;
  onSelectDraftNewRun: () => void;
  doDeleteRun: (id: GraphExecutionMeta) => void;
  doDeletePreset: (id: LibraryAgentPresetID) => void;
  doDeleteSchedule: (id: ScheduleID) => void;
  doCreatePresetFromRun?: (id: GraphExecutionID) => void;
  className?: string;
 }
 export function AgentRunsSelectorList({
  agent,
  agentRunsQuery: {
    agentRuns,
    agentRunCount,
    agentRunsLoading,
    hasMoreRuns,
    fetchMoreRuns,
    isFetchingMoreRuns,
  },
  agentPresets,
  schedules,
  selectedView,
  allowDraftNewRun = true,
  onSelectRun,
  onSelectPreset,
  onSelectSchedule,
  onSelectDraftNewRun,
  doDeleteRun,
  doDeletePreset,
  doDeleteSchedule,
  doCreatePresetFromRun,
  className,
 }: AgentRunsSelectorListProps): React.ReactElement {
  const [activeListTab, setActiveListTab] = useState<"runs" | "scheduled">(
    "runs",
  );
  useEffect(() => {
    if (selectedView.type === "schedule") {
      setActiveListTab("scheduled");
    } else {
      setActiveListTab("runs");
    }
  }, [selectedView]);
  const listItemClasses = "h-28 w-72 lg:w-full lg:h-32";
  return (
    <aside className={cn("flex flex-col gap-4", className)}>
      {allowDraftNewRun ? (
        <Button
          className={"mb-4 hidden lg:flex"}
          onClick={onSelectDraftNewRun}
          leftIcon={<Plus className="h-6 w-6" />}
        >
          New {agent.has_external_trigger ? "trigger" : "run"}
        </Button>
      ) : null}
      <div className="flex gap-2">
        <Badge
          variant={activeListTab === "runs" ? "secondary" : "outline"}
          className="cursor-pointer gap-2 rounded-full text-base"
          onClick={() => setActiveListTab("runs")}
        >
          <span>Runs</span>
          <span className="text-neutral-600">
            {agentRunCount ?? <LoadingSpinner className="size-4" />}
          </span>
        </Badge>
        <Badge
          variant={activeListTab === "scheduled" ? "secondary" : "outline"}
          className="cursor-pointer gap-2 rounded-full text-base"
          onClick={() => setActiveListTab("scheduled")}
        >
          <span>Scheduled</span>
          <span className="text-neutral-600">{schedules.length}</span>
        </Badge>
      </div>
      {/* Runs / Schedules list */}
      {agentRunsLoading && activeListTab === "runs" ? (
        <LoadingBox className="h-28 w-full lg:h-[calc(100vh-300px)] lg:w-72 xl:w-80" />
      ) : (
        <ScrollArea
          className="w-full lg:h-[calc(100vh-300px)] lg:w-72 xl:w-80"
          orientation={window.innerWidth >= 1024 ? "vertical" : "horizontal"}
        >
          <InfiniteScroll
            direction={window.innerWidth >= 1024 ? "vertical" : "horizontal"}
            hasNextPage={hasMoreRuns}
            fetchNextPage={fetchMoreRuns}
            isFetchingNextPage={isFetchingMoreRuns}
          >
            <div className="flex items-center gap-2 lg:flex-col">
              {/* New Run button - only in small layouts */}
              {allowDraftNewRun && (
                <Button
                  size="large"
                  className={
                    "flex h-12 w-40 items-center gap-2 py-6 lg:hidden " +
                    (selectedView.type == "run" && !selectedView.id
                      ? "agpt-card-selected text-accent"
                      : "")
                  }
                  onClick={onSelectDraftNewRun}
                  leftIcon={<Plus className="h-6 w-6" />}
                >
                  New {agent.has_external_trigger ? "trigger" : "run"}
                </Button>
              )}
              {activeListTab === "runs" ? (
                <>
                  {agentPresets
                    .filter((preset) => preset.webhook) // Triggers
                    .toSorted(
                      (a, b) => b.updated_at.getTime() - a.updated_at.getTime(),
                    )
                    .map((preset) => (
                      <AgentRunSummaryCard
                        className={cn(listItemClasses, "lg:h-auto")}
                        key={preset.id}
                        type="preset.triggered"
                        status={preset.is_active ? "active" : "inactive"}
                        title={preset.name}
                        // timestamp={preset.last_run_time} // TODO: implement this
                        selected={selectedView.id === preset.id}
                        onClick={() => onSelectPreset(preset.id)}
                        onDelete={() => doDeletePreset(preset.id)}
                      />
                    ))}
                  {agentPresets
                    .filter((preset) => !preset.webhook) // Presets
                    .toSorted(
                      (a, b) => b.updated_at.getTime() - a.updated_at.getTime(),
                    )
                    .map((preset) => (
                      <AgentRunSummaryCard
                        className={cn(listItemClasses, "lg:h-auto")}
                        key={preset.id}
                        type="preset"
                        title={preset.name}
                        // timestamp={preset.last_run_time} // TODO: implement this
                        selected={selectedView.id === preset.id}
                        onClick={() => onSelectPreset(preset.id)}
                        onDelete={() => doDeletePreset(preset.id)}
                      />
                    ))}
                  {agentPresets.length > 0 && <Separator className="my-1" />}
                  {agentRuns
                    .toSorted((a, b) => {
                      const aTime = a.started_at?.getTime() ?? 0;
                      const bTime = b.started_at?.getTime() ?? 0;
                      return bTime - aTime;
                    })
                    .map((run) => (
                      <AgentRunSummaryCard
                        className={listItemClasses}
                        key={run.id}
                        type="run"
                        status={agentRunStatusMap[run.status]}
                        title={
                          (run.preset_id
                            ? agentPresets.find((p) => p.id == run.preset_id)
                                ?.name
                            : null) ?? agent.name
                        }
                        timestamp={run.started_at ?? undefined}
                        selected={selectedView.id === run.id}
                        onClick={() => onSelectRun(run.id)}
                        onDelete={() => doDeleteRun(run as GraphExecutionMeta)}
                        onPinAsPreset={
                          doCreatePresetFromRun
                            ? () => doCreatePresetFromRun(run.id)
                            : undefined
                        }
                      />
                    ))}
                </>
              ) : (
                schedules.map((schedule) => (
                  <AgentRunSummaryCard
                    className={listItemClasses}
                    key={schedule.id}
                    type="schedule"
                    status="scheduled" // TODO: implement active/inactive status for schedules
                    title={schedule.name}
                    timestamp={schedule.next_run_time}
                    selected={selectedView.id === schedule.id}
                    onClick={() => onSelectSchedule(schedule.id)}
                    onDelete={() => doDeleteSchedule(schedule.id)}
                  />
                ))
              )}
            </div>
          </InfiniteScroll>
        </ScrollArea>
      )}
    </aside>
  );
 }
--- a/autogpt_platform/frontend/src/app/(platform)/library/agents/[id]/components/OldAgentLibraryView/components/agent-schedule-details-view.tsx
+++ b/autogpt_platform/frontend/src/app/(platform)/library/agents/[id]/components/OldAgentLibraryView/components/agent-schedule-details-view.tsx
@@ -1,180 +0,0 @@
 "use client";
 import React, { useCallback, useMemo } from "react";
 import {
  Graph,
  GraphExecutionID,
  Schedule,
  ScheduleID,
 } from "@/lib/autogpt-server-api";
 import { useBackendAPI } from "@/lib/autogpt-server-api/context";
 import ActionButtonGroup from "@/components/__legacy__/action-button-group";
 import type { ButtonAction } from "@/components/__legacy__/types";
 import {
  Card,
  CardContent,
  CardHeader,
  CardTitle,
 } from "@/components/__legacy__/ui/card";
 import { IconCross } from "@/components/__legacy__/ui/icons";
 import { Input } from "@/components/__legacy__/ui/input";
 import LoadingBox from "@/components/__legacy__/ui/loading";
 import { useToastOnFail } from "@/components/molecules/Toast/use-toast";
 import { humanizeCronExpression } from "@/lib/cron-expression-utils";
 import { formatScheduleTime } from "@/lib/timezone-utils";
 import { useUserTimezone } from "@/lib/hooks/useUserTimezone";
 import { PlayIcon } from "lucide-react";
 import { AgentRunStatus } from "./agent-run-status-chip";
 export function AgentScheduleDetailsView({
  graph,
  schedule,
  agentActions,
  onForcedRun,
  doDeleteSchedule,
 }: {
  graph: Graph;
  schedule: Schedule;
  agentActions: ButtonAction[];
  onForcedRun: (runID: GraphExecutionID) => void;
  doDeleteSchedule: (scheduleID: ScheduleID) => void;
 }): React.ReactNode {
  const api = useBackendAPI();
  const selectedRunStatus: AgentRunStatus = "scheduled";
  const toastOnFail = useToastOnFail();
  // Get user's timezone for displaying schedule times
  const userTimezone = useUserTimezone();
  const infoStats: { label: string; value: React.ReactNode }[] = useMemo(() => {
    return [
      {
        label: "Status",
        value:
          selectedRunStatus.charAt(0).toUpperCase() +
          selectedRunStatus.slice(1),
      },
      {
        label: "Schedule",
        value: humanizeCronExpression(schedule.cron),
      },
      {
        label: "Next run",
        value: formatScheduleTime(schedule.next_run_time, userTimezone),
      },
    ];
  }, [schedule, selectedRunStatus, userTimezone]);
  const agentRunInputs: Record<
    string,
    { title?: string; /* type: BlockIOSubType; */ value: any }
  > = useMemo(() => {
    // TODO: show (link to) preset - https://github.com/Significant-Gravitas/AutoGPT/issues/9168
    // Add type info from agent input schema
    return Object.fromEntries(
      Object.entries(schedule.input_data).map(([k, v]) => [
        k,
        {
          title: graph.input_schema.properties[k].title,
          /* TODO: type: agent.input_schema.properties[k].type */
          value: v,
        },
      ]),
    );
  }, [graph, schedule]);
  const runNow = useCallback(
    () =>
      api
        .executeGraph(
          graph.id,
          graph.version,
          schedule.input_data,
          schedule.input_credentials,
          "library",
        )
        .then((run) => onForcedRun(run.id))
        .catch(toastOnFail("execute agent")),
    [api, graph, schedule, onForcedRun, toastOnFail],
  );
  const runActions: ButtonAction[] = useMemo(
    () => [
      {
        label: (
          <>
            <PlayIcon className="mr-2 size-4" />
            Run now
          </>
        ),
        callback: runNow,
      },
      {
        label: (
          <>
            <IconCross className="mr-2 size-4 px-0.5" />
            Delete schedule
          </>
        ),
        callback: () => doDeleteSchedule(schedule.id),
        variant: "destructive",
      },
    ],
    [runNow],
  );
  return (
    <div className="agpt-div flex gap-6">
      <div className="flex flex-1 flex-col gap-4">
        <Card className="agpt-box">
          <CardHeader>
            <CardTitle className="font-poppins text-lg">Info</CardTitle>
          </CardHeader>
          <CardContent>
            <div className="flex justify-stretch gap-4">
              {infoStats.map(({ label, value }) => (
                <div key={label} className="flex-1">
                  <p className="text-sm font-medium text-black">{label}</p>
                  <p className="text-sm text-neutral-600">{value}</p>
                </div>
              ))}
            </div>
          </CardContent>
        </Card>
        <Card className="agpt-box">
          <CardHeader>
            <CardTitle className="font-poppins text-lg">Input</CardTitle>
          </CardHeader>
          <CardContent className="flex flex-col gap-4">
            {agentRunInputs !== undefined ? (
              Object.entries(agentRunInputs).map(([key, { title, value }]) => (
                <div key={key} className="flex flex-col gap-1.5">
                  <label className="text-sm font-medium">{title || key}</label>
                  <Input value={value} className="rounded-full" disabled />
                </div>
              ))
            ) : (
              <LoadingBox spinnerSize={12} className="h-24" />
            )}
          </CardContent>
        </Card>
      </div>
      {/* Run / Agent Actions */}
      <aside className="w-48 xl:w-56">
        <div className="flex flex-col gap-8">
          <ActionButtonGroup title="Run actions" actions={runActions} />
          <ActionButtonGroup title="Agent actions" actions={agentActions} />
        </div>
      </aside>
    </div>
  );
 }
--- a/autogpt_platform/frontend/src/app/(platform)/library/agents/[id]/components/OldAgentLibraryView/components/create-preset-dialog.tsx
+++ b/autogpt_platform/frontend/src/app/(platform)/library/agents/[id]/components/OldAgentLibraryView/components/create-preset-dialog.tsx
@@ -1,100 +0,0 @@
 "use client";
 import React, { useState } from "react";
 import { Button } from "@/components/__legacy__/ui/button";
 import {
  Dialog,
  DialogContent,
  DialogDescription,
  DialogFooter,
  DialogHeader,
  DialogTitle,
 } from "@/components/__legacy__/ui/dialog";
 import { Input } from "@/components/__legacy__/ui/input";
 import { Textarea } from "@/components/__legacy__/ui/textarea";
 interface CreatePresetDialogProps {
  open: boolean;
  onOpenChange: (open: boolean) => void;
  onConfirm: (name: string, description: string) => Promise<void> | void;
 }
 export function CreatePresetDialog({
  open,
  onOpenChange,
  onConfirm,
 }: CreatePresetDialogProps) {
  const [name, setName] = useState("");
  const [description, setDescription] = useState("");
  const handleSubmit = async () => {
    if (name.trim()) {
      await onConfirm(name.trim(), description.trim());
      setName("");
      setDescription("");
      onOpenChange(false);
    }
  };
  const handleCancel = () => {
    setName("");
    setDescription("");
    onOpenChange(false);
  };
  const handleKeyDown = (e: React.KeyboardEvent) => {
    if (e.key === "Enter" && (e.metaKey || e.ctrlKey)) {
      e.preventDefault();
      handleSubmit();
    }
  };
  return (
    <Dialog open={open} onOpenChange={onOpenChange}>
      <DialogContent className="sm:max-w-[425px]">
        <DialogHeader>
          <DialogTitle>Create Preset</DialogTitle>
          <DialogDescription>
            Give your preset a name and description to help identify it later.
          </DialogDescription>
        </DialogHeader>
        <div className="grid gap-4 py-4">
          <div className="grid gap-2">
            <label htmlFor="preset-name" className="text-sm font-medium">
              Name *
            </label>
            <Input
              id="preset-name"
              placeholder="Enter preset name"
              value={name}
              onChange={(e) => setName(e.target.value)}
              onKeyDown={handleKeyDown}
              autoFocus
            />
          </div>
          <div className="grid gap-2">
            <label htmlFor="preset-description" className="text-sm font-medium">
              Description
            </label>
            <Textarea
              id="preset-description"
              placeholder="Optional description"
              value={description}
              onChange={(e) => setDescription(e.target.value)}
              onKeyDown={handleKeyDown}
              rows={3}
            />
          </div>
        </div>
        <DialogFooter>
          <Button variant="outline" onClick={handleCancel}>
            Cancel
          </Button>
          <Button onClick={handleSubmit} disabled={!name.trim()}>
            Create Preset
          </Button>
        </DialogFooter>
      </DialogContent>
    </Dialog>
  );
 }
--- a/autogpt_platform/frontend/src/app/(platform)/library/agents/[id]/components/OldAgentLibraryView/use-agent-runs.ts
+++ b/autogpt_platform/frontend/src/app/(platform)/library/agents/[id]/components/OldAgentLibraryView/use-agent-runs.ts
@@ -1,210 +0,0 @@
 import {
  GraphExecutionMeta as LegacyGraphExecutionMeta,
  GraphID,
  GraphExecutionID,
 } from "@/lib/autogpt-server-api";
 import { getQueryClient } from "@/lib/react-query/queryClient";
 import {
  getPaginatedTotalCount,
  getPaginationNextPageNumber,
  unpaginate,
 } from "@/app/api/helpers";
 import {
  getV1ListGraphExecutionsResponse,
  getV1ListGraphExecutionsResponse200,
  useGetV1ListGraphExecutionsInfinite,
 } from "@/app/api/__generated__/endpoints/graphs/graphs";
 import { GraphExecutionsPaginated } from "@/app/api/__generated__/models/graphExecutionsPaginated";
 import { GraphExecutionMeta as RawGraphExecutionMeta } from "@/app/api/__generated__/models/graphExecutionMeta";
 export type GraphExecutionMeta = Omit<
  RawGraphExecutionMeta,
  "id" | "user_id" | "graph_id" | "preset_id" | "stats"
 > &
  Pick<
    LegacyGraphExecutionMeta,
    "id" | "user_id" | "graph_id" | "preset_id" | "stats"
  >;
 /** Hook to fetch runs for a specific graph, with support for infinite scroll.
 *
 * @param graphID - The ID of the graph to fetch agent runs for. This parameter is
 *                  optional in the sense that the hook doesn't run unless it is passed.
 *                  This way, it can be used in components where the graph ID is not
 *                  immediately available.
 */
 export const useAgentRunsInfinite = (graphID?: GraphID) => {
  const queryClient = getQueryClient();
  const {
    data: queryResults,
    refetch: refetchRuns,
    isPending: agentRunsLoading,
    isRefetching: agentRunsReloading,
    hasNextPage: hasMoreRuns,
    fetchNextPage: fetchMoreRuns,
    isFetchingNextPage: isFetchingMoreRuns,
    queryKey,
  } = useGetV1ListGraphExecutionsInfinite(
    graphID!,
    { page: 1, page_size: 20 },
    {
      query: {
        getNextPageParam: getPaginationNextPageNumber,
        // Prevent query from running if graphID is not available (yet)
        ...(!graphID
          ? {
              enabled: false,
              queryFn: () =>
                // Fake empty response if graphID is not available (yet)
                Promise.resolve({
                  status: 200,
                  data: {
                    executions: [],
                    pagination: {
                      current_page: 1,
                      page_size: 20,
                      total_items: 0,
                      total_pages: 0,
                    },
                  },
                  headers: new Headers(),
                } satisfies getV1ListGraphExecutionsResponse),
            }
          : {}),
      },
    },
    queryClient,
  );
  const agentRuns = queryResults ? unpaginate(queryResults, "executions") : [];
  const agentRunCount = getPaginatedTotalCount(queryResults);
  const upsertAgentRun = (newAgentRun: GraphExecutionMeta) => {
    queryClient.setQueryData(
      queryKey,
      (currentQueryData: typeof queryResults) => {
        if (!currentQueryData?.pages || agentRunCount === undefined)
          return currentQueryData;
        const exists = currentQueryData.pages.some((page) => {
          if (page.status !== 200) return false;
          const response = page.data;
          return response.executions.some((run) => run.id === newAgentRun.id);
        });
        if (exists) {
          // If the run already exists, we update it
          return {
            ...currentQueryData,
            pages: currentQueryData.pages.map((page) => {
              if (page.status !== 200) return page;
              const response = page.data;
              const executions = response.executions;
              const index = executions.findIndex(
                (run) => run.id === newAgentRun.id,
              );
              if (index === -1) return page;
              const newExecutions = [...executions];
              newExecutions[index] = newAgentRun;
              return {
                ...page,
                data: {
                  ...response,
                  executions: newExecutions,
                },
              } satisfies getV1ListGraphExecutionsResponse;
            }),
          };
        }
        // If the run does not exist, we add it to the first page
        const page = currentQueryData
          .pages[0] as getV1ListGraphExecutionsResponse200 & {
          headers: Headers;
        };
        const updatedExecutions = [newAgentRun, ...page.data.executions];
        const updatedPage = {
          ...page,
          data: {
            ...page.data,
            executions: updatedExecutions,
          },
        } satisfies getV1ListGraphExecutionsResponse;
        const updatedPages = [updatedPage, ...currentQueryData.pages.slice(1)];
        return {
          ...currentQueryData,
          pages: updatedPages.map(
            // Increment the total runs count in the pagination info of all pages
            (page) =>
              page.status === 200
                ? {
                    ...page,
                    data: {
                      ...page.data,
                      pagination: {
                        ...page.data.pagination,
                        total_items: agentRunCount + 1,
                      },
                    },
                  }
                : page,
          ),
        };
      },
    );
  };
  const removeAgentRun = (runID: GraphExecutionID) => {
    queryClient.setQueryData(
      [queryKey, { page: 1, page_size: 20 }],
      (currentQueryData: typeof queryResults) => {
        if (!currentQueryData?.pages) return currentQueryData;
        let found = false;
        return {
          ...currentQueryData,
          pages: currentQueryData.pages.map((page) => {
            const response = page.data as GraphExecutionsPaginated;
            const filteredExecutions = response.executions.filter(
              (run) => run.id !== runID,
            );
            if (filteredExecutions.length < response.executions.length) {
              found = true;
            }
            return {
              ...page,
              data: {
                ...response,
                executions: filteredExecutions,
                pagination: {
                  ...response.pagination,
                  total_items:
                    response.pagination.total_items - (found ? 1 : 0),
                },
              },
            };
          }),
        };
      },
    );
  };
  return {
    agentRuns: agentRuns as GraphExecutionMeta[],
    refetchRuns,
    agentRunCount,
    agentRunsLoading: agentRunsLoading || agentRunsReloading,
    hasMoreRuns,
    fetchMoreRuns,
    isFetchingMoreRuns,
    upsertAgentRun,
    removeAgentRun,
  };
 };
 export type AgentRunsQuery = ReturnType<typeof useAgentRunsInfinite>;
--- a/autogpt_platform/frontend/src/app/(platform)/library/legacy/[id]/page.tsx
+++ b/autogpt_platform/frontend/src/app/(platform)/library/legacy/[id]/page.tsx
@@ -1,7 +0,0 @@
 "use client";
 import { OldAgentLibraryView } from "../../agents/[id]/components/OldAgentLibraryView/OldAgentLibraryView";
 export default function OldAgentLibraryPage() {
  return <OldAgentLibraryView />;
 }
--- a/autogpt_platform/frontend/src/app/api/openapi.json
+++ b/autogpt_platform/frontend/src/app/api/openapi.json
@@ -7022,24 +7022,29 @@
          "input_schema": {
            "additionalProperties": true,
            "type": "object",
-            "title": "Input Schema",
+            "title": "Input Schema"
            "description": "Full JSON schema for block inputs"
          },
          "output_schema": {
            "additionalProperties": true,
            "type": "object",
-            "title": "Output Schema",
+            "title": "Output Schema"
            "description": "Full JSON schema for block outputs"
          },
          "required_inputs": {
            "items": { "$ref": "#/components/schemas/BlockInputFieldInfo" },
            "type": "array",
            "title": "Required Inputs",
-            "description": "List of input fields for this block"
+            "description": "List of required input fields for this block"
          }
        },
        "type": "object",
-        "required": ["id", "name", "description", "categories"],
+        "required": [
          "id",
          "name",
          "description",
          "categories",
          "input_schema",
          "output_schema"
        ],
        "title": "BlockInfoSummary",
        "description": "Summary of a block for search results."
      },
@@ -7085,7 +7090,7 @@
          "usage_hint": {
            "type": "string",
            "title": "Usage Hint",
-            "default": "To execute a block, call run_block with block_id set to the block's 'id' field and input_data containing the fields listed in required_inputs."
+            "default": "To execute a block, call run_block with block_id set to the block's 'id' field and input_data containing the required fields from input_schema."
          }
        },
        "type": "object",
@@ -10490,10 +10495,7 @@
          "operation_started",
          "operation_pending",
          "operation_in_progress",
-          "input_validation_error",
+          "input_validation_error"
          "web_fetch",
          "bash_exec",
          "operation_status"
        ],
        "title": "ResponseType",
        "description": "Types of tool responses."
--- a/autogpt_platform/frontend/src/app/globals.css
+++ b/autogpt_platform/frontend/src/app/globals.css
@@ -180,3 +180,14 @@ body[data-google-picker-open="true"] [data-dialog-content] {
  z-index: 1 !important;
  pointer-events: none !important;
 }
 /* CoPilot chat table styling — remove left/right borders, increase padding */
 [data-streamdown="table-wrapper"] table {
  border-left: none;
  border-right: none;
 }
 [data-streamdown="table-wrapper"] th,
 [data-streamdown="table-wrapper"] td {
  padding: 0.875rem 1rem; /* py-3.5 px-4 */
 }
--- a/autogpt_platform/frontend/src/app/(platform)/library/agents/[id]/components/OldAgentLibraryView/components/cron-scheduler-dialog.tsx
+++ b/autogpt_platform/frontend/src/app/(platform)/library/agents/[id]/components/OldAgentLibraryView/components/cron-scheduler-dialog.tsx
@@ -2,7 +2,7 @@ import { useEffect, useState } from "react";
 import { Input } from "@/components/__legacy__/ui/input";
 import { Button } from "@/components/__legacy__/ui/button";
 import { useToast } from "@/components/molecules/Toast/use-toast";
-import { CronScheduler } from "@/app/(platform)/library/agents/[id]/components/OldAgentLibraryView/components/cron-scheduler";
+import { CronScheduler } from "@/components/contextual/CronScheduler/cron-scheduler";
 import { Dialog } from "@/components/molecules/Dialog/Dialog";
 import { getTimezoneDisplayName } from "@/lib/timezone-utils";
 import { useUserTimezone } from "@/lib/hooks/useUserTimezone";
--- a/autogpt_platform/frontend/src/app/(platform)/library/agents/[id]/components/OldAgentLibraryView/components/cron-scheduler.tsx
+++ b/autogpt_platform/frontend/src/app/(platform)/library/agents/[id]/components/OldAgentLibraryView/components/cron-scheduler.tsx
--- a/autogpt_platform/frontend/src/components/contextual/OutputRenderers/renderers/MarkdownRenderer.tsx
+++ b/autogpt_platform/frontend/src/components/contextual/OutputRenderers/renderers/MarkdownRenderer.tsx
@@ -226,7 +226,7 @@ function renderMarkdown(
          table: ({ children, ...props }) => (
            <div className="my-4 overflow-x-auto">
              <table
-                className="min-w-full divide-y divide-gray-200 rounded-lg border border-gray-200 dark:divide-gray-700 dark:border-gray-700"
+                className="min-w-full divide-y divide-gray-200 border-y border-gray-200 dark:divide-gray-700 dark:border-gray-700"
                {...props}
              >
                {children}
@@ -235,7 +235,7 @@ function renderMarkdown(
          ),
          th: ({ children, ...props }) => (
            <th
-              className="bg-gray-50 px-4 py-3 text-left text-xs font-semibold uppercase tracking-wider text-gray-700 dark:bg-gray-800 dark:text-gray-300"
+              className="bg-gray-50 px-4 py-3.5 text-left text-xs font-semibold uppercase tracking-wider text-gray-700 dark:bg-gray-800 dark:text-gray-300"
              {...props}
            >
              {children}
@@ -243,7 +243,7 @@ function renderMarkdown(
          ),
          td: ({ children, ...props }) => (
            <td
-              className="border-t border-gray-200 px-4 py-3 text-sm text-gray-600 dark:border-gray-700 dark:text-gray-400"
+              className="border-t border-gray-200 px-4 py-3.5 text-sm text-gray-600 dark:border-gray-700 dark:text-gray-400"
              {...props}
            >
              {children}
--- a/autogpt_platform/frontend/src/components/contextual/PublishAgentModal/components/AgentInfoStep/AgentInfoStep.tsx
+++ b/autogpt_platform/frontend/src/components/contextual/PublishAgentModal/components/AgentInfoStep/AgentInfoStep.tsx
@@ -1,6 +1,6 @@
 "use client";
-import { CronExpressionDialog } from "@/app/(platform)/library/agents/[id]/components/OldAgentLibraryView/components/cron-scheduler-dialog";
+import { CronExpressionDialog } from "@/components/contextual/CronScheduler/cron-scheduler-dialog";
 import { Form, FormField } from "@/components/__legacy__/ui/form";
 import { Button } from "@/components/atoms/Button/Button";
 import { Input } from "@/components/atoms/Input/Input";
--- a/autogpt_platform/frontend/src/services/feature-flags/use-get-flag.ts
+++ b/autogpt_platform/frontend/src/services/feature-flags/use-get-flag.ts
@@ -7,7 +7,6 @@ import { useFlags } from "launchdarkly-react-client-sdk";
 export enum Flag {
  BETA_BLOCKS = "beta-blocks",
  NEW_BLOCK_MENU = "new-block-menu",
  NEW_AGENT_RUNS = "new-agent-runs",
  GRAPH_SEARCH = "graph-search",
  ENABLE_ENHANCED_OUTPUT_HANDLING = "enable-enhanced-output-handling",
  SHARE_EXECUTION_RESULTS = "share-execution-results",
@@ -22,7 +21,6 @@ const isPwMockEnabled = process.env.NEXT_PUBLIC_PW_TEST === "true";
 const defaultFlags = {
  [Flag.BETA_BLOCKS]: [],
  [Flag.NEW_BLOCK_MENU]: false,
  [Flag.NEW_AGENT_RUNS]: false,
  [Flag.GRAPH_SEARCH]: false,
  [Flag.ENABLE_ENHANCED_OUTPUT_HANDLING]: false,
  [Flag.SHARE_EXECUTION_RESULTS]: false,
Author	SHA1	Message	Date
Ubbe	361d6ff6fc	Merge branch 'dev' into refactor/remove-old-agent-library-view	2026-02-13 09:39:20 +08:00
Ubbe	e8c50b96d1	fix(frontend): improve CoPilot chat table styling (#12094 ) ## Summary - Remove left and right borders from tables rendered in CoPilot chat - Increase cell padding (py-3 → py-3.5) for better spacing between text and lines - Applies to both Streamdown (main chat) and MarkdownRenderer (tool outputs) Design feedback from Olivia to make tables "breathe" more. ## Test plan - [ ] Open CoPilot chat and trigger a response containing a table - [ ] Verify tables no longer have left/right borders - [ ] Verify increased spacing between rows - [ ] Check both light and dark modes 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- greptile_comment --> <h2>Greptile Overview</h2> <details><summary><h3>Greptile Summary</h3></summary> Improved CoPilot chat table styling by removing left and right borders and increasing vertical padding from `py-3` to `py-3.5`. Changes apply to both: - Streamdown-rendered tables (via CSS selector in `globals.css`) - MarkdownRenderer tables (via Tailwind classes) The changes make tables "breathe" more per design feedback from Olivia. Issue Found: - The CSS padding value in `globals.css:192` is `0.625rem` (`py-2.5`) but should be `0.875rem` (`py-3.5`) to match the PR description and the MarkdownRenderer implementation. </details> <details><summary><h3>Confidence Score: 2/5</h3></summary> - This PR has a logical error that will cause inconsistent table styling between Streamdown and MarkdownRenderer tables - The implementation has an inconsistency where the CSS file uses `py-2.5` padding while the PR description and MarkdownRenderer use `py-3.5`. This will result in different table padding between the two rendering systems, contradicting the goal of consistent styling improvements. - Pay close attention to `autogpt_platform/frontend/src/app/globals.css` - the padding value needs to be corrected to match the intended design </details> <!-- greptile_other_comments_section --> <!-- /greptile_comment --> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>	2026-02-13 09:38:59 +08:00
Ubbe	30e854569a	feat(frontend): add exact timestamp tooltip on run timestamps (#12087 ) Resolves OPEN-2693: Make exact timestamp of runs accessible through UI. The NewAgentLibraryView shows relative timestamps ("2 days ago") for runs and schedules, but unlike the OldAgentLibraryView it didn't show the exact timestamp on hover. This PR adds a native `title` tooltip so users can see the full date/time by hovering. ### Changes 🏗️ - Added `descriptionTitle` prop to `SidebarItemCard` that renders as a `title` attribute on the description text - `TaskListItem` now passes the exact `run.started_at` timestamp via `descriptionTitle` - `ScheduleListItem` now passes the exact `schedule.next_run_time` timestamp via `descriptionTitle` ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [ ] Open an agent in the library view - [ ] Hover over a run's relative timestamp (e.g. "2 days ago") and confirm the full date/time tooltip appears - [ ] Hover over a schedule's relative timestamp and confirm the full date/time tooltip appears 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- greptile_comment --> <h2>Greptile Overview</h2> <details><summary><h3>Greptile Summary</h3></summary> Added native tooltip functionality to show exact timestamps in the library view. The implementation adds a `descriptionTitle` prop to `SidebarItemCard` that renders as a `title` attribute on the description text. This allows users to hover over relative timestamps (e.g., "2 days ago") to see the full date/time. Changes: - Added optional `descriptionTitle` prop to `SidebarItemCard` component (SidebarItemCard.tsx:10) - `TaskListItem` passes `run.started_at` as the tooltip value (TaskListItem.tsx:84-86) - `ScheduleListItem` passes `schedule.next_run_time` as the tooltip value (ScheduleListItem.tsx:32) - Unrelated fix included: Sentry configuration updated to suppress cross-origin stylesheet errors (instrumentation-client.ts:25-28) Note: The PR includes two separate commits - the main timestamp tooltip feature and a Sentry error suppression fix. The PR description only documents the timestamp feature. </details> <details><summary><h3>Confidence Score: 5/5</h3></summary> - This PR is safe to merge with minimal risk - The changes are straightforward and limited in scope - adding an optional prop that forwards a native HTML attribute for tooltip functionality. The Text component already supports forwarding arbitrary HTML attributes through its spread operator (...rest), ensuring the `title` attribute works correctly. Both the timestamp tooltip feature and the Sentry configuration fix are low-risk improvements with no breaking changes. - No files require special attention </details> <details><summary><h3>Sequence Diagram</h3></summary> ```mermaid sequenceDiagram participant User participant TaskListItem participant ScheduleListItem participant SidebarItemCard participant Text participant Browser User->>TaskListItem: Hover over run timestamp TaskListItem->>SidebarItemCard: Pass descriptionTitle (run.started_at) SidebarItemCard->>Text: Render with title attribute Text->>Browser: Forward title attribute to DOM Browser->>User: Display native tooltip with exact timestamp User->>ScheduleListItem: Hover over schedule timestamp ScheduleListItem->>SidebarItemCard: Pass descriptionTitle (schedule.next_run_time) SidebarItemCard->>Text: Render with title attribute Text->>Browser: Forward title attribute to DOM Browser->>User: Display native tooltip with exact timestamp ``` </details> <!-- greptile_other_comments_section --> <!-- /greptile_comment --> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 09:38:16 +08:00
Ubbe	301d7cbada	fix(frontend): suppress cross-origin stylesheet security error (#12086 ) ## Summary - Adds `ignoreErrors` to the Sentry client configuration (`instrumentation-client.ts`) to filter out `SecurityError: CSSStyleSheet.cssRules getter: Not allowed to access cross-origin stylesheet` errors - These errors are caused by Sentry Replay (rrweb) attempting to serialize DOM snapshots that include cross-origin stylesheets (from browser extensions or CDN-loaded CSS) - This was reported via Sentry on production, occurring on any page when logged in ## Changes - `frontend/instrumentation-client.ts`: Added `ignoreErrors: [/Not allowed to access cross-origin stylesheet/]` to `Sentry.init()` config ## Test plan - [ ] Verify the error no longer appears in Sentry after deployment - [ ] Verify Sentry Replay still works correctly for other errors - [ ] Verify no regressions in error tracking (other errors should still be captured) 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- greptile_comment --> <h2>Greptile Overview</h2> <details><summary><h3>Greptile Summary</h3></summary> Adds error filtering to Sentry client configuration to suppress cross-origin stylesheet security errors that occur when Sentry Replay (rrweb) attempts to serialize DOM snapshots containing stylesheets from browser extensions or CDN-loaded CSS. This prevents noise in Sentry error logs without affecting the capture of legitimate errors. </details> <details><summary><h3>Confidence Score: 5/5</h3></summary> - This PR is safe to merge with minimal risk - The change adds a simple error filter to suppress benign cross-origin stylesheet errors that are caused by Sentry Replay itself. The regex pattern is specific and only affects client-side error reporting, with no impact on application functionality or legitimate error capture - No files require special attention </details> <!-- greptile_other_comments_section --> <!-- /greptile_comment --> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 09:37:54 +08:00
Ubbe	d95aef7665	fix(copilot): stream timeout, long-running tool polling, and CreateAgent UI refresh (#12070 ) Agent generation completes on the backend but the UI does not update/refresh to show the result. ### Changes 🏗️ ![Uploading Screenshot 2026-02-13 at 00.44.54.png…]() - Stream start timeout (12s): If the backend doesn't begin streaming within 12 seconds of submitting a message, the stream is aborted and a destructive toast is shown to the user. - Long-running tool polling: Added `useLongRunningToolPolling` hook that polls the session endpoint every 1.5s while a tool output is in an operating state (`operation_started` / `operation_pending` / `operation_in_progress`). When the backend completes, messages are refreshed so the UI reflects the final result. - CreateAgent UI improvements: Replaced the orbit loader / progress bar with a mini-game, added expanded accordion for saved agents, and improved the saved-agent card with image, icons, and links that open in new tabs. - Backend tweaks: Added `image_url` to `CreateAgentToolOutput`, minor model/service updates for the dummy agent generator. ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Send a message and verify the stream starts within 12s or a toast appears - [x] Trigger agent creation and verify the UI updates when the backend completes - [x] Verify the saved-agent card renders correctly with image, links, and icons --------- Co-authored-by: Otto <otto@agpt.co> Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-12 20:06:40 +00:00
Lluis Agusti	0fe6cc8dc7	refactor(frontend): remove OldAgentLibraryView and NEW_AGENT_RUNS flag - Delete the entire OldAgentLibraryView directory (13 files, ~2200 lines) - Remove the legacy agent library page at library/legacy/[id] - Remove the NEW_AGENT_RUNS feature flag from the Flag enum and defaults - Move cron-scheduler components to shared CronScheduler directory - Move agent-run-draft-view and agent-status-chip to legacy-builder - Update all import paths in consuming files Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-12 19:52:48 +08:00