Merge remote-tracking branch 'origin/dev' into feat/autogpt-copilot-block

fix(platform): use typing_extensions.TypedDict, snake_case attrs, and handle CancelledError
- Use typing_extensions.TypedDict instead of typing.TypedDict for Python <3.12 compat - Rename TypedDict attributes to snake_case (prompt_tokens, tool_call_id, etc.) - Add explicit asyncio.CancelledError handler to prevent orphaned sessions - Regenerate block docs
2026-03-17 03:00:27 -04:00 · 2026-03-17 06:17:13 +07:00 · 2026-03-17 04:50:33 +07:00 · 2026-03-17 04:03:56 +07:00 · 2026-03-17 02:18:58 +07:00 · 2026-03-17 00:37:08 +07:00
18 changed files with 419 additions and 1003 deletions
--- a/autogpt_platform/backend/backend/blocks/autopilot.py
+++ b/autogpt_platform/backend/backend/blocks/autopilot.py
@@ -0,0 +1,341 @@
+from __future__ import annotations
+
+import asyncio
+import contextvars
+import json
+from typing import TYPE_CHECKING
+
+from typing_extensions import TypedDict
+
+from backend.blocks._base import (
+    Block,
+    BlockCategory,
+    BlockOutput,
+    BlockSchemaInput,
+    BlockSchemaOutput,
+)
+from backend.data.model import SchemaField
+
+if TYPE_CHECKING:
+    from backend.executor.utils import ExecutionContext
+
+
+class ToolCallEntry(TypedDict):
+    tool_call_id: str
+    tool_name: str
+    input: object
+    output: object | None
+    success: bool | None
+
+
+class TokenUsage(TypedDict):
+    prompt_tokens: int
+    completion_tokens: int
+    total_tokens: int
+
+
+# Task-scoped recursion depth counter & chain-wide limit.
+# contextvars are scoped to the current asyncio task, so concurrent
+# graph executions each get independent counters.
+_autopilot_recursion_depth: contextvars.ContextVar[int] = contextvars.ContextVar(
+    "_autopilot_recursion_depth", default=0
+)
+_autopilot_recursion_limit: contextvars.ContextVar[int | None] = contextvars.ContextVar(
+    "_autopilot_recursion_limit", default=None
+)
+
+
+def _check_recursion(max_depth: int) -> tuple:
+    """Check and increment recursion depth. Returns tokens to reset on exit."""
+    current = _autopilot_recursion_depth.get()
+    inherited = _autopilot_recursion_limit.get()
+    limit = max_depth if inherited is None else min(inherited, max_depth)
+    if current >= limit:
+        raise RuntimeError(
+            f"AutoPilot recursion depth limit reached ({limit}). "
+            "The autopilot has called itself too many times."
+        )
+    return (
+        _autopilot_recursion_depth.set(current + 1),
+        _autopilot_recursion_limit.set(limit),
+    )
+
+
+def _reset_recursion(tokens: tuple) -> None:
+    _autopilot_recursion_depth.reset(tokens[0])
+    _autopilot_recursion_limit.reset(tokens[1])
+
+
+class AutoPilotBlock(Block):
+    """Execute tasks using AutoGPT AutoPilot with full access to platform tools.
+
+    The autopilot can manage agents, access workspace files, fetch web content,
+    run blocks, and more. This block enables sub-agent patterns (autopilot calling
+    autopilot) and scheduled autopilot execution via the agent executor.
+    """
+
+    class Input(BlockSchemaInput):
+        prompt: str = SchemaField(
+            description=(
+                "The task or instruction for the autopilot to execute. "
+                "The autopilot has access to platform tools like agent management, "
+                "workspace files, web fetch, block execution, and more."
+            ),
+            placeholder="Find my agents and list them",
+            advanced=False,
+        )
+
+        system_context: str = SchemaField(
+            description=(
+                "Optional additional context prepended to the prompt. "
+                "Use this to constrain autopilot behavior, provide domain "
+                "context, or set output format requirements."
+            ),
+            default="",
+            advanced=True,
+        )
+
+        session_id: str = SchemaField(
+            description=(
+                "Session ID to continue an existing autopilot conversation. "
+                "Leave empty to start a new session. "
+                "Use the session_id output from a previous run to continue."
+            ),
+            default="",
+            advanced=True,
+        )
+
+        max_recursion_depth: int = SchemaField(
+            description=(
+                "Maximum nesting depth when the autopilot calls this block "
+                "recursively (sub-agent pattern). Prevents infinite loops."
+            ),
+            default=3,
+            ge=1,
+            advanced=True,
+        )
+
+    class Output(BlockSchemaOutput):
+        response: str = SchemaField(
+            description="The final text response from the autopilot."
+        )
+        tool_calls: list[ToolCallEntry] = SchemaField(
+            description=(
+                "List of tools called during execution. Each entry has "
+                "tool_call_id, tool_name, input, output, and success fields."
+            ),
+        )
+        conversation_history: str = SchemaField(
+            description=(
+                "Full conversation history as JSON. "
+                "It can be used for logging or analysis."
+            ),
+        )
+        session_id: str = SchemaField(
+            description=(
+                "Session ID for this conversation. "
+                "Pass this back to continue the conversation in a future run."
+            ),
+        )
+        token_usage: TokenUsage = SchemaField(
+            description=(
+                "Token usage statistics: prompt_tokens, "
+                "completion_tokens, total_tokens."
+            ),
+        )
+        error: str = SchemaField(
+            description="Error message if execution failed.",
+        )
+
+    def __init__(self):
+        super().__init__(
+            id="c069dc6b-c3ed-4c12-b6e5-d47361e64ce6",
+            description=(
+                "Execute tasks using AutoGPT AutoPilot with full access to "
+                "platform tools (agent management, workspace files, web fetch, "
+                "block execution, and more). Enables sub-agent patterns and "
+                "scheduled autopilot execution."
+            ),
+            categories={BlockCategory.AI, BlockCategory.AGENT},
+            input_schema=AutoPilotBlock.Input,
+            output_schema=AutoPilotBlock.Output,
+            test_input={
+                "prompt": "List my agents",
+                "system_context": "",
+                "session_id": "",
+                "max_recursion_depth": 3,
+            },
+            test_output=[
+                ("response", "You have 2 agents: Agent A and Agent B."),
+                ("tool_calls", []),
+                (
+                    "conversation_history",
+                    '[{"role": "user", "content": "List my agents"}]',
+                ),
+                ("session_id", "test-session-id"),
+                (
+                    "token_usage",
+                    {
+                        "prompt_tokens": 100,
+                        "completion_tokens": 50,
+                        "total_tokens": 150,
+                    },
+                ),
+            ],
+            test_mock={
+                "create_session": lambda *args, **kwargs: "test-session-id",
+                "execute_copilot": lambda *args, **kwargs: (
+                    "You have 2 agents: Agent A and Agent B.",
+                    [],
+                    '[{"role": "user", "content": "List my agents"}]',
+                    "test-session-id",
+                    {
+                        "prompt_tokens": 100,
+                        "completion_tokens": 50,
+                        "total_tokens": 150,
+                    },
+                ),
+            },
+        )
+
+    async def create_session(self, user_id: str) -> str:
+        """Create a new chat session and return its ID (mockable for tests)."""
+        from backend.copilot.model import create_chat_session
+
+        session = await create_chat_session(user_id)
+        return session.session_id
+
+    async def execute_copilot(
+        self,
+        prompt: str,
+        system_context: str,
+        session_id: str,
+        max_recursion_depth: int,
+        user_id: str,
+    ) -> tuple[str, list[ToolCallEntry], str, str, TokenUsage]:
+        """Invoke the copilot and collect all stream results.
+
+        Follows the same path as the normal copilot: create session if needed,
+        then let stream_chat_completion_sdk handle everything (session loading,
+        message append, lock, transcript, cleanup).
+        """
+        from backend.copilot.model import get_chat_session
+        from backend.copilot.response_model import (
+            StreamError,
+            StreamTextDelta,
+            StreamToolInputAvailable,
+            StreamToolOutputAvailable,
+            StreamUsage,
+        )
+        from backend.copilot.sdk.service import stream_chat_completion_sdk
+
+        tokens = _check_recursion(max_recursion_depth)
+        try:
+            effective_prompt = prompt
+            if system_context:
+                effective_prompt = f"[System Context: {system_context}]\n\n{prompt}"
+
+            # Consume the stream — same as the executor processor.
+            # Do NOT pass a session object; let the SDK load it internally
+            # so all session management (lock, persist, transcript) is handled
+            # by the SDK's own finally block.
+            response_parts: list[str] = []
+            tool_calls: list[ToolCallEntry] = []
+            tool_calls_by_id: dict[str, ToolCallEntry] = {}
+            total_usage: TokenUsage = {
+                "prompt_tokens": 0,
+                "completion_tokens": 0,
+                "total_tokens": 0,
+            }
+
+            async for event in stream_chat_completion_sdk(
+                session_id=session_id,
+                message=effective_prompt,
+                is_user_message=True,
+                user_id=user_id,
+            ):
+                if isinstance(event, StreamTextDelta):
+                    response_parts.append(event.delta)
+                elif isinstance(event, StreamToolInputAvailable):
+                    entry: ToolCallEntry = {
+                        "tool_call_id": event.toolCallId,
+                        "tool_name": event.toolName,
+                        "input": event.input,
+                        "output": None,
+                        "success": None,
+                    }
+                    tool_calls.append(entry)
+                    tool_calls_by_id[event.toolCallId] = entry
+                elif isinstance(event, StreamToolOutputAvailable):
+                    if tc := tool_calls_by_id.get(event.toolCallId):
+                        tc["output"] = event.output
+                        tc["success"] = event.success
+                elif isinstance(event, StreamUsage):
+                    total_usage["prompt_tokens"] += event.promptTokens
+                    total_usage["completion_tokens"] += event.completionTokens
+                    total_usage["total_tokens"] += event.totalTokens
+                elif isinstance(event, StreamError):
+                    raise RuntimeError(f"AutoPilot error: {event.errorText}")
+
+            # Session was persisted by the SDK's finally block.
+            # Re-fetch for conversation history output.
+            updated_session = await get_chat_session(session_id, user_id)
+            history_json = "[]"
+            if updated_session and updated_session.messages:
+                history_json = json.dumps(
+                    [m.model_dump(exclude_none=True) for m in updated_session.messages],
+                    default=str,
+                )
+
+            return (
+                "".join(response_parts),
+                tool_calls,
+                history_json,
+                session_id,
+                total_usage,
+            )
+        finally:
+            _reset_recursion(tokens)
+
+    async def run(
+        self,
+        input_data: Input,
+        *,
+        execution_context: ExecutionContext,
+        **kwargs,
+    ) -> BlockOutput:
+        if not input_data.prompt.strip():
+            yield "error", "Prompt cannot be empty."
+            return
+
+        if not execution_context.user_id:
+            yield "error", "Cannot run autopilot without an authenticated user."
+            return
+
+        # Create session eagerly so the user always gets the session_id,
+        # even if the downstream stream fails (avoids orphaned sessions).
+        sid = input_data.session_id
+        if not sid:
+            sid = await self.create_session(execution_context.user_id)
+
+        try:
+            response, tool_calls, history, _, usage = await self.execute_copilot(
+                prompt=input_data.prompt,
+                system_context=input_data.system_context,
+                session_id=sid,
+                max_recursion_depth=input_data.max_recursion_depth,
+                user_id=execution_context.user_id,
+            )
+
+            yield "response", response
+            yield "tool_calls", tool_calls
+            yield "conversation_history", history
+            yield "session_id", sid
+            yield "token_usage", usage
+        except asyncio.CancelledError:
+            yield "session_id", sid
+            yield "error", "AutoPilot execution was cancelled."
+            raise
+        except Exception as e:
+            yield "session_id", sid
+            yield "error", str(e)
--- a/autogpt_platform/backend/backend/copilot/integration_creds.py
+++ b/autogpt_platform/backend/backend/copilot/integration_creds.py
@@ -1,162 +0,0 @@
-"""Integration credential lookup with per-process TTL cache.
-
-Provides token retrieval for connected integrations so that copilot tools
-(e.g. bash_exec) can inject auth tokens into the execution environment without
-hitting the database on every command.
-
-Cache semantics (handled automatically by TTLCache):
- Token found → cached for _TOKEN_CACHE_TTL (5 min).  Avoids repeated DB hits
-  for users who have credentials and are running many bash commands.
- No credentials found → cached for _NULL_CACHE_TTL (60 s).  Avoids a DB hit
-  on every E2B command for users who haven't connected an account yet, while
-  still picking up a newly-connected account within one minute.
-
-Both caches are bounded to _CACHE_MAX_SIZE entries; cachetools evicts the
-least-recently-used entry when the limit is reached.
-
-Multi-worker note: both caches are in-process only.  Each worker/replica
-maintains its own independent cache, so a credential fetch may be duplicated
-across processes.  This is acceptable for the current goal (reduce DB hits per
-session per-process), but if cache efficiency across replicas becomes important
-a shared cache (e.g. Redis) should be used instead.
-"""
-
-import logging
-from typing import cast
-
-from cachetools import TTLCache
-
-from backend.data.model import APIKeyCredentials, OAuth2Credentials
-from backend.integrations.creds_manager import (
-    IntegrationCredentialsManager,
-    register_creds_changed_hook,
-)
-
-logger = logging.getLogger(__name__)
-
-# Maps provider slug → env var names to inject when the provider is connected.
-# Add new providers here when adding integration support.
-# NOTE: keep in sync with connect_integration._PROVIDER_INFO — both registries
-# must be updated when adding a new provider.
-PROVIDER_ENV_VARS: dict[str, list[str]] = {
-    "github": ["GH_TOKEN", "GITHUB_TOKEN"],
-}
-
-_TOKEN_CACHE_TTL = 300.0  # seconds — for found tokens
-_NULL_CACHE_TTL = 60.0  # seconds — for "not connected" results
-_CACHE_MAX_SIZE = 10_000
-
-# (user_id, provider) → token string.  TTLCache handles expiry + eviction.
-# Thread-safety note: TTLCache is NOT thread-safe, but that is acceptable here
-# because all callers (get_provider_token, invalidate_user_provider_cache) run
-# exclusively on the asyncio event loop.  There are no await points between a
-# cache read and its corresponding write within any function, so no concurrent
-# coroutine can interleave.  If ThreadPoolExecutor workers are ever added to
-# this path, a threading.RLock should be wrapped around these caches.
-_token_cache: TTLCache[tuple[str, str], str] = TTLCache(
-    maxsize=_CACHE_MAX_SIZE, ttl=_TOKEN_CACHE_TTL
-)
-# Separate cache for "no credentials" results with a shorter TTL.
-_null_cache: TTLCache[tuple[str, str], bool] = TTLCache(
-    maxsize=_CACHE_MAX_SIZE, ttl=_NULL_CACHE_TTL
-)
-
-
-def invalidate_user_provider_cache(user_id: str, provider: str) -> None:
-    """Remove the cached entry for *user_id*/*provider* from both caches.
-
-    Call this after storing new credentials so that the next
-    ``get_provider_token()`` call performs a fresh DB lookup instead of
-    serving a stale TTL-cached result.
-    """
-    key = (user_id, provider)
-    _token_cache.pop(key, None)
-    _null_cache.pop(key, None)
-
-
-# Register this module's cache-bust function with the credentials manager so
-# that any create/update/delete operation immediately evicts stale cache
-# entries.  This avoids a lazy import inside creds_manager and eliminates the
-# circular-import risk.
-register_creds_changed_hook(invalidate_user_provider_cache)
-
-# Module-level singleton to avoid re-instantiating IntegrationCredentialsManager
-# on every cache-miss call to get_provider_token().
-_manager = IntegrationCredentialsManager()
-
-
-async def get_provider_token(user_id: str, provider: str) -> str | None:
-    """Return the user's access token for *provider*, or ``None`` if not connected.
-
-    OAuth2 tokens are preferred (refreshed if needed); API keys are the fallback.
-    Found tokens are cached for _TOKEN_CACHE_TTL (5 min).  "Not connected" results
-    are cached for _NULL_CACHE_TTL (60 s) to avoid a DB hit on every bash_exec
-    command for users who haven't connected yet, while still picking up a
-    newly-connected account within one minute.
-    """
-    cache_key = (user_id, provider)
-
-    if cache_key in _null_cache:
-        return None
-    if cached := _token_cache.get(cache_key):
-        return cached
-
-    manager = _manager
-    try:
-        creds_list = await manager.store.get_creds_by_provider(user_id, provider)
-    except Exception:
-        logger.debug("Failed to fetch %s credentials for user %s", provider, user_id)
-        return None
-
-    # Pass 1: prefer OAuth2 (carry scope info, refreshable via token endpoint).
-    # Sort so broader-scoped tokens come first: a token with "repo" scope covers
-    # full git access, while a public-data-only token lacks push/pull permission.
-    # lock=False — background injection; not worth a distributed lock acquisition.
-    oauth2_creds = sorted(
-        [c for c in creds_list if c.type == "oauth2"],
-        key=lambda c: 0 if "repo" in (cast(OAuth2Credentials, c).scopes or []) else 1,
-    )
-    for creds in oauth2_creds:
-        if creds.type == "oauth2":
-            try:
-                fresh = await manager.refresh_if_needed(
-                    user_id, cast(OAuth2Credentials, creds), lock=False
-                )
-                token = fresh.access_token.get_secret_value()
-            except Exception:
-                logger.warning(
-                    "Failed to refresh %s OAuth token for user %s; "
-                    "falling back to potentially stale token",
-                    provider,
-                    user_id,
-                )
-                token = cast(OAuth2Credentials, creds).access_token.get_secret_value()
-            _token_cache[cache_key] = token
-            return token
-
-    # Pass 2: fall back to API key (no expiry, no refresh needed).
-    for creds in creds_list:
-        if creds.type == "api_key":
-            token = cast(APIKeyCredentials, creds).api_key.get_secret_value()
-            _token_cache[cache_key] = token
-            return token
-
-    # No credentials found — cache to avoid repeated DB hits.
-    _null_cache[cache_key] = True
-    return None
-
-
-async def get_integration_env_vars(user_id: str) -> dict[str, str]:
-    """Return env vars for all providers the user has connected.
-
-    Iterates :data:`PROVIDER_ENV_VARS`, fetches each token, and builds a flat
-    ``{env_var: token}`` dict ready to pass to a subprocess or E2B sandbox.
-    Only providers with a stored credential contribute entries.
-    """
-    env: dict[str, str] = {}
-    for provider, var_names in PROVIDER_ENV_VARS.items():
-        token = await get_provider_token(user_id, provider)
-        if token:
-            for var in var_names:
-                env[var] = token
-    return env
--- a/autogpt_platform/backend/backend/copilot/integration_creds_test.py
+++ b/autogpt_platform/backend/backend/copilot/integration_creds_test.py
@@ -1,193 +0,0 @@
-"""Tests for integration_creds — TTL cache and token lookup paths."""
-
-from unittest.mock import AsyncMock, MagicMock, patch
-
-import pytest
-from pydantic import SecretStr
-
-from backend.copilot.integration_creds import (
-    _NULL_CACHE_TTL,
-    _TOKEN_CACHE_TTL,
-    PROVIDER_ENV_VARS,
-    _null_cache,
-    _token_cache,
-    get_integration_env_vars,
-    get_provider_token,
-    invalidate_user_provider_cache,
-)
-from backend.data.model import APIKeyCredentials, OAuth2Credentials
-
-_USER = "user-integration-creds-test"
-_PROVIDER = "github"
-
-
-def _make_api_key_creds(key: str = "test-api-key") -> APIKeyCredentials:
-    return APIKeyCredentials(
-        id="creds-api-key",
-        provider=_PROVIDER,
-        api_key=SecretStr(key),
-        title="Test API Key",
-        expires_at=None,
-    )
-
-
-def _make_oauth2_creds(token: str = "test-oauth-token") -> OAuth2Credentials:
-    return OAuth2Credentials(
-        id="creds-oauth2",
-        provider=_PROVIDER,
-        title="Test OAuth",
-        access_token=SecretStr(token),
-        refresh_token=SecretStr("test-refresh"),
-        access_token_expires_at=None,
-        refresh_token_expires_at=None,
-        scopes=[],
-    )
-
-
-@pytest.fixture(autouse=True)
-def clear_caches():
-    """Ensure clean caches before and after every test."""
-    _token_cache.clear()
-    _null_cache.clear()
-    yield
-    _token_cache.clear()
-    _null_cache.clear()
-
-
-class TestInvalidateUserProviderCache:
-    def test_removes_token_entry(self):
-        key = (_USER, _PROVIDER)
-        _token_cache[key] = "tok"
-        invalidate_user_provider_cache(_USER, _PROVIDER)
-        assert key not in _token_cache
-
-    def test_removes_null_entry(self):
-        key = (_USER, _PROVIDER)
-        _null_cache[key] = True
-        invalidate_user_provider_cache(_USER, _PROVIDER)
-        assert key not in _null_cache
-
-    def test_noop_when_key_not_cached(self):
-        # Should not raise even when there is no cache entry.
-        invalidate_user_provider_cache("no-such-user", _PROVIDER)
-
-    def test_only_removes_targeted_key(self):
-        other_key = ("other-user", _PROVIDER)
-        _token_cache[other_key] = "other-tok"
-        invalidate_user_provider_cache(_USER, _PROVIDER)
-        assert other_key in _token_cache
-
-
-class TestGetProviderToken:
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_returns_cached_token_without_db_hit(self):
-        _token_cache[(_USER, _PROVIDER)] = "cached-tok"
-
-        mock_manager = MagicMock()
-        with patch("backend.copilot.integration_creds._manager", mock_manager):
-            result = await get_provider_token(_USER, _PROVIDER)
-
-        assert result == "cached-tok"
-        mock_manager.store.get_creds_by_provider.assert_not_called()
-
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_returns_none_for_null_cached_provider(self):
-        _null_cache[(_USER, _PROVIDER)] = True
-
-        mock_manager = MagicMock()
-        with patch("backend.copilot.integration_creds._manager", mock_manager):
-            result = await get_provider_token(_USER, _PROVIDER)
-
-        assert result is None
-        mock_manager.store.get_creds_by_provider.assert_not_called()
-
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_api_key_creds_returned_and_cached(self):
-        api_creds = _make_api_key_creds("my-api-key")
-        mock_manager = MagicMock()
-        mock_manager.store.get_creds_by_provider = AsyncMock(return_value=[api_creds])
-
-        with patch("backend.copilot.integration_creds._manager", mock_manager):
-            result = await get_provider_token(_USER, _PROVIDER)
-
-        assert result == "my-api-key"
-        assert _token_cache.get((_USER, _PROVIDER)) == "my-api-key"
-
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_oauth2_preferred_over_api_key(self):
-        oauth_creds = _make_oauth2_creds("oauth-tok")
-        api_creds = _make_api_key_creds("api-tok")
-        mock_manager = MagicMock()
-        mock_manager.store.get_creds_by_provider = AsyncMock(
-            return_value=[api_creds, oauth_creds]
-        )
-        mock_manager.refresh_if_needed = AsyncMock(return_value=oauth_creds)
-
-        with patch("backend.copilot.integration_creds._manager", mock_manager):
-            result = await get_provider_token(_USER, _PROVIDER)
-
-        assert result == "oauth-tok"
-
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_oauth2_refresh_failure_falls_back_to_stale_token(self):
-        oauth_creds = _make_oauth2_creds("stale-oauth-tok")
-        mock_manager = MagicMock()
-        mock_manager.store.get_creds_by_provider = AsyncMock(return_value=[oauth_creds])
-        mock_manager.refresh_if_needed = AsyncMock(side_effect=RuntimeError("network"))
-
-        with patch("backend.copilot.integration_creds._manager", mock_manager):
-            result = await get_provider_token(_USER, _PROVIDER)
-
-        assert result == "stale-oauth-tok"
-
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_no_credentials_caches_null_entry(self):
-        mock_manager = MagicMock()
-        mock_manager.store.get_creds_by_provider = AsyncMock(return_value=[])
-
-        with patch("backend.copilot.integration_creds._manager", mock_manager):
-            result = await get_provider_token(_USER, _PROVIDER)
-
-        assert result is None
-        assert _null_cache.get((_USER, _PROVIDER)) is True
-
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_db_exception_returns_none_without_caching(self):
-        mock_manager = MagicMock()
-        mock_manager.store.get_creds_by_provider = AsyncMock(
-            side_effect=RuntimeError("db down")
-        )
-
-        with patch("backend.copilot.integration_creds._manager", mock_manager):
-            result = await get_provider_token(_USER, _PROVIDER)
-
-        assert result is None
-        # DB errors are not cached — next call will retry
-        assert (_USER, _PROVIDER) not in _token_cache
-        assert (_USER, _PROVIDER) not in _null_cache
-
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_null_cache_has_shorter_ttl_than_token_cache(self):
-        """Verify the TTL constants are set correctly for each cache."""
-        assert _null_cache.ttl == _NULL_CACHE_TTL
-        assert _token_cache.ttl == _TOKEN_CACHE_TTL
-        assert _NULL_CACHE_TTL < _TOKEN_CACHE_TTL
-
-
-class TestGetIntegrationEnvVars:
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_injects_all_env_vars_for_provider(self):
-        _token_cache[(_USER, "github")] = "gh-tok"
-
-        result = await get_integration_env_vars(_USER)
-
-        for var in PROVIDER_ENV_VARS["github"]:
-            assert result[var] == "gh-tok"
-
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_empty_dict_when_no_credentials(self):
-        _null_cache[(_USER, "github")] = True
-
-        result = await get_integration_env_vars(_USER)
-
-        assert result == {}
--- a/autogpt_platform/backend/backend/copilot/prompting.py
+++ b/autogpt_platform/backend/backend/copilot/prompting.py
@@ -93,25 +93,22 @@ Example — committing an image file to GitHub:
 ### Sub-agent tasks
 - When using the Task tool, NEVER set `run_in_background` to true.
  All tasks must run in the foreground.
-"""

-# E2B-only notes — E2B has full internet access so gh CLI works there.
-# Not shown in local (bubblewrap) mode: --unshare-net blocks all network.
-_E2B_TOOL_NOTES = """
-### GitHub CLI (`gh`) and git
- If the user has connected their GitHub account, both `gh` and `git` are
-  pre-authenticated — use them directly without any manual login step.
-  `git` HTTPS operations (clone, push, pull) work automatically.
- If the token changes mid-session (e.g. user reconnects with a new token),
-  run `gh auth setup-git` to re-register the credential helper.
- If `gh` or `git` fails with an authentication error (e.g. "authentication
-  required", "could not read Username", or exit code 128), call
-  `connect_integration(provider="github")` to surface the GitHub credentials
-  setup card so the user can connect their account. Once connected, retry
-  the operation.
- For operations that need broader access (e.g. private org repos, GitHub
-  Actions), pass the required scopes: e.g.
-  `connect_integration(provider="github", scopes=["repo", "read:org"])`.
+### Delegating to another autopilot (sub-autopilot pattern)
+Use the **AutoPilotBlock** (`run_block` with block_id
+`c069dc6b-c3ed-4c12-b6e5-d47361e64ce6`) to delegate a task to a fresh
+autopilot instance.  The sub-autopilot has its own full tool set and can
+perform multi-step work autonomously.
+
+- **Input**: `prompt` (required) — the task description.
+  Optional: `system_context` to constrain behavior, `session_id` to
+  continue a previous conversation, `max_recursion_depth` (default 3).
+- **Output**: `response` (text), `tool_calls` (list), `session_id`
+  (for continuation), `conversation_history`, `token_usage`.
+
+Use this when a task is complex enough to benefit from a separate
+autopilot context, e.g. "research X and write a report" while the
+parent autopilot handles orchestration.
 """


@@ -124,7 +121,6 @@ def _build_storage_supplement(
    storage_system_1_persistence: list[str],
    file_move_name_1_to_2: str,
    file_move_name_2_to_1: str,
-    extra_notes: str = "",
 ) -> str:
    """Build storage/filesystem supplement for a specific environment.

@@ -139,7 +135,6 @@ def _build_storage_supplement(
        storage_system_1_persistence: List of persistence behavior descriptions
        file_move_name_1_to_2: Direction label for primary→persistent
        file_move_name_2_to_1: Direction label for persistent→primary
-        extra_notes: Environment-specific notes appended after shared notes
    """
    # Format lists as bullet points with proper indentation
    characteristics = "\n".join(f"   - {c}" for c in storage_system_1_characteristics)
@@ -173,16 +168,12 @@ def _build_storage_supplement(

 ### File persistence
 Important files (code, configs, outputs) should be saved to workspace to ensure they persist.
-{_SHARED_TOOL_NOTES}{extra_notes}"""
+{_SHARED_TOOL_NOTES}"""


 # Pre-built supplements for common environments
 def _get_local_storage_supplement(cwd: str) -> str:
-    """Local ephemeral storage (files lost between turns).
-
-    Network is isolated (bubblewrap --unshare-net), so internet-dependent CLIs
-    like gh will not work — no integration env-var notes are included.
-    """
+    """Local ephemeral storage (files lost between turns)."""
    return _build_storage_supplement(
        working_dir=cwd,
        sandbox_type="in a network-isolated sandbox",
@@ -200,11 +191,7 @@ def _get_local_storage_supplement(cwd: str) -> str:


 def _get_cloud_sandbox_supplement() -> str:
-    """Cloud persistent sandbox (files survive across turns in session).
-
-    E2B has full internet access, so integration tokens (GH_TOKEN etc.) are
-    injected per command in bash_exec — include the CLI guidance notes.
-    """
+    """Cloud persistent sandbox (files survive across turns in session)."""
    return _build_storage_supplement(
        working_dir="/home/user",
        sandbox_type="in a cloud sandbox with full internet access",
@@ -219,7 +206,6 @@ def _get_cloud_sandbox_supplement() -> str:
        ],
        file_move_name_1_to_2="Sandbox → Persistent",
        file_move_name_2_to_1="Persistent → Sandbox",
-        extra_notes=_E2B_TOOL_NOTES,
    )


--- a/autogpt_platform/backend/backend/copilot/sdk/service.py
+++ b/autogpt_platform/backend/backend/copilot/sdk/service.py
@@ -769,7 +769,7 @@ async def stream_chat_completion_sdk(
                    )
                return None
            try:
-                sandbox = await get_or_create_sandbox(
+                return await get_or_create_sandbox(
                    session_id,
                    api_key=e2b_api_key,
                    template=config.e2b_sandbox_template,
@@ -783,9 +783,7 @@ async def stream_chat_completion_sdk(
                    e2b_err,
                    exc_info=True,
                )
-                return None
-
-            return sandbox
+            return None

        async def _fetch_transcript():
            """Download transcript for --resume if applicable."""
--- a/autogpt_platform/backend/backend/copilot/tools/init.py
+++ b/autogpt_platform/backend/backend/copilot/tools/init.py
@@ -12,7 +12,6 @@ from .agent_browser import BrowserActTool, BrowserNavigateTool, BrowserScreensho
 from .agent_output import AgentOutputTool
 from .base import BaseTool
 from .bash_exec import BashExecTool
-from .connect_integration import ConnectIntegrationTool
 from .continue_run_block import ContinueRunBlockTool
 from .create_agent import CreateAgentTool
 from .customize_agent import CustomizeAgentTool
@@ -85,7 +84,6 @@ TOOL_REGISTRY: dict[str, BaseTool] = {
    "browser_screenshot": BrowserScreenshotTool(),
    # Sandboxed code execution (bubblewrap)
    "bash_exec": BashExecTool(),
-    "connect_integration": ConnectIntegrationTool(),
    # Persistent workspace tools (cloud storage, survives across sessions)
    # Feature request tools
    "search_feature_requests": SearchFeatureRequestsTool(),
--- a/autogpt_platform/backend/backend/copilot/tools/bash_exec.py
+++ b/autogpt_platform/backend/backend/copilot/tools/bash_exec.py
@@ -22,7 +22,6 @@ from e2b import AsyncSandbox
 from e2b.exceptions import TimeoutException

 from backend.copilot.context import E2B_WORKDIR, get_current_sandbox
-from backend.copilot.integration_creds import get_integration_env_vars
 from backend.copilot.model import ChatSession

 from .base import BaseTool
@@ -97,9 +96,7 @@ class BashExecTool(BaseTool):

        sandbox = get_current_sandbox()
        if sandbox is not None:
-            return await self._execute_on_e2b(
-                sandbox, command, timeout, session_id, user_id
-            )
+            return await self._execute_on_e2b(sandbox, command, timeout, session_id)

        # Bubblewrap fallback: local isolated execution.
        if not has_full_sandbox():
@@ -136,27 +133,14 @@ class BashExecTool(BaseTool):
        command: str,
        timeout: int,
        session_id: str | None,
-        user_id: str | None = None,
    ) -> ToolResponseBase:
-        """Execute *command* on the E2B sandbox via commands.run().
-
-        Integration tokens (e.g. GH_TOKEN) are injected into the sandbox env
-        for any user with connected accounts. E2B has full internet access, so
-        CLI tools like ``gh`` work without manual authentication.
-        """
-        envs: dict[str, str] = {
-            "PATH": "/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin",
-        }
-        if user_id is not None:
-            integration_env = await get_integration_env_vars(user_id)
-            envs.update(integration_env)
-
+        """Execute *command* on the E2B sandbox via commands.run()."""
        try:
            result = await sandbox.commands.run(
                f"bash -c {shlex.quote(command)}",
                cwd=E2B_WORKDIR,
                timeout=timeout,
-                envs=envs,
+                envs={"PATH": "/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin"},
            )
            return BashExecResponse(
                message=f"Command executed on E2B (exit {result.exit_code})",
--- a/autogpt_platform/backend/backend/copilot/tools/bash_exec_test.py
+++ b/autogpt_platform/backend/backend/copilot/tools/bash_exec_test.py
@@ -1,78 +0,0 @@
-"""Tests for BashExecTool — E2B path with token injection."""
-
-from unittest.mock import AsyncMock, MagicMock, patch
-
-import pytest
-
-from ._test_data import make_session
-from .bash_exec import BashExecTool
-from .models import BashExecResponse
-
-_USER = "user-bash-exec-test"
-
-
-def _make_tool() -> BashExecTool:
-    return BashExecTool()
-
-
-def _make_sandbox(exit_code: int = 0, stdout: str = "", stderr: str = "") -> MagicMock:
-    result = MagicMock()
-    result.exit_code = exit_code
-    result.stdout = stdout
-    result.stderr = stderr
-
-    sandbox = MagicMock()
-    sandbox.commands.run = AsyncMock(return_value=result)
-    return sandbox
-
-
-class TestBashExecE2BTokenInjection:
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_token_injected_when_user_id_set(self):
-        """When user_id is provided, integration env vars are merged into sandbox envs."""
-        tool = _make_tool()
-        session = make_session(user_id=_USER)
-        sandbox = _make_sandbox(stdout="ok")
-        env_vars = {"GH_TOKEN": "gh-secret", "GITHUB_TOKEN": "gh-secret"}
-
-        with patch(
-            "backend.copilot.tools.bash_exec.get_integration_env_vars",
-            new=AsyncMock(return_value=env_vars),
-        ) as mock_get_env:
-            result = await tool._execute_on_e2b(
-                sandbox=sandbox,
-                command="echo hi",
-                timeout=10,
-                session_id=session.session_id,
-                user_id=_USER,
-            )
-
-        mock_get_env.assert_awaited_once_with(_USER)
-        call_kwargs = sandbox.commands.run.call_args[1]
-        assert call_kwargs["envs"]["GH_TOKEN"] == "gh-secret"
-        assert call_kwargs["envs"]["GITHUB_TOKEN"] == "gh-secret"
-        assert isinstance(result, BashExecResponse)
-
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_no_token_injection_when_user_id_is_none(self):
-        """When user_id is None, get_integration_env_vars must NOT be called."""
-        tool = _make_tool()
-        session = make_session(user_id=_USER)
-        sandbox = _make_sandbox(stdout="ok")
-
-        with patch(
-            "backend.copilot.tools.bash_exec.get_integration_env_vars",
-            new=AsyncMock(return_value={"GH_TOKEN": "should-not-appear"}),
-        ) as mock_get_env:
-            result = await tool._execute_on_e2b(
-                sandbox=sandbox,
-                command="echo hi",
-                timeout=10,
-                session_id=session.session_id,
-                user_id=None,
-            )
-
-        mock_get_env.assert_not_called()
-        call_kwargs = sandbox.commands.run.call_args[1]
-        assert "GH_TOKEN" not in call_kwargs["envs"]
-        assert isinstance(result, BashExecResponse)
--- a/autogpt_platform/backend/backend/copilot/tools/connect_integration.py
+++ b/autogpt_platform/backend/backend/copilot/tools/connect_integration.py
@@ -1,215 +0,0 @@
-"""Tool for prompting the user to connect a required integration.
-
-When the copilot encounters an authentication failure (e.g. `gh` CLI returns
-"authentication required"), it calls this tool to surface the credentials
-setup card in the chat — the same UI that appears when a GitHub block runs
-without configured credentials.
-"""
-
-import functools
-from typing import Any, TypedDict
-
-from backend.copilot.model import ChatSession
-from backend.copilot.tools.models import (
-    ErrorResponse,
-    ResponseType,
-    SetupInfo,
-    SetupRequirementsResponse,
-    ToolResponseBase,
-    UserReadiness,
-)
-
-from .base import BaseTool
-
-
-class _ProviderInfo(TypedDict):
-    name: str
-    types: list[str]
-    # Default OAuth scopes requested when the agent doesn't specify any.
-    scopes: list[str]
-
-
-class _CredentialEntry(TypedDict):
-    """Shape of each entry inside SetupRequirementsResponse.user_readiness.missing_credentials."""
-
-    id: str
-    title: str
-    provider: str
-    provider_name: str
-    type: str
-    types: list[str]
-    scopes: list[str]
-
-
-@functools.lru_cache(maxsize=1)
-def _is_github_oauth_configured() -> bool:
-    """Return True if GitHub OAuth env vars are set.
-
-    Evaluated lazily (not at import time) to avoid triggering Secrets() during
-    module import, which can fail in environments where secrets are not loaded.
-    """
-    from backend.blocks.github._auth import GITHUB_OAUTH_IS_CONFIGURED
-
-    return GITHUB_OAUTH_IS_CONFIGURED
-
-
-# Registry of known providers: name + supported credential types for the UI.
-# When adding a new provider, also add its env var names to
-# backend.copilot.integration_creds.PROVIDER_ENV_VARS.
-def _get_provider_info() -> dict[str, _ProviderInfo]:
-    """Build the provider registry, evaluating OAuth config lazily."""
-    return {
-        "github": {
-            "name": "GitHub",
-            "types": (
-                ["api_key", "oauth2"] if _is_github_oauth_configured() else ["api_key"]
-            ),
-            # Default: repo scope covers clone/push/pull for public and private repos.
-            # Agent can request additional scopes (e.g. "read:org") via the scopes param.
-            "scopes": ["repo"],
-        },
-    }
-
-
-class ConnectIntegrationTool(BaseTool):
-    """Surface the credentials setup UI when an integration is not connected."""
-
-    @property
-    def name(self) -> str:
-        return "connect_integration"
-
-    @property
-    def description(self) -> str:
-        return (
-            "Prompt the user to connect a required integration (e.g. GitHub). "
-            "Call this when an external CLI or API call fails because the user "
-            "has not connected the relevant account. "
-            "The tool surfaces a credentials setup card in the chat so the user "
-            "can authenticate without leaving the page. "
-            "After the user connects the account, retry the operation. "
-            "In E2B/cloud sandbox mode the token (GH_TOKEN/GITHUB_TOKEN) is "
-            "automatically injected per-command in bash_exec — no manual export needed. "
-            "In local bubblewrap mode network is isolated so GitHub CLI commands "
-            "will still fail after connecting; inform the user of this limitation."
-        )
-
-    @property
-    def parameters(self) -> dict[str, Any]:
-        return {
-            "type": "object",
-            "properties": {
-                "provider": {
-                    "type": "string",
-                    "description": (
-                        "Integration provider slug, e.g. 'github'. "
-                        "Must be one of the supported providers."
-                    ),
-                    "enum": list(_get_provider_info().keys()),
-                },
-                "reason": {
-                    "type": "string",
-                    "description": (
-                        "Brief explanation of why the integration is needed, "
-                        "shown to the user in the setup card."
-                    ),
-                    "maxLength": 500,
-                },
-                "scopes": {
-                    "type": "array",
-                    "items": {"type": "string"},
-                    "description": (
-                        "OAuth scopes to request. Omit to use the provider default. "
-                        "Add extra scopes when you need more access — e.g. for GitHub: "
-                        "'repo' (clone/push/pull), 'read:org' (org membership), "
-                        "'workflow' (GitHub Actions). "
-                        "Requesting only the scopes you actually need is best practice."
-                    ),
-                },
-            },
-            "required": ["provider"],
-        }
-
-    @property
-    def requires_auth(self) -> bool:
-        # Require auth so only authenticated users can trigger the setup card.
-        # The card itself is user-agnostic (no per-user data needed), so
-        # user_id is intentionally unused in _execute.
-        return True
-
-    async def _execute(
-        self,
-        user_id: str | None,
-        session: ChatSession,
-        **kwargs: Any,
-    ) -> ToolResponseBase:
-        del user_id  # setup card is user-agnostic; auth is enforced via requires_auth
-        session_id = session.session_id if session else None
-        provider: str = (kwargs.get("provider") or "").strip().lower()
-        reason: str = (kwargs.get("reason") or "").strip()[
-            :500
-        ]  # cap LLM-controlled text
-        extra_scopes: list[str] = [
-            str(s).strip() for s in (kwargs.get("scopes") or []) if str(s).strip()
-        ]
-
-        provider_info = _get_provider_info()
-        info = provider_info.get(provider)
-        if not info:
-            supported = ", ".join(f"'{p}'" for p in provider_info)
-            return ErrorResponse(
-                message=(
-                    f"Unknown provider '{provider}'. "
-                    f"Supported providers: {supported}."
-                ),
-                error="unknown_provider",
-                session_id=session_id,
-            )
-
-        provider_name: str = info["name"]
-        supported_types: list[str] = info["types"]
-        # Merge agent-requested scopes with provider defaults (deduplicated, order preserved).
-        default_scopes: list[str] = info["scopes"]
-        seen: set[str] = set()
-        scopes: list[str] = []
-        for s in default_scopes + extra_scopes:
-            if s not in seen:
-                seen.add(s)
-                scopes.append(s)
-        field_key = f"{provider}_credentials"
-
-        message_parts = [
-            f"To continue, please connect your {provider_name} account.",
-        ]
-        if reason:
-            message_parts.append(reason)
-
-        credential_entry: _CredentialEntry = {
-            "id": field_key,
-            "title": f"{provider_name} Credentials",
-            "provider": provider,
-            "provider_name": provider_name,
-            "type": supported_types[0],
-            "types": supported_types,
-            "scopes": scopes,
-        }
-        missing_credentials: dict[str, _CredentialEntry] = {field_key: credential_entry}
-
-        return SetupRequirementsResponse(
-            type=ResponseType.SETUP_REQUIREMENTS,
-            message=" ".join(message_parts),
-            session_id=session_id,
-            setup_info=SetupInfo(
-                agent_id=f"connect_{provider}",
-                agent_name=provider_name,
-                user_readiness=UserReadiness(
-                    has_all_credentials=False,
-                    missing_credentials=missing_credentials,
-                    ready_to_run=False,
-                ),
-                requirements={
-                    "credentials": [missing_credentials[field_key]],
-                    "inputs": [],
-                    "execution_modes": [],
-                },
-            ),
-        )
--- a/autogpt_platform/backend/backend/copilot/tools/connect_integration_test.py
+++ b/autogpt_platform/backend/backend/copilot/tools/connect_integration_test.py
@@ -1,135 +0,0 @@
-"""Tests for ConnectIntegrationTool."""
-
-import pytest
-
-from ._test_data import make_session
-from .connect_integration import ConnectIntegrationTool
-from .models import ErrorResponse, SetupRequirementsResponse
-
-_TEST_USER_ID = "test-user-connect-integration"
-
-
-class TestConnectIntegrationTool:
-    def _make_tool(self) -> ConnectIntegrationTool:
-        return ConnectIntegrationTool()
-
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_unknown_provider_returns_error(self):
-        tool = self._make_tool()
-        session = make_session(user_id=_TEST_USER_ID)
-        result = await tool._execute(
-            user_id=_TEST_USER_ID, session=session, provider="nonexistent"
-        )
-        assert isinstance(result, ErrorResponse)
-        assert result.error == "unknown_provider"
-        assert "nonexistent" in result.message
-        assert "github" in result.message
-
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_empty_provider_returns_error(self):
-        tool = self._make_tool()
-        session = make_session(user_id=_TEST_USER_ID)
-        result = await tool._execute(
-            user_id=_TEST_USER_ID, session=session, provider=""
-        )
-        assert isinstance(result, ErrorResponse)
-        assert result.error == "unknown_provider"
-
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_github_provider_returns_setup_response(self):
-        tool = self._make_tool()
-        session = make_session(user_id=_TEST_USER_ID)
-        result = await tool._execute(
-            user_id=_TEST_USER_ID, session=session, provider="github"
-        )
-        assert isinstance(result, SetupRequirementsResponse)
-        assert result.setup_info.agent_name == "GitHub"
-        assert result.setup_info.agent_id == "connect_github"
-
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_github_has_missing_credentials_in_readiness(self):
-        tool = self._make_tool()
-        session = make_session(user_id=_TEST_USER_ID)
-        result = await tool._execute(
-            user_id=_TEST_USER_ID, session=session, provider="github"
-        )
-        assert isinstance(result, SetupRequirementsResponse)
-        readiness = result.setup_info.user_readiness
-        assert readiness.has_all_credentials is False
-        assert readiness.ready_to_run is False
-        assert "github_credentials" in readiness.missing_credentials
-
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_github_requirements_include_credential_entry(self):
-        tool = self._make_tool()
-        session = make_session(user_id=_TEST_USER_ID)
-        result = await tool._execute(
-            user_id=_TEST_USER_ID, session=session, provider="github"
-        )
-        assert isinstance(result, SetupRequirementsResponse)
-        creds = result.setup_info.requirements["credentials"]
-        assert len(creds) == 1
-        assert creds[0]["provider"] == "github"
-        assert creds[0]["id"] == "github_credentials"
-
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_reason_appears_in_message(self):
-        tool = self._make_tool()
-        session = make_session(user_id=_TEST_USER_ID)
-        reason = "Needed to create a pull request."
-        result = await tool._execute(
-            user_id=_TEST_USER_ID, session=session, provider="github", reason=reason
-        )
-        assert isinstance(result, SetupRequirementsResponse)
-        assert reason in result.message
-
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_session_id_propagated(self):
-        tool = self._make_tool()
-        session = make_session(user_id=_TEST_USER_ID)
-        result = await tool._execute(
-            user_id=_TEST_USER_ID, session=session, provider="github"
-        )
-        assert isinstance(result, SetupRequirementsResponse)
-        assert result.session_id == session.session_id
-
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_provider_case_insensitive(self):
-        """Provider slug is normalised to lowercase before lookup."""
-        tool = self._make_tool()
-        session = make_session(user_id=_TEST_USER_ID)
-        result = await tool._execute(
-            user_id=_TEST_USER_ID, session=session, provider="GitHub"
-        )
-        assert isinstance(result, SetupRequirementsResponse)
-
-    def test_tool_name(self):
-        assert ConnectIntegrationTool().name == "connect_integration"
-
-    def test_requires_auth(self):
-        assert ConnectIntegrationTool().requires_auth is True
-
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_unauthenticated_user_gets_need_login_response(self):
-        """execute() with user_id=None must return NeedLoginResponse, not the setup card.
-
-        This verifies that the requires_auth guard in BaseTool.execute() fires
-        before _execute() is called, so unauthenticated callers cannot probe
-        which integrations are configured.
-        """
-        import json
-
-        tool = self._make_tool()
-        # Session still needs a user_id string; the None is passed to execute()
-        # to simulate an unauthenticated call.
-        session = make_session(user_id=_TEST_USER_ID)
-        result = await tool.execute(
-            user_id=None,
-            session=session,
-            tool_call_id="test-call-id",
-            provider="github",
-        )
-        raw = result.output
-        output = json.loads(raw) if isinstance(raw, str) else raw
-        assert output.get("type") == "need_login"
-        assert result.success is False
--- a/autogpt_platform/backend/backend/integrations/creds_manager.py
+++ b/autogpt_platform/backend/backend/integrations/creds_manager.py
@@ -25,35 +25,6 @@ logger = logging.getLogger(__name__)
 settings = Settings()


-_on_creds_changed: Callable[[str, str], None] | None = None
-
-
-def register_creds_changed_hook(hook: Callable[[str, str], None]) -> None:
-    """Register a callback invoked after any credential is created/updated/deleted.
-
-    The callback receives ``(user_id, provider)`` and should be idempotent.
-    Only one hook can be registered at a time; calling this again replaces the
-    previous hook.  Intended to be called once at application startup by the
-    copilot module to bust its token cache without creating an import cycle.
-    """
-    global _on_creds_changed
-    _on_creds_changed = hook
-
-
-def _bust_copilot_cache(user_id: str, provider: str) -> None:
-    """Invoke the registered hook (if any) to bust downstream token caches."""
-    if _on_creds_changed is not None:
-        try:
-            _on_creds_changed(user_id, provider)
-        except Exception:
-            logger.warning(
-                "Credential-change hook failed for user=%s provider=%s",
-                user_id,
-                provider,
-                exc_info=True,
-            )
-
-
 class IntegrationCredentialsManager:
    """
    Handles the lifecycle of integration credentials.
@@ -98,11 +69,7 @@ class IntegrationCredentialsManager:
        return self._locks

    async def create(self, user_id: str, credentials: Credentials) -> None:
-        result = await self.store.add_creds(user_id, credentials)
-        # Bust the copilot token cache so that the next bash_exec picks up the
-        # new credential immediately instead of waiting for _NULL_CACHE_TTL.
-        _bust_copilot_cache(user_id, credentials.provider)
-        return result
+        return await self.store.add_creds(user_id, credentials)

    async def exists(self, user_id: str, credentials_id: str) -> bool:
        return (await self.store.get_creds_by_id(user_id, credentials_id)) is not None
@@ -189,8 +156,6 @@ class IntegrationCredentialsManager:

                fresh_credentials = await oauth_handler.refresh_tokens(credentials)
                await self.store.update_creds(user_id, fresh_credentials)
-                # Bust copilot cache so the refreshed token is picked up immediately.
-                _bust_copilot_cache(user_id, fresh_credentials.provider)
                if _lock and (await _lock.locked()) and (await _lock.owned()):
                    try:
                        await _lock.release()
@@ -203,17 +168,10 @@ class IntegrationCredentialsManager:
    async def update(self, user_id: str, updated: Credentials) -> None:
        async with self._locked(user_id, updated.id):
            await self.store.update_creds(user_id, updated)
-        # Bust the copilot token cache so the updated credential is picked up immediately.
-        _bust_copilot_cache(user_id, updated.provider)

    async def delete(self, user_id: str, credentials_id: str) -> None:
        async with self._locked(user_id, credentials_id):
-            # Read inside the lock to avoid TOCTOU — another coroutine could
-            # delete the same credential between the read and the delete.
-            creds = await self.store.get_creds_by_id(user_id, credentials_id)
            await self.store.delete_creds_by_id(user_id, credentials_id)
-        if creds:
-            _bust_copilot_cache(user_id, creds.provider)

    # -- Locking utilities -- #

--- a/autogpt_platform/backend/scripts/generate_block_docs.py
+++ b/autogpt_platform/backend/scripts/generate_block_docs.py
@@ -112,6 +112,11 @@ CATEGORY_FILE_MAP = {
 }


+_BRAND_NAMES: dict[str, str] = {
+    "AutoPilot": "AutoPilot",
+}
+
+
 def class_name_to_display_name(class_name: str) -> str:
    """Convert BlockClassName to 'Block Class Name'."""
    # Remove 'Block' suffix (only at the end, not all occurrences)
@@ -120,7 +125,13 @@ def class_name_to_display_name(class_name: str) -> str:
    name = re.sub(r"([a-z])([A-Z])", r"\1 \2", name)
    # Handle consecutive capitals (e.g., 'HTTPRequest' -> 'HTTP Request')
    name = re.sub(r"([A-Z]+)([A-Z][a-z])", r"\1 \2", name)
-    return name.strip()
+    name = name.strip()
+    # Restore brand names that shouldn't be split
+    for split_form, brand in _BRAND_NAMES.items():
+        # Build the split version (e.g., "AutoPilot" -> "Auto Pilot")
+        split = re.sub(r"([a-z])([A-Z])", r"\1 \2", split_form)
+        name = name.replace(split, brand)
+    return name


 def type_to_readable(type_schema: dict[str, Any] | Any) -> str:
--- a/autogpt_platform/frontend/src/app/(platform)/copilot/components/ChatMessagesContainer/components/MessagePartRenderer.tsx
+++ b/autogpt_platform/frontend/src/app/(platform)/copilot/components/ChatMessagesContainer/components/MessagePartRenderer.tsx
@@ -3,7 +3,6 @@ import { ErrorCard } from "@/components/molecules/ErrorCard/ErrorCard";
 import { ExclamationMarkIcon } from "@phosphor-icons/react";
 import { ToolUIPart, UIDataTypes, UIMessage, UITools } from "ai";
 import { useState } from "react";
-import { ConnectIntegrationTool } from "../../../tools/ConnectIntegrationTool/ConnectIntegrationTool";
 import { CreateAgentTool } from "../../../tools/CreateAgent/CreateAgent";
 import { EditAgentTool } from "../../../tools/EditAgent/EditAgent";
 import {
@@ -130,8 +129,6 @@ export function MessagePartRenderer({ part, messageID, partIndex }: Props) {
    case "tool-search_docs":
    case "tool-get_doc_page":
      return <SearchDocsTool key={key} part={part as ToolUIPart} />;
-    case "tool-connect_integration":
-      return <ConnectIntegrationTool key={key} part={part as ToolUIPart} />;
    case "tool-run_block":
    case "tool-continue_run_block":
      return <RunBlockTool key={key} part={part as ToolUIPart} />;
--- a/autogpt_platform/frontend/src/app/(platform)/copilot/tools/ConnectIntegrationTool/ConnectIntegrationTool.tsx
+++ b/autogpt_platform/frontend/src/app/(platform)/copilot/tools/ConnectIntegrationTool/ConnectIntegrationTool.tsx
@@ -1,104 +0,0 @@
-"use client";
-
-import type { SetupRequirementsResponse } from "@/app/api/__generated__/models/setupRequirementsResponse";
-import type { ToolUIPart } from "ai";
-import { useState } from "react";
-import { MorphingTextAnimation } from "../../components/MorphingTextAnimation/MorphingTextAnimation";
-import { ContentMessage } from "../../components/ToolAccordion/AccordionContent";
-import { SetupRequirementsCard } from "../RunBlock/components/SetupRequirementsCard/SetupRequirementsCard";
-
-type Props = {
-  part: ToolUIPart;
-};
-
-function parseJson(raw: unknown): unknown {
-  if (typeof raw === "string") {
-    try {
-      return JSON.parse(raw);
-    } catch {
-      return null;
-    }
-  }
-  return raw;
-}
-
-function parseOutput(raw: unknown): SetupRequirementsResponse | null {
-  const parsed = parseJson(raw);
-  if (parsed && typeof parsed === "object" && "setup_info" in parsed) {
-    return parsed as SetupRequirementsResponse;
-  }
-  return null;
-}
-
-function parseError(raw: unknown): string | null {
-  const parsed = parseJson(raw);
-  if (parsed && typeof parsed === "object" && "message" in parsed) {
-    return String((parsed as { message: unknown }).message);
-  }
-  return null;
-}
-
-export function ConnectIntegrationTool({ part }: Props) {
-  // Persist dismissed state here so SetupRequirementsCard remounts don't re-enable Proceed.
-  const [isDismissed, setIsDismissed] = useState(false);
-
-  const isStreaming =
-    part.state === "input-streaming" || part.state === "input-available";
-  const isError = part.state === "output-error";
-
-  const output =
-    part.state === "output-available"
-      ? parseOutput((part as { output?: unknown }).output)
-      : null;
-
-  const errorMessage = isError
-    ? (parseError((part as { output?: unknown }).output) ??
-      "Failed to connect integration")
-    : null;
-
-  const rawProvider =
-    (part as { input?: { provider?: string } }).input?.provider ?? "";
-  const providerName =
-    output?.setup_info?.agent_name ??
-    // Sanitize LLM-controlled provider slug: trim and cap at 64 chars to
-    // prevent runaway text in the DOM.
-    (rawProvider ? rawProvider.trim().slice(0, 64) : "integration");
-
-  const label = isStreaming
-    ? `Connecting ${providerName}…`
-    : isError
-      ? `Failed to connect ${providerName}`
-      : output
-        ? `Connect ${output.setup_info?.agent_name ?? providerName}`
-        : `Connect ${providerName}`;
-
-  return (
-    <div className="py-2">
-      <div className="flex items-center gap-2 text-sm text-muted-foreground">
-        <MorphingTextAnimation
-          text={label}
-          className={isError ? "text-red-500" : undefined}
-        />
-      </div>
-
-      {isError && errorMessage && (
-        <p className="mt-1 text-sm text-red-500">{errorMessage}</p>
-      )}
-
-      {output && (
-        <div className="mt-2">
-          {isDismissed ? (
-            <ContentMessage>Connected. Continuing…</ContentMessage>
-          ) : (
-            <SetupRequirementsCard
-              output={output}
-              credentialsLabel={`${output.setup_info?.agent_name ?? providerName} credentials`}
-              retryInstruction="I've connected my account. Please continue."
-              onComplete={() => setIsDismissed(true)}
-            />
-          )}
-        </div>
-      )}
-    </div>
-  );
-}
--- a/autogpt_platform/frontend/src/app/(platform)/copilot/tools/RunBlock/components/SetupRequirementsCard/SetupRequirementsCard.tsx
+++ b/autogpt_platform/frontend/src/app/(platform)/copilot/tools/RunBlock/components/SetupRequirementsCard/SetupRequirementsCard.tsx
@@ -23,16 +23,12 @@ interface Props {
  /** Override the label shown above the credentials section.
   * Defaults to "Credentials". */
  credentialsLabel?: string;
-  /** Called after Proceed is clicked so the parent can persist the dismissed state
-   * across remounts (avoids re-enabling the Proceed button on remount). */
-  onComplete?: () => void;
 }

 export function SetupRequirementsCard({
  output,
  retryInstruction,
  credentialsLabel,
-  onComplete,
 }: Props) {
  const { onSend } = useCopilotChatActions();

@@ -72,17 +68,13 @@ export function SetupRequirementsCard({
      return v !== undefined && v !== null && v !== "";
    });

-  if (hasSent) {
-    return <ContentMessage>Connected. Continuing…</ContentMessage>;
-  }
-
  const canRun =
+    !hasSent &&
    (!needsCredentials || isAllCredentialsComplete) &&
    (!needsInputs || isAllInputsComplete);

  function handleRun() {
    setHasSent(true);
-    onComplete?.();

    const parts: string[] = [];
    if (needsCredentials) {
--- a/autogpt_platform/frontend/src/components/contextual/CredentialsInput/useCredentialsInput.ts
+++ b/autogpt_platform/frontend/src/components/contextual/CredentialsInput/useCredentialsInput.ts
@@ -125,9 +125,9 @@ export function useCredentialsInput({
      if (hasAttemptedAutoSelect.current) return;
      hasAttemptedAutoSelect.current = true;

-      // Auto-select only when there is exactly one saved credential.
-      // With multiple options the user must choose — regardless of optional/required.
-      if (savedCreds.length > 1) return;
+      // Auto-select if exactly one credential matches.
+      // For optional fields with multiple options, let the user choose.
+      if (isOptional && savedCreds.length > 1) return;

      const cred = savedCreds[0];
      onSelectCredential({
--- a/docs/integrations/README.md
+++ b/docs/integrations/README.md
@@ -579,6 +579,7 @@ Below is a comprehensive list of all available blocks, categorized by their prim
 | Block Name | Description |
 |------------|-------------|
 | [Agent Executor](block-integrations/misc.md#agent-executor) | Executes an existing agent inside your agent |
+| [AutoPilot](block-integrations/misc.md#autopilot) | Execute tasks using AutoGPT AutoPilot with full access to platform tools (agent management, workspace files, web fetch, block execution, and more) |

 ## CRM Services

--- a/docs/integrations/block-integrations/misc.md
+++ b/docs/integrations/block-integrations/misc.md
@@ -38,6 +38,43 @@ Input and output schemas define the expected data structure for communication be

 ---

+## AutoPilot
+
+### What it is
+Execute tasks using AutoGPT AutoPilot with full access to platform tools (agent management, workspace files, web fetch, block execution, and more). Enables sub-agent patterns and scheduled autopilot execution.
+
+### How it works
+<!-- MANUAL: how_it_works -->
+This block invokes the platform's copilot system directly via `stream_chat_completion_sdk`. It creates (or resumes) a chat session, streams the autopilot's response collecting text deltas, tool call details, and token usage, then returns the aggregated results. A recursion depth guard prevents infinite loops when the autopilot calls this block as a sub-agent.
+<!-- END MANUAL -->
+
+### Inputs
+
+| Input | Description | Type | Required |
+|-------|-------------|------|----------|
+| prompt | The task or instruction for the autopilot to execute. The autopilot has access to platform tools like agent management, workspace files, web fetch, block execution, and more. | str | Yes |
+| system_context | Optional additional context prepended to the prompt. Use this to constrain autopilot behavior, provide domain context, or set output format requirements. | str | No |
+| session_id | Session ID to continue an existing autopilot conversation. Leave empty to start a new session. Use the session_id output from a previous run to continue. | str | No |
+| max_recursion_depth | Maximum nesting depth when the autopilot calls this block recursively (sub-agent pattern). Prevents infinite loops. | int | No |
+
+### Outputs
+
+| Output | Description | Type |
+|--------|-------------|------|
+| error | Error message if execution failed. | str |
+| response | The final text response from the autopilot. | str |
+| tool_calls | List of tools called during execution. Each entry has tool_call_id, tool_name, input, output, and success fields. | List[ToolCallEntry] |
+| conversation_history | Full conversation history as JSON. It can be used for logging or analysis. | str |
+| session_id | Session ID for this conversation. Pass this back to continue the conversation in a future run. | str |
+| token_usage | Token usage statistics: prompt_tokens, completion_tokens, total_tokens. | TokenUsage |
+
+### Possible use case
+<!-- MANUAL: use_case -->
+Schedule an autopilot to run daily that checks workspace files, summarizes recent agent activity, and posts a report. Or chain autopilot blocks where one gathers data and another analyzes it, enabling multi-step AI workflows within the graph editor.
+<!-- END MANUAL -->
+
+---
+
 ## Create Reddit Post

 ### What it is
Author	SHA1	Message	Date
Zamil Majdy	d5ddf7013e	Merge remote-tracking branch 'origin/dev' into feat/autogpt-copilot-block	2026-03-17 06:17:13 +07:00
Zamil Majdy	0b55e55cda	fix(platform): use typing_extensions.TypedDict, snake_case attrs, and handle CancelledError - Use typing_extensions.TypedDict instead of typing.TypedDict for Python <3.12 compat - Rename TypedDict attributes to snake_case (prompt_tokens, tool_call_id, etc.) - Add explicit asyncio.CancelledError handler to prevent orphaned sessions - Regenerate block docs	2026-03-17 04:50:33 +07:00
Zamil Majdy	23764b9eb5	fix(platform): type tool_calls output with ToolCallEntry TypedDict Address PR review: replace generic `list[dict]` with properly typed `list[ToolCallEntry]` for the tool_calls output field, matching the described structure (toolCallId, toolName, input, output, success).	2026-03-17 04:03:56 +07:00
Zamil Majdy	69231dc627	fix(platform): type token_usage output with TypedDict instead of bare dict Address PR review: use TokenUsage TypedDict with promptTokens, completionTokens, and totalTokens fields for better type safety.	2026-03-17 02:18:58 +07:00
Zamil Majdy	3e459c1235	fix(backend): remove duplicate session creation from execute_copilot Session creation was moved to run() in the previous commit. Clean up the now-redundant create_chat_session import and logic in execute_copilot.	2026-03-17 00:37:08 +07:00
Zamil Majdy	040d1e851c	fix(backend): always yield session_id even on stream failure Move session creation from execute_copilot into run() so the session_id is yielded to the user even if the downstream stream raises an exception, preventing orphaned sessions.	2026-03-17 00:32:22 +07:00
Zamil Majdy	cbc16e43a6	docs: regenerate block docs overview table	2026-03-17 00:27:58 +07:00
Zamil Majdy	f6fada8e0a	docs: add Auto Pilot block to AI and Language Models table The AutoPilotBlock is categorized under both AI and AGENT categories, but the README only listed it under Agent Integration. Add it to the AI and Language Models table as well to match runtime categorization.	2026-03-17 00:22:26 +07:00
Zamil Majdy	ee9cec12aa	refactor(platform): rename AutogptCopilotBlock to AutoPilotBlock Rename class, file, and all user-facing strings from "Copilot" to "AutoPilot" per reviewer feedback. Internal method names (execute_copilot) kept as-is since they invoke the copilot system.	2026-03-17 00:14:14 +07:00
Zamil Majdy	3b70f61b17	fix(platform): align copilot block with normal copilot flow - Don't pass session object to stream_chat_completion_sdk; let the SDK load it internally so all session management (lock, persist, transcript, E2B) is handled by its own finally block. This fixes the "stream already active" error when continuing via chat UI. - Create session only when needed, pass just session_id downstream — same pattern as the executor processor and chat API route. - Add sub-copilot delegation guidance to shared tool notes so the copilot knows about the AutogptCopilotBlock for recursive patterns. - Extract recursion helpers as module-level functions.	2026-03-16 23:53:34 +07:00
Zamil Majdy	180578bfe4	fix(platform): use future annotations instead of noqa suppression	2026-03-16 23:32:15 +07:00
Zamil Majdy	8f73813565	refactor(platform): simplify copilot block, drop asyncio.timeout - Remove asyncio.timeout wrapper: the SDK's internal stream must not be cancelled mid-flight (corrupts anyio memory stream, see sdk/service.py L998-1001). Matches normal copilot behavior. - Remove timeout input field (no longer needed). - Extract _check_recursion, _reset_recursion, _get_or_create_session as static helpers to reduce indentation and improve readability. - Use ge=1 on max_recursion_depth SchemaField for schema-level validation instead of runtime check. - Drop explicit TimeoutError catch (no longer raised).	2026-03-16 23:27:57 +07:00
Zamil Majdy	348e9f8e27	fix(platform): enforce chain-wide recursion limit for copilot block The recursion guard now stores the effective limit in a ContextVar so nested calls cannot raise the cap set by the original caller. This prevents a recursive copilot call from bypassing the intended depth limit by passing a higher max_recursion_depth value.	2026-03-16 23:23:43 +07:00
Zamil Majdy	436aab7edc	fix(platform): add bounds validation and fix doc sentence fragment Add validation for timeout (>0) and max_recursion_depth (>=1) in the run method to fail early with clear error messages. Fix sentence fragment in conversation_history description.	2026-03-16 22:27:45 +07:00
Zamil Majdy	abacc25e58	fix(platform): make prompt required in AutogptCopilotBlock schema Removes default="" so schema/UI correctly marks prompt as required, matching the runtime validation that rejects empty prompts.	2026-03-16 22:14:55 +07:00
Zamil Majdy	35bb208b9e	docs: regenerate integrations README overview table	2026-03-16 22:10:16 +07:00
Zamil Majdy	e2fefcf550	docs: fill manual sections for AutogptCopilotBlock in misc.md	2026-03-16 22:07:38 +07:00
Zamil Majdy	da94d4b28e	docs: regenerate block docs to include AutogptCopilotBlock	2026-03-16 22:03:28 +07:00
Zamil Majdy	4e8144e7b7	fix(platform): address PR review comments on AutogptCopilotBlock - Validate user_id is present before running copilot - Catch TimeoutError explicitly with user-friendly message - Use list+join for response text accumulation - Use dict lookup for tool call output matching - Use exclude_none=True in conversation history serialization	2026-03-16 22:00:14 +07:00
Zamil Majdy	0a532185d0	feat(platform): add AutogptCopilotBlock for invoking copilot from graphs Enables sub-agent patterns (copilot calling copilot recursively) and scheduled copilot execution via the agent executor. The block calls stream_chat_completion_sdk directly with full platform tool access.	2026-03-16 21:56:17 +07:00