Compare commits

..

23 Commits

Author SHA1 Message Date
Zamil Majdy
88eaab2baa Merge remote-tracking branch 'origin/dev' into feat/github-cli-copilot 2026-03-17 06:17:03 +07:00
Zamil Majdy
4b0a445635 fix(copilot): remove implicit gh auth setup-git from sandbox creation
Remove the automatic GitHub credential helper configuration that ran on
every E2B sandbox connect/reconnect. This addressed a review concern
about implicitly giving AutoPilot full GitHub access without user
awareness or opt-in.

The bash_exec tool already injects GH_TOKEN/GITHUB_TOKEN per-command
for users who have connected their account via connect_integration,
which is the explicit opt-in path.
2026-03-17 00:36:51 +07:00
Zamil Majdy
36312d2c6e fix(backend/copilot): bust cache on OAuth refresh + persist dismissed state
- creds_manager: call _bust_copilot_cache after refresh_if_needed
  persists the refreshed token so the copilot cache doesn't serve
  a stale access token after silent refresh
- ConnectIntegrationTool: lift isDismissed state to parent so
  SetupRequirementsCard remounts don't re-enable the Proceed button;
  onComplete callback propagates the dismissed signal up
2026-03-16 17:10:18 +07:00
Zamil Majdy
d6d3b8d710 fix(copilot): address coderabbitai major issues — scope token description to E2B, guard cache-bust hook
- connect_integration.py: clarify that GH_TOKEN is injected per-command in
  E2B/cloud only; note that bubblewrap isolates network so retry won't work
- creds_manager._bust_copilot_cache: wrap _on_creds_changed in try/except so
  a failing hook doesn't turn successful create/update/delete into a 500
2026-03-16 15:52:40 +07:00
Zamil Majdy
17d8d0bf05 fix(backend/copilot): run gh auth setup-git once on sandbox connect/reconnect
Move git credential helper setup out of bash_exec (where it ran on
every command) and into _setup_e2b so it runs exactly once per sandbox
connect or reconnect. Non-fatal: logged at debug level on failure.
2026-03-16 15:45:18 +07:00
Zamil Majdy
5a2ab65f41 fix(backend/copilot): run gh auth setup-git once per sandbox session
Use grep to skip re-running if the credential helper is already
configured in ~/.gitconfig — only pays the cost on first command.
Agent can still call it manually if GH_TOKEN changes mid-session.
2026-03-16 15:42:54 +07:00
Zamil Majdy
81a318de3e feat(backend/copilot): improve GitHub OAuth UX and git auth
- Dynamic OAuth scopes: connect_integration tool now accepts a `scopes`
  param so the agent can request exactly the access it needs (e.g.
  `["repo", "read:org"]`); GitHub defaults to `["repo"]` so git
  push/pull works out of the box instead of public-data-only
- Lazy git auth: prepend `gh auth setup-git` on every E2B bash_exec
  when GH_TOKEN is present — git HTTPS clone/push/pull now work
  automatically without the agent needing to set this up manually
- Prefer broadest-scoped OAuth2 credential: sort repo-scoped tokens
  first so a stale public-data token is never picked over a full one
- Collapse SetupRequirementsCard to "Connected. Continuing…" after
  Proceed is clicked instead of leaving the full card visible
- Fix credential auto-select: don't silently pick the first token when
  multiple credentials exist — let the user choose via the dropdown
2026-03-16 15:26:14 +07:00
Zamil Majdy
62c8e8634b fix(copilot): patch _manager singleton directly in tests instead of class constructor
The module-level _manager singleton is created at import time, so patching
IntegrationCredentialsManager after import has no effect. Patch the _manager
attribute directly so get_provider_token uses the mock.
2026-03-16 06:32:36 +07:00
Zamil Majdy
b91c959cd9 fix(copilot): address remaining review findings
- creds_manager: fix TOCTOU in delete() — move get_creds_by_id inside the lock
- creds_manager: replace lazy import in _bust_copilot_cache with a
  register_creds_changed_hook() callback so creds_manager has no runtime
  dependency on the copilot module
- integration_creds: register invalidate_user_provider_cache at module import
  via register_creds_changed_hook() — eliminates the circular-import risk
- integration_creds: add module-level _manager singleton (avoids re-instantiating
  IntegrationCredentialsManager on every cache miss)
- integration_creds: document TTLCache asyncio-only thread-safety assumption
- connect_integration: defer GITHUB_OAUTH_IS_CONFIGURED evaluation to runtime
  with an lru_cache'd helper; importing the module no longer triggers Secrets()
- connect_integration: type missing_credentials dict with _CredentialEntry TypedDict
- connect_integration: cap reason field at 500 chars; add maxLength to JSON schema
- bash_exec: use 'user_id is not None' instead of truthy check
- connect_integration_test: add test for unauthenticated caller (requires_auth guard)
- bash_exec_test: add E2B path tests — token injected when user_id set, skipped
  when user_id is None
- ConnectIntegrationTool.tsx: sanitize LLM-controlled providerName fallback
  (trim + slice to 64 chars)
2026-03-16 06:21:44 +07:00
Zamil Majdy
5b95a2a1ef refactor(copilot): strongly type _PROVIDER_INFO with TypedDict
Replace dict[str, Any] with a _ProviderInfo TypedDict for provider
metadata entries, eliminating key/type drift as new providers are added.
2026-03-16 06:04:02 +07:00
Zamil Majdy
9c2a601167 refactor(copilot): simplify cache with cachetools.TTLCache, fix prompt wording
- Replace manual dict+sentinel cache with two TTLCache instances:
  _token_cache (5min TTL) and _null_cache (60s TTL)
- Remove _cache_set helper and _NO_TOKEN sentinel — TTLCache handles
  expiry and LRU eviction natively
- Update tests to use _token_cache/_null_cache directly; add TTL constant test
- Change _E2B_TOOL_NOTES from "GH_TOKEN is set" to "gh is pre-authenticated"
  so the AI doesn't attempt to read the env var directly
2026-03-16 00:16:26 +07:00
Zamil Majdy
b98e37bf23 refactor(copilot): DRY cache-bust helper, fast eviction test, unified JSON parse
Backend:
- Extract _bust_copilot_cache() in creds_manager.py; create/update/delete
  now each call it once instead of repeating the try/except ImportError block
- test_evicts_oldest_when_full: patch _CACHE_MAX_SIZE to 3 to avoid
  allocating 10 000 entries in CI; remove now-unused _CACHE_MAX_SIZE import

Frontend:
- Extract parseJson() helper shared by parseOutput and parseError in
  ConnectIntegrationTool.tsx, eliminating duplicated try/catch logic
2026-03-16 00:01:10 +07:00
Zamil Majdy
fec8924361 fix(copilot): bust token cache on update/delete, tighten except clause
- creds_manager.create/update/delete now all call invalidate_user_provider_cache
  after mutating credentials, so the next bash_exec always picks up the
  current state without waiting for TTL to expire
- Change broad `except Exception` to `except ImportError` in all three methods
  so real bugs inside invalidate_user_provider_cache are not silently swallowed
- delete() reads the provider before deletion so we know which cache key to evict
- Add tests for invalidate_user_provider_cache: removes sentinel/token entry,
  no-op when key absent, only removes the targeted key
2026-03-15 23:57:11 +07:00
Zamil Majdy
712aee7302 fix(copilot): warn on stale OAuth token fallback, document per-process cache
- Log at WARNING (not DEBUG) when OAuth refresh fails and we fall back to
  a potentially stale token, so operators can diagnose repeated auth failures
- Add multi-worker note to module docstring: _token_cache is process-local;
  each replica maintains its own cache (acceptable for current goal, but a
  shared cache would be needed for cross-replica efficiency)
2026-03-15 23:53:10 +07:00
Zamil Majdy
bef292033e fix(copilot): render error state in ConnectIntegrationTool
When part.state is 'output-error', show the error message from the
backend (ErrorResponse.message) in red text below the status line.
Without this, errors from unknown/unsupported providers were silently
discarded, leaving the user without any feedback.
2026-03-15 23:51:09 +07:00
Zamil Majdy
ec6974e3b8 fix(copilot): invalidate null cache on credential creation
When a user connects an integration, IntegrationCredentialsManager.create()
now calls invalidate_user_provider_cache() to remove any stale _NO_TOKEN
sentinel from the TTL cache. Without this, the first retry after connecting
would still return None for up to _NULL_CACHE_TTL (60 s).

The import is done lazily inside create() to avoid a circular import
between integrations.creds_manager and copilot.integration_creds.
2026-03-15 23:50:34 +07:00
Zamil Majdy
2ef5e2fe77 feat(copilot): bounded TTL cache with sentinel for integration creds
- Replace empty-string sentinel with explicit _NO_TOKEN = object() to
  avoid ambiguity with zero-length tokens
- Bound _token_cache to _CACHE_MAX_SIZE=10_000 entries; _cache_set()
  evicts oldest insertion-order entry when full
- Cache "not connected" results with _NULL_CACHE_TTL=60s (vs 300s for
  found tokens) to avoid a DB hit on every E2B bash_exec for users who
  haven't connected yet, while still picking up a new connection quickly
- Add integration_creds_test.py covering all cache paths, sentinel,
  eviction, OAuth2 preferred/fallback, DB exception, and env var injection
2026-03-15 23:46:05 +07:00
Zamil Majdy
0a8c7221ce fix(copilot): address all review findings
- prompting: rename _SDK_TOOL_NOTES → _E2B_TOOL_NOTES; pass it only to
  _get_cloud_sandbox_supplement() via new extra_notes param — local
  (bubblewrap) mode uses --unshare-net so gh CLI cannot reach GitHub
- integration_creds: cache None results with 60 s TTL (_NULL_CACHE_TTL)
  to avoid a DB hit on every E2B bash_exec for users without GitHub creds;
  found tokens still cached for 5 min (_TOKEN_CACHE_TTL)
- connect_integration: add cross-reference comment to PROVIDER_ENV_VARS
- ConnectIntegrationTool: use provider-specific credentialsLabel
  (e.g. "GitHub credentials" instead of "Integration credentials")
2026-03-15 23:35:25 +07:00
Zamil Majdy
840d1de636 refactor(copilot): move token injection to bash_exec + add integration_creds module
- Extract integration token lookup into backend/copilot/integration_creds.py:
  * Generic get_provider_token(user_id, provider) with 5-min TTL cache
  * get_integration_env_vars(user_id) loops over PROVIDER_ENV_VARS registry
  * Adding a new provider only requires a one-line PROVIDER_ENV_VARS entry

- Inject tokens lazily in bash_exec._execute_on_e2b (E2B has internet access;
  bubblewrap uses --unshare-net so gh CLI cannot reach GitHub regardless)

- Remove eager per-turn GH_TOKEN injection from sdk/service.py (wrong layer:
  bubblewrap is network-isolated, E2B injection now done per-command in bash_exec)

- Fix unsafe output.setup_info?.agent_name access in ConnectIntegrationTool

- Add connect_integration_test.py: unknown provider, known provider structure,
  reason in message, session_id propagation, case-insensitive provider slug
2026-03-15 23:29:33 +07:00
Zamil Majdy
ac55ab619b fix(copilot): address coderabbitai nitpicks
- Use `type Props` instead of `interface Props` in ConnectIntegrationTool
- Simplify parseOutput: skip stringify→parse round-trip for objects
- Document why requires_auth=True despite user_id not being used
2026-03-15 23:14:54 +07:00
Zamil Majdy
a8014d1e92 fix(copilot): address sentry — refresh expired OAuth tokens, handle object output in parseOutput
- service.py: use IntegrationCredentialsManager.refresh_if_needed() instead of
  raw IntegrationCredentialsStore so expired GitHub OAuth tokens are refreshed
  before injection; falls back to stale token on refresh failure to avoid
  breaking the turn entirely (lock=False to avoid blocking the turn)
- ConnectIntegrationTool.tsx: parseOutput now handles both string and
  already-parsed object inputs, matching the RunBlock helper pattern
  used elsewhere in the codebase
2026-03-15 23:06:57 +07:00
Zamil Majdy
7de13c7713 fix(copilot): address self-review — GH_TOKEN OAuth preference, unknown provider error, baseline note scope
- service.py: two-pass loop in _get_github_token_for_user() to genuinely
  prefer OAuth2 tokens over API keys; use creds.type discriminator instead
  of isinstance to match codebase style
- connect_integration.py: return ErrorResponse (not SetupRequirementsResponse)
  for unknown providers so the frontend renders a proper error instead of a
  blank broken card; trim "Integration" suffix from agent_name to avoid
  "Connect GitHub Integration" redundancy
- prompting.py: move GitHub CLI / connect_integration guidance from
  _SHARED_TOOL_NOTES (baseline+SDK) to _SDK_TOOL_NOTES (SDK-only) since
  baseline mode has no subprocess, no gh CLI, and no connect_integration tool
- ConnectIntegrationTool.tsx: simplify parseOutput to short-circuit when raw
  is not a string, removing unnecessary JSON.stringify round-trip
2026-03-15 23:00:46 +07:00
Zamil Majdy
9358b525a0 feat(copilot): inject GH_TOKEN and add connect_integration tool for missing GitHub credentials
When the user has connected GitHub, GH_TOKEN is automatically injected
into the Claude Agent SDK subprocess environment so `gh` CLI works
without any manual auth step.

When GitHub is not connected, the copilot can call the new
connect_integration(provider="github") MCP tool, which surfaces the
same credentials setup card used by GitHub blocks — letting the user
connect their account inline without leaving the chat.

- backend: _get_github_token_for_user() fetches the user's GitHub
  credentials (OAuth2 or API key) and injects GH_TOKEN + GITHUB_TOKEN
  into sdk_env before the Claude Agent SDK subprocess starts
- backend: ConnectIntegrationTool MCP tool returns a
  SetupRequirementsResponse for any known provider (github for now)
- backend: prompting.py documents the gh CLI / connect_integration
  flow in _SHARED_TOOL_NOTES so the copilot knows when to call it
- frontend: ConnectIntegrationTool component renders the existing
  SetupRequirementsCard with a tailored retry instruction
- frontend: MessagePartRenderer dispatches tool-connect_integration
  to the new component
2026-03-15 22:55:08 +07:00
18 changed files with 1003 additions and 419 deletions

View File

@@ -1,341 +0,0 @@
from __future__ import annotations
import asyncio
import contextvars
import json
from typing import TYPE_CHECKING
from typing_extensions import TypedDict
from backend.blocks._base import (
Block,
BlockCategory,
BlockOutput,
BlockSchemaInput,
BlockSchemaOutput,
)
from backend.data.model import SchemaField
if TYPE_CHECKING:
from backend.executor.utils import ExecutionContext
class ToolCallEntry(TypedDict):
tool_call_id: str
tool_name: str
input: object
output: object | None
success: bool | None
class TokenUsage(TypedDict):
prompt_tokens: int
completion_tokens: int
total_tokens: int
# Task-scoped recursion depth counter & chain-wide limit.
# contextvars are scoped to the current asyncio task, so concurrent
# graph executions each get independent counters.
_autopilot_recursion_depth: contextvars.ContextVar[int] = contextvars.ContextVar(
"_autopilot_recursion_depth", default=0
)
_autopilot_recursion_limit: contextvars.ContextVar[int | None] = contextvars.ContextVar(
"_autopilot_recursion_limit", default=None
)
def _check_recursion(max_depth: int) -> tuple:
"""Check and increment recursion depth. Returns tokens to reset on exit."""
current = _autopilot_recursion_depth.get()
inherited = _autopilot_recursion_limit.get()
limit = max_depth if inherited is None else min(inherited, max_depth)
if current >= limit:
raise RuntimeError(
f"AutoPilot recursion depth limit reached ({limit}). "
"The autopilot has called itself too many times."
)
return (
_autopilot_recursion_depth.set(current + 1),
_autopilot_recursion_limit.set(limit),
)
def _reset_recursion(tokens: tuple) -> None:
_autopilot_recursion_depth.reset(tokens[0])
_autopilot_recursion_limit.reset(tokens[1])
class AutoPilotBlock(Block):
"""Execute tasks using AutoGPT AutoPilot with full access to platform tools.
The autopilot can manage agents, access workspace files, fetch web content,
run blocks, and more. This block enables sub-agent patterns (autopilot calling
autopilot) and scheduled autopilot execution via the agent executor.
"""
class Input(BlockSchemaInput):
prompt: str = SchemaField(
description=(
"The task or instruction for the autopilot to execute. "
"The autopilot has access to platform tools like agent management, "
"workspace files, web fetch, block execution, and more."
),
placeholder="Find my agents and list them",
advanced=False,
)
system_context: str = SchemaField(
description=(
"Optional additional context prepended to the prompt. "
"Use this to constrain autopilot behavior, provide domain "
"context, or set output format requirements."
),
default="",
advanced=True,
)
session_id: str = SchemaField(
description=(
"Session ID to continue an existing autopilot conversation. "
"Leave empty to start a new session. "
"Use the session_id output from a previous run to continue."
),
default="",
advanced=True,
)
max_recursion_depth: int = SchemaField(
description=(
"Maximum nesting depth when the autopilot calls this block "
"recursively (sub-agent pattern). Prevents infinite loops."
),
default=3,
ge=1,
advanced=True,
)
class Output(BlockSchemaOutput):
response: str = SchemaField(
description="The final text response from the autopilot."
)
tool_calls: list[ToolCallEntry] = SchemaField(
description=(
"List of tools called during execution. Each entry has "
"tool_call_id, tool_name, input, output, and success fields."
),
)
conversation_history: str = SchemaField(
description=(
"Full conversation history as JSON. "
"It can be used for logging or analysis."
),
)
session_id: str = SchemaField(
description=(
"Session ID for this conversation. "
"Pass this back to continue the conversation in a future run."
),
)
token_usage: TokenUsage = SchemaField(
description=(
"Token usage statistics: prompt_tokens, "
"completion_tokens, total_tokens."
),
)
error: str = SchemaField(
description="Error message if execution failed.",
)
def __init__(self):
super().__init__(
id="c069dc6b-c3ed-4c12-b6e5-d47361e64ce6",
description=(
"Execute tasks using AutoGPT AutoPilot with full access to "
"platform tools (agent management, workspace files, web fetch, "
"block execution, and more). Enables sub-agent patterns and "
"scheduled autopilot execution."
),
categories={BlockCategory.AI, BlockCategory.AGENT},
input_schema=AutoPilotBlock.Input,
output_schema=AutoPilotBlock.Output,
test_input={
"prompt": "List my agents",
"system_context": "",
"session_id": "",
"max_recursion_depth": 3,
},
test_output=[
("response", "You have 2 agents: Agent A and Agent B."),
("tool_calls", []),
(
"conversation_history",
'[{"role": "user", "content": "List my agents"}]',
),
("session_id", "test-session-id"),
(
"token_usage",
{
"prompt_tokens": 100,
"completion_tokens": 50,
"total_tokens": 150,
},
),
],
test_mock={
"create_session": lambda *args, **kwargs: "test-session-id",
"execute_copilot": lambda *args, **kwargs: (
"You have 2 agents: Agent A and Agent B.",
[],
'[{"role": "user", "content": "List my agents"}]',
"test-session-id",
{
"prompt_tokens": 100,
"completion_tokens": 50,
"total_tokens": 150,
},
),
},
)
async def create_session(self, user_id: str) -> str:
"""Create a new chat session and return its ID (mockable for tests)."""
from backend.copilot.model import create_chat_session
session = await create_chat_session(user_id)
return session.session_id
async def execute_copilot(
self,
prompt: str,
system_context: str,
session_id: str,
max_recursion_depth: int,
user_id: str,
) -> tuple[str, list[ToolCallEntry], str, str, TokenUsage]:
"""Invoke the copilot and collect all stream results.
Follows the same path as the normal copilot: create session if needed,
then let stream_chat_completion_sdk handle everything (session loading,
message append, lock, transcript, cleanup).
"""
from backend.copilot.model import get_chat_session
from backend.copilot.response_model import (
StreamError,
StreamTextDelta,
StreamToolInputAvailable,
StreamToolOutputAvailable,
StreamUsage,
)
from backend.copilot.sdk.service import stream_chat_completion_sdk
tokens = _check_recursion(max_recursion_depth)
try:
effective_prompt = prompt
if system_context:
effective_prompt = f"[System Context: {system_context}]\n\n{prompt}"
# Consume the stream — same as the executor processor.
# Do NOT pass a session object; let the SDK load it internally
# so all session management (lock, persist, transcript) is handled
# by the SDK's own finally block.
response_parts: list[str] = []
tool_calls: list[ToolCallEntry] = []
tool_calls_by_id: dict[str, ToolCallEntry] = {}
total_usage: TokenUsage = {
"prompt_tokens": 0,
"completion_tokens": 0,
"total_tokens": 0,
}
async for event in stream_chat_completion_sdk(
session_id=session_id,
message=effective_prompt,
is_user_message=True,
user_id=user_id,
):
if isinstance(event, StreamTextDelta):
response_parts.append(event.delta)
elif isinstance(event, StreamToolInputAvailable):
entry: ToolCallEntry = {
"tool_call_id": event.toolCallId,
"tool_name": event.toolName,
"input": event.input,
"output": None,
"success": None,
}
tool_calls.append(entry)
tool_calls_by_id[event.toolCallId] = entry
elif isinstance(event, StreamToolOutputAvailable):
if tc := tool_calls_by_id.get(event.toolCallId):
tc["output"] = event.output
tc["success"] = event.success
elif isinstance(event, StreamUsage):
total_usage["prompt_tokens"] += event.promptTokens
total_usage["completion_tokens"] += event.completionTokens
total_usage["total_tokens"] += event.totalTokens
elif isinstance(event, StreamError):
raise RuntimeError(f"AutoPilot error: {event.errorText}")
# Session was persisted by the SDK's finally block.
# Re-fetch for conversation history output.
updated_session = await get_chat_session(session_id, user_id)
history_json = "[]"
if updated_session and updated_session.messages:
history_json = json.dumps(
[m.model_dump(exclude_none=True) for m in updated_session.messages],
default=str,
)
return (
"".join(response_parts),
tool_calls,
history_json,
session_id,
total_usage,
)
finally:
_reset_recursion(tokens)
async def run(
self,
input_data: Input,
*,
execution_context: ExecutionContext,
**kwargs,
) -> BlockOutput:
if not input_data.prompt.strip():
yield "error", "Prompt cannot be empty."
return
if not execution_context.user_id:
yield "error", "Cannot run autopilot without an authenticated user."
return
# Create session eagerly so the user always gets the session_id,
# even if the downstream stream fails (avoids orphaned sessions).
sid = input_data.session_id
if not sid:
sid = await self.create_session(execution_context.user_id)
try:
response, tool_calls, history, _, usage = await self.execute_copilot(
prompt=input_data.prompt,
system_context=input_data.system_context,
session_id=sid,
max_recursion_depth=input_data.max_recursion_depth,
user_id=execution_context.user_id,
)
yield "response", response
yield "tool_calls", tool_calls
yield "conversation_history", history
yield "session_id", sid
yield "token_usage", usage
except asyncio.CancelledError:
yield "session_id", sid
yield "error", "AutoPilot execution was cancelled."
raise
except Exception as e:
yield "session_id", sid
yield "error", str(e)

View File

@@ -0,0 +1,162 @@
"""Integration credential lookup with per-process TTL cache.
Provides token retrieval for connected integrations so that copilot tools
(e.g. bash_exec) can inject auth tokens into the execution environment without
hitting the database on every command.
Cache semantics (handled automatically by TTLCache):
- Token found → cached for _TOKEN_CACHE_TTL (5 min). Avoids repeated DB hits
for users who have credentials and are running many bash commands.
- No credentials found → cached for _NULL_CACHE_TTL (60 s). Avoids a DB hit
on every E2B command for users who haven't connected an account yet, while
still picking up a newly-connected account within one minute.
Both caches are bounded to _CACHE_MAX_SIZE entries; cachetools evicts the
least-recently-used entry when the limit is reached.
Multi-worker note: both caches are in-process only. Each worker/replica
maintains its own independent cache, so a credential fetch may be duplicated
across processes. This is acceptable for the current goal (reduce DB hits per
session per-process), but if cache efficiency across replicas becomes important
a shared cache (e.g. Redis) should be used instead.
"""
import logging
from typing import cast
from cachetools import TTLCache
from backend.data.model import APIKeyCredentials, OAuth2Credentials
from backend.integrations.creds_manager import (
IntegrationCredentialsManager,
register_creds_changed_hook,
)
logger = logging.getLogger(__name__)
# Maps provider slug → env var names to inject when the provider is connected.
# Add new providers here when adding integration support.
# NOTE: keep in sync with connect_integration._PROVIDER_INFO — both registries
# must be updated when adding a new provider.
PROVIDER_ENV_VARS: dict[str, list[str]] = {
"github": ["GH_TOKEN", "GITHUB_TOKEN"],
}
_TOKEN_CACHE_TTL = 300.0 # seconds — for found tokens
_NULL_CACHE_TTL = 60.0 # seconds — for "not connected" results
_CACHE_MAX_SIZE = 10_000
# (user_id, provider) → token string. TTLCache handles expiry + eviction.
# Thread-safety note: TTLCache is NOT thread-safe, but that is acceptable here
# because all callers (get_provider_token, invalidate_user_provider_cache) run
# exclusively on the asyncio event loop. There are no await points between a
# cache read and its corresponding write within any function, so no concurrent
# coroutine can interleave. If ThreadPoolExecutor workers are ever added to
# this path, a threading.RLock should be wrapped around these caches.
_token_cache: TTLCache[tuple[str, str], str] = TTLCache(
maxsize=_CACHE_MAX_SIZE, ttl=_TOKEN_CACHE_TTL
)
# Separate cache for "no credentials" results with a shorter TTL.
_null_cache: TTLCache[tuple[str, str], bool] = TTLCache(
maxsize=_CACHE_MAX_SIZE, ttl=_NULL_CACHE_TTL
)
def invalidate_user_provider_cache(user_id: str, provider: str) -> None:
"""Remove the cached entry for *user_id*/*provider* from both caches.
Call this after storing new credentials so that the next
``get_provider_token()`` call performs a fresh DB lookup instead of
serving a stale TTL-cached result.
"""
key = (user_id, provider)
_token_cache.pop(key, None)
_null_cache.pop(key, None)
# Register this module's cache-bust function with the credentials manager so
# that any create/update/delete operation immediately evicts stale cache
# entries. This avoids a lazy import inside creds_manager and eliminates the
# circular-import risk.
register_creds_changed_hook(invalidate_user_provider_cache)
# Module-level singleton to avoid re-instantiating IntegrationCredentialsManager
# on every cache-miss call to get_provider_token().
_manager = IntegrationCredentialsManager()
async def get_provider_token(user_id: str, provider: str) -> str | None:
"""Return the user's access token for *provider*, or ``None`` if not connected.
OAuth2 tokens are preferred (refreshed if needed); API keys are the fallback.
Found tokens are cached for _TOKEN_CACHE_TTL (5 min). "Not connected" results
are cached for _NULL_CACHE_TTL (60 s) to avoid a DB hit on every bash_exec
command for users who haven't connected yet, while still picking up a
newly-connected account within one minute.
"""
cache_key = (user_id, provider)
if cache_key in _null_cache:
return None
if cached := _token_cache.get(cache_key):
return cached
manager = _manager
try:
creds_list = await manager.store.get_creds_by_provider(user_id, provider)
except Exception:
logger.debug("Failed to fetch %s credentials for user %s", provider, user_id)
return None
# Pass 1: prefer OAuth2 (carry scope info, refreshable via token endpoint).
# Sort so broader-scoped tokens come first: a token with "repo" scope covers
# full git access, while a public-data-only token lacks push/pull permission.
# lock=False — background injection; not worth a distributed lock acquisition.
oauth2_creds = sorted(
[c for c in creds_list if c.type == "oauth2"],
key=lambda c: 0 if "repo" in (cast(OAuth2Credentials, c).scopes or []) else 1,
)
for creds in oauth2_creds:
if creds.type == "oauth2":
try:
fresh = await manager.refresh_if_needed(
user_id, cast(OAuth2Credentials, creds), lock=False
)
token = fresh.access_token.get_secret_value()
except Exception:
logger.warning(
"Failed to refresh %s OAuth token for user %s; "
"falling back to potentially stale token",
provider,
user_id,
)
token = cast(OAuth2Credentials, creds).access_token.get_secret_value()
_token_cache[cache_key] = token
return token
# Pass 2: fall back to API key (no expiry, no refresh needed).
for creds in creds_list:
if creds.type == "api_key":
token = cast(APIKeyCredentials, creds).api_key.get_secret_value()
_token_cache[cache_key] = token
return token
# No credentials found — cache to avoid repeated DB hits.
_null_cache[cache_key] = True
return None
async def get_integration_env_vars(user_id: str) -> dict[str, str]:
"""Return env vars for all providers the user has connected.
Iterates :data:`PROVIDER_ENV_VARS`, fetches each token, and builds a flat
``{env_var: token}`` dict ready to pass to a subprocess or E2B sandbox.
Only providers with a stored credential contribute entries.
"""
env: dict[str, str] = {}
for provider, var_names in PROVIDER_ENV_VARS.items():
token = await get_provider_token(user_id, provider)
if token:
for var in var_names:
env[var] = token
return env

View File

@@ -0,0 +1,193 @@
"""Tests for integration_creds — TTL cache and token lookup paths."""
from unittest.mock import AsyncMock, MagicMock, patch
import pytest
from pydantic import SecretStr
from backend.copilot.integration_creds import (
_NULL_CACHE_TTL,
_TOKEN_CACHE_TTL,
PROVIDER_ENV_VARS,
_null_cache,
_token_cache,
get_integration_env_vars,
get_provider_token,
invalidate_user_provider_cache,
)
from backend.data.model import APIKeyCredentials, OAuth2Credentials
_USER = "user-integration-creds-test"
_PROVIDER = "github"
def _make_api_key_creds(key: str = "test-api-key") -> APIKeyCredentials:
return APIKeyCredentials(
id="creds-api-key",
provider=_PROVIDER,
api_key=SecretStr(key),
title="Test API Key",
expires_at=None,
)
def _make_oauth2_creds(token: str = "test-oauth-token") -> OAuth2Credentials:
return OAuth2Credentials(
id="creds-oauth2",
provider=_PROVIDER,
title="Test OAuth",
access_token=SecretStr(token),
refresh_token=SecretStr("test-refresh"),
access_token_expires_at=None,
refresh_token_expires_at=None,
scopes=[],
)
@pytest.fixture(autouse=True)
def clear_caches():
"""Ensure clean caches before and after every test."""
_token_cache.clear()
_null_cache.clear()
yield
_token_cache.clear()
_null_cache.clear()
class TestInvalidateUserProviderCache:
def test_removes_token_entry(self):
key = (_USER, _PROVIDER)
_token_cache[key] = "tok"
invalidate_user_provider_cache(_USER, _PROVIDER)
assert key not in _token_cache
def test_removes_null_entry(self):
key = (_USER, _PROVIDER)
_null_cache[key] = True
invalidate_user_provider_cache(_USER, _PROVIDER)
assert key not in _null_cache
def test_noop_when_key_not_cached(self):
# Should not raise even when there is no cache entry.
invalidate_user_provider_cache("no-such-user", _PROVIDER)
def test_only_removes_targeted_key(self):
other_key = ("other-user", _PROVIDER)
_token_cache[other_key] = "other-tok"
invalidate_user_provider_cache(_USER, _PROVIDER)
assert other_key in _token_cache
class TestGetProviderToken:
@pytest.mark.asyncio(loop_scope="session")
async def test_returns_cached_token_without_db_hit(self):
_token_cache[(_USER, _PROVIDER)] = "cached-tok"
mock_manager = MagicMock()
with patch("backend.copilot.integration_creds._manager", mock_manager):
result = await get_provider_token(_USER, _PROVIDER)
assert result == "cached-tok"
mock_manager.store.get_creds_by_provider.assert_not_called()
@pytest.mark.asyncio(loop_scope="session")
async def test_returns_none_for_null_cached_provider(self):
_null_cache[(_USER, _PROVIDER)] = True
mock_manager = MagicMock()
with patch("backend.copilot.integration_creds._manager", mock_manager):
result = await get_provider_token(_USER, _PROVIDER)
assert result is None
mock_manager.store.get_creds_by_provider.assert_not_called()
@pytest.mark.asyncio(loop_scope="session")
async def test_api_key_creds_returned_and_cached(self):
api_creds = _make_api_key_creds("my-api-key")
mock_manager = MagicMock()
mock_manager.store.get_creds_by_provider = AsyncMock(return_value=[api_creds])
with patch("backend.copilot.integration_creds._manager", mock_manager):
result = await get_provider_token(_USER, _PROVIDER)
assert result == "my-api-key"
assert _token_cache.get((_USER, _PROVIDER)) == "my-api-key"
@pytest.mark.asyncio(loop_scope="session")
async def test_oauth2_preferred_over_api_key(self):
oauth_creds = _make_oauth2_creds("oauth-tok")
api_creds = _make_api_key_creds("api-tok")
mock_manager = MagicMock()
mock_manager.store.get_creds_by_provider = AsyncMock(
return_value=[api_creds, oauth_creds]
)
mock_manager.refresh_if_needed = AsyncMock(return_value=oauth_creds)
with patch("backend.copilot.integration_creds._manager", mock_manager):
result = await get_provider_token(_USER, _PROVIDER)
assert result == "oauth-tok"
@pytest.mark.asyncio(loop_scope="session")
async def test_oauth2_refresh_failure_falls_back_to_stale_token(self):
oauth_creds = _make_oauth2_creds("stale-oauth-tok")
mock_manager = MagicMock()
mock_manager.store.get_creds_by_provider = AsyncMock(return_value=[oauth_creds])
mock_manager.refresh_if_needed = AsyncMock(side_effect=RuntimeError("network"))
with patch("backend.copilot.integration_creds._manager", mock_manager):
result = await get_provider_token(_USER, _PROVIDER)
assert result == "stale-oauth-tok"
@pytest.mark.asyncio(loop_scope="session")
async def test_no_credentials_caches_null_entry(self):
mock_manager = MagicMock()
mock_manager.store.get_creds_by_provider = AsyncMock(return_value=[])
with patch("backend.copilot.integration_creds._manager", mock_manager):
result = await get_provider_token(_USER, _PROVIDER)
assert result is None
assert _null_cache.get((_USER, _PROVIDER)) is True
@pytest.mark.asyncio(loop_scope="session")
async def test_db_exception_returns_none_without_caching(self):
mock_manager = MagicMock()
mock_manager.store.get_creds_by_provider = AsyncMock(
side_effect=RuntimeError("db down")
)
with patch("backend.copilot.integration_creds._manager", mock_manager):
result = await get_provider_token(_USER, _PROVIDER)
assert result is None
# DB errors are not cached — next call will retry
assert (_USER, _PROVIDER) not in _token_cache
assert (_USER, _PROVIDER) not in _null_cache
@pytest.mark.asyncio(loop_scope="session")
async def test_null_cache_has_shorter_ttl_than_token_cache(self):
"""Verify the TTL constants are set correctly for each cache."""
assert _null_cache.ttl == _NULL_CACHE_TTL
assert _token_cache.ttl == _TOKEN_CACHE_TTL
assert _NULL_CACHE_TTL < _TOKEN_CACHE_TTL
class TestGetIntegrationEnvVars:
@pytest.mark.asyncio(loop_scope="session")
async def test_injects_all_env_vars_for_provider(self):
_token_cache[(_USER, "github")] = "gh-tok"
result = await get_integration_env_vars(_USER)
for var in PROVIDER_ENV_VARS["github"]:
assert result[var] == "gh-tok"
@pytest.mark.asyncio(loop_scope="session")
async def test_empty_dict_when_no_credentials(self):
_null_cache[(_USER, "github")] = True
result = await get_integration_env_vars(_USER)
assert result == {}

View File

@@ -93,22 +93,25 @@ Example — committing an image file to GitHub:
### Sub-agent tasks
- When using the Task tool, NEVER set `run_in_background` to true.
All tasks must run in the foreground.
"""
### Delegating to another autopilot (sub-autopilot pattern)
Use the **AutoPilotBlock** (`run_block` with block_id
`c069dc6b-c3ed-4c12-b6e5-d47361e64ce6`) to delegate a task to a fresh
autopilot instance. The sub-autopilot has its own full tool set and can
perform multi-step work autonomously.
- **Input**: `prompt` (required) — the task description.
Optional: `system_context` to constrain behavior, `session_id` to
continue a previous conversation, `max_recursion_depth` (default 3).
- **Output**: `response` (text), `tool_calls` (list), `session_id`
(for continuation), `conversation_history`, `token_usage`.
Use this when a task is complex enough to benefit from a separate
autopilot context, e.g. "research X and write a report" while the
parent autopilot handles orchestration.
# E2B-only notes — E2B has full internet access so gh CLI works there.
# Not shown in local (bubblewrap) mode: --unshare-net blocks all network.
_E2B_TOOL_NOTES = """
### GitHub CLI (`gh`) and git
- If the user has connected their GitHub account, both `gh` and `git` are
pre-authenticated — use them directly without any manual login step.
`git` HTTPS operations (clone, push, pull) work automatically.
- If the token changes mid-session (e.g. user reconnects with a new token),
run `gh auth setup-git` to re-register the credential helper.
- If `gh` or `git` fails with an authentication error (e.g. "authentication
required", "could not read Username", or exit code 128), call
`connect_integration(provider="github")` to surface the GitHub credentials
setup card so the user can connect their account. Once connected, retry
the operation.
- For operations that need broader access (e.g. private org repos, GitHub
Actions), pass the required scopes: e.g.
`connect_integration(provider="github", scopes=["repo", "read:org"])`.
"""
@@ -121,6 +124,7 @@ def _build_storage_supplement(
storage_system_1_persistence: list[str],
file_move_name_1_to_2: str,
file_move_name_2_to_1: str,
extra_notes: str = "",
) -> str:
"""Build storage/filesystem supplement for a specific environment.
@@ -135,6 +139,7 @@ def _build_storage_supplement(
storage_system_1_persistence: List of persistence behavior descriptions
file_move_name_1_to_2: Direction label for primary→persistent
file_move_name_2_to_1: Direction label for persistent→primary
extra_notes: Environment-specific notes appended after shared notes
"""
# Format lists as bullet points with proper indentation
characteristics = "\n".join(f" - {c}" for c in storage_system_1_characteristics)
@@ -168,12 +173,16 @@ def _build_storage_supplement(
### File persistence
Important files (code, configs, outputs) should be saved to workspace to ensure they persist.
{_SHARED_TOOL_NOTES}"""
{_SHARED_TOOL_NOTES}{extra_notes}"""
# Pre-built supplements for common environments
def _get_local_storage_supplement(cwd: str) -> str:
"""Local ephemeral storage (files lost between turns)."""
"""Local ephemeral storage (files lost between turns).
Network is isolated (bubblewrap --unshare-net), so internet-dependent CLIs
like gh will not work — no integration env-var notes are included.
"""
return _build_storage_supplement(
working_dir=cwd,
sandbox_type="in a network-isolated sandbox",
@@ -191,7 +200,11 @@ def _get_local_storage_supplement(cwd: str) -> str:
def _get_cloud_sandbox_supplement() -> str:
"""Cloud persistent sandbox (files survive across turns in session)."""
"""Cloud persistent sandbox (files survive across turns in session).
E2B has full internet access, so integration tokens (GH_TOKEN etc.) are
injected per command in bash_exec — include the CLI guidance notes.
"""
return _build_storage_supplement(
working_dir="/home/user",
sandbox_type="in a cloud sandbox with full internet access",
@@ -206,6 +219,7 @@ def _get_cloud_sandbox_supplement() -> str:
],
file_move_name_1_to_2="Sandbox → Persistent",
file_move_name_2_to_1="Persistent → Sandbox",
extra_notes=_E2B_TOOL_NOTES,
)

View File

@@ -769,7 +769,7 @@ async def stream_chat_completion_sdk(
)
return None
try:
return await get_or_create_sandbox(
sandbox = await get_or_create_sandbox(
session_id,
api_key=e2b_api_key,
template=config.e2b_sandbox_template,
@@ -783,7 +783,9 @@ async def stream_chat_completion_sdk(
e2b_err,
exc_info=True,
)
return None
return None
return sandbox
async def _fetch_transcript():
"""Download transcript for --resume if applicable."""

View File

@@ -12,6 +12,7 @@ from .agent_browser import BrowserActTool, BrowserNavigateTool, BrowserScreensho
from .agent_output import AgentOutputTool
from .base import BaseTool
from .bash_exec import BashExecTool
from .connect_integration import ConnectIntegrationTool
from .continue_run_block import ContinueRunBlockTool
from .create_agent import CreateAgentTool
from .customize_agent import CustomizeAgentTool
@@ -84,6 +85,7 @@ TOOL_REGISTRY: dict[str, BaseTool] = {
"browser_screenshot": BrowserScreenshotTool(),
# Sandboxed code execution (bubblewrap)
"bash_exec": BashExecTool(),
"connect_integration": ConnectIntegrationTool(),
# Persistent workspace tools (cloud storage, survives across sessions)
# Feature request tools
"search_feature_requests": SearchFeatureRequestsTool(),

View File

@@ -22,6 +22,7 @@ from e2b import AsyncSandbox
from e2b.exceptions import TimeoutException
from backend.copilot.context import E2B_WORKDIR, get_current_sandbox
from backend.copilot.integration_creds import get_integration_env_vars
from backend.copilot.model import ChatSession
from .base import BaseTool
@@ -96,7 +97,9 @@ class BashExecTool(BaseTool):
sandbox = get_current_sandbox()
if sandbox is not None:
return await self._execute_on_e2b(sandbox, command, timeout, session_id)
return await self._execute_on_e2b(
sandbox, command, timeout, session_id, user_id
)
# Bubblewrap fallback: local isolated execution.
if not has_full_sandbox():
@@ -133,14 +136,27 @@ class BashExecTool(BaseTool):
command: str,
timeout: int,
session_id: str | None,
user_id: str | None = None,
) -> ToolResponseBase:
"""Execute *command* on the E2B sandbox via commands.run()."""
"""Execute *command* on the E2B sandbox via commands.run().
Integration tokens (e.g. GH_TOKEN) are injected into the sandbox env
for any user with connected accounts. E2B has full internet access, so
CLI tools like ``gh`` work without manual authentication.
"""
envs: dict[str, str] = {
"PATH": "/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin",
}
if user_id is not None:
integration_env = await get_integration_env_vars(user_id)
envs.update(integration_env)
try:
result = await sandbox.commands.run(
f"bash -c {shlex.quote(command)}",
cwd=E2B_WORKDIR,
timeout=timeout,
envs={"PATH": "/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin"},
envs=envs,
)
return BashExecResponse(
message=f"Command executed on E2B (exit {result.exit_code})",

View File

@@ -0,0 +1,78 @@
"""Tests for BashExecTool — E2B path with token injection."""
from unittest.mock import AsyncMock, MagicMock, patch
import pytest
from ._test_data import make_session
from .bash_exec import BashExecTool
from .models import BashExecResponse
_USER = "user-bash-exec-test"
def _make_tool() -> BashExecTool:
return BashExecTool()
def _make_sandbox(exit_code: int = 0, stdout: str = "", stderr: str = "") -> MagicMock:
result = MagicMock()
result.exit_code = exit_code
result.stdout = stdout
result.stderr = stderr
sandbox = MagicMock()
sandbox.commands.run = AsyncMock(return_value=result)
return sandbox
class TestBashExecE2BTokenInjection:
@pytest.mark.asyncio(loop_scope="session")
async def test_token_injected_when_user_id_set(self):
"""When user_id is provided, integration env vars are merged into sandbox envs."""
tool = _make_tool()
session = make_session(user_id=_USER)
sandbox = _make_sandbox(stdout="ok")
env_vars = {"GH_TOKEN": "gh-secret", "GITHUB_TOKEN": "gh-secret"}
with patch(
"backend.copilot.tools.bash_exec.get_integration_env_vars",
new=AsyncMock(return_value=env_vars),
) as mock_get_env:
result = await tool._execute_on_e2b(
sandbox=sandbox,
command="echo hi",
timeout=10,
session_id=session.session_id,
user_id=_USER,
)
mock_get_env.assert_awaited_once_with(_USER)
call_kwargs = sandbox.commands.run.call_args[1]
assert call_kwargs["envs"]["GH_TOKEN"] == "gh-secret"
assert call_kwargs["envs"]["GITHUB_TOKEN"] == "gh-secret"
assert isinstance(result, BashExecResponse)
@pytest.mark.asyncio(loop_scope="session")
async def test_no_token_injection_when_user_id_is_none(self):
"""When user_id is None, get_integration_env_vars must NOT be called."""
tool = _make_tool()
session = make_session(user_id=_USER)
sandbox = _make_sandbox(stdout="ok")
with patch(
"backend.copilot.tools.bash_exec.get_integration_env_vars",
new=AsyncMock(return_value={"GH_TOKEN": "should-not-appear"}),
) as mock_get_env:
result = await tool._execute_on_e2b(
sandbox=sandbox,
command="echo hi",
timeout=10,
session_id=session.session_id,
user_id=None,
)
mock_get_env.assert_not_called()
call_kwargs = sandbox.commands.run.call_args[1]
assert "GH_TOKEN" not in call_kwargs["envs"]
assert isinstance(result, BashExecResponse)

View File

@@ -0,0 +1,215 @@
"""Tool for prompting the user to connect a required integration.
When the copilot encounters an authentication failure (e.g. `gh` CLI returns
"authentication required"), it calls this tool to surface the credentials
setup card in the chat — the same UI that appears when a GitHub block runs
without configured credentials.
"""
import functools
from typing import Any, TypedDict
from backend.copilot.model import ChatSession
from backend.copilot.tools.models import (
ErrorResponse,
ResponseType,
SetupInfo,
SetupRequirementsResponse,
ToolResponseBase,
UserReadiness,
)
from .base import BaseTool
class _ProviderInfo(TypedDict):
name: str
types: list[str]
# Default OAuth scopes requested when the agent doesn't specify any.
scopes: list[str]
class _CredentialEntry(TypedDict):
"""Shape of each entry inside SetupRequirementsResponse.user_readiness.missing_credentials."""
id: str
title: str
provider: str
provider_name: str
type: str
types: list[str]
scopes: list[str]
@functools.lru_cache(maxsize=1)
def _is_github_oauth_configured() -> bool:
"""Return True if GitHub OAuth env vars are set.
Evaluated lazily (not at import time) to avoid triggering Secrets() during
module import, which can fail in environments where secrets are not loaded.
"""
from backend.blocks.github._auth import GITHUB_OAUTH_IS_CONFIGURED
return GITHUB_OAUTH_IS_CONFIGURED
# Registry of known providers: name + supported credential types for the UI.
# When adding a new provider, also add its env var names to
# backend.copilot.integration_creds.PROVIDER_ENV_VARS.
def _get_provider_info() -> dict[str, _ProviderInfo]:
"""Build the provider registry, evaluating OAuth config lazily."""
return {
"github": {
"name": "GitHub",
"types": (
["api_key", "oauth2"] if _is_github_oauth_configured() else ["api_key"]
),
# Default: repo scope covers clone/push/pull for public and private repos.
# Agent can request additional scopes (e.g. "read:org") via the scopes param.
"scopes": ["repo"],
},
}
class ConnectIntegrationTool(BaseTool):
"""Surface the credentials setup UI when an integration is not connected."""
@property
def name(self) -> str:
return "connect_integration"
@property
def description(self) -> str:
return (
"Prompt the user to connect a required integration (e.g. GitHub). "
"Call this when an external CLI or API call fails because the user "
"has not connected the relevant account. "
"The tool surfaces a credentials setup card in the chat so the user "
"can authenticate without leaving the page. "
"After the user connects the account, retry the operation. "
"In E2B/cloud sandbox mode the token (GH_TOKEN/GITHUB_TOKEN) is "
"automatically injected per-command in bash_exec — no manual export needed. "
"In local bubblewrap mode network is isolated so GitHub CLI commands "
"will still fail after connecting; inform the user of this limitation."
)
@property
def parameters(self) -> dict[str, Any]:
return {
"type": "object",
"properties": {
"provider": {
"type": "string",
"description": (
"Integration provider slug, e.g. 'github'. "
"Must be one of the supported providers."
),
"enum": list(_get_provider_info().keys()),
},
"reason": {
"type": "string",
"description": (
"Brief explanation of why the integration is needed, "
"shown to the user in the setup card."
),
"maxLength": 500,
},
"scopes": {
"type": "array",
"items": {"type": "string"},
"description": (
"OAuth scopes to request. Omit to use the provider default. "
"Add extra scopes when you need more access — e.g. for GitHub: "
"'repo' (clone/push/pull), 'read:org' (org membership), "
"'workflow' (GitHub Actions). "
"Requesting only the scopes you actually need is best practice."
),
},
},
"required": ["provider"],
}
@property
def requires_auth(self) -> bool:
# Require auth so only authenticated users can trigger the setup card.
# The card itself is user-agnostic (no per-user data needed), so
# user_id is intentionally unused in _execute.
return True
async def _execute(
self,
user_id: str | None,
session: ChatSession,
**kwargs: Any,
) -> ToolResponseBase:
del user_id # setup card is user-agnostic; auth is enforced via requires_auth
session_id = session.session_id if session else None
provider: str = (kwargs.get("provider") or "").strip().lower()
reason: str = (kwargs.get("reason") or "").strip()[
:500
] # cap LLM-controlled text
extra_scopes: list[str] = [
str(s).strip() for s in (kwargs.get("scopes") or []) if str(s).strip()
]
provider_info = _get_provider_info()
info = provider_info.get(provider)
if not info:
supported = ", ".join(f"'{p}'" for p in provider_info)
return ErrorResponse(
message=(
f"Unknown provider '{provider}'. "
f"Supported providers: {supported}."
),
error="unknown_provider",
session_id=session_id,
)
provider_name: str = info["name"]
supported_types: list[str] = info["types"]
# Merge agent-requested scopes with provider defaults (deduplicated, order preserved).
default_scopes: list[str] = info["scopes"]
seen: set[str] = set()
scopes: list[str] = []
for s in default_scopes + extra_scopes:
if s not in seen:
seen.add(s)
scopes.append(s)
field_key = f"{provider}_credentials"
message_parts = [
f"To continue, please connect your {provider_name} account.",
]
if reason:
message_parts.append(reason)
credential_entry: _CredentialEntry = {
"id": field_key,
"title": f"{provider_name} Credentials",
"provider": provider,
"provider_name": provider_name,
"type": supported_types[0],
"types": supported_types,
"scopes": scopes,
}
missing_credentials: dict[str, _CredentialEntry] = {field_key: credential_entry}
return SetupRequirementsResponse(
type=ResponseType.SETUP_REQUIREMENTS,
message=" ".join(message_parts),
session_id=session_id,
setup_info=SetupInfo(
agent_id=f"connect_{provider}",
agent_name=provider_name,
user_readiness=UserReadiness(
has_all_credentials=False,
missing_credentials=missing_credentials,
ready_to_run=False,
),
requirements={
"credentials": [missing_credentials[field_key]],
"inputs": [],
"execution_modes": [],
},
),
)

View File

@@ -0,0 +1,135 @@
"""Tests for ConnectIntegrationTool."""
import pytest
from ._test_data import make_session
from .connect_integration import ConnectIntegrationTool
from .models import ErrorResponse, SetupRequirementsResponse
_TEST_USER_ID = "test-user-connect-integration"
class TestConnectIntegrationTool:
def _make_tool(self) -> ConnectIntegrationTool:
return ConnectIntegrationTool()
@pytest.mark.asyncio(loop_scope="session")
async def test_unknown_provider_returns_error(self):
tool = self._make_tool()
session = make_session(user_id=_TEST_USER_ID)
result = await tool._execute(
user_id=_TEST_USER_ID, session=session, provider="nonexistent"
)
assert isinstance(result, ErrorResponse)
assert result.error == "unknown_provider"
assert "nonexistent" in result.message
assert "github" in result.message
@pytest.mark.asyncio(loop_scope="session")
async def test_empty_provider_returns_error(self):
tool = self._make_tool()
session = make_session(user_id=_TEST_USER_ID)
result = await tool._execute(
user_id=_TEST_USER_ID, session=session, provider=""
)
assert isinstance(result, ErrorResponse)
assert result.error == "unknown_provider"
@pytest.mark.asyncio(loop_scope="session")
async def test_github_provider_returns_setup_response(self):
tool = self._make_tool()
session = make_session(user_id=_TEST_USER_ID)
result = await tool._execute(
user_id=_TEST_USER_ID, session=session, provider="github"
)
assert isinstance(result, SetupRequirementsResponse)
assert result.setup_info.agent_name == "GitHub"
assert result.setup_info.agent_id == "connect_github"
@pytest.mark.asyncio(loop_scope="session")
async def test_github_has_missing_credentials_in_readiness(self):
tool = self._make_tool()
session = make_session(user_id=_TEST_USER_ID)
result = await tool._execute(
user_id=_TEST_USER_ID, session=session, provider="github"
)
assert isinstance(result, SetupRequirementsResponse)
readiness = result.setup_info.user_readiness
assert readiness.has_all_credentials is False
assert readiness.ready_to_run is False
assert "github_credentials" in readiness.missing_credentials
@pytest.mark.asyncio(loop_scope="session")
async def test_github_requirements_include_credential_entry(self):
tool = self._make_tool()
session = make_session(user_id=_TEST_USER_ID)
result = await tool._execute(
user_id=_TEST_USER_ID, session=session, provider="github"
)
assert isinstance(result, SetupRequirementsResponse)
creds = result.setup_info.requirements["credentials"]
assert len(creds) == 1
assert creds[0]["provider"] == "github"
assert creds[0]["id"] == "github_credentials"
@pytest.mark.asyncio(loop_scope="session")
async def test_reason_appears_in_message(self):
tool = self._make_tool()
session = make_session(user_id=_TEST_USER_ID)
reason = "Needed to create a pull request."
result = await tool._execute(
user_id=_TEST_USER_ID, session=session, provider="github", reason=reason
)
assert isinstance(result, SetupRequirementsResponse)
assert reason in result.message
@pytest.mark.asyncio(loop_scope="session")
async def test_session_id_propagated(self):
tool = self._make_tool()
session = make_session(user_id=_TEST_USER_ID)
result = await tool._execute(
user_id=_TEST_USER_ID, session=session, provider="github"
)
assert isinstance(result, SetupRequirementsResponse)
assert result.session_id == session.session_id
@pytest.mark.asyncio(loop_scope="session")
async def test_provider_case_insensitive(self):
"""Provider slug is normalised to lowercase before lookup."""
tool = self._make_tool()
session = make_session(user_id=_TEST_USER_ID)
result = await tool._execute(
user_id=_TEST_USER_ID, session=session, provider="GitHub"
)
assert isinstance(result, SetupRequirementsResponse)
def test_tool_name(self):
assert ConnectIntegrationTool().name == "connect_integration"
def test_requires_auth(self):
assert ConnectIntegrationTool().requires_auth is True
@pytest.mark.asyncio(loop_scope="session")
async def test_unauthenticated_user_gets_need_login_response(self):
"""execute() with user_id=None must return NeedLoginResponse, not the setup card.
This verifies that the requires_auth guard in BaseTool.execute() fires
before _execute() is called, so unauthenticated callers cannot probe
which integrations are configured.
"""
import json
tool = self._make_tool()
# Session still needs a user_id string; the None is passed to execute()
# to simulate an unauthenticated call.
session = make_session(user_id=_TEST_USER_ID)
result = await tool.execute(
user_id=None,
session=session,
tool_call_id="test-call-id",
provider="github",
)
raw = result.output
output = json.loads(raw) if isinstance(raw, str) else raw
assert output.get("type") == "need_login"
assert result.success is False

View File

@@ -25,6 +25,35 @@ logger = logging.getLogger(__name__)
settings = Settings()
_on_creds_changed: Callable[[str, str], None] | None = None
def register_creds_changed_hook(hook: Callable[[str, str], None]) -> None:
"""Register a callback invoked after any credential is created/updated/deleted.
The callback receives ``(user_id, provider)`` and should be idempotent.
Only one hook can be registered at a time; calling this again replaces the
previous hook. Intended to be called once at application startup by the
copilot module to bust its token cache without creating an import cycle.
"""
global _on_creds_changed
_on_creds_changed = hook
def _bust_copilot_cache(user_id: str, provider: str) -> None:
"""Invoke the registered hook (if any) to bust downstream token caches."""
if _on_creds_changed is not None:
try:
_on_creds_changed(user_id, provider)
except Exception:
logger.warning(
"Credential-change hook failed for user=%s provider=%s",
user_id,
provider,
exc_info=True,
)
class IntegrationCredentialsManager:
"""
Handles the lifecycle of integration credentials.
@@ -69,7 +98,11 @@ class IntegrationCredentialsManager:
return self._locks
async def create(self, user_id: str, credentials: Credentials) -> None:
return await self.store.add_creds(user_id, credentials)
result = await self.store.add_creds(user_id, credentials)
# Bust the copilot token cache so that the next bash_exec picks up the
# new credential immediately instead of waiting for _NULL_CACHE_TTL.
_bust_copilot_cache(user_id, credentials.provider)
return result
async def exists(self, user_id: str, credentials_id: str) -> bool:
return (await self.store.get_creds_by_id(user_id, credentials_id)) is not None
@@ -156,6 +189,8 @@ class IntegrationCredentialsManager:
fresh_credentials = await oauth_handler.refresh_tokens(credentials)
await self.store.update_creds(user_id, fresh_credentials)
# Bust copilot cache so the refreshed token is picked up immediately.
_bust_copilot_cache(user_id, fresh_credentials.provider)
if _lock and (await _lock.locked()) and (await _lock.owned()):
try:
await _lock.release()
@@ -168,10 +203,17 @@ class IntegrationCredentialsManager:
async def update(self, user_id: str, updated: Credentials) -> None:
async with self._locked(user_id, updated.id):
await self.store.update_creds(user_id, updated)
# Bust the copilot token cache so the updated credential is picked up immediately.
_bust_copilot_cache(user_id, updated.provider)
async def delete(self, user_id: str, credentials_id: str) -> None:
async with self._locked(user_id, credentials_id):
# Read inside the lock to avoid TOCTOU — another coroutine could
# delete the same credential between the read and the delete.
creds = await self.store.get_creds_by_id(user_id, credentials_id)
await self.store.delete_creds_by_id(user_id, credentials_id)
if creds:
_bust_copilot_cache(user_id, creds.provider)
# -- Locking utilities -- #

View File

@@ -112,11 +112,6 @@ CATEGORY_FILE_MAP = {
}
_BRAND_NAMES: dict[str, str] = {
"AutoPilot": "AutoPilot",
}
def class_name_to_display_name(class_name: str) -> str:
"""Convert BlockClassName to 'Block Class Name'."""
# Remove 'Block' suffix (only at the end, not all occurrences)
@@ -125,13 +120,7 @@ def class_name_to_display_name(class_name: str) -> str:
name = re.sub(r"([a-z])([A-Z])", r"\1 \2", name)
# Handle consecutive capitals (e.g., 'HTTPRequest' -> 'HTTP Request')
name = re.sub(r"([A-Z]+)([A-Z][a-z])", r"\1 \2", name)
name = name.strip()
# Restore brand names that shouldn't be split
for split_form, brand in _BRAND_NAMES.items():
# Build the split version (e.g., "AutoPilot" -> "Auto Pilot")
split = re.sub(r"([a-z])([A-Z])", r"\1 \2", split_form)
name = name.replace(split, brand)
return name
return name.strip()
def type_to_readable(type_schema: dict[str, Any] | Any) -> str:

View File

@@ -3,6 +3,7 @@ import { ErrorCard } from "@/components/molecules/ErrorCard/ErrorCard";
import { ExclamationMarkIcon } from "@phosphor-icons/react";
import { ToolUIPart, UIDataTypes, UIMessage, UITools } from "ai";
import { useState } from "react";
import { ConnectIntegrationTool } from "../../../tools/ConnectIntegrationTool/ConnectIntegrationTool";
import { CreateAgentTool } from "../../../tools/CreateAgent/CreateAgent";
import { EditAgentTool } from "../../../tools/EditAgent/EditAgent";
import {
@@ -129,6 +130,8 @@ export function MessagePartRenderer({ part, messageID, partIndex }: Props) {
case "tool-search_docs":
case "tool-get_doc_page":
return <SearchDocsTool key={key} part={part as ToolUIPart} />;
case "tool-connect_integration":
return <ConnectIntegrationTool key={key} part={part as ToolUIPart} />;
case "tool-run_block":
case "tool-continue_run_block":
return <RunBlockTool key={key} part={part as ToolUIPart} />;

View File

@@ -0,0 +1,104 @@
"use client";
import type { SetupRequirementsResponse } from "@/app/api/__generated__/models/setupRequirementsResponse";
import type { ToolUIPart } from "ai";
import { useState } from "react";
import { MorphingTextAnimation } from "../../components/MorphingTextAnimation/MorphingTextAnimation";
import { ContentMessage } from "../../components/ToolAccordion/AccordionContent";
import { SetupRequirementsCard } from "../RunBlock/components/SetupRequirementsCard/SetupRequirementsCard";
type Props = {
part: ToolUIPart;
};
function parseJson(raw: unknown): unknown {
if (typeof raw === "string") {
try {
return JSON.parse(raw);
} catch {
return null;
}
}
return raw;
}
function parseOutput(raw: unknown): SetupRequirementsResponse | null {
const parsed = parseJson(raw);
if (parsed && typeof parsed === "object" && "setup_info" in parsed) {
return parsed as SetupRequirementsResponse;
}
return null;
}
function parseError(raw: unknown): string | null {
const parsed = parseJson(raw);
if (parsed && typeof parsed === "object" && "message" in parsed) {
return String((parsed as { message: unknown }).message);
}
return null;
}
export function ConnectIntegrationTool({ part }: Props) {
// Persist dismissed state here so SetupRequirementsCard remounts don't re-enable Proceed.
const [isDismissed, setIsDismissed] = useState(false);
const isStreaming =
part.state === "input-streaming" || part.state === "input-available";
const isError = part.state === "output-error";
const output =
part.state === "output-available"
? parseOutput((part as { output?: unknown }).output)
: null;
const errorMessage = isError
? (parseError((part as { output?: unknown }).output) ??
"Failed to connect integration")
: null;
const rawProvider =
(part as { input?: { provider?: string } }).input?.provider ?? "";
const providerName =
output?.setup_info?.agent_name ??
// Sanitize LLM-controlled provider slug: trim and cap at 64 chars to
// prevent runaway text in the DOM.
(rawProvider ? rawProvider.trim().slice(0, 64) : "integration");
const label = isStreaming
? `Connecting ${providerName}`
: isError
? `Failed to connect ${providerName}`
: output
? `Connect ${output.setup_info?.agent_name ?? providerName}`
: `Connect ${providerName}`;
return (
<div className="py-2">
<div className="flex items-center gap-2 text-sm text-muted-foreground">
<MorphingTextAnimation
text={label}
className={isError ? "text-red-500" : undefined}
/>
</div>
{isError && errorMessage && (
<p className="mt-1 text-sm text-red-500">{errorMessage}</p>
)}
{output && (
<div className="mt-2">
{isDismissed ? (
<ContentMessage>Connected. Continuing</ContentMessage>
) : (
<SetupRequirementsCard
output={output}
credentialsLabel={`${output.setup_info?.agent_name ?? providerName} credentials`}
retryInstruction="I've connected my account. Please continue."
onComplete={() => setIsDismissed(true)}
/>
)}
</div>
)}
</div>
);
}

View File

@@ -23,12 +23,16 @@ interface Props {
/** Override the label shown above the credentials section.
* Defaults to "Credentials". */
credentialsLabel?: string;
/** Called after Proceed is clicked so the parent can persist the dismissed state
* across remounts (avoids re-enabling the Proceed button on remount). */
onComplete?: () => void;
}
export function SetupRequirementsCard({
output,
retryInstruction,
credentialsLabel,
onComplete,
}: Props) {
const { onSend } = useCopilotChatActions();
@@ -68,13 +72,17 @@ export function SetupRequirementsCard({
return v !== undefined && v !== null && v !== "";
});
if (hasSent) {
return <ContentMessage>Connected. Continuing</ContentMessage>;
}
const canRun =
!hasSent &&
(!needsCredentials || isAllCredentialsComplete) &&
(!needsInputs || isAllInputsComplete);
function handleRun() {
setHasSent(true);
onComplete?.();
const parts: string[] = [];
if (needsCredentials) {

View File

@@ -125,9 +125,9 @@ export function useCredentialsInput({
if (hasAttemptedAutoSelect.current) return;
hasAttemptedAutoSelect.current = true;
// Auto-select if exactly one credential matches.
// For optional fields with multiple options, let the user choose.
if (isOptional && savedCreds.length > 1) return;
// Auto-select only when there is exactly one saved credential.
// With multiple options the user must choose — regardless of optional/required.
if (savedCreds.length > 1) return;
const cred = savedCreds[0];
onSelectCredential({

View File

@@ -579,7 +579,6 @@ Below is a comprehensive list of all available blocks, categorized by their prim
| Block Name | Description |
|------------|-------------|
| [Agent Executor](block-integrations/misc.md#agent-executor) | Executes an existing agent inside your agent |
| [AutoPilot](block-integrations/misc.md#autopilot) | Execute tasks using AutoGPT AutoPilot with full access to platform tools (agent management, workspace files, web fetch, block execution, and more) |
## CRM Services

View File

@@ -38,43 +38,6 @@ Input and output schemas define the expected data structure for communication be
---
## AutoPilot
### What it is
Execute tasks using AutoGPT AutoPilot with full access to platform tools (agent management, workspace files, web fetch, block execution, and more). Enables sub-agent patterns and scheduled autopilot execution.
### How it works
<!-- MANUAL: how_it_works -->
This block invokes the platform's copilot system directly via `stream_chat_completion_sdk`. It creates (or resumes) a chat session, streams the autopilot's response collecting text deltas, tool call details, and token usage, then returns the aggregated results. A recursion depth guard prevents infinite loops when the autopilot calls this block as a sub-agent.
<!-- END MANUAL -->
### Inputs
| Input | Description | Type | Required |
|-------|-------------|------|----------|
| prompt | The task or instruction for the autopilot to execute. The autopilot has access to platform tools like agent management, workspace files, web fetch, block execution, and more. | str | Yes |
| system_context | Optional additional context prepended to the prompt. Use this to constrain autopilot behavior, provide domain context, or set output format requirements. | str | No |
| session_id | Session ID to continue an existing autopilot conversation. Leave empty to start a new session. Use the session_id output from a previous run to continue. | str | No |
| max_recursion_depth | Maximum nesting depth when the autopilot calls this block recursively (sub-agent pattern). Prevents infinite loops. | int | No |
### Outputs
| Output | Description | Type |
|--------|-------------|------|
| error | Error message if execution failed. | str |
| response | The final text response from the autopilot. | str |
| tool_calls | List of tools called during execution. Each entry has tool_call_id, tool_name, input, output, and success fields. | List[ToolCallEntry] |
| conversation_history | Full conversation history as JSON. It can be used for logging or analysis. | str |
| session_id | Session ID for this conversation. Pass this back to continue the conversation in a future run. | str |
| token_usage | Token usage statistics: prompt_tokens, completion_tokens, total_tokens. | TokenUsage |
### Possible use case
<!-- MANUAL: use_case -->
Schedule an autopilot to run daily that checks workspace files, summarizes recent agent activity, and posts a report. Or chain autopilot blocks where one gathers data and another analyzes it, enabling multi-step AI workflows within the graph editor.
<!-- END MANUAL -->
---
## Create Reddit Post
### What it is