Compare commits

...

20 Commits

Author SHA1 Message Date
Zamil Majdy
d5ddf7013e Merge remote-tracking branch 'origin/dev' into feat/autogpt-copilot-block 2026-03-17 06:17:13 +07:00
Zamil Majdy
0b55e55cda fix(platform): use typing_extensions.TypedDict, snake_case attrs, and handle CancelledError
- Use typing_extensions.TypedDict instead of typing.TypedDict for Python <3.12 compat
- Rename TypedDict attributes to snake_case (prompt_tokens, tool_call_id, etc.)
- Add explicit asyncio.CancelledError handler to prevent orphaned sessions
- Regenerate block docs
2026-03-17 04:50:33 +07:00
Zamil Majdy
23764b9eb5 fix(platform): type tool_calls output with ToolCallEntry TypedDict
Address PR review: replace generic `list[dict]` with properly typed
`list[ToolCallEntry]` for the tool_calls output field, matching the
described structure (toolCallId, toolName, input, output, success).
2026-03-17 04:03:56 +07:00
Zamil Majdy
69231dc627 fix(platform): type token_usage output with TypedDict instead of bare dict
Address PR review: use TokenUsage TypedDict with promptTokens,
completionTokens, and totalTokens fields for better type safety.
2026-03-17 02:18:58 +07:00
Zamil Majdy
3e459c1235 fix(backend): remove duplicate session creation from execute_copilot
Session creation was moved to run() in the previous commit. Clean up
the now-redundant create_chat_session import and logic in execute_copilot.
2026-03-17 00:37:08 +07:00
Zamil Majdy
040d1e851c fix(backend): always yield session_id even on stream failure
Move session creation from execute_copilot into run() so the
session_id is yielded to the user even if the downstream stream
raises an exception, preventing orphaned sessions.
2026-03-17 00:32:22 +07:00
Zamil Majdy
cbc16e43a6 docs: regenerate block docs overview table 2026-03-17 00:27:58 +07:00
Zamil Majdy
f6fada8e0a docs: add Auto Pilot block to AI and Language Models table
The AutoPilotBlock is categorized under both AI and AGENT categories,
but the README only listed it under Agent Integration. Add it to the
AI and Language Models table as well to match runtime categorization.
2026-03-17 00:22:26 +07:00
Zamil Majdy
ee9cec12aa refactor(platform): rename AutogptCopilotBlock to AutoPilotBlock
Rename class, file, and all user-facing strings from "Copilot" to
"AutoPilot" per reviewer feedback. Internal method names (execute_copilot)
kept as-is since they invoke the copilot system.
2026-03-17 00:14:14 +07:00
Zamil Majdy
3b70f61b17 fix(platform): align copilot block with normal copilot flow
- Don't pass session object to stream_chat_completion_sdk; let the SDK
  load it internally so all session management (lock, persist,
  transcript, E2B) is handled by its own finally block. This fixes the
  "stream already active" error when continuing via chat UI.
- Create session only when needed, pass just session_id downstream —
  same pattern as the executor processor and chat API route.
- Add sub-copilot delegation guidance to shared tool notes so the
  copilot knows about the AutogptCopilotBlock for recursive patterns.
- Extract recursion helpers as module-level functions.
2026-03-16 23:53:34 +07:00
Zamil Majdy
180578bfe4 fix(platform): use future annotations instead of noqa suppression 2026-03-16 23:32:15 +07:00
Zamil Majdy
8f73813565 refactor(platform): simplify copilot block, drop asyncio.timeout
- Remove asyncio.timeout wrapper: the SDK's internal stream must not
  be cancelled mid-flight (corrupts anyio memory stream, see
  sdk/service.py L998-1001). Matches normal copilot behavior.
- Remove timeout input field (no longer needed).
- Extract _check_recursion, _reset_recursion, _get_or_create_session
  as static helpers to reduce indentation and improve readability.
- Use ge=1 on max_recursion_depth SchemaField for schema-level
  validation instead of runtime check.
- Drop explicit TimeoutError catch (no longer raised).
2026-03-16 23:27:57 +07:00
Zamil Majdy
348e9f8e27 fix(platform): enforce chain-wide recursion limit for copilot block
The recursion guard now stores the effective limit in a ContextVar so
nested calls cannot raise the cap set by the original caller. This
prevents a recursive copilot call from bypassing the intended depth
limit by passing a higher max_recursion_depth value.
2026-03-16 23:23:43 +07:00
Zamil Majdy
436aab7edc fix(platform): add bounds validation and fix doc sentence fragment
Add validation for timeout (>0) and max_recursion_depth (>=1) in the
run method to fail early with clear error messages. Fix sentence
fragment in conversation_history description.
2026-03-16 22:27:45 +07:00
Zamil Majdy
abacc25e58 fix(platform): make prompt required in AutogptCopilotBlock schema
Removes default="" so schema/UI correctly marks prompt as required,
matching the runtime validation that rejects empty prompts.
2026-03-16 22:14:55 +07:00
Zamil Majdy
35bb208b9e docs: regenerate integrations README overview table 2026-03-16 22:10:16 +07:00
Zamil Majdy
e2fefcf550 docs: fill manual sections for AutogptCopilotBlock in misc.md 2026-03-16 22:07:38 +07:00
Zamil Majdy
da94d4b28e docs: regenerate block docs to include AutogptCopilotBlock 2026-03-16 22:03:28 +07:00
Zamil Majdy
4e8144e7b7 fix(platform): address PR review comments on AutogptCopilotBlock
- Validate user_id is present before running copilot
- Catch TimeoutError explicitly with user-friendly message
- Use list+join for response text accumulation
- Use dict lookup for tool call output matching
- Use exclude_none=True in conversation history serialization
2026-03-16 22:00:14 +07:00
Zamil Majdy
0a532185d0 feat(platform): add AutogptCopilotBlock for invoking copilot from graphs
Enables sub-agent patterns (copilot calling copilot recursively) and
scheduled copilot execution via the agent executor. The block calls
stream_chat_completion_sdk directly with full platform tool access.
2026-03-16 21:56:17 +07:00
5 changed files with 407 additions and 1 deletions

View File

@@ -0,0 +1,341 @@
from __future__ import annotations
import asyncio
import contextvars
import json
from typing import TYPE_CHECKING
from typing_extensions import TypedDict
from backend.blocks._base import (
Block,
BlockCategory,
BlockOutput,
BlockSchemaInput,
BlockSchemaOutput,
)
from backend.data.model import SchemaField
if TYPE_CHECKING:
from backend.executor.utils import ExecutionContext
class ToolCallEntry(TypedDict):
tool_call_id: str
tool_name: str
input: object
output: object | None
success: bool | None
class TokenUsage(TypedDict):
prompt_tokens: int
completion_tokens: int
total_tokens: int
# Task-scoped recursion depth counter & chain-wide limit.
# contextvars are scoped to the current asyncio task, so concurrent
# graph executions each get independent counters.
_autopilot_recursion_depth: contextvars.ContextVar[int] = contextvars.ContextVar(
"_autopilot_recursion_depth", default=0
)
_autopilot_recursion_limit: contextvars.ContextVar[int | None] = contextvars.ContextVar(
"_autopilot_recursion_limit", default=None
)
def _check_recursion(max_depth: int) -> tuple:
"""Check and increment recursion depth. Returns tokens to reset on exit."""
current = _autopilot_recursion_depth.get()
inherited = _autopilot_recursion_limit.get()
limit = max_depth if inherited is None else min(inherited, max_depth)
if current >= limit:
raise RuntimeError(
f"AutoPilot recursion depth limit reached ({limit}). "
"The autopilot has called itself too many times."
)
return (
_autopilot_recursion_depth.set(current + 1),
_autopilot_recursion_limit.set(limit),
)
def _reset_recursion(tokens: tuple) -> None:
_autopilot_recursion_depth.reset(tokens[0])
_autopilot_recursion_limit.reset(tokens[1])
class AutoPilotBlock(Block):
"""Execute tasks using AutoGPT AutoPilot with full access to platform tools.
The autopilot can manage agents, access workspace files, fetch web content,
run blocks, and more. This block enables sub-agent patterns (autopilot calling
autopilot) and scheduled autopilot execution via the agent executor.
"""
class Input(BlockSchemaInput):
prompt: str = SchemaField(
description=(
"The task or instruction for the autopilot to execute. "
"The autopilot has access to platform tools like agent management, "
"workspace files, web fetch, block execution, and more."
),
placeholder="Find my agents and list them",
advanced=False,
)
system_context: str = SchemaField(
description=(
"Optional additional context prepended to the prompt. "
"Use this to constrain autopilot behavior, provide domain "
"context, or set output format requirements."
),
default="",
advanced=True,
)
session_id: str = SchemaField(
description=(
"Session ID to continue an existing autopilot conversation. "
"Leave empty to start a new session. "
"Use the session_id output from a previous run to continue."
),
default="",
advanced=True,
)
max_recursion_depth: int = SchemaField(
description=(
"Maximum nesting depth when the autopilot calls this block "
"recursively (sub-agent pattern). Prevents infinite loops."
),
default=3,
ge=1,
advanced=True,
)
class Output(BlockSchemaOutput):
response: str = SchemaField(
description="The final text response from the autopilot."
)
tool_calls: list[ToolCallEntry] = SchemaField(
description=(
"List of tools called during execution. Each entry has "
"tool_call_id, tool_name, input, output, and success fields."
),
)
conversation_history: str = SchemaField(
description=(
"Full conversation history as JSON. "
"It can be used for logging or analysis."
),
)
session_id: str = SchemaField(
description=(
"Session ID for this conversation. "
"Pass this back to continue the conversation in a future run."
),
)
token_usage: TokenUsage = SchemaField(
description=(
"Token usage statistics: prompt_tokens, "
"completion_tokens, total_tokens."
),
)
error: str = SchemaField(
description="Error message if execution failed.",
)
def __init__(self):
super().__init__(
id="c069dc6b-c3ed-4c12-b6e5-d47361e64ce6",
description=(
"Execute tasks using AutoGPT AutoPilot with full access to "
"platform tools (agent management, workspace files, web fetch, "
"block execution, and more). Enables sub-agent patterns and "
"scheduled autopilot execution."
),
categories={BlockCategory.AI, BlockCategory.AGENT},
input_schema=AutoPilotBlock.Input,
output_schema=AutoPilotBlock.Output,
test_input={
"prompt": "List my agents",
"system_context": "",
"session_id": "",
"max_recursion_depth": 3,
},
test_output=[
("response", "You have 2 agents: Agent A and Agent B."),
("tool_calls", []),
(
"conversation_history",
'[{"role": "user", "content": "List my agents"}]',
),
("session_id", "test-session-id"),
(
"token_usage",
{
"prompt_tokens": 100,
"completion_tokens": 50,
"total_tokens": 150,
},
),
],
test_mock={
"create_session": lambda *args, **kwargs: "test-session-id",
"execute_copilot": lambda *args, **kwargs: (
"You have 2 agents: Agent A and Agent B.",
[],
'[{"role": "user", "content": "List my agents"}]',
"test-session-id",
{
"prompt_tokens": 100,
"completion_tokens": 50,
"total_tokens": 150,
},
),
},
)
async def create_session(self, user_id: str) -> str:
"""Create a new chat session and return its ID (mockable for tests)."""
from backend.copilot.model import create_chat_session
session = await create_chat_session(user_id)
return session.session_id
async def execute_copilot(
self,
prompt: str,
system_context: str,
session_id: str,
max_recursion_depth: int,
user_id: str,
) -> tuple[str, list[ToolCallEntry], str, str, TokenUsage]:
"""Invoke the copilot and collect all stream results.
Follows the same path as the normal copilot: create session if needed,
then let stream_chat_completion_sdk handle everything (session loading,
message append, lock, transcript, cleanup).
"""
from backend.copilot.model import get_chat_session
from backend.copilot.response_model import (
StreamError,
StreamTextDelta,
StreamToolInputAvailable,
StreamToolOutputAvailable,
StreamUsage,
)
from backend.copilot.sdk.service import stream_chat_completion_sdk
tokens = _check_recursion(max_recursion_depth)
try:
effective_prompt = prompt
if system_context:
effective_prompt = f"[System Context: {system_context}]\n\n{prompt}"
# Consume the stream — same as the executor processor.
# Do NOT pass a session object; let the SDK load it internally
# so all session management (lock, persist, transcript) is handled
# by the SDK's own finally block.
response_parts: list[str] = []
tool_calls: list[ToolCallEntry] = []
tool_calls_by_id: dict[str, ToolCallEntry] = {}
total_usage: TokenUsage = {
"prompt_tokens": 0,
"completion_tokens": 0,
"total_tokens": 0,
}
async for event in stream_chat_completion_sdk(
session_id=session_id,
message=effective_prompt,
is_user_message=True,
user_id=user_id,
):
if isinstance(event, StreamTextDelta):
response_parts.append(event.delta)
elif isinstance(event, StreamToolInputAvailable):
entry: ToolCallEntry = {
"tool_call_id": event.toolCallId,
"tool_name": event.toolName,
"input": event.input,
"output": None,
"success": None,
}
tool_calls.append(entry)
tool_calls_by_id[event.toolCallId] = entry
elif isinstance(event, StreamToolOutputAvailable):
if tc := tool_calls_by_id.get(event.toolCallId):
tc["output"] = event.output
tc["success"] = event.success
elif isinstance(event, StreamUsage):
total_usage["prompt_tokens"] += event.promptTokens
total_usage["completion_tokens"] += event.completionTokens
total_usage["total_tokens"] += event.totalTokens
elif isinstance(event, StreamError):
raise RuntimeError(f"AutoPilot error: {event.errorText}")
# Session was persisted by the SDK's finally block.
# Re-fetch for conversation history output.
updated_session = await get_chat_session(session_id, user_id)
history_json = "[]"
if updated_session and updated_session.messages:
history_json = json.dumps(
[m.model_dump(exclude_none=True) for m in updated_session.messages],
default=str,
)
return (
"".join(response_parts),
tool_calls,
history_json,
session_id,
total_usage,
)
finally:
_reset_recursion(tokens)
async def run(
self,
input_data: Input,
*,
execution_context: ExecutionContext,
**kwargs,
) -> BlockOutput:
if not input_data.prompt.strip():
yield "error", "Prompt cannot be empty."
return
if not execution_context.user_id:
yield "error", "Cannot run autopilot without an authenticated user."
return
# Create session eagerly so the user always gets the session_id,
# even if the downstream stream fails (avoids orphaned sessions).
sid = input_data.session_id
if not sid:
sid = await self.create_session(execution_context.user_id)
try:
response, tool_calls, history, _, usage = await self.execute_copilot(
prompt=input_data.prompt,
system_context=input_data.system_context,
session_id=sid,
max_recursion_depth=input_data.max_recursion_depth,
user_id=execution_context.user_id,
)
yield "response", response
yield "tool_calls", tool_calls
yield "conversation_history", history
yield "session_id", sid
yield "token_usage", usage
except asyncio.CancelledError:
yield "session_id", sid
yield "error", "AutoPilot execution was cancelled."
raise
except Exception as e:
yield "session_id", sid
yield "error", str(e)

View File

@@ -93,6 +93,22 @@ Example — committing an image file to GitHub:
### Sub-agent tasks
- When using the Task tool, NEVER set `run_in_background` to true.
All tasks must run in the foreground.
### Delegating to another autopilot (sub-autopilot pattern)
Use the **AutoPilotBlock** (`run_block` with block_id
`c069dc6b-c3ed-4c12-b6e5-d47361e64ce6`) to delegate a task to a fresh
autopilot instance. The sub-autopilot has its own full tool set and can
perform multi-step work autonomously.
- **Input**: `prompt` (required) — the task description.
Optional: `system_context` to constrain behavior, `session_id` to
continue a previous conversation, `max_recursion_depth` (default 3).
- **Output**: `response` (text), `tool_calls` (list), `session_id`
(for continuation), `conversation_history`, `token_usage`.
Use this when a task is complex enough to benefit from a separate
autopilot context, e.g. "research X and write a report" while the
parent autopilot handles orchestration.
"""

View File

@@ -112,6 +112,11 @@ CATEGORY_FILE_MAP = {
}
_BRAND_NAMES: dict[str, str] = {
"AutoPilot": "AutoPilot",
}
def class_name_to_display_name(class_name: str) -> str:
"""Convert BlockClassName to 'Block Class Name'."""
# Remove 'Block' suffix (only at the end, not all occurrences)
@@ -120,7 +125,13 @@ def class_name_to_display_name(class_name: str) -> str:
name = re.sub(r"([a-z])([A-Z])", r"\1 \2", name)
# Handle consecutive capitals (e.g., 'HTTPRequest' -> 'HTTP Request')
name = re.sub(r"([A-Z]+)([A-Z][a-z])", r"\1 \2", name)
return name.strip()
name = name.strip()
# Restore brand names that shouldn't be split
for split_form, brand in _BRAND_NAMES.items():
# Build the split version (e.g., "AutoPilot" -> "Auto Pilot")
split = re.sub(r"([a-z])([A-Z])", r"\1 \2", split_form)
name = name.replace(split, brand)
return name
def type_to_readable(type_schema: dict[str, Any] | Any) -> str:

View File

@@ -579,6 +579,7 @@ Below is a comprehensive list of all available blocks, categorized by their prim
| Block Name | Description |
|------------|-------------|
| [Agent Executor](block-integrations/misc.md#agent-executor) | Executes an existing agent inside your agent |
| [AutoPilot](block-integrations/misc.md#autopilot) | Execute tasks using AutoGPT AutoPilot with full access to platform tools (agent management, workspace files, web fetch, block execution, and more) |
## CRM Services

View File

@@ -38,6 +38,43 @@ Input and output schemas define the expected data structure for communication be
---
## AutoPilot
### What it is
Execute tasks using AutoGPT AutoPilot with full access to platform tools (agent management, workspace files, web fetch, block execution, and more). Enables sub-agent patterns and scheduled autopilot execution.
### How it works
<!-- MANUAL: how_it_works -->
This block invokes the platform's copilot system directly via `stream_chat_completion_sdk`. It creates (or resumes) a chat session, streams the autopilot's response collecting text deltas, tool call details, and token usage, then returns the aggregated results. A recursion depth guard prevents infinite loops when the autopilot calls this block as a sub-agent.
<!-- END MANUAL -->
### Inputs
| Input | Description | Type | Required |
|-------|-------------|------|----------|
| prompt | The task or instruction for the autopilot to execute. The autopilot has access to platform tools like agent management, workspace files, web fetch, block execution, and more. | str | Yes |
| system_context | Optional additional context prepended to the prompt. Use this to constrain autopilot behavior, provide domain context, or set output format requirements. | str | No |
| session_id | Session ID to continue an existing autopilot conversation. Leave empty to start a new session. Use the session_id output from a previous run to continue. | str | No |
| max_recursion_depth | Maximum nesting depth when the autopilot calls this block recursively (sub-agent pattern). Prevents infinite loops. | int | No |
### Outputs
| Output | Description | Type |
|--------|-------------|------|
| error | Error message if execution failed. | str |
| response | The final text response from the autopilot. | str |
| tool_calls | List of tools called during execution. Each entry has tool_call_id, tool_name, input, output, and success fields. | List[ToolCallEntry] |
| conversation_history | Full conversation history as JSON. It can be used for logging or analysis. | str |
| session_id | Session ID for this conversation. Pass this back to continue the conversation in a future run. | str |
| token_usage | Token usage statistics: prompt_tokens, completion_tokens, total_tokens. | TokenUsage |
### Possible use case
<!-- MANUAL: use_case -->
Schedule an autopilot to run daily that checks workspace files, summarizes recent agent activity, and posts a report. Or chain autopilot blocks where one gathers data and another analyzes it, enabling multi-step AI workflows within the graph editor.
<!-- END MANUAL -->
---
## Create Reddit Post
### What it is