refactor(backend/copilot): move imports to module level

- Move KEY_WORKFLOWS and TOOL_REGISTRY imports to top of file - Better code organization following Python conventions
test(backend/copilot): add tests for auto-generated tool documentation
2026-03-17 03:00:27 -04:00 · 2026-03-06 23:15:39 +07:00 · 2026-03-06 23:15:39 +07:00 · 2026-03-06 23:10:42 +07:00 · 2026-03-06 23:10:42 +07:00 · 2026-03-06 23:10:42 +07:00
4 changed files with 186 additions and 118 deletions
--- a/autogpt_platform/backend/backend/copilot/prompt_constants.py
+++ b/autogpt_platform/backend/backend/copilot/prompt_constants.py
@@ -0,0 +1,29 @@
+"""Prompt constants for CoPilot - workflow guidance and supplementary documentation.
+
+This module contains workflow patterns and guidance that supplement the main system prompt.
+These are appended dynamically to the prompt along with auto-generated tool documentation.
+"""
+
+# Workflow guidance for key tool patterns
+# This is appended after the auto-generated tool list to provide usage patterns
+KEY_WORKFLOWS = """
+
+## KEY WORKFLOWS
+
+### MCP Integration Workflow
+When using `run_mcp_tool`:
+1. **Known servers** (use directly): Notion (https://mcp.notion.com/mcp), Linear (https://mcp.linear.app/mcp), Stripe (https://mcp.stripe.com), Intercom (https://mcp.intercom.com/mcp), Cloudflare (https://mcp.cloudflare.com/mcp), Atlassian (https://mcp.atlassian.com/mcp)
+2. **Unknown servers**: Use `web_search("{{service}} MCP server URL")` to find the endpoint
+3. **Discovery**: Call `run_mcp_tool(server_url)` to see available tools
+4. **Execution**: Call `run_mcp_tool(server_url, tool_name, tool_arguments)`
+5. **Authentication**: If credentials needed, user will be prompted. When they confirm, retry immediately with same arguments.
+
+### Agent Creation Workflow
+When using `create_agent`:
+1. Always check `find_library_agent` first for existing solutions
+2. Call `create_agent` with description
+3. **If `suggested_goal` returned**: Present to user, ask for confirmation, call again with suggested goal if accepted
+4. **If `clarifying_questions` returned**: After user answers, call again with original description AND answers in `context` parameter
+
+### Folder Management
+Use folder tools (`create_folder`, `list_folders`, `move_agents_to_folder`) to organize agents in the user's library for better discoverability."""
--- a/autogpt_platform/backend/backend/copilot/sdk/service.py
+++ b/autogpt_platform/backend/backend/copilot/sdk/service.py
@@ -44,6 +44,7 @@ from ..model import (
    update_session_title,
    upsert_chat_session,
 )
+from ..prompt_constants import KEY_WORKFLOWS
 from ..response_model import (
    StreamBaseResponse,
    StreamError,
@@ -59,6 +60,7 @@ from ..service import (
    _generate_session_title,
    _is_langfuse_configured,
 )
+from ..tools import TOOL_REGISTRY
 from ..tools.e2b_sandbox import get_or_create_sandbox
 from ..tools.sandbox import WORKSPACE_PREFIX, make_session_path
 from ..tools.workspace_files import get_manager
@@ -149,8 +151,37 @@ _HEARTBEAT_INTERVAL = 10.0  # seconds
 # Appended to the system prompt to inform the agent about available tools.
 # The SDK built-in Bash is NOT available — use mcp__copilot__bash_exec instead,
 # which has kernel-level network isolation (unshare --net).
+def _generate_tool_documentation() -> str:
+    """Auto-generate tool documentation from TOOL_REGISTRY.
+
+    This generates a complete list of available tools with their descriptions,
+    ensuring the documentation stays in sync with the actual tool implementations.
+    """
+    docs = "\n## AVAILABLE TOOLS\n\n"
+
+    # Sort tools alphabetically for consistent output
+    for name in sorted(TOOL_REGISTRY.keys()):
+        tool = TOOL_REGISTRY[name]
+        schema = tool.as_openai_tool()
+        desc = schema["function"].get("description", "No description available")
+        # Format as bullet list with tool name in code style
+        docs += f"- **`{name}`**: {desc}\n"
+
+    # Add workflow guidance for key tools
+    docs += KEY_WORKFLOWS
+
+    return docs
+
+
 _SHARED_TOOL_NOTES = """\

+### Web search and research
+- **`web_search(query)`** — Search the web for current information (uses Claude's
+  native web search). Use this when you need up-to-date information, facts,
+  statistics, or current events that are beyond your knowledge cutoff.
+- **`web_fetch(url)`** — Retrieve and analyze content from a specific URL.
+  Use this when you have a specific URL to read (documentation, articles, etc.).
+
 ### Sharing files with the user
 After saving a file to the persistent workspace with `write_workspace_file`,
 share it with the user by embedding the `download_url` from the response in
@@ -965,10 +996,16 @@ async def stream_chat_completion_sdk(
        )

        use_e2b = e2b_sandbox is not None
-        system_prompt = base_system_prompt + (
-            _E2B_TOOL_SUPPLEMENT
-            if use_e2b
-            else _LOCAL_TOOL_SUPPLEMENT.format(cwd=sdk_cwd)
+        # Generate tool documentation and append appropriate supplement
+        tool_docs = _generate_tool_documentation()
+        system_prompt = (
+            base_system_prompt
+            + tool_docs
+            + (
+                _E2B_TOOL_SUPPLEMENT
+                if use_e2b
+                else _LOCAL_TOOL_SUPPLEMENT.format(cwd=sdk_cwd)
+            )
        )

        # Process transcript download result
@@ -1355,17 +1392,28 @@ async def stream_chat_completion_sdk(
                                has_appended_assistant = True

                        elif isinstance(response, StreamToolOutputAvailable):
+                            tool_result_content = (
+                                response.output
+                                if isinstance(response.output, str)
+                                else str(response.output)
+                            )
                            session.messages.append(
                                ChatMessage(
                                    role="tool",
-                                    content=(
-                                        response.output
-                                        if isinstance(response.output, str)
-                                        else str(response.output)
-                                    ),
+                                    content=tool_result_content,
                                    tool_call_id=response.toolCallId,
                                )
                            )
+                            # Capture tool result in transcript as user message with tool_result content
+                            transcript_builder.add_user_message(
+                                content=[
+                                    {
+                                        "type": "tool_result",
+                                        "tool_use_id": response.toolCallId,
+                                        "content": tool_result_content,
+                                    }
+                                ]
+                            )
                            has_tool_results = True

                        elif isinstance(response, StreamFinish):
--- a/autogpt_platform/backend/backend/copilot/sdk/service_test.py
+++ b/autogpt_platform/backend/backend/copilot/sdk/service_test.py
@@ -7,7 +7,7 @@ from unittest.mock import AsyncMock, patch

 import pytest

-from .service import _prepare_file_attachments
+from .service import _generate_tool_documentation, _prepare_file_attachments


@dataclass
@@ -145,3 +145,94 @@ class TestPrepareFileAttachments:

        assert "Read tool" not in result.hint
        assert len(result.image_blocks) == 1
+
+
+class TestGenerateToolDocumentation:
+    """Tests for auto-generated tool documentation from TOOL_REGISTRY."""
+
+    def test_generate_tool_documentation_structure(self):
+        """Test that tool documentation has expected structure."""
+        docs = _generate_tool_documentation()
+
+        # Check main sections exist
+        assert "## AVAILABLE TOOLS" in docs
+        assert "## KEY WORKFLOWS" in docs
+
+        # Verify no duplicate sections
+        assert docs.count("## AVAILABLE TOOLS") == 1
+        assert docs.count("## KEY WORKFLOWS") == 1
+
+    def test_tool_documentation_includes_key_tools(self):
+        """Test that documentation includes essential copilot tools."""
+        docs = _generate_tool_documentation()
+
+        # Core agent workflow tools
+        assert "`create_agent`" in docs
+        assert "`run_agent`" in docs
+        assert "`find_library_agent`" in docs
+        assert "`edit_agent`" in docs
+
+        # MCP integration
+        assert "`run_mcp_tool`" in docs
+
+        # Browser automation
+        assert "`browser_navigate`" in docs
+
+        # Folder management
+        assert "`create_folder`" in docs
+
+    def test_tool_documentation_format(self):
+        """Test that each tool follows bullet list format."""
+        docs = _generate_tool_documentation()
+
+        lines = docs.split("\n")
+        tool_lines = [line for line in lines if line.strip().startswith("- **`")]
+
+        # Should have multiple tools (at least 20 from TOOL_REGISTRY)
+        assert len(tool_lines) >= 20
+
+        # Each tool line should have proper markdown format
+        for line in tool_lines:
+            assert line.startswith("- **`"), f"Bad format: {line}"
+            assert "`**:" in line, f"Missing description separator: {line}"
+
+    def test_tool_documentation_includes_workflows(self):
+        """Test that key workflow patterns are documented."""
+        docs = _generate_tool_documentation()
+
+        # Check workflow sections
+        assert "MCP Integration Workflow" in docs
+        assert "Agent Creation Workflow" in docs
+        assert "Folder Management" in docs
+
+        # Check workflow details
+        assert "suggested_goal" in docs  # Agent creation feedback loop
+        assert "clarifying_questions" in docs  # Agent creation feedback loop
+        assert "run_mcp_tool(server_url)" in docs  # MCP discovery pattern
+
+    def test_tool_documentation_completeness(self):
+        """Test that all tools from TOOL_REGISTRY appear in documentation."""
+        from backend.copilot.tools import TOOL_REGISTRY
+
+        docs = _generate_tool_documentation()
+
+        # Verify each registered tool is documented
+        for tool_name in TOOL_REGISTRY.keys():
+            assert (
+                f"`{tool_name}`" in docs
+            ), f"Tool '{tool_name}' missing from auto-generated documentation"
+
+    def test_tool_documentation_no_duplicate_tools(self):
+        """Test that no tool appears multiple times in the list."""
+        from backend.copilot.tools import TOOL_REGISTRY
+
+        docs = _generate_tool_documentation()
+
+        # Extract the tools section (before KEY WORKFLOWS)
+        tools_section = docs.split("## KEY WORKFLOWS")[0]
+
+        # Count occurrences of each tool
+        for tool_name in TOOL_REGISTRY.keys():
+            # Count how many times this tool appears as a bullet point
+            count = tools_section.count(f"- **`{tool_name}`**")
+            assert count == 1, f"Tool '{tool_name}' appears {count} times (should be 1)"
--- a/autogpt_platform/backend/backend/copilot/service.py
+++ b/autogpt_platform/backend/backend/copilot/service.py
@@ -34,8 +34,9 @@ client = LangfuseAsyncOpenAI(api_key=config.api_key, base_url=config.base_url)
 langfuse = get_client()

 # Default system prompt used when Langfuse is not configured
-# This is a snapshot of the "CoPilot Prompt" from Langfuse (version 11)
-DEFAULT_SYSTEM_PROMPT = """You are **Otto**, an AI Co-Pilot for AutoGPT and a Forward-Deployed Automation Engineer serving small business owners. Your mission is to help users automate business tasks with AI by delivering tangible value through working automations—not through documentation or lengthy explanations.
+# Provides minimal baseline tone and personality - all workflow, tools, and
+# technical details are provided via the supplement.
+DEFAULT_SYSTEM_PROMPT = """You are an AI automation assistant helping users build and run automations.

 Here is everything you know about the current user from previous interactions:

@@ -43,113 +44,12 @@ Here is everything you know about the current user from previous interactions:
 {users_information}
 </users_information>

-## YOUR CORE MANDATE
+Your goal is to help users automate tasks by:
+- Understanding their needs and business context
+- Building and running working automations
+- Delivering tangible value through action, not just explanation

-You are action-oriented. Your success is measured by:
- **Value Delivery**: Does the user think "wow, that was amazing" or "what was the point"?
- **Demonstrable Proof**: Show working automations, not descriptions of what's possible
- **Time Saved**: Focus on tangible efficiency gains
- **Quality Output**: Deliver results that meet or exceed expectations
-
-## YOUR WORKFLOW
-
-Adapt flexibly to the conversation context. Not every interaction requires all stages:
-
-1. **Explore & Understand**: Learn about the user's business, tasks, and goals. Use `add_understanding` to capture important context that will improve future conversations.
-
-2. **Assess Automation Potential**: Help the user understand whether and how AI can automate their task.
-
-3. **Prepare for AI**: Provide brief, actionable guidance on prerequisites (data, access, etc.).
-
-4. **Discover or Create Agents**:
-   - **Always check the user's library first** with `find_library_agent` (these may be customized to their needs)
-   - Search the marketplace with `find_agent` for pre-built automations
-   - Find reusable components with `find_block`
-   - **For live integrations** (read a GitHub repo, query a database, post to Slack, etc.) consider `run_mcp_tool` — it connects directly to external services without building a full agent
-   - Create custom solutions with `create_agent` if nothing suitable exists
-   - Modify existing library agents with `edit_agent`
-   - **When `create_agent` returns `suggested_goal`**: Present the suggestion to the user and ask "Would you like me to proceed with this refined goal?" If they accept, call `create_agent` again with the suggested goal.
-   - **When `create_agent` returns `clarifying_questions`**: After the user answers, call `create_agent` again with the original description AND the answers in the `context` parameter.
-
-5. **Execute**: Run automations immediately, schedule them, or set up webhooks using `run_agent`. Test specific components with `run_block`.
-
-6. **Show Results**: Display outputs using `agent_output`.
-
-## AVAILABLE TOOLS
-
-**Understanding & Discovery:**
- `add_understanding`: Create a memory about the user's business or use cases for future sessions
- `search_docs`: Search platform documentation for specific technical information
- `get_doc_page`: Retrieve full text of a specific documentation page
-
-**Agent Discovery:**
- `find_library_agent`: Search the user's existing agents (CHECK HERE FIRST—these may be customized)
- `find_agent`: Search the marketplace for pre-built automations
- `find_block`: Find pre-written code units that perform specific tasks (agents are built from blocks)
-
-**Agent Creation & Editing:**
- `create_agent`: Create a new automation agent
- `edit_agent`: Modify an agent in the user's library
-
-**Execution & Output:**
- `run_agent`: Run an agent now, schedule it, or set up a webhook trigger
- `run_block`: Test or run a specific block independently
- `agent_output`: View results from previous agent runs
-
-**MCP (Model Context Protocol) Servers:**
- `run_mcp_tool`: Connect to any MCP server to discover and run its tools
-
-  **Two-step flow:**
-  1. `run_mcp_tool(server_url)` → returns a list of available tools. Each tool has `name`, `description`, and `input_schema` (JSON Schema). Read `input_schema.properties` to understand what arguments are needed.
-  2. `run_mcp_tool(server_url, tool_name, tool_arguments)` → executes the tool. Build `tool_arguments` as a flat `{{key: value}}` object matching the tool's `input_schema.properties`.
-
-  **Authentication:** If the MCP server requires credentials, the UI will show an OAuth connect button. Once the user connects and clicks Proceed, they will automatically send you a message confirming credentials are ready (e.g. "I've connected the MCP server credentials. Please retry run_mcp_tool..."). When you receive that confirmation, **immediately** call `run_mcp_tool` again with the exact same `server_url` — and the same `tool_name`/`tool_arguments` if you were already mid-execution. Do not ask the user what to do next; just retry.
-
-  **Finding server URLs (fastest → slowest):**
-  1. **Known hosted servers** — use directly, no lookup:
-     - Notion: `https://mcp.notion.com/mcp`
-     - Linear: `https://mcp.linear.app/mcp`
-     - Stripe: `https://mcp.stripe.com`
-     - Intercom: `https://mcp.intercom.com/mcp`
-     - Cloudflare: `https://mcp.cloudflare.com/mcp`
-     - Atlassian (Jira/Confluence): `https://mcp.atlassian.com/mcp`
-  2. **`web_search`** — use `web_search("{{service}} MCP server URL")` for any service not in the list above. This is the fastest way to find unlisted servers.
-  3. **Registry API** — `web_fetch("https://registry.modelcontextprotocol.io/v0.1/servers?search={{query}}&limit=10")` to browse what's available. Returns names + GitHub repo URLs but NOT the endpoint URL; follow up with `web_search` to find the actual endpoint.
-  - **Never** `web_fetch` the registry homepage — it is JavaScript-rendered and returns a blank page.
-
-  **When to use:** Use `run_mcp_tool` when the user wants to interact with an external service (GitHub, Slack, a database, a SaaS tool, etc.) via its MCP integration. Unlike `web_fetch` (which just retrieves a raw URL), MCP servers expose structured typed tools — prefer `run_mcp_tool` for any service with an MCP server, and `web_fetch` only for plain URL retrieval with no MCP server involved.
-
-  **CRITICAL**: `run_mcp_tool` is **always available** in your tool list. If the user explicitly provides an MCP server URL or asks you to call `run_mcp_tool`, you MUST use it — never claim it is unavailable, and never substitute `web_fetch` for an explicit MCP request.
-
-## BEHAVIORAL GUIDELINES
-
-**Be Concise:**
- Target 2-5 short lines maximum
- Make every word count—no repetition or filler
- Use lightweight structure for scannability (bullets, numbered lists, short prompts)
- Avoid jargon (blocks, slugs, cron) unless the user asks
-
-**Be Proactive:**
- Suggest next steps before being asked
- Anticipate needs based on conversation context and user information
- Look for opportunities to expand scope when relevant
- Reveal capabilities through action, not explanation
-
-**Use Tools Effectively:**
- Select the right tool for each task
- **Always check `find_library_agent` before searching the marketplace**
- Use `add_understanding` to capture valuable business context
- When tool calls fail, try alternative approaches
- **For MCP integrations**: Known URL (see list) or `web_search("{{service}} MCP server URL")` → `run_mcp_tool(server_url)` → `run_mcp_tool(server_url, tool_name, tool_arguments)`. If credentials needed, UI prompts automatically; when user confirms, retry immediately with same arguments.
-
-**Handle Feedback Loops:**
- When a tool returns a suggested alternative (like a refined goal), present it clearly and ask the user for confirmation before proceeding
- When clarifying questions are answered, immediately re-call the tool with the accumulated context
- Don't ask redundant questions if the user has already provided context in the conversation
-
-## CRITICAL REMINDER
-
-You are NOT a chatbot. You are NOT documentation. You are a partner who helps busy business owners get value quickly by showing proof through working automations. Bias toward action over explanation."""
+Be concise, proactive, and action-oriented. Bias toward showing working solutions over lengthy explanations."""


 # ---------------------------------------------------------------------------
Author	SHA1	Message	Date
Zamil Majdy	3d880cd591	refactor(backend/copilot): move imports to module level - Move KEY_WORKFLOWS and TOOL_REGISTRY imports to top of file - Better code organization following Python conventions	2026-03-06 23:15:39 +07:00
Zamil Majdy	73f5ff9983	test(backend/copilot): add tests for auto-generated tool documentation - Test tool documentation structure (sections, format) - Test that all TOOL_REGISTRY tools are included - Test workflow sections are present - Test no duplicate tools - Verify markdown formatting compliance - All 6 tests passing	2026-03-06 23:15:39 +07:00
Zamil Majdy	6d9faf5f91	refactor(backend/copilot): auto-generate tool docs in supplement, simplify default prompt - Add _generate_tool_documentation() to auto-generate tool list from TOOL_REGISTRY - Extract KEY_WORKFLOWS constant to prompt_constants.py for maintainability - Append auto-generated tool docs + workflow guidance to system prompt supplement - Simplify DEFAULT_SYSTEM_PROMPT to minimal tone/style baseline (Langfuse handles details) - Add KEY WORKFLOWS section covering MCP integration, agent creation, folder management - Ensures tool documentation stays in sync with actual implementations - Fix Pyright error by safely accessing description field with .get()	2026-03-06 23:10:42 +07:00
Zamil Majdy	7774717104	docs(backend/copilot): document web_search and web_fetch in tool supplement Add clear documentation for web_search and web_fetch to the shared tool notes that get appended to all system prompts (Langfuse or default). This ensures the copilot knows to use web_search for general web queries instead of incorrectly using find_block to search for web search blocks. - web_search: For current information beyond knowledge cutoff - web_fetch: For retrieving content from specific URLs	2026-03-06 23:10:42 +07:00
Zamil Majdy	89ed628609	fix(backend/copilot): capture tool results in transcript Tool results (StreamToolOutputAvailable) were being added to session.messages but NOT to transcript_builder, causing the transcript to miss tool executions. This made the copilot claim '(no tool used)' when tools were actually called. Now tool results are captured as user messages with tool_result content blocks, matching the Claude API transcript format and ensuring --resume has complete conversation history including all tool interactions.	2026-03-06 23:10:42 +07:00