Files
AutoGPT/docs/platform/block-sdk-guide.md
Nicholas Tindle 7668c17d9c feat(platform): add User Workspace for persistent CoPilot file storage (#11867)
Implements persistent User Workspace storage for CoPilot, enabling
blocks to save and retrieve files across sessions. Files are stored in
session-scoped virtual paths (`/sessions/{session_id}/`).

Fixes SECRT-1833

### Changes 🏗️

**Database & Storage:**
- Add `UserWorkspace` and `UserWorkspaceFile` Prisma models
- Implement `WorkspaceStorageBackend` abstraction (GCS for cloud, local
filesystem for self-hosted)
- Add `workspace_id` and `session_id` fields to `ExecutionContext`

**Backend API:**
- Add REST endpoints: `GET/POST /api/workspace/files`, `GET/DELETE
/api/workspace/files/{id}`, `GET /api/workspace/files/{id}/download`
- Add CoPilot tools: `list_workspace_files`, `read_workspace_file`,
`write_workspace_file`
- Integrate workspace storage into `store_media_file()` - returns
`workspace://file-id` references

**Block Updates:**
- Refactor all file-handling blocks to use unified `ExecutionContext`
parameter
- Update media-generating blocks to persist outputs to workspace
(AIImageGenerator, AIImageCustomizer, FluxKontext, TalkingHead, FAL
video, Bannerbear, etc.)

**Frontend:**
- Render `workspace://` image references in chat via proxy endpoint
- Add "AI cannot see this image" overlay indicator

**CoPilot Context Mapping:**
- Session = Agent (graph_id) = Run (graph_exec_id)
- Files scoped to `/sessions/{session_id}/`

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [ ] I have tested my changes according to the test plan:
- [ ] Create CoPilot session, generate image with AIImageGeneratorBlock
  - [ ] Verify image returns `workspace://file-id` (not base64)
  - [ ] Verify image renders in chat with visibility indicator
  - [ ] Verify workspace files persist across sessions
  - [ ] Test list/read/write workspace files via CoPilot tools
  - [ ] Test local storage backend for self-hosted deployments

#### For configuration changes:
- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)

🤖 Generated with [Claude Code](https://claude.ai/code)

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Medium Risk**
> Introduces a new persistent file-storage surface area (DB tables,
storage backends, download API, and chat tools) and rewires
`store_media_file()`/block execution context across many blocks, so
regressions could impact file handling, access control, or storage
costs.
> 
> **Overview**
> Adds a **persistent per-user Workspace** (new
`UserWorkspace`/`UserWorkspaceFile` models plus `WorkspaceManager` +
`WorkspaceStorageBackend` with GCS/local implementations) and wires it
into the API via a new `/api/workspace/files/{file_id}/download` route
(including header-sanitized `Content-Disposition`) and shutdown
lifecycle hooks.
> 
> Extends `ExecutionContext` to carry execution identity +
`workspace_id`/`session_id`, updates executor tooling to clone
node-specific contexts, and updates `run_block` (CoPilot) to create a
session-scoped workspace and synthetic graph/run/node IDs.
> 
> Refactors `store_media_file()` to require `execution_context` +
`return_format` and to support `workspace://` references; migrates many
media/file-handling blocks and related tests to the new API and to
persist generated media as `workspace://...` (or fall back to data URIs
outside CoPilot), and adds CoPilot chat tools for
listing/reading/writing/deleting workspace files with safeguards against
context bloat.
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
6abc70f793. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Reinier van der Leer <pwuts@agpt.co>
2026-01-29 05:49:47 +00:00

10 KiB

Block Creation with SDK

This guide explains how to create new blocks for the AutoGPT Platform using the SDK pattern with advanced features.

Overview

Blocks are reusable components that perform specific tasks in AutoGPT workflows. They can integrate with external services, process data, or perform any programmatic operation.

Basic Structure

1. Create Provider Configuration

First, create a _config.py file to configure your provider using the ProviderBuilder:

from backend.sdk import BlockCostType, ProviderBuilder

# Simple API key provider
my_provider = (
    ProviderBuilder("my_provider")
    .with_api_key("MY_PROVIDER_API_KEY", "My Provider API Key")
    .with_base_cost(1, BlockCostType.RUN)
    .build()
)

For OAuth providers:

from backend.sdk import BlockCostType, ProviderBuilder
from ._oauth import MyProviderOAuthHandler

my_provider = (
    ProviderBuilder("my_provider")
    .with_oauth(
        MyProviderOAuthHandler,
        scopes=["read", "write"],
        client_id_env_var="MY_PROVIDER_CLIENT_ID",
        client_secret_env_var="MY_PROVIDER_CLIENT_SECRET",
    )
    .with_base_cost(1, BlockCostType.RUN)
    .build()
)

2. Create the Block Class

Create your block file (e.g., my_block.py):

import uuid
from backend.sdk import (
    APIKeyCredentials,
    Block,
    BlockCategory,
    BlockOutput,
    BlockSchema,
    BlockSchemaInput,
    BlockSchemaOutput,
    CredentialsMetaInput,
    SchemaField,
)
from ._config import my_provider


class MyBlock(Block):
    class Input(BlockSchemaInput):
        # Define input fields
        credentials: CredentialsMetaInput = my_provider.credentials_field(
            description="API credentials for My Provider"
        )
        query: str = SchemaField(description="The query to process")
        limit: int = SchemaField(
            description="Number of results", 
            default=10,
            ge=1,  # Greater than or equal to 1
            le=100  # Less than or equal to 100
        )
        advanced_option: str = SchemaField(
            description="Advanced setting",
            default="",
            advanced=True  # Hidden by default in UI
        )

    class Output(BlockSchemaOutput):
        # Define output fields
        results: list = SchemaField(description="List of results")
        count: int = SchemaField(description="Total count")
        # error output pin is already defined on BlockSchemaOutput

    def __init__(self):
        super().__init__(
            id=str(uuid.uuid4()),  # Generate unique ID
            description="Brief description of what this block does",
            categories={BlockCategory.SEARCH},  # Choose appropriate categories
            input_schema=self.Input,
            output_schema=self.Output,
        )

    async def run(
        self, 
        input_data: Input, 
        *, 
        credentials: APIKeyCredentials,
        **kwargs
    ) -> BlockOutput:
        try:
            # Your block logic here
            results = await self.process_data(
                input_data.query,
                input_data.limit,
                credentials
            )
            
            # Yield outputs
            yield "results", results
            yield "count", len(results)
            
        except Exception as e:
            yield "error", str(e)

    async def process_data(self, query, limit, credentials):
        # Implement your logic
        # Use credentials.api_key.get_secret_value() to access the API key
        pass

Key Components Explained

Provider Configuration

The ProviderBuilder allows you to:

  • .with_api_key(): Add API key authentication
  • .with_oauth(): Add OAuth authentication
  • .with_base_cost(): Set resource costs for the block
  • .with_webhook_manager(): Add webhook support
  • .with_user_password(): Add username/password auth

Block Schema

  • Input/Output classes: Define the data structure using BlockSchema
  • SchemaField: Define individual fields with validation
  • CredentialsMetaInput: Special field for handling credentials

Block Implementation

  1. Unique ID: Generate using uuid.uuid4()
  2. Categories: Choose from BlockCategory enum (e.g., SEARCH, AI, PRODUCTIVITY)
  3. async run(): Main execution method that yields outputs
  4. Error handling: Error output pin is already defined on BlockSchemaOutput

Advanced Features

Testing

Add test configuration to your block:

def __init__(self):
    super().__init__(
        # ... other config ...
        test_input={
            "query": "test query",
            "limit": 5,
            "credentials": {
                "provider": "my_provider",
                "id": str(uuid.uuid4()),
                "type": "api_key"
            }
        },
        test_output=[
            ("results", ["result1", "result2"]),
            ("count", 2)
        ],
        test_mock={
            "process_data": lambda *args, **kwargs: ["result1", "result2"]
        }
    )

OAuth Support

Create an OAuth handler in _oauth.py:

from backend.integrations.oauth.base import BaseOAuthHandler

class MyProviderOAuthHandler(BaseOAuthHandler):
    PROVIDER_NAME = "my_provider"
    
    def _get_authorization_url(self, scopes: list[str], state: str) -> str:
        # Implementation
        pass
    
    def _exchange_code_for_token(self, code: str, scopes: list[str]) -> dict:
        # Implementation
        pass

Webhook Support

Create a webhook manager in _webhook.py:

from backend.integrations.webhooks._base import BaseWebhooksManager

class MyProviderWebhookManager(BaseWebhooksManager):
    PROVIDER_NAME = "my_provider"
    
    async def validate_event(self, event: dict) -> bool:
        # Implementation
        pass

File Organization

backend/blocks/my_provider/
├── __init__.py          # Export your blocks
├── _config.py           # Provider configuration  
├── _oauth.py           # OAuth handler (optional)
├── _webhook.py         # Webhook manager (optional)
├── _api.py             # API client wrapper (optional)
├── models.py           # Data models (optional)
└── my_block.py         # Block implementations

Best Practices

  1. Error Handling: Use BlockInputError for validation failures and BlockExecutionError for runtime errors (import from backend.util.exceptions). These inherit from ValueError so the executor treats them as user-fixable. See Error Handling in new_blocks.md for details.
  2. Credentials: Use the provider's credentials_field() method
  3. Validation: Use SchemaField constraints (ge, le, min_length, etc.)
  4. Categories: Choose appropriate categories for discoverability
  5. Advanced Fields: Mark complex options as advanced=True
  6. Async Operations: Use async/await for I/O operations
  7. API Clients: Use Requests() from SDK or external libraries
  8. Testing: Include test inputs/outputs for validation

Common Patterns

Making API Requests

from backend.sdk import Requests

async def run(self, input_data: Input, *, credentials: APIKeyCredentials, **kwargs):
    headers = {
        "Authorization": f"Bearer {credentials.api_key.get_secret_value()}",
        "Content-Type": "application/json"
    }
    
    response = await Requests().post(
        "https://api.example.com/endpoint",
        headers=headers,
        json={"query": input_data.query}
    )
    
    data = response.json()
    yield "results", data.get("results", [])

Multiple Auth Types

async def run(
    self, 
    input_data: Input, 
    *, 
    credentials: OAuth2Credentials | APIKeyCredentials,
    **kwargs
):
    if isinstance(credentials, OAuth2Credentials):
        # Handle OAuth
        token = credentials.access_token.get_secret_value()
    else:
        # Handle API key
        token = credentials.api_key.get_secret_value()

Handling Files

When your block works with files (images, videos, documents), use store_media_file():

from backend.data.execution import ExecutionContext
from backend.util.file import store_media_file
from backend.util.type import MediaFileType

async def run(
    self,
    input_data: Input,
    *,
    execution_context: ExecutionContext,
    **kwargs,
):
    # PROCESSING: Need local file path for tools like ffmpeg, MoviePy, PIL
    local_path = await store_media_file(
        file=input_data.video,
        execution_context=execution_context,
        return_format="for_local_processing",
    )

    # EXTERNAL API: Need base64 content for APIs like Replicate, OpenAI
    image_b64 = await store_media_file(
        file=input_data.image,
        execution_context=execution_context,
        return_format="for_external_api",
    )

    # OUTPUT: Return to user/next block (auto-adapts to context)
    result = await store_media_file(
        file=generated_url,
        execution_context=execution_context,
        return_format="for_block_output",  # workspace:// in CoPilot, data URI in graphs
    )
    yield "image_url", result

Return format options:

  • "for_local_processing" - Local file path for processing tools
  • "for_external_api" - Data URI for external APIs needing base64
  • "for_block_output" - Always use for outputs - automatically picks best format

Testing Your Block

# Run all block tests
poetry run pytest backend/blocks/test/test_block.py -xvs

# Test specific block
poetry run pytest 'backend/blocks/test/test_block.py::test_available_blocks[MyBlock]' -xvs

Integration Checklist

  • Create provider configuration in _config.py
  • Implement block class with Input/Output schemas
  • Generate unique block ID with uuid.uuid4()
  • Choose appropriate block categories
  • Implement async run() method
  • Handle errors gracefully
  • Add test configuration
  • Export block in __init__.py
  • Test the block
  • Document any special requirements

Example Blocks for Reference

  • Simple API: /backend/blocks/firecrawl/ - Basic API key authentication
  • OAuth + API: /backend/blocks/linear/ - OAuth and API key support
  • Webhooks: /backend/blocks/exa/ - Includes webhook manager

Study these examples to understand different patterns and approaches for building blocks.