Files
AutoGPT/docs/platform/block-sdk-guide.md
Nicholas Tindle 7668c17d9c feat(platform): add User Workspace for persistent CoPilot file storage (#11867)
Implements persistent User Workspace storage for CoPilot, enabling
blocks to save and retrieve files across sessions. Files are stored in
session-scoped virtual paths (`/sessions/{session_id}/`).

Fixes SECRT-1833

### Changes 🏗️

**Database & Storage:**
- Add `UserWorkspace` and `UserWorkspaceFile` Prisma models
- Implement `WorkspaceStorageBackend` abstraction (GCS for cloud, local
filesystem for self-hosted)
- Add `workspace_id` and `session_id` fields to `ExecutionContext`

**Backend API:**
- Add REST endpoints: `GET/POST /api/workspace/files`, `GET/DELETE
/api/workspace/files/{id}`, `GET /api/workspace/files/{id}/download`
- Add CoPilot tools: `list_workspace_files`, `read_workspace_file`,
`write_workspace_file`
- Integrate workspace storage into `store_media_file()` - returns
`workspace://file-id` references

**Block Updates:**
- Refactor all file-handling blocks to use unified `ExecutionContext`
parameter
- Update media-generating blocks to persist outputs to workspace
(AIImageGenerator, AIImageCustomizer, FluxKontext, TalkingHead, FAL
video, Bannerbear, etc.)

**Frontend:**
- Render `workspace://` image references in chat via proxy endpoint
- Add "AI cannot see this image" overlay indicator

**CoPilot Context Mapping:**
- Session = Agent (graph_id) = Run (graph_exec_id)
- Files scoped to `/sessions/{session_id}/`

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [ ] I have tested my changes according to the test plan:
- [ ] Create CoPilot session, generate image with AIImageGeneratorBlock
  - [ ] Verify image returns `workspace://file-id` (not base64)
  - [ ] Verify image renders in chat with visibility indicator
  - [ ] Verify workspace files persist across sessions
  - [ ] Test list/read/write workspace files via CoPilot tools
  - [ ] Test local storage backend for self-hosted deployments

#### For configuration changes:
- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)

🤖 Generated with [Claude Code](https://claude.ai/code)

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Medium Risk**
> Introduces a new persistent file-storage surface area (DB tables,
storage backends, download API, and chat tools) and rewires
`store_media_file()`/block execution context across many blocks, so
regressions could impact file handling, access control, or storage
costs.
> 
> **Overview**
> Adds a **persistent per-user Workspace** (new
`UserWorkspace`/`UserWorkspaceFile` models plus `WorkspaceManager` +
`WorkspaceStorageBackend` with GCS/local implementations) and wires it
into the API via a new `/api/workspace/files/{file_id}/download` route
(including header-sanitized `Content-Disposition`) and shutdown
lifecycle hooks.
> 
> Extends `ExecutionContext` to carry execution identity +
`workspace_id`/`session_id`, updates executor tooling to clone
node-specific contexts, and updates `run_block` (CoPilot) to create a
session-scoped workspace and synthetic graph/run/node IDs.
> 
> Refactors `store_media_file()` to require `execution_context` +
`return_format` and to support `workspace://` references; migrates many
media/file-handling blocks and related tests to the new API and to
persist generated media as `workspace://...` (or fall back to data URIs
outside CoPilot), and adds CoPilot chat tools for
listing/reading/writing/deleting workspace files with safeguards against
context bloat.
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
6abc70f793. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Reinier van der Leer <pwuts@agpt.co>
2026-01-29 05:49:47 +00:00

353 lines
10 KiB
Markdown

# Block Creation with SDK
This guide explains how to create new blocks for the AutoGPT Platform using the SDK pattern with advanced features.
## Overview
Blocks are reusable components that perform specific tasks in AutoGPT workflows. They can integrate with external services, process data, or perform any programmatic operation.
## Basic Structure
### 1. Create Provider Configuration
First, create a `_config.py` file to configure your provider using the `ProviderBuilder`:
```python
from backend.sdk import BlockCostType, ProviderBuilder
# Simple API key provider
my_provider = (
ProviderBuilder("my_provider")
.with_api_key("MY_PROVIDER_API_KEY", "My Provider API Key")
.with_base_cost(1, BlockCostType.RUN)
.build()
)
```
For OAuth providers:
```python
from backend.sdk import BlockCostType, ProviderBuilder
from ._oauth import MyProviderOAuthHandler
my_provider = (
ProviderBuilder("my_provider")
.with_oauth(
MyProviderOAuthHandler,
scopes=["read", "write"],
client_id_env_var="MY_PROVIDER_CLIENT_ID",
client_secret_env_var="MY_PROVIDER_CLIENT_SECRET",
)
.with_base_cost(1, BlockCostType.RUN)
.build()
)
```
### 2. Create the Block Class
Create your block file (e.g., `my_block.py`):
```python
import uuid
from backend.sdk import (
APIKeyCredentials,
Block,
BlockCategory,
BlockOutput,
BlockSchema,
BlockSchemaInput,
BlockSchemaOutput,
CredentialsMetaInput,
SchemaField,
)
from ._config import my_provider
class MyBlock(Block):
class Input(BlockSchemaInput):
# Define input fields
credentials: CredentialsMetaInput = my_provider.credentials_field(
description="API credentials for My Provider"
)
query: str = SchemaField(description="The query to process")
limit: int = SchemaField(
description="Number of results",
default=10,
ge=1, # Greater than or equal to 1
le=100 # Less than or equal to 100
)
advanced_option: str = SchemaField(
description="Advanced setting",
default="",
advanced=True # Hidden by default in UI
)
class Output(BlockSchemaOutput):
# Define output fields
results: list = SchemaField(description="List of results")
count: int = SchemaField(description="Total count")
# error output pin is already defined on BlockSchemaOutput
def __init__(self):
super().__init__(
id=str(uuid.uuid4()), # Generate unique ID
description="Brief description of what this block does",
categories={BlockCategory.SEARCH}, # Choose appropriate categories
input_schema=self.Input,
output_schema=self.Output,
)
async def run(
self,
input_data: Input,
*,
credentials: APIKeyCredentials,
**kwargs
) -> BlockOutput:
try:
# Your block logic here
results = await self.process_data(
input_data.query,
input_data.limit,
credentials
)
# Yield outputs
yield "results", results
yield "count", len(results)
except Exception as e:
yield "error", str(e)
async def process_data(self, query, limit, credentials):
# Implement your logic
# Use credentials.api_key.get_secret_value() to access the API key
pass
```
## Key Components Explained
### Provider Configuration
The `ProviderBuilder` allows you to:
- **`.with_api_key()`**: Add API key authentication
- **`.with_oauth()`**: Add OAuth authentication
- **`.with_base_cost()`**: Set resource costs for the block
- **`.with_webhook_manager()`**: Add webhook support
- **`.with_user_password()`**: Add username/password auth
### Block Schema
- **Input/Output classes**: Define the data structure using `BlockSchema`
- **SchemaField**: Define individual fields with validation
- **CredentialsMetaInput**: Special field for handling credentials
### Block Implementation
1. **Unique ID**: Generate using `uuid.uuid4()`
2. **Categories**: Choose from `BlockCategory` enum (e.g., SEARCH, AI, PRODUCTIVITY)
3. **async run()**: Main execution method that yields outputs
4. **Error handling**: Error output pin is already defined on BlockSchemaOutput
## Advanced Features
### Testing
Add test configuration to your block:
```python
def __init__(self):
super().__init__(
# ... other config ...
test_input={
"query": "test query",
"limit": 5,
"credentials": {
"provider": "my_provider",
"id": str(uuid.uuid4()),
"type": "api_key"
}
},
test_output=[
("results", ["result1", "result2"]),
("count", 2)
],
test_mock={
"process_data": lambda *args, **kwargs: ["result1", "result2"]
}
)
```
### OAuth Support
Create an OAuth handler in `_oauth.py`:
```python
from backend.integrations.oauth.base import BaseOAuthHandler
class MyProviderOAuthHandler(BaseOAuthHandler):
PROVIDER_NAME = "my_provider"
def _get_authorization_url(self, scopes: list[str], state: str) -> str:
# Implementation
pass
def _exchange_code_for_token(self, code: str, scopes: list[str]) -> dict:
# Implementation
pass
```
### Webhook Support
Create a webhook manager in `_webhook.py`:
```python
from backend.integrations.webhooks._base import BaseWebhooksManager
class MyProviderWebhookManager(BaseWebhooksManager):
PROVIDER_NAME = "my_provider"
async def validate_event(self, event: dict) -> bool:
# Implementation
pass
```
## File Organization
```
backend/blocks/my_provider/
├── __init__.py # Export your blocks
├── _config.py # Provider configuration
├── _oauth.py # OAuth handler (optional)
├── _webhook.py # Webhook manager (optional)
├── _api.py # API client wrapper (optional)
├── models.py # Data models (optional)
└── my_block.py # Block implementations
```
## Best Practices
1. **Error Handling**: Use `BlockInputError` for validation failures and `BlockExecutionError` for runtime errors (import from `backend.util.exceptions`). These inherit from `ValueError` so the executor treats them as user-fixable. See [Error Handling in new_blocks.md](new_blocks.md#error-handling) for details.
2. **Credentials**: Use the provider's `credentials_field()` method
3. **Validation**: Use SchemaField constraints (ge, le, min_length, etc.)
4. **Categories**: Choose appropriate categories for discoverability
5. **Advanced Fields**: Mark complex options as `advanced=True`
6. **Async Operations**: Use `async`/`await` for I/O operations
7. **API Clients**: Use `Requests()` from SDK or external libraries
8. **Testing**: Include test inputs/outputs for validation
## Common Patterns
### Making API Requests
```python
from backend.sdk import Requests
async def run(self, input_data: Input, *, credentials: APIKeyCredentials, **kwargs):
headers = {
"Authorization": f"Bearer {credentials.api_key.get_secret_value()}",
"Content-Type": "application/json"
}
response = await Requests().post(
"https://api.example.com/endpoint",
headers=headers,
json={"query": input_data.query}
)
data = response.json()
yield "results", data.get("results", [])
```
### Multiple Auth Types
```python
async def run(
self,
input_data: Input,
*,
credentials: OAuth2Credentials | APIKeyCredentials,
**kwargs
):
if isinstance(credentials, OAuth2Credentials):
# Handle OAuth
token = credentials.access_token.get_secret_value()
else:
# Handle API key
token = credentials.api_key.get_secret_value()
```
### Handling Files
When your block works with files (images, videos, documents), use `store_media_file()`:
```python
from backend.data.execution import ExecutionContext
from backend.util.file import store_media_file
from backend.util.type import MediaFileType
async def run(
self,
input_data: Input,
*,
execution_context: ExecutionContext,
**kwargs,
):
# PROCESSING: Need local file path for tools like ffmpeg, MoviePy, PIL
local_path = await store_media_file(
file=input_data.video,
execution_context=execution_context,
return_format="for_local_processing",
)
# EXTERNAL API: Need base64 content for APIs like Replicate, OpenAI
image_b64 = await store_media_file(
file=input_data.image,
execution_context=execution_context,
return_format="for_external_api",
)
# OUTPUT: Return to user/next block (auto-adapts to context)
result = await store_media_file(
file=generated_url,
execution_context=execution_context,
return_format="for_block_output", # workspace:// in CoPilot, data URI in graphs
)
yield "image_url", result
```
**Return format options:**
- `"for_local_processing"` - Local file path for processing tools
- `"for_external_api"` - Data URI for external APIs needing base64
- `"for_block_output"` - **Always use for outputs** - automatically picks best format
## Testing Your Block
```bash
# Run all block tests
poetry run pytest backend/blocks/test/test_block.py -xvs
# Test specific block
poetry run pytest 'backend/blocks/test/test_block.py::test_available_blocks[MyBlock]' -xvs
```
## Integration Checklist
- [ ] Create provider configuration in `_config.py`
- [ ] Implement block class with Input/Output schemas
- [ ] Generate unique block ID with `uuid.uuid4()`
- [ ] Choose appropriate block categories
- [ ] Implement `async run()` method
- [ ] Handle errors gracefully
- [ ] Add test configuration
- [ ] Export block in `__init__.py`
- [ ] Test the block
- [ ] Document any special requirements
## Example Blocks for Reference
- **Simple API**: `/backend/blocks/firecrawl/` - Basic API key authentication
- **OAuth + API**: `/backend/blocks/linear/` - OAuth and API key support
- **Webhooks**: `/backend/blocks/exa/` - Includes webhook manager
Study these examples to understand different patterns and approaches for building blocks.