Files
AutoGPT/autogpt_platform/backend/AGENTS.md
Nicholas Tindle 88589764b5 dx(platform): normalize agent instructions for Claude and Codex (#12592)
### Why / What / How

Why: repo guidance was split between Claude-specific `CLAUDE.md` files
and Codex-specific `AGENTS.md` files, which duplicated instruction
content and made the same repository behave differently across agents.
The repo also had Claude skills under `.claude/skills` but no
Codex-visible repo skill path.

What: this PR bridges the repo's Claude skills into Codex and normalizes
shared instruction files so `AGENTS.md` becomes the canonical source
while each `CLAUDE.md` imports its sibling `AGENTS.md`.

How: add a repo-local `.agents/skills` symlink pointing to
`../.claude/skills`; move nested `CLAUDE.md` content into sibling
`AGENTS.md` files; replace each repo `CLAUDE.md` with a one-line
`@AGENTS.md` shim so Claude and Codex read the same scoped guidance
without duplicating text. The root `CLAUDE.md` now imports the root
`AGENTS.md` rather than symlinking to it.

Note: the instruction-file normalization commit was created with
`--no-verify` because the repo's frontend pre-commit `tsc` hook
currently fails on unrelated existing errors, largely missing
`autogpt_platform/frontend/src/app/api/__generated__/*` modules.

### Changes 🏗️

- Add `.agents/skills` as a repo-local symlink to `../.claude/skills` so
Codex discovers the existing Claude repo skills.
- Add a real root `CLAUDE.md` shim that imports the canonical root
`AGENTS.md`.
- Promote nested scoped instruction content into sibling `AGENTS.md`
files under `autogpt_platform/`, `autogpt_platform/backend/`,
`autogpt_platform/frontend/`, `autogpt_platform/frontend/src/tests/`,
and `docs/`.
- Replace the corresponding nested `CLAUDE.md` files with one-line
`@AGENTS.md` shims.
- Preserve the existing scoped instruction hierarchy while making the
shared content cross-compatible between Claude and Codex.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Verified `.agents/skills` resolves to `../.claude/skills`
  - [x] Verified each repo `CLAUDE.md` now contains only `@AGENTS.md`
- [x] Verified the expected `AGENTS.md` files exist at the root and
nested scoped directories
- [x] Verified the branch contains only the intended agent-guidance
commits relative to `dev` and the working tree is clean

#### For configuration changes:

- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)

No runtime configuration changes are included in this PR.

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Low Risk**
> Low risk: documentation/instruction-file reshuffle plus an
`.agents/skills` pointer; no runtime code paths are modified.
> 
> **Overview**
> Unifies agent guidance so **`AGENTS.md` becomes canonical** and all
corresponding `CLAUDE.md` files become 1-line shims (`@AGENTS.md`) at
the repo root, `autogpt_platform/`, backend, frontend, frontend tests,
and `docs/`.
> 
> Adds `.agents/skills` pointing to `../.claude/skills` so non-Claude
agents discover the same shared skills/instructions, eliminating
duplicated/agent-specific guidance content.
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
839483c3b6. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
2026-04-01 09:08:51 +00:00

9.6 KiB

Backend

This file provides guidance to coding agents when working with the backend.

Essential Commands

To run something with Python package dependencies you MUST use poetry run ....

# Install dependencies
poetry install

# Run database migrations
poetry run prisma migrate dev

# Start all services (database, redis, rabbitmq, clamav)
docker compose up -d

# Run the backend as a whole
poetry run app

# Run tests
poetry run test

# Run specific test
poetry run pytest path/to/test_file.py::test_function_name

# Run block tests (tests that validate all blocks work correctly)
poetry run pytest backend/blocks/test/test_block.py -xvs

# Run tests for a specific block (e.g., GetCurrentTimeBlock)
poetry run pytest 'backend/blocks/test/test_block.py::test_available_blocks[GetCurrentTimeBlock]' -xvs

# Lint and format
# prefer format if you want to just "fix" it and only get the errors that can't be autofixed
poetry run format  # Black + isort
poetry run lint    # ruff

More details can be found in @TESTING.md

Creating/Updating Snapshots

When you first write a test or when the expected output changes:

poetry run pytest path/to/test.py --snapshot-update

⚠️ Important: Always review snapshot changes before committing! Use git diff to verify the changes are expected.

Architecture

  • API Layer: FastAPI with REST and WebSocket endpoints
  • Database: PostgreSQL with Prisma ORM, includes pgvector for embeddings
  • Queue System: RabbitMQ for async task processing
  • Execution Engine: Separate executor service processes agent workflows
  • Authentication: JWT-based with Supabase integration
  • Security: Cache protection middleware prevents sensitive data caching in browsers/proxies

Code Style

  • Top-level imports only — no local/inner imports (lazy imports only for heavy optional deps like openpyxl)
  • Absolute imports — use from backend.module import ... for cross-package imports. Single-dot relative (from .sibling import ...) is acceptable for sibling modules within the same package (e.g., blocks). Avoid double-dot relative imports (from ..parent import ...) — use the absolute path instead
  • No duck typing — no hasattr/getattr/isinstance for type dispatch; use typed interfaces/unions/protocols
  • Pydantic models over dataclass/namedtuple/dict for structured data
  • No linter suppressors — no # type: ignore, # noqa, # pyright: ignore; fix the type/code
  • List comprehensions over manual loop-and-append
  • Early return — guard clauses first, avoid deep nesting
  • f-strings vs printf syntax in log statements — Use %s for deferred interpolation in debug statements, f-strings elsewhere for readability: logger.debug("Processing %s items", count), logger.info(f"Processing {count} items")
  • Sanitize error pathsos.path.basename() in error messages to avoid leaking directory structure
  • TOCTOU awareness — avoid check-then-act patterns for file access and credit charging
  • Security() vs Depends() — use Security() for auth deps to get proper OpenAPI security spec
  • Redis pipelinestransaction=True for atomicity on multi-step operations
  • max(0, value) guards — for computed values that should never be negative
  • SSE protocoldata: lines for frontend-parsed events (must match Zod schema), : comment lines for heartbeats/status
  • File length — keep files under ~300 lines; if a file grows beyond this, split by responsibility (e.g. extract helpers, models, or a sub-module into a new file). Never keep appending to a long file.
  • Function length — keep functions under ~40 lines; extract named helpers when a function grows longer. Long functions are a sign of mixed concerns, not complexity.
  • Top-down ordering — define the main/public function or class first, then the helpers it uses below. A reader should encounter high-level logic before implementation details.

Testing Approach

  • Uses pytest with snapshot testing for API responses
  • Test files are colocated with source files (*_test.py)
  • Mock at boundaries — mock where the symbol is used, not where it's defined
  • After refactoring, update mock targets to match new module paths
  • Use AsyncMock for async functions (from unittest.mock import AsyncMock)

Test-Driven Development (TDD)

When fixing a bug or adding a feature, write the test before the implementation:

# 1. Write a failing test marked xfail
@pytest.mark.xfail(reason="Bug #1234: widget crashes on empty input")
def test_widget_handles_empty_input():
    result = widget.process("")
    assert result == Widget.EMPTY_RESULT

# 2. Run it — confirm it fails (XFAIL)
# poetry run pytest path/to/test.py::test_widget_handles_empty_input -xvs

# 3. Implement the fix

# 4. Remove xfail, run again — confirm it passes
def test_widget_handles_empty_input():
    result = widget.process("")
    assert result == Widget.EMPTY_RESULT

This catches regressions and proves the fix actually works. Every bug fix should include a test that would have caught it.

Database Schema

Key models (defined in schema.prisma):

  • User: Authentication and profile data
  • AgentGraph: Workflow definitions with version control
  • AgentGraphExecution: Execution history and results
  • AgentNode: Individual nodes in a workflow
  • StoreListing: Marketplace listings for sharing agents

Environment Configuration

  • Backend: .env.default (defaults) → .env (user overrides)

Common Development Tasks

Adding a new block

Follow the comprehensive Block SDK Guide which covers:

  • Provider configuration with ProviderBuilder
  • Block schema definition
  • Authentication (API keys, OAuth, webhooks)
  • Testing and validation
  • File organization

Quick steps:

  1. Create new file in backend/blocks/
  2. Configure provider using ProviderBuilder in _config.py
  3. Inherit from Block base class
  4. Define input/output schemas using BlockSchema
  5. Implement async run method
  6. Generate unique block ID using uuid.uuid4()
  7. Test with poetry run pytest backend/blocks/test/test_block.py

Note: when making many new blocks analyze the interfaces for each of these blocks and picture if they would go well together in a graph-based editor or would they struggle to connect productively? ex: do the inputs and outputs tie well together?

If you get any pushback or hit complex block conditions check the new_blocks guide in the docs.

Handling files in blocks with store_media_file()

When blocks need to work with files (images, videos, documents), use store_media_file() from backend.util.file. The return_format parameter determines what you get back:

Format Use When Returns
"for_local_processing" Processing with local tools (ffmpeg, MoviePy, PIL) Local file path (e.g., "image.png")
"for_external_api" Sending content to external APIs (Replicate, OpenAI) Data URI (e.g., "data:image/png;base64,...")
"for_block_output" Returning output from your block Smart: workspace:// in CoPilot, data URI in graphs

Examples:

# INPUT: Need to process file locally with ffmpeg
local_path = await store_media_file(
    file=input_data.video,
    execution_context=execution_context,
    return_format="for_local_processing",
)
# local_path = "video.mp4" - use with Path/ffmpeg/etc

# INPUT: Need to send to external API like Replicate
image_b64 = await store_media_file(
    file=input_data.image,
    execution_context=execution_context,
    return_format="for_external_api",
)
# image_b64 = "data:image/png;base64,iVBORw0..." - send to API

# OUTPUT: Returning result from block
result_url = await store_media_file(
    file=generated_image_url,
    execution_context=execution_context,
    return_format="for_block_output",
)
yield "image_url", result_url
# In CoPilot: result_url = "workspace://abc123"
# In graphs:  result_url = "data:image/png;base64,..."

Key points:

  • for_block_output is the ONLY format that auto-adapts to execution context
  • Always use for_block_output for block outputs unless you have a specific reason not to
  • Never hardcode workspace checks - let for_block_output handle it

Modifying the API

  1. Update route in backend/api/features/
  2. Add/update Pydantic models in same directory
  3. Write tests alongside the route file
  4. Run poetry run test to verify

Workspace & Media Files

Read Workspace & Media Architecture when:

  • Working on CoPilot file upload/download features
  • Building blocks that handle MediaFileType inputs/outputs
  • Modifying WorkspaceManager or store_media_file()
  • Debugging file persistence or virus scanning issues

Covers: WorkspaceManager (persistent storage with session scoping), store_media_file() (media normalization pipeline), and responsibility boundaries for virus scanning and persistence.

Security Implementation

Cache Protection Middleware

  • Located in backend/api/middleware/security.py
  • Default behavior: Disables caching for ALL endpoints with Cache-Control: no-store, no-cache, must-revalidate, private
  • Uses an allow list approach - only explicitly permitted paths can be cached
  • Cacheable paths include: static assets (static/*, _next/static/*), health checks, public store pages, documentation
  • Prevents sensitive data (auth tokens, API keys, user data) from being cached by browsers/proxies
  • To allow caching for a new endpoint, add it to CACHEABLE_PATHS in the middleware
  • Applied to both main API server and external API applications