mirror of https://github.com/Significant-Gravitas/AutoGPT.git synced 2026-02-12 07:45:14 -05:00

Files

Nicholas Tindle ab95077e5b refactor(forge): remove VCR cassettes, use real API calls with skip for forks

- Remove vcrpy and pytest-recording dependencies
- Remove tests/vcr/ directory and vcr_cassettes submodule
- Remove .gitmodules (only had cassette submodule)
- Simplify CI workflow - no more cassette checkout/push/PAT_REVIEW
- Tests requiring API keys now skip if not set (fork PRs)
- Update CLAUDE.md files to remove cassette references
- Fix broken agbenchmark path in pyproject.toml

Security improvement: removes need for PAT with cross-repo write access.
Fork PRs will have API-dependent tests skipped (GitHub protects secrets).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-01-19 22:51:57 -06:00

8.6 KiB

Raw Blame History

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

AutoGPT Classic is an experimental, unsupported project demonstrating autonomous GPT-4 operation. Dependencies will not be updated, and the codebase contains known vulnerabilities. This is preserved for educational/historical purposes.

Repository Structure

/forge            - Core autonomous agent framework (main library)
/original_autogpt - Original AutoGPT implementation (depends on forge)
/direct_benchmark - Benchmark harness for testing agent performance

Each Python subproject has its own pyproject.toml and uses Poetry for dependency management.

Common Commands

Setup & Install

# Install forge (core library)
cd forge && poetry install

# Install original_autogpt (includes forge as dependency)
cd original_autogpt && poetry install

# Install benchmark
cd benchmark && poetry install

# Install with benchmark support (optional extra)
cd forge && poetry install --extras benchmark
cd original_autogpt && poetry install --extras benchmark

Running Agents

# Run forge agent (from forge directory)
cd forge && poetry run python -m forge

# Run original autogpt (from original_autogpt directory)
cd original_autogpt && poetry run serve --debug

# Run autogpt CLI
cd original_autogpt && poetry run autogpt

Agents run on http://localhost:8000 by default.

Benchmarking

# Run benchmarks
cd direct_benchmark && poetry run python -m direct_benchmark run

# Run specific strategies and models
poetry run python -m direct_benchmark run \
    --strategies one_shot,rewoo \
    --models claude \
    --parallel 4

# Run a single test
poetry run python -m direct_benchmark run --tests ReadFile

# List available commands
poetry run python -m direct_benchmark --help

Testing

cd forge && poetry run pytest                    # All tests
cd forge && poetry run pytest tests/             # Tests directory only
cd forge && poetry run pytest -k test_name       # Single test by name
cd forge && poetry run pytest path/to/test.py   # Specific test file
cd forge && poetry run pytest --cov             # With coverage

Linting & Formatting

Run from forge/ or original_autogpt/ directory:

# Format everything (recommended to run together)
poetry run black . && poetry run isort .

# Check formatting (CI-style, no changes)
poetry run black --check . && poetry run isort --check-only .

# Lint
poetry run flake8        # Style linting

# Type check
poetry run pyright       # Type checking (some errors are expected in infrastructure code)

Note: Always run linters over the entire directory, not specific files, for best results.

Architecture

Forge (Core Framework)

The forge package is the foundation that other components depend on:

forge/agent/ - Agent implementation and protocols
forge/llm/ - Multi-provider LLM integrations (OpenAI, Anthropic, Groq, LiteLLM)
forge/components/ - Reusable agent components
forge/file_storage/ - File system abstraction
forge/config/ - Configuration management

Original AutoGPT

Depends on forge via local path (autogpt-forge = { path = "../forge" }):

autogpt/app/ - CLI application entry points
autogpt/agents/ - Agent implementations
autogpt/agent_factory/ - Agent creation logic

Direct Benchmark

Benchmark harness for testing agent performance:

direct_benchmark/ - CLI and harness code
benchmark/agbenchmark/challenges/ - Test cases organized by category (code, retrieval, data, etc.)
Reports generated in direct_benchmark/reports/

Dependency Chain

original_autogpt → forge direct_benchmark → original_autogpt → forge

Code Style

Python 3.12 target
Line length: 88 characters (Black default)
Black for formatting, isort for imports (profile="black")
Type hints with Pyright checking

Testing Patterns

Async support via pytest-asyncio
Fixtures defined in conftest.py files provide: tmp_project_root, storage, config, llm_provider, agent
Tests requiring API keys (OPENAI_API_KEY, ANTHROPIC_API_KEY) will skip if not set

Environment Setup

Copy .env.example to .env in the relevant directory and add your API keys:

cp .env.example .env
# Edit .env with your OPENAI_API_KEY, etc.

Workspaces

Agents operate within a workspace - a directory containing all agent data and files. The workspace root defaults to the current working directory.

Workspace Structure

{workspace}/
├── .autogpt/
│   ├── autogpt.yaml              # Workspace-level permissions
│   ├── ap_server.db              # Agent Protocol database (server mode)
│   └── agents/
│       └── AutoGPT-{agent_id}/
│           ├── state.json        # Agent profile, directives, action history
│           ├── permissions.yaml  # Agent-specific permission overrides
│           └── workspace/        # Agent's sandboxed working directory

Key Concepts

Multiple agents can coexist in the same workspace (each gets its own subdirectory)
File access is sandboxed to the agent's workspace/ directory by default
State persistence - agent state saves to state.json and survives across sessions
Storage backends - supports local filesystem, S3, and GCS (via FILE_STORAGE_BACKEND env var)

Specifying a Workspace

# Default: uses current directory
cd /path/to/my/project && poetry run autogpt

# Or specify explicitly via CLI (if supported)
poetry run autogpt --workspace /path/to/workspace

Settings Location

Configuration uses a layered system with three levels (in order of precedence):

1. Environment Variables (Global)

Loaded from .env file in the working directory:

# Required
OPENAI_API_KEY=sk-...

# Optional LLM settings
SMART_LLM=gpt-4o                    # Model for complex reasoning
FAST_LLM=gpt-4o-mini                # Model for simple tasks
EMBEDDING_MODEL=text-embedding-3-small

# Optional search providers (for web search component)
TAVILY_API_KEY=tvly-...
SERPER_API_KEY=...
GOOGLE_API_KEY=...
GOOGLE_CUSTOM_SEARCH_ENGINE_ID=...

# Optional infrastructure
LOG_LEVEL=DEBUG                     # DEBUG, INFO, WARNING, ERROR
DATABASE_STRING=sqlite:///agent.db  # Agent Protocol database
PORT=8000                           # Server port
FILE_STORAGE_BACKEND=local          # local, s3, or gcs

2. Workspace Settings (`{workspace}/.autogpt/autogpt.yaml`)

Workspace-wide permissions that apply to all agents in this workspace:

allow:
  - read_file({workspace}/**)
  - write_to_file({workspace}/**)
  - list_folder({workspace}/**)
  - web_search(*)

deny:
  - read_file(**.env)
  - read_file(**.env.*)
  - read_file(**.key)
  - read_file(**.pem)
  - execute_shell(rm -rf:*)
  - execute_shell(sudo:*)

Auto-generated with sensible defaults if missing.

3. Agent Settings (`{workspace}/.autogpt/agents/{id}/permissions.yaml`)

Agent-specific permission overrides:

allow:
  - execute_python(*)
  - web_search(*)

deny:
  - execute_shell(*)

Permissions

The permission system uses pattern matching with a first-match-wins evaluation order.

Permission Check Order

Agent deny list → Block
Workspace deny list → Block
Agent allow list → Allow
Workspace allow list → Allow
Session denied list → Block (commands denied during this session)
Prompt user → Interactive approval (if in interactive mode)

Pattern Syntax

Format: command_name(glob_pattern)

Pattern	Description
`read_file({workspace}/**)`	Read any file in workspace (recursive)
`write_to_file({workspace}/*.txt)`	Write only .txt files in workspace root
`execute_shell(python:**)`	Execute Python commands only
`execute_shell(git:*)`	Execute any git command
`web_search(*)`	Allow all web searches

Special tokens:

{workspace} - Replaced with actual workspace path
** - Matches any path including /
* - Matches any characters except /

Interactive Approval Scopes

When prompted for permission, users can choose:

Scope	Effect
Once	Allow this one time only (not saved)
Agent	Always allow for this agent (saves to agent `permissions.yaml`)
Workspace	Always allow for all agents (saves to `autogpt.yaml`)
Deny	Deny this command (saves to appropriate deny list)

Default Security

Out of the box, the following are denied by default:

Reading sensitive files (.env, .key, .pem)
Destructive shell commands (rm -rf, sudo)
Operations outside the workspace directory

8.6 KiB Raw Blame History