mirror of https://github.com/Significant-Gravitas/AutoGPT.git synced 2026-04-30 03:00:41 -04:00

Files

Nicholas Tindle e33b1e2105 feat(classic): update classic autogpt a bit to make it more useful for my day to day (#11797 )

## Summary

This PR modernizes AutoGPT Classic to make it more useful for day-to-day
autonomous agent development. Major changes include consolidating the
project structure, adding new prompt strategies, modernizing the
benchmark system, and improving the development experience.

**Note: AutoGPT Classic is an experimental, unsupported project
preserved for educational/historical purposes. Dependencies will not be
actively updated.**

## Changes 🏗️

### Project Structure & Build System
- **Consolidated Poetry projects** - Merged `forge/`,
`original_autogpt/`, and benchmark packages into a single
`pyproject.toml` at `classic/` root
- **Removed old benchmark infrastructure** - Deleted the complex
`agbenchmark` package (3000+ lines) in favor of the new
`direct_benchmark` harness
- **Removed frontend** - Deleted `benchmark/frontend/` React app (no
longer needed)
- **Cleaned up CI workflows** - Simplified GitHub Actions workflows for
the consolidated project structure
- **Added CLAUDE.md** - Documentation for working with the codebase
using Claude Code

### New Direct Benchmark System
- **`direct_benchmark` harness** - New streamlined benchmark runner
with:
  - Rich TUI with multi-panel layout showing parallel test execution
  - Incremental resume and selective reset capabilities
  - CI mode for non-interactive environments
  - Step-level logging with colored prefixes
  - "Would have passed" tracking for timed-out challenges
  - Copy-paste completion blocks for sharing results

### Multiple Prompt Strategies
Added pluggable prompt strategy system supporting:
- **one_shot** - Single-prompt completion
- **plan_execute** - Plan first, then execute steps
- **rewoo** - Reasoning without observation (deferred tool execution)
- **react** - Reason + Act iterative loop
- **lats** - Language Agent Tree Search (MCTS-based exploration)
- **sub_agent** - Multi-agent delegation architecture
- **debate** - Multi-agent debate for consensus

### LLM Provider Improvements
- Added support for modern **Anthropic Claude models**
(claude-3.5-sonnet, claude-3-haiku, etc.)
- Added **Groq** provider support
- Improved tool call error feedback for LLM self-correction
- Fixed deprecated API usage

### Web Components
- **Replaced Selenium with Playwright** for web browsing (better async
support, faster)
- Added **lightweight web fetch component** for simple URL fetching
- **Modernized web search** with tiered provider system (Tavily, Serper,
Google)

### Agent Capabilities
- **Workspace permissions system** - Pattern-based allow/deny lists for
agent commands
- **Rich interactive selector** for command approval with scopes
(once/agent/workspace/deny)
- **TodoComponent** with LLM-powered task decomposition
- **Platform blocks integration** - Connect to AutoGPT Platform API for
additional blocks
- **Sub-agent architecture** - Agents can spawn and coordinate
sub-agents

### Developer Experience
- **Python 3.12+ support** with CI testing on 3.12, 3.13, 3.14
- **Current working directory as default workspace** - Run `autogpt`
from any project directory
- Simplified log format (removed timestamps)
- Improved configuration and setup flow
- External benchmark adapters for GAIA, SWE-bench, and AgentBench

### Bug Fixes
- Fixed N/A command loop when using native tool calling
- Fixed auto-advance plan steps in Plan-Execute strategy
- Fixed approve+feedback to execute command then send feedback
- Fixed parallel tool calls in action history
- Always recreate Docker containers for code execution
- Various pyright type errors resolved
- Linting and formatting issues fixed across codebase

## Test Plan

- [x] CI lint, type, and test checks pass
- [x] Run `poetry install` from `classic/` directory
- [x] Run `poetry run autogpt` and verify CLI starts
- [x] Run `poetry run direct-benchmark run --tests ReadFile` to verify
benchmark works

## Notes

- This is a WIP PR for personal use improvements
- The project is marked as **unsupported** - no active maintenance
planned
- Contains known vulnerabilities in dependencies (intentionally not
updated)

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Medium Risk**
> CI/build workflows are substantially reworked (runner matrix removal,
path/layout changes, new benchmark runner), so breakage is most likely
in automation and packaging rather than runtime behavior.
> 
> **Overview**
> **Modernizes the `classic/` project layout and automation around a
single consolidated Poetry project** (root
`classic/pyproject.toml`/`poetry.lock`) and updates docs
(`classic/README.md`, new `classic/CLAUDE.md`) accordingly.
> 
> **Replaces the old `agbenchmark` CI usage with `direct-benchmark` in
GitHub Actions**, including new/updated benchmark smoke and regression
workflows, standardized `working-directory: classic`, and a move to
**Python 3.12** on Ubuntu-only runners (plus updated caching, coverage
flags, and required `ANTHROPIC_API_KEY` wiring).
> 
> Cleans up repo/dev tooling by removing the classic frontend workflow,
deleting the Forge VCR cassette submodule (`.gitmodules`) and associated
CI steps, consolidating `flake8`/`isort`/`pyright` pre-commit hooks to
run from `classic/`, updating ignores for new report/workspace
artifacts, and updating `classic/Dockerfile.autogpt` to build from
Python 3.12 with the consolidated project structure.
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
de67834dac. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>

2026-04-03 07:16:36 +00:00

agbenchmark_config

refactor: AutoGPT Platform Stealth Launch Repo Re-Org (#8113 )

2024-09-20 16:50:43 +02:00

forge

feat(classic): update classic autogpt a bit to make it more useful for my day to day (#11797 )

2026-04-03 07:16:36 +00:00

tests

feat(classic): update classic autogpt a bit to make it more useful for my day to day (#11797 )

2026-04-03 07:16:36 +00:00

.env.example

feat(classic): update classic autogpt a bit to make it more useful for my day to day (#11797 )

2026-04-03 07:16:36 +00:00

.flake8

refactor: AutoGPT Platform Stealth Launch Repo Re-Org (#8113 )

2024-09-20 16:50:43 +02:00

.gitignore

refactor: AutoGPT Platform Stealth Launch Repo Re-Org (#8113 )

2024-09-20 16:50:43 +02:00

CLAUDE.md

feat(classic): update classic autogpt a bit to make it more useful for my day to day (#11797 )

2026-04-03 07:16:36 +00:00

conftest.py

feat(classic): update classic autogpt a bit to make it more useful for my day to day (#11797 )

2026-04-03 07:16:36 +00:00

Dockerfile

feat(classic): update classic autogpt a bit to make it more useful for my day to day (#11797 )

2026-04-03 07:16:36 +00:00

README.md

feat(classic): update classic autogpt a bit to make it more useful for my day to day (#11797 )

2026-04-03 07:16:36 +00:00

README.md

AutoGPT Forge

Core autonomous agent framework for building AI agents.

Quick Start

All commands run from the classic/ directory (parent of this directory):

# Install (one-time setup)
cd classic
poetry install

# Configure
cp .env.example .env
# Edit .env with your OPENAI_API_KEY

# Run
poetry run python -m forge

The agent server runs on http://localhost:8000 by default.

Configuration

Environment Variables (`.env`)

# Required
OPENAI_API_KEY=sk-...

# Optional LLM settings
SMART_LLM=gpt-4o                    # Model for complex reasoning
FAST_LLM=gpt-4o-mini                # Model for simple tasks
EMBEDDING_MODEL=text-embedding-3-small

# Optional search providers
TAVILY_API_KEY=tvly-...
SERPER_API_KEY=...
GOOGLE_API_KEY=...
GOOGLE_CUSTOM_SEARCH_ENGINE_ID=...

# Optional infrastructure
LOG_LEVEL=DEBUG                     # DEBUG, INFO, WARNING, ERROR
DATABASE_STRING=sqlite:///agent.db  # Agent Protocol database
PORT=8000                           # Server port
FILE_STORAGE_BACKEND=local          # local, s3, or gcs

Workspace Settings (`.autogpt/autogpt.yaml`)

Workspace-wide permissions for all agents:

allow:
  - read_file({workspace}/**)
  - write_to_file({workspace}/**)
  - list_folder({workspace}/**)
  - web_search(*)

deny:
  - read_file(**.env)
  - read_file(**.key)
  - execute_shell(rm -rf:*)
  - execute_shell(sudo:*)

Agent Settings (`.autogpt/agents/{id}/permissions.yaml`)

Agent-specific permission overrides:

allow:
  - execute_python(*)
deny:
  - execute_shell(*)

Workspace Structure

{workspace}/
├── .autogpt/
│   ├── autogpt.yaml              # Workspace permissions
│   ├── ap_server.db              # Agent Protocol database
│   └── agents/
│       └── AutoGPT-{agent_id}/
│           ├── state.json        # Agent state
│           ├── permissions.yaml  # Agent permissions
│           └── workspace/        # Agent's working directory

Permissions

Permission checks follow this order (first match wins):

Agent deny list → Block
Workspace deny list → Block
Agent allow list → Allow
Workspace allow list → Allow
Prompt user → Interactive approval

Pattern Syntax

Format: command_name(glob_pattern)

Pattern	Description
`read_file({workspace}/**)`	Read any file in workspace
`execute_shell(python:**)`	Execute Python commands
`web_search(*)`	All web searches

Special tokens:

{workspace} - Replaced with workspace path
** - Matches any path including /
* - Matches any characters except /

Tutorials

The tutorial series guides you through building a custom agent:

README.md

AutoGPT Forge

Quick Start

Configuration

Environment Variables (.env)

Workspace Settings (.autogpt/autogpt.yaml)

Agent Settings (.autogpt/agents/{id}/permissions.yaml)

Workspace Structure

Permissions

Pattern Syntax

Tutorials

Environment Variables (`.env`)

Workspace Settings (`.autogpt/autogpt.yaml`)

Agent Settings (`.autogpt/agents/{id}/permissions.yaml`)