mirror of https://github.com/Significant-Gravitas/AutoGPT.git synced 2026-04-08 03:00:28 -04:00

Files

Nicholas Tindle e33b1e2105 feat(classic): update classic autogpt a bit to make it more useful for my day to day (#11797 )

## Summary

This PR modernizes AutoGPT Classic to make it more useful for day-to-day
autonomous agent development. Major changes include consolidating the
project structure, adding new prompt strategies, modernizing the
benchmark system, and improving the development experience.

**Note: AutoGPT Classic is an experimental, unsupported project
preserved for educational/historical purposes. Dependencies will not be
actively updated.**

## Changes 🏗️

### Project Structure & Build System
- **Consolidated Poetry projects** - Merged `forge/`,
`original_autogpt/`, and benchmark packages into a single
`pyproject.toml` at `classic/` root
- **Removed old benchmark infrastructure** - Deleted the complex
`agbenchmark` package (3000+ lines) in favor of the new
`direct_benchmark` harness
- **Removed frontend** - Deleted `benchmark/frontend/` React app (no
longer needed)
- **Cleaned up CI workflows** - Simplified GitHub Actions workflows for
the consolidated project structure
- **Added CLAUDE.md** - Documentation for working with the codebase
using Claude Code

### New Direct Benchmark System
- **`direct_benchmark` harness** - New streamlined benchmark runner
with:
  - Rich TUI with multi-panel layout showing parallel test execution
  - Incremental resume and selective reset capabilities
  - CI mode for non-interactive environments
  - Step-level logging with colored prefixes
  - "Would have passed" tracking for timed-out challenges
  - Copy-paste completion blocks for sharing results

### Multiple Prompt Strategies
Added pluggable prompt strategy system supporting:
- **one_shot** - Single-prompt completion
- **plan_execute** - Plan first, then execute steps
- **rewoo** - Reasoning without observation (deferred tool execution)
- **react** - Reason + Act iterative loop
- **lats** - Language Agent Tree Search (MCTS-based exploration)
- **sub_agent** - Multi-agent delegation architecture
- **debate** - Multi-agent debate for consensus

### LLM Provider Improvements
- Added support for modern **Anthropic Claude models**
(claude-3.5-sonnet, claude-3-haiku, etc.)
- Added **Groq** provider support
- Improved tool call error feedback for LLM self-correction
- Fixed deprecated API usage

### Web Components
- **Replaced Selenium with Playwright** for web browsing (better async
support, faster)
- Added **lightweight web fetch component** for simple URL fetching
- **Modernized web search** with tiered provider system (Tavily, Serper,
Google)

### Agent Capabilities
- **Workspace permissions system** - Pattern-based allow/deny lists for
agent commands
- **Rich interactive selector** for command approval with scopes
(once/agent/workspace/deny)
- **TodoComponent** with LLM-powered task decomposition
- **Platform blocks integration** - Connect to AutoGPT Platform API for
additional blocks
- **Sub-agent architecture** - Agents can spawn and coordinate
sub-agents

### Developer Experience
- **Python 3.12+ support** with CI testing on 3.12, 3.13, 3.14
- **Current working directory as default workspace** - Run `autogpt`
from any project directory
- Simplified log format (removed timestamps)
- Improved configuration and setup flow
- External benchmark adapters for GAIA, SWE-bench, and AgentBench

### Bug Fixes
- Fixed N/A command loop when using native tool calling
- Fixed auto-advance plan steps in Plan-Execute strategy
- Fixed approve+feedback to execute command then send feedback
- Fixed parallel tool calls in action history
- Always recreate Docker containers for code execution
- Various pyright type errors resolved
- Linting and formatting issues fixed across codebase

## Test Plan

- [x] CI lint, type, and test checks pass
- [x] Run `poetry install` from `classic/` directory
- [x] Run `poetry run autogpt` and verify CLI starts
- [x] Run `poetry run direct-benchmark run --tests ReadFile` to verify
benchmark works

## Notes

- This is a WIP PR for personal use improvements
- The project is marked as **unsupported** - no active maintenance
planned
- Contains known vulnerabilities in dependencies (intentionally not
updated)

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Medium Risk**
> CI/build workflows are substantially reworked (runner matrix removal,
path/layout changes, new benchmark runner), so breakage is most likely
in automation and packaging rather than runtime behavior.
> 
> **Overview**
> **Modernizes the `classic/` project layout and automation around a
single consolidated Poetry project** (root
`classic/pyproject.toml`/`poetry.lock`) and updates docs
(`classic/README.md`, new `classic/CLAUDE.md`) accordingly.
> 
> **Replaces the old `agbenchmark` CI usage with `direct-benchmark` in
GitHub Actions**, including new/updated benchmark smoke and regression
workflows, standardized `working-directory: classic`, and a move to
**Python 3.12** on Ubuntu-only runners (plus updated caching, coverage
flags, and required `ANTHROPIC_API_KEY` wiring).
> 
> Cleans up repo/dev tooling by removing the classic frontend workflow,
deleting the Forge VCR cassette submodule (`.gitmodules`) and associated
CI steps, consolidating `flake8`/`isort`/`pyright` pre-commit hooks to
run from `classic/`, updating ignores for new report/workspace
artifacts, and updating `classic/Dockerfile.autogpt` to build from
Python 3.12 with the consolidated project structure.
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
de67834dac. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>

2026-04-03 07:16:36 +00:00

5.3 KiB

Raw Blame History

AutoGPT Classic

AutoGPT Classic was an experimental project to demonstrate autonomous GPT-4 operation. It was designed to make GPT-4 independently operate and chain together tasks to achieve more complex goals.

Project Status

This project is unsupported, and dependencies will not be updated. It was an experiment that has concluded its initial research phase. If you want to use AutoGPT, you should use the AutoGPT Platform.

For those interested in autonomous AI agents, we recommend exploring more actively maintained alternatives or referring to this codebase for educational purposes only.

Overview

AutoGPT Classic was one of the first implementations of autonomous AI agents - AI systems that can independently:

Break down complex goals into smaller tasks
Execute those tasks using available tools and APIs
Learn from the results and adjust its approach
Chain multiple actions together to achieve an objective

Structure

classic/
├── pyproject.toml          # Single consolidated Poetry project
├── poetry.lock             # Single lock file
├── forge/                  # Core autonomous agent framework
├── original_autogpt/       # Original implementation
├── direct_benchmark/       # Benchmark harness
└── benchmark/              # Challenge definitions (data)

Getting Started

Prerequisites

Python 3.12+
Poetry

Installation

# Clone the repository
git clone https://github.com/Significant-Gravitas/AutoGPT.git
cd classic

# Install everything
poetry install

Configuration

Configuration uses a layered system:

Environment variables (.env file)
Workspace settings (.autogpt/autogpt.yaml)
Agent settings (.autogpt/agents/{id}/permissions.yaml)

Copy the example environment file and add your API keys:

cp .env.example .env

Key environment variables:

# Required
OPENAI_API_KEY=sk-...

# Optional LLM settings
SMART_LLM=gpt-4o                    # Model for complex reasoning
FAST_LLM=gpt-4o-mini                # Model for simple tasks

# Optional search providers
TAVILY_API_KEY=tvly-...
SERPER_API_KEY=...

# Optional infrastructure
LOG_LEVEL=DEBUG
PORT=8000
FILE_STORAGE_BACKEND=local          # local, s3, or gcs

Running

All commands run from the classic/ directory:

# Run forge agent
poetry run python -m forge

# Run original autogpt server
poetry run serve --debug

# Run autogpt CLI
poetry run autogpt

Agents run on http://localhost:8000 by default.

Benchmarking

poetry run direct-benchmark run

Testing

poetry run pytest                        # All tests
poetry run pytest forge/tests/           # Forge tests only
poetry run pytest original_autogpt/tests/ # AutoGPT tests only

Workspaces

Agents operate within a workspace directory that contains all agent data and files:

{workspace}/
├── .autogpt/
│   ├── autogpt.yaml              # Workspace-level permissions
│   ├── ap_server.db              # Agent Protocol database (server mode)
│   └── agents/
│       └── AutoGPT-{agent_id}/
│           ├── state.json        # Agent profile, directives, history
│           ├── permissions.yaml  # Agent-specific permissions
│           └── workspace/        # Agent's sandboxed working directory

The workspace defaults to the current working directory
Multiple agents can coexist in the same workspace
Agent file access is sandboxed to their workspace/ subdirectory
State persists across sessions via state.json

Permissions

AutoGPT uses a layered permission system with pattern matching:

Permission Files

File	Scope	Location
`autogpt.yaml`	All agents in workspace	`.autogpt/autogpt.yaml`
`permissions.yaml`	Single agent	`.autogpt/agents/{id}/permissions.yaml`

Permission Format

allow:
  - read_file({workspace}/**)     # Read any file in workspace
  - write_to_file({workspace}/**) # Write any file in workspace
  - web_search(*)                 # All web searches

deny:
  - read_file(**.env)             # Block .env files
  - execute_shell(sudo:*)         # Block sudo commands

Check Order (First Match Wins)

Agent deny → Block
Workspace deny → Block
Agent allow → Allow
Workspace allow → Allow
Prompt user → Interactive approval

Interactive Approval

When prompted, users can approve commands with different scopes:

Once - Allow this one time only
Agent - Always allow for this agent
Workspace - Always allow for all agents
Deny - Block this command

Default Security

Denied by default:

Sensitive files (.env, .key, .pem)
Destructive commands (rm -rf, sudo)
Operations outside the workspace

Security Notice

This codebase has known vulnerabilities and issues with its dependencies. It will not be updated to new dependencies. Use for educational purposes only.

License

This project segment is licensed under the MIT License - see the LICENSE file for details.

Documentation

Please refer to the documentation for more detailed information about the project's architecture and concepts.

5.3 KiB Raw Blame History