AutoGPT

mirror of https://github.com/Significant-Gravitas/AutoGPT.git synced 2026-02-13 08:14:58 -05:00

Author	SHA1	Message	Date
Nicholas Tindle	791e1d8982	fix(classic): resolve CI lint, type, and test failures - Fix line-too-long in test_permissions.py docstring - Fix type annotation in validators.py (callable -> Callable) - Add --fresh flag to benchmark tests to prevent state resumption - Exclude direct_benchmark/adapters from pyright (optional deps) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-29 14:31:11 -06:00
Nicholas Tindle	0040636948	fix(permissions): update wildcard handling for command patterns	2026-01-26 12:42:21 -06:00
Nicholas Tindle	6f2783468c	feat(classic): add sub-agent architecture and LATS/multi-agent debate strategies Add comprehensive sub-agent spawning infrastructure that enables prompt strategies to coordinate multiple agents for advanced reasoning patterns. New files: - forge/agent/execution_context.py: ExecutionContext, ResourceBudget, SubAgentHandle, and AgentFactory protocol for sub-agent lifecycle - agent_factory/default_factory.py: DefaultAgentFactory implementation - prompt_strategies/lats.py: Language Agent Tree Search using MCTS with sub-agents for action expansion and evaluation - prompt_strategies/multi_agent_debate.py: Multi-agent debate with proposal, critique, and consensus phases Key changes: - BaseMultiStepPromptStrategy gains spawn_sub_agent(), run_sub_agent(), spawn_and_run(), and run_parallel() methods - Agent class accepts optional ExecutionContext and injects it into strategies - Sub-agents enabled by default (enable_sub_agents=True) - Resource limits: max_depth=5, max_sub_agents=25, max_cycles=25 All 7 strategies now available in benchmark: one_shot, rewoo, plan_execute, reflexion, tree_of_thoughts, lats, multi_agent_debate Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-20 01:01:28 -06:00
Nicholas Tindle	ab95077e5b	refactor(forge): remove VCR cassettes, use real API calls with skip for forks - Remove vcrpy and pytest-recording dependencies - Remove tests/vcr/ directory and vcr_cassettes submodule - Remove .gitmodules (only had cassette submodule) - Simplify CI workflow - no more cassette checkout/push/PAT_REVIEW - Tests requiring API keys now skip if not set (fork PRs) - Update CLAUDE.md files to remove cassette references - Fix broken agbenchmark path in pyproject.toml Security improvement: removes need for PAT with cross-repo write access. Fork PRs will have API-dependent tests skipped (GitHub protects secrets). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-19 22:51:57 -06:00
Nicholas Tindle	44182aff9c	feat(classic): add strategy benchmark test harness for CI - Add test_prompt_strategies.py harness to compare prompt strategies - Add pytest wrapper (test_strategy_benchmark.py) for CI integration - Fix serve command (remove invalid --port flag, use AP_SERVER_PORT env) - Fix test category (interface -> general) - Add aiohttp-retry dependency for agbenchmark - Add pytest markers: slow, integration, requires_agent Usage: poetry run python agbenchmark_config/test_prompt_strategies.py --quick poetry run pytest tests/integration/test_strategy_benchmark.py -v Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-18 23:36:19 -06:00
Nicholas Tindle	864c5a7846	fix(classic): approve+feedback now executes command then sends feedback Previously, when a user selected "Once" or "Always" with feedback (via Tab), the command was NOT executed because UserFeedbackProvided was raised before checking the approval scope. This fix changes the architecture from exception-based to return-value-based. Changes: - Add PermissionCheckResult class with allowed, scope, and feedback fields - Change check_command() to return PermissionCheckResult instead of bool - Update prompt_fn signature to return (ApprovalScope, feedback) tuple - Add pending_user_feedback mechanism to EpisodicActionHistory - Update execute() to handle feedback after successful command execution - Feedback message explicitly states "Command executed successfully" - Add on_auto_approve callback for displaying auto-approved commands - Add comprehensive tests for approval/denial with feedback scenarios Behavior: - Once + feedback → Execute command, then send feedback to agent - Always + feedback → Execute command, save permission, send feedback - Deny + feedback → Don't execute, send feedback to agent Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-18 22:32:43 -06:00
Nicholas Tindle	4c264b7ae9	feat(classic): add TodoComponent with LLM-powered decomposition Add a task management component modeled after Claude Code's TodoWrite: - TodoItem with recursive sub_items for hierarchical task structure - todo_write: atomic list replacement with sub-items support - todo_read: retrieve current todos with nested structure - todo_clear: clear all todos - todo_decompose: use smart LLM to break down tasks into sub-steps Features: - Hierarchical task tracking with independent status per sub-item - MessageProvider shows todos in LLM context with proper indentation - DirectiveProvider adds best practices for task management - Graceful fallback when LLM provider not configured Integrates with: - original_autogpt Agent (full LLM decomposition support) - ForgeAgent (basic task tracking, no decomposition) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-18 18:49:48 -06:00
Swifty	ef7cfbb860	refactor: AutoGPT Platform Stealth Launch Repo Re-Org (#8113 ) Restructuring the Repo to make it clear the difference between classic autogpt and the autogpt platform: * Move the "classic" projects `autogpt`, `forge`, `frontend`, and `benchmark` into a `classic` folder * Also rename `autogpt` to `original_autogpt` for absolute clarity * Rename `rnd/` to `autogpt_platform/` * `rnd/autogpt_builder` -> `autogpt_platform/frontend` * `rnd/autogpt_server` -> `autogpt_platform/backend` * Adjust any paths accordingly	2024-09-20 16:50:43 +02:00

8 Commits