Commit Graph

2 Commits

Author SHA1 Message Date
Nicholas Tindle
44182aff9c feat(classic): add strategy benchmark test harness for CI
- Add test_prompt_strategies.py harness to compare prompt strategies
- Add pytest wrapper (test_strategy_benchmark.py) for CI integration
- Fix serve command (remove invalid --port flag, use AP_SERVER_PORT env)
- Fix test category (interface -> general)
- Add aiohttp-retry dependency for agbenchmark
- Add pytest markers: slow, integration, requires_agent

Usage:
  poetry run python agbenchmark_config/test_prompt_strategies.py --quick
  poetry run pytest tests/integration/test_strategy_benchmark.py -v

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-18 23:36:19 -06:00
Nicholas Tindle
4c264b7ae9 feat(classic): add TodoComponent with LLM-powered decomposition
Add a task management component modeled after Claude Code's TodoWrite:
- TodoItem with recursive sub_items for hierarchical task structure
- todo_write: atomic list replacement with sub-items support
- todo_read: retrieve current todos with nested structure
- todo_clear: clear all todos
- todo_decompose: use smart LLM to break down tasks into sub-steps

Features:
- Hierarchical task tracking with independent status per sub-item
- MessageProvider shows todos in LLM context with proper indentation
- DirectiveProvider adds best practices for task management
- Graceful fallback when LLM provider not configured

Integrates with:
- original_autogpt Agent (full LLM decomposition support)
- ForgeAgent (basic task tracking, no decomposition)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-18 18:49:48 -06:00