mirror of
https://github.com/Significant-Gravitas/AutoGPT.git
synced 2026-04-30 03:00:41 -04:00
Enable agents to execute shell commands during benchmarks by setting execute_local_commands=True and using denylist mode to block dangerous commands (rm, sudo, chmod, kill, etc.) while allowing safe operations. Also adds ExecutePython challenge to test code execution capability. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>