mirror of
https://github.com/Significant-Gravitas/AutoGPT.git
synced 2026-04-08 03:00:28 -04:00
fix(classic): resolve CI lint, type, and test failures
- Fix line-too-long in test_permissions.py docstring - Fix type annotation in validators.py (callable -> Callable) - Add --fresh flag to benchmark tests to prevent state resumption - Exclude direct_benchmark/adapters from pyright (optional deps) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
5
.github/workflows/classic-benchmark-ci.yml
vendored
5
.github/workflows/classic-benchmark-ci.yml
vendored
@@ -66,6 +66,7 @@ jobs:
|
||||
run: |
|
||||
echo "Testing ReadFile challenge with one_shot strategy..."
|
||||
poetry run direct-benchmark run \
|
||||
--fresh \
|
||||
--strategies one_shot \
|
||||
--models claude \
|
||||
--tests ReadFile \
|
||||
@@ -73,6 +74,7 @@ jobs:
|
||||
|
||||
echo "Testing WriteFile challenge..."
|
||||
poetry run direct-benchmark run \
|
||||
--fresh \
|
||||
--strategies one_shot \
|
||||
--models claude \
|
||||
--tests WriteFile \
|
||||
@@ -87,6 +89,7 @@ jobs:
|
||||
run: |
|
||||
echo "Testing coding category..."
|
||||
poetry run direct-benchmark run \
|
||||
--fresh \
|
||||
--strategies one_shot \
|
||||
--models claude \
|
||||
--categories coding \
|
||||
@@ -102,6 +105,7 @@ jobs:
|
||||
run: |
|
||||
echo "Testing multiple strategies..."
|
||||
poetry run direct-benchmark run \
|
||||
--fresh \
|
||||
--strategies one_shot,plan_execute \
|
||||
--models claude \
|
||||
--tests ReadFile \
|
||||
@@ -145,6 +149,7 @@ jobs:
|
||||
run: |
|
||||
echo "Running regression tests (previously beaten challenges)..."
|
||||
poetry run direct-benchmark run \
|
||||
--fresh \
|
||||
--strategies one_shot \
|
||||
--models claude \
|
||||
--maintain \
|
||||
|
||||
Reference in New Issue
Block a user