mirror of
https://github.com/Significant-Gravitas/AutoGPT.git
synced 2026-04-08 03:00:28 -04:00
Compare commits
23 Commits
remove-cla
...
spare/16
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
80bfd64ffa | ||
|
|
0076ad2a1a | ||
|
|
edb3d322f0 | ||
|
|
9381057079 | ||
|
|
f21a36ca37 | ||
|
|
ee5382a064 | ||
|
|
b80e5ea987 | ||
|
|
3d4fcfacb6 | ||
|
|
32eac6d52e | ||
|
|
9762f4cde7 | ||
|
|
76901ba22f | ||
|
|
23b65939f3 | ||
|
|
1c27eaac53 | ||
|
|
923b164794 | ||
|
|
e86ac21c43 | ||
|
|
94224be841 | ||
|
|
da4bdc7ab9 | ||
|
|
7176cecf25 | ||
|
|
f35210761c | ||
|
|
1ebcf85669 | ||
|
|
ab7c38bda7 | ||
|
|
b9ce37600e | ||
|
|
3921deaef1 |
@@ -17,6 +17,14 @@ gh pr list --head $(git branch --show-current) --repo Significant-Gravitas/AutoG
|
||||
gh pr view {N}
|
||||
```
|
||||
|
||||
## Read the PR description
|
||||
|
||||
Understand the **Why / What / How** before addressing comments — you need context to make good fixes:
|
||||
|
||||
```bash
|
||||
gh pr view {N} --json body --jq '.body'
|
||||
```
|
||||
|
||||
## Fetch comments (all sources)
|
||||
|
||||
### 1. Inline review threads — GraphQL (primary source of actionable items)
|
||||
|
||||
@@ -17,6 +17,16 @@ gh pr list --head $(git branch --show-current) --repo Significant-Gravitas/AutoG
|
||||
gh pr view {N}
|
||||
```
|
||||
|
||||
## Read the PR description
|
||||
|
||||
Before reading code, understand the **why**, **what**, and **how** from the PR description:
|
||||
|
||||
```bash
|
||||
gh pr view {N} --json body --jq '.body'
|
||||
```
|
||||
|
||||
Every PR should have a Why / What / How structure. If any of these are missing, note it as feedback.
|
||||
|
||||
## Read the diff
|
||||
|
||||
```bash
|
||||
@@ -34,6 +44,8 @@ gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/reviews
|
||||
|
||||
## What to check
|
||||
|
||||
**Description quality:** Does the PR description cover Why (motivation/problem), What (summary of changes), and How (approach/implementation details)? If any are missing, request them — you can't judge the approach without understanding the problem and intent.
|
||||
|
||||
**Correctness:** logic errors, off-by-one, missing edge cases, race conditions (TOCTOU in file access, credit charging), error handling gaps, async correctness (missing `await`, unclosed resources).
|
||||
|
||||
**Security:** input validation at boundaries, no injection (command, XSS, SQL), secrets not logged, file paths sanitized (`os.path.basename()` in error messages).
|
||||
|
||||
754
.claude/skills/pr-test/SKILL.md
Normal file
754
.claude/skills/pr-test/SKILL.md
Normal file
@@ -0,0 +1,754 @@
|
||||
---
|
||||
name: pr-test
|
||||
description: "E2E manual testing of PRs/branches using docker compose, agent-browser, and API calls. TRIGGER when user asks to manually test a PR, test a feature end-to-end, or run integration tests against a running system."
|
||||
user-invocable: true
|
||||
argument-hint: "[worktree path or PR number] — tests the PR in the given worktree. Optional flags: --fix (auto-fix issues found)"
|
||||
metadata:
|
||||
author: autogpt-team
|
||||
version: "2.0.0"
|
||||
---
|
||||
|
||||
# Manual E2E Test
|
||||
|
||||
Test a PR/branch end-to-end by building the full platform, interacting via browser and API, capturing screenshots, and reporting results.
|
||||
|
||||
## Critical Requirements
|
||||
|
||||
These are NON-NEGOTIABLE. Every test run MUST satisfy ALL the following:
|
||||
|
||||
### 1. Screenshots at Every Step
|
||||
- Take a screenshot at EVERY significant test step — not just at the end
|
||||
- Every test scenario MUST have at least one BEFORE and one AFTER screenshot
|
||||
- Name screenshots sequentially: `{NN}-{action}-{state}.png` (e.g., `01-credits-before.png`, `02-credits-after.png`)
|
||||
- If a screenshot is missing for a scenario, the test is INCOMPLETE — go back and take it
|
||||
|
||||
### 2. Screenshots MUST Be Posted to PR
|
||||
- Push ALL screenshots to a temp branch `test-screenshots/pr-{N}`
|
||||
- Post a PR comment with ALL screenshots embedded inline using GitHub raw URLs
|
||||
- This is NOT optional — every test run MUST end with a PR comment containing screenshots
|
||||
- If screenshot upload fails, retry. If it still fails, list failed files and require manual drag-and-drop/paste attachment in the PR comment
|
||||
|
||||
### 3. State Verification with Before/After Evidence
|
||||
- For EVERY state-changing operation (API call, user action), capture the state BEFORE and AFTER
|
||||
- Log the actual API response values (e.g., `credits_before=100, credits_after=95`)
|
||||
- Screenshot MUST show the relevant UI state change
|
||||
- Compare expected vs actual values explicitly — do not just eyeball it
|
||||
|
||||
### 4. Negative Test Cases Are Mandatory
|
||||
- Test at least ONE negative case per feature (e.g., insufficient credits, invalid input, unauthorized access)
|
||||
- Verify error messages are user-friendly and accurate
|
||||
- Verify the system state did NOT change after a rejected operation
|
||||
|
||||
### 5. Test Report Must Include Full Evidence
|
||||
Each test scenario in the report MUST have:
|
||||
- **Steps**: What was done (exact commands or UI actions)
|
||||
- **Expected**: What should happen
|
||||
- **Actual**: What actually happened
|
||||
- **API Evidence**: Before/after API response values for state-changing operations
|
||||
- **Screenshot Evidence**: Before/after screenshots with explanations
|
||||
|
||||
## State Manipulation for Realistic Testing
|
||||
|
||||
When testing features that depend on specific states (rate limits, credits, quotas):
|
||||
|
||||
1. **Use Redis CLI to set counters directly:**
|
||||
```bash
|
||||
# Find the Redis container
|
||||
REDIS_CONTAINER=$(docker ps --format '{{.Names}}' | grep redis | head -1)
|
||||
# Set a key with expiry
|
||||
docker exec $REDIS_CONTAINER redis-cli SET key value EX ttl
|
||||
# Example: Set rate limit counter to near-limit
|
||||
docker exec $REDIS_CONTAINER redis-cli SET "rate_limit:user:test@test.com" 99 EX 3600
|
||||
# Example: Check current value
|
||||
docker exec $REDIS_CONTAINER redis-cli GET "rate_limit:user:test@test.com"
|
||||
```
|
||||
|
||||
2. **Use API calls to check before/after state:**
|
||||
```bash
|
||||
# BEFORE: Record current state
|
||||
BEFORE=$(curl -s -H "Authorization: Bearer $TOKEN" http://localhost:8006/api/credits | jq '.credits')
|
||||
echo "Credits BEFORE: $BEFORE"
|
||||
|
||||
# Perform the action...
|
||||
|
||||
# AFTER: Record new state and compare
|
||||
AFTER=$(curl -s -H "Authorization: Bearer $TOKEN" http://localhost:8006/api/credits | jq '.credits')
|
||||
echo "Credits AFTER: $AFTER"
|
||||
echo "Delta: $(( BEFORE - AFTER ))"
|
||||
```
|
||||
|
||||
3. **Take screenshots BEFORE and AFTER state changes** — the UI must reflect the backend state change
|
||||
|
||||
4. **Never rely on mocked/injected browser state** — always use real backend state. Do NOT use `agent-browser eval` to fake UI state. The backend must be the source of truth.
|
||||
|
||||
5. **Use direct DB queries when needed:**
|
||||
```bash
|
||||
# Query via Supabase's PostgREST or docker exec into the DB
|
||||
docker exec supabase-db psql -U supabase_admin -d postgres -c "SELECT credits FROM user_credits WHERE user_id = '...';"
|
||||
```
|
||||
|
||||
6. **After every API test, verify the state change actually persisted:**
|
||||
```bash
|
||||
# Example: After a credits purchase, verify DB matches API
|
||||
API_CREDITS=$(curl -s -H "Authorization: Bearer $TOKEN" http://localhost:8006/api/credits | jq '.credits')
|
||||
DB_CREDITS=$(docker exec supabase-db psql -U supabase_admin -d postgres -t -c "SELECT credits FROM user_credits WHERE user_id = '...';" | tr -d ' ')
|
||||
[ "$API_CREDITS" = "$DB_CREDITS" ] && echo "CONSISTENT" || echo "MISMATCH: API=$API_CREDITS DB=$DB_CREDITS"
|
||||
```
|
||||
|
||||
## Arguments
|
||||
|
||||
- `$ARGUMENTS` — worktree path (e.g. `$REPO_ROOT`) or PR number
|
||||
- If `--fix` flag is present, auto-fix bugs found and push fixes (like pr-address loop)
|
||||
|
||||
## Step 0: Resolve the target
|
||||
|
||||
```bash
|
||||
# If argument is a PR number, find its worktree
|
||||
gh pr view {N} --json headRefName --jq '.headRefName'
|
||||
# If argument is a path, use it directly
|
||||
```
|
||||
|
||||
Determine:
|
||||
- `REPO_ROOT` — the root repo directory: `git -C "$WORKTREE_PATH" worktree list | head -1 | awk '{print $1}'` (or `git rev-parse --show-toplevel` if not a worktree)
|
||||
- `WORKTREE_PATH` — the worktree directory
|
||||
- `PLATFORM_DIR` — `$WORKTREE_PATH/autogpt_platform`
|
||||
- `BACKEND_DIR` — `$PLATFORM_DIR/backend`
|
||||
- `FRONTEND_DIR` — `$PLATFORM_DIR/frontend`
|
||||
- `PR_NUMBER` — the PR number (from `gh pr list --head $(git branch --show-current)`)
|
||||
- `PR_TITLE` — the PR title, slugified (e.g. "Add copilot permissions" → "add-copilot-permissions")
|
||||
- `RESULTS_DIR` — `$REPO_ROOT/test-results/PR-{PR_NUMBER}-{slugified-title}`
|
||||
|
||||
Create the results directory:
|
||||
```bash
|
||||
PR_NUMBER=$(cd $WORKTREE_PATH && gh pr list --head $(git branch --show-current) --repo Significant-Gravitas/AutoGPT --json number --jq '.[0].number')
|
||||
PR_TITLE=$(cd $WORKTREE_PATH && gh pr list --head $(git branch --show-current) --repo Significant-Gravitas/AutoGPT --json title --jq '.[0].title' | tr '[:upper:]' '[:lower:]' | sed 's/[^a-z0-9]/-/g' | sed 's/--*/-/g' | sed 's/^-//;s/-$//' | head -c 50)
|
||||
RESULTS_DIR="$REPO_ROOT/test-results/PR-${PR_NUMBER}-${PR_TITLE}"
|
||||
mkdir -p $RESULTS_DIR
|
||||
```
|
||||
|
||||
**Test user credentials** (for logging into the UI or verifying results manually):
|
||||
- Email: `test@test.com`
|
||||
- Password: `testtest123`
|
||||
|
||||
## Step 1: Understand the PR
|
||||
|
||||
Before testing, understand what changed:
|
||||
|
||||
```bash
|
||||
cd $WORKTREE_PATH
|
||||
|
||||
# Read PR description to understand the WHY
|
||||
gh pr view {N} --json body --jq '.body'
|
||||
|
||||
git log --oneline dev..HEAD | head -20
|
||||
git diff dev --stat
|
||||
```
|
||||
|
||||
Read the PR description (Why / What / How) and changed files to understand:
|
||||
0. **Why** does this PR exist? What problem does it solve?
|
||||
1. **What** feature/fix does this PR implement?
|
||||
2. **How** does it work? What's the approach?
|
||||
3. What components are affected? (backend, frontend, copilot, executor, etc.)
|
||||
4. What are the key user-facing behaviors to test?
|
||||
|
||||
## Step 2: Write test scenarios
|
||||
|
||||
Based on the PR analysis, write a test plan to `$RESULTS_DIR/test-plan.md`:
|
||||
|
||||
```markdown
|
||||
# Test Plan: PR #{N} — {title}
|
||||
|
||||
## Scenarios
|
||||
1. [Scenario name] — [what to verify]
|
||||
2. ...
|
||||
|
||||
## API Tests (if applicable)
|
||||
1. [Endpoint] — [expected behavior]
|
||||
- Before state: [what to check before]
|
||||
- After state: [what to verify changed]
|
||||
|
||||
## UI Tests (if applicable)
|
||||
1. [Page/component] — [interaction to test]
|
||||
- Screenshot before: [what to capture]
|
||||
- Screenshot after: [what to capture]
|
||||
|
||||
## Negative Tests (REQUIRED — at least one per feature)
|
||||
1. [What should NOT happen] — [how to trigger it]
|
||||
- Expected error: [what error message/code]
|
||||
- State unchanged: [what to verify did NOT change]
|
||||
```
|
||||
|
||||
**Be critical** — include edge cases, error paths, and security checks. Every scenario MUST specify what screenshots to take and what state to verify.
|
||||
|
||||
## Step 3: Environment setup
|
||||
|
||||
### 3a. Copy .env files from the root worktree
|
||||
|
||||
The root worktree (`$REPO_ROOT`) has the canonical `.env` files with all API keys. Copy them to the target worktree:
|
||||
|
||||
```bash
|
||||
# CRITICAL: .env files are NOT checked into git. They must be copied manually.
|
||||
cp $REPO_ROOT/autogpt_platform/.env $PLATFORM_DIR/.env
|
||||
cp $REPO_ROOT/autogpt_platform/backend/.env $BACKEND_DIR/.env
|
||||
cp $REPO_ROOT/autogpt_platform/frontend/.env $FRONTEND_DIR/.env
|
||||
```
|
||||
|
||||
### 3b. Configure copilot authentication
|
||||
|
||||
The copilot needs an LLM API to function. Two approaches (try subscription first):
|
||||
|
||||
#### Option 1: Subscription mode (preferred — uses your Claude Max/Pro subscription)
|
||||
|
||||
The `claude_agent_sdk` Python package **bundles its own Claude CLI binary** — no need to install `@anthropic-ai/claude-code` via npm. The backend auto-provisions credentials from environment variables on startup.
|
||||
|
||||
Run the helper script to extract tokens from your host and auto-update `backend/.env` (works on macOS, Linux, and Windows/WSL):
|
||||
|
||||
```bash
|
||||
# Extracts OAuth tokens and writes CLAUDE_CODE_OAUTH_TOKEN + CLAUDE_CODE_REFRESH_TOKEN into .env
|
||||
bash $BACKEND_DIR/scripts/refresh_claude_token.sh --env-file $BACKEND_DIR/.env
|
||||
```
|
||||
|
||||
**How it works:** The script reads the OAuth token from:
|
||||
- **macOS**: system keychain (`"Claude Code-credentials"`)
|
||||
- **Linux/WSL**: `~/.claude/.credentials.json`
|
||||
- **Windows**: `%APPDATA%/claude/.credentials.json`
|
||||
|
||||
It sets `CLAUDE_CODE_OAUTH_TOKEN`, `CLAUDE_CODE_REFRESH_TOKEN`, and `CHAT_USE_CLAUDE_CODE_SUBSCRIPTION=true` in the `.env` file. On container startup, the backend auto-provisions `~/.claude/.credentials.json` inside the container from these env vars. The SDK's bundled CLI then authenticates using that file. No `claude login`, no npm install needed.
|
||||
|
||||
**Note:** The OAuth token expires (~24h). If copilot returns auth errors, re-run the script and restart: `$BACKEND_DIR/scripts/refresh_claude_token.sh --env-file $BACKEND_DIR/.env && docker compose up -d copilot_executor`
|
||||
|
||||
#### Option 2: OpenRouter API key mode (fallback)
|
||||
|
||||
If subscription mode doesn't work, switch to API key mode using OpenRouter:
|
||||
|
||||
```bash
|
||||
# In $BACKEND_DIR/.env, ensure these are set:
|
||||
CHAT_USE_CLAUDE_CODE_SUBSCRIPTION=false
|
||||
CHAT_API_KEY=<value of OPEN_ROUTER_API_KEY from the same .env>
|
||||
CHAT_BASE_URL=https://openrouter.ai/api/v1
|
||||
CHAT_USE_CLAUDE_AGENT_SDK=true
|
||||
```
|
||||
|
||||
Use `sed` to update these values:
|
||||
```bash
|
||||
ORKEY=$(grep "^OPEN_ROUTER_API_KEY=" $BACKEND_DIR/.env | cut -d= -f2)
|
||||
[ -n "$ORKEY" ] || { echo "ERROR: OPEN_ROUTER_API_KEY is missing in $BACKEND_DIR/.env"; exit 1; }
|
||||
perl -i -pe 's/CHAT_USE_CLAUDE_CODE_SUBSCRIPTION=true/CHAT_USE_CLAUDE_CODE_SUBSCRIPTION=false/' $BACKEND_DIR/.env
|
||||
# Add or update CHAT_API_KEY and CHAT_BASE_URL
|
||||
grep -q "^CHAT_API_KEY=" $BACKEND_DIR/.env && perl -i -pe "s|^CHAT_API_KEY=.*|CHAT_API_KEY=$ORKEY|" $BACKEND_DIR/.env || echo "CHAT_API_KEY=$ORKEY" >> $BACKEND_DIR/.env
|
||||
grep -q "^CHAT_BASE_URL=" $BACKEND_DIR/.env && perl -i -pe 's|^CHAT_BASE_URL=.*|CHAT_BASE_URL=https://openrouter.ai/api/v1|' $BACKEND_DIR/.env || echo "CHAT_BASE_URL=https://openrouter.ai/api/v1" >> $BACKEND_DIR/.env
|
||||
```
|
||||
|
||||
### 3c. Stop conflicting containers
|
||||
|
||||
```bash
|
||||
# Stop any running app containers (keep infra: supabase, redis, rabbitmq, clamav)
|
||||
docker ps --format "{{.Names}}" | grep -E "rest_server|executor|copilot|websocket|database_manager|scheduler|notification|frontend|migrate" | while read name; do
|
||||
docker stop "$name" 2>/dev/null
|
||||
done
|
||||
```
|
||||
|
||||
### 3e. Build and start
|
||||
|
||||
```bash
|
||||
cd $PLATFORM_DIR && docker compose build --no-cache 2>&1 | tail -20
|
||||
if [ ${PIPESTATUS[0]} -ne 0 ]; then echo "ERROR: Docker build failed"; exit 1; fi
|
||||
|
||||
cd $PLATFORM_DIR && docker compose up -d 2>&1 | tail -20
|
||||
if [ ${PIPESTATUS[0]} -ne 0 ]; then echo "ERROR: Docker compose up failed"; exit 1; fi
|
||||
```
|
||||
|
||||
**Note:** If the container appears to be running old code (e.g. missing PR changes), use `docker compose build --no-cache` to force a full rebuild. Docker BuildKit may sometimes reuse cached `COPY` layers from a previous build on a different branch.
|
||||
|
||||
**Expected time: 3-8 minutes** for build, 5-10 minutes with `--no-cache`.
|
||||
|
||||
### 3f. Wait for services to be ready
|
||||
|
||||
```bash
|
||||
# Poll until backend and frontend respond
|
||||
for i in $(seq 1 60); do
|
||||
BACKEND=$(curl -s -o /dev/null -w "%{http_code}" http://localhost:8006/docs 2>/dev/null)
|
||||
FRONTEND=$(curl -s -o /dev/null -w "%{http_code}" http://localhost:3000 2>/dev/null)
|
||||
if [ "$BACKEND" = "200" ] && [ "$FRONTEND" = "200" ]; then
|
||||
echo "Services ready"
|
||||
break
|
||||
fi
|
||||
sleep 5
|
||||
done
|
||||
```
|
||||
|
||||
|
||||
### 3h. Create test user and get auth token
|
||||
|
||||
```bash
|
||||
ANON_KEY=$(grep "NEXT_PUBLIC_SUPABASE_ANON_KEY=" $FRONTEND_DIR/.env | sed 's/.*NEXT_PUBLIC_SUPABASE_ANON_KEY=//' | tr -d '[:space:]')
|
||||
|
||||
# Signup (idempotent — returns "User already registered" if exists)
|
||||
RESULT=$(curl -s -X POST 'http://localhost:8000/auth/v1/signup' \
|
||||
-H "apikey: $ANON_KEY" \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"email":"test@test.com","password":"testtest123"}')
|
||||
|
||||
# If "Database error finding user", restart supabase-auth and retry
|
||||
if echo "$RESULT" | grep -q "Database error"; then
|
||||
docker restart supabase-auth && sleep 5
|
||||
curl -s -X POST 'http://localhost:8000/auth/v1/signup' \
|
||||
-H "apikey: $ANON_KEY" \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"email":"test@test.com","password":"testtest123"}'
|
||||
fi
|
||||
|
||||
# Get auth token
|
||||
TOKEN=$(curl -s -X POST 'http://localhost:8000/auth/v1/token?grant_type=password' \
|
||||
-H "apikey: $ANON_KEY" \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"email":"test@test.com","password":"testtest123"}' | jq -r '.access_token // ""')
|
||||
```
|
||||
|
||||
**Use this token for ALL API calls:**
|
||||
```bash
|
||||
curl -H "Authorization: Bearer $TOKEN" http://localhost:8006/api/...
|
||||
```
|
||||
|
||||
## Step 4: Run tests
|
||||
|
||||
### Service ports reference
|
||||
|
||||
| Service | Port | URL |
|
||||
|---------|------|-----|
|
||||
| Frontend | 3000 | http://localhost:3000 |
|
||||
| Backend REST | 8006 | http://localhost:8006 |
|
||||
| Supabase Auth (via Kong) | 8000 | http://localhost:8000 |
|
||||
| Executor | 8002 | http://localhost:8002 |
|
||||
| Copilot Executor | 8008 | http://localhost:8008 |
|
||||
| WebSocket | 8001 | http://localhost:8001 |
|
||||
| Database Manager | 8005 | http://localhost:8005 |
|
||||
| Redis | 6379 | localhost:6379 |
|
||||
| RabbitMQ | 5672 | localhost:5672 |
|
||||
|
||||
### API testing
|
||||
|
||||
Use `curl` with the auth token for backend API tests. **For EVERY API call that changes state, record before/after values:**
|
||||
|
||||
```bash
|
||||
# Example: List agents
|
||||
curl -s -H "Authorization: Bearer $TOKEN" http://localhost:8006/api/graphs | jq . | head -20
|
||||
|
||||
# Example: Create an agent
|
||||
curl -s -X POST http://localhost:8006/api/graphs \
|
||||
-H "Authorization: Bearer $TOKEN" \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{...}' | jq .
|
||||
|
||||
# Example: Run an agent
|
||||
curl -s -X POST "http://localhost:8006/api/graphs/{graph_id}/execute" \
|
||||
-H "Authorization: Bearer $TOKEN" \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"data": {...}}'
|
||||
|
||||
# Example: Get execution results
|
||||
curl -s -H "Authorization: Bearer $TOKEN" \
|
||||
"http://localhost:8006/api/graphs/{graph_id}/executions/{exec_id}" | jq .
|
||||
```
|
||||
|
||||
**State verification pattern (use for EVERY state-changing API call):**
|
||||
```bash
|
||||
# 1. Record BEFORE state
|
||||
BEFORE_STATE=$(curl -s -H "Authorization: Bearer $TOKEN" http://localhost:8006/api/{resource} | jq '{relevant_fields}')
|
||||
echo "BEFORE: $BEFORE_STATE"
|
||||
|
||||
# 2. Perform the action
|
||||
ACTION_RESULT=$(curl -s -X POST ... | jq .)
|
||||
echo "ACTION RESULT: $ACTION_RESULT"
|
||||
|
||||
# 3. Record AFTER state
|
||||
AFTER_STATE=$(curl -s -H "Authorization: Bearer $TOKEN" http://localhost:8006/api/{resource} | jq '{relevant_fields}')
|
||||
echo "AFTER: $AFTER_STATE"
|
||||
|
||||
# 4. Log the comparison
|
||||
echo "=== STATE CHANGE VERIFICATION ==="
|
||||
echo "Before: $BEFORE_STATE"
|
||||
echo "After: $AFTER_STATE"
|
||||
echo "Expected change: {describe what should have changed}"
|
||||
```
|
||||
|
||||
### Browser testing with agent-browser
|
||||
|
||||
```bash
|
||||
# Close any existing session
|
||||
agent-browser close 2>/dev/null || true
|
||||
|
||||
# Use --session-name to persist cookies across navigations
|
||||
# This means login only needs to happen once per test session
|
||||
agent-browser --session-name pr-test open 'http://localhost:3000/login' --timeout 15000
|
||||
|
||||
# Get interactive elements
|
||||
agent-browser --session-name pr-test snapshot | grep "textbox\|button"
|
||||
|
||||
# Login
|
||||
agent-browser --session-name pr-test fill {email_ref} "test@test.com"
|
||||
agent-browser --session-name pr-test fill {password_ref} "testtest123"
|
||||
agent-browser --session-name pr-test click {login_button_ref}
|
||||
sleep 5
|
||||
|
||||
# Dismiss cookie banner if present
|
||||
agent-browser --session-name pr-test click 'text=Accept All' 2>/dev/null || true
|
||||
|
||||
# Navigate — cookies are preserved so login persists
|
||||
agent-browser --session-name pr-test open 'http://localhost:3000/copilot' --timeout 10000
|
||||
|
||||
# Take screenshot
|
||||
agent-browser --session-name pr-test screenshot $RESULTS_DIR/01-page.png
|
||||
|
||||
# Interact with elements
|
||||
agent-browser --session-name pr-test fill {ref} "text"
|
||||
agent-browser --session-name pr-test press "Enter"
|
||||
agent-browser --session-name pr-test click {ref}
|
||||
agent-browser --session-name pr-test click 'text=Button Text'
|
||||
|
||||
# Read page content
|
||||
agent-browser --session-name pr-test snapshot | grep "text:"
|
||||
```
|
||||
|
||||
**Key pages:**
|
||||
- `/copilot` — CoPilot chat (for testing copilot features)
|
||||
- `/build` — Agent builder (for testing block/node features)
|
||||
- `/build?flowID={id}` — Specific agent in builder
|
||||
- `/library` — Agent library (for testing listing/import features)
|
||||
- `/library/agents/{id}` — Agent detail with run history
|
||||
- `/marketplace` — Marketplace
|
||||
|
||||
### Checking logs
|
||||
|
||||
```bash
|
||||
# Backend REST server
|
||||
docker logs autogpt_platform-rest_server-1 2>&1 | tail -30
|
||||
|
||||
# Executor (runs agent graphs)
|
||||
docker logs autogpt_platform-executor-1 2>&1 | tail -30
|
||||
|
||||
# Copilot executor (runs copilot chat sessions)
|
||||
docker logs autogpt_platform-copilot_executor-1 2>&1 | tail -30
|
||||
|
||||
# Frontend
|
||||
docker logs autogpt_platform-frontend-1 2>&1 | tail -30
|
||||
|
||||
# Filter for errors
|
||||
docker logs autogpt_platform-executor-1 2>&1 | grep -i "error\|exception\|traceback" | tail -20
|
||||
```
|
||||
|
||||
### Copilot chat testing
|
||||
|
||||
The copilot uses SSE streaming. To test via API:
|
||||
|
||||
```bash
|
||||
# Create a session
|
||||
SESSION_ID=$(curl -s -X POST 'http://localhost:8006/api/chat/sessions' \
|
||||
-H "Authorization: Bearer $TOKEN" \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{}' | jq -r '.id // .session_id // ""')
|
||||
|
||||
# Stream a message (SSE - will stream chunks)
|
||||
curl -N -X POST "http://localhost:8006/api/chat/sessions/$SESSION_ID/stream" \
|
||||
-H "Authorization: Bearer $TOKEN" \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"message": "Hello, what can you help me with?"}' \
|
||||
--max-time 60 2>/dev/null | head -50
|
||||
```
|
||||
|
||||
Or test via browser (preferred for UI verification):
|
||||
```bash
|
||||
agent-browser --session-name pr-test open 'http://localhost:3000/copilot' --timeout 10000
|
||||
# ... fill chat input and press Enter, wait 20-30s for response
|
||||
```
|
||||
|
||||
## Step 5: Record results and take screenshots
|
||||
|
||||
**Take a screenshot at EVERY significant test step** — before and after interactions, on success, and on failure. This is NON-NEGOTIABLE.
|
||||
|
||||
**Required screenshot pattern for each test scenario:**
|
||||
```bash
|
||||
# BEFORE the action
|
||||
agent-browser --session-name pr-test screenshot $RESULTS_DIR/{NN}-{scenario}-before.png
|
||||
|
||||
# Perform the action...
|
||||
|
||||
# AFTER the action
|
||||
agent-browser --session-name pr-test screenshot $RESULTS_DIR/{NN}-{scenario}-after.png
|
||||
```
|
||||
|
||||
**Naming convention:**
|
||||
```bash
|
||||
# Examples:
|
||||
# $RESULTS_DIR/01-login-page-before.png
|
||||
# $RESULTS_DIR/02-login-page-after.png
|
||||
# $RESULTS_DIR/03-credits-page-before.png
|
||||
# $RESULTS_DIR/04-credits-purchase-after.png
|
||||
# $RESULTS_DIR/05-negative-insufficient-credits.png
|
||||
# $RESULTS_DIR/06-error-state.png
|
||||
```
|
||||
|
||||
**Minimum requirements:**
|
||||
- At least TWO screenshots per test scenario (before + after)
|
||||
- At least ONE screenshot for each negative test case showing the error state
|
||||
- If a test fails, screenshot the failure state AND any error logs visible in the UI
|
||||
|
||||
## Step 6: Show results to user with screenshots
|
||||
|
||||
**CRITICAL: After all tests complete, you MUST show every screenshot to the user using the Read tool, with an explanation of what each screenshot shows.** This is the most important part of the test report — the user needs to visually verify the results.
|
||||
|
||||
For each screenshot:
|
||||
1. Use the `Read` tool to display the PNG file (Claude can read images)
|
||||
2. Write a 1-2 sentence explanation below it describing:
|
||||
- What page/state is being shown
|
||||
- What the screenshot proves (which test scenario it validates)
|
||||
- Any notable details visible in the UI
|
||||
|
||||
Format the output like this:
|
||||
|
||||
```markdown
|
||||
### Screenshot 1: {descriptive title}
|
||||
[Read the PNG file here]
|
||||
|
||||
**What it shows:** {1-2 sentence explanation of what this screenshot proves}
|
||||
|
||||
---
|
||||
```
|
||||
|
||||
After showing all screenshots, output a **detailed** summary table:
|
||||
|
||||
| # | Scenario | Result | API Evidence | Screenshot Evidence |
|
||||
|---|----------|--------|-------------|-------------------|
|
||||
| 1 | {name} | PASS/FAIL | Before: X, After: Y | 01-before.png, 02-after.png |
|
||||
| 2 | ... | ... | ... | ... |
|
||||
|
||||
**IMPORTANT:** As you show each screenshot and record test results, persist them in shell variables for Step 7:
|
||||
|
||||
```bash
|
||||
# Build these variables during Step 6 — they are required by Step 7's script
|
||||
# NOTE: declare -A requires Bash 4.0+. This is standard on modern systems (macOS ships zsh
|
||||
# but Homebrew bash is 5.x; Linux typically has bash 5.x). If running on Bash <4, use a
|
||||
# plain variable with a lookup function instead.
|
||||
declare -A SCREENSHOT_EXPLANATIONS=(
|
||||
["01-login-page.png"]="Shows the login page loaded successfully with SSO options visible."
|
||||
["02-builder-with-block.png"]="The builder canvas displays the newly added block connected to the trigger."
|
||||
# ... one entry per screenshot, using the same explanations you showed the user above
|
||||
)
|
||||
|
||||
TEST_RESULTS_TABLE="| 1 | Login flow | PASS | N/A | 01-login-before.png, 02-login-after.png |
|
||||
| 2 | Credits purchase | PASS | Before: 100, After: 95 | 03-credits-before.png, 04-credits-after.png |
|
||||
| 3 | Insufficient credits (negative) | PASS | Credits: 0, rejected | 05-insufficient-credits-error.png |"
|
||||
# ... one row per test scenario with actual results
|
||||
```
|
||||
|
||||
## Step 7: Post test report as PR comment with screenshots
|
||||
|
||||
Upload screenshots to the PR using the GitHub Git API (no local git operations — safe for worktrees), then post a comment with inline images and per-screenshot explanations.
|
||||
|
||||
**This step is MANDATORY. Every test run MUST post a PR comment with screenshots. No exceptions.**
|
||||
|
||||
```bash
|
||||
# Upload screenshots via GitHub Git API (creates blobs, tree, commit, and ref remotely)
|
||||
REPO="Significant-Gravitas/AutoGPT"
|
||||
SCREENSHOTS_BRANCH="test-screenshots/pr-${PR_NUMBER}"
|
||||
SCREENSHOTS_DIR="test-screenshots/PR-${PR_NUMBER}"
|
||||
|
||||
# Step 1: Create blobs for each screenshot and build tree JSON
|
||||
# Retry each blob upload up to 3 times. If still failing, list them at end of report.
|
||||
shopt -s nullglob
|
||||
SCREENSHOT_FILES=("$RESULTS_DIR"/*.png)
|
||||
if [ ${#SCREENSHOT_FILES[@]} -eq 0 ]; then
|
||||
echo "ERROR: No screenshots found in $RESULTS_DIR. Test run is incomplete."
|
||||
exit 1
|
||||
fi
|
||||
TREE_JSON='['
|
||||
FIRST=true
|
||||
FAILED_UPLOADS=()
|
||||
for img in "${SCREENSHOT_FILES[@]}"; do
|
||||
BASENAME=$(basename "$img")
|
||||
B64=$(base64 < "$img")
|
||||
BLOB_SHA=""
|
||||
for attempt in 1 2 3; do
|
||||
BLOB_SHA=$(gh api "repos/${REPO}/git/blobs" -f content="$B64" -f encoding="base64" --jq '.sha' 2>/dev/null || true)
|
||||
[ -n "$BLOB_SHA" ] && break
|
||||
sleep 1
|
||||
done
|
||||
if [ -z "$BLOB_SHA" ]; then
|
||||
FAILED_UPLOADS+=("$img")
|
||||
continue
|
||||
fi
|
||||
if [ "$FIRST" = true ]; then FIRST=false; else TREE_JSON+=','; fi
|
||||
TREE_JSON+="{\"path\":\"${SCREENSHOTS_DIR}/${BASENAME}\",\"mode\":\"100644\",\"type\":\"blob\",\"sha\":\"${BLOB_SHA}\"}"
|
||||
done
|
||||
TREE_JSON+=']'
|
||||
|
||||
# Step 2: Create tree, commit, and branch ref
|
||||
TREE_SHA=$(echo "$TREE_JSON" | jq -c '{tree: .}' | gh api "repos/${REPO}/git/trees" --input - --jq '.sha')
|
||||
COMMIT_SHA=$(gh api "repos/${REPO}/git/commits" \
|
||||
-f message="test: add E2E test screenshots for PR #${PR_NUMBER}" \
|
||||
-f tree="$TREE_SHA" \
|
||||
--jq '.sha')
|
||||
gh api "repos/${REPO}/git/refs" \
|
||||
-f ref="refs/heads/${SCREENSHOTS_BRANCH}" \
|
||||
-f sha="$COMMIT_SHA" 2>/dev/null \
|
||||
|| gh api "repos/${REPO}/git/refs/heads/${SCREENSHOTS_BRANCH}" \
|
||||
-X PATCH -f sha="$COMMIT_SHA" -f force=true
|
||||
```
|
||||
|
||||
Then post the comment with **inline images AND explanations for each screenshot**:
|
||||
|
||||
```bash
|
||||
REPO_URL="https://raw.githubusercontent.com/${REPO}/${SCREENSHOTS_BRANCH}"
|
||||
|
||||
# Build image markdown using uploaded image URLs; skip FAILED_UPLOADS (listed separately)
|
||||
|
||||
IMAGE_MARKDOWN=""
|
||||
for img in "${SCREENSHOT_FILES[@]}"; do
|
||||
BASENAME=$(basename "$img")
|
||||
TITLE=$(echo "${BASENAME%.png}" | sed 's/^[0-9]*-//' | sed 's/-/ /g' | awk '{for(i=1;i<=NF;i++) $i=toupper(substr($i,1,1)) tolower(substr($i,2))}1')
|
||||
# Skip images that failed to upload — they will be listed at the end
|
||||
IS_FAILED=false
|
||||
for failed in "${FAILED_UPLOADS[@]}"; do
|
||||
[ "$(basename "$failed")" = "$BASENAME" ] && IS_FAILED=true && break
|
||||
done
|
||||
if [ "$IS_FAILED" = true ]; then
|
||||
continue
|
||||
fi
|
||||
EXPLANATION="${SCREENSHOT_EXPLANATIONS[$BASENAME]}"
|
||||
if [ -z "$EXPLANATION" ]; then
|
||||
echo "ERROR: Missing screenshot explanation for $BASENAME. Add it to SCREENSHOT_EXPLANATIONS in Step 6."
|
||||
exit 1
|
||||
fi
|
||||
IMAGE_MARKDOWN="${IMAGE_MARKDOWN}
|
||||
### ${TITLE}
|
||||

|
||||
${EXPLANATION}
|
||||
"
|
||||
done
|
||||
|
||||
# Write comment body to file to avoid shell interpretation issues with special characters
|
||||
COMMENT_FILE=$(mktemp)
|
||||
# If any uploads failed, append a section listing them with instructions
|
||||
FAILED_SECTION=""
|
||||
if [ ${#FAILED_UPLOADS[@]} -gt 0 ]; then
|
||||
FAILED_SECTION="
|
||||
## ⚠️ Failed Screenshot Uploads
|
||||
The following screenshots could not be uploaded via the GitHub API after 3 retries.
|
||||
**To add them:** drag-and-drop or paste these files into a PR comment manually:
|
||||
"
|
||||
for failed in "${FAILED_UPLOADS[@]}"; do
|
||||
FAILED_SECTION="${FAILED_SECTION}
|
||||
- \`$(basename "$failed")\` (local path: \`$failed\`)"
|
||||
done
|
||||
FAILED_SECTION="${FAILED_SECTION}
|
||||
|
||||
**Run status:** INCOMPLETE until the files above are manually attached and visible inline in the PR."
|
||||
fi
|
||||
|
||||
cat > "$COMMENT_FILE" <<INNEREOF
|
||||
## E2E Test Report
|
||||
|
||||
| # | Scenario | Result | API Evidence | Screenshot Evidence |
|
||||
|---|----------|--------|-------------|-------------------|
|
||||
${TEST_RESULTS_TABLE}
|
||||
|
||||
${IMAGE_MARKDOWN}
|
||||
${FAILED_SECTION}
|
||||
INNEREOF
|
||||
|
||||
gh api "repos/${REPO}/issues/$PR_NUMBER/comments" -F body=@"$COMMENT_FILE"
|
||||
rm -f "$COMMENT_FILE"
|
||||
```
|
||||
|
||||
**The PR comment MUST include:**
|
||||
1. A summary table of all scenarios with PASS/FAIL and before/after API evidence
|
||||
2. Every successfully uploaded screenshot rendered inline; any failed uploads listed with manual attachment instructions
|
||||
3. A 1-2 sentence explanation below each screenshot describing what it proves
|
||||
|
||||
This approach uses the GitHub Git API to create blobs, trees, commits, and refs entirely server-side. No local `git checkout` or `git push` — safe for worktrees and won't interfere with the PR branch.
|
||||
|
||||
## Fix mode (--fix flag)
|
||||
|
||||
When `--fix` is present, the standard is HIGHER. Do not just note issues — FIX them immediately.
|
||||
|
||||
### Fix protocol for EVERY issue found (including UX issues):
|
||||
|
||||
1. **Identify** the root cause in the code — read the relevant source files
|
||||
2. **Write a failing test first** (TDD): For backend bugs, write a test marked with `pytest.mark.xfail(reason="...")`. For frontend/Playwright bugs, write a test with `.fixme` annotation. Run it to confirm it fails as expected.
|
||||
3. **Screenshot** the broken state: `agent-browser screenshot $RESULTS_DIR/{NN}-broken-{description}.png`
|
||||
4. **Fix** the code in the worktree
|
||||
5. **Rebuild** ONLY the affected service (not the whole stack):
|
||||
```bash
|
||||
cd $PLATFORM_DIR && docker compose up --build -d {service_name}
|
||||
# e.g., docker compose up --build -d rest_server
|
||||
# e.g., docker compose up --build -d frontend
|
||||
```
|
||||
6. **Wait** for the service to be ready (poll health endpoint)
|
||||
7. **Re-test** the same scenario
|
||||
8. **Screenshot** the fixed state: `agent-browser screenshot $RESULTS_DIR/{NN}-fixed-{description}.png`
|
||||
9. **Remove the xfail/fixme marker** from the test written in step 2, and verify it passes
|
||||
10. **Verify** the fix did not break other scenarios (run a quick smoke test)
|
||||
11. **Commit and push** immediately:
|
||||
```bash
|
||||
cd $WORKTREE_PATH
|
||||
git add -A
|
||||
git commit -m "fix: {description of fix}"
|
||||
git push
|
||||
```
|
||||
12. **Continue** to the next test scenario
|
||||
|
||||
### Fix loop (like pr-address)
|
||||
|
||||
```text
|
||||
test scenario → find issue (bug OR UX problem) → screenshot broken state
|
||||
→ fix code → rebuild affected service only → re-test → screenshot fixed state
|
||||
→ verify no regressions → commit + push
|
||||
→ repeat for next scenario
|
||||
→ after ALL scenarios pass, run full re-test to verify everything together
|
||||
```
|
||||
|
||||
**Key differences from non-fix mode:**
|
||||
- UX issues count as bugs — fix them (bad alignment, confusing labels, missing loading states)
|
||||
- Every fix MUST have a before/after screenshot pair proving it works
|
||||
- Commit after EACH fix, not in a batch at the end
|
||||
- The final re-test must produce a clean set of all-passing screenshots
|
||||
|
||||
## Known issues and workarounds
|
||||
|
||||
### Problem: "Database error finding user" on signup
|
||||
**Cause:** Supabase auth service schema cache is stale after migration.
|
||||
**Fix:** `docker restart supabase-auth && sleep 5` then retry signup.
|
||||
|
||||
### Problem: Copilot returns auth errors in subscription mode
|
||||
**Cause:** `CHAT_USE_CLAUDE_CODE_SUBSCRIPTION=true` but `CLAUDE_CODE_OAUTH_TOKEN` is not set or expired.
|
||||
**Fix:** Re-extract the OAuth token from macOS keychain (see step 3b, Option 1) and recreate the container (`docker compose up -d copilot_executor`). The backend auto-provisions `~/.claude/.credentials.json` from the env var on startup. No `npm install` or `claude login` needed — the SDK bundles its own CLI binary.
|
||||
|
||||
### Problem: agent-browser can't find chromium
|
||||
**Cause:** The Dockerfile auto-provisions system chromium on all architectures (including ARM64). If your branch is behind `dev`, this may not be present yet.
|
||||
**Fix:** Check if chromium exists: `which chromium || which chromium-browser`. If missing, install it: `apt-get install -y chromium` and set `AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium` in the container environment.
|
||||
|
||||
### Problem: agent-browser selector matches multiple elements
|
||||
**Cause:** `text=X` matches all elements containing that text.
|
||||
**Fix:** Use `agent-browser snapshot` to get specific `ref=eNN` references, then use those: `agent-browser click eNN`.
|
||||
|
||||
### Problem: Frontend shows cookie banner blocking interaction
|
||||
**Fix:** `agent-browser click 'text=Accept All'` before other interactions.
|
||||
|
||||
### Problem: Container loses npm packages after rebuild
|
||||
**Cause:** `docker compose up --build` rebuilds the image, losing runtime installs.
|
||||
**Fix:** Add packages to the Dockerfile instead of installing at runtime.
|
||||
|
||||
### Problem: Services not starting after `docker compose up`
|
||||
**Fix:** Wait and check health: `docker compose ps`. Common cause: migration hasn't finished. Check: `docker logs autogpt_platform-migrate-1 2>&1 | tail -5`. If supabase-db isn't healthy: `docker restart supabase-db && sleep 10`.
|
||||
|
||||
### Problem: Docker uses cached layers with old code (PR changes not visible)
|
||||
**Cause:** `docker compose up --build` reuses cached `COPY` layers from previous builds. If the PR branch changes Python files but the previous build already cached that layer from `dev`, the container runs `dev` code.
|
||||
**Fix:** Always use `docker compose build --no-cache` for the first build of a PR branch. Subsequent rebuilds within the same branch can use `--build`.
|
||||
|
||||
### Problem: `agent-browser open` loses login session
|
||||
**Cause:** Without session persistence, `agent-browser open` starts fresh.
|
||||
**Fix:** Use `--session-name pr-test` on ALL agent-browser commands. This auto-saves/restores cookies and localStorage across navigations. Alternatively, use `agent-browser eval "window.location.href = '...'"` to navigate within the same context.
|
||||
|
||||
### Problem: Supabase auth returns "Database error querying schema"
|
||||
**Cause:** The database schema changed (migration ran) but supabase-auth has a stale schema cache.
|
||||
**Fix:** `docker restart supabase-db && sleep 10 && docker restart supabase-auth && sleep 8`. If user data was lost, re-signup.
|
||||
8
.github/PULL_REQUEST_TEMPLATE.md
vendored
8
.github/PULL_REQUEST_TEMPLATE.md
vendored
@@ -1,8 +1,12 @@
|
||||
<!-- Clearly explain the need for these changes: -->
|
||||
### Why / What / How
|
||||
|
||||
<!-- Why: Why does this PR exist? What problem does it solve, or what's broken/missing without it? -->
|
||||
<!-- What: What does this PR change? Summarize the changes at a high level. -->
|
||||
<!-- How: How does it work? Describe the approach, key implementation details, or architecture decisions. -->
|
||||
|
||||
### Changes 🏗️
|
||||
|
||||
<!-- Concisely describe all of the changes made in this pull request: -->
|
||||
<!-- List the key changes. Keep it higher level than the diff but specific enough to highlight what's new/modified. -->
|
||||
|
||||
### Checklist 📋
|
||||
|
||||
|
||||
@@ -55,6 +55,7 @@ AutoGPT Platform is a monorepo containing:
|
||||
- Create the PR against the `dev` branch of the repository.
|
||||
- Ensure the branch name is descriptive (e.g., `feature/add-new-block`)
|
||||
- Use conventional commit messages (see below)
|
||||
- **Structure the PR description with Why / What / How** — Why: the motivation (what problem it solves, what's broken/missing without it); What: high-level summary of changes; How: approach, key implementation details, or architecture decisions. Reviewers need all three to judge whether the approach fits the problem.
|
||||
- Fill out the .github/PULL_REQUEST_TEMPLATE.md template as the PR description
|
||||
- Always use `--body-file` to pass PR body — avoids shell interpretation of backticks and special characters:
|
||||
```bash
|
||||
|
||||
54
autogpt_platform/autogpt_libs/poetry.lock
generated
54
autogpt_platform/autogpt_libs/poetry.lock
generated
@@ -1,4 +1,4 @@
|
||||
# This file is automatically @generated by Poetry 2.1.1 and should not be changed by hand.
|
||||
# This file is automatically @generated by Poetry 2.2.1 and should not be changed by hand.
|
||||
|
||||
[[package]]
|
||||
name = "annotated-doc"
|
||||
@@ -67,7 +67,7 @@ description = "Backport of asyncio.Runner, a context manager that controls event
|
||||
optional = false
|
||||
python-versions = "<3.11,>=3.8"
|
||||
groups = ["dev"]
|
||||
markers = "python_version < \"3.11\""
|
||||
markers = "python_version == \"3.10\""
|
||||
files = [
|
||||
{file = "backports_asyncio_runner-1.2.0-py3-none-any.whl", hash = "sha256:0da0a936a8aeb554eccb426dc55af3ba63bcdc69fa1a600b5bb305413a4477b5"},
|
||||
{file = "backports_asyncio_runner-1.2.0.tar.gz", hash = "sha256:a5aa7b2b7d8f8bfcaa2b57313f70792df84e32a2a746f585213373f900b42162"},
|
||||
@@ -541,7 +541,7 @@ description = "Backport of PEP 654 (exception groups)"
|
||||
optional = false
|
||||
python-versions = ">=3.7"
|
||||
groups = ["main", "dev"]
|
||||
markers = "python_version < \"3.11\""
|
||||
markers = "python_version == \"3.10\""
|
||||
files = [
|
||||
{file = "exceptiongroup-1.3.0-py3-none-any.whl", hash = "sha256:4d111e6e0c13d0644cad6ddaa7ed0261a0b36971f6d23e7ec9b4b9097da78a10"},
|
||||
{file = "exceptiongroup-1.3.0.tar.gz", hash = "sha256:b241f5885f560bc56a59ee63ca4c6a8bfa46ae4ad651af316d4e81817bb9fd88"},
|
||||
@@ -2181,14 +2181,14 @@ testing = ["coverage (>=6.2)", "hypothesis (>=5.7.1)"]
|
||||
|
||||
[[package]]
|
||||
name = "pytest-cov"
|
||||
version = "7.0.0"
|
||||
version = "7.1.0"
|
||||
description = "Pytest plugin for measuring coverage."
|
||||
optional = false
|
||||
python-versions = ">=3.9"
|
||||
groups = ["dev"]
|
||||
files = [
|
||||
{file = "pytest_cov-7.0.0-py3-none-any.whl", hash = "sha256:3b8e9558b16cc1479da72058bdecf8073661c7f57f7d3c5f22a1c23507f2d861"},
|
||||
{file = "pytest_cov-7.0.0.tar.gz", hash = "sha256:33c97eda2e049a0c5298e91f519302a1334c26ac65c1a483d6206fd458361af1"},
|
||||
{file = "pytest_cov-7.1.0-py3-none-any.whl", hash = "sha256:a0461110b7865f9a271aa1b51e516c9a95de9d696734a2f71e3e78f46e1d4678"},
|
||||
{file = "pytest_cov-7.1.0.tar.gz", hash = "sha256:30674f2b5f6351aa09702a9c8c364f6a01c27aae0c1366ae8016160d1efc56b2"},
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
@@ -2342,30 +2342,30 @@ pyasn1 = ">=0.1.3"
|
||||
|
||||
[[package]]
|
||||
name = "ruff"
|
||||
version = "0.15.0"
|
||||
version = "0.15.7"
|
||||
description = "An extremely fast Python linter and code formatter, written in Rust."
|
||||
optional = false
|
||||
python-versions = ">=3.7"
|
||||
groups = ["dev"]
|
||||
files = [
|
||||
{file = "ruff-0.15.0-py3-none-linux_armv6l.whl", hash = "sha256:aac4ebaa612a82b23d45964586f24ae9bc23ca101919f5590bdb368d74ad5455"},
|
||||
{file = "ruff-0.15.0-py3-none-macosx_10_12_x86_64.whl", hash = "sha256:dcd4be7cc75cfbbca24a98d04d0b9b36a270d0833241f776b788d59f4142b14d"},
|
||||
{file = "ruff-0.15.0-py3-none-macosx_11_0_arm64.whl", hash = "sha256:d747e3319b2bce179c7c1eaad3d884dc0a199b5f4d5187620530adf9105268ce"},
|
||||
{file = "ruff-0.15.0-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:650bd9c56ae03102c51a5e4b554d74d825ff3abe4db22b90fd32d816c2e90621"},
|
||||
{file = "ruff-0.15.0-py3-none-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:a6664b7eac559e3048223a2da77769c2f92b43a6dfd4720cef42654299a599c9"},
|
||||
{file = "ruff-0.15.0-py3-none-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:6f811f97b0f092b35320d1556f3353bf238763420ade5d9e62ebd2b73f2ff179"},
|
||||
{file = "ruff-0.15.0-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:761ec0a66680fab6454236635a39abaf14198818c8cdf691e036f4bc0f406b2d"},
|
||||
{file = "ruff-0.15.0-py3-none-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:940f11c2604d317e797b289f4f9f3fa5555ffe4fb574b55ed006c3d9b6f0eb78"},
|
||||
{file = "ruff-0.15.0-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:bcbca3d40558789126da91d7ef9a7c87772ee107033db7191edefa34e2c7f1b4"},
|
||||
{file = "ruff-0.15.0-py3-none-manylinux_2_31_riscv64.whl", hash = "sha256:9a121a96db1d75fa3eb39c4539e607f628920dd72ff1f7c5ee4f1b768ac62d6e"},
|
||||
{file = "ruff-0.15.0-py3-none-musllinux_1_2_aarch64.whl", hash = "sha256:5298d518e493061f2eabd4abd067c7e4fb89e2f63291c94332e35631c07c3662"},
|
||||
{file = "ruff-0.15.0-py3-none-musllinux_1_2_armv7l.whl", hash = "sha256:afb6e603d6375ff0d6b0cee563fa21ab570fd15e65c852cb24922cef25050cf1"},
|
||||
{file = "ruff-0.15.0-py3-none-musllinux_1_2_i686.whl", hash = "sha256:77e515f6b15f828b94dc17d2b4ace334c9ddb7d9468c54b2f9ed2b9c1593ef16"},
|
||||
{file = "ruff-0.15.0-py3-none-musllinux_1_2_x86_64.whl", hash = "sha256:6f6e80850a01eb13b3e42ee0ebdf6e4497151b48c35051aab51c101266d187a3"},
|
||||
{file = "ruff-0.15.0-py3-none-win32.whl", hash = "sha256:238a717ef803e501b6d51e0bdd0d2c6e8513fe9eec14002445134d3907cd46c3"},
|
||||
{file = "ruff-0.15.0-py3-none-win_amd64.whl", hash = "sha256:dd5e4d3301dc01de614da3cdffc33d4b1b96fb89e45721f1598e5532ccf78b18"},
|
||||
{file = "ruff-0.15.0-py3-none-win_arm64.whl", hash = "sha256:c480d632cc0ca3f0727acac8b7d053542d9e114a462a145d0b00e7cd658c515a"},
|
||||
{file = "ruff-0.15.0.tar.gz", hash = "sha256:6bdea47cdbea30d40f8f8d7d69c0854ba7c15420ec75a26f463290949d7f7e9a"},
|
||||
{file = "ruff-0.15.7-py3-none-linux_armv6l.whl", hash = "sha256:a81cc5b6910fb7dfc7c32d20652e50fa05963f6e13ead3c5915c41ac5d16668e"},
|
||||
{file = "ruff-0.15.7-py3-none-macosx_10_12_x86_64.whl", hash = "sha256:722d165bd52403f3bdabc0ce9e41fc47070ac56d7a91b4e0d097b516a53a3477"},
|
||||
{file = "ruff-0.15.7-py3-none-macosx_11_0_arm64.whl", hash = "sha256:7fbc2448094262552146cbe1b9643a92f66559d3761f1ad0656d4991491af49e"},
|
||||
{file = "ruff-0.15.7-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:6b39329b60eba44156d138275323cc726bbfbddcec3063da57caa8a8b1d50adf"},
|
||||
{file = "ruff-0.15.7-py3-none-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:87768c151808505f2bfc93ae44e5f9e7c8518943e5074f76ac21558ef5627c85"},
|
||||
{file = "ruff-0.15.7-py3-none-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:fb0511670002c6c529ec66c0e30641c976c8963de26a113f3a30456b702468b0"},
|
||||
{file = "ruff-0.15.7-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:e0d19644f801849229db8345180a71bee5407b429dd217f853ec515e968a6912"},
|
||||
{file = "ruff-0.15.7-py3-none-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:4806d8e09ef5e84eb19ba833d0442f7e300b23fe3f0981cae159a248a10f0036"},
|
||||
{file = "ruff-0.15.7-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:dce0896488562f09a27b9c91b1f58a097457143931f3c4d519690dea54e624c5"},
|
||||
{file = "ruff-0.15.7-py3-none-manylinux_2_31_riscv64.whl", hash = "sha256:1852ce241d2bc89e5dc823e03cff4ce73d816b5c6cdadd27dbfe7b03217d2a12"},
|
||||
{file = "ruff-0.15.7-py3-none-musllinux_1_2_aarch64.whl", hash = "sha256:5f3e4b221fb4bd293f79912fc5e93a9063ebd6d0dcbd528f91b89172a9b8436c"},
|
||||
{file = "ruff-0.15.7-py3-none-musllinux_1_2_armv7l.whl", hash = "sha256:b15e48602c9c1d9bdc504b472e90b90c97dc7d46c7028011ae67f3861ceba7b4"},
|
||||
{file = "ruff-0.15.7-py3-none-musllinux_1_2_i686.whl", hash = "sha256:1b4705e0e85cedc74b0a23cf6a179dbb3df184cb227761979cc76c0440b5ab0d"},
|
||||
{file = "ruff-0.15.7-py3-none-musllinux_1_2_x86_64.whl", hash = "sha256:112c1fa316a558bb34319282c1200a8bf0495f1b735aeb78bfcb2991e6087580"},
|
||||
{file = "ruff-0.15.7-py3-none-win32.whl", hash = "sha256:6d39e2d3505b082323352f733599f28169d12e891f7dd407f2d4f54b4c2886de"},
|
||||
{file = "ruff-0.15.7-py3-none-win_amd64.whl", hash = "sha256:4d53d712ddebcd7dace1bc395367aec12c057aacfe9adbb6d832302575f4d3a1"},
|
||||
{file = "ruff-0.15.7-py3-none-win_arm64.whl", hash = "sha256:18e8d73f1c3fdf27931497972250340f92e8c861722161a9caeb89a58ead6ed2"},
|
||||
{file = "ruff-0.15.7.tar.gz", hash = "sha256:04f1ae61fc20fe0b148617c324d9d009b5f63412c0b16474f3d5f1a1a665f7ac"},
|
||||
]
|
||||
|
||||
[[package]]
|
||||
@@ -2564,7 +2564,7 @@ description = "A lil' TOML parser"
|
||||
optional = false
|
||||
python-versions = ">=3.8"
|
||||
groups = ["dev"]
|
||||
markers = "python_version < \"3.11\""
|
||||
markers = "python_version == \"3.10\""
|
||||
files = [
|
||||
{file = "tomli-2.2.1-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:678e4fa69e4575eb77d103de3df8a895e1591b48e740211bd1067378c69e8249"},
|
||||
{file = "tomli-2.2.1-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:023aa114dd824ade0100497eb2318602af309e5a55595f76b626d6d9f3b7b0a6"},
|
||||
@@ -2912,4 +2912,4 @@ type = ["pytest-mypy"]
|
||||
[metadata]
|
||||
lock-version = "2.1"
|
||||
python-versions = ">=3.10,<4.0"
|
||||
content-hash = "9619cae908ad38fa2c48016a58bcf4241f6f5793aa0e6cc140276e91c433cbbb"
|
||||
content-hash = "e0936a065565550afed18f6298b7e04e814b44100def7049f1a0d68662624a39"
|
||||
|
||||
@@ -26,8 +26,8 @@ pyright = "^1.1.408"
|
||||
pytest = "^8.4.1"
|
||||
pytest-asyncio = "^1.3.0"
|
||||
pytest-mock = "^3.15.1"
|
||||
pytest-cov = "^7.0.0"
|
||||
ruff = "^0.15.0"
|
||||
pytest-cov = "^7.1.0"
|
||||
ruff = "^0.15.7"
|
||||
|
||||
[build-system]
|
||||
requires = ["poetry-core"]
|
||||
|
||||
@@ -121,36 +121,20 @@ RUN ln -s ../lib/node_modules/npm/bin/npm-cli.js /usr/bin/npm \
|
||||
&& ln -s ../lib/node_modules/npm/bin/npx-cli.js /usr/bin/npx
|
||||
COPY --from=builder /root/.cache/prisma-python/binaries /root/.cache/prisma-python/binaries
|
||||
|
||||
# Install agent-browser (Copilot browser tool) + Chromium.
|
||||
# On amd64: install runtime libs + run `agent-browser install` to download
|
||||
# Chrome for Testing (pinned version, tested with Playwright).
|
||||
# On arm64: install system chromium package — Chrome for Testing has no ARM64
|
||||
# binary. AGENT_BROWSER_EXECUTABLE_PATH is set at runtime by the entrypoint
|
||||
# script (below) to redirect agent-browser to the system binary.
|
||||
ARG TARGETARCH
|
||||
# Install agent-browser (Copilot browser tool) using the system chromium package.
|
||||
# Chrome for Testing (the binary agent-browser downloads via `agent-browser install`)
|
||||
# has no ARM64 builds, so we use the distro-packaged chromium instead — verified to
|
||||
# work with agent-browser via Docker tests on arm64; amd64 is validated in CI.
|
||||
# Note: system chromium tracks the Debian package schedule rather than a pinned
|
||||
# Chrome for Testing release. If agent-browser requires a specific Chrome version,
|
||||
# verify compatibility against the chromium package version in the base image.
|
||||
RUN apt-get update \
|
||||
&& if [ "$TARGETARCH" = "arm64" ]; then \
|
||||
apt-get install -y --no-install-recommends chromium fonts-liberation; \
|
||||
else \
|
||||
apt-get install -y --no-install-recommends \
|
||||
libnss3 libnspr4 libatk1.0-0 libatk-bridge2.0-0 libcups2 libdrm2 \
|
||||
libdbus-1-3 libxkbcommon0 libatspi2.0-0t64 libxcomposite1 libxdamage1 \
|
||||
libxfixes3 libxrandr2 libgbm1 libasound2t64 libpango-1.0-0 libcairo2 \
|
||||
libx11-6 libx11-xcb1 libxcb1 libxext6 libglib2.0-0t64 \
|
||||
fonts-liberation libfontconfig1; \
|
||||
fi \
|
||||
&& apt-get install -y --no-install-recommends chromium fonts-liberation \
|
||||
&& rm -rf /var/lib/apt/lists/* \
|
||||
&& npm install -g agent-browser \
|
||||
&& ([ "$TARGETARCH" = "arm64" ] || agent-browser install) \
|
||||
&& rm -rf /tmp/* /root/.npm
|
||||
|
||||
# On arm64 the system chromium is at /usr/bin/chromium; set
|
||||
# AGENT_BROWSER_EXECUTABLE_PATH so agent-browser's daemon uses it instead of
|
||||
# Chrome for Testing (which has no ARM64 binary). On amd64 the variable is left
|
||||
# unset so agent-browser uses the Chrome for Testing binary it downloaded above.
|
||||
RUN printf '#!/bin/sh\n[ -x /usr/bin/chromium ] && export AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium\nexec "$@"\n' \
|
||||
> /usr/local/bin/entrypoint.sh \
|
||||
&& chmod +x /usr/local/bin/entrypoint.sh
|
||||
ENV AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium
|
||||
|
||||
WORKDIR /app/autogpt_platform/backend
|
||||
|
||||
@@ -173,5 +157,4 @@ RUN POETRY_VIRTUALENVS_CREATE=true POETRY_VIRTUALENVS_IN_PROJECT=true \
|
||||
|
||||
ENV PORT=8000
|
||||
|
||||
ENTRYPOINT ["/usr/local/bin/entrypoint.sh"]
|
||||
CMD ["rest"]
|
||||
|
||||
@@ -0,0 +1,93 @@
|
||||
from unittest.mock import AsyncMock, MagicMock, patch
|
||||
|
||||
import pytest
|
||||
|
||||
from backend.data.graph import get_graph_as_admin
|
||||
|
||||
# Shared constants
|
||||
ADMIN_USER_ID = "admin-user-id"
|
||||
CREATOR_USER_ID = "other-creator-id"
|
||||
GRAPH_ID = "test-graph-id"
|
||||
GRAPH_VERSION = 3
|
||||
|
||||
|
||||
def _make_mock_graph(user_id: str = CREATOR_USER_ID) -> MagicMock:
|
||||
graph = MagicMock()
|
||||
graph.userId = user_id
|
||||
graph.id = GRAPH_ID
|
||||
graph.version = GRAPH_VERSION
|
||||
graph.Nodes = []
|
||||
return graph
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_admin_can_access_pending_agent_not_owned() -> None:
|
||||
"""Admin must be able to access a graph they don't own even if it's not
|
||||
APPROVED in the marketplace. This is the core use case: reviewing a
|
||||
submitted-but-pending agent from the admin dashboard."""
|
||||
mock_graph = _make_mock_graph()
|
||||
mock_graph_model = MagicMock(name="GraphModel")
|
||||
|
||||
with (
|
||||
patch(
|
||||
"backend.data.graph.AgentGraph.prisma",
|
||||
) as mock_prisma,
|
||||
patch(
|
||||
"backend.data.graph.GraphModel.from_db",
|
||||
return_value=mock_graph_model,
|
||||
),
|
||||
):
|
||||
mock_prisma.return_value.find_first = AsyncMock(return_value=mock_graph)
|
||||
|
||||
result = await get_graph_as_admin(
|
||||
graph_id=GRAPH_ID,
|
||||
version=GRAPH_VERSION,
|
||||
user_id=ADMIN_USER_ID,
|
||||
for_export=False,
|
||||
)
|
||||
|
||||
assert (
|
||||
result is not None
|
||||
), "Admin should be able to access a pending agent they don't own"
|
||||
assert result is mock_graph_model
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_admin_download_pending_agent_with_subagents() -> None:
|
||||
"""Admin export (for_export=True) of a pending agent must include
|
||||
sub-graphs. This exercises the full export code path that the Download
|
||||
button uses."""
|
||||
mock_graph = _make_mock_graph()
|
||||
mock_sub_graph = MagicMock(name="SubGraph")
|
||||
mock_graph_model = MagicMock(name="GraphModel")
|
||||
|
||||
with (
|
||||
patch(
|
||||
"backend.data.graph.AgentGraph.prisma",
|
||||
) as mock_prisma,
|
||||
patch(
|
||||
"backend.data.graph.get_sub_graphs",
|
||||
new_callable=AsyncMock,
|
||||
return_value=[mock_sub_graph],
|
||||
) as mock_get_sub,
|
||||
patch(
|
||||
"backend.data.graph.GraphModel.from_db",
|
||||
return_value=mock_graph_model,
|
||||
) as mock_from_db,
|
||||
):
|
||||
mock_prisma.return_value.find_first = AsyncMock(return_value=mock_graph)
|
||||
|
||||
result = await get_graph_as_admin(
|
||||
graph_id=GRAPH_ID,
|
||||
version=GRAPH_VERSION,
|
||||
user_id=ADMIN_USER_ID,
|
||||
for_export=True,
|
||||
)
|
||||
|
||||
assert result is not None, "Admin export of pending agent must succeed"
|
||||
mock_get_sub.assert_awaited_once_with(mock_graph)
|
||||
mock_from_db.assert_called_once_with(
|
||||
graph=mock_graph,
|
||||
sub_graphs=[mock_sub_graph],
|
||||
for_export=True,
|
||||
)
|
||||
@@ -592,6 +592,11 @@ async def fulfill_checkout(user_id: Annotated[str, Security(get_user_id)]):
|
||||
async def configure_user_auto_top_up(
|
||||
request: AutoTopUpConfig, user_id: Annotated[str, Security(get_user_id)]
|
||||
) -> str:
|
||||
"""Configure auto top-up settings and perform an immediate top-up if needed.
|
||||
|
||||
Raises HTTPException(422) if the request parameters are invalid or if
|
||||
the credit top-up fails.
|
||||
"""
|
||||
if request.threshold < 0:
|
||||
raise HTTPException(status_code=422, detail="Threshold must be greater than 0")
|
||||
if request.amount < 500 and request.amount != 0:
|
||||
@@ -606,10 +611,20 @@ async def configure_user_auto_top_up(
|
||||
user_credit_model = await get_user_credit_model(user_id)
|
||||
current_balance = await user_credit_model.get_credits(user_id)
|
||||
|
||||
if current_balance < request.threshold:
|
||||
await user_credit_model.top_up_credits(user_id, request.amount)
|
||||
else:
|
||||
await user_credit_model.top_up_credits(user_id, 0)
|
||||
try:
|
||||
if current_balance < request.threshold:
|
||||
await user_credit_model.top_up_credits(user_id, request.amount)
|
||||
else:
|
||||
await user_credit_model.top_up_credits(user_id, 0)
|
||||
except ValueError as e:
|
||||
known_messages = (
|
||||
"must not be negative",
|
||||
"already exists for user",
|
||||
"No payment method found",
|
||||
)
|
||||
if any(msg in str(e) for msg in known_messages):
|
||||
raise HTTPException(status_code=422, detail=str(e))
|
||||
raise
|
||||
|
||||
await set_auto_top_up(
|
||||
user_id, AutoTopUpConfig(threshold=request.threshold, amount=request.amount)
|
||||
|
||||
@@ -188,6 +188,7 @@ async def upload_file(
|
||||
user_id: Annotated[str, fastapi.Security(get_user_id)],
|
||||
file: UploadFile,
|
||||
session_id: str | None = Query(default=None),
|
||||
overwrite: bool = Query(default=False),
|
||||
) -> UploadFileResponse:
|
||||
"""
|
||||
Upload a file to the user's workspace.
|
||||
@@ -248,7 +249,9 @@ async def upload_file(
|
||||
# Write file via WorkspaceManager
|
||||
manager = WorkspaceManager(user_id, workspace.id, session_id)
|
||||
try:
|
||||
workspace_file = await manager.write_file(content, filename)
|
||||
workspace_file = await manager.write_file(
|
||||
content, filename, overwrite=overwrite
|
||||
)
|
||||
except ValueError as e:
|
||||
raise fastapi.HTTPException(status_code=409, detail=str(e)) from e
|
||||
|
||||
|
||||
@@ -210,13 +210,22 @@ instrument_fastapi(
|
||||
def handle_internal_http_error(status_code: int = 500, log_error: bool = True):
|
||||
def handler(request: fastapi.Request, exc: Exception):
|
||||
if log_error:
|
||||
logger.exception(
|
||||
"%s %s failed. Investigate and resolve the underlying issue: %s",
|
||||
request.method,
|
||||
request.url.path,
|
||||
exc,
|
||||
exc_info=exc,
|
||||
)
|
||||
if status_code >= 500:
|
||||
logger.exception(
|
||||
"%s %s failed. Investigate and resolve the underlying issue: %s",
|
||||
request.method,
|
||||
request.url.path,
|
||||
exc,
|
||||
exc_info=exc,
|
||||
)
|
||||
else:
|
||||
logger.warning(
|
||||
"%s %s failed with %d: %s",
|
||||
request.method,
|
||||
request.url.path,
|
||||
status_code,
|
||||
exc,
|
||||
)
|
||||
|
||||
hint = (
|
||||
"Adjust the request and retry."
|
||||
@@ -266,12 +275,10 @@ async def validation_error_handler(
|
||||
|
||||
|
||||
app.add_exception_handler(PrismaError, handle_internal_http_error(500))
|
||||
app.add_exception_handler(
|
||||
FolderAlreadyExistsError, handle_internal_http_error(409, False)
|
||||
)
|
||||
app.add_exception_handler(FolderValidationError, handle_internal_http_error(400, False))
|
||||
app.add_exception_handler(NotFoundError, handle_internal_http_error(404, False))
|
||||
app.add_exception_handler(NotAuthorizedError, handle_internal_http_error(403, False))
|
||||
app.add_exception_handler(FolderAlreadyExistsError, handle_internal_http_error(409))
|
||||
app.add_exception_handler(FolderValidationError, handle_internal_http_error(400))
|
||||
app.add_exception_handler(NotFoundError, handle_internal_http_error(404))
|
||||
app.add_exception_handler(NotAuthorizedError, handle_internal_http_error(403))
|
||||
app.add_exception_handler(RequestValidationError, validation_error_handler)
|
||||
app.add_exception_handler(pydantic.ValidationError, validation_error_handler)
|
||||
app.add_exception_handler(MissingConfigError, handle_internal_http_error(503))
|
||||
|
||||
@@ -15,6 +15,12 @@ from backend.blocks._base import (
|
||||
BlockSchemaInput,
|
||||
BlockSchemaOutput,
|
||||
)
|
||||
from backend.copilot.permissions import (
|
||||
CopilotPermissions,
|
||||
ToolName,
|
||||
all_known_tool_names,
|
||||
validate_block_identifiers,
|
||||
)
|
||||
from backend.data.model import SchemaField
|
||||
|
||||
if TYPE_CHECKING:
|
||||
@@ -96,6 +102,50 @@ class AutoPilotBlock(Block):
|
||||
advanced=True,
|
||||
)
|
||||
|
||||
tools: list[ToolName] = SchemaField(
|
||||
description=(
|
||||
"Tool names to filter. Works with tools_exclude to form an "
|
||||
"allow-list or deny-list. "
|
||||
"Leave empty to apply no tool filter."
|
||||
),
|
||||
default=[],
|
||||
advanced=True,
|
||||
)
|
||||
|
||||
tools_exclude: bool = SchemaField(
|
||||
description=(
|
||||
"Controls how the 'tools' list is interpreted. "
|
||||
"True (default): 'tools' is a deny-list — listed tools are blocked, "
|
||||
"all others are allowed. An empty 'tools' list means allow everything. "
|
||||
"False: 'tools' is an allow-list — only listed tools are permitted."
|
||||
),
|
||||
default=True,
|
||||
advanced=True,
|
||||
)
|
||||
|
||||
blocks: list[str] = SchemaField(
|
||||
description=(
|
||||
"Block identifiers to filter when the copilot uses run_block. "
|
||||
"Each entry can be: a block name (e.g. 'HTTP Request'), "
|
||||
"a full block UUID, or the first 8 hex characters of the UUID "
|
||||
"(e.g. 'c069dc6b'). Works with blocks_exclude. "
|
||||
"Leave empty to apply no block filter."
|
||||
),
|
||||
default=[],
|
||||
advanced=True,
|
||||
)
|
||||
|
||||
blocks_exclude: bool = SchemaField(
|
||||
description=(
|
||||
"Controls how the 'blocks' list is interpreted. "
|
||||
"True (default): 'blocks' is a deny-list — listed blocks are blocked, "
|
||||
"all others are allowed. An empty 'blocks' list means allow everything. "
|
||||
"False: 'blocks' is an allow-list — only listed blocks are permitted."
|
||||
),
|
||||
default=True,
|
||||
advanced=True,
|
||||
)
|
||||
|
||||
# timeout_seconds removed: the SDK manages its own heartbeat-based
|
||||
# timeouts internally; wrapping with asyncio.timeout corrupts the
|
||||
# SDK's internal stream (see service.py CRITICAL comment).
|
||||
@@ -184,7 +234,7 @@ class AutoPilotBlock(Block):
|
||||
|
||||
async def create_session(self, user_id: str) -> str:
|
||||
"""Create a new chat session and return its ID (mockable for tests)."""
|
||||
from backend.copilot.model import create_chat_session
|
||||
from backend.copilot.model import create_chat_session # avoid circular import
|
||||
|
||||
session = await create_chat_session(user_id)
|
||||
return session.session_id
|
||||
@@ -196,6 +246,7 @@ class AutoPilotBlock(Block):
|
||||
session_id: str,
|
||||
max_recursion_depth: int,
|
||||
user_id: str,
|
||||
permissions: "CopilotPermissions | None" = None,
|
||||
) -> tuple[str, list[ToolCallEntry], str, str, TokenUsage]:
|
||||
"""Invoke the copilot and collect all stream results.
|
||||
|
||||
@@ -209,14 +260,21 @@ class AutoPilotBlock(Block):
|
||||
session_id: Chat session to use.
|
||||
max_recursion_depth: Maximum allowed recursion nesting.
|
||||
user_id: Authenticated user ID.
|
||||
permissions: Optional capability filter restricting tools/blocks.
|
||||
|
||||
Returns:
|
||||
A tuple of (response_text, tool_calls, history_json, session_id, usage).
|
||||
"""
|
||||
from backend.copilot.sdk.collect import collect_copilot_response
|
||||
from backend.copilot.sdk.collect import (
|
||||
collect_copilot_response, # avoid circular import
|
||||
)
|
||||
|
||||
tokens = _check_recursion(max_recursion_depth)
|
||||
perm_token = None
|
||||
try:
|
||||
effective_permissions, perm_token = _merge_inherited_permissions(
|
||||
permissions
|
||||
)
|
||||
effective_prompt = prompt
|
||||
if system_context:
|
||||
effective_prompt = f"[System Context: {system_context}]\n\n{prompt}"
|
||||
@@ -225,6 +283,7 @@ class AutoPilotBlock(Block):
|
||||
session_id=session_id,
|
||||
message=effective_prompt,
|
||||
user_id=user_id,
|
||||
permissions=effective_permissions,
|
||||
)
|
||||
|
||||
# Build a lightweight conversation summary from streamed data.
|
||||
@@ -271,6 +330,8 @@ class AutoPilotBlock(Block):
|
||||
)
|
||||
finally:
|
||||
_reset_recursion(tokens)
|
||||
if perm_token is not None:
|
||||
_inherited_permissions.reset(perm_token)
|
||||
|
||||
async def run(
|
||||
self,
|
||||
@@ -295,6 +356,13 @@ class AutoPilotBlock(Block):
|
||||
yield "error", "max_recursion_depth must be at least 1."
|
||||
return
|
||||
|
||||
# Validate and build permissions eagerly — fail before creating a session.
|
||||
permissions = await _build_and_validate_permissions(input_data)
|
||||
if isinstance(permissions, str):
|
||||
# Validation error returned as a string message.
|
||||
yield "error", permissions
|
||||
return
|
||||
|
||||
# Create session eagerly so the user always gets the session_id,
|
||||
# even if the downstream stream fails (avoids orphaned sessions).
|
||||
sid = input_data.session_id
|
||||
@@ -312,6 +380,7 @@ class AutoPilotBlock(Block):
|
||||
session_id=sid,
|
||||
max_recursion_depth=input_data.max_recursion_depth,
|
||||
user_id=execution_context.user_id,
|
||||
permissions=permissions,
|
||||
)
|
||||
|
||||
yield "response", response
|
||||
@@ -374,3 +443,78 @@ def _reset_recursion(
|
||||
"""Restore recursion depth and limit to their previous values."""
|
||||
_autopilot_recursion_depth.reset(tokens[0])
|
||||
_autopilot_recursion_limit.reset(tokens[1])
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Permission helpers
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
# Inherited permissions from a parent AutoPilotBlock execution.
|
||||
# This acts as a ceiling: child executions can only be more restrictive.
|
||||
_inherited_permissions: contextvars.ContextVar["CopilotPermissions | None"] = (
|
||||
contextvars.ContextVar("_inherited_permissions", default=None)
|
||||
)
|
||||
|
||||
|
||||
async def _build_and_validate_permissions(
|
||||
input_data: "AutoPilotBlock.Input",
|
||||
) -> "CopilotPermissions | str":
|
||||
"""Build a :class:`CopilotPermissions` from block input and validate it.
|
||||
|
||||
Returns a :class:`CopilotPermissions` on success or a human-readable
|
||||
error string if validation fails.
|
||||
"""
|
||||
# Tool names are validated by Pydantic via the ToolName Literal type
|
||||
# at model construction time — no runtime check needed here.
|
||||
# Validate block identifiers against live block registry.
|
||||
if input_data.blocks:
|
||||
invalid_blocks = await validate_block_identifiers(input_data.blocks)
|
||||
if invalid_blocks:
|
||||
return (
|
||||
f"Unknown block identifier(s) in 'blocks': {invalid_blocks}. "
|
||||
"Use find_block to discover valid block names and IDs. "
|
||||
"You may also use the first 8 characters of a block UUID."
|
||||
)
|
||||
|
||||
return CopilotPermissions(
|
||||
tools=list(input_data.tools),
|
||||
tools_exclude=input_data.tools_exclude,
|
||||
blocks=input_data.blocks,
|
||||
blocks_exclude=input_data.blocks_exclude,
|
||||
)
|
||||
|
||||
|
||||
def _merge_inherited_permissions(
|
||||
permissions: "CopilotPermissions | None",
|
||||
) -> "tuple[CopilotPermissions | None, contextvars.Token[CopilotPermissions | None] | None]":
|
||||
"""Merge *permissions* with any inherited parent permissions.
|
||||
|
||||
The merged result is stored back into the contextvar so that any nested
|
||||
AutoPilotBlock invocation (sub-agent) inherits the merged ceiling.
|
||||
|
||||
Returns a tuple of (merged_permissions, reset_token). The caller MUST
|
||||
reset the contextvar via ``_inherited_permissions.reset(token)`` in a
|
||||
``finally`` block when ``reset_token`` is not None — this prevents
|
||||
permission leakage between sequential independent executions in the same
|
||||
asyncio task.
|
||||
"""
|
||||
parent = _inherited_permissions.get()
|
||||
|
||||
if permissions is None and parent is None:
|
||||
return None, None
|
||||
|
||||
all_tools = all_known_tool_names()
|
||||
|
||||
if permissions is None:
|
||||
permissions = CopilotPermissions() # allow-all; will be narrowed by parent
|
||||
|
||||
merged = (
|
||||
permissions.merged_with_parent(parent, all_tools)
|
||||
if parent is not None
|
||||
else permissions
|
||||
)
|
||||
|
||||
# Store merged permissions as the new inherited ceiling for nested calls.
|
||||
# Return the token so the caller can restore the previous value in finally.
|
||||
token = _inherited_permissions.set(merged)
|
||||
return merged, token
|
||||
|
||||
@@ -0,0 +1,265 @@
|
||||
"""Tests for AutoPilotBlock permission fields and validation."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from unittest.mock import AsyncMock, MagicMock, patch
|
||||
|
||||
import pytest
|
||||
from pydantic import ValidationError
|
||||
|
||||
from backend.blocks.autopilot import (
|
||||
AutoPilotBlock,
|
||||
_build_and_validate_permissions,
|
||||
_inherited_permissions,
|
||||
_merge_inherited_permissions,
|
||||
)
|
||||
from backend.copilot.permissions import CopilotPermissions, all_known_tool_names
|
||||
from backend.data.execution import ExecutionContext
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Helpers
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def _make_input(**kwargs) -> AutoPilotBlock.Input:
|
||||
defaults = {
|
||||
"prompt": "Do something",
|
||||
"system_context": "",
|
||||
"session_id": "",
|
||||
"max_recursion_depth": 3,
|
||||
"tools": [],
|
||||
"tools_exclude": True,
|
||||
"blocks": [],
|
||||
"blocks_exclude": True,
|
||||
}
|
||||
defaults.update(kwargs)
|
||||
return AutoPilotBlock.Input(**defaults)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# _build_and_validate_permissions
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
class TestBuildAndValidatePermissions:
|
||||
async def test_empty_inputs_returns_empty_permissions(self):
|
||||
inp = _make_input()
|
||||
result = await _build_and_validate_permissions(inp)
|
||||
assert isinstance(result, CopilotPermissions)
|
||||
assert result.is_empty()
|
||||
|
||||
async def test_valid_tool_names_accepted(self):
|
||||
inp = _make_input(tools=["run_block", "web_fetch"], tools_exclude=True)
|
||||
result = await _build_and_validate_permissions(inp)
|
||||
assert isinstance(result, CopilotPermissions)
|
||||
assert result.tools == ["run_block", "web_fetch"]
|
||||
assert result.tools_exclude is True
|
||||
|
||||
async def test_invalid_tool_rejected_by_pydantic(self):
|
||||
"""Invalid tool names are now caught at Pydantic validation time
|
||||
(Literal type), before ``_build_and_validate_permissions`` is called."""
|
||||
with pytest.raises(ValidationError, match="not_a_real_tool"):
|
||||
_make_input(tools=["not_a_real_tool"])
|
||||
|
||||
async def test_valid_block_name_accepted(self):
|
||||
mock_block_cls = MagicMock()
|
||||
mock_block_cls.return_value.name = "HTTP Request"
|
||||
with patch(
|
||||
"backend.blocks.get_blocks",
|
||||
return_value={"c069dc6b-c3ed-4c12-b6e5-d47361e64ce6": mock_block_cls},
|
||||
):
|
||||
inp = _make_input(blocks=["HTTP Request"], blocks_exclude=True)
|
||||
result = await _build_and_validate_permissions(inp)
|
||||
assert isinstance(result, CopilotPermissions)
|
||||
assert result.blocks == ["HTTP Request"]
|
||||
|
||||
async def test_valid_partial_uuid_accepted(self):
|
||||
mock_block_cls = MagicMock()
|
||||
mock_block_cls.return_value.name = "HTTP Request"
|
||||
with patch(
|
||||
"backend.blocks.get_blocks",
|
||||
return_value={"c069dc6b-c3ed-4c12-b6e5-d47361e64ce6": mock_block_cls},
|
||||
):
|
||||
inp = _make_input(blocks=["c069dc6b"], blocks_exclude=False)
|
||||
result = await _build_and_validate_permissions(inp)
|
||||
assert isinstance(result, CopilotPermissions)
|
||||
|
||||
async def test_invalid_block_identifier_returns_error(self):
|
||||
mock_block_cls = MagicMock()
|
||||
mock_block_cls.return_value.name = "HTTP Request"
|
||||
with patch(
|
||||
"backend.blocks.get_blocks",
|
||||
return_value={"c069dc6b-c3ed-4c12-b6e5-d47361e64ce6": mock_block_cls},
|
||||
):
|
||||
inp = _make_input(blocks=["totally_fake_block"])
|
||||
result = await _build_and_validate_permissions(inp)
|
||||
assert isinstance(result, str)
|
||||
assert "totally_fake_block" in result
|
||||
assert "Unknown block identifier" in result
|
||||
|
||||
async def test_sdk_builtin_tool_names_accepted(self):
|
||||
inp = _make_input(tools=["Read", "Task", "WebSearch"], tools_exclude=False)
|
||||
result = await _build_and_validate_permissions(inp)
|
||||
assert isinstance(result, CopilotPermissions)
|
||||
assert not result.tools_exclude
|
||||
|
||||
async def test_empty_blocks_skips_validation(self):
|
||||
# Should not call validate_block_identifiers at all when blocks=[].
|
||||
with patch(
|
||||
"backend.copilot.permissions.validate_block_identifiers"
|
||||
) as mock_validate:
|
||||
inp = _make_input(blocks=[])
|
||||
await _build_and_validate_permissions(inp)
|
||||
mock_validate.assert_not_called()
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# _merge_inherited_permissions
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class TestMergeInheritedPermissions:
|
||||
def test_no_permissions_no_parent_returns_none(self):
|
||||
merged, token = _merge_inherited_permissions(None)
|
||||
assert merged is None
|
||||
assert token is None
|
||||
|
||||
def test_permissions_no_parent_returned_unchanged(self):
|
||||
perms = CopilotPermissions(tools=["bash_exec"], tools_exclude=True)
|
||||
merged, token = _merge_inherited_permissions(perms)
|
||||
try:
|
||||
assert merged is perms
|
||||
assert token is not None
|
||||
finally:
|
||||
if token is not None:
|
||||
_inherited_permissions.reset(token)
|
||||
|
||||
def test_child_narrows_parent(self):
|
||||
parent = CopilotPermissions(tools=["bash_exec"], tools_exclude=True)
|
||||
# Set parent as inherited
|
||||
outer_token = _inherited_permissions.set(parent)
|
||||
try:
|
||||
child = CopilotPermissions(tools=["web_fetch"], tools_exclude=True)
|
||||
merged, inner_token = _merge_inherited_permissions(child)
|
||||
try:
|
||||
assert merged is not None
|
||||
all_t = all_known_tool_names()
|
||||
effective = merged.effective_allowed_tools(all_t)
|
||||
assert "bash_exec" not in effective
|
||||
assert "web_fetch" not in effective
|
||||
finally:
|
||||
if inner_token is not None:
|
||||
_inherited_permissions.reset(inner_token)
|
||||
finally:
|
||||
_inherited_permissions.reset(outer_token)
|
||||
|
||||
def test_none_permissions_with_parent_uses_parent(self):
|
||||
parent = CopilotPermissions(tools=["bash_exec"], tools_exclude=True)
|
||||
outer_token = _inherited_permissions.set(parent)
|
||||
try:
|
||||
merged, inner_token = _merge_inherited_permissions(None)
|
||||
try:
|
||||
assert merged is not None
|
||||
# Merged should have parent's restrictions
|
||||
effective = merged.effective_allowed_tools(all_known_tool_names())
|
||||
assert "bash_exec" not in effective
|
||||
finally:
|
||||
if inner_token is not None:
|
||||
_inherited_permissions.reset(inner_token)
|
||||
finally:
|
||||
_inherited_permissions.reset(outer_token)
|
||||
|
||||
def test_child_cannot_expand_parent_whitelist(self):
|
||||
parent = CopilotPermissions(tools=["run_block"], tools_exclude=False)
|
||||
outer_token = _inherited_permissions.set(parent)
|
||||
try:
|
||||
# Child tries to allow more tools
|
||||
child = CopilotPermissions(
|
||||
tools=["run_block", "bash_exec"], tools_exclude=False
|
||||
)
|
||||
merged, inner_token = _merge_inherited_permissions(child)
|
||||
try:
|
||||
assert merged is not None
|
||||
effective = merged.effective_allowed_tools(all_known_tool_names())
|
||||
assert "bash_exec" not in effective
|
||||
assert "run_block" in effective
|
||||
finally:
|
||||
if inner_token is not None:
|
||||
_inherited_permissions.reset(inner_token)
|
||||
finally:
|
||||
_inherited_permissions.reset(outer_token)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# AutoPilotBlock.run — validation integration
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
class TestAutoPilotBlockRunPermissions:
|
||||
async def _collect_outputs(self, block, input_data, user_id="test-user"):
|
||||
"""Helper to collect all yields from block.run()."""
|
||||
ctx = ExecutionContext(
|
||||
user_id=user_id,
|
||||
graph_id="g1",
|
||||
graph_exec_id="ge1",
|
||||
node_exec_id="ne1",
|
||||
node_id="n1",
|
||||
)
|
||||
outputs = {}
|
||||
async for key, val in block.run(input_data, execution_context=ctx):
|
||||
outputs[key] = val
|
||||
return outputs
|
||||
|
||||
async def test_invalid_tool_rejected_by_pydantic(self):
|
||||
"""Invalid tool names are caught at Pydantic validation (Literal type)."""
|
||||
with pytest.raises(ValidationError, match="not_a_tool"):
|
||||
_make_input(tools=["not_a_tool"])
|
||||
|
||||
async def test_invalid_block_yields_error(self):
|
||||
mock_block_cls = MagicMock()
|
||||
mock_block_cls.return_value.name = "HTTP Request"
|
||||
with patch(
|
||||
"backend.blocks.get_blocks",
|
||||
return_value={"c069dc6b-c3ed-4c12-b6e5-d47361e64ce6": mock_block_cls},
|
||||
):
|
||||
block = AutoPilotBlock()
|
||||
inp = _make_input(blocks=["nonexistent_block"])
|
||||
outputs = await self._collect_outputs(block, inp)
|
||||
assert "error" in outputs
|
||||
assert "nonexistent_block" in outputs["error"]
|
||||
|
||||
async def test_empty_prompt_yields_error_before_permission_check(self):
|
||||
block = AutoPilotBlock()
|
||||
inp = _make_input(prompt=" ", tools=["run_block"])
|
||||
outputs = await self._collect_outputs(block, inp)
|
||||
assert "error" in outputs
|
||||
assert "Prompt cannot be empty" in outputs["error"]
|
||||
|
||||
async def test_valid_permissions_passed_to_execute(self):
|
||||
"""Permissions are forwarded to execute_copilot when valid."""
|
||||
block = AutoPilotBlock()
|
||||
captured: dict = {}
|
||||
|
||||
async def fake_execute_copilot(self_inner, **kwargs):
|
||||
captured["permissions"] = kwargs.get("permissions")
|
||||
return (
|
||||
"ok",
|
||||
[],
|
||||
'[{"role":"user","content":"hi"}]',
|
||||
"test-sid",
|
||||
{"prompt_tokens": 1, "completion_tokens": 1, "total_tokens": 2},
|
||||
)
|
||||
|
||||
with patch.object(
|
||||
AutoPilotBlock, "create_session", new=AsyncMock(return_value="test-sid")
|
||||
), patch.object(AutoPilotBlock, "execute_copilot", new=fake_execute_copilot):
|
||||
inp = _make_input(tools=["run_block"], tools_exclude=False)
|
||||
outputs = await self._collect_outputs(block, inp)
|
||||
|
||||
assert "error" not in outputs
|
||||
perms = captured.get("permissions")
|
||||
assert isinstance(perms, CopilotPermissions)
|
||||
assert perms.tools == ["run_block"]
|
||||
assert perms.tools_exclude is False
|
||||
@@ -49,6 +49,9 @@ settings = Settings()
|
||||
logger = TruncatedLogger(logging.getLogger(__name__), "[LLM-Block]")
|
||||
fmt = TextFormatter(autoescape=False)
|
||||
|
||||
# HTTP status codes for user-caused errors that should not be reported to Sentry.
|
||||
USER_ERROR_STATUS_CODES = (401, 403, 429)
|
||||
|
||||
LLMProviderName = Literal[
|
||||
ProviderName.AIML_API,
|
||||
ProviderName.ANTHROPIC,
|
||||
@@ -796,6 +799,19 @@ async def llm_call(
|
||||
)
|
||||
prompt = result.messages
|
||||
|
||||
# Sanitize unpaired surrogates in message content to prevent
|
||||
# UnicodeEncodeError when httpx encodes the JSON request body.
|
||||
for msg in prompt:
|
||||
content = msg.get("content")
|
||||
if isinstance(content, str):
|
||||
try:
|
||||
content.encode("utf-8")
|
||||
except UnicodeEncodeError:
|
||||
logger.warning("Sanitized unpaired surrogates in LLM prompt content")
|
||||
msg["content"] = content.encode("utf-8", errors="surrogatepass").decode(
|
||||
"utf-8", errors="replace"
|
||||
)
|
||||
|
||||
# Calculate available tokens based on context window and input length
|
||||
estimated_input_tokens = estimate_token_count(prompt)
|
||||
model_max_output = llm_model.max_output_tokens or int(2**15)
|
||||
@@ -878,65 +894,60 @@ async def llm_call(
|
||||
client = anthropic.AsyncAnthropic(
|
||||
api_key=credentials.api_key.get_secret_value()
|
||||
)
|
||||
try:
|
||||
resp = await client.messages.create(
|
||||
model=llm_model.value,
|
||||
system=sysprompt,
|
||||
messages=messages,
|
||||
max_tokens=max_tokens,
|
||||
tools=an_tools,
|
||||
timeout=600,
|
||||
)
|
||||
resp = await client.messages.create(
|
||||
model=llm_model.value,
|
||||
system=sysprompt,
|
||||
messages=messages,
|
||||
max_tokens=max_tokens,
|
||||
tools=an_tools,
|
||||
timeout=600,
|
||||
)
|
||||
|
||||
if not resp.content:
|
||||
raise ValueError("No content returned from Anthropic.")
|
||||
if not resp.content:
|
||||
raise ValueError("No content returned from Anthropic.")
|
||||
|
||||
tool_calls = None
|
||||
for content_block in resp.content:
|
||||
# Antropic is different to openai, need to iterate through
|
||||
# the content blocks to find the tool calls
|
||||
if content_block.type == "tool_use":
|
||||
if tool_calls is None:
|
||||
tool_calls = []
|
||||
tool_calls.append(
|
||||
ToolContentBlock(
|
||||
id=content_block.id,
|
||||
type=content_block.type,
|
||||
function=ToolCall(
|
||||
name=content_block.name,
|
||||
arguments=json.dumps(content_block.input),
|
||||
),
|
||||
)
|
||||
tool_calls = None
|
||||
for content_block in resp.content:
|
||||
# Antropic is different to openai, need to iterate through
|
||||
# the content blocks to find the tool calls
|
||||
if content_block.type == "tool_use":
|
||||
if tool_calls is None:
|
||||
tool_calls = []
|
||||
tool_calls.append(
|
||||
ToolContentBlock(
|
||||
id=content_block.id,
|
||||
type=content_block.type,
|
||||
function=ToolCall(
|
||||
name=content_block.name,
|
||||
arguments=json.dumps(content_block.input),
|
||||
),
|
||||
)
|
||||
|
||||
if not tool_calls and resp.stop_reason == "tool_use":
|
||||
logger.warning(
|
||||
f"Tool use stop reason but no tool calls found in content. {resp}"
|
||||
)
|
||||
|
||||
reasoning = None
|
||||
for content_block in resp.content:
|
||||
if hasattr(content_block, "type") and content_block.type == "thinking":
|
||||
reasoning = content_block.thinking
|
||||
break
|
||||
|
||||
return LLMResponse(
|
||||
raw_response=resp,
|
||||
prompt=prompt,
|
||||
response=(
|
||||
resp.content[0].name
|
||||
if isinstance(resp.content[0], anthropic.types.ToolUseBlock)
|
||||
else getattr(resp.content[0], "text", "")
|
||||
),
|
||||
tool_calls=tool_calls,
|
||||
prompt_tokens=resp.usage.input_tokens,
|
||||
completion_tokens=resp.usage.output_tokens,
|
||||
reasoning=reasoning,
|
||||
if not tool_calls and resp.stop_reason == "tool_use":
|
||||
logger.warning(
|
||||
f"Tool use stop reason but no tool calls found in content. {resp}"
|
||||
)
|
||||
except anthropic.APIError as e:
|
||||
error_message = f"Anthropic API error: {str(e)}"
|
||||
logger.error(error_message)
|
||||
raise ValueError(error_message)
|
||||
|
||||
reasoning = None
|
||||
for content_block in resp.content:
|
||||
if hasattr(content_block, "type") and content_block.type == "thinking":
|
||||
reasoning = content_block.thinking
|
||||
break
|
||||
|
||||
return LLMResponse(
|
||||
raw_response=resp,
|
||||
prompt=prompt,
|
||||
response=(
|
||||
resp.content[0].name
|
||||
if isinstance(resp.content[0], anthropic.types.ToolUseBlock)
|
||||
else getattr(resp.content[0], "text", "")
|
||||
),
|
||||
tool_calls=tool_calls,
|
||||
prompt_tokens=resp.usage.input_tokens,
|
||||
completion_tokens=resp.usage.output_tokens,
|
||||
reasoning=reasoning,
|
||||
)
|
||||
elif provider == "groq":
|
||||
if tools:
|
||||
raise ValueError("Groq does not support tools.")
|
||||
@@ -1449,7 +1460,16 @@ class AIStructuredResponseGeneratorBlock(AIBlockBase):
|
||||
yield "prompt", self.prompt
|
||||
return
|
||||
except Exception as e:
|
||||
logger.exception(f"Error calling LLM: {e}")
|
||||
is_user_error = (
|
||||
isinstance(e, (anthropic.APIStatusError, openai.APIStatusError))
|
||||
and e.status_code in USER_ERROR_STATUS_CODES
|
||||
)
|
||||
if is_user_error:
|
||||
logger.warning(f"Error calling LLM: {e}")
|
||||
error_feedback_message = f"Error calling LLM: {e}"
|
||||
break
|
||||
else:
|
||||
logger.exception(f"Error calling LLM: {e}")
|
||||
if (
|
||||
"maximum context length" in str(e).lower()
|
||||
or "token limit" in str(e).lower()
|
||||
|
||||
@@ -258,9 +258,10 @@ def get_pending_tool_calls(conversation_history: list[Any] | None) -> dict[str,
|
||||
return {call_id: count for call_id, count in pending_calls.items() if count > 0}
|
||||
|
||||
|
||||
class SmartDecisionMakerBlock(Block):
|
||||
class OrchestratorBlock(Block):
|
||||
"""
|
||||
A block that uses a language model to make smart decisions based on a given prompt.
|
||||
A block that uses a language model to orchestrate tool calls, supporting both
|
||||
single-shot and iterative agent mode execution.
|
||||
"""
|
||||
|
||||
class Input(BlockSchemaInput):
|
||||
@@ -401,8 +402,8 @@ class SmartDecisionMakerBlock(Block):
|
||||
description="Uses AI to intelligently decide what tool to use.",
|
||||
categories={BlockCategory.AI},
|
||||
block_type=BlockType.AI,
|
||||
input_schema=SmartDecisionMakerBlock.Input,
|
||||
output_schema=SmartDecisionMakerBlock.Output,
|
||||
input_schema=OrchestratorBlock.Input,
|
||||
output_schema=OrchestratorBlock.Output,
|
||||
test_input={
|
||||
"prompt": "Hello, World!",
|
||||
"credentials": llm.TEST_CREDENTIALS_INPUT,
|
||||
@@ -440,7 +441,7 @@ class SmartDecisionMakerBlock(Block):
|
||||
tool_name = custom_name if custom_name else block.name
|
||||
|
||||
tool_function: dict[str, Any] = {
|
||||
"name": SmartDecisionMakerBlock.cleanup(tool_name),
|
||||
"name": OrchestratorBlock.cleanup(tool_name),
|
||||
"description": block.description,
|
||||
}
|
||||
sink_block_input_schema = block.input_schema
|
||||
@@ -451,7 +452,7 @@ class SmartDecisionMakerBlock(Block):
|
||||
field_name = link.sink_name
|
||||
is_dynamic = is_dynamic_field(field_name)
|
||||
# Clean property key to ensure Anthropic API compatibility for ALL fields
|
||||
clean_field_name = SmartDecisionMakerBlock.cleanup(field_name)
|
||||
clean_field_name = OrchestratorBlock.cleanup(field_name)
|
||||
field_mapping[clean_field_name] = field_name
|
||||
|
||||
if is_dynamic:
|
||||
@@ -485,7 +486,7 @@ class SmartDecisionMakerBlock(Block):
|
||||
field_name = link.sink_name
|
||||
is_dynamic = is_dynamic_field(field_name)
|
||||
# Always use cleaned field name for property key (Anthropic API compliance)
|
||||
clean_field_name = SmartDecisionMakerBlock.cleanup(field_name)
|
||||
clean_field_name = OrchestratorBlock.cleanup(field_name)
|
||||
|
||||
if is_dynamic:
|
||||
base_name = extract_base_field_name(field_name)
|
||||
@@ -542,7 +543,7 @@ class SmartDecisionMakerBlock(Block):
|
||||
tool_name = custom_name if custom_name else sink_graph_meta.name
|
||||
|
||||
tool_function: dict[str, Any] = {
|
||||
"name": SmartDecisionMakerBlock.cleanup(tool_name),
|
||||
"name": OrchestratorBlock.cleanup(tool_name),
|
||||
"description": sink_graph_meta.description,
|
||||
}
|
||||
|
||||
@@ -552,7 +553,7 @@ class SmartDecisionMakerBlock(Block):
|
||||
for link in links:
|
||||
field_name = link.sink_name
|
||||
|
||||
clean_field_name = SmartDecisionMakerBlock.cleanup(field_name)
|
||||
clean_field_name = OrchestratorBlock.cleanup(field_name)
|
||||
field_mapping[clean_field_name] = field_name
|
||||
|
||||
sink_block_input_schema = sink_node.input_default["input_schema"]
|
||||
@@ -618,17 +619,13 @@ class SmartDecisionMakerBlock(Block):
|
||||
raise ValueError(f"Sink node not found: {links[0].sink_id}")
|
||||
|
||||
if sink_node.block_id == AgentExecutorBlock().id:
|
||||
tool_func = (
|
||||
await SmartDecisionMakerBlock._create_agent_function_signature(
|
||||
sink_node, links
|
||||
)
|
||||
tool_func = await OrchestratorBlock._create_agent_function_signature(
|
||||
sink_node, links
|
||||
)
|
||||
return_tool_functions.append(tool_func)
|
||||
else:
|
||||
tool_func = (
|
||||
await SmartDecisionMakerBlock._create_block_function_signature(
|
||||
sink_node, links
|
||||
)
|
||||
tool_func = await OrchestratorBlock._create_block_function_signature(
|
||||
sink_node, links
|
||||
)
|
||||
return_tool_functions.append(tool_func)
|
||||
|
||||
@@ -908,7 +905,7 @@ class SmartDecisionMakerBlock(Block):
|
||||
task=node_exec_future,
|
||||
)
|
||||
|
||||
# Execute the node directly since we're in the SmartDecisionMaker context
|
||||
# Execute the node directly since we're in the Orchestrator context
|
||||
node_exec_future.set_result(
|
||||
await execution_processor.on_node_execution(
|
||||
node_exec=node_exec_entry,
|
||||
@@ -934,7 +931,7 @@ class SmartDecisionMakerBlock(Block):
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Tool execution with manager failed: {e}")
|
||||
logger.warning(f"Tool execution with manager failed: {e}")
|
||||
# Return error response
|
||||
return _create_tool_response(
|
||||
tool_call.id,
|
||||
@@ -1112,7 +1109,7 @@ class SmartDecisionMakerBlock(Block):
|
||||
return
|
||||
elif input_data.last_tool_output:
|
||||
logger.error(
|
||||
f"[SmartDecisionMakerBlock-node_exec_id={node_exec_id}] "
|
||||
f"[OrchestratorBlock-node_exec_id={node_exec_id}] "
|
||||
f"No pending tool calls found. This may indicate an issue with the "
|
||||
f"conversation history, or the tool giving response more than once."
|
||||
f"This should not happen! Please check the conversation history for any inconsistencies."
|
||||
@@ -1249,7 +1246,7 @@ class SmartDecisionMakerBlock(Block):
|
||||
emit_key = f"tools_^_{sink_node_id}_~_{original_field_name}"
|
||||
|
||||
logger.debug(
|
||||
"[SmartDecisionMakerBlock|geid:%s|neid:%s] emit %s",
|
||||
"[OrchestratorBlock|geid:%s|neid:%s] emit %s",
|
||||
graph_exec_id,
|
||||
node_exec_id,
|
||||
emit_key,
|
||||
@@ -1,13 +1,8 @@
|
||||
import logging
|
||||
import signal
|
||||
import threading
|
||||
import warnings
|
||||
from contextlib import contextmanager
|
||||
from enum import Enum
|
||||
|
||||
# Monkey patch Stagehands to prevent signal handling in worker threads
|
||||
import stagehand.main
|
||||
from stagehand import Stagehand
|
||||
from stagehand import AsyncStagehand
|
||||
from stagehand.types.session_act_params import Options as ActOptions
|
||||
|
||||
from backend.blocks.llm import (
|
||||
MODEL_METADATA,
|
||||
@@ -28,46 +23,6 @@ from backend.sdk import (
|
||||
SchemaField,
|
||||
)
|
||||
|
||||
# Suppress false positive cleanup warning of litellm (a dependency of stagehand)
|
||||
warnings.filterwarnings("ignore", module="litellm.llms.custom_httpx")
|
||||
|
||||
# Store the original method
|
||||
original_register_signal_handlers = stagehand.main.Stagehand._register_signal_handlers
|
||||
|
||||
|
||||
def safe_register_signal_handlers(self):
|
||||
"""Only register signal handlers in the main thread"""
|
||||
if threading.current_thread() is threading.main_thread():
|
||||
original_register_signal_handlers(self)
|
||||
else:
|
||||
# Skip signal handling in worker threads
|
||||
pass
|
||||
|
||||
|
||||
# Replace the method
|
||||
stagehand.main.Stagehand._register_signal_handlers = safe_register_signal_handlers
|
||||
|
||||
|
||||
@contextmanager
|
||||
def disable_signal_handling():
|
||||
"""Context manager to temporarily disable signal handling"""
|
||||
if threading.current_thread() is not threading.main_thread():
|
||||
# In worker threads, temporarily replace signal.signal with a no-op
|
||||
original_signal = signal.signal
|
||||
|
||||
def noop_signal(*args, **kwargs):
|
||||
pass
|
||||
|
||||
signal.signal = noop_signal
|
||||
try:
|
||||
yield
|
||||
finally:
|
||||
signal.signal = original_signal
|
||||
else:
|
||||
# In main thread, don't modify anything
|
||||
yield
|
||||
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
@@ -148,13 +103,10 @@ class StagehandObserveBlock(Block):
|
||||
instruction: str = SchemaField(
|
||||
description="Natural language description of elements or actions to discover.",
|
||||
)
|
||||
iframes: bool = SchemaField(
|
||||
description="Whether to search within iframes. If True, Stagehand will search for actions within iframes.",
|
||||
default=True,
|
||||
)
|
||||
domSettleTimeoutMs: int = SchemaField(
|
||||
description="Timeout in milliseconds for DOM settlement.Wait longer for dynamic content",
|
||||
default=45000,
|
||||
dom_settle_timeout_ms: int = SchemaField(
|
||||
description="Timeout in ms to wait for the DOM to settle after navigation.",
|
||||
default=30000,
|
||||
advanced=True,
|
||||
)
|
||||
|
||||
class Output(BlockSchemaOutput):
|
||||
@@ -185,32 +137,28 @@ class StagehandObserveBlock(Block):
|
||||
|
||||
logger.debug(f"OBSERVE: Using model provider {model_credentials.provider}")
|
||||
|
||||
with disable_signal_handling():
|
||||
stagehand = Stagehand(
|
||||
api_key=stagehand_credentials.api_key.get_secret_value(),
|
||||
project_id=input_data.browserbase_project_id,
|
||||
async with AsyncStagehand(
|
||||
browserbase_api_key=stagehand_credentials.api_key.get_secret_value(),
|
||||
browserbase_project_id=input_data.browserbase_project_id,
|
||||
model_api_key=model_credentials.api_key.get_secret_value(),
|
||||
) as client:
|
||||
session = await client.sessions.start(
|
||||
model_name=input_data.model.provider_name,
|
||||
model_api_key=model_credentials.api_key.get_secret_value(),
|
||||
dom_settle_timeout_ms=input_data.dom_settle_timeout_ms,
|
||||
)
|
||||
try:
|
||||
await session.navigate(url=input_data.url)
|
||||
|
||||
await stagehand.init()
|
||||
|
||||
page = stagehand.page
|
||||
|
||||
assert page is not None, "Stagehand page is not initialized"
|
||||
|
||||
await page.goto(input_data.url)
|
||||
|
||||
observe_results = await page.observe(
|
||||
input_data.instruction,
|
||||
iframes=input_data.iframes,
|
||||
domSettleTimeoutMs=input_data.domSettleTimeoutMs,
|
||||
)
|
||||
for result in observe_results:
|
||||
yield "selector", result.selector
|
||||
yield "description", result.description
|
||||
yield "method", result.method
|
||||
yield "arguments", result.arguments
|
||||
observe_response = await session.observe(
|
||||
instruction=input_data.instruction,
|
||||
)
|
||||
for result in observe_response.data.result:
|
||||
yield "selector", result.selector
|
||||
yield "description", result.description
|
||||
yield "method", result.method
|
||||
yield "arguments", result.arguments
|
||||
finally:
|
||||
await session.end()
|
||||
|
||||
|
||||
class StagehandActBlock(Block):
|
||||
@@ -242,24 +190,22 @@ class StagehandActBlock(Block):
|
||||
description="Variables to use in the action. Variables contains data you want the action to use.",
|
||||
default_factory=dict,
|
||||
)
|
||||
iframes: bool = SchemaField(
|
||||
description="Whether to search within iframes. If True, Stagehand will search for actions within iframes.",
|
||||
default=True,
|
||||
dom_settle_timeout_ms: int = SchemaField(
|
||||
description="Timeout in ms to wait for the DOM to settle after navigation.",
|
||||
default=30000,
|
||||
advanced=True,
|
||||
)
|
||||
domSettleTimeoutMs: int = SchemaField(
|
||||
description="Timeout in milliseconds for DOM settlement.Wait longer for dynamic content",
|
||||
default=45000,
|
||||
)
|
||||
timeoutMs: int = SchemaField(
|
||||
description="Timeout in milliseconds for DOM ready. Extended timeout for slow-loading forms",
|
||||
default=60000,
|
||||
timeout_ms: int = SchemaField(
|
||||
description="Timeout in ms for each action.",
|
||||
default=30000,
|
||||
advanced=True,
|
||||
)
|
||||
|
||||
class Output(BlockSchemaOutput):
|
||||
success: bool = SchemaField(
|
||||
description="Whether the action was completed successfully"
|
||||
)
|
||||
message: str = SchemaField(description="Details about the action’s execution.")
|
||||
message: str = SchemaField(description="Details about the action's execution.")
|
||||
action: str = SchemaField(description="Action performed")
|
||||
|
||||
def __init__(self):
|
||||
@@ -282,32 +228,33 @@ class StagehandActBlock(Block):
|
||||
|
||||
logger.debug(f"ACT: Using model provider {model_credentials.provider}")
|
||||
|
||||
with disable_signal_handling():
|
||||
stagehand = Stagehand(
|
||||
api_key=stagehand_credentials.api_key.get_secret_value(),
|
||||
project_id=input_data.browserbase_project_id,
|
||||
async with AsyncStagehand(
|
||||
browserbase_api_key=stagehand_credentials.api_key.get_secret_value(),
|
||||
browserbase_project_id=input_data.browserbase_project_id,
|
||||
model_api_key=model_credentials.api_key.get_secret_value(),
|
||||
) as client:
|
||||
session = await client.sessions.start(
|
||||
model_name=input_data.model.provider_name,
|
||||
model_api_key=model_credentials.api_key.get_secret_value(),
|
||||
dom_settle_timeout_ms=input_data.dom_settle_timeout_ms,
|
||||
)
|
||||
try:
|
||||
await session.navigate(url=input_data.url)
|
||||
|
||||
await stagehand.init()
|
||||
|
||||
page = stagehand.page
|
||||
|
||||
assert page is not None, "Stagehand page is not initialized"
|
||||
|
||||
await page.goto(input_data.url)
|
||||
for action in input_data.action:
|
||||
action_results = await page.act(
|
||||
action,
|
||||
variables=input_data.variables,
|
||||
iframes=input_data.iframes,
|
||||
domSettleTimeoutMs=input_data.domSettleTimeoutMs,
|
||||
timeoutMs=input_data.timeoutMs,
|
||||
)
|
||||
yield "success", action_results.success
|
||||
yield "message", action_results.message
|
||||
yield "action", action_results.action
|
||||
for action in input_data.action:
|
||||
act_options = ActOptions(
|
||||
variables={k: v for k, v in input_data.variables.items()},
|
||||
timeout=input_data.timeout_ms,
|
||||
)
|
||||
act_response = await session.act(
|
||||
input=action,
|
||||
options=act_options,
|
||||
)
|
||||
result = act_response.data.result
|
||||
yield "success", result.success
|
||||
yield "message", result.message
|
||||
yield "action", result.action_description
|
||||
finally:
|
||||
await session.end()
|
||||
|
||||
|
||||
class StagehandExtractBlock(Block):
|
||||
@@ -335,13 +282,10 @@ class StagehandExtractBlock(Block):
|
||||
instruction: str = SchemaField(
|
||||
description="Natural language description of elements or actions to discover.",
|
||||
)
|
||||
iframes: bool = SchemaField(
|
||||
description="Whether to search within iframes. If True, Stagehand will search for actions within iframes.",
|
||||
default=True,
|
||||
)
|
||||
domSettleTimeoutMs: int = SchemaField(
|
||||
description="Timeout in milliseconds for DOM settlement.Wait longer for dynamic content",
|
||||
default=45000,
|
||||
dom_settle_timeout_ms: int = SchemaField(
|
||||
description="Timeout in ms to wait for the DOM to settle after navigation.",
|
||||
default=30000,
|
||||
advanced=True,
|
||||
)
|
||||
|
||||
class Output(BlockSchemaOutput):
|
||||
@@ -367,24 +311,21 @@ class StagehandExtractBlock(Block):
|
||||
|
||||
logger.debug(f"EXTRACT: Using model provider {model_credentials.provider}")
|
||||
|
||||
with disable_signal_handling():
|
||||
stagehand = Stagehand(
|
||||
api_key=stagehand_credentials.api_key.get_secret_value(),
|
||||
project_id=input_data.browserbase_project_id,
|
||||
async with AsyncStagehand(
|
||||
browserbase_api_key=stagehand_credentials.api_key.get_secret_value(),
|
||||
browserbase_project_id=input_data.browserbase_project_id,
|
||||
model_api_key=model_credentials.api_key.get_secret_value(),
|
||||
) as client:
|
||||
session = await client.sessions.start(
|
||||
model_name=input_data.model.provider_name,
|
||||
model_api_key=model_credentials.api_key.get_secret_value(),
|
||||
dom_settle_timeout_ms=input_data.dom_settle_timeout_ms,
|
||||
)
|
||||
try:
|
||||
await session.navigate(url=input_data.url)
|
||||
|
||||
await stagehand.init()
|
||||
|
||||
page = stagehand.page
|
||||
|
||||
assert page is not None, "Stagehand page is not initialized"
|
||||
|
||||
await page.goto(input_data.url)
|
||||
extraction = await page.extract(
|
||||
input_data.instruction,
|
||||
iframes=input_data.iframes,
|
||||
domSettleTimeoutMs=input_data.domSettleTimeoutMs,
|
||||
)
|
||||
yield "extraction", str(extraction.model_dump()["extraction"])
|
||||
extract_response = await session.extract(
|
||||
instruction=input_data.instruction,
|
||||
)
|
||||
yield "extraction", str(extract_response.data.result)
|
||||
finally:
|
||||
await session.end()
|
||||
|
||||
@@ -1,9 +1,18 @@
|
||||
from typing import cast
|
||||
from unittest.mock import AsyncMock, MagicMock, patch
|
||||
|
||||
import anthropic
|
||||
import httpx
|
||||
import openai
|
||||
import pytest
|
||||
|
||||
import backend.blocks.llm as llm
|
||||
from backend.data.model import NodeExecutionStats
|
||||
|
||||
# TEST_CREDENTIALS_INPUT is a plain dict that satisfies AICredentials at runtime
|
||||
# but not at the type level. Cast once here to avoid per-test suppressors.
|
||||
_TEST_AI_CREDENTIALS = cast(llm.AICredentials, llm.TEST_CREDENTIALS_INPUT)
|
||||
|
||||
|
||||
class TestLLMStatsTracking:
|
||||
"""Test that LLM blocks correctly track token usage statistics."""
|
||||
@@ -655,3 +664,148 @@ class TestAITextSummarizerValidation:
|
||||
error_message = str(exc_info.value)
|
||||
assert "Expected a string summary" in error_message
|
||||
assert "received dict" in error_message
|
||||
|
||||
|
||||
def _make_anthropic_status_error(status_code: int) -> anthropic.APIStatusError:
|
||||
"""Create an anthropic.APIStatusError with the given status code."""
|
||||
request = httpx.Request("POST", "https://api.anthropic.com/v1/messages")
|
||||
response = httpx.Response(status_code, request=request)
|
||||
return anthropic.APIStatusError(
|
||||
f"Error code: {status_code}", response=response, body=None
|
||||
)
|
||||
|
||||
|
||||
def _make_openai_status_error(status_code: int) -> openai.APIStatusError:
|
||||
"""Create an openai.APIStatusError with the given status code."""
|
||||
response = httpx.Response(
|
||||
status_code, request=httpx.Request("POST", "https://api.openai.com/v1/chat")
|
||||
)
|
||||
return openai.APIStatusError(
|
||||
f"Error code: {status_code}", response=response, body=None
|
||||
)
|
||||
|
||||
|
||||
class TestUserErrorStatusCodeHandling:
|
||||
"""Test that user-caused LLM API errors (401/403/429) break the retry loop
|
||||
and are logged as warnings, while server errors (500) trigger retries."""
|
||||
|
||||
@pytest.mark.asyncio
|
||||
@pytest.mark.parametrize("status_code", [401, 403, 429])
|
||||
async def test_anthropic_user_error_breaks_retry_loop(self, status_code: int):
|
||||
"""401/403/429 Anthropic errors should break immediately, not retry."""
|
||||
import backend.blocks.llm as llm
|
||||
|
||||
block = llm.AIStructuredResponseGeneratorBlock()
|
||||
call_count = 0
|
||||
|
||||
async def mock_llm_call(*args, **kwargs):
|
||||
nonlocal call_count
|
||||
call_count += 1
|
||||
raise _make_anthropic_status_error(status_code)
|
||||
|
||||
with patch.object(block, "llm_call", new=AsyncMock(side_effect=mock_llm_call)):
|
||||
input_data = llm.AIStructuredResponseGeneratorBlock.Input(
|
||||
prompt="Test",
|
||||
expected_format={"key": "desc"},
|
||||
model=llm.DEFAULT_LLM_MODEL,
|
||||
credentials=_TEST_AI_CREDENTIALS,
|
||||
retry=3,
|
||||
)
|
||||
|
||||
with pytest.raises(RuntimeError):
|
||||
async for _ in block.run(input_data, credentials=llm.TEST_CREDENTIALS):
|
||||
pass
|
||||
|
||||
assert (
|
||||
call_count == 1
|
||||
), f"Expected exactly 1 call for status {status_code}, got {call_count}"
|
||||
|
||||
@pytest.mark.asyncio
|
||||
@pytest.mark.parametrize("status_code", [401, 403, 429])
|
||||
async def test_openai_user_error_breaks_retry_loop(self, status_code: int):
|
||||
"""401/403/429 OpenAI errors should break immediately, not retry."""
|
||||
import backend.blocks.llm as llm
|
||||
|
||||
block = llm.AIStructuredResponseGeneratorBlock()
|
||||
call_count = 0
|
||||
|
||||
async def mock_llm_call(*args, **kwargs):
|
||||
nonlocal call_count
|
||||
call_count += 1
|
||||
raise _make_openai_status_error(status_code)
|
||||
|
||||
with patch.object(block, "llm_call", new=AsyncMock(side_effect=mock_llm_call)):
|
||||
input_data = llm.AIStructuredResponseGeneratorBlock.Input(
|
||||
prompt="Test",
|
||||
expected_format={"key": "desc"},
|
||||
model=llm.DEFAULT_LLM_MODEL,
|
||||
credentials=_TEST_AI_CREDENTIALS,
|
||||
retry=3,
|
||||
)
|
||||
|
||||
with pytest.raises(RuntimeError):
|
||||
async for _ in block.run(input_data, credentials=llm.TEST_CREDENTIALS):
|
||||
pass
|
||||
|
||||
assert (
|
||||
call_count == 1
|
||||
), f"Expected exactly 1 call for status {status_code}, got {call_count}"
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_server_error_retries(self):
|
||||
"""500 errors should be retried (not break immediately)."""
|
||||
import backend.blocks.llm as llm
|
||||
|
||||
block = llm.AIStructuredResponseGeneratorBlock()
|
||||
call_count = 0
|
||||
|
||||
async def mock_llm_call(*args, **kwargs):
|
||||
nonlocal call_count
|
||||
call_count += 1
|
||||
raise _make_anthropic_status_error(500)
|
||||
|
||||
with patch.object(block, "llm_call", new=AsyncMock(side_effect=mock_llm_call)):
|
||||
input_data = llm.AIStructuredResponseGeneratorBlock.Input(
|
||||
prompt="Test",
|
||||
expected_format={"key": "desc"},
|
||||
model=llm.DEFAULT_LLM_MODEL,
|
||||
credentials=_TEST_AI_CREDENTIALS,
|
||||
retry=3,
|
||||
)
|
||||
|
||||
with pytest.raises(RuntimeError):
|
||||
async for _ in block.run(input_data, credentials=llm.TEST_CREDENTIALS):
|
||||
pass
|
||||
|
||||
assert (
|
||||
call_count > 1
|
||||
), f"Expected multiple retry attempts for 500, got {call_count}"
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_user_error_logs_warning_not_exception(self):
|
||||
"""User-caused errors should log with logger.warning, not logger.exception."""
|
||||
import backend.blocks.llm as llm
|
||||
|
||||
block = llm.AIStructuredResponseGeneratorBlock()
|
||||
|
||||
async def mock_llm_call(*args, **kwargs):
|
||||
raise _make_anthropic_status_error(401)
|
||||
|
||||
with patch.object(block, "llm_call", new=AsyncMock(side_effect=mock_llm_call)):
|
||||
input_data = llm.AIStructuredResponseGeneratorBlock.Input(
|
||||
prompt="Test",
|
||||
expected_format={"key": "desc"},
|
||||
model=llm.DEFAULT_LLM_MODEL,
|
||||
credentials=_TEST_AI_CREDENTIALS,
|
||||
)
|
||||
|
||||
with (
|
||||
patch.object(llm.logger, "warning") as mock_warning,
|
||||
patch.object(llm.logger, "exception") as mock_exception,
|
||||
pytest.raises(RuntimeError),
|
||||
):
|
||||
async for _ in block.run(input_data, credentials=llm.TEST_CREDENTIALS):
|
||||
pass
|
||||
|
||||
mock_warning.assert_called_once()
|
||||
mock_exception.assert_not_called()
|
||||
|
||||
@@ -57,7 +57,7 @@ async def execute_graph(
|
||||
@pytest.mark.asyncio(loop_scope="session")
|
||||
async def test_graph_validation_with_tool_nodes_correct(server: SpinTestServer):
|
||||
from backend.blocks.agent import AgentExecutorBlock
|
||||
from backend.blocks.smart_decision_maker import SmartDecisionMakerBlock
|
||||
from backend.blocks.orchestrator import OrchestratorBlock
|
||||
from backend.data import graph
|
||||
|
||||
test_user = await create_test_user()
|
||||
@@ -66,7 +66,7 @@ async def test_graph_validation_with_tool_nodes_correct(server: SpinTestServer):
|
||||
|
||||
nodes = [
|
||||
graph.Node(
|
||||
block_id=SmartDecisionMakerBlock().id,
|
||||
block_id=OrchestratorBlock().id,
|
||||
input_default={
|
||||
"prompt": "Hello, World!",
|
||||
"credentials": creds,
|
||||
@@ -108,10 +108,10 @@ async def test_graph_validation_with_tool_nodes_correct(server: SpinTestServer):
|
||||
|
||||
|
||||
@pytest.mark.asyncio(loop_scope="session")
|
||||
async def test_smart_decision_maker_function_signature(server: SpinTestServer):
|
||||
async def test_orchestrator_function_signature(server: SpinTestServer):
|
||||
from backend.blocks.agent import AgentExecutorBlock
|
||||
from backend.blocks.basic import StoreValueBlock
|
||||
from backend.blocks.smart_decision_maker import SmartDecisionMakerBlock
|
||||
from backend.blocks.orchestrator import OrchestratorBlock
|
||||
from backend.data import graph
|
||||
|
||||
test_user = await create_test_user()
|
||||
@@ -120,7 +120,7 @@ async def test_smart_decision_maker_function_signature(server: SpinTestServer):
|
||||
|
||||
nodes = [
|
||||
graph.Node(
|
||||
block_id=SmartDecisionMakerBlock().id,
|
||||
block_id=OrchestratorBlock().id,
|
||||
input_default={
|
||||
"prompt": "Hello, World!",
|
||||
"credentials": creds,
|
||||
@@ -169,7 +169,7 @@ async def test_smart_decision_maker_function_signature(server: SpinTestServer):
|
||||
)
|
||||
test_graph = await create_graph(server, test_graph, test_user)
|
||||
|
||||
tool_functions = await SmartDecisionMakerBlock._create_tool_node_signatures(
|
||||
tool_functions = await OrchestratorBlock._create_tool_node_signatures(
|
||||
test_graph.nodes[0].id
|
||||
)
|
||||
assert tool_functions is not None, "Tool functions should not be None"
|
||||
@@ -198,12 +198,12 @@ async def test_smart_decision_maker_function_signature(server: SpinTestServer):
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_smart_decision_maker_tracks_llm_stats():
|
||||
"""Test that SmartDecisionMakerBlock correctly tracks LLM usage stats."""
|
||||
async def test_orchestrator_tracks_llm_stats():
|
||||
"""Test that OrchestratorBlock correctly tracks LLM usage stats."""
|
||||
import backend.blocks.llm as llm_module
|
||||
from backend.blocks.smart_decision_maker import SmartDecisionMakerBlock
|
||||
from backend.blocks.orchestrator import OrchestratorBlock
|
||||
|
||||
block = SmartDecisionMakerBlock()
|
||||
block = OrchestratorBlock()
|
||||
|
||||
# Mock the llm.llm_call function to return controlled data
|
||||
mock_response = MagicMock()
|
||||
@@ -224,14 +224,14 @@ async def test_smart_decision_maker_tracks_llm_stats():
|
||||
new_callable=AsyncMock,
|
||||
return_value=mock_response,
|
||||
), patch.object(
|
||||
SmartDecisionMakerBlock,
|
||||
OrchestratorBlock,
|
||||
"_create_tool_node_signatures",
|
||||
new_callable=AsyncMock,
|
||||
return_value=[],
|
||||
):
|
||||
|
||||
# Create test input
|
||||
input_data = SmartDecisionMakerBlock.Input(
|
||||
input_data = OrchestratorBlock.Input(
|
||||
prompt="Should I continue with this task?",
|
||||
model=llm_module.DEFAULT_LLM_MODEL,
|
||||
credentials=llm_module.TEST_CREDENTIALS_INPUT, # type: ignore
|
||||
@@ -274,12 +274,12 @@ async def test_smart_decision_maker_tracks_llm_stats():
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_smart_decision_maker_parameter_validation():
|
||||
"""Test that SmartDecisionMakerBlock correctly validates tool call parameters."""
|
||||
async def test_orchestrator_parameter_validation():
|
||||
"""Test that OrchestratorBlock correctly validates tool call parameters."""
|
||||
import backend.blocks.llm as llm_module
|
||||
from backend.blocks.smart_decision_maker import SmartDecisionMakerBlock
|
||||
from backend.blocks.orchestrator import OrchestratorBlock
|
||||
|
||||
block = SmartDecisionMakerBlock()
|
||||
block = OrchestratorBlock()
|
||||
|
||||
# Mock tool functions with specific parameter schema
|
||||
mock_tool_functions = [
|
||||
@@ -327,13 +327,13 @@ async def test_smart_decision_maker_parameter_validation():
|
||||
new_callable=AsyncMock,
|
||||
return_value=mock_response_with_typo,
|
||||
) as mock_llm_call, patch.object(
|
||||
SmartDecisionMakerBlock,
|
||||
OrchestratorBlock,
|
||||
"_create_tool_node_signatures",
|
||||
new_callable=AsyncMock,
|
||||
return_value=mock_tool_functions,
|
||||
):
|
||||
|
||||
input_data = SmartDecisionMakerBlock.Input(
|
||||
input_data = OrchestratorBlock.Input(
|
||||
prompt="Search for keywords",
|
||||
model=llm_module.DEFAULT_LLM_MODEL,
|
||||
credentials=llm_module.TEST_CREDENTIALS_INPUT, # type: ignore
|
||||
@@ -394,13 +394,13 @@ async def test_smart_decision_maker_parameter_validation():
|
||||
new_callable=AsyncMock,
|
||||
return_value=mock_response_missing_required,
|
||||
), patch.object(
|
||||
SmartDecisionMakerBlock,
|
||||
OrchestratorBlock,
|
||||
"_create_tool_node_signatures",
|
||||
new_callable=AsyncMock,
|
||||
return_value=mock_tool_functions,
|
||||
):
|
||||
|
||||
input_data = SmartDecisionMakerBlock.Input(
|
||||
input_data = OrchestratorBlock.Input(
|
||||
prompt="Search for keywords",
|
||||
model=llm_module.DEFAULT_LLM_MODEL,
|
||||
credentials=llm_module.TEST_CREDENTIALS_INPUT, # type: ignore
|
||||
@@ -454,13 +454,13 @@ async def test_smart_decision_maker_parameter_validation():
|
||||
new_callable=AsyncMock,
|
||||
return_value=mock_response_valid,
|
||||
), patch.object(
|
||||
SmartDecisionMakerBlock,
|
||||
OrchestratorBlock,
|
||||
"_create_tool_node_signatures",
|
||||
new_callable=AsyncMock,
|
||||
return_value=mock_tool_functions,
|
||||
):
|
||||
|
||||
input_data = SmartDecisionMakerBlock.Input(
|
||||
input_data = OrchestratorBlock.Input(
|
||||
prompt="Search for keywords",
|
||||
model=llm_module.DEFAULT_LLM_MODEL,
|
||||
credentials=llm_module.TEST_CREDENTIALS_INPUT, # type: ignore
|
||||
@@ -518,13 +518,13 @@ async def test_smart_decision_maker_parameter_validation():
|
||||
new_callable=AsyncMock,
|
||||
return_value=mock_response_all_params,
|
||||
), patch.object(
|
||||
SmartDecisionMakerBlock,
|
||||
OrchestratorBlock,
|
||||
"_create_tool_node_signatures",
|
||||
new_callable=AsyncMock,
|
||||
return_value=mock_tool_functions,
|
||||
):
|
||||
|
||||
input_data = SmartDecisionMakerBlock.Input(
|
||||
input_data = OrchestratorBlock.Input(
|
||||
prompt="Search for keywords",
|
||||
model=llm_module.DEFAULT_LLM_MODEL,
|
||||
credentials=llm_module.TEST_CREDENTIALS_INPUT, # type: ignore
|
||||
@@ -562,12 +562,12 @@ async def test_smart_decision_maker_parameter_validation():
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_smart_decision_maker_raw_response_conversion():
|
||||
"""Test that SmartDecisionMaker correctly handles different raw_response types with retry mechanism."""
|
||||
async def test_orchestrator_raw_response_conversion():
|
||||
"""Test that Orchestrator correctly handles different raw_response types with retry mechanism."""
|
||||
import backend.blocks.llm as llm_module
|
||||
from backend.blocks.smart_decision_maker import SmartDecisionMakerBlock
|
||||
from backend.blocks.orchestrator import OrchestratorBlock
|
||||
|
||||
block = SmartDecisionMakerBlock()
|
||||
block = OrchestratorBlock()
|
||||
|
||||
# Mock tool functions
|
||||
mock_tool_functions = [
|
||||
@@ -637,7 +637,7 @@ async def test_smart_decision_maker_raw_response_conversion():
|
||||
with patch(
|
||||
"backend.blocks.llm.llm_call", new_callable=AsyncMock
|
||||
) as mock_llm_call, patch.object(
|
||||
SmartDecisionMakerBlock,
|
||||
OrchestratorBlock,
|
||||
"_create_tool_node_signatures",
|
||||
new_callable=AsyncMock,
|
||||
return_value=mock_tool_functions,
|
||||
@@ -646,7 +646,7 @@ async def test_smart_decision_maker_raw_response_conversion():
|
||||
# Second call returns successful response
|
||||
mock_llm_call.side_effect = [mock_response_retry, mock_response_success]
|
||||
|
||||
input_data = SmartDecisionMakerBlock.Input(
|
||||
input_data = OrchestratorBlock.Input(
|
||||
prompt="Test prompt",
|
||||
model=llm_module.DEFAULT_LLM_MODEL,
|
||||
credentials=llm_module.TEST_CREDENTIALS_INPUT, # type: ignore
|
||||
@@ -715,12 +715,12 @@ async def test_smart_decision_maker_raw_response_conversion():
|
||||
new_callable=AsyncMock,
|
||||
return_value=mock_response_ollama,
|
||||
), patch.object(
|
||||
SmartDecisionMakerBlock,
|
||||
OrchestratorBlock,
|
||||
"_create_tool_node_signatures",
|
||||
new_callable=AsyncMock,
|
||||
return_value=[], # No tools for this test
|
||||
):
|
||||
input_data = SmartDecisionMakerBlock.Input(
|
||||
input_data = OrchestratorBlock.Input(
|
||||
prompt="Simple prompt",
|
||||
model=llm_module.DEFAULT_LLM_MODEL,
|
||||
credentials=llm_module.TEST_CREDENTIALS_INPUT, # type: ignore
|
||||
@@ -771,12 +771,12 @@ async def test_smart_decision_maker_raw_response_conversion():
|
||||
new_callable=AsyncMock,
|
||||
return_value=mock_response_dict,
|
||||
), patch.object(
|
||||
SmartDecisionMakerBlock,
|
||||
OrchestratorBlock,
|
||||
"_create_tool_node_signatures",
|
||||
new_callable=AsyncMock,
|
||||
return_value=[],
|
||||
):
|
||||
input_data = SmartDecisionMakerBlock.Input(
|
||||
input_data = OrchestratorBlock.Input(
|
||||
prompt="Another test",
|
||||
model=llm_module.DEFAULT_LLM_MODEL,
|
||||
credentials=llm_module.TEST_CREDENTIALS_INPUT, # type: ignore
|
||||
@@ -811,12 +811,12 @@ async def test_smart_decision_maker_raw_response_conversion():
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_smart_decision_maker_agent_mode():
|
||||
async def test_orchestrator_agent_mode():
|
||||
"""Test that agent mode executes tools directly and loops until finished."""
|
||||
import backend.blocks.llm as llm_module
|
||||
from backend.blocks.smart_decision_maker import SmartDecisionMakerBlock
|
||||
from backend.blocks.orchestrator import OrchestratorBlock
|
||||
|
||||
block = SmartDecisionMakerBlock()
|
||||
block = OrchestratorBlock()
|
||||
|
||||
# Mock tool call that requires multiple iterations
|
||||
mock_tool_call_1 = MagicMock()
|
||||
@@ -893,7 +893,7 @@ async def test_smart_decision_maker_agent_mode():
|
||||
with patch("backend.blocks.llm.llm_call", llm_call_mock), patch.object(
|
||||
block, "_create_tool_node_signatures", return_value=mock_tool_signatures
|
||||
), patch(
|
||||
"backend.blocks.smart_decision_maker.get_database_manager_async_client",
|
||||
"backend.blocks.orchestrator.get_database_manager_async_client",
|
||||
return_value=mock_db_client,
|
||||
), patch(
|
||||
"backend.executor.manager.async_update_node_execution_status",
|
||||
@@ -929,7 +929,7 @@ async def test_smart_decision_maker_agent_mode():
|
||||
}
|
||||
|
||||
# Test agent mode with max_iterations = 3
|
||||
input_data = SmartDecisionMakerBlock.Input(
|
||||
input_data = OrchestratorBlock.Input(
|
||||
prompt="Complete this task using tools",
|
||||
model=llm_module.DEFAULT_LLM_MODEL,
|
||||
credentials=llm_module.TEST_CREDENTIALS_INPUT, # type: ignore
|
||||
@@ -969,12 +969,12 @@ async def test_smart_decision_maker_agent_mode():
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_smart_decision_maker_traditional_mode_default():
|
||||
async def test_orchestrator_traditional_mode_default():
|
||||
"""Test that default behavior (agent_mode_max_iterations=0) works as traditional mode."""
|
||||
import backend.blocks.llm as llm_module
|
||||
from backend.blocks.smart_decision_maker import SmartDecisionMakerBlock
|
||||
from backend.blocks.orchestrator import OrchestratorBlock
|
||||
|
||||
block = SmartDecisionMakerBlock()
|
||||
block = OrchestratorBlock()
|
||||
|
||||
# Mock tool call
|
||||
mock_tool_call = MagicMock()
|
||||
@@ -1018,7 +1018,7 @@ async def test_smart_decision_maker_traditional_mode_default():
|
||||
):
|
||||
|
||||
# Test default behavior (traditional mode)
|
||||
input_data = SmartDecisionMakerBlock.Input(
|
||||
input_data = OrchestratorBlock.Input(
|
||||
prompt="Test prompt",
|
||||
model=llm_module.DEFAULT_LLM_MODEL,
|
||||
credentials=llm_module.TEST_CREDENTIALS_INPUT, # type: ignore
|
||||
@@ -1060,12 +1060,12 @@ async def test_smart_decision_maker_traditional_mode_default():
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_smart_decision_maker_uses_customized_name_for_blocks():
|
||||
"""Test that SmartDecisionMakerBlock uses customized_name from node metadata for tool names."""
|
||||
async def test_orchestrator_uses_customized_name_for_blocks():
|
||||
"""Test that OrchestratorBlock uses customized_name from node metadata for tool names."""
|
||||
from unittest.mock import MagicMock
|
||||
|
||||
from backend.blocks.basic import StoreValueBlock
|
||||
from backend.blocks.smart_decision_maker import SmartDecisionMakerBlock
|
||||
from backend.blocks.orchestrator import OrchestratorBlock
|
||||
from backend.data.graph import Link, Node
|
||||
|
||||
# Create a mock node with customized_name in metadata
|
||||
@@ -1080,7 +1080,7 @@ async def test_smart_decision_maker_uses_customized_name_for_blocks():
|
||||
mock_link.sink_name = "input"
|
||||
|
||||
# Call the function directly
|
||||
result = await SmartDecisionMakerBlock._create_block_function_signature(
|
||||
result = await OrchestratorBlock._create_block_function_signature(
|
||||
mock_node, [mock_link]
|
||||
)
|
||||
|
||||
@@ -1091,12 +1091,12 @@ async def test_smart_decision_maker_uses_customized_name_for_blocks():
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_smart_decision_maker_falls_back_to_block_name():
|
||||
"""Test that SmartDecisionMakerBlock falls back to block.name when no customized_name."""
|
||||
async def test_orchestrator_falls_back_to_block_name():
|
||||
"""Test that OrchestratorBlock falls back to block.name when no customized_name."""
|
||||
from unittest.mock import MagicMock
|
||||
|
||||
from backend.blocks.basic import StoreValueBlock
|
||||
from backend.blocks.smart_decision_maker import SmartDecisionMakerBlock
|
||||
from backend.blocks.orchestrator import OrchestratorBlock
|
||||
from backend.data.graph import Link, Node
|
||||
|
||||
# Create a mock node without customized_name
|
||||
@@ -1111,7 +1111,7 @@ async def test_smart_decision_maker_falls_back_to_block_name():
|
||||
mock_link.sink_name = "input"
|
||||
|
||||
# Call the function directly
|
||||
result = await SmartDecisionMakerBlock._create_block_function_signature(
|
||||
result = await OrchestratorBlock._create_block_function_signature(
|
||||
mock_node, [mock_link]
|
||||
)
|
||||
|
||||
@@ -1122,11 +1122,11 @@ async def test_smart_decision_maker_falls_back_to_block_name():
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_smart_decision_maker_uses_customized_name_for_agents():
|
||||
"""Test that SmartDecisionMakerBlock uses customized_name from metadata for agent nodes."""
|
||||
async def test_orchestrator_uses_customized_name_for_agents():
|
||||
"""Test that OrchestratorBlock uses customized_name from metadata for agent nodes."""
|
||||
from unittest.mock import AsyncMock, MagicMock, patch
|
||||
|
||||
from backend.blocks.smart_decision_maker import SmartDecisionMakerBlock
|
||||
from backend.blocks.orchestrator import OrchestratorBlock
|
||||
from backend.data.graph import Link, Node
|
||||
|
||||
# Create a mock node with customized_name in metadata
|
||||
@@ -1152,10 +1152,10 @@ async def test_smart_decision_maker_uses_customized_name_for_agents():
|
||||
mock_db_client.get_graph_metadata.return_value = mock_graph_meta
|
||||
|
||||
with patch(
|
||||
"backend.blocks.smart_decision_maker.get_database_manager_async_client",
|
||||
"backend.blocks.orchestrator.get_database_manager_async_client",
|
||||
return_value=mock_db_client,
|
||||
):
|
||||
result = await SmartDecisionMakerBlock._create_agent_function_signature(
|
||||
result = await OrchestratorBlock._create_agent_function_signature(
|
||||
mock_node, [mock_link]
|
||||
)
|
||||
|
||||
@@ -1166,11 +1166,11 @@ async def test_smart_decision_maker_uses_customized_name_for_agents():
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_smart_decision_maker_agent_falls_back_to_graph_name():
|
||||
async def test_orchestrator_agent_falls_back_to_graph_name():
|
||||
"""Test that agent node falls back to graph name when no customized_name."""
|
||||
from unittest.mock import AsyncMock, MagicMock, patch
|
||||
|
||||
from backend.blocks.smart_decision_maker import SmartDecisionMakerBlock
|
||||
from backend.blocks.orchestrator import OrchestratorBlock
|
||||
from backend.data.graph import Link, Node
|
||||
|
||||
# Create a mock node without customized_name
|
||||
@@ -1196,10 +1196,10 @@ async def test_smart_decision_maker_agent_falls_back_to_graph_name():
|
||||
mock_db_client.get_graph_metadata.return_value = mock_graph_meta
|
||||
|
||||
with patch(
|
||||
"backend.blocks.smart_decision_maker.get_database_manager_async_client",
|
||||
"backend.blocks.orchestrator.get_database_manager_async_client",
|
||||
return_value=mock_db_client,
|
||||
):
|
||||
result = await SmartDecisionMakerBlock._create_agent_function_signature(
|
||||
result = await OrchestratorBlock._create_agent_function_signature(
|
||||
mock_node, [mock_link]
|
||||
)
|
||||
|
||||
@@ -3,12 +3,12 @@ from unittest.mock import Mock
|
||||
import pytest
|
||||
|
||||
from backend.blocks.data_manipulation import AddToListBlock, CreateDictionaryBlock
|
||||
from backend.blocks.smart_decision_maker import SmartDecisionMakerBlock
|
||||
from backend.blocks.orchestrator import OrchestratorBlock
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_smart_decision_maker_handles_dynamic_dict_fields():
|
||||
"""Test Smart Decision Maker can handle dynamic dictionary fields (_#_) for any block"""
|
||||
async def test_orchestrator_handles_dynamic_dict_fields():
|
||||
"""Test Orchestrator can handle dynamic dictionary fields (_#_) for any block"""
|
||||
|
||||
# Create a mock node for CreateDictionaryBlock
|
||||
mock_node = Mock()
|
||||
@@ -23,24 +23,24 @@ async def test_smart_decision_maker_handles_dynamic_dict_fields():
|
||||
source_name="tools_^_create_dict_~_name",
|
||||
sink_name="values_#_name", # Dynamic dict field
|
||||
sink_id="dict_node_id",
|
||||
source_id="smart_decision_node_id",
|
||||
source_id="orchestrator_node_id",
|
||||
),
|
||||
Mock(
|
||||
source_name="tools_^_create_dict_~_age",
|
||||
sink_name="values_#_age", # Dynamic dict field
|
||||
sink_id="dict_node_id",
|
||||
source_id="smart_decision_node_id",
|
||||
source_id="orchestrator_node_id",
|
||||
),
|
||||
Mock(
|
||||
source_name="tools_^_create_dict_~_city",
|
||||
sink_name="values_#_city", # Dynamic dict field
|
||||
sink_id="dict_node_id",
|
||||
source_id="smart_decision_node_id",
|
||||
source_id="orchestrator_node_id",
|
||||
),
|
||||
]
|
||||
|
||||
# Generate function signature
|
||||
signature = await SmartDecisionMakerBlock._create_block_function_signature(
|
||||
signature = await OrchestratorBlock._create_block_function_signature(
|
||||
mock_node, mock_links # type: ignore
|
||||
)
|
||||
|
||||
@@ -70,8 +70,8 @@ async def test_smart_decision_maker_handles_dynamic_dict_fields():
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_smart_decision_maker_handles_dynamic_list_fields():
|
||||
"""Test Smart Decision Maker can handle dynamic list fields (_$_) for any block"""
|
||||
async def test_orchestrator_handles_dynamic_list_fields():
|
||||
"""Test Orchestrator can handle dynamic list fields (_$_) for any block"""
|
||||
|
||||
# Create a mock node for AddToListBlock
|
||||
mock_node = Mock()
|
||||
@@ -86,18 +86,18 @@ async def test_smart_decision_maker_handles_dynamic_list_fields():
|
||||
source_name="tools_^_add_to_list_~_0",
|
||||
sink_name="entries_$_0", # Dynamic list field
|
||||
sink_id="list_node_id",
|
||||
source_id="smart_decision_node_id",
|
||||
source_id="orchestrator_node_id",
|
||||
),
|
||||
Mock(
|
||||
source_name="tools_^_add_to_list_~_1",
|
||||
sink_name="entries_$_1", # Dynamic list field
|
||||
sink_id="list_node_id",
|
||||
source_id="smart_decision_node_id",
|
||||
source_id="orchestrator_node_id",
|
||||
),
|
||||
]
|
||||
|
||||
# Generate function signature
|
||||
signature = await SmartDecisionMakerBlock._create_block_function_signature(
|
||||
signature = await OrchestratorBlock._create_block_function_signature(
|
||||
mock_node, mock_links # type: ignore
|
||||
)
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
"""Comprehensive tests for SmartDecisionMakerBlock dynamic field handling."""
|
||||
"""Comprehensive tests for OrchestratorBlock dynamic field handling."""
|
||||
|
||||
import json
|
||||
from unittest.mock import AsyncMock, MagicMock, Mock, patch
|
||||
@@ -6,7 +6,7 @@ from unittest.mock import AsyncMock, MagicMock, Mock, patch
|
||||
import pytest
|
||||
|
||||
from backend.blocks.data_manipulation import AddToListBlock, CreateDictionaryBlock
|
||||
from backend.blocks.smart_decision_maker import SmartDecisionMakerBlock
|
||||
from backend.blocks.orchestrator import OrchestratorBlock
|
||||
from backend.blocks.text import MatchTextPatternBlock
|
||||
from backend.data.dynamic_fields import get_dynamic_field_description
|
||||
|
||||
@@ -37,7 +37,7 @@ async def test_dynamic_field_description_generation():
|
||||
@pytest.mark.asyncio
|
||||
async def test_create_block_function_signature_with_dict_fields():
|
||||
"""Test that function signatures are created correctly for dictionary dynamic fields."""
|
||||
block = SmartDecisionMakerBlock()
|
||||
block = OrchestratorBlock()
|
||||
|
||||
# Create a mock node for CreateDictionaryBlock
|
||||
mock_node = Mock()
|
||||
@@ -52,19 +52,19 @@ async def test_create_block_function_signature_with_dict_fields():
|
||||
source_name="tools_^_create_dict_~_values___name", # Sanitized source
|
||||
sink_name="values_#_name", # Original sink
|
||||
sink_id="dict_node_id",
|
||||
source_id="smart_decision_node_id",
|
||||
source_id="orchestrator_node_id",
|
||||
),
|
||||
Mock(
|
||||
source_name="tools_^_create_dict_~_values___age", # Sanitized source
|
||||
sink_name="values_#_age", # Original sink
|
||||
sink_id="dict_node_id",
|
||||
source_id="smart_decision_node_id",
|
||||
source_id="orchestrator_node_id",
|
||||
),
|
||||
Mock(
|
||||
source_name="tools_^_create_dict_~_values___email", # Sanitized source
|
||||
sink_name="values_#_email", # Original sink
|
||||
sink_id="dict_node_id",
|
||||
source_id="smart_decision_node_id",
|
||||
source_id="orchestrator_node_id",
|
||||
),
|
||||
]
|
||||
|
||||
@@ -100,7 +100,7 @@ async def test_create_block_function_signature_with_dict_fields():
|
||||
@pytest.mark.asyncio
|
||||
async def test_create_block_function_signature_with_list_fields():
|
||||
"""Test that function signatures are created correctly for list dynamic fields."""
|
||||
block = SmartDecisionMakerBlock()
|
||||
block = OrchestratorBlock()
|
||||
|
||||
# Create a mock node for AddToListBlock
|
||||
mock_node = Mock()
|
||||
@@ -115,19 +115,19 @@ async def test_create_block_function_signature_with_list_fields():
|
||||
source_name="tools_^_add_list_~_0",
|
||||
sink_name="entries_$_0", # Dynamic list field
|
||||
sink_id="list_node_id",
|
||||
source_id="smart_decision_node_id",
|
||||
source_id="orchestrator_node_id",
|
||||
),
|
||||
Mock(
|
||||
source_name="tools_^_add_list_~_1",
|
||||
sink_name="entries_$_1", # Dynamic list field
|
||||
sink_id="list_node_id",
|
||||
source_id="smart_decision_node_id",
|
||||
source_id="orchestrator_node_id",
|
||||
),
|
||||
Mock(
|
||||
source_name="tools_^_add_list_~_2",
|
||||
sink_name="entries_$_2", # Dynamic list field
|
||||
sink_id="list_node_id",
|
||||
source_id="smart_decision_node_id",
|
||||
source_id="orchestrator_node_id",
|
||||
),
|
||||
]
|
||||
|
||||
@@ -154,7 +154,7 @@ async def test_create_block_function_signature_with_list_fields():
|
||||
@pytest.mark.asyncio
|
||||
async def test_create_block_function_signature_with_object_fields():
|
||||
"""Test that function signatures are created correctly for object dynamic fields."""
|
||||
block = SmartDecisionMakerBlock()
|
||||
block = OrchestratorBlock()
|
||||
|
||||
# Create a mock node for MatchTextPatternBlock (simulating object fields)
|
||||
mock_node = Mock()
|
||||
@@ -169,13 +169,13 @@ async def test_create_block_function_signature_with_object_fields():
|
||||
source_name="tools_^_extract_~_user_name",
|
||||
sink_name="data_@_user_name", # Dynamic object field
|
||||
sink_id="extract_node_id",
|
||||
source_id="smart_decision_node_id",
|
||||
source_id="orchestrator_node_id",
|
||||
),
|
||||
Mock(
|
||||
source_name="tools_^_extract_~_user_email",
|
||||
sink_name="data_@_user_email", # Dynamic object field
|
||||
sink_id="extract_node_id",
|
||||
source_id="smart_decision_node_id",
|
||||
source_id="orchestrator_node_id",
|
||||
),
|
||||
]
|
||||
|
||||
@@ -197,11 +197,11 @@ async def test_create_block_function_signature_with_object_fields():
|
||||
@pytest.mark.asyncio
|
||||
async def test_create_tool_node_signatures():
|
||||
"""Test that the mapping between sanitized and original field names is built correctly."""
|
||||
block = SmartDecisionMakerBlock()
|
||||
block = OrchestratorBlock()
|
||||
|
||||
# Mock the database client and connected nodes
|
||||
with patch(
|
||||
"backend.blocks.smart_decision_maker.get_database_manager_async_client"
|
||||
"backend.blocks.orchestrator.get_database_manager_async_client"
|
||||
) as mock_db:
|
||||
mock_client = AsyncMock()
|
||||
mock_db.return_value = mock_client
|
||||
@@ -281,7 +281,7 @@ async def test_create_tool_node_signatures():
|
||||
@pytest.mark.asyncio
|
||||
async def test_output_yielding_with_dynamic_fields():
|
||||
"""Test that outputs are yielded correctly with dynamic field names mapped back."""
|
||||
block = SmartDecisionMakerBlock()
|
||||
block = OrchestratorBlock()
|
||||
|
||||
# No more sanitized mapping needed since we removed sanitization
|
||||
|
||||
@@ -309,13 +309,13 @@ async def test_output_yielding_with_dynamic_fields():
|
||||
|
||||
# Mock the LLM call
|
||||
with patch(
|
||||
"backend.blocks.smart_decision_maker.llm.llm_call", new_callable=AsyncMock
|
||||
"backend.blocks.orchestrator.llm.llm_call", new_callable=AsyncMock
|
||||
) as mock_llm:
|
||||
mock_llm.return_value = mock_response
|
||||
|
||||
# Mock the database manager to avoid HTTP calls during tool execution
|
||||
with patch(
|
||||
"backend.blocks.smart_decision_maker.get_database_manager_async_client"
|
||||
"backend.blocks.orchestrator.get_database_manager_async_client"
|
||||
) as mock_db_manager, patch.object(
|
||||
block, "_create_tool_node_signatures", new_callable=AsyncMock
|
||||
) as mock_sig:
|
||||
@@ -420,7 +420,7 @@ async def test_output_yielding_with_dynamic_fields():
|
||||
@pytest.mark.asyncio
|
||||
async def test_mixed_regular_and_dynamic_fields():
|
||||
"""Test handling of blocks with both regular and dynamic fields."""
|
||||
block = SmartDecisionMakerBlock()
|
||||
block = OrchestratorBlock()
|
||||
|
||||
# Create a mock node
|
||||
mock_node = Mock()
|
||||
@@ -450,19 +450,19 @@ async def test_mixed_regular_and_dynamic_fields():
|
||||
source_name="tools_^_test_~_regular",
|
||||
sink_name="regular_field", # Regular field
|
||||
sink_id="test_node_id",
|
||||
source_id="smart_decision_node_id",
|
||||
source_id="orchestrator_node_id",
|
||||
),
|
||||
Mock(
|
||||
source_name="tools_^_test_~_dict_key",
|
||||
sink_name="values_#_key1", # Dynamic dict field
|
||||
sink_id="test_node_id",
|
||||
source_id="smart_decision_node_id",
|
||||
source_id="orchestrator_node_id",
|
||||
),
|
||||
Mock(
|
||||
source_name="tools_^_test_~_dict_key2",
|
||||
sink_name="values_#_key2", # Dynamic dict field
|
||||
sink_id="test_node_id",
|
||||
source_id="smart_decision_node_id",
|
||||
source_id="orchestrator_node_id",
|
||||
),
|
||||
]
|
||||
|
||||
@@ -488,7 +488,7 @@ async def test_mixed_regular_and_dynamic_fields():
|
||||
@pytest.mark.asyncio
|
||||
async def test_validation_errors_dont_pollute_conversation():
|
||||
"""Test that validation errors are only used during retries and don't pollute the conversation."""
|
||||
block = SmartDecisionMakerBlock()
|
||||
block = OrchestratorBlock()
|
||||
|
||||
# Track conversation history changes
|
||||
conversation_snapshots = []
|
||||
@@ -535,7 +535,7 @@ async def test_validation_errors_dont_pollute_conversation():
|
||||
|
||||
# Mock the LLM call
|
||||
with patch(
|
||||
"backend.blocks.smart_decision_maker.llm.llm_call", new_callable=AsyncMock
|
||||
"backend.blocks.orchestrator.llm.llm_call", new_callable=AsyncMock
|
||||
) as mock_llm:
|
||||
mock_llm.side_effect = mock_llm_call
|
||||
|
||||
@@ -565,7 +565,7 @@ async def test_validation_errors_dont_pollute_conversation():
|
||||
|
||||
# Mock the database manager to avoid HTTP calls during tool execution
|
||||
with patch(
|
||||
"backend.blocks.smart_decision_maker.get_database_manager_async_client"
|
||||
"backend.blocks.orchestrator.get_database_manager_async_client"
|
||||
) as mock_db_manager:
|
||||
# Set up the mock database manager for agent mode
|
||||
mock_db_client = AsyncMock()
|
||||
@@ -1,6 +1,6 @@
|
||||
"""Tests for SmartDecisionMakerBlock compatibility with the OpenAI Responses API.
|
||||
"""Tests for OrchestratorBlock compatibility with the OpenAI Responses API.
|
||||
|
||||
The SmartDecisionMakerBlock manages conversation history in the Chat Completions
|
||||
The OrchestratorBlock manages conversation history in the Chat Completions
|
||||
format, but OpenAI models now use the Responses API which has a fundamentally
|
||||
different conversation structure. These tests document:
|
||||
|
||||
@@ -27,8 +27,8 @@ from unittest.mock import AsyncMock, MagicMock, patch
|
||||
|
||||
import pytest
|
||||
|
||||
from backend.blocks.smart_decision_maker import (
|
||||
SmartDecisionMakerBlock,
|
||||
from backend.blocks.orchestrator import (
|
||||
OrchestratorBlock,
|
||||
_combine_tool_responses,
|
||||
_convert_raw_response_to_dict,
|
||||
_create_tool_response,
|
||||
@@ -733,7 +733,7 @@ class TestUpdateConversation:
|
||||
|
||||
def test_dict_raw_response_no_reasoning_no_tools(self):
|
||||
"""Dict raw_response, no reasoning → appends assistant dict."""
|
||||
block = SmartDecisionMakerBlock()
|
||||
block = OrchestratorBlock()
|
||||
prompt: list[dict] = []
|
||||
resp = self._make_response({"role": "assistant", "content": "hi"})
|
||||
block._update_conversation(prompt, resp)
|
||||
@@ -741,7 +741,7 @@ class TestUpdateConversation:
|
||||
|
||||
def test_dict_raw_response_with_reasoning_no_tool_calls(self):
|
||||
"""Reasoning present, no tool calls → reasoning prepended."""
|
||||
block = SmartDecisionMakerBlock()
|
||||
block = OrchestratorBlock()
|
||||
prompt: list[dict] = []
|
||||
resp = self._make_response(
|
||||
{"role": "assistant", "content": "answer"},
|
||||
@@ -757,7 +757,7 @@ class TestUpdateConversation:
|
||||
|
||||
def test_dict_raw_response_with_reasoning_and_anthropic_tool_calls(self):
|
||||
"""Reasoning + Anthropic tool_use in content → reasoning skipped."""
|
||||
block = SmartDecisionMakerBlock()
|
||||
block = OrchestratorBlock()
|
||||
prompt: list[dict] = []
|
||||
raw = {
|
||||
"role": "assistant",
|
||||
@@ -772,7 +772,7 @@ class TestUpdateConversation:
|
||||
|
||||
def test_with_tool_outputs(self):
|
||||
"""Tool outputs → extended onto prompt."""
|
||||
block = SmartDecisionMakerBlock()
|
||||
block = OrchestratorBlock()
|
||||
prompt: list[dict] = []
|
||||
resp = self._make_response({"role": "assistant", "content": None})
|
||||
outputs = [{"role": "tool", "tool_call_id": "call_1", "content": "r"}]
|
||||
@@ -782,7 +782,7 @@ class TestUpdateConversation:
|
||||
|
||||
def test_without_tool_outputs(self):
|
||||
"""No tool outputs → only assistant message appended."""
|
||||
block = SmartDecisionMakerBlock()
|
||||
block = OrchestratorBlock()
|
||||
prompt: list[dict] = []
|
||||
resp = self._make_response({"role": "assistant", "content": "done"})
|
||||
block._update_conversation(prompt, resp, None)
|
||||
@@ -790,7 +790,7 @@ class TestUpdateConversation:
|
||||
|
||||
def test_string_raw_response(self):
|
||||
"""Ollama string → wrapped as assistant dict."""
|
||||
block = SmartDecisionMakerBlock()
|
||||
block = OrchestratorBlock()
|
||||
prompt: list[dict] = []
|
||||
resp = self._make_response("hello from ollama")
|
||||
block._update_conversation(prompt, resp)
|
||||
@@ -800,7 +800,7 @@ class TestUpdateConversation:
|
||||
|
||||
def test_responses_api_text_response_produces_valid_items(self):
|
||||
"""Responses API text response → conversation items must have valid role."""
|
||||
block = SmartDecisionMakerBlock()
|
||||
block = OrchestratorBlock()
|
||||
prompt: list[dict] = [
|
||||
{"role": "system", "content": "sys"},
|
||||
{"role": "user", "content": "user"},
|
||||
@@ -820,7 +820,7 @@ class TestUpdateConversation:
|
||||
|
||||
def test_responses_api_function_call_produces_valid_items(self):
|
||||
"""Responses API function_call → conversation items must have valid type."""
|
||||
block = SmartDecisionMakerBlock()
|
||||
block = OrchestratorBlock()
|
||||
prompt: list[dict] = []
|
||||
resp = self._make_response(
|
||||
_MockResponse(output=[_MockFunctionCall("tool", "{}", call_id="call_1")])
|
||||
@@ -856,7 +856,7 @@ async def test_agent_mode_conversation_valid_for_responses_api():
|
||||
"""
|
||||
import backend.blocks.llm as llm_module
|
||||
|
||||
block = SmartDecisionMakerBlock()
|
||||
block = OrchestratorBlock()
|
||||
|
||||
# First response: tool call
|
||||
mock_tc = MagicMock()
|
||||
@@ -936,7 +936,7 @@ async def test_agent_mode_conversation_valid_for_responses_api():
|
||||
with patch("backend.blocks.llm.llm_call", llm_mock), patch.object(
|
||||
block, "_create_tool_node_signatures", return_value=tool_sigs
|
||||
), patch(
|
||||
"backend.blocks.smart_decision_maker.get_database_manager_async_client",
|
||||
"backend.blocks.orchestrator.get_database_manager_async_client",
|
||||
return_value=mock_db,
|
||||
), patch(
|
||||
"backend.executor.manager.async_update_node_execution_status",
|
||||
@@ -945,7 +945,7 @@ async def test_agent_mode_conversation_valid_for_responses_api():
|
||||
"backend.integrations.creds_manager.IntegrationCredentialsManager"
|
||||
):
|
||||
|
||||
inp = SmartDecisionMakerBlock.Input(
|
||||
inp = OrchestratorBlock.Input(
|
||||
prompt="Improve this",
|
||||
model=llm_module.DEFAULT_LLM_MODEL,
|
||||
credentials=llm_module.TEST_CREDENTIALS_INPUT, # type: ignore
|
||||
@@ -992,7 +992,7 @@ async def test_traditional_mode_conversation_valid_for_responses_api():
|
||||
"""Traditional mode: the yielded conversation must contain only valid items."""
|
||||
import backend.blocks.llm as llm_module
|
||||
|
||||
block = SmartDecisionMakerBlock()
|
||||
block = OrchestratorBlock()
|
||||
|
||||
mock_tc = MagicMock()
|
||||
mock_tc.function.name = "my_tool"
|
||||
@@ -1028,7 +1028,7 @@ async def test_traditional_mode_conversation_valid_for_responses_api():
|
||||
"backend.blocks.llm.llm_call", new_callable=AsyncMock, return_value=resp
|
||||
), patch.object(block, "_create_tool_node_signatures", return_value=tool_sigs):
|
||||
|
||||
inp = SmartDecisionMakerBlock.Input(
|
||||
inp = OrchestratorBlock.Input(
|
||||
prompt="Do it",
|
||||
model=llm_module.DEFAULT_LLM_MODEL,
|
||||
credentials=llm_module.TEST_CREDENTIALS_INPUT, # type: ignore
|
||||
@@ -17,6 +17,9 @@ from backend.util.workspace import WorkspaceManager
|
||||
if TYPE_CHECKING:
|
||||
from e2b import AsyncSandbox
|
||||
|
||||
from backend.copilot.permissions import CopilotPermissions
|
||||
|
||||
|
||||
# Allowed base directory for the Read tool. Public so service.py can use it
|
||||
# for sweep operations without depending on a private implementation detail.
|
||||
# Respects CLAUDE_CONFIG_DIR env var, consistent with transcript.py's
|
||||
@@ -43,6 +46,12 @@ _current_sandbox: ContextVar["AsyncSandbox | None"] = ContextVar(
|
||||
)
|
||||
_current_sdk_cwd: ContextVar[str] = ContextVar("_current_sdk_cwd", default="")
|
||||
|
||||
# Current execution's capability filter. None means "no restrictions".
|
||||
# Set by set_execution_context(); read by run_block and service.py.
|
||||
_current_permissions: "ContextVar[CopilotPermissions | None]" = ContextVar(
|
||||
"_current_permissions", default=None
|
||||
)
|
||||
|
||||
|
||||
def encode_cwd_for_cli(cwd: str) -> str:
|
||||
"""Encode a working directory path the same way the Claude CLI does.
|
||||
@@ -63,6 +72,7 @@ def set_execution_context(
|
||||
session: ChatSession,
|
||||
sandbox: "AsyncSandbox | None" = None,
|
||||
sdk_cwd: str | None = None,
|
||||
permissions: "CopilotPermissions | None" = None,
|
||||
) -> None:
|
||||
"""Set per-turn context variables used by file-resolution tool handlers."""
|
||||
_current_user_id.set(user_id)
|
||||
@@ -70,6 +80,7 @@ def set_execution_context(
|
||||
_current_sandbox.set(sandbox)
|
||||
_current_sdk_cwd.set(sdk_cwd or "")
|
||||
_current_project_dir.set(_encode_cwd_for_cli(sdk_cwd) if sdk_cwd else "")
|
||||
_current_permissions.set(permissions)
|
||||
|
||||
|
||||
def get_execution_context() -> tuple[str | None, ChatSession | None]:
|
||||
@@ -77,6 +88,11 @@ def get_execution_context() -> tuple[str | None, ChatSession | None]:
|
||||
return _current_user_id.get(), _current_session.get()
|
||||
|
||||
|
||||
def get_current_permissions() -> "CopilotPermissions | None":
|
||||
"""Return the capability filter for the current execution, or None if unrestricted."""
|
||||
return _current_permissions.get()
|
||||
|
||||
|
||||
def get_current_sandbox() -> "AsyncSandbox | None":
|
||||
"""Return the E2B sandbox for the current session, or None if not active."""
|
||||
return _current_sandbox.get()
|
||||
|
||||
@@ -11,6 +11,7 @@ import pytest
|
||||
from backend.copilot.context import (
|
||||
SDK_PROJECTS_DIR,
|
||||
_current_project_dir,
|
||||
get_current_permissions,
|
||||
get_current_sandbox,
|
||||
get_execution_context,
|
||||
get_sdk_cwd,
|
||||
@@ -18,6 +19,7 @@ from backend.copilot.context import (
|
||||
resolve_sandbox_path,
|
||||
set_execution_context,
|
||||
)
|
||||
from backend.copilot.permissions import CopilotPermissions
|
||||
|
||||
|
||||
def _make_session() -> MagicMock:
|
||||
@@ -61,6 +63,19 @@ def test_get_current_sandbox_returns_set_value():
|
||||
assert get_current_sandbox() is mock_sandbox
|
||||
|
||||
|
||||
def test_set_and_get_current_permissions():
|
||||
"""set_execution_context stores permissions; get_current_permissions returns it."""
|
||||
perms = CopilotPermissions(tools=["run_block"], tools_exclude=False)
|
||||
set_execution_context("u1", _make_session(), permissions=perms)
|
||||
assert get_current_permissions() is perms
|
||||
|
||||
|
||||
def test_get_current_permissions_defaults_to_none():
|
||||
"""get_current_permissions returns None when no permissions have been set."""
|
||||
set_execution_context("u1", _make_session())
|
||||
assert get_current_permissions() is None
|
||||
|
||||
|
||||
def test_get_sdk_cwd_empty_when_not_set():
|
||||
"""get_sdk_cwd returns empty string when sdk_cwd is not set."""
|
||||
set_execution_context("u1", _make_session(), sdk_cwd=None)
|
||||
|
||||
430
autogpt_platform/backend/backend/copilot/permissions.py
Normal file
430
autogpt_platform/backend/backend/copilot/permissions.py
Normal file
@@ -0,0 +1,430 @@
|
||||
"""Copilot execution permissions — tool and block allow/deny filtering.
|
||||
|
||||
:class:`CopilotPermissions` is the single model used everywhere:
|
||||
|
||||
- ``AutoPilotBlock`` reads four block-input fields and builds one instance.
|
||||
- ``stream_chat_completion_sdk`` applies it when constructing
|
||||
``ClaudeAgentOptions.allowed_tools`` / ``disallowed_tools``.
|
||||
- ``run_block`` reads it from the contextvar to gate block execution.
|
||||
- Recursive (sub-agent) invocations merge parent and child so children
|
||||
can only be *more* restrictive, never more permissive.
|
||||
|
||||
Tool names
|
||||
----------
|
||||
Users specify the **short name** as it appears in ``TOOL_REGISTRY`` (e.g.
|
||||
``run_block``, ``web_fetch``) or as an SDK built-in (e.g. ``Read``,
|
||||
``Task``, ``WebSearch``). Internally these are mapped to the full SDK
|
||||
format (``mcp__copilot__run_block``, ``Read``, …) by
|
||||
:func:`apply_tool_permissions`.
|
||||
|
||||
Block identifiers
|
||||
-----------------
|
||||
Each entry in ``blocks`` may be one of:
|
||||
|
||||
- A **full UUID** (``c069dc6b-c3ed-4c12-b6e5-d47361e64ce6``)
|
||||
- A **partial UUID** — the first 8-character hex segment (``c069dc6b``)
|
||||
- A **block name** (case-insensitive, e.g. ``"HTTP Request"``)
|
||||
|
||||
:func:`validate_block_identifiers` resolves all entries against the live
|
||||
block registry and returns any that could not be matched.
|
||||
|
||||
Semantics
|
||||
---------
|
||||
``tools_exclude=True`` (default) — ``tools`` is a **blacklist**; listed
|
||||
tools are denied and everything else is allowed. An empty list means
|
||||
"allow all" (no filtering).
|
||||
|
||||
``tools_exclude=False`` — ``tools`` is a **whitelist**; only listed tools
|
||||
are allowed.
|
||||
|
||||
``blocks_exclude`` follows the same pattern for ``blocks``.
|
||||
|
||||
Recursion inheritance
|
||||
---------------------
|
||||
:meth:`CopilotPermissions.merged_with_parent` produces a new instance that
|
||||
is at most as permissive as the parent:
|
||||
|
||||
- Tools: effective-allowed sets are intersected then stored as a whitelist.
|
||||
- Blocks: the parent is stored in ``_parent`` and consulted during every
|
||||
:meth:`is_block_allowed` call so both constraints must pass.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import re
|
||||
from typing import Literal, get_args
|
||||
|
||||
from pydantic import BaseModel, PrivateAttr
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Constants — single source of truth for all accepted tool names
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
# Literal type combining all valid tool names — used by AutoPilotBlock.Input
|
||||
# so the frontend renders a multi-select dropdown.
|
||||
# This is the SINGLE SOURCE OF TRUTH. All other name sets are derived from it.
|
||||
ToolName = Literal[
|
||||
# Platform tools (must match keys in TOOL_REGISTRY)
|
||||
"add_understanding",
|
||||
"bash_exec",
|
||||
"browser_act",
|
||||
"browser_navigate",
|
||||
"browser_screenshot",
|
||||
"connect_integration",
|
||||
"continue_run_block",
|
||||
"create_agent",
|
||||
"create_feature_request",
|
||||
"create_folder",
|
||||
"customize_agent",
|
||||
"delete_folder",
|
||||
"delete_workspace_file",
|
||||
"edit_agent",
|
||||
"find_agent",
|
||||
"find_block",
|
||||
"find_library_agent",
|
||||
"fix_agent_graph",
|
||||
"get_agent_building_guide",
|
||||
"get_doc_page",
|
||||
"get_mcp_guide",
|
||||
"list_folders",
|
||||
"list_workspace_files",
|
||||
"move_agents_to_folder",
|
||||
"move_folder",
|
||||
"read_workspace_file",
|
||||
"run_agent",
|
||||
"run_block",
|
||||
"run_mcp_tool",
|
||||
"search_docs",
|
||||
"search_feature_requests",
|
||||
"update_folder",
|
||||
"validate_agent_graph",
|
||||
"view_agent_output",
|
||||
"web_fetch",
|
||||
"write_workspace_file",
|
||||
# SDK built-ins
|
||||
"Edit",
|
||||
"Glob",
|
||||
"Grep",
|
||||
"Read",
|
||||
"Task",
|
||||
"TodoWrite",
|
||||
"WebSearch",
|
||||
"Write",
|
||||
]
|
||||
|
||||
# Frozen set of all valid tool names — derived from the Literal.
|
||||
ALL_TOOL_NAMES: frozenset[str] = frozenset(get_args(ToolName))
|
||||
|
||||
# SDK built-in tool names — uppercase-initial names are SDK built-ins.
|
||||
SDK_BUILTIN_TOOL_NAMES: frozenset[str] = frozenset(
|
||||
n for n in ALL_TOOL_NAMES if n[0].isupper()
|
||||
)
|
||||
|
||||
# Platform tool names — everything that isn't an SDK built-in.
|
||||
PLATFORM_TOOL_NAMES: frozenset[str] = ALL_TOOL_NAMES - SDK_BUILTIN_TOOL_NAMES
|
||||
|
||||
# Compiled regex patterns for block identifier classification.
|
||||
_FULL_UUID_RE = re.compile(
|
||||
r"^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$",
|
||||
re.IGNORECASE,
|
||||
)
|
||||
_PARTIAL_UUID_RE = re.compile(r"^[0-9a-f]{8}$", re.IGNORECASE)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Helper — block identifier matching
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def _block_matches(identifier: str, block_id: str, block_name: str) -> bool:
|
||||
"""Return True if *identifier* resolves to the given block.
|
||||
|
||||
Resolution order:
|
||||
1. Full UUID — exact case-insensitive match against *block_id*.
|
||||
2. Partial UUID (8 hex chars, first segment) — prefix match.
|
||||
3. Name — case-insensitive equality against *block_name*.
|
||||
"""
|
||||
ident = identifier.strip()
|
||||
if _FULL_UUID_RE.match(ident):
|
||||
return ident.lower() == block_id.lower()
|
||||
if _PARTIAL_UUID_RE.match(ident):
|
||||
return block_id.lower().startswith(ident.lower())
|
||||
return ident.lower() == block_name.lower()
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Model
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class CopilotPermissions(BaseModel):
|
||||
"""Capability filter for a single copilot execution.
|
||||
|
||||
Attributes:
|
||||
tools: Tool names to filter (short names, e.g. ``run_block``).
|
||||
tools_exclude: When True (default) ``tools`` is a blacklist;
|
||||
when False it is a whitelist. Ignored when *tools* is empty.
|
||||
blocks: Block identifiers (name, full UUID, or 8-char partial UUID).
|
||||
blocks_exclude: Same semantics as *tools_exclude* but for blocks.
|
||||
"""
|
||||
|
||||
tools: list[str] = []
|
||||
tools_exclude: bool = True
|
||||
blocks: list[str] = []
|
||||
blocks_exclude: bool = True
|
||||
|
||||
# Private: parent permissions for recursion inheritance.
|
||||
# Set only by merged_with_parent(); never exposed in block input schema.
|
||||
_parent: CopilotPermissions | None = PrivateAttr(default=None)
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Tool helpers
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def effective_allowed_tools(self, all_tools: frozenset[str]) -> frozenset[str]:
|
||||
"""Compute the set of short tool names that are permitted.
|
||||
|
||||
Args:
|
||||
all_tools: Universe of valid short tool names.
|
||||
|
||||
Returns:
|
||||
Subset of *all_tools* that pass the filter.
|
||||
"""
|
||||
if not self.tools:
|
||||
return frozenset(all_tools)
|
||||
tool_set = frozenset(self.tools)
|
||||
if self.tools_exclude:
|
||||
return all_tools - tool_set
|
||||
return all_tools & tool_set
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Block helpers
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def is_block_allowed(self, block_id: str, block_name: str) -> bool:
|
||||
"""Return True if the block may be executed under these permissions.
|
||||
|
||||
Checks this instance first, then consults the parent (if any) so
|
||||
the entire inheritance chain is respected.
|
||||
"""
|
||||
if not self._check_block_locally(block_id, block_name):
|
||||
return False
|
||||
if self._parent is not None:
|
||||
return self._parent.is_block_allowed(block_id, block_name)
|
||||
return True
|
||||
|
||||
def _check_block_locally(self, block_id: str, block_name: str) -> bool:
|
||||
"""Check *only* this instance's block filter (ignores parent)."""
|
||||
if not self.blocks:
|
||||
return True # No filter → allow all
|
||||
matched = any(
|
||||
_block_matches(identifier, block_id, block_name)
|
||||
for identifier in self.blocks
|
||||
)
|
||||
return not matched if self.blocks_exclude else matched
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Recursion / merging
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def merged_with_parent(
|
||||
self,
|
||||
parent: CopilotPermissions,
|
||||
all_tools: frozenset[str],
|
||||
) -> CopilotPermissions:
|
||||
"""Return a new instance that is at most as permissive as *parent*.
|
||||
|
||||
- Tools: intersection of effective-allowed sets, stored as a whitelist.
|
||||
- Blocks: parent is stored internally; both constraints are applied
|
||||
during :meth:`is_block_allowed`.
|
||||
"""
|
||||
merged_tools = self.effective_allowed_tools(
|
||||
all_tools
|
||||
) & parent.effective_allowed_tools(all_tools)
|
||||
result = CopilotPermissions(
|
||||
tools=sorted(merged_tools),
|
||||
tools_exclude=False,
|
||||
blocks=self.blocks,
|
||||
blocks_exclude=self.blocks_exclude,
|
||||
)
|
||||
result._parent = parent
|
||||
return result
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Convenience
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def is_empty(self) -> bool:
|
||||
"""Return True when no filtering is configured (allow-all passthrough)."""
|
||||
return not self.tools and not self.blocks and self._parent is None
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Validation helpers
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def all_known_tool_names() -> frozenset[str]:
|
||||
"""Return all short tool names accepted in *tools*.
|
||||
|
||||
Returns the pre-computed ``ALL_TOOL_NAMES`` set (derived from the
|
||||
``ToolName`` Literal). On first call, also verifies consistency with
|
||||
the live ``TOOL_REGISTRY``.
|
||||
"""
|
||||
_assert_tool_names_consistent()
|
||||
return ALL_TOOL_NAMES
|
||||
|
||||
|
||||
def validate_tool_names(tools: list[str]) -> list[str]:
|
||||
"""Return entries in *tools* that are not valid tool names.
|
||||
|
||||
Args:
|
||||
tools: List of short tool name strings to validate.
|
||||
|
||||
Returns:
|
||||
List of invalid names (empty if all are valid).
|
||||
"""
|
||||
return [t for t in tools if t not in ALL_TOOL_NAMES]
|
||||
|
||||
|
||||
_tool_names_checked = False
|
||||
|
||||
|
||||
def _assert_tool_names_consistent() -> None:
|
||||
"""Verify that ``PLATFORM_TOOL_NAMES`` matches ``TOOL_REGISTRY`` keys.
|
||||
|
||||
Called once lazily (TOOL_REGISTRY has heavy imports). Raises
|
||||
``AssertionError`` with a helpful diff if they diverge.
|
||||
"""
|
||||
global _tool_names_checked
|
||||
if _tool_names_checked:
|
||||
return
|
||||
_tool_names_checked = True
|
||||
|
||||
from backend.copilot.tools import TOOL_REGISTRY
|
||||
|
||||
registry_keys: frozenset[str] = frozenset(TOOL_REGISTRY.keys())
|
||||
declared: frozenset[str] = PLATFORM_TOOL_NAMES
|
||||
if registry_keys != declared:
|
||||
missing = registry_keys - declared
|
||||
extra = declared - registry_keys
|
||||
parts: list[str] = [
|
||||
"PLATFORM_TOOL_NAMES in permissions.py is out of sync with TOOL_REGISTRY."
|
||||
]
|
||||
if missing:
|
||||
parts.append(f" Missing from PLATFORM_TOOL_NAMES: {sorted(missing)}")
|
||||
if extra:
|
||||
parts.append(f" Extra in PLATFORM_TOOL_NAMES: {sorted(extra)}")
|
||||
parts.append(" Update the ToolName Literal to match.")
|
||||
raise AssertionError("\n".join(parts))
|
||||
|
||||
|
||||
async def validate_block_identifiers(
|
||||
identifiers: list[str],
|
||||
) -> list[str]:
|
||||
"""Resolve each block identifier and return those that could not be matched.
|
||||
|
||||
Args:
|
||||
identifiers: List of block identifiers (name, full UUID, or partial UUID).
|
||||
|
||||
Returns:
|
||||
List of identifiers that matched no known block.
|
||||
"""
|
||||
from backend.blocks import get_blocks
|
||||
|
||||
# get_blocks() returns dict[block_id_str, BlockClass]; instantiate once to get names.
|
||||
block_registry = get_blocks()
|
||||
block_info = {bid: cls().name for bid, cls in block_registry.items()}
|
||||
invalid: list[str] = []
|
||||
for ident in identifiers:
|
||||
matched = any(
|
||||
_block_matches(ident, bid, bname) for bid, bname in block_info.items()
|
||||
)
|
||||
if not matched:
|
||||
invalid.append(ident)
|
||||
return invalid
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# SDK tool-list application
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def apply_tool_permissions(
|
||||
permissions: CopilotPermissions,
|
||||
*,
|
||||
use_e2b: bool = False,
|
||||
) -> tuple[list[str], list[str]]:
|
||||
"""Compute (allowed_tools, extra_disallowed) for :class:`ClaudeAgentOptions`.
|
||||
|
||||
Takes the base allowed/disallowed lists from
|
||||
:func:`~backend.copilot.sdk.tool_adapter.get_copilot_tool_names` /
|
||||
:func:`~backend.copilot.sdk.tool_adapter.get_sdk_disallowed_tools` and
|
||||
applies *permissions* on top.
|
||||
|
||||
Returns:
|
||||
``(allowed_tools, extra_disallowed)`` where *allowed_tools* is the
|
||||
possibly-narrowed list to pass to ``ClaudeAgentOptions.allowed_tools``
|
||||
and *extra_disallowed* is the list to pass to
|
||||
``ClaudeAgentOptions.disallowed_tools``.
|
||||
"""
|
||||
from backend.copilot.sdk.tool_adapter import (
|
||||
_READ_TOOL_NAME,
|
||||
MCP_TOOL_PREFIX,
|
||||
get_copilot_tool_names,
|
||||
get_sdk_disallowed_tools,
|
||||
)
|
||||
from backend.copilot.tools import TOOL_REGISTRY
|
||||
|
||||
base_allowed = get_copilot_tool_names(use_e2b=use_e2b)
|
||||
base_disallowed = get_sdk_disallowed_tools(use_e2b=use_e2b)
|
||||
|
||||
if permissions.is_empty():
|
||||
return base_allowed, base_disallowed
|
||||
|
||||
all_tools = all_known_tool_names()
|
||||
effective = permissions.effective_allowed_tools(all_tools)
|
||||
|
||||
# In E2B mode, SDK built-in file tools (Read, Write, Edit, Glob, Grep)
|
||||
# are replaced by MCP equivalents (read_file, write_file, ...).
|
||||
# Map each SDK built-in name to its E2B MCP name so users can use the
|
||||
# familiar names in their permissions and the E2B tools are included.
|
||||
_SDK_TO_E2B: dict[str, str] = {}
|
||||
if use_e2b:
|
||||
from backend.copilot.sdk.e2b_file_tools import E2B_FILE_TOOL_NAMES
|
||||
|
||||
_SDK_TO_E2B = dict(
|
||||
zip(
|
||||
["Read", "Write", "Edit", "Glob", "Grep"],
|
||||
E2B_FILE_TOOL_NAMES,
|
||||
strict=False,
|
||||
)
|
||||
)
|
||||
|
||||
# Build an updated allowed list by mapping short names → SDK names and
|
||||
# keeping only those present in the original base_allowed list.
|
||||
def to_sdk_names(short: str) -> list[str]:
|
||||
names: list[str] = []
|
||||
if short in TOOL_REGISTRY:
|
||||
names.append(f"{MCP_TOOL_PREFIX}{short}")
|
||||
elif short in _SDK_TO_E2B:
|
||||
# E2B mode: map SDK built-in file tool to its MCP equivalent.
|
||||
names.append(f"{MCP_TOOL_PREFIX}{_SDK_TO_E2B[short]}")
|
||||
else:
|
||||
names.append(short) # SDK built-in — used as-is
|
||||
return names
|
||||
|
||||
# short names permitted by permissions
|
||||
permitted_sdk: set[str] = set()
|
||||
for s in effective:
|
||||
permitted_sdk.update(to_sdk_names(s))
|
||||
# Always include the internal Read tool (used by SDK for large/truncated outputs)
|
||||
permitted_sdk.add(f"{MCP_TOOL_PREFIX}{_READ_TOOL_NAME}")
|
||||
|
||||
filtered_allowed = [t for t in base_allowed if t in permitted_sdk]
|
||||
|
||||
# Extra disallowed = tools that were in base_allowed but are now removed
|
||||
removed = set(base_allowed) - set(filtered_allowed)
|
||||
extra_disallowed = list(set(base_disallowed) | removed)
|
||||
|
||||
return filtered_allowed, extra_disallowed
|
||||
579
autogpt_platform/backend/backend/copilot/permissions_test.py
Normal file
579
autogpt_platform/backend/backend/copilot/permissions_test.py
Normal file
@@ -0,0 +1,579 @@
|
||||
"""Tests for CopilotPermissions — tool/block capability filtering."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import pytest
|
||||
|
||||
from backend.copilot.permissions import (
|
||||
ALL_TOOL_NAMES,
|
||||
PLATFORM_TOOL_NAMES,
|
||||
SDK_BUILTIN_TOOL_NAMES,
|
||||
CopilotPermissions,
|
||||
_block_matches,
|
||||
all_known_tool_names,
|
||||
apply_tool_permissions,
|
||||
validate_block_identifiers,
|
||||
validate_tool_names,
|
||||
)
|
||||
from backend.copilot.tools import TOOL_REGISTRY
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# _block_matches
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class TestBlockMatches:
|
||||
BLOCK_ID = "c069dc6b-c3ed-4c12-b6e5-d47361e64ce6"
|
||||
BLOCK_NAME = "HTTP Request"
|
||||
|
||||
def test_full_uuid_match(self):
|
||||
assert _block_matches(self.BLOCK_ID, self.BLOCK_ID, self.BLOCK_NAME)
|
||||
|
||||
def test_full_uuid_case_insensitive(self):
|
||||
assert _block_matches(self.BLOCK_ID.upper(), self.BLOCK_ID, self.BLOCK_NAME)
|
||||
|
||||
def test_full_uuid_no_match(self):
|
||||
other = "aaaaaaaa-0000-0000-0000-000000000000"
|
||||
assert not _block_matches(other, self.BLOCK_ID, self.BLOCK_NAME)
|
||||
|
||||
def test_partial_uuid_match(self):
|
||||
assert _block_matches("c069dc6b", self.BLOCK_ID, self.BLOCK_NAME)
|
||||
|
||||
def test_partial_uuid_case_insensitive(self):
|
||||
assert _block_matches("C069DC6B", self.BLOCK_ID, self.BLOCK_NAME)
|
||||
|
||||
def test_partial_uuid_no_match(self):
|
||||
assert not _block_matches("deadbeef", self.BLOCK_ID, self.BLOCK_NAME)
|
||||
|
||||
def test_name_match(self):
|
||||
assert _block_matches("HTTP Request", self.BLOCK_ID, self.BLOCK_NAME)
|
||||
|
||||
def test_name_case_insensitive(self):
|
||||
assert _block_matches("http request", self.BLOCK_ID, self.BLOCK_NAME)
|
||||
assert _block_matches("HTTP REQUEST", self.BLOCK_ID, self.BLOCK_NAME)
|
||||
|
||||
def test_name_no_match(self):
|
||||
assert not _block_matches("Unknown Block", self.BLOCK_ID, self.BLOCK_NAME)
|
||||
|
||||
def test_partial_uuid_not_matching_as_name(self):
|
||||
# "c069dc6b" is 8 hex chars → treated as partial UUID, NOT name match
|
||||
assert not _block_matches(
|
||||
"c069dc6b", "ffffffff-0000-0000-0000-000000000000", "c069dc6b"
|
||||
)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# CopilotPermissions.effective_allowed_tools
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
ALL_TOOLS = frozenset(
|
||||
["run_block", "web_fetch", "bash_exec", "find_agent", "Task", "Read"]
|
||||
)
|
||||
|
||||
|
||||
class TestEffectiveAllowedTools:
|
||||
def test_empty_list_allows_all(self):
|
||||
perms = CopilotPermissions(tools=[], tools_exclude=True)
|
||||
assert perms.effective_allowed_tools(ALL_TOOLS) == ALL_TOOLS
|
||||
|
||||
def test_empty_whitelist_allows_all(self):
|
||||
# edge: tools_exclude=False but empty list → allow all
|
||||
perms = CopilotPermissions(tools=[], tools_exclude=False)
|
||||
assert perms.effective_allowed_tools(ALL_TOOLS) == ALL_TOOLS
|
||||
|
||||
def test_blacklist_removes_listed(self):
|
||||
perms = CopilotPermissions(tools=["bash_exec", "web_fetch"], tools_exclude=True)
|
||||
result = perms.effective_allowed_tools(ALL_TOOLS)
|
||||
assert "bash_exec" not in result
|
||||
assert "web_fetch" not in result
|
||||
assert "run_block" in result
|
||||
assert "Task" in result
|
||||
|
||||
def test_whitelist_keeps_only_listed(self):
|
||||
perms = CopilotPermissions(tools=["run_block", "Task"], tools_exclude=False)
|
||||
result = perms.effective_allowed_tools(ALL_TOOLS)
|
||||
assert result == frozenset(["run_block", "Task"])
|
||||
|
||||
def test_whitelist_unknown_tool_yields_empty(self):
|
||||
perms = CopilotPermissions(tools=["nonexistent"], tools_exclude=False)
|
||||
result = perms.effective_allowed_tools(ALL_TOOLS)
|
||||
assert result == frozenset()
|
||||
|
||||
def test_blacklist_unknown_tool_ignored(self):
|
||||
perms = CopilotPermissions(tools=["nonexistent"], tools_exclude=True)
|
||||
result = perms.effective_allowed_tools(ALL_TOOLS)
|
||||
assert result == ALL_TOOLS
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# CopilotPermissions.is_block_allowed
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
BLOCK_ID = "c069dc6b-c3ed-4c12-b6e5-d47361e64ce6"
|
||||
BLOCK_NAME = "HTTP Request"
|
||||
|
||||
|
||||
class TestIsBlockAllowed:
|
||||
def test_empty_allows_everything(self):
|
||||
perms = CopilotPermissions(blocks=[], blocks_exclude=True)
|
||||
assert perms.is_block_allowed(BLOCK_ID, BLOCK_NAME)
|
||||
|
||||
def test_blacklist_blocks_listed(self):
|
||||
perms = CopilotPermissions(blocks=["HTTP Request"], blocks_exclude=True)
|
||||
assert not perms.is_block_allowed(BLOCK_ID, BLOCK_NAME)
|
||||
|
||||
def test_blacklist_allows_unlisted(self):
|
||||
perms = CopilotPermissions(blocks=["Other Block"], blocks_exclude=True)
|
||||
assert perms.is_block_allowed(BLOCK_ID, BLOCK_NAME)
|
||||
|
||||
def test_whitelist_allows_listed(self):
|
||||
perms = CopilotPermissions(blocks=["HTTP Request"], blocks_exclude=False)
|
||||
assert perms.is_block_allowed(BLOCK_ID, BLOCK_NAME)
|
||||
|
||||
def test_whitelist_blocks_unlisted(self):
|
||||
perms = CopilotPermissions(blocks=["Other Block"], blocks_exclude=False)
|
||||
assert not perms.is_block_allowed(BLOCK_ID, BLOCK_NAME)
|
||||
|
||||
def test_partial_uuid_blacklist(self):
|
||||
perms = CopilotPermissions(blocks=["c069dc6b"], blocks_exclude=True)
|
||||
assert not perms.is_block_allowed(BLOCK_ID, BLOCK_NAME)
|
||||
|
||||
def test_full_uuid_whitelist(self):
|
||||
perms = CopilotPermissions(blocks=[BLOCK_ID], blocks_exclude=False)
|
||||
assert perms.is_block_allowed(BLOCK_ID, BLOCK_NAME)
|
||||
|
||||
def test_parent_blocks_when_child_allows(self):
|
||||
parent = CopilotPermissions(blocks=["HTTP Request"], blocks_exclude=True)
|
||||
child = CopilotPermissions(blocks=[], blocks_exclude=True)
|
||||
child._parent = parent
|
||||
assert not child.is_block_allowed(BLOCK_ID, BLOCK_NAME)
|
||||
|
||||
def test_parent_allows_when_child_blocks(self):
|
||||
parent = CopilotPermissions(blocks=[], blocks_exclude=True)
|
||||
child = CopilotPermissions(blocks=["HTTP Request"], blocks_exclude=True)
|
||||
child._parent = parent
|
||||
assert not child.is_block_allowed(BLOCK_ID, BLOCK_NAME)
|
||||
|
||||
def test_both_must_allow(self):
|
||||
parent = CopilotPermissions(blocks=["HTTP Request"], blocks_exclude=False)
|
||||
child = CopilotPermissions(blocks=["HTTP Request"], blocks_exclude=False)
|
||||
child._parent = parent
|
||||
assert child.is_block_allowed(BLOCK_ID, BLOCK_NAME)
|
||||
|
||||
def test_grandparent_blocks_propagate(self):
|
||||
grandparent = CopilotPermissions(blocks=["HTTP Request"], blocks_exclude=True)
|
||||
parent = CopilotPermissions(blocks=[], blocks_exclude=True)
|
||||
parent._parent = grandparent
|
||||
child = CopilotPermissions(blocks=[], blocks_exclude=True)
|
||||
child._parent = parent
|
||||
assert not child.is_block_allowed(BLOCK_ID, BLOCK_NAME)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# CopilotPermissions.merged_with_parent
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class TestMergedWithParent:
|
||||
def test_tool_intersection(self):
|
||||
all_t = frozenset(["run_block", "web_fetch", "bash_exec"])
|
||||
parent = CopilotPermissions(tools=["bash_exec"], tools_exclude=True)
|
||||
child = CopilotPermissions(tools=["web_fetch"], tools_exclude=True)
|
||||
merged = child.merged_with_parent(parent, all_t)
|
||||
effective = merged.effective_allowed_tools(all_t)
|
||||
assert "bash_exec" not in effective
|
||||
assert "web_fetch" not in effective
|
||||
assert "run_block" in effective
|
||||
|
||||
def test_parent_whitelist_narrows_child(self):
|
||||
all_t = frozenset(["run_block", "web_fetch", "bash_exec"])
|
||||
parent = CopilotPermissions(tools=["run_block"], tools_exclude=False)
|
||||
child = CopilotPermissions(tools=[], tools_exclude=True) # allow all
|
||||
merged = child.merged_with_parent(parent, all_t)
|
||||
effective = merged.effective_allowed_tools(all_t)
|
||||
assert effective == frozenset(["run_block"])
|
||||
|
||||
def test_child_cannot_expand_parent_whitelist(self):
|
||||
all_t = frozenset(["run_block", "web_fetch", "bash_exec"])
|
||||
parent = CopilotPermissions(tools=["run_block"], tools_exclude=False)
|
||||
child = CopilotPermissions(
|
||||
tools=["run_block", "bash_exec"], tools_exclude=False
|
||||
)
|
||||
merged = child.merged_with_parent(parent, all_t)
|
||||
effective = merged.effective_allowed_tools(all_t)
|
||||
# bash_exec was not in parent's whitelist → must not appear
|
||||
assert "bash_exec" not in effective
|
||||
assert "run_block" in effective
|
||||
|
||||
def test_merged_stored_as_whitelist(self):
|
||||
all_t = frozenset(["run_block", "web_fetch"])
|
||||
parent = CopilotPermissions(tools=[], tools_exclude=True)
|
||||
child = CopilotPermissions(tools=[], tools_exclude=True)
|
||||
merged = child.merged_with_parent(parent, all_t)
|
||||
assert not merged.tools_exclude # stored as whitelist
|
||||
assert set(merged.tools) == {"run_block", "web_fetch"}
|
||||
|
||||
def test_block_parent_stored(self):
|
||||
all_t = frozenset(["run_block"])
|
||||
parent = CopilotPermissions(blocks=["HTTP Request"], blocks_exclude=True)
|
||||
child = CopilotPermissions(blocks=[], blocks_exclude=True)
|
||||
merged = child.merged_with_parent(parent, all_t)
|
||||
# Parent restriction is preserved via _parent
|
||||
assert not merged.is_block_allowed(BLOCK_ID, BLOCK_NAME)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# CopilotPermissions.is_empty
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class TestIsEmpty:
|
||||
def test_default_is_empty(self):
|
||||
assert CopilotPermissions().is_empty()
|
||||
|
||||
def test_with_tools_not_empty(self):
|
||||
assert not CopilotPermissions(tools=["bash_exec"]).is_empty()
|
||||
|
||||
def test_with_blocks_not_empty(self):
|
||||
assert not CopilotPermissions(blocks=["HTTP Request"]).is_empty()
|
||||
|
||||
def test_with_parent_not_empty(self):
|
||||
perms = CopilotPermissions()
|
||||
perms._parent = CopilotPermissions(tools=["bash_exec"])
|
||||
assert not perms.is_empty()
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# validate_tool_names
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class TestValidateToolNames:
|
||||
def test_valid_registry_tool(self):
|
||||
assert validate_tool_names(["run_block", "web_fetch"]) == []
|
||||
|
||||
def test_valid_sdk_builtin(self):
|
||||
assert validate_tool_names(["Read", "Task", "WebSearch"]) == []
|
||||
|
||||
def test_invalid_tool(self):
|
||||
result = validate_tool_names(["nonexistent_tool"])
|
||||
assert "nonexistent_tool" in result
|
||||
|
||||
def test_mixed(self):
|
||||
result = validate_tool_names(["run_block", "fake_tool"])
|
||||
assert "fake_tool" in result
|
||||
assert "run_block" not in result
|
||||
|
||||
def test_empty_list(self):
|
||||
assert validate_tool_names([]) == []
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# validate_block_identifiers (async)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
class TestValidateBlockIdentifiers:
|
||||
async def test_empty_list(self):
|
||||
result = await validate_block_identifiers([])
|
||||
assert result == []
|
||||
|
||||
async def test_valid_full_uuid(self, mocker):
|
||||
mock_block = mocker.MagicMock()
|
||||
mock_block.return_value.name = "HTTP Request"
|
||||
mocker.patch(
|
||||
"backend.blocks.get_blocks",
|
||||
return_value={"c069dc6b-c3ed-4c12-b6e5-d47361e64ce6": mock_block},
|
||||
)
|
||||
result = await validate_block_identifiers(
|
||||
["c069dc6b-c3ed-4c12-b6e5-d47361e64ce6"]
|
||||
)
|
||||
assert result == []
|
||||
|
||||
async def test_invalid_identifier(self, mocker):
|
||||
mock_block = mocker.MagicMock()
|
||||
mock_block.return_value.name = "HTTP Request"
|
||||
mocker.patch(
|
||||
"backend.blocks.get_blocks",
|
||||
return_value={"c069dc6b-c3ed-4c12-b6e5-d47361e64ce6": mock_block},
|
||||
)
|
||||
result = await validate_block_identifiers(["totally_unknown"])
|
||||
assert "totally_unknown" in result
|
||||
|
||||
async def test_partial_uuid_match(self, mocker):
|
||||
mock_block = mocker.MagicMock()
|
||||
mock_block.return_value.name = "HTTP Request"
|
||||
mocker.patch(
|
||||
"backend.blocks.get_blocks",
|
||||
return_value={"c069dc6b-c3ed-4c12-b6e5-d47361e64ce6": mock_block},
|
||||
)
|
||||
result = await validate_block_identifiers(["c069dc6b"])
|
||||
assert result == []
|
||||
|
||||
async def test_name_match(self, mocker):
|
||||
mock_block = mocker.MagicMock()
|
||||
mock_block.return_value.name = "HTTP Request"
|
||||
mocker.patch(
|
||||
"backend.blocks.get_blocks",
|
||||
return_value={"c069dc6b-c3ed-4c12-b6e5-d47361e64ce6": mock_block},
|
||||
)
|
||||
result = await validate_block_identifiers(["http request"])
|
||||
assert result == []
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# apply_tool_permissions
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class TestApplyToolPermissions:
|
||||
def test_empty_permissions_returns_base_unchanged(self, mocker):
|
||||
mocker.patch(
|
||||
"backend.copilot.sdk.tool_adapter.get_copilot_tool_names",
|
||||
return_value=["mcp__copilot__run_block", "mcp__copilot__web_fetch", "Task"],
|
||||
)
|
||||
mocker.patch(
|
||||
"backend.copilot.sdk.tool_adapter.get_sdk_disallowed_tools",
|
||||
return_value=["Bash"],
|
||||
)
|
||||
mocker.patch(
|
||||
"backend.copilot.sdk.tool_adapter.TOOL_REGISTRY",
|
||||
{"run_block": object(), "web_fetch": object()},
|
||||
)
|
||||
perms = CopilotPermissions()
|
||||
allowed, disallowed = apply_tool_permissions(perms, use_e2b=False)
|
||||
assert "mcp__copilot__run_block" in allowed
|
||||
assert "mcp__copilot__web_fetch" in allowed
|
||||
|
||||
def test_blacklist_removes_tool(self, mocker):
|
||||
mocker.patch(
|
||||
"backend.copilot.sdk.tool_adapter.get_copilot_tool_names",
|
||||
return_value=[
|
||||
"mcp__copilot__run_block",
|
||||
"mcp__copilot__web_fetch",
|
||||
"mcp__copilot__bash_exec",
|
||||
"Task",
|
||||
],
|
||||
)
|
||||
mocker.patch(
|
||||
"backend.copilot.sdk.tool_adapter.get_sdk_disallowed_tools",
|
||||
return_value=["Bash"],
|
||||
)
|
||||
mocker.patch(
|
||||
"backend.copilot.sdk.tool_adapter.TOOL_REGISTRY",
|
||||
{
|
||||
"run_block": object(),
|
||||
"web_fetch": object(),
|
||||
"bash_exec": object(),
|
||||
},
|
||||
)
|
||||
mocker.patch(
|
||||
"backend.copilot.permissions.all_known_tool_names",
|
||||
return_value=frozenset(["run_block", "web_fetch", "bash_exec", "Task"]),
|
||||
)
|
||||
perms = CopilotPermissions(tools=["bash_exec"], tools_exclude=True)
|
||||
allowed, _ = apply_tool_permissions(perms, use_e2b=False)
|
||||
assert "mcp__copilot__bash_exec" not in allowed
|
||||
assert "mcp__copilot__run_block" in allowed
|
||||
|
||||
def test_whitelist_keeps_only_listed(self, mocker):
|
||||
mocker.patch(
|
||||
"backend.copilot.sdk.tool_adapter.get_copilot_tool_names",
|
||||
return_value=[
|
||||
"mcp__copilot__run_block",
|
||||
"mcp__copilot__web_fetch",
|
||||
"Task",
|
||||
"WebSearch",
|
||||
],
|
||||
)
|
||||
mocker.patch(
|
||||
"backend.copilot.sdk.tool_adapter.get_sdk_disallowed_tools",
|
||||
return_value=["Bash"],
|
||||
)
|
||||
mocker.patch(
|
||||
"backend.copilot.sdk.tool_adapter.TOOL_REGISTRY",
|
||||
{"run_block": object(), "web_fetch": object()},
|
||||
)
|
||||
mocker.patch(
|
||||
"backend.copilot.permissions.all_known_tool_names",
|
||||
return_value=frozenset(["run_block", "web_fetch", "Task", "WebSearch"]),
|
||||
)
|
||||
perms = CopilotPermissions(tools=["run_block"], tools_exclude=False)
|
||||
allowed, _ = apply_tool_permissions(perms, use_e2b=False)
|
||||
assert "mcp__copilot__run_block" in allowed
|
||||
assert "mcp__copilot__web_fetch" not in allowed
|
||||
assert "Task" not in allowed
|
||||
|
||||
def test_read_tool_always_included_even_when_blacklisted(self, mocker):
|
||||
"""mcp__copilot__Read must stay in allowed even if Read is explicitly blacklisted."""
|
||||
mocker.patch(
|
||||
"backend.copilot.sdk.tool_adapter.get_copilot_tool_names",
|
||||
return_value=[
|
||||
"mcp__copilot__run_block",
|
||||
"mcp__copilot__Read",
|
||||
"Task",
|
||||
],
|
||||
)
|
||||
mocker.patch(
|
||||
"backend.copilot.sdk.tool_adapter.get_sdk_disallowed_tools",
|
||||
return_value=[],
|
||||
)
|
||||
mocker.patch(
|
||||
"backend.copilot.sdk.tool_adapter.TOOL_REGISTRY",
|
||||
{"run_block": object()},
|
||||
)
|
||||
mocker.patch(
|
||||
"backend.copilot.permissions.all_known_tool_names",
|
||||
return_value=frozenset(["run_block", "Read", "Task"]),
|
||||
)
|
||||
# Explicitly blacklist Read
|
||||
perms = CopilotPermissions(tools=["Read"], tools_exclude=True)
|
||||
allowed, _ = apply_tool_permissions(perms, use_e2b=False)
|
||||
assert "mcp__copilot__Read" in allowed # always preserved for SDK internals
|
||||
assert "mcp__copilot__run_block" in allowed
|
||||
assert "Task" in allowed
|
||||
|
||||
def test_read_tool_always_included_with_narrow_whitelist(self, mocker):
|
||||
"""mcp__copilot__Read must stay in allowed even when not in a whitelist."""
|
||||
mocker.patch(
|
||||
"backend.copilot.sdk.tool_adapter.get_copilot_tool_names",
|
||||
return_value=[
|
||||
"mcp__copilot__run_block",
|
||||
"mcp__copilot__Read",
|
||||
"Task",
|
||||
],
|
||||
)
|
||||
mocker.patch(
|
||||
"backend.copilot.sdk.tool_adapter.get_sdk_disallowed_tools",
|
||||
return_value=[],
|
||||
)
|
||||
mocker.patch(
|
||||
"backend.copilot.sdk.tool_adapter.TOOL_REGISTRY",
|
||||
{"run_block": object()},
|
||||
)
|
||||
mocker.patch(
|
||||
"backend.copilot.permissions.all_known_tool_names",
|
||||
return_value=frozenset(["run_block", "Read", "Task"]),
|
||||
)
|
||||
# Whitelist only run_block — Read not listed
|
||||
perms = CopilotPermissions(tools=["run_block"], tools_exclude=False)
|
||||
allowed, _ = apply_tool_permissions(perms, use_e2b=False)
|
||||
assert "mcp__copilot__Read" in allowed # always preserved for SDK internals
|
||||
assert "mcp__copilot__run_block" in allowed
|
||||
|
||||
def test_e2b_file_tools_included_when_sdk_builtin_whitelisted(self, mocker):
|
||||
"""In E2B mode, whitelisting 'Read' must include mcp__copilot__read_file."""
|
||||
mocker.patch(
|
||||
"backend.copilot.sdk.tool_adapter.get_copilot_tool_names",
|
||||
return_value=[
|
||||
"mcp__copilot__run_block",
|
||||
"mcp__copilot__Read",
|
||||
"mcp__copilot__read_file",
|
||||
"mcp__copilot__write_file",
|
||||
"Task",
|
||||
],
|
||||
)
|
||||
mocker.patch(
|
||||
"backend.copilot.sdk.tool_adapter.get_sdk_disallowed_tools",
|
||||
return_value=["Bash", "Read", "Write", "Edit", "Glob", "Grep"],
|
||||
)
|
||||
mocker.patch(
|
||||
"backend.copilot.sdk.tool_adapter.TOOL_REGISTRY",
|
||||
{"run_block": object()},
|
||||
)
|
||||
mocker.patch(
|
||||
"backend.copilot.permissions.all_known_tool_names",
|
||||
return_value=frozenset(["run_block", "Read", "Write", "Task"]),
|
||||
)
|
||||
mocker.patch(
|
||||
"backend.copilot.sdk.e2b_file_tools.E2B_FILE_TOOL_NAMES",
|
||||
["read_file", "write_file", "edit_file", "glob", "grep"],
|
||||
)
|
||||
# Whitelist Read and run_block — E2B read_file should be included
|
||||
perms = CopilotPermissions(tools=["Read", "run_block"], tools_exclude=False)
|
||||
allowed, _ = apply_tool_permissions(perms, use_e2b=True)
|
||||
assert "mcp__copilot__read_file" in allowed
|
||||
assert "mcp__copilot__run_block" in allowed
|
||||
# Write not whitelisted — write_file should NOT be included
|
||||
assert "mcp__copilot__write_file" not in allowed
|
||||
|
||||
def test_e2b_file_tools_excluded_when_sdk_builtin_blacklisted(self, mocker):
|
||||
"""In E2B mode, blacklisting 'Read' must also remove mcp__copilot__read_file."""
|
||||
mocker.patch(
|
||||
"backend.copilot.sdk.tool_adapter.get_copilot_tool_names",
|
||||
return_value=[
|
||||
"mcp__copilot__run_block",
|
||||
"mcp__copilot__Read",
|
||||
"mcp__copilot__read_file",
|
||||
"Task",
|
||||
],
|
||||
)
|
||||
mocker.patch(
|
||||
"backend.copilot.sdk.tool_adapter.get_sdk_disallowed_tools",
|
||||
return_value=["Bash", "Read", "Write", "Edit", "Glob", "Grep"],
|
||||
)
|
||||
mocker.patch(
|
||||
"backend.copilot.sdk.tool_adapter.TOOL_REGISTRY",
|
||||
{"run_block": object()},
|
||||
)
|
||||
mocker.patch(
|
||||
"backend.copilot.permissions.all_known_tool_names",
|
||||
return_value=frozenset(["run_block", "Read", "Task"]),
|
||||
)
|
||||
mocker.patch(
|
||||
"backend.copilot.sdk.e2b_file_tools.E2B_FILE_TOOL_NAMES",
|
||||
["read_file", "write_file", "edit_file", "glob", "grep"],
|
||||
)
|
||||
# Blacklist Read — E2B read_file should also be removed
|
||||
perms = CopilotPermissions(tools=["Read"], tools_exclude=True)
|
||||
allowed, _ = apply_tool_permissions(perms, use_e2b=True)
|
||||
assert "mcp__copilot__read_file" not in allowed
|
||||
assert "mcp__copilot__run_block" in allowed
|
||||
# mcp__copilot__Read is always preserved for SDK internals
|
||||
assert "mcp__copilot__Read" in allowed
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# SDK_BUILTIN_TOOL_NAMES sanity check
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class TestSdkBuiltinToolNames:
|
||||
def test_expected_builtins_present(self):
|
||||
expected = {
|
||||
"Read",
|
||||
"Write",
|
||||
"Edit",
|
||||
"Glob",
|
||||
"Grep",
|
||||
"Task",
|
||||
"WebSearch",
|
||||
"TodoWrite",
|
||||
}
|
||||
assert expected.issubset(SDK_BUILTIN_TOOL_NAMES)
|
||||
|
||||
def test_platform_names_match_tool_registry(self):
|
||||
"""PLATFORM_TOOL_NAMES (derived from ToolName Literal) must match TOOL_REGISTRY keys."""
|
||||
registry_keys = frozenset(TOOL_REGISTRY.keys())
|
||||
assert PLATFORM_TOOL_NAMES == registry_keys, (
|
||||
f"ToolName Literal is out of sync with TOOL_REGISTRY. "
|
||||
f"Missing: {registry_keys - PLATFORM_TOOL_NAMES}, "
|
||||
f"Extra: {PLATFORM_TOOL_NAMES - registry_keys}"
|
||||
)
|
||||
|
||||
def test_all_tool_names_is_union(self):
|
||||
"""ALL_TOOL_NAMES must equal PLATFORM_TOOL_NAMES | SDK_BUILTIN_TOOL_NAMES."""
|
||||
assert ALL_TOOL_NAMES == PLATFORM_TOOL_NAMES | SDK_BUILTIN_TOOL_NAMES
|
||||
|
||||
def test_no_overlap_between_platform_and_sdk(self):
|
||||
"""Platform and SDK built-in names must not overlap."""
|
||||
assert PLATFORM_TOOL_NAMES.isdisjoint(SDK_BUILTIN_TOOL_NAMES)
|
||||
|
||||
def test_known_tools_includes_registry_and_builtins(self):
|
||||
known = all_known_tool_names()
|
||||
assert "run_block" in known
|
||||
assert "Read" in known
|
||||
assert "Task" in known
|
||||
@@ -12,34 +12,18 @@ from backend.copilot.tools import TOOL_REGISTRY
|
||||
# Shared technical notes that apply to both SDK and baseline modes
|
||||
_SHARED_TOOL_NOTES = f"""\
|
||||
|
||||
### Sharing files with the user
|
||||
After saving a file to the persistent workspace with `write_workspace_file`,
|
||||
share it with the user by embedding the `download_url` from the response in
|
||||
your message as a Markdown link or image:
|
||||
### Sharing files
|
||||
After `write_workspace_file`, embed the `download_url` in Markdown:
|
||||
- File: `[report.csv](workspace://file_id#text/csv)`
|
||||
- Image: ``
|
||||
- Video: ``
|
||||
|
||||
- **Any file** — shows as a clickable download link:
|
||||
`[report.csv](workspace://file_id#text/csv)`
|
||||
- **Image** — renders inline in chat:
|
||||
``
|
||||
- **Video** — renders inline in chat with player controls:
|
||||
``
|
||||
|
||||
The `download_url` field in the `write_workspace_file` response is already
|
||||
in the correct format — paste it directly after the `(` in the Markdown.
|
||||
|
||||
### Passing file content to tools — @@agptfile: references
|
||||
Instead of copying large file contents into a tool argument, pass a file
|
||||
reference and the platform will load the content for you.
|
||||
|
||||
Syntax: `@@agptfile:<uri>[<start>-<end>]`
|
||||
|
||||
- `<uri>` **must** start with `workspace://` or `/` (absolute path):
|
||||
- `workspace://<file_id>` — workspace file by ID
|
||||
- `workspace:///<path>` — workspace file by virtual path
|
||||
- `/absolute/local/path` — ephemeral or sdk_cwd file
|
||||
- E2B sandbox absolute path (e.g. `/home/user/script.py`)
|
||||
- `[<start>-<end>]` is an optional 1-indexed inclusive line range.
|
||||
- URIs that do not start with `workspace://` or `/` are **not** expanded.
|
||||
### File references — @@agptfile:
|
||||
Pass large file content to tools by reference: `@@agptfile:<uri>[<start>-<end>]`
|
||||
- `workspace://<file_id>` or `workspace:///<path>` — workspace files
|
||||
- `/absolute/path` — local/sandbox files
|
||||
- `[start-end]` — optional 1-indexed line range
|
||||
- Multiple refs per argument supported. Only `workspace://` and absolute paths are expanded.
|
||||
|
||||
Examples:
|
||||
```
|
||||
@@ -50,21 +34,9 @@ Examples:
|
||||
@@agptfile:/home/user/script.py
|
||||
```
|
||||
|
||||
You can embed a reference inside any string argument, or use it as the entire
|
||||
value. Multiple references in one argument are all expanded.
|
||||
**Structured data**: When the entire argument is a single file reference, the platform auto-parses by extension/MIME. Supported: JSON, JSONL, CSV, TSV, YAML, TOML, Parquet, Excel (.xlsx only; legacy `.xls` is NOT supported). Unrecognised formats return plain string.
|
||||
|
||||
**Structured data**: When the **entire** argument value is a single file
|
||||
reference (no surrounding text), the platform automatically parses the file
|
||||
content based on its extension or MIME type. Supported formats: JSON, JSONL,
|
||||
CSV, TSV, YAML, TOML, Parquet, and Excel (.xlsx — first sheet only).
|
||||
For example, pass `@@agptfile:workspace://<id>` where the file is a `.csv` and
|
||||
the rows will be parsed into `list[list[str]]` automatically. If the format is
|
||||
unrecognised or parsing fails, the content is returned as a plain string.
|
||||
Legacy `.xls` files are **not** supported — only the modern `.xlsx` format.
|
||||
|
||||
**Type coercion**: The platform also coerces expanded values to match the
|
||||
block's expected input types. For example, if a block expects `list[list[str]]`
|
||||
and the expanded value is a JSON string, it will be parsed into the correct type.
|
||||
**Type coercion**: The platform auto-coerces expanded string values to match block input types (e.g. JSON string → `list[list[str]]`).
|
||||
|
||||
### Media file inputs (format: "file")
|
||||
Some block inputs accept media files — their schema shows `"format": "file"`.
|
||||
@@ -91,6 +63,50 @@ Example — committing an image file to GitHub:
|
||||
}}
|
||||
```
|
||||
|
||||
### Writing large files — CRITICAL
|
||||
**Never write an entire large document in a single tool call.** When the
|
||||
content you want to write exceeds ~2000 words the tool call's output token
|
||||
limit will silently truncate the arguments, producing an empty `{{}}` input
|
||||
that fails repeatedly.
|
||||
|
||||
**Preferred: compose from file references.** If the data is already in
|
||||
files (tool outputs, workspace files), compose the report in one call
|
||||
using `@@agptfile:` references — the system expands them inline:
|
||||
|
||||
```bash
|
||||
cat > report.md << 'EOF'
|
||||
# Research Report
|
||||
## Data from web research
|
||||
@@agptfile:/home/user/web_results.txt
|
||||
## Block execution output
|
||||
@@agptfile:workspace://<file_id>
|
||||
## Conclusion
|
||||
<brief synthesis>
|
||||
EOF
|
||||
```
|
||||
|
||||
**Fallback: write section-by-section.** When you must generate content
|
||||
from conversation context (no files to reference), split into multiple
|
||||
`bash_exec` calls — one section per call:
|
||||
|
||||
```bash
|
||||
cat > report.md << 'EOF'
|
||||
# Section 1
|
||||
<content from your earlier tool call results>
|
||||
EOF
|
||||
```
|
||||
```bash
|
||||
cat >> report.md << 'EOF'
|
||||
# Section 2
|
||||
<content from your earlier tool call results>
|
||||
EOF
|
||||
```
|
||||
Use `cat >` for the first chunk and `cat >>` to append subsequent chunks.
|
||||
Do not re-fetch or re-generate data you already have from prior tool calls.
|
||||
|
||||
After building the file, reference it with `@@agptfile:` in other tools:
|
||||
`@@agptfile:/home/user/report.md`
|
||||
|
||||
### Sub-agent tasks
|
||||
- When using the Task tool, NEVER set `run_in_background` to true.
|
||||
All tasks must run in the foreground.
|
||||
@@ -166,17 +182,12 @@ def _build_storage_supplement(
|
||||
|
||||
## Tool notes
|
||||
|
||||
### Shell commands
|
||||
- The SDK built-in Bash tool is NOT available. Use the `bash_exec` MCP tool
|
||||
for shell commands — it runs {sandbox_type}.
|
||||
|
||||
### Working directory
|
||||
- Your working directory is: `{working_dir}`
|
||||
- All SDK file tools AND `bash_exec` operate on the same filesystem
|
||||
- Use relative paths or absolute paths under `{working_dir}` for all file operations
|
||||
### Shell & filesystem
|
||||
- The SDK built-in Bash tool is NOT available. Use `bash_exec` for shell commands ({sandbox_type}). Working dir: `{working_dir}`
|
||||
- SDK file tools (Read/Write/Edit/Glob/Grep) and `bash_exec` share one filesystem — use relative or absolute paths under this dir.
|
||||
- `read_workspace_file`/`write_workspace_file` operate on **persistent cloud workspace storage** (separate from the working dir).
|
||||
|
||||
### Two storage systems — CRITICAL to understand
|
||||
|
||||
1. **{storage_system_1_name}** (`{working_dir}`):
|
||||
{characteristics}
|
||||
{persistence}
|
||||
|
||||
@@ -143,11 +143,11 @@ To use an MCP (Model Context Protocol) tool as a node in the agent:
|
||||
tool_arguments.
|
||||
6. Output: `result` (the tool's return value) and `error` (error message)
|
||||
|
||||
### Using SmartDecisionMakerBlock (AI Orchestrator with Agent Mode)
|
||||
### Using OrchestratorBlock (AI Orchestrator with Agent Mode)
|
||||
|
||||
To create an agent where AI autonomously decides which tools or sub-agents to
|
||||
call in a loop until the task is complete:
|
||||
1. Create a `SmartDecisionMakerBlock` node
|
||||
1. Create a `OrchestratorBlock` node
|
||||
(ID: `3b191d9f-356f-482d-8238-ba04b6d18381`)
|
||||
2. Set `input_default`:
|
||||
- `agent_mode_max_iterations`: Choose based on task complexity:
|
||||
@@ -169,8 +169,8 @@ call in a loop until the task is complete:
|
||||
3. Wire the `prompt` input from an `AgentInputBlock` (the user's task)
|
||||
4. Create downstream tool blocks — regular blocks **or** `AgentExecutorBlock`
|
||||
nodes that call sub-agents
|
||||
5. Link each tool to the SmartDecisionMaker: set `source_name: "tools"` on
|
||||
the SmartDecisionMaker side and `sink_name: <input_field>` on each tool
|
||||
5. Link each tool to the Orchestrator: set `source_name: "tools"` on
|
||||
the Orchestrator side and `sink_name: <input_field>` on each tool
|
||||
block's input. Create one link per input field the tool needs.
|
||||
6. Wire the `finished` output to an `AgentOutputBlock` for the final result
|
||||
7. Credentials (LLM API key) are configured by the user in the platform UI
|
||||
@@ -178,35 +178,35 @@ call in a loop until the task is complete:
|
||||
|
||||
**Example — Orchestrator calling two sub-agents:**
|
||||
- Node 1: `AgentInputBlock` (input_default: `{"name": "task"}`)
|
||||
- Node 2: `SmartDecisionMakerBlock` (input_default:
|
||||
- Node 2: `OrchestratorBlock` (input_default:
|
||||
`{"agent_mode_max_iterations": 10, "conversation_compaction": true}`)
|
||||
- Node 3: `AgentExecutorBlock` (sub-agent A — set `graph_id`, `graph_version`,
|
||||
`input_schema`, `output_schema` from library agent)
|
||||
- Node 4: `AgentExecutorBlock` (sub-agent B — same pattern)
|
||||
- Node 5: `AgentOutputBlock` (input_default: `{"name": "result"}`)
|
||||
- Links:
|
||||
- Input→SDM: `source_name: "result"`, `sink_name: "prompt"`
|
||||
- SDM→Agent A (per input field): `source_name: "tools"`,
|
||||
- Input→Orchestrator: `source_name: "result"`, `sink_name: "prompt"`
|
||||
- Orchestrator→Agent A (per input field): `source_name: "tools"`,
|
||||
`sink_name: "<agent_a_input_field>"`
|
||||
- SDM→Agent B (per input field): `source_name: "tools"`,
|
||||
- Orchestrator→Agent B (per input field): `source_name: "tools"`,
|
||||
`sink_name: "<agent_b_input_field>"`
|
||||
- SDM→Output: `source_name: "finished"`, `sink_name: "value"`
|
||||
- Orchestrator→Output: `source_name: "finished"`, `sink_name: "value"`
|
||||
|
||||
**Example — Orchestrator calling regular blocks as tools:**
|
||||
- Node 1: `AgentInputBlock` (input_default: `{"name": "task"}`)
|
||||
- Node 2: `SmartDecisionMakerBlock` (input_default:
|
||||
- Node 2: `OrchestratorBlock` (input_default:
|
||||
`{"agent_mode_max_iterations": 5, "conversation_compaction": true}`)
|
||||
- Node 3: `GetWebpageBlock` (regular block — the AI calls it as a tool)
|
||||
- Node 4: `AITextGeneratorBlock` (another regular block as a tool)
|
||||
- Node 5: `AgentOutputBlock` (input_default: `{"name": "result"}`)
|
||||
- Links:
|
||||
- Input→SDM: `source_name: "result"`, `sink_name: "prompt"`
|
||||
- SDM→GetWebpage: `source_name: "tools"`, `sink_name: "url"`
|
||||
- SDM→AITextGenerator: `source_name: "tools"`, `sink_name: "prompt"`
|
||||
- SDM→Output: `source_name: "finished"`, `sink_name: "value"`
|
||||
- Input→Orchestrator: `source_name: "result"`, `sink_name: "prompt"`
|
||||
- Orchestrator→GetWebpage: `source_name: "tools"`, `sink_name: "url"`
|
||||
- Orchestrator→AITextGenerator: `source_name: "tools"`, `sink_name: "prompt"`
|
||||
- Orchestrator→Output: `source_name: "finished"`, `sink_name: "value"`
|
||||
|
||||
Regular blocks work exactly like sub-agents as tools — wire each input
|
||||
field from `source_name: "tools"` on the SmartDecisionMaker side.
|
||||
field from `source_name: "tools"` on the Orchestrator side.
|
||||
|
||||
### Example: Simple AI Text Processor
|
||||
|
||||
|
||||
@@ -7,7 +7,20 @@ without implementing their own event loop.
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import Any
|
||||
from typing import TYPE_CHECKING, Any
|
||||
|
||||
from backend.copilot.response_model import (
|
||||
StreamError,
|
||||
StreamTextDelta,
|
||||
StreamToolInputAvailable,
|
||||
StreamToolOutputAvailable,
|
||||
StreamUsage,
|
||||
)
|
||||
|
||||
from .service import stream_chat_completion_sdk
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from backend.copilot.permissions import CopilotPermissions
|
||||
|
||||
|
||||
class CopilotResult:
|
||||
@@ -39,6 +52,7 @@ async def collect_copilot_response(
|
||||
message: str,
|
||||
user_id: str,
|
||||
is_user_message: bool = True,
|
||||
permissions: "CopilotPermissions | None" = None,
|
||||
) -> CopilotResult:
|
||||
"""Consume :func:`stream_chat_completion_sdk` and return aggregated results.
|
||||
|
||||
@@ -53,6 +67,8 @@ async def collect_copilot_response(
|
||||
message: The user message / prompt.
|
||||
user_id: Authenticated user ID.
|
||||
is_user_message: Whether this is a user-initiated message.
|
||||
permissions: Optional capability filter. When provided, restricts
|
||||
which tools and blocks the copilot may use during this execution.
|
||||
|
||||
Returns:
|
||||
A :class:`CopilotResult` with the aggregated response text,
|
||||
@@ -61,16 +77,6 @@ async def collect_copilot_response(
|
||||
Raises:
|
||||
RuntimeError: If the stream yields a ``StreamError`` event.
|
||||
"""
|
||||
from backend.copilot.response_model import (
|
||||
StreamError,
|
||||
StreamTextDelta,
|
||||
StreamToolInputAvailable,
|
||||
StreamToolOutputAvailable,
|
||||
StreamUsage,
|
||||
)
|
||||
|
||||
from .service import stream_chat_completion_sdk
|
||||
|
||||
result = CopilotResult()
|
||||
response_parts: list[str] = []
|
||||
tool_calls_by_id: dict[str, dict[str, Any]] = {}
|
||||
@@ -80,6 +86,7 @@ async def collect_copilot_response(
|
||||
message=message,
|
||||
is_user_message=is_user_message,
|
||||
user_id=user_id,
|
||||
permissions=permissions,
|
||||
):
|
||||
if isinstance(event, StreamTextDelta):
|
||||
response_parts.append(event.delta)
|
||||
|
||||
@@ -2,19 +2,20 @@
|
||||
|
||||
import asyncio
|
||||
import base64
|
||||
import functools
|
||||
import json
|
||||
import logging
|
||||
import os
|
||||
import re
|
||||
import shutil
|
||||
import subprocess
|
||||
import sys
|
||||
import time
|
||||
import uuid
|
||||
from collections.abc import AsyncGenerator, AsyncIterator
|
||||
from dataclasses import dataclass
|
||||
from typing import Any, NamedTuple, cast
|
||||
from typing import TYPE_CHECKING, Any, NamedTuple, cast
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from backend.copilot.permissions import CopilotPermissions
|
||||
|
||||
from claude_agent_sdk import (
|
||||
AssistantMessage,
|
||||
@@ -31,6 +32,7 @@ from langsmith.integrations.claude_agent_sdk import configure_claude_agent_sdk
|
||||
from pydantic import BaseModel
|
||||
|
||||
from backend.copilot.context import get_workspace_manager
|
||||
from backend.copilot.permissions import apply_tool_permissions
|
||||
from backend.data.redis_client import get_redis_async
|
||||
from backend.executor.cluster_lock import AsyncClusterLock
|
||||
from backend.util.exceptions import NotFoundError
|
||||
@@ -77,10 +79,15 @@ from ..tracking import track_user_message
|
||||
from .compaction import CompactionTracker, filter_compaction_messages
|
||||
from .response_adapter import SDKResponseAdapter
|
||||
from .security_hooks import create_security_hooks
|
||||
from .subscription import validate_subscription as _validate_claude_code_subscription
|
||||
from .tool_adapter import (
|
||||
cancel_pending_tool_tasks,
|
||||
create_copilot_mcp_server,
|
||||
get_copilot_tool_names,
|
||||
get_sdk_disallowed_tools,
|
||||
pre_launch_tool_call,
|
||||
reset_stash_event,
|
||||
reset_tool_failure_counters,
|
||||
set_execution_context,
|
||||
wait_for_stash,
|
||||
)
|
||||
@@ -106,6 +113,20 @@ config = ChatConfig()
|
||||
# Non-context errors (network, auth, rate-limit) are NOT retried.
|
||||
_MAX_STREAM_ATTEMPTS = 3
|
||||
|
||||
# Hard circuit breaker: abort the stream if the model sends this many
|
||||
# consecutive tool calls with empty parameters (a sign of context
|
||||
# saturation or serialization failure). Empty input ({}) is never
|
||||
# legitimate — even one is suspicious, three is conclusive.
|
||||
_EMPTY_TOOL_CALL_LIMIT = 3
|
||||
|
||||
# User-facing error shown when the empty-tool-call circuit breaker trips.
|
||||
_CIRCUIT_BREAKER_ERROR_MSG = (
|
||||
"AutoPilot was unable to complete the tool call "
|
||||
"— this usually happens when the response is "
|
||||
"too large to fit in a single tool call. "
|
||||
"Try breaking your request into smaller parts."
|
||||
)
|
||||
|
||||
# Patterns that indicate the prompt/request exceeds the model's context limit.
|
||||
# Matched case-insensitively against the full exception chain.
|
||||
_PROMPT_TOO_LONG_PATTERNS: tuple[str, ...] = (
|
||||
@@ -164,6 +185,19 @@ def _is_prompt_too_long(err: BaseException) -> bool:
|
||||
return False
|
||||
|
||||
|
||||
def _is_tool_only_message(sdk_msg: object) -> bool:
|
||||
"""Return True if *sdk_msg* is an AssistantMessage containing only ToolUseBlocks.
|
||||
|
||||
Such a message represents a parallel tool-call batch (no text output yet).
|
||||
The ``bool(…content)`` guard prevents vacuous-truth evaluation on an empty list.
|
||||
"""
|
||||
return (
|
||||
isinstance(sdk_msg, AssistantMessage)
|
||||
and bool(sdk_msg.content)
|
||||
and all(isinstance(b, ToolUseBlock) for b in sdk_msg.content)
|
||||
)
|
||||
|
||||
|
||||
class ReducedContext(NamedTuple):
|
||||
builder: TranscriptBuilder
|
||||
use_resume: bool
|
||||
@@ -458,37 +492,6 @@ def _resolve_sdk_model() -> str | None:
|
||||
return model
|
||||
|
||||
|
||||
@functools.cache
|
||||
def _validate_claude_code_subscription() -> None:
|
||||
"""Validate Claude CLI is installed and responds to `--version`.
|
||||
|
||||
Cached so the blocking subprocess check runs at most once per process
|
||||
lifetime. A failure (CLI not installed) is a config error that requires
|
||||
a process restart anyway.
|
||||
"""
|
||||
claude_path = shutil.which("claude")
|
||||
if not claude_path:
|
||||
raise RuntimeError(
|
||||
"Claude Code CLI not found. Install it with: "
|
||||
"npm install -g @anthropic-ai/claude-code"
|
||||
)
|
||||
result = subprocess.run(
|
||||
[claude_path, "--version"],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
timeout=10,
|
||||
)
|
||||
if result.returncode != 0:
|
||||
raise RuntimeError(
|
||||
f"Claude CLI check failed (exit {result.returncode}): "
|
||||
f"{result.stderr.strip()}"
|
||||
)
|
||||
logger.info(
|
||||
"Claude Code subscription mode: CLI version %s",
|
||||
result.stdout.strip(),
|
||||
)
|
||||
|
||||
|
||||
def _build_sdk_env(
|
||||
session_id: str | None = None,
|
||||
user_id: str | None = None,
|
||||
@@ -1028,15 +1031,122 @@ def _dispatch_response(
|
||||
return response
|
||||
|
||||
|
||||
class _TransientErrorHandled(Exception):
|
||||
class _HandledStreamError(Exception):
|
||||
"""Raised by `_run_stream_attempt` after it has already yielded a
|
||||
`StreamError` for a transient API error.
|
||||
`StreamError` to the client (e.g. transient API error, circuit breaker).
|
||||
|
||||
This signals the outer retry loop that the attempt failed so it can
|
||||
perform session-message rollback and set the `ended_with_stream_error`
|
||||
flag, **without** yielding a duplicate `StreamError` to the client.
|
||||
|
||||
Attributes:
|
||||
error_msg: The user-facing error message to persist.
|
||||
code: Machine-readable error code (e.g. ``circuit_breaker_empty_tool_calls``).
|
||||
retryable: Whether the frontend should offer a retry button.
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
message: str,
|
||||
error_msg: str | None = None,
|
||||
code: str | None = None,
|
||||
retryable: bool = True,
|
||||
):
|
||||
super().__init__(message)
|
||||
self.error_msg = error_msg
|
||||
self.code = code
|
||||
self.retryable = retryable
|
||||
|
||||
|
||||
@dataclass
|
||||
class _EmptyToolBreakResult:
|
||||
"""Result of checking for empty tool calls in a single AssistantMessage."""
|
||||
|
||||
count: int # Updated consecutive counter
|
||||
tripped: bool # Whether the circuit breaker fired
|
||||
error: StreamError | None # StreamError to yield (if tripped)
|
||||
error_msg: str | None # Error message (if tripped)
|
||||
error_code: str | None # Error code (if tripped)
|
||||
|
||||
|
||||
def _check_empty_tool_breaker(
|
||||
sdk_msg: object,
|
||||
consecutive: int,
|
||||
ctx: _StreamContext,
|
||||
state: _RetryState,
|
||||
) -> _EmptyToolBreakResult:
|
||||
"""Detect consecutive empty tool calls and trip the circuit breaker.
|
||||
|
||||
Returns an ``_EmptyToolBreakResult`` with the updated counter and, if the
|
||||
breaker tripped, the ``StreamError`` to yield plus the error metadata.
|
||||
"""
|
||||
if not isinstance(sdk_msg, AssistantMessage):
|
||||
return _EmptyToolBreakResult(consecutive, False, None, None, None)
|
||||
|
||||
empty_tools = [
|
||||
b.name for b in sdk_msg.content if isinstance(b, ToolUseBlock) and not b.input
|
||||
]
|
||||
if not empty_tools:
|
||||
# Reset on any non-empty-tool AssistantMessage (including text-only
|
||||
# messages — any() over empty content is False).
|
||||
return _EmptyToolBreakResult(0, False, None, None, None)
|
||||
|
||||
consecutive += 1
|
||||
|
||||
# Log full diagnostics on first occurrence only; subsequent hits just
|
||||
# log the counter to reduce noise.
|
||||
if consecutive == 1:
|
||||
logger.warning(
|
||||
"%s Empty tool call detected (%d/%d): "
|
||||
"tools=%s, model=%s, error=%s, "
|
||||
"block_types=%s, cumulative_usage=%s",
|
||||
ctx.log_prefix,
|
||||
consecutive,
|
||||
_EMPTY_TOOL_CALL_LIMIT,
|
||||
empty_tools,
|
||||
sdk_msg.model,
|
||||
sdk_msg.error,
|
||||
[type(b).__name__ for b in sdk_msg.content],
|
||||
{
|
||||
"prompt": state.usage.prompt_tokens,
|
||||
"completion": state.usage.completion_tokens,
|
||||
"cache_read": state.usage.cache_read_tokens,
|
||||
},
|
||||
)
|
||||
else:
|
||||
logger.warning(
|
||||
"%s Empty tool call detected (%d/%d): tools=%s",
|
||||
ctx.log_prefix,
|
||||
consecutive,
|
||||
_EMPTY_TOOL_CALL_LIMIT,
|
||||
empty_tools,
|
||||
)
|
||||
|
||||
if consecutive < _EMPTY_TOOL_CALL_LIMIT:
|
||||
return _EmptyToolBreakResult(consecutive, False, None, None, None)
|
||||
|
||||
logger.error(
|
||||
"%s Circuit breaker: aborting stream after %d "
|
||||
"consecutive empty tool calls. "
|
||||
"This is likely caused by the model attempting "
|
||||
"to write content too large for a single tool "
|
||||
"call's output token limit. The model should "
|
||||
"write large files in chunks using bash_exec "
|
||||
"with cat >> (append).",
|
||||
ctx.log_prefix,
|
||||
consecutive,
|
||||
)
|
||||
error_msg = _CIRCUIT_BREAKER_ERROR_MSG
|
||||
error_code = "circuit_breaker_empty_tool_calls"
|
||||
_append_error_marker(ctx.session, error_msg, retryable=True)
|
||||
return _EmptyToolBreakResult(
|
||||
count=consecutive,
|
||||
tripped=True,
|
||||
error=StreamError(errorText=error_msg, code=error_code),
|
||||
error_msg=error_msg,
|
||||
error_code=error_code,
|
||||
)
|
||||
|
||||
|
||||
async def _run_stream_attempt(
|
||||
ctx: _StreamContext,
|
||||
@@ -1071,6 +1181,12 @@ async def _run_stream_attempt(
|
||||
accumulated_tool_calls=[],
|
||||
)
|
||||
ended_with_stream_error = False
|
||||
# Stores the error message used by _append_error_marker so the outer
|
||||
# retry loop can re-append the correct message after session rollback.
|
||||
stream_error_msg: str | None = None
|
||||
stream_error_code: str | None = None
|
||||
|
||||
consecutive_empty_tool_calls = 0
|
||||
|
||||
async with ClaudeSDKClient(options=state.options) as client:
|
||||
logger.info(
|
||||
@@ -1161,18 +1277,43 @@ async def _run_stream_attempt(
|
||||
"suppressing raw error text",
|
||||
ctx.log_prefix,
|
||||
)
|
||||
stream_error_msg = FRIENDLY_TRANSIENT_MSG
|
||||
stream_error_code = "transient_api_error"
|
||||
_append_error_marker(
|
||||
ctx.session,
|
||||
FRIENDLY_TRANSIENT_MSG,
|
||||
stream_error_msg,
|
||||
retryable=True,
|
||||
)
|
||||
yield StreamError(
|
||||
errorText=FRIENDLY_TRANSIENT_MSG,
|
||||
code="transient_api_error",
|
||||
errorText=stream_error_msg,
|
||||
code=stream_error_code,
|
||||
)
|
||||
ended_with_stream_error = True
|
||||
break
|
||||
|
||||
# Parallel tool execution: pre-launch every ToolUseBlock as an
|
||||
# asyncio.Task the moment its AssistantMessage arrives. The SDK
|
||||
# sends one AssistantMessage per tool call when issuing parallel
|
||||
# calls, so each message is pre-launched independently. The MCP
|
||||
# handlers will await the already-running task instead of executing
|
||||
# fresh, making all concurrent tool calls run in parallel.
|
||||
#
|
||||
# Also determine if the message is a tool-only batch (all content
|
||||
# items are ToolUseBlocks) — such messages have no text output yet,
|
||||
# so we skip the wait_for_stash flush below.
|
||||
is_tool_only = False
|
||||
if isinstance(sdk_msg, AssistantMessage) and sdk_msg.content:
|
||||
is_tool_only = True
|
||||
# NOTE: Pre-launches are sequential (each await completes
|
||||
# file-ref expansion before the next starts). This is fine
|
||||
# since expansion is typically sub-ms; a future optimisation
|
||||
# could gather all pre-launches concurrently.
|
||||
for tool_use in sdk_msg.content:
|
||||
if isinstance(tool_use, ToolUseBlock):
|
||||
await pre_launch_tool_call(tool_use.name, tool_use.input)
|
||||
else:
|
||||
is_tool_only = False
|
||||
|
||||
# Race-condition fix: SDK hooks (PostToolUse) are
|
||||
# executed asynchronously via start_soon() — the next
|
||||
# message can arrive before the hook stashes output.
|
||||
@@ -1186,15 +1327,12 @@ async def _run_stream_attempt(
|
||||
# AssistantMessages (each containing only
|
||||
# ToolUseBlocks), we must NOT wait/flush — the prior
|
||||
# tools are still executing concurrently.
|
||||
is_parallel_continuation = isinstance(sdk_msg, AssistantMessage) and all(
|
||||
isinstance(b, ToolUseBlock) for b in sdk_msg.content
|
||||
)
|
||||
if (
|
||||
state.adapter.has_unresolved_tool_calls
|
||||
and isinstance(sdk_msg, (AssistantMessage, ResultMessage))
|
||||
and not is_parallel_continuation
|
||||
and not is_tool_only
|
||||
):
|
||||
if await wait_for_stash(timeout=0.5):
|
||||
if await wait_for_stash():
|
||||
await asyncio.sleep(0)
|
||||
else:
|
||||
logger.warning(
|
||||
@@ -1209,13 +1347,17 @@ async def _run_stream_attempt(
|
||||
if isinstance(sdk_msg, ResultMessage):
|
||||
logger.info(
|
||||
"%s Received: ResultMessage %s "
|
||||
"(unresolved=%d, current=%d, resolved=%d)",
|
||||
"(unresolved=%d, current=%d, resolved=%d, "
|
||||
"num_turns=%d, cost_usd=%s, result=%s)",
|
||||
ctx.log_prefix,
|
||||
sdk_msg.subtype,
|
||||
len(state.adapter.current_tool_calls)
|
||||
- len(state.adapter.resolved_tool_calls),
|
||||
len(state.adapter.current_tool_calls),
|
||||
len(state.adapter.resolved_tool_calls),
|
||||
sdk_msg.num_turns,
|
||||
sdk_msg.total_cost_usd,
|
||||
(sdk_msg.result or "")[:200],
|
||||
)
|
||||
if sdk_msg.subtype in (
|
||||
"error",
|
||||
@@ -1272,6 +1414,18 @@ async def _run_stream_attempt(
|
||||
)
|
||||
entries_replaced = True
|
||||
|
||||
# --- Hard circuit breaker for empty tool calls ---
|
||||
breaker = _check_empty_tool_breaker(
|
||||
sdk_msg, consecutive_empty_tool_calls, ctx, state
|
||||
)
|
||||
consecutive_empty_tool_calls = breaker.count
|
||||
if breaker.tripped and breaker.error is not None:
|
||||
stream_error_msg = breaker.error_msg
|
||||
stream_error_code = breaker.error_code
|
||||
yield breaker.error
|
||||
ended_with_stream_error = True
|
||||
break
|
||||
|
||||
# --- Dispatch adapter responses ---
|
||||
for response in state.adapter.convert_message(sdk_msg):
|
||||
dispatched = _dispatch_response(
|
||||
@@ -1352,8 +1506,10 @@ async def _run_stream_attempt(
|
||||
# to the client (StreamError yielded above), raise so the outer retry
|
||||
# loop can rollback session messages and set its error flags properly.
|
||||
if ended_with_stream_error:
|
||||
raise _TransientErrorHandled(
|
||||
"Transient API error handled — StreamError already yielded"
|
||||
raise _HandledStreamError(
|
||||
"Stream error handled — StreamError already yielded",
|
||||
error_msg=stream_error_msg,
|
||||
code=stream_error_code,
|
||||
)
|
||||
|
||||
|
||||
@@ -1364,6 +1520,7 @@ async def stream_chat_completion_sdk(
|
||||
user_id: str | None = None,
|
||||
session: ChatSession | None = None,
|
||||
file_ids: list[str] | None = None,
|
||||
permissions: "CopilotPermissions | None" = None,
|
||||
**_kwargs: Any,
|
||||
) -> AsyncIterator[StreamBaseResponse]:
|
||||
"""Stream chat completion using Claude Agent SDK.
|
||||
@@ -1609,7 +1766,13 @@ async def stream_chat_completion_sdk(
|
||||
|
||||
yield StreamStart(messageId=message_id, sessionId=session_id)
|
||||
|
||||
set_execution_context(user_id, session, sandbox=e2b_sandbox, sdk_cwd=sdk_cwd)
|
||||
set_execution_context(
|
||||
user_id,
|
||||
session,
|
||||
sandbox=e2b_sandbox,
|
||||
sdk_cwd=sdk_cwd,
|
||||
permissions=permissions,
|
||||
)
|
||||
|
||||
# Fail fast when no API credentials are available at all.
|
||||
sdk_env = _build_sdk_env(session_id=session_id, user_id=user_id)
|
||||
@@ -1635,8 +1798,11 @@ async def stream_chat_completion_sdk(
|
||||
on_compact=compaction.on_compact,
|
||||
)
|
||||
|
||||
allowed = get_copilot_tool_names(use_e2b=use_e2b)
|
||||
disallowed = get_sdk_disallowed_tools(use_e2b=use_e2b)
|
||||
if permissions is not None:
|
||||
allowed, disallowed = apply_tool_permissions(permissions, use_e2b=use_e2b)
|
||||
else:
|
||||
allowed = get_copilot_tool_names(use_e2b=use_e2b)
|
||||
disallowed = get_sdk_disallowed_tools(use_e2b=use_e2b)
|
||||
|
||||
def _on_stderr(line: str) -> None:
|
||||
"""Log a stderr line emitted by the Claude CLI subprocess."""
|
||||
@@ -1746,6 +1912,12 @@ async def stream_chat_completion_sdk(
|
||||
)
|
||||
|
||||
for attempt in range(_MAX_STREAM_ATTEMPTS):
|
||||
# Clear any stale stash signal from the previous attempt so
|
||||
# wait_for_stash() doesn't fire prematurely on a leftover event.
|
||||
reset_stash_event()
|
||||
# Reset tool-level circuit breaker so failures from a previous
|
||||
# (rolled-back) attempt don't carry over to the fresh attempt.
|
||||
reset_tool_failure_counters()
|
||||
if attempt > 0:
|
||||
logger.info(
|
||||
"%s Retrying with reduced context (%d/%d)",
|
||||
@@ -1801,6 +1973,10 @@ async def stream_chat_completion_sdk(
|
||||
if not isinstance(event, StreamHeartbeat):
|
||||
events_yielded += 1
|
||||
yield event
|
||||
# Cancel any pre-launched tasks that were never dispatched
|
||||
# by the SDK (e.g. edge-case SDK behaviour changes). Symmetric
|
||||
# with the three error-path await cancel_pending_tool_tasks() calls.
|
||||
await cancel_pending_tool_tasks()
|
||||
break # Stream completed — exit retry loop
|
||||
except asyncio.CancelledError:
|
||||
logger.warning(
|
||||
@@ -1809,26 +1985,42 @@ async def stream_chat_completion_sdk(
|
||||
attempt + 1,
|
||||
_MAX_STREAM_ATTEMPTS,
|
||||
)
|
||||
# Cancel any pre-launched tasks so they don't continue executing
|
||||
# against a rolled-back or abandoned session.
|
||||
await cancel_pending_tool_tasks()
|
||||
raise
|
||||
except _TransientErrorHandled:
|
||||
except _HandledStreamError as exc:
|
||||
# _run_stream_attempt already yielded a StreamError and
|
||||
# appended an error marker. We only need to rollback
|
||||
# session messages and set the error flag — do NOT set
|
||||
# stream_err so the post-loop code won't emit a
|
||||
# duplicate StreamError.
|
||||
logger.warning(
|
||||
"%s Transient error handled in stream attempt "
|
||||
"(attempt %d/%d, events_yielded=%d)",
|
||||
"%s Stream error handled in attempt "
|
||||
"(attempt %d/%d, code=%s, events_yielded=%d)",
|
||||
log_prefix,
|
||||
attempt + 1,
|
||||
_MAX_STREAM_ATTEMPTS,
|
||||
exc.code or "transient",
|
||||
events_yielded,
|
||||
)
|
||||
session.messages = session.messages[:pre_attempt_msg_count]
|
||||
# transcript_builder still contains entries from the aborted
|
||||
# attempt that no longer match session.messages. Skip upload
|
||||
# so a future --resume doesn't replay rolled-back content.
|
||||
skip_transcript_upload = True
|
||||
# Re-append the error marker so it survives the rollback
|
||||
# and is persisted by the finally block (see #2947655365).
|
||||
_append_error_marker(session, FRIENDLY_TRANSIENT_MSG, retryable=True)
|
||||
# Use the specific error message from the attempt (e.g.
|
||||
# circuit breaker msg) rather than always the generic one.
|
||||
_append_error_marker(
|
||||
session,
|
||||
exc.error_msg or FRIENDLY_TRANSIENT_MSG,
|
||||
retryable=True,
|
||||
)
|
||||
ended_with_stream_error = True
|
||||
# Cancel any pre-launched tasks from the failed attempt.
|
||||
await cancel_pending_tool_tasks()
|
||||
break
|
||||
except Exception as e:
|
||||
stream_err = e
|
||||
@@ -1845,6 +2037,9 @@ async def stream_chat_completion_sdk(
|
||||
exc_info=True,
|
||||
)
|
||||
session.messages = session.messages[:pre_attempt_msg_count]
|
||||
# Cancel any pre-launched tasks from the failed attempt so they
|
||||
# don't continue executing against the rolled-back session.
|
||||
await cancel_pending_tool_tasks()
|
||||
if events_yielded > 0:
|
||||
# Events were already sent to the frontend and cannot be
|
||||
# unsent. Retrying would produce duplicate/inconsistent
|
||||
@@ -1854,11 +2049,13 @@ async def stream_chat_completion_sdk(
|
||||
log_prefix,
|
||||
events_yielded,
|
||||
)
|
||||
skip_transcript_upload = True
|
||||
ended_with_stream_error = True
|
||||
break
|
||||
if not is_context_error:
|
||||
# Non-context errors (network, auth, rate-limit) should
|
||||
# not trigger compaction — surface the error immediately.
|
||||
skip_transcript_upload = True
|
||||
ended_with_stream_error = True
|
||||
break
|
||||
continue
|
||||
|
||||
@@ -1,21 +1,23 @@
|
||||
"""Unit tests for extracted service helpers.
|
||||
|
||||
Covers ``_is_prompt_too_long``, ``_reduce_context``, ``_iter_sdk_messages``,
|
||||
and the ``ReducedContext`` named tuple.
|
||||
``ReducedContext``, and the ``is_parallel_continuation`` logic.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
from collections.abc import AsyncGenerator
|
||||
from unittest.mock import AsyncMock, patch
|
||||
from unittest.mock import AsyncMock, MagicMock, patch
|
||||
|
||||
import pytest
|
||||
from claude_agent_sdk import AssistantMessage, TextBlock, ToolUseBlock
|
||||
|
||||
from .conftest import build_test_transcript as _build_transcript
|
||||
from .service import (
|
||||
ReducedContext,
|
||||
_is_prompt_too_long,
|
||||
_is_tool_only_message,
|
||||
_iter_sdk_messages,
|
||||
_reduce_context,
|
||||
)
|
||||
@@ -281,3 +283,55 @@ class TestIterSdkMessages:
|
||||
first = await gen.__anext__()
|
||||
assert first == "first"
|
||||
await gen.aclose() # should cancel pending task cleanly
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# is_parallel_continuation logic
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class TestIsParallelContinuation:
|
||||
"""Unit tests for the is_parallel_continuation expression in the streaming loop.
|
||||
|
||||
Verifies the vacuous-truth guard (empty content must return False) and the
|
||||
boundary cases for mixed TextBlock+ToolUseBlock messages.
|
||||
"""
|
||||
|
||||
def _make_tool_block(self) -> MagicMock:
|
||||
block = MagicMock(spec=ToolUseBlock)
|
||||
return block
|
||||
|
||||
def test_all_tool_use_blocks_is_parallel(self):
|
||||
"""AssistantMessage with only ToolUseBlocks is a parallel continuation."""
|
||||
msg = MagicMock(spec=AssistantMessage)
|
||||
msg.content = [self._make_tool_block(), self._make_tool_block()]
|
||||
assert _is_tool_only_message(msg) is True
|
||||
|
||||
def test_empty_content_is_not_parallel(self):
|
||||
"""AssistantMessage with empty content must NOT be treated as parallel.
|
||||
|
||||
Without the bool(sdk_msg.content) guard, all() on an empty iterable
|
||||
returns True via vacuous truth — this test ensures the guard is present.
|
||||
"""
|
||||
msg = MagicMock(spec=AssistantMessage)
|
||||
msg.content = []
|
||||
assert _is_tool_only_message(msg) is False
|
||||
|
||||
def test_mixed_text_and_tool_blocks_not_parallel(self):
|
||||
"""AssistantMessage with text + tool blocks is NOT a parallel continuation."""
|
||||
msg = MagicMock(spec=AssistantMessage)
|
||||
text_block = MagicMock(spec=TextBlock)
|
||||
msg.content = [text_block, self._make_tool_block()]
|
||||
assert _is_tool_only_message(msg) is False
|
||||
|
||||
def test_non_assistant_message_not_parallel(self):
|
||||
"""Non-AssistantMessage types are never parallel continuations."""
|
||||
assert _is_tool_only_message("not a message") is False
|
||||
assert _is_tool_only_message(None) is False
|
||||
assert _is_tool_only_message(42) is False
|
||||
|
||||
def test_single_tool_block_is_parallel(self):
|
||||
"""Single ToolUseBlock AssistantMessage is a parallel continuation."""
|
||||
msg = MagicMock(spec=AssistantMessage)
|
||||
msg.content = [self._make_tool_block()]
|
||||
assert _is_tool_only_message(msg) is True
|
||||
|
||||
144
autogpt_platform/backend/backend/copilot/sdk/subscription.py
Normal file
144
autogpt_platform/backend/backend/copilot/sdk/subscription.py
Normal file
@@ -0,0 +1,144 @@
|
||||
"""Claude Code subscription auth helpers.
|
||||
|
||||
Handles locating the SDK-bundled CLI binary, provisioning credentials from
|
||||
environment variables, and validating that subscription auth is functional.
|
||||
"""
|
||||
|
||||
import functools
|
||||
import json
|
||||
import logging
|
||||
import os
|
||||
import shutil
|
||||
import subprocess
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
def find_bundled_cli() -> str:
|
||||
"""Locate the Claude CLI binary bundled inside ``claude_agent_sdk``.
|
||||
|
||||
Falls back to ``shutil.which("claude")`` if the SDK bundle is absent.
|
||||
"""
|
||||
try:
|
||||
from claude_agent_sdk._internal.transport.subprocess_cli import (
|
||||
SubprocessCLITransport,
|
||||
)
|
||||
|
||||
path = SubprocessCLITransport._find_bundled_cli(None) # type: ignore[arg-type]
|
||||
if path:
|
||||
return str(path)
|
||||
except Exception:
|
||||
pass
|
||||
system_path = shutil.which("claude")
|
||||
if system_path:
|
||||
return system_path
|
||||
raise RuntimeError(
|
||||
"Claude CLI not found — neither the SDK-bundled binary nor a "
|
||||
"system-installed `claude` could be located."
|
||||
)
|
||||
|
||||
|
||||
def provision_credentials_file() -> None:
|
||||
"""Write ``~/.claude/.credentials.json`` from env when running headless.
|
||||
|
||||
If ``CLAUDE_CODE_OAUTH_TOKEN`` is set (an OAuth *access* token obtained
|
||||
from ``claude auth status`` or extracted from the macOS keychain), this
|
||||
helper writes a minimal credentials file so the bundled CLI can
|
||||
authenticate without an interactive ``claude login``.
|
||||
|
||||
A ``CLAUDE_CODE_REFRESH_TOKEN`` env var is optional but recommended —
|
||||
it lets the CLI silently refresh an expired access token.
|
||||
"""
|
||||
access_token = os.environ.get("CLAUDE_CODE_OAUTH_TOKEN", "").strip()
|
||||
if not access_token:
|
||||
return
|
||||
|
||||
creds_dir = os.path.expanduser("~/.claude")
|
||||
creds_path = os.path.join(creds_dir, ".credentials.json")
|
||||
|
||||
# Don't overwrite an existing credentials file (e.g. from a volume mount).
|
||||
if os.path.exists(creds_path):
|
||||
logger.debug("Credentials file already exists at %s — skipping", creds_path)
|
||||
return
|
||||
|
||||
os.makedirs(creds_dir, exist_ok=True)
|
||||
|
||||
creds = {
|
||||
"claudeAiOauth": {
|
||||
"accessToken": access_token,
|
||||
"refreshToken": os.environ.get("CLAUDE_CODE_REFRESH_TOKEN", "").strip(),
|
||||
"expiresAt": 0,
|
||||
"scopes": [
|
||||
"user:inference",
|
||||
"user:profile",
|
||||
"user:sessions:claude_code",
|
||||
],
|
||||
}
|
||||
}
|
||||
with open(creds_path, "w") as f:
|
||||
json.dump(creds, f)
|
||||
logger.info("Provisioned Claude credentials file at %s", creds_path)
|
||||
|
||||
|
||||
@functools.cache
|
||||
def validate_subscription() -> None:
|
||||
"""Validate the bundled Claude CLI is reachable and authenticated.
|
||||
|
||||
Cached so the blocking subprocess check runs at most once per process
|
||||
lifetime. On first call, also provisions ``~/.claude/.credentials.json``
|
||||
from the ``CLAUDE_CODE_OAUTH_TOKEN`` env var when available.
|
||||
"""
|
||||
provision_credentials_file()
|
||||
|
||||
cli = find_bundled_cli()
|
||||
result = subprocess.run(
|
||||
[cli, "--version"],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
timeout=10,
|
||||
)
|
||||
if result.returncode != 0:
|
||||
raise RuntimeError(
|
||||
f"Claude CLI check failed (exit {result.returncode}): "
|
||||
f"{result.stderr.strip()}"
|
||||
)
|
||||
logger.info(
|
||||
"Claude Code subscription mode: CLI version %s",
|
||||
result.stdout.strip(),
|
||||
)
|
||||
|
||||
# Verify the CLI is actually authenticated.
|
||||
auth_result = subprocess.run(
|
||||
[cli, "auth", "status"],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
timeout=10,
|
||||
env={
|
||||
**os.environ,
|
||||
"ANTHROPIC_API_KEY": "",
|
||||
"ANTHROPIC_AUTH_TOKEN": "",
|
||||
"ANTHROPIC_BASE_URL": "",
|
||||
},
|
||||
)
|
||||
if auth_result.returncode != 0:
|
||||
raise RuntimeError(
|
||||
"Claude CLI is not authenticated. Either:\n"
|
||||
" • Set CLAUDE_CODE_OAUTH_TOKEN env var (from `claude auth status` "
|
||||
"or macOS keychain), or\n"
|
||||
" • Mount ~/.claude/.credentials.json into the container, or\n"
|
||||
" • Run `claude login` inside the container."
|
||||
)
|
||||
try:
|
||||
status = json.loads(auth_result.stdout)
|
||||
if not status.get("loggedIn"):
|
||||
raise RuntimeError(
|
||||
"Claude CLI reports loggedIn=false. Set CLAUDE_CODE_OAUTH_TOKEN "
|
||||
"or run `claude login`."
|
||||
)
|
||||
logger.info(
|
||||
"Claude subscription auth: method=%s, email=%s",
|
||||
status.get("authMethod"),
|
||||
status.get("email"),
|
||||
)
|
||||
except json.JSONDecodeError:
|
||||
logger.warning("Could not parse `claude auth status` output")
|
||||
@@ -0,0 +1,96 @@
|
||||
"""Tests for the tool call circuit breaker in tool_adapter.py."""
|
||||
|
||||
import pytest
|
||||
|
||||
from backend.copilot.sdk.tool_adapter import (
|
||||
_MAX_CONSECUTIVE_TOOL_FAILURES,
|
||||
_check_circuit_breaker,
|
||||
_clear_tool_failures,
|
||||
_consecutive_tool_failures,
|
||||
_record_tool_failure,
|
||||
)
|
||||
|
||||
|
||||
@pytest.fixture(autouse=True)
|
||||
def _reset_tracker():
|
||||
"""Reset the circuit breaker tracker for each test."""
|
||||
token = _consecutive_tool_failures.set({})
|
||||
yield
|
||||
_consecutive_tool_failures.reset(token)
|
||||
|
||||
|
||||
class TestCircuitBreaker:
|
||||
def test_no_trip_below_threshold(self):
|
||||
"""Circuit breaker should not trip before reaching the limit."""
|
||||
args = {"file_path": "/tmp/test.txt"}
|
||||
for _ in range(_MAX_CONSECUTIVE_TOOL_FAILURES - 1):
|
||||
assert _check_circuit_breaker("write_file", args) is None
|
||||
_record_tool_failure("write_file", args)
|
||||
# Still under the limit
|
||||
assert _check_circuit_breaker("write_file", args) is None
|
||||
|
||||
def test_trips_at_threshold(self):
|
||||
"""Circuit breaker should trip after reaching the failure limit."""
|
||||
args = {"file_path": "/tmp/test.txt"}
|
||||
for _ in range(_MAX_CONSECUTIVE_TOOL_FAILURES):
|
||||
assert _check_circuit_breaker("write_file", args) is None
|
||||
_record_tool_failure("write_file", args)
|
||||
# Now it should trip
|
||||
result = _check_circuit_breaker("write_file", args)
|
||||
assert result is not None
|
||||
assert "STOP" in result
|
||||
assert "write_file" in result
|
||||
|
||||
def test_different_args_tracked_separately(self):
|
||||
"""Different args should have separate failure counters."""
|
||||
args_a = {"file_path": "/tmp/a.txt"}
|
||||
args_b = {"file_path": "/tmp/b.txt"}
|
||||
for _ in range(_MAX_CONSECUTIVE_TOOL_FAILURES):
|
||||
_record_tool_failure("write_file", args_a)
|
||||
# args_a should trip
|
||||
assert _check_circuit_breaker("write_file", args_a) is not None
|
||||
# args_b should NOT trip
|
||||
assert _check_circuit_breaker("write_file", args_b) is None
|
||||
|
||||
def test_different_tools_tracked_separately(self):
|
||||
"""Different tools should have separate failure counters."""
|
||||
args = {"file_path": "/tmp/test.txt"}
|
||||
for _ in range(_MAX_CONSECUTIVE_TOOL_FAILURES):
|
||||
_record_tool_failure("tool_a", args)
|
||||
# tool_a should trip
|
||||
assert _check_circuit_breaker("tool_a", args) is not None
|
||||
# tool_b with same args should NOT trip
|
||||
assert _check_circuit_breaker("tool_b", args) is None
|
||||
|
||||
def test_empty_args_tracked(self):
|
||||
"""Empty args ({}) — the exact failure pattern from the bug — should be tracked."""
|
||||
args = {}
|
||||
for _ in range(_MAX_CONSECUTIVE_TOOL_FAILURES):
|
||||
_record_tool_failure("write_file", args)
|
||||
assert _check_circuit_breaker("write_file", args) is not None
|
||||
|
||||
def test_clear_resets_counter(self):
|
||||
"""Clearing failures should reset the counter."""
|
||||
args = {}
|
||||
for _ in range(_MAX_CONSECUTIVE_TOOL_FAILURES):
|
||||
_record_tool_failure("write_file", args)
|
||||
_clear_tool_failures("write_file")
|
||||
assert _check_circuit_breaker("write_file", args) is None
|
||||
|
||||
def test_success_clears_failures(self):
|
||||
"""A successful call should reset the failure counter."""
|
||||
args = {}
|
||||
for _ in range(_MAX_CONSECUTIVE_TOOL_FAILURES - 1):
|
||||
_record_tool_failure("write_file", args)
|
||||
# Success clears failures
|
||||
_clear_tool_failures("write_file")
|
||||
# Should be able to fail again without tripping
|
||||
for _ in range(_MAX_CONSECUTIVE_TOOL_FAILURES - 1):
|
||||
_record_tool_failure("write_file", args)
|
||||
assert _check_circuit_breaker("write_file", args) is None
|
||||
|
||||
def test_no_tracker_returns_none(self):
|
||||
"""If tracker is not initialized, circuit breaker should not trip."""
|
||||
_consecutive_tool_failures.set(None) # type: ignore[arg-type]
|
||||
_record_tool_failure("write_file", {}) # should not raise
|
||||
assert _check_circuit_breaker("write_file", {}) is None
|
||||
@@ -16,6 +16,7 @@ from typing import TYPE_CHECKING, Any
|
||||
from claude_agent_sdk import create_sdk_mcp_server, tool
|
||||
|
||||
from backend.copilot.context import (
|
||||
_current_permissions,
|
||||
_current_project_dir,
|
||||
_current_sandbox,
|
||||
_current_sdk_cwd,
|
||||
@@ -41,6 +42,8 @@ from .e2b_file_tools import E2B_FILE_TOOL_NAMES, E2B_FILE_TOOLS
|
||||
if TYPE_CHECKING:
|
||||
from e2b import AsyncSandbox
|
||||
|
||||
from backend.copilot.permissions import CopilotPermissions
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Max MCP response size in chars — keeps tool output under the SDK's 10 MB JSON buffer.
|
||||
@@ -50,6 +53,14 @@ _MCP_MAX_CHARS = 500_000
|
||||
MCP_SERVER_NAME = "copilot"
|
||||
MCP_TOOL_PREFIX = f"mcp__{MCP_SERVER_NAME}__"
|
||||
|
||||
# Map from tool_name -> Queue of pre-launched (task, args) pairs.
|
||||
# Initialised per-session in set_execution_context() so concurrent sessions
|
||||
# never share the same dict.
|
||||
_TaskQueueItem = tuple[asyncio.Task[dict[str, Any]], dict[str, Any]]
|
||||
_tool_task_queues: ContextVar[dict[str, asyncio.Queue[_TaskQueueItem]] | None] = (
|
||||
ContextVar("_tool_task_queues", default=None)
|
||||
)
|
||||
|
||||
# Stash for MCP tool outputs before the SDK potentially truncates them.
|
||||
# Keyed by tool_name → full output string. Consumed (popped) by the
|
||||
# response adapter when it builds StreamToolOutputAvailable.
|
||||
@@ -66,12 +77,23 @@ _stash_event: ContextVar[asyncio.Event | None] = ContextVar(
|
||||
"_stash_event", default=None
|
||||
)
|
||||
|
||||
# Circuit breaker: tracks consecutive tool failures to detect infinite retry loops.
|
||||
# When a tool is called repeatedly with empty/identical args and keeps failing,
|
||||
# this counter is incremented. After _MAX_CONSECUTIVE_TOOL_FAILURES identical
|
||||
# failures the tool handler returns a hard-stop message instead of the raw error.
|
||||
_MAX_CONSECUTIVE_TOOL_FAILURES = 3
|
||||
_consecutive_tool_failures: ContextVar[dict[str, int]] = ContextVar(
|
||||
"_consecutive_tool_failures",
|
||||
default=None, # type: ignore[arg-type]
|
||||
)
|
||||
|
||||
|
||||
def set_execution_context(
|
||||
user_id: str | None,
|
||||
session: ChatSession,
|
||||
sandbox: "AsyncSandbox | None" = None,
|
||||
sdk_cwd: str | None = None,
|
||||
permissions: "CopilotPermissions | None" = None,
|
||||
) -> None:
|
||||
"""Set the execution context for tool calls.
|
||||
|
||||
@@ -83,14 +105,83 @@ def set_execution_context(
|
||||
session: Current chat session.
|
||||
sandbox: Optional E2B sandbox; when set, bash_exec routes commands there.
|
||||
sdk_cwd: SDK working directory; used to scope tool-results reads.
|
||||
permissions: Optional capability filter restricting tools/blocks.
|
||||
"""
|
||||
_current_user_id.set(user_id)
|
||||
_current_session.set(session)
|
||||
_current_sandbox.set(sandbox)
|
||||
_current_sdk_cwd.set(sdk_cwd or "")
|
||||
_current_project_dir.set(_encode_cwd_for_cli(sdk_cwd) if sdk_cwd else "")
|
||||
_current_permissions.set(permissions)
|
||||
_pending_tool_outputs.set({})
|
||||
_stash_event.set(asyncio.Event())
|
||||
_tool_task_queues.set({})
|
||||
_consecutive_tool_failures.set({})
|
||||
|
||||
|
||||
def reset_stash_event() -> None:
|
||||
"""Clear any stale stash signal left over from a previous stream attempt.
|
||||
|
||||
``_stash_event`` is set once per session in ``set_execution_context`` and
|
||||
reused across retry attempts. A PostToolUse hook from a failed attempt may
|
||||
leave the event set; calling this at the start of each retry prevents
|
||||
``wait_for_stash`` from returning prematurely on a stale signal.
|
||||
"""
|
||||
event = _stash_event.get(None)
|
||||
if event is not None:
|
||||
event.clear()
|
||||
|
||||
|
||||
async def cancel_pending_tool_tasks() -> None:
|
||||
"""Cancel all queued pre-launched tasks for the current execution context.
|
||||
|
||||
Call this when a stream attempt aborts (error, cancellation) to prevent
|
||||
pre-launched tasks from continuing to execute against a rolled-back session.
|
||||
Tasks that are already done are skipped; in-flight tasks are cancelled and
|
||||
awaited so that any cleanup (``finally`` blocks, DB rollbacks) completes
|
||||
before the next retry starts.
|
||||
"""
|
||||
queues = _tool_task_queues.get()
|
||||
if not queues:
|
||||
return
|
||||
cancelled_tasks: list[asyncio.Task] = []
|
||||
for tool_name, queue in list(queues.items()):
|
||||
cancelled = 0
|
||||
while not queue.empty():
|
||||
task, _args = queue.get_nowait()
|
||||
if not task.done():
|
||||
task.cancel()
|
||||
cancelled_tasks.append(task)
|
||||
cancelled += 1
|
||||
if cancelled:
|
||||
logger.debug(
|
||||
"Cancelled %d pre-launched task(s) for tool '%s'", cancelled, tool_name
|
||||
)
|
||||
queues.clear()
|
||||
# Await all cancelled tasks so their cleanup (finally blocks, DB rollbacks)
|
||||
# completes before the next retry attempt starts new pre-launches.
|
||||
# Use a timeout to prevent hanging indefinitely if a task's cleanup is stuck.
|
||||
if cancelled_tasks:
|
||||
try:
|
||||
await asyncio.wait_for(
|
||||
asyncio.gather(*cancelled_tasks, return_exceptions=True),
|
||||
timeout=5.0,
|
||||
)
|
||||
except TimeoutError:
|
||||
logger.warning(
|
||||
"Timed out waiting for %d cancelled task(s) to clean up",
|
||||
len(cancelled_tasks),
|
||||
)
|
||||
|
||||
|
||||
def reset_tool_failure_counters() -> None:
|
||||
"""Reset all tool-level circuit breaker counters.
|
||||
|
||||
Called at the start of each SDK retry attempt so that failure counts
|
||||
from a previous (rolled-back) attempt do not carry over and prematurely
|
||||
trip the breaker on a fresh attempt with different context.
|
||||
"""
|
||||
_consecutive_tool_failures.set({})
|
||||
|
||||
|
||||
def pop_pending_tool_output(tool_name: str) -> str | None:
|
||||
@@ -155,12 +246,13 @@ async def wait_for_stash(timeout: float = 2.0) -> bool:
|
||||
by waiting on the ``_stash_event``, which is signaled by
|
||||
:func:`stash_pending_tool_output`.
|
||||
|
||||
Returns ``True`` if a stash signal was received, ``False`` on timeout.
|
||||
Uses ``asyncio.Event.wait()`` so it returns the instant the hook signals —
|
||||
the timeout is purely a safety net for the case where the hook never fires.
|
||||
Returns ``True`` if the stash signal was received, ``False`` on timeout.
|
||||
|
||||
The 2.0 s default was chosen based on production metrics: the original
|
||||
0.5 s caused frequent timeouts under load (parallel tool calls, large
|
||||
outputs). 2.0 s gives a comfortable margin while still failing fast
|
||||
when the hook genuinely will not fire.
|
||||
The 2.0 s default was chosen to accommodate slower tool startup in cloud
|
||||
sandboxes while still failing fast when the hook genuinely will not fire.
|
||||
With the parallel pre-launch path, hooks typically fire well under 1 ms.
|
||||
"""
|
||||
event = _stash_event.get(None)
|
||||
if event is None:
|
||||
@@ -169,7 +261,7 @@ async def wait_for_stash(timeout: float = 2.0) -> bool:
|
||||
if event.is_set():
|
||||
event.clear()
|
||||
return True
|
||||
# Slow path: wait for the hook to signal.
|
||||
# Slow path: block until the hook signals or the safety timeout expires.
|
||||
try:
|
||||
async with asyncio.timeout(timeout):
|
||||
await event.wait()
|
||||
@@ -179,6 +271,82 @@ async def wait_for_stash(timeout: float = 2.0) -> bool:
|
||||
return False
|
||||
|
||||
|
||||
async def pre_launch_tool_call(tool_name: str, args: dict[str, Any]) -> None:
|
||||
"""Pre-launch a tool as a background task so parallel calls run concurrently.
|
||||
|
||||
Called when an AssistantMessage with ToolUseBlocks is received, before the
|
||||
SDK dispatches the MCP tool/call requests. The tool_handler will await the
|
||||
pre-launched task instead of executing fresh.
|
||||
|
||||
The tool_name may include an MCP prefix (e.g. ``mcp__copilot__run_block``);
|
||||
the prefix is stripped automatically before looking up the tool.
|
||||
|
||||
Ordering guarantee: the Claude Agent SDK dispatches MCP ``tools/call`` requests
|
||||
in the same order as the ToolUseBlocks appear in the AssistantMessage.
|
||||
Pre-launched tasks are queued FIFO per tool name, so the N-th handler for a
|
||||
given tool name dequeues the N-th pre-launched task — result and args always
|
||||
correspond when the SDK preserves order (which it does in the current SDK).
|
||||
"""
|
||||
queues = _tool_task_queues.get()
|
||||
if queues is None:
|
||||
return
|
||||
|
||||
# Strip the MCP server prefix (e.g. "mcp__copilot__") to get the bare tool name.
|
||||
# Use removeprefix so tool names that themselves contain "__" are handled correctly.
|
||||
bare_name = tool_name.removeprefix(MCP_TOOL_PREFIX)
|
||||
|
||||
base_tool = TOOL_REGISTRY.get(bare_name)
|
||||
if base_tool is None:
|
||||
return
|
||||
|
||||
user_id, session = get_execution_context()
|
||||
if session is None:
|
||||
return
|
||||
|
||||
# Expand @@agptfile: references before launching the task.
|
||||
# The _truncating wrapper (which normally handles expansion) runs AFTER
|
||||
# pre_launch_tool_call — the pre-launched task would otherwise receive raw
|
||||
# @@agptfile: tokens and fail to resolve them inside _execute_tool_sync.
|
||||
# Use _build_input_schema (same path as _truncating) for schema-aware expansion.
|
||||
input_schema: dict[str, Any] | None
|
||||
try:
|
||||
input_schema = _build_input_schema(base_tool)
|
||||
except Exception:
|
||||
input_schema = None # schema unavailable — skip schema-aware expansion
|
||||
try:
|
||||
args = await expand_file_refs_in_args(
|
||||
args, user_id, session, input_schema=input_schema
|
||||
)
|
||||
except FileRefExpansionError as exc:
|
||||
logger.warning(
|
||||
"pre_launch_tool_call: @@agptfile expansion failed for %s: %s — skipping pre-launch",
|
||||
bare_name,
|
||||
exc,
|
||||
)
|
||||
return
|
||||
|
||||
task = asyncio.create_task(_execute_tool_sync(base_tool, user_id, session, args))
|
||||
# Log unhandled exceptions so "Task exception was never retrieved" warnings
|
||||
# do not pollute stderr when a task is pre-launched but never dequeued.
|
||||
task.add_done_callback(
|
||||
lambda t, name=bare_name: (
|
||||
logger.warning(
|
||||
"Pre-launched task for %s raised unhandled: %s",
|
||||
name,
|
||||
t.exception(),
|
||||
)
|
||||
if not t.cancelled() and t.exception()
|
||||
else None
|
||||
)
|
||||
)
|
||||
|
||||
if bare_name not in queues:
|
||||
queues[bare_name] = asyncio.Queue[_TaskQueueItem]()
|
||||
# Store (task, args) so the handler can log a warning if the SDK dispatches
|
||||
# calls in a different order than the ToolUseBlocks appeared in the message.
|
||||
queues[bare_name].put_nowait((task, args))
|
||||
|
||||
|
||||
async def _execute_tool_sync(
|
||||
base_tool: BaseTool,
|
||||
user_id: str | None,
|
||||
@@ -187,8 +355,10 @@ async def _execute_tool_sync(
|
||||
) -> dict[str, Any]:
|
||||
"""Execute a tool synchronously and return MCP-formatted response.
|
||||
|
||||
Note: ``@@agptfile:`` expansion is handled upstream in the ``_truncating`` wrapper
|
||||
so all registered handlers (BaseTool, E2B, Read) expand uniformly.
|
||||
Note: ``@@agptfile:`` expansion should be performed by the caller before
|
||||
invoking this function. For the normal (non-parallel) path it is handled
|
||||
by the ``_truncating`` wrapper; for the pre-launched parallel path it is
|
||||
handled in :func:`pre_launch_tool_call` before the task is created.
|
||||
"""
|
||||
effective_id = f"sdk-{uuid.uuid4().hex[:12]}"
|
||||
result = await base_tool.execute(
|
||||
@@ -217,6 +387,66 @@ def _mcp_error(message: str) -> dict[str, Any]:
|
||||
}
|
||||
|
||||
|
||||
def _failure_key(tool_name: str, args: dict[str, Any]) -> str:
|
||||
"""Compute a stable fingerprint for (tool_name, args) used by the circuit breaker."""
|
||||
args_key = json.dumps(args, sort_keys=True, default=str)
|
||||
return f"{tool_name}:{args_key}"
|
||||
|
||||
|
||||
def _check_circuit_breaker(tool_name: str, args: dict[str, Any]) -> str | None:
|
||||
"""Check if a tool has hit the consecutive failure limit.
|
||||
|
||||
Tracks failures keyed by (tool_name, args_fingerprint). Returns an error
|
||||
message if the circuit breaker has tripped, or None if the call should proceed.
|
||||
"""
|
||||
tracker = _consecutive_tool_failures.get(None)
|
||||
if tracker is None:
|
||||
return None
|
||||
|
||||
key = _failure_key(tool_name, args)
|
||||
count = tracker.get(key, 0)
|
||||
if count >= _MAX_CONSECUTIVE_TOOL_FAILURES:
|
||||
logger.warning(
|
||||
"Circuit breaker tripped for tool %s after %d consecutive "
|
||||
"identical failures (args=%s)",
|
||||
tool_name,
|
||||
count,
|
||||
key[len(tool_name) + 1 :][:200],
|
||||
)
|
||||
return (
|
||||
f"STOP: Tool '{tool_name}' has failed {count} consecutive times with "
|
||||
f"the same arguments. Do NOT retry this tool call. "
|
||||
f"If you were trying to write content to a file, instead respond with "
|
||||
f"the content directly as a text message to the user."
|
||||
)
|
||||
return None
|
||||
|
||||
|
||||
def _record_tool_failure(tool_name: str, args: dict[str, Any]) -> None:
|
||||
"""Record a tool failure for circuit breaker tracking."""
|
||||
tracker = _consecutive_tool_failures.get(None)
|
||||
if tracker is None:
|
||||
return
|
||||
key = _failure_key(tool_name, args)
|
||||
tracker[key] = tracker.get(key, 0) + 1
|
||||
|
||||
|
||||
def _clear_tool_failures(tool_name: str) -> None:
|
||||
"""Clear failure tracking for a tool on success.
|
||||
|
||||
Clears ALL args variants for the tool, not just the successful call's args.
|
||||
This gives the tool a "fresh start" on any success, which is appropriate for
|
||||
the primary use case (detecting infinite loops with identical failing args).
|
||||
"""
|
||||
tracker = _consecutive_tool_failures.get(None)
|
||||
if tracker is None:
|
||||
return
|
||||
# Clear all entries for this tool name
|
||||
keys_to_remove = [k for k in tracker if k.startswith(f"{tool_name}:")]
|
||||
for k in keys_to_remove:
|
||||
del tracker[k]
|
||||
|
||||
|
||||
def create_tool_handler(base_tool: BaseTool):
|
||||
"""Create an async handler function for a BaseTool.
|
||||
|
||||
@@ -225,7 +455,83 @@ def create_tool_handler(base_tool: BaseTool):
|
||||
"""
|
||||
|
||||
async def tool_handler(args: dict[str, Any]) -> dict[str, Any]:
|
||||
"""Execute the wrapped tool and return MCP-formatted response."""
|
||||
"""Execute the wrapped tool and return MCP-formatted response.
|
||||
|
||||
If a pre-launched task exists (from parallel tool pre-launch in the
|
||||
message loop), await it instead of executing fresh.
|
||||
"""
|
||||
queues = _tool_task_queues.get()
|
||||
if queues and base_tool.name in queues:
|
||||
queue = queues[base_tool.name]
|
||||
if not queue.empty():
|
||||
task, launch_args = queue.get_nowait()
|
||||
# Sanity-check: warn if the args don't match — this can happen
|
||||
# if the SDK dispatches tool calls in a different order than the
|
||||
# ToolUseBlocks appeared in the AssistantMessage (unlikely but
|
||||
# could occur in future SDK versions or with SDK bugs).
|
||||
# We compare full values (not just keys) so that two run_block
|
||||
# calls with different block_id values are caught even though
|
||||
# both have the same key set.
|
||||
if launch_args != args:
|
||||
logger.warning(
|
||||
"Pre-launched task for %s: arg mismatch "
|
||||
"(launch_keys=%s, call_keys=%s) — cancelling "
|
||||
"pre-launched task and falling back to direct execution",
|
||||
base_tool.name,
|
||||
(
|
||||
sorted(launch_args.keys())
|
||||
if isinstance(launch_args, dict)
|
||||
else type(launch_args).__name__
|
||||
),
|
||||
(
|
||||
sorted(args.keys())
|
||||
if isinstance(args, dict)
|
||||
else type(args).__name__
|
||||
),
|
||||
)
|
||||
if not task.done():
|
||||
task.cancel()
|
||||
# Await cancellation to prevent duplicate concurrent
|
||||
# execution for blocks with side effects.
|
||||
try:
|
||||
await task
|
||||
except (asyncio.CancelledError, Exception):
|
||||
pass
|
||||
# Fall through to the direct-execution path below.
|
||||
else:
|
||||
# Args match — await the pre-launched task.
|
||||
try:
|
||||
result = await task
|
||||
except asyncio.CancelledError:
|
||||
# Re-raise: CancelledError may be propagating from the
|
||||
# outer streaming loop being cancelled — swallowing it
|
||||
# would mask the cancellation and prevent proper cleanup.
|
||||
logger.warning(
|
||||
"Pre-launched tool %s was cancelled — re-raising",
|
||||
base_tool.name,
|
||||
)
|
||||
raise
|
||||
except Exception as e:
|
||||
logger.error(
|
||||
"Pre-launched tool %s failed: %s",
|
||||
base_tool.name,
|
||||
e,
|
||||
exc_info=True,
|
||||
)
|
||||
return _mcp_error(
|
||||
f"Failed to execute {base_tool.name}. "
|
||||
"Check server logs for details."
|
||||
)
|
||||
|
||||
# Pre-truncate the result so the _truncating wrapper (which
|
||||
# wraps this handler) receives an already-within-budget
|
||||
# value. _truncating handles stashing — we must NOT stash
|
||||
# here or the output will be appended twice to the FIFO
|
||||
# queue and pop_pending_tool_output would return a duplicate
|
||||
# entry on the second call for the same tool.
|
||||
return truncate(result, _MCP_MAX_CHARS)
|
||||
|
||||
# No pre-launched task — execute directly (fallback for non-parallel calls).
|
||||
user_id, session = get_execution_context()
|
||||
|
||||
if session is None:
|
||||
@@ -234,8 +540,12 @@ def create_tool_handler(base_tool: BaseTool):
|
||||
try:
|
||||
return await _execute_tool_sync(base_tool, user_id, session, args)
|
||||
except Exception as e:
|
||||
logger.error(f"Error executing tool {base_tool.name}: {e}", exc_info=True)
|
||||
return _mcp_error(f"Failed to execute {base_tool.name}: {e}")
|
||||
logger.error(
|
||||
"Error executing tool %s: %s", base_tool.name, e, exc_info=True
|
||||
)
|
||||
return _mcp_error(
|
||||
f"Failed to execute {base_tool.name}. Check server logs for details."
|
||||
)
|
||||
|
||||
return tool_handler
|
||||
|
||||
@@ -358,6 +668,15 @@ def create_copilot_mcp_server(*, use_e2b: bool = False):
|
||||
Applied once to every registered tool."""
|
||||
|
||||
async def wrapper(args: dict[str, Any]) -> dict[str, Any]:
|
||||
# Circuit breaker: stop infinite retry loops with identical args.
|
||||
# Use the original (pre-expansion) args for fingerprinting so
|
||||
# check and record always use the same key — @@agptfile:
|
||||
# expansion mutates args, which would cause a key mismatch.
|
||||
original_args = args
|
||||
stop_msg = _check_circuit_breaker(tool_name, original_args)
|
||||
if stop_msg:
|
||||
return _mcp_error(stop_msg)
|
||||
|
||||
user_id, session = get_execution_context()
|
||||
if session is not None:
|
||||
try:
|
||||
@@ -365,6 +684,7 @@ def create_copilot_mcp_server(*, use_e2b: bool = False):
|
||||
args, user_id, session, input_schema=input_schema
|
||||
)
|
||||
except FileRefExpansionError as exc:
|
||||
_record_tool_failure(tool_name, original_args)
|
||||
return _mcp_error(
|
||||
f"@@agptfile: reference could not be resolved: {exc}. "
|
||||
"Ensure the file exists before referencing it. "
|
||||
@@ -374,6 +694,12 @@ def create_copilot_mcp_server(*, use_e2b: bool = False):
|
||||
result = await fn(args)
|
||||
truncated = truncate(result, _MCP_MAX_CHARS)
|
||||
|
||||
# Track consecutive failures for circuit breaker
|
||||
if truncated.get("isError"):
|
||||
_record_tool_failure(tool_name, original_args)
|
||||
else:
|
||||
_clear_tool_failures(tool_name)
|
||||
|
||||
# Stash the text so the response adapter can forward our
|
||||
# middle-out truncated version to the frontend instead of the
|
||||
# SDK's head-truncated version (for outputs >~100 KB the SDK
|
||||
|
||||
@@ -1,16 +1,26 @@
|
||||
"""Tests for tool_adapter helpers: truncation, stash, context vars."""
|
||||
"""Tests for tool_adapter helpers: truncation, stash, context vars, parallel pre-launch."""
|
||||
|
||||
import asyncio
|
||||
from unittest.mock import AsyncMock, MagicMock, patch
|
||||
|
||||
import pytest
|
||||
|
||||
from backend.copilot.context import get_sdk_cwd
|
||||
from backend.copilot.response_model import StreamToolOutputAvailable
|
||||
from backend.copilot.sdk.file_ref import FileRefExpansionError
|
||||
from backend.util.truncate import truncate
|
||||
|
||||
from .tool_adapter import (
|
||||
_MCP_MAX_CHARS,
|
||||
_text_from_mcp_result,
|
||||
cancel_pending_tool_tasks,
|
||||
create_tool_handler,
|
||||
pop_pending_tool_output,
|
||||
pre_launch_tool_call,
|
||||
reset_stash_event,
|
||||
set_execution_context,
|
||||
stash_pending_tool_output,
|
||||
wait_for_stash,
|
||||
)
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
@@ -120,6 +130,69 @@ class TestToolOutputStash:
|
||||
assert pop_pending_tool_output("a") == "alpha"
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# reset_stash_event / wait_for_stash
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class TestResetStashEvent:
|
||||
"""Tests for reset_stash_event — the stale-signal fix for retry attempts."""
|
||||
|
||||
@pytest.fixture(autouse=True)
|
||||
def _init_context(self):
|
||||
set_execution_context(
|
||||
user_id="test",
|
||||
session=None, # type: ignore[arg-type]
|
||||
sandbox=None,
|
||||
)
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_reset_clears_stale_signal(self):
|
||||
"""After reset, wait_for_stash does NOT return immediately (blocks until timeout)."""
|
||||
# Simulate a stale signal left by a failed attempt's PostToolUse hook.
|
||||
stash_pending_tool_output("some_tool", "stale output")
|
||||
# The stash_pending_tool_output call sets the event.
|
||||
# Now reset it — simulating start of a new retry attempt.
|
||||
reset_stash_event()
|
||||
# wait_for_stash should block and time out since the event was cleared.
|
||||
result = await wait_for_stash(timeout=0.05)
|
||||
assert result is False, (
|
||||
"wait_for_stash should have timed out after reset_stash_event, "
|
||||
"but it returned True — stale signal was not cleared"
|
||||
)
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_wait_returns_true_when_signaled_after_reset(self):
|
||||
"""After reset, a new stash signal is correctly detected."""
|
||||
reset_stash_event()
|
||||
|
||||
async def _signal_after_delay():
|
||||
await asyncio.sleep(0.01)
|
||||
stash_pending_tool_output("tool", "fresh output")
|
||||
|
||||
asyncio.create_task(_signal_after_delay())
|
||||
result = await wait_for_stash(timeout=1.0)
|
||||
assert result is True
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_retry_scenario_stale_event_does_not_fire_prematurely(self):
|
||||
"""Simulates: attempt 1 leaves event set → reset → attempt 2 waits correctly."""
|
||||
# Attempt 1: hook fires and sets the event
|
||||
stash_pending_tool_output("t", "attempt-1-output")
|
||||
# Pop it so the stash is empty (simulating normal consumption)
|
||||
pop_pending_tool_output("t")
|
||||
|
||||
# Between attempts: reset (as service.py does before each retry)
|
||||
reset_stash_event()
|
||||
|
||||
# Attempt 2: wait_for_stash should NOT return True immediately
|
||||
result = await wait_for_stash(timeout=0.05)
|
||||
assert result is False, (
|
||||
"Stale event from attempt 1 caused wait_for_stash to return "
|
||||
"prematurely in attempt 2"
|
||||
)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# _truncating wrapper (integration via create_copilot_mcp_server)
|
||||
# ---------------------------------------------------------------------------
|
||||
@@ -168,3 +241,534 @@ class TestTruncationAndStashIntegration:
|
||||
text = _text_from_mcp_result(truncated)
|
||||
assert len(text) < len(big_text)
|
||||
assert len(str(truncated)) <= _MCP_MAX_CHARS
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Parallel pre-launch infrastructure
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def _make_mock_tool(name: str, output: str = "result") -> MagicMock:
|
||||
"""Return a BaseTool mock that returns a successful StreamToolOutputAvailable."""
|
||||
tool = MagicMock()
|
||||
tool.name = name
|
||||
tool.parameters = {"properties": {}, "required": []}
|
||||
tool.execute = AsyncMock(
|
||||
return_value=StreamToolOutputAvailable(
|
||||
toolCallId="test-id",
|
||||
output=output,
|
||||
toolName=name,
|
||||
success=True,
|
||||
)
|
||||
)
|
||||
return tool
|
||||
|
||||
|
||||
def _make_mock_session() -> MagicMock:
|
||||
"""Return a minimal ChatSession mock."""
|
||||
return MagicMock()
|
||||
|
||||
|
||||
def _init_ctx(session=None):
|
||||
set_execution_context(
|
||||
user_id="user-1",
|
||||
session=session, # type: ignore[arg-type]
|
||||
sandbox=None,
|
||||
)
|
||||
|
||||
|
||||
class TestPreLaunchToolCall:
|
||||
"""Tests for pre_launch_tool_call and the queue-based parallel dispatch."""
|
||||
|
||||
@pytest.fixture(autouse=True)
|
||||
def _init(self):
|
||||
_init_ctx(session=_make_mock_session())
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_unknown_tool_is_silently_ignored(self):
|
||||
"""pre_launch_tool_call does nothing for tools not in TOOL_REGISTRY."""
|
||||
# Should not raise even if the tool name is completely unknown
|
||||
await pre_launch_tool_call("nonexistent_tool", {})
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_mcp_prefix_stripped_before_registry_lookup(self):
|
||||
"""mcp__copilot__run_block is looked up as 'run_block'."""
|
||||
mock_tool = _make_mock_tool("run_block")
|
||||
with patch(
|
||||
"backend.copilot.sdk.tool_adapter.TOOL_REGISTRY",
|
||||
{"run_block": mock_tool},
|
||||
):
|
||||
await pre_launch_tool_call("mcp__copilot__run_block", {"block_id": "b1"})
|
||||
|
||||
# The task was enqueued — mock_tool.execute should be called once
|
||||
# (may not complete immediately but should start)
|
||||
await asyncio.sleep(0) # yield to event loop
|
||||
mock_tool.execute.assert_awaited_once()
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_bare_tool_name_without_prefix(self):
|
||||
"""Tool names without __ separator are looked up as-is."""
|
||||
mock_tool = _make_mock_tool("run_block")
|
||||
with patch(
|
||||
"backend.copilot.sdk.tool_adapter.TOOL_REGISTRY",
|
||||
{"run_block": mock_tool},
|
||||
):
|
||||
await pre_launch_tool_call("run_block", {"block_id": "b1"})
|
||||
|
||||
await asyncio.sleep(0)
|
||||
mock_tool.execute.assert_awaited_once()
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_task_enqueued_fifo_for_same_tool(self):
|
||||
"""Two pre-launched calls for the same tool name are enqueued FIFO."""
|
||||
results = []
|
||||
|
||||
async def slow_execute(*args, **kwargs):
|
||||
results.append(len(results))
|
||||
return StreamToolOutputAvailable(
|
||||
toolCallId="id",
|
||||
output=str(len(results) - 1),
|
||||
toolName="t",
|
||||
success=True,
|
||||
)
|
||||
|
||||
mock_tool = _make_mock_tool("t")
|
||||
mock_tool.execute = AsyncMock(side_effect=slow_execute)
|
||||
|
||||
with patch(
|
||||
"backend.copilot.sdk.tool_adapter.TOOL_REGISTRY",
|
||||
{"t": mock_tool},
|
||||
):
|
||||
await pre_launch_tool_call("t", {"n": 1})
|
||||
await pre_launch_tool_call("t", {"n": 2})
|
||||
await asyncio.sleep(0)
|
||||
|
||||
assert mock_tool.execute.await_count == 2
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_file_ref_expansion_failure_skips_pre_launch(self):
|
||||
"""When @@agptfile: expansion fails, pre_launch_tool_call skips the task.
|
||||
|
||||
The handler should then fall back to direct execution (which will also
|
||||
fail with a proper MCP error via _truncating's own expansion).
|
||||
"""
|
||||
mock_tool = _make_mock_tool("run_block", output="should-not-execute")
|
||||
|
||||
with (
|
||||
patch(
|
||||
"backend.copilot.sdk.tool_adapter.TOOL_REGISTRY",
|
||||
{"run_block": mock_tool},
|
||||
),
|
||||
patch(
|
||||
"backend.copilot.sdk.tool_adapter.expand_file_refs_in_args",
|
||||
AsyncMock(side_effect=FileRefExpansionError("@@agptfile:missing.txt")),
|
||||
),
|
||||
):
|
||||
# Should not raise — expansion failure is handled gracefully
|
||||
await pre_launch_tool_call("run_block", {"text": "@@agptfile:missing.txt"})
|
||||
await asyncio.sleep(0)
|
||||
|
||||
# No task was pre-launched — execute was not called
|
||||
mock_tool.execute.assert_not_awaited()
|
||||
|
||||
|
||||
class TestCreateToolHandlerParallel:
|
||||
"""Tests for create_tool_handler using pre-launched tasks."""
|
||||
|
||||
@pytest.fixture(autouse=True)
|
||||
def _init(self):
|
||||
_init_ctx(session=_make_mock_session())
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_handler_uses_prelaunched_task(self):
|
||||
"""Handler pops and awaits the pre-launched task rather than re-executing."""
|
||||
mock_tool = _make_mock_tool("run_block", output="pre-launched result")
|
||||
|
||||
with patch(
|
||||
"backend.copilot.sdk.tool_adapter.TOOL_REGISTRY",
|
||||
{"run_block": mock_tool},
|
||||
):
|
||||
await pre_launch_tool_call("run_block", {"block_id": "b1"})
|
||||
await asyncio.sleep(0) # let task start
|
||||
|
||||
handler = create_tool_handler(mock_tool)
|
||||
result = await handler({"block_id": "b1"})
|
||||
|
||||
assert result["isError"] is False
|
||||
text = result["content"][0]["text"]
|
||||
assert "pre-launched result" in text
|
||||
# Should only have been called once (the pre-launched task), not twice
|
||||
mock_tool.execute.assert_awaited_once()
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_handler_does_not_double_stash_for_prelaunched_task(self):
|
||||
"""Pre-launched task result must NOT be stashed by tool_handler directly.
|
||||
|
||||
The _truncating wrapper wraps tool_handler and handles stashing after
|
||||
tool_handler returns. If tool_handler also stashed, the output would be
|
||||
appended twice to the FIFO queue and pop_pending_tool_output would return
|
||||
a duplicate on the second call.
|
||||
|
||||
This test calls tool_handler directly (without _truncating) and asserts
|
||||
that nothing was stashed — confirming stashing is deferred to _truncating.
|
||||
"""
|
||||
mock_tool = _make_mock_tool("run_block", output="stash-me")
|
||||
|
||||
with patch(
|
||||
"backend.copilot.sdk.tool_adapter.TOOL_REGISTRY",
|
||||
{"run_block": mock_tool},
|
||||
):
|
||||
await pre_launch_tool_call("run_block", {"block_id": "b1"})
|
||||
await asyncio.sleep(0)
|
||||
|
||||
handler = create_tool_handler(mock_tool)
|
||||
result = await handler({"block_id": "b1"})
|
||||
|
||||
assert result["isError"] is False
|
||||
assert "stash-me" in result["content"][0]["text"]
|
||||
# tool_handler must NOT stash — _truncating (which wraps handler) does it.
|
||||
# Calling pop here (without going through _truncating) should return None.
|
||||
not_stashed = pop_pending_tool_output("run_block")
|
||||
assert not_stashed is None, (
|
||||
"tool_handler must not stash directly — _truncating handles stashing "
|
||||
"to prevent double-stash in the FIFO queue"
|
||||
)
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_handler_falls_back_when_queue_empty(self):
|
||||
"""When no pre-launched task exists, handler executes directly."""
|
||||
mock_tool = _make_mock_tool("run_block", output="direct result")
|
||||
|
||||
# Don't call pre_launch_tool_call — queue is empty
|
||||
handler = create_tool_handler(mock_tool)
|
||||
result = await handler({"block_id": "b1"})
|
||||
|
||||
assert result["isError"] is False
|
||||
text = result["content"][0]["text"]
|
||||
assert "direct result" in text
|
||||
mock_tool.execute.assert_awaited_once()
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_handler_cancelled_error_propagates(self):
|
||||
"""CancelledError from a pre-launched task is re-raised to preserve cancellation semantics."""
|
||||
mock_tool = _make_mock_tool("run_block")
|
||||
mock_tool.execute = AsyncMock(side_effect=asyncio.CancelledError())
|
||||
|
||||
with patch(
|
||||
"backend.copilot.sdk.tool_adapter.TOOL_REGISTRY",
|
||||
{"run_block": mock_tool},
|
||||
):
|
||||
await pre_launch_tool_call("run_block", {"block_id": "b1"})
|
||||
await asyncio.sleep(0)
|
||||
|
||||
handler = create_tool_handler(mock_tool)
|
||||
with pytest.raises(asyncio.CancelledError):
|
||||
await handler({"block_id": "b1"})
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_handler_exception_returns_mcp_error(self):
|
||||
"""Exception from a pre-launched task is caught and returned as MCP error."""
|
||||
mock_tool = _make_mock_tool("run_block")
|
||||
mock_tool.execute = AsyncMock(side_effect=RuntimeError("block exploded"))
|
||||
|
||||
with patch(
|
||||
"backend.copilot.sdk.tool_adapter.TOOL_REGISTRY",
|
||||
{"run_block": mock_tool},
|
||||
):
|
||||
await pre_launch_tool_call("run_block", {"block_id": "b1"})
|
||||
await asyncio.sleep(0)
|
||||
|
||||
handler = create_tool_handler(mock_tool)
|
||||
result = await handler({"block_id": "b1"})
|
||||
|
||||
assert result["isError"] is True
|
||||
assert "Failed to execute run_block" in result["content"][0]["text"]
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_two_same_tool_calls_dispatched_in_order(self):
|
||||
"""Two pre-launched tasks for the same tool are consumed in FIFO order."""
|
||||
call_order = []
|
||||
|
||||
async def execute_with_tag(*args, **kwargs):
|
||||
tag = kwargs.get("block_id", "?")
|
||||
call_order.append(tag)
|
||||
return StreamToolOutputAvailable(
|
||||
toolCallId="id", output=f"out-{tag}", toolName="run_block", success=True
|
||||
)
|
||||
|
||||
mock_tool = _make_mock_tool("run_block")
|
||||
mock_tool.execute = AsyncMock(side_effect=execute_with_tag)
|
||||
|
||||
with patch(
|
||||
"backend.copilot.sdk.tool_adapter.TOOL_REGISTRY",
|
||||
{"run_block": mock_tool},
|
||||
):
|
||||
await pre_launch_tool_call("run_block", {"block_id": "first"})
|
||||
await pre_launch_tool_call("run_block", {"block_id": "second"})
|
||||
await asyncio.sleep(0)
|
||||
|
||||
handler = create_tool_handler(mock_tool)
|
||||
r1 = await handler({"block_id": "first"})
|
||||
r2 = await handler({"block_id": "second"})
|
||||
|
||||
assert "out-first" in r1["content"][0]["text"]
|
||||
assert "out-second" in r2["content"][0]["text"]
|
||||
assert call_order == [
|
||||
"first",
|
||||
"second",
|
||||
], f"Expected FIFO dispatch order but got {call_order}"
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_arg_mismatch_falls_back_to_direct_execution(self):
|
||||
"""When pre-launched args differ from SDK args, handler cancels pre-launched
|
||||
task and falls back to direct execution with the correct args."""
|
||||
mock_tool = _make_mock_tool("run_block", output="direct-result")
|
||||
|
||||
with patch(
|
||||
"backend.copilot.sdk.tool_adapter.TOOL_REGISTRY",
|
||||
{"run_block": mock_tool},
|
||||
):
|
||||
# Pre-launch with args {"block_id": "wrong"}
|
||||
await pre_launch_tool_call("run_block", {"block_id": "wrong"})
|
||||
await asyncio.sleep(0)
|
||||
|
||||
# SDK dispatches with different args
|
||||
handler = create_tool_handler(mock_tool)
|
||||
result = await handler({"block_id": "correct"})
|
||||
|
||||
assert result["isError"] is False
|
||||
# The tool was called twice: once by pre-launch (wrong args), once by
|
||||
# direct fallback (correct args). The result should come from the
|
||||
# direct execution path.
|
||||
assert mock_tool.execute.await_count == 2
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_no_session_falls_back_gracefully(self):
|
||||
"""When session is None and no pre-launched task, handler returns MCP error."""
|
||||
mock_tool = _make_mock_tool("run_block")
|
||||
# session=None means get_execution_context returns (user_id, None)
|
||||
set_execution_context(user_id="u", session=None, sandbox=None) # type: ignore[arg-type]
|
||||
|
||||
handler = create_tool_handler(mock_tool)
|
||||
result = await handler({"block_id": "b1"})
|
||||
|
||||
assert result["isError"] is True
|
||||
assert "session" in result["content"][0]["text"].lower()
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# cancel_pending_tool_tasks
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class TestCancelPendingToolTasks:
|
||||
"""Tests for cancel_pending_tool_tasks — the stream-abort cleanup helper."""
|
||||
|
||||
@pytest.fixture(autouse=True)
|
||||
def _init(self):
|
||||
_init_ctx(session=_make_mock_session())
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_cancels_queued_tasks(self):
|
||||
"""Queued tasks are cancelled and the queue is cleared."""
|
||||
ran = False
|
||||
|
||||
async def never_run(*_args, **_kwargs):
|
||||
nonlocal ran
|
||||
await asyncio.sleep(10) # long enough to still be pending
|
||||
ran = True
|
||||
|
||||
mock_tool = _make_mock_tool("run_block")
|
||||
mock_tool.execute = AsyncMock(side_effect=never_run)
|
||||
|
||||
with patch(
|
||||
"backend.copilot.sdk.tool_adapter.TOOL_REGISTRY",
|
||||
{"run_block": mock_tool},
|
||||
):
|
||||
await pre_launch_tool_call("run_block", {"block_id": "b1"})
|
||||
await asyncio.sleep(0) # let task start
|
||||
await cancel_pending_tool_tasks()
|
||||
await asyncio.sleep(0) # let cancellation propagate
|
||||
|
||||
assert not ran, "Task should have been cancelled before completing"
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_noop_when_no_tasks_queued(self):
|
||||
"""cancel_pending_tool_tasks does not raise when queues are empty."""
|
||||
await cancel_pending_tool_tasks() # should not raise
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_handler_does_not_find_cancelled_task(self):
|
||||
"""After cancel, tool_handler falls back to direct execution."""
|
||||
mock_tool = _make_mock_tool("run_block", output="direct-fallback")
|
||||
|
||||
with patch(
|
||||
"backend.copilot.sdk.tool_adapter.TOOL_REGISTRY",
|
||||
{"run_block": mock_tool},
|
||||
):
|
||||
await pre_launch_tool_call("run_block", {"block_id": "b1"})
|
||||
await asyncio.sleep(0)
|
||||
await cancel_pending_tool_tasks()
|
||||
|
||||
# Queue is now empty — handler should execute directly
|
||||
handler = create_tool_handler(mock_tool)
|
||||
result = await handler({"block_id": "b1"})
|
||||
|
||||
assert result["isError"] is False
|
||||
assert "direct-fallback" in result["content"][0]["text"]
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Concurrent / parallel pre-launch scenarios
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class TestAllParallelToolsPrelaunchedIndependently:
|
||||
"""Simulate SDK sending N separate AssistantMessages for the same tool concurrently."""
|
||||
|
||||
@pytest.fixture(autouse=True)
|
||||
def _init(self):
|
||||
_init_ctx(session=_make_mock_session())
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_all_parallel_tools_prelaunched_independently(self):
|
||||
"""5 pre-launches for the same tool all enqueue independently and run concurrently.
|
||||
|
||||
Each task sleeps for PER_TASK_S seconds. If they ran sequentially the total
|
||||
wall time would be ~5*PER_TASK_S. Running concurrently it should finish in
|
||||
roughly PER_TASK_S (plus scheduling overhead).
|
||||
"""
|
||||
PER_TASK_S = 0.05
|
||||
N = 5
|
||||
started: list[int] = []
|
||||
finished: list[int] = []
|
||||
|
||||
async def slow_execute(*args, **kwargs):
|
||||
idx = len(started)
|
||||
started.append(idx)
|
||||
await asyncio.sleep(PER_TASK_S)
|
||||
finished.append(idx)
|
||||
return StreamToolOutputAvailable(
|
||||
toolCallId=f"id-{idx}",
|
||||
output=f"result-{idx}",
|
||||
toolName="bash_exec",
|
||||
success=True,
|
||||
)
|
||||
|
||||
mock_tool = _make_mock_tool("bash_exec")
|
||||
mock_tool.execute = AsyncMock(side_effect=slow_execute)
|
||||
|
||||
with patch(
|
||||
"backend.copilot.sdk.tool_adapter.TOOL_REGISTRY",
|
||||
{"bash_exec": mock_tool},
|
||||
):
|
||||
for i in range(N):
|
||||
await pre_launch_tool_call("bash_exec", {"cmd": f"echo {i}"})
|
||||
|
||||
# Measure only the concurrent execution window, not pre-launch overhead.
|
||||
# Starting the timer here avoids false failures on slow CI runners where
|
||||
# the pre_launch_tool_call setup takes longer than the concurrent sleep.
|
||||
t0 = asyncio.get_running_loop().time()
|
||||
await asyncio.sleep(PER_TASK_S * 2)
|
||||
elapsed = asyncio.get_running_loop().time() - t0
|
||||
|
||||
assert mock_tool.execute.await_count == N
|
||||
assert len(finished) == N
|
||||
# Wall time of the sleep window should be well under N * PER_TASK_S
|
||||
# (sequential would be ~0.25s; concurrent finishes in ~PER_TASK_S = 0.05s)
|
||||
assert elapsed < N * PER_TASK_S, (
|
||||
f"Expected concurrent execution (<{N * PER_TASK_S:.2f}s) "
|
||||
f"but sleep window took {elapsed:.2f}s"
|
||||
)
|
||||
|
||||
|
||||
class TestHandlerReturnsResultFromCorrectPrelaunchedTask:
|
||||
"""Pop pre-launched tasks in order and verify each returns its own result."""
|
||||
|
||||
@pytest.fixture(autouse=True)
|
||||
def _init(self):
|
||||
_init_ctx(session=_make_mock_session())
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_handler_returns_result_from_correct_prelaunched_task(self):
|
||||
"""Two pre-launches for the same tool: first handler gets first result, second gets second."""
|
||||
|
||||
async def execute_with_cmd(*args, **kwargs):
|
||||
cmd = kwargs.get("cmd", "?")
|
||||
return StreamToolOutputAvailable(
|
||||
toolCallId="id",
|
||||
output=f"output-for-{cmd}",
|
||||
toolName="bash_exec",
|
||||
success=True,
|
||||
)
|
||||
|
||||
mock_tool = _make_mock_tool("bash_exec")
|
||||
mock_tool.execute = AsyncMock(side_effect=execute_with_cmd)
|
||||
|
||||
with patch(
|
||||
"backend.copilot.sdk.tool_adapter.TOOL_REGISTRY",
|
||||
{"bash_exec": mock_tool},
|
||||
):
|
||||
await pre_launch_tool_call("bash_exec", {"cmd": "alpha"})
|
||||
await pre_launch_tool_call("bash_exec", {"cmd": "beta"})
|
||||
await asyncio.sleep(0) # let both tasks start
|
||||
|
||||
handler = create_tool_handler(mock_tool)
|
||||
r1 = await handler({"cmd": "alpha"})
|
||||
r2 = await handler({"cmd": "beta"})
|
||||
|
||||
text1 = r1["content"][0]["text"]
|
||||
text2 = r2["content"][0]["text"]
|
||||
assert "output-for-alpha" in text1, f"Expected alpha result, got: {text1}"
|
||||
assert "output-for-beta" in text2, f"Expected beta result, got: {text2}"
|
||||
assert mock_tool.execute.await_count == 2
|
||||
|
||||
|
||||
class TestFiveConcurrentPrelaunchAllComplete:
|
||||
"""Pre-launch 5 tasks; consume all 5 via handlers; assert all succeed."""
|
||||
|
||||
@pytest.fixture(autouse=True)
|
||||
def _init(self):
|
||||
_init_ctx(session=_make_mock_session())
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_five_concurrent_prelaunch_all_complete(self):
|
||||
"""All 5 pre-launched tasks complete and return successful results."""
|
||||
N = 5
|
||||
call_count = 0
|
||||
|
||||
async def counting_execute(*args, **kwargs):
|
||||
nonlocal call_count
|
||||
call_count += 1
|
||||
n = call_count
|
||||
return StreamToolOutputAvailable(
|
||||
toolCallId=f"id-{n}",
|
||||
output=f"done-{n}",
|
||||
toolName="bash_exec",
|
||||
success=True,
|
||||
)
|
||||
|
||||
mock_tool = _make_mock_tool("bash_exec")
|
||||
mock_tool.execute = AsyncMock(side_effect=counting_execute)
|
||||
|
||||
with patch(
|
||||
"backend.copilot.sdk.tool_adapter.TOOL_REGISTRY",
|
||||
{"bash_exec": mock_tool},
|
||||
):
|
||||
for i in range(N):
|
||||
await pre_launch_tool_call("bash_exec", {"cmd": f"task-{i}"})
|
||||
|
||||
await asyncio.sleep(0) # let all tasks start
|
||||
|
||||
handler = create_tool_handler(mock_tool)
|
||||
results = []
|
||||
for i in range(N):
|
||||
results.append(await handler({"cmd": f"task-{i}"}))
|
||||
|
||||
assert (
|
||||
mock_tool.execute.await_count == N
|
||||
), f"Expected {N} execute calls, got {mock_tool.execute.await_count}"
|
||||
for i, result in enumerate(results):
|
||||
assert result["isError"] is False, f"Result {i} should not be an error"
|
||||
text = result["content"][0]["text"]
|
||||
assert "done-" in text, f"Result {i} missing expected output: {text}"
|
||||
|
||||
@@ -22,13 +22,12 @@ class AddUnderstandingTool(BaseTool):
|
||||
|
||||
@property
|
||||
def description(self) -> str:
|
||||
return """Capture and store information about the user's business context,
|
||||
workflows, pain points, and automation goals. Call this tool whenever the user
|
||||
shares information about their business. Each call incrementally adds to the
|
||||
existing understanding - you don't need to provide all fields at once.
|
||||
|
||||
Use this to build a comprehensive profile that helps recommend better agents
|
||||
and automations for the user's specific needs."""
|
||||
return (
|
||||
"Store user's business context, workflows, pain points, and automation goals. "
|
||||
"Call whenever the user shares business info. Each call incrementally merges "
|
||||
"with existing data — provide only the fields you have. "
|
||||
"Builds a profile that helps recommend better agents for the user's needs."
|
||||
)
|
||||
|
||||
@property
|
||||
def parameters(self) -> dict[str, Any]:
|
||||
|
||||
@@ -20,9 +20,9 @@ SSRF protection:
|
||||
|
||||
Requires:
|
||||
npm install -g agent-browser
|
||||
agent-browser install (downloads Chromium, one-time — skipped in Docker
|
||||
where system chromium is pre-installed and
|
||||
AGENT_BROWSER_EXECUTABLE_PATH is set)
|
||||
In Docker: system chromium package with AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium
|
||||
(set automatically — no `agent-browser install` needed).
|
||||
Locally: run `agent-browser install` to download Chromium.
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
@@ -410,18 +410,11 @@ class BrowserNavigateTool(BaseTool):
|
||||
@property
|
||||
def description(self) -> str:
|
||||
return (
|
||||
"Navigate to a URL using a real browser. Returns an accessibility "
|
||||
"tree snapshot listing the page's interactive elements with @ref IDs "
|
||||
"(e.g. @e3) that can be used with browser_act. "
|
||||
"Session persists — cookies and login state carry over between calls. "
|
||||
"Use this (with browser_act) for multi-step interaction: login flows, "
|
||||
"form filling, button clicks, or anything requiring page interaction. "
|
||||
"For plain static pages, prefer web_fetch — no browser overhead. "
|
||||
"For authenticated pages: navigate to the login page first, use browser_act "
|
||||
"to fill credentials and submit, then navigate to the target page. "
|
||||
"Note: for slow SPAs, the returned snapshot may reflect a partially-loaded "
|
||||
"state. If elements seem missing, use browser_act with action='wait' and a "
|
||||
"CSS selector or millisecond delay, then take a browser_screenshot to verify."
|
||||
"Navigate to a URL in a real browser. Returns accessibility tree with @ref IDs "
|
||||
"for browser_act. Session persists (cookies/auth carry over). "
|
||||
"For static pages, prefer web_fetch. "
|
||||
"For SPAs, elements may load late — use browser_act with wait + browser_screenshot to verify. "
|
||||
"For auth: navigate to login, fill creds and submit with browser_act, then navigate to target."
|
||||
)
|
||||
|
||||
@property
|
||||
@@ -431,13 +424,13 @@ class BrowserNavigateTool(BaseTool):
|
||||
"properties": {
|
||||
"url": {
|
||||
"type": "string",
|
||||
"description": "The HTTP/HTTPS URL to navigate to.",
|
||||
"description": "HTTP/HTTPS URL to navigate to.",
|
||||
},
|
||||
"wait_for": {
|
||||
"type": "string",
|
||||
"enum": ["networkidle", "load", "domcontentloaded"],
|
||||
"default": "networkidle",
|
||||
"description": "When to consider navigation complete. Use 'networkidle' for SPAs (default).",
|
||||
"description": "Navigation completion strategy (default: networkidle).",
|
||||
},
|
||||
},
|
||||
"required": ["url"],
|
||||
@@ -556,14 +549,12 @@ class BrowserActTool(BaseTool):
|
||||
@property
|
||||
def description(self) -> str:
|
||||
return (
|
||||
"Interact with the current browser page. Use @ref IDs from the "
|
||||
"snapshot (e.g. '@e3') to target elements. Returns an updated snapshot. "
|
||||
"Supported actions: click, dblclick, fill, type, scroll, hover, press, "
|
||||
"Interact with the current browser page using @ref IDs from the snapshot. "
|
||||
"Actions: click, dblclick, fill, type, scroll, hover, press, "
|
||||
"check, uncheck, select, wait, back, forward, reload. "
|
||||
"fill clears the field before typing; type appends without clearing. "
|
||||
"wait accepts a CSS selector (waits for element) or milliseconds string (e.g. '1000'). "
|
||||
"Example login flow: fill @e1 with email → fill @e2 with password → "
|
||||
"click @e3 (submit) → browser_navigate to the target page."
|
||||
"fill clears field first; type appends. "
|
||||
"wait accepts CSS selector or milliseconds (e.g. '1000'). "
|
||||
"Returns updated snapshot."
|
||||
)
|
||||
|
||||
@property
|
||||
@@ -589,30 +580,21 @@ class BrowserActTool(BaseTool):
|
||||
"forward",
|
||||
"reload",
|
||||
],
|
||||
"description": "The action to perform.",
|
||||
"description": "Action to perform.",
|
||||
},
|
||||
"target": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"Element to target. Use @ref from snapshot (e.g. '@e3'), "
|
||||
"a CSS selector, or a text description. "
|
||||
"Required for: click, dblclick, fill, type, hover, check, uncheck, select. "
|
||||
"For wait: a CSS selector to wait for, or milliseconds as a string (e.g. '1000')."
|
||||
),
|
||||
"description": "@ref ID (e.g. '@e3'), CSS selector, or text. Required for: click, dblclick, fill, type, hover, check, uncheck, select. For wait: CSS selector or milliseconds string (e.g. '1000').",
|
||||
},
|
||||
"value": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"For fill/type: the text to enter. "
|
||||
"For press: key name (e.g. 'Enter', 'Tab', 'Control+a'). "
|
||||
"For select: the option value to select."
|
||||
),
|
||||
"description": "Text for fill/type, key for press (e.g. 'Enter'), option for select.",
|
||||
},
|
||||
"direction": {
|
||||
"type": "string",
|
||||
"enum": ["up", "down", "left", "right"],
|
||||
"default": "down",
|
||||
"description": "For scroll: direction to scroll.",
|
||||
"description": "Scroll direction (default: down).",
|
||||
},
|
||||
},
|
||||
"required": ["action"],
|
||||
@@ -759,12 +741,10 @@ class BrowserScreenshotTool(BaseTool):
|
||||
@property
|
||||
def description(self) -> str:
|
||||
return (
|
||||
"Take a screenshot of the current browser page and save it to the workspace. "
|
||||
"IMPORTANT: After calling this tool, immediately call read_workspace_file "
|
||||
"with the returned file_id to display the image inline to the user — "
|
||||
"the screenshot is not visible until you do this. "
|
||||
"With annotate=true (default), @ref labels are overlaid on interactive "
|
||||
"elements, making it easy to see which @ref ID maps to which element on screen."
|
||||
"Screenshot the current browser page and save to workspace. "
|
||||
"annotate=true overlays @ref labels on elements. "
|
||||
"IMPORTANT: After calling, you MUST immediately call read_workspace_file with the "
|
||||
"returned file_id to display the image inline."
|
||||
)
|
||||
|
||||
@property
|
||||
@@ -775,12 +755,12 @@ class BrowserScreenshotTool(BaseTool):
|
||||
"annotate": {
|
||||
"type": "boolean",
|
||||
"default": True,
|
||||
"description": "Overlay @ref labels on interactive elements (default: true).",
|
||||
"description": "Overlay @ref labels (default: true).",
|
||||
},
|
||||
"filename": {
|
||||
"type": "string",
|
||||
"default": "screenshot.png",
|
||||
"description": "Filename to save in the workspace.",
|
||||
"description": "Workspace filename (default: screenshot.png).",
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
@@ -0,0 +1,351 @@
|
||||
"""Integration tests for agent-browser + system chromium.
|
||||
|
||||
These tests actually invoke the agent-browser binary via subprocess and require:
|
||||
- agent-browser installed (npm install -g agent-browser)
|
||||
- AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium (set in Docker)
|
||||
|
||||
Run with:
|
||||
poetry run test
|
||||
|
||||
Or to run only this file:
|
||||
poetry run pytest backend/copilot/tools/agent_browser_integration_test.py -v -p no:autogpt_platform
|
||||
|
||||
Skipped automatically when agent-browser binary is not found.
|
||||
Tests that hit external sites are marked ``integration`` and skipped by default
|
||||
in CI (use ``-m integration`` to include them).
|
||||
|
||||
Two test tiers:
|
||||
- CLI tests: call agent-browser subprocess directly (no backend imports needed)
|
||||
- Tool class tests: call BrowserNavigateTool/BrowserActTool._execute() directly
|
||||
with user_id=None (skips workspace/DB interactions — no Postgres/RabbitMQ needed)
|
||||
"""
|
||||
|
||||
import concurrent.futures
|
||||
import os
|
||||
import shutil
|
||||
import subprocess
|
||||
import tempfile
|
||||
from datetime import datetime, timezone
|
||||
from urllib.parse import urlparse
|
||||
|
||||
import pytest
|
||||
|
||||
from backend.copilot.model import ChatSession
|
||||
from backend.copilot.tools.agent_browser import BrowserActTool, BrowserNavigateTool
|
||||
from backend.copilot.tools.models import (
|
||||
BrowserActResponse,
|
||||
BrowserNavigateResponse,
|
||||
ErrorResponse,
|
||||
)
|
||||
|
||||
pytestmark = pytest.mark.skipif(
|
||||
shutil.which("agent-browser") is None,
|
||||
reason="agent-browser binary not found",
|
||||
)
|
||||
|
||||
_SESSION = "integration-test-session"
|
||||
|
||||
|
||||
def _agent_browser(
|
||||
*args: str, session: str = _SESSION, timeout: int = 30
|
||||
) -> tuple[int, str, str]:
|
||||
"""Run agent-browser for the given session, return (rc, stdout, stderr)."""
|
||||
result = subprocess.run(
|
||||
["agent-browser", "--session", session, "--session-name", session, *args],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
timeout=timeout,
|
||||
)
|
||||
return result.returncode, result.stdout, result.stderr
|
||||
|
||||
|
||||
def _close_session(session: str, timeout: int = 5) -> None:
|
||||
"""Best-effort close for a browser session; never raises on failure."""
|
||||
try:
|
||||
subprocess.run(
|
||||
["agent-browser", "--session", session, "--session-name", session, "close"],
|
||||
capture_output=True,
|
||||
timeout=timeout,
|
||||
)
|
||||
except (subprocess.TimeoutExpired, OSError):
|
||||
pass
|
||||
|
||||
|
||||
@pytest.fixture(autouse=True)
|
||||
def _teardown():
|
||||
"""Close the shared test session after each test (best-effort)."""
|
||||
yield
|
||||
_close_session(_SESSION)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Tests
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_chromium_executable_env_is_set():
|
||||
"""AGENT_BROWSER_EXECUTABLE_PATH must be set and point to an executable binary."""
|
||||
exe = os.environ.get("AGENT_BROWSER_EXECUTABLE_PATH", "")
|
||||
assert exe, "AGENT_BROWSER_EXECUTABLE_PATH is not set"
|
||||
assert os.path.isfile(exe), f"Chromium binary not found at {exe}"
|
||||
assert os.access(exe, os.X_OK), f"Chromium binary at {exe} is not executable"
|
||||
|
||||
|
||||
@pytest.mark.integration
|
||||
def test_navigate_returns_success():
|
||||
"""agent-browser can open a public URL using system chromium."""
|
||||
rc, _, stderr = _agent_browser("open", "https://example.com")
|
||||
assert rc == 0, f"open failed (rc={rc}): {stderr}"
|
||||
|
||||
|
||||
@pytest.mark.integration
|
||||
def test_get_title_after_navigate():
|
||||
"""get title returns the page title after navigation."""
|
||||
rc, _, _ = _agent_browser("open", "https://example.com")
|
||||
assert rc == 0
|
||||
|
||||
rc, stdout, stderr = _agent_browser("get", "title", timeout=10)
|
||||
assert rc == 0, f"get title failed: {stderr}"
|
||||
assert "example" in stdout.lower()
|
||||
|
||||
|
||||
@pytest.mark.integration
|
||||
def test_get_url_after_navigate():
|
||||
"""get url returns the navigated URL."""
|
||||
rc, _, _ = _agent_browser("open", "https://example.com")
|
||||
assert rc == 0
|
||||
|
||||
rc, stdout, stderr = _agent_browser("get", "url", timeout=10)
|
||||
assert rc == 0, f"get url failed: {stderr}"
|
||||
assert urlparse(stdout.strip()).netloc == "example.com"
|
||||
|
||||
|
||||
@pytest.mark.integration
|
||||
def test_snapshot_returns_interactive_elements():
|
||||
"""snapshot -i -c lists interactive elements on the page."""
|
||||
rc, _, _ = _agent_browser("open", "https://example.com")
|
||||
assert rc == 0
|
||||
|
||||
rc, stdout, stderr = _agent_browser("snapshot", "-i", "-c", timeout=15)
|
||||
assert rc == 0, f"snapshot failed: {stderr}"
|
||||
assert len(stdout.strip()) > 0, "snapshot returned empty output"
|
||||
|
||||
|
||||
@pytest.mark.integration
|
||||
def test_screenshot_produces_valid_png():
|
||||
"""screenshot saves a non-empty, valid PNG file."""
|
||||
rc, _, _ = _agent_browser("open", "https://example.com")
|
||||
assert rc == 0
|
||||
|
||||
with tempfile.NamedTemporaryFile(suffix=".png", delete=False) as f:
|
||||
tmp = f.name
|
||||
try:
|
||||
rc, _, stderr = _agent_browser("screenshot", tmp, timeout=15)
|
||||
assert rc == 0, f"screenshot failed: {stderr}"
|
||||
size = os.path.getsize(tmp)
|
||||
assert size > 1000, f"PNG too small ({size} bytes) — likely blank or corrupt"
|
||||
with open(tmp, "rb") as f:
|
||||
assert f.read(4) == b"\x89PNG", "Output is not a valid PNG"
|
||||
finally:
|
||||
os.unlink(tmp)
|
||||
|
||||
|
||||
@pytest.mark.integration
|
||||
def test_scroll_down():
|
||||
"""scroll down succeeds without error."""
|
||||
rc, _, _ = _agent_browser("open", "https://example.com")
|
||||
assert rc == 0
|
||||
|
||||
rc, _, stderr = _agent_browser("scroll", "down", timeout=10)
|
||||
assert rc == 0, f"scroll failed: {stderr}"
|
||||
|
||||
|
||||
@pytest.mark.integration
|
||||
def test_fill_form_field():
|
||||
"""fill writes text into an input field."""
|
||||
rc, _, _ = _agent_browser("open", "https://httpbin.org/forms/post")
|
||||
assert rc == 0
|
||||
|
||||
rc, _, stderr = _agent_browser(
|
||||
"fill", "input[name=custname]", "IntegrationTestUser", timeout=10
|
||||
)
|
||||
assert rc == 0, f"fill failed: {stderr}"
|
||||
|
||||
|
||||
@pytest.mark.integration
|
||||
def test_concurrent_independent_sessions():
|
||||
"""Two independent sessions can navigate in parallel without interference."""
|
||||
session_a = "integration-concurrent-a"
|
||||
session_b = "integration-concurrent-b"
|
||||
|
||||
try:
|
||||
with concurrent.futures.ThreadPoolExecutor(max_workers=2) as pool:
|
||||
fut_a = pool.submit(
|
||||
_agent_browser, "open", "https://example.com", session=session_a
|
||||
)
|
||||
fut_b = pool.submit(
|
||||
_agent_browser, "open", "https://httpbin.org/html", session=session_b
|
||||
)
|
||||
rc_a, _, err_a = fut_a.result(timeout=40)
|
||||
rc_b, _, err_b = fut_b.result(timeout=40)
|
||||
assert rc_a == 0, f"session_a open failed: {err_a}"
|
||||
assert rc_b == 0, f"session_b open failed: {err_b}"
|
||||
|
||||
rc_ua, url_a, err_ua = _agent_browser(
|
||||
"get", "url", session=session_a, timeout=10
|
||||
)
|
||||
rc_ub, url_b, err_ub = _agent_browser(
|
||||
"get", "url", session=session_b, timeout=10
|
||||
)
|
||||
assert rc_ua == 0, f"session_a get url failed: {err_ua}"
|
||||
assert rc_ub == 0, f"session_b get url failed: {err_ub}"
|
||||
assert urlparse(url_a.strip()).netloc == "example.com"
|
||||
assert urlparse(url_b.strip()).netloc == "httpbin.org"
|
||||
finally:
|
||||
_close_session(session_a)
|
||||
_close_session(session_b)
|
||||
|
||||
|
||||
@pytest.mark.integration
|
||||
def test_close_session():
|
||||
"""close shuts down the browser daemon cleanly."""
|
||||
rc, _, _ = _agent_browser("open", "https://example.com")
|
||||
assert rc == 0
|
||||
|
||||
rc, _, stderr = _agent_browser("close", timeout=10)
|
||||
assert rc == 0, f"close failed: {stderr}"
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Python tool class integration tests
|
||||
#
|
||||
# These tests exercise the actual BrowserNavigateTool / BrowserActTool Python
|
||||
# classes (not just the CLI binary) to verify the full call path — URL
|
||||
# validation, subprocess dispatch, response parsing — works with system
|
||||
# chromium. user_id=None skips workspace/DB interactions so no Postgres or
|
||||
# RabbitMQ is needed.
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
_TOOL_SESSION_ID = "integration-tool-test-session"
|
||||
_TEST_SESSION = ChatSession(
|
||||
session_id=_TOOL_SESSION_ID,
|
||||
user_id="test-user",
|
||||
messages=[],
|
||||
usage=[],
|
||||
started_at=datetime.now(timezone.utc),
|
||||
updated_at=datetime.now(timezone.utc),
|
||||
)
|
||||
|
||||
|
||||
@pytest.fixture(autouse=False)
|
||||
def _close_tool_session():
|
||||
"""Tear down the tool-test browser session after each tool test."""
|
||||
yield
|
||||
_close_session(_TOOL_SESSION_ID)
|
||||
|
||||
|
||||
@pytest.mark.integration
|
||||
@pytest.mark.asyncio
|
||||
async def test_tool_navigate_returns_response(_close_tool_session):
|
||||
"""BrowserNavigateTool._execute returns a BrowserNavigateResponse with real content."""
|
||||
tool = BrowserNavigateTool()
|
||||
resp = await tool._execute(
|
||||
user_id=None, session=_TEST_SESSION, url="https://example.com"
|
||||
)
|
||||
assert isinstance(
|
||||
resp, BrowserNavigateResponse
|
||||
), f"Expected BrowserNavigateResponse, got: {resp}"
|
||||
assert urlparse(resp.url).netloc == "example.com"
|
||||
assert resp.title, "Expected non-empty page title"
|
||||
assert resp.snapshot, "Expected non-empty accessibility snapshot"
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
@pytest.mark.parametrize(
|
||||
"ssrf_url",
|
||||
[
|
||||
"http://169.254.169.254/", # AWS/GCP/Azure metadata endpoint
|
||||
"http://127.0.0.1/", # IPv4 loopback
|
||||
"http://10.0.0.1/", # RFC-1918 private range
|
||||
"http://[::1]/", # IPv6 loopback
|
||||
"http://0.0.0.0/", # Wildcard / INADDR_ANY
|
||||
],
|
||||
)
|
||||
async def test_tool_navigate_blocked_url(ssrf_url: str, _close_tool_session):
|
||||
"""BrowserNavigateTool._execute rejects internal/private URLs (SSRF guard)."""
|
||||
tool = BrowserNavigateTool()
|
||||
resp = await tool._execute(user_id=None, session=_TEST_SESSION, url=ssrf_url)
|
||||
assert isinstance(
|
||||
resp, ErrorResponse
|
||||
), f"Expected ErrorResponse for SSRF URL {ssrf_url!r}, got: {resp}"
|
||||
assert resp.error == "blocked_url"
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_tool_navigate_missing_url(_close_tool_session):
|
||||
"""BrowserNavigateTool._execute returns an error when url is empty."""
|
||||
tool = BrowserNavigateTool()
|
||||
resp = await tool._execute(user_id=None, session=_TEST_SESSION, url="")
|
||||
assert isinstance(resp, ErrorResponse)
|
||||
assert resp.error == "missing_url"
|
||||
|
||||
|
||||
@pytest.mark.integration
|
||||
@pytest.mark.asyncio
|
||||
async def test_tool_act_scroll(_close_tool_session):
|
||||
"""BrowserActTool._execute can scroll after a navigate."""
|
||||
nav = BrowserNavigateTool()
|
||||
nav_resp = await nav._execute(
|
||||
user_id=None, session=_TEST_SESSION, url="https://example.com"
|
||||
)
|
||||
assert isinstance(nav_resp, BrowserNavigateResponse)
|
||||
|
||||
act = BrowserActTool()
|
||||
resp = await act._execute(
|
||||
user_id=None, session=_TEST_SESSION, action="scroll", direction="down"
|
||||
)
|
||||
assert isinstance(
|
||||
resp, BrowserActResponse
|
||||
), f"Expected BrowserActResponse, got: {resp}"
|
||||
assert resp.action == "scroll"
|
||||
|
||||
|
||||
@pytest.mark.integration
|
||||
@pytest.mark.asyncio
|
||||
async def test_tool_act_fill_and_click(_close_tool_session):
|
||||
"""BrowserActTool._execute can fill a form field."""
|
||||
nav = BrowserNavigateTool()
|
||||
nav_resp = await nav._execute(
|
||||
user_id=None, session=_TEST_SESSION, url="https://httpbin.org/forms/post"
|
||||
)
|
||||
assert isinstance(nav_resp, BrowserNavigateResponse)
|
||||
|
||||
act = BrowserActTool()
|
||||
resp = await act._execute(
|
||||
user_id=None,
|
||||
session=_TEST_SESSION,
|
||||
action="fill",
|
||||
target="input[name=custname]",
|
||||
value="ToolIntegrationTest",
|
||||
)
|
||||
assert isinstance(resp, BrowserActResponse), f"fill failed: {resp}"
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_tool_act_missing_action(_close_tool_session):
|
||||
"""BrowserActTool._execute returns an error when action is missing."""
|
||||
act = BrowserActTool()
|
||||
resp = await act._execute(user_id=None, session=_TEST_SESSION, action="")
|
||||
assert isinstance(resp, ErrorResponse)
|
||||
assert resp.error == "missing_action"
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_tool_act_missing_target(_close_tool_session):
|
||||
"""BrowserActTool._execute returns an error when click target is missing."""
|
||||
act = BrowserActTool()
|
||||
resp = await act._execute(
|
||||
user_id=None, session=_TEST_SESSION, action="click", target=""
|
||||
)
|
||||
assert isinstance(resp, ErrorResponse)
|
||||
assert resp.error == "missing_target"
|
||||
@@ -7,7 +7,7 @@ from typing import Any
|
||||
from .helpers import (
|
||||
AGENT_EXECUTOR_BLOCK_ID,
|
||||
MCP_TOOL_BLOCK_ID,
|
||||
SMART_DECISION_MAKER_BLOCK_ID,
|
||||
TOOL_ORCHESTRATOR_BLOCK_ID,
|
||||
AgentDict,
|
||||
are_types_compatible,
|
||||
generate_uuid,
|
||||
@@ -31,7 +31,7 @@ _GET_CURRENT_DATE_BLOCK_ID = "b29c1b50-5d0e-4d9f-8f9d-1b0e6fcbf0b1"
|
||||
_GMAIL_SEND_BLOCK_ID = "6c27abc2-e51d-499e-a85f-5a0041ba94f0"
|
||||
_TEXT_REPLACE_BLOCK_ID = "7e7c87ab-3469-4bcc-9abe-67705091b713"
|
||||
|
||||
# Defaults applied to SmartDecisionMakerBlock nodes by the fixer.
|
||||
# Defaults applied to OrchestratorBlock nodes by the fixer.
|
||||
_SDM_DEFAULTS: dict[str, int | bool] = {
|
||||
"agent_mode_max_iterations": 10,
|
||||
"conversation_compaction": True,
|
||||
@@ -1639,8 +1639,8 @@ class AgentFixer:
|
||||
|
||||
return agent
|
||||
|
||||
def fix_smart_decision_maker_blocks(self, agent: AgentDict) -> AgentDict:
|
||||
"""Fix SmartDecisionMakerBlock nodes to ensure agent-mode defaults.
|
||||
def fix_orchestrator_blocks(self, agent: AgentDict) -> AgentDict:
|
||||
"""Fix OrchestratorBlock nodes to ensure agent-mode defaults.
|
||||
|
||||
Ensures:
|
||||
1. ``agent_mode_max_iterations`` defaults to ``10`` (bounded agent mode)
|
||||
@@ -1657,7 +1657,7 @@ class AgentFixer:
|
||||
nodes = agent.get("nodes", [])
|
||||
|
||||
for node in nodes:
|
||||
if node.get("block_id") != SMART_DECISION_MAKER_BLOCK_ID:
|
||||
if node.get("block_id") != TOOL_ORCHESTRATOR_BLOCK_ID:
|
||||
continue
|
||||
|
||||
node_id = node.get("id", "unknown")
|
||||
@@ -1670,7 +1670,7 @@ class AgentFixer:
|
||||
if field not in input_default or input_default[field] is None:
|
||||
input_default[field] = default_value
|
||||
self.add_fix_log(
|
||||
f"SmartDecisionMakerBlock {node_id}: "
|
||||
f"OrchestratorBlock {node_id}: "
|
||||
f"Set {field}={default_value!r}"
|
||||
)
|
||||
|
||||
@@ -1763,8 +1763,8 @@ class AgentFixer:
|
||||
# Apply fixes for MCPToolBlock nodes
|
||||
agent = self.fix_mcp_tool_blocks(agent)
|
||||
|
||||
# Apply fixes for SmartDecisionMakerBlock nodes (agent-mode defaults)
|
||||
agent = self.fix_smart_decision_maker_blocks(agent)
|
||||
# Apply fixes for OrchestratorBlock nodes (agent-mode defaults)
|
||||
agent = self.fix_orchestrator_blocks(agent)
|
||||
|
||||
# Apply fixes for AgentExecutorBlock nodes (sub-agents)
|
||||
if library_agents:
|
||||
|
||||
@@ -12,7 +12,7 @@ __all__ = [
|
||||
"AGENT_OUTPUT_BLOCK_ID",
|
||||
"AgentDict",
|
||||
"MCP_TOOL_BLOCK_ID",
|
||||
"SMART_DECISION_MAKER_BLOCK_ID",
|
||||
"TOOL_ORCHESTRATOR_BLOCK_ID",
|
||||
"UUID_REGEX",
|
||||
"are_types_compatible",
|
||||
"generate_uuid",
|
||||
@@ -34,7 +34,7 @@ UUID_REGEX = re.compile(r"^" + UUID_RE_STR + r"$")
|
||||
|
||||
AGENT_EXECUTOR_BLOCK_ID = "e189baac-8c20-45a1-94a7-55177ea42565"
|
||||
MCP_TOOL_BLOCK_ID = "a0a4b1c2-d3e4-4f56-a7b8-c9d0e1f2a3b4"
|
||||
SMART_DECISION_MAKER_BLOCK_ID = "3b191d9f-356f-482d-8238-ba04b6d18381"
|
||||
TOOL_ORCHESTRATOR_BLOCK_ID = "3b191d9f-356f-482d-8238-ba04b6d18381"
|
||||
AGENT_INPUT_BLOCK_ID = "c0a8e994-ebf1-4a9c-a4d8-89d09c86741b"
|
||||
AGENT_OUTPUT_BLOCK_ID = "363ae599-353e-4804-937e-b2ee3cef3da4"
|
||||
|
||||
|
||||
@@ -10,7 +10,7 @@ from .helpers import (
|
||||
AGENT_INPUT_BLOCK_ID,
|
||||
AGENT_OUTPUT_BLOCK_ID,
|
||||
MCP_TOOL_BLOCK_ID,
|
||||
SMART_DECISION_MAKER_BLOCK_ID,
|
||||
TOOL_ORCHESTRATOR_BLOCK_ID,
|
||||
AgentDict,
|
||||
are_types_compatible,
|
||||
get_defined_property_type,
|
||||
@@ -827,18 +827,18 @@ class AgentValidator:
|
||||
|
||||
return valid
|
||||
|
||||
def validate_smart_decision_maker_blocks(
|
||||
def validate_orchestrator_blocks(
|
||||
self,
|
||||
agent: AgentDict,
|
||||
node_lookup: dict[str, dict[str, Any]] | None = None,
|
||||
) -> bool:
|
||||
"""Validate that SmartDecisionMakerBlock nodes have downstream tools.
|
||||
"""Validate that OrchestratorBlock nodes have downstream tools.
|
||||
|
||||
Checks that each SmartDecisionMakerBlock node has at least one link
|
||||
Checks that each OrchestratorBlock node has at least one link
|
||||
with ``source_name == "tools"`` connecting to a downstream block.
|
||||
Without tools, the block has nothing to call and will error at runtime.
|
||||
|
||||
Returns True if all SmartDecisionMakerBlock nodes are valid.
|
||||
Returns True if all OrchestratorBlock nodes are valid.
|
||||
"""
|
||||
valid = True
|
||||
nodes = agent.get("nodes", [])
|
||||
@@ -848,7 +848,7 @@ class AgentValidator:
|
||||
non_tool_block_ids = {AGENT_INPUT_BLOCK_ID, AGENT_OUTPUT_BLOCK_ID}
|
||||
|
||||
for node in nodes:
|
||||
if node.get("block_id") != SMART_DECISION_MAKER_BLOCK_ID:
|
||||
if node.get("block_id") != TOOL_ORCHESTRATOR_BLOCK_ID:
|
||||
continue
|
||||
|
||||
node_id = node.get("id", "unknown")
|
||||
@@ -863,7 +863,7 @@ class AgentValidator:
|
||||
max_iter = input_default.get("agent_mode_max_iterations")
|
||||
if max_iter is not None and not isinstance(max_iter, int):
|
||||
self.add_error(
|
||||
f"SmartDecisionMakerBlock node '{customized_name}' "
|
||||
f"OrchestratorBlock node '{customized_name}' "
|
||||
f"({node_id}) has non-integer "
|
||||
f"agent_mode_max_iterations={max_iter!r}. "
|
||||
f"This field must be an integer."
|
||||
@@ -871,7 +871,7 @@ class AgentValidator:
|
||||
valid = False
|
||||
elif isinstance(max_iter, int) and max_iter < -1:
|
||||
self.add_error(
|
||||
f"SmartDecisionMakerBlock node '{customized_name}' "
|
||||
f"OrchestratorBlock node '{customized_name}' "
|
||||
f"({node_id}) has invalid "
|
||||
f"agent_mode_max_iterations={max_iter}. "
|
||||
f"Use -1 for infinite or a positive number for "
|
||||
@@ -880,7 +880,7 @@ class AgentValidator:
|
||||
valid = False
|
||||
elif isinstance(max_iter, int) and max_iter > 100:
|
||||
self.add_error(
|
||||
f"SmartDecisionMakerBlock node '{customized_name}' "
|
||||
f"OrchestratorBlock node '{customized_name}' "
|
||||
f"({node_id}) has agent_mode_max_iterations="
|
||||
f"{max_iter} which is unusually high. Values above "
|
||||
f"100 risk excessive cost and long execution times. "
|
||||
@@ -890,7 +890,7 @@ class AgentValidator:
|
||||
valid = False
|
||||
elif max_iter == 0:
|
||||
self.add_error(
|
||||
f"SmartDecisionMakerBlock node '{customized_name}' "
|
||||
f"OrchestratorBlock node '{customized_name}' "
|
||||
f"({node_id}) has agent_mode_max_iterations=0 "
|
||||
f"(traditional mode). The agent generator only supports "
|
||||
f"agent mode (set to -1 for infinite or a positive "
|
||||
@@ -908,7 +908,7 @@ class AgentValidator:
|
||||
|
||||
if not has_tools:
|
||||
self.add_error(
|
||||
f"SmartDecisionMakerBlock node '{customized_name}' "
|
||||
f"OrchestratorBlock node '{customized_name}' "
|
||||
f"({node_id}) has no downstream tool blocks connected. "
|
||||
f"Connect at least one block to its 'tools' output so "
|
||||
f"the AI has tools to call."
|
||||
@@ -1025,8 +1025,8 @@ class AgentValidator:
|
||||
self.validate_mcp_tool_blocks(agent),
|
||||
),
|
||||
(
|
||||
"SmartDecisionMaker blocks",
|
||||
self.validate_smart_decision_maker_blocks(agent, node_lookup),
|
||||
"Orchestrator blocks",
|
||||
self.validate_orchestrator_blocks(agent, node_lookup),
|
||||
),
|
||||
]
|
||||
|
||||
|
||||
@@ -108,22 +108,12 @@ class AgentOutputTool(BaseTool):
|
||||
|
||||
@property
|
||||
def description(self) -> str:
|
||||
return """Retrieve execution outputs from agents in the user's library.
|
||||
|
||||
Identify the agent using one of:
|
||||
- agent_name: Fuzzy search in user's library
|
||||
- library_agent_id: Exact library agent ID
|
||||
- store_slug: Marketplace format 'username/agent-name'
|
||||
|
||||
Select which run to retrieve using:
|
||||
- execution_id: Specific execution ID
|
||||
- run_time: 'latest' (default), 'yesterday', 'last week', or ISO date 'YYYY-MM-DD'
|
||||
|
||||
Wait for completion (optional):
|
||||
- wait_if_running: Max seconds to wait if execution is still running (0-300).
|
||||
If the execution is running/queued, waits up to this many seconds for completion.
|
||||
Returns current status on timeout. If already finished, returns immediately.
|
||||
"""
|
||||
return (
|
||||
"Retrieve execution outputs from a library agent. "
|
||||
"Identify by agent_name, library_agent_id, or store_slug. "
|
||||
"Filter by execution_id or run_time. "
|
||||
"Optionally wait for running executions."
|
||||
)
|
||||
|
||||
@property
|
||||
def parameters(self) -> dict[str, Any]:
|
||||
@@ -132,32 +122,29 @@ class AgentOutputTool(BaseTool):
|
||||
"properties": {
|
||||
"agent_name": {
|
||||
"type": "string",
|
||||
"description": "Agent name to search for in user's library (fuzzy match)",
|
||||
"description": "Agent name (fuzzy match).",
|
||||
},
|
||||
"library_agent_id": {
|
||||
"type": "string",
|
||||
"description": "Exact library agent ID",
|
||||
"description": "Library agent ID.",
|
||||
},
|
||||
"store_slug": {
|
||||
"type": "string",
|
||||
"description": "Marketplace identifier: 'username/agent-slug'",
|
||||
"description": "Marketplace 'username/agent-name'.",
|
||||
},
|
||||
"execution_id": {
|
||||
"type": "string",
|
||||
"description": "Specific execution ID to retrieve",
|
||||
"description": "Specific execution ID.",
|
||||
},
|
||||
"run_time": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"Time filter: 'latest', 'yesterday', 'last week', or 'YYYY-MM-DD'"
|
||||
),
|
||||
"description": "Time filter: 'latest', 'today', 'yesterday', 'last week', 'last 7 days', 'last month', 'last 30 days', 'YYYY-MM-DD', or ISO datetime.",
|
||||
},
|
||||
"wait_if_running": {
|
||||
"type": "integer",
|
||||
"description": (
|
||||
"Max seconds to wait if execution is still running (0-300). "
|
||||
"If running, waits for completion. Returns current state on timeout."
|
||||
),
|
||||
"description": "Max seconds to wait if still running (0-300). Returns current state on timeout.",
|
||||
"minimum": 0,
|
||||
"maximum": 300,
|
||||
},
|
||||
},
|
||||
"required": [],
|
||||
|
||||
@@ -42,15 +42,9 @@ class BashExecTool(BaseTool):
|
||||
@property
|
||||
def description(self) -> str:
|
||||
return (
|
||||
"Execute a Bash command or script. "
|
||||
"Full Bash scripting is supported (loops, conditionals, pipes, "
|
||||
"functions, etc.). "
|
||||
"The working directory is shared with the SDK Read/Write/Edit/Glob/Grep "
|
||||
"tools — files created by either are immediately visible to both. "
|
||||
"Execution is killed after the timeout (default 30s, max 120s). "
|
||||
"Returns stdout and stderr. "
|
||||
"Useful for file manipulation, data processing, running scripts, "
|
||||
"and installing packages."
|
||||
"Execute a Bash command or script. Shares filesystem with SDK file tools. "
|
||||
"Useful for scripts, data processing, and package installation. "
|
||||
"Killed after timeout (default 30s, max 120s)."
|
||||
)
|
||||
|
||||
@property
|
||||
@@ -60,13 +54,11 @@ class BashExecTool(BaseTool):
|
||||
"properties": {
|
||||
"command": {
|
||||
"type": "string",
|
||||
"description": "Bash command or script to execute.",
|
||||
"description": "Bash command or script.",
|
||||
},
|
||||
"timeout": {
|
||||
"type": "integer",
|
||||
"description": (
|
||||
"Max execution time in seconds (default 30, max 120)."
|
||||
),
|
||||
"description": "Max seconds (default 30, max 120).",
|
||||
"default": 30,
|
||||
},
|
||||
},
|
||||
|
||||
20
autogpt_platform/backend/backend/copilot/tools/conftest.py
Normal file
20
autogpt_platform/backend/backend/copilot/tools/conftest.py
Normal file
@@ -0,0 +1,20 @@
|
||||
"""Local conftest for copilot/tools tests.
|
||||
|
||||
Overrides the session-scoped `server` and `graph_cleanup` autouse fixtures from
|
||||
backend/conftest.py so that integration tests in this directory do not trigger
|
||||
the full SpinTestServer startup (which requires Postgres + RabbitMQ).
|
||||
"""
|
||||
|
||||
import pytest_asyncio
|
||||
|
||||
|
||||
@pytest_asyncio.fixture(scope="session", loop_scope="session")
|
||||
async def server(): # type: ignore[override]
|
||||
"""No-op server stub — tools tests don't need the full backend."""
|
||||
return None
|
||||
|
||||
|
||||
@pytest_asyncio.fixture(scope="session", loop_scope="session", autouse=True)
|
||||
async def graph_cleanup(): # type: ignore[override]
|
||||
"""No-op graph cleanup stub."""
|
||||
yield
|
||||
@@ -30,12 +30,7 @@ class ContinueRunBlockTool(BaseTool):
|
||||
|
||||
@property
|
||||
def description(self) -> str:
|
||||
return (
|
||||
"Continue executing a block after human review approval. "
|
||||
"Use this after a run_block call returned review_required. "
|
||||
"Pass the review_id from the review_required response. "
|
||||
"The block will execute with the original pre-approved input data."
|
||||
)
|
||||
return "Resume block execution after a run_block call returned review_required. Pass the review_id."
|
||||
|
||||
@property
|
||||
def parameters(self) -> dict[str, Any]:
|
||||
@@ -44,10 +39,7 @@ class ContinueRunBlockTool(BaseTool):
|
||||
"properties": {
|
||||
"review_id": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"The review_id from a previous review_required response. "
|
||||
"This resumes execution with the pre-approved input data."
|
||||
),
|
||||
"description": "review_id from the review_required response.",
|
||||
},
|
||||
},
|
||||
"required": ["review_id"],
|
||||
|
||||
@@ -23,12 +23,8 @@ class CreateAgentTool(BaseTool):
|
||||
@property
|
||||
def description(self) -> str:
|
||||
return (
|
||||
"Create a new agent workflow. Pass `agent_json` with the complete "
|
||||
"agent graph JSON you generated using block schemas from find_block. "
|
||||
"The tool validates, auto-fixes, and saves.\n\n"
|
||||
"IMPORTANT: Before calling this tool, search for relevant existing agents "
|
||||
"using find_library_agent that could be used as building blocks. "
|
||||
"Pass their IDs in the library_agent_ids parameter."
|
||||
"Create a new agent from JSON (nodes + links). Validates, auto-fixes, and saves. "
|
||||
"Before calling, search for existing agents with find_library_agent."
|
||||
)
|
||||
|
||||
@property
|
||||
@@ -42,34 +38,21 @@ class CreateAgentTool(BaseTool):
|
||||
"properties": {
|
||||
"agent_json": {
|
||||
"type": "object",
|
||||
"description": (
|
||||
"The agent JSON to validate and save. "
|
||||
"Must contain 'nodes' and 'links' arrays, and optionally "
|
||||
"'name' and 'description'."
|
||||
),
|
||||
"description": "Agent graph with 'nodes' and 'links' arrays.",
|
||||
},
|
||||
"library_agent_ids": {
|
||||
"type": "array",
|
||||
"items": {"type": "string"},
|
||||
"description": (
|
||||
"List of library agent IDs to use as building blocks."
|
||||
),
|
||||
"description": "Library agent IDs as building blocks.",
|
||||
},
|
||||
"save": {
|
||||
"type": "boolean",
|
||||
"description": (
|
||||
"Whether to save the agent. Default is true. "
|
||||
"Set to false for preview only."
|
||||
),
|
||||
"description": "Save the agent (default: true). False for preview.",
|
||||
"default": True,
|
||||
},
|
||||
"folder_id": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"Optional folder ID to save the agent into. "
|
||||
"If not provided, the agent is saved at root level. "
|
||||
"Use list_folders to find available folders."
|
||||
),
|
||||
"description": "Folder ID to save into (default: root).",
|
||||
},
|
||||
},
|
||||
"required": ["agent_json"],
|
||||
|
||||
@@ -23,9 +23,7 @@ class CustomizeAgentTool(BaseTool):
|
||||
@property
|
||||
def description(self) -> str:
|
||||
return (
|
||||
"Customize a marketplace or template agent. Pass `agent_json` "
|
||||
"with the complete customized agent JSON. The tool validates, "
|
||||
"auto-fixes, and saves."
|
||||
"Customize a marketplace/template agent. Validates, auto-fixes, and saves."
|
||||
)
|
||||
|
||||
@property
|
||||
@@ -39,32 +37,21 @@ class CustomizeAgentTool(BaseTool):
|
||||
"properties": {
|
||||
"agent_json": {
|
||||
"type": "object",
|
||||
"description": (
|
||||
"Complete customized agent JSON to validate and save. "
|
||||
"Optionally include 'name' and 'description'."
|
||||
),
|
||||
"description": "Customized agent JSON with nodes and links.",
|
||||
},
|
||||
"library_agent_ids": {
|
||||
"type": "array",
|
||||
"items": {"type": "string"},
|
||||
"description": (
|
||||
"List of library agent IDs to use as building blocks."
|
||||
),
|
||||
"description": "Library agent IDs as building blocks.",
|
||||
},
|
||||
"save": {
|
||||
"type": "boolean",
|
||||
"description": (
|
||||
"Whether to save the customized agent. Default is true."
|
||||
),
|
||||
"description": "Save the agent (default: true). False for preview.",
|
||||
"default": True,
|
||||
},
|
||||
"folder_id": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"Optional folder ID to save the agent into. "
|
||||
"If not provided, the agent is saved at root level. "
|
||||
"Use list_folders to find available folders."
|
||||
),
|
||||
"description": "Folder ID to save into (default: root).",
|
||||
},
|
||||
},
|
||||
"required": ["agent_json"],
|
||||
|
||||
@@ -23,12 +23,8 @@ class EditAgentTool(BaseTool):
|
||||
@property
|
||||
def description(self) -> str:
|
||||
return (
|
||||
"Edit an existing agent. Pass `agent_json` with the complete "
|
||||
"updated agent JSON you generated. The tool validates, auto-fixes, "
|
||||
"and saves.\n\n"
|
||||
"IMPORTANT: Before calling this tool, if the changes involve adding new "
|
||||
"functionality, search for relevant existing agents using find_library_agent "
|
||||
"that could be used as building blocks."
|
||||
"Edit an existing agent. Validates, auto-fixes, and saves. "
|
||||
"Before calling, search for existing agents with find_library_agent."
|
||||
)
|
||||
|
||||
@property
|
||||
@@ -42,33 +38,20 @@ class EditAgentTool(BaseTool):
|
||||
"properties": {
|
||||
"agent_id": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"The ID of the agent to edit. "
|
||||
"Can be a graph ID or library agent ID."
|
||||
),
|
||||
"description": "Graph ID or library agent ID to edit.",
|
||||
},
|
||||
"agent_json": {
|
||||
"type": "object",
|
||||
"description": (
|
||||
"Complete updated agent JSON to validate and save. "
|
||||
"Must contain 'nodes' and 'links'. "
|
||||
"Include 'name' and/or 'description' if they need "
|
||||
"to be updated."
|
||||
),
|
||||
"description": "Updated agent JSON with nodes and links.",
|
||||
},
|
||||
"library_agent_ids": {
|
||||
"type": "array",
|
||||
"items": {"type": "string"},
|
||||
"description": (
|
||||
"List of library agent IDs to use as building blocks for the changes."
|
||||
),
|
||||
"description": "Library agent IDs as building blocks.",
|
||||
},
|
||||
"save": {
|
||||
"type": "boolean",
|
||||
"description": (
|
||||
"Whether to save the changes. "
|
||||
"Default is true. Set to false for preview only."
|
||||
),
|
||||
"description": "Save changes (default: true). False for preview.",
|
||||
"default": True,
|
||||
},
|
||||
},
|
||||
|
||||
@@ -134,11 +134,7 @@ class SearchFeatureRequestsTool(BaseTool):
|
||||
|
||||
@property
|
||||
def description(self) -> str:
|
||||
return (
|
||||
"Search existing feature requests to check if a similar request "
|
||||
"already exists before creating a new one. Returns matching feature "
|
||||
"requests with their ID, title, and description."
|
||||
)
|
||||
return "Search existing feature requests. Check before creating a new one."
|
||||
|
||||
@property
|
||||
def parameters(self) -> dict[str, Any]:
|
||||
@@ -234,14 +230,9 @@ class CreateFeatureRequestTool(BaseTool):
|
||||
@property
|
||||
def description(self) -> str:
|
||||
return (
|
||||
"Create a new feature request or add a customer need to an existing one. "
|
||||
"Always search first with search_feature_requests to avoid duplicates. "
|
||||
"If a matching request exists, pass its ID as existing_issue_id to add "
|
||||
"the user's need to it instead of creating a duplicate. "
|
||||
"IMPORTANT: Never include personally identifiable information (PII) in "
|
||||
"the title or description — no names, emails, phone numbers, company "
|
||||
"names, or other identifying details. Write titles and descriptions in "
|
||||
"generic, feature-focused language."
|
||||
"Create a feature request or add need to existing one. "
|
||||
"Search first to avoid duplicates. Pass existing_issue_id to add to existing. "
|
||||
"Never include PII (names, emails, phone numbers, company names) in title/description."
|
||||
)
|
||||
|
||||
@property
|
||||
@@ -251,28 +242,15 @@ class CreateFeatureRequestTool(BaseTool):
|
||||
"properties": {
|
||||
"title": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"Title for the feature request. Must be generic and "
|
||||
"feature-focused — do not include any user names, emails, "
|
||||
"company names, or other PII."
|
||||
),
|
||||
"description": "Feature request title. No names, emails, or company info.",
|
||||
},
|
||||
"description": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"Detailed description of what the user wants and why. "
|
||||
"Must not contain any personally identifiable information "
|
||||
"(PII) — describe the feature need generically without "
|
||||
"referencing specific users, companies, or contact details."
|
||||
),
|
||||
"description": "What the user wants and why. No names, emails, or company info.",
|
||||
},
|
||||
"existing_issue_id": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"If adding a need to an existing feature request, "
|
||||
"provide its Linear issue ID (from search results). "
|
||||
"Omit to create a new feature request."
|
||||
),
|
||||
"description": "Linear issue ID to add need to (from search results).",
|
||||
},
|
||||
},
|
||||
"required": ["title", "description"],
|
||||
|
||||
@@ -18,10 +18,7 @@ class FindAgentTool(BaseTool):
|
||||
|
||||
@property
|
||||
def description(self) -> str:
|
||||
return (
|
||||
"Discover agents from the marketplace based on capabilities and "
|
||||
"user needs, or look up a specific agent by its creator/slug ID."
|
||||
)
|
||||
return "Search marketplace agents by capability, or look up by slug ('username/agent-name')."
|
||||
|
||||
@property
|
||||
def parameters(self) -> dict[str, Any]:
|
||||
@@ -30,7 +27,7 @@ class FindAgentTool(BaseTool):
|
||||
"properties": {
|
||||
"query": {
|
||||
"type": "string",
|
||||
"description": "Search query describing what the user wants to accomplish, or a creator/slug ID (e.g. 'username/agent-name') for direct lookup. Use single keywords for best results.",
|
||||
"description": "Search keywords, or 'username/agent-name' for direct slug lookup.",
|
||||
},
|
||||
},
|
||||
"required": ["query"],
|
||||
|
||||
@@ -5,6 +5,7 @@ from prisma.enums import ContentType
|
||||
|
||||
from backend.blocks import get_block
|
||||
from backend.blocks._base import BlockType
|
||||
from backend.copilot.context import get_current_permissions
|
||||
from backend.copilot.model import ChatSession
|
||||
from backend.data.db_accessors import search
|
||||
|
||||
@@ -38,7 +39,7 @@ COPILOT_EXCLUDED_BLOCK_TYPES = {
|
||||
|
||||
# Specific block IDs excluded from CoPilot (STANDARD type but still require graph context)
|
||||
COPILOT_EXCLUDED_BLOCK_IDS = {
|
||||
# SmartDecisionMakerBlock - dynamically discovers downstream blocks via graph topology;
|
||||
# OrchestratorBlock - dynamically discovers downstream blocks via graph topology;
|
||||
# usable in agent graphs (guide hardcodes its ID) but cannot run standalone.
|
||||
"3b191d9f-356f-482d-8238-ba04b6d18381",
|
||||
}
|
||||
@@ -54,13 +55,9 @@ class FindBlockTool(BaseTool):
|
||||
@property
|
||||
def description(self) -> str:
|
||||
return (
|
||||
"Search for available blocks by name or description, or look up a "
|
||||
"specific block by its ID. "
|
||||
"Blocks are reusable components that perform specific tasks like "
|
||||
"sending emails, making API calls, processing text, etc. "
|
||||
"IMPORTANT: Use this tool FIRST to get the block's 'id' before calling run_block. "
|
||||
"The response includes each block's id, name, and description. "
|
||||
"Call run_block with the block's id **with no inputs** to see detailed inputs/outputs and execute it."
|
||||
"Search blocks by name or description. Returns block IDs for run_block. "
|
||||
"Always call this FIRST to get block IDs before using run_block. "
|
||||
"Then call run_block with the block's id and empty input_data to see its detailed schema."
|
||||
)
|
||||
|
||||
@property
|
||||
@@ -70,19 +67,11 @@ class FindBlockTool(BaseTool):
|
||||
"properties": {
|
||||
"query": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"Search query to find blocks by name or description, "
|
||||
"or a block ID (UUID) for direct lookup. "
|
||||
"Use keywords like 'email', 'http', 'text', 'ai', etc."
|
||||
),
|
||||
"description": "Search keywords (e.g. 'email', 'http', 'ai').",
|
||||
},
|
||||
"include_schemas": {
|
||||
"type": "boolean",
|
||||
"description": (
|
||||
"If true, include full input_schema and output_schema "
|
||||
"for each block. Use when generating agent JSON that "
|
||||
"needs block schemas. Default is false."
|
||||
),
|
||||
"description": "Include full input/output schemas (for agent JSON generation).",
|
||||
"default": False,
|
||||
},
|
||||
},
|
||||
@@ -161,6 +150,19 @@ class FindBlockTool(BaseTool):
|
||||
session_id=session_id,
|
||||
)
|
||||
|
||||
# Check block-level permissions — hide denied blocks entirely
|
||||
perms = get_current_permissions()
|
||||
if perms is not None and not perms.is_block_allowed(
|
||||
block.id, block.name
|
||||
):
|
||||
return NoResultsResponse(
|
||||
message=f"No blocks found for '{query}'",
|
||||
suggestions=[
|
||||
"Search for an alternative block by name",
|
||||
],
|
||||
session_id=session_id,
|
||||
)
|
||||
|
||||
summary = BlockInfoSummary(
|
||||
id=block.id,
|
||||
name=block.name,
|
||||
@@ -207,6 +209,7 @@ class FindBlockTool(BaseTool):
|
||||
)
|
||||
|
||||
# Enrich results with block information
|
||||
perms = get_current_permissions()
|
||||
blocks: list[BlockInfoSummary] = []
|
||||
for result in results:
|
||||
block_id = result["content_id"]
|
||||
@@ -223,6 +226,12 @@ class FindBlockTool(BaseTool):
|
||||
):
|
||||
continue
|
||||
|
||||
# Skip blocks denied by execution permissions
|
||||
if perms is not None and not perms.is_block_allowed(
|
||||
block.id, block.name
|
||||
):
|
||||
continue
|
||||
|
||||
summary = BlockInfoSummary(
|
||||
id=block_id,
|
||||
name=block.name,
|
||||
|
||||
@@ -69,8 +69,8 @@ class TestFindBlockFiltering:
|
||||
assert BlockType.HUMAN_IN_THE_LOOP in COPILOT_EXCLUDED_BLOCK_TYPES
|
||||
assert BlockType.AGENT in COPILOT_EXCLUDED_BLOCK_TYPES
|
||||
|
||||
def test_excluded_block_ids_contains_smart_decision_maker(self):
|
||||
"""Verify SmartDecisionMakerBlock is in COPILOT_EXCLUDED_BLOCK_IDS."""
|
||||
def test_excluded_block_ids_contains_orchestrator(self):
|
||||
"""Verify OrchestratorBlock is in COPILOT_EXCLUDED_BLOCK_IDS."""
|
||||
assert "3b191d9f-356f-482d-8238-ba04b6d18381" in COPILOT_EXCLUDED_BLOCK_IDS
|
||||
|
||||
@pytest.mark.asyncio(loop_scope="session")
|
||||
@@ -120,18 +120,18 @@ class TestFindBlockFiltering:
|
||||
|
||||
@pytest.mark.asyncio(loop_scope="session")
|
||||
async def test_excluded_block_id_filtered_from_results(self):
|
||||
"""Verify SmartDecisionMakerBlock is filtered from search results."""
|
||||
"""Verify OrchestratorBlock is filtered from search results."""
|
||||
session = make_session(user_id=_TEST_USER_ID)
|
||||
|
||||
smart_decision_id = "3b191d9f-356f-482d-8238-ba04b6d18381"
|
||||
orchestrator_id = "3b191d9f-356f-482d-8238-ba04b6d18381"
|
||||
search_results = [
|
||||
{"content_id": smart_decision_id, "score": 0.9},
|
||||
{"content_id": orchestrator_id, "score": 0.9},
|
||||
{"content_id": "normal-block-id", "score": 0.8},
|
||||
]
|
||||
|
||||
# SmartDecisionMakerBlock has STANDARD type but is excluded by ID
|
||||
# OrchestratorBlock has STANDARD type but is excluded by ID
|
||||
smart_block = make_mock_block(
|
||||
smart_decision_id, "Smart Decision Maker", BlockType.STANDARD
|
||||
orchestrator_id, "Orchestrator", BlockType.STANDARD
|
||||
)
|
||||
normal_block = make_mock_block(
|
||||
"normal-block-id", "Normal Block", BlockType.STANDARD
|
||||
@@ -139,7 +139,7 @@ class TestFindBlockFiltering:
|
||||
|
||||
def mock_get_block(block_id):
|
||||
return {
|
||||
smart_decision_id: smart_block,
|
||||
orchestrator_id: smart_block,
|
||||
"normal-block-id": normal_block,
|
||||
}.get(block_id)
|
||||
|
||||
@@ -161,7 +161,7 @@ class TestFindBlockFiltering:
|
||||
user_id=_TEST_USER_ID, session=session, query="decision"
|
||||
)
|
||||
|
||||
# Should only return normal block, not SmartDecisionMakerBlock
|
||||
# Should only return normal block, not OrchestratorBlock
|
||||
assert isinstance(response, BlockListResponse)
|
||||
assert len(response.blocks) == 1
|
||||
assert response.blocks[0].id == "normal-block-id"
|
||||
@@ -601,10 +601,8 @@ class TestFindBlockDirectLookup:
|
||||
async def test_uuid_lookup_excluded_block_id(self):
|
||||
"""UUID matching an excluded block ID returns NoResultsResponse."""
|
||||
session = make_session(user_id=_TEST_USER_ID)
|
||||
smart_decision_id = "3b191d9f-356f-482d-8238-ba04b6d18381"
|
||||
block = make_mock_block(
|
||||
smart_decision_id, "Smart Decision Maker", BlockType.STANDARD
|
||||
)
|
||||
orchestrator_id = "3b191d9f-356f-482d-8238-ba04b6d18381"
|
||||
block = make_mock_block(orchestrator_id, "Orchestrator", BlockType.STANDARD)
|
||||
|
||||
with patch(
|
||||
"backend.copilot.tools.find_block.get_block",
|
||||
@@ -612,7 +610,7 @@ class TestFindBlockDirectLookup:
|
||||
):
|
||||
tool = FindBlockTool()
|
||||
response = await tool._execute(
|
||||
user_id=_TEST_USER_ID, session=session, query=smart_decision_id
|
||||
user_id=_TEST_USER_ID, session=session, query=orchestrator_id
|
||||
)
|
||||
|
||||
from .models import NoResultsResponse
|
||||
|
||||
@@ -19,13 +19,8 @@ class FindLibraryAgentTool(BaseTool):
|
||||
@property
|
||||
def description(self) -> str:
|
||||
return (
|
||||
"Search for or list agents in the user's library. Use this to find "
|
||||
"agents the user has already added to their library, including agents "
|
||||
"they created or added from the marketplace. "
|
||||
"When creating agents with sub-agent composition, use this to get "
|
||||
"the agent's graph_id, graph_version, input_schema, and output_schema "
|
||||
"needed for AgentExecutorBlock nodes. "
|
||||
"Omit the query to list all agents."
|
||||
"Search user's library agents. Returns graph_id, schemas for sub-agent composition. "
|
||||
"Omit query to list all."
|
||||
)
|
||||
|
||||
@property
|
||||
@@ -35,10 +30,7 @@ class FindLibraryAgentTool(BaseTool):
|
||||
"properties": {
|
||||
"query": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"Search query to find agents by name or description. "
|
||||
"Omit to list all agents in the library."
|
||||
),
|
||||
"description": "Search by name/description. Omit to list all.",
|
||||
},
|
||||
},
|
||||
"required": [],
|
||||
|
||||
@@ -22,20 +22,10 @@ class FixAgentGraphTool(BaseTool):
|
||||
@property
|
||||
def description(self) -> str:
|
||||
return (
|
||||
"Auto-fix common issues in an agent JSON graph. Applies fixes for:\n"
|
||||
"- Missing or invalid UUIDs on nodes and links\n"
|
||||
"- StoreValueBlock prerequisites for ConditionBlock\n"
|
||||
"- Double curly brace escaping in prompt templates\n"
|
||||
"- AddToList/AddToDictionary prerequisite blocks\n"
|
||||
"- CodeExecutionBlock output field naming\n"
|
||||
"- Missing credentials configuration\n"
|
||||
"- Node X coordinate spacing (800+ units apart)\n"
|
||||
"- AI model default parameters\n"
|
||||
"- Link static properties based on input schema\n"
|
||||
"- Type mismatches (inserts conversion blocks)\n\n"
|
||||
"Returns the fixed agent JSON plus a list of fixes applied. "
|
||||
"After fixing, the agent is re-validated. If still invalid, "
|
||||
"the remaining errors are included in the response."
|
||||
"Auto-fix common agent JSON issues: missing/invalid UUIDs, StoreValueBlock prerequisites, "
|
||||
"double curly brace escaping, AddToList/AddToDictionary prerequisites, credentials, "
|
||||
"node spacing, AI model defaults, link static properties, and type mismatches. "
|
||||
"Returns fixed JSON and list of fixes applied."
|
||||
)
|
||||
|
||||
@property
|
||||
|
||||
@@ -42,12 +42,7 @@ class GetAgentBuildingGuideTool(BaseTool):
|
||||
|
||||
@property
|
||||
def description(self) -> str:
|
||||
return (
|
||||
"Returns the complete guide for building agent JSON graphs, including "
|
||||
"block IDs, link structure, AgentInputBlock, AgentOutputBlock, "
|
||||
"AgentExecutorBlock (for sub-agent composition), and MCPToolBlock usage. "
|
||||
"Call this before generating agent JSON to ensure correct structure."
|
||||
)
|
||||
return "Get the agent JSON building guide (nodes, links, AgentExecutorBlock, MCPToolBlock usage). Call before generating agent JSON."
|
||||
|
||||
@property
|
||||
def parameters(self) -> dict[str, Any]:
|
||||
|
||||
@@ -25,8 +25,7 @@ class GetDocPageTool(BaseTool):
|
||||
@property
|
||||
def description(self) -> str:
|
||||
return (
|
||||
"Get the full content of a documentation page by its path. "
|
||||
"Use this after search_docs to read the complete content of a relevant page."
|
||||
"Read full documentation page content by path (from search_docs results)."
|
||||
)
|
||||
|
||||
@property
|
||||
@@ -36,10 +35,7 @@ class GetDocPageTool(BaseTool):
|
||||
"properties": {
|
||||
"path": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"The path to the documentation file, as returned by search_docs. "
|
||||
"Example: 'platform/block-sdk-guide.md'"
|
||||
),
|
||||
"description": "Doc file path (e.g. 'platform/block-sdk-guide.md').",
|
||||
},
|
||||
},
|
||||
"required": ["path"],
|
||||
|
||||
@@ -38,11 +38,7 @@ class GetMCPGuideTool(BaseTool):
|
||||
|
||||
@property
|
||||
def description(self) -> str:
|
||||
return (
|
||||
"Returns the MCP tool guide: known hosted server URLs (Notion, Linear, "
|
||||
"Stripe, Intercom, Cloudflare, Atlassian) and authentication workflow. "
|
||||
"Call before using run_mcp_tool if you need a server URL or auth info."
|
||||
)
|
||||
return "Get MCP server URLs and auth guide. Call before run_mcp_tool if you need a server URL or auth info."
|
||||
|
||||
@property
|
||||
def parameters(self) -> dict[str, Any]:
|
||||
|
||||
@@ -1,15 +1,24 @@
|
||||
"""Shared helpers for chat tools."""
|
||||
|
||||
import logging
|
||||
import uuid
|
||||
from collections import defaultdict
|
||||
from dataclasses import dataclass
|
||||
from typing import Any
|
||||
|
||||
from pydantic_core import PydanticUndefined
|
||||
|
||||
from backend.blocks import BlockType, get_block
|
||||
from backend.blocks._base import AnyBlockSchema
|
||||
from backend.copilot.constants import COPILOT_NODE_PREFIX, COPILOT_SESSION_PREFIX
|
||||
from backend.copilot.constants import (
|
||||
COPILOT_NODE_EXEC_ID_SEPARATOR,
|
||||
COPILOT_NODE_PREFIX,
|
||||
COPILOT_SESSION_PREFIX,
|
||||
)
|
||||
from backend.copilot.model import ChatSession
|
||||
from backend.copilot.sdk.file_ref import FileRefExpansionError, expand_file_refs_in_args
|
||||
from backend.data.credit import UsageTransactionMetadata
|
||||
from backend.data.db_accessors import credit_db, workspace_db
|
||||
from backend.data.db_accessors import credit_db, review_db, workspace_db
|
||||
from backend.data.execution import ExecutionContext
|
||||
from backend.data.model import CredentialsFieldInfo, CredentialsMetaInput
|
||||
from backend.executor.utils import block_usage_cost
|
||||
@@ -17,8 +26,20 @@ from backend.integrations.creds_manager import IntegrationCredentialsManager
|
||||
from backend.util.exceptions import BlockError, InsufficientBalanceError
|
||||
from backend.util.type import coerce_inputs_to_schema
|
||||
|
||||
from .models import BlockOutputResponse, ErrorResponse, ToolResponseBase
|
||||
from .utils import match_credentials_to_requirements
|
||||
from .models import (
|
||||
BlockOutputResponse,
|
||||
ErrorResponse,
|
||||
InputValidationErrorResponse,
|
||||
ReviewRequiredResponse,
|
||||
SetupInfo,
|
||||
SetupRequirementsResponse,
|
||||
ToolResponseBase,
|
||||
UserReadiness,
|
||||
)
|
||||
from .utils import (
|
||||
build_missing_credentials_from_field_info,
|
||||
match_credentials_to_requirements,
|
||||
)
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
@@ -231,6 +252,286 @@ async def resolve_block_credentials(
|
||||
return await match_credentials_to_requirements(user_id, requirements)
|
||||
|
||||
|
||||
@dataclass
|
||||
class BlockPreparation:
|
||||
"""Result of successful block validation, ready for execution or task creation.
|
||||
|
||||
Attributes:
|
||||
block: The resolved block instance (schema definition + execute method).
|
||||
block_id: UUID of the block being prepared.
|
||||
input_data: User-supplied input values after file-ref expansion.
|
||||
matched_credentials: Credential field name -> resolved credential metadata.
|
||||
input_schema: JSON Schema for the block's input, with credential
|
||||
discriminators resolved for the user's available providers.
|
||||
credentials_fields: Set of field names in the schema that are credential
|
||||
inputs (e.g. ``{"credentials", "api_key"}``).
|
||||
required_non_credential_keys: Schema-required fields minus credential
|
||||
fields — the fields the user must supply directly.
|
||||
provided_input_keys: Keys the user actually provided in ``input_data``.
|
||||
synthetic_graph_id: Auto-generated graph UUID used for CoPilot
|
||||
single-block executions (no real graph exists in the DB).
|
||||
synthetic_node_id: Auto-generated node UUID paired with
|
||||
``synthetic_graph_id`` to form the execution context for the block.
|
||||
"""
|
||||
|
||||
block: AnyBlockSchema
|
||||
block_id: str
|
||||
input_data: dict[str, Any]
|
||||
matched_credentials: dict[str, CredentialsMetaInput]
|
||||
input_schema: dict[str, Any]
|
||||
credentials_fields: set[str]
|
||||
required_non_credential_keys: set[str]
|
||||
provided_input_keys: set[str]
|
||||
synthetic_graph_id: str
|
||||
synthetic_node_id: str
|
||||
|
||||
|
||||
async def prepare_block_for_execution(
|
||||
block_id: str,
|
||||
input_data: dict[str, Any],
|
||||
user_id: str,
|
||||
session: ChatSession,
|
||||
session_id: str,
|
||||
) -> "BlockPreparation | ToolResponseBase":
|
||||
"""Validate and prepare a block for execution.
|
||||
|
||||
Performs: block lookup, disabled/excluded-type checks, credential resolution,
|
||||
input schema generation, file-ref expansion, missing-credentials check, and
|
||||
unrecognized-field validation.
|
||||
|
||||
Does NOT check for missing required fields (tools differ: run_block shows a
|
||||
schema preview) and does NOT run the HITL review check (use check_hitl_review
|
||||
separately).
|
||||
|
||||
Args:
|
||||
block_id: Block UUID to prepare.
|
||||
input_data: Input values provided by the caller.
|
||||
user_id: Authenticated user ID.
|
||||
session: Current chat session (needed for file-ref expansion).
|
||||
session_id: Chat session ID (used in error responses).
|
||||
|
||||
Returns:
|
||||
BlockPreparation on success, or a ToolResponseBase error/setup response.
|
||||
"""
|
||||
# Lazy import: find_block imports from .base and .models (siblings), not
|
||||
# from helpers — no actual circular dependency exists today. Kept lazy as a
|
||||
# precaution since find_block is the block-registry module and future changes
|
||||
# could introduce a cycle.
|
||||
from .find_block import COPILOT_EXCLUDED_BLOCK_IDS, COPILOT_EXCLUDED_BLOCK_TYPES
|
||||
|
||||
block = get_block(block_id)
|
||||
if not block:
|
||||
return ErrorResponse(
|
||||
message=f"Block '{block_id}' not found", session_id=session_id
|
||||
)
|
||||
if block.disabled:
|
||||
return ErrorResponse(
|
||||
message=f"Block '{block_id}' is disabled", session_id=session_id
|
||||
)
|
||||
|
||||
if (
|
||||
block.block_type in COPILOT_EXCLUDED_BLOCK_TYPES
|
||||
or block.id in COPILOT_EXCLUDED_BLOCK_IDS
|
||||
):
|
||||
if block.block_type == BlockType.MCP_TOOL:
|
||||
hint = (
|
||||
" Use the `run_mcp_tool` tool instead — it handles "
|
||||
"MCP server discovery, authentication, and execution."
|
||||
)
|
||||
elif block.block_type == BlockType.AGENT:
|
||||
hint = " Use the `run_agent` tool instead."
|
||||
else:
|
||||
hint = " This block is designed for use within graphs only."
|
||||
return ErrorResponse(
|
||||
message=f"Block '{block.name}' cannot be run directly.{hint}",
|
||||
session_id=session_id,
|
||||
)
|
||||
|
||||
matched_credentials, missing_credentials = await resolve_block_credentials(
|
||||
user_id, block, input_data
|
||||
)
|
||||
|
||||
try:
|
||||
input_schema: dict[str, Any] = block.input_schema.jsonschema()
|
||||
except Exception as e:
|
||||
logger.warning("Failed to generate input schema for block %s: %s", block_id, e)
|
||||
return ErrorResponse(
|
||||
message=f"Block '{block.name}' has an invalid input schema",
|
||||
error=str(e),
|
||||
session_id=session_id,
|
||||
)
|
||||
|
||||
# Expand @@agptfile: refs using the block's input schema so string/list
|
||||
# fields get the correct deserialization.
|
||||
if input_data:
|
||||
try:
|
||||
input_data = await expand_file_refs_in_args(
|
||||
input_data, user_id, session, input_schema=input_schema
|
||||
)
|
||||
except FileRefExpansionError as exc:
|
||||
return ErrorResponse(
|
||||
message=(
|
||||
f"Failed to resolve file reference: {exc}. "
|
||||
"Ensure the file exists before referencing it."
|
||||
),
|
||||
session_id=session_id,
|
||||
)
|
||||
|
||||
credentials_fields = set(block.input_schema.get_credentials_fields().keys())
|
||||
|
||||
if missing_credentials:
|
||||
credentials_fields_info = _resolve_discriminated_credentials(block, input_data)
|
||||
missing_creds_dict = build_missing_credentials_from_field_info(
|
||||
credentials_fields_info, set(matched_credentials.keys())
|
||||
)
|
||||
missing_creds_list = list(missing_creds_dict.values())
|
||||
return SetupRequirementsResponse(
|
||||
message=(
|
||||
f"Block '{block.name}' requires credentials that are not configured. "
|
||||
"Please set up the required credentials before running this block."
|
||||
),
|
||||
session_id=session_id,
|
||||
setup_info=SetupInfo(
|
||||
agent_id=block_id,
|
||||
agent_name=block.name,
|
||||
user_readiness=UserReadiness(
|
||||
has_all_credentials=False,
|
||||
missing_credentials=missing_creds_dict,
|
||||
ready_to_run=False,
|
||||
),
|
||||
requirements={
|
||||
"credentials": missing_creds_list,
|
||||
"inputs": get_inputs_from_schema(
|
||||
input_schema, exclude_fields=credentials_fields
|
||||
),
|
||||
"execution_modes": ["immediate"],
|
||||
},
|
||||
),
|
||||
graph_id=None,
|
||||
graph_version=None,
|
||||
)
|
||||
required_keys = set(input_schema.get("required", []))
|
||||
required_non_credential_keys = required_keys - credentials_fields
|
||||
provided_input_keys = set(input_data.keys()) - credentials_fields
|
||||
|
||||
valid_fields = set(input_schema.get("properties", {}).keys()) - credentials_fields
|
||||
unrecognized_fields = provided_input_keys - valid_fields
|
||||
if unrecognized_fields:
|
||||
return InputValidationErrorResponse(
|
||||
message=(
|
||||
f"Unknown input field(s) provided: {', '.join(sorted(unrecognized_fields))}. "
|
||||
"Block was not executed. Please use the correct field names from the schema."
|
||||
),
|
||||
session_id=session_id,
|
||||
unrecognized_fields=sorted(unrecognized_fields),
|
||||
inputs=input_schema,
|
||||
)
|
||||
|
||||
synthetic_graph_id = f"{COPILOT_SESSION_PREFIX}{session_id}"
|
||||
synthetic_node_id = f"{COPILOT_NODE_PREFIX}{block_id}"
|
||||
|
||||
return BlockPreparation(
|
||||
block=block,
|
||||
block_id=block_id,
|
||||
input_data=input_data,
|
||||
matched_credentials=matched_credentials,
|
||||
input_schema=input_schema,
|
||||
credentials_fields=credentials_fields,
|
||||
required_non_credential_keys=required_non_credential_keys,
|
||||
provided_input_keys=provided_input_keys,
|
||||
synthetic_graph_id=synthetic_graph_id,
|
||||
synthetic_node_id=synthetic_node_id,
|
||||
)
|
||||
|
||||
|
||||
async def check_hitl_review(
|
||||
prep: BlockPreparation,
|
||||
user_id: str,
|
||||
session_id: str,
|
||||
) -> "tuple[str, dict[str, Any]] | ToolResponseBase":
|
||||
"""Check for an existing or new HITL review requirement.
|
||||
|
||||
If a review is needed, stores the review record and returns a
|
||||
ReviewRequiredResponse. Otherwise returns
|
||||
``(synthetic_node_exec_id, input_data)`` ready for execute_block.
|
||||
"""
|
||||
block = prep.block
|
||||
block_id = prep.block_id
|
||||
synthetic_graph_id = prep.synthetic_graph_id
|
||||
synthetic_node_id = prep.synthetic_node_id
|
||||
input_data = prep.input_data
|
||||
|
||||
# Reuse an existing WAITING review for identical input (LLM retry guard)
|
||||
existing_reviews = await review_db().get_pending_reviews_for_execution(
|
||||
synthetic_graph_id, user_id
|
||||
)
|
||||
existing_review = next(
|
||||
(
|
||||
r
|
||||
for r in existing_reviews
|
||||
if r.node_id == synthetic_node_id
|
||||
and r.status.value == "WAITING"
|
||||
and r.payload == input_data
|
||||
),
|
||||
None,
|
||||
)
|
||||
if existing_review:
|
||||
return ReviewRequiredResponse(
|
||||
message=(
|
||||
f"Block '{block.name}' requires human review. "
|
||||
f"After the user approves, call continue_run_block with "
|
||||
f"review_id='{existing_review.node_exec_id}' to execute."
|
||||
),
|
||||
session_id=session_id,
|
||||
block_id=block_id,
|
||||
block_name=block.name,
|
||||
review_id=existing_review.node_exec_id,
|
||||
graph_exec_id=synthetic_graph_id,
|
||||
input_data=input_data,
|
||||
)
|
||||
|
||||
synthetic_node_exec_id = (
|
||||
f"{synthetic_node_id}{COPILOT_NODE_EXEC_ID_SEPARATOR}" f"{uuid.uuid4().hex[:8]}"
|
||||
)
|
||||
|
||||
review_context = ExecutionContext(
|
||||
user_id=user_id,
|
||||
graph_id=synthetic_graph_id,
|
||||
graph_exec_id=synthetic_graph_id,
|
||||
graph_version=1,
|
||||
node_id=synthetic_node_id,
|
||||
node_exec_id=synthetic_node_exec_id,
|
||||
sensitive_action_safe_mode=True,
|
||||
)
|
||||
should_pause, input_data = await block.is_block_exec_need_review(
|
||||
input_data,
|
||||
user_id=user_id,
|
||||
node_id=synthetic_node_id,
|
||||
node_exec_id=synthetic_node_exec_id,
|
||||
graph_exec_id=synthetic_graph_id,
|
||||
graph_id=synthetic_graph_id,
|
||||
graph_version=1,
|
||||
execution_context=review_context,
|
||||
is_graph_execution=False,
|
||||
)
|
||||
if should_pause:
|
||||
return ReviewRequiredResponse(
|
||||
message=(
|
||||
f"Block '{block.name}' requires human review. "
|
||||
f"After the user approves, call continue_run_block with "
|
||||
f"review_id='{synthetic_node_exec_id}' to execute."
|
||||
),
|
||||
session_id=session_id,
|
||||
block_id=block_id,
|
||||
block_name=block.name,
|
||||
review_id=synthetic_node_exec_id,
|
||||
graph_exec_id=synthetic_graph_id,
|
||||
input_data=input_data,
|
||||
)
|
||||
|
||||
return synthetic_node_exec_id, input_data
|
||||
|
||||
|
||||
def _resolve_discriminated_credentials(
|
||||
block: AnyBlockSchema,
|
||||
input_data: dict[str, Any],
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
"""Tests for execute_block — credit charging and type coercion."""
|
||||
"""Tests for execute_block, prepare_block_for_execution, and check_hitl_review."""
|
||||
|
||||
from collections.abc import AsyncIterator
|
||||
from typing import Any
|
||||
@@ -7,8 +7,20 @@ from unittest.mock import AsyncMock, MagicMock, patch
|
||||
import pytest
|
||||
|
||||
from backend.blocks._base import BlockType
|
||||
from backend.copilot.tools.helpers import execute_block
|
||||
from backend.copilot.tools.models import BlockOutputResponse, ErrorResponse
|
||||
from backend.copilot.constants import COPILOT_NODE_PREFIX, COPILOT_SESSION_PREFIX
|
||||
from backend.copilot.tools.helpers import (
|
||||
BlockPreparation,
|
||||
check_hitl_review,
|
||||
execute_block,
|
||||
prepare_block_for_execution,
|
||||
)
|
||||
from backend.copilot.tools.models import (
|
||||
BlockOutputResponse,
|
||||
ErrorResponse,
|
||||
InputValidationErrorResponse,
|
||||
ReviewRequiredResponse,
|
||||
SetupRequirementsResponse,
|
||||
)
|
||||
|
||||
_USER = "test-user-helpers"
|
||||
_SESSION = "test-session-helpers"
|
||||
@@ -510,3 +522,341 @@ async def test_coerce_inner_elements_of_generic():
|
||||
# Inner elements should be coerced from int to str
|
||||
assert block._captured_inputs["values"] == ["1", "2", "3"]
|
||||
assert all(isinstance(v, str) for v in block._captured_inputs["values"])
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# prepare_block_for_execution tests
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
_PREP_USER = "prep-user"
|
||||
_PREP_SESSION = "prep-session"
|
||||
|
||||
|
||||
def _make_prep_session(session_id: str = _PREP_SESSION) -> MagicMock:
|
||||
session = MagicMock()
|
||||
session.session_id = session_id
|
||||
return session
|
||||
|
||||
|
||||
def _make_simple_block(
|
||||
block_id: str = "blk-1",
|
||||
name: str = "Simple Block",
|
||||
disabled: bool = False,
|
||||
required: list[str] | None = None,
|
||||
properties: dict[str, Any] | None = None,
|
||||
) -> MagicMock:
|
||||
block = MagicMock()
|
||||
block.id = block_id
|
||||
block.name = name
|
||||
block.disabled = disabled
|
||||
block.description = ""
|
||||
block.block_type = MagicMock()
|
||||
|
||||
schema = {
|
||||
"type": "object",
|
||||
"properties": properties or {"text": {"type": "string"}},
|
||||
"required": required or [],
|
||||
}
|
||||
block.input_schema.jsonschema.return_value = schema
|
||||
block.input_schema.get_credentials_fields.return_value = {}
|
||||
block.input_schema.get_credentials_fields_info.return_value = {}
|
||||
return block
|
||||
|
||||
|
||||
def _patch_excluded(block_ids: set | None = None, block_types: set | None = None):
|
||||
return (
|
||||
patch(
|
||||
"backend.copilot.tools.find_block.COPILOT_EXCLUDED_BLOCK_IDS",
|
||||
new=block_ids or set(),
|
||||
create=True,
|
||||
),
|
||||
patch(
|
||||
"backend.copilot.tools.find_block.COPILOT_EXCLUDED_BLOCK_TYPES",
|
||||
new=block_types or set(),
|
||||
create=True,
|
||||
),
|
||||
)
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_prepare_block_not_found() -> None:
|
||||
excl_ids, excl_types = _patch_excluded()
|
||||
with (
|
||||
patch("backend.copilot.tools.helpers.get_block", return_value=None),
|
||||
excl_ids,
|
||||
excl_types,
|
||||
):
|
||||
result = await prepare_block_for_execution(
|
||||
block_id="missing",
|
||||
input_data={},
|
||||
user_id=_PREP_USER,
|
||||
session=_make_prep_session(),
|
||||
session_id=_PREP_SESSION,
|
||||
)
|
||||
assert isinstance(result, ErrorResponse)
|
||||
assert "not found" in result.message
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_prepare_block_disabled() -> None:
|
||||
block = _make_simple_block(disabled=True)
|
||||
excl_ids, excl_types = _patch_excluded()
|
||||
with (
|
||||
patch("backend.copilot.tools.helpers.get_block", return_value=block),
|
||||
excl_ids,
|
||||
excl_types,
|
||||
):
|
||||
result = await prepare_block_for_execution(
|
||||
block_id="blk-1",
|
||||
input_data={},
|
||||
user_id=_PREP_USER,
|
||||
session=_make_prep_session(),
|
||||
session_id=_PREP_SESSION,
|
||||
)
|
||||
assert isinstance(result, ErrorResponse)
|
||||
assert "disabled" in result.message
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_prepare_block_unrecognized_fields() -> None:
|
||||
block = _make_simple_block(properties={"text": {"type": "string"}})
|
||||
excl_ids, excl_types = _patch_excluded()
|
||||
with (
|
||||
patch("backend.copilot.tools.helpers.get_block", return_value=block),
|
||||
excl_ids,
|
||||
excl_types,
|
||||
patch(
|
||||
"backend.copilot.tools.helpers.resolve_block_credentials",
|
||||
AsyncMock(return_value=({}, [])),
|
||||
),
|
||||
patch(
|
||||
"backend.copilot.tools.helpers.expand_file_refs_in_args",
|
||||
AsyncMock(side_effect=lambda d, *a, **kw: d),
|
||||
),
|
||||
):
|
||||
result = await prepare_block_for_execution(
|
||||
block_id="blk-1",
|
||||
input_data={"text": "hi", "unknown_field": "oops"},
|
||||
user_id=_PREP_USER,
|
||||
session=_make_prep_session(),
|
||||
session_id=_PREP_SESSION,
|
||||
)
|
||||
assert isinstance(result, InputValidationErrorResponse)
|
||||
assert "unknown_field" in result.unrecognized_fields
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_prepare_block_missing_credentials() -> None:
|
||||
block = _make_simple_block()
|
||||
mock_cred = MagicMock()
|
||||
excl_ids, excl_types = _patch_excluded()
|
||||
with (
|
||||
patch("backend.copilot.tools.helpers.get_block", return_value=block),
|
||||
excl_ids,
|
||||
excl_types,
|
||||
patch(
|
||||
"backend.copilot.tools.helpers.resolve_block_credentials",
|
||||
AsyncMock(return_value=({}, [mock_cred])),
|
||||
),
|
||||
patch(
|
||||
"backend.copilot.tools.helpers.build_missing_credentials_from_field_info",
|
||||
return_value={"cred_key": mock_cred},
|
||||
),
|
||||
):
|
||||
result = await prepare_block_for_execution(
|
||||
block_id="blk-1",
|
||||
input_data={},
|
||||
user_id=_PREP_USER,
|
||||
session=_make_prep_session(),
|
||||
session_id=_PREP_SESSION,
|
||||
)
|
||||
assert isinstance(result, SetupRequirementsResponse)
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_prepare_block_success_returns_preparation() -> None:
|
||||
block = _make_simple_block(
|
||||
required=["text"], properties={"text": {"type": "string"}}
|
||||
)
|
||||
excl_ids, excl_types = _patch_excluded()
|
||||
with (
|
||||
patch("backend.copilot.tools.helpers.get_block", return_value=block),
|
||||
excl_ids,
|
||||
excl_types,
|
||||
patch(
|
||||
"backend.copilot.tools.helpers.resolve_block_credentials",
|
||||
AsyncMock(return_value=({}, [])),
|
||||
),
|
||||
patch(
|
||||
"backend.copilot.tools.helpers.expand_file_refs_in_args",
|
||||
AsyncMock(side_effect=lambda d, *a, **kw: d),
|
||||
),
|
||||
):
|
||||
result = await prepare_block_for_execution(
|
||||
block_id="blk-1",
|
||||
input_data={"text": "hello"},
|
||||
user_id=_PREP_USER,
|
||||
session=_make_prep_session(),
|
||||
session_id=_PREP_SESSION,
|
||||
)
|
||||
assert isinstance(result, BlockPreparation)
|
||||
assert result.required_non_credential_keys == {"text"}
|
||||
assert result.provided_input_keys == {"text"}
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# check_hitl_review tests
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def _make_hitl_prep(
|
||||
block_id: str = "blk-hitl",
|
||||
input_data: dict | None = None,
|
||||
session_id: str = "hitl-sess",
|
||||
needs_review: bool = False,
|
||||
) -> BlockPreparation:
|
||||
block = MagicMock()
|
||||
block.id = block_id
|
||||
block.name = "HITL Block"
|
||||
data = input_data if input_data is not None else {"action": "delete"}
|
||||
block.is_block_exec_need_review = AsyncMock(return_value=(needs_review, data))
|
||||
return BlockPreparation(
|
||||
block=block,
|
||||
block_id=block_id,
|
||||
input_data=data,
|
||||
matched_credentials={},
|
||||
input_schema={},
|
||||
credentials_fields=set(),
|
||||
required_non_credential_keys=set(),
|
||||
provided_input_keys=set(),
|
||||
synthetic_graph_id=f"{COPILOT_SESSION_PREFIX}{session_id}",
|
||||
synthetic_node_id=f"{COPILOT_NODE_PREFIX}{block_id}",
|
||||
)
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_check_hitl_no_review_needed() -> None:
|
||||
prep = _make_hitl_prep(input_data={"action": "read"}, needs_review=False)
|
||||
mock_rdb = MagicMock()
|
||||
mock_rdb.get_pending_reviews_for_execution = AsyncMock(return_value=[])
|
||||
|
||||
with patch("backend.copilot.tools.helpers.review_db", return_value=mock_rdb):
|
||||
result = await check_hitl_review(prep, "user1", "hitl-sess")
|
||||
|
||||
assert isinstance(result, tuple)
|
||||
node_exec_id, returned_data = result
|
||||
assert node_exec_id.startswith(f"{COPILOT_NODE_PREFIX}blk-hitl")
|
||||
assert returned_data == {"action": "read"}
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_check_hitl_review_required() -> None:
|
||||
prep = _make_hitl_prep(input_data={"action": "delete"}, needs_review=True)
|
||||
mock_rdb = MagicMock()
|
||||
mock_rdb.get_pending_reviews_for_execution = AsyncMock(return_value=[])
|
||||
|
||||
with patch("backend.copilot.tools.helpers.review_db", return_value=mock_rdb):
|
||||
result = await check_hitl_review(prep, "user1", "hitl-sess")
|
||||
|
||||
assert isinstance(result, ReviewRequiredResponse)
|
||||
assert result.block_id == "blk-hitl"
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_check_hitl_reuses_existing_waiting_review() -> None:
|
||||
prep = _make_hitl_prep(input_data={"action": "delete"}, needs_review=False)
|
||||
|
||||
existing = MagicMock()
|
||||
existing.node_id = prep.synthetic_node_id
|
||||
existing.status.value = "WAITING"
|
||||
existing.payload = {"action": "delete"}
|
||||
existing.node_exec_id = "existing-review-42"
|
||||
|
||||
mock_rdb = MagicMock()
|
||||
mock_rdb.get_pending_reviews_for_execution = AsyncMock(return_value=[existing])
|
||||
|
||||
with patch("backend.copilot.tools.helpers.review_db", return_value=mock_rdb):
|
||||
result = await check_hitl_review(prep, "user1", "hitl-sess")
|
||||
|
||||
assert isinstance(result, ReviewRequiredResponse)
|
||||
assert result.review_id == "existing-review-42"
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_prepare_block_excluded_by_type() -> None:
|
||||
"""prepare_block_for_execution returns ErrorResponse for excluded block types."""
|
||||
from backend.blocks import BlockType
|
||||
|
||||
block = _make_simple_block()
|
||||
block.block_type = BlockType.AGENT
|
||||
|
||||
excl_ids, excl_types = _patch_excluded(block_types={BlockType.AGENT})
|
||||
with (
|
||||
patch("backend.copilot.tools.helpers.get_block", return_value=block),
|
||||
excl_ids,
|
||||
excl_types,
|
||||
):
|
||||
result = await prepare_block_for_execution(
|
||||
block_id="blk-agent",
|
||||
input_data={},
|
||||
user_id=_PREP_USER,
|
||||
session=_make_prep_session(),
|
||||
session_id=_PREP_SESSION,
|
||||
)
|
||||
assert isinstance(result, ErrorResponse)
|
||||
assert "cannot be run directly" in result.message
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_prepare_block_excluded_by_id() -> None:
|
||||
"""prepare_block_for_execution returns ErrorResponse for excluded block IDs."""
|
||||
block = _make_simple_block(block_id="blk-excluded")
|
||||
|
||||
excl_ids, excl_types = _patch_excluded(block_ids={"blk-excluded"})
|
||||
with (
|
||||
patch("backend.copilot.tools.helpers.get_block", return_value=block),
|
||||
excl_ids,
|
||||
excl_types,
|
||||
):
|
||||
result = await prepare_block_for_execution(
|
||||
block_id="blk-excluded",
|
||||
input_data={},
|
||||
user_id=_PREP_USER,
|
||||
session=_make_prep_session(),
|
||||
session_id=_PREP_SESSION,
|
||||
)
|
||||
assert isinstance(result, ErrorResponse)
|
||||
assert "cannot be run directly" in result.message
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_prepare_block_file_ref_expansion_error() -> None:
|
||||
"""prepare_block_for_execution returns ErrorResponse when file-ref expansion fails."""
|
||||
from backend.copilot.sdk.file_ref import FileRefExpansionError
|
||||
|
||||
block = _make_simple_block(properties={"text": {"type": "string"}})
|
||||
excl_ids, excl_types = _patch_excluded()
|
||||
with (
|
||||
patch("backend.copilot.tools.helpers.get_block", return_value=block),
|
||||
excl_ids,
|
||||
excl_types,
|
||||
patch(
|
||||
"backend.copilot.tools.helpers.resolve_block_credentials",
|
||||
AsyncMock(return_value=({}, [])),
|
||||
),
|
||||
patch(
|
||||
"backend.copilot.tools.helpers.expand_file_refs_in_args",
|
||||
AsyncMock(
|
||||
side_effect=FileRefExpansionError("@@agptfile:missing.txt not found")
|
||||
),
|
||||
),
|
||||
):
|
||||
result = await prepare_block_for_execution(
|
||||
block_id="blk-1",
|
||||
input_data={"text": "@@agptfile:missing.txt"},
|
||||
user_id=_PREP_USER,
|
||||
session=_make_prep_session(),
|
||||
session_id=_PREP_SESSION,
|
||||
)
|
||||
assert isinstance(result, ErrorResponse)
|
||||
assert "file reference" in result.message.lower()
|
||||
|
||||
@@ -88,10 +88,7 @@ class CreateFolderTool(BaseTool):
|
||||
|
||||
@property
|
||||
def description(self) -> str:
|
||||
return (
|
||||
"Create a new folder in the user's library to organize agents. "
|
||||
"Optionally nest it inside an existing folder using parent_id."
|
||||
)
|
||||
return "Create a library folder. Use parent_id to nest inside another folder."
|
||||
|
||||
@property
|
||||
def requires_auth(self) -> bool:
|
||||
@@ -104,22 +101,19 @@ class CreateFolderTool(BaseTool):
|
||||
"properties": {
|
||||
"name": {
|
||||
"type": "string",
|
||||
"description": "Name for the new folder (max 100 chars).",
|
||||
"description": "Folder name (max 100 chars).",
|
||||
},
|
||||
"parent_id": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"ID of the parent folder to nest inside. "
|
||||
"Omit to create at root level."
|
||||
),
|
||||
"description": "Parent folder ID (omit for root).",
|
||||
},
|
||||
"icon": {
|
||||
"type": "string",
|
||||
"description": "Optional icon identifier for the folder.",
|
||||
"description": "Icon identifier.",
|
||||
},
|
||||
"color": {
|
||||
"type": "string",
|
||||
"description": "Optional hex color code (#RRGGBB).",
|
||||
"description": "Hex color (#RRGGBB).",
|
||||
},
|
||||
},
|
||||
"required": ["name"],
|
||||
@@ -175,13 +169,9 @@ class ListFoldersTool(BaseTool):
|
||||
@property
|
||||
def description(self) -> str:
|
||||
return (
|
||||
"List the user's library folders. "
|
||||
"Omit parent_id to get the full folder tree. "
|
||||
"Provide parent_id to list only direct children of that folder. "
|
||||
"Set include_agents=true to also return the agents inside each folder "
|
||||
"and root-level agents not in any folder. Always set include_agents=true "
|
||||
"when the user asks about agents, wants to see what's in their folders, "
|
||||
"or mentions agents alongside folders."
|
||||
"List library folders. Omit parent_id for full tree. "
|
||||
"Set include_agents=true when user asks about agents, wants to see "
|
||||
"what's in their folders, or mentions agents alongside folders."
|
||||
)
|
||||
|
||||
@property
|
||||
@@ -195,17 +185,11 @@ class ListFoldersTool(BaseTool):
|
||||
"properties": {
|
||||
"parent_id": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"List children of this folder. "
|
||||
"Omit to get the full folder tree."
|
||||
),
|
||||
"description": "List children of this folder (omit for full tree).",
|
||||
},
|
||||
"include_agents": {
|
||||
"type": "boolean",
|
||||
"description": (
|
||||
"Whether to include the list of agents inside each folder. "
|
||||
"Defaults to false."
|
||||
),
|
||||
"description": "Include agents in each folder (default: false).",
|
||||
},
|
||||
},
|
||||
"required": [],
|
||||
@@ -357,10 +341,7 @@ class MoveFolderTool(BaseTool):
|
||||
|
||||
@property
|
||||
def description(self) -> str:
|
||||
return (
|
||||
"Move a folder to a different parent folder. "
|
||||
"Set target_parent_id to null to move to root level."
|
||||
)
|
||||
return "Move a folder. Set target_parent_id to null for root."
|
||||
|
||||
@property
|
||||
def requires_auth(self) -> bool:
|
||||
@@ -373,14 +354,11 @@ class MoveFolderTool(BaseTool):
|
||||
"properties": {
|
||||
"folder_id": {
|
||||
"type": "string",
|
||||
"description": "ID of the folder to move.",
|
||||
"description": "Folder ID.",
|
||||
},
|
||||
"target_parent_id": {
|
||||
"type": ["string", "null"],
|
||||
"description": (
|
||||
"ID of the new parent folder. "
|
||||
"Use null to move to root level."
|
||||
),
|
||||
"description": "New parent folder ID (null for root).",
|
||||
},
|
||||
},
|
||||
"required": ["folder_id"],
|
||||
@@ -433,10 +411,7 @@ class DeleteFolderTool(BaseTool):
|
||||
|
||||
@property
|
||||
def description(self) -> str:
|
||||
return (
|
||||
"Delete a folder from the user's library. "
|
||||
"Agents inside the folder are moved to root level (not deleted)."
|
||||
)
|
||||
return "Delete a folder. Agents inside move to root (not deleted)."
|
||||
|
||||
@property
|
||||
def requires_auth(self) -> bool:
|
||||
@@ -499,10 +474,7 @@ class MoveAgentsToFolderTool(BaseTool):
|
||||
|
||||
@property
|
||||
def description(self) -> str:
|
||||
return (
|
||||
"Move one or more agents to a folder. "
|
||||
"Set folder_id to null to move agents to root level."
|
||||
)
|
||||
return "Move agents to a folder. Set folder_id to null for root."
|
||||
|
||||
@property
|
||||
def requires_auth(self) -> bool:
|
||||
@@ -516,13 +488,11 @@ class MoveAgentsToFolderTool(BaseTool):
|
||||
"agent_ids": {
|
||||
"type": "array",
|
||||
"items": {"type": "string"},
|
||||
"description": "List of library agent IDs to move.",
|
||||
"description": "Library agent IDs to move.",
|
||||
},
|
||||
"folder_id": {
|
||||
"type": ["string", "null"],
|
||||
"description": (
|
||||
"Target folder ID. Use null to move to root level."
|
||||
),
|
||||
"description": "Target folder ID (null for root).",
|
||||
},
|
||||
},
|
||||
"required": ["agent_ids"],
|
||||
|
||||
@@ -104,19 +104,11 @@ class RunAgentTool(BaseTool):
|
||||
|
||||
@property
|
||||
def description(self) -> str:
|
||||
return """Run or schedule an agent from the marketplace or user's library.
|
||||
|
||||
The tool automatically handles the setup flow:
|
||||
- Returns missing inputs if required fields are not provided
|
||||
- Returns missing credentials if user needs to configure them
|
||||
- Executes immediately if all requirements are met
|
||||
- Schedules execution if cron expression is provided
|
||||
|
||||
Identify the agent using either:
|
||||
- username_agent_slug: Marketplace format 'username/agent-name'
|
||||
- library_agent_id: ID of an agent in the user's library
|
||||
|
||||
For scheduled execution, provide: schedule_name, cron, and optionally timezone."""
|
||||
return (
|
||||
"Run or schedule an agent. Automatically checks inputs and credentials. "
|
||||
"Identify by username_agent_slug ('user/agent') or library_agent_id. "
|
||||
"For scheduling, provide schedule_name + cron."
|
||||
)
|
||||
|
||||
@property
|
||||
def parameters(self) -> dict[str, Any]:
|
||||
@@ -125,40 +117,38 @@ class RunAgentTool(BaseTool):
|
||||
"properties": {
|
||||
"username_agent_slug": {
|
||||
"type": "string",
|
||||
"description": "Agent identifier in format 'username/agent-name'",
|
||||
"description": "Marketplace format 'username/agent-name'.",
|
||||
},
|
||||
"library_agent_id": {
|
||||
"type": "string",
|
||||
"description": "Library agent ID from user's library",
|
||||
"description": "Library agent ID.",
|
||||
},
|
||||
"inputs": {
|
||||
"type": "object",
|
||||
"description": "Input values for the agent",
|
||||
"description": "Input values for the agent.",
|
||||
"additionalProperties": True,
|
||||
},
|
||||
"use_defaults": {
|
||||
"type": "boolean",
|
||||
"description": "Set to true to run with default values (user must confirm)",
|
||||
"description": "Run with default values (confirm with user first).",
|
||||
},
|
||||
"schedule_name": {
|
||||
"type": "string",
|
||||
"description": "Name for scheduled execution (triggers scheduling mode)",
|
||||
"description": "Name for scheduled execution. Providing this triggers scheduling mode (also requires cron).",
|
||||
},
|
||||
"cron": {
|
||||
"type": "string",
|
||||
"description": "Cron expression (5 fields: min hour day month weekday)",
|
||||
"description": "Cron expression (min hour day month weekday).",
|
||||
},
|
||||
"timezone": {
|
||||
"type": "string",
|
||||
"description": "IANA timezone for schedule (default: UTC)",
|
||||
"description": "IANA timezone (default: UTC).",
|
||||
},
|
||||
"wait_for_result": {
|
||||
"type": "integer",
|
||||
"description": (
|
||||
"Max seconds to wait for execution to complete (0-300). "
|
||||
"If >0, blocks until the execution finishes or times out. "
|
||||
"Returns execution outputs when complete."
|
||||
),
|
||||
"description": "Max seconds to wait for completion (0-300).",
|
||||
"minimum": 0,
|
||||
"maximum": 300,
|
||||
},
|
||||
},
|
||||
"required": [],
|
||||
|
||||
@@ -1,36 +1,19 @@
|
||||
"""Tool for executing blocks directly."""
|
||||
|
||||
import logging
|
||||
import uuid
|
||||
from typing import Any
|
||||
|
||||
from backend.blocks import BlockType, get_block
|
||||
from backend.blocks._base import AnyBlockSchema
|
||||
from backend.copilot.constants import (
|
||||
COPILOT_NODE_EXEC_ID_SEPARATOR,
|
||||
COPILOT_NODE_PREFIX,
|
||||
COPILOT_SESSION_PREFIX,
|
||||
)
|
||||
from backend.copilot.context import get_current_permissions
|
||||
from backend.copilot.model import ChatSession
|
||||
from backend.copilot.sdk.file_ref import FileRefExpansionError, expand_file_refs_in_args
|
||||
from backend.data.db_accessors import review_db
|
||||
from backend.data.execution import ExecutionContext
|
||||
|
||||
from .base import BaseTool
|
||||
from .find_block import COPILOT_EXCLUDED_BLOCK_IDS, COPILOT_EXCLUDED_BLOCK_TYPES
|
||||
from .helpers import execute_block, get_inputs_from_schema, resolve_block_credentials
|
||||
from .models import (
|
||||
BlockDetails,
|
||||
BlockDetailsResponse,
|
||||
ErrorResponse,
|
||||
InputValidationErrorResponse,
|
||||
ReviewRequiredResponse,
|
||||
SetupInfo,
|
||||
SetupRequirementsResponse,
|
||||
ToolResponseBase,
|
||||
UserReadiness,
|
||||
from .helpers import (
|
||||
BlockPreparation,
|
||||
check_hitl_review,
|
||||
execute_block,
|
||||
prepare_block_for_execution,
|
||||
)
|
||||
from .utils import build_missing_credentials_from_field_info
|
||||
from .models import BlockDetails, BlockDetailsResponse, ErrorResponse, ToolResponseBase
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
@@ -45,13 +28,10 @@ class RunBlockTool(BaseTool):
|
||||
@property
|
||||
def description(self) -> str:
|
||||
return (
|
||||
"Execute a specific block with the provided input data. "
|
||||
"IMPORTANT: You MUST call find_block first to get the block's 'id' - "
|
||||
"do NOT guess or make up block IDs. "
|
||||
"On first attempt (without input_data), returns detailed schema showing "
|
||||
"required inputs and outputs. Then call again with proper input_data to execute. "
|
||||
"If a block requires human review, use continue_run_block with the "
|
||||
"review_id after the user approves."
|
||||
"Execute a block. IMPORTANT: Always get block_id from find_block first "
|
||||
"— do NOT guess or fabricate IDs. "
|
||||
"Call with empty input_data to see schema, then with data to execute. "
|
||||
"If review_required, use continue_run_block."
|
||||
)
|
||||
|
||||
@property
|
||||
@@ -61,28 +41,14 @@ class RunBlockTool(BaseTool):
|
||||
"properties": {
|
||||
"block_id": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"The block's 'id' field from find_block results. "
|
||||
"NEVER guess this - always get it from find_block first."
|
||||
),
|
||||
},
|
||||
"block_name": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"The block's human-readable name from find_block results. "
|
||||
"Used for display purposes in the UI."
|
||||
),
|
||||
"description": "Block ID from find_block results.",
|
||||
},
|
||||
"input_data": {
|
||||
"type": "object",
|
||||
"description": (
|
||||
"Input values for the block. "
|
||||
"First call with empty {} to see the block's schema, "
|
||||
"then call again with proper values to execute."
|
||||
),
|
||||
"description": "Input values. Use {} first to see schema.",
|
||||
},
|
||||
},
|
||||
"required": ["block_id", "block_name", "input_data"],
|
||||
"required": ["block_id", "input_data"],
|
||||
}
|
||||
|
||||
@property
|
||||
@@ -130,267 +96,85 @@ class RunBlockTool(BaseTool):
|
||||
session_id=session_id,
|
||||
)
|
||||
|
||||
# Get the block
|
||||
block = get_block(block_id)
|
||||
if not block:
|
||||
return ErrorResponse(
|
||||
message=f"Block '{block_id}' not found",
|
||||
session_id=session_id,
|
||||
)
|
||||
if block.disabled:
|
||||
return ErrorResponse(
|
||||
message=f"Block '{block_id}' is disabled",
|
||||
session_id=session_id,
|
||||
)
|
||||
logger.info("Preparing block %s for user %s", block_id, user_id)
|
||||
|
||||
# Check if block is excluded from CoPilot (graph-only blocks)
|
||||
if (
|
||||
block.block_type in COPILOT_EXCLUDED_BLOCK_TYPES
|
||||
or block.id in COPILOT_EXCLUDED_BLOCK_IDS
|
||||
):
|
||||
# Provide actionable guidance for blocks with dedicated tools
|
||||
if block.block_type == BlockType.MCP_TOOL:
|
||||
hint = (
|
||||
" Use the `run_mcp_tool` tool instead — it handles "
|
||||
"MCP server discovery, authentication, and execution."
|
||||
prep_or_err = await prepare_block_for_execution(
|
||||
block_id=block_id,
|
||||
input_data=input_data,
|
||||
user_id=user_id,
|
||||
session=session,
|
||||
session_id=session_id,
|
||||
)
|
||||
if isinstance(prep_or_err, ToolResponseBase):
|
||||
return prep_or_err
|
||||
prep: BlockPreparation = prep_or_err
|
||||
|
||||
# Check block-level permissions before execution.
|
||||
perms = get_current_permissions()
|
||||
if perms is not None and not perms.is_block_allowed(block_id, prep.block.name):
|
||||
available_hint = (
|
||||
f"Allowed identifiers: {perms.blocks!r}. "
|
||||
if not perms.blocks_exclude and perms.blocks
|
||||
else (
|
||||
f"Blocked identifiers: {perms.blocks!r}. "
|
||||
if perms.blocks_exclude and perms.blocks
|
||||
else ""
|
||||
)
|
||||
elif block.block_type == BlockType.AGENT:
|
||||
hint = " Use the `run_agent` tool instead."
|
||||
else:
|
||||
hint = " This block is designed for use within graphs only."
|
||||
)
|
||||
return ErrorResponse(
|
||||
message=f"Block '{block.name}' cannot be run directly.{hint}",
|
||||
message=(
|
||||
f"Block '{prep.block.name}' ({block_id}) is not permitted "
|
||||
f"by the current execution permissions. {available_hint}"
|
||||
"Use find_block to discover blocks that are allowed."
|
||||
),
|
||||
session_id=session_id,
|
||||
)
|
||||
|
||||
logger.info(f"Executing block {block.name} ({block_id}) for user {user_id}")
|
||||
|
||||
(
|
||||
matched_credentials,
|
||||
missing_credentials,
|
||||
) = await resolve_block_credentials(user_id, block, input_data)
|
||||
|
||||
# Get block schemas for details/validation
|
||||
try:
|
||||
input_schema: dict[str, Any] = block.input_schema.jsonschema()
|
||||
except Exception as e:
|
||||
logger.warning(
|
||||
"Failed to generate input schema for block %s: %s",
|
||||
block_id,
|
||||
e,
|
||||
)
|
||||
return ErrorResponse(
|
||||
message=f"Block '{block.name}' has an invalid input schema",
|
||||
error=str(e),
|
||||
session_id=session_id,
|
||||
)
|
||||
try:
|
||||
output_schema: dict[str, Any] = block.output_schema.jsonschema()
|
||||
except Exception as e:
|
||||
logger.warning(
|
||||
"Failed to generate output schema for block %s: %s",
|
||||
block_id,
|
||||
e,
|
||||
)
|
||||
return ErrorResponse(
|
||||
message=f"Block '{block.name}' has an invalid output schema",
|
||||
error=str(e),
|
||||
session_id=session_id,
|
||||
)
|
||||
|
||||
# Expand @@agptfile: refs in input_data with the block's input
|
||||
# schema. The generic _truncating wrapper skips opaque object
|
||||
# properties (input_data has no declared inner properties in the
|
||||
# tool schema), so file ref tokens are still intact here.
|
||||
# Using the block's schema lets us return raw text for string-typed
|
||||
# fields and parsed structures for list/dict-typed fields.
|
||||
if input_data:
|
||||
# Show block details when required inputs are not yet provided.
|
||||
# This is run_block's two-step UX: first call returns the schema,
|
||||
# second call (with inputs) actually executes.
|
||||
if not (prep.required_non_credential_keys <= prep.provided_input_keys):
|
||||
try:
|
||||
input_data = await expand_file_refs_in_args(
|
||||
input_data,
|
||||
user_id,
|
||||
session,
|
||||
input_schema=input_schema,
|
||||
output_schema: dict[str, Any] = prep.block.output_schema.jsonschema()
|
||||
except Exception as e:
|
||||
logger.warning(
|
||||
"Failed to generate output schema for block %s: %s", block_id, e
|
||||
)
|
||||
except FileRefExpansionError as exc:
|
||||
return ErrorResponse(
|
||||
message=(
|
||||
f"Failed to resolve file reference: {exc}. "
|
||||
"Ensure the file exists before referencing it."
|
||||
),
|
||||
message=f"Block '{prep.block.name}' has an invalid output schema",
|
||||
error=str(e),
|
||||
session_id=session_id,
|
||||
)
|
||||
|
||||
if missing_credentials:
|
||||
# Return setup requirements response with missing credentials
|
||||
credentials_fields_info = block.input_schema.get_credentials_fields_info()
|
||||
missing_creds_dict = build_missing_credentials_from_field_info(
|
||||
credentials_fields_info, set(matched_credentials.keys())
|
||||
)
|
||||
missing_creds_list = list(missing_creds_dict.values())
|
||||
|
||||
return SetupRequirementsResponse(
|
||||
message=(
|
||||
f"Block '{block.name}' requires credentials that are not configured. "
|
||||
"Please set up the required credentials before running this block."
|
||||
),
|
||||
session_id=session_id,
|
||||
setup_info=SetupInfo(
|
||||
agent_id=block_id,
|
||||
agent_name=block.name,
|
||||
user_readiness=UserReadiness(
|
||||
has_all_credentials=False,
|
||||
missing_credentials=missing_creds_dict,
|
||||
ready_to_run=False,
|
||||
),
|
||||
requirements={
|
||||
"credentials": missing_creds_list,
|
||||
"inputs": self._get_inputs_list(block),
|
||||
"execution_modes": ["immediate"],
|
||||
},
|
||||
),
|
||||
graph_id=None,
|
||||
graph_version=None,
|
||||
)
|
||||
|
||||
# Check if this is a first attempt (required inputs missing)
|
||||
# Return block details so user can see what inputs are needed
|
||||
credentials_fields = set(block.input_schema.get_credentials_fields().keys())
|
||||
required_keys = set(input_schema.get("required", []))
|
||||
required_non_credential_keys = required_keys - credentials_fields
|
||||
provided_input_keys = set(input_data.keys()) - credentials_fields
|
||||
|
||||
# Check for unknown input fields
|
||||
valid_fields = (
|
||||
set(input_schema.get("properties", {}).keys()) - credentials_fields
|
||||
)
|
||||
unrecognized_fields = provided_input_keys - valid_fields
|
||||
if unrecognized_fields:
|
||||
return InputValidationErrorResponse(
|
||||
message=(
|
||||
f"Unknown input field(s) provided: {', '.join(sorted(unrecognized_fields))}. "
|
||||
f"Block was not executed. Please use the correct field names from the schema."
|
||||
),
|
||||
session_id=session_id,
|
||||
unrecognized_fields=sorted(unrecognized_fields),
|
||||
inputs=input_schema,
|
||||
)
|
||||
|
||||
# Show details when not all required non-credential inputs are provided
|
||||
if not (required_non_credential_keys <= provided_input_keys):
|
||||
# Get credentials info for the response
|
||||
credentials_meta = []
|
||||
for field_name, cred_meta in matched_credentials.items():
|
||||
credentials_meta.append(cred_meta)
|
||||
|
||||
credentials_meta = list(prep.matched_credentials.values())
|
||||
return BlockDetailsResponse(
|
||||
message=(
|
||||
f"Block '{block.name}' details. "
|
||||
f"Block '{prep.block.name}' details. "
|
||||
"Provide input_data matching the inputs schema to execute the block."
|
||||
),
|
||||
session_id=session_id,
|
||||
block=BlockDetails(
|
||||
id=block_id,
|
||||
name=block.name,
|
||||
description=block.description or "",
|
||||
inputs=input_schema,
|
||||
name=prep.block.name,
|
||||
description=prep.block.description or "",
|
||||
inputs=prep.input_schema,
|
||||
outputs=output_schema,
|
||||
credentials=credentials_meta,
|
||||
),
|
||||
user_authenticated=True,
|
||||
)
|
||||
|
||||
# Generate synthetic IDs for CoPilot context.
|
||||
# Encode node_id in node_exec_id so it can be extracted later
|
||||
# (e.g. for auto-approve, where we need node_id but have no NodeExecution row).
|
||||
synthetic_graph_id = f"{COPILOT_SESSION_PREFIX}{session.session_id}"
|
||||
synthetic_node_id = f"{COPILOT_NODE_PREFIX}{block_id}"
|
||||
|
||||
# Check for an existing WAITING review for this block with the same input.
|
||||
# If the LLM retries run_block with identical input, we reuse the existing
|
||||
# review instead of creating duplicates. Different inputs = new execution.
|
||||
existing_reviews = await review_db().get_pending_reviews_for_execution(
|
||||
synthetic_graph_id, user_id
|
||||
)
|
||||
existing_review = next(
|
||||
(
|
||||
r
|
||||
for r in existing_reviews
|
||||
if r.node_id == synthetic_node_id
|
||||
and r.status.value == "WAITING"
|
||||
and r.payload == input_data
|
||||
),
|
||||
None,
|
||||
)
|
||||
if existing_review:
|
||||
return ReviewRequiredResponse(
|
||||
message=(
|
||||
f"Block '{block.name}' requires human review. "
|
||||
f"After the user approves, call continue_run_block with "
|
||||
f"review_id='{existing_review.node_exec_id}' to execute."
|
||||
),
|
||||
session_id=session_id,
|
||||
block_id=block_id,
|
||||
block_name=block.name,
|
||||
review_id=existing_review.node_exec_id,
|
||||
graph_exec_id=synthetic_graph_id,
|
||||
input_data=input_data,
|
||||
)
|
||||
|
||||
synthetic_node_exec_id = (
|
||||
f"{synthetic_node_id}{COPILOT_NODE_EXEC_ID_SEPARATOR}"
|
||||
f"{uuid.uuid4().hex[:8]}"
|
||||
)
|
||||
|
||||
# Check for HITL review before execution.
|
||||
# This creates the review record in the DB for CoPilot flows.
|
||||
review_context = ExecutionContext(
|
||||
user_id=user_id,
|
||||
graph_id=synthetic_graph_id,
|
||||
graph_exec_id=synthetic_graph_id,
|
||||
graph_version=1,
|
||||
node_id=synthetic_node_id,
|
||||
node_exec_id=synthetic_node_exec_id,
|
||||
sensitive_action_safe_mode=True,
|
||||
)
|
||||
should_pause, input_data = await block.is_block_exec_need_review(
|
||||
input_data,
|
||||
user_id=user_id,
|
||||
node_id=synthetic_node_id,
|
||||
node_exec_id=synthetic_node_exec_id,
|
||||
graph_exec_id=synthetic_graph_id,
|
||||
graph_id=synthetic_graph_id,
|
||||
graph_version=1,
|
||||
execution_context=review_context,
|
||||
is_graph_execution=False,
|
||||
)
|
||||
if should_pause:
|
||||
return ReviewRequiredResponse(
|
||||
message=(
|
||||
f"Block '{block.name}' requires human review. "
|
||||
f"After the user approves, call continue_run_block with "
|
||||
f"review_id='{synthetic_node_exec_id}' to execute."
|
||||
),
|
||||
session_id=session_id,
|
||||
block_id=block_id,
|
||||
block_name=block.name,
|
||||
review_id=synthetic_node_exec_id,
|
||||
graph_exec_id=synthetic_graph_id,
|
||||
input_data=input_data,
|
||||
)
|
||||
hitl_or_err = await check_hitl_review(prep, user_id, session_id)
|
||||
if isinstance(hitl_or_err, ToolResponseBase):
|
||||
return hitl_or_err
|
||||
synthetic_node_exec_id, input_data = hitl_or_err
|
||||
|
||||
return await execute_block(
|
||||
block=block,
|
||||
block=prep.block,
|
||||
block_id=block_id,
|
||||
input_data=input_data,
|
||||
user_id=user_id,
|
||||
session_id=session_id,
|
||||
node_exec_id=synthetic_node_exec_id,
|
||||
matched_credentials=matched_credentials,
|
||||
matched_credentials=prep.matched_credentials,
|
||||
)
|
||||
|
||||
def _get_inputs_list(self, block: AnyBlockSchema) -> list[dict[str, Any]]:
|
||||
"""Extract non-credential inputs from block schema."""
|
||||
schema = block.input_schema.jsonschema()
|
||||
credentials_fields = set(block.input_schema.get_credentials_fields().keys())
|
||||
return get_inputs_from_schema(schema, exclude_fields=credentials_fields)
|
||||
|
||||
@@ -5,6 +5,8 @@ from unittest.mock import AsyncMock, MagicMock, patch
|
||||
import pytest
|
||||
|
||||
from backend.blocks._base import BlockType
|
||||
from backend.copilot.context import _current_permissions
|
||||
from backend.copilot.permissions import CopilotPermissions
|
||||
|
||||
from ._test_data import make_session
|
||||
from .models import (
|
||||
@@ -92,7 +94,7 @@ class TestRunBlockFiltering:
|
||||
input_block = make_mock_block("input-block-id", "Input Block", BlockType.INPUT)
|
||||
|
||||
with patch(
|
||||
"backend.copilot.tools.run_block.get_block",
|
||||
"backend.copilot.tools.helpers.get_block",
|
||||
return_value=input_block,
|
||||
):
|
||||
tool = RunBlockTool()
|
||||
@@ -109,29 +111,92 @@ class TestRunBlockFiltering:
|
||||
|
||||
@pytest.mark.asyncio(loop_scope="session")
|
||||
async def test_excluded_block_id_returns_error(self):
|
||||
"""Attempting to execute SmartDecisionMakerBlock returns error."""
|
||||
"""Attempting to execute OrchestratorBlock returns error."""
|
||||
session = make_session(user_id=_TEST_USER_ID)
|
||||
|
||||
smart_decision_id = "3b191d9f-356f-482d-8238-ba04b6d18381"
|
||||
orchestrator_id = "3b191d9f-356f-482d-8238-ba04b6d18381"
|
||||
smart_block = make_mock_block(
|
||||
smart_decision_id, "Smart Decision Maker", BlockType.STANDARD
|
||||
orchestrator_id, "Orchestrator", BlockType.STANDARD
|
||||
)
|
||||
|
||||
with patch(
|
||||
"backend.copilot.tools.run_block.get_block",
|
||||
"backend.copilot.tools.helpers.get_block",
|
||||
return_value=smart_block,
|
||||
):
|
||||
tool = RunBlockTool()
|
||||
response = await tool._execute(
|
||||
user_id=_TEST_USER_ID,
|
||||
session=session,
|
||||
block_id=smart_decision_id,
|
||||
block_id=orchestrator_id,
|
||||
input_data={},
|
||||
)
|
||||
|
||||
assert isinstance(response, ErrorResponse)
|
||||
assert "cannot be run directly" in response.message
|
||||
|
||||
@pytest.mark.asyncio(loop_scope="session")
|
||||
async def test_block_denied_by_permissions_returns_error(self):
|
||||
"""A block denied by CopilotPermissions returns an ErrorResponse."""
|
||||
session = make_session(user_id=_TEST_USER_ID)
|
||||
block_id = "c069dc6b-c3ed-4c12-b6e5-d47361e64ce6"
|
||||
standard_block = make_mock_block(block_id, "HTTP Request", BlockType.STANDARD)
|
||||
|
||||
perms = CopilotPermissions(blocks=[block_id], blocks_exclude=True)
|
||||
token = _current_permissions.set(perms)
|
||||
try:
|
||||
with patch(
|
||||
"backend.copilot.tools.helpers.get_block",
|
||||
return_value=standard_block,
|
||||
):
|
||||
tool = RunBlockTool()
|
||||
response = await tool._execute(
|
||||
user_id=_TEST_USER_ID,
|
||||
session=session,
|
||||
block_id=block_id,
|
||||
input_data={},
|
||||
)
|
||||
finally:
|
||||
_current_permissions.reset(token)
|
||||
|
||||
assert isinstance(response, ErrorResponse)
|
||||
assert "not permitted" in response.message
|
||||
|
||||
@pytest.mark.asyncio(loop_scope="session")
|
||||
async def test_allowed_by_permissions_passes_guard(self):
|
||||
"""A block explicitly allowed by a whitelist CopilotPermissions passes the guard."""
|
||||
session = make_session(user_id=_TEST_USER_ID)
|
||||
block_id = "c069dc6b-c3ed-4c12-b6e5-d47361e64ce6"
|
||||
standard_block = make_mock_block(block_id, "HTTP Request", BlockType.STANDARD)
|
||||
|
||||
perms = CopilotPermissions(blocks=[block_id], blocks_exclude=False)
|
||||
token = _current_permissions.set(perms)
|
||||
try:
|
||||
with (
|
||||
patch(
|
||||
"backend.copilot.tools.helpers.get_block",
|
||||
return_value=standard_block,
|
||||
),
|
||||
patch(
|
||||
"backend.copilot.tools.helpers.match_credentials_to_requirements",
|
||||
return_value=({}, []),
|
||||
),
|
||||
):
|
||||
tool = RunBlockTool()
|
||||
response = await tool._execute(
|
||||
user_id=_TEST_USER_ID,
|
||||
session=session,
|
||||
block_id=block_id,
|
||||
input_data={},
|
||||
)
|
||||
finally:
|
||||
_current_permissions.reset(token)
|
||||
|
||||
# Must NOT be blocked by permissions — assert it's not a permission error
|
||||
assert (
|
||||
not isinstance(response, ErrorResponse)
|
||||
or "not permitted" not in response.message
|
||||
)
|
||||
|
||||
@pytest.mark.asyncio(loop_scope="session")
|
||||
async def test_non_excluded_block_passes_guard(self):
|
||||
"""Non-excluded blocks pass the filtering guard (may fail later for other reasons)."""
|
||||
@@ -143,7 +208,7 @@ class TestRunBlockFiltering:
|
||||
|
||||
with (
|
||||
patch(
|
||||
"backend.copilot.tools.run_block.get_block",
|
||||
"backend.copilot.tools.helpers.get_block",
|
||||
return_value=standard_block,
|
||||
),
|
||||
patch(
|
||||
@@ -200,7 +265,7 @@ class TestRunBlockInputValidation:
|
||||
|
||||
with (
|
||||
patch(
|
||||
"backend.copilot.tools.run_block.get_block",
|
||||
"backend.copilot.tools.helpers.get_block",
|
||||
return_value=mock_block,
|
||||
),
|
||||
patch(
|
||||
@@ -243,7 +308,7 @@ class TestRunBlockInputValidation:
|
||||
|
||||
with (
|
||||
patch(
|
||||
"backend.copilot.tools.run_block.get_block",
|
||||
"backend.copilot.tools.helpers.get_block",
|
||||
return_value=mock_block,
|
||||
),
|
||||
patch(
|
||||
@@ -289,7 +354,7 @@ class TestRunBlockInputValidation:
|
||||
|
||||
with (
|
||||
patch(
|
||||
"backend.copilot.tools.run_block.get_block",
|
||||
"backend.copilot.tools.helpers.get_block",
|
||||
return_value=mock_block,
|
||||
),
|
||||
patch(
|
||||
@@ -337,7 +402,7 @@ class TestRunBlockInputValidation:
|
||||
|
||||
with (
|
||||
patch(
|
||||
"backend.copilot.tools.run_block.get_block",
|
||||
"backend.copilot.tools.helpers.get_block",
|
||||
return_value=mock_block,
|
||||
),
|
||||
patch(
|
||||
@@ -381,7 +446,7 @@ class TestRunBlockInputValidation:
|
||||
|
||||
with (
|
||||
patch(
|
||||
"backend.copilot.tools.run_block.get_block",
|
||||
"backend.copilot.tools.helpers.get_block",
|
||||
return_value=mock_block,
|
||||
),
|
||||
patch(
|
||||
@@ -435,7 +500,7 @@ class TestRunBlockSensitiveAction:
|
||||
|
||||
with (
|
||||
patch(
|
||||
"backend.copilot.tools.run_block.get_block",
|
||||
"backend.copilot.tools.helpers.get_block",
|
||||
return_value=mock_block,
|
||||
),
|
||||
patch(
|
||||
@@ -491,7 +556,7 @@ class TestRunBlockSensitiveAction:
|
||||
|
||||
with (
|
||||
patch(
|
||||
"backend.copilot.tools.run_block.get_block",
|
||||
"backend.copilot.tools.helpers.get_block",
|
||||
return_value=mock_block,
|
||||
),
|
||||
patch(
|
||||
@@ -545,7 +610,7 @@ class TestRunBlockSensitiveAction:
|
||||
|
||||
with (
|
||||
patch(
|
||||
"backend.copilot.tools.run_block.get_block",
|
||||
"backend.copilot.tools.helpers.get_block",
|
||||
return_value=mock_block,
|
||||
),
|
||||
patch(
|
||||
|
||||
@@ -57,10 +57,9 @@ class RunMCPToolTool(BaseTool):
|
||||
@property
|
||||
def description(self) -> str:
|
||||
return (
|
||||
"Connect to an MCP (Model Context Protocol) server to discover and execute its tools. "
|
||||
"Two-step: (1) call with server_url to list available tools, "
|
||||
"(2) call again with server_url + tool_name + tool_arguments to execute. "
|
||||
"Call get_mcp_guide for known server URLs and auth details."
|
||||
"Discover and execute MCP server tools. "
|
||||
"Call with server_url only to list tools, then with tool_name + tool_arguments to execute. "
|
||||
"Call get_mcp_guide first for server URLs and auth."
|
||||
)
|
||||
|
||||
@property
|
||||
@@ -70,24 +69,15 @@ class RunMCPToolTool(BaseTool):
|
||||
"properties": {
|
||||
"server_url": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"URL of the MCP server (Streamable HTTP endpoint), "
|
||||
"e.g. https://mcp.example.com/mcp"
|
||||
),
|
||||
"description": "MCP server URL (Streamable HTTP endpoint).",
|
||||
},
|
||||
"tool_name": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"Name of the MCP tool to execute. "
|
||||
"Omit on first call to discover available tools."
|
||||
),
|
||||
"description": "Tool to execute. Omit to discover available tools.",
|
||||
},
|
||||
"tool_arguments": {
|
||||
"type": "object",
|
||||
"description": (
|
||||
"Arguments to pass to the selected tool. "
|
||||
"Must match the tool's input schema returned during discovery."
|
||||
),
|
||||
"description": "Arguments matching the tool's input schema.",
|
||||
},
|
||||
},
|
||||
"required": ["server_url"],
|
||||
|
||||
@@ -38,11 +38,7 @@ class SearchDocsTool(BaseTool):
|
||||
|
||||
@property
|
||||
def description(self) -> str:
|
||||
return (
|
||||
"Search the AutoGPT platform documentation for information about "
|
||||
"how to use the platform, build agents, configure blocks, and more. "
|
||||
"Returns relevant documentation sections. Use get_doc_page to read full content."
|
||||
)
|
||||
return "Search platform documentation by keyword. Use get_doc_page to read full results."
|
||||
|
||||
@property
|
||||
def parameters(self) -> dict[str, Any]:
|
||||
@@ -51,10 +47,7 @@ class SearchDocsTool(BaseTool):
|
||||
"properties": {
|
||||
"query": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"Search query to find relevant documentation. "
|
||||
"Use natural language to describe what you're looking for."
|
||||
),
|
||||
"description": "Documentation search query.",
|
||||
},
|
||||
},
|
||||
"required": ["query"],
|
||||
|
||||
@@ -61,12 +61,12 @@ async def test_run_block_returns_details_when_no_input_provided():
|
||||
)
|
||||
|
||||
with patch(
|
||||
"backend.copilot.tools.run_block.get_block",
|
||||
"backend.copilot.tools.helpers.get_block",
|
||||
return_value=http_block,
|
||||
):
|
||||
# Mock credentials check to return no missing credentials
|
||||
with patch(
|
||||
"backend.copilot.tools.run_block.resolve_block_credentials",
|
||||
"backend.copilot.tools.helpers.resolve_block_credentials",
|
||||
new_callable=AsyncMock,
|
||||
return_value=({}, []), # (matched_credentials, missing_credentials)
|
||||
):
|
||||
@@ -119,11 +119,11 @@ async def test_run_block_returns_details_when_only_credentials_provided():
|
||||
}
|
||||
|
||||
with patch(
|
||||
"backend.copilot.tools.run_block.get_block",
|
||||
"backend.copilot.tools.helpers.get_block",
|
||||
return_value=mock,
|
||||
):
|
||||
with patch(
|
||||
"backend.copilot.tools.run_block.resolve_block_credentials",
|
||||
"backend.copilot.tools.helpers.resolve_block_credentials",
|
||||
new_callable=AsyncMock,
|
||||
return_value=(
|
||||
{
|
||||
|
||||
@@ -0,0 +1,119 @@
|
||||
"""Schema regression tests for all registered CoPilot tools.
|
||||
|
||||
Validates that every tool in TOOL_REGISTRY produces a well-formed schema:
|
||||
- description is non-empty
|
||||
- all `required` fields exist in `properties`
|
||||
- every property has a `type` and `description`
|
||||
- total schema character budget does not regress past threshold
|
||||
"""
|
||||
|
||||
import json
|
||||
from typing import Any, cast
|
||||
|
||||
import pytest
|
||||
|
||||
from backend.copilot.tools import TOOL_REGISTRY
|
||||
|
||||
# Character budget (~4 chars/token heuristic, targeting ~8000 tokens)
|
||||
_CHAR_BUDGET = 32_000
|
||||
|
||||
|
||||
@pytest.fixture(scope="module")
|
||||
def all_tool_schemas() -> list[tuple[str, Any]]:
|
||||
"""Return (tool_name, openai_schema) pairs for every registered tool."""
|
||||
return [(name, tool.as_openai_tool()) for name, tool in TOOL_REGISTRY.items()]
|
||||
|
||||
|
||||
def _get_parametrize_data() -> list[tuple[str, object]]:
|
||||
"""Build parametrize data at collection time."""
|
||||
return [(name, tool.as_openai_tool()) for name, tool in TOOL_REGISTRY.items()]
|
||||
|
||||
|
||||
@pytest.mark.parametrize(
|
||||
"tool_name,schema",
|
||||
_get_parametrize_data(),
|
||||
ids=[name for name, _ in _get_parametrize_data()],
|
||||
)
|
||||
class TestToolSchema:
|
||||
"""Validate schema invariants for every registered tool."""
|
||||
|
||||
def test_description_non_empty(self, tool_name: str, schema: dict) -> None:
|
||||
desc = schema["function"].get("description", "")
|
||||
assert desc, f"Tool '{tool_name}' has an empty description"
|
||||
|
||||
def test_required_fields_exist_in_properties(
|
||||
self, tool_name: str, schema: dict
|
||||
) -> None:
|
||||
params = schema["function"].get("parameters", {})
|
||||
properties = params.get("properties", {})
|
||||
required = params.get("required", [])
|
||||
for field in required:
|
||||
assert field in properties, (
|
||||
f"Tool '{tool_name}': required field '{field}' "
|
||||
f"not found in properties {list(properties.keys())}"
|
||||
)
|
||||
|
||||
def test_every_property_has_type_and_description(
|
||||
self, tool_name: str, schema: dict
|
||||
) -> None:
|
||||
params = schema["function"].get("parameters", {})
|
||||
properties = params.get("properties", {})
|
||||
for prop_name, prop_def in properties.items():
|
||||
assert (
|
||||
"type" in prop_def
|
||||
), f"Tool '{tool_name}', property '{prop_name}' is missing 'type'"
|
||||
assert (
|
||||
"description" in prop_def
|
||||
), f"Tool '{tool_name}', property '{prop_name}' is missing 'description'"
|
||||
|
||||
|
||||
def test_browser_act_action_enum_complete() -> None:
|
||||
"""Assert browser_act action enum still contains all 14 supported actions.
|
||||
|
||||
This prevents future PRs from accidentally dropping actions during description
|
||||
trimming. The enum is the authoritative list — this locks it at 14 values.
|
||||
"""
|
||||
tool = TOOL_REGISTRY["browser_act"]
|
||||
schema = tool.as_openai_tool()
|
||||
fn_def = schema["function"]
|
||||
params = cast(dict[str, Any], fn_def.get("parameters", {}))
|
||||
actions = params["properties"]["action"]["enum"]
|
||||
expected = {
|
||||
"click",
|
||||
"dblclick",
|
||||
"fill",
|
||||
"type",
|
||||
"scroll",
|
||||
"hover",
|
||||
"press",
|
||||
"check",
|
||||
"uncheck",
|
||||
"select",
|
||||
"wait",
|
||||
"back",
|
||||
"forward",
|
||||
"reload",
|
||||
}
|
||||
assert set(actions) == expected, (
|
||||
f"browser_act action enum changed. Got {set(actions)}, expected {expected}. "
|
||||
"If you added/removed an action, update this test intentionally."
|
||||
)
|
||||
|
||||
|
||||
def test_total_schema_char_budget() -> None:
|
||||
"""Assert total tool schema size stays under the character budget.
|
||||
|
||||
This locks in the 34% token reduction from #12398 and prevents future
|
||||
description bloat from eroding the gains. Uses character count with a
|
||||
~4 chars/token heuristic (budget of 32000 chars ≈ 8000 tokens).
|
||||
Character count is tokenizer-agnostic — no dependency on GPT or Claude
|
||||
tokenizers — while still providing a stable regression gate.
|
||||
"""
|
||||
schemas = [tool.as_openai_tool() for tool in TOOL_REGISTRY.values()]
|
||||
serialized = json.dumps(schemas)
|
||||
total_chars = len(serialized)
|
||||
assert total_chars < _CHAR_BUDGET, (
|
||||
f"Tool schemas use {total_chars} chars (~{total_chars // 4} tokens), "
|
||||
f"exceeding budget of {_CHAR_BUDGET} chars (~{_CHAR_BUDGET // 4} tokens). "
|
||||
f"Description bloat detected — trim descriptions or raise the budget intentionally."
|
||||
)
|
||||
@@ -22,17 +22,9 @@ class ValidateAgentGraphTool(BaseTool):
|
||||
@property
|
||||
def description(self) -> str:
|
||||
return (
|
||||
"Validate an agent JSON graph for correctness. Checks:\n"
|
||||
"- All block_ids reference real blocks\n"
|
||||
"- All links reference valid source/sink nodes and fields\n"
|
||||
"- Required input fields are wired or have defaults\n"
|
||||
"- Data types are compatible across links\n"
|
||||
"- Nested sink links use correct notation\n"
|
||||
"- Prompt templates use proper curly brace escaping\n"
|
||||
"- AgentExecutorBlock configurations are valid\n\n"
|
||||
"Call this after generating agent JSON to verify correctness. "
|
||||
"If validation fails, either fix issues manually based on the error "
|
||||
"descriptions, or call fix_agent_graph to auto-fix common problems."
|
||||
"Validate agent JSON for correctness: block_ids, links, required fields, "
|
||||
"type compatibility, nested sink notation, prompt brace escaping, "
|
||||
"and AgentExecutorBlock configs. On failure, use fix_agent_graph to auto-fix."
|
||||
)
|
||||
|
||||
@property
|
||||
@@ -46,11 +38,7 @@ class ValidateAgentGraphTool(BaseTool):
|
||||
"properties": {
|
||||
"agent_json": {
|
||||
"type": "object",
|
||||
"description": (
|
||||
"The agent JSON to validate. Must contain 'nodes' and 'links' arrays. "
|
||||
"Each node needs: id (UUID), block_id, input_default, metadata. "
|
||||
"Each link needs: id (UUID), source_id, source_name, sink_id, sink_name."
|
||||
),
|
||||
"description": "Agent JSON with 'nodes' and 'links' arrays.",
|
||||
},
|
||||
},
|
||||
"required": ["agent_json"],
|
||||
|
||||
@@ -59,13 +59,7 @@ class WebFetchTool(BaseTool):
|
||||
|
||||
@property
|
||||
def description(self) -> str:
|
||||
return (
|
||||
"Fetch the content of a public web page by URL. "
|
||||
"Returns readable text extracted from HTML by default. "
|
||||
"Useful for reading documentation, articles, and API responses. "
|
||||
"Only supports HTTP/HTTPS GET requests to public URLs "
|
||||
"(private/internal network addresses are blocked)."
|
||||
)
|
||||
return "Fetch a public web page. Public URLs only — internal addresses blocked. Returns readable text from HTML by default."
|
||||
|
||||
@property
|
||||
def parameters(self) -> dict[str, Any]:
|
||||
@@ -74,14 +68,11 @@ class WebFetchTool(BaseTool):
|
||||
"properties": {
|
||||
"url": {
|
||||
"type": "string",
|
||||
"description": "The public HTTP/HTTPS URL to fetch.",
|
||||
"description": "Public HTTP/HTTPS URL.",
|
||||
},
|
||||
"extract_text": {
|
||||
"type": "boolean",
|
||||
"description": (
|
||||
"If true (default), extract readable text from HTML. "
|
||||
"If false, return raw content."
|
||||
),
|
||||
"description": "Extract text from HTML (default: true).",
|
||||
"default": True,
|
||||
},
|
||||
},
|
||||
|
||||
@@ -27,6 +27,8 @@ from .models import ErrorResponse, ResponseType, ToolResponseBase
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
_MAX_FILE_SIZE_MB = Config().max_file_size_mb
|
||||
|
||||
# Sentinel file_id used when a tool-result file is read directly from the local
|
||||
# host filesystem (rather than from workspace storage).
|
||||
_LOCAL_TOOL_RESULT_FILE_ID = "local"
|
||||
@@ -415,13 +417,7 @@ class ListWorkspaceFilesTool(BaseTool):
|
||||
|
||||
@property
|
||||
def description(self) -> str:
|
||||
return (
|
||||
"List files in the user's persistent workspace (cloud storage). "
|
||||
"These files survive across sessions. "
|
||||
"For ephemeral session files, use the SDK Read/Glob tools instead. "
|
||||
"Returns file names, paths, sizes, and metadata. "
|
||||
"Optionally filter by path prefix."
|
||||
)
|
||||
return "List persistent workspace files. For ephemeral session files, use SDK Glob/Read instead. Optionally filter by path prefix."
|
||||
|
||||
@property
|
||||
def parameters(self) -> dict[str, Any]:
|
||||
@@ -430,24 +426,17 @@ class ListWorkspaceFilesTool(BaseTool):
|
||||
"properties": {
|
||||
"path_prefix": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"Optional path prefix to filter files "
|
||||
"(e.g., '/documents/' to list only files in documents folder). "
|
||||
"By default, only files from the current session are listed."
|
||||
),
|
||||
"description": "Filter by path prefix (e.g. '/documents/').",
|
||||
},
|
||||
"limit": {
|
||||
"type": "integer",
|
||||
"description": "Maximum number of files to return (default 50, max 100)",
|
||||
"description": "Max files to return (default 50, max 100).",
|
||||
"minimum": 1,
|
||||
"maximum": 100,
|
||||
},
|
||||
"include_all_sessions": {
|
||||
"type": "boolean",
|
||||
"description": (
|
||||
"If true, list files from all sessions. "
|
||||
"Default is false (only current session's files)."
|
||||
),
|
||||
"description": "Include files from all sessions (default: false).",
|
||||
},
|
||||
},
|
||||
"required": [],
|
||||
@@ -530,18 +519,11 @@ class ReadWorkspaceFileTool(BaseTool):
|
||||
@property
|
||||
def description(self) -> str:
|
||||
return (
|
||||
"Read a file from the user's persistent workspace (cloud storage). "
|
||||
"These files survive across sessions. "
|
||||
"For ephemeral session files, use the SDK Read tool instead. "
|
||||
"Specify either file_id or path to identify the file. "
|
||||
"For small text files, returns content directly. "
|
||||
"For large or binary files, returns metadata and a download URL. "
|
||||
"Use 'save_to_path' to copy the file to the working directory "
|
||||
"(sandbox or ephemeral) for processing with bash_exec or file tools. "
|
||||
"Use 'offset' and 'length' for paginated reads of large files "
|
||||
"(e.g., persisted tool outputs). "
|
||||
"Paths are scoped to the current session by default. "
|
||||
"Use /sessions/<session_id>/... for cross-session access."
|
||||
"Read a file from persistent workspace. Specify file_id or path. "
|
||||
"Small text/image files return inline; large/binary return metadata+URL. "
|
||||
"Use save_to_path to copy to working dir for processing. "
|
||||
"Use offset/length for paginated reads. "
|
||||
"Paths scoped to current session; use /sessions/<id>/... for cross-session access."
|
||||
)
|
||||
|
||||
@property
|
||||
@@ -551,48 +533,30 @@ class ReadWorkspaceFileTool(BaseTool):
|
||||
"properties": {
|
||||
"file_id": {
|
||||
"type": "string",
|
||||
"description": "The file's unique ID (from list_workspace_files)",
|
||||
"description": "File ID from list_workspace_files.",
|
||||
},
|
||||
"path": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"The virtual file path (e.g., '/documents/report.pdf'). "
|
||||
"Scoped to current session by default."
|
||||
),
|
||||
"description": "Virtual file path (e.g. '/documents/report.pdf').",
|
||||
},
|
||||
"save_to_path": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"If provided, save the file to this path in the working "
|
||||
"directory (cloud sandbox when E2B is active, or "
|
||||
"ephemeral dir otherwise) so it can be processed with "
|
||||
"bash_exec or file tools. "
|
||||
"The file content is still returned in the response."
|
||||
),
|
||||
"description": "Copy file to this working directory path for processing.",
|
||||
},
|
||||
"force_download_url": {
|
||||
"type": "boolean",
|
||||
"description": (
|
||||
"If true, always return metadata+URL instead of inline content. "
|
||||
"Default is false (auto-selects based on file size/type)."
|
||||
),
|
||||
"description": "Always return metadata+URL instead of inline content.",
|
||||
},
|
||||
"offset": {
|
||||
"type": "integer",
|
||||
"description": (
|
||||
"Character offset to start reading from (0-based). "
|
||||
"Use with 'length' for paginated reads of large files."
|
||||
),
|
||||
"description": "Character offset for paginated reads (0-based).",
|
||||
},
|
||||
"length": {
|
||||
"type": "integer",
|
||||
"description": (
|
||||
"Maximum number of characters to return. "
|
||||
"Defaults to full file. Use with 'offset' for paginated reads."
|
||||
),
|
||||
"description": "Max characters to return for paginated reads.",
|
||||
},
|
||||
},
|
||||
"required": [], # At least one must be provided
|
||||
"required": [], # At least one of file_id or path must be provided
|
||||
}
|
||||
|
||||
@property
|
||||
@@ -755,15 +719,10 @@ class WriteWorkspaceFileTool(BaseTool):
|
||||
@property
|
||||
def description(self) -> str:
|
||||
return (
|
||||
"Write or create a file in the user's persistent workspace (cloud storage). "
|
||||
"These files survive across sessions. "
|
||||
"For ephemeral session files, use the SDK Write tool instead. "
|
||||
"Provide content as plain text via 'content', OR base64-encoded via "
|
||||
"'content_base64', OR copy a file from the ephemeral working directory "
|
||||
"via 'source_path'. Exactly one of these three is required. "
|
||||
f"Maximum file size is {Config().max_file_size_mb}MB. "
|
||||
"Files are saved to the current session's folder by default. "
|
||||
"Use /sessions/<session_id>/... for cross-session access."
|
||||
"Write a file to persistent workspace (survives across sessions). "
|
||||
"Provide exactly one of: content (text), content_base64 (binary), "
|
||||
f"or source_path (copy from working dir). Max {_MAX_FILE_SIZE_MB}MB. "
|
||||
"Paths scoped to current session; use /sessions/<id>/... for cross-session access."
|
||||
)
|
||||
|
||||
@property
|
||||
@@ -773,51 +732,31 @@ class WriteWorkspaceFileTool(BaseTool):
|
||||
"properties": {
|
||||
"filename": {
|
||||
"type": "string",
|
||||
"description": "Name for the file (e.g., 'report.pdf')",
|
||||
"description": "Filename (e.g. 'report.pdf').",
|
||||
},
|
||||
"content": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"Plain text content to write. Use this for text files "
|
||||
"(code, configs, documents, etc.). "
|
||||
"Mutually exclusive with content_base64 and source_path."
|
||||
),
|
||||
"description": "Plain text content. Mutually exclusive with content_base64/source_path.",
|
||||
},
|
||||
"content_base64": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"Base64-encoded file content. Use this for binary files "
|
||||
"(images, PDFs, etc.). "
|
||||
"Mutually exclusive with content and source_path."
|
||||
),
|
||||
"description": "Base64-encoded binary content. Mutually exclusive with content/source_path.",
|
||||
},
|
||||
"source_path": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"Path to a file in the ephemeral working directory to "
|
||||
"copy to workspace (e.g., '/tmp/copilot-.../output.csv'). "
|
||||
"Use this to persist files created by bash_exec or SDK Write. "
|
||||
"Mutually exclusive with content and content_base64."
|
||||
),
|
||||
"description": "Working directory path to copy to workspace. Mutually exclusive with content/content_base64.",
|
||||
},
|
||||
"path": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"Optional virtual path where to save the file "
|
||||
"(e.g., '/documents/report.pdf'). "
|
||||
"Defaults to '/{filename}'. Scoped to current session."
|
||||
),
|
||||
"description": "Virtual path (e.g. '/documents/report.pdf'). Defaults to '/{filename}'.",
|
||||
},
|
||||
"mime_type": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"Optional MIME type of the file. "
|
||||
"Auto-detected from filename if not provided."
|
||||
),
|
||||
"description": "MIME type. Auto-detected from filename if omitted.",
|
||||
},
|
||||
"overwrite": {
|
||||
"type": "boolean",
|
||||
"description": "Whether to overwrite if file exists at path (default: false)",
|
||||
"description": "Overwrite if file exists (default: false).",
|
||||
},
|
||||
},
|
||||
"required": ["filename"],
|
||||
@@ -859,10 +798,10 @@ class WriteWorkspaceFileTool(BaseTool):
|
||||
return resolved
|
||||
content: bytes = resolved
|
||||
|
||||
max_size = Config().max_file_size_mb * 1024 * 1024
|
||||
max_size = _MAX_FILE_SIZE_MB * 1024 * 1024
|
||||
if len(content) > max_size:
|
||||
return ErrorResponse(
|
||||
message=f"File too large. Maximum size is {Config().max_file_size_mb}MB",
|
||||
message=f"File too large. Maximum size is {_MAX_FILE_SIZE_MB}MB",
|
||||
session_id=session_id,
|
||||
)
|
||||
|
||||
@@ -944,12 +883,7 @@ class DeleteWorkspaceFileTool(BaseTool):
|
||||
|
||||
@property
|
||||
def description(self) -> str:
|
||||
return (
|
||||
"Delete a file from the user's persistent workspace (cloud storage). "
|
||||
"Specify either file_id or path to identify the file. "
|
||||
"Paths are scoped to the current session by default. "
|
||||
"Use /sessions/<session_id>/... for cross-session access."
|
||||
)
|
||||
return "Delete a file from persistent workspace. Specify file_id or path. Paths scoped to current session; use /sessions/<id>/... for cross-session access."
|
||||
|
||||
@property
|
||||
def parameters(self) -> dict[str, Any]:
|
||||
@@ -958,17 +892,14 @@ class DeleteWorkspaceFileTool(BaseTool):
|
||||
"properties": {
|
||||
"file_id": {
|
||||
"type": "string",
|
||||
"description": "The file's unique ID (from list_workspace_files)",
|
||||
"description": "File ID from list_workspace_files.",
|
||||
},
|
||||
"path": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"The virtual file path (e.g., '/documents/report.pdf'). "
|
||||
"Scoped to current session by default."
|
||||
),
|
||||
"description": "Virtual file path.",
|
||||
},
|
||||
},
|
||||
"required": [], # At least one must be provided
|
||||
"required": [], # At least one of file_id or path must be provided
|
||||
}
|
||||
|
||||
@property
|
||||
|
||||
@@ -32,9 +32,9 @@ from backend.blocks.llm import (
|
||||
AITextSummarizerBlock,
|
||||
LlmModel,
|
||||
)
|
||||
from backend.blocks.orchestrator import OrchestratorBlock
|
||||
from backend.blocks.replicate.flux_advanced import ReplicateFluxAdvancedModelBlock
|
||||
from backend.blocks.replicate.replicate_block import ReplicateModelBlock
|
||||
from backend.blocks.smart_decision_maker import SmartDecisionMakerBlock
|
||||
from backend.blocks.talking_head import CreateTalkingAvatarVideoBlock
|
||||
from backend.blocks.text_to_speech_block import UnrealTextToSpeechBlock
|
||||
from backend.blocks.video.narration import VideoNarrationBlock
|
||||
@@ -548,7 +548,6 @@ BLOCK_COSTS: dict[Type[Block], list[BlockCost]] = {
|
||||
},
|
||||
)
|
||||
],
|
||||
SmartDecisionMakerBlock: LLM_COST,
|
||||
SearchOrganizationsBlock: [
|
||||
BlockCost(
|
||||
cost_amount=2,
|
||||
@@ -700,6 +699,7 @@ BLOCK_COSTS: dict[Type[Block], list[BlockCost]] = {
|
||||
},
|
||||
),
|
||||
],
|
||||
OrchestratorBlock: LLM_COST,
|
||||
VideoNarrationBlock: [
|
||||
BlockCost(
|
||||
cost_amount=5, # ElevenLabs TTS cost
|
||||
|
||||
@@ -38,6 +38,10 @@ POOL_TIMEOUT = os.getenv("DB_POOL_TIMEOUT")
|
||||
if POOL_TIMEOUT:
|
||||
DATABASE_URL = add_param(DATABASE_URL, "pool_timeout", POOL_TIMEOUT)
|
||||
|
||||
STMT_CACHE_SIZE = os.getenv("DB_STATEMENT_CACHE_SIZE")
|
||||
if STMT_CACHE_SIZE:
|
||||
DATABASE_URL = add_param(DATABASE_URL, "statement_cache_size", STMT_CACHE_SIZE)
|
||||
|
||||
HTTP_TIMEOUT = int(POOL_TIMEOUT) if POOL_TIMEOUT else None
|
||||
|
||||
prisma = Prisma(
|
||||
|
||||
@@ -7,7 +7,7 @@ the function returns plain values instead of lists, it causes:
|
||||
1 validation error for dict[str,list[any]] response
|
||||
Input should be a valid list [type=list_type, input_value='', input_type=str]
|
||||
|
||||
This breaks SmartDecisionMakerBlock agent mode tool execution.
|
||||
This breaks OrchestratorBlock agent mode tool execution.
|
||||
"""
|
||||
|
||||
from unittest.mock import AsyncMock, MagicMock, patch
|
||||
|
||||
@@ -737,7 +737,7 @@ class GraphModel(Graph, GraphMeta):
|
||||
# Collect errors per node
|
||||
node_errors: dict[str, dict[str, str]] = defaultdict(dict)
|
||||
|
||||
# Validate smart decision maker nodes
|
||||
# Validate tool orchestrator nodes
|
||||
nodes_block = {
|
||||
node.id: block
|
||||
for node in graph.nodes
|
||||
@@ -1207,13 +1207,9 @@ async def get_graph_as_admin(
|
||||
order={"version": "desc"},
|
||||
)
|
||||
|
||||
# For access, the graph must be owned by the user or listed in the store
|
||||
if graph is None or (
|
||||
graph.userId != user_id
|
||||
and not await is_graph_published_in_marketplace(
|
||||
graph_id, version or graph.version
|
||||
)
|
||||
):
|
||||
# Admin access bypasses ownership and marketplace checks — route-level
|
||||
# auth already ensures only admins can call this function.
|
||||
if graph is None:
|
||||
return None
|
||||
|
||||
if for_export:
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
import json
|
||||
from typing import Any
|
||||
from unittest.mock import AsyncMock, patch
|
||||
from unittest.mock import AsyncMock, MagicMock, patch
|
||||
from uuid import UUID
|
||||
|
||||
import fastapi.exceptions
|
||||
@@ -13,7 +13,7 @@ from backend.api.model import CreateGraph
|
||||
from backend.blocks._base import BlockSchema, BlockSchemaInput
|
||||
from backend.blocks.basic import StoreValueBlock
|
||||
from backend.blocks.io import AgentInputBlock, AgentOutputBlock
|
||||
from backend.data.graph import Graph, Link, Node
|
||||
from backend.data.graph import Graph, Link, Node, get_graph
|
||||
from backend.data.model import SchemaField
|
||||
from backend.data.user import DEFAULT_USER_ID
|
||||
from backend.usecases.sample import create_test_user
|
||||
@@ -595,3 +595,82 @@ def test_mcp_credential_combine_no_discriminator_values():
|
||||
f"Expected 1 credential entry for MCP blocks without discriminator_values, "
|
||||
f"got {len(combined)}: {list(combined.keys())}"
|
||||
)
|
||||
|
||||
|
||||
# --------------- get_graph access-control regression tests --------------- #
|
||||
# These protect the behavior introduced in PR #11323 (Reinier, 2025-11-05):
|
||||
# non-owners can access APPROVED marketplace agents but NOT pending ones.
|
||||
|
||||
|
||||
def _make_mock_db_graph(user_id: str = "owner-user-id") -> MagicMock:
|
||||
graph = MagicMock()
|
||||
graph.userId = user_id
|
||||
graph.id = "graph-id"
|
||||
graph.version = 1
|
||||
graph.Nodes = []
|
||||
return graph
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_get_graph_non_owner_approved_marketplace_agent() -> None:
|
||||
"""A non-owner should be able to access a graph that has an APPROVED
|
||||
marketplace listing. This is the normal marketplace download flow."""
|
||||
owner_id = "owner-user-id"
|
||||
requester_id = "different-user-id"
|
||||
graph_id = "graph-id"
|
||||
mock_graph = _make_mock_db_graph(owner_id)
|
||||
mock_graph_model = MagicMock(name="GraphModel")
|
||||
|
||||
mock_listing = MagicMock()
|
||||
mock_listing.AgentGraph = mock_graph
|
||||
|
||||
with (
|
||||
patch("backend.data.graph.AgentGraph.prisma") as mock_ag_prisma,
|
||||
patch(
|
||||
"backend.data.graph.StoreListingVersion.prisma",
|
||||
) as mock_slv_prisma,
|
||||
patch(
|
||||
"backend.data.graph.GraphModel.from_db",
|
||||
return_value=mock_graph_model,
|
||||
),
|
||||
):
|
||||
# First lookup (owned graph) returns None — requester != owner
|
||||
mock_ag_prisma.return_value.find_first = AsyncMock(return_value=None)
|
||||
# Marketplace fallback finds an APPROVED listing
|
||||
mock_slv_prisma.return_value.find_first = AsyncMock(return_value=mock_listing)
|
||||
|
||||
result = await get_graph(
|
||||
graph_id=graph_id,
|
||||
version=1,
|
||||
user_id=requester_id,
|
||||
)
|
||||
|
||||
assert result is not None, "Non-owner should access APPROVED marketplace agent"
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_get_graph_non_owner_pending_marketplace_agent_denied() -> None:
|
||||
"""A non-owner must NOT be able to access a graph that only has a PENDING
|
||||
(not APPROVED) marketplace listing. The marketplace fallback filters on
|
||||
submissionStatus=APPROVED, so pending agents should be invisible."""
|
||||
requester_id = "different-user-id"
|
||||
graph_id = "graph-id"
|
||||
|
||||
with (
|
||||
patch("backend.data.graph.AgentGraph.prisma") as mock_ag_prisma,
|
||||
patch(
|
||||
"backend.data.graph.StoreListingVersion.prisma",
|
||||
) as mock_slv_prisma,
|
||||
):
|
||||
# First lookup (owned graph) returns None
|
||||
mock_ag_prisma.return_value.find_first = AsyncMock(return_value=None)
|
||||
# Marketplace fallback finds nothing (not APPROVED)
|
||||
mock_slv_prisma.return_value.find_first = AsyncMock(return_value=None)
|
||||
|
||||
result = await get_graph(
|
||||
graph_id=graph_id,
|
||||
version=1,
|
||||
user_id=requester_id,
|
||||
)
|
||||
|
||||
assert result is None, "Non-owner must not access a pending marketplace agent"
|
||||
|
||||
@@ -23,11 +23,29 @@ def _cache_key(user_id: str) -> str:
|
||||
|
||||
|
||||
def _json_to_list(value: Any) -> list[str]:
|
||||
"""Convert Json field to list[str], handling None."""
|
||||
"""Convert Json field to list[str], handling None.
|
||||
|
||||
Also handles legacy dict-format rows (e.g. ``{"Learn": [...], "Create": [...]}``
|
||||
from the reverted themed-prompts feature) by flattening all values into a single
|
||||
list so existing personalised data isn't silently lost.
|
||||
"""
|
||||
if value is None:
|
||||
return []
|
||||
if isinstance(value, list):
|
||||
return cast(list[str], value)
|
||||
if isinstance(value, dict):
|
||||
# Legacy themed-prompt format: flatten all string values from all categories.
|
||||
logger.debug(
|
||||
"_json_to_list: flattening legacy dict-format value (keys=%s)",
|
||||
list(value.keys()),
|
||||
)
|
||||
return [
|
||||
item
|
||||
for vals in value.values()
|
||||
if isinstance(vals, list)
|
||||
for item in vals
|
||||
if isinstance(item, str)
|
||||
]
|
||||
return []
|
||||
|
||||
|
||||
|
||||
@@ -224,7 +224,7 @@ async def execute_node(
|
||||
# Sanity check: validate the execution input.
|
||||
input_data, error = validate_exec(node, data.inputs, resolve_input=False)
|
||||
if input_data is None:
|
||||
log_metadata.error(f"Skip execution, input validation error: {error}")
|
||||
log_metadata.warning(f"Skip execution, input validation error: {error}")
|
||||
yield "error", error
|
||||
return
|
||||
|
||||
|
||||
@@ -612,7 +612,7 @@ class TestEnsureToolPairsIntact:
|
||||
# ---- Mixed/Edge Case Tests ----
|
||||
|
||||
def test_anthropic_with_type_message_field(self):
|
||||
"""Test Anthropic format with 'type': 'message' field (smart_decision_maker style)."""
|
||||
"""Test Anthropic format with 'type': 'message' field (orchestrator style)."""
|
||||
all_msgs = [
|
||||
{"role": "system", "content": "You are helpful."},
|
||||
{
|
||||
@@ -628,7 +628,7 @@ class TestEnsureToolPairsIntact:
|
||||
},
|
||||
{
|
||||
"role": "user",
|
||||
"type": "message", # Extra field from smart_decision_maker
|
||||
"type": "message", # Extra field from orchestrator
|
||||
"content": [
|
||||
{
|
||||
"type": "tool_result",
|
||||
|
||||
@@ -704,8 +704,19 @@ def get_service_client(
|
||||
return kwargs
|
||||
|
||||
def _get_return(self, expected_return: TypeAdapter | None, result: Any) -> Any:
|
||||
"""Validate and coerce the RPC result to the expected return type.
|
||||
|
||||
Falls back to the raw result with a warning if validation fails.
|
||||
"""
|
||||
if expected_return:
|
||||
return expected_return.validate_python(result)
|
||||
try:
|
||||
return expected_return.validate_python(result)
|
||||
except Exception as e:
|
||||
logger.warning(
|
||||
"RPC return type validation failed, using raw result: %s",
|
||||
type(e).__name__,
|
||||
)
|
||||
return result
|
||||
return result
|
||||
|
||||
def __getattr__(self, name: str) -> Callable[..., Any]:
|
||||
|
||||
@@ -302,7 +302,14 @@ def _value_satisfies_type(value: Any, target: Any) -> bool:
|
||||
|
||||
# Simple type (e.g. str, int)
|
||||
if isinstance(target, type):
|
||||
return isinstance(value, target)
|
||||
try:
|
||||
return isinstance(value, target)
|
||||
except TypeError:
|
||||
# TypedDict and some typing constructs don't support isinstance checks.
|
||||
# For TypedDict, check if value is a dict with the required keys.
|
||||
if isinstance(value, dict) and hasattr(target, "__required_keys__"):
|
||||
return all(k in value for k in target.__required_keys__)
|
||||
return False
|
||||
|
||||
return False
|
||||
|
||||
|
||||
378
autogpt_platform/backend/poetry.lock
generated
378
autogpt_platform/backend/poetry.lock
generated
@@ -594,26 +594,6 @@ files = [
|
||||
{file = "bracex-2.6.tar.gz", hash = "sha256:98f1347cd77e22ee8d967a30ad4e310b233f7754dbf31ff3fceb76145ba47dc7"},
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "browserbase"
|
||||
version = "1.4.0"
|
||||
description = "The official Python library for the Browserbase API"
|
||||
optional = false
|
||||
python-versions = ">=3.8"
|
||||
groups = ["main"]
|
||||
files = [
|
||||
{file = "browserbase-1.4.0-py3-none-any.whl", hash = "sha256:ea9f1fb4a88921975b8b9606835c441a59d8ce82ce00313a6d48bbe8e30f79fb"},
|
||||
{file = "browserbase-1.4.0.tar.gz", hash = "sha256:e2ed36f513c8630b94b826042c4bb9f497c333f3bd28e5b76cb708c65b4318a0"},
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
anyio = ">=3.5.0,<5"
|
||||
distro = ">=1.7.0,<2"
|
||||
httpx = ">=0.23.0,<1"
|
||||
pydantic = ">=1.9.0,<3"
|
||||
sniffio = "*"
|
||||
typing-extensions = ">=4.10,<5"
|
||||
|
||||
[[package]]
|
||||
name = "build"
|
||||
version = "1.4.0"
|
||||
@@ -1488,94 +1468,6 @@ files = [
|
||||
[package.extras]
|
||||
devel = ["colorama", "json-spec", "jsonschema", "pylint", "pytest", "pytest-benchmark", "pytest-cache", "validictory"]
|
||||
|
||||
[[package]]
|
||||
name = "fastuuid"
|
||||
version = "0.14.0"
|
||||
description = "Python bindings to Rust's UUID library."
|
||||
optional = false
|
||||
python-versions = ">=3.8"
|
||||
groups = ["main"]
|
||||
files = [
|
||||
{file = "fastuuid-0.14.0-cp310-cp310-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl", hash = "sha256:6e6243d40f6c793c3e2ee14c13769e341b90be5ef0c23c82fa6515a96145181a"},
|
||||
{file = "fastuuid-0.14.0-cp310-cp310-macosx_10_12_x86_64.whl", hash = "sha256:13ec4f2c3b04271f62be2e1ce7e95ad2dd1cf97e94503a3760db739afbd48f00"},
|
||||
{file = "fastuuid-0.14.0-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:b2fdd48b5e4236df145a149d7125badb28e0a383372add3fbaac9a6b7a394470"},
|
||||
{file = "fastuuid-0.14.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:f74631b8322d2780ebcf2d2d75d58045c3e9378625ec51865fe0b5620800c39d"},
|
||||
{file = "fastuuid-0.14.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:83cffc144dc93eb604b87b179837f2ce2af44871a7b323f2bfed40e8acb40ba8"},
|
||||
{file = "fastuuid-0.14.0-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:1a771f135ab4523eb786e95493803942a5d1fc1610915f131b363f55af53b219"},
|
||||
{file = "fastuuid-0.14.0-cp310-cp310-musllinux_1_1_aarch64.whl", hash = "sha256:4edc56b877d960b4eda2c4232f953a61490c3134da94f3c28af129fb9c62a4f6"},
|
||||
{file = "fastuuid-0.14.0-cp310-cp310-musllinux_1_1_i686.whl", hash = "sha256:bcc96ee819c282e7c09b2eed2b9bd13084e3b749fdb2faf58c318d498df2efbe"},
|
||||
{file = "fastuuid-0.14.0-cp310-cp310-musllinux_1_1_x86_64.whl", hash = "sha256:7a3c0bca61eacc1843ea97b288d6789fbad7400d16db24e36a66c28c268cfe3d"},
|
||||
{file = "fastuuid-0.14.0-cp310-cp310-win32.whl", hash = "sha256:7f2f3efade4937fae4e77efae1af571902263de7b78a0aee1a1653795a093b2a"},
|
||||
{file = "fastuuid-0.14.0-cp310-cp310-win_amd64.whl", hash = "sha256:ae64ba730d179f439b0736208b4c279b8bc9c089b102aec23f86512ea458c8a4"},
|
||||
{file = "fastuuid-0.14.0-cp311-cp311-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl", hash = "sha256:73946cb950c8caf65127d4e9a325e2b6be0442a224fd51ba3b6ac44e1912ce34"},
|
||||
{file = "fastuuid-0.14.0-cp311-cp311-macosx_10_12_x86_64.whl", hash = "sha256:12ac85024637586a5b69645e7ed986f7535106ed3013640a393a03e461740cb7"},
|
||||
{file = "fastuuid-0.14.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:05a8dde1f395e0c9b4be515b7a521403d1e8349443e7641761af07c7ad1624b1"},
|
||||
{file = "fastuuid-0.14.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:09378a05020e3e4883dfdab438926f31fea15fd17604908f3d39cbeb22a0b4dc"},
|
||||
{file = "fastuuid-0.14.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:bbb0c4b15d66b435d2538f3827f05e44e2baafcc003dd7d8472dc67807ab8fd8"},
|
||||
{file = "fastuuid-0.14.0-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:cd5a7f648d4365b41dbf0e38fe8da4884e57bed4e77c83598e076ac0c93995e7"},
|
||||
{file = "fastuuid-0.14.0-cp311-cp311-musllinux_1_1_aarch64.whl", hash = "sha256:c0a94245afae4d7af8c43b3159d5e3934c53f47140be0be624b96acd672ceb73"},
|
||||
{file = "fastuuid-0.14.0-cp311-cp311-musllinux_1_1_i686.whl", hash = "sha256:2b29e23c97e77c3a9514d70ce343571e469098ac7f5a269320a0f0b3e193ab36"},
|
||||
{file = "fastuuid-0.14.0-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:1e690d48f923c253f28151b3a6b4e335f2b06bf669c68a02665bc150b7839e94"},
|
||||
{file = "fastuuid-0.14.0-cp311-cp311-win32.whl", hash = "sha256:a6f46790d59ab38c6aa0e35c681c0484b50dc0acf9e2679c005d61e019313c24"},
|
||||
{file = "fastuuid-0.14.0-cp311-cp311-win_amd64.whl", hash = "sha256:e150eab56c95dc9e3fefc234a0eedb342fac433dacc273cd4d150a5b0871e1fa"},
|
||||
{file = "fastuuid-0.14.0-cp312-cp312-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl", hash = "sha256:77e94728324b63660ebf8adb27055e92d2e4611645bf12ed9d88d30486471d0a"},
|
||||
{file = "fastuuid-0.14.0-cp312-cp312-macosx_10_12_x86_64.whl", hash = "sha256:caa1f14d2102cb8d353096bc6ef6c13b2c81f347e6ab9d6fbd48b9dea41c153d"},
|
||||
{file = "fastuuid-0.14.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:d23ef06f9e67163be38cece704170486715b177f6baae338110983f99a72c070"},
|
||||
{file = "fastuuid-0.14.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:0c9ec605ace243b6dbe3bd27ebdd5d33b00d8d1d3f580b39fdd15cd96fd71796"},
|
||||
{file = "fastuuid-0.14.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:808527f2407f58a76c916d6aa15d58692a4a019fdf8d4c32ac7ff303b7d7af09"},
|
||||
{file = "fastuuid-0.14.0-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:2fb3c0d7fef6674bbeacdd6dbd386924a7b60b26de849266d1ff6602937675c8"},
|
||||
{file = "fastuuid-0.14.0-cp312-cp312-musllinux_1_1_aarch64.whl", hash = "sha256:ab3f5d36e4393e628a4df337c2c039069344db5f4b9d2a3c9cea48284f1dd741"},
|
||||
{file = "fastuuid-0.14.0-cp312-cp312-musllinux_1_1_i686.whl", hash = "sha256:b9a0ca4f03b7e0b01425281ffd44e99d360e15c895f1907ca105854ed85e2057"},
|
||||
{file = "fastuuid-0.14.0-cp312-cp312-musllinux_1_1_x86_64.whl", hash = "sha256:3acdf655684cc09e60fb7e4cf524e8f42ea760031945aa8086c7eae2eeeabeb8"},
|
||||
{file = "fastuuid-0.14.0-cp312-cp312-win32.whl", hash = "sha256:9579618be6280700ae36ac42c3efd157049fe4dd40ca49b021280481c78c3176"},
|
||||
{file = "fastuuid-0.14.0-cp312-cp312-win_amd64.whl", hash = "sha256:d9e4332dc4ba054434a9594cbfaf7823b57993d7d8e7267831c3e059857cf397"},
|
||||
{file = "fastuuid-0.14.0-cp313-cp313-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl", hash = "sha256:77a09cb7427e7af74c594e409f7731a0cf887221de2f698e1ca0ebf0f3139021"},
|
||||
{file = "fastuuid-0.14.0-cp313-cp313-macosx_10_12_x86_64.whl", hash = "sha256:9bd57289daf7b153bfa3e8013446aa144ce5e8c825e9e366d455155ede5ea2dc"},
|
||||
{file = "fastuuid-0.14.0-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:ac60fc860cdf3c3f327374db87ab8e064c86566ca8c49d2e30df15eda1b0c2d5"},
|
||||
{file = "fastuuid-0.14.0-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:ab32f74bd56565b186f036e33129da77db8be09178cd2f5206a5d4035fb2a23f"},
|
||||
{file = "fastuuid-0.14.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:33e678459cf4addaedd9936bbb038e35b3f6b2061330fd8f2f6a1d80414c0f87"},
|
||||
{file = "fastuuid-0.14.0-cp313-cp313-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:1e3cc56742f76cd25ecb98e4b82a25f978ccffba02e4bdce8aba857b6d85d87b"},
|
||||
{file = "fastuuid-0.14.0-cp313-cp313-musllinux_1_1_aarch64.whl", hash = "sha256:cb9a030f609194b679e1660f7e32733b7a0f332d519c5d5a6a0a580991290022"},
|
||||
{file = "fastuuid-0.14.0-cp313-cp313-musllinux_1_1_i686.whl", hash = "sha256:09098762aad4f8da3a888eb9ae01c84430c907a297b97166b8abc07b640f2995"},
|
||||
{file = "fastuuid-0.14.0-cp313-cp313-musllinux_1_1_x86_64.whl", hash = "sha256:1383fff584fa249b16329a059c68ad45d030d5a4b70fb7c73a08d98fd53bcdab"},
|
||||
{file = "fastuuid-0.14.0-cp313-cp313-win32.whl", hash = "sha256:a0809f8cc5731c066c909047f9a314d5f536c871a7a22e815cc4967c110ac9ad"},
|
||||
{file = "fastuuid-0.14.0-cp313-cp313-win_amd64.whl", hash = "sha256:0df14e92e7ad3276327631c9e7cec09e32572ce82089c55cb1bb8df71cf394ed"},
|
||||
{file = "fastuuid-0.14.0-cp314-cp314-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl", hash = "sha256:b852a870a61cfc26c884af205d502881a2e59cc07076b60ab4a951cc0c94d1ad"},
|
||||
{file = "fastuuid-0.14.0-cp314-cp314-macosx_10_12_x86_64.whl", hash = "sha256:c7502d6f54cd08024c3ea9b3514e2d6f190feb2f46e6dbcd3747882264bb5f7b"},
|
||||
{file = "fastuuid-0.14.0-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:1ca61b592120cf314cfd66e662a5b54a578c5a15b26305e1b8b618a6f22df714"},
|
||||
{file = "fastuuid-0.14.0-cp314-cp314-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:aa75b6657ec129d0abded3bec745e6f7ab642e6dba3a5272a68247e85f5f316f"},
|
||||
{file = "fastuuid-0.14.0-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:a8a0dfea3972200f72d4c7df02c8ac70bad1bb4c58d7e0ec1e6f341679073a7f"},
|
||||
{file = "fastuuid-0.14.0-cp314-cp314-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:1bf539a7a95f35b419f9ad105d5a8a35036df35fdafae48fb2fd2e5f318f0d75"},
|
||||
{file = "fastuuid-0.14.0-cp314-cp314-musllinux_1_1_aarch64.whl", hash = "sha256:9a133bf9cc78fdbd1179cb58a59ad0100aa32d8675508150f3658814aeefeaa4"},
|
||||
{file = "fastuuid-0.14.0-cp314-cp314-musllinux_1_1_i686.whl", hash = "sha256:f54d5b36c56a2d5e1a31e73b950b28a0d83eb0c37b91d10408875a5a29494bad"},
|
||||
{file = "fastuuid-0.14.0-cp314-cp314-musllinux_1_1_x86_64.whl", hash = "sha256:ec27778c6ca3393ef662e2762dba8af13f4ec1aaa32d08d77f71f2a70ae9feb8"},
|
||||
{file = "fastuuid-0.14.0-cp314-cp314-win32.whl", hash = "sha256:e23fc6a83f112de4be0cc1990e5b127c27663ae43f866353166f87df58e73d06"},
|
||||
{file = "fastuuid-0.14.0-cp314-cp314-win_amd64.whl", hash = "sha256:df61342889d0f5e7a32f7284e55ef95103f2110fee433c2ae7c2c0956d76ac8a"},
|
||||
{file = "fastuuid-0.14.0-cp38-cp38-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl", hash = "sha256:47c821f2dfe95909ead0085d4cb18d5149bca704a2b03e03fb3f81a5202d8cea"},
|
||||
{file = "fastuuid-0.14.0-cp38-cp38-macosx_10_12_x86_64.whl", hash = "sha256:3964bab460c528692c70ab6b2e469dd7a7b152fbe8c18616c58d34c93a6cf8d4"},
|
||||
{file = "fastuuid-0.14.0-cp38-cp38-macosx_11_0_arm64.whl", hash = "sha256:c501561e025b7aea3508719c5801c360c711d5218fc4ad5d77bf1c37c1a75779"},
|
||||
{file = "fastuuid-0.14.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:2dce5d0756f046fa792a40763f36accd7e466525c5710d2195a038f93ff96346"},
|
||||
{file = "fastuuid-0.14.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:193ca10ff553cf3cc461572da83b5780fc0e3eea28659c16f89ae5202f3958d4"},
|
||||
{file = "fastuuid-0.14.0-cp38-cp38-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:0737606764b29785566f968bd8005eace73d3666bd0862f33a760796e26d1ede"},
|
||||
{file = "fastuuid-0.14.0-cp38-cp38-musllinux_1_1_aarch64.whl", hash = "sha256:e0976c0dff7e222513d206e06341503f07423aceb1db0b83ff6851c008ceee06"},
|
||||
{file = "fastuuid-0.14.0-cp38-cp38-musllinux_1_1_i686.whl", hash = "sha256:6fbc49a86173e7f074b1a9ec8cf12ca0d54d8070a85a06ebf0e76c309b84f0d0"},
|
||||
{file = "fastuuid-0.14.0-cp38-cp38-musllinux_1_1_x86_64.whl", hash = "sha256:de01280eabcd82f7542828ecd67ebf1551d37203ecdfd7ab1f2e534edb78d505"},
|
||||
{file = "fastuuid-0.14.0-cp38-cp38-win32.whl", hash = "sha256:af5967c666b7d6a377098849b07f83462c4fedbafcf8eb8bc8ff05dcbe8aa209"},
|
||||
{file = "fastuuid-0.14.0-cp38-cp38-win_amd64.whl", hash = "sha256:c3091e63acf42f56a6f74dc65cfdb6f99bfc79b5913c8a9ac498eb7ca09770a8"},
|
||||
{file = "fastuuid-0.14.0-cp39-cp39-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl", hash = "sha256:2ec3d94e13712a133137b2805073b65ecef4a47217d5bac15d8ac62376cefdb4"},
|
||||
{file = "fastuuid-0.14.0-cp39-cp39-macosx_10_12_x86_64.whl", hash = "sha256:139d7ff12bb400b4a0c76be64c28cbe2e2edf60b09826cbfd85f33ed3d0bbe8b"},
|
||||
{file = "fastuuid-0.14.0-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:d55b7e96531216fc4f071909e33e35e5bfa47962ae67d9e84b00a04d6e8b7173"},
|
||||
{file = "fastuuid-0.14.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:c0eb25f0fd935e376ac4334927a59e7c823b36062080e2e13acbaf2af15db836"},
|
||||
{file = "fastuuid-0.14.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:089c18018fdbdda88a6dafd7d139f8703a1e7c799618e33ea25eb52503d28a11"},
|
||||
{file = "fastuuid-0.14.0-cp39-cp39-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:2fc37479517d4d70c08696960fad85494a8a7a0af4e93e9a00af04d74c59f9e3"},
|
||||
{file = "fastuuid-0.14.0-cp39-cp39-musllinux_1_1_aarch64.whl", hash = "sha256:73657c9f778aba530bc96a943d30e1a7c80edb8278df77894fe9457540df4f85"},
|
||||
{file = "fastuuid-0.14.0-cp39-cp39-musllinux_1_1_i686.whl", hash = "sha256:d31f8c257046b5617fc6af9c69be066d2412bdef1edaa4bdf6a214cf57806105"},
|
||||
{file = "fastuuid-0.14.0-cp39-cp39-musllinux_1_1_x86_64.whl", hash = "sha256:5816d41f81782b209843e52fdef757a361b448d782452d96abedc53d545da722"},
|
||||
{file = "fastuuid-0.14.0-cp39-cp39-win32.whl", hash = "sha256:448aa6833f7a84bfe37dd47e33df83250f404d591eb83527fa2cac8d1e57d7f3"},
|
||||
{file = "fastuuid-0.14.0-cp39-cp39-win_amd64.whl", hash = "sha256:84b0779c5abbdec2a9511d5ffbfcd2e53079bf889824b32be170c0d8ef5fc74c"},
|
||||
{file = "fastuuid-0.14.0.tar.gz", hash = "sha256:178947fc2f995b38497a74172adee64fdeb8b7ec18f2a5934d037641ba265d26"},
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "feedparser"
|
||||
version = "6.0.12"
|
||||
@@ -2038,7 +1930,6 @@ files = [
|
||||
[package.dependencies]
|
||||
cryptography = ">=38.0.3"
|
||||
pyasn1-modules = ">=0.2.1"
|
||||
requests = {version = ">=2.20.0,<3.0.0", optional = true, markers = "extra == \"requests\""}
|
||||
rsa = ">=3.1.4,<5"
|
||||
|
||||
[package.extras]
|
||||
@@ -2240,34 +2131,6 @@ files = [
|
||||
{file = "google_crc32c-1.8.0.tar.gz", hash = "sha256:a428e25fb7691024de47fecfbff7ff957214da51eddded0da0ae0e0f03a2cf79"},
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "google-genai"
|
||||
version = "1.62.0"
|
||||
description = "GenAI Python SDK"
|
||||
optional = false
|
||||
python-versions = ">=3.10"
|
||||
groups = ["main"]
|
||||
files = [
|
||||
{file = "google_genai-1.62.0-py3-none-any.whl", hash = "sha256:4c3daeff3d05fafee4b9a1a31f9c07f01bc22051081aa58b4d61f58d16d1bcc0"},
|
||||
{file = "google_genai-1.62.0.tar.gz", hash = "sha256:709468a14c739a080bc240a4f3191df597bf64485b1ca3728e0fb67517774c18"},
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
anyio = ">=4.8.0,<5.0.0"
|
||||
distro = ">=1.7.0,<2"
|
||||
google-auth = {version = ">=2.47.0,<3.0.0", extras = ["requests"]}
|
||||
httpx = ">=0.28.1,<1.0.0"
|
||||
pydantic = ">=2.9.0,<3.0.0"
|
||||
requests = ">=2.28.1,<3.0.0"
|
||||
sniffio = "*"
|
||||
tenacity = ">=8.2.3,<9.2.0"
|
||||
typing-extensions = ">=4.11.0,<5.0.0"
|
||||
websockets = ">=13.0.0,<15.1.0"
|
||||
|
||||
[package.extras]
|
||||
aiohttp = ["aiohttp (<3.13.3)"]
|
||||
local-tokenizer = ["protobuf", "sentencepiece (>=0.2.0)"]
|
||||
|
||||
[[package]]
|
||||
name = "google-resumable-media"
|
||||
version = "2.8.0"
|
||||
@@ -2360,6 +2223,7 @@ description = "Lightweight in-process concurrent programming"
|
||||
optional = false
|
||||
python-versions = ">=3.10"
|
||||
groups = ["main"]
|
||||
markers = "platform_machine == \"aarch64\" or platform_machine == \"ppc64le\" or platform_machine == \"x86_64\" or platform_machine == \"amd64\" or platform_machine == \"AMD64\" or platform_machine == \"win32\" or platform_machine == \"WIN32\""
|
||||
files = [
|
||||
{file = "greenlet-3.3.1-cp310-cp310-macosx_11_0_universal2.whl", hash = "sha256:04bee4775f40ecefcdaa9d115ab44736cd4b9c5fba733575bfe9379419582e13"},
|
||||
{file = "greenlet-3.3.1-cp310-cp310-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:50e1457f4fed12a50e427988a07f0f9df53cf0ee8da23fab16e6732c2ec909d4"},
|
||||
@@ -2582,42 +2446,6 @@ files = [
|
||||
hpack = ">=4.1,<5"
|
||||
hyperframe = ">=6.1,<7"
|
||||
|
||||
[[package]]
|
||||
name = "hf-xet"
|
||||
version = "1.2.0"
|
||||
description = "Fast transfer of large files with the Hugging Face Hub."
|
||||
optional = false
|
||||
python-versions = ">=3.8"
|
||||
groups = ["main"]
|
||||
markers = "platform_machine == \"x86_64\" or platform_machine == \"amd64\" or platform_machine == \"AMD64\" or platform_machine == \"arm64\" or platform_machine == \"aarch64\""
|
||||
files = [
|
||||
{file = "hf_xet-1.2.0-cp313-cp313t-macosx_10_12_x86_64.whl", hash = "sha256:ceeefcd1b7aed4956ae8499e2199607765fbd1c60510752003b6cc0b8413b649"},
|
||||
{file = "hf_xet-1.2.0-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:b70218dd548e9840224df5638fdc94bd033552963cfa97f9170829381179c813"},
|
||||
{file = "hf_xet-1.2.0-cp313-cp313t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:7d40b18769bb9a8bc82a9ede575ce1a44c75eb80e7375a01d76259089529b5dc"},
|
||||
{file = "hf_xet-1.2.0-cp313-cp313t-manylinux_2_28_aarch64.whl", hash = "sha256:cd3a6027d59cfb60177c12d6424e31f4b5ff13d8e3a1247b3a584bf8977e6df5"},
|
||||
{file = "hf_xet-1.2.0-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:6de1fc44f58f6dd937956c8d304d8c2dea264c80680bcfa61ca4a15e7b76780f"},
|
||||
{file = "hf_xet-1.2.0-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:f182f264ed2acd566c514e45da9f2119110e48a87a327ca271027904c70c5832"},
|
||||
{file = "hf_xet-1.2.0-cp313-cp313t-win_amd64.whl", hash = "sha256:293a7a3787e5c95d7be1857358a9130694a9c6021de3f27fa233f37267174382"},
|
||||
{file = "hf_xet-1.2.0-cp314-cp314t-macosx_10_12_x86_64.whl", hash = "sha256:10bfab528b968c70e062607f663e21e34e2bba349e8038db546646875495179e"},
|
||||
{file = "hf_xet-1.2.0-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:2a212e842647b02eb6a911187dc878e79c4aa0aa397e88dd3b26761676e8c1f8"},
|
||||
{file = "hf_xet-1.2.0-cp314-cp314t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:30e06daccb3a7d4c065f34fc26c14c74f4653069bb2b194e7f18f17cbe9939c0"},
|
||||
{file = "hf_xet-1.2.0-cp314-cp314t-manylinux_2_28_aarch64.whl", hash = "sha256:29c8fc913a529ec0a91867ce3d119ac1aac966e098cf49501800c870328cc090"},
|
||||
{file = "hf_xet-1.2.0-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:66e159cbfcfbb29f920db2c09ed8b660eb894640d284f102ada929b6e3dc410a"},
|
||||
{file = "hf_xet-1.2.0-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:9c91d5ae931510107f148874e9e2de8a16052b6f1b3ca3c1b12f15ccb491390f"},
|
||||
{file = "hf_xet-1.2.0-cp314-cp314t-win_amd64.whl", hash = "sha256:210d577732b519ac6ede149d2f2f34049d44e8622bf14eb3d63bbcd2d4b332dc"},
|
||||
{file = "hf_xet-1.2.0-cp37-abi3-macosx_10_12_x86_64.whl", hash = "sha256:46740d4ac024a7ca9b22bebf77460ff43332868b661186a8e46c227fdae01848"},
|
||||
{file = "hf_xet-1.2.0-cp37-abi3-macosx_11_0_arm64.whl", hash = "sha256:27df617a076420d8845bea087f59303da8be17ed7ec0cd7ee3b9b9f579dff0e4"},
|
||||
{file = "hf_xet-1.2.0-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:3651fd5bfe0281951b988c0facbe726aa5e347b103a675f49a3fa8144c7968fd"},
|
||||
{file = "hf_xet-1.2.0-cp37-abi3-manylinux_2_28_aarch64.whl", hash = "sha256:d06fa97c8562fb3ee7a378dd9b51e343bc5bc8190254202c9771029152f5e08c"},
|
||||
{file = "hf_xet-1.2.0-cp37-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:4c1428c9ae73ec0939410ec73023c4f842927f39db09b063b9482dac5a3bb737"},
|
||||
{file = "hf_xet-1.2.0-cp37-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:a55558084c16b09b5ed32ab9ed38421e2d87cf3f1f89815764d1177081b99865"},
|
||||
{file = "hf_xet-1.2.0-cp37-abi3-win_amd64.whl", hash = "sha256:e6584a52253f72c9f52f9e549d5895ca7a471608495c4ecaa6cc73dba2b24d69"},
|
||||
{file = "hf_xet-1.2.0.tar.gz", hash = "sha256:a8c27070ca547293b6890c4bf389f713f80e8c478631432962bb7f4bc0bd7d7f"},
|
||||
]
|
||||
|
||||
[package.extras]
|
||||
tests = ["pytest"]
|
||||
|
||||
[[package]]
|
||||
name = "hpack"
|
||||
version = "4.1.0"
|
||||
@@ -2769,42 +2597,6 @@ files = [
|
||||
{file = "httpx_sse-0.4.3.tar.gz", hash = "sha256:9b1ed0127459a66014aec3c56bebd93da3c1bc8bb6618c8082039a44889a755d"},
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "huggingface-hub"
|
||||
version = "1.4.1"
|
||||
description = "Client library to download and publish models, datasets and other repos on the huggingface.co hub"
|
||||
optional = false
|
||||
python-versions = ">=3.9.0"
|
||||
groups = ["main"]
|
||||
files = [
|
||||
{file = "huggingface_hub-1.4.1-py3-none-any.whl", hash = "sha256:9931d075fb7a79af5abc487106414ec5fba2c0ae86104c0c62fd6cae38873d18"},
|
||||
{file = "huggingface_hub-1.4.1.tar.gz", hash = "sha256:b41131ec35e631e7383ab26d6146b8d8972abc8b6309b963b306fbcca87f5ed5"},
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
filelock = "*"
|
||||
fsspec = ">=2023.5.0"
|
||||
hf-xet = {version = ">=1.2.0,<2.0.0", markers = "platform_machine == \"x86_64\" or platform_machine == \"amd64\" or platform_machine == \"AMD64\" or platform_machine == \"arm64\" or platform_machine == \"aarch64\""}
|
||||
httpx = ">=0.23.0,<1"
|
||||
packaging = ">=20.9"
|
||||
pyyaml = ">=5.1"
|
||||
shellingham = "*"
|
||||
tqdm = ">=4.42.1"
|
||||
typer-slim = "*"
|
||||
typing-extensions = ">=4.1.0"
|
||||
|
||||
[package.extras]
|
||||
all = ["Jinja2", "Pillow", "authlib (>=1.3.2)", "fastapi", "fastapi", "httpx", "itsdangerous", "jedi", "libcst (>=1.4.0)", "mypy (==1.15.0)", "numpy", "pytest (>=8.4.2)", "pytest-asyncio", "pytest-cov", "pytest-env", "pytest-mock", "pytest-rerunfailures (<16.0)", "pytest-vcr", "pytest-xdist", "ruff (>=0.9.0)", "soundfile", "ty", "types-PyYAML", "types-simplejson", "types-toml", "types-tqdm", "types-urllib3", "typing-extensions (>=4.8.0)", "urllib3 (<2.0)"]
|
||||
dev = ["Jinja2", "Pillow", "authlib (>=1.3.2)", "fastapi", "fastapi", "httpx", "itsdangerous", "jedi", "libcst (>=1.4.0)", "mypy (==1.15.0)", "numpy", "pytest (>=8.4.2)", "pytest-asyncio", "pytest-cov", "pytest-env", "pytest-mock", "pytest-rerunfailures (<16.0)", "pytest-vcr", "pytest-xdist", "ruff (>=0.9.0)", "soundfile", "ty", "types-PyYAML", "types-simplejson", "types-toml", "types-tqdm", "types-urllib3", "typing-extensions (>=4.8.0)", "urllib3 (<2.0)"]
|
||||
fastai = ["fastai (>=2.4)", "fastcore (>=1.3.27)", "toml"]
|
||||
hf-xet = ["hf-xet (>=1.2.0,<2.0.0)"]
|
||||
mcp = ["mcp (>=1.8.0)"]
|
||||
oauth = ["authlib (>=1.3.2)", "fastapi", "httpx", "itsdangerous"]
|
||||
quality = ["libcst (>=1.4.0)", "mypy (==1.15.0)", "ruff (>=0.9.0)", "ty"]
|
||||
testing = ["Jinja2", "Pillow", "authlib (>=1.3.2)", "fastapi", "fastapi", "httpx", "itsdangerous", "jedi", "numpy", "pytest (>=8.4.2)", "pytest-asyncio", "pytest-cov", "pytest-env", "pytest-mock", "pytest-rerunfailures (<16.0)", "pytest-vcr", "pytest-xdist", "soundfile", "urllib3 (<2.0)"]
|
||||
torch = ["safetensors[torch]", "torch"]
|
||||
typing = ["types-PyYAML", "types-simplejson", "types-toml", "types-tqdm", "types-urllib3", "typing-extensions (>=4.8.0)"]
|
||||
|
||||
[[package]]
|
||||
name = "hyperframe"
|
||||
version = "6.1.0"
|
||||
@@ -3350,40 +3142,6 @@ dynamodb = ["boto3 (>=1.9.71)"]
|
||||
redis = ["redis (>=2.10.5)"]
|
||||
test-filesource = ["pyyaml (>=5.3.1)", "watchdog (>=3.0.0)"]
|
||||
|
||||
[[package]]
|
||||
name = "litellm"
|
||||
version = "1.80.0"
|
||||
description = "Library to easily interface with LLM API providers"
|
||||
optional = false
|
||||
python-versions = "!=2.7.*,!=3.0.*,!=3.1.*,!=3.2.*,!=3.3.*,!=3.4.*,!=3.5.*,!=3.6.*,!=3.7.*,>=3.8"
|
||||
groups = ["main"]
|
||||
files = [
|
||||
{file = "litellm-1.80.0-py3-none-any.whl", hash = "sha256:fd0009758f4772257048d74bf79bb64318859adb4ea49a8b66fdbc718cd80b6e"},
|
||||
{file = "litellm-1.80.0.tar.gz", hash = "sha256:eeac733eb6b226f9e5fb020f72fe13a32b3354b001dc62bcf1bc4d9b526d6231"},
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
aiohttp = ">=3.10"
|
||||
click = "*"
|
||||
fastuuid = ">=0.13.0"
|
||||
httpx = ">=0.23.0"
|
||||
importlib-metadata = ">=6.8.0"
|
||||
jinja2 = ">=3.1.2,<4.0.0"
|
||||
jsonschema = ">=4.22.0,<5.0.0"
|
||||
openai = ">=1.99.5"
|
||||
pydantic = ">=2.5.0,<3.0.0"
|
||||
python-dotenv = ">=0.2.0"
|
||||
tiktoken = ">=0.7.0"
|
||||
tokenizers = "*"
|
||||
|
||||
[package.extras]
|
||||
caching = ["diskcache (>=5.6.1,<6.0.0)"]
|
||||
extra-proxy = ["azure-identity (>=1.15.0,<2.0.0)", "azure-keyvault-secrets (>=4.8.0,<5.0.0)", "google-cloud-iam (>=2.19.1,<3.0.0)", "google-cloud-kms (>=2.21.3,<3.0.0)", "prisma (==0.11.0)", "redisvl (>=0.4.1,<0.5.0) ; python_version >= \"3.9\" and python_version < \"3.14\"", "resend (>=0.8.0,<0.9.0)"]
|
||||
mlflow = ["mlflow (>3.1.4) ; python_version >= \"3.10\""]
|
||||
proxy = ["PyJWT (>=2.8.0,<3.0.0)", "apscheduler (>=3.10.4,<4.0.0)", "azure-identity (>=1.15.0,<2.0.0)", "azure-storage-blob (>=12.25.1,<13.0.0)", "backoff", "boto3 (==1.36.0)", "cryptography", "fastapi (>=0.120.1)", "fastapi-sso (>=0.16.0,<0.17.0)", "gunicorn (>=23.0.0,<24.0.0)", "litellm-enterprise (==0.1.21)", "litellm-proxy-extras (==0.4.5)", "mcp (>=1.10.0,<2.0.0) ; python_version >= \"3.10\"", "orjson (>=3.9.7,<4.0.0)", "polars (>=1.31.0,<2.0.0) ; python_version >= \"3.10\"", "pynacl (>=1.5.0,<2.0.0)", "python-multipart (>=0.0.18,<0.0.19)", "pyyaml (>=6.0.1,<7.0.0)", "rich (==13.7.1)", "rq", "soundfile (>=0.12.1,<0.13.0)", "uvicorn (>=0.29.0,<0.30.0)", "uvloop (>=0.21.0,<0.22.0) ; sys_platform != \"win32\"", "websockets (>=13.1.0,<14.0.0)"]
|
||||
semantic-router = ["semantic-router ; python_version >= \"3.9\""]
|
||||
utils = ["numpydoc"]
|
||||
|
||||
[[package]]
|
||||
name = "markdown-it-py"
|
||||
version = "4.0.0"
|
||||
@@ -4857,28 +4615,6 @@ docs = ["furo (>=2025.9.25)", "proselint (>=0.14)", "sphinx (>=8.2.3)", "sphinx-
|
||||
test = ["appdirs (==1.4.4)", "covdefaults (>=2.3)", "pytest (>=8.4.2)", "pytest-cov (>=7)", "pytest-mock (>=3.15.1)"]
|
||||
type = ["mypy (>=1.18.2)"]
|
||||
|
||||
[[package]]
|
||||
name = "playwright"
|
||||
version = "1.58.0"
|
||||
description = "A high-level API to automate web browsers"
|
||||
optional = false
|
||||
python-versions = ">=3.9"
|
||||
groups = ["main"]
|
||||
files = [
|
||||
{file = "playwright-1.58.0-py3-none-macosx_10_13_x86_64.whl", hash = "sha256:96e3204aac292ee639edbfdef6298b4be2ea0a55a16b7068df91adac077cc606"},
|
||||
{file = "playwright-1.58.0-py3-none-macosx_11_0_arm64.whl", hash = "sha256:70c763694739d28df71ed578b9c8202bb83e8fe8fb9268c04dd13afe36301f71"},
|
||||
{file = "playwright-1.58.0-py3-none-macosx_11_0_universal2.whl", hash = "sha256:185e0132578733d02802dfddfbbc35f42be23a45ff49ccae5081f25952238117"},
|
||||
{file = "playwright-1.58.0-py3-none-manylinux1_x86_64.whl", hash = "sha256:c95568ba1eda83812598c1dc9be60b4406dffd60b149bc1536180ad108723d6b"},
|
||||
{file = "playwright-1.58.0-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:8f9999948f1ab541d98812de25e3a8c410776aa516d948807140aff797b4bffa"},
|
||||
{file = "playwright-1.58.0-py3-none-win32.whl", hash = "sha256:1e03be090e75a0fabbdaeab65ce17c308c425d879fa48bb1d7986f96bfad0b99"},
|
||||
{file = "playwright-1.58.0-py3-none-win_amd64.whl", hash = "sha256:a2bf639d0ce33b3ba38de777e08697b0d8f3dc07ab6802e4ac53fb65e3907af8"},
|
||||
{file = "playwright-1.58.0-py3-none-win_arm64.whl", hash = "sha256:32ffe5c303901a13a0ecab91d1c3f74baf73b84f4bedbb6b935f5bc11cc98e1b"},
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
greenlet = ">=3.1.1,<4.0.0"
|
||||
pyee = ">=13,<14"
|
||||
|
||||
[[package]]
|
||||
name = "pluggy"
|
||||
version = "1.6.0"
|
||||
@@ -5865,24 +5601,6 @@ gcp-secret-manager = ["google-cloud-secret-manager (>=2.23.1)"]
|
||||
toml = ["tomli (>=2.0.1)"]
|
||||
yaml = ["pyyaml (>=6.0.1)"]
|
||||
|
||||
[[package]]
|
||||
name = "pyee"
|
||||
version = "13.0.0"
|
||||
description = "A rough port of Node.js's EventEmitter to Python with a few tricks of its own"
|
||||
optional = false
|
||||
python-versions = ">=3.8"
|
||||
groups = ["main"]
|
||||
files = [
|
||||
{file = "pyee-13.0.0-py3-none-any.whl", hash = "sha256:48195a3cddb3b1515ce0695ed76036b5ccc2ef3a9f963ff9f77aec0139845498"},
|
||||
{file = "pyee-13.0.0.tar.gz", hash = "sha256:b391e3c5a434d1f5118a25615001dbc8f669cf410ab67d04c4d4e07c55481c37"},
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
typing-extensions = "*"
|
||||
|
||||
[package.extras]
|
||||
dev = ["black", "build", "flake8", "flake8-black", "isort", "jupyter-console", "mkdocs", "mkdocs-include-markdown-plugin", "mkdocstrings[python]", "mypy", "pytest", "pytest-asyncio ; python_version >= \"3.4\"", "pytest-trio ; python_version >= \"3.7\"", "sphinx", "toml", "tox", "trio", "trio ; python_version > \"3.6\"", "trio-typing ; python_version > \"3.6\"", "twine", "twisted", "validate-pyproject[all]"]
|
||||
|
||||
[[package]]
|
||||
name = "pyflakes"
|
||||
version = "3.4.0"
|
||||
@@ -7315,32 +7033,29 @@ uvicorn = ["uvicorn (>=0.34.0)"]
|
||||
|
||||
[[package]]
|
||||
name = "stagehand"
|
||||
version = "0.5.9"
|
||||
description = "Python SDK for Stagehand"
|
||||
version = "3.7.0"
|
||||
description = "The official Python library for the stagehand API"
|
||||
optional = false
|
||||
python-versions = ">=3.9"
|
||||
groups = ["main"]
|
||||
files = [
|
||||
{file = "stagehand-0.5.9-py3-none-any.whl", hash = "sha256:cc8d2a114799ea1c3d6f199e86abd6479a8b338a101fffa6824d85b542ed9071"},
|
||||
{file = "stagehand-0.5.9.tar.gz", hash = "sha256:068a2825b02fbc949ab9d1cf59b80d2c17caba0259e759d807f38d0e9ab236b0"},
|
||||
{file = "stagehand-3.7.0-py3-none-macosx_10_9_x86_64.whl", hash = "sha256:4918068e6c02717c09766f1df41d5a41ac2ad9b610a30bb584a7d5d359f8d654"},
|
||||
{file = "stagehand-3.7.0-py3-none-macosx_11_0_arm64.whl", hash = "sha256:cedb940ebbd47930227f5ef82077080aeb1f77480913382183187aba98e3cca5"},
|
||||
{file = "stagehand-3.7.0-py3-none-manylinux2014_x86_64.whl", hash = "sha256:87df69bca9a611c4acae7383333f1e0cf67cc5b92be91639c65772aa59f8e6ea"},
|
||||
{file = "stagehand-3.7.0-py3-none-win_amd64.whl", hash = "sha256:09d809f3b35389b2ed0b879e8909a8ed01e1ba9330f39c08cfcefe1699197585"},
|
||||
{file = "stagehand-3.7.0.tar.gz", hash = "sha256:53cdd79111147a4c6fedcf17ef92427472beaf11ad3fcd800736ae3475a5cc54"},
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
anthropic = ">=0.51.0"
|
||||
browserbase = ">=1.4.0"
|
||||
google-genai = ">=1.40.0"
|
||||
httpx = ">=0.24.0"
|
||||
litellm = ">=1.72.0,<=1.80.0"
|
||||
nest-asyncio = ">=1.6.0"
|
||||
openai = ">=1.99.6"
|
||||
playwright = ">=1.42.1"
|
||||
pydantic = ">=1.10.0"
|
||||
python-dotenv = ">=1.0.0"
|
||||
requests = ">=2.31.0"
|
||||
rich = ">=13.7.0"
|
||||
anyio = ">=3.5.0,<5"
|
||||
distro = ">=1.7.0,<2"
|
||||
httpx = ">=0.23.0,<1"
|
||||
pydantic = ">=1.9.0,<3"
|
||||
sniffio = "*"
|
||||
typing-extensions = ">=4.14,<5"
|
||||
|
||||
[package.extras]
|
||||
dev = ["black (>=23.3.0)", "isort (>=5.12.0)", "mypy (>=1.3.0)", "psutil (>=5.9.0)", "pytest (>=7.3.1)", "pytest-asyncio (>=0.21.0)", "pytest-cov (>=4.1.0)", "pytest-mock (>=3.10.0)", "ruff"]
|
||||
aiohttp = ["aiohttp", "httpx-aiohttp (>=0.1.9)"]
|
||||
|
||||
[[package]]
|
||||
name = "starlette"
|
||||
@@ -7607,48 +7322,6 @@ files = [
|
||||
[package.dependencies]
|
||||
requests = ">=2.32.3,<3.0.0"
|
||||
|
||||
[[package]]
|
||||
name = "tokenizers"
|
||||
version = "0.22.2"
|
||||
description = ""
|
||||
optional = false
|
||||
python-versions = ">=3.9"
|
||||
groups = ["main"]
|
||||
files = [
|
||||
{file = "tokenizers-0.22.2-cp39-abi3-macosx_10_12_x86_64.whl", hash = "sha256:544dd704ae7238755d790de45ba8da072e9af3eea688f698b137915ae959281c"},
|
||||
{file = "tokenizers-0.22.2-cp39-abi3-macosx_11_0_arm64.whl", hash = "sha256:1e418a55456beedca4621dbab65a318981467a2b188e982a23e117f115ce5001"},
|
||||
{file = "tokenizers-0.22.2-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:2249487018adec45d6e3554c71d46eb39fa8ea67156c640f7513eb26f318cec7"},
|
||||
{file = "tokenizers-0.22.2-cp39-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:25b85325d0815e86e0bac263506dd114578953b7b53d7de09a6485e4a160a7dd"},
|
||||
{file = "tokenizers-0.22.2-cp39-abi3-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:bfb88f22a209ff7b40a576d5324bf8286b519d7358663db21d6246fb17eea2d5"},
|
||||
{file = "tokenizers-0.22.2-cp39-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:1c774b1276f71e1ef716e5486f21e76333464f47bece56bbd554485982a9e03e"},
|
||||
{file = "tokenizers-0.22.2-cp39-abi3-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:df6c4265b289083bf710dff49bc51ef252f9d5be33a45ee2bed151114a56207b"},
|
||||
{file = "tokenizers-0.22.2-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:369cc9fc8cc10cb24143873a0d95438bb8ee257bb80c71989e3ee290e8d72c67"},
|
||||
{file = "tokenizers-0.22.2-cp39-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:29c30b83d8dcd061078b05ae0cb94d3c710555fbb44861139f9f83dcca3dc3e4"},
|
||||
{file = "tokenizers-0.22.2-cp39-abi3-musllinux_1_2_armv7l.whl", hash = "sha256:37ae80a28c1d3265bb1f22464c856bd23c02a05bb211e56d0c5301a435be6c1a"},
|
||||
{file = "tokenizers-0.22.2-cp39-abi3-musllinux_1_2_i686.whl", hash = "sha256:791135ee325f2336f498590eb2f11dc5c295232f288e75c99a36c5dbce63088a"},
|
||||
{file = "tokenizers-0.22.2-cp39-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:38337540fbbddff8e999d59970f3c6f35a82de10053206a7562f1ea02d046fa5"},
|
||||
{file = "tokenizers-0.22.2-cp39-abi3-win32.whl", hash = "sha256:a6bf3f88c554a2b653af81f3204491c818ae2ac6fbc09e76ef4773351292bc92"},
|
||||
{file = "tokenizers-0.22.2-cp39-abi3-win_amd64.whl", hash = "sha256:c9ea31edff2968b44a88f97d784c2f16dc0729b8b143ed004699ebca91f05c48"},
|
||||
{file = "tokenizers-0.22.2-cp39-abi3-win_arm64.whl", hash = "sha256:9ce725d22864a1e965217204946f830c37876eee3b2ba6fc6255e8e903d5fcbc"},
|
||||
{file = "tokenizers-0.22.2-pp310-pypy310_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:753d47ebd4542742ef9261d9da92cd545b2cacbb48349a1225466745bb866ec4"},
|
||||
{file = "tokenizers-0.22.2-pp310-pypy310_pp73-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:e10bf9113d209be7cd046d40fbabbaf3278ff6d18eb4da4c500443185dc1896c"},
|
||||
{file = "tokenizers-0.22.2-pp310-pypy310_pp73-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:64d94e84f6660764e64e7e0b22baa72f6cd942279fdbb21d46abd70d179f0195"},
|
||||
{file = "tokenizers-0.22.2-pp310-pypy310_pp73-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:f01a9c019878532f98927d2bacb79bbb404b43d3437455522a00a30718cdedb5"},
|
||||
{file = "tokenizers-0.22.2-pp39-pypy39_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:319f659ee992222f04e58f84cbf407cfa66a65fe3a8de44e8ad2bc53e7d99012"},
|
||||
{file = "tokenizers-0.22.2-pp39-pypy39_pp73-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:1e50f8554d504f617d9e9d6e4c2c2884a12b388a97c5c77f0bc6cf4cd032feee"},
|
||||
{file = "tokenizers-0.22.2-pp39-pypy39_pp73-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:1a62ba2c5faa2dd175aaeed7b15abf18d20266189fb3406c5d0550dd34dd5f37"},
|
||||
{file = "tokenizers-0.22.2-pp39-pypy39_pp73-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:143b999bdc46d10febb15cbffb4207ddd1f410e2c755857b5a0797961bbdc113"},
|
||||
{file = "tokenizers-0.22.2.tar.gz", hash = "sha256:473b83b915e547aa366d1eee11806deaf419e17be16310ac0a14077f1e28f917"},
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
huggingface-hub = ">=0.16.4,<2.0"
|
||||
|
||||
[package.extras]
|
||||
dev = ["tokenizers[testing]"]
|
||||
docs = ["setuptools-rust", "sphinx", "sphinx-rtd-theme"]
|
||||
testing = ["datasets", "numpy", "pytest", "pytest-asyncio", "requests", "ruff", "ty"]
|
||||
|
||||
[[package]]
|
||||
name = "tomli"
|
||||
version = "2.4.0"
|
||||
@@ -7775,25 +7448,6 @@ async = ["aiohttp (>=3.7.3,<4)", "async-lru (>=1.0.3,<3)"]
|
||||
dev = ["coverage (>=4.4.2)", "coveralls (>=2.1.0)", "tox (>=3.21.0)"]
|
||||
test = ["urllib3 (<2)", "vcrpy (>=1.10.3)"]
|
||||
|
||||
[[package]]
|
||||
name = "typer-slim"
|
||||
version = "0.21.1"
|
||||
description = "Typer, build great CLIs. Easy to code. Based on Python type hints."
|
||||
optional = false
|
||||
python-versions = ">=3.9"
|
||||
groups = ["main"]
|
||||
files = [
|
||||
{file = "typer_slim-0.21.1-py3-none-any.whl", hash = "sha256:6e6c31047f171ac93cc5a973c9e617dbc5ab2bddc4d0a3135dc161b4e2020e0d"},
|
||||
{file = "typer_slim-0.21.1.tar.gz", hash = "sha256:73495dd08c2d0940d611c5a8c04e91c2a0a98600cbd4ee19192255a233b6dbfd"},
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
click = ">=8.0.0"
|
||||
typing-extensions = ">=3.7.4.3"
|
||||
|
||||
[package.extras]
|
||||
standard = ["rich (>=10.11.0)", "shellingham (>=1.3.0)"]
|
||||
|
||||
[[package]]
|
||||
name = "typing-extensions"
|
||||
version = "4.15.0"
|
||||
@@ -8976,4 +8630,4 @@ cffi = ["cffi (>=1.17,<2.0) ; platform_python_implementation != \"PyPy\" and pyt
|
||||
[metadata]
|
||||
lock-version = "2.1"
|
||||
python-versions = ">=3.10,<3.14"
|
||||
content-hash = "938e93b7de4005bdd60ce5fb542a63df79115f9e21b1cb9940a19605f00d354a"
|
||||
content-hash = "1dd10577184ebff0d10997f4c6ba49484de79b7fa090946e8e5ce5c5bac3cdeb"
|
||||
|
||||
@@ -88,7 +88,7 @@ pandas = "^2.3.1"
|
||||
firecrawl-py = "^4.3.6"
|
||||
exa-py = "^1.14.20"
|
||||
croniter = "^6.0.0"
|
||||
stagehand = "^0.5.1"
|
||||
stagehand = "^3.4.0"
|
||||
gravitas-md2gdocs = "^0.1.0"
|
||||
posthog = "^7.6.0"
|
||||
fpdf2 = "^2.8.6"
|
||||
|
||||
123
autogpt_platform/backend/scripts/refresh_claude_token.sh
Executable file
123
autogpt_platform/backend/scripts/refresh_claude_token.sh
Executable file
@@ -0,0 +1,123 @@
|
||||
#!/usr/bin/env bash
|
||||
# refresh_claude_token.sh — Extract Claude OAuth tokens and update backend/.env
|
||||
#
|
||||
# Works on macOS (keychain), Linux (~/.claude/.credentials.json),
|
||||
# and Windows/WSL (~/.claude/.credentials.json or PowerShell fallback).
|
||||
#
|
||||
# Usage:
|
||||
# ./scripts/refresh_claude_token.sh # auto-detect OS
|
||||
# ./scripts/refresh_claude_token.sh --env-file /path/to/.env # custom .env path
|
||||
#
|
||||
# Prerequisite: You must have run `claude login` at least once on the host.
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
# --- Parse arguments ---
|
||||
ENV_FILE=""
|
||||
while [[ $# -gt 0 ]]; do
|
||||
case "$1" in
|
||||
--env-file) ENV_FILE="$2"; shift 2 ;;
|
||||
*) echo "Unknown option: $1"; exit 1 ;;
|
||||
esac
|
||||
done
|
||||
|
||||
# Default .env path: relative to this script's location
|
||||
if [[ -z "$ENV_FILE" ]]; then
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
ENV_FILE="$SCRIPT_DIR/../.env"
|
||||
fi
|
||||
|
||||
# --- Extract tokens by platform ---
|
||||
ACCESS_TOKEN=""
|
||||
REFRESH_TOKEN=""
|
||||
|
||||
extract_from_credentials_file() {
|
||||
local creds_file="$1"
|
||||
if [[ -f "$creds_file" ]]; then
|
||||
ACCESS_TOKEN=$(jq -r '.claudeAiOauth.accessToken // ""' "$creds_file" 2>/dev/null)
|
||||
REFRESH_TOKEN=$(jq -r '.claudeAiOauth.refreshToken // ""' "$creds_file" 2>/dev/null)
|
||||
fi
|
||||
}
|
||||
|
||||
case "$(uname -s)" in
|
||||
Darwin)
|
||||
# macOS: extract from system keychain
|
||||
CREDS_JSON=$(security find-generic-password -s "Claude Code-credentials" -w 2>/dev/null || true)
|
||||
if [[ -n "$CREDS_JSON" ]]; then
|
||||
ACCESS_TOKEN=$(echo "$CREDS_JSON" | jq -r '.claudeAiOauth.accessToken // ""' 2>/dev/null)
|
||||
REFRESH_TOKEN=$(echo "$CREDS_JSON" | jq -r '.claudeAiOauth.refreshToken // ""' 2>/dev/null)
|
||||
else
|
||||
# Fallback to credentials file (e.g. if keychain access denied)
|
||||
extract_from_credentials_file "$HOME/.claude/.credentials.json"
|
||||
fi
|
||||
;;
|
||||
Linux)
|
||||
# Linux (including WSL): read from credentials file
|
||||
extract_from_credentials_file "$HOME/.claude/.credentials.json"
|
||||
;;
|
||||
MINGW*|MSYS*|CYGWIN*)
|
||||
# Windows Git Bash / MSYS2 / Cygwin
|
||||
APPDATA_PATH="${APPDATA:-$USERPROFILE/AppData/Roaming}"
|
||||
extract_from_credentials_file "$APPDATA_PATH/claude/.credentials.json"
|
||||
# Fallback to home dir
|
||||
if [[ -z "$ACCESS_TOKEN" ]]; then
|
||||
extract_from_credentials_file "$HOME/.claude/.credentials.json"
|
||||
fi
|
||||
;;
|
||||
*)
|
||||
echo "Unsupported platform: $(uname -s)"
|
||||
exit 1
|
||||
;;
|
||||
esac
|
||||
|
||||
# --- Validate ---
|
||||
if [[ -z "$ACCESS_TOKEN" ]]; then
|
||||
echo "ERROR: Could not extract Claude OAuth token."
|
||||
echo ""
|
||||
echo "Make sure you have run 'claude login' at least once."
|
||||
echo ""
|
||||
echo "Locations checked:"
|
||||
echo " macOS: Keychain ('Claude Code-credentials')"
|
||||
echo " Linux: ~/.claude/.credentials.json"
|
||||
echo " Windows: %APPDATA%/claude/.credentials.json"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "Found Claude OAuth token: ${ACCESS_TOKEN:0:20}..."
|
||||
[[ -n "$REFRESH_TOKEN" ]] && echo "Found refresh token: ${REFRESH_TOKEN:0:20}..."
|
||||
|
||||
# --- Update .env file ---
|
||||
update_env_var() {
|
||||
local key="$1" value="$2" file="$3"
|
||||
if grep -q "^${key}=" "$file" 2>/dev/null; then
|
||||
# Replace existing value (works on both macOS and Linux sed)
|
||||
if [[ "$(uname -s)" == "Darwin" ]]; then
|
||||
sed -i '' "s|^${key}=.*|${key}=${value}|" "$file"
|
||||
else
|
||||
sed -i "s|^${key}=.*|${key}=${value}|" "$file"
|
||||
fi
|
||||
elif grep -q "^# *${key}=" "$file" 2>/dev/null; then
|
||||
# Uncomment and set
|
||||
if [[ "$(uname -s)" == "Darwin" ]]; then
|
||||
sed -i '' "s|^# *${key}=.*|${key}=${value}|" "$file"
|
||||
else
|
||||
sed -i "s|^# *${key}=.*|${key}=${value}|" "$file"
|
||||
fi
|
||||
else
|
||||
# Append
|
||||
echo "${key}=${value}" >> "$file"
|
||||
fi
|
||||
}
|
||||
|
||||
if [[ ! -f "$ENV_FILE" ]]; then
|
||||
echo "WARNING: $ENV_FILE does not exist, creating it."
|
||||
touch "$ENV_FILE"
|
||||
fi
|
||||
|
||||
update_env_var "CLAUDE_CODE_OAUTH_TOKEN" "$ACCESS_TOKEN" "$ENV_FILE"
|
||||
[[ -n "$REFRESH_TOKEN" ]] && update_env_var "CLAUDE_CODE_REFRESH_TOKEN" "$REFRESH_TOKEN" "$ENV_FILE"
|
||||
update_env_var "CHAT_USE_CLAUDE_CODE_SUBSCRIPTION" "true" "$ENV_FILE"
|
||||
|
||||
echo ""
|
||||
echo "Updated $ENV_FILE with Claude subscription tokens."
|
||||
echo "Run 'docker compose up -d copilot_executor' to apply."
|
||||
@@ -1,10 +1,10 @@
|
||||
"""
|
||||
Tests for SmartDecisionMakerBlock support in agent generator.
|
||||
Tests for OrchestratorBlock support in agent generator.
|
||||
|
||||
Covers:
|
||||
- AgentFixer.fix_smart_decision_maker_blocks()
|
||||
- AgentValidator.validate_smart_decision_maker_blocks()
|
||||
- End-to-end fix → validate → pipeline for SmartDecisionMaker agents
|
||||
- AgentFixer.fix_orchestrator_blocks()
|
||||
- AgentValidator.validate_orchestrator_blocks()
|
||||
- End-to-end fix → validate → pipeline for Orchestrator agents
|
||||
"""
|
||||
|
||||
import uuid
|
||||
@@ -14,7 +14,7 @@ from backend.copilot.tools.agent_generator.helpers import (
|
||||
AGENT_EXECUTOR_BLOCK_ID,
|
||||
AGENT_INPUT_BLOCK_ID,
|
||||
AGENT_OUTPUT_BLOCK_ID,
|
||||
SMART_DECISION_MAKER_BLOCK_ID,
|
||||
TOOL_ORCHESTRATOR_BLOCK_ID,
|
||||
)
|
||||
from backend.copilot.tools.agent_generator.validator import AgentValidator
|
||||
|
||||
@@ -28,10 +28,10 @@ def _make_sdm_node(
|
||||
input_default: dict | None = None,
|
||||
metadata: dict | None = None,
|
||||
) -> dict:
|
||||
"""Create a SmartDecisionMakerBlock node dict."""
|
||||
"""Create a OrchestratorBlock node dict."""
|
||||
return {
|
||||
"id": node_id or _uid(),
|
||||
"block_id": SMART_DECISION_MAKER_BLOCK_ID,
|
||||
"block_id": TOOL_ORCHESTRATOR_BLOCK_ID,
|
||||
"input_default": input_default or {},
|
||||
"metadata": metadata or {"position": {"x": 0, "y": 0}},
|
||||
}
|
||||
@@ -125,15 +125,15 @@ def _make_orchestrator_agent() -> dict:
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class TestFixSmartDecisionMakerBlocks:
|
||||
"""Tests for AgentFixer.fix_smart_decision_maker_blocks()."""
|
||||
class TestFixOrchestratorBlocks:
|
||||
"""Tests for AgentFixer.fix_orchestrator_blocks()."""
|
||||
|
||||
def test_fills_defaults_when_missing(self):
|
||||
"""All agent-mode defaults are populated for a bare SDM node."""
|
||||
fixer = AgentFixer()
|
||||
agent = {"nodes": [_make_sdm_node()], "links": []}
|
||||
|
||||
result = fixer.fix_smart_decision_maker_blocks(agent)
|
||||
result = fixer.fix_orchestrator_blocks(agent)
|
||||
|
||||
defaults = result["nodes"][0]["input_default"]
|
||||
assert defaults["agent_mode_max_iterations"] == 10
|
||||
@@ -159,7 +159,7 @@ class TestFixSmartDecisionMakerBlocks:
|
||||
"links": [],
|
||||
}
|
||||
|
||||
result = fixer.fix_smart_decision_maker_blocks(agent)
|
||||
result = fixer.fix_orchestrator_blocks(agent)
|
||||
|
||||
defaults = result["nodes"][0]["input_default"]
|
||||
assert defaults["agent_mode_max_iterations"] == 5
|
||||
@@ -182,7 +182,7 @@ class TestFixSmartDecisionMakerBlocks:
|
||||
"links": [],
|
||||
}
|
||||
|
||||
result = fixer.fix_smart_decision_maker_blocks(agent)
|
||||
result = fixer.fix_orchestrator_blocks(agent)
|
||||
|
||||
defaults = result["nodes"][0]["input_default"]
|
||||
assert defaults["agent_mode_max_iterations"] == 10 # kept
|
||||
@@ -192,7 +192,7 @@ class TestFixSmartDecisionMakerBlocks:
|
||||
assert len(fixer.fixes_applied) == 3
|
||||
|
||||
def test_skips_non_sdm_nodes(self):
|
||||
"""Non-SmartDecisionMaker nodes are untouched."""
|
||||
"""Non-Orchestrator nodes are untouched."""
|
||||
fixer = AgentFixer()
|
||||
other_node = {
|
||||
"id": _uid(),
|
||||
@@ -202,7 +202,7 @@ class TestFixSmartDecisionMakerBlocks:
|
||||
}
|
||||
agent = {"nodes": [other_node], "links": []}
|
||||
|
||||
result = fixer.fix_smart_decision_maker_blocks(agent)
|
||||
result = fixer.fix_orchestrator_blocks(agent)
|
||||
|
||||
assert "agent_mode_max_iterations" not in result["nodes"][0]["input_default"]
|
||||
assert len(fixer.fixes_applied) == 0
|
||||
@@ -212,12 +212,12 @@ class TestFixSmartDecisionMakerBlocks:
|
||||
fixer = AgentFixer()
|
||||
node = {
|
||||
"id": _uid(),
|
||||
"block_id": SMART_DECISION_MAKER_BLOCK_ID,
|
||||
"block_id": TOOL_ORCHESTRATOR_BLOCK_ID,
|
||||
"metadata": {},
|
||||
}
|
||||
agent = {"nodes": [node], "links": []}
|
||||
|
||||
result = fixer.fix_smart_decision_maker_blocks(agent)
|
||||
result = fixer.fix_orchestrator_blocks(agent)
|
||||
|
||||
assert "input_default" in result["nodes"][0]
|
||||
assert result["nodes"][0]["input_default"]["agent_mode_max_iterations"] == 10
|
||||
@@ -227,13 +227,13 @@ class TestFixSmartDecisionMakerBlocks:
|
||||
fixer = AgentFixer()
|
||||
node = {
|
||||
"id": _uid(),
|
||||
"block_id": SMART_DECISION_MAKER_BLOCK_ID,
|
||||
"block_id": TOOL_ORCHESTRATOR_BLOCK_ID,
|
||||
"input_default": None,
|
||||
"metadata": {},
|
||||
}
|
||||
agent = {"nodes": [node], "links": []}
|
||||
|
||||
result = fixer.fix_smart_decision_maker_blocks(agent)
|
||||
result = fixer.fix_orchestrator_blocks(agent)
|
||||
|
||||
assert isinstance(result["nodes"][0]["input_default"], dict)
|
||||
assert result["nodes"][0]["input_default"]["agent_mode_max_iterations"] == 10
|
||||
@@ -255,7 +255,7 @@ class TestFixSmartDecisionMakerBlocks:
|
||||
"links": [],
|
||||
}
|
||||
|
||||
result = fixer.fix_smart_decision_maker_blocks(agent)
|
||||
result = fixer.fix_orchestrator_blocks(agent)
|
||||
|
||||
defaults = result["nodes"][0]["input_default"]
|
||||
assert defaults["agent_mode_max_iterations"] == 10 # None → default
|
||||
@@ -275,7 +275,7 @@ class TestFixSmartDecisionMakerBlocks:
|
||||
"links": [],
|
||||
}
|
||||
|
||||
result = fixer.fix_smart_decision_maker_blocks(agent)
|
||||
result = fixer.fix_orchestrator_blocks(agent)
|
||||
|
||||
# First node: 3 defaults filled (agent_mode was already set)
|
||||
assert result["nodes"][0]["input_default"]["agent_mode_max_iterations"] == 3
|
||||
@@ -284,7 +284,7 @@ class TestFixSmartDecisionMakerBlocks:
|
||||
assert len(fixer.fixes_applied) == 7 # 3 + 4
|
||||
|
||||
def test_registered_in_apply_all_fixes(self):
|
||||
"""fix_smart_decision_maker_blocks runs as part of apply_all_fixes."""
|
||||
"""fix_orchestrator_blocks runs as part of apply_all_fixes."""
|
||||
fixer = AgentFixer()
|
||||
agent = {
|
||||
"nodes": [_make_sdm_node()],
|
||||
@@ -295,7 +295,7 @@ class TestFixSmartDecisionMakerBlocks:
|
||||
|
||||
defaults = result["nodes"][0]["input_default"]
|
||||
assert defaults["agent_mode_max_iterations"] == 10
|
||||
assert any("SmartDecisionMakerBlock" in fix for fix in fixer.fixes_applied)
|
||||
assert any("OrchestratorBlock" in fix for fix in fixer.fixes_applied)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
@@ -303,15 +303,15 @@ class TestFixSmartDecisionMakerBlocks:
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class TestValidateSmartDecisionMakerBlocks:
|
||||
"""Tests for AgentValidator.validate_smart_decision_maker_blocks()."""
|
||||
class TestValidateOrchestratorBlocks:
|
||||
"""Tests for AgentValidator.validate_orchestrator_blocks()."""
|
||||
|
||||
def test_valid_sdm_with_tools(self):
|
||||
"""SDM with downstream tool links passes validation."""
|
||||
validator = AgentValidator()
|
||||
agent = _make_orchestrator_agent()
|
||||
|
||||
result = validator.validate_smart_decision_maker_blocks(agent)
|
||||
result = validator.validate_orchestrator_blocks(agent)
|
||||
|
||||
assert result is True
|
||||
assert len(validator.errors) == 0
|
||||
@@ -325,7 +325,7 @@ class TestValidateSmartDecisionMakerBlocks:
|
||||
"links": [], # no tool links
|
||||
}
|
||||
|
||||
result = validator.validate_smart_decision_maker_blocks(agent)
|
||||
result = validator.validate_orchestrator_blocks(agent)
|
||||
|
||||
assert result is False
|
||||
assert len(validator.errors) == 1
|
||||
@@ -344,20 +344,20 @@ class TestValidateSmartDecisionMakerBlocks:
|
||||
],
|
||||
}
|
||||
|
||||
result = validator.validate_smart_decision_maker_blocks(agent)
|
||||
result = validator.validate_orchestrator_blocks(agent)
|
||||
|
||||
assert result is False
|
||||
assert len(validator.errors) == 1
|
||||
|
||||
def test_no_sdm_nodes_passes(self):
|
||||
"""Agent without SmartDecisionMaker nodes passes trivially."""
|
||||
"""Agent without Orchestrator nodes passes trivially."""
|
||||
validator = AgentValidator()
|
||||
agent = {
|
||||
"nodes": [_make_input_node(), _make_output_node()],
|
||||
"links": [],
|
||||
}
|
||||
|
||||
result = validator.validate_smart_decision_maker_blocks(agent)
|
||||
result = validator.validate_orchestrator_blocks(agent)
|
||||
|
||||
assert result is True
|
||||
assert len(validator.errors) == 0
|
||||
@@ -373,7 +373,7 @@ class TestValidateSmartDecisionMakerBlocks:
|
||||
)
|
||||
agent = {"nodes": [sdm], "links": []}
|
||||
|
||||
validator.validate_smart_decision_maker_blocks(agent)
|
||||
validator.validate_orchestrator_blocks(agent)
|
||||
|
||||
assert "My Orchestrator" in validator.errors[0]
|
||||
|
||||
@@ -392,7 +392,7 @@ class TestValidateSmartDecisionMakerBlocks:
|
||||
],
|
||||
}
|
||||
|
||||
result = validator.validate_smart_decision_maker_blocks(agent)
|
||||
result = validator.validate_orchestrator_blocks(agent)
|
||||
|
||||
assert result is False
|
||||
assert len(validator.errors) == 1
|
||||
@@ -408,7 +408,7 @@ class TestValidateSmartDecisionMakerBlocks:
|
||||
"links": [_link(sdm["id"], "tools", tool["id"], "query")],
|
||||
}
|
||||
|
||||
result = validator.validate_smart_decision_maker_blocks(agent)
|
||||
result = validator.validate_orchestrator_blocks(agent)
|
||||
|
||||
assert result is False
|
||||
assert any("agent_mode_max_iterations=0" in e for e in validator.errors)
|
||||
@@ -423,7 +423,7 @@ class TestValidateSmartDecisionMakerBlocks:
|
||||
"links": [_link(sdm["id"], "tools", tool["id"], "query")],
|
||||
}
|
||||
|
||||
result = validator.validate_smart_decision_maker_blocks(agent)
|
||||
result = validator.validate_orchestrator_blocks(agent)
|
||||
|
||||
assert result is True
|
||||
assert len(validator.errors) == 0
|
||||
@@ -438,7 +438,7 @@ class TestValidateSmartDecisionMakerBlocks:
|
||||
"links": [_link(sdm["id"], "tools", tool["id"], "query")],
|
||||
}
|
||||
|
||||
result = validator.validate_smart_decision_maker_blocks(agent)
|
||||
result = validator.validate_orchestrator_blocks(agent)
|
||||
|
||||
assert result is False
|
||||
assert any("unusually high" in e for e in validator.errors)
|
||||
@@ -453,7 +453,7 @@ class TestValidateSmartDecisionMakerBlocks:
|
||||
"links": [_link(sdm["id"], "tools", tool["id"], "query")],
|
||||
}
|
||||
|
||||
result = validator.validate_smart_decision_maker_blocks(agent)
|
||||
result = validator.validate_orchestrator_blocks(agent)
|
||||
|
||||
assert result is False
|
||||
assert any("non-integer" in e for e in validator.errors)
|
||||
@@ -468,7 +468,7 @@ class TestValidateSmartDecisionMakerBlocks:
|
||||
"links": [_link(sdm["id"], "tools", tool["id"], "query")],
|
||||
}
|
||||
|
||||
result = validator.validate_smart_decision_maker_blocks(agent)
|
||||
result = validator.validate_orchestrator_blocks(agent)
|
||||
|
||||
assert result is False
|
||||
assert any("invalid" in e and "-5" in e for e in validator.errors)
|
||||
@@ -488,14 +488,14 @@ class TestValidateSmartDecisionMakerBlocks:
|
||||
],
|
||||
}
|
||||
|
||||
result = validator.validate_smart_decision_maker_blocks(agent)
|
||||
result = validator.validate_orchestrator_blocks(agent)
|
||||
|
||||
assert result is False
|
||||
assert len(validator.errors) == 1
|
||||
assert "no downstream tool blocks" in validator.errors[0]
|
||||
|
||||
def test_registered_in_validate(self):
|
||||
"""validate_smart_decision_maker_blocks runs as part of validate()."""
|
||||
"""validate_orchestrator_blocks runs as part of validate()."""
|
||||
validator = AgentValidator()
|
||||
sdm = _make_sdm_node()
|
||||
agent = {
|
||||
@@ -511,8 +511,8 @@ class TestValidateSmartDecisionMakerBlocks:
|
||||
# Build a minimal blocks list with the SDM block info
|
||||
blocks = [
|
||||
{
|
||||
"id": SMART_DECISION_MAKER_BLOCK_ID,
|
||||
"name": "SmartDecisionMakerBlock",
|
||||
"id": TOOL_ORCHESTRATOR_BLOCK_ID,
|
||||
"name": "OrchestratorBlock",
|
||||
"inputSchema": {"properties": {"prompt": {"type": "string"}}},
|
||||
"outputSchema": {
|
||||
"properties": {
|
||||
@@ -557,7 +557,7 @@ class TestValidateSmartDecisionMakerBlocks:
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class TestSmartDecisionMakerE2EPipeline:
|
||||
class TestOrchestratorE2EPipeline:
|
||||
"""End-to-end tests: build agent JSON → fix → validate."""
|
||||
|
||||
def test_orchestrator_agent_fix_then_validate(self):
|
||||
@@ -570,7 +570,7 @@ class TestSmartDecisionMakerE2EPipeline:
|
||||
|
||||
# Verify defaults were applied
|
||||
sdm_nodes = [
|
||||
n for n in fixed["nodes"] if n["block_id"] == SMART_DECISION_MAKER_BLOCK_ID
|
||||
n for n in fixed["nodes"] if n["block_id"] == TOOL_ORCHESTRATOR_BLOCK_ID
|
||||
]
|
||||
assert len(sdm_nodes) == 1
|
||||
assert sdm_nodes[0]["input_default"]["agent_mode_max_iterations"] == 10
|
||||
@@ -578,7 +578,7 @@ class TestSmartDecisionMakerE2EPipeline:
|
||||
|
||||
# Validate (standalone SDM check)
|
||||
validator = AgentValidator()
|
||||
assert validator.validate_smart_decision_maker_blocks(fixed) is True
|
||||
assert validator.validate_orchestrator_blocks(fixed) is True
|
||||
|
||||
def test_bare_sdm_no_tools_fix_then_validate(self):
|
||||
"""SDM without tools: fixer fills defaults, validator catches error."""
|
||||
@@ -606,7 +606,7 @@ class TestSmartDecisionMakerE2EPipeline:
|
||||
|
||||
# Validate catches missing tools
|
||||
validator = AgentValidator()
|
||||
assert validator.validate_smart_decision_maker_blocks(fixed) is False
|
||||
assert validator.validate_orchestrator_blocks(fixed) is False
|
||||
assert any("no downstream tool blocks" in e for e in validator.errors)
|
||||
|
||||
def test_sdm_with_user_set_bounded_iterations(self):
|
||||
@@ -614,7 +614,7 @@ class TestSmartDecisionMakerE2EPipeline:
|
||||
agent = _make_orchestrator_agent()
|
||||
# Simulate user setting bounded iterations
|
||||
for node in agent["nodes"]:
|
||||
if node["block_id"] == SMART_DECISION_MAKER_BLOCK_ID:
|
||||
if node["block_id"] == TOOL_ORCHESTRATOR_BLOCK_ID:
|
||||
node["input_default"]["agent_mode_max_iterations"] = 5
|
||||
node["input_default"]["sys_prompt"] = "You are a helpful orchestrator"
|
||||
|
||||
@@ -622,7 +622,7 @@ class TestSmartDecisionMakerE2EPipeline:
|
||||
fixed = fixer.apply_all_fixes(agent)
|
||||
|
||||
sdm = next(
|
||||
n for n in fixed["nodes"] if n["block_id"] == SMART_DECISION_MAKER_BLOCK_ID
|
||||
n for n in fixed["nodes"] if n["block_id"] == TOOL_ORCHESTRATOR_BLOCK_ID
|
||||
)
|
||||
assert sdm["input_default"]["agent_mode_max_iterations"] == 5
|
||||
assert sdm["input_default"]["sys_prompt"] == "You are a helpful orchestrator"
|
||||
@@ -638,8 +638,8 @@ class TestSmartDecisionMakerE2EPipeline:
|
||||
|
||||
blocks = [
|
||||
{
|
||||
"id": SMART_DECISION_MAKER_BLOCK_ID,
|
||||
"name": "SmartDecisionMakerBlock",
|
||||
"id": TOOL_ORCHESTRATOR_BLOCK_ID,
|
||||
"name": "OrchestratorBlock",
|
||||
"inputSchema": {
|
||||
"properties": {
|
||||
"prompt": {"type": "string"},
|
||||
@@ -709,5 +709,5 @@ class TestSmartDecisionMakerE2EPipeline:
|
||||
assert is_valid, f"Validation failed: {error_msg}"
|
||||
|
||||
# SDM-specific validation should pass (has tool links)
|
||||
sdm_errors = [e for e in validator.errors if "SmartDecisionMakerBlock" in e]
|
||||
sdm_errors = [e for e in validator.errors if "OrchestratorBlock" in e]
|
||||
assert len(sdm_errors) == 0, f"Unexpected SDM errors: {sdm_errors}"
|
||||
@@ -66,6 +66,9 @@ services:
|
||||
container_name: supabase-kong
|
||||
image: kong:2.8.1
|
||||
restart: unless-stopped
|
||||
networks:
|
||||
- default
|
||||
- shared-network
|
||||
ports:
|
||||
- 8000:8000/tcp
|
||||
- 8443:8443/tcp
|
||||
@@ -407,6 +410,9 @@ services:
|
||||
container_name: supabase-db
|
||||
image: supabase/postgres:15.8.1.049
|
||||
restart: unless-stopped
|
||||
networks:
|
||||
- default
|
||||
- app-network
|
||||
volumes:
|
||||
- ./volumes/db/realtime.sql:/docker-entrypoint-initdb.d/migrations/99-realtime.sql:Z
|
||||
# Must be superuser to create event trigger
|
||||
@@ -538,5 +544,11 @@ services:
|
||||
"/app/bin/migrate && /app/bin/supavisor eval \"$$(cat /etc/pooler/pooler.exs)\" && /app/bin/server"
|
||||
]
|
||||
|
||||
networks:
|
||||
shared-network:
|
||||
name: shared-network
|
||||
app-network:
|
||||
name: app-network
|
||||
|
||||
volumes:
|
||||
supabase-config:
|
||||
|
||||
@@ -10,6 +10,12 @@ then
|
||||
fi
|
||||
|
||||
echo "Stopping and removing all containers..."
|
||||
# Use the platform compose to tear everything down so no orphan containers remain
|
||||
# (the platform compose manages supabase containers via `extends`, using the
|
||||
# standalone supabase compose here would leave orphans that conflict on next start)
|
||||
if [ -f "../../docker-compose.yml" ]; then
|
||||
docker compose -f ../../docker-compose.yml down -v --remove-orphans
|
||||
fi
|
||||
docker compose -f docker-compose.yml -f ./dev/docker-compose.dev.yml down -v --remove-orphans
|
||||
|
||||
echo "Cleaning up bind-mounted directories..."
|
||||
|
||||
@@ -114,6 +114,8 @@ services:
|
||||
<<: *backend-env
|
||||
ports:
|
||||
- "8006:8006"
|
||||
volumes:
|
||||
- workspace-data:/app/autogpt_platform/backend/workspaces
|
||||
networks:
|
||||
- app-network
|
||||
logging:
|
||||
@@ -185,6 +187,8 @@ services:
|
||||
PYTHONUNBUFFERED: "1"
|
||||
ports:
|
||||
- "8008:8008"
|
||||
volumes:
|
||||
- workspace-data:/app/autogpt_platform/backend/workspaces
|
||||
networks:
|
||||
- app-network
|
||||
logging:
|
||||
@@ -368,6 +372,9 @@ services:
|
||||
SUPABASE_URL: http://kong:8000
|
||||
AGPT_SERVER_URL: http://rest_server:8006/api
|
||||
AGPT_WS_SERVER_URL: ws://websocket_server:8001/ws
|
||||
volumes:
|
||||
workspace-data:
|
||||
|
||||
networks:
|
||||
app-network:
|
||||
driver: bridge
|
||||
|
||||
@@ -7,6 +7,7 @@ networks:
|
||||
volumes:
|
||||
supabase-config:
|
||||
clamav-data:
|
||||
workspace-data:
|
||||
|
||||
x-agpt-services:
|
||||
&agpt-services
|
||||
|
||||
@@ -73,7 +73,7 @@
|
||||
"@vercel/analytics": "1.5.0",
|
||||
"@vercel/speed-insights": "1.2.0",
|
||||
"@xyflow/react": "12.9.2",
|
||||
"ai": "6.0.59",
|
||||
"ai": "6.0.134",
|
||||
"boring-avatars": "1.11.2",
|
||||
"canvas-confetti": "1.9.4",
|
||||
"class-variance-authority": "0.7.1",
|
||||
|
||||
68
autogpt_platform/frontend/pnpm-lock.yaml
generated
68
autogpt_platform/frontend/pnpm-lock.yaml
generated
@@ -142,8 +142,8 @@ importers:
|
||||
specifier: 12.9.2
|
||||
version: 12.9.2(@types/react@18.3.17)(immer@11.1.3)(react-dom@18.3.1(react@18.3.1))(react@18.3.1)
|
||||
ai:
|
||||
specifier: 6.0.59
|
||||
version: 6.0.59(zod@3.25.76)
|
||||
specifier: 6.0.134
|
||||
version: 6.0.134(zod@3.25.76)
|
||||
boring-avatars:
|
||||
specifier: 1.11.2
|
||||
version: 1.11.2
|
||||
@@ -448,16 +448,32 @@ packages:
|
||||
peerDependencies:
|
||||
zod: ^3.25.76 || ^4.1.8
|
||||
|
||||
'@ai-sdk/gateway@3.0.77':
|
||||
resolution: {integrity: sha512-UdwIG2H2YMuntJQ5L+EmED5XiwnlvDT3HOmKfVFxR4Nq/RSLFA/HcchhwfNXHZ5UJjyuL2VO0huLbWSZ9ijemQ==}
|
||||
engines: {node: '>=18'}
|
||||
peerDependencies:
|
||||
zod: ^3.25.76 || ^4.1.8
|
||||
|
||||
'@ai-sdk/provider-utils@4.0.10':
|
||||
resolution: {integrity: sha512-VeDAiCH+ZK8Xs4hb9Cw7pHlujWNL52RKe8TExOkrw6Ir1AmfajBZTb9XUdKOZO08RwQElIKA8+Ltm+Gqfo8djQ==}
|
||||
engines: {node: '>=18'}
|
||||
peerDependencies:
|
||||
zod: ^3.25.76 || ^4.1.8
|
||||
|
||||
'@ai-sdk/provider-utils@4.0.21':
|
||||
resolution: {integrity: sha512-MtFUYI1/8mgDvRmaBDjbLJPFFrMG777AvSgyIFQtZHIMzm88R/12vYBBpnk7pfiWLFE1DSZzY4WDYzGbKAcmiw==}
|
||||
engines: {node: '>=18'}
|
||||
peerDependencies:
|
||||
zod: ^3.25.76 || ^4.1.8
|
||||
|
||||
'@ai-sdk/provider@3.0.5':
|
||||
resolution: {integrity: sha512-2Xmoq6DBJqmSl80U6V9z5jJSJP7ehaJJQMy2iFUqTay06wdCqTnPVBBQbtEL8RCChenL+q5DC5H5WzU3vV3v8w==}
|
||||
engines: {node: '>=18'}
|
||||
|
||||
'@ai-sdk/provider@3.0.8':
|
||||
resolution: {integrity: sha512-oGMAgGoQdBXbZqNG0Ze56CHjDZ1IDYOwGYxYjO5KLSlz5HiNQ9udIXsPZ61VWaHGZ5XW/jyjmr6t2xz2jGVwbQ==}
|
||||
engines: {node: '>=18'}
|
||||
|
||||
'@ai-sdk/react@3.0.61':
|
||||
resolution: {integrity: sha512-vCjZBnY2+TawFBXamSKt6elAt9n1MXMfcjSd9DSgT9peCJN27qNGVSXgaGNh/B3cUgeOktFfhB2GVmIqOjvmLQ==}
|
||||
engines: {node: '>=18'}
|
||||
@@ -4053,6 +4069,12 @@ packages:
|
||||
resolution: {integrity: sha512-MnA+YT8fwfJPgBx3m60MNqakm30XOkyIoH1y6huTQvC0PwZG7ki8NacLBcrPbNoo8vEZy7Jpuk7+jMO+CUovTQ==}
|
||||
engines: {node: '>= 14'}
|
||||
|
||||
ai@6.0.134:
|
||||
resolution: {integrity: sha512-YalNEaavld/kE444gOcsMKXdVVRGEe0SK77fAFcWYcqLg+a7xKnEet8bdfrEAJTfnMjj01rhgrIL10903w1a5Q==}
|
||||
engines: {node: '>=18'}
|
||||
peerDependencies:
|
||||
zod: ^3.25.76 || ^4.1.8
|
||||
|
||||
ai@6.0.59:
|
||||
resolution: {integrity: sha512-9SfCvcr4kVk4t8ZzIuyHpuL1hFYKsYMQfBSbBq3dipXPa+MphARvI8wHEjNaRqYl3JOsJbWxEBIMqHL0L92mUA==}
|
||||
engines: {node: '>=18'}
|
||||
@@ -8718,6 +8740,13 @@ snapshots:
|
||||
'@vercel/oidc': 3.1.0
|
||||
zod: 3.25.76
|
||||
|
||||
'@ai-sdk/gateway@3.0.77(zod@3.25.76)':
|
||||
dependencies:
|
||||
'@ai-sdk/provider': 3.0.8
|
||||
'@ai-sdk/provider-utils': 4.0.21(zod@3.25.76)
|
||||
'@vercel/oidc': 3.1.0
|
||||
zod: 3.25.76
|
||||
|
||||
'@ai-sdk/provider-utils@4.0.10(zod@3.25.76)':
|
||||
dependencies:
|
||||
'@ai-sdk/provider': 3.0.5
|
||||
@@ -8725,10 +8754,21 @@ snapshots:
|
||||
eventsource-parser: 3.0.6
|
||||
zod: 3.25.76
|
||||
|
||||
'@ai-sdk/provider-utils@4.0.21(zod@3.25.76)':
|
||||
dependencies:
|
||||
'@ai-sdk/provider': 3.0.8
|
||||
'@standard-schema/spec': 1.1.0
|
||||
eventsource-parser: 3.0.6
|
||||
zod: 3.25.76
|
||||
|
||||
'@ai-sdk/provider@3.0.5':
|
||||
dependencies:
|
||||
json-schema: 0.4.0
|
||||
|
||||
'@ai-sdk/provider@3.0.8':
|
||||
dependencies:
|
||||
json-schema: 0.4.0
|
||||
|
||||
'@ai-sdk/react@3.0.61(react@18.3.1)(zod@3.25.76)':
|
||||
dependencies:
|
||||
'@ai-sdk/provider-utils': 4.0.10(zod@3.25.76)
|
||||
@@ -12798,6 +12838,14 @@ snapshots:
|
||||
agent-base@7.1.4:
|
||||
optional: true
|
||||
|
||||
ai@6.0.134(zod@3.25.76):
|
||||
dependencies:
|
||||
'@ai-sdk/gateway': 3.0.77(zod@3.25.76)
|
||||
'@ai-sdk/provider': 3.0.8
|
||||
'@ai-sdk/provider-utils': 4.0.21(zod@3.25.76)
|
||||
'@opentelemetry/api': 1.9.0
|
||||
zod: 3.25.76
|
||||
|
||||
ai@6.0.59(zod@3.25.76):
|
||||
dependencies:
|
||||
'@ai-sdk/gateway': 3.0.27(zod@3.25.76)
|
||||
@@ -14066,8 +14114,8 @@ snapshots:
|
||||
'@typescript-eslint/parser': 8.52.0(eslint@8.57.1)(typescript@5.9.3)
|
||||
eslint: 8.57.1
|
||||
eslint-import-resolver-node: 0.3.9
|
||||
eslint-import-resolver-typescript: 3.10.1(eslint-plugin-import@2.32.0)(eslint@8.57.1)
|
||||
eslint-plugin-import: 2.32.0(@typescript-eslint/parser@8.52.0(eslint@8.57.1)(typescript@5.9.3))(eslint-import-resolver-typescript@3.10.1)(eslint@8.57.1)
|
||||
eslint-import-resolver-typescript: 3.10.1(eslint-plugin-import@2.32.0(@typescript-eslint/parser@8.52.0(eslint@8.57.1)(typescript@5.9.3))(eslint@8.57.1))(eslint@8.57.1)
|
||||
eslint-plugin-import: 2.32.0(@typescript-eslint/parser@8.52.0(eslint@8.57.1)(typescript@5.9.3))(eslint-import-resolver-typescript@3.10.1(eslint-plugin-import@2.32.0(@typescript-eslint/parser@8.52.0(eslint@8.57.1)(typescript@5.9.3))(eslint@8.57.1))(eslint@8.57.1))(eslint@8.57.1)
|
||||
eslint-plugin-jsx-a11y: 6.10.2(eslint@8.57.1)
|
||||
eslint-plugin-react: 7.37.5(eslint@8.57.1)
|
||||
eslint-plugin-react-hooks: 5.2.0(eslint@8.57.1)
|
||||
@@ -14086,7 +14134,7 @@ snapshots:
|
||||
transitivePeerDependencies:
|
||||
- supports-color
|
||||
|
||||
eslint-import-resolver-typescript@3.10.1(eslint-plugin-import@2.32.0)(eslint@8.57.1):
|
||||
eslint-import-resolver-typescript@3.10.1(eslint-plugin-import@2.32.0(@typescript-eslint/parser@8.52.0(eslint@8.57.1)(typescript@5.9.3))(eslint@8.57.1))(eslint@8.57.1):
|
||||
dependencies:
|
||||
'@nolyfill/is-core-module': 1.0.39
|
||||
debug: 4.4.3
|
||||
@@ -14097,22 +14145,22 @@ snapshots:
|
||||
tinyglobby: 0.2.15
|
||||
unrs-resolver: 1.11.1
|
||||
optionalDependencies:
|
||||
eslint-plugin-import: 2.32.0(@typescript-eslint/parser@8.52.0(eslint@8.57.1)(typescript@5.9.3))(eslint-import-resolver-typescript@3.10.1)(eslint@8.57.1)
|
||||
eslint-plugin-import: 2.32.0(@typescript-eslint/parser@8.52.0(eslint@8.57.1)(typescript@5.9.3))(eslint-import-resolver-typescript@3.10.1(eslint-plugin-import@2.32.0(@typescript-eslint/parser@8.52.0(eslint@8.57.1)(typescript@5.9.3))(eslint@8.57.1))(eslint@8.57.1))(eslint@8.57.1)
|
||||
transitivePeerDependencies:
|
||||
- supports-color
|
||||
|
||||
eslint-module-utils@2.12.1(@typescript-eslint/parser@8.52.0(eslint@8.57.1)(typescript@5.9.3))(eslint-import-resolver-node@0.3.9)(eslint-import-resolver-typescript@3.10.1)(eslint@8.57.1):
|
||||
eslint-module-utils@2.12.1(@typescript-eslint/parser@8.52.0(eslint@8.57.1)(typescript@5.9.3))(eslint-import-resolver-node@0.3.9)(eslint-import-resolver-typescript@3.10.1(eslint-plugin-import@2.32.0(@typescript-eslint/parser@8.52.0(eslint@8.57.1)(typescript@5.9.3))(eslint@8.57.1))(eslint@8.57.1))(eslint@8.57.1):
|
||||
dependencies:
|
||||
debug: 3.2.7
|
||||
optionalDependencies:
|
||||
'@typescript-eslint/parser': 8.52.0(eslint@8.57.1)(typescript@5.9.3)
|
||||
eslint: 8.57.1
|
||||
eslint-import-resolver-node: 0.3.9
|
||||
eslint-import-resolver-typescript: 3.10.1(eslint-plugin-import@2.32.0)(eslint@8.57.1)
|
||||
eslint-import-resolver-typescript: 3.10.1(eslint-plugin-import@2.32.0(@typescript-eslint/parser@8.52.0(eslint@8.57.1)(typescript@5.9.3))(eslint@8.57.1))(eslint@8.57.1)
|
||||
transitivePeerDependencies:
|
||||
- supports-color
|
||||
|
||||
eslint-plugin-import@2.32.0(@typescript-eslint/parser@8.52.0(eslint@8.57.1)(typescript@5.9.3))(eslint-import-resolver-typescript@3.10.1)(eslint@8.57.1):
|
||||
eslint-plugin-import@2.32.0(@typescript-eslint/parser@8.52.0(eslint@8.57.1)(typescript@5.9.3))(eslint-import-resolver-typescript@3.10.1(eslint-plugin-import@2.32.0(@typescript-eslint/parser@8.52.0(eslint@8.57.1)(typescript@5.9.3))(eslint@8.57.1))(eslint@8.57.1))(eslint@8.57.1):
|
||||
dependencies:
|
||||
'@rtsao/scc': 1.1.0
|
||||
array-includes: 3.1.9
|
||||
@@ -14123,7 +14171,7 @@ snapshots:
|
||||
doctrine: 2.1.0
|
||||
eslint: 8.57.1
|
||||
eslint-import-resolver-node: 0.3.9
|
||||
eslint-module-utils: 2.12.1(@typescript-eslint/parser@8.52.0(eslint@8.57.1)(typescript@5.9.3))(eslint-import-resolver-node@0.3.9)(eslint-import-resolver-typescript@3.10.1)(eslint@8.57.1)
|
||||
eslint-module-utils: 2.12.1(@typescript-eslint/parser@8.52.0(eslint@8.57.1)(typescript@5.9.3))(eslint-import-resolver-node@0.3.9)(eslint-import-resolver-typescript@3.10.1(eslint-plugin-import@2.32.0(@typescript-eslint/parser@8.52.0(eslint@8.57.1)(typescript@5.9.3))(eslint@8.57.1))(eslint@8.57.1))(eslint@8.57.1)
|
||||
hasown: 2.0.2
|
||||
is-core-module: 2.16.1
|
||||
is-glob: 4.0.3
|
||||
|
||||
@@ -15,46 +15,11 @@ import { useCopilotUIStore } from "./store";
|
||||
import { useChatSession } from "./useChatSession";
|
||||
import { useCopilotNotifications } from "./useCopilotNotifications";
|
||||
import { useCopilotStream } from "./useCopilotStream";
|
||||
import { useWorkflowImportAutoSubmit } from "./useWorkflowImportAutoSubmit";
|
||||
|
||||
const TITLE_POLL_INTERVAL_MS = 2_000;
|
||||
const TITLE_POLL_MAX_ATTEMPTS = 5;
|
||||
|
||||
/**
|
||||
* Extract a prompt from the URL hash fragment.
|
||||
* Supports: /copilot#prompt=URL-encoded-text
|
||||
* Optionally auto-submits if ?autosubmit=true is in the query string.
|
||||
* Returns null if no prompt is present.
|
||||
*/
|
||||
function extractPromptFromUrl(): {
|
||||
prompt: string;
|
||||
autosubmit: boolean;
|
||||
} | null {
|
||||
if (typeof window === "undefined") return null;
|
||||
|
||||
const hash = window.location.hash;
|
||||
if (!hash) return null;
|
||||
|
||||
const hashParams = new URLSearchParams(hash.slice(1));
|
||||
const prompt = hashParams.get("prompt");
|
||||
|
||||
if (!prompt || !prompt.trim()) return null;
|
||||
|
||||
const searchParams = new URLSearchParams(window.location.search);
|
||||
const autosubmit = searchParams.get("autosubmit") === "true";
|
||||
|
||||
// Clean up hash + autosubmit param only (preserve other query params)
|
||||
const cleanURL = new URL(window.location.href);
|
||||
cleanURL.hash = "";
|
||||
cleanURL.searchParams.delete("autosubmit");
|
||||
window.history.replaceState(
|
||||
null,
|
||||
"",
|
||||
`${cleanURL.pathname}${cleanURL.search}`,
|
||||
);
|
||||
|
||||
return { prompt: prompt.trim(), autosubmit };
|
||||
}
|
||||
|
||||
interface UploadedFile {
|
||||
file_id: string;
|
||||
name: string;
|
||||
@@ -130,16 +95,23 @@ export function useCopilotPage() {
|
||||
breakpoint === "base" || breakpoint === "sm" || breakpoint === "md";
|
||||
|
||||
const pendingFilesRef = useRef<File[]>([]);
|
||||
// Pre-built file parts from workflow import (already uploaded, skip re-upload)
|
||||
const pendingFilePartsRef = useRef<FileUIPart[]>([]);
|
||||
|
||||
// --- Send pending message after session creation ---
|
||||
useEffect(() => {
|
||||
if (!sessionId || pendingMessage === null) return;
|
||||
const msg = pendingMessage;
|
||||
const files = pendingFilesRef.current;
|
||||
const prebuiltParts = pendingFilePartsRef.current;
|
||||
setPendingMessage(null);
|
||||
pendingFilesRef.current = [];
|
||||
pendingFilePartsRef.current = [];
|
||||
|
||||
if (files.length > 0) {
|
||||
if (prebuiltParts.length > 0) {
|
||||
// File already uploaded (e.g. workflow import) — send directly
|
||||
sendMessage({ text: msg, files: prebuiltParts });
|
||||
} else if (files.length > 0) {
|
||||
setIsUploadingFiles(true);
|
||||
void uploadFiles(files, sessionId)
|
||||
.then((uploaded) => {
|
||||
@@ -164,26 +136,11 @@ export function useCopilotPage() {
|
||||
}, [sessionId, pendingMessage, sendMessage]);
|
||||
|
||||
// --- Extract prompt from URL hash on mount (e.g. /copilot#prompt=Hello) ---
|
||||
const { setInitialPrompt } = useCopilotUIStore();
|
||||
const hasProcessedUrlPrompt = useRef(false);
|
||||
useEffect(() => {
|
||||
if (hasProcessedUrlPrompt.current) return;
|
||||
|
||||
const urlPrompt = extractPromptFromUrl();
|
||||
if (!urlPrompt) return;
|
||||
|
||||
hasProcessedUrlPrompt.current = true;
|
||||
|
||||
if (urlPrompt.autosubmit) {
|
||||
setPendingMessage(urlPrompt.prompt);
|
||||
void createSession().catch(() => {
|
||||
setPendingMessage(null);
|
||||
setInitialPrompt(urlPrompt.prompt);
|
||||
});
|
||||
} else {
|
||||
setInitialPrompt(urlPrompt.prompt);
|
||||
}
|
||||
}, [createSession, setInitialPrompt]);
|
||||
useWorkflowImportAutoSubmit({
|
||||
createSession,
|
||||
setPendingMessage,
|
||||
pendingFilePartsRef,
|
||||
});
|
||||
|
||||
async function uploadFiles(
|
||||
files: File[],
|
||||
|
||||
@@ -0,0 +1,122 @@
|
||||
import type { FileUIPart } from "ai";
|
||||
import { useEffect, useRef } from "react";
|
||||
import { useCopilotUIStore } from "./store";
|
||||
|
||||
/**
|
||||
* Extract a prompt from the URL hash fragment.
|
||||
* Supports: /copilot#prompt=URL-encoded-text
|
||||
* Optionally auto-submits if ?autosubmit=true is in the query string.
|
||||
* Returns null if no prompt is present.
|
||||
*/
|
||||
function extractPromptFromUrl(): {
|
||||
prompt: string;
|
||||
autosubmit: boolean;
|
||||
filePart?: FileUIPart;
|
||||
} | null {
|
||||
if (typeof window === "undefined") return null;
|
||||
|
||||
const searchParams = new URLSearchParams(window.location.search);
|
||||
const autosubmit = searchParams.get("autosubmit") === "true";
|
||||
|
||||
// Check sessionStorage first (used by workflow import for large prompts)
|
||||
const storedPrompt = sessionStorage.getItem("importWorkflowPrompt");
|
||||
if (storedPrompt) {
|
||||
sessionStorage.removeItem("importWorkflowPrompt");
|
||||
|
||||
// Check for a pre-uploaded workflow file attached to this import
|
||||
let filePart: FileUIPart | undefined;
|
||||
const storedFile = sessionStorage.getItem("importWorkflowFile");
|
||||
if (storedFile) {
|
||||
sessionStorage.removeItem("importWorkflowFile");
|
||||
try {
|
||||
const { fileId, fileName, mimeType } = JSON.parse(storedFile);
|
||||
// Validate fileId is a UUID to prevent path traversal
|
||||
const UUID_RE =
|
||||
/^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$/i;
|
||||
if (typeof fileId === "string" && UUID_RE.test(fileId)) {
|
||||
filePart = {
|
||||
type: "file",
|
||||
mediaType: mimeType ?? "application/json",
|
||||
filename: fileName ?? "workflow.json",
|
||||
url: `/api/proxy/api/workspace/files/${fileId}/download`,
|
||||
};
|
||||
}
|
||||
} catch {
|
||||
// ignore malformed stored data
|
||||
}
|
||||
}
|
||||
|
||||
// Clean up query params
|
||||
const cleanURL = new URL(window.location.href);
|
||||
cleanURL.searchParams.delete("autosubmit");
|
||||
cleanURL.searchParams.delete("source");
|
||||
window.history.replaceState(
|
||||
null,
|
||||
"",
|
||||
`${cleanURL.pathname}${cleanURL.search}`,
|
||||
);
|
||||
return { prompt: storedPrompt.trim(), autosubmit, filePart };
|
||||
}
|
||||
|
||||
// Fall back to URL hash (e.g. /copilot#prompt=...)
|
||||
const hash = window.location.hash;
|
||||
if (!hash) return null;
|
||||
|
||||
const hashParams = new URLSearchParams(hash.slice(1));
|
||||
const prompt = hashParams.get("prompt");
|
||||
|
||||
if (!prompt || !prompt.trim()) return null;
|
||||
|
||||
// Clean up hash + autosubmit param only (preserve other query params)
|
||||
const cleanURL = new URL(window.location.href);
|
||||
cleanURL.hash = "";
|
||||
cleanURL.searchParams.delete("autosubmit");
|
||||
window.history.replaceState(
|
||||
null,
|
||||
"",
|
||||
`${cleanURL.pathname}${cleanURL.search}`,
|
||||
);
|
||||
|
||||
return { prompt: prompt.trim(), autosubmit };
|
||||
}
|
||||
|
||||
/**
|
||||
* Hook that checks for workflow import data in sessionStorage / URL on mount,
|
||||
* and auto-submits a new CoPilot session when `autosubmit=true`.
|
||||
*
|
||||
* Extracted from useCopilotPage to keep that hook focused on page-level concerns.
|
||||
*/
|
||||
export function useWorkflowImportAutoSubmit({
|
||||
createSession,
|
||||
setPendingMessage,
|
||||
pendingFilePartsRef,
|
||||
}: {
|
||||
createSession: () => Promise<string | undefined>;
|
||||
setPendingMessage: (msg: string | null) => void;
|
||||
pendingFilePartsRef: React.MutableRefObject<FileUIPart[]>;
|
||||
}) {
|
||||
const { setInitialPrompt } = useCopilotUIStore();
|
||||
const hasProcessedUrlPrompt = useRef(false);
|
||||
|
||||
useEffect(() => {
|
||||
if (hasProcessedUrlPrompt.current) return;
|
||||
|
||||
const urlPrompt = extractPromptFromUrl();
|
||||
if (!urlPrompt) return;
|
||||
|
||||
hasProcessedUrlPrompt.current = true;
|
||||
|
||||
if (urlPrompt.autosubmit) {
|
||||
if (urlPrompt.filePart) {
|
||||
pendingFilePartsRef.current = [urlPrompt.filePart];
|
||||
}
|
||||
setPendingMessage(urlPrompt.prompt);
|
||||
void createSession().catch(() => {
|
||||
setPendingMessage(null);
|
||||
setInitialPrompt(urlPrompt.prompt);
|
||||
});
|
||||
} else {
|
||||
setInitialPrompt(urlPrompt.prompt);
|
||||
}
|
||||
}, [createSession, setInitialPrompt, setPendingMessage, pendingFilePartsRef]);
|
||||
}
|
||||
@@ -169,7 +169,7 @@ function renderMarkdown(
|
||||
[remarkMath, { singleDollarTextMath: false }], // Math support for LaTeX
|
||||
]}
|
||||
rehypePlugins={[
|
||||
rehypeKatex, // Render math with KaTeX
|
||||
[rehypeKatex, { strict: false }], // Render math with KaTeX
|
||||
rehypeHighlight, // Syntax highlighting for code blocks
|
||||
rehypeSlug, // Add IDs to headings
|
||||
[rehypeAutolinkHeadings, { behavior: "wrap" }], // Make headings clickable
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
import LibraryImportDialog from "../LibraryImportDialog/LibraryImportDialog";
|
||||
import { LibrarySearchBar } from "../LibrarySearchBar/LibrarySearchBar";
|
||||
import LibraryUploadAgentDialog from "../LibraryUploadAgentDialog/LibraryUploadAgentDialog";
|
||||
|
||||
interface Props {
|
||||
setSearchTerm: (value: string) => void;
|
||||
@@ -10,13 +10,13 @@ export function LibraryActionHeader({ setSearchTerm }: Props) {
|
||||
<>
|
||||
<div className="mb-[32px] hidden items-center justify-center gap-4 md:flex">
|
||||
<LibrarySearchBar setSearchTerm={setSearchTerm} />
|
||||
<LibraryUploadAgentDialog />
|
||||
<LibraryImportDialog />
|
||||
</div>
|
||||
|
||||
{/* Mobile and tablet */}
|
||||
<div className="flex flex-col gap-4 p-4 pt-[52px] md:hidden">
|
||||
<div className="flex w-full justify-between">
|
||||
<LibraryUploadAgentDialog />
|
||||
<div className="flex w-full justify-between gap-2">
|
||||
<LibraryImportDialog />
|
||||
</div>
|
||||
|
||||
<div className="flex items-center justify-center">
|
||||
|
||||
@@ -0,0 +1,66 @@
|
||||
"use client";
|
||||
import { Button } from "@/components/atoms/Button/Button";
|
||||
import { Dialog } from "@/components/molecules/Dialog/Dialog";
|
||||
import {
|
||||
TabsLine,
|
||||
TabsLineList,
|
||||
TabsLineTrigger,
|
||||
} from "@/components/molecules/TabsLine/TabsLine";
|
||||
import { UploadSimpleIcon } from "@phosphor-icons/react";
|
||||
import { useState } from "react";
|
||||
import { useLibraryUploadAgentDialog } from "../LibraryUploadAgentDialog/useLibraryUploadAgentDialog";
|
||||
import AgentUploadTab from "./components/AgentUploadTab/AgentUploadTab";
|
||||
import ExternalWorkflowTab from "./components/ExternalWorkflowTab/ExternalWorkflowTab";
|
||||
import { useExternalWorkflowTab } from "./components/ExternalWorkflowTab/useExternalWorkflowTab";
|
||||
|
||||
export default function LibraryImportDialog() {
|
||||
const [isOpen, setIsOpen] = useState(false);
|
||||
|
||||
const importWorkflow = useExternalWorkflowTab();
|
||||
|
||||
function handleClose() {
|
||||
setIsOpen(false);
|
||||
importWorkflow.setFileValue("");
|
||||
importWorkflow.setUrlValue("");
|
||||
}
|
||||
|
||||
const upload = useLibraryUploadAgentDialog({ onSuccess: handleClose });
|
||||
|
||||
return (
|
||||
<Dialog
|
||||
title="Import"
|
||||
styling={{ maxWidth: "32rem" }}
|
||||
controlled={{
|
||||
isOpen,
|
||||
set: setIsOpen,
|
||||
}}
|
||||
onClose={handleClose}
|
||||
>
|
||||
<Dialog.Trigger>
|
||||
<Button
|
||||
data-testid="import-button"
|
||||
variant="primary"
|
||||
className="h-[2.78rem] w-full md:w-[10rem]"
|
||||
size="small"
|
||||
>
|
||||
<UploadSimpleIcon width={18} height={18} />
|
||||
<span>Import</span>
|
||||
</Button>
|
||||
</Dialog.Trigger>
|
||||
<Dialog.Content>
|
||||
<TabsLine defaultValue="agent">
|
||||
<TabsLineList>
|
||||
<TabsLineTrigger value="agent">AutoGPT agent</TabsLineTrigger>
|
||||
<TabsLineTrigger value="platform">Another platform</TabsLineTrigger>
|
||||
</TabsLineList>
|
||||
|
||||
{/* Tab: Import from any platform (file upload + n8n URL) */}
|
||||
<ExternalWorkflowTab importWorkflow={importWorkflow} />
|
||||
|
||||
{/* Tab: Upload AutoGPT agent JSON */}
|
||||
<AgentUploadTab upload={upload} />
|
||||
</TabsLine>
|
||||
</Dialog.Content>
|
||||
</Dialog>
|
||||
);
|
||||
}
|
||||
@@ -0,0 +1,105 @@
|
||||
"use client";
|
||||
import { Button } from "@/components/atoms/Button/Button";
|
||||
import { FileInput } from "@/components/atoms/FileInput/FileInput";
|
||||
import { Input } from "@/components/atoms/Input/Input";
|
||||
import { LoadingSpinner } from "@/components/atoms/LoadingSpinner/LoadingSpinner";
|
||||
import {
|
||||
Form,
|
||||
FormControl,
|
||||
FormField,
|
||||
FormItem,
|
||||
FormMessage,
|
||||
} from "@/components/molecules/Form/Form";
|
||||
import { TabsLineContent } from "@/components/molecules/TabsLine/TabsLine";
|
||||
import { useLibraryUploadAgentDialog } from "../../../LibraryUploadAgentDialog/useLibraryUploadAgentDialog";
|
||||
|
||||
type AgentUploadTabProps = {
|
||||
upload: ReturnType<typeof useLibraryUploadAgentDialog>;
|
||||
};
|
||||
|
||||
export default function AgentUploadTab({ upload }: AgentUploadTabProps) {
|
||||
return (
|
||||
<TabsLineContent value="agent">
|
||||
<p className="mb-4 text-sm text-neutral-500">
|
||||
Upload a previously exported AutoGPT agent file (.json).
|
||||
</p>
|
||||
<Form
|
||||
form={upload.form}
|
||||
onSubmit={upload.onSubmit}
|
||||
className="flex flex-col justify-center gap-0 px-1"
|
||||
>
|
||||
<FormField
|
||||
control={upload.form.control}
|
||||
name="agentName"
|
||||
render={({ field }) => (
|
||||
<FormItem>
|
||||
<FormControl>
|
||||
<Input
|
||||
{...field}
|
||||
id={field.name}
|
||||
label="Agent name"
|
||||
className="w-full rounded-[10px]"
|
||||
/>
|
||||
</FormControl>
|
||||
<FormMessage />
|
||||
</FormItem>
|
||||
)}
|
||||
/>
|
||||
<FormField
|
||||
control={upload.form.control}
|
||||
name="agentDescription"
|
||||
render={({ field }) => (
|
||||
<FormItem>
|
||||
<FormControl>
|
||||
<Input
|
||||
{...field}
|
||||
id={field.name}
|
||||
label="Agent description"
|
||||
type="textarea"
|
||||
className="w-full rounded-[10px]"
|
||||
/>
|
||||
</FormControl>
|
||||
<FormMessage />
|
||||
</FormItem>
|
||||
)}
|
||||
/>
|
||||
<FormField
|
||||
control={upload.form.control}
|
||||
name="agentFile"
|
||||
render={({ field }) => (
|
||||
<FormItem>
|
||||
<FormControl>
|
||||
<FileInput
|
||||
mode="base64"
|
||||
value={field.value}
|
||||
onChange={field.onChange}
|
||||
accept=".json,application/json"
|
||||
placeholder="Agent file"
|
||||
maxFileSize={10 * 1024 * 1024}
|
||||
showStorageNote={false}
|
||||
className="mb-8 mt-4"
|
||||
/>
|
||||
</FormControl>
|
||||
<FormMessage />
|
||||
</FormItem>
|
||||
)}
|
||||
/>
|
||||
<Button
|
||||
type="submit"
|
||||
variant="primary"
|
||||
className="w-full"
|
||||
disabled={!upload.agentObject || upload.isUploading}
|
||||
>
|
||||
{upload.isUploading ? (
|
||||
<div className="flex items-center gap-2">
|
||||
<LoadingSpinner size="small" className="text-white" />
|
||||
<span>Uploading...</span>
|
||||
</div>
|
||||
) : (
|
||||
"Upload"
|
||||
)}
|
||||
</Button>
|
||||
</Form>
|
||||
</TabsLineContent>
|
||||
);
|
||||
}
|
||||
@@ -0,0 +1,99 @@
|
||||
"use client";
|
||||
import { Button } from "@/components/atoms/Button/Button";
|
||||
import { FileInput } from "@/components/atoms/FileInput/FileInput";
|
||||
import { Input } from "@/components/atoms/Input/Input";
|
||||
import { LoadingSpinner } from "@/components/atoms/LoadingSpinner/LoadingSpinner";
|
||||
import { TabsLineContent } from "@/components/molecules/TabsLine/TabsLine";
|
||||
import { useExternalWorkflowTab } from "./useExternalWorkflowTab";
|
||||
|
||||
const N8N_EXAMPLES = [
|
||||
{ label: "Build Your First AI Agent", url: "https://n8n.io/workflows/6270" },
|
||||
{ label: "Interactive AI Chat Agent", url: "https://n8n.io/workflows/5819" },
|
||||
];
|
||||
|
||||
type ExternalWorkflowTabProps = {
|
||||
importWorkflow: ReturnType<typeof useExternalWorkflowTab>;
|
||||
};
|
||||
|
||||
export default function ExternalWorkflowTab({
|
||||
importWorkflow,
|
||||
}: ExternalWorkflowTabProps) {
|
||||
return (
|
||||
<TabsLineContent value="platform">
|
||||
<p className="mb-4 text-sm text-neutral-500">
|
||||
Upload a workflow exported from n8n, Make.com, Zapier, or any other
|
||||
platform. AutoPilot will convert it into an AutoGPT agent for you.
|
||||
</p>
|
||||
<FileInput
|
||||
mode="base64"
|
||||
value={importWorkflow.fileValue}
|
||||
onChange={importWorkflow.setFileValue}
|
||||
accept=".json,application/json"
|
||||
placeholder="Workflow file (n8n, Make.com, Zapier, ...)"
|
||||
maxFileSize={10 * 1024 * 1024}
|
||||
showStorageNote={false}
|
||||
className="mb-4 mt-2"
|
||||
/>
|
||||
<Button
|
||||
type="button"
|
||||
variant="primary"
|
||||
className="w-full"
|
||||
disabled={!importWorkflow.fileValue || importWorkflow.isSubmitting}
|
||||
onClick={() => importWorkflow.submitWithMode("file")}
|
||||
>
|
||||
{importWorkflow.submittingMode === "file" ? (
|
||||
<div className="flex items-center gap-2">
|
||||
<LoadingSpinner size="small" className="text-white" />
|
||||
<span>Importing...</span>
|
||||
</div>
|
||||
) : (
|
||||
"Import to AutoPilot"
|
||||
)}
|
||||
</Button>
|
||||
|
||||
<div className="my-5 flex items-center gap-3">
|
||||
<div className="h-px flex-1 bg-neutral-200" />
|
||||
<span className="text-xs text-neutral-400">or import from URL</span>
|
||||
<div className="h-px flex-1 bg-neutral-200" />
|
||||
</div>
|
||||
|
||||
<div className="mb-3 flex flex-wrap gap-2">
|
||||
{N8N_EXAMPLES.map((p) => (
|
||||
<button
|
||||
key={p.label}
|
||||
type="button"
|
||||
disabled={importWorkflow.isSubmitting}
|
||||
onClick={() => importWorkflow.setUrlValue(p.url)}
|
||||
className="rounded-full border border-neutral-200 px-3 py-1 text-xs text-neutral-600 hover:border-purple-400 hover:text-purple-600 disabled:opacity-50"
|
||||
>
|
||||
{p.label}
|
||||
</button>
|
||||
))}
|
||||
</div>
|
||||
<Input
|
||||
id="template-url"
|
||||
value={importWorkflow.urlValue}
|
||||
onChange={(e) => importWorkflow.setUrlValue(e.target.value)}
|
||||
label="Workflow URL"
|
||||
placeholder="https://n8n.io/workflows/1234"
|
||||
className="mb-4 w-full rounded-[10px]"
|
||||
/>
|
||||
<Button
|
||||
type="button"
|
||||
variant="primary"
|
||||
className="w-full"
|
||||
disabled={!importWorkflow.urlValue || importWorkflow.isSubmitting}
|
||||
onClick={() => importWorkflow.submitWithMode("url")}
|
||||
>
|
||||
{importWorkflow.submittingMode === "url" ? (
|
||||
<div className="flex items-center gap-2">
|
||||
<LoadingSpinner size="small" className="text-white" />
|
||||
<span>Importing...</span>
|
||||
</div>
|
||||
) : (
|
||||
"Import from URL"
|
||||
)}
|
||||
</Button>
|
||||
</TabsLineContent>
|
||||
);
|
||||
}
|
||||
@@ -0,0 +1,85 @@
|
||||
"use server";
|
||||
|
||||
/**
|
||||
* Regex to extract the numeric template ID from various n8n URL formats:
|
||||
* - https://n8n.io/workflows/1234
|
||||
* - https://n8n.io/workflows/1234-some-slug
|
||||
* - https://api.n8n.io/api/templates/workflows/1234
|
||||
*/
|
||||
const N8N_TEMPLATE_ID_RE = /n8n\.io\/(?:api\/templates\/)?workflows\/(\d+)/i;
|
||||
|
||||
/** Hardcoded n8n templates API base — the only URL we ever fetch. */
|
||||
const N8N_TEMPLATES_API = "https://api.n8n.io/api/templates/workflows";
|
||||
|
||||
/** Max response body size (10 MB) to prevent memory exhaustion. */
|
||||
const MAX_RESPONSE_BYTES = 10 * 1024 * 1024;
|
||||
|
||||
export type FetchWorkflowResult =
|
||||
| { ok: true; json: string }
|
||||
| { ok: false; error: string };
|
||||
|
||||
/**
|
||||
* Server action that fetches a workflow JSON from an n8n template URL.
|
||||
* Runs server-side so there are no CORS restrictions.
|
||||
*
|
||||
* Returns a result object instead of throwing because Next.js
|
||||
* server actions do not propagate error messages to the client.
|
||||
*
|
||||
* Only n8n.io workflow URLs are accepted. The template ID is extracted
|
||||
* and used to call the hardcoded n8n API — the user-supplied URL is
|
||||
* never passed to fetch() directly (SSRF prevention).
|
||||
*/
|
||||
export async function fetchWorkflowFromUrl(
|
||||
url: string,
|
||||
): Promise<FetchWorkflowResult> {
|
||||
const match = url.match(N8N_TEMPLATE_ID_RE);
|
||||
if (!match) {
|
||||
return {
|
||||
ok: false,
|
||||
error:
|
||||
"Invalid or unsupported URL. " +
|
||||
"URL import is supported for n8n.io workflow templates " +
|
||||
"(e.g. https://n8n.io/workflows/1234). " +
|
||||
"For other platforms, use file upload.",
|
||||
};
|
||||
}
|
||||
|
||||
const templateId = match[1]; // purely numeric, safe to interpolate
|
||||
|
||||
try {
|
||||
const json = await fetchN8nWorkflow(templateId);
|
||||
return { ok: true, json };
|
||||
} catch (err) {
|
||||
return {
|
||||
ok: false,
|
||||
error: err instanceof Error ? err.message : "Failed to fetch workflow.",
|
||||
};
|
||||
}
|
||||
}
|
||||
|
||||
async function fetchN8nWorkflow(templateId: string): Promise<string> {
|
||||
// Only ever fetch from the hardcoded API base + numeric ID.
|
||||
// parseInt + toString round-trips to guarantee the value is purely numeric,
|
||||
// preventing any path-traversal or SSRF via the interpolated segment.
|
||||
const safeId = parseInt(templateId, 10);
|
||||
if (!Number.isFinite(safeId) || safeId <= 0) {
|
||||
throw new Error("Invalid template ID");
|
||||
}
|
||||
const res = await fetch(`${N8N_TEMPLATES_API}/${safeId.toString()}`);
|
||||
if (!res.ok) throw new Error(`n8n template not found (${res.status})`);
|
||||
|
||||
const contentLength = res.headers.get("content-length");
|
||||
if (contentLength && parseInt(contentLength, 10) > MAX_RESPONSE_BYTES) {
|
||||
throw new Error("Response too large.");
|
||||
}
|
||||
|
||||
const text = await res.text();
|
||||
if (text.length > MAX_RESPONSE_BYTES) throw new Error("Response too large.");
|
||||
|
||||
const data = JSON.parse(text);
|
||||
const template = data?.workflow ?? data;
|
||||
const workflow = template?.workflow ?? template;
|
||||
if (!workflow?.nodes) throw new Error("Unexpected n8n API response format");
|
||||
if (!workflow.name) workflow.name = template?.name ?? data?.name ?? "";
|
||||
return JSON.stringify(workflow);
|
||||
}
|
||||
@@ -0,0 +1,114 @@
|
||||
import { useToast } from "@/components/molecules/Toast/use-toast";
|
||||
import { uploadFileDirect } from "@/lib/direct-upload";
|
||||
import { useRouter } from "next/navigation";
|
||||
import { useState } from "react";
|
||||
import { fetchWorkflowFromUrl } from "./fetchWorkflowFromUrl";
|
||||
|
||||
function decodeBase64Json(dataUrl: string): string {
|
||||
const match = dataUrl.match(/^data:[^;]+;base64,(.+)$/);
|
||||
if (!match) throw new Error("Could not read the uploaded file.");
|
||||
const binary = atob(match[1]);
|
||||
const bytes = Uint8Array.from(binary, (c) => c.charCodeAt(0));
|
||||
const json = new TextDecoder().decode(bytes);
|
||||
JSON.parse(json); // validate — throws SyntaxError if invalid
|
||||
return json;
|
||||
}
|
||||
|
||||
async function uploadJsonAsFile(
|
||||
jsonString: string,
|
||||
): Promise<{ fileId: string; fileName: string; mimeType: string }> {
|
||||
const file = new File(
|
||||
[new Blob([jsonString], { type: "application/json" })],
|
||||
`workflow-${crypto.randomUUID()}.json`,
|
||||
{ type: "application/json" },
|
||||
);
|
||||
const uploaded = await uploadFileDirect(file);
|
||||
return {
|
||||
fileId: uploaded.file_id,
|
||||
fileName: uploaded.name,
|
||||
mimeType: uploaded.mime_type,
|
||||
};
|
||||
}
|
||||
|
||||
function storeAndRedirect(
|
||||
fileInfo: { fileId: string; fileName: string; mimeType: string },
|
||||
router: ReturnType<typeof useRouter>,
|
||||
) {
|
||||
sessionStorage.setItem(
|
||||
"importWorkflowPrompt",
|
||||
"Import this workflow and recreate it as an AutoGPT agent",
|
||||
);
|
||||
sessionStorage.setItem("importWorkflowFile", JSON.stringify(fileInfo));
|
||||
router.push("/copilot?source=import&autosubmit=true");
|
||||
}
|
||||
|
||||
export function useExternalWorkflowTab() {
|
||||
const { toast } = useToast();
|
||||
const router = useRouter();
|
||||
const [fileValue, setFileValue] = useState("");
|
||||
const [urlValue, setUrlValue] = useState("");
|
||||
const [submittingMode, setSubmittingMode] = useState<"url" | "file" | null>(
|
||||
null,
|
||||
);
|
||||
const isSubmitting = submittingMode !== null;
|
||||
|
||||
async function submitWithMode(mode: "url" | "file") {
|
||||
setSubmittingMode(mode);
|
||||
try {
|
||||
const jsonString = await resolveJson(mode);
|
||||
if (!jsonString) return;
|
||||
storeAndRedirect(await uploadJsonAsFile(jsonString), router);
|
||||
} catch (err) {
|
||||
toast({
|
||||
title: "Upload failed",
|
||||
description:
|
||||
err instanceof Error ? err.message : "Could not upload the file.",
|
||||
variant: "destructive",
|
||||
});
|
||||
} finally {
|
||||
setSubmittingMode(null);
|
||||
}
|
||||
}
|
||||
|
||||
async function resolveJson(mode: "url" | "file"): Promise<string | null> {
|
||||
if (mode === "url") {
|
||||
const result = await fetchWorkflowFromUrl(urlValue);
|
||||
if (!result.ok) {
|
||||
toast({
|
||||
title: "Could not fetch workflow",
|
||||
description: result.error,
|
||||
variant: "destructive",
|
||||
});
|
||||
return null;
|
||||
}
|
||||
setUrlValue("");
|
||||
return result.json;
|
||||
}
|
||||
|
||||
try {
|
||||
const json = decodeBase64Json(fileValue);
|
||||
setFileValue("");
|
||||
return json;
|
||||
} catch (err) {
|
||||
const isParseError = err instanceof SyntaxError;
|
||||
toast({
|
||||
title: isParseError ? "Invalid JSON" : "Invalid file",
|
||||
description: isParseError
|
||||
? "The uploaded file is not valid JSON."
|
||||
: "Could not read the uploaded file.",
|
||||
variant: "destructive",
|
||||
});
|
||||
return null;
|
||||
}
|
||||
}
|
||||
|
||||
return {
|
||||
submitWithMode,
|
||||
fileValue,
|
||||
setFileValue,
|
||||
urlValue,
|
||||
setUrlValue,
|
||||
isSubmitting,
|
||||
submittingMode,
|
||||
};
|
||||
}
|
||||
@@ -9,7 +9,9 @@ import { useForm } from "react-hook-form";
|
||||
import { z } from "zod";
|
||||
import { uploadAgentFormSchema } from "./LibraryUploadAgentDialog";
|
||||
|
||||
export function useLibraryUploadAgentDialog() {
|
||||
export function useLibraryUploadAgentDialog(options?: {
|
||||
onSuccess?: () => void;
|
||||
}) {
|
||||
const [isOpen, setIsOpen] = useState(false);
|
||||
const { toast } = useToast();
|
||||
const [agentObject, setAgentObject] = useState<Graph | null>(null);
|
||||
@@ -19,6 +21,7 @@ export function useLibraryUploadAgentDialog() {
|
||||
mutation: {
|
||||
onSuccess: ({ data }) => {
|
||||
setIsOpen(false);
|
||||
options?.onSuccess?.();
|
||||
toast({
|
||||
title: "Success",
|
||||
description: "Agent uploaded successfully",
|
||||
@@ -114,7 +117,7 @@ export function useLibraryUploadAgentDialog() {
|
||||
}
|
||||
}, [agentFileValue, form, toast]);
|
||||
|
||||
const onSubmit = async (values: z.infer<typeof uploadAgentFormSchema>) => {
|
||||
async function onSubmit(values: z.infer<typeof uploadAgentFormSchema>) {
|
||||
if (!agentObject) {
|
||||
form.setError("root", { message: "No Agent object to save" });
|
||||
return;
|
||||
@@ -133,7 +136,7 @@ export function useLibraryUploadAgentDialog() {
|
||||
source: "upload",
|
||||
},
|
||||
});
|
||||
};
|
||||
}
|
||||
|
||||
return {
|
||||
onSubmit,
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user