Compare commits

..

52 Commits

Author SHA1 Message Date
Zamil Majdy
a8c1d07f09 fix(frontend): reset input value when auto-approve toggle changes
- Pass external data value from parent to child component
- Add useEffect to sync child's internal state with parent's reviewDataMap
- Fixes issue where toggling auto-approve didn't reset edited input values
- Input now correctly reverts to original payload when auto-approve is enabled
2026-01-23 20:09:52 -06:00
Zamil Majdy
69e6927209 fix(backend): resolve event loop conflicts in human review tests
- Use TYPE_CHECKING conditional imports to avoid event loop binding issues
- Keep local imports within functions to prevent async initialization conflicts
- All 25 human review tests now pass reliably (10 data layer + 15 API routes)
- Fixes node_id fetching for frontend review grouping
2026-01-23 19:29:20 -06:00
Zamil Majdy
4003d7abf6 Merge branch 'dev' into feat/sensitive-action-features 2026-01-23 19:28:17 -05:00
Zamil Majdy
0cfc3e395f feat(backend): add node_id to PendingHumanReviewModel for frontend grouping
Adds node_id field to PendingHumanReviewModel to enable frontend to:
- Group reviews from the same node together
- Show auto-approve toggle only on last review per node
- Apply auto-approval at node level (not per-review)

Changes:
- Fetch node_id from NodeExecution when loading reviews
- Add node_id parameter to PendingHumanReviewModel.from_db()
- Update test fixture with node_id
- Add temporary default value for backwards compatibility
2026-01-23 18:04:02 -06:00
Zamil Majdy
27e93a39c1 fix(test): add event loop cleanup to prevent CI test failures
Adds a 0.5s sleep after shutting down test services (ExecutionManager,
AgentServer, etc.) to ensure they fully clean up their event loops before
the next test starts. This prevents 'Event loop is closed' errors in CI
that were breaking the oauth_test.py fixture setup.

The issue occurred because:
1. SpinTestServer starts background services with their own event loops
2. When services shut down, event loops weren't fully cleaned up
3. Subsequent tests would encounter closed event loops

Fixes test isolation issue where review_routes_test.py would leave the
event loop in a bad state for tests running after it.
2026-01-23 17:13:26 -06:00
Zamil Majdy
e95a9273e6 fix(frontend): simplify review UI - remove exclude button and tone down styling
- Remove confusing Exclude/Include button functionality
- Remove blue border/background from auto-approve toggle for cleaner look
- Toggle now shows by default with subtle gray text
- Disable editing when auto-approve is enabled (not when excluded)
- Remove unused rejection message functionality
- Simpler UX: just approve/reject reviews directly
2026-01-22 23:28:43 -05:00
Zamil Majdy
abd4a2b0b3 fix(frontend): restore auto-approve toggle in review UI
The auto-approve toggle was hidden because onToggleDisabled handler was not
passed to PendingReviewCard, causing showSimplified mode. This adds:
- disabledReviewsMap state to track excluded reviews
- handleToggleDisabled handler to toggle review exclusion
- Proper filtering in processReviews to handle disabled reviews
2026-01-22 23:24:27 -05:00
Zamil Majdy
3a96455db5 fix(backend): add ExecutionStatus import to utils_test.py
Fixes test failures caused by race condition fix. The status attribute
added to mocks requires importing ExecutionStatus enum.
2026-01-22 23:07:48 -05:00
Zamil Majdy
4c68b87e00 fix(backend): Add explicit None check for matching_review to satisfy Pyright
Add safety check before accessing matching_review.graph_exec_id to prevent
Pyright reportOptionalMemberAccess error. This check should never be triggered
in practice due to validation logic, but satisfies static type checking.

Fixes Pyright error: reportOptionalMemberAccess on line 164
2026-01-22 22:49:36 -05:00
Zamil Majdy
f4296f8764 fix(backend): Mock get_user_by_id in tests to prevent database access
Add get_user_by_id mocks to three review route tests that trigger the execution resume path.
Without these mocks, the tests attempt database access via Prisma when get_user_by_id is
called to fetch user.timezone, making tests non-deterministic and potentially causing failures.

Tests fixed:
- test_process_review_action_auto_approve_creates_auto_approval_records
- test_process_review_action_without_auto_approve_still_loads_settings
- test_process_review_action_auto_approve_only_applies_to_approved_reviews

Fixes CodeRabbit comment: 2719326546
2026-01-22 22:47:13 -05:00
Zamil Majdy
c0547bb1ed fix(backend): Prevent race condition in concurrent review completion
- Update execution status to QUEUED before publishing to RabbitMQ to prevent duplicate queueing
- Verify status update succeeded before publishing message to queue
- Return early if status update fails (execution already queued by another request)
- Ensures only one concurrent request can successfully queue an execution

This prevents the race condition where two concurrent requests processing the final reviews
for the same execution could both publish to the queue if the queue publish happens before
the status update.

Fixes CodeRabbit comment: 2719411493
2026-01-22 22:38:25 -05:00
Zamil Majdy
a396155d63 fix(backend): Address CodeRabbit review comments - query scoping, stale payload, and execution validation
- Change get_pending_review_by_node_exec_id to use find_first with userId filter in query (prevents cross-tenant existence leak)
- Add input_data parameter to check_approval() to use current data for auto-approvals instead of stale stored payload
- Validate all reviews in a request belong to the same execution to prevent cross-execution review processing
- Update HITLReviewHelper to pass input_data to check_approval()

Fixes CodeRabbit comments: 2719424510, 2719424508, 2719424506
2026-01-22 22:35:42 -05:00
Zamil Majdy
4c47b4df76 style: fix prettier formatting in PendingReviewsList 2026-01-22 22:14:26 -05:00
Zamil Majdy
c46e563e6a fix(frontend): remove unused imports in PendingReviewsList 2026-01-22 22:01:22 -05:00
Zamil Majdy
a80f452ffe style: fix linter formatting issues 2026-01-22 21:39:07 -05:00
Zamil Majdy
fd970c800c feat(frontend): implement per-review auto-approval toggles
- Replace global auto_approve_future_actions with per-review auto_approve_future
- Add individual toggle for each review in PendingReviewCard
- Track per-review auto-approval state in autoApproveFutureMap
- Send auto_approve_future field with each review item
- Update UI to show per-review toggle with explanation
- Automatically reset data to original when auto-approve is enabled per review
2026-01-22 21:38:17 -05:00
Zamil Majdy
5fc1ec0ece fix(backend): add user-scoped validation to cancel_pending_reviews_for_execution
- Add user_id parameter to validate ownership before cancelling reviews
- Update call site in executor/utils.py to pass user_id
- Update all test assertions to expect user_id parameter
- Prevents cross-tenant cancellation if graph_exec_id is misused
2026-01-22 21:29:00 -05:00
Zamil Majdy
9be3ec58ae fix(backend): fix pagination bug in review lookup and implement per-review auto-approval
- Add get_pending_review_by_node_exec_id() for direct review lookup
- Replace paginated search with direct lookup to avoid missing reviews beyond page 1
- Implement per-review auto_approve_future toggle for granular control
- Fix log deduplication for embedding generation warnings
- Remove unnecessary f-string prefixes per linter feedback
- Fix all test mocks to use correct functions (get_pending_reviews_for_user vs get_pending_review_by_node_exec_id)
- All 15 review route tests passing
2026-01-22 21:24:15 -05:00
Zamil Majdy
e6ca904326 style: remove unnecessary f-string prefixes
- Remove f-string prefix from strings without placeholders
- Fixes Ruff F541 linter warning
- Addresses CodeRabbit comment 2719299451
2026-01-22 21:13:24 -05:00
Zamil Majdy
dbb56fa7aa feat: add per-review auto-approval toggle for granular control
- Change from global auto_approve_future_actions to per-review auto_approve_future flag
- Each review item can now individually specify auto-approval
- Better UX: users can auto-approve some actions but not others
- Example: auto-approve file reads but not file writes
- Backward compatible: auto_approve_future defaults to False
- Add test for per-review granularity
- Update all existing tests to use new structure
2026-01-22 21:01:35 -05:00
Zamil Majdy
0111820f61 feat: deduplicate embedding generation failure warnings
- Use log_once_per_task for embedding generation failures
- Prevents log spam when API key is missing
- Now shows single warning per task instead of per-file warnings
- Makes logs more readable and actionable
2026-01-22 20:51:23 -05:00
Zamil Majdy
1a1b1aa26d fix: add user ownership validation to create_auto_approval_record
- Add defense-in-depth check that graph_exec_id belongs to user_id
- Validates ownership before creating auto-approval records
- Prevents potential misuse if function called from other contexts
- Addresses CodeRabbit security concern (comment 2718990979)
2026-01-22 20:43:47 -05:00
Zamil Majdy
614ed8cf82 fix(backend/hitl): preserve user timezone when resuming execution from review
- Add user_timezone to ExecutionContext when resuming after review approval
- Fetch user to get timezone preference, defaulting to UTC if not set
- Make error deduplication more general using contextvars
- Replace global flag with log_once_per_task() helper for task-scoped logging
- Prevents log spam when processing batches (embeddings, etc.)

Addresses CodeRabbit comment about ExecutionContext not being exhaustive.
2026-01-22 19:59:45 -05:00
Zamil Majdy
edd4c96aa6 fix: remove accidentally committed supabase submodule 2026-01-22 19:48:22 -05:00
Zamil Majdy
cd231e2d69 fix(backend/tests): fix event loop issues in review route tests
- Convert module-level TestClient to fixture to avoid event loop conflicts
- Add missing mock for get_pending_reviews_for_user in all tests
- Add client parameter to all test functions that use the test client
- Add missing mocks for get_graph_execution_meta in several tests
- Remove asyncio.gather to avoid event loop binding issues
- Process auto-approval creation sequentially with try/except for safety

All 14 review route tests now pass successfully.
2026-01-22 19:24:07 -05:00
Zamil Majdy
399c472623 fix(backend/store): deduplicate missing API key error logs
Only log "openai_internal_api_key not set" error once per process instead
of on every embedding generation attempt. Reduces log spam when processing
batch operations without an API key configured.
2026-01-22 19:09:35 -05:00
Zamil Majdy
554e2beddf fix(backend/hitl): address CodeRabbit review feedback
- Use return_exceptions=True in asyncio.gather for auto-approval creation
  to prevent endpoint failure when auto-approval fails (reviews already processed)
- Fix empty payload handling: use explicit None check instead of truthiness
- Distinguish auto-approvals from normal approvals: auto-approvals always
  use current input_data, normal approvals preserve explicitly empty payloads
2026-01-22 19:08:14 -05:00
Zamil Majdy
29fdda3fa8 test(backend/executor): add tests for stop_graph_execution with REVIEW status
- Test cancellation of pending reviews when stopping execution in REVIEW status
- Test database manager pattern when Prisma is disconnected
- Test cascading stop to children with pending reviews
- Fix mock to simulate status transition from RUNNING to TERMINATED

Covers the bug fixes in stop_graph_execution() that handle:
1. Immediate termination of REVIEW status executions
2. Cleanup of pending reviews when stopping
3. Recursive cleanup of subagent reviews via cascade
2026-01-22 18:59:20 -05:00
Zamil Majdy
67e6a8841c fix(executor): Handle REVIEW status when stopping graph executions
Critical bug fix: stopping a graph in REVIEW status caused timeouts and orphaned reviews.

## Bugs Fixed

### 1. REVIEW Status Not Handled
Before:
- stop_graph_execution() only handled QUEUED, INCOMPLETE, RUNNING, COMPLETED, FAILED
- REVIEW status → waited 15 seconds → TimeoutError
- Graph remained stuck in REVIEW status

After:
- REVIEW status treated like QUEUED/INCOMPLETE (terminate immediately)
- No need to wait for executor since execution is paused
- Clean termination without timeouts

### 2. Orphaned Pending Reviews
Before:
- Stopping graph → status = TERMINATED
- Pending reviews remained in WAITING status
- User saw reviews for terminated execution in UI
- Could not approve/reject (backend validation rejects)
- Reviews stuck until manual cleanup

After:
- When stopping REVIEW execution, clean up pending reviews
- Mark all WAITING reviews as REJECTED
- reviewMessage: 'Execution was stopped by user'
- processed: true, reviewedAt: now()
- No orphaned reviews in UI

### 3. Subagent Reviews
Before:
- Parent graph with child (subagent) executions
- Child paused for HITL review
- Stop parent → recursively stops child
- Child reviews orphaned (same bugs as above)

After:
- Cascade stop properly handles child REVIEW status
- All child reviews cleaned up recursively
- Clean shutdown of entire execution tree

## Implementation

Changes to stop_graph_execution():
1. Added ExecutionStatus.REVIEW to immediate termination list
2. Check if status == REVIEW before marking TERMINATED
3. Update all WAITING reviews to REJECTED with message
4. Log cleanup for debugging
5. Then terminate execution normally

Cascade behavior preserved:
- Still recursively stops all child executions
- Each child's reviews cleaned up individually
- Parent waits for all children to complete cleanup
2026-01-22 18:27:08 -05:00
Zamil Majdy
aea97db485 feat(frontend): Hide pending reviews panel while execution is RUNNING/QUEUED
Defense in depth: prevent users from seeing/clicking review panel before
execution pauses for review.

Before:
- Reviews panel could show while execution is RUNNING
- User could click to open panel and see pending reviews
- Confusing UX: why are reviews shown if graph hasn't paused yet?
- Could lead to frustration when backend rejects the approval attempt

After:
- Panel hidden if execution status is RUNNING or QUEUED
- Panel only shows when status is REVIEW (paused for review)
- Clear UX: reviews appear only when execution needs user input

Benefits:
1. **Better UX**: No confusion about when to approve reviews
2. **Prevents invalid attempts**: User can't try to approve while running
3. **Works with backend validation**: Frontend hides, backend rejects
4. **Clear state**: Panel visibility directly matches execution state

Changes:
- Added status check: hide if RUNNING or QUEUED
- Panel shows only when execution has paused (REVIEW/INCOMPLETE)
- Existing polling logic still works for real-time updates
2026-01-22 18:22:33 -05:00
Zamil Majdy
71a6969bbd feat(hitl): Add backend validation to prevent review processing during RUNNING/QUEUED status
Defense in depth: validate execution status before processing reviews.

Before:
- Reviews could be processed regardless of execution status
- Could cause race conditions and deadlocks
- User confusion when reviews processed but execution still running

After:
- Reject review processing with 409 Conflict if status is not REVIEW/INCOMPLETE
- Only allow processing when execution is actually paused for review
- Clear error message explaining why the request was rejected

Benefits:
1. **Prevention over cure**: Stop invalid requests before processing
2. **Clear semantics**: Reviews can only be processed when execution paused
3. **Better UX**: User gets immediate feedback if they try to approve too early
4. **Simpler resume logic**: No need for complex status checks since we validate upfront

Changes:
- Fetch graph execution metadata early in the endpoint
- Validate status is REVIEW or INCOMPLETE before processing
- Removed redundant status checks in resume logic (already validated)
- Simplified resume flow: just check if pending reviews remain
- Fixed comment: 'all pending reviews' not 'some reviews'
2026-01-22 18:22:21 -05:00
Zamil Majdy
e4c3f9995b feat(frontend): Change safety popup to per-agent instead of global
Changed AI_AGENT_SAFETY_POPUP_SHOWN from a boolean flag to an array of
agent IDs. This ensures users see the safety popup once per unique agent
instead of once globally.

Why this is better:
- Different agents have different capabilities (sensitive actions, HITL blocks)
- User should be aware of what THIS specific agent can do
- Not too annoying since it's still only once per agent, not every run
- Better safety awareness when switching between safe and risky agents

Changes:
- Store array of seen agent IDs in localStorage instead of single boolean
- Pass agentId to useAIAgentSafetyPopup hook and AIAgentSafetyPopup component
- Check if current agent ID is in the seen list before showing popup
- Add agent ID to list when user acknowledges popup

Testing:
- Clear localStorage or remove specific agent ID from array to re-trigger popup
- Each unique agent shows popup on first run only
2026-01-22 18:13:33 -05:00
Zamil Majdy
3b58684abc fix(hitl): Prevent review deadlock by resuming regardless of execution status
When users approve/reject reviews but the execution status is not REVIEW
(due to race conditions or bugs), the reviews get marked as processed but
execution never resumes, leaving the graph stuck forever.

This fix ensures that:
- If no pending reviews remain after processing, we ALWAYS attempt to resume
- Only skip if status is COMPLETED or FAILED (already finished)
- Log warning if status is unexpected (not REVIEW) but still resume to prevent deadlock
- Prevents scenario where user has nothing to do (reviews processed) but graph never completes

Example deadlock scenario (now prevented):
1. Graph creates review, sets status to REVIEW
2. User approves review → marked as APPROVED
3. Status check finds unexpected state (not REVIEW)
4. OLD: Return without resuming → graph stuck forever
5. NEW: Log warning and resume anyway → graph completes
2026-01-22 18:13:18 -05:00
Zamil Majdy
e8d44a62fd refactor(hitl): Add user_id validation and code quality improvements
- Add user_id parameter to check_approval for data isolation consistency
- Fix message text: 'block' → 'node' in auto-approval message
- Use walrus operator for cleaner approval_result check
- Move imports to top-level in test file (avoid local imports)
- Remove obvious comments (Check if pending, Resume execution, Load settings)
2026-01-22 18:04:03 -05:00
Zamil Majdy
be024da2a8 fix(hitl): Prevent review race condition by checking execution status
Fixed race condition where user approves reviews while graph execution
is still RUNNING, which could queue the execution twice and cause
duplicate/conflicting execution instances.

Solution:
- Check graph execution status BEFORE resuming
- Only resume if status is REVIEW (execution paused for review)
- Skip resumption if RUNNING (will naturally pick up approved reviews)
- Skip if COMPLETED/other (already finished)

This ensures we never queue an execution that's already running,
while still allowing the running execution to pick up approved
reviews naturally.

Added tests:
- All review action tests now mock get_graph_execution_meta
- Tests verify execution only resumes when status is REVIEW
2026-01-22 17:48:24 -05:00
Zamil Majdy
0df917e243 fix(hitl): Expose check_approval through database manager client
Fixed "Client is not connected to the query engine" error when
check_approval is called from block execution context. The function
is now accessed through the database manager async client (RPC),
similar to other HITL methods like get_or_create_human_review.

Changes:
- Add check_approval to DatabaseManager and DatabaseManagerAsyncClient
- Update HITLReviewHelper to call check_approval via database client
- Remove direct import of check_approval in review.py
2026-01-22 17:33:52 -05:00
Zamil Majdy
8688805a8c refactor(hitl): Consolidate check_auto_approval into check_approval
Merge auto-approval check and normal approval check into a single
function using find_first with OR condition. This reduces database
queries by checking both the node_exec_id and auto_approve_key in
one query.
2026-01-22 16:55:12 -05:00
Zamil Majdy
9bdda7dab0 cleanup 2026-01-22 16:23:40 -05:00
Zamil Majdy
7d377aabaa fix(db): Remove useless prefix 2026-01-22 16:00:09 -05:00
Zamil Majdy
dfd7c64068 feat(backend): Implement node-specific auto-approval using key pattern
- Add auto-approval via special nodeExecId key pattern (auto_approve_{graph_exec_id}_{node_id})
- Create auto-approval records in PendingHumanReview when user approves with auto-approve flag
- Check for existing auto-approval before requiring human review
- Remove node_id parameter from get_or_create_human_review
- Load graph settings properly when resuming execution after review
2026-01-21 22:21:00 -05:00
Zamil Majdy
02089bc047 fix(frontend): Add polling for pending reviews badge to update in real-time
- Add refetchInterval to execution details query to poll while running/review
- Add polling support to usePendingReviewsForExecution hook
- Poll pending reviews every 2 seconds when execution is in REVIEW status
- This ensures the "X Reviews Pending" badge updates without page refresh
2026-01-21 21:08:10 -05:00
Zamil Majdy
bed7b356bb fix(frontend): Reset card data when auto-approve toggle changes
Include autoApproveFuture in the key prop to force PendingReviewCard
to remount when the toggle changes, which resets its internal state
to the original payload data.
2026-01-21 21:04:56 -05:00
Zamil Majdy
4efc0ff502 fix(migration): Correct migration to only drop FK constraint, not non-existent column
The nodeId column was never added to PendingHumanReview. The migration
should only drop the foreign key constraint linking nodeExecId to
AgentNodeExecution, not try to drop a column that doesn't exist.
2026-01-21 20:13:41 -05:00
Zamil Majdy
4ad0528257 feat(hitl): Simplify auto-approval with toggle UX and remove node_id storage
- Remove nodeId column from PendingHumanReview schema (use in-memory tracking)
- Remove foreign key relation from PendingHumanReview to AgentNodeExecution
- Use ExecutionContext.auto_approved_node_ids for auto-approval tracking
- Add auto-approve toggle in frontend (default off)
- When toggle enabled: disable editing and use original data
- Backend looks up agentNodeId from AgentNodeExecution when auto-approving
- Update tests to reflect schema changes
2026-01-21 19:57:11 -05:00
Zamil Majdy
2f440ee80a Merge branch 'dev' into feat/sensitive-action-features 2026-01-21 19:08:32 -05:00
Zamil Majdy
2a55923ec0 Merge dev to get GraphSettings fix 2026-01-21 09:31:17 -05:00
Zamil Majdy
ad50f57a2b chore: add migration for nodeId field in PendingHumanReview
Adds database migration to add the nodeId column which tracks
the node ID in the graph definition for auto-approval tracking.
2026-01-20 23:03:03 -05:00
Zamil Majdy
aebd961ef5 fix: implement node-specific auto-approval for human reviews
Instead of disabling all safe modes when approving all future actions,
now tracks specific node IDs that should be auto-approved. This means
clicking "Approve all future actions" will only auto-approve future
reviews from the same blocks, not all reviews.

Changes:
- Add nodeId field to PendingHumanReview schema
- Add auto_approved_node_ids set to ExecutionContext
- Update review helper to check auto_approved_node_ids
- Change API from disable_future_reviews to auto_approve_node_ids
- Update frontend to pass node_ids when bulk approving
- Address PR feedback: remove barrel file, JSDoc comments, and cleanup
2026-01-20 22:15:51 -05:00
Zamil Majdy
bcccaa16cc fix: remove unused props from AIAgentSafetyPopup component
Removes hasSensitiveAction and hasHumanInTheLoop props that were only
used by the hook, not the component itself, fixing ESLint unused vars error.
2026-01-20 21:05:39 -05:00
Zamil Majdy
d5ddc41b18 feat: add bulk approval option for human reviews
Add "Approve all future actions" button to the review UI that:
- Approves all current pending reviews
- Disables safe mode for the remainder of the execution run
- Shows helper text about turning auto-approval on/off in settings

Backend changes:
- Add disable_future_reviews flag to ReviewRequest model
- Pass ExecutionContext with disabled safe modes when resuming

Frontend changes:
- Add "Approve all future actions" button to PendingReviewsList
- Include helper text per PRD requirements

Implements SECRT-1795
2026-01-20 20:45:50 -05:00
Zamil Majdy
95eab5b7eb feat: add one-time safety popup for AI-generated agent runs
Show a one-time safety popup the first time a user runs an agent with
sensitive actions or human-in-the-loop blocks. The popup explains that
agents may take real-world actions and that safety checks are enabled.

- Add AI_AGENT_SAFETY_POPUP_SHOWN localStorage key
- Create AIAgentSafetyPopup component with hook
- Integrate popup into RunAgentModal before first run

Implements SECRT-1798
2026-01-20 20:40:18 -05:00
Zamil Majdy
832d6e1696 fix: correct safe mode checks for sensitive action blocks
- Add skip_safe_mode_check parameter to HITLReviewHelper to avoid
  checking the wrong safe mode flag for sensitive action blocks
- Simplify SafeModeToggle and FloatingSafeModeToggle by removing
  unnecessary intermediate variables and isHITLStateUndetermined checks
2026-01-20 20:33:55 -05:00
40 changed files with 3617 additions and 1301 deletions

View File

@@ -1,28 +1,29 @@
"""Agent generator package - Creates agents from natural language."""
from .core import (
AgentGeneratorNotConfiguredError,
apply_agent_patch,
decompose_goal,
generate_agent,
generate_agent_patch,
get_agent_as_json,
json_to_graph,
save_agent_to_library,
)
from .service import health_check as check_external_service_health
from .service import is_external_service_configured
from .fixer import apply_all_fixes
from .utils import get_blocks_info
from .validator import validate_agent
__all__ = [
# Core functions
"decompose_goal",
"generate_agent",
"generate_agent_patch",
"apply_agent_patch",
"save_agent_to_library",
"get_agent_as_json",
"json_to_graph",
# Exceptions
"AgentGeneratorNotConfiguredError",
# Service
"is_external_service_configured",
"check_external_service_health",
# Fixer
"apply_all_fixes",
# Validator
"validate_agent",
# Utils
"get_blocks_info",
]

View File

@@ -0,0 +1,25 @@
"""OpenRouter client configuration for agent generation."""
import os
from openai import AsyncOpenAI
# Configuration - use OPEN_ROUTER_API_KEY for consistency with chat/config.py
OPENROUTER_API_KEY = os.getenv("OPEN_ROUTER_API_KEY")
AGENT_GENERATOR_MODEL = os.getenv("AGENT_GENERATOR_MODEL", "anthropic/claude-opus-4.5")
# OpenRouter client (OpenAI-compatible API)
_client: AsyncOpenAI | None = None
def get_client() -> AsyncOpenAI:
"""Get or create the OpenRouter client."""
global _client
if _client is None:
if not OPENROUTER_API_KEY:
raise ValueError("OPENROUTER_API_KEY environment variable is required")
_client = AsyncOpenAI(
base_url="https://openrouter.ai/api/v1",
api_key=OPENROUTER_API_KEY,
)
return _client

View File

@@ -1,5 +1,7 @@
"""Core agent generation functions."""
import copy
import json
import logging
import uuid
from typing import Any
@@ -7,35 +9,13 @@ from typing import Any
from backend.api.features.library import db as library_db
from backend.data.graph import Graph, Link, Node, create_graph
from .service import (
decompose_goal_external,
generate_agent_external,
generate_agent_patch_external,
is_external_service_configured,
)
from .client import AGENT_GENERATOR_MODEL, get_client
from .prompts import DECOMPOSITION_PROMPT, GENERATION_PROMPT, PATCH_PROMPT
from .utils import get_block_summaries, parse_json_from_llm
logger = logging.getLogger(__name__)
class AgentGeneratorNotConfiguredError(Exception):
"""Raised when the external Agent Generator service is not configured."""
pass
def _check_service_configured() -> None:
"""Check if the external Agent Generator service is configured.
Raises:
AgentGeneratorNotConfiguredError: If the service is not configured.
"""
if not is_external_service_configured():
raise AgentGeneratorNotConfiguredError(
"Agent Generator service is not configured. "
"Set AGENTGENERATOR_HOST environment variable to enable agent generation."
)
async def decompose_goal(description: str, context: str = "") -> dict[str, Any] | None:
"""Break down a goal into steps or return clarifying questions.
@@ -48,13 +28,40 @@ async def decompose_goal(description: str, context: str = "") -> dict[str, Any]
- {"type": "clarifying_questions", "questions": [...]}
- {"type": "instructions", "steps": [...]}
Or None on error
Raises:
AgentGeneratorNotConfiguredError: If the external service is not configured.
"""
_check_service_configured()
logger.info("Calling external Agent Generator service for decompose_goal")
return await decompose_goal_external(description, context)
client = get_client()
prompt = DECOMPOSITION_PROMPT.format(block_summaries=get_block_summaries())
full_description = description
if context:
full_description = f"{description}\n\nAdditional context:\n{context}"
try:
response = await client.chat.completions.create(
model=AGENT_GENERATOR_MODEL,
messages=[
{"role": "system", "content": prompt},
{"role": "user", "content": full_description},
],
temperature=0,
)
content = response.choices[0].message.content
if content is None:
logger.error("LLM returned empty content for decomposition")
return None
result = parse_json_from_llm(content)
if result is None:
logger.error(f"Failed to parse decomposition response: {content[:200]}")
return None
return result
except Exception as e:
logger.error(f"Error decomposing goal: {e}")
return None
async def generate_agent(instructions: dict[str, Any]) -> dict[str, Any] | None:
@@ -65,14 +72,31 @@ async def generate_agent(instructions: dict[str, Any]) -> dict[str, Any] | None:
Returns:
Agent JSON dict or None on error
Raises:
AgentGeneratorNotConfiguredError: If the external service is not configured.
"""
_check_service_configured()
logger.info("Calling external Agent Generator service for generate_agent")
result = await generate_agent_external(instructions)
if result:
client = get_client()
prompt = GENERATION_PROMPT.format(block_summaries=get_block_summaries())
try:
response = await client.chat.completions.create(
model=AGENT_GENERATOR_MODEL,
messages=[
{"role": "system", "content": prompt},
{"role": "user", "content": json.dumps(instructions, indent=2)},
],
temperature=0,
)
content = response.choices[0].message.content
if content is None:
logger.error("LLM returned empty content for agent generation")
return None
result = parse_json_from_llm(content)
if result is None:
logger.error(f"Failed to parse agent JSON: {content[:200]}")
return None
# Ensure required fields
if "id" not in result:
result["id"] = str(uuid.uuid4())
@@ -80,7 +104,12 @@ async def generate_agent(instructions: dict[str, Any]) -> dict[str, Any] | None:
result["version"] = 1
if "is_active" not in result:
result["is_active"] = True
return result
return result
except Exception as e:
logger.error(f"Error generating agent: {e}")
return None
def json_to_graph(agent_json: dict[str, Any]) -> Graph:
@@ -255,23 +284,108 @@ async def get_agent_as_json(
async def generate_agent_patch(
update_request: str, current_agent: dict[str, Any]
) -> dict[str, Any] | None:
"""Update an existing agent using natural language.
The external Agent Generator service handles:
- Generating the patch
- Applying the patch
- Fixing and validating the result
"""Generate a patch to update an existing agent.
Args:
update_request: Natural language description of changes
current_agent: Current agent JSON
Returns:
Updated agent JSON, clarifying questions dict, or None on error
Raises:
AgentGeneratorNotConfiguredError: If the external service is not configured.
Patch dict or clarifying questions, or None on error
"""
_check_service_configured()
logger.info("Calling external Agent Generator service for generate_agent_patch")
return await generate_agent_patch_external(update_request, current_agent)
client = get_client()
prompt = PATCH_PROMPT.format(
current_agent=json.dumps(current_agent, indent=2),
block_summaries=get_block_summaries(),
)
try:
response = await client.chat.completions.create(
model=AGENT_GENERATOR_MODEL,
messages=[
{"role": "system", "content": prompt},
{"role": "user", "content": update_request},
],
temperature=0,
)
content = response.choices[0].message.content
if content is None:
logger.error("LLM returned empty content for patch generation")
return None
return parse_json_from_llm(content)
except Exception as e:
logger.error(f"Error generating patch: {e}")
return None
def apply_agent_patch(
current_agent: dict[str, Any], patch: dict[str, Any]
) -> dict[str, Any]:
"""Apply a patch to an existing agent.
Args:
current_agent: Current agent JSON
patch: Patch dict with operations
Returns:
Updated agent JSON
"""
agent = copy.deepcopy(current_agent)
patches = patch.get("patches", [])
for p in patches:
patch_type = p.get("type")
if patch_type == "modify":
node_id = p.get("node_id")
changes = p.get("changes", {})
for node in agent.get("nodes", []):
if node["id"] == node_id:
_deep_update(node, changes)
logger.debug(f"Modified node {node_id}")
break
elif patch_type == "add":
new_nodes = p.get("new_nodes", [])
new_links = p.get("new_links", [])
agent["nodes"] = agent.get("nodes", []) + new_nodes
agent["links"] = agent.get("links", []) + new_links
logger.debug(f"Added {len(new_nodes)} nodes, {len(new_links)} links")
elif patch_type == "remove":
node_ids_to_remove = set(p.get("node_ids", []))
link_ids_to_remove = set(p.get("link_ids", []))
# Remove nodes
agent["nodes"] = [
n for n in agent.get("nodes", []) if n["id"] not in node_ids_to_remove
]
# Remove links (both explicit and those referencing removed nodes)
agent["links"] = [
link
for link in agent.get("links", [])
if link["id"] not in link_ids_to_remove
and link["source_id"] not in node_ids_to_remove
and link["sink_id"] not in node_ids_to_remove
]
logger.debug(
f"Removed {len(node_ids_to_remove)} nodes, {len(link_ids_to_remove)} links"
)
return agent
def _deep_update(target: dict, source: dict) -> None:
"""Recursively update a dict with another dict."""
for key, value in source.items():
if key in target and isinstance(target[key], dict) and isinstance(value, dict):
_deep_update(target[key], value)
else:
target[key] = value

View File

@@ -0,0 +1,606 @@
"""Agent fixer - Fixes common LLM generation errors."""
import logging
import re
import uuid
from typing import Any
from .utils import (
ADDTODICTIONARY_BLOCK_ID,
ADDTOLIST_BLOCK_ID,
CODE_EXECUTION_BLOCK_ID,
CONDITION_BLOCK_ID,
CREATEDICT_BLOCK_ID,
CREATELIST_BLOCK_ID,
DATA_SAMPLING_BLOCK_ID,
DOUBLE_CURLY_BRACES_BLOCK_IDS,
GET_CURRENT_DATE_BLOCK_ID,
STORE_VALUE_BLOCK_ID,
UNIVERSAL_TYPE_CONVERTER_BLOCK_ID,
get_blocks_info,
is_valid_uuid,
)
logger = logging.getLogger(__name__)
def fix_agent_ids(agent: dict[str, Any]) -> dict[str, Any]:
"""Fix invalid UUIDs in agent and link IDs."""
# Fix agent ID
if not is_valid_uuid(agent.get("id", "")):
agent["id"] = str(uuid.uuid4())
logger.debug(f"Fixed agent ID: {agent['id']}")
# Fix node IDs
id_mapping = {} # Old ID -> New ID
for node in agent.get("nodes", []):
if not is_valid_uuid(node.get("id", "")):
old_id = node.get("id", "")
new_id = str(uuid.uuid4())
id_mapping[old_id] = new_id
node["id"] = new_id
logger.debug(f"Fixed node ID: {old_id} -> {new_id}")
# Fix link IDs and update references
for link in agent.get("links", []):
if not is_valid_uuid(link.get("id", "")):
link["id"] = str(uuid.uuid4())
logger.debug(f"Fixed link ID: {link['id']}")
# Update source/sink IDs if they were remapped
if link.get("source_id") in id_mapping:
link["source_id"] = id_mapping[link["source_id"]]
if link.get("sink_id") in id_mapping:
link["sink_id"] = id_mapping[link["sink_id"]]
return agent
def fix_double_curly_braces(agent: dict[str, Any]) -> dict[str, Any]:
"""Fix single curly braces to double in template blocks."""
for node in agent.get("nodes", []):
if node.get("block_id") not in DOUBLE_CURLY_BRACES_BLOCK_IDS:
continue
input_data = node.get("input_default", {})
for key in ("prompt", "format"):
if key in input_data and isinstance(input_data[key], str):
original = input_data[key]
# Fix simple variable references: {var} -> {{var}}
fixed = re.sub(
r"(?<!\{)\{([a-zA-Z_][a-zA-Z0-9_]*)\}(?!\})",
r"{{\1}}",
original,
)
if fixed != original:
input_data[key] = fixed
logger.debug(f"Fixed curly braces in {key}")
return agent
def fix_storevalue_before_condition(agent: dict[str, Any]) -> dict[str, Any]:
"""Add StoreValueBlock before ConditionBlock if needed for value2."""
nodes = agent.get("nodes", [])
links = agent.get("links", [])
# Find all ConditionBlock nodes
condition_node_ids = {
node["id"] for node in nodes if node.get("block_id") == CONDITION_BLOCK_ID
}
if not condition_node_ids:
return agent
new_nodes = []
new_links = []
processed_conditions = set()
for link in links:
sink_id = link.get("sink_id")
sink_name = link.get("sink_name")
# Check if this link goes to a ConditionBlock's value2
if sink_id in condition_node_ids and sink_name == "value2":
source_node = next(
(n for n in nodes if n["id"] == link.get("source_id")), None
)
# Skip if source is already a StoreValueBlock
if source_node and source_node.get("block_id") == STORE_VALUE_BLOCK_ID:
continue
# Skip if we already processed this condition
if sink_id in processed_conditions:
continue
processed_conditions.add(sink_id)
# Create StoreValueBlock
store_node_id = str(uuid.uuid4())
store_node = {
"id": store_node_id,
"block_id": STORE_VALUE_BLOCK_ID,
"input_default": {"data": None},
"metadata": {"position": {"x": 0, "y": -100}},
}
new_nodes.append(store_node)
# Create link: original source -> StoreValueBlock
new_links.append(
{
"id": str(uuid.uuid4()),
"source_id": link["source_id"],
"source_name": link["source_name"],
"sink_id": store_node_id,
"sink_name": "input",
"is_static": False,
}
)
# Update original link: StoreValueBlock -> ConditionBlock
link["source_id"] = store_node_id
link["source_name"] = "output"
logger.debug(f"Added StoreValueBlock before ConditionBlock {sink_id}")
if new_nodes:
agent["nodes"] = nodes + new_nodes
return agent
def fix_addtolist_blocks(agent: dict[str, Any]) -> dict[str, Any]:
"""Fix AddToList blocks by adding prerequisite empty AddToList block.
When an AddToList block is found:
1. Checks if there's a CreateListBlock before it
2. Removes CreateListBlock if linked directly to AddToList
3. Adds an empty AddToList block before the original
4. Ensures the original has a self-referencing link
"""
nodes = agent.get("nodes", [])
links = agent.get("links", [])
new_nodes = []
original_addtolist_ids = set()
nodes_to_remove = set()
links_to_remove = []
# First pass: identify CreateListBlock nodes to remove
for link in links:
source_node = next(
(n for n in nodes if n.get("id") == link.get("source_id")), None
)
sink_node = next((n for n in nodes if n.get("id") == link.get("sink_id")), None)
if (
source_node
and sink_node
and source_node.get("block_id") == CREATELIST_BLOCK_ID
and sink_node.get("block_id") == ADDTOLIST_BLOCK_ID
):
nodes_to_remove.add(source_node.get("id"))
links_to_remove.append(link)
logger.debug(f"Removing CreateListBlock {source_node.get('id')}")
# Second pass: process AddToList blocks
filtered_nodes = []
for node in nodes:
if node.get("id") in nodes_to_remove:
continue
if node.get("block_id") == ADDTOLIST_BLOCK_ID:
original_addtolist_ids.add(node.get("id"))
node_id = node.get("id")
pos = node.get("metadata", {}).get("position", {"x": 0, "y": 0})
# Check if already has prerequisite
has_prereq = any(
link.get("sink_id") == node_id
and link.get("sink_name") == "list"
and link.get("source_name") == "updated_list"
for link in links
)
if not has_prereq:
# Remove links to "list" input (except self-reference)
for link in links:
if (
link.get("sink_id") == node_id
and link.get("sink_name") == "list"
and link.get("source_id") != node_id
and link not in links_to_remove
):
links_to_remove.append(link)
# Create prerequisite AddToList block
prereq_id = str(uuid.uuid4())
prereq_node = {
"id": prereq_id,
"block_id": ADDTOLIST_BLOCK_ID,
"input_default": {"list": [], "entry": None, "entries": []},
"metadata": {
"position": {"x": pos.get("x", 0) - 800, "y": pos.get("y", 0)}
},
}
new_nodes.append(prereq_node)
# Link prerequisite to original
links.append(
{
"id": str(uuid.uuid4()),
"source_id": prereq_id,
"source_name": "updated_list",
"sink_id": node_id,
"sink_name": "list",
"is_static": False,
}
)
logger.debug(f"Added prerequisite AddToList block for {node_id}")
filtered_nodes.append(node)
# Remove marked links
filtered_links = [link for link in links if link not in links_to_remove]
# Add self-referencing links for original AddToList blocks
for node in filtered_nodes + new_nodes:
if (
node.get("block_id") == ADDTOLIST_BLOCK_ID
and node.get("id") in original_addtolist_ids
):
node_id = node.get("id")
has_self_ref = any(
link["source_id"] == node_id
and link["sink_id"] == node_id
and link["source_name"] == "updated_list"
and link["sink_name"] == "list"
for link in filtered_links
)
if not has_self_ref:
filtered_links.append(
{
"id": str(uuid.uuid4()),
"source_id": node_id,
"source_name": "updated_list",
"sink_id": node_id,
"sink_name": "list",
"is_static": False,
}
)
logger.debug(f"Added self-reference for AddToList {node_id}")
agent["nodes"] = filtered_nodes + new_nodes
agent["links"] = filtered_links
return agent
def fix_addtodictionary_blocks(agent: dict[str, Any]) -> dict[str, Any]:
"""Fix AddToDictionary blocks by removing empty CreateDictionary nodes."""
nodes = agent.get("nodes", [])
links = agent.get("links", [])
nodes_to_remove = set()
links_to_remove = []
for link in links:
source_node = next(
(n for n in nodes if n.get("id") == link.get("source_id")), None
)
sink_node = next((n for n in nodes if n.get("id") == link.get("sink_id")), None)
if (
source_node
and sink_node
and source_node.get("block_id") == CREATEDICT_BLOCK_ID
and sink_node.get("block_id") == ADDTODICTIONARY_BLOCK_ID
):
nodes_to_remove.add(source_node.get("id"))
links_to_remove.append(link)
logger.debug(f"Removing CreateDictionary {source_node.get('id')}")
agent["nodes"] = [n for n in nodes if n.get("id") not in nodes_to_remove]
agent["links"] = [link for link in links if link not in links_to_remove]
return agent
def fix_code_execution_output(agent: dict[str, Any]) -> dict[str, Any]:
"""Fix CodeExecutionBlock output: change 'response' to 'stdout_logs'."""
nodes = agent.get("nodes", [])
links = agent.get("links", [])
for link in links:
source_node = next(
(n for n in nodes if n.get("id") == link.get("source_id")), None
)
if (
source_node
and source_node.get("block_id") == CODE_EXECUTION_BLOCK_ID
and link.get("source_name") == "response"
):
link["source_name"] = "stdout_logs"
logger.debug("Fixed CodeExecutionBlock output: response -> stdout_logs")
return agent
def fix_data_sampling_sample_size(agent: dict[str, Any]) -> dict[str, Any]:
"""Fix DataSamplingBlock by setting sample_size to 1 as default."""
nodes = agent.get("nodes", [])
links = agent.get("links", [])
links_to_remove = []
for node in nodes:
if node.get("block_id") == DATA_SAMPLING_BLOCK_ID:
node_id = node.get("id")
input_default = node.get("input_default", {})
# Remove links to sample_size
for link in links:
if (
link.get("sink_id") == node_id
and link.get("sink_name") == "sample_size"
):
links_to_remove.append(link)
# Set default
input_default["sample_size"] = 1
node["input_default"] = input_default
logger.debug(f"Fixed DataSamplingBlock {node_id} sample_size to 1")
if links_to_remove:
agent["links"] = [link for link in links if link not in links_to_remove]
return agent
def fix_node_x_coordinates(agent: dict[str, Any]) -> dict[str, Any]:
"""Fix node x-coordinates to ensure 800+ unit spacing between linked nodes."""
nodes = agent.get("nodes", [])
links = agent.get("links", [])
node_lookup = {n.get("id"): n for n in nodes}
for link in links:
source_id = link.get("source_id")
sink_id = link.get("sink_id")
source_node = node_lookup.get(source_id)
sink_node = node_lookup.get(sink_id)
if not source_node or not sink_node:
continue
source_pos = source_node.get("metadata", {}).get("position", {})
sink_pos = sink_node.get("metadata", {}).get("position", {})
source_x = source_pos.get("x", 0)
sink_x = sink_pos.get("x", 0)
if abs(sink_x - source_x) < 800:
new_x = source_x + 800
if "metadata" not in sink_node:
sink_node["metadata"] = {}
if "position" not in sink_node["metadata"]:
sink_node["metadata"]["position"] = {}
sink_node["metadata"]["position"]["x"] = new_x
logger.debug(f"Fixed node {sink_id} x: {sink_x} -> {new_x}")
return agent
def fix_getcurrentdate_offset(agent: dict[str, Any]) -> dict[str, Any]:
"""Fix GetCurrentDateBlock offset to ensure it's positive."""
for node in agent.get("nodes", []):
if node.get("block_id") == GET_CURRENT_DATE_BLOCK_ID:
input_default = node.get("input_default", {})
if "offset" in input_default:
offset = input_default["offset"]
if isinstance(offset, (int, float)) and offset < 0:
input_default["offset"] = abs(offset)
logger.debug(f"Fixed offset: {offset} -> {abs(offset)}")
return agent
def fix_ai_model_parameter(
agent: dict[str, Any],
blocks_info: list[dict[str, Any]],
default_model: str = "gpt-4o",
) -> dict[str, Any]:
"""Add default model parameter to AI blocks if missing."""
block_map = {b.get("id"): b for b in blocks_info}
for node in agent.get("nodes", []):
block_id = node.get("block_id")
block = block_map.get(block_id)
if not block:
continue
# Check if block has AI category
categories = block.get("categories", [])
is_ai_block = any(
cat.get("category") == "AI" for cat in categories if isinstance(cat, dict)
)
if is_ai_block:
input_default = node.get("input_default", {})
if "model" not in input_default:
input_default["model"] = default_model
node["input_default"] = input_default
logger.debug(
f"Added model '{default_model}' to AI block {node.get('id')}"
)
return agent
def fix_link_static_properties(
agent: dict[str, Any], blocks_info: list[dict[str, Any]]
) -> dict[str, Any]:
"""Fix is_static property based on source block's staticOutput."""
block_map = {b.get("id"): b for b in blocks_info}
node_lookup = {n.get("id"): n for n in agent.get("nodes", [])}
for link in agent.get("links", []):
source_node = node_lookup.get(link.get("source_id"))
if not source_node:
continue
source_block = block_map.get(source_node.get("block_id"))
if not source_block:
continue
static_output = source_block.get("staticOutput", False)
if link.get("is_static") != static_output:
link["is_static"] = static_output
logger.debug(f"Fixed link {link.get('id')} is_static to {static_output}")
return agent
def fix_data_type_mismatch(
agent: dict[str, Any], blocks_info: list[dict[str, Any]]
) -> dict[str, Any]:
"""Fix data type mismatches by inserting UniversalTypeConverterBlock."""
nodes = agent.get("nodes", [])
links = agent.get("links", [])
block_map = {b.get("id"): b for b in blocks_info}
node_lookup = {n.get("id"): n for n in nodes}
def get_property_type(schema: dict, name: str) -> str | None:
if "_#_" in name:
parent, child = name.split("_#_", 1)
parent_schema = schema.get(parent, {})
if "properties" in parent_schema:
return parent_schema["properties"].get(child, {}).get("type")
return None
return schema.get(name, {}).get("type")
def are_types_compatible(src: str, sink: str) -> bool:
if {src, sink} <= {"integer", "number"}:
return True
return src == sink
type_mapping = {
"string": "string",
"text": "string",
"integer": "number",
"number": "number",
"float": "number",
"boolean": "boolean",
"bool": "boolean",
"array": "list",
"list": "list",
"object": "dictionary",
"dict": "dictionary",
"dictionary": "dictionary",
}
new_links = []
nodes_to_add = []
for link in links:
source_node = node_lookup.get(link.get("source_id"))
sink_node = node_lookup.get(link.get("sink_id"))
if not source_node or not sink_node:
new_links.append(link)
continue
source_block = block_map.get(source_node.get("block_id"))
sink_block = block_map.get(sink_node.get("block_id"))
if not source_block or not sink_block:
new_links.append(link)
continue
source_outputs = source_block.get("outputSchema", {}).get("properties", {})
sink_inputs = sink_block.get("inputSchema", {}).get("properties", {})
source_type = get_property_type(source_outputs, link.get("source_name", ""))
sink_type = get_property_type(sink_inputs, link.get("sink_name", ""))
if (
source_type
and sink_type
and not are_types_compatible(source_type, sink_type)
):
# Insert type converter
converter_id = str(uuid.uuid4())
target_type = type_mapping.get(sink_type, sink_type)
converter_node = {
"id": converter_id,
"block_id": UNIVERSAL_TYPE_CONVERTER_BLOCK_ID,
"input_default": {"type": target_type},
"metadata": {"position": {"x": 0, "y": 100}},
}
nodes_to_add.append(converter_node)
# source -> converter
new_links.append(
{
"id": str(uuid.uuid4()),
"source_id": link["source_id"],
"source_name": link["source_name"],
"sink_id": converter_id,
"sink_name": "value",
"is_static": False,
}
)
# converter -> sink
new_links.append(
{
"id": str(uuid.uuid4()),
"source_id": converter_id,
"source_name": "value",
"sink_id": link["sink_id"],
"sink_name": link["sink_name"],
"is_static": False,
}
)
logger.debug(f"Inserted type converter: {source_type} -> {target_type}")
else:
new_links.append(link)
if nodes_to_add:
agent["nodes"] = nodes + nodes_to_add
agent["links"] = new_links
return agent
def apply_all_fixes(
agent: dict[str, Any], blocks_info: list[dict[str, Any]] | None = None
) -> dict[str, Any]:
"""Apply all fixes to an agent JSON.
Args:
agent: Agent JSON dict
blocks_info: Optional list of block info dicts for advanced fixes
Returns:
Fixed agent JSON
"""
# Basic fixes (no block info needed)
agent = fix_agent_ids(agent)
agent = fix_double_curly_braces(agent)
agent = fix_storevalue_before_condition(agent)
agent = fix_addtolist_blocks(agent)
agent = fix_addtodictionary_blocks(agent)
agent = fix_code_execution_output(agent)
agent = fix_data_sampling_sample_size(agent)
agent = fix_node_x_coordinates(agent)
agent = fix_getcurrentdate_offset(agent)
# Advanced fixes (require block info)
if blocks_info is None:
blocks_info = get_blocks_info()
agent = fix_ai_model_parameter(agent, blocks_info)
agent = fix_link_static_properties(agent, blocks_info)
agent = fix_data_type_mismatch(agent, blocks_info)
return agent

View File

@@ -0,0 +1,225 @@
"""Prompt templates for agent generation."""
DECOMPOSITION_PROMPT = """
You are an expert AutoGPT Workflow Decomposer. Your task is to analyze a user's high-level goal and break it down into a clear, step-by-step plan using the available blocks.
Each step should represent a distinct, automatable action suitable for execution by an AI automation system.
---
FIRST: Analyze the user's goal and determine:
1) Design-time configuration (fixed settings that won't change per run)
2) Runtime inputs (values the agent's end-user will provide each time it runs)
For anything that can vary per run (email addresses, names, dates, search terms, etc.):
- DO NOT ask for the actual value
- Instead, define it as an Agent Input with a clear name, type, and description
Only ask clarifying questions about design-time config that affects how you build the workflow:
- Which external service to use (e.g., "Gmail vs Outlook", "Notion vs Google Docs")
- Required formats or structures (e.g., "CSV, JSON, or PDF output?")
- Business rules that must be hard-coded
IMPORTANT CLARIFICATIONS POLICY:
- Ask no more than five essential questions
- Do not ask for concrete values that can be provided at runtime as Agent Inputs
- Do not ask for API keys or credentials; the platform handles those directly
- If there is enough information to infer reasonable defaults, prefer to propose defaults
---
GUIDELINES:
1. List each step as a numbered item
2. Describe the action clearly and specify inputs/outputs
3. Ensure steps are in logical, sequential order
4. Mention block names naturally (e.g., "Use GetWeatherByLocationBlock to...")
5. Help the user reach their goal efficiently
---
RULES:
1. OUTPUT FORMAT: Only output either clarifying questions OR step-by-step instructions, not both
2. USE ONLY THE BLOCKS PROVIDED
3. ALL required_input fields must be provided
4. Data types of linked properties must match
5. Write expert-level prompts for AI-related blocks
---
CRITICAL BLOCK RESTRICTIONS:
1. AddToListBlock: Outputs updated list EVERY addition, not after all additions
2. SendEmailBlock: Draft the email for user review; set SMTP config based on email type
3. ConditionBlock: value2 is reference, value1 is contrast
4. CodeExecutionBlock: DO NOT USE - use AI blocks instead
5. ReadCsvBlock: Only use the 'rows' output, not 'row'
---
OUTPUT FORMAT:
If more information is needed:
```json
{{
"type": "clarifying_questions",
"questions": [
{{
"question": "Which email provider should be used? (Gmail, Outlook, custom SMTP)",
"keyword": "email_provider",
"example": "Gmail"
}}
]
}}
```
If ready to proceed:
```json
{{
"type": "instructions",
"steps": [
{{
"step_number": 1,
"block_name": "AgentShortTextInputBlock",
"description": "Get the URL of the content to analyze.",
"inputs": [{{"name": "name", "value": "URL"}}],
"outputs": [{{"name": "result", "description": "The URL entered by user"}}]
}}
]
}}
```
---
AVAILABLE BLOCKS:
{block_summaries}
"""
GENERATION_PROMPT = """
You are an expert AI workflow builder. Generate a valid agent JSON from the given instructions.
---
NODES:
Each node must include:
- `id`: Unique UUID v4 (e.g. `a8f5b1e2-c3d4-4e5f-8a9b-0c1d2e3f4a5b`)
- `block_id`: The block identifier (must match an Allowed Block)
- `input_default`: Dict of inputs (can be empty if no static inputs needed)
- `metadata`: Must contain:
- `position`: {{"x": number, "y": number}} - adjacent nodes should differ by 800+ in X
- `customized_name`: Clear name describing this block's purpose in the workflow
---
LINKS:
Each link connects a source node's output to a sink node's input:
- `id`: MUST be UUID v4 (NOT "link-1", "link-2", etc.)
- `source_id`: ID of the source node
- `source_name`: Output field name from the source block
- `sink_id`: ID of the sink node
- `sink_name`: Input field name on the sink block
- `is_static`: true only if source block has static_output: true
CRITICAL: All IDs must be valid UUID v4 format!
---
AGENT (GRAPH):
Wrap nodes and links in:
- `id`: UUID of the agent
- `name`: Short, generic name (avoid specific company names, URLs)
- `description`: Short, generic description
- `nodes`: List of all nodes
- `links`: List of all links
- `version`: 1
- `is_active`: true
---
TIPS:
- All required_input fields must be provided via input_default or a valid link
- Ensure consistent source_id and sink_id references
- Avoid dangling links
- Input/output pins must match block schemas
- Do not invent unknown block_ids
---
ALLOWED BLOCKS:
{block_summaries}
---
Generate the complete agent JSON. Output ONLY valid JSON, no explanation.
"""
PATCH_PROMPT = """
You are an expert at modifying AutoGPT agent workflows. Given the current agent and a modification request, generate a JSON patch to update the agent.
CURRENT AGENT:
{current_agent}
AVAILABLE BLOCKS:
{block_summaries}
---
PATCH FORMAT:
Return a JSON object with the following structure:
```json
{{
"type": "patch",
"intent": "Brief description of what the patch does",
"patches": [
{{
"type": "modify",
"node_id": "uuid-of-node-to-modify",
"changes": {{
"input_default": {{"field": "new_value"}},
"metadata": {{"customized_name": "New Name"}}
}}
}},
{{
"type": "add",
"new_nodes": [
{{
"id": "new-uuid",
"block_id": "block-uuid",
"input_default": {{}},
"metadata": {{"position": {{"x": 0, "y": 0}}, "customized_name": "Name"}}
}}
],
"new_links": [
{{
"id": "link-uuid",
"source_id": "source-node-id",
"source_name": "output_field",
"sink_id": "sink-node-id",
"sink_name": "input_field"
}}
]
}},
{{
"type": "remove",
"node_ids": ["uuid-of-node-to-remove"],
"link_ids": ["uuid-of-link-to-remove"]
}}
]
}}
```
If you need more information, return:
```json
{{
"type": "clarifying_questions",
"questions": [
{{
"question": "What specific change do you want?",
"keyword": "change_type",
"example": "Add error handling"
}}
]
}}
```
Generate the minimal patch needed. Output ONLY valid JSON.
"""

View File

@@ -1,269 +0,0 @@
"""External Agent Generator service client.
This module provides a client for communicating with the external Agent Generator
microservice. When AGENTGENERATOR_HOST is configured, the agent generation functions
will delegate to the external service instead of using the built-in LLM-based implementation.
"""
import logging
from typing import Any
import httpx
from backend.util.settings import Settings
logger = logging.getLogger(__name__)
_client: httpx.AsyncClient | None = None
_settings: Settings | None = None
def _get_settings() -> Settings:
"""Get or create settings singleton."""
global _settings
if _settings is None:
_settings = Settings()
return _settings
def is_external_service_configured() -> bool:
"""Check if external Agent Generator service is configured."""
settings = _get_settings()
return bool(settings.config.agentgenerator_host)
def _get_base_url() -> str:
"""Get the base URL for the external service."""
settings = _get_settings()
host = settings.config.agentgenerator_host
port = settings.config.agentgenerator_port
return f"http://{host}:{port}"
def _get_client() -> httpx.AsyncClient:
"""Get or create the HTTP client for the external service."""
global _client
if _client is None:
settings = _get_settings()
_client = httpx.AsyncClient(
base_url=_get_base_url(),
timeout=httpx.Timeout(settings.config.agentgenerator_timeout),
)
return _client
async def decompose_goal_external(
description: str, context: str = ""
) -> dict[str, Any] | None:
"""Call the external service to decompose a goal.
Args:
description: Natural language goal description
context: Additional context (e.g., answers to previous questions)
Returns:
Dict with either:
- {"type": "clarifying_questions", "questions": [...]}
- {"type": "instructions", "steps": [...]}
- {"type": "unachievable_goal", ...}
- {"type": "vague_goal", ...}
Or None on error
"""
client = _get_client()
# Build the request payload
payload: dict[str, Any] = {"description": description}
if context:
# The external service uses user_instruction for additional context
payload["user_instruction"] = context
try:
response = await client.post("/api/decompose-description", json=payload)
response.raise_for_status()
data = response.json()
if not data.get("success"):
logger.error(f"External service returned error: {data.get('error')}")
return None
# Map the response to the expected format
response_type = data.get("type")
if response_type == "instructions":
return {"type": "instructions", "steps": data.get("steps", [])}
elif response_type == "clarifying_questions":
return {
"type": "clarifying_questions",
"questions": data.get("questions", []),
}
elif response_type == "unachievable_goal":
return {
"type": "unachievable_goal",
"reason": data.get("reason"),
"suggested_goal": data.get("suggested_goal"),
}
elif response_type == "vague_goal":
return {
"type": "vague_goal",
"suggested_goal": data.get("suggested_goal"),
}
else:
logger.error(
f"Unknown response type from external service: {response_type}"
)
return None
except httpx.HTTPStatusError as e:
logger.error(f"HTTP error calling external agent generator: {e}")
return None
except httpx.RequestError as e:
logger.error(f"Request error calling external agent generator: {e}")
return None
except Exception as e:
logger.error(f"Unexpected error calling external agent generator: {e}")
return None
async def generate_agent_external(
instructions: dict[str, Any]
) -> dict[str, Any] | None:
"""Call the external service to generate an agent from instructions.
Args:
instructions: Structured instructions from decompose_goal
Returns:
Agent JSON dict or None on error
"""
client = _get_client()
try:
response = await client.post(
"/api/generate-agent", json={"instructions": instructions}
)
response.raise_for_status()
data = response.json()
if not data.get("success"):
logger.error(f"External service returned error: {data.get('error')}")
return None
return data.get("agent_json")
except httpx.HTTPStatusError as e:
logger.error(f"HTTP error calling external agent generator: {e}")
return None
except httpx.RequestError as e:
logger.error(f"Request error calling external agent generator: {e}")
return None
except Exception as e:
logger.error(f"Unexpected error calling external agent generator: {e}")
return None
async def generate_agent_patch_external(
update_request: str, current_agent: dict[str, Any]
) -> dict[str, Any] | None:
"""Call the external service to generate a patch for an existing agent.
Args:
update_request: Natural language description of changes
current_agent: Current agent JSON
Returns:
Updated agent JSON, clarifying questions dict, or None on error
"""
client = _get_client()
try:
response = await client.post(
"/api/update-agent",
json={
"update_request": update_request,
"current_agent_json": current_agent,
},
)
response.raise_for_status()
data = response.json()
if not data.get("success"):
logger.error(f"External service returned error: {data.get('error')}")
return None
# Check if it's clarifying questions
if data.get("type") == "clarifying_questions":
return {
"type": "clarifying_questions",
"questions": data.get("questions", []),
}
# Otherwise return the updated agent JSON
return data.get("agent_json")
except httpx.HTTPStatusError as e:
logger.error(f"HTTP error calling external agent generator: {e}")
return None
except httpx.RequestError as e:
logger.error(f"Request error calling external agent generator: {e}")
return None
except Exception as e:
logger.error(f"Unexpected error calling external agent generator: {e}")
return None
async def get_blocks_external() -> list[dict[str, Any]] | None:
"""Get available blocks from the external service.
Returns:
List of block info dicts or None on error
"""
client = _get_client()
try:
response = await client.get("/api/blocks")
response.raise_for_status()
data = response.json()
if not data.get("success"):
logger.error("External service returned error getting blocks")
return None
return data.get("blocks", [])
except httpx.HTTPStatusError as e:
logger.error(f"HTTP error getting blocks from external service: {e}")
return None
except httpx.RequestError as e:
logger.error(f"Request error getting blocks from external service: {e}")
return None
except Exception as e:
logger.error(f"Unexpected error getting blocks from external service: {e}")
return None
async def health_check() -> bool:
"""Check if the external service is healthy.
Returns:
True if healthy, False otherwise
"""
if not is_external_service_configured():
return False
client = _get_client()
try:
response = await client.get("/health")
response.raise_for_status()
data = response.json()
return data.get("status") == "healthy" and data.get("blocks_loaded", False)
except Exception as e:
logger.warning(f"External agent generator health check failed: {e}")
return False
async def close_client() -> None:
"""Close the HTTP client."""
global _client
if _client is not None:
await _client.aclose()
_client = None

View File

@@ -0,0 +1,213 @@
"""Utilities for agent generation."""
import json
import re
from typing import Any
from backend.data.block import get_blocks
# UUID validation regex
UUID_REGEX = re.compile(
r"^[a-f0-9]{8}-[a-f0-9]{4}-4[a-f0-9]{3}-[89ab][a-f0-9]{3}-[a-f0-9]{12}$"
)
# Block IDs for various fixes
STORE_VALUE_BLOCK_ID = "1ff065e9-88e8-4358-9d82-8dc91f622ba9"
CONDITION_BLOCK_ID = "715696a0-e1da-45c8-b209-c2fa9c3b0be6"
ADDTOLIST_BLOCK_ID = "aeb08fc1-2fc1-4141-bc8e-f758f183a822"
ADDTODICTIONARY_BLOCK_ID = "31d1064e-7446-4693-a7d4-65e5ca1180d1"
CREATELIST_BLOCK_ID = "a912d5c7-6e00-4542-b2a9-8034136930e4"
CREATEDICT_BLOCK_ID = "b924ddf4-de4f-4b56-9a85-358930dcbc91"
CODE_EXECUTION_BLOCK_ID = "0b02b072-abe7-11ef-8372-fb5d162dd712"
DATA_SAMPLING_BLOCK_ID = "4a448883-71fa-49cf-91cf-70d793bd7d87"
UNIVERSAL_TYPE_CONVERTER_BLOCK_ID = "95d1b990-ce13-4d88-9737-ba5c2070c97b"
GET_CURRENT_DATE_BLOCK_ID = "b29c1b50-5d0e-4d9f-8f9d-1b0e6fcbf0b1"
DOUBLE_CURLY_BRACES_BLOCK_IDS = [
"44f6c8ad-d75c-4ae1-8209-aad1c0326928", # FillTextTemplateBlock
"6ab085e2-20b3-4055-bc3e-08036e01eca6",
"90f8c45e-e983-4644-aa0b-b4ebe2f531bc",
"363ae599-353e-4804-937e-b2ee3cef3da4", # AgentOutputBlock
"3b191d9f-356f-482d-8238-ba04b6d18381",
"db7d8f02-2f44-4c55-ab7a-eae0941f0c30",
"3a7c4b8d-6e2f-4a5d-b9c1-f8d23c5a9b0e",
"ed1ae7a0-b770-4089-b520-1f0005fad19a",
"a892b8d9-3e4e-4e9c-9c1e-75f8efcf1bfa",
"b29c1b50-5d0e-4d9f-8f9d-1b0e6fcbf0b1",
"716a67b3-6760-42e7-86dc-18645c6e00fc",
"530cf046-2ce0-4854-ae2c-659db17c7a46",
"ed55ac19-356e-4243-a6cb-bc599e9b716f",
"1f292d4a-41a4-4977-9684-7c8d560b9f91", # LLM blocks
"32a87eab-381e-4dd4-bdb8-4c47151be35a",
]
def is_valid_uuid(value: str) -> bool:
"""Check if a string is a valid UUID v4."""
return isinstance(value, str) and UUID_REGEX.match(value) is not None
def _compact_schema(schema: dict) -> dict[str, str]:
"""Extract compact type info from a JSON schema properties dict.
Returns a dict of {field_name: type_string} for essential info only.
"""
props = schema.get("properties", {})
result = {}
for name, prop in props.items():
# Skip internal/complex fields
if name.startswith("_"):
continue
# Get type string
type_str = prop.get("type", "any")
# Handle anyOf/oneOf (optional types)
if "anyOf" in prop:
types = [t.get("type", "?") for t in prop["anyOf"] if t.get("type")]
type_str = "|".join(types) if types else "any"
elif "allOf" in prop:
type_str = "object"
# Add array item type if present
if type_str == "array" and "items" in prop:
items = prop["items"]
if isinstance(items, dict):
item_type = items.get("type", "any")
type_str = f"array[{item_type}]"
result[name] = type_str
return result
def get_block_summaries(include_schemas: bool = True) -> str:
"""Generate compact block summaries for prompts.
Args:
include_schemas: Whether to include input/output type info
Returns:
Formatted string of block summaries (compact format)
"""
blocks = get_blocks()
summaries = []
for block_id, block_cls in blocks.items():
block = block_cls()
name = block.name
desc = getattr(block, "description", "") or ""
# Truncate description
if len(desc) > 150:
desc = desc[:147] + "..."
if not include_schemas:
summaries.append(f"- {name} (id: {block_id}): {desc}")
else:
# Compact format with type info only
inputs = {}
outputs = {}
required = []
if hasattr(block, "input_schema"):
try:
schema = block.input_schema.jsonschema()
inputs = _compact_schema(schema)
required = schema.get("required", [])
except Exception:
pass
if hasattr(block, "output_schema"):
try:
schema = block.output_schema.jsonschema()
outputs = _compact_schema(schema)
except Exception:
pass
# Build compact line format
# Format: NAME (id): desc | in: {field:type, ...} [required] | out: {field:type}
in_str = ", ".join(f"{k}:{v}" for k, v in inputs.items())
out_str = ", ".join(f"{k}:{v}" for k, v in outputs.items())
req_str = f" req=[{','.join(required)}]" if required else ""
static = " [static]" if getattr(block, "static_output", False) else ""
line = f"- {name} (id: {block_id}): {desc}"
if in_str:
line += f"\n in: {{{in_str}}}{req_str}"
if out_str:
line += f"\n out: {{{out_str}}}{static}"
summaries.append(line)
return "\n".join(summaries)
def get_blocks_info() -> list[dict[str, Any]]:
"""Get block information with schemas for validation and fixing."""
blocks = get_blocks()
blocks_info = []
for block_id, block_cls in blocks.items():
block = block_cls()
blocks_info.append(
{
"id": block_id,
"name": block.name,
"description": getattr(block, "description", ""),
"categories": getattr(block, "categories", []),
"staticOutput": getattr(block, "static_output", False),
"inputSchema": (
block.input_schema.jsonschema()
if hasattr(block, "input_schema")
else {}
),
"outputSchema": (
block.output_schema.jsonschema()
if hasattr(block, "output_schema")
else {}
),
}
)
return blocks_info
def parse_json_from_llm(text: str) -> dict[str, Any] | None:
"""Extract JSON from LLM response (handles markdown code blocks)."""
if not text:
return None
# Try fenced code block
match = re.search(r"```(?:json)?\s*([\s\S]*?)```", text, re.IGNORECASE)
if match:
try:
return json.loads(match.group(1).strip())
except json.JSONDecodeError:
pass
# Try raw text
try:
return json.loads(text.strip())
except json.JSONDecodeError:
pass
# Try finding {...} span
start = text.find("{")
end = text.rfind("}")
if start != -1 and end > start:
try:
return json.loads(text[start : end + 1])
except json.JSONDecodeError:
pass
# Try finding [...] span
start = text.find("[")
end = text.rfind("]")
if start != -1 and end > start:
try:
return json.loads(text[start : end + 1])
except json.JSONDecodeError:
pass
return None

View File

@@ -0,0 +1,279 @@
"""Agent validator - Validates agent structure and connections."""
import logging
import re
from typing import Any
from .utils import get_blocks_info
logger = logging.getLogger(__name__)
class AgentValidator:
"""Validator for AutoGPT agents with detailed error reporting."""
def __init__(self):
self.errors: list[str] = []
def add_error(self, error: str) -> None:
"""Add an error message."""
self.errors.append(error)
def validate_block_existence(
self, agent: dict[str, Any], blocks_info: list[dict[str, Any]]
) -> bool:
"""Validate all block IDs exist in the blocks library."""
valid = True
valid_block_ids = {b.get("id") for b in blocks_info if b.get("id")}
for node in agent.get("nodes", []):
block_id = node.get("block_id")
node_id = node.get("id")
if not block_id:
self.add_error(f"Node '{node_id}' is missing 'block_id' field.")
valid = False
continue
if block_id not in valid_block_ids:
self.add_error(
f"Node '{node_id}' references block_id '{block_id}' which does not exist."
)
valid = False
return valid
def validate_link_node_references(self, agent: dict[str, Any]) -> bool:
"""Validate all node IDs referenced in links exist."""
valid = True
valid_node_ids = {n.get("id") for n in agent.get("nodes", []) if n.get("id")}
for link in agent.get("links", []):
link_id = link.get("id", "Unknown")
source_id = link.get("source_id")
sink_id = link.get("sink_id")
if not source_id:
self.add_error(f"Link '{link_id}' is missing 'source_id'.")
valid = False
elif source_id not in valid_node_ids:
self.add_error(
f"Link '{link_id}' references non-existent source_id '{source_id}'."
)
valid = False
if not sink_id:
self.add_error(f"Link '{link_id}' is missing 'sink_id'.")
valid = False
elif sink_id not in valid_node_ids:
self.add_error(
f"Link '{link_id}' references non-existent sink_id '{sink_id}'."
)
valid = False
return valid
def validate_required_inputs(
self, agent: dict[str, Any], blocks_info: list[dict[str, Any]]
) -> bool:
"""Validate required inputs are provided."""
valid = True
block_map = {b.get("id"): b for b in blocks_info}
for node in agent.get("nodes", []):
block_id = node.get("block_id")
block = block_map.get(block_id)
if not block:
continue
required_inputs = block.get("inputSchema", {}).get("required", [])
input_defaults = node.get("input_default", {})
node_id = node.get("id")
# Get linked inputs
linked_inputs = {
link["sink_name"]
for link in agent.get("links", [])
if link.get("sink_id") == node_id
}
for req_input in required_inputs:
if (
req_input not in input_defaults
and req_input not in linked_inputs
and req_input != "credentials"
):
block_name = block.get("name", "Unknown Block")
self.add_error(
f"Node '{node_id}' ({block_name}) is missing required input '{req_input}'."
)
valid = False
return valid
def validate_data_type_compatibility(
self, agent: dict[str, Any], blocks_info: list[dict[str, Any]]
) -> bool:
"""Validate linked data types are compatible."""
valid = True
block_map = {b.get("id"): b for b in blocks_info}
node_lookup = {n.get("id"): n for n in agent.get("nodes", [])}
def get_type(schema: dict, name: str) -> str | None:
if "_#_" in name:
parent, child = name.split("_#_", 1)
parent_schema = schema.get(parent, {})
if "properties" in parent_schema:
return parent_schema["properties"].get(child, {}).get("type")
return None
return schema.get(name, {}).get("type")
def are_compatible(src: str, sink: str) -> bool:
if {src, sink} <= {"integer", "number"}:
return True
return src == sink
for link in agent.get("links", []):
source_node = node_lookup.get(link.get("source_id"))
sink_node = node_lookup.get(link.get("sink_id"))
if not source_node or not sink_node:
continue
source_block = block_map.get(source_node.get("block_id"))
sink_block = block_map.get(sink_node.get("block_id"))
if not source_block or not sink_block:
continue
source_outputs = source_block.get("outputSchema", {}).get("properties", {})
sink_inputs = sink_block.get("inputSchema", {}).get("properties", {})
source_type = get_type(source_outputs, link.get("source_name", ""))
sink_type = get_type(sink_inputs, link.get("sink_name", ""))
if source_type and sink_type and not are_compatible(source_type, sink_type):
self.add_error(
f"Type mismatch: {source_block.get('name')} output '{link['source_name']}' "
f"({source_type}) -> {sink_block.get('name')} input '{link['sink_name']}' ({sink_type})."
)
valid = False
return valid
def validate_nested_sink_links(
self, agent: dict[str, Any], blocks_info: list[dict[str, Any]]
) -> bool:
"""Validate nested sink links (with _#_ notation)."""
valid = True
block_map = {b.get("id"): b for b in blocks_info}
node_lookup = {n.get("id"): n for n in agent.get("nodes", [])}
for link in agent.get("links", []):
sink_name = link.get("sink_name", "")
if "_#_" in sink_name:
parent, child = sink_name.split("_#_", 1)
sink_node = node_lookup.get(link.get("sink_id"))
if not sink_node:
continue
block = block_map.get(sink_node.get("block_id"))
if not block:
continue
input_props = block.get("inputSchema", {}).get("properties", {})
parent_schema = input_props.get(parent)
if not parent_schema:
self.add_error(
f"Invalid nested link '{sink_name}': parent '{parent}' not found."
)
valid = False
continue
if not parent_schema.get("additionalProperties"):
if not (
isinstance(parent_schema, dict)
and "properties" in parent_schema
and child in parent_schema.get("properties", {})
):
self.add_error(
f"Invalid nested link '{sink_name}': child '{child}' not found in '{parent}'."
)
valid = False
return valid
def validate_prompt_spaces(self, agent: dict[str, Any]) -> bool:
"""Validate prompts don't have spaces in template variables."""
valid = True
for node in agent.get("nodes", []):
input_default = node.get("input_default", {})
prompt = input_default.get("prompt", "")
if not isinstance(prompt, str):
continue
# Find {{...}} with spaces
matches = re.finditer(r"\{\{([^}]+)\}\}", prompt)
for match in matches:
content = match.group(1)
if " " in content:
self.add_error(
f"Node '{node.get('id')}' has spaces in template variable: "
f"'{{{{{content}}}}}' should be '{{{{{content.replace(' ', '_')}}}}}'."
)
valid = False
return valid
def validate(
self, agent: dict[str, Any], blocks_info: list[dict[str, Any]] | None = None
) -> tuple[bool, str | None]:
"""Run all validations.
Returns:
Tuple of (is_valid, error_message)
"""
self.errors = []
if blocks_info is None:
blocks_info = get_blocks_info()
checks = [
self.validate_block_existence(agent, blocks_info),
self.validate_link_node_references(agent),
self.validate_required_inputs(agent, blocks_info),
self.validate_data_type_compatibility(agent, blocks_info),
self.validate_nested_sink_links(agent, blocks_info),
self.validate_prompt_spaces(agent),
]
all_passed = all(checks)
if all_passed:
logger.info("Agent validation successful")
return True, None
error_message = "Agent validation failed:\n"
for i, error in enumerate(self.errors, 1):
error_message += f"{i}. {error}\n"
logger.warning(f"Agent validation failed with {len(self.errors)} errors")
return False, error_message
def validate_agent(
agent: dict[str, Any], blocks_info: list[dict[str, Any]] | None = None
) -> tuple[bool, str | None]:
"""Convenience function to validate an agent.
Returns:
Tuple of (is_valid, error_message)
"""
validator = AgentValidator()
return validator.validate(agent, blocks_info)

View File

@@ -8,10 +8,12 @@ from langfuse import observe
from backend.api.features.chat.model import ChatSession
from .agent_generator import (
AgentGeneratorNotConfiguredError,
apply_all_fixes,
decompose_goal,
generate_agent,
get_blocks_info,
save_agent_to_library,
validate_agent,
)
from .base import BaseTool
from .models import (
@@ -25,6 +27,9 @@ from .models import (
logger = logging.getLogger(__name__)
# Maximum retries for agent generation with validation feedback
MAX_GENERATION_RETRIES = 2
class CreateAgentTool(BaseTool):
"""Tool for creating agents from natural language descriptions."""
@@ -86,8 +91,9 @@ class CreateAgentTool(BaseTool):
Flow:
1. Decompose the description into steps (may return clarifying questions)
2. Generate agent JSON (external service handles fixing and validation)
3. Preview or save based on the save parameter
2. Generate agent JSON from the steps
3. Apply fixes to correct common LLM errors
4. Preview or save based on the save parameter
"""
description = kwargs.get("description", "").strip()
context = kwargs.get("context", "")
@@ -104,13 +110,11 @@ class CreateAgentTool(BaseTool):
# Step 1: Decompose goal into steps
try:
decomposition_result = await decompose_goal(description, context)
except AgentGeneratorNotConfiguredError:
except ValueError as e:
# Handle missing API key or configuration errors
return ErrorResponse(
message=(
"Agent generation is not available. "
"The Agent Generator service is not configured."
),
error="service_not_configured",
message=f"Agent generation is not configured: {str(e)}",
error="configuration_error",
session_id=session_id,
)
@@ -167,32 +171,72 @@ class CreateAgentTool(BaseTool):
session_id=session_id,
)
# Step 2: Generate agent JSON (external service handles fixing and validation)
try:
agent_json = await generate_agent(decomposition_result)
except AgentGeneratorNotConfiguredError:
return ErrorResponse(
message=(
"Agent generation is not available. "
"The Agent Generator service is not configured."
),
error="service_not_configured",
session_id=session_id,
# Step 2: Generate agent JSON with retry on validation failure
blocks_info = get_blocks_info()
agent_json = None
validation_errors = None
for attempt in range(MAX_GENERATION_RETRIES + 1):
# Generate agent (include validation errors from previous attempt)
if attempt == 0:
agent_json = await generate_agent(decomposition_result)
else:
# Retry with validation error feedback
logger.info(
f"Retry {attempt}/{MAX_GENERATION_RETRIES} with validation feedback"
)
retry_instructions = {
**decomposition_result,
"previous_errors": validation_errors,
"retry_instructions": (
"The previous generation had validation errors. "
"Please fix these issues in the new generation:\n"
f"{validation_errors}"
),
}
agent_json = await generate_agent(retry_instructions)
if agent_json is None:
if attempt == MAX_GENERATION_RETRIES:
return ErrorResponse(
message="Failed to generate the agent. Please try again.",
error="Generation failed",
session_id=session_id,
)
continue
# Step 3: Apply fixes to correct common errors
agent_json = apply_all_fixes(agent_json, blocks_info)
# Step 4: Validate the agent
is_valid, validation_errors = validate_agent(agent_json, blocks_info)
if is_valid:
logger.info(f"Agent generated successfully on attempt {attempt + 1}")
break
logger.warning(
f"Validation failed on attempt {attempt + 1}: {validation_errors}"
)
if agent_json is None:
return ErrorResponse(
message="Failed to generate the agent. Please try again.",
error="Generation failed",
session_id=session_id,
)
if attempt == MAX_GENERATION_RETRIES:
# Return error with validation details
return ErrorResponse(
message=(
f"Generated agent has validation errors after {MAX_GENERATION_RETRIES + 1} attempts. "
f"Please try rephrasing your request or simplify the workflow."
),
error="validation_failed",
details={"validation_errors": validation_errors},
session_id=session_id,
)
agent_name = agent_json.get("name", "Generated Agent")
agent_description = agent_json.get("description", "")
node_count = len(agent_json.get("nodes", []))
link_count = len(agent_json.get("links", []))
# Step 3: Preview or save
# Step 4: Preview or save
if not save:
return AgentPreviewResponse(
message=(

View File

@@ -8,10 +8,13 @@ from langfuse import observe
from backend.api.features.chat.model import ChatSession
from .agent_generator import (
AgentGeneratorNotConfiguredError,
apply_agent_patch,
apply_all_fixes,
generate_agent_patch,
get_agent_as_json,
get_blocks_info,
save_agent_to_library,
validate_agent,
)
from .base import BaseTool
from .models import (
@@ -25,6 +28,9 @@ from .models import (
logger = logging.getLogger(__name__)
# Maximum retries for patch generation with validation feedback
MAX_GENERATION_RETRIES = 2
class EditAgentTool(BaseTool):
"""Tool for editing existing agents using natural language."""
@@ -37,7 +43,7 @@ class EditAgentTool(BaseTool):
def description(self) -> str:
return (
"Edit an existing agent from the user's library using natural language. "
"Generates updates to the agent while preserving unchanged parts."
"Generates a patch to update the agent while preserving unchanged parts."
)
@property
@@ -92,8 +98,9 @@ class EditAgentTool(BaseTool):
Flow:
1. Fetch the current agent
2. Generate updated agent (external service handles fixing and validation)
3. Preview or save based on the save parameter
2. Generate a patch based on the requested changes
3. Apply the patch to create an updated agent
4. Preview or save based on the save parameter
"""
agent_id = kwargs.get("agent_id", "").strip()
changes = kwargs.get("changes", "").strip()
@@ -130,58 +137,121 @@ class EditAgentTool(BaseTool):
if context:
update_request = f"{changes}\n\nAdditional context:\n{context}"
# Step 2: Generate updated agent (external service handles fixing and validation)
try:
result = await generate_agent_patch(update_request, current_agent)
except AgentGeneratorNotConfiguredError:
return ErrorResponse(
message=(
"Agent editing is not available. "
"The Agent Generator service is not configured."
),
error="service_not_configured",
session_id=session_id,
)
# Step 2: Generate patch with retry on validation failure
blocks_info = get_blocks_info()
updated_agent = None
validation_errors = None
intent = "Applied requested changes"
if result is None:
return ErrorResponse(
message="Failed to generate changes. Please try rephrasing.",
error="Update generation failed",
session_id=session_id,
)
# Check if LLM returned clarifying questions
if result.get("type") == "clarifying_questions":
questions = result.get("questions", [])
return ClarificationNeededResponse(
message=(
"I need some more information about the changes. "
"Please answer the following questions:"
),
questions=[
ClarifyingQuestion(
question=q.get("question", ""),
keyword=q.get("keyword", ""),
example=q.get("example"),
for attempt in range(MAX_GENERATION_RETRIES + 1):
# Generate patch (include validation errors from previous attempt)
try:
if attempt == 0:
patch_result = await generate_agent_patch(
update_request, current_agent
)
for q in questions
],
session_id=session_id,
else:
# Retry with validation error feedback
logger.info(
f"Retry {attempt}/{MAX_GENERATION_RETRIES} with validation feedback"
)
retry_request = (
f"{update_request}\n\n"
f"IMPORTANT: The previous edit had validation errors. "
f"Please fix these issues:\n{validation_errors}"
)
patch_result = await generate_agent_patch(
retry_request, current_agent
)
except ValueError as e:
# Handle missing API key or configuration errors
return ErrorResponse(
message=f"Agent generation is not configured: {str(e)}",
error="configuration_error",
session_id=session_id,
)
if patch_result is None:
if attempt == MAX_GENERATION_RETRIES:
return ErrorResponse(
message="Failed to generate changes. Please try rephrasing.",
error="Patch generation failed",
session_id=session_id,
)
continue
# Check if LLM returned clarifying questions
if patch_result.get("type") == "clarifying_questions":
questions = patch_result.get("questions", [])
return ClarificationNeededResponse(
message=(
"I need some more information about the changes. "
"Please answer the following questions:"
),
questions=[
ClarifyingQuestion(
question=q.get("question", ""),
keyword=q.get("keyword", ""),
example=q.get("example"),
)
for q in questions
],
session_id=session_id,
)
# Step 3: Apply patch and fixes
try:
updated_agent = apply_agent_patch(current_agent, patch_result)
updated_agent = apply_all_fixes(updated_agent, blocks_info)
except Exception as e:
if attempt == MAX_GENERATION_RETRIES:
return ErrorResponse(
message=f"Failed to apply changes: {str(e)}",
error="patch_apply_failed",
details={"exception": str(e)},
session_id=session_id,
)
validation_errors = str(e)
continue
# Step 4: Validate the updated agent
is_valid, validation_errors = validate_agent(updated_agent, blocks_info)
if is_valid:
logger.info(f"Agent edited successfully on attempt {attempt + 1}")
intent = patch_result.get("intent", "Applied requested changes")
break
logger.warning(
f"Validation failed on attempt {attempt + 1}: {validation_errors}"
)
# Result is the updated agent JSON
updated_agent = result
if attempt == MAX_GENERATION_RETRIES:
# Return error with validation details
return ErrorResponse(
message=(
f"Updated agent has validation errors after "
f"{MAX_GENERATION_RETRIES + 1} attempts. "
f"Please try rephrasing your request or simplify the changes."
),
error="validation_failed",
details={"validation_errors": validation_errors},
session_id=session_id,
)
# At this point, updated_agent is guaranteed to be set (we return on all failure paths)
assert updated_agent is not None
agent_name = updated_agent.get("name", "Updated Agent")
agent_description = updated_agent.get("description", "")
node_count = len(updated_agent.get("nodes", []))
link_count = len(updated_agent.get("links", []))
# Step 3: Preview or save
# Step 5: Preview or save
if not save:
return AgentPreviewResponse(
message=(
f"I've updated the agent. "
f"I've updated the agent. Changes: {intent}. "
f"The agent now has {node_count} blocks. "
f"Review it and call edit_agent with save=true to save the changes."
),
@@ -207,7 +277,10 @@ class EditAgentTool(BaseTool):
)
return AgentSavedResponse(
message=f"Updated agent '{created_graph.name}' has been saved to your library!",
message=(
f"Updated agent '{created_graph.name}' has been saved to your library! "
f"Changes: {intent}"
),
agent_id=created_graph.id,
agent_name=created_graph.name,
library_agent_id=library_agent.id,

View File

@@ -23,6 +23,7 @@ class PendingHumanReviewModel(BaseModel):
id: Unique identifier for the review record
user_id: ID of the user who must perform the review
node_exec_id: ID of the node execution that created this review
node_id: ID of the node definition (for grouping reviews from same node)
graph_exec_id: ID of the graph execution containing the node
graph_id: ID of the graph template being executed
graph_version: Version number of the graph template
@@ -37,6 +38,10 @@ class PendingHumanReviewModel(BaseModel):
"""
node_exec_id: str = Field(description="Node execution ID (primary key)")
node_id: str = Field(
description="Node definition ID (for grouping)",
default="", # Temporary default for test compatibility
)
user_id: str = Field(description="User ID associated with the review")
graph_exec_id: str = Field(description="Graph execution ID")
graph_id: str = Field(description="Graph ID")
@@ -66,7 +71,9 @@ class PendingHumanReviewModel(BaseModel):
)
@classmethod
def from_db(cls, review: "PendingHumanReview") -> "PendingHumanReviewModel":
def from_db(
cls, review: "PendingHumanReview", node_id: str
) -> "PendingHumanReviewModel":
"""
Convert a database model to a response model.
@@ -74,9 +81,14 @@ class PendingHumanReviewModel(BaseModel):
payload, instructions, and editable flag.
Handles invalid data gracefully by using safe defaults.
Args:
review: Database review object
node_id: Node definition ID (fetched from NodeExecution)
"""
return cls(
node_exec_id=review.nodeExecId,
node_id=node_id,
user_id=review.userId,
graph_exec_id=review.graphExecId,
graph_id=review.graphId,
@@ -107,6 +119,13 @@ class ReviewItem(BaseModel):
reviewed_data: SafeJsonData | None = Field(
None, description="Optional edited data (ignored if approved=False)"
)
auto_approve_future: bool = Field(
default=False,
description=(
"If true and this review is approved, future executions of this same "
"block (node) will be automatically approved. This only affects approved reviews."
),
)
@field_validator("reviewed_data")
@classmethod
@@ -174,6 +193,9 @@ class ReviewRequest(BaseModel):
This request must include ALL pending reviews for a graph execution.
Each review will be either approved (with optional data modifications)
or rejected (data ignored). The execution will resume only after ALL reviews are processed.
Each review item can individually specify whether to auto-approve future executions
of the same block via the `auto_approve_future` field on ReviewItem.
"""
reviews: List[ReviewItem] = Field(

View File

@@ -8,6 +8,12 @@ from prisma.enums import ReviewStatus
from pytest_snapshot.plugin import Snapshot
from backend.api.rest_api import handle_internal_http_error
from backend.data.execution import (
ExecutionContext,
ExecutionStatus,
NodeExecutionResult,
)
from backend.data.graph import GraphSettings
from .model import PendingHumanReviewModel
from .routes import router
@@ -15,20 +21,24 @@ from .routes import router
# Using a fixed timestamp for reproducible tests
FIXED_NOW = datetime.datetime(2023, 1, 1, 0, 0, 0, tzinfo=datetime.timezone.utc)
app = fastapi.FastAPI()
app.include_router(router, prefix="/api/review")
app.add_exception_handler(ValueError, handle_internal_http_error(400))
client = fastapi.testclient.TestClient(app)
@pytest.fixture
def app():
"""Create FastAPI app for testing"""
test_app = fastapi.FastAPI()
test_app.include_router(router, prefix="/api/review")
test_app.add_exception_handler(ValueError, handle_internal_http_error(400))
return test_app
@pytest.fixture(autouse=True)
def setup_app_auth(mock_jwt_user):
"""Setup auth overrides for all tests in this module"""
@pytest.fixture
def client(app, mock_jwt_user):
"""Create test client with auth overrides"""
from autogpt_libs.auth.jwt_utils import get_jwt_payload
app.dependency_overrides[get_jwt_payload] = mock_jwt_user["get_jwt_payload"]
yield
with fastapi.testclient.TestClient(app) as test_client:
yield test_client
app.dependency_overrides.clear()
@@ -37,6 +47,7 @@ def sample_pending_review(test_user_id: str) -> PendingHumanReviewModel:
"""Create a sample pending review for testing"""
return PendingHumanReviewModel(
node_exec_id="test_node_123",
node_id="test_node_def_456",
user_id=test_user_id,
graph_exec_id="test_graph_exec_456",
graph_id="test_graph_789",
@@ -55,6 +66,7 @@ def sample_pending_review(test_user_id: str) -> PendingHumanReviewModel:
def test_get_pending_reviews_empty(
client: fastapi.testclient.TestClient,
mocker: pytest_mock.MockerFixture,
snapshot: Snapshot,
test_user_id: str,
@@ -73,6 +85,7 @@ def test_get_pending_reviews_empty(
def test_get_pending_reviews_with_data(
client: fastapi.testclient.TestClient,
mocker: pytest_mock.MockerFixture,
sample_pending_review: PendingHumanReviewModel,
snapshot: Snapshot,
@@ -95,6 +108,7 @@ def test_get_pending_reviews_with_data(
def test_get_pending_reviews_for_execution_success(
client: fastapi.testclient.TestClient,
mocker: pytest_mock.MockerFixture,
sample_pending_review: PendingHumanReviewModel,
snapshot: Snapshot,
@@ -123,6 +137,7 @@ def test_get_pending_reviews_for_execution_success(
def test_get_pending_reviews_for_execution_not_available(
client: fastapi.testclient.TestClient,
mocker: pytest_mock.MockerFixture,
) -> None:
"""Test access denied when user doesn't own the execution"""
@@ -138,6 +153,7 @@ def test_get_pending_reviews_for_execution_not_available(
def test_process_review_action_approve_success(
client: fastapi.testclient.TestClient,
mocker: pytest_mock.MockerFixture,
sample_pending_review: PendingHumanReviewModel,
test_user_id: str,
@@ -145,6 +161,12 @@ def test_process_review_action_approve_success(
"""Test successful review approval"""
# Mock the route functions
# Mock get_pending_review_by_node_exec_id (called to find the graph_exec_id)
mock_get_reviews_for_user = mocker.patch(
"backend.api.features.executions.review.routes.get_pending_review_by_node_exec_id"
)
mock_get_reviews_for_user.return_value = sample_pending_review
mock_get_reviews_for_execution = mocker.patch(
"backend.api.features.executions.review.routes.get_pending_reviews_for_execution"
)
@@ -173,6 +195,14 @@ def test_process_review_action_approve_success(
)
mock_process_all_reviews.return_value = {"test_node_123": approved_review}
# Mock get_graph_execution_meta to return execution in REVIEW status
mock_get_graph_exec = mocker.patch(
"backend.api.features.executions.review.routes.get_graph_execution_meta"
)
mock_graph_exec_meta = mocker.Mock()
mock_graph_exec_meta.status = ExecutionStatus.REVIEW
mock_get_graph_exec.return_value = mock_graph_exec_meta
mock_has_pending = mocker.patch(
"backend.api.features.executions.review.routes.has_pending_reviews_for_graph_exec"
)
@@ -202,6 +232,7 @@ def test_process_review_action_approve_success(
def test_process_review_action_reject_success(
client: fastapi.testclient.TestClient,
mocker: pytest_mock.MockerFixture,
sample_pending_review: PendingHumanReviewModel,
test_user_id: str,
@@ -209,6 +240,20 @@ def test_process_review_action_reject_success(
"""Test successful review rejection"""
# Mock the route functions
# Mock get_pending_review_by_node_exec_id (called to find the graph_exec_id)
mock_get_reviews_for_user = mocker.patch(
"backend.api.features.executions.review.routes.get_pending_review_by_node_exec_id"
)
mock_get_reviews_for_user.return_value = sample_pending_review
# Mock get_graph_execution_meta to return execution in REVIEW status
mock_get_graph_exec = mocker.patch(
"backend.api.features.executions.review.routes.get_graph_execution_meta"
)
mock_graph_exec_meta = mocker.Mock()
mock_graph_exec_meta.status = ExecutionStatus.REVIEW
mock_get_graph_exec.return_value = mock_graph_exec_meta
mock_get_reviews_for_execution = mocker.patch(
"backend.api.features.executions.review.routes.get_pending_reviews_for_execution"
)
@@ -262,6 +307,7 @@ def test_process_review_action_reject_success(
def test_process_review_action_mixed_success(
client: fastapi.testclient.TestClient,
mocker: pytest_mock.MockerFixture,
sample_pending_review: PendingHumanReviewModel,
test_user_id: str,
@@ -288,6 +334,12 @@ def test_process_review_action_mixed_success(
# Mock the route functions
# Mock get_pending_review_by_node_exec_id (called to find the graph_exec_id)
mock_get_reviews_for_user = mocker.patch(
"backend.api.features.executions.review.routes.get_pending_review_by_node_exec_id"
)
mock_get_reviews_for_user.return_value = sample_pending_review
mock_get_reviews_for_execution = mocker.patch(
"backend.api.features.executions.review.routes.get_pending_reviews_for_execution"
)
@@ -337,6 +389,14 @@ def test_process_review_action_mixed_success(
"test_node_456": rejected_review,
}
# Mock get_graph_execution_meta to return execution in REVIEW status
mock_get_graph_exec = mocker.patch(
"backend.api.features.executions.review.routes.get_graph_execution_meta"
)
mock_graph_exec_meta = mocker.Mock()
mock_graph_exec_meta.status = ExecutionStatus.REVIEW
mock_get_graph_exec.return_value = mock_graph_exec_meta
mock_has_pending = mocker.patch(
"backend.api.features.executions.review.routes.has_pending_reviews_for_graph_exec"
)
@@ -369,6 +429,7 @@ def test_process_review_action_mixed_success(
def test_process_review_action_empty_request(
client: fastapi.testclient.TestClient,
mocker: pytest_mock.MockerFixture,
test_user_id: str,
) -> None:
@@ -386,10 +447,45 @@ def test_process_review_action_empty_request(
def test_process_review_action_review_not_found(
client: fastapi.testclient.TestClient,
mocker: pytest_mock.MockerFixture,
sample_pending_review: PendingHumanReviewModel,
test_user_id: str,
) -> None:
"""Test error when review is not found"""
# Create a review with the nonexistent_node ID so the route can find the graph_exec_id
nonexistent_review = PendingHumanReviewModel(
node_exec_id="nonexistent_node",
user_id=test_user_id,
graph_exec_id="test_graph_exec_456",
graph_id="test_graph_789",
graph_version=1,
payload={"data": "test"},
instructions="Review",
editable=True,
status=ReviewStatus.WAITING,
review_message=None,
was_edited=None,
processed=False,
created_at=FIXED_NOW,
updated_at=None,
reviewed_at=None,
)
# Mock get_pending_review_by_node_exec_id (called to find the graph_exec_id)
mock_get_reviews_for_user = mocker.patch(
"backend.api.features.executions.review.routes.get_pending_review_by_node_exec_id"
)
mock_get_reviews_for_user.return_value = nonexistent_review
# Mock get_graph_execution_meta to return execution in REVIEW status
mock_get_graph_exec = mocker.patch(
"backend.api.features.executions.review.routes.get_graph_execution_meta"
)
mock_graph_exec_meta = mocker.Mock()
mock_graph_exec_meta.status = ExecutionStatus.REVIEW
mock_get_graph_exec.return_value = mock_graph_exec_meta
# Mock the functions that extract graph execution ID from the request
mock_get_reviews_for_execution = mocker.patch(
"backend.api.features.executions.review.routes.get_pending_reviews_for_execution"
@@ -422,11 +518,26 @@ def test_process_review_action_review_not_found(
def test_process_review_action_partial_failure(
client: fastapi.testclient.TestClient,
mocker: pytest_mock.MockerFixture,
sample_pending_review: PendingHumanReviewModel,
test_user_id: str,
) -> None:
"""Test handling of partial failures in review processing"""
# Mock get_pending_review_by_node_exec_id (called to find the graph_exec_id)
mock_get_reviews_for_user = mocker.patch(
"backend.api.features.executions.review.routes.get_pending_review_by_node_exec_id"
)
mock_get_reviews_for_user.return_value = sample_pending_review
# Mock get_graph_execution_meta to return execution in REVIEW status
mock_get_graph_exec = mocker.patch(
"backend.api.features.executions.review.routes.get_graph_execution_meta"
)
mock_graph_exec_meta = mocker.Mock()
mock_graph_exec_meta.status = ExecutionStatus.REVIEW
mock_get_graph_exec.return_value = mock_graph_exec_meta
# Mock the route functions
mock_get_reviews_for_execution = mocker.patch(
"backend.api.features.executions.review.routes.get_pending_reviews_for_execution"
@@ -456,16 +567,50 @@ def test_process_review_action_partial_failure(
def test_process_review_action_invalid_node_exec_id(
client: fastapi.testclient.TestClient,
mocker: pytest_mock.MockerFixture,
sample_pending_review: PendingHumanReviewModel,
test_user_id: str,
) -> None:
"""Test failure when trying to process review with invalid node execution ID"""
# Create a review with the invalid-node-format ID so the route can find the graph_exec_id
invalid_review = PendingHumanReviewModel(
node_exec_id="invalid-node-format",
user_id=test_user_id,
graph_exec_id="test_graph_exec_456",
graph_id="test_graph_789",
graph_version=1,
payload={"data": "test"},
instructions="Review",
editable=True,
status=ReviewStatus.WAITING,
review_message=None,
was_edited=None,
processed=False,
created_at=FIXED_NOW,
updated_at=None,
reviewed_at=None,
)
# Mock get_pending_review_by_node_exec_id (called to find the graph_exec_id)
mock_get_reviews_for_user = mocker.patch(
"backend.api.features.executions.review.routes.get_pending_review_by_node_exec_id"
)
mock_get_reviews_for_user.return_value = invalid_review
# Mock get_graph_execution_meta to return execution in REVIEW status
mock_get_graph_exec = mocker.patch(
"backend.api.features.executions.review.routes.get_graph_execution_meta"
)
mock_graph_exec_meta = mocker.Mock()
mock_graph_exec_meta.status = ExecutionStatus.REVIEW
mock_get_graph_exec.return_value = mock_graph_exec_meta
# Mock the route functions
mock_get_reviews_for_execution = mocker.patch(
"backend.api.features.executions.review.routes.get_pending_reviews_for_execution"
)
mock_get_reviews_for_execution.return_value = [sample_pending_review]
mock_get_reviews_for_execution.return_value = [invalid_review]
# Mock validation failure - this should return 400, not 500
mock_process_all_reviews = mocker.patch(
@@ -490,3 +635,595 @@ def test_process_review_action_invalid_node_exec_id(
# Should be a 400 Bad Request, not 500 Internal Server Error
assert response.status_code == 400
assert "Invalid node execution ID format" in response.json()["detail"]
def test_process_review_action_auto_approve_creates_auto_approval_records(
client: fastapi.testclient.TestClient,
mocker: pytest_mock.MockerFixture,
sample_pending_review: PendingHumanReviewModel,
test_user_id: str,
) -> None:
"""Test that auto_approve_future_actions flag creates auto-approval records"""
# Mock get_pending_review_by_node_exec_id (called to find the graph_exec_id)
mock_get_reviews_for_user = mocker.patch(
"backend.api.features.executions.review.routes.get_pending_review_by_node_exec_id"
)
mock_get_reviews_for_user.return_value = sample_pending_review
# Mock process_all_reviews
mock_process_all_reviews = mocker.patch(
"backend.api.features.executions.review.routes.process_all_reviews_for_execution"
)
approved_review = PendingHumanReviewModel(
node_exec_id="test_node_123",
user_id=test_user_id,
graph_exec_id="test_graph_exec_456",
graph_id="test_graph_789",
graph_version=1,
payload={"data": "test payload"},
instructions="Please review",
editable=True,
status=ReviewStatus.APPROVED,
review_message="Approved",
was_edited=False,
processed=False,
created_at=FIXED_NOW,
updated_at=FIXED_NOW,
reviewed_at=FIXED_NOW,
)
mock_process_all_reviews.return_value = {"test_node_123": approved_review}
# Mock get_node_execution to return node_id
mock_get_node_execution = mocker.patch(
"backend.api.features.executions.review.routes.get_node_execution"
)
mock_node_exec = mocker.Mock(spec=NodeExecutionResult)
mock_node_exec.node_id = "test_node_def_456"
mock_get_node_execution.return_value = mock_node_exec
# Mock create_auto_approval_record
mock_create_auto_approval = mocker.patch(
"backend.api.features.executions.review.routes.create_auto_approval_record"
)
# Mock get_graph_execution_meta to return execution in REVIEW status
mock_get_graph_exec = mocker.patch(
"backend.api.features.executions.review.routes.get_graph_execution_meta"
)
mock_graph_exec_meta = mocker.Mock()
mock_graph_exec_meta.status = ExecutionStatus.REVIEW
mock_get_graph_exec.return_value = mock_graph_exec_meta
# Mock has_pending_reviews_for_graph_exec
mock_has_pending = mocker.patch(
"backend.api.features.executions.review.routes.has_pending_reviews_for_graph_exec"
)
mock_has_pending.return_value = False
# Mock get_graph_settings to return custom settings
mock_get_settings = mocker.patch(
"backend.api.features.executions.review.routes.get_graph_settings"
)
mock_get_settings.return_value = GraphSettings(
human_in_the_loop_safe_mode=True,
sensitive_action_safe_mode=True,
)
# Mock get_user_by_id to prevent database access
mock_get_user = mocker.patch(
"backend.api.features.executions.review.routes.get_user_by_id"
)
mock_user = mocker.Mock()
mock_user.timezone = "UTC"
mock_get_user.return_value = mock_user
# Mock add_graph_execution
mock_add_execution = mocker.patch(
"backend.api.features.executions.review.routes.add_graph_execution"
)
request_data = {
"reviews": [
{
"node_exec_id": "test_node_123",
"approved": True,
"message": "Approved",
"auto_approve_future": True,
}
],
}
response = client.post("/api/review/action", json=request_data)
assert response.status_code == 200
# Verify process_all_reviews_for_execution was called (without auto_approve param)
mock_process_all_reviews.assert_called_once()
# Verify create_auto_approval_record was called for the approved review
mock_create_auto_approval.assert_called_once_with(
user_id=test_user_id,
graph_exec_id="test_graph_exec_456",
graph_id="test_graph_789",
graph_version=1,
node_id="test_node_def_456",
payload={"data": "test payload"},
)
# Verify get_graph_settings was called with correct parameters
mock_get_settings.assert_called_once_with(
user_id=test_user_id, graph_id="test_graph_789"
)
# Verify add_graph_execution was called with proper ExecutionContext
mock_add_execution.assert_called_once()
call_kwargs = mock_add_execution.call_args.kwargs
execution_context = call_kwargs["execution_context"]
assert isinstance(execution_context, ExecutionContext)
assert execution_context.human_in_the_loop_safe_mode is True
assert execution_context.sensitive_action_safe_mode is True
def test_process_review_action_without_auto_approve_still_loads_settings(
client: fastapi.testclient.TestClient,
mocker: pytest_mock.MockerFixture,
sample_pending_review: PendingHumanReviewModel,
test_user_id: str,
) -> None:
"""Test that execution context is created with settings even without auto-approve"""
# Mock get_pending_review_by_node_exec_id (called to find the graph_exec_id)
mock_get_reviews_for_user = mocker.patch(
"backend.api.features.executions.review.routes.get_pending_review_by_node_exec_id"
)
mock_get_reviews_for_user.return_value = sample_pending_review
# Mock process_all_reviews
mock_process_all_reviews = mocker.patch(
"backend.api.features.executions.review.routes.process_all_reviews_for_execution"
)
approved_review = PendingHumanReviewModel(
node_exec_id="test_node_123",
user_id=test_user_id,
graph_exec_id="test_graph_exec_456",
graph_id="test_graph_789",
graph_version=1,
payload={"data": "test payload"},
instructions="Please review",
editable=True,
status=ReviewStatus.APPROVED,
review_message="Approved",
was_edited=False,
processed=False,
created_at=FIXED_NOW,
updated_at=FIXED_NOW,
reviewed_at=FIXED_NOW,
)
mock_process_all_reviews.return_value = {"test_node_123": approved_review}
# Mock create_auto_approval_record - should NOT be called when auto_approve is False
mock_create_auto_approval = mocker.patch(
"backend.api.features.executions.review.routes.create_auto_approval_record"
)
# Mock get_graph_execution_meta to return execution in REVIEW status
mock_get_graph_exec = mocker.patch(
"backend.api.features.executions.review.routes.get_graph_execution_meta"
)
mock_graph_exec_meta = mocker.Mock()
mock_graph_exec_meta.status = ExecutionStatus.REVIEW
mock_get_graph_exec.return_value = mock_graph_exec_meta
# Mock has_pending_reviews_for_graph_exec
mock_has_pending = mocker.patch(
"backend.api.features.executions.review.routes.has_pending_reviews_for_graph_exec"
)
mock_has_pending.return_value = False
# Mock get_graph_settings with sensitive_action_safe_mode enabled
mock_get_settings = mocker.patch(
"backend.api.features.executions.review.routes.get_graph_settings"
)
mock_get_settings.return_value = GraphSettings(
human_in_the_loop_safe_mode=False,
sensitive_action_safe_mode=True,
)
# Mock get_user_by_id to prevent database access
mock_get_user = mocker.patch(
"backend.api.features.executions.review.routes.get_user_by_id"
)
mock_user = mocker.Mock()
mock_user.timezone = "UTC"
mock_get_user.return_value = mock_user
# Mock add_graph_execution
mock_add_execution = mocker.patch(
"backend.api.features.executions.review.routes.add_graph_execution"
)
# Request WITHOUT auto_approve_future (defaults to False)
request_data = {
"reviews": [
{
"node_exec_id": "test_node_123",
"approved": True,
"message": "Approved",
# auto_approve_future defaults to False
}
],
}
response = client.post("/api/review/action", json=request_data)
assert response.status_code == 200
# Verify process_all_reviews_for_execution was called
mock_process_all_reviews.assert_called_once()
# Verify create_auto_approval_record was NOT called (auto_approve_future=False)
mock_create_auto_approval.assert_not_called()
# Verify settings were loaded
mock_get_settings.assert_called_once()
# Verify ExecutionContext has proper settings
mock_add_execution.assert_called_once()
call_kwargs = mock_add_execution.call_args.kwargs
execution_context = call_kwargs["execution_context"]
assert isinstance(execution_context, ExecutionContext)
assert execution_context.human_in_the_loop_safe_mode is False
assert execution_context.sensitive_action_safe_mode is True
def test_process_review_action_auto_approve_only_applies_to_approved_reviews(
client: fastapi.testclient.TestClient,
mocker: pytest_mock.MockerFixture,
test_user_id: str,
) -> None:
"""Test that auto_approve record is created only for approved reviews"""
# Create two reviews - one approved, one rejected
approved_review = PendingHumanReviewModel(
node_exec_id="node_exec_approved",
user_id=test_user_id,
graph_exec_id="test_graph_exec_456",
graph_id="test_graph_789",
graph_version=1,
payload={"data": "approved"},
instructions="Review",
editable=True,
status=ReviewStatus.APPROVED,
review_message=None,
was_edited=False,
processed=False,
created_at=FIXED_NOW,
updated_at=FIXED_NOW,
reviewed_at=FIXED_NOW,
)
rejected_review = PendingHumanReviewModel(
node_exec_id="node_exec_rejected",
user_id=test_user_id,
graph_exec_id="test_graph_exec_456",
graph_id="test_graph_789",
graph_version=1,
payload={"data": "rejected"},
instructions="Review",
editable=True,
status=ReviewStatus.REJECTED,
review_message="Rejected",
was_edited=False,
processed=False,
created_at=FIXED_NOW,
updated_at=FIXED_NOW,
reviewed_at=FIXED_NOW,
)
# Mock get_pending_review_by_node_exec_id (called to find the graph_exec_id)
mock_get_reviews_for_user = mocker.patch(
"backend.api.features.executions.review.routes.get_pending_review_by_node_exec_id"
)
mock_get_reviews_for_user.return_value = approved_review
# Mock process_all_reviews
mock_process_all_reviews = mocker.patch(
"backend.api.features.executions.review.routes.process_all_reviews_for_execution"
)
mock_process_all_reviews.return_value = {
"node_exec_approved": approved_review,
"node_exec_rejected": rejected_review,
}
# Mock get_node_execution to return node_id (only called for approved review)
mock_get_node_execution = mocker.patch(
"backend.api.features.executions.review.routes.get_node_execution"
)
mock_node_exec = mocker.Mock(spec=NodeExecutionResult)
mock_node_exec.node_id = "test_node_def_approved"
mock_get_node_execution.return_value = mock_node_exec
# Mock create_auto_approval_record
mock_create_auto_approval = mocker.patch(
"backend.api.features.executions.review.routes.create_auto_approval_record"
)
# Mock get_graph_execution_meta to return execution in REVIEW status
mock_get_graph_exec = mocker.patch(
"backend.api.features.executions.review.routes.get_graph_execution_meta"
)
mock_graph_exec_meta = mocker.Mock()
mock_graph_exec_meta.status = ExecutionStatus.REVIEW
mock_get_graph_exec.return_value = mock_graph_exec_meta
# Mock has_pending_reviews_for_graph_exec
mock_has_pending = mocker.patch(
"backend.api.features.executions.review.routes.has_pending_reviews_for_graph_exec"
)
mock_has_pending.return_value = False
# Mock get_graph_settings
mock_get_settings = mocker.patch(
"backend.api.features.executions.review.routes.get_graph_settings"
)
mock_get_settings.return_value = GraphSettings()
# Mock get_user_by_id to prevent database access
mock_get_user = mocker.patch(
"backend.api.features.executions.review.routes.get_user_by_id"
)
mock_user = mocker.Mock()
mock_user.timezone = "UTC"
mock_get_user.return_value = mock_user
# Mock add_graph_execution
mock_add_execution = mocker.patch(
"backend.api.features.executions.review.routes.add_graph_execution"
)
request_data = {
"reviews": [
{
"node_exec_id": "node_exec_approved",
"approved": True,
"auto_approve_future": True,
},
{
"node_exec_id": "node_exec_rejected",
"approved": False,
"auto_approve_future": True, # Should be ignored since rejected
},
],
}
response = client.post("/api/review/action", json=request_data)
assert response.status_code == 200
# Verify process_all_reviews_for_execution was called
mock_process_all_reviews.assert_called_once()
# Verify create_auto_approval_record was called ONLY for the approved review
# (not for the rejected one)
mock_create_auto_approval.assert_called_once_with(
user_id=test_user_id,
graph_exec_id="test_graph_exec_456",
graph_id="test_graph_789",
graph_version=1,
node_id="test_node_def_approved",
payload={"data": "approved"},
)
# Verify get_node_execution was called only for approved review
mock_get_node_execution.assert_called_once_with("node_exec_approved")
# Verify ExecutionContext was created (auto-approval is now DB-based)
call_kwargs = mock_add_execution.call_args.kwargs
execution_context = call_kwargs["execution_context"]
assert isinstance(execution_context, ExecutionContext)
def test_process_review_action_per_review_auto_approve_granularity(
client: fastapi.testclient.TestClient,
mocker: pytest_mock.MockerFixture,
sample_pending_review: PendingHumanReviewModel,
test_user_id: str,
) -> None:
"""Test that auto-approval can be set per-review (granular control)"""
# Mock get_pending_review_by_node_exec_id - return different reviews based on node_exec_id
mock_get_reviews_for_user = mocker.patch(
"backend.api.features.executions.review.routes.get_pending_review_by_node_exec_id"
)
# Create a mapping of node_exec_id to review
review_map = {
"node_1_auto": PendingHumanReviewModel(
node_exec_id="node_1_auto",
user_id=test_user_id,
graph_exec_id="test_graph_exec",
graph_id="test_graph",
graph_version=1,
payload={"data": "node1"},
instructions="Review 1",
editable=True,
status=ReviewStatus.WAITING,
review_message=None,
was_edited=False,
processed=False,
created_at=FIXED_NOW,
),
"node_2_manual": PendingHumanReviewModel(
node_exec_id="node_2_manual",
user_id=test_user_id,
graph_exec_id="test_graph_exec",
graph_id="test_graph",
graph_version=1,
payload={"data": "node2"},
instructions="Review 2",
editable=True,
status=ReviewStatus.WAITING,
review_message=None,
was_edited=False,
processed=False,
created_at=FIXED_NOW,
),
"node_3_auto": PendingHumanReviewModel(
node_exec_id="node_3_auto",
user_id=test_user_id,
graph_exec_id="test_graph_exec",
graph_id="test_graph",
graph_version=1,
payload={"data": "node3"},
instructions="Review 3",
editable=True,
status=ReviewStatus.WAITING,
review_message=None,
was_edited=False,
processed=False,
created_at=FIXED_NOW,
),
}
# Use side_effect to return different reviews based on node_exec_id parameter
def mock_get_review_by_id(node_exec_id: str, _user_id: str):
return review_map.get(node_exec_id)
mock_get_reviews_for_user.side_effect = mock_get_review_by_id
# Mock process_all_reviews - return 3 approved reviews
mock_process_all_reviews = mocker.patch(
"backend.api.features.executions.review.routes.process_all_reviews_for_execution"
)
mock_process_all_reviews.return_value = {
"node_1_auto": PendingHumanReviewModel(
node_exec_id="node_1_auto",
user_id=test_user_id,
graph_exec_id="test_graph_exec",
graph_id="test_graph",
graph_version=1,
payload={"data": "node1"},
instructions="Review 1",
editable=True,
status=ReviewStatus.APPROVED,
review_message=None,
was_edited=False,
processed=False,
created_at=FIXED_NOW,
updated_at=FIXED_NOW,
reviewed_at=FIXED_NOW,
),
"node_2_manual": PendingHumanReviewModel(
node_exec_id="node_2_manual",
user_id=test_user_id,
graph_exec_id="test_graph_exec",
graph_id="test_graph",
graph_version=1,
payload={"data": "node2"},
instructions="Review 2",
editable=True,
status=ReviewStatus.APPROVED,
review_message=None,
was_edited=False,
processed=False,
created_at=FIXED_NOW,
updated_at=FIXED_NOW,
reviewed_at=FIXED_NOW,
),
"node_3_auto": PendingHumanReviewModel(
node_exec_id="node_3_auto",
user_id=test_user_id,
graph_exec_id="test_graph_exec",
graph_id="test_graph",
graph_version=1,
payload={"data": "node3"},
instructions="Review 3",
editable=True,
status=ReviewStatus.APPROVED,
review_message=None,
was_edited=False,
processed=False,
created_at=FIXED_NOW,
updated_at=FIXED_NOW,
reviewed_at=FIXED_NOW,
),
}
# Mock get_node_execution
mock_get_node_execution = mocker.patch(
"backend.api.features.executions.review.routes.get_node_execution"
)
def mock_get_node(node_exec_id: str):
mock_node = mocker.Mock(spec=NodeExecutionResult)
mock_node.node_id = f"node_def_{node_exec_id}"
return mock_node
mock_get_node_execution.side_effect = mock_get_node
# Mock create_auto_approval_record
mock_create_auto_approval = mocker.patch(
"backend.api.features.executions.review.routes.create_auto_approval_record"
)
# Mock get_graph_execution_meta
mock_get_graph_exec = mocker.patch(
"backend.api.features.executions.review.routes.get_graph_execution_meta"
)
mock_graph_exec_meta = mocker.Mock()
mock_graph_exec_meta.status = ExecutionStatus.REVIEW
mock_get_graph_exec.return_value = mock_graph_exec_meta
# Mock has_pending_reviews_for_graph_exec
mock_has_pending = mocker.patch(
"backend.api.features.executions.review.routes.has_pending_reviews_for_graph_exec"
)
mock_has_pending.return_value = False
# Mock settings and execution
mock_get_settings = mocker.patch(
"backend.api.features.executions.review.routes.get_graph_settings"
)
mock_get_settings.return_value = GraphSettings(
human_in_the_loop_safe_mode=False, sensitive_action_safe_mode=False
)
mocker.patch("backend.api.features.executions.review.routes.add_graph_execution")
mocker.patch("backend.api.features.executions.review.routes.get_user_by_id")
# Request with granular auto-approval:
# - node_1_auto: auto_approve_future=True
# - node_2_manual: auto_approve_future=False (explicit)
# - node_3_auto: auto_approve_future=True
request_data = {
"reviews": [
{
"node_exec_id": "node_1_auto",
"approved": True,
"auto_approve_future": True,
},
{
"node_exec_id": "node_2_manual",
"approved": True,
"auto_approve_future": False, # Don't auto-approve this one
},
{
"node_exec_id": "node_3_auto",
"approved": True,
"auto_approve_future": True,
},
],
}
response = client.post("/api/review/action", json=request_data)
assert response.status_code == 200
# Verify create_auto_approval_record was called ONLY for reviews with auto_approve_future=True
assert mock_create_auto_approval.call_count == 2
# Check that it was called for node_1 and node_3, but NOT node_2
call_args_list = [call.kwargs for call in mock_create_auto_approval.call_args_list]
node_ids_with_auto_approval = [args["node_id"] for args in call_args_list]
assert "node_def_node_1_auto" in node_ids_with_auto_approval
assert "node_def_node_3_auto" in node_ids_with_auto_approval
assert "node_def_node_2_manual" not in node_ids_with_auto_approval

View File

@@ -5,13 +5,23 @@ import autogpt_libs.auth as autogpt_auth_lib
from fastapi import APIRouter, HTTPException, Query, Security, status
from prisma.enums import ReviewStatus
from backend.data.execution import get_graph_execution_meta
from backend.data.execution import (
ExecutionContext,
ExecutionStatus,
get_graph_execution_meta,
get_node_execution,
)
from backend.data.graph import get_graph_settings
from backend.data.human_review import (
create_auto_approval_record,
get_pending_review_by_node_exec_id,
get_pending_reviews_for_execution,
get_pending_reviews_for_user,
has_pending_reviews_for_graph_exec,
process_all_reviews_for_execution,
)
from backend.data.model import USER_TIMEZONE_NOT_SET
from backend.data.user import get_user_by_id
from backend.executor.utils import add_graph_execution
from .model import PendingHumanReviewModel, ReviewRequest, ReviewResponse
@@ -127,17 +137,80 @@ async def process_review_action(
detail="At least one review must be provided",
)
# Build review decisions map
# Get graph execution ID by looking up all requested reviews
# Use direct lookup to avoid pagination issues (can't miss reviews beyond first page)
# Also validate that all reviews belong to the same execution
matching_review = None
graph_exec_ids: set[str] = set()
for node_exec_id in all_request_node_ids:
review = await get_pending_review_by_node_exec_id(node_exec_id, user_id)
if not review:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail=f"No pending review found for node execution {node_exec_id}",
)
if matching_review is None:
matching_review = review
graph_exec_ids.add(review.graph_exec_id)
# Ensure all reviews belong to the same execution
if len(graph_exec_ids) > 1:
raise HTTPException(
status_code=status.HTTP_409_CONFLICT,
detail="All reviews in a single request must belong to the same execution.",
)
# Safety check (matching_review should never be None here due to validation above)
if matching_review is None:
raise HTTPException(
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
detail="Internal error: No matching review found despite validation",
)
graph_exec_id = matching_review.graph_exec_id
# Validate execution status before processing reviews
graph_exec_meta = await get_graph_execution_meta(
user_id=user_id, execution_id=graph_exec_id
)
if not graph_exec_meta:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail=f"Graph execution #{graph_exec_id} not found",
)
# Only allow processing reviews if execution is paused for review
# or incomplete (partial execution with some reviews already processed)
if graph_exec_meta.status not in (
ExecutionStatus.REVIEW,
ExecutionStatus.INCOMPLETE,
):
raise HTTPException(
status_code=status.HTTP_409_CONFLICT,
detail=f"Cannot process reviews while execution status is {graph_exec_meta.status}. "
f"Reviews can only be processed when execution is paused (REVIEW status). "
f"Current status: {graph_exec_meta.status}",
)
# Build review decisions map and track which reviews requested auto-approval
# Auto-approved reviews use original data (no modifications allowed)
review_decisions = {}
auto_approve_requests = {} # Map node_exec_id -> auto_approve_future flag
for review in request.reviews:
review_status = (
ReviewStatus.APPROVED if review.approved else ReviewStatus.REJECTED
)
# If this review requested auto-approval, don't allow data modifications
reviewed_data = None if review.auto_approve_future else review.reviewed_data
review_decisions[review.node_exec_id] = (
review_status,
review.reviewed_data,
reviewed_data,
review.message,
)
auto_approve_requests[review.node_exec_id] = review.auto_approve_future
# Process all reviews
updated_reviews = await process_all_reviews_for_execution(
@@ -145,6 +218,32 @@ async def process_review_action(
review_decisions=review_decisions,
)
# Create auto-approval records for approved reviews that requested it
# Note: Processing sequentially to avoid event loop issues in tests
for node_exec_id, review_result in updated_reviews.items():
# Only create auto-approval if:
# 1. This review was approved
# 2. The review requested auto-approval
if review_result.status == ReviewStatus.APPROVED and auto_approve_requests.get(
node_exec_id, False
):
try:
node_exec = await get_node_execution(node_exec_id)
if node_exec:
await create_auto_approval_record(
user_id=user_id,
graph_exec_id=review_result.graph_exec_id,
graph_id=review_result.graph_id,
graph_version=review_result.graph_version,
node_id=node_exec.node_id,
payload=review_result.payload,
)
except Exception as e:
logger.error(
f"Failed to create auto-approval record for {node_exec_id}",
exc_info=e,
)
# Count results
approved_count = sum(
1
@@ -157,22 +256,37 @@ async def process_review_action(
if review.status == ReviewStatus.REJECTED
)
# Resume execution if we processed some reviews
# Resume execution only if ALL pending reviews for this execution have been processed
if updated_reviews:
# Get graph execution ID from any processed review
first_review = next(iter(updated_reviews.values()))
graph_exec_id = first_review.graph_exec_id
# Check if any pending reviews remain for this execution
still_has_pending = await has_pending_reviews_for_graph_exec(graph_exec_id)
if not still_has_pending:
# Resume execution
# Get the graph_id from any processed review
first_review = next(iter(updated_reviews.values()))
try:
# Fetch user and settings to build complete execution context
user = await get_user_by_id(user_id)
settings = await get_graph_settings(
user_id=user_id, graph_id=first_review.graph_id
)
# Preserve user's timezone preference when resuming execution
user_timezone = (
user.timezone if user.timezone != USER_TIMEZONE_NOT_SET else "UTC"
)
execution_context = ExecutionContext(
human_in_the_loop_safe_mode=settings.human_in_the_loop_safe_mode,
sensitive_action_safe_mode=settings.sensitive_action_safe_mode,
user_timezone=user_timezone,
)
await add_graph_execution(
graph_id=first_review.graph_id,
user_id=user_id,
graph_exec_id=graph_exec_id,
execution_context=execution_context,
)
logger.info(f"Resumed execution {graph_exec_id}")
except Exception as e:

View File

@@ -6,6 +6,7 @@ Handles generation and storage of OpenAI embeddings for all content types
"""
import asyncio
import contextvars
import logging
import time
from typing import Any
@@ -21,6 +22,11 @@ from backend.util.json import dumps
logger = logging.getLogger(__name__)
# Context variable to track errors logged in the current task/operation
# This prevents spamming the same error multiple times when processing batches
_logged_errors: contextvars.ContextVar[set[str]] = contextvars.ContextVar(
"_logged_errors"
)
# OpenAI embedding model configuration
EMBEDDING_MODEL = "text-embedding-3-small"
@@ -31,6 +37,42 @@ EMBEDDING_DIM = 1536
EMBEDDING_MAX_TOKENS = 8191
def log_once_per_task(error_key: str, log_fn, message: str, **kwargs) -> bool:
"""
Log an error/warning only once per task/operation to avoid log spam.
Uses contextvars to track what has been logged in the current async context.
Useful when processing batches where the same error might occur for many items.
Args:
error_key: Unique identifier for this error type
log_fn: Logger function to call (e.g., logger.error, logger.warning)
message: Message to log
**kwargs: Additional arguments to pass to log_fn
Returns:
True if the message was logged, False if it was suppressed (already logged)
Example:
log_once_per_task("missing_api_key", logger.error, "API key not set")
"""
# Get current logged errors, or create a new set if this is the first call in this context
logged = _logged_errors.get(None)
if logged is None:
logged = set()
_logged_errors.set(logged)
if error_key in logged:
return False
# Log the message with a note that it will only appear once
log_fn(f"{message} (This message will only be shown once per task.)", **kwargs)
# Mark as logged
logged.add(error_key)
return True
def build_searchable_text(
name: str,
description: str,
@@ -73,7 +115,11 @@ async def generate_embedding(text: str) -> list[float] | None:
try:
client = get_openai_client()
if not client:
logger.error("openai_internal_api_key not set, cannot generate embedding")
log_once_per_task(
"openai_api_key_missing",
logger.error,
"openai_internal_api_key not set, cannot generate embeddings",
)
return None
# Truncate text to token limit using tiktoken
@@ -290,7 +336,12 @@ async def ensure_embedding(
# Generate new embedding
embedding = await generate_embedding(searchable_text)
if embedding is None:
logger.warning(f"Could not generate embedding for version {version_id}")
log_once_per_task(
"embedding_generation_failed",
logger.warning,
"Could not generate embeddings (missing API key or service unavailable). "
"Embedding generation is disabled for this task.",
)
return False
# Store the embedding with metadata using new function
@@ -609,8 +660,11 @@ async def ensure_content_embedding(
# Generate new embedding
embedding = await generate_embedding(searchable_text)
if embedding is None:
logger.warning(
f"Could not generate embedding for {content_type}:{content_id}"
log_once_per_task(
"embedding_generation_failed",
logger.warning,
"Could not generate embeddings (missing API key or service unavailable). "
"Embedding generation is disabled for this task.",
)
return False

View File

@@ -116,6 +116,7 @@ class PrintToConsoleBlock(Block):
input_schema=PrintToConsoleBlock.Input,
output_schema=PrintToConsoleBlock.Output,
test_input={"text": "Hello, World!"},
is_sensitive_action=True,
test_output=[
("output", "Hello, World!"),
("status", "printed"),

View File

@@ -9,7 +9,7 @@ from typing import Any, Optional
from prisma.enums import ReviewStatus
from pydantic import BaseModel
from backend.data.execution import ExecutionContext, ExecutionStatus
from backend.data.execution import ExecutionStatus
from backend.data.human_review import ReviewResult
from backend.executor.manager import async_update_node_execution_status
from backend.util.clients import get_database_manager_async_client
@@ -28,6 +28,11 @@ class ReviewDecision(BaseModel):
class HITLReviewHelper:
"""Helper class for Human-In-The-Loop review operations."""
@staticmethod
async def check_approval(**kwargs) -> Optional[ReviewResult]:
"""Check if there's an existing approval for this node execution."""
return await get_database_manager_async_client().check_approval(**kwargs)
@staticmethod
async def get_or_create_human_review(**kwargs) -> Optional[ReviewResult]:
"""Create or retrieve a human review from the database."""
@@ -55,11 +60,11 @@ class HITLReviewHelper:
async def _handle_review_request(
input_data: Any,
user_id: str,
node_id: str,
node_exec_id: str,
graph_exec_id: str,
graph_id: str,
graph_version: int,
execution_context: ExecutionContext,
block_name: str = "Block",
editable: bool = False,
) -> Optional[ReviewResult]:
@@ -69,11 +74,11 @@ class HITLReviewHelper:
Args:
input_data: The input data to be reviewed
user_id: ID of the user requesting the review
node_id: ID of the node in the graph definition
node_exec_id: ID of the node execution
graph_exec_id: ID of the graph execution
graph_id: ID of the graph
graph_version: Version of the graph
execution_context: Current execution context
block_name: Name of the block requesting review
editable: Whether the reviewer can edit the data
@@ -83,15 +88,41 @@ class HITLReviewHelper:
Raises:
Exception: If review creation or status update fails
"""
# Skip review if safe mode is disabled - return auto-approved result
if not execution_context.human_in_the_loop_safe_mode:
# Note: Safe mode checks (human_in_the_loop_safe_mode, sensitive_action_safe_mode)
# are handled by the caller:
# - HITL blocks check human_in_the_loop_safe_mode in their run() method
# - Sensitive action blocks check sensitive_action_safe_mode in is_block_exec_need_review()
# This function only handles checking for existing approvals.
# Check if this node has already been approved (normal or auto-approval)
if approval_result := await HITLReviewHelper.check_approval(
node_exec_id=node_exec_id,
graph_exec_id=graph_exec_id,
node_id=node_id,
user_id=user_id,
input_data=input_data,
):
logger.info(
f"Block {block_name} skipping review for node {node_exec_id} - safe mode disabled"
f"Block {block_name} skipping review for node {node_exec_id} - "
f"found existing approval"
)
# Return a new ReviewResult with the current node_exec_id but approved status
# For auto-approvals, always use current input_data
# For normal approvals, use approval_result.data unless it's None
is_auto_approval = approval_result.node_exec_id != node_exec_id
approved_data = (
input_data
if is_auto_approval
else (
approval_result.data
if approval_result.data is not None
else input_data
)
)
return ReviewResult(
data=input_data,
data=approved_data,
status=ReviewStatus.APPROVED,
message="Auto-approved (safe mode disabled)",
message=approval_result.message,
processed=True,
node_exec_id=node_exec_id,
)
@@ -129,11 +160,11 @@ class HITLReviewHelper:
async def handle_review_decision(
input_data: Any,
user_id: str,
node_id: str,
node_exec_id: str,
graph_exec_id: str,
graph_id: str,
graph_version: int,
execution_context: ExecutionContext,
block_name: str = "Block",
editable: bool = False,
) -> Optional[ReviewDecision]:
@@ -143,11 +174,11 @@ class HITLReviewHelper:
Args:
input_data: The input data to be reviewed
user_id: ID of the user requesting the review
node_id: ID of the node in the graph definition
node_exec_id: ID of the node execution
graph_exec_id: ID of the graph execution
graph_id: ID of the graph
graph_version: Version of the graph
execution_context: Current execution context
block_name: Name of the block requesting review
editable: Whether the reviewer can edit the data
@@ -158,11 +189,11 @@ class HITLReviewHelper:
review_result = await HITLReviewHelper._handle_review_request(
input_data=input_data,
user_id=user_id,
node_id=node_id,
node_exec_id=node_exec_id,
graph_exec_id=graph_exec_id,
graph_id=graph_id,
graph_version=graph_version,
execution_context=execution_context,
block_name=block_name,
editable=editable,
)

View File

@@ -97,6 +97,7 @@ class HumanInTheLoopBlock(Block):
input_data: Input,
*,
user_id: str,
node_id: str,
node_exec_id: str,
graph_exec_id: str,
graph_id: str,
@@ -115,11 +116,11 @@ class HumanInTheLoopBlock(Block):
decision = await self.handle_review_decision(
input_data=input_data.data,
user_id=user_id,
node_id=node_id,
node_exec_id=node_exec_id,
graph_exec_id=graph_exec_id,
graph_id=graph_id,
graph_version=graph_version,
execution_context=execution_context,
block_name=self.name,
editable=input_data.editable,
)

View File

@@ -441,6 +441,7 @@ class Block(ABC, Generic[BlockSchemaInputType, BlockSchemaOutputType]):
static_output: bool = False,
block_type: BlockType = BlockType.STANDARD,
webhook_config: Optional[BlockWebhookConfig | BlockManualWebhookConfig] = None,
is_sensitive_action: bool = False,
):
"""
Initialize the block with the given schema.
@@ -473,8 +474,8 @@ class Block(ABC, Generic[BlockSchemaInputType, BlockSchemaOutputType]):
self.static_output = static_output
self.block_type = block_type
self.webhook_config = webhook_config
self.is_sensitive_action = is_sensitive_action
self.execution_stats: NodeExecutionStats = NodeExecutionStats()
self.is_sensitive_action: bool = False
if self.webhook_config:
if isinstance(self.webhook_config, BlockWebhookConfig):
@@ -622,6 +623,7 @@ class Block(ABC, Generic[BlockSchemaInputType, BlockSchemaOutputType]):
input_data: BlockInput,
*,
user_id: str,
node_id: str,
node_exec_id: str,
graph_exec_id: str,
graph_id: str,
@@ -648,11 +650,11 @@ class Block(ABC, Generic[BlockSchemaInputType, BlockSchemaOutputType]):
decision = await HITLReviewHelper.handle_review_decision(
input_data=input_data,
user_id=user_id,
node_id=node_id,
node_exec_id=node_exec_id,
graph_exec_id=graph_exec_id,
graph_id=graph_id,
graph_version=graph_version,
execution_context=execution_context,
block_name=self.name,
editable=True,
)

View File

@@ -6,7 +6,7 @@ Handles all database operations for pending human reviews.
import asyncio
import logging
from datetime import datetime, timezone
from typing import Optional
from typing import TYPE_CHECKING, Optional
from prisma.enums import ReviewStatus
from prisma.models import PendingHumanReview
@@ -17,8 +17,12 @@ from backend.api.features.executions.review.model import (
PendingHumanReviewModel,
SafeJsonData,
)
from backend.data.execution import get_graph_execution_meta
from backend.util.json import SafeJson
if TYPE_CHECKING:
pass
logger = logging.getLogger(__name__)
@@ -32,6 +36,125 @@ class ReviewResult(BaseModel):
node_exec_id: str
def get_auto_approve_key(graph_exec_id: str, node_id: str) -> str:
"""Generate the special nodeExecId key for auto-approval records."""
return f"auto_approve_{graph_exec_id}_{node_id}"
async def check_approval(
node_exec_id: str,
graph_exec_id: str,
node_id: str,
user_id: str,
input_data: SafeJsonData | None = None,
) -> Optional[ReviewResult]:
"""
Check if there's an existing approval for this node execution.
Checks both:
1. Normal approval by node_exec_id (previous run of the same node execution)
2. Auto-approval by special key pattern "auto_approve_{graph_exec_id}_{node_id}"
Args:
node_exec_id: ID of the node execution
graph_exec_id: ID of the graph execution
node_id: ID of the node definition (not execution)
user_id: ID of the user (for data isolation)
input_data: Current input data (used for auto-approvals to avoid stale data)
Returns:
ReviewResult if approval found (either normal or auto), None otherwise
"""
auto_approve_key = get_auto_approve_key(graph_exec_id, node_id)
# Check for either normal approval or auto-approval in a single query
existing_review = await PendingHumanReview.prisma().find_first(
where={
"OR": [
{"nodeExecId": node_exec_id},
{"nodeExecId": auto_approve_key},
],
"status": ReviewStatus.APPROVED,
"userId": user_id,
},
)
if existing_review:
is_auto_approval = existing_review.nodeExecId == auto_approve_key
logger.info(
f"Found {'auto-' if is_auto_approval else ''}approval for node {node_id} "
f"(exec: {node_exec_id}) in execution {graph_exec_id}"
)
# For auto-approvals, use current input_data to avoid replaying stale payload
# For normal approvals, use the stored payload (which may have been edited)
return ReviewResult(
data=(
input_data
if is_auto_approval and input_data is not None
else existing_review.payload
),
status=ReviewStatus.APPROVED,
message=(
"Auto-approved (user approved all future actions for this node)"
if is_auto_approval
else existing_review.reviewMessage or ""
),
processed=True,
node_exec_id=existing_review.nodeExecId,
)
return None
async def create_auto_approval_record(
user_id: str,
graph_exec_id: str,
graph_id: str,
graph_version: int,
node_id: str,
payload: SafeJsonData,
) -> None:
"""
Create an auto-approval record for a node in this execution.
This is stored as a PendingHumanReview with a special nodeExecId pattern
and status=APPROVED, so future executions of the same node can skip review.
Raises:
ValueError: If the graph execution doesn't belong to the user
"""
# Validate that the graph execution belongs to this user (defense in depth)
graph_exec = await get_graph_execution_meta(
user_id=user_id, execution_id=graph_exec_id
)
if not graph_exec:
raise ValueError(
f"Graph execution {graph_exec_id} not found or doesn't belong to user {user_id}"
)
auto_approve_key = get_auto_approve_key(graph_exec_id, node_id)
await PendingHumanReview.prisma().upsert(
where={"nodeExecId": auto_approve_key},
data={
"create": {
"nodeExecId": auto_approve_key,
"userId": user_id,
"graphExecId": graph_exec_id,
"graphId": graph_id,
"graphVersion": graph_version,
"payload": SafeJson(payload),
"instructions": "Auto-approval record",
"editable": False,
"status": ReviewStatus.APPROVED,
"processed": True,
"reviewedAt": datetime.now(timezone.utc),
},
"update": {}, # Already exists, no update needed
},
)
async def get_or_create_human_review(
user_id: str,
node_exec_id: str,
@@ -108,6 +231,38 @@ async def get_or_create_human_review(
)
async def get_pending_review_by_node_exec_id(
node_exec_id: str, user_id: str
) -> Optional["PendingHumanReviewModel"]:
"""
Get a pending review by its node execution ID.
Args:
node_exec_id: The node execution ID to look up
user_id: User ID for authorization (only returns if review belongs to this user)
Returns:
The pending review if found and belongs to user, None otherwise
"""
review = await PendingHumanReview.prisma().find_first(
where={
"nodeExecId": node_exec_id,
"userId": user_id,
"status": ReviewStatus.WAITING,
}
)
if not review:
return None
# Local import to avoid event loop conflicts in tests
from backend.data.execution import get_node_execution
node_exec = await get_node_execution(review.nodeExecId)
node_id = node_exec.node_id if node_exec else review.nodeExecId
return PendingHumanReviewModel.from_db(review, node_id=node_id)
async def has_pending_reviews_for_graph_exec(graph_exec_id: str) -> bool:
"""
Check if a graph execution has any pending reviews.
@@ -137,8 +292,11 @@ async def get_pending_reviews_for_user(
page_size: Number of reviews per page
Returns:
List of pending review models
List of pending review models with node_id included
"""
# Local import to avoid event loop conflicts in tests
from backend.data.execution import get_node_execution
# Calculate offset for pagination
offset = (page - 1) * page_size
@@ -149,7 +307,14 @@ async def get_pending_reviews_for_user(
take=page_size,
)
return [PendingHumanReviewModel.from_db(review) for review in reviews]
# Fetch node_id for each review from NodeExecution
result = []
for review in reviews:
node_exec = await get_node_execution(review.nodeExecId)
node_id = node_exec.node_id if node_exec else review.nodeExecId
result.append(PendingHumanReviewModel.from_db(review, node_id=node_id))
return result
async def get_pending_reviews_for_execution(
@@ -163,8 +328,11 @@ async def get_pending_reviews_for_execution(
user_id: User ID for security validation
Returns:
List of pending review models
List of pending review models with node_id included
"""
# Local import to avoid event loop conflicts in tests
from backend.data.execution import get_node_execution
reviews = await PendingHumanReview.prisma().find_many(
where={
"userId": user_id,
@@ -174,7 +342,14 @@ async def get_pending_reviews_for_execution(
order={"createdAt": "asc"},
)
return [PendingHumanReviewModel.from_db(review) for review in reviews]
# Fetch node_id for each review from NodeExecution
result = []
for review in reviews:
node_exec = await get_node_execution(review.nodeExecId)
node_id = node_exec.node_id if node_exec else review.nodeExecId
result.append(PendingHumanReviewModel.from_db(review, node_id=node_id))
return result
async def process_all_reviews_for_execution(
@@ -244,11 +419,19 @@ async def process_all_reviews_for_execution(
# Note: Execution resumption is now handled at the API layer after ALL reviews
# for an execution are processed (both approved and rejected)
# Return as dict for easy access
return {
review.nodeExecId: PendingHumanReviewModel.from_db(review)
for review in updated_reviews
}
# Fetch node_id for each review and return as dict for easy access
# Local import to avoid event loop conflicts in tests
from backend.data.execution import get_node_execution
result = {}
for review in updated_reviews:
node_exec = await get_node_execution(review.nodeExecId)
node_id = node_exec.node_id if node_exec else review.nodeExecId
result[review.nodeExecId] = PendingHumanReviewModel.from_db(
review, node_id=node_id
)
return result
async def update_review_processed_status(node_exec_id: str, processed: bool) -> None:
@@ -256,3 +439,44 @@ async def update_review_processed_status(node_exec_id: str, processed: bool) ->
await PendingHumanReview.prisma().update(
where={"nodeExecId": node_exec_id}, data={"processed": processed}
)
async def cancel_pending_reviews_for_execution(graph_exec_id: str, user_id: str) -> int:
"""
Cancel all pending reviews for a graph execution (e.g., when execution is stopped).
Marks all WAITING reviews as REJECTED with a message indicating the execution was stopped.
Args:
graph_exec_id: The graph execution ID
user_id: User ID who owns the execution (for security validation)
Returns:
Number of reviews cancelled
Raises:
ValueError: If the graph execution doesn't belong to the user
"""
# Validate user ownership before cancelling reviews
graph_exec = await get_graph_execution_meta(
user_id=user_id, execution_id=graph_exec_id
)
if not graph_exec:
raise ValueError(
f"Graph execution {graph_exec_id} not found or doesn't belong to user {user_id}"
)
result = await PendingHumanReview.prisma().update_many(
where={
"graphExecId": graph_exec_id,
"userId": user_id,
"status": ReviewStatus.WAITING,
},
data={
"status": ReviewStatus.REJECTED,
"reviewMessage": "Execution was stopped by user",
"processed": True,
"reviewedAt": datetime.now(timezone.utc),
},
)
return result

View File

@@ -46,8 +46,8 @@ async def test_get_or_create_human_review_new(
sample_db_review.status = ReviewStatus.WAITING
sample_db_review.processed = False
mock_upsert = mocker.patch("backend.data.human_review.PendingHumanReview.prisma")
mock_upsert.return_value.upsert = AsyncMock(return_value=sample_db_review)
mock_prisma = mocker.patch("backend.data.human_review.PendingHumanReview.prisma")
mock_prisma.return_value.upsert = AsyncMock(return_value=sample_db_review)
result = await get_or_create_human_review(
user_id="test-user-123",
@@ -75,8 +75,8 @@ async def test_get_or_create_human_review_approved(
sample_db_review.processed = False
sample_db_review.reviewMessage = "Looks good"
mock_upsert = mocker.patch("backend.data.human_review.PendingHumanReview.prisma")
mock_upsert.return_value.upsert = AsyncMock(return_value=sample_db_review)
mock_prisma = mocker.patch("backend.data.human_review.PendingHumanReview.prisma")
mock_prisma.return_value.upsert = AsyncMock(return_value=sample_db_review)
result = await get_or_create_human_review(
user_id="test-user-123",
@@ -131,10 +131,19 @@ async def test_get_pending_reviews_for_user(
mock_find_many = mocker.patch("backend.data.human_review.PendingHumanReview.prisma")
mock_find_many.return_value.find_many = AsyncMock(return_value=[sample_db_review])
# Mock get_node_execution to return node with node_id (async function)
mock_node_exec = Mock()
mock_node_exec.node_id = "test_node_def_789"
mocker.patch(
"backend.data.execution.get_node_execution",
new=AsyncMock(return_value=mock_node_exec),
)
result = await get_pending_reviews_for_user("test_user", page=2, page_size=10)
assert len(result) == 1
assert result[0].node_exec_id == "test_node_123"
assert result[0].node_id == "test_node_def_789"
# Verify pagination parameters
call_args = mock_find_many.return_value.find_many.call_args
@@ -151,12 +160,21 @@ async def test_get_pending_reviews_for_execution(
mock_find_many = mocker.patch("backend.data.human_review.PendingHumanReview.prisma")
mock_find_many.return_value.find_many = AsyncMock(return_value=[sample_db_review])
# Mock get_node_execution to return node with node_id (async function)
mock_node_exec = Mock()
mock_node_exec.node_id = "test_node_def_789"
mocker.patch(
"backend.data.execution.get_node_execution",
new=AsyncMock(return_value=mock_node_exec),
)
result = await get_pending_reviews_for_execution(
"test_graph_exec_456", "test-user-123"
)
assert len(result) == 1
assert result[0].graph_exec_id == "test_graph_exec_456"
assert result[0].node_id == "test_node_def_789"
# Verify it filters by execution and user
call_args = mock_find_many.return_value.find_many.call_args
@@ -201,6 +219,14 @@ async def test_process_all_reviews_for_execution_success(
new=AsyncMock(return_value=[updated_review]),
)
# Mock get_node_execution to return node with node_id (async function)
mock_node_exec = Mock()
mock_node_exec.node_id = "test_node_def_789"
mocker.patch(
"backend.data.execution.get_node_execution",
new=AsyncMock(return_value=mock_node_exec),
)
result = await process_all_reviews_for_execution(
user_id="test-user-123",
review_decisions={
@@ -211,6 +237,7 @@ async def test_process_all_reviews_for_execution_success(
assert len(result) == 1
assert "test_node_123" in result
assert result["test_node_123"].status == ReviewStatus.APPROVED
assert result["test_node_123"].node_id == "test_node_def_789"
@pytest.mark.asyncio
@@ -329,6 +356,14 @@ async def test_process_all_reviews_mixed_approval_rejection(
new=AsyncMock(return_value=[approved_review, rejected_review]),
)
# Mock get_node_execution to return node with node_id (async function)
mock_node_exec = Mock()
mock_node_exec.node_id = "test_node_def_789"
mocker.patch(
"backend.data.execution.get_node_execution",
new=AsyncMock(return_value=mock_node_exec),
)
result = await process_all_reviews_for_execution(
user_id="test-user-123",
review_decisions={
@@ -340,3 +375,5 @@ async def test_process_all_reviews_mixed_approval_rejection(
assert len(result) == 2
assert "test_node_123" in result
assert "test_node_456" in result
assert result["test_node_123"].node_id == "test_node_def_789"
assert result["test_node_456"].node_id == "test_node_def_789"

View File

@@ -50,6 +50,8 @@ from backend.data.graph import (
validate_graph_execution_permissions,
)
from backend.data.human_review import (
cancel_pending_reviews_for_execution,
check_approval,
get_or_create_human_review,
has_pending_reviews_for_graph_exec,
update_review_processed_status,
@@ -190,6 +192,8 @@ class DatabaseManager(AppService):
get_user_notification_preference = _(get_user_notification_preference)
# Human In The Loop
cancel_pending_reviews_for_execution = _(cancel_pending_reviews_for_execution)
check_approval = _(check_approval)
get_or_create_human_review = _(get_or_create_human_review)
has_pending_reviews_for_graph_exec = _(has_pending_reviews_for_graph_exec)
update_review_processed_status = _(update_review_processed_status)
@@ -313,6 +317,8 @@ class DatabaseManagerAsyncClient(AppServiceClient):
set_execution_kv_data = d.set_execution_kv_data
# Human In The Loop
cancel_pending_reviews_for_execution = d.cancel_pending_reviews_for_execution
check_approval = d.check_approval
get_or_create_human_review = d.get_or_create_human_review
update_review_processed_status = d.update_review_processed_status

View File

@@ -10,6 +10,7 @@ from pydantic import BaseModel, JsonValue, ValidationError
from backend.data import execution as execution_db
from backend.data import graph as graph_db
from backend.data import human_review as human_review_db
from backend.data import onboarding as onboarding_db
from backend.data import user as user_db
from backend.data.block import (
@@ -749,9 +750,27 @@ async def stop_graph_execution(
if graph_exec.status in [
ExecutionStatus.QUEUED,
ExecutionStatus.INCOMPLETE,
ExecutionStatus.REVIEW,
]:
# If the graph is still on the queue, we can prevent them from being executed
# by setting the status to TERMINATED.
# If the graph is queued/incomplete/paused for review, terminate immediately
# No need to wait for executor since it's not actively running
# If graph is in REVIEW status, clean up pending reviews before terminating
if graph_exec.status == ExecutionStatus.REVIEW:
# Use human_review_db if Prisma connected, else database manager
review_db = (
human_review_db
if prisma.is_connected()
else get_database_manager_async_client()
)
# Mark all pending reviews as rejected/cancelled
cancelled_count = await review_db.cancel_pending_reviews_for_execution(
graph_exec_id, user_id
)
logger.info(
f"Cancelled {cancelled_count} pending review(s) for stopped execution {graph_exec_id}"
)
graph_exec.status = ExecutionStatus.TERMINATED
await asyncio.gather(
@@ -887,9 +906,28 @@ async def add_graph_execution(
nodes_to_skip=nodes_to_skip,
execution_context=execution_context,
)
logger.info(f"Publishing execution {graph_exec.id} to execution queue")
logger.info(f"Queueing execution {graph_exec.id}")
# Update execution status to QUEUED BEFORE publishing to prevent race condition
# where two concurrent requests could both publish the same execution
updated_exec = await edb.update_graph_execution_stats(
graph_exec_id=graph_exec.id,
status=ExecutionStatus.QUEUED,
)
# Verify the status update succeeded (prevents duplicate queueing in race conditions)
# If another request already updated the status, this execution will not be QUEUED
if not updated_exec or updated_exec.status != ExecutionStatus.QUEUED:
logger.warning(
f"Skipping queue publish for execution {graph_exec.id} - "
f"status update failed or execution already queued by another request"
)
return graph_exec
graph_exec.status = ExecutionStatus.QUEUED
# Publish to execution queue for executor to pick up
# This happens AFTER status update to ensure only one request publishes
exec_queue = await get_async_execution_queue()
await exec_queue.publish_message(
routing_key=GRAPH_EXECUTION_ROUTING_KEY,
@@ -897,13 +935,6 @@ async def add_graph_execution(
exchange=GRAPH_EXECUTION_EXCHANGE,
)
logger.info(f"Published execution {graph_exec.id} to RabbitMQ queue")
# Update execution status to QUEUED
graph_exec.status = ExecutionStatus.QUEUED
await edb.update_graph_execution_stats(
graph_exec_id=graph_exec.id,
status=graph_exec.status,
)
except BaseException as e:
err = str(e) or type(e).__name__
if not graph_exec:

View File

@@ -4,6 +4,7 @@ import pytest
from pytest_mock import MockerFixture
from backend.data.dynamic_fields import merge_execution_input, parse_execution_output
from backend.data.execution import ExecutionStatus
from backend.util.mock import MockObject
@@ -346,6 +347,7 @@ async def test_add_graph_execution_is_repeatable(mocker: MockerFixture):
mock_graph_exec = mocker.MagicMock(spec=GraphExecutionWithNodes)
mock_graph_exec.id = "execution-id-123"
mock_graph_exec.node_executions = [] # Add this to avoid AttributeError
mock_graph_exec.status = ExecutionStatus.QUEUED # Required for race condition check
mock_graph_exec.to_graph_execution_entry.return_value = mocker.MagicMock()
# Mock the queue and event bus
@@ -611,6 +613,7 @@ async def test_add_graph_execution_with_nodes_to_skip(mocker: MockerFixture):
mock_graph_exec = mocker.MagicMock(spec=GraphExecutionWithNodes)
mock_graph_exec.id = "execution-id-123"
mock_graph_exec.node_executions = []
mock_graph_exec.status = ExecutionStatus.QUEUED # Required for race condition check
# Track what's passed to to_graph_execution_entry
captured_kwargs = {}
@@ -670,3 +673,232 @@ async def test_add_graph_execution_with_nodes_to_skip(mocker: MockerFixture):
# Verify nodes_to_skip was passed to to_graph_execution_entry
assert "nodes_to_skip" in captured_kwargs
assert captured_kwargs["nodes_to_skip"] == nodes_to_skip
@pytest.mark.asyncio
async def test_stop_graph_execution_in_review_status_cancels_pending_reviews(
mocker: MockerFixture,
):
"""Test that stopping an execution in REVIEW status cancels pending reviews."""
from backend.data.execution import ExecutionStatus, GraphExecutionMeta
from backend.executor.utils import stop_graph_execution
user_id = "test-user"
graph_exec_id = "test-exec-123"
# Mock graph execution in REVIEW status
mock_graph_exec = mocker.MagicMock(spec=GraphExecutionMeta)
mock_graph_exec.id = graph_exec_id
mock_graph_exec.status = ExecutionStatus.REVIEW
# Mock dependencies
mock_get_queue = mocker.patch("backend.executor.utils.get_async_execution_queue")
mock_queue_client = mocker.AsyncMock()
mock_get_queue.return_value = mock_queue_client
mock_prisma = mocker.patch("backend.executor.utils.prisma")
mock_prisma.is_connected.return_value = True
mock_human_review_db = mocker.patch("backend.executor.utils.human_review_db")
mock_human_review_db.cancel_pending_reviews_for_execution = mocker.AsyncMock(
return_value=2 # 2 reviews cancelled
)
mock_execution_db = mocker.patch("backend.executor.utils.execution_db")
mock_execution_db.get_graph_execution_meta = mocker.AsyncMock(
return_value=mock_graph_exec
)
mock_execution_db.update_graph_execution_stats = mocker.AsyncMock()
mock_get_event_bus = mocker.patch(
"backend.executor.utils.get_async_execution_event_bus"
)
mock_event_bus = mocker.MagicMock()
mock_event_bus.publish = mocker.AsyncMock()
mock_get_event_bus.return_value = mock_event_bus
mock_get_child_executions = mocker.patch(
"backend.executor.utils._get_child_executions"
)
mock_get_child_executions.return_value = [] # No children
# Call stop_graph_execution with timeout to allow status check
await stop_graph_execution(
user_id=user_id,
graph_exec_id=graph_exec_id,
wait_timeout=1.0, # Wait to allow status check
cascade=True,
)
# Verify pending reviews were cancelled
mock_human_review_db.cancel_pending_reviews_for_execution.assert_called_once_with(
graph_exec_id, user_id
)
# Verify execution status was updated to TERMINATED
mock_execution_db.update_graph_execution_stats.assert_called_once()
call_kwargs = mock_execution_db.update_graph_execution_stats.call_args[1]
assert call_kwargs["graph_exec_id"] == graph_exec_id
assert call_kwargs["status"] == ExecutionStatus.TERMINATED
@pytest.mark.asyncio
async def test_stop_graph_execution_with_database_manager_when_prisma_disconnected(
mocker: MockerFixture,
):
"""Test that stop uses database manager when Prisma is not connected."""
from backend.data.execution import ExecutionStatus, GraphExecutionMeta
from backend.executor.utils import stop_graph_execution
user_id = "test-user"
graph_exec_id = "test-exec-456"
# Mock graph execution in REVIEW status
mock_graph_exec = mocker.MagicMock(spec=GraphExecutionMeta)
mock_graph_exec.id = graph_exec_id
mock_graph_exec.status = ExecutionStatus.REVIEW
# Mock dependencies
mock_get_queue = mocker.patch("backend.executor.utils.get_async_execution_queue")
mock_queue_client = mocker.AsyncMock()
mock_get_queue.return_value = mock_queue_client
# Prisma is NOT connected
mock_prisma = mocker.patch("backend.executor.utils.prisma")
mock_prisma.is_connected.return_value = False
# Mock database manager client
mock_get_db_manager = mocker.patch(
"backend.executor.utils.get_database_manager_async_client"
)
mock_db_manager = mocker.AsyncMock()
mock_db_manager.get_graph_execution_meta = mocker.AsyncMock(
return_value=mock_graph_exec
)
mock_db_manager.cancel_pending_reviews_for_execution = mocker.AsyncMock(
return_value=3 # 3 reviews cancelled
)
mock_db_manager.update_graph_execution_stats = mocker.AsyncMock()
mock_get_db_manager.return_value = mock_db_manager
mock_get_event_bus = mocker.patch(
"backend.executor.utils.get_async_execution_event_bus"
)
mock_event_bus = mocker.MagicMock()
mock_event_bus.publish = mocker.AsyncMock()
mock_get_event_bus.return_value = mock_event_bus
mock_get_child_executions = mocker.patch(
"backend.executor.utils._get_child_executions"
)
mock_get_child_executions.return_value = [] # No children
# Call stop_graph_execution with timeout
await stop_graph_execution(
user_id=user_id,
graph_exec_id=graph_exec_id,
wait_timeout=1.0,
cascade=True,
)
# Verify database manager was used for cancel_pending_reviews
mock_db_manager.cancel_pending_reviews_for_execution.assert_called_once_with(
graph_exec_id, user_id
)
# Verify execution status was updated via database manager
mock_db_manager.update_graph_execution_stats.assert_called_once()
@pytest.mark.asyncio
async def test_stop_graph_execution_cascades_to_child_with_reviews(
mocker: MockerFixture,
):
"""Test that stopping parent execution cascades to children and cancels their reviews."""
from backend.data.execution import ExecutionStatus, GraphExecutionMeta
from backend.executor.utils import stop_graph_execution
user_id = "test-user"
parent_exec_id = "parent-exec"
child_exec_id = "child-exec"
# Mock parent execution in RUNNING status
mock_parent_exec = mocker.MagicMock(spec=GraphExecutionMeta)
mock_parent_exec.id = parent_exec_id
mock_parent_exec.status = ExecutionStatus.RUNNING
# Mock child execution in REVIEW status
mock_child_exec = mocker.MagicMock(spec=GraphExecutionMeta)
mock_child_exec.id = child_exec_id
mock_child_exec.status = ExecutionStatus.REVIEW
# Mock dependencies
mock_get_queue = mocker.patch("backend.executor.utils.get_async_execution_queue")
mock_queue_client = mocker.AsyncMock()
mock_get_queue.return_value = mock_queue_client
mock_prisma = mocker.patch("backend.executor.utils.prisma")
mock_prisma.is_connected.return_value = True
mock_human_review_db = mocker.patch("backend.executor.utils.human_review_db")
mock_human_review_db.cancel_pending_reviews_for_execution = mocker.AsyncMock(
return_value=1 # 1 child review cancelled
)
# Mock execution_db to return different status based on which execution is queried
mock_execution_db = mocker.patch("backend.executor.utils.execution_db")
# Track call count to simulate status transition
call_count = {"count": 0}
async def get_exec_meta_side_effect(execution_id, user_id):
call_count["count"] += 1
if execution_id == parent_exec_id:
# After a few calls (child processing happens), transition parent to TERMINATED
# This simulates the executor service processing the stop request
if call_count["count"] > 3:
mock_parent_exec.status = ExecutionStatus.TERMINATED
return mock_parent_exec
elif execution_id == child_exec_id:
return mock_child_exec
return None
mock_execution_db.get_graph_execution_meta = mocker.AsyncMock(
side_effect=get_exec_meta_side_effect
)
mock_execution_db.update_graph_execution_stats = mocker.AsyncMock()
mock_get_event_bus = mocker.patch(
"backend.executor.utils.get_async_execution_event_bus"
)
mock_event_bus = mocker.MagicMock()
mock_event_bus.publish = mocker.AsyncMock()
mock_get_event_bus.return_value = mock_event_bus
# Mock _get_child_executions to return the child
mock_get_child_executions = mocker.patch(
"backend.executor.utils._get_child_executions"
)
def get_children_side_effect(parent_id):
if parent_id == parent_exec_id:
return [mock_child_exec]
return []
mock_get_child_executions.side_effect = get_children_side_effect
# Call stop_graph_execution on parent with cascade=True
await stop_graph_execution(
user_id=user_id,
graph_exec_id=parent_exec_id,
wait_timeout=1.0,
cascade=True,
)
# Verify child reviews were cancelled
mock_human_review_db.cancel_pending_reviews_for_execution.assert_called_once_with(
child_exec_id, user_id
)
# Verify both parent and child status updates
assert mock_execution_db.update_graph_execution_stats.call_count >= 1

View File

@@ -350,19 +350,6 @@ class Config(UpdateTrackingModel["Config"], BaseSettings):
description="Whether to mark failed scans as clean or not",
)
agentgenerator_host: str = Field(
default="",
description="The host for the Agent Generator service (empty to use built-in)",
)
agentgenerator_port: int = Field(
default=8000,
description="The port for the Agent Generator service",
)
agentgenerator_timeout: int = Field(
default=120,
description="The timeout in seconds for Agent Generator service requests",
)
enable_example_blocks: bool = Field(
default=False,
description="Whether to enable example blocks in production",

View File

@@ -1,3 +1,4 @@
import asyncio
import inspect
import logging
import time
@@ -58,6 +59,11 @@ class SpinTestServer:
self.db_api.__exit__(exc_type, exc_val, exc_tb)
self.notif_manager.__exit__(exc_type, exc_val, exc_tb)
# Give services time to fully shut down
# This prevents event loop issues where services haven't fully cleaned up
# before the next test starts
await asyncio.sleep(0.5)
def setup_dependency_overrides(self):
# Override get_user_id for testing
self.agent_server.set_test_dependency_overrides(

View File

@@ -0,0 +1,7 @@
-- Remove NodeExecution foreign key from PendingHumanReview
-- The nodeExecId column remains as the primary key, but we remove the FK constraint
-- to AgentNodeExecution since PendingHumanReview records can persist after node
-- execution records are deleted.
-- Drop foreign key constraint that linked PendingHumanReview.nodeExecId to AgentNodeExecution.id
ALTER TABLE "PendingHumanReview" DROP CONSTRAINT IF EXISTS "PendingHumanReview_nodeExecId_fkey";

View File

@@ -517,8 +517,6 @@ model AgentNodeExecution {
stats Json?
PendingHumanReview PendingHumanReview?
@@index([agentGraphExecutionId, agentNodeId, executionStatus])
@@index([agentNodeId, executionStatus])
@@index([addedTime, queuedTime])
@@ -567,6 +565,7 @@ enum ReviewStatus {
}
// Pending human reviews for Human-in-the-loop blocks
// Also stores auto-approval records with special nodeExecId patterns (e.g., "auto_approve_{graph_exec_id}_{node_id}")
model PendingHumanReview {
nodeExecId String @id
userId String
@@ -585,7 +584,6 @@ model PendingHumanReview {
reviewedAt DateTime?
User User @relation(fields: [userId], references: [id], onDelete: Cascade)
NodeExecution AgentNodeExecution @relation(fields: [nodeExecId], references: [id], onDelete: Cascade)
GraphExecution AgentGraphExecution @relation(fields: [graphExecId], references: [id], onDelete: Cascade)
@@unique([nodeExecId]) // One pending review per node execution

View File

@@ -1 +0,0 @@
"""Tests for agent generator module."""

View File

@@ -1,273 +0,0 @@
"""
Tests for the Agent Generator core module.
This test suite verifies that the core functions correctly delegate to
the external Agent Generator service.
"""
from unittest.mock import AsyncMock, patch
import pytest
from backend.api.features.chat.tools.agent_generator import core
from backend.api.features.chat.tools.agent_generator.core import (
AgentGeneratorNotConfiguredError,
)
class TestServiceNotConfigured:
"""Test that functions raise AgentGeneratorNotConfiguredError when service is not configured."""
@pytest.mark.asyncio
async def test_decompose_goal_raises_when_not_configured(self):
"""Test that decompose_goal raises error when service not configured."""
with patch.object(core, "is_external_service_configured", return_value=False):
with pytest.raises(AgentGeneratorNotConfiguredError):
await core.decompose_goal("Build a chatbot")
@pytest.mark.asyncio
async def test_generate_agent_raises_when_not_configured(self):
"""Test that generate_agent raises error when service not configured."""
with patch.object(core, "is_external_service_configured", return_value=False):
with pytest.raises(AgentGeneratorNotConfiguredError):
await core.generate_agent({"steps": []})
@pytest.mark.asyncio
async def test_generate_agent_patch_raises_when_not_configured(self):
"""Test that generate_agent_patch raises error when service not configured."""
with patch.object(core, "is_external_service_configured", return_value=False):
with pytest.raises(AgentGeneratorNotConfiguredError):
await core.generate_agent_patch("Add a node", {"nodes": []})
class TestDecomposeGoal:
"""Test decompose_goal function service delegation."""
@pytest.mark.asyncio
async def test_calls_external_service(self):
"""Test that decompose_goal calls the external service."""
expected_result = {"type": "instructions", "steps": ["Step 1"]}
with patch.object(
core, "is_external_service_configured", return_value=True
), patch.object(
core, "decompose_goal_external", new_callable=AsyncMock
) as mock_external:
mock_external.return_value = expected_result
result = await core.decompose_goal("Build a chatbot")
mock_external.assert_called_once_with("Build a chatbot", "")
assert result == expected_result
@pytest.mark.asyncio
async def test_passes_context_to_external_service(self):
"""Test that decompose_goal passes context to external service."""
expected_result = {"type": "instructions", "steps": ["Step 1"]}
with patch.object(
core, "is_external_service_configured", return_value=True
), patch.object(
core, "decompose_goal_external", new_callable=AsyncMock
) as mock_external:
mock_external.return_value = expected_result
await core.decompose_goal("Build a chatbot", "Use Python")
mock_external.assert_called_once_with("Build a chatbot", "Use Python")
@pytest.mark.asyncio
async def test_returns_none_on_service_failure(self):
"""Test that decompose_goal returns None when external service fails."""
with patch.object(
core, "is_external_service_configured", return_value=True
), patch.object(
core, "decompose_goal_external", new_callable=AsyncMock
) as mock_external:
mock_external.return_value = None
result = await core.decompose_goal("Build a chatbot")
assert result is None
class TestGenerateAgent:
"""Test generate_agent function service delegation."""
@pytest.mark.asyncio
async def test_calls_external_service(self):
"""Test that generate_agent calls the external service."""
expected_result = {"name": "Test Agent", "nodes": [], "links": []}
with patch.object(
core, "is_external_service_configured", return_value=True
), patch.object(
core, "generate_agent_external", new_callable=AsyncMock
) as mock_external:
mock_external.return_value = expected_result
instructions = {"type": "instructions", "steps": ["Step 1"]}
result = await core.generate_agent(instructions)
mock_external.assert_called_once_with(instructions)
# Result should have id, version, is_active added if not present
assert result is not None
assert result["name"] == "Test Agent"
assert "id" in result
assert result["version"] == 1
assert result["is_active"] is True
@pytest.mark.asyncio
async def test_preserves_existing_id_and_version(self):
"""Test that external service result preserves existing id and version."""
expected_result = {
"id": "existing-id",
"version": 3,
"is_active": False,
"name": "Test Agent",
}
with patch.object(
core, "is_external_service_configured", return_value=True
), patch.object(
core, "generate_agent_external", new_callable=AsyncMock
) as mock_external:
mock_external.return_value = expected_result.copy()
result = await core.generate_agent({"steps": []})
assert result is not None
assert result["id"] == "existing-id"
assert result["version"] == 3
assert result["is_active"] is False
@pytest.mark.asyncio
async def test_returns_none_when_external_service_fails(self):
"""Test that generate_agent returns None when external service fails."""
with patch.object(
core, "is_external_service_configured", return_value=True
), patch.object(
core, "generate_agent_external", new_callable=AsyncMock
) as mock_external:
mock_external.return_value = None
result = await core.generate_agent({"steps": []})
assert result is None
class TestGenerateAgentPatch:
"""Test generate_agent_patch function service delegation."""
@pytest.mark.asyncio
async def test_calls_external_service(self):
"""Test that generate_agent_patch calls the external service."""
expected_result = {"name": "Updated Agent", "nodes": [], "links": []}
with patch.object(
core, "is_external_service_configured", return_value=True
), patch.object(
core, "generate_agent_patch_external", new_callable=AsyncMock
) as mock_external:
mock_external.return_value = expected_result
current_agent = {"nodes": [], "links": []}
result = await core.generate_agent_patch("Add a node", current_agent)
mock_external.assert_called_once_with("Add a node", current_agent)
assert result == expected_result
@pytest.mark.asyncio
async def test_returns_clarifying_questions(self):
"""Test that generate_agent_patch returns clarifying questions."""
expected_result = {
"type": "clarifying_questions",
"questions": [{"question": "What type of node?"}],
}
with patch.object(
core, "is_external_service_configured", return_value=True
), patch.object(
core, "generate_agent_patch_external", new_callable=AsyncMock
) as mock_external:
mock_external.return_value = expected_result
result = await core.generate_agent_patch("Add a node", {"nodes": []})
assert result == expected_result
@pytest.mark.asyncio
async def test_returns_none_when_external_service_fails(self):
"""Test that generate_agent_patch returns None when service fails."""
with patch.object(
core, "is_external_service_configured", return_value=True
), patch.object(
core, "generate_agent_patch_external", new_callable=AsyncMock
) as mock_external:
mock_external.return_value = None
result = await core.generate_agent_patch("Add a node", {"nodes": []})
assert result is None
class TestJsonToGraph:
"""Test json_to_graph function."""
def test_converts_agent_json_to_graph(self):
"""Test conversion of agent JSON to Graph model."""
agent_json = {
"id": "test-id",
"version": 2,
"is_active": True,
"name": "Test Agent",
"description": "A test agent",
"nodes": [
{
"id": "node1",
"block_id": "block1",
"input_default": {"key": "value"},
"metadata": {"x": 100},
}
],
"links": [
{
"id": "link1",
"source_id": "node1",
"sink_id": "output",
"source_name": "result",
"sink_name": "input",
"is_static": False,
}
],
}
graph = core.json_to_graph(agent_json)
assert graph.id == "test-id"
assert graph.version == 2
assert graph.is_active is True
assert graph.name == "Test Agent"
assert graph.description == "A test agent"
assert len(graph.nodes) == 1
assert graph.nodes[0].id == "node1"
assert graph.nodes[0].block_id == "block1"
assert len(graph.links) == 1
assert graph.links[0].source_id == "node1"
def test_generates_ids_if_missing(self):
"""Test that missing IDs are generated."""
agent_json = {
"name": "Test Agent",
"nodes": [{"block_id": "block1"}],
"links": [],
}
graph = core.json_to_graph(agent_json)
assert graph.id is not None
assert graph.nodes[0].id is not None
if __name__ == "__main__":
pytest.main([__file__, "-v"])

View File

@@ -1,422 +0,0 @@
"""
Tests for the Agent Generator external service client.
This test suite verifies the external Agent Generator service integration,
including service detection, API calls, and error handling.
"""
from unittest.mock import AsyncMock, MagicMock, patch
import httpx
import pytest
from backend.api.features.chat.tools.agent_generator import service
class TestServiceConfiguration:
"""Test service configuration detection."""
def setup_method(self):
"""Reset settings singleton before each test."""
service._settings = None
service._client = None
def test_external_service_not_configured_when_host_empty(self):
"""Test that external service is not configured when host is empty."""
mock_settings = MagicMock()
mock_settings.config.agentgenerator_host = ""
with patch.object(service, "_get_settings", return_value=mock_settings):
assert service.is_external_service_configured() is False
def test_external_service_configured_when_host_set(self):
"""Test that external service is configured when host is set."""
mock_settings = MagicMock()
mock_settings.config.agentgenerator_host = "agent-generator.local"
with patch.object(service, "_get_settings", return_value=mock_settings):
assert service.is_external_service_configured() is True
def test_get_base_url(self):
"""Test base URL construction."""
mock_settings = MagicMock()
mock_settings.config.agentgenerator_host = "agent-generator.local"
mock_settings.config.agentgenerator_port = 8000
with patch.object(service, "_get_settings", return_value=mock_settings):
url = service._get_base_url()
assert url == "http://agent-generator.local:8000"
class TestDecomposeGoalExternal:
"""Test decompose_goal_external function."""
def setup_method(self):
"""Reset client singleton before each test."""
service._settings = None
service._client = None
@pytest.mark.asyncio
async def test_decompose_goal_returns_instructions(self):
"""Test successful decomposition returning instructions."""
mock_response = MagicMock()
mock_response.json.return_value = {
"success": True,
"type": "instructions",
"steps": ["Step 1", "Step 2"],
}
mock_response.raise_for_status = MagicMock()
mock_client = AsyncMock()
mock_client.post.return_value = mock_response
with patch.object(service, "_get_client", return_value=mock_client):
result = await service.decompose_goal_external("Build a chatbot")
assert result == {"type": "instructions", "steps": ["Step 1", "Step 2"]}
mock_client.post.assert_called_once_with(
"/api/decompose-description", json={"description": "Build a chatbot"}
)
@pytest.mark.asyncio
async def test_decompose_goal_returns_clarifying_questions(self):
"""Test decomposition returning clarifying questions."""
mock_response = MagicMock()
mock_response.json.return_value = {
"success": True,
"type": "clarifying_questions",
"questions": ["What platform?", "What language?"],
}
mock_response.raise_for_status = MagicMock()
mock_client = AsyncMock()
mock_client.post.return_value = mock_response
with patch.object(service, "_get_client", return_value=mock_client):
result = await service.decompose_goal_external("Build something")
assert result == {
"type": "clarifying_questions",
"questions": ["What platform?", "What language?"],
}
@pytest.mark.asyncio
async def test_decompose_goal_with_context(self):
"""Test decomposition with additional context."""
mock_response = MagicMock()
mock_response.json.return_value = {
"success": True,
"type": "instructions",
"steps": ["Step 1"],
}
mock_response.raise_for_status = MagicMock()
mock_client = AsyncMock()
mock_client.post.return_value = mock_response
with patch.object(service, "_get_client", return_value=mock_client):
await service.decompose_goal_external(
"Build a chatbot", context="Use Python"
)
mock_client.post.assert_called_once_with(
"/api/decompose-description",
json={"description": "Build a chatbot", "user_instruction": "Use Python"},
)
@pytest.mark.asyncio
async def test_decompose_goal_returns_unachievable_goal(self):
"""Test decomposition returning unachievable goal response."""
mock_response = MagicMock()
mock_response.json.return_value = {
"success": True,
"type": "unachievable_goal",
"reason": "Cannot do X",
"suggested_goal": "Try Y instead",
}
mock_response.raise_for_status = MagicMock()
mock_client = AsyncMock()
mock_client.post.return_value = mock_response
with patch.object(service, "_get_client", return_value=mock_client):
result = await service.decompose_goal_external("Do something impossible")
assert result == {
"type": "unachievable_goal",
"reason": "Cannot do X",
"suggested_goal": "Try Y instead",
}
@pytest.mark.asyncio
async def test_decompose_goal_handles_http_error(self):
"""Test decomposition handles HTTP errors gracefully."""
mock_client = AsyncMock()
mock_client.post.side_effect = httpx.HTTPStatusError(
"Server error", request=MagicMock(), response=MagicMock()
)
with patch.object(service, "_get_client", return_value=mock_client):
result = await service.decompose_goal_external("Build a chatbot")
assert result is None
@pytest.mark.asyncio
async def test_decompose_goal_handles_request_error(self):
"""Test decomposition handles request errors gracefully."""
mock_client = AsyncMock()
mock_client.post.side_effect = httpx.RequestError("Connection failed")
with patch.object(service, "_get_client", return_value=mock_client):
result = await service.decompose_goal_external("Build a chatbot")
assert result is None
@pytest.mark.asyncio
async def test_decompose_goal_handles_service_error(self):
"""Test decomposition handles service returning error."""
mock_response = MagicMock()
mock_response.json.return_value = {
"success": False,
"error": "Internal error",
}
mock_response.raise_for_status = MagicMock()
mock_client = AsyncMock()
mock_client.post.return_value = mock_response
with patch.object(service, "_get_client", return_value=mock_client):
result = await service.decompose_goal_external("Build a chatbot")
assert result is None
class TestGenerateAgentExternal:
"""Test generate_agent_external function."""
def setup_method(self):
"""Reset client singleton before each test."""
service._settings = None
service._client = None
@pytest.mark.asyncio
async def test_generate_agent_success(self):
"""Test successful agent generation."""
agent_json = {
"name": "Test Agent",
"nodes": [],
"links": [],
}
mock_response = MagicMock()
mock_response.json.return_value = {
"success": True,
"agent_json": agent_json,
}
mock_response.raise_for_status = MagicMock()
mock_client = AsyncMock()
mock_client.post.return_value = mock_response
instructions = {"type": "instructions", "steps": ["Step 1"]}
with patch.object(service, "_get_client", return_value=mock_client):
result = await service.generate_agent_external(instructions)
assert result == agent_json
mock_client.post.assert_called_once_with(
"/api/generate-agent", json={"instructions": instructions}
)
@pytest.mark.asyncio
async def test_generate_agent_handles_error(self):
"""Test agent generation handles errors gracefully."""
mock_client = AsyncMock()
mock_client.post.side_effect = httpx.RequestError("Connection failed")
with patch.object(service, "_get_client", return_value=mock_client):
result = await service.generate_agent_external({"steps": []})
assert result is None
class TestGenerateAgentPatchExternal:
"""Test generate_agent_patch_external function."""
def setup_method(self):
"""Reset client singleton before each test."""
service._settings = None
service._client = None
@pytest.mark.asyncio
async def test_generate_patch_returns_updated_agent(self):
"""Test successful patch generation returning updated agent."""
updated_agent = {
"name": "Updated Agent",
"nodes": [{"id": "1", "block_id": "test"}],
"links": [],
}
mock_response = MagicMock()
mock_response.json.return_value = {
"success": True,
"agent_json": updated_agent,
}
mock_response.raise_for_status = MagicMock()
mock_client = AsyncMock()
mock_client.post.return_value = mock_response
current_agent = {"name": "Old Agent", "nodes": [], "links": []}
with patch.object(service, "_get_client", return_value=mock_client):
result = await service.generate_agent_patch_external(
"Add a new node", current_agent
)
assert result == updated_agent
mock_client.post.assert_called_once_with(
"/api/update-agent",
json={
"update_request": "Add a new node",
"current_agent_json": current_agent,
},
)
@pytest.mark.asyncio
async def test_generate_patch_returns_clarifying_questions(self):
"""Test patch generation returning clarifying questions."""
mock_response = MagicMock()
mock_response.json.return_value = {
"success": True,
"type": "clarifying_questions",
"questions": ["What type of node?"],
}
mock_response.raise_for_status = MagicMock()
mock_client = AsyncMock()
mock_client.post.return_value = mock_response
with patch.object(service, "_get_client", return_value=mock_client):
result = await service.generate_agent_patch_external(
"Add something", {"nodes": []}
)
assert result == {
"type": "clarifying_questions",
"questions": ["What type of node?"],
}
class TestHealthCheck:
"""Test health_check function."""
def setup_method(self):
"""Reset singletons before each test."""
service._settings = None
service._client = None
@pytest.mark.asyncio
async def test_health_check_returns_false_when_not_configured(self):
"""Test health check returns False when service not configured."""
with patch.object(
service, "is_external_service_configured", return_value=False
):
result = await service.health_check()
assert result is False
@pytest.mark.asyncio
async def test_health_check_returns_true_when_healthy(self):
"""Test health check returns True when service is healthy."""
mock_response = MagicMock()
mock_response.json.return_value = {
"status": "healthy",
"blocks_loaded": True,
}
mock_response.raise_for_status = MagicMock()
mock_client = AsyncMock()
mock_client.get.return_value = mock_response
with patch.object(service, "is_external_service_configured", return_value=True):
with patch.object(service, "_get_client", return_value=mock_client):
result = await service.health_check()
assert result is True
mock_client.get.assert_called_once_with("/health")
@pytest.mark.asyncio
async def test_health_check_returns_false_when_not_healthy(self):
"""Test health check returns False when service is not healthy."""
mock_response = MagicMock()
mock_response.json.return_value = {
"status": "unhealthy",
"blocks_loaded": False,
}
mock_response.raise_for_status = MagicMock()
mock_client = AsyncMock()
mock_client.get.return_value = mock_response
with patch.object(service, "is_external_service_configured", return_value=True):
with patch.object(service, "_get_client", return_value=mock_client):
result = await service.health_check()
assert result is False
@pytest.mark.asyncio
async def test_health_check_returns_false_on_error(self):
"""Test health check returns False on connection error."""
mock_client = AsyncMock()
mock_client.get.side_effect = httpx.RequestError("Connection failed")
with patch.object(service, "is_external_service_configured", return_value=True):
with patch.object(service, "_get_client", return_value=mock_client):
result = await service.health_check()
assert result is False
class TestGetBlocksExternal:
"""Test get_blocks_external function."""
def setup_method(self):
"""Reset client singleton before each test."""
service._settings = None
service._client = None
@pytest.mark.asyncio
async def test_get_blocks_success(self):
"""Test successful blocks retrieval."""
blocks = [
{"id": "block1", "name": "Block 1"},
{"id": "block2", "name": "Block 2"},
]
mock_response = MagicMock()
mock_response.json.return_value = {
"success": True,
"blocks": blocks,
}
mock_response.raise_for_status = MagicMock()
mock_client = AsyncMock()
mock_client.get.return_value = mock_response
with patch.object(service, "_get_client", return_value=mock_client):
result = await service.get_blocks_external()
assert result == blocks
mock_client.get.assert_called_once_with("/api/blocks")
@pytest.mark.asyncio
async def test_get_blocks_handles_error(self):
"""Test blocks retrieval handles errors gracefully."""
mock_client = AsyncMock()
mock_client.get.side_effect = httpx.RequestError("Connection failed")
with patch.object(service, "_get_client", return_value=mock_client):
result = await service.get_blocks_external()
assert result is None
if __name__ == "__main__":
pytest.main([__file__, "-v"])

View File

@@ -86,7 +86,6 @@ export function FloatingSafeModeToggle({
const {
currentHITLSafeMode,
showHITLToggle,
isHITLStateUndetermined,
handleHITLToggle,
currentSensitiveActionSafeMode,
showSensitiveActionToggle,
@@ -99,16 +98,9 @@ export function FloatingSafeModeToggle({
return null;
}
const showHITL = showHITLToggle && !isHITLStateUndetermined;
const showSensitive = showSensitiveActionToggle;
if (!showHITL && !showSensitive) {
return null;
}
return (
<div className={cn("fixed z-50 flex flex-col gap-2", className)}>
{showHITL && (
{showHITLToggle && (
<SafeModeButton
isEnabled={currentHITLSafeMode}
label="Human in the loop block approval"
@@ -119,7 +111,7 @@ export function FloatingSafeModeToggle({
fullWidth={fullWidth}
/>
)}
{showSensitive && (
{showSensitiveActionToggle && (
<SafeModeButton
isEnabled={currentSensitiveActionSafeMode}
label="Sensitive actions blocks approval"

View File

@@ -14,6 +14,10 @@ import {
import { Dialog } from "@/components/molecules/Dialog/Dialog";
import { useEffect, useRef, useState } from "react";
import { ScheduleAgentModal } from "../ScheduleAgentModal/ScheduleAgentModal";
import {
AIAgentSafetyPopup,
useAIAgentSafetyPopup,
} from "./components/AIAgentSafetyPopup/AIAgentSafetyPopup";
import { ModalHeader } from "./components/ModalHeader/ModalHeader";
import { ModalRunSection } from "./components/ModalRunSection/ModalRunSection";
import { RunActions } from "./components/RunActions/RunActions";
@@ -83,8 +87,18 @@ export function RunAgentModal({
const [isScheduleModalOpen, setIsScheduleModalOpen] = useState(false);
const [hasOverflow, setHasOverflow] = useState(false);
const [isSafetyPopupOpen, setIsSafetyPopupOpen] = useState(false);
const [pendingRunAction, setPendingRunAction] = useState<(() => void) | null>(
null,
);
const contentRef = useRef<HTMLDivElement>(null);
const { shouldShowPopup, dismissPopup } = useAIAgentSafetyPopup(
agent.id,
agent.has_sensitive_action,
agent.has_human_in_the_loop,
);
const hasAnySetupFields =
Object.keys(agentInputFields || {}).length > 0 ||
Object.keys(agentCredentialsInputFields || {}).length > 0;
@@ -165,6 +179,24 @@ export function RunAgentModal({
onScheduleCreated?.(schedule);
}
function handleRunWithSafetyCheck() {
if (shouldShowPopup) {
setPendingRunAction(() => handleRun);
setIsSafetyPopupOpen(true);
} else {
handleRun();
}
}
function handleSafetyPopupAcknowledge() {
setIsSafetyPopupOpen(false);
dismissPopup();
if (pendingRunAction) {
pendingRunAction();
setPendingRunAction(null);
}
}
return (
<>
<Dialog
@@ -248,7 +280,7 @@ export function RunAgentModal({
)}
<RunActions
defaultRunType={defaultRunType}
onRun={handleRun}
onRun={handleRunWithSafetyCheck}
isExecuting={isExecuting}
isSettingUpTrigger={isSettingUpTrigger}
isRunReady={allRequiredInputsAreSet}
@@ -266,6 +298,12 @@ export function RunAgentModal({
</div>
</Dialog.Content>
</Dialog>
<AIAgentSafetyPopup
agentId={agent.id}
isOpen={isSafetyPopupOpen}
onAcknowledge={handleSafetyPopupAcknowledge}
/>
</>
);
}

View File

@@ -0,0 +1,108 @@
"use client";
import { Button } from "@/components/atoms/Button/Button";
import { Text } from "@/components/atoms/Text/Text";
import { Dialog } from "@/components/molecules/Dialog/Dialog";
import { Key, storage } from "@/services/storage/local-storage";
import { ShieldCheckIcon } from "@phosphor-icons/react";
import { useCallback, useEffect, useState } from "react";
interface Props {
agentId: string;
onAcknowledge: () => void;
isOpen: boolean;
}
export function AIAgentSafetyPopup({ agentId, onAcknowledge, isOpen }: Props) {
function handleAcknowledge() {
// Add this agent to the list of agents for which popup has been shown
const seenAgentsJson = storage.get(Key.AI_AGENT_SAFETY_POPUP_SHOWN);
const seenAgents: string[] = seenAgentsJson
? JSON.parse(seenAgentsJson)
: [];
if (!seenAgents.includes(agentId)) {
seenAgents.push(agentId);
storage.set(Key.AI_AGENT_SAFETY_POPUP_SHOWN, JSON.stringify(seenAgents));
}
onAcknowledge();
}
if (!isOpen) return null;
return (
<Dialog
controlled={{ isOpen, set: () => {} }}
styling={{ maxWidth: "480px" }}
>
<Dialog.Content>
<div className="flex flex-col items-center p-6 text-center">
<div className="mb-6 flex h-16 w-16 items-center justify-center rounded-full bg-blue-50">
<ShieldCheckIcon
weight="fill"
size={32}
className="text-blue-600"
/>
</div>
<Text variant="h3" className="mb-4">
Safety Checks Enabled
</Text>
<Text variant="body" className="mb-2 text-zinc-700">
AI-generated agents may take actions that affect your data or
external systems.
</Text>
<Text variant="body" className="mb-8 text-zinc-700">
AutoGPT includes safety checks so you&apos;ll always have the
opportunity to review and approve sensitive actions before they
happen.
</Text>
<Button
variant="primary"
size="large"
className="w-full"
onClick={handleAcknowledge}
>
Got it
</Button>
</div>
</Dialog.Content>
</Dialog>
);
}
export function useAIAgentSafetyPopup(
agentId: string,
hasSensitiveAction: boolean,
hasHumanInTheLoop: boolean,
) {
const [shouldShowPopup, setShouldShowPopup] = useState(false);
const [hasChecked, setHasChecked] = useState(false);
useEffect(() => {
if (hasChecked) return;
const seenAgentsJson = storage.get(Key.AI_AGENT_SAFETY_POPUP_SHOWN);
const seenAgents: string[] = seenAgentsJson
? JSON.parse(seenAgentsJson)
: [];
const hasSeenPopupForThisAgent = seenAgents.includes(agentId);
const isRelevantAgent = hasSensitiveAction || hasHumanInTheLoop;
setShouldShowPopup(!hasSeenPopupForThisAgent && isRelevantAgent);
setHasChecked(true);
}, [agentId, hasSensitiveAction, hasHumanInTheLoop, hasChecked]);
const dismissPopup = useCallback(() => {
setShouldShowPopup(false);
}, []);
return {
shouldShowPopup,
dismissPopup,
};
}

View File

@@ -69,7 +69,6 @@ export function SafeModeToggle({ graph, className }: Props) {
const {
currentHITLSafeMode,
showHITLToggle,
isHITLStateUndetermined,
handleHITLToggle,
currentSensitiveActionSafeMode,
showSensitiveActionToggle,
@@ -78,20 +77,13 @@ export function SafeModeToggle({ graph, className }: Props) {
shouldShowToggle,
} = useAgentSafeMode(graph);
if (!shouldShowToggle || isHITLStateUndetermined) {
return null;
}
const showHITL = showHITLToggle && !isHITLStateUndetermined;
const showSensitive = showSensitiveActionToggle;
if (!showHITL && !showSensitive) {
if (!shouldShowToggle) {
return null;
}
return (
<div className={cn("flex gap-1", className)}>
{showHITL && (
{showHITLToggle && (
<SafeModeIconButton
isEnabled={currentHITLSafeMode}
label="Human-in-the-loop"
@@ -101,7 +93,7 @@ export function SafeModeToggle({ graph, className }: Props) {
isPending={isPending}
/>
)}
{showSensitive && (
{showSensitiveActionToggle && (
<SafeModeIconButton
isEnabled={currentSensitiveActionSafeMode}
label="Sensitive actions"

View File

@@ -8809,6 +8809,12 @@
"title": "Node Exec Id",
"description": "Node execution ID (primary key)"
},
"node_id": {
"type": "string",
"title": "Node Id",
"description": "Node definition ID (for grouping)",
"default": ""
},
"user_id": {
"type": "string",
"title": "User Id",
@@ -8908,7 +8914,7 @@
"created_at"
],
"title": "PendingHumanReviewModel",
"description": "Response model for pending human review data.\n\nRepresents a human review request that is awaiting user action.\nContains all necessary information for a user to review and approve\nor reject data from a Human-in-the-Loop block execution.\n\nAttributes:\n id: Unique identifier for the review record\n user_id: ID of the user who must perform the review\n node_exec_id: ID of the node execution that created this review\n graph_exec_id: ID of the graph execution containing the node\n graph_id: ID of the graph template being executed\n graph_version: Version number of the graph template\n payload: The actual data payload awaiting review\n instructions: Instructions or message for the reviewer\n editable: Whether the reviewer can edit the data\n status: Current review status (WAITING, APPROVED, or REJECTED)\n review_message: Optional message from the reviewer\n created_at: Timestamp when review was created\n updated_at: Timestamp when review was last modified\n reviewed_at: Timestamp when review was completed (if applicable)"
"description": "Response model for pending human review data.\n\nRepresents a human review request that is awaiting user action.\nContains all necessary information for a user to review and approve\nor reject data from a Human-in-the-Loop block execution.\n\nAttributes:\n id: Unique identifier for the review record\n user_id: ID of the user who must perform the review\n node_exec_id: ID of the node execution that created this review\n node_id: ID of the node definition (for grouping reviews from same node)\n graph_exec_id: ID of the graph execution containing the node\n graph_id: ID of the graph template being executed\n graph_version: Version number of the graph template\n payload: The actual data payload awaiting review\n instructions: Instructions or message for the reviewer\n editable: Whether the reviewer can edit the data\n status: Current review status (WAITING, APPROVED, or REJECTED)\n review_message: Optional message from the reviewer\n created_at: Timestamp when review was created\n updated_at: Timestamp when review was last modified\n reviewed_at: Timestamp when review was completed (if applicable)"
},
"PostmarkBounceEnum": {
"type": "integer",
@@ -9411,6 +9417,12 @@
],
"title": "Reviewed Data",
"description": "Optional edited data (ignored if approved=False)"
},
"auto_approve_future": {
"type": "boolean",
"title": "Auto Approve Future",
"description": "If true and this review is approved, future executions of this same block (node) will be automatically approved. This only affects approved reviews.",
"default": false
}
},
"type": "object",
@@ -9430,7 +9442,7 @@
"type": "object",
"required": ["reviews"],
"title": "ReviewRequest",
"description": "Request model for processing ALL pending reviews for an execution.\n\nThis request must include ALL pending reviews for a graph execution.\nEach review will be either approved (with optional data modifications)\nor rejected (data ignored). The execution will resume only after ALL reviews are processed."
"description": "Request model for processing ALL pending reviews for an execution.\n\nThis request must include ALL pending reviews for a graph execution.\nEach review will be either approved (with optional data modifications)\nor rejected (data ignored). The execution will resume only after ALL reviews are processed.\n\nEach review item can individually specify whether to auto-approve future executions\nof the same block via the `auto_approve_future` field on ReviewItem."
},
"ReviewResponse": {
"properties": {

View File

@@ -31,6 +31,29 @@ export function FloatingReviewsPanel({
query: {
enabled: !!(graphId && executionId),
select: okData,
// Poll while execution is in progress to detect status changes
refetchInterval: (q) => {
// Note: refetchInterval callback receives raw data before select transform
const rawData = q.state.data as
| { status: number; data?: { status?: string } }
| undefined;
if (rawData?.status !== 200) return false;
const status = rawData?.data?.status;
if (!status) return false;
// Poll every 2 seconds while running or in review
if (
status === AgentExecutionStatus.RUNNING ||
status === AgentExecutionStatus.QUEUED ||
status === AgentExecutionStatus.INCOMPLETE ||
status === AgentExecutionStatus.REVIEW
) {
return 2000;
}
return false;
},
refetchIntervalInBackground: true,
},
},
);
@@ -40,28 +63,47 @@ export function FloatingReviewsPanel({
useShallow((state) => state.graphExecutionStatus),
);
// Determine if we should poll for pending reviews
const isInReviewStatus =
executionDetails?.status === AgentExecutionStatus.REVIEW ||
graphExecutionStatus === AgentExecutionStatus.REVIEW;
const { pendingReviews, isLoading, refetch } = usePendingReviewsForExecution(
executionId || "",
{
enabled: !!executionId,
// Poll every 2 seconds when in REVIEW status to catch new reviews
refetchInterval: isInReviewStatus ? 2000 : false,
},
);
// Refetch pending reviews when execution status changes
useEffect(() => {
if (executionId) {
if (executionId && executionDetails?.status) {
refetch();
}
}, [executionDetails?.status, executionId, refetch]);
// Refetch when graph execution status changes to REVIEW
useEffect(() => {
if (graphExecutionStatus === AgentExecutionStatus.REVIEW && executionId) {
refetch();
}
}, [graphExecutionStatus, executionId, refetch]);
// Hide panel if:
// 1. No execution ID
// 2. No pending reviews and not in REVIEW status
// 3. Execution is RUNNING or QUEUED (hasn't paused for review yet)
if (!executionId) {
return null;
}
if (
!executionId ||
(!isLoading &&
pendingReviews.length === 0 &&
executionDetails?.status !== AgentExecutionStatus.REVIEW)
!isLoading &&
pendingReviews.length === 0 &&
executionDetails?.status !== AgentExecutionStatus.REVIEW
) {
return null;
}
// Don't show panel while execution is still running/queued (not paused for review)
if (
executionDetails?.status === AgentExecutionStatus.RUNNING ||
executionDetails?.status === AgentExecutionStatus.QUEUED
) {
return null;
}

View File

@@ -1,10 +1,8 @@
import { PendingHumanReviewModel } from "@/app/api/__generated__/models/pendingHumanReviewModel";
import { Text } from "@/components/atoms/Text/Text";
import { Button } from "@/components/atoms/Button/Button";
import { Input } from "@/components/atoms/Input/Input";
import { Switch } from "@/components/atoms/Switch/Switch";
import { TrashIcon, EyeSlashIcon } from "@phosphor-icons/react";
import { useState } from "react";
import { useEffect, useState } from "react";
interface StructuredReviewPayload {
data: unknown;
@@ -40,37 +38,40 @@ function extractReviewData(payload: unknown): {
interface PendingReviewCardProps {
review: PendingHumanReviewModel;
onReviewDataChange: (nodeExecId: string, data: string) => void;
reviewMessage?: string;
onReviewMessageChange?: (nodeExecId: string, message: string) => void;
isDisabled?: boolean;
onToggleDisabled?: (nodeExecId: string) => void;
autoApproveFuture?: boolean;
onAutoApproveFutureChange?: (nodeExecId: string, enabled: boolean) => void;
externalDataValue?: string;
}
export function PendingReviewCard({
review,
onReviewDataChange,
reviewMessage = "",
onReviewMessageChange,
isDisabled = false,
onToggleDisabled,
autoApproveFuture = false,
onAutoApproveFutureChange,
externalDataValue,
}: PendingReviewCardProps) {
const extractedData = extractReviewData(review.payload);
const isDataEditable = review.editable;
const instructions = extractedData.instructions || review.instructions;
const [currentData, setCurrentData] = useState(extractedData.data);
// Sync with external data value when auto-approve is toggled
useEffect(() => {
if (externalDataValue !== undefined) {
try {
const parsedData = JSON.parse(externalDataValue);
setCurrentData(parsedData);
} catch {
// If parsing fails, keep current data
}
}
}, [externalDataValue]);
const handleDataChange = (newValue: unknown) => {
setCurrentData(newValue);
onReviewDataChange(review.node_exec_id, JSON.stringify(newValue, null, 2));
};
const handleMessageChange = (newMessage: string) => {
onReviewMessageChange?.(review.node_exec_id, newMessage);
};
// Show simplified view when no toggle functionality is provided (Screenshot 1 mode)
const showSimplified = !onToggleDisabled;
const renderDataInput = () => {
const data = currentData;
@@ -147,35 +148,13 @@ export function PendingReviewCard({
// Use the existing HITL review interface
return (
<div className="space-y-4">
{!showSimplified && (
<div className="flex items-start justify-between">
<div className="flex-1">
{isDisabled && (
<Text variant="small" className="text-muted-foreground">
This item will be rejected
</Text>
)}
</div>
<Button
onClick={() => onToggleDisabled!(review.node_exec_id)}
variant={isDisabled ? "primary" : "secondary"}
size="small"
leftIcon={
isDisabled ? <EyeSlashIcon size={14} /> : <TrashIcon size={14} />
}
>
{isDisabled ? "Include" : "Exclude"}
</Button>
</div>
)}
{/* Show instructions as field label */}
{instructions && (
<div className="space-y-3">
<Text variant="body" className="font-semibold text-gray-900">
{getFieldLabel(instructions)}
</Text>
{isDataEditable && !isDisabled ? (
{isDataEditable && !autoApproveFuture ? (
renderDataInput()
) : (
<div className="rounded-lg border border-gray-200 bg-white p-3">
@@ -198,7 +177,7 @@ export function PendingReviewCard({
</span>
)}
</Text>
{isDataEditable && !isDisabled ? (
{isDataEditable && !autoApproveFuture ? (
renderDataInput()
) : (
<div className="rounded-lg border border-gray-200 bg-white p-3">
@@ -210,22 +189,26 @@ export function PendingReviewCard({
</div>
)}
{!showSimplified && isDisabled && (
<div>
<Text variant="body" className="mb-2 font-semibold">
Rejection Reason (Optional):
</Text>
<Input
id="rejection-reason"
label="Rejection Reason"
hideLabel
size="small"
type="textarea"
rows={3}
value={reviewMessage}
onChange={(e) => handleMessageChange(e.target.value)}
placeholder="Add any notes about why you're rejecting this..."
/>
{/* Auto-approve toggle for this review */}
{onAutoApproveFutureChange && (
<div className="space-y-2 pt-2">
<div className="flex items-center gap-3">
<Switch
checked={autoApproveFuture}
onCheckedChange={(enabled: boolean) =>
onAutoApproveFutureChange(review.node_exec_id, enabled)
}
/>
<Text variant="small" className="text-gray-700">
Auto-approve future executions of this block
</Text>
</div>
{autoApproveFuture && (
<Text variant="small" className="pl-11 text-gray-500">
Original data will be used for this and all future reviews from
this block.
</Text>
)}
</div>
)}
</div>

View File

@@ -32,14 +32,15 @@ export function PendingReviewsList({
},
);
const [reviewMessageMap, setReviewMessageMap] = useState<
Record<string, string>
>({});
const [pendingAction, setPendingAction] = useState<
"approve" | "reject" | null
>(null);
// Track per-review auto-approval state
const [autoApproveFutureMap, setAutoApproveFutureMap] = useState<
Record<string, boolean>
>({});
const { toast } = useToast();
const reviewActionMutation = usePostV2ProcessReviewAction({
@@ -88,8 +89,23 @@ export function PendingReviewsList({
setReviewDataMap((prev) => ({ ...prev, [nodeExecId]: data }));
}
function handleReviewMessageChange(nodeExecId: string, message: string) {
setReviewMessageMap((prev) => ({ ...prev, [nodeExecId]: message }));
// Handle per-review auto-approval toggle
function handleAutoApproveFutureToggle(nodeExecId: string, enabled: boolean) {
setAutoApproveFutureMap((prev) => ({
...prev,
[nodeExecId]: enabled,
}));
if (enabled) {
// Reset this review's data to original value
const review = reviews.find((r) => r.node_exec_id === nodeExecId);
if (review) {
setReviewDataMap((prev) => ({
...prev,
[nodeExecId]: JSON.stringify(review.payload, null, 2),
}));
}
}
}
function processReviews(approved: boolean) {
@@ -107,30 +123,39 @@ export function PendingReviewsList({
for (const review of reviews) {
const reviewData = reviewDataMap[review.node_exec_id];
const reviewMessage = reviewMessageMap[review.node_exec_id];
const autoApproveThisReview = autoApproveFutureMap[review.node_exec_id];
let parsedData: any = review.payload; // Default to original payload
// When auto-approving future actions for this review, send undefined (use original data)
// Otherwise, parse and send the edited data if available
let parsedData: any = undefined;
// Parse edited data if available and editable
if (review.editable && reviewData) {
try {
parsedData = JSON.parse(reviewData);
} catch (error) {
toast({
title: "Invalid JSON",
description: `Please fix the JSON format in review for node ${review.node_exec_id}: ${error instanceof Error ? error.message : "Invalid syntax"}`,
variant: "destructive",
});
setPendingAction(null);
return;
if (!autoApproveThisReview) {
// For regular approve/reject, use edited data if available
if (review.editable && reviewData) {
try {
parsedData = JSON.parse(reviewData);
} catch (error) {
toast({
title: "Invalid JSON",
description: `Please fix the JSON format in review for node ${review.node_exec_id}: ${error instanceof Error ? error.message : "Invalid syntax"}`,
variant: "destructive",
});
setPendingAction(null);
return;
}
} else {
// No edits, use original payload
parsedData = review.payload;
}
}
// When autoApproveThisReview is true, parsedData stays undefined
// Backend will use the original payload stored in the database
reviewItems.push({
node_exec_id: review.node_exec_id,
approved,
reviewed_data: parsedData,
message: reviewMessage || undefined,
auto_approve_future: autoApproveThisReview && approved,
});
}
@@ -182,21 +207,20 @@ export function PendingReviewsList({
<div className="space-y-7">
{reviews.map((review) => (
<PendingReviewCard
key={review.node_exec_id}
key={`${review.node_exec_id}`}
review={review}
onReviewDataChange={handleReviewDataChange}
onReviewMessageChange={handleReviewMessageChange}
reviewMessage={reviewMessageMap[review.node_exec_id] || ""}
autoApproveFuture={
autoApproveFutureMap[review.node_exec_id] || false
}
onAutoApproveFutureChange={handleAutoApproveFutureToggle}
externalDataValue={reviewDataMap[review.node_exec_id]}
/>
))}
</div>
<div className="space-y-7">
<Text variant="body" className="text-textGrey">
Note: Changes you make here apply only to this task
</Text>
<div className="flex gap-2">
<div className="space-y-4">
<div className="flex flex-wrap gap-2">
<Button
onClick={() => processReviews(true)}
disabled={reviewActionMutation.isPending || reviews.length === 0}
@@ -220,6 +244,11 @@ export function PendingReviewsList({
Reject
</Button>
</div>
<Text variant="small" className="text-textGrey">
You can turn auto-approval on or off anytime in this agent&apos;s
settings.
</Text>
</div>
</div>
);

View File

@@ -15,8 +15,22 @@ export function usePendingReviews() {
};
}
export function usePendingReviewsForExecution(graphExecId: string) {
const query = useGetV2GetPendingReviewsForExecution(graphExecId);
interface UsePendingReviewsForExecutionOptions {
enabled?: boolean;
refetchInterval?: number | false;
}
export function usePendingReviewsForExecution(
graphExecId: string,
options?: UsePendingReviewsForExecutionOptions,
) {
const query = useGetV2GetPendingReviewsForExecution(graphExecId, {
query: {
enabled: options?.enabled ?? !!graphExecId,
refetchInterval: options?.refetchInterval,
refetchIntervalInBackground: !!options?.refetchInterval,
},
});
return {
pendingReviews: okData(query.data) || [],

View File

@@ -10,6 +10,7 @@ export enum Key {
LIBRARY_AGENTS_CACHE = "library-agents-cache",
CHAT_SESSION_ID = "chat_session_id",
COOKIE_CONSENT = "autogpt_cookie_consent",
AI_AGENT_SAFETY_POPUP_SHOWN = "ai-agent-safety-popup-shown",
}
function get(key: Key) {