Commit Graph

7831 Commits

Author SHA1 Message Date
Zamil Majdy
c1bb452743 refactor(backend): remove unused log_once_per_task function
The function was defined but never actually used in the codebase.
With the logging improvements from other PRs, this is no longer needed.
2026-01-24 11:15:25 -06:00
Zamil Majdy
ceba0500ef perf(backend): batch database queries in review processing
- Add get_pending_reviews_by_node_exec_ids() for batch fetching reviews
- Replace N+1 query loop with single batch query using find_many
- Execute auto-approval record creation in parallel with asyncio.gather()
- Simplify graph_exec_id extraction logic

This reduces database round-trips from N queries to 1 query when processing
multiple reviews, significantly improving performance for bulk operations.
2026-01-24 11:14:12 -06:00
Zamil Majdy
666f6d28f3 refactor(frontend): remove backward compatibility regex and console.log
- Remove regex matching for old 'Review required for X execution' format
- Remove console.log debug statement
- Simplify code by removing unnecessary string parsing
2026-01-24 11:09:35 -06:00
Zamil Majdy
435e1d4975 refactor(frontend): simplify HITL review UI logic and remove fallbacks
- Remove fallback to displayName/nodeNameMap, use instructions directly
- Remove graph fetch and nodeNameMap logic (unused)
- Simplify PendingReviewCard to use review.instructions only
- Remove all obvious comments
- Clean up unused imports
2026-01-24 11:04:16 -06:00
Zamil Majdy
17f7f0171c fix(frontend): fix HITL instruction label, header text, and default collapse state
- Show HITL instruction label by checking for user-provided names (not technical block names)
- Always show 'Review required for X' in header (not just the name)
- Default single reviews to expanded, multiple reviews to collapsed
- Detect HITL blocks by checking if name doesn't contain 'Block' suffix
2026-01-24 10:43:50 -06:00
Zamil Majdy
385536c2aa fix(frontend): show instruction label only for HITL blocks, clean up messaging
- Only show instruction label when it's in payload (HITL blocks)
- Don't show label for sensitive action blocks (uses review.instructions)
- Add backward compatibility for old format reviews
- Simplify auto-approve message (remove review count)
- Remove obvious code comments
2026-01-24 09:33:37 -06:00
Zamil Majdy
99210549c7 fix(frontend): restore HITL instruction label above input field
- Add back instructions display as label for HITL blocks
- Show user-provided name field (e.g., 'User profile data') above the input
- Only show label when instructions exist (HITL blocks)
- Other review types show no label, just the data
2026-01-24 09:30:44 -06:00
Zamil Majdy
164a5dc612 fix(frontend/backend): improve HITL review UX with proper naming and styling
Backend changes:
- Use user-provided name directly as review message (not wrapped in 'Review required for X execution')
- Pass input_data.name from HITL block instead of generic block type name

Frontend changes:
- Show HITL block name in bold in collapse header
- Use HITL name (not node UUID) in auto-approve text
- Default all review groups to collapsed state
- Remove redundant instructions display from individual cards
- Fix lint error: remove unused instructions variable

This ensures HITL blocks show meaningful names like 'User profile data'
instead of UUIDs or generic 'HumanInTheLoopBlock' in the UI.
2026-01-24 09:28:59 -06:00
Zamil Majdy
b67164e66e fix(frontend): improve review grouping UX and auto-approve messaging
- Use 'Node #shortened-id' format in collapse header for consistency
- Remove special background colors and styling from collapse button
- Only show collapse header when there are multiple reviews in a group
- Don't show Node #id on individual cards when collapse header is visible (avoid redundancy)
- Show actual node name in auto-approve text instead of generic 'this node'
  (e.g., 'Auto-approve future executions of Email Trigger')
2026-01-24 09:17:17 -06:00
Zamil Majdy
071049afdd fix(frontend): simplify auto-approve toggle styling and use 'node' terminology
- Remove border and gray background from auto-approve toggle section
- Change 'this block' to 'this node' for consistency
2026-01-24 09:11:45 -06:00
Zamil Majdy
2dcb975773 feat(frontend): improve auto-approve UX by grouping at node level
- Move auto-approve toggle from individual review cards to node group level
- Change auto-approve tracking from node_exec_id to node_id for consistency
- When auto-approve is enabled, reset ALL review items in that node group to original values
- Add node identifier display to each review card (shortened format: Node #abc1...xyz9)
- Show single toggle per node with clear messaging about how many reviews it affects

This prevents confusion when multiple review items come from the same node,
making it clear that auto-approve applies to the entire node, not individual items.
2026-01-24 09:09:54 -06:00
Zamil Majdy
dd57629598 Merge branch 'dev' into feat/sensitive-action-features
Resolved conflicts in embeddings.py by accepting dev's fail-fast error handling approach for missing OpenAI API keys.
2026-01-24 09:01:24 -06:00
Zamil Majdy
40ceca3803 fix(backend): convert review_routes tests to async to prevent event loop closure
- Convert review_routes_test.py from sync TestClient to async httpx.AsyncClient
- Use shared server fixture instead of creating new FastAPI app
- Add @pytest.mark.asyncio(loop_scope="session") to all test functions
- Convert all HTTP calls from sync to async (add await)

This fixes the 'RuntimeError: Event loop is closed' error that occurred
when OAuth tests ran after review_routes tests. The sync TestClient was
closing the session event loop, contaminating subsequent async tests.
2026-01-24 08:58:42 -06:00
Zamil Majdy
595f3508c1 refactor(backend): consolidate embedding error logging to prevent Sentry spam (#11832)
## Summary

Refactors error handling in the embedding service to prevent Sentry
alert spam. Previously, batch operations would log one error per failed
file, causing hundreds of duplicate alerts. Now, exceptions bubble up
from individual functions and are aggregated at the batch level,
producing a single log entry showing all unique error types with counts.

## Changes

### Removed Error Swallowing
- Removed try/except blocks from `generate_embedding()`,
`store_content_embedding()`, `ensure_content_embedding()`,
`get_content_embedding()`, and `ensure_embedding()`
- These functions now raise exceptions instead of returning None/False
on failure
- Added docstring notes: "Raises exceptions on failure - caller should
handle"

### Improved Batch Error Aggregation
- Updated `backfill_all_content_types()` to aggregate unique errors
- Collects all exceptions from batch results
- Groups by error type and message, shows counts
- Single log entry per content type instead of per-file

### Example Output
Before: 50 separate error logs for same issue
After: `BLOCK: 50/100 embeddings failed. Errors: PrismaError: type
vector does not exist (50x)`

## Motivation

This was triggered by the AUTOGPT-SERVER-7D2 Sentry issue where pgvector
errors created hundreds of duplicate alerts. Even after the root cause
was fixed (stale database connections), the error logging pattern would
create spam for any future issues.

## Impact

-  Reduces Sentry noise - single alert per batch instead of per-file
-  Better diagnostics - shows all unique error types with counts
-  Cleaner code - removed ~24 lines of unnecessary error swallowing
-  Proper exception propagation follows Python best practices

## Testing

- Existing tests should pass (error handling moved to batch level)
- Error aggregation logic tested via
asyncio.gather(return_exceptions=True)

## Related Issues

- Fixes Sentry alert spam from AUTOGPT-SERVER-7D2
2026-01-24 21:49:32 +07:00
Zamil Majdy
e07d9bef99 fix(backend): fix deprecated pytest-asyncio syntax in run_agent_test.py
Change deprecated @pytest.mark.asyncio(scope="session") to the correct
@pytest.mark.asyncio(loop_scope="session") syntax to eliminate deprecation
warnings in the test suite.

This affects 11 test functions in run_agent_test.py that were using the
old 'scope' parameter instead of the new 'loop_scope' parameter.
2026-01-23 23:51:26 -06:00
Zamil Majdy
3c67a1745c fix(backend): use function-scoped event loops in human review tests
Add explicit loop_scope='function' to all async tests in human_review_test.py to prevent event loop conflicts with other tests in the suite.

The pytest config has asyncio_default_fixture_loop_scope='session' which causes all async tests to share the same event loop by default. This was causing the OAuth test to fail with 'Event loop is closed' when running the full test suite.

By explicitly using function scope for our async tests, each test gets its own isolated event loop, preventing contamination between tests.
2026-01-23 21:56:46 -06:00
Zamil Majdy
3e38e7f8c6 feat(frontend): add review grouping by node_id with block names
Group pending reviews by node_id and display the block names from the graph metadata. Reviews from the same HITL block are now grouped together with a collapsible header.

Features:
- Fetch graph structure to get node metadata
- Group reviews by node_id
- Display customized block name or fall back to block_id
- Collapsible groups with caret icons
- Show review count per group
- Only show group headers when there are multiple groups

This improves UX when multiple reviews are from the same block, making it easier to understand context and batch-process related reviews.
2026-01-23 21:35:11 -06:00
Zamil Majdy
a8c1d07f09 fix(frontend): reset input value when auto-approve toggle changes
- Pass external data value from parent to child component
- Add useEffect to sync child's internal state with parent's reviewDataMap
- Fixes issue where toggling auto-approve didn't reset edited input values
- Input now correctly reverts to original payload when auto-approve is enabled
2026-01-23 20:09:52 -06:00
Zamil Majdy
69e6927209 fix(backend): resolve event loop conflicts in human review tests
- Use TYPE_CHECKING conditional imports to avoid event loop binding issues
- Keep local imports within functions to prevent async initialization conflicts
- All 25 human review tests now pass reliably (10 data layer + 15 API routes)
- Fixes node_id fetching for frontend review grouping
2026-01-23 19:29:20 -06:00
Zamil Majdy
4003d7abf6 Merge branch 'dev' into feat/sensitive-action-features 2026-01-23 19:28:17 -05:00
Zamil Majdy
0cfc3e395f feat(backend): add node_id to PendingHumanReviewModel for frontend grouping
Adds node_id field to PendingHumanReviewModel to enable frontend to:
- Group reviews from the same node together
- Show auto-approve toggle only on last review per node
- Apply auto-approval at node level (not per-review)

Changes:
- Fetch node_id from NodeExecution when loading reviews
- Add node_id parameter to PendingHumanReviewModel.from_db()
- Update test fixture with node_id
- Add temporary default value for backwards compatibility
2026-01-23 18:04:02 -06:00
Zamil Majdy
27e93a39c1 fix(test): add event loop cleanup to prevent CI test failures
Adds a 0.5s sleep after shutting down test services (ExecutionManager,
AgentServer, etc.) to ensure they fully clean up their event loops before
the next test starts. This prevents 'Event loop is closed' errors in CI
that were breaking the oauth_test.py fixture setup.

The issue occurred because:
1. SpinTestServer starts background services with their own event loops
2. When services shut down, event loops weren't fully cleaned up
3. Subsequent tests would encounter closed event loops

Fixes test isolation issue where review_routes_test.py would leave the
event loop in a bad state for tests running after it.
2026-01-23 17:13:26 -06:00
Ubbe
7892590b12 feat(frontend): refine copilot loading states (#11827)
## Changes 🏗️

- Make the loading UX better when switching between chats or loading a
new chat
- Make session/chat management logic more manageable
- Improving "Deep thinking" loading states
- Fix bug that happened when returning to chat after navigating away

## Checklist 📋

### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Run the app locally and test the above
2026-01-23 18:25:45 +07:00
Bently
82d7134fc6 feat(blocks): Add ClaudeCodeBlock for executing tasks via Claude Code in E2B sandbox (#11761)
Introduces a new ClaudeCodeBlock that enables execution of coding tasks
using Anthropic's Claude Code in an E2B sandbox. This block unlocks
powerful agentic coding capabilities - Claude Code can autonomously
create files, install packages, run commands, and build complete
applications within a secure sandboxed environment.

Changes 🏗️

- New file backend/blocks/claude_code.py:
  - ClaudeCodeBlock - Execute tasks using Claude Code in an E2B sandbox
- Dual credential support: E2B API key (sandbox) + Anthropic API key
(Claude Code)
- Session continuation support via session_id, sandbox_id, and
conversation_history
- Automatic file extraction with path, relative_path, name, and content
fields
  - Configurable timeout, setup commands, and working directory
- dispose_sandbox option to keep sandbox alive for multi-turn
conversations

Checklist 📋

For code changes:

- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Create and execute ClaudeCodeBlock with a simple prompt ("Create a
hello world HTML file")
- [x] Verify files output includes correct path, relative_path, name,
and content
- [x] Test session continuation by passing session_id and sandbox_id
back
- [x] Build "Any API → Instant App" demo agent combining Firecrawl +
ClaudeCodeBlock + GitHub blocks
- [x] Verify generated files are pushed to GitHub with correct folder
structure using relative_path

Here are two example agents i made that can be used to test this agent,
they require github, anthropic and e2b access via api keys that are set
via the user/on the platform is testing on dev

The first agent is my

Any API → Instant App
"Transform any API documentation into a fully functional web
application. Just provide a docs URL and get a complete, ready-to-deploy
app pushed to a new GitHub repository."

[Any API → Instant
App_v36.json](https://github.com/user-attachments/files/24600326/Any.API.Instant.App_v36.json)


The second agent is my
Idea to project
"Simply enter your coding project's idea and this agent will make all of
the base initial code needed for you to start working on that project
and place it on github for you!"

[Idea to
project_v11.json](https://github.com/user-attachments/files/24600346/Idea.to.project_v11.json)

If you have any questions or issues let me know.

References
https://e2b.dev/blog/python-guide-run-claude-code-in-an-e2b-sandbox

https://github.com/e2b-dev/e2b-cookbook/tree/main/examples/anthropic-claude-code-in-sandbox-python
https://code.claude.com/docs/en/cli-reference

I tried to use E2b's "anthropic-claude-code" template but it kept
complaining it was out of date, so I make it manually spin up a E2b
instance and make it install the latest claude code and it uses that
2026-01-23 10:05:32 +00:00
Nicholas Tindle
90466908a8 refactor(docs): restructure platform docs for GitBook and remove MkDo… (#11825)
<!-- Clearly explain the need for these changes: -->
we met some reality when merging into the docs site but this fixes it
### Changes 🏗️
updates paths, adds some guides
<!-- Concisely describe all of the changes made in this pull request:
-->
update to match reality
### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  <!-- Put your test plan here: -->
  - [x] deploy it and validate

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Aligns block integrations documentation with GitBook.
> 
> - Changes generator default output to
`docs/integrations/block-integrations` and writes overview `README.md`
and `SUMMARY.md` at `docs/integrations/`
> - Adds GitBook frontmatter and hint syntax to overview; prefixes block
links with `block-integrations/`
> - Introduces `generate_summary_md` to build GitBook navigation
(including optional `guides/`)
> - Preserves per-block manual sections and adds optional `extras` +
file-level `additional_content`
> - Updates sync checker to validate parent `README.md` and `SUMMARY.md`
> - Rewrites `docs/integrations/README.md` with GitBook frontmatter and
updated links; adds `docs/integrations/SUMMARY.md`
> - Adds new guides: `guides/llm-providers.md`,
`guides/voice-providers.md`
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
fdb7ff8111. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: bobby.gaffin <bobby.gaffin@agpt.co>
2026-01-23 06:18:16 +00:00
Zamil Majdy
e95a9273e6 fix(frontend): simplify review UI - remove exclude button and tone down styling
- Remove confusing Exclude/Include button functionality
- Remove blue border/background from auto-approve toggle for cleaner look
- Toggle now shows by default with subtle gray text
- Disable editing when auto-approve is enabled (not when excluded)
- Remove unused rejection message functionality
- Simpler UX: just approve/reject reviews directly
2026-01-22 23:28:43 -05:00
Zamil Majdy
abd4a2b0b3 fix(frontend): restore auto-approve toggle in review UI
The auto-approve toggle was hidden because onToggleDisabled handler was not
passed to PendingReviewCard, causing showSimplified mode. This adds:
- disabledReviewsMap state to track excluded reviews
- handleToggleDisabled handler to toggle review exclusion
- Proper filtering in processReviews to handle disabled reviews
2026-01-22 23:24:27 -05:00
Zamil Majdy
3a96455db5 fix(backend): add ExecutionStatus import to utils_test.py
Fixes test failures caused by race condition fix. The status attribute
added to mocks requires importing ExecutionStatus enum.
2026-01-22 23:07:48 -05:00
Zamil Majdy
4c68b87e00 fix(backend): Add explicit None check for matching_review to satisfy Pyright
Add safety check before accessing matching_review.graph_exec_id to prevent
Pyright reportOptionalMemberAccess error. This check should never be triggered
in practice due to validation logic, but satisfies static type checking.

Fixes Pyright error: reportOptionalMemberAccess on line 164
2026-01-22 22:49:36 -05:00
Zamil Majdy
f4296f8764 fix(backend): Mock get_user_by_id in tests to prevent database access
Add get_user_by_id mocks to three review route tests that trigger the execution resume path.
Without these mocks, the tests attempt database access via Prisma when get_user_by_id is
called to fetch user.timezone, making tests non-deterministic and potentially causing failures.

Tests fixed:
- test_process_review_action_auto_approve_creates_auto_approval_records
- test_process_review_action_without_auto_approve_still_loads_settings
- test_process_review_action_auto_approve_only_applies_to_approved_reviews

Fixes CodeRabbit comment: 2719326546
2026-01-22 22:47:13 -05:00
Zamil Majdy
c0547bb1ed fix(backend): Prevent race condition in concurrent review completion
- Update execution status to QUEUED before publishing to RabbitMQ to prevent duplicate queueing
- Verify status update succeeded before publishing message to queue
- Return early if status update fails (execution already queued by another request)
- Ensures only one concurrent request can successfully queue an execution

This prevents the race condition where two concurrent requests processing the final reviews
for the same execution could both publish to the queue if the queue publish happens before
the status update.

Fixes CodeRabbit comment: 2719411493
2026-01-22 22:38:25 -05:00
Zamil Majdy
a396155d63 fix(backend): Address CodeRabbit review comments - query scoping, stale payload, and execution validation
- Change get_pending_review_by_node_exec_id to use find_first with userId filter in query (prevents cross-tenant existence leak)
- Add input_data parameter to check_approval() to use current data for auto-approvals instead of stale stored payload
- Validate all reviews in a request belong to the same execution to prevent cross-execution review processing
- Update HITLReviewHelper to pass input_data to check_approval()

Fixes CodeRabbit comments: 2719424510, 2719424508, 2719424506
2026-01-22 22:35:42 -05:00
Zamil Majdy
4c47b4df76 style: fix prettier formatting in PendingReviewsList 2026-01-22 22:14:26 -05:00
Zamil Majdy
c46e563e6a fix(frontend): remove unused imports in PendingReviewsList 2026-01-22 22:01:22 -05:00
Zamil Majdy
a80f452ffe style: fix linter formatting issues 2026-01-22 21:39:07 -05:00
Zamil Majdy
fd970c800c feat(frontend): implement per-review auto-approval toggles
- Replace global auto_approve_future_actions with per-review auto_approve_future
- Add individual toggle for each review in PendingReviewCard
- Track per-review auto-approval state in autoApproveFutureMap
- Send auto_approve_future field with each review item
- Update UI to show per-review toggle with explanation
- Automatically reset data to original when auto-approve is enabled per review
2026-01-22 21:38:17 -05:00
Zamil Majdy
5fc1ec0ece fix(backend): add user-scoped validation to cancel_pending_reviews_for_execution
- Add user_id parameter to validate ownership before cancelling reviews
- Update call site in executor/utils.py to pass user_id
- Update all test assertions to expect user_id parameter
- Prevents cross-tenant cancellation if graph_exec_id is misused
2026-01-22 21:29:00 -05:00
Zamil Majdy
9be3ec58ae fix(backend): fix pagination bug in review lookup and implement per-review auto-approval
- Add get_pending_review_by_node_exec_id() for direct review lookup
- Replace paginated search with direct lookup to avoid missing reviews beyond page 1
- Implement per-review auto_approve_future toggle for granular control
- Fix log deduplication for embedding generation warnings
- Remove unnecessary f-string prefixes per linter feedback
- Fix all test mocks to use correct functions (get_pending_reviews_for_user vs get_pending_review_by_node_exec_id)
- All 15 review route tests passing
2026-01-22 21:24:15 -05:00
Zamil Majdy
e6ca904326 style: remove unnecessary f-string prefixes
- Remove f-string prefix from strings without placeholders
- Fixes Ruff F541 linter warning
- Addresses CodeRabbit comment 2719299451
2026-01-22 21:13:24 -05:00
Zamil Majdy
dbb56fa7aa feat: add per-review auto-approval toggle for granular control
- Change from global auto_approve_future_actions to per-review auto_approve_future flag
- Each review item can now individually specify auto-approval
- Better UX: users can auto-approve some actions but not others
- Example: auto-approve file reads but not file writes
- Backward compatible: auto_approve_future defaults to False
- Add test for per-review granularity
- Update all existing tests to use new structure
2026-01-22 21:01:35 -05:00
Zamil Majdy
0111820f61 feat: deduplicate embedding generation failure warnings
- Use log_once_per_task for embedding generation failures
- Prevents log spam when API key is missing
- Now shows single warning per task instead of per-file warnings
- Makes logs more readable and actionable
2026-01-22 20:51:23 -05:00
Zamil Majdy
1a1b1aa26d fix: add user ownership validation to create_auto_approval_record
- Add defense-in-depth check that graph_exec_id belongs to user_id
- Validates ownership before creating auto-approval records
- Prevents potential misuse if function called from other contexts
- Addresses CodeRabbit security concern (comment 2718990979)
2026-01-22 20:43:47 -05:00
Zamil Majdy
614ed8cf82 fix(backend/hitl): preserve user timezone when resuming execution from review
- Add user_timezone to ExecutionContext when resuming after review approval
- Fetch user to get timezone preference, defaulting to UTC if not set
- Make error deduplication more general using contextvars
- Replace global flag with log_once_per_task() helper for task-scoped logging
- Prevents log spam when processing batches (embeddings, etc.)

Addresses CodeRabbit comment about ExecutionContext not being exhaustive.
2026-01-22 19:59:45 -05:00
Zamil Majdy
edd4c96aa6 fix: remove accidentally committed supabase submodule 2026-01-22 19:48:22 -05:00
Zamil Majdy
cd231e2d69 fix(backend/tests): fix event loop issues in review route tests
- Convert module-level TestClient to fixture to avoid event loop conflicts
- Add missing mock for get_pending_reviews_for_user in all tests
- Add client parameter to all test functions that use the test client
- Add missing mocks for get_graph_execution_meta in several tests
- Remove asyncio.gather to avoid event loop binding issues
- Process auto-approval creation sequentially with try/except for safety

All 14 review route tests now pass successfully.
2026-01-22 19:24:07 -05:00
Zamil Majdy
399c472623 fix(backend/store): deduplicate missing API key error logs
Only log "openai_internal_api_key not set" error once per process instead
of on every embedding generation attempt. Reduces log spam when processing
batch operations without an API key configured.
2026-01-22 19:09:35 -05:00
Zamil Majdy
554e2beddf fix(backend/hitl): address CodeRabbit review feedback
- Use return_exceptions=True in asyncio.gather for auto-approval creation
  to prevent endpoint failure when auto-approval fails (reviews already processed)
- Fix empty payload handling: use explicit None check instead of truthiness
- Distinguish auto-approvals from normal approvals: auto-approvals always
  use current input_data, normal approvals preserve explicitly empty payloads
2026-01-22 19:08:14 -05:00
Zamil Majdy
29fdda3fa8 test(backend/executor): add tests for stop_graph_execution with REVIEW status
- Test cancellation of pending reviews when stopping execution in REVIEW status
- Test database manager pattern when Prisma is disconnected
- Test cascading stop to children with pending reviews
- Fix mock to simulate status transition from RUNNING to TERMINATED

Covers the bug fixes in stop_graph_execution() that handle:
1. Immediate termination of REVIEW status executions
2. Cleanup of pending reviews when stopping
3. Recursive cleanup of subagent reviews via cascade
2026-01-22 18:59:20 -05:00
Zamil Majdy
67e6a8841c fix(executor): Handle REVIEW status when stopping graph executions
Critical bug fix: stopping a graph in REVIEW status caused timeouts and orphaned reviews.

## Bugs Fixed

### 1. REVIEW Status Not Handled
Before:
- stop_graph_execution() only handled QUEUED, INCOMPLETE, RUNNING, COMPLETED, FAILED
- REVIEW status → waited 15 seconds → TimeoutError
- Graph remained stuck in REVIEW status

After:
- REVIEW status treated like QUEUED/INCOMPLETE (terminate immediately)
- No need to wait for executor since execution is paused
- Clean termination without timeouts

### 2. Orphaned Pending Reviews
Before:
- Stopping graph → status = TERMINATED
- Pending reviews remained in WAITING status
- User saw reviews for terminated execution in UI
- Could not approve/reject (backend validation rejects)
- Reviews stuck until manual cleanup

After:
- When stopping REVIEW execution, clean up pending reviews
- Mark all WAITING reviews as REJECTED
- reviewMessage: 'Execution was stopped by user'
- processed: true, reviewedAt: now()
- No orphaned reviews in UI

### 3. Subagent Reviews
Before:
- Parent graph with child (subagent) executions
- Child paused for HITL review
- Stop parent → recursively stops child
- Child reviews orphaned (same bugs as above)

After:
- Cascade stop properly handles child REVIEW status
- All child reviews cleaned up recursively
- Clean shutdown of entire execution tree

## Implementation

Changes to stop_graph_execution():
1. Added ExecutionStatus.REVIEW to immediate termination list
2. Check if status == REVIEW before marking TERMINATED
3. Update all WAITING reviews to REJECTED with message
4. Log cleanup for debugging
5. Then terminate execution normally

Cascade behavior preserved:
- Still recursively stops all child executions
- Each child's reviews cleaned up individually
- Parent waits for all children to complete cleanup
2026-01-22 18:27:08 -05:00
Zamil Majdy
aea97db485 feat(frontend): Hide pending reviews panel while execution is RUNNING/QUEUED
Defense in depth: prevent users from seeing/clicking review panel before
execution pauses for review.

Before:
- Reviews panel could show while execution is RUNNING
- User could click to open panel and see pending reviews
- Confusing UX: why are reviews shown if graph hasn't paused yet?
- Could lead to frustration when backend rejects the approval attempt

After:
- Panel hidden if execution status is RUNNING or QUEUED
- Panel only shows when status is REVIEW (paused for review)
- Clear UX: reviews appear only when execution needs user input

Benefits:
1. **Better UX**: No confusion about when to approve reviews
2. **Prevents invalid attempts**: User can't try to approve while running
3. **Works with backend validation**: Frontend hides, backend rejects
4. **Clear state**: Panel visibility directly matches execution state

Changes:
- Added status check: hide if RUNNING or QUEUED
- Panel shows only when execution has paused (REVIEW/INCOMPLETE)
- Existing polling logic still works for real-time updates
2026-01-22 18:22:33 -05:00