AutoGPT

mirror of https://github.com/Significant-Gravitas/AutoGPT.git synced 2026-02-18 02:32:04 -05:00

Author	SHA1	Message	Date
Nicholas Tindle	7668c17d9c	feat(platform): add User Workspace for persistent CoPilot file storage (#11867 ) Implements persistent User Workspace storage for CoPilot, enabling blocks to save and retrieve files across sessions. Files are stored in session-scoped virtual paths (`/sessions/{session_id}/`). Fixes SECRT-1833 ### Changes 🏗️ Database & Storage: - Add `UserWorkspace` and `UserWorkspaceFile` Prisma models - Implement `WorkspaceStorageBackend` abstraction (GCS for cloud, local filesystem for self-hosted) - Add `workspace_id` and `session_id` fields to `ExecutionContext` Backend API: - Add REST endpoints: `GET/POST /api/workspace/files`, `GET/DELETE /api/workspace/files/{id}`, `GET /api/workspace/files/{id}/download` - Add CoPilot tools: `list_workspace_files`, `read_workspace_file`, `write_workspace_file` - Integrate workspace storage into `store_media_file()` - returns `workspace://file-id` references Block Updates: - Refactor all file-handling blocks to use unified `ExecutionContext` parameter - Update media-generating blocks to persist outputs to workspace (AIImageGenerator, AIImageCustomizer, FluxKontext, TalkingHead, FAL video, Bannerbear, etc.) Frontend: - Render `workspace://` image references in chat via proxy endpoint - Add "AI cannot see this image" overlay indicator CoPilot Context Mapping: - Session = Agent (graph_id) = Run (graph_exec_id) - Files scoped to `/sessions/{session_id}/` ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [ ] I have tested my changes according to the test plan: - [ ] Create CoPilot session, generate image with AIImageGeneratorBlock - [ ] Verify image returns `workspace://file-id` (not base64) - [ ] Verify image renders in chat with visibility indicator - [ ] Verify workspace files persist across sessions - [ ] Test list/read/write workspace files via CoPilot tools - [ ] Test local storage backend for self-hosted deployments #### For configuration changes: - [x] `.env.default` is updated or already compatible with my changes - [x] `docker-compose.yml` is updated or already compatible with my changes - [x] I have included a list of my configuration changes in the PR description (under Changes) 🤖 Generated with [Claude Code](https://claude.ai/code) <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Medium Risk > Introduces a new persistent file-storage surface area (DB tables, storage backends, download API, and chat tools) and rewires `store_media_file()`/block execution context across many blocks, so regressions could impact file handling, access control, or storage costs. > > Overview > Adds a persistent per-user Workspace (new `UserWorkspace`/`UserWorkspaceFile` models plus `WorkspaceManager` + `WorkspaceStorageBackend` with GCS/local implementations) and wires it into the API via a new `/api/workspace/files/{file_id}/download` route (including header-sanitized `Content-Disposition`) and shutdown lifecycle hooks. > > Extends `ExecutionContext` to carry execution identity + `workspace_id`/`session_id`, updates executor tooling to clone node-specific contexts, and updates `run_block` (CoPilot) to create a session-scoped workspace and synthetic graph/run/node IDs. > > Refactors `store_media_file()` to require `execution_context` + `return_format` and to support `workspace://` references; migrates many media/file-handling blocks and related tests to the new API and to persist generated media as `workspace://...` (or fall back to data URIs outside CoPilot), and adds CoPilot chat tools for listing/reading/writing/deleting workspace files with safeguards against context bloat. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit `6abc70f793`. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY --> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> Co-authored-by: Reinier van der Leer <pwuts@agpt.co>	2026-01-29 05:49:47 +00:00
Zamil Majdy	fb58827c61	feat(backend;frontend): Implement node-specific auto-approval, safety popup, and race condition fixes (#11810 ) ## Summary This PR implements comprehensive improvements to the human-in-the-loop (HITL) review system, including safety features, architectural changes, and bug fixes: ### Key Features - SECRT-1798: One-time safety popup - Shows informational popup before first run of AI-generated agents with sensitive actions/HITL blocks - SECRT-1795: Auto-approval toggle UX - Toggle in pending reviews panel to auto-approve future actions from the same node - Node-specific auto-approval - Changed from execution-specific to node-specific using special key pattern `auto_approve_{graph_exec_id}_{node_id}` - Consolidated approval checking - Merged `check_auto_approval` into `check_approval` using single OR query for better performance - Race condition prevention - Added execution status check before resuming to prevent duplicate execution when approving while graph is running - Parallel auto-approval creation - Uses `asyncio.gather` for better performance when creating multiple auto-approval records ## Changes ### Backend Architecture - `human_review.py`: - Added `check_approval()` function that checks both normal and auto-approval in single query - Added `create_auto_approval_record()` for node-specific auto-approval using special key pattern - Added `get_auto_approve_key()` helper to generate consistent auto-approval keys - `review/routes.py`: - Added execution status check before resuming to prevent race conditions - Refactored auto-approval record creation to use parallel execution with `asyncio.gather` - Removed obvious comments for cleaner code - `review/model.py`: Added `auto_approve_future_actions` field to `ReviewRequest` - `blocks/helpers/review.py`: Updated to use consolidated `check_approval` via database manager client - `executor/database.py`: Exposed `check_approval` through DatabaseManager RPC for block execution context - `data/block.py`: Fixed safe mode checks for sensitive action blocks ### Frontend - New `AIAgentSafetyPopup` component with localStorage-based one-time display - `PendingReviewsList`: - Replaced "Approve all future actions" button with toggle - Toggle resets data to original values and disables editing when enabled - Shows warning message explaining auto-approval behavior - `RunAgentModal`: Integrated safety popup before first run - `usePendingReviews`: Added polling for real-time badge updates - `FloatingSafeModeToggle` & `SafeModeToggle`: Simplified visibility logic - `local-storage.ts`: Added localStorage key for popup state tracking ### Bug Fixes - Fixed "Client is not connected to query engine" error by using database manager client pattern - Fixed race condition where approving reviews while graph is RUNNING could queue execution twice - Fixed migration to only drop FK constraint, not non-existent column - Fixed card data reset when auto-approve toggle changes ### Code Quality - Removed duplicate/obvious comments - Moved imports to top-level instead of local scope in tests - Used walrus operator for cleaner conditional assignments - Parallel execution for auto-approval record creation ## Test plan - [ ] Create an AI-generated agent with sensitive actions (e.g., email sending) - [ ] First run should show the safety popup before starting - [ ] Subsequent runs should not show the popup - [ ] Clear localStorage (`AI_AGENT_SAFETY_POPUP_SHOWN`) to verify popup shows again - [ ] Create an agent with human-in-the-loop blocks - [ ] Run it and verify the pending reviews panel appears - [ ] Enable the "Auto-approve all future actions" toggle - [ ] Verify editing is disabled and shows warning message - [ ] Click "Approve" and verify subsequent blocks from same node auto-approve - [ ] Verify auto-approval persists across multiple executions of same graph - [ ] Disable toggle and verify editing works normally - [ ] Verify "Reject" button still works regardless of toggle state - [ ] Test race condition: Approve reviews while graph is RUNNING (should skip resume) - [ ] Test race condition: Approve reviews while graph is REVIEW (should resume) - [ ] Verify pending reviews badge updates in real-time when new reviews are created	2026-01-25 04:05:25 +07:00
Zamil Majdy	8b25e62959	feat(backend,frontend): add explicit safe mode toggles for HITL and sensitive actions (#11756 ) ## Summary This PR introduces two explicit safe mode toggles for controlling agent execution behavior, providing clearer and more granular control over when agents should pause for human review. ### Key Changes New Safe Mode Settings: - `human_in_the_loop_safe_mode` (bool, default `true`) - Controls whether human-in-the-loop (HITL) blocks pause for review - `sensitive_action_safe_mode` (bool, default `false`) - Controls whether sensitive action blocks pause for review New Computed Properties on LibraryAgent: - `has_human_in_the_loop` - Indicates if agent contains HITL blocks - `has_sensitive_action` - Indicates if agent contains sensitive action blocks Block Changes: - Renamed `requires_human_review` to `is_sensitive_action` on blocks for clarity - Blocks marked as `is_sensitive_action=True` pause only when `sensitive_action_safe_mode=True` - HITL blocks pause when `human_in_the_loop_safe_mode=True` Frontend Changes: - Two separate toggles in Agent Settings based on block types present - Toggle visibility based on `has_human_in_the_loop` and `has_sensitive_action` computed properties - Settings cog hidden if neither toggle applies - Proper state management for both toggles with defaults AI-Generated Agent Behavior: - AI-generated agents set `sensitive_action_safe_mode=True` by default - This ensures sensitive actions are reviewed for AI-generated content ## Changes Backend: - `backend/data/graph.py` - Updated `GraphSettings` with two boolean toggles (non-optional with defaults), added `has_sensitive_action` computed property - `backend/data/block.py` - Renamed `requires_human_review` to `is_sensitive_action`, updated review logic - `backend/data/execution.py` - Updated `ExecutionContext` with both safe mode fields - `backend/api/features/library/model.py` - Added `has_human_in_the_loop` and `has_sensitive_action` to `LibraryAgent` - `backend/api/features/library/db.py` - Updated to use `sensitive_action_safe_mode` parameter - `backend/executor/utils.py` - Simplified execution context creation Frontend: - `useAgentSafeMode.ts` - Rewritten to support two independent toggles - `AgentSettingsModal.tsx` - Shows two separate toggles - `SelectedSettingsView.tsx` - Shows two separate toggles - Regenerated API types with new schema ## Test Plan - [x] All backend tests pass (Python 3.11, 3.12, 3.13) - [x] All frontend tests pass - [x] Backend format and lint pass - [x] Frontend format and lint pass - [x] Pre-commit hooks pass --------- Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co>	2026-01-21 00:56:02 +00:00
Swifty	e55f05c7a8	feat(backend): add chat search tools and BM25 reranking (#11782 ) This PR adds new chat tools for searching blocks and documentation, along with BM25 reranking for improved search relevance. ### Changes 🏗️ New Chat Tools: - `find_block` - Search for available blocks by name/description using hybrid search - `run_block` - Execute a block directly with provided inputs and credentials - `search_docs` - Search documentation with section-level granularity - `get_doc_page` - Retrieve full documentation page content Search Improvements: - Added BM25 reranking to hybrid search for better lexical relevance - Documentation handler now chunks markdown by headings (##) for finer-grained embeddings - Section-based content IDs (`doc_path::section_index`) for precise doc retrieval - Startup embedding backfill in scheduler for immediate searchability Other Changes: - New response models for block and documentation search results - Updated orphan cleanup to handle section-based doc embeddings - Added `rank-bm25` dependency for BM25 scoring - Removed max message limit check in chat service ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Run find_block tool to search for blocks (e.g., "current time") - [x] Run run_block tool to execute a found block - [x] Run search_docs tool to search documentation - [x] Run get_doc_page tool to retrieve full doc content - [x] Verify BM25 reranking improves search relevance for exact term matches - [x] Verify documentation sections are properly chunked and embedded #### For configuration changes: - [x] `.env.default` is updated or already compatible with my changes - [x] `docker-compose.yml` is updated or already compatible with my changes - [x] I have included a list of my configuration changes in the PR description (under Changes) Dependencies added: `rank-bm25` for BM25 scoring algorithm	2026-01-16 16:18:10 +01:00
Zamil Majdy	8b83bb8647	feat(backend): unified hybrid search with embedding backfill for all content types (#11767 ) ## Summary This PR extends the embedding system to support blocks and documentation content types in addition to store agents, and introduces unified hybrid search across all content types using a single `UnifiedContentEmbedding` table. ### Key Changes 1. Unified Hybrid Search Architecture - Added `search` tsvector column to `UnifiedContentEmbedding` table - New `unified_hybrid_search()` function searches across all content types (agents, blocks, docs) - Updated `hybrid_search()` for store agents to use `UnifiedContentEmbedding.search` - Removed deprecated `search` column from `StoreListingVersion` table 2. Pluggable Content Handler Architecture - Created abstract `ContentHandler` base class for extensibility - Implemented handlers: `StoreAgentHandler`, `BlockHandler`, `DocumentationHandler` - Registry pattern for easy addition of new content types 3. Block Embeddings - Discovers all blocks using `get_blocks()` - Extracts searchable text from: name, description, categories, input/output schemas 4. Documentation Embeddings - Scans `/docs/` directory for `.md` and `.mdx` files - Extracts title from first `#` heading or uses filename as fallback 5. Hybrid Search Graceful Degradation - Falls back to lexical-only search if query embedding generation fails - Redistributes semantic weight proportionally to other components - Logs warning instead of throwing error 6. Database Migrations - `20260115200000_add_unified_search_tsvector`: Adds search column to UnifiedContentEmbedding with auto-update trigger - `20260115210000_remove_storelistingversion_search`: Removes deprecated search column and updates StoreAgent view 7. Orphan Cleanup - `cleanup_orphaned_embeddings()` removes embeddings for deleted content - Always runs after backfill, even at 100% coverage ### Review Comments Addressed - ✅ SQL parameter index bug when user_id provided (embeddings.py) - ✅ Early return skipping cleanup at 100% coverage (scheduler.py) - ✅ Inconsistent return structure across code paths (scheduler.py) - ✅ SQL UNION syntax error - added parentheses for ORDER BY/LIMIT (hybrid_search.py) - ✅ Version numeric ordering in aggregations (migration) - ✅ Embedding dimension uses EMBEDDING_DIM constant ### Files Changed - `backend/api/features/store/content_handlers.py` (NEW): Handler architecture - `backend/api/features/store/embeddings.py`: Refactored to use handlers - `backend/api/features/store/hybrid_search.py`: Unified search + graceful degradation - `backend/executor/scheduler.py`: Process all content types, consistent returns - `migrations/20260115200000_add_unified_search_tsvector/`: Add tsvector to unified table - `migrations/20260115210000_remove_storelistingversion_search/`: Remove old search column - `schema.prisma`: Updated UnifiedContentEmbedding and StoreListingVersion models - `*_test.py`: Added tests for unified_hybrid_search ## Test Plan 1. ✅ All tests passing on Python 3.11, 3.12, 3.13 2. ✅ Types check passing 3. ✅ CodeRabbit and Sentry reviews addressed 4. Deploy to staging and verify: - Backfill job processes all content types - Search results include blocks and docs - Search works without OpenAI API (graceful degradation) 🤖 Generated with [Claude Code](https://claude.ai/code) --------- Co-authored-by: Swifty <craigswift13@gmail.com> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-16 09:47:19 +01:00
Swifty	5ac941fe2f	feat(backend): add hybrid search for store listings, docs and blocks (#11721 ) This PR adds hybrid search functionality combining semantic embeddings with traditional text search for improved store listing discovery. ### Changes 🏗️ - Add `embeddings.py` - OpenAI-based embedding generation and similarity search - Add `hybrid_search.py` - Combines vector similarity with text matching for better search results - Add `backfill_embeddings.py` - Script to generate embeddings for existing store listings - Update `db.py` - Integrate hybrid search into store database queries - Update `schema.prisma` - Add embedding storage fields and indexes - Add migrations for embedding columns and HNSW index for vector search ### Architecture Decisions 🏛️ Fail-Fast Approach (No Silent Fallbacks) We explicitly chose NOT to implement graceful degradation when hybrid search fails. Here's why: ✅ Benefits: - Errors surface immediately → faster fixes - Tests verify hybrid search actually works (not just fallback) - Consistent search quality for all users - Forces proper infrastructure setup (API keys, database) ❌ Why Not Fallback: - Silent degradation hides production issues - Users get inconsistent results without knowing why - Tests can pass even when hybrid search is broken - Reduces operational visibility How We Prevent Failures: 1. Embedding generation in approval flow (db.py:1545) 2. Error logging with `logger.error` (not warning) 3. Clear error messages (ValueError explains what's wrong) 4. Comprehensive test coverage (9/9 tests passing) If embeddings fail, it indicates a real infrastructure issue (missing API key, OpenAI down, database issues) that needs immediate attention, not silent degradation. ### Test Coverage ✅ All tests passing (1625 total): - 9/9 hybrid_search tests (including fail-fast validation) - 3/3 db search integration tests - Full schema compatibility (public/platform schemas) - Error handling verification ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Test hybrid search returns relevant results - [x] Test embedding generation for new listings - [x] Test backfill script on existing data - [x] Verify search performance with embeddings - [x] Test fail-fast behavior when embeddings unavailable #### For configuration changes: - [x] `.env.default` is updated or already compatible with my changes - [x] `docker-compose.yml` is updated or already compatible with my changes - [x] Configuration: Requires `openai_internal_api_key` in secrets --------- Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-15 04:17:03 +00:00
Reinier van der Leer	b01ea3fcbd	fix(backend/executor): Centralize `increment_runs` calls & make `add_graph_execution` more robust (#11764 ) [OPEN-2946: \[Scheduler\] Error executing graph <graph_id> after 19.83s: ClientNotConnectedError: Client is not connected to the query engine, you must call `connect()` before attempting to query data.](https://linear.app/autogpt/issue/OPEN-2946) - Follow-up to #11375 <sub>(broken `increment_runs` call)</sub> - Follow-up to #11380 <sub>(direct `get_graph_execution` call)</sub> ### Changes 🏗️ - Move `increment_runs` call from `scheduler._execute_graph` to `executor.utils.add_graph_execution` so it can be made through `DatabaseManager` - Add `increment_onboarding_runs` to `DatabaseManager` - Remove now-redundant `increment_onboarding_runs` calls in other places - Make `add_graph_execution` more resilient - Split up large try/except block - Fix direct `get_graph_execution` call ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - CI + a thorough review	2026-01-15 04:08:19 +00:00
Nicholas Tindle	47a3a5ef41	feat(backend,frontend): optional credentials flag for blocks at agent level (#11716 ) This feature allows agent makers to mark credential fields as optional. When credentials are not configured for an optional block, the block will be skipped during execution rather than causing a validation error. Use case: An agent with multiple notification channels (Discord, Twilio, Slack) where the user only needs to configure one - unconfigured channels are simply skipped. ### Changes 🏗️ #### Backend Data Model Changes: - `backend/data/graph.py`: Added `credentials_optional` property to `Node` model that reads from node metadata - `backend/data/execution.py`: Added `nodes_to_skip` field to `GraphExecutionEntry` model to track nodes that should be skipped Validation Changes: - `backend/executor/utils.py`: - Updated `_validate_node_input_credentials()` to return a tuple of `(credential_errors, nodes_to_skip)` - Nodes with `credentials_optional=True` and missing credentials are added to `nodes_to_skip` instead of raising validation errors - Updated `validate_graph_with_credentials()` to propagate `nodes_to_skip` set - Updated `validate_and_construct_node_execution_input()` to return `nodes_to_skip` - Updated `add_graph_execution()` to pass `nodes_to_skip` to execution entry Execution Changes: - `backend/executor/manager.py`: - Added skip logic in `_on_graph_execution()` dispatch loop - When a node is in `nodes_to_skip`, it is marked as `COMPLETED` without execution - No outputs are produced, so downstream nodes won't trigger #### Frontend Node Store: - `frontend/src/app/(platform)/build/stores/nodeStore.ts`: - Added `credentials_optional` to node metadata serialization in `convertCustomNodeToBackendNode()` - Added `getCredentialsOptional()` and `setCredentialsOptional()` helper methods Credential Field Component: - `frontend/src/components/renderers/input-renderer/fields/CredentialField/CredentialField.tsx`: - Added "Optional - skip block if not configured" switch toggle - Switch controls the `credentials_optional` metadata flag - Placeholder text updates based on optional state Credential Field Hook: - `frontend/src/components/renderers/input-renderer/fields/CredentialField/useCredentialField.ts`: - Added `disableAutoSelect` parameter - When credentials are optional, auto-selection of credentials is disabled Feature Flags: - `frontend/src/services/feature-flags/use-get-flag.ts`: Minor refactor (condition ordering) ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Build an agent using smart decision maker and down stream blocks to test this <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Introduces optional credentials across graph execution and UI, allowing nodes to be skipped (no outputs, no downstream triggers) when their credentials are not configured. > > - Backend > - Adds `Node.credentials_optional` (from node `metadata`) and computes required credential fields in `Graph.credentials_input_schema` based on usage. > - Validates credentials with `_validate_node_input_credentials` → returns `(errors, nodes_to_skip)`; plumbs `nodes_to_skip` through `validate_graph_with_credentials`, `_construct_starting_node_execution_input`, `validate_and_construct_node_execution_input`, and `add_graph_execution` into `GraphExecutionEntry`. > - Executor: dispatch loop skips nodes in `nodes_to_skip` (marks `COMPLETED`); `execute_node`/`on_node_execution` accept `nodes_to_skip`; `SmartDecisionMakerBlock.run` filters tool functions whose `_sink_node_id` is in `nodes_to_skip` and errors only if all tools are filtered. > - Models: `GraphExecutionEntry` gains `nodes_to_skip` field. Tests and snapshots updated accordingly. > > - Frontend > - Builder: credential field uses `custom/credential_field` with an "Optional – skip block if not configured" toggle; `nodeStore` persists `credentials_optional` and history; UI hides optional toggle in run dialogs. > - Run dialogs: compute required credentials from `credentials_input_schema.required`; allow selecting "None"; avoid auto-select for optional; filter out incomplete creds before execute. > - Minor schema/UI wiring updates (`uiSchema`, form context flags). > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit `5e01fd6a3e`. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY --> --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com> Co-authored-by: Nicholas Tindle <ntindle@users.noreply.github.com>	2026-01-09 14:11:35 +00:00
Nicholas Tindle	79d45a15d0	feat(platform): Deduplicate insufficient funds Discord + email notifications (#11672 ) Add Redis-based deduplication for insufficient funds notifications (both Discord alerts and user emails) when users run out of credits. This prevents spamming users and the PRODUCT Discord channel with repeated alerts for the same user+agent combination. ### Changes 🏗️ - Redis-based deduplication (`backend/executor/manager.py`): - Add `INSUFFICIENT_FUNDS_NOTIFIED_PREFIX` constant for Redis key prefix - Add `INSUFFICIENT_FUNDS_NOTIFIED_TTL_SECONDS` (30 days) as fallback cleanup - Implement deduplication in `_handle_insufficient_funds_notif` using Redis `SET NX` - Skip both email (`ZERO_BALANCE`) and Discord notifications for duplicate alerts per user+agent - Add `clear_insufficient_funds_notifications(user_id)` function to remove all notification flags for a user - Clear flags on credit top-up (`backend/data/credit.py`): - Call `clear_insufficient_funds_notifications` in `_top_up_credits` after successful auto-charge - Call `clear_insufficient_funds_notifications` in `fulfill_checkout` after successful manual top-up - This allows users to receive notifications again if they run out of funds in the future - Comprehensive test coverage (`backend/executor/manager_insufficient_funds_test.py`): - Test first-time notification sends both email and Discord alert - Test duplicate notifications are skipped for same user+agent - Test different agents for same user get separate alerts - Test clearing notifications removes all keys for a user - Test handling when no notification keys exist - Test notifications still sent when Redis fails (graceful degradation) ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] First insufficient funds alert sends both email and Discord notification - [x] Duplicate alerts for same user+agent are skipped - [x] Different agents for same user each get their own notification - [x] Topping up credits clears notification flags - [x] Redis failure gracefully falls back to sending notifications - [x] 30-day TTL provides automatic cleanup as fallback - [x] Manually test this works with scheduled agents <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Introduces Redis-backed deduplication for insufficient-funds alerts and resets flags on successful credit additions. > > - Dedup insufficient-funds alerts in `executor/manager.py` using Redis `SET NX` with `INSUFFICIENT_FUNDS_NOTIFIED_PREFIX` and 30‑day TTL; skips duplicate ZERO_BALANCE email + Discord alerts per `user_id`+`graph_id`, with graceful fallback if Redis fails. > - Reset notification flags on credit increases by adding `clear_insufficient_funds_notifications(user_id)` and invoking it when enabling/adding positive `GRANT`/`TOP_UP` transactions in `data/credit.py`. > - Tests (`executor/manager_insufficient_funds_test.py`): first-time vs duplicate behavior, per-agent separation, clearing keys (including no-key and Redis-error cases), and clearing on `_add_transaction`/`_enable_transaction`. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit `1a4413b3a1`. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY --> --------- Co-authored-by: Ubbe <hi@ubbe.dev> Co-authored-by: Claude <noreply@anthropic.com>	2025-12-30 18:10:30 +00:00
Reinier van der Leer	de78d062a9	refactor(backend/api): Clean up API file structure (#11629 ) We'll soon be needing a more feature-complete external API. To make way for this, I'm moving some files around so: - We can more easily create new versions of our external API - The file structure of our internal API is more homogeneous These changes are quite opinionated, but IMO in any case they're better than the chaotic structure we have now. ### Changes 🏗️ - Move `backend/server` -> `backend/api` - Move `backend/server/routers` + `backend/server/v2` -> `backend/api/features` - Change absolute sibling imports to relative imports - Move `backend/server/v2/AutoMod` -> `backend/executor/automod` - Combine `backend/server/routers/analytics_*test.py` -> `backend/api/features/analytics_test.py` - Sort OpenAPI spec file ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - CI tests - [x] Clicking around in the app -> no obvious breakage	2025-12-20 20:33:10 +00:00
Reinier van der Leer	3dbc03e488	feat(platform): OAuth API & Single Sign-On (#11617 ) We want to provide Single Sign-On for multiple AutoGPT apps that use the Platform as their backend. ### Changes 🏗️ Backend: - DB + logic + API for OAuth flow (w/ tests) - DB schema additions for OAuth apps, codes, and tokens - Token creation/validation/management logic - OAuth flow endpoints (app info, authorize, token exchange, introspect, revoke) - E2E OAuth API integration tests - Other OAuth-related endpoints (upload app logo, list owned apps, external `/me` endpoint) - App logo asset management - Adjust external API middleware to support auth with access token - Expired token clean-up job - Add `OAUTH_TOKEN_CLEANUP_INTERVAL_HOURS` setting (optional) - `poetry run oauth-tool`: dev tool to test the OAuth flows and register new OAuth apps - `poetry run export-api-schema`: dev tool to quickly export the OpenAPI schema (much quicker than spinning up the backend) Frontend: - Frontend UI for app authorization (`/auth/authorize`) - Re-redirect after login/signup - Frontend flow to batch-auth integrations on request of the client app (`/auth/integrations/setup-wizard`) - Debug `CredentialInputs` component - Add `/profile/oauth-apps` management page - Add `isOurProblem` flag to `ErrorCard` to hide action buttons when the error isn't our fault - Add `showTitle` flag to `CredentialsInput` to hide built-in title for layout reasons DX: - Add [API guide](https://github.com/Significant-Gravitas/AutoGPT/blob/pwuts/sso/docs/content/platform/integrating/api-guide.md) and [OAuth guide](https://github.com/Significant-Gravitas/AutoGPT/blob/pwuts/sso/docs/content/platform/integrating/oauth-guide.md) ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Manually verify test coverage of OAuth API tests - Test `/auth/authorize` using `poetry run oauth-tool test-server` - [x] Works - [x] Looks okay - Test `/auth/integrations/setup-wizard` using `poetry run oauth-tool test-server` - [x] Works - [x] Looks okay - Test `/profile/oauth-apps` page - [x] All owned OAuth apps show up - [x] Enabling/disabling apps works - [ ] ~~Uploading logos works~~ can only test this once deployed to dev #### For configuration changes: - [x] `.env.default` is updated or already compatible with my changes - [x] `docker-compose.yml` is updated or already compatible with my changes - [x] I have included a list of my configuration changes in the PR description (under Changes)	2025-12-19 21:05:16 +01:00
Zamil Majdy	71157bddd7	feat(backend): add agent mode support to SmartDecisionMakerBlock with autonomous tool execution loops (#11547 ) ## Summary <img width="2072" height="1836" alt="image" src="https://github.com/user-attachments/assets/9d231a77-6309-46b9-bc11-befb5d8e9fcc" /> 🚀 Major Feature: Agent Mode Support Adds autonomous agent mode to SmartDecisionMakerBlock, enabling it to execute tools directly in loops until tasks are completed, rather than just yielding tool calls for external execution. ## ⭐ Key New Features ### 🤖 Agent Mode with Tool Execution Loops - New `agent_mode_max_iterations` parameter controls execution behavior: - `0` = Traditional mode (single LLM call, yield tool calls) - `1+` = Agent mode with iteration limit - `-1` = Infinite agent mode (loop until finished) ### 🔄 Autonomous Tool Execution - Direct tool execution instead of yielding for external handling - Multi-iteration loops with conversation state management - Automatic completion detection when LLM stops making tool calls - Iteration limit handling with graceful completion messages ### 🏗️ Proper Database Operations - Replace manual execution ID generation with proper `upsert_execution_input`/`upsert_execution_output` - Real NodeExecutionEntry objects from database results - Proper execution status management: QUEUED → RUNNING → COMPLETED/FAILED ### 🔧 Enhanced Type Safety - Pydantic models replace TypedDict: `ToolInfo` and `ExecutionParams` - Runtime validation with better error messages - Improved developer experience with IDE support ## 🔧 Technical Implementation ### Agent Mode Flow: ```python # Agent mode enabled with iterations if input_data.agent_mode_max_iterations != 0: async for result in self._execute_tools_agent_mode(...): yield result # "conversations", "finished" return # Traditional mode (existing behavior) # Single LLM call + yield tool calls for external execution ``` ### Tool Execution with Database Operations: ```python # Before: Manual execution IDs tool_exec_id = f"{node_exec_id}_tool_{sink_node_id}_{len(input_data)}" # After: Proper database operations node_exec_result, final_input_data = await db_client.upsert_execution_input( node_id=sink_node_id, graph_exec_id=execution_params.graph_exec_id, input_name=input_name, input_data=input_value, ) ``` ### Type Safety with Pydantic: ```python # Before: Dict access prone to errors execution_params["user_id"] # After: Validated model access execution_params.user_id # Runtime validation + IDE support ``` ## 🧪 Comprehensive Test Coverage - Agent mode execution tests with multi-iteration scenarios - Database operation verification - Type safety validation - Backward compatibility for traditional mode - Enhanced dynamic fields tests ## 📊 Usage Examples ### Traditional Mode (Existing Behavior): ```python SmartDecisionMakerBlock.Input( prompt="Search for keywords", agent_mode_max_iterations=0 # Default ) # → Yields tool calls for external execution ``` ### Agent Mode (New Feature): ```python SmartDecisionMakerBlock.Input( prompt="Complete this task using available tools", agent_mode_max_iterations=5 # Max 5 iterations ) # → Executes tools directly until task completion or iteration limit ``` ### Infinite Agent Mode: ```python SmartDecisionMakerBlock.Input( prompt="Analyze and process this data thoroughly", agent_mode_max_iterations=-1 # No limit, run until finished ) # → Executes tools autonomously until LLM indicates completion ``` ## ✅ Backward Compatibility - Zero breaking changes to existing functionality - Traditional mode remains default (`agent_mode_max_iterations=0`) - All existing tests pass - Same API for tool definitions and execution This transforms the SmartDecisionMakerBlock from a simple tool call generator into a powerful autonomous agent capable of complex multi-step task execution! 🎯 🤖 Generated with [Claude Code](https://claude.ai/code) --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-12-12 09:58:06 +00:00
Zamil Majdy	c1e21d07e6	feat(platform): add execution accuracy alert system (#11562 ) ## Summary <img width="1263" height="883" alt="image" src="https://github.com/user-attachments/assets/98d4f449-1897-4019-a599-846c27df4191" /> <img width="398" height="190" alt="image" src="https://github.com/user-attachments/assets/0138ac02-420d-4f96-b980-74eb41e3c968" /> - Add execution accuracy monitoring with moving averages and Discord alerts - Dashboard visualization for accuracy trends and alert detection - Hourly monitoring for marketplace agents (≥10 executions in 30 days) - Generated API client integration with type-safe models ## Features - Moving Average Analysis: 3-day vs 7-day comparison with configurable thresholds - Discord Notifications: Hourly alerts for accuracy drops ≥10% - Dashboard UI: Real-time trends visualization with alert status - Type Safety: Generated API hooks and models throughout - Error Handling: Graceful OpenAI configuration handling - PostgreSQL Optimization: Window functions for efficient trend queries ## Test plan - [x] Backend accuracy monitoring logic tested with sample data - [x] Frontend components using generated API hooks (no manual fetch) - [x] Discord notification integration working - [x] Admin authentication and authorization working - [x] All formatting and linting checks passing - [x] Error handling for missing OpenAI configuration - [x] Test data available with `test-accuracy-agent-001` 🤖 Generated with [Claude Code](https://claude.ai/code) --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-12-08 19:28:57 +00:00
Krzysztof Czerwinski	c880db439d	feat(platform): Backend completion of Onboarding tasks (#11375 ) Make onboarding task completion backend-authoritative which prevents cheating (previously users could mark all tasks as completed instantly and get rewards) and makes task completion more reliable. Completion of tasks is moved backend with exception of introductory onboarding tasks and visit-page type tasks. ### Changes 🏗️ - Move incrementing run counter backend and make webhook-triggered and scheduled task execution count as well - Use user timezone for calculating run streak - Frontend task completion is moved from update onboarding state to separate endpoint and guarded so only frontend tasks can be completed - Graph creation, execution and add marketplace agent to library accept `source`, so appropriate tasks can be completed - Replace `client.ts` api calls with orval generated and remove no longer used functions from `client.ts` - Add `resolveResponse` helper function that unwraps orval generated call result to 2xx response Small changes&bug fixes: - Make Redis notification bus serialize all payload fields - Fix confetti when group is finished - Collapse finished group when opening Wallet - Play confetti only for tasks that are listed in the Wallet UI ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Onboarding can be finished - [x] All tasks can be finished and work properly - [x] Confetti works properly	2025-12-05 02:32:28 +00:00
Nicholas Tindle	113df689dc	feat(platform): Improve Google Sheets/Drive integration with unified credentials (#11520 ) Simplifies and improves the Google Sheets/Drive integration by merging credentials with the file picker and using narrower OAuth scopes. ### Changes 🏗️ - Merge Google credentials and file picker into a single unified input field for better UX - Create spreadsheets using Drive API instead of Sheets API for proper scope support - Simplify Google Drive OAuth scope to only use `drive.file` (narrowest permission needed) - Clean up unused imports (NormalizedPickedFile) ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Test creating a new Google Spreadsheet with GoogleSheetsCreateSpreadsheetBlock - [x] Test reading from existing spreadsheets with GoogleSheetsReadBlock - [x] Test writing to spreadsheets with GoogleSheetsWriteBlock - [x] Verify OAuth flow works with simplified scopes - [x] Verify file picker works with merged credentials field #### For configuration changes: - [x] `.env.default` is updated or already compatible with my changes - [x] `docker-compose.yml` is updated or already compatible with my changes - [x] I have included a list of my configuration changes in the PR description (under Changes) 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Unifies Google Drive picker and credentials with auto-credentials across backend and frontend, updates all Sheets blocks and execution to use it, and adds Drive-based spreadsheet creation plus supporting tests and UI fixes. > > - Backend: > - Google Drive model/field: Introduce `GoogleDriveFile` (with `_credentials_id`) and `GoogleDriveFileField()` for unified auth+picker (`backend/blocks/google/_drive.py`). > - Sheets blocks: Replace `GoogleDrivePickerField` and explicit credentials with `GoogleDriveFileField` across all Sheets blocks; preserve and emit credentials for chaining; add Drive service; create spreadsheets via Drive API then manage via Sheets API. > - IO block: Add `AgentGoogleDriveFileInputBlock` providing a Drive picker input. > - Execution: Support auto-generated credentials via `BlockSchema.get_auto_credentials_fields()`; acquire/release multiple credential locks; pass creds by `credentials_kwarg` (`executor/manager.py`, `data/block.py`, `util/test.py`). > - Tests: Add validation tests for duplicate/unique `auto_credentials.kwarg_name` and defaults. > - Frontend: > - Picker: Enhance Google Drive picker to require/use saved platform credentials, pass `_credentials_id`, validate scopes, and manage dialog z-index/interaction; expose `requirePlatformCredentials`. > - UI: Update dialogs/CSS to keep Google picker on top and prevent overlay interactions. > - Types: Extend `GoogleDrivePickerConfig` with `auto_credentials` and related typings. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit `7d25534def`. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY --> --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>	2025-12-04 14:40:30 +00:00
Zamil Majdy	7b951c977e	feat(platform): implement graph-level Safe Mode toggle for HITL blocks (#11455 ) ## Summary This PR implements a graph-level Safe Mode toggle system for Human-in-the-Loop (HITL) blocks. When Safe Mode is ON (default), HITL blocks require manual review before proceeding. When OFF, they execute automatically. ## 🔧 Backend Changes - Database: Added `metadata` JSON column to `AgentGraph` table with migration - API: Updated `execute_graph` endpoint to accept `safe_mode` parameter - Execution: Enhanced execution context to use graph metadata as default with API override capability - Auto-detection: Automatically populate `has_human_in_the_loop` for graphs containing HITL blocks - Block Detection: HITL block ID: `8b2a7b3c-6e9d-4a5f-8c1b-2e3f4a5b6c7d` ## 🎨 Frontend Changes - Component: New `FloatingSafeModeToggle` with dual variants: - White variant: For library pages, integrates with action buttons - Black variant: For builders, floating positioned - Integration: Added toggles to both new/legacy builders and library pages - API Integration: Direct graph metadata updates via `usePutV1UpdateGraphVersion` - Query Management: React Query cache invalidation for consistent UI updates - Conditional Display: Toggle only appears when graph contains HITL blocks ## 🛠 Technical Implementation - Safe Mode ON (default): HITL blocks require manual review before proceeding - Safe Mode OFF: HITL blocks execute automatically without intervention - Priority: Backend API `safe_mode` parameter takes precedence over graph metadata - Detection: Auto-populates `has_human_in_the_loop` metadata field - Positioning: Proper z-index and responsive positioning for floating elements ## 🚧 Known Issues (Work in Progress) ### High Priority - [ ] Toggle state persistence: Always shows "ON" regardless of actual state - query invalidation issue - [ ] LibraryAgent metadata: Missing metadata field causing TypeScript errors - [ ] Tooltip z-index: Still covered by some UI elements despite high z-index ### Medium Priority - [ ] HITL detection: Logic needs improvement for reliable block detection - [ ] Error handling: Removing HITL blocks from graph causes save errors - [ ] TypeScript: Fix type mismatches between GraphModel and LibraryAgent ### Low Priority - [ ] Frontend API: Add `safe_mode` parameter to execution calls once OpenAPI is regenerated - [ ] Performance: Consider debouncing rapid toggle clicks ## 🧪 Test Plan - [ ] Verify toggle appears only when graph has HITL blocks - [ ] Test toggle persistence across page refreshes - [ ] Confirm API calls update graph metadata correctly - [ ] Validate execution behavior respects safe mode setting - [ ] Check styling consistency across builder and library contexts ## 🔗 Related - Addresses requirements for graph-level HITL configuration - Builds on existing FloatingReviewsPanel infrastructure - Integrates with existing graph metadata system 🤖 Generated with [Claude Code](https://claude.ai/code)	2025-12-02 09:55:55 +00:00
Zamil Majdy	3d08c22dd5	feat(platform): add Human In The Loop block with review workflow (#11380 ) ## Summary This PR implements a comprehensive Human In The Loop (HITL) block that allows agents to pause execution and wait for human approval/modification of data before continuing. https://github.com/user-attachments/assets/c027d731-17d3-494c-85ca-97c3bf33329c ## Key Features - Added WAITING_FOR_REVIEW status to AgentExecutionStatus enum - Created PendingHumanReview database table for storing review requests - Implemented HumanInTheLoopBlock that extracts input data and creates review entries - Added API endpoints at /api/executions/review for fetching and reviewing pending data - Updated execution manager to properly handle waiting status and resume after approval ## Frontend Components - PendingReviewCard for individual review handling - PendingReviewsList for multiple reviews - FloatingReviewsPanel for graph builder integration - Integrated review UI into 3 locations: legacy library, new library, and graph builder ## Technical Implementation - Added proper type safety throughout with SafeJson handling - Optimized database queries using count functions instead of full data fetching - Fixed imports to be top-level instead of local - All formatters and linters pass ## Test plan - [ ] Test Human In The Loop block creation in graph builder - [ ] Test block execution pauses and creates pending review - [ ] Test review UI appears in all 3 locations - [ ] Test data modification and approval workflow - [ ] Test rejection workflow - [ ] Test execution resumes after approval 🤖 Generated with [Claude Code](https://claude.ai/code) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * New Features * Added Human-In-The-Loop review workflows to pause executions for human validation. * Users can approve or reject pending tasks, optionally editing submitted data and adding a message. * New "Waiting for Review" execution status with UI indicators across run lists, badges, and activity views. * Review management UI: pending review cards, list view, and a floating reviews panel for quick access. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-11-27 12:07:46 +07:00
Zamil Majdy	901bb31e14	feat(backend): parameterize activity status generation with customizable prompts (#11407 ) ## Summary Implement comprehensive parameterization of the activity status generation system to enable custom prompts for admin analytics dashboard. ## Changes Made ### Core Function Enhancement (`activity_status_generator.py`) - Extract hardcoded prompts to constants: `DEFAULT_SYSTEM_PROMPT` and `DEFAULT_USER_PROMPT` - Add prompt parameters: `system_prompt`, `user_prompt` with defaults to maintain backward compatibility - Template substitution system: User prompt supports `{{GRAPH_NAME}}` and `{{EXECUTION_DATA}}` placeholders - Skip existing flag: `skip_existing` parameter allows admin to force regeneration of existing data - Maintain manager compatibility: All existing calls continue to work with default parameters ### Admin API Enhancement (`execution_analytics_routes.py`) - Custom prompt fields: `system_prompt` and `user_prompt` optional fields in `ExecutionAnalyticsRequest` - Skip existing control: `skip_existing` boolean flag for admin regeneration option - Template documentation: Clear documentation of placeholder system in field descriptions - Backward compatibility: All existing API calls work unchanged ### Template System Design - Simple placeholder replacement: `{{GRAPH_NAME}}` → actual graph name, `{{EXECUTION_DATA}}` → JSON execution data - No dependencies: Uses simple `string.replace()` for maximum compatibility - JSON safety: Execution data properly serialized as indented JSON - Validation tested: Template substitution verified to work correctly ## Key Features ### For Regular Users (Manager Integration) - No changes required: Existing manager.py calls work unchanged - Default behavior preserved: Same prompts and logic as before - Feature flag compatibility: LaunchDarkly integration unchanged ### For Admin Analytics Dashboard - Custom system prompts: Admins can override the AI evaluation criteria - Custom user prompts: Admins can modify the analysis instructions with execution data templates - Force regeneration: `skip_existing=False` allows reprocessing existing executions with new prompts - Complete model list: Access to all LLM models from `llm.py` (70+ models including GPT, Claude, Gemini, etc.) ## Technical Validation - ✅ Template substitution tested and working - ✅ Default behavior preserved for existing code - ✅ Admin API parameter validation working - ✅ All imports and function signatures correct - ✅ Backward compatibility maintained ## Use Cases Enabled - A/B testing: Compare different prompt strategies on same execution data - Custom evaluation: Tailor success criteria for specific graph types - Prompt optimization: Iterate on prompt design based on admin feedback - Bulk reprocessing: Regenerate activity status with improved prompts ## Testing - Template substitution functionality verified - Function signatures and imports validated - Code formatting and linting passed - Backward compatibility confirmed ## Breaking Changes None - all existing functionality preserved with default parameters. ## Related Issues Resolves the requirement to expose prompt customization on the frontend execution analytics dashboard. --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-11-19 13:38:08 +00:00
Swifty	9438817702	fix(platform): Capture Sentry Block Errors Correctly (#11404 ) Currently we are capturing block errors via the scope only, this change captures the error directly. ### Changes 🏗️ - capture the error as well as the scope in the executor manager - Update the block error message to include additional details - remove the __str__ function from blockerror as it is no longer needed ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: <!-- Put your test plan here: --> - [x] Checked that errors are still captured in dev	2025-11-19 12:21:47 +01:00
Zamil Majdy	02757d68f3	fix(backend): resolve marketplace agent access in get_graph_execution endpoint (#11396 ) ## Summary Fixes critical issue where `GET /graphs/{graph_id}/executions/{graph_exec_id}` failed for marketplace agents with "Graph not found" errors due to incorrect version access checking. ## Root Cause The endpoint was checking access to the latest version of a graph instead of the specific version used in the execution. This broke marketplace agents when: 1. User executes a marketplace agent (e.g., v3) 2. Graph owner later publishes a new version (e.g., v4) 3. User tries to view execution details 4. BUG: Code checked access to latest version (v4) instead of execution version (v3) 5. If v4 wasn't published to marketplace → access denied → "Graph not found" ## Original Problematic Code ```python # routers/v1.py - get_graph_execution (WRONG ORDER) graph = await graph_db.get_graph(graph_id=graph_id, user_id=user_id) # ❌ Uses LATEST version if not graph: raise HTTPException(404, f"Graph #{graph_id} not found") result = await execution_db.get_graph_execution(...) # Gets execution data ``` ## Solution Reordered operations to check access against the execution's specific version: ```python # NEW CODE (CORRECT ORDER) result = await execution_db.get_graph_execution(...) # ✅ Get execution FIRST if not await graph_db.get_graph( graph_id=result.graph_id, version=result.graph_version, # ✅ Use execution's version, not latest! user_id=user_id, ): raise HTTPException(404, f"Graph #{graph_id} not found") ``` ### Key Changes Made 1. Fixed version access logic (routers/v1.py:1075-1095): - Reordered operations to get execution data first - Check access using `result.graph_version` instead of latest version - Applied same fix to external API routes 2. Enhanced `get_graph()` marketplace fallback (data/graph.py:919-935): - Added proper marketplace lookup when user doesn't own the graph - Supports version-specific marketplace access checking - Maintains security by only allowing approved, non-deleted listings 3. Activity status generator fix (activity_status_generator.py:139-144): - Use `skip_access_check=True` for internal system operations 4. Missing block handling (data/graph.py:94-103): - Added `_UnknownBlockBase` placeholder for graceful handling of deleted blocks ## Example Scenario Fixed 1. User: Installs marketplace agent "Blog Writer" v3 2. Owner: Later publishes v4 (not to marketplace yet) 3. User: Runs the agent (executes v3) 4. Before: Viewing execution details fails because code checked v4 access 5. After: ✅ Viewing execution details works because code checks v3 access ## Impact - ✅ Marketplace agents work correctly: Users can view execution details for any marketplace agent version they've used - ✅ Backward compatibility: Existing owned graphs continue working - ✅ Security maintained: Only allows access to versions user legitimately executed - ✅ Version-aware access control: Proper access checking for specific versions, not just latest ## Testing - [x] Marketplace agents: Execution details now accessible for all executed versions - [x] Owned graphs: Continue working as before - [x] Version scenarios: Access control works correctly for specific versions - [x] Missing blocks: Graceful handling without errors Root issue resolved: Version mismatch between execution version and access check version that was breaking marketplace agent execution viewing. --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-11-18 15:44:10 +00:00
Swifty	a66219fc1f	fix(platform): Remove un-runnable agents from schedule (#11374 ) Currently when an agent fails validation during a scheduled run, we raise an error then try again, regardless of why. This change removed the agent schedule and notifies the user ### Changes 🏗️ - add schedule_id to the GraphExecutionJobArgs - add agent_name to the GraphExecutionJobArgs - Delete schedule on GraphValidationError - Notify the user with a message that include the agent name ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: <!-- Put your test plan here: --> - [x] I have ensured the scheduler tests work with these changes	2025-11-17 15:24:40 +00:00
Reinier van der Leer	536e2a5ec8	fix(blocks): Make Smart Decision Maker tool pin handling consistent and reliable (#11363 ) - Resolves #11345 ### Changes 🏗️ - Move tool use routing logic from frontend to backend: routing info was being baked into graph links by the frontend, inconsistently, causing issues - Rework tool use routing to use target node ID instead of target block name - Add a bit of magic to `NodeOutputs` component to show tool node title instead of ID DX: - Removed `build` from `.prettierignore` -> re-enable formatting for builder components ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Use SDM block in a graph; verify it works - [x] Use SDM block with agent executor block as tool; verify it works - Tests for `parse_execution_output` pass (checked by CI)	2025-11-12 18:55:38 +01:00
Zamil Majdy	d6ee402483	feat(platform): Add execution analytics admin endpoint with feature flag bypass (#11327 ) This PR adds a comprehensive execution analytics admin endpoint that generates AI-powered activity summaries and correctness scores for graph executions, with proper feature flag bypass for admin use. ### Changes 🏗️ Backend Changes: - Added admin endpoint: `/api/executions/admin/execution_analytics` - Implemented feature flag bypass with `skip_feature_flag=True` parameter for admin operations - Fixed async database client usage (`get_db_async_client`) to resolve async/await errors - Added batch processing with configurable size limits to handle large datasets - Comprehensive error handling and logging for troubleshooting - Renamed entire feature from "Activity Backfill" to "Execution Analytics" for clarity Frontend Changes: - Created clean admin UI for execution analytics generation at `/admin/execution-analytics` - Built form with graph ID input, model selection dropdown, and optional filters - Implemented results table with status badges and detailed execution information - Added CSV export functionality for analytics results - Integrated with generated TypeScript API client for proper authentication - Added proper error handling with toast notifications and loading states Database & API: - Fixed critical async/await issue by switching from sync to async database client - Updated router configuration and endpoint naming for consistency - Generated proper TypeScript types and API client integration - Applied feature flag filtering at API level while bypassing for admin operations ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: Test Plan: - [x] Admin can access execution analytics page at `/admin/execution-analytics` - [x] Form validation works correctly (requires graph ID, validates inputs) - [x] API endpoint `/api/executions/admin/execution_analytics` responds correctly - [x] Authentication works properly through generated API client - [x] Analytics generation works with different LLM models (gpt-4o-mini, gpt-4o, etc.) - [x] Results display correctly with appropriate status badges (success/failed/skipped) - [x] CSV export functionality downloads correct data - [x] Error handling displays appropriate toast messages - [x] Feature flag bypass works for admin users (generates analytics regardless of user flags) - [x] Batch processing handles multiple executions correctly - [x] Loading states show proper feedback during processing #### For configuration changes: - [x] `.env.default` is updated or already compatible with my changes - [x] `docker-compose.yml` is updated or already compatible with my changes - [x] No configuration changes required for this feature Related to: PR #11325 (base correctness score functionality) 🤖 Generated with [Claude Code](https://claude.ai/code) --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com> Co-authored-by: Zamil Majdy <majdyz@users.noreply.github.com>	2025-11-10 10:27:44 +00:00
Zamil Majdy	6037f80502	feat(backend): Add correctness score to execution activity generation (#11325 ) ## Summary Add AI-generated correctness score field to execution activity status generation to provide quantitative assessment of how well executions achieved their intended purpose. New page: <img width="1000" height="229" alt="image" src="https://github.com/user-attachments/assets/5cb907cf-5bc7-4b96-8128-8eecccde9960" /> Old page: <img width="1000" alt="image" src="https://github.com/user-attachments/assets/ece0dfab-1e50-4121-9985-d585f7fcd4d2" /> ## What Changed - Added `correctness_score` field (float 0.0-1.0) to `GraphExecutionStats` model - REFACTORED: Removed duplicate `llm_utils.py` and reused existing `AIStructuredResponseGeneratorBlock` logic - Updated activity status generator to use structured responses instead of plain text - Modified prompts to include correctness assessment with 5-tier scoring system: - 0.0-0.2: Failure - 0.2-0.4: Poor - 0.4-0.6: Partial Success - 0.6-0.8: Mostly Successful - 0.8-1.0: Success - Updated manager.py to extract and set both activity_status and correctness_score - Fixed tests to work with existing structured response interface ## Technical Details - Code Reuse: Eliminated duplication by using existing `AIStructuredResponseGeneratorBlock` instead of creating new LLM utilities - Added JSON validation with retry logic for malformed responses - Maintained backward compatibility for existing activity status functionality - Score is clamped to valid 0.0-1.0 range and validated - All type errors resolved and linting passes ## Test Plan - [x] All existing tests pass with refactored structure - [x] Structured LLM call functionality tested with success and error cases - [x] Activity status generation tested with various execution scenarios - [x] Integration tests verify both fields are properly set in execution stats - [x] No code duplication - reuses existing block logic 🤖 Generated with [Claude Code](https://claude.ai/code) --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com> Co-authored-by: Zamil Majdy <majdyz@users.noreply.github.com>	2025-11-06 04:42:13 +00:00
Reinier van der Leer	de7c5b5c31	Merge branch 'master' into dev	2025-11-05 20:17:27 +01:00
Reinier van der Leer	d68dceb9c1	fix(backend/executor): Improve graph execution permission check (#11323 ) - Resolves #11316 - Durable fix to replace #11318 ### Changes 🏗️ - Expand graph execution permissions check - Don't require library membership for execution as sub-graph ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Can run sub-agent with non-latest graph version - [x] Can run sub-agent that is available in Marketplace but not added to Library	2025-11-05 17:13:41 +00:00
Zamil Majdy	193866232c	hotfix(backend): fix rate-limited messages blocking queue by republishing to back (#11326 ) ## Summary Fix critical queue blocking issue where rate-limited user messages prevent other users' executions from being processed, causing the 135 late executions reported in production. ## Root Cause Analysis When a user exceeds `max_concurrent_graph_executions_per_user` (25), the executor uses `basic_nack(requeue=True)` which sends the message to the FRONT of the RabbitMQ queue. This creates an infinite blocking loop where: 1. Rate-limited message goes to front of queue 2. Gets processed, hits rate limit again 3. Goes back to front of queue 4. Blocks all other users' messages indefinitely ## Solution Implementation ### 🔧 Core Changes - New setting: `requeue_by_republishing` (default: `True`) in `backend/util/settings.py` - Smart `_ack_message`: Automatically uses republishing when `requeue=True` and setting enabled - Efficient implementation: Uses existing `self.run_client` connection instead of creating new ones - Integration test: Real RabbitMQ test validates queue ordering behavior ### 🔄 Technical Implementation Before (blocking): ```python basic_nack(delivery_tag, requeue=True) # Goes to FRONT of queue ❌ ``` After (non-blocking): ```python if requeue and self.config.requeue_by_republishing: # First: Republish to BACK of queue self.run_client.publish_message(...) # Then: Reject without requeue basic_nack(delivery_tag, requeue=False) ``` ### 📊 Impact - ✅ Other users' executions no longer blocked by rate-limited users - ✅ Fair queue processing - FIFO behavior maintained for all users - ✅ Rate limiting still works - just doesn't block others - ✅ Configurable - can revert to old behavior with `requeue_by_republishing=False` - ✅ Zero performance impact - uses existing connections ## Test Plan - Integration test: `test_requeue_integration.py` validates real RabbitMQ queue ordering - Scenario testing: Confirms rate-limited messages go to back of queue - Cross-user validation: Verifies other users' messages process correctly - Setting test: Confirms configuration loads with correct defaults ## Deployment Strategy This is a hotfix that can be deployed immediately: - Backward compatible: Old behavior available via config - Safe default: New behavior is safer than current state - No breaking changes: All existing functionality preserved - Immediate relief: Resolves production queue blocking ## Files Modified - `backend/executor/manager.py`: Enhanced `_ack_message` logic and `_requeue_message_to_back` method - `backend/util/settings.py`: Added `requeue_by_republishing` configuration field - `test_requeue_integration.py`: Integration test for queue ordering validation ## Related Issues Fixes the 135 late executions issue where messages were stuck in QUEUED state despite available executor capacity (583m/600m utilization). 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-11-05 16:24:07 +00:00
Zamil Majdy	910fd2640d	hotfix(backend): Temporarily disable library existence check for graph execution (#11318 ) ### Changes 🏗️ add_store_agent_to_library does not add subagents to the user library, this check can cause issues. ### Checklist 📋 #### For code changes: - [ ] I have clearly listed my changes in the PR description - [ ] I have made a test plan - [ ] I have tested my changes according to the test plan: <!-- Put your test plan here: --> - [ ] ... <details> <summary>Example test plan</summary> - [ ] Create from scratch and execute an agent with at least 3 blocks - [ ] Import an agent from file upload, and confirm it executes correctly - [ ] Upload agent to marketplace - [ ] Import an agent from marketplace and confirm it executes correctly - [ ] Edit an agent from monitor, and confirm it executes correctly </details> #### For configuration changes: - [ ] `.env.default` is updated or already compatible with my changes - [ ] `docker-compose.yml` is updated or already compatible with my changes - [ ] I have included a list of my configuration changes in the PR description (under Changes) <details> <summary>Examples of configuration changes</summary> - Changing ports - Adding new services that need to communicate with each other - Secrets or environment variable changes - New or infrastructure changes such as databases </details>	2025-11-04 13:54:48 +00:00
Zamil Majdy	5506d59da1	fix(backend/executor): make graph execution permission check version-agnostic (#11283 ) ## Summary Fix critical issue where pre-execution permission validation broke execution of graphs that reference older versions of sub-graphs. ## Problem The `validate_graph_execution_permissions` function was checking for the specific version of a graph in the user's library. This caused failures when: 1. A parent graph references an older version of a sub-graph 2. The user updates the sub-graph to a newer version 3. The older version is no longer in their library 4. Execution of the parent graph fails with `GraphNotInLibraryError` ## Root Cause In `backend/executor/utils.py` line 523, the function was checking for the exact version, but sub-graphs legitimately reference older versions that may no longer be in the library. ## Solution ### 1. Remove Version-Specific Check (backend/executor/utils.py) - Remove `graph_version=graph.version` parameter from validation call - Add explanatory comment about version-agnostic behavior - Now only checks that the graph ID exists in user's library (any version) ### 2. Enhance Documentation (backend/data/graph.py) - Update function docstring to explain version-agnostic behavior - Document that `None` (now default) allows execution of any version - Clarify this is important for sub-graph version compatibility ## Technical Details The `validate_graph_execution_permissions` function was already designed to handle version-agnostic checks when `graph_version=None`. By omitting the version parameter, we skip the version check and only verify: - Graph exists in user's library - Graph is not deleted/archived - User has execution permissions ## Impact - ✅ Parent graphs can execute even when they reference older sub-graph versions - ✅ Sub-graph updates don't break existing parent graphs - ✅ Maintains security: still checks library membership and permissions - ✅ No breaking changes: version-specific validation still available when needed ## Example Scenario Fixed 1. User creates parent graph that uses sub-graph v1 2. User updates sub-graph to v2 (v1 removed from library) 3. Parent graph still references sub-graph v1 4. Before: Execution fails with `GraphNotInLibraryError` 5. After: Execution succeeds (version-agnostic permission check) ## Testing - [x] Code formatting and linting passes - [x] Type checking passes - [x] No breaking changes to existing functionality - [x] Security still maintained through library membership checks ## Files Changed - `backend/executor/utils.py`: Remove version-specific permission check - `backend/data/graph.py`: Enhanced documentation for version-agnostic behavior Closes #[issue-number-if-applicable] Co-authored-by: Claude <noreply@anthropic.com>	2025-10-29 14:13:23 +00:00
Zamil Majdy	4922f88851	feat(backend/executor): Implement cascading stop for nested graph executions (#11277 ) ## Summary Fixes critical issue where child executions spawned by `AgentExecutorBlock` continue running after parent execution is stopped. Implements parent-child execution tracking and recursive cascading stop logic to ensure entire execution trees are terminated together. ## Background When a parent graph execution containing `AgentExecutorBlock` nodes is stopped, only the parent was terminated. Child executions continued running, leading to: - ❌ Orphaned child executions consuming credits - ❌ No user control over execution trees - ❌ Race conditions where children start after parent stops - ❌ Resource leaks from abandoned executions ## Core Changes ### 1. Database Schema (`schema.prisma` + migration) ```sql -- Add nullable parent tracking field ALTER TABLE "AgentGraphExecution" ADD COLUMN "parentGraphExecutionId" TEXT; -- Add self-referential foreign key with graceful deletion ALTER TABLE "AgentGraphExecution" ADD CONSTRAINT "AgentGraphExecution_parentGraphExecutionId_fkey" FOREIGN KEY ("parentGraphExecutionId") REFERENCES "AgentGraphExecution"("id") ON DELETE SET NULL ON UPDATE CASCADE; -- Add index for efficient child queries CREATE INDEX "AgentGraphExecution_parentGraphExecutionId_idx" ON "AgentGraphExecution"("parentGraphExecutionId"); ``` ### 2. Parent ID Propagation (`backend/blocks/agent.py`) ```python # Extract current graph execution ID and pass as parent to child execution = add_graph_execution( # ... other params parent_graph_exec_id=graph_exec_id, # NEW: Track parent relationship ) ``` ### 3. Data Layer (`backend/data/execution.py`) ```python async def get_child_graph_executions(parent_exec_id: str) -> list[GraphExecution]: """Get all child executions of a parent execution.""" children = await AgentGraphExecution.prisma().find_many( where={"parentGraphExecutionId": parent_exec_id, "isDeleted": False} ) return [GraphExecution.from_db(child) for child in children] ``` ### 4. Cascading Stop Logic (`backend/executor/utils.py`) ```python async def stop_graph_execution( user_id: str, graph_exec_id: str, wait_timeout: float = 15.0, cascade: bool = True, # NEW parameter ): # 1. Find all child executions if cascade: children = await _get_child_executions(graph_exec_id) # 2. Stop all children recursively in parallel if children: await asyncio.gather( [stop_graph_execution(user_id, child.id, wait_timeout, True) for child in children], return_exceptions=True, # Don't fail parent if child fails ) # 3. Stop the parent execution # ... existing stop logic ``` ### 5. Race Condition Prevention (`backend/executor/manager.py`) ```python # Before executing queued child, check if parent was terminated if parent_graph_exec_id: parent_exec = get_db_client().get_graph_execution_meta(parent_graph_exec_id, user_id) if parent_exec and parent_exec.status == ExecutionStatus.TERMINATED: # Skip execution, mark child as terminated get_db_client().update_graph_execution_stats( graph_exec_id=graph_exec_id, status=ExecutionStatus.TERMINATED, ) return # Don't start orphaned child ``` ## How It Works ### Before (Broken) ``` User stops parent execution ↓ Parent terminates ✓ ↓ Child executions keep running ✗ ↓ User cannot stop children ✗ ``` ### After (Fixed) ``` User stops parent execution ↓ Query database for all children ↓ Recursively stop all children in parallel ↓ Wait for children to terminate ↓ Stop parent execution ↓ All executions in tree stopped ✓ ``` ### Race Prevention ``` Child in QUEUED status ↓ Parent stopped ↓ Child picked up by executor ↓ Pre-flight check: parent TERMINATED? ↓ Yes → Skip execution, mark child TERMINATED ↓ Child never runs ✓ ``` ## Edge Cases Handled ✅ Deep nesting* - Recursive cascading handles multi-level trees ✅ Queued children - Pre-flight check prevents execution ✅ Race conditions - Child spawned during stop operation ✅ Partial failures - `return_exceptions=True` continues on error ✅ Multiple children - Parallel stop via `asyncio.gather()` ✅ No parent - Backward compatible (nullable field) ✅ Already completed - Existing status check handles it ## Performance Impact - Stop operation: O(depth) with parallel execution vs O(1) before - Memory: +36 bytes per execution (one UUID reference) - Database: +1 query per tree level, indexed for efficiency ## API Changes (Backward Compatible) ### `stop_graph_execution()` - New Optional Parameter ```python # Before async def stop_graph_execution(user_id: str, graph_exec_id: str, wait_timeout: float = 15.0) # After async def stop_graph_execution(user_id: str, graph_exec_id: str, wait_timeout: float = 15.0, cascade: bool = True) ``` Default `cascade=True` means existing callers get the new behavior automatically. ### `add_graph_execution()` - New Optional Parameter ```python async def add_graph_execution(..., parent_graph_exec_id: Optional[str] = None) ``` ## Security & Safety - ✅ User verification - Users can only stop their own executions (parent + children) - ✅ No cycles - Self-referential FK prevents infinite loops - ✅ Graceful degradation - Errors in child stops don't block parent stop - ✅ Rate limits - Existing execution rate limits still apply ## Testing Checklist ### Database Migration - [x] Migration runs successfully - [x] Prisma client regenerates without errors - [x] Existing tests pass ### Core Functionality - [ ] Manual test: Stop parent with running child → child stops - [ ] Manual test: Stop parent with queued child → child never starts - [ ] Unit test: Cascading stop with multiple children - [ ] Unit test: Deep nesting (3+ levels) - [ ] Integration test: Race condition prevention ## Breaking Changes None - All changes are backward compatible with existing code. ## Rollback Plan If issues arise: 1. Code rollback: Revert PR, redeploy 2. Database rollback: Drop column and constraints (non-destructive) --- Note: This branch contains additional unrelated changes from merging with `dev`. The core cascading stop feature involves only: - `schema.prisma` + migration - `backend/data/execution.py` - `backend/executor/utils.py` - `backend/blocks/agent.py` - `backend/executor/manager.py` All other file changes are from dev branch updates and not part of this feature. 🤖 Generated with [Claude Code](https://claude.ai/code) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * New Features * Nested graph executions: parent-child tracking and retrieval of child executions * Improvements * Cascading stop: stopping a parent optionally terminates child executions * Parent execution IDs propagated through runs and surfaced in logs * Per-user/graph concurrent execution limits enforced * Bug Fixes * Skip enqueuing children if parent is terminated; robust handling when parent-status checks fail * Tests * Updated tests to cover parent linkage in graph creation <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-29 11:11:22 +00:00
Zamil Majdy	5fb142c656	fix(backend/executor): ensure cluster lock release on all execution submission failures (#11281 ) ## Root Cause During rolling deployment, execution `97058338-052a-4528-87f4-98c88416bb7f` got stuck in QUEUED state because: 1. Pod acquired cluster lock successfully during shutdown 2. Subsequent setup operations failed (ThreadPoolExecutor shutdown, resource exhaustion, etc.) 3. No error handling existed around the critical section after lock acquisition 4. Cluster lock remained stuck in Redis for 5 minutes (TTL timeout) 5. Other pods couldn't acquire the lock, leaving execution permanently queued ## The Fix ### Problem: Critical Section Not Protected The original code had no error handling for the entire critical section after successful lock acquisition: ```python # Original code - no error handling after lock acquired current_owner = cluster_lock.try_acquire() if current_owner != self.executor_id: return # didn't get lock # CRITICAL SECTION - any failure here leaves lock stuck self._execution_locks[graph_exec_id] = cluster_lock # Could fail: memory logger.info("Acquired cluster lock...") # Could fail: logging cancel_event = threading.Event() # Could fail: resources future = self.executor.submit(...) # Could fail: shutdown self.active_graph_runs[...] = (future, cancel_event) # Could fail: memory ``` ### Solution: Wrap Entire Critical Section Protect ALL operations after successful lock acquisition: ```python # Fixed code - comprehensive error handling current_owner = cluster_lock.try_acquire() if current_owner != self.executor_id: return # didn't get lock # Wrap ENTIRE critical section after successful acquisition try: self._execution_locks[graph_exec_id] = cluster_lock logger.info("Acquired cluster lock...") cancel_event = threading.Event() future = self.executor.submit(...) self.active_graph_runs[...] = (future, cancel_event) except Exception as e: # Release cluster lock before requeue cluster_lock.release() del self._execution_locks[graph_exec_id] _ack_message(reject=True, requeue=True) return ``` ### Why This Comprehensive Approach Works - Complete protection: Any failure in critical section → lock released - Proper cleanup order: Lock released → message requeued → another pod can try - Uses existing infrastructure: Leverages established `_ack_message()` requeue logic - Handles all scenarios: ThreadPoolExecutor shutdown, resource exhaustion, memory issues, logging failures ## Protected Failure Scenarios 1. Memory exhaustion: `_execution_locks` assignment or `active_graph_runs` assignment 2. Resource exhaustion: `threading.Event()` creation fails 3. ThreadPoolExecutor shutdown: `executor.submit()` with "cannot schedule new futures after shutdown" 4. Logging system failures: `logger.info()` calls fail 5. Any unexpected exceptions: Network issues, disk problems, etc. ## Validation - ✅ All existing tests pass - ✅ Maintains exact same success path behavior - ✅ Comprehensive error handling for all failure points - ✅ Minimal code change with maximum protection ## Impact - Eliminates stuck executions during pod lifecycle events (rolling deployments, scaling, crashes) - Faster recovery: Immediate requeue vs 5-minute Redis TTL wait - Higher reliability: Handles ANY failure in the critical section - Production-ready: Comprehensive solution for distributed lock management This prevents the exact race condition that caused execution `97058338-052a-4528-87f4-98c88416bb7f` to be stuck for >300 seconds, plus many other potential failure scenarios. --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-29 08:56:24 +00:00
Zamil Majdy	de70ede54a	fix(backend): prevent execution of deleted agents and cleanup orphaned resources (#11243 ) ## Summary Fix critical bug where deleted agents continue running scheduled and triggered executions indefinitely, consuming credits without user control. ## Problem When agents are deleted from user libraries, their schedules and webhook triggers remain active, leading to: - ❌ Uncontrolled resource consumption - ❌ "Unknown agent" executions that charge credits - ❌ No way for users to stop orphaned executions - ❌ Accumulation of orphaned database records ## Solution ### 1. Prevention: Library Validation Before Execution - Add `is_graph_in_user_library()` function with efficient database queries - Validate graph accessibility before all executions in `validate_and_construct_node_execution_input()` - Use specific `GraphNotInLibraryError` for clear error handling ### 2. Cleanup: Remove Schedules & Webhooks on Deletion - Enhanced `delete_library_agent()` to clean up associated schedules and webhooks - Comprehensive cleanup functions for both scheduled and triggered executions - Proper database transaction handling ### 3. Error-Based Cleanup: Handle Existing Orphaned Resources - Catch `GraphNotInLibraryError` in scheduler and webhook handlers - Automatically clean up orphaned resources when execution fails - Graceful degradation without breaking existing workflows ### 4. Migration: Clean Up Historical Orphans - SQL migration to remove existing orphaned schedules and webhooks - Performance index for faster cleanup queries - Proper logging and error handling ## Key Changes ### Core Library Validation ```python # backend/data/graph.py - Single source of truth async def is_graph_in_user_library(graph_id: str, user_id: str, graph_version: Optional[int] = None) -> bool: where_clause = {"userId": user_id, "agentGraphId": graph_id, "isDeleted": False, "isArchived": False} if graph_version is not None: where_clause["agentGraphVersion"] = graph_version count = await LibraryAgent.prisma().count(where=where_clause) return count > 0 ``` ### Enhanced Agent Deletion ```python # backend/server/v2/library/db.py async def delete_library_agent(library_agent_id: str, user_id: str, soft_delete: bool = True) -> None: # ... existing deletion logic ... await _cleanup_schedules_for_graph(graph_id=graph_id, user_id=user_id) await _cleanup_webhooks_for_graph(graph_id=graph_id, user_id=user_id) ``` ### Execution Prevention ```python # backend/executor/utils.py if not await gdb.is_graph_in_user_library(graph_id=graph_id, user_id=user_id, graph_version=graph.version): raise GraphNotInLibraryError(f"Graph #{graph_id} is not accessible in your library") ``` ### Error-Based Cleanup ```python # backend/executor/scheduler.py & backend/server/integrations/router.py except GraphNotInLibraryError as e: logger.warning(f"Execution blocked for deleted/archived graph {graph_id}") await _cleanup_orphaned_resources_for_graph(graph_id, user_id) ``` ## Technical Implementation ### Database Efficiency - Use `count()` instead of `find_first()` for faster queries - Add performance index: `idx_library_agent_user_graph_active` - Follow existing `prisma.is_connected()` patterns ### Error Handling Hierarchy - `GraphNotInLibraryError`: Specific exception for deleted/archived graphs - `NotAuthorizedError`: Generic authorization errors (preserved for user ID mismatches) - Clear error messages for better debugging ### Code Organization - Single source of truth for library validation in `backend/data/graph.py` - Import from centralized location to avoid duplication - Top-level imports following codebase conventions ## Testing & Validation ### Functional Testing - ✅ Library validation prevents execution of deleted agents - ✅ Cleanup functions remove schedules and webhooks properly - ✅ Error-based cleanup handles orphaned resources gracefully - ✅ Migration removes existing orphaned records ### Integration Testing - ✅ All existing tests pass (including `test_store_listing_graph`) - ✅ No breaking changes to existing functionality - ✅ Proper error propagation and handling ### Performance Testing - ✅ Efficient database queries with proper indexing - ✅ Minimal overhead for normal execution flows - ✅ Cleanup operations don't impact performance ## Impact ### User Experience - 🎯 Immediate: Deleted agents stop running automatically - 🎯 Ongoing: No more unexpected credit charges from orphaned executions - 🎯 Cleanup: Historical orphaned resources are removed ### System Reliability - 🔒 Security: Users can only execute agents they have access to - 🧹 Cleanup: Automatic removal of orphaned database records - 📈 Performance: Efficient validation with minimal overhead ### Developer Experience - 🎯 Clear Errors: Specific exception types for better debugging - 🔧 Maintainable: Centralized library validation logic - 📚 Documented: Comprehensive error handling patterns ## Files Modified - `backend/data/graph.py` - Library validation function - `backend/server/v2/library/db.py` - Enhanced agent deletion with cleanup - `backend/executor/utils.py` - Execution validation and prevention - `backend/executor/scheduler.py` - Error-based cleanup for schedules - `backend/server/integrations/router.py` - Error-based cleanup for webhooks - `backend/util/exceptions.py` - Specific error type for deleted graphs - `migrations/20251023000000_cleanup_orphaned_schedules_and_webhooks/migration.sql` - Historical cleanup ## Breaking Changes None. All changes are backward compatible and preserve existing functionality. ## Follow-up Tasks - [ ] Monitor cleanup effectiveness in production - [ ] Consider adding metrics for orphaned resource detection - [ ] Potential optimization of cleanup batch operations 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-28 23:48:35 +00:00
Reinier van der Leer	5e5f45a713	fix(backend): Fix various warnings (#11252 ) - Resolves #11251 This fixes all the warnings mentioned in #11251, reducing noise and making our logs and error alerts more useful :) ### Changes 🏗️ - Remove "Block {block_name} has multiple credential inputs" warning (not actually an issue) - Rename `json` attribute of `MainCodeExecutionResult` to `json_data`; retain serialized name through a field alias - Replace `Path(regex=...)` with `Path(pattern=...)` in `get_shared_execution` endpoint parameter config - Change Uvicorn's WebSocket module to new Sans-I/O implementation for WS server - Disable Uvicorn's WebSocket module for REST server - Remove deprecated `enable_cleanup_closed=True` argument in `CloudStorageHandler` implementation - Replace Prisma transaction timeout `int` argument with a `timedelta` value - Update Sentry SDK to latest version (v2.42.1) - Broaden filter for cleanup warnings from indirect dependency `litellm` - Fix handling of `MissingConfigError` in REST server endpoints ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - Check that the warnings are actually gone - [x] Deploy to dev environment and run a graph; check for any warnings - Test WebSocket server - [x] Run an agent in the Builder; make sure real-time execution updates still work	2025-10-28 13:18:45 +00:00
Reinier van der Leer	e06e7ff33f	fix(backend): Implement graceful shutdown in `AppService` to prevent RPC errors (#11240 ) We're currently seeing errors in the `DatabaseManager` while it's shutting down, like: ``` WARNING [DatabaseManager] Termination request: SystemExit; 0 executing cleanup. INFO [DatabaseManager] ⏳ Disconnecting Database... INFO [PID-1\|THREAD-29\|DatabaseManager\|Prisma-82fb1994-4b87-40c1-8869-fbd97bd33fc8] Releasing connection started... INFO [PID-1\|THREAD-29\|DatabaseManager\|Prisma-82fb1994-4b87-40c1-8869-fbd97bd33fc8] Releasing connection completed successfully. INFO [DatabaseManager] Terminated. ERROR POST /create_or_add_to_user_notification_batch failed: Failed to create or add to notification batch for user {user_id} and type AGENT_RUN: NoneType: None ``` This indicates two issues: - The service doesn't wait for pending RPC calls to finish before terminating - We're using `logger.exception` outside an error handling context, causing the confusing and not much useful `NoneType: None` to be printed instead of error info ### Changes 🏗️ - Implement graceful shutdown in `AppService` so in-flight RPC calls can finish - Add tests for graceful shutdown - Prevent `AppService` accepting new requests during shutdown - Rework `AppService` lifecycle management; add support for async `lifespan` - Fix `AppService` endpoint error logging - Improve logging in `AppProcess` and `AppService` ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - Deploy to Dev cluster, then `kubectl rollout restart` the different services a few times - [x] -> `DatabaseManager` doesn't break on re-deployment - [x] -> `Scheduler` doesn't break on re-deployment - [x] -> `NotificationManager` doesn't break on re-deployment	2025-10-25 14:47:19 +00:00
Reinier van der Leer	04df981115	fix(backend): Fix structured logging for cloud environments (#11227 ) - Resolves #11226 ### Changes 🏗️ - Drop use of `CloudLoggingHandler` which docs state isn't for use in GKE - For cloud logging, output only structured log entries to `stdout` ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Test deploy to dev and check logs	2025-10-21 12:48:41 +00:00
Zamil Majdy	11d55f6055	fix(backend/executor): Avoid running direct query in executor (#11224 ) ## Summary - Fixes database connection warnings in executor logs: "Client is not connected to the query engine, you must call `connect()` before attempting to query data" - Implements resilient database client pattern already used elsewhere in the codebase - Adds caching to reduce database load for user context lookups ## Changes - Updated `get_user_context()` to check `prisma.is_connected()` and fall back to database manager client - Added `@cached(maxsize=1000, ttl_seconds=3600)` decorator for performance optimization - Updated database manager to expose `get_user_by_id` method ## Test plan - [x] Verify executor pods no longer show Prisma connection warnings - [x] Confirm user timezone is still correctly retrieved - [x] Test fallback behavior when Prisma is disconnected 🤖 Generated with [Claude Code](https://claude.ai/code) Co-authored-by: Claude <noreply@anthropic.com>	2025-10-21 08:46:40 +00:00
Zamil Majdy	0bb2b87c32	fix(backend): resolve UserBalance migration issues and credit spending bug (#11192 ) ## Summary Fix critical UserBalance migration and spending issues affecting users with credits from transaction history but no UserBalance records. ## Root Issues Fixed ### Issue 1: UserBalance Migration Complexity - Problem: Complex data migration with timestamp logic issues and potential race conditions - Solution: Simplified to idempotent table creation only, application handles auto-population ### Issue 2: Credit Spending Bug - Problem: Users with $10.0 from transaction history couldn't spend $0.16 - Root Cause: `_add_transaction` and `_enable_transaction` only checked UserBalance table, returning 0 balance for users without records - Solution: Enhanced both methods with transaction history fallback logic ### Issue 3: Exception Handling Inconsistency - Problem: Raw SQL unique violations raised different exception types than Prisma ORM - Solution: Convert raw SQL unique violations to `UniqueViolationError` at source ## Changes Made ### Migration Cleanup - Idempotent operations: Use `CREATE TABLE IF NOT EXISTS`, `CREATE INDEX IF NOT EXISTS` - Inline foreign key: Define constraint within `CREATE TABLE` instead of separate `ALTER TABLE` - Removed data migration: Application creates UserBalance records on-demand - Safe to re-run: No errors if table/index/constraint already exists ### Credit Logic Fixes - Enhanced `_add_transaction`: Added transaction history fallback in `user_balance_lock` CTE - Enhanced `_enable_transaction`: Added same fallback logic for payment fulfillment - Exception normalization: Convert raw SQL unique violations to `UniqueViolationError` - Simplified `onboarding_reward`: Use standardized `UniqueViolationError` catching ### SQL Fallback Pattern ```sql COALESCE( (SELECT balance FROM UserBalance WHERE userId = ? FOR UPDATE), -- Fallback: compute from transaction history if UserBalance doesn't exist (SELECT COALESCE(ct.runningBalance, 0) FROM CreditTransaction ct WHERE ct.userId = ? AND ct.isActive = true AND ct.runningBalance IS NOT NULL ORDER BY ct.createdAt DESC LIMIT 1), 0 ) as balance ``` ## Impact ### Before - ❌ Users with transaction history but no UserBalance couldn't spend credits - ❌ Migration had complex timestamp logic with potential bugs - ❌ Raw SQL and Prisma exceptions handled differently - ❌ Error: "Insufficient balance of $10.0, where this will cost $0.16" ### After - ✅ Seamless spending for all users regardless of UserBalance record existence - ✅ Simple, idempotent migration that's safe to re-run - ✅ Consistent exception handling across all credit operations - ✅ Automatic UserBalance record creation during first transaction - ✅ Backward compatible - existing users unaffected ## Business Value - Eliminates user frustration: Users can spend their credits immediately - Smooth migration path: From old User.balance to new UserBalance table - Better reliability: Atomic operations with proper error handling - Maintainable code: Consistent patterns across credit operations ## Test Plan - [ ] Manual testing with users who have transaction history but no UserBalance records - [ ] Verify migration can be run multiple times safely - [ ] Test spending credits works for all user scenarios - [ ] Verify payment fulfillment (`_enable_transaction`) works correctly - [ ] Add comprehensive test coverage for this scenario 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-17 19:46:13 +07:00
Zamil Majdy	dfdd632161	fix(backend/util): handle nested Pydantic models in SafeJson (#11188 ) ## Summary Fixes a critical serialization bug introduced in PR #11187 where `SafeJson` failed to serialize dictionaries containing Pydantic models, causing 500 Internal Server Errors in the executor service. ## Problem The error manifested as: ``` CRITICAL: Operation Approaching Failure Threshold: Service communication: '_call_method_async' Current attempt: 50/50 Error: HTTPServerError: HTTP 500: Server error '500 Internal Server Error' for url 'http://autogpt-database-manager.prod-agpt.svc.cluster.local:8005/create_graph_execution' ``` Root cause in `create_graph_execution` (backend/data/execution.py:656-657): ```python "credentialInputs": SafeJson(credential_inputs) if credential_inputs else Json({}) ``` Where `credential_inputs: Mapping[str, CredentialsMetaInput]` is a dict containing Pydantic models. After PR #11187's refactor, `_sanitize_value()` only converted top-level BaseModel instances to dicts, but didn't handle BaseModel instances nested inside dicts/lists/tuples. This caused Prisma's JSON serializer to fail with: ``` TypeError: Type <class 'backend.data.model.CredentialsMetaInput'> not serializable ``` ## Solution Added BaseModel handling to `_sanitize_value()` to recursively convert Pydantic models to dicts before sanitizing: ```python elif isinstance(value, BaseModel): # Convert Pydantic models to dict and recursively sanitize return _sanitize_value(value.model_dump(exclude_none=True)) ``` This ensures all nested Pydantic models are properly serialized regardless of nesting depth. ## Changes - backend/util/json.py: Added BaseModel check to `_sanitize_value()` function - backend/util/test_json.py: Added 6 comprehensive tests covering: - Dict containing Pydantic models - Deeply nested Pydantic models - Lists of Pydantic models in dicts - The exact CredentialsMetaInput scenario - Complex mixed structures - Models with control characters ## Testing ✅ All new tests pass ✅ Verified fix resolves the production 500 error ✅ Code formatted with `poetry run format` ## Related - Fixes issues introduced in PR #11187 - Related to executor service 500 errors in production 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Bentlybro <Github@bentlybro.com> Co-authored-by: Claude <noreply@anthropic.com>	2025-10-17 09:27:09 +00:00
Zamil Majdy	374f35874c	feat(platform): Add LaunchDarkly flag for platform payment system (#11181 ) ## Summary Implement selective rollout of payment functionality using LaunchDarkly feature flags to enable gradual deployment to pilot users. - Add `ENABLE_PLATFORM_PAYMENT` flag to control credit system behavior - Update `get_user_credit_model` to use user-specific flag evaluation - Replace hardcoded `NEXT_PUBLIC_SHOW_BILLING_PAGE` with LaunchDarkly flag - Enable payment UI components only for flagged users - Maintain backward compatibility with existing beta credit system - Default to beta monthly credits when flag is disabled - Fix tests to work with new async credit model function ## Key Changes ### Backend - Credit Model Selection: The `get_user_credit_model()` function now takes a `user_id` parameter and uses LaunchDarkly to determine which credit model to return: - Flag enabled → `UserCredit` (payment system enabled, no monthly refills) - Flag disabled → `BetaUserCredit` (current behavior with monthly refills) - Flag Integration: Added `ENABLE_PLATFORM_PAYMENT` flag and integrated LaunchDarkly evaluation throughout the credit system - API Updates: All credit-related endpoints now use the user-specific credit model instead of a global instance ### Frontend - Dynamic UI: Payment-related components (billing page, wallet refill) now show/hide based on the LaunchDarkly flag - Removed Environment Variable: Replaced `NEXT_PUBLIC_SHOW_BILLING_PAGE` with runtime flag evaluation ### Testing - Test Fixes: Updated all tests that referenced the removed global `_user_credit_model` to use proper mocking of the new async function ## Deployment Strategy This implementation enables a controlled rollout: 1. Deploy with flag disabled (default) - no behavior change for existing users 2. Enable flag for pilot/beta users via LaunchDarkly dashboard 3. Monitor usage and feedback from pilot users 4. Gradually expand to more users 5. Eventually enable for all users once validated ## Test Plan - [x] Unit tests pass for credit system components - [x] Payment UI components show/hide correctly based on flag - [x] Default behavior (flag disabled) maintains current functionality - [x] Flag enabled users get payment system without monthly refills - [x] Admin credit operations work correctly - [x] Backward compatibility maintained 🤖 Generated with [Claude Code](https://claude.ai/code) --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-17 06:11:39 +00:00
Nicholas Tindle	b230b1b5cf	feat(backend): Add Sentry user and tag tracking to node execution (#11170 ) Integrates Sentry SDK to set user and contextual tags during node execution for improved error tracking and user count analytics. Ensures Sentry context is properly set and restored, and exceptions are captured with relevant context before scope restoration. <!-- Clearly explain the need for these changes: --> ### Changes 🏗️ Adds sentry tracking to block failures <!-- Concisely describe all of the changes made in this pull request: --> ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: <!-- Put your test plan here: --> - [x] Test to make sure the userid and block details show up in Sentry - [x] make sure other errors aren't contaminated <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Added conditional support for feature flags when configured, enabling targeted rollouts and experiments without impacting unconfigured environments. - Chores - Enhanced error monitoring with richer contextual data during node execution to improve stability and diagnostics. - Updated metrics initialization to dynamically include feature flag integrations when available, without altering behavior for unconfigured setups. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-10-15 14:33:08 +00:00
Zamil Majdy	934cb3a9c7	feat(backend): Make execution limit per user per graph and reduce to 25 (#11169 ) ## Summary - Changed max_concurrent_graph_executions_per_user from 50 to 25 concurrent executions - Updated the limit to be per user per graph instead of globally per user - Users can now run different graphs concurrently without being limited by executions of other graphs - Enhanced database query to filter by both user_id and graph_id ## Changes Made - Settings: Reduced default limit from 50 to 25 and updated description to clarify per-graph scope - Database Layer: Modified `get_graph_executions_count` to accept optional `graph_id` parameter - Executor Manager: Updated rate limiting logic to check per-user-per-graph instead of per-user globally - Logging: Enhanced warning messages to include graph_id context ## Test plan - [ ] Verify that users can run up to 25 concurrent executions of the same graph - [ ] Verify that users can run different graphs concurrently without interference - [ ] Test rate limiting behavior when limit is exceeded for a specific graph - [ ] Confirm logging shows correct graph_id context in rate limit messages ## Impact This change improves the user experience by allowing concurrent execution of different graphs while still preventing resource exhaustion from running too many instances of the same graph. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-authored-by: Claude <noreply@anthropic.com>	2025-10-15 00:02:55 +00:00
seer-by-sentry[bot]	20acd8b51d	fix(backend): Improve Postmark error handling and logging for notification delivery (#11052 ) <!-- Clearly explain the need for these changes: --> Fixes [AUTOGPT-SERVER-5K6](https://sentry.io/organizations/significant-gravitas/issues/6887660207/). The issue was that: Batch sending fails due to malformed data (422) and inactive recipients (406); the 406 error is misclassified as a size limit failure. - Implements more robust error handling for Postmark API failures during notification sending. - Specifically handles inactive recipients (HTTP 406), malformed data (HTTP 422), and oversized notifications. - Adds detailed logging for each error case, including the notification index and error message. - Skips individual notifications that fail due to these errors, preventing the entire batch from failing. - Improves error handling for ValueErrors during send_templated calls, specifically addressing oversized notifications. This fix was generated by Seer in Sentry, triggered by Nicholas Tindle. 👁️ Run ID: 1675950 Not quite right? [Click here to continue debugging with Seer.](https://sentry.io/organizations/significant-gravitas/issues/6887660207/?seerDrawer=true) ### Changes 🏗️ <!-- Concisely describe all of the changes made in this pull request: --> - Implements more robust error handling for Postmark API failures during notification sending. - Specifically handles inactive recipients (HTTP 406), malformed data (HTTP 422), and oversized notifications. - Adds detailed logging for each error case, including the notification index and error message. - Skips individual notifications that fail due to these errors, preventing the entire batch from failing. - Improves error handling for ValueErrors during send_templated calls, specifically addressing oversized notifications. - Also disables this in prod to prevent scaling issues until we work out some of the more critical issues ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: <!-- Put your test plan here: --> - [x] Test sending notifications with invalid email addresses to ensure 406 errors are handled correctly. - [x] Test sending notifications with malformed data to ensure 422 errors are handled correctly. - [x] Test sending oversized notifications to ensure they are skipped and logged correctly. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - None - Bug Fixes - Individual email failures no longer abort a batch; processing continues after per-recipient errors. - Specific handling for inactive recipients and malformed messages to prevent repeated delivery attempts. - Chores - Improved error logging and diagnostics for email delivery scenarios. - Tests - Added tests covering email-sending error cases, user-deactivation on inactive addresses, and batch-continuation behavior. - Documentation - None <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: seer-by-sentry[bot] <157164994+seer-by-sentry[bot]@users.noreply.github.com> Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com> Co-authored-by: Nicholas Tindle <ntindle@users.noreply.github.com> Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co>	2025-10-13 07:16:48 +00:00
Zamil Majdy	05a72f4185	feat(backend): implement user rate limiting for concurrent graph executions (#11128 ) ## Summary Add configurable rate limiting to prevent users from exceeding the maximum number of concurrent graph executions, defaulting to 50 per user. ## Changes Made ### Configuration (`backend/util/settings.py`) - Add `max_concurrent_graph_executions_per_user` setting (default: 50, range: 1-1000) - Configurable via environment variables or settings file ### Database Query Function (`backend/data/execution.py`) - Add `get_graph_executions_count()` function for efficient count queries - Supports filtering by user_id, statuses, and time ranges - Used to check current RUNNING/QUEUED executions per user ### Database Manager Integration (`backend/executor/database.py`) - Expose `get_graph_executions_count` through DatabaseManager RPC interface - Follows existing patterns for database operations - Enables proper service-to-service communication ### Rate Limiting Logic (`backend/executor/manager.py`) - Inline rate limit check in `_handle_run_message()` before cluster lock - Use existing `db_client` pattern for consistency - Reject and requeue executions when limit exceeded - Graceful error handling - proceed if rate limit check fails - Enhanced logging with user_id and current/max execution counts ## Technical Implementation - Database approach: Query actual execution statuses for accuracy - RPC pattern: Use DatabaseManager client following existing codebase patterns - Fail-safe design: Proceed with execution if rate limit check fails - Requeue on limit: Rejected executions are requeued for later processing - Early rejection: Check rate limit before expensive cluster lock operations ## Rate Limiting Flow 1. Parse incoming graph execution request 2. Query database via RPC for user's current RUNNING/QUEUED execution count 3. Compare against configured limit (default: 50) 4. If limit exceeded: reject and requeue message 5. If within limit: proceed with normal execution flow ## Configuration Example ```env MAX_CONCURRENT_GRAPH_EXECUTIONS_PER_USER=25 # Reduce to 25 for stricter limits ``` ## Test plan - [x] Basic functionality tested - settings load correctly, database function works - [x] ExecutionManager imports and initializes without errors - [x] Database manager exposes the new function through RPC - [x] Code follows existing patterns and conventions - [ ] Integration testing with actual rate limiting scenarios - [ ] Performance testing to ensure minimal impact on execution pipeline 🤖 Generated with [Claude Code](https://claude.ai/code) --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-11 08:02:34 +07:00
Bently	4856bd1f3a	fix(backend): prevent sub-agent execution visibility across users (#11132 ) Fixes a issue where sub-agent executions triggered by one user were visible in the original agent author's execution library. ## Solution Fixed the user_id attribution in `autogpt_platform/backend/backend/executor/manager.py` by ensuring that sub-agent executions always use the actual executor's user_id rather than the agent author's user_id stored in node defaults. ### Changes - Added user_id override in `execute_node()` function when preparing AgentExecutorBlock input (line 194) - Ensures sub-agent executions are correctly attributed to the user running them, not the agent author - Maintains proper privacy isolation between users in marketplace agent scenarios ### Security Impact - Before: When User B downloaded and ran a marketplace agent containing sub-agents owned by User A, the sub-agent executions appeared in User A's library - After: Sub-agent executions now only appear in the library of the user who actually ran them - Prevents unauthorized access to execution data and user privacy violation #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: <!-- Test plan: --> - [x] Create an agent with sub-agents as User A - [x] Publish agent to marketplace - [x] Run the agent as User B - [x] Verify User A cannot see User B's sub-agent executions in their library - [x] Verify User B can see their own sub-agent executions - [x] Verify primary agent executions remain correctly filtered	2025-10-09 11:17:26 +00:00
Zamil Majdy	59c27fe248	feat(backend): implement comprehensive rate-limited Discord alerting system (#11106 ) ## Summary Implement comprehensive Discord alerting system with intelligent rate limiting to prevent spam and provide proper visibility into system failures across retry mechanisms and execution errors. ## Key Features ### 🚨 Rate-Limited Discord Alerting Infrastructure - Reusable rate-limited alerts: `send_rate_limited_discord_alert()` function for any Discord alerts - 5-minute rate limiting: Prevents spam for identical error signatures (function+error+context) - Thread-safe: Proper locking for concurrent alert attempts - Configurable channels: Support custom Discord channels or default to PLATFORM - Graceful failure handling: Alert failures don't break main application flow ### 🔄 Enhanced Retry Alert System - Unified threshold alerting: Both general retries and infrastructure retries alert at EXCESSIVE_RETRY_THRESHOLD (50 attempts) - Critical retry alerts: Early warning when operations approach failure threshold - Infrastructure monitoring: Dedicated alerts for database, Redis, RabbitMQ connection issues - Rate limited: All retry alerts use rate limiting to prevent overwhelming Discord channels ### 📊 Unknown Execution Error Alerts - Automatic error detection: Alert for unexpected graph execution failures - Rich context: Include user ID, graph ID, execution ID, error type and message - Filtered alerts: Skip known errors (InsufficientBalanceError, ModerationError) - Proper error tracking: Ensure execution_stats.error is set for all error types ## Technical Implementation ### Rate Limiting Strategy ```python # Create unique signatures based on function+error+context error_signature = f"{context}:{func_name}:{type(exception).__name__}:{str(exception)[:100]}" ``` - 5-minute windows: ALERT_RATE_LIMIT_SECONDS = 300 prevents duplicate alerts - Memory efficient: Only store last alert timestamp per unique error signature - Context awareness: Same error in different contexts can send separate alerts ### Alerting Hierarchy 1. 50 attempts: Critical alert warning about approaching failure (EXCESSIVE_RETRY_THRESHOLD) 2. 100 attempts: Final infrastructure failure (conn_retry max_retry) 3. Unknown execution errors: Immediate rate-limited alerts for unexpected failures ## Files Modified ### Core Implementation - `backend/executor/manager.py`: Unknown execution error alerts with rate limiting - `backend/util/retry.py`: Comprehensive rate-limited alerting infrastructure - `backend/util/retry_test.py`: Full test coverage for rate limiting functionality (14 tests) ### Code Quality Improvements - Inlined alert messages: Eliminated unnecessary temporary variables - Simplified logic: Removed excessive comments and redundant alerts - Consistent patterns: All alert functions follow same clean code style - DRY principle: Reusable rate-limited alert system for future monitoring needs ## Benefits ### 🛡️ Prevents Alert Spam - Rate limiting: No more overwhelming Discord channels with duplicate alerts - Intelligent deduplication: Same errors rate limited while different errors get through - Thread safety: Concurrent operations handled correctly ### 🔍 Better System Visibility - Unknown errors: Issues that need investigation are properly surfaced - Infrastructure monitoring: Early warning for database/Redis/RabbitMQ issues - Rich context: All necessary debugging information included in alerts ### 🧹 Maintainable Codebase - Reusable infrastructure: `send_rate_limited_discord_alert()` for future monitoring - Clean, consistent code: Inlined messages, simplified logic, proper abstractions - Comprehensive testing: Rate limiting edge cases and real-world scenarios covered ## Validation Results - ✅ All 14 retry tests pass including comprehensive rate limiting coverage - ✅ Manager execution tests pass validating integration with execution flow - ✅ Thread safety validated with concurrent alert attempt tests - ✅ Real-world scenarios tested including the specific spend_credits spam issue that motivated this work - ✅ Code formatting, linting, and type checking all pass ## Before/After Comparison ### Before - No rate limiting → Discord spam for repeated errors - Unknown execution errors not monitored → Issues went unnoticed - Inconsistent alerting thresholds → Confusing monitoring - Verbose code with temporary variables → Harder to maintain ### After - ✅ Rate-limited intelligent alerting prevents spam - ✅ Unknown execution errors properly monitored with context - ✅ Unified 50-attempt threshold for consistent monitoring - ✅ Clean, maintainable code with reusable infrastructure 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-09 08:22:15 +07:00
Zamil Majdy	4e1557e498	fix(backend): Add dynamic input pin support for Smart Decision Maker Block (#11082 ) ## Summary - Centralize dynamic field delimiters and helpers in backend/data/dynamic_fields.py. - Refactor SmartDecisionMaker: build function signatures with dynamic-field mapping and re-map tool outputs back to original dynamic names. - Deterministic retry loop with retry-only feedback to avoid polluting final conversation history. - Update executor/utils.py and data/graph.py to use centralized utilities. - Update and extend tests: dynamic-field E2E flow, mapping verification, output yielding, and retry validation; switch mocked llm_call to AsyncMock; align tool-name expectations. - Add a single-tool fallback in schema lookup to support mocked scenarios. ## Validation - Full backend test suite: 1125 passed, 88 skipped, 53 warnings (local). - Backend lint/format pass. ## Scope - Minimal and localized to SmartDecisionMaker and dynamic-field utilities; unrelated pyright warnings remain unchanged. ## Risks/Mitigations - Behavior is backward-compatible; dynamic-field constants are centralized and reused. - Output re-mapping only affects SmartDecisionMaker tool outputs and matches existing link naming conventions. ## Checklist - [x] Formatted and linted - [x] All updated tests pass locally - [x] No secrets introduced --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-04 14:23:13 +00:00
Zamil Majdy	57a06f7088	fix(blocks, security): Fixes for various DoS vulnerabilities (#10798 ) This PR addresses multiple critical and medium security vulnerabilities that could lead to Denial of Service (DoS) attacks. All fixes implement defense-in-depth strategies with comprehensive testing. ### Changes 🏗️ #### Critical Security Fixes: 1. GHSA-m2wr-7m3r-p52c - ReDoS in CodeExtractionBlock - Fixed catastrophic backtracking in regex patterns `\s+[\s\S]?` and `\s+(.?)` - Replaced with safer patterns: `[ \t]\n([^\s\S]?)` - Files: `backend/blocks/code_extraction_block.py` 2. GHSA-955p-gpfx-r66j - AITextSummarizerBlock Memory Amplification - Added 1MB text size limit and 100 chunk maximum - Prevents 10K input → 50G memory amplification attacks - Files: `backend/blocks/llm.py` 3. GHSA-5cqw-g779-9f9x - RSS Feed XML Bomb DoS - Added 10MB feed size limit and 30s timeout - Prevents deep XML parsing memory exhaustion - Files: `backend/blocks/rss.py` 4. GHSA-7g34-7fvq-xxq6 - File Storage Disk Exhaustion - Added 100MB per file and 1GB per execution directory limits - Prevents disk space exhaustion from file uploads - Files: `backend/util/file.py` 5. GHSA-pppq-xx2w-7jpq - ExtractTextInformationBlock ReDoS - Added 1MB text limit, 1000 match limit, and 5s timeout protection - Prevents lookahead pattern memory exhaustion - Files: `backend/blocks/text.py` 6. GHSA-vw3v-whvp-33v5 - Docker Logging Disk Exhaustion - Added log rotation limits at Docker (10MB × 3 files) and application levels - Prevents unbounded log growth causing disk exhaustion - Files: `docker-compose.platform.yml`, `autogpt_libs/autogpt_libs/logging/config.py` #### Additional Security Improvements: 7. StepThroughItemsBlock DoS Prevention - Added 10,000 item limit and 1MB input size limit - Prevents large iteration DoS attacks - Files: `backend/blocks/iteration.py` 8. XMLParserBlock XML Bomb Prevention - Added 10MB XML input size limit - Files: `backend/blocks/xml_parser.py` #### Code Quality: - Fixed Python 3.10 typing compatibility issues - Added comprehensive security test suite - All code formatted and linted ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Created comprehensive security test suite covering all vulnerabilities - [x] Verified ReDoS patterns are fixed and don't cause timeouts - [x] Confirmed memory limits prevent amplification attacks - [x] Tested file size limits prevent disk exhaustion - [x] Validated log rotation prevents unbounded growth - [x] Ensured backward compatibility for normal usage #### For configuration changes: - [x] `docker-compose.yml` is updated with logging limits - [x] I have included a list of my configuration changes in the PR description (under Changes) ### Test Plan 🧪 Security Tests: 1. ReDoS Protection: Tested with malicious regex inputs (large spaces) - completes without hanging 2. Memory Limits: Verified 2MB text input gets truncated to 1MB, chunk limits enforced 3. File Size Limits: Confirmed 200MB files rejected, directory size limits enforced 4. Iteration Limits: Tested 20K item arrays rejected, large JSON strings rejected 5. Timeout Protection: Dangerous regex patterns timeout after 5s instead of hanging Compatibility Tests: - Normal functionality preserved for all blocks - Existing tests pass with new security limits - Performance impact minimal for typical usage ### Security Impact 🛡️ Before: Multiple attack vectors could cause: - CPU exhaustion (ReDoS attacks) - Memory exhaustion (amplification attacks) - Disk exhaustion (file/log bombs) - Service unavailability After: All attack vectors mitigated with: - Input validation and size limits - Timeout protections - Resource quotas - Defense-in-depth approach All fixes maintain backward compatibility while preventing DoS attacks. 🤖 Generated with [Claude Code](https://claude.ai/code) <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Adds robust DoS protections across blocks (regex, memory, iteration, XML/RSS, file I/O) and enables app/Docker log rotation with comprehensive tests. > > - Security hardening: > - Replace unsafe regex in `backend/blocks/code_extraction_block.py` to prevent ReDoS; add safer extraction/removal patterns. > - Constrain LLM summarizer chunking in `backend/blocks/llm.py` (1MB cap, chunk/overlap validation, chunk count limit). > - Limit RSS fetching in `backend/blocks/rss.py` (scheme validation, 10MB cap, timeout, bounded read) and return empty on failure. > - Impose XML size limit (10MB) in `backend/blocks/xml_parser.py`. > - Add file upload/download limits in `backend/util/file.py` (100MB/file, 1GB dir quota) and enforce scanning before write. > - Enable rotating file logs in `autogpt_libs/logging/config.py` (size + backups) and Docker json-file log rotation in `docker-compose.platform.yml`. > - Iteration block: > - Add item count/string size limits; fix yielded key for dicts; cap iterations in `backend/blocks/iteration.py`. > - Tests: > - New `backend/blocks/test/test_security_fixes.py` covering ReDoS, timeouts, memory/size and iteration limits, XML/file constraints. > - Misc: > - Typing fallback for `NotRequired` in `activity_status_generator.py`. > - Dependency updates in `backend/poetry.lock`. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit `500e1578b1`. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY --> --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co> Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com> Co-authored-by: Nicholas Tindle <ntindle@users.noreply.github.com> Co-authored-by: Zamil Majdy <majdyz@users.noreply.github.com> Co-authored-by: Reinier van der Leer <Pwuts@users.noreply.github.com> Co-authored-by: Reinier van der Leer <pwuts@agpt.co>	2025-10-02 12:55:55 +00:00
Zamil Majdy	258bf0b1a5	fix(backend): improve activity status generation accuracy and handle missing blocks gracefully (#11039 ) ## Summary Fix critical issues where activity status generator incorrectly reported failed executions as successful, and enhance AI evaluation logic to be more accurate about actual task accomplishment. ## Changes Made ### 1. Missing Block Handling (`backend/data/graph.py`) - Replace ValueError with graceful degradation: When blocks are deleted/missing, return `_UnknownBlock` placeholder instead of crashing - Comprehensive interface implementation: `_UnknownBlock` implements all expected Block methods to prevent type errors - Warning logging: Log missing blocks for debugging without breaking execution flow - Removed unnecessary caching: Direct constructor calls instead of cached wrapper functions ### 2. Enhanced Activity Status AI Evaluation (`backend/executor/activity_status_generator.py`) #### Intention-Based Success Evaluation - Graph description analysis: AI now reads graph description FIRST to understand intended purpose - Purpose-driven evaluation: Success is measured against what the graph was designed to accomplish - Critical output analysis: Enhanced detection of missing outputs from key blocks (Output, Post, Create, Send, Publish, Generate) - Sub-agent failure detection: Better identification when AgentExecutorBlock produces no outputs #### Improved Prompting - Intent-specific examples: 'blog writing' → check for blog content, 'email automation' → check for sent emails - Primary evaluation criteria: 'Did this execution accomplish what the graph was designed to do?' - Enhanced checklist: 7-point analysis including graph description matching - Technical vs. goal completion: Distinguish between workflow steps completing vs. actual user goals achieved #### Removed Database Error Handling - Eliminated try-catch blocks: No longer needed around `get_graph_metadata` and `get_graph` calls - Direct database calls: Simplified error handling after fixing missing block root cause - Cleaner code flow: More predictable execution path without redundant error handling ## Problem Solved - False success reports: AI previously marked executions as 'successful' when critical output blocks produced no results - Missing block crashes: System would fail when trying to analyze executions with deleted/missing blocks - Intent-blind evaluation: AI evaluated technical completion instead of actual goal achievement - Database service errors: 500 errors when missing blocks caused graph loading failures ## Business Impact - More accurate user feedback: Users get honest assessment of whether their automations actually worked - Better task completion detection: Clear distinction between 'workflow completed' vs. 'goal achieved' - Improved reliability: System handles edge cases gracefully without crashing - Enhanced user trust: Truthful reporting builds confidence in the platform ## Testing - ✅ Tested with problematic executions that previously showed false successes - ✅ Confirmed missing block handling works without warnings - ✅ Verified enhanced prompt correctly identifies failures - ✅ Database calls work without try-catch protection ## Example Before/After Before (False Success): ``` Graph: "Automated SEO Blog Writer" Status: "✅ I successfully completed your blog writing task!" Reality: No blog content was actually created (critical output blocks had no outputs) ``` After (Accurate Failure Detection): ``` Graph: "Automated SEO Blog Writer" Status: "❌ The task failed because the blog post creation step didn't produce any output." Reality: Correctly identifies that the intended blog writing goal was not achieved ``` ## Files Modified - `backend/data/graph.py`: Missing block graceful handling with complete interface - `backend/executor/activity_status_generator.py`: Enhanced AI evaluation with intention-based analysis ## Type of Change - [x] Bug fix (non-breaking change which fixes an issue) - [x] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected) - [ ] This change requires a documentation update ## Checklist - [x] My code follows the style guidelines of this project - [x] I have performed a self-review of my own code - [x] I have commented my code, particularly in hard-to-understand areas - [x] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [x] I have added tests that prove my fix is effective or that my feature works - [x] New and existing unit tests pass locally with my changes - [x] Any dependent changes have been merged and published in downstream modules --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-02 12:28:57 +00:00
Zamil Majdy	f314fbf14f	fix(backend): resolve two critical long-running agent execution failures (#11011 ) ## Summary Fix two production issues causing agent execution failures that occurred this morning: 1. AsyncRedisLock Release Error (ExecutionID: 08b2c251-ee27-45de-b88d-1792823ca3ee) - Error: "Cannot release a lock that's no longer owned" - Root cause: Race condition where lock expires during long database operations - Location: backend/executor/manager.py synchronized context manager 2. Tool Call Parameter Validation (ExecutionID: 766fd9a0-5f22-4a77-96e8-14c9d02f3292) - Issue: LLM used typo'd parameter 'maximum_keyword_difficulty' instead of 'max_keyword_difficulty' - SmartDecisionMakerBlock silently accepted typo, setting correct parameter to null - Result: Downstream blocks received null values causing execution failures ## Changes Made ### AsyncRedisLock Error Handling - Add try-catch blocks around AsyncRedisLock.release() calls in ExecutionManager and OAuth refresh - Prevent crashes when locks expire between ownership check and release - Log warnings instead of crashing execution ### Tool Call Parameter Validation - Reject unknown parameters: Raise ValueError for typo'd parameter names with detailed error messages - Allow optional parameters: Only validate missing REQUIRED parameters - Safe parameter access: Use .get() to handle optional parameters with defaults - Clean code: Extract parameters object once to eliminate duplication ## Technical Implementation Lock Release Protection: ```python if await lock.locked() and await lock.owned(): try: await lock.release() except Exception as e: logger.warning(f"Failed to release lock for key {key}: {e}") ``` Parameter Validation Logic: ```python # Get parameters schema from tool definition if tool_def and "function" in tool_def and "parameters" in tool_def["function"]: parameters = tool_def["function"]["parameters"] expected_args = parameters.get("properties", {}) required_params = set(parameters.get("required", [])) # Detect parameter typos and missing required params unexpected_args = provided_args - expected_args_set missing_required_args = required_params - provided_args if unexpected_args or missing_required_args: raise ValueError(error_msg) # Detailed error explaining the problem ``` ## Testing - [x] All existing tests pass - [x] Lock error handling prevents execution crashes - [x] Tool validation catches typos while allowing optional parameters - [x] Maintains backward compatibility with existing workflows ## Impact - ✅ No more "Cannot release a lock" crashes during long database operations - ✅ Tool calls with typo'd parameters are rejected with clear error messages - ✅ Optional parameters work correctly with default values - ✅ Production stability improved with graceful error handling ## Files Modified - `backend/executor/manager.py` - AsyncRedisLock error handling in synchronized context - `backend/integrations/creds_manager.py` - OAuth refresh lock error handling - `backend/blocks/smart_decision_maker.py` - Tool call parameter validation with typo detection Fixes two critical production failures that were causing 2/5 agent runs to fail this morning. --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-09-29 15:34:20 +00:00
Zamil Majdy	a97ff641c3	feat(backend): optimize FastAPI endpoints performance and alert system (#11000 ) ## Summary Comprehensive performance optimization fixing event loop binding issues and addressing all PR feedback. ### Original Performance Issues Fixed Event Loop Binding Problems: - JWT authentication dependencies were synchronous, causing thread pool bottlenecks under high concurrency - FastAPI's default thread pool (40 threads) was insufficient for high-load scenarios - Backend services lacked proper event loop configuration Security & Performance Improvements: - Security middleware converted from BaseHTTPMiddleware to pure ASGI for better performance - Added blocks endpoint to cacheable paths for improved response times - Cross-platform uvloop detection with Windows compatibility ### Key Changes Made #### 1. JWT Authentication Async Conversion - Files: `autogpt_libs/auth/dependencies.py`, `autogpt_libs/auth/jwt_utils.py` - Change: Convert all JWT functions to async (`requires_user`, `requires_admin_user`, `get_user_id`, `get_jwt_payload`) - Impact: Eliminates thread pool blocking, improves concurrency handling - Tests: All 25+ authentication tests updated to async patterns #### 2. FastAPI Thread Pool Optimization - File: `backend/server/rest_api.py:82-93` - Change: Configure thread pool size via `config.fastapi_thread_pool_size` - Default: Increased from 40 to higher limit for sync operations - Impact: Better handling of remaining sync dependencies #### 3. Performance-Optimized Security Middleware - File: `backend/server/middleware/security.py` - Change: Pure ASGI implementation replacing BaseHTTPMiddleware - Headers: HTTP spec compliant capitalization (X-Content-Type-Options, X-Frame-Options, etc.) - Caching: Added `/api/blocks` and `/api/v1/blocks` to cacheable paths - Impact: Reduced middleware overhead, improved header compliance #### 4. Cross-Platform Event Loop Configuration - File: `backend/server/rest_api.py:311-312` - Change: Platform-aware uvloop detection: `'uvloop' if platform.system() != 'Windows' else 'auto'` - Impact: Windows compatibility while maintaining Unix performance benefits - Verified: 'auto' is valid uvicorn default parameter #### 5. Enhanced Caching Infrastructure - File: `autogpt_libs/utils/cache.py:118-132` - Change: Per-event-loop asyncio.Lock instances prevent cross-loop deadlocks - Impact: Thread-safe caching across multiple event loops #### 6. Database Query Limits & Performance - Files: Multiple data layer files - Change: Added configurable limits to prevent unbounded queries - Constants: `MAX_GRAPH_VERSIONS_FETCH=50`, `MAX_USER_API_KEYS_FETCH=500`, etc. - Impact: Consistent performance regardless of data volume #### 7. OpenAPI Documentation Improvements - File: `backend/server/routers/v1.py:68-85` - Change: Added proper response model and schema for blocks endpoint - Impact: Better API documentation and type safety #### 8. Error Handling & Retry Logic Fixes - File: `backend/util/retry.py:63` - Change: Accurate retry threshold comments referencing EXCESSIVE_RETRY_THRESHOLD - Impact: Clear documentation for debugging retry scenarios ### ntindle Feedback Addressed ✅ HTTP Header Capitalization: All headers now use proper HTTP spec capitalization ✅ Windows uvloop Compatibility: Clean platform detection with inline conditional ✅ OpenAPI Response Model: Blocks endpoint properly documented in schema ✅ Retry Comment Accuracy: References actual threshold constants instead of hardcoded numbers ✅ Code Cleanliness: Inline conditionals preferred over verbose if statements ### Performance Testing Results Before Optimization: - High latency under concurrent load - Thread pool exhaustion at ~40 concurrent requests - Event loop binding issues causing timeouts After Optimization: - Improved concurrency handling with async JWT pipeline - Configurable thread pool scaling - Cross-platform event loop optimization - Reduced middleware overhead ### Backward Compatibility ✅ All existing functionality preserved ✅ No breaking API changes ✅ Enhanced test coverage with async patterns ✅ Windows and Unix compatibility maintained ### Files Modified Core Authentication & Performance: - `autogpt_libs/auth/dependencies.py` - Async JWT dependencies - `autogpt_libs/auth/jwt_utils.py` - Async JWT utilities - `backend/server/rest_api.py` - Thread pool config + uvloop detection - `backend/server/middleware/security.py` - ASGI security middleware Database & Limits: - `backend/data/includes.py` - Performance constants and configurable includes - `backend/data/api_key.py`, `backend/data/credit.py`, `backend/data/graph.py`, `backend/data/integrations.py` - Query limits Caching & Infrastructure: - `autogpt_libs/utils/cache.py` - Per-event-loop lock safety - `backend/server/routers/v1.py` - OpenAPI improvements - `backend/util/retry.py` - Comment accuracy Testing: - `autogpt_libs/auth/dependencies_test.py` - 25+ async test conversions - `autogpt_libs/auth/jwt_utils_test.py` - Async JWT test patterns Ready for review and production deployment. 🚀 --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-09-29 05:32:48 +00:00

1 2 3 4 5

232 Commits