mirror of
https://github.com/Significant-Gravitas/AutoGPT.git
synced 2026-02-09 14:25:25 -05:00
f999c8ccdf3bee5d3f733fceed651aab4e5329d4
950 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
f999c8ccdf | Merge branch 'dev' into toran/open-2856-handle-failed-replicate-predictions-with-retries-in-all | ||
|
|
3b24884fd7 |
refactor(backend/blocks): implement run_replicate_with_retry helper function
### Changes 🏗️
- Introduced a new helper function `run_replicate_with_retry` to handle retries for model execution across multiple blocks, improving error handling and reducing code duplication.
- Updated `AIImageCustomizerBlock`, `AIImageGeneratorBlock`, `AIMusicGeneratorBlock`, `AIImageEditorBlock`, `ReplicateFluxAdvancedModelBlock`, and `ReplicateModelBlock` to utilize the new helper function for running models.
|
||
|
|
3d08c22dd5 |
feat(platform): add Human In The Loop block with review workflow (#11380)
## Summary This PR implements a comprehensive Human In The Loop (HITL) block that allows agents to pause execution and wait for human approval/modification of data before continuing. https://github.com/user-attachments/assets/c027d731-17d3-494c-85ca-97c3bf33329c ## Key Features - Added WAITING_FOR_REVIEW status to AgentExecutionStatus enum - Created PendingHumanReview database table for storing review requests - Implemented HumanInTheLoopBlock that extracts input data and creates review entries - Added API endpoints at /api/executions/review for fetching and reviewing pending data - Updated execution manager to properly handle waiting status and resume after approval ## Frontend Components - PendingReviewCard for individual review handling - PendingReviewsList for multiple reviews - FloatingReviewsPanel for graph builder integration - Integrated review UI into 3 locations: legacy library, new library, and graph builder ## Technical Implementation - Added proper type safety throughout with SafeJson handling - Optimized database queries using count functions instead of full data fetching - Fixed imports to be top-level instead of local - All formatters and linters pass ## Test plan - [ ] Test Human In The Loop block creation in graph builder - [ ] Test block execution pauses and creates pending review - [ ] Test review UI appears in all 3 locations - [ ] Test data modification and approval workflow - [ ] Test rejection workflow - [ ] Test execution resumes after approval 🤖 Generated with [Claude Code](https://claude.ai/code) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Added Human-In-The-Loop review workflows to pause executions for human validation. * Users can approve or reject pending tasks, optionally editing submitted data and adding a message. * New "Waiting for Review" execution status with UI indicators across run lists, badges, and activity views. * Review management UI: pending review cards, list view, and a floating reviews panel for quick access. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Claude <noreply@anthropic.com> |
||
|
|
ff5dd7a5b4 |
fix(backend): migrate all query_raw calls to query_raw_with_schema for proper schema handling (#11462)
## Summary Complete migration of all non-test `query_raw` calls to use `query_raw_with_schema` for proper PostgreSQL schema context handling. This resolves the marketplace API failures where queries were looking for unqualified table names. ## Root Cause Prisma's `query_raw()` doesn't respect the `schema` parameter in `DATABASE_URL` (`?schema=platform`) for raw SQL queries, causing queries to fail when looking for unqualified table names in multi-schema environments. ## Changes Made ### Files Updated - ✅ **backend/server/v2/store/db.py**: Already updated in previous commit - ✅ **backend/server/v2/builder/db.py**: Updated `get_suggested_blocks` query at line 343 - ✅ **backend/check_store_data.py**: Updated all 4 `query_raw` calls to use schema-aware queries - ✅ **backend/check_db.py**: Updated all `query_raw` calls (import already existed) ### Technical Implementation - Add import: `from backend.data.db import query_raw_with_schema` - Replace `prisma.get_client().query_raw()` with `query_raw_with_schema()` - Add `{schema_prefix}` placeholder to table references in SQL queries - Fix f-string template conflicts by using double braces `{{schema_prefix}}` ### Query Examples **Before:** ```sql FROM "StoreAgent" FROM "AgentNodeExecution" execution ``` **After:** ```sql FROM {schema_prefix}"StoreAgent" FROM {schema_prefix}"AgentNodeExecution" execution ``` ## Impact - ✅ All raw SQL queries now properly respect platform schema context - ✅ Fixes "relation does not exist" errors in multi-schema environments - ✅ Maintains backward compatibility with public schema deployments - ✅ Code formatting passes with `poetry run format` ## Testing - All `query_raw` usages in non-test code successfully migrated - `query_raw_with_schema` automatically handles schema prefix injection - Existing query logic unchanged, only schema awareness added ## Before/After **Before:** GET /api/store/agents → "relation 'StoreAgent' does not exist" **After:** GET /api/store/agents → ✅ Returns store agents correctly Resolves the marketplace API failures and ensures consistent schema handling across all raw SQL operations. Co-authored-by: Claude <noreply@anthropic.com> |
||
|
|
02f8a69c6a |
feat(platform): add Google Drive Picker field type for enhanced file selection (#11311)
### 🏗️ Changes This PR adds a Google Drive Picker field type to enhance the user experience of existing Google blocks, replacing manual file ID entry with a visual file picker. #### Backend Changes - **Added and types** in : - Configurable picker field with OAuth scope management - Support for multiselect, folder selection, and MIME type filtering - Proper access token handling for file downloads - **Enhanced Gmail blocks**: Updated attachment fields to use Google Drive Picker for better UX - **Enhanced Google Sheets blocks**: Updated spreadsheet selection to use picker instead of manual ID entry - **Added utility**: Async file download with virus scanning and 100MB size limit #### Frontend Changes - **Enhanced GoogleDrivePicker component**: Improved UI with folder icon and multiselect messaging - **Integrated picker in form renderers**: Auto-renders for fields with format - **Added shared GoogleDrivePickerInput component**: Eliminates code duplication between NodeInputs and RunAgentInputs - **Added type definitions**: Complete TypeScript support for picker schemas and responses #### Key Features - 🎯 **Visual file selection**: Replace manual Google Drive file ID entry with intuitive picker - 📁 **Flexible configuration**: Support for documents, spreadsheets, folders, and custom MIME types - 🔒 **Minimal OAuth scopes**: Uses scope for security (only access to user-selected files) - ⚡ **Enhanced UX**: Seamless integration in both block configuration and agent run modals - 🛡️ **Security**: Virus scanning and file size limits for downloaded attachments #### Migration Impact - **Backward compatible**: Existing blocks continue to work with manual ID entry - **Progressive enhancement**: New picker fields provide better UX for the same functionality - **No breaking changes**: all existing blocks should be unaffected This enhancement improves the user experience of Google blocks without introducing new systems or breaking existing functionality. ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: <!-- Put your test plan here: --> - [x]Test multiple of the new blocks [of note is that the create spreadsheet block should be not used for now as it uses api not drive picker] - [x] chain the blocks together and pass values between them --------- Co-authored-by: Lluis Agusti <hi@llu.lu> Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co> Co-authored-by: Claude <noreply@anthropic.com> |
||
|
|
e983d5c49a |
fix(backend): Implement passed uploaded media support for AI image customizer block (#11441)
- Added `store_media_file` utility to convert local file paths to Data
URIs for image processing.
- Updated `AIImageCustomizerBlock` to utilize processed images in model
execution, improving compatibility with Replicate API.
- Added optional Aspect ratio input to AIImageCustomizerBlock
This change enhances the image handling capabilities of the AI image
customizer, ensuring that images are properly formatted for external
processing.
<!-- Clearly explain the need for these changes: -->
### Changes 🏗️
<!-- Concisely describe all of the changes made in this pull request:
-->
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
<!-- Put your test plan here: -->
- [x] Created agent using AI Image Customizer block attached to agent
file input
- [x] Run agent, confirmed block is working
- [x] Confirm block is still working in original direct file upload
setup.
### Testing Results
#### Before (dev cloud):
<img width="836" height="592" alt="image"
src="https://github.com/user-attachments/assets/88c75668-c5c9-44bb-bec5-6554088a0cb7"
/>
#### After (local):
<img width="827" height="587" alt="image"
src="https://github.com/user-attachments/assets/04fea431-70a5-4173-bc84-d354c03d7174"
/>
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> Preprocesses input images to data URIs and adds an `aspect_ratio`
option, wiring both through to Replicate in `AIImageCustomizerBlock`.
>
> - **Backend**
> - **`backend/blocks/ai_image_customizer.py`**:
> - Preprocesses input images via `store_media_file(...,
return_content=True)` to Data URIs before invoking Replicate.
> - Adds `AspectRatio` enum and `aspect_ratio` input; passed through
`run_model` and included in Replicate input.
> - Updates block test input accordingly.
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
|
||
|
|
64a775dfa7 |
feat(backend/blocks): Add GPT-5.1 and GPT-5.1-codex (#11406)
This pr adds the latest gpt-5.1 and gpt-5.1-codex llm's from openai, as well as update the price of the gpt-5-chat model https://platform.openai.com/docs/models/gpt-5.1 https://platform.openai.com/docs/models/gpt-5.1-codex I have also had to add a new codex block as it uses a different openai API and has other options the main llm's dont use <img width="231" height="755" alt="image" src="https://github.com/user-attachments/assets/a4056633-7b0f-446f-ae86-d7755c5b88ec" /> #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: <!-- Put your test plan here: --> - [x] Test the latest gpt-5.1 llm - [x] Test the latest gpt-5.1-codex block --------- Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co> Co-authored-by: Claude <noreply@anthropic.com> |
||
|
|
5d97706bb8 |
feat(backend/blocks): Add claude opus 4.5 (#11446)
This PR adds the latest [claude opus 4.5](https://www.anthropic.com/news/claude-opus-4-5) model to the platform ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Test and use the llm to make sure it works |
||
|
|
244f3c7c71 |
chore(backend/deps-dev): bump faker from 37.8.0 to 38.2.0 in /autogpt_platform/backend (#11435)
Bumps [faker](https://github.com/joke2k/faker) from 37.8.0 to 38.2.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/joke2k/faker/releases">faker's releases</a>.</em></p> <blockquote> <h2>Release v38.2.0</h2> <p>See <a href="https://github.com/joke2k/faker/blob/refs/tags/v38.2.0/CHANGELOG.md">CHANGELOG.md</a>.</p> <h2>Release v38.1.0</h2> <p>See <a href="https://github.com/joke2k/faker/blob/refs/tags/v38.1.0/CHANGELOG.md">CHANGELOG.md</a>.</p> <h2>Release v38.0.0</h2> <p>See <a href="https://github.com/joke2k/faker/blob/refs/tags/v38.0.0/CHANGELOG.md">CHANGELOG.md</a>.</p> <h2>Release v37.12.0</h2> <p>See <a href="https://github.com/joke2k/faker/blob/refs/tags/v37.12.0/CHANGELOG.md">CHANGELOG.md</a>.</p> <h2>Release v37.11.0</h2> <p>See <a href="https://github.com/joke2k/faker/blob/refs/tags/v37.11.0/CHANGELOG.md">CHANGELOG.md</a>.</p> <h2>Release v37.10.0</h2> <p>See <a href="https://github.com/joke2k/faker/blob/refs/tags/v37.10.0/CHANGELOG.md">CHANGELOG.md</a>.</p> <h2>Release v37.9.0</h2> <p>See <a href="https://github.com/joke2k/faker/blob/refs/tags/v37.9.0/CHANGELOG.md">CHANGELOG.md</a>.</p> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/joke2k/faker/blob/master/CHANGELOG.md">faker's changelog</a>.</em></p> <blockquote> <h3><a href="https://github.com/joke2k/faker/compare/v38.1.0...v38.2.0">v38.2.0 - 2025-11-19</a></h3> <ul> <li>Add localized UniqueProxy. Thanks <a href="https://github.com/azmeuk"><code>@azmeuk</code></a></li> </ul> <h3><a href="https://github.com/joke2k/faker/compare/v38.0.0...v38.1.0">v38.1.0 - 2025-11-19</a></h3> <ul> <li>Add <code>person</code> provider for <code>ar_DZ</code> locale. Thanks <a href="https://github.com/othmane099"><code>@othmane099</code></a>.</li> <li>Add <code>person</code>, <code>phone_number</code>, <code>date_time</code> for <code>fr_DZ</code> locale. Thanks <a href="https://github.com/othmane099"><code>@othmane099</code></a>.</li> </ul> <h3><a href="https://github.com/joke2k/faker/compare/v37.12.0...v38.0.0">v38.0.0 - 2025-11-11</a></h3> <ul> <li>Drop support for Python 3.9</li> <li>Add support for Python 3.14</li> </ul> <h3><a href="https://github.com/joke2k/faker/compare/v37.11.0...v37.12.0">v37.12.0 - 2025-10-07</a></h3> <ul> <li>Add french VAT number. Thanks <a href="https://github.com/fabien-michel"><code>@fabien-michel</code></a>.</li> </ul> <h3><a href="https://github.com/joke2k/faker/compare/v37.9.0...v37.11.0">v37.11.0 - 2025-10-07</a></h3> <ul> <li>Add French company APE code. Thanks <a href="https://github.com/fabien-michel"><code>@fabien-michel</code></a>.</li> </ul> <h3><a href="https://github.com/joke2k/faker/compare/v37.8.0...v37.9.0">v37.9.0 - 2025-10-07</a></h3> <ul> <li>Add names generation to <code>en_KE</code> locale. Thanks <a href="https://github.com/titustum"><code>@titustum</code></a>.</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href=" |
||
|
|
126d5838a0 |
feat(backend/blocks): add latest grok models (#11422)
This PR adds some of the latest grok models to the platform ``x-ai/grok-4-fast``, ``x-ai/grok-4.1-fast`` and ``ai/grok-code-fast-1`` #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: <!-- Put your test plan here: --> - [x] Test all of the latest grok models to make sure they work and they do! <img width="1089" height="714" alt="image" src="https://github.com/user-attachments/assets/0d1e3984-69e8-432b-982a-b04c16bc4f41" /> |
||
|
|
643aea849b |
feat(backend/blocks): Add google banana pro (#11425)
This PR adds the latest google banana pro image generator and editor to the platform and fixes up some of the prices for the image generation models I asked for ``Generate a image of a dog on a skateboard`` and this is what i got: <img width="2048" height="2048" alt="image" src="https://github.com/user-attachments/assets/9b6c16d8-df8f-4fb6-a009-d6d342f9beb7" /> ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: <!-- Put your test plan here: --> - [x] Test the image generator and image editor block using the latest google banana pro model and it works --------- Co-authored-by: Abhimanyu Yadav <122007096+Abhi1992002@users.noreply.github.com> |
||
|
|
3b092f34d8 |
feat(platform): Add Get Linear Issues Block (#11415)
Added the ability to get all issues for a given project. ### Changes 🏗️ - added api query - added new models - added new block that gets all issues for a given project ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: <!-- Put your test plan here: --> - [x] I have ensured the new block works in dev - [x] I have ensured the other linear blocks still work |
||
|
|
0921d23628 |
fix(block): Improve error handling of SendEmailBlock (#11420)
Currently if the smtp server is not configured currently it results in a platform error. This PR simplifies the error handling ### Changes 🏗️ - removed default value for smtp server host. - capture common errors and yield them as error ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Checked all tests still pass |
||
|
|
06d20e7e4c |
chore(backend/deps-dev): bump the development-dependencies group across 1 directory with 3 updates (#11411)
Bumps the development-dependencies group with 3 updates in the /autogpt_platform/backend directory: [pre-commit](https://github.com/pre-commit/pre-commit), [pyright](https://github.com/RobertCraigie/pyright-python) and [ruff](https://github.com/astral-sh/ruff). Updates `pre-commit` from 4.3.0 to 4.4.0 <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/pre-commit/pre-commit/releases">pre-commit's releases</a>.</em></p> <blockquote> <h2>pre-commit v4.4.0</h2> <h3>Features</h3> <ul> <li>Add <code>--fail-fast</code> option to <code>pre-commit run</code>. <ul> <li><a href="https://redirect.github.com/pre-commit/pre-commit/issues/3528">#3528</a> PR by <a href="https://github.com/JulianMaurin"><code>@JulianMaurin</code></a>.</li> </ul> </li> <li>Upgrade <code>ruby-build</code> / <code>rbenv</code>. <ul> <li><a href="https://redirect.github.com/pre-commit/pre-commit/issues/3566">#3566</a> PR by <a href="https://github.com/asottile"><code>@asottile</code></a>.</li> <li><a href="https://redirect.github.com/pre-commit/pre-commit/issues/3565">#3565</a> issue by <a href="https://github.com/MRigal"><code>@MRigal</code></a>.</li> </ul> </li> <li>Add <code>language: unsupported</code> / <code>language: unsupported_script</code> as aliases for <code>language: system</code> / <code>language: script</code> (which will eventually be deprecated). <ul> <li><a href="https://redirect.github.com/pre-commit/pre-commit/issues/3577">#3577</a> PR by <a href="https://github.com/asottile"><code>@asottile</code></a>.</li> </ul> </li> <li>Add support docker-in-docker detection for cgroups v2. <ul> <li><a href="https://redirect.github.com/pre-commit/pre-commit/issues/3535">#3535</a> PR by <a href="https://github.com/br-rhrbacek"><code>@br-rhrbacek</code></a>.</li> <li><a href="https://redirect.github.com/pre-commit/pre-commit/issues/3360">#3360</a> issue by <a href="https://github.com/JasonAlt"><code>@JasonAlt</code></a>.</li> </ul> </li> </ul> <h3>Fixes</h3> <ul> <li>Handle when docker gives <code>SecurityOptions: null</code>. <ul> <li><a href="https://redirect.github.com/pre-commit/pre-commit/issues/3537">#3537</a> PR by <a href="https://github.com/asottile"><code>@asottile</code></a>.</li> <li><a href="https://redirect.github.com/pre-commit/pre-commit/issues/3514">#3514</a> issue by <a href="https://github.com/jenstroeger"><code>@jenstroeger</code></a>.</li> </ul> </li> <li>Fix error context for invalid <code>stages</code> in <code>.pre-commit-config.yaml</code>. <ul> <li><a href="https://redirect.github.com/pre-commit/pre-commit/issues/3576">#3576</a> PR by <a href="https://github.com/asottile"><code>@asottile</code></a>.</li> </ul> </li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/pre-commit/pre-commit/blob/main/CHANGELOG.md">pre-commit's changelog</a>.</em></p> <blockquote> <h1>4.4.0 - 2025-11-08</h1> <h3>Features</h3> <ul> <li>Add <code>--fail-fast</code> option to <code>pre-commit run</code>. <ul> <li><a href="https://redirect.github.com/pre-commit/pre-commit/issues/3528">#3528</a> PR by <a href="https://github.com/JulianMaurin"><code>@JulianMaurin</code></a>.</li> </ul> </li> <li>Upgrade <code>ruby-build</code> / <code>rbenv</code>. <ul> <li><a href="https://redirect.github.com/pre-commit/pre-commit/issues/3566">#3566</a> PR by <a href="https://github.com/asottile"><code>@asottile</code></a>.</li> <li><a href="https://redirect.github.com/pre-commit/pre-commit/issues/3565">#3565</a> issue by <a href="https://github.com/MRigal"><code>@MRigal</code></a>.</li> </ul> </li> <li>Add <code>language: unsupported</code> / <code>language: unsupported_script</code> as aliases for <code>language: system</code> / <code>language: script</code> (which will eventually be deprecated). <ul> <li><a href="https://redirect.github.com/pre-commit/pre-commit/issues/3577">#3577</a> PR by <a href="https://github.com/asottile"><code>@asottile</code></a>.</li> </ul> </li> <li>Add support docker-in-docker detection for cgroups v2. <ul> <li><a href="https://redirect.github.com/pre-commit/pre-commit/issues/3535">#3535</a> PR by <a href="https://github.com/br-rhrbacek"><code>@br-rhrbacek</code></a>.</li> <li><a href="https://redirect.github.com/pre-commit/pre-commit/issues/3360">#3360</a> issue by <a href="https://github.com/JasonAlt"><code>@JasonAlt</code></a>.</li> </ul> </li> </ul> <h3>Fixes</h3> <ul> <li>Handle when docker gives <code>SecurityOptions: null</code>. <ul> <li><a href="https://redirect.github.com/pre-commit/pre-commit/issues/3537">#3537</a> PR by <a href="https://github.com/asottile"><code>@asottile</code></a>.</li> <li><a href="https://redirect.github.com/pre-commit/pre-commit/issues/3514">#3514</a> issue by <a href="https://github.com/jenstroeger"><code>@jenstroeger</code></a>.</li> </ul> </li> <li>Fix error context for invalid <code>stages</code> in <code>.pre-commit-config.yaml</code>. <ul> <li><a href="https://redirect.github.com/pre-commit/pre-commit/issues/3576">#3576</a> PR by <a href="https://github.com/asottile"><code>@asottile</code></a>.</li> </ul> </li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href=" |
||
|
|
07b5fe859a |
feat(platform/backend): add gemini-3-pro-preview (#11413)
This adds gemini-3-pro-preview from openrouter https://openrouter.ai/google/gemini-3-pro-preview #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: <!-- Put your test plan here: --> - [x] Test the gemini 3 model in the llm blocks and it works |
||
|
|
901bb31e14 |
feat(backend): parameterize activity status generation with customizable prompts (#11407)
## Summary
Implement comprehensive parameterization of the activity status
generation system to enable custom prompts for admin analytics
dashboard.
## Changes Made
### Core Function Enhancement (`activity_status_generator.py`)
- **Extract hardcoded prompts to constants**: `DEFAULT_SYSTEM_PROMPT`
and `DEFAULT_USER_PROMPT`
- **Add prompt parameters**: `system_prompt`, `user_prompt` with
defaults to maintain backward compatibility
- **Template substitution system**: User prompt supports
`{{GRAPH_NAME}}` and `{{EXECUTION_DATA}}` placeholders
- **Skip existing flag**: `skip_existing` parameter allows admin to
force regeneration of existing data
- **Maintain manager compatibility**: All existing calls continue to
work with default parameters
### Admin API Enhancement (`execution_analytics_routes.py`)
- **Custom prompt fields**: `system_prompt` and `user_prompt` optional
fields in `ExecutionAnalyticsRequest`
- **Skip existing control**: `skip_existing` boolean flag for admin
regeneration option
- **Template documentation**: Clear documentation of placeholder system
in field descriptions
- **Backward compatibility**: All existing API calls work unchanged
### Template System Design
- **Simple placeholder replacement**: `{{GRAPH_NAME}}` → actual graph
name, `{{EXECUTION_DATA}}` → JSON execution data
- **No dependencies**: Uses simple `string.replace()` for maximum
compatibility
- **JSON safety**: Execution data properly serialized as indented JSON
- **Validation tested**: Template substitution verified to work
correctly
## Key Features
### For Regular Users (Manager Integration)
- **No changes required**: Existing manager.py calls work unchanged
- **Default behavior preserved**: Same prompts and logic as before
- **Feature flag compatibility**: LaunchDarkly integration unchanged
### For Admin Analytics Dashboard
- **Custom system prompts**: Admins can override the AI evaluation
criteria
- **Custom user prompts**: Admins can modify the analysis instructions
with execution data templates
- **Force regeneration**: `skip_existing=False` allows reprocessing
existing executions with new prompts
- **Complete model list**: Access to all LLM models from `llm.py` (70+
models including GPT, Claude, Gemini, etc.)
## Technical Validation
- ✅ Template substitution tested and working
- ✅ Default behavior preserved for existing code
- ✅ Admin API parameter validation working
- ✅ All imports and function signatures correct
- ✅ Backward compatibility maintained
## Use Cases Enabled
- **A/B testing**: Compare different prompt strategies on same execution
data
- **Custom evaluation**: Tailor success criteria for specific graph
types
- **Prompt optimization**: Iterate on prompt design based on admin
feedback
- **Bulk reprocessing**: Regenerate activity status with improved
prompts
## Testing
- Template substitution functionality verified
- Function signatures and imports validated
- Code formatting and linting passed
- Backward compatibility confirmed
## Breaking Changes
None - all existing functionality preserved with default parameters.
## Related Issues
Resolves the requirement to expose prompt customization on the frontend
execution analytics dashboard.
---------
Co-authored-by: Claude <noreply@anthropic.com>
|
||
|
|
9438817702 |
fix(platform): Capture Sentry Block Errors Correctly (#11404)
Currently we are capturing block errors via the scope only, this change captures the error directly. ### Changes 🏗️ - capture the error as well as the scope in the executor manager - Update the block error message to include additional details - remove the __str__ function from blockerror as it is no longer needed ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: <!-- Put your test plan here: --> - [x] Checked that errors are still captured in dev |
||
|
|
73c93cf554 |
fix(backend): resolve production failures with comprehensive token handling and conversation safety fixes (#11394)
## Summary Resolves multiple production failures including execution **6239b448-0434-4687-a42b-9ff0ddf01c1d** where AI Text Generator failed with `'NoneType' object is not iterable`. This implements comprehensive fixes addressing both the root cause (unrealistic token limits) and masking issues (Sentry SDK bug + conversation history null safety). ## Root Cause Analysis Three interconnected issues caused production failures: ### 1. Unrealistic Perplexity Token Limits ❌ - **PERPLEXITY_SONAR**: 127,000 max_output_tokens (equivalent to ~95,000 words!) - **PERPLEXITY_SONAR_DEEP_RESEARCH**: 128,000 max_output_tokens - **Problem**: Newsletter generation defaulted to 127K output tokens - **Result**: Exceeded OpenRouter's 128K total limit, causing API failures ### 2. Sentry SDK OpenAI Integration Bug 🐛 - **Location**: `sentry_sdk/integrations/openai.py:157` - **Bug**: `for choice in response.choices:` failed when `choices=None` - **Impact**: Masked real token limit errors with confusing TypeError ### 3. Conversation History Null Safety Issues ⚠️ - **Problem**: `get_pending_tool_calls()` expected non-null conversation_history - **Impact**: SmartDecisionMaker crashes when conversation_history is None - **Pattern**: Common in various LLM block scenarios ## Changes Made ### ✅ Fix 1: Realistic Perplexity Token Limits (`backend/blocks/llm.py`) ```python # Before (PROBLEMATIC) LlmModel.PERPLEXITY_SONAR: ModelMetadata("open_router", 127000, 127000) LlmModel.PERPLEXITY_SONAR_DEEP_RESEARCH: ModelMetadata("open_router", 128000, 128000) # After (FIXED) LlmModel.PERPLEXITY_SONAR: ModelMetadata("open_router", 127000, 8000) LlmModel.PERPLEXITY_SONAR_DEEP_RESEARCH: ModelMetadata("open_router", 128000, 16000) ``` **Rationale:** - **8K tokens** (SONAR): Matches industry standard, sufficient for long content (6K words) - **16K tokens** (DEEP_RESEARCH): Higher limit for research, supports very long content (12K words) - **Industry pattern**: 3-4% of context window (consistent with other OpenRouter models) ### ✅ Fix 2: Sentry SDK Upgrade (`pyproject.toml`) - **Upgrade**: `^2.33.2` → `^2.44.0` - **Result**: OpenAI integration bug fixed in SDK (no code changes needed) ### ✅ Fix 3: Conversation History Null Safety (`backend/blocks/smart_decision_maker.py`) ```python # Before def get_pending_tool_calls(conversation_history: list[Any]) -> dict[str, int]: # After def get_pending_tool_calls(conversation_history: list[Any] | None) -> dict[str, int]: if not conversation_history: return {} ``` - **Added**: Proper null checking for conversation_history parameter - **Prevents**: `'NoneType' object is not iterable` errors - **Impact**: Improves SmartDecisionMaker reliability across all scenarios ## Impact & Benefits ### 🎯 Production Reliability - ✅ **Prevents token limit errors** for realistic content generation - ✅ **Clear error handling** without masked Sentry TypeError crashes - ✅ **Better conversation safety** with proper null checking - ✅ **Multiple failure scenarios resolved** comprehensively ### 📈 User Experience - ✅ **Faster responses** (reasonable output lengths) - ✅ **Lower costs** (more focused content generation) - ✅ **More stable workflows** with better error handling - ✅ **Maintains flexibility** - users can override with explicit `max_tokens` ### 🔧 Technical Improvements - ✅ **Follows industry standards** - aligns with other OpenRouter models - ✅ **Breaking change risk: LOW** - users can override if needed - ✅ **Root cause resolution** - fixes error chain at source - ✅ **Defensive programming** - better null safety patterns ## Validation ### Industry Analysis ✅ - Large context models typically use 8K-16K output limits (not 127K) - Newsletter generation needs 650-10K tokens typically, not 127K tokens - Pattern analysis of 13 OpenRouter models confirms 3-4% context ratio ### Production Testing ✅ - **Before**: Newsletter generation → 127K tokens → API failure → Sentry crash - **After**: Newsletter generation → 8K tokens → successful completion - **Error handling**: Clear token limit errors instead of confusing TypeErrors - **Null safety**: Conversation history None/undefined handled gracefully ### Dependencies ✅ - **Sentry SDK**: Confirmed 2.44.0 fixes OpenAI integration crashes - **Poetry lock**: All dependencies updated successfully - **Backward compatibility**: Maintained for existing workflows ## Related Issues - Fixes flowExecutionID **6239b448-0434-4687-a42b-9ff0ddf01c1d** - Resolves AI Text Generator reliability issues - Improves overall platform token handling and conversation safety - Addresses multiple production failure patterns comprehensively ## Breaking Changes Assessment **Risk Level**: 🟡 **LOW-MEDIUM** - **Perplexity limits**: Users relying on 127K+ output would be limited (likely unintentional usage) - **Override available**: Users can explicitly set `max_tokens` for custom limits - **Conversation safety**: Only improves reliability, no breaking changes - **Most use cases**: Unaffected or improved by realistic defaults 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com> |
||
|
|
02757d68f3 |
fix(backend): resolve marketplace agent access in get_graph_execution endpoint (#11396)
## Summary
Fixes critical issue where `GET
/graphs/{graph_id}/executions/{graph_exec_id}` failed for marketplace
agents with "Graph not found" errors due to incorrect version access
checking.
## Root Cause
The endpoint was checking access to the **latest version** of a graph
instead of the **specific version used in the execution**. This broke
marketplace agents when:
1. User executes a marketplace agent (e.g., v3)
2. Graph owner later publishes a new version (e.g., v4)
3. User tries to view execution details
4. **BUG**: Code checked access to latest version (v4) instead of
execution version (v3)
5. If v4 wasn't published to marketplace → access denied → "Graph not
found"
## Original Problematic Code
```python
# routers/v1.py - get_graph_execution (WRONG ORDER)
graph = await graph_db.get_graph(graph_id=graph_id, user_id=user_id) # ❌ Uses LATEST version
if not graph:
raise HTTPException(404, f"Graph #{graph_id} not found")
result = await execution_db.get_graph_execution(...) # Gets execution data
```
## Solution
**Reordered operations** to check access against the **execution's
specific version**:
```python
# NEW CODE (CORRECT ORDER)
result = await execution_db.get_graph_execution(...) # ✅ Get execution FIRST
if not await graph_db.get_graph(
graph_id=result.graph_id,
version=result.graph_version, # ✅ Use execution's version, not latest!
user_id=user_id,
):
raise HTTPException(404, f"Graph #{graph_id} not found")
```
### Key Changes Made
1. **Fixed version access logic** (routers/v1.py:1075-1095):
- Reordered operations to get execution data first
- Check access using `result.graph_version` instead of latest version
- Applied same fix to external API routes
2. **Enhanced `get_graph()` marketplace fallback**
(data/graph.py:919-935):
- Added proper marketplace lookup when user doesn't own the graph
- Supports version-specific marketplace access checking
- Maintains security by only allowing approved, non-deleted listings
3. **Activity status generator fix**
(activity_status_generator.py:139-144):
- Use `skip_access_check=True` for internal system operations
4. **Missing block handling** (data/graph.py:94-103):
- Added `_UnknownBlockBase` placeholder for graceful handling of deleted
blocks
## Example Scenario Fixed
1. **User**: Installs marketplace agent "Blog Writer" v3
2. **Owner**: Later publishes v4 (not to marketplace yet)
3. **User**: Runs the agent (executes v3)
4. **Before**: Viewing execution details fails because code checked v4
access
5. **After**: ✅ Viewing execution details works because code checks v3
access
## Impact
- ✅ **Marketplace agents work correctly**: Users can view execution
details for any marketplace agent version they've used
- ✅ **Backward compatibility**: Existing owned graphs continue working
- ✅ **Security maintained**: Only allows access to versions user
legitimately executed
- ✅ **Version-aware access control**: Proper access checking for
specific versions, not just latest
## Testing
- [x] Marketplace agents: Execution details now accessible for all
executed versions
- [x] Owned graphs: Continue working as before
- [x] Version scenarios: Access control works correctly for specific
versions
- [x] Missing blocks: Graceful handling without errors
**Root issue resolved**: Version mismatch between execution version and
access check version that was breaking marketplace agent execution
viewing.
---------
Co-authored-by: Claude <noreply@anthropic.com>
|
||
|
|
a66219fc1f |
fix(platform): Remove un-runnable agents from schedule (#11374)
Currently when an agent fails validation during a scheduled run, we raise an error then try again, regardless of why. This change removed the agent schedule and notifies the user ### Changes 🏗️ - add schedule_id to the GraphExecutionJobArgs - add agent_name to the GraphExecutionJobArgs - Delete schedule on GraphValidationError - Notify the user with a message that include the agent name ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: <!-- Put your test plan here: --> - [x] I have ensured the scheduler tests work with these changes |
||
|
|
8b3a741f60 |
refactor(turnstile): Remove turnstile (#11387)
This PR removes turnstile from the platform. #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: <!-- Put your test plan here: --> - [x] Test to make sure that turnstile is gone, it will be. - [x] Test logging in with out turnstile to make sure it still works - [x] Test registering a new account with out turnstile and it works |
||
|
|
749be06599 |
fix(blocks/ai): Make AI List Generator block more reliable (#11317)
- Resolves #11305 ### Changes 🏗️ Make `AIListGeneratorBlock` more reliable: - Leverage `AIStructuredResponseGenerator`'s robust prompt/retry/validate logic - Use JSON format instead of Python list format - Add `force_json_output` toggle - Fix output instructions in prompt (only string values allowed) ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Works without `force_json_output` - [x] Works with `force_json_output` - [x] Retry mechanism works as intended |
||
|
|
27d886f05c |
feat(platform): WebSocket Onboarding notifications (#11335)
Use WebSocket notifications from the backend to display confetti. ### Changes 🏗️ - Send WebSocket notifications to the browser when new onboarding steps are completed - Handle WebSocket notifications events in the Wallet and use them instead of frontend-based logic to play confetti (fixes confetti appearing on every refresh) - Scroll to newly completed tasks when wallet opens just before confetti plays - Fix: make `Run again` button complete `RE_RUN_AGENT` task ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Confetti are displayed when previously uncompleted tasks are completed - [x] Confetti do not appear on page refresh - [x] Wallet scrolls on open before confetti is displayed - [x] `Run again` button completes `RE_RUN_AGENT` task |
||
|
|
536e2a5ec8 |
fix(blocks): Make Smart Decision Maker tool pin handling consistent and reliable (#11363)
- Resolves #11345 ### Changes 🏗️ - Move tool use routing logic from frontend to backend: routing info was being baked into graph links by the frontend, inconsistently, causing issues - Rework tool use routing to use target node ID instead of target block name - Add a bit of magic to `NodeOutputs` component to show tool node title instead of ID DX: - Removed `build` from `.prettierignore` -> re-enable formatting for builder components ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Use SDM block in a graph; verify it works - [x] Use SDM block with agent executor block as tool; verify it works - Tests for `parse_execution_output` pass (checked by CI) |
||
|
|
d674cb80e2 |
refactor(backend): Improve Block Error Handling (#11366)
We need a way to differentiate between serious errors that cause on call alerts and block errors. This PR address this need by ensuring all errors that occur during execution of a block are of Subtype BlockError ### Changes 🏗️ - Introduced BlockErrors and its subtypes - Updated current errors that are emitted by blocks to use BlockError - Update executor manager, to errors emitted when running a block that are not of type BlockError to BlockUnknownError ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: <!-- Put your test plan here: --> - [x] checked tests still work - [x] Ensured block error message is readable and useful |
||
|
|
d6ee402483 |
feat(platform): Add execution analytics admin endpoint with feature flag bypass (#11327)
This PR adds a comprehensive execution analytics admin endpoint that generates AI-powered activity summaries and correctness scores for graph executions, with proper feature flag bypass for admin use. ### Changes 🏗️ **Backend Changes:** - Added admin endpoint: `/api/executions/admin/execution_analytics` - Implemented feature flag bypass with `skip_feature_flag=True` parameter for admin operations - Fixed async database client usage (`get_db_async_client`) to resolve async/await errors - Added batch processing with configurable size limits to handle large datasets - Comprehensive error handling and logging for troubleshooting - Renamed entire feature from "Activity Backfill" to "Execution Analytics" for clarity **Frontend Changes:** - Created clean admin UI for execution analytics generation at `/admin/execution-analytics` - Built form with graph ID input, model selection dropdown, and optional filters - Implemented results table with status badges and detailed execution information - Added CSV export functionality for analytics results - Integrated with generated TypeScript API client for proper authentication - Added proper error handling with toast notifications and loading states **Database & API:** - Fixed critical async/await issue by switching from sync to async database client - Updated router configuration and endpoint naming for consistency - Generated proper TypeScript types and API client integration - Applied feature flag filtering at API level while bypassing for admin operations ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: **Test Plan:** - [x] Admin can access execution analytics page at `/admin/execution-analytics` - [x] Form validation works correctly (requires graph ID, validates inputs) - [x] API endpoint `/api/executions/admin/execution_analytics` responds correctly - [x] Authentication works properly through generated API client - [x] Analytics generation works with different LLM models (gpt-4o-mini, gpt-4o, etc.) - [x] Results display correctly with appropriate status badges (success/failed/skipped) - [x] CSV export functionality downloads correct data - [x] Error handling displays appropriate toast messages - [x] Feature flag bypass works for admin users (generates analytics regardless of user flags) - [x] Batch processing handles multiple executions correctly - [x] Loading states show proper feedback during processing #### For configuration changes: - [x] `.env.default` is updated or already compatible with my changes - [x] `docker-compose.yml` is updated or already compatible with my changes - [x] No configuration changes required for this feature **Related to:** PR #11325 (base correctness score functionality) 🤖 Generated with [Claude Code](https://claude.ai/code) --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com> Co-authored-by: Zamil Majdy <majdyz@users.noreply.github.com> |
||
|
|
18bb78d93e |
feat(platform): WebSocket-based notifications (#11297)
This enables real time notifications from backend to browser via WebSocket using Redis bus for moving notifications from REST process to WebSocket process. This is needed for (follow-up) backend-completion of onboarding tasks with instant notifications. ### Changes 🏗️ - Add new `AsyncRedisNotificationEventBus` to enable publishing notifications to the Redis event bus - Consume notifications in `ws_api.py` similarly to execution events and send them via WebSocket - Store WebSocket user connections in `ConnectionManager` - Add relevant tests and types ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Notifications are sent to the frontend |
||
|
|
8058b9487b | feat(platform): Chat UI refinements with simplified tool status indicators (#11337) | ||
|
|
e68896a25a |
feat(backend): allow regex on CORS allowed origins (#11336)
## Changes 🏗️ Allow dynamic URLs in the CORS config, to match them via regex. This helps because currently we have Front-end preview deployments which are isolated ( _nice they don't pollute or overrride other domains_ ) like: ``` https://autogpt-git-{branch_name}-{commit}-significant-gravitas.vercel.app ``` The Front-end builds and works there, but as soon as you login, any API requests to endpoints that need auth will fail due to CORS, given our current CORS config does not support dynamically generated domains. ### Changes After these changes we can specify dynamic domains to be allowed under CORS. I also made `localhost` disabled if the API is in production for safety... ### Before ```yml cors: allowOrigin: "https://dev-builder.agpt.co" # could only specify full URL strings, not dyamic ones ``` ### After ```yml cors: allowOrigins: - "https://dev-builder.agpt.co" - "regex:https://autogpt-git-[a-z0-9-]+\\.vercel\\.app" # dynamic domains supported via regex ``` ### Files - add `build_cors_params` utility to parse literal/regex origins and block localhost in production (`backend/server/utils/cors.py`) - apply the helper in both `AgentServer` and `WebsocketServer` so CORS logic and validations remain consistent - add reusable `override_config` testing helper and update existing WebSocket tests to cover the shared CORS behavior - introduce targeted unit tests for the new CORS helper (`backend/server/utils/cors_test.py`) ## Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] We will know once we made the origin config changes on infra and test with this... |
||
|
|
5559d978d7 | fix(platform): chat duplicate messages (#11332) | ||
|
|
dcecb17bd1 |
feat(backend): Remove deprecated LLM models and add migration script (#11331)
These models have become deprecated - deepseek-r1-distill-llama-70b - gemma2-9b-it - llama3-70b-8192 - llama3-8b-8192 - google/gemini-flash-1.5 I have removed them and setup a migration, the migration is to convert all the old versions of the model to new versions, the model changes will happen like so - llama3-70b-8192 → llama-3.3-70b-versatile - llama3-8b-8192 → llama-3.1-8b-instant - google/gemini-flash-1.5 → google/gemini-2.5-flash - deepseek-r1-distill-llama-70b → gpt-5-chat-latest - gemma2-9b-it → gpt-5-chat-latest ### Changes 🏗️ <!-- Concisely describe all of the changes made in this pull request: --> ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: <!-- Put your test plan here: --> - [x] Check to see if old models where removed - [x] Check to see if migration worked and converted old models to new one in graph |
||
|
|
a056d9e71a | feature(backend): Limit Chat to Auth Users, Limit Agent Runs Per Chat (#11330) | ||
|
|
6037f80502 |
feat(backend): Add correctness score to execution activity generation (#11325)
## Summary Add AI-generated correctness score field to execution activity status generation to provide quantitative assessment of how well executions achieved their intended purpose. New page: <img width="1000" height="229" alt="image" src="https://github.com/user-attachments/assets/5cb907cf-5bc7-4b96-8128-8eecccde9960" /> Old page: <img width="1000" alt="image" src="https://github.com/user-attachments/assets/ece0dfab-1e50-4121-9985-d585f7fcd4d2" /> ## What Changed - Added `correctness_score` field (float 0.0-1.0) to `GraphExecutionStats` model - **REFACTORED**: Removed duplicate `llm_utils.py` and reused existing `AIStructuredResponseGeneratorBlock` logic - Updated activity status generator to use structured responses instead of plain text - Modified prompts to include correctness assessment with 5-tier scoring system: - 0.0-0.2: Failure - 0.2-0.4: Poor - 0.4-0.6: Partial Success - 0.6-0.8: Mostly Successful - 0.8-1.0: Success - Updated manager.py to extract and set both activity_status and correctness_score - Fixed tests to work with existing structured response interface ## Technical Details - **Code Reuse**: Eliminated duplication by using existing `AIStructuredResponseGeneratorBlock` instead of creating new LLM utilities - Added JSON validation with retry logic for malformed responses - Maintained backward compatibility for existing activity status functionality - Score is clamped to valid 0.0-1.0 range and validated - All type errors resolved and linting passes ## Test Plan - [x] All existing tests pass with refactored structure - [x] Structured LLM call functionality tested with success and error cases - [x] Activity status generation tested with various execution scenarios - [x] Integration tests verify both fields are properly set in execution stats - [x] No code duplication - reuses existing block logic 🤖 Generated with [Claude Code](https://claude.ai/code) --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com> Co-authored-by: Zamil Majdy <majdyz@users.noreply.github.com> |
||
|
|
37b3e4e82e |
feat(blocks)!: Update Exa search block to match latest API specification (#11185)
BREAKING CHANGE: Removed deprecated use_auto_prompt field from Input
schema. Existing workflows using this field will need to be updated to
use the type field set to "auto" instead.
## Summary of Changes 📝
This PR comprehensively updates all Exa search blocks to match the
latest Exa API specification and adds significant new functionality
through the Websets API integration.
### Core API Updates 🔄
- **Migration to Exa SDK**: Replaced manual API calls with the official
`exa_py` AsyncExa SDK across all blocks for better reliability and
maintainability
- **Removed deprecated fields**: Eliminated
`use_auto_prompt`/`useAutoprompt` field (breaking change)
- **Fixed incomplete field definitions**: Corrected `user_location`
field definition
- **Added new input fields**: Added `moderation` and `context` fields
for enhanced content filtering
### Enhanced Content Settings 🛠️
- **Text field improvements**: Support both boolean and advanced object
configurations
- **New content options**:
- Added `livecrawl` settings (never, fallback, always, preferred)
- Added `subpages` support for deeper content retrieval
- Added `extras` settings for links and images
- Added `context` settings for additional contextual information
- **Updated settings**: Enhanced `highlight` and `summary`
configurations with new query and schema options
### Comprehensive Cost Tracking 💰
- Added detailed cost tracking models:
- `CostDollars` for monetary costs
- `CostCredits` for API credit tracking
- `CostDuration` for time-based costs
- New output fields: `request_id`, `resolved_search_type`,
`cost_dollars`
- Improved response handling to conditionally yield fields based on
availability
### New Websets API Integration 🚀
Added eight new specialized blocks for Exa's Websets API:
- **`websets.py`**: Core webset management (create, get, list, delete)
- **`websets_search.py`**: Search operations within websets
- **`websets_items.py`**: Individual item management (add, get, update,
delete)
- **`websets_enrichment.py`**: Data enrichment operations
- **`websets_import_export.py`**: Bulk import/export functionality
- **`websets_monitor.py`**: Monitor and track webset changes
- **`websets_polling.py`**: Poll for updates and changes
### New Special-Purpose Blocks 🎯
- **`code_context.py`**: Code search capabilities for finding relevant
code snippets from open source repositories, documentation, and Stack
Overflow
- **`research.py`**: Asynchronous research capabilities that explore the
web, gather sources, synthesize findings, and return structured results
with citations
### Code Organization Improvements 📁
- **Removed legacy code**: Deleted `model.py` file containing deprecated
API models
- **Centralized helpers**: Consolidated shared models and utilities in
`helpers.py`
- **Improved modularity**: Each webset operation is now in its own
dedicated file
### Other Changes 🔧
- Updated `.gitignore` for better development workflow
- Updated `CLAUDE.md` with project-specific instructions
- Updated documentation in `docs/content/platform/new_blocks.md` with
error handling, data models, and file input guidelines
- Improved webhook block implementations with SDK integration
### Files Changed 📂
- **Modified (11 files)**:
- `.gitignore`
- `autogpt_platform/CLAUDE.md`
- `autogpt_platform/backend/backend/blocks/exa/answers.py`
- `autogpt_platform/backend/backend/blocks/exa/contents.py`
- `autogpt_platform/backend/backend/blocks/exa/helpers.py`
- `autogpt_platform/backend/backend/blocks/exa/search.py`
- `autogpt_platform/backend/backend/blocks/exa/similar.py`
- `autogpt_platform/backend/backend/blocks/exa/webhook_blocks.py`
- `autogpt_platform/backend/backend/blocks/exa/websets.py`
- `docs/content/platform/new_blocks.md`
- **Added (8 files)**:
- `autogpt_platform/backend/backend/blocks/exa/code_context.py`
- `autogpt_platform/backend/backend/blocks/exa/research.py`
- `autogpt_platform/backend/backend/blocks/exa/websets_enrichment.py`
- `autogpt_platform/backend/backend/blocks/exa/websets_import_export.py`
- `autogpt_platform/backend/backend/blocks/exa/websets_items.py`
- `autogpt_platform/backend/backend/blocks/exa/websets_monitor.py`
- `autogpt_platform/backend/backend/blocks/exa/websets_polling.py`
- `autogpt_platform/backend/backend/blocks/exa/websets_search.py`
- **Deleted (1 file)**:
- `autogpt_platform/backend/backend/blocks/exa/model.py`
### Migration Guide 🚦
For users with existing workflows using the deprecated `use_auto_prompt`
field:
1. Remove the `use_auto_prompt` field from your input configuration
2. Set the `type` field to `ExaSearchTypes.AUTO` (or "auto" in JSON) to
achieve the same behavior
3. Review any custom content settings as the structure has been enhanced
### Testing Recommendations ✅
- Test existing workflows to ensure they handle the breaking change
- Verify cost tracking fields are properly returned
- Test new content settings options (livecrawl, subpages, extras,
context)
- Validate websets functionality if using the new Websets API blocks
🤖 Generated with [Claude Code](https://claude.com/claude-code)
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] made + ran a test agent for the blocks and flows between them
[Exa
Tests_v44.json](https://github.com/user-attachments/files/23226143/Exa.Tests_v44.json)
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> Migrates Exa blocks to AsyncExa SDK, adds comprehensive
Websets/research/code-context blocks, updates existing
search/content/answers/similar, deletes legacy models, adjusts
tests/docs; breaking: remove `use_auto_prompt` in favor of
`type="auto"`.
>
> - **Backend — Exa integration (SDK migration & BREAKING)**:
> - Replace manual HTTP calls with `exa_py.AsyncExa` across `search`,
`similar`, `contents`, `answers`, and webhooks; richer outputs
(citations, context, costs, resolved search type).
> - BREAKING: remove `Input.use_auto_prompt`; use `type = "auto"`.
> - Centralize models/utilities in `exa/helpers.py` (content settings,
cost models, result mappers).
> - **New Blocks**:
> - **Websets**: management (`websets.py`), searches, items,
enrichments, imports/exports, monitors, polling (new files under
`exa/websets_*`).
> - **Research**: async research task create/get/wait/list
(`exa/research.py`).
> - **Code Context**: code snippet/context retrieval
(`exa/code_context.py`).
> - **Removals**:
> - Delete deprecated `exa/model.py`.
> - **Docs & DX**:
> - Update `docs/new_blocks.md` (error handling, models, file input) and
`CLAUDE.md`; ignore backend logs in `.gitignore`.
> - **Frontend Tests**:
> - Split/extend “e” block tests and improve block add robustness in
Playwright (`build.spec.ts`, `build.page.ts`).
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
|
||
|
|
de7c5b5c31 | Merge branch 'master' into dev | ||
|
|
d68dceb9c1 |
fix(backend/executor): Improve graph execution permission check (#11323)
- Resolves #11316 - Durable fix to replace #11318 ### Changes 🏗️ - Expand graph execution permissions check - Don't require library membership for execution as sub-graph ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Can run sub-agent with non-latest graph version - [x] Can run sub-agent that is available in Marketplace but not added to Library |
||
|
|
193866232c |
hotfix(backend): fix rate-limited messages blocking queue by republishing to back (#11326)
## Summary Fix critical queue blocking issue where rate-limited user messages prevent other users' executions from being processed, causing the 135 late executions reported in production. ## Root Cause Analysis When a user exceeds `max_concurrent_graph_executions_per_user` (25), the executor uses `basic_nack(requeue=True)` which sends the message to the **FRONT** of the RabbitMQ queue. This creates an infinite blocking loop where: 1. Rate-limited message goes to front of queue 2. Gets processed, hits rate limit again 3. Goes back to front of queue 4. Blocks all other users' messages indefinitely ## Solution Implementation ### 🔧 Core Changes - **New setting**: `requeue_by_republishing` (default: `True`) in `backend/util/settings.py` - **Smart `_ack_message`**: Automatically uses republishing when `requeue=True` and setting enabled - **Efficient implementation**: Uses existing `self.run_client` connection instead of creating new ones - **Integration test**: Real RabbitMQ test validates queue ordering behavior ### 🔄 Technical Implementation **Before (blocking):** ```python basic_nack(delivery_tag, requeue=True) # Goes to FRONT of queue ❌ ``` **After (non-blocking):** ```python if requeue and self.config.requeue_by_republishing: # First: Republish to BACK of queue self.run_client.publish_message(...) # Then: Reject without requeue basic_nack(delivery_tag, requeue=False) ``` ### 📊 Impact - ✅ **Other users' executions no longer blocked** by rate-limited users - ✅ **Fair queue processing** - FIFO behavior maintained for all users - ✅ **Rate limiting still works** - just doesn't block others - ✅ **Configurable** - can revert to old behavior with `requeue_by_republishing=False` - ✅ **Zero performance impact** - uses existing connections ## Test Plan - **Integration test**: `test_requeue_integration.py` validates real RabbitMQ queue ordering - **Scenario testing**: Confirms rate-limited messages go to back of queue - **Cross-user validation**: Verifies other users' messages process correctly - **Setting test**: Confirms configuration loads with correct defaults ## Deployment Strategy This is a **hotfix** that can be deployed immediately: - **Backward compatible**: Old behavior available via config - **Safe default**: New behavior is safer than current state - **No breaking changes**: All existing functionality preserved - **Immediate relief**: Resolves production queue blocking ## Files Modified - `backend/executor/manager.py`: Enhanced `_ack_message` logic and `_requeue_message_to_back` method - `backend/util/settings.py`: Added `requeue_by_republishing` configuration field - `test_requeue_integration.py`: Integration test for queue ordering validation ## Related Issues Fixes the 135 late executions issue where messages were stuck in QUEUED state despite available executor capacity (583m/600m utilization). 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com> |
||
|
|
2f87e13d17 |
feat(platform): Chat system backend (#11230)
Implements foundational backend infrastructure for chat-based agent interaction system. Users will be able to discover, configure, and run marketplace agents through conversational AI. **Note:** Chat routes are behind a feature flag ### Changes 🏗️ **Core Chat System:** - Chat service with LLM orchestration (Claude 3.5 Sonnet, Haiku, GPT-4) - REST API routes for sessions and messages - Database layer for chat persistence - System prompts and configuration **5 Conversational Tools:** 1. `find_agent` - Search marketplace by keywords 2. `get_agent_details` - Fetch agent info, inputs, credentials 3. `get_required_setup_info` - Check user readiness, missing credentials 4. `run_agent` - Execute agents immediately 5. `setup_agent` - Configure scheduled execution with cron **Testing:** - 28 tests across chat tools (23 passing, 5 skipped for scheduler) - Test fixtures for simple, LLM, and Firecrawl agents - Service and data layer tests **Bug Fixes:** - Fixed `setup_agent.py` to create schedules instead of immediate execution - Fixed graph lookup to use UUID instead of username/slug - Fixed credential matching by provider/type instead of ID - Fixed internal tool calls to use `._execute()` instead of `.execute()` ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] All 28 chat tool tests pass (23 pass, 5 skip - require scheduler) - [x] Code formatting and linting pass - [x] Tool execution flow validated through unit tests - [x] Agent discovery, details, and execution tested - [x] Credential parsing and matching tested #### For configuration changes: - [x] `.env.default` is updated or already compatible with my changes - [x] `docker-compose.yml` is updated or already compatible with my changes - [x] I have included a list of my configuration changes in the PR description (under **Changes**) No configuration changes required - all existing settings compatible. |
||
|
|
910fd2640d |
hotfix(backend): Temporarily disable library existence check for graph execution (#11318)
### Changes 🏗️ add_store_agent_to_library does not add subagents to the user library, this check can cause issues. ### Checklist 📋 #### For code changes: - [ ] I have clearly listed my changes in the PR description - [ ] I have made a test plan - [ ] I have tested my changes according to the test plan: <!-- Put your test plan here: --> - [ ] ... <details> <summary>Example test plan</summary> - [ ] Create from scratch and execute an agent with at least 3 blocks - [ ] Import an agent from file upload, and confirm it executes correctly - [ ] Upload agent to marketplace - [ ] Import an agent from marketplace and confirm it executes correctly - [ ] Edit an agent from monitor, and confirm it executes correctly </details> #### For configuration changes: - [ ] `.env.default` is updated or already compatible with my changes - [ ] `docker-compose.yml` is updated or already compatible with my changes - [ ] I have included a list of my configuration changes in the PR description (under **Changes**) <details> <summary>Examples of configuration changes</summary> - Changing ports - Adding new services that need to communicate with each other - Secrets or environment variable changes - New or infrastructure changes such as databases </details> |
||
|
|
f97e19f418 |
hotfix: Patch onboarding (#11299)
### Changes 🏗️ - Prevent removing progress of user onboarding tasks by merging arrays on the backend instead of replacing them - New endpoint for onboarding reset ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Tasks are not being reset - [x] `/onboarding/reset` works |
||
|
|
42b9facd4a |
hotfix(backend/scheduler): Bump apscheduler to DST-fixed version 3.11.1 (#11294)
- #11273 - Bump `apscheduler` to v3.11.1 which contains a fix for the issue - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] "It's a rather ugly solution but the test proves that it works." ~the maintainer - [x] CI passes |
||
|
|
834617d221 | hotfix(backend): Clarify prompt requirements for list generation for our friend claude (#11293) | ||
|
|
e6fb649ced | Merge 'master' into 'dev' | ||
|
|
2f8cdf62ba |
feat(backend): Standardize error handling with BlockSchemaInput & BlockSchemaOutput base class (#11257)
<!-- Clearly explain the need for these changes: --> This PR addresses the need for consistent error handling across all blocks in the AutoGPT platform. Previously, each block had to manually define an `error` field in their output schema, leading to code duplication and potential inconsistencies. Some blocks might forget to include the error field, making error handling unpredictable. ### Changes 🏗️ <!-- Concisely describe all of the changes made in this pull request: --> - **Created `BlockSchemaOutput` base class**: New base class that extends `BlockSchema` with a standardized `error` field - **Created `BlockSchemaInput` base class**: Added for consistency and future extensibility - **Updated 140+ block implementations**: Changed all block `Output` classes from `class Output(BlockSchema):` to `class Output(BlockSchemaOutput):` - **Removed manual error field definitions**: Eliminated hundreds of duplicate `error: str = SchemaField(...)` definitions - **Updated type annotations**: Changed `Block[BlockSchema, BlockSchema]` to `Block[BlockSchemaInput, BlockSchemaOutput]` throughout the codebase - **Fixed imports**: Added `BlockSchemaInput` and `BlockSchemaOutput` imports to all relevant files - **Maintained backward compatibility**: Updated `EmptySchema` to inherit from `BlockSchemaOutput` **Key Benefits:** - Consistent error handling across all blocks - Reduced code duplication (removed ~200 lines of repetitive error field definitions) - Type safety improvements with distinct input/output schema types - Blocks can still override error field with more specific descriptions when needed ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: <!-- Put your test plan here: --> - [x] Verified `poetry run format` passes (all linting, formatting, and type checking) - [x] Tested block instantiation works correctly (MediaDurationBlock, UnrealTextToSpeechBlock) - [x] Confirmed error fields are automatically present in all updated blocks - [x] Verified block loading system works (successfully loads 353+ blocks) - [x] Tested backward compatibility with EmptySchema - [x] Confirmed blocks can still override error field with custom descriptions - [x] Validated core schema inheritance chain works correctly #### For configuration changes: - [x] `.env.default` is updated or already compatible with my changes - [x] `docker-compose.yml` is updated or already compatible with my changes - [x] I have included a list of my configuration changes in the PR description (under **Changes**) *Note: No configuration changes were needed for this refactoring.* 🤖 Generated with [Claude Code](https://claude.ai/code) --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Lluis Agusti <hi@llu.lu> Co-authored-by: Ubbe <hi@ubbe.dev> |
||
|
|
3dc5208f71 |
feat(backend): Increase max_field_size in aiohttp requests (#11261)
### Changes 🏗️ - Increased `max_field_size` in `aiohttp.ClientSession` to 16KB to handle servers with large headers (e.g., long CSP headers). ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: <!-- Put your test plan here: --> - [x] Add unit test that checks it can now parse headers over 8k size --------- Co-authored-by: seer-by-sentry[bot] <157164994+seer-by-sentry[bot]@users.noreply.github.com> Co-authored-by: Swifty <craigswift13@gmail.com> Co-authored-by: Ubbe <hi@ubbe.dev> |
||
|
|
4140331731 |
fix(blocks/llm): Validate LLM summary responses are strings (#11275)
### Changes 🏗️ - Added validation to ensure that the `summary` and `final_summary` returned by the LLM are strings. - Raises a `ValueError` if the LLM returns a list or other non-string type, providing a descriptive error message to aid debugging. Fixes [AUTOGPT-SERVER-6M4](https://sentry.io/organizations/significant-gravitas/issues/6978480131/). The issue was that: LLM returned list of strings instead of single string summary, causing `_combine_summaries` to fail on `join`. This fix was generated by Seer in Sentry, triggered by Craig Swift. 👁️ Run ID: 2230933 Not quite right? [Click here to continue debugging with Seer.](https://sentry.io/organizations/significant-gravitas/issues/6978480131/?seerDrawer=true) ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: <!-- Put your test plan here: --> - [x] Added a unit test to verify that a ValueError is raised when the LLM returns a list instead of a string for summary or final_summary. --------- Co-authored-by: seer-by-sentry[bot] <157164994+seer-by-sentry[bot]@users.noreply.github.com> Co-authored-by: Swifty <craigswift13@gmail.com> |
||
|
|
594b1adcf7 |
fix(frontend): Fix marketplace sort by (#11284)
Marketplace sort by functionality was not working on the frontend. This PR fixes it ### Changes 🏗️ - Add type hints for sort by - Fix marketplace sort by drop downs ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: <!-- Put your test plan here: --> - [x] tested locally |
||
|
|
a1ac109356 |
fix(backend): Further enhance sanitization of SQL raw queries (#11279)
### Changes 🏗️ Enhanced SQL query security in the store search functionality by implementing proper parameterization to prevent SQL injection vulnerabilities. **Security Improvements:** - Replaced string interpolation with PostgreSQL positional parameters (`$1`, `$2`, etc.) for all user inputs - Added ORDER BY whitelist validation to prevent injection via `sorted_by` parameter - Parameterized search term, creators array, category, and pagination values - Fixed variable naming conflict (`sql_where_clause` vs `where_clause`) **Testing:** - Added 4 comprehensive tests validating SQL injection prevention across different attack vectors - Tests verify that malicious input in search queries, filters, sorting, and categories are safely handled - All 10 tests in db_test.py pass successfully ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] All existing tests pass (10/10 tests passing) - [x] New security tests validate SQL injection prevention - [x] Verified parameterized queries handle malicious input safely - [x] Code formatting passes (`poetry run format`) #### For configuration changes: - [x] `.env.default` is updated or already compatible with my changes - [x] `docker-compose.yml` is updated or already compatible with my changes - [x] I have included a list of my configuration changes in the PR description (under **Changes**) *Note: No configuration changes required for this security fix* |
||
|
|
5506d59da1 |
fix(backend/executor): make graph execution permission check version-agnostic (#11283)
## Summary Fix critical issue where pre-execution permission validation broke execution of graphs that reference older versions of sub-graphs. ## Problem The `validate_graph_execution_permissions` function was checking for the specific version of a graph in the user's library. This caused failures when: 1. A parent graph references an older version of a sub-graph 2. The user updates the sub-graph to a newer version 3. The older version is no longer in their library 4. Execution of the parent graph fails with `GraphNotInLibraryError` ## Root Cause In `backend/executor/utils.py` line 523, the function was checking for the exact version, but sub-graphs legitimately reference older versions that may no longer be in the library. ## Solution ### 1. Remove Version-Specific Check (backend/executor/utils.py) - Remove `graph_version=graph.version` parameter from validation call - Add explanatory comment about version-agnostic behavior - Now only checks that the graph ID exists in user's library (any version) ### 2. Enhance Documentation (backend/data/graph.py) - Update function docstring to explain version-agnostic behavior - Document that `None` (now default) allows execution of any version - Clarify this is important for sub-graph version compatibility ## Technical Details The `validate_graph_execution_permissions` function was already designed to handle version-agnostic checks when `graph_version=None`. By omitting the version parameter, we skip the version check and only verify: - Graph exists in user's library - Graph is not deleted/archived - User has execution permissions ## Impact - ✅ Parent graphs can execute even when they reference older sub-graph versions - ✅ Sub-graph updates don't break existing parent graphs - ✅ Maintains security: still checks library membership and permissions - ✅ No breaking changes: version-specific validation still available when needed ## Example Scenario Fixed 1. User creates parent graph that uses sub-graph v1 2. User updates sub-graph to v2 (v1 removed from library) 3. Parent graph still references sub-graph v1 4. **Before**: Execution fails with `GraphNotInLibraryError` 5. **After**: Execution succeeds (version-agnostic permission check) ## Testing - [x] Code formatting and linting passes - [x] Type checking passes - [x] No breaking changes to existing functionality - [x] Security still maintained through library membership checks ## Files Changed - `backend/executor/utils.py`: Remove version-specific permission check - `backend/data/graph.py`: Enhanced documentation for version-agnostic behavior Closes #[issue-number-if-applicable] Co-authored-by: Claude <noreply@anthropic.com> |
||
|
|
4922f88851 |
feat(backend/executor): Implement cascading stop for nested graph executions (#11277)
## Summary Fixes critical issue where child executions spawned by `AgentExecutorBlock` continue running after parent execution is stopped. Implements parent-child execution tracking and recursive cascading stop logic to ensure entire execution trees are terminated together. ## Background When a parent graph execution containing `AgentExecutorBlock` nodes is stopped, only the parent was terminated. Child executions continued running, leading to: - ❌ Orphaned child executions consuming credits - ❌ No user control over execution trees - ❌ Race conditions where children start after parent stops - ❌ Resource leaks from abandoned executions ## Core Changes ### 1. Database Schema (`schema.prisma` + migration) ```sql -- Add nullable parent tracking field ALTER TABLE "AgentGraphExecution" ADD COLUMN "parentGraphExecutionId" TEXT; -- Add self-referential foreign key with graceful deletion ALTER TABLE "AgentGraphExecution" ADD CONSTRAINT "AgentGraphExecution_parentGraphExecutionId_fkey" FOREIGN KEY ("parentGraphExecutionId") REFERENCES "AgentGraphExecution"("id") ON DELETE SET NULL ON UPDATE CASCADE; -- Add index for efficient child queries CREATE INDEX "AgentGraphExecution_parentGraphExecutionId_idx" ON "AgentGraphExecution"("parentGraphExecutionId"); ``` ### 2. Parent ID Propagation (`backend/blocks/agent.py`) ```python # Extract current graph execution ID and pass as parent to child execution = add_graph_execution( # ... other params parent_graph_exec_id=graph_exec_id, # NEW: Track parent relationship ) ``` ### 3. Data Layer (`backend/data/execution.py`) ```python async def get_child_graph_executions(parent_exec_id: str) -> list[GraphExecution]: """Get all child executions of a parent execution.""" children = await AgentGraphExecution.prisma().find_many( where={"parentGraphExecutionId": parent_exec_id, "isDeleted": False} ) return [GraphExecution.from_db(child) for child in children] ``` ### 4. Cascading Stop Logic (`backend/executor/utils.py`) ```python async def stop_graph_execution( user_id: str, graph_exec_id: str, wait_timeout: float = 15.0, cascade: bool = True, # NEW parameter ): # 1. Find all child executions if cascade: children = await _get_child_executions(graph_exec_id) # 2. Stop all children recursively in parallel if children: await asyncio.gather( *[stop_graph_execution(user_id, child.id, wait_timeout, True) for child in children], return_exceptions=True, # Don't fail parent if child fails ) # 3. Stop the parent execution # ... existing stop logic ``` ### 5. Race Condition Prevention (`backend/executor/manager.py`) ```python # Before executing queued child, check if parent was terminated if parent_graph_exec_id: parent_exec = get_db_client().get_graph_execution_meta(parent_graph_exec_id, user_id) if parent_exec and parent_exec.status == ExecutionStatus.TERMINATED: # Skip execution, mark child as terminated get_db_client().update_graph_execution_stats( graph_exec_id=graph_exec_id, status=ExecutionStatus.TERMINATED, ) return # Don't start orphaned child ``` ## How It Works ### Before (Broken) ``` User stops parent execution ↓ Parent terminates ✓ ↓ Child executions keep running ✗ ↓ User cannot stop children ✗ ``` ### After (Fixed) ``` User stops parent execution ↓ Query database for all children ↓ Recursively stop all children in parallel ↓ Wait for children to terminate ↓ Stop parent execution ↓ All executions in tree stopped ✓ ``` ### Race Prevention ``` Child in QUEUED status ↓ Parent stopped ↓ Child picked up by executor ↓ Pre-flight check: parent TERMINATED? ↓ Yes → Skip execution, mark child TERMINATED ↓ Child never runs ✓ ``` ## Edge Cases Handled ✅ **Deep nesting** - Recursive cascading handles multi-level trees ✅ **Queued children** - Pre-flight check prevents execution ✅ **Race conditions** - Child spawned during stop operation ✅ **Partial failures** - `return_exceptions=True` continues on error ✅ **Multiple children** - Parallel stop via `asyncio.gather()` ✅ **No parent** - Backward compatible (nullable field) ✅ **Already completed** - Existing status check handles it ## Performance Impact - **Stop operation**: O(depth) with parallel execution vs O(1) before - **Memory**: +36 bytes per execution (one UUID reference) - **Database**: +1 query per tree level, indexed for efficiency ## API Changes (Backward Compatible) ### `stop_graph_execution()` - New Optional Parameter ```python # Before async def stop_graph_execution(user_id: str, graph_exec_id: str, wait_timeout: float = 15.0) # After async def stop_graph_execution(user_id: str, graph_exec_id: str, wait_timeout: float = 15.0, cascade: bool = True) ``` **Default `cascade=True`** means existing callers get the new behavior automatically. ### `add_graph_execution()` - New Optional Parameter ```python async def add_graph_execution(..., parent_graph_exec_id: Optional[str] = None) ``` ## Security & Safety - ✅ **User verification** - Users can only stop their own executions (parent + children) - ✅ **No cycles** - Self-referential FK prevents infinite loops - ✅ **Graceful degradation** - Errors in child stops don't block parent stop - ✅ **Rate limits** - Existing execution rate limits still apply ## Testing Checklist ### Database Migration - [x] Migration runs successfully - [x] Prisma client regenerates without errors - [x] Existing tests pass ### Core Functionality - [ ] Manual test: Stop parent with running child → child stops - [ ] Manual test: Stop parent with queued child → child never starts - [ ] Unit test: Cascading stop with multiple children - [ ] Unit test: Deep nesting (3+ levels) - [ ] Integration test: Race condition prevention ## Breaking Changes **None** - All changes are backward compatible with existing code. ## Rollback Plan If issues arise: 1. **Code rollback**: Revert PR, redeploy 2. **Database rollback**: Drop column and constraints (non-destructive) --- **Note**: This branch contains additional unrelated changes from merging with `dev`. The core cascading stop feature involves only: - `schema.prisma` + migration - `backend/data/execution.py` - `backend/executor/utils.py` - `backend/blocks/agent.py` - `backend/executor/manager.py` All other file changes are from dev branch updates and not part of this feature. 🤖 Generated with [Claude Code](https://claude.ai/code) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Nested graph executions: parent-child tracking and retrieval of child executions * **Improvements** * Cascading stop: stopping a parent optionally terminates child executions * Parent execution IDs propagated through runs and surfaced in logs * Per-user/graph concurrent execution limits enforced * **Bug Fixes** * Skip enqueuing children if parent is terminated; robust handling when parent-status checks fail * **Tests** * Updated tests to cover parent linkage in graph creation <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Claude <noreply@anthropic.com> |