Compare commits

...

61 Commits

Author SHA1 Message Date
Swifty
2e36eab6d9 add run block tool 2026-01-15 13:01:11 +01:00
Ubbe
375d33cca9 fix(frontend): agent credentials improvements (#11763)
## Changes 🏗️

### System credentials in Run Modal

We had the issue that "system" credentials were mixed with "user"
credentials in the run agent modal:

#### Before

<img width="400" height="466" alt="Screenshot 2026-01-14 at 19 05 56"
src="https://github.com/user-attachments/assets/9d1ee766-5004-491f-ae14-a0cf89a9118e"
/>

This created confusion among the users. This "system" credentials are
supplied by AutoGPT ( _most of the time_ ) and a user running an agent
should not bother with them ( _unless they want to change them_ ). For
example in this case, the credential that matters is the **Google** one
🙇🏽

### After

<img width="400" height="350" alt="Screenshot 2026-01-14 at 19 04 12"
src="https://github.com/user-attachments/assets/e2bbc015-ce4c-496c-a76f-293c01a11c6f"
/>

<img width="400" height="672" alt="Screenshot 2026-01-14 at 19 04 19"
src="https://github.com/user-attachments/assets/d704dae2-ecb2-4306-bd04-3d812fed4401"
/>

"System" credentials are collapsed by default, reducing noise in the
Task Credentials section. The user can still see and change them by
expanding the accordion.

<img width="400" height="190" alt="Screenshot 2026-01-14 at 19 04 27"
src="https://github.com/user-attachments/assets/edc69612-4588-48e4-981a-f59c26cfa390"
/>

If some "system" credentials are missing, there is a red label
indicating so, it wasn't that obvious with the previous implementation,

<img width="400" height="309" alt="Screenshot 2026-01-14 at 19 04 30"
src="https://github.com/user-attachments/assets/f27081c7-40ad-4757-97b3-f29636616fc2"
/>

### New endpoint

There is a new REST endpoint, `GET /providers/system`, to list system
credential providers so it is easy to access in the Front-end to group
them together vs user ones.

### Other improvements

#### `<CredentialsInput />` refinements

<img width="715" height="200" alt="Screenshot 2026-01-14 at 19 09 31"
src="https://github.com/user-attachments/assets/01b39b16-25f3-428d-a6c8-da608038a38b"
/>

Use a normal browser `<select>` for the Credentials Dropdown ( _when you
have more than 1 for a provider_ ). This simplifies the UI shennagians a
lot and provides a better UX in 📱 ( _eventually we should move all our
selects to the native ones as they are much better for mobile and touch
screens and less code to maintain our end_ ).

I also renamed some files for clarity and tidied up some of the existing
logic.

#### Other

- Fix **Open telemetry** warnings on the server console by making the
packages external
- Fix `require-in-the-middle` console warnings
- Prettier tidy ups

## Checklist 📋

### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Run the app locally and test the above
2026-01-15 17:44:44 +07:00
Swifty
3b1b2fe30c feat(backend): Extract backend copilot/chat enhancements from hackathon (#11719)
This PR extracts backend changes from the hackathon/copilot branch,
adding enhanced chat capabilities, agent management tools, store
embeddings, and hybrid search functionality.

### Changes 🏗️

**Chat Features:**
- Added chat database layer (`db.py`) for conversation and message
persistence
- Extended chat models with new types and response structures
- New onboarding system prompt for guided user experiences
- Enhanced chat routes with additional endpoints
- Expanded chat service with more capabilities

**Chat Agent Tools:**
- `agent_output.py` - Handle agent execution outputs
- `create_agent.py` - Tool for creating new agents via chat
- `edit_agent.py` - Tool for modifying existing agents
- `find_library_agent.py` - Search and discover library agents
- Enhanced `run_agent.py` with additional functionality
- New `models.py` for shared tool types

**Store Enhancements:**
- `embeddings.py` - Vector embeddings support for semantic search
- `hybrid_search.py` - Combined keyword and semantic search
- `backfill_embeddings.py` - Utility for backfilling existing data
- Updated store database operations

**Admin:**
- Enhanced store admin routes

**Data Layer:**
- New `understanding.py` module for agent understanding/context

**Database Migrations:**
- `add_chat_tables` - Chat conversation and message tables
- `add_store_embeddings` - Embeddings storage for store items
- `enhance_search` - Search index improvements

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Chat endpoints respond correctly
  - [x] Agent tools (create/edit/find/run) function properly
  - [x] Store embeddings and hybrid search work
  - [x] Database migrations apply cleanly

#### For configuration changes:

- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)

---------

Co-authored-by: Torantulino <40276179@live.napier.ac.uk>
2026-01-15 11:11:36 +01:00
Abhimanyu Yadav
af63b3678e feat(frontend): hide children of connected array and object fields
(#11770)

### Changes 🏗️

- Added conditional rendering for array and object field children based
on connection status
- Implemented `shouldShowChildren` logic in `ArrayFieldTemplate` and
`ObjectFieldTemplate` components
- Modified the `shouldShowChildren` condition in `FieldTemplate` to
handle different schema types
- Imported and utilized `cleanUpHandleId` and `useEdgeStore` to check if
inputs are connected
- Added connection status checks to hide form fields when their inputs
are connected to other nodes

![Screenshot 2026-01-15 at
12.55.32 PM.png](https://app.graphite.com/user-attachments/assets/d3fffade-872e-4fd8-a347-28d1bae3072e.png)

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Verified that object and array fields hide their children when
connected to other nodes
- [x] Confirmed that unconnected fields display their children properly
- [x] Tested with various schema types to ensure correct rendering
behavior
- [x] Checked that the connection status is properly detected and
applied
2026-01-15 08:10:52 +00:00
Abhimanyu Yadav
631f1bd50a feat(frontend): add interactive tutorial for the new builder interface (#11458)
### Changes 🏗️

This PR adds a comprehensive interactive tutorial for the new Builder UI
to help users learn how to create agents. Key changes include:

- Added a tutorial button to the canvas controls that launches a
step-by-step guide
- Created a Shepherd.js-based tutorial with multiple steps covering:
    - Adding blocks from the Block Menu
    - Understanding input and output handles
    - Configuring block values
    - Connecting blocks together
    - Saving and running agents
- Added data-id attributes to key UI elements for tutorial targeting
- Implemented tutorial state management with a new tutorialStore
- Added helper functions for tutorial navigation and block manipulation
- Created CSS styles for tutorial tooltips and highlights
- Integrated with the Run Input dialog to support tutorial flow
- Added prefetching of tutorial blocks for better performance


https://github.com/user-attachments/assets/3db964b3-855c-4fcc-aa5f-6cd74ab33d7d


### Checklist 📋

#### For code changes:

- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
    - [x] Complete the tutorial from start to finish
    - [x] Test tutorial on different screen sizes
    - [x] Verify all tutorial steps work correctly
    - [x] Ensure tutorial can be canceled and restarted
- [x] Check that tutorial doesn't interfere with normal builder
functionality
2026-01-15 07:47:27 +00:00
Swifty
5ac941fe2f feat(backend): add hybrid search for store listings, docs and blocks (#11721)
This PR adds hybrid search functionality combining semantic embeddings
with traditional text search for improved store listing discovery.

### Changes 🏗️

- Add `embeddings.py` - OpenAI-based embedding generation and similarity
search
- Add `hybrid_search.py` - Combines vector similarity with text matching
for better search results
- Add `backfill_embeddings.py` - Script to generate embeddings for
existing store listings
- Update `db.py` - Integrate hybrid search into store database queries
- Update `schema.prisma` - Add embedding storage fields and indexes
- Add migrations for embedding columns and HNSW index for vector search

### Architecture Decisions 🏛️

**Fail-Fast Approach (No Silent Fallbacks)**

We explicitly chose NOT to implement graceful degradation when hybrid
search fails. Here's why:

 **Benefits:**
- Errors surface immediately → faster fixes
- Tests verify hybrid search actually works (not just fallback)
- Consistent search quality for all users
- Forces proper infrastructure setup (API keys, database)

 **Why Not Fallback:**
- Silent degradation hides production issues
- Users get inconsistent results without knowing why
- Tests can pass even when hybrid search is broken
- Reduces operational visibility

**How We Prevent Failures:**
1. Embedding generation in approval flow (db.py:1545)
2. Error logging with `logger.error` (not warning)
3. Clear error messages (ValueError explains what's wrong)
4. Comprehensive test coverage (9/9 tests passing)

If embeddings fail, it indicates a real infrastructure issue (missing
API key, OpenAI down, database issues) that needs immediate attention,
not silent degradation.

### Test Coverage 

**All tests passing (1625 total):**
- 9/9 hybrid_search tests (including fail-fast validation)
- 3/3 db search integration tests
- Full schema compatibility (public/platform schemas)
- Error handling verification

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Test hybrid search returns relevant results
  - [x] Test embedding generation for new listings
  - [x] Test backfill script on existing data
  - [x] Verify search performance with embeddings
  - [x] Test fail-fast behavior when embeddings unavailable

#### For configuration changes:

- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] Configuration: Requires `openai_internal_api_key` in secrets

---------

Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-15 04:17:03 +00:00
Reinier van der Leer
b01ea3fcbd fix(backend/executor): Centralize increment_runs calls & make add_graph_execution more robust (#11764)
[OPEN-2946: \[Scheduler\] Error executing graph <graph_id> after 19.83s:
ClientNotConnectedError: Client is not connected to the query engine,
you must call `connect()` before attempting to query
data.](https://linear.app/autogpt/issue/OPEN-2946)

- Follow-up to #11375
  <sub>(broken `increment_runs` call)</sub>
- Follow-up to #11380
  <sub>(direct `get_graph_execution` call)</sub>

### Changes 🏗️

- Move `increment_runs` call from `scheduler._execute_graph` to
`executor.utils.add_graph_execution` so it can be made through
`DatabaseManager`
  - Add `increment_onboarding_runs` to `DatabaseManager`
- Remove now-redundant `increment_onboarding_runs` calls in other places
- Make `add_graph_execution` more resilient
  - Split up large try/except block
  - Fix direct `get_graph_execution` call

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - CI + a thorough review
2026-01-15 04:08:19 +00:00
Reinier van der Leer
3b09a94e3f feat(frontend/builder): Add sub-graph update UX (#11631)
[OPEN-2743: Ability to Update Sub-Agents in Graph (Without
Re-Adding)](https://linear.app/autogpt/issue/OPEN-2743/ability-to-update-sub-agents-in-graph-without-re-adding)

Updating sub-graphs is a cumbersome experience at the moment, this
should help. :)

Demo in Builder v2:


https://github.com/user-attachments/assets/df564f32-4d1d-432c-bb91-fe9065068360


https://github.com/user-attachments/assets/f169471a-1f22-46e9-a958-ddb72d3f65af


### Changes 🏗️

- Add sub-graph update banner with I/O incompatibility notification and
resolution mode
  - Red visual indicators for broken inputs/outputs and edges
  - Update bars and tooltips show compatibility details
- Sub-agent update UI with compatibility checks, incompatibility dialog,
and guided resolution workflow
- Resolution mode banner guiding users to remove incompatible
connections
- Visual controls to stage/apply updates and auto-apply when broken
connections are fixed
  
  Technical:
- Builder v1: Add `CustomNode` > `IncompatibilityDialog` +
`SubAgentUpdateBar` sub-components
- Builder v2: Add `SubAgentUpdateFeature` + `ResolutionModeBar` +
`IncompatibleUpdateDialog` + `useSubAgentUpdateState` sub-components
  - Add `useSubAgentUpdate` hook

- Related fixes in Builder v1:
  - Fix static edges not rendering as such
  - Fix edge styling not applying
- Related fixes in Builder v2:
  - Fix excess spacing for nested node input fields

Other:
- "Retry" button in error view now reloads the page instead of
navigating to `/marketplace`

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - CI for existing frontend UX flows
- [x] Updating to a new sub-agent version with compatibility issues: UX
flow works
- [x] Updating to a new sub-agent version with *no* compatibility
issues: works
  - [x] Designer approves of the look

---------

Co-authored-by: abhi1992002 <abhimanyu1992002@gmail.com>
Co-authored-by: Abhimanyu Yadav <122007096+Abhi1992002@users.noreply.github.com>
2026-01-14 13:25:20 +00:00
Zamil Majdy
61efee4139 fix(frontend): Remove hardcoded bypass of billing feature flag (#11762)
## Summary

Fixes a critical security issue where the billing button in the settings
sidebar was always visible to all users, bypassing the
`ENABLE_PLATFORM_PAYMENT` feature flag.

## Changes 🏗️

- Removed hardcoded `|| true` condition in
`frontend/src/app/(platform)/profile/(user)/layout.tsx:32` that was
bypassing the feature flag check
- The billing button is now properly gated by the
`ENABLE_PLATFORM_PAYMENT` feature flag as intended

## Root Cause

The `|| true` was accidentally left in commit
3dbc03e488 (PR #11617 - OAuth API & Single
Sign-On) from December 19, 2025. It was likely added temporarily during
development/testing to always show the billing button, but was not
removed before merging.

## Test Plan

1. Verify feature flag is set to disabled in LaunchDarkly
2. Navigate to settings page (`/profile/settings`)
3. Confirm billing button is NOT visible in the sidebar
4. Enable feature flag in LaunchDarkly
5. Refresh page and confirm billing button IS now visible
6. Verify billing page (`/profile/credits`) is still accessible via
direct URL when feature flag is disabled

## Checklist 📋

### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan

Fixes SECRT-1791

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Bug Fixes**
* The Billing link in the profile sidebar now respects the payment
feature flag configuration and will only display when payment
functionality is enabled.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-01-14 03:28:36 +00:00
Bently
e539280e98 fix(blocks): set User-Agent header and URL-encode topic in GetWikipediaSummaryBlock (#11754)
The GetWikipediaSummaryBlock was returning HTTP 403 errors from
Wikipedia's API because it wasn't explicitly setting a User-Agent header
that complies with https://wikitech.wikimedia.org/wiki/Robot_policy.
Additionally, topics with spaces or special characters would cause
malformed URLs.

Fixes: OPEN-2889

Changes 🏗️

- URL-encode the topic parameter using urllib.parse.quote() to handle
spaces and special characters
- Explicitly set required headers per Wikimedia robot policy:
- User-Agent: Platform default user agent (includes app name, URL, and
contact email)
- Accept-Encoding: gzip, deflate: Recommended by Wikimedia to reduce
bandwidth
- Updated test mock to match the new function signature

Checklist 📋

For code changes:

- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Verify code passes syntax check
  - [x] Verify code passes ruff linting
- [x] Create an agent using GetWikipediaSummaryBlock with a topic
containing spaces (e.g., "Artificial Intelligence")
  - [x] Verify the block returns a Wikipedia summary without 403 errors

For configuration changes:

- .env.default is updated or already compatible with my changes
- docker-compose.yml is updated or already compatible with my changes
- I have included a list of my configuration changes in the PR
description (under Changes)
.
N/A - No configuration changes required.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Bug Fixes**
* Improved Wikipedia API requests by adding compatible request headers
(including a proper user agent and encoding acceptance) for more
reliable responses.
* Enhanced handling of search topics by URL-encoding terms so queries
with spaces or special characters return correct results.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-01-13 12:24:51 +00:00
Toran Bruce Richards
db8b43bb3d feat(blocks): Add WordPress Get All Posts block and Publish Post draft toggle (#11003)
**Implements issue #11002**

This PR adds WordPress post management functionality and improves error
handling in DataForSEO blocks.

### Changes 🏗️

1. **New WordPress Blocks:**
- Added `WordPressGetAllPostsBlock` - Fetches posts from WordPress sites
with filtering and pagination support
- Enhanced `WordPressCreatePostBlock` with `publish_as_draft` toggle to
control post publication status

2. **WordPress API Enhancements:**
- Added `get_posts()` function in `_api.py` to retrieve posts with
filtering by status
- Added `PostsResponse` model for handling WordPress posts list API
responses
- Support for pagination with `number` and `offset` parameters (max 100
posts per request)

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  
  **Test Plan:**
- [x] Test `WordPressGetAllPostsBlock` with valid WordPress credentials
  - [x] Verify filtering posts by status (publish, draft, pending, etc.)
  - [x] Test pagination with different number and offset values
- [x] Test `WordPressCreatePostBlock` with publish_as_draft=True to
create draft posts
- [x] Test `WordPressCreatePostBlock` with publish_as_draft=False to
publish posts publicly

#### For configuration changes:

- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)

**Note:** No configuration changes were required for this PR.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **New Features**
* Added a WordPress “Get All Posts” block to fetch posts with optional
status filtering and pagination; returns total found and post details.
* **Enhancements**
* WordPress “Create Post” block now supports a “Publish as draft”
option, allowing posts to be created as drafts or published immediately.
* WordPress blocks are now surfaced consistently in the block catalog
for easier use.
* **Error Handling**
* Clearer error messages when fetching posts fails, aiding
troubleshooting.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->


<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Introduces WordPress post listing and improves post creation and API
robustness.
> 
> - Adds `WordPressGetAllPostsBlock` to fetch posts with optional
`status` filter and pagination (`number`, `offset`); outputs `found`,
`posts`, and streams each `post`
> - Enhances `WordPressCreatePostBlock` with `publish_as_draft` input
and adds `site` to outputs; sets `status` accordingly
> - WordPress API updates in `_api.py`: new `get_posts`, `Post`,
`PostsResponse`, and `normalize_site`; apply
`Requests(raise_for_status=False)` across OAuth/token/info and post
creation; better error propagation
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
10be1c4709. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Toran Bruce Richards <Torantulino@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: Nicholas Tindle <ntindle@users.noreply.github.com>
Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-12 19:57:47 +00:00
Abhimanyu Yadav
923d8baedc feat(frontend): add JsonTextField component for complex nested form data (#11752)
### Changes 🏗️

- Added a new `JsonTextField` component to handle complex nested JSON
types (objects/arrays inside other objects/arrays)
- Created helper functions for JSON parsing, validation, and formatting
- Implemented `useJsonTextField` hook to manage state and validation
- Enhanced `generateUiSchemaForCustomFields` to detect nested complex
types and render them as JSON text fields
- Updated `TextInputExpanderModal` to support JSON-specific styling
- Added `JSON_TEXT_FIELD_ID` constant to custom registry for field
identification

This change improves the user experience by preventing deeply nested
form UIs. Instead, complex nested structures are presented as editable
JSON text fields with proper validation and formatting.

### Before

![Screenshot 2026-01-12 at
1.07.54 PM.png](https://app.graphite.com/user-attachments/assets/dc2b96cc-562a-4e6b-8278-76de941e3bd9.png)

### After

![Screenshot 2026-01-12 at
12.35.19 PM.png](https://app.graphite.com/user-attachments/assets/ea0028a5-c119-43c3-8100-b103484e0b54.png)

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Test with simple JSON objects in forms
  - [x] Test with nested arrays and objects
  - [x] Test with anyOf/oneOf schemas containing complex types
  - [x] Test the expander modal with JSON content

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* New JSON text field with expandable modal editor, inline validation,
and helpful placeholders.
* Complex nested objects/arrays now render as JSON fields to simplify
editing.
* Modal editor uses monospace, smaller text when editing JSON for
improved readability.

* **Chores**
* Added a non-functional runtime debug log (no user-facing behavior
changes).

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-01-12 12:22:41 +00:00
Abhimanyu Yadav
a55b2e02dc feat(frontend): enhance CredentialsInput and CredentialRow components with variant support (#11753)
### Changes 🏗️

- Added a new `variant` prop to `CredentialsInput` component with
options "default" or "node"
- Implemented compact styling for the "node" variant in `CredentialRow`
component
- Modified layout and overflow handling for credential display in node
context
- Added conditional rendering of masked key display based on variant
- Passed the variant prop through the component hierarchy
- Applied the "node" variant to the `CredentialsField` component with
appropriate styling

Before

![Screenshot 2026-01-12 at
4.39.35 PM.png](https://app.graphite.com/user-attachments/assets/2b605b2d-7abf-4e8a-adc5-6a6e8b712ef7.png)

After

![Screenshot 2026-01-12 at
4.55.39 PM.png](https://app.graphite.com/user-attachments/assets/20bb1452-870a-4111-a246-c4e3a3b456ea.png)

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Verified credential selection works correctly in node context
  - [x] Confirmed compact styling is applied properly in node variant
  - [x] Tested overflow handling for long credential names
  - [x] Verified both default and node variants display correctly

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **New Features**
* Credential input and selection components now support multiple
configurable visual variants, enabling better text display handling,
optimized layouts, and improved visual consistency across different
application contexts and specific use cases.

* **Style**
* Credential field displays now feature enhanced text truncation and
overflow management for a more polished and consistent user interface
experience.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-01-12 12:22:20 +00:00
Abhimanyu Yadav
6b6648b290 feat(frontend): add Table component with TableField renderer for tabular data input (#11751)
### Changes 🏗️

- Added a new `Table` component for handling tabular data input
- Created supporting hooks and helper functions for the Table component
- Added Storybook stories to showcase different Table configurations
- Implemented a custom `TableField` renderer for JSON Schema forms
- Updated type display info to support the new "table" format
- Added schema matcher to detect and render table fields appropriately

![Screenshot 2026-01-12 at
11.29.04 AM.png](https://app.graphite.com/user-attachments/assets/71469d59-469f-4cb0-882b-a49791fe948d.png)

![Screenshot 2026-01-12 at
11.28.54 AM.png](https://app.graphite.com/user-attachments/assets/81193f32-0e16-435e-bb66-5d2aea98266a.png)

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Verified Table component renders correctly with various
configurations
  - [x] Tested adding and removing rows in the Table
- [x] Confirmed data changes are properly tracked and reported via
onChange
  - [x] Verified TableField renderer works with JSON Schema forms
  - [x] Checked that table format is properly detected in the schema

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

## Release Notes

* **New Features**
* Added a Table component for displaying and editing tabular data with
support for adding/deleting rows, read-only mode, and customizable
labels.
* Added support for rendering array fields as tables in form inputs with
configurable columns and values.

* **Tests**
* Added comprehensive Storybook stories demonstrating various Table
configurations and behaviors.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-01-12 10:32:14 +00:00
Abhimanyu Yadav
c0a9c0410b feat(frontend): add MultiSelectField component and improve node title cursor styling (#11744)
## Changes 🏗️

- Added a new `MultiSelectField` component for handling multiple boolean
selections in a dropdown format
- Implemented `useMultiSelectField` hook to manage the state and logic
of the multi-select component
- Added support for custom fields in `AnyOfField` by checking if the
option schema matches a custom field
- Added `isMultiSelectSchema` utility function to detect schemas
suitable for the multi-select component
- Added hover cursor styling to node headers to indicate text
editability

![Screenshot 2026-01-10 at
11.15.12 AM.png](https://app.graphite.com/user-attachments/assets/8254497b-604f-4ccc-a40b-eb8994c073b4.png)

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Verified that multi-select fields render correctly in the UI
  - [x] Confirmed that selecting multiple options works as expected
  - [x] Tested that the node header shows the text cursor on hover
- [x] Verified that AnyOf fields correctly use custom field renderers
when applicable

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Added a multi-select field allowing selection of multiple options with
improved selection UI.
* AnyOf options can now resolve and render custom field types, improving
form composition when schemas map to custom controls.

* **Style**
  * Tooltip header cursor updated for clearer hover feedback.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-01-12 09:48:58 +00:00
Abhimanyu Yadav
17a77b02c7 fix(frontend): exclude schemas with enum from anyOf detection (#11743)
### Changes 🏗️

Fixed the `isAnyOfSchema` function in schema-utils.ts to exclude schemas
that have an `enum` property. This prevents incorrect schema processing
for enums that also have anyOf definitions. Added a console.log
statement in FormRenderer.tsx to help debug schema preprocessing.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Verified that forms with enum values render correctly
- [x] Confirmed that anyOf schemas are properly identified and processed
- [x] Tested with various schema combinations to ensure the fix doesn't
break existing functionality

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

## Bug Fixes
* Improved validation logic for form field schemas to correctly handle
edge cases when multiple constraint types are defined.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-01-12 09:48:47 +00:00
Zamil Majdy
701fce83ca fix(backend): add missing metadata attribute to mock nodes in SmartDecisionMaker tests (#11750)
This PR fixes failing SmartDecisionMaker tests by adding missing
`metadata` attribute to mock nodes.

### Changes 🏗️

Mock nodes in SmartDecisionMaker tests were missing the `metadata = {}`
attribute, which was introduced in commit 4a52b7eca for the
customized_name feature. This caused tests to fail with:

```
TypeError: expected string or bytes-like object, got 'Mock'
```

**Files fixed**:
- `backend/blocks/test/test_smart_decision_maker_dict.py`: Added
`metadata = {}` to mock nodes in all 3 tests
- `backend/blocks/test/test_smart_decision_maker_dynamic_fields.py`:
Added `metadata = {}` to mock nodes in all 8 tests

**Root cause**: The `_create_block_function_signature` method calls
`sink_node.metadata.get("customized_name")`, but mock nodes in tests
didn't have the metadata attribute initialized.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Run `poetry run pytest
backend/blocks/test/test_smart_decision_maker_dict.py -xvs` - 3 passed
- [x] Run `poetry run pytest
backend/blocks/test/test_smart_decision_maker_dynamic_fields.py -xvs` -
8 passed
  - [x] All tests pass successfully

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

## Release Notes

* **Tests**
* Updated test infrastructure to enhance mock object configuration for
improved test reliability and consistency across test suites.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-01-11 17:00:36 -06:00
Zamil Majdy
78d89d0faf Merge branch 'master' of github.com:Significant-Gravitas/AutoGPT into dev 2026-01-11 13:09:23 -06:00
Zamil Majdy
f482eb668b hotfix(backend): resolve tool pin name mismatch in SmartDecisionMakerBlock (#11749)
## Root Cause

Execution a40bdb4a-964d-4684-94e8-b148eb6bcfc2 and all similar
executions have been failing since Nov 12, 2025 when tool pin routing
was refactored to use node IDs. The SmartDecisionMakerBlock was
double-sanitizing field names when emitting tool call outputs:

```python
# Original field name from link: "Max Keyword Difficulty"
original_field_name = field_mapping.get(clean_arg_name)  #  Retrieved correctly
sanitized_arg_name = self.cleanup(original_field_name)   #  Sanitized AGAIN!
emit_key = f"tools_^_{node_id}_~_{sanitized_arg_name}"   # Emits "max_keyword_difficulty"
```

But the parser expected original names from graph links:
```python
# Parser expects: "Max Keyword Difficulty" (from link.sink_name)
# Emit provides: "max_keyword_difficulty" (sanitized)
# Result: Mismatch → Tool never executes
```

### Changes 🏗️

**1. Fixed Emit Logic** (`smart_decision_maker.py` line 1135)
- Removed double sanitization: `sanitized_arg_name =
self.cleanup(original_field_name)`
- Now emits with original field names: `emit_key =
f"tools_^_{node_id}_~_{original_field_name}"`

**2. Made Agent Nodes Consistent** (`smart_decision_maker.py` lines
497-530)
- Added `field_mapping` to agent function signatures (was missing)
- Agent signatures now sanitize property keys for Anthropic API (like
block signatures)
- Stores field_mapping for use during emit

### Impact

**Fixes:**
-  All graphs with multi-word field names (e.g., "Max Keyword
Difficulty", "Minimum Volume")
-  All graphs with special characters in field names (e.g., "API-Key")
-  Both block nodes AND agent nodes now work consistently

**Unaffected:**
- Single-word lowercase field names (e.g., "keyword", "url") - these
were already working

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Verified parse_execution_output handles exact match correctly
  - [x] Verified emit uses original field names
  - [x] Verified field_mapping works for both block and agent nodes
- [x] Re-run execution a40bdb4a-964d-4684-94e8-b148eb6bcfc2 after
deployment to verify fix

#### For configuration changes:
- [x] `.env.default` is updated or already compatible with my changes
(no changes)
- [x] `docker-compose.yml` is updated or already compatible with my
changes (no changes)
- [x] No configuration changes in this PR

### Test Plan

1. **Unit test validation** (completed):
- Field name cleanup: "Max Keyword Difficulty" →
"max_keyword_difficulty" 
   - Parse with exact match: Success 
   - Parse with mismatch: Returns None 

2. **Production validation** (to be done after deployment):
   - Re-run execution a40bdb4a-964d-4684-94e8-b148eb6bcfc2
- Verify AgentExecutor (node 767682f5-694f-4b2a-bf52-fbdcad6a4a4f)
executes successfully
   - Verify execution completes with high correctness score (not 0.20)
   - Monitor for any regressions in existing graphs

### Files Changed

- `backend/blocks/smart_decision_maker.py`: Remove double sanitization,
add agent field_mapping

### Related Issues

- Resolves execution failure a40bdb4a-964d-4684-94e8-b148eb6bcfc2
- Fixes bug introduced in commit 536e2a5ec (Nov 12, 2025)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Bug Fixes**
* Improved field name mapping consistency in the SmartDecisionMaker
block to ensure proper handling of field names throughout function
signatures and tool execution workflows.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-01-12 02:08:12 +07:00
Nicholas Tindle
4a52b7eca0 fix(backend): use customized block names in smart decision maker
The SmartDecisionMakerBlock now respects the customized_name field from
node metadata when generating tool function signatures for the LLM.

Previously, the block always used the static block.name from the block
class definition, ignoring any custom names users set in the builder UI.

Changes:
- _create_block_function_signature: Check sink_node.metadata for
  customized_name before falling back to block.name
- _create_agent_function_signature: Check sink_node.metadata for
  customized_name before falling back to sink_graph_meta.name
- Added 4 unit tests for the customized_name feature

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-09 16:51:39 -07:00
Zamil Majdy
97847f59f7 feat(backend): add human-in-the-loop review system for blocks requiring approval (#11732)
## Summary
Introduces a comprehensive Human-In-The-Loop (HITL) review system that
allows any block to require human approval before execution. This
extends the existing HITL infrastructure to support automatic review
requests for potentially dangerous operations.

## 🚀 Key Features

### **Automatic HITL for Any Block**
- **Simple opt-in**: Set `self.requires_human_review = True` in any
block constructor
- **Safe mode integration**: Only activates when
`execution_context.safe_mode = True`
- **Seamless workflow**: Blocks pause execution → Human reviews via
existing UI → Execution continues or stops

### **Unified Review Infrastructure**
- **Shared HITLReviewHelper**: Clean, reusable helper class for all
review operations
- **Single API**: `handle_review_decision()` method with structured
return type
- **Type-safe**: Proper typing with non-nullable
`ReviewDecision.review_result`

### **Smart Graph Detection** 
- **Updated `has_human_in_the_loop`**: Now detects both dedicated HITL
blocks and blocks with `requires_human_review = True`
- **Frontend awareness**: UI can properly indicate graphs requiring
human intervention

## 🏗️ Implementation

### **Block Usage**
```python
class MyBlock(Block):
    def __init__(self):
        super().__init__(...)
        self.requires_human_review = True  # Enable automatic HITL
        
    async def run(self, input_data, **kwargs):
        # If we reach here, either safe mode is off OR human approved
        # No additional HITL code needed - handled automatically by base class
        yield "result", "Operation completed"
```

### **Review Workflow**
1. **Block execution starts** → Base class checks
`requires_human_review` flag
2. **Safe mode enabled** → Creates review entry, pauses execution 
3. **Human reviews** → Uses existing review UI to approve/reject
4. **Execution resumes** → Continues if approved, raises error if
rejected
5. **Safe mode disabled** → Executes normally without review

## 🔧 Technical Improvements

### **Code Quality Enhancements**
- **Better naming**: `risky_block` → `requires_human_review` (clearer
intent)
- **Type safety**: Non-nullable `ReviewDecision.review_result`
(eliminates Optional checks)
- **Exhaustive handling**: Proper error handling for unexpected review
statuses
- **Clean exception handling**: Removed redundant try-catch-log-reraise
patterns

### **Architecture Fixes**
- **Circular import resolution**: Fixed `ExecutionContext` import issues
breaking 444+ block tests
- **Early returns**: Cleaner control flow without nested conditionals
- **Defensive programming**: Handles edge cases with clear error
messages

## 📊 Changes Made

### **Core Files**
- **`Block.requires_human_review`**: New flag for marking blocks
requiring approval
- **`HITLReviewHelper`**: Shared helper class with clean, testable API
- **`HumanInTheLoopBlock`**: Refactored to use shared infrastructure
- **`Graph.has_human_in_the_loop`**: Updated to include review-requiring
blocks

### **Quality Improvements**
- **Type hints**: Proper typing throughout with runtime compatibility
- **Error handling**: Exhaustive status handling with descriptive errors
- **Code reduction**: -16 lines through removal of redundant exception
handling
- **Test compatibility**: All 444/445 block tests pass

##  Testing & Validation

- **All tests pass**: 444/445 block tests passing 
- **Type checking**: All pyright/mypy checks pass   
- **Formatting**: All linting and formatting checks pass 
- **Circular imports**: Resolved import issues that were breaking tests

- **Backward compatibility**: Existing HITL functionality unchanged 

## 🎯 Use Cases

This enables automatic human oversight for blocks performing:
- **File operations**: Deletion, modification, system access
- **External API calls**: Payments, data modifications, destructive
operations
- **System commands**: Shell execution, configuration changes
- **Data processing**: Sensitive data handling, compliance-required
operations

## 🔄 Migration Path

**Existing code**: No changes required - fully backward compatible
**New blocks**: Simply set `self.requires_human_review = True` to enable
automatic HITL
**Safe mode**: Controls whether review requests are created (production
vs development)

---

This creates a robust, type-safe foundation for human oversight in
automated workflows while maintaining the existing HITL user experience
and API compatibility.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Human-in-the-loop review support so executions can pause for human
review and resume based on decisions.

* **Improvements**
* Blocks can opt into requiring human review and will use reviewed input
when proceeding.
* Unified review decision flow with clearer approved/rejected outcomes
and messaging.
* Graph detection expanded to recognize nodes that require human review.

* **Chores**
  * Test config adjusted to avoid pytest plugin conflicts.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-01-09 21:14:37 +00:00
Zamil Majdy
22ca8955c5 fix(backend): library agent creation and version update improvements (#11731)
## Summary
Fixes library agent creation and version update logic to properly handle
both user-created and marketplace agents.

## Changes
- **Remove useGraphIsActiveVersion filter** from
`update_agent_version_in_library` to allow both manual and auto updates
- **Set useGraphIsActiveVersion correctly**:
- `False` for marketplace agents (require manual updates to avoid
breaking workflows)
- `True` for user-created agents (can safely auto-update since user
controls source)
- Update function documentation to reflect new behavior

## Problem Solved
- Marketplace agents can now be updated manually via API
- User-created agents maintain auto-update capability  
- Resolves Sentry error AUTOGPT-SERVER-722 about "Expected a record,
found none"
- Fixes store submission modal issues

## Test Plan
- [x] Verify marketplace agents are created with
`useGraphIsActiveVersion: False`
- [x] Verify user agents are created with `useGraphIsActiveVersion:
True`
- [x] Confirm `update_agent_version_in_library` works for both types
- [x] Test store submission flow works without modal issues

## Review Notes
This change ensures proper separation between user-controlled agents
(auto-update) and marketplace agents (manual update), while allowing the
API to service both use cases.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Release Notes

* **New Features**
* Enhanced agent publishing workflow with improved version tracking and
change detection for marketplace updates

* **Bug Fixes**
  * Improved error handling when updating agent versions in the library
  * Better detection of unpublished changes before publishing agents

* **Improvements**
* Changes Summary field now supports longer descriptions (up to 500
characters) with multi-line editing capability

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-01-09 21:14:05 +00:00
Nicholas Tindle
43cbe2e011 feat!(blocks): Add Reddit OAuth2 integration and advanced Reddit blocks (#11623)
Replaces user/password Reddit credentials with OAuth2, adds
RedditOAuthHandler, and updates Reddit blocks to support OAuth2
authentication. Introduces new blocks for creating posts, fetching post
details, searching, editing posts, and retrieving subreddit info.
Updates test credentials and input handling to use OAuth2 tokens.

<!-- Clearly explain the need for these changes: -->

### Changes 🏗️
Rebuild the reddit blocks to support oauth2 rather than requiring users
to provide their password and username.
This is done via a swap from script based to web based authentication on
the reddit side faciliatated by the approval of an oauth app by reddit
on the account `ntindle`
<!-- Concisely describe all of the changes made in this pull request:
-->

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  <!-- Put your test plan here: -->
  - [x] Build a super agent
  - [x] Upload the super agent and a video of it working

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Introduces full Reddit OAuth2 support and substantially expands Reddit
capabilities across the platform.
> 
> - Adds `RedditOAuthHandler` with token exchange, refresh, revoke;
registers handler in `integrations/oauth/__init__.py`
> - Refactors Reddit blocks to use `OAuth2Credentials` and `praw` via
refresh tokens; updates models (e.g., `post_id`, richer outputs) and
adds `strip_reddit_prefix`
> - New blocks: create/edit/delete posts, post/get/delete comments,
reply to comments, get post details, user posts (self/others), search,
inbox, subreddit info/rules/flairs, send messages
> - Updates default `settings.config.reddit_user_agent` and test
credentials; minor `.branchlet.json` addition
> - Docs: clarifies block error-handling with
`BlockInputError`/`BlockExecutionError` guidance
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
4f1f26c7e7. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Release Notes

* **New Features**
* Added OAuth2-based authentication for Reddit integration, replacing
legacy credential methods
* Expanded Reddit capabilities with new blocks for creating posts,
retrieving post details, managing comments, accessing inbox, and
fetching user/subreddit information
* Enhanced data models to support richer Reddit interactions and
chainable workflows

* **Documentation**
* Updated error handling guidance to distinguish between validation
errors and runtime errors with improved exception patterns

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
2026-01-09 20:53:03 +00:00
Nicholas Tindle
a318832414 feat(docs): update dev from gitbook changes (#11740)
<!-- Clearly explain the need for these changes: -->
gitbook branch has changes that need synced to dev
### Changes 🏗️
Pull changes from gitbook into dev
<!-- Concisely describe all of the changes made in this pull request:
-->

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Migrates documentation to GitBook and removes the old MkDocs setup.
> 
> - Removes MkDocs configuration and infra: `docs/mkdocs.yml`,
`docs/netlify.toml`, `docs/overrides/main.html`,
`docs/requirements.txt`, and JS assets (`_javascript/mathjax.js`,
`_javascript/tablesort.js`)
> - Updates `docs/content/contribute/index.md` to describe GitBook
workflow (gitbook branch, editing, previews, and `SUMMARY.md`)
> - Adds GitBook navigation file `docs/platform/SUMMARY.md` and a new
platform overview page `docs/platform/what-is-autogpt-platform.md`
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
e7e118b5a8. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Documentation**
* Updated contribution guide for new documentation platform and workflow
  * Added new platform overview and navigation documentation

* **Chores**
  * Removed MkDocs configuration and related dependencies
  * Removed deprecated JavaScript integrations and deployment overrides

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-09 19:22:05 +00:00
Swifty
843c487500 feat(backend): add prisma types stub generator for pyright compatibility (#11736)
Prisma's generated `types.py` file is 57,000+ lines with complex
recursive TypedDict definitions that exhaust Pyright's type inference
budget. This causes random type errors and makes the type checker
unreliable.

### Changes 🏗️

- Add `gen_prisma_types_stub.py` script that generates a lightweight
`.pyi` stub file
- The stub preserves safe types (Literal, TypeVar) while collapsing
complex TypedDicts to `dict[str, Any]`
- Integrate stub generation into all workflows that run `prisma
generate`:
  - `platform-backend-ci.yml`
  - `claude.yml`
  - `claude-dependabot.yml`
  - `copilot-setup-steps.yml`
  - `docker-compose.platform.yml`
  - `Dockerfile`
  - `Makefile` (migrate & reset-db targets)
  - `linter.py` (lint & format commands)
- Add `gen-prisma-stub` poetry script entry
- Fix two pre-existing type errors that were previously masked:
- `store/db.py`: Replace private type
`_StoreListingVersion_version_OrderByInput` with dict literal
  - `airtable/_webhook.py`: Add cast for `Serializable` type

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Run `poetry run format` - passes with 0 errors (down from 57+)
  - [x] Run `poetry run lint` - passes with 0 errors
  - [x] Run `poetry run gen-prisma-stub` - generates stub successfully
- [x] Verify stub file is created at correct location with proper
content

#### For configuration changes:
- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Chores**
* Added a lightweight Prisma type-stub generator and integrated it into
build, lint, CI/CD, and container workflows.
* Build, migration, formatting, and lint steps now generate these stubs
to improve type-checking performance and reduce overhead during builds
and deployments.
  * Exposed a project command to run stub generation manually.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-01-09 16:31:10 +01:00
Nicholas Tindle
47a3a5ef41 feat(backend,frontend): optional credentials flag for blocks at agent level (#11716)
This feature allows agent makers to mark credential fields as optional.
When credentials are not configured for an optional block, the block
will be skipped during execution rather than causing a validation error.

**Use case:** An agent with multiple notification channels (Discord,
Twilio, Slack) where the user only needs to configure one - unconfigured
channels are simply skipped.

### Changes 🏗️

#### Backend

**Data Model Changes:**
- `backend/data/graph.py`: Added `credentials_optional` property to
`Node` model that reads from node metadata
- `backend/data/execution.py`: Added `nodes_to_skip` field to
`GraphExecutionEntry` model to track nodes that should be skipped

**Validation Changes:**
- `backend/executor/utils.py`:
- Updated `_validate_node_input_credentials()` to return a tuple of
`(credential_errors, nodes_to_skip)`
- Nodes with `credentials_optional=True` and missing credentials are
added to `nodes_to_skip` instead of raising validation errors
- Updated `validate_graph_with_credentials()` to propagate
`nodes_to_skip` set
- Updated `validate_and_construct_node_execution_input()` to return
`nodes_to_skip`
- Updated `add_graph_execution()` to pass `nodes_to_skip` to execution
entry

**Execution Changes:**
- `backend/executor/manager.py`:
  - Added skip logic in `_on_graph_execution()` dispatch loop
- When a node is in `nodes_to_skip`, it is marked as `COMPLETED` without
execution
  - No outputs are produced, so downstream nodes won't trigger

#### Frontend

**Node Store:**
- `frontend/src/app/(platform)/build/stores/nodeStore.ts`:
- Added `credentials_optional` to node metadata serialization in
`convertCustomNodeToBackendNode()`
- Added `getCredentialsOptional()` and `setCredentialsOptional()` helper
methods

**Credential Field Component:**
-
`frontend/src/components/renderers/input-renderer/fields/CredentialField/CredentialField.tsx`:
  - Added "Optional - skip block if not configured" switch toggle
  - Switch controls the `credentials_optional` metadata flag
  - Placeholder text updates based on optional state

**Credential Field Hook:**
-
`frontend/src/components/renderers/input-renderer/fields/CredentialField/useCredentialField.ts`:
  - Added `disableAutoSelect` parameter
- When credentials are optional, auto-selection of credentials is
disabled

**Feature Flags:**
- `frontend/src/services/feature-flags/use-get-flag.ts`: Minor refactor
(condition ordering)

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Build an agent using smart decision maker and down stream blocks
to test this

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Introduces optional credentials across graph execution and UI,
allowing nodes to be skipped (no outputs, no downstream triggers) when
their credentials are not configured.
> 
> - Backend
> - Adds `Node.credentials_optional` (from node `metadata`) and computes
required credential fields in `Graph.credentials_input_schema` based on
usage.
> - Validates credentials with `_validate_node_input_credentials` →
returns `(errors, nodes_to_skip)`; plumbs `nodes_to_skip` through
`validate_graph_with_credentials`,
`_construct_starting_node_execution_input`,
`validate_and_construct_node_execution_input`, and `add_graph_execution`
into `GraphExecutionEntry`.
> - Executor: dispatch loop skips nodes in `nodes_to_skip` (marks
`COMPLETED`); `execute_node`/`on_node_execution` accept `nodes_to_skip`;
`SmartDecisionMakerBlock.run` filters tool functions whose
`_sink_node_id` is in `nodes_to_skip` and errors only if all tools are
filtered.
> - Models: `GraphExecutionEntry` gains `nodes_to_skip` field. Tests and
snapshots updated accordingly.
> 
> - Frontend
> - Builder: credential field uses `custom/credential_field` with an
"Optional – skip block if not configured" toggle; `nodeStore` persists
`credentials_optional` and history; UI hides optional toggle in run
dialogs.
> - Run dialogs: compute required credentials from
`credentials_input_schema.required`; allow selecting "None"; avoid
auto-select for optional; filter out incomplete creds before execute.
>   - Minor schema/UI wiring updates (`uiSchema`, form context flags).
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
5e01fd6a3e. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Nicholas Tindle <ntindle@users.noreply.github.com>
2026-01-09 14:11:35 +00:00
Ubbe
ec00aa951a fix(frontend): agent favorites layout (#11733)
## Changes 🏗️

<img width="800" height="744" alt="Screenshot 2026-01-09 at 16 07 08"
src="https://github.com/user-attachments/assets/034c97e2-18f3-441c-a13d-71f668ad672f"
/>

- Remove feature flag for agent favourites ( _keep it always visible_ )
- Fix the layout on the card so the ❤️ icon appears next to the `...`
menu
- Remove icons on toasts

## Checklist 📋

### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Run the app locally and check the above


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Favorites now respond to the current search term and are available to
all users (no feature-flag).

* **UI/UX Improvements**
* Redesigned Favorites section with simplified header, inline agent
counts, updated spacing/dividers, and removal of skeleton placeholders.
  * Favorite button repositioned and visually simplified on agent cards.
* Toast visuals simplified by removing per-type icons and adjusting
close-button positioning.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-01-09 18:52:07 +07:00
Zamil Majdy
36fb1ea004 fix(platform): store submission validation and marketplace improvements (#11706)
## Summary

Major improvements to AutoGPT Platform store submission deletion,
creator detection, and marketplace functionality. This PR addresses
critical issues with submission management and significantly improves
performance.

### 🔧 **Store Submission Deletion Issues Fixed**

**Problems Solved**:
-  **Wrong deletion granularity**: Deleting entire `StoreListing` (all
versions) when users expected to delete individual submissions
-  **"Graph not found" errors**: Cascade deletion removing AgentGraphs
that were still referenced
-  **Multiple submissions deleted**: When removing one submission, all
submissions for that agent were removed
-  **Deletion of approved content**: Users could accidentally remove
live store content

**Solutions Implemented**:
-  **Granular deletion**: Now deletes individual `StoreListingVersion`
records instead of entire listings
-  **Protected approved content**: Prevents deletion of approved
submissions to keep store content safe
-  **Automatic cleanup**: Empty listings are automatically removed when
last version is deleted
-  **Simplified logic**: Reduced deletion function from 85 lines to 32
lines for better maintainability

### 🔧 **Creator Detection Performance Issues Fixed**

**Problems Solved**:
-  **Inefficient API calls**: Fetching ALL user submissions just to
check if they own one specific agent
-  **Complex logic**: Convoluted creator detection requiring multiple
database queries
-  **Performance impact**: Especially bad for non-creators who would
never need this data

**Solutions Implemented**:
-  **Added `owner_user_id` field**: Direct ownership reference in
`LibraryAgent` model
-  **Simple ownership check**: `owner_user_id === user.id` instead of
complex submission fetching
-  **90%+ performance improvement**: Massive reduction in unnecessary
API calls for non-creators
-  **Optimized data fetching**: Only fetch submissions when user is
creator AND has marketplace listing

### 🔧 **Original Store Submission Validation Issues (BUILDER-59F)**
Fixes "Agent not found for this user. User ID: ..., Agent ID: , Version:
0" errors:

- **Backend validation**: Added Pydantic validation for `agent_id`
(min_length=1) and `agent_version` (>0)
- **Frontend validation**: Pre-submission validation with user-friendly
error messages
- **Agent selection flow**: Fixed `agentId` not being set from
`selectedAgentId`
- **State management**: Prevented state reset conflicts clearing
selected agent

### 🔧 **Marketplace Display Improvements**
Enhanced version history and changelog display:

- Updated title from "Changelog" to "Version history"
- Added "Last updated X ago" with proper relative time formatting  
- Display version numbers as "Version X.0" format
- Replaced all hardcoded values with dynamic API data
- Improved text sizes and layout structure

### 📁 **Files Changed**

**Backend Changes**:
- `backend/api/features/store/db.py` - Simplified deletion logic, added
approval protection
- `backend/api/features/store/model.py` - Added `listing_id` field,
Pydantic validation
- `backend/api/features/library/model.py` - Added `owner_user_id` field
for efficient creator detection
- All test files - Updated with new required fields

**Frontend Changes**:
- `useMarketplaceUpdate.ts` - Optimized creator detection logic 
- `MainDashboardPage.tsx` - Added `listing_id` mapping for proper type
safety
- `useAgentTableRow.ts` - Updated deletion logic to use
`store_listing_version_id`
- `usePublishAgentModal.ts` - Fixed state reset conflicts
- Marketplace components - Enhanced version history display

###  **Benefits**

**Performance**:
- 🚀 **90%+ reduction** in unnecessary API calls for creator detection
- 🚀 **Instant ownership checks** (no database queries needed)
- 🚀 **Optimized submissions fetching** (only when needed)

**User Experience**: 
-  **Granular submission control** (delete individual versions, not
entire listings)
-  **Protected approved content** (prevents accidental store content
removal)
-  **Better error prevention** (no more "Graph not found" errors)
-  **Clear validation messages** (user-friendly error feedback)

**Code Quality**:
-  **Simplified deletion logic** (85 lines → 32 lines)
-  **Better type safety** (proper `listing_id` field usage)  
-  **Cleaner creator detection** (explicit ownership vs inferred)
-  **Automatic cleanup** (empty listings removed automatically)

### 🧪 **Testing**
- [x] Backend validation rejects empty agent_id and zero agent_version
- [x] Frontend TypeScript compilation passes
- [x] Store submission works from both creator dashboard and "become a
creator" flows
- [x] Granular submission deletion works correctly
- [x] Approved submissions are protected from deletion
- [x] Creator detection is fast and accurate
- [x] Marketplace displays version history correctly

**Breaking Changes**: None - All changes are additive and backwards
compatible.

Fixes critical submission deletion issues, improves performance
significantly, and enhances user experience across the platform.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
  * Agent ownership is now tracked and exposed across the platform.
* Store submissions and versions now include a required listing_id to
preserve listing linkage.

* **Bug Fixes**
* Prevent deletion of APPROVED submissions; remove empty listings after
deletions.
* Edits restricted to PENDING submissions with clearer invalid-operation
messages.

* **Improvements**
* Stronger publish validation and UX guards; deduplicated images and
modal open/reset refinements.
* Version history shows relative "Last updated" times and version
badges.

* **Tests**
* E2E tests updated to target pending-submission flows for edit/delete.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-01-08 19:11:38 +00:00
Abhimanyu Yadav
a81ac150da fix(frontend): add word wrapping to CodeRenderer and improve output actions visibility (#11724)
## Changes 🏗️
- Updated the `CodeRenderer` component to add `whitespace-pre-wrap` and
`break-words` CSS classes to the `<code>` element
- This enables proper wrapping of long code lines while preserving
whitespace formatting

Before


![image.png](https://app.graphite.com/user-attachments/assets/aca769cc-0f6f-4e25-8cdd-c491fcbf21bb.png)

After

![Screenshot 2026-01-08 at
3.02.53 PM.png](https://app.graphite.com/user-attachments/assets/99e23efa-be2a-441b-b0d6-50fa2a08cdb0.png)

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Verified code with long lines wraps correctly
  - [x] Confirmed whitespace and indentation are preserved
  - [x] Tested code display in various viewport sizes

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Bug Fixes**
* Code blocks now preserve whitespace and wrap long lines for improved
readability.
* Output action controls are hidden when there is only a single output
item, reducing unnecessary UI elements.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-01-08 11:13:47 +00:00
Abhimanyu Yadav
49ee087496 feat(frontend): add new integration images for Webshare and WordPress (#11725)
### Changes 🏗️

Added two new integration icons to the frontend:
- `webshare_proxy.png` - Icon for WebShare Proxy integration
- `wordpress.png` - Icon for WordPress integration

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Verified both icons display correctly in the integrations section
  - [x] Confirmed icons render properly at different screen sizes
  - [x] Checked that the icons maintain quality when scaled

#### For configuration changes:
- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
2026-01-08 11:13:34 +00:00
Ubbe
fc25e008b3 feat(frontend): update library agent cards to use DS (#11720)
## Changes 🏗️

<img width="700" height="838" alt="Screenshot 2026-01-07 at 16 11 04"
src="https://github.com/user-attachments/assets/0b38d2e1-d4a8-4036-862c-b35c82c496c2"
/>

- Update the agent library cards to new designs
- Update page to use Design System components
- Allow to edit/delete/duplicate agents on the library list page
- Add missing actions on library agent detail page

## Checklist 📋

### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Run locally and test the above


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Marketplace info shown on agent cards and improved favoriting with
optimistic UI and feedback.
  * Delete agent and delete schedule flows with confirmation dialogs.

* **Refactor**
* New composable form system, modernized upload dialog, streamlined
search bar, and multiple library components converted to named exports
with layout tweaks.
  * New agent card menu and favorite button UI.

* **Chores**
  * Removed notification UI and dropped a drag-drop dependency.

* **Tests**
  * Increased timeouts and stabilized upload/pagination flows.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-01-08 18:28:27 +07:00
Ubbe
b0855e8cf2 feat(frontend): context menu right click new builder (#11703)
## Changes 🏗️

<img width="250" height="504" alt="Screenshot 2026-01-06 at 17 53 26"
src="https://github.com/user-attachments/assets/52013448-f49c-46b6-b86a-39f98270cbc3"
/>

<img width="300" height="544" alt="Screenshot 2026-01-06 at 17 53 29"
src="https://github.com/user-attachments/assets/e6334034-68e4-4346-9092-3774ab3e8445"
/>

On the **New Builder**:
- right-click on a node menu make it show the context menu
- use the same menu for right-click and when clicking on `...`

## Checklist 📋

### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Run locally and test the above



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Added a custom right-click context menu for nodes with Copy, Open
agent (when available), and Delete actions; browser default menu is
suppressed while preserving zoom/drag/wiring.
* Introduced reusable SecondaryMenu primitives for context and dropdown
menus.

* **Documentation**
* Added Storybook examples demonstrating the context menu and dropdown
menu usage.

* **Style**
* Updated menu styling and icons with improved consistency and dark-mode
support.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-01-08 17:35:49 +07:00
Abhimanyu Yadav
5e2146dd76 feat(frontend): add CustomSchemaField wrapper for dynamic form field routing
(#11722)

### Changes 🏗️

This PR introduces automatic UI schema generation for custom form
fields, eliminating manual field mapping.

#### 1. **generateUiSchemaForCustomFields Utility**
(`generate-ui-schema.ts`) - New File
   - Auto-generates `ui:field` settings for custom fields
   - Detects custom fields using `findCustomFieldId()` matcher
   - Handles nested objects and array items recursively
   - Merges with existing UI schema without overwriting

#### 2. **FormRenderer Integration** (`FormRenderer.tsx`)
   - Imports and uses `generateUiSchemaForCustomFields`
   - Creates merged UI schema with `useMemo`
   - Passes merged schema to Form component
   - Enables automatic custom field detection

#### 3. **Preprocessor Cleanup** (`input-schema-pre-processor.ts`)
   - Removed manual `$id` assignment for custom fields
   - Removed unused `findCustomFieldId` import
   - Simplified to focus only on type validation

### Why these changes?

- Custom fields now auto-detect without manual `ui:field` configuration
- Uses standard RJSF approach (UI schema) for field routing
- Centralized custom field detection logic improves maintainability

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Verify custom fields render correctly when present in schema
- [x] Verify standard fields continue to render with default SchemaField
- [x] Verify multiple instances of same custom field type have unique
IDs
  - [x] Test form submission with custom fields

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Bug Fixes**
* Improved custom field rendering in forms by optimizing the UI schema
generation process.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-01-08 08:47:52 +00:00
Abhimanyu Yadav
103a62c9da feat(frontend/builder): add filters to blocks menu (#11654)
### Changes 🏗️

This PR adds filtering functionality to the new blocks menu, allowing
users to filter search results by category and creator.

**New Components:**
- `BlockMenuFilters`: Main filter component displaying active filters
and filter chips
- `FilterSheet`: Slide-out panel for selecting filters with categories
and creators
- `BlockMenuSearchContent`: Refactored search results display component

**Features Added:**
- Filter by categories: Blocks, Integrations, Marketplace agents, My
agents
- Filter by creator: Shows all available creators from search results
- Category counts: Display number of results per category
- Interactive filter chips with animations (using framer-motion)
- Hover states showing result counts on filter chips
- "All filters" sheet with apply/clear functionality

**State Management:**
- Extended `blockMenuStore` with filter state management
- Added `filters`, `creators`, `creators_list`, and `categoryCounts` to
store
- Integrated filters with search API (`filter` and `by_creator`
parameters)

**Refactoring:**
- Moved search logic from `BlockMenuSearch` to `BlockMenuSearchContent`
- Renamed `useBlockMenuSearch` to `useBlockMenuSearchContent`
- Moved helper functions to `BlockMenuSearchContent` directory

**API Changes:**
- Updated `custom-mutator.ts` to properly handle query parameter
encoding


### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Search for blocks and verify filter chips appear
- [x] Click "All filters" and verify filter sheet opens with categories
- [x] Select/deselect category filters and verify results update
accordingly
  - [x] Filter by creator and verify only blocks from that creator show
  - [x] Clear all filters and verify reset to default state
  - [x] Verify filter counts display correctly
  - [x] Test filter chip hover animations
2026-01-08 08:02:21 +00:00
Bentlybro
fc8434fb30 Merge branch 'master' into dev 2026-01-07 12:02:15 +00:00
Ubbe
3ae08cd48e feat(frontend): use Google Drive Picker on new builder (#11702)
## Changes 🏗️

<img width="600" height="960" alt="Screenshot 2026-01-06 at 17 40 23"
src="https://github.com/user-attachments/assets/61085ec5-a367-45c7-acaa-e3fc0f0af647"
/>

- So when using Google Blocks on the new builder, it shows Google Drive
Picket 🏁

## Checklist 📋

### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
  - [x] Run app locally and test the above


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Added a Google Drive picker field and widget for forms with an
always-visible remove button and improved single/multi selection
handling.

* **Bug Fixes**
* Better validation and normalization of selected files and consolidated
error messaging.
* Adjusted layout spacing around the picker and selected files for
clearer display.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-01-07 17:07:09 +07:00
Swifty
4db13837b9 Revert "extracted frontend changes out of the hackathon/copilot branch"
This reverts commit df87867625.
2026-01-07 09:27:25 +01:00
Swifty
df87867625 extracted frontend changes out of the hackathon/copilot branch 2026-01-07 09:25:10 +01:00
Abhimanyu Yadav
e503126170 feat(frontend): upgrade RJSF to v6 and implement new FormRenderer system
(#11677)

Fixes #11686

### Changes 🏗️

This PR upgrades the React JSON Schema Form (RJSF) library from v5 to v6
and introduces a complete rewrite of the form rendering system with
improved architecture and new features.

#### Core Library Updates
- Upgraded `@rjsf/core` from 5.24.13 to 6.1.2
- Upgraded `@rjsf/utils` from 5.24.13 to 6.1.2
- Added `@radix-ui/react-slider` 1.3.6 for new slider components

#### New Form Renderer Architecture
- **Base Templates**: Created modular base templates for arrays,
objects, and standard fields
- **AnyOf Support**: Implemented `AnyOfField` component with type
selector for union types
- **Array Fields**: New `ArrayFieldTemplate`, `ArrayFieldItemTemplate`,
and `ArraySchemaField` with context provider
- **Object Fields**: Enhanced `ObjectFieldTemplate` with better support
for additional properties via `WrapIfAdditionalTemplate`
- **Field Templates**: New `TitleField`, `DescriptionField`, and
`FieldTemplate` with improved styling
- **Custom Widgets**: Implemented TextWidget, SelectWidget,
CheckboxWidget, FileWidget, DateWidget, TimeWidget, and DateTimeWidget
- **Button Components**: Custom AddButton, RemoveButton, and CopyButton
components

#### Node Handle System Refactor
- Split `NodeHandle` into `InputNodeHandle` and `OutputNodeHandle` for
better separation of concerns
- Refactored handle ID generation logic in `helpers.ts` with new
`generateHandleIdFromTitleId` function
- Improved handle connection detection using edge store
- Added support for nested output handles (objects within outputs)

#### Edge Store Improvements
- Added `removeEdgesByHandlePrefix` method for bulk edge removal
- Improved `isInputConnected` with handle ID cleanup
- Optimized `updateEdgeBeads` to only update when changes occur
- Better edge management with `applyEdgeChanges`

#### Node Store Enhancements
- Added `syncHardcodedValuesWithHandleIds` method to maintain
consistency between form data and handle connections
- Better handling of additional properties in objects
- Improved path parsing with `parseHandleIdToPath` and
`ensurePathExists`

#### Draft Recovery Improvements
- Added diff calculation with `calculateDraftDiff` to show what changed
- New `formatDiffSummary` to display changes in a readable format (e.g.,
"+2/-1 blocks, +3 connections")
- Better visual feedback for draft changes

#### UI/UX Enhancements
- Fixed node container width to 350px for consistency
- Improved field error display with inline error messages
- Better spacing and styling throughout forms
- Enhanced tooltip support for field descriptions
- Improved array item controls with better button placement
- Context-aware field sizing (small/large)

#### Output Handler Updates
- Recursive rendering of nested output properties
- Better type display with color coding
- Improved handle connections for complex output schemas

#### Migration & Cleanup
- Updated `RunInputDialog` to use new FormRenderer
- Updated `FormCreator` to use new FormRenderer
- Moved OAuth callback types to separate file
- Updated import paths from `input-renderer` to `InputRenderer`
- Removed unused console.log statements
- Added `type="button"` to buttons to prevent form submission

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Test form rendering with various field types (text, number,
boolean, arrays, objects)
  - [x] Test anyOf field type selector functionality
  - [x] Test array item addition/removal
  - [x] Test nested object fields with additional properties
  - [x] Test input/output node handle connections
  - [x] Test draft recovery with diff display
  - [x] Verify backward compatibility with existing agents
  - [x] Test field validation and error display
  - [x] Verify handle ID generation for complex schemas

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **New Features**
* Improved form field rendering with enhanced support for optional
types, arrays, and nested objects.
* Enhanced draft recovery display showing detailed difference tracking
(added, removed, modified items).
  * Better OAuth popup callback handling with structured message types.

* **Bug Fixes**
  * Improved node handle ID normalization and synchronization.
  * Enhanced edge management for complex field changes.
  * Fixed styling consistency across form components.

* **Dependencies**
  * Updated React JSON Schema Form library to version 6.1.2.
  * Added Radix UI slider component support.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-01-07 05:06:34 +00:00
Zamil Majdy
7ee28197a3 docs(gitbook): sync documentation updates with dev branch (#11709)
## Summary

Sync GitBook documentation changes from the gitbook branch to dev. This
PR contains comprehensive documentation updates including new assets,
content restructuring, and infrastructure improvements.

## Changes 🏗️

### Documentation Updates
- **New GitBook Assets**: Added 9 new documentation images and
screenshots
  - Platform overview images (AGPT_Platform.png, Banner_image.png)
- Feature illustrations (Contribute.png, Integrations.png, hosted.jpg,
no-code.jpg, api-reference.jpg)
  - Screenshots and examples for better user guidance
- **Content Updates**: Enhanced README.md and SUMMARY.md with improved
structure and navigation
- **Visual Documentation**: Added comprehensive visual guides for
platform features

### Infrastructure 
- **Cloudflare Worker**: Added redirect handler for docs.agpt.co →
agpt.co/docs migration
  - Complete URL mapping for 71+ redirect patterns
  - Handles platform blocks restructuring and edge cases
  - Ready for deployment to Cloudflare Workers

### Merge Conflict Resolution
- **Clean merge from dev**: Successfully merged dev's major backend
restructuring (server/ → api/)
- **File resurrection fix**: Removed files that were accidentally
resurrected during merge conflict resolution
  - Cleaned up BuilderActionButton.tsx (deleted in dev)
  - Cleaned up old PreviewBanner.tsx location (moved in dev)
  - Synced pnpm-lock.yaml and layout.tsx with dev's current state

## Technical Details

This PR represents a careful synchronization that:
1. **Preserves all GitBook documentation work** while staying current
with dev
2. **Maintains clean diff**: Only documentation-related changes remain
after merge cleanup
3. **Resolves merge conflicts**: Handled major backend API restructuring
without breaking docs
4. **Infrastructure ready**: Cloudflare Worker ready for docs migration
deployment

## Files Changed
- `docs/`: GitBook documentation assets and content
- `autogpt_platform/cloudflare_worker.js`: Docs infrastructure for URL
redirects

## Validation
-  All TypeScript compilation errors resolved
-  Pre-commit hooks passing (Prettier, TypeCheck)
-  Only documentation changes remain in diff vs dev
-  Cloudflare Worker tested with comprehensive URL mapping
-  No non-documentation code changes after cleanup

## Deployment Notes
The Cloudflare Worker can be deployed via:
```bash
# Cloudflare Dashboard → Workers → Create → Paste code → Add route docs.agpt.co/*
```

This completes the GitBook synchronization and prepares for docs site
migration.

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: bobby.gaffin <bobby.gaffin@agpt.co>
Co-authored-by: Bently <Github@bentlybro.com>
Co-authored-by: Abhimanyu Yadav <122007096+Abhi1992002@users.noreply.github.com>
Co-authored-by: Swifty <craigswift13@gmail.com>
Co-authored-by: Ubbe <hi@ubbe.dev>
Co-authored-by: Reinier van der Leer <pwuts@agpt.co>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Lluis Agusti <hi@llu.lu>
2026-01-07 02:11:11 +00:00
Nicholas Tindle
818de26d24 fix(platform/blocks): XMLParserBlock list object error (#11517)
<!-- Clearly explain the need for these changes: -->

### Need for these changes 💡

The `XMLParserBlock` was susceptible to crashing with an
`AttributeError: 'List' object has no attribute 'add_text'` when
processing malformed XML inputs, such as documents with multiple root
elements or stray text outside the root. This PR introduces robust
validation to prevent these crashes and provide clear, actionable error
messages to users.

### Changes 🏗️

<!-- Concisely describe all of the changes made in this pull request:
-->

- Added a `_validate_tokens` static method to `XMLParserBlock` to
perform pre-parsing validation on the token stream. This method ensures
the XML input has a single root element and no text content outside of
it.
- Modified the `XMLParserBlock.run` method to call `_validate_tokens`
immediately after tokenization and before passing the tokens to
`gravitasml.Parser`.
- Introduced a new test case, `test_rejects_text_outside_root`, in
`test_blocks_dos_vulnerability.py` to verify that the `XMLParserBlock`
correctly raises a `ValueError` when encountering XML with text outside
the root element.
- Imported `Token` for type hinting in `xml_parser.py`.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  <!-- Put your test plan here: -->
- [x] Confirm that the `test_rejects_text_outside_root` test passes,
asserting that `ValueError` is raised for invalid XML.
  - [x] Confirm that other relevant XML parsing tests continue to pass.


---
Linear Issue:
[OPEN-2835](https://linear.app/autogpt/issue/OPEN-2835/blockunknownerror-raised-by-xmlparserblock-with-message-list-object)

<a
href="https://cursor.com/background-agent?bcId=bc-4495ea93-6836-412c-b2e3-0adb31113169"><picture><source
media="(prefers-color-scheme: dark)"
srcset="https://cursor.com/open-in-cursor-dark.svg"><source
media="(prefers-color-scheme: light)"
srcset="https://cursor.com/open-in-cursor-light.svg"><img alt="Open in
Cursor"
src="https://cursor.com/open-in-cursor.svg"></picture></a>&nbsp;<a
href="https://cursor.com/agents?id=bc-4495ea93-6836-412c-b2e3-0adb31113169"><picture><source
media="(prefers-color-scheme: dark)"
srcset="https://cursor.com/open-in-web-dark.svg"><source
media="(prefers-color-scheme: light)"
srcset="https://cursor.com/open-in-web-light.svg"><img alt="Open in Web"
src="https://cursor.com/open-in-web.svg"></picture></a>


<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Strengthens XML parsing robustness and error clarity.
> 
> - Adds `_validate_tokens` in `XMLParserBlock` to ensure a single root
element, balanced tags, and no text outside the root before parsing
> - Updates `run` to `list(tokenize(...))` and validate tokens prior to
`Parser.parse()`; maintains 10MB input size guard
> - Introduces `test_rejects_text_outside_root` asserting a readable
`ValueError` for trailing text
> - Bumps `gravitasml` to `0.1.4` in `pyproject.toml` and lockfile
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
22cc5149c5. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Bug Fixes**
* Improved XML parsing validation with stricter enforcement of
single-root elements and prevention of trailing text, providing clearer
error messages for invalid XML input.

* **Tests**
* Added test coverage for XML parser validation of invalid root text
scenarios.

* **Chores**
  * Updated GravitasML dependency to latest compatible version.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Nicholas Tindle <ntindle@users.noreply.github.com>
2026-01-06 20:02:53 +00:00
Nicholas Tindle
cb08def96c feat(blocks): Add Google Docs integration blocks (#11608)
Introduces a new module with blocks for Google Docs operations,
including reading, creating, appending, inserting, formatting,
exporting, sharing, and managing public access for Google Docs. Updates
dependencies in pyproject.toml and poetry.lock to support these
features.



https://github.com/user-attachments/assets/3597366b-a9eb-4f8e-8a0a-5a0bc8ebc09b



<!-- Clearly explain the need for these changes: -->

### Changes 🏗️
Adds lots of basic docs tools + a dependency to use them with markdown

Block | Description | Key Features
-- | -- | --
Read & Create |   |  
GoogleDocsReadBlock | Read content from a Google Doc | Returns text
content, title, revision ID
GoogleDocsCreateBlock | Create a new Google Doc | Title, optional
initial content
GoogleDocsGetMetadataBlock | Get document metadata | Title, revision ID,
locale, suggested modes
GoogleDocsGetStructureBlock | Get document structure with indexes | Flat
segments or detailed hierarchy; shows start/end indexes
Plain Text Operations |   |  
GoogleDocsAppendPlainTextBlock | Append plain text to end | No
formatting applied
GoogleDocsInsertPlainTextBlock | Insert plain text at position |
Requires index; no formatting
GoogleDocsFindReplacePlainTextBlock | Find and replace plain text |
Case-sensitive option; no formatting on replacement
Markdown Operations | (ideal for LLM/AI output) |  
GoogleDocsAppendMarkdownBlock | Append Markdown to end | Full formatting
via gravitas-md2gdocs
GoogleDocsInsertMarkdownAtBlock | Insert Markdown at position | Requires
index
GoogleDocsReplaceAllWithMarkdownBlock | Replace entire doc with Markdown
| Clears and rewrites
GoogleDocsReplaceRangeWithMarkdownBlock | Replace index range with
Markdown | Requires start/end index
GoogleDocsReplaceContentWithMarkdownBlock | Find text and replace with
Markdown | Text-based search; great for templates
Structural Operations |   |  
GoogleDocsInsertTableBlock | Insert a table | Rows/columns OR content
array; optional Markdown in cells
GoogleDocsInsertPageBreakBlock | Insert a page break | Position index (0
= end)
GoogleDocsDeleteContentBlock | Delete content range | Requires start/end
index
GoogleDocsFormatTextBlock | Apply formatting to text range | Bold,
italic, underline, font size/color, etc.
Export & Sharing |   |  
GoogleDocsExportBlock | Export to different formats | PDF, DOCX, TXT,
HTML, RTF, ODT, EPUB
GoogleDocsShareBlock | Share with specific users | Reader, commenter,
writer, owner roles
GoogleDocsSetPublicAccessBlock | Set public access level | Private,
anyone with link (view/comment/edit)


<!-- Concisely describe all of the changes made in this pull request:
-->

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  <!-- Put your test plan here: -->
  - [x] Build, run, verify, and upload a block super test
- [x] [Google Docs Super
Agent_v8.json](https://github.com/user-attachments/files/24134215/Google.Docs.Super.Agent_v8.json)
works


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Chores**
  * Updated backend dependencies.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Adds end-to-end Google Docs capabilities under
`backend/blocks/google/docs.py`, including rich Markdown support.
> 
> - New blocks: read/create docs; plain-text
`append`/`insert`/`find_replace`/`delete`; text `format`;
`insert_table`; `insert_page_break`; `get_metadata`; `get_structure`
> - Markdown-powered blocks (via `gravitas_md2gdocs.to_requests`):
`append_markdown`, `insert_markdown_at`, `replace_all_with_markdown`,
`replace_range_with_markdown`, `replace_content_with_markdown`
> - Export and sharing: `export` (PDF/DOCX/TXT/HTML/RTF/ODT/EPUB),
`share` (user roles), `set_public_access`
> - Dependency updates: add `gravitas-md2gdocs` to `pyproject.toml` and
update `poetry.lock`
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
73512a95b2. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Nicholas Tindle <ntindle@users.noreply.github.com>
2026-01-05 18:36:56 +00:00
Krzysztof Czerwinski
ac2daee5f8 feat(backend): Add GPT-5.2 and update default models (#11652)
### Changes 🏗️

- Add OpenAI `GPT-5.2` with metadata&cost
- Add const `DEFAULT_LLM_MODEL` (set to GPT-5.2) and use it instead of
hardcoded model across llm blocks and tests

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] GPT-5.2 is set as default and works on llm blocks
2026-01-05 16:13:35 +00:00
lif
266e0d79d4 fix(blocks): add YouTube Shorts URL support (#11659)
## Summary
Added support for parsing YouTube Shorts URLs (`youtube.com/shorts/...`)
in the TranscribeYoutubeVideoBlock to extract video IDs correctly.

## Changes
- Modified `_extract_video_id` method in `youtube.py` to handle Shorts
URL format
- Added test cases for YouTube Shorts URL extraction

## Related Issue
Fixes #11500

## Test Plan
- [x] Added unit tests for YouTube Shorts URL extraction
- [x] Verified existing YouTube URL formats still work
- [x] CI should pass all existing tests

---------

Co-authored-by: Ubbe <hi@ubbe.dev>
2026-01-05 16:11:45 +00:00
lif
01f443190e fix(frontend): allow empty values in number inputs and fix AnyOfField toggle (#11661)
<!-- ⚠️ Reminder: Think about your Changeset/Docs changes! -->
<!-- If you are introducing new blocks or features, document them for
users. -->
<!-- Reference:
https://github.com/Significant-Gravitas/AutoGPT/blob/dev/CONTRIBUTING.md
-->

## Summary

This PR fixes two related issues with number/integer inputs in the
frontend:

1. **HTMLType typo fix**: INTEGER input type was incorrectly mapped to
`htmlType: 'account'` (which is not a valid HTML input type) instead of
`htmlType: 'number'`.

2. **AnyOfField toggle fix**: When a user cleared a number input field,
the input would disappear because `useAnyOfField` checked for both
`null` AND `undefined` in `isEnabled`. This PR changes it to only check
for explicit `null` (set by toggle off), allowing `undefined` (empty
input) to keep the field visible.

### Root cause analysis

When a user cleared a number input:
1. `handleChange` returned `undefined` (because `v === "" ? undefined :
Number(v)`)
2. In `useAnyOfField`, `isEnabled = formData !== null && formData !==
undefined` became `false`
3. The input field disappeared

### Fix

Changed `useAnyOfField.tsx` line 67:
```diff
- const isEnabled = formData !== null && formData !== undefined;
+ const isEnabled = formData !== null;
```

This way:
- When toggle is OFF → `formData` is `null` → `isEnabled` is `false`
(input hidden) ✓
- When toggle is ON but input is cleared → `formData` is `undefined` →
`isEnabled` is `true` (input visible) ✓

## Test plan

- [x] Verified INTEGER inputs now render correctly with `type="number"`
- [x] Verified clearing a number input keeps the field visible
- [x] Verified toggling the nullable switch still works correctly

Fixes #11594

🤖 AI-assisted development disclaimer: This PR was developed with
assistance from Claude Code.

---------

Signed-off-by: majiayu000 <1835304752@qq.com>
Co-authored-by: Abhimanyu Yadav <122007096+Abhi1992002@users.noreply.github.com>
2026-01-05 16:10:47 +00:00
Ubbe
bdba0033de refactor(frontend): move NodeInput files (#11695)
## Changes 🏗️

Move the `<NodeInput />` component next to the old builder code where it
is used.

## Checklist 📋

### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Run app locally and click around, E2E is fine
2026-01-05 10:29:12 +00:00
Abhimanyu Yadav
b87c64ce38 feat(frontend): Add delete key bindings to ReactFlow editor
(#11693)

Issues fixed by this PR
- https://github.com/Significant-Gravitas/AutoGPT/issues/11688
- https://github.com/Significant-Gravitas/AutoGPT/issues/11687

### **Changes 🏗️**

Added keyboard delete functionality to the ReactFlow editor by enabling
the `deleteKeyCode` prop with both "Backspace" and "Delete" keys. This
allows users to delete selected nodes and edges using standard keyboard
shortcuts, improving the editing experience.

**Modified:**

- `Flow.tsx`: Added `deleteKeyCode={["Backspace", "Delete"]}` prop to
the ReactFlow component to enable deletion of selected elements via
keyboard

### **Checklist 📋**

#### **For code changes:**

- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Select a node in the flow editor and press Delete key to confirm
it deletes
- [x] Select a node in the flow editor and press Backspace key to
confirm it deletes
    - [x] Verify deletion works for multiple selected elements
2026-01-05 10:28:57 +00:00
Ubbe
003affca43 refactor(frontend): fix new builder buttons (#11696)
## Changes 🏗️

<img width="800" height="964" alt="Screenshot 2026-01-05 at 15 26 21"
src="https://github.com/user-attachments/assets/f8c7fc47-894a-4db2-b2f1-62b4d70e8453"
/>

- Adjust the new builder to use the Design System components
- Re-structure imports to match formatting rules
- Small improvement on `use-get-flag`
- Move file which is the main hook

## Checklist 📋

### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Run locally and check the new buttons look good
2026-01-05 09:09:47 +00:00
Abhimanyu Yadav
290d0d9a9b feat(frontend): add auto-save Draft Recovery feature with IndexedDB persistence
(#11658)

## Summary
Implements an auto-save draft recovery system that persists unsaved flow
builder state across browser sessions, tab closures, and refreshes. When
users return to a flow with unsaved changes, they can choose to restore
or discard the draft via an intuitive recovery popup.



https://github.com/user-attachments/assets/0f77173b-7834-48d2-b7aa-73c6cd2eaff6



## Changes 🏗️

### Core Features
- **Draft Recovery Popup** (`DraftRecoveryPopup.tsx`)
  - Displays amber-themed notification with unsaved changes metadata
  - Shows node count, edge count, and relative time since last save
  - Provides restore and discard actions with tooltips
  - Auto-dismisses on click outside or ESC key

- **Auto-Save System** (`useDraftManager.ts`)
  - Automatically saves draft state every 15 seconds
  - Saves on browser tab close/refresh via `beforeunload`
  - Tracks nodes, edges, graph schemas, node counter, and flow version
  - Smart dirty checking - only saves when actual changes detected
  - Cleans up expired drafts (24-hour TTL)

- **IndexedDB Persistence** (`db.ts`, `draft-service.ts`)
  - Uses Dexie library for reliable client-side storage
- Handles both existing flows (by flowID) and new flows (via temp
session IDs)
- Compares draft state with current state to determine if recovery
needed
  - Automatically clears drafts after successful save

### Integration Changes
- **Flow Editor** (`Flow.tsx`)
  - Integrated `DraftRecoveryPopup` component
  - Passes `isInitialLoadComplete` state for proper timing

- **useFlow Hook** (`useFlow.ts`)
  - Added `isInitialLoadComplete` state to track when flow is ready
  - Ensures draft check happens after initial graph load
  - Resets state on flow/version changes

- **useCopyPaste Hook** (`useCopyPaste.ts`)
  - Refactored to manage keyboard event listeners internally
  - Simplified integration by removing external event handler setup

- **useSaveGraph Hook** (`useSaveGraph.ts`)
  - Clears draft after successful save (both create and update)
  - Removes temp flow ID from session storage on first save

### Dependencies
- Added `dexie@4.2.1` - Modern IndexedDB wrapper for reliable
client-side storage

## Technical Details

**Auto-Save Flow:**
1. User makes changes to nodes/edges
2. Change triggers 15-second debounced save
3. Draft saved to IndexedDB with timestamp
4. On save, current state compared with last saved state
5. Only saves if meaningful changes detected

**Recovery Flow:**
1. User loads flow/refreshes page
2. After initial load completes, check for existing draft
3. Compare draft with current state
4. If different and non-empty, show recovery popup
5. User chooses to restore or discard
6. Draft cleared after either action

**Session Management:**
- Existing flows: Use actual flowID for draft key

### Test Plan 🧪

- [x] Create a new flow with 3+ blocks and connections, wait 15+
seconds, then refresh the page - verify recovery popup appears with
correct counts and restoring works
- [x] Create a flow with blocks, refresh, then click "Discard" button on
recovery popup - verify popup disappears and draft is deleted
- [x] Add blocks to a flow, save successfully - verify draft is cleared
from IndexedDB (check DevTools > Application > IndexedDB)
- [x] Make changes to an existing flow, refresh page - verify recovery
popup shows and restoring preserves all changes correctly
- [x] Verify empty flows (0 nodes) don't trigger recovery popup or save
drafts
2025-12-31 14:49:53 +00:00
Abhimanyu Yadav
fba61c72ed feat(frontend): fix duplicate publish button and improve BuilderActionButton styling
(#11669)

Fixes duplicate "Publish to Marketplace" buttons in the builder by
adding a `showTrigger` prop to control modal trigger visibility.

<img width="296" height="99" alt="Screenshot 2025-12-23 at 8 18 58 AM"
src="https://github.com/user-attachments/assets/d5dbfba8-e854-4c0c-a6b7-da47133ec815"
/>


### Changes 🏗️

**BuilderActionButton.tsx**

- Removed borders on hover and active states for a cleaner visual
appearance
- Added `hover:border-none` and `active:border-none` to maintain
consistent styling during interactions

**PublishToMarketplace.tsx**

- Pass `showTrigger={false}` to `PublishAgentModal` to hide the default
trigger button
- This prevents duplicate buttons when a custom trigger is already
rendered

**PublishAgentModal.tsx**

- Added `showTrigger` prop (defaults to `true`) to conditionally render
the modal trigger
- Allows parent components to control whether the built-in trigger
button should be displayed
- Maintains backward compatibility with existing usage

### Checklist 📋

#### For code changes:

- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Verify only one "Publish to Marketplace" button appears in the
builder
- [x] Confirm button hover/active states display correctly without
border artifacts
- [x] Verify modal can still be triggered programmatically without the
trigger button
2025-12-31 09:46:12 +00:00
Nicholas Tindle
79d45a15d0 feat(platform): Deduplicate insufficient funds Discord + email notifications (#11672)
Add Redis-based deduplication for insufficient funds notifications (both
Discord alerts and user emails) when users run out of credits. This
prevents spamming users and the PRODUCT Discord channel with repeated
alerts for the same user+agent combination.

### Changes 🏗️

- **Redis-based deduplication** (`backend/executor/manager.py`):
- Add `INSUFFICIENT_FUNDS_NOTIFIED_PREFIX` constant for Redis key prefix
- Add `INSUFFICIENT_FUNDS_NOTIFIED_TTL_SECONDS` (30 days) as fallback
cleanup
- Implement deduplication in `_handle_insufficient_funds_notif` using
Redis `SET NX`
- Skip both email (`ZERO_BALANCE`) and Discord notifications for
duplicate alerts per user+agent
- Add `clear_insufficient_funds_notifications(user_id)` function to
remove all notification flags for a user

- **Clear flags on credit top-up** (`backend/data/credit.py`):
- Call `clear_insufficient_funds_notifications` in `_top_up_credits`
after successful auto-charge
- Call `clear_insufficient_funds_notifications` in `fulfill_checkout`
after successful manual top-up
- This allows users to receive notifications again if they run out of
funds in the future

- **Comprehensive test coverage**
(`backend/executor/manager_insufficient_funds_test.py`):
  - Test first-time notification sends both email and Discord alert
  - Test duplicate notifications are skipped for same user+agent
  - Test different agents for same user get separate alerts
  - Test clearing notifications removes all keys for a user
  - Test handling when no notification keys exist
- Test notifications still sent when Redis fails (graceful degradation)

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] First insufficient funds alert sends both email and Discord
notification
  - [x] Duplicate alerts for same user+agent are skipped
  - [x] Different agents for same user each get their own notification
  - [x] Topping up credits clears notification flags
  - [x] Redis failure gracefully falls back to sending notifications
  - [x] 30-day TTL provides automatic cleanup as fallback
  - [x] Manually test this works with scheduled agents
 

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Introduces Redis-backed deduplication for insufficient-funds alerts
and resets flags on successful credit additions.
> 
> - **Dedup insufficient-funds alerts** in `executor/manager.py` using
Redis `SET NX` with `INSUFFICIENT_FUNDS_NOTIFIED_PREFIX` and 30‑day TTL;
skips duplicate ZERO_BALANCE email + Discord alerts per
`user_id`+`graph_id`, with graceful fallback if Redis fails.
> - **Reset notification flags on credit increases** by adding
`clear_insufficient_funds_notifications(user_id)` and invoking it when
enabling/adding positive `GRANT`/`TOP_UP` transactions in
`data/credit.py`.
> - **Tests** (`executor/manager_insufficient_funds_test.py`):
first-time vs duplicate behavior, per-agent separation, clearing keys
(including no-key and Redis-error cases), and clearing on
`_add_transaction`/`_enable_transaction`.
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
1a4413b3a1. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: Ubbe <hi@ubbe.dev>
Co-authored-by: Claude <noreply@anthropic.com>
2025-12-30 18:10:30 +00:00
Ubbe
66f0d97ca2 fix(frontend): hide better chat link if not enabled (#11648)
## Changes 🏗️

- Make `<Navbar />` a client component so its rendering is more
predictable
- Remove the `useMemo()` for the chat link to prevent the flash...
- Make sure chat is added to the navbar links only after checking the
flag is enabled
- Improve logout with `useTransition`
- Simplify feature flags setup

## Checklist 📋

### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Run locally and test the above

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Ensures the `Chat` nav item is hidden when the feature flag is off
across desktop and mobile nav.
> 
> - Inline-filters `loggedInLinks` to skip `Chat` when `Flag.CHAT` is
false for both `NavbarLink` rendering and `MobileNavBar` menu items
> - Removes `useMemo`/`linksWithChat` helper; maps directly over
`loggedInLinks` and filters nulls in mobile, keeping icon mapping intact
> - Cleans up unused `useMemo` import
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
79c42d87b4. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
2025-12-30 13:21:53 +00:00
Ubbe
5894a8fcdf fix(frontend): use DS Dialog on old builder (#11643)
## Changes 🏗️

Use the Design System `<Dialog />` on the old builder, which supports
long content scrolling ( the current one does not, causing issues in
graphs with many run inputs )...

## Checklist 📋

### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Run locally and test the above


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **New Features**
* Added Enhanced Rendering toggle for improved output handling and
display (controlled via feature flag)

* **Improvements**
  * Refined dialog layouts and user interactions
* Enhanced copy-to-clipboard functionality with toast notifications upon
copying

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-12-30 20:22:57 +07:00
Ubbe
dff8efa35d fix(frontend): favico colour override issue (#11681)
## Changes 🏗️

Sometimes, on Dev, when navigating between pages, the Favico colour
would revert from Green 🟢 (Dev) to Purple 🟣(Default). That's because the
`/marketplace` page had custom code overriding it that I didn't notice
earlier...

I also made it use the Next.js metadata API, so it handles the favicon
correctly across navigations.

## Checklist 📋

### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Run locally and test the above
2025-12-30 20:22:32 +07:00
seer-by-sentry[bot]
e26822998f fix: Handle missing or null 'items' key in DataForSEO Related Keywords block (#10989)
### Changes 🏗️

- Modified the DataForSEO Related Keywords block to handle cases where
the 'items' key is missing or has a null value in the API response.
- Ensures that the code gracefully handles these scenarios by defaulting
to an empty list, preventing potential errors. Fixes
[AUTOGPT-SERVER-66D](https://sentry.io/organizations/significant-gravitas/issues/6902944636/).

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  <!-- Put your test plan here: -->
- [x] The DataForSEO API now returns an empty list when there are no
results, preventing the code from attempting to iterate on a null value.

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Strengthens parsing of DataForSEO Labs response to avoid errors when
`items` is missing or null.
> 
> - In `backend/blocks/dataforseo/related_keywords.py` `run()`, sets
`items = first_result.get("items") or []` when `first_result` is a
`dict`, otherwise `[]`, ensuring safe iteration
> - Prevents exceptions and yields empty results when no items are
returned
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
cc465ddbf2. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

Co-authored-by: seer-by-sentry[bot] <157164994+seer-by-sentry[bot]@users.noreply.github.com>
Co-authored-by: Toran Bruce Richards <toran.richards@gmail.com>
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Nicholas Tindle <ntindle@users.noreply.github.com>
Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co>
2025-12-26 16:17:24 +00:00
Zamil Majdy
88731b1f76 feat(platform): marketplace update notifications with enhanced publishing workflow (#11630)
## Summary
This PR implements a comprehensive marketplace update notification
system that allows users to discover and update to newer agent versions,
along with enhanced publishing workflows and UI improvements.

<img width="1500" height="533" alt="image"
src="https://github.com/user-attachments/assets/ee331838-d712-4718-b231-1f9ec21bcd8e"
/>

<img width="600" height="610" alt="image"
src="https://github.com/user-attachments/assets/b881a7b8-91a5-460d-a159-f64765b339f1"
/>

<img width="1500" height="416" alt="image"
src="https://github.com/user-attachments/assets/a2d61904-2673-4e44-bcc5-c47d36af7a38"
/>

<img width="1500" height="1015" alt="image"
src="https://github.com/user-attachments/assets/2dd978c7-20cc-4230-977e-9c62157b9f23"
/>


## Core Features

### 🔔 Marketplace Update Notifications
- **Update detection**: Automatically detects when marketplace has newer
agent versions than user's local copy
- **Creator notifications**: Shows banners for creators with unpublished
changes ready to publish
- **Non-creator support**: Enables regular users to discover and update
to newer marketplace versions
- **Version comparison**: Intelligent logic comparing `graph_version` vs
marketplace listing versions

### 📋 Enhanced Publishing Workflow  
- **Builder integration**: Added "Publish to Marketplace" button
directly in the builder actions
- **Unified banner system**: Consistent `MarketplaceBanners` component
across library and marketplace pages
- **Streamlined UX**: Fixed layout issues, improved button placement and
styling
- **Modal improvements**: Fixed thumbnail loading race conditions and
infinite loop bugs

### 📚 Version History & Changelog
- **Inline version history**: Added version changelog directly to
marketplace agent pages
- **Version comparison**: Clear display of available versions with
current version highlighting
- **Update mechanism**: Direct updates using `graph_version` parameter
for accuracy

## Technical Implementation

### Backend Changes
- **Database schema**: Added `agentGraphVersions` and `agentGraphId`
fields to `StoreAgent` model
- **API enhancement**: Updated store endpoints to expose graph version
data for version comparison
- **Data migration**: Fixed agent version field naming from `version` to
`agentGraphVersions`
- **Model updates**: Enhanced `LibraryAgentUpdateRequest` with
`graph_version` field

### Frontend Architecture
- **`useMarketplaceUpdate` hook**: Centralized marketplace update
detection and creator identification
- **`MarketplaceBanners` component**: Unified banner system with proper
vertical layout and styling
- **`AgentVersionChangelog` component**: Version history display for
marketplace pages
- **`PublishToMarketplace` component**: Builder integration with modal
workflow

### Key Bug Fixes
- **Thumbnail loading**: Fixed race condition where images wouldn't load
on first modal open
- **Infinite loops**: Used refs to prevent circular dependencies in
`useThumbnailImages` hook
- **Layout issues**: Fixed banner placement, removed duplicate
breadcrumbs, corrected vertical layout
- **Field naming**: Fixed `agent_version` vs `version` field
inconsistencies across APIs

## Files Changed

### Backend
- `autogpt_platform/backend/backend/server/v2/store/` - Enhanced store
API with graph version data
- `autogpt_platform/backend/backend/server/v2/library/` - Updated
library API models
- `autogpt_platform/backend/migrations/` - Database migrations for
version fields
- `autogpt_platform/backend/schema.prisma` - Schema updates for graph
versions

### Frontend
- `src/app/(platform)/components/MarketplaceBanners/` - New unified
banner component
- `src/app/(platform)/library/agents/[id]/components/` - Enhanced
library views with banners
- `src/app/(platform)/build/components/BuilderActions/` - Added
marketplace publish button
- `src/app/(platform)/marketplace/components/AgentInfo/` - Added inline
version history
- `src/components/contextual/PublishAgentModal/` - Fixed thumbnail
loading and modal workflow

## User Experience Impact
- **Better discovery**: Users automatically notified of newer agent
versions
- **Streamlined publishing**: Direct publish access from builder
interface
- **Reduced friction**: Fixed UI bugs, improved loading states,
consistent design
- **Enhanced transparency**: Inline version history on marketplace pages
- **Creator workflow**: Better notifications for creators with
unpublished changes

## Testing
-  Update banners appear correctly when marketplace has newer versions
-  Creator banners show for users with unpublished changes  
-  Version comparison logic works with graph_version vs marketplace
versions
-  Publish button in builder opens modal correctly with pre-populated
data
-  Thumbnail images load properly on first modal open without infinite
loops
-  Database migrations completed successfully with version field fixes
-  All existing tests updated and passing with new schema changes

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Lluis Agusti <hi@llu.lu>
Co-authored-by: Ubbe <hi@ubbe.dev>
Co-authored-by: Reinier van der Leer <pwuts@agpt.co>
2025-12-22 11:13:06 +00:00
Abhimanyu Yadav
c3e407ef09 feat(frontend): add hover state to edge delete button in FlowEditor (#11601)
<!-- Clearly explain the need for these changes: -->

The delete button on flow editor edges is always visible, which creates
visual clutter. This change makes the button only appear on hover,
improving the UI while keeping it accessible.

### Changes 🏗️

- Added hover state management using `useState` to track when the edge
delete button is hovered
- Applied opacity transition to the delete button (fades in on hover,
fades out when not hovered)
- Added `onMouseEnter` and `onMouseLeave` handlers to the button to
control hover state
- Used `cn` utility for conditional className management
- Button remains interactive even when `opacity-0` (still clickable for
better UX)

### Checklist 📋

#### For code changes:

- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Hover over an edge in the flow editor and verify the delete button
fades in smoothly
- [x] Move mouse away from edge and verify the delete button fades out
smoothly
- [x] Click the delete button while hovered to verify it still removes
the edge connection
- [x] Test with multiple edges to ensure hover state is independent per
edge
2025-12-22 01:30:58 +00:00
Reinier van der Leer
08a60dcb9b refactor(frontend): Clean up React Query-related code (#11604)
- #11603

### Changes 🏗️

Frontend:
- Make `okData` infer the response data type instead of casting
- Generalize infinite query utilities from `SidebarRunsList/helpers.ts`
  - Move to `@/app/api/helpers` and use wherever possible
- Simplify/replace boilerplate checks and conditions with `okData` in
many places
- Add `useUserTimezone` hook to replace all the boilerplate timezone
queries

Backend:
- Fix response type annotation of `GET
/api/store/graph/{store_listing_version_id}` endpoint
- Fix documentation and error behavior of `GET
/api/review/execution/{graph_exec_id}` endpoint

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - CI passes
  - [x] Clicking around the app manually -> no obvious issues
  - [x] Test Onboarding step 5 (run)
  - [x] Library runs list loads normally
2025-12-20 22:46:24 +01:00
Reinier van der Leer
de78d062a9 refactor(backend/api): Clean up API file structure (#11629)
We'll soon be needing a more feature-complete external API. To make way
for this, I'm moving some files around so:
- We can more easily create new versions of our external API
- The file structure of our internal API is more homogeneous

These changes are quite opinionated, but IMO in any case they're better
than the chaotic structure we have now.

### Changes 🏗️

- Move `backend/server` -> `backend/api`
- Move `backend/server/routers` + `backend/server/v2` ->
`backend/api/features`
  - Change absolute sibling imports to relative imports
- Move `backend/server/v2/AutoMod` -> `backend/executor/automod`
- Combine `backend/server/routers/analytics_*test.py` ->
`backend/api/features/analytics_test.py`
- Sort OpenAPI spec file

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - CI tests
  - [x] Clicking around in the app -> no obvious breakage
2025-12-20 20:33:10 +00:00
Zamil Majdy
217e3718d7 feat(platform): implement HITL UI redesign with improved review flow (#11529)
## Summary

• Redesigned Human-in-the-Loop review interface with yellow warning
scheme
• Implemented separate approved_data/rejected_data output pins for
human_in_the_loop block
• Added real-time execution status tracking to legacy flow for review
detection
• Fixed button loading states and improved UI consistency across flows
• Standardized Tailwind CSS usage removing custom values

<img width="1500" alt="image"
src="https://github.com/user-attachments/assets/4ca6dd98-f3c4-41c0-a06b-92b3bca22490"
/>
<img width="1500" alt="image"
src="https://github.com/user-attachments/assets/0afae211-09f0-465e-b477-c3949f13c876"
/>
<img width="1500" alt="image"
src="https://github.com/user-attachments/assets/05d9d1ed-cd40-4c73-92b8-0dab21713ca9"
/>



## Changes Made

### Backend Changes
- Modified `human_in_the_loop.py` block to output separate
`approved_data` and `rejected_data` pins instead of single reviewed_data
with status
- Updated block output schema to support better data flow in graph
builder

### Frontend UI Changes
- Redesigned PendingReviewsList with yellow warning color scheme
(replacing orange)
- Fixed button loading states to show spinner only on clicked button 
- Improved FloatingReviewsPanel layout removing redundant headers
- Added real-time status tracking to legacy flow using useFlowRealtime
hook
- Fixed AgentActivityDropdown text overflow and layout issues
- Enhanced Safe Mode toggle positioning and toast timing
- Standardized all custom Tailwind values to use standard classes

### Design System Updates
- Added yellow design tokens (25, 150, 600) for warning states
- Unified REVIEW status handling across all components
- Improved component composition patterns

## Test Plan
- [x] Verify HITL blocks create separate output pins for
approved/rejected data
- [x] Test review flow works in both new and legacy flow builders
- [x] Confirm button loading states work correctly (only clicked button
shows spinner)
- [x] Validate AgentActivityDropdown properly displays review status
- [x] Check Safe Mode toggle positioning matches old flow
- [x] Ensure real-time status updates work in legacy flow
- [x] Verify yellow warning colors are consistent throughout

🤖 Generated with [Claude Code](https://claude.ai/code)

---------

Co-authored-by: Lluis Agusti <hi@llu.lu>
2025-12-20 15:52:51 +00:00
Ubbe
4a7bc006a8 hotfix(frontend): chat should be disabled by default (#11639)
### Changes 🏗️

Chat should be disabled by default; otherwise, it flashes, and if Launch
Darkly fails to fail, it is dangerous.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Run locally with Launch Darkly disabled and test the above
2025-12-18 19:04:13 +01:00
708 changed files with 40478 additions and 17890 deletions

37
.branchlet.json Normal file
View File

@@ -0,0 +1,37 @@
{
"worktreeCopyPatterns": [
".env*",
".vscode/**",
".auth/**",
".claude/**",
"autogpt_platform/.env*",
"autogpt_platform/backend/.env*",
"autogpt_platform/frontend/.env*",
"autogpt_platform/frontend/.auth/**",
"autogpt_platform/db/docker/.env*"
],
"worktreeCopyIgnores": [
"**/node_modules/**",
"**/dist/**",
"**/.git/**",
"**/Thumbs.db",
"**/.DS_Store",
"**/.next/**",
"**/__pycache__/**",
"**/.ruff_cache/**",
"**/.pytest_cache/**",
"**/*.pyc",
"**/playwright-report/**",
"**/logs/**",
"**/site/**"
],
"worktreePathTemplate": "$BASE_PATH.worktree",
"postCreateCmd": [
"cd autogpt_platform/autogpt_libs && poetry install",
"cd autogpt_platform/backend && poetry install && poetry run prisma generate",
"cd autogpt_platform/frontend && pnpm install",
"cd docs && pip install -r requirements.txt"
],
"terminalCommand": "code .",
"deleteBranchWithWorktree": false
}

View File

@@ -16,6 +16,7 @@
!autogpt_platform/backend/poetry.lock
!autogpt_platform/backend/README.md
!autogpt_platform/backend/.env
!autogpt_platform/backend/gen_prisma_types_stub.py
# Platform - Market
!autogpt_platform/market/market/

View File

@@ -74,7 +74,7 @@ jobs:
- name: Generate Prisma Client
working-directory: autogpt_platform/backend
run: poetry run prisma generate
run: poetry run prisma generate && poetry run gen-prisma-stub
# Frontend Node.js/pnpm setup (mirrors platform-frontend-ci.yml)
- name: Set up Node.js

View File

@@ -90,7 +90,7 @@ jobs:
- name: Generate Prisma Client
working-directory: autogpt_platform/backend
run: poetry run prisma generate
run: poetry run prisma generate && poetry run gen-prisma-stub
# Frontend Node.js/pnpm setup (mirrors platform-frontend-ci.yml)
- name: Set up Node.js

View File

@@ -72,7 +72,7 @@ jobs:
- name: Generate Prisma Client
working-directory: autogpt_platform/backend
run: poetry run prisma generate
run: poetry run prisma generate && poetry run gen-prisma-stub
# Frontend Node.js/pnpm setup (mirrors platform-frontend-ci.yml)
- name: Set up Node.js
@@ -108,6 +108,16 @@ jobs:
# run: pnpm playwright install --with-deps chromium
# Docker setup for development environment
- name: Free up disk space
run: |
# Remove large unused tools to free disk space for Docker builds
sudo rm -rf /usr/share/dotnet
sudo rm -rf /usr/local/lib/android
sudo rm -rf /opt/ghc
sudo rm -rf /opt/hostedtoolcache/CodeQL
sudo docker system prune -af
df -h
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3

View File

@@ -134,7 +134,7 @@ jobs:
run: poetry install
- name: Generate Prisma Client
run: poetry run prisma generate
run: poetry run prisma generate && poetry run gen-prisma-stub
- id: supabase
name: Start Supabase
@@ -176,7 +176,7 @@ jobs:
}
- name: Run Database Migrations
run: poetry run prisma migrate dev --name updates
run: poetry run prisma migrate deploy
env:
DATABASE_URL: ${{ steps.supabase.outputs.DB_URL }}
DIRECT_URL: ${{ steps.supabase.outputs.DB_URL }}

View File

@@ -11,6 +11,7 @@ on:
- ".github/workflows/platform-frontend-ci.yml"
- "autogpt_platform/frontend/**"
merge_group:
workflow_dispatch:
concurrency:
group: ${{ github.workflow }}-${{ github.event_name == 'merge_group' && format('merge-queue-{0}', github.ref) || format('{0}-{1}', github.ref, github.event.pull_request.number || github.sha) }}
@@ -151,6 +152,14 @@ jobs:
run: |
cp ../.env.default ../.env
- name: Copy backend .env and set OpenAI API key
run: |
cp ../backend/.env.default ../backend/.env
echo "OPENAI_INTERNAL_API_KEY=${{ secrets.OPENAI_API_KEY }}" >> ../backend/.env
env:
# Used by E2E test data script to generate embeddings for approved store agents
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
@@ -226,13 +235,25 @@ jobs:
- name: Run Playwright tests
run: pnpm test:no-build
continue-on-error: false
- name: Upload Playwright artifacts
if: failure()
- name: Upload Playwright report
if: always()
uses: actions/upload-artifact@v4
with:
name: playwright-report
path: playwright-report
if-no-files-found: ignore
retention-days: 3
- name: Upload Playwright test results
if: always()
uses: actions/upload-artifact@v4
with:
name: playwright-test-results
path: test-results
if-no-files-found: ignore
retention-days: 3
- name: Print Final Docker Compose logs
if: always()

View File

@@ -6,12 +6,14 @@ start-core:
# Stop core services
stop-core:
docker compose stop deps
docker compose stop
reset-db:
docker compose stop db
rm -rf db/docker/volumes/db/data
cd backend && poetry run prisma migrate deploy
cd backend && poetry run prisma generate
cd backend && poetry run gen-prisma-stub
# View logs for core services
logs-core:
@@ -33,6 +35,7 @@ init-env:
migrate:
cd backend && poetry run prisma migrate deploy
cd backend && poetry run prisma generate
cd backend && poetry run gen-prisma-stub
run-backend:
cd backend && poetry run app
@@ -58,4 +61,4 @@ help:
@echo " run-backend - Run the backend FastAPI server"
@echo " run-frontend - Run the frontend Next.js development server"
@echo " test-data - Run the test data creator"
@echo " load-store-agents - Load store agents from agents/ folder into test database"
@echo " load-store-agents - Load store agents from agents/ folder into test database"

View File

@@ -1,29 +1,25 @@
from fastapi import FastAPI
from fastapi.openapi.utils import get_openapi
from .jwt_utils import bearer_jwt_auth
def add_auth_responses_to_openapi(app: FastAPI) -> None:
"""
Set up custom OpenAPI schema generation that adds 401 responses
Patch a FastAPI instance's `openapi()` method to add 401 responses
to all authenticated endpoints.
This is needed when using HTTPBearer with auto_error=False to get proper
401 responses instead of 403, but FastAPI only automatically adds security
responses when auto_error=True.
"""
# Wrap current method to allow stacking OpenAPI schema modifiers like this
wrapped_openapi = app.openapi
def custom_openapi():
if app.openapi_schema:
return app.openapi_schema
openapi_schema = get_openapi(
title=app.title,
version=app.version,
description=app.description,
routes=app.routes,
)
openapi_schema = wrapped_openapi()
# Add 401 response to all endpoints that have security requirements
for path, methods in openapi_schema["paths"].items():

View File

@@ -58,6 +58,13 @@ V0_API_KEY=
OPEN_ROUTER_API_KEY=
NVIDIA_API_KEY=
# Langfuse Prompt Management
# Used for managing the CoPilot system prompt externally
# Get credentials from https://cloud.langfuse.com or your self-hosted instance
LANGFUSE_PUBLIC_KEY=
LANGFUSE_SECRET_KEY=
LANGFUSE_HOST=https://cloud.langfuse.com
# OAuth Credentials
# For the OAuth callback URL, use <your_frontend_url>/auth/integrations/oauth_callback,
# e.g. http://localhost:3000/auth/integrations/oauth_callback

View File

@@ -18,3 +18,4 @@ load-tests/results/
load-tests/*.json
load-tests/*.log
load-tests/node_modules/*
migrations/*/rollback*.sql

View File

@@ -48,7 +48,8 @@ RUN poetry install --no-ansi --no-root
# Generate Prisma client
COPY autogpt_platform/backend/schema.prisma ./
COPY autogpt_platform/backend/backend/data/partial_types.py ./backend/data/partial_types.py
RUN poetry run prisma generate
COPY autogpt_platform/backend/gen_prisma_types_stub.py ./
RUN poetry run prisma generate && poetry run gen-prisma-stub
FROM debian:13-slim AS server_dependencies

View File

@@ -108,7 +108,7 @@ import fastapi.testclient
import pytest
from pytest_snapshot.plugin import Snapshot
from backend.server.v2.myroute import router
from backend.api.features.myroute import router
app = fastapi.FastAPI()
app.include_router(router)
@@ -149,7 +149,7 @@ These provide the easiest way to set up authentication mocking in test modules:
import fastapi
import fastapi.testclient
import pytest
from backend.server.v2.myroute import router
from backend.api.features.myroute import router
app = fastapi.FastAPI()
app.include_router(router)

View File

@@ -3,12 +3,12 @@ from typing import Dict, Set
from fastapi import WebSocket
from backend.api.model import NotificationPayload, WSMessage, WSMethod
from backend.data.execution import (
ExecutionEventType,
GraphExecutionEvent,
NodeExecutionEvent,
)
from backend.server.model import NotificationPayload, WSMessage, WSMethod
_EVENT_TYPE_TO_METHOD_MAP: dict[ExecutionEventType, WSMethod] = {
ExecutionEventType.GRAPH_EXEC_UPDATE: WSMethod.GRAPH_EXECUTION_EVENT,

View File

@@ -4,13 +4,13 @@ from unittest.mock import AsyncMock
import pytest
from fastapi import WebSocket
from backend.api.conn_manager import ConnectionManager
from backend.api.model import NotificationPayload, WSMessage, WSMethod
from backend.data.execution import (
ExecutionStatus,
GraphExecutionEvent,
NodeExecutionEvent,
)
from backend.server.conn_manager import ConnectionManager
from backend.server.model import NotificationPayload, WSMessage, WSMethod
@pytest.fixture

View File

@@ -0,0 +1,25 @@
from fastapi import FastAPI
from backend.api.middleware.security import SecurityHeadersMiddleware
from backend.monitoring.instrumentation import instrument_fastapi
from .v1.routes import v1_router
external_api = FastAPI(
title="AutoGPT External API",
description="External API for AutoGPT integrations",
docs_url="/docs",
version="1.0",
)
external_api.add_middleware(SecurityHeadersMiddleware)
external_api.include_router(v1_router, prefix="/v1")
# Add Prometheus instrumentation
instrument_fastapi(
external_api,
service_name="external-api",
expose_endpoint=True,
endpoint="/metrics",
include_in_schema=True,
)

View File

@@ -16,6 +16,8 @@ from fastapi import APIRouter, Body, HTTPException, Path, Security, status
from prisma.enums import APIKeyPermission
from pydantic import BaseModel, Field, SecretStr
from backend.api.external.middleware import require_permission
from backend.api.features.integrations.models import get_all_provider_names
from backend.data.auth.base import APIAuthorizationInfo
from backend.data.model import (
APIKeyCredentials,
@@ -28,8 +30,6 @@ from backend.data.model import (
from backend.integrations.creds_manager import IntegrationCredentialsManager
from backend.integrations.oauth import CREDENTIALS_BY_PROVIDER, HANDLERS_BY_NAME
from backend.integrations.providers import ProviderName
from backend.server.external.middleware import require_permission
from backend.server.integrations.models import get_all_provider_names
from backend.util.settings import Settings
if TYPE_CHECKING:

View File

@@ -8,23 +8,29 @@ from prisma.enums import AgentExecutionStatus, APIKeyPermission
from pydantic import BaseModel, Field
from typing_extensions import TypedDict
import backend.api.features.store.cache as store_cache
import backend.api.features.store.model as store_model
import backend.data.block
import backend.server.v2.store.cache as store_cache
import backend.server.v2.store.model as store_model
from backend.api.external.middleware import require_permission
from backend.data import execution as execution_db
from backend.data import graph as graph_db
from backend.data import user as user_db
from backend.data.auth.base import APIAuthorizationInfo
from backend.data.block import BlockInput, CompletedBlockOutput
from backend.executor.utils import add_graph_execution
from backend.server.external.middleware import require_permission
from backend.util.settings import Settings
from .integrations import integrations_router
from .tools import tools_router
settings = Settings()
logger = logging.getLogger(__name__)
v1_router = APIRouter()
v1_router.include_router(integrations_router)
v1_router.include_router(tools_router)
class UserInfoResponse(BaseModel):
id: str

View File

@@ -14,11 +14,11 @@ from fastapi import APIRouter, Security
from prisma.enums import APIKeyPermission
from pydantic import BaseModel, Field
from backend.api.external.middleware import require_permission
from backend.api.features.chat.model import ChatSession
from backend.api.features.chat.tools import find_agent_tool, run_agent_tool
from backend.api.features.chat.tools.models import ToolResponseBase
from backend.data.auth.base import APIAuthorizationInfo
from backend.server.external.middleware import require_permission
from backend.server.v2.chat.model import ChatSession
from backend.server.v2.chat.tools import find_agent_tool, run_agent_tool
from backend.server.v2.chat.tools.models import ToolResponseBase
logger = logging.getLogger(__name__)
@@ -70,7 +70,7 @@ class RunAgentRequest(BaseModel):
)
def _create_ephemeral_session(user_id: str | None) -> ChatSession:
def _create_ephemeral_session(user_id: str) -> ChatSession:
"""Create an ephemeral session for stateless API requests."""
return ChatSession.new(user_id)

View File

@@ -6,9 +6,10 @@ from fastapi import APIRouter, Body, Security
from prisma.enums import CreditTransactionType
from backend.data.credit import admin_get_user_history, get_user_credit_model
from backend.server.v2.admin.model import AddUserCreditsResponse, UserHistoryResponse
from backend.util.json import SafeJson
from .model import AddUserCreditsResponse, UserHistoryResponse
logger = logging.getLogger(__name__)

View File

@@ -9,14 +9,15 @@ import pytest_mock
from autogpt_libs.auth.jwt_utils import get_jwt_payload
from pytest_snapshot.plugin import Snapshot
import backend.server.v2.admin.credit_admin_routes as credit_admin_routes
import backend.server.v2.admin.model as admin_model
from backend.data.model import UserTransaction
from backend.util.json import SafeJson
from backend.util.models import Pagination
from .credit_admin_routes import router as credit_admin_router
from .model import UserHistoryResponse
app = fastapi.FastAPI()
app.include_router(credit_admin_routes.router)
app.include_router(credit_admin_router)
client = fastapi.testclient.TestClient(app)
@@ -30,7 +31,7 @@ def setup_app_admin_auth(mock_jwt_admin):
def test_add_user_credits_success(
mocker: pytest_mock.MockFixture,
mocker: pytest_mock.MockerFixture,
configured_snapshot: Snapshot,
admin_user_id: str,
target_user_id: str,
@@ -42,7 +43,7 @@ def test_add_user_credits_success(
return_value=(1500, "transaction-123-uuid")
)
mocker.patch(
"backend.server.v2.admin.credit_admin_routes.get_user_credit_model",
"backend.api.features.admin.credit_admin_routes.get_user_credit_model",
return_value=mock_credit_model,
)
@@ -84,7 +85,7 @@ def test_add_user_credits_success(
def test_add_user_credits_negative_amount(
mocker: pytest_mock.MockFixture,
mocker: pytest_mock.MockerFixture,
snapshot: Snapshot,
) -> None:
"""Test credit deduction by admin (negative amount)"""
@@ -94,7 +95,7 @@ def test_add_user_credits_negative_amount(
return_value=(200, "transaction-456-uuid")
)
mocker.patch(
"backend.server.v2.admin.credit_admin_routes.get_user_credit_model",
"backend.api.features.admin.credit_admin_routes.get_user_credit_model",
return_value=mock_credit_model,
)
@@ -119,12 +120,12 @@ def test_add_user_credits_negative_amount(
def test_get_user_history_success(
mocker: pytest_mock.MockFixture,
mocker: pytest_mock.MockerFixture,
snapshot: Snapshot,
) -> None:
"""Test successful retrieval of user credit history"""
# Mock the admin_get_user_history function
mock_history_response = admin_model.UserHistoryResponse(
mock_history_response = UserHistoryResponse(
history=[
UserTransaction(
user_id="user-1",
@@ -150,7 +151,7 @@ def test_get_user_history_success(
)
mocker.patch(
"backend.server.v2.admin.credit_admin_routes.admin_get_user_history",
"backend.api.features.admin.credit_admin_routes.admin_get_user_history",
return_value=mock_history_response,
)
@@ -170,12 +171,12 @@ def test_get_user_history_success(
def test_get_user_history_with_filters(
mocker: pytest_mock.MockFixture,
mocker: pytest_mock.MockerFixture,
snapshot: Snapshot,
) -> None:
"""Test user credit history with search and filter parameters"""
# Mock the admin_get_user_history function
mock_history_response = admin_model.UserHistoryResponse(
mock_history_response = UserHistoryResponse(
history=[
UserTransaction(
user_id="user-3",
@@ -194,7 +195,7 @@ def test_get_user_history_with_filters(
)
mock_get_history = mocker.patch(
"backend.server.v2.admin.credit_admin_routes.admin_get_user_history",
"backend.api.features.admin.credit_admin_routes.admin_get_user_history",
return_value=mock_history_response,
)
@@ -230,12 +231,12 @@ def test_get_user_history_with_filters(
def test_get_user_history_empty_results(
mocker: pytest_mock.MockFixture,
mocker: pytest_mock.MockerFixture,
snapshot: Snapshot,
) -> None:
"""Test user credit history with no results"""
# Mock empty history response
mock_history_response = admin_model.UserHistoryResponse(
mock_history_response = UserHistoryResponse(
history=[],
pagination=Pagination(
total_items=0,
@@ -246,7 +247,7 @@ def test_get_user_history_empty_results(
)
mocker.patch(
"backend.server.v2.admin.credit_admin_routes.admin_get_user_history",
"backend.api.features.admin.credit_admin_routes.admin_get_user_history",
return_value=mock_history_response,
)

View File

@@ -7,9 +7,9 @@ import fastapi
import fastapi.responses
import prisma.enums
import backend.server.v2.store.cache as store_cache
import backend.server.v2.store.db
import backend.server.v2.store.model
import backend.api.features.store.cache as store_cache
import backend.api.features.store.db as store_db
import backend.api.features.store.model as store_model
import backend.util.json
logger = logging.getLogger(__name__)
@@ -24,7 +24,7 @@ router = fastapi.APIRouter(
@router.get(
"/listings",
summary="Get Admin Listings History",
response_model=backend.server.v2.store.model.StoreListingsWithVersionsResponse,
response_model=store_model.StoreListingsWithVersionsResponse,
)
async def get_admin_listings_with_versions(
status: typing.Optional[prisma.enums.SubmissionStatus] = None,
@@ -48,7 +48,7 @@ async def get_admin_listings_with_versions(
StoreListingsWithVersionsResponse with listings and their versions
"""
try:
listings = await backend.server.v2.store.db.get_admin_listings_with_versions(
listings = await store_db.get_admin_listings_with_versions(
status=status,
search_query=search,
page=page,
@@ -68,11 +68,11 @@ async def get_admin_listings_with_versions(
@router.post(
"/submissions/{store_listing_version_id}/review",
summary="Review Store Submission",
response_model=backend.server.v2.store.model.StoreSubmission,
response_model=store_model.StoreSubmission,
)
async def review_submission(
store_listing_version_id: str,
request: backend.server.v2.store.model.ReviewSubmissionRequest,
request: store_model.ReviewSubmissionRequest,
user_id: str = fastapi.Security(autogpt_libs.auth.get_user_id),
):
"""
@@ -87,12 +87,10 @@ async def review_submission(
StoreSubmission with updated review information
"""
try:
already_approved = (
await backend.server.v2.store.db.check_submission_already_approved(
store_listing_version_id=store_listing_version_id,
)
already_approved = await store_db.check_submission_already_approved(
store_listing_version_id=store_listing_version_id,
)
submission = await backend.server.v2.store.db.review_store_submission(
submission = await store_db.review_store_submission(
store_listing_version_id=store_listing_version_id,
is_approved=request.is_approved,
external_comments=request.comments,
@@ -136,7 +134,7 @@ async def admin_download_agent_file(
Raises:
HTTPException: If the agent is not found or an unexpected error occurs.
"""
graph_data = await backend.server.v2.store.db.get_agent_as_admin(
graph_data = await store_db.get_agent_as_admin(
user_id=user_id,
store_listing_version_id=store_listing_version_id,
)

View File

@@ -6,10 +6,11 @@ from typing import Annotated
import fastapi
import pydantic
from autogpt_libs.auth import get_user_id
from autogpt_libs.auth.dependencies import requires_user
import backend.data.analytics
router = fastapi.APIRouter()
router = fastapi.APIRouter(dependencies=[fastapi.Security(requires_user)])
logger = logging.getLogger(__name__)

View File

@@ -0,0 +1,340 @@
"""Tests for analytics API endpoints."""
import json
from unittest.mock import AsyncMock, Mock
import fastapi
import fastapi.testclient
import pytest
import pytest_mock
from pytest_snapshot.plugin import Snapshot
from .analytics import router as analytics_router
app = fastapi.FastAPI()
app.include_router(analytics_router)
client = fastapi.testclient.TestClient(app)
@pytest.fixture(autouse=True)
def setup_app_auth(mock_jwt_user):
"""Setup auth overrides for all tests in this module."""
from autogpt_libs.auth.jwt_utils import get_jwt_payload
app.dependency_overrides[get_jwt_payload] = mock_jwt_user["get_jwt_payload"]
yield
app.dependency_overrides.clear()
# =============================================================================
# /log_raw_metric endpoint tests
# =============================================================================
def test_log_raw_metric_success(
mocker: pytest_mock.MockFixture,
configured_snapshot: Snapshot,
test_user_id: str,
) -> None:
"""Test successful raw metric logging."""
mock_result = Mock(id="metric-123-uuid")
mock_log_metric = mocker.patch(
"backend.data.analytics.log_raw_metric",
new_callable=AsyncMock,
return_value=mock_result,
)
request_data = {
"metric_name": "page_load_time",
"metric_value": 2.5,
"data_string": "/dashboard",
}
response = client.post("/log_raw_metric", json=request_data)
assert response.status_code == 200, f"Unexpected response: {response.text}"
assert response.json() == "metric-123-uuid"
mock_log_metric.assert_called_once_with(
user_id=test_user_id,
metric_name="page_load_time",
metric_value=2.5,
data_string="/dashboard",
)
configured_snapshot.assert_match(
json.dumps({"metric_id": response.json()}, indent=2, sort_keys=True),
"analytics_log_metric_success",
)
@pytest.mark.parametrize(
"metric_value,metric_name,data_string,test_id",
[
(100, "api_calls_count", "external_api", "integer_value"),
(0, "error_count", "no_errors", "zero_value"),
(-5.2, "temperature_delta", "cooling", "negative_value"),
(1.23456789, "precision_test", "float_precision", "float_precision"),
(999999999, "large_number", "max_value", "large_number"),
(0.0000001, "tiny_number", "min_value", "tiny_number"),
],
)
def test_log_raw_metric_various_values(
mocker: pytest_mock.MockFixture,
configured_snapshot: Snapshot,
metric_value: float,
metric_name: str,
data_string: str,
test_id: str,
) -> None:
"""Test raw metric logging with various metric values."""
mock_result = Mock(id=f"metric-{test_id}-uuid")
mocker.patch(
"backend.data.analytics.log_raw_metric",
new_callable=AsyncMock,
return_value=mock_result,
)
request_data = {
"metric_name": metric_name,
"metric_value": metric_value,
"data_string": data_string,
}
response = client.post("/log_raw_metric", json=request_data)
assert response.status_code == 200, f"Failed for {test_id}: {response.text}"
configured_snapshot.assert_match(
json.dumps(
{"metric_id": response.json(), "test_case": test_id},
indent=2,
sort_keys=True,
),
f"analytics_metric_{test_id}",
)
@pytest.mark.parametrize(
"invalid_data,expected_error",
[
({}, "Field required"),
({"metric_name": "test"}, "Field required"),
(
{"metric_name": "test", "metric_value": "not_a_number", "data_string": "x"},
"Input should be a valid number",
),
(
{"metric_name": "", "metric_value": 1.0, "data_string": "test"},
"String should have at least 1 character",
),
(
{"metric_name": "test", "metric_value": 1.0, "data_string": ""},
"String should have at least 1 character",
),
],
ids=[
"empty_request",
"missing_metric_value_and_data_string",
"invalid_metric_value_type",
"empty_metric_name",
"empty_data_string",
],
)
def test_log_raw_metric_validation_errors(
invalid_data: dict,
expected_error: str,
) -> None:
"""Test validation errors for invalid metric requests."""
response = client.post("/log_raw_metric", json=invalid_data)
assert response.status_code == 422
error_detail = response.json()
assert "detail" in error_detail, f"Missing 'detail' in error: {error_detail}"
error_text = json.dumps(error_detail)
assert (
expected_error in error_text
), f"Expected '{expected_error}' in error response: {error_text}"
def test_log_raw_metric_service_error(
mocker: pytest_mock.MockFixture,
test_user_id: str,
) -> None:
"""Test error handling when analytics service fails."""
mocker.patch(
"backend.data.analytics.log_raw_metric",
new_callable=AsyncMock,
side_effect=Exception("Database connection failed"),
)
request_data = {
"metric_name": "test_metric",
"metric_value": 1.0,
"data_string": "test",
}
response = client.post("/log_raw_metric", json=request_data)
assert response.status_code == 500
error_detail = response.json()["detail"]
assert "Database connection failed" in error_detail["message"]
assert "hint" in error_detail
# =============================================================================
# /log_raw_analytics endpoint tests
# =============================================================================
def test_log_raw_analytics_success(
mocker: pytest_mock.MockFixture,
configured_snapshot: Snapshot,
test_user_id: str,
) -> None:
"""Test successful raw analytics logging."""
mock_result = Mock(id="analytics-789-uuid")
mock_log_analytics = mocker.patch(
"backend.data.analytics.log_raw_analytics",
new_callable=AsyncMock,
return_value=mock_result,
)
request_data = {
"type": "user_action",
"data": {
"action": "button_click",
"button_id": "submit_form",
"timestamp": "2023-01-01T00:00:00Z",
"metadata": {"form_type": "registration", "fields_filled": 5},
},
"data_index": "button_click_submit_form",
}
response = client.post("/log_raw_analytics", json=request_data)
assert response.status_code == 200, f"Unexpected response: {response.text}"
assert response.json() == "analytics-789-uuid"
mock_log_analytics.assert_called_once_with(
test_user_id,
"user_action",
request_data["data"],
"button_click_submit_form",
)
configured_snapshot.assert_match(
json.dumps({"analytics_id": response.json()}, indent=2, sort_keys=True),
"analytics_log_analytics_success",
)
def test_log_raw_analytics_complex_data(
mocker: pytest_mock.MockFixture,
configured_snapshot: Snapshot,
) -> None:
"""Test raw analytics logging with complex nested data structures."""
mock_result = Mock(id="analytics-complex-uuid")
mocker.patch(
"backend.data.analytics.log_raw_analytics",
new_callable=AsyncMock,
return_value=mock_result,
)
request_data = {
"type": "agent_execution",
"data": {
"agent_id": "agent_123",
"execution_id": "exec_456",
"status": "completed",
"duration_ms": 3500,
"nodes_executed": 15,
"blocks_used": [
{"block_id": "llm_block", "count": 3},
{"block_id": "http_block", "count": 5},
{"block_id": "code_block", "count": 2},
],
"errors": [],
"metadata": {
"trigger": "manual",
"user_tier": "premium",
"environment": "production",
},
},
"data_index": "agent_123_exec_456",
}
response = client.post("/log_raw_analytics", json=request_data)
assert response.status_code == 200
configured_snapshot.assert_match(
json.dumps(
{"analytics_id": response.json(), "logged_data": request_data["data"]},
indent=2,
sort_keys=True,
),
"analytics_log_analytics_complex_data",
)
@pytest.mark.parametrize(
"invalid_data,expected_error",
[
({}, "Field required"),
({"type": "test"}, "Field required"),
(
{"type": "test", "data": "not_a_dict", "data_index": "test"},
"Input should be a valid dictionary",
),
({"type": "test", "data": {"key": "value"}}, "Field required"),
],
ids=[
"empty_request",
"missing_data_and_data_index",
"invalid_data_type",
"missing_data_index",
],
)
def test_log_raw_analytics_validation_errors(
invalid_data: dict,
expected_error: str,
) -> None:
"""Test validation errors for invalid analytics requests."""
response = client.post("/log_raw_analytics", json=invalid_data)
assert response.status_code == 422
error_detail = response.json()
assert "detail" in error_detail, f"Missing 'detail' in error: {error_detail}"
error_text = json.dumps(error_detail)
assert (
expected_error in error_text
), f"Expected '{expected_error}' in error response: {error_text}"
def test_log_raw_analytics_service_error(
mocker: pytest_mock.MockFixture,
test_user_id: str,
) -> None:
"""Test error handling when analytics service fails."""
mocker.patch(
"backend.data.analytics.log_raw_analytics",
new_callable=AsyncMock,
side_effect=Exception("Analytics DB unreachable"),
)
request_data = {
"type": "test_event",
"data": {"key": "value"},
"data_index": "test_index",
}
response = client.post("/log_raw_analytics", json=request_data)
assert response.status_code == 500
error_detail = response.json()["detail"]
assert "Analytics DB unreachable" in error_detail["message"]
assert "hint" in error_detail

View File

@@ -6,17 +6,20 @@ from typing import Sequence
import prisma
import backend.api.features.library.db as library_db
import backend.api.features.library.model as library_model
import backend.api.features.store.db as store_db
import backend.api.features.store.model as store_model
import backend.data.block
import backend.server.v2.library.db as library_db
import backend.server.v2.library.model as library_model
import backend.server.v2.store.db as store_db
import backend.server.v2.store.model as store_model
from backend.blocks import load_all_blocks
from backend.blocks.llm import LlmModel
from backend.data.block import AnyBlockSchema, BlockCategory, BlockInfo, BlockSchema
from backend.data.db import query_raw_with_schema
from backend.integrations.providers import ProviderName
from backend.server.v2.builder.model import (
from backend.util.cache import cached
from backend.util.models import Pagination
from .model import (
BlockCategoryResponse,
BlockResponse,
BlockType,
@@ -26,8 +29,6 @@ from backend.server.v2.builder.model import (
ProviderResponse,
SearchEntry,
)
from backend.util.cache import cached
from backend.util.models import Pagination
logger = logging.getLogger(__name__)
llm_models = [name.name.lower().replace("_", " ") for name in LlmModel]

View File

@@ -2,8 +2,8 @@ from typing import Literal
from pydantic import BaseModel
import backend.server.v2.library.model as library_model
import backend.server.v2.store.model as store_model
import backend.api.features.library.model as library_model
import backend.api.features.store.model as store_model
from backend.data.block import BlockInfo
from backend.integrations.providers import ProviderName
from backend.util.models import Pagination

View File

@@ -4,11 +4,12 @@ from typing import Annotated, Sequence
import fastapi
from autogpt_libs.auth.dependencies import get_user_id, requires_user
import backend.server.v2.builder.db as builder_db
import backend.server.v2.builder.model as builder_model
from backend.integrations.providers import ProviderName
from backend.util.models import Pagination
from . import db as builder_db
from . import model as builder_model
logger = logging.getLogger(__name__)
router = fastapi.APIRouter(

View File

@@ -1,7 +1,6 @@
"""Configuration management for chat system."""
import os
from pathlib import Path
from pydantic import Field, field_validator
from pydantic_settings import BaseSettings
@@ -12,7 +11,11 @@ class ChatConfig(BaseSettings):
# OpenAI API Configuration
model: str = Field(
default="qwen/qwen3-235b-a22b-2507", description="Default model to use"
default="anthropic/claude-opus-4.5", description="Default model to use"
)
title_model: str = Field(
default="openai/gpt-4o-mini",
description="Model to use for generating session titles (should be fast/cheap)",
)
api_key: str | None = Field(default=None, description="OpenAI API key")
base_url: str | None = Field(
@@ -23,12 +26,6 @@ class ChatConfig(BaseSettings):
# Session TTL Configuration - 12 hours
session_ttl: int = Field(default=43200, description="Session TTL in seconds")
# System Prompt Configuration
system_prompt_path: str = Field(
default="prompts/chat_system.md",
description="Path to system prompt file relative to chat module",
)
# Streaming Configuration
max_context_messages: int = Field(
default=50, ge=1, le=200, description="Maximum context messages"
@@ -41,6 +38,13 @@ class ChatConfig(BaseSettings):
default=3, description="Maximum number of agent schedules"
)
# Langfuse Prompt Management Configuration
# Note: Langfuse credentials are in Settings().secrets (settings.py)
langfuse_prompt_name: str = Field(
default="CoPilot Prompt",
description="Name of the prompt in Langfuse to fetch",
)
@field_validator("api_key", mode="before")
@classmethod
def get_api_key(cls, v):
@@ -72,43 +76,11 @@ class ChatConfig(BaseSettings):
v = "https://openrouter.ai/api/v1"
return v
def get_system_prompt(self, **template_vars) -> str:
"""Load and render the system prompt from file.
Args:
**template_vars: Variables to substitute in the template
Returns:
Rendered system prompt string
"""
# Get the path relative to this module
module_dir = Path(__file__).parent
prompt_path = module_dir / self.system_prompt_path
# Check for .j2 extension first (Jinja2 template)
j2_path = Path(str(prompt_path) + ".j2")
if j2_path.exists():
try:
from jinja2 import Template
template = Template(j2_path.read_text())
return template.render(**template_vars)
except ImportError:
# Jinja2 not installed, fall back to reading as plain text
return j2_path.read_text()
# Check for markdown file
if prompt_path.exists():
content = prompt_path.read_text()
# Simple variable substitution if Jinja2 is not available
for key, value in template_vars.items():
placeholder = f"{{{key}}}"
content = content.replace(placeholder, str(value))
return content
raise FileNotFoundError(f"System prompt file not found: {prompt_path}")
# Prompt paths for different contexts
PROMPT_PATHS: dict[str, str] = {
"default": "prompts/chat_system.md",
"onboarding": "prompts/onboarding_system.md",
}
class Config:
"""Pydantic config."""

View File

@@ -0,0 +1,249 @@
"""Database operations for chat sessions."""
import asyncio
import logging
from datetime import UTC, datetime
from typing import Any, cast
from prisma.models import ChatMessage as PrismaChatMessage
from prisma.models import ChatSession as PrismaChatSession
from prisma.types import (
ChatMessageCreateInput,
ChatSessionCreateInput,
ChatSessionUpdateInput,
ChatSessionWhereInput,
)
from backend.data.db import transaction
from backend.util.json import SafeJson
logger = logging.getLogger(__name__)
async def get_chat_session(session_id: str) -> PrismaChatSession | None:
"""Get a chat session by ID from the database."""
session = await PrismaChatSession.prisma().find_unique(
where={"id": session_id},
include={"Messages": True},
)
if session and session.Messages:
# Sort messages by sequence in Python - Prisma Python client doesn't support
# order_by in include clauses (unlike Prisma JS), so we sort after fetching
session.Messages.sort(key=lambda m: m.sequence)
return session
async def create_chat_session(
session_id: str,
user_id: str,
) -> PrismaChatSession:
"""Create a new chat session in the database."""
data = ChatSessionCreateInput(
id=session_id,
userId=user_id,
credentials=SafeJson({}),
successfulAgentRuns=SafeJson({}),
successfulAgentSchedules=SafeJson({}),
)
return await PrismaChatSession.prisma().create(
data=data,
include={"Messages": True},
)
async def update_chat_session(
session_id: str,
credentials: dict[str, Any] | None = None,
successful_agent_runs: dict[str, Any] | None = None,
successful_agent_schedules: dict[str, Any] | None = None,
total_prompt_tokens: int | None = None,
total_completion_tokens: int | None = None,
title: str | None = None,
) -> PrismaChatSession | None:
"""Update a chat session's metadata."""
data: ChatSessionUpdateInput = {"updatedAt": datetime.now(UTC)}
if credentials is not None:
data["credentials"] = SafeJson(credentials)
if successful_agent_runs is not None:
data["successfulAgentRuns"] = SafeJson(successful_agent_runs)
if successful_agent_schedules is not None:
data["successfulAgentSchedules"] = SafeJson(successful_agent_schedules)
if total_prompt_tokens is not None:
data["totalPromptTokens"] = total_prompt_tokens
if total_completion_tokens is not None:
data["totalCompletionTokens"] = total_completion_tokens
if title is not None:
data["title"] = title
session = await PrismaChatSession.prisma().update(
where={"id": session_id},
data=data,
include={"Messages": True},
)
if session and session.Messages:
# Sort in Python - Prisma Python doesn't support order_by in include clauses
session.Messages.sort(key=lambda m: m.sequence)
return session
async def add_chat_message(
session_id: str,
role: str,
sequence: int,
content: str | None = None,
name: str | None = None,
tool_call_id: str | None = None,
refusal: str | None = None,
tool_calls: list[dict[str, Any]] | None = None,
function_call: dict[str, Any] | None = None,
) -> PrismaChatMessage:
"""Add a message to a chat session."""
# Build input dict dynamically rather than using ChatMessageCreateInput directly
# because Prisma's TypedDict validation rejects optional fields set to None.
# We only include fields that have values, then cast at the end.
data: dict[str, Any] = {
"Session": {"connect": {"id": session_id}},
"role": role,
"sequence": sequence,
}
# Add optional string fields
if content is not None:
data["content"] = content
if name is not None:
data["name"] = name
if tool_call_id is not None:
data["toolCallId"] = tool_call_id
if refusal is not None:
data["refusal"] = refusal
# Add optional JSON fields only when they have values
if tool_calls is not None:
data["toolCalls"] = SafeJson(tool_calls)
if function_call is not None:
data["functionCall"] = SafeJson(function_call)
# Run message create and session timestamp update in parallel for lower latency
_, message = await asyncio.gather(
PrismaChatSession.prisma().update(
where={"id": session_id},
data={"updatedAt": datetime.now(UTC)},
),
PrismaChatMessage.prisma().create(data=cast(ChatMessageCreateInput, data)),
)
return message
async def add_chat_messages_batch(
session_id: str,
messages: list[dict[str, Any]],
start_sequence: int,
) -> list[PrismaChatMessage]:
"""Add multiple messages to a chat session in a batch.
Uses a transaction for atomicity - if any message creation fails,
the entire batch is rolled back.
"""
if not messages:
return []
created_messages = []
async with transaction() as tx:
for i, msg in enumerate(messages):
# Build input dict dynamically rather than using ChatMessageCreateInput
# directly because Prisma's TypedDict validation rejects optional fields
# set to None. We only include fields that have values, then cast.
data: dict[str, Any] = {
"Session": {"connect": {"id": session_id}},
"role": msg["role"],
"sequence": start_sequence + i,
}
# Add optional string fields
if msg.get("content") is not None:
data["content"] = msg["content"]
if msg.get("name") is not None:
data["name"] = msg["name"]
if msg.get("tool_call_id") is not None:
data["toolCallId"] = msg["tool_call_id"]
if msg.get("refusal") is not None:
data["refusal"] = msg["refusal"]
# Add optional JSON fields only when they have values
if msg.get("tool_calls") is not None:
data["toolCalls"] = SafeJson(msg["tool_calls"])
if msg.get("function_call") is not None:
data["functionCall"] = SafeJson(msg["function_call"])
created = await PrismaChatMessage.prisma(tx).create(
data=cast(ChatMessageCreateInput, data)
)
created_messages.append(created)
# Update session's updatedAt timestamp within the same transaction.
# Note: Token usage (total_prompt_tokens, total_completion_tokens) is updated
# separately via update_chat_session() after streaming completes.
await PrismaChatSession.prisma(tx).update(
where={"id": session_id},
data={"updatedAt": datetime.now(UTC)},
)
return created_messages
async def get_user_chat_sessions(
user_id: str,
limit: int = 50,
offset: int = 0,
) -> list[PrismaChatSession]:
"""Get chat sessions for a user, ordered by most recent."""
return await PrismaChatSession.prisma().find_many(
where={"userId": user_id},
order={"updatedAt": "desc"},
take=limit,
skip=offset,
)
async def get_user_session_count(user_id: str) -> int:
"""Get the total number of chat sessions for a user."""
return await PrismaChatSession.prisma().count(where={"userId": user_id})
async def delete_chat_session(session_id: str, user_id: str | None = None) -> bool:
"""Delete a chat session and all its messages.
Args:
session_id: The session ID to delete.
user_id: If provided, validates that the session belongs to this user
before deletion. This prevents unauthorized deletion of other
users' sessions.
Returns:
True if deleted successfully, False otherwise.
"""
try:
# Build typed where clause with optional user_id validation
where_clause: ChatSessionWhereInput = {"id": session_id}
if user_id is not None:
where_clause["userId"] = user_id
result = await PrismaChatSession.prisma().delete_many(where=where_clause)
if result == 0:
logger.warning(
f"No session deleted for {session_id} "
f"(user_id validation: {user_id is not None})"
)
return False
return True
except Exception as e:
logger.error(f"Failed to delete chat session {session_id}: {e}")
return False
async def get_chat_session_message_count(session_id: str) -> int:
"""Get the number of messages in a chat session."""
count = await PrismaChatMessage.prisma().count(where={"sessionId": session_id})
return count

View File

@@ -0,0 +1,597 @@
import asyncio
import logging
import uuid
from datetime import UTC, datetime
from typing import Any
from weakref import WeakValueDictionary
from openai.types.chat import (
ChatCompletionAssistantMessageParam,
ChatCompletionDeveloperMessageParam,
ChatCompletionFunctionMessageParam,
ChatCompletionMessageParam,
ChatCompletionSystemMessageParam,
ChatCompletionToolMessageParam,
ChatCompletionUserMessageParam,
)
from openai.types.chat.chat_completion_assistant_message_param import FunctionCall
from openai.types.chat.chat_completion_message_tool_call_param import (
ChatCompletionMessageToolCallParam,
Function,
)
from prisma.models import ChatMessage as PrismaChatMessage
from prisma.models import ChatSession as PrismaChatSession
from pydantic import BaseModel
from backend.data.redis_client import get_redis_async
from backend.util import json
from backend.util.exceptions import DatabaseError, RedisError
from . import db as chat_db
from .config import ChatConfig
logger = logging.getLogger(__name__)
config = ChatConfig()
def _parse_json_field(value: str | dict | list | None, default: Any = None) -> Any:
"""Parse a JSON field that may be stored as string or already parsed."""
if value is None:
return default
if isinstance(value, str):
return json.loads(value)
return value
# Redis cache key prefix for chat sessions
CHAT_SESSION_CACHE_PREFIX = "chat:session:"
def _get_session_cache_key(session_id: str) -> str:
"""Get the Redis cache key for a chat session."""
return f"{CHAT_SESSION_CACHE_PREFIX}{session_id}"
# Session-level locks to prevent race conditions during concurrent upserts.
# Uses WeakValueDictionary to automatically garbage collect locks when no longer referenced,
# preventing unbounded memory growth while maintaining lock semantics for active sessions.
# Invalidation: Locks are auto-removed by GC when no coroutine holds a reference (after
# async with lock: completes). Explicit cleanup also occurs in delete_chat_session().
_session_locks: WeakValueDictionary[str, asyncio.Lock] = WeakValueDictionary()
_session_locks_mutex = asyncio.Lock()
async def _get_session_lock(session_id: str) -> asyncio.Lock:
"""Get or create a lock for a specific session to prevent concurrent upserts.
Uses WeakValueDictionary for automatic cleanup: locks are garbage collected
when no coroutine holds a reference to them, preventing memory leaks from
unbounded growth of session locks.
"""
async with _session_locks_mutex:
lock = _session_locks.get(session_id)
if lock is None:
lock = asyncio.Lock()
_session_locks[session_id] = lock
return lock
class ChatMessage(BaseModel):
role: str
content: str | None = None
name: str | None = None
tool_call_id: str | None = None
refusal: str | None = None
tool_calls: list[dict] | None = None
function_call: dict | None = None
class Usage(BaseModel):
prompt_tokens: int
completion_tokens: int
total_tokens: int
class ChatSession(BaseModel):
session_id: str
user_id: str
title: str | None = None
messages: list[ChatMessage]
usage: list[Usage]
credentials: dict[str, dict] = {} # Map of provider -> credential metadata
started_at: datetime
updated_at: datetime
successful_agent_runs: dict[str, int] = {}
successful_agent_schedules: dict[str, int] = {}
@staticmethod
def new(user_id: str) -> "ChatSession":
return ChatSession(
session_id=str(uuid.uuid4()),
user_id=user_id,
title=None,
messages=[],
usage=[],
credentials={},
started_at=datetime.now(UTC),
updated_at=datetime.now(UTC),
)
@staticmethod
def from_db(
prisma_session: PrismaChatSession,
prisma_messages: list[PrismaChatMessage] | None = None,
) -> "ChatSession":
"""Convert Prisma models to Pydantic ChatSession."""
messages = []
if prisma_messages:
for msg in prisma_messages:
messages.append(
ChatMessage(
role=msg.role,
content=msg.content,
name=msg.name,
tool_call_id=msg.toolCallId,
refusal=msg.refusal,
tool_calls=_parse_json_field(msg.toolCalls),
function_call=_parse_json_field(msg.functionCall),
)
)
# Parse JSON fields from Prisma
credentials = _parse_json_field(prisma_session.credentials, default={})
successful_agent_runs = _parse_json_field(
prisma_session.successfulAgentRuns, default={}
)
successful_agent_schedules = _parse_json_field(
prisma_session.successfulAgentSchedules, default={}
)
# Calculate usage from token counts
usage = []
if prisma_session.totalPromptTokens or prisma_session.totalCompletionTokens:
usage.append(
Usage(
prompt_tokens=prisma_session.totalPromptTokens or 0,
completion_tokens=prisma_session.totalCompletionTokens or 0,
total_tokens=(prisma_session.totalPromptTokens or 0)
+ (prisma_session.totalCompletionTokens or 0),
)
)
return ChatSession(
session_id=prisma_session.id,
user_id=prisma_session.userId,
title=prisma_session.title,
messages=messages,
usage=usage,
credentials=credentials,
started_at=prisma_session.createdAt,
updated_at=prisma_session.updatedAt,
successful_agent_runs=successful_agent_runs,
successful_agent_schedules=successful_agent_schedules,
)
def to_openai_messages(self) -> list[ChatCompletionMessageParam]:
messages = []
for message in self.messages:
if message.role == "developer":
m = ChatCompletionDeveloperMessageParam(
role="developer",
content=message.content or "",
)
if message.name:
m["name"] = message.name
messages.append(m)
elif message.role == "system":
m = ChatCompletionSystemMessageParam(
role="system",
content=message.content or "",
)
if message.name:
m["name"] = message.name
messages.append(m)
elif message.role == "user":
m = ChatCompletionUserMessageParam(
role="user",
content=message.content or "",
)
if message.name:
m["name"] = message.name
messages.append(m)
elif message.role == "assistant":
m = ChatCompletionAssistantMessageParam(
role="assistant",
content=message.content or "",
)
if message.function_call:
m["function_call"] = FunctionCall(
arguments=message.function_call["arguments"],
name=message.function_call["name"],
)
if message.refusal:
m["refusal"] = message.refusal
if message.tool_calls:
t: list[ChatCompletionMessageToolCallParam] = []
for tool_call in message.tool_calls:
# Tool calls are stored with nested structure: {id, type, function: {name, arguments}}
function_data = tool_call.get("function", {})
# Skip tool calls that are missing required fields
if "id" not in tool_call or "name" not in function_data:
logger.warning(
f"Skipping invalid tool call: missing required fields. "
f"Got: {tool_call.keys()}, function keys: {function_data.keys()}"
)
continue
# Arguments are stored as a JSON string
arguments_str = function_data.get("arguments", "{}")
t.append(
ChatCompletionMessageToolCallParam(
id=tool_call["id"],
type="function",
function=Function(
arguments=arguments_str,
name=function_data["name"],
),
)
)
m["tool_calls"] = t
if message.name:
m["name"] = message.name
messages.append(m)
elif message.role == "tool":
messages.append(
ChatCompletionToolMessageParam(
role="tool",
content=message.content or "",
tool_call_id=message.tool_call_id or "",
)
)
elif message.role == "function":
messages.append(
ChatCompletionFunctionMessageParam(
role="function",
content=message.content,
name=message.name or "",
)
)
return messages
async def _get_session_from_cache(session_id: str) -> ChatSession | None:
"""Get a chat session from Redis cache."""
redis_key = _get_session_cache_key(session_id)
async_redis = await get_redis_async()
raw_session: bytes | None = await async_redis.get(redis_key)
if raw_session is None:
return None
try:
session = ChatSession.model_validate_json(raw_session)
logger.info(
f"Loading session {session_id} from cache: "
f"message_count={len(session.messages)}, "
f"roles={[m.role for m in session.messages]}"
)
return session
except Exception as e:
logger.error(f"Failed to deserialize session {session_id}: {e}", exc_info=True)
raise RedisError(f"Corrupted session data for {session_id}") from e
async def _cache_session(session: ChatSession) -> None:
"""Cache a chat session in Redis."""
redis_key = _get_session_cache_key(session.session_id)
async_redis = await get_redis_async()
await async_redis.setex(redis_key, config.session_ttl, session.model_dump_json())
async def _get_session_from_db(session_id: str) -> ChatSession | None:
"""Get a chat session from the database."""
prisma_session = await chat_db.get_chat_session(session_id)
if not prisma_session:
return None
messages = prisma_session.Messages
logger.info(
f"Loading session {session_id} from DB: "
f"has_messages={messages is not None}, "
f"message_count={len(messages) if messages else 0}, "
f"roles={[m.role for m in messages] if messages else []}"
)
return ChatSession.from_db(prisma_session, messages)
async def _save_session_to_db(
session: ChatSession, existing_message_count: int
) -> None:
"""Save or update a chat session in the database."""
# Check if session exists in DB
existing = await chat_db.get_chat_session(session.session_id)
if not existing:
# Create new session
await chat_db.create_chat_session(
session_id=session.session_id,
user_id=session.user_id,
)
existing_message_count = 0
# Calculate total tokens from usage
total_prompt = sum(u.prompt_tokens for u in session.usage)
total_completion = sum(u.completion_tokens for u in session.usage)
# Update session metadata
await chat_db.update_chat_session(
session_id=session.session_id,
credentials=session.credentials,
successful_agent_runs=session.successful_agent_runs,
successful_agent_schedules=session.successful_agent_schedules,
total_prompt_tokens=total_prompt,
total_completion_tokens=total_completion,
)
# Add new messages (only those after existing count)
new_messages = session.messages[existing_message_count:]
if new_messages:
messages_data = []
for msg in new_messages:
messages_data.append(
{
"role": msg.role,
"content": msg.content,
"name": msg.name,
"tool_call_id": msg.tool_call_id,
"refusal": msg.refusal,
"tool_calls": msg.tool_calls,
"function_call": msg.function_call,
}
)
logger.info(
f"Saving {len(new_messages)} new messages to DB for session {session.session_id}: "
f"roles={[m['role'] for m in messages_data]}, "
f"start_sequence={existing_message_count}"
)
await chat_db.add_chat_messages_batch(
session_id=session.session_id,
messages=messages_data,
start_sequence=existing_message_count,
)
async def get_chat_session(
session_id: str,
user_id: str | None = None,
) -> ChatSession | None:
"""Get a chat session by ID.
Checks Redis cache first, falls back to database if not found.
Caches database results back to Redis.
Args:
session_id: The session ID to fetch.
user_id: If provided, validates that the session belongs to this user.
If None, ownership is not validated (admin/system access).
"""
# Try cache first
try:
session = await _get_session_from_cache(session_id)
if session:
# Verify user ownership if user_id was provided for validation
if user_id is not None and session.user_id != user_id:
logger.warning(
f"Session {session_id} user id mismatch: {session.user_id} != {user_id}"
)
return None
return session
except RedisError:
logger.warning(f"Cache error for session {session_id}, trying database")
except Exception as e:
logger.warning(f"Unexpected cache error for session {session_id}: {e}")
# Fall back to database
logger.info(f"Session {session_id} not in cache, checking database")
session = await _get_session_from_db(session_id)
if session is None:
logger.warning(f"Session {session_id} not found in cache or database")
return None
# Verify user ownership if user_id was provided for validation
if user_id is not None and session.user_id != user_id:
logger.warning(
f"Session {session_id} user id mismatch: {session.user_id} != {user_id}"
)
return None
# Cache the session from DB
try:
await _cache_session(session)
logger.info(f"Cached session {session_id} from database")
except Exception as e:
logger.warning(f"Failed to cache session {session_id}: {e}")
return session
async def upsert_chat_session(
session: ChatSession,
) -> ChatSession:
"""Update a chat session in both cache and database.
Uses session-level locking to prevent race conditions when concurrent
operations (e.g., background title update and main stream handler)
attempt to upsert the same session simultaneously.
Raises:
DatabaseError: If the database write fails. The cache is still updated
as a best-effort optimization, but the error is propagated to ensure
callers are aware of the persistence failure.
RedisError: If the cache write fails (after successful DB write).
"""
# Acquire session-specific lock to prevent concurrent upserts
lock = await _get_session_lock(session.session_id)
async with lock:
# Get existing message count from DB for incremental saves
existing_message_count = await chat_db.get_chat_session_message_count(
session.session_id
)
db_error: Exception | None = None
# Save to database (primary storage)
try:
await _save_session_to_db(session, existing_message_count)
except Exception as e:
logger.error(
f"Failed to save session {session.session_id} to database: {e}"
)
db_error = e
# Save to cache (best-effort, even if DB failed)
try:
await _cache_session(session)
except Exception as e:
# If DB succeeded but cache failed, raise cache error
if db_error is None:
raise RedisError(
f"Failed to persist chat session {session.session_id} to Redis: {e}"
) from e
# If both failed, log cache error but raise DB error (more critical)
logger.warning(
f"Cache write also failed for session {session.session_id}: {e}"
)
# Propagate DB error after attempting cache (prevents data loss)
if db_error is not None:
raise DatabaseError(
f"Failed to persist chat session {session.session_id} to database"
) from db_error
return session
async def create_chat_session(user_id: str) -> ChatSession:
"""Create a new chat session and persist it.
Raises:
DatabaseError: If the database write fails. We fail fast to ensure
callers never receive a non-persisted session that only exists
in cache (which would be lost when the cache expires).
"""
session = ChatSession.new(user_id)
# Create in database first - fail fast if this fails
try:
await chat_db.create_chat_session(
session_id=session.session_id,
user_id=user_id,
)
except Exception as e:
logger.error(f"Failed to create session {session.session_id} in database: {e}")
raise DatabaseError(
f"Failed to create chat session {session.session_id} in database"
) from e
# Cache the session (best-effort optimization, DB is source of truth)
try:
await _cache_session(session)
except Exception as e:
logger.warning(f"Failed to cache new session {session.session_id}: {e}")
return session
async def get_user_sessions(
user_id: str,
limit: int = 50,
offset: int = 0,
) -> tuple[list[ChatSession], int]:
"""Get chat sessions for a user from the database with total count.
Returns:
A tuple of (sessions, total_count) where total_count is the overall
number of sessions for the user (not just the current page).
"""
prisma_sessions = await chat_db.get_user_chat_sessions(user_id, limit, offset)
total_count = await chat_db.get_user_session_count(user_id)
sessions = []
for prisma_session in prisma_sessions:
# Convert without messages for listing (lighter weight)
sessions.append(ChatSession.from_db(prisma_session, None))
return sessions, total_count
async def delete_chat_session(session_id: str, user_id: str | None = None) -> bool:
"""Delete a chat session from both cache and database.
Args:
session_id: The session ID to delete.
user_id: If provided, validates that the session belongs to this user
before deletion. This prevents unauthorized deletion.
Returns:
True if deleted successfully, False otherwise.
"""
# Delete from database first (with optional user_id validation)
# This confirms ownership before invalidating cache
deleted = await chat_db.delete_chat_session(session_id, user_id)
if not deleted:
return False
# Only invalidate cache and clean up lock after DB confirms deletion
try:
redis_key = _get_session_cache_key(session_id)
async_redis = await get_redis_async()
await async_redis.delete(redis_key)
except Exception as e:
logger.warning(f"Failed to delete session {session_id} from cache: {e}")
# Clean up session lock (belt-and-suspenders with WeakValueDictionary)
async with _session_locks_mutex:
_session_locks.pop(session_id, None)
return True
async def update_session_title(session_id: str, title: str) -> bool:
"""Update only the title of a chat session.
This is a lightweight operation that doesn't touch messages, avoiding
race conditions with concurrent message updates. Use this for background
title generation instead of upsert_chat_session.
Args:
session_id: The session ID to update.
title: The new title to set.
Returns:
True if updated successfully, False otherwise.
"""
try:
result = await chat_db.update_chat_session(session_id=session_id, title=title)
if result is None:
logger.warning(f"Session {session_id} not found for title update")
return False
# Invalidate cache so next fetch gets updated title
try:
redis_key = _get_session_cache_key(session_id)
async_redis = await get_redis_async()
await async_redis.delete(redis_key)
except Exception as e:
logger.warning(f"Failed to invalidate cache for session {session_id}: {e}")
return True
except Exception as e:
logger.error(f"Failed to update title for session {session_id}: {e}")
return False

View File

@@ -0,0 +1,119 @@
import pytest
from .model import (
ChatMessage,
ChatSession,
Usage,
get_chat_session,
upsert_chat_session,
)
messages = [
ChatMessage(content="Hello, how are you?", role="user"),
ChatMessage(
content="I'm fine, thank you!",
role="assistant",
tool_calls=[
{
"id": "t123",
"type": "function",
"function": {
"name": "get_weather",
"arguments": '{"city": "New York"}',
},
}
],
),
ChatMessage(
content="I'm using the tool to get the weather",
role="tool",
tool_call_id="t123",
),
]
@pytest.mark.asyncio(loop_scope="session")
async def test_chatsession_serialization_deserialization():
s = ChatSession.new(user_id="abc123")
s.messages = messages
s.usage = [Usage(prompt_tokens=100, completion_tokens=200, total_tokens=300)]
serialized = s.model_dump_json()
s2 = ChatSession.model_validate_json(serialized)
assert s2.model_dump() == s.model_dump()
@pytest.mark.asyncio(loop_scope="session")
async def test_chatsession_redis_storage(setup_test_user, test_user_id):
s = ChatSession.new(user_id=test_user_id)
s.messages = messages
s = await upsert_chat_session(s)
s2 = await get_chat_session(
session_id=s.session_id,
user_id=s.user_id,
)
assert s2 == s
@pytest.mark.asyncio(loop_scope="session")
async def test_chatsession_redis_storage_user_id_mismatch(
setup_test_user, test_user_id
):
s = ChatSession.new(user_id=test_user_id)
s.messages = messages
s = await upsert_chat_session(s)
s2 = await get_chat_session(s.session_id, "different_user_id")
assert s2 is None
@pytest.mark.asyncio(loop_scope="session")
async def test_chatsession_db_storage(setup_test_user, test_user_id):
"""Test that messages are correctly saved to and loaded from DB (not cache)."""
from backend.data.redis_client import get_redis_async
# Create session with messages including assistant message
s = ChatSession.new(user_id=test_user_id)
s.messages = messages # Contains user, assistant, and tool messages
assert s.session_id is not None, "Session id is not set"
# Upsert to save to both cache and DB
s = await upsert_chat_session(s)
# Clear the Redis cache to force DB load
redis_key = f"chat:session:{s.session_id}"
async_redis = await get_redis_async()
await async_redis.delete(redis_key)
# Load from DB (cache was cleared)
s2 = await get_chat_session(
session_id=s.session_id,
user_id=s.user_id,
)
assert s2 is not None, "Session not found after loading from DB"
assert len(s2.messages) == len(
s.messages
), f"Message count mismatch: expected {len(s.messages)}, got {len(s2.messages)}"
# Verify all roles are present
roles = [m.role for m in s2.messages]
assert "user" in roles, f"User message missing. Roles found: {roles}"
assert "assistant" in roles, f"Assistant message missing. Roles found: {roles}"
assert "tool" in roles, f"Tool message missing. Roles found: {roles}"
# Verify message content
for orig, loaded in zip(s.messages, s2.messages):
assert orig.role == loaded.role, f"Role mismatch: {orig.role} != {loaded.role}"
assert (
orig.content == loaded.content
), f"Content mismatch for {orig.role}: {orig.content} != {loaded.content}"
if orig.tool_calls:
assert (
loaded.tool_calls is not None
), f"Tool calls missing for {orig.role} message"
assert len(orig.tool_calls) == len(loaded.tool_calls)

View File

@@ -0,0 +1,144 @@
"""
Response models for Vercel AI SDK UI Stream Protocol.
This module implements the AI SDK UI Stream Protocol (v1) for streaming chat responses.
See: https://ai-sdk.dev/docs/ai-sdk-ui/stream-protocol
"""
from enum import Enum
from typing import Any
from pydantic import BaseModel, Field
class ResponseType(str, Enum):
"""Types of streaming responses following AI SDK protocol."""
# Message lifecycle
START = "start"
FINISH = "finish"
# Text streaming
TEXT_START = "text-start"
TEXT_DELTA = "text-delta"
TEXT_END = "text-end"
# Tool interaction
TOOL_INPUT_START = "tool-input-start"
TOOL_INPUT_AVAILABLE = "tool-input-available"
TOOL_OUTPUT_AVAILABLE = "tool-output-available"
# Other
ERROR = "error"
USAGE = "usage"
class StreamBaseResponse(BaseModel):
"""Base response model for all streaming responses."""
type: ResponseType
def to_sse(self) -> str:
"""Convert to SSE format."""
return f"data: {self.model_dump_json()}\n\n"
# ========== Message Lifecycle ==========
class StreamStart(StreamBaseResponse):
"""Start of a new message."""
type: ResponseType = ResponseType.START
messageId: str = Field(..., description="Unique message ID")
class StreamFinish(StreamBaseResponse):
"""End of message/stream."""
type: ResponseType = ResponseType.FINISH
# ========== Text Streaming ==========
class StreamTextStart(StreamBaseResponse):
"""Start of a text block."""
type: ResponseType = ResponseType.TEXT_START
id: str = Field(..., description="Text block ID")
class StreamTextDelta(StreamBaseResponse):
"""Streaming text content delta."""
type: ResponseType = ResponseType.TEXT_DELTA
id: str = Field(..., description="Text block ID")
delta: str = Field(..., description="Text content delta")
class StreamTextEnd(StreamBaseResponse):
"""End of a text block."""
type: ResponseType = ResponseType.TEXT_END
id: str = Field(..., description="Text block ID")
# ========== Tool Interaction ==========
class StreamToolInputStart(StreamBaseResponse):
"""Tool call started notification."""
type: ResponseType = ResponseType.TOOL_INPUT_START
toolCallId: str = Field(..., description="Unique tool call ID")
toolName: str = Field(..., description="Name of the tool being called")
class StreamToolInputAvailable(StreamBaseResponse):
"""Tool input is ready for execution."""
type: ResponseType = ResponseType.TOOL_INPUT_AVAILABLE
toolCallId: str = Field(..., description="Unique tool call ID")
toolName: str = Field(..., description="Name of the tool being called")
input: dict[str, Any] = Field(
default_factory=dict, description="Tool input arguments"
)
class StreamToolOutputAvailable(StreamBaseResponse):
"""Tool execution result."""
type: ResponseType = ResponseType.TOOL_OUTPUT_AVAILABLE
toolCallId: str = Field(..., description="Tool call ID this responds to")
output: str | dict[str, Any] = Field(..., description="Tool execution output")
# Additional fields for internal use (not part of AI SDK spec but useful)
toolName: str | None = Field(
default=None, description="Name of the tool that was executed"
)
success: bool = Field(
default=True, description="Whether the tool execution succeeded"
)
# ========== Other ==========
class StreamUsage(StreamBaseResponse):
"""Token usage statistics."""
type: ResponseType = ResponseType.USAGE
promptTokens: int = Field(..., description="Number of prompt tokens")
completionTokens: int = Field(..., description="Number of completion tokens")
totalTokens: int = Field(..., description="Total number of tokens")
class StreamError(StreamBaseResponse):
"""Error response."""
type: ResponseType = ResponseType.ERROR
errorText: str = Field(..., description="Error message text")
code: str | None = Field(default=None, description="Error code")
details: dict[str, Any] | None = Field(
default=None, description="Additional error details"
)

View File

@@ -0,0 +1,362 @@
"""Chat API routes for chat session management and streaming via SSE."""
import logging
from collections.abc import AsyncGenerator
from typing import Annotated
from autogpt_libs import auth
from fastapi import APIRouter, Depends, Query, Security
from fastapi.responses import StreamingResponse
from pydantic import BaseModel
from backend.util.exceptions import NotFoundError
from . import service as chat_service
from .config import ChatConfig
from .model import ChatSession, create_chat_session, get_chat_session, get_user_sessions
config = ChatConfig()
logger = logging.getLogger(__name__)
async def _validate_and_get_session(
session_id: str,
user_id: str | None,
) -> ChatSession:
"""Validate session exists and belongs to user."""
session = await get_chat_session(session_id, user_id)
if not session:
raise NotFoundError(f"Session {session_id} not found.")
return session
router = APIRouter(
tags=["chat"],
)
# ========== Request/Response Models ==========
class StreamChatRequest(BaseModel):
"""Request model for streaming chat with optional context."""
message: str
is_user_message: bool = True
context: dict[str, str] | None = None # {url: str, content: str}
class CreateSessionResponse(BaseModel):
"""Response model containing information on a newly created chat session."""
id: str
created_at: str
user_id: str | None
class SessionDetailResponse(BaseModel):
"""Response model providing complete details for a chat session, including messages."""
id: str
created_at: str
updated_at: str
user_id: str | None
messages: list[dict]
class SessionSummaryResponse(BaseModel):
"""Response model for a session summary (without messages)."""
id: str
created_at: str
updated_at: str
title: str | None = None
class ListSessionsResponse(BaseModel):
"""Response model for listing chat sessions."""
sessions: list[SessionSummaryResponse]
total: int
# ========== Routes ==========
@router.get(
"/sessions",
dependencies=[Security(auth.requires_user)],
)
async def list_sessions(
user_id: Annotated[str, Security(auth.get_user_id)],
limit: int = Query(default=50, ge=1, le=100),
offset: int = Query(default=0, ge=0),
) -> ListSessionsResponse:
"""
List chat sessions for the authenticated user.
Returns a paginated list of chat sessions belonging to the current user,
ordered by most recently updated.
Args:
user_id: The authenticated user's ID.
limit: Maximum number of sessions to return (1-100).
offset: Number of sessions to skip for pagination.
Returns:
ListSessionsResponse: List of session summaries and total count.
"""
sessions, total_count = await get_user_sessions(user_id, limit, offset)
return ListSessionsResponse(
sessions=[
SessionSummaryResponse(
id=session.session_id,
created_at=session.started_at.isoformat(),
updated_at=session.updated_at.isoformat(),
title=session.title,
)
for session in sessions
],
total=total_count,
)
@router.post(
"/sessions",
)
async def create_session(
user_id: Annotated[str, Depends(auth.get_user_id)],
) -> CreateSessionResponse:
"""
Create a new chat session.
Initiates a new chat session for the authenticated user.
Args:
user_id: The authenticated user ID parsed from the JWT (required).
Returns:
CreateSessionResponse: Details of the created session.
"""
logger.info(
f"Creating session with user_id: "
f"...{user_id[-8:] if len(user_id) > 8 else '<redacted>'}"
)
session = await create_chat_session(user_id)
return CreateSessionResponse(
id=session.session_id,
created_at=session.started_at.isoformat(),
user_id=session.user_id,
)
@router.get(
"/sessions/{session_id}",
)
async def get_session(
session_id: str,
user_id: Annotated[str | None, Depends(auth.get_user_id)],
) -> SessionDetailResponse:
"""
Retrieve the details of a specific chat session.
Looks up a chat session by ID for the given user (if authenticated) and returns all session data including messages.
Args:
session_id: The unique identifier for the desired chat session.
user_id: The optional authenticated user ID, or None for anonymous access.
Returns:
SessionDetailResponse: Details for the requested session; raises NotFoundError if not found.
"""
session = await get_chat_session(session_id, user_id)
if not session:
raise NotFoundError(f"Session {session_id} not found")
messages = [message.model_dump() for message in session.messages]
logger.info(
f"Returning session {session_id}: "
f"message_count={len(messages)}, "
f"roles={[m.get('role') for m in messages]}"
)
return SessionDetailResponse(
id=session.session_id,
created_at=session.started_at.isoformat(),
updated_at=session.updated_at.isoformat(),
user_id=session.user_id or None,
messages=messages,
)
@router.post(
"/sessions/{session_id}/stream",
)
async def stream_chat_post(
session_id: str,
request: StreamChatRequest,
user_id: str | None = Depends(auth.get_user_id),
):
"""
Stream chat responses for a session (POST with context support).
Streams the AI/completion responses in real time over Server-Sent Events (SSE), including:
- Text fragments as they are generated
- Tool call UI elements (if invoked)
- Tool execution results
Args:
session_id: The chat session identifier to associate with the streamed messages.
request: Request body containing message, is_user_message, and optional context.
user_id: Optional authenticated user ID.
Returns:
StreamingResponse: SSE-formatted response chunks.
"""
session = await _validate_and_get_session(session_id, user_id)
async def event_generator() -> AsyncGenerator[str, None]:
async for chunk in chat_service.stream_chat_completion(
session_id,
request.message,
is_user_message=request.is_user_message,
user_id=user_id,
session=session, # Pass pre-fetched session to avoid double-fetch
context=request.context,
):
yield chunk.to_sse()
# AI SDK protocol termination
yield "data: [DONE]\n\n"
return StreamingResponse(
event_generator(),
media_type="text/event-stream",
headers={
"Cache-Control": "no-cache",
"Connection": "keep-alive",
"X-Accel-Buffering": "no", # Disable nginx buffering
"x-vercel-ai-ui-message-stream": "v1", # AI SDK protocol header
},
)
@router.get(
"/sessions/{session_id}/stream",
)
async def stream_chat_get(
session_id: str,
message: Annotated[str, Query(min_length=1, max_length=10000)],
user_id: str | None = Depends(auth.get_user_id),
is_user_message: bool = Query(default=True),
):
"""
Stream chat responses for a session (GET - legacy endpoint).
Streams the AI/completion responses in real time over Server-Sent Events (SSE), including:
- Text fragments as they are generated
- Tool call UI elements (if invoked)
- Tool execution results
Args:
session_id: The chat session identifier to associate with the streamed messages.
message: The user's new message to process.
user_id: Optional authenticated user ID.
is_user_message: Whether the message is a user message.
Returns:
StreamingResponse: SSE-formatted response chunks.
"""
session = await _validate_and_get_session(session_id, user_id)
async def event_generator() -> AsyncGenerator[str, None]:
async for chunk in chat_service.stream_chat_completion(
session_id,
message,
is_user_message=is_user_message,
user_id=user_id,
session=session, # Pass pre-fetched session to avoid double-fetch
):
yield chunk.to_sse()
# AI SDK protocol termination
yield "data: [DONE]\n\n"
return StreamingResponse(
event_generator(),
media_type="text/event-stream",
headers={
"Cache-Control": "no-cache",
"Connection": "keep-alive",
"X-Accel-Buffering": "no", # Disable nginx buffering
"x-vercel-ai-ui-message-stream": "v1", # AI SDK protocol header
},
)
@router.patch(
"/sessions/{session_id}/assign-user",
dependencies=[Security(auth.requires_user)],
status_code=200,
)
async def session_assign_user(
session_id: str,
user_id: Annotated[str, Security(auth.get_user_id)],
) -> dict:
"""
Assign an authenticated user to a chat session.
Used (typically post-login) to claim an existing anonymous session as the current authenticated user.
Args:
session_id: The identifier for the (previously anonymous) session.
user_id: The authenticated user's ID to associate with the session.
Returns:
dict: Status of the assignment.
"""
await chat_service.assign_user_to_session(session_id, user_id)
return {"status": "ok"}
# ========== Health Check ==========
@router.get("/health", status_code=200)
async def health_check() -> dict:
"""
Health check endpoint for the chat service.
Performs a full cycle test of session creation and retrieval. Should always return healthy
if the service and data layer are operational.
Returns:
dict: A status dictionary indicating health, service name, and API version.
"""
from backend.data.user import get_or_create_user
# Ensure health check user exists (required for FK constraint)
health_check_user_id = "health-check-user"
await get_or_create_user(
{
"sub": health_check_user_id,
"email": "health-check@system.local",
"user_metadata": {"name": "Health Check User"},
}
)
# Create and retrieve session to verify full data layer
session = await create_chat_session(health_check_user_id)
await get_chat_session(session.session_id, health_check_user_id)
return {
"status": "healthy",
"service": "chat",
"version": "0.1.0",
}

View File

@@ -0,0 +1,907 @@
import asyncio
import logging
from collections.abc import AsyncGenerator
from typing import Any
import orjson
from langfuse import Langfuse
from openai import (
APIConnectionError,
APIError,
APIStatusError,
AsyncOpenAI,
RateLimitError,
)
from openai.types.chat import ChatCompletionChunk, ChatCompletionToolParam
from backend.data.understanding import (
format_understanding_for_prompt,
get_business_understanding,
)
from backend.util.exceptions import NotFoundError
from backend.util.settings import Settings
from . import db as chat_db
from .config import ChatConfig
from .model import (
ChatMessage,
ChatSession,
Usage,
get_chat_session,
update_session_title,
upsert_chat_session,
)
from .response_model import (
StreamBaseResponse,
StreamError,
StreamFinish,
StreamStart,
StreamTextDelta,
StreamTextEnd,
StreamTextStart,
StreamToolInputAvailable,
StreamToolInputStart,
StreamToolOutputAvailable,
StreamUsage,
)
from .tools import execute_tool, tools
logger = logging.getLogger(__name__)
config = ChatConfig()
settings = Settings()
client = AsyncOpenAI(api_key=config.api_key, base_url=config.base_url)
# Langfuse client (lazy initialization)
_langfuse_client: Langfuse | None = None
class LangfuseNotConfiguredError(Exception):
"""Raised when Langfuse is required but not configured."""
pass
def _is_langfuse_configured() -> bool:
"""Check if Langfuse credentials are configured."""
return bool(
settings.secrets.langfuse_public_key and settings.secrets.langfuse_secret_key
)
def _get_langfuse_client() -> Langfuse:
"""Get or create the Langfuse client for prompt management and tracing."""
global _langfuse_client
if _langfuse_client is None:
if not _is_langfuse_configured():
raise LangfuseNotConfiguredError(
"Langfuse is not configured. The chat feature requires Langfuse for prompt management. "
"Please set the LANGFUSE_PUBLIC_KEY and LANGFUSE_SECRET_KEY environment variables."
)
_langfuse_client = Langfuse(
public_key=settings.secrets.langfuse_public_key,
secret_key=settings.secrets.langfuse_secret_key,
host=settings.secrets.langfuse_host or "https://cloud.langfuse.com",
)
return _langfuse_client
def _get_environment() -> str:
"""Get the current environment name for Langfuse tagging."""
return settings.config.app_env.value
def _get_langfuse_prompt() -> str:
"""Fetch the latest production prompt from Langfuse.
Returns:
The compiled prompt text from Langfuse.
Raises:
Exception: If Langfuse is unavailable or prompt fetch fails.
"""
try:
langfuse = _get_langfuse_client()
# cache_ttl_seconds=0 disables SDK caching to always get the latest prompt
prompt = langfuse.get_prompt(config.langfuse_prompt_name, cache_ttl_seconds=0)
compiled = prompt.compile()
logger.info(
f"Fetched prompt '{config.langfuse_prompt_name}' from Langfuse "
f"(version: {prompt.version})"
)
return compiled
except Exception as e:
logger.error(f"Failed to fetch prompt from Langfuse: {e}")
raise
async def _is_first_session(user_id: str) -> bool:
"""Check if this is the user's first chat session.
Returns True if the user has 1 or fewer sessions (meaning this is their first).
"""
try:
session_count = await chat_db.get_user_session_count(user_id)
return session_count <= 1
except Exception as e:
logger.warning(f"Failed to check session count for user {user_id}: {e}")
return False # Default to non-onboarding if we can't check
async def _build_system_prompt(user_id: str | None) -> tuple[str, Any]:
"""Build the full system prompt including business understanding if available.
Args:
user_id: The user ID for fetching business understanding
If "default" and this is the user's first session, will use "onboarding" instead.
Returns:
Tuple of (compiled prompt string, Langfuse prompt object for tracing)
"""
langfuse = _get_langfuse_client()
# cache_ttl_seconds=0 disables SDK caching to always get the latest prompt
prompt = langfuse.get_prompt(config.langfuse_prompt_name, cache_ttl_seconds=0)
# If user is authenticated, try to fetch their business understanding
understanding = None
if user_id:
try:
understanding = await get_business_understanding(user_id)
except Exception as e:
logger.warning(f"Failed to fetch business understanding: {e}")
understanding = None
if understanding:
context = format_understanding_for_prompt(understanding)
else:
context = "This is the first time you are meeting the user. Greet them and introduce them to the platform"
compiled = prompt.compile(users_information=context)
return compiled, prompt
async def _generate_session_title(message: str) -> str | None:
"""Generate a concise title for a chat session based on the first message.
Args:
message: The first user message in the session
Returns:
A short title (3-6 words) or None if generation fails
"""
try:
response = await client.chat.completions.create(
model=config.title_model,
messages=[
{
"role": "system",
"content": (
"Generate a very short title (3-6 words) for a chat conversation "
"based on the user's first message. The title should capture the "
"main topic or intent. Return ONLY the title, no quotes or punctuation."
),
},
{"role": "user", "content": message[:500]}, # Limit input length
],
max_tokens=20,
)
title = response.choices[0].message.content
if title:
# Clean up the title
title = title.strip().strip("\"'")
# Limit length
if len(title) > 50:
title = title[:47] + "..."
return title
return None
except Exception as e:
logger.warning(f"Failed to generate session title: {e}")
return None
async def assign_user_to_session(
session_id: str,
user_id: str,
) -> ChatSession:
"""
Assign a user to a chat session.
"""
session = await get_chat_session(session_id, None)
if not session:
raise NotFoundError(f"Session {session_id} not found")
session.user_id = user_id
return await upsert_chat_session(session)
async def stream_chat_completion(
session_id: str,
message: str | None = None,
is_user_message: bool = True,
user_id: str | None = None,
retry_count: int = 0,
session: ChatSession | None = None,
context: dict[str, str] | None = None, # {url: str, content: str}
) -> AsyncGenerator[StreamBaseResponse, None]:
"""Main entry point for streaming chat completions with database handling.
This function handles all database operations and delegates streaming
to the internal _stream_chat_chunks function.
Args:
session_id: Chat session ID
user_message: User's input message
user_id: User ID for authentication (None for anonymous)
session: Optional pre-loaded session object (for recursive calls to avoid Redis refetch)
Yields:
StreamBaseResponse objects formatted as SSE
Raises:
NotFoundError: If session_id is invalid
ValueError: If max_context_messages is exceeded
"""
logger.info(
f"Streaming chat completion for session {session_id} for message {message} and user id {user_id}. Message is user message: {is_user_message}"
)
# Check if Langfuse is configured - required for chat functionality
if not _is_langfuse_configured():
logger.error("Chat request failed: Langfuse is not configured")
yield StreamError(
errorText="Chat service is not available. Langfuse must be configured "
"with LANGFUSE_PUBLIC_KEY and LANGFUSE_SECRET_KEY environment variables."
)
yield StreamFinish()
return
# Langfuse observations will be created after session is loaded (need messages for input)
# Initialize to None so finally block can safely check and end them
trace = None
generation = None
# Only fetch from Redis if session not provided (initial call)
if session is None:
session = await get_chat_session(session_id, user_id)
logger.info(
f"Fetched session from Redis: {session.session_id if session else 'None'}, "
f"message_count={len(session.messages) if session else 0}"
)
else:
logger.info(
f"Using provided session object: {session.session_id}, "
f"message_count={len(session.messages)}"
)
if not session:
raise NotFoundError(
f"Session {session_id} not found. Please create a new session first."
)
if message:
# Build message content with context if provided
message_content = message
if context and context.get("url") and context.get("content"):
context_text = f"Page URL: {context['url']}\n\nPage Content:\n{context['content']}\n\n---\n\nUser Message: {message}"
message_content = context_text
logger.info(
f"Including page context: URL={context['url']}, content_length={len(context['content'])}"
)
session.messages.append(
ChatMessage(
role="user" if is_user_message else "assistant", content=message_content
)
)
logger.info(
f"Appended message (role={'user' if is_user_message else 'assistant'}), "
f"new message_count={len(session.messages)}"
)
if len(session.messages) > config.max_context_messages:
raise ValueError(f"Max messages exceeded: {config.max_context_messages}")
logger.info(
f"Upserting session: {session.session_id} with user id {session.user_id}, "
f"message_count={len(session.messages)}"
)
session = await upsert_chat_session(session)
assert session, "Session not found"
# Generate title for new sessions on first user message (non-blocking)
# Check: is_user_message, no title yet, and this is the first user message
if is_user_message and message and not session.title:
user_messages = [m for m in session.messages if m.role == "user"]
if len(user_messages) == 1:
# First user message - generate title in background
import asyncio
# Capture only the values we need (not the session object) to avoid
# stale data issues when the main flow modifies the session
captured_session_id = session_id
captured_message = message
async def _update_title():
try:
title = await _generate_session_title(captured_message)
if title:
# Use dedicated title update function that doesn't
# touch messages, avoiding race conditions
await update_session_title(captured_session_id, title)
logger.info(
f"Generated title for session {captured_session_id}: {title}"
)
except Exception as e:
logger.warning(f"Failed to update session title: {e}")
# Fire and forget - don't block the chat response
asyncio.create_task(_update_title())
# Build system prompt with business understanding
system_prompt, langfuse_prompt = await _build_system_prompt(user_id)
# Build input messages including system prompt for complete Langfuse logging
trace_input_messages = [{"role": "system", "content": system_prompt}] + [
m.model_dump() for m in session.messages
]
# Create Langfuse trace for this LLM call (each call gets its own trace, grouped by session_id)
# Using v3 SDK: start_observation creates a root span, update_trace sets trace-level attributes
try:
langfuse = _get_langfuse_client()
env = _get_environment()
trace = langfuse.start_observation(
name="chat_completion",
input={"messages": trace_input_messages},
metadata={
"environment": env,
"model": config.model,
"message_count": len(session.messages),
"prompt_name": langfuse_prompt.name if langfuse_prompt else None,
"prompt_version": langfuse_prompt.version if langfuse_prompt else None,
},
)
# Set trace-level attributes (session_id, user_id, tags)
trace.update_trace(
session_id=session_id,
user_id=user_id,
tags=[env, "copilot"],
)
except Exception as e:
logger.warning(f"Failed to create Langfuse trace: {e}")
# Initialize variables that will be used in finally block (must be defined before try)
assistant_response = ChatMessage(
role="assistant",
content="",
)
accumulated_tool_calls: list[dict[str, Any]] = []
# Wrap main logic in try/finally to ensure Langfuse observations are always ended
try:
has_yielded_end = False
has_yielded_error = False
has_done_tool_call = False
has_received_text = False
text_streaming_ended = False
tool_response_messages: list[ChatMessage] = []
should_retry = False
# Generate unique IDs for AI SDK protocol
import uuid as uuid_module
message_id = str(uuid_module.uuid4())
text_block_id = str(uuid_module.uuid4())
# Yield message start
yield StreamStart(messageId=message_id)
# Create Langfuse generation for each LLM call, linked to the prompt
# Using v3 SDK: start_observation with as_type="generation"
generation = (
trace.start_observation(
as_type="generation",
name="llm_call",
model=config.model,
input={"messages": trace_input_messages},
prompt=langfuse_prompt,
)
if trace
else None
)
try:
async for chunk in _stream_chat_chunks(
session=session,
tools=tools,
system_prompt=system_prompt,
text_block_id=text_block_id,
):
if isinstance(chunk, StreamTextStart):
# Emit text-start before first text delta
if not has_received_text:
yield chunk
elif isinstance(chunk, StreamTextDelta):
delta = chunk.delta or ""
assert assistant_response.content is not None
assistant_response.content += delta
has_received_text = True
yield chunk
elif isinstance(chunk, StreamTextEnd):
# Emit text-end after text completes
if has_received_text and not text_streaming_ended:
text_streaming_ended = True
yield chunk
elif isinstance(chunk, StreamToolInputStart):
# Emit text-end before first tool call, but only if we've received text
if has_received_text and not text_streaming_ended:
yield StreamTextEnd(id=text_block_id)
text_streaming_ended = True
yield chunk
elif isinstance(chunk, StreamToolInputAvailable):
# Accumulate tool calls in OpenAI format
accumulated_tool_calls.append(
{
"id": chunk.toolCallId,
"type": "function",
"function": {
"name": chunk.toolName,
"arguments": orjson.dumps(chunk.input).decode("utf-8"),
},
}
)
elif isinstance(chunk, StreamToolOutputAvailable):
result_content = (
chunk.output
if isinstance(chunk.output, str)
else orjson.dumps(chunk.output).decode("utf-8")
)
tool_response_messages.append(
ChatMessage(
role="tool",
content=result_content,
tool_call_id=chunk.toolCallId,
)
)
has_done_tool_call = True
# Track if any tool execution failed
if not chunk.success:
logger.warning(
f"Tool {chunk.toolName} (ID: {chunk.toolCallId}) execution failed"
)
yield chunk
elif isinstance(chunk, StreamFinish):
if not has_done_tool_call:
# Emit text-end before finish if we received text but haven't closed it
if has_received_text and not text_streaming_ended:
yield StreamTextEnd(id=text_block_id)
text_streaming_ended = True
has_yielded_end = True
yield chunk
elif isinstance(chunk, StreamError):
has_yielded_error = True
elif isinstance(chunk, StreamUsage):
session.usage.append(
Usage(
prompt_tokens=chunk.promptTokens,
completion_tokens=chunk.completionTokens,
total_tokens=chunk.totalTokens,
)
)
else:
logger.error(f"Unknown chunk type: {type(chunk)}", exc_info=True)
except Exception as e:
logger.error(f"Error during stream: {e!s}", exc_info=True)
# Check if this is a retryable error (JSON parsing, incomplete tool calls, etc.)
is_retryable = isinstance(e, (orjson.JSONDecodeError, KeyError, TypeError))
if is_retryable and retry_count < config.max_retries:
logger.info(
f"Retryable error encountered. Attempt {retry_count + 1}/{config.max_retries}"
)
should_retry = True
else:
# Non-retryable error or max retries exceeded
# Save any partial progress before reporting error
messages_to_save: list[ChatMessage] = []
# Add assistant message if it has content or tool calls
if accumulated_tool_calls:
assistant_response.tool_calls = accumulated_tool_calls
if assistant_response.content or assistant_response.tool_calls:
messages_to_save.append(assistant_response)
# Add tool response messages after assistant message
messages_to_save.extend(tool_response_messages)
session.messages.extend(messages_to_save)
await upsert_chat_session(session)
if not has_yielded_error:
error_message = str(e)
if not is_retryable:
error_message = f"Non-retryable error: {error_message}"
elif retry_count >= config.max_retries:
error_message = f"Max retries ({config.max_retries}) exceeded: {error_message}"
error_response = StreamError(errorText=error_message)
yield error_response
if not has_yielded_end:
yield StreamFinish()
return
# Handle retry outside of exception handler to avoid nesting
if should_retry and retry_count < config.max_retries:
logger.info(
f"Retrying stream_chat_completion for session {session_id}, attempt {retry_count + 1}"
)
async for chunk in stream_chat_completion(
session_id=session.session_id,
user_id=user_id,
retry_count=retry_count + 1,
session=session,
context=context,
):
yield chunk
return # Exit after retry to avoid double-saving in finally block
# Normal completion path - save session and handle tool call continuation
logger.info(
f"Normal completion path: session={session.session_id}, "
f"current message_count={len(session.messages)}"
)
# Build the messages list in the correct order
messages_to_save: list[ChatMessage] = []
# Add assistant message with tool_calls if any
if accumulated_tool_calls:
assistant_response.tool_calls = accumulated_tool_calls
logger.info(
f"Added {len(accumulated_tool_calls)} tool calls to assistant message"
)
if assistant_response.content or assistant_response.tool_calls:
messages_to_save.append(assistant_response)
logger.info(
f"Saving assistant message with content_len={len(assistant_response.content or '')}, tool_calls={len(assistant_response.tool_calls or [])}"
)
# Add tool response messages after assistant message
messages_to_save.extend(tool_response_messages)
logger.info(
f"Saving {len(tool_response_messages)} tool response messages, "
f"total_to_save={len(messages_to_save)}"
)
session.messages.extend(messages_to_save)
logger.info(
f"Extended session messages, new message_count={len(session.messages)}"
)
await upsert_chat_session(session)
# If we did a tool call, stream the chat completion again to get the next response
if has_done_tool_call:
logger.info(
"Tool call executed, streaming chat completion again to get assistant response"
)
async for chunk in stream_chat_completion(
session_id=session.session_id,
user_id=user_id,
session=session, # Pass session object to avoid Redis refetch
context=context,
):
yield chunk
finally:
# Always end Langfuse observations to prevent resource leaks
# Guard against None and catch errors to avoid masking original exceptions
if generation is not None:
try:
latest_usage = session.usage[-1] if session.usage else None
generation.update(
model=config.model,
output={
"content": assistant_response.content,
"tool_calls": accumulated_tool_calls or None,
},
usage_details=(
{
"input": latest_usage.prompt_tokens,
"output": latest_usage.completion_tokens,
"total": latest_usage.total_tokens,
}
if latest_usage
else None
),
)
generation.end()
except Exception as e:
logger.warning(f"Failed to end Langfuse generation: {e}")
if trace is not None:
try:
if accumulated_tool_calls:
trace.update_trace(output={"tool_calls": accumulated_tool_calls})
else:
trace.update_trace(output={"response": assistant_response.content})
trace.end()
except Exception as e:
logger.warning(f"Failed to end Langfuse trace: {e}")
# Retry configuration for OpenAI API calls
MAX_RETRIES = 3
BASE_DELAY_SECONDS = 1.0
MAX_DELAY_SECONDS = 30.0
def _is_retryable_error(error: Exception) -> bool:
"""Determine if an error is retryable."""
if isinstance(error, RateLimitError):
return True
if isinstance(error, APIConnectionError):
return True
if isinstance(error, APIStatusError):
# APIStatusError has a response with status_code
# Retry on 5xx status codes (server errors)
if error.response.status_code >= 500:
return True
if isinstance(error, APIError):
# Retry on overloaded errors or 500 errors (may not have status code)
error_message = str(error).lower()
if "overloaded" in error_message or "internal server error" in error_message:
return True
return False
async def _stream_chat_chunks(
session: ChatSession,
tools: list[ChatCompletionToolParam],
system_prompt: str | None = None,
text_block_id: str | None = None,
) -> AsyncGenerator[StreamBaseResponse, None]:
"""
Pure streaming function for OpenAI chat completions with tool calling.
This function is database-agnostic and focuses only on streaming logic.
Implements exponential backoff retry for transient API errors.
Args:
session: Chat session with conversation history
tools: Available tools for the model
system_prompt: System prompt to prepend to messages
Yields:
SSE formatted JSON response objects
"""
model = config.model
logger.info("Starting pure chat stream")
# Build messages with system prompt prepended
messages = session.to_openai_messages()
if system_prompt:
from openai.types.chat import ChatCompletionSystemMessageParam
system_message = ChatCompletionSystemMessageParam(
role="system",
content=system_prompt,
)
messages = [system_message] + messages
# Loop to handle tool calls and continue conversation
while True:
retry_count = 0
last_error: Exception | None = None
while retry_count <= MAX_RETRIES:
try:
logger.info(
f"Creating OpenAI chat completion stream..."
f"{f' (retry {retry_count}/{MAX_RETRIES})' if retry_count > 0 else ''}"
)
# Create the stream with proper types
stream = await client.chat.completions.create(
model=model,
messages=messages,
tools=tools,
tool_choice="auto",
stream=True,
stream_options={"include_usage": True},
)
# Variables to accumulate tool calls
tool_calls: list[dict[str, Any]] = []
active_tool_call_idx: int | None = None
finish_reason: str | None = None
# Track which tool call indices have had their start event emitted
emitted_start_for_idx: set[int] = set()
# Track if we've started the text block
text_started = False
# Process the stream
chunk: ChatCompletionChunk
async for chunk in stream:
if chunk.usage:
yield StreamUsage(
promptTokens=chunk.usage.prompt_tokens,
completionTokens=chunk.usage.completion_tokens,
totalTokens=chunk.usage.total_tokens,
)
if chunk.choices:
choice = chunk.choices[0]
delta = choice.delta
# Capture finish reason
if choice.finish_reason:
finish_reason = choice.finish_reason
logger.info(f"Finish reason: {finish_reason}")
# Handle content streaming
if delta.content:
# Emit text-start on first text content
if not text_started and text_block_id:
yield StreamTextStart(id=text_block_id)
text_started = True
# Stream the text delta
text_response = StreamTextDelta(
id=text_block_id or "",
delta=delta.content,
)
yield text_response
# Handle tool calls
if delta.tool_calls:
for tc_chunk in delta.tool_calls:
idx = tc_chunk.index
# Update active tool call index if needed
if (
active_tool_call_idx is None
or active_tool_call_idx != idx
):
active_tool_call_idx = idx
# Ensure we have a tool call object at this index
while len(tool_calls) <= idx:
tool_calls.append(
{
"id": "",
"type": "function",
"function": {
"name": "",
"arguments": "",
},
},
)
# Accumulate the tool call data
if tc_chunk.id:
tool_calls[idx]["id"] = tc_chunk.id
if tc_chunk.function:
if tc_chunk.function.name:
tool_calls[idx]["function"][
"name"
] = tc_chunk.function.name
if tc_chunk.function.arguments:
tool_calls[idx]["function"][
"arguments"
] += tc_chunk.function.arguments
# Emit StreamToolInputStart only after we have the tool call ID
if (
idx not in emitted_start_for_idx
and tool_calls[idx]["id"]
and tool_calls[idx]["function"]["name"]
):
yield StreamToolInputStart(
toolCallId=tool_calls[idx]["id"],
toolName=tool_calls[idx]["function"]["name"],
)
emitted_start_for_idx.add(idx)
logger.info(f"Stream complete. Finish reason: {finish_reason}")
# Yield all accumulated tool calls after the stream is complete
# This ensures all tool call arguments have been fully received
for idx, tool_call in enumerate(tool_calls):
try:
async for tc in _yield_tool_call(tool_calls, idx, session):
yield tc
except (orjson.JSONDecodeError, KeyError, TypeError) as e:
logger.error(
f"Failed to parse tool call {idx}: {e}",
exc_info=True,
extra={"tool_call": tool_call},
)
yield StreamError(
errorText=f"Invalid tool call arguments for tool {tool_call.get('function', {}).get('name', 'unknown')}: {e}",
)
# Re-raise to trigger retry logic in the parent function
raise
yield StreamFinish()
return
except Exception as e:
last_error = e
if _is_retryable_error(e) and retry_count < MAX_RETRIES:
retry_count += 1
# Calculate delay with exponential backoff
delay = min(
BASE_DELAY_SECONDS * (2 ** (retry_count - 1)),
MAX_DELAY_SECONDS,
)
logger.warning(
f"Retryable error in stream: {e!s}. "
f"Retrying in {delay:.1f}s (attempt {retry_count}/{MAX_RETRIES})"
)
await asyncio.sleep(delay)
continue # Retry the stream
else:
# Non-retryable error or max retries exceeded
logger.error(
f"Error in stream (not retrying): {e!s}",
exc_info=True,
)
error_response = StreamError(errorText=str(e))
yield error_response
yield StreamFinish()
return
# If we exit the retry loop without returning, it means we exhausted retries
if last_error:
logger.error(
f"Max retries ({MAX_RETRIES}) exceeded. Last error: {last_error!s}",
exc_info=True,
)
yield StreamError(errorText=f"Max retries exceeded: {last_error!s}")
yield StreamFinish()
return
async def _yield_tool_call(
tool_calls: list[dict[str, Any]],
yield_idx: int,
session: ChatSession,
) -> AsyncGenerator[StreamBaseResponse, None]:
"""
Yield a tool call and its execution result.
Raises:
orjson.JSONDecodeError: If tool call arguments cannot be parsed as JSON
KeyError: If expected tool call fields are missing
TypeError: If tool call structure is invalid
"""
tool_name = tool_calls[yield_idx]["function"]["name"]
tool_call_id = tool_calls[yield_idx]["id"]
logger.info(f"Yielding tool call: {tool_calls[yield_idx]}")
# Parse tool call arguments - handle empty arguments gracefully
raw_arguments = tool_calls[yield_idx]["function"]["arguments"]
if raw_arguments:
arguments = orjson.loads(raw_arguments)
else:
arguments = {}
yield StreamToolInputAvailable(
toolCallId=tool_call_id,
toolName=tool_name,
input=arguments,
)
tool_execution_response: StreamToolOutputAvailable = await execute_tool(
tool_name=tool_name,
parameters=arguments,
tool_call_id=tool_call_id,
user_id=session.user_id,
session=session,
)
logger.info(f"Yielding Tool execution response: {tool_execution_response}")
yield tool_execution_response

View File

@@ -3,19 +3,20 @@ from os import getenv
import pytest
import backend.server.v2.chat.service as chat_service
from backend.server.v2.chat.response_model import (
StreamEnd,
from . import service as chat_service
from .model import create_chat_session, get_chat_session, upsert_chat_session
from .response_model import (
StreamError,
StreamTextChunk,
StreamToolExecutionResult,
StreamFinish,
StreamTextDelta,
StreamToolOutputAvailable,
)
logger = logging.getLogger(__name__)
@pytest.mark.asyncio(loop_scope="session")
async def test_stream_chat_completion():
async def test_stream_chat_completion(setup_test_user, test_user_id):
"""
Test the stream_chat_completion function.
"""
@@ -23,7 +24,7 @@ async def test_stream_chat_completion():
if not api_key:
return pytest.skip("OPEN_ROUTER_API_KEY is not set, skipping test")
session = await chat_service.create_chat_session()
session = await create_chat_session(test_user_id)
has_errors = False
has_ended = False
@@ -34,9 +35,9 @@ async def test_stream_chat_completion():
logger.info(chunk)
if isinstance(chunk, StreamError):
has_errors = True
if isinstance(chunk, StreamTextChunk):
assistant_message += chunk.content
if isinstance(chunk, StreamEnd):
if isinstance(chunk, StreamTextDelta):
assistant_message += chunk.delta
if isinstance(chunk, StreamFinish):
has_ended = True
assert has_ended, "Chat completion did not end"
@@ -45,7 +46,7 @@ async def test_stream_chat_completion():
@pytest.mark.asyncio(loop_scope="session")
async def test_stream_chat_completion_with_tool_calls():
async def test_stream_chat_completion_with_tool_calls(setup_test_user, test_user_id):
"""
Test the stream_chat_completion function.
"""
@@ -53,8 +54,8 @@ async def test_stream_chat_completion_with_tool_calls():
if not api_key:
return pytest.skip("OPEN_ROUTER_API_KEY is not set, skipping test")
session = await chat_service.create_chat_session()
session = await chat_service.upsert_chat_session(session)
session = await create_chat_session(test_user_id)
session = await upsert_chat_session(session)
has_errors = False
has_ended = False
@@ -68,14 +69,14 @@ async def test_stream_chat_completion_with_tool_calls():
if isinstance(chunk, StreamError):
has_errors = True
if isinstance(chunk, StreamEnd):
if isinstance(chunk, StreamFinish):
has_ended = True
if isinstance(chunk, StreamToolExecutionResult):
if isinstance(chunk, StreamToolOutputAvailable):
had_tool_calls = True
assert has_ended, "Chat completion did not end"
assert not has_errors, "Error occurred while streaming chat completion"
assert had_tool_calls, "Tool calls did not occur"
session = await chat_service.get_session(session.session_id)
session = await get_chat_session(session.session_id)
assert session, "Session not found"
assert session.usage, "Usage is empty"

View File

@@ -0,0 +1,49 @@
from typing import TYPE_CHECKING, Any
from openai.types.chat import ChatCompletionToolParam
from backend.api.features.chat.model import ChatSession
from .add_understanding import AddUnderstandingTool
from .agent_output import AgentOutputTool
from .base import BaseTool
from .find_agent import FindAgentTool
from .find_library_agent import FindLibraryAgentTool
from .run_block import RunBlockTool
from .run_agent import RunAgentTool
if TYPE_CHECKING:
from backend.api.features.chat.response_model import StreamToolOutputAvailable
# Single source of truth for all tools
TOOL_REGISTRY: dict[str, BaseTool] = {
"add_understanding": AddUnderstandingTool(),
"find_agent": FindAgentTool(),
"find_library_agent": FindLibraryAgentTool(),
"run_agent": RunAgentTool(),
"agent_output": AgentOutputTool(),
"run_block": RunBlockTool(),
}
# Export individual tool instances for backwards compatibility
find_agent_tool = TOOL_REGISTRY["find_agent"]
run_agent_tool = TOOL_REGISTRY["run_agent"]
# Generated from registry for OpenAI API
tools: list[ChatCompletionToolParam] = [
tool.as_openai_tool() for tool in TOOL_REGISTRY.values()
]
async def execute_tool(
tool_name: str,
parameters: dict[str, Any],
user_id: str | None,
session: ChatSession,
tool_call_id: str,
) -> "StreamToolOutputAvailable":
"""Execute a tool by name."""
tool = TOOL_REGISTRY.get(tool_name)
if not tool:
raise ValueError(f"Tool {tool_name} not found")
return await tool.execute(user_id, session, tool_call_id, **parameters)

View File

@@ -3,8 +3,11 @@ from datetime import UTC, datetime
from os import getenv
import pytest
from prisma.types import ProfileCreateInput
from pydantic import SecretStr
from backend.api.features.chat.model import ChatSession
from backend.api.features.store import db as store_db
from backend.blocks.firecrawl.scrape import FirecrawlScrapeBlock
from backend.blocks.io import AgentInputBlock, AgentOutputBlock
from backend.blocks.llm import AITextGeneratorBlock
@@ -13,11 +16,9 @@ from backend.data.graph import Graph, Link, Node, create_graph
from backend.data.model import APIKeyCredentials
from backend.data.user import get_or_create_user
from backend.integrations.credentials_store import IntegrationCredentialsStore
from backend.server.v2.chat.model import ChatSession
from backend.server.v2.store import db as store_db
def make_session(user_id: str | None = None):
def make_session(user_id: str):
return ChatSession(
session_id=str(uuid.uuid4()),
user_id=user_id,
@@ -49,13 +50,13 @@ async def setup_test_data():
# 1b. Create a profile with username for the user (required for store agent lookup)
username = user.email.split("@")[0]
await prisma.profile.create(
data={
"userId": user.id,
"username": username,
"name": f"Test User {username}",
"description": "Test user profile",
"links": [], # Required field - empty array for test profiles
}
data=ProfileCreateInput(
userId=user.id,
username=username,
name=f"Test User {username}",
description="Test user profile",
links=[], # Required field - empty array for test profiles
)
)
# 2. Create a test graph with agent input -> agent output
@@ -172,13 +173,13 @@ async def setup_llm_test_data():
# 1b. Create a profile with username for the user (required for store agent lookup)
username = user.email.split("@")[0]
await prisma.profile.create(
data={
"userId": user.id,
"username": username,
"name": f"Test User {username}",
"description": "Test user profile for LLM tests",
"links": [], # Required field - empty array for test profiles
}
data=ProfileCreateInput(
userId=user.id,
username=username,
name=f"Test User {username}",
description="Test user profile for LLM tests",
links=[], # Required field - empty array for test profiles
)
)
# 2. Create test OpenAI credentials for the user
@@ -332,13 +333,13 @@ async def setup_firecrawl_test_data():
# 1b. Create a profile with username for the user (required for store agent lookup)
username = user.email.split("@")[0]
await prisma.profile.create(
data={
"userId": user.id,
"username": username,
"name": f"Test User {username}",
"description": "Test user profile for Firecrawl tests",
"links": [], # Required field - empty array for test profiles
}
data=ProfileCreateInput(
userId=user.id,
username=username,
name=f"Test User {username}",
description="Test user profile for Firecrawl tests",
links=[], # Required field - empty array for test profiles
)
)
# NOTE: We deliberately do NOT create Firecrawl credentials for this user

View File

@@ -0,0 +1,119 @@
"""Tool for capturing user business understanding incrementally."""
import logging
from typing import Any
from backend.api.features.chat.model import ChatSession
from backend.data.understanding import (
BusinessUnderstandingInput,
upsert_business_understanding,
)
from .base import BaseTool
from .models import ErrorResponse, ToolResponseBase, UnderstandingUpdatedResponse
logger = logging.getLogger(__name__)
class AddUnderstandingTool(BaseTool):
"""Tool for capturing user's business understanding incrementally."""
@property
def name(self) -> str:
return "add_understanding"
@property
def description(self) -> str:
return """Capture and store information about the user's business context,
workflows, pain points, and automation goals. Call this tool whenever the user
shares information about their business. Each call incrementally adds to the
existing understanding - you don't need to provide all fields at once.
Use this to build a comprehensive profile that helps recommend better agents
and automations for the user's specific needs."""
@property
def parameters(self) -> dict[str, Any]:
# Auto-generate from Pydantic model schema
schema = BusinessUnderstandingInput.model_json_schema()
properties = {}
for field_name, field_schema in schema.get("properties", {}).items():
prop: dict[str, Any] = {"description": field_schema.get("description", "")}
# Handle anyOf for Optional types
if "anyOf" in field_schema:
for option in field_schema["anyOf"]:
if option.get("type") != "null":
prop["type"] = option.get("type", "string")
if "items" in option:
prop["items"] = option["items"]
break
else:
prop["type"] = field_schema.get("type", "string")
if "items" in field_schema:
prop["items"] = field_schema["items"]
properties[field_name] = prop
return {"type": "object", "properties": properties, "required": []}
@property
def requires_auth(self) -> bool:
"""Requires authentication to store user-specific data."""
return True
async def _execute(
self,
user_id: str | None,
session: ChatSession,
**kwargs,
) -> ToolResponseBase:
"""
Capture and store business understanding incrementally.
Each call merges new data with existing understanding:
- String fields are overwritten if provided
- List fields are appended (with deduplication)
"""
session_id = session.session_id
if not user_id:
return ErrorResponse(
message="Authentication required to save business understanding.",
session_id=session_id,
)
# Check if any data was provided
if not any(v is not None for v in kwargs.values()):
return ErrorResponse(
message="Please provide at least one field to update.",
session_id=session_id,
)
# Build input model from kwargs (only include fields defined in the model)
valid_fields = set(BusinessUnderstandingInput.model_fields.keys())
input_data = BusinessUnderstandingInput(
**{k: v for k, v in kwargs.items() if k in valid_fields}
)
# Track which fields were updated
updated_fields = [
k for k, v in kwargs.items() if k in valid_fields and v is not None
]
# Upsert with merge
understanding = await upsert_business_understanding(user_id, input_data)
# Build current understanding summary (filter out empty values)
current_understanding = {
k: v
for k, v in understanding.model_dump(
exclude={"id", "user_id", "created_at", "updated_at"}
).items()
if v is not None and v != [] and v != ""
}
return UnderstandingUpdatedResponse(
message=f"Updated understanding with: {', '.join(updated_fields)}. "
"I now have a better picture of your business context.",
session_id=session_id,
updated_fields=updated_fields,
current_understanding=current_understanding,
)

View File

@@ -0,0 +1,446 @@
"""Tool for retrieving agent execution outputs from user's library."""
import logging
import re
from datetime import datetime, timedelta, timezone
from typing import Any
from pydantic import BaseModel, field_validator
from backend.api.features.chat.model import ChatSession
from backend.api.features.library import db as library_db
from backend.api.features.library.model import LibraryAgent
from backend.data import execution as execution_db
from backend.data.execution import ExecutionStatus, GraphExecution, GraphExecutionMeta
from .base import BaseTool
from .models import (
AgentOutputResponse,
ErrorResponse,
ExecutionOutputInfo,
NoResultsResponse,
ToolResponseBase,
)
from .utils import fetch_graph_from_store_slug
logger = logging.getLogger(__name__)
class AgentOutputInput(BaseModel):
"""Input parameters for the agent_output tool."""
agent_name: str = ""
library_agent_id: str = ""
store_slug: str = ""
execution_id: str = ""
run_time: str = "latest"
@field_validator(
"agent_name",
"library_agent_id",
"store_slug",
"execution_id",
"run_time",
mode="before",
)
@classmethod
def strip_strings(cls, v: Any) -> Any:
"""Strip whitespace from string fields."""
return v.strip() if isinstance(v, str) else v
def parse_time_expression(
time_expr: str | None,
) -> tuple[datetime | None, datetime | None]:
"""
Parse time expression into datetime range (start, end).
Supports: "latest", "yesterday", "today", "last week", "last 7 days",
"last month", "last 30 days", ISO date "YYYY-MM-DD", ISO datetime.
"""
if not time_expr or time_expr.lower() == "latest":
return None, None
now = datetime.now(timezone.utc)
today_start = now.replace(hour=0, minute=0, second=0, microsecond=0)
expr = time_expr.lower().strip()
# Relative time expressions lookup
relative_times: dict[str, tuple[datetime, datetime]] = {
"yesterday": (today_start - timedelta(days=1), today_start),
"today": (today_start, now),
"last week": (now - timedelta(days=7), now),
"last 7 days": (now - timedelta(days=7), now),
"last month": (now - timedelta(days=30), now),
"last 30 days": (now - timedelta(days=30), now),
}
if expr in relative_times:
return relative_times[expr]
# Try ISO date format (YYYY-MM-DD)
date_match = re.match(r"^(\d{4})-(\d{2})-(\d{2})$", expr)
if date_match:
try:
year, month, day = map(int, date_match.groups())
start = datetime(year, month, day, 0, 0, 0, tzinfo=timezone.utc)
return start, start + timedelta(days=1)
except ValueError:
# Invalid date components (e.g., month=13, day=32)
pass
# Try ISO datetime
try:
parsed = datetime.fromisoformat(expr.replace("Z", "+00:00"))
if parsed.tzinfo is None:
parsed = parsed.replace(tzinfo=timezone.utc)
return parsed - timedelta(hours=1), parsed + timedelta(hours=1)
except ValueError:
return None, None
class AgentOutputTool(BaseTool):
"""Tool for retrieving execution outputs from user's library agents."""
@property
def name(self) -> str:
return "agent_output"
@property
def description(self) -> str:
return """Retrieve execution outputs from agents in the user's library.
Identify the agent using one of:
- agent_name: Fuzzy search in user's library
- library_agent_id: Exact library agent ID
- store_slug: Marketplace format 'username/agent-name'
Select which run to retrieve using:
- execution_id: Specific execution ID
- run_time: 'latest' (default), 'yesterday', 'last week', or ISO date 'YYYY-MM-DD'
"""
@property
def parameters(self) -> dict[str, Any]:
return {
"type": "object",
"properties": {
"agent_name": {
"type": "string",
"description": "Agent name to search for in user's library (fuzzy match)",
},
"library_agent_id": {
"type": "string",
"description": "Exact library agent ID",
},
"store_slug": {
"type": "string",
"description": "Marketplace identifier: 'username/agent-slug'",
},
"execution_id": {
"type": "string",
"description": "Specific execution ID to retrieve",
},
"run_time": {
"type": "string",
"description": (
"Time filter: 'latest', 'yesterday', 'last week', or 'YYYY-MM-DD'"
),
},
},
"required": [],
}
@property
def requires_auth(self) -> bool:
return True
async def _resolve_agent(
self,
user_id: str,
agent_name: str | None,
library_agent_id: str | None,
store_slug: str | None,
) -> tuple[LibraryAgent | None, str | None]:
"""
Resolve agent from provided identifiers.
Returns (library_agent, error_message).
"""
# Priority 1: Exact library agent ID
if library_agent_id:
try:
agent = await library_db.get_library_agent(library_agent_id, user_id)
return agent, None
except Exception as e:
logger.warning(f"Failed to get library agent by ID: {e}")
return None, f"Library agent '{library_agent_id}' not found"
# Priority 2: Store slug (username/agent-name)
if store_slug and "/" in store_slug:
username, agent_slug = store_slug.split("/", 1)
graph, _ = await fetch_graph_from_store_slug(username, agent_slug)
if not graph:
return None, f"Agent '{store_slug}' not found in marketplace"
# Find in user's library by graph_id
agent = await library_db.get_library_agent_by_graph_id(user_id, graph.id)
if not agent:
return (
None,
f"Agent '{store_slug}' is not in your library. "
"Add it first to see outputs.",
)
return agent, None
# Priority 3: Fuzzy name search in library
if agent_name:
try:
response = await library_db.list_library_agents(
user_id=user_id,
search_term=agent_name,
page_size=5,
)
if not response.agents:
return (
None,
f"No agents matching '{agent_name}' found in your library",
)
# Return best match (first result from search)
return response.agents[0], None
except Exception as e:
logger.error(f"Error searching library agents: {e}")
return None, f"Error searching for agent: {e}"
return (
None,
"Please specify an agent name, library_agent_id, or store_slug",
)
async def _get_execution(
self,
user_id: str,
graph_id: str,
execution_id: str | None,
time_start: datetime | None,
time_end: datetime | None,
) -> tuple[GraphExecution | None, list[GraphExecutionMeta], str | None]:
"""
Fetch execution(s) based on filters.
Returns (single_execution, available_executions_meta, error_message).
"""
# If specific execution_id provided, fetch it directly
if execution_id:
execution = await execution_db.get_graph_execution(
user_id=user_id,
execution_id=execution_id,
include_node_executions=False,
)
if not execution:
return None, [], f"Execution '{execution_id}' not found"
return execution, [], None
# Get completed executions with time filters
executions = await execution_db.get_graph_executions(
graph_id=graph_id,
user_id=user_id,
statuses=[ExecutionStatus.COMPLETED],
created_time_gte=time_start,
created_time_lte=time_end,
limit=10,
)
if not executions:
return None, [], None # No error, just no executions
# If only one execution, fetch full details
if len(executions) == 1:
full_execution = await execution_db.get_graph_execution(
user_id=user_id,
execution_id=executions[0].id,
include_node_executions=False,
)
return full_execution, [], None
# Multiple executions - return latest with full details, plus list of available
full_execution = await execution_db.get_graph_execution(
user_id=user_id,
execution_id=executions[0].id,
include_node_executions=False,
)
return full_execution, executions, None
def _build_response(
self,
agent: LibraryAgent,
execution: GraphExecution | None,
available_executions: list[GraphExecutionMeta],
session_id: str | None,
) -> AgentOutputResponse:
"""Build the response based on execution data."""
library_agent_link = f"/library/agents/{agent.id}"
if not execution:
return AgentOutputResponse(
message=f"No completed executions found for agent '{agent.name}'",
session_id=session_id,
agent_name=agent.name,
agent_id=agent.graph_id,
library_agent_id=agent.id,
library_agent_link=library_agent_link,
total_executions=0,
)
execution_info = ExecutionOutputInfo(
execution_id=execution.id,
status=execution.status.value,
started_at=execution.started_at,
ended_at=execution.ended_at,
outputs=dict(execution.outputs),
inputs_summary=execution.inputs if execution.inputs else None,
)
available_list = None
if len(available_executions) > 1:
available_list = [
{
"id": e.id,
"status": e.status.value,
"started_at": e.started_at.isoformat() if e.started_at else None,
}
for e in available_executions[:5]
]
message = f"Found execution outputs for agent '{agent.name}'"
if len(available_executions) > 1:
message += (
f". Showing latest of {len(available_executions)} matching executions."
)
return AgentOutputResponse(
message=message,
session_id=session_id,
agent_name=agent.name,
agent_id=agent.graph_id,
library_agent_id=agent.id,
library_agent_link=library_agent_link,
execution=execution_info,
available_executions=available_list,
total_executions=len(available_executions) if available_executions else 1,
)
async def _execute(
self,
user_id: str | None,
session: ChatSession,
**kwargs,
) -> ToolResponseBase:
"""Execute the agent_output tool."""
session_id = session.session_id
# Parse and validate input
try:
input_data = AgentOutputInput(**kwargs)
except Exception as e:
logger.error(f"Invalid input: {e}")
return ErrorResponse(
message="Invalid input parameters",
error=str(e),
session_id=session_id,
)
# Ensure user_id is present (should be guaranteed by requires_auth)
if not user_id:
return ErrorResponse(
message="User authentication required",
session_id=session_id,
)
# Check if at least one identifier is provided
if not any(
[
input_data.agent_name,
input_data.library_agent_id,
input_data.store_slug,
input_data.execution_id,
]
):
return ErrorResponse(
message=(
"Please specify at least one of: agent_name, "
"library_agent_id, store_slug, or execution_id"
),
session_id=session_id,
)
# If only execution_id provided, we need to find the agent differently
if (
input_data.execution_id
and not input_data.agent_name
and not input_data.library_agent_id
and not input_data.store_slug
):
# Fetch execution directly to get graph_id
execution = await execution_db.get_graph_execution(
user_id=user_id,
execution_id=input_data.execution_id,
include_node_executions=False,
)
if not execution:
return ErrorResponse(
message=f"Execution '{input_data.execution_id}' not found",
session_id=session_id,
)
# Find library agent by graph_id
agent = await library_db.get_library_agent_by_graph_id(
user_id, execution.graph_id
)
if not agent:
return NoResultsResponse(
message=(
f"Execution found but agent not in your library. "
f"Graph ID: {execution.graph_id}"
),
session_id=session_id,
suggestions=["Add the agent to your library to see more details"],
)
return self._build_response(agent, execution, [], session_id)
# Resolve agent from identifiers
agent, error = await self._resolve_agent(
user_id=user_id,
agent_name=input_data.agent_name or None,
library_agent_id=input_data.library_agent_id or None,
store_slug=input_data.store_slug or None,
)
if error or not agent:
return NoResultsResponse(
message=error or "Agent not found",
session_id=session_id,
suggestions=[
"Check the agent name or ID",
"Make sure the agent is in your library",
],
)
# Parse time expression
time_start, time_end = parse_time_expression(input_data.run_time)
# Fetch execution(s)
execution, available_executions, exec_error = await self._get_execution(
user_id=user_id,
graph_id=agent.graph_id,
execution_id=input_data.execution_id or None,
time_start=time_start,
time_end=time_end,
)
if exec_error:
return ErrorResponse(
message=exec_error,
session_id=session_id,
)
return self._build_response(agent, execution, available_executions, session_id)

View File

@@ -0,0 +1,151 @@
"""Shared agent search functionality for find_agent and find_library_agent tools."""
import logging
from typing import Literal
from backend.api.features.library import db as library_db
from backend.api.features.store import db as store_db
from backend.util.exceptions import DatabaseError, NotFoundError
from .models import (
AgentInfo,
AgentsFoundResponse,
ErrorResponse,
NoResultsResponse,
ToolResponseBase,
)
logger = logging.getLogger(__name__)
SearchSource = Literal["marketplace", "library"]
async def search_agents(
query: str,
source: SearchSource,
session_id: str | None,
user_id: str | None = None,
) -> ToolResponseBase:
"""
Search for agents in marketplace or user library.
Args:
query: Search query string
source: "marketplace" or "library"
session_id: Chat session ID
user_id: User ID (required for library search)
Returns:
AgentsFoundResponse, NoResultsResponse, or ErrorResponse
"""
if not query:
return ErrorResponse(
message="Please provide a search query", session_id=session_id
)
if source == "library" and not user_id:
return ErrorResponse(
message="User authentication required to search library",
session_id=session_id,
)
agents: list[AgentInfo] = []
try:
if source == "marketplace":
logger.info(f"Searching marketplace for: {query}")
results = await store_db.get_store_agents(search_query=query, page_size=5)
for agent in results.agents:
agents.append(
AgentInfo(
id=f"{agent.creator}/{agent.slug}",
name=agent.agent_name,
description=agent.description or "",
source="marketplace",
in_library=False,
creator=agent.creator,
category="general",
rating=agent.rating,
runs=agent.runs,
is_featured=False,
)
)
else: # library
logger.info(f"Searching user library for: {query}")
results = await library_db.list_library_agents(
user_id=user_id, # type: ignore[arg-type]
search_term=query,
page_size=10,
)
for agent in results.agents:
agents.append(
AgentInfo(
id=agent.id,
name=agent.name,
description=agent.description or "",
source="library",
in_library=True,
creator=agent.creator_name,
status=agent.status.value,
can_access_graph=agent.can_access_graph,
has_external_trigger=agent.has_external_trigger,
new_output=agent.new_output,
graph_id=agent.graph_id,
)
)
logger.info(f"Found {len(agents)} agents in {source}")
except NotFoundError:
pass
except DatabaseError as e:
logger.error(f"Error searching {source}: {e}", exc_info=True)
return ErrorResponse(
message=f"Failed to search {source}. Please try again.",
error=str(e),
session_id=session_id,
)
if not agents:
suggestions = (
[
"Try more general terms",
"Browse categories in the marketplace",
"Check spelling",
]
if source == "marketplace"
else [
"Try different keywords",
"Use find_agent to search the marketplace",
"Check your library at /library",
]
)
no_results_msg = (
f"No agents found matching '{query}'. Try different keywords or browse the marketplace."
if source == "marketplace"
else f"No agents matching '{query}' found in your library."
)
return NoResultsResponse(
message=no_results_msg, session_id=session_id, suggestions=suggestions
)
title = f"Found {len(agents)} agent{'s' if len(agents) != 1 else ''} "
title += (
f"for '{query}'"
if source == "marketplace"
else f"in your library for '{query}'"
)
message = (
"Now you have found some options for the user to choose from. "
"You can add a link to a recommended agent at: /marketplace/agent/agent_id "
"Please ask the user if they would like to use any of these agents."
if source == "marketplace"
else "Found agents in the user's library. You can provide a link to view an agent at: "
"/library/agents/{agent_id}. Use agent_output to get execution results, or run_agent to execute."
)
return AgentsFoundResponse(
message=message,
title=title,
agents=agents,
count=len(agents),
session_id=session_id,
)

View File

@@ -5,8 +5,8 @@ from typing import Any
from openai.types.chat import ChatCompletionToolParam
from backend.server.v2.chat.model import ChatSession
from backend.server.v2.chat.response_model import StreamToolExecutionResult
from backend.api.features.chat.model import ChatSession
from backend.api.features.chat.response_model import StreamToolOutputAvailable
from .models import ErrorResponse, NeedLoginResponse, ToolResponseBase
@@ -53,7 +53,7 @@ class BaseTool:
session: ChatSession,
tool_call_id: str,
**kwargs,
) -> StreamToolExecutionResult:
) -> StreamToolOutputAvailable:
"""Execute the tool with authentication check.
Args:
@@ -69,10 +69,10 @@ class BaseTool:
logger.error(
f"Attempted tool call for {self.name} but user not authenticated"
)
return StreamToolExecutionResult(
tool_id=tool_call_id,
tool_name=self.name,
result=NeedLoginResponse(
return StreamToolOutputAvailable(
toolCallId=tool_call_id,
toolName=self.name,
output=NeedLoginResponse(
message=f"Please sign in to use {self.name}",
session_id=session.session_id,
).model_dump_json(),
@@ -81,17 +81,17 @@ class BaseTool:
try:
result = await self._execute(user_id, session, **kwargs)
return StreamToolExecutionResult(
tool_id=tool_call_id,
tool_name=self.name,
result=result.model_dump_json(),
return StreamToolOutputAvailable(
toolCallId=tool_call_id,
toolName=self.name,
output=result.model_dump_json(),
)
except Exception as e:
logger.error(f"Error in {self.name}: {e}", exc_info=True)
return StreamToolExecutionResult(
tool_id=tool_call_id,
tool_name=self.name,
result=ErrorResponse(
return StreamToolOutputAvailable(
toolCallId=tool_call_id,
toolName=self.name,
output=ErrorResponse(
message=f"An error occurred while executing {self.name}",
error=str(e),
session_id=session.session_id,

View File

@@ -0,0 +1,46 @@
"""Tool for discovering agents from marketplace."""
from typing import Any
from backend.api.features.chat.model import ChatSession
from .agent_search import search_agents
from .base import BaseTool
from .models import ToolResponseBase
class FindAgentTool(BaseTool):
"""Tool for discovering agents from the marketplace."""
@property
def name(self) -> str:
return "find_agent"
@property
def description(self) -> str:
return (
"Discover agents from the marketplace based on capabilities and user needs."
)
@property
def parameters(self) -> dict[str, Any]:
return {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "Search query describing what the user wants to accomplish. Use single keywords for best results.",
},
},
"required": ["query"],
}
async def _execute(
self, user_id: str | None, session: ChatSession, **kwargs
) -> ToolResponseBase:
return await search_agents(
query=kwargs.get("query", "").strip(),
source="marketplace",
session_id=session.session_id,
user_id=user_id,
)

View File

@@ -0,0 +1,52 @@
"""Tool for searching agents in the user's library."""
from typing import Any
from backend.api.features.chat.model import ChatSession
from .agent_search import search_agents
from .base import BaseTool
from .models import ToolResponseBase
class FindLibraryAgentTool(BaseTool):
"""Tool for searching agents in the user's library."""
@property
def name(self) -> str:
return "find_library_agent"
@property
def description(self) -> str:
return (
"Search for agents in the user's library. Use this to find agents "
"the user has already added to their library, including agents they "
"created or added from the marketplace."
)
@property
def parameters(self) -> dict[str, Any]:
return {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "Search query to find agents by name or description.",
},
},
"required": ["query"],
}
@property
def requires_auth(self) -> bool:
return True
async def _execute(
self, user_id: str | None, session: ChatSession, **kwargs
) -> ToolResponseBase:
return await search_agents(
query=kwargs.get("query", "").strip(),
source="library",
session_id=session.session_id,
user_id=user_id,
)

View File

@@ -1,24 +1,28 @@
"""Pydantic models for tool responses."""
from datetime import datetime
from enum import Enum
from typing import Any
from pydantic import BaseModel, Field
from backend.data import block
from backend.data.model import CredentialsMetaInput
class ResponseType(str, Enum):
"""Types of tool responses."""
AGENT_CAROUSEL = "agent_carousel"
AGENTS_FOUND = "agents_found"
AGENT_DETAILS = "agent_details"
BLOCK_OUTPUT = "block_output"
SETUP_REQUIREMENTS = "setup_requirements"
EXECUTION_STARTED = "execution_started"
NEED_LOGIN = "need_login"
ERROR = "error"
NO_RESULTS = "no_results"
SUCCESS = "success"
AGENT_OUTPUT = "agent_output"
UNDERSTANDING_UPDATED = "understanding_updated"
# Base response model
@@ -51,15 +55,22 @@ class AgentInfo(BaseModel):
graph_id: str | None = None
class AgentCarouselResponse(ToolResponseBase):
class AgentsFoundResponse(ToolResponseBase):
"""Response for find_agent tool."""
type: ResponseType = ResponseType.AGENT_CAROUSEL
type: ResponseType = ResponseType.AGENTS_FOUND
title: str = "Available Agents"
agents: list[AgentInfo]
count: int
name: str = "agent_carousel"
name: str = "agents_found"
class BlockOutputResponse(ToolResponseBase):
"""Response for find_block tool"""
type: ResponseType = ResponseType.BLOCK_OUTPUT
block_id: str
block_name: str
outputs: dict[str, list[Any]]
success: bool = True
class NoResultsResponse(ToolResponseBase):
"""Response when no agents found."""
@@ -173,3 +184,37 @@ class ErrorResponse(ToolResponseBase):
type: ResponseType = ResponseType.ERROR
error: str | None = None
details: dict[str, Any] | None = None
# Agent output models
class ExecutionOutputInfo(BaseModel):
"""Summary of a single execution's outputs."""
execution_id: str
status: str
started_at: datetime | None = None
ended_at: datetime | None = None
outputs: dict[str, list[Any]]
inputs_summary: dict[str, Any] | None = None
class AgentOutputResponse(ToolResponseBase):
"""Response for agent_output tool."""
type: ResponseType = ResponseType.AGENT_OUTPUT
agent_name: str
agent_id: str
library_agent_id: str | None = None
library_agent_link: str | None = None
execution: ExecutionOutputInfo | None = None
available_executions: list[dict[str, Any]] | None = None
total_executions: int = 0
# Business understanding models
class UnderstandingUpdatedResponse(ToolResponseBase):
"""Response for add_understanding tool."""
type: ResponseType = ResponseType.UNDERSTANDING_UPDATED
updated_fields: list[str] = Field(default_factory=list)
current_understanding: dict[str, Any] = Field(default_factory=dict)

View File

@@ -5,14 +5,22 @@ from typing import Any
from pydantic import BaseModel, Field, field_validator
from backend.api.features.chat.config import ChatConfig
from backend.api.features.chat.model import ChatSession
from backend.api.features.library import db as library_db
from backend.data.graph import GraphModel
from backend.data.model import CredentialsMetaInput
from backend.data.user import get_user_by_id
from backend.executor import utils as execution_utils
from backend.server.v2.chat.config import ChatConfig
from backend.server.v2.chat.model import ChatSession
from backend.server.v2.chat.tools.base import BaseTool
from backend.server.v2.chat.tools.models import (
from backend.util.clients import get_scheduler_client
from backend.util.exceptions import DatabaseError, NotFoundError
from backend.util.timezone_utils import (
convert_utc_time_to_user_timezone,
get_user_timezone_or_utc,
)
from .base import BaseTool
from .models import (
AgentDetails,
AgentDetailsResponse,
ErrorResponse,
@@ -23,19 +31,13 @@ from backend.server.v2.chat.tools.models import (
ToolResponseBase,
UserReadiness,
)
from backend.server.v2.chat.tools.utils import (
from .utils import (
check_user_has_required_credentials,
extract_credentials_from_schema,
fetch_graph_from_store_slug,
get_or_create_library_agent,
match_user_credentials_to_graph,
)
from backend.util.clients import get_scheduler_client
from backend.util.exceptions import DatabaseError, NotFoundError
from backend.util.timezone_utils import (
convert_utc_time_to_user_timezone,
get_user_timezone_or_utc,
)
logger = logging.getLogger(__name__)
config = ChatConfig()
@@ -56,6 +58,7 @@ class RunAgentInput(BaseModel):
"""Input parameters for the run_agent tool."""
username_agent_slug: str = ""
library_agent_id: str = ""
inputs: dict[str, Any] = Field(default_factory=dict)
use_defaults: bool = False
schedule_name: str = ""
@@ -63,7 +66,12 @@ class RunAgentInput(BaseModel):
timezone: str = "UTC"
@field_validator(
"username_agent_slug", "schedule_name", "cron", "timezone", mode="before"
"username_agent_slug",
"library_agent_id",
"schedule_name",
"cron",
"timezone",
mode="before",
)
@classmethod
def strip_strings(cls, v: Any) -> Any:
@@ -89,7 +97,7 @@ class RunAgentTool(BaseTool):
@property
def description(self) -> str:
return """Run or schedule an agent from the marketplace.
return """Run or schedule an agent from the marketplace or user's library.
The tool automatically handles the setup flow:
- Returns missing inputs if required fields are not provided
@@ -97,6 +105,10 @@ class RunAgentTool(BaseTool):
- Executes immediately if all requirements are met
- Schedules execution if cron expression is provided
Identify the agent using either:
- username_agent_slug: Marketplace format 'username/agent-name'
- library_agent_id: ID of an agent in the user's library
For scheduled execution, provide: schedule_name, cron, and optionally timezone."""
@property
@@ -108,6 +120,10 @@ class RunAgentTool(BaseTool):
"type": "string",
"description": "Agent identifier in format 'username/agent-name'",
},
"library_agent_id": {
"type": "string",
"description": "Library agent ID from user's library",
},
"inputs": {
"type": "object",
"description": "Input values for the agent",
@@ -130,7 +146,7 @@ class RunAgentTool(BaseTool):
"description": "IANA timezone for schedule (default: UTC)",
},
},
"required": ["username_agent_slug"],
"required": [],
}
@property
@@ -148,10 +164,16 @@ class RunAgentTool(BaseTool):
params = RunAgentInput(**kwargs)
session_id = session.session_id
# Validate agent slug format
if not params.username_agent_slug or "/" not in params.username_agent_slug:
# Validate at least one identifier is provided
has_slug = params.username_agent_slug and "/" in params.username_agent_slug
has_library_id = bool(params.library_agent_id)
if not has_slug and not has_library_id:
return ErrorResponse(
message="Please provide an agent slug in format 'username/agent-name'",
message=(
"Please provide either a username_agent_slug "
"(format 'username/agent-name') or a library_agent_id"
),
session_id=session_id,
)
@@ -166,13 +188,41 @@ class RunAgentTool(BaseTool):
is_schedule = bool(params.schedule_name or params.cron)
try:
# Step 1: Fetch agent details (always happens first)
username, agent_name = params.username_agent_slug.split("/", 1)
graph, store_agent = await fetch_graph_from_store_slug(username, agent_name)
# Step 1: Fetch agent details
graph: GraphModel | None = None
library_agent = None
# Priority: library_agent_id if provided
if has_library_id:
library_agent = await library_db.get_library_agent(
params.library_agent_id, user_id
)
if not library_agent:
return ErrorResponse(
message=f"Library agent '{params.library_agent_id}' not found",
session_id=session_id,
)
# Get the graph from the library agent
from backend.data.graph import get_graph
graph = await get_graph(
library_agent.graph_id,
library_agent.graph_version,
user_id=user_id,
)
else:
# Fetch from marketplace slug
username, agent_name = params.username_agent_slug.split("/", 1)
graph, _ = await fetch_graph_from_store_slug(username, agent_name)
if not graph:
identifier = (
params.library_agent_id
if has_library_id
else params.username_agent_slug
)
return ErrorResponse(
message=f"Agent '{params.username_agent_slug}' not found in marketplace",
message=f"Agent '{identifier}' not found",
session_id=session_id,
)

View File

@@ -1,15 +1,16 @@
import uuid
from unittest.mock import AsyncMock, patch
import orjson
import pytest
from backend.server.v2.chat.tools._test_data import (
from ._test_data import (
make_session,
setup_firecrawl_test_data,
setup_llm_test_data,
setup_test_data,
)
from backend.server.v2.chat.tools.run_agent import RunAgentTool
from .run_agent import RunAgentTool
# This is so the formatter doesn't remove the fixture imports
setup_llm_test_data = setup_llm_test_data
@@ -17,6 +18,17 @@ setup_test_data = setup_test_data
setup_firecrawl_test_data = setup_firecrawl_test_data
@pytest.fixture(scope="session", autouse=True)
def mock_embedding_functions():
"""Mock embedding functions for all tests to avoid database/API dependencies."""
with patch(
"backend.api.features.store.db.ensure_embedding",
new_callable=AsyncMock,
return_value=True,
):
yield
@pytest.mark.asyncio(scope="session")
async def test_run_agent(setup_test_data):
"""Test that the run_agent tool successfully executes an approved agent"""
@@ -46,11 +58,11 @@ async def test_run_agent(setup_test_data):
# Verify the response
assert response is not None
assert hasattr(response, "result")
assert hasattr(response, "output")
# Parse the result JSON to verify the execution started
assert isinstance(response.result, str)
result_data = orjson.loads(response.result)
assert isinstance(response.output, str)
result_data = orjson.loads(response.output)
assert "execution_id" in result_data
assert "graph_id" in result_data
assert result_data["graph_id"] == graph.id
@@ -86,11 +98,11 @@ async def test_run_agent_missing_inputs(setup_test_data):
# Verify that we get an error response
assert response is not None
assert hasattr(response, "result")
assert hasattr(response, "output")
# The tool should return an ErrorResponse when setup info indicates not ready
assert isinstance(response.result, str)
result_data = orjson.loads(response.result)
assert isinstance(response.output, str)
result_data = orjson.loads(response.output)
assert "message" in result_data
@@ -118,10 +130,10 @@ async def test_run_agent_invalid_agent_id(setup_test_data):
# Verify that we get an error response
assert response is not None
assert hasattr(response, "result")
assert hasattr(response, "output")
assert isinstance(response.result, str)
result_data = orjson.loads(response.result)
assert isinstance(response.output, str)
result_data = orjson.loads(response.output)
assert "message" in result_data
# Should get an error about failed setup or not found
assert any(
@@ -158,12 +170,12 @@ async def test_run_agent_with_llm_credentials(setup_llm_test_data):
# Verify the response
assert response is not None
assert hasattr(response, "result")
assert hasattr(response, "output")
# Parse the result JSON to verify the execution started
assert isinstance(response.result, str)
result_data = orjson.loads(response.result)
assert isinstance(response.output, str)
result_data = orjson.loads(response.output)
# Should successfully start execution since credentials are available
assert "execution_id" in result_data
@@ -195,9 +207,9 @@ async def test_run_agent_shows_available_inputs_when_none_provided(setup_test_da
)
assert response is not None
assert hasattr(response, "result")
assert isinstance(response.result, str)
result_data = orjson.loads(response.result)
assert hasattr(response, "output")
assert isinstance(response.output, str)
result_data = orjson.loads(response.output)
# Should return agent_details type showing available inputs
assert result_data.get("type") == "agent_details"
@@ -230,9 +242,9 @@ async def test_run_agent_with_use_defaults(setup_test_data):
)
assert response is not None
assert hasattr(response, "result")
assert isinstance(response.result, str)
result_data = orjson.loads(response.result)
assert hasattr(response, "output")
assert isinstance(response.output, str)
result_data = orjson.loads(response.output)
# Should execute successfully
assert "execution_id" in result_data
@@ -260,9 +272,9 @@ async def test_run_agent_missing_credentials(setup_firecrawl_test_data):
)
assert response is not None
assert hasattr(response, "result")
assert isinstance(response.result, str)
result_data = orjson.loads(response.result)
assert hasattr(response, "output")
assert isinstance(response.output, str)
result_data = orjson.loads(response.output)
# Should return setup_requirements type with missing credentials
assert result_data.get("type") == "setup_requirements"
@@ -292,9 +304,9 @@ async def test_run_agent_invalid_slug_format(setup_test_data):
)
assert response is not None
assert hasattr(response, "result")
assert isinstance(response.result, str)
result_data = orjson.loads(response.result)
assert hasattr(response, "output")
assert isinstance(response.output, str)
result_data = orjson.loads(response.output)
# Should return error
assert result_data.get("type") == "error"
@@ -305,9 +317,10 @@ async def test_run_agent_invalid_slug_format(setup_test_data):
async def test_run_agent_unauthenticated():
"""Test that run_agent returns need_login for unauthenticated users."""
tool = RunAgentTool()
session = make_session(user_id=None)
# Session has a user_id (session owner), but we test tool execution without user_id
session = make_session(user_id="test-session-owner")
# Execute without user_id
# Execute without user_id to test unauthenticated behavior
response = await tool.execute(
user_id=None,
session_id=str(uuid.uuid4()),
@@ -318,9 +331,9 @@ async def test_run_agent_unauthenticated():
)
assert response is not None
assert hasattr(response, "result")
assert isinstance(response.result, str)
result_data = orjson.loads(response.result)
assert hasattr(response, "output")
assert isinstance(response.output, str)
result_data = orjson.loads(response.output)
# Base tool returns need_login type for unauthenticated users
assert result_data.get("type") == "need_login"
@@ -350,9 +363,9 @@ async def test_run_agent_schedule_without_cron(setup_test_data):
)
assert response is not None
assert hasattr(response, "result")
assert isinstance(response.result, str)
result_data = orjson.loads(response.result)
assert hasattr(response, "output")
assert isinstance(response.output, str)
result_data = orjson.loads(response.output)
# Should return error about missing cron
assert result_data.get("type") == "error"
@@ -382,9 +395,9 @@ async def test_run_agent_schedule_without_name(setup_test_data):
)
assert response is not None
assert hasattr(response, "result")
assert isinstance(response.result, str)
result_data = orjson.loads(response.result)
assert hasattr(response, "output")
assert isinstance(response.output, str)
result_data = orjson.loads(response.output)
# Should return error about missing schedule_name
assert result_data.get("type") == "error"

View File

@@ -0,0 +1,287 @@
"""Tool for executing blocks directly."""
import logging
from collections import defaultdict
from typing import Any
from backend.api.features.chat.model import ChatSession
from backend.data.block import get_block
from backend.data.model import CredentialsMetaInput
from backend.integrations.creds_manager import IntegrationCredentialsManager
from backend.util.exceptions import BlockError
from .base import BaseTool
from .models import (
BlockOutputResponse,
ErrorResponse,
SetupInfo,
SetupRequirementsResponse,
ToolResponseBase,
UserReadiness,
)
logger = logging.getLogger(__name__)
class RunBlockTool(BaseTool):
"""Tool for executing a block and returning its outputs."""
@property
def name(self) -> str:
return "run_block"
@property
def description(self) -> str:
return (
"Execute a specific block with the provided input data. "
"Use find_block to discover available blocks and their input schemas. "
"The block will run and return its outputs once complete."
)
@property
def parameters(self) -> dict[str, Any]:
return {
"type": "object",
"properties": {
"block_id": {
"type": "string",
"description": "The UUID of the block to execute",
},
"input_data": {
"type": "object",
"description": (
"Input values for the block. Must match the block's input schema. "
"Check the block's input_schema from find_block for required fields."
),
},
},
"required": ["block_id", "input_data"],
}
@property
def requires_auth(self) -> bool:
return True
async def _check_block_credentials(
self,
user_id: str,
block: Any,
) -> tuple[dict[str, CredentialsMetaInput], list[CredentialsMetaInput]]:
"""
Check if user has required credentials for a block.
Returns:
tuple[matched_credentials, missing_credentials]
"""
matched_credentials: dict[str, CredentialsMetaInput] = {}
missing_credentials: list[CredentialsMetaInput] = []
# Get credential field info from block's input schema
credentials_fields_info = block.input_schema.get_credentials_fields_info()
if not credentials_fields_info:
return matched_credentials, missing_credentials
# Get user's available credentials
creds_manager = IntegrationCredentialsManager()
available_creds = await creds_manager.store.get_all_creds(user_id)
for field_name, field_info in credentials_fields_info.items():
# field_info.provider is a frozenset of acceptable providers
# field_info.supported_types is a frozenset of acceptable types
matching_cred = next(
(
cred
for cred in available_creds
if cred.provider in field_info.provider
and cred.type in field_info.supported_types
),
None,
)
if matching_cred:
matched_credentials[field_name] = CredentialsMetaInput(
id=matching_cred.id,
provider=matching_cred.provider, # type: ignore
type=matching_cred.type,
title=matching_cred.title,
)
else:
# Create a placeholder for the missing credential
provider = next(iter(field_info.provider), "unknown")
cred_type = next(iter(field_info.supported_types), "api_key")
missing_credentials.append(
CredentialsMetaInput(
id=field_name,
provider=provider, # type: ignore
type=cred_type, # type: ignore
title=field_name.replace("_", " ").title(),
)
)
return matched_credentials, missing_credentials
async def _execute(
self,
user_id: str | None,
session: ChatSession,
**kwargs,
) -> ToolResponseBase:
"""Execute a block with the given input data.
Args:
user_id: User ID (required)
session: Chat session
block_id: Block UUID to execute
input_data: Input values for the block
Returns:
BlockOutputResponse: Block execution outputs
SetupRequirementsResponse: Missing credentials
ErrorResponse: Error message
"""
block_id = kwargs.get("block_id", "").strip()
input_data = kwargs.get("input_data", {})
session_id = session.session_id
if not block_id:
return ErrorResponse(
message="Please provide a block_id",
session_id=session_id,
)
if not isinstance(input_data, dict):
return ErrorResponse(
message="input_data must be an object",
session_id=session_id,
)
if not user_id:
return ErrorResponse(
message="Authentication required",
session_id=session_id,
)
# Get the block
block = get_block(block_id)
if not block:
return ErrorResponse(
message=f"Block '{block_id}' not found",
session_id=session_id,
)
logger.info(f"Executing block {block.name} ({block_id}) for user {user_id}")
# Check credentials
creds_manager = IntegrationCredentialsManager()
matched_credentials, missing_credentials = await self._check_block_credentials(
user_id, block
)
if missing_credentials:
# Return setup requirements response with missing credentials
missing_creds_dict = {c.id: c.model_dump() for c in missing_credentials}
return SetupRequirementsResponse(
message=(
f"Block '{block.name}' requires credentials that are not configured. "
"Please set up the required credentials before running this block."
),
session_id=session_id,
setup_info=SetupInfo(
agent_id=block_id,
agent_name=block.name,
user_readiness=UserReadiness(
has_all_credentials=False,
missing_credentials=missing_creds_dict,
ready_to_run=False,
),
requirements={
"credentials": [c.model_dump() for c in missing_credentials],
"inputs": self._get_inputs_list(block),
"execution_modes": ["immediate"],
},
),
graph_id=None,
graph_version=None,
)
try:
# Fetch actual credentials and prepare kwargs for block execution
exec_kwargs: dict[str, Any] = {"user_id": user_id}
for field_name, cred_meta in matched_credentials.items():
# Inject metadata into input_data (for validation)
if field_name not in input_data:
input_data[field_name] = cred_meta.model_dump()
# Fetch actual credentials and pass as kwargs (for execution)
actual_credentials = await creds_manager.get(
user_id, cred_meta.id, lock=False
)
if actual_credentials:
exec_kwargs[field_name] = actual_credentials
else:
return ErrorResponse(
message=f"Failed to retrieve credentials for {field_name}",
session_id=session_id,
)
# Execute the block and collect outputs
outputs: dict[str, list[Any]] = defaultdict(list)
async for output_name, output_data in block.execute(
input_data,
**exec_kwargs,
):
outputs[output_name].append(output_data)
return BlockOutputResponse(
message=f"Block '{block.name}' executed successfully",
block_id=block_id,
block_name=block.name,
outputs=dict(outputs),
success=True,
session_id=session_id,
)
except BlockError as e:
logger.warning(f"Block execution failed: {e}")
return ErrorResponse(
message=f"Block execution failed: {e}",
error=str(e),
session_id=session_id,
)
except Exception as e:
logger.error(f"Unexpected error executing block: {e}", exc_info=True)
return ErrorResponse(
message=f"Failed to execute block: {str(e)}",
error=str(e),
session_id=session_id,
)
def _get_inputs_list(self, block: Any) -> list[dict[str, Any]]:
"""Extract non-credential inputs from block schema."""
inputs_list = []
schema = block.input_schema.jsonschema()
properties = schema.get("properties", {})
required_fields = set(schema.get("required", []))
# Get credential field names to exclude
credentials_fields = set(block.input_schema.get_credentials_fields().keys())
for field_name, field_schema in properties.items():
# Skip credential fields
if field_name in credentials_fields:
continue
inputs_list.append(
{
"name": field_name,
"title": field_schema.get("title", field_name),
"type": field_schema.get("type", "string"),
"description": field_schema.get("description", ""),
"required": field_name in required_fields,
}
)
return inputs_list

View File

@@ -3,13 +3,13 @@
import logging
from typing import Any
from backend.api.features.library import db as library_db
from backend.api.features.library import model as library_model
from backend.api.features.store import db as store_db
from backend.data import graph as graph_db
from backend.data.graph import GraphModel
from backend.data.model import CredentialsMetaInput
from backend.integrations.creds_manager import IntegrationCredentialsManager
from backend.server.v2.library import db as library_db
from backend.server.v2.library import model as library_model
from backend.server.v2.store import db as store_db
from backend.util.exceptions import NotFoundError
logger = logging.getLogger(__name__)

View File

@@ -7,9 +7,10 @@ import pytest_mock
from prisma.enums import ReviewStatus
from pytest_snapshot.plugin import Snapshot
from backend.server.rest_api import handle_internal_http_error
from backend.server.v2.executions.review.model import PendingHumanReviewModel
from backend.server.v2.executions.review.routes import router
from backend.api.rest_api import handle_internal_http_error
from .model import PendingHumanReviewModel
from .routes import router
# Using a fixed timestamp for reproducible tests
FIXED_NOW = datetime.datetime(2023, 1, 1, 0, 0, 0, tzinfo=datetime.timezone.utc)
@@ -54,13 +55,13 @@ def sample_pending_review(test_user_id: str) -> PendingHumanReviewModel:
def test_get_pending_reviews_empty(
mocker: pytest_mock.MockFixture,
mocker: pytest_mock.MockerFixture,
snapshot: Snapshot,
test_user_id: str,
) -> None:
"""Test getting pending reviews when none exist"""
mock_get_reviews = mocker.patch(
"backend.server.v2.executions.review.routes.get_pending_reviews_for_user"
"backend.api.features.executions.review.routes.get_pending_reviews_for_user"
)
mock_get_reviews.return_value = []
@@ -72,14 +73,14 @@ def test_get_pending_reviews_empty(
def test_get_pending_reviews_with_data(
mocker: pytest_mock.MockFixture,
mocker: pytest_mock.MockerFixture,
sample_pending_review: PendingHumanReviewModel,
snapshot: Snapshot,
test_user_id: str,
) -> None:
"""Test getting pending reviews with data"""
mock_get_reviews = mocker.patch(
"backend.server.v2.executions.review.routes.get_pending_reviews_for_user"
"backend.api.features.executions.review.routes.get_pending_reviews_for_user"
)
mock_get_reviews.return_value = [sample_pending_review]
@@ -94,14 +95,14 @@ def test_get_pending_reviews_with_data(
def test_get_pending_reviews_for_execution_success(
mocker: pytest_mock.MockFixture,
mocker: pytest_mock.MockerFixture,
sample_pending_review: PendingHumanReviewModel,
snapshot: Snapshot,
test_user_id: str,
) -> None:
"""Test getting pending reviews for specific execution"""
mock_get_graph_execution = mocker.patch(
"backend.server.v2.executions.review.routes.get_graph_execution_meta"
"backend.api.features.executions.review.routes.get_graph_execution_meta"
)
mock_get_graph_execution.return_value = {
"id": "test_graph_exec_456",
@@ -109,7 +110,7 @@ def test_get_pending_reviews_for_execution_success(
}
mock_get_reviews = mocker.patch(
"backend.server.v2.executions.review.routes.get_pending_reviews_for_execution"
"backend.api.features.executions.review.routes.get_pending_reviews_for_execution"
)
mock_get_reviews.return_value = [sample_pending_review]
@@ -121,24 +122,23 @@ def test_get_pending_reviews_for_execution_success(
assert data[0]["graph_exec_id"] == "test_graph_exec_456"
def test_get_pending_reviews_for_execution_access_denied(
mocker: pytest_mock.MockFixture,
test_user_id: str,
def test_get_pending_reviews_for_execution_not_available(
mocker: pytest_mock.MockerFixture,
) -> None:
"""Test access denied when user doesn't own the execution"""
mock_get_graph_execution = mocker.patch(
"backend.server.v2.executions.review.routes.get_graph_execution_meta"
"backend.api.features.executions.review.routes.get_graph_execution_meta"
)
mock_get_graph_execution.return_value = None
response = client.get("/api/review/execution/test_graph_exec_456")
assert response.status_code == 403
assert "Access denied" in response.json()["detail"]
assert response.status_code == 404
assert "not found" in response.json()["detail"]
def test_process_review_action_approve_success(
mocker: pytest_mock.MockFixture,
mocker: pytest_mock.MockerFixture,
sample_pending_review: PendingHumanReviewModel,
test_user_id: str,
) -> None:
@@ -146,12 +146,12 @@ def test_process_review_action_approve_success(
# Mock the route functions
mock_get_reviews_for_execution = mocker.patch(
"backend.server.v2.executions.review.routes.get_pending_reviews_for_execution"
"backend.api.features.executions.review.routes.get_pending_reviews_for_execution"
)
mock_get_reviews_for_execution.return_value = [sample_pending_review]
mock_process_all_reviews = mocker.patch(
"backend.server.v2.executions.review.routes.process_all_reviews_for_execution"
"backend.api.features.executions.review.routes.process_all_reviews_for_execution"
)
# Create approved review for return
approved_review = PendingHumanReviewModel(
@@ -174,11 +174,11 @@ def test_process_review_action_approve_success(
mock_process_all_reviews.return_value = {"test_node_123": approved_review}
mock_has_pending = mocker.patch(
"backend.server.v2.executions.review.routes.has_pending_reviews_for_graph_exec"
"backend.api.features.executions.review.routes.has_pending_reviews_for_graph_exec"
)
mock_has_pending.return_value = False
mocker.patch("backend.server.v2.executions.review.routes.add_graph_execution")
mocker.patch("backend.api.features.executions.review.routes.add_graph_execution")
request_data = {
"reviews": [
@@ -202,7 +202,7 @@ def test_process_review_action_approve_success(
def test_process_review_action_reject_success(
mocker: pytest_mock.MockFixture,
mocker: pytest_mock.MockerFixture,
sample_pending_review: PendingHumanReviewModel,
test_user_id: str,
) -> None:
@@ -210,12 +210,12 @@ def test_process_review_action_reject_success(
# Mock the route functions
mock_get_reviews_for_execution = mocker.patch(
"backend.server.v2.executions.review.routes.get_pending_reviews_for_execution"
"backend.api.features.executions.review.routes.get_pending_reviews_for_execution"
)
mock_get_reviews_for_execution.return_value = [sample_pending_review]
mock_process_all_reviews = mocker.patch(
"backend.server.v2.executions.review.routes.process_all_reviews_for_execution"
"backend.api.features.executions.review.routes.process_all_reviews_for_execution"
)
rejected_review = PendingHumanReviewModel(
node_exec_id="test_node_123",
@@ -237,7 +237,7 @@ def test_process_review_action_reject_success(
mock_process_all_reviews.return_value = {"test_node_123": rejected_review}
mock_has_pending = mocker.patch(
"backend.server.v2.executions.review.routes.has_pending_reviews_for_graph_exec"
"backend.api.features.executions.review.routes.has_pending_reviews_for_graph_exec"
)
mock_has_pending.return_value = False
@@ -262,7 +262,7 @@ def test_process_review_action_reject_success(
def test_process_review_action_mixed_success(
mocker: pytest_mock.MockFixture,
mocker: pytest_mock.MockerFixture,
sample_pending_review: PendingHumanReviewModel,
test_user_id: str,
) -> None:
@@ -289,12 +289,12 @@ def test_process_review_action_mixed_success(
# Mock the route functions
mock_get_reviews_for_execution = mocker.patch(
"backend.server.v2.executions.review.routes.get_pending_reviews_for_execution"
"backend.api.features.executions.review.routes.get_pending_reviews_for_execution"
)
mock_get_reviews_for_execution.return_value = [sample_pending_review, second_review]
mock_process_all_reviews = mocker.patch(
"backend.server.v2.executions.review.routes.process_all_reviews_for_execution"
"backend.api.features.executions.review.routes.process_all_reviews_for_execution"
)
# Create approved version of first review
approved_review = PendingHumanReviewModel(
@@ -338,7 +338,7 @@ def test_process_review_action_mixed_success(
}
mock_has_pending = mocker.patch(
"backend.server.v2.executions.review.routes.has_pending_reviews_for_graph_exec"
"backend.api.features.executions.review.routes.has_pending_reviews_for_graph_exec"
)
mock_has_pending.return_value = False
@@ -369,7 +369,7 @@ def test_process_review_action_mixed_success(
def test_process_review_action_empty_request(
mocker: pytest_mock.MockFixture,
mocker: pytest_mock.MockerFixture,
test_user_id: str,
) -> None:
"""Test error when no reviews provided"""
@@ -386,19 +386,19 @@ def test_process_review_action_empty_request(
def test_process_review_action_review_not_found(
mocker: pytest_mock.MockFixture,
mocker: pytest_mock.MockerFixture,
test_user_id: str,
) -> None:
"""Test error when review is not found"""
# Mock the functions that extract graph execution ID from the request
mock_get_reviews_for_execution = mocker.patch(
"backend.server.v2.executions.review.routes.get_pending_reviews_for_execution"
"backend.api.features.executions.review.routes.get_pending_reviews_for_execution"
)
mock_get_reviews_for_execution.return_value = [] # No reviews found
# Mock process_all_reviews to simulate not finding reviews
mock_process_all_reviews = mocker.patch(
"backend.server.v2.executions.review.routes.process_all_reviews_for_execution"
"backend.api.features.executions.review.routes.process_all_reviews_for_execution"
)
# This should raise a ValueError with "Reviews not found" message based on the data/human_review.py logic
mock_process_all_reviews.side_effect = ValueError(
@@ -422,20 +422,20 @@ def test_process_review_action_review_not_found(
def test_process_review_action_partial_failure(
mocker: pytest_mock.MockFixture,
mocker: pytest_mock.MockerFixture,
sample_pending_review: PendingHumanReviewModel,
test_user_id: str,
) -> None:
"""Test handling of partial failures in review processing"""
# Mock the route functions
mock_get_reviews_for_execution = mocker.patch(
"backend.server.v2.executions.review.routes.get_pending_reviews_for_execution"
"backend.api.features.executions.review.routes.get_pending_reviews_for_execution"
)
mock_get_reviews_for_execution.return_value = [sample_pending_review]
# Mock partial failure in processing
mock_process_all_reviews = mocker.patch(
"backend.server.v2.executions.review.routes.process_all_reviews_for_execution"
"backend.api.features.executions.review.routes.process_all_reviews_for_execution"
)
mock_process_all_reviews.side_effect = ValueError("Some reviews failed validation")
@@ -456,20 +456,20 @@ def test_process_review_action_partial_failure(
def test_process_review_action_invalid_node_exec_id(
mocker: pytest_mock.MockFixture,
mocker: pytest_mock.MockerFixture,
sample_pending_review: PendingHumanReviewModel,
test_user_id: str,
) -> None:
"""Test failure when trying to process review with invalid node execution ID"""
# Mock the route functions
mock_get_reviews_for_execution = mocker.patch(
"backend.server.v2.executions.review.routes.get_pending_reviews_for_execution"
"backend.api.features.executions.review.routes.get_pending_reviews_for_execution"
)
mock_get_reviews_for_execution.return_value = [sample_pending_review]
# Mock validation failure - this should return 400, not 500
mock_process_all_reviews = mocker.patch(
"backend.server.v2.executions.review.routes.process_all_reviews_for_execution"
"backend.api.features.executions.review.routes.process_all_reviews_for_execution"
)
mock_process_all_reviews.side_effect = ValueError(
"Invalid node execution ID format"

View File

@@ -13,11 +13,8 @@ from backend.data.human_review import (
process_all_reviews_for_execution,
)
from backend.executor.utils import add_graph_execution
from backend.server.v2.executions.review.model import (
PendingHumanReviewModel,
ReviewRequest,
ReviewResponse,
)
from .model import PendingHumanReviewModel, ReviewRequest, ReviewResponse
logger = logging.getLogger(__name__)
@@ -70,8 +67,7 @@ async def list_pending_reviews(
response_model=List[PendingHumanReviewModel],
responses={
200: {"description": "List of pending reviews for the execution"},
400: {"description": "Invalid graph execution ID"},
403: {"description": "Access denied to graph execution"},
404: {"description": "Graph execution not found"},
500: {"description": "Server error", "content": {"application/json": {}}},
},
)
@@ -94,7 +90,7 @@ async def list_pending_reviews_for_execution(
Raises:
HTTPException:
- 403: If user doesn't own the graph execution
- 404: If the graph execution doesn't exist or isn't owned by this user
- 500: If authentication fails or database error occurs
Note:
@@ -108,8 +104,8 @@ async def list_pending_reviews_for_execution(
)
if not graph_exec:
raise HTTPException(
status_code=status.HTTP_403_FORBIDDEN,
detail="Access denied to graph execution",
status_code=status.HTTP_404_NOT_FOUND,
detail=f"Graph execution #{graph_exec_id} not found",
)
return await get_pending_reviews_for_execution(graph_exec_id, user_id)

View File

@@ -17,6 +17,8 @@ from fastapi import (
from pydantic import BaseModel, Field, SecretStr
from starlette.status import HTTP_500_INTERNAL_SERVER_ERROR, HTTP_502_BAD_GATEWAY
from backend.api.features.library.db import set_preset_webhook, update_preset
from backend.api.features.library.model import LibraryAgentPreset
from backend.data.graph import NodeModel, get_graph, set_node_webhook
from backend.data.integrations import (
WebhookEvent,
@@ -33,11 +35,7 @@ from backend.data.model import (
OAuth2Credentials,
UserIntegrations,
)
from backend.data.onboarding import (
OnboardingStep,
complete_onboarding_step,
increment_runs,
)
from backend.data.onboarding import OnboardingStep, complete_onboarding_step
from backend.data.user import get_user_integrations
from backend.executor.utils import add_graph_execution
from backend.integrations.ayrshare import AyrshareClient, SocialPlatform
@@ -45,13 +43,6 @@ from backend.integrations.creds_manager import IntegrationCredentialsManager
from backend.integrations.oauth import CREDENTIALS_BY_PROVIDER, HANDLERS_BY_NAME
from backend.integrations.providers import ProviderName
from backend.integrations.webhooks import get_webhook_manager
from backend.server.integrations.models import (
ProviderConstants,
ProviderNamesResponse,
get_all_provider_names,
)
from backend.server.v2.library.db import set_preset_webhook, update_preset
from backend.server.v2.library.model import LibraryAgentPreset
from backend.util.exceptions import (
GraphNotInLibraryError,
MissingConfigError,
@@ -60,6 +51,8 @@ from backend.util.exceptions import (
)
from backend.util.settings import Settings
from .models import ProviderConstants, ProviderNamesResponse, get_all_provider_names
if TYPE_CHECKING:
from backend.integrations.oauth import BaseOAuthHandler
@@ -178,6 +171,7 @@ async def callback(
f"Successfully processed OAuth callback for user {user_id} "
f"and provider {provider.value}"
)
return CredentialsMetaResponse(
id=credentials.id,
provider=credentials.provider,
@@ -196,6 +190,7 @@ async def list_credentials(
user_id: Annotated[str, Security(get_user_id)],
) -> list[CredentialsMetaResponse]:
credentials = await creds_manager.store.get_all_creds(user_id)
return [
CredentialsMetaResponse(
id=cred.id,
@@ -218,6 +213,7 @@ async def list_credentials_by_provider(
user_id: Annotated[str, Security(get_user_id)],
) -> list[CredentialsMetaResponse]:
credentials = await creds_manager.store.get_creds_by_provider(user_id, provider)
return [
CredentialsMetaResponse(
id=cred.id,
@@ -381,7 +377,6 @@ async def webhook_ingress_generic(
return
await complete_onboarding_step(user_id, OnboardingStep.TRIGGER_WEBHOOK)
await increment_runs(user_id)
# Execute all triggers concurrently for better performance
tasks = []
@@ -834,6 +829,18 @@ async def list_providers() -> List[str]:
return all_providers
@router.get("/providers/system", response_model=List[str])
async def list_system_providers() -> List[str]:
"""
Get a list of providers that have platform credits (system credentials) available.
These providers can be used without the user providing their own API keys.
"""
from backend.integrations.credentials_store import SYSTEM_PROVIDERS
return list(SYSTEM_PROVIDERS)
@router.get("/providers/names", response_model=ProviderNamesResponse)
async def get_provider_names() -> ProviderNamesResponse:
"""

View File

@@ -4,16 +4,14 @@ from typing import Literal, Optional
import fastapi
import prisma.errors
import prisma.fields
import prisma.models
import prisma.types
import backend.api.features.store.exceptions as store_exceptions
import backend.api.features.store.image_gen as store_image_gen
import backend.api.features.store.media as store_media
import backend.data.graph as graph_db
import backend.data.integrations as integrations_db
import backend.server.v2.library.model as library_model
import backend.server.v2.store.exceptions as store_exceptions
import backend.server.v2.store.image_gen as store_image_gen
import backend.server.v2.store.media as store_media
from backend.data.block import BlockInput
from backend.data.db import transaction
from backend.data.execution import get_graph_execution
@@ -28,6 +26,8 @@ from backend.util.json import SafeJson
from backend.util.models import Pagination
from backend.util.settings import Config
from . import model as library_model
logger = logging.getLogger(__name__)
config = Config()
integration_creds_manager = IntegrationCredentialsManager()
@@ -489,7 +489,7 @@ async def update_agent_version_in_library(
agent_graph_version: int,
) -> library_model.LibraryAgent:
"""
Updates the agent version in the library if useGraphIsActiveVersion is True.
Updates the agent version in the library for any agent owned by the user.
Args:
user_id: Owner of the LibraryAgent.
@@ -498,20 +498,31 @@ async def update_agent_version_in_library(
Raises:
DatabaseError: If there's an error with the update.
NotFoundError: If no library agent is found for this user and agent.
"""
logger.debug(
f"Updating agent version in library for user #{user_id}, "
f"agent #{agent_graph_id} v{agent_graph_version}"
)
try:
library_agent = await prisma.models.LibraryAgent.prisma().find_first_or_raise(
async with transaction() as tx:
library_agent = await prisma.models.LibraryAgent.prisma(tx).find_first_or_raise(
where={
"userId": user_id,
"agentGraphId": agent_graph_id,
"useGraphIsActiveVersion": True,
},
)
lib = await prisma.models.LibraryAgent.prisma().update(
# Delete any conflicting LibraryAgent for the target version
await prisma.models.LibraryAgent.prisma(tx).delete_many(
where={
"userId": user_id,
"agentGraphId": agent_graph_id,
"agentGraphVersion": agent_graph_version,
"id": {"not": library_agent.id},
}
)
lib = await prisma.models.LibraryAgent.prisma(tx).update(
where={"id": library_agent.id},
data={
"AgentGraph": {
@@ -525,19 +536,20 @@ async def update_agent_version_in_library(
},
include={"AgentGraph": True},
)
if lib is None:
raise NotFoundError(f"Library agent {library_agent.id} not found")
return library_model.LibraryAgent.from_db(lib)
except prisma.errors.PrismaError as e:
logger.error(f"Database error updating agent version in library: {e}")
raise DatabaseError("Failed to update agent version in library") from e
if lib is None:
raise NotFoundError(
f"Failed to update library agent for {agent_graph_id} v{agent_graph_version}"
)
return library_model.LibraryAgent.from_db(lib)
async def update_library_agent(
library_agent_id: str,
user_id: str,
auto_update_version: Optional[bool] = None,
graph_version: Optional[int] = None,
is_favorite: Optional[bool] = None,
is_archived: Optional[bool] = None,
is_deleted: Optional[Literal[False]] = None,
@@ -550,6 +562,7 @@ async def update_library_agent(
library_agent_id: The ID of the LibraryAgent to update.
user_id: The owner of this LibraryAgent.
auto_update_version: Whether the agent should auto-update to active version.
graph_version: Specific graph version to update to.
is_favorite: Whether this agent is marked as a favorite.
is_archived: Whether this agent is archived.
settings: User-specific settings for this library agent.
@@ -563,8 +576,8 @@ async def update_library_agent(
"""
logger.debug(
f"Updating library agent {library_agent_id} for user {user_id} with "
f"auto_update_version={auto_update_version}, is_favorite={is_favorite}, "
f"is_archived={is_archived}, settings={settings}"
f"auto_update_version={auto_update_version}, graph_version={graph_version}, "
f"is_favorite={is_favorite}, is_archived={is_archived}, settings={settings}"
)
update_fields: prisma.types.LibraryAgentUpdateManyMutationInput = {}
if auto_update_version is not None:
@@ -581,10 +594,23 @@ async def update_library_agent(
update_fields["isDeleted"] = is_deleted
if settings is not None:
update_fields["settings"] = SafeJson(settings.model_dump())
if not update_fields:
raise ValueError("No values were passed to update")
try:
# If graph_version is provided, update to that specific version
if graph_version is not None:
# Get the current agent to find its graph_id
agent = await get_library_agent(id=library_agent_id, user_id=user_id)
# Update to the specified version using existing function
return await update_agent_version_in_library(
user_id=user_id,
agent_graph_id=agent.graph_id,
agent_graph_version=graph_version,
)
# Otherwise, just update the simple fields
if not update_fields:
raise ValueError("No values were passed to update")
n_updated = await prisma.models.LibraryAgent.prisma().update_many(
where={"id": library_agent_id, "userId": user_id},
data=update_fields,
@@ -810,6 +836,7 @@ async def add_store_agent_to_library(
}
},
"isCreatedByUser": False,
"useGraphIsActiveVersion": False,
"settings": SafeJson(
_initialize_graph_settings(graph_model).model_dump()
),

View File

@@ -1,16 +1,15 @@
from datetime import datetime
import prisma.enums
import prisma.errors
import prisma.models
import prisma.types
import pytest
import backend.server.v2.library.db as db
import backend.server.v2.store.exceptions
import backend.api.features.store.exceptions
from backend.data.db import connect
from backend.data.includes import library_agent_include
from . import db
@pytest.mark.asyncio
async def test_get_library_agents(mocker):
@@ -88,7 +87,7 @@ async def test_add_agent_to_library(mocker):
await connect()
# Mock the transaction context
mock_transaction = mocker.patch("backend.server.v2.library.db.transaction")
mock_transaction = mocker.patch("backend.api.features.library.db.transaction")
mock_transaction.return_value.__aenter__ = mocker.AsyncMock(return_value=None)
mock_transaction.return_value.__aexit__ = mocker.AsyncMock(return_value=None)
# Mock data
@@ -151,7 +150,7 @@ async def test_add_agent_to_library(mocker):
)
# Mock graph_db.get_graph function that's called to check for HITL blocks
mock_graph_db = mocker.patch("backend.server.v2.library.db.graph_db")
mock_graph_db = mocker.patch("backend.api.features.library.db.graph_db")
mock_graph_model = mocker.Mock()
mock_graph_model.nodes = (
[]
@@ -159,7 +158,9 @@ async def test_add_agent_to_library(mocker):
mock_graph_db.get_graph = mocker.AsyncMock(return_value=mock_graph_model)
# Mock the model conversion
mock_from_db = mocker.patch("backend.server.v2.library.model.LibraryAgent.from_db")
mock_from_db = mocker.patch(
"backend.api.features.library.model.LibraryAgent.from_db"
)
mock_from_db.return_value = mocker.Mock()
# Call function
@@ -217,7 +218,7 @@ async def test_add_agent_to_library_not_found(mocker):
)
# Call function and verify exception
with pytest.raises(backend.server.v2.store.exceptions.AgentNotFoundError):
with pytest.raises(backend.api.features.store.exceptions.AgentNotFoundError):
await db.add_store_agent_to_library("version123", "test-user")
# Verify mock called correctly

View File

@@ -48,6 +48,7 @@ class LibraryAgent(pydantic.BaseModel):
id: str
graph_id: str
graph_version: int
owner_user_id: str # ID of user who owns/created this agent graph
image_url: str | None
@@ -163,6 +164,7 @@ class LibraryAgent(pydantic.BaseModel):
id=agent.id,
graph_id=agent.agentGraphId,
graph_version=agent.agentGraphVersion,
owner_user_id=agent.userId,
image_url=agent.imageUrl,
creator_name=creator_name,
creator_image_url=creator_image_url,
@@ -385,6 +387,9 @@ class LibraryAgentUpdateRequest(pydantic.BaseModel):
auto_update_version: Optional[bool] = pydantic.Field(
default=None, description="Auto-update the agent version"
)
graph_version: Optional[int] = pydantic.Field(
default=None, description="Specific graph version to update to"
)
is_favorite: Optional[bool] = pydantic.Field(
default=None, description="Mark the agent as a favorite"
)

View File

@@ -3,7 +3,7 @@ import datetime
import prisma.models
import pytest
import backend.server.v2.library.model as library_model
from . import model as library_model
@pytest.mark.asyncio

View File

@@ -6,12 +6,13 @@ from fastapi import APIRouter, Body, HTTPException, Query, Security, status
from fastapi.responses import Response
from prisma.enums import OnboardingStep
import backend.server.v2.library.db as library_db
import backend.server.v2.library.model as library_model
import backend.server.v2.store.exceptions as store_exceptions
import backend.api.features.store.exceptions as store_exceptions
from backend.data.onboarding import complete_onboarding_step
from backend.util.exceptions import DatabaseError, NotFoundError
from .. import db as library_db
from .. import model as library_model
logger = logging.getLogger(__name__)
router = APIRouter(
@@ -284,6 +285,7 @@ async def update_library_agent(
library_agent_id=library_agent_id,
user_id=user_id,
auto_update_version=payload.auto_update_version,
graph_version=payload.graph_version,
is_favorite=payload.is_favorite,
is_archived=payload.is_archived,
settings=payload.settings,

View File

@@ -4,19 +4,19 @@ from typing import Any, Optional
import autogpt_libs.auth as autogpt_auth_lib
from fastapi import APIRouter, Body, HTTPException, Query, Security, status
import backend.server.v2.library.db as db
import backend.server.v2.library.model as models
from backend.data.execution import GraphExecutionMeta
from backend.data.graph import get_graph
from backend.data.integrations import get_webhook
from backend.data.model import CredentialsMetaInput
from backend.data.onboarding import increment_runs
from backend.executor.utils import add_graph_execution, make_node_credentials_input_map
from backend.integrations.creds_manager import IntegrationCredentialsManager
from backend.integrations.webhooks import get_webhook_manager
from backend.integrations.webhooks.utils import setup_webhook_for_block
from backend.util.exceptions import NotFoundError
from .. import db
from .. import model as models
logger = logging.getLogger(__name__)
credentials_manager = IntegrationCredentialsManager()
@@ -402,8 +402,6 @@ async def execute_preset(
merged_node_input = preset.inputs | inputs
merged_credential_inputs = preset.credentials | credential_inputs
await increment_runs(user_id)
return await add_graph_execution(
user_id=user_id,
graph_id=preset.graph_id,

View File

@@ -7,10 +7,11 @@ import pytest
import pytest_mock
from pytest_snapshot.plugin import Snapshot
import backend.server.v2.library.model as library_model
from backend.server.v2.library.routes import router as library_router
from backend.util.models import Pagination
from . import model as library_model
from .routes import router as library_router
app = fastapi.FastAPI()
app.include_router(library_router)
@@ -41,6 +42,7 @@ async def test_get_library_agents_success(
id="test-agent-1",
graph_id="test-agent-1",
graph_version=1,
owner_user_id=test_user_id,
name="Test Agent 1",
description="Test Description 1",
image_url=None,
@@ -63,6 +65,7 @@ async def test_get_library_agents_success(
id="test-agent-2",
graph_id="test-agent-2",
graph_version=1,
owner_user_id=test_user_id,
name="Test Agent 2",
description="Test Description 2",
image_url=None,
@@ -86,7 +89,7 @@ async def test_get_library_agents_success(
total_items=2, total_pages=1, current_page=1, page_size=50
),
)
mock_db_call = mocker.patch("backend.server.v2.library.db.list_library_agents")
mock_db_call = mocker.patch("backend.api.features.library.db.list_library_agents")
mock_db_call.return_value = mocked_value
response = client.get("/agents?search_term=test")
@@ -112,7 +115,7 @@ async def test_get_library_agents_success(
def test_get_library_agents_error(mocker: pytest_mock.MockFixture, test_user_id: str):
mock_db_call = mocker.patch("backend.server.v2.library.db.list_library_agents")
mock_db_call = mocker.patch("backend.api.features.library.db.list_library_agents")
mock_db_call.side_effect = Exception("Test error")
response = client.get("/agents?search_term=test")
@@ -137,6 +140,7 @@ async def test_get_favorite_library_agents_success(
id="test-agent-1",
graph_id="test-agent-1",
graph_version=1,
owner_user_id=test_user_id,
name="Favorite Agent 1",
description="Test Favorite Description 1",
image_url=None,
@@ -161,7 +165,7 @@ async def test_get_favorite_library_agents_success(
),
)
mock_db_call = mocker.patch(
"backend.server.v2.library.db.list_favorite_library_agents"
"backend.api.features.library.db.list_favorite_library_agents"
)
mock_db_call.return_value = mocked_value
@@ -184,7 +188,7 @@ def test_get_favorite_library_agents_error(
mocker: pytest_mock.MockFixture, test_user_id: str
):
mock_db_call = mocker.patch(
"backend.server.v2.library.db.list_favorite_library_agents"
"backend.api.features.library.db.list_favorite_library_agents"
)
mock_db_call.side_effect = Exception("Test error")
@@ -204,6 +208,7 @@ def test_add_agent_to_library_success(
id="test-library-agent-id",
graph_id="test-agent-1",
graph_version=1,
owner_user_id=test_user_id,
name="Test Agent 1",
description="Test Description 1",
image_url=None,
@@ -223,11 +228,11 @@ def test_add_agent_to_library_success(
)
mock_db_call = mocker.patch(
"backend.server.v2.library.db.add_store_agent_to_library"
"backend.api.features.library.db.add_store_agent_to_library"
)
mock_db_call.return_value = mock_library_agent
mock_complete_onboarding = mocker.patch(
"backend.server.v2.library.routes.agents.complete_onboarding_step",
"backend.api.features.library.routes.agents.complete_onboarding_step",
new_callable=AsyncMock,
)
@@ -249,7 +254,7 @@ def test_add_agent_to_library_success(
def test_add_agent_to_library_error(mocker: pytest_mock.MockFixture, test_user_id: str):
mock_db_call = mocker.patch(
"backend.server.v2.library.db.add_store_agent_to_library"
"backend.api.features.library.db.add_store_agent_to_library"
)
mock_db_call.side_effect = Exception("Test error")

View File

@@ -5,11 +5,11 @@ Implements OAuth 2.0 Authorization Code flow with PKCE support.
Flow:
1. User clicks "Login with AutoGPT" in 3rd party app
2. App redirects user to /oauth/authorize with client_id, redirect_uri, scope, state
2. App redirects user to /auth/authorize with client_id, redirect_uri, scope, state
3. User sees consent screen (if not already logged in, redirects to login first)
4. User approves backend creates authorization code
5. User redirected back to app with code
6. App exchanges code for access/refresh tokens at /oauth/token
6. App exchanges code for access/refresh tokens at /api/oauth/token
7. App uses access token to call external API endpoints
"""

View File

@@ -28,7 +28,7 @@ from prisma.models import OAuthAuthorizationCode as PrismaOAuthAuthorizationCode
from prisma.models import OAuthRefreshToken as PrismaOAuthRefreshToken
from prisma.models import User as PrismaUser
from backend.server.rest_api import app
from backend.api.rest_api import app
keysmith = APIKeySmith()

View File

@@ -6,9 +6,9 @@ import pytest
import pytest_mock
from pytest_snapshot.plugin import Snapshot
import backend.server.v2.otto.models as otto_models
import backend.server.v2.otto.routes as otto_routes
from backend.server.v2.otto.service import OttoService
from . import models as otto_models
from . import routes as otto_routes
from .service import OttoService
app = fastapi.FastAPI()
app.include_router(otto_routes.router)

View File

@@ -4,12 +4,15 @@ from typing import Annotated
from fastapi import APIRouter, Body, HTTPException, Query, Security
from fastapi.responses import JSONResponse
from backend.api.utils.api_key_auth import APIKeyAuthenticator
from backend.data.user import (
get_user_by_email,
set_user_email_verification,
unsubscribe_user_by_token,
)
from backend.server.routers.postmark.models import (
from backend.util.settings import Settings
from .models import (
PostmarkBounceEnum,
PostmarkBounceWebhook,
PostmarkClickWebhook,
@@ -19,8 +22,6 @@ from backend.server.routers.postmark.models import (
PostmarkSubscriptionChangeWebhook,
PostmarkWebhook,
)
from backend.server.utils.api_key_auth import APIKeyAuthenticator
from backend.util.settings import Settings
logger = logging.getLogger(__name__)
settings = Settings()

View File

@@ -1,8 +1,9 @@
from typing import Literal
import backend.server.v2.store.db
from backend.util.cache import cached
from . import db as store_db
##############################################
############### Caches #######################
##############################################
@@ -29,7 +30,7 @@ async def _get_cached_store_agents(
page_size: int,
):
"""Cached helper to get store agents."""
return await backend.server.v2.store.db.get_store_agents(
return await store_db.get_store_agents(
featured=featured,
creators=[creator] if creator else None,
sorted_by=sorted_by,
@@ -42,10 +43,12 @@ async def _get_cached_store_agents(
# Cache individual agent details for 15 minutes
@cached(maxsize=200, ttl_seconds=300, shared_cache=True)
async def _get_cached_agent_details(username: str, agent_name: str):
async def _get_cached_agent_details(
username: str, agent_name: str, include_changelog: bool = False
):
"""Cached helper to get agent details."""
return await backend.server.v2.store.db.get_store_agent_details(
username=username, agent_name=agent_name
return await store_db.get_store_agent_details(
username=username, agent_name=agent_name, include_changelog=include_changelog
)
@@ -59,7 +62,7 @@ async def _get_cached_store_creators(
page_size: int,
):
"""Cached helper to get store creators."""
return await backend.server.v2.store.db.get_store_creators(
return await store_db.get_store_creators(
featured=featured,
search_query=search_query,
sorted_by=sorted_by,
@@ -72,6 +75,4 @@ async def _get_cached_store_creators(
@cached(maxsize=100, ttl_seconds=300, shared_cache=True)
async def _get_cached_creator_details(username: str):
"""Cached helper to get creator details."""
return await backend.server.v2.store.db.get_store_creator_details(
username=username.lower()
)
return await store_db.get_store_creator_details(username=username.lower())

View File

@@ -6,8 +6,8 @@ import prisma.models
import pytest
from prisma import Prisma
import backend.server.v2.store.db as db
from backend.server.v2.store.model import Profile
from . import db
from .model import Profile
@pytest.fixture(autouse=True)
@@ -40,6 +40,8 @@ async def test_get_store_agents(mocker):
runs=10,
rating=4.5,
versions=["1.0"],
agentGraphVersions=["1"],
agentGraphId="test-graph-id",
updated_at=datetime.now(),
is_available=False,
useForOnboarding=False,
@@ -83,6 +85,8 @@ async def test_get_store_agent_details(mocker):
runs=10,
rating=4.5,
versions=["1.0"],
agentGraphVersions=["1"],
agentGraphId="test-graph-id",
updated_at=datetime.now(),
is_available=False,
useForOnboarding=False,
@@ -105,6 +109,8 @@ async def test_get_store_agent_details(mocker):
runs=15,
rating=4.8,
versions=["1.0", "2.0"],
agentGraphVersions=["1", "2"],
agentGraphId="test-graph-id-active",
updated_at=datetime.now(),
is_available=True,
useForOnboarding=False,

View File

@@ -0,0 +1,568 @@
"""
Unified Content Embeddings Service
Handles generation and storage of OpenAI embeddings for all content types
(store listings, blocks, documentation, library agents) to enable semantic/hybrid search.
"""
import asyncio
import logging
import time
from typing import Any
import prisma
from prisma.enums import ContentType
from tiktoken import encoding_for_model
from backend.data.db import execute_raw_with_schema, query_raw_with_schema
from backend.util.clients import get_openai_client
from backend.util.json import dumps
logger = logging.getLogger(__name__)
# OpenAI embedding model configuration
EMBEDDING_MODEL = "text-embedding-3-small"
# OpenAI embedding token limit (8,191 with 1 token buffer for safety)
EMBEDDING_MAX_TOKENS = 8191
def build_searchable_text(
name: str,
description: str,
sub_heading: str,
categories: list[str],
) -> str:
"""
Build searchable text from listing version fields.
Combines relevant fields into a single string for embedding.
"""
parts = []
# Name is important - include it
if name:
parts.append(name)
# Sub-heading provides context
if sub_heading:
parts.append(sub_heading)
# Description is the main content
if description:
parts.append(description)
# Categories help with semantic matching
if categories:
parts.append(" ".join(categories))
return " ".join(parts)
async def generate_embedding(text: str) -> list[float] | None:
"""
Generate embedding for text using OpenAI API.
Returns None if embedding generation fails.
Fail-fast: no retries to maintain consistency with approval flow.
"""
try:
client = get_openai_client()
if not client:
logger.error("openai_internal_api_key not set, cannot generate embedding")
return None
# Truncate text to token limit using tiktoken
# Character-based truncation is insufficient because token ratios vary by content type
enc = encoding_for_model(EMBEDDING_MODEL)
tokens = enc.encode(text)
if len(tokens) > EMBEDDING_MAX_TOKENS:
tokens = tokens[:EMBEDDING_MAX_TOKENS]
truncated_text = enc.decode(tokens)
logger.info(
f"Truncated text from {len(enc.encode(text))} to {len(tokens)} tokens"
)
else:
truncated_text = text
start_time = time.time()
response = await client.embeddings.create(
model=EMBEDDING_MODEL,
input=truncated_text,
)
latency_ms = (time.time() - start_time) * 1000
embedding = response.data[0].embedding
logger.info(
f"Generated embedding: {len(embedding)} dims, "
f"{len(tokens)} tokens, {latency_ms:.0f}ms"
)
return embedding
except Exception as e:
logger.error(f"Failed to generate embedding: {e}")
return None
async def store_embedding(
version_id: str,
embedding: list[float],
tx: prisma.Prisma | None = None,
) -> bool:
"""
Store embedding in the database.
BACKWARD COMPATIBILITY: Maintained for existing store listing usage.
DEPRECATED: Use ensure_embedding() instead (includes searchable_text).
"""
return await store_content_embedding(
content_type=ContentType.STORE_AGENT,
content_id=version_id,
embedding=embedding,
searchable_text="", # Empty for backward compat; ensure_embedding() populates this
metadata=None,
user_id=None, # Store agents are public
tx=tx,
)
async def store_content_embedding(
content_type: ContentType,
content_id: str,
embedding: list[float],
searchable_text: str,
metadata: dict | None = None,
user_id: str | None = None,
tx: prisma.Prisma | None = None,
) -> bool:
"""
Store embedding in the unified content embeddings table.
New function for unified content embedding storage.
Uses raw SQL since Prisma doesn't natively support pgvector.
"""
try:
client = tx if tx else prisma.get_client()
# Convert embedding to PostgreSQL vector format
embedding_str = embedding_to_vector_string(embedding)
metadata_json = dumps(metadata or {})
# Upsert the embedding
# WHERE clause in DO UPDATE prevents PostgreSQL 15 bug with NULLS NOT DISTINCT
await execute_raw_with_schema(
"""
INSERT INTO {schema_prefix}"UnifiedContentEmbedding" (
"id", "contentType", "contentId", "userId", "embedding", "searchableText", "metadata", "createdAt", "updatedAt"
)
VALUES (gen_random_uuid()::text, $1::{schema_prefix}"ContentType", $2, $3, $4::vector, $5, $6::jsonb, NOW(), NOW())
ON CONFLICT ("contentType", "contentId", "userId")
DO UPDATE SET
"embedding" = $4::vector,
"searchableText" = $5,
"metadata" = $6::jsonb,
"updatedAt" = NOW()
WHERE {schema_prefix}"UnifiedContentEmbedding"."contentType" = $1::{schema_prefix}"ContentType"
AND {schema_prefix}"UnifiedContentEmbedding"."contentId" = $2
AND ({schema_prefix}"UnifiedContentEmbedding"."userId" = $3 OR ($3 IS NULL AND {schema_prefix}"UnifiedContentEmbedding"."userId" IS NULL))
""",
content_type,
content_id,
user_id,
embedding_str,
searchable_text,
metadata_json,
client=client,
set_public_search_path=True,
)
logger.info(f"Stored embedding for {content_type}:{content_id}")
return True
except Exception as e:
logger.error(f"Failed to store embedding for {content_type}:{content_id}: {e}")
return False
async def get_embedding(version_id: str) -> dict[str, Any] | None:
"""
Retrieve embedding record for a listing version.
BACKWARD COMPATIBILITY: Maintained for existing store listing usage.
Returns dict with storeListingVersionId, embedding, timestamps or None if not found.
"""
result = await get_content_embedding(
ContentType.STORE_AGENT, version_id, user_id=None
)
if result:
# Transform to old format for backward compatibility
return {
"storeListingVersionId": result["contentId"],
"embedding": result["embedding"],
"createdAt": result["createdAt"],
"updatedAt": result["updatedAt"],
}
return None
async def get_content_embedding(
content_type: ContentType, content_id: str, user_id: str | None = None
) -> dict[str, Any] | None:
"""
Retrieve embedding record for any content type.
New function for unified content embedding retrieval.
Returns dict with contentType, contentId, embedding, timestamps or None if not found.
"""
try:
result = await query_raw_with_schema(
"""
SELECT
"contentType",
"contentId",
"userId",
"embedding"::text as "embedding",
"searchableText",
"metadata",
"createdAt",
"updatedAt"
FROM {schema_prefix}"UnifiedContentEmbedding"
WHERE "contentType" = $1::{schema_prefix}"ContentType" AND "contentId" = $2 AND ("userId" = $3 OR ($3 IS NULL AND "userId" IS NULL))
""",
content_type,
content_id,
user_id,
set_public_search_path=True,
)
if result and len(result) > 0:
return result[0]
return None
except Exception as e:
logger.error(f"Failed to get embedding for {content_type}:{content_id}: {e}")
return None
async def ensure_embedding(
version_id: str,
name: str,
description: str,
sub_heading: str,
categories: list[str],
force: bool = False,
tx: prisma.Prisma | None = None,
) -> bool:
"""
Ensure an embedding exists for the listing version.
Creates embedding if missing. Use force=True to regenerate.
Backward-compatible wrapper for store listings.
Args:
version_id: The StoreListingVersion ID
name: Agent name
description: Agent description
sub_heading: Agent sub-heading
categories: Agent categories
force: Force regeneration even if embedding exists
tx: Optional transaction client
Returns:
True if embedding exists/was created, False on failure
"""
try:
# Check if embedding already exists
if not force:
existing = await get_embedding(version_id)
if existing and existing.get("embedding"):
logger.debug(f"Embedding for version {version_id} already exists")
return True
# Build searchable text for embedding
searchable_text = build_searchable_text(
name, description, sub_heading, categories
)
# Generate new embedding
embedding = await generate_embedding(searchable_text)
if embedding is None:
logger.warning(f"Could not generate embedding for version {version_id}")
return False
# Store the embedding with metadata using new function
metadata = {
"name": name,
"subHeading": sub_heading,
"categories": categories,
}
return await store_content_embedding(
content_type=ContentType.STORE_AGENT,
content_id=version_id,
embedding=embedding,
searchable_text=searchable_text,
metadata=metadata,
user_id=None, # Store agents are public
tx=tx,
)
except Exception as e:
logger.error(f"Failed to ensure embedding for version {version_id}: {e}")
return False
async def delete_embedding(version_id: str) -> bool:
"""
Delete embedding for a listing version.
BACKWARD COMPATIBILITY: Maintained for existing store listing usage.
Note: This is usually handled automatically by CASCADE delete,
but provided for manual cleanup if needed.
"""
return await delete_content_embedding(ContentType.STORE_AGENT, version_id)
async def delete_content_embedding(
content_type: ContentType, content_id: str, user_id: str | None = None
) -> bool:
"""
Delete embedding for any content type.
New function for unified content embedding deletion.
Note: This is usually handled automatically by CASCADE delete,
but provided for manual cleanup if needed.
Args:
content_type: The type of content (STORE_AGENT, LIBRARY_AGENT, etc.)
content_id: The unique identifier for the content
user_id: Optional user ID. For public content (STORE_AGENT, BLOCK), pass None.
For user-scoped content (LIBRARY_AGENT), pass the user's ID to avoid
deleting embeddings belonging to other users.
Returns:
True if deletion succeeded, False otherwise
"""
try:
client = prisma.get_client()
await execute_raw_with_schema(
"""
DELETE FROM {schema_prefix}"UnifiedContentEmbedding"
WHERE "contentType" = $1::{schema_prefix}"ContentType"
AND "contentId" = $2
AND ("userId" = $3 OR ($3 IS NULL AND "userId" IS NULL))
""",
content_type,
content_id,
user_id,
client=client,
)
user_str = f" (user: {user_id})" if user_id else ""
logger.info(f"Deleted embedding for {content_type}:{content_id}{user_str}")
return True
except Exception as e:
logger.error(f"Failed to delete embedding for {content_type}:{content_id}: {e}")
return False
async def get_embedding_stats() -> dict[str, Any]:
"""
Get statistics about embedding coverage.
Returns counts of:
- Total approved listing versions
- Versions with embeddings
- Versions without embeddings
"""
try:
# Count approved versions
approved_result = await query_raw_with_schema(
"""
SELECT COUNT(*) as count
FROM {schema_prefix}"StoreListingVersion"
WHERE "submissionStatus" = 'APPROVED'
AND "isDeleted" = false
"""
)
total_approved = approved_result[0]["count"] if approved_result else 0
# Count versions with embeddings
embedded_result = await query_raw_with_schema(
"""
SELECT COUNT(*) as count
FROM {schema_prefix}"StoreListingVersion" slv
JOIN {schema_prefix}"UnifiedContentEmbedding" uce ON slv.id = uce."contentId" AND uce."contentType" = 'STORE_AGENT'::{schema_prefix}"ContentType"
WHERE slv."submissionStatus" = 'APPROVED'
AND slv."isDeleted" = false
"""
)
with_embeddings = embedded_result[0]["count"] if embedded_result else 0
return {
"total_approved": total_approved,
"with_embeddings": with_embeddings,
"without_embeddings": total_approved - with_embeddings,
"coverage_percent": (
round(with_embeddings / total_approved * 100, 1)
if total_approved > 0
else 0
),
}
except Exception as e:
logger.error(f"Failed to get embedding stats: {e}")
return {
"total_approved": 0,
"with_embeddings": 0,
"without_embeddings": 0,
"coverage_percent": 0,
"error": str(e),
}
async def backfill_missing_embeddings(batch_size: int = 10) -> dict[str, Any]:
"""
Generate embeddings for approved listings that don't have them.
Args:
batch_size: Number of embeddings to generate in one call
Returns:
Dict with success/failure counts
"""
try:
# Find approved versions without embeddings
missing = await query_raw_with_schema(
"""
SELECT
slv.id,
slv.name,
slv.description,
slv."subHeading",
slv.categories
FROM {schema_prefix}"StoreListingVersion" slv
LEFT JOIN {schema_prefix}"UnifiedContentEmbedding" uce
ON slv.id = uce."contentId" AND uce."contentType" = 'STORE_AGENT'::{schema_prefix}"ContentType"
WHERE slv."submissionStatus" = 'APPROVED'
AND slv."isDeleted" = false
AND uce."contentId" IS NULL
LIMIT $1
""",
batch_size,
)
if not missing:
return {
"processed": 0,
"success": 0,
"failed": 0,
"message": "No missing embeddings",
}
# Process embeddings concurrently for better performance
embedding_tasks = [
ensure_embedding(
version_id=row["id"],
name=row["name"],
description=row["description"],
sub_heading=row["subHeading"],
categories=row["categories"] or [],
)
for row in missing
]
results = await asyncio.gather(*embedding_tasks, return_exceptions=True)
success = sum(1 for result in results if result is True)
failed = len(results) - success
return {
"processed": len(missing),
"success": success,
"failed": failed,
"message": f"Backfilled {success} embeddings, {failed} failed",
}
except Exception as e:
logger.error(f"Failed to backfill embeddings: {e}")
return {
"processed": 0,
"success": 0,
"failed": 0,
"error": str(e),
}
async def embed_query(query: str) -> list[float] | None:
"""
Generate embedding for a search query.
Same as generate_embedding but with clearer intent.
"""
return await generate_embedding(query)
def embedding_to_vector_string(embedding: list[float]) -> str:
"""Convert embedding list to PostgreSQL vector string format."""
return "[" + ",".join(str(x) for x in embedding) + "]"
async def ensure_content_embedding(
content_type: ContentType,
content_id: str,
searchable_text: str,
metadata: dict | None = None,
user_id: str | None = None,
force: bool = False,
tx: prisma.Prisma | None = None,
) -> bool:
"""
Ensure an embedding exists for any content type.
Generic function for creating embeddings for store agents, blocks, docs, etc.
Args:
content_type: ContentType enum value (STORE_AGENT, BLOCK, etc.)
content_id: Unique identifier for the content
searchable_text: Combined text for embedding generation
metadata: Optional metadata to store with embedding
force: Force regeneration even if embedding exists
tx: Optional transaction client
Returns:
True if embedding exists/was created, False on failure
"""
try:
# Check if embedding already exists
if not force:
existing = await get_content_embedding(content_type, content_id, user_id)
if existing and existing.get("embedding"):
logger.debug(
f"Embedding for {content_type}:{content_id} already exists"
)
return True
# Generate new embedding
embedding = await generate_embedding(searchable_text)
if embedding is None:
logger.warning(
f"Could not generate embedding for {content_type}:{content_id}"
)
return False
# Store the embedding
return await store_content_embedding(
content_type=content_type,
content_id=content_id,
embedding=embedding,
searchable_text=searchable_text,
metadata=metadata or {},
user_id=user_id,
tx=tx,
)
except Exception as e:
logger.error(f"Failed to ensure embedding for {content_type}:{content_id}: {e}")
return False

View File

@@ -0,0 +1,329 @@
"""
Integration tests for embeddings with schema handling.
These tests verify that embeddings operations work correctly across different database schemas.
"""
from unittest.mock import AsyncMock, patch
import pytest
from prisma.enums import ContentType
from backend.api.features.store import embeddings
# Schema prefix tests removed - functionality moved to db.raw_with_schema() helper
@pytest.mark.asyncio(loop_scope="session")
@pytest.mark.integration
async def test_store_content_embedding_with_schema():
"""Test storing embeddings with proper schema handling."""
with patch("backend.data.db.get_database_schema") as mock_schema:
mock_schema.return_value = "platform"
with patch("prisma.get_client") as mock_get_client:
mock_client = AsyncMock()
mock_get_client.return_value = mock_client
result = await embeddings.store_content_embedding(
content_type=ContentType.STORE_AGENT,
content_id="test-id",
embedding=[0.1] * 1536,
searchable_text="test text",
metadata={"test": "data"},
user_id=None,
)
# Verify the query was called
assert mock_client.execute_raw.called
# Get the SQL query that was executed
call_args = mock_client.execute_raw.call_args
sql_query = call_args[0][0]
# Verify schema prefix is in the query
assert '"platform"."UnifiedContentEmbedding"' in sql_query
# Verify result
assert result is True
@pytest.mark.asyncio(loop_scope="session")
@pytest.mark.integration
async def test_get_content_embedding_with_schema():
"""Test retrieving embeddings with proper schema handling."""
with patch("backend.data.db.get_database_schema") as mock_schema:
mock_schema.return_value = "platform"
with patch("prisma.get_client") as mock_get_client:
mock_client = AsyncMock()
mock_client.query_raw.return_value = [
{
"contentType": "STORE_AGENT",
"contentId": "test-id",
"userId": None,
"embedding": "[0.1, 0.2]",
"searchableText": "test",
"metadata": {},
"createdAt": "2024-01-01",
"updatedAt": "2024-01-01",
}
]
mock_get_client.return_value = mock_client
result = await embeddings.get_content_embedding(
ContentType.STORE_AGENT,
"test-id",
user_id=None,
)
# Verify the query was called
assert mock_client.query_raw.called
# Get the SQL query that was executed
call_args = mock_client.query_raw.call_args
sql_query = call_args[0][0]
# Verify schema prefix is in the query
assert '"platform"."UnifiedContentEmbedding"' in sql_query
# Verify result
assert result is not None
assert result["contentId"] == "test-id"
@pytest.mark.asyncio(loop_scope="session")
@pytest.mark.integration
async def test_delete_content_embedding_with_schema():
"""Test deleting embeddings with proper schema handling."""
with patch("backend.data.db.get_database_schema") as mock_schema:
mock_schema.return_value = "platform"
with patch("prisma.get_client") as mock_get_client:
mock_client = AsyncMock()
mock_get_client.return_value = mock_client
result = await embeddings.delete_content_embedding(
ContentType.STORE_AGENT,
"test-id",
)
# Verify the query was called
assert mock_client.execute_raw.called
# Get the SQL query that was executed
call_args = mock_client.execute_raw.call_args
sql_query = call_args[0][0]
# Verify schema prefix is in the query
assert '"platform"."UnifiedContentEmbedding"' in sql_query
# Verify result
assert result is True
@pytest.mark.asyncio(loop_scope="session")
@pytest.mark.integration
async def test_get_embedding_stats_with_schema():
"""Test embedding statistics with proper schema handling."""
with patch("backend.data.db.get_database_schema") as mock_schema:
mock_schema.return_value = "platform"
with patch("prisma.get_client") as mock_get_client:
mock_client = AsyncMock()
# Mock both query results
mock_client.query_raw.side_effect = [
[{"count": 100}], # total_approved
[{"count": 80}], # with_embeddings
]
mock_get_client.return_value = mock_client
result = await embeddings.get_embedding_stats()
# Verify both queries were called
assert mock_client.query_raw.call_count == 2
# Get both SQL queries
first_call = mock_client.query_raw.call_args_list[0]
second_call = mock_client.query_raw.call_args_list[1]
first_sql = first_call[0][0]
second_sql = second_call[0][0]
# Verify schema prefix in both queries
assert '"platform"."StoreListingVersion"' in first_sql
assert '"platform"."StoreListingVersion"' in second_sql
assert '"platform"."UnifiedContentEmbedding"' in second_sql
# Verify results
assert result["total_approved"] == 100
assert result["with_embeddings"] == 80
assert result["without_embeddings"] == 20
assert result["coverage_percent"] == 80.0
@pytest.mark.asyncio(loop_scope="session")
@pytest.mark.integration
async def test_backfill_missing_embeddings_with_schema():
"""Test backfilling embeddings with proper schema handling."""
with patch("backend.data.db.get_database_schema") as mock_schema:
mock_schema.return_value = "platform"
with patch("prisma.get_client") as mock_get_client:
mock_client = AsyncMock()
# Mock missing embeddings query
mock_client.query_raw.return_value = [
{
"id": "version-1",
"name": "Test Agent",
"description": "Test description",
"subHeading": "Test heading",
"categories": ["test"],
}
]
mock_get_client.return_value = mock_client
with patch(
"backend.api.features.store.embeddings.ensure_embedding"
) as mock_ensure:
mock_ensure.return_value = True
result = await embeddings.backfill_missing_embeddings(batch_size=10)
# Verify the query was called
assert mock_client.query_raw.called
# Get the SQL query
call_args = mock_client.query_raw.call_args
sql_query = call_args[0][0]
# Verify schema prefix in query
assert '"platform"."StoreListingVersion"' in sql_query
assert '"platform"."UnifiedContentEmbedding"' in sql_query
# Verify ensure_embedding was called
assert mock_ensure.called
# Verify results
assert result["processed"] == 1
assert result["success"] == 1
assert result["failed"] == 0
@pytest.mark.asyncio(loop_scope="session")
@pytest.mark.integration
async def test_ensure_content_embedding_with_schema():
"""Test ensuring embeddings exist with proper schema handling."""
with patch("backend.data.db.get_database_schema") as mock_schema:
mock_schema.return_value = "platform"
with patch(
"backend.api.features.store.embeddings.get_content_embedding"
) as mock_get:
# Simulate no existing embedding
mock_get.return_value = None
with patch(
"backend.api.features.store.embeddings.generate_embedding"
) as mock_generate:
mock_generate.return_value = [0.1] * 1536
with patch(
"backend.api.features.store.embeddings.store_content_embedding"
) as mock_store:
mock_store.return_value = True
result = await embeddings.ensure_content_embedding(
content_type=ContentType.STORE_AGENT,
content_id="test-id",
searchable_text="test text",
metadata={"test": "data"},
user_id=None,
force=False,
)
# Verify the flow
assert mock_get.called
assert mock_generate.called
assert mock_store.called
assert result is True
@pytest.mark.asyncio(loop_scope="session")
@pytest.mark.integration
async def test_backward_compatibility_store_embedding():
"""Test backward compatibility wrapper for store_embedding."""
with patch(
"backend.api.features.store.embeddings.store_content_embedding"
) as mock_store:
mock_store.return_value = True
result = await embeddings.store_embedding(
version_id="test-version-id",
embedding=[0.1] * 1536,
tx=None,
)
# Verify it calls the new function with correct parameters
assert mock_store.called
call_args = mock_store.call_args
assert call_args[1]["content_type"] == ContentType.STORE_AGENT
assert call_args[1]["content_id"] == "test-version-id"
assert call_args[1]["user_id"] is None
assert result is True
@pytest.mark.asyncio(loop_scope="session")
@pytest.mark.integration
async def test_backward_compatibility_get_embedding():
"""Test backward compatibility wrapper for get_embedding."""
with patch(
"backend.api.features.store.embeddings.get_content_embedding"
) as mock_get:
mock_get.return_value = {
"contentType": "STORE_AGENT",
"contentId": "test-version-id",
"embedding": "[0.1, 0.2]",
"createdAt": "2024-01-01",
"updatedAt": "2024-01-01",
}
result = await embeddings.get_embedding("test-version-id")
# Verify it calls the new function
assert mock_get.called
# Verify it transforms to old format
assert result is not None
assert result["storeListingVersionId"] == "test-version-id"
assert "embedding" in result
@pytest.mark.asyncio(loop_scope="session")
@pytest.mark.integration
async def test_schema_handling_error_cases():
"""Test error handling in schema-aware operations."""
with patch("backend.data.db.get_database_schema") as mock_schema:
mock_schema.return_value = "platform"
with patch("prisma.get_client") as mock_get_client:
mock_client = AsyncMock()
mock_client.execute_raw.side_effect = Exception("Database error")
mock_get_client.return_value = mock_client
result = await embeddings.store_content_embedding(
content_type=ContentType.STORE_AGENT,
content_id="test-id",
embedding=[0.1] * 1536,
searchable_text="test",
metadata=None,
user_id=None,
)
# Should return False on error, not raise
assert result is False
if __name__ == "__main__":
pytest.main([__file__, "-v", "-s"])

View File

@@ -0,0 +1,387 @@
from unittest.mock import AsyncMock, MagicMock, patch
import prisma
import pytest
from prisma import Prisma
from prisma.enums import ContentType
from backend.api.features.store import embeddings
@pytest.fixture(autouse=True)
async def setup_prisma():
"""Setup Prisma client for tests."""
try:
Prisma()
except prisma.errors.ClientAlreadyRegisteredError:
pass
yield
@pytest.mark.asyncio(loop_scope="session")
async def test_build_searchable_text():
"""Test searchable text building from listing fields."""
result = embeddings.build_searchable_text(
name="AI Assistant",
description="A helpful AI assistant for productivity",
sub_heading="Boost your productivity",
categories=["AI", "Productivity"],
)
expected = "AI Assistant Boost your productivity A helpful AI assistant for productivity AI Productivity"
assert result == expected
@pytest.mark.asyncio(loop_scope="session")
async def test_build_searchable_text_empty_fields():
"""Test searchable text building with empty fields."""
result = embeddings.build_searchable_text(
name="", description="Test description", sub_heading="", categories=[]
)
assert result == "Test description"
@pytest.mark.asyncio(loop_scope="session")
async def test_generate_embedding_success():
"""Test successful embedding generation."""
# Mock OpenAI response
mock_client = MagicMock()
mock_response = MagicMock()
mock_response.data = [MagicMock()]
mock_response.data[0].embedding = [0.1, 0.2, 0.3] * 512 # 1536 dimensions
# Use AsyncMock for async embeddings.create method
mock_client.embeddings.create = AsyncMock(return_value=mock_response)
# Patch at the point of use in embeddings.py
with patch(
"backend.api.features.store.embeddings.get_openai_client"
) as mock_get_client:
mock_get_client.return_value = mock_client
result = await embeddings.generate_embedding("test text")
assert result is not None
assert len(result) == 1536
assert result[0] == 0.1
mock_client.embeddings.create.assert_called_once_with(
model="text-embedding-3-small", input="test text"
)
@pytest.mark.asyncio(loop_scope="session")
async def test_generate_embedding_no_api_key():
"""Test embedding generation without API key."""
# Patch at the point of use in embeddings.py
with patch(
"backend.api.features.store.embeddings.get_openai_client"
) as mock_get_client:
mock_get_client.return_value = None
result = await embeddings.generate_embedding("test text")
assert result is None
@pytest.mark.asyncio(loop_scope="session")
async def test_generate_embedding_api_error():
"""Test embedding generation with API error."""
mock_client = MagicMock()
mock_client.embeddings.create = AsyncMock(side_effect=Exception("API Error"))
# Patch at the point of use in embeddings.py
with patch(
"backend.api.features.store.embeddings.get_openai_client"
) as mock_get_client:
mock_get_client.return_value = mock_client
result = await embeddings.generate_embedding("test text")
assert result is None
@pytest.mark.asyncio(loop_scope="session")
async def test_generate_embedding_text_truncation():
"""Test that long text is properly truncated using tiktoken."""
from tiktoken import encoding_for_model
mock_client = MagicMock()
mock_response = MagicMock()
mock_response.data = [MagicMock()]
mock_response.data[0].embedding = [0.1] * 1536
# Use AsyncMock for async embeddings.create method
mock_client.embeddings.create = AsyncMock(return_value=mock_response)
# Patch at the point of use in embeddings.py
with patch(
"backend.api.features.store.embeddings.get_openai_client"
) as mock_get_client:
mock_get_client.return_value = mock_client
# Create text that will exceed 8191 tokens
# Use varied characters to ensure token-heavy text: each word is ~1 token
words = [f"word{i}" for i in range(10000)]
long_text = " ".join(words) # ~10000 tokens
await embeddings.generate_embedding(long_text)
# Verify text was truncated to 8191 tokens
call_args = mock_client.embeddings.create.call_args
truncated_text = call_args.kwargs["input"]
# Count actual tokens in truncated text
enc = encoding_for_model("text-embedding-3-small")
actual_tokens = len(enc.encode(truncated_text))
# Should be at or just under 8191 tokens
assert actual_tokens <= 8191
# Should be close to the limit (not over-truncated)
assert actual_tokens >= 8100
@pytest.mark.asyncio(loop_scope="session")
async def test_store_embedding_success(mocker):
"""Test successful embedding storage."""
mock_client = mocker.AsyncMock()
mock_client.execute_raw = mocker.AsyncMock()
embedding = [0.1, 0.2, 0.3]
result = await embeddings.store_embedding(
version_id="test-version-id", embedding=embedding, tx=mock_client
)
assert result is True
# execute_raw is called twice: once for SET search_path, once for INSERT
assert mock_client.execute_raw.call_count == 2
# First call: SET search_path
first_call_args = mock_client.execute_raw.call_args_list[0][0]
assert "SET search_path" in first_call_args[0]
# Second call: INSERT query with the actual data
second_call_args = mock_client.execute_raw.call_args_list[1][0]
assert "test-version-id" in second_call_args
assert "[0.1,0.2,0.3]" in second_call_args
assert None in second_call_args # userId should be None for store agents
@pytest.mark.asyncio(loop_scope="session")
async def test_store_embedding_database_error(mocker):
"""Test embedding storage with database error."""
mock_client = mocker.AsyncMock()
mock_client.execute_raw.side_effect = Exception("Database error")
embedding = [0.1, 0.2, 0.3]
result = await embeddings.store_embedding(
version_id="test-version-id", embedding=embedding, tx=mock_client
)
assert result is False
@pytest.mark.asyncio(loop_scope="session")
async def test_get_embedding_success():
"""Test successful embedding retrieval."""
mock_result = [
{
"contentType": "STORE_AGENT",
"contentId": "test-version-id",
"userId": None,
"embedding": "[0.1,0.2,0.3]",
"searchableText": "Test text",
"metadata": {},
"createdAt": "2024-01-01T00:00:00Z",
"updatedAt": "2024-01-01T00:00:00Z",
}
]
with patch(
"backend.api.features.store.embeddings.query_raw_with_schema",
return_value=mock_result,
):
result = await embeddings.get_embedding("test-version-id")
assert result is not None
assert result["storeListingVersionId"] == "test-version-id"
assert result["embedding"] == "[0.1,0.2,0.3]"
@pytest.mark.asyncio(loop_scope="session")
async def test_get_embedding_not_found():
"""Test embedding retrieval when not found."""
with patch(
"backend.api.features.store.embeddings.query_raw_with_schema",
return_value=[],
):
result = await embeddings.get_embedding("test-version-id")
assert result is None
@pytest.mark.asyncio(loop_scope="session")
@patch("backend.api.features.store.embeddings.generate_embedding")
@patch("backend.api.features.store.embeddings.store_embedding")
@patch("backend.api.features.store.embeddings.get_embedding")
async def test_ensure_embedding_already_exists(mock_get, mock_store, mock_generate):
"""Test ensure_embedding when embedding already exists."""
mock_get.return_value = {"embedding": "[0.1,0.2,0.3]"}
result = await embeddings.ensure_embedding(
version_id="test-id",
name="Test",
description="Test description",
sub_heading="Test heading",
categories=["test"],
)
assert result is True
mock_generate.assert_not_called()
mock_store.assert_not_called()
@pytest.mark.asyncio(loop_scope="session")
@patch("backend.api.features.store.embeddings.generate_embedding")
@patch("backend.api.features.store.embeddings.store_content_embedding")
@patch("backend.api.features.store.embeddings.get_embedding")
async def test_ensure_embedding_create_new(mock_get, mock_store, mock_generate):
"""Test ensure_embedding creating new embedding."""
mock_get.return_value = None
mock_generate.return_value = [0.1, 0.2, 0.3]
mock_store.return_value = True
result = await embeddings.ensure_embedding(
version_id="test-id",
name="Test",
description="Test description",
sub_heading="Test heading",
categories=["test"],
)
assert result is True
mock_generate.assert_called_once_with("Test Test heading Test description test")
mock_store.assert_called_once_with(
content_type=ContentType.STORE_AGENT,
content_id="test-id",
embedding=[0.1, 0.2, 0.3],
searchable_text="Test Test heading Test description test",
metadata={"name": "Test", "subHeading": "Test heading", "categories": ["test"]},
user_id=None,
tx=None,
)
@pytest.mark.asyncio(loop_scope="session")
@patch("backend.api.features.store.embeddings.generate_embedding")
@patch("backend.api.features.store.embeddings.get_embedding")
async def test_ensure_embedding_generation_fails(mock_get, mock_generate):
"""Test ensure_embedding when generation fails."""
mock_get.return_value = None
mock_generate.return_value = None
result = await embeddings.ensure_embedding(
version_id="test-id",
name="Test",
description="Test description",
sub_heading="Test heading",
categories=["test"],
)
assert result is False
@pytest.mark.asyncio(loop_scope="session")
async def test_get_embedding_stats():
"""Test embedding statistics retrieval."""
# Mock approved count query and embedded count query
mock_approved_result = [{"count": 100}]
mock_embedded_result = [{"count": 75}]
with patch(
"backend.api.features.store.embeddings.query_raw_with_schema",
side_effect=[mock_approved_result, mock_embedded_result],
):
result = await embeddings.get_embedding_stats()
assert result["total_approved"] == 100
assert result["with_embeddings"] == 75
assert result["without_embeddings"] == 25
assert result["coverage_percent"] == 75.0
@pytest.mark.asyncio(loop_scope="session")
@patch("backend.api.features.store.embeddings.ensure_embedding")
async def test_backfill_missing_embeddings_success(mock_ensure):
"""Test backfill with successful embedding generation."""
# Mock missing embeddings query
mock_missing = [
{
"id": "version-1",
"name": "Agent 1",
"description": "Description 1",
"subHeading": "Heading 1",
"categories": ["AI"],
},
{
"id": "version-2",
"name": "Agent 2",
"description": "Description 2",
"subHeading": "Heading 2",
"categories": ["Productivity"],
},
]
# Mock ensure_embedding to succeed for first, fail for second
mock_ensure.side_effect = [True, False]
with patch(
"backend.api.features.store.embeddings.query_raw_with_schema",
return_value=mock_missing,
):
result = await embeddings.backfill_missing_embeddings(batch_size=5)
assert result["processed"] == 2
assert result["success"] == 1
assert result["failed"] == 1
assert mock_ensure.call_count == 2
@pytest.mark.asyncio(loop_scope="session")
async def test_backfill_missing_embeddings_no_missing():
"""Test backfill when no embeddings are missing."""
with patch(
"backend.api.features.store.embeddings.query_raw_with_schema",
return_value=[],
):
result = await embeddings.backfill_missing_embeddings(batch_size=5)
assert result["processed"] == 0
assert result["success"] == 0
assert result["failed"] == 0
assert result["message"] == "No missing embeddings"
@pytest.mark.asyncio(loop_scope="session")
async def test_embedding_to_vector_string():
"""Test embedding to PostgreSQL vector string conversion."""
embedding = [0.1, 0.2, 0.3, -0.4]
result = embeddings.embedding_to_vector_string(embedding)
assert result == "[0.1,0.2,0.3,-0.4]"
@pytest.mark.asyncio(loop_scope="session")
async def test_embed_query():
"""Test embed_query function (alias for generate_embedding)."""
with patch(
"backend.api.features.store.embeddings.generate_embedding"
) as mock_generate:
mock_generate.return_value = [0.1, 0.2, 0.3]
result = await embeddings.embed_query("test query")
assert result == [0.1, 0.2, 0.3]
mock_generate.assert_called_once_with("test query")

View File

@@ -0,0 +1,393 @@
"""
Hybrid Search for Store Agents
Combines semantic (embedding) search with lexical (tsvector) search
for improved relevance in marketplace agent discovery.
"""
import logging
from dataclasses import dataclass
from datetime import datetime
from typing import Any, Literal
from backend.api.features.store.embeddings import (
embed_query,
embedding_to_vector_string,
)
from backend.data.db import query_raw_with_schema
logger = logging.getLogger(__name__)
@dataclass
class HybridSearchWeights:
"""Weights for combining search signals."""
semantic: float = 0.30 # Embedding cosine similarity
lexical: float = 0.30 # tsvector ts_rank_cd score
category: float = 0.20 # Category match boost
recency: float = 0.10 # Newer agents ranked higher
popularity: float = 0.10 # Agent usage/runs (PageRank-like)
def __post_init__(self):
"""Validate weights are non-negative and sum to approximately 1.0."""
total = (
self.semantic
+ self.lexical
+ self.category
+ self.recency
+ self.popularity
)
if any(
w < 0
for w in [
self.semantic,
self.lexical,
self.category,
self.recency,
self.popularity,
]
):
raise ValueError("All weights must be non-negative")
if not (0.99 <= total <= 1.01):
raise ValueError(f"Weights must sum to ~1.0, got {total:.3f}")
DEFAULT_WEIGHTS = HybridSearchWeights()
# Minimum relevance score threshold - agents below this are filtered out
# With weights (0.30 semantic + 0.30 lexical + 0.20 category + 0.10 recency + 0.10 popularity):
# - 0.20 means at least ~60% semantic match OR strong lexical match required
# - Ensures only genuinely relevant results are returned
# - Recency/popularity alone (0.10 each) won't pass the threshold
DEFAULT_MIN_SCORE = 0.20
@dataclass
class HybridSearchResult:
"""A single search result with score breakdown."""
slug: str
agent_name: str
agent_image: str
creator_username: str
creator_avatar: str
sub_heading: str
description: str
runs: int
rating: float
categories: list[str]
featured: bool
is_available: bool
updated_at: datetime
# Score breakdown (for debugging/tuning)
combined_score: float
semantic_score: float = 0.0
lexical_score: float = 0.0
category_score: float = 0.0
recency_score: float = 0.0
popularity_score: float = 0.0
async def hybrid_search(
query: str,
featured: bool = False,
creators: list[str] | None = None,
category: str | None = None,
sorted_by: (
Literal["relevance", "rating", "runs", "name", "updated_at"] | None
) = None,
page: int = 1,
page_size: int = 20,
weights: HybridSearchWeights | None = None,
min_score: float | None = None,
) -> tuple[list[dict[str, Any]], int]:
"""
Perform hybrid search combining semantic and lexical signals.
Args:
query: Search query string
featured: Filter for featured agents only
creators: Filter by creator usernames
category: Filter by category
sorted_by: Sort order (relevance uses hybrid scoring)
page: Page number (1-indexed)
page_size: Results per page
weights: Custom weights for search signals
min_score: Minimum relevance score threshold (0-1). Results below
this score are filtered out. Defaults to DEFAULT_MIN_SCORE.
Returns:
Tuple of (results list, total count). Returns empty list if no
results meet the minimum relevance threshold.
"""
# Validate inputs
query = query.strip()
if not query:
return [], 0 # Empty query returns no results
if page < 1:
page = 1
if page_size < 1:
page_size = 1
if page_size > 100: # Cap at reasonable limit to prevent performance issues
page_size = 100
if weights is None:
weights = DEFAULT_WEIGHTS
if min_score is None:
min_score = DEFAULT_MIN_SCORE
offset = (page - 1) * page_size
# Generate query embedding
query_embedding = await embed_query(query)
# Build WHERE clause conditions
where_parts: list[str] = ["sa.is_available = true"]
params: list[Any] = []
param_index = 1
# Add search query for lexical matching
params.append(query)
query_param = f"${param_index}"
param_index += 1
# Add lowercased query for category matching
params.append(query.lower())
query_lower_param = f"${param_index}"
param_index += 1
if featured:
where_parts.append("sa.featured = true")
if creators:
where_parts.append(f"sa.creator_username = ANY(${param_index})")
params.append(creators)
param_index += 1
if category:
where_parts.append(f"${param_index} = ANY(sa.categories)")
params.append(category)
param_index += 1
# Safe: where_parts only contains hardcoded strings with $N parameter placeholders
# No user input is concatenated directly into the SQL string
where_clause = " AND ".join(where_parts)
# Embedding is required for hybrid search - fail fast if unavailable
if query_embedding is None or not query_embedding:
# Log detailed error server-side
logger.error(
"Failed to generate query embedding. "
"Check that openai_internal_api_key is configured and OpenAI API is accessible."
)
# Raise generic error to client
raise ValueError("Search service temporarily unavailable")
# Add embedding parameter
embedding_str = embedding_to_vector_string(query_embedding)
params.append(embedding_str)
embedding_param = f"${param_index}"
param_index += 1
# Add weight parameters for SQL calculation
params.append(weights.semantic)
weight_semantic_param = f"${param_index}"
param_index += 1
params.append(weights.lexical)
weight_lexical_param = f"${param_index}"
param_index += 1
params.append(weights.category)
weight_category_param = f"${param_index}"
param_index += 1
params.append(weights.recency)
weight_recency_param = f"${param_index}"
param_index += 1
params.append(weights.popularity)
weight_popularity_param = f"${param_index}"
param_index += 1
# Add min_score parameter
params.append(min_score)
min_score_param = f"${param_index}"
param_index += 1
# Optimized hybrid search query:
# 1. Direct join to UnifiedContentEmbedding via contentId=storeListingVersionId (no redundant JOINs)
# 2. UNION approach (deduplicates agents matching both branches)
# 3. COUNT(*) OVER() to get total count in single query
# 4. Optimized category matching with EXISTS + unnest
# 5. Pre-calculated max values for lexical and popularity normalization
# 6. Simplified recency calculation with linear decay
# 7. Logarithmic popularity scaling to prevent viral agents from dominating
sql_query = f"""
WITH candidates AS (
-- Lexical matches (uses GIN index on search column)
SELECT sa."storeListingVersionId"
FROM {{schema_prefix}}"StoreAgent" sa
WHERE {where_clause}
AND sa.search @@ plainto_tsquery('english', {query_param})
UNION
-- Semantic matches (uses HNSW index on embedding with KNN)
SELECT "storeListingVersionId"
FROM (
SELECT sa."storeListingVersionId", uce.embedding
FROM {{schema_prefix}}"StoreAgent" sa
INNER JOIN {{schema_prefix}}"UnifiedContentEmbedding" uce
ON sa."storeListingVersionId" = uce."contentId" AND uce."contentType" = 'STORE_AGENT'::{{schema_prefix}}"ContentType"
WHERE {where_clause}
ORDER BY uce.embedding <=> {embedding_param}::vector
LIMIT 200
) semantic_results
),
search_scores AS (
SELECT
sa.slug,
sa.agent_name,
sa.agent_image,
sa.creator_username,
sa.creator_avatar,
sa.sub_heading,
sa.description,
sa.runs,
sa.rating,
sa.categories,
sa.featured,
sa.is_available,
sa.updated_at,
-- Semantic score: cosine similarity (1 - distance)
COALESCE(1 - (uce.embedding <=> {embedding_param}::vector), 0) as semantic_score,
-- Lexical score: ts_rank_cd (will be normalized later)
COALESCE(ts_rank_cd(sa.search, plainto_tsquery('english', {query_param})), 0) as lexical_raw,
-- Category match: optimized with unnest for better performance
CASE
WHEN EXISTS (
SELECT 1 FROM unnest(sa.categories) cat
WHERE LOWER(cat) LIKE '%' || {query_lower_param} || '%'
)
THEN 1.0
ELSE 0.0
END as category_score,
-- Recency score: linear decay over 90 days (simpler than exponential)
GREATEST(0, 1 - EXTRACT(EPOCH FROM (NOW() - sa.updated_at)) / (90 * 24 * 3600)) as recency_score,
-- Popularity raw: agent runs count (will be normalized with log scaling)
sa.runs as popularity_raw
FROM candidates c
INNER JOIN {{schema_prefix}}"StoreAgent" sa
ON c."storeListingVersionId" = sa."storeListingVersionId"
LEFT JOIN {{schema_prefix}}"UnifiedContentEmbedding" uce
ON sa."storeListingVersionId" = uce."contentId" AND uce."contentType" = 'STORE_AGENT'::{{schema_prefix}}"ContentType"
),
max_lexical AS (
SELECT MAX(lexical_raw) as max_val FROM search_scores
),
max_popularity AS (
SELECT MAX(popularity_raw) as max_val FROM search_scores
),
normalized AS (
SELECT
ss.*,
-- Normalize lexical score by pre-calculated max
CASE
WHEN ml.max_val > 0
THEN ss.lexical_raw / ml.max_val
ELSE 0
END as lexical_score,
-- Normalize popularity with logarithmic scaling to prevent viral agents from dominating
-- LOG(1 + runs) / LOG(1 + max_runs) ensures score is 0-1 range
CASE
WHEN mp.max_val > 0 AND ss.popularity_raw > 0
THEN LN(1 + ss.popularity_raw) / LN(1 + mp.max_val)
ELSE 0
END as popularity_score
FROM search_scores ss
CROSS JOIN max_lexical ml
CROSS JOIN max_popularity mp
),
scored AS (
SELECT
slug,
agent_name,
agent_image,
creator_username,
creator_avatar,
sub_heading,
description,
runs,
rating,
categories,
featured,
is_available,
updated_at,
semantic_score,
lexical_score,
category_score,
recency_score,
popularity_score,
(
{weight_semantic_param} * semantic_score +
{weight_lexical_param} * lexical_score +
{weight_category_param} * category_score +
{weight_recency_param} * recency_score +
{weight_popularity_param} * popularity_score
) as combined_score
FROM normalized
),
filtered AS (
SELECT
*,
COUNT(*) OVER () as total_count
FROM scored
WHERE combined_score >= {min_score_param}
)
SELECT * FROM filtered
ORDER BY combined_score DESC
LIMIT ${param_index} OFFSET ${param_index + 1}
"""
# Add pagination params
params.extend([page_size, offset])
# Execute search query - includes total_count via window function
results = await query_raw_with_schema(
sql_query, *params, set_public_search_path=True
)
# Extract total count from first result (all rows have same count)
total = results[0]["total_count"] if results else 0
# Remove total_count from results before returning
for result in results:
result.pop("total_count", None)
# Log without sensitive query content
logger.info(f"Hybrid search: {len(results)} results, {total} total")
return results, total
async def hybrid_search_simple(
query: str,
page: int = 1,
page_size: int = 20,
) -> tuple[list[dict[str, Any]], int]:
"""
Simplified hybrid search for common use cases.
Uses default weights and no filters.
"""
return await hybrid_search(
query=query,
page=page,
page_size=page_size,
)

View File

@@ -0,0 +1,334 @@
"""
Integration tests for hybrid search with schema handling.
These tests verify that hybrid search works correctly across different database schemas.
"""
from unittest.mock import patch
import pytest
from backend.api.features.store.hybrid_search import HybridSearchWeights, hybrid_search
@pytest.mark.asyncio(loop_scope="session")
@pytest.mark.integration
async def test_hybrid_search_with_schema_handling():
"""Test that hybrid search correctly handles database schema prefixes."""
# Test with a mock query to ensure schema handling works
query = "test agent"
with patch(
"backend.api.features.store.hybrid_search.query_raw_with_schema"
) as mock_query:
# Mock the query result
mock_query.return_value = [
{
"slug": "test/agent",
"agent_name": "Test Agent",
"agent_image": "test.png",
"creator_username": "test",
"creator_avatar": "avatar.png",
"sub_heading": "Test sub-heading",
"description": "Test description",
"runs": 10,
"rating": 4.5,
"categories": ["test"],
"featured": False,
"is_available": True,
"updated_at": "2024-01-01T00:00:00Z",
"combined_score": 0.8,
"semantic_score": 0.7,
"lexical_score": 0.6,
"category_score": 0.5,
"recency_score": 0.4,
"total_count": 1,
}
]
with patch(
"backend.api.features.store.hybrid_search.embed_query"
) as mock_embed:
mock_embed.return_value = [0.1] * 1536 # Mock embedding
results, total = await hybrid_search(
query=query,
page=1,
page_size=20,
)
# Verify the query was called
assert mock_query.called
# Verify the SQL template uses schema_prefix placeholder
call_args = mock_query.call_args
sql_template = call_args[0][0]
assert "{schema_prefix}" in sql_template
# Verify results
assert len(results) == 1
assert total == 1
assert results[0]["slug"] == "test/agent"
@pytest.mark.asyncio(loop_scope="session")
@pytest.mark.integration
async def test_hybrid_search_with_public_schema():
"""Test hybrid search when using public schema (no prefix needed)."""
with patch("backend.data.db.get_database_schema") as mock_schema:
mock_schema.return_value = "public"
with patch(
"backend.api.features.store.hybrid_search.query_raw_with_schema"
) as mock_query:
mock_query.return_value = []
with patch(
"backend.api.features.store.hybrid_search.embed_query"
) as mock_embed:
mock_embed.return_value = [0.1] * 1536
results, total = await hybrid_search(
query="test",
page=1,
page_size=20,
)
# Verify the mock was set up correctly
assert mock_schema.return_value == "public"
# Results should work even with empty results
assert results == []
assert total == 0
@pytest.mark.asyncio(loop_scope="session")
@pytest.mark.integration
async def test_hybrid_search_with_custom_schema():
"""Test hybrid search when using custom schema (e.g., 'platform')."""
with patch("backend.data.db.get_database_schema") as mock_schema:
mock_schema.return_value = "platform"
with patch(
"backend.api.features.store.hybrid_search.query_raw_with_schema"
) as mock_query:
mock_query.return_value = []
with patch(
"backend.api.features.store.hybrid_search.embed_query"
) as mock_embed:
mock_embed.return_value = [0.1] * 1536
results, total = await hybrid_search(
query="test",
page=1,
page_size=20,
)
# Verify the mock was set up correctly
assert mock_schema.return_value == "platform"
assert results == []
assert total == 0
@pytest.mark.asyncio(loop_scope="session")
@pytest.mark.integration
async def test_hybrid_search_without_embeddings():
"""Test hybrid search fails fast when embeddings are unavailable."""
# Patch where the function is used, not where it's defined
with patch("backend.api.features.store.hybrid_search.embed_query") as mock_embed:
# Simulate embedding failure
mock_embed.return_value = None
# Should raise ValueError with helpful message
with pytest.raises(ValueError) as exc_info:
await hybrid_search(
query="test",
page=1,
page_size=20,
)
# Verify error message is generic (doesn't leak implementation details)
assert "Search service temporarily unavailable" in str(exc_info.value)
@pytest.mark.asyncio(loop_scope="session")
@pytest.mark.integration
async def test_hybrid_search_with_filters():
"""Test hybrid search with various filters."""
with patch(
"backend.api.features.store.hybrid_search.query_raw_with_schema"
) as mock_query:
mock_query.return_value = []
with patch(
"backend.api.features.store.hybrid_search.embed_query"
) as mock_embed:
mock_embed.return_value = [0.1] * 1536
# Test with featured filter
results, total = await hybrid_search(
query="test",
featured=True,
creators=["user1", "user2"],
category="productivity",
page=1,
page_size=10,
)
# Verify filters were applied in the query
call_args = mock_query.call_args
params = call_args[0][1:] # Skip SQL template
# Should have query, query_lower, creators array, category
assert len(params) >= 4
@pytest.mark.asyncio(loop_scope="session")
@pytest.mark.integration
async def test_hybrid_search_weights():
"""Test hybrid search with custom weights."""
custom_weights = HybridSearchWeights(
semantic=0.5,
lexical=0.3,
category=0.1,
recency=0.1,
popularity=0.0,
)
with patch(
"backend.api.features.store.hybrid_search.query_raw_with_schema"
) as mock_query:
mock_query.return_value = []
with patch(
"backend.api.features.store.hybrid_search.embed_query"
) as mock_embed:
mock_embed.return_value = [0.1] * 1536
results, total = await hybrid_search(
query="test",
weights=custom_weights,
page=1,
page_size=20,
)
# Verify custom weights were used in the query
call_args = mock_query.call_args
sql_template = call_args[0][0]
params = call_args[0][1:] # Get all parameters passed
# Check that SQL uses parameterized weights (not f-string interpolation)
assert "$" in sql_template # Verify parameterization is used
# Check that custom weights are in the params
assert 0.5 in params # semantic weight
assert 0.3 in params # lexical weight
assert 0.1 in params # category and recency weights
@pytest.mark.asyncio(loop_scope="session")
@pytest.mark.integration
async def test_hybrid_search_min_score_filtering():
"""Test hybrid search minimum score threshold."""
with patch(
"backend.api.features.store.hybrid_search.query_raw_with_schema"
) as mock_query:
# Return results with varying scores
mock_query.return_value = [
{
"slug": "high-score/agent",
"agent_name": "High Score Agent",
"combined_score": 0.8,
"total_count": 1,
# ... other fields
}
]
with patch(
"backend.api.features.store.hybrid_search.embed_query"
) as mock_embed:
mock_embed.return_value = [0.1] * 1536
# Test with custom min_score
results, total = await hybrid_search(
query="test",
min_score=0.5, # High threshold
page=1,
page_size=20,
)
# Verify min_score was applied in query
call_args = mock_query.call_args
sql_template = call_args[0][0]
params = call_args[0][1:] # Get all parameters
# Check that SQL uses parameterized min_score
assert "combined_score >=" in sql_template
assert "$" in sql_template # Verify parameterization
# Check that custom min_score is in the params
assert 0.5 in params
@pytest.mark.asyncio(loop_scope="session")
@pytest.mark.integration
async def test_hybrid_search_pagination():
"""Test hybrid search pagination."""
with patch(
"backend.api.features.store.hybrid_search.query_raw_with_schema"
) as mock_query:
mock_query.return_value = []
with patch(
"backend.api.features.store.hybrid_search.embed_query"
) as mock_embed:
mock_embed.return_value = [0.1] * 1536
# Test page 2 with page_size 10
results, total = await hybrid_search(
query="test",
page=2,
page_size=10,
)
# Verify pagination parameters
call_args = mock_query.call_args
params = call_args[0]
# Last two params should be LIMIT and OFFSET
limit = params[-2]
offset = params[-1]
assert limit == 10 # page_size
assert offset == 10 # (page - 1) * page_size = (2 - 1) * 10
@pytest.mark.asyncio(loop_scope="session")
@pytest.mark.integration
async def test_hybrid_search_error_handling():
"""Test hybrid search error handling."""
with patch(
"backend.api.features.store.hybrid_search.query_raw_with_schema"
) as mock_query:
# Simulate database error
mock_query.side_effect = Exception("Database connection error")
with patch(
"backend.api.features.store.hybrid_search.embed_query"
) as mock_embed:
mock_embed.return_value = [0.1] * 1536
# Should raise exception
with pytest.raises(Exception) as exc_info:
await hybrid_search(
query="test",
page=1,
page_size=20,
)
assert "Database connection error" in str(exc_info.value)
if __name__ == "__main__":
pytest.main([__file__, "-v", "-s"])

View File

@@ -5,11 +5,12 @@ import uuid
import fastapi
from gcloud.aio import storage as async_storage
import backend.server.v2.store.exceptions
from backend.util.exceptions import MissingConfigError
from backend.util.settings import Settings
from backend.util.virus_scanner import scan_content_safe
from . import exceptions as store_exceptions
logger = logging.getLogger(__name__)
ALLOWED_IMAGE_TYPES = {"image/jpeg", "image/png", "image/gif", "image/webp"}
@@ -68,61 +69,55 @@ async def upload_media(
await file.seek(0) # Reset file pointer
except Exception as e:
logger.error(f"Error reading file content: {str(e)}")
raise backend.server.v2.store.exceptions.FileReadError(
"Failed to read file content"
) from e
raise store_exceptions.FileReadError("Failed to read file content") from e
# Validate file signature/magic bytes
if file.content_type in ALLOWED_IMAGE_TYPES:
# Check image file signatures
if content.startswith(b"\xff\xd8\xff"): # JPEG
if file.content_type != "image/jpeg":
raise backend.server.v2.store.exceptions.InvalidFileTypeError(
raise store_exceptions.InvalidFileTypeError(
"File signature does not match content type"
)
elif content.startswith(b"\x89PNG\r\n\x1a\n"): # PNG
if file.content_type != "image/png":
raise backend.server.v2.store.exceptions.InvalidFileTypeError(
raise store_exceptions.InvalidFileTypeError(
"File signature does not match content type"
)
elif content.startswith(b"GIF87a") or content.startswith(b"GIF89a"): # GIF
if file.content_type != "image/gif":
raise backend.server.v2.store.exceptions.InvalidFileTypeError(
raise store_exceptions.InvalidFileTypeError(
"File signature does not match content type"
)
elif content.startswith(b"RIFF") and content[8:12] == b"WEBP": # WebP
if file.content_type != "image/webp":
raise backend.server.v2.store.exceptions.InvalidFileTypeError(
raise store_exceptions.InvalidFileTypeError(
"File signature does not match content type"
)
else:
raise backend.server.v2.store.exceptions.InvalidFileTypeError(
"Invalid image file signature"
)
raise store_exceptions.InvalidFileTypeError("Invalid image file signature")
elif file.content_type in ALLOWED_VIDEO_TYPES:
# Check video file signatures
if content.startswith(b"\x00\x00\x00") and (content[4:8] == b"ftyp"): # MP4
if file.content_type != "video/mp4":
raise backend.server.v2.store.exceptions.InvalidFileTypeError(
raise store_exceptions.InvalidFileTypeError(
"File signature does not match content type"
)
elif content.startswith(b"\x1a\x45\xdf\xa3"): # WebM
if file.content_type != "video/webm":
raise backend.server.v2.store.exceptions.InvalidFileTypeError(
raise store_exceptions.InvalidFileTypeError(
"File signature does not match content type"
)
else:
raise backend.server.v2.store.exceptions.InvalidFileTypeError(
"Invalid video file signature"
)
raise store_exceptions.InvalidFileTypeError("Invalid video file signature")
settings = Settings()
# Check required settings first before doing any file processing
if not settings.config.media_gcs_bucket_name:
logger.error("Missing GCS bucket name setting")
raise backend.server.v2.store.exceptions.StorageConfigError(
raise store_exceptions.StorageConfigError(
"Missing storage bucket configuration"
)
@@ -137,7 +132,7 @@ async def upload_media(
and content_type not in ALLOWED_VIDEO_TYPES
):
logger.warning(f"Invalid file type attempted: {content_type}")
raise backend.server.v2.store.exceptions.InvalidFileTypeError(
raise store_exceptions.InvalidFileTypeError(
f"File type not supported. Must be jpeg, png, gif, webp, mp4 or webm. Content type: {content_type}"
)
@@ -150,16 +145,14 @@ async def upload_media(
file_size += len(chunk)
if file_size > MAX_FILE_SIZE:
logger.warning(f"File size too large: {file_size} bytes")
raise backend.server.v2.store.exceptions.FileSizeTooLargeError(
raise store_exceptions.FileSizeTooLargeError(
"File too large. Maximum size is 50MB"
)
except backend.server.v2.store.exceptions.FileSizeTooLargeError:
except store_exceptions.FileSizeTooLargeError:
raise
except Exception as e:
logger.error(f"Error reading file chunks: {str(e)}")
raise backend.server.v2.store.exceptions.FileReadError(
"Failed to read uploaded file"
) from e
raise store_exceptions.FileReadError("Failed to read uploaded file") from e
# Reset file pointer
await file.seek(0)
@@ -198,14 +191,14 @@ async def upload_media(
except Exception as e:
logger.error(f"GCS storage error: {str(e)}")
raise backend.server.v2.store.exceptions.StorageUploadError(
raise store_exceptions.StorageUploadError(
"Failed to upload file to storage"
) from e
except backend.server.v2.store.exceptions.MediaUploadError:
except store_exceptions.MediaUploadError:
raise
except Exception as e:
logger.exception("Unexpected error in upload_media")
raise backend.server.v2.store.exceptions.MediaUploadError(
raise store_exceptions.MediaUploadError(
"Unexpected error during media upload"
) from e

View File

@@ -6,17 +6,18 @@ import fastapi
import pytest
import starlette.datastructures
import backend.server.v2.store.exceptions
import backend.server.v2.store.media
from backend.util.settings import Settings
from . import exceptions as store_exceptions
from . import media as store_media
@pytest.fixture
def mock_settings(monkeypatch):
settings = Settings()
settings.config.media_gcs_bucket_name = "test-bucket"
settings.config.google_application_credentials = "test-credentials"
monkeypatch.setattr("backend.server.v2.store.media.Settings", lambda: settings)
monkeypatch.setattr("backend.api.features.store.media.Settings", lambda: settings)
return settings
@@ -32,12 +33,13 @@ def mock_storage_client(mocker):
# Mock the constructor to return our mock client
mocker.patch(
"backend.server.v2.store.media.async_storage.Storage", return_value=mock_client
"backend.api.features.store.media.async_storage.Storage",
return_value=mock_client,
)
# Mock virus scanner to avoid actual scanning
mocker.patch(
"backend.server.v2.store.media.scan_content_safe", new_callable=AsyncMock
"backend.api.features.store.media.scan_content_safe", new_callable=AsyncMock
)
return mock_client
@@ -53,7 +55,7 @@ async def test_upload_media_success(mock_settings, mock_storage_client):
headers=starlette.datastructures.Headers({"content-type": "image/jpeg"}),
)
result = await backend.server.v2.store.media.upload_media("test-user", test_file)
result = await store_media.upload_media("test-user", test_file)
assert result.startswith(
"https://storage.googleapis.com/test-bucket/users/test-user/images/"
@@ -69,8 +71,8 @@ async def test_upload_media_invalid_type(mock_settings, mock_storage_client):
headers=starlette.datastructures.Headers({"content-type": "text/plain"}),
)
with pytest.raises(backend.server.v2.store.exceptions.InvalidFileTypeError):
await backend.server.v2.store.media.upload_media("test-user", test_file)
with pytest.raises(store_exceptions.InvalidFileTypeError):
await store_media.upload_media("test-user", test_file)
mock_storage_client.upload.assert_not_called()
@@ -79,7 +81,7 @@ async def test_upload_media_missing_credentials(monkeypatch):
settings = Settings()
settings.config.media_gcs_bucket_name = ""
settings.config.google_application_credentials = ""
monkeypatch.setattr("backend.server.v2.store.media.Settings", lambda: settings)
monkeypatch.setattr("backend.api.features.store.media.Settings", lambda: settings)
test_file = fastapi.UploadFile(
filename="laptop.jpeg",
@@ -87,8 +89,8 @@ async def test_upload_media_missing_credentials(monkeypatch):
headers=starlette.datastructures.Headers({"content-type": "image/jpeg"}),
)
with pytest.raises(backend.server.v2.store.exceptions.StorageConfigError):
await backend.server.v2.store.media.upload_media("test-user", test_file)
with pytest.raises(store_exceptions.StorageConfigError):
await store_media.upload_media("test-user", test_file)
async def test_upload_media_video_type(mock_settings, mock_storage_client):
@@ -98,7 +100,7 @@ async def test_upload_media_video_type(mock_settings, mock_storage_client):
headers=starlette.datastructures.Headers({"content-type": "video/mp4"}),
)
result = await backend.server.v2.store.media.upload_media("test-user", test_file)
result = await store_media.upload_media("test-user", test_file)
assert result.startswith(
"https://storage.googleapis.com/test-bucket/users/test-user/videos/"
@@ -117,8 +119,8 @@ async def test_upload_media_file_too_large(mock_settings, mock_storage_client):
headers=starlette.datastructures.Headers({"content-type": "image/jpeg"}),
)
with pytest.raises(backend.server.v2.store.exceptions.FileSizeTooLargeError):
await backend.server.v2.store.media.upload_media("test-user", test_file)
with pytest.raises(store_exceptions.FileSizeTooLargeError):
await store_media.upload_media("test-user", test_file)
async def test_upload_media_file_read_error(mock_settings, mock_storage_client):
@@ -129,8 +131,8 @@ async def test_upload_media_file_read_error(mock_settings, mock_storage_client):
)
test_file.read = unittest.mock.AsyncMock(side_effect=Exception("Read error"))
with pytest.raises(backend.server.v2.store.exceptions.FileReadError):
await backend.server.v2.store.media.upload_media("test-user", test_file)
with pytest.raises(store_exceptions.FileReadError):
await store_media.upload_media("test-user", test_file)
async def test_upload_media_png_success(mock_settings, mock_storage_client):
@@ -140,7 +142,7 @@ async def test_upload_media_png_success(mock_settings, mock_storage_client):
headers=starlette.datastructures.Headers({"content-type": "image/png"}),
)
result = await backend.server.v2.store.media.upload_media("test-user", test_file)
result = await store_media.upload_media("test-user", test_file)
assert result.startswith(
"https://storage.googleapis.com/test-bucket/users/test-user/images/"
)
@@ -154,7 +156,7 @@ async def test_upload_media_gif_success(mock_settings, mock_storage_client):
headers=starlette.datastructures.Headers({"content-type": "image/gif"}),
)
result = await backend.server.v2.store.media.upload_media("test-user", test_file)
result = await store_media.upload_media("test-user", test_file)
assert result.startswith(
"https://storage.googleapis.com/test-bucket/users/test-user/images/"
)
@@ -168,7 +170,7 @@ async def test_upload_media_webp_success(mock_settings, mock_storage_client):
headers=starlette.datastructures.Headers({"content-type": "image/webp"}),
)
result = await backend.server.v2.store.media.upload_media("test-user", test_file)
result = await store_media.upload_media("test-user", test_file)
assert result.startswith(
"https://storage.googleapis.com/test-bucket/users/test-user/images/"
)
@@ -182,7 +184,7 @@ async def test_upload_media_webm_success(mock_settings, mock_storage_client):
headers=starlette.datastructures.Headers({"content-type": "video/webm"}),
)
result = await backend.server.v2.store.media.upload_media("test-user", test_file)
result = await store_media.upload_media("test-user", test_file)
assert result.startswith(
"https://storage.googleapis.com/test-bucket/users/test-user/videos/"
)
@@ -196,8 +198,8 @@ async def test_upload_media_mismatched_signature(mock_settings, mock_storage_cli
headers=starlette.datastructures.Headers({"content-type": "image/jpeg"}),
)
with pytest.raises(backend.server.v2.store.exceptions.InvalidFileTypeError):
await backend.server.v2.store.media.upload_media("test-user", test_file)
with pytest.raises(store_exceptions.InvalidFileTypeError):
await store_media.upload_media("test-user", test_file)
async def test_upload_media_invalid_signature(mock_settings, mock_storage_client):
@@ -207,5 +209,5 @@ async def test_upload_media_invalid_signature(mock_settings, mock_storage_client
headers=starlette.datastructures.Headers({"content-type": "image/jpeg"}),
)
with pytest.raises(backend.server.v2.store.exceptions.InvalidFileTypeError):
await backend.server.v2.store.media.upload_media("test-user", test_file)
with pytest.raises(store_exceptions.InvalidFileTypeError):
await store_media.upload_media("test-user", test_file)

View File

@@ -7,6 +7,12 @@ import pydantic
from backend.util.models import Pagination
class ChangelogEntry(pydantic.BaseModel):
version: str
changes_summary: str
date: datetime.datetime
class MyAgent(pydantic.BaseModel):
agent_id: str
agent_version: int
@@ -55,12 +61,17 @@ class StoreAgentDetails(pydantic.BaseModel):
runs: int
rating: float
versions: list[str]
agentGraphVersions: list[str]
agentGraphId: str
last_updated: datetime.datetime
recommended_schedule_cron: str | None = None
active_version_id: str | None = None
has_approved_version: bool = False
# Optional changelog data when include_changelog=True
changelog: list[ChangelogEntry] | None = None
class Creator(pydantic.BaseModel):
name: str
@@ -99,6 +110,7 @@ class Profile(pydantic.BaseModel):
class StoreSubmission(pydantic.BaseModel):
listing_id: str
agent_id: str
agent_version: int
name: str
@@ -153,8 +165,12 @@ class StoreListingsWithVersionsResponse(pydantic.BaseModel):
class StoreSubmissionRequest(pydantic.BaseModel):
agent_id: str
agent_version: int
agent_id: str = pydantic.Field(
..., min_length=1, description="Agent ID cannot be empty"
)
agent_version: int = pydantic.Field(
..., gt=0, description="Agent version must be greater than 0"
)
slug: str
name: str
sub_heading: str

Some files were not shown because too many files have changed in this diff Show More