- Move imports to top level
- Use tempfile for secure temp paths
- Add exception chaining (from e)
- Close AudioFileClip in finally block
- Document that ducking = reduced volume mix
- Extract helper method for test mocking
- Proper resource cleanup
- Move imports to top level
- Use tempfile for secure temp paths
- Add exception chaining (from e)
- Constrain output_format to enum
- Add ge=0.0 to transition_duration
- Extract helper method for test mocking
- Proper resource cleanup in finally
- Move imports to top level
- Use tempfile for secure temp paths
- Add exception chaining (from e)
- Constrain output_format to enum
- Extract helper method for test mocking
- Proper resource cleanup in finally
### Changes 🏗️
This PR includes two database migration fixes:
#### 1. Remove redundant Supabase extensions migration
Removes the `20260112173500_add_supabase_extensions_to_platform_schema`
migration which was attempting to manage Supabase-provided extensions
and schemas.
**What was removed:**
- Migration that created extensions (pgcrypto, uuid-ossp,
pg_stat_statements, pg_net, pgjwt, pg_graphql, pgsodium, supabase_vault)
- Schema creation for these extensions
**Why it was removed:**
- These extensions and schemas are pre-installed and managed by Supabase
automatically
- The migration was redundant and could cause schema drift warnings
- Attempting to manage Supabase-owned resources in our migrations is an
anti-pattern
#### 2. Fix pgvector extension schema handling
Improves the `20260109181714_add_docs_embedding` migration to handle
cases where pgvector exists in the wrong schema.
**Problem:**
- If pgvector was previously installed in `public` schema, `CREATE
EXTENSION IF NOT EXISTS` would succeed but not actually install it in
the `platform` schema
- This causes `type "vector" does not exist` errors because the type
isn't in the search_path
**Solution:**
- Detect if vector extension exists in a different schema than the
current one
- Drop it with CASCADE and reinstall in the correct schema (platform)
- Use dynamic SQL with `EXECUTE format()` to explicitly specify the
target schema
- Split exception handling: catch errors during removal, but let
installation fail naturally with clear PostgreSQL errors
**Impact:**
- No functional changes - Supabase continues to provide extensions as
before
- pgvector now correctly installs in the platform schema
- Cleaner migration history
- Prevents schema-related errors
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Verified migrations run successfully without the redundant file
- [x] Confirmed Supabase extensions are still available
- [x] Tested pgvector migration handles wrong-schema scenario
- [x] No schema drift warnings
#### For configuration changes:
- [x] .env.default is updated or already compatible with my changes
- [x] docker-compose.yml is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)
- N/A - No configuration changes required
### Changes 🏗️
- Renamed the `test` job to `e2e_test` in the CI workflow for better
clarity
- Added a new `integration_test` job to the CI workflow that runs unit
tests using `pnpm test:unit`
- Created a basic integration test for the MainMarketplacePage component
to verify CI functionality
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Verified the CI workflow runs both e2e and integration tests
- [x] Confirmed the integration test for MainMarketplacePage passes
#### For configuration changes:
- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
### Changes 🏗️
- Added Vitest and React Testing Library for frontend unit testing
- Configured MSW (Mock Service Worker) for API mocking in tests
- Created test utilities and setup files for integration tests
- Added comprehensive testing documentation in `AGENTS.md`
- Updated Orval configuration to generate MSW mock handlers
- Added mock server and browser implementations for development testing
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Run `pnpm test:unit` to verify tests pass
- [x] Verify MSW mock handlers are generated correctly
- [x] Check that test utilities work with sample component tests
#### For configuration changes:
- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)
### Changes 🏗️
<img width="1920" height="998" alt="Screenshot 2026-01-19 at 22 14 51"
src="https://github.com/user-attachments/assets/ecd1c241-6f77-4702-9774-5e58806b0b64"
/>
This PR lays the groundwork for the new UX of AutoGPT Copilot.
- moves the Copilot to its own route `/copilot`
- Makes the Copilot the homepage when enabled
- Updates the labelling of the homepage icons
- Makes the Library the homepage when Copilot is disabled
- Improves Copilot's:
- session handling
- styles and UX
- message parsing
### Other improvements
- Improve the log out UX by adding a new `/logout` page and using a
re-direct
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Run locally and test the above
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> Launches the new Copilot experience and aligns API behavior with the
UI.
>
> - **Routing/Home**: Add `/copilot` with `CopilotShell` (desktop
sidebar + mobile drawer), make homepage route flag-driven; update
login/signup/error redirects and root page to use `getHomepageRoute`.
> - **Chat UX**: Replace legacy chat with `components/contextual/Chat/*`
(new message list, bubbles, tool call/response formatting, stop button,
initial-prompt handling, refined streaming/error handling); remove old
platform chat components.
> - **Sessions**: Add paginated session list (infinite load),
auto-select/create logic, mobile/desktop navigation, and improved
session fetching/claiming guards.
> - **Auth/Logout**: New `/logout` flow with delayed redirect; gate
various queries on auth state and logout-in-progress.
> - **Backend**: `GET /api/chat/sessions/{id}` returns `null` instead of
404; service saves assistant message on `StreamFinish` to avoid loss and
prevents duplicate saves; OpenAPI updated accordingly.
> - **Misc**: Minor UI polish in library modals, loader styling, docs
(CONTRIBUTING) additions, and small formatting fixes in block docs
generator.
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
1b4776dcf5. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
## Summary
- Remove explicit schema qualification (`{schema}.vector` and
`OPERATOR({schema}.<=>)`) from pgvector queries in `embeddings.py` and
`hybrid_search.py`
- Use unqualified `::vector` type cast and `<=>` operator which work
because pgvector is in the search_path on all environments
## Problem
The previous approach tried to explicitly qualify the vector type with
schema names, but this failed because:
- **CI environment**: pgvector is in `public` schema → `platform.vector`
doesn't exist
- **Dev (Supabase)**: pgvector is in `platform` schema → `public.vector`
doesn't exist
## Solution
Use unqualified `::vector` and `<=>` operator. PostgreSQL resolves these
via `search_path`, which includes the schema where pgvector is installed
on all environments.
Tested on both local and dev environments with a test script that
verified:
- ✅ Unqualified `::vector` type cast
- ✅ Unqualified `<=>` operator in ORDER BY
- ✅ Unqualified `<=>` in SELECT (similarity calculation)
- ✅ Combined query patterns matching actual usage
## Test plan
- [ ] CI tests pass
- [ ] Marketplace approval works on dev after deployment
Fixes: AUTOGPT-SERVER-763, AUTOGPT-SERVER-764, AUTOGPT-SERVER-76B
## Summary
Adds graceful error handling to AsyncRedisEventBus and RedisEventBus so
that connection failures log exceptions with full traceback while
remaining non-breaking. This allows DatabaseManager to operate without
Redis connectivity.
## Problem
DatabaseManager was failing with "Authentication required" when trying
to publish notifications via AsyncRedisNotificationEventBus. The service
has no Redis credentials configured, causing `increment_onboarding_runs`
to fail.
## Root Cause
When `increment_onboarding_runs` publishes a notification:
1. Calls `AsyncRedisNotificationEventBus().publish()`
2. Attempts to connect to Redis via `get_redis_async()`
3. Connection fails due to missing credentials
4. Exception propagates, failing the entire DB operation
Previous fix (#11775) made the cache module lazy, but didn't address the
notification bus which also requires Redis.
## Solution
Wrap Redis operations in try-except blocks:
- `publish_event`: Logs exception with traceback, continues without
publishing
- `listen_events`: Logs exception with traceback, returns empty
generator
- `wait_for_event`: Returns None on connection failure
Using `logger.exception()` instead of `logger.warning()` ensures full
stack traces are captured for debugging while keeping operations
non-breaking.
This allows services to operate without Redis when only using event bus
for non-critical notifications.
## Changes
- Modified `backend/data/event_bus.py`:
- Added graceful error handling to `RedisEventBus` and
`AsyncRedisEventBus`
- All Redis operations now catch exceptions and log with
`logger.exception()`
- Added `backend/data/event_bus_test.py`:
- Tests verify graceful degradation when Redis is unavailable
- Tests verify normal operation when Redis is available
## Test Plan
- [x] New tests verify graceful degradation when Redis unavailable
- [x] Existing notification tests still pass
- [x] DatabaseManager can increment onboarding runs without Redis
## Related Issues
Fixes https://significant-gravitas.sentry.io/issues/7205834440/
(AUTOGPT-SERVER-76D)
## Changes 🏗️
On the **Old Builder**, when running an agent...
### Before
<img width="800" height="614" alt="Screenshot 2026-01-21 at 21 27 05"
src="https://github.com/user-attachments/assets/a3b2ec17-597f-44d2-9130-9e7931599c38"
/>
Credentials are there, but it is not recognising them, you need to click
on them to be selected
### After
<img width="1029" height="728" alt="Screenshot 2026-01-21 at 21 26 47"
src="https://github.com/user-attachments/assets/c6e83846-6048-439e-919d-6807674f2d5a"
/>
It uses the new credentials UI and correctly auto-selects existing ones.
### Other
Fixed a small timezone display glitch on the new library view.
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Run agent in old builder
- [x] Credentials are auto-selected and using the new collapsed system
credentials UI
## Summary
- Fixes AUTOGPT-SERVER-76H - Error parsing LibraryAgent from database
due to null values in GraphSettings fields
- When parsing LibraryAgent settings from the database, null values for
`human_in_the_loop_safe_mode` and `sensitive_action_safe_mode` were
causing Pydantic validation errors
- Adds `BeforeValidator` annotations to coerce null values to their
defaults (True and False respectively)
## Test plan
- [x] Verified with unit tests that GraphSettings can now handle
None/null values
- [x] Backend tests pass
- [x] Manually tested with all scenarios (None, empty dict, explicit
values)
Add new LLM Picker for the new Builder.
### Changes 🏗️
- Enrich `LlmModelMeta` (in `llm.py`) with human readable model, creator
and provider names and price tier (note: this is temporary measure and
all LlmModelMeta will be removed completely once LLM Registry is ready)
- Add provider icons
- Add custom input field `LlmModelField` and its components&helpers
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] LLM model picker works correctly in the new Builder
- [x] Legacy LLM model picker works in the old Builder
## Summary
This PR introduces two explicit safe mode toggles for controlling agent
execution behavior, providing clearer and more granular control over
when agents should pause for human review.
### Key Changes
**New Safe Mode Settings:**
- **`human_in_the_loop_safe_mode`** (bool, default `true`) - Controls
whether human-in-the-loop (HITL) blocks pause for review
- **`sensitive_action_safe_mode`** (bool, default `false`) - Controls
whether sensitive action blocks pause for review
**New Computed Properties on LibraryAgent:**
- `has_human_in_the_loop` - Indicates if agent contains HITL blocks
- `has_sensitive_action` - Indicates if agent contains sensitive action
blocks
**Block Changes:**
- Renamed `requires_human_review` to `is_sensitive_action` on blocks for
clarity
- Blocks marked as `is_sensitive_action=True` pause only when
`sensitive_action_safe_mode=True`
- HITL blocks pause when `human_in_the_loop_safe_mode=True`
**Frontend Changes:**
- Two separate toggles in Agent Settings based on block types present
- Toggle visibility based on `has_human_in_the_loop` and
`has_sensitive_action` computed properties
- Settings cog hidden if neither toggle applies
- Proper state management for both toggles with defaults
**AI-Generated Agent Behavior:**
- AI-generated agents set `sensitive_action_safe_mode=True` by default
- This ensures sensitive actions are reviewed for AI-generated content
## Changes
**Backend:**
- `backend/data/graph.py` - Updated `GraphSettings` with two boolean
toggles (non-optional with defaults), added `has_sensitive_action`
computed property
- `backend/data/block.py` - Renamed `requires_human_review` to
`is_sensitive_action`, updated review logic
- `backend/data/execution.py` - Updated `ExecutionContext` with both
safe mode fields
- `backend/api/features/library/model.py` - Added
`has_human_in_the_loop` and `has_sensitive_action` to `LibraryAgent`
- `backend/api/features/library/db.py` - Updated to use
`sensitive_action_safe_mode` parameter
- `backend/executor/utils.py` - Simplified execution context creation
**Frontend:**
- `useAgentSafeMode.ts` - Rewritten to support two independent toggles
- `AgentSettingsModal.tsx` - Shows two separate toggles
- `SelectedSettingsView.tsx` - Shows two separate toggles
- Regenerated API types with new schema
## Test Plan
- [x] All backend tests pass (Python 3.11, 3.12, 3.13)
- [x] All frontend tests pass
- [x] Backend format and lint pass
- [x] Frontend format and lint pass
- [x] Pre-commit hooks pass
---------
Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co>
## Summary
- Fix intermittent "type 'vector' does not exist" errors when using
PgBouncer in transaction mode
- The issue was that `SET search_path` and the actual query could run on
different backend connections
- Use explicit schema qualification (`{schema}.vector`,
`OPERATOR({schema}.<=>)`) instead of relying on search_path
## Test plan
- [x] Tested vector type cast on local: `'[1,2,3]'::platform.vector`
works
- [x] Tested OPERATOR syntax on local: `OPERATOR(platform.<=>)` works
- [x] Tested on dev via kubectl exec: both work correctly
- [ ] Deploy to dev and verify backfill_missing_embeddings endpoint no
longer errors
## Related Issues
Fixes: AUTOGPT-SERVER-763, AUTOGPT-SERVER-764
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
# feat(backend/blocks): add ConcatenateListsBlock
## Description
This PR implements a new block `ConcatenateListsBlock` that concatenates
multiple lists into a single list. This addresses the "good first issue"
for implementing a list concatenation block in the platform/blocks area.
The block takes a list of lists as input and combines all elements in
order into a single concatenated list. This is useful for workflows that
need to merge data from multiple sources or combine results from
different operations.
### Changes 🏗️
- **Added `ConcatenateListsBlock` class** in
`autogpt_platform/backend/backend/blocks/data_manipulation.py`
- Input: `lists: List[List[Any]]` - accepts a list of lists to
concatenate
- Output: `concatenated_list: List[Any]` - returns a single concatenated
list
- Error output: `error: str` - provides clear error messages for invalid
input types
- Block ID: `3cf9298b-5817-4141-9d80-7c2cc5199c8e`
- Category: `BlockCategory.BASIC` (consistent with other list
manipulation blocks)
- **Added comprehensive test suite** in
`autogpt_platform/backend/test/blocks/test_concatenate_lists.py`
- Tests using built-in `test_input`/`test_output` validation
- Manual test cases covering edge cases (empty lists, single list, empty
input)
- Error handling tests for invalid input types
- Category consistency verification
- All tests passing
- **Implementation details:**
- Uses `extend()` method for efficient list concatenation
- Preserves element order from all input lists
- **Runtime type validation**: Explicitly checks `isinstance(lst, list)`
before calling `extend()` to prevent:
- Strings being iterated character-by-character (e.g., `extend("abc")` →
`['a', 'b', 'c']`)
- Non-iterable types causing `TypeError` (e.g., `extend(1)`)
- Clear error messages indicating which index has invalid input
- Handles edge cases: empty lists, empty input, single list, None values
- Follows existing block patterns and conventions
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Run `poetry run pytest test/blocks/test_concatenate_lists.py -v` -
all tests pass
- [x] Verified block can be imported and instantiated
- [x] Tested with built-in test cases (4 test scenarios)
- [x] Tested manual edge cases (empty lists, single list, empty input)
- [x] Tested error handling for invalid input types
- [x] Verified category is `BASIC` for consistency
- [x] Verified no linting errors
- [x] Confirmed block follows same patterns as other blocks in
`data_manipulation.py`
#### Code Quality:
- [x] Code follows existing patterns and conventions
- [x] Type hints are properly used
- [x] Documentation strings are clear and descriptive
- [x] Runtime type validation implemented
- [x] Error handling with clear error messages
- [x] No linting errors
- [x] Prisma client generated successfully
### Testing
**Test Results:**
```
test/blocks/test_concatenate_lists.py::test_concatenate_lists_block_builtin_tests PASSED
test/blocks/test_concatenate_lists.py::test_concatenate_lists_manual PASSED
============================== 2 passed in 8.35s ==============================
```
**Test Coverage:**
- Basic concatenation: `[[1, 2, 3], [4, 5, 6]]` → `[1, 2, 3, 4, 5, 6]`
- Mixed types: `[["a", "b"], ["c"], ["d", "e", "f"]]` → `["a", "b", "c",
"d", "e", "f"]`
- Empty list handling: `[[1, 2], []]` → `[1, 2]`
- Empty input: `[]` → `[]`
- Single list: `[[1, 2, 3]]` → `[1, 2, 3]`
- Error handling: Invalid input types (strings, non-lists) produce clear
error messages
- Category verification: Confirmed `BlockCategory.BASIC` for consistency
### Review Feedback Addressed
- **Category Consistency**: Changed from `BlockCategory.DATA` to
`BlockCategory.BASIC` to match other list manipulation blocks
(`AddToListBlock`, `FindInListBlock`, etc.)
- **Type Robustness**: Added explicit runtime validation with
`isinstance(lst, list)` check before calling `extend()` to prevent:
- Strings being iterated character-by-character
- Non-iterable types causing `TypeError`
- **Error Handling**: Added `error` output field with clear, descriptive
error messages indicating which index has invalid input
- **Test Coverage**: Added test case for error handling with invalid
input types
### Related Issues
- Addresses: "Implement block to concatenate lists" (good first issue,
platform/blocks, hacktoberfest)
### Notes
- This is a straightforward data manipulation block that doesn't require
external dependencies
- The block will be automatically discovered by the block loading system
- No database or configuration changes required
- Compatible with existing workflow system
- All review feedback has been addressed and incorporated
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> Adds a new list utility and updates docs.
>
> - **New block**: `ConcatenateListsBlock` in
`backend/blocks/data_manipulation.py`
> - Input `lists: List[List[Any]]`; outputs `concatenated_list` or
`error`
> - Skips `None` entries; emits error for non-list items; preserves
order
> - **Docs**: Adds "Concatenate Lists" section to
`docs/integrations/basic.md` and links it in
`docs/integrations/README.md`
> - **Contributor guide**: New `docs/CLAUDE.md` with manual doc section
guidelines
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
4f56dd86c2. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
---------
Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
## Background
When using chat to run blocks/agents that support multiple credential
types (e.g., GitHub blocks support both `api_key` and `oauth2`), users
reported that the credentials setup UI would randomly show either "Add
API key" or "Connect account (OAuth)" - seemingly at random between
requests or server restarts.
## Root Cause
The bug was in how the backend selected which credential type to return
when building the missing credentials response:
```python
cred_type = next(iter(field_info.supported_types), "api_key")
```
The problem is that `supported_types` is a **frozenset**. When you call
`iter()` on a frozenset and take `next()`, the iteration order is
**non-deterministic** due to Python's hash randomization. This means:
- `frozenset({'api_key', 'oauth2'})` could iterate as either
`['api_key', 'oauth2']` or `['oauth2', 'api_key']`
- The order varies between Python process restarts and sometimes between
requests
- This caused the UI to randomly show different credential options
### Changes 🏗️
**Backend (`utils.py`, `run_block.py`, `run_agent.py`):**
- Added `_serialize_missing_credential()` helper that uses `sorted()`
for deterministic ordering
- Added `build_missing_credentials_from_graph()` and
`build_missing_credentials_from_field_info()` utilities
- Now returns both `type` (first sorted type, for backwards compat) and
`types` (full array with ALL supported types)
**Frontend (`helpers.ts`, `ChatCredentialsSetup.tsx`,
`useChatMessage.ts`):**
- Updated to read the `types` array from backend response
- Changed `credentialType` (single) to `credentialTypes` (array)
throughout the chat credentials flow
- Passes all supported types to `CredentialsInput` via
`credentials_types` schema field
### Result
Now `useCredentials.ts` correctly sets both `supportsApiKey=true` AND
`supportsOAuth2=true` when both are supported, ensuring:
1. **Deterministic behavior** - no more random type selection
2. **All saved credentials shown** - credentials of any supported type
appear in the selection list
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Verified GitHub block shows consistent credential options across
page reloads
- [x] Verified both OAuth and API key credentials appear in selection
when user has both saved
- [x] Verified backend returns `types: ["api_key", "oauth2"]` array
(checked via Python REPL)
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> Ensures deterministic credential type selection and surfaces all
supported types end-to-end.
>
> - Backend: add `_serialize_missing_credential`,
`build_missing_credentials_from_graph/field_info`;
`run_agent`/`run_block` now return missing credentials with stable
ordering and both `type` (first) and `types` (all).
> - Frontend: chat helpers and UI (`helpers.ts`,
`ChatCredentialsSetup.tsx`, `useChatMessage.ts`) now read `types`,
switch from single `credentialType` to `credentialTypes`, and pass all
supported `credentials_types` in schemas.
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
7d80f4f0e0. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Nicholas Tindle <ntindle@users.noreply.github.com>
### Changes 🏗️
- Enhanced UI for the Run Graph button with improved loading states and
animations
- Added color-coded edges in the flow editor based on output data types
- Improved the layout of the Run Input Dialog with a two-column grid
design
- Refined the styling of flow editor controls with consistent icon sizes
and colors
- Updated tutorial icons with better color and size customization
- Fixed credential field display to show provider name with "credential"
suffix
- Optimized draft saving by excluding node position changes to prevent
excessive saves when dragging nodes
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Verified that the Run Graph button shows proper loading states
- [x] Confirmed that edges display correct colors based on data types
- [x] Tested the Run Input Dialog layout with various input
configurations
- [x] Checked that flow editor controls display consistently
- [x] Verified that tutorial icons render properly
- [x] Confirmed credential fields show proper provider names
- [x] Tested that dragging nodes doesn't trigger unnecessary draft saves
### Changes 🏗️
- Refactored the credentials input handling in the RunInputDialog to use
the shared CredentialsGroupedView component
- Moved CredentialsGroupedView from agent library to a shared component
location for reuse
- Fixed source name handling in edge creation to properly handle tool
source names
- Improved node output UI by replacing custom expand/collapse with
Accordion component
- Fixed timing of hardcoded values synchronization with handle IDs to
ensure proper loading
- Enabled NEW_FLOW_EDITOR and BUILDER_VIEW_SWITCH feature flags by
default
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Verified credentials input works in both agent run dialog and
builder run dialog
- [x] Confirmed node output accordion works correctly
- [x] Tested flow editor with tools to ensure source name handling works
properly
- [x] Verified hardcoded values sync correctly with handle IDs
#### For configuration changes:
- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)
<!-- Clearly explain the need for these changes: -->
This PR improves the Langfuse tracing implementation in the chat feature
by adopting the v3 SDK patterns, resulting in cleaner code and better
observability.
### Changes 🏗️
- **Simplified Langfuse client usage**: Replace manual client
initialization with `langfuse.get_client()` global singleton
- **Use v3 context managers**: Switch to
`start_as_current_observation()` and `propagate_attributes()` for
automatic trace propagation
- **Auto-instrument OpenAI calls**: Use `langfuse.openai` wrapper for
automatic LLM call tracing instead of manual generation tracking
- **Add `@observe` decorators**: All chat tools now have
`@observe(as_type="tool")` decorators for automatic tool execution
tracing:
- `add_understanding`
- `view_agent_output` (renamed from `agent_output`)
- `create_agent`
- `edit_agent`
- `find_agent`
- `find_block`
- `find_library_agent`
- `get_doc_page`
- `run_agent`
- `run_block`
- `search_docs`
- **Remove manual trace lifecycle**: Eliminated the verbose `finally`
block that manually ended traces/generations
- **Rename tool**: `agent_output` → `view_agent_output` for clarity
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Verified chat feature works with Langfuse tracing enabled
- [x] Confirmed traces appear correctly in Langfuse dashboard with tool
spans
- [x] Tested tool execution flows show up as nested observations
#### For configuration changes:
- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)
No configuration changes required - uses existing Langfuse environment
variables.
- Add generate_block_docs.py script that introspects block code to
generate markdown
- Support manual content preservation via <!-- MANUAL: --> markers
- Add migrate_block_docs.py to preserve existing manual content from git
HEAD
- Add CI workflow (docs-block-sync.yml) to fail if docs drift from code
- Add Claude PR review workflow (docs-claude-review.yml) for doc changes
- Add manual LLM enhancement workflow (docs-enhance.yml)
- Add GitBook configuration (.gitbook.yaml, SUMMARY.md)
- Fix non-deterministic category ordering (categories is a set)
- Add comprehensive test suite (32 tests)
- Generate docs for 444 blocks with 66 preserved manual sections
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
<!-- Clearly explain the need for these changes: -->
### Changes 🏗️
<!-- Concisely describe all of the changes made in this pull request:
-->
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
<!-- Put your test plan here: -->
- [x] Extensively test code generation for the docs pages
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> Introduces an automated documentation pipeline for blocks and
integrates it into CI.
>
> - Adds `scripts/generate_block_docs.py` (+ tests) to introspect blocks
and generate `docs/integrations/**`, preserving `<!-- MANUAL: -->`
sections
> - New CI workflows: **docs-block-sync** (fails if docs drift),
**docs-claude-review** (AI review for block/docs PRs), and
**docs-enhance** (optional LLM improvements)
> - Updates existing Claude workflows to use `CLAUDE_CODE_OAUTH_TOKEN`
instead of `ANTHROPIC_API_KEY`
> - Improves numerous block descriptions/typos and links across backend
blocks to standardize docs output
> - Commits initial generated docs including
`docs/integrations/README.md` and many provider/category pages
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
631e53e0f6. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
- Add start_time < end_time validation in VideoClipBlock and VideoTextOverlayBlock
- Fix resource leaks: close AudioFileClip in narration.py, TextClip in text_overlay.py
- Fix concat.py: proper resource cleanup in finally block, load clips individually
- Implement proper crossfade using crossfadein/crossfadeout
- Implement ducking mode with stronger attenuation (0.3x original_volume)
- Remove unused start_time/end_time params from VideoDownloadBlock
- Fix None handling for duration/title in download.py (use 'or' instead of 'get' default)
- Add exception chaining with 'from e' in all blocks
- Add minimum clips validation in VideoConcatBlock
- Sort __all__ in __init__.py
- Increase ElevenLabs API timeout to 120s for longer scripts
Co-authored-by: Nicholas Tindle <ntindle@users.noreply.github.com>
## Summary
This PR adds proper execution end time tracking and fixes timestamp
handling throughout the execution analytics system.
### Key Changes
1. **Added `endedAt` field to database schema** - Executions now have a
dedicated field for tracking when they finish
2. **Fixed timestamp nullable handling** - `started_at` and `ended_at`
are now properly nullable in types
3. **Fixed chart aggregation** - Reduced threshold from ≥3 to ≥1
executions per day
4. **Improved timestamp display** - Moved timestamps to expandable
details section in analytics table
5. **Fixed nullable timestamp bugs** - Updated all frontend code to
handle null timestamps correctly
## Problem Statement
### Issue 1: Missing Execution End Times
Previously, executions used `updatedAt` (last DB update) as a proxy for
"end time". This broke when adding correctness scores retroactively -
the end time would change to whenever the score was added, not when the
execution actually finished.
### Issue 2: Chart Shows Only One Data Point
The accuracy trends chart showed only one data point despite having
executions across multiple days. Root cause: aggregation required ≥3
executions per day.
### Issue 3: Incorrect Type Definitions
Manually maintained types defined `started_at` and `ended_at` as
non-nullable `Date`, contradicting reality where QUEUED executions
haven't started yet.
## Solution
### Database Schema (`schema.prisma`)
```prisma
model AgentGraphExecution {
// ...
startedAt DateTime?
endedAt DateTime? // NEW FIELD
// ...
}
```
### Execution Lifecycle
- **QUEUED**: `startedAt = null`, `endedAt = null` (not started)
- **RUNNING**: `startedAt = set`, `endedAt = null` (in progress)
- **COMPLETED/FAILED/TERMINATED**: `startedAt = set`, `endedAt = set`
(finished)
### Migration Strategy
```sql
-- Add endedAt column
ALTER TABLE "AgentGraphExecution" ADD COLUMN "endedAt" TIMESTAMP(3);
-- Backfill ONLY terminal executions (prevents marking RUNNING executions as ended)
UPDATE "AgentGraphExecution"
SET "endedAt" = "updatedAt"
WHERE "endedAt" IS NULL
AND "executionStatus" IN ('COMPLETED', 'FAILED', 'TERMINATED');
```
## Changes by Component
### Backend
**`schema.prisma`**
- Added `endedAt` field to `AgentGraphExecution`
**`execution.py`**
- Made `started_at` and `ended_at` optional with Field descriptions
- Updated `from_db()` to use `endedAt` instead of `updatedAt`
- `update_graph_execution_stats()` sets `endedAt` when status becomes
terminal
**`execution_analytics_routes.py`**
- Removed `created_at`/`updated_at` from `ExecutionAnalyticsResult` (DB
metadata, not execution data)
- Kept only `started_at`/`ended_at` (actual execution runtime)
- Made settings global (avoid recreation)
- Moved OpenAI key validation to `_process_batch` (only check when LLM
actually runs)
**`analytics.py`**
- Fixed aggregation: `COUNT(*) >= 1` (was 3) - include all days with ≥1
execution
- Uses `createdAt` for chart grouping (when execution was queued)
**`late_execution_monitor.py`**
- Handle optional `started_at` with fallback to `datetime.min` for
sorting
- Display "Not started" when `started_at` is null
### Frontend
**Type Definitions**
- Fixed manually maintained `types.ts`: `started_at: Date | null` (was
non-nullable)
- Generated types were already correct
**Analytics Components**
- `AnalyticsResultsTable.tsx`: Show only `started_at`/`ended_at` in
2-column expandable grid
- `ExecutionAnalyticsForm.tsx`: Added filter explanation UI
**Monitoring Components** - Fixed null handling bugs:
- `OldAgentLibraryView.tsx`: Handle null in reduce function
- `agent-runs-selector-list.tsx`: Safe sorting with `?.getTime() ?? 0`
- `AgentFlowList.tsx`: Filter/sort with null checks
- `FlowRunsStatus.tsx`: Filter null timestamps
- `FlowRunsTimeline.tsx`: Filter executions with null timestamps before
rendering
- `monitoring/page.tsx`: Safe sorting
- `ActivityItem.tsx`: Fallback to "recently" for null timestamps
## Benefits
✅ **Accurate End Times**: `endedAt` is frozen when execution finishes,
not updated later
✅ **Type Safety**: Nullable types match reality, exposing real bugs
✅ **Better UX**: Chart shows all days with data (not just days with ≥3
executions)
✅ **Bug Fixes**: 7+ frontend components now handle null timestamps
correctly
✅ **Documentation**: Field descriptions explain when timestamps are null
## Testing
### Backend
```bash
cd autogpt_platform/backend
poetry run format # ✅ All checks passed
poetry run lint # ✅ All checks passed
```
### Frontend
```bash
cd autogpt_platform/frontend
pnpm format # ✅ All checks passed
pnpm lint # ✅ All checks passed
pnpm types # ✅ All type errors fixed
```
### Test Data Generation
Created script to generate 35 test executions across 7 days with
correctness scores:
```bash
poetry run python scripts/generate_test_analytics_data.py
```
## Migration Notes
⚠️ **Important**: The migration only backfills `endedAt` for executions
with terminal status (COMPLETED, FAILED, TERMINATED). Active executions
(QUEUED, RUNNING) correctly keep `endedAt = null`.
## Breaking Changes
None - this is backward compatible:
- `endedAt` is nullable, existing code that doesn't use it is unaffected
- Frontend already used generated types which were correct
- Migration safely backfills historical data
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> Introduces explicit execution end-time tracking and normalizes
timestamp handling across backend and frontend.
>
> - Adds `endedAt` to `AgentGraphExecution` (schema + migration);
backfills terminal executions; sets `endedAt` on terminal status updates
> - Makes `GraphExecutionMeta.started_at/ended_at` optional; updates
`from_db()` to use DB `endedAt`; exposes timestamps in
`ExecutionAnalyticsResult`
> - Moves OpenAI key validation into batch processing; instantiates
`Settings` once
> - Accuracy trends: reduce daily aggregation threshold to `>= 1`;
optional historical series
> - Monitoring/analytics UI: results table shows/export
`started_at`/`ended_at`; adds chart filter explainer
> - Frontend null-safety: update types (`Date | null`) and fix
sorting/filtering/rendering for nullable timestamps across monitoring
and library views
> - Late execution monitor: safe sorting/display when `started_at` is
null
> - OpenAPI specs updated for new/nullable fields
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
1d987ca6e5. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
---------
Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co>
### Changes 🏗️
**Fixed missing default credentials and provider name mismatch in the
credentials store:**
1. **Provider name correction** (`credentials_store.py:97-103`)
- Changed `provider="unreal"` → `provider="unreal_speech"` to match the
existing `unreal_speech_api_key` setting and block usage
- Updated title from "Use Credits for Unreal" → "Use Credits for Unreal
Speech" for clarity
2. **Added missing OpenWeatherMap credentials**
(`credentials_store.py:219-226`)
- New `openweathermap_credentials` definition with `APIKeyCredentials`
- Uses existing `settings.secrets.openweathermap_api_key` setting that
was previously defined but had no credential object
- Added to `DEFAULT_CREDENTIALS` list
3. **Fixed credentials not exposed in `get_all_creds()`**
(`credentials_store.py:343-354`)
- Added `llama_api_credentials` conditional append (was defined but not
returned to users)
- Added `v0_credentials` conditional append (was defined but not
returned to users)
- Added `openweathermap_credentials` conditional append
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Verified provider name `unreal_speech` matches block usage in
`text_to_speech_block.py`
- [x] Confirmed `openweathermap_api_key` setting exists in secrets
- [x] Confirmed `llama_api_key` and `v0_api_key` settings exist in
secrets
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> Aligns backend credential definitions and exposes missing system
creds; updates frontend to hide new built-ins.
>
> - Backend `credentials_store.py`:
> - Corrects `provider` to `unreal_speech` and updates title
> - Adds `openweathermap_credentials`; includes in `DEFAULT_CREDENTIALS`
and `get_all_creds()` when key present
> - Ensures `llama_api_credentials` and `v0_credentials` are returned by
`get_all_creds()`
> - Frontend `integrations/page.tsx`:
> - Extends `hiddenCredentials` with IDs for `v0`, `webshare_proxy`, and
`openweathermap`
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
e7d46b76c6. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
---------
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Nicholas Tindle <ntindle@users.noreply.github.com>
This PR fixes flaky agent-activity Playwright tests that were failing
intermittently in CI.
Closes#11789
### Changes 🏗️
- **Navigate to specific agent by name**: Replace
`LibraryPage.clickFirstAgent(page)` with
`LibraryPage.navigateToAgentByName(page, "Test Agent")` to ensure we're
testing the correct agent rather than relying on the first agent in the
list
- **Add retry mechanism for async data loading**: Replace direct
visibility check with `expect(...).toPass({ timeout: 15000 })` pattern
to properly handle asynchronous agent data fetching
- **Increase timeout**: Extended timeout from 8000ms to 15000ms to
accommodate slower CI environments
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Verified the test file syntax is correct
- [x] Changes target the correct file
(`autogpt_platform/frontend/src/tests/agent-activity.spec.ts`)
- [x] The retry mechanism follows Playwright best practices using
`toPass()`
#### For configuration changes:
- [x] `.env.default` is updated or already compatible with my changes
(N/A - no config changes)
- [x] `docker-compose.yml` is updated or already compatible with my
changes (N/A - no config changes)
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**) (N/A - no config changes)
---------
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Nicholas Tindle <ntindle@users.noreply.github.com>
This PR adds the ability to create and edit agents from natural language
descriptions in the chat copilot.
### Changes 🏗️
- Added `agent_generator/` module with:
- LLM client for OpenAI API calls
- Core generation logic for decomposing goals and generating agent JSON
- Fixer module to correct common LLM generation errors
- Validator to ensure generated agents are structurally valid
- Prompts for goal decomposition and agent generation
- Utility functions for blocks info and agent saving
- Added `CreateAgentTool` - creates new agents from natural language
descriptions
- Added `EditAgentTool` - edits existing agents using natural language
patches
- Added response models: `AgentPreviewResponse`, `AgentSavedResponse`,
`ClarificationNeededResponse`
- Registered new tools in the tools registry
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Run `poetry run format` to ensure code passes linting
- [x] Test creating an agent via chat with a natural language
description
- [x] Test editing an existing agent via chat
This PR adds new chat tools for searching blocks and documentation,
along with BM25 reranking for improved search relevance.
### Changes 🏗️
**New Chat Tools:**
- `find_block` - Search for available blocks by name/description using
hybrid search
- `run_block` - Execute a block directly with provided inputs and
credentials
- `search_docs` - Search documentation with section-level granularity
- `get_doc_page` - Retrieve full documentation page content
**Search Improvements:**
- Added BM25 reranking to hybrid search for better lexical relevance
- Documentation handler now chunks markdown by headings (##) for
finer-grained embeddings
- Section-based content IDs (`doc_path::section_index`) for precise doc
retrieval
- Startup embedding backfill in scheduler for immediate searchability
**Other Changes:**
- New response models for block and documentation search results
- Updated orphan cleanup to handle section-based doc embeddings
- Added `rank-bm25` dependency for BM25 scoring
- Removed max message limit check in chat service
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Run find_block tool to search for blocks (e.g., "current time")
- [x] Run run_block tool to execute a found block
- [x] Run search_docs tool to search documentation
- [x] Run get_doc_page tool to retrieve full doc content
- [x] Verify BM25 reranking improves search relevance for exact term
matches
- [x] Verify documentation sections are properly chunked and embedded
#### For configuration changes:
- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)
**Dependencies added:** `rank-bm25` for BM25 scoring algorithm
Frontend changes extracted from the hackathon/copilot branch for the
copilot feature development.
### Changes 🏗️
- New Chat system with contextual components (`Chat`, `ChatDrawer`,
`ChatContainer`, `ChatMessage`, etc.)
- Form renderer system with RJSF v6 integration and new input renderers
- Enhanced credentials management with improved OAuth flow and
credential selection
- New output renderers for various content types (Code, Image, JSON,
Markdown, Text, Video)
- Scrollable tabs component for better UI organization
- Marketplace update notifications and publishing workflow improvements
- Draft recovery feature with IndexedDB persistence
- Safe mode toggle functionality
- Various UI/UX improvements across the platform
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [ ] Test new Chat components functionality
- [ ] Verify form renderer with various input types
- [ ] Test credential management flows
- [ ] Verify output renderers display correctly
- [ ] Test draft recovery feature
#### For configuration changes:
- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)
---------
Co-authored-by: Lluis Agusti <hi@llu.lu>
## Summary
- Makes Redis connection lazy in the cache module - connection is only
established when `shared_cache=True` is actually used
- Fixes DatabaseManager failing to start because it imports
`onboarding.py` which imports `cache.py`, triggering Redis connection at
module load time even though it only uses in-memory caching
## Root Cause
Commit `b01ea3fcb` (merged today) added `increment_onboarding_runs` to
DatabaseManager, which imports from `onboarding.py`. That module imports
`@cached` decorator from `cache.py`, which was creating a Redis
connection at module import time:
```python
# Old code - ran at import time!
redis = Redis(connection_pool=_get_cache_pool())
```
Since `onboarding.py` only uses `@cached(shared_cache=False)` (in-memory
caching), it doesn't actually need Redis. But the import triggered the
connection attempt.
## Changes
- Wrapped Redis connection in a singleton class with lazy initialization
- Connection is only established when `_get_redis()` is first called
(i.e., when `shared_cache=True` is used)
- Services using only in-memory caching can now import `cache.py`
without Redis configuration
## Test plan
- [ ] Services using `shared_cache=False` work without Redis configured
- [ ] Services using `shared_cache=True` still work correctly with Redis
- [ ] Existing cache tests pass
🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
### Changes 🏗️
- **Improved Error Handling**: Enhanced error handling in
`useRunInputDialog.ts` to properly handle cases where node errors are
empty or undefined
- **Fixed Node Collision Resolution**: Updated `Flow.tsx` to use the
current state from the store instead of stale props
- **Enhanced History Management**:
- Added proper state tracking for edge removal operations
- Improved undo/redo functionality to prevent duplicate states
- Fixed edge case where history wasn't properly tracked during node
dragging
- **UI Improvements**:
- Fixed potential null reference in NodeHeader when accessing agent_name
- Added placeholder for GoogleDrivePicker in INPUT mode
- Fixed spacing in ArrayFieldTemplate
- **Bug Fixes**:
- Added proper state tracking before modifying nodes/edges
- Fixed history tracking to avoid redundant states
- Improved collision detection and resolution
### Checklist ���
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Test undo/redo functionality after adding, removing, and moving
nodes
- [x] Test edge creation and deletion with history tracking
- [x] Verify error handling when graph validation fails
- [x] Test Google Drive picker in different UI modes
- [x] Verify node collision resolution works correctly
### Changes 🏗️
- Added a placeholder UI for Google Drive Picker in INPUT block type
- Improved detection of Google Drive file objects in schema validation
- Extracted `isGoogleDrivePickerSchema` function for better code
organization
- Added spacing between array field elements with a gap-2 class
- Added debug logging for preprocessed schema in FormRenderer
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Verified Google Drive Picker shows placeholder in INPUT blocks
- [x] Confirmed array field elements have proper spacing
- [x] Tested that Google Drive file objects are properly detected
### Changes 🏗️
- Enhanced error handling for graph validation failures with detailed
user feedback
- Added automatic viewport navigation to the first node with errors when
validation fails
- Improved node title display to prioritize agent_name from hardcoded
values
- Removed console.log debugging statement from OutputHandler
- Added ApiError import and improved error type handling
- Reorganized imports for better code organization
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Create a graph with intentional validation errors and verify error
messages display correctly
- [x] Verify the viewport automatically navigates to the first node with
errors
- [x] Check that node titles correctly display customized names or agent
names
- [x] Test error recovery by fixing validation errors and successfully
running the graph
### Changes 🏗️
Added support for Note blocks in the FieldTemplate component by:
- Importing the BlockUIType enum from the build components types
- Extracting the uiType from the registry.formContext
- Adding a conditional rendering check that returns children directly
when the uiType is BlockUIType.NOTE
This change allows Note blocks to render without the standard field
template wrapper, providing a cleaner display for note-type content.

### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Created a Note block and verified it renders correctly without
field template wrapper
- [x] Confirmed other block types still render with proper field
template
- [x] Verified that Note blocks maintain proper functionality in the
node graph
## Summary
This PR extends the embedding system to support **blocks** and
**documentation** content types in addition to store agents, and
introduces **unified hybrid search** across all content types using a
single `UnifiedContentEmbedding` table.
### Key Changes
1. **Unified Hybrid Search Architecture**
- Added `search` tsvector column to `UnifiedContentEmbedding` table
- New `unified_hybrid_search()` function searches across all content
types (agents, blocks, docs)
- Updated `hybrid_search()` for store agents to use
`UnifiedContentEmbedding.search`
- Removed deprecated `search` column from `StoreListingVersion` table
2. **Pluggable Content Handler Architecture**
- Created abstract `ContentHandler` base class for extensibility
- Implemented handlers: `StoreAgentHandler`, `BlockHandler`,
`DocumentationHandler`
- Registry pattern for easy addition of new content types
3. **Block Embeddings**
- Discovers all blocks using `get_blocks()`
- Extracts searchable text from: name, description, categories,
input/output schemas
4. **Documentation Embeddings**
- Scans `/docs/` directory for `.md` and `.mdx` files
- Extracts title from first `#` heading or uses filename as fallback
5. **Hybrid Search Graceful Degradation**
- Falls back to lexical-only search if query embedding generation fails
- Redistributes semantic weight proportionally to other components
- Logs warning instead of throwing error
6. **Database Migrations**
- `20260115200000_add_unified_search_tsvector`: Adds search column to
UnifiedContentEmbedding with auto-update trigger
- `20260115210000_remove_storelistingversion_search`: Removes deprecated
search column and updates StoreAgent view
7. **Orphan Cleanup**
- `cleanup_orphaned_embeddings()` removes embeddings for deleted content
- Always runs after backfill, even at 100% coverage
### Review Comments Addressed
- ✅ SQL parameter index bug when user_id provided (embeddings.py)
- ✅ Early return skipping cleanup at 100% coverage (scheduler.py)
- ✅ Inconsistent return structure across code paths (scheduler.py)
- ✅ SQL UNION syntax error - added parentheses for ORDER BY/LIMIT
(hybrid_search.py)
- ✅ Version numeric ordering in aggregations (migration)
- ✅ Embedding dimension uses EMBEDDING_DIM constant
### Files Changed
- `backend/api/features/store/content_handlers.py` (NEW): Handler
architecture
- `backend/api/features/store/embeddings.py`: Refactored to use handlers
- `backend/api/features/store/hybrid_search.py`: Unified search +
graceful degradation
- `backend/executor/scheduler.py`: Process all content types, consistent
returns
- `migrations/20260115200000_add_unified_search_tsvector/`: Add tsvector
to unified table
- `migrations/20260115210000_remove_storelistingversion_search/`: Remove
old search column
- `schema.prisma`: Updated UnifiedContentEmbedding and
StoreListingVersion models
- `*_test.py`: Added tests for unified_hybrid_search
## Test Plan
1. ✅ All tests passing on Python 3.11, 3.12, 3.13
2. ✅ Types check passing
3. ✅ CodeRabbit and Sentry reviews addressed
4. Deploy to staging and verify:
- Backfill job processes all content types
- Search results include blocks and docs
- Search works without OpenAI API (graceful degradation)
🤖 Generated with [Claude Code](https://claude.ai/code)
---------
Co-authored-by: Swifty <craigswift13@gmail.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
<!-- Clearly explain the need for these changes: -->
gitbook changes via ui
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> **Docs sync from GitBook**
>
> - Updates `docs/home/README.md` with a new Developer Platform landing
page (cards, links to Platform, Integrations, Contribute, Discord,
GitHub) and metadata/cover settings
> - Adds `docs/home/SUMMARY.md` defining the table of contents linking
to `README.md`
> - No application/runtime code changes
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
446c71fec8. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
## Changes 🏗️
### System credentials in Run Modal
We had the issue that "system" credentials were mixed with "user"
credentials in the run agent modal:
#### Before
<img width="400" height="466" alt="Screenshot 2026-01-14 at 19 05 56"
src="https://github.com/user-attachments/assets/9d1ee766-5004-491f-ae14-a0cf89a9118e"
/>
This created confusion among the users. This "system" credentials are
supplied by AutoGPT ( _most of the time_ ) and a user running an agent
should not bother with them ( _unless they want to change them_ ). For
example in this case, the credential that matters is the **Google** one
🙇🏽
### After
<img width="400" height="350" alt="Screenshot 2026-01-14 at 19 04 12"
src="https://github.com/user-attachments/assets/e2bbc015-ce4c-496c-a76f-293c01a11c6f"
/>
<img width="400" height="672" alt="Screenshot 2026-01-14 at 19 04 19"
src="https://github.com/user-attachments/assets/d704dae2-ecb2-4306-bd04-3d812fed4401"
/>
"System" credentials are collapsed by default, reducing noise in the
Task Credentials section. The user can still see and change them by
expanding the accordion.
<img width="400" height="190" alt="Screenshot 2026-01-14 at 19 04 27"
src="https://github.com/user-attachments/assets/edc69612-4588-48e4-981a-f59c26cfa390"
/>
If some "system" credentials are missing, there is a red label
indicating so, it wasn't that obvious with the previous implementation,
<img width="400" height="309" alt="Screenshot 2026-01-14 at 19 04 30"
src="https://github.com/user-attachments/assets/f27081c7-40ad-4757-97b3-f29636616fc2"
/>
### New endpoint
There is a new REST endpoint, `GET /providers/system`, to list system
credential providers so it is easy to access in the Front-end to group
them together vs user ones.
### Other improvements
#### `<CredentialsInput />` refinements
<img width="715" height="200" alt="Screenshot 2026-01-14 at 19 09 31"
src="https://github.com/user-attachments/assets/01b39b16-25f3-428d-a6c8-da608038a38b"
/>
Use a normal browser `<select>` for the Credentials Dropdown ( _when you
have more than 1 for a provider_ ). This simplifies the UI shennagians a
lot and provides a better UX in 📱 ( _eventually we should move all our
selects to the native ones as they are much better for mobile and touch
screens and less code to maintain our end_ ).
I also renamed some files for clarity and tidied up some of the existing
logic.
#### Other
- Fix **Open telemetry** warnings on the server console by making the
packages external
- Fix `require-in-the-middle` console warnings
- Prettier tidy ups
## Checklist 📋
### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Run the app locally and test the above
This PR extracts backend changes from the hackathon/copilot branch,
adding enhanced chat capabilities, agent management tools, store
embeddings, and hybrid search functionality.
### Changes 🏗️
**Chat Features:**
- Added chat database layer (`db.py`) for conversation and message
persistence
- Extended chat models with new types and response structures
- New onboarding system prompt for guided user experiences
- Enhanced chat routes with additional endpoints
- Expanded chat service with more capabilities
**Chat Agent Tools:**
- `agent_output.py` - Handle agent execution outputs
- `create_agent.py` - Tool for creating new agents via chat
- `edit_agent.py` - Tool for modifying existing agents
- `find_library_agent.py` - Search and discover library agents
- Enhanced `run_agent.py` with additional functionality
- New `models.py` for shared tool types
**Store Enhancements:**
- `embeddings.py` - Vector embeddings support for semantic search
- `hybrid_search.py` - Combined keyword and semantic search
- `backfill_embeddings.py` - Utility for backfilling existing data
- Updated store database operations
**Admin:**
- Enhanced store admin routes
**Data Layer:**
- New `understanding.py` module for agent understanding/context
**Database Migrations:**
- `add_chat_tables` - Chat conversation and message tables
- `add_store_embeddings` - Embeddings storage for store items
- `enhance_search` - Search index improvements
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Chat endpoints respond correctly
- [x] Agent tools (create/edit/find/run) function properly
- [x] Store embeddings and hybrid search work
- [x] Database migrations apply cleanly
#### For configuration changes:
- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)
---------
Co-authored-by: Torantulino <40276179@live.napier.ac.uk>
(#11770)
### Changes 🏗️
- Added conditional rendering for array and object field children based
on connection status
- Implemented `shouldShowChildren` logic in `ArrayFieldTemplate` and
`ObjectFieldTemplate` components
- Modified the `shouldShowChildren` condition in `FieldTemplate` to
handle different schema types
- Imported and utilized `cleanUpHandleId` and `useEdgeStore` to check if
inputs are connected
- Added connection status checks to hide form fields when their inputs
are connected to other nodes

### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Verified that object and array fields hide their children when
connected to other nodes
- [x] Confirmed that unconnected fields display their children properly
- [x] Tested with various schema types to ensure correct rendering
behavior
- [x] Checked that the connection status is properly detected and
applied
### Changes 🏗️
This PR adds a comprehensive interactive tutorial for the new Builder UI
to help users learn how to create agents. Key changes include:
- Added a tutorial button to the canvas controls that launches a
step-by-step guide
- Created a Shepherd.js-based tutorial with multiple steps covering:
- Adding blocks from the Block Menu
- Understanding input and output handles
- Configuring block values
- Connecting blocks together
- Saving and running agents
- Added data-id attributes to key UI elements for tutorial targeting
- Implemented tutorial state management with a new tutorialStore
- Added helper functions for tutorial navigation and block manipulation
- Created CSS styles for tutorial tooltips and highlights
- Integrated with the Run Input dialog to support tutorial flow
- Added prefetching of tutorial blocks for better performance
https://github.com/user-attachments/assets/3db964b3-855c-4fcc-aa5f-6cd74ab33d7d
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Complete the tutorial from start to finish
- [x] Test tutorial on different screen sizes
- [x] Verify all tutorial steps work correctly
- [x] Ensure tutorial can be canceled and restarted
- [x] Check that tutorial doesn't interfere with normal builder
functionality
This PR adds hybrid search functionality combining semantic embeddings
with traditional text search for improved store listing discovery.
### Changes 🏗️
- Add `embeddings.py` - OpenAI-based embedding generation and similarity
search
- Add `hybrid_search.py` - Combines vector similarity with text matching
for better search results
- Add `backfill_embeddings.py` - Script to generate embeddings for
existing store listings
- Update `db.py` - Integrate hybrid search into store database queries
- Update `schema.prisma` - Add embedding storage fields and indexes
- Add migrations for embedding columns and HNSW index for vector search
### Architecture Decisions 🏛️
**Fail-Fast Approach (No Silent Fallbacks)**
We explicitly chose NOT to implement graceful degradation when hybrid
search fails. Here's why:
✅ **Benefits:**
- Errors surface immediately → faster fixes
- Tests verify hybrid search actually works (not just fallback)
- Consistent search quality for all users
- Forces proper infrastructure setup (API keys, database)
❌ **Why Not Fallback:**
- Silent degradation hides production issues
- Users get inconsistent results without knowing why
- Tests can pass even when hybrid search is broken
- Reduces operational visibility
**How We Prevent Failures:**
1. Embedding generation in approval flow (db.py:1545)
2. Error logging with `logger.error` (not warning)
3. Clear error messages (ValueError explains what's wrong)
4. Comprehensive test coverage (9/9 tests passing)
If embeddings fail, it indicates a real infrastructure issue (missing
API key, OpenAI down, database issues) that needs immediate attention,
not silent degradation.
### Test Coverage ✅
**All tests passing (1625 total):**
- 9/9 hybrid_search tests (including fail-fast validation)
- 3/3 db search integration tests
- Full schema compatibility (public/platform schemas)
- Error handling verification
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Test hybrid search returns relevant results
- [x] Test embedding generation for new listings
- [x] Test backfill script on existing data
- [x] Verify search performance with embeddings
- [x] Test fail-fast behavior when embeddings unavailable
#### For configuration changes:
- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] Configuration: Requires `openai_internal_api_key` in secrets
---------
Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>