AutoGPT

mirror of https://github.com/Significant-Gravitas/AutoGPT.git synced 2026-04-08 03:00:28 -04:00

Author	SHA1	Message	Date
Zamil Majdy	edd4c96aa6	fix: remove accidentally committed supabase submodule	2026-01-22 19:48:22 -05:00
Zamil Majdy	cd231e2d69	fix(backend/tests): fix event loop issues in review route tests - Convert module-level TestClient to fixture to avoid event loop conflicts - Add missing mock for get_pending_reviews_for_user in all tests - Add client parameter to all test functions that use the test client - Add missing mocks for get_graph_execution_meta in several tests - Remove asyncio.gather to avoid event loop binding issues - Process auto-approval creation sequentially with try/except for safety All 14 review route tests now pass successfully.	2026-01-22 19:24:07 -05:00
Zamil Majdy	399c472623	fix(backend/store): deduplicate missing API key error logs Only log "openai_internal_api_key not set" error once per process instead of on every embedding generation attempt. Reduces log spam when processing batch operations without an API key configured.	2026-01-22 19:09:35 -05:00
Zamil Majdy	554e2beddf	fix(backend/hitl): address CodeRabbit review feedback - Use return_exceptions=True in asyncio.gather for auto-approval creation to prevent endpoint failure when auto-approval fails (reviews already processed) - Fix empty payload handling: use explicit None check instead of truthiness - Distinguish auto-approvals from normal approvals: auto-approvals always use current input_data, normal approvals preserve explicitly empty payloads	2026-01-22 19:08:14 -05:00
Zamil Majdy	29fdda3fa8	test(backend/executor): add tests for stop_graph_execution with REVIEW status - Test cancellation of pending reviews when stopping execution in REVIEW status - Test database manager pattern when Prisma is disconnected - Test cascading stop to children with pending reviews - Fix mock to simulate status transition from RUNNING to TERMINATED Covers the bug fixes in stop_graph_execution() that handle: 1. Immediate termination of REVIEW status executions 2. Cleanup of pending reviews when stopping 3. Recursive cleanup of subagent reviews via cascade	2026-01-22 18:59:20 -05:00
Zamil Majdy	67e6a8841c	fix(executor): Handle REVIEW status when stopping graph executions Critical bug fix: stopping a graph in REVIEW status caused timeouts and orphaned reviews. ## Bugs Fixed ### 1. REVIEW Status Not Handled Before: - stop_graph_execution() only handled QUEUED, INCOMPLETE, RUNNING, COMPLETED, FAILED - REVIEW status → waited 15 seconds → TimeoutError - Graph remained stuck in REVIEW status After: - REVIEW status treated like QUEUED/INCOMPLETE (terminate immediately) - No need to wait for executor since execution is paused - Clean termination without timeouts ### 2. Orphaned Pending Reviews Before: - Stopping graph → status = TERMINATED - Pending reviews remained in WAITING status - User saw reviews for terminated execution in UI - Could not approve/reject (backend validation rejects) - Reviews stuck until manual cleanup After: - When stopping REVIEW execution, clean up pending reviews - Mark all WAITING reviews as REJECTED - reviewMessage: 'Execution was stopped by user' - processed: true, reviewedAt: now() - No orphaned reviews in UI ### 3. Subagent Reviews Before: - Parent graph with child (subagent) executions - Child paused for HITL review - Stop parent → recursively stops child - Child reviews orphaned (same bugs as above) After: - Cascade stop properly handles child REVIEW status - All child reviews cleaned up recursively - Clean shutdown of entire execution tree ## Implementation Changes to stop_graph_execution(): 1. Added ExecutionStatus.REVIEW to immediate termination list 2. Check if status == REVIEW before marking TERMINATED 3. Update all WAITING reviews to REJECTED with message 4. Log cleanup for debugging 5. Then terminate execution normally Cascade behavior preserved: - Still recursively stops all child executions - Each child's reviews cleaned up individually - Parent waits for all children to complete cleanup	2026-01-22 18:27:08 -05:00
Zamil Majdy	aea97db485	feat(frontend): Hide pending reviews panel while execution is RUNNING/QUEUED Defense in depth: prevent users from seeing/clicking review panel before execution pauses for review. Before: - Reviews panel could show while execution is RUNNING - User could click to open panel and see pending reviews - Confusing UX: why are reviews shown if graph hasn't paused yet? - Could lead to frustration when backend rejects the approval attempt After: - Panel hidden if execution status is RUNNING or QUEUED - Panel only shows when status is REVIEW (paused for review) - Clear UX: reviews appear only when execution needs user input Benefits: 1. Better UX: No confusion about when to approve reviews 2. Prevents invalid attempts: User can't try to approve while running 3. Works with backend validation: Frontend hides, backend rejects 4. Clear state: Panel visibility directly matches execution state Changes: - Added status check: hide if RUNNING or QUEUED - Panel shows only when execution has paused (REVIEW/INCOMPLETE) - Existing polling logic still works for real-time updates	2026-01-22 18:22:33 -05:00
Zamil Majdy	71a6969bbd	feat(hitl): Add backend validation to prevent review processing during RUNNING/QUEUED status Defense in depth: validate execution status before processing reviews. Before: - Reviews could be processed regardless of execution status - Could cause race conditions and deadlocks - User confusion when reviews processed but execution still running After: - Reject review processing with 409 Conflict if status is not REVIEW/INCOMPLETE - Only allow processing when execution is actually paused for review - Clear error message explaining why the request was rejected Benefits: 1. Prevention over cure: Stop invalid requests before processing 2. Clear semantics: Reviews can only be processed when execution paused 3. Better UX: User gets immediate feedback if they try to approve too early 4. Simpler resume logic: No need for complex status checks since we validate upfront Changes: - Fetch graph execution metadata early in the endpoint - Validate status is REVIEW or INCOMPLETE before processing - Removed redundant status checks in resume logic (already validated) - Simplified resume flow: just check if pending reviews remain - Fixed comment: 'all pending reviews' not 'some reviews'	2026-01-22 18:22:21 -05:00
Zamil Majdy	e4c3f9995b	feat(frontend): Change safety popup to per-agent instead of global Changed AI_AGENT_SAFETY_POPUP_SHOWN from a boolean flag to an array of agent IDs. This ensures users see the safety popup once per unique agent instead of once globally. Why this is better: - Different agents have different capabilities (sensitive actions, HITL blocks) - User should be aware of what THIS specific agent can do - Not too annoying since it's still only once per agent, not every run - Better safety awareness when switching between safe and risky agents Changes: - Store array of seen agent IDs in localStorage instead of single boolean - Pass agentId to useAIAgentSafetyPopup hook and AIAgentSafetyPopup component - Check if current agent ID is in the seen list before showing popup - Add agent ID to list when user acknowledges popup Testing: - Clear localStorage or remove specific agent ID from array to re-trigger popup - Each unique agent shows popup on first run only	2026-01-22 18:13:33 -05:00
Zamil Majdy	3b58684abc	fix(hitl): Prevent review deadlock by resuming regardless of execution status When users approve/reject reviews but the execution status is not REVIEW (due to race conditions or bugs), the reviews get marked as processed but execution never resumes, leaving the graph stuck forever. This fix ensures that: - If no pending reviews remain after processing, we ALWAYS attempt to resume - Only skip if status is COMPLETED or FAILED (already finished) - Log warning if status is unexpected (not REVIEW) but still resume to prevent deadlock - Prevents scenario where user has nothing to do (reviews processed) but graph never completes Example deadlock scenario (now prevented): 1. Graph creates review, sets status to REVIEW 2. User approves review → marked as APPROVED 3. Status check finds unexpected state (not REVIEW) 4. OLD: Return without resuming → graph stuck forever 5. NEW: Log warning and resume anyway → graph completes	2026-01-22 18:13:18 -05:00
Zamil Majdy	e8d44a62fd	refactor(hitl): Add user_id validation and code quality improvements - Add user_id parameter to check_approval for data isolation consistency - Fix message text: 'block' → 'node' in auto-approval message - Use walrus operator for cleaner approval_result check - Move imports to top-level in test file (avoid local imports) - Remove obvious comments (Check if pending, Resume execution, Load settings)	2026-01-22 18:04:03 -05:00
Zamil Majdy	be024da2a8	fix(hitl): Prevent review race condition by checking execution status Fixed race condition where user approves reviews while graph execution is still RUNNING, which could queue the execution twice and cause duplicate/conflicting execution instances. Solution: - Check graph execution status BEFORE resuming - Only resume if status is REVIEW (execution paused for review) - Skip resumption if RUNNING (will naturally pick up approved reviews) - Skip if COMPLETED/other (already finished) This ensures we never queue an execution that's already running, while still allowing the running execution to pick up approved reviews naturally. Added tests: - All review action tests now mock get_graph_execution_meta - Tests verify execution only resumes when status is REVIEW	2026-01-22 17:48:24 -05:00
Zamil Majdy	0df917e243	fix(hitl): Expose check_approval through database manager client Fixed "Client is not connected to the query engine" error when check_approval is called from block execution context. The function is now accessed through the database manager async client (RPC), similar to other HITL methods like get_or_create_human_review. Changes: - Add check_approval to DatabaseManager and DatabaseManagerAsyncClient - Update HITLReviewHelper to call check_approval via database client - Remove direct import of check_approval in review.py	2026-01-22 17:33:52 -05:00
Zamil Majdy	8688805a8c	refactor(hitl): Consolidate check_auto_approval into check_approval Merge auto-approval check and normal approval check into a single function using find_first with OR condition. This reduces database queries by checking both the node_exec_id and auto_approve_key in one query.	2026-01-22 16:55:12 -05:00
Zamil Majdy	9bdda7dab0	cleanup	2026-01-22 16:23:40 -05:00
Zamil Majdy	7d377aabaa	fix(db): Remove useless prefix	2026-01-22 16:00:09 -05:00
Zamil Majdy	dfd7c64068	feat(backend): Implement node-specific auto-approval using key pattern - Add auto-approval via special nodeExecId key pattern (auto_approve_{graph_exec_id}_{node_id}) - Create auto-approval records in PendingHumanReview when user approves with auto-approve flag - Check for existing auto-approval before requiring human review - Remove node_id parameter from get_or_create_human_review - Load graph settings properly when resuming execution after review	2026-01-21 22:21:00 -05:00
Zamil Majdy	02089bc047	fix(frontend): Add polling for pending reviews badge to update in real-time - Add refetchInterval to execution details query to poll while running/review - Add polling support to usePendingReviewsForExecution hook - Poll pending reviews every 2 seconds when execution is in REVIEW status - This ensures the "X Reviews Pending" badge updates without page refresh	2026-01-21 21:08:10 -05:00
Zamil Majdy	bed7b356bb	fix(frontend): Reset card data when auto-approve toggle changes Include autoApproveFuture in the key prop to force PendingReviewCard to remount when the toggle changes, which resets its internal state to the original payload data.	2026-01-21 21:04:56 -05:00
Zamil Majdy	4efc0ff502	fix(migration): Correct migration to only drop FK constraint, not non-existent column The nodeId column was never added to PendingHumanReview. The migration should only drop the foreign key constraint linking nodeExecId to AgentNodeExecution, not try to drop a column that doesn't exist.	2026-01-21 20:13:41 -05:00
Zamil Majdy	4ad0528257	feat(hitl): Simplify auto-approval with toggle UX and remove node_id storage - Remove nodeId column from PendingHumanReview schema (use in-memory tracking) - Remove foreign key relation from PendingHumanReview to AgentNodeExecution - Use ExecutionContext.auto_approved_node_ids for auto-approval tracking - Add auto-approve toggle in frontend (default off) - When toggle enabled: disable editing and use original data - Backend looks up agentNodeId from AgentNodeExecution when auto-approving - Update tests to reflect schema changes	2026-01-21 19:57:11 -05:00
Zamil Majdy	2f440ee80a	Merge branch 'dev' into feat/sensitive-action-features	2026-01-21 19:08:32 -05:00
Zamil Majdy	5d0cd88d98	fix(backend): Use unqualified vector type for pgvector queries (#11818 ) ## Summary - Remove explicit schema qualification (`{schema}.vector` and `OPERATOR({schema}.<=>)`) from pgvector queries in `embeddings.py` and `hybrid_search.py` - Use unqualified `::vector` type cast and `<=>` operator which work because pgvector is in the search_path on all environments ## Problem The previous approach tried to explicitly qualify the vector type with schema names, but this failed because: - CI environment: pgvector is in `public` schema → `platform.vector` doesn't exist - Dev (Supabase): pgvector is in `platform` schema → `public.vector` doesn't exist ## Solution Use unqualified `::vector` and `<=>` operator. PostgreSQL resolves these via `search_path`, which includes the schema where pgvector is installed on all environments. Tested on both local and dev environments with a test script that verified: - ✅ Unqualified `::vector` type cast - ✅ Unqualified `<=>` operator in ORDER BY - ✅ Unqualified `<=>` in SELECT (similarity calculation) - ✅ Combined query patterns matching actual usage ## Test plan - [ ] CI tests pass - [ ] Marketplace approval works on dev after deployment Fixes: AUTOGPT-SERVER-763, AUTOGPT-SERVER-764, AUTOGPT-SERVER-76B autogpt-platform-beta-v0.6.43	2026-01-21 18:11:58 +00:00
Zamil Majdy	033f58c075	fix(backend): Make Redis event bus gracefully handle connection failures (#11817 ) ## Summary Adds graceful error handling to AsyncRedisEventBus and RedisEventBus so that connection failures log exceptions with full traceback while remaining non-breaking. This allows DatabaseManager to operate without Redis connectivity. ## Problem DatabaseManager was failing with "Authentication required" when trying to publish notifications via AsyncRedisNotificationEventBus. The service has no Redis credentials configured, causing `increment_onboarding_runs` to fail. ## Root Cause When `increment_onboarding_runs` publishes a notification: 1. Calls `AsyncRedisNotificationEventBus().publish()` 2. Attempts to connect to Redis via `get_redis_async()` 3. Connection fails due to missing credentials 4. Exception propagates, failing the entire DB operation Previous fix (#11775) made the cache module lazy, but didn't address the notification bus which also requires Redis. ## Solution Wrap Redis operations in try-except blocks: - `publish_event`: Logs exception with traceback, continues without publishing - `listen_events`: Logs exception with traceback, returns empty generator - `wait_for_event`: Returns None on connection failure Using `logger.exception()` instead of `logger.warning()` ensures full stack traces are captured for debugging while keeping operations non-breaking. This allows services to operate without Redis when only using event bus for non-critical notifications. ## Changes - Modified `backend/data/event_bus.py`: - Added graceful error handling to `RedisEventBus` and `AsyncRedisEventBus` - All Redis operations now catch exceptions and log with `logger.exception()` - Added `backend/data/event_bus_test.py`: - Tests verify graceful degradation when Redis is unavailable - Tests verify normal operation when Redis is available ## Test Plan - [x] New tests verify graceful degradation when Redis unavailable - [x] Existing notification tests still pass - [x] DatabaseManager can increment onboarding runs without Redis ## Related Issues Fixes https://significant-gravitas.sentry.io/issues/7205834440/ (AUTOGPT-SERVER-76D)	2026-01-21 15:51:26 +00:00
Ubbe	40ef2d511f	fix(frontend): auto-select credentials correctly in old builder (#11815 ) ## Changes 🏗️ On the Old Builder, when running an agent... ### Before <img width="800" height="614" alt="Screenshot 2026-01-21 at 21 27 05" src="https://github.com/user-attachments/assets/a3b2ec17-597f-44d2-9130-9e7931599c38" /> Credentials are there, but it is not recognising them, you need to click on them to be selected ### After <img width="1029" height="728" alt="Screenshot 2026-01-21 at 21 26 47" src="https://github.com/user-attachments/assets/c6e83846-6048-439e-919d-6807674f2d5a" /> It uses the new credentials UI and correctly auto-selects existing ones. ### Other Fixed a small timezone display glitch on the new library view. ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Run agent in old builder - [x] Credentials are auto-selected and using the new collapsed system credentials UI	2026-01-21 14:55:49 +00:00
Zamil Majdy	2a55923ec0	Merge dev to get GraphSettings fix	2026-01-21 09:31:17 -05:00
Zamil Majdy	b714c0c221	fix(backend): handle null values in GraphSettings validation (#11812 ) ## Summary - Fixes AUTOGPT-SERVER-76H - Error parsing LibraryAgent from database due to null values in GraphSettings fields - When parsing LibraryAgent settings from the database, null values for `human_in_the_loop_safe_mode` and `sensitive_action_safe_mode` were causing Pydantic validation errors - Adds `BeforeValidator` annotations to coerce null values to their defaults (True and False respectively) ## Test plan - [x] Verified with unit tests that GraphSettings can now handle None/null values - [x] Backend tests pass - [x] Manually tested with all scenarios (None, empty dict, explicit values)	2026-01-21 08:40:38 -05:00
Krzysztof Czerwinski	ebabc4287e	feat(platform): New LLM Picker UI (#11726 ) Add new LLM Picker for the new Builder. ### Changes 🏗️ - Enrich `LlmModelMeta` (in `llm.py`) with human readable model, creator and provider names and price tier (note: this is temporary measure and all LlmModelMeta will be removed completely once LLM Registry is ready) - Add provider icons - Add custom input field `LlmModelField` and its components&helpers ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] LLM model picker works correctly in the new Builder - [x] Legacy LLM model picker works in the old Builder	2026-01-21 10:52:55 +00:00
Zamil Majdy	ad50f57a2b	chore: add migration for nodeId field in PendingHumanReview Adds database migration to add the nodeId column which tracks the node ID in the graph definition for auto-approval tracking.	2026-01-20 23:03:03 -05:00
Zamil Majdy	aebd961ef5	fix: implement node-specific auto-approval for human reviews Instead of disabling all safe modes when approving all future actions, now tracks specific node IDs that should be auto-approved. This means clicking "Approve all future actions" will only auto-approve future reviews from the same blocks, not all reviews. Changes: - Add nodeId field to PendingHumanReview schema - Add auto_approved_node_ids set to ExecutionContext - Update review helper to check auto_approved_node_ids - Change API from disable_future_reviews to auto_approve_node_ids - Update frontend to pass node_ids when bulk approving - Address PR feedback: remove barrel file, JSDoc comments, and cleanup	2026-01-20 22:15:51 -05:00
Zamil Majdy	bcccaa16cc	fix: remove unused props from AIAgentSafetyPopup component Removes hasSensitiveAction and hasHumanInTheLoop props that were only used by the hook, not the component itself, fixing ESLint unused vars error.	2026-01-20 21:05:39 -05:00
Zamil Majdy	d5ddc41b18	feat: add bulk approval option for human reviews Add "Approve all future actions" button to the review UI that: - Approves all current pending reviews - Disables safe mode for the remainder of the execution run - Shows helper text about turning auto-approval on/off in settings Backend changes: - Add disable_future_reviews flag to ReviewRequest model - Pass ExecutionContext with disabled safe modes when resuming Frontend changes: - Add "Approve all future actions" button to PendingReviewsList - Include helper text per PRD requirements Implements SECRT-1795	2026-01-20 20:45:50 -05:00
Zamil Majdy	95eab5b7eb	feat: add one-time safety popup for AI-generated agent runs Show a one-time safety popup the first time a user runs an agent with sensitive actions or human-in-the-loop blocks. The popup explains that agents may take real-world actions and that safety checks are enabled. - Add AI_AGENT_SAFETY_POPUP_SHOWN localStorage key - Create AIAgentSafetyPopup component with hook - Integrate popup into RunAgentModal before first run Implements SECRT-1798	2026-01-20 20:40:18 -05:00
Zamil Majdy	832d6e1696	fix: correct safe mode checks for sensitive action blocks - Add skip_safe_mode_check parameter to HITLReviewHelper to avoid checking the wrong safe mode flag for sensitive action blocks - Simplify SafeModeToggle and FloatingSafeModeToggle by removing unnecessary intermediate variables and isHITLStateUndetermined checks	2026-01-20 20:33:55 -05:00
Zamil Majdy	8b25e62959	feat(backend,frontend): add explicit safe mode toggles for HITL and sensitive actions (#11756 ) ## Summary This PR introduces two explicit safe mode toggles for controlling agent execution behavior, providing clearer and more granular control over when agents should pause for human review. ### Key Changes New Safe Mode Settings: - `human_in_the_loop_safe_mode` (bool, default `true`) - Controls whether human-in-the-loop (HITL) blocks pause for review - `sensitive_action_safe_mode` (bool, default `false`) - Controls whether sensitive action blocks pause for review New Computed Properties on LibraryAgent: - `has_human_in_the_loop` - Indicates if agent contains HITL blocks - `has_sensitive_action` - Indicates if agent contains sensitive action blocks Block Changes: - Renamed `requires_human_review` to `is_sensitive_action` on blocks for clarity - Blocks marked as `is_sensitive_action=True` pause only when `sensitive_action_safe_mode=True` - HITL blocks pause when `human_in_the_loop_safe_mode=True` Frontend Changes: - Two separate toggles in Agent Settings based on block types present - Toggle visibility based on `has_human_in_the_loop` and `has_sensitive_action` computed properties - Settings cog hidden if neither toggle applies - Proper state management for both toggles with defaults AI-Generated Agent Behavior: - AI-generated agents set `sensitive_action_safe_mode=True` by default - This ensures sensitive actions are reviewed for AI-generated content ## Changes Backend: - `backend/data/graph.py` - Updated `GraphSettings` with two boolean toggles (non-optional with defaults), added `has_sensitive_action` computed property - `backend/data/block.py` - Renamed `requires_human_review` to `is_sensitive_action`, updated review logic - `backend/data/execution.py` - Updated `ExecutionContext` with both safe mode fields - `backend/api/features/library/model.py` - Added `has_human_in_the_loop` and `has_sensitive_action` to `LibraryAgent` - `backend/api/features/library/db.py` - Updated to use `sensitive_action_safe_mode` parameter - `backend/executor/utils.py` - Simplified execution context creation Frontend: - `useAgentSafeMode.ts` - Rewritten to support two independent toggles - `AgentSettingsModal.tsx` - Shows two separate toggles - `SelectedSettingsView.tsx` - Shows two separate toggles - Regenerated API types with new schema ## Test Plan - [x] All backend tests pass (Python 3.11, 3.12, 3.13) - [x] All frontend tests pass - [x] Backend format and lint pass - [x] Frontend format and lint pass - [x] Pre-commit hooks pass --------- Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co>	2026-01-21 00:56:02 +00:00
Zamil Majdy	35a13e3df5	fix(backend): Use explicit schema qualification for pgvector types (#11805 ) ## Summary - Fix intermittent "type 'vector' does not exist" errors when using PgBouncer in transaction mode - The issue was that `SET search_path` and the actual query could run on different backend connections - Use explicit schema qualification (`{schema}.vector`, `OPERATOR({schema}.<=>)`) instead of relying on search_path ## Test plan - [x] Tested vector type cast on local: `'[1,2,3]'::platform.vector` works - [x] Tested OPERATOR syntax on local: `OPERATOR(platform.<=>)` works - [x] Tested on dev via kubectl exec: both work correctly - [ ] Deploy to dev and verify backfill_missing_embeddings endpoint no longer errors ## Related Issues Fixes: AUTOGPT-SERVER-763, AUTOGPT-SERVER-764 --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-20 22:18:16 +00:00
Mewael Tsegay Desta	2169b433c9	feat(backend/blocks): add ConcatenateListsBlock (#11567 ) # feat(backend/blocks): add ConcatenateListsBlock ## Description This PR implements a new block `ConcatenateListsBlock` that concatenates multiple lists into a single list. This addresses the "good first issue" for implementing a list concatenation block in the platform/blocks area. The block takes a list of lists as input and combines all elements in order into a single concatenated list. This is useful for workflows that need to merge data from multiple sources or combine results from different operations. ### Changes 🏗️ - Added `ConcatenateListsBlock` class in `autogpt_platform/backend/backend/blocks/data_manipulation.py` - Input: `lists: List[List[Any]]` - accepts a list of lists to concatenate - Output: `concatenated_list: List[Any]` - returns a single concatenated list - Error output: `error: str` - provides clear error messages for invalid input types - Block ID: `3cf9298b-5817-4141-9d80-7c2cc5199c8e` - Category: `BlockCategory.BASIC` (consistent with other list manipulation blocks) - Added comprehensive test suite in `autogpt_platform/backend/test/blocks/test_concatenate_lists.py` - Tests using built-in `test_input`/`test_output` validation - Manual test cases covering edge cases (empty lists, single list, empty input) - Error handling tests for invalid input types - Category consistency verification - All tests passing - Implementation details: - Uses `extend()` method for efficient list concatenation - Preserves element order from all input lists - Runtime type validation: Explicitly checks `isinstance(lst, list)` before calling `extend()` to prevent: - Strings being iterated character-by-character (e.g., `extend("abc")` → `['a', 'b', 'c']`) - Non-iterable types causing `TypeError` (e.g., `extend(1)`) - Clear error messages indicating which index has invalid input - Handles edge cases: empty lists, empty input, single list, None values - Follows existing block patterns and conventions ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Run `poetry run pytest test/blocks/test_concatenate_lists.py -v` - all tests pass - [x] Verified block can be imported and instantiated - [x] Tested with built-in test cases (4 test scenarios) - [x] Tested manual edge cases (empty lists, single list, empty input) - [x] Tested error handling for invalid input types - [x] Verified category is `BASIC` for consistency - [x] Verified no linting errors - [x] Confirmed block follows same patterns as other blocks in `data_manipulation.py` #### Code Quality: - [x] Code follows existing patterns and conventions - [x] Type hints are properly used - [x] Documentation strings are clear and descriptive - [x] Runtime type validation implemented - [x] Error handling with clear error messages - [x] No linting errors - [x] Prisma client generated successfully ### Testing Test Results: ``` test/blocks/test_concatenate_lists.py::test_concatenate_lists_block_builtin_tests PASSED test/blocks/test_concatenate_lists.py::test_concatenate_lists_manual PASSED ============================== 2 passed in 8.35s ============================== ``` Test Coverage: - Basic concatenation: `[[1, 2, 3], [4, 5, 6]]` → `[1, 2, 3, 4, 5, 6]` - Mixed types: `[["a", "b"], ["c"], ["d", "e", "f"]]` → `["a", "b", "c", "d", "e", "f"]` - Empty list handling: `[[1, 2], []]` → `[1, 2]` - Empty input: `[]` → `[]` - Single list: `[[1, 2, 3]]` → `[1, 2, 3]` - Error handling: Invalid input types (strings, non-lists) produce clear error messages - Category verification: Confirmed `BlockCategory.BASIC` for consistency ### Review Feedback Addressed - Category Consistency: Changed from `BlockCategory.DATA` to `BlockCategory.BASIC` to match other list manipulation blocks (`AddToListBlock`, `FindInListBlock`, etc.) - Type Robustness: Added explicit runtime validation with `isinstance(lst, list)` check before calling `extend()` to prevent: - Strings being iterated character-by-character - Non-iterable types causing `TypeError` - Error Handling: Added `error` output field with clear, descriptive error messages indicating which index has invalid input - Test Coverage: Added test case for error handling with invalid input types ### Related Issues - Addresses: "Implement block to concatenate lists" (good first issue, platform/blocks, hacktoberfest) ### Notes - This is a straightforward data manipulation block that doesn't require external dependencies - The block will be automatically discovered by the block loading system - No database or configuration changes required - Compatible with existing workflow system - All review feedback has been addressed and incorporated <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Adds a new list utility and updates docs. > > - New block: `ConcatenateListsBlock` in `backend/blocks/data_manipulation.py` > - Input `lists: List[List[Any]]`; outputs `concatenated_list` or `error` > - Skips `None` entries; emits error for non-list items; preserves order > - Docs: Adds "Concatenate Lists" section to `docs/integrations/basic.md` and links it in `docs/integrations/README.md` > - Contributor guide: New `docs/CLAUDE.md` with manual doc section guidelines > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit `4f56dd86c2`. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY --> --------- Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co> Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-20 18:04:12 +00:00
Nicholas Tindle	fa0b7029dd	fix(platform): make chat credentials type selection deterministic (#11795 ) ## Background When using chat to run blocks/agents that support multiple credential types (e.g., GitHub blocks support both `api_key` and `oauth2`), users reported that the credentials setup UI would randomly show either "Add API key" or "Connect account (OAuth)" - seemingly at random between requests or server restarts. ## Root Cause The bug was in how the backend selected which credential type to return when building the missing credentials response: ```python cred_type = next(iter(field_info.supported_types), "api_key") ``` The problem is that `supported_types` is a frozenset. When you call `iter()` on a frozenset and take `next()`, the iteration order is non-deterministic due to Python's hash randomization. This means: - `frozenset({'api_key', 'oauth2'})` could iterate as either `['api_key', 'oauth2']` or `['oauth2', 'api_key']` - The order varies between Python process restarts and sometimes between requests - This caused the UI to randomly show different credential options ### Changes 🏗️ Backend (`utils.py`, `run_block.py`, `run_agent.py`): - Added `_serialize_missing_credential()` helper that uses `sorted()` for deterministic ordering - Added `build_missing_credentials_from_graph()` and `build_missing_credentials_from_field_info()` utilities - Now returns both `type` (first sorted type, for backwards compat) and `types` (full array with ALL supported types) Frontend (`helpers.ts`, `ChatCredentialsSetup.tsx`, `useChatMessage.ts`): - Updated to read the `types` array from backend response - Changed `credentialType` (single) to `credentialTypes` (array) throughout the chat credentials flow - Passes all supported types to `CredentialsInput` via `credentials_types` schema field ### Result Now `useCredentials.ts` correctly sets both `supportsApiKey=true` AND `supportsOAuth2=true` when both are supported, ensuring: 1. Deterministic behavior - no more random type selection 2. All saved credentials shown - credentials of any supported type appear in the selection list ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Verified GitHub block shows consistent credential options across page reloads - [x] Verified both OAuth and API key credentials appear in selection when user has both saved - [x] Verified backend returns `types: ["api_key", "oauth2"]` array (checked via Python REPL) <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Ensures deterministic credential type selection and surfaces all supported types end-to-end. > > - Backend: add `_serialize_missing_credential`, `build_missing_credentials_from_graph/field_info`; `run_agent`/`run_block` now return missing credentials with stable ordering and both `type` (first) and `types` (all). > - Frontend: chat helpers and UI (`helpers.ts`, `ChatCredentialsSetup.tsx`, `useChatMessage.ts`) now read `types`, switch from single `credentialType` to `credentialTypes`, and pass all supported `credentials_types` in schemas. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit `7d80f4f0e0`. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY --> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com> Co-authored-by: Nicholas Tindle <ntindle@users.noreply.github.com>	2026-01-20 16:19:57 +00:00
Abhimanyu Yadav	c20ca47bb0	feat(frontend): enhance RunGraph and RunInputDialog components with loading states and improved UI (#11808 ) ### Changes 🏗️ - Enhanced UI for the Run Graph button with improved loading states and animations - Added color-coded edges in the flow editor based on output data types - Improved the layout of the Run Input Dialog with a two-column grid design - Refined the styling of flow editor controls with consistent icon sizes and colors - Updated tutorial icons with better color and size customization - Fixed credential field display to show provider name with "credential" suffix - Optimized draft saving by excluding node position changes to prevent excessive saves when dragging nodes ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Verified that the Run Graph button shows proper loading states - [x] Confirmed that edges display correct colors based on data types - [x] Tested the Run Input Dialog layout with various input configurations - [x] Checked that flow editor controls display consistently - [x] Verified that tutorial icons render properly - [x] Confirmed credential fields show proper provider names - [x] Tested that dragging nodes doesn't trigger unnecessary draft saves	2026-01-20 15:50:23 +00:00
Abhimanyu Yadav	7756e2d12d	refactor(frontend): refactor credentials input with unified CredentialsGroupedView component (#11801 ) ### Changes 🏗️ - Refactored the credentials input handling in the RunInputDialog to use the shared CredentialsGroupedView component - Moved CredentialsGroupedView from agent library to a shared component location for reuse - Fixed source name handling in edge creation to properly handle tool source names - Improved node output UI by replacing custom expand/collapse with Accordion component - Fixed timing of hardcoded values synchronization with handle IDs to ensure proper loading - Enabled NEW_FLOW_EDITOR and BUILDER_VIEW_SWITCH feature flags by default ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Verified credentials input works in both agent run dialog and builder run dialog - [x] Confirmed node output accordion works correctly - [x] Tested flow editor with tools to ensure source name handling works properly - [x] Verified hardcoded values sync correctly with handle IDs #### For configuration changes: - [x] `.env.default` is updated or already compatible with my changes - [x] `docker-compose.yml` is updated or already compatible with my changes - [x] I have included a list of my configuration changes in the PR description (under Changes)	2026-01-20 12:20:25 +00:00
Swifty	bc75d70e7d	refactor(backend): Improve Langfuse tracing with v3 SDK patterns and @observe decorators (#11803 ) <!-- Clearly explain the need for these changes: --> This PR improves the Langfuse tracing implementation in the chat feature by adopting the v3 SDK patterns, resulting in cleaner code and better observability. ### Changes 🏗️ - Simplified Langfuse client usage: Replace manual client initialization with `langfuse.get_client()` global singleton - Use v3 context managers: Switch to `start_as_current_observation()` and `propagate_attributes()` for automatic trace propagation - Auto-instrument OpenAI calls: Use `langfuse.openai` wrapper for automatic LLM call tracing instead of manual generation tracking - Add `@observe` decorators: All chat tools now have `@observe(as_type="tool")` decorators for automatic tool execution tracing: - `add_understanding` - `view_agent_output` (renamed from `agent_output`) - `create_agent` - `edit_agent` - `find_agent` - `find_block` - `find_library_agent` - `get_doc_page` - `run_agent` - `run_block` - `search_docs` - Remove manual trace lifecycle: Eliminated the verbose `finally` block that manually ended traces/generations - Rename tool: `agent_output` → `view_agent_output` for clarity ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Verified chat feature works with Langfuse tracing enabled - [x] Confirmed traces appear correctly in Langfuse dashboard with tool spans - [x] Tested tool execution flows show up as nested observations #### For configuration changes: - [x] `.env.default` is updated or already compatible with my changes - [x] `docker-compose.yml` is updated or already compatible with my changes - [x] I have included a list of my configuration changes in the PR description (under Changes) No configuration changes required - uses existing Langfuse environment variables.	2026-01-19 20:56:51 +00:00
Nicholas Tindle	c1a1767034	feat(docs): Add block documentation auto-generation system (#11707 ) - Add generate_block_docs.py script that introspects block code to generate markdown - Support manual content preservation via <!-- MANUAL: --> markers - Add migrate_block_docs.py to preserve existing manual content from git HEAD - Add CI workflow (docs-block-sync.yml) to fail if docs drift from code - Add Claude PR review workflow (docs-claude-review.yml) for doc changes - Add manual LLM enhancement workflow (docs-enhance.yml) - Add GitBook configuration (.gitbook.yaml, SUMMARY.md) - Fix non-deterministic category ordering (categories is a set) - Add comprehensive test suite (32 tests) - Generate docs for 444 blocks with 66 preserved manual sections 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> <!-- Clearly explain the need for these changes: --> ### Changes 🏗️ <!-- Concisely describe all of the changes made in this pull request: --> ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: <!-- Put your test plan here: --> - [x] Extensively test code generation for the docs pages <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Introduces an automated documentation pipeline for blocks and integrates it into CI. > > - Adds `scripts/generate_block_docs.py` (+ tests) to introspect blocks and generate `docs/integrations/`, preserving `<!-- MANUAL: -->` sections > - New CI workflows: docs-block-sync (fails if docs drift), docs-claude-review (AI review for block/docs PRs), and docs-enhance** (optional LLM improvements) > - Updates existing Claude workflows to use `CLAUDE_CODE_OAUTH_TOKEN` instead of `ANTHROPIC_API_KEY` > - Improves numerous block descriptions/typos and links across backend blocks to standardize docs output > - Commits initial generated docs including `docs/integrations/README.md` and many provider/category pages > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit `631e53e0f6`. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY --> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-19 07:03:19 +00:00
Nicholas Tindle	1b56ff13d9	test	2026-01-18 15:32:10 -06:00
Zamil Majdy	f31c160043	feat(platform): add endedAt field and fix execution analytics timestamps (#11759 ) ## Summary This PR adds proper execution end time tracking and fixes timestamp handling throughout the execution analytics system. ### Key Changes 1. Added `endedAt` field to database schema - Executions now have a dedicated field for tracking when they finish 2. Fixed timestamp nullable handling - `started_at` and `ended_at` are now properly nullable in types 3. Fixed chart aggregation - Reduced threshold from ≥3 to ≥1 executions per day 4. Improved timestamp display - Moved timestamps to expandable details section in analytics table 5. Fixed nullable timestamp bugs - Updated all frontend code to handle null timestamps correctly ## Problem Statement ### Issue 1: Missing Execution End Times Previously, executions used `updatedAt` (last DB update) as a proxy for "end time". This broke when adding correctness scores retroactively - the end time would change to whenever the score was added, not when the execution actually finished. ### Issue 2: Chart Shows Only One Data Point The accuracy trends chart showed only one data point despite having executions across multiple days. Root cause: aggregation required ≥3 executions per day. ### Issue 3: Incorrect Type Definitions Manually maintained types defined `started_at` and `ended_at` as non-nullable `Date`, contradicting reality where QUEUED executions haven't started yet. ## Solution ### Database Schema (`schema.prisma`) ```prisma model AgentGraphExecution { // ... startedAt DateTime? endedAt DateTime? // NEW FIELD // ... } ``` ### Execution Lifecycle - QUEUED: `startedAt = null`, `endedAt = null` (not started) - RUNNING: `startedAt = set`, `endedAt = null` (in progress) - COMPLETED/FAILED/TERMINATED: `startedAt = set`, `endedAt = set` (finished) ### Migration Strategy ```sql -- Add endedAt column ALTER TABLE "AgentGraphExecution" ADD COLUMN "endedAt" TIMESTAMP(3); -- Backfill ONLY terminal executions (prevents marking RUNNING executions as ended) UPDATE "AgentGraphExecution" SET "endedAt" = "updatedAt" WHERE "endedAt" IS NULL AND "executionStatus" IN ('COMPLETED', 'FAILED', 'TERMINATED'); ``` ## Changes by Component ### Backend `schema.prisma` - Added `endedAt` field to `AgentGraphExecution` `execution.py` - Made `started_at` and `ended_at` optional with Field descriptions - Updated `from_db()` to use `endedAt` instead of `updatedAt` - `update_graph_execution_stats()` sets `endedAt` when status becomes terminal `execution_analytics_routes.py` - Removed `created_at`/`updated_at` from `ExecutionAnalyticsResult` (DB metadata, not execution data) - Kept only `started_at`/`ended_at` (actual execution runtime) - Made settings global (avoid recreation) - Moved OpenAI key validation to `_process_batch` (only check when LLM actually runs) `analytics.py` - Fixed aggregation: `COUNT() >= 1` (was 3) - include all days with ≥1 execution - Uses `createdAt` for chart grouping (when execution was queued) `late_execution_monitor.py`* - Handle optional `started_at` with fallback to `datetime.min` for sorting - Display "Not started" when `started_at` is null ### Frontend Type Definitions - Fixed manually maintained `types.ts`: `started_at: Date \| null` (was non-nullable) - Generated types were already correct Analytics Components - `AnalyticsResultsTable.tsx`: Show only `started_at`/`ended_at` in 2-column expandable grid - `ExecutionAnalyticsForm.tsx`: Added filter explanation UI Monitoring Components - Fixed null handling bugs: - `OldAgentLibraryView.tsx`: Handle null in reduce function - `agent-runs-selector-list.tsx`: Safe sorting with `?.getTime() ?? 0` - `AgentFlowList.tsx`: Filter/sort with null checks - `FlowRunsStatus.tsx`: Filter null timestamps - `FlowRunsTimeline.tsx`: Filter executions with null timestamps before rendering - `monitoring/page.tsx`: Safe sorting - `ActivityItem.tsx`: Fallback to "recently" for null timestamps ## Benefits ✅ Accurate End Times: `endedAt` is frozen when execution finishes, not updated later ✅ Type Safety: Nullable types match reality, exposing real bugs ✅ Better UX: Chart shows all days with data (not just days with ≥3 executions) ✅ Bug Fixes: 7+ frontend components now handle null timestamps correctly ✅ Documentation: Field descriptions explain when timestamps are null ## Testing ### Backend ```bash cd autogpt_platform/backend poetry run format # ✅ All checks passed poetry run lint # ✅ All checks passed ``` ### Frontend ```bash cd autogpt_platform/frontend pnpm format # ✅ All checks passed pnpm lint # ✅ All checks passed pnpm types # ✅ All type errors fixed ``` ### Test Data Generation Created script to generate 35 test executions across 7 days with correctness scores: ```bash poetry run python scripts/generate_test_analytics_data.py ``` ## Migration Notes ⚠️ Important: The migration only backfills `endedAt` for executions with terminal status (COMPLETED, FAILED, TERMINATED). Active executions (QUEUED, RUNNING) correctly keep `endedAt = null`. ## Breaking Changes None - this is backward compatible: - `endedAt` is nullable, existing code that doesn't use it is unaffected - Frontend already used generated types which were correct - Migration safely backfills historical data <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Introduces explicit execution end-time tracking and normalizes timestamp handling across backend and frontend. > > - Adds `endedAt` to `AgentGraphExecution` (schema + migration); backfills terminal executions; sets `endedAt` on terminal status updates > - Makes `GraphExecutionMeta.started_at/ended_at` optional; updates `from_db()` to use DB `endedAt`; exposes timestamps in `ExecutionAnalyticsResult` > - Moves OpenAI key validation into batch processing; instantiates `Settings` once > - Accuracy trends: reduce daily aggregation threshold to `>= 1`; optional historical series > - Monitoring/analytics UI: results table shows/export `started_at`/`ended_at`; adds chart filter explainer > - Frontend null-safety: update types (`Date \| null`) and fix sorting/filtering/rendering for nullable timestamps across monitoring and library views > - Late execution monitor: safe sorting/display when `started_at` is null > - OpenAPI specs updated for new/nullable fields > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit `1d987ca6e5`. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY --> --------- Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co>	2026-01-16 21:44:24 +00:00
Nicholas Tindle	06550a87eb	feat(backend): add missed default credentials (#11760 ) ### Changes 🏗️ Fixed missing default credentials and provider name mismatch in the credentials store: 1. Provider name correction (`credentials_store.py:97-103`) - Changed `provider="unreal"` → `provider="unreal_speech"` to match the existing `unreal_speech_api_key` setting and block usage - Updated title from "Use Credits for Unreal" → "Use Credits for Unreal Speech" for clarity 2. Added missing OpenWeatherMap credentials (`credentials_store.py:219-226`) - New `openweathermap_credentials` definition with `APIKeyCredentials` - Uses existing `settings.secrets.openweathermap_api_key` setting that was previously defined but had no credential object - Added to `DEFAULT_CREDENTIALS` list 3. Fixed credentials not exposed in `get_all_creds()` (`credentials_store.py:343-354`) - Added `llama_api_credentials` conditional append (was defined but not returned to users) - Added `v0_credentials` conditional append (was defined but not returned to users) - Added `openweathermap_credentials` conditional append ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Verified provider name `unreal_speech` matches block usage in `text_to_speech_block.py` - [x] Confirmed `openweathermap_api_key` setting exists in secrets - [x] Confirmed `llama_api_key` and `v0_api_key` settings exist in secrets <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Aligns backend credential definitions and exposes missing system creds; updates frontend to hide new built-ins. > > - Backend `credentials_store.py`: > - Corrects `provider` to `unreal_speech` and updates title > - Adds `openweathermap_credentials`; includes in `DEFAULT_CREDENTIALS` and `get_all_creds()` when key present > - Ensures `llama_api_credentials` and `v0_credentials` are returned by `get_all_creds()` > - Frontend `integrations/page.tsx`: > - Extends `hiddenCredentials` with IDs for `v0`, `webshare_proxy`, and `openweathermap` > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit `e7d46b76c6`. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY --> --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com> Co-authored-by: Nicholas Tindle <ntindle@users.noreply.github.com>	2026-01-16 21:18:12 +00:00
Nicholas Tindle	088b9998dc	fix(frontend): Fix flaky agent-activity tests by targeting correct agent (#11790 ) This PR fixes flaky agent-activity Playwright tests that were failing intermittently in CI. Closes #11789 ### Changes 🏗️ - Navigate to specific agent by name: Replace `LibraryPage.clickFirstAgent(page)` with `LibraryPage.navigateToAgentByName(page, "Test Agent")` to ensure we're testing the correct agent rather than relying on the first agent in the list - Add retry mechanism for async data loading: Replace direct visibility check with `expect(...).toPass({ timeout: 15000 })` pattern to properly handle asynchronous agent data fetching - Increase timeout: Extended timeout from 8000ms to 15000ms to accommodate slower CI environments ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Verified the test file syntax is correct - [x] Changes target the correct file (`autogpt_platform/frontend/src/tests/agent-activity.spec.ts`) - [x] The retry mechanism follows Playwright best practices using `toPass()` #### For configuration changes: - [x] `.env.default` is updated or already compatible with my changes (N/A - no config changes) - [x] `docker-compose.yml` is updated or already compatible with my changes (N/A - no config changes) - [x] I have included a list of my configuration changes in the PR description (under Changes) (N/A - no config changes) --------- Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com> Co-authored-by: Nicholas Tindle <ntindle@users.noreply.github.com>	2026-01-16 20:33:47 +00:00
Nicholas Tindle	05c89fa5c0	feat(claude): add vercel-react-best-practices skill (#11777 )	2026-01-16 09:40:58 -07:00
Swifty	8cc8295f14	feat(backend): add agent generator tools for chat copilot (#11781 ) This PR adds the ability to create and edit agents from natural language descriptions in the chat copilot. ### Changes 🏗️ - Added `agent_generator/` module with: - LLM client for OpenAI API calls - Core generation logic for decomposing goals and generating agent JSON - Fixer module to correct common LLM generation errors - Validator to ensure generated agents are structurally valid - Prompts for goal decomposition and agent generation - Utility functions for blocks info and agent saving - Added `CreateAgentTool` - creates new agents from natural language descriptions - Added `EditAgentTool` - edits existing agents using natural language patches - Added response models: `AgentPreviewResponse`, `AgentSavedResponse`, `ClarificationNeededResponse` - Registered new tools in the tools registry ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Run `poetry run format` to ensure code passes linting - [x] Test creating an agent via chat with a natural language description - [x] Test editing an existing agent via chat	2026-01-16 17:11:57 +01:00
Swifty	e55f05c7a8	feat(backend): add chat search tools and BM25 reranking (#11782 ) This PR adds new chat tools for searching blocks and documentation, along with BM25 reranking for improved search relevance. ### Changes 🏗️ New Chat Tools: - `find_block` - Search for available blocks by name/description using hybrid search - `run_block` - Execute a block directly with provided inputs and credentials - `search_docs` - Search documentation with section-level granularity - `get_doc_page` - Retrieve full documentation page content Search Improvements: - Added BM25 reranking to hybrid search for better lexical relevance - Documentation handler now chunks markdown by headings (##) for finer-grained embeddings - Section-based content IDs (`doc_path::section_index`) for precise doc retrieval - Startup embedding backfill in scheduler for immediate searchability Other Changes: - New response models for block and documentation search results - Updated orphan cleanup to handle section-based doc embeddings - Added `rank-bm25` dependency for BM25 scoring - Removed max message limit check in chat service ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Run find_block tool to search for blocks (e.g., "current time") - [x] Run run_block tool to execute a found block - [x] Run search_docs tool to search documentation - [x] Run get_doc_page tool to retrieve full doc content - [x] Verify BM25 reranking improves search relevance for exact term matches - [x] Verify documentation sections are properly chunked and embedded #### For configuration changes: - [x] `.env.default` is updated or already compatible with my changes - [x] `docker-compose.yml` is updated or already compatible with my changes - [x] I have included a list of my configuration changes in the PR description (under Changes) Dependencies added: `rank-bm25` for BM25 scoring algorithm	2026-01-16 16:18:10 +01:00
Swifty	4a9b13acb6	feat(frontend): extract frontend changes from hackathon/copilot branch (#11717 ) Frontend changes extracted from the hackathon/copilot branch for the copilot feature development. ### Changes 🏗️ - New Chat system with contextual components (`Chat`, `ChatDrawer`, `ChatContainer`, `ChatMessage`, etc.) - Form renderer system with RJSF v6 integration and new input renderers - Enhanced credentials management with improved OAuth flow and credential selection - New output renderers for various content types (Code, Image, JSON, Markdown, Text, Video) - Scrollable tabs component for better UI organization - Marketplace update notifications and publishing workflow improvements - Draft recovery feature with IndexedDB persistence - Safe mode toggle functionality - Various UI/UX improvements across the platform ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [ ] Test new Chat components functionality - [ ] Verify form renderer with various input types - [ ] Test credential management flows - [ ] Verify output renderers display correctly - [ ] Test draft recovery feature #### For configuration changes: - [x] `.env.default` is updated or already compatible with my changes - [x] `docker-compose.yml` is updated or already compatible with my changes - [x] I have included a list of my configuration changes in the PR description (under Changes) --------- Co-authored-by: Lluis Agusti <hi@llu.lu>	2026-01-16 22:15:39 +07:00

1 2 3 4 5 ...

7784 Commits