AutoGPT

mirror of https://github.com/Significant-Gravitas/AutoGPT.git synced 2026-04-30 03:00:41 -04:00

Author	SHA1	Message	Date
Pratyush Singh	e14594ff4a	fix: handle oversized notifications by sending summary email (#11119 ) (#11130 ) 📨 Fix: Handle Oversized Notification Emails Summary This PR adds logic to detect and handle oversized notification emails exceeding Postmark’s 5 MB limit. Instead of retrying indefinitely, the system now sends a lightweight summary email with key stats and a dashboard link. Changes Added size check in EmailSender.send_templated() Sends summary email when payload > ~4.5 MB Prevents infinite retries and queue clogging Added logs for oversized detection Fixes #11119 --------- Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co> Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>	2025-10-29 00:57:13 +00:00
Zamil Majdy	de70ede54a	fix(backend): prevent execution of deleted agents and cleanup orphaned resources (#11243 ) ## Summary Fix critical bug where deleted agents continue running scheduled and triggered executions indefinitely, consuming credits without user control. ## Problem When agents are deleted from user libraries, their schedules and webhook triggers remain active, leading to: - ❌ Uncontrolled resource consumption - ❌ "Unknown agent" executions that charge credits - ❌ No way for users to stop orphaned executions - ❌ Accumulation of orphaned database records ## Solution ### 1. Prevention: Library Validation Before Execution - Add `is_graph_in_user_library()` function with efficient database queries - Validate graph accessibility before all executions in `validate_and_construct_node_execution_input()` - Use specific `GraphNotInLibraryError` for clear error handling ### 2. Cleanup: Remove Schedules & Webhooks on Deletion - Enhanced `delete_library_agent()` to clean up associated schedules and webhooks - Comprehensive cleanup functions for both scheduled and triggered executions - Proper database transaction handling ### 3. Error-Based Cleanup: Handle Existing Orphaned Resources - Catch `GraphNotInLibraryError` in scheduler and webhook handlers - Automatically clean up orphaned resources when execution fails - Graceful degradation without breaking existing workflows ### 4. Migration: Clean Up Historical Orphans - SQL migration to remove existing orphaned schedules and webhooks - Performance index for faster cleanup queries - Proper logging and error handling ## Key Changes ### Core Library Validation ```python # backend/data/graph.py - Single source of truth async def is_graph_in_user_library(graph_id: str, user_id: str, graph_version: Optional[int] = None) -> bool: where_clause = {"userId": user_id, "agentGraphId": graph_id, "isDeleted": False, "isArchived": False} if graph_version is not None: where_clause["agentGraphVersion"] = graph_version count = await LibraryAgent.prisma().count(where=where_clause) return count > 0 ``` ### Enhanced Agent Deletion ```python # backend/server/v2/library/db.py async def delete_library_agent(library_agent_id: str, user_id: str, soft_delete: bool = True) -> None: # ... existing deletion logic ... await _cleanup_schedules_for_graph(graph_id=graph_id, user_id=user_id) await _cleanup_webhooks_for_graph(graph_id=graph_id, user_id=user_id) ``` ### Execution Prevention ```python # backend/executor/utils.py if not await gdb.is_graph_in_user_library(graph_id=graph_id, user_id=user_id, graph_version=graph.version): raise GraphNotInLibraryError(f"Graph #{graph_id} is not accessible in your library") ``` ### Error-Based Cleanup ```python # backend/executor/scheduler.py & backend/server/integrations/router.py except GraphNotInLibraryError as e: logger.warning(f"Execution blocked for deleted/archived graph {graph_id}") await _cleanup_orphaned_resources_for_graph(graph_id, user_id) ``` ## Technical Implementation ### Database Efficiency - Use `count()` instead of `find_first()` for faster queries - Add performance index: `idx_library_agent_user_graph_active` - Follow existing `prisma.is_connected()` patterns ### Error Handling Hierarchy - `GraphNotInLibraryError`: Specific exception for deleted/archived graphs - `NotAuthorizedError`: Generic authorization errors (preserved for user ID mismatches) - Clear error messages for better debugging ### Code Organization - Single source of truth for library validation in `backend/data/graph.py` - Import from centralized location to avoid duplication - Top-level imports following codebase conventions ## Testing & Validation ### Functional Testing - ✅ Library validation prevents execution of deleted agents - ✅ Cleanup functions remove schedules and webhooks properly - ✅ Error-based cleanup handles orphaned resources gracefully - ✅ Migration removes existing orphaned records ### Integration Testing - ✅ All existing tests pass (including `test_store_listing_graph`) - ✅ No breaking changes to existing functionality - ✅ Proper error propagation and handling ### Performance Testing - ✅ Efficient database queries with proper indexing - ✅ Minimal overhead for normal execution flows - ✅ Cleanup operations don't impact performance ## Impact ### User Experience - 🎯 Immediate: Deleted agents stop running automatically - 🎯 Ongoing: No more unexpected credit charges from orphaned executions - 🎯 Cleanup: Historical orphaned resources are removed ### System Reliability - 🔒 Security: Users can only execute agents they have access to - 🧹 Cleanup: Automatic removal of orphaned database records - 📈 Performance: Efficient validation with minimal overhead ### Developer Experience - 🎯 Clear Errors: Specific exception types for better debugging - 🔧 Maintainable: Centralized library validation logic - 📚 Documented: Comprehensive error handling patterns ## Files Modified - `backend/data/graph.py` - Library validation function - `backend/server/v2/library/db.py` - Enhanced agent deletion with cleanup - `backend/executor/utils.py` - Execution validation and prevention - `backend/executor/scheduler.py` - Error-based cleanup for schedules - `backend/server/integrations/router.py` - Error-based cleanup for webhooks - `backend/util/exceptions.py` - Specific error type for deleted graphs - `migrations/20251023000000_cleanup_orphaned_schedules_and_webhooks/migration.sql` - Historical cleanup ## Breaking Changes None. All changes are backward compatible and preserve existing functionality. ## Follow-up Tasks - [ ] Monitor cleanup effectiveness in production - [ ] Consider adding metrics for orphaned resource detection - [ ] Potential optimization of cleanup batch operations 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-28 23:48:35 +00:00
Reinier van der Leer	5e5f45a713	fix(backend): Fix various warnings (#11252 ) - Resolves #11251 This fixes all the warnings mentioned in #11251, reducing noise and making our logs and error alerts more useful :) ### Changes 🏗️ - Remove "Block {block_name} has multiple credential inputs" warning (not actually an issue) - Rename `json` attribute of `MainCodeExecutionResult` to `json_data`; retain serialized name through a field alias - Replace `Path(regex=...)` with `Path(pattern=...)` in `get_shared_execution` endpoint parameter config - Change Uvicorn's WebSocket module to new Sans-I/O implementation for WS server - Disable Uvicorn's WebSocket module for REST server - Remove deprecated `enable_cleanup_closed=True` argument in `CloudStorageHandler` implementation - Replace Prisma transaction timeout `int` argument with a `timedelta` value - Update Sentry SDK to latest version (v2.42.1) - Broaden filter for cleanup warnings from indirect dependency `litellm` - Fix handling of `MissingConfigError` in REST server endpoints ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - Check that the warnings are actually gone - [x] Deploy to dev environment and run a graph; check for any warnings - Test WebSocket server - [x] Run an agent in the Builder; make sure real-time execution updates still work	2025-10-28 13:18:45 +00:00
seer-by-sentry[bot]	377657f8a1	fix(backend): Extract response from LLM response dictionary (#11262 ) ### Changes 🏗️ - Modifies the LLM block to extract the actual response from the dictionary returned by the LLM, instead of yielding the entire dictionary. This addresses [AUTOGPT-SERVER-6EY](https://sentry.io/organizations/significant-gravitas/issues/6950850822/). ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: <!-- Put your test plan here: --> - [x] After applying the fix, I ran the agent that triggered the Sentry error and confirmed that it now completes successfully without errors. --------- Co-authored-by: seer-by-sentry[bot] <157164994+seer-by-sentry[bot]@users.noreply.github.com> Co-authored-by: Swifty <craigswift13@gmail.com>	2025-10-28 08:43:29 +00:00
seer-by-sentry[bot]	ff71c940c9	fix(backend): Properly encode hostname in URL validation (#11259 ) Fixes [AUTOGPT-SERVER-6KZ](https://sentry.io/organizations/significant-gravitas/issues/6976926125/). The issue was that: Redirect handling strips the URL scheme, causing subsequent requests to fail validation and hit a 404. - Ensures the hostname in the URL is properly IDNA-encoded after validation. - Reconstructs the netloc with the encoded hostname and preserves the port if it exists. This fix was generated by Seer in Sentry, triggered by Craig Swift. 👁️ Run ID: 2204774 Not quite right? [Click here to continue debugging with Seer.](https://sentry.io/organizations/significant-gravitas/issues/6976926125/?seerDrawer=true) ### Changes 🏗️ backend/util/request.py: - Fixed URL validation to properly preserve port numbers when reconstructing netloc - Ensures IDNA-encoded hostname is combined with port (if present) before URL reconstruction Test Results: - ✅ Tested request to https://www.target.com/ (original failing URL from Sentry issue) - ✅ Status: 200, Content retrieved successfully (339,846 bytes) - ✅ Port preservation verified for URLs with explicit ports ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Tested request to https://www.target.com/ (original failing URL) - [x] Verified status code 200 and successful content retrieval - [x] Verified port preservation in URL validation <details> <summary>Example test plan</summary> - [ ] Create from scratch and execute an agent with at least 3 blocks - [ ] Import an agent from file upload, and confirm it executes correctly - [ ] Upload agent to marketplace - [ ] Import an agent from marketplace and confirm it executes correctly - [ ] Edit an agent from monitor, and confirm it executes correctly </details> #### For configuration changes: - [x] `.env.default` is updated or already compatible with my changes - [x] `docker-compose.yml` is updated or already compatible with my changes - [x] I have included a list of my configuration changes in the PR description (under Changes) <details> <summary>Examples of configuration changes</summary> - Changing ports - Adding new services that need to communicate with each other - Secrets or environment variable changes - New or infrastructure changes such as databases </details> Co-authored-by: seer-by-sentry[bot] <157164994+seer-by-sentry[bot]@users.noreply.github.com> Co-authored-by: Swifty <craigswift13@gmail.com>	2025-10-28 08:43:14 +00:00
Bently	9db443960a	feat(blocks/claude): Remove Claude 3.5 Sonnet and Haiku model (#11260 ) Removes CLAUDE_3_5_SONNET and CLAUDE_3_5_HAIKU from LlmModel enum, model metadata, and cost configuration since they are deprecated ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Verify the models are gone from the llm blocks	2025-10-27 16:49:02 +00:00
Swifty	7cbb1ed859	fix(backend/store): Sanitize all sql terms (#11228 ) Categories and Creators where not sanitized in the full text search ### Changes 🏗️ - apply sanitization to categories and creators ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] run tests to check it still works	2025-10-27 12:59:05 +01:00
Reinier van der Leer	e06e7ff33f	fix(backend): Implement graceful shutdown in `AppService` to prevent RPC errors (#11240 ) We're currently seeing errors in the `DatabaseManager` while it's shutting down, like: ``` WARNING [DatabaseManager] Termination request: SystemExit; 0 executing cleanup. INFO [DatabaseManager] ⏳ Disconnecting Database... INFO [PID-1\|THREAD-29\|DatabaseManager\|Prisma-82fb1994-4b87-40c1-8869-fbd97bd33fc8] Releasing connection started... INFO [PID-1\|THREAD-29\|DatabaseManager\|Prisma-82fb1994-4b87-40c1-8869-fbd97bd33fc8] Releasing connection completed successfully. INFO [DatabaseManager] Terminated. ERROR POST /create_or_add_to_user_notification_batch failed: Failed to create or add to notification batch for user {user_id} and type AGENT_RUN: NoneType: None ``` This indicates two issues: - The service doesn't wait for pending RPC calls to finish before terminating - We're using `logger.exception` outside an error handling context, causing the confusing and not much useful `NoneType: None` to be printed instead of error info ### Changes 🏗️ - Implement graceful shutdown in `AppService` so in-flight RPC calls can finish - Add tests for graceful shutdown - Prevent `AppService` accepting new requests during shutdown - Rework `AppService` lifecycle management; add support for async `lifespan` - Fix `AppService` endpoint error logging - Improve logging in `AppProcess` and `AppService` ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - Deploy to Dev cluster, then `kubectl rollout restart` the different services a few times - [x] -> `DatabaseManager` doesn't break on re-deployment - [x] -> `Scheduler` doesn't break on re-deployment - [x] -> `NotificationManager` doesn't break on re-deployment	2025-10-25 14:47:19 +00:00
Bently	48ff225837	feat(blocks/revid): Add cost configs for revid video blocks (#11242 ) Updated block costs in `backend/backend/data/block_cost_config.py`: - AIShortformVideoCreatorBlock: Updated from 50 credits to 307 - AIAdMakerVideoCreatorBlock: Added cost of 714 credits - AIScreenshotToVideoAdBlock: Added cost of 612 credits ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Verify AIShortformVideoCreatorBlock costs 307 credits when executed - [x] Verify AIAdMakerVideoCreatorBlock costs 714 credits when executed - [x] Verify AIScreenshotToVideoAdBlock costs 612 credits when executed	2025-10-23 09:46:22 +00:00
Bently	a6a2f71458	Merge commit from fork * Replace urllib with Requests in RSS block to prevent SSRF * Format	2025-10-22 14:18:34 +01:00
Bently	788b861bb7	Merge commit from fork	2025-10-22 14:17:26 +01:00
Zamil Majdy	bb0b45d7f7	fix(backend): Make Jinja Error on TextFormatter as value error (#11236 ) <!-- Clearly explain the need for these changes: --> This PR converts Jinja2 TemplateError exceptions to ValueError in the TextFormatter class to ensure proper error handling and HTTP status code responses (400 instead of 500). ### Changes 🏗️ <!-- Concisely describe all of the changes made in this pull request: --> - Added import for `jinja2.exceptions.TemplateError` in `backend/util/text.py:6` - Wrapped template rendering in try-catch block in `format_string` method (`backend/util/text.py:105-109`) - Convert `TemplateError` to `ValueError` to ensure proper 400 HTTP status code for client errors - Added warning logging for template rendering errors before re-raising as ValueError ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: <!-- Put your test plan: --> - [x] Verified that invalid Jinja2 templates now raise ValueError instead of TemplateError - [x] Confirmed that valid templates continue to work correctly - [x] Checked that warning logs are generated for template errors - [x] Validated that the exception chain is preserved with `from e` #### For configuration changes: - [x] `.env.default` is updated or already compatible with my changes - [x] `docker-compose.yml` is updated or already compatible with my changes - [x] I have included a list of my configuration changes in the PR description (under Changes)	2025-10-22 09:38:02 +00:00
Reinier van der Leer	04df981115	fix(backend): Fix structured logging for cloud environments (#11227 ) - Resolves #11226 ### Changes 🏗️ - Drop use of `CloudLoggingHandler` which docs state isn't for use in GKE - For cloud logging, output only structured log entries to `stdout` ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Test deploy to dev and check logs	2025-10-21 12:48:41 +00:00
Swifty	d25997b4f2	Revert "Merge branch 'swiftyos/secrt-1709-store-provider-names-and-en… (#11225 ) Changes to providers blocks to store in db ### Changes 🏗️ - revet change ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: <!-- Put your test plan here: --> - [x] I have reverted the merge	2025-10-21 09:12:00 +00:00
Zamil Majdy	11d55f6055	fix(backend/executor): Avoid running direct query in executor (#11224 ) ## Summary - Fixes database connection warnings in executor logs: "Client is not connected to the query engine, you must call `connect()` before attempting to query data" - Implements resilient database client pattern already used elsewhere in the codebase - Adds caching to reduce database load for user context lookups ## Changes - Updated `get_user_context()` to check `prisma.is_connected()` and fall back to database manager client - Added `@cached(maxsize=1000, ttl_seconds=3600)` decorator for performance optimization - Updated database manager to expose `get_user_by_id` method ## Test plan - [x] Verify executor pods no longer show Prisma connection warnings - [x] Confirm user timezone is still correctly retrieved - [x] Test fallback behavior when Prisma is disconnected 🤖 Generated with [Claude Code](https://claude.ai/code) Co-authored-by: Claude <noreply@anthropic.com>	2025-10-21 08:46:40 +00:00
Reinier van der Leer	3da595f599	fix(backend): Only try to initialize LaunchDarkly once (#11222 ) We currently try to re-init the LaunchDarkly client every time a feature flag is checked. This causes 5 second extra latency on the flag check when LD is down, such as now. Since flag checks are performed on every block execution, this currently cripples the platform's executors. - Follow-up to #11221 ### Changes 🏗️ - Only try to init LaunchDarkly once - Improve surrounding log statements in the `feature_flag` module ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - This is a critical hotfix; we'll see its effect once deployed	2025-10-21 08:46:07 +02:00
Reinier van der Leer	e5e60921a3	fix(backend): Handle LaunchDarkly init failure (#11221 ) LaunchDarkly is currently down and it's keeping our executor pods from spinning up. ### Changes 🏗️ - Wrap `LaunchDarklyIntegration` init in a try/except ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - We'll see if it works once it deploys	2025-10-21 07:53:40 +02:00
Copilot	90af8f8e1a	feat(backend): Add language fallback for YouTube transcription block (#11057 ) ## Problem The YouTube transcription block would fail when attempting to transcribe videos that only had transcripts available in non-English languages. Even when usable transcripts existed in other languages, the block would raise a `NoTranscriptFound` error because it only requested English transcripts. Example video that would fail: https://www.youtube.com/watch?v=3AMl5d2NKpQ (only has Hungarian transcripts) Error message: ``` Could not retrieve a transcript for the video https://www.youtube.com/watch?v=3AMl5d2NKpQ! No transcripts were found for any of the requested language codes: ('en',) For this video (3AMl5d2NKpQ) transcripts are available in the following languages: (GENERATED) - hu ("Hungarian (auto-generated)") ``` ## Solution Implemented intelligent language fallback in the `TranscribeYoutubeVideoBlock.get_transcript()` method: 1. First, tries to fetch English transcript (maintains backward compatibility) 2. If English unavailable, lists all available transcripts and selects the first one using this priority: - Manually created transcripts (any language) - Auto-generated transcripts (any language) 3. Only fails if no transcripts exist at all Example behavior: ```python # Before: Video with only Hungarian transcript get_transcript("3AMl5d2NKpQ") # ❌ Raises NoTranscriptFound # After: Video with only Hungarian transcript get_transcript("3AMl5d2NKpQ") # ✅ Returns Hungarian transcript ``` ## Changes - Modified `backend/blocks/youtube.py`: Added try-catch logic to fallback to any available language when English is not found - Added `test/blocks/test_youtube.py`: Comprehensive test suite covering URL extraction, language fallback, transcript preferences, and error handling (7 tests) - Updated `docs/content/platform/blocks/youtube.md`: Documented the language fallback behavior and transcript priority order ## Testing - ✅ All 7 new unit tests pass - ✅ Block integration test passes - ✅ Full test suite: 621 passed, 0 failed (no regressions) - ✅ Code formatting and linting pass ## Impact This fix enables the YouTube transcription block to work with international content while maintaining full backward compatibility: - ✅ Videos in any language can now be transcribed - ✅ English is still preferred when available - ✅ No breaking changes to existing functionality - ✅ Graceful degradation to available languages Fixes #10637 Fixes https://linear.app/autogpt/issue/OPEN-2626 > [!WARNING] > > <details> > <summary>Firewall rules blocked me from connecting to one or more addresses (expand for details)</summary> > > #### I tried to connect to the following addresses, but was blocked by firewall rules: > > - `www.youtube.com` > - Triggering command: `/home/REDACTED/.cache/pypoetry/virtualenvs/autogpt-platform-backend-Ajv4iu2i-py3.11/bin/python3` (dns block) > > If you need me to access, download, or install something from one of these locations, you can either: > > - Configure [Actions setup steps](https://gh.io/copilot/actions-setup-steps) to set up my environment, which run before the firewall is enabled > - Add the appropriate URLs or hosts to the custom allowlist in this repository's [Copilot coding agent settings](https://github.com/Significant-Gravitas/AutoGPT/settings/copilot/coding_agent) (admins only) > > </details> <!-- START COPILOT CODING AGENT SUFFIX --> <details> <summary>Original prompt</summary> > Issue Title: if theres only one lanague available for transcribe youtube return that langage not an error > Issue Description: `Could not retrieve a transcript for the video https://www.youtube.com/watch?v=3AMl5d2NKpQ! This is most likely caused by: No transcripts were found for any of the requested language codes: ('en',) For this video (3AMl5d2NKpQ) transcripts are available in the following languages: (MANUALLY CREATED) None (GENERATED) - hu ("Hungarian (auto-generated)") (TRANSLATION LANGUAGES) None If you are sure that the described cause is not responsible for this error and that a transcript should be retrievable, please create an issue at https://github.com/jdepoix/youtube-transcript-api/issues. Please add which version of youtube_transcript_api you are using and provide the information needed to replicate the error. Also make sure that there are no open issues which already describe your problem!` you can use this video to test: [https://www.youtube.com/watch?v=3AMl5d2NKpQ\`](https://www.youtube.com/watch?v=3AMl5d2NKpQ%60) > Fixes https://linear.app/autogpt/issue/OPEN-2626/if-theres-only-one-lanague-available-for-transcribe-youtube-return > > > Comment by User : > This thread is for an agent session with githubcopilotcodingagent. > > Comment by User : > This thread is for an agent session with githubcopilotcodingagent. > > Comment by User : > This comment thread is synced to a corresponding [GitHub issue](https://github.com/Significant-Gravitas/AutoGPT/issues/10637). All replies are displayed in both locations. > > </details> <!-- START COPILOT CODING AGENT TIPS --> --- ✨ Let Copilot coding agent [set things up for you](https://github.com/Significant-Gravitas/AutoGPT/issues/new?title=✨+Set+up+Copilot+instructions&body=Configure%20instructions%20for%20this%20repository%20as%20documented%20in%20%5BBest%20practices%20for%20Copilot%20coding%20agent%20in%20your%20repository%5D%28https://gh.io/copilot-coding-agent-tips%29%2E%0A%0A%3COnboard%20this%20repo%3E&assignees=copilot) — coding agent works faster and does higher quality work when set up for your repo. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: ntindle <8845353+ntindle@users.noreply.github.com> Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co>	2025-10-21 02:31:33 +00:00
Nicholas Tindle	eba67e0a4b	fix(platform/blocks): update linear oauth to use refresh tokens (#10998 ) <!-- Clearly explain the need for these changes: --> ### Need 💡 This PR addresses Linear issue SECRT-1665, which mandates an update to Linear's OAuth2 implementation. Linear is transitioning from long-lived access tokens to short-lived access tokens with refresh tokens, with a deadline of April 1, 2026. This change is crucial to ensure continued integration with Linear and to support their new token management system, including a migration path for existing long-lived tokens. ### Changes 🏗️ - `autogpt_platform/backend/backend/blocks/linear/_oauth.py`: - Implemented full support for refresh tokens, including HTTP Basic Authentication for token refresh requests. - Added `migrate_old_token()` method to exchange old long-lived access tokens for new short-lived tokens with refresh tokens using Linear's `/oauth/migrate_old_token` endpoint. - Enhanced `get_access_token()` to automatically detect and attempt migration for old tokens, and to refresh short-lived tokens when they expire. - Improved error handling and token expiration management. - Updated `_request_tokens` to handle both authorization code and refresh token flows, supporting Linear's recommended authentication methods. - `autogpt_platform/backend/backend/blocks/linear/_config.py`: - Updated `TEST_CREDENTIALS_OAUTH` mock data to include realistic `access_token_expires_at` and `refresh_token` for testing the new token lifecycle. - `LINEAR_OAUTH_IMPLEMENTATION.md`: - Added documentation detailing the new Linear OAuth refresh token implementation, including technical details, migration strategy, and testing notes. ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Verified OAuth URL generation and parameter encoding. - [x] Confirmed HTTP Basic Authentication header creation for refresh requests. - [x] Tested token expiration logic with a 5-minute buffer. - [x] Validated migration detection for old vs. new token types. - [x] Checked code syntax and import compatibility. #### For configuration changes: - [ ] `.env.default` is updated or already compatible with my changes - [ ] `docker-compose.yml` is updated or already compatible with my changes - [ ] I have included a list of my configuration changes in the PR description (under Changes) --- Linear Issue: [SECRT-1665](https://linear.app/autogpt/issue/SECRT-1665) <a href="https://cursor.com/background-agent?bcId=bc-95f4c668-f7fa-4057-87e5-622ac81c0783"><picture><source media="(prefers-color-scheme: dark)" srcset="https://cursor.com/open-in-cursor-dark.svg"><source media="(prefers-color-scheme: light)" srcset="https://cursor.com/open-in-cursor-light.svg"><img alt="Open in Cursor" src="https://cursor.com/open-in-cursor.svg"></picture></a> <a href="https://cursor.com/agents?id=bc-95f4c668-f7fa-4057-87e5-622ac81c0783"><picture><source media="(prefers-color-scheme: dark)" srcset="https://cursor.com/open-in-web-dark.svg"><source media="(prefers-color-scheme: light)" srcset="https://cursor.com/open-in-web-light.svg"><img alt="Open in Web" src="https://cursor.com/open-in-web.svg"></picture></a> --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com> Co-authored-by: Nicholas Tindle <ntindle@users.noreply.github.com> Co-authored-by: Bentlybro <Github@bentlybro.com>	2025-10-20 20:44:58 +00:00
Swifty	3988057032	Merge branch 'swiftyos/secrt-1712-remove-error-handling-form-store-routes' into dev	2025-10-18 12:28:25 +02:00
Swifty	a6c6e48f00	Merge branch 'swiftyos/open-2791-featplatform-add-easy-test-data-creation' into dev	2025-10-18 12:28:17 +02:00
Swifty	e72ce2f9e7	Merge branch 'swiftyos/secrt-1709-store-provider-names-and-env-vars-in-db' into dev	2025-10-18 12:27:58 +02:00
Swifty	bd7a79a920	Merge branch 'swiftyos/secrt-1706-improve-store-search' into dev	2025-10-18 12:27:31 +02:00
Swifty	d9035a233c	Merge branch 'swiftyos/secrt-1709-store-provider-names-and-env-vars-in-db' of github.com:Significant-Gravitas/AutoGPT into swiftyos/secrt-1709-store-provider-names-and-env-vars-in-db	2025-10-17 17:20:27 +02:00
Swifty	972cbfc3de	fix tests	2025-10-17 17:20:05 +02:00
Swifty	8f861b1bb2	removed error handling from routes	2025-10-17 17:08:17 +02:00
Swifty	fa2731bb8b	Merge branch 'dev' into swiftyos/secrt-1709-store-provider-names-and-env-vars-in-db	2025-10-17 17:06:09 +02:00
Swifty	2dc0c97a52	Add block registry and updated	2025-10-17 16:49:04 +02:00
Zamil Majdy	0bb2b87c32	fix(backend): resolve UserBalance migration issues and credit spending bug (#11192 ) ## Summary Fix critical UserBalance migration and spending issues affecting users with credits from transaction history but no UserBalance records. ## Root Issues Fixed ### Issue 1: UserBalance Migration Complexity - Problem: Complex data migration with timestamp logic issues and potential race conditions - Solution: Simplified to idempotent table creation only, application handles auto-population ### Issue 2: Credit Spending Bug - Problem: Users with $10.0 from transaction history couldn't spend $0.16 - Root Cause: `_add_transaction` and `_enable_transaction` only checked UserBalance table, returning 0 balance for users without records - Solution: Enhanced both methods with transaction history fallback logic ### Issue 3: Exception Handling Inconsistency - Problem: Raw SQL unique violations raised different exception types than Prisma ORM - Solution: Convert raw SQL unique violations to `UniqueViolationError` at source ## Changes Made ### Migration Cleanup - Idempotent operations: Use `CREATE TABLE IF NOT EXISTS`, `CREATE INDEX IF NOT EXISTS` - Inline foreign key: Define constraint within `CREATE TABLE` instead of separate `ALTER TABLE` - Removed data migration: Application creates UserBalance records on-demand - Safe to re-run: No errors if table/index/constraint already exists ### Credit Logic Fixes - Enhanced `_add_transaction`: Added transaction history fallback in `user_balance_lock` CTE - Enhanced `_enable_transaction`: Added same fallback logic for payment fulfillment - Exception normalization: Convert raw SQL unique violations to `UniqueViolationError` - Simplified `onboarding_reward`: Use standardized `UniqueViolationError` catching ### SQL Fallback Pattern ```sql COALESCE( (SELECT balance FROM UserBalance WHERE userId = ? FOR UPDATE), -- Fallback: compute from transaction history if UserBalance doesn't exist (SELECT COALESCE(ct.runningBalance, 0) FROM CreditTransaction ct WHERE ct.userId = ? AND ct.isActive = true AND ct.runningBalance IS NOT NULL ORDER BY ct.createdAt DESC LIMIT 1), 0 ) as balance ``` ## Impact ### Before - ❌ Users with transaction history but no UserBalance couldn't spend credits - ❌ Migration had complex timestamp logic with potential bugs - ❌ Raw SQL and Prisma exceptions handled differently - ❌ Error: "Insufficient balance of $10.0, where this will cost $0.16" ### After - ✅ Seamless spending for all users regardless of UserBalance record existence - ✅ Simple, idempotent migration that's safe to re-run - ✅ Consistent exception handling across all credit operations - ✅ Automatic UserBalance record creation during first transaction - ✅ Backward compatible - existing users unaffected ## Business Value - Eliminates user frustration: Users can spend their credits immediately - Smooth migration path: From old User.balance to new UserBalance table - Better reliability: Atomic operations with proper error handling - Maintainable code: Consistent patterns across credit operations ## Test Plan - [ ] Manual testing with users who have transaction history but no UserBalance records - [ ] Verify migration can be run multiple times safely - [ ] Test spending credits works for all user scenarios - [ ] Verify payment fulfillment (`_enable_transaction`) works correctly - [ ] Add comprehensive test coverage for this scenario 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-17 19:46:13 +07:00
Swifty	a1d9b45238	updated openapi spec	2025-10-17 14:01:37 +02:00
Swifty	29895c290f	store providers in db	2025-10-17 13:34:35 +02:00
Zamil Majdy	73c0b6899a	fix(backend): Remove advisory locks for atomic credit operations (#11143 ) ## Problem High QPS failures on `spend_credits` operations due to lock contention from `pg_advisory_xact_lock` causing serialization and seconds of wait time. ## Solution Replace PostgreSQL advisory locks with atomic database operations using CTEs (Common Table Expressions). ### Key Changes - Add persistent balance column to User table for O(1) balance lookups - Atomic CTE-based operations for all credit transactions using UPDATE...RETURNING pattern - Comprehensive concurrency tests with 7 test scenarios including stress testing - Remove all advisory lock usage from the credit system ### Implementation Details 1. Migration: Adds balance column with backfill from transaction history 2. Atomic Operations: All credit operations now use single atomic CTEs that update balance and create transaction in one query 3. Race Condition Prevention: WHERE clauses in UPDATE statements ensure balance never goes negative 4. BetaUserCredit Compatibility: Preserved monthly refill logic with updated `_add_transaction` signature ### Performance Impact - ✅ Eliminated lock contention bottlenecks - ✅ O(1) balance lookups instead of O(n) transaction aggregation - ✅ Atomic operations prevent race conditions without locks - ✅ Supports high QPS without serialization delays ### Testing - All existing tests pass - New concurrency test suite (`credit_concurrency_test.py`) with: - Concurrent spends from same user - Insufficient balance handling - Mixed operations (spends, top-ups, balance checks) - Race condition prevention - Integer overflow protection - Stress testing with 100 concurrent operations ### Breaking Changes None - all existing APIs maintain compatibility 🤖 Generated with [Claude Code](https://claude.ai/code) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * New Features * Enhanced top‑up flows with top‑up types, clearer credit→dollar formatting, and idempotent onboarding rewards. * Bug Fixes * Fixed race conditions for concurrent spends/top‑ups, added integer‑overflow and underflow protection, stronger input validation, and improved refund/dispute handling. * Refactor * Persisted per‑user balance with atomic updates for reliable balances; admin history now prefetches balances. * Tests * Added extensive concurrency, refund, ceiling/underflow and migration test suites. * Chores * Database migration to add persisted user balance; APIKey status extended (SUSPENDED). <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Swifty <craigswift13@gmail.com>	2025-10-17 17:05:05 +07:00
Zamil Majdy	dfdd632161	fix(backend/util): handle nested Pydantic models in SafeJson (#11188 ) ## Summary Fixes a critical serialization bug introduced in PR #11187 where `SafeJson` failed to serialize dictionaries containing Pydantic models, causing 500 Internal Server Errors in the executor service. ## Problem The error manifested as: ``` CRITICAL: Operation Approaching Failure Threshold: Service communication: '_call_method_async' Current attempt: 50/50 Error: HTTPServerError: HTTP 500: Server error '500 Internal Server Error' for url 'http://autogpt-database-manager.prod-agpt.svc.cluster.local:8005/create_graph_execution' ``` Root cause in `create_graph_execution` (backend/data/execution.py:656-657): ```python "credentialInputs": SafeJson(credential_inputs) if credential_inputs else Json({}) ``` Where `credential_inputs: Mapping[str, CredentialsMetaInput]` is a dict containing Pydantic models. After PR #11187's refactor, `_sanitize_value()` only converted top-level BaseModel instances to dicts, but didn't handle BaseModel instances nested inside dicts/lists/tuples. This caused Prisma's JSON serializer to fail with: ``` TypeError: Type <class 'backend.data.model.CredentialsMetaInput'> not serializable ``` ## Solution Added BaseModel handling to `_sanitize_value()` to recursively convert Pydantic models to dicts before sanitizing: ```python elif isinstance(value, BaseModel): # Convert Pydantic models to dict and recursively sanitize return _sanitize_value(value.model_dump(exclude_none=True)) ``` This ensures all nested Pydantic models are properly serialized regardless of nesting depth. ## Changes - backend/util/json.py: Added BaseModel check to `_sanitize_value()` function - backend/util/test_json.py: Added 6 comprehensive tests covering: - Dict containing Pydantic models - Deeply nested Pydantic models - Lists of Pydantic models in dicts - The exact CredentialsMetaInput scenario - Complex mixed structures - Models with control characters ## Testing ✅ All new tests pass ✅ Verified fix resolves the production 500 error ✅ Code formatted with `poetry run format` ## Related - Fixes issues introduced in PR #11187 - Related to executor service 500 errors in production 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Bentlybro <Github@bentlybro.com> Co-authored-by: Claude <noreply@anthropic.com>	2025-10-17 09:27:09 +00:00
Swifty	1ed224d481	simplify test and add reset-db make command	2025-10-17 11:12:00 +02:00
Swifty	3b5d919399	fix formatting	2025-10-17 10:56:45 +02:00
Swifty	3c16de22ef	add test data creation to makefile and test it	2025-10-17 10:51:58 +02:00
Swifty	2c6d85d15e	feat(platform): Shared cache (#11150 ) ### Problem When running multiple backend pods in production, requests can be routed to different pods causing inconsistent cache states. Additionally, the current cache implementation in `autogpt_libs` doesn't support shared caching across processes, leading to data inconsistency and redundant cache misses. ### Changes 🏗️ - Moved cache implementation from autogpt_libs to backend (`/backend/backend/util/cache.py`) - Removed `/autogpt_libs/autogpt_libs/utils/cache.py` - Centralized cache utilities within the backend module - Updated all import statements across the codebase - Implemented Redis-based shared caching - Added `shared_cache` parameter to `@cached` decorator for cross-process caching - Implemented Redis connection pooling for efficient cache operations - Added support for cache key pattern matching and bulk deletion - Added TTL refresh on cache access with `refresh_ttl_on_get` option - Enhanced cache functionality - Added thundering herd protection with double-checked locking - Implemented thread-local caching with `@thread_cached` decorator - Added cache management methods: `cache_clear()`, `cache_info()`, `cache_delete()` - Added support for both sync and async functions - Updated store caching (`/backend/server/v2/store/cache.py`) - Enabled shared caching for all store-related cache functions - Set appropriate TTL values (5-15 minutes) for different cache types - Added `clear_all_caches()` function for cache invalidation - Added Redis configuration - Added Redis connection settings to backend settings - Configured dedicated connection pool for cache operations - Set up binary mode for pickle serialization ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Verify Redis connection and cache operations work correctly - [x] Test shared cache across multiple backend instances - [x] Verify cache invalidation with `clear_all_caches()` - [x] Run cache tests: `poetry run pytest backend/backend/util/cache_test.py` - [x] Test thundering herd protection under concurrent load - [x] Verify TTL refresh functionality with `refresh_ttl_on_get=True` - [x] Test thread-local caching for request-scoped data - [x] Ensure no performance regression vs in-memory cache #### For configuration changes: - [x] `.env.default` is updated or already compatible with my changes - [x] `docker-compose.yml` is updated or already compatible with my changes (Redis already configured) - [x] I have included a list of my configuration changes in the PR description (under Changes) - Redis cache configuration uses existing Redis service settings (REDIS_HOST, REDIS_PORT, REDIS_PASSWORD) - No new environment variables required	2025-10-17 07:56:01 +00:00
Zamil Majdy	374f35874c	feat(platform): Add LaunchDarkly flag for platform payment system (#11181 ) ## Summary Implement selective rollout of payment functionality using LaunchDarkly feature flags to enable gradual deployment to pilot users. - Add `ENABLE_PLATFORM_PAYMENT` flag to control credit system behavior - Update `get_user_credit_model` to use user-specific flag evaluation - Replace hardcoded `NEXT_PUBLIC_SHOW_BILLING_PAGE` with LaunchDarkly flag - Enable payment UI components only for flagged users - Maintain backward compatibility with existing beta credit system - Default to beta monthly credits when flag is disabled - Fix tests to work with new async credit model function ## Key Changes ### Backend - Credit Model Selection: The `get_user_credit_model()` function now takes a `user_id` parameter and uses LaunchDarkly to determine which credit model to return: - Flag enabled → `UserCredit` (payment system enabled, no monthly refills) - Flag disabled → `BetaUserCredit` (current behavior with monthly refills) - Flag Integration: Added `ENABLE_PLATFORM_PAYMENT` flag and integrated LaunchDarkly evaluation throughout the credit system - API Updates: All credit-related endpoints now use the user-specific credit model instead of a global instance ### Frontend - Dynamic UI: Payment-related components (billing page, wallet refill) now show/hide based on the LaunchDarkly flag - Removed Environment Variable: Replaced `NEXT_PUBLIC_SHOW_BILLING_PAGE` with runtime flag evaluation ### Testing - Test Fixes: Updated all tests that referenced the removed global `_user_credit_model` to use proper mocking of the new async function ## Deployment Strategy This implementation enables a controlled rollout: 1. Deploy with flag disabled (default) - no behavior change for existing users 2. Enable flag for pilot/beta users via LaunchDarkly dashboard 3. Monitor usage and feedback from pilot users 4. Gradually expand to more users 5. Eventually enable for all users once validated ## Test Plan - [x] Unit tests pass for credit system components - [x] Payment UI components show/hide correctly based on flag - [x] Default behavior (flag disabled) maintains current functionality - [x] Flag enabled users get payment system without monthly refills - [x] Admin credit operations work correctly - [x] Backward compatibility maintained 🤖 Generated with [Claude Code](https://claude.ai/code) --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-17 06:11:39 +00:00
Zamil Majdy	e62a56e8ba	fix(backend/util): rewrite SafeJson to prevent Invalid \escape errors (#11187 ) ## Summary Fixes the `Invalid \escape` error occurring in `/upsert_execution_output` endpoint by completely rewriting the SafeJson implementation. ## Problem - Error: `POST /upsert_execution_output failed: Invalid \escape: line 1 column 36404 (char 36403)` - Caused by data containing literal backslash-u sequences (e.g., `\u0000` as text, not actual null characters) - Previous implementation tried to remove problematic escape sequences from JSON strings - This created invalid JSON when it removed `\\u0000` and left invalid sequences like `\w` ## Solution Completely rewrote SafeJson to work on Python data structures instead of JSON strings: 1. Direct data sanitization: Recursively walks through dicts, lists, and tuples to remove control characters directly from strings 2. No JSON string manipulation: Avoids all escape sequence parsing issues 3. More efficient: Eliminates the serialize → sanitize → deserialize cycle 4. Preserves valid content: Backslashes, paths, and literal text are correctly preserved ## Changes - Removed `POSTGRES_JSON_ESCAPES` regex (no longer needed) - Added `_sanitize_value()` helper function for recursive sanitization - Simplified `SafeJson()` to convert Pydantic models and sanitize data structures - Added `import json # noqa: F401` for backwards compatibility ## Testing - ✅ Verified fix resolves the `Invalid \escape` error - ✅ All existing SafeJson unit tests pass - ✅ Problematic data with literal escape sequences no longer causes errors - ✅ Code formatted with `poetry run format` ## Technical Details Before (JSON string approach): ```python # Serialize to JSON string json_string = dumps(data) # Remove escape sequences from string (BREAKS!) sanitized = regex.sub("", json_string) # Parse back (FAILS with Invalid \escape) return Json(json.loads(sanitized)) ``` After (data structure approach): ```python # Convert Pydantic to dict data = model.model_dump() if isinstance(data, BaseModel) else data # Recursively sanitize strings in data structure sanitized = _sanitize_value(data) # Return as Json (no parsing needed) return Json(sanitized) ``` 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-17 05:56:08 +00:00
Swifty	3ed1c93ec0	Merge branch 'dev' into swiftyos/secrt-1706-improve-store-search	2025-10-16 15:10:01 +02:00
Swifty	773f545cfd	update existing rows when migration is ran	2025-10-16 13:38:01 +02:00
Swifty	84ad4a9f95	updated migration and query	2025-10-16 13:06:47 +02:00
Swifty	8610118ddc	ai sucks - fixing	2025-10-16 12:14:26 +02:00
Bently	9469b9e2eb	feat(platform/backend): Add Claude Haiku 4.5 model support (#11179 ) ### Changes 🏗️ - Added Claude Haiku 4.5 model support (`claude-haiku-4-5-20251001`) - Added model to `LlmModel` enum in `autogpt_platform/backend/backend/blocks/llm.py` - Configured model metadata with 200k context window and 64k max output tokens - Set pricing to 4 credits per million tokens in `backend/data/block_cost_config.py` - Classic Forge Integration - Added `CLAUDE4_5_HAIKU_v1` to Anthropic provider in `classic/forge/forge/llm/providers/anthropic.py` - Configured with $1/1M prompt tokens and $5/1M completion tokens pricing - Enabled function call API support ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: Test Plan: - [x] Verify Claude Haiku 4.5 model appears in the LLM block model selection dropdown - [x] Test basic text generation using Claude Haiku 4.5 in an agent workflow #### For configuration changes: - [x] `.env.default` is updated or already compatible with my changes - [x] `docker-compose.yml` is updated or already compatible with my changes - [x] I have included a list of my configuration changes in the PR description (under Changes) <details> <summary>Configuration changes</summary> - No environment variable changes required - No docker-compose changes needed - Model configuration is handled through existing Anthropic API integration </details> https://github.com/user-attachments/assets/bbc42c47-0e7c-4772-852e-55aa91f4d253 --------- Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com> Co-authored-by: Bently <Bentlybro@users.noreply.github.com>	2025-10-16 10:11:38 +00:00
Swifty	ebb4ebb025	include parital types in second place	2025-10-16 12:10:38 +02:00
Swifty	cb532e1c4d	update docker file to include partial types	2025-10-16 12:08:04 +02:00
Zamil Majdy	b7ae2c2fd2	fix(backend): move DatabaseError to backend.util.exceptions for better layer separation (#11177 ) ## Summary Move DatabaseError from store-specific exceptions to generic backend exceptions for proper layer separation, while also fixing store exception inheritance to ensure proper HTTP status codes. ## Problem 1. Poor Layer Separation: DatabaseError was defined in store-specific exceptions but represents infrastructure concerns that affect the entire backend 2. Incorrect HTTP Status Codes: Store exceptions inherited from Exception instead of ValueError, causing 500 responses for client errors 3. Reusability Issues: Other backend modules couldn't use DatabaseError for DB operations 4. Blanket Catch Issues: Store-specific catches were affecting generic database operations ## Solution ### Move DatabaseError to Generic Location - Move from backend.server.v2.store.exceptions to backend.util.exceptions - Update all 23 references in backend/server/v2/store/db.py to use new location - Remove from StoreError inheritance hierarchy ### Fix Complete Store Exception Hierarchy - MediaUploadError: Changed from Exception to ValueError inheritance (client errors → 400) - StoreError: Changed from Exception to ValueError inheritance (business logic errors → 400) - Store NotFound exceptions: Changed to inherit from NotFoundError (→ 404) - DatabaseError: Now properly inherits from Exception (infrastructure errors → 500) ## Benefits ### ✅ Proper Layer Separation - Database errors are infrastructure concerns, not store-specific business logic - Store exceptions focus on business validation and client errors - Clean separation between infrastructure and business logic layers ### ✅ Correct HTTP Status Codes - DatabaseError: 500 (server infrastructure errors) - Store NotFound errors: 404 (via existing NotFoundError handler) - Store validation errors: 400 (via existing ValueError handler) - Media upload errors: 400 (client validation errors) ### ✅ Architectural Improvements - DatabaseError now reusable across entire backend - Eliminates blanket catch issues affecting generic DB operations - All store exceptions use global exception handlers properly - Future store exceptions automatically get proper status codes ## Files Changed - backend/util/exceptions.py: Add DatabaseError class - backend/server/v2/store/exceptions.py: Remove DatabaseError, fix inheritance hierarchy - backend/server/v2/store/db.py: Update all DatabaseError references to new location ## Result - ✅ No more stack trace spam: Expected business logic errors handled properly - ✅ Proper HTTP semantics: 500 for infrastructure, 400/404 for client errors - ✅ Better architecture: Clean layer separation and reusable components - ✅ Fixes original issue: AgentNotFoundError now returns 404 instead of 500 This addresses the logging issue mentioned by @zamilmajdy while also implementing the architectural improvements suggested by @Pwuts.	2025-10-16 09:51:58 +00:00
Swifty	794aee25ab	add full text search	2025-10-16 11:49:36 +02:00
Zamil Majdy	12b1067017	fix(backend/store): improve store exception hierarchy for proper HTTP status codes (#11176 ) ## Summary Fix store exception hierarchy to prevent ERROR level stack trace spam for expected business logic errors and ensure proper HTTP status codes. ## Problem The original error from production logs showed AgentNotFoundError for non-existent agents like autogpt/domain-drop-catcher was: - Returning 500 status codes instead of 404 - Generating ERROR level stack traces in logs for expected not found scenarios - Bypassing global exception handlers due to improper inheritance ## Root Cause Store exceptions inherited from Exception instead of ValueError, causing them to bypass the global ValueError handler (400) and fall through to the generic Exception handler (500) with full stack traces. ## Solution Create proper exception hierarchy for ALL store-related errors by making: - MediaUploadError inherit from ValueError instead of Exception - StoreError inherit from ValueError instead of Exception - Store NotFound exceptions inherit from NotFoundError (which inherits from ValueError) ## Changes Made 1. MediaUploadError: Changed from Exception to ValueError inheritance 2. StoreError: Changed from Exception to ValueError inheritance 3. Store NotFound exceptions: Changed to inherit from NotFoundError ## Benefits - Correct HTTP status codes: Not found errors return 404, validation errors return 400 - No more 500 stack trace spam for expected business logic errors - Clean consistent error handling using existing global handlers - Future-proof: Any new store exceptions automatically get proper status codes ## Result - AgentNotFoundError for autogpt/domain-drop-catcher now returns 404 instead of 500 - InvalidFileTypeError, VirusDetectedError, etc. now return 400 instead of 500 - No ERROR level stack traces for expected client errors - Proper HTTP semantics throughout the store API	2025-10-16 04:36:49 +00:00
Zamil Majdy	ba53cb78dc	fix(backend/util): comprehensive SafeJson sanitization to prevent PostgreSQL null character errors (#11174 ) ## Summary Fix critical SafeJson function to properly sanitize JSON-encoded Unicode escape sequences that were causing PostgreSQL 22P05 errors when null characters in web scraped content were stored as "\u0000" in the database. ## Root Cause Analysis The existing SafeJson function in backend/util/json.py: 1. Only removed raw control characters (\x00-\x08, etc.) using POSTGRES_CONTROL_CHARS regex 2. Failed to handle JSON-encoded Unicode escape sequences (\u0000, \u0001, etc.) 3. When web scraping returned content with null bytes, these were JSON-encoded as "\u0000" strings 4. PostgreSQL rejected these Unicode escape sequences, causing 22P05 errors ## Changes Made ### Enhanced SafeJson Function (backend/util/json.py:147-153) - Add POSTGRES_JSON_ESCAPES regex: Comprehensive pattern targeting all PostgreSQL-incompatible Unicode and single-char JSON escape sequences - Unicode escapes: \u0000-\u0008, \u000B-\u000C, \u000E-\u001F, \u007F (preserves \u0009=tab, \u000A=newline, \u000D=carriage return) - Single-char escapes: \b (backspace), \f (form feed) with negative lookbehind/lookahead to preserve file paths like "C:\\file.txt" - Two-pass sanitization: Remove JSON escape sequences first, then raw characters as fallback ### Comprehensive Test Coverage (backend/util/test_json.py:219-414) Added 11 new test methods covering: - Control character sanitization: Verify dangerous characters (\x00, \x07, \x0C, etc.) are removed while preserving safe whitespace (\t, \n, \r) - Web scraping content: Simulate SearchTheWebBlock scenarios with null bytes and control characters - Code preservation: Ensure legitimate file paths, JSON strings, regex patterns, and programming code are preserved - Unicode escape handling: Test both \u0000-style and \b/\f-style escape sequences - Edge case protection: Prevent over-matching of legitimate sequences like "mybfile.txt" or "\\u0040" - Mixed content scenarios: Verify only problematic sequences are removed while preserving legitimate content ## Validation Results - ✅ All 24 SafeJson tests pass including 11 new comprehensive sanitization tests - ✅ Control characters properly removed: \x00, \x01, \x08, \x0C, \x1F, \x7F - ✅ Safe characters preserved: \t (tab), \n (newline), \r (carriage return) - ✅ File paths preserved: "C:\\Users\\file.txt", "\\\\server\\share" - ✅ Programming code preserved: regex patterns, JSON strings, SQL escapes - ✅ Unicode escapes sanitized: \u0000 → removed, \u0048 ("H") → preserved if valid - ✅ No false positives: Legitimate sequences not accidentally removed - ✅ poetry run format succeeds without errors ## Impact - Prevents PostgreSQL 22P05 errors: No more null character database rejections from web scraping - Maintains data integrity: Legitimate content preserved while dangerous characters removed - Comprehensive protection: Handles both raw bytes and JSON-encoded escape sequences - Web scraping reliability: SearchTheWebBlock and similar blocks now store content safely - Backward compatibility: Existing SafeJson behavior unchanged for legitimate content ## Test Plan - [x] All existing SafeJson tests pass (24/24) - [x] New comprehensive sanitization tests pass (11/11) - [x] Control character removal verified - [x] Legitimate content preservation verified - [x] Web scraping scenarios tested - [x] Code formatting and type checking passes 🤖 Generated with [Claude Code](https://claude.ai/code) --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-15 21:25:30 +00:00

1 2 3 4 5 ...

896 Commits