AutoGPT

mirror of https://github.com/Significant-Gravitas/AutoGPT.git synced 2026-01-10 07:38:04 -05:00

Author	SHA1	Message	Date
Zamil Majdy	a28b2cf04f	fix(backend/scheduler): Reconfigure scheduling setting & Add more logging on execution scheduling logic autogpt-platform-beta-v0.6.20	2025-08-08 19:27:30 +07:00
Zamil Majdy	de7b6b503f	fix(backend): Add timeout on stopping message consumer on manager	2025-08-08 18:04:10 +07:00
Zamil Majdy	5338ab5b80	feat(backend): standardize service health checks with UnhealthyServiceError (#10584 )	2025-08-08 17:23:36 +07:00
Zamil Majdy	e8f897ead1	feat(backend): standardize service health checks with UnhealthyServiceError (#10584 ) This PR standardizes health check error handling across all services by introducing and using a consistent `UnhealthyServiceError` exception type. This improves monitoring, debugging, and service reliability by providing uniform error reporting when services are unhealthy. ### Changes 🏗️ - Added `UnhealthyServiceError` class in `backend/util/service.py`: - Custom exception for unhealthy service states - Includes service name in error message - Added to `EXCEPTION_MAPPING` for proper serialization - Updated health checks across services to use `UnhealthyServiceError`: - Database service (`backend/executor/database.py`): Replace `RuntimeError` with `UnhealthyServiceError` for database connection failures - Scheduler service (`backend/executor/scheduler.py`): Replace `RuntimeError` with `UnhealthyServiceError` for scheduler initialization and running state checks - Notification service (`backend/notifications/notifications.py`): - Replace `RuntimeError` with `UnhealthyServiceError` for RabbitMQ configuration issues - Added new `health_check()` method to verify RabbitMQ readiness - REST API (`backend/server/rest_api.py`): Replace `RuntimeError` with `UnhealthyServiceError` for database health checks - Updated imports across all affected files to include `UnhealthyServiceError` ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Verified health check endpoints return appropriate errors when services are unhealthy - [x] Confirmed services start up properly and health checks pass when healthy - [x] Tested error serialization through API responses - [x] Verified no breaking changes to existing functionality #### For configuration changes: - [x] `.env.example` is updated or already compatible with my changes - [x] `docker-compose.yml` is updated or already compatible with my changes - [x] I have included a list of my configuration changes in the PR description (under Changes) No configuration changes were made in this PR - only code changes to improve error handling consistency. --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-08-08 10:00:59 +00:00
Zamil Majdy	fbe432919d	fix(backend/scheduler): Add more robust health check mechanism for scheduler service	2025-08-08 14:53:56 +07:00
Abhimanyu Yadav	4f208d262e	test(frontend): add e2e tests for agent dashboard page (#10572 ) I have added e2e tests for agent dashboard page It includes, tests like - dashboard page loads successfully - submit agent button works correctly - agent table displays data correctly - agent table actions work correctly I’ve also updated the e2e test script to include some static agent submissions, so I can test if it loads on the frontend. #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] All tests are working perfectly locally <img width="469" height="177" alt="Screenshot 2025-08-08 at 12 13 42 PM" src="https://github.com/user-attachments/assets/5e37afc3-c151-476a-84de-0a06f44a0722" />	2025-08-08 07:29:11 +00:00
Zamil Majdy	ac9265c40d	Merge branch 'master' of github.com:Significant-Gravitas/AutoGPT into dev	2025-08-08 14:08:37 +07:00
Zamil Majdy	e60deba05f	refactor(backend): separate notification service from scheduler (#10579 ) ## Summary - Create dedicated notification service entry point (backend.notification:main) - Remove NotificationManager from scheduler service for better separation of concerns - Update docker-compose to run notification service on dedicated port 8007 - Configure all services to communicate with separate notification service This refactoring separates the notification service from the scheduler service, allowing them to run as independent microservices instead of two processes in the same pod. ## Changes Made - New notification service entry point: Created `backend/backend/notification.py` with dedicated main function - Updated pyproject.toml: Added notification service entry point registration - Modified scheduler service: Removed NotificationManager from `backend/backend/scheduler.py` - Docker Compose updates: Added notification_server service on port 8007, updated NOTIFICATIONMANAGER_HOST references ## Test plan - [x] Verify notification service starts correctly with new entry point - [x] Confirm scheduler service runs without notification manager - [x] Test docker-compose configuration with separate services - [x] Validate service discovery between microservices - [x] Run linting and type checking 🤖 Generated with [Claude Code](https://claude.ai/code)	2025-08-08 14:07:41 +07:00
Zamil Majdy	3131e2e856	fix(backend): resolve unclosed HTTP client session errors (#10566 ) ## Summary This PR resolves unclosed HTTP client session errors that were occurring in the backend, particularly during file uploads and service-to-service communication. ### Key Changes - Fixed GCS storage operations: Convert `gcloud.aio.storage.Storage()` to use async context managers in `media.py` and `cloud_storage.py` - Enhanced service client cleanup: Added proper cleanup methods to `DynamicClient` class in `service.py` with `__del__` fallback and context manager support - Application shutdown cleanup: Added cloud storage handler cleanup to FastAPI application lifespan - Updated test mocks: Fixed test fixtures to properly mock async context manager behavior ### Root Cause Analysis The "Unclosed client session" and "Unclosed connector" errors were caused by: 1. GCS storage clients not using context managers (agent image uploads) 2. Service HTTP clients (`httpx.Client`/`AsyncClient`) not being properly cleaned up in the `DynamicClient` class ### Technical Details - All `gcloud.aio.storage.Storage()` instances now use `async with` context managers - `DynamicClient` class now has proper cleanup methods and context manager support - Application shutdown hook ensures cloud storage handlers are properly closed - Test fixtures updated to mock async context manager protocol ### Testing - ✅ All media upload tests pass - ✅ Service client tests pass - ✅ Linting and formatting pass ## Test plan - [ ] Deploy to staging environment - [ ] Monitor logs for "Unclosed client session" errors (should be eliminated) - [ ] Verify file upload functionality works correctly - [ ] Check service-to-service communication operates normally 🤖 Generated with [Claude Code](https://claude.ai/code) --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-08-08 05:41:41 +00:00
Zamil Majdy	378d256b58	fix(backend): add graph validation before scheduling recurring jobs (#10568 ) ## Summary This PR addresses the recurring job validation failures by adding graph validation before scheduling jobs. Previously, validation errors only occurred at runtime during job execution, making it difficult to communicate errors to users for scheduled recurring jobs. ### Changes 🏗️ - Extract validation logic: Created `validate_and_construct_node_execution_input` wrapper function that centralizes graph fetching, credential mapping, and validation logic - Add pre-scheduling validation: Modified `add_graph_execution_schedule` to validate graphs before creating scheduled jobs - Make construct function private: Renamed `construct_node_execution_input` to `_construct_node_execution_input` to prevent direct usage and encourage use of the wrapper - Reduce code duplication: Eliminated duplicate validation logic between scheduler and execution paths - Improve scheduler lifecycle management: - Enhanced cleanup process with proper event loop shutdown sequence - Added graceful event loop thread termination with timeout - Fixed thread lifecycle management to prevent resource leaks - Add helper utilities: - Created `run_async` helper to reduce `asyncio.run_coroutine_threadsafe` boilerplate - Added `SCHEDULER_OPERATION_TIMEOUT_SECONDS` constant for consistent timeout handling across all scheduler operations ### Technical Details Validation Flow: The validation now happens in `add_graph_execution_schedule` before calling `scheduler.add_job()`, ensuring that: 1. Graph exists and is accessible to the user 2. All credentials are valid and available 3. Graph structure and node configurations are valid 4. Starting nodes are present and properly configured This uses the same validation logic as runtime execution, guaranteeing consistency. Scheduler Lifecycle Improvements: - Proper cleanup sequence: Event loop is stopped before thread termination - Thread management: Added global tracking of event loop thread for proper cleanup - Timeout consistency: All scheduler operations now use the same 300-second timeout - Resource management: Prevents potential memory leaks from unclosed event loops Code Quality Improvements: - DRY principle: `run_async` helper eliminates repeated `asyncio.run_coroutine_threadsafe` patterns - Single source of truth: All timeout values use `SCHEDULER_OPERATION_TIMEOUT_SECONDS` constant - Cleaner abstractions: Direct utility function calls instead of unnecessary wrapper methods ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Verified imports work correctly for both scheduler and utils modules - [x] Confirmed code passes all linting and type checking - [x] Validated that existing functionality remains intact - [x] Tested that validation logic is properly extracted and reused - [x] Verified scheduler cleanup process works correctly - [x] Confirmed thread lifecycle management improvements #### For configuration changes: - [x] `.env.example` is updated or already compatible with my changes - [x] `docker-compose.yml` is updated or already compatible with my changes - [x] I have included a list of my configuration changes in the PR description (under Changes) Note: No configuration changes were required for this fix. ## Impact - Prevents runtime failures: Invalid graphs are caught before scheduling instead of failing silently during execution - Better error communication: Validation errors surface immediately when scheduling - Improved resource management: Proper event loop and thread cleanup prevents memory leaks - Enhanced maintainability: Single source of truth for validation logic and consistent timeout handling - Reduced code duplication: Eliminated ~30+ lines of duplicate code across validation and async execution patterns - Better developer experience: Cleaner code with helper functions and consistent patterns Resolves the TODO comment: "We need to communicate this error to the user somehow" in scheduler.py:107 Co-authored-by: Claude <noreply@anthropic.com>	2025-08-08 05:40:20 +00:00
Abhimanyu Yadav	3c52b75278	fix(frontend): marketplace top agents section (#10571 ) Currently, we’re only seeing the top 20 agents, but we need to display all of them until we see more call-to-action buttons. #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] All tests are working perfectly - [x] It's working manually as well	2025-08-08 04:52:51 +00:00
Zamil Majdy	40601f1616	fix(backend): Fix executor running RabbitMQ operations on closed/closing connection (#10578 ) The RabbitMQ connection is unreliable (fixing it is a separate issue) and sometimes get restarted. The scope of this PR is to avoid the operation break due to executing on a stale, broken connection. ### Changes 🏗️ Fix executor running RabbitMQ operations on closed/closing connection ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: <!-- Put your test plan here: --> - [x] Manually kill rabbitmq and see how it goes while executing an agent	2025-08-07 23:53:52 +00:00
Nicholas Tindle	178c91d6b9	ref(backend): time/date blocks to support ISO 8601 and custom formats (#10576 ) Introduces discriminated unions for time, date, and date-time format selection, supporting both strftime and ISO 8601 (with timezone and microsecond options). Updates schemas, test cases, and block logic to handle the new format types, improving flexibility and standards compliance for time and date outputs. <!-- Clearly explain the need for these changes: --> ### Why these changes are needed Users need to output timestamps in ISO 8601/RFC 3339 format for API integrations and standardized data exchange. The previous implementation only supported strftime formatting, which made it difficult to generate properly formatted timestamps with timezone information. This change enables: - Standards compliance: ISO 8601 and RFC 3339 compliant timestamps - Timezone support: 38 timezone options covering all UTC offsets globally - API compatibility: Many APIs require RFC 3339 timestamps (e.g., "2011-06-03T10:00:00-07:00") - Backward compatibility: Existing workflows continue to work with default strftime format ### Changes 🏗️ <!-- Concisely describe all of the changes made in this pull request: --> - Added discriminated union format types for all time/date blocks: - `GetCurrentTimeBlock`: Now supports `TimeStrftimeFormat` and `TimeISO8601Format` - `GetCurrentDateBlock`: Now supports `DateStrftimeFormat` and `DateISO8601Format` - `GetCurrentDateAndTimeBlock`: Now supports `StrftimeFormat` and `ISO8601Format` - Implemented shared timezone support: - Created `TimezoneLiteral` type with 38 timezone options (all UTC offsets) - Supports fractional offsets (e.g., India UTC+05:30, Nepal UTC+05:45) - Deduplicated timezone lists across all format classes - Added ISO 8601 format features: - Timezone-aware timestamps with proper offset formatting - Optional microseconds inclusion - RFC 3339 compliance (subset of ISO 8601 with mandatory timezone) - Updated test cases for all three blocks to verify: - Default behavior unchanged (backward compatibility) - Custom strftime formats still work - ISO 8601 format produces correct output ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: <!-- Put your test plan here: --> - [x] Verified backward compatibility - default strftime format unchanged - [x] Tested ISO 8601 format with UTC timezone - [x] Tested ISO 8601 format with various timezones (India, New York, etc.) - [x] Tested microseconds option for ISO formats - [x] Verified all existing tests pass for GetCurrentTimeBlock - [x] Verified all existing tests pass for GetCurrentDateBlock - [x] Verified all existing tests pass for GetCurrentDateAndTimeBlock - [x] Manually tested each block with different format configurations - [x] Confirmed RFC 3339 compliance for timestamps with mandatory timezone --------- Co-authored-by: Claude <claude@users.noreply.github.com>	2025-08-07 22:34:31 +00:00
Nicholas Tindle	c972f34713	Revert "feat(docker): add frontend service to docker-compose with env config improvements" (#10577 ) Reverts Significant-Gravitas/AutoGPT#10536 to bring platform back up due to this error: ``` │ Error creating Supabase client Error: @supabase/ssr: Your project's URL and API key are required to create a Supabase client! │ │ │ │ Check your Supabase project's API settings to find these values │ │ │ │ https://supabase.com/dashboard/project/_/settings/api │ │ at <unknown> (https://supabase.com/dashboard/project/_/settings/api) │ │ at bX (.next/server/chunks/3873.js:6:90688) │ │ at <unknown> (.next/server/chunks/150.js:6:13460) │ │ at n (.next/server/chunks/150.js:6:13419) │ │ at o (.next/server/chunks/150.js:6:14187) │ │ ⨯ Error: Your project's URL and Key are required to create a Supabase client! │ │ │ │ Check your Supabase project's API settings to find these values │ │ │ │ https://supabase.com/dashboard/project/_/settings/api │ │ at <unknown> (https://supabase.com/dashboard/project/_/settings/api) │ │ at bY (.next/server/chunks/3006.js:10:486) │ │ at g (.next/server/app/(platform)/auth/callback/route.js:1:5890) │ │ at async e (.next/server/chunks/9836.js:1:101814) │ │ at async k (.next/server/chunks/9836.js:1:15611) │ │ at async l (.next/server/chunks/9836.js:1:15817) { │ │ digest: '424987633' │ │ } │ │ Error creating Supabase client Error: @supabase/ssr: Your project's URL and API key are required to create a Supabase client! │ │ │ │ Check your Supabase project's API settings to find these values │ │ │ │ https://supabase.com/dashboard/project/_/settings/api │ │ at <unknown> (https://supabase.com/dashboard/project/_/settings/api) │ │ at bX (.next/server/chunks/3873.js:6:90688) │ │ at <unknown> (.next/server/chunks/150.js:6:13460) │ │ at n (.next/server/chunks/150.js:6:13419) │ │ at j (.next/server/chunks/150.js:6:7482) │ │ Error creating Supabase client Error: @supabase/ssr: Your project's URL and API key are required to create a Supabase client! │ │ │ │ Check your Supabase project's API settings to find these values │ │ │ │ https://supabase.com/dashboard/project/_/settings/api │ │ at <unknown> (https://supabase.com/dashboard/project/_/settings/api) │ │ at bX (.next/server/chunks/3873.js:6:90688) │ │ at <unknown> (.next/server/chunks/150.js:6:13460) │ │ at n (.next/server/chunks/150.js:6:13419) │ │ at h (.next/server/chunks/150.js:6:10561) │ │ Error creating Supabase client Error: @supabase/ssr: Your project's URL and API key are required to create a Supabase client! │ │ │ │ Check your Supabase project's API settings to find these values │ │ │ │ https://supabase.com/dashboard/project/_/settings/api │ │ at <unknown> (https://supabase.com/dashboard/project/_/settings/api) │ │ at bX (.next/server/chunks/3873.js:6:90688) │ │ at <unknown> (.next/server/chunks/150.js:6:13460) │ │ at n (.next/server/chunks/150.js:6:13419) ```	2025-08-07 20:00:45 +00:00
Bently	7b3ee66247	feat(blocks): Add Anthropics new Claude Opus 4.1 model (#10575 ) This adds the latest claude opus 4.1 model to the platform This adds the following models - claude-opus-4-1-20250805 ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: <!-- Put your test plan here: --> - [x] Test claude opus 4.1 to make sure they work	2025-08-07 17:40:04 +00:00
Bently	2d10ac92b5	feat(blocks): Add GPT-5 models to the platform (#10574 ) This adds the latest chatGPT models, gpt 5 to the platform, this is ahead of its release, the prices and context limits are still to be properly set but for now i set them to be the same as gpt4.1, the price is set at 5 for now till we know more This adds the following models - gpt-5 - gpt-5-mini - gpt-5-nano - gpt-5-chat ### Changes 🏗️ <!-- Concisely describe all of the changes made in this pull request: --> ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: <!-- Put your test plan here: --> - [x] Test all of the models to make sure they work	2025-08-07 17:19:23 +00:00
Swifty	377b5ef01c	fix id not preserved through airtable oauth refresh (#10573 ) <!-- Clearly explain the need for these changes: --> ### Changes 🏗️ <!-- Concisely describe all of the changes made in this pull request: --> ### Checklist 📋 #### For code changes: - [ ] I have clearly listed my changes in the PR description - [ ] I have made a test plan - [ ] I have tested my changes according to the test plan: <!-- Put your test plan here: --> - [ ] ... <details> <summary>Example test plan</summary> - [ ] Create from scratch and execute an agent with at least 3 blocks - [ ] Import an agent from file upload, and confirm it executes correctly - [ ] Upload agent to marketplace - [ ] Import an agent from marketplace and confirm it executes correctly - [ ] Edit an agent from monitor, and confirm it executes correctly </details> #### For configuration changes: - [ ] `.env.example` is updated or already compatible with my changes - [ ] `docker-compose.yml` is updated or already compatible with my changes - [ ] I have included a list of my configuration changes in the PR description (under Changes) <details> <summary>Examples of configuration changes</summary> - Changing ports - Adding new services that need to communicate with each other - Secrets or environment variable changes - New or infrastructure changes such as databases </details>	2025-08-07 16:44:36 +02:00
Zamil Majdy	7922e4add4	fix(backend): fix lack of event loop on notification manager	2025-08-07 16:15:32 +07:00
Zamil Majdy	f172b314a4	feat(docker): add frontend service to docker-compose with env config improvements (#10536 ) ## Summary This PR adds the frontend service to the Docker Compose configuration, enabling `docker compose up` to run the complete stack including the frontend. It also implements comprehensive environment variable improvements and fixes Docker networking issues. ## Key Changes ### 🐳 Docker Compose Improvements - Added frontend service to `docker-compose.yml` and `docker-compose.platform.yml` - Production build: Uses `pnpm build + serve` instead of dev server for better stability and lower memory usage - Service dependencies: Frontend now waits for backend services (`rest_server`, `websocket_server`) to be ready - YAML anchors: Implemented DRY configuration to avoid duplicating environment values ### 🔧 Environment Variable Architecture - Dual environment strategy: - Server-side code uses Docker service names (`http://rest_server:8006/api`) - Client-side code uses localhost URLs (`http://localhost:8006/api`) - Comprehensive config: Added build args and runtime environment variables - Network compatibility: Fixes connection issues between frontend and backend containers ### 🛠️ Code Improvements - Centralized env-config helper (`/frontend/src/lib/env-config.ts`) with server-side priority - Updated all frontend code to use shared environment helpers instead of direct `process.env` access - Consistent API: All environment variable access now goes through helper functions ### 🔗 Files Changed - `docker-compose.yml` & `docker-compose.platform.yml` - Added frontend service - `frontend/Dockerfile` - Added build args for environment variables - `frontend/src/lib/env-config.ts` - New centralized environment configuration - Multiple frontend files - Updated to use env helpers ## Benefits - ✅ Single command deployment: `docker compose up` now runs everything - ✅ Better reliability: Production build reduces memory usage and crashes - ✅ Network compatibility: Proper container-to-container communication - ✅ Maintainable config: Centralized environment variable management - ✅ Development friendly: Works in both Docker and local development ## Testing - ✅ Verified Docker service communication works correctly - ✅ Frontend responds and serves content properly - ✅ Environment variables are correctly resolved in both server and client contexts - ✅ No connection errors after implementing service dependencies 🤖 Generated with [Claude Code](https://claude.ai/code) --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-08-07 08:26:28 +00:00
Zamil Majdy	a21711a7ff	feat(backend): migrate AgentExecutor from ProcessPoolExecutor to ThreadPoolExecutor (#10540 ) ## Summary - Migrate execution manager from ProcessPoolExecutor to ThreadPoolExecutor for improved performance and resource efficiency - Rename `Executor` class to `ExecutionProcessor` for better clarity - Convert classmethods to instance methods following proper OOP design patterns - Implement thread-local storage using `threading.local()` for thread-safe execution ## Technical Changes - Executor Pattern: Replace process-based execution with thread-based execution using `ThreadPoolExecutor` - Thread-Local Storage: Use `threading.local()` to bind `ExecutionProcessor` instances to worker threads - Initialization: Add `init_worker()` function called once per thread via `initializer` parameter - Event Handling: Replace `multiprocessing.Manager().Event()` with `threading.Event()` - Tracking: Update from PID to TID (`threading.get_ident()`) for thread identification - Method Conversion: Convert all classmethods to instance methods (`cls` → `self`) - Signal Handling: Remove signal handling code that doesn't work in worker threads ## Benefits - Performance: Reduced overhead compared to process creation/destruction - Resource Efficiency: Lower memory footprint and faster startup - Simplicity: Cleaner implementation using thread-local storage pattern - Thread Safety: Maintained through isolated ExecutionProcessor instances per thread ## Test Plan - [x] Code passes all linting and formatting - [x] All executor tests pass (23/23) - [x] Graph execution test passes successfully - [x] Thread-local storage implementation verified - [x] Signal handling compatibility fixed for worker threads 🤖 Generated with [Claude Code](https://claude.ai/code) --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-08-07 08:25:22 +00:00
Zamil Majdy	e2af2f454d	fix(backend): migrate notification service to fully async to resolve RabbitMQ connection issues (#10564 ) ## Summary - Remove background_executor from NotificationManager to eliminate event loop conflicts that were causing RabbitMQ "Connection reset by peer" errors - Convert all notification processing to fully async using async database clients - Optimize Settings instantiation to prevent file descriptor leaks by moving to module level - Fix scheduler event loop management to use single shared loop instead of thread-cached approach ## Changes 🏗️ ### 1. Remove ProcessPoolExecutor from NotificationManager - Eliminated `background_executor` entirely from notification service - Converted `queue_weekly_summary()` and `process_existing_batches()` from sync to async - Fixed the root cause: `asyncio.run()` was creating new event loops, conflicting with existing RabbitMQ connections ### 2. Full Async Conversion - Updated `_consume_queue` to only accept async functions: `Callable[[str], Awaitable[bool]]` - Replaced sync `DatabaseManagerClient` with `DatabaseManagerAsyncClient` throughout notification service - Added missing async methods to `DatabaseManagerAsyncClient`: - `get_active_user_ids_in_timerange` - `get_user_email_by_id` - `get_user_email_verification` - `get_user_notification_preference` - `create_or_add_to_user_notification_batch` - `empty_user_notification_batch` - `get_all_batches_by_type` ### 3. Settings Optimization - Moved `Settings()` instantiation to module level in: - `backend/util/metrics.py` - `backend/blocks/google_calendar.py` - `backend/blocks/gmail.py` - `backend/blocks/slant3d.py` - `backend/blocks/user.py` - Prevents multiple file descriptor reads per process, reducing resource usage ### 4. Scheduler Event Loop Fix - Simplified event loop initialization in `Scheduler.run_service()` to create single shared loop - Removed complex thread caching and locking that could create multiple connections - Fixed daemon thread lifecycle by using non-daemon thread with proper cleanup - Event loop runs in dedicated background thread with graceful shutdown handling ## Root Cause Analysis The RabbitMQ "Connection reset by peer" errors were caused by: 1. Event Loop Conflicts: `asyncio.run()` in `queue_weekly_summary` created new event loops, disrupting existing RabbitMQ heartbeat connections 2. Thread Resource Waste: Thread-cached event loops in scheduler created unnecessary connections 3. File Descriptor Leaks: Multiple Settings instantiations per process increased resource pressure ## Why This Fixes the Issue 1. Eliminates Event Loop Creation: By using `asyncio.create_task()` instead of `asyncio.run()`, we reuse the existing event loop 2. Maintains Heartbeat Connections: Async RabbitMQ connections remain stable without event loop disruption 3. Reduces Resource Pressure: Settings optimization and simplified scheduler reduce file descriptor usage 4. Ensures Connection Stability: Single shared event loop prevents connection multiplexing issues ## Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Verified RabbitMQ connection stability by checking heartbeat logs - [x] Confirmed async conversion maintains all notification functionality - [x] Tested scheduler job execution with simplified event loop - [x] Validated Settings optimization reduces file descriptor usage - [x] Ensured notification processing works end-to-end 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-08-07 08:25:09 +00:00
Zamil Majdy	59cc3266e0	Merge branch 'master' of github.com:Significant-Gravitas/AutoGPT into dev	2025-08-07 06:28:56 +07:00
Zamil Majdy	c9360555b2	fix(backend): Persist any non interruption error on node execution as output (#10562 ) Some non-node execution errors and system failures (like credentials not found, or database failure) are not logged and exposed to the user. This will make the node execution look like it's failed without an error message: <img width="804" height="1141" alt="image" src="https://github.com/user-attachments/assets/e81314a0-b9af-4a95-bba7-8df576911e96" /> ### Changes 🏗️ Make all non-interruption errors yielded as node execution error output. ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: <!-- Put your test plan here: --> - [x] CI	2025-08-07 06:28:24 +07:00
Bently	4a63fbc006	feat(blocks): Add OpenAI's new opensource models (#10559 ) This adds the latest opensource models from OpenAI to the platform, we are using openrouter to provide api access to it! I added - openai/gpt-oss-20b - openai/gpt-oss-120b ### Changes 🏗️ <!-- Concisely describe all of the changes made in this pull request: --> ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: <!-- Put your test plan here: --> - [x] Test both of the latest models from openai, openai/gpt-oss-20b and openai/gpt-oss-120b and they should work!	2025-08-06 11:43:49 +00:00
Abhimanyu Yadav	9848266474	test(frontend): e2e tests for library page (#10355 ) In this PR, I’ve added library page tests. ### Changes I’ve added 9 tests: 8 for normal flows and 1 for checking edge cases. Test names are something like: - Library navigation is accessible from the navbar. - The library page loads successfully. - Agents are visible, and cards work correctly. - Pagination works correctly. - Sorting works correctly. - Searching works correctly. - Pagination while searching works correctly. - Uploading an agent works correctly. - Edge case: Search edge cases and error handling behave correctly. Other than that, I’ve added a new utility that uses the build page to help us create users at the start, which we could use to test the library page. - All tests are passing locally <img width="514" height="465" alt="Screenshot 2025-07-12 at 11 13 41 AM" src="https://github.com/user-attachments/assets/7a46c437-7db5-458b-b99a-4fa0d479866f" /> ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] All library tests are working locally and on CI perfectly. autogpt-platform-beta-v0.6.19	2025-08-06 08:00:04 +00:00
Zamil Majdy	3fe88b6106	refactor(backend): Refactor log client and resource cleanup (#10558 ) ## Summary - Created centralized service client helpers with thread caching in `util/clients.py` - Refactored service client management to eliminate health checks and improve performance - Enhanced logging in process cleanup to include error details - Improved retry mechanisms and resource cleanup across the platform - Updated multiple services to use new centralized client patterns ## Key Changes ### New Centralized Client Factory (`util/clients.py`) - Added thread-cached factory functions for all major service clients: - Database managers (sync and async) - Scheduler client - Notification manager - Execution event bus (Redis-based) - RabbitMQ execution queue (sync and async) - Integration credentials store - All clients use `@thread_cached` decorator for performance optimization ### Service Client Improvements - Removed health checks: Eliminated unnecessary health check calls from `get_service_client()` to reduce startup overhead - Enhanced retry support: Database manager clients now use request retry by default - Better error handling: Improved error propagation and logging ### Enhanced Logging and Cleanup - Process termination logs: Added error details to termination messages in `util/process.py` - Retry mechanism updates: Improved retry logic with better error handling in `util/retry.py` - Resource cleanup: Better resource management across executors and monitoring services ### Updated Service Usage - Refactored 21+ files to use new centralized client patterns - Updated all executor, monitoring, and notification services - Maintained backward compatibility while improving performance ## Files Changed - Created: `backend/util/clients.py` - Centralized client factory with thread caching - Modified: 21 files across blocks, executor, monitoring, and utility modules - Key areas: Service client initialization, resource cleanup, retry mechanisms ## Test Plan - [x] Verify all existing tests pass - [x] Validate service startup and client initialization - [x] Test resource cleanup on process termination - [x] Confirm retry mechanisms work correctly - [x] Validate thread caching performance improvements - [x] Ensure no breaking changes to existing functionality ## Breaking Changes None - all changes maintain backward compatibility. ## Additional Notes This refactoring centralizes client management patterns that were scattered across the codebase, making them more consistent and performant through thread caching. The removal of health checks reduces startup time while maintaining reliability through improved retry mechanisms. 🤖 Generated with [Claude Code](https://claude.ai/code)	2025-08-06 13:53:01 +07:00
Reinier van der Leer	fa2d968458	fix(builder): Defer graph validation to backend (#10556 ) - Resolves #10553 ### Changes 🏗️ - Remove frontend graph validation in `useAgentGraph:saveAndRun(..)` - Remove now unused `ajv` dependency - Implement graph validation error propagation (backend->frontend) - Add `GraphValidationError` type in frontend and backend - Add `GraphModel.validate_graph_get_errors(..)` method - Fix error handling & propagation in frontend API request logic ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Saving & running a graph with missing required inputs gives a node-specific error - [x] Saving & running a graph with missing node credential inputs succeeds with passed-in credentials	2025-08-05 23:43:34 +00:00
Zamil Majdy	b935638240	Merge branch 'master' of github.com:Significant-Gravitas/AutoGPT into dev	2025-08-06 05:56:00 +07:00
Zamil Majdy	f9b255fb7a	feat(backend/executor): Avoid executor premature termination on inflight agent execution (#10552 ) There is no 100% accurate way of retrying an agent that has been terminated. And the safest way to avoid executing an agent wrong is minimizing the chance of an agent execution being terminated. A whole set of mechanism to make sure the agent is retried on failure is still in place and improved, this is used as our best-effort reliability mechanism. ### Changes 🏗️ * Cap SIGINT & SIGTERM to be raised at most once, so the executor can gracefully handle the stopping. * SIGINT & SIGTERM will stop the execution request message consumption, but not agent execution. * Executor process will only stop if all the in-flight agent executions are completed or terminated. * Avoid retrying the agent stop command on AgentExecutorBlock on timeout. ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: <!-- Put your test plan here: --> - [x] Run agent, send SIGTERM to the executor pod, execution should not be interrupted. - [x] Run agent, send SIGKILL to the executor pod, execution should be transferred to another pod.	2025-08-06 05:55:30 +07:00
Nicholas Tindle	cc6697e46d	fix(backend): clean up parsing a bit for gmail read (#10555 ) <!-- Clearly explain the need for these changes: --> Toran hit an error on reading a snippet incorrectly ### Changes 🏗️ Does fallback getting from dictionary when building email objects <!-- Concisely describe all of the changes made in this pull request: --> ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [ ] I have tested my changes according to the test plan: <!-- Put your test plan here: --> - [ ] Deploy to dev and have Toran test against his inbox --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-08-05 18:30:34 +00:00
Swifty	5a5f7f0f9e	fix(backend): Fix Airtable API boolean type casting for string values (#10551 ) ### Changes 🏗️ - Added `_convert_bools()` function to recursively convert string boolean values ("true"/"false") to actual Python booleans - Applied boolean conversion to all Airtable API endpoints that send JSON data to ensure proper type casting - Fixed parameters that were incorrectly converted to strings (e.g., `typecast`, `returnFieldsByFieldId`) to maintain their boolean types This fix addresses an issue where the Airtable API was not properly handling boolean values passed as strings, which could cause API calls to fail or behave unexpectedly. ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Tested boolean field updates with string values "true" and "false" - [x] Verified that boolean parameters like `typecast` and `returnFieldsByFieldId` are properly handled - [x] Confirmed that nested boolean values in records are correctly converted - [x] Tested that non-boolean values remain unchanged [Working Airtable Example_v56.json](https://github.com/user-attachments/files/21594436/Working.Airtable.Example_v56.json)	2025-08-05 13:22:55 +00:00
Bently	05d4d21d98	feat(frontend): Show CAPTCHA only in cloud environments (#10543 ) Updated login and signup pages to display the Turnstile CAPTCHA and require verification only when running in a cloud environment. This prevents unnecessary CAPTCHA prompts in local or non-cloud deployments. ### Changes 🏗️ Locally when you try to login with the wrong password, and you update and login again, you get a warning about captcha which is wrong, so this fix makes it so the captcha will only when running in a cloud ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: <!-- Put your test plan here: --> - [x] Try to login with the wrong password, get "Invalid login credentials" and try to login again, you should keep getting "Invalid login credentials" and it should not mention captcha	2025-08-04 16:37:20 +00:00
Zamil Majdy	1e6bd8d2a6	fix(backend/executor): Avoid stopping agent node evaluation when stopping graph (#10542 ) Graph evaluation should stop naturally once all the node execution are stopped. ### Changes 🏗️ Avoid stopping agent node evaluation when stopping graph ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: <!-- Put your test plan here: --> - [x] CI	2025-08-04 23:10:42 +07:00
Zamil Majdy	6f8d0bfdf2	fix(backend/executor): Fix node execution status and output persistence ordering (#10541 ) The node execution status can be done before the output persistence, making the output be persisted when the node execution status is already completed. ### Changes 🏗️ * Re-order the node execution status & output persistence logic. * Make agent.py avoid yielding the same node_exec_id twice (that can be caused by the above issue). ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: <!-- Put your test plan here: --> - [x] Existing CI	2025-08-04 22:17:30 +07:00
Zamil Majdy	b85e8204df	refactor(platform): remove unused functions and imports (#10535 ) ## Summary - Removed unused metadata functions from user.py (get_user_metadata, update_user_metadata) - Removed unused execution and database functions from database.py and related imports - Added NodeExecutionStats validation in execution.py - Updated CLAUDE.md with PR and commit conventions ## Changes Made ### `/backend/backend/data/user.py` - Removed `get_user_metadata()` function (unused) - Removed `update_user_metadata()` function (unused) - Removed unused import `UserMetadataRaw` ### `/backend/backend/data/execution.py` - Added `NodeExecutionStats` validation in `from_db()` method ### `/backend/backend/executor/database.py` - Removed unused imports and function exposures - Cleaned up DatabaseManagerClient to remove unused client methods ### `/CLAUDE.md` - Added documentation for creating pull requests - Added conventional commit types and scopes guide ## Testing - Existing tests should pass as removed functions were not being used - No new functionality added ## Checklist - [x] Code follows the project's style guidelines - [x] Self-review completed - [x] Changes are backward compatible - [x] No new warnings introduced 🤖 Generated with [Claude Code](https://claude.ai/code) --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-08-04 21:34:33 +07:00
Zamil Majdy	e5d3ebac08	feat(backend): Make Graph & Node Execution Stats Update Durable (#10529 ) Graph and Node execution can fail due to so many reasons, sometimes this messes up the stats tracking, giving an inaccurate result. The scope of this PR is to minimize such issues. ### Changes 🏗️ * Catch BaseException on time_measured decorator to catch asyncio.CancelledError * Make sure update node & graph stats are executed on cancellation & exception. * Protect graph execution stats update under the thread lock to avoid race condition. ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: <!-- Put your test plan here: --> - [x] Existing automated tests. --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-08-04 21:33:52 +07:00
Swifty	2182c7ba9e	refactor(backend): remove Ayrshare from execution & credentials manager (#10538 ) This PR refactors the Ayrshare integration to remove the centralized `get_ayrshare_profile_key` function from the credentials store and instead retrieve the profile key directly within each Ayrshare block. This change improves code organization by keeping Ayrshare-specific logic within the Ayrshare module. ### Changes 🏗️ - Refactored Ayrshare profile key retrieval: Moved profile key fetching logic from the credentials store into the Ayrshare blocks - Added `get_profile_key` helper function in `autogpt_platform/backend/backend/blocks/ayrshare/_util.py` to fetch the profile key from user integrations - Updated all 15 Ayrshare social media blocks to use `user_id` instead of `profile_key` parameter and fetch the profile key internally - Removed `get_ayrshare_profile_key` method from `autogpt_platform/backend/backend/integrations/credentials_store.py` - Removed Ayrshare-specific logic from `autogpt_platform/backend/backend/executor/manager.py` that was passing profile keys to blocks - Updated router in `autogpt_platform/backend/backend/server/integrations/router.py` to directly fetch user integrations instead of using the removed method ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Test posting to X/Twitter to check credentials flow - [x] Verify profile key retrieval works correctly for authenticated users - [x] Test Ayrshare SSO URL generation flow --------- Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>	2025-08-04 13:36:54 +00:00
Abhimanyu Yadav	e043e4989b	fix(frontend) : Update server-side mutator to bypass proxy (#10523 ) This PR helps us bypass the proxy server in server-side requests, allowing us to directly send requests to the backend and reduce latency. ### Changes 🏗️ - Introduced server-side detection to dynamically set the base URL for API requests. - Added error handling for server-side requests to log failures and throw errors appropriately. - Updated header management to include authentication tokens when applicable. ### Checklist 📋 - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] All E2E tests are working. - [x] I have manually checked the server-side and client-side components, and both are working perfectly.	2025-08-04 11:36:25 +00:00
Abhimanyu Yadav	5dbc3a7d39	feat(frontend): Add marketplace agent page tests (#10434 ) - Resolves - https://github.com/Significant-Gravitas/AutoGPT/issues/10433 - Depends on - https://github.com/Significant-Gravitas/AutoGPT/pull/10427 - Need to review this pr, once this issue is fixed - https://github.com/Significant-Gravitas/AutoGPT/issues/10404 I’ve created additional tests for the agents marketplace page Tests that I have added - Add to library button works and agent appears in library. - Download button functionality works. - Agent page details are visible. - User can access agent page when logged in. - User can access agent page when logged out #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] I have done all the tests and they are working perfectly --------- Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co> Co-authored-by: Lluis Agusti <hi@llu.lu> Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co> Co-authored-by: Ubbe <hi@ubbe.dev>	2025-08-04 05:53:53 +00:00
Abhimanyu Yadav	0978f406bc	feat(frontend): Add reusable infinite scroll component for consistent pagination across frontend (#10530 ) We currently use infinite scroll pagination in multiple places, but our strategies vary across these locations. This repetitive code writing is not ideal, and our current methods are also complex. We’re not utilising React Query’s useInfiniteQuery hooks effectively. To address these issues, we’re introducing a new component called `InfiniteScroll` that handles pagination independently. ### How to use it? - Use React Query’s `useInfiniteHook` to return multiple data points. For pagination, we only need `fetchNextPage`, `hasNextPage`, and `isFetchingNextPage`. ```ts const { data: agents, fetchNextPage, hasNextPage, isFetchingNextPage, isLoading: agentLoading, } = useGetV2ListLibraryAgentsInfinite( { page: 1, page_size: 8, search_term: searchTerm \|\| undefined, sort_by: librarySort, }, ); ``` - Simply pass these three data points and the current data length to the `InfiniteScroll` component. That's it ```tsx <InfiniteScroll dataLength={agents.length} isFetchingNextPage={isFetchingNextPage} fetchNextPage={fetchNextPage} hasNextPage={hasNextPage} loader={<LoadingSpinner />} > ... ``` ### Changes - Add the `InfiniteScroll.tsx` component for consistency and simplicity in pagination across the frontend. - Update the current library page to use the `InfiniteScroll` component. ### Checklist 📋 - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] I’ve tested everything locally, and it’s working perfectly fine.	2025-08-04 05:48:54 +00:00
Zamil Majdy	1c3fa804d4	feat(backend): add timeout guard for locked_transaction used for credit transactions (#10528 ) ## Summary This PR adds a timeout guard to the `locked_transaction` function used for credit transactions to prevent indefinite blocking and improve reliability. ## Changes - Modified `locked_transaction` in `/backend/backend/data/db.py` to add proper timeout handling - Set `lock_timeout` and `statement_timeout` to prevent indefinite blocking - Updated function signature to use default timeout parameter - Added comprehensive docstring explaining the locking mechanism ## Motivation The previous implementation could potentially block indefinitely if a lock couldn't be acquired, which could cause issues in production environments, especially for critical credit transactions. ## Testing - Existing tests pass - The timeout mechanism ensures transactions won't hang indefinitely - Advisory locks are properly released on commit/rollback 🤖 Generated with [Claude Code](https://claude.ai/code) --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-08-02 15:32:20 +00:00
Zamil Majdy	69d873debc	fix(backend): improve executor reliability and error handling (#10526 ) This PR improves the reliability of the executor system by addressing several race conditions and improving error handling throughout the execution pipeline. ### Changes 🏗️ - Consolidated exception handling: Now using `BaseException` to properly catch all types of interruptions including `CancelledError` and `SystemExit` - Atomic stats updates: Moved node execution stats updates to be atomic with graph stats updates to prevent race conditions - Improved cleanup handling: Added proper timeout handling (3600s) for stuck executions during cleanup - Fixed concurrent update race conditions: Node execution updates are now properly synchronized with graph execution updates - Better error propagation: Improved error type preservation and status management throughout the execution chain - Graph resumption support: Added proper handling for resuming terminated and failed graph executions - Removed deprecated methods: Removed `update_node_execution_stats` in favor of atomic updates ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Execute a graph with multiple nodes and verify stats are updated correctly - [x] Cancel a running graph execution and verify proper cleanup - [x] Simulate node failures and verify error propagation - [x] Test graph resumption after termination/failure - [x] Verify no race conditions in concurrent node execution updates #### For configuration changes: - [x] `.env.example` is updated or already compatible with my changes - [x] `docker-compose.yml` is updated or already compatible with my changes - [x] I have included a list of my configuration changes in the PR description (under Changes) 🤖 Generated with [Claude Code](https://claude.ai/code) --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-08-02 17:41:59 +07:00
Zamil Majdy	4283798dc2	feat: Avoid Rest & DatabaseManager service serving traffic when the db is not yet connected (#10522 ) Sometimes we receive an error where the service is not connected to the DB, but we have started receiving traffic, making the request fail. ### Changes 🏗️ Make the `/health_check` endpoint also check the database connection. ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: <!-- Put your test plan here: --> - [x] Existing CI, manual test	2025-08-01 23:17:43 +00:00
Abhimanyu Yadav	326c4a9e0c	feat(frontend): Add marketplace creator page tests (#10429 ) - Resolves - https://github.com/Significant-Gravitas/AutoGPT/issues/10428 - Depends on - https://github.com/Significant-Gravitas/AutoGPT/pull/10427 - Need to review this pr, once this issue is fixed - https://github.com/Significant-Gravitas/AutoGPT/issues/10404 I’ve created additional tests for the creators marketplace page Tests that I have added - User can access creator's page when logged out. - User can access creator's page when logged in. - Creator page details are visible. - Agents in agent by sections navigation works. #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] I have done all the tests and they are working perfectly	2025-08-01 15:28:02 +00:00
Abhimanyu Yadav	7705cf243c	refactor(frontend): Update data fetching strategy in marketplace main page (#10520 ) With this PR, we’re changing the data fetching strategy on the marketplace page. We’re now using autogenerated React queries. ### Changes - Splits separate render logic and hook logic. - Update the data fetching strategy. - Currently, we’re seeing agents in the featured section and creators in the featured creators section, even if they’re not set to “isFeatured” true. I’ve fixed that also. ### Checklist 📋 - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] All marketplace E2E tests are working. - [x] I’ve tested all the links and checked if everything renders perfectly on the marketplace page.	2025-08-01 15:27:48 +00:00
Zamil Majdy	8331dabf6a	feat(backend): Make agent graph execution retriable and its failure visible (#10518 ) Make agent graph execution durable by making it retriable. When it fails to retry, we should make the error visible to the UI. <img width="900" height="495" alt="image" src="https://github.com/user-attachments/assets/70e3e117-31e7-4704-8bdf-1802c6afc70b" /> <img width="900" height="407" alt="image" src="https://github.com/user-attachments/assets/78ca6c28-6cc2-4aff-bfa9-9f94b7f89f77" /> ### Changes 🏗️ * Make _on_graph_execution retriable * Increase retry count for failing db-manager RPC * Add test coverage for RPC failure retry ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: <!-- Put your test plan here: --> - [x] Allow graph execution retry	2025-08-01 11:44:43 +00:00
Zamil Majdy	e632549175	feat(backend): Add AI-generated activity status for agent executions (#10487 ) ## Summary - Adds AI-generated activity status summaries for agent execution results - Provides users with conversational, non-technical summaries of what their agents accomplished - Includes comprehensive execution data analysis with honest failure reporting ## Changes Made - Backend: Added `ActivityStatusGenerator` module with async LLM integration - Database: Extended `GraphExecutionStats` and `Stats` models with `activity_status` field - Frontend: Added "Smart Agent Execution Summary" display with disclaimer tooltip - Settings: Added `execution_enable_ai_activity_status` toggle (disabled by default) - Testing: Comprehensive test suite with 12 test cases covering all scenarios ## Key Features - Collects execution data including graph structure, node relations, errors, and I/O samples - Generates user-friendly summaries from first-person perspective - Honest reporting of failures and invalid inputs (no sugar-coating) - Payload optimization for LLM context limits - Full async implementation with proper error handling ## Test Plan - [x] All existing tests pass - [x] New comprehensive test suite covers success/failure scenarios - [x] Feature toggle testing (enabled/disabled states) - [x] Frontend integration displays correctly - [x] Error handling and edge cases covered 🤖 Generated with [Claude Code](https://claude.ai/code) --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-08-01 11:37:49 +00:00
Abhimanyu Yadav	878f61aaf4	fix(test): Enhance E2E test data script to include featured creators and agents (#10517 ) This PR updates the existing E2E test data script to support the creation of featured creators and featured agents. Previously, these entities were not included, which limited our ability to fully test certain flows during Playwright E2E testing. ### Changes - Added logic to create featured creators - Added logic to create featured agents ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] All tests are passing locally after updating the data script.	2025-08-01 11:09:39 +00:00
Abhimanyu Yadav	e371ef853a	feat(frontend): Add main marketplace page tests and page object structure (#10427 ) - Resolves - https://github.com/Significant-Gravitas/AutoGPT/issues/10426 - Need to review this pr, once this issue is fixed - https://github.com/Significant-Gravitas/AutoGPT/issues/10404 I’ve created additional tests for the main page, divided into two parts: one for basic functionality and the other for edge cases. Basic functionality: - Users can access the marketplace page when logged out. - Users can access the marketplace page when logged in. - Featured agents, top agents, and featured creators are visible. - Users can navigate and interact with marketplace elements. - The complete search flow works correctly. Edge cases: - Searching for a non-existent item shows no results. ### Changes - Introduced a new test suite for the marketplace, covering basic functionality and edge cases. - Implemented the MarketplacePage class to encapsulate interactions with the marketplace page. - Added utility functions for assertions, including visibility checks and URL matching. - Enhanced the LoginPage class with a goto method for navigation. - Established a comprehensive search flow test to validate search functionality. #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] I have done all the tests and they are working perfectly --------- Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co> Co-authored-by: Lluis Agusti <hi@llu.lu> Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co> Co-authored-by: Ubbe <hi@ubbe.dev>	2025-08-01 06:43:53 +00:00
Nicholas Tindle	d323dc2821	feat(backend): optimize processing of queues in notif service (#10513 ) The notification service was running an inefficient polling loop that constantly checked each queue sequentially with 1-second timeouts, even when queues were empty. This caused: - High CPU usage from continuous polling - Sequential processing that blocked queues from being processed in parallel - Unnecessary delays from timeout-based polling instead of event-driven consumption - Poor throughput (500-2,000 messages/second) compared to potential (8,000-12,000 messages/second) ## Changes 🏗️ - Replaced polling-based _run_queue() with event-driven _consume_queue() using async iterators - Implemented concurrent queue consumption using asyncio.gather() instead of sequential processing - Added QoS settings (prefetch_count=10) to control memory usage - Improved error handling with message.process() context manager for automatic ack/nack - Added graceful shutdown that properly cancels all consumer tasks - Removed unused QueueEmpty import ## Checklist 📋 ### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [ ] I have tested my changes according to the test plan: - [ ] Deploy to test environment and monitor CPU usage - [ ] Verify all queue types (immediate, admin, batch, summary) process messages correctly - [ ] Test graceful shutdown with messages in flight - [ ] Monitor that database management service remains stable - [ ] Check logs for proper consumer startup messages - [ ] Verify messages are properly acked/nacked on success/failure --------- Co-authored-by: Claude <claude@users.noreply.github.com> Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>	2025-07-31 17:17:05 +00:00

1 2 3 4 5 ...

7088 Commits