mirror of
https://github.com/Significant-Gravitas/AutoGPT.git
synced 2026-02-06 21:05:13 -05:00
## Summary Comprehensive performance optimization fixing event loop binding issues and addressing all PR feedback. ### Original Performance Issues Fixed **Event Loop Binding Problems:** - JWT authentication dependencies were synchronous, causing thread pool bottlenecks under high concurrency - FastAPI's default thread pool (40 threads) was insufficient for high-load scenarios - Backend services lacked proper event loop configuration **Security & Performance Improvements:** - Security middleware converted from BaseHTTPMiddleware to pure ASGI for better performance - Added blocks endpoint to cacheable paths for improved response times - Cross-platform uvloop detection with Windows compatibility ### Key Changes Made #### 1. JWT Authentication Async Conversion - **Files**: `autogpt_libs/auth/dependencies.py`, `autogpt_libs/auth/jwt_utils.py` - **Change**: Convert all JWT functions to async (`requires_user`, `requires_admin_user`, `get_user_id`, `get_jwt_payload`) - **Impact**: Eliminates thread pool blocking, improves concurrency handling - **Tests**: All 25+ authentication tests updated to async patterns #### 2. FastAPI Thread Pool Optimization - **File**: `backend/server/rest_api.py:82-93` - **Change**: Configure thread pool size via `config.fastapi_thread_pool_size` - **Default**: Increased from 40 to higher limit for sync operations - **Impact**: Better handling of remaining sync dependencies #### 3. Performance-Optimized Security Middleware - **File**: `backend/server/middleware/security.py` - **Change**: Pure ASGI implementation replacing BaseHTTPMiddleware - **Headers**: HTTP spec compliant capitalization (X-Content-Type-Options, X-Frame-Options, etc.) - **Caching**: Added `/api/blocks` and `/api/v1/blocks` to cacheable paths - **Impact**: Reduced middleware overhead, improved header compliance #### 4. Cross-Platform Event Loop Configuration - **File**: `backend/server/rest_api.py:311-312` - **Change**: Platform-aware uvloop detection: `'uvloop' if platform.system() != 'Windows' else 'auto'` - **Impact**: Windows compatibility while maintaining Unix performance benefits - **Verified**: 'auto' is valid uvicorn default parameter #### 5. Enhanced Caching Infrastructure - **File**: `autogpt_libs/utils/cache.py:118-132` - **Change**: Per-event-loop asyncio.Lock instances prevent cross-loop deadlocks - **Impact**: Thread-safe caching across multiple event loops #### 6. Database Query Limits & Performance - **Files**: Multiple data layer files - **Change**: Added configurable limits to prevent unbounded queries - **Constants**: `MAX_GRAPH_VERSIONS_FETCH=50`, `MAX_USER_API_KEYS_FETCH=500`, etc. - **Impact**: Consistent performance regardless of data volume #### 7. OpenAPI Documentation Improvements - **File**: `backend/server/routers/v1.py:68-85` - **Change**: Added proper response model and schema for blocks endpoint - **Impact**: Better API documentation and type safety #### 8. Error Handling & Retry Logic Fixes - **File**: `backend/util/retry.py:63` - **Change**: Accurate retry threshold comments referencing EXCESSIVE_RETRY_THRESHOLD - **Impact**: Clear documentation for debugging retry scenarios ### ntindle Feedback Addressed ✅ **HTTP Header Capitalization**: All headers now use proper HTTP spec capitalization ✅ **Windows uvloop Compatibility**: Clean platform detection with inline conditional ✅ **OpenAPI Response Model**: Blocks endpoint properly documented in schema ✅ **Retry Comment Accuracy**: References actual threshold constants instead of hardcoded numbers ✅ **Code Cleanliness**: Inline conditionals preferred over verbose if statements ### Performance Testing Results **Before Optimization:** - High latency under concurrent load - Thread pool exhaustion at ~40 concurrent requests - Event loop binding issues causing timeouts **After Optimization:** - Improved concurrency handling with async JWT pipeline - Configurable thread pool scaling - Cross-platform event loop optimization - Reduced middleware overhead ### Backward Compatibility ✅ **All existing functionality preserved** ✅ **No breaking API changes** ✅ **Enhanced test coverage with async patterns** ✅ **Windows and Unix compatibility maintained** ### Files Modified **Core Authentication & Performance:** - `autogpt_libs/auth/dependencies.py` - Async JWT dependencies - `autogpt_libs/auth/jwt_utils.py` - Async JWT utilities - `backend/server/rest_api.py` - Thread pool config + uvloop detection - `backend/server/middleware/security.py` - ASGI security middleware **Database & Limits:** - `backend/data/includes.py` - Performance constants and configurable includes - `backend/data/api_key.py`, `backend/data/credit.py`, `backend/data/graph.py`, `backend/data/integrations.py` - Query limits **Caching & Infrastructure:** - `autogpt_libs/utils/cache.py` - Per-event-loop lock safety - `backend/server/routers/v1.py` - OpenAPI improvements - `backend/util/retry.py` - Comment accuracy **Testing:** - `autogpt_libs/auth/dependencies_test.py` - 25+ async test conversions - `autogpt_libs/auth/jwt_utils_test.py` - Async JWT test patterns Ready for review and production deployment. 🚀 --------- Co-authored-by: Claude <noreply@anthropic.com>
135 lines
4.4 KiB
Python
135 lines
4.4 KiB
Python
from typing import Sequence, cast
|
|
|
|
import prisma.enums
|
|
import prisma.types
|
|
|
|
AGENT_NODE_INCLUDE: prisma.types.AgentNodeInclude = {
|
|
"Input": True,
|
|
"Output": True,
|
|
"Webhook": True,
|
|
"AgentBlock": True,
|
|
}
|
|
|
|
AGENT_GRAPH_INCLUDE: prisma.types.AgentGraphInclude = {
|
|
"Nodes": {"include": AGENT_NODE_INCLUDE}
|
|
}
|
|
|
|
|
|
EXECUTION_RESULT_ORDER: list[prisma.types.AgentNodeExecutionOrderByInput] = [
|
|
{"queuedTime": "desc"},
|
|
# Fallback: Incomplete execs has no queuedTime.
|
|
{"addedTime": "desc"},
|
|
]
|
|
|
|
EXECUTION_RESULT_INCLUDE: prisma.types.AgentNodeExecutionInclude = {
|
|
"Input": {"order_by": {"time": "asc"}},
|
|
"Output": {"order_by": {"time": "asc"}},
|
|
"Node": True,
|
|
"GraphExecution": True,
|
|
}
|
|
|
|
MAX_NODE_EXECUTIONS_FETCH = 1000
|
|
MAX_LIBRARY_AGENT_EXECUTIONS_FETCH = 10
|
|
|
|
# Default limits for potentially large result sets
|
|
MAX_CREDIT_REFUND_REQUESTS_FETCH = 100
|
|
MAX_INTEGRATION_WEBHOOKS_FETCH = 100
|
|
MAX_USER_API_KEYS_FETCH = 500
|
|
MAX_GRAPH_VERSIONS_FETCH = 50
|
|
|
|
GRAPH_EXECUTION_INCLUDE_WITH_NODES: prisma.types.AgentGraphExecutionInclude = {
|
|
"NodeExecutions": {
|
|
"include": EXECUTION_RESULT_INCLUDE,
|
|
"order_by": EXECUTION_RESULT_ORDER,
|
|
"take": MAX_NODE_EXECUTIONS_FETCH, # Avoid loading excessive node executions.
|
|
}
|
|
}
|
|
|
|
|
|
def graph_execution_include(
|
|
include_block_ids: Sequence[str],
|
|
) -> prisma.types.AgentGraphExecutionInclude:
|
|
return {
|
|
"NodeExecutions": {
|
|
**cast(
|
|
prisma.types.FindManyAgentNodeExecutionArgsFromAgentGraphExecution,
|
|
GRAPH_EXECUTION_INCLUDE_WITH_NODES["NodeExecutions"], # type: ignore
|
|
),
|
|
"where": {
|
|
"Node": {
|
|
"is": {"AgentBlock": {"is": {"id": {"in": include_block_ids}}}}
|
|
},
|
|
"NOT": [
|
|
{"executionStatus": prisma.enums.AgentExecutionStatus.INCOMPLETE}
|
|
],
|
|
},
|
|
}
|
|
}
|
|
|
|
|
|
AGENT_PRESET_INCLUDE: prisma.types.AgentPresetInclude = {
|
|
"InputPresets": True,
|
|
"Webhook": True,
|
|
}
|
|
|
|
|
|
INTEGRATION_WEBHOOK_INCLUDE: prisma.types.IntegrationWebhookInclude = {
|
|
"AgentNodes": {"include": AGENT_NODE_INCLUDE},
|
|
"AgentPresets": {"include": AGENT_PRESET_INCLUDE},
|
|
}
|
|
|
|
|
|
def library_agent_include(
|
|
user_id: str,
|
|
include_nodes: bool = True,
|
|
include_executions: bool = True,
|
|
execution_limit: int = MAX_LIBRARY_AGENT_EXECUTIONS_FETCH,
|
|
) -> prisma.types.LibraryAgentInclude:
|
|
"""
|
|
Fully configurable includes for library agent queries with performance optimization.
|
|
|
|
Args:
|
|
user_id: User ID for filtering user-specific data
|
|
include_nodes: Whether to include graph nodes (default: True, needed for get_sub_graphs)
|
|
include_executions: Whether to include executions (default: True, safe with execution_limit)
|
|
execution_limit: Limit on executions to fetch (default: MAX_LIBRARY_AGENT_EXECUTIONS_FETCH)
|
|
|
|
Defaults maintain backward compatibility and safety - includes everything needed for all functionality.
|
|
For performance optimization, explicitly set include_nodes=False and include_executions=False
|
|
for listing views where frontend fetches data separately.
|
|
|
|
Performance impact:
|
|
- Default (full nodes + limited executions): Original performance, works everywhere
|
|
- Listing optimization (no nodes/executions): ~2s for 15 agents vs potential timeouts
|
|
- Unlimited executions: varies by user (thousands of executions = timeouts)
|
|
"""
|
|
result: prisma.types.LibraryAgentInclude = {
|
|
"Creator": True, # Always needed for creator info
|
|
}
|
|
|
|
# Build AgentGraph include based on requested options
|
|
if include_nodes or include_executions:
|
|
agent_graph_include = {}
|
|
|
|
# Add nodes if requested (always full nodes)
|
|
if include_nodes:
|
|
agent_graph_include.update(AGENT_GRAPH_INCLUDE) # Full nodes
|
|
|
|
# Add executions if requested
|
|
if include_executions:
|
|
agent_graph_include["Executions"] = {
|
|
"where": {"userId": user_id},
|
|
"order_by": {"createdAt": "desc"},
|
|
"take": execution_limit,
|
|
}
|
|
|
|
result["AgentGraph"] = cast(
|
|
prisma.types.AgentGraphArgsFromLibraryAgent,
|
|
{"include": agent_graph_include},
|
|
)
|
|
else:
|
|
# Default: Basic metadata only (fast - recommended for most use cases)
|
|
result["AgentGraph"] = True # Basic graph metadata (name, description, id)
|
|
|
|
return result
|