Compare commits

..

16 Commits

Author SHA1 Message Date
openhands 4d97ae6c5d Remove AuthenticationError handling from integration managers
The AuthenticationError exception occurs when we don't have access to a repo
(e.g., user token lacks permissions, repo moved to different org). In this
situation, we cannot reliably send a message back to the user because:

1. The error happens when trying to fetch issue/PR data using the user's token
2. Sending messages requires the GitHub App installation token for the repo
3. If we don't have access to the repo, the installation token likely won't
   work either (especially if repo moved to different org)
4. Setting a helpful message that we can't actually deliver is deceptive

Instead, let AuthenticationError fall through to the generic exception handler,
which will:
- Log the full error with stack trace for debugging
- Attempt to send the generic 'Uh oh!' error message (which may also fail)
- At least be honest that we can't help the user in this scenario

This reverts the AuthenticationError handling from GitHub, GitLab, and Slack
integration managers.

Addresses review feedback from @neubig on PR #11473.
2025-12-03 04:26:51 +00:00
Graham Neubig eeb81ecc49 Merge branch 'main' into openhands/fix-github-resolver-auth-error 2025-12-02 23:20:25 -05:00
Hiep Le eaea8b3ce1 fix(frontend): buying credits does not work on staging (#11873) 2025-12-03 10:07:01 +07:00
Tim O'Farrell 72555e0f1c APP-193: add X-Access-Token header support to get_api_key_from_header (#11872)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-02 17:01:09 -07:00
Hiep Le fd13c91387 fix(backend): apply user-defined condenser_max_size in new v1 conversations (#11862) 2025-12-03 00:24:25 +07:00
Hiep Le 6139e39449 fix(backend): git settings not applying in v1 conversations (#11866) 2025-12-02 21:34:37 +07:00
Hiep Le f76ac242f0 fix(backend): conversation statistics are currently not being persisted to the database (V1). (#11837) 2025-12-02 21:22:02 +07:00
Hiep Le 1f9350320f refactor(frontend): hide agent dropdown when v1 is enabled (#11860) 2025-12-02 20:22:40 +07:00
Hiep Le 1a3460ba06 fix(frontend): image attachments not working in v1 conversations (#11864) 2025-12-02 20:22:14 +07:00
Tim O'Farrell 8f361b3698 Fix git checkout error in workspace setup (#11855)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-01 23:01:30 +00:00
Tim O'Farrell fd6e0cab3f Fix V1 MCP services (Fix tavily search) (#11840)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-01 21:19:19 +00:00
Hiep Le 33eec7cb09 feat(frontend): automatically scroll to bottom of container on plan content update (#11808)
Co-authored-by: amanape <83104063+amanape@users.noreply.github.com>
2025-12-01 16:23:48 +00:00
Hiep Le 6c2862ae08 feat(frontend): add handler for 'create a plan' button click (#11806) 2025-12-01 11:08:00 -05:00
openhands 4d0f2e7a6d Remove AuthenticationError handling from Jira and Linear integrations
- Reverted changes to Linear, Jira, and Jira DC managers
- Keep AuthenticationError handling only in GitHub, GitLab, and Slack managers

Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-21 23:37:08 +00:00
openhands 2c6d1e97e8 Extend authentication error handling to all integrations
- Updated GitHub manager to use @username format in error message
- Added AuthenticationError handling to GitLab, Linear, Jira, Jira DC, and Slack managers
- All integrations now display user-friendly message directing users to add repo at app.all-hands.dev
- Consistent error handling pattern across all integration managers

Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-21 23:29:09 +00:00
openhands 180557265f Fix GitHub resolver authentication error handling
- Add AuthenticationError catch block in github_manager.py start_job method
- Display user-friendly message when GitHub API returns 401 Unauthorized
- Provide clear instructions to users about adding the repo to OpenHands app
- Prevents opaque 'Uh oh!' error message when authentication fails

Fixes #11472
2025-10-21 23:13:04 +00:00
30 changed files with 2781 additions and 1939 deletions
-836
View File
@@ -1,836 +0,0 @@
# Custom Agent Packages with Custom Runtime Images (Scenario 1)
## 1. Introduction
### 1.1 Problem Statement
OpenHands currently supports agent customization through the software-agent-sdk, but users who need custom system dependencies, specialized tools, or non-Python runtime environments cannot easily deploy their agents. The current V1 architecture uses a fixed agent server image (`ghcr.io/openhands/agent-server:5f62cee-python`) that may not contain the required dependencies for specialized agents.
Users building agents that require:
- Custom system packages (e.g., specialized compilers, databases, ML frameworks)
- Non-Python tools and runtimes (e.g., Node.js, Go, Rust toolchains)
- Custom Docker base images with specific OS configurations
- Proprietary or licensed software installations
Currently have no supported path to deploy their agents to OpenHands Enterprise.
### 1.2 Proposed Solution
We propose extending the existing **Sandbox Specification System** to support custom agent runtime images with proper permissions and security controls. This approach builds directly on OpenHands' current sandbox infrastructure rather than creating parallel systems.
Users will be able to:
1. Create custom Docker images containing their agent code and dependencies
2. Register these images as enhanced sandbox specifications with rich metadata
3. Deploy conversations using their custom sandbox specs (with proper permissions)
4. Maintain full compatibility with existing sandbox management and API infrastructure
The solution extends the current `SandboxSpecService` with:
- **Permission-based access control** to limit custom specs to authorized users
- **Enhanced sandbox specifications** that include agent-specific metadata and requirements
- **Secure image management** with validation and approval workflows
- **Integrated deployment** through existing conversation creation APIs
**Trade-offs**: This approach requires users to build and maintain Docker images, increasing complexity compared to simple Python package deployment. However, it provides the necessary isolation and dependency management for complex agent requirements while leveraging proven sandbox infrastructure.
## 2. User Interface
### 2.1 Custom Agent Image Creation
Users create a custom agent image by extending the base agent server image:
```dockerfile
# Dockerfile for custom agent
FROM ghcr.io/openhands/agent-server:5f62cee-python
# Install custom system dependencies
RUN apt-get update && apt-get install -y \
nodejs \
npm \
golang-go \
&& rm -rf /var/lib/apt/lists/*
# Install custom Python packages
COPY requirements.txt /tmp/
RUN pip install -r /tmp/requirements.txt
# Copy custom agent code
COPY my_custom_agent/ /app/my_custom_agent/
COPY agent_config.json /app/config/
# Set custom agent as default
ENV CUSTOM_AGENT_MODULE=my_custom_agent
ENV CUSTOM_AGENT_CLASS=MySpecializedAgent
```
### 2.2 Enhanced Sandbox Spec Registration
Users register their custom agent image as an enhanced sandbox specification:
```yaml
# enhanced-sandbox-spec.yaml
apiVersion: openhands.ai/v1
kind: SandboxSpec
metadata:
name: specialized-ml-agent
version: "1.0.0"
owner: user@company.com
permissions:
users: ["user@company.com", "team-lead@company.com"]
groups: ["ml-team", "data-science"]
spec:
image: "myregistry/specialized-ml-agent:v1.0.0"
description: "ML agent with TensorFlow and custom data processing tools"
# Agent-specific metadata
agent:
capabilities:
- machine_learning
- data_analysis
- custom_visualization
type: "custom"
module: "agents.specialized_ml_agent"
class: "SpecializedMLAgent"
requirements:
memory: "4Gi"
cpu: "2"
environment:
TENSORFLOW_VERSION: "2.15.0"
CUSTOM_MODEL_PATH: "/app/models"
# Agent server configuration
CUSTOM_AGENT_MODULE: "agents.specialized_ml_agent"
CUSTOM_AGENT_CLASS: "SpecializedMLAgent"
ports:
- name: agent-server
port: 8000
- name: tensorboard
port: 6006
```
### 2.3 Conversation Creation with Custom Sandbox Spec
Users create conversations using their custom sandbox specs through the existing API:
```bash
# Create conversation with custom sandbox spec
curl -X POST "https://api.openhands.ai/api/conversations" \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"sandbox_spec_id": "specialized-ml-agent:v1.0.0",
"initial_message": "Analyze this dataset and create a predictive model",
"workspace": {
"type": "local",
"working_dir": "/workspace/ml-project"
}
}'
```
### 2.4 Image Management Workflows
#### 2.4.1 Pre-built Image Approach
For organizations that want to manage custom agent images centrally:
```bash
# Admin registers pre-built image as sandbox spec
curl -X POST "https://api.openhands.ai/api/sandbox-specs" \
-H "Authorization: Bearer $ADMIN_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "company-ml-agent",
"version": "1.0.0",
"image": "company-registry/ml-agent:v1.0.0",
"permissions": {
"groups": ["ml-team", "data-science"]
},
"agent": {
"type": "custom",
"capabilities": ["machine_learning", "data_analysis"]
}
}'
```
#### 2.4.2 User Upload Approach
For users who want to upload their own custom images:
```bash
# User uploads custom image (with security validation)
curl -X POST "https://api.openhands.ai/api/sandbox-specs/upload" \
-H "Authorization: Bearer $API_KEY" \
-F "dockerfile=@Dockerfile" \
-F "context=@agent-context.tar.gz" \
-F "spec=@sandbox-spec.yaml"
```
## 3. Other Context
### 3.1 Current Sandbox Specification System
OpenHands V1 uses a sandbox specification system to manage container deployments:
- **Single Default Spec**: Currently only one sandbox spec exists, shared by all users
- **SandboxSpecService**: Manages sandbox specifications and container creation
- **SandboxSpecInfo**: Contains image, environment, and resource configuration
- **No Permissions**: Current system lacks user-based access control
The existing system provides the foundation but needs enhancement for custom agents:
- **Permission Layer**: Required to control access to custom specs
- **Rich Metadata**: Need agent-specific information beyond basic container config
- **Image Management**: Need secure workflows for custom image registration
### 3.2 Enhanced Sandbox Specification Architecture
Our proposal extends the existing system with:
#### 3.2.1 Permission-Based Access Control
- **User Permissions**: Individual user access to specific sandbox specs
- **Group Permissions**: Team-based access control for organizational specs
- **Owner Management**: Spec ownership and delegation capabilities
- **Admin Override**: Administrative access for spec management
#### 3.2.2 Agent-Specific Metadata
- **Agent Configuration**: Module, class, and capability information
- **Resource Requirements**: Memory, CPU, and storage specifications
- **Environment Variables**: Agent-specific configuration and secrets
- **Port Mappings**: Additional ports for agent services (e.g., TensorBoard)
#### 3.2.3 Image Management Integration
- **Registry Support**: Integration with Docker registries for image storage
- **Security Validation**: Image scanning and approval workflows
- **Version Management**: Support for multiple versions of custom specs
- **Build Integration**: Optional image building from Dockerfile uploads
### 3.3 Existing Container Orchestration Integration
The enhanced system leverages existing OpenHands infrastructure:
- **Sandbox Service**: Extended to support permission checks and enhanced specs
- **Container Management**: Same lifecycle management with additional metadata
- **Network Isolation**: Maintains existing security boundaries
- **Resource Enforcement**: Enhanced with custom resource requirements
- **Health Monitoring**: Extended to track custom agent-specific metrics
## 4. Technical Design
### 4.1 Enhanced Sandbox Specification Model
#### 4.1.1 Extended SandboxSpecInfo Structure
The existing `SandboxSpecInfo` model is enhanced to support custom agents:
```python
# openhands/app_server/sandbox/sandbox_spec_models.py (enhanced)
from pydantic import BaseModel, Field
from typing import Dict, List, Optional
class AgentMetadata(BaseModel):
"""Agent-specific metadata for custom agents."""
type: str = Field(default="default", description="Agent type (default|custom)")
capabilities: List[str] = Field(default_factory=list, description="Agent capabilities")
module: Optional[str] = Field(description="Python module containing agent class")
class_name: Optional[str] = Field(description="Agent class name")
class PermissionSpec(BaseModel):
"""Permission specification for sandbox spec access."""
users: List[str] = Field(default_factory=list, description="Authorized user emails")
groups: List[str] = Field(default_factory=list, description="Authorized group names")
owner: Optional[str] = Field(description="Spec owner")
class EnhancedSandboxSpecInfo(BaseModel):
"""Enhanced sandbox specification with agent metadata and permissions."""
# Existing fields from SandboxSpecInfo
id: str = Field(description="Docker image identifier")
command: List[str] = Field(default_factory=lambda: ['--port', '8000'])
initial_env: Dict[str, str] = Field(default_factory=dict)
working_dir: str = Field(default="/workspace/project")
# Enhanced fields
name: str = Field(description="Human-readable spec name")
version: str = Field(description="Spec version")
description: Optional[str] = Field(description="Spec description")
# Agent-specific metadata
agent: AgentMetadata = Field(default_factory=AgentMetadata)
# Permission and access control
permissions: PermissionSpec = Field(default_factory=PermissionSpec)
# Resource requirements
memory_limit: Optional[str] = Field(description="Memory limit (e.g., '4Gi')")
cpu_limit: Optional[str] = Field(description="CPU limit (e.g., '2')")
# Additional ports for custom services
ports: List[Dict[str, any]] = Field(
default_factory=lambda: [{"name": "agent-server", "port": 8000}]
)
```
#### 4.1.2 Custom Agent Image Structure
Custom agent images extend the base agent server with this structure:
```
/app/
├── config/
│ ├── agent_config.json # Agent configuration
│ └── tool_registry.json # Custom tool definitions (optional)
├── agents/
│ └── custom_agent.py # Agent implementation
├── tools/ # Custom tools (optional)
│ ├── __init__.py
│ └── custom_tools.py
└── startup/
└── init_agent.py # Agent initialization script
```
### 4.2 Agent Implementation Interface
#### 4.2.1 Custom Agent Base Class
```python
# agents/custom_agent.py
from openhands.sdk.agent.base import AgentBase
from openhands.sdk.llm import LLM
from openhands.sdk.tool import Tool
from typing import List, Dict, Any
class SpecializedMLAgent(AgentBase):
"""Custom ML agent with TensorFlow capabilities."""
def __init__(
self,
llm: LLM,
tools: List[Tool],
config: Dict[str, Any] = None
):
super().__init__(llm=llm, tools=tools)
self.config = config or {}
self.model_cache = self.config.get('MODEL_CACHE_DIR', '/app/models')
async def initialize(self) -> None:
"""Initialize custom agent resources."""
# Load pre-trained models
await self._load_models()
# Initialize custom tools
await self._setup_custom_tools()
async def _load_models(self) -> None:
"""Load TensorFlow models from cache."""
import tensorflow as tf
# Custom model loading logic
pass
async def _setup_custom_tools(self) -> None:
"""Initialize custom tools with agent context."""
# Custom tool setup logic
pass
```
#### 4.2.2 Custom Tool Implementation
```python
# tools/custom_tools.py
from openhands.sdk.tool import Tool, ToolExecutor, register_tool
from openhands.sdk import Action, Observation
from pydantic import Field
import tensorflow as tf
class TensorFlowAnalysisAction(Action):
dataset_path: str = Field(description="Path to dataset file")
model_type: str = Field(description="Type of ML model to create")
target_column: str = Field(description="Target column for prediction")
class TensorFlowAnalysisObservation(Observation):
model_accuracy: float = Field(description="Model accuracy score")
feature_importance: Dict[str, float] = Field(description="Feature importance scores")
model_path: str = Field(description="Path to saved model")
class TensorFlowToolExecutor(ToolExecutor[TensorFlowAnalysisAction, TensorFlowAnalysisObservation]):
def __call__(self, action: TensorFlowAnalysisAction, conversation=None) -> TensorFlowAnalysisObservation:
# Custom TensorFlow analysis logic
model = self._create_model(action.dataset_path, action.model_type, action.target_column)
accuracy = self._evaluate_model(model)
importance = self._get_feature_importance(model)
model_path = self._save_model(model)
return TensorFlowAnalysisObservation(
model_accuracy=accuracy,
feature_importance=importance,
model_path=model_path
)
# Register the custom tool
register_tool(
Tool(
name="TensorFlowTool",
executor=TensorFlowToolExecutor(),
definition=ToolDefinition(
name="tensorflow_analysis",
description="Perform machine learning analysis using TensorFlow",
parameters=TensorFlowAnalysisAction.model_json_schema()
)
)
)
```
### 4.3 Runtime Integration
#### 4.3.1 Custom Agent Loader
```python
# startup/init_agent.py
import json
import importlib
from pathlib import Path
from openhands.sdk.agent.base import AgentBase
from openhands.sdk.llm import LLM
from openhands.sdk.tool import Tool, resolve_tool
class CustomAgentLoader:
"""Loads custom agents from configuration."""
def __init__(self, config_path: str = "/app/config/agent_config.json"):
self.config_path = Path(config_path)
self.config = self._load_config()
def _load_config(self) -> dict:
"""Load agent configuration from JSON file."""
with open(self.config_path) as f:
return json.load(f)
def create_agent(self, llm: LLM) -> AgentBase:
"""Create custom agent instance."""
agent_config = self.config["agent"]
# Import custom agent class
module = importlib.import_module(agent_config["module"])
agent_class = getattr(module, agent_config["class"])
# Load custom tools
tools = self._load_tools()
# Create agent instance
agent = agent_class(
llm=llm,
tools=tools,
config=self.config.get("environment", {})
)
return agent
def _load_tools(self) -> List[Tool]:
"""Load and resolve custom tools."""
tools = []
for tool_config in self.config.get("tools", []):
if "module" in tool_config:
# Import custom tool module to register it
importlib.import_module(tool_config["module"])
tool = resolve_tool(tool_config["name"])
tools.append(tool)
return tools
```
#### 4.3.2 Agent Server Startup Integration
```python
# Modified agent server startup in software-agent-sdk
import os
from openhands.agent_server.api import app
from openhands.agent_server.conversation_service import ConversationService
from startup.init_agent import CustomAgentLoader
@app.on_event("startup")
async def startup_event():
"""Initialize custom agent during server startup."""
# Check for custom agent configuration
custom_agent_module = os.getenv('CUSTOM_AGENT_MODULE')
custom_agent_class = os.getenv('CUSTOM_AGENT_CLASS')
if custom_agent_module and custom_agent_class:
# Load custom agent
loader = CustomAgentLoader()
app.state.agent_factory = loader.create_agent
print(f"Loaded custom agent: {custom_agent_class}")
else:
# Use default agent
from openhands.sdk.agent import Agent
app.state.agent_factory = lambda llm: Agent(llm=llm, tools=get_default_tools())
print("Using default OpenHands agent")
```
### 4.4 Enhanced Sandbox Service Integration
#### 4.4.1 Permission-Aware Sandbox Service
```python
# openhands/app_server/sandbox/enhanced_sandbox_spec_service.py
from openhands.app_server.sandbox.sandbox_spec_service import SandboxSpecService
from openhands.app_server.sandbox.sandbox_spec_models import SandboxSpecInfo, EnhancedSandboxSpecInfo
from typing import Dict, List, Optional
class EnhancedSandboxSpecService(SandboxSpecService):
"""Enhanced sandbox service with permissions and custom agent support."""
def __init__(self, spec_registry: Dict[str, EnhancedSandboxSpecInfo]):
super().__init__()
self.spec_registry = spec_registry
def get_available_sandbox_specs(self, user_email: str, user_groups: List[str]) -> List[str]:
"""Get sandbox specs available to the user based on permissions."""
available_specs = []
for spec_key, spec in self.spec_registry.items():
if self._has_permission(spec, user_email, user_groups):
available_specs.append(spec_key)
return available_specs
def get_sandbox_spec_by_id(
self,
spec_id: str,
user_email: str,
user_groups: List[str]
) -> SandboxSpecInfo:
"""Get sandbox spec by ID with permission check."""
if spec_id not in self.spec_registry:
# Fall back to default specs for backward compatibility
return super().get_default_sandbox_specs()[0]
enhanced_spec = self.spec_registry[spec_id]
# Check permissions
if not self._has_permission(enhanced_spec, user_email, user_groups):
raise PermissionError(f"User {user_email} does not have access to spec {spec_id}")
# Convert to SandboxSpecInfo for existing infrastructure
return self._convert_to_sandbox_spec_info(enhanced_spec)
def _has_permission(
self,
spec: EnhancedSandboxSpecInfo,
user_email: str,
user_groups: List[str]
) -> bool:
"""Check if user has permission to use the sandbox spec."""
# Owner always has access
if spec.permissions.owner == user_email:
return True
# Check user permissions
if user_email in spec.permissions.users:
return True
# Check group permissions
for group in user_groups:
if group in spec.permissions.groups:
return True
return False
def _convert_to_sandbox_spec_info(self, enhanced_spec: EnhancedSandboxSpecInfo) -> SandboxSpecInfo:
"""Convert enhanced spec to standard SandboxSpecInfo."""
# Build environment variables including agent configuration
env_vars = {
'OPENVSCODE_SERVER_ROOT': '/openhands/.openvscode-server',
'OH_ENABLE_VNC': '0',
'LOG_JSON': 'true',
'OH_CONVERSATIONS_PATH': '/workspace/conversations',
'OH_BASH_EVENTS_DIR': '/workspace/bash_events',
'PYTHONUNBUFFERED': '1',
'ENV_LOG_LEVEL': '20',
**enhanced_spec.initial_env
}
# Add custom agent configuration if specified
if enhanced_spec.agent.type == "custom":
env_vars.update({
'CUSTOM_AGENT_MODULE': enhanced_spec.agent.module,
'CUSTOM_AGENT_CLASS': enhanced_spec.agent.class_name,
})
return SandboxSpecInfo(
id=enhanced_spec.id,
command=enhanced_spec.command,
initial_env=env_vars,
working_dir=enhanced_spec.working_dir,
)
def register_sandbox_spec(
self,
spec: EnhancedSandboxSpecInfo,
admin_user: str
) -> str:
"""Register a new sandbox spec (admin only)."""
spec_key = f"{spec.name}:{spec.version}"
# Validate spec
self._validate_sandbox_spec(spec)
# Store in registry
self.spec_registry[spec_key] = spec
return spec_key
def _validate_sandbox_spec(self, spec: EnhancedSandboxSpecInfo) -> None:
"""Validate sandbox spec for security and correctness."""
# Image validation
if not spec.id or not spec.id.strip():
raise ValueError("Image ID cannot be empty")
# Permission validation
if not spec.permissions.owner:
raise ValueError("Sandbox spec must have an owner")
# Agent validation for custom agents
if spec.agent.type == "custom":
if not spec.agent.module or not spec.agent.class_name:
raise ValueError("Custom agents must specify module and class_name")
```
### 4.5 Enhanced API Integration
#### 4.5.1 Enhanced Conversation Creation
```python
# openhands/server/routes/conversation_routes.py (enhanced)
from fastapi import APIRouter, HTTPException, Depends
from pydantic import BaseModel
from typing import Optional, Dict, Any, List
from uuid import UUID
from openhands.app_server.sandbox.enhanced_sandbox_spec_service import EnhancedSandboxSpecService
from openhands.server.session.agent_session import AgentSession
from openhands.server.auth import get_current_user, get_user_groups
# Enhanced conversation creation request
class CreateConversationRequest(BaseModel):
initial_message: str
workspace_config: Optional[Dict[str, Any]] = None
# New field for custom sandbox spec
sandbox_spec_id: Optional[str] = None
@router.post("/conversations")
async def create_conversation(
request: CreateConversationRequest,
current_user: str = Depends(get_current_user),
user_groups: List[str] = Depends(get_user_groups),
sandbox_service: EnhancedSandboxSpecService = Depends(get_enhanced_sandbox_service)
) -> ConversationResponse:
"""Create conversation with optional custom sandbox spec."""
try:
if request.sandbox_spec_id:
# Use custom sandbox spec with permission check
sandbox_spec = sandbox_service.get_sandbox_spec_by_id(
request.sandbox_spec_id,
current_user,
user_groups
)
else:
# Use default sandbox spec
sandbox_spec = sandbox_service.get_default_sandbox_specs()[0]
# Create sandbox and conversation
sandbox = await sandbox_service.create_sandbox(sandbox_spec)
await wait_for_agent_server_ready(sandbox)
conversation = await create_conversation_with_sandbox(
sandbox=sandbox,
initial_message=request.initial_message,
workspace_config=request.workspace_config
)
return ConversationResponse(
conversation_id=conversation.id,
status="created",
sandbox_spec_id=request.sandbox_spec_id or "default"
)
except PermissionError as e:
raise HTTPException(status_code=403, detail=str(e))
except ValueError as e:
raise HTTPException(status_code=404, detail=str(e))
```
#### 4.5.2 Sandbox Spec Management API
```python
# openhands/server/routes/sandbox_spec_routes.py (new)
from fastapi import APIRouter, HTTPException, Depends, UploadFile, File
from pydantic import BaseModel
from typing import List, Optional
import yaml
from openhands.app_server.sandbox.enhanced_sandbox_spec_service import EnhancedSandboxSpecService
from openhands.app_server.sandbox.sandbox_spec_models import EnhancedSandboxSpecInfo
from openhands.server.auth import get_current_user, get_user_groups, require_admin
router = APIRouter(prefix="/api/sandbox-specs", tags=["Sandbox Specs"])
@router.get("/")
async def list_available_sandbox_specs(
current_user: str = Depends(get_current_user),
user_groups: List[str] = Depends(get_user_groups),
sandbox_service: EnhancedSandboxSpecService = Depends(get_enhanced_sandbox_service)
) -> List[str]:
"""List sandbox specs available to the current user."""
return sandbox_service.get_available_sandbox_specs(current_user, user_groups)
@router.post("/")
async def register_sandbox_spec(
spec_data: EnhancedSandboxSpecInfo,
current_user: str = Depends(require_admin),
sandbox_service: EnhancedSandboxSpecService = Depends(get_enhanced_sandbox_service)
) -> Dict[str, str]:
"""Register a new sandbox spec (admin only)."""
try:
spec_key = sandbox_service.register_sandbox_spec(spec_data, current_user)
return {"spec_id": spec_key, "status": "registered"}
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
@router.post("/upload")
async def upload_custom_image(
dockerfile: UploadFile = File(...),
context: UploadFile = File(...),
spec: UploadFile = File(...),
current_user: str = Depends(get_current_user),
sandbox_service: EnhancedSandboxSpecService = Depends(get_enhanced_sandbox_service)
) -> Dict[str, str]:
"""Upload custom image with Dockerfile and context (with security validation)."""
try:
# Parse spec file
spec_content = await spec.read()
spec_data = yaml.safe_load(spec_content)
# Validate user has permission to create specs
if not _can_user_create_specs(current_user):
raise HTTPException(status_code=403, detail="User not authorized to create custom specs")
# Security validation of Dockerfile
dockerfile_content = await dockerfile.read()
_validate_dockerfile_security(dockerfile_content)
# Build image (implementation depends on build system)
image_id = await _build_custom_image(dockerfile_content, context, current_user)
# Create enhanced spec
enhanced_spec = EnhancedSandboxSpecInfo(**spec_data)
enhanced_spec.id = image_id
enhanced_spec.permissions.owner = current_user
# Register the spec
spec_key = sandbox_service.register_sandbox_spec(enhanced_spec, current_user)
return {"spec_id": spec_key, "image_id": image_id, "status": "uploaded"}
except Exception as e:
raise HTTPException(status_code=400, detail=f"Upload failed: {str(e)}")
```
## 5. Implementation Plan
All implementation must pass existing lints and tests. New functionality requires comprehensive test coverage including unit tests, integration tests, and end-to-end scenarios.
### 5.1 Enhanced Sandbox Models and Permissions (M1)
#### 5.1.1 Enhanced Sandbox Specification Models
* `openhands/app_server/sandbox/sandbox_spec_models.py` (enhanced)
* `tests/unit/app_server/sandbox/test_enhanced_sandbox_spec_models.py`
Extend existing `SandboxSpecInfo` with `EnhancedSandboxSpecInfo` including agent metadata, permissions, and resource requirements. This is the **core requirement** identified by the engineer.
#### 5.1.2 Permission System Foundation
* `openhands/server/auth/permissions.py`
* `tests/unit/server/auth/test_permissions.py`
Implement user and group-based permission system for sandbox spec access control. This addresses the **security concerns** from V0 mentioned by the engineer.
**Demo**: Create enhanced sandbox specs with permission restrictions and verify access control works correctly.
### 5.2 Enhanced Sandbox Service (M2)
#### 5.2.1 Permission-Aware Sandbox Service
* `openhands/app_server/sandbox/enhanced_sandbox_spec_service.py`
* `tests/unit/app_server/sandbox/test_enhanced_sandbox_spec_service.py`
Extend existing `SandboxSpecService` with permission checks and enhanced spec management. This **builds on existing infrastructure** as the engineer suggested.
#### 5.2.2 Agent Server Startup Integration
* `openhands-agent-server/openhands/agent_server/custom_agent_loader.py`
* `tests/unit/agent_server/test_custom_agent_loader.py`
Implement custom agent loading mechanism in agent server startup process with configuration-driven agent instantiation.
**Demo**: Deploy custom agents using enhanced sandbox specs and verify permission-based access control works end-to-end.
### 5.3 Image Management and API Integration (M3)
#### 5.3.1 Secure Image Management
* `openhands/app_server/sandbox/image_builder.py`
* `openhands/app_server/security/dockerfile_validator.py`
* `tests/unit/app_server/sandbox/test_image_builder.py`
* `tests/unit/app_server/security/test_dockerfile_validator.py`
Implement both **pre-built image registration** and **secure user upload** workflows as identified by the engineer. This addresses the security issues from V0.
#### 5.3.2 Enhanced Conversation API
* `openhands/server/routes/conversation_routes.py` (enhanced)
* `openhands/server/routes/sandbox_spec_routes.py` (new)
* `tests/unit/server/routes/test_enhanced_conversation_routes.py`
* `tests/unit/server/routes/test_sandbox_spec_routes.py`
Enhance existing conversation creation API to support `sandbox_spec_id` parameter and add new sandbox spec management endpoints.
**Demo**: Create conversations with custom sandbox specs through existing API endpoints and demonstrate both pre-built and user-uploaded image workflows.
### 5.4 Advanced Security and Management (M4)
#### 5.4.1 Image Security Validation
* `openhands/app_server/security/image_scanner.py`
* `openhands/app_server/security/security_policies.py`
* `tests/unit/app_server/security/test_image_scanner.py`
Implement comprehensive security validation including image vulnerability scanning, Dockerfile analysis, and approval workflows.
#### 5.4.2 Spec Registry and Lifecycle Management
* `openhands/app_server/sandbox/spec_registry.py`
* `openhands/app_server/sandbox/spec_lifecycle.py`
* `tests/unit/app_server/sandbox/test_spec_registry.py`
Add persistent storage for enhanced sandbox specs, version management, and lifecycle policies (deprecation, cleanup).
**Demo**: Deploy multiple custom agents with different permission levels, demonstrate security validation workflows, and show proper spec lifecycle management.
---
## Key Alignment with Engineer's Approach
This revised implementation plan directly addresses the engineer's requirements:
1. **✅ Uses existing sandbox specs system** - Enhanced rather than replaced
2. **✅ Permissions as core requirement** - Moved to M1 instead of M4
3. **✅ Two image management approaches** - Pre-built registration and secure user uploads
4. **✅ Security-first design** - Addresses V0 security issues with comprehensive validation
5. **✅ Minimal infrastructure changes** - Builds on existing `SandboxSpecService` and conversation APIs
-934
View File
@@ -1,934 +0,0 @@
# Dynamic Custom Agent Package Loading (Scenario 2)
## 1. Introduction
### 1.1 Problem Statement
OpenHands V1 architecture uses a fixed agent server image (`ghcr.io/openhands/agent-server:5f62cee-python`) that contains the default agent implementation. Users who want to customize agent behavior with pure Python packages that don't require additional system dependencies currently have no supported mechanism to deploy their custom agents without building entirely new Docker images.
This creates unnecessary complexity for the common use case where users simply want to:
- Customize agent prompts and reasoning logic
- Add new Python-based tools and capabilities
- Integrate with Python APIs and libraries already available in the base image
- Deploy agents with different LLM configurations or specialized workflows
The current approach forces all customization through the heavyweight Scenario 1 path (custom Docker images), even when the base agent server image already contains all necessary dependencies.
### 1.2 Proposed Solution
We propose implementing **Dynamic Custom Agent Package Loading** within the existing V1 agent server container. This allows users to deploy custom agents by providing Python packages that are downloaded, installed, and instantiated at runtime without requiring custom Docker images.
Users will be able to:
1. Package their custom agents as standard Python packages (pip-installable)
2. Specify agent package URLs (Git repositories, PyPI packages, or ZIP archives) in conversation creation
3. Have the agent server dynamically download and install the package at startup
4. Instantiate their custom agent within the existing container environment
5. Maintain full compatibility with the existing HTTP API (`/ask_agent` endpoint)
The solution leverages the existing V1 architecture's agent server container but extends the startup process to support dynamic agent loading based on environment configuration.
**Trade-offs**: This approach is limited to Python packages that can run within the existing agent server environment. Users needing custom system dependencies, non-Python tools, or different base images must use Scenario 1. However, this covers the majority of agent customization use cases with significantly reduced complexity.
## 2. User Interface
### 2.1 Custom Agent Package Structure
Users create a standard Python package with the required interface:
```python
# my_custom_agent/
setup.py
requirements.txt # Optional additional dependencies
my_custom_agent/
__init__.py
agent.py # Main agent implementation
tools.py # Custom tools (optional)
config.py # Agent configuration
```
### 2.2 Agent Implementation
```python
# my_custom_agent/agent.py
from openhands.sdk.agent.base import AgentBase
from openhands.sdk.llm import LLM
from openhands.sdk.tool import Tool
from typing import List, Dict, Any
class MyCustomAgent(AgentBase):
"""Custom agent with specialized behavior."""
def __init__(self, llm: LLM, tools: List[Tool], config: Dict[str, Any] = None):
super().__init__(llm=llm, tools=tools)
self.config = config or {}
async def initialize(self) -> None:
"""Initialize custom agent resources."""
# Custom initialization logic
pass
# Factory function for agent creation
def create_agent(llm: LLM, tools: List[Tool], config: Dict[str, Any] = None) -> AgentBase:
"""Factory function to create the custom agent."""
return MyCustomAgent(llm=llm, tools=tools, config=config)
```
### 2.3 Package Entry Point
```python
# my_custom_agent/__init__.py
from .agent import create_agent, MyCustomAgent
__all__ = ['create_agent', 'MyCustomAgent']
```
### 2.4 Setup Configuration
```python
# setup.py
from setuptools import setup, find_packages
setup(
name="my-custom-agent",
version="1.0.0",
packages=find_packages(),
install_requires=[
# Only additional dependencies beyond base image
"requests>=2.25.0",
"beautifulsoup4>=4.9.0",
],
entry_points={
'openhands.agents': [
'my_custom_agent = my_custom_agent:create_agent',
],
},
)
```
### 2.5 Conversation Creation with Dynamic Agent Loading
Users create conversations by specifying the agent package URL:
```bash
# Create conversation with custom agent package
curl -X POST "https://api.openhands.ai/api/conversations" \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"agent_package_url": "git+https://github.com/user/my-custom-agent.git",
"initial_message": "Help me analyze this codebase",
"workspace": {
"type": "local",
"working_dir": "/workspace/project"
}
}'
```
Alternative package sources:
```bash
# PyPI package
"agent_package_url": "my-custom-agent==1.0.0"
# ZIP archive
"agent_package_url": "https://example.com/agents/my-custom-agent.zip"
# Private Git repository
"agent_package_url": "git+https://token@github.com/private/agent.git"
```
## 3. Other Context
### 3.1 Current V1 Architecture Overview
The OpenHands V1 architecture follows a distributed service model with clear separation between the main application server and agent execution environment. Understanding this architecture is crucial for implementing dynamic agent loading.
#### 3.1.1 Service Separation
The V1 system consists of three primary components:
1. **Main Server** (`openhands/app_server/`): Handles user requests, conversation management, and sandbox orchestration
2. **Agent Server** (`software-agent-sdk/openhands-agent-server/`): Executes agent logic and manages conversation state
3. **Action Execution Server**: Handles tool execution (bash commands, file operations) within sandboxed environments
#### 3.1.2 Communication Flow
The current communication pattern follows this sequence:
```
User Request → Main Server → HTTP API → Agent Server → Agent Instance → Tools → Response
```
This separation allows for:
- **Isolation**: Agent execution is isolated from the main application
- **Scalability**: Multiple agent servers can be spawned for different conversations
- **Security**: Sandboxed execution prevents agent actions from affecting the host system
- **Flexibility**: Different agent configurations can be deployed without affecting the main server
#### 3.1.3 Container Orchestration
The main server uses `DockerSandboxSpecService` to create and manage agent server containers:
- **Image Selection**: Currently hardcoded to `ghcr.io/openhands/agent-server:5f62cee-python`
- **Environment Configuration**: Passed via `initial_env` in `SandboxSpecInfo`
- **Network Isolation**: Each conversation gets its own container instance
- **Resource Management**: Memory and CPU limits enforced at container level
### 3.2 Agent Server Internal Architecture
#### 3.2.1 FastAPI Application Structure
The agent server is built as a FastAPI application with these key components:
- **Conversation Router** (`conversation_router.py`): Handles HTTP endpoints for agent interaction
- **Conversation Service** (`conversation_service.py`): Manages conversation lifecycle and state
- **Event Service** (`event_service.py`): Processes agent actions and observations
- **Dependencies** (`dependencies.py`): Provides dependency injection for services
#### 3.2.2 Agent Instantiation Pattern
Currently, agents are instantiated during server startup using a fixed pattern:
```python
# Simplified current pattern
agent = Agent(
llm=LLM(model="default-model", api_key="..."),
tools=[TerminalTool(), FileEditorTool(), ...]
)
```
This creates a single agent instance that serves all requests to that container.
#### 3.2.3 Request Processing Flow
When the `/ask_agent` endpoint receives a request:
1. **Request Validation**: `AskAgentRequest` is validated and parsed
2. **Conversation Lookup**: Conversation state is retrieved or created
3. **Agent Invocation**: The fixed agent instance processes the question
4. **Response Formatting**: Result is wrapped in `AskAgentResponse`
5. **HTTP Response**: JSON response sent back to main server
### 3.3 Software Agent SDK Integration Points
#### 3.3.1 Agent Interface Requirements
The `software-agent-sdk` defines the contract that all agents must follow:
- **AgentBase**: Abstract base class requiring `llm` and `tools` parameters
- **Tool Integration**: Agents must work with the standardized tool system
- **Event Handling**: Agents process events through the conversation framework
- **State Management**: Agents maintain conversation context through event streams
#### 3.3.2 Tool System Architecture
The tool system provides the foundation for agent capabilities:
- **Tool Registration**: Tools are registered globally and resolved by name
- **Execution Framework**: `ToolExecutor` classes handle action execution
- **Built-in Tools**: Standard tools (Terminal, FileEditor, Browser) are always available
- **Custom Tools**: Additional tools can be registered through the plugin system
#### 3.3.3 LLM Integration
Agents interact with language models through the SDK's LLM abstraction:
- **Provider Agnostic**: Supports multiple LLM providers through unified interface
- **Configuration**: LLM settings (model, API keys, parameters) are configurable
- **Response Processing**: Structured handling of LLM responses and tool calls
### 3.4 Dynamic Loading Technical Foundation
#### 3.4.1 Python Package Management
Our dynamic loading approach leverages Python's built-in package management:
- **pip install**: Supports Git repositories, PyPI packages, and archive files
- **importlib**: Enables runtime module importing and class instantiation
- **entry_points**: Provides standardized plugin discovery mechanism
- **sys.path**: Allows dynamic modification of Python module search paths
#### 3.4.2 Container Environment Considerations
The agent server container provides a controlled environment for dynamic loading:
- **Python Runtime**: Pre-installed Python 3.x with pip and common libraries
- **Network Access**: Required for downloading packages from external sources
- **File System**: Writable areas for package installation and caching
- **Security Context**: Isolated from host system with appropriate permissions
#### 3.4.3 State Management Implications
Dynamic agent loading affects conversation state management:
- **Agent Persistence**: Custom agents must maintain state across requests
- **Configuration Isolation**: Different conversations can use different agent configurations
- **Resource Cleanup**: Proper cleanup of agent resources when conversations end
- **Error Recovery**: Fallback mechanisms when custom agents fail to load or execute
## 4. Technical Design
### 4.1 Current V1 Agent Instantiation Flow
To understand how our proposal integrates with the existing system, it's important to first examine how agents are currently instantiated and executed in the V1 architecture.
#### 4.1.1 Current Agent Server Startup Process
In the current V1 flow, agent instantiation follows this sequence:
1. **Main Server Request**: When a user creates a conversation, the main server (`openhands/app_server`) creates a sandbox specification via `DockerSandboxSpecService.get_default_sandbox_specs()`
2. **Container Launch**: The sandbox service launches the agent server container using the hardcoded image `ghcr.io/openhands/agent-server:5f62cee-python`
3. **Agent Server Initialization**: The agent server container starts with the command `['--port', '8000']` and initializes a FastAPI application
4. **Default Agent Creation**: During startup, the agent server creates a default agent instance (typically from the software-agent-sdk) with standard tools and configuration
5. **HTTP API Ready**: The agent server exposes the `/api/conversations/{id}/ask_agent` endpoint, routing requests to the default agent instance
#### 4.1.2 Current Agent Execution Flow
When a user sends a message through the V1 API:
1. **HTTP Request**: Main server makes POST request to `http://agent-server:8000/api/conversations/{id}/ask_agent`
2. **Agent Router**: `conversation_router.py` receives the request and extracts the `AskAgentRequest`
3. **Conversation Service**: `ConversationService.ask_agent()` method is called with the user's question
4. **Event Service**: The request is forwarded to `EventService.ask_agent()` which manages the conversation state
5. **Agent Execution**: The default agent processes the question using its configured LLM and tools
6. **Response Return**: The agent's response is returned through the same HTTP chain back to the main server
#### 4.1.3 Limitations of Current Approach
The current system has several limitations for custom agent deployment:
- **Fixed Agent Implementation**: The agent server container contains a single, hardcoded agent implementation
- **Static Configuration**: Agent behavior cannot be modified without rebuilding the entire container
- **No Runtime Customization**: Users cannot specify different agent types or configurations per conversation
- **Deployment Complexity**: Any agent customization requires building and maintaining custom Docker images
### 4.2 Proposed Dynamic Agent Loading Architecture
Our proposal extends the current V1 flow by introducing dynamic agent loading capabilities while maintaining full backward compatibility with existing APIs and infrastructure.
#### 4.2.1 Enhanced Agent Server Startup Process
The modified startup process introduces agent selection based on environment configuration:
1. **Environment Detection**: During agent server startup, check for `CUSTOM_AGENT_PACKAGE_URL` environment variable
2. **Conditional Loading**: If custom agent URL is present, download and install the package; otherwise use default agent
3. **Agent Factory Creation**: Create an agent factory function that can instantiate either custom or default agents
4. **HTTP API Registration**: Register the same `/ask_agent` endpoint, but route to the dynamically selected agent
#### 4.2.2 Dynamic Package Installation Process
When a custom agent package URL is detected, the system performs these steps:
1. **Package Download**: Use `pip install` to download the package from Git, PyPI, or ZIP sources
2. **Dependency Resolution**: Install any additional Python dependencies specified in the package
3. **Module Import**: Use `importlib` to dynamically import the custom agent module
4. **Agent Instantiation**: Call the package's `create_agent()` factory function with LLM and tools
5. **Initialization**: Execute any custom initialization logic defined by the agent
6. **Caching**: Cache the agent instance for reuse across multiple requests
#### 4.2.3 Modified Execution Flow
The execution flow remains largely unchanged from the user's perspective, but internally:
1. **Same HTTP API**: The `/ask_agent` endpoint signature and behavior remain identical
2. **Dynamic Routing**: Requests are routed to either custom or default agent based on startup configuration
3. **Transparent Operation**: The main server is unaware of whether it's communicating with a custom or default agent
4. **Consistent Response Format**: All agents return responses in the same `AskAgentResponse` format
### 4.3 Integration Points and Modifications
#### 4.3.1 Sandbox Service Modifications
The main server's sandbox service requires minimal changes to support dynamic agent loading:
```python
# Current: Fixed environment for all conversations
def get_default_sandbox_specs():
return [SandboxSpecInfo(
id=AGENT_SERVER_IMAGE,
command=['--port', '8000'],
initial_env={...} # Standard environment
)]
# Enhanced: Dynamic environment based on conversation requirements
def create_dynamic_agent_sandbox_spec(agent_package_url: str):
return SandboxSpecInfo(
id=AGENT_SERVER_IMAGE, # Same base image
command=['--port', '8000'],
initial_env={
...standard_env,
'CUSTOM_AGENT_PACKAGE_URL': agent_package_url # New variable
}
)
```
#### 4.3.2 Agent Server Startup Modifications
The agent server startup process is enhanced to detect and load custom agents:
```python
# Current: Fixed agent creation
@app.on_event("startup")
async def startup_event():
app.state.agent = DefaultAgent(llm=default_llm, tools=default_tools)
# Enhanced: Dynamic agent creation
@app.on_event("startup")
async def startup_event():
custom_agent_url = os.getenv('CUSTOM_AGENT_PACKAGE_URL')
if custom_agent_url:
loader = DynamicAgentLoader()
app.state.agent = await loader.load_agent_from_url(custom_agent_url, ...)
else:
app.state.agent = DefaultAgent(llm=default_llm, tools=default_tools)
```
#### 4.3.3 Conversation Service Integration
The conversation service routing logic is updated to use the dynamically loaded agent:
```python
# Current: Direct agent usage
async def ask_agent(self, conversation_id: UUID, question: str) -> str:
event_service = self.event_services[conversation_id]
return await event_service.ask_agent(question)
# Enhanced: Dynamic agent resolution
async def ask_agent(self, conversation_id: UUID, question: str) -> str:
event_service = self.event_services[conversation_id]
# Agent is now dynamically determined at startup
return await event_service.ask_agent(question)
```
### 4.4 Dynamic Agent Loading Implementation
#### 4.4.1 Agent Package Loader
```python
# openhands/agent_server/dynamic_agent_loader.py
import subprocess
import importlib
import tempfile
import os
import sys
from typing import Dict, Any, Optional
from urllib.parse import urlparse
from pathlib import Path
from openhands.sdk.agent.base import AgentBase
from openhands.sdk.llm import LLM
from openhands.sdk.tool import Tool
from openhands.sdk.logger import get_logger
logger = get_logger(__name__)
class DynamicAgentLoader:
"""Loads custom agents from package URLs at runtime."""
def __init__(self):
self.installed_packages: Dict[str, str] = {}
async def load_agent_from_url(
self,
package_url: str,
llm: LLM,
tools: list[Tool],
config: Optional[Dict[str, Any]] = None
) -> AgentBase:
"""Load and instantiate agent from package URL."""
# Check if already installed
if package_url in self.installed_packages:
package_name = self.installed_packages[package_url]
return await self._create_agent_instance(package_name, llm, tools, config)
# Install the package
package_name = await self._install_package(package_url)
self.installed_packages[package_url] = package_name
# Create agent instance
return await self._create_agent_instance(package_name, llm, tools, config)
async def _install_package(self, package_url: str) -> str:
"""Install package from URL and return package name."""
logger.info(f"Installing custom agent package: {package_url}")
try:
# Install package using pip
result = subprocess.run([
sys.executable, "-m", "pip", "install", package_url
], capture_output=True, text=True, check=True)
logger.info(f"Package installation successful: {result.stdout}")
# Extract package name from URL
package_name = self._extract_package_name(package_url)
return package_name
except subprocess.CalledProcessError as e:
logger.error(f"Failed to install package {package_url}: {e.stderr}")
raise RuntimeError(f"Package installation failed: {e.stderr}")
def _extract_package_name(self, package_url: str) -> str:
"""Extract package name from various URL formats."""
if package_url.startswith('git+'):
# Git URL: extract repo name
url = package_url.replace('git+', '')
return Path(urlparse(url).path).stem
elif '==' in package_url:
# PyPI with version: extract package name
return package_url.split('==')[0]
elif package_url.endswith('.zip'):
# ZIP file: extract filename
return Path(urlparse(package_url).path).stem
else:
# Assume it's a simple package name
return package_url
async def _create_agent_instance(
self,
package_name: str,
llm: LLM,
tools: list[Tool],
config: Optional[Dict[str, Any]] = None
) -> AgentBase:
"""Create agent instance from installed package."""
try:
# Import the package
module = importlib.import_module(package_name)
# Look for create_agent function
if hasattr(module, 'create_agent'):
create_agent_func = getattr(module, 'create_agent')
agent = create_agent_func(llm=llm, tools=tools, config=config)
else:
# Fallback: look for agent class
agent_classes = [
attr for attr in dir(module)
if (isinstance(getattr(module, attr), type) and
issubclass(getattr(module, attr), AgentBase) and
getattr(module, attr) != AgentBase)
]
if not agent_classes:
raise RuntimeError(f"No agent class found in package {package_name}")
agent_class = getattr(module, agent_classes[0])
agent = agent_class(llm=llm, tools=tools, config=config)
# Initialize the agent
if hasattr(agent, 'initialize'):
await agent.initialize()
logger.info(f"Successfully created agent from package: {package_name}")
return agent
except Exception as e:
logger.error(f"Failed to create agent from package {package_name}: {e}")
raise RuntimeError(f"Agent instantiation failed: {e}")
```
#### 4.1.2 Agent Server Integration
```python
# Modified openhands/agent_server/conversation_service.py
import os
from typing import Optional
from openhands.agent_server.dynamic_agent_loader import DynamicAgentLoader
from openhands.sdk.agent.base import AgentBase
from openhands.sdk.agent import Agent # Default agent
class ConversationService:
"""Enhanced conversation service with dynamic agent loading."""
def __init__(self, config: Config):
self.config = config
self.agent_loader = DynamicAgentLoader()
self._default_agent_factory = None
self._custom_agent_cache: Dict[str, AgentBase] = {}
async def _get_or_create_agent(
self,
conversation_id: UUID,
llm: LLM,
tools: list[Tool]
) -> AgentBase:
"""Get or create agent for conversation."""
# Check for custom agent package URL in environment
custom_agent_url = os.getenv('CUSTOM_AGENT_PACKAGE_URL')
if custom_agent_url:
# Use custom agent
if custom_agent_url not in self._custom_agent_cache:
agent = await self.agent_loader.load_agent_from_url(
package_url=custom_agent_url,
llm=llm,
tools=tools,
config=self._get_agent_config()
)
self._custom_agent_cache[custom_agent_url] = agent
return self._custom_agent_cache[custom_agent_url]
else:
# Use default agent
if not self._default_agent_factory:
self._default_agent_factory = Agent(llm=llm, tools=tools)
return self._default_agent_factory
def _get_agent_config(self) -> Dict[str, Any]:
"""Extract agent configuration from environment."""
config = {}
# Parse JSON config if provided
config_json = os.getenv('CUSTOM_AGENT_CONFIG')
if config_json:
import json
config.update(json.loads(config_json))
return config
```
### 4.2 Sandbox Service Integration
#### 4.2.1 Enhanced Sandbox Specification
```python
# openhands/app_server/sandbox/docker_sandbox_spec_service.py
from typing import Optional
class DockerSandboxSpecService(SandboxSpecService):
"""Enhanced sandbox service supporting dynamic agent loading."""
def create_dynamic_agent_sandbox_spec(
self,
agent_package_url: str,
agent_config: Optional[Dict[str, Any]] = None
) -> SandboxSpecInfo:
"""Create sandbox spec with dynamic agent loading configuration."""
# Base environment from existing implementation
base_env = {
'OPENVSCODE_SERVER_ROOT': '/openhands/.openvscode-server',
'OH_ENABLE_VNC': '0',
'LOG_JSON': 'true',
'OH_CONVERSATIONS_PATH': '/workspace/conversations',
'OH_BASH_EVENTS_DIR': '/workspace/bash_events',
'PYTHONUNBUFFERED': '1',
'ENV_LOG_LEVEL': '20',
}
# Add dynamic agent configuration
dynamic_env = {
**base_env,
'CUSTOM_AGENT_PACKAGE_URL': agent_package_url,
}
# Add agent configuration as JSON if provided
if agent_config:
import json
dynamic_env['CUSTOM_AGENT_CONFIG'] = json.dumps(agent_config)
return SandboxSpecInfo(
id=AGENT_SERVER_IMAGE, # Same base image
command=['--port', '8000'],
initial_env=dynamic_env,
working_dir='/workspace/project',
)
```
#### 4.2.2 Conversation Creation API Enhancement
```python
# openhands/server/routes/conversation_routes.py
from pydantic import BaseModel
from typing import Optional, Dict, Any
class CreateConversationRequest(BaseModel):
"""Enhanced conversation creation request."""
initial_message: str
workspace_config: Optional[Dict[str, Any]] = None
# New field for dynamic agent loading
agent_package_url: Optional[str] = None
agent_config: Optional[Dict[str, Any]] = None
@router.post("/conversations")
async def create_conversation(
request: CreateConversationRequest,
sandbox_service: DockerSandboxSpecService = Depends(get_sandbox_service)
) -> ConversationResponse:
"""Create conversation with optional dynamic agent loading."""
if request.agent_package_url:
# Create sandbox with dynamic agent loading
sandbox_spec = sandbox_service.create_dynamic_agent_sandbox_spec(
agent_package_url=request.agent_package_url,
agent_config=request.agent_config
)
else:
# Use default sandbox specification
sandbox_spec = sandbox_service.get_default_sandbox_specs()[0]
# Create sandbox and conversation
sandbox = await sandbox_service.create_sandbox(sandbox_spec)
await wait_for_agent_server_ready(sandbox)
conversation = await create_conversation_with_sandbox(
sandbox=sandbox,
initial_message=request.initial_message,
workspace_config=request.workspace_config
)
return ConversationResponse(
conversation_id=conversation.id,
status="created",
agent_type="custom" if request.agent_package_url else "default"
)
```
### 4.3 Agent Server Startup Process
#### 4.3.1 Enhanced Agent Server Initialization
```python
# openhands/agent_server/api.py startup modification
import os
from openhands.agent_server.dynamic_agent_loader import DynamicAgentLoader
@app.on_event("startup")
async def startup_event():
"""Enhanced startup with dynamic agent loading support."""
# Initialize dynamic agent loader
app.state.agent_loader = DynamicAgentLoader()
# Check for custom agent package URL
custom_agent_url = os.getenv('CUSTOM_AGENT_PACKAGE_URL')
if custom_agent_url:
logger.info(f"Dynamic agent loading enabled: {custom_agent_url}")
# Pre-validate package URL (optional)
try:
await app.state.agent_loader._install_package(custom_agent_url)
logger.info("Custom agent package pre-installed successfully")
except Exception as e:
logger.warning(f"Custom agent pre-installation failed: {e}")
# Continue with startup - will retry on first conversation
else:
logger.info("Using default agent configuration")
```
### 4.4 Error Handling and Fallback
#### 4.4.1 Robust Error Handling
```python
# openhands/agent_server/dynamic_agent_loader.py (enhanced)
class DynamicAgentLoader:
"""Enhanced loader with comprehensive error handling."""
async def load_agent_with_fallback(
self,
package_url: str,
llm: LLM,
tools: list[Tool],
config: Optional[Dict[str, Any]] = None
) -> AgentBase:
"""Load custom agent with fallback to default agent."""
try:
return await self.load_agent_from_url(package_url, llm, tools, config)
except Exception as e:
logger.error(f"Custom agent loading failed: {e}")
logger.info("Falling back to default agent")
# Import default agent
from openhands.sdk.agent import Agent
return Agent(llm=llm, tools=tools)
async def _validate_package_url(self, package_url: str) -> bool:
"""Validate package URL accessibility."""
try:
if package_url.startswith('git+'):
# Validate Git repository access
import subprocess
result = subprocess.run([
'git', 'ls-remote', package_url.replace('git+', '')
], capture_output=True, timeout=30)
return result.returncode == 0
elif package_url.startswith('http'):
# Validate HTTP URL accessibility
import httpx
async with httpx.AsyncClient() as client:
response = await client.head(package_url, timeout=30)
return response.status_code == 200
else:
# Assume PyPI package - always return True
return True
except Exception:
return False
```
### 4.5 Security and Isolation
#### 4.5.1 Package Security Validation
```python
# openhands/agent_server/security/package_validator.py
import re
from typing import List, Set
from urllib.parse import urlparse
class PackageSecurityValidator:
"""Validates custom agent packages for security compliance."""
ALLOWED_DOMAINS: Set[str] = {
'github.com',
'gitlab.com',
'bitbucket.org',
'pypi.org',
'files.pythonhosted.org'
}
BLOCKED_PACKAGES: Set[str] = {
# Add known malicious packages
}
def validate_package_url(self, package_url: str) -> bool:
"""Validate package URL against security policies."""
# Check blocked packages
if self._is_blocked_package(package_url):
return False
# Validate domain for HTTP/Git URLs
if package_url.startswith(('http', 'git+')):
parsed = urlparse(package_url.replace('git+', ''))
if parsed.hostname not in self.ALLOWED_DOMAINS:
return False
# Additional security checks
return self._validate_package_name(package_url)
def _is_blocked_package(self, package_url: str) -> bool:
"""Check if package is in blocklist."""
for blocked in self.BLOCKED_PACKAGES:
if blocked in package_url.lower():
return True
return False
def _validate_package_name(self, package_url: str) -> bool:
"""Validate package name format."""
# Basic validation for malicious patterns
malicious_patterns = [
r'\.\./', # Path traversal
r'[;&|`$]', # Command injection
r'<script', # XSS attempts
]
for pattern in malicious_patterns:
if re.search(pattern, package_url, re.IGNORECASE):
return False
return True
```
## 5. Implementation Plan
All implementation must pass existing lints and tests. New functionality requires comprehensive test coverage including unit tests, integration tests, and end-to-end scenarios.
### 5.1 Dynamic Agent Loading Foundation (M1)
#### 5.1.1 Dynamic Agent Loader Implementation
* `openhands/agent_server/dynamic_agent_loader.py`
* `tests/unit/agent_server/test_dynamic_agent_loader.py`
Implement core dynamic agent loading functionality with package installation, module importing, and agent instantiation.
#### 5.1.2 Package Security Validation
* `openhands/agent_server/security/package_validator.py`
* `tests/unit/agent_server/security/test_package_validator.py`
Add security validation for custom agent packages including domain allowlists and malicious pattern detection.
**Demo**: Load a simple custom agent from a Git repository and verify it responds to basic queries through the existing `/ask_agent` HTTP API.
### 5.2 Sandbox Service Integration (M2)
#### 5.2.1 Enhanced Sandbox Specification
* `openhands/app_server/sandbox/docker_sandbox_spec_service.py` (modifications)
* `tests/unit/app_server/sandbox/test_docker_sandbox_spec_service.py` (enhancements)
Extend existing sandbox service to support dynamic agent loading configuration through environment variables.
#### 5.2.2 Agent Server Startup Integration
* `openhands/agent_server/conversation_service.py` (modifications)
* `openhands/agent_server/api.py` (startup enhancements)
* `tests/unit/agent_server/test_conversation_service.py` (enhancements)
Integrate dynamic agent loading into agent server startup and conversation management processes.
**Demo**: Create conversations with custom agents specified via environment variables and demonstrate proper agent instantiation and tool execution.
### 5.3 API Integration (M3)
#### 5.3.1 Enhanced Conversation Creation API
* `openhands/server/routes/conversation_routes.py` (modifications)
* `tests/unit/server/routes/test_conversation_routes.py` (enhancements)
Extend conversation creation API to accept agent package URLs and configuration parameters.
#### 5.3.2 Error Handling and Fallback
* `openhands/agent_server/dynamic_agent_loader.py` (enhancements)
* `tests/unit/agent_server/test_dynamic_agent_fallback.py`
Implement comprehensive error handling with fallback to default agents when custom agent loading fails.
**Demo**: Create conversations through API endpoints with various package URL formats (Git, PyPI, ZIP) and demonstrate proper error handling and fallback behavior.
### 5.4 Advanced Features and Optimization (M4)
#### 5.4.1 Agent Caching and Performance
* `openhands/agent_server/agent_cache.py`
* `tests/unit/agent_server/test_agent_cache.py`
Implement agent instance caching to avoid repeated package installation and improve performance for multiple conversations with the same custom agent.
#### 5.4.2 Package Management and Cleanup
* `openhands/agent_server/package_manager.py`
* `tests/unit/agent_server/test_package_manager.py`
Add package lifecycle management including cleanup of unused packages and version management for package updates.
**Demo**: Deploy multiple conversations with different custom agents simultaneously and demonstrate proper resource management, caching, and cleanup behavior.
---
## References
This design document is based on analysis of the following source materials:
1. **OpenHands V1 Architecture**: Analysis of `openhands/app_server/sandbox/docker_sandbox_spec_service.py` and `openhands/app_server/event_callback/github_v1_callback_processor.py` for understanding the V1 flow and agent server integration.
2. **Software Agent SDK**: Analysis of the `software-agent-sdk` repository, specifically:
- `openhands-agent-server/openhands/agent_server/conversation_router.py` for HTTP API patterns
- `openhands-sdk/openhands/sdk/agent/base.py` for agent interface requirements
- `examples/01_standalone_sdk/02_custom_tools.py` for custom agent implementation patterns
3. **Agent Server Models**: Analysis of `openhands.agent_server.models` imports in the main OpenHands codebase for understanding the API contract between main server and agent server.
4. **Container Architecture**: Analysis of `AGENT_SERVER_IMAGE` constant usage in `openhands/app_server/sandbox/sandbox_spec_service.py` for understanding the current container deployment model.
All technical specifications and implementation details are derived from examination of the existing codebase and established patterns within the OpenHands ecosystem.
@@ -97,6 +97,9 @@ class GithubUserContext(UserContext):
user_secrets = await self.secrets_store.load()
return dict(user_secrets.custom_secrets) if user_secrets else {}
async def get_mcp_api_key(self) -> str | None:
raise NotImplementedError()
async def get_user_proactive_conversation_setting(user_id: str | None) -> bool:
"""Get the user's proactive conversation setting.
+15 -1
View File
@@ -203,6 +203,15 @@ class SaasUserAuth(UserAuth):
self.settings_store = settings_store
return settings_store
async def get_mcp_api_key(self) -> str:
api_key_store = ApiKeyStore.get_instance()
mcp_api_key = api_key_store.retrieve_mcp_api_key(self.user_id)
if not mcp_api_key:
mcp_api_key = api_key_store.create_api_key(
self.user_id, 'MCP_API_KEY', None
)
return mcp_api_key
@classmethod
async def get_instance(cls, request: Request) -> UserAuth:
logger.debug('saas_user_auth_get_instance')
@@ -243,7 +252,12 @@ def get_api_key_from_header(request: Request):
# This is a temp hack
# Streamable HTTP MCP Client works via redirect requests, but drops the Authorization header for reason
# We include `X-Session-API-Key` header by default due to nested runtimes, so it used as a drop in replacement here
return request.headers.get('X-Session-API-Key')
session_api_key = request.headers.get('X-Session-API-Key')
if session_api_key:
return session_api_key
# Fallback to X-Access-Token header as an additional option
return request.headers.get('X-Access-Token')
async def saas_user_auth_from_bearer(request: Request) -> SaasUserAuth | None:
@@ -535,3 +535,115 @@ def test_get_api_key_from_header_with_invalid_authorization_format():
# Assert that None was returned
assert api_key is None
def test_get_api_key_from_header_with_x_access_token():
"""Test that get_api_key_from_header extracts API key from X-Access-Token header."""
# Create a mock request with X-Access-Token header
mock_request = MagicMock(spec=Request)
mock_request.headers = {'X-Access-Token': 'access_token_key'}
# Call the function
api_key = get_api_key_from_header(mock_request)
# Assert that the API key was correctly extracted
assert api_key == 'access_token_key'
def test_get_api_key_from_header_priority_authorization_over_x_access_token():
"""Test that Authorization header takes priority over X-Access-Token header."""
# Create a mock request with both headers
mock_request = MagicMock(spec=Request)
mock_request.headers = {
'Authorization': 'Bearer auth_api_key',
'X-Access-Token': 'access_token_key',
}
# Call the function
api_key = get_api_key_from_header(mock_request)
# Assert that the API key from Authorization header was used
assert api_key == 'auth_api_key'
def test_get_api_key_from_header_priority_x_session_over_x_access_token():
"""Test that X-Session-API-Key header takes priority over X-Access-Token header."""
# Create a mock request with both headers
mock_request = MagicMock(spec=Request)
mock_request.headers = {
'X-Session-API-Key': 'session_api_key',
'X-Access-Token': 'access_token_key',
}
# Call the function
api_key = get_api_key_from_header(mock_request)
# Assert that the API key from X-Session-API-Key header was used
assert api_key == 'session_api_key'
def test_get_api_key_from_header_all_three_headers():
"""Test header priority when all three headers are present."""
# Create a mock request with all three headers
mock_request = MagicMock(spec=Request)
mock_request.headers = {
'Authorization': 'Bearer auth_api_key',
'X-Session-API-Key': 'session_api_key',
'X-Access-Token': 'access_token_key',
}
# Call the function
api_key = get_api_key_from_header(mock_request)
# Assert that the API key from Authorization header was used (highest priority)
assert api_key == 'auth_api_key'
def test_get_api_key_from_header_invalid_authorization_fallback_to_x_access_token():
"""Test that invalid Authorization header falls back to X-Access-Token."""
# Create a mock request with invalid Authorization header and X-Access-Token
mock_request = MagicMock(spec=Request)
mock_request.headers = {
'Authorization': 'InvalidFormat api_key',
'X-Access-Token': 'access_token_key',
}
# Call the function
api_key = get_api_key_from_header(mock_request)
# Assert that the API key from X-Access-Token header was used
assert api_key == 'access_token_key'
def test_get_api_key_from_header_empty_headers():
"""Test that empty header values are handled correctly."""
# Create a mock request with empty header values
mock_request = MagicMock(spec=Request)
mock_request.headers = {
'Authorization': '',
'X-Session-API-Key': '',
'X-Access-Token': 'access_token_key',
}
# Call the function
api_key = get_api_key_from_header(mock_request)
# Assert that the API key from X-Access-Token header was used
assert api_key == 'access_token_key'
def test_get_api_key_from_header_bearer_with_empty_token():
"""Test that Bearer header with empty token falls back to other headers."""
# Create a mock request with Bearer header with empty token
mock_request = MagicMock(spec=Request)
mock_request.headers = {
'Authorization': 'Bearer ',
'X-Access-Token': 'access_token_key',
}
# Call the function
api_key = get_api_key_from_header(mock_request)
# Assert that empty string from Bearer is returned (current behavior)
# This tests the current implementation behavior
assert api_key == ''
@@ -3,15 +3,19 @@ import { Provider } from "#/types/settings";
import { V1SandboxStatus } from "../sandbox-service/sandbox-service.types";
// V1 API Types for requests
// Note: This represents the serialized API format, not the internal TextContent/ImageContent types
export interface V1MessageContent {
type: "text" | "image_url";
text?: string;
image_url?: {
url: string;
};
// These types match the SDK's TextContent and ImageContent formats
export interface V1TextContent {
type: "text";
text: string;
}
export interface V1ImageContent {
type: "image";
image_urls: string[];
}
export type V1MessageContent = V1TextContent | V1ImageContent;
type V1Role = "user" | "system" | "assistant" | "tool";
export interface V1SendMessageRequest {
@@ -12,20 +12,15 @@ import { USE_PLANNING_AGENT } from "#/utils/feature-flags";
import { useAgentState } from "#/hooks/use-agent-state";
import { AgentState } from "#/types/agent-state";
import { useActiveConversation } from "#/hooks/query/use-active-conversation";
import { useCreateConversation } from "#/hooks/mutation/use-create-conversation";
import { displaySuccessToast } from "#/utils/custom-toast-handlers";
import { useUnifiedWebSocketStatus } from "#/hooks/use-unified-websocket-status";
import { useSubConversationTaskPolling } from "#/hooks/query/use-sub-conversation-task-polling";
import { useHandlePlanClick } from "#/hooks/use-handle-plan-click";
export function ChangeAgentButton() {
const [contextMenuOpen, setContextMenuOpen] = useState<boolean>(false);
const {
conversationMode,
setConversationMode,
setSubConversationTaskId,
subConversationTaskId,
} = useConversationStore();
const { conversationMode, setConversationMode, subConversationTaskId } =
useConversationStore();
const webSocketStatus = useUnifiedWebSocketStatus();
@@ -40,8 +35,6 @@ export function ChangeAgentButton() {
const isAgentRunning = curAgentState === AgentState.RUNNING;
const { data: conversation } = useActiveConversation();
const { mutate: createConversation, isPending: isCreatingConversation } =
useCreateConversation();
// Poll sub-conversation task and invalidate parent conversation when ready
useSubConversationTaskPolling(
@@ -49,6 +42,9 @@ export function ChangeAgentButton() {
conversation?.conversation_id || null,
);
// Get handlePlanClick and isCreatingConversation from custom hook
const { handlePlanClick, isCreatingConversation } = useHandlePlanClick();
// Close context menu when agent starts running
useEffect(() => {
if ((isAgentRunning || !isWebSocketConnected) && contextMenuOpen) {
@@ -56,45 +52,6 @@ export function ChangeAgentButton() {
}
}, [isAgentRunning, contextMenuOpen, isWebSocketConnected]);
const handlePlanClick = (
event: React.MouseEvent<HTMLButtonElement> | KeyboardEvent,
) => {
event.preventDefault();
event.stopPropagation();
// Set conversation mode to "plan" immediately
setConversationMode("plan");
// Check if sub_conversation_ids is not empty
if (
(conversation?.sub_conversation_ids &&
conversation.sub_conversation_ids.length > 0) ||
!conversation?.conversation_id
) {
// Do nothing if both conditions are true
return;
}
// Create a new sub-conversation if we have a current conversation ID
createConversation(
{
parentConversationId: conversation.conversation_id,
agentType: "plan",
},
{
onSuccess: (data) => {
displaySuccessToast(
t(I18nKey.PLANNING_AGENTT$PLANNING_AGENT_INITIALIZED),
);
// Track the task ID to poll for sub-conversation creation
if (data.v1_task_id) {
setSubConversationTaskId(data.v1_task_id);
}
},
},
);
};
const isButtonDisabled =
isAgentRunning ||
isCreatingConversation ||
@@ -0,0 +1,71 @@
import { useCallback } from "react";
import { useTranslation } from "react-i18next";
import { I18nKey } from "#/i18n/declaration";
import { useConversationStore } from "#/state/conversation-store";
import { useActiveConversation } from "#/hooks/query/use-active-conversation";
import { useCreateConversation } from "#/hooks/mutation/use-create-conversation";
import { displaySuccessToast } from "#/utils/custom-toast-handlers";
/**
* Custom hook that encapsulates the logic for handling plan creation.
* Returns a function that can be called to create a plan conversation and
* the pending state of the conversation creation.
*
* @returns An object containing handlePlanClick function and isCreatingConversation boolean
*/
export const useHandlePlanClick = () => {
const { t } = useTranslation();
const { setConversationMode, setSubConversationTaskId } =
useConversationStore();
const { data: conversation } = useActiveConversation();
const { mutate: createConversation, isPending: isCreatingConversation } =
useCreateConversation();
const handlePlanClick = useCallback(
(event?: React.MouseEvent<HTMLButtonElement> | KeyboardEvent) => {
event?.preventDefault();
event?.stopPropagation();
// Set conversation mode to "plan" immediately
setConversationMode("plan");
// Check if sub_conversation_ids is not empty
if (
(conversation?.sub_conversation_ids &&
conversation.sub_conversation_ids.length > 0) ||
!conversation?.conversation_id
) {
// Do nothing if both conditions are true
return;
}
// Create a new sub-conversation if we have a current conversation ID
createConversation(
{
parentConversationId: conversation.conversation_id,
agentType: "plan",
},
{
onSuccess: (data) => {
displaySuccessToast(
t(I18nKey.PLANNING_AGENTT$PLANNING_AGENT_INITIALIZED),
);
// Track the task ID to poll for sub-conversation creation
if (data.v1_task_id) {
setSubConversationTaskId(data.v1_task_id);
}
},
},
);
},
[
conversation,
createConversation,
setConversationMode,
setSubConversationTaskId,
t,
],
);
return { handlePlanClick, isCreatingConversation };
};
+4 -6
View File
@@ -41,13 +41,11 @@ export function useSendMessage() {
},
];
// Add images if present
// Add images if present - using SDK's ImageContent format
if (args.image_urls && args.image_urls.length > 0) {
args.image_urls.forEach((url) => {
content.push({
type: "image_url",
image_url: { url },
});
content.push({
type: "image",
image_urls: args.image_urls,
});
}
+3 -2
View File
@@ -30,11 +30,12 @@ function BillingSettingsScreen() {
}
displaySuccessToast(t(I18nKey.PAYMENT$SUCCESS));
setSearchParams({});
} else if (checkoutStatus === "cancel") {
displayErrorToast(t(I18nKey.PAYMENT$CANCELLED));
setSearchParams({});
}
setSearchParams({});
}, [checkoutStatus, searchParams, setSearchParams, t, trackCreditsPurchased]);
return <PaymentForm />;
+21 -15
View File
@@ -28,6 +28,7 @@ import { KeyStatusIcon } from "#/components/features/settings/key-status-icon";
import { DEFAULT_SETTINGS } from "#/services/settings";
import { getProviderId } from "#/utils/map-provider";
import { DEFAULT_OPENHANDS_MODEL } from "#/utils/verified-models";
import { USE_V1_CONVERSATION_API } from "#/utils/feature-flags";
interface OpenHandsApiKeyHelpProps {
testId: string;
@@ -118,6 +119,9 @@ function LlmSettingsScreen() {
const isSaasMode = config?.APP_MODE === "saas";
const shouldUseOpenHandsKey = isOpenHandsProvider && isSaasMode;
// Determine if we should hide the agent dropdown when V1 conversation API feature flag is enabled
const isV1Enabled = USE_V1_CONVERSATION_API();
React.useEffect(() => {
const determineWhetherToToggleAdvancedSettings = () => {
if (resources && settings) {
@@ -612,21 +616,23 @@ function LlmSettingsScreen() {
href="https://tavily.com/"
/>
<SettingsDropdownInput
testId="agent-input"
name="agent-input"
label={t(I18nKey.SETTINGS$AGENT)}
items={
resources?.agents.map((agent) => ({
key: agent,
label: agent, // TODO: Add i18n support for agent names
})) || []
}
defaultSelectedKey={settings.AGENT}
isClearable={false}
onInputChange={handleAgentIsDirty}
wrapperClassName="w-full max-w-[680px]"
/>
{!isV1Enabled && (
<SettingsDropdownInput
testId="agent-input"
name="agent-input"
label={t(I18nKey.SETTINGS$AGENT)}
items={
resources?.agents.map((agent) => ({
key: agent,
label: agent, // TODO: Add i18n support for agent names
})) || []
}
defaultSelectedKey={settings.AGENT}
isClearable={false}
onInputChange={handleAgentIsDirty}
wrapperClassName="w-full max-w-[680px]"
/>
)}
</>
)}
+14 -3
View File
@@ -1,17 +1,28 @@
import React from "react";
import { useTranslation } from "react-i18next";
import { I18nKey } from "#/i18n/declaration";
import LessonPlanIcon from "#/icons/lesson-plan.svg?react";
import { useConversationStore } from "#/state/conversation-store";
import { useScrollToBottom } from "#/hooks/use-scroll-to-bottom";
import { MarkdownRenderer } from "#/components/features/markdown/markdown-renderer";
import { useHandlePlanClick } from "#/hooks/use-handle-plan-click";
function PlannerTab() {
const { t } = useTranslation();
const { scrollRef: scrollContainerRef, onChatBodyScroll } = useScrollToBottom(
React.useRef<HTMLDivElement>(null),
);
const { planContent, setConversationMode } = useConversationStore();
const { planContent } = useConversationStore();
const { handlePlanClick } = useHandlePlanClick();
if (planContent !== null && planContent !== undefined) {
return (
<div className="flex flex-col w-full h-full p-4 overflow-auto">
<div
ref={scrollContainerRef}
onScroll={(e) => onChatBodyScroll(e.currentTarget)}
className="flex flex-col w-full h-full p-4 overflow-auto"
>
<MarkdownRenderer includeStandard includeHeadings>
{planContent}
</MarkdownRenderer>
@@ -27,7 +38,7 @@ function PlannerTab() {
</span>
<button
type="button"
onClick={() => setConversationMode("plan")}
onClick={handlePlanClick}
className="flex w-[164px] h-[40px] p-2 justify-center items-center shrink-0 rounded-lg bg-white overflow-hidden text-black text-ellipsis font-sans text-[16px] not-italic font-normal leading-[20px] hover:cursor-pointer hover:opacity-80"
>
{t(I18nKey.COMMON$CREATE_A_PLAN)}
@@ -9,6 +9,7 @@ from openhands.app_server.app_conversation.app_conversation_models import (
AppConversationSortOrder,
)
from openhands.app_server.services.injector import Injector
from openhands.sdk.event import ConversationStateUpdateEvent
from openhands.sdk.utils.models import DiscriminatedUnionMixin
@@ -92,6 +93,19 @@ class AppConversationInfoService(ABC):
Return the stored info
"""
@abstractmethod
async def process_stats_event(
self,
event: ConversationStateUpdateEvent,
conversation_id: UUID,
) -> None:
"""Process a stats event and update conversation statistics.
Args:
event: The ConversationStateUpdateEvent with key='stats'
conversation_id: The ID of the conversation to update
"""
class AppConversationInfoServiceInjector(
DiscriminatedUnionMixin, Injector[AppConversationInfoService], ABC
@@ -9,6 +9,7 @@ from typing import AsyncGenerator
import base62
from openhands.app_server.app_conversation.app_conversation_models import (
AgentType,
AppConversationStartTask,
AppConversationStartTaskStatus,
)
@@ -25,7 +26,9 @@ from openhands.app_server.sandbox.sandbox_models import SandboxInfo
from openhands.app_server.user.user_context import UserContext
from openhands.sdk import Agent
from openhands.sdk.context.agent_context import AgentContext
from openhands.sdk.context.condenser import LLMSummarizingCondenser
from openhands.sdk.context.skills import load_user_skills
from openhands.sdk.llm import LLM
from openhands.sdk.workspace.remote.async_remote_workspace import AsyncRemoteWorkspace
_logger = logging.getLogger(__name__)
@@ -182,6 +185,43 @@ class AppConversationServiceBase(AppConversationService, ABC):
workspace.working_dir,
)
async def _configure_git_user_settings(
self,
workspace: AsyncRemoteWorkspace,
) -> None:
"""Configure git global user settings from user preferences.
Reads git_user_name and git_user_email from user settings and
configures them as git global settings in the workspace.
Args:
workspace: The remote workspace to configure git settings in.
"""
try:
user_info = await self.user_context.get_user_info()
if user_info.git_user_name:
cmd = f'git config --global user.name "{user_info.git_user_name}"'
result = await workspace.execute_command(cmd, workspace.working_dir)
if result.exit_code:
_logger.warning(f'Git config user.name failed: {result.stderr}')
else:
_logger.info(
f'Git configured with user.name={user_info.git_user_name}'
)
if user_info.git_user_email:
cmd = f'git config --global user.email "{user_info.git_user_email}"'
result = await workspace.execute_command(cmd, workspace.working_dir)
if result.exit_code:
_logger.warning(f'Git config user.email failed: {result.stderr}')
else:
_logger.info(
f'Git configured with user.email={user_info.git_user_email}'
)
except Exception as e:
_logger.warning(f'Failed to configure git user settings: {e}')
async def clone_or_init_git_repo(
self,
task: AppConversationStartTask,
@@ -197,6 +237,9 @@ class AppConversationServiceBase(AppConversationService, ABC):
if result.exit_code:
_logger.warning(f'mkdir failed: {result.stderr}')
# Configure git user settings from user preferences
await self._configure_git_user_settings(workspace)
if not request.selected_repository:
if self.init_git_in_empty_workspace:
_logger.debug('Initializing a new git repository in the workspace.')
@@ -221,7 +264,9 @@ class AppConversationServiceBase(AppConversationService, ABC):
# Clone the repo - this is the slow part!
clone_command = f'git clone {remote_repo_url} {dir_name}'
result = await workspace.execute_command(clone_command, workspace.working_dir)
result = await workspace.execute_command(
clone_command, workspace.working_dir, 120
)
if result.exit_code:
_logger.warning(f'Git clone failed: {result.stderr}')
@@ -233,7 +278,10 @@ class AppConversationServiceBase(AppConversationService, ABC):
random_str = base62.encodebytes(os.urandom(16))
openhands_workspace_branch = f'openhands-workspace-{random_str}'
checkout_command = f'git checkout -b {openhands_workspace_branch}'
await workspace.execute_command(checkout_command, workspace.working_dir)
git_dir = Path(workspace.working_dir) / dir_name
result = await workspace.execute_command(checkout_command, git_dir)
if result.exit_code:
_logger.warning(f'Git checkout failed: {result.stderr}')
async def maybe_run_setup_script(
self,
@@ -295,3 +343,39 @@ class AppConversationServiceBase(AppConversationService, ABC):
return
_logger.info('Git pre-commit hook installed successfully')
def _create_condenser(
self,
llm: LLM,
agent_type: AgentType,
condenser_max_size: int | None,
) -> LLMSummarizingCondenser:
"""Create a condenser based on user settings and agent type.
Args:
llm: The LLM instance to use for condensation
agent_type: Type of agent (PLAN or DEFAULT)
condenser_max_size: condenser_max_size setting
Returns:
Configured LLMSummarizingCondenser instance
"""
# LLMSummarizingCondenser has defaults: max_size=120, keep_first=4
condenser_kwargs = {
'llm': llm.model_copy(
update={
'usage_id': (
'condenser'
if agent_type == AgentType.DEFAULT
else 'planning_condenser'
)
}
),
}
# Only override max_size if user has a custom value
if condenser_max_size is not None:
condenser_kwargs['max_size'] = condenser_max_size
condenser = LLMSummarizingCondenser(**condenser_kwargs)
return condenser
@@ -4,12 +4,12 @@ from collections import defaultdict
from dataclasses import dataclass
from datetime import datetime, timedelta
from time import time
from typing import AsyncGenerator, Sequence
from typing import Any, AsyncGenerator, Sequence
from uuid import UUID, uuid4
import httpx
from fastapi import Request
from pydantic import Field, TypeAdapter
from pydantic import Field, SecretStr, TypeAdapter
from openhands.agent_server.models import (
ConversationInfo,
@@ -63,19 +63,25 @@ from openhands.app_server.sandbox.sandbox_spec_service import SandboxSpecService
from openhands.app_server.services.injector import InjectorState
from openhands.app_server.services.jwt_service import JwtService
from openhands.app_server.user.user_context import UserContext
from openhands.app_server.user.user_models import UserInfo
from openhands.app_server.utils.docker_utils import (
replace_localhost_hostname_for_docker,
)
from openhands.experiments.experiment_manager import ExperimentManagerImpl
from openhands.integrations.provider import ProviderType
from openhands.sdk import AgentContext, LocalWorkspace
from openhands.sdk import Agent, AgentContext, LocalWorkspace
from openhands.sdk.conversation.secret_source import LookupSecret, StaticSecret
from openhands.sdk.llm import LLM
from openhands.sdk.security.confirmation_policy import AlwaysConfirm
from openhands.sdk.workspace.remote.async_remote_workspace import AsyncRemoteWorkspace
from openhands.server.types import AppMode
from openhands.tools.preset.default import get_default_agent
from openhands.tools.preset.planning import get_planning_agent
from openhands.tools.preset.default import (
get_default_tools,
)
from openhands.tools.preset.planning import (
format_plan_structure,
get_planning_tools,
)
_conversation_info_type_adapter = TypeAdapter(list[ConversationInfo | None])
_logger = logging.getLogger(__name__)
@@ -99,6 +105,7 @@ class LiveStatusAppConversationService(AppConversationServiceBase):
access_token_hard_timeout: timedelta | None
app_mode: str | None = None
keycloak_auth_cookie: str | None = None
tavily_api_key: str | None = None
async def search_app_conversations(
self,
@@ -519,6 +526,224 @@ class LiveStatusAppConversationService(AppConversationServiceBase):
if not request.llm_model and parent_info.llm_model:
request.llm_model = parent_info.llm_model
async def _setup_secrets_for_git_provider(
self, git_provider: ProviderType | None, user: UserInfo
) -> dict:
"""Set up secrets for git provider authentication.
Args:
git_provider: The git provider type (GitHub, GitLab, etc.)
user: User information containing authentication details
Returns:
Dictionary of secrets for the conversation
"""
secrets = await self.user_context.get_secrets()
if not git_provider:
return secrets
secret_name = f'{git_provider.name}_TOKEN'
if self.web_url:
# Create an access token for web-based authentication
access_token = self.jwt_service.create_jws_token(
payload={
'user_id': user.id,
'provider_type': git_provider.value,
},
expires_in=self.access_token_hard_timeout,
)
headers = {'X-Access-Token': access_token}
# Include keycloak_auth cookie in headers if app_mode is SaaS
if self.app_mode == 'saas' and self.keycloak_auth_cookie:
headers['Cookie'] = f'keycloak_auth={self.keycloak_auth_cookie}'
secrets[secret_name] = LookupSecret(
url=self.web_url + '/api/v1/webhooks/secrets',
headers=headers,
)
else:
# Use static token for environments without web URL access
static_token = await self.user_context.get_latest_token(git_provider)
if static_token:
secrets[secret_name] = StaticSecret(value=static_token)
return secrets
async def _configure_llm_and_mcp(
self, user: UserInfo, llm_model: str | None
) -> tuple[LLM, dict]:
"""Configure LLM and MCP (Model Context Protocol) settings.
Args:
user: User information containing LLM preferences
llm_model: Optional specific model to use, falls back to user default
Returns:
Tuple of (configured LLM instance, MCP config dictionary)
"""
# Configure LLM
model = llm_model or user.llm_model
llm = LLM(
model=model,
base_url=user.llm_base_url,
api_key=user.llm_api_key,
usage_id='agent',
)
# Configure MCP
mcp_config: dict[str, Any] = {}
if self.web_url:
mcp_url = f'{self.web_url}/mcp/mcp'
mcp_config = {
'default': {
'url': mcp_url,
}
}
# Add API key if available
mcp_api_key = await self.user_context.get_mcp_api_key()
if mcp_api_key:
mcp_config['default']['headers'] = {
'X-Session-API-Key': mcp_api_key,
}
# Get the actual API key values, prioritizing user's key over service key
user_search_key = None
if user.search_api_key:
key_value = user.search_api_key.get_secret_value()
if key_value and key_value.strip():
user_search_key = key_value
service_tavily_key = None
if self.tavily_api_key:
# tavily_api_key is already a string (extracted in the factory method)
if self.tavily_api_key.strip():
service_tavily_key = self.tavily_api_key
tavily_api_key = user_search_key or service_tavily_key
if tavily_api_key:
_logger.info('Adding search engine to MCP config')
mcp_config['tavily'] = {
'url': f'https://mcp.tavily.com/mcp/?tavilyApiKey={tavily_api_key}'
}
else:
_logger.info('No search engine API key found, skipping search engine')
return llm, mcp_config
def _create_agent_with_context(
self,
llm: LLM,
agent_type: AgentType,
system_message_suffix: str | None,
mcp_config: dict,
condenser_max_size: int | None,
) -> Agent:
"""Create an agent with appropriate tools and context based on agent type.
Args:
llm: Configured LLM instance
agent_type: Type of agent to create (PLAN or DEFAULT)
system_message_suffix: Optional suffix for system messages
mcp_config: MCP configuration dictionary
condenser_max_size: condenser_max_size setting
Returns:
Configured Agent instance with context
"""
# Create condenser with user's settings
condenser = self._create_condenser(llm, agent_type, condenser_max_size)
# Create agent based on type
if agent_type == AgentType.PLAN:
agent = Agent(
llm=llm,
tools=get_planning_tools(),
system_prompt_filename='system_prompt_planning.j2',
system_prompt_kwargs={'plan_structure': format_plan_structure()},
condenser=condenser,
security_analyzer=None,
mcp_config=mcp_config,
)
else:
agent = Agent(
llm=llm,
tools=get_default_tools(enable_browser=True),
system_prompt_kwargs={'cli_mode': False},
condenser=condenser,
mcp_config=mcp_config,
)
# Add agent context
agent_context = AgentContext(system_message_suffix=system_message_suffix)
agent = agent.model_copy(update={'agent_context': agent_context})
return agent
async def _finalize_conversation_request(
self,
agent: Agent,
conversation_id: UUID | None,
user: UserInfo,
workspace: LocalWorkspace,
initial_message: SendMessageRequest | None,
secrets: dict,
sandbox: SandboxInfo,
remote_workspace: AsyncRemoteWorkspace | None,
selected_repository: str | None,
working_dir: str,
) -> StartConversationRequest:
"""Finalize the conversation request with experiment variants and skills.
Args:
agent: The configured agent
conversation_id: Optional conversation ID, generates new one if None
user: User information
workspace: Local workspace instance
initial_message: Optional initial message for the conversation
secrets: Dictionary of secrets for authentication
sandbox: Sandbox information
remote_workspace: Optional remote workspace for skills loading
selected_repository: Optional repository name
working_dir: Working directory path
Returns:
Complete StartConversationRequest ready for use
"""
# Generate conversation ID if not provided
conversation_id = conversation_id or uuid4()
# Apply experiment variants
agent = ExperimentManagerImpl.run_agent_variant_tests__v1(
user.id, conversation_id, agent
)
# Load and merge skills if remote workspace is available
if remote_workspace:
try:
agent = await self._load_skills_and_update_agent(
sandbox, agent, remote_workspace, selected_repository, working_dir
)
except Exception as e:
_logger.warning(f'Failed to load skills: {e}', exc_info=True)
# Continue without skills - don't fail conversation startup
# Create and return the final request
return StartConversationRequest(
conversation_id=conversation_id,
agent=agent,
workspace=workspace,
confirmation_policy=(
AlwaysConfirm() if user.confirmation_mode else NeverConfirm()
),
initial_message=initial_message,
secrets=secrets,
)
async def _build_start_conversation_request_for_user(
self,
sandbox: SandboxInfo,
@@ -532,87 +757,41 @@ class LiveStatusAppConversationService(AppConversationServiceBase):
remote_workspace: AsyncRemoteWorkspace | None = None,
selected_repository: str | None = None,
) -> StartConversationRequest:
"""Build a complete conversation request for a user.
This method orchestrates the creation of a conversation request by:
1. Setting up git provider secrets
2. Configuring LLM and MCP settings
3. Creating an agent with appropriate context
4. Finalizing the request with skills and experiment variants
"""
user = await self.user_context.get_user_info()
# Set up a secret for the git token
secrets = await self.user_context.get_secrets()
if git_provider:
secret_name = f'{git_provider.name}_TOKEN'
if self.web_url:
# If there is a web url, then we create an access token to access it.
# For security reasons, we are explicit here - only this user, and
# only this provider, with a timeout
access_token = self.jwt_service.create_jws_token(
payload={
'user_id': user.id,
'provider_type': git_provider.value,
},
expires_in=self.access_token_hard_timeout,
)
headers = {'X-Access-Token': access_token}
# Include keycloak_auth cookie in headers if app_mode is SaaS
if self.app_mode == 'saas' and self.keycloak_auth_cookie:
headers['Cookie'] = f'keycloak_auth={self.keycloak_auth_cookie}'
secrets[secret_name] = LookupSecret(
url=self.web_url + '/api/v1/webhooks/secrets',
headers=headers,
)
else:
# If there is no URL specified where the sandbox can access the app server
# then we supply a static secret with the most recent value. Depending
# on the type, this may eventually expire.
static_token = await self.user_context.get_latest_token(git_provider)
if static_token:
secrets[secret_name] = StaticSecret(value=static_token)
workspace = LocalWorkspace(working_dir=working_dir)
# Use provided llm_model if available, otherwise fall back to user's default
model = llm_model or user.llm_model
llm = LLM(
model=model,
base_url=user.llm_base_url,
api_key=user.llm_api_key,
usage_id='agent',
)
# The agent gets passed initial instructions
# Select agent based on agent_type
if agent_type == AgentType.PLAN:
agent = get_planning_agent(llm=llm)
else:
agent = get_default_agent(llm=llm)
# Set up secrets for git provider
secrets = await self._setup_secrets_for_git_provider(git_provider, user)
agent_context = AgentContext(system_message_suffix=system_message_suffix)
agent = agent.model_copy(update={'agent_context': agent_context})
# Configure LLM and MCP
llm, mcp_config = await self._configure_llm_and_mcp(user, llm_model)
conversation_id = conversation_id or uuid4()
agent = ExperimentManagerImpl.run_agent_variant_tests__v1(
user.id, conversation_id, agent
# Create agent with context
agent = self._create_agent_with_context(
llm, agent_type, system_message_suffix, mcp_config, user.condenser_max_size
)
# Load and merge all skills if remote_workspace is available
if remote_workspace:
try:
agent = await self._load_skills_and_update_agent(
sandbox, agent, remote_workspace, selected_repository, working_dir
)
except Exception as e:
_logger.warning(f'Failed to load skills: {e}', exc_info=True)
# Continue without skills - don't fail conversation startup
start_conversation_request = StartConversationRequest(
conversation_id=conversation_id,
agent=agent,
workspace=workspace,
confirmation_policy=(
AlwaysConfirm() if user.confirmation_mode else NeverConfirm()
),
initial_message=initial_message,
secrets=secrets,
# Finalize and return the conversation request
return await self._finalize_conversation_request(
agent,
conversation_id,
user,
workspace,
initial_message,
secrets,
sandbox,
remote_workspace,
selected_repository,
working_dir,
)
return start_conversation_request
async def update_agent_server_conversation_title(
self,
@@ -817,6 +996,10 @@ class LiveStatusAppConversationServiceInjector(AppConversationServiceInjector):
'be retrieved by a sandboxed conversation.'
),
)
tavily_api_key: SecretStr | None = Field(
default=None,
description='The Tavily Search API key to add to MCP integration',
)
async def inject(
self, state: InjectorState, request: Request | None = None
@@ -874,6 +1057,14 @@ class LiveStatusAppConversationServiceInjector(AppConversationServiceInjector):
# If server_config is not available (e.g., in tests), continue without it
pass
# We supply the global tavily key only if the app mode is not SAAS, where
# currently the search endpoints are patched into the app server instead
# so the tavily key does not need to be shared
if self.tavily_api_key and app_mode != AppMode.SAAS:
tavily_api_key = self.tavily_api_key.get_secret_value()
else:
tavily_api_key = None
yield LiveStatusAppConversationService(
init_git_in_empty_workspace=self.init_git_in_empty_workspace,
user_context=user_context,
@@ -890,4 +1081,5 @@ class LiveStatusAppConversationServiceInjector(AppConversationServiceInjector):
access_token_hard_timeout=access_token_hard_timeout,
app_mode=app_mode,
keycloak_auth_cookie=keycloak_auth_cookie,
tavily_api_key=tavily_api_key,
)
@@ -45,6 +45,8 @@ from openhands.app_server.utils.sql_utils import (
create_json_type_decorator,
)
from openhands.integrations.provider import ProviderType
from openhands.sdk.conversation.conversation_stats import ConversationStats
from openhands.sdk.event import ConversationStateUpdateEvent
from openhands.sdk.llm import MetricsSnapshot
from openhands.sdk.llm.utils.metrics import TokenUsage
from openhands.storage.data_models.conversation_metadata import ConversationTrigger
@@ -354,6 +356,130 @@ class SQLAppConversationInfoService(AppConversationInfoService):
await self.db_session.commit()
return info
async def update_conversation_statistics(
self, conversation_id: UUID, stats: ConversationStats
) -> None:
"""Update conversation statistics from stats event data.
Args:
conversation_id: The ID of the conversation to update
stats: ConversationStats object containing usage_to_metrics data from stats event
"""
# Extract agent metrics from usage_to_metrics
usage_to_metrics = stats.usage_to_metrics
agent_metrics = usage_to_metrics.get('agent')
if not agent_metrics:
logger.debug(
'No agent metrics found in stats for conversation %s', conversation_id
)
return
# Query existing record using secure select (filters for V1 and user if available)
query = await self._secure_select()
query = query.where(
StoredConversationMetadata.conversation_id == str(conversation_id)
)
result = await self.db_session.execute(query)
stored = result.scalar_one_or_none()
if not stored:
logger.debug(
'Conversation %s not found or not accessible, skipping statistics update',
conversation_id,
)
return
# Extract accumulated_cost and max_budget_per_task from Metrics object
accumulated_cost = agent_metrics.accumulated_cost
max_budget_per_task = agent_metrics.max_budget_per_task
# Extract accumulated_token_usage from Metrics object
accumulated_token_usage = agent_metrics.accumulated_token_usage
if accumulated_token_usage:
prompt_tokens = accumulated_token_usage.prompt_tokens
completion_tokens = accumulated_token_usage.completion_tokens
cache_read_tokens = accumulated_token_usage.cache_read_tokens
cache_write_tokens = accumulated_token_usage.cache_write_tokens
reasoning_tokens = accumulated_token_usage.reasoning_tokens
context_window = accumulated_token_usage.context_window
per_turn_token = accumulated_token_usage.per_turn_token
else:
prompt_tokens = None
completion_tokens = None
cache_read_tokens = None
cache_write_tokens = None
reasoning_tokens = None
context_window = None
per_turn_token = None
# Update fields only if values are provided (not None)
if accumulated_cost is not None:
stored.accumulated_cost = accumulated_cost
if max_budget_per_task is not None:
stored.max_budget_per_task = max_budget_per_task
if prompt_tokens is not None:
stored.prompt_tokens = prompt_tokens
if completion_tokens is not None:
stored.completion_tokens = completion_tokens
if cache_read_tokens is not None:
stored.cache_read_tokens = cache_read_tokens
if cache_write_tokens is not None:
stored.cache_write_tokens = cache_write_tokens
if reasoning_tokens is not None:
stored.reasoning_tokens = reasoning_tokens
if context_window is not None:
stored.context_window = context_window
if per_turn_token is not None:
stored.per_turn_token = per_turn_token
# Update last_updated_at timestamp
stored.last_updated_at = utc_now()
await self.db_session.commit()
async def process_stats_event(
self,
event: ConversationStateUpdateEvent,
conversation_id: UUID,
) -> None:
"""Process a stats event and update conversation statistics.
Args:
event: The ConversationStateUpdateEvent with key='stats'
conversation_id: The ID of the conversation to update
"""
try:
# Parse event value into ConversationStats model for type safety
# event.value can be a dict (from JSON deserialization) or a ConversationStats object
event_value = event.value
conversation_stats: ConversationStats | None = None
if isinstance(event_value, ConversationStats):
# Already a ConversationStats object
conversation_stats = event_value
elif isinstance(event_value, dict):
# Parse dict into ConversationStats model
# This validates the structure and ensures type safety
conversation_stats = ConversationStats.model_validate(event_value)
elif hasattr(event_value, 'usage_to_metrics'):
# Handle objects with usage_to_metrics attribute (e.g., from tests)
# Convert to dict first, then validate
stats_dict = {'usage_to_metrics': event_value.usage_to_metrics}
conversation_stats = ConversationStats.model_validate(stats_dict)
if conversation_stats and conversation_stats.usage_to_metrics:
# Pass ConversationStats object directly for type safety
await self.update_conversation_statistics(
conversation_id, conversation_stats
)
except Exception:
logger.exception(
'Error updating conversation statistics for conversation %s',
conversation_id,
stack_info=True,
)
async def _secure_select(self):
query = select(StoredConversationMetadata).where(
StoredConversationMetadata.conversation_version == 'V1'
+8 -2
View File
@@ -6,7 +6,7 @@ from typing import AsyncContextManager
import httpx
from fastapi import Depends, Request
from pydantic import Field
from pydantic import Field, SecretStr
from sqlalchemy.ext.asyncio import AsyncSession
# Import the event_callback module to ensure all processors are registered
@@ -185,7 +185,13 @@ def config_from_env() -> AppServerConfig:
)
if config.app_conversation is None:
config.app_conversation = LiveStatusAppConversationServiceInjector()
tavily_api_key = None
tavily_api_key_str = os.getenv('TAVILY_API_KEY') or os.getenv('SEARCH_API_KEY')
if tavily_api_key_str:
tavily_api_key = SecretStr(tavily_api_key_str)
config.app_conversation = LiveStatusAppConversationServiceInjector(
tavily_api_key=tavily_api_key
)
if config.user is None:
config.user = AuthUserContextInjector()
@@ -6,7 +6,6 @@ from __future__ import annotations
import asyncio
import logging
from dataclasses import dataclass
from datetime import datetime
from typing import AsyncGenerator
from uuid import UUID
@@ -15,6 +14,7 @@ from sqlalchemy import UUID as SQLUUID
from sqlalchemy import Column, Enum, String, and_, func, or_, select
from sqlalchemy.ext.asyncio import AsyncSession
from openhands.agent_server.utils import utc_now
from openhands.app_server.event_callback.event_callback_models import (
CreateEventCallbackRequest,
EventCallback,
@@ -177,7 +177,7 @@ class SQLEventCallbackService(EventCallbackService):
return EventCallbackPage(items=callbacks, next_page_id=next_page_id)
async def save_event_callback(self, event_callback: EventCallback) -> EventCallback:
event_callback.updated_at = datetime.now()
event_callback.updated_at = utc_now()
stored_callback = StoredEventCallback(**event_callback.model_dump())
await self.db_session.merge(stored_callback)
return event_callback
@@ -43,6 +43,7 @@ from openhands.app_server.user.specifiy_user_context import (
from openhands.app_server.user.user_context import UserContext
from openhands.integrations.provider import ProviderType
from openhands.sdk import Event
from openhands.sdk.event import ConversationStateUpdateEvent
from openhands.server.user_auth.default_user_auth import DefaultUserAuth
from openhands.server.user_auth.user_auth import (
get_for_user as get_user_auth_for_user,
@@ -144,6 +145,13 @@ async def on_event(
*[event_service.save_event(conversation_id, event) for event in events]
)
# Process stats events for V1 conversations
for event in events:
if isinstance(event, ConversationStateUpdateEvent) and event.key == 'stats':
await app_conversation_info_service.process_stats_event(
event, conversation_id
)
asyncio.create_task(
_run_callbacks_in_bg_and_close(
conversation_id, app_conversation_info.created_by_user_id, events
@@ -78,6 +78,10 @@ class AuthUserContext(UserContext):
return results
async def get_mcp_api_key(self) -> str | None:
mcp_api_key = await self.user_auth.get_mcp_api_key()
return mcp_api_key
USER_ID_ATTR = 'user_id'
@@ -30,6 +30,9 @@ class SpecifyUserContext(UserContext):
async def get_secrets(self) -> dict[str, SecretSource]:
raise NotImplementedError()
async def get_mcp_api_key(self) -> str | None:
raise NotImplementedError()
USER_CONTEXT_ATTR = 'user_context'
ADMIN = SpecifyUserContext(user_id=None)
@@ -34,6 +34,10 @@ class UserContext(ABC):
async def get_secrets(self) -> dict[str, SecretSource]:
"""Get custom secrets and github provider secrets for the conversation."""
@abstractmethod
async def get_mcp_api_key(self) -> str | None:
"""Get an MCP API Key."""
class UserContextInjector(DiscriminatedUnionMixin, Injector[UserContext], ABC):
"""Injector for user contexts."""
@@ -88,6 +88,9 @@ class DefaultUserAuth(UserAuth):
return None
return user_secrets.provider_tokens
async def get_mcp_api_key(self) -> str | None:
return None
@classmethod
async def get_instance(cls, request: Request) -> UserAuth:
user_auth = DefaultUserAuth()
+4
View File
@@ -75,6 +75,10 @@ class UserAuth(ABC):
def get_auth_type(self) -> AuthType | None:
return None
@abstractmethod
async def get_mcp_api_key(self) -> str | None:
"""Get an mcp api key for the user"""
@classmethod
@abstractmethod
async def get_instance(cls, request: Request) -> UserAuth:
@@ -0,0 +1,628 @@
"""Unit tests for git functionality in AppConversationServiceBase.
This module tests the git-related functionality, specifically the clone_or_init_git_repo method
and the recent bug fixes for git checkout operations.
"""
import subprocess
from unittest.mock import AsyncMock, MagicMock, Mock, patch
import pytest
from openhands.app_server.app_conversation.app_conversation_models import AgentType
from openhands.app_server.app_conversation.app_conversation_service_base import (
AppConversationServiceBase,
)
from openhands.app_server.user.user_context import UserContext
class MockUserInfo:
"""Mock class for UserInfo to simulate user settings."""
def __init__(
self, git_user_name: str | None = None, git_user_email: str | None = None
):
self.git_user_name = git_user_name
self.git_user_email = git_user_email
class MockCommandResult:
"""Mock class for command execution result."""
def __init__(self, exit_code: int = 0, stderr: str = ''):
self.exit_code = exit_code
self.stderr = stderr
class MockWorkspace:
"""Mock class for AsyncRemoteWorkspace."""
def __init__(self, working_dir: str = '/workspace'):
self.working_dir = working_dir
self.execute_command = AsyncMock(return_value=MockCommandResult())
class MockAppConversationServiceBase:
"""Mock class to test git functionality without complex dependencies."""
def __init__(self):
self.logger = MagicMock()
async def clone_or_init_git_repo(
self,
workspace_path: str,
repo_url: str,
branch: str = 'main',
timeout: int = 300,
) -> bool:
"""Clone or initialize a git repository.
This is a simplified version of the actual method for testing purposes.
"""
try:
# Try to clone the repository
clone_result = subprocess.run(
['git', 'clone', '--branch', branch, repo_url, workspace_path],
capture_output=True,
text=True,
timeout=timeout,
)
if clone_result.returncode == 0:
self.logger.info(
f'Successfully cloned repository {repo_url} to {workspace_path}'
)
return True
# If clone fails, try to checkout the branch
checkout_result = subprocess.run(
['git', 'checkout', branch],
cwd=workspace_path,
capture_output=True,
text=True,
timeout=timeout,
)
if checkout_result.returncode == 0:
self.logger.info(f'Successfully checked out branch {branch}')
return True
else:
self.logger.error(
f'Failed to checkout branch {branch}: {checkout_result.stderr}'
)
return False
except subprocess.TimeoutExpired:
self.logger.error(f'Git operation timed out after {timeout} seconds')
return False
except Exception as e:
self.logger.error(f'Git operation failed: {str(e)}')
return False
@pytest.fixture
def service():
"""Create a mock service instance for testing."""
return MockAppConversationServiceBase()
@pytest.mark.asyncio
async def test_clone_or_init_git_repo_successful_clone(service):
"""Test successful git clone operation."""
with patch('subprocess.run') as mock_run:
# Mock successful clone
mock_run.return_value = MagicMock(returncode=0, stderr='', stdout='Cloning...')
result = await service.clone_or_init_git_repo(
workspace_path='/tmp/test_repo',
repo_url='https://github.com/test/repo.git',
branch='main',
timeout=300,
)
assert result is True
mock_run.assert_called_once_with(
[
'git',
'clone',
'--branch',
'main',
'https://github.com/test/repo.git',
'/tmp/test_repo',
],
capture_output=True,
text=True,
timeout=300,
)
service.logger.info.assert_called_with(
'Successfully cloned repository https://github.com/test/repo.git to /tmp/test_repo'
)
@pytest.mark.asyncio
async def test_clone_or_init_git_repo_clone_fails_checkout_succeeds(service):
"""Test git clone fails but checkout succeeds."""
with patch('subprocess.run') as mock_run:
# Mock clone failure, then checkout success
mock_run.side_effect = [
MagicMock(returncode=1, stderr='Clone failed', stdout=''), # Clone fails
MagicMock(
returncode=0, stderr='', stdout='Switched to branch'
), # Checkout succeeds
]
result = await service.clone_or_init_git_repo(
workspace_path='/tmp/test_repo',
repo_url='https://github.com/test/repo.git',
branch='feature-branch',
timeout=300,
)
assert result is True
assert mock_run.call_count == 2
# Check clone call
mock_run.assert_any_call(
[
'git',
'clone',
'--branch',
'feature-branch',
'https://github.com/test/repo.git',
'/tmp/test_repo',
],
capture_output=True,
text=True,
timeout=300,
)
# Check checkout call
mock_run.assert_any_call(
['git', 'checkout', 'feature-branch'],
cwd='/tmp/test_repo',
capture_output=True,
text=True,
timeout=300,
)
service.logger.info.assert_called_with(
'Successfully checked out branch feature-branch'
)
@pytest.mark.asyncio
async def test_clone_or_init_git_repo_both_operations_fail(service):
"""Test both git clone and checkout operations fail."""
with patch('subprocess.run') as mock_run:
# Mock both operations failing
mock_run.side_effect = [
MagicMock(returncode=1, stderr='Clone failed', stdout=''), # Clone fails
MagicMock(
returncode=1, stderr='Checkout failed', stdout=''
), # Checkout fails
]
result = await service.clone_or_init_git_repo(
workspace_path='/tmp/test_repo',
repo_url='https://github.com/test/repo.git',
branch='nonexistent-branch',
timeout=300,
)
assert result is False
assert mock_run.call_count == 2
service.logger.error.assert_called_with(
'Failed to checkout branch nonexistent-branch: Checkout failed'
)
@pytest.mark.asyncio
async def test_clone_or_init_git_repo_timeout(service):
"""Test git operation timeout."""
with patch('subprocess.run') as mock_run:
# Mock timeout exception
mock_run.side_effect = subprocess.TimeoutExpired(
cmd=['git', 'clone'], timeout=300
)
result = await service.clone_or_init_git_repo(
workspace_path='/tmp/test_repo',
repo_url='https://github.com/test/repo.git',
branch='main',
timeout=300,
)
assert result is False
service.logger.error.assert_called_with(
'Git operation timed out after 300 seconds'
)
@pytest.mark.asyncio
async def test_clone_or_init_git_repo_exception(service):
"""Test git operation with unexpected exception."""
with patch('subprocess.run') as mock_run:
# Mock unexpected exception
mock_run.side_effect = Exception('Unexpected error')
result = await service.clone_or_init_git_repo(
workspace_path='/tmp/test_repo',
repo_url='https://github.com/test/repo.git',
branch='main',
timeout=300,
)
assert result is False
service.logger.error.assert_called_with(
'Git operation failed: Unexpected error'
)
@pytest.mark.asyncio
async def test_clone_or_init_git_repo_custom_timeout(service):
"""Test git operation with custom timeout."""
with patch('subprocess.run') as mock_run:
# Mock successful clone with custom timeout
mock_run.return_value = MagicMock(returncode=0, stderr='', stdout='Cloning...')
result = await service.clone_or_init_git_repo(
workspace_path='/tmp/test_repo',
repo_url='https://github.com/test/repo.git',
branch='main',
timeout=600, # Custom timeout
)
assert result is True
mock_run.assert_called_once_with(
[
'git',
'clone',
'--branch',
'main',
'https://github.com/test/repo.git',
'/tmp/test_repo',
],
capture_output=True,
text=True,
timeout=600, # Verify custom timeout is used
)
@patch(
'openhands.app_server.app_conversation.app_conversation_service_base.LLMSummarizingCondenser'
)
def test_create_condenser_default_agent_with_none_max_size(mock_condenser_class):
"""Test _create_condenser for DEFAULT agent with condenser_max_size = None uses default."""
# Arrange
mock_user_context = Mock(spec=UserContext)
with patch.object(
AppConversationServiceBase,
'__abstractmethods__',
set(),
):
service = AppConversationServiceBase(
init_git_in_empty_workspace=True,
user_context=mock_user_context,
)
mock_llm = MagicMock()
mock_llm_copy = MagicMock()
mock_llm_copy.usage_id = 'condenser'
mock_llm.model_copy.return_value = mock_llm_copy
mock_condenser_instance = MagicMock()
mock_condenser_class.return_value = mock_condenser_instance
# Act
service._create_condenser(mock_llm, AgentType.DEFAULT, None)
# Assert
mock_condenser_class.assert_called_once()
call_kwargs = mock_condenser_class.call_args[1]
# When condenser_max_size is None, max_size should not be passed (uses SDK default of 120)
assert 'max_size' not in call_kwargs
# keep_first is never passed (uses SDK default of 4)
assert 'keep_first' not in call_kwargs
assert call_kwargs['llm'].usage_id == 'condenser'
mock_llm.model_copy.assert_called_once()
@patch(
'openhands.app_server.app_conversation.app_conversation_service_base.LLMSummarizingCondenser'
)
def test_create_condenser_default_agent_with_custom_max_size(mock_condenser_class):
"""Test _create_condenser for DEFAULT agent with custom condenser_max_size."""
# Arrange
mock_user_context = Mock(spec=UserContext)
with patch.object(
AppConversationServiceBase,
'__abstractmethods__',
set(),
):
service = AppConversationServiceBase(
init_git_in_empty_workspace=True,
user_context=mock_user_context,
)
mock_llm = MagicMock()
mock_llm_copy = MagicMock()
mock_llm_copy.usage_id = 'condenser'
mock_llm.model_copy.return_value = mock_llm_copy
mock_condenser_instance = MagicMock()
mock_condenser_class.return_value = mock_condenser_instance
# Act
service._create_condenser(mock_llm, AgentType.DEFAULT, 150)
# Assert
mock_condenser_class.assert_called_once()
call_kwargs = mock_condenser_class.call_args[1]
assert call_kwargs['max_size'] == 150 # Custom value should be used
# keep_first is never passed (uses SDK default of 4)
assert 'keep_first' not in call_kwargs
assert call_kwargs['llm'].usage_id == 'condenser'
mock_llm.model_copy.assert_called_once()
@patch(
'openhands.app_server.app_conversation.app_conversation_service_base.LLMSummarizingCondenser'
)
def test_create_condenser_plan_agent_with_none_max_size(mock_condenser_class):
"""Test _create_condenser for PLAN agent with condenser_max_size = None uses default."""
# Arrange
mock_user_context = Mock(spec=UserContext)
with patch.object(
AppConversationServiceBase,
'__abstractmethods__',
set(),
):
service = AppConversationServiceBase(
init_git_in_empty_workspace=True,
user_context=mock_user_context,
)
mock_llm = MagicMock()
mock_llm_copy = MagicMock()
mock_llm_copy.usage_id = 'planning_condenser'
mock_llm.model_copy.return_value = mock_llm_copy
mock_condenser_instance = MagicMock()
mock_condenser_class.return_value = mock_condenser_instance
# Act
service._create_condenser(mock_llm, AgentType.PLAN, None)
# Assert
mock_condenser_class.assert_called_once()
call_kwargs = mock_condenser_class.call_args[1]
# When condenser_max_size is None, max_size should not be passed (uses SDK default of 120)
assert 'max_size' not in call_kwargs
# keep_first is never passed (uses SDK default of 4)
assert 'keep_first' not in call_kwargs
assert call_kwargs['llm'].usage_id == 'planning_condenser'
mock_llm.model_copy.assert_called_once()
@patch(
'openhands.app_server.app_conversation.app_conversation_service_base.LLMSummarizingCondenser'
)
def test_create_condenser_plan_agent_with_custom_max_size(mock_condenser_class):
"""Test _create_condenser for PLAN agent with custom condenser_max_size."""
# Arrange
mock_user_context = Mock(spec=UserContext)
with patch.object(
AppConversationServiceBase,
'__abstractmethods__',
set(),
):
service = AppConversationServiceBase(
init_git_in_empty_workspace=True,
user_context=mock_user_context,
)
mock_llm = MagicMock()
mock_llm_copy = MagicMock()
mock_llm_copy.usage_id = 'planning_condenser'
mock_llm.model_copy.return_value = mock_llm_copy
mock_condenser_instance = MagicMock()
mock_condenser_class.return_value = mock_condenser_instance
# Act
service._create_condenser(mock_llm, AgentType.PLAN, 200)
# Assert
mock_condenser_class.assert_called_once()
call_kwargs = mock_condenser_class.call_args[1]
assert call_kwargs['max_size'] == 200 # Custom value should be used
# keep_first is never passed (uses SDK default of 4)
assert 'keep_first' not in call_kwargs
assert call_kwargs['llm'].usage_id == 'planning_condenser'
mock_llm.model_copy.assert_called_once()
# =============================================================================
# Tests for _configure_git_user_settings
# =============================================================================
def _create_service_with_mock_user_context(user_info: MockUserInfo) -> tuple:
"""Create a mock service with the actual _configure_git_user_settings method.
Uses MagicMock for the service but binds the real method for testing.
Returns a tuple of (service, mock_user_context) for testing.
"""
mock_user_context = MagicMock()
mock_user_context.get_user_info = AsyncMock(return_value=user_info)
# Create a simple mock service and set required attribute
service = MagicMock()
service.user_context = mock_user_context
# Bind the actual method from the real class to test real implementation
service._configure_git_user_settings = (
lambda workspace: AppConversationServiceBase._configure_git_user_settings(
service, workspace
)
)
return service, mock_user_context
@pytest.fixture
def mock_workspace():
"""Create a mock workspace instance for testing."""
return MockWorkspace(working_dir='/workspace/project')
@pytest.mark.asyncio
async def test_configure_git_user_settings_both_name_and_email(mock_workspace):
"""Test configuring both git user name and email."""
user_info = MockUserInfo(
git_user_name='Test User', git_user_email='test@example.com'
)
service, mock_user_context = _create_service_with_mock_user_context(user_info)
await service._configure_git_user_settings(mock_workspace)
# Verify get_user_info was called
mock_user_context.get_user_info.assert_called_once()
# Verify both git config commands were executed
assert mock_workspace.execute_command.call_count == 2
# Check git config user.name call
mock_workspace.execute_command.assert_any_call(
'git config --global user.name "Test User"', '/workspace/project'
)
# Check git config user.email call
mock_workspace.execute_command.assert_any_call(
'git config --global user.email "test@example.com"', '/workspace/project'
)
@pytest.mark.asyncio
async def test_configure_git_user_settings_only_name(mock_workspace):
"""Test configuring only git user name."""
user_info = MockUserInfo(git_user_name='Test User', git_user_email=None)
service, _ = _create_service_with_mock_user_context(user_info)
await service._configure_git_user_settings(mock_workspace)
# Verify only user.name was configured
assert mock_workspace.execute_command.call_count == 1
mock_workspace.execute_command.assert_called_once_with(
'git config --global user.name "Test User"', '/workspace/project'
)
@pytest.mark.asyncio
async def test_configure_git_user_settings_only_email(mock_workspace):
"""Test configuring only git user email."""
user_info = MockUserInfo(git_user_name=None, git_user_email='test@example.com')
service, _ = _create_service_with_mock_user_context(user_info)
await service._configure_git_user_settings(mock_workspace)
# Verify only user.email was configured
assert mock_workspace.execute_command.call_count == 1
mock_workspace.execute_command.assert_called_once_with(
'git config --global user.email "test@example.com"', '/workspace/project'
)
@pytest.mark.asyncio
async def test_configure_git_user_settings_neither_set(mock_workspace):
"""Test when neither git user name nor email is set."""
user_info = MockUserInfo(git_user_name=None, git_user_email=None)
service, _ = _create_service_with_mock_user_context(user_info)
await service._configure_git_user_settings(mock_workspace)
# Verify no git config commands were executed
mock_workspace.execute_command.assert_not_called()
@pytest.mark.asyncio
async def test_configure_git_user_settings_empty_strings(mock_workspace):
"""Test when git user name and email are empty strings."""
user_info = MockUserInfo(git_user_name='', git_user_email='')
service, _ = _create_service_with_mock_user_context(user_info)
await service._configure_git_user_settings(mock_workspace)
# Empty strings are falsy, so no commands should be executed
mock_workspace.execute_command.assert_not_called()
@pytest.mark.asyncio
async def test_configure_git_user_settings_get_user_info_fails(mock_workspace):
"""Test handling of exception when get_user_info fails."""
user_info = MockUserInfo()
service, mock_user_context = _create_service_with_mock_user_context(user_info)
mock_user_context.get_user_info = AsyncMock(
side_effect=Exception('User info error')
)
# Should not raise exception, just log warning
await service._configure_git_user_settings(mock_workspace)
# Verify no git config commands were executed
mock_workspace.execute_command.assert_not_called()
@pytest.mark.asyncio
async def test_configure_git_user_settings_name_command_fails(mock_workspace):
"""Test handling when git config user.name command fails."""
user_info = MockUserInfo(
git_user_name='Test User', git_user_email='test@example.com'
)
service, _ = _create_service_with_mock_user_context(user_info)
# Make the first command fail (user.name), second succeed (user.email)
mock_workspace.execute_command = AsyncMock(
side_effect=[
MockCommandResult(exit_code=1, stderr='Permission denied'),
MockCommandResult(exit_code=0),
]
)
# Should not raise exception
await service._configure_git_user_settings(mock_workspace)
# Verify both commands were still attempted
assert mock_workspace.execute_command.call_count == 2
@pytest.mark.asyncio
async def test_configure_git_user_settings_email_command_fails(mock_workspace):
"""Test handling when git config user.email command fails."""
user_info = MockUserInfo(
git_user_name='Test User', git_user_email='test@example.com'
)
service, _ = _create_service_with_mock_user_context(user_info)
# Make the first command succeed (user.name), second fail (user.email)
mock_workspace.execute_command = AsyncMock(
side_effect=[
MockCommandResult(exit_code=0),
MockCommandResult(exit_code=1, stderr='Permission denied'),
]
)
# Should not raise exception
await service._configure_git_user_settings(mock_workspace)
# Verify both commands were still attempted
assert mock_workspace.execute_command.call_count == 2
@pytest.mark.asyncio
async def test_configure_git_user_settings_special_characters_in_name(mock_workspace):
"""Test git user name with special characters."""
user_info = MockUserInfo(
git_user_name="Test O'Brien", git_user_email='test@example.com'
)
service, _ = _create_service_with_mock_user_context(user_info)
await service._configure_git_user_settings(mock_workspace)
# Verify the name is passed with special characters
mock_workspace.execute_command.assert_any_call(
'git config --global user.name "Test O\'Brien"', '/workspace/project'
)
@@ -0,0 +1,721 @@
"""Unit tests for the methods in LiveStatusAppConversationService."""
from unittest.mock import AsyncMock, Mock, patch
from uuid import UUID, uuid4
import pytest
from openhands.agent_server.models import SendMessageRequest, StartConversationRequest
from openhands.app_server.app_conversation.app_conversation_models import AgentType
from openhands.app_server.app_conversation.live_status_app_conversation_service import (
LiveStatusAppConversationService,
)
from openhands.app_server.sandbox.sandbox_models import SandboxInfo, SandboxStatus
from openhands.app_server.user.user_context import UserContext
from openhands.integrations.provider import ProviderType
from openhands.sdk import Agent
from openhands.sdk.conversation.secret_source import LookupSecret, StaticSecret
from openhands.sdk.llm import LLM
from openhands.sdk.workspace import LocalWorkspace
from openhands.sdk.workspace.remote.async_remote_workspace import AsyncRemoteWorkspace
from openhands.server.types import AppMode
class TestLiveStatusAppConversationService:
"""Test cases for the methods in LiveStatusAppConversationService."""
def setup_method(self):
"""Set up test fixtures."""
# Create mock dependencies
self.mock_user_context = Mock(spec=UserContext)
self.mock_jwt_service = Mock()
self.mock_sandbox_service = Mock()
self.mock_sandbox_spec_service = Mock()
self.mock_app_conversation_info_service = Mock()
self.mock_app_conversation_start_task_service = Mock()
self.mock_event_callback_service = Mock()
self.mock_httpx_client = Mock()
# Create service instance
self.service = LiveStatusAppConversationService(
init_git_in_empty_workspace=True,
user_context=self.mock_user_context,
app_conversation_info_service=self.mock_app_conversation_info_service,
app_conversation_start_task_service=self.mock_app_conversation_start_task_service,
event_callback_service=self.mock_event_callback_service,
sandbox_service=self.mock_sandbox_service,
sandbox_spec_service=self.mock_sandbox_spec_service,
jwt_service=self.mock_jwt_service,
sandbox_startup_timeout=30,
sandbox_startup_poll_frequency=1,
httpx_client=self.mock_httpx_client,
web_url='https://test.example.com',
access_token_hard_timeout=None,
app_mode='test',
keycloak_auth_cookie=None,
)
# Mock user info
self.mock_user = Mock()
self.mock_user.id = 'test_user_123'
self.mock_user.llm_model = 'gpt-4'
self.mock_user.llm_base_url = 'https://api.openai.com/v1'
self.mock_user.llm_api_key = 'test_api_key'
self.mock_user.confirmation_mode = False
self.mock_user.search_api_key = None # Default to None
self.mock_user.condenser_max_size = None # Default to None
# Mock sandbox
self.mock_sandbox = Mock(spec=SandboxInfo)
self.mock_sandbox.id = uuid4()
self.mock_sandbox.status = SandboxStatus.RUNNING
@pytest.mark.asyncio
async def test_setup_secrets_for_git_provider_no_provider(self):
"""Test _setup_secrets_for_git_provider with no git provider."""
# Arrange
base_secrets = {'existing': 'secret'}
self.mock_user_context.get_secrets.return_value = base_secrets
# Act
result = await self.service._setup_secrets_for_git_provider(
None, self.mock_user
)
# Assert
assert result == base_secrets
self.mock_user_context.get_secrets.assert_called_once()
@pytest.mark.asyncio
async def test_setup_secrets_for_git_provider_with_web_url(self):
"""Test _setup_secrets_for_git_provider with web URL (creates access token)."""
# Arrange
base_secrets = {}
self.mock_user_context.get_secrets.return_value = base_secrets
self.mock_jwt_service.create_jws_token.return_value = 'test_access_token'
git_provider = ProviderType.GITHUB
# Act
result = await self.service._setup_secrets_for_git_provider(
git_provider, self.mock_user
)
# Assert
assert 'GITHUB_TOKEN' in result
assert isinstance(result['GITHUB_TOKEN'], LookupSecret)
assert (
result['GITHUB_TOKEN'].url
== 'https://test.example.com/api/v1/webhooks/secrets'
)
assert result['GITHUB_TOKEN'].headers['X-Access-Token'] == 'test_access_token'
self.mock_jwt_service.create_jws_token.assert_called_once_with(
payload={
'user_id': self.mock_user.id,
'provider_type': git_provider.value,
},
expires_in=None,
)
@pytest.mark.asyncio
async def test_setup_secrets_for_git_provider_with_saas_mode(self):
"""Test _setup_secrets_for_git_provider with SaaS mode (includes keycloak cookie)."""
# Arrange
self.service.app_mode = 'saas'
self.service.keycloak_auth_cookie = 'test_cookie'
base_secrets = {}
self.mock_user_context.get_secrets.return_value = base_secrets
self.mock_jwt_service.create_jws_token.return_value = 'test_access_token'
git_provider = ProviderType.GITLAB
# Act
result = await self.service._setup_secrets_for_git_provider(
git_provider, self.mock_user
)
# Assert
assert 'GITLAB_TOKEN' in result
lookup_secret = result['GITLAB_TOKEN']
assert isinstance(lookup_secret, LookupSecret)
assert 'Cookie' in lookup_secret.headers
assert lookup_secret.headers['Cookie'] == 'keycloak_auth=test_cookie'
@pytest.mark.asyncio
async def test_setup_secrets_for_git_provider_without_web_url(self):
"""Test _setup_secrets_for_git_provider without web URL (uses static token)."""
# Arrange
self.service.web_url = None
base_secrets = {}
self.mock_user_context.get_secrets.return_value = base_secrets
self.mock_user_context.get_latest_token.return_value = 'static_token_value'
git_provider = ProviderType.GITHUB
# Act
result = await self.service._setup_secrets_for_git_provider(
git_provider, self.mock_user
)
# Assert
assert 'GITHUB_TOKEN' in result
assert isinstance(result['GITHUB_TOKEN'], StaticSecret)
assert result['GITHUB_TOKEN'].value.get_secret_value() == 'static_token_value'
self.mock_user_context.get_latest_token.assert_called_once_with(git_provider)
@pytest.mark.asyncio
async def test_setup_secrets_for_git_provider_no_static_token(self):
"""Test _setup_secrets_for_git_provider when no static token is available."""
# Arrange
self.service.web_url = None
base_secrets = {}
self.mock_user_context.get_secrets.return_value = base_secrets
self.mock_user_context.get_latest_token.return_value = None
git_provider = ProviderType.GITHUB
# Act
result = await self.service._setup_secrets_for_git_provider(
git_provider, self.mock_user
)
# Assert
assert 'GITHUB_TOKEN' not in result
assert result == base_secrets
@pytest.mark.asyncio
async def test_configure_llm_and_mcp_with_custom_model(self):
"""Test _configure_llm_and_mcp with custom LLM model."""
# Arrange
custom_model = 'gpt-3.5-turbo'
self.mock_user_context.get_mcp_api_key.return_value = 'mcp_api_key'
# Act
llm, mcp_config = await self.service._configure_llm_and_mcp(
self.mock_user, custom_model
)
# Assert
assert isinstance(llm, LLM)
assert llm.model == custom_model
assert llm.base_url == self.mock_user.llm_base_url
assert llm.api_key.get_secret_value() == self.mock_user.llm_api_key
assert llm.usage_id == 'agent'
assert 'default' in mcp_config
assert mcp_config['default']['url'] == 'https://test.example.com/mcp/mcp'
assert mcp_config['default']['headers']['X-Session-API-Key'] == 'mcp_api_key'
@pytest.mark.asyncio
async def test_configure_llm_and_mcp_with_user_default_model(self):
"""Test _configure_llm_and_mcp using user's default model."""
# Arrange
self.mock_user_context.get_mcp_api_key.return_value = None
# Act
llm, mcp_config = await self.service._configure_llm_and_mcp(
self.mock_user, None
)
# Assert
assert llm.model == self.mock_user.llm_model
assert 'default' in mcp_config
assert 'headers' not in mcp_config['default']
@pytest.mark.asyncio
async def test_configure_llm_and_mcp_without_web_url(self):
"""Test _configure_llm_and_mcp without web URL (no MCP config)."""
# Arrange
self.service.web_url = None
# Act
llm, mcp_config = await self.service._configure_llm_and_mcp(
self.mock_user, None
)
# Assert
assert isinstance(llm, LLM)
assert mcp_config == {}
@pytest.mark.asyncio
async def test_configure_llm_and_mcp_tavily_with_user_search_api_key(self):
"""Test _configure_llm_and_mcp adds tavily when user has search_api_key."""
# Arrange
from pydantic import SecretStr
self.mock_user.search_api_key = SecretStr('user_search_key')
self.mock_user_context.get_mcp_api_key.return_value = 'mcp_api_key'
# Act
llm, mcp_config = await self.service._configure_llm_and_mcp(
self.mock_user, None
)
# Assert
assert isinstance(llm, LLM)
assert 'default' in mcp_config
assert 'tavily' in mcp_config
assert (
mcp_config['tavily']['url']
== 'https://mcp.tavily.com/mcp/?tavilyApiKey=user_search_key'
)
@pytest.mark.asyncio
async def test_configure_llm_and_mcp_tavily_with_env_tavily_key(self):
"""Test _configure_llm_and_mcp adds tavily when service has tavily_api_key."""
# Arrange
self.service.tavily_api_key = 'env_tavily_key'
self.mock_user_context.get_mcp_api_key.return_value = None
# Act
llm, mcp_config = await self.service._configure_llm_and_mcp(
self.mock_user, None
)
# Assert
assert isinstance(llm, LLM)
assert 'default' in mcp_config
assert 'tavily' in mcp_config
assert (
mcp_config['tavily']['url']
== 'https://mcp.tavily.com/mcp/?tavilyApiKey=env_tavily_key'
)
@pytest.mark.asyncio
async def test_configure_llm_and_mcp_tavily_user_key_takes_precedence(self):
"""Test _configure_llm_and_mcp user search_api_key takes precedence over env key."""
# Arrange
from pydantic import SecretStr
self.mock_user.search_api_key = SecretStr('user_search_key')
self.service.tavily_api_key = 'env_tavily_key'
self.mock_user_context.get_mcp_api_key.return_value = None
# Act
llm, mcp_config = await self.service._configure_llm_and_mcp(
self.mock_user, None
)
# Assert
assert isinstance(llm, LLM)
assert 'tavily' in mcp_config
assert (
mcp_config['tavily']['url']
== 'https://mcp.tavily.com/mcp/?tavilyApiKey=user_search_key'
)
@pytest.mark.asyncio
async def test_configure_llm_and_mcp_no_tavily_without_keys(self):
"""Test _configure_llm_and_mcp does not add tavily when no keys are available."""
# Arrange
self.mock_user.search_api_key = None
self.service.tavily_api_key = None
self.mock_user_context.get_mcp_api_key.return_value = None
# Act
llm, mcp_config = await self.service._configure_llm_and_mcp(
self.mock_user, None
)
# Assert
assert isinstance(llm, LLM)
assert 'default' in mcp_config
assert 'tavily' not in mcp_config
@pytest.mark.asyncio
async def test_configure_llm_and_mcp_saas_mode_no_tavily_without_user_key(self):
"""Test _configure_llm_and_mcp does not add tavily in SAAS mode without user search_api_key.
In SAAS mode, the global tavily_api_key should not be passed to the service instance,
so tavily should only be added if the user has their own search_api_key.
"""
# Arrange - simulate SAAS mode where no global tavily key is available
self.service.app_mode = AppMode.SAAS.value
self.service.tavily_api_key = None # In SAAS mode, this should be None
self.mock_user.search_api_key = None
self.mock_user_context.get_mcp_api_key.return_value = None
# Act
llm, mcp_config = await self.service._configure_llm_and_mcp(
self.mock_user, None
)
# Assert
assert isinstance(llm, LLM)
assert 'default' in mcp_config
assert 'tavily' not in mcp_config
@pytest.mark.asyncio
async def test_configure_llm_and_mcp_saas_mode_with_user_search_key(self):
"""Test _configure_llm_and_mcp adds tavily in SAAS mode when user has search_api_key.
Even in SAAS mode, if the user has their own search_api_key, tavily should be added.
"""
# Arrange - simulate SAAS mode with user having their own search key
from pydantic import SecretStr
self.service.app_mode = AppMode.SAAS.value
self.service.tavily_api_key = None # In SAAS mode, this should be None
self.mock_user.search_api_key = SecretStr('user_search_key')
self.mock_user_context.get_mcp_api_key.return_value = None
# Act
llm, mcp_config = await self.service._configure_llm_and_mcp(
self.mock_user, None
)
# Assert
assert isinstance(llm, LLM)
assert 'default' in mcp_config
assert 'tavily' in mcp_config
assert (
mcp_config['tavily']['url']
== 'https://mcp.tavily.com/mcp/?tavilyApiKey=user_search_key'
)
@pytest.mark.asyncio
async def test_configure_llm_and_mcp_tavily_with_empty_user_search_key(self):
"""Test _configure_llm_and_mcp handles empty user search_api_key correctly."""
# Arrange
from pydantic import SecretStr
self.mock_user.search_api_key = SecretStr('') # Empty string
self.service.tavily_api_key = 'env_tavily_key'
self.mock_user_context.get_mcp_api_key.return_value = None
# Act
llm, mcp_config = await self.service._configure_llm_and_mcp(
self.mock_user, None
)
# Assert
assert isinstance(llm, LLM)
assert 'tavily' in mcp_config
# Should fall back to env key since user key is empty
assert (
mcp_config['tavily']['url']
== 'https://mcp.tavily.com/mcp/?tavilyApiKey=env_tavily_key'
)
@pytest.mark.asyncio
async def test_configure_llm_and_mcp_tavily_with_whitespace_user_search_key(self):
"""Test _configure_llm_and_mcp handles whitespace-only user search_api_key correctly."""
# Arrange
from pydantic import SecretStr
self.mock_user.search_api_key = SecretStr(' ') # Whitespace only
self.service.tavily_api_key = 'env_tavily_key'
self.mock_user_context.get_mcp_api_key.return_value = None
# Act
llm, mcp_config = await self.service._configure_llm_and_mcp(
self.mock_user, None
)
# Assert
assert isinstance(llm, LLM)
assert 'tavily' in mcp_config
# Should fall back to env key since user key is whitespace only
assert (
mcp_config['tavily']['url']
== 'https://mcp.tavily.com/mcp/?tavilyApiKey=env_tavily_key'
)
@patch(
'openhands.app_server.app_conversation.live_status_app_conversation_service.get_planning_tools'
)
@patch(
'openhands.app_server.app_conversation.app_conversation_service_base.AppConversationServiceBase._create_condenser'
)
@patch(
'openhands.app_server.app_conversation.live_status_app_conversation_service.format_plan_structure'
)
def test_create_agent_with_context_planning_agent(
self, mock_format_plan, mock_create_condenser, mock_get_tools
):
"""Test _create_agent_with_context for planning agent type."""
# Arrange
mock_llm = Mock(spec=LLM)
mock_llm.model_copy.return_value = mock_llm
mock_get_tools.return_value = []
mock_condenser = Mock()
mock_create_condenser.return_value = mock_condenser
mock_format_plan.return_value = 'test_plan_structure'
mcp_config = {'default': {'url': 'test'}}
system_message_suffix = 'Test suffix'
# Act
with patch(
'openhands.app_server.app_conversation.live_status_app_conversation_service.Agent'
) as mock_agent_class:
mock_agent_instance = Mock()
mock_agent_instance.model_copy.return_value = mock_agent_instance
mock_agent_class.return_value = mock_agent_instance
self.service._create_agent_with_context(
mock_llm,
AgentType.PLAN,
system_message_suffix,
mcp_config,
self.mock_user.condenser_max_size,
)
# Assert
mock_agent_class.assert_called_once()
call_kwargs = mock_agent_class.call_args[1]
assert call_kwargs['llm'] == mock_llm
assert call_kwargs['system_prompt_filename'] == 'system_prompt_planning.j2'
assert (
call_kwargs['system_prompt_kwargs']['plan_structure']
== 'test_plan_structure'
)
assert call_kwargs['mcp_config'] == mcp_config
assert call_kwargs['security_analyzer'] is None
assert call_kwargs['condenser'] == mock_condenser
mock_create_condenser.assert_called_once_with(
mock_llm, AgentType.PLAN, self.mock_user.condenser_max_size
)
@patch(
'openhands.app_server.app_conversation.live_status_app_conversation_service.get_default_tools'
)
@patch(
'openhands.app_server.app_conversation.app_conversation_service_base.AppConversationServiceBase._create_condenser'
)
def test_create_agent_with_context_default_agent(
self, mock_create_condenser, mock_get_tools
):
"""Test _create_agent_with_context for default agent type."""
# Arrange
mock_llm = Mock(spec=LLM)
mock_llm.model_copy.return_value = mock_llm
mock_get_tools.return_value = []
mock_condenser = Mock()
mock_create_condenser.return_value = mock_condenser
mcp_config = {'default': {'url': 'test'}}
# Act
with patch(
'openhands.app_server.app_conversation.live_status_app_conversation_service.Agent'
) as mock_agent_class:
mock_agent_instance = Mock()
mock_agent_instance.model_copy.return_value = mock_agent_instance
mock_agent_class.return_value = mock_agent_instance
self.service._create_agent_with_context(
mock_llm,
AgentType.DEFAULT,
None,
mcp_config,
self.mock_user.condenser_max_size,
)
# Assert
mock_agent_class.assert_called_once()
call_kwargs = mock_agent_class.call_args[1]
assert call_kwargs['llm'] == mock_llm
assert call_kwargs['system_prompt_kwargs']['cli_mode'] is False
assert call_kwargs['mcp_config'] == mcp_config
assert call_kwargs['condenser'] == mock_condenser
mock_get_tools.assert_called_once_with(enable_browser=True)
mock_create_condenser.assert_called_once_with(
mock_llm, AgentType.DEFAULT, self.mock_user.condenser_max_size
)
@pytest.mark.asyncio
@patch(
'openhands.app_server.app_conversation.live_status_app_conversation_service.ExperimentManagerImpl'
)
async def test_finalize_conversation_request_with_skills(
self, mock_experiment_manager
):
"""Test _finalize_conversation_request with skills loading."""
# Arrange
mock_agent = Mock(spec=Agent)
mock_updated_agent = Mock(spec=Agent)
mock_experiment_manager.run_agent_variant_tests__v1.return_value = (
mock_updated_agent
)
conversation_id = uuid4()
workspace = LocalWorkspace(working_dir='/test')
initial_message = Mock(spec=SendMessageRequest)
secrets = {'test': StaticSecret(value='secret')}
remote_workspace = Mock(spec=AsyncRemoteWorkspace)
# Mock the skills loading method
self.service._load_skills_and_update_agent = AsyncMock(
return_value=mock_updated_agent
)
# Act
result = await self.service._finalize_conversation_request(
mock_agent,
conversation_id,
self.mock_user,
workspace,
initial_message,
secrets,
self.mock_sandbox,
remote_workspace,
'test_repo',
'/test/dir',
)
# Assert
assert isinstance(result, StartConversationRequest)
assert result.conversation_id == conversation_id
assert result.agent == mock_updated_agent
assert result.workspace == workspace
assert result.initial_message == initial_message
assert result.secrets == secrets
mock_experiment_manager.run_agent_variant_tests__v1.assert_called_once_with(
self.mock_user.id, conversation_id, mock_agent
)
self.service._load_skills_and_update_agent.assert_called_once_with(
self.mock_sandbox,
mock_updated_agent,
remote_workspace,
'test_repo',
'/test/dir',
)
@pytest.mark.asyncio
@patch(
'openhands.app_server.app_conversation.live_status_app_conversation_service.ExperimentManagerImpl'
)
async def test_finalize_conversation_request_without_skills(
self, mock_experiment_manager
):
"""Test _finalize_conversation_request without remote workspace (no skills)."""
# Arrange
mock_agent = Mock(spec=Agent)
mock_updated_agent = Mock(spec=Agent)
mock_experiment_manager.run_agent_variant_tests__v1.return_value = (
mock_updated_agent
)
workspace = LocalWorkspace(working_dir='/test')
secrets = {'test': StaticSecret(value='secret')}
# Act
result = await self.service._finalize_conversation_request(
mock_agent,
None,
self.mock_user,
workspace,
None,
secrets,
self.mock_sandbox,
None,
None,
'/test/dir',
)
# Assert
assert isinstance(result, StartConversationRequest)
assert isinstance(result.conversation_id, UUID)
assert result.agent == mock_updated_agent
mock_experiment_manager.run_agent_variant_tests__v1.assert_called_once()
@pytest.mark.asyncio
@patch(
'openhands.app_server.app_conversation.live_status_app_conversation_service.ExperimentManagerImpl'
)
async def test_finalize_conversation_request_skills_loading_fails(
self, mock_experiment_manager
):
"""Test _finalize_conversation_request when skills loading fails."""
# Arrange
mock_agent = Mock(spec=Agent)
mock_updated_agent = Mock(spec=Agent)
mock_experiment_manager.run_agent_variant_tests__v1.return_value = (
mock_updated_agent
)
workspace = LocalWorkspace(working_dir='/test')
secrets = {'test': StaticSecret(value='secret')}
remote_workspace = Mock(spec=AsyncRemoteWorkspace)
# Mock skills loading to raise an exception
self.service._load_skills_and_update_agent = AsyncMock(
side_effect=Exception('Skills loading failed')
)
# Act
with patch(
'openhands.app_server.app_conversation.live_status_app_conversation_service._logger'
) as mock_logger:
result = await self.service._finalize_conversation_request(
mock_agent,
None,
self.mock_user,
workspace,
None,
secrets,
self.mock_sandbox,
remote_workspace,
'test_repo',
'/test/dir',
)
# Assert
assert isinstance(result, StartConversationRequest)
assert (
result.agent == mock_updated_agent
) # Should still use the experiment-modified agent
mock_logger.warning.assert_called_once()
@pytest.mark.asyncio
async def test_build_start_conversation_request_for_user_integration(self):
"""Test the main _build_start_conversation_request_for_user method integration."""
# Arrange
self.mock_user_context.get_user_info.return_value = self.mock_user
# Mock all the helper methods
mock_secrets = {'GITHUB_TOKEN': Mock()}
mock_llm = Mock(spec=LLM)
mock_mcp_config = {'default': {'url': 'test'}}
mock_agent = Mock(spec=Agent)
mock_final_request = Mock(spec=StartConversationRequest)
self.service._setup_secrets_for_git_provider = AsyncMock(
return_value=mock_secrets
)
self.service._configure_llm_and_mcp = AsyncMock(
return_value=(mock_llm, mock_mcp_config)
)
self.service._create_agent_with_context = Mock(return_value=mock_agent)
self.service._finalize_conversation_request = AsyncMock(
return_value=mock_final_request
)
# Act
result = await self.service._build_start_conversation_request_for_user(
sandbox=self.mock_sandbox,
initial_message=None,
system_message_suffix='Test suffix',
git_provider=ProviderType.GITHUB,
working_dir='/test/dir',
agent_type=AgentType.DEFAULT,
llm_model='gpt-4',
conversation_id=None,
remote_workspace=None,
selected_repository='test/repo',
)
# Assert
assert result == mock_final_request
self.service._setup_secrets_for_git_provider.assert_called_once_with(
ProviderType.GITHUB, self.mock_user
)
self.service._configure_llm_and_mcp.assert_called_once_with(
self.mock_user, 'gpt-4'
)
self.service._create_agent_with_context.assert_called_once_with(
mock_llm,
AgentType.DEFAULT,
'Test suffix',
mock_mcp_config,
self.mock_user.condenser_max_size,
)
self.service._finalize_conversation_request.assert_called_once()
@@ -0,0 +1,615 @@
"""Tests for stats event processing in webhook_router.
This module tests the stats event processing functionality introduced for
updating conversation statistics from ConversationStateUpdateEvent events.
"""
from datetime import datetime, timezone
from typing import AsyncGenerator
from unittest.mock import AsyncMock, MagicMock, patch
from uuid import uuid4
import pytest
from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker, create_async_engine
from sqlalchemy.pool import StaticPool
from openhands.app_server.app_conversation.app_conversation_models import (
AppConversationInfo,
)
from openhands.app_server.app_conversation.sql_app_conversation_info_service import (
SQLAppConversationInfoService,
StoredConversationMetadata,
)
from openhands.app_server.user.specifiy_user_context import SpecifyUserContext
from openhands.app_server.utils.sql_utils import Base
from openhands.sdk.conversation.conversation_stats import ConversationStats
from openhands.sdk.event import ConversationStateUpdateEvent
from openhands.sdk.llm.utils.metrics import Metrics, TokenUsage
# ---------------------------------------------------------------------------
# Fixtures
# ---------------------------------------------------------------------------
@pytest.fixture
async def async_engine():
"""Create an async SQLite engine for testing."""
engine = create_async_engine(
'sqlite+aiosqlite:///:memory:',
poolclass=StaticPool,
connect_args={'check_same_thread': False},
echo=False,
)
# Create all tables
async with engine.begin() as conn:
await conn.run_sync(Base.metadata.create_all)
yield engine
await engine.dispose()
@pytest.fixture
async def async_session(async_engine) -> AsyncGenerator[AsyncSession, None]:
"""Create an async session for testing."""
async_session_maker = async_sessionmaker(
async_engine, class_=AsyncSession, expire_on_commit=False
)
async with async_session_maker() as db_session:
yield db_session
@pytest.fixture
def service(async_session) -> SQLAppConversationInfoService:
"""Create a SQLAppConversationInfoService instance for testing."""
return SQLAppConversationInfoService(
db_session=async_session, user_context=SpecifyUserContext(user_id=None)
)
@pytest.fixture
async def v1_conversation_metadata(async_session, service):
"""Create a V1 conversation metadata record for testing."""
conversation_id = uuid4()
stored = StoredConversationMetadata(
conversation_id=str(conversation_id),
user_id='test_user_123',
sandbox_id='sandbox_123',
conversation_version='V1',
title='Test Conversation',
accumulated_cost=0.0,
prompt_tokens=0,
completion_tokens=0,
cache_read_tokens=0,
cache_write_tokens=0,
reasoning_tokens=0,
context_window=0,
per_turn_token=0,
created_at=datetime.now(timezone.utc),
last_updated_at=datetime.now(timezone.utc),
)
async_session.add(stored)
await async_session.commit()
return conversation_id, stored
@pytest.fixture
def stats_event_with_dict_value():
"""Create a ConversationStateUpdateEvent with dict value."""
event_value = {
'usage_to_metrics': {
'agent': {
'accumulated_cost': 0.03411525,
'max_budget_per_task': None,
'accumulated_token_usage': {
'prompt_tokens': 8770,
'completion_tokens': 82,
'cache_read_tokens': 0,
'cache_write_tokens': 8767,
'reasoning_tokens': 0,
'context_window': 0,
'per_turn_token': 8852,
},
},
'condenser': {
'accumulated_cost': 0.0,
'accumulated_token_usage': {
'prompt_tokens': 0,
'completion_tokens': 0,
},
},
}
}
return ConversationStateUpdateEvent(key='stats', value=event_value)
@pytest.fixture
def stats_event_with_object_value():
"""Create a ConversationStateUpdateEvent with object value."""
event_value = MagicMock()
event_value.usage_to_metrics = {
'agent': {
'accumulated_cost': 0.05,
'accumulated_token_usage': {
'prompt_tokens': 1000,
'completion_tokens': 100,
},
}
}
return ConversationStateUpdateEvent(key='stats', value=event_value)
@pytest.fixture
def stats_event_no_usage_to_metrics():
"""Create a ConversationStateUpdateEvent without usage_to_metrics."""
event_value = {'some_other_key': 'value'}
return ConversationStateUpdateEvent(key='stats', value=event_value)
# ---------------------------------------------------------------------------
# Tests for update_conversation_statistics
# ---------------------------------------------------------------------------
class TestUpdateConversationStatistics:
"""Test the update_conversation_statistics method."""
@pytest.mark.asyncio
async def test_update_statistics_success(
self, service, async_session, v1_conversation_metadata
):
"""Test successfully updating conversation statistics."""
conversation_id, stored = v1_conversation_metadata
agent_metrics = Metrics(
model_name='test-model',
accumulated_cost=0.03411525,
max_budget_per_task=10.0,
accumulated_token_usage=TokenUsage(
model='test-model',
prompt_tokens=8770,
completion_tokens=82,
cache_read_tokens=0,
cache_write_tokens=8767,
reasoning_tokens=0,
context_window=0,
per_turn_token=8852,
),
)
stats = ConversationStats(usage_to_metrics={'agent': agent_metrics})
await service.update_conversation_statistics(conversation_id, stats)
# Verify the update
await async_session.refresh(stored)
assert stored.accumulated_cost == 0.03411525
assert stored.max_budget_per_task == 10.0
assert stored.prompt_tokens == 8770
assert stored.completion_tokens == 82
assert stored.cache_read_tokens == 0
assert stored.cache_write_tokens == 8767
assert stored.reasoning_tokens == 0
assert stored.context_window == 0
assert stored.per_turn_token == 8852
assert stored.last_updated_at is not None
@pytest.mark.asyncio
async def test_update_statistics_partial_update(
self, service, async_session, v1_conversation_metadata
):
"""Test updating only some statistics fields."""
conversation_id, stored = v1_conversation_metadata
# Set initial values
stored.accumulated_cost = 0.01
stored.prompt_tokens = 100
await async_session.commit()
agent_metrics = Metrics(
model_name='test-model',
accumulated_cost=0.05,
accumulated_token_usage=TokenUsage(
model='test-model',
prompt_tokens=200,
completion_tokens=0, # Default value
),
)
stats = ConversationStats(usage_to_metrics={'agent': agent_metrics})
await service.update_conversation_statistics(conversation_id, stats)
# Verify updated fields
await async_session.refresh(stored)
assert stored.accumulated_cost == 0.05
assert stored.prompt_tokens == 200
# completion_tokens should remain unchanged (not None in stats)
assert stored.completion_tokens == 0
@pytest.mark.asyncio
async def test_update_statistics_no_agent_metrics(
self, service, v1_conversation_metadata
):
"""Test that update is skipped when no agent metrics are present."""
conversation_id, stored = v1_conversation_metadata
original_cost = stored.accumulated_cost
condenser_metrics = Metrics(
model_name='test-model',
accumulated_cost=0.1,
)
stats = ConversationStats(usage_to_metrics={'condenser': condenser_metrics})
await service.update_conversation_statistics(conversation_id, stats)
# Verify no update occurred
assert stored.accumulated_cost == original_cost
@pytest.mark.asyncio
async def test_update_statistics_conversation_not_found(self, service):
"""Test that update is skipped when conversation doesn't exist."""
nonexistent_id = uuid4()
agent_metrics = Metrics(
model_name='test-model',
accumulated_cost=0.1,
)
stats = ConversationStats(usage_to_metrics={'agent': agent_metrics})
# Should not raise an exception
await service.update_conversation_statistics(nonexistent_id, stats)
@pytest.mark.asyncio
async def test_update_statistics_v0_conversation_skipped(
self, service, async_session
):
"""Test that V0 conversations are skipped."""
conversation_id = uuid4()
stored = StoredConversationMetadata(
conversation_id=str(conversation_id),
user_id='test_user_123',
sandbox_id='sandbox_123',
conversation_version='V0', # V0 conversation
title='V0 Conversation',
accumulated_cost=0.0,
created_at=datetime.now(timezone.utc),
last_updated_at=datetime.now(timezone.utc),
)
async_session.add(stored)
await async_session.commit()
original_cost = stored.accumulated_cost
agent_metrics = Metrics(
model_name='test-model',
accumulated_cost=0.1,
)
stats = ConversationStats(usage_to_metrics={'agent': agent_metrics})
await service.update_conversation_statistics(conversation_id, stats)
# Verify no update occurred
await async_session.refresh(stored)
assert stored.accumulated_cost == original_cost
@pytest.mark.asyncio
async def test_update_statistics_with_none_values(
self, service, async_session, v1_conversation_metadata
):
"""Test that None values in stats don't overwrite existing values."""
conversation_id, stored = v1_conversation_metadata
# Set initial values
stored.accumulated_cost = 0.01
stored.max_budget_per_task = 5.0
stored.prompt_tokens = 100
await async_session.commit()
agent_metrics = Metrics(
model_name='test-model',
accumulated_cost=0.05,
max_budget_per_task=None, # None value
accumulated_token_usage=TokenUsage(
model='test-model',
prompt_tokens=200,
completion_tokens=0, # Default value (None is not valid for int)
),
)
stats = ConversationStats(usage_to_metrics={'agent': agent_metrics})
await service.update_conversation_statistics(conversation_id, stats)
# Verify updated fields and that None values didn't overwrite
await async_session.refresh(stored)
assert stored.accumulated_cost == 0.05
assert stored.max_budget_per_task == 5.0 # Should remain unchanged
assert stored.prompt_tokens == 200
assert (
stored.completion_tokens == 0
) # Should remain unchanged (was 0, None doesn't update)
# ---------------------------------------------------------------------------
# Tests for process_stats_event
# ---------------------------------------------------------------------------
class TestProcessStatsEvent:
"""Test the process_stats_event method."""
@pytest.mark.asyncio
async def test_process_stats_event_with_dict_value(
self,
service,
async_session,
stats_event_with_dict_value,
v1_conversation_metadata,
):
"""Test processing stats event with dict value."""
conversation_id, stored = v1_conversation_metadata
await service.process_stats_event(stats_event_with_dict_value, conversation_id)
# Verify the update occurred
await async_session.refresh(stored)
assert stored.accumulated_cost == 0.03411525
assert stored.prompt_tokens == 8770
assert stored.completion_tokens == 82
@pytest.mark.asyncio
async def test_process_stats_event_with_object_value(
self,
service,
async_session,
stats_event_with_object_value,
v1_conversation_metadata,
):
"""Test processing stats event with object value."""
conversation_id, stored = v1_conversation_metadata
await service.process_stats_event(
stats_event_with_object_value, conversation_id
)
# Verify the update occurred
await async_session.refresh(stored)
assert stored.accumulated_cost == 0.05
assert stored.prompt_tokens == 1000
assert stored.completion_tokens == 100
@pytest.mark.asyncio
async def test_process_stats_event_no_usage_to_metrics(
self,
service,
async_session,
stats_event_no_usage_to_metrics,
v1_conversation_metadata,
):
"""Test processing stats event without usage_to_metrics."""
conversation_id, stored = v1_conversation_metadata
original_cost = stored.accumulated_cost
await service.process_stats_event(
stats_event_no_usage_to_metrics, conversation_id
)
# Verify update_conversation_statistics was NOT called
await async_session.refresh(stored)
assert stored.accumulated_cost == original_cost
@pytest.mark.asyncio
async def test_process_stats_event_service_error_handled(
self, service, stats_event_with_dict_value
):
"""Test that errors from service are caught and logged."""
conversation_id = uuid4()
# Should not raise an exception
with (
patch.object(
service,
'update_conversation_statistics',
side_effect=Exception('Database error'),
),
patch(
'openhands.app_server.app_conversation.sql_app_conversation_info_service.logger'
) as mock_logger,
):
await service.process_stats_event(
stats_event_with_dict_value, conversation_id
)
# Verify error was logged
mock_logger.exception.assert_called_once()
@pytest.mark.asyncio
async def test_process_stats_event_empty_usage_to_metrics(
self, service, async_session, v1_conversation_metadata
):
"""Test processing stats event with empty usage_to_metrics."""
conversation_id, stored = v1_conversation_metadata
original_cost = stored.accumulated_cost
# Create event with empty usage_to_metrics
event = ConversationStateUpdateEvent(
key='stats', value={'usage_to_metrics': {}}
)
await service.process_stats_event(event, conversation_id)
# Empty dict is falsy, so update_conversation_statistics should NOT be called
await async_session.refresh(stored)
assert stored.accumulated_cost == original_cost
# ---------------------------------------------------------------------------
# Integration tests for on_event endpoint
# ---------------------------------------------------------------------------
class TestOnEventStatsProcessing:
"""Test stats event processing in the on_event endpoint."""
@pytest.mark.asyncio
async def test_on_event_processes_stats_events(self):
"""Test that on_event processes stats events."""
from openhands.app_server.event_callback.webhook_router import on_event
from openhands.app_server.sandbox.sandbox_models import (
SandboxInfo,
SandboxStatus,
)
conversation_id = uuid4()
sandbox_id = 'sandbox_123'
# Create stats event
stats_event = ConversationStateUpdateEvent(
key='stats',
value={
'usage_to_metrics': {
'agent': {
'accumulated_cost': 0.1,
'accumulated_token_usage': {
'prompt_tokens': 1000,
},
}
}
},
)
# Create non-stats event
other_event = ConversationStateUpdateEvent(
key='execution_status', value='running'
)
events = [stats_event, other_event]
# Mock dependencies
mock_sandbox = SandboxInfo(
id=sandbox_id,
status=SandboxStatus.RUNNING,
session_api_key='test_key',
created_by_user_id='user_123',
sandbox_spec_id='spec_123',
)
mock_app_conversation_info = AppConversationInfo(
id=conversation_id,
sandbox_id=sandbox_id,
created_by_user_id='user_123',
)
mock_event_service = AsyncMock()
mock_app_conversation_info_service = AsyncMock()
mock_app_conversation_info_service.get_app_conversation_info.return_value = (
mock_app_conversation_info
)
# Set up process_stats_event to call update_conversation_statistics
async def process_stats_event_side_effect(event, conversation_id):
# Simulate what process_stats_event does - call update_conversation_statistics
from openhands.sdk.conversation.conversation_stats import ConversationStats
if isinstance(event.value, dict):
stats = ConversationStats.model_validate(event.value)
if stats and stats.usage_to_metrics:
await mock_app_conversation_info_service.update_conversation_statistics(
conversation_id, stats
)
mock_app_conversation_info_service.process_stats_event.side_effect = (
process_stats_event_side_effect
)
with (
patch(
'openhands.app_server.event_callback.webhook_router.valid_sandbox',
return_value=mock_sandbox,
),
patch(
'openhands.app_server.event_callback.webhook_router.valid_conversation',
return_value=mock_app_conversation_info,
),
patch(
'openhands.app_server.event_callback.webhook_router._run_callbacks_in_bg_and_close'
) as mock_callbacks,
):
await on_event(
events=events,
conversation_id=conversation_id,
sandbox_info=mock_sandbox,
app_conversation_info_service=mock_app_conversation_info_service,
event_service=mock_event_service,
)
# Verify events were saved
assert mock_event_service.save_event.call_count == 2
# Verify stats event was processed
mock_app_conversation_info_service.update_conversation_statistics.assert_called_once()
# Verify callbacks were scheduled
mock_callbacks.assert_called_once()
@pytest.mark.asyncio
async def test_on_event_skips_non_stats_events(self):
"""Test that on_event skips non-stats events."""
from openhands.app_server.event_callback.webhook_router import on_event
from openhands.app_server.sandbox.sandbox_models import (
SandboxInfo,
SandboxStatus,
)
from openhands.events.action.message import MessageAction
conversation_id = uuid4()
sandbox_id = 'sandbox_123'
# Create non-stats events
events = [
ConversationStateUpdateEvent(key='execution_status', value='running'),
MessageAction(content='test'),
]
mock_sandbox = SandboxInfo(
id=sandbox_id,
status=SandboxStatus.RUNNING,
session_api_key='test_key',
created_by_user_id='user_123',
sandbox_spec_id='spec_123',
)
mock_app_conversation_info = AppConversationInfo(
id=conversation_id,
sandbox_id=sandbox_id,
created_by_user_id='user_123',
)
mock_event_service = AsyncMock()
mock_app_conversation_info_service = AsyncMock()
mock_app_conversation_info_service.get_app_conversation_info.return_value = (
mock_app_conversation_info
)
with (
patch(
'openhands.app_server.event_callback.webhook_router.valid_sandbox',
return_value=mock_sandbox,
),
patch(
'openhands.app_server.event_callback.webhook_router.valid_conversation',
return_value=mock_app_conversation_info,
),
patch(
'openhands.app_server.event_callback.webhook_router._run_callbacks_in_bg_and_close'
),
):
await on_event(
events=events,
conversation_id=conversation_id,
sandbox_info=mock_sandbox,
app_conversation_info_service=mock_app_conversation_info_service,
event_service=mock_event_service,
)
# Verify stats update was NOT called
mock_app_conversation_info_service.update_conversation_statistics.assert_not_called()
@@ -152,6 +152,7 @@ class TestExperimentManagerIntegration:
llm_base_url=None,
llm_api_key=None,
confirmation_mode=False,
condenser_max_size=None,
)
async def get_secrets(self):
@@ -200,8 +201,24 @@ class TestExperimentManagerIntegration:
# Patch the pieces invoked by the service
with (
patch(
'openhands.app_server.app_conversation.live_status_app_conversation_service.get_default_agent',
patch.object(
service,
'_setup_secrets_for_git_provider',
return_value={},
),
patch.object(
service,
'_configure_llm_and_mcp',
return_value=(mock_llm, {}),
),
patch.object(
service,
'_create_agent_with_context',
return_value=mock_agent,
),
patch.object(
service,
'_load_skills_and_update_agent',
return_value=mock_agent,
),
patch(
@@ -46,6 +46,9 @@ class MockUserAuth(UserAuth):
async def get_secrets(self) -> Secrets | None:
return None
async def get_mcp_api_key(self) -> str | None:
return None
@classmethod
async def get_instance(cls, request: Request) -> UserAuth:
return MockUserAuth()
@@ -46,6 +46,9 @@ class MockUserAuth(UserAuth):
async def get_secrets(self) -> Secrets | None:
return None
async def get_mcp_api_key(self) -> str | None:
return None
@classmethod
async def get_instance(cls, request: Request) -> UserAuth:
return MockUserAuth()