Compare commits

..

3 Commits

Author SHA1 Message Date
enyst 8316ec9eb3 Integrate AuthSystem docs into documentation navigation
- Add AuthSystem design documents to docs.json navigation under Architecture section
- Convert .md files to .mdx format for proper documentation system integration
- Add frontmatter with appropriate titles for both documents
- Documents now accessible in deployed documentation under OpenHands Developers > Architecture
2025-09-04 07:43:29 +00:00
enyst 059e5b7af6 Clean up AuthSystem docs: remove emojis, fix token references, focus on architectural benefits
- Remove all emojis for professional presentation
- Fix provider_tokens references (was dependency injection, not route parameters)
- Streamline benefits section to focus on architectural/codebase/extensibility improvements
- Remove promotional language and focus on technical merits
- Maintain clear documentation of the design and implementation approach
2025-09-04 07:06:12 +00:00
enyst d858882c3d Add AuthSystem design documentation
- Comprehensive AuthSystem design supporting None/SU/MU strategies
- Based on GitHub issues #10751 and #10730 analysis
- Includes before/after code examples and Mermaid diagrams
- Provides clean abstraction boundaries and migration plan
- Extension points for custom builds with multi-user support

Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-04 06:54:53 +00:00
38 changed files with 2252 additions and 566 deletions
-13
View File
@@ -1,13 +0,0 @@
# Task List
1. Verify PR#10432 item 11: serialization reasoning_content precedence matches reviewer intent
- id: 11-verify-serialization-reasoning-precedence
- File: openhands/events/serialization/action.py
- Reviewer asked to prefer top-level rc; current implementation may differ.
- Status: todo
2. Verify PR#10432 item 12: conversation_memory uses structured thought without legacy hasattr/getattr checks
- id: 12-verify-memory-use-structured-thought
- File: openhands/memory/conversation_memory.py
- Reviewer asked to directly use action.thought; code may retain legacy guards.
- Status: todo
+3 -1
View File
@@ -153,7 +153,9 @@
"group": "Architecture",
"pages": [
"usage/architecture/backend",
"usage/architecture/runtime"
"usage/architecture/runtime",
"usage/architecture/auth-system-summary",
"usage/architecture/auth-system-design"
]
},
"usage/how-to/debugging",
@@ -0,0 +1,856 @@
# OpenHands AuthSystem Design
## Executive Summary
This document proposes a comprehensive AuthSystem design for OpenHands that supports three authentication strategies: **None** (current behavior), **Single User (SU)** with GitHub OAuth, and **Multi User (MU)** (for custom builds). The design introduces clean abstraction boundaries, eliminates scattered `user_id` threading, and provides a foundation for future authentication enhancements.
## Problem Statement
### Current Issues
1. **No Auth Strategy Abstraction**: OpenHands currently has a monolithic `DefaultUserAuth` that always returns `None` for `user_id`, with no clear path to support different authentication modes.
2. **Scattered user_id Threading**: 339+ occurrences of `user_id` across 68 files, with complex threading through:
- Storage partitioning (`users/{user_id}/` paths)
- Conversation/session scoping
- API route dependencies
- Provider token resolution
- Data model fields
3. **Provider Token Pollution**: Routes accept `provider_tokens` parameters and thread them through `ProviderHandler`, creating security risks and complex signatures.
4. **No Single User Support**: No way to enable GitHub OAuth for personal/single-user deployments while maintaining the simplicity of the current "None" mode.
5. **Boundary Violations**: Auth concerns are mixed with business logic throughout the codebase, making it difficult to switch between authentication modes.
### Requirements from GitHub Issues
From **Issue #10751** (user_id audit):
- Support None, SU, and MU modes
- Introduce `UserContext` and `StorageNamespace` abstractions
- Remove redundant `if user_id` guards (7 identified)
- Clean up storage path helpers
From **Issue #10730** (token provider):
- Remove `provider_tokens` dependency injection
- Introduce `TokenProvider` boundary abstraction
- Support backend-only credential resolution
- Enable custom builds with token refresh/rotation patterns
## Solution Architecture
### Core Components
#### 1. AuthStrategy Interface
```python
# openhands/auth/strategies/base.py
from abc import ABC, abstractmethod
from typing import Optional
from fastapi import Request
from openhands.auth.user_context import UserContext
from openhands.auth.token_provider import TokenProvider
class AuthStrategy(ABC):
"""Base class for authentication strategies"""
@abstractmethod
def get_name(self) -> str:
"""Return strategy name for logging/debugging"""
@abstractmethod
def requires_auth(self) -> bool:
"""Whether this strategy requires user authentication"""
@abstractmethod
async def authenticate(self, request: Request) -> Optional[UserContext]:
"""Authenticate request and return UserContext or None"""
@abstractmethod
async def get_token_provider(self, request: Request) -> TokenProvider:
"""Get token provider for this request"""
@abstractmethod
def get_login_url(self) -> Optional[str]:
"""Get login URL for frontend, None if no auth required"""
```
#### 2. UserContext
```python
# openhands/auth/user_context.py
from dataclasses import dataclass
from typing import Optional
from datetime import datetime
@dataclass(frozen=True)
class UserContext:
"""Immutable user context for authenticated requests"""
user_id: str
email: Optional[str] = None
username: Optional[str] = None
github_id: Optional[int] = None
github_username: Optional[str] = None
is_admin: bool = False
created_at: Optional[datetime] = None
last_login: Optional[datetime] = None
@property
def storage_namespace(self) -> str:
"""Get storage namespace for this user"""
return self.user_id
```
#### 3. TokenProvider Interface
```python
# openhands/auth/token_provider.py
from abc import ABC, abstractmethod
from typing import Optional, Mapping
from openhands.integrations.service_types import ProviderType
from openhands.integrations.provider import ProviderToken
class TokenProvider(ABC):
"""Abstract token provider for git integrations"""
@abstractmethod
async def get_token(self, provider: ProviderType) -> Optional[ProviderToken]:
"""Get token for specific provider"""
@abstractmethod
async def get_all_tokens(self) -> Mapping[ProviderType, ProviderToken]:
"""Get all available provider tokens"""
```
#### 4. StorageNamespace
```python
# openhands/auth/storage_namespace.py
from dataclasses import dataclass
from typing import Optional
@dataclass(frozen=True)
class StorageNamespace:
"""Encapsulates storage path logic for user data"""
namespace: Optional[str]
def get_conversation_dir(self, sid: str) -> str:
if self.namespace:
return f'users/{self.namespace}/conversations/{sid}/'
return f'sessions/{sid}/'
def get_conversation_events_dir(self, sid: str) -> str:
return f'{self.get_conversation_dir(sid)}events/'
def get_conversation_metadata_filename(self, sid: str) -> str:
return f'{self.get_conversation_dir(sid)}metadata.json'
# ... other path methods
```
### Authentication Strategies
#### 1. None Strategy (Current Behavior)
```python
# openhands/auth/strategies/none_strategy.py
from typing import Optional
from fastapi import Request
from openhands.auth.strategies.base import AuthStrategy
from openhands.auth.user_context import UserContext
from openhands.auth.token_provider import TokenProvider, DefaultTokenProvider
class NoneStrategy(AuthStrategy):
"""No authentication - current OpenHands behavior"""
def get_name(self) -> str:
return "none"
def requires_auth(self) -> bool:
return False
async def authenticate(self, request: Request) -> Optional[UserContext]:
return None # No user context
async def get_token_provider(self, request: Request) -> TokenProvider:
return DefaultTokenProvider() # Uses secrets.json
def get_login_url(self) -> Optional[str]:
return None
```
#### 2. Single User Strategy
```python
# openhands/auth/strategies/single_user_strategy.py
from typing import Optional
from fastapi import Request, HTTPException
from openhands.auth.strategies.base import AuthStrategy
from openhands.auth.user_context import UserContext
from openhands.auth.token_provider import TokenProvider, SingleUserTokenProvider
from openhands.server.shared import server_config
class SingleUserStrategy(AuthStrategy):
"""Single user with GitHub OAuth"""
def get_name(self) -> str:
return "single_user"
def requires_auth(self) -> bool:
return server_config.enable_su_auth
async def authenticate(self, request: Request) -> Optional[UserContext]:
if not self.requires_auth():
# SU mode without auth - create virtual user
return UserContext(
user_id="local",
username="local_user",
is_admin=True
)
# Extract JWT token from cookie/header
token = self._extract_token(request)
if not token:
return None
# Validate JWT and extract user info
user_data = self._validate_jwt(token)
if not user_data:
return None
# Verify user is allowed (if configured)
if (server_config.su_github_username and
user_data.get('github_username') != server_config.su_github_username):
raise HTTPException(403, "Access denied")
return UserContext(
user_id=user_data['github_username'],
email=user_data.get('email'),
username=user_data['github_username'],
github_id=user_data.get('github_id'),
github_username=user_data['github_username'],
is_admin=True # SU user is always admin
)
async def get_token_provider(self, request: Request) -> TokenProvider:
user_context = await self.authenticate(request)
return SingleUserTokenProvider(user_context)
def get_login_url(self) -> Optional[str]:
if not self.requires_auth():
return None
return f"/api/auth/github/login"
```
#### 3. Multi User Strategy (Custom Build Extension Point)
```python
# openhands/auth/strategies/multi_user_strategy.py
from typing import Optional
from fastapi import Request
from openhands.auth.strategies.base import AuthStrategy
from openhands.auth.user_context import UserContext
from openhands.auth.token_provider import TokenProvider
class MultiUserStrategy(AuthStrategy):
"""Multi-user strategy - extension point for custom builds"""
def get_name(self) -> str:
return "multi_user"
def requires_auth(self) -> bool:
return True
async def authenticate(self, request: Request) -> Optional[UserContext]:
# This would be implemented by custom builds/applications built on OH
raise NotImplementedError("Multi-user strategy not available in base OpenHands")
async def get_token_provider(self, request: Request) -> TokenProvider:
raise NotImplementedError("Multi-user strategy not available in base OpenHands")
def get_login_url(self) -> Optional[str]:
return "/api/auth/login"
```
### Integration Points
#### 1. Updated UserAuth
```python
# openhands/server/user_auth/strategy_user_auth.py
from fastapi import Request
from openhands.auth.strategies.base import AuthStrategy
from openhands.auth.user_context import UserContext
from openhands.auth.storage_namespace import StorageNamespace
from openhands.server.user_auth.user_auth import UserAuth
from openhands.storage.settings.settings_store import SettingsStore
from openhands.storage.secrets.secrets_store import SecretsStore
class StrategyUserAuth(UserAuth):
"""UserAuth implementation using AuthStrategy pattern"""
def __init__(self, strategy: AuthStrategy, user_context: Optional[UserContext]):
self.strategy = strategy
self.user_context = user_context
self._storage_namespace = StorageNamespace(
user_context.storage_namespace if user_context else None
)
async def get_user_id(self) -> str | None:
return self.user_context.user_id if self.user_context else None
async def get_user_email(self) -> str | None:
return self.user_context.email if self.user_context else None
# ... other methods using storage_namespace
```
#### 2. FastAPI Dependencies
```python
# openhands/server/dependencies/auth.py
from fastapi import Depends, Request
from openhands.auth.strategies.base import AuthStrategy
from openhands.auth.user_context import UserContext
from openhands.auth.token_provider import TokenProvider
from openhands.server.shared import get_auth_strategy
async def get_current_user(
request: Request,
strategy: AuthStrategy = Depends(get_auth_strategy)
) -> Optional[UserContext]:
"""Get current user context"""
return await strategy.authenticate(request)
async def get_token_provider(
request: Request,
strategy: AuthStrategy = Depends(get_auth_strategy)
) -> TokenProvider:
"""Get token provider for current request"""
return await strategy.get_token_provider(request)
async def require_auth(
user: Optional[UserContext] = Depends(get_current_user)
) -> UserContext:
"""Require authentication"""
if not user:
raise HTTPException(401, "Authentication required")
return user
```
#### 3. Updated Routes
```python
# openhands/server/routes/git.py (AFTER)
from fastapi import APIRouter, Depends
from openhands.auth.token_provider import TokenProvider
from openhands.auth.user_context import UserContext
from openhands.server.dependencies.auth import get_token_provider, get_current_user
from openhands.integrations.provider import ProviderHandler
app = APIRouter(prefix='/api/user')
@app.get('/repositories')
async def get_user_repositories(
sort: str = "pushed",
selected_provider: ProviderType | None = None,
token_provider: TokenProvider = Depends(get_token_provider),
user: Optional[UserContext] = Depends(get_current_user)
):
"""Get user repositories - no provider_tokens parameter!"""
client = ProviderHandler(token_provider=token_provider)
return await client.get_repositories(sort, selected_provider)
```
## Before/After Code Comparison
### Before: Current Implementation
```python
# BEFORE: openhands/server/routes/git.py
@app.get('/repositories', response_model=list[Repository])
async def get_user_repositories(
sort: str = Query(default='pushed'),
selected_provider: ProviderType | None = Query(default=None),
page: int | None = Query(default=None),
per_page: int | None = Query(default=None),
installation_id: str | None = Query(default=None),
provider_tokens: PROVIDER_TOKEN_TYPE | None = Depends(get_provider_tokens),
access_token: SecretStr | None = Depends(get_access_token),
user_id: str | None = Depends(get_user_id),
):
if provider_tokens:
client = ProviderHandler(
provider_tokens=provider_tokens,
external_auth_token=access_token,
external_auth_id=user_id,
)
# ... complex logic
```
```python
# BEFORE: openhands/storage/locations.py
def get_conversation_dir(sid: str, user_id: str | None = None) -> str:
if user_id:
return f'users/{user_id}/conversations/{sid}/'
else:
return f'sessions/{sid}/'
```
```python
# BEFORE: openhands/server/user_auth/default_user_auth.py
class DefaultUserAuth(UserAuth):
async def get_user_id(self) -> str | None:
return None # Always None - no multi-tenancy support
async def get_provider_tokens(self) -> PROVIDER_TOKEN_TYPE | None:
user_secrets = await self.get_user_secrets()
if user_secrets is None:
return None
return user_secrets.provider_tokens
```
### After: Proposed Implementation
```python
# AFTER: openhands/server/routes/git.py
@app.get('/repositories', response_model=list[Repository])
async def get_user_repositories(
sort: str = Query(default='pushed'),
selected_provider: ProviderType | None = Query(default=None),
page: int | None = Query(default=None),
per_page: int | None = Query(default=None),
installation_id: str | None = Query(default=None),
token_provider: TokenProvider = Depends(get_token_provider),
user: Optional[UserContext] = Depends(get_current_user),
):
client = ProviderHandler(token_provider=token_provider)
return await client.get_repositories(
sort, server_config.app_mode, selected_provider, page, per_page, installation_id
)
```
```python
# AFTER: openhands/auth/storage_namespace.py
@dataclass(frozen=True)
class StorageNamespace:
namespace: Optional[str]
def get_conversation_dir(self, sid: str) -> str:
if self.namespace:
return f'users/{self.namespace}/conversations/{sid}/'
return f'sessions/{sid}/'
```
```python
# AFTER: openhands/server/user_auth/strategy_user_auth.py
class StrategyUserAuth(UserAuth):
def __init__(self, strategy: AuthStrategy, user_context: Optional[UserContext]):
self.strategy = strategy
self.user_context = user_context
self.storage_namespace = StorageNamespace(
user_context.storage_namespace if user_context else None
)
async def get_user_id(self) -> str | None:
return self.user_context.user_id if self.user_context else None
```
## Configuration
### Environment Variables
```bash
# Authentication Strategy
OH_AUTH_STRATEGY=none # Options: none, single_user, multi_user
# Single User Mode Settings
OH_ENABLE_SU_AUTH=false # Enable GitHub OAuth in SU mode
OH_SU_GITHUB_USERNAME=your_username # Restrict access to specific user
OH_GITHUB_CLIENT_ID=your_client_id
OH_GITHUB_CLIENT_SECRET=your_client_secret
# Multi User Mode (custom build extension point)
OH_MU_ADMIN_USERNAME=admin_user
```
### Configuration Modes
#### 1. None Mode (Current Default)
```bash
OH_AUTH_STRATEGY=none
# No additional config needed
```
#### 2. Single User - No Auth
```bash
OH_AUTH_STRATEGY=single_user
OH_ENABLE_SU_AUTH=false
```
#### 3. Single User - GitHub Auth
```bash
OH_AUTH_STRATEGY=single_user
OH_ENABLE_SU_AUTH=true
OH_SU_GITHUB_USERNAME=your_username
OH_GITHUB_CLIENT_ID=your_client_id
OH_GITHUB_CLIENT_SECRET=your_client_secret
```
## Implementation Benefits
### 1. Clean Separation of Concerns
- Auth logic isolated in strategy classes
- Business logic doesn't need to know about user_id
- Clear boundaries between auth and core functionality
### 2. Reduced Complexity
- Eliminates 7 redundant `if user_id` guards
- Removes provider_tokens dependency injection
- Simplifies method signatures throughout codebase
### 3. Forward Compatibility
- custom builds can extend with custom strategies
- Token refresh/rotation support built-in
- Multi-tenancy ready without core changes
### 4. Security Improvements
- Tokens never exposed in route parameters
- Centralized token management
- Immutable user context prevents tampering
### 5. Developer Experience
- Clear configuration options
- Easy mode switching
- Consistent patterns across codebase
## Migration Strategy
### Phase 1: Foundation
1. Introduce auth strategy interfaces
2. Add UserContext and StorageNamespace
3. Create TokenProvider abstraction
4. Update core dependencies
### Phase 2: Strategy Implementation
1. Implement NoneStrategy (backward compatible)
2. Implement SingleUserStrategy
3. Add configuration support
4. Update UserAuth integration
### Phase 3: Route Migration
1. Update FastAPI dependencies
2. Remove provider_tokens dependency injection
3. Update ProviderHandler integration
4. Clean up redundant if-guards
### Phase 4: Storage Migration
1. Replace storage path helpers
2. Update conversation managers
3. Migrate event stores
4. Clean up legacy code
## Testing Strategy
### Unit Tests
- Strategy implementations
- UserContext immutability
- StorageNamespace path generation
- TokenProvider implementations
### Integration Tests
- End-to-end auth flows
- Route authentication
- Storage partitioning
- Configuration switching
### Migration Tests
- Backward compatibility
- Data migration paths
- Configuration validation
## Future Extensions
### custom builds Integration Points
```python
# custom builds can provide their own strategies
class custom buildsMultiUserStrategy(AuthStrategy):
async def authenticate(self, request: Request) -> Optional[UserContext]:
# Custom custom builds authentication logic
pass
async def get_token_provider(self, request: Request) -> TokenProvider:
# custom builds token refresh/rotation
return CustomBuildTokenProvider(request)
```
### Additional Auth Methods
- SAML/OIDC strategies
- API key authentication
- Custom JWT providers
- Enterprise SSO integration
## Architecture Diagrams
### 1. Overall Auth System Architecture
```mermaid
graph TB
subgraph "FastAPI Application"
Routes[API Routes]
Deps[FastAPI Dependencies]
end
subgraph "Auth Layer"
AuthStrategy[AuthStrategy Interface]
NoneStrategy[NoneStrategy]
SUStrategy[SingleUserStrategy]
MUStrategy[MultiUserStrategy]
end
subgraph "Core Abstractions"
UserContext[UserContext]
TokenProvider[TokenProvider Interface]
StorageNamespace[StorageNamespace]
end
subgraph "Token Providers"
DefaultTP[DefaultTokenProvider]
SingleUserTP[SingleUserTokenProvider]
custom buildsTP[CustomBuildTokenProvider]
end
subgraph "Storage Layer"
SecretsStore[SecretsStore]
SettingsStore[SettingsStore]
ConversationStore[ConversationStore]
end
Routes --> Deps
Deps --> AuthStrategy
AuthStrategy --> UserContext
AuthStrategy --> TokenProvider
UserContext --> StorageNamespace
NoneStrategy --> DefaultTP
SUStrategy --> SingleUserTP
MUStrategy --> custom buildsTP
TokenProvider --> SecretsStore
StorageNamespace --> ConversationStore
StorageNamespace --> SettingsStore
```
### 2. Authentication Flow - None Strategy
```mermaid
sequenceDiagram
participant Client
participant Route
participant NoneStrategy
participant DefaultTP
participant SecretsStore
Client->>Route: API Request
Route->>NoneStrategy: authenticate(request)
NoneStrategy->>Route: None (no user context)
Route->>NoneStrategy: get_token_provider(request)
NoneStrategy->>DefaultTP: create()
DefaultTP->>SecretsStore: load secrets.json
SecretsStore->>DefaultTP: provider tokens
DefaultTP->>Route: token provider
Route->>Client: API Response
```
### 3. Authentication Flow - Single User Strategy (No Auth)
```mermaid
sequenceDiagram
participant Client
participant Route
participant SUStrategy
participant UserContext
participant SingleUserTP
participant SecretsStore
Client->>Route: API Request
Route->>SUStrategy: authenticate(request)
SUStrategy->>UserContext: create virtual user (local)
UserContext->>SUStrategy: user context
SUStrategy->>Route: UserContext(user_id="local")
Route->>SUStrategy: get_token_provider(request)
SUStrategy->>SingleUserTP: create(user_context)
SingleUserTP->>SecretsStore: load user secrets
SecretsStore->>SingleUserTP: provider tokens
SingleUserTP->>Route: token provider
Route->>Client: API Response
```
### 4. Authentication Flow - Single User Strategy (GitHub Auth)
```mermaid
sequenceDiagram
participant Client
participant Route
participant SUStrategy
participant GitHub
participant UserContext
participant SingleUserTP
participant Database
Client->>Route: API Request with JWT cookie
Route->>SUStrategy: authenticate(request)
SUStrategy->>SUStrategy: extract JWT token
SUStrategy->>SUStrategy: validate JWT
SUStrategy->>SUStrategy: check allowed user
SUStrategy->>UserContext: create from JWT data
UserContext->>SUStrategy: user context
SUStrategy->>Route: UserContext
Route->>SUStrategy: get_token_provider(request)
SUStrategy->>SingleUserTP: create(user_context)
SingleUserTP->>Database: load encrypted tokens
Database->>SingleUserTP: encrypted provider tokens
SingleUserTP->>Route: token provider
Route->>Client: API Response
```
### 5. GitHub OAuth Flow - Single User Strategy
```mermaid
sequenceDiagram
participant Client
participant AuthRoute
participant GitHub
participant SUStrategy
participant Database
participant UserContext
Client->>AuthRoute: GET /auth/login
AuthRoute->>Client: GitHub OAuth URL
Client->>GitHub: OAuth authorization
GitHub->>AuthRoute: GET /auth/callback?code=xxx
AuthRoute->>GitHub: exchange code for token
GitHub->>AuthRoute: access token + user info
AuthRoute->>SUStrategy: validate user allowed
SUStrategy->>AuthRoute: user authorized
AuthRoute->>Database: create/update user record
Database->>AuthRoute: user saved
AuthRoute->>AuthRoute: create JWT token
AuthRoute->>Client: Set JWT cookie + redirect
```
### 6. Storage Namespace Architecture
```mermaid
graph TB
subgraph "User Context"
UC[UserContext]
UC --> SN[StorageNamespace]
end
subgraph "Storage Paths"
SN --> ConvDir[get_conversation_dir]
SN --> EventsDir[get_events_dir]
SN --> MetaFile[get_metadata_file]
SN --> StateFile[get_state_file]
end
subgraph "Path Examples"
ConvDir --> NonePath[sessions/sid/]
ConvDir --> UserPath[users/user_id/conversations/sid/]
EventsDir --> NoneEvents[sessions/sid/events/]
EventsDir --> UserEvents[users/user_id/conversations/sid/events/]
end
subgraph "Strategy Impact"
NoneStrategy2[NoneStrategy] --> NonePath
SUStrategy2[SingleUserStrategy] --> UserPath
MUStrategy2[MultiUserStrategy] --> UserPath
end
```
### 7. Token Provider Architecture
```mermaid
graph TB
subgraph "Token Provider Interface"
TP[TokenProvider]
TP --> GetToken[get_token(provider)]
TP --> GetAllTokens[get_all_tokens()]
end
subgraph "Implementations"
DefaultTP2[DefaultTokenProvider]
SingleUserTP2[SingleUserTokenProvider]
custom buildsTP2[CustomBuildTokenProvider]
end
subgraph "Token Sources"
SecretsJSON[secrets.json]
UserDB[User Database]
Custom Build API[custom builds Token API]
end
subgraph "Provider Integration"
ProviderHandler[ProviderHandler]
GitHubService[GitHubService]
GitLabService[GitLabService]
BitBucketService[BitBucketService]
end
DefaultTP2 --> SecretsJSON
SingleUserTP2 --> UserDB
custom buildsTP2 --> Custom Build API
TP --> ProviderHandler
ProviderHandler --> GitHubService
ProviderHandler --> GitLabService
ProviderHandler --> BitBucketService
```
### 8. Configuration-Driven Strategy Selection
```mermaid
graph TB
subgraph "Configuration"
Config[OH_AUTH_STRATEGY]
Config --> None[none]
Config --> SU[single_user]
Config --> MU[multi_user]
end
subgraph "Strategy Factory"
Factory[AuthStrategyFactory]
Factory --> CreateNone[create NoneStrategy]
Factory --> CreateSU[create SingleUserStrategy]
Factory --> CreateMU[create MultiUserStrategy]
end
subgraph "Additional Config"
SUConfig[OH_ENABLE_SU_AUTH<br/>OH_SU_GITHUB_USERNAME<br/>OH_GITHUB_CLIENT_ID]
MUConfig[OH_MU_ADMIN_USERNAME<br/>Database Config]
end
None --> CreateNone
SU --> CreateSU
MU --> CreateMU
CreateSU --> SUConfig
CreateMU --> MUConfig
```
## Conclusion
This AuthSystem design provides OpenHands with a robust, extensible authentication foundation that:
1. **Maintains backward compatibility** with the current "None" mode
2. **Enables Single User mode** with optional GitHub OAuth
3. **Provides extension points** for custom builds with multi-user implementations
4. **Cleans up the codebase** by removing scattered user_id threading
5. **Improves security** by centralizing token management
6. **Simplifies development** with clear abstractions and patterns
The design is ready for implementation and will significantly improve OpenHands' authentication capabilities while maintaining its current simplicity for users who don't need authentication.
@@ -0,0 +1,860 @@
---
title: AuthSystem Design - Complete Specification
---
# OpenHands AuthSystem Design
## Executive Summary
This document proposes a comprehensive AuthSystem design for OpenHands that supports three authentication strategies: **None** (current behavior), **Single User (SU)** with GitHub OAuth, and **Multi User (MU)** (for custom builds). The design introduces clean abstraction boundaries, eliminates scattered `user_id` threading, and provides a foundation for future authentication enhancements.
## Problem Statement
### Current Issues
1. **No Auth Strategy Abstraction**: OpenHands currently has a monolithic `DefaultUserAuth` that always returns `None` for `user_id`, with no clear path to support different authentication modes.
2. **Scattered user_id Threading**: 339+ occurrences of `user_id` across 68 files, with complex threading through:
- Storage partitioning (`users/{user_id}/` paths)
- Conversation/session scoping
- API route dependencies
- Provider token resolution
- Data model fields
3. **Provider Token Pollution**: Routes accept `provider_tokens` parameters and thread them through `ProviderHandler`, creating security risks and complex signatures.
4. **No Single User Support**: No way to enable GitHub OAuth for personal/single-user deployments while maintaining the simplicity of the current "None" mode.
5. **Boundary Violations**: Auth concerns are mixed with business logic throughout the codebase, making it difficult to switch between authentication modes.
### Requirements from GitHub Issues
From **Issue #10751** (user_id audit):
- Support None, SU, and MU modes
- Introduce `UserContext` and `StorageNamespace` abstractions
- Remove redundant `if user_id` guards (7 identified)
- Clean up storage path helpers
From **Issue #10730** (token provider):
- Remove `provider_tokens` dependency injection
- Introduce `TokenProvider` boundary abstraction
- Support backend-only credential resolution
- Enable custom builds with token refresh/rotation patterns
## Solution Architecture
### Core Components
#### 1. AuthStrategy Interface
```python
# openhands/auth/strategies/base.py
from abc import ABC, abstractmethod
from typing import Optional
from fastapi import Request
from openhands.auth.user_context import UserContext
from openhands.auth.token_provider import TokenProvider
class AuthStrategy(ABC):
"""Base class for authentication strategies"""
@abstractmethod
def get_name(self) -> str:
"""Return strategy name for logging/debugging"""
@abstractmethod
def requires_auth(self) -> bool:
"""Whether this strategy requires user authentication"""
@abstractmethod
async def authenticate(self, request: Request) -> Optional[UserContext]:
"""Authenticate request and return UserContext or None"""
@abstractmethod
async def get_token_provider(self, request: Request) -> TokenProvider:
"""Get token provider for this request"""
@abstractmethod
def get_login_url(self) -> Optional[str]:
"""Get login URL for frontend, None if no auth required"""
```
#### 2. UserContext
```python
# openhands/auth/user_context.py
from dataclasses import dataclass
from typing import Optional
from datetime import datetime
@dataclass(frozen=True)
class UserContext:
"""Immutable user context for authenticated requests"""
user_id: str
email: Optional[str] = None
username: Optional[str] = None
github_id: Optional[int] = None
github_username: Optional[str] = None
is_admin: bool = False
created_at: Optional[datetime] = None
last_login: Optional[datetime] = None
@property
def storage_namespace(self) -> str:
"""Get storage namespace for this user"""
return self.user_id
```
#### 3. TokenProvider Interface
```python
# openhands/auth/token_provider.py
from abc import ABC, abstractmethod
from typing import Optional, Mapping
from openhands.integrations.service_types import ProviderType
from openhands.integrations.provider import ProviderToken
class TokenProvider(ABC):
"""Abstract token provider for git integrations"""
@abstractmethod
async def get_token(self, provider: ProviderType) -> Optional[ProviderToken]:
"""Get token for specific provider"""
@abstractmethod
async def get_all_tokens(self) -> Mapping[ProviderType, ProviderToken]:
"""Get all available provider tokens"""
```
#### 4. StorageNamespace
```python
# openhands/auth/storage_namespace.py
from dataclasses import dataclass
from typing import Optional
@dataclass(frozen=True)
class StorageNamespace:
"""Encapsulates storage path logic for user data"""
namespace: Optional[str]
def get_conversation_dir(self, sid: str) -> str:
if self.namespace:
return f'users/{self.namespace}/conversations/{sid}/'
return f'sessions/{sid}/'
def get_conversation_events_dir(self, sid: str) -> str:
return f'{self.get_conversation_dir(sid)}events/'
def get_conversation_metadata_filename(self, sid: str) -> str:
return f'{self.get_conversation_dir(sid)}metadata.json'
# ... other path methods
```
### Authentication Strategies
#### 1. None Strategy (Current Behavior)
```python
# openhands/auth/strategies/none_strategy.py
from typing import Optional
from fastapi import Request
from openhands.auth.strategies.base import AuthStrategy
from openhands.auth.user_context import UserContext
from openhands.auth.token_provider import TokenProvider, DefaultTokenProvider
class NoneStrategy(AuthStrategy):
"""No authentication - current OpenHands behavior"""
def get_name(self) -> str:
return "none"
def requires_auth(self) -> bool:
return False
async def authenticate(self, request: Request) -> Optional[UserContext]:
return None # No user context
async def get_token_provider(self, request: Request) -> TokenProvider:
return DefaultTokenProvider() # Uses secrets.json
def get_login_url(self) -> Optional[str]:
return None
```
#### 2. Single User Strategy
```python
# openhands/auth/strategies/single_user_strategy.py
from typing import Optional
from fastapi import Request, HTTPException
from openhands.auth.strategies.base import AuthStrategy
from openhands.auth.user_context import UserContext
from openhands.auth.token_provider import TokenProvider, SingleUserTokenProvider
from openhands.server.shared import server_config
class SingleUserStrategy(AuthStrategy):
"""Single user with GitHub OAuth"""
def get_name(self) -> str:
return "single_user"
def requires_auth(self) -> bool:
return server_config.enable_su_auth
async def authenticate(self, request: Request) -> Optional[UserContext]:
if not self.requires_auth():
# SU mode without auth - create virtual user
return UserContext(
user_id="local",
username="local_user",
is_admin=True
)
# Extract JWT token from cookie/header
token = self._extract_token(request)
if not token:
return None
# Validate JWT and extract user info
user_data = self._validate_jwt(token)
if not user_data:
return None
# Verify user is allowed (if configured)
if (server_config.su_github_username and
user_data.get('github_username') != server_config.su_github_username):
raise HTTPException(403, "Access denied")
return UserContext(
user_id=user_data['github_username'],
email=user_data.get('email'),
username=user_data['github_username'],
github_id=user_data.get('github_id'),
github_username=user_data['github_username'],
is_admin=True # SU user is always admin
)
async def get_token_provider(self, request: Request) -> TokenProvider:
user_context = await self.authenticate(request)
return SingleUserTokenProvider(user_context)
def get_login_url(self) -> Optional[str]:
if not self.requires_auth():
return None
return f"/api/auth/github/login"
```
#### 3. Multi User Strategy (Custom Build Extension Point)
```python
# openhands/auth/strategies/multi_user_strategy.py
from typing import Optional
from fastapi import Request
from openhands.auth.strategies.base import AuthStrategy
from openhands.auth.user_context import UserContext
from openhands.auth.token_provider import TokenProvider
class MultiUserStrategy(AuthStrategy):
"""Multi-user strategy - extension point for custom builds"""
def get_name(self) -> str:
return "multi_user"
def requires_auth(self) -> bool:
return True
async def authenticate(self, request: Request) -> Optional[UserContext]:
# This would be implemented by custom builds/applications built on OH
raise NotImplementedError("Multi-user strategy not available in base OpenHands")
async def get_token_provider(self, request: Request) -> TokenProvider:
raise NotImplementedError("Multi-user strategy not available in base OpenHands")
def get_login_url(self) -> Optional[str]:
return "/api/auth/login"
```
### Integration Points
#### 1. Updated UserAuth
```python
# openhands/server/user_auth/strategy_user_auth.py
from fastapi import Request
from openhands.auth.strategies.base import AuthStrategy
from openhands.auth.user_context import UserContext
from openhands.auth.storage_namespace import StorageNamespace
from openhands.server.user_auth.user_auth import UserAuth
from openhands.storage.settings.settings_store import SettingsStore
from openhands.storage.secrets.secrets_store import SecretsStore
class StrategyUserAuth(UserAuth):
"""UserAuth implementation using AuthStrategy pattern"""
def __init__(self, strategy: AuthStrategy, user_context: Optional[UserContext]):
self.strategy = strategy
self.user_context = user_context
self._storage_namespace = StorageNamespace(
user_context.storage_namespace if user_context else None
)
async def get_user_id(self) -> str | None:
return self.user_context.user_id if self.user_context else None
async def get_user_email(self) -> str | None:
return self.user_context.email if self.user_context else None
# ... other methods using storage_namespace
```
#### 2. FastAPI Dependencies
```python
# openhands/server/dependencies/auth.py
from fastapi import Depends, Request
from openhands.auth.strategies.base import AuthStrategy
from openhands.auth.user_context import UserContext
from openhands.auth.token_provider import TokenProvider
from openhands.server.shared import get_auth_strategy
async def get_current_user(
request: Request,
strategy: AuthStrategy = Depends(get_auth_strategy)
) -> Optional[UserContext]:
"""Get current user context"""
return await strategy.authenticate(request)
async def get_token_provider(
request: Request,
strategy: AuthStrategy = Depends(get_auth_strategy)
) -> TokenProvider:
"""Get token provider for current request"""
return await strategy.get_token_provider(request)
async def require_auth(
user: Optional[UserContext] = Depends(get_current_user)
) -> UserContext:
"""Require authentication"""
if not user:
raise HTTPException(401, "Authentication required")
return user
```
#### 3. Updated Routes
```python
# openhands/server/routes/git.py (AFTER)
from fastapi import APIRouter, Depends
from openhands.auth.token_provider import TokenProvider
from openhands.auth.user_context import UserContext
from openhands.server.dependencies.auth import get_token_provider, get_current_user
from openhands.integrations.provider import ProviderHandler
app = APIRouter(prefix='/api/user')
@app.get('/repositories')
async def get_user_repositories(
sort: str = "pushed",
selected_provider: ProviderType | None = None,
token_provider: TokenProvider = Depends(get_token_provider),
user: Optional[UserContext] = Depends(get_current_user)
):
"""Get user repositories - no provider_tokens parameter!"""
client = ProviderHandler(token_provider=token_provider)
return await client.get_repositories(sort, selected_provider)
```
## Before/After Code Comparison
### Before: Current Implementation
```python
# BEFORE: openhands/server/routes/git.py
@app.get('/repositories', response_model=list[Repository])
async def get_user_repositories(
sort: str = Query(default='pushed'),
selected_provider: ProviderType | None = Query(default=None),
page: int | None = Query(default=None),
per_page: int | None = Query(default=None),
installation_id: str | None = Query(default=None),
provider_tokens: PROVIDER_TOKEN_TYPE | None = Depends(get_provider_tokens),
access_token: SecretStr | None = Depends(get_access_token),
user_id: str | None = Depends(get_user_id),
):
if provider_tokens:
client = ProviderHandler(
provider_tokens=provider_tokens,
external_auth_token=access_token,
external_auth_id=user_id,
)
# ... complex logic
```
```python
# BEFORE: openhands/storage/locations.py
def get_conversation_dir(sid: str, user_id: str | None = None) -> str:
if user_id:
return f'users/{user_id}/conversations/{sid}/'
else:
return f'sessions/{sid}/'
```
```python
# BEFORE: openhands/server/user_auth/default_user_auth.py
class DefaultUserAuth(UserAuth):
async def get_user_id(self) -> str | None:
return None # Always None - no multi-tenancy support
async def get_provider_tokens(self) -> PROVIDER_TOKEN_TYPE | None:
user_secrets = await self.get_user_secrets()
if user_secrets is None:
return None
return user_secrets.provider_tokens
```
### After: Proposed Implementation
```python
# AFTER: openhands/server/routes/git.py
@app.get('/repositories', response_model=list[Repository])
async def get_user_repositories(
sort: str = Query(default='pushed'),
selected_provider: ProviderType | None = Query(default=None),
page: int | None = Query(default=None),
per_page: int | None = Query(default=None),
installation_id: str | None = Query(default=None),
token_provider: TokenProvider = Depends(get_token_provider),
user: Optional[UserContext] = Depends(get_current_user),
):
client = ProviderHandler(token_provider=token_provider)
return await client.get_repositories(
sort, server_config.app_mode, selected_provider, page, per_page, installation_id
)
```
```python
# AFTER: openhands/auth/storage_namespace.py
@dataclass(frozen=True)
class StorageNamespace:
namespace: Optional[str]
def get_conversation_dir(self, sid: str) -> str:
if self.namespace:
return f'users/{self.namespace}/conversations/{sid}/'
return f'sessions/{sid}/'
```
```python
# AFTER: openhands/server/user_auth/strategy_user_auth.py
class StrategyUserAuth(UserAuth):
def __init__(self, strategy: AuthStrategy, user_context: Optional[UserContext]):
self.strategy = strategy
self.user_context = user_context
self.storage_namespace = StorageNamespace(
user_context.storage_namespace if user_context else None
)
async def get_user_id(self) -> str | None:
return self.user_context.user_id if self.user_context else None
```
## Configuration
### Environment Variables
```bash
# Authentication Strategy
OH_AUTH_STRATEGY=none # Options: none, single_user, multi_user
# Single User Mode Settings
OH_ENABLE_SU_AUTH=false # Enable GitHub OAuth in SU mode
OH_SU_GITHUB_USERNAME=your_username # Restrict access to specific user
OH_GITHUB_CLIENT_ID=your_client_id
OH_GITHUB_CLIENT_SECRET=your_client_secret
# Multi User Mode (custom build extension point)
OH_MU_ADMIN_USERNAME=admin_user
```
### Configuration Modes
#### 1. None Mode (Current Default)
```bash
OH_AUTH_STRATEGY=none
# No additional config needed
```
#### 2. Single User - No Auth
```bash
OH_AUTH_STRATEGY=single_user
OH_ENABLE_SU_AUTH=false
```
#### 3. Single User - GitHub Auth
```bash
OH_AUTH_STRATEGY=single_user
OH_ENABLE_SU_AUTH=true
OH_SU_GITHUB_USERNAME=your_username
OH_GITHUB_CLIENT_ID=your_client_id
OH_GITHUB_CLIENT_SECRET=your_client_secret
```
## Implementation Benefits
### 1. Clean Separation of Concerns
- Auth logic isolated in strategy classes
- Business logic doesn't need to know about user_id
- Clear boundaries between auth and core functionality
### 2. Reduced Complexity
- Eliminates 7 redundant `if user_id` guards
- Removes provider_tokens dependency injection
- Simplifies method signatures throughout codebase
### 3. Forward Compatibility
- custom builds can extend with custom strategies
- Token refresh/rotation support built-in
- Multi-tenancy ready without core changes
### 4. Security Improvements
- Tokens never exposed in route parameters
- Centralized token management
- Immutable user context prevents tampering
### 5. Developer Experience
- Clear configuration options
- Easy mode switching
- Consistent patterns across codebase
## Migration Strategy
### Phase 1: Foundation
1. Introduce auth strategy interfaces
2. Add UserContext and StorageNamespace
3. Create TokenProvider abstraction
4. Update core dependencies
### Phase 2: Strategy Implementation
1. Implement NoneStrategy (backward compatible)
2. Implement SingleUserStrategy
3. Add configuration support
4. Update UserAuth integration
### Phase 3: Route Migration
1. Update FastAPI dependencies
2. Remove provider_tokens dependency injection
3. Update ProviderHandler integration
4. Clean up redundant if-guards
### Phase 4: Storage Migration
1. Replace storage path helpers
2. Update conversation managers
3. Migrate event stores
4. Clean up legacy code
## Testing Strategy
### Unit Tests
- Strategy implementations
- UserContext immutability
- StorageNamespace path generation
- TokenProvider implementations
### Integration Tests
- End-to-end auth flows
- Route authentication
- Storage partitioning
- Configuration switching
### Migration Tests
- Backward compatibility
- Data migration paths
- Configuration validation
## Future Extensions
### custom builds Integration Points
```python
# custom builds can provide their own strategies
class custom buildsMultiUserStrategy(AuthStrategy):
async def authenticate(self, request: Request) -> Optional[UserContext]:
# Custom custom builds authentication logic
pass
async def get_token_provider(self, request: Request) -> TokenProvider:
# custom builds token refresh/rotation
return CustomBuildTokenProvider(request)
```
### Additional Auth Methods
- SAML/OIDC strategies
- API key authentication
- Custom JWT providers
- Enterprise SSO integration
## Architecture Diagrams
### 1. Overall Auth System Architecture
```mermaid
graph TB
subgraph "FastAPI Application"
Routes[API Routes]
Deps[FastAPI Dependencies]
end
subgraph "Auth Layer"
AuthStrategy[AuthStrategy Interface]
NoneStrategy[NoneStrategy]
SUStrategy[SingleUserStrategy]
MUStrategy[MultiUserStrategy]
end
subgraph "Core Abstractions"
UserContext[UserContext]
TokenProvider[TokenProvider Interface]
StorageNamespace[StorageNamespace]
end
subgraph "Token Providers"
DefaultTP[DefaultTokenProvider]
SingleUserTP[SingleUserTokenProvider]
custom buildsTP[CustomBuildTokenProvider]
end
subgraph "Storage Layer"
SecretsStore[SecretsStore]
SettingsStore[SettingsStore]
ConversationStore[ConversationStore]
end
Routes --> Deps
Deps --> AuthStrategy
AuthStrategy --> UserContext
AuthStrategy --> TokenProvider
UserContext --> StorageNamespace
NoneStrategy --> DefaultTP
SUStrategy --> SingleUserTP
MUStrategy --> custom buildsTP
TokenProvider --> SecretsStore
StorageNamespace --> ConversationStore
StorageNamespace --> SettingsStore
```
### 2. Authentication Flow - None Strategy
```mermaid
sequenceDiagram
participant Client
participant Route
participant NoneStrategy
participant DefaultTP
participant SecretsStore
Client->>Route: API Request
Route->>NoneStrategy: authenticate(request)
NoneStrategy->>Route: None (no user context)
Route->>NoneStrategy: get_token_provider(request)
NoneStrategy->>DefaultTP: create()
DefaultTP->>SecretsStore: load secrets.json
SecretsStore->>DefaultTP: provider tokens
DefaultTP->>Route: token provider
Route->>Client: API Response
```
### 3. Authentication Flow - Single User Strategy (No Auth)
```mermaid
sequenceDiagram
participant Client
participant Route
participant SUStrategy
participant UserContext
participant SingleUserTP
participant SecretsStore
Client->>Route: API Request
Route->>SUStrategy: authenticate(request)
SUStrategy->>UserContext: create virtual user (local)
UserContext->>SUStrategy: user context
SUStrategy->>Route: UserContext(user_id="local")
Route->>SUStrategy: get_token_provider(request)
SUStrategy->>SingleUserTP: create(user_context)
SingleUserTP->>SecretsStore: load user secrets
SecretsStore->>SingleUserTP: provider tokens
SingleUserTP->>Route: token provider
Route->>Client: API Response
```
### 4. Authentication Flow - Single User Strategy (GitHub Auth)
```mermaid
sequenceDiagram
participant Client
participant Route
participant SUStrategy
participant GitHub
participant UserContext
participant SingleUserTP
participant Database
Client->>Route: API Request with JWT cookie
Route->>SUStrategy: authenticate(request)
SUStrategy->>SUStrategy: extract JWT token
SUStrategy->>SUStrategy: validate JWT
SUStrategy->>SUStrategy: check allowed user
SUStrategy->>UserContext: create from JWT data
UserContext->>SUStrategy: user context
SUStrategy->>Route: UserContext
Route->>SUStrategy: get_token_provider(request)
SUStrategy->>SingleUserTP: create(user_context)
SingleUserTP->>Database: load encrypted tokens
Database->>SingleUserTP: encrypted provider tokens
SingleUserTP->>Route: token provider
Route->>Client: API Response
```
### 5. GitHub OAuth Flow - Single User Strategy
```mermaid
sequenceDiagram
participant Client
participant AuthRoute
participant GitHub
participant SUStrategy
participant Database
participant UserContext
Client->>AuthRoute: GET /auth/login
AuthRoute->>Client: GitHub OAuth URL
Client->>GitHub: OAuth authorization
GitHub->>AuthRoute: GET /auth/callback?code=xxx
AuthRoute->>GitHub: exchange code for token
GitHub->>AuthRoute: access token + user info
AuthRoute->>SUStrategy: validate user allowed
SUStrategy->>AuthRoute: user authorized
AuthRoute->>Database: create/update user record
Database->>AuthRoute: user saved
AuthRoute->>AuthRoute: create JWT token
AuthRoute->>Client: Set JWT cookie + redirect
```
### 6. Storage Namespace Architecture
```mermaid
graph TB
subgraph "User Context"
UC[UserContext]
UC --> SN[StorageNamespace]
end
subgraph "Storage Paths"
SN --> ConvDir[get_conversation_dir]
SN --> EventsDir[get_events_dir]
SN --> MetaFile[get_metadata_file]
SN --> StateFile[get_state_file]
end
subgraph "Path Examples"
ConvDir --> NonePath[sessions/sid/]
ConvDir --> UserPath[users/user_id/conversations/sid/]
EventsDir --> NoneEvents[sessions/sid/events/]
EventsDir --> UserEvents[users/user_id/conversations/sid/events/]
end
subgraph "Strategy Impact"
NoneStrategy2[NoneStrategy] --> NonePath
SUStrategy2[SingleUserStrategy] --> UserPath
MUStrategy2[MultiUserStrategy] --> UserPath
end
```
### 7. Token Provider Architecture
```mermaid
graph TB
subgraph "Token Provider Interface"
TP[TokenProvider]
TP --> GetToken[get_token(provider)]
TP --> GetAllTokens[get_all_tokens()]
end
subgraph "Implementations"
DefaultTP2[DefaultTokenProvider]
SingleUserTP2[SingleUserTokenProvider]
custom buildsTP2[CustomBuildTokenProvider]
end
subgraph "Token Sources"
SecretsJSON[secrets.json]
UserDB[User Database]
Custom Build API[custom builds Token API]
end
subgraph "Provider Integration"
ProviderHandler[ProviderHandler]
GitHubService[GitHubService]
GitLabService[GitLabService]
BitBucketService[BitBucketService]
end
DefaultTP2 --> SecretsJSON
SingleUserTP2 --> UserDB
custom buildsTP2 --> Custom Build API
TP --> ProviderHandler
ProviderHandler --> GitHubService
ProviderHandler --> GitLabService
ProviderHandler --> BitBucketService
```
### 8. Configuration-Driven Strategy Selection
```mermaid
graph TB
subgraph "Configuration"
Config[OH_AUTH_STRATEGY]
Config --> None[none]
Config --> SU[single_user]
Config --> MU[multi_user]
end
subgraph "Strategy Factory"
Factory[AuthStrategyFactory]
Factory --> CreateNone[create NoneStrategy]
Factory --> CreateSU[create SingleUserStrategy]
Factory --> CreateMU[create MultiUserStrategy]
end
subgraph "Additional Config"
SUConfig[OH_ENABLE_SU_AUTH<br/>OH_SU_GITHUB_USERNAME<br/>OH_GITHUB_CLIENT_ID]
MUConfig[OH_MU_ADMIN_USERNAME<br/>Database Config]
end
None --> CreateNone
SU --> CreateSU
MU --> CreateMU
CreateSU --> SUConfig
CreateMU --> MUConfig
```
## Conclusion
This AuthSystem design provides OpenHands with a robust, extensible authentication foundation that:
1. **Maintains backward compatibility** with the current "None" mode
2. **Enables Single User mode** with optional GitHub OAuth
3. **Provides extension points** for custom builds with multi-user implementations
4. **Cleans up the codebase** by removing scattered user_id threading
5. **Improves security** by centralizing token management
6. **Simplifies development** with clear abstractions and patterns
The design is ready for implementation and will significantly improve OpenHands' authentication capabilities while maintaining its current simplicity for users who don't need authentication.
@@ -0,0 +1,210 @@
# OpenHands AuthSystem Design - Executive Summary
## Goal
Design a flexible authentication system for OpenHands that supports three strategies:
- **None**: Current behavior (no auth, optional GitHub token)
- **SU (Single User)**: GitHub OAuth for personal use
- **MU (Multi User)**: Extension point for custom builds (not in base OH)
## Current Problems
- 339+ `user_id` occurrences scattered across 68 files
- No auth strategy abstraction
- `provider_tokens` dependency injection complexity
- No single-user GitHub OAuth support
- Mixed auth/business logic concerns
## Solution Architecture
### Core Components
1. **AuthStrategy Interface** - Pluggable auth strategies
2. **UserContext** - Immutable user data container
3. **TokenProvider** - Centralized token management
4. **StorageNamespace** - Clean storage path abstraction
### Auth Strategies
```python
# None Strategy (current behavior)
OH_AUTH_STRATEGY=none
# Single User - No Auth (virtual user)
OH_AUTH_STRATEGY=single_user
OH_ENABLE_SU_AUTH=false
# Single User - GitHub OAuth
OH_AUTH_STRATEGY=single_user
OH_ENABLE_SU_AUTH=true
OH_SU_GITHUB_USERNAME=your_username
OH_GITHUB_CLIENT_ID=your_client_id
OH_GITHUB_CLIENT_SECRET=your_client_secret
```
## 🔄 Key Changes
### Before (Current)
```python
# Route with complex dependencies
async def get_repositories(
provider_tokens: PROVIDER_TOKEN_TYPE = Depends(get_provider_tokens),
user_id: str | None = Depends(get_user_id),
):
client = ProviderHandler(
provider_tokens=provider_tokens,
external_auth_id=user_id,
)
# Scattered path logic
def get_conversation_dir(sid: str, user_id: str | None = None) -> str:
if user_id:
return f'users/{user_id}/conversations/{sid}/'
else:
return f'sessions/{sid}/'
```
### After (Proposed)
```python
# Clean route signature
async def get_repositories(
token_provider: TokenProvider = Depends(get_token_provider),
user: Optional[UserContext] = Depends(get_current_user),
):
client = ProviderHandler(token_provider=token_provider)
# Encapsulated storage logic
@dataclass(frozen=True)
class StorageNamespace:
namespace: Optional[str]
def get_conversation_dir(self, sid: str) -> str:
if self.namespace:
return f'users/{self.namespace}/conversations/{sid}/'
return f'sessions/{sid}/'
```
## Architectural Benefits
### Codebase Cleanup
- Removes 7 redundant `if user_id` guards across the codebase
- Eliminates `provider_tokens` dependency injection complexity
- Reduces method signature complexity throughout the system
- Centralizes storage path logic in dedicated abstractions
### Extensibility
- Strategy pattern enables custom build extension points
- Token refresh/rotation patterns built-in
- Multi-tenancy ready without core changes
- Additional auth methods can be added without refactoring
### Code Organization
- Clear separation of auth and business logic
- Consistent patterns across all authentication modes
- Centralized token and credential management
- Immutable user context prevents state corruption
## Implementation Plan
### Phase 1: Foundation
- [ ] Auth strategy interfaces
- [ ] UserContext & StorageNamespace
- [ ] TokenProvider abstraction
- [ ] Core dependencies
### Phase 2: Strategies
- [ ] NoneStrategy (backward compatible)
- [ ] SingleUserStrategy
- [ ] Configuration support
- [ ] UserAuth integration
### Phase 3: Routes
- [ ] Update FastAPI dependencies
- [ ] Remove provider_tokens
- [ ] Update ProviderHandler
- [ ] Clean redundant guards
### Phase 4: Storage
- [ ] Replace path helpers
- [ ] Update conversation managers
- [ ] Migrate event stores
- [ ] Legacy cleanup
## Architecture Highlights
### Strategy Pattern
```python
class AuthStrategy(ABC):
@abstractmethod
async def authenticate(self, request: Request) -> Optional[UserContext]:
pass
@abstractmethod
async def get_token_provider(self, request: Request) -> TokenProvider:
pass
```
### Immutable User Context
```python
@dataclass(frozen=True)
class UserContext:
user_id: str
email: Optional[str] = None
username: Optional[str] = None
is_admin: bool = False
```
### Token Provider Interface
```python
class TokenProvider(ABC):
@abstractmethod
async def get_token(self, provider: ProviderType) -> Optional[ProviderToken]:
pass
```
## 🔧 Configuration Examples
### Current Default (None)
```bash
# No configuration needed - maintains current behavior
```
### Personal Use (SU without auth)
```bash
OH_AUTH_STRATEGY=single_user
OH_ENABLE_SU_AUTH=false
# Creates virtual "local" user, uses secrets.json
```
### Personal Use (SU with GitHub)
```bash
OH_AUTH_STRATEGY=single_user
OH_ENABLE_SU_AUTH=true
OH_SU_GITHUB_USERNAME=myusername
OH_GITHUB_CLIENT_ID=abc123
OH_GITHUB_CLIENT_SECRET=secret456
# Requires GitHub OAuth, restricts to specific user
```
## Implementation Readiness
### Backward Compatibility
- None strategy maintains exact current behavior
- No breaking changes for existing users
- Gradual migration path available
### Code Quality Improvements
- Reduces complexity from 339 to ~50 user_id references
- Introduces clear abstractions and boundaries
- Enables better testing and maintainability
### Extensibility Foundation
- Custom builds can add authentication strategies
- Token refresh/rotation patterns built-in
- Multi-tenancy foundation without core changes
## Summary
This design provides a clean authentication architecture for OpenHands with three key outcomes:
1. **Maintains simplicity** - Current users see no changes
2. **Enables extension** - Custom builds can add authentication features
3. **Improves codebase** - Reduces scattered auth logic and complexity
The architecture is well-defined with a clear migration path.
@@ -0,0 +1,214 @@
---
title: AuthSystem Design - Executive Summary
---
# OpenHands AuthSystem Design - Executive Summary
## Goal
Design a flexible authentication system for OpenHands that supports three strategies:
- **None**: Current behavior (no auth, optional GitHub token)
- **SU (Single User)**: GitHub OAuth for personal use
- **MU (Multi User)**: Extension point for custom builds (not in base OH)
## Current Problems
- 339+ `user_id` occurrences scattered across 68 files
- No auth strategy abstraction
- `provider_tokens` dependency injection complexity
- No single-user GitHub OAuth support
- Mixed auth/business logic concerns
## Solution Architecture
### Core Components
1. **AuthStrategy Interface** - Pluggable auth strategies
2. **UserContext** - Immutable user data container
3. **TokenProvider** - Centralized token management
4. **StorageNamespace** - Clean storage path abstraction
### Auth Strategies
```python
# None Strategy (current behavior)
OH_AUTH_STRATEGY=none
# Single User - No Auth (virtual user)
OH_AUTH_STRATEGY=single_user
OH_ENABLE_SU_AUTH=false
# Single User - GitHub OAuth
OH_AUTH_STRATEGY=single_user
OH_ENABLE_SU_AUTH=true
OH_SU_GITHUB_USERNAME=your_username
OH_GITHUB_CLIENT_ID=your_client_id
OH_GITHUB_CLIENT_SECRET=your_client_secret
```
## 🔄 Key Changes
### Before (Current)
```python
# Route with complex dependencies
async def get_repositories(
provider_tokens: PROVIDER_TOKEN_TYPE = Depends(get_provider_tokens),
user_id: str | None = Depends(get_user_id),
):
client = ProviderHandler(
provider_tokens=provider_tokens,
external_auth_id=user_id,
)
# Scattered path logic
def get_conversation_dir(sid: str, user_id: str | None = None) -> str:
if user_id:
return f'users/{user_id}/conversations/{sid}/'
else:
return f'sessions/{sid}/'
```
### After (Proposed)
```python
# Clean route signature
async def get_repositories(
token_provider: TokenProvider = Depends(get_token_provider),
user: Optional[UserContext] = Depends(get_current_user),
):
client = ProviderHandler(token_provider=token_provider)
# Encapsulated storage logic
@dataclass(frozen=True)
class StorageNamespace:
namespace: Optional[str]
def get_conversation_dir(self, sid: str) -> str:
if self.namespace:
return f'users/{self.namespace}/conversations/{sid}/'
return f'sessions/{sid}/'
```
## Architectural Benefits
### Codebase Cleanup
- Removes 7 redundant `if user_id` guards across the codebase
- Eliminates `provider_tokens` dependency injection complexity
- Reduces method signature complexity throughout the system
- Centralizes storage path logic in dedicated abstractions
### Extensibility
- Strategy pattern enables custom build extension points
- Token refresh/rotation patterns built-in
- Multi-tenancy ready without core changes
- Additional auth methods can be added without refactoring
### Code Organization
- Clear separation of auth and business logic
- Consistent patterns across all authentication modes
- Centralized token and credential management
- Immutable user context prevents state corruption
## Implementation Plan
### Phase 1: Foundation
- [ ] Auth strategy interfaces
- [ ] UserContext & StorageNamespace
- [ ] TokenProvider abstraction
- [ ] Core dependencies
### Phase 2: Strategies
- [ ] NoneStrategy (backward compatible)
- [ ] SingleUserStrategy
- [ ] Configuration support
- [ ] UserAuth integration
### Phase 3: Routes
- [ ] Update FastAPI dependencies
- [ ] Remove provider_tokens
- [ ] Update ProviderHandler
- [ ] Clean redundant guards
### Phase 4: Storage
- [ ] Replace path helpers
- [ ] Update conversation managers
- [ ] Migrate event stores
- [ ] Legacy cleanup
## Architecture Highlights
### Strategy Pattern
```python
class AuthStrategy(ABC):
@abstractmethod
async def authenticate(self, request: Request) -> Optional[UserContext]:
pass
@abstractmethod
async def get_token_provider(self, request: Request) -> TokenProvider:
pass
```
### Immutable User Context
```python
@dataclass(frozen=True)
class UserContext:
user_id: str
email: Optional[str] = None
username: Optional[str] = None
is_admin: bool = False
```
### Token Provider Interface
```python
class TokenProvider(ABC):
@abstractmethod
async def get_token(self, provider: ProviderType) -> Optional[ProviderToken]:
pass
```
## 🔧 Configuration Examples
### Current Default (None)
```bash
# No configuration needed - maintains current behavior
```
### Personal Use (SU without auth)
```bash
OH_AUTH_STRATEGY=single_user
OH_ENABLE_SU_AUTH=false
# Creates virtual "local" user, uses secrets.json
```
### Personal Use (SU with GitHub)
```bash
OH_AUTH_STRATEGY=single_user
OH_ENABLE_SU_AUTH=true
OH_SU_GITHUB_USERNAME=myusername
OH_GITHUB_CLIENT_ID=abc123
OH_GITHUB_CLIENT_SECRET=secret456
# Requires GitHub OAuth, restricts to specific user
```
## Implementation Readiness
### Backward Compatibility
- None strategy maintains exact current behavior
- No breaking changes for existing users
- Gradual migration path available
### Code Quality Improvements
- Reduces complexity from 339 to ~50 user_id references
- Introduces clear abstractions and boundaries
- Enables better testing and maintainability
### Extensibility Foundation
- Custom builds can add authentication strategies
- Token refresh/rotation patterns built-in
- Multi-tenancy foundation without core changes
## Summary
This design provides a clean authentication architecture for OpenHands with three key outcomes:
1. **Maintains simplicity** - Current users see no changes
2. **Enables extension** - Custom builds can add authentication features
3. **Improves codebase** - Reduces scattered auth logic and complexity
The architecture is well-defined with a clear migration path.
@@ -29,7 +29,7 @@ describe("EventMessage", () => {
args: {
final_thought: "Task completed successfully",
outputs: {},
thought: { text: "Task completed successfully" },
thought: "Task completed successfully",
},
message: "Task completed successfully",
timestamp: new Date().toISOString(),
@@ -55,7 +55,7 @@ describe("EventMessage", () => {
source: "agent" as const,
action: "message" as const,
args: {
thought: { text: "I need more information to proceed." },
thought: "I need more information to proceed.",
image_urls: null,
file_urls: [],
wait_for_response: true,
@@ -114,7 +114,7 @@ describe("EventMessage", () => {
args: {
final_thought: "Task completed successfully",
outputs: {},
thought: { text: "Task completed successfully" },
thought: "Task completed successfully",
},
message: "Task completed successfully",
timestamp: new Date().toISOString(),
@@ -58,7 +58,7 @@ describe("Messages", () => {
args: {
image_urls: [],
file_urls: [],
thought: { text: "" },
thought: "",
wait_for_response: false,
},
};
@@ -67,14 +67,16 @@ const getMcpActionContent = (event: MCPAction): string => {
const name = event.args.name || "";
const args = event.args.arguments || {};
let details = `**MCP Tool Call:** ${name}\n\n`;
details += `**Arguments:**\n\`\`\`json\n${JSON.stringify(args, null, 2)}\n\`\`\``;
// Include thought if available
if (event.args.thought) {
details += `\n\n**Thought:**\n${event.args.thought}`;
}
details += `\n\n**Arguments:**\n\`\`\`json\n${JSON.stringify(args, null, 2)}\n\`\`\``;
return details;
};
const getThinkActionContent = (event: ThinkAction): string => {
const t = event.args.thought;
return t.reasoning_content ? `${t.reasoning_content}\n\n${t.text}` : t.text;
};
const getThinkActionContent = (event: ThinkAction): string =>
event.args.thought;
const getFinishActionContent = (event: FinishAction): string =>
event.args.final_thought.trim();
@@ -33,20 +33,7 @@ import { useFeedbackExists } from "#/hooks/query/use-feedback-exists";
const hasThoughtProperty = (
obj: Record<string, unknown>,
): obj is {
thought?: { text?: string; reasoning_content?: string | null };
} => {
const { thought } = obj;
if (!thought || typeof thought !== "object") return false;
const { text = "", reasoning_content: rc } = thought as {
text?: string;
reasoning_content?: string | null;
};
return (
(typeof text === "string" && text.length > 0) ||
(typeof rc === "string" && rc.length > 0)
);
};
): obj is { thought: string } => "thought" in obj && !!obj.thought;
interface EventMessageProps {
event: OpenHandsAction | OpenHandsObservation;
@@ -134,20 +121,11 @@ export function EventMessage({
if (hasThoughtProperty(event.args) && event.action !== "think") {
return (
<div>
{event.args.thought?.reasoning_content && (
<GenericEventMessage
title={t("ACTION_MESSAGE$REASONING")}
details={event.args.thought.reasoning_content}
initiallyExpanded={false}
/>
)}
{(event.args.thought?.text || "") !== "" && (
<ChatMessage
type="agent"
message={event.args.thought?.text || ""}
actions={actions}
/>
)}
<ChatMessage
type="agent"
message={event.args.thought}
actions={actions}
/>
{microagentStatus && actions && (
<MicroagentStatusIndicator
status={microagentStatus}
@@ -170,13 +148,6 @@ export function EventMessage({
if (isFinishAction(event)) {
return (
<>
{event.args.thought?.reasoning_content && (
<GenericEventMessage
title="Reasoning"
details={event.args.thought.reasoning_content}
initiallyExpanded={false}
/>
)}
<ChatMessage
type="agent"
message={getEventContent(event).details}
@@ -276,21 +247,7 @@ export function EventMessage({
{isOpenHandsAction(event) &&
hasThoughtProperty(event.args) &&
event.action !== "think" && (
<>
{event.args.thought?.reasoning_content && (
<GenericEventMessage
title={t("ACTION_MESSAGE$REASONING")}
details={event.args.thought.reasoning_content}
initiallyExpanded={false}
/>
)}
{(event.args.thought?.text || "") !== "" && (
<ChatMessage
type="agent"
message={event.args.thought?.text || ""}
/>
)}
</>
<ChatMessage type="agent" message={event.args.thought} />
)}
<GenericEventMessage
-1
View File
@@ -816,7 +816,6 @@ export enum I18nKey {
MICROAGENT_MANAGEMENT$PR_READY_FOR_REVIEW = "MICROAGENT_MANAGEMENT$PR_READY_FOR_REVIEW",
MICROAGENT_MANAGEMENT$PR_NOT_CREATED = "MICROAGENT_MANAGEMENT$PR_NOT_CREATED",
MICROAGENT_MANAGEMENT$ERROR_CREATING_MICROAGENT = "MICROAGENT_MANAGEMENT$ERROR_CREATING_MICROAGENT",
ACTION_MESSAGE$REASONING = "ACTION_MESSAGE$REASONING",
MICROAGENT$STATUS_WAITING = "MICROAGENT$STATUS_WAITING",
MICROAGENT$UNKNOWN_ERROR = "MICROAGENT$UNKNOWN_ERROR",
MICROAGENT$CONVERSATION_STARTING = "MICROAGENT$CONVERSATION_STARTING",
-16
View File
@@ -13055,22 +13055,6 @@
"de": "Etwas ist schiefgelaufen. Versuchen Sie, den Microagenten erneut zu starten.",
"uk": "Щось пішло не так. Спробуйте ініціювати мікроагента ще раз."
},
"ACTION_MESSAGE$REASONING": {
"en": "Reasoning",
"ja": "推論",
"zh-CN": "推理",
"zh-TW": "推理",
"ko-KR": "추론",
"no": "Resonnement",
"ar": "التفكير",
"de": "Begründung",
"fr": "Raisonnement",
"it": "Ragionamento",
"pt": "Raciocínio",
"es": "Razonamiento",
"tr": "Akıl Yürütme",
"uk": "Міркування"
},
"MICROAGENT$STATUS_WAITING": {
"en": "Waiting for runtime to start...",
"ja": "ランタイムの開始を待機中...",
+1 -1
View File
@@ -29,7 +29,7 @@ export const generateAssistantMessageAction = (
timestamp: new Date().toISOString(),
action: "message",
args: {
thought: { text: message },
thought: message,
image_urls: [],
file_urls: [],
wait_for_response: false,
+15 -16
View File
@@ -1,6 +1,5 @@
import { OpenHandsActionEvent } from "./base";
import { ActionSecurityRisk } from "#/state/security-analyzer-slice";
import { Thought } from "./thought";
export interface UserMessageAction extends OpenHandsActionEvent<"message"> {
source: "user";
@@ -27,7 +26,7 @@ export interface CommandAction extends OpenHandsActionEvent<"run"> {
command: string;
security_risk: ActionSecurityRisk;
confirmation_state: "confirmed" | "rejected" | "awaiting_confirmation";
thought: Thought;
thought: string;
hidden?: boolean;
};
}
@@ -36,7 +35,7 @@ export interface AssistantMessageAction
extends OpenHandsActionEvent<"message"> {
source: "agent";
args: {
thought: Thought;
thought: string;
image_urls: string[] | null;
file_urls: string[];
wait_for_response: boolean;
@@ -50,14 +49,14 @@ export interface IPythonAction extends OpenHandsActionEvent<"run_ipython"> {
security_risk: ActionSecurityRisk;
confirmation_state: "confirmed" | "rejected" | "awaiting_confirmation";
kernel_init_code: string;
thought: Thought;
thought: string;
};
}
export interface ThinkAction extends OpenHandsActionEvent<"think"> {
source: "agent";
args: {
thought: Thought;
thought: string;
};
}
@@ -66,7 +65,7 @@ export interface FinishAction extends OpenHandsActionEvent<"finish"> {
args: {
final_thought: string;
outputs: Record<string, unknown>;
thought: Thought;
thought: string;
};
}
@@ -76,7 +75,7 @@ export interface DelegateAction extends OpenHandsActionEvent<"delegate"> {
args: {
agent: "BrowsingAgent";
inputs: Record<string, string>;
thought: Thought;
thought: string;
};
}
@@ -84,7 +83,7 @@ export interface BrowseAction extends OpenHandsActionEvent<"browse"> {
source: "agent";
args: {
url: string;
thought: Thought;
thought: string;
};
}
@@ -94,7 +93,7 @@ export interface BrowseInteractiveAction
timeout: number;
args: {
browser_actions: string;
thought: Thought | null;
thought: string | null;
browsergym_send_msg_to_user: string;
};
}
@@ -103,7 +102,7 @@ export interface FileReadAction extends OpenHandsActionEvent<"read"> {
source: "agent";
args: {
path: string;
thought: Thought;
thought: string;
security_risk: ActionSecurityRisk | null;
impl_source?: string;
view_range?: number[] | null;
@@ -115,7 +114,7 @@ export interface FileWriteAction extends OpenHandsActionEvent<"write"> {
args: {
path: string;
content: string;
thought: Thought;
thought: string;
};
}
@@ -132,7 +131,7 @@ export interface FileEditAction extends OpenHandsActionEvent<"edit"> {
content?: string;
start?: number;
end?: number;
thought: Thought;
thought: string;
security_risk: ActionSecurityRisk | null;
impl_source?: string;
};
@@ -141,7 +140,7 @@ export interface FileEditAction extends OpenHandsActionEvent<"edit"> {
export interface RejectAction extends OpenHandsActionEvent<"reject"> {
source: "agent";
args: {
thought: Thought;
thought: string;
};
}
@@ -150,7 +149,7 @@ export interface RecallAction extends OpenHandsActionEvent<"recall"> {
args: {
recall_type: "workspace_context" | "knowledge";
query: string;
thought: Thought;
thought: string;
};
}
@@ -159,7 +158,7 @@ export interface MCPAction extends OpenHandsActionEvent<"call_tool_mcp"> {
args: {
name: string;
arguments: Record<string, unknown>;
thought?: Thought;
thought?: string;
};
}
@@ -174,7 +173,7 @@ export interface TaskTrackingAction
status: "todo" | "in_progress" | "done";
notes?: string;
}>;
thought: Thought;
thought: string;
};
}
-4
View File
@@ -1,4 +0,0 @@
export interface Thought {
text: string;
reasoning_content?: string | null;
}
@@ -6,7 +6,6 @@ from openhands.core.logger import openhands_logger as logger
from openhands.events.action import (
Action,
BrowseInteractiveAction,
Thought,
)
@@ -63,10 +62,9 @@ class BrowsingActionParserMessage(ActionParser):
def parse(self, action_str: str) -> Action:
msg = f'send_msg_to_user("""{action_str}""")'
return BrowseInteractiveAction(
browser_actions=msg,
thought=Thought(text=action_str),
thought=action_str,
browsergym_send_msg_to_user=action_str,
)
@@ -123,6 +121,6 @@ class BrowsingActionParserBrowseInteractive(ActionParser):
return BrowseInteractiveAction(
browser_actions=browser_actions,
thought=Thought(text=thought),
thought=thought,
browsergym_send_msg_to_user=msg_content,
)
@@ -38,7 +38,6 @@ from openhands.events.action import (
IPythonRunCellAction,
MessageAction,
TaskTrackingAction,
Thought,
)
from openhands.events.action.agent import CondensationRequestAction
from openhands.events.action.mcp import MCPAction
@@ -47,24 +46,13 @@ from openhands.events.tool import ToolCallMetadata
from openhands.llm.tool_names import TASK_TRACKER_TOOL_NAME
def combine_thought(
action: Action, thought: str, reasoning_content: str | None = None
) -> Action:
def combine_thought(action: Action, thought: str) -> Action:
if not hasattr(action, 'thought'):
return action
current_thought = action.thought
# Always normalize to Thought for downstream code
if not isinstance(current_thought, Thought):
current_thought = Thought(text=str(current_thought) if current_thought else '')
action.thought = current_thought
# We have a Thought, so we can update it
cur_text = current_thought.text or ''
if thought:
current_thought.text = f'{thought}\n{cur_text}' if cur_text else thought
if reasoning_content is not None:
current_thought.reasoning_content = reasoning_content
if thought and action.thought:
action.thought = f'{thought}\n{action.thought}'
elif thought:
action.thought = thought
return action
@@ -92,26 +80,12 @@ def response_to_actions(
if hasattr(assistant_msg, 'tool_calls') and assistant_msg.tool_calls:
# Check if there's assistant_msg.content. If so, add it to the thought
thought = ''
reasoning_content: str | None = None
if isinstance(assistant_msg.content, str):
thought = assistant_msg.content
elif isinstance(assistant_msg.content, list):
for msg in assistant_msg.content:
if msg['type'] == 'text':
thought += msg['text']
# Capture optional reasoning content if provided by the model
if msg.get('type') in {'reasoning', 'thinking'} and 'text' in msg:
reasoning_content = (
reasoning_content + '\n' if reasoning_content else ''
) + msg['text']
# Also try direct attributes from LiteLLM message wrapper
for attr in ('reasoning_content', 'reasoning', 'thinking'):
rc = getattr(assistant_msg, attr, None)
if isinstance(rc, str) and rc.strip():
reasoning_content = (
rc if not reasoning_content else reasoning_content + '\n' + rc
)
# Process each tool call to OpenHands action
for i, tool_call in enumerate(assistant_msg.tool_calls):
@@ -257,9 +231,7 @@ def response_to_actions(
# AgentThinkAction
# ================================================
elif tool_call.function.name == ThinkTool['function']['name']:
action = AgentThinkAction(
thought=Thought(text=arguments.get('thought', ''))
)
action = AgentThinkAction(thought=arguments.get('thought', ''))
# ================================================
# CondensationRequestAction
@@ -338,7 +310,7 @@ def response_to_actions(
# We only add thought to the first action
if i == 0:
action = combine_thought(action, thought, reasoning_content)
action = combine_thought(action, thought)
# Add metadata for tool calling
action.tool_call_metadata = ToolCallMetadata(
tool_call_id=tool_call.id,
+1 -2
View File
@@ -12,7 +12,6 @@ from openhands.events.action import (
FileReadAction,
FileWriteAction,
MessageAction,
Thought,
)
from openhands.events.observation import (
AgentStateChangedObservation,
@@ -92,7 +91,7 @@ class DummyAgent(Agent):
},
{
'action': AgentFinishAction(
outputs={}, thought=Thought(text='Task completed'), action='finish'
outputs={}, thought='Task completed', action='finish'
),
'observations': [AgentStateChangedObservation('', AgentState.FINISHED)],
},
@@ -42,23 +42,12 @@ def response_to_actions(
if hasattr(assistant_msg, 'tool_calls') and assistant_msg.tool_calls:
# Check if there's assistant_msg.content. If so, add it to the thought
thought = ''
reasoning_content: str | None = None
if isinstance(assistant_msg.content, str):
thought = assistant_msg.content
elif isinstance(assistant_msg.content, list):
for msg in assistant_msg.content:
if msg['type'] == 'text':
thought += msg['text']
if msg.get('type') in {'reasoning', 'thinking'} and 'text' in msg:
reasoning_content = (
reasoning_content + '\n' if reasoning_content else ''
) + msg['text']
for attr in ('reasoning_content', 'reasoning', 'thinking'):
rc = getattr(assistant_msg, attr, None)
if isinstance(rc, str) and rc.strip():
reasoning_content = (
rc if not reasoning_content else reasoning_content + '\n' + rc
)
# Process each tool call to OpenHands action
for i, tool_call in enumerate(assistant_msg.tool_calls):
@@ -100,7 +89,7 @@ def response_to_actions(
# We only add thought to the first action
if i == 0:
action = combine_thought(action, thought, reasoning_content)
action = combine_thought(action, thought)
# Add metadata for tool calling
action.tool_call_metadata = ToolCallMetadata(
tool_call_id=tool_call.id,
@@ -36,7 +36,6 @@ from openhands.events.action import (
FileReadAction,
MCPAction,
MessageAction,
Thought,
)
from openhands.events.event import FileReadSource
from openhands.events.tool import ToolCallMetadata
@@ -118,23 +117,12 @@ def response_to_actions(
if hasattr(assistant_msg, 'tool_calls') and assistant_msg.tool_calls:
# Check if there's assistant_msg.content. If so, add it to the thought
thought = ''
reasoning_content: str | None = None
if isinstance(assistant_msg.content, str):
thought = assistant_msg.content
elif isinstance(assistant_msg.content, list):
for msg in assistant_msg.content:
if msg['type'] == 'text':
thought += msg['text']
if msg.get('type') in {'reasoning', 'thinking'} and 'text' in msg:
reasoning_content = (
reasoning_content + '\n' if reasoning_content else ''
) + msg['text']
for attr in ('reasoning_content', 'reasoning', 'thinking'):
rc = getattr(assistant_msg, attr, None)
if isinstance(rc, str) and rc.strip():
reasoning_content = (
rc if not reasoning_content else reasoning_content + '\n' + rc
)
# Process each tool call to OpenHands action
for i, tool_call in enumerate(assistant_msg.tool_calls):
@@ -173,9 +161,7 @@ def response_to_actions(
# AgentThinkAction
# ================================================
elif tool_call.function.name == ThinkTool['function']['name']:
action = AgentThinkAction(
thought=Thought(text=arguments.get('thought', ''))
)
action = AgentThinkAction(thought=arguments.get('thought', ''))
# ================================================
# GrepTool (file content search)
@@ -224,7 +210,7 @@ def response_to_actions(
# We only add thought to the first action
if i == 0:
action = combine_thought(action, thought, reasoning_content)
action = combine_thought(action, thought)
# Add metadata for tool calling
action.tool_call_metadata = ToolCallMetadata(
tool_call_id=tool_call.id,
+3 -3
View File
@@ -263,7 +263,7 @@ def display_event(event: Event, config: OpenHandsConfig) -> None:
if isinstance(event, CmdRunAction):
# For CmdRunAction, display thought first, then command
if hasattr(event, 'thought') and event.thought:
display_thought_if_new(str(event.thought))
display_thought_if_new(event.thought)
# Only display the command if it's not already confirmed
# Commands are always shown when AWAITING_CONFIRMATION, so we don't need to show them again when CONFIRMED
@@ -279,7 +279,7 @@ def display_event(event: Event, config: OpenHandsConfig) -> None:
elif isinstance(event, Action):
# For other actions, display thoughts normally
if hasattr(event, 'thought') and event.thought:
display_thought_if_new(str(event.thought))
display_thought_if_new(event.thought)
if hasattr(event, 'final_thought') and event.final_thought:
# Display final thoughts with agent styling
display_message(event.final_thought, is_agent_message=True)
@@ -522,7 +522,7 @@ def display_task_tracking_action(event: TaskTrackingAction) -> None:
"""Display a TaskTracking action in the CLI."""
# Display thought first if present
if hasattr(event, 'thought') and event.thought:
display_thought_if_new(str(event.thought))
display_thought_if_new(event.thought)
# Format the command and task list for display
display_text = f'Command: {event.command}'
+1 -2
View File
@@ -659,8 +659,7 @@ class AgentController:
new_state in (AgentState.USER_CONFIRMED, AgentState.USER_REJECTED)
):
if hasattr(self._pending_action, 'thought'):
# Clear the thought text for confirmed/rejected actions
self._pending_action.thought.text = '' # type: ignore[attr-defined]
self._pending_action.thought = '' # type: ignore[union-attr]
if new_state == AgentState.USER_CONFIRMED:
confirmation_state = ActionConfirmationStatus.CONFIRMED
else:
-2
View File
@@ -2,7 +2,6 @@ from openhands.events.action.action import (
Action,
ActionConfirmationStatus,
ActionSecurityRisk,
Thought,
)
from openhands.events.action.agent import (
AgentDelegateAction,
@@ -45,6 +44,5 @@ __all__ = [
'RecallAction',
'MCPAction',
'TaskTrackingAction',
'Thought',
'ActionSecurityRisk',
]
-36
View File
@@ -21,39 +21,3 @@ class ActionSecurityRisk(int, Enum):
@dataclass
class Action(Event):
runnable: ClassVar[bool] = False
@dataclass
class Thought:
"""Container for agent reasoning.
Attributes:
text: The visible plain thought string used throughout the UI/logs.
reasoning_content: Optional provider-native reasoning content (e.g., OpenAI reasoning).
"""
text: str = ''
reasoning_content: str | None = None
def __bool__(self) -> bool:
return bool(self.text or self.reasoning_content)
def __str__(self) -> str:
# Concatenate provider-native reasoning content and visible text for display.
# Do not rely on this for content sent to the LLM; conversation_memory must use .text only.
if self.reasoning_content and self.text:
return f'{self.reasoning_content}\n\n{self.text}'
if self.reasoning_content:
return self.reasoning_content
return self.text
def __eq__(self, other: object) -> bool: # type: ignore[override]
# Allow comparing Thought to plain strings for backward compatibility in tests/UI code
if isinstance(other, Thought):
return (
self.text == other.text
and self.reasoning_content == other.reasoning_content
)
if isinstance(other, str):
return self.text == other
return NotImplemented # type: ignore[return-value]
+10 -10
View File
@@ -2,7 +2,7 @@ from dataclasses import dataclass, field
from typing import Any
from openhands.core.schema import ActionType
from openhands.events.action import Action, Thought
from openhands.events.action.action import Action
from openhands.events.event import RecallType
@@ -11,7 +11,7 @@ class ChangeAgentStateAction(Action):
"""Fake action, just to notify the client that a task state has changed."""
agent_state: str
thought: Thought = field(default_factory=Thought)
thought: str = ''
action: str = ActionType.CHANGE_AGENT_STATE
@property
@@ -32,13 +32,13 @@ class AgentFinishAction(Action):
final_thought: str = ''
outputs: dict[str, Any] = field(default_factory=dict)
thought: Thought = field(default_factory=Thought)
thought: str = ''
action: str = ActionType.FINISH
@property
def message(self) -> str:
if self.thought and str(self.thought) != '':
return str(self.thought)
if self.thought != '':
return self.thought
return "All done! What's next on the agenda?"
@@ -51,7 +51,7 @@ class AgentThinkAction(Action):
action (str): The action type, namely ActionType.THINK.
"""
thought: Thought = field(default_factory=Thought)
thought: str = ''
action: str = ActionType.THINK
@property
@@ -62,7 +62,7 @@ class AgentThinkAction(Action):
@dataclass
class AgentRejectAction(Action):
outputs: dict = field(default_factory=dict)
thought: Thought = field(default_factory=Thought)
thought: str = ''
action: str = ActionType.REJECT
@property
@@ -77,7 +77,7 @@ class AgentRejectAction(Action):
class AgentDelegateAction(Action):
agent: str
inputs: dict
thought: Thought = field(default_factory=Thought)
thought: str = ''
action: str = ActionType.DELEGATE
@property
@@ -91,7 +91,7 @@ class RecallAction(Action):
recall_type: RecallType
query: str = ''
thought: Thought = field(default_factory=Thought)
thought: str = ''
action: str = ActionType.RECALL
@property
@@ -214,7 +214,7 @@ class TaskTrackingAction(Action):
command: str = 'view'
task_list: list[dict[str, Any]] = field(default_factory=list)
thought: Thought = field(default_factory=Thought)
thought: str = ''
action: str = ActionType.TASK_TRACKING
@property
+4 -4
View File
@@ -1,14 +1,14 @@
from dataclasses import dataclass, field
from dataclasses import dataclass
from typing import ClassVar
from openhands.core.schema import ActionType
from openhands.events.action import Action, ActionSecurityRisk, Thought
from openhands.events.action.action import Action, ActionSecurityRisk
@dataclass
class BrowseURLAction(Action):
url: str
thought: Thought = field(default_factory=Thought)
thought: str = ''
action: str = ActionType.BROWSE
runnable: ClassVar[bool] = True
security_risk: ActionSecurityRisk = ActionSecurityRisk.UNKNOWN
@@ -29,7 +29,7 @@ class BrowseURLAction(Action):
@dataclass
class BrowseInteractiveAction(Action):
browser_actions: str
thought: Thought = field(default_factory=Thought)
thought: str = ''
browsergym_send_msg_to_user: str = ''
action: str = ActionType.BROWSE_INTERACTIVE
runnable: ClassVar[bool] = True
+3 -4
View File
@@ -1,4 +1,4 @@
from dataclasses import dataclass, field
from dataclasses import dataclass
from typing import ClassVar
from openhands.core.schema import ActionType
@@ -6,7 +6,6 @@ from openhands.events.action.action import (
Action,
ActionConfirmationStatus,
ActionSecurityRisk,
Thought,
)
@@ -16,7 +15,7 @@ class CmdRunAction(Action):
str # When `command` is empty, it will be used to print the current tmux window
)
is_input: bool = False # if True, the command is an input to the running process
thought: Thought = field(default_factory=Thought)
thought: str = ''
blocking: bool = False # if True, the command will be run in a blocking manner, but a timeout must be set through _set_hard_timeout
is_static: bool = False # if True, runs the command in a separate process
cwd: str | None = None # current working directory, only used if is_static is True
@@ -43,7 +42,7 @@ class CmdRunAction(Action):
@dataclass
class IPythonRunCellAction(Action):
code: str
thought: Thought = field(default_factory=Thought)
thought: str = ''
include_extra: bool = (
True # whether to include CWD & Python interpreter in the output
)
+5 -5
View File
@@ -1,8 +1,8 @@
from dataclasses import dataclass, field
from dataclasses import dataclass
from typing import ClassVar
from openhands.core.schema import ActionType
from openhands.events.action import Action, ActionSecurityRisk, Thought
from openhands.events.action.action import Action, ActionSecurityRisk
from openhands.events.event import FileEditSource, FileReadSource
@@ -16,7 +16,7 @@ class FileReadAction(Action):
path: str
start: int = 0
end: int = -1
thought: Thought = field(default_factory=Thought)
thought: str = ''
action: str = ActionType.READ
runnable: ClassVar[bool] = True
security_risk: ActionSecurityRisk = ActionSecurityRisk.UNKNOWN
@@ -39,7 +39,7 @@ class FileWriteAction(Action):
content: str
start: int = 0
end: int = -1
thought: Thought = field(default_factory=Thought)
thought: str = ''
action: str = ActionType.WRITE
runnable: ClassVar[bool] = True
security_risk: ActionSecurityRisk = ActionSecurityRisk.UNKNOWN
@@ -108,7 +108,7 @@ class FileEditAction(Action):
end: int = -1
# Shared arguments
thought: Thought = field(default_factory=Thought)
thought: str = ''
action: str = ActionType.EDIT
runnable: ClassVar[bool] = True
security_risk: ActionSecurityRisk = ActionSecurityRisk.UNKNOWN
+2 -2
View File
@@ -2,14 +2,14 @@ from dataclasses import dataclass, field
from typing import Any, ClassVar
from openhands.core.schema import ActionType
from openhands.events.action import Action, ActionSecurityRisk, Thought
from openhands.events.action.action import Action, ActionSecurityRisk
@dataclass
class MCPAction(Action):
name: str
arguments: dict[str, Any] = field(default_factory=dict)
thought: Thought = field(default_factory=Thought)
thought: str = ''
action: str = ActionType.MCP
runnable: ClassVar[bool] = True
security_risk: ActionSecurityRisk = ActionSecurityRisk.UNKNOWN
+2 -21
View File
@@ -1,7 +1,7 @@
from typing import Any
from openhands.core.exceptions import LLMMalformedActionError
from openhands.events.action import Action, ActionSecurityRisk, Thought
from openhands.events.action.action import Action, ActionSecurityRisk
from openhands.events.action.agent import (
AgentDelegateAction,
AgentFinishAction,
@@ -110,8 +110,7 @@ def action_from_dict(action: dict) -> Action:
raise LLMMalformedActionError(
f"'{action['action']=}' is not defined. Available actions: {ACTION_TYPE_TO_CLASS.keys()}"
)
# Work on a copy of args to avoid mutating the caller's dictionary
args = dict(action.get('args', {}))
args = action.get('args', {})
# Remove timestamp from args if present
timestamp = args.pop('timestamp', None)
@@ -125,24 +124,6 @@ def action_from_dict(action: dict) -> Action:
if 'images_urls' in args:
args['image_urls'] = args.pop('images_urls')
# Convert thought arg from legacy formats and capture optional reasoning_content
rc = args.pop('reasoning_content', None)
if 'thought' in args:
t = args['thought']
if isinstance(t, dict):
# Accept either {'text': '...', 'reasoning_content': '...'} or legacy {'thought': '...'}
text = t.get('text') or t.get('thought') or ''
reasoning_content = t.get('reasoning_content') or rc
args['thought'] = Thought(text=text, reasoning_content=reasoning_content)
elif isinstance(t, str):
args['thought'] = Thought(text=t, reasoning_content=rc)
# Inputs to action_from_dict come from wire (JSON→dict), so t will be dict or str.
# Thought instances should not appear here; if they do, they are out-of-band.
# We intentionally do not handle object instances to keep deserialization strict.
elif rc is not None:
# No text thought provided, but reasoning content exists
args['thought'] = Thought(text='', reasoning_content=rc)
# Handle security_risk deserialization
if 'security_risk' in args and args['security_risk'] is not None:
try:
-16
View File
@@ -99,7 +99,6 @@ def _convert_pydantic_to_dict(obj: BaseModel | dict) -> dict:
def event_to_dict(event: 'Event') -> dict:
props = asdict(event)
d = {}
for key in TOP_KEYS:
if hasattr(event, key) and getattr(event, key) is not None:
@@ -127,22 +126,7 @@ def event_to_dict(event: 'Event') -> dict:
# Remove task_completed from serialization when it's None (backward compatibility)
if 'task_completed' in props and props['task_completed'] is None:
props.pop('task_completed')
if 'action' in d:
# Normalize Thought representation strictly at the action args boundary
# Always emit a dict-shaped thought: {"text": str, "reasoning_content": str|null}
t = props.get('thought', None)
if t is not None:
if isinstance(t, dict):
text = t.get('text') or t.get('thought') or ''
rc = t.get('reasoning_content')
props['thought'] = {'text': text, 'reasoning_content': rc}
elif isinstance(t, str):
props['thought'] = {'text': t, 'reasoning_content': None}
else:
# Any other legacy/unknown shape: coerce to safe string
props['thought'] = {'text': str(t), 'reasoning_content': None}
# Handle security_risk for actions - include it in args
if 'security_risk' in props:
props['security_risk'] = props['security_risk'].value
+1 -8
View File
@@ -1,6 +1,4 @@
import json
from dataclasses import asdict as dataclass_asdict
from dataclasses import is_dataclass
from datetime import datetime
from json_repair import repair_json
@@ -14,18 +12,13 @@ from openhands.llm.metrics import Metrics
class OpenHandsJSONEncoder(json.JSONEncoder):
"""Custom JSON encoder that handles datetime, event objects, and nested dataclasses"""
"""Custom JSON encoder that handles datetime and event objects"""
def default(self, obj):
if isinstance(obj, datetime):
return obj.isoformat()
# Important: handle Event before generic dataclass handling
if isinstance(obj, Event):
return event_to_dict(obj)
# Fallback: serialize any dataclass (e.g., Thought) to a dict
# Guard against dataclass classes (types) which also return True for is_dataclass
if is_dataclass(obj) and not isinstance(obj, type):
return dataclass_asdict(obj) # type: ignore[arg-type]
if isinstance(obj, Metrics):
return obj.get()
if isinstance(obj, ModelResponse):
+8 -12
View File
@@ -19,7 +19,6 @@ from openhands.events.action import (
IPythonRunCellAction,
MessageAction,
TaskTrackingAction,
Thought,
)
from openhands.events.action.mcp import MCPAction
from openhands.events.action.message import SystemMessageAction
@@ -283,24 +282,21 @@ class ConversationMemory:
)
content = assistant_msg.content or ''
# Update the Thought text with assistant content when present
cur_text = action.thought.text
if cur_text != content:
action.thought.text = (
(cur_text + '\n' + content) if cur_text else content
)
# save content if any, to thought
if action.thought:
if action.thought != content:
action.thought += '\n' + content
else:
action.thought = content
# remove the tool call metadata
if hasattr(action, '_tool_call_metadata'):
delattr(action, '_tool_call_metadata')
action.tool_call_metadata = None
if role not in ('user', 'system', 'assistant', 'tool'):
raise ValueError(f'Invalid role: {role}')
# Only send plain thought text to the LLM
thought_text = action.thought.text
return [
Message(
role=role, # type: ignore[arg-type]
content=[TextContent(text=thought_text)],
content=[TextContent(text=action.thought)],
)
]
elif isinstance(action, MessageAction):
+1 -6
View File
@@ -6,9 +6,7 @@ import docker
from openhands.core.logger import openhands_logger as logger
from openhands.events.action.action import Action, ActionSecurityRisk
# Delay import to avoid circular import with openhands.runtime package
# from openhands.runtime.utils import find_available_tcp_port
from openhands.runtime.utils import find_available_tcp_port
from openhands.security.analyzer import SecurityAnalyzer
from openhands.security.invariant.client import InvariantClient
from openhands.security.invariant.parser import TraceElement, parse_element
@@ -55,9 +53,6 @@ class InvariantAnalyzer(SecurityAnalyzer):
self.container = all_containers[0]
all_containers[0].start()
else:
# Local import here to avoid circular import during module initialization
from openhands.runtime.utils import find_available_tcp_port
self.api_port = find_available_tcp_port()
self.container = self.docker_client.containers.run(
self.image_name,
+1 -7
View File
@@ -6,7 +6,6 @@ from openhands.events.action import (
ChangeAgentStateAction,
MessageAction,
NullAction,
Thought,
)
from openhands.events.event import EventSource
from openhands.events.observation import (
@@ -54,12 +53,7 @@ def parse_action(trace: list[TraceElement], action: Action) -> list[TraceElement
function = Function(name=action.action, arguments=args)
if thought is not None:
# We assume Thought is a Thought instance here
if isinstance(thought, Thought):
inv_trace.append(Message(role='assistant', content=thought.text))
else:
# If for some reason it's not Thought (shouldn't happen here), emit empty
inv_trace.append(Message(role='assistant', content=''))
inv_trace.append(Message(role='assistant', content=thought))
inv_trace.append(ToolCall(id=next_id, type='function', function=function))
else:
logger.error(f'Unknown action type: {type(action)}')
+21 -30
View File
@@ -76,7 +76,7 @@ def test_agent_finish_action_serialization_deserialization():
'action': 'finish',
'args': {
'outputs': {},
'thought': {'text': '', 'reasoning_content': None},
'thought': '',
'final_thought': '',
},
}
@@ -89,7 +89,7 @@ def test_agent_finish_action_legacy_task_completed_serialization():
'action': 'finish',
'args': {
'outputs': {},
'thought': {'text': '', 'reasoning_content': None},
'thought': '',
'final_thought': 'Task completed',
'task_completed': 'true', # This should be ignored during deserialization
},
@@ -110,7 +110,7 @@ def test_agent_finish_action_legacy_task_completed_serialization():
def test_agent_reject_action_serialization_deserialization():
original_action_dict = {
'action': 'reject',
'args': {'outputs': {}, 'thought': {'text': '', 'reasoning_content': None}},
'args': {'outputs': {}, 'thought': ''},
}
serialization_deserialization(original_action_dict, AgentRejectAction)
@@ -122,7 +122,7 @@ def test_cmd_run_action_serialization_deserialization():
'blocking': False,
'command': 'echo "Hello world"',
'is_input': False,
'thought': {'text': '', 'reasoning_content': None},
'thought': '',
'hidden': False,
'confirmation_state': ActionConfirmationStatus.CONFIRMED,
'is_static': False,
@@ -137,7 +137,7 @@ def test_browse_url_action_serialization_deserialization():
original_action_dict = {
'action': 'browse',
'args': {
'thought': {'text': '', 'reasoning_content': None},
'thought': '',
'url': 'https://www.example.com',
'return_axtree': False,
'security_risk': -1,
@@ -150,7 +150,7 @@ def test_browse_interactive_action_serialization_deserialization():
original_action_dict = {
'action': 'browse_interactive',
'args': {
'thought': {'text': '', 'reasoning_content': None},
'thought': '',
'browser_actions': 'goto("https://www.example.com")',
'browsergym_send_msg_to_user': '',
'return_axtree': False,
@@ -167,7 +167,7 @@ def test_file_read_action_serialization_deserialization():
'path': '/path/to/file.txt',
'start': 0,
'end': -1,
'thought': {'text': 'None', 'reasoning_content': None},
'thought': 'None',
'impl_source': 'default',
'view_range': None,
'security_risk': -1,
@@ -184,7 +184,7 @@ def test_file_write_action_serialization_deserialization():
'content': 'Hello world',
'start': 0,
'end': 1,
'thought': {'text': 'None', 'reasoning_content': None},
'thought': 'None',
'security_risk': -1,
},
}
@@ -204,7 +204,7 @@ def test_file_edit_action_aci_serialization_deserialization():
'content': '',
'start': 1,
'end': -1,
'thought': {'text': 'Replacing text', 'reasoning_content': None},
'thought': 'Replacing text',
'impl_source': 'oh_aci',
'security_risk': -1,
},
@@ -225,7 +225,7 @@ def test_file_edit_action_llm_serialization_deserialization():
'content': 'Updated content',
'start': 1,
'end': 10,
'thought': {'text': 'Updating file content', 'reasoning_content': None},
'thought': 'Updating file content',
'impl_source': 'llm_based_edit',
'security_risk': -1,
},
@@ -239,7 +239,7 @@ def test_cmd_run_action_legacy_serialization():
'args': {
'blocking': False,
'command': 'echo "Hello world"',
'thought': {'text': '', 'reasoning_content': None},
'thought': '',
'hidden': False,
'confirmation_state': ActionConfirmationStatus.CONFIRMED,
'keep_prompt': False, # will be treated as no-op
@@ -259,7 +259,7 @@ def test_cmd_run_action_legacy_serialization():
)
assert event_dict['args']['blocking'] is False
assert event_dict['args']['command'] == 'echo "Hello world"'
assert event_dict['args']['thought'] == {'text': '', 'reasoning_content': None}
assert event_dict['args']['thought'] == ''
assert event_dict['args']['is_input'] is False
@@ -271,7 +271,7 @@ def test_file_llm_based_edit_action_legacy_serialization():
'content': 'dummy content',
'start': 1,
'end': -1,
'thought': {'text': 'Replacing text', 'reasoning_content': None},
'thought': 'Replacing text',
'impl_source': 'oh_aci',
'translated_ipython_code': None,
},
@@ -304,10 +304,7 @@ def test_file_llm_based_edit_action_legacy_serialization():
# Common arguments
assert event_dict['args']['path'] == '/path/to/file.txt'
assert event_dict['args']['impl_source'] == 'oh_aci'
assert event_dict['args']['thought'] == {
'text': 'Replacing text',
'reasoning_content': None,
}
assert event_dict['args']['thought'] == 'Replacing text'
# OH_ACI arguments
assert event_dict['args']['command'] == ''
@@ -366,10 +363,10 @@ def test_file_ohaci_edit_action_legacy_serialization():
# Common arguments
assert event_dict['args']['path'] == '/workspace/game_2048.py'
assert event_dict['args']['impl_source'] == 'oh_aci'
assert event_dict['args']['thought'] == {
'text': "I'll help you create a simple 2048 game in Python. I'll use the str_replace_editor to create the file.",
'reasoning_content': None,
}
assert (
event_dict['args']['thought']
== "I'll help you create a simple 2048 game in Python. I'll use the str_replace_editor to create the file."
)
# OH_ACI arguments
assert event_dict['args']['command'] == 'create'
@@ -389,10 +386,7 @@ def test_agent_microagent_action_serialization_deserialization():
'action': 'recall',
'args': {
'query': 'What is the capital of France?',
'thought': {
'text': 'I need to find information about France',
'reasoning_content': None,
},
'thought': 'I need to find information about France',
'recall_type': 'knowledge',
},
}
@@ -406,7 +400,7 @@ def test_file_read_action_legacy_serialization():
'path': '/workspace/test.txt',
'start': 0,
'end': -1,
'thought': {'text': 'Reading the file contents', 'reasoning_content': None},
'thought': 'Reading the file contents',
'impl_source': 'oh_aci',
'translated_ipython_code': "print(file_editor(**{'command': 'view', 'path': '/workspace/test.txt'}))",
},
@@ -438,10 +432,7 @@ def test_file_read_action_legacy_serialization():
# Common arguments in serialized form
assert event_dict['args']['path'] == '/workspace/test.txt'
assert event_dict['args']['impl_source'] == 'oh_aci'
assert event_dict['args']['thought'] == {
'text': 'Reading the file contents',
'reasoning_content': None,
}
assert event_dict['args']['thought'] == 'Reading the file contents'
# Read-specific arguments in serialized form
assert event_dict['args']['start'] == 0
@@ -1,173 +0,0 @@
import os
import sys
import pytest
# Ensure this repo takes precedence over any installed openhands package
sys.path.insert(
0, os.path.abspath(os.path.join(os.path.dirname(__file__), '..', '..', '..'))
)
from openhands.events.action import (
AgentDelegateAction,
AgentFinishAction,
AgentRejectAction,
ChangeAgentStateAction,
CmdRunAction,
FileEditAction,
FileReadAction,
FileWriteAction,
IPythonRunCellAction,
RecallAction,
TaskTrackingAction,
Thought,
)
from openhands.events.event import RecallType
from openhands.events.serialization.event import event_from_dict, event_to_dict
from openhands.io import json as oh_json
# ---------------------------
# event_to_dict normalization
# ---------------------------
def test_thought_serialization_flatten_with_reasoning():
a = CmdRunAction(command='echo 1', thought=Thought(text='t', reasoning_content='r'))
d = event_to_dict(a)
assert d['action'] == a.action
assert 'args' in d
assert isinstance(d['args']['thought'], dict)
assert d['args']['thought']['text'] == 't'
assert d['args']['thought']['reasoning_content'] == 'r'
# Round-trip back
a2 = event_from_dict(d)
assert isinstance(a2.thought, Thought)
assert a2.thought.text == 't'
assert a2.thought.reasoning_content == 'r'
# ---------------------------
# action_from_dict handling
# ---------------------------
def test_thought_deserialization_from_string_plus_rc():
d = {
'action': 'run',
'args': {'command': 'echo 1', 'thought': 'hello', 'reasoning_content': 'why'},
}
a = event_from_dict(d)
assert isinstance(a.thought, Thought)
assert a.thought.text == 'hello'
assert a.thought.reasoning_content == 'why'
def test_thought_deserialization_from_dict_text_key():
d = {
'action': 'run',
'args': {
'command': 'echo 1',
'thought': {'text': 'hi', 'reasoning_content': 'rc'},
},
}
a = event_from_dict(d)
assert isinstance(a.thought, Thought)
assert a.thought.text == 'hi'
assert a.thought.reasoning_content == 'rc'
def test_thought_deserialization_from_dict_legacy_thought_key():
d = {
'action': 'run',
'args': {'command': 'echo 1', 'thought': {'thought': 'legacy'}},
}
a = event_from_dict(d)
assert isinstance(a.thought, Thought)
assert a.thought.text == 'legacy'
assert a.thought.reasoning_content is None
def test_thought_deserialization_without_thought_but_with_top_level_rc():
d = {
'action': 'run',
'args': {'command': 'echo 1', 'reasoning_content': 'only-rc'},
}
a = event_from_dict(d)
assert isinstance(a.thought, Thought)
assert a.thought.text == ''
assert a.thought.reasoning_content == 'only-rc'
def test_thought_backwards_compat_direct_init_with_str():
# Direct construction with a string should still work; serializer coerces to dict on wire
a = CmdRunAction(command='echo 1', thought='plain') # type: ignore[arg-type]
d = event_to_dict(a)
assert d['args']['thought'] == {'text': 'plain', 'reasoning_content': None}
# When it comes back from wire, it becomes Thought
a2 = event_from_dict(d)
assert isinstance(a2.thought, Thought)
assert a2.thought.text == 'plain'
# ---------------------------
# Round-trip across action types
# ---------------------------
@pytest.mark.parametrize(
'action',
[
CmdRunAction(
command='echo 1', thought=Thought(text='t', reasoning_content='r')
),
IPythonRunCellAction(
code='x=1', thought=Thought(text='t', reasoning_content='r')
),
FileReadAction(path='/tmp/a', thought=Thought(text='t', reasoning_content='r')),
FileWriteAction(
path='/tmp/a', content='c', thought=Thought(text='t', reasoning_content='r')
),
FileEditAction(
path='/tmp/a',
command='view',
thought=Thought(text='t', reasoning_content='r'),
),
AgentFinishAction(
final_thought='done', thought=Thought(text='t', reasoning_content='r')
),
AgentRejectAction(thought=Thought(text='t', reasoning_content='r')),
AgentDelegateAction(
agent='helper', inputs={}, thought=Thought(text='t', reasoning_content='r')
),
ChangeAgentStateAction(
agent_state='running', thought=Thought(text='t', reasoning_content='r')
),
RecallAction(
recall_type=RecallType.WORKSPACE_CONTEXT,
thought=Thought(text='t', reasoning_content='r'),
),
TaskTrackingAction(
task_list=[{'id': 1, 'title': 'a'}],
thought=Thought(text='t', reasoning_content='r'),
),
],
)
def test_thought_serializes_round_trip(action):
d = event_to_dict(action)
assert d['action'] == action.action
assert 'args' in d
assert isinstance(d['args'].get('thought'), dict)
assert d['args']['thought']['text'] == 't'
assert d['args']['thought']['reasoning_content'] == 'r'
# json encoder should handle dicts produced by serializer
s = oh_json.dumps(d)
assert isinstance(s, str) and s
# round-trip back to object
a2 = event_from_dict(d)
assert isinstance(a2.thought, Thought)
assert a2.thought.text == 't'
assert a2.thought.reasoning_content == 'r'
@@ -1,45 +0,0 @@
import os
import sys
from unittest.mock import MagicMock
from openhands.core.config.agent_config import AgentConfig
# Ensure this repo takes precedence over any installed openhands package
sys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__), '..', '..')))
from openhands.events.action import Thought
from openhands.events.action.agent import AgentFinishAction
from openhands.events.action.message import MessageAction
from openhands.memory.conversation_memory import ConversationMemory
from openhands.utils.prompt import PromptManager
def test_llm_receives_only_thought_text():
# Setup
agent_config = AgentConfig()
prompt_manager = MagicMock(spec=PromptManager)
prompt_manager.get_system_message.return_value = 'System message'
cm = ConversationMemory(agent_config, prompt_manager)
user_msg = MessageAction(content='hi')
finish = AgentFinishAction(
final_thought='done',
thought=Thought(text='visible', reasoning_content='secret'),
)
messages = cm.process_events(
condensed_history=[finish],
initial_user_action=user_msg,
max_message_chars=None,
vision_is_active=False,
)
# Find the assistant message produced from AgentFinishAction
assistant_texts = []
for m in messages:
if m.role == 'assistant':
for c in m.content:
if hasattr(c, 'text'):
assistant_texts.append(c.text)
combined = '\n'.join(assistant_texts)
assert 'visible' in combined
assert 'secret' not in combined