Compare commits

..

29 Commits

Author SHA1 Message Date
openhands 6903223ef7 Merge main into feature/agent-mode-toggle and resolve conflicts 2025-05-17 06:26:56 +00:00
Justin Coffi bf383b4881 Add SSH Microagent (#8436)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyaoww@gmail.com>
2025-05-17 05:56:51 +00:00
OpenHands c17b0ebfc6 Fix issue #8304: [Bug]: Non-native tool use converter fails when builtin tools are disabled (#8310)
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-05-17 06:37:45 +02:00
Xingyao Wang 1f390430e5 feat(MCP, microagent): MCP-support for Repo Microagent & add fetch as default tool (#8360)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2025-05-16 23:32:38 +00:00
Robert Brennan 819bad0777 Fix: Only show login modal for genuine 401 errors, not connection issues (#8540)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-05-16 13:33:06 -04:00
dependabot[bot] 2faed14139 chore(deps): bump the version-all group with 5 updates (#8541)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2025-05-16 17:17:38 +00:00
tofarr 4733270e3c Add event search endpoints with filtering and pagination (#8538)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-05-16 09:51:40 -06:00
Robert Brennan 21dd91de63 Add info logging for 401 Unauthorized responses (#8527)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-05-16 11:46:15 -04:00
Graham Neubig 25619c5a93 Fix #8510: Improve error messages for invalid microagent format (#8511)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-05-16 15:01:39 +00:00
Robert Brennan 15f71e7ed6 Update microagent message from "Activated" to "ready" (#8536)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-05-16 10:44:37 -04:00
mamoodi 5b583e5f27 Organization and user microagents (#8506) 2025-05-16 10:42:32 -04:00
Rohit Malhotra c191a17afb [Fix]: don't access secrets if doesn't exist (#8535)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-05-16 14:30:16 +00:00
Robert Brennan 8ec5d0e043 Add WebSocket connection documentation (#8404)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-05-16 10:12:05 -04:00
Robert Brennan f3f038bb60 Fix mypy error for pkg_resources import (#8537)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-05-16 14:06:41 +00:00
Robert Brennan b8d3027cfe Always enable logout button regardless of GitHub connection status (#8529)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-05-16 09:09:33 -04:00
Rohit Malhotra feb04dc65f Plumb custom secrets to runtime (#8330)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-05-15 20:06:30 -04:00
Engel Nyst 1f827170f4 Fix resolver test (#8530) 2025-05-15 21:57:02 +00:00
Engel Nyst f7cb2d0f64 Restore previous conversation in CLI (#8431) 2025-05-15 23:47:41 +02:00
tofarr 033788c2d0 API Updates to facilitate nested runtimes. (#8525) 2025-05-15 15:38:09 -06:00
dependabot[bot] 21d0990be4 chore(deps): bump the version-all group with 7 updates (#8522)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2025-05-15 20:17:51 +00:00
tofarr 6227073cff Add missing Ukrainian translations for secrets-related keys (#8526)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-05-15 19:25:06 +00:00
Graham Neubig 4c38113cb7 Fix CI to check for missing translations (#8486)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-05-15 13:31:44 -04:00
Emmanuel Ferdman fb516dfa0f Remove obsolete task.py file (#8517)
Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>
2025-05-15 18:54:25 +02:00
openhands 22420ce6a7 Add backend implementation for agent mode toggle feature 2025-04-25 21:23:55 +00:00
Xingyao Wang ef360c1f67 Merge branch 'main' into feature/agent-mode-toggle 2025-04-25 17:18:38 -04:00
openhands b202407a3d Fix TypeScript errors in test files by adding required agent state properties 2025-04-23 01:54:23 +00:00
openhands 52cb8341fb Update agent mode toggle design document with implementation status 2025-04-23 01:40:13 +00:00
openhands d37dfc49c0 Add agent mode toggle functionality with notifications 2025-04-23 01:39:27 +00:00
openhands 01b4729095 Add design document for agent mode toggle feature 2025-04-23 01:32:13 +00:00
96 changed files with 5106 additions and 3227 deletions
+4
View File
@@ -13,6 +13,10 @@ updates:
browsergym:
patterns:
- "browsergym*"
mcp-packages:
patterns:
- "mcp"
- "mcpm"
security-all:
applies-to: "security-updates"
patterns:
+2 -1
View File
@@ -30,11 +30,12 @@ jobs:
run: |
cd frontend
npm install --frozen-lockfile
- name: Lint and TypeScript compilation
- name: Lint, TypeScript compilation, and translation checks
run: |
cd frontend
npm run lint
npm run make-i18n && tsc
npm run check-translation-completeness
# Run lint on the python code
lint-python:
+9 -8
View File
@@ -154,20 +154,21 @@ install-python-dependencies:
fi
@if [ "${INSTALL_PLAYWRIGHT}" != "false" ] && [ "${INSTALL_PLAYWRIGHT}" != "0" ]; then \
if [ -f "/etc/manjaro-release" ]; then \
echo "$(BLUE)Detected Manjaro Linux. Installing Patchright dependencies...$(RESET)"; \
poetry run patchright install chromium; \
echo "$(BLUE)Detected Manjaro Linux. Installing Playwright dependencies...$(RESET)"; \
poetry run pip install playwright; \
poetry run playwright install chromium; \
else \
if [ ! -f cache/patchright_chromium_is_installed.txt ]; then \
echo "Installing patchright chromium..."; \
poetry run patchright install chromium; \
if [ ! -f cache/playwright_chromium_is_installed.txt ]; then \
echo "Running playwright install --with-deps chromium..."; \
poetry run playwright install --with-deps chromium; \
mkdir -p cache; \
touch cache/patchright_chromium_is_installed.txt; \
touch cache/playwright_chromium_is_installed.txt; \
else \
echo "Setup already done. Skipping patchright installation."; \
echo "Setup already done. Skipping playwright installation."; \
fi \
fi \
else \
echo "Skipping Patchright installation (INSTALL_PLAYWRIGHT=${INSTALL_PLAYWRIGHT})."; \
echo "Skipping Playwright installation (INSTALL_PLAYWRIGHT=${INSTALL_PLAYWRIGHT})."; \
fi
@echo "$(GREEN)Python dependencies installed successfully.$(RESET)"
+249
View File
@@ -0,0 +1,249 @@
# Agent Mode Toggle Design Document
## Overview
This document outlines the design for implementing a toggle switch between "Read-only mode" and "Execute mode" in the OpenHands application. This feature will allow users to switch between a restricted ReadOnlyAgent that can only explore and analyze code, and the fully capable CodeActAgent that can modify code and execute commands.
## Motivation
Users often want to explore a codebase and discuss implementation details with the agent before making any changes. The ability to switch between read-only and execute modes provides several benefits:
1. **Safety**: Users can ensure no changes are made during the exploration phase
2. **Clarity**: Clear indication of the agent's current capabilities
3. **Control**: Users decide when to transition from planning to execution
4. **Workflow**: Supports a natural workflow of exploration → planning → implementation
## Architecture
The implementation will leverage the existing agent delegation mechanism in OpenHands. When a user toggles the switch:
1. In **Execute Mode** (default): The application uses the standard CodeActAgent
2. In **Read-only Mode**: The application delegates to a ReadOnlyAgent
### Key Components
#### Frontend
1. **Toggle Switch Component**:
- UI element that shows the current mode and allows switching
- Sends appropriate actions to the event stream when toggled
2. **Agent State Tracking**:
- Redux state to track current agent type and delegation status
- Event listeners to update state based on event stream
3. **Visual Indicators**:
- Mode indicator showing current agent mode
- Visual styling differences between modes
#### Backend
1. **Agent Delegation**:
- Uses existing delegation mechanism to switch to ReadOnlyAgent
- User-initiated FinishAction to end delegation and return to CodeActAgent
2. **Event Stream Integration**:
- AgentDelegateAction to start read-only mode
- AgentFinishAction to end read-only mode
- System messages to indicate mode changes
## Implementation Details
### Frontend Implementation
#### Redux State Extension
```typescript
interface AgentState {
curAgentState: AgentState;
currentAgentType: string; // Track the agent type
isDelegated: boolean; // Track if we're in a delegation
// other existing fields...
}
const initialState: AgentState = {
curAgentState: AgentState.IDLE,
currentAgentType: "CodeActAgent", // Default agent type
isDelegated: false,
// other initial values...
};
```
#### Action Generators
```typescript
export const generateDelegateToReadOnlyAction = () => ({
action: ActionType.DELEGATE,
args: {
agent: "ReadOnlyAgent",
inputs: {
task: "Continue the conversation in READ-ONLY MODE. You can explore and analyze code but cannot make changes."
},
thought: "Switching to read-only mode at user's request"
}
});
export const generateFinishDelegationAction = () => ({
action: ActionType.FINISH,
args: {
message: "Switching back to EXECUTE MODE. You now have full capabilities to modify code and execute commands.",
task_completed: "true",
outputs: {
mode_switch: true
}
}
});
```
#### Toggle Switch Component
```tsx
function AgentModeToggle() {
const { t } = useTranslation();
const dispatch = useDispatch();
const { send } = useWsClient();
// Get agent type from Redux
const { currentAgentType, isDelegated } = useSelector((state: RootState) => state.agent);
// Compute if we're in read-only mode
const isReadOnly = currentAgentType === "ReadOnlyAgent";
const handleToggle = () => {
if (isReadOnly) {
// Currently in read-only mode, switch back to execute mode
send(generateFinishDelegationAction());
} else {
// Currently in execute mode, switch to read-only mode
send(generateDelegateToReadOnlyAction());
}
};
return (
<div className="flex items-center gap-2">
<span className="text-sm font-medium">
{isReadOnly ? "Read-Only Mode" : "Execute Mode"}
</span>
<Switch
checked={isReadOnly}
onChange={handleToggle}
className={`${isReadOnly ? 'bg-amber-600' : 'bg-blue-600'} relative inline-flex h-6 w-11 items-center rounded-full`}
>
<span className="sr-only">Toggle agent mode</span>
<span
className={`${isReadOnly ? 'translate-x-6' : 'translate-x-1'} inline-block h-4 w-4 transform rounded-full bg-white transition`}
/>
</Switch>
</div>
);
}
```
#### Event Listener for State Updates
```typescript
function handleEvent(event) {
// Handle agent delegation events
if (event.action === ActionType.DELEGATE) {
// A delegation is starting
dispatch(setDelegationState(true));
dispatch(setAgentType(event.args.agent));
}
// Handle agent delegate observation (delegation ended)
else if (event.observation === "delegate") {
// Delegation has ended, returning to parent agent
dispatch(setDelegationState(false));
dispatch(setAgentType("CodeActAgent")); // Reset to default agent
}
// Handle other events...
}
```
### Backend Considerations
The backend implementation will leverage the existing agent delegation mechanism:
1. When the user toggles to read-only mode:
- An AgentDelegateAction is sent to the event stream
- The AgentController creates a ReadOnlyAgent delegate
- All subsequent events are handled by the delegate
2. When the user toggles back to execute mode:
- An AgentFinishAction is sent to the event stream
- The delegate agent finishes its task
- The parent AgentController resumes normal operation
No backend code changes are required as we're using the existing delegation mechanism.
## User Experience
1. **Initial State**: The application starts in Execute Mode with CodeActAgent
2. **Mode Switching**:
- User clicks the toggle switch to enter Read-only Mode
- System message indicates the mode change
- Agent capabilities are restricted to read-only tools
- UI shows visual indicators of the current mode
- User clicks the toggle switch again to return to Execute Mode
- System message indicates the return to full capabilities
3. **Visual Indicators**:
- Toggle switch position (left/right)
- Color coding (amber for read-only, blue for execute)
- Mode label text
- System messages in the conversation
## Future Enhancements
1. **Persistent Mode Preference**: Remember the user's preferred starting mode
2. **Context Preservation**: Improve context retention when switching modes
3. **Custom Tool Sets**: Allow users to customize which tools are available in each mode
4. **Mode-specific Prompts**: Optimize agent prompts for each mode
## Implementation Plan
1. **Frontend Implementation**:
- Add Redux state for agent type tracking ✅
- Create toggle switch component ✅
- Implement event listeners for state updates ✅
- Add visual indicators for current mode ✅
- Add notifications for mode changes ✅
2. **Testing**:
- Test mode switching with various conversation states
- Verify proper tool restrictions in read-only mode
- Test persistence across page refreshes
3. **Documentation**:
- Update user documentation to explain the mode toggle feature
- Add developer documentation for the implementation details ✅
## Implementation Status
The agent mode toggle feature has been implemented with the following components:
1. **Redux State**:
- Added `currentAgentType` and `isDelegated` properties to the agent slice
- Default agent type is set to "CodeActAgent"
2. **Agent Mode Service**:
- Created `agent-mode-service.ts` with action generators for delegation
- Implemented `generateDelegateToReadOnlyAction()` and `generateFinishDelegationAction()`
3. **UI Components**:
- Created `AgentModeToggle` component with toggle switch UI
- Integrated toggle into the agent control bar
- Updated agent status bar to display current mode
- Added color coding (amber for read-only, blue for execute)
4. **Event Handling**:
- Updated `use-handle-ws-events.ts` to process agent delegation events
- Added state updates when delegation starts/ends
- Added notifications to inform users of mode changes
5. **Internationalization**:
- Added translations for all UI elements
- Supported multiple languages through i18n
The implementation is complete and ready for testing. The feature allows users to seamlessly switch between read-only and execute modes during a conversation, with clear visual indicators and notifications of the current mode.
@@ -0,0 +1,181 @@
---
sidebar_position: 9
---
# Connecting to the WebSocket
This guide explains how to connect to the OpenHands WebSocket API to receive real-time events and send actions to the agent.
## Overview
OpenHands uses [Socket.IO](https://socket.io/) for WebSocket communication between the client and server. The WebSocket connection allows you to:
1. Receive real-time events from the agent
2. Send user actions to the agent
3. Maintain a persistent connection for ongoing conversations
## Connecting to the WebSocket
### Connection Parameters
When connecting to the WebSocket, you need to provide the following query parameters:
- `conversation_id`: The ID of the conversation you want to join
- `latest_event_id`: The ID of the latest event you've received (use `-1` for a new connection)
- `providers_set`: (Optional) A comma-separated list of provider types
### Connection Example
Here's a basic example of connecting to the WebSocket using JavaScript:
```javascript
import { io } from "socket.io-client";
const socket = io("http://localhost:3000", {
transports: ["websocket"],
query: {
conversation_id: "your-conversation-id",
latest_event_id: -1,
providers_set: "github,gitlab" // Optional
}
});
socket.on("connect", () => {
console.log("Connected to OpenHands WebSocket");
});
socket.on("oh_event", (event) => {
console.log("Received event:", event);
});
socket.on("connect_error", (error) => {
console.error("Connection error:", error);
});
socket.on("disconnect", (reason) => {
console.log("Disconnected:", reason);
});
```
## Sending Actions to the Agent
To send an action to the agent, use the `oh_user_action` event:
```javascript
// Send a user message to the agent
socket.emit("oh_user_action", {
type: "message",
source: "user",
message: "Hello, can you help me with my project?"
});
```
## Receiving Events from the Agent
The server emits events using the `oh_event` event type. Here are some common event types you might receive:
- User messages (`source: "user", type: "message"`)
- Agent messages (`source: "agent", type: "message"`)
- File edits (`action: "edit"`)
- File writes (`action: "write"`)
- Command executions (`action: "run"`)
Example event handler:
```javascript
socket.on("oh_event", (event) => {
if (event.source === "agent" && event.type === "message") {
console.log("Agent says:", event.message);
} else if (event.action === "run") {
console.log("Command executed:", event.args.command);
console.log("Result:", event.result);
}
});
```
## Using Websocat for Testing
[Websocat](https://github.com/vi/websocat) is a command-line tool for interacting with WebSockets. It's useful for testing your WebSocket connection without writing a full client application.
### Installation
```bash
# On macOS
brew install websocat
# On Linux
curl -L https://github.com/vi/websocat/releases/download/v1.11.0/websocat.x86_64-unknown-linux-musl > websocat
chmod +x websocat
sudo mv websocat /usr/local/bin/
```
### Connecting to the WebSocket
```bash
# Connect to the WebSocket and print all received messages
echo "40{}" | \
websocat "ws://localhost:3000/socket.io/?EIO=4&transport=websocket&conversation_id=your-conversation-id&latest_event_id=-1"
```
### Sending a Message
```bash
# Send a message to the agent
echo '42["oh_user_action",{"type":"message","source":"user","message":"Hello, agent!"}]' | \
websocat "ws://localhost:3000/socket.io/?EIO=4&transport=websocket&conversation_id=your-conversation-id&latest_event_id=-1"
```
### Complete Example with Websocat
Here's a complete example of connecting to the WebSocket, sending a message, and receiving events:
```bash
# Start a persistent connection
websocat -v "ws://localhost:3000/socket.io/?EIO=4&transport=websocket&conversation_id=your-conversation-id&latest_event_id=-1"
# In another terminal, send a message
echo '42["oh_user_action",{"type":"message","source":"user","message":"Can you help me with my project?"}]' | \
websocat "ws://localhost:3000/socket.io/?EIO=4&transport=websocket&conversation_id=your-conversation-id&latest_event_id=-1"
```
## Event Structure
Events sent and received through the WebSocket follow a specific structure:
```typescript
interface OpenHandsEvent {
id: string; // Unique event ID
source: string; // "user" or "agent"
timestamp: string; // ISO timestamp
message?: string; // For message events
type?: string; // Event type (e.g., "message")
action?: string; // Action type (e.g., "run", "edit", "write")
args?: any; // Action arguments
result?: any; // Action result
}
```
## Best Practices
1. **Handle Reconnection**: Implement reconnection logic in your client to handle network interruptions.
2. **Track Event IDs**: Store the latest event ID you've received and use it when reconnecting to avoid duplicate events.
3. **Error Handling**: Implement proper error handling for connection errors and failed actions.
4. **Rate Limiting**: Avoid sending too many actions in a short period to prevent overloading the server.
## Troubleshooting
### Connection Issues
- Verify that the OpenHands server is running and accessible
- Check that you're providing the correct conversation ID
- Ensure your WebSocket URL is correctly formatted
### Authentication Issues
- Make sure you have the necessary authentication cookies if required
- Verify that you have permission to access the specified conversation
### Event Handling Issues
- Check that you're correctly parsing the event data
- Verify that your event handlers are properly registered
@@ -0,0 +1,23 @@
# Organization and User Microagents
## Purpose
Organizations and users can define microagents that apply to all repositories belonging to the organization or user.
## Usage
These microagents can be [any type of microagent](./microagents-overview#microagent-types) and will be loaded
accordingly. However, they are applied to all repositories belonging to the organization or user.
Add a `.openhands` repository under the organization or user and create a `microagents` directory and place the
microagents in that directory.
## Example
General microagent file example for organization `Great-Co` located inside the `.openhands` repository:
`microagents/org-microagent.md`:
```
* Use type hints and error boundaries; validate inputs at system boundaries and fail with meaningful error messages.
* Document interfaces and public APIs; use implementation comments only for non-obvious logic.
* Follow the same naming convention for variables, classes, constants, etc. already used in each repository.
```
@@ -7,7 +7,7 @@ They provide expert guidance, automate common tasks, and ensure consistent pract
Currently OpenHands supports the following types of microagents:
- [General Repository Microagents](./microagents-repo): General guidelines for OpenHands about the repository.
- [General Microagents](./microagents-repo): General guidelines for OpenHands about the repository.
- [Keyword-Triggered Microagents](./microagents-keyword): Guidelines activated by specific keywords in prompts.
To customize OpenHands' behavior, create a .openhands/microagents/ directory in the root of your repository and
@@ -24,7 +24,7 @@ Example repository structure:
some-repository/
└── .openhands/
└── microagents/
└── repo.md # General repository guidelines
└── repo.md # General guidelines
└── trigger_this.md # Microagent triggered by specific keywords
└── trigger_that.md # Microagent triggered by specific keywords
```
@@ -34,7 +34,7 @@ some-repository/
Each microagent file may include frontmatter that provides additional information. In some cases, this frontmatter
is required:
| Microagent Type | Required |
|----------------------------------|----------|
| `General Repository Microagents` | No |
| `Keyword-Triggered Microagents` | Yes |
| Microagent Type | Required |
|---------------------------------|----------|
| `General Microagents` | No |
| `Keyword-Triggered Microagents` | Yes |
@@ -1,4 +1,4 @@
# General Repository Microagents
# General Microagents
## Purpose
@@ -20,7 +20,7 @@ Frontmatter should be enclosed in triple dashes (---) and may include the follow
## Example
General repository microagent file example located at `.openhands/microagents/repo.md`:
General microagent file example located at `.openhands/microagents/repo.md`:
```
This project is a TODO application that allows users to track TODO items.
@@ -28,4 +28,4 @@ To set it up, you can run `npm run build`.
Always make sure the tests are passing before committing changes. You can run the tests by running `npm run test`.
```
[See more examples of general repository microagents here.](https://github.com/All-Hands-AI/OpenHands/tree/main/.openhands/microagents)
[See more examples of general microagents here.](https://github.com/All-Hands-AI/OpenHands/tree/main/.openhands/microagents)
+11 -1
View File
@@ -70,7 +70,7 @@ const sidebars: SidebarsConfig = {
},
{
type: 'doc',
label: 'General Repository Microagents',
label: 'General Microagents',
id: 'usage/prompting/microagents-repo',
},
{
@@ -78,6 +78,11 @@ const sidebars: SidebarsConfig = {
label: 'Keyword-Triggered Microagents',
id: 'usage/prompting/microagents-keyword',
},
{
type: 'doc',
label: 'Organization and User Microagents',
id: 'usage/prompting/microagents-org',
},
{
type: 'doc',
label: 'Global Microagents',
@@ -267,6 +272,11 @@ const sidebars: SidebarsConfig = {
label: 'Evaluation',
id: 'usage/how-to/evaluation-harness',
},
{
type: 'doc',
label: 'WebSocket Connection',
id: 'usage/how-to/websocket-connection',
},
],
},
{
+55
View File
@@ -0,0 +1,55 @@
# Agent Mode Toggle
The Agent Mode Toggle feature allows you to switch between two different agent modes:
1. **Execute Mode** (default): Full capabilities with the CodeActAgent, which can modify code and execute commands
2. **Read-only Mode**: Restricted capabilities with the ReadOnlyAgent, which can only explore and analyze code
## Why Use Different Modes?
- **Safety**: Ensure no changes are made during the exploration phase
- **Clarity**: Clear indication of the agent's current capabilities
- **Control**: Decide when to transition from planning to execution
- **Workflow**: Support a natural workflow of exploration → planning → implementation
## How to Use
1. **Toggle Switch**: Click the toggle switch in the agent control bar to switch between modes
- Blue toggle: Execute Mode (default)
- Amber toggle: Read-only Mode
2. **Mode Indicators**:
- The current mode is displayed in the agent status bar
- System messages indicate when the mode changes
## Available Tools in Each Mode
### Execute Mode (CodeActAgent)
All tools are available, including:
- File editing (`str_replace_editor`)
- Command execution (`execute_bash`)
- Python code execution (`execute_ipython_cell`)
- Web browsing (`browser`, `web_read`)
- Thinking and finishing (`think`, `finish`)
### Read-only Mode (ReadOnlyAgent)
Only non-destructive tools are available:
- File viewing (`view`)
- File searching (`grep`, `glob`)
- Web reading (`web_read`)
- Thinking and finishing (`think`, `finish`)
## Best Practices
1. **Start in Read-only Mode** for new codebases to safely explore without making changes
2. **Switch to Execute Mode** when you're ready to implement changes
3. **Return to Read-only Mode** when you want to explore different parts of the codebase
## Technical Details
The agent mode toggle uses OpenHands' agent delegation mechanism:
- When toggling to Read-only Mode, the system delegates to a ReadOnlyAgent
- When toggling back to Execute Mode, the delegation ends and returns to the CodeActAgent
- Context is preserved between mode switches
@@ -20,7 +20,6 @@ describe("AccountSettingsContextMenu", () => {
<AccountSettingsContextMenu
onLogout={onLogoutMock}
onClose={onCloseMock}
isLoggedIn
/>,
);
@@ -35,7 +34,6 @@ describe("AccountSettingsContextMenu", () => {
<AccountSettingsContextMenu
onLogout={onLogoutMock}
onClose={onCloseMock}
isLoggedIn
/>,
);
@@ -45,19 +43,18 @@ describe("AccountSettingsContextMenu", () => {
expect(onLogoutMock).toHaveBeenCalledOnce();
});
test("onLogout should be disabled if the user is not logged in", async () => {
test("logout button is always enabled", async () => {
render(
<AccountSettingsContextMenu
onLogout={onLogoutMock}
onClose={onCloseMock}
isLoggedIn={false}
/>,
);
const logoutOption = screen.getByText("ACCOUNT_SETTINGS$LOGOUT");
await user.click(logoutOption);
expect(onLogoutMock).not.toHaveBeenCalled();
expect(onLogoutMock).toHaveBeenCalledOnce();
});
it("should call onClose when clicking outside of the element", async () => {
@@ -65,7 +62,6 @@ describe("AccountSettingsContextMenu", () => {
<AccountSettingsContextMenu
onLogout={onLogoutMock}
onClose={onCloseMock}
isLoggedIn
/>,
);
@@ -57,7 +57,7 @@ describe("UserActions", () => {
).not.toBeInTheDocument();
});
test("onLogout should not be called when the user is not logged in", async () => {
test("logout button is always enabled", async () => {
render(<UserActions onLogout={onLogoutMock} />);
const userAvatar = screen.getByTestId("user-avatar");
@@ -66,6 +66,6 @@ describe("UserActions", () => {
const logoutOption = screen.getByText("ACCOUNT_SETTINGS$LOGOUT");
await user.click(logoutOption);
expect(onLogoutMock).not.toHaveBeenCalled();
expect(onLogoutMock).toHaveBeenCalledOnce();
});
});
+15 -3
View File
@@ -59,7 +59,11 @@ describe("useTerminal", () => {
it("should render", () => {
renderWithProviders(<TestTerminalComponent commands={[]} />, {
preloadedState: {
agent: { curAgentState: AgentState.RUNNING },
agent: {
curAgentState: AgentState.RUNNING,
currentAgentType: "CodeActAgent",
isDelegated: false
},
cmd: { commands: [] },
},
});
@@ -73,7 +77,11 @@ describe("useTerminal", () => {
renderWithProviders(<TestTerminalComponent commands={commands} />, {
preloadedState: {
agent: { curAgentState: AgentState.RUNNING },
agent: {
curAgentState: AgentState.RUNNING,
currentAgentType: "CodeActAgent",
isDelegated: false
},
cmd: { commands },
},
});
@@ -100,7 +108,11 @@ describe("useTerminal", () => {
/>,
{
preloadedState: {
agent: { curAgentState: AgentState.RUNNING },
agent: {
curAgentState: AgentState.RUNNING,
currentAgentType: "CodeActAgent",
isDelegated: false
},
cmd: { commands },
},
},
@@ -11,7 +11,6 @@ describe("Translations", () => {
<AccountSettingsContextMenu
onLogout={() => {}}
onClose={() => {}}
isLoggedIn
/>,
);
expect(
+3 -3
View File
@@ -76,9 +76,9 @@ class OpenHands {
): Promise<boolean> {
if (appMode === "oss") return true;
const response =
await openHands.post<AuthenticateResponse>("/api/authenticate");
return response.status === 200;
// Just make the request, if it succeeds (no exception thrown), return true
await openHands.post<AuthenticateResponse>("/api/authenticate");
return true;
}
/**
@@ -7,13 +7,11 @@ import { I18nKey } from "#/i18n/declaration";
interface AccountSettingsContextMenuProps {
onLogout: () => void;
onClose: () => void;
isLoggedIn: boolean;
}
export function AccountSettingsContextMenu({
onLogout,
onClose,
isLoggedIn,
}: AccountSettingsContextMenuProps) {
const ref = useClickOutsideElement<HTMLUListElement>(onClose);
const { t } = useTranslation();
@@ -24,7 +22,7 @@ export function AccountSettingsContextMenu({
ref={ref}
className="absolute right-full md:left-full -top-1 z-10 w-fit"
>
<ContextMenuListItem onClick={onLogout} isDisabled={!isLoggedIn}>
<ContextMenuListItem onClick={onLogout}>
{t(I18nKey.ACCOUNT_SETTINGS$LOGOUT)}
</ContextMenuListItem>
</ContextMenu>
@@ -9,6 +9,7 @@ import { AgentState } from "#/types/agent-state";
import { useWsClient } from "#/context/ws-client-provider";
import { IGNORE_TASK_STATE_MAP } from "#/ignore-task-state-map.constant";
import { ActionButton } from "#/components/shared/buttons/action-button";
import { AgentModeToggle } from "./agent-mode-toggle";
export function AgentControlBar() {
const { t } = useTranslation();
@@ -23,25 +24,29 @@ export function AgentControlBar() {
return (
<div className="flex justify-between items-center gap-20">
<ActionButton
isDisabled={
curAgentState !== AgentState.RUNNING &&
curAgentState !== AgentState.PAUSED
}
content={
curAgentState === AgentState.PAUSED
? t(I18nKey.AGENT$RESUME_TASK)
: t(I18nKey.AGENT$PAUSE_TASK)
}
action={
curAgentState === AgentState.PAUSED
? AgentState.RUNNING
: AgentState.PAUSED
}
handleAction={handleAction}
>
{curAgentState === AgentState.PAUSED ? <PlayIcon /> : <PauseIcon />}
</ActionButton>
<div className="flex items-center gap-4">
<ActionButton
isDisabled={
curAgentState !== AgentState.RUNNING &&
curAgentState !== AgentState.PAUSED
}
content={
curAgentState === AgentState.PAUSED
? t(I18nKey.AGENT$RESUME_TASK)
: t(I18nKey.AGENT$PAUSE_TASK)
}
action={
curAgentState === AgentState.PAUSED
? AgentState.RUNNING
: AgentState.PAUSED
}
handleAction={handleAction}
>
{curAgentState === AgentState.PAUSED ? <PlayIcon /> : <PauseIcon />}
</ActionButton>
<AgentModeToggle />
</div>
</div>
);
}
@@ -0,0 +1,72 @@
import { useSelector } from "react-redux";
import { useTranslation } from "react-i18next";
import { Switch } from "@heroui/react";
import { useWsClient } from "#/context/ws-client-provider";
import { RootState } from "#/store";
import { cn } from "#/utils/utils";
import {
generateDelegateToReadOnlyAction,
generateFinishDelegationAction,
} from "#/services/agent-mode-service";
import { AgentState } from "#/types/agent-state";
import { I18nKey } from "#/i18n/declaration";
export function AgentModeToggle() {
const { t } = useTranslation();
const { send } = useWsClient();
// Get agent type and state from Redux
const { currentAgentType, curAgentState } = useSelector(
(state: RootState) => state.agent,
);
// Compute if we're in read-only mode
const isReadOnly = currentAgentType === "ReadOnlyAgent";
// Check if toggle is disabled (should be disabled during certain agent states)
const isDisabled = [
AgentState.LOADING,
AgentState.INIT,
AgentState.ERROR,
AgentState.RATE_LIMITED,
].includes(curAgentState);
const handleToggle = () => {
if (isReadOnly) {
// Currently in read-only mode, switch back to execute mode
send(generateFinishDelegationAction());
} else {
// Currently in execute mode, switch to read-only mode
send(generateDelegateToReadOnlyAction());
}
};
return (
<div className="flex items-center gap-2">
<Switch
isDisabled={isDisabled}
name="agent-mode"
isSelected={isReadOnly}
onValueChange={handleToggle}
classNames={{
thumb: cn("bg-white w-3 h-3"),
wrapper: cn(
"border border-[#D4D4D4] bg-white px-[6px] w-12 h-6",
"group-data-[selected=true]:border-transparent",
isReadOnly
? "group-data-[selected=true]:bg-amber-600"
: "group-data-[selected=true]:bg-blue-600",
),
label: "text-[#A3A3A3] text-xs",
}}
>
<span className="sr-only">{t(I18nKey.AGENT$MODE_TOGGLE_LABEL)}</span>
<span className="text-sm font-medium ml-2">
{isReadOnly
? t(I18nKey.AGENT$MODE_READ_ONLY)
: t(I18nKey.AGENT$MODE_EXECUTE)}
</span>
</Switch>
</div>
);
}
@@ -24,7 +24,9 @@ const notificationStates = [
export function AgentStatusBar() {
const { t, i18n } = useTranslation();
const { curAgentState } = useSelector((state: RootState) => state.agent);
const { curAgentState, currentAgentType } = useSelector(
(state: RootState) => state.agent,
);
const { curStatusMessage } = useSelector((state: RootState) => state.status);
const { status } = useWsClient();
const { notify } = useNotification();
@@ -99,6 +101,10 @@ export function AgentStatusBar() {
}
}, [curAgentState, status, notify, t]);
// Determine agent mode badge color
const agentModeBadgeColor =
currentAgentType === "ReadOnlyAgent" ? "bg-amber-600" : "bg-blue-600";
return (
<div className="flex flex-col items-center">
<div className="flex items-center bg-base-secondary px-2 py-1 text-gray-400 rounded-[100px] text-sm gap-[6px]">
@@ -106,6 +112,15 @@ export function AgentStatusBar() {
className={`w-2 h-2 rounded-full animate-pulse ${indicatorColor}`}
/>
<span className="text-sm text-stone-400">{t(statusMessage)}</span>
{/* Agent Mode Badge */}
<div
className={`ml-2 px-2 py-0.5 rounded-full text-xs text-white ${agentModeBadgeColor}`}
>
{currentAgentType === "ReadOnlyAgent"
? t(I18nKey.AGENT$MODE_READ_ONLY)
: t(I18nKey.AGENT$MODE_EXECUTE)}
</div>
</div>
</div>
);
@@ -35,7 +35,6 @@ export function UserActions({ onLogout, user, isLoading }: UserActionsProps) {
{accountContextMenuIsVisible && (
<AccountSettingsContextMenu
isLoggedIn={!!user}
onLogout={handleLogout}
onClose={closeAccountMenu}
/>
+18 -1
View File
@@ -1,4 +1,5 @@
import { useQuery } from "@tanstack/react-query";
import axios, { AxiosError } from "axios";
import OpenHands from "#/api/open-hands";
import { useConfig } from "./use-config";
import { useIsOnTosPage } from "#/hooks/use-is-on-tos-page";
@@ -11,7 +12,23 @@ export const useIsAuthed = () => {
return useQuery({
queryKey: ["user", "authenticated", appMode],
queryFn: () => OpenHands.authenticate(appMode!),
queryFn: async () => {
try {
// If in OSS mode or authentication succeeds, return true
await OpenHands.authenticate(appMode!);
return true;
} catch (error) {
// If it's a 401 error, return false (not authenticated)
if (axios.isAxiosError(error)) {
const axiosError = error as AxiosError;
if (axiosError.response?.status === 401) {
return false;
}
}
// For any other error, throw it to put the query in error state
throw error;
}
},
enabled: !!appMode && !isOnTosPage,
staleTime: 1000 * 60 * 5, // 5 minutes
gcTime: 1000 * 60 * 15, // 15 minutes
@@ -0,0 +1,47 @@
import { useEffect } from "react";
import { useDispatch } from "react-redux";
import { setAgentType, setDelegationState } from "#/state/agent-slice";
import ActionType from "#/types/action-type";
/**
* Hook to handle agent mode changes based on WebSocket events
*/
export function useAgentModeHandler(events: Record<string, unknown>[]) {
const dispatch = useDispatch();
useEffect(() => {
// Process only the latest event
if (events.length === 0) return;
const latestEvent = events[events.length - 1];
// Handle agent delegation events
if (
"action" in latestEvent &&
latestEvent.action === ActionType.DELEGATE &&
"args" in latestEvent &&
typeof latestEvent.args === "object" &&
latestEvent.args !== null &&
"agent" in latestEvent.args
) {
// A delegation is starting
dispatch(setDelegationState(true));
dispatch(setAgentType(latestEvent.args.agent as string));
}
// Handle agent delegate observation (delegation ended)
else if (
"observation" in latestEvent &&
latestEvent.observation === "delegate" &&
"data" in latestEvent &&
typeof latestEvent.data === "object" &&
latestEvent.data !== null &&
"status" in latestEvent.data &&
latestEvent.data.status === "finished"
) {
// Delegation has ended, returning to parent agent
dispatch(setDelegationState(false));
dispatch(setAgentType("CodeActAgent")); // Reset to default agent
}
}, [events, dispatch]);
}
+49 -2
View File
@@ -1,11 +1,19 @@
import React from "react";
import { useDispatch } from "react-redux";
import { useTranslation } from "react-i18next";
import { useWsClient } from "#/context/ws-client-provider";
import { generateAgentStateChangeEvent } from "#/services/agent-state-service";
import { addErrorMessage } from "#/state/chat-slice";
import { AgentState } from "#/types/agent-state";
import { ErrorObservation } from "#/types/core/observations";
import { displayErrorToast } from "#/utils/custom-toast-handlers";
import { useEndSession } from "./use-end-session";
import {
displayErrorToast,
displaySuccessToast,
} from "#/utils/custom-toast-handlers";
import { setAgentType, setDelegationState } from "#/state/agent-slice";
import ActionType from "#/types/action-type";
import { I18nKey } from "#/i18n/declaration";
interface ServerError {
error: boolean | string;
@@ -21,6 +29,7 @@ const isErrorObservation = (data: object): data is ErrorObservation =>
export const useHandleWSEvents = () => {
const { events, send } = useWsClient();
const dispatch = useDispatch();
const { t } = useTranslation();
React.useEffect(() => {
if (!events.length) {
@@ -58,5 +67,43 @@ export const useHandleWSEvents = () => {
}),
);
}
}, [events.length]);
// Handle agent mode changes
// Handle agent delegation events
if (
"action" in event &&
event.action === ActionType.DELEGATE &&
"args" in event &&
typeof event.args === "object" &&
event.args !== null &&
"agent" in event.args
) {
// A delegation is starting
const agentType = event.args.agent as string;
dispatch(setDelegationState(true));
dispatch(setAgentType(agentType));
// Show notification
if (agentType === "ReadOnlyAgent") {
displaySuccessToast(t(I18nKey.AGENT$MODE_READ_ONLY));
}
}
// Handle agent delegate observation (delegation ended)
else if (
"observation" in event &&
event.observation === "delegate" &&
"data" in event &&
typeof event.data === "object" &&
event.data !== null &&
"status" in event.data &&
event.data.status === "finished"
) {
// Delegation has ended, returning to parent agent
dispatch(setDelegationState(false));
dispatch(setAgentType("CodeActAgent")); // Reset to default agent
// Show notification
displaySuccessToast(t(I18nKey.AGENT$MODE_EXECUTE));
}
}, [events.length, dispatch, send, t]);
};
+3
View File
@@ -1,5 +1,8 @@
// this file generate by script, don't modify it manually!!!
export enum I18nKey {
AGENT$MODE_READ_ONLY = "AGENT$MODE_READ_ONLY",
AGENT$MODE_EXECUTE = "AGENT$MODE_EXECUTE",
AGENT$MODE_TOGGLE_LABEL = "AGENT$MODE_TOGGLE_LABEL",
SECRETS$SECRET_VALUE_REQUIRED = "SECRETS$SECRET_VALUE_REQUIRED",
SECRETS$ADD_SECRET = "SECRETS$ADD_SECRET",
SECRETS$EDIT_SECRET = "SECRETS$EDIT_SECRET",
+73 -21
View File
@@ -1,4 +1,49 @@
{
"AGENT$MODE_READ_ONLY": {
"en": "Read-Only Mode",
"ja": "読み取り専用モード",
"zh-CN": "只读模式",
"zh-TW": "唯讀模式",
"ko-KR": "읽기 전용 모드",
"no": "Skrivebeskyttet modus",
"it": "Modalità di sola lettura",
"pt": "Modo somente leitura",
"es": "Modo de solo lectura",
"ar": "وضع القراءة فقط",
"fr": "Mode lecture seule",
"tr": "Salt okunur mod",
"de": "Nur-Lese-Modus"
},
"AGENT$MODE_EXECUTE": {
"en": "Execute Mode",
"ja": "実行モード",
"zh-CN": "执行模式",
"zh-TW": "執行模式",
"ko-KR": "실행 모드",
"no": "Utførelsesmodus",
"it": "Modalità di esecuzione",
"pt": "Modo de execução",
"es": "Modo de ejecución",
"ar": "وضع التنفيذ",
"fr": "Mode d'exécution",
"tr": "Yürütme modu",
"de": "Ausführungsmodus"
},
"AGENT$MODE_TOGGLE_LABEL": {
"en": "Toggle agent mode",
"ja": "エージェントモードを切り替える",
"zh-CN": "切换代理模式",
"zh-TW": "切換代理模式",
"ko-KR": "에이전트 모드 전환",
"no": "Bytt agentmodus",
"it": "Cambia modalità agente",
"pt": "Alternar modo do agente",
"es": "Cambiar modo del agente",
"ar": "تبديل وضع الوكيل",
"fr": "Basculer le mode de l'agent",
"tr": "Ajan modunu değiştir",
"de": "Agentenmodus umschalten"
},
"SECRETS$SECRET_VALUE_REQUIRED": {
"en": "Secret value is required",
"ja": "シークレット値は必須です",
@@ -12,7 +57,8 @@
"ar": "قيمة السر مطلوبة",
"fr": "La valeur du secret est requise",
"tr": "Gizli değer gereklidir",
"de": "Geheimer Wert ist erforderlich"
"de": "Geheimer Wert ist erforderlich",
"uk": "Значення секрету є обов'язковим"
},
"SECRETS$ADD_SECRET": {
"en": "Add secret",
@@ -27,7 +73,8 @@
"ar": "إضافة سر",
"fr": "Ajouter un secret",
"tr": "Gizli ekle",
"de": "Geheimnis hinzufügen"
"de": "Geheimnis hinzufügen",
"uk": "Додати секрет"
},
"SECRETS$EDIT_SECRET": {
"en": "Edit secret",
@@ -42,7 +89,8 @@
"ar": "تعديل السر",
"fr": "Modifier le secret",
"tr": "Gizliyi düzenle",
"de": "Geheimnis bearbeiten"
"de": "Geheimnis bearbeiten",
"uk": "Редагувати секрет"
},
"SECRETS$NO_SECRETS_FOUND": {
"en": "No secrets found",
@@ -57,7 +105,8 @@
"ar": "لم يتم العثور على أسرار",
"fr": "Aucun secret trouvé",
"tr": "Gizli bulunamadı",
"de": "Keine Geheimnisse gefunden"
"de": "Keine Geheimnisse gefunden",
"uk": "Секретів не знайдено"
},
"SECRETS$ADD_NEW_SECRET": {
"en": "Add a new secret",
@@ -72,7 +121,8 @@
"ar": "إضافة سر جديد",
"fr": "Ajouter un nouveau secret",
"tr": "Yeni bir gizli ekle",
"de": "Neues Geheimnis hinzufügen"
"de": "Neues Geheimnis hinzufügen",
"uk": "Додати новий секрет"
},
"SECRETS$CONFIRM_DELETE_KEY": {
"en": "Are you sure you want to delete this key?",
@@ -87,7 +137,8 @@
"ar": "هل أنت متأكد أنك تريد حذف هذا المفتاح؟",
"fr": "Êtes-vous sûr de vouloir supprimer cette clé ?",
"tr": "Bu anahtarı silmek istediğinizden emin misiniz?",
"de": "Sind Sie sicher, dass Sie diesen Schlüssel löschen möchten?"
"de": "Sind Sie sicher, dass Sie diesen Schlüssel löschen möchten?",
"uk": "Ви впевнені, що хочете видалити цей ключ?"
},
"SETTINGS$MCP_TITLE": {
"en": "Model Context Protocol (MCP)",
@@ -1062,7 +1113,8 @@
"ar": "أسرار",
"fr": "Secrets",
"tr": "Sırları",
"de": "Geheimnisse"
"de": "Geheimnisse",
"uk": "Секрети"
},
"SETTINGS$NAV_API_KEYS": {
"en": "API Keys",
@@ -6649,20 +6701,20 @@
"uk": "Результат інструменту MCP: {{action.payload.args.name}}"
},
"OBSERVATION_MESSAGE$RECALL": {
"en": "Microagent Activated",
"ja": "マイクロエージェントが有効化されました",
"zh-CN": "微代理已激活",
"zh-TW": "微代理已啟動",
"ko-KR": "마이크로에이전트 활성화됨",
"no": "MikroAgent aktivert",
"it": "Microagent attivato",
"pt": "Microagent ativado",
"es": "Microagent activado",
"ar": "تم تنشيط الوكيل المصغر",
"fr": "Microagent activé",
"tr": "MikroAjan Etkinleştirildi",
"de": "Microagent aktiviert",
"uk": "Мікроагент активований"
"en": "Microagent ready",
"ja": "マイクロエージェントの準備完了",
"zh-CN": "微代理已就绪",
"zh-TW": "微代理已就緒",
"ko-KR": "마이크로에이전트 준비됨",
"no": "MikroAgent klar",
"it": "Microagent pronto",
"pt": "Microagent pronto",
"es": "Microagent listo",
"ar": "الوكيل المصغر جاهز",
"fr": "Microagent prêt",
"tr": "MikroAjan hazır",
"de": "Microagent bereit",
"uk": "Мікроагент готовий"
},
"EXPANDABLE_MESSAGE$SHOW_DETAILS": {
"en": "Show details",
+14 -20
View File
@@ -58,7 +58,7 @@ export function ErrorBoundary() {
export default function MainApp() {
const navigate = useNavigate();
const { pathname } = useLocation();
const tosPageStatus = useIsOnTosPage();
const isOnTosPage = useIsOnTosPage();
const { data: settings } = useSettings();
const { error } = useBalance();
const { migrateUserConsent } = useMigrateUserConsent();
@@ -68,7 +68,7 @@ export default function MainApp() {
const {
data: isAuthed,
isFetching: isFetchingAuth,
isError: authError,
isError: isAuthError,
} = useIsAuthed();
// Always call the hook, but we'll only use the result when not on TOS page
@@ -78,30 +78,30 @@ export default function MainApp() {
});
// When on TOS page, we don't use the GitHub auth URL
const effectiveGitHubAuthUrl = tosPageStatus ? null : gitHubAuthUrl;
const effectiveGitHubAuthUrl = isOnTosPage ? null : gitHubAuthUrl;
const [consentFormIsOpen, setConsentFormIsOpen] = React.useState(false);
React.useEffect(() => {
// Don't change language when on TOS page
if (!tosPageStatus && settings?.LANGUAGE) {
if (!isOnTosPage && settings?.LANGUAGE) {
i18n.changeLanguage(settings.LANGUAGE);
}
}, [settings?.LANGUAGE, tosPageStatus]);
}, [settings?.LANGUAGE, isOnTosPage]);
React.useEffect(() => {
// Don't show consent form when on TOS page
if (!tosPageStatus) {
if (!isOnTosPage) {
const consentFormModalIsOpen =
settings?.USER_CONSENTS_TO_ANALYTICS === null;
setConsentFormIsOpen(consentFormModalIsOpen);
}
}, [settings, tosPageStatus]);
}, [settings, isOnTosPage]);
React.useEffect(() => {
// Don't migrate user consent when on TOS page
if (!tosPageStatus) {
if (!isOnTosPage) {
// Migrate user consent to the server if it was previously stored in localStorage
migrateUserConsent({
handleAnalyticsWasPresentInLocalStorage: () => {
@@ -109,7 +109,7 @@ export default function MainApp() {
},
});
}
}, [tosPageStatus]);
}, [isOnTosPage]);
React.useEffect(() => {
if (settings?.IS_NEW_USER && config.data?.APP_MODE === "saas") {
@@ -120,22 +120,16 @@ export default function MainApp() {
React.useEffect(() => {
// Don't do any redirects when on TOS page
// Don't allow users to use the app if it 402s
if (!tosPageStatus && error?.status === 402 && pathname !== "/") {
if (!isOnTosPage && error?.status === 402 && pathname !== "/") {
navigate("/");
}
}, [error?.status, pathname, tosPageStatus]);
}, [error?.status, pathname, isOnTosPage]);
// When on TOS page, we don't make any API calls, so we need to handle this case
const userIsAuthed = tosPageStatus ? false : !!isAuthed && !authError;
// Only show the auth modal if:
// 1. User is not authenticated
// 2. We're not currently on the TOS page
// 3. We're in SaaS mode
const renderAuthModal =
!isAuthed &&
!isAuthError &&
!isFetchingAuth &&
!userIsAuthed &&
!tosPageStatus &&
!isOnTosPage &&
config.data?.APP_MODE === "saas";
return (
@@ -0,0 +1,24 @@
import ActionType from "#/types/action-type";
export const generateDelegateToReadOnlyAction = () => ({
action: ActionType.DELEGATE,
args: {
agent: "ReadOnlyAgent",
inputs: {
task: "Continue the conversation in READ-ONLY MODE. You can explore and analyze code but cannot make changes.",
},
thought: "Switching to read-only mode at user's request",
},
});
export const generateFinishDelegationAction = () => ({
action: ActionType.FINISH,
args: {
message:
"Switching back to EXECUTE MODE. You now have full capabilities to modify code and execute commands.",
task_completed: "true",
outputs: {
mode_switch: true,
},
},
});
+10 -1
View File
@@ -5,14 +5,23 @@ export const agentSlice = createSlice({
name: "agent",
initialState: {
curAgentState: AgentState.LOADING,
currentAgentType: "CodeActAgent", // Default agent type
isDelegated: false, // Track if we're in a delegation
},
reducers: {
setCurrentAgentState: (state, action) => {
state.curAgentState = action.payload;
},
setAgentType: (state, action) => {
state.currentAgentType = action.payload;
},
setDelegationState: (state, action) => {
state.isDelegated = action.payload;
},
},
});
export const { setCurrentAgentState } = agentSlice.actions;
export const { setCurrentAgentState, setAgentType, setDelegationState } =
agentSlice.actions;
export default agentSlice.reducer;
+11
View File
@@ -212,6 +212,17 @@ export const chatSlice = createSlice({
content += `\n\n- ${host} (port ${port})`;
}
}
if (
recallObs.extras.custom_secrets_descriptions &&
Object.keys(recallObs.extras.custom_secrets_descriptions).length > 0
) {
content += `\n\n**Custom Secrets**`;
for (const [name, description] of Object.entries(
recallObs.extras.custom_secrets_descriptions,
)) {
content += `\n\n- $${name}: ${description}`;
}
}
if (recallObs.extras.repo_instructions) {
content += `\n\n**Repository Instructions:**\n\n${recallObs.extras.repo_instructions}`;
}
+1
View File
@@ -123,6 +123,7 @@ export interface RecallObservation extends OpenHandsObservationEvent<"recall"> {
repo_directory?: string;
repo_instructions?: string;
runtime_hosts?: Record<string, number>;
custom_secrets_descriptions?: Record<string, string>;
additional_agent_instructions?: string;
date?: string;
microagent_knowledge?: MicroagentKnowledge[];
+15
View File
@@ -0,0 +1,15 @@
---
# This is a repo microagent that is always activated
# to include necessary default tools implemented with MCP
name: default-tools
type: repo
version: 1.0.0
agent: CodeActAgent
mcp_tools:
stdio_servers:
- name: "fetch"
command: "uvx"
args: ["mcp-server-fetch"]
# We leave the body empty because MCP tools will automatically add the
# tool description for LLMs in tool calls, so there's no need to add extra descriptions.
---
+137
View File
@@ -0,0 +1,137 @@
---
name: SSH Microagent
type: knowledge
version: 1.0.0
agent: CodeActAgent
triggers:
- ssh
- remote server
- remote machine
- remote host
- remote connection
- secure shell
- ssh keys
---
# SSH Microagent
This microagent provides capabilities for establishing and managing SSH connections to remote machines.
## Capabilities
- Establish SSH connections using password or key-based authentication
- Generate and manage SSH key pairs
- Configure SSH for easier connections
- Execute commands on remote machines
- Transfer files between local and remote machines
- Manage SSH configurations and known hosts
## Authentication Methods
### Password Authentication
```bash
ssh username@hostname
```
When prompted, you should ask the user for their password or a private key.
### Key-Based Authentication
Generate a new SSH key pair:
```bash
ssh-keygen -t ed25519 -f ~/.ssh/key_name -C "comment" -N ""
```
Copy the public key to the remote server:
```bash
ssh-copy-id -i ~/.ssh/key_name.pub username@hostname
```
Connect using the private key:
```bash
ssh -i ~/.ssh/key_name username@hostname
```
## SSH Configuration
Create or edit the SSH config file for easier connections:
```bash
mkdir -p ~/.ssh
cat > ~/.ssh/config << 'EOF'
Host alias
HostName hostname_or_ip
User username
IdentityFile ~/.ssh/key_name
Port 22
ServerAliveInterval 60
EOF
chmod 600 ~/.ssh/config
```
Then connect using the alias:
```bash
ssh alias
```
## Common SSH Options
- `-p PORT`: Connect to a specific port
- `-X`: Enable X11 forwarding
- `-L local_port:remote_host:remote_port`: Set up local port forwarding
- `-R remote_port:local_host:local_port`: Set up remote port forwarding
- `-N`: Do not execute a remote command (useful for port forwarding)
- `-f`: Run in background
- `-v`: Verbose mode (add more v's for increased verbosity)
## File Transfer with SCP
Copy a file to the remote server:
```bash
scp /path/to/local/file username@hostname:/path/to/remote/directory/
```
Copy a file from the remote server:
```bash
scp username@hostname:/path/to/remote/file /path/to/local/directory/
```
Copy a directory recursively:
```bash
scp -r /path/to/local/directory username@hostname:/path/to/remote/directory/
```
## SSH Agent
Start the SSH agent:
```bash
eval "$(ssh-agent -s)"
```
Add a key to the agent:
```bash
ssh-add ~/.ssh/key_name
```
## Troubleshooting
- Check SSH service status on remote: `systemctl status sshd`
- Verify SSH port is open: `nc -zv hostname 22`
- Debug connection issues: `ssh -vvv username@hostname`
- Check permissions: SSH private keys should have 600 permissions (`chmod 600 ~/.ssh/key_name`)
- Verify known_hosts: If host key changed, remove the old entry with `ssh-keygen -R hostname`
## Secure SSH Key Management
### Local Storage with Proper Permissions
The most basic approach is to ensure proper file permissions:
```bash
# Set correct permissions for private keys
chmod 600 ~/.ssh/id_ed25519
# Set correct permissions for public keys
chmod 644 ~/.ssh/id_ed25519.pub
# Set correct permissions for SSH directory
chmod 700 ~/.ssh
```
+1 -1
View File
@@ -22,7 +22,7 @@ def get_version():
pass
try:
from pkg_resources import DistributionNotFound, get_distribution
from pkg_resources import DistributionNotFound, get_distribution # type: ignore
return get_distribution(__package_name__).version
except (ImportError, DistributionNotFound):
+1 -1
View File
@@ -10,7 +10,7 @@ The CodeAct agent operates through a function calling interface. At each turn, t
2. **CodeAct**: Execute actions through a set of well-defined tools:
- Execute Linux `bash` commands with `execute_bash`
- Run Python code in an [IPython](https://ipython.org/) environment with `execute_ipython_cell`
- Interact with web browsers using `browser` and `web_read`
- Interact with web browsers using `browser` and `fetch`
- Edit files using `str_replace_editor` or `edit_file`
![image](https://github.com/All-Hands-AI/OpenHands/assets/38853559/92b622e3-72ad-4a61-8f41-8c040b6d5fb3)
@@ -20,7 +20,6 @@ from openhands.agenthub.codeact_agent.tools.str_replace_editor import (
create_str_replace_editor_tool,
)
from openhands.agenthub.codeact_agent.tools.think import ThinkTool
from openhands.agenthub.codeact_agent.tools.web_read import WebReadTool
from openhands.controller.agent import Agent
from openhands.controller.state.state import State
from openhands.core.config import AgentConfig
@@ -123,7 +122,6 @@ class CodeActAgent(Agent):
if sys.platform == 'win32':
logger.warning('Windows runtime does not support browsing yet')
else:
tools.append(WebReadTool)
tools.append(BrowserTool)
if self.config.enable_jupyter:
tools.append(IPythonTool)
@@ -15,7 +15,6 @@ from openhands.agenthub.codeact_agent.tools import (
IPythonTool,
LLMBasedFileEditTool,
ThinkTool,
WebReadTool,
create_cmd_run_tool,
create_str_replace_editor_tool,
)
@@ -212,16 +211,6 @@ def response_to_actions(
)
action = BrowseInteractiveAction(browser_actions=arguments['code'])
# ================================================
# WebReadTool (simplified browsing)
# ================================================
elif tool_call.function.name == WebReadTool['function']['name']:
if 'url' not in arguments:
raise FunctionCallValidationError(
f'Missing required argument "url" in tool call {tool_call.function.name}'
)
action = BrowseURLAction(url=arguments['url'])
# ================================================
# MCPAction (MCP)
# ================================================
@@ -8,7 +8,7 @@ At the user's request, repository {{ repository_info.repo_name }} has been clone
{{ repository_instructions }}
</REPOSITORY_INSTRUCTIONS>
{% endif %}
{% if runtime_info and (runtime_info.available_hosts or runtime_info.additional_agent_instructions) -%}
{% if runtime_info -%}
<RUNTIME_INFORMATION>
{% if runtime_info.available_hosts %}
The user has access to the following hosts for accessing a web application,
@@ -24,6 +24,14 @@ For example, if you are using vite.config.js, you should set server.host and ser
{% if runtime_info.additional_agent_instructions %}
{{ runtime_info.additional_agent_instructions }}
{% endif %}
{% if runtime_info.custom_secrets_descriptions %}
<CUSTOM_SECRETS>
You are have access to the following environment variables
{% for secret_name, secret_description in runtime_info.custom_secrets_descriptions.items() %}
* $**{{ secret_name }}**: {{ secret_description }}
{% endfor %}
</CUSTOM_SECRETS>
{% endif %}
{% if runtime_info.date %}
Today's date is {{ runtime_info.date }} (UTC).
{% endif %}
@@ -5,7 +5,6 @@ from .ipython import IPythonTool
from .llm_based_edit import LLMBasedFileEditTool
from .str_replace_editor import create_str_replace_editor_tool
from .think import ThinkTool
from .web_read import WebReadTool
__all__ = [
'BrowserTool',
@@ -14,6 +13,5 @@ __all__ = [
'IPythonTool',
'LLMBasedFileEditTool',
'create_str_replace_editor_tool',
'WebReadTool',
'ThinkTool',
]
@@ -2,6 +2,8 @@ import sys
from litellm import ChatCompletionToolParam, ChatCompletionToolParamFunctionChunk
from openhands.llm.tool_names import EXECUTE_BASH_TOOL_NAME
_DETAILED_BASH_DESCRIPTION = """Execute a bash command in the terminal within a persistent shell session.
### Command Execution
@@ -45,7 +47,7 @@ def create_cmd_run_tool(
return ChatCompletionToolParam(
type='function',
function=ChatCompletionToolParamFunctionChunk(
name=refine_prompt('execute_bash'),
name=EXECUTE_BASH_TOOL_NAME,
description=refine_prompt(description),
parameters={
'type': 'object',
@@ -1,6 +1,8 @@
from browsergym.core.action.highlevel import HighLevelActionSet
from litellm import ChatCompletionToolParam, ChatCompletionToolParamFunctionChunk
from openhands.llm.tool_names import BROWSER_TOOL_NAME
# from browsergym/core/action/highlevel.py
_browser_action_space = HighLevelActionSet(
subsets=['bid', 'nav'],
@@ -141,7 +143,7 @@ for _, action in _browser_action_space.action_set.items():
BrowserTool = ChatCompletionToolParam(
type='function',
function=ChatCompletionToolParamFunctionChunk(
name='browser',
name=BROWSER_TOOL_NAME,
description=_BROWSER_DESCRIPTION,
parameters={
'type': 'object',
@@ -1,5 +1,7 @@
from litellm import ChatCompletionToolParam, ChatCompletionToolParamFunctionChunk
from openhands.llm.tool_names import FINISH_TOOL_NAME
_FINISH_DESCRIPTION = """Signals the completion of the current task or conversation.
Use this tool when:
@@ -18,7 +20,7 @@ The task_completed field should be set to True if you believed you have complete
FinishTool = ChatCompletionToolParam(
type='function',
function=ChatCompletionToolParamFunctionChunk(
name='finish',
name=FINISH_TOOL_NAME,
description=_FINISH_DESCRIPTION,
parameters={
'type': 'object',
@@ -1,5 +1,7 @@
from litellm import ChatCompletionToolParam, ChatCompletionToolParamFunctionChunk
from openhands.llm.tool_names import STR_REPLACE_EDITOR_TOOL_NAME
_DETAILED_STR_REPLACE_EDITOR_DESCRIPTION = """Custom editing tool for viewing, creating and editing files in plain-text format
* State is persistent across command calls and discussions with the user
* If `path` is a file, `view` displays the result of applying `cat -n`. If `path` is a directory, `view` lists non-hidden files and directories up to 2 levels deep
@@ -55,7 +57,7 @@ def create_str_replace_editor_tool(
return ChatCompletionToolParam(
type='function',
function=ChatCompletionToolParamFunctionChunk(
name='str_replace_editor',
name=STR_REPLACE_EDITOR_TOOL_NAME,
description=description,
parameters={
'type': 'object',
@@ -1,26 +0,0 @@
from litellm import ChatCompletionToolParam, ChatCompletionToolParamFunctionChunk
_WEB_DESCRIPTION = """Read (convert to markdown) content from a webpage. You should prefer using the `web_read` tool over the `browser` tool, but do use the `browser` tool if you need to interact with a webpage (e.g., click a button, fill out a form, etc.) OR read a webpage that contains images.
You may use the `web_read` tool to read text content from a webpage, and even search the webpage content using a Google search query (e.g., url=`https://www.google.com/search?q=YOUR_QUERY`).
Only the most recently read webpage will be available to read. This means you should not follow a link to a new page until you are done with the information on the current page.
"""
WebReadTool = ChatCompletionToolParam(
type='function',
function=ChatCompletionToolParamFunctionChunk(
name='web_read',
description=_WEB_DESCRIPTION,
parameters={
'type': 'object',
'properties': {
'url': {
'type': 'string',
'description': 'The URL of the webpage to read. You can also use a Google search query here (e.g., `https://www.google.com/search?q=YOUR_QUERY`).',
}
},
'required': ['url'],
},
),
)
@@ -17,7 +17,6 @@ from openhands.agenthub.codeact_agent.function_calling import (
from openhands.agenthub.codeact_agent.tools import (
FinishTool,
ThinkTool,
WebReadTool,
)
from openhands.agenthub.readonly_agent.tools import (
GlobTool,
@@ -191,16 +190,6 @@ def response_to_actions(
glob_cmd = glob_to_cmdrun(pattern, path)
action = CmdRunAction(command=glob_cmd, is_input=False)
# ================================================
# WebReadTool (simplified browsing)
# ================================================
elif tool_call.function.name == WebReadTool['function']['name']:
if 'url' not in arguments:
raise FunctionCallValidationError(
f'Missing required argument "url" in tool call {tool_call.function.name}'
)
action = BrowseURLAction(url=arguments['url'])
# ================================================
# MCPAction (MCP)
# ================================================
@@ -249,7 +238,6 @@ def get_tools() -> list[ChatCompletionToolParam]:
return [
ThinkTool,
FinishTool,
WebReadTool,
GrepTool,
GlobTool,
ViewTool,
+55 -16
View File
@@ -1,7 +1,6 @@
import asyncio
import logging
import sys
from uuid import uuid4
from prompt_toolkit.shortcuts import clear
@@ -42,6 +41,7 @@ from openhands.core.setup import (
create_controller,
create_memory,
create_runtime,
generate_sid,
initialize_repository_for_runtime,
)
from openhands.events import EventSource, EventStreamSubscriber
@@ -81,6 +81,16 @@ async def cleanup_session(
if pending:
await asyncio.wait(pending, timeout=5.0)
event_stream = runtime.event_stream
# Save the final state
end_state = controller.get_state()
end_state.save_to_session(
event_stream.sid,
event_stream.file_store,
event_stream.user_id,
)
# Reset agent, close runtime and controller
agent.reset()
runtime.close()
@@ -94,12 +104,13 @@ async def run_session(
config: AppConfig,
settings_store: FileSettingsStore,
current_dir: str,
initial_user_action: str | None = None,
task_content: str | None = None,
session_name: str | None = None,
) -> bool:
reload_microagents = False
new_session_requested = False
sid = str(uuid4())
sid = generate_sid(config, session_name)
is_loaded = asyncio.Event()
is_paused = asyncio.Event() # Event to track agent pause requests
always_confirm_mode = False # Flag to enable always confirm mode
@@ -120,7 +131,7 @@ async def run_session(
agent=agent,
)
controller, _ = create_controller(agent, runtime, config)
controller, initial_state = create_controller(agent, runtime, config)
event_stream = runtime.event_stream
@@ -218,10 +229,9 @@ async def run_session(
def on_event(event: Event) -> None:
loop.create_task(on_event_async(event))
event_stream.subscribe(EventStreamSubscriber.MAIN, on_event, str(uuid4()))
event_stream.subscribe(EventStreamSubscriber.MAIN, on_event, sid)
await runtime.connect()
await add_mcp_tools_to_agent(agent, runtime, config.mcp)
# Initialize repository if needed
repo_directory = None
@@ -240,6 +250,9 @@ async def run_session(
repo_directory=repo_directory,
)
# Add MCP tools to the agent
await add_mcp_tools_to_agent(agent, runtime, memory, config.mcp)
# Clear loading animation
is_loaded.set()
@@ -249,17 +262,38 @@ async def run_session(
# Show OpenHands banner and session ID
display_banner(session_id=sid)
# Show OpenHands welcome
display_welcome_message()
welcome_message = 'What do you want to build?' # from the application
initial_message = '' # from the user
if initial_user_action:
# If there's an initial user action, enqueue it and do not prompt again
display_initial_user_prompt(initial_user_action)
event_stream.add_event(
MessageAction(content=initial_user_action), EventSource.USER
)
if task_content:
initial_message = task_content
# If we loaded a state, we are resuming a previous session
if initial_state is not None:
logger.info(f'Resuming session: {sid}')
if initial_state.last_error:
# If the last session ended in an error, provide a message.
initial_message = (
'NOTE: the last session ended with an error.'
"Let's get back on track. Do NOT resume your task. Ask me about it."
)
else:
# If we are resuming, we already have a task
initial_message = ''
welcome_message += '\nLoading previous conversation.'
# Show OpenHands welcome
display_welcome_message(welcome_message)
# The prompt_for_next_task will be triggered if the agent enters AWAITING_USER_INPUT.
# If the restored state is already AWAITING_USER_INPUT, on_event_async will handle it.
if initial_message:
display_initial_user_prompt(initial_message)
event_stream.add_event(MessageAction(content=initial_message), EventSource.USER)
else:
# Otherwise prompt for the user's first message right away
# No session restored, no initial action: prompt for the user's first message
asyncio.create_task(prompt_for_next_task(''))
await run_agent_until_done(
@@ -334,7 +368,12 @@ async def main(loop: asyncio.AbstractEventLoop) -> None:
# Run the first session
new_session_requested = await run_session(
loop, config, settings_store, current_dir, task_str
loop,
config,
settings_store,
current_dir,
task_str,
session_name=args.name,
)
# If a new session was requested, run it
+11 -5
View File
@@ -145,14 +145,20 @@ def display_banner(session_id: str) -> None:
print_formatted_text('')
def display_welcome_message() -> None:
def display_welcome_message(message: str = '') -> None:
print_formatted_text(
HTML("<gold>Let's start building!</gold>\n"), style=DEFAULT_STYLE
)
print_formatted_text(
HTML('What do you want to build? <grey>Type /help for help</grey>'),
style=DEFAULT_STYLE,
)
if message:
print_formatted_text(
HTML(f'{message} <grey>Type /help for help</grey>'),
style=DEFAULT_STYLE,
)
else:
print_formatted_text(
HTML('What do you want to build? <grey>Type /help for help</grey>'),
style=DEFAULT_STYLE,
)
def display_initial_user_prompt(prompt: str) -> None:
-2
View File
@@ -8,7 +8,6 @@ from enum import Enum
from typing import Any
import openhands
from openhands.controller.state.task import RootTask
from openhands.core.logger import openhands_logger as logger
from openhands.core.schema import AgentState
from openhands.events.action import (
@@ -75,7 +74,6 @@ class State:
- additional task-specific data
"""
root_task: RootTask = field(default_factory=RootTask)
session_id: str = ''
# global iteration for the current task
iteration: int = 0
-234
View File
@@ -1,234 +0,0 @@
from __future__ import annotations
from openhands.core.exceptions import (
LLMMalformedActionError,
TaskInvalidStateError,
)
from openhands.core.logger import openhands_logger as logger
OPEN_STATE = 'open'
COMPLETED_STATE = 'completed'
ABANDONED_STATE = 'abandoned'
IN_PROGRESS_STATE = 'in_progress'
VERIFIED_STATE = 'verified'
STATES = [
OPEN_STATE,
COMPLETED_STATE,
ABANDONED_STATE,
IN_PROGRESS_STATE,
VERIFIED_STATE,
]
class Task:
id: str
goal: str
parent: 'Task' | None
subtasks: list['Task']
def __init__(
self,
parent: 'Task',
goal: str,
state: str = OPEN_STATE,
subtasks: list[dict | 'Task'] | None = None, # noqa: B006
) -> None:
"""Initializes a new instance of the Task class.
Args:
parent: The parent task, or None if it is the root task.
goal: The goal of the task.
state: The initial state of the task.
subtasks: A list of subtasks associated with this task.
"""
if subtasks is None:
subtasks = []
if parent.id:
self.id = parent.id + '.' + str(len(parent.subtasks))
else:
self.id = str(len(parent.subtasks))
self.parent = parent
self.goal = goal
logger.debug(f'Creating task {self.id} with parent={parent.id}, goal={goal}')
self.subtasks = []
for subtask in subtasks or []:
if isinstance(subtask, Task):
self.subtasks.append(subtask)
else:
goal = str(subtask.get('goal', ''))
state = str(subtask.get('state', OPEN_STATE))
subtasks = subtask.get('subtasks')
logger.debug(f'Reading: {goal}, {state}, {subtasks}')
self.subtasks.append(Task(self, goal, state, subtasks))
self.state = OPEN_STATE
def to_string(self, indent: str = '') -> str:
"""Returns a string representation of the task and its subtasks.
Args:
indent: The indentation string for formatting the output.
Returns:
A string representation of the task and its subtasks.
"""
emoji = ''
if self.state == VERIFIED_STATE:
emoji = ''
elif self.state == COMPLETED_STATE:
emoji = '🟢'
elif self.state == ABANDONED_STATE:
emoji = ''
elif self.state == IN_PROGRESS_STATE:
emoji = '💪'
elif self.state == OPEN_STATE:
emoji = '🔵'
result = indent + emoji + ' ' + self.id + ' ' + self.goal + '\n'
for subtask in self.subtasks:
result += subtask.to_string(indent + ' ')
return result
def to_dict(self) -> dict:
"""Returns a dictionary representation of the task.
Returns:
A dictionary containing the task's attributes.
"""
return {
'id': self.id,
'goal': self.goal,
'state': self.state,
'subtasks': [t.to_dict() for t in self.subtasks],
}
def set_state(self, state: str) -> None:
"""Sets the state of the task and its subtasks.
Args:
state: The new state of the task.
Raises:
TaskInvalidStateError: If the provided state is invalid.
"""
if state not in STATES:
logger.error('Invalid state: %s', state)
raise TaskInvalidStateError(state)
self.state = state
if (
state == COMPLETED_STATE
or state == ABANDONED_STATE
or state == VERIFIED_STATE
):
for subtask in self.subtasks:
if subtask.state != ABANDONED_STATE:
subtask.set_state(state)
elif state == IN_PROGRESS_STATE:
if self.parent is not None:
self.parent.set_state(state)
def get_current_task(self) -> 'Task' | None:
"""Retrieves the current task in progress.
Returns:
The current task in progress, or None if no task is in progress.
"""
for subtask in self.subtasks:
if subtask.state == IN_PROGRESS_STATE:
return subtask.get_current_task()
if self.state == IN_PROGRESS_STATE:
return self
return None
class RootTask(Task):
"""Serves as the root node in a tree of tasks.
Because we want the top-level of the root_task to be a list of tasks (1, 2, 3, etc.),
the "root node" of the data structure is kind of invisible--it just
holds references to the top-level tasks.
Attributes:
id: Kept blank for root_task
goal: Kept blank for root_task
parent: None for root_task
subtasks: The top-level list of tasks associated with the root_task.
state: The state of the root_task.
"""
id: str = ''
goal: str = ''
parent: None = None
def __init__(self) -> None:
self.subtasks = []
self.state = OPEN_STATE
def __str__(self) -> str:
"""Returns a string representation of the root_task.
Returns:
A string representation of the root_task.
"""
return self.to_string()
def get_task_by_id(self, id: str) -> Task:
"""Retrieves a task by its ID.
Args:
id: The ID of the task.
Returns:
The task with the specified ID.
Raises:
AgentMalformedActionError: If the provided task ID is invalid or does not exist.
"""
if id == '':
return self
if len(self.subtasks) == 0:
raise LLMMalformedActionError('Task does not exist:' + id)
try:
parts = [int(p) for p in id.split('.')]
except ValueError:
raise LLMMalformedActionError('Invalid task id:' + id)
task: Task = self
for part in parts:
if part >= len(task.subtasks):
raise LLMMalformedActionError('Task does not exist:' + id)
task = task.subtasks[part]
return task
def add_subtask(
self,
parent_id: str,
goal: str,
subtasks: list[dict | Task] | None = None,
) -> None:
"""Adds a subtask to a parent task.
Args:
parent_id: The ID of the parent task.
goal: The goal of the subtask.
subtasks: A list of subtasks associated with the new subtask.
"""
subtasks = subtasks or []
parent = self.get_task_by_id(parent_id)
child = Task(parent=parent, goal=goal, subtasks=subtasks)
parent.subtasks.append(child)
def set_subtask_state(self, id: str, state: str) -> None:
"""Sets the state of a subtask.
Args:
id: The ID of the subtask.
state: The new state of the subtask.
"""
task = self.get_task_by_id(id)
logger.debug('Setting task {task.id} from state {task.state} to {state}')
task.set_state(state)
unfinished_tasks = [
t
for t in self.subtasks
if t.state not in [COMPLETED_STATE, VERIFIED_STATE, ABANDONED_STATE]
]
if len(unfinished_tasks) == 0:
self.set_state(COMPLETED_STATE)
+3 -2
View File
@@ -116,8 +116,6 @@ async def run_controller(
selected_repository=config.sandbox.selected_repo,
)
await add_mcp_tools_to_agent(agent, runtime, config.mcp)
event_stream = runtime.event_stream
# when memory is created, it will load the microagents from the selected repository
@@ -130,6 +128,9 @@ async def run_controller(
repo_directory=repo_directory,
)
# Add MCP tools to the agent
await add_mcp_tools_to_agent(agent, runtime, memory, config.mcp)
replay_events: list[Event] | None = None
if config.replay_trajectory_path:
logger.info('Trajectory replay is enabled')
+1 -1
View File
@@ -154,7 +154,7 @@ def create_memory(
if runtime:
# sets available hosts
memory.set_runtime_info(runtime)
memory.set_runtime_info(runtime, {})
# loads microagents from repo/.openhands/microagents
microagents: list[BaseMicroagent] = runtime.get_microagents_from_selected_repo(
+5
View File
@@ -78,6 +78,7 @@ class EventStore(EventStoreABC):
end_id: int | None = None,
reverse: bool = False,
filter: EventFilter | None = None,
limit: int | None = None,
) -> Iterable[Event]:
"""
Retrieve events from the event stream, optionally filtering out events of a given type
@@ -107,6 +108,7 @@ class EventStore(EventStoreABC):
step = 1
cache_page = _DUMMY_PAGE
num_results = 0
for index in range(start_id, end_id, step):
if not should_continue():
return
@@ -121,6 +123,9 @@ class EventStore(EventStoreABC):
if event:
if not filter or filter.include(event):
yield event
num_results += 1
if limit and limit <= num_results:
return
def get_event(self, id: int) -> Event:
filename = self._get_filename_for_id(id, self.user_id)
+1
View File
@@ -23,6 +23,7 @@ class EventStoreABC:
end_id: int | None = None,
reverse: bool = False,
filter: EventFilter | None = None,
limit: int | None = None,
) -> Iterable[Event]:
"""
Retrieve events from the event stream, optionally excluding events using a filter
+3 -1
View File
@@ -74,6 +74,7 @@ class RecallObservation(Observation):
runtime_hosts: dict[str, int] = field(default_factory=dict)
additional_agent_instructions: str = ''
date: str = ''
custom_secrets_descriptions: dict[str, str] = field(default_factory=dict)
# knowledge
microagent_knowledge: list[MicroagentKnowledge] = field(default_factory=list)
@@ -114,7 +115,8 @@ class RecallObservation(Observation):
f'repo_instructions={self.repo_instructions[:20]}...',
f'runtime_hosts={self.runtime_hosts}',
f'additional_agent_instructions={self.additional_agent_instructions[:20]}...',
f'date={self.date}',
f'date={self.date}'
f'custom_secrets_descriptions={self.custom_secrets_descriptions}',
]
)
else:
+196 -115
View File
@@ -18,6 +18,12 @@ from openhands.core.exceptions import (
FunctionCallConversionError,
FunctionCallValidationError,
)
from openhands.llm.tool_names import (
BROWSER_TOOL_NAME,
EXECUTE_BASH_TOOL_NAME,
FINISH_TOOL_NAME,
STR_REPLACE_EDITOR_TOOL_NAME,
)
# Inspired by: https://docs.together.ai/docs/llama-3-function-calling#function-calling-w-llama-31-70b
SYSTEM_PROMPT_SUFFIX_TEMPLATE = """
@@ -55,14 +61,12 @@ def refine_prompt(prompt: str) -> str:
return prompt
# NOTE: we need to make sure this example is always in-sync with the tool interface designed in openhands/agenthub/codeact_agent/function_calling.py
IN_CONTEXT_LEARNING_EXAMPLE_PREFIX = refine_prompt("""
Here's a running example of how to perform a task with the provided tools.
--------------------- START OF EXAMPLE ---------------------
USER: Create a list of numbers from 1 to 10, and display them in a web page at port 5000.
# NOTE: we need to make sure these examples are always in-sync with the tool interface designed in openhands/agenthub/codeact_agent/function_calling.py
# Example snippets for each tool
TOOL_EXAMPLES = {
'execute_bash': {
'check_dir': """
ASSISTANT: Sure! Let me first check the current directory:
<function=execute_bash>
<parameter=command>
@@ -73,29 +77,10 @@ pwd && ls
USER: EXECUTION RESULT of [execute_bash]:
/workspace
openhands@runtime:~/workspace$
ASSISTANT: There is no `app.py` file in the current directory. Let me create a Python file `app.py`:
<function=str_replace_editor>
<parameter=command>create</parameter>
<parameter=path>/workspace/app.py</parameter>
<parameter=file_text>
from flask import Flask
app = Flask(__name__)
@app.route('/')
def index() -> str:
numbers = list(range(1, 11))
return str(numbers)
if __name__ == '__main__':
app.run(port=5000)
</parameter>
</function>
USER: EXECUTION RESULT of [str_replace_editor]:
File created successfully at: /workspace/app.py
ASSISTANT: I have created a Python file `app.py` that will display a list of numbers from 1 to 10 when you run it. Let me run the Python file for you:
""",
'run_server': """
ASSISTANT:
Let me run the Python file for you:
<function=execute_bash>
<parameter=command>
python3 app.py > server.log 2>&1 &
@@ -106,7 +91,8 @@ USER: EXECUTION RESULT of [execute_bash]:
[1] 121
[1]+ Exit 1 python3 app.py > server.log 2>&1
ASSISTANT: Looks like the server is running with PID 121 then crashed. Let me check the server log:
ASSISTANT:
Looks like the server was running with PID 121 then crashed. Let me check the server log:
<function=execute_bash>
<parameter=command>
cat server.log
@@ -119,7 +105,8 @@ Traceback (most recent call last):
from flask import Flask
ModuleNotFoundError: No module named 'flask'
ASSISTANT: Looks like the server crashed because the `flask` module is not installed. Let me install the `flask` module for you:
ASSISTANT:
Looks like the server crashed because the `flask` module is not installed. Let me install the `flask` module for you:
<function=execute_bash>
<parameter=command>
pip3 install flask
@@ -143,7 +130,8 @@ Requirement already satisfied: MarkupSafe>=2.0 in /home/openhands/.local/lib/pyt
Installing collected packages: Werkzeug, itsdangerous, click, blinker, flask
Successfully installed Werkzeug-3.0.2 blinker-1.7.0 click-8.1.7 flask-3.0.3 itsdangerous-2.2.0
ASSISTANT: The `flask` module is installed successfully. Let me run the Python file again:
ASSISTANT:
The `flask` module is installed successfully. Let me run the Python file again:
<function=execute_bash>
<parameter=command>
python3 app.py > server.log 2>&1 &
@@ -170,7 +158,8 @@ Press CTRL+C to quit
ASSISTANT:
The server is running on port 5000 with PID 124. You can access the list of numbers by visiting http://127.0.0.1:5000. If you have any further questions, feel free to ask!
""",
'kill_server': """
USER: Now kill the server, make it display the numbers in a table format.
ASSISTANT:
@@ -183,7 +172,48 @@ kill 124
USER: EXECUTION RESULT of [execute_bash]:
[1]+ Terminated python3 app.py > server.log 2>&1
""",
'run_server_again': """
ASSISTANT:
Running the updated file:
<function=execute_bash>
<parameter=command>
python3 app.py > server.log 2>&1 &
</parameter>
</function>
USER: EXECUTION RESULT of [execute_bash]:
[1] 126
ASSISTANT:
The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000.
""",
},
'str_replace_editor': {
'create_file': """
ASSISTANT:
There is no `app.py` file in the current directory. Let me create a Python file `app.py`:
<function=str_replace_editor>
<parameter=command>create</parameter>
<parameter=path>/workspace/app.py</parameter>
<parameter=file_text>
from flask import Flask
app = Flask(__name__)
@app.route('/')
def index():
numbers = list(range(1, 11))
return str(numbers)
if __name__ == '__main__':
app.run(port=5000)
</parameter>
</function>
USER: EXECUTION RESULT of [str_replace_editor]:
File created successfully at: /workspace/app.py
""",
'edit_file': """
ASSISTANT:
Now let me display the numbers in a table format:
<function=str_replace_editor>
@@ -205,28 +235,100 @@ The file /workspace/app.py has been edited. Here's the result of running `cat -n
10 app.run(port=5000)
Review the changes and make sure they are as expected. Edit the file again if necessary.
""",
},
'browser': {
'view_page': """
ASSISTANT:
Running the updated file:
<function=execute_bash>
<parameter=command>
python3 app.py > server.log 2>&1 &
Let me check how the page looks in the browser:
<function=browser>
<parameter=code>
goto('http://127.0.0.1:5000')
noop(1000) # Wait for page to load
</parameter>
</function>
USER: EXECUTION RESULT of [execute_bash]:
[1] 126
USER: EXECUTION RESULT of [browser]:
[Browser shows the numbers in a table format]
"""
},
'finish': {
'task_completed': """
ASSISTANT:
The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Let me know if you have any further requests!
<function=finish>
<parameter=message>The task has been completed. The web server is running and displaying numbers 1-10 in a table format at http://127.0.0.1:5000.</parameter>
<parameter=task_completed>true</parameter>
</function>
"""
},
}
def get_example_for_tools(tools: list[dict]) -> str:
"""Generate an in-context learning example based on available tools."""
available_tools = set()
for tool in tools:
if tool['type'] == 'function':
name = tool['function']['name']
if name == EXECUTE_BASH_TOOL_NAME:
available_tools.add('execute_bash')
elif name == STR_REPLACE_EDITOR_TOOL_NAME:
available_tools.add('str_replace_editor')
elif name == BROWSER_TOOL_NAME:
available_tools.add('browser')
elif name == FINISH_TOOL_NAME:
available_tools.add('finish')
if not available_tools:
return ''
example = """Here's a running example of how to perform a task with the provided tools.
--------------------- START OF EXAMPLE ---------------------
USER: Create a list of numbers from 1 to 10, and display them in a web page at port 5000.
"""
# Build example based on available tools
if 'execute_bash' in available_tools:
example += TOOL_EXAMPLES['execute_bash']['check_dir']
if 'str_replace_editor' in available_tools:
example += TOOL_EXAMPLES['str_replace_editor']['create_file']
if 'execute_bash' in available_tools:
example += TOOL_EXAMPLES['execute_bash']['run_server']
if 'browser' in available_tools:
example += TOOL_EXAMPLES['browser']['view_page']
if 'execute_bash' in available_tools:
example += TOOL_EXAMPLES['execute_bash']['kill_server']
if 'str_replace_editor' in available_tools:
example += TOOL_EXAMPLES['str_replace_editor']['edit_file']
if 'execute_bash' in available_tools:
example += TOOL_EXAMPLES['execute_bash']['run_server_again']
if 'finish' in available_tools:
example += TOOL_EXAMPLES['finish']['task_completed']
example += """
--------------------- END OF EXAMPLE ---------------------
Do NOT assume the environment is the same as in the example above.
--------------------- NEW TASK DESCRIPTION ---------------------
""").lstrip()
"""
example = example.lstrip()
return example
IN_CONTEXT_LEARNING_EXAMPLE_PREFIX = get_example_for_tools
IN_CONTEXT_LEARNING_EXAMPLE_SUFFIX = """
--------------------- END OF NEW TASK DESCRIPTION ---------------------
@@ -352,74 +454,42 @@ def convert_fncall_messages_to_non_fncall_messages(
# Add in-context learning example for the first user message
if not first_user_message_encountered and add_in_context_learning_example:
first_user_message_encountered = True
# Check tools
if not (
tools
and len(tools) > 0
and any(
(
tool['type'] == 'function'
and tool['function']['name']
== refine_prompt('execute_bash')
and 'command'
in tool['function']['parameters']['properties']
)
for tool in tools
)
and any(
(
tool['type'] == 'function'
and tool['function']['name'] == 'str_replace_editor'
and 'path' in tool['function']['parameters']['properties']
and 'file_text'
in tool['function']['parameters']['properties']
and 'old_str'
in tool['function']['parameters']['properties']
and 'new_str'
in tool['function']['parameters']['properties']
)
for tool in tools
)
):
raise FunctionCallConversionError(
'The currently provided tool set are NOT compatible with the in-context learning example for FnCall to Non-FnCall conversion. '
'Please update your tool set OR the in-context learning example in openhands/llm/fn_call_converter.py'
)
# add in-context learning example
if isinstance(content, str):
content = (
IN_CONTEXT_LEARNING_EXAMPLE_PREFIX
+ content
+ IN_CONTEXT_LEARNING_EXAMPLE_SUFFIX
)
elif isinstance(content, list):
if content and content[0]['type'] == 'text':
content[0]['text'] = (
IN_CONTEXT_LEARNING_EXAMPLE_PREFIX
+ content[0]['text']
+ IN_CONTEXT_LEARNING_EXAMPLE_SUFFIX
)
# Generate example based on available tools
example = IN_CONTEXT_LEARNING_EXAMPLE_PREFIX(tools)
# Add example if we have any tools
if example:
# add in-context learning example
if isinstance(content, str):
content = example + content + IN_CONTEXT_LEARNING_EXAMPLE_SUFFIX
elif isinstance(content, list):
if content and content[0]['type'] == 'text':
content[0]['text'] = (
example
+ content[0]['text']
+ IN_CONTEXT_LEARNING_EXAMPLE_SUFFIX
)
else:
content = (
[
{
'type': 'text',
'text': example,
}
]
+ content
+ [
{
'type': 'text',
'text': IN_CONTEXT_LEARNING_EXAMPLE_SUFFIX,
}
]
)
else:
content = (
[
{
'type': 'text',
'text': IN_CONTEXT_LEARNING_EXAMPLE_PREFIX,
}
]
+ content
+ [
{
'type': 'text',
'text': IN_CONTEXT_LEARNING_EXAMPLE_SUFFIX,
}
]
raise FunctionCallConversionError(
f'Unexpected content type {type(content)}. Expected str or list. Content: {content}'
)
else:
raise FunctionCallConversionError(
f'Unexpected content type {type(content)}. Expected str or list. Content: {content}'
)
converted_messages.append(
{
'role': 'user',
@@ -604,17 +674,28 @@ def convert_non_fncall_messages_to_fncall_messages(
if not first_user_message_encountered:
first_user_message_encountered = True
if isinstance(content, str):
content = content.replace(IN_CONTEXT_LEARNING_EXAMPLE_PREFIX, '')
content = content.replace(IN_CONTEXT_LEARNING_EXAMPLE_SUFFIX, '')
# Remove any existing example
if content.startswith(IN_CONTEXT_LEARNING_EXAMPLE_PREFIX(tools)):
content = content.replace(
IN_CONTEXT_LEARNING_EXAMPLE_PREFIX(tools), '', 1
)
if content.endswith(IN_CONTEXT_LEARNING_EXAMPLE_SUFFIX):
content = content.replace(
IN_CONTEXT_LEARNING_EXAMPLE_SUFFIX, '', 1
)
elif isinstance(content, list):
for item in content:
if item['type'] == 'text':
item['text'] = item['text'].replace(
IN_CONTEXT_LEARNING_EXAMPLE_PREFIX, ''
)
item['text'] = item['text'].replace(
IN_CONTEXT_LEARNING_EXAMPLE_SUFFIX, ''
)
# Remove any existing example
example = IN_CONTEXT_LEARNING_EXAMPLE_PREFIX(tools)
if item['text'].startswith(example):
item['text'] = item['text'].replace(example, '', 1)
if item['text'].endswith(
IN_CONTEXT_LEARNING_EXAMPLE_SUFFIX
):
item['text'] = item['text'].replace(
IN_CONTEXT_LEARNING_EXAMPLE_SUFFIX, '', 1
)
else:
raise FunctionCallConversionError(
f'Unexpected content type {type(content)}. Expected str or list. Content: {content}'
+6
View File
@@ -0,0 +1,6 @@
"""Constants for tool names used in function calling."""
EXECUTE_BASH_TOOL_NAME = 'execute_bash'
STR_REPLACE_EDITOR_TOOL_NAME = 'str_replace_editor'
BROWSER_TOOL_NAME = 'browser'
FINISH_TOOL_NAME = 'finish'
+20 -2
View File
@@ -10,6 +10,7 @@ from openhands.events.action.mcp import MCPAction
from openhands.events.observation.mcp import MCPObservation
from openhands.events.observation.observation import Observation
from openhands.mcp.client import MCPClient
from openhands.memory.memory import Memory
from openhands.runtime.base import Runtime
@@ -149,7 +150,7 @@ async def call_tool_mcp(mcp_clients: list[MCPClient], action: MCPAction) -> Obse
async def add_mcp_tools_to_agent(
agent: 'Agent', runtime: Runtime, mcp_config: MCPConfig
agent: 'Agent', runtime: Runtime, memory: 'Memory', mcp_config: MCPConfig
):
"""
Add MCP tools to an agent.
@@ -165,8 +166,25 @@ async def add_mcp_tools_to_agent(
'Runtime must be initialized before adding MCP tools'
)
# Add microagent MCP tools if available
microagent_mcp_configs = memory.get_microagent_mcp_tools()
extra_stdio_servers = []
for mcp_config in microagent_mcp_configs:
if mcp_config.sse_servers:
logger.warning(
'Microagent MCP config contains SSE servers, it is not yet supported.'
)
if mcp_config.stdio_servers:
for stdio_server in mcp_config.stdio_servers:
# Check if this stdio server is already in the config
if stdio_server not in extra_stdio_servers:
extra_stdio_servers.append(stdio_server)
logger.info(f'Added microagent stdio server: {stdio_server.name}')
# Add the runtime as another MCP server
updated_mcp_config = runtime.get_updated_mcp_config()
updated_mcp_config = runtime.get_updated_mcp_config(extra_stdio_servers)
# Fetch the MCP tools
mcp_tools = await fetch_mcp_tools_from_config(updated_mcp_config)
+5 -1
View File
@@ -451,9 +451,13 @@ class ConversationMemory:
available_hosts=obs.runtime_hosts,
additional_agent_instructions=obs.additional_agent_instructions,
date=date,
custom_secrets_descriptions=obs.custom_secrets_descriptions,
)
else:
runtime_info = RuntimeInfo(date=date)
runtime_info = RuntimeInfo(
date=date,
custom_secrets_descriptions=obs.custom_secrets_descriptions,
)
repo_instructions = (
obs.repo_instructions if obs.repo_instructions else ''
+30 -2
View File
@@ -5,6 +5,7 @@ from datetime import datetime, timezone
from typing import Callable
import openhands
from openhands.core.config.mcp_config import MCPConfig
from openhands.core.logger import openhands_logger as logger
from openhands.events.action.agent import RecallAction
from openhands.events.event import Event, EventSource, RecallType
@@ -176,6 +177,9 @@ class Memory:
microagent_knowledge=microagent_knowledge,
content='Added workspace context',
date=self.runtime_info.date if self.runtime_info is not None else '',
custom_secrets_descriptions=self.runtime_info.custom_secrets_descriptions
if self.runtime_info is not None
else {},
)
return obs
return None
@@ -259,6 +263,25 @@ class Memory:
if isinstance(agent, RepoMicroagent):
self.repo_microagents[name] = agent
def get_microagent_mcp_tools(self) -> list[MCPConfig]:
"""
Get MCP tools from all repo microagents (always active)
Returns:
A list of MCP tools configurations from microagents
"""
mcp_configs: list[MCPConfig] = []
# Check all repo microagents for MCP tools (always active)
for agent in self.repo_microagents.values():
if agent.metadata.mcp_tools:
mcp_configs.append(agent.metadata.mcp_tools)
logger.debug(
f'Found MCP tools in repo microagent {agent.name}: {agent.metadata.mcp_tools}'
)
return mcp_configs
def set_repository_info(self, repo_name: str, repo_directory: str) -> None:
"""Store repository info so we can reference it in an observation."""
if repo_name or repo_directory:
@@ -266,7 +289,9 @@ class Memory:
else:
self.repository_info = None
def set_runtime_info(self, runtime: Runtime) -> None:
def set_runtime_info(
self, runtime: Runtime, custom_secrets_descriptions: dict[str, str]
) -> None:
"""Store runtime info (web hosts, ports, etc.)."""
# e.g. { '127.0.0.1': 8080 }
utc_now = datetime.now(timezone.utc)
@@ -277,9 +302,12 @@ class Memory:
available_hosts=runtime.web_hosts,
additional_agent_instructions=runtime.additional_agent_instructions,
date=date,
custom_secrets_descriptions=custom_secrets_descriptions,
)
else:
self.runtime_info = RuntimeInfo(date=date)
self.runtime_info = RuntimeInfo(
date=date, custom_secrets_descriptions=custom_secrets_descriptions
)
def send_error_message(self, message_id: str, message: str):
"""Sends an error message if the callback function was provided."""
+35 -6
View File
@@ -64,8 +64,28 @@ class BaseMicroagent(BaseModel):
try:
metadata = MicroagentMetadata(**metadata_dict)
# Validate MCP tools configuration if present
if metadata.mcp_tools:
if metadata.mcp_tools.sse_servers:
logger.warning(
f'Microagent {metadata.name} has SSE servers. Only stdio servers are currently supported.'
)
if not metadata.mcp_tools.stdio_servers:
raise MicroagentValidationError(
f'Microagent {metadata.name} has MCP tools configuration but no stdio servers. '
'Only stdio servers are currently supported.'
)
except Exception as e:
raise MicroagentValidationError(f'Error loading metadata: {e}') from e
# Provide more detailed error message for validation errors
error_msg = f'Error validating microagent metadata in {path.name}: {str(e)}'
if 'type' in metadata_dict and metadata_dict['type'] not in [
t.value for t in MicroagentType
]:
valid_types = ', '.join([f'"{t.value}"' for t in MicroagentType])
error_msg += f'. Invalid "type" value: "{metadata_dict["type"]}". Valid types are: {valid_types}'
raise MicroagentValidationError(error_msg) from e
# Create appropriate subclass based on type
subclass_map = {
@@ -74,13 +94,13 @@ class BaseMicroagent(BaseModel):
}
# Infer the agent type:
# 1. If triggers exist -> KNOWLEDGE
# 2. Else (no triggers) -> REPO
# 1. If triggers exist -> KNOWLEDGE (optional)
# 2. Else (no triggers) -> REPO (always active)
inferred_type: MicroagentType
if metadata.triggers:
inferred_type = MicroagentType.KNOWLEDGE
else:
# No triggers, default to REPO unless metadata explicitly says otherwise (which it shouldn't for REPO)
# No triggers, default to REPO
# This handles cases where 'type' might be missing or defaulted by Pydantic
inferred_type = MicroagentType.REPO_KNOWLEDGE
@@ -123,6 +143,7 @@ class KnowledgeMicroagent(BaseMicroagent):
for trigger in self.triggers:
if trigger.lower() in message:
return trigger
return None
@property
@@ -183,8 +204,16 @@ def load_microagents_from_dir(
repo_agents[agent.name] = agent
elif isinstance(agent, KnowledgeMicroagent):
knowledge_agents[agent.name] = agent
logger.debug(f'Loaded agent {agent.name} from {file}')
logger.debug(
f'Loaded agent {agent.name} from {file}. Type: {type(agent)}'
)
except MicroagentValidationError as e:
# For validation errors, include the original exception
error_msg = f'Error loading microagent from {file}: {str(e)}'
raise MicroagentValidationError(error_msg) from e
except Exception as e:
raise ValueError(f'Error loading agent from {file}: {e}')
# For other errors, wrap in a ValueError with detailed message
error_msg = f'Error loading microagent from {file}: {str(e)}'
raise ValueError(error_msg) from e
return repo_agents, knowledge_agents
+9 -2
View File
@@ -2,12 +2,16 @@ from enum import Enum
from pydantic import BaseModel, Field
from openhands.core.config.mcp_config import (
MCPConfig,
)
class MicroagentType(str, Enum):
"""Type of microagent."""
KNOWLEDGE = 'knowledge'
REPO_KNOWLEDGE = 'repo'
KNOWLEDGE = 'knowledge' # Optional microagent, triggered by keywords
REPO_KNOWLEDGE = 'repo' # Always active microagent
class MicroagentMetadata(BaseModel):
@@ -18,3 +22,6 @@ class MicroagentMetadata(BaseModel):
version: str = Field(default='1.0.0')
agent: str = Field(default='CodeActAgent')
triggers: list[str] = [] # optional, only exists for knowledge microagents
mcp_tools: MCPConfig | None = (
None # optional, for microagents that provide additional MCP tools
)
-4
View File
@@ -4,10 +4,6 @@ import multiprocessing
import time
import uuid
from openhands.utils.playwright_patchright_util import use_patchright # noqa F401
use_patchright() # noqa F401
# NOTE: this overrides the playwright import in browsergym
import browsergym.core # noqa F401 (we register the openended task as a gym environment)
import gymnasium as gym
import html2text
@@ -10,7 +10,7 @@ import httpx
from tenacity import retry, retry_if_exception, stop_after_attempt, wait_exponential
from openhands.core.config import AppConfig
from openhands.core.config.mcp_config import MCPConfig, MCPSSEServerConfig
from openhands.core.config.mcp_config import MCPConfig, MCPStdioServerConfig, MCPSSEServerConfig
from openhands.core.exceptions import (
AgentRuntimeTimeoutError,
)
@@ -351,7 +351,9 @@ class ActionExecutionClient(Runtime):
def browse_interactive(self, action: BrowseInteractiveAction) -> Observation:
return self.send_action_for_execution(action)
def get_updated_mcp_config(self) -> MCPConfig:
def get_updated_mcp_config(
self, extra_stdio_servers: list[MCPStdioServerConfig] | None = None
) -> MCPConfig:
# Add the runtime as another MCP server
updated_mcp_config = self.config.mcp.model_copy()
# Send a request to the action execution server to updated MCP config
@@ -359,6 +361,10 @@ class ActionExecutionClient(Runtime):
server.model_dump(mode='json')
for server in updated_mcp_config.stdio_servers
]
if extra_stdio_servers:
stdio_tools.extend(
[server.model_dump(mode='json') for server in extra_stdio_servers]
)
if len(stdio_tools) > 0:
self.log('debug', f'Updating MCP server to: {stdio_tools}')
@@ -8,6 +8,8 @@ from openhands.core.config import AppConfig
from openhands.events.action import MessageAction
from openhands.events.event_store import EventStore
from openhands.server.config.server_config import ServerConfig
from openhands.server.data_models.agent_loop_info import AgentLoopInfo
from openhands.server.data_models.conversation_info import ConversationInfo
from openhands.server.monitoring import MonitoringListener
from openhands.server.session.conversation import Conversation
from openhands.storage.conversation.conversation_store import ConversationStore
@@ -53,7 +55,7 @@ class ConversationManager(ABC):
connection_id: str,
settings: Settings,
user_id: str | None,
) -> EventStore | None:
) -> AgentLoopInfo | None:
"""Join a conversation and return its event stream."""
async def is_agent_loop_running(self, sid: str) -> bool:
@@ -81,7 +83,7 @@ class ConversationManager(ABC):
user_id: str | None,
initial_user_msg: MessageAction | None = None,
replay_json: str | None = None,
) -> EventStore:
) -> AgentLoopInfo:
"""Start an event loop if one is not already running"""
@abstractmethod
@@ -96,6 +98,12 @@ class ConversationManager(ABC):
async def close_session(self, sid: str):
"""Close a session."""
@abstractmethod
async def get_agent_loop_info(
self, user_id: str | None = None, filter_to_sids: set[str] | None = None
) -> list[AgentLoopInfo]:
"""Get the AgentLoopInfo for conversations."""
@classmethod
@abstractmethod
def get_instance(
@@ -11,9 +11,9 @@ from openhands.core.exceptions import AgentRuntimeUnavailableError
from openhands.core.logger import openhands_logger as logger
from openhands.core.schema.agent import AgentState
from openhands.events.action import MessageAction
from openhands.events.event_store import EventStore
from openhands.events.stream import EventStreamSubscriber, session_exists
from openhands.server.config.server_config import ServerConfig
from openhands.server.data_models.agent_loop_info import AgentLoopInfo
from openhands.server.monitoring import MonitoringListener
from openhands.server.session.agent_session import WAIT_TIME_BEFORE_CLOSE
from openhands.server.session.conversation import Conversation
@@ -119,21 +119,15 @@ class StandaloneConversationManager(ConversationManager):
connection_id: str,
settings: Settings,
user_id: str | None,
) -> EventStore:
) -> AgentLoopInfo:
logger.info(
f'join_conversation:{sid}:{connection_id}',
extra={'session_id': sid, 'user_id': user_id},
)
await self.sio.enter_room(connection_id, ROOM_KEY.format(sid=sid))
self._local_connection_id_to_session_id[connection_id] = sid
event_stream = await self.maybe_start_agent_loop(sid, settings, user_id)
if not event_stream:
logger.error(
f'No event stream after joining conversation: {sid}',
extra={'session_id': sid},
)
raise RuntimeError(f'no_event_stream:{sid}')
return event_stream
agent_loop_info = await self.maybe_start_agent_loop(sid, settings, user_id)
return agent_loop_info
async def detach_from_conversation(self, conversation: Conversation):
sid = conversation.sid
@@ -251,21 +245,14 @@ class StandaloneConversationManager(ConversationManager):
user_id: str | None,
initial_user_msg: MessageAction | None = None,
replay_json: str | None = None,
) -> EventStore:
) -> AgentLoopInfo:
logger.info(f'maybe_start_agent_loop:{sid}', extra={'session_id': sid})
if not await self.is_agent_loop_running(sid):
await self._start_agent_loop(
session = self._local_agent_loops_by_sid.get(sid)
if not session:
session = await self._start_agent_loop(
sid, settings, user_id, initial_user_msg, replay_json
)
event_store = await self._get_event_store(sid, user_id)
if not event_store:
logger.error(
f'No event stream after starting agent loop: {sid}',
extra={'session_id': sid},
)
raise RuntimeError(f'no_event_stream:{sid}')
return event_store
return self._agent_loop_info_from_session(session)
async def _start_agent_loop(
self,
@@ -330,22 +317,6 @@ class StandaloneConversationManager(ConversationManager):
pass # Already subscribed - take no action
return session
async def _get_event_store(
self, sid: str, user_id: str | None
) -> EventStore | None:
logger.info(f'_get_event_store:{sid}', extra={'session_id': sid})
session = self._local_agent_loops_by_sid.get(sid)
if session:
logger.info(f'found_local_agent_loop:{sid}', extra={'session_id': sid})
event_stream = session.agent_session.event_stream
return EventStore(
event_stream.sid,
event_stream.file_store,
event_stream.user_id,
event_stream.cur_id,
)
return None
async def send_to_event_stream(self, connection_id: str, data: dict):
# If there is a local session running, send to that
sid = self._local_connection_id_to_session_id.get(connection_id)
@@ -493,6 +464,29 @@ class StandaloneConversationManager(ConversationManager):
await conversation_store.save_metadata(conversation)
async def get_agent_loop_info(
self, user_id: str | None = None, filter_to_sids: set[str] | None = None
):
results = []
for session in self._local_agent_loops_by_sid.values():
if user_id and session.user_id != user_id:
continue
if filter_to_sids and session.sid not in filter_to_sids:
continue
results.append(self._agent_loop_info_from_session(session))
return results
def _agent_loop_info_from_session(self, session: Session):
return AgentLoopInfo(
conversation_id=session.sid,
url=self._get_conversation_url(session.sid),
api_key=None,
event_store=session.agent_session.event_stream,
)
def _get_conversation_url(self, conversation_id: str):
return f"/conversations/{conversation_id}"
def _last_updated_at_key(conversation: ConversationMetadata) -> float:
last_updated_at = conversation.last_updated_at
@@ -0,0 +1,14 @@
from dataclasses import dataclass
from openhands.events.event_store_abc import EventStoreABC
@dataclass
class AgentLoopInfo:
"""
Information about an agent loop - the URL on which to locate it and the event store
"""
conversation_id: str
url: str | None
api_key: str | None
event_store: EventStoreABC
@@ -19,4 +19,6 @@ class ConversationInfo:
selected_repository: str | None = None
trigger: ConversationTrigger | None = None
num_connections: int = 0
url: str | None = None
api_key: str | None = None
created_at: datetime = field(default_factory=lambda: datetime.now(timezone.utc))
+7 -3
View File
@@ -100,10 +100,12 @@ async def connect(connection_id: str, environ: dict) -> None:
git_provider_tokens = user_secrets.provider_tokens
session_init_args['git_provider_tokens'] = git_provider_tokens
if user_secrets:
session_init_args['custom_secrets'] = user_secrets.custom_secrets
conversation_init_data = ConversationInitData(**session_init_args)
event_stream = await conversation_manager.join_conversation(
agent_loop_info = await conversation_manager.join_conversation(
conversation_id,
connection_id,
conversation_init_data,
@@ -113,9 +115,11 @@ async def connect(connection_id: str, environ: dict) -> None:
f'Connected to conversation {conversation_id} with connection_id {connection_id}. Replaying event stream...'
)
agent_state_changed = None
if event_stream is None:
if agent_loop_info is None:
raise ConnectionRefusedError('Failed to join conversation')
async_store = AsyncEventStoreWrapper(event_stream, latest_event_id + 1)
async_store = AsyncEventStoreWrapper(
agent_loop_info.event_store, latest_event_id + 1
)
async for event in async_store:
logger.debug(f'oh_event: {event.__class__.__name__}')
if isinstance(
+67 -1
View File
@@ -1,8 +1,11 @@
from fastapi import APIRouter, Request, status
from fastapi import APIRouter, HTTPException, Request, status
from fastapi.responses import JSONResponse
from openhands.core.logger import openhands_logger as logger
from openhands.events.event_filter import EventFilter
from openhands.events.serialization.event import event_to_dict
from openhands.runtime.base import Runtime
from openhands.server.shared import conversation_manager
app = APIRouter(prefix='/api/conversations/{conversation_id}')
@@ -91,3 +94,66 @@ async def get_hosts(request: Request) -> JSONResponse:
'error': f'Error getting runtime hosts: {e}',
},
)
@app.get('/events')
async def search_events(
request: Request,
start_id: int = 0,
end_id: int | None = None,
reverse: bool = False,
filter: EventFilter | None = None,
limit: int = 20
):
"""Search through the event stream with filtering and pagination.
Args:
request: The incoming request object
start_id: Starting ID in the event stream. Defaults to 0
end_id: Ending ID in the event stream
reverse: Whether to retrieve events in reverse order. Defaults to False.
filter: Filter for events
limit: Maximum number of events to return. Must be between 1 and 100. Defaults to 20
Returns:
dict: Dictionary containing:
- events: List of matching events
- has_more: Whether there are more matching events after this batch
Raises:
HTTPException: If conversation is not found
ValueError: If limit is less than 1 or greater than 100
"""
if not request.state.conversation:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND, detail='Conversation not found'
)
if limit < 0 or limit > 100:
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST, detail='Invalid limit'
)
# Get matching events from the stream
event_stream = request.state.conversation.event_stream
events = list(event_stream.search_events(
start_id=start_id,
end_id=end_id,
reverse=reverse,
filter=filter,
limit=limit + 1,
))
# Check if there are more events
has_more = len(events) > limit
if has_more:
events = events[:limit] # Remove the extra event
events = [event_to_dict(event) for event in events]
return {
'events': events,
'has_more': has_more,
}
@app.post('/events')
async def add_event(request: Request):
data = request.json()
conversation_manager.send_to_event_stream(request.state.sid, data)
return JSONResponse({"success": True})
+25
View File
@@ -2,6 +2,7 @@ from fastapi import APIRouter, Depends, status
from fastapi.responses import JSONResponse
from pydantic import SecretStr
from openhands.core.logger import openhands_logger as logger
from openhands.integrations.provider import (
PROVIDER_TOKEN_TYPE,
ProviderHandler,
@@ -42,6 +43,9 @@ async def get_user_repositories(
return await client.get_repositories(sort, server_config.app_mode)
except AuthenticationError as e:
logger.info(
f'Returning 401 Unauthorized - Authentication error for user_id: {user_id}, error: {str(e)}'
)
return JSONResponse(
content=str(e),
status_code=status.HTTP_401_UNAUTHORIZED,
@@ -53,6 +57,9 @@ async def get_user_repositories(
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
)
logger.info(
f'Returning 401 Unauthorized - Git provider token required for user_id: {user_id}'
)
return JSONResponse(
content='Git provider token required. (such as GitHub).',
status_code=status.HTTP_401_UNAUTHORIZED,
@@ -63,6 +70,7 @@ async def get_user_repositories(
async def get_user(
provider_tokens: PROVIDER_TOKEN_TYPE | None = Depends(get_provider_tokens),
access_token: SecretStr | None = Depends(get_access_token),
user_id: str | None = Depends(get_user_id),
) -> User | JSONResponse:
if provider_tokens:
client = ProviderHandler(
@@ -74,6 +82,9 @@ async def get_user(
return user
except AuthenticationError as e:
logger.info(
f'Returning 401 Unauthorized - Authentication error for user_id: {user_id}, error: {str(e)}'
)
return JSONResponse(
content=str(e),
status_code=status.HTTP_401_UNAUTHORIZED,
@@ -85,6 +96,9 @@ async def get_user(
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
)
logger.info(
f'Returning 401 Unauthorized - Git provider token required for user_id: {user_id}'
)
return JSONResponse(
content='Git provider token required. (such as GitHub).',
status_code=status.HTTP_401_UNAUTHORIZED,
@@ -99,6 +113,7 @@ async def search_repositories(
order: str = 'desc',
provider_tokens: PROVIDER_TOKEN_TYPE | None = Depends(get_provider_tokens),
access_token: SecretStr | None = Depends(get_access_token),
user_id: str | None = Depends(get_user_id),
) -> list[Repository] | JSONResponse:
if provider_tokens:
client = ProviderHandler(
@@ -122,6 +137,9 @@ async def search_repositories(
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
)
logger.info(
f'Returning 401 Unauthorized - GitHub token required for user_id: {user_id}'
)
return JSONResponse(
content='GitHub token required.',
status_code=status.HTTP_401_UNAUTHORIZED,
@@ -132,6 +150,7 @@ async def search_repositories(
async def get_suggested_tasks(
provider_tokens: PROVIDER_TOKEN_TYPE | None = Depends(get_provider_tokens),
access_token: SecretStr | None = Depends(get_access_token),
user_id: str | None = Depends(get_user_id),
) -> list[SuggestedTask] | JSONResponse:
"""Get suggested tasks for the authenticated user across their most recently pushed repositories.
@@ -158,6 +177,7 @@ async def get_suggested_tasks(
content=str(e),
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
)
logger.info(f'Returning 401 Unauthorized - No providers set for user_id: {user_id}')
return JSONResponse(
content='No providers set.',
@@ -170,6 +190,7 @@ async def get_repository_branches(
repository: str,
provider_tokens: PROVIDER_TOKEN_TYPE | None = Depends(get_provider_tokens),
access_token: SecretStr | None = Depends(get_access_token),
user_id: str | None = Depends(get_user_id),
) -> list[Branch] | JSONResponse:
"""Get branches for a repository.
@@ -199,6 +220,10 @@ async def get_repository_branches(
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
)
logger.info(
f'Returning 401 Unauthorized - Git provider token required for user_id: {user_id}'
)
return JSONResponse(
content='Git provider token required. (such as GitHub).',
status_code=status.HTTP_401_UNAUTHORIZED,
+38 -12
View File
@@ -9,6 +9,7 @@ from pydantic import BaseModel
from openhands.core.logger import openhands_logger as logger
from openhands.events.action.message import MessageAction
from openhands.integrations.provider import (
CUSTOM_SECRETS_TYPE_WITH_JSON_SCHEMA,
PROVIDER_TOKEN_TYPE,
ProviderHandler,
)
@@ -18,6 +19,7 @@ from openhands.integrations.service_types import (
SuggestedTask,
)
from openhands.runtime import get_runtime_cls
from openhands.server.data_models.agent_loop_info import AgentLoopInfo
from openhands.server.data_models.conversation_info import ConversationInfo
from openhands.server.data_models.conversation_info_result_set import (
ConversationInfoResultSet,
@@ -34,6 +36,7 @@ from openhands.server.user_auth import (
get_auth_type,
get_provider_tokens,
get_user_id,
get_user_secrets,
)
from openhands.server.user_auth.user_auth import AuthType
from openhands.server.utils import get_conversation_store
@@ -43,6 +46,7 @@ from openhands.storage.data_models.conversation_metadata import (
ConversationTrigger,
)
from openhands.storage.data_models.conversation_status import ConversationStatus
from openhands.storage.data_models.user_secrets import UserSecrets
from openhands.utils.async_utils import wait_all
from openhands.utils.conversation_summary import get_default_conversation_title
@@ -61,9 +65,18 @@ class InitSessionRequest(BaseModel):
model_config = {'extra': 'forbid'}
class InitSessionResponse(BaseModel):
status: str
conversation_id: str
conversation_url: str
api_key: str | None
message: str | None = None
async def _create_new_conversation(
user_id: str | None,
git_provider_tokens: PROVIDER_TOKEN_TYPE | None,
custom_secrets: CUSTOM_SECRETS_TYPE_WITH_JSON_SCHEMA | None,
selected_repository: str | None,
selected_branch: str | None,
initial_user_msg: str | None,
@@ -71,7 +84,7 @@ async def _create_new_conversation(
replay_json: str | None,
conversation_trigger: ConversationTrigger = ConversationTrigger.GUI,
attach_convo_id: bool = False,
) -> str:
) -> AgentLoopInfo:
logger.info(
'Creating conversation',
extra={
@@ -105,6 +118,7 @@ async def _create_new_conversation(
session_init_args['git_provider_tokens'] = git_provider_tokens
session_init_args['selected_repository'] = selected_repository
session_init_args['custom_secrets'] = custom_secrets
session_init_args['selected_branch'] = selected_branch
conversation_init_data = ConversationInitData(**session_init_args)
logger.info('Loading conversation store')
@@ -149,15 +163,15 @@ async def _create_new_conversation(
content=user_msg or '',
image_urls=image_urls or [],
)
await conversation_manager.maybe_start_agent_loop(
agent_loop_info = await conversation_manager.maybe_start_agent_loop(
conversation_id,
conversation_init_data,
user_id,
initial_user_msg=initial_message_action,
replay_json=replay_json,
)
logger.info(f'Finished initializing conversation {conversation_id}')
return conversation_id
logger.info(f'Finished initializing conversation {agent_loop_info.conversation_id}')
return agent_loop_info
@app.post('/conversations')
@@ -165,8 +179,9 @@ async def new_conversation(
data: InitSessionRequest,
user_id: str = Depends(get_user_id),
provider_tokens: PROVIDER_TOKEN_TYPE = Depends(get_provider_tokens),
user_secrets: UserSecrets = Depends(get_user_secrets),
auth_type: AuthType | None = Depends(get_auth_type),
) -> JSONResponse:
) -> InitSessionResponse:
"""Initialize a new session or join an existing one.
After successful initialization, the client should connect to the WebSocket
@@ -197,9 +212,10 @@ async def new_conversation(
await provider_handler.verify_repo_provider(repository, git_provider)
# Create conversation with initial message
conversation_id = await _create_new_conversation(
agent_loop_info = await _create_new_conversation(
user_id=user_id,
git_provider_tokens=provider_tokens,
custom_secrets=user_secrets.custom_secrets if user_secrets else None,
selected_repository=repository,
selected_branch=selected_branch,
initial_user_msg=initial_user_msg,
@@ -208,8 +224,11 @@ async def new_conversation(
conversation_trigger=conversation_trigger,
)
return JSONResponse(
content={'status': 'ok', 'conversation_id': conversation_id}
return InitSessionResponse(
status='ok',
conversation_id=agent_loop_info.conversation_id,
conversation_url=agent_loop_info.url,
api_key=agent_loop_info.api_key,
)
except MissingSettingsError as e:
return JSONResponse(
@@ -269,6 +288,8 @@ async def search_conversations(
user_id, conversation_ids
)
connection_ids_to_conversation_ids = await conversation_manager.get_connections(filter_to_sids=conversation_ids)
agent_loop_info = await conversation_manager.get_agent_loop_info(filter_to_sids=conversation_ids)
urls_by_conversation_id = {info.conversation_id: info.url for info in agent_loop_info}
result = ConversationInfoResultSet(
results=await wait_all(
_get_conversation_info(
@@ -277,7 +298,8 @@ async def search_conversations(
num_connections=sum(
1 for conversation_id in connection_ids_to_conversation_ids.values()
if conversation_id == conversation.conversation_id
)
),
url=urls_by_conversation_id.get(conversation.conversation_id),
)
for conversation in filtered_results
),
@@ -295,7 +317,9 @@ async def get_conversation(
metadata = await conversation_store.get_metadata(conversation_id)
is_running = await conversation_manager.is_agent_loop_running(conversation_id)
num_connections = len(await conversation_manager.get_connections(filter_to_sids={conversation_id}))
conversation_info = await _get_conversation_info(metadata, is_running, num_connections)
agent_loop_info = await conversation_manager.get_agent_loop_info(filter_to_sids={conversation_id})
url = agent_loop_info[0].url if agent_loop_info else None
conversation_info = await _get_conversation_info(metadata, is_running, num_connections, url)
return conversation_info
except FileNotFoundError:
return None
@@ -323,7 +347,8 @@ async def delete_conversation(
async def _get_conversation_info(
conversation: ConversationMetadata,
is_running: bool,
num_connections: int
num_connections: int,
url: str | None,
) -> ConversationInfo | None:
try:
title = conversation.title
@@ -339,7 +364,8 @@ async def _get_conversation_info(
status=(
ConversationStatus.RUNNING if is_running else ConversationStatus.STOPPED
),
num_connections=num_connections
num_connections=num_connections,
url=url,
)
except Exception as e:
logger.error(
+6 -2
View File
@@ -2,8 +2,7 @@ from fastapi import APIRouter, Depends, status
from fastapi.responses import JSONResponse
from openhands.core.logger import openhands_logger as logger
from openhands.integrations.provider import CustomSecret
from openhands.integrations.provider import PROVIDER_TOKEN_TYPE
from openhands.integrations.provider import PROVIDER_TOKEN_TYPE, CustomSecret
from openhands.integrations.service_types import ProviderType
from openhands.integrations.utils import validate_provider_token
from openhands.server.settings import (
@@ -110,6 +109,10 @@ async def store_provider_tokens(
) -> JSONResponse:
provider_err_msg = await check_provider_tokens(provider_info, provider_tokens)
if provider_err_msg:
# We don't have direct access to user_id here, but we can log the provider info
logger.info(
f'Returning 401 Unauthorized - Provider token error: {provider_err_msg}'
)
return JSONResponse(
status_code=status.HTTP_401_UNAUTHORIZED,
content={'error': provider_err_msg},
@@ -203,6 +206,7 @@ async def load_custom_secrets_names(
except Exception as e:
logger.warning(f'Failed to load secret names: {e}')
logger.info('Returning 401 Unauthorized - Failed to get secret names')
return JSONResponse(
status_code=status.HTTP_401_UNAUTHORIZED,
content={'error': 'Failed to get secret names'},
+5
View File
@@ -71,6 +71,11 @@ async def load_settings(
return settings_with_token_data
except Exception as e:
logger.warning(f'Invalid token: {e}')
# Get user_id from settings if available
user_id = getattr(settings, 'user_id', 'unknown') if settings else 'unknown'
logger.info(
f'Returning 401 Unauthorized - Invalid token for user_id: {user_id}'
)
return JSONResponse(
status_code=status.HTTP_401_UNAUTHORIZED,
content={'error': 'Invalid token'},
+35 -20
View File
@@ -16,7 +16,7 @@ from openhands.core.schema.agent import AgentState
from openhands.events.action import ChangeAgentStateAction, MessageAction
from openhands.events.event import Event, EventSource
from openhands.events.stream import EventStream
from openhands.integrations.provider import PROVIDER_TOKEN_TYPE, ProviderHandler
from openhands.integrations.provider import CUSTOM_SECRETS_TYPE, PROVIDER_TOKEN_TYPE, ProviderHandler
from openhands.mcp import add_mcp_tools_to_agent
from openhands.memory.memory import Memory
from openhands.microagent.microagent import BaseMicroagent
@@ -24,6 +24,7 @@ from openhands.runtime import get_runtime_cls
from openhands.runtime.base import Runtime
from openhands.runtime.impl.remote.remote_runtime import RemoteRuntime
from openhands.security import SecurityAnalyzer, options
from openhands.storage.data_models.user_secrets import UserSecrets
from openhands.storage.files import FileStore
from openhands.utils.async_utils import EXECUTOR, call_sync_from_async
from openhands.utils.shutdown_listener import should_continue
@@ -82,6 +83,7 @@ class AgentSession:
agent: Agent,
max_iterations: int,
git_provider_tokens: PROVIDER_TOKEN_TYPE | None = None,
custom_secrets: CUSTOM_SECRETS_TYPE | None = None,
max_budget_per_task: float | None = None,
agent_to_llm_config: dict[str, LLMConfig] | None = None,
agent_configs: dict[str, AgentConfig] | None = None,
@@ -113,6 +115,9 @@ class AgentSession:
self._started_at = started_at
finished = False # For monitoring
runtime_connected = False
custom_secrets_handler = UserSecrets(custom_secrets=custom_secrets if custom_secrets else {})
try:
self._create_security_analyzer(config.security.security_analyzer)
runtime_connected = await self._create_runtime(
@@ -120,14 +125,32 @@ class AgentSession:
config=config,
agent=agent,
git_provider_tokens=git_provider_tokens,
custom_secrets=custom_secrets,
selected_repository=selected_repository,
selected_branch=selected_branch,
)
repo_directory = None
if self.runtime and runtime_connected and selected_repository:
repo_directory = selected_repository.split('/')[-1]
if git_provider_tokens:
provider_handler = ProviderHandler(provider_tokens=git_provider_tokens)
await provider_handler.set_event_stream_secrets(self.event_stream)
if custom_secrets:
custom_secrets_handler.set_event_stream_secrets(self.event_stream)
self.memory = await self._create_memory(
selected_repository=selected_repository,
repo_directory=repo_directory,
custom_secrets_descriptions=custom_secrets_handler.get_custom_secrets_descriptions()
)
# NOTE: this needs to happen before controller is created
# so MCP tools can be included into the SystemMessageAction
if self.runtime and runtime_connected:
await add_mcp_tools_to_agent(agent, self.runtime, config.mcp)
await add_mcp_tools_to_agent(agent, self.runtime, self.memory, config.mcp)
if replay_json:
initial_message = self._run_replay(
@@ -150,19 +173,6 @@ class AgentSession:
agent_configs=agent_configs,
)
repo_directory = None
if self.runtime and runtime_connected and selected_repository:
repo_directory = selected_repository.split('/')[-1]
self.memory = await self._create_memory(
selected_repository=selected_repository,
repo_directory=repo_directory,
)
if git_provider_tokens:
provider_handler = ProviderHandler(provider_tokens=git_provider_tokens)
await provider_handler.set_event_stream_secrets(self.event_stream)
if not self._closed:
if initial_message:
self.event_stream.add_event(initial_message, EventSource.USER)
@@ -264,6 +274,7 @@ class AgentSession:
config: AppConfig,
agent: Agent,
git_provider_tokens: PROVIDER_TOKEN_TYPE | None = None,
custom_secrets: CUSTOM_SECRETS_TYPE | None = None,
selected_repository: str | None = None,
selected_branch: str | None = None,
) -> bool:
@@ -281,9 +292,11 @@ class AgentSession:
if self.runtime is not None:
raise RuntimeError('Runtime already created')
custom_secrets_handler = UserSecrets(custom_secrets=custom_secrets or {})
env_vars = custom_secrets_handler.get_env_vars()
self.logger.debug(f'Initializing runtime `{runtime_name}` now...')
runtime_cls = get_runtime_cls(runtime_name)
if runtime_cls == RemoteRuntime:
self.runtime = runtime_cls(
config=config,
@@ -294,6 +307,7 @@ class AgentSession:
headless_mode=False,
attach_to_existing=False,
git_provider_tokens=git_provider_tokens,
env_vars=env_vars,
user_id=self.user_id,
)
else:
@@ -301,8 +315,9 @@ class AgentSession:
provider_tokens=git_provider_tokens
or cast(PROVIDER_TOKEN_TYPE, MappingProxyType({}))
)
env_vars = await provider_handler.get_env_vars(expose_secrets=True)
# Merge git provider tokens with custom secrets before passing over to runtime
env_vars.update(await provider_handler.get_env_vars(expose_secrets=True))
self.runtime = runtime_cls(
config=config,
event_stream=self.event_stream,
@@ -400,7 +415,7 @@ class AgentSession:
return controller
async def _create_memory(
self, selected_repository: str | None, repo_directory: str | None
self, selected_repository: str | None, repo_directory: str | None, custom_secrets_descriptions: dict[str, str]
) -> Memory:
memory = Memory(
event_stream=self.event_stream,
@@ -410,7 +425,7 @@ class AgentSession:
if self.runtime:
# sets available hosts and other runtime info
memory.set_runtime_info(self.runtime)
memory.set_runtime_info(self.runtime, custom_secrets_descriptions)
# loads microagents from repo/.openhands/microagents
microagents: list[BaseMicroagent] = await call_sync_from_async(
@@ -1,6 +1,6 @@
from pydantic import Field
from openhands.integrations.provider import PROVIDER_TOKEN_TYPE
from openhands.integrations.provider import CUSTOM_SECRETS_TYPE, PROVIDER_TOKEN_TYPE
from openhands.storage.data_models.settings import Settings
@@ -10,6 +10,7 @@ class ConversationInitData(Settings):
"""
git_provider_tokens: PROVIDER_TOKEN_TYPE | None = Field(default=None, frozen=True)
custom_secrets: CUSTOM_SECRETS_TYPE | None = Field(default=None, frozen=True)
selected_repository: str | None = Field(default=None)
replay_json: str | None = Field(default=None)
selected_branch: str | None = Field(default=None)
+26 -5
View File
@@ -12,6 +12,7 @@ from openhands.core.config.condenser_config import (
CondenserPipelineConfig,
LLMSummarizingCondenserConfig,
)
from openhands.core.exceptions import MicroagentValidationError
from openhands.core.logger import OpenHandsLoggerAdapter
from openhands.core.schema import AgentState
from openhands.events.action import MessageAction, NullAction
@@ -153,10 +154,12 @@ class Session:
git_provider_tokens = None
selected_repository = None
selected_branch = None
custom_secrets = None
if isinstance(settings, ConversationInitData):
git_provider_tokens = settings.git_provider_tokens
selected_repository = settings.selected_repository
selected_branch = settings.selected_branch
custom_secrets = settings.custom_secrets
try:
await self.agent_session.start(
@@ -168,21 +171,39 @@ class Session:
agent_to_llm_config=self.config.get_agent_to_llm_config_map(),
agent_configs=self.config.get_agent_configs(),
git_provider_tokens=git_provider_tokens,
custom_secrets=custom_secrets,
selected_repository=selected_repository,
selected_branch=selected_branch,
initial_message=initial_message,
replay_json=replay_json,
)
except MicroagentValidationError as e:
self.logger.exception(f'Error creating agent_session: {e}')
# For microagent validation errors, provide more helpful information
await self.send_error(f'Failed to create agent session: {str(e)}')
return
except ValueError as e:
self.logger.exception(f'Error creating agent_session: {e}')
error_message = str(e)
# For ValueError related to microagents, provide more helpful information
if 'microagent' in error_message.lower():
await self.send_error(
f'Failed to create agent session: {error_message}'
)
else:
# For other ValueErrors, just show the error class
await self.send_error('Failed to create agent session: ValueError')
return
except Exception as e:
self.logger.exception(f'Error creating agent_session: {e}')
err_class = e.__class__.__name__
await self.send_error(f'Failed to create agent session: {err_class}')
# For other errors, just show the error class to avoid exposing sensitive information
await self.send_error(
f'Failed to create agent session: {e.__class__.__name__}'
)
return
def _create_llm(self, agent_cls: str | None) -> LLM:
"""
Initialize LLM, extracted for testing.
"""
"""Initialize LLM, extracted for testing."""
agent_name = agent_cls if agent_cls is not None else 'agent'
return LLM(
config=self.config.get_llm_config_from_agent(agent_name),
@@ -10,6 +10,7 @@ from pydantic import (
)
from pydantic.json import pydantic_encoder
from openhands.events.stream import EventStream
from openhands.integrations.provider import (
CUSTOM_SECRETS_TYPE,
CUSTOM_SECRETS_TYPE_WITH_JSON_SCHEMA,
@@ -136,3 +137,31 @@ class UserSecrets(BaseModel):
new_data['custom_secrets'] = secrets
return new_data
def set_event_stream_secrets(self, event_stream: EventStream) -> None:
"""
This ensures that provider tokens and custom secrets masked from the event stream
Args:
event_stream: Agent session's event stream
"""
secrets = self.get_env_vars()
event_stream.set_secrets(secrets)
def get_env_vars(self) -> dict[str, str]:
secret_store = self.model_dump(context={'expose_secrets': True})
custom_secrets = secret_store.get('custom_secrets', {})
secrets = {}
for secret_name, value in custom_secrets.items():
secrets[secret_name] = value['secret']
return secrets
def get_custom_secrets_descriptions(self) -> dict[str, str]:
secrets = {}
for secret_name, secret in self.custom_secrets.items():
secrets[secret_name] = secret.description
return secrets
@@ -1,164 +0,0 @@
#!/usr/bin/env python3
"""Utility to use patchright as a drop-in replacement for playwright.
This module provides a function to patch all imports of playwright to use patchright instead.
It uses Python's import system to intercept imports of playwright modules and redirect them
to the corresponding patchright modules.
Usage:
from openhands.utils.playwright_patchright_util import use_patchright
use_patchright() # Call this before any imports of browsergym or playwright
"""
import importlib
import importlib.abc
import importlib.machinery
import importlib.util
import logging
import sys
import types
from typing import Optional, Sequence
logger = logging.getLogger(__name__)
class PlaywrightToPatchrightLoader(importlib.abc.Loader):
"""Custom loader that loads a patchright module but presents it as a playwright module."""
def __init__(
self, fullname: str, patchright_name: str, original_loader: importlib.abc.Loader
):
self.fullname = fullname
self.patchright_name = patchright_name
self.original_loader = original_loader
def create_module(
self, spec: importlib.machinery.ModuleSpec
) -> Optional[types.ModuleType]:
"""Create a module object for the patchright module."""
try:
# Import the patchright module and return it directly
return importlib.import_module(self.patchright_name)
except ImportError as e:
logger.warning(f'Failed to import {self.patchright_name}: {e}')
return None
def exec_module(self, module: types.ModuleType) -> None:
"""Execute the module (nothing to do here as we already set up the module)."""
pass
class PlaywrightToPatchrightFinder(importlib.abc.MetaPathFinder):
"""Custom finder that intercepts imports of playwright modules and redirects them to patchright."""
def find_spec(
self,
fullname: str,
path: Optional[Sequence[str]] = None,
target: Optional[types.ModuleType] = None,
) -> Optional[importlib.machinery.ModuleSpec]:
"""Find the module spec for the given module name."""
# Only handle playwright modules
if not fullname.startswith('playwright'):
return None
# Replace playwright prefix with patchright
patchright_name = 'patchright' + fullname[len('playwright') :]
try:
# Try to find the spec for the patchright module
spec = importlib.util.find_spec(patchright_name)
if spec is None:
return None
# Create a loader that will load the patchright module
loader = PlaywrightToPatchrightLoader(
fullname, patchright_name, spec.loader or importlib.abc.Loader()
)
# Create a new spec with the original name but using our custom loader
new_spec = importlib.machinery.ModuleSpec(
name=fullname,
loader=loader,
origin=spec.origin,
loader_state=spec.loader_state,
is_package=spec.submodule_search_locations is not None,
)
# Set submodule_search_locations if it's a package
if spec.submodule_search_locations is not None:
new_spec.submodule_search_locations = spec.submodule_search_locations
return new_spec
except (ImportError, AttributeError) as e:
logger.warning(f'Failed to find spec for {patchright_name}: {e}')
return None
def use_patchright():
"""Configure the system to use patchright as a drop-in replacement for playwright.
This function:
1. Checks if patchright is installed
2. Removes any existing playwright modules from sys.modules
3. Installs a meta path finder to redirect imports
Call this function before importing any modules that use playwright.
"""
# Check if patchright is installed
try:
importlib.import_module('patchright')
except ImportError:
logger.error(
"Patchright is not installed. Please install it with 'pip install patchright'."
)
raise ImportError(
"Patchright is not installed. Please install it with 'pip install patchright'."
)
# Remove any existing playwright modules from sys.modules
playwright_modules = [
name
for name in list(sys.modules.keys())
if name == 'playwright' or name.startswith('playwright.')
]
for name in playwright_modules:
del sys.modules[name]
# Install our custom finder at the beginning of sys.meta_path
for i, finder in enumerate(sys.meta_path):
if isinstance(finder, PlaywrightToPatchrightFinder):
# Already installed
return
# Add our finder to the beginning of sys.meta_path
sys.meta_path.insert(0, PlaywrightToPatchrightFinder())
logger.info('Patchright will be used as a drop-in replacement for playwright.')
if __name__ == '__main__':
# Configure logging
logging.basicConfig(level=logging.INFO)
# Test the utility
use_patchright()
# Try importing playwright
import playwright
print(f'Imported module: {playwright.__name__}')
print(f'Module file: {playwright.__file__}')
# Try importing sync_api
from playwright.sync_api import sync_playwright
print(f'sync_playwright function: {sync_playwright}')
# Use playwright
with sync_playwright() as p:
browser = p.chromium.launch(headless=True)
page = browser.new_page()
page.goto('https://example.com')
print(f'Page title: {page.title()}')
browser.close()
+1
View File
@@ -14,6 +14,7 @@ class RuntimeInfo:
date: str
available_hosts: dict[str, int] = field(default_factory=dict)
additional_agent_instructions: str = ''
custom_secrets_descriptions: dict[str, str] = field(default_factory=dict)
@dataclass
Generated
+2126 -2240
View File
File diff suppressed because it is too large Load Diff
+5 -5
View File
@@ -35,7 +35,6 @@ types-toml = "*"
numpy = "*"
json-repair = "*"
browsergym-core = "0.13.3" # integrate browsergym-core as the browsing interface
patchright = "*" # undetected version of playwright for browser automation
html2text = "*"
e2b = ">=1.0.5,<1.4.0"
pexpect = "*"
@@ -79,19 +78,20 @@ ipywidgets = "^8.1.5"
qtconsole = "^5.6.1"
memory-profiler = "^0.61.0"
daytona-sdk = "0.16.1"
mcp = "1.7.1"
python-json-logger = "^3.2.1"
prompt-toolkit = "^3.0.50"
mcpm = "1.9.0"
poetry = "^2.1.2"
anyio = "4.9.0"
pythonnet = "*"
mcp = "1.7.1"
mcpm = "1.9.0"
[tool.poetry.group.dev.dependencies]
ruff = "0.11.9"
ruff = "0.11.10"
mypy = "1.15.0"
pre-commit = "4.2.0"
build = "*"
types-setuptools = "*"
[tool.poetry.group.test.dependencies]
pytest = "*"
@@ -133,7 +133,7 @@ pyarrow = "20.0.0" # transiti
datasets = "*"
[tool.poetry.scripts]
openhands = "openhands.core.cli:main"
openhands = "openhands.cli.main:main"
[tool.poetry.group.testgeneval.dependencies]
fuzzywuzzy = "^0.18.0"
+94
View File
@@ -1,12 +1,18 @@
"""Tests for microagent loading in runtime."""
import os
from pathlib import Path
from unittest.mock import AsyncMock, MagicMock, patch
import pytest
from conftest import (
_close_test_runtime,
_load_runtime,
)
from openhands.core.config import MCPConfig
from openhands.core.config.mcp_config import MCPStdioServerConfig
from openhands.mcp.utils import add_mcp_tools_to_agent
from openhands.microagent import KnowledgeMicroagent, RepoMicroagent
@@ -165,3 +171,91 @@ Repository-specific test instructions.
finally:
_close_test_runtime(runtime)
def test_default_tools_microagent_exists():
"""Test that the default-tools microagent exists in the global microagents directory."""
# Get the path to the global microagents directory
import openhands
project_root = os.path.dirname(openhands.__file__)
parent_dir = os.path.dirname(project_root)
microagents_dir = os.path.join(parent_dir, 'microagents')
# Check that the default-tools.md file exists
default_tools_path = os.path.join(microagents_dir, 'default-tools.md')
assert os.path.exists(default_tools_path), (
f'default-tools.md not found at {default_tools_path}'
)
# Read the file and check its content
with open(default_tools_path, 'r') as f:
content = f.read()
# Verify it's a repo microagent (always activated)
assert 'type: repo' in content, 'default-tools.md should be a repo microagent'
# Verify it has the fetch tool configured
assert 'name: "fetch"' in content, 'default-tools.md should have a fetch tool'
assert 'command: "uvx"' in content, 'default-tools.md should use uvx command'
assert 'args: ["mcp-server-fetch"]' in content, (
'default-tools.md should use mcp-server-fetch'
)
@pytest.mark.asyncio
async def test_add_mcp_tools_from_microagents():
"""Test that add_mcp_tools_to_agent adds tools from microagents."""
# Import ActionExecutionClient for mocking
from openhands.runtime.impl.action_execution.action_execution_client import (
ActionExecutionClient,
)
# Create mock objects
mock_agent = MagicMock()
mock_runtime = MagicMock(spec=ActionExecutionClient)
mock_memory = MagicMock()
mock_mcp_config = MCPConfig()
# Configure the mock memory to return a microagent MCP config
mock_stdio_server = MCPStdioServerConfig(
name='test-tool', command='test-command', args=['test-arg1', 'test-arg2']
)
mock_microagent_mcp_config = MCPConfig(stdio_servers=[mock_stdio_server])
mock_memory.get_microagent_mcp_tools.return_value = [mock_microagent_mcp_config]
# Configure the mock runtime
mock_runtime.runtime_initialized = True
mock_runtime.get_updated_mcp_config.return_value = mock_microagent_mcp_config
# Mock the fetch_mcp_tools_from_config function to return a mock tool
mock_tool = {
'type': 'function',
'function': {
'name': 'test-tool',
'description': 'Test tool description',
'parameters': {},
},
}
with patch(
'openhands.mcp.utils.fetch_mcp_tools_from_config',
new=AsyncMock(return_value=[mock_tool]),
):
# Call the function
await add_mcp_tools_to_agent(
mock_agent, mock_runtime, mock_memory, mock_mcp_config
)
# Verify that the memory's get_microagent_mcp_tools was called
mock_memory.get_microagent_mcp_tools.assert_called_once()
# Verify that the runtime's get_updated_mcp_config was called with the extra stdio servers
mock_runtime.get_updated_mcp_config.assert_called_once()
args, kwargs = mock_runtime.get_updated_mcp_config.call_args
assert len(args) == 1
assert len(args[0]) == 1
assert args[0][0].name == 'test-tool'
# Verify that the agent's set_mcp_tools was called with the mock tool
mock_agent.set_mcp_tools.assert_called_once_with([mock_tool])
@@ -0,0 +1,33 @@
"""Test that the translation completeness check works correctly."""
import os
import subprocess
import unittest
class TestTranslationCompleteness(unittest.TestCase):
"""Test that the translation completeness check works correctly."""
def test_translation_completeness_check_runs(self):
"""Test that the translation completeness check script can be executed."""
frontend_dir = os.path.join(os.path.dirname(os.path.dirname(os.path.dirname(os.path.dirname(__file__)))), "frontend")
script_path = os.path.join(frontend_dir, "scripts", "check-translation-completeness.cjs")
# Verify the script exists
self.assertTrue(os.path.exists(script_path), f"Script not found at {script_path}")
# Verify the script is executable
self.assertTrue(os.access(script_path, os.X_OK), f"Script at {script_path} is not executable")
# Run the script (it may fail due to missing translations, but we just want to verify it runs)
try:
subprocess.run(
["node", script_path],
cwd=frontend_dir,
check=False,
capture_output=True,
text=True
)
# We don't assert on the return code because it might fail due to missing translations
except Exception as e:
self.fail(f"Failed to run translation completeness check: {e}")
+45 -41
View File
@@ -10,9 +10,8 @@ import openhands
def assert_sandbox_config(
config: SandboxConfig,
base_container_image = SandboxConfig.model_fields["base_container_image"].default,
runtime_container_image = "ghcr.io/all-hands-ai/runtime:0.38.0-nikolaik",
runtime_container_image = f'ghcr.io/all-hands-ai/runtime:mock-nikolaik', # Default to mock version
local_runtime_url = SandboxConfig.model_fields["local_runtime_url"].default,
user_id = SandboxConfig.model_fields["user_id"].default,
):
"""Helper function to assert the properties of the SandboxConfig object."""
assert isinstance(config, SandboxConfig)
@@ -22,17 +21,21 @@ def assert_sandbox_config(
assert config.use_host_network is False
assert config.timeout == 300
assert config.local_runtime_url == local_runtime_url
assert config.user_id == user_id
def test_setup_sandbox_config_default():
"""Test default configuration when no images provided and not experimental"""
config = IssueResolver._setup_sandbox_config(
base_container_image=None,
runtime_container_image=None,
is_experimental=False,
)
with mock.patch('openhands.__version__', 'mock'):
config = IssueResolver._setup_sandbox_config(
base_container_image=None,
runtime_container_image=None,
is_experimental=False,
)
assert_sandbox_config(
config,
runtime_container_image='ghcr.io/all-hands-ai/runtime:mock-nikolaik'
)
assert_sandbox_config(config)
def test_setup_sandbox_config_both_images():
"""Test that providing both container images raises ValueError"""
@@ -74,46 +77,47 @@ def test_setup_sandbox_config_runtime_only():
def test_setup_sandbox_config_experimental():
"""Test configuration when experimental mode is enabled"""
config = IssueResolver._setup_sandbox_config(
base_container_image=None,
runtime_container_image=None,
is_experimental=True,
)
with mock.patch('openhands.__version__', 'mock'):
config = IssueResolver._setup_sandbox_config(
base_container_image=None,
runtime_container_image=None,
is_experimental=True,
)
assert_sandbox_config(
config,
runtime_container_image=None
)
assert_sandbox_config(
config,
runtime_container_image=None
)
@mock.patch("openhands.resolver.resolve_issue.os.getuid", return_value=0)
@mock.patch("openhands.resolver.resolve_issue.get_unique_uid", return_value=1001)
def test_setup_sandbox_config_gitlab_ci(mock_get_unique_uid, mock_getuid):
"""Test GitLab CI specific configuration when running as root"""
with mock.patch.object(IssueResolver, "GITLAB_CI", True):
config = IssueResolver._setup_sandbox_config(
base_container_image=None,
runtime_container_image=None,
is_experimental=False,
)
assert_sandbox_config(
config,
local_runtime_url="http://localhost",
user_id=1001
)
with mock.patch('openhands.__version__', 'mock'):
with mock.patch.object(IssueResolver, "GITLAB_CI", True):
config = IssueResolver._setup_sandbox_config(
base_container_image=None,
runtime_container_image=None,
is_experimental=False,
)
assert_sandbox_config(
config,
local_runtime_url="http://localhost"
)
@mock.patch("openhands.resolver.resolve_issue.os.getuid", return_value=1000)
def test_setup_sandbox_config_gitlab_ci_non_root(mock_getuid):
"""Test GitLab CI configuration when not running as root"""
with mock.patch.object(IssueResolver, "GITLAB_CI", True):
config = IssueResolver._setup_sandbox_config(
base_container_image=None,
runtime_container_image=None,
is_experimental=False,
)
with mock.patch('openhands.__version__', 'mock'):
with mock.patch.object(IssueResolver, "GITLAB_CI", True):
config = IssueResolver._setup_sandbox_config(
base_container_image=None,
runtime_container_image=None,
is_experimental=False,
)
assert_sandbox_config(
config,
local_runtime_url="http://localhost",
user_id=1000
)
assert_sandbox_config(
config,
local_runtime_url="http://localhost"
)
+3 -1
View File
@@ -103,6 +103,8 @@ def mock_memory() -> Memory:
spec=Memory,
event_stream=test_event_stream,
)
# Add the get_microagent_mcp_tools method to the mock
memory.get_microagent_mcp_tools.return_value = []
return memory
@@ -740,7 +742,7 @@ async def test_notify_on_llm_retry(mock_agent, mock_event_stream, mock_status_ca
@pytest.mark.asyncio
async def test_context_window_exceeded_error_handling(
mock_agent, mock_runtime, test_event_stream
mock_agent, mock_runtime, test_event_stream, mock_memory
):
"""Test that context window exceeded errors are handled correctly by the controller, providing a smaller view but keeping the history intact."""
max_iterations = 5
+2 -1
View File
@@ -233,6 +233,7 @@ async def test_delegate_step_different_states(
else:
assert controller.delegate is None
assert controller.state.iteration == 5
mock_delegate.close.assert_called_once()
# The close method is called once in end_delegate
assert mock_delegate.close.call_count == 1
await controller.close()
+162
View File
@@ -0,0 +1,162 @@
import asyncio
from unittest.mock import MagicMock, Mock
from uuid import uuid4
import pytest
from openhands.agenthub.codeact_agent.codeact_agent import CodeActAgent
from openhands.agenthub.readonly_agent.readonly_agent import ReadOnlyAgent
from openhands.controller.agent import Agent
from openhands.controller.agent_controller import AgentController
from openhands.controller.state.state import State
from openhands.core.config import AgentConfig, LLMConfig
from openhands.core.schema import AgentState
from openhands.events import EventSource, EventStream
from openhands.events.action import (
AgentDelegateAction,
AgentFinishAction,
MessageAction,
)
from openhands.events.observation import AgentDelegateObservation
from openhands.llm.llm import LLM
from openhands.llm.metrics import Metrics
from openhands.storage.memory import InMemoryFileStore
@pytest.fixture
def mock_event_stream():
"""Creates an event stream in memory."""
sid = f'test-{uuid4()}'
file_store = InMemoryFileStore({})
return EventStream(sid=sid, file_store=file_store)
@pytest.fixture
def mock_codeact_agent():
"""Creates a mock CodeActAgent for testing."""
agent = MagicMock(spec=CodeActAgent)
agent.name = 'CodeActAgent'
agent.llm = MagicMock(spec=LLM)
agent.llm.metrics = Metrics()
agent.llm.config = LLMConfig()
agent.config = AgentConfig()
# Add a proper system message mock
from openhands.events.action.message import SystemMessageAction
system_message = SystemMessageAction(content='Test system message for CodeActAgent')
system_message._source = EventSource.AGENT
system_message._id = -1 # Set invalid ID to avoid the ID check
agent.get_system_message.return_value = system_message
return agent
@pytest.fixture
def mock_readonly_agent():
"""Creates a mock ReadOnlyAgent for testing."""
agent = MagicMock(spec=ReadOnlyAgent)
agent.name = 'ReadOnlyAgent'
agent.llm = MagicMock(spec=LLM)
agent.llm.metrics = Metrics()
agent.llm.config = LLMConfig()
agent.config = AgentConfig()
# Add a proper system message mock
from openhands.events.action.message import SystemMessageAction
system_message = SystemMessageAction(content='Test system message for ReadOnlyAgent')
system_message._source = EventSource.AGENT
system_message._id = -1 # Set invalid ID to avoid the ID check
agent.get_system_message.return_value = system_message
return agent
@pytest.mark.asyncio
async def test_agent_mode_toggle(mock_codeact_agent, mock_readonly_agent, mock_event_stream):
"""
Test that the agent mode toggle works correctly:
1. Start with CodeActAgent
2. Toggle to ReadOnlyAgent
3. Toggle back to CodeActAgent
"""
# Mock the agent class resolution so that AgentController can instantiate mock_readonly_agent
original_get_cls = Agent.get_cls
def mock_get_cls(agent_name):
if agent_name == 'ReadOnlyAgent':
return lambda llm, config: mock_readonly_agent
return original_get_cls(agent_name)
Agent.get_cls = Mock(side_effect=mock_get_cls)
# Create parent controller with CodeActAgent
parent_state = State(max_iterations=10)
parent_controller = AgentController(
agent=mock_codeact_agent,
event_stream=mock_event_stream,
max_iterations=10,
sid='parent',
confirmation_mode=False,
headless_mode=True,
initial_state=parent_state,
)
# Verify we're starting with CodeActAgent
assert parent_controller.agent.name == 'CodeActAgent'
assert parent_controller.delegate is None
# Create a delegate action to switch to ReadOnlyAgent
delegate_action = AgentDelegateAction(
agent='ReadOnlyAgent',
inputs={
'task': 'Continue the conversation in READ-ONLY MODE. You can explore and analyze code but cannot make changes.'
},
thought='Switching to read-only mode at user\'s request'
)
# Simulate the delegate action
await parent_controller._on_event(delegate_action)
# Give time for the async step() to execute
await asyncio.sleep(0.5)
# Verify that we've delegated to ReadOnlyAgent
assert parent_controller.delegate is not None
assert parent_controller.delegate.agent.name == 'ReadOnlyAgent'
# Simulate a user message to the ReadOnlyAgent
message_action = MessageAction(content='Show me the files in this directory')
message_action._source = EventSource.USER
await parent_controller.delegate._on_event(message_action)
# Give time for the async step() to execute
await asyncio.sleep(0.5)
# Now simulate switching back to CodeActAgent with a finish action
finish_action = AgentFinishAction(
final_thought='Switching back to EXECUTE MODE. You now have full capabilities to modify code and execute commands.',
task_completed=True,
outputs={'mode_switch': True}
)
# Send the finish action to the delegate
await parent_controller.delegate._on_event(finish_action)
# Give time for the async step() to execute
await asyncio.sleep(0.5)
# Verify that we're back to the parent CodeActAgent
assert parent_controller.delegate is None
assert parent_controller.agent.name == 'CodeActAgent'
# Verify that a delegate observation was added to the event stream
events = list(mock_event_stream.get_events())
assert any(isinstance(event, AgentDelegateObservation) for event in events)
# Cleanup
await parent_controller.close()
# Restore the original get_cls method
Agent.get_cls = original_get_cls
-9
View File
@@ -13,7 +13,6 @@ from openhands.agenthub.codeact_agent.tools import (
IPythonTool,
LLMBasedFileEditTool,
ThinkTool,
WebReadTool,
create_cmd_run_tool,
create_str_replace_editor_tool,
)
@@ -79,7 +78,6 @@ def test_agent_with_default_config_has_default_tools():
'finish',
'str_replace_editor',
'think',
'web_read',
}.issubset(default_tool_names)
@@ -179,13 +177,6 @@ def test_str_replace_editor_tool():
]
def test_web_read_tool():
assert WebReadTool['type'] == 'function'
assert WebReadTool['function']['name'] == 'web_read'
assert 'url' in WebReadTool['function']['parameters']['properties']
assert WebReadTool['function']['parameters']['required'] == ['url']
def test_browser_tool():
assert BrowserTool['type'] == 'function'
assert BrowserTool['function']['name'] == 'browser'
+199 -7
View File
@@ -5,6 +5,7 @@ import pytest
import pytest_asyncio
from openhands.cli import main as cli
from openhands.controller.state.state import State
from openhands.events import EventSource
from openhands.events.action import MessageAction
@@ -28,6 +29,11 @@ def mock_runtime():
def mock_controller():
controller = AsyncMock()
controller.close = AsyncMock()
# Setup for get_state() and the returned state's save_to_session()
mock_state = MagicMock()
mock_state.save_to_session = MagicMock()
controller.get_state = MagicMock(return_value=mock_state)
return controller
@@ -126,7 +132,9 @@ def mock_settings_store():
@patch('openhands.cli.main.add_mcp_tools_to_agent')
@patch('openhands.cli.main.create_runtime')
@patch('openhands.cli.main.create_controller')
@patch('openhands.cli.main.create_memory')
@patch(
'openhands.cli.main.create_memory',
)
@patch('openhands.cli.main.run_agent_until_done')
@patch('openhands.cli.main.cleanup_session')
@patch('openhands.cli.main.initialize_repository_for_runtime')
@@ -162,7 +170,8 @@ async def test_run_session_without_initial_action(
mock_controller_task = MagicMock()
mock_create_controller.return_value = (mock_controller, mock_controller_task)
mock_memory = AsyncMock()
# Create a regular MagicMock for memory to avoid coroutine issues
mock_memory = MagicMock()
mock_create_memory.return_value = mock_memory
with patch(
@@ -191,7 +200,7 @@ async def test_run_session_without_initial_action(
mock_display_animation.assert_called_once()
mock_create_agent.assert_called_once_with(mock_config)
mock_add_mcp_tools.assert_called_once_with(
mock_agent, mock_runtime, mock_config.mcp
mock_agent, mock_runtime, mock_memory, mock_config.mcp
)
mock_create_runtime.assert_called_once()
mock_create_controller.assert_called_once()
@@ -214,7 +223,7 @@ async def test_run_session_without_initial_action(
@patch('openhands.cli.main.add_mcp_tools_to_agent')
@patch('openhands.cli.main.create_runtime')
@patch('openhands.cli.main.create_controller')
@patch('openhands.cli.main.create_memory')
@patch('openhands.cli.main.create_memory', new_callable=AsyncMock)
@patch('openhands.cli.main.run_agent_until_done')
@patch('openhands.cli.main.cleanup_session')
@patch('openhands.cli.main.initialize_repository_for_runtime')
@@ -247,8 +256,10 @@ async def test_run_session_with_initial_action(
mock_create_runtime.return_value = mock_runtime
mock_controller = AsyncMock()
mock_controller_task = MagicMock()
mock_create_controller.return_value = (mock_controller, mock_controller_task)
mock_create_controller.return_value = (
mock_controller,
None,
) # Ensure initial_state is None for this test
mock_memory = AsyncMock()
mock_create_memory.return_value = mock_memory
@@ -326,6 +337,7 @@ async def test_main_without_task(
mock_args = MagicMock()
mock_args.agent_cls = None
mock_args.llm_config = None
mock_args.name = None
mock_parse_args.return_value = mock_args
# Mock config
@@ -372,7 +384,7 @@ async def test_main_without_task(
# Check that run_session was called with expected arguments
mock_run_session.assert_called_once_with(
loop, mock_config, mock_settings_store, '/test/dir', None
loop, mock_config, mock_settings_store, '/test/dir', None, session_name=None
)
@@ -470,6 +482,186 @@ async def test_main_with_task(
assert second_call_args[4] is None
@pytest.mark.asyncio
@patch('openhands.cli.main.parse_arguments')
@patch('openhands.cli.main.setup_config_from_args')
@patch('openhands.cli.main.FileSettingsStore.get_instance')
@patch('openhands.cli.main.check_folder_security_agreement')
@patch('openhands.cli.main.read_task')
@patch('openhands.cli.main.run_session')
@patch('openhands.cli.main.LLMSummarizingCondenserConfig')
@patch('openhands.cli.main.NoOpCondenserConfig')
async def test_main_with_session_name_passes_name_to_run_session(
mock_noop_condenser,
mock_llm_condenser,
mock_run_session,
mock_read_task,
mock_check_security,
mock_get_settings_store,
mock_setup_config,
mock_parse_args,
):
"""Test main function with a session name passes it to run_session."""
loop = asyncio.get_running_loop()
test_session_name = 'my_named_session'
# Mock arguments
mock_args = MagicMock()
mock_args.agent_cls = None
mock_args.llm_config = None
mock_args.name = test_session_name # Set the session name
mock_parse_args.return_value = mock_args
# Mock config
mock_config = MagicMock()
mock_config.workspace_base = '/test/dir'
mock_config.cli_multiline_input = False
mock_setup_config.return_value = mock_config
# Mock settings store
mock_settings_store = AsyncMock()
mock_settings = MagicMock()
mock_settings.agent = 'test-agent'
mock_settings.llm_model = 'test-model' # Copied from test_main_without_task
mock_settings.llm_api_key = 'test-api-key' # Copied from test_main_without_task
mock_settings.llm_base_url = 'test-base-url' # Copied from test_main_without_task
mock_settings.confirmation_mode = True # Copied from test_main_without_task
mock_settings.enable_default_condenser = True # Copied from test_main_without_task
mock_settings_store.load.return_value = mock_settings
mock_get_settings_store.return_value = mock_settings_store
# Mock condenser config (as in test_main_without_task)
mock_llm_condenser_instance = MagicMock()
mock_llm_condenser.return_value = mock_llm_condenser_instance
# Mock security check
mock_check_security.return_value = True
# Mock read_task to return no task
mock_read_task.return_value = None
# Mock run_session to return False (no new session requested)
mock_run_session.return_value = False
# Run the function
await cli.main(loop)
# Assertions
mock_parse_args.assert_called_once()
mock_setup_config.assert_called_once_with(mock_args)
mock_get_settings_store.assert_called_once()
mock_settings_store.load.assert_called_once()
mock_check_security.assert_called_once_with(mock_config, '/test/dir')
mock_read_task.assert_called_once()
# Check that run_session was called with the correct session_name
mock_run_session.assert_called_once_with(
loop,
mock_config,
mock_settings_store,
'/test/dir',
None,
session_name=test_session_name,
)
@pytest.mark.asyncio
@patch('openhands.cli.main.generate_sid')
@patch('openhands.cli.main.create_agent')
@patch('openhands.cli.main.create_runtime') # Returns mock_runtime
@patch('openhands.cli.main.create_memory')
@patch('openhands.cli.main.add_mcp_tools_to_agent')
@patch('openhands.cli.main.run_agent_until_done')
@patch('openhands.cli.main.cleanup_session')
@patch(
'openhands.cli.main.read_prompt_input', new_callable=AsyncMock
) # For REPL control
@patch('openhands.cli.main.handle_commands', new_callable=AsyncMock) # For REPL control
@patch('openhands.core.setup.State.restore_from_session') # Key mock
@patch('openhands.controller.AgentController.__init__') # To check initial_state
@patch('openhands.cli.main.display_runtime_initialization_message') # Cosmetic
@patch('openhands.cli.main.display_initialization_animation') # Cosmetic
@patch('openhands.cli.main.initialize_repository_for_runtime') # Cosmetic / setup
@patch('openhands.cli.main.display_initial_user_prompt') # Cosmetic
async def test_run_session_with_name_attempts_state_restore(
mock_display_initial_user_prompt,
mock_initialize_repo,
mock_display_init_anim,
mock_display_runtime_init,
mock_agent_controller_init,
mock_restore_from_session,
mock_handle_commands,
mock_read_prompt_input,
mock_cleanup_session,
mock_run_agent_until_done,
mock_add_mcp_tools,
mock_create_memory,
mock_create_runtime,
mock_create_agent,
mock_generate_sid,
mock_config, # Fixture
mock_settings_store, # Fixture
):
"""Test run_session with a session_name attempts to restore state and passes it to AgentController."""
loop = asyncio.get_running_loop()
test_session_name = 'my_restore_test_session'
expected_sid = f'sid_for_{test_session_name}'
mock_generate_sid.return_value = expected_sid
mock_agent = AsyncMock()
mock_create_agent.return_value = mock_agent
mock_runtime = AsyncMock()
mock_runtime.event_stream = MagicMock() # This is the EventStream instance
mock_runtime.event_stream.sid = expected_sid
mock_runtime.event_stream.file_store = (
MagicMock()
) # Mock the file_store attribute on the EventStream
mock_create_runtime.return_value = mock_runtime
# This is what State.restore_from_session will return
mock_loaded_state = MagicMock(spec=State)
mock_restore_from_session.return_value = mock_loaded_state
# AgentController.__init__ should not return a value (it's __init__)
mock_agent_controller_init.return_value = None
# To make run_session exit cleanly after one loop
mock_read_prompt_input.return_value = '/exit'
mock_handle_commands.return_value = (
True,
False,
False,
) # close_repl, reload_microagents, new_session_requested
# Mock other functions called by run_session to avoid side effects
mock_initialize_repo.return_value = '/mocked/repo/dir'
mock_create_memory.return_value = AsyncMock() # Memory instance
await cli.run_session(
loop,
mock_config,
mock_settings_store, # This is FileSettingsStore, not directly used for restore in this path
'/test/dir',
task_content=None,
session_name=test_session_name,
)
mock_generate_sid.assert_called_once_with(mock_config, test_session_name)
# State.restore_from_session is called from within core.setup.create_controller,
# which receives the runtime object (and thus its event_stream with sid and file_store).
mock_restore_from_session.assert_called_once_with(
expected_sid, mock_runtime.event_stream.file_store
)
# Check that AgentController was initialized with the loaded state
mock_agent_controller_init.assert_called_once()
args, kwargs = mock_agent_controller_init.call_args
assert kwargs.get('initial_state') == mock_loaded_state
@pytest.mark.asyncio
@patch('openhands.cli.main.parse_arguments')
@patch('openhands.cli.main.setup_config_from_args')
+45 -20
View File
@@ -21,6 +21,7 @@ from openhands.server.data_models.conversation_info_result_set import (
)
from openhands.server.routes.manage_conversations import (
InitSessionRequest,
InitSessionResponse,
delete_conversation,
get_conversation,
new_conversation,
@@ -76,10 +77,15 @@ def test_client():
def create_new_test_conversation(
test_request: InitSessionRequest, auth_type: AuthType | None = None
):
# Create a mock UserSecrets object with the required custom_secrets attribute
mock_user_secrets = MagicMock()
mock_user_secrets.custom_secrets = MappingProxyType({})
return new_conversation(
data=test_request,
user_id='test_user',
provider_tokens=MappingProxyType({'github': 'token123'}),
user_secrets=mock_user_secrets,
auth_type=auth_type,
)
@@ -112,8 +118,12 @@ async def test_search_conversations():
async def mock_get_connections(*args, **kwargs):
return {}
async def get_agent_loop_info(*args, **kwargs):
return []
mock_manager.get_running_agent_loops = mock_get_running_agent_loops
mock_manager.get_connections = mock_get_connections
mock_manager.get_agent_loop_info = get_agent_loop_info
with patch(
'openhands.server.routes.manage_conversations.datetime'
) as mock_datetime:
@@ -165,6 +175,7 @@ async def test_search_conversations():
status=ConversationStatus.STOPPED,
selected_repository='foobar',
num_connections=0,
url=None,
)
]
)
@@ -193,6 +204,7 @@ async def test_get_conversation():
) as mock_manager:
mock_manager.is_agent_loop_running = AsyncMock(return_value=False)
mock_manager.get_connections = AsyncMock(return_value={})
mock_manager.get_agent_loop_info = AsyncMock(return_value=[])
conversation = await get_conversation(
'some_conversation_id', conversation_store=mock_store
@@ -206,6 +218,7 @@ async def test_get_conversation():
status=ConversationStatus.STOPPED,
selected_repository='foobar',
num_connections=0,
url=None,
)
assert conversation == expected
@@ -234,7 +247,11 @@ async def test_new_conversation_success(provider_handler_mock):
'openhands.server.routes.manage_conversations._create_new_conversation'
) as mock_create_conversation:
# Set up the mock to return a conversation ID
mock_create_conversation.return_value = 'test_conversation_id'
mock_create_conversation.return_value = MagicMock(
conversation_id='test_conversation_id',
url='https://my-conversation.com',
api_key=None,
)
test_request = InitSessionRequest(
repository='test/repo',
@@ -247,12 +264,10 @@ async def test_new_conversation_success(provider_handler_mock):
response = await create_new_test_conversation(test_request)
# Verify the response
assert isinstance(response, JSONResponse)
assert response.status_code == 200
assert (
response.body.decode('utf-8')
== '{"status":"ok","conversation_id":"test_conversation_id"}'
)
assert isinstance(response, InitSessionResponse)
assert response.status == 'ok'
assert response.conversation_id == 'test_conversation_id'
assert response.conversation_url == 'https://my-conversation.com'
# Verify that _create_new_conversation was called with the correct arguments
mock_create_conversation.assert_called_once()
@@ -274,7 +289,11 @@ async def test_new_conversation_with_suggested_task(provider_handler_mock):
'openhands.server.routes.manage_conversations._create_new_conversation'
) as mock_create_conversation:
# Set up the mock to return a conversation ID
mock_create_conversation.return_value = 'test_conversation_id'
mock_create_conversation.return_value = MagicMock(
conversation_id='test_conversation_id',
url='https://my-conversation.com',
api_key=None,
)
# Mock SuggestedTask.get_prompt_for_task
with patch(
@@ -302,12 +321,10 @@ async def test_new_conversation_with_suggested_task(provider_handler_mock):
response = await create_new_test_conversation(test_request)
# Verify the response
assert isinstance(response, JSONResponse)
assert response.status_code == 200
assert (
response.body.decode('utf-8')
== '{"status":"ok","conversation_id":"test_conversation_id"}'
)
assert isinstance(response, InitSessionResponse)
assert response.status == 'ok'
assert response.conversation_id == 'test_conversation_id'
assert response.conversation_url == 'https://my-conversation.com'
# Verify that _create_new_conversation was called with the correct arguments
mock_create_conversation.assert_called_once()
@@ -457,7 +474,11 @@ async def test_new_conversation_with_bearer_auth(provider_handler_mock):
'openhands.server.routes.manage_conversations._create_new_conversation'
) as mock_create_conversation:
# Set up the mock to return a conversation ID
mock_create_conversation.return_value = 'test_conversation_id'
mock_create_conversation.return_value = MagicMock(
conversation_id='test_conversation_id',
url='https://my-conversation.com',
api_key=None,
)
# Create the request object
test_request = InitSessionRequest(
@@ -470,8 +491,8 @@ async def test_new_conversation_with_bearer_auth(provider_handler_mock):
response = await create_new_test_conversation(test_request, AuthType.BEARER)
# Verify the response
assert isinstance(response, JSONResponse)
assert response.status_code == 200
assert isinstance(response, InitSessionResponse)
assert response.status == 'ok'
# Verify that _create_new_conversation was called with REMOTE_API_KEY trigger
mock_create_conversation.assert_called_once()
@@ -490,7 +511,11 @@ async def test_new_conversation_with_null_repository():
'openhands.server.routes.manage_conversations._create_new_conversation'
) as mock_create_conversation:
# Set up the mock to return a conversation ID
mock_create_conversation.return_value = 'test_conversation_id'
mock_create_conversation.return_value = MagicMock(
conversation_id='test_conversation_id',
url='https://my-conversation.com',
api_key=None,
)
# Create the request object with null repository
test_request = InitSessionRequest(
@@ -503,8 +528,8 @@ async def test_new_conversation_with_null_repository():
response = await create_new_test_conversation(test_request)
# Verify the response
assert isinstance(response, JSONResponse)
assert response.status_code == 200
assert isinstance(response, InitSessionResponse)
assert response.status == 'ok'
# Verify that _create_new_conversation was called with None repository
mock_create_conversation.assert_called_once()
+141
View File
@@ -441,6 +441,147 @@ def test_cache_page_performance(temp_dir: str):
# In real-world scenarios with many more events, the performance difference would be more significant.
def test_search_events_limit(temp_dir: str):
"""Test that the search_events method correctly applies the limit parameter."""
file_store = get_file_store('local', temp_dir)
event_stream = EventStream('abc', file_store)
# Add 10 events
for i in range(10):
event_stream.add_event(NullObservation(f'test{i}'), EventSource.AGENT)
# Test with no limit (should return all events)
events = list(event_stream.search_events())
assert len(events) == 10
# Test with limit=5 (should return first 5 events)
events = list(event_stream.search_events(limit=5))
assert len(events) == 5
assert all(isinstance(e, NullObservation) for e in events)
assert [e.content for e in events] == ['test0', 'test1', 'test2', 'test3', 'test4']
# Test with limit=3 and start_id=5 (should return 3 events starting from ID 5)
events = list(event_stream.search_events(start_id=5, limit=3))
assert len(events) == 3
assert [e.content for e in events] == ['test5', 'test6', 'test7']
# Test with limit and reverse=True (should return events in reverse order)
events = list(event_stream.search_events(reverse=True, limit=4))
assert len(events) == 4
assert [e.content for e in events] == ['test9', 'test8', 'test7', 'test6']
# Test with limit and filter (should apply limit after filtering)
# Add some events with different content for filtering
event_stream.add_event(NullObservation('filter_me'), EventSource.AGENT)
event_stream.add_event(NullObservation('filter_me_too'), EventSource.AGENT)
events = list(
event_stream.search_events(filter=EventFilter(query='filter'), limit=1)
)
assert len(events) == 1
assert events[0].content == 'filter_me'
def test_search_events_limit_with_complex_filters(temp_dir: str):
"""Test the interaction between limit and various filter combinations in search_events."""
file_store = get_file_store('local', temp_dir)
event_stream = EventStream('abc', file_store)
# Add events with different sources and types
event_stream.add_event(NullAction(), EventSource.AGENT) # id 0
event_stream.add_event(NullObservation('test1'), EventSource.AGENT) # id 1
event_stream.add_event(MessageAction(content='hello'), EventSource.USER) # id 2
event_stream.add_event(NullObservation('test2'), EventSource.ENVIRONMENT) # id 3
event_stream.add_event(NullAction(), EventSource.AGENT) # id 4
event_stream.add_event(MessageAction(content='world'), EventSource.USER) # id 5
event_stream.add_event(NullObservation('hello world'), EventSource.AGENT) # id 6
# Test limit with type filter
events = list(
event_stream.search_events(
filter=EventFilter(include_types=(NullAction,)), limit=1
)
)
assert len(events) == 1
assert isinstance(events[0], NullAction)
assert events[0].id == 0
# Test limit with source filter
events = list(
event_stream.search_events(filter=EventFilter(source='user'), limit=1)
)
assert len(events) == 1
assert events[0].source == EventSource.USER
assert events[0].id == 2
# Test limit with query filter
events = list(
event_stream.search_events(filter=EventFilter(query='hello'), limit=2)
)
assert len(events) == 2
assert [e.id for e in events] == [2, 6]
# Test limit with combined filters
events = list(
event_stream.search_events(
filter=EventFilter(source='agent', include_types=(NullObservation,)),
limit=1,
)
)
assert len(events) == 1
assert isinstance(events[0], NullObservation)
assert events[0].source == EventSource.AGENT
assert events[0].id == 1
# Test limit with reverse and filter
events = list(
event_stream.search_events(
filter=EventFilter(source='agent'), reverse=True, limit=2
)
)
assert len(events) == 2
assert [e.id for e in events] == [6, 4]
def test_search_events_limit_edge_cases(temp_dir: str):
"""Test edge cases for the limit parameter in search_events."""
file_store = get_file_store('local', temp_dir)
event_stream = EventStream('abc', file_store)
# Add some events
for i in range(5):
event_stream.add_event(NullObservation(f'test{i}'), EventSource.AGENT)
# Test with limit=None (should return all events)
events = list(event_stream.search_events(limit=None))
assert len(events) == 5
# Test with limit larger than number of events
events = list(event_stream.search_events(limit=10))
assert len(events) == 5
# Test with limit=0 (let's check actual behavior)
events = list(event_stream.search_events(limit=0))
# If it returns all events, assert len(events) == 5
# If it returns no events, assert len(events) == 0
# Let's check the actual behavior
assert len(events) in [0, 5]
# Test with negative limit (implementation returns only first event)
events = list(event_stream.search_events(limit=-1))
assert len(events) == 1
# Test with empty result set and limit
events = list(
event_stream.search_events(filter=EventFilter(query='nonexistent'), limit=5)
)
assert len(events) == 0
# Test with start_id beyond available events
events = list(event_stream.search_events(start_id=10, limit=5))
assert len(events) == 0
def test_callback_dictionary_modification(temp_dir: str):
"""Test that the event stream can handle dictionary modification during iteration.
-18
View File
@@ -10,7 +10,6 @@ from openhands.agenthub.codeact_agent.function_calling import response_to_action
from openhands.core.exceptions import FunctionCallValidationError
from openhands.events.action import (
BrowseInteractiveAction,
BrowseURLAction,
CmdRunAction,
FileEditAction,
FileReadAction,
@@ -189,23 +188,6 @@ def test_browser_missing_code():
assert 'Missing required argument "code"' in str(exc_info.value)
def test_web_read_valid():
"""Test web_read with valid arguments."""
response = create_mock_response('web_read', {'url': 'https://example.com'})
actions = response_to_actions(response)
assert len(actions) == 1
assert isinstance(actions[0], BrowseURLAction)
assert actions[0].url == 'https://example.com'
def test_web_read_missing_url():
"""Test web_read with missing url argument."""
response = create_mock_response('web_read', {})
with pytest.raises(FunctionCallValidationError) as exc_info:
response_to_actions(response)
assert 'Missing required argument "url"' in str(exc_info.value)
def test_invalid_json_arguments():
"""Test handling of invalid JSON in arguments."""
response = ModelResponse(
+261 -1
View File
@@ -9,12 +9,14 @@ from litellm import ChatCompletionToolParam
from openhands.llm.fn_call_converter import (
IN_CONTEXT_LEARNING_EXAMPLE_PREFIX,
IN_CONTEXT_LEARNING_EXAMPLE_SUFFIX,
TOOL_EXAMPLES,
FunctionCallConversionError,
convert_fncall_messages_to_non_fncall_messages,
convert_from_multiple_tool_calls_to_single_tool_call_messages,
convert_non_fncall_messages_to_fncall_messages,
convert_tool_call_to_string,
convert_tools_to_description,
get_example_for_tools,
)
FNCALL_TOOLS: list[ChatCompletionToolParam] = [
@@ -140,6 +142,264 @@ Allowed values: [`view`, `create`, `str_replace`, `insert`, `undo_edit`]
)
def test_get_example_for_tools_no_tools():
"""Test that get_example_for_tools returns empty string when no tools are available."""
tools = []
example = get_example_for_tools(tools)
assert example == ''
def test_get_example_for_tools_single_tool():
"""Test that get_example_for_tools generates correct example with a single tool."""
tools = [
{
'type': 'function',
'function': {
'name': 'execute_bash',
'description': 'Execute a bash command in the terminal.',
'parameters': {
'type': 'object',
'properties': {
'command': {
'type': 'string',
'description': 'The bash command to execute.',
}
},
'required': ['command'],
},
},
}
]
example = get_example_for_tools(tools)
assert example.startswith(
"Here's a running example of how to perform a task with the provided tools."
)
assert (
'USER: Create a list of numbers from 1 to 10, and display them in a web page at port 5000.'
in example
)
assert TOOL_EXAMPLES['execute_bash']['check_dir'] in example
assert TOOL_EXAMPLES['execute_bash']['run_server'] in example
assert TOOL_EXAMPLES['execute_bash']['kill_server'] in example
assert TOOL_EXAMPLES['str_replace_editor']['create_file'] not in example
assert TOOL_EXAMPLES['browser']['view_page'] not in example
assert TOOL_EXAMPLES['finish']['task_completed'] not in example
def test_get_example_for_tools_single_tool_is_finish():
"""Test get_example_for_tools with only the finish tool."""
tools = [
{
'type': 'function',
'function': {
'name': 'finish',
'description': 'Finish the interaction when the task is complete.',
},
}
]
example = get_example_for_tools(tools)
assert example.startswith(
"Here's a running example of how to perform a task with the provided tools."
)
assert (
'USER: Create a list of numbers from 1 to 10, and display them in a web page at port 5000.'
in example
)
assert TOOL_EXAMPLES['finish']['task_completed'] in example
assert TOOL_EXAMPLES['execute_bash']['check_dir'] not in example
assert TOOL_EXAMPLES['str_replace_editor']['create_file'] not in example
assert TOOL_EXAMPLES['browser']['view_page'] not in example
def test_get_example_for_tools_multiple_tools():
"""Test that get_example_for_tools generates correct example with multiple tools."""
tools = [
{
'type': 'function',
'function': {
'name': 'execute_bash',
'description': 'Execute a bash command in the terminal.',
'parameters': {
'type': 'object',
'properties': {
'command': {
'type': 'string',
'description': 'The bash command to execute.',
}
},
'required': ['command'],
},
},
},
{
'type': 'function',
'function': {
'name': 'str_replace_editor',
'description': 'Custom editing tool for viewing, creating and editing files.',
'parameters': {
'type': 'object',
'properties': {
'command': {
'type': 'string',
'description': 'The commands to run.',
'enum': [
'view',
'create',
'str_replace',
'insert',
'undo_edit',
],
},
'path': {
'type': 'string',
'description': 'Absolute path to file or directory.',
},
},
'required': ['command', 'path'],
},
},
},
]
example = get_example_for_tools(tools)
assert example.startswith(
"Here's a running example of how to perform a task with the provided tools."
)
assert (
'USER: Create a list of numbers from 1 to 10, and display them in a web page at port 5000.'
in example
)
assert TOOL_EXAMPLES['execute_bash']['check_dir'] in example
assert TOOL_EXAMPLES['execute_bash']['run_server'] in example
assert TOOL_EXAMPLES['execute_bash']['kill_server'] in example
assert TOOL_EXAMPLES['str_replace_editor']['create_file'] in example
assert TOOL_EXAMPLES['str_replace_editor']['edit_file'] in example
assert TOOL_EXAMPLES['browser']['view_page'] not in example
assert TOOL_EXAMPLES['finish']['task_completed'] not in example
def test_get_example_for_tools_multiple_tools_with_finish():
"""Test get_example_for_tools with multiple tools including finish."""
# Uses execute_bash and finish tools
tools = [
{
'type': 'function',
'function': {
'name': 'execute_bash',
'description': 'Execute a bash command in the terminal.',
'parameters': { # Params added for completeness, not strictly needed by get_example_for_tools
'type': 'object',
'properties': {
'command': {
'type': 'string',
'description': 'The bash command to execute.',
}
},
'required': ['command'],
},
},
},
{
'type': 'function',
'function': {
'name': 'str_replace_editor',
'description': 'Custom editing tool for viewing, creating and editing files.',
'parameters': {
'type': 'object',
'properties': {
'command': {
'type': 'string',
'description': 'The commands to run.',
'enum': [
'view',
'create',
'str_replace',
'insert',
'undo_edit',
],
},
'path': {
'type': 'string',
'description': 'Absolute path to file or directory.',
},
},
'required': ['command', 'path'],
},
},
},
{
'type': 'function',
'function': {
'name': 'browser',
'description': 'Interact with the browser.',
'parameters': {
'type': 'object',
'properties': {
'code': {
'type': 'string',
'description': 'The Python code that interacts with the browser.',
}
},
'required': ['code'],
},
},
},
{
'type': 'function',
'function': {
'name': 'finish',
'description': 'Finish the interaction.',
},
},
]
example = get_example_for_tools(tools)
assert example.startswith(
"Here's a running example of how to perform a task with the provided tools."
)
assert (
'USER: Create a list of numbers from 1 to 10, and display them in a web page at port 5000.'
in example
)
# Check for execute_bash parts (order matters for get_example_for_tools)
assert TOOL_EXAMPLES['execute_bash']['check_dir'].strip() in example
assert TOOL_EXAMPLES['execute_bash']['run_server'].strip() in example
assert TOOL_EXAMPLES['execute_bash']['kill_server'].strip() in example
assert TOOL_EXAMPLES['execute_bash']['run_server_again'].strip() in example
# Check for str_replace_editor parts
assert TOOL_EXAMPLES['str_replace_editor']['create_file'] in example
assert TOOL_EXAMPLES['str_replace_editor']['edit_file'] in example
# Check for browser part
assert TOOL_EXAMPLES['browser']['view_page'] in example
# Check for finish part
assert TOOL_EXAMPLES['finish']['task_completed'] in example
def test_get_example_for_tools_all_tools():
"""Test that get_example_for_tools generates correct example with all tools."""
tools = FNCALL_TOOLS # FNCALL_TOOLS already includes 'finish'
example = get_example_for_tools(tools)
assert example.startswith(
"Here's a running example of how to perform a task with the provided tools."
)
assert (
'USER: Create a list of numbers from 1 to 10, and display them in a web page at port 5000.'
in example
)
assert TOOL_EXAMPLES['execute_bash']['check_dir'] in example
assert TOOL_EXAMPLES['execute_bash']['run_server'] in example
assert TOOL_EXAMPLES['execute_bash']['kill_server'] in example
assert TOOL_EXAMPLES['str_replace_editor']['create_file'] in example
assert TOOL_EXAMPLES['str_replace_editor']['edit_file'] in example
assert TOOL_EXAMPLES['finish']['task_completed'] in example
# These are not in global FNCALL_TOOLS
# assert TOOL_EXAMPLES['web_read']['read_docs'] not in example # web_read is removed
assert TOOL_EXAMPLES['browser']['view_page'] not in example
FNCALL_MESSAGES = [
{
'content': [
@@ -270,7 +530,7 @@ NON_FNCALL_MESSAGES = [
'content': [
{
'type': 'text',
'text': IN_CONTEXT_LEARNING_EXAMPLE_PREFIX
'text': IN_CONTEXT_LEARNING_EXAMPLE_PREFIX(FNCALL_TOOLS)
+ "<uploaded_files>\n/workspace/astropy__astropy__5.1\n</uploaded_files>\nI've uploaded a python code repository in the directory astropy__astropy__5.1. LONG DESCRIPTION:\n\n"
+ IN_CONTEXT_LEARNING_EXAMPLE_SUFFIX,
}
+164 -6
View File
@@ -17,6 +17,7 @@ from openhands.events.observation.agent import (
RecallObservation,
RecallType,
)
from openhands.events.serialization.observation import observation_from_dict
from openhands.events.stream import EventStream
from openhands.llm import LLM
from openhands.llm.metrics import Metrics
@@ -25,6 +26,7 @@ from openhands.runtime.impl.action_execution.action_execution_client import (
ActionExecutionClient,
)
from openhands.storage.memory import InMemoryFileStore
from openhands.utils.prompt import PromptManager, RepositoryInfo, RuntimeInfo
@pytest.fixture
@@ -189,11 +191,23 @@ async def test_memory_with_microagents():
assert isinstance(observation, RecallObservation)
assert source == EventSource.ENVIRONMENT
assert observation.recall_type == RecallType.KNOWLEDGE
# We should have at least one microagent: flarglebargle (triggered by keyword)
# Note: The default-tools microagent might not be loaded in tests
assert len(observation.microagent_knowledge) == 1
# Find the flarglebargle microagent in the list
flarglebargle_knowledge = None
for knowledge in observation.microagent_knowledge:
if knowledge.name == derived_name:
flarglebargle_knowledge = knowledge
break
# Check against the derived name
assert observation.microagent_knowledge[0].name == derived_name
assert observation.microagent_knowledge[0].trigger == 'flarglebargle'
assert 'magic word' in observation.microagent_knowledge[0].content
assert flarglebargle_knowledge is not None
assert flarglebargle_knowledge.name == derived_name
assert flarglebargle_knowledge.trigger == 'flarglebargle'
assert 'magic word' in flarglebargle_knowledge.content
def test_memory_repository_info(prompt_dir, file_store):
@@ -319,11 +333,155 @@ async def test_memory_with_agent_microagents():
assert isinstance(observation, RecallObservation)
assert source == EventSource.ENVIRONMENT
assert observation.recall_type == RecallType.KNOWLEDGE
# We should have at least one microagent: flarglebargle (triggered by keyword)
# Note: The default-tools microagent might not be loaded in tests
assert len(observation.microagent_knowledge) == 1
# Find the flarglebargle microagent in the list
flarglebargle_knowledge = None
for knowledge in observation.microagent_knowledge:
if knowledge.name == derived_name:
flarglebargle_knowledge = knowledge
break
# Check against the derived name
assert observation.microagent_knowledge[0].name == derived_name
assert observation.microagent_knowledge[0].trigger == 'flarglebargle'
assert 'magic word' in observation.microagent_knowledge[0].content
assert flarglebargle_knowledge is not None
assert flarglebargle_knowledge.name == derived_name
assert flarglebargle_knowledge.trigger == 'flarglebargle'
assert 'magic word' in flarglebargle_knowledge.content
@pytest.mark.asyncio
async def test_custom_secrets_descriptions():
"""Test that custom_secrets_descriptions are properly stored in memory and included in RecallObservation."""
# Create a mock event stream
event_stream = MagicMock(spec=EventStream)
# Initialize Memory
memory = Memory(
event_stream=event_stream,
sid='test-session',
)
# Create a mock runtime with custom secrets descriptions
mock_runtime = MagicMock()
mock_runtime.web_hosts = {'test-host.example.com': 8080}
mock_runtime.additional_agent_instructions = 'Test instructions'
# Define custom secrets descriptions
custom_secrets = {
'API_KEY': 'API key for external service',
'DATABASE_URL': 'Connection string for the database',
'SECRET_TOKEN': 'Authentication token for secure operations',
}
# Set runtime info with custom secrets
memory.set_runtime_info(mock_runtime, custom_secrets)
# Set repository info
memory.set_repository_info('test-owner/test-repo', '/workspace/test-repo')
# Create a workspace context recall action
recall_action = RecallAction(
query='Initial message', recall_type=RecallType.WORKSPACE_CONTEXT
)
recall_action._source = EventSource.USER # type: ignore[attr-defined]
# Mock the event_stream.add_event method
added_events = []
def mock_add_event(event, source):
added_events.append((event, source))
event_stream.add_event = mock_add_event
# Process the recall action
await memory._on_event(recall_action)
# Verify a RecallObservation was added to the event stream
assert len(added_events) == 1
observation, source = added_events[0]
# Verify the observation is a RecallObservation
assert isinstance(observation, RecallObservation)
assert source == EventSource.ENVIRONMENT
assert observation.recall_type == RecallType.WORKSPACE_CONTEXT
# Verify custom_secrets_descriptions are included in the observation
assert observation.custom_secrets_descriptions == custom_secrets
# Verify repository info is included
assert observation.repo_name == 'test-owner/test-repo'
assert observation.repo_directory == '/workspace/test-repo'
# Verify runtime info is included
assert observation.runtime_hosts == {'test-host.example.com': 8080}
assert observation.additional_agent_instructions == 'Test instructions'
def test_custom_secrets_descriptions_serialization(prompt_dir):
"""Test that custom_secrets_descriptions are properly serialized in the message for the LLM."""
# Create a PromptManager with the test prompt directory
prompt_manager = PromptManager(prompt_dir)
# Create a RuntimeInfo with custom_secrets_descriptions
custom_secrets = {
'API_KEY': 'API key for external service',
'DATABASE_URL': 'Connection string for the database',
'SECRET_TOKEN': 'Authentication token for secure operations',
}
runtime_info = RuntimeInfo(
date='2025-05-15',
available_hosts={'test-host.example.com': 8080},
additional_agent_instructions='Test instructions',
custom_secrets_descriptions=custom_secrets,
)
# Create a RepositoryInfo
repository_info = RepositoryInfo(
repo_name='test-owner/test-repo', repo_directory='/workspace/test-repo'
)
# Build the workspace context message
workspace_context = prompt_manager.build_workspace_context(
repository_info=repository_info,
runtime_info=runtime_info,
repo_instructions='Test repository instructions',
)
# Verify that the workspace context includes the custom_secrets_descriptions
assert '<CUSTOM_SECRETS>' in workspace_context
for secret_name, secret_description in custom_secrets.items():
assert f'$**{secret_name}**' in workspace_context
assert secret_description in workspace_context
def test_serialization_deserialization_with_custom_secrets():
"""Test that RecallObservation can be serialized and deserialized with custom_secrets_descriptions."""
# This simulates an older version of the RecallObservation
legacy_observation = {
'message': 'Added workspace context',
'observation': 'recall',
'content': 'Test content',
'extras': {
'recall_type': 'workspace_context',
'repo_name': 'test-owner/test-repo',
'repo_directory': '/workspace/test-repo',
'repo_instructions': 'Test repository instructions',
'runtime_hosts': {'test-host.example.com': 8080},
'additional_agent_instructions': 'Test instructions',
'date': '2025-05-15',
'microagent_knowledge': [], # Intentionally omitting custom_secrets_descriptions
},
}
legacy_observation = observation_from_dict(legacy_observation)
# Verify that the observation was created successfully
assert legacy_observation.recall_type == RecallType.WORKSPACE_CONTEXT
assert legacy_observation.repo_name == 'test-owner/test-repo'
assert legacy_observation.repo_directory == '/workspace/test-repo'
def test_memory_multiple_repo_microagents(prompt_dir, file_store):
+34
View File
@@ -166,3 +166,37 @@ Testing loading with trailing slashes.
assert isinstance(agent_t, KnowledgeMicroagent)
assert agent_t.type == MicroagentType.KNOWLEDGE # Check inferred type
assert 'trailing' in agent_t.triggers
def test_invalid_microagent_type(temp_microagents_dir):
"""Test loading a microagent with an invalid type."""
# Create a microagent with an invalid type
invalid_agent = """---
name: invalid_type_agent
type: task
version: 1.0.0
agent: CodeActAgent
triggers:
- test
---
# Invalid Type Test
This microagent has an invalid type.
"""
invalid_file = temp_microagents_dir / 'invalid_type.md'
invalid_file.write_text(invalid_agent)
# Attempt to load the microagent should raise a MicroagentValidationError
from openhands.core.exceptions import MicroagentValidationError
with pytest.raises(MicroagentValidationError) as excinfo:
load_microagents_from_dir(temp_microagents_dir)
# Check that the error message contains helpful information
error_msg = str(excinfo.value)
assert 'invalid_type.md' in error_msg
assert 'Invalid "type" value: "task"' in error_msg
assert 'Valid types are:' in error_msg
assert '"knowledge"' in error_msg
assert '"repo"' in error_msg
@@ -245,6 +245,7 @@ def test_microagent_observation_serialization():
'runtime_hosts': {'host1': 8080, 'host2': 8081},
'repo_instructions': 'complex_repo_instructions',
'additional_agent_instructions': 'You know it all about this runtime',
'custom_secrets_descriptions': {'SECRET': 'CUSTOM'},
'date': '04/12/1023',
'microagent_knowledge': [],
},
@@ -264,6 +265,7 @@ def test_microagent_observation_microagent_knowledge_serialization():
'repo_instructions': '',
'runtime_hosts': {},
'additional_agent_instructions': '',
'custom_secrets_descriptions': {},
'date': '',
'microagent_knowledge': [
{
@@ -1,72 +0,0 @@
#!/usr/bin/env python3
"""Test script for the playwright_patchright_util module."""
import logging
import sys
from openhands.utils.playwright_patchright_util import use_patchright
# Configure logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
def test_patchright_import():
"""Test that patchright can be imported and used as a replacement for playwright."""
# Use patchright as a replacement for playwright
use_patchright()
# Now import playwright - this must be imported after use_patchright() is called
import playwright.sync_api # noqa: F401
# Check that the import worked
assert 'playwright.sync_api' in sys.modules
# Check that the actual module is patchright
playwright_modules = [
name
for name in sys.modules.keys()
if name == 'playwright' or name.startswith('playwright.')
]
assert len(playwright_modules) > 0
# Check that patchright modules are loaded
patchright_modules = [
name
for name in sys.modules.keys()
if name == 'patchright' or name.startswith('patchright.')
]
assert len(patchright_modules) > 0
def test_patchright_functionality():
"""Test that patchright functionality works through the playwright import."""
# Use patchright as a replacement for playwright
use_patchright()
# Import playwright - this must be imported after use_patchright() is called
import playwright
from playwright.sync_api import sync_playwright
# print the actual package name and file
print(f'Actual playwright package name: {playwright.__name__}')
print(f'Actual playwright package file: {playwright.__file__}')
assert 'patchright' in playwright.__file__
# Use playwright (which is actually patchright)
with sync_playwright() as p:
# Launch a browser
browser = p.chromium.launch(headless=True)
# Create a new page
page = browser.new_page()
# Navigate to a URL
page.goto('https://example.com')
# Check that we can get the title
title = page.title()
assert 'Example' in title
# Close the browser
browser.close()