Compare commits

..

24 Commits

Author SHA1 Message Date
Engel Nyst 70ad153c2c Add comprehensive unit tests for Ctrl+C behavior
- test_single_ctrl_c_stops_agent: Verifies first Ctrl+C stops agent gracefully with helpful message
- test_double_ctrl_c_raises_keyboard_interrupt: Verifies second Ctrl+C within 2 seconds raises KeyboardInterrupt for CLI cleanup
- test_ctrl_p_pauses_agent: Verifies Ctrl+P still pauses agent as expected

Tests use proper mocking of prompt_toolkit's create_input, raw_mode, and attach context managers.
All tests pass and validate the improved Ctrl+C behavior implementation.

Co-authored-by: OpenHands-Claude <openhands-claude@all-hands.dev>
2025-06-28 16:17:48 +02:00
Engel Nyst bded599449 Fix Ctrl+C behavior: use KeyboardInterrupt instead of signals
The previous approach using os.kill(os.getpid(), signal.SIGTERM) was too
aggressive and caused runtime crashes. The proper solution is to raise
KeyboardInterrupt and let the CLI main function handle it gracefully.

Key insights:
- CLI main function already has proper KeyboardInterrupt handling
- shutdown_listener is designed for server mode (uvicorn) primarily
- Raw input mode intercepts Ctrl+C before it becomes SIGINT
- Raising KeyboardInterrupt allows normal CLI shutdown flow

This approach:
- First Ctrl+C: stops agent gracefully with helpful message
- Second Ctrl+C: raises KeyboardInterrupt for clean application exit
- No more runtime crashes or 'system crashed and restarted' errors

Co-authored-by: OpenHands-Claude <openhands@all-hands.dev>
2025-06-28 16:02:22 +02:00
Engel Nyst 0bb43193d0 Use proper signal mechanism for double Ctrl+C shutdown
Instead of directly setting shutdown_listener._should_exit = True,
use os.kill(os.getpid(), signal.SIGTERM) to trigger the proper
shutdown signal handler.

This follows the established pattern where:
- shutdown_listener registers signal handlers for SIGINT/SIGTERM
- Signal handler sets _should_exit = True and calls shutdown listeners
- Components check should_continue()/should_exit() for coordinated shutdown

Benefits:
- Follows OpenHands' established shutdown architecture
- Proper signal handling instead of direct flag manipulation
- Consistent with how other shutdown scenarios work
- Cleaner separation of concerns

Co-authored-by: OpenHands-Claude <openhands-claude@all-hands.dev>
2025-06-28 15:41:02 +02:00
Engel Nyst 72b1aa6154 Fix double Ctrl+C to use global shutdown mechanism
Instead of raising KeyboardInterrupt which interrupts pending actions
and causes 'runtime system crashed' errors, use the existing global
shutdown_listener mechanism that gracefully shuts down all components.

This prevents:
- [Errno 21] Is a directory errors
- 'runtime system crashed and restarted' messages
- Delayed action execution after restart
- Missing 'Force quitting...' message

The shutdown_listener._should_exit flag is checked by EventStream
and other components for clean shutdown coordination.
2025-06-28 15:24:03 +02:00
Engel Nyst db5f7a5744 Implement double Ctrl+C behavior for graceful vs force quit
Changed to a more user-friendly approach:
- First Ctrl+C: Stops agent gracefully (sets STOPPED state)
- Second Ctrl+C within 2 seconds: Force quits application

This provides better UX by allowing users to:
1. Stop the current agent task without quitting the CLI
2. Force quit if they really want to exit the application

Messages shown:
- First: 'Stopping agent... (press Ctrl+C again within 2 seconds to force quit)'
- Second: 'Force quitting...'

Co-authored-by: OpenHands-Claude <openhands@all-hands.dev>
2025-06-28 14:46:54 +02:00
Engel Nyst fbe253f9e9 Fix Ctrl+C to cleanly stop agent instead of hard interrupt
Changed approach from raising KeyboardInterrupt to setting agent state
to STOPPED, which allows for clean shutdown without race conditions.

Changes:
- Ctrl+C now sets AgentState.STOPPED and signals done event
- Shows 'Keyboard interrupt, shutting down...' message
- Avoids 'cannot schedule new futures after interpreter shutdown' error
- Ctrl+P and Ctrl+D continue to pause the agent as before

This approach prevents the race condition where background tasks try to
schedule futures after the interpreter begins shutdown.

Co-authored-by: OpenHands-Claude <openhands@all-hands.dev>
2025-06-28 14:30:01 +02:00
Engel Nyst f70f07e19a Fix Ctrl+C to raise KeyboardInterrupt instead of ignoring it
The previous fix removed Ctrl+C handling entirely, which caused it to be
consumed by the input handler but do nothing. This fix makes Ctrl+C
explicitly raise KeyboardInterrupt, which will properly terminate the
application.

Changes:
- Ctrl+C now raises KeyboardInterrupt in process_agent_pause
- Ctrl+P and Ctrl+D continue to pause the agent as before
- Application will properly terminate when Ctrl+C is pressed

Co-authored-by: OpenHands-Claude <openhands@all-hands.dev>
2025-06-28 14:21:48 +02:00
Engel Nyst c9a2dec194 Fix Ctrl+C behavior in CLI to properly interrupt and stop application
Previously, Ctrl+C was intercepted by the process_agent_pause function
and treated the same as Ctrl+P (pause agent). This prevented normal
KeyboardInterrupt handling and made it impossible to stop the application
with Ctrl+C.

Changes:
- Remove Keys.ControlC from process_agent_pause function
- Now only Ctrl+P and Ctrl+D pause the agent
- Ctrl+C properly propagates as KeyboardInterrupt to main function
- Application can now be terminated normally with Ctrl+C

Co-authored-by: OpenHands-Claude <openhands@all-hands.dev>
2025-06-28 14:08:01 +02:00
Graham Neubig 2c2a721937 Fix unit tests to be environment-independent for cloud deployment (#9425)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-06-27 20:43:09 -04:00
AutoLTX 7abad5844a [Feature] Support .cursorrules (#9327)
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2025-06-28 02:33:19 +02:00
dependabot[bot] 4781e9a424 chore(deps): bump the version-all group across 1 directory with 20 updates (#9421)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-06-27 20:32:51 -04:00
llamantino a24d7e636e fix(cli): avoid race condition from multiple process_agent_pause tasks (#9423) 2025-06-27 23:22:43 +00:00
Peter Hamilton 66b95adbc9 Fix: Retry on Bedrock ServiceUnavailableError (#9419)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-06-27 22:17:50 +02:00
mamoodi d617d6842a Release 0.47.0 (#9405)
Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2025-06-27 13:59:36 -04:00
Xingyao Wang 0eb7f956a9 fix(CLI): Reduce severity of pending action timeout messages (#9415)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2025-06-27 16:28:31 +00:00
Graham Neubig d3154c4bae Fix CLI import error with broken third-party runtime dependencies (#9413)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-06-27 12:00:38 -04:00
Calvin Smith 04a15b1467 Condensation request signal in event stream (#9097)
Co-authored-by: Calvin Smith <calvin@all-hands.dev>
2025-06-27 09:57:39 -06:00
Xingyao Wang b74da7d4c3 feat(CLI): Enhance --file option to prompt agent to read and understand file first (#9398)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-06-27 15:57:29 +00:00
Graham Neubig 70ad469fb2 Fix typing 2025-06-26 23:47:54 -04:00
Graham Neubig a85f6af9c2 Fix typing in memory module 2025-06-26 23:46:37 -04:00
Graham Neubig 5e213963dc Fix typing 2025-06-26 23:43:13 -04:00
openhands 051c579855 Fix mypy type error in memory.py with reference to GitHub issue #18440 2025-06-27 03:38:50 +00:00
openhands 6d66b8503c Fix mypy type error in memory.py by adding type ignore annotations 2025-06-27 03:20:20 +00:00
Engel Nyst 0fb1a712d5 feat: Add user directory support for microagents (#9333)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-06-26 22:31:59 -04:00
34 changed files with 2150 additions and 1683 deletions
+3
View File
@@ -260,6 +260,9 @@ enable_finish = true
# length limit
enable_history_truncation = true
# Whether the condensation request tool is enabled
enable_condensation_request = false
[agent.RepoExplorerAgent]
# Example: use a cheaper model for RepoExplorerAgent to reduce cost, especially
# useful when an agent doesn't demand high quality but uses a lot of tokens
+749 -1078
View File
File diff suppressed because it is too large Load Diff
+18 -18
View File
@@ -7,33 +7,33 @@
"node": ">=20.0.0"
},
"dependencies": {
"@heroui/react": "^2.8.0-beta.9",
"@heroui/react": "^2.8.0-beta.10",
"@microlink/react-json-view": "^1.26.2",
"@monaco-editor/react": "^4.7.0-rc.0",
"@react-router/node": "^7.6.2",
"@react-router/serve": "^7.6.2",
"@react-router/node": "^7.6.3",
"@react-router/serve": "^7.6.3",
"@react-types/shared": "^3.29.1",
"@reduxjs/toolkit": "^2.8.2",
"@stripe/react-stripe-js": "^3.7.0",
"@stripe/stripe-js": "^7.3.1",
"@tailwindcss/postcss": "^4.1.10",
"@tailwindcss/vite": "^4.1.10",
"@tanstack/react-query": "^5.80.10",
"@vitejs/plugin-react": "^4.5.2",
"@stripe/stripe-js": "^7.4.0",
"@tailwindcss/postcss": "^4.1.11",
"@tailwindcss/vite": "^4.1.11",
"@tanstack/react-query": "^5.81.4",
"@vitejs/plugin-react": "^4.6.0",
"@xterm/addon-fit": "^0.10.0",
"@xterm/xterm": "^5.4.0",
"axios": "^1.10.0",
"clsx": "^2.1.1",
"eslint-config-airbnb-typescript": "^18.0.0",
"framer-motion": "^12.18.1",
"framer-motion": "^12.19.2",
"i18next": "^25.2.1",
"i18next-browser-languagedetector": "^8.2.0",
"i18next-http-backend": "^3.0.2",
"isbot": "^5.1.28",
"jose": "^6.0.11",
"lucide-react": "^0.519.0",
"lucide-react": "^0.525.0",
"monaco-editor": "^0.52.2",
"posthog-js": "^1.255.0",
"posthog-js": "^1.255.1",
"react": "^19.1.0",
"react-dom": "^19.1.0",
"react-highlight": "^0.15.0",
@@ -42,14 +42,14 @@
"react-icons": "^5.5.0",
"react-markdown": "^10.1.0",
"react-redux": "^9.2.0",
"react-router": "^7.6.2",
"react-router": "^7.6.3",
"react-syntax-highlighter": "^15.6.1",
"react-textarea-autosize": "^8.5.9",
"remark-gfm": "^4.0.1",
"sirv-cli": "^3.0.1",
"socket.io-client": "^4.8.1",
"tailwind-merge": "^3.3.1",
"vite": "^6.3.5",
"vite": "^7.0.0",
"web-vitals": "^5.0.3",
"ws": "^8.18.2"
},
@@ -80,19 +80,19 @@
]
},
"devDependencies": {
"@babel/parser": "^7.27.1",
"@babel/traverse": "^7.27.1",
"@babel/parser": "^7.27.7",
"@babel/traverse": "^7.27.7",
"@babel/types": "^7.27.0",
"@mswjs/socket.io-binding": "^0.2.0",
"@playwright/test": "^1.53.1",
"@react-router/dev": "^7.6.2",
"@react-router/dev": "^7.6.3",
"@tailwindcss/typography": "^0.5.16",
"@tanstack/eslint-plugin-query": "^5.81.2",
"@testing-library/dom": "^10.4.0",
"@testing-library/jest-dom": "^6.6.1",
"@testing-library/react": "^16.3.0",
"@testing-library/user-event": "^14.6.1",
"@types/node": "^24.0.3",
"@types/node": "^24.0.5",
"@types/react": "^19.1.8",
"@types/react-dom": "^19.1.6",
"@types/react-highlight": "^0.12.8",
@@ -117,7 +117,7 @@
"jsdom": "^26.1.0",
"lint-staged": "^16.1.2",
"msw": "^2.6.6",
"prettier": "^3.5.3",
"prettier": "^3.6.2",
"stripe": "^18.2.1",
"tailwindcss": "^4.1.8",
"typescript": "^5.8.3",
@@ -12,6 +12,9 @@ if TYPE_CHECKING:
import openhands.agenthub.codeact_agent.function_calling as codeact_function_calling
from openhands.agenthub.codeact_agent.tools.bash import create_cmd_run_tool
from openhands.agenthub.codeact_agent.tools.browser import BrowserTool
from openhands.agenthub.codeact_agent.tools.condensation_request import (
CondensationRequestTool,
)
from openhands.agenthub.codeact_agent.tools.finish import FinishTool
from openhands.agenthub.codeact_agent.tools.ipython import IPythonTool
from openhands.agenthub.codeact_agent.tools.llm_based_edit import LLMBasedFileEditTool
@@ -119,6 +122,8 @@ class CodeActAgent(Agent):
tools.append(ThinkTool)
if self.config.enable_finish:
tools.append(FinishTool)
if self.config.enable_condensation_request:
tools.append(CondensationRequestTool)
if self.config.enable_browsing:
if sys.platform == 'win32':
logger.warning('Windows runtime does not support browsing yet')
@@ -11,6 +11,7 @@ from litellm import (
from openhands.agenthub.codeact_agent.tools import (
BrowserTool,
CondensationRequestTool,
FinishTool,
IPythonTool,
LLMBasedFileEditTool,
@@ -35,6 +36,7 @@ from openhands.events.action import (
IPythonRunCellAction,
MessageAction,
)
from openhands.events.action.agent import CondensationRequestAction
from openhands.events.action.mcp import MCPAction
from openhands.events.event import FileEditSource, FileReadSource
from openhands.events.tool import ToolCallMetadata
@@ -203,6 +205,12 @@ def response_to_actions(
elif tool_call.function.name == ThinkTool['function']['name']:
action = AgentThinkAction(thought=arguments.get('thought', ''))
# ================================================
# CondensationRequestAction
# ================================================
elif tool_call.function.name == CondensationRequestTool['function']['name']:
action = CondensationRequestAction()
# ================================================
# BrowserTool
# ================================================
@@ -1,5 +1,6 @@
from .bash import create_cmd_run_tool
from .browser import BrowserTool
from .condensation_request import CondensationRequestTool
from .finish import FinishTool
from .ipython import IPythonTool
from .llm_based_edit import LLMBasedFileEditTool
@@ -8,6 +9,7 @@ from .think import ThinkTool
__all__ = [
'BrowserTool',
'CondensationRequestTool',
'create_cmd_run_tool',
'FinishTool',
'IPythonTool',
@@ -0,0 +1,16 @@
from litellm import ChatCompletionToolParam, ChatCompletionToolParamFunctionChunk
_CONDENSATION_REQUEST_DESCRIPTION = 'Request a condensation of the conversation history when the context becomes too long or when you need to focus on the most relevant information.'
CondensationRequestTool = ChatCompletionToolParam(
type='function',
function=ChatCompletionToolParamFunctionChunk(
name='request_condensation',
description=_CONDENSATION_REQUEST_DESCRIPTION,
parameters={
'type': 'object',
'properties': {},
'required': [],
},
),
)
+23 -5
View File
@@ -77,7 +77,6 @@ async def cleanup_session(
controller: AgentController,
) -> None:
"""Clean up all resources from the current session."""
event_stream = runtime.event_stream
end_state = controller.get_state()
end_state.save_to_session(
@@ -121,6 +120,7 @@ async def run_session(
sid = generate_sid(config, session_name)
is_loaded = asyncio.Event()
is_paused = asyncio.Event() # Event to track agent pause requests
pause_task: asyncio.Task | None = None # No more than one pause task
always_confirm_mode = False # Flag to enable always confirm mode
# Show runtime initialization message
@@ -236,9 +236,11 @@ async def run_session(
if event.agent_state == AgentState.RUNNING:
display_agent_running_message()
loop.create_task(
process_agent_pause(is_paused, event_stream)
) # Create a task to track agent pause requests from the user
nonlocal pause_task
if pause_task is None or pause_task.done():
pause_task = loop.create_task(
process_agent_pause(is_paused, event_stream)
) # Create a task to track agent pause requests from the user
def on_event(event: Event) -> None:
loop.create_task(on_event_async(event))
@@ -434,7 +436,23 @@ async def main_with_loop(loop: asyncio.AbstractEventLoop) -> None:
return
# Read task from file, CLI args, or stdin
task_str = read_task(args, config.cli_multiline_input)
if args.file:
# For CLI usage, we want to enhance the file content with a prompt
# that instructs the agent to read and understand the file first
with open(args.file, 'r', encoding='utf-8') as file:
file_content = file.read()
# Create a prompt that instructs the agent to read and understand the file first
task_str = f"""The user has tagged a file '{args.file}'.
Please read and understand the following file content first:
```
{file_content}
```
After reviewing the file, please ask the user what they would like to do with it."""
else:
task_str = read_task(args, config.cli_multiline_input)
# Run the first session
new_session_requested = await run_session(
+28 -5
View File
@@ -586,15 +586,38 @@ async def read_confirmation_input(config: OpenHandsConfig) -> str:
async def process_agent_pause(done: asyncio.Event, event_stream: EventStream) -> None:
import time
input = create_input()
ctrl_c_pressed_time = None
def keys_ready() -> None:
nonlocal ctrl_c_pressed_time
for key_press in input.read_keys():
if (
key_press.key == Keys.ControlP
or key_press.key == Keys.ControlC
or key_press.key == Keys.ControlD
):
if key_press.key == Keys.ControlC:
current_time = time.time()
if ctrl_c_pressed_time and (current_time - ctrl_c_pressed_time) < 2.0:
# Double Ctrl+C within 2 seconds - force quit
print_formatted_text('')
print_formatted_text(HTML('<red>Force quitting...</red>'))
# Let the CLI main function handle the KeyboardInterrupt properly
raise KeyboardInterrupt()
else:
# First Ctrl+C - stop agent gracefully
ctrl_c_pressed_time = current_time
print_formatted_text('')
print_formatted_text(
HTML(
'<yellow>Stopping agent... (press Ctrl+C again within 2 seconds to force quit)</yellow>'
)
)
event_stream.add_event(
ChangeAgentStateAction(AgentState.STOPPED),
EventSource.USER,
)
done.set()
elif key_press.key == Keys.ControlP or key_press.key == Keys.ControlD:
print_formatted_text('')
print_formatted_text(HTML('<gold>Pausing the agent...</gold>'))
event_stream.add_event(
+11 -178
View File
@@ -59,7 +59,11 @@ from openhands.events.action import (
NullAction,
SystemMessageAction,
)
from openhands.events.action.agent import CondensationAction, RecallAction
from openhands.events.action.agent import (
CondensationAction,
CondensationRequestAction,
RecallAction,
)
from openhands.events.event import Event
from openhands.events.observation import (
AgentDelegateObservation,
@@ -71,7 +75,6 @@ from openhands.events.observation import (
from openhands.events.serialization.event import truncate_content
from openhands.llm.llm import LLM
from openhands.llm.metrics import Metrics
from openhands.memory.view import View
from openhands.storage.files import FileStore
# note: RESUME is only available on web GUI
@@ -336,6 +339,8 @@ class AgentController:
return True
if isinstance(event, CondensationAction):
return True
if isinstance(event, CondensationRequestAction):
return True
return False
if isinstance(event, Observation):
if (
@@ -829,7 +834,9 @@ class AgentController:
or isinstance(e, ContextWindowExceededError)
):
if self.agent.config.enable_history_truncation:
self._handle_long_context_error()
self.event_stream.add_event(
CondensationRequestAction(), EventSource.AGENT
)
return
else:
raise LLMContextWindowExceedError()
@@ -880,7 +887,7 @@ class AgentController:
action_id = getattr(action, 'id', 'unknown')
action_type = type(action).__name__
self.log(
'warning',
'info',
f'Pending action active for {elapsed_time:.2f}s: {action_type} (id={action_id})',
extra={'msg_type': 'PENDING_ACTION_TIMEOUT'},
)
@@ -949,180 +956,6 @@ class AgentController:
assert self._closed
return self.state_tracker.get_trajectory(include_screenshots)
def _handle_long_context_error(self) -> None:
# When context window is exceeded, keep roughly half of agent interactions
current_view = View.from_events(self.state.history)
kept_events = self._apply_conversation_window(current_view.events)
kept_event_ids = {e.id for e in kept_events}
self.log(
'info',
f'Context window exceeded. Keeping events with IDs: {kept_event_ids}',
)
# The events to forget are those that are not in the kept set
forgotten_event_ids = {e.id for e in self.state.history} - kept_event_ids
if len(kept_event_ids) == 0:
self.log(
'warning',
'No events kept after applying conversation window. This should not happen.',
)
# verify that the first event id in kept_event_ids is the same as the start_id
if len(kept_event_ids) > 0 and self.state.history[0].id not in kept_event_ids:
self.log(
'warning',
f'First event after applying conversation window was not kept: {self.state.history[0].id} not in {kept_event_ids}',
)
# Add an error event to trigger another step by the agent
self.event_stream.add_event(
CondensationAction(
forgotten_events_start_id=min(forgotten_event_ids)
if forgotten_event_ids
else 0,
forgotten_events_end_id=max(forgotten_event_ids)
if forgotten_event_ids
else 0,
),
EventSource.AGENT,
)
def _apply_conversation_window(self, history: list[Event]) -> list[Event]:
"""Cuts history roughly in half when context window is exceeded.
It preserves action-observation pairs and ensures that the system message,
the first user message, and its associated recall observation are always included
at the beginning of the context window.
The algorithm:
1. Identify essential initial events: System Message, First User Message, Recall Observation.
2. Determine the slice of recent events to potentially keep.
3. Validate the start of the recent slice for dangling observations.
4. Combine essential events and validated recent events, ensuring essentials come first.
Args:
events: List of events to filter
Returns:
Filtered list of events keeping newest half while preserving pairs and essential initial events.
"""
# Handle empty history
if not history:
return []
# 1. Identify essential initial events
system_message: SystemMessageAction | None = None
first_user_msg: MessageAction | None = None
recall_action: RecallAction | None = None
recall_observation: Observation | None = None
# Find System Message (should be the first event, if it exists)
system_message = next(
(e for e in history if isinstance(e, SystemMessageAction)), None
)
assert (
system_message is None
or isinstance(system_message, SystemMessageAction)
and system_message.id == history[0].id
)
# Find First User Message in the history, which MUST exist
first_user_msg = self._first_user_message(history)
if first_user_msg is None:
# If not found in history, try the event stream
first_user_msg = self._first_user_message()
if first_user_msg is None:
raise RuntimeError('No first user message found in the event stream.')
self.log(
'warning',
'First user message not found in history. Using cached version from event stream.',
)
# Find the first user message index in the history
first_user_msg_index = -1
for i, event in enumerate(history):
if isinstance(event, MessageAction) and event.source == EventSource.USER:
first_user_msg_index = i
break
# Find Recall Action and Observation related to the First User Message
# Look for RecallAction after the first user message
for i in range(first_user_msg_index + 1, len(history)):
event = history[i]
if (
isinstance(event, RecallAction)
and event.query == first_user_msg.content
):
# Found RecallAction, now look for its Observation
recall_action = event
for j in range(i + 1, len(history)):
obs_event = history[j]
# Check for Observation caused by this RecallAction
if (
isinstance(obs_event, Observation)
and obs_event.cause == recall_action.id
):
recall_observation = obs_event
break # Found the observation, stop inner loop
break # Found the recall action (and maybe obs), stop outer loop
essential_events: list[Event] = []
if system_message:
essential_events.append(system_message)
# Only include first user message if history is not empty
if history:
essential_events.append(first_user_msg)
# Include recall action and observation if both exist
if recall_action and recall_observation:
essential_events.append(recall_action)
essential_events.append(recall_observation)
# Include recall action without observation for backward compatibility
elif recall_action:
essential_events.append(recall_action)
# 2. Determine the slice of recent events to potentially keep
num_non_essential_events = len(history) - len(essential_events)
# Keep roughly half of the non-essential events, minimum 1
num_recent_to_keep = max(1, num_non_essential_events // 2)
# Calculate the starting index for the recent slice
slice_start_index = len(history) - num_recent_to_keep
slice_start_index = max(0, slice_start_index) # Ensure index is not negative
recent_events_slice = history[slice_start_index:]
# 3. Validate the start of the recent slice for dangling observations
# IMPORTANT: Most observations in history are tool call results, which cannot be without their action, or we get an LLM API error
first_valid_event_index = 0
for i, event in enumerate(recent_events_slice):
if isinstance(event, Observation):
first_valid_event_index += 1
else:
break
# If all events in the slice are dangling observations, we need to keep at least one
if first_valid_event_index == len(recent_events_slice):
self.log(
'warning',
'All recent events are dangling observations, which we truncate. This means the agent has only the essential first events. This should not happen.',
)
# Adjust the recent_events_slice if dangling observations were found at the start
if first_valid_event_index < len(recent_events_slice):
validated_recent_events = recent_events_slice[first_valid_event_index:]
if first_valid_event_index > 0:
self.log(
'debug',
f'Removed {first_valid_event_index} dangling observation(s) from the start of recent event slice.',
)
else:
validated_recent_events = []
# 4. Combine essential events and validated recent events
events_to_keep: list[Event] = essential_events + validated_recent_events
self.log('debug', f'History truncated. Kept {len(events_to_keep)} events.')
return events_to_keep
def _is_stuck(self) -> bool:
"""Checks if the agent or its delegate is stuck in a loop.
+3 -3
View File
@@ -31,6 +31,8 @@ class AgentConfig(BaseModel):
"""Whether to enable think tool"""
enable_finish: bool = Field(default=True)
"""Whether to enable finish tool"""
enable_condensation_request: bool = Field(default=False)
"""Whether to enable condensation request tool"""
enable_prompt_extensions: bool = Field(default=True)
"""Whether to enable prompt extensions"""
enable_mcp: bool = Field(default=True)
@@ -51,8 +53,7 @@ class AgentConfig(BaseModel):
@classmethod
def from_toml_section(cls, data: dict) -> dict[str, AgentConfig]:
"""
Create a mapping of AgentConfig instances from a toml dictionary representing the [agent] section.
"""Create a mapping of AgentConfig instances from a toml dictionary representing the [agent] section.
The default configuration is built from all non-dict keys in data.
Then, each key with a dict value is treated as a custom agent configuration, and its values override
@@ -70,7 +71,6 @@ class AgentConfig(BaseModel):
dict[str, AgentConfig]: A mapping where the key "agent" corresponds to the default configuration
and additional keys represent custom configurations.
"""
# Initialize the result mapping
agent_mapping: dict[str, AgentConfig] = {}
+27 -18
View File
@@ -11,7 +11,7 @@ from openhands.core.config.llm_config import LLMConfig
class NoOpCondenserConfig(BaseModel):
"""Configuration for NoOpCondenser."""
type: Literal['noop'] = 'noop'
type: Literal['noop'] = Field(default='noop')
model_config = ConfigDict(extra='forbid')
@@ -19,7 +19,7 @@ class NoOpCondenserConfig(BaseModel):
class ObservationMaskingCondenserConfig(BaseModel):
"""Configuration for ObservationMaskingCondenser."""
type: Literal['observation_masking'] = 'observation_masking'
type: Literal['observation_masking'] = Field(default='observation_masking')
attention_window: int = Field(
default=100,
description='The number of most-recent events where observations will not be masked.',
@@ -32,7 +32,7 @@ class ObservationMaskingCondenserConfig(BaseModel):
class BrowserOutputCondenserConfig(BaseModel):
"""Configuration for the BrowserOutputCondenser."""
type: Literal['browser_output_masking'] = 'browser_output_masking'
type: Literal['browser_output_masking'] = Field(default='browser_output_masking')
attention_window: int = Field(
default=1,
description='The number of most recent browser output observations that will not be masked.',
@@ -43,7 +43,7 @@ class BrowserOutputCondenserConfig(BaseModel):
class RecentEventsCondenserConfig(BaseModel):
"""Configuration for RecentEventsCondenser."""
type: Literal['recent'] = 'recent'
type: Literal['recent'] = Field(default='recent')
# at least one event by default, because the best guess is that it is the user task
keep_first: int = Field(
@@ -61,7 +61,7 @@ class RecentEventsCondenserConfig(BaseModel):
class LLMSummarizingCondenserConfig(BaseModel):
"""Configuration for LLMCondenser."""
type: Literal['llm'] = 'llm'
type: Literal['llm'] = Field(default='llm')
llm_config: LLMConfig = Field(
..., description='Configuration for the LLM to use for condensing.'
)
@@ -88,7 +88,7 @@ class LLMSummarizingCondenserConfig(BaseModel):
class AmortizedForgettingCondenserConfig(BaseModel):
"""Configuration for AmortizedForgettingCondenser."""
type: Literal['amortized'] = 'amortized'
type: Literal['amortized'] = Field(default='amortized')
max_size: int = Field(
default=100,
description='Maximum size of the condensed history before triggering forgetting.',
@@ -108,7 +108,7 @@ class AmortizedForgettingCondenserConfig(BaseModel):
class LLMAttentionCondenserConfig(BaseModel):
"""Configuration for LLMAttentionCondenser."""
type: Literal['llm_attention'] = 'llm_attention'
type: Literal['llm_attention'] = Field(default='llm_attention')
llm_config: LLMConfig = Field(
..., description='Configuration for the LLM to use for attention.'
)
@@ -131,7 +131,7 @@ class LLMAttentionCondenserConfig(BaseModel):
class StructuredSummaryCondenserConfig(BaseModel):
"""Configuration for StructuredSummaryCondenser instances."""
type: Literal['structured'] = 'structured'
type: Literal['structured'] = Field(default='structured')
llm_config: LLMConfig = Field(
..., description='Configuration for the LLM to use for condensing.'
)
@@ -156,12 +156,9 @@ class StructuredSummaryCondenserConfig(BaseModel):
class CondenserPipelineConfig(BaseModel):
"""Configuration for the CondenserPipeline.
"""Configuration for the CondenserPipeline."""
Not currently supported by the TOML or ENV_VAR configuration strategies.
"""
type: Literal['pipeline'] = 'pipeline'
type: Literal['pipeline'] = Field(default='pipeline')
condensers: list[CondenserConfig] = Field(
default_factory=list,
description='List of condenser configurations to be used in the pipeline.',
@@ -170,6 +167,17 @@ class CondenserPipelineConfig(BaseModel):
model_config = ConfigDict(extra='forbid')
class ConversationWindowCondenserConfig(BaseModel):
"""Configuration for ConversationWindowCondenser.
Not currently supported by the TOML or ENV_VAR configuration strategies.
"""
type: Literal['conversation_window'] = Field(default='conversation_window')
model_config = ConfigDict(extra='forbid')
# Type alias for convenience
CondenserConfig = (
NoOpCondenserConfig
@@ -181,14 +189,14 @@ CondenserConfig = (
| LLMAttentionCondenserConfig
| StructuredSummaryCondenserConfig
| CondenserPipelineConfig
| ConversationWindowCondenserConfig
)
def condenser_config_from_toml_section(
data: dict, llm_configs: dict | None = None
) -> dict[str, CondenserConfig]:
"""
Create a CondenserConfig instance from a toml dictionary representing the [condenser] section.
"""Create a CondenserConfig instance from a toml dictionary representing the [condenser] section.
For CondenserConfig, the handling is different since it's a union type. The type of condenser
is determined by the 'type' field in the section.
@@ -210,7 +218,6 @@ def condenser_config_from_toml_section(
Returns:
dict[str, CondenserConfig]: A mapping where the key "condenser" corresponds to the configuration.
"""
# Initialize the result mapping
condenser_mapping: dict[str, CondenserConfig] = {}
@@ -261,8 +268,7 @@ from_toml_section = condenser_config_from_toml_section
def create_condenser_config(condenser_type: str, data: dict) -> CondenserConfig:
"""
Create a CondenserConfig instance based on the specified type.
"""Create a CondenserConfig instance based on the specified type.
Args:
condenser_type: The type of condenser to create.
@@ -284,6 +290,9 @@ def create_condenser_config(condenser_type: str, data: dict) -> CondenserConfig:
'amortized': AmortizedForgettingCondenserConfig,
'llm_attention': LLMAttentionCondenserConfig,
'structured': StructuredSummaryCondenserConfig,
'pipeline': CondenserPipelineConfig,
'conversation_window': ConversationWindowCondenserConfig,
'browser_output_masking': BrowserOutputCondenserConfig,
}
if condenser_type not in condenser_classes:
+3
View File
@@ -91,3 +91,6 @@ class ActionType(str, Enum):
CONDENSATION = 'condensation'
"""Condenses a list of events into a summary."""
CONDENSATION_REQUEST = 'condensation_request'
"""Request for condensation of a list of events."""
+15
View File
@@ -195,3 +195,18 @@ class CondensationAction(Action):
if self.summary:
return f'Summary: {self.summary}'
return f'Condenser is dropping the events: {self.forgotten}.'
@dataclass
class CondensationRequestAction(Action):
"""This action is used to request a condensation of the conversation history.
Attributes:
action (str): The action type, namely ActionType.CONDENSATION_REQUEST.
"""
action: str = ActionType.CONDENSATION_REQUEST
@property
def message(self) -> str:
return 'Requesting a condensation of the conversation history.'
+2
View File
@@ -9,6 +9,7 @@ from openhands.events.action.agent import (
AgentThinkAction,
ChangeAgentStateAction,
CondensationAction,
CondensationRequestAction,
RecallAction,
)
from openhands.events.action.browse import BrowseInteractiveAction, BrowseURLAction
@@ -43,6 +44,7 @@ actions = (
MessageAction,
SystemMessageAction,
CondensationAction,
CondensationRequestAction,
MCPAction,
)
+2
View File
@@ -19,6 +19,7 @@ from litellm import completion as litellm_completion
from litellm import completion_cost as litellm_completion_cost
from litellm.exceptions import (
RateLimitError,
ServiceUnavailableError,
)
from litellm.types.utils import CostPerToken, ModelResponse, Usage
from litellm.utils import create_pretrained_tokenizer
@@ -40,6 +41,7 @@ __all__ = ['LLM']
# tuple of exceptions to retry on
LLM_RETRY_EXCEPTIONS: tuple[type[Exception], ...] = (
RateLimitError,
ServiceUnavailableError,
litellm.Timeout,
litellm.InternalServerError,
LLMNoResponseError,
@@ -4,6 +4,9 @@ from openhands.memory.condenser.impl.amortized_forgetting_condenser import (
from openhands.memory.condenser.impl.browser_output_condenser import (
BrowserOutputCondenser,
)
from openhands.memory.condenser.impl.conversation_window_condenser import (
ConversationWindowCondenser,
)
from openhands.memory.condenser.impl.llm_attention_condenser import (
ImportantEventSelection,
LLMAttentionCondenser,
@@ -34,4 +37,5 @@ __all__ = [
'RecentEventsCondenser',
'StructuredSummaryCondenser',
'CondenserPipeline',
'ConversationWindowCondenser',
]
@@ -0,0 +1,185 @@
from __future__ import annotations
from openhands.core.config.condenser_config import ConversationWindowCondenserConfig
from openhands.core.logger import openhands_logger as logger
from openhands.events.action.agent import (
CondensationAction,
RecallAction,
)
from openhands.events.action.message import MessageAction, SystemMessageAction
from openhands.events.event import EventSource
from openhands.events.observation import Observation
from openhands.memory.condenser.condenser import Condensation, RollingCondenser, View
class ConversationWindowCondenser(RollingCondenser):
def __init__(self) -> None:
super().__init__()
def get_condensation(self, view: View) -> Condensation:
"""Apply conversation window truncation similar to _apply_conversation_window.
This method:
1. Identifies essential initial events (System Message, First User Message, Recall Observation)
2. Keeps roughly half of the history
3. Ensures action-observation pairs are preserved
4. Returns a CondensationAction specifying which events to forget
"""
events = view.events
# Handle empty history
if not events:
# No events to condense
action = CondensationAction(forgotten_event_ids=[])
return Condensation(action=action)
# 1. Identify essential initial events
system_message: SystemMessageAction | None = None
first_user_msg: MessageAction | None = None
recall_action: RecallAction | None = None
recall_observation: Observation | None = None
# Find System Message (should be the first event, if it exists)
system_message = next(
(e for e in events if isinstance(e, SystemMessageAction)), None
)
# Find First User Message
first_user_msg = next(
(
e
for e in events
if isinstance(e, MessageAction) and e.source == EventSource.USER
),
None,
)
if first_user_msg is None:
logger.warning(
'No first user message found in history during condensation.'
)
# Return empty condensation if no user message
action = CondensationAction(forgotten_event_ids=[])
return Condensation(action=action)
# Find the first user message index
first_user_msg_index = -1
for i, event in enumerate(events):
if isinstance(event, MessageAction) and event.source == EventSource.USER:
first_user_msg_index = i
break
# Find Recall Action and Observation related to the First User Message
for i in range(first_user_msg_index + 1, len(events)):
event = events[i]
if (
isinstance(event, RecallAction)
and event.query == first_user_msg.content
):
recall_action = event
# Look for its observation
for j in range(i + 1, len(events)):
obs_event = events[j]
if (
isinstance(obs_event, Observation)
and obs_event.cause == recall_action.id
):
recall_observation = obs_event
break
break
# Collect essential events
essential_events: list[int] = [] # Store event IDs
if system_message:
essential_events.append(system_message.id)
essential_events.append(first_user_msg.id)
if recall_action:
essential_events.append(recall_action.id)
if recall_observation:
essential_events.append(recall_observation.id)
# 2. Determine which events to keep
num_essential_events = len(essential_events)
total_events = len(events)
num_non_essential_events = total_events - num_essential_events
# Keep roughly half of the non-essential events
num_recent_to_keep = max(1, num_non_essential_events // 2)
# Calculate the starting index for recent events to keep
slice_start_index = total_events - num_recent_to_keep
slice_start_index = max(0, slice_start_index)
# 3. Handle dangling observations at the start of the slice
# Find the first non-observation event in the slice
recent_events_slice = events[slice_start_index:]
first_valid_event_index_in_slice = 0
for i, event in enumerate(recent_events_slice):
if not isinstance(event, Observation):
first_valid_event_index_in_slice = i
break
else:
# All events in the slice are observations
first_valid_event_index_in_slice = len(recent_events_slice)
# Check if all events in the recent slice are dangling observations
if first_valid_event_index_in_slice == len(recent_events_slice):
logger.warning(
'All recent events are dangling observations, which we truncate. This means the agent has only the essential first events. This should not happen.'
)
# Calculate the actual index in the full events list
first_valid_event_index = slice_start_index + first_valid_event_index_in_slice
if first_valid_event_index_in_slice > 0:
logger.debug(
f'Removed {first_valid_event_index_in_slice} dangling observation(s) '
f'from the start of recent event slice.'
)
# 4. Determine which events to keep and which to forget
events_to_keep: set[int] = set(essential_events)
# Add recent events starting from first_valid_event_index
for i in range(first_valid_event_index, total_events):
events_to_keep.add(events[i].id)
# Calculate which events to forget
all_event_ids = {e.id for e in events}
forgotten_event_ids = sorted(all_event_ids - events_to_keep)
logger.info(
f'ConversationWindowCondenser: Keeping {len(events_to_keep)} events, '
f'forgetting {len(forgotten_event_ids)} events.'
)
# Create the condensation action
if forgotten_event_ids:
# Use range if the forgotten events are contiguous
if (
len(forgotten_event_ids) > 1
and forgotten_event_ids[-1] - forgotten_event_ids[0]
== len(forgotten_event_ids) - 1
):
action = CondensationAction(
forgotten_events_start_id=forgotten_event_ids[0],
forgotten_events_end_id=forgotten_event_ids[-1],
)
else:
action = CondensationAction(forgotten_event_ids=forgotten_event_ids)
else:
action = CondensationAction(forgotten_event_ids=[])
return Condensation(action=action)
def should_condense(self, view: View) -> bool:
return view.unhandled_condensation_request
@classmethod
def from_config(
cls, _config: ConversationWindowCondenserConfig
) -> ConversationWindowCondenser:
return ConversationWindowCondenser()
ConversationWindowCondenser.register_config(ConversationWindowCondenserConfig)
+34 -6
View File
@@ -2,6 +2,7 @@ import asyncio
import os
import uuid
from datetime import datetime, timezone
from pathlib import Path
from typing import Callable
import openhands
@@ -33,6 +34,8 @@ GLOBAL_MICROAGENTS_DIR = os.path.join(
'microagents',
)
USER_MICROAGENTS_DIR = Path.home() / '.openhands' / 'microagents'
class Memory:
"""
@@ -77,6 +80,9 @@ class Memory:
# from typically OpenHands/microagents (i.e., the PUBLIC microagents)
self._load_global_microagents()
# Load user microagents from ~/.openhands/microagents/
self._load_user_microagents()
def on_event(self, event: Event):
"""Handle an event from the event stream."""
asyncio.get_event_loop().run_until_complete(self._on_event(event))
@@ -267,12 +273,34 @@ class Memory:
repo_agents, knowledge_agents = load_microagents_from_dir(
GLOBAL_MICROAGENTS_DIR
)
for name, k_agent in knowledge_agents.items():
if isinstance(k_agent, KnowledgeMicroagent):
self.knowledge_microagents[name] = k_agent
for name, r_agent in repo_agents.items():
if isinstance(r_agent, RepoMicroagent):
self.repo_microagents[name] = r_agent
for name, agent_knowledge in knowledge_agents.items():
self.knowledge_microagents[name] = agent_knowledge
for name, agent_repo in repo_agents.items():
self.repo_microagents[name] = agent_repo
def _load_user_microagents(self) -> None:
"""
Loads microagents from the user's home directory (~/.openhands/microagents/)
Creates the directory if it doesn't exist.
"""
try:
# Create the user microagents directory if it doesn't exist
os.makedirs(USER_MICROAGENTS_DIR, exist_ok=True)
# Load microagents from user directory
repo_agents, knowledge_agents = load_microagents_from_dir(
USER_MICROAGENTS_DIR
)
for name, agent_knowledge in knowledge_agents.items():
self.knowledge_microagents[name] = agent_knowledge
for name, agent_repo in repo_agents.items():
self.repo_microagents[name] = agent_repo
except Exception as e:
logger.warning(
f'Failed to load user microagents from {USER_MICROAGENTS_DIR}: {str(e)}'
)
def get_microagent_mcp_tools(self) -> list[MCPConfig]:
"""
+18 -2
View File
@@ -5,7 +5,7 @@ from typing import overload
from pydantic import BaseModel
from openhands.core.logger import openhands_logger as logger
from openhands.events.action.agent import CondensationAction
from openhands.events.action.agent import CondensationAction, CondensationRequestAction
from openhands.events.event import Event
from openhands.events.observation.agent import AgentCondensationObservation
@@ -17,6 +17,7 @@ class View(BaseModel):
"""
events: list[Event]
unhandled_condensation_request: bool = False
def __len__(self) -> int:
return len(self.events)
@@ -52,6 +53,8 @@ class View(BaseModel):
forgotten_event_ids.update(event.forgotten)
# Make sure we also forget the condensation action itself
forgotten_event_ids.add(event.id)
if isinstance(event, CondensationRequestAction):
forgotten_event_ids.add(event.id)
kept_events = [event for event in events if event.id not in forgotten_event_ids]
@@ -74,4 +77,17 @@ class View(BaseModel):
summary_offset, AgentCondensationObservation(content=summary)
)
return View(events=kept_events)
# Check for an unhandled condensation request -- these are events closer to the
# end of the list than any condensation action.
unhandled_condensation_request = False
for event in reversed(events):
if isinstance(event, CondensationAction):
break
if isinstance(event, CondensationRequestAction):
unhandled_condensation_request = True
break
return View(
events=kept_events,
unhandled_condensation_request=unhandled_condensation_request,
)
+25 -5
View File
@@ -1,5 +1,6 @@
import io
import re
from itertools import chain
from pathlib import Path
from typing import Union
@@ -39,7 +40,11 @@ class BaseMicroagent(BaseModel):
# Otherwise, we will rely on the name from metadata later
derived_name = None
if microagent_dir is not None:
derived_name = str(path.relative_to(microagent_dir).with_suffix(''))
# Special handling for .cursorrules files which are not in microagent_dir
if path.name == '.cursorrules':
derived_name = 'cursorrules'
else:
derived_name = str(path.relative_to(microagent_dir).with_suffix(''))
# Only load directly from path if file_content is not provided
if file_content is None:
@@ -56,6 +61,16 @@ class BaseMicroagent(BaseModel):
type=MicroagentType.REPO_KNOWLEDGE,
)
# Handle .cursorrules files
if path.name == '.cursorrules':
return RepoMicroagent(
name='cursorrules',
content=file_content,
metadata=MicroagentMetadata(name='cursorrules'),
source=str(path),
type=MicroagentType.REPO_KNOWLEDGE,
)
file_io = io.StringIO(file_content)
loaded = frontmatter.load(file_io)
content = loaded.content
@@ -258,10 +273,15 @@ def load_microagents_from_dir(
# Load all agents from microagents directory
logger.debug(f'Loading agents from {microagent_dir}')
if microagent_dir.exists():
for file in microagent_dir.rglob('*.md'):
# skip README.md
if file.name == 'README.md':
continue
# Collect .cursorrules file from repo root and .md files from microagents dir
cursorrules_files = []
if (microagent_dir.parent.parent / '.cursorrules').exists():
cursorrules_files = [microagent_dir.parent.parent / '.cursorrules']
md_files = [f for f in microagent_dir.rglob('*.md') if f.name != 'README.md']
# Process all files in one loop
for file in chain(cursorrules_files, md_files):
try:
agent = BaseMicroagent.load(file, microagent_dir)
if isinstance(agent, RepoMicroagent):
+11 -5
View File
@@ -10,6 +10,7 @@ from openhands.core.config import OpenHandsConfig
from openhands.core.config.condenser_config import (
BrowserOutputCondenserConfig,
CondenserPipelineConfig,
ConversationWindowCondenserConfig,
LLMSummarizingCondenserConfig,
)
from openhands.core.config.mcp_config import MCPConfig, OpenHandsMCPConfigImpl
@@ -156,13 +157,18 @@ class Session:
agent_config = self.config.get_agent_config(agent_cls)
if settings.enable_default_condenser:
# Default condenser chains a condenser that limits browser the total
# size of browser observations with a condenser that limits the size
# of the view given to the LLM. The order matters: with the browser
# output first, the summarizer will only see the most recent browser
# output, which should keep the summarization cost down.
# Default condenser chains three condensers together:
# 1. a conversation window condenser that handles explicit
# condensation requests,
# 2. a condenser that limits the total size of browser observations,
# and
# 3. a condenser that limits the size of the view given to the LLM.
# The order matters: with the browser output first, the summarizer
# will only see the most recent browser output, which should keep
# the summarization cost down.
default_condenser_config = CondenserPipelineConfig(
condensers=[
ConversationWindowCondenserConfig(),
BrowserOutputCondenserConfig(attention_window=2),
LLMSummarizingCondenserConfig(
llm_config=llm.config, keep_first=4, max_size=120
+4 -1
View File
@@ -454,7 +454,10 @@ def test_cmd_run(temp_dir, runtime_cls, run_as_openhands):
):
assert 'openhands' in obs.content
elif runtime_cls == LocalRuntime or runtime_cls == CLIRuntime:
assert 'root' not in obs.content and 'openhands' not in obs.content
# For CLI and Local runtimes, the user depends on the actual environment
# In CI it might be a non-root user, in cloud environments it might be root
# We just check that the command succeeded and the directory was created
pass # Skip user-specific assertions for environment independence
else:
assert 'root' in obs.content
assert 'test' in obs.content
+41 -212
View File
@@ -34,7 +34,12 @@ from openhands.events.observation.empty import NullObservation
from openhands.events.serialization import event_to_dict
from openhands.llm import LLM
from openhands.llm.metrics import Metrics, TokenUsage
from openhands.memory.condenser.condenser import Condensation
from openhands.memory.condenser.impl.conversation_window_condenser import (
ConversationWindowCondenser,
)
from openhands.memory.memory import Memory
from openhands.memory.view import View
from openhands.runtime.base import Runtime
from openhands.runtime.impl.action_execution.action_execution_client import (
ActionExecutionClient,
@@ -835,8 +840,23 @@ async def test_notify_on_llm_retry(mock_agent, mock_event_stream, mock_status_ca
@pytest.mark.asyncio
@pytest.mark.parametrize(
'context_window_error',
[
ContextWindowExceededError(
message='prompt is too long: 233885 tokens > 200000 maximum',
model='',
llm_provider='',
),
BadRequestError(
message='litellm.BadRequestError: OpenrouterException - This endpoint\'s maximum context length is 40960 tokens. However, you requested about 42988 tokens (38892 of text input, 4096 in the output). Please reduce the length of either one, or use the "middle-out" transform to compress your prompt automatically.',
model='openrouter/qwen/qwen3-30b-a3b',
llm_provider='openrouter',
),
],
)
async def test_context_window_exceeded_error_handling(
mock_agent, mock_runtime, test_event_stream, mock_memory
context_window_error, mock_agent, mock_runtime, test_event_stream, mock_memory
):
"""Test that context window exceeded errors are handled correctly by the controller, providing a smaller view but keeping the history intact."""
max_iterations = 5
@@ -847,9 +867,15 @@ async def test_context_window_exceeded_error_handling(
self.has_errored = False
self.index = 0
self.views = []
self.condenser = ConversationWindowCondenser()
def step(self, state: State):
self.views.append(state.view)
match self.condenser.condense(state.view):
case View() as view:
self.views.append(view)
case Condensation(action=action):
return action
# Wait until the right step to throw the error, and make sure we
# only throw it once.
@@ -857,13 +883,13 @@ async def test_context_window_exceeded_error_handling(
self.index += 1
return MessageAction(content=f'Test message {self.index}')
error = ContextWindowExceededError(
ContextWindowExceededError(
message='prompt is too long: 233885 tokens > 200000 maximum',
model='',
llm_provider='',
)
self.has_errored = True
raise error
raise context_window_error
step_state = StepState()
mock_agent.step = step_state.step
@@ -881,7 +907,7 @@ async def test_context_window_exceeded_error_handling(
content='Test microagent content',
recall_type=RecallType.KNOWLEDGE,
)
microagent_obs._cause = event.id
microagent_obs._cause = event.id # type: ignore
test_event_stream.add_event(microagent_obs, EventSource.ENVIRONMENT)
test_event_stream.subscribe(
@@ -911,7 +937,7 @@ async def test_context_window_exceeded_error_handling(
# Check that the context window exception was thrown and the controller
# called the agent's `step` function the right number of times.
assert step_state.has_errored
assert len(step_state.views) == max_iterations
assert len(step_state.views) == max_iterations - 1
print('step_state.views: ', step_state.views)
# Look at pre/post-step views. Normally, these should always increase in
@@ -936,7 +962,7 @@ async def test_context_window_exceeded_error_handling(
assert len(first_view) < len(second_view)
# The final state's history should contain:
# - max_iterations number of message actions,
# - (max_iterations - 1) number of message actions (one iteration taken up with the condensation request)
# - 1 recall actions,
# - 1 recall observations,
# - 1 condensation action.
@@ -944,7 +970,7 @@ async def test_context_window_exceeded_error_handling(
len(
[event for event in final_state.history if isinstance(event, MessageAction)]
)
== max_iterations
== max_iterations - 1
)
assert (
len(
@@ -955,7 +981,7 @@ async def test_context_window_exceeded_error_handling(
and event.source == EventSource.AGENT
]
)
== max_iterations - 1
== max_iterations - 2
)
assert (
len([event for event in final_state.history if isinstance(event, RecallAction)])
@@ -1001,8 +1027,14 @@ async def test_run_controller_with_context_window_exceeded_with_truncation(
class StepState:
def __init__(self):
self.has_errored = False
self.condenser = ConversationWindowCondenser()
def step(self, state: State):
match self.condenser.condense(state.view):
case Condensation(action=action):
return action
case _:
pass
# If the state has more than one message and we haven't errored yet,
# throw the context window exceeded error
if len(state.history) > 5 and not self.has_errored:
@@ -1614,206 +1646,3 @@ def test_system_message_in_event_stream(mock_agent, test_event_stream):
assert isinstance(events[0], SystemMessageAction)
assert events[0].content == 'Test system message'
assert events[0].tools == ['test_tool']
@pytest.mark.asyncio
async def test_openrouter_context_window_exceeded_error(
mock_agent, test_event_stream, mock_status_callback
):
"""Test that OpenRouter context window exceeded errors are properly detected and handled."""
max_iterations = 5
error_after = 2
class StepState:
def __init__(self):
self.has_errored = False
self.index = 0
self.views = []
def step(self, state: State):
self.views.append(state.view)
# Wait until the right step to throw the error, and make sure we
# only throw it once.
if self.index < error_after or self.has_errored:
self.index += 1
return MessageAction(content=f'Test message {self.index}')
# Create a BadRequestError with the OpenRouter context window exceeded message pattern
error = BadRequestError(
message='litellm.BadRequestError: OpenrouterException - This endpoint\'s maximum context length is 40960 tokens. However, you requested about 42988 tokens (38892 of text input, 4096 in the output). Please reduce the length of either one, or use the "middle-out" transform to compress your prompt automatically.',
model='openrouter/qwen/qwen3-30b-a3b',
llm_provider='openrouter',
)
self.has_errored = True
raise error
step_state = StepState()
mock_agent.step = step_state.step
mock_agent.config = AgentConfig(enable_history_truncation=True)
controller = AgentController(
agent=mock_agent,
event_stream=test_event_stream,
iteration_delta=max_iterations,
sid='test',
confirmation_mode=False,
headless_mode=True,
status_callback=mock_status_callback,
)
# Set the agent state to RUNNING
controller.state.agent_state = AgentState.RUNNING
# Run the controller until it hits the error
for _ in range(error_after + 2): # +2 to ensure we go past the error
await controller._step()
if step_state.has_errored:
break
# Verify that the error was handled as a context window exceeded error
# by checking that _handle_long_context_error was called (which adds a CondensationAction)
events = list(test_event_stream.get_events())
condensation_actions = [e for e in events if isinstance(e, CondensationAction)]
# There should be at least one CondensationAction if the error was handled correctly
assert len(condensation_actions) > 0, (
'OpenRouter context window exceeded error was not handled correctly'
)
await controller.close()
@pytest.mark.asyncio
async def test_sambanova_context_window_exceeded_error(
mock_agent, test_event_stream, mock_status_callback
):
"""Test that SambaNova context window exceeded errors are properly detected and handled."""
max_iterations = 5
error_after = 2
class StepState:
def __init__(self):
self.has_errored = False
self.index = 0
self.views = []
def step(self, state: State):
# Store the view for later inspection
self.views.append(state.view)
# only throw it once.
if self.index < error_after or self.has_errored:
self.index += 1
return MessageAction(content=f'Test message {self.index}')
# Create a BadRequestError with the SambaNova context window exceeded message pattern
error = BadRequestError(
message='litellm.BadRequestError: SambanovaException - The maximum context length of DeepSeek-V3-0324 is 32768. However, answering your request will take 39732 tokens. Please reduce the length of the messages or the specified max_completion_tokens value.',
model='sambanova/deepseek-v3-0324',
llm_provider='sambanova',
)
self.has_errored = True
raise error
step_state = StepState()
mock_agent.step = step_state.step
mock_agent.config = AgentConfig(enable_history_truncation=True)
controller = AgentController(
agent=mock_agent,
event_stream=test_event_stream,
iteration_delta=max_iterations,
sid='test',
confirmation_mode=False,
headless_mode=True,
status_callback=mock_status_callback,
)
# Set the agent state to RUNNING
controller.state.agent_state = AgentState.RUNNING
# Run the controller until it hits the error
for _ in range(error_after + 2): # +2 to ensure we go past the error
await controller._step()
if step_state.has_errored:
break
# Verify that the error was handled as a context window exceeded error
# by checking that _handle_long_context_error was called (which adds a CondensationAction)
events = list(test_event_stream.get_events())
condensation_actions = [e for e in events if isinstance(e, CondensationAction)]
# There should be at least one CondensationAction if the error was handled correctly
assert len(condensation_actions) > 0, (
'SambaNova context window exceeded error was not handled correctly'
)
await controller.close()
@pytest.mark.asyncio
async def test_sambanova_generic_exception_not_handled_as_context_error(
mock_agent, test_event_stream, mock_status_callback
):
"""Test that generic SambaNova exceptions (without context length pattern) are NOT handled as context window errors."""
max_iterations = 5
error_after = 2
class StepState:
def __init__(self):
self.has_errored = False
self.index = 0
self.views = []
def step(self, state: State):
# Store the view for later inspection
self.views.append(state.view)
# only throw it once.
if self.index < error_after or self.has_errored:
self.index += 1
return MessageAction(content=f'Test message {self.index}')
# Create a BadRequestError with a generic SambaNova error (no context length pattern)
error = BadRequestError(
message='litellm.BadRequestError: SambanovaException - Some other error occurred',
model='sambanova/deepseek-v3-0324',
llm_provider='sambanova',
)
self.has_errored = True
raise error
step_state = StepState()
mock_agent.step = step_state.step
mock_agent.config = AgentConfig(enable_history_truncation=True)
controller = AgentController(
agent=mock_agent,
event_stream=test_event_stream,
iteration_delta=max_iterations,
sid='test',
confirmation_mode=False,
headless_mode=True,
status_callback=mock_status_callback,
)
# Set the agent state to RUNNING
controller.state.agent_state = AgentState.RUNNING
# Run the controller until it hits the error
with pytest.raises(BadRequestError):
for _ in range(error_after + 2): # +2 to ensure we go past the error
await controller._step()
if step_state.has_errored:
break
# Verify that the error was NOT handled as a context window exceeded error
# by checking that _handle_long_context_error was NOT called (no CondensationAction should be added)
events = list(test_event_stream.get_events())
condensation_actions = [e for e in events if isinstance(e, CondensationAction)]
# There should be NO CondensationAction if the error was correctly NOT handled as context window error
assert len(condensation_actions) == 0, (
'Generic SambaNova exception was incorrectly handled as context window error'
)
await controller.close()
+93
View File
@@ -345,6 +345,7 @@ async def test_main_without_task(
mock_args.agent_cls = None
mock_args.llm_config = None
mock_args.name = None
mock_args.file = None
mock_parse_args.return_value = mock_args
# Mock config
@@ -427,6 +428,7 @@ async def test_main_with_task(
mock_args = MagicMock()
mock_args.agent_cls = 'custom-agent'
mock_args.llm_config = 'custom-config'
mock_args.file = None
mock_parse_args.return_value = mock_args
# Mock config
@@ -523,6 +525,7 @@ async def test_main_with_session_name_passes_name_to_run_session(
mock_args.agent_cls = None
mock_args.llm_config = None
mock_args.name = test_session_name # Set the session name
mock_args.file = None
mock_parse_args.return_value = mock_args
# Mock config
@@ -831,3 +834,93 @@ async def test_config_loading_order(
# Verify that run_session was called with the correct arguments
mock_run_session.assert_called_once()
@pytest.mark.asyncio
@patch('openhands.cli.main.parse_arguments')
@patch('openhands.cli.main.setup_config_from_args')
@patch('openhands.cli.main.FileSettingsStore.get_instance')
@patch('openhands.cli.main.check_folder_security_agreement')
@patch('openhands.cli.main.run_session')
@patch('openhands.cli.main.LLMSummarizingCondenserConfig')
@patch('openhands.cli.main.NoOpCondenserConfig')
@patch('builtins.open', new_callable=MagicMock)
async def test_main_with_file_option(
mock_open,
mock_noop_condenser,
mock_llm_condenser,
mock_run_session,
mock_check_security,
mock_get_settings_store,
mock_setup_config,
mock_parse_args,
):
"""Test main function with a file option."""
loop = asyncio.get_running_loop()
# Mock arguments
mock_args = MagicMock()
mock_args.agent_cls = None
mock_args.llm_config = None
mock_args.name = None
mock_args.file = '/path/to/test/file.txt'
mock_args.task = None
mock_parse_args.return_value = mock_args
# Mock config
mock_config = MagicMock()
mock_config.workspace_base = '/test/dir'
mock_config.cli_multiline_input = False
mock_setup_config.return_value = mock_config
# Mock settings store
mock_settings_store = AsyncMock()
mock_settings = MagicMock()
mock_settings.agent = 'test-agent'
mock_settings.llm_model = 'test-model'
mock_settings.llm_api_key = 'test-api-key'
mock_settings.llm_base_url = 'test-base-url'
mock_settings.confirmation_mode = True
mock_settings.enable_default_condenser = True
mock_settings_store.load.return_value = mock_settings
mock_get_settings_store.return_value = mock_settings_store
# Mock condenser config to return a mock instead of validating
mock_llm_condenser_instance = MagicMock()
mock_llm_condenser.return_value = mock_llm_condenser_instance
# Mock security check
mock_check_security.return_value = True
# Mock file open
mock_file = MagicMock()
mock_file.__enter__.return_value.read.return_value = 'This is a test file content.'
mock_open.return_value = mock_file
# Mock run_session to return False (no new session requested)
mock_run_session.return_value = False
# Run the function
await cli.main_with_loop(loop)
# Assertions
mock_parse_args.assert_called_once()
mock_setup_config.assert_called_once_with(mock_args)
mock_get_settings_store.assert_called_once()
mock_settings_store.load.assert_called_once()
mock_check_security.assert_called_once_with(mock_config, '/test/dir')
# Verify file was opened
mock_open.assert_called_once_with('/path/to/test/file.txt', 'r', encoding='utf-8')
# Check that run_session was called with expected arguments
mock_run_session.assert_called_once()
# Extract the task_str from the call
task_str = mock_run_session.call_args[0][4]
assert "The user has tagged a file '/path/to/test/file.txt'" in task_str
assert 'Please read and understand the following file content first:' in task_str
assert 'This is a test file content.' in task_str
assert (
'After reviewing the file, please ask the user what they would like to do with it.'
in task_str
)
+198
View File
@@ -16,6 +16,7 @@ from openhands.cli.tui import (
display_usage_metrics,
display_welcome_message,
get_session_duration,
process_agent_pause,
read_confirmation_input,
)
from openhands.core.config import OpenHandsConfig
@@ -385,3 +386,200 @@ class TestReadConfirmationInput:
result = await read_confirmation_input(config=MagicMock(spec=OpenHandsConfig))
assert result == 'no'
class TestProcessAgentPause:
@pytest.mark.asyncio
@patch('openhands.cli.tui.create_input')
@patch('openhands.cli.tui.print_formatted_text')
async def test_single_ctrl_c_stops_agent(self, mock_print, mock_create_input):
"""Test that a single Ctrl+C stops the agent gracefully."""
import asyncio
from prompt_toolkit.keys import Keys
# Mock the input to simulate a single Ctrl+C
mock_input = Mock()
mock_key_press = Mock()
mock_key_press.key = Keys.ControlC
mock_input.read_keys.return_value = [mock_key_press]
# Mock the context managers and simulate immediate key press
mock_input.raw_mode.return_value.__enter__ = Mock(return_value=None)
mock_input.raw_mode.return_value.__exit__ = Mock(return_value=None)
# Mock attach to immediately call the keys_ready function
def mock_attach(keys_ready_func):
# Simulate the key press by calling the function immediately
keys_ready_func()
# Return a mock context manager
mock_context = Mock()
mock_context.__enter__ = Mock(return_value=None)
mock_context.__exit__ = Mock(return_value=None)
return mock_context
mock_input.attach.side_effect = mock_attach
mock_create_input.return_value = mock_input
# Mock event stream
mock_event_stream = Mock()
mock_event_stream.add_event = Mock()
# Create done event
done = asyncio.Event()
# Run the function
await process_agent_pause(done, mock_event_stream)
# Verify agent was stopped gracefully
mock_event_stream.add_event.assert_called_once()
call_args = mock_event_stream.add_event.call_args[0]
assert call_args[0].agent_state.value == 'stopped'
# Verify the helpful message was displayed
mock_print.assert_called()
print_calls = [call.args[0] for call in mock_print.call_args_list]
helpful_message_found = any(
'Stopping agent' in str(call) and 'press Ctrl+C again' in str(call)
for call in print_calls
)
assert helpful_message_found, (
f'Expected helpful message not found in: {print_calls}'
)
# Verify done event was set
assert done.is_set()
@pytest.mark.asyncio
@patch('openhands.cli.tui.create_input')
@patch('openhands.cli.tui.print_formatted_text')
async def test_double_ctrl_c_raises_keyboard_interrupt(
self, mock_print, mock_create_input
):
"""Test that double Ctrl+C within 2 seconds raises KeyboardInterrupt."""
import asyncio
from prompt_toolkit.keys import Keys
# Mock the input to simulate double Ctrl+C
mock_input = Mock()
mock_key_press = Mock()
mock_key_press.key = Keys.ControlC
# Simulate two Ctrl+C presses within 2 seconds
call_count = 0
def mock_read_keys():
nonlocal call_count
call_count += 1
if call_count == 1:
# First call returns first Ctrl+C
return [mock_key_press]
elif call_count == 2:
# Second call returns second Ctrl+C (within 2 seconds)
return [mock_key_press]
else:
# Subsequent calls return empty to avoid infinite loop
return []
mock_input.read_keys.side_effect = mock_read_keys
# Mock the context managers and simulate double key press
mock_input.raw_mode.return_value.__enter__ = Mock(return_value=None)
mock_input.raw_mode.return_value.__exit__ = Mock(return_value=None)
# Mock attach to call the keys_ready function twice (simulating double Ctrl+C)
def mock_attach(keys_ready_func):
# Simulate first Ctrl+C
keys_ready_func()
# Simulate second Ctrl+C immediately (within 2 seconds)
keys_ready_func()
# Return a mock context manager
mock_context = Mock()
mock_context.__enter__ = Mock(return_value=None)
mock_context.__exit__ = Mock(return_value=None)
return mock_context
mock_input.attach.side_effect = mock_attach
mock_create_input.return_value = mock_input
# Mock event stream
mock_event_stream = Mock()
mock_event_stream.add_event = Mock()
# Create done event
done = asyncio.Event()
# Run the function and expect KeyboardInterrupt
with pytest.raises(KeyboardInterrupt):
await process_agent_pause(done, mock_event_stream)
# Verify force quit message was displayed
mock_print.assert_called()
print_calls = [call.args[0] for call in mock_print.call_args_list]
force_quit_message_found = any(
'Force quitting' in str(call) for call in print_calls
)
assert force_quit_message_found, (
f'Expected force quit message not found in: {print_calls}'
)
@pytest.mark.asyncio
@patch('openhands.cli.tui.create_input')
@patch('openhands.cli.tui.print_formatted_text')
async def test_ctrl_p_pauses_agent(self, mock_print, mock_create_input):
"""Test that Ctrl+P pauses the agent."""
import asyncio
from prompt_toolkit.keys import Keys
# Mock the input to simulate Ctrl+P
mock_input = Mock()
mock_key_press = Mock()
mock_key_press.key = Keys.ControlP
mock_input.read_keys.return_value = [mock_key_press]
# Mock the context managers and simulate immediate key press
mock_input.raw_mode.return_value.__enter__ = Mock(return_value=None)
mock_input.raw_mode.return_value.__exit__ = Mock(return_value=None)
# Mock attach to immediately call the keys_ready function
def mock_attach(keys_ready_func):
# Simulate the key press by calling the function immediately
keys_ready_func()
# Return a mock context manager
mock_context = Mock()
mock_context.__enter__ = Mock(return_value=None)
mock_context.__exit__ = Mock(return_value=None)
return mock_context
mock_input.attach.side_effect = mock_attach
mock_create_input.return_value = mock_input
# Mock event stream
mock_event_stream = Mock()
mock_event_stream.add_event = Mock()
# Create done event
done = asyncio.Event()
# Run the function
await process_agent_pause(done, mock_event_stream)
# Verify agent was paused
mock_event_stream.add_event.assert_called_once()
call_args = mock_event_stream.add_event.call_args[0]
assert call_args[0].agent_state.value == 'paused'
# Verify the pause message was displayed
mock_print.assert_called()
print_calls = [call.args[0] for call in mock_print.call_args_list]
pause_message_found = any(
'Pausing the agent' in str(call) for call in print_calls
)
assert pause_message_found, (
f'Expected pause message not found in: {print_calls}'
)
# Verify done event was set
assert done.is_set()
@@ -1,24 +1,34 @@
from unittest.mock import MagicMock, patch
"""
Unit tests for ConversationWindowCondenser.
These tests mirror the tests for `_apply_conversation_window` in the AgentController,
but adapted to test the condenser implementation. The ConversationWindowCondenser
copies the functionality of the `_apply_conversation_window` function as closely as possible.
The tests verify that the condenser:
1. Identifies essential initial events (System Message, First User Message, Recall Action/Observation)
2. Keeps roughly half of the non-essential events from recent history
3. Handles dangling observations properly
4. Returns appropriate CondensationAction objects specifying which events to forget
"""
from unittest.mock import patch
import pytest
from openhands.controller.agent import Agent
from openhands.controller.agent_controller import AgentController
from openhands.controller.state.state import State
from openhands.core.config import OpenHandsConfig
from openhands.events import EventSource
from openhands.events.action import CmdRunAction, MessageAction, RecallAction
from openhands.events.action.agent import CondensationAction
from openhands.events.action.message import SystemMessageAction
from openhands.events.event import RecallType
from openhands.events.observation import (
CmdOutputObservation,
Observation,
RecallObservation,
)
from openhands.events.stream import EventStream
from openhands.llm.llm import LLM
from openhands.llm.metrics import Metrics
from openhands.storage.memory import InMemoryFileStore
from openhands.memory.condenser.condenser import Condensation, View
from openhands.memory.condenser.impl.conversation_window_condenser import (
ConversationWindowCondenser,
)
# Helper function to create events with sequential IDs and causes
@@ -86,44 +96,20 @@ def create_events(event_data):
@pytest.fixture
def controller_fixture():
mock_agent = MagicMock(spec=Agent)
mock_agent.llm = MagicMock(spec=LLM)
mock_agent.llm.metrics = Metrics()
mock_agent.llm.config = OpenHandsConfig().get_llm_config()
mock_agent.config = OpenHandsConfig().get_agent_config('CodeActAgent')
mock_event_stream = MagicMock(spec=EventStream)
mock_event_stream.sid = 'test_sid'
mock_event_stream.file_store = InMemoryFileStore({})
# Ensure get_latest_event_id returns an integer
mock_event_stream.get_latest_event_id.return_value = -1
# Create a state with iteration_flag.max_value set to 10
state = State(inputs={}, session_id='test_sid')
state.iteration_flag.max_value = 10
controller = AgentController(
agent=mock_agent,
event_stream=mock_event_stream,
iteration_delta=1, # Add the required iteration_delta parameter
sid='test_sid',
initial_state=state,
)
# Don't mock _first_user_message anymore since we need it to work with history
return controller
def condenser_fixture():
condenser = ConversationWindowCondenser()
return condenser
# =============================================
# Test Cases for _apply_conversation_window
# Test Cases for ConversationWindowCondenser
# =============================================
def test_basic_truncation(controller_fixture):
controller = controller_fixture
def test_basic_truncation(condenser_fixture):
condenser = condenser_fixture
controller.state.history = create_events(
events = create_events(
[
{'type': SystemMessageAction, 'content': 'System Prompt'}, # 1
{
@@ -156,6 +142,7 @@ def test_basic_truncation(controller_fixture):
}, # 10
]
)
view = View(events=events)
# Calculation (RecallAction now essential):
# History len = 10
@@ -167,21 +154,22 @@ def test_basic_truncation(controller_fixture):
# Validation: remove leading obs2(8). validated_slice = [cmd3(9), obs3(10)]
# Final = essentials + validated_slice = [sys(1), user(2), recall_act(3), recall_obs(4), cmd3(9), obs3(10)]
# Expected IDs: [1, 2, 3, 4, 9, 10]. Length 6.
truncated_events = controller._apply_conversation_window(controller.state.history)
# Forgotten IDs: [5, 6, 7, 8]
condensation = condenser.get_condensation(view)
assert len(truncated_events) == 6
expected_ids = [1, 2, 3, 4, 9, 10]
actual_ids = [e.id for e in truncated_events]
assert actual_ids == expected_ids
# Check no dangling observations at the start of the recent slice part
# The first event of the validated slice is cmd3(9)
assert not isinstance(truncated_events[4], Observation) # Index adjusted
assert isinstance(condensation, Condensation)
assert isinstance(condensation.action, CondensationAction)
# Check the forgotten event IDs
forgotten_ids = condensation.action.forgotten
expected_forgotten = [5, 6, 7, 8]
assert sorted(forgotten_ids) == expected_forgotten
def test_no_system_message(controller_fixture):
controller = controller_fixture
def test_no_system_message(condenser_fixture):
condenser = condenser_fixture
controller.state.history = create_events(
events = create_events(
[
{
'type': MessageAction,
@@ -213,7 +201,7 @@ def test_no_system_message(controller_fixture):
}, # 9
]
)
# No longer need to set mock ID
view = View(events=events)
# Calculation (RecallAction now essential):
# History len = 9
@@ -224,19 +212,22 @@ def test_no_system_message(controller_fixture):
# recent_events_slice = history[6:] = [obs2(7), cmd3(8), obs3(9)]
# Validation: remove leading obs2(7). validated_slice = [cmd3(8), obs3(9)]
# Final = essentials + validated_slice = [user(1), recall_act(2), recall_obs(3), cmd3(8), obs3(9)]
# Expected IDs: [1, 2, 3, 8, 9]. Length 5.
truncated_events = controller._apply_conversation_window(controller.state.history)
# Expected kept IDs: [1, 2, 3, 8, 9]. Length 5.
# Forgotten IDs: [4, 5, 6, 7]
condensation = condenser.get_condensation(view)
assert len(truncated_events) == 5
expected_ids = [1, 2, 3, 8, 9]
actual_ids = [e.id for e in truncated_events]
assert actual_ids == expected_ids
assert isinstance(condensation, Condensation)
assert isinstance(condensation.action, CondensationAction)
forgotten_ids = condensation.action.forgotten
expected_forgotten = [4, 5, 6, 7]
assert sorted(forgotten_ids) == expected_forgotten
def test_no_recall_observation(controller_fixture):
controller = controller_fixture
def test_no_recall_observation(condenser_fixture):
condenser = condenser_fixture
controller.state.history = create_events(
events = create_events(
[
{'type': SystemMessageAction, 'content': 'System Prompt'}, # 1
{
@@ -269,29 +260,33 @@ def test_no_recall_observation(controller_fixture):
}, # 9
]
)
view = View(events=events)
# Calculation (RecallAction essential only if RecallObs exists):
# Calculation (RecallAction essential even without RecallObs in condenser):
# History len = 9
# Essentials = [sys(1), user(2)] (len=2) - RecallObs missing, so RecallAction not essential here
# Non-essential count = 9 - 2 = 7
# num_recent_to_keep = max(1, 7 // 2) = 3
# Essentials = [sys(1), user(2), recall_action(3)] (len=3)
# Non-essential count = 9 - 3 = 6
# num_recent_to_keep = max(1, 6 // 2) = 3
# slice_start_index = 9 - 3 = 6
# recent_events_slice = history[6:] = [obs2(7), cmd3(8), obs3(9)]
# Validation: remove leading obs2(7). validated_slice = [cmd3(8), obs3(9)]
# Final = essentials + validated_slice = [sys(1), user(2), recall_action(3), cmd_cat(8), obs_cat(9)]
# Expected IDs: [1, 2, 3, 8, 9]. Length 5.
truncated_events = controller._apply_conversation_window(controller.state.history)
# Expected kept IDs: [1, 2, 3, 8, 9]. Length 5.
# Forgotten IDs: [4, 5, 6, 7]
condensation = condenser.get_condensation(view)
assert len(truncated_events) == 5
expected_ids = [1, 2, 3, 8, 9]
actual_ids = [e.id for e in truncated_events]
assert actual_ids == expected_ids
assert isinstance(condensation, Condensation)
assert isinstance(condensation.action, CondensationAction)
forgotten_ids = condensation.action.forgotten
expected_forgotten = [4, 5, 6, 7]
assert sorted(forgotten_ids) == expected_forgotten
def test_short_history_no_truncation(controller_fixture):
controller = controller_fixture
def test_short_history_no_truncation(condenser_fixture):
condenser = condenser_fixture
history = create_events(
events = create_events(
[
{'type': SystemMessageAction, 'content': 'System Prompt'}, # 1
{
@@ -310,7 +305,7 @@ def test_short_history_no_truncation(controller_fixture):
}, # 6
]
)
controller.state.history = history
view = View(events=events)
# Calculation (RecallAction now essential):
# History len = 6
@@ -321,19 +316,22 @@ def test_short_history_no_truncation(controller_fixture):
# recent_events_slice = history[5:] = [obs1(6)]
# Validation: remove leading obs1(6). validated_slice = []
# Final = essentials + validated_slice = [sys(1), user(2), recall_act(3), recall_obs(4)]
# Expected IDs: [1, 2, 3, 4]. Length 4.
truncated_events = controller._apply_conversation_window(controller.state.history)
# Expected kept IDs: [1, 2, 3, 4]. Length 4.
# Forgotten IDs: [5, 6]
condensation = condenser.get_condensation(view)
assert len(truncated_events) == 4
expected_ids = [1, 2, 3, 4]
actual_ids = [e.id for e in truncated_events]
assert actual_ids == expected_ids
assert isinstance(condensation, Condensation)
assert isinstance(condensation.action, CondensationAction)
forgotten_ids = condensation.action.forgotten
expected_forgotten = [5, 6]
assert sorted(forgotten_ids) == expected_forgotten
def test_only_essential_events(controller_fixture):
controller = controller_fixture
def test_only_essential_events(condenser_fixture):
condenser = condenser_fixture
history = create_events(
events = create_events(
[
{'type': SystemMessageAction, 'content': 'System Prompt'}, # 1
{
@@ -345,7 +343,7 @@ def test_only_essential_events(controller_fixture):
{'type': RecallObservation, 'content': 'Recall result', 'cause_id': 3}, # 4
]
)
controller.state.history = history
view = View(events=events)
# Calculation (RecallAction now essential):
# History len = 4
@@ -356,19 +354,22 @@ def test_only_essential_events(controller_fixture):
# recent_events_slice = history[3:] = [recall_obs(4)]
# Validation: remove leading recall_obs(4). validated_slice = []
# Final = essentials + validated_slice = [sys(1), user(2), recall_act(3), recall_obs(4)]
# Expected IDs: [1, 2, 3, 4]. Length 4.
truncated_events = controller._apply_conversation_window(controller.state.history)
# Expected kept IDs: [1, 2, 3, 4]. Length 4.
# Forgotten IDs: []
condensation = condenser.get_condensation(view)
assert len(truncated_events) == 4
expected_ids = [1, 2, 3, 4]
actual_ids = [e.id for e in truncated_events]
assert actual_ids == expected_ids
assert isinstance(condensation, Condensation)
assert isinstance(condensation.action, CondensationAction)
forgotten_ids = condensation.action.forgotten
expected_forgotten = []
assert forgotten_ids == expected_forgotten
def test_dangling_observations_at_cut_point(controller_fixture):
controller = controller_fixture
def test_dangling_observations_at_cut_point(condenser_fixture):
condenser = condenser_fixture
history_forced_dangle = create_events(
events = create_events(
[
{'type': SystemMessageAction, 'content': 'System Prompt'}, # 1
{
@@ -405,7 +406,7 @@ def test_dangling_observations_at_cut_point(controller_fixture):
}, # 10
]
) # 10 events total
controller.state.history = history_forced_dangle
view = View(events=events)
# Calculation (RecallAction now essential):
# History len = 10
@@ -416,20 +417,22 @@ def test_dangling_observations_at_cut_point(controller_fixture):
# recent_events_slice = history[7:] = [obs1(8), cmd2(9), obs2(10)]
# Validation: remove leading obs1(8). validated_slice = [cmd2(9), obs2(10)]
# Final = essentials + validated_slice = [sys(1), user(2), recall_act(3), recall_obs(4), cmd2(9), obs2(10)]
# Expected IDs: [1, 2, 3, 4, 9, 10]. Length 6.
truncated_events = controller._apply_conversation_window(controller.state.history)
# Expected kept IDs: [1, 2, 3, 4, 9, 10]. Length 6.
# Forgotten IDs: [5, 6, 7, 8]
condensation = condenser.get_condensation(view)
assert len(truncated_events) == 6
expected_ids = [1, 2, 3, 4, 9, 10]
actual_ids = [e.id for e in truncated_events]
assert actual_ids == expected_ids
# Verify dangling observations 5 and 6 were removed (implicitly by slice start and validation)
assert isinstance(condensation, Condensation)
assert isinstance(condensation.action, CondensationAction)
forgotten_ids = condensation.action.forgotten
expected_forgotten = [5, 6, 7, 8]
assert sorted(forgotten_ids) == expected_forgotten
def test_only_dangling_observations_in_recent_slice(controller_fixture):
controller = controller_fixture
def test_only_dangling_observations_in_recent_slice(condenser_fixture):
condenser = condenser_fixture
history = create_events(
events = create_events(
[
{'type': SystemMessageAction, 'content': 'System Prompt'}, # 1
{
@@ -452,7 +455,7 @@ def test_only_dangling_observations_in_recent_slice(controller_fixture):
}, # 6 (Dangling)
]
) # 6 events total
controller.state.history = history
view = View(events=events)
# Calculation (RecallAction now essential):
# History len = 6
@@ -463,43 +466,44 @@ def test_only_dangling_observations_in_recent_slice(controller_fixture):
# recent_events_slice = history[5:] = [dangle2(6)]
# Validation: remove leading dangle2(6). validated_slice = [] (Corrected based on user feedback/bugfix)
# Final = essentials + validated_slice = [sys(1), user(2), recall_act(3), recall_obs(4)]
# Expected IDs: [1, 2, 3, 4]. Length 4.
# Expected kept IDs: [1, 2, 3, 4]. Length 4.
# Forgotten IDs: [5, 6]
with patch(
'openhands.controller.agent_controller.logger.warning'
'openhands.memory.condenser.impl.conversation_window_condenser.logger.warning'
) as mock_log_warning:
truncated_events = controller._apply_conversation_window(
controller.state.history
)
condensation = condenser.get_condensation(view)
assert len(truncated_events) == 4
expected_ids = [1, 2, 3, 4]
actual_ids = [e.id for e in truncated_events]
assert actual_ids == expected_ids
# Verify dangling observations 5 and 6 were removed
assert isinstance(condensation, Condensation)
assert isinstance(condensation.action, CondensationAction)
forgotten_ids = condensation.action.forgotten
expected_forgotten = [5, 6]
assert sorted(forgotten_ids) == expected_forgotten
# Check that the specific warning was logged exactly once
assert mock_log_warning.call_count == 1
# Check the essential parts of the arguments, allowing for variations like stacklevel
# Check the essential parts of the arguments
call_args, call_kwargs = mock_log_warning.call_args
expected_message_substring = 'All recent events are dangling observations, which we truncate. This means the agent has only the essential first events. This should not happen.'
assert expected_message_substring in call_args[0]
assert 'extra' in call_kwargs
assert call_kwargs['extra'].get('session_id') == 'test_sid'
def test_empty_history(controller_fixture):
controller = controller_fixture
controller.state.history = []
def test_empty_history(condenser_fixture):
condenser = condenser_fixture
view = View(events=[])
truncated_events = controller._apply_conversation_window(controller.state.history)
assert truncated_events == []
condensation = condenser.get_condensation(view)
assert isinstance(condensation, Condensation)
assert isinstance(condensation.action, CondensationAction)
assert condensation.action.forgotten == []
def test_multiple_user_messages(controller_fixture):
controller = controller_fixture
def test_multiple_user_messages(condenser_fixture):
condenser = condenser_fixture
history = create_events(
events = create_events(
[
{'type': SystemMessageAction, 'content': 'System Prompt'}, # 1
{
@@ -540,7 +544,7 @@ def test_multiple_user_messages(controller_fixture):
}, # 11
]
) # 11 events total
controller.state.history = history
view = View(events=events)
# Calculation (RecallAction now essential):
# History len = 11
@@ -551,15 +555,18 @@ def test_multiple_user_messages(controller_fixture):
# recent_events_slice = history[8:] = [recall_obs2(9), cmd2(10), obs2(11)]
# Validation: remove leading recall_obs2(9). validated_slice = [cmd2(10), obs2(11)]
# Final = essentials + validated_slice = [sys(1), user1(2), recall_act1(3), recall_obs1(4)] + [cmd2(10), obs2(11)]
# Expected IDs: [1, 2, 3, 4, 10, 11]. Length 6.
truncated_events = controller._apply_conversation_window(controller.state.history)
# Expected kept IDs: [1, 2, 3, 4, 10, 11]. Length 6.
# Forgotten IDs: [5, 6, 7, 8, 9]
condensation = condenser.get_condensation(view)
assert len(truncated_events) == 6
expected_ids = [1, 2, 3, 4, 10, 11]
actual_ids = [e.id for e in truncated_events]
assert actual_ids == expected_ids
assert isinstance(condensation, Condensation)
assert isinstance(condensation.action, CondensationAction)
# Verify the second user message (ID 7) was NOT kept
assert not any(event.id == 7 for event in truncated_events)
# Verify the first user message (ID 2) is present
assert any(event.id == 2 for event in truncated_events)
forgotten_ids = condensation.action.forgotten
expected_forgotten = [5, 6, 7, 8, 9]
assert sorted(forgotten_ids) == expected_forgotten
# Additional validation: ensure that only the first user message is kept
kept_event_ids = set(range(1, 12)) - set(forgotten_ids)
assert 2 in kept_event_ids # First user message kept
assert 7 not in kept_event_ids # Second user message forgotten
+10 -6
View File
@@ -6,7 +6,11 @@ from unittest.mock import patch
import pytest
from openhands.core.config import LLMConfig, OpenHandsConfig
from openhands.core.logger import OpenHandsLoggerAdapter, json_log_handler
from openhands.core.logger import (
LOG_JSON_LEVEL_KEY,
OpenHandsLoggerAdapter,
json_log_handler,
)
from openhands.core.logger import openhands_logger as openhands_logger
@@ -139,7 +143,7 @@ class TestJsonOutput:
output = json.loads(string_io.getvalue())
assert 'timestamp' in output
del output['timestamp']
assert output == {'message': 'Test message', 'level': 'INFO'}
assert output == {'message': 'Test message', LOG_JSON_LEVEL_KEY: 'INFO'}
def test_error(self, json_handler):
logger, string_io = json_handler
@@ -147,7 +151,7 @@ class TestJsonOutput:
logger.error('Test message')
output = json.loads(string_io.getvalue())
del output['timestamp']
assert output == {'message': 'Test message', 'level': 'ERROR'}
assert output == {'message': 'Test message', LOG_JSON_LEVEL_KEY: 'ERROR'}
def test_extra_fields(self, json_handler):
logger, string_io = json_handler
@@ -158,7 +162,7 @@ class TestJsonOutput:
assert output == {
'key': '..val..',
'message': 'Test message',
'level': 'INFO',
LOG_JSON_LEVEL_KEY: 'INFO',
}
def test_extra_fields_from_adapter(self, json_handler):
@@ -171,7 +175,7 @@ class TestJsonOutput:
'context_field': '..val..',
'log_fied': '..val..',
'message': 'Test message',
'level': 'INFO',
LOG_JSON_LEVEL_KEY: 'INFO',
}
def test_extra_fields_from_adapter_can_override(self, json_handler):
@@ -183,5 +187,5 @@ class TestJsonOutput:
assert output == {
'override': 'b',
'message': 'Test message',
'level': 'INFO',
LOG_JSON_LEVEL_KEY: 'INFO',
}
+72
View File
@@ -201,3 +201,75 @@ This microagent has an invalid type.
assert '"knowledge"' in error_msg
assert '"repo"' in error_msg
assert '"task"' in error_msg
def test_cursorrules_file_load():
"""Test loading .cursorrules file as a RepoMicroagent."""
cursorrules_content = """Always use Python for new files.
Follow the existing code style.
Add proper error handling."""
cursorrules_path = Path('.cursorrules')
# Test loading .cursorrules file directly
agent = BaseMicroagent.load(cursorrules_path, file_content=cursorrules_content)
# Verify it's loaded as a RepoMicroagent
assert isinstance(agent, RepoMicroagent)
assert agent.name == 'cursorrules'
assert agent.content == cursorrules_content
assert agent.type == MicroagentType.REPO_KNOWLEDGE
assert agent.metadata.name == 'cursorrules'
assert agent.source == str(cursorrules_path)
@pytest.fixture
def temp_microagents_dir_with_cursorrules():
"""Create a temporary directory with test microagents and .cursorrules file."""
with tempfile.TemporaryDirectory() as temp_dir:
root = Path(temp_dir)
# Create .openhands/microagents directory structure
microagents_dir = root / '.openhands' / 'microagents'
microagents_dir.mkdir(parents=True, exist_ok=True)
# Create .cursorrules file in repository root
cursorrules_content = """Always use TypeScript for new files.
Follow the existing code style."""
(root / '.cursorrules').write_text(cursorrules_content)
# Create test repo agent
repo_agent = """---
# type: repo
version: 1.0.0
agent: CodeActAgent
---
# Test Repository Agent
Repository-specific test instructions.
"""
(microagents_dir / 'repo.md').write_text(repo_agent)
yield root
def test_load_microagents_with_cursorrules(temp_microagents_dir_with_cursorrules):
"""Test loading microagents when .cursorrules file exists."""
microagents_dir = (
temp_microagents_dir_with_cursorrules / '.openhands' / 'microagents'
)
repo_agents, knowledge_agents = load_microagents_from_dir(microagents_dir)
# Verify that .cursorrules file was loaded as a RepoMicroagent
assert len(repo_agents) == 2 # repo.md + .cursorrules
assert 'repo' in repo_agents
assert 'cursorrules' in repo_agents
# Check .cursorrules agent
cursorrules_agent = repo_agents['cursorrules']
assert isinstance(cursorrules_agent, RepoMicroagent)
assert cursorrules_agent.name == 'cursorrules'
assert 'Always use TypeScript for new files' in cursorrules_agent.content
assert cursorrules_agent.type == MicroagentType.REPO_KNOWLEDGE
+7 -1
View File
@@ -1,6 +1,7 @@
"""Tests for the custom secrets API endpoints."""
# flake8: noqa: E501
import os
from unittest.mock import AsyncMock, patch
import pytest
@@ -24,7 +25,12 @@ def test_client():
"""Create a test client for the settings API."""
app = FastAPI()
app.include_router(secrets_app)
return TestClient(app)
# Mock SESSION_API_KEY to None to disable authentication in tests
with patch.dict(os.environ, {'SESSION_API_KEY': ''}, clear=False):
# Clear the SESSION_API_KEY to disable auth dependency
with patch('openhands.server.dependencies._SESSION_API_KEY', None):
yield TestClient(app)
@pytest.fixture
+3
View File
@@ -1,3 +1,4 @@
import os
from unittest.mock import AsyncMock, MagicMock, patch
import pytest
@@ -54,6 +55,8 @@ class MockUserAuth(UserAuth):
def test_client():
# Create a test client
with (
patch.dict(os.environ, {'SESSION_API_KEY': ''}, clear=False),
patch('openhands.server.dependencies._SESSION_API_KEY', None),
patch(
'openhands.server.user_auth.user_auth.UserAuth.get_instance',
return_value=MockUserAuth(),
@@ -1,3 +1,4 @@
import os
from unittest.mock import AsyncMock, MagicMock, patch
import pytest
@@ -28,6 +29,8 @@ async def get_settings_store(request):
def test_client():
# Create a test client
with (
patch.dict(os.environ, {'SESSION_API_KEY': ''}, clear=False),
patch('openhands.server.dependencies._SESSION_API_KEY', None),
patch(
'openhands.server.routes.secrets.check_provider_tokens',
AsyncMock(return_value=''),
+217
View File
@@ -0,0 +1,217 @@
"""Tests for user directory microagent loading."""
import tempfile
from pathlib import Path
from unittest.mock import patch
import pytest
from openhands.events.stream import EventStream
from openhands.memory.memory import Memory
from openhands.microagent import KnowledgeMicroagent, MicroagentType, RepoMicroagent
from openhands.storage import get_file_store
@pytest.fixture
def temp_user_microagents_dir():
"""Create a temporary directory to simulate ~/.openhands/microagents/."""
with tempfile.TemporaryDirectory() as temp_dir:
user_dir = Path(temp_dir)
# Create test knowledge agent
knowledge_agent = """---
name: user_knowledge
version: 1.0.0
agent: CodeActAgent
triggers:
- user-test
- personal
---
# User Knowledge Agent
Personal knowledge and guidelines.
"""
(user_dir / 'user_knowledge.md').write_text(knowledge_agent)
# Create test repo agent
repo_agent = """---
name: user_repo
version: 1.0.0
agent: CodeActAgent
---
# User Repository Agent
Personal repository-specific instructions.
"""
(user_dir / 'user_repo.md').write_text(repo_agent)
yield user_dir
def test_user_microagents_loading(temp_user_microagents_dir):
"""Test that user microagents are loaded from ~/.openhands/microagents/."""
with patch(
'openhands.memory.memory.USER_MICROAGENTS_DIR', str(temp_user_microagents_dir)
):
with tempfile.TemporaryDirectory() as temp_dir:
# Create event stream and memory
file_store = get_file_store('local', temp_dir)
event_stream = EventStream('test', file_store)
memory = Memory(event_stream, 'test_sid')
# Check that user microagents were loaded
assert 'user_knowledge' in memory.knowledge_microagents
assert 'user_repo' in memory.repo_microagents
# Verify the loaded agents
user_knowledge = memory.knowledge_microagents['user_knowledge']
assert isinstance(user_knowledge, KnowledgeMicroagent)
assert user_knowledge.type == MicroagentType.KNOWLEDGE
assert 'user-test' in user_knowledge.triggers
assert 'personal' in user_knowledge.triggers
user_repo = memory.repo_microagents['user_repo']
assert isinstance(user_repo, RepoMicroagent)
assert user_repo.type == MicroagentType.REPO_KNOWLEDGE
def test_user_microagents_directory_creation():
"""Test that user microagents directory is created if it doesn't exist."""
with tempfile.TemporaryDirectory() as temp_dir:
non_existent_dir = Path(temp_dir) / 'non_existent' / 'microagents'
with patch(
'openhands.memory.memory.USER_MICROAGENTS_DIR', str(non_existent_dir)
):
with tempfile.TemporaryDirectory() as temp_store_dir:
# Create event stream and memory
file_store = get_file_store('local', temp_store_dir)
event_stream = EventStream('test', file_store)
Memory(event_stream, 'test_sid')
# Check that the directory was created
assert non_existent_dir.exists()
assert non_existent_dir.is_dir()
def test_user_microagents_override_global():
"""Test that user microagents can override global ones with the same name."""
with tempfile.TemporaryDirectory() as temp_dir:
user_dir = Path(temp_dir)
# Create a user microagent with the same name as a global one
# (assuming there's a global 'github' microagent)
github_agent = """---
name: github
version: 1.0.0
agent: CodeActAgent
triggers:
- github
- git
---
# Personal GitHub Agent
My personal GitHub workflow and preferences.
"""
(user_dir / 'github.md').write_text(github_agent)
with patch('openhands.memory.memory.USER_MICROAGENTS_DIR', str(user_dir)):
with tempfile.TemporaryDirectory() as temp_store_dir:
# Create event stream and memory
file_store = get_file_store('local', temp_store_dir)
event_stream = EventStream('test', file_store)
memory = Memory(event_stream, 'test_sid')
# Check that the user microagent is loaded
if 'github' in memory.knowledge_microagents:
github_microagent = memory.knowledge_microagents['github']
# The user version should contain our personal content
assert 'My personal GitHub workflow' in github_microagent.content
def test_user_microagents_loading_error_handling():
"""Test error handling when user microagents directory has issues."""
with tempfile.TemporaryDirectory() as temp_dir:
user_dir = Path(temp_dir)
# Create an invalid microagent file
invalid_agent = """---
name: invalid
type: invalid_type
---
# Invalid Agent
"""
(user_dir / 'invalid.md').write_text(invalid_agent)
with patch('openhands.memory.memory.USER_MICROAGENTS_DIR', str(user_dir)):
with tempfile.TemporaryDirectory() as temp_store_dir:
# Create event stream and memory - should not crash
file_store = get_file_store('local', temp_store_dir)
event_stream = EventStream('test', file_store)
memory = Memory(event_stream, 'test_sid')
# Memory should still be created despite the invalid microagent
assert memory is not None
# The invalid microagent should not be loaded
assert 'invalid' not in memory.knowledge_microagents
assert 'invalid' not in memory.repo_microagents
def test_user_microagents_empty_directory():
"""Test behavior when user microagents directory is empty."""
with tempfile.TemporaryDirectory() as temp_dir:
empty_dir = Path(temp_dir)
with patch('openhands.memory.memory.USER_MICROAGENTS_DIR', str(empty_dir)):
with tempfile.TemporaryDirectory() as temp_store_dir:
# Create event stream and memory
file_store = get_file_store('local', temp_store_dir)
event_stream = EventStream('test', file_store)
memory = Memory(event_stream, 'test_sid')
# Memory should be created successfully
assert memory is not None
# No user microagents should be loaded, but global ones might be
# (we can't assert the exact count since global microagents may exist)
def test_user_microagents_nested_directories(temp_user_microagents_dir):
"""Test loading user microagents from nested directories."""
# Create nested microagent
nested_dir = temp_user_microagents_dir / 'personal' / 'tools'
nested_dir.mkdir(parents=True)
nested_agent = """---
name: personal_tool
version: 1.0.0
agent: CodeActAgent
triggers:
- personal-tool
---
# Personal Tool Agent
My personal development tools and workflows.
"""
(nested_dir / 'tool.md').write_text(nested_agent)
with patch(
'openhands.memory.memory.USER_MICROAGENTS_DIR', str(temp_user_microagents_dir)
):
with tempfile.TemporaryDirectory() as temp_store_dir:
# Create event stream and memory
file_store = get_file_store('local', temp_store_dir)
event_stream = EventStream('test', file_store)
memory = Memory(event_stream, 'test_sid')
# Check that nested microagent was loaded
# The name should be derived from the relative path
assert 'personal/tools/tool' in memory.knowledge_microagents
nested_microagent = memory.knowledge_microagents['personal/tools/tool']
assert isinstance(nested_microagent, KnowledgeMicroagent)
assert 'personal-tool' in nested_microagent.triggers
+164 -1
View File
@@ -1,4 +1,4 @@
from openhands.events.action.agent import CondensationAction
from openhands.events.action.agent import CondensationAction, CondensationRequestAction
from openhands.events.action.message import MessageAction
from openhands.events.event import Event
from openhands.events.observation.agent import AgentCondensationObservation
@@ -98,6 +98,169 @@ def test_no_condensation_action_in_view() -> None:
assert len(view) == 3 # Event 1, Event 2, Event 3 (Event 0 was forgotten)
def test_unhandled_condensation_request_with_no_condensation() -> None:
"""Test that unhandled_condensation_request is True when there's a CondensationRequestAction but no CondensationAction."""
events: list[Event] = [
MessageAction(content='Event 0'),
MessageAction(content='Event 1'),
CondensationRequestAction(),
MessageAction(content='Event 2'),
]
set_ids(events)
view = View.from_events(events)
# Should be marked as having an unhandled condensation request
assert view.unhandled_condensation_request is True
# CondensationRequestAction should be removed from the view
assert len(view) == 3 # Only the MessageActions remain
for event in view:
assert not isinstance(event, CondensationRequestAction)
def test_handled_condensation_request_with_condensation_action() -> None:
"""Test that unhandled_condensation_request is False when CondensationAction comes after CondensationRequestAction."""
events: list[Event] = [
MessageAction(content='Event 0'),
MessageAction(content='Event 1'),
CondensationRequestAction(),
MessageAction(content='Event 2'),
CondensationAction(forgotten_event_ids=[0, 1]), # Handles the request
MessageAction(content='Event 3'),
]
set_ids(events)
view = View.from_events(events)
# Should NOT be marked as having an unhandled condensation request
assert view.unhandled_condensation_request is False
# Both CondensationRequestAction and CondensationAction should be removed from the view
assert len(view) == 2 # Event 2 and Event 3 (Event 0, 1 forgotten)
for event in view:
assert not isinstance(event, CondensationRequestAction)
assert not isinstance(event, CondensationAction)
def test_multiple_condensation_requests_pattern() -> None:
"""Test the pattern with multiple condensation requests and actions."""
events: list[Event] = [
MessageAction(content='Event 0'),
CondensationRequestAction(), # First request
MessageAction(content='Event 1'),
CondensationAction(forgotten_event_ids=[0]), # Handles first request
MessageAction(content='Event 2'),
CondensationRequestAction(), # Second request - should be unhandled
MessageAction(content='Event 3'),
]
set_ids(events)
view = View.from_events(events)
# Should be marked as having an unhandled condensation request (the second one)
assert view.unhandled_condensation_request is True
# Both CondensationRequestActions and CondensationAction should be removed from the view
assert len(view) == 3 # Event 1, Event 2, Event 3 (Event 0 forgotten)
for event in view:
assert not isinstance(event, CondensationRequestAction)
assert not isinstance(event, CondensationAction)
def test_condensation_action_before_request() -> None:
"""Test that CondensationAction before CondensationRequestAction doesn't affect the unhandled status."""
events: list[Event] = [
MessageAction(content='Event 0'),
CondensationAction(
forgotten_event_ids=[]
), # This doesn't handle the later request
MessageAction(content='Event 1'),
CondensationRequestAction(), # This should be unhandled
MessageAction(content='Event 2'),
]
set_ids(events)
view = View.from_events(events)
# Should be marked as having an unhandled condensation request
assert view.unhandled_condensation_request is True
# Both CondensationRequestAction and CondensationAction should be removed from the view
assert len(view) == 3 # Event 0, Event 1, Event 2
for event in view:
assert not isinstance(event, CondensationRequestAction)
assert not isinstance(event, CondensationAction)
def test_no_condensation_events() -> None:
"""Test that unhandled_condensation_request is False when there are no condensation events."""
events: list[Event] = [
MessageAction(content='Event 0'),
MessageAction(content='Event 1'),
MessageAction(content='Event 2'),
]
set_ids(events)
view = View.from_events(events)
# Should NOT be marked as having an unhandled condensation request
assert view.unhandled_condensation_request is False
# All events should remain
assert len(view) == 3
assert view.events == events
def test_only_condensation_action() -> None:
"""Test behavior when there's only a CondensationAction (no request)."""
events: list[Event] = [
MessageAction(content='Event 0'),
MessageAction(content='Event 1'),
CondensationAction(forgotten_event_ids=[0]),
MessageAction(content='Event 2'),
]
set_ids(events)
view = View.from_events(events)
# Should NOT be marked as having an unhandled condensation request
assert view.unhandled_condensation_request is False
# CondensationAction should be removed, Event 0 should be forgotten
assert len(view) == 2 # Event 1, Event 2
for event in view:
assert not isinstance(event, CondensationAction)
def test_condensation_request_always_removed_from_view() -> None:
"""Test that CondensationRequestAction is always removed from the view regardless of unhandled status."""
# Test case 1: Unhandled request
events_unhandled: list[Event] = [
MessageAction(content='Event 0'),
CondensationRequestAction(),
MessageAction(content='Event 1'),
]
set_ids(events_unhandled)
view_unhandled = View.from_events(events_unhandled)
assert view_unhandled.unhandled_condensation_request is True
assert len(view_unhandled) == 2 # Only MessageActions
for event in view_unhandled:
assert not isinstance(event, CondensationRequestAction)
# Test case 2: Handled request
events_handled: list[Event] = [
MessageAction(content='Event 0'),
CondensationRequestAction(),
MessageAction(content='Event 1'),
CondensationAction(forgotten_event_ids=[]),
MessageAction(content='Event 2'),
]
set_ids(events_handled)
view_handled = View.from_events(events_handled)
assert view_handled.unhandled_condensation_request is False
assert len(view_handled) == 3 # Only MessageActions
for event in view_handled:
assert not isinstance(event, CondensationRequestAction)
assert not isinstance(event, CondensationAction)
def set_ids(events: list[Event]) -> None:
"""Set the IDs of the events in the list to their index."""
for i, e in enumerate(events):