mirror of
https://github.com/All-Hands-AI/OpenHands.git
synced 2026-04-29 03:00:45 -04:00
Compare commits
2 Commits
resolver-r
...
0.47.0
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
3c39b93f7e | ||
|
|
197e1a3612 |
156
REFACTOR_PLAN.md
156
REFACTOR_PLAN.md
@@ -1,156 +0,0 @@
|
||||
# Resolver Runtime Refactoring Plan
|
||||
|
||||
## Task Overview
|
||||
Refactor the resolver component to reuse setup.py functions for runtime initialization, connection, and completion instead of reinventing the wheel.
|
||||
|
||||
## Repository Cloning Patterns Analysis
|
||||
|
||||
### Repository Cloning Patterns Across OpenHands Entry Points
|
||||
|
||||
#### 1. **Resolver (issue_resolver.py)** - DIFFERENT PATTERN (Legacy)
|
||||
```python
|
||||
# Step 1: Clone to separate location
|
||||
subprocess.check_output(['git', 'clone', url, f'{output_dir}/repo'])
|
||||
|
||||
# Step 2: Later, copy repo to workspace
|
||||
shutil.copytree(os.path.join(self.output_dir, 'repo'), self.workspace_base)
|
||||
|
||||
# Step 3: Create and connect runtime
|
||||
runtime = create_runtime(config)
|
||||
await runtime.connect()
|
||||
|
||||
# Step 4: Initialize runtime (git config, setup scripts)
|
||||
self.initialize_runtime(runtime)
|
||||
```
|
||||
|
||||
#### 2. **Main.py** - STANDARD PATTERN
|
||||
```python
|
||||
# Step 1: Create and connect runtime
|
||||
runtime = create_runtime(config)
|
||||
await runtime.connect()
|
||||
|
||||
# Step 2: Clone directly into runtime workspace + setup
|
||||
repo_directory = initialize_repository_for_runtime(runtime, selected_repository)
|
||||
```
|
||||
|
||||
#### 3. **Server/Session** - STANDARD PATTERN
|
||||
```python
|
||||
# Step 1: Create and connect runtime
|
||||
# Step 2: Clone directly into runtime workspace
|
||||
await runtime.clone_or_init_repo(tokens, repo, branch)
|
||||
# Step 3: Run setup scripts
|
||||
await runtime.maybe_run_setup_script()
|
||||
await runtime.maybe_setup_git_hooks()
|
||||
```
|
||||
|
||||
#### 4. **Setup.py's initialize_repository_for_runtime()** - STANDARD PATTERN
|
||||
```python
|
||||
# Calls runtime.clone_or_init_repo() + setup scripts
|
||||
repo_directory = runtime.clone_or_init_repo(tokens, repo, branch)
|
||||
runtime.maybe_run_setup_script()
|
||||
runtime.maybe_setup_git_hooks()
|
||||
```
|
||||
|
||||
### The Issue
|
||||
The **resolver is the odd one out** - it uses a 2-step process (clone to temp location, then copy to workspace) due to **legacy reasons** (it was originally developed as a separate app built on OH, not a component of OH). All other entry points use the standard pattern (clone directly into runtime workspace).
|
||||
|
||||
## Current State Analysis
|
||||
|
||||
### ✅ What Resolver Already Does Right:
|
||||
- [x] Uses `create_runtime()` from setup.py for runtime creation
|
||||
|
||||
### ❌ What Needs to be Fixed:
|
||||
- [ ] **Resolver uses legacy 2-step cloning instead of standard runtime.clone_or_init_repo()**
|
||||
- [ ] Resolver has custom `initialize_runtime()` method that duplicates setup.py logic
|
||||
- [ ] Resolver has custom `complete_runtime()` method with no setup.py equivalent
|
||||
- [ ] Resolver doesn't follow proper runtime cleanup patterns like main.py
|
||||
- [ ] Runtime connection pattern is inconsistent across codebase
|
||||
|
||||
## Refactoring Steps
|
||||
|
||||
### Phase 1: Fix Repository Cloning Pattern (PRIORITY)
|
||||
**Goal**: Make resolver use the same repository cloning pattern as all other OpenHands entry points.
|
||||
|
||||
- [ ] **Step 1.1**: Replace resolver's legacy 2-step cloning with standard pattern
|
||||
- Remove `subprocess.check_output(['git', 'clone', ...])` from `resolve_issue()`
|
||||
- Remove `shutil.copytree()` from `process_issue()`
|
||||
- Use `initialize_repository_for_runtime()` instead
|
||||
- This will clone directly into runtime workspace AND run setup scripts
|
||||
|
||||
- [ ] **Step 1.2**: Update resolver workflow to match standard pattern
|
||||
- Create and connect runtime first
|
||||
- Then call `initialize_repository_for_runtime()` for cloning + setup
|
||||
- Remove the manual repo copying step entirely
|
||||
- Ensure base_commit is still captured correctly
|
||||
|
||||
### Phase 2: Refactor Runtime Initialization and Completion
|
||||
**Goal**: Remove code duplication between resolver and setup.py for runtime operations.
|
||||
|
||||
- [ ] **Step 2.1**: Create missing functions in setup.py
|
||||
- Create `setup_runtime_environment()` for git config and platform-specific setup
|
||||
- Create `complete_runtime_session()` for git patch generation
|
||||
- Create `cleanup_runtime()` for proper resource cleanup
|
||||
|
||||
- [ ] **Step 2.2**: Replace resolver's `initialize_runtime()`
|
||||
- Use setup.py's `setup_runtime_environment()` instead
|
||||
- Remove duplicate git configuration code
|
||||
- Maintain platform-specific behavior (GitLab CI)
|
||||
|
||||
- [ ] **Step 2.3**: Replace resolver's `complete_runtime()`
|
||||
- Use setup.py's `complete_runtime_session()` instead
|
||||
- Move git patch generation logic to setup.py
|
||||
- Ensure return values match resolver's expectations
|
||||
|
||||
- [ ] **Step 2.4**: Add proper runtime cleanup to resolver
|
||||
- Use setup.py's `cleanup_runtime()` function
|
||||
- Ensure resources are properly released in try/finally blocks
|
||||
|
||||
### Phase 3: Testing and Validation
|
||||
- [ ] **Step 3.1**: Test resolver functionality with refactored code
|
||||
- Verify git operations work correctly
|
||||
- Verify setup scripts are executed
|
||||
- Verify git hooks are set up
|
||||
|
||||
- [ ] **Step 3.2**: Test runtime lifecycle (create → connect → clone → initialize → complete → cleanup)
|
||||
- Ensure no resource leaks
|
||||
- Verify proper error handling
|
||||
|
||||
- [ ] **Step 3.3**: Verify resolver output remains consistent
|
||||
- Git patches are generated correctly
|
||||
- Issue resolution works as before
|
||||
- No regression in functionality
|
||||
|
||||
### Phase 4: Code Quality and Documentation
|
||||
- [ ] **Step 4.1**: Add proper documentation to new setup.py functions
|
||||
- Document parameters and return values
|
||||
- Add usage examples
|
||||
- Document platform-specific behavior
|
||||
|
||||
- [ ] **Step 4.2**: Remove obsolete code from resolver
|
||||
- Delete old `initialize_runtime()` method
|
||||
- Delete old `complete_runtime()` method
|
||||
- Clean up imports and unused code
|
||||
|
||||
- [ ] **Step 4.3**: Update any other components that might benefit from these functions
|
||||
- Check if other entry points could use the same patterns
|
||||
- Ensure consistency across the codebase
|
||||
|
||||
## Success Criteria
|
||||
- [ ] **Resolver uses standard repository cloning pattern (runtime.clone_or_init_repo)**
|
||||
- [ ] Resolver uses setup.py functions for all runtime operations
|
||||
- [ ] No code duplication between resolver and setup.py
|
||||
- [ ] Proper runtime lifecycle management (connect → initialize → complete → cleanup)
|
||||
- [ ] All existing resolver functionality preserved
|
||||
- [ ] Consistent patterns across all OpenHands entry points
|
||||
- [ ] Proper error handling and resource cleanup
|
||||
|
||||
## Files to Modify
|
||||
1. `/openhands/core/setup.py` - Add new runtime management functions
|
||||
2. `/openhands/resolver/issue_resolver.py` - Refactor to use setup.py functions
|
||||
3. Any tests related to resolver functionality
|
||||
|
||||
## Risk Mitigation
|
||||
- Maintain backward compatibility during refactoring
|
||||
- Test thoroughly before removing old code
|
||||
- Keep git patch generation logic identical to avoid breaking issue resolution
|
||||
- Ensure platform-specific behavior (GitLab CI) is preserved
|
||||
@@ -260,9 +260,6 @@ enable_finish = true
|
||||
# length limit
|
||||
enable_history_truncation = true
|
||||
|
||||
# Whether the condensation request tool is enabled
|
||||
enable_condensation_request = false
|
||||
|
||||
[agent.RepoExplorerAgent]
|
||||
# Example: use a cheaper model for RepoExplorerAgent to reduce cost, especially
|
||||
# useful when an agent doesn't demand high quality but uses a lot of tokens
|
||||
|
||||
1829
frontend/package-lock.json
generated
1829
frontend/package-lock.json
generated
File diff suppressed because it is too large
Load Diff
@@ -7,33 +7,33 @@
|
||||
"node": ">=20.0.0"
|
||||
},
|
||||
"dependencies": {
|
||||
"@heroui/react": "^2.8.0-beta.10",
|
||||
"@heroui/react": "^2.8.0-beta.9",
|
||||
"@microlink/react-json-view": "^1.26.2",
|
||||
"@monaco-editor/react": "^4.7.0-rc.0",
|
||||
"@react-router/node": "^7.6.3",
|
||||
"@react-router/serve": "^7.6.3",
|
||||
"@react-router/node": "^7.6.2",
|
||||
"@react-router/serve": "^7.6.2",
|
||||
"@react-types/shared": "^3.29.1",
|
||||
"@reduxjs/toolkit": "^2.8.2",
|
||||
"@stripe/react-stripe-js": "^3.7.0",
|
||||
"@stripe/stripe-js": "^7.4.0",
|
||||
"@tailwindcss/postcss": "^4.1.11",
|
||||
"@tailwindcss/vite": "^4.1.11",
|
||||
"@tanstack/react-query": "^5.81.4",
|
||||
"@vitejs/plugin-react": "^4.6.0",
|
||||
"@stripe/stripe-js": "^7.3.1",
|
||||
"@tailwindcss/postcss": "^4.1.10",
|
||||
"@tailwindcss/vite": "^4.1.10",
|
||||
"@tanstack/react-query": "^5.80.10",
|
||||
"@vitejs/plugin-react": "^4.5.2",
|
||||
"@xterm/addon-fit": "^0.10.0",
|
||||
"@xterm/xterm": "^5.4.0",
|
||||
"axios": "^1.10.0",
|
||||
"clsx": "^2.1.1",
|
||||
"eslint-config-airbnb-typescript": "^18.0.0",
|
||||
"framer-motion": "^12.19.2",
|
||||
"framer-motion": "^12.18.1",
|
||||
"i18next": "^25.2.1",
|
||||
"i18next-browser-languagedetector": "^8.2.0",
|
||||
"i18next-http-backend": "^3.0.2",
|
||||
"isbot": "^5.1.28",
|
||||
"jose": "^6.0.11",
|
||||
"lucide-react": "^0.525.0",
|
||||
"lucide-react": "^0.519.0",
|
||||
"monaco-editor": "^0.52.2",
|
||||
"posthog-js": "^1.255.1",
|
||||
"posthog-js": "^1.255.0",
|
||||
"react": "^19.1.0",
|
||||
"react-dom": "^19.1.0",
|
||||
"react-highlight": "^0.15.0",
|
||||
@@ -42,14 +42,14 @@
|
||||
"react-icons": "^5.5.0",
|
||||
"react-markdown": "^10.1.0",
|
||||
"react-redux": "^9.2.0",
|
||||
"react-router": "^7.6.3",
|
||||
"react-router": "^7.6.2",
|
||||
"react-syntax-highlighter": "^15.6.1",
|
||||
"react-textarea-autosize": "^8.5.9",
|
||||
"remark-gfm": "^4.0.1",
|
||||
"sirv-cli": "^3.0.1",
|
||||
"socket.io-client": "^4.8.1",
|
||||
"tailwind-merge": "^3.3.1",
|
||||
"vite": "^7.0.0",
|
||||
"vite": "^6.3.5",
|
||||
"web-vitals": "^5.0.3",
|
||||
"ws": "^8.18.2"
|
||||
},
|
||||
@@ -80,19 +80,19 @@
|
||||
]
|
||||
},
|
||||
"devDependencies": {
|
||||
"@babel/parser": "^7.27.7",
|
||||
"@babel/traverse": "^7.27.7",
|
||||
"@babel/parser": "^7.27.1",
|
||||
"@babel/traverse": "^7.27.1",
|
||||
"@babel/types": "^7.27.0",
|
||||
"@mswjs/socket.io-binding": "^0.2.0",
|
||||
"@playwright/test": "^1.53.1",
|
||||
"@react-router/dev": "^7.6.3",
|
||||
"@react-router/dev": "^7.6.2",
|
||||
"@tailwindcss/typography": "^0.5.16",
|
||||
"@tanstack/eslint-plugin-query": "^5.81.2",
|
||||
"@testing-library/dom": "^10.4.0",
|
||||
"@testing-library/jest-dom": "^6.6.1",
|
||||
"@testing-library/react": "^16.3.0",
|
||||
"@testing-library/user-event": "^14.6.1",
|
||||
"@types/node": "^24.0.5",
|
||||
"@types/node": "^24.0.3",
|
||||
"@types/react": "^19.1.8",
|
||||
"@types/react-dom": "^19.1.6",
|
||||
"@types/react-highlight": "^0.12.8",
|
||||
@@ -117,7 +117,7 @@
|
||||
"jsdom": "^26.1.0",
|
||||
"lint-staged": "^16.1.2",
|
||||
"msw": "^2.6.6",
|
||||
"prettier": "^3.6.2",
|
||||
"prettier": "^3.5.3",
|
||||
"stripe": "^18.2.1",
|
||||
"tailwindcss": "^4.1.8",
|
||||
"typescript": "^5.8.3",
|
||||
|
||||
@@ -12,9 +12,6 @@ if TYPE_CHECKING:
|
||||
import openhands.agenthub.codeact_agent.function_calling as codeact_function_calling
|
||||
from openhands.agenthub.codeact_agent.tools.bash import create_cmd_run_tool
|
||||
from openhands.agenthub.codeact_agent.tools.browser import BrowserTool
|
||||
from openhands.agenthub.codeact_agent.tools.condensation_request import (
|
||||
CondensationRequestTool,
|
||||
)
|
||||
from openhands.agenthub.codeact_agent.tools.finish import FinishTool
|
||||
from openhands.agenthub.codeact_agent.tools.ipython import IPythonTool
|
||||
from openhands.agenthub.codeact_agent.tools.llm_based_edit import LLMBasedFileEditTool
|
||||
@@ -122,8 +119,6 @@ class CodeActAgent(Agent):
|
||||
tools.append(ThinkTool)
|
||||
if self.config.enable_finish:
|
||||
tools.append(FinishTool)
|
||||
if self.config.enable_condensation_request:
|
||||
tools.append(CondensationRequestTool)
|
||||
if self.config.enable_browsing:
|
||||
if sys.platform == 'win32':
|
||||
logger.warning('Windows runtime does not support browsing yet')
|
||||
|
||||
@@ -11,7 +11,6 @@ from litellm import (
|
||||
|
||||
from openhands.agenthub.codeact_agent.tools import (
|
||||
BrowserTool,
|
||||
CondensationRequestTool,
|
||||
FinishTool,
|
||||
IPythonTool,
|
||||
LLMBasedFileEditTool,
|
||||
@@ -36,7 +35,6 @@ from openhands.events.action import (
|
||||
IPythonRunCellAction,
|
||||
MessageAction,
|
||||
)
|
||||
from openhands.events.action.agent import CondensationRequestAction
|
||||
from openhands.events.action.mcp import MCPAction
|
||||
from openhands.events.event import FileEditSource, FileReadSource
|
||||
from openhands.events.tool import ToolCallMetadata
|
||||
@@ -205,12 +203,6 @@ def response_to_actions(
|
||||
elif tool_call.function.name == ThinkTool['function']['name']:
|
||||
action = AgentThinkAction(thought=arguments.get('thought', ''))
|
||||
|
||||
# ================================================
|
||||
# CondensationRequestAction
|
||||
# ================================================
|
||||
elif tool_call.function.name == CondensationRequestTool['function']['name']:
|
||||
action = CondensationRequestAction()
|
||||
|
||||
# ================================================
|
||||
# BrowserTool
|
||||
# ================================================
|
||||
|
||||
@@ -1,111 +0,0 @@
|
||||
You are OpenHands agent, a helpful AI assistant that can interact with a computer to solve tasks.
|
||||
|
||||
<ROLE>
|
||||
Your primary role is to assist users by executing commands, modifying code, and solving technical problems effectively. You should be thorough, methodical, and prioritize quality over speed.
|
||||
* If the user asks a question, like "why is X happening", don't try to fix the problem. Just give an answer to the question.
|
||||
</ROLE>
|
||||
|
||||
<EFFICIENCY>
|
||||
* Each action you take is somewhat expensive. Wherever possible, combine multiple actions into a single action, e.g. combine multiple bash commands into one, using sed and grep to edit/view multiple files at once.
|
||||
* When exploring the codebase, use efficient tools like find, grep, and git commands with appropriate filters to minimize unnecessary operations.
|
||||
</EFFICIENCY>
|
||||
|
||||
<FILE_SYSTEM_GUIDELINES>
|
||||
* When a user provides a file path, do NOT assume it's relative to the current working directory. First explore the file system to locate the file before working on it.
|
||||
* If asked to edit a file, edit the file directly, rather than creating a new file with a different filename.
|
||||
* For global search-and-replace operations, consider using `sed` instead of opening file editors multiple times.
|
||||
</FILE_SYSTEM_GUIDELINES>
|
||||
|
||||
<CODE_QUALITY>
|
||||
* Write clean, efficient code with minimal comments. Avoid redundancy in comments: Do not repeat information that can be easily inferred from the code itself.
|
||||
* When implementing solutions, focus on making the minimal changes needed to solve the problem.
|
||||
* Before implementing any changes, first thoroughly understand the codebase through exploration.
|
||||
* If you are adding a lot of code to a function or file, consider splitting the function or file into smaller pieces when appropriate.
|
||||
</CODE_QUALITY>
|
||||
|
||||
<VERSION_CONTROL>
|
||||
* When configuring git credentials, use "openhands" as the user.name and "openhands@all-hands.dev" as the user.email by default, unless explicitly instructed otherwise.
|
||||
* Exercise caution with git operations. Do NOT make potentially dangerous changes (e.g., pushing to main, deleting repositories) unless explicitly asked to do so.
|
||||
* When committing changes, use `git status` to see all modified files, and stage all files necessary for the commit. Use `git commit -a` whenever possible.
|
||||
* Do NOT commit files that typically shouldn't go into version control (e.g., node_modules/, .env files, build directories, cache files, large binaries) unless explicitly instructed by the user.
|
||||
* If unsure about committing certain files, check for the presence of .gitignore files or ask the user for clarification.
|
||||
</VERSION_CONTROL>
|
||||
|
||||
<PULL_REQUESTS>
|
||||
* When creating pull requests, create only ONE per session/issue unless explicitly instructed otherwise.
|
||||
* When working with an existing PR, update it with new commits rather than creating additional PRs for the same issue.
|
||||
* When updating a PR, preserve the original PR title and purpose, updating description only when necessary.
|
||||
</PULL_REQUESTS>
|
||||
|
||||
<PROBLEM_SOLVING_WORKFLOW>
|
||||
1. EXPLORATION: Thoroughly explore relevant files and understand the context before proposing solutions
|
||||
2. ANALYSIS: Consider multiple approaches and select the most promising one
|
||||
3. TESTING:
|
||||
* For bug fixes: Create tests to verify issues before implementing fixes
|
||||
* For new features: Consider test-driven development when appropriate
|
||||
* If the repository lacks testing infrastructure and implementing tests would require extensive setup, consult with the user before investing time in building testing infrastructure
|
||||
* If the environment is not set up to run tests, consult with the user first before investing time to install all dependencies
|
||||
4. IMPLEMENTATION: Make focused, minimal changes to address the problem
|
||||
5. VERIFICATION: If the environment is set up to run tests, test your implementation thoroughly, including edge cases. If the environment is not set up to run tests, consult with the user first before investing time to run tests.
|
||||
</PROBLEM_SOLVING_WORKFLOW>
|
||||
|
||||
<TASK_MANAGEMENT>
|
||||
* For complex, long-horizon tasks, create a TODO.md file to track progress:
|
||||
1. Start by creating a detailed plan in TODO.md with clear steps
|
||||
2. Check TODO.md before each new action to maintain context and track progress
|
||||
3. Update TODO.md as you complete steps or discover new requirements
|
||||
4. Mark completed items with ✓ or [x] to maintain a clear record of progress
|
||||
5. For each major step, add sub-tasks as needed to break down complex work
|
||||
6. If you discover the plan needs significant changes, propose updates and confirm with the user before proceeding and update TODO.md
|
||||
7. IMPORTANT: Do NOT add TODO.md to git commits or version control systems
|
||||
|
||||
* Example TODO.md format:
|
||||
```markdown
|
||||
# Task: [Brief description of the overall task]
|
||||
|
||||
## Plan
|
||||
- [ ] Step 1: [Description]
|
||||
- [ ] Sub-task 1.1
|
||||
- [ ] Sub-task 1.2
|
||||
- [ ] Step 2: [Description]
|
||||
- [x] Step 3: [Description] (Completed)
|
||||
|
||||
## Notes
|
||||
- Important discovery: [Details about something you learned]
|
||||
- Potential issue: [Description of a potential problem]
|
||||
```
|
||||
|
||||
* When working on a task:
|
||||
- Read the README to understand how the system works
|
||||
- Create TODO.md with every major step unchecked
|
||||
- Add TODO.md to .gitignore if it's not already ignored
|
||||
- Until every item in TODO.md is checked:
|
||||
a. Pick the next unchecked item and work on it
|
||||
b. Run appropriate tests to verify your work
|
||||
c. If issues arise, fix them until tests pass
|
||||
d. Once complete, check off the item in TODO.md
|
||||
e. Proceed to the next unchecked item
|
||||
</TASK_MANAGEMENT>
|
||||
|
||||
<SECURITY>
|
||||
* Only use GITHUB_TOKEN and other credentials in ways the user has explicitly requested and would expect.
|
||||
* Use APIs to work with GitHub or other platforms, unless the user asks otherwise or your task requires browsing.
|
||||
</SECURITY>
|
||||
|
||||
<ENVIRONMENT_SETUP>
|
||||
* When user asks you to run an application, don't stop if the application is not installed. Instead, please install the application and run the command again.
|
||||
* If you encounter missing dependencies:
|
||||
1. First, look around in the repository for existing dependency files (requirements.txt, pyproject.toml, package.json, Gemfile, etc.)
|
||||
2. If dependency files exist, use them to install all dependencies at once (e.g., `pip install -r requirements.txt`, `npm install`, etc.)
|
||||
3. Only install individual packages directly if no dependency files are found or if only specific packages are needed
|
||||
* Similarly, if you encounter missing dependencies for essential tools requested by the user, install them when possible.
|
||||
</ENVIRONMENT_SETUP>
|
||||
|
||||
<TROUBLESHOOTING>
|
||||
* If you've made repeated attempts to solve a problem but tests still fail or the user reports it's still broken:
|
||||
1. Step back and reflect on 5-7 different possible sources of the problem
|
||||
2. Assess the likelihood of each possible cause
|
||||
3. Methodically address the most likely causes, starting with the highest probability
|
||||
4. Document your reasoning process
|
||||
* When you run into any major issue while executing a plan from the user, please don't try to directly work around it. Instead, propose a new plan and confirm with the user before proceeding.
|
||||
</TROUBLESHOOTING>
|
||||
@@ -1,6 +1,5 @@
|
||||
from .bash import create_cmd_run_tool
|
||||
from .browser import BrowserTool
|
||||
from .condensation_request import CondensationRequestTool
|
||||
from .finish import FinishTool
|
||||
from .ipython import IPythonTool
|
||||
from .llm_based_edit import LLMBasedFileEditTool
|
||||
@@ -9,7 +8,6 @@ from .think import ThinkTool
|
||||
|
||||
__all__ = [
|
||||
'BrowserTool',
|
||||
'CondensationRequestTool',
|
||||
'create_cmd_run_tool',
|
||||
'FinishTool',
|
||||
'IPythonTool',
|
||||
|
||||
@@ -1,16 +0,0 @@
|
||||
from litellm import ChatCompletionToolParam, ChatCompletionToolParamFunctionChunk
|
||||
|
||||
_CONDENSATION_REQUEST_DESCRIPTION = 'Request a condensation of the conversation history when the context becomes too long or when you need to focus on the most relevant information.'
|
||||
|
||||
CondensationRequestTool = ChatCompletionToolParam(
|
||||
type='function',
|
||||
function=ChatCompletionToolParamFunctionChunk(
|
||||
name='request_condensation',
|
||||
description=_CONDENSATION_REQUEST_DESCRIPTION,
|
||||
parameters={
|
||||
'type': 'object',
|
||||
'properties': {},
|
||||
'required': [],
|
||||
},
|
||||
),
|
||||
)
|
||||
@@ -77,6 +77,7 @@ async def cleanup_session(
|
||||
controller: AgentController,
|
||||
) -> None:
|
||||
"""Clean up all resources from the current session."""
|
||||
|
||||
event_stream = runtime.event_stream
|
||||
end_state = controller.get_state()
|
||||
end_state.save_to_session(
|
||||
@@ -120,7 +121,6 @@ async def run_session(
|
||||
sid = generate_sid(config, session_name)
|
||||
is_loaded = asyncio.Event()
|
||||
is_paused = asyncio.Event() # Event to track agent pause requests
|
||||
pause_task: asyncio.Task | None = None # No more than one pause task
|
||||
always_confirm_mode = False # Flag to enable always confirm mode
|
||||
|
||||
# Show runtime initialization message
|
||||
@@ -236,11 +236,9 @@ async def run_session(
|
||||
|
||||
if event.agent_state == AgentState.RUNNING:
|
||||
display_agent_running_message()
|
||||
nonlocal pause_task
|
||||
if pause_task is None or pause_task.done():
|
||||
pause_task = loop.create_task(
|
||||
process_agent_pause(is_paused, event_stream)
|
||||
) # Create a task to track agent pause requests from the user
|
||||
loop.create_task(
|
||||
process_agent_pause(is_paused, event_stream)
|
||||
) # Create a task to track agent pause requests from the user
|
||||
|
||||
def on_event(event: Event) -> None:
|
||||
loop.create_task(on_event_async(event))
|
||||
@@ -436,23 +434,7 @@ async def main_with_loop(loop: asyncio.AbstractEventLoop) -> None:
|
||||
return
|
||||
|
||||
# Read task from file, CLI args, or stdin
|
||||
if args.file:
|
||||
# For CLI usage, we want to enhance the file content with a prompt
|
||||
# that instructs the agent to read and understand the file first
|
||||
with open(args.file, 'r', encoding='utf-8') as file:
|
||||
file_content = file.read()
|
||||
|
||||
# Create a prompt that instructs the agent to read and understand the file first
|
||||
task_str = f"""The user has tagged a file '{args.file}'.
|
||||
Please read and understand the following file content first:
|
||||
|
||||
```
|
||||
{file_content}
|
||||
```
|
||||
|
||||
After reviewing the file, please ask the user what they would like to do with it."""
|
||||
else:
|
||||
task_str = read_task(args, config.cli_multiline_input)
|
||||
task_str = read_task(args, config.cli_multiline_input)
|
||||
|
||||
# Run the first session
|
||||
new_session_requested = await run_session(
|
||||
|
||||
@@ -59,11 +59,7 @@ from openhands.events.action import (
|
||||
NullAction,
|
||||
SystemMessageAction,
|
||||
)
|
||||
from openhands.events.action.agent import (
|
||||
CondensationAction,
|
||||
CondensationRequestAction,
|
||||
RecallAction,
|
||||
)
|
||||
from openhands.events.action.agent import CondensationAction, RecallAction
|
||||
from openhands.events.event import Event
|
||||
from openhands.events.observation import (
|
||||
AgentDelegateObservation,
|
||||
@@ -75,6 +71,7 @@ from openhands.events.observation import (
|
||||
from openhands.events.serialization.event import truncate_content
|
||||
from openhands.llm.llm import LLM
|
||||
from openhands.llm.metrics import Metrics
|
||||
from openhands.memory.view import View
|
||||
from openhands.storage.files import FileStore
|
||||
|
||||
# note: RESUME is only available on web GUI
|
||||
@@ -339,8 +336,6 @@ class AgentController:
|
||||
return True
|
||||
if isinstance(event, CondensationAction):
|
||||
return True
|
||||
if isinstance(event, CondensationRequestAction):
|
||||
return True
|
||||
return False
|
||||
if isinstance(event, Observation):
|
||||
if (
|
||||
@@ -834,9 +829,7 @@ class AgentController:
|
||||
or isinstance(e, ContextWindowExceededError)
|
||||
):
|
||||
if self.agent.config.enable_history_truncation:
|
||||
self.event_stream.add_event(
|
||||
CondensationRequestAction(), EventSource.AGENT
|
||||
)
|
||||
self._handle_long_context_error()
|
||||
return
|
||||
else:
|
||||
raise LLMContextWindowExceedError()
|
||||
@@ -887,7 +880,7 @@ class AgentController:
|
||||
action_id = getattr(action, 'id', 'unknown')
|
||||
action_type = type(action).__name__
|
||||
self.log(
|
||||
'info',
|
||||
'warning',
|
||||
f'Pending action active for {elapsed_time:.2f}s: {action_type} (id={action_id})',
|
||||
extra={'msg_type': 'PENDING_ACTION_TIMEOUT'},
|
||||
)
|
||||
@@ -956,6 +949,180 @@ class AgentController:
|
||||
assert self._closed
|
||||
return self.state_tracker.get_trajectory(include_screenshots)
|
||||
|
||||
def _handle_long_context_error(self) -> None:
|
||||
# When context window is exceeded, keep roughly half of agent interactions
|
||||
current_view = View.from_events(self.state.history)
|
||||
kept_events = self._apply_conversation_window(current_view.events)
|
||||
kept_event_ids = {e.id for e in kept_events}
|
||||
|
||||
self.log(
|
||||
'info',
|
||||
f'Context window exceeded. Keeping events with IDs: {kept_event_ids}',
|
||||
)
|
||||
|
||||
# The events to forget are those that are not in the kept set
|
||||
forgotten_event_ids = {e.id for e in self.state.history} - kept_event_ids
|
||||
|
||||
if len(kept_event_ids) == 0:
|
||||
self.log(
|
||||
'warning',
|
||||
'No events kept after applying conversation window. This should not happen.',
|
||||
)
|
||||
|
||||
# verify that the first event id in kept_event_ids is the same as the start_id
|
||||
if len(kept_event_ids) > 0 and self.state.history[0].id not in kept_event_ids:
|
||||
self.log(
|
||||
'warning',
|
||||
f'First event after applying conversation window was not kept: {self.state.history[0].id} not in {kept_event_ids}',
|
||||
)
|
||||
|
||||
# Add an error event to trigger another step by the agent
|
||||
self.event_stream.add_event(
|
||||
CondensationAction(
|
||||
forgotten_events_start_id=min(forgotten_event_ids)
|
||||
if forgotten_event_ids
|
||||
else 0,
|
||||
forgotten_events_end_id=max(forgotten_event_ids)
|
||||
if forgotten_event_ids
|
||||
else 0,
|
||||
),
|
||||
EventSource.AGENT,
|
||||
)
|
||||
|
||||
def _apply_conversation_window(self, history: list[Event]) -> list[Event]:
|
||||
"""Cuts history roughly in half when context window is exceeded.
|
||||
|
||||
It preserves action-observation pairs and ensures that the system message,
|
||||
the first user message, and its associated recall observation are always included
|
||||
at the beginning of the context window.
|
||||
|
||||
The algorithm:
|
||||
1. Identify essential initial events: System Message, First User Message, Recall Observation.
|
||||
2. Determine the slice of recent events to potentially keep.
|
||||
3. Validate the start of the recent slice for dangling observations.
|
||||
4. Combine essential events and validated recent events, ensuring essentials come first.
|
||||
|
||||
Args:
|
||||
events: List of events to filter
|
||||
|
||||
Returns:
|
||||
Filtered list of events keeping newest half while preserving pairs and essential initial events.
|
||||
"""
|
||||
# Handle empty history
|
||||
if not history:
|
||||
return []
|
||||
# 1. Identify essential initial events
|
||||
system_message: SystemMessageAction | None = None
|
||||
first_user_msg: MessageAction | None = None
|
||||
recall_action: RecallAction | None = None
|
||||
recall_observation: Observation | None = None
|
||||
|
||||
# Find System Message (should be the first event, if it exists)
|
||||
system_message = next(
|
||||
(e for e in history if isinstance(e, SystemMessageAction)), None
|
||||
)
|
||||
assert (
|
||||
system_message is None
|
||||
or isinstance(system_message, SystemMessageAction)
|
||||
and system_message.id == history[0].id
|
||||
)
|
||||
|
||||
# Find First User Message in the history, which MUST exist
|
||||
first_user_msg = self._first_user_message(history)
|
||||
if first_user_msg is None:
|
||||
# If not found in history, try the event stream
|
||||
first_user_msg = self._first_user_message()
|
||||
if first_user_msg is None:
|
||||
raise RuntimeError('No first user message found in the event stream.')
|
||||
self.log(
|
||||
'warning',
|
||||
'First user message not found in history. Using cached version from event stream.',
|
||||
)
|
||||
|
||||
# Find the first user message index in the history
|
||||
first_user_msg_index = -1
|
||||
for i, event in enumerate(history):
|
||||
if isinstance(event, MessageAction) and event.source == EventSource.USER:
|
||||
first_user_msg_index = i
|
||||
break
|
||||
|
||||
# Find Recall Action and Observation related to the First User Message
|
||||
# Look for RecallAction after the first user message
|
||||
for i in range(first_user_msg_index + 1, len(history)):
|
||||
event = history[i]
|
||||
if (
|
||||
isinstance(event, RecallAction)
|
||||
and event.query == first_user_msg.content
|
||||
):
|
||||
# Found RecallAction, now look for its Observation
|
||||
recall_action = event
|
||||
for j in range(i + 1, len(history)):
|
||||
obs_event = history[j]
|
||||
# Check for Observation caused by this RecallAction
|
||||
if (
|
||||
isinstance(obs_event, Observation)
|
||||
and obs_event.cause == recall_action.id
|
||||
):
|
||||
recall_observation = obs_event
|
||||
break # Found the observation, stop inner loop
|
||||
break # Found the recall action (and maybe obs), stop outer loop
|
||||
|
||||
essential_events: list[Event] = []
|
||||
if system_message:
|
||||
essential_events.append(system_message)
|
||||
# Only include first user message if history is not empty
|
||||
if history:
|
||||
essential_events.append(first_user_msg)
|
||||
# Include recall action and observation if both exist
|
||||
if recall_action and recall_observation:
|
||||
essential_events.append(recall_action)
|
||||
essential_events.append(recall_observation)
|
||||
# Include recall action without observation for backward compatibility
|
||||
elif recall_action:
|
||||
essential_events.append(recall_action)
|
||||
|
||||
# 2. Determine the slice of recent events to potentially keep
|
||||
num_non_essential_events = len(history) - len(essential_events)
|
||||
# Keep roughly half of the non-essential events, minimum 1
|
||||
num_recent_to_keep = max(1, num_non_essential_events // 2)
|
||||
|
||||
# Calculate the starting index for the recent slice
|
||||
slice_start_index = len(history) - num_recent_to_keep
|
||||
slice_start_index = max(0, slice_start_index) # Ensure index is not negative
|
||||
recent_events_slice = history[slice_start_index:]
|
||||
|
||||
# 3. Validate the start of the recent slice for dangling observations
|
||||
# IMPORTANT: Most observations in history are tool call results, which cannot be without their action, or we get an LLM API error
|
||||
first_valid_event_index = 0
|
||||
for i, event in enumerate(recent_events_slice):
|
||||
if isinstance(event, Observation):
|
||||
first_valid_event_index += 1
|
||||
else:
|
||||
break
|
||||
# If all events in the slice are dangling observations, we need to keep at least one
|
||||
if first_valid_event_index == len(recent_events_slice):
|
||||
self.log(
|
||||
'warning',
|
||||
'All recent events are dangling observations, which we truncate. This means the agent has only the essential first events. This should not happen.',
|
||||
)
|
||||
|
||||
# Adjust the recent_events_slice if dangling observations were found at the start
|
||||
if first_valid_event_index < len(recent_events_slice):
|
||||
validated_recent_events = recent_events_slice[first_valid_event_index:]
|
||||
if first_valid_event_index > 0:
|
||||
self.log(
|
||||
'debug',
|
||||
f'Removed {first_valid_event_index} dangling observation(s) from the start of recent event slice.',
|
||||
)
|
||||
else:
|
||||
validated_recent_events = []
|
||||
|
||||
# 4. Combine essential events and validated recent events
|
||||
events_to_keep: list[Event] = essential_events + validated_recent_events
|
||||
self.log('debug', f'History truncated. Kept {len(events_to_keep)} events.')
|
||||
|
||||
return events_to_keep
|
||||
|
||||
def _is_stuck(self) -> bool:
|
||||
"""Checks if the agent or its delegate is stuck in a loop.
|
||||
|
||||
|
||||
@@ -31,8 +31,6 @@ class AgentConfig(BaseModel):
|
||||
"""Whether to enable think tool"""
|
||||
enable_finish: bool = Field(default=True)
|
||||
"""Whether to enable finish tool"""
|
||||
enable_condensation_request: bool = Field(default=False)
|
||||
"""Whether to enable condensation request tool"""
|
||||
enable_prompt_extensions: bool = Field(default=True)
|
||||
"""Whether to enable prompt extensions"""
|
||||
enable_mcp: bool = Field(default=True)
|
||||
@@ -53,7 +51,8 @@ class AgentConfig(BaseModel):
|
||||
|
||||
@classmethod
|
||||
def from_toml_section(cls, data: dict) -> dict[str, AgentConfig]:
|
||||
"""Create a mapping of AgentConfig instances from a toml dictionary representing the [agent] section.
|
||||
"""
|
||||
Create a mapping of AgentConfig instances from a toml dictionary representing the [agent] section.
|
||||
|
||||
The default configuration is built from all non-dict keys in data.
|
||||
Then, each key with a dict value is treated as a custom agent configuration, and its values override
|
||||
@@ -71,6 +70,7 @@ class AgentConfig(BaseModel):
|
||||
dict[str, AgentConfig]: A mapping where the key "agent" corresponds to the default configuration
|
||||
and additional keys represent custom configurations.
|
||||
"""
|
||||
|
||||
# Initialize the result mapping
|
||||
agent_mapping: dict[str, AgentConfig] = {}
|
||||
|
||||
|
||||
@@ -11,7 +11,7 @@ from openhands.core.config.llm_config import LLMConfig
|
||||
class NoOpCondenserConfig(BaseModel):
|
||||
"""Configuration for NoOpCondenser."""
|
||||
|
||||
type: Literal['noop'] = Field(default='noop')
|
||||
type: Literal['noop'] = 'noop'
|
||||
|
||||
model_config = ConfigDict(extra='forbid')
|
||||
|
||||
@@ -19,7 +19,7 @@ class NoOpCondenserConfig(BaseModel):
|
||||
class ObservationMaskingCondenserConfig(BaseModel):
|
||||
"""Configuration for ObservationMaskingCondenser."""
|
||||
|
||||
type: Literal['observation_masking'] = Field(default='observation_masking')
|
||||
type: Literal['observation_masking'] = 'observation_masking'
|
||||
attention_window: int = Field(
|
||||
default=100,
|
||||
description='The number of most-recent events where observations will not be masked.',
|
||||
@@ -32,7 +32,7 @@ class ObservationMaskingCondenserConfig(BaseModel):
|
||||
class BrowserOutputCondenserConfig(BaseModel):
|
||||
"""Configuration for the BrowserOutputCondenser."""
|
||||
|
||||
type: Literal['browser_output_masking'] = Field(default='browser_output_masking')
|
||||
type: Literal['browser_output_masking'] = 'browser_output_masking'
|
||||
attention_window: int = Field(
|
||||
default=1,
|
||||
description='The number of most recent browser output observations that will not be masked.',
|
||||
@@ -43,7 +43,7 @@ class BrowserOutputCondenserConfig(BaseModel):
|
||||
class RecentEventsCondenserConfig(BaseModel):
|
||||
"""Configuration for RecentEventsCondenser."""
|
||||
|
||||
type: Literal['recent'] = Field(default='recent')
|
||||
type: Literal['recent'] = 'recent'
|
||||
|
||||
# at least one event by default, because the best guess is that it is the user task
|
||||
keep_first: int = Field(
|
||||
@@ -61,7 +61,7 @@ class RecentEventsCondenserConfig(BaseModel):
|
||||
class LLMSummarizingCondenserConfig(BaseModel):
|
||||
"""Configuration for LLMCondenser."""
|
||||
|
||||
type: Literal['llm'] = Field(default='llm')
|
||||
type: Literal['llm'] = 'llm'
|
||||
llm_config: LLMConfig = Field(
|
||||
..., description='Configuration for the LLM to use for condensing.'
|
||||
)
|
||||
@@ -88,7 +88,7 @@ class LLMSummarizingCondenserConfig(BaseModel):
|
||||
class AmortizedForgettingCondenserConfig(BaseModel):
|
||||
"""Configuration for AmortizedForgettingCondenser."""
|
||||
|
||||
type: Literal['amortized'] = Field(default='amortized')
|
||||
type: Literal['amortized'] = 'amortized'
|
||||
max_size: int = Field(
|
||||
default=100,
|
||||
description='Maximum size of the condensed history before triggering forgetting.',
|
||||
@@ -108,7 +108,7 @@ class AmortizedForgettingCondenserConfig(BaseModel):
|
||||
class LLMAttentionCondenserConfig(BaseModel):
|
||||
"""Configuration for LLMAttentionCondenser."""
|
||||
|
||||
type: Literal['llm_attention'] = Field(default='llm_attention')
|
||||
type: Literal['llm_attention'] = 'llm_attention'
|
||||
llm_config: LLMConfig = Field(
|
||||
..., description='Configuration for the LLM to use for attention.'
|
||||
)
|
||||
@@ -131,7 +131,7 @@ class LLMAttentionCondenserConfig(BaseModel):
|
||||
class StructuredSummaryCondenserConfig(BaseModel):
|
||||
"""Configuration for StructuredSummaryCondenser instances."""
|
||||
|
||||
type: Literal['structured'] = Field(default='structured')
|
||||
type: Literal['structured'] = 'structured'
|
||||
llm_config: LLMConfig = Field(
|
||||
..., description='Configuration for the LLM to use for condensing.'
|
||||
)
|
||||
@@ -156,24 +156,16 @@ class StructuredSummaryCondenserConfig(BaseModel):
|
||||
|
||||
|
||||
class CondenserPipelineConfig(BaseModel):
|
||||
"""Configuration for the CondenserPipeline."""
|
||||
|
||||
type: Literal['pipeline'] = Field(default='pipeline')
|
||||
condensers: list[CondenserConfig] = Field(
|
||||
default_factory=list,
|
||||
description='List of condenser configurations to be used in the pipeline.',
|
||||
)
|
||||
|
||||
model_config = ConfigDict(extra='forbid')
|
||||
|
||||
|
||||
class ConversationWindowCondenserConfig(BaseModel):
|
||||
"""Configuration for ConversationWindowCondenser.
|
||||
"""Configuration for the CondenserPipeline.
|
||||
|
||||
Not currently supported by the TOML or ENV_VAR configuration strategies.
|
||||
"""
|
||||
|
||||
type: Literal['conversation_window'] = Field(default='conversation_window')
|
||||
type: Literal['pipeline'] = 'pipeline'
|
||||
condensers: list[CondenserConfig] = Field(
|
||||
default_factory=list,
|
||||
description='List of condenser configurations to be used in the pipeline.',
|
||||
)
|
||||
|
||||
model_config = ConfigDict(extra='forbid')
|
||||
|
||||
@@ -189,14 +181,14 @@ CondenserConfig = (
|
||||
| LLMAttentionCondenserConfig
|
||||
| StructuredSummaryCondenserConfig
|
||||
| CondenserPipelineConfig
|
||||
| ConversationWindowCondenserConfig
|
||||
)
|
||||
|
||||
|
||||
def condenser_config_from_toml_section(
|
||||
data: dict, llm_configs: dict | None = None
|
||||
) -> dict[str, CondenserConfig]:
|
||||
"""Create a CondenserConfig instance from a toml dictionary representing the [condenser] section.
|
||||
"""
|
||||
Create a CondenserConfig instance from a toml dictionary representing the [condenser] section.
|
||||
|
||||
For CondenserConfig, the handling is different since it's a union type. The type of condenser
|
||||
is determined by the 'type' field in the section.
|
||||
@@ -218,6 +210,7 @@ def condenser_config_from_toml_section(
|
||||
Returns:
|
||||
dict[str, CondenserConfig]: A mapping where the key "condenser" corresponds to the configuration.
|
||||
"""
|
||||
|
||||
# Initialize the result mapping
|
||||
condenser_mapping: dict[str, CondenserConfig] = {}
|
||||
|
||||
@@ -268,7 +261,8 @@ from_toml_section = condenser_config_from_toml_section
|
||||
|
||||
|
||||
def create_condenser_config(condenser_type: str, data: dict) -> CondenserConfig:
|
||||
"""Create a CondenserConfig instance based on the specified type.
|
||||
"""
|
||||
Create a CondenserConfig instance based on the specified type.
|
||||
|
||||
Args:
|
||||
condenser_type: The type of condenser to create.
|
||||
@@ -290,9 +284,6 @@ def create_condenser_config(condenser_type: str, data: dict) -> CondenserConfig:
|
||||
'amortized': AmortizedForgettingCondenserConfig,
|
||||
'llm_attention': LLMAttentionCondenserConfig,
|
||||
'structured': StructuredSummaryCondenserConfig,
|
||||
'pipeline': CondenserPipelineConfig,
|
||||
'conversation_window': ConversationWindowCondenserConfig,
|
||||
'browser_output_masking': BrowserOutputCondenserConfig,
|
||||
}
|
||||
|
||||
if condenser_type not in condenser_classes:
|
||||
|
||||
@@ -91,6 +91,3 @@ class ActionType(str, Enum):
|
||||
|
||||
CONDENSATION = 'condensation'
|
||||
"""Condenses a list of events into a summary."""
|
||||
|
||||
CONDENSATION_REQUEST = 'condensation_request'
|
||||
"""Request for condensation of a list of events."""
|
||||
|
||||
@@ -195,18 +195,3 @@ class CondensationAction(Action):
|
||||
if self.summary:
|
||||
return f'Summary: {self.summary}'
|
||||
return f'Condenser is dropping the events: {self.forgotten}.'
|
||||
|
||||
|
||||
@dataclass
|
||||
class CondensationRequestAction(Action):
|
||||
"""This action is used to request a condensation of the conversation history.
|
||||
|
||||
Attributes:
|
||||
action (str): The action type, namely ActionType.CONDENSATION_REQUEST.
|
||||
"""
|
||||
|
||||
action: str = ActionType.CONDENSATION_REQUEST
|
||||
|
||||
@property
|
||||
def message(self) -> str:
|
||||
return 'Requesting a condensation of the conversation history.'
|
||||
|
||||
@@ -9,7 +9,6 @@ from openhands.events.action.agent import (
|
||||
AgentThinkAction,
|
||||
ChangeAgentStateAction,
|
||||
CondensationAction,
|
||||
CondensationRequestAction,
|
||||
RecallAction,
|
||||
)
|
||||
from openhands.events.action.browse import BrowseInteractiveAction, BrowseURLAction
|
||||
@@ -44,7 +43,6 @@ actions = (
|
||||
MessageAction,
|
||||
SystemMessageAction,
|
||||
CondensationAction,
|
||||
CondensationRequestAction,
|
||||
MCPAction,
|
||||
)
|
||||
|
||||
|
||||
@@ -19,7 +19,6 @@ from litellm import completion as litellm_completion
|
||||
from litellm import completion_cost as litellm_completion_cost
|
||||
from litellm.exceptions import (
|
||||
RateLimitError,
|
||||
ServiceUnavailableError,
|
||||
)
|
||||
from litellm.types.utils import CostPerToken, ModelResponse, Usage
|
||||
from litellm.utils import create_pretrained_tokenizer
|
||||
@@ -41,7 +40,6 @@ __all__ = ['LLM']
|
||||
# tuple of exceptions to retry on
|
||||
LLM_RETRY_EXCEPTIONS: tuple[type[Exception], ...] = (
|
||||
RateLimitError,
|
||||
ServiceUnavailableError,
|
||||
litellm.Timeout,
|
||||
litellm.InternalServerError,
|
||||
LLMNoResponseError,
|
||||
|
||||
@@ -4,9 +4,6 @@ from openhands.memory.condenser.impl.amortized_forgetting_condenser import (
|
||||
from openhands.memory.condenser.impl.browser_output_condenser import (
|
||||
BrowserOutputCondenser,
|
||||
)
|
||||
from openhands.memory.condenser.impl.conversation_window_condenser import (
|
||||
ConversationWindowCondenser,
|
||||
)
|
||||
from openhands.memory.condenser.impl.llm_attention_condenser import (
|
||||
ImportantEventSelection,
|
||||
LLMAttentionCondenser,
|
||||
@@ -37,5 +34,4 @@ __all__ = [
|
||||
'RecentEventsCondenser',
|
||||
'StructuredSummaryCondenser',
|
||||
'CondenserPipeline',
|
||||
'ConversationWindowCondenser',
|
||||
]
|
||||
|
||||
@@ -1,185 +0,0 @@
|
||||
from __future__ import annotations
|
||||
|
||||
from openhands.core.config.condenser_config import ConversationWindowCondenserConfig
|
||||
from openhands.core.logger import openhands_logger as logger
|
||||
from openhands.events.action.agent import (
|
||||
CondensationAction,
|
||||
RecallAction,
|
||||
)
|
||||
from openhands.events.action.message import MessageAction, SystemMessageAction
|
||||
from openhands.events.event import EventSource
|
||||
from openhands.events.observation import Observation
|
||||
from openhands.memory.condenser.condenser import Condensation, RollingCondenser, View
|
||||
|
||||
|
||||
class ConversationWindowCondenser(RollingCondenser):
|
||||
def __init__(self) -> None:
|
||||
super().__init__()
|
||||
|
||||
def get_condensation(self, view: View) -> Condensation:
|
||||
"""Apply conversation window truncation similar to _apply_conversation_window.
|
||||
|
||||
This method:
|
||||
1. Identifies essential initial events (System Message, First User Message, Recall Observation)
|
||||
2. Keeps roughly half of the history
|
||||
3. Ensures action-observation pairs are preserved
|
||||
4. Returns a CondensationAction specifying which events to forget
|
||||
"""
|
||||
events = view.events
|
||||
|
||||
# Handle empty history
|
||||
if not events:
|
||||
# No events to condense
|
||||
action = CondensationAction(forgotten_event_ids=[])
|
||||
return Condensation(action=action)
|
||||
|
||||
# 1. Identify essential initial events
|
||||
system_message: SystemMessageAction | None = None
|
||||
first_user_msg: MessageAction | None = None
|
||||
recall_action: RecallAction | None = None
|
||||
recall_observation: Observation | None = None
|
||||
|
||||
# Find System Message (should be the first event, if it exists)
|
||||
system_message = next(
|
||||
(e for e in events if isinstance(e, SystemMessageAction)), None
|
||||
)
|
||||
|
||||
# Find First User Message
|
||||
first_user_msg = next(
|
||||
(
|
||||
e
|
||||
for e in events
|
||||
if isinstance(e, MessageAction) and e.source == EventSource.USER
|
||||
),
|
||||
None,
|
||||
)
|
||||
|
||||
if first_user_msg is None:
|
||||
logger.warning(
|
||||
'No first user message found in history during condensation.'
|
||||
)
|
||||
# Return empty condensation if no user message
|
||||
action = CondensationAction(forgotten_event_ids=[])
|
||||
return Condensation(action=action)
|
||||
|
||||
# Find the first user message index
|
||||
first_user_msg_index = -1
|
||||
for i, event in enumerate(events):
|
||||
if isinstance(event, MessageAction) and event.source == EventSource.USER:
|
||||
first_user_msg_index = i
|
||||
break
|
||||
|
||||
# Find Recall Action and Observation related to the First User Message
|
||||
for i in range(first_user_msg_index + 1, len(events)):
|
||||
event = events[i]
|
||||
if (
|
||||
isinstance(event, RecallAction)
|
||||
and event.query == first_user_msg.content
|
||||
):
|
||||
recall_action = event
|
||||
# Look for its observation
|
||||
for j in range(i + 1, len(events)):
|
||||
obs_event = events[j]
|
||||
if (
|
||||
isinstance(obs_event, Observation)
|
||||
and obs_event.cause == recall_action.id
|
||||
):
|
||||
recall_observation = obs_event
|
||||
break
|
||||
break
|
||||
|
||||
# Collect essential events
|
||||
essential_events: list[int] = [] # Store event IDs
|
||||
if system_message:
|
||||
essential_events.append(system_message.id)
|
||||
essential_events.append(first_user_msg.id)
|
||||
if recall_action:
|
||||
essential_events.append(recall_action.id)
|
||||
if recall_observation:
|
||||
essential_events.append(recall_observation.id)
|
||||
|
||||
# 2. Determine which events to keep
|
||||
num_essential_events = len(essential_events)
|
||||
total_events = len(events)
|
||||
num_non_essential_events = total_events - num_essential_events
|
||||
|
||||
# Keep roughly half of the non-essential events
|
||||
num_recent_to_keep = max(1, num_non_essential_events // 2)
|
||||
|
||||
# Calculate the starting index for recent events to keep
|
||||
slice_start_index = total_events - num_recent_to_keep
|
||||
slice_start_index = max(0, slice_start_index)
|
||||
|
||||
# 3. Handle dangling observations at the start of the slice
|
||||
# Find the first non-observation event in the slice
|
||||
recent_events_slice = events[slice_start_index:]
|
||||
first_valid_event_index_in_slice = 0
|
||||
for i, event in enumerate(recent_events_slice):
|
||||
if not isinstance(event, Observation):
|
||||
first_valid_event_index_in_slice = i
|
||||
break
|
||||
else:
|
||||
# All events in the slice are observations
|
||||
first_valid_event_index_in_slice = len(recent_events_slice)
|
||||
|
||||
# Check if all events in the recent slice are dangling observations
|
||||
if first_valid_event_index_in_slice == len(recent_events_slice):
|
||||
logger.warning(
|
||||
'All recent events are dangling observations, which we truncate. This means the agent has only the essential first events. This should not happen.'
|
||||
)
|
||||
|
||||
# Calculate the actual index in the full events list
|
||||
first_valid_event_index = slice_start_index + first_valid_event_index_in_slice
|
||||
|
||||
if first_valid_event_index_in_slice > 0:
|
||||
logger.debug(
|
||||
f'Removed {first_valid_event_index_in_slice} dangling observation(s) '
|
||||
f'from the start of recent event slice.'
|
||||
)
|
||||
|
||||
# 4. Determine which events to keep and which to forget
|
||||
events_to_keep: set[int] = set(essential_events)
|
||||
|
||||
# Add recent events starting from first_valid_event_index
|
||||
for i in range(first_valid_event_index, total_events):
|
||||
events_to_keep.add(events[i].id)
|
||||
|
||||
# Calculate which events to forget
|
||||
all_event_ids = {e.id for e in events}
|
||||
forgotten_event_ids = sorted(all_event_ids - events_to_keep)
|
||||
|
||||
logger.info(
|
||||
f'ConversationWindowCondenser: Keeping {len(events_to_keep)} events, '
|
||||
f'forgetting {len(forgotten_event_ids)} events.'
|
||||
)
|
||||
|
||||
# Create the condensation action
|
||||
if forgotten_event_ids:
|
||||
# Use range if the forgotten events are contiguous
|
||||
if (
|
||||
len(forgotten_event_ids) > 1
|
||||
and forgotten_event_ids[-1] - forgotten_event_ids[0]
|
||||
== len(forgotten_event_ids) - 1
|
||||
):
|
||||
action = CondensationAction(
|
||||
forgotten_events_start_id=forgotten_event_ids[0],
|
||||
forgotten_events_end_id=forgotten_event_ids[-1],
|
||||
)
|
||||
else:
|
||||
action = CondensationAction(forgotten_event_ids=forgotten_event_ids)
|
||||
else:
|
||||
action = CondensationAction(forgotten_event_ids=[])
|
||||
|
||||
return Condensation(action=action)
|
||||
|
||||
def should_condense(self, view: View) -> bool:
|
||||
return view.unhandled_condensation_request
|
||||
|
||||
@classmethod
|
||||
def from_config(
|
||||
cls, _config: ConversationWindowCondenserConfig
|
||||
) -> ConversationWindowCondenser:
|
||||
return ConversationWindowCondenser()
|
||||
|
||||
|
||||
ConversationWindowCondenser.register_config(ConversationWindowCondenserConfig)
|
||||
@@ -2,7 +2,6 @@ import asyncio
|
||||
import os
|
||||
import uuid
|
||||
from datetime import datetime, timezone
|
||||
from pathlib import Path
|
||||
from typing import Callable
|
||||
|
||||
import openhands
|
||||
@@ -34,8 +33,6 @@ GLOBAL_MICROAGENTS_DIR = os.path.join(
|
||||
'microagents',
|
||||
)
|
||||
|
||||
USER_MICROAGENTS_DIR = Path.home() / '.openhands' / 'microagents'
|
||||
|
||||
|
||||
class Memory:
|
||||
"""
|
||||
@@ -80,9 +77,6 @@ class Memory:
|
||||
# from typically OpenHands/microagents (i.e., the PUBLIC microagents)
|
||||
self._load_global_microagents()
|
||||
|
||||
# Load user microagents from ~/.openhands/microagents/
|
||||
self._load_user_microagents()
|
||||
|
||||
def on_event(self, event: Event):
|
||||
"""Handle an event from the event stream."""
|
||||
asyncio.get_event_loop().run_until_complete(self._on_event(event))
|
||||
@@ -273,34 +267,12 @@ class Memory:
|
||||
repo_agents, knowledge_agents = load_microagents_from_dir(
|
||||
GLOBAL_MICROAGENTS_DIR
|
||||
)
|
||||
for name, agent_knowledge in knowledge_agents.items():
|
||||
self.knowledge_microagents[name] = agent_knowledge
|
||||
for name, agent_repo in repo_agents.items():
|
||||
self.repo_microagents[name] = agent_repo
|
||||
|
||||
def _load_user_microagents(self) -> None:
|
||||
"""
|
||||
Loads microagents from the user's home directory (~/.openhands/microagents/)
|
||||
Creates the directory if it doesn't exist.
|
||||
"""
|
||||
try:
|
||||
# Create the user microagents directory if it doesn't exist
|
||||
os.makedirs(USER_MICROAGENTS_DIR, exist_ok=True)
|
||||
|
||||
# Load microagents from user directory
|
||||
repo_agents, knowledge_agents = load_microagents_from_dir(
|
||||
USER_MICROAGENTS_DIR
|
||||
)
|
||||
|
||||
for name, agent_knowledge in knowledge_agents.items():
|
||||
self.knowledge_microagents[name] = agent_knowledge
|
||||
for name, agent_repo in repo_agents.items():
|
||||
self.repo_microagents[name] = agent_repo
|
||||
|
||||
except Exception as e:
|
||||
logger.warning(
|
||||
f'Failed to load user microagents from {USER_MICROAGENTS_DIR}: {str(e)}'
|
||||
)
|
||||
for name, k_agent in knowledge_agents.items():
|
||||
if isinstance(k_agent, KnowledgeMicroagent):
|
||||
self.knowledge_microagents[name] = k_agent
|
||||
for name, r_agent in repo_agents.items():
|
||||
if isinstance(r_agent, RepoMicroagent):
|
||||
self.repo_microagents[name] = r_agent
|
||||
|
||||
def get_microagent_mcp_tools(self) -> list[MCPConfig]:
|
||||
"""
|
||||
|
||||
@@ -5,7 +5,7 @@ from typing import overload
|
||||
from pydantic import BaseModel
|
||||
|
||||
from openhands.core.logger import openhands_logger as logger
|
||||
from openhands.events.action.agent import CondensationAction, CondensationRequestAction
|
||||
from openhands.events.action.agent import CondensationAction
|
||||
from openhands.events.event import Event
|
||||
from openhands.events.observation.agent import AgentCondensationObservation
|
||||
|
||||
@@ -17,7 +17,6 @@ class View(BaseModel):
|
||||
"""
|
||||
|
||||
events: list[Event]
|
||||
unhandled_condensation_request: bool = False
|
||||
|
||||
def __len__(self) -> int:
|
||||
return len(self.events)
|
||||
@@ -53,8 +52,6 @@ class View(BaseModel):
|
||||
forgotten_event_ids.update(event.forgotten)
|
||||
# Make sure we also forget the condensation action itself
|
||||
forgotten_event_ids.add(event.id)
|
||||
if isinstance(event, CondensationRequestAction):
|
||||
forgotten_event_ids.add(event.id)
|
||||
|
||||
kept_events = [event for event in events if event.id not in forgotten_event_ids]
|
||||
|
||||
@@ -77,17 +74,4 @@ class View(BaseModel):
|
||||
summary_offset, AgentCondensationObservation(content=summary)
|
||||
)
|
||||
|
||||
# Check for an unhandled condensation request -- these are events closer to the
|
||||
# end of the list than any condensation action.
|
||||
unhandled_condensation_request = False
|
||||
for event in reversed(events):
|
||||
if isinstance(event, CondensationAction):
|
||||
break
|
||||
if isinstance(event, CondensationRequestAction):
|
||||
unhandled_condensation_request = True
|
||||
break
|
||||
|
||||
return View(
|
||||
events=kept_events,
|
||||
unhandled_condensation_request=unhandled_condensation_request,
|
||||
)
|
||||
return View(events=kept_events)
|
||||
|
||||
@@ -1,6 +1,5 @@
|
||||
import io
|
||||
import re
|
||||
from itertools import chain
|
||||
from pathlib import Path
|
||||
from typing import Union
|
||||
|
||||
@@ -40,11 +39,7 @@ class BaseMicroagent(BaseModel):
|
||||
# Otherwise, we will rely on the name from metadata later
|
||||
derived_name = None
|
||||
if microagent_dir is not None:
|
||||
# Special handling for .cursorrules files which are not in microagent_dir
|
||||
if path.name == '.cursorrules':
|
||||
derived_name = 'cursorrules'
|
||||
else:
|
||||
derived_name = str(path.relative_to(microagent_dir).with_suffix(''))
|
||||
derived_name = str(path.relative_to(microagent_dir).with_suffix(''))
|
||||
|
||||
# Only load directly from path if file_content is not provided
|
||||
if file_content is None:
|
||||
@@ -61,16 +56,6 @@ class BaseMicroagent(BaseModel):
|
||||
type=MicroagentType.REPO_KNOWLEDGE,
|
||||
)
|
||||
|
||||
# Handle .cursorrules files
|
||||
if path.name == '.cursorrules':
|
||||
return RepoMicroagent(
|
||||
name='cursorrules',
|
||||
content=file_content,
|
||||
metadata=MicroagentMetadata(name='cursorrules'),
|
||||
source=str(path),
|
||||
type=MicroagentType.REPO_KNOWLEDGE,
|
||||
)
|
||||
|
||||
file_io = io.StringIO(file_content)
|
||||
loaded = frontmatter.load(file_io)
|
||||
content = loaded.content
|
||||
@@ -273,15 +258,10 @@ def load_microagents_from_dir(
|
||||
# Load all agents from microagents directory
|
||||
logger.debug(f'Loading agents from {microagent_dir}')
|
||||
if microagent_dir.exists():
|
||||
# Collect .cursorrules file from repo root and .md files from microagents dir
|
||||
cursorrules_files = []
|
||||
if (microagent_dir.parent.parent / '.cursorrules').exists():
|
||||
cursorrules_files = [microagent_dir.parent.parent / '.cursorrules']
|
||||
|
||||
md_files = [f for f in microagent_dir.rglob('*.md') if f.name != 'README.md']
|
||||
|
||||
# Process all files in one loop
|
||||
for file in chain(cursorrules_files, md_files):
|
||||
for file in microagent_dir.rglob('*.md'):
|
||||
# skip README.md
|
||||
if file.name == 'README.md':
|
||||
continue
|
||||
try:
|
||||
agent = BaseMicroagent.load(file, microagent_dir)
|
||||
if isinstance(agent, RepoMicroagent):
|
||||
|
||||
@@ -5,6 +5,7 @@ import dataclasses
|
||||
import json
|
||||
import os
|
||||
import pathlib
|
||||
import shutil
|
||||
import subprocess
|
||||
from argparse import Namespace
|
||||
from typing import Any
|
||||
@@ -393,7 +394,7 @@ class IssueResolver:
|
||||
async def process_issue(
|
||||
self,
|
||||
issue: Issue,
|
||||
branch_to_checkout: str | None,
|
||||
base_commit: str,
|
||||
issue_handler: ServiceContextIssue | ServiceContextPR,
|
||||
reset_logger: bool = False,
|
||||
) -> ResolverOutput:
|
||||
@@ -404,45 +405,14 @@ class IssueResolver:
|
||||
else:
|
||||
logger.info(f'Starting fixing issue {issue.number}.')
|
||||
|
||||
# create runtime and clone repo using standard pattern
|
||||
# write the repo to the workspace
|
||||
if os.path.exists(self.workspace_base):
|
||||
shutil.rmtree(self.workspace_base)
|
||||
shutil.copytree(os.path.join(self.output_dir, 'repo'), self.workspace_base)
|
||||
|
||||
runtime = create_runtime(self.app_config)
|
||||
await runtime.connect()
|
||||
|
||||
# clone repo directly into runtime workspace
|
||||
from openhands.core.setup import initialize_repository_for_runtime
|
||||
|
||||
initialize_repository_for_runtime(runtime, self.issue_handler.get_clone_url())
|
||||
|
||||
# checkout to PR branch if needed
|
||||
if branch_to_checkout:
|
||||
logger.info(f'Checking out to PR branch {branch_to_checkout}')
|
||||
# Fetch the branch first to ensure it exists locally
|
||||
fetch_cmd = ['git', 'fetch', 'origin', branch_to_checkout]
|
||||
subprocess.check_output(fetch_cmd, cwd=runtime.workspace_root) # noqa: ASYNC101
|
||||
|
||||
# Checkout the branch
|
||||
checkout_cmd = ['git', 'checkout', branch_to_checkout]
|
||||
subprocess.check_output(checkout_cmd, cwd=runtime.workspace_root) # noqa: ASYNC101
|
||||
|
||||
# get the commit id of current repo for reproducibility
|
||||
base_commit = (
|
||||
subprocess.check_output(
|
||||
['git', 'rev-parse', 'HEAD'], cwd=runtime.workspace_root
|
||||
) # noqa: ASYNC101
|
||||
.decode('utf-8')
|
||||
.strip()
|
||||
)
|
||||
logger.info(f'Base commit: {base_commit}')
|
||||
|
||||
# Check for .openhands_instructions file in the workspace directory
|
||||
if self.repo_instruction is None:
|
||||
openhands_instructions_path = os.path.join(
|
||||
runtime.workspace_root, '.openhands_instructions'
|
||||
)
|
||||
if os.path.exists(openhands_instructions_path):
|
||||
with open(openhands_instructions_path, 'r') as f: # noqa: ASYNC101
|
||||
self.repo_instruction = f.read()
|
||||
|
||||
def on_event(evt: Event) -> None:
|
||||
logger.info(evt)
|
||||
|
||||
@@ -595,10 +565,36 @@ class IssueResolver:
|
||||
)
|
||||
logger.info(f'Using output directory: {self.output_dir}')
|
||||
|
||||
# repo will be cloned later in process_issue using standard pattern
|
||||
# base_commit will be captured after cloning
|
||||
# checkout the repo
|
||||
repo_dir = os.path.join(self.output_dir, 'repo')
|
||||
if not os.path.exists(repo_dir):
|
||||
checkout_output = subprocess.check_output( # noqa: ASYNC101
|
||||
[
|
||||
'git',
|
||||
'clone',
|
||||
self.issue_handler.get_clone_url(),
|
||||
f'{self.output_dir}/repo',
|
||||
]
|
||||
).decode('utf-8')
|
||||
if 'fatal' in checkout_output:
|
||||
raise RuntimeError(f'Failed to clone repository: {checkout_output}')
|
||||
|
||||
# .openhands_instructions will be read after repo is cloned in process_issue
|
||||
# get the commit id of current repo for reproducibility
|
||||
base_commit = (
|
||||
subprocess.check_output(['git', 'rev-parse', 'HEAD'], cwd=repo_dir) # noqa: ASYNC101
|
||||
.decode('utf-8')
|
||||
.strip()
|
||||
)
|
||||
logger.info(f'Base commit: {base_commit}')
|
||||
|
||||
if self.repo_instruction is None:
|
||||
# Check for .openhands_instructions file in the workspace directory
|
||||
openhands_instructions_path = os.path.join(
|
||||
repo_dir, '.openhands_instructions'
|
||||
)
|
||||
if os.path.exists(openhands_instructions_path):
|
||||
with open(openhands_instructions_path, 'r') as f: # noqa: ASYNC101
|
||||
self.repo_instruction = f.read()
|
||||
|
||||
# OUTPUT FILE
|
||||
output_file = os.path.join(self.output_dir, 'output.jsonl')
|
||||
@@ -622,19 +618,39 @@ class IssueResolver:
|
||||
)
|
||||
|
||||
try:
|
||||
# determine branch to use for PR
|
||||
branch_to_use = None
|
||||
# checkout to pr branch if needed
|
||||
if self.issue_type == 'pr':
|
||||
branch_to_use = issue.head_branch
|
||||
logger.info(
|
||||
f'Will checkout to PR branch {branch_to_use} for issue {issue.number}'
|
||||
f'Checking out to PR branch {branch_to_use} for issue {issue.number}'
|
||||
)
|
||||
|
||||
if not branch_to_use:
|
||||
raise ValueError('Branch name cannot be None')
|
||||
|
||||
# Fetch the branch first to ensure it exists locally
|
||||
fetch_cmd = ['git', 'fetch', 'origin', branch_to_use]
|
||||
subprocess.check_output( # noqa: ASYNC101
|
||||
fetch_cmd,
|
||||
cwd=repo_dir,
|
||||
)
|
||||
|
||||
# Checkout the branch
|
||||
checkout_cmd = ['git', 'checkout', branch_to_use]
|
||||
subprocess.check_output( # noqa: ASYNC101
|
||||
checkout_cmd,
|
||||
cwd=repo_dir,
|
||||
)
|
||||
|
||||
base_commit = (
|
||||
subprocess.check_output(['git', 'rev-parse', 'HEAD'], cwd=repo_dir) # noqa: ASYNC101
|
||||
.decode('utf-8')
|
||||
.strip()
|
||||
)
|
||||
|
||||
output = await self.process_issue(
|
||||
issue,
|
||||
branch_to_use, # pass branch instead of base_commit
|
||||
base_commit,
|
||||
self.issue_handler,
|
||||
reset_logger,
|
||||
)
|
||||
|
||||
@@ -10,7 +10,6 @@ from openhands.core.config import OpenHandsConfig
|
||||
from openhands.core.config.condenser_config import (
|
||||
BrowserOutputCondenserConfig,
|
||||
CondenserPipelineConfig,
|
||||
ConversationWindowCondenserConfig,
|
||||
LLMSummarizingCondenserConfig,
|
||||
)
|
||||
from openhands.core.config.mcp_config import MCPConfig, OpenHandsMCPConfigImpl
|
||||
@@ -157,18 +156,13 @@ class Session:
|
||||
agent_config = self.config.get_agent_config(agent_cls)
|
||||
|
||||
if settings.enable_default_condenser:
|
||||
# Default condenser chains three condensers together:
|
||||
# 1. a conversation window condenser that handles explicit
|
||||
# condensation requests,
|
||||
# 2. a condenser that limits the total size of browser observations,
|
||||
# and
|
||||
# 3. a condenser that limits the size of the view given to the LLM.
|
||||
# The order matters: with the browser output first, the summarizer
|
||||
# will only see the most recent browser output, which should keep
|
||||
# the summarization cost down.
|
||||
# Default condenser chains a condenser that limits browser the total
|
||||
# size of browser observations with a condenser that limits the size
|
||||
# of the view given to the LLM. The order matters: with the browser
|
||||
# output first, the summarizer will only see the most recent browser
|
||||
# output, which should keep the summarization cost down.
|
||||
default_condenser_config = CondenserPipelineConfig(
|
||||
condensers=[
|
||||
ConversationWindowCondenserConfig(),
|
||||
BrowserOutputCondenserConfig(attention_window=2),
|
||||
LLMSummarizingCondenserConfig(
|
||||
llm_config=llm.config, keep_first=4, max_size=120
|
||||
|
||||
@@ -454,10 +454,7 @@ def test_cmd_run(temp_dir, runtime_cls, run_as_openhands):
|
||||
):
|
||||
assert 'openhands' in obs.content
|
||||
elif runtime_cls == LocalRuntime or runtime_cls == CLIRuntime:
|
||||
# For CLI and Local runtimes, the user depends on the actual environment
|
||||
# In CI it might be a non-root user, in cloud environments it might be root
|
||||
# We just check that the command succeeded and the directory was created
|
||||
pass # Skip user-specific assertions for environment independence
|
||||
assert 'root' not in obs.content and 'openhands' not in obs.content
|
||||
else:
|
||||
assert 'root' in obs.content
|
||||
assert 'test' in obs.content
|
||||
|
||||
@@ -34,12 +34,7 @@ from openhands.events.observation.empty import NullObservation
|
||||
from openhands.events.serialization import event_to_dict
|
||||
from openhands.llm import LLM
|
||||
from openhands.llm.metrics import Metrics, TokenUsage
|
||||
from openhands.memory.condenser.condenser import Condensation
|
||||
from openhands.memory.condenser.impl.conversation_window_condenser import (
|
||||
ConversationWindowCondenser,
|
||||
)
|
||||
from openhands.memory.memory import Memory
|
||||
from openhands.memory.view import View
|
||||
from openhands.runtime.base import Runtime
|
||||
from openhands.runtime.impl.action_execution.action_execution_client import (
|
||||
ActionExecutionClient,
|
||||
@@ -840,23 +835,8 @@ async def test_notify_on_llm_retry(mock_agent, mock_event_stream, mock_status_ca
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
@pytest.mark.parametrize(
|
||||
'context_window_error',
|
||||
[
|
||||
ContextWindowExceededError(
|
||||
message='prompt is too long: 233885 tokens > 200000 maximum',
|
||||
model='',
|
||||
llm_provider='',
|
||||
),
|
||||
BadRequestError(
|
||||
message='litellm.BadRequestError: OpenrouterException - This endpoint\'s maximum context length is 40960 tokens. However, you requested about 42988 tokens (38892 of text input, 4096 in the output). Please reduce the length of either one, or use the "middle-out" transform to compress your prompt automatically.',
|
||||
model='openrouter/qwen/qwen3-30b-a3b',
|
||||
llm_provider='openrouter',
|
||||
),
|
||||
],
|
||||
)
|
||||
async def test_context_window_exceeded_error_handling(
|
||||
context_window_error, mock_agent, mock_runtime, test_event_stream, mock_memory
|
||||
mock_agent, mock_runtime, test_event_stream, mock_memory
|
||||
):
|
||||
"""Test that context window exceeded errors are handled correctly by the controller, providing a smaller view but keeping the history intact."""
|
||||
max_iterations = 5
|
||||
@@ -867,15 +847,9 @@ async def test_context_window_exceeded_error_handling(
|
||||
self.has_errored = False
|
||||
self.index = 0
|
||||
self.views = []
|
||||
self.condenser = ConversationWindowCondenser()
|
||||
|
||||
def step(self, state: State):
|
||||
match self.condenser.condense(state.view):
|
||||
case View() as view:
|
||||
self.views.append(view)
|
||||
|
||||
case Condensation(action=action):
|
||||
return action
|
||||
self.views.append(state.view)
|
||||
|
||||
# Wait until the right step to throw the error, and make sure we
|
||||
# only throw it once.
|
||||
@@ -883,13 +857,13 @@ async def test_context_window_exceeded_error_handling(
|
||||
self.index += 1
|
||||
return MessageAction(content=f'Test message {self.index}')
|
||||
|
||||
ContextWindowExceededError(
|
||||
error = ContextWindowExceededError(
|
||||
message='prompt is too long: 233885 tokens > 200000 maximum',
|
||||
model='',
|
||||
llm_provider='',
|
||||
)
|
||||
self.has_errored = True
|
||||
raise context_window_error
|
||||
raise error
|
||||
|
||||
step_state = StepState()
|
||||
mock_agent.step = step_state.step
|
||||
@@ -907,7 +881,7 @@ async def test_context_window_exceeded_error_handling(
|
||||
content='Test microagent content',
|
||||
recall_type=RecallType.KNOWLEDGE,
|
||||
)
|
||||
microagent_obs._cause = event.id # type: ignore
|
||||
microagent_obs._cause = event.id
|
||||
test_event_stream.add_event(microagent_obs, EventSource.ENVIRONMENT)
|
||||
|
||||
test_event_stream.subscribe(
|
||||
@@ -937,7 +911,7 @@ async def test_context_window_exceeded_error_handling(
|
||||
# Check that the context window exception was thrown and the controller
|
||||
# called the agent's `step` function the right number of times.
|
||||
assert step_state.has_errored
|
||||
assert len(step_state.views) == max_iterations - 1
|
||||
assert len(step_state.views) == max_iterations
|
||||
print('step_state.views: ', step_state.views)
|
||||
|
||||
# Look at pre/post-step views. Normally, these should always increase in
|
||||
@@ -962,7 +936,7 @@ async def test_context_window_exceeded_error_handling(
|
||||
assert len(first_view) < len(second_view)
|
||||
|
||||
# The final state's history should contain:
|
||||
# - (max_iterations - 1) number of message actions (one iteration taken up with the condensation request)
|
||||
# - max_iterations number of message actions,
|
||||
# - 1 recall actions,
|
||||
# - 1 recall observations,
|
||||
# - 1 condensation action.
|
||||
@@ -970,7 +944,7 @@ async def test_context_window_exceeded_error_handling(
|
||||
len(
|
||||
[event for event in final_state.history if isinstance(event, MessageAction)]
|
||||
)
|
||||
== max_iterations - 1
|
||||
== max_iterations
|
||||
)
|
||||
assert (
|
||||
len(
|
||||
@@ -981,7 +955,7 @@ async def test_context_window_exceeded_error_handling(
|
||||
and event.source == EventSource.AGENT
|
||||
]
|
||||
)
|
||||
== max_iterations - 2
|
||||
== max_iterations - 1
|
||||
)
|
||||
assert (
|
||||
len([event for event in final_state.history if isinstance(event, RecallAction)])
|
||||
@@ -1027,14 +1001,8 @@ async def test_run_controller_with_context_window_exceeded_with_truncation(
|
||||
class StepState:
|
||||
def __init__(self):
|
||||
self.has_errored = False
|
||||
self.condenser = ConversationWindowCondenser()
|
||||
|
||||
def step(self, state: State):
|
||||
match self.condenser.condense(state.view):
|
||||
case Condensation(action=action):
|
||||
return action
|
||||
case _:
|
||||
pass
|
||||
# If the state has more than one message and we haven't errored yet,
|
||||
# throw the context window exceeded error
|
||||
if len(state.history) > 5 and not self.has_errored:
|
||||
@@ -1646,3 +1614,206 @@ def test_system_message_in_event_stream(mock_agent, test_event_stream):
|
||||
assert isinstance(events[0], SystemMessageAction)
|
||||
assert events[0].content == 'Test system message'
|
||||
assert events[0].tools == ['test_tool']
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_openrouter_context_window_exceeded_error(
|
||||
mock_agent, test_event_stream, mock_status_callback
|
||||
):
|
||||
"""Test that OpenRouter context window exceeded errors are properly detected and handled."""
|
||||
max_iterations = 5
|
||||
error_after = 2
|
||||
|
||||
class StepState:
|
||||
def __init__(self):
|
||||
self.has_errored = False
|
||||
self.index = 0
|
||||
self.views = []
|
||||
|
||||
def step(self, state: State):
|
||||
self.views.append(state.view)
|
||||
|
||||
# Wait until the right step to throw the error, and make sure we
|
||||
# only throw it once.
|
||||
if self.index < error_after or self.has_errored:
|
||||
self.index += 1
|
||||
return MessageAction(content=f'Test message {self.index}')
|
||||
|
||||
# Create a BadRequestError with the OpenRouter context window exceeded message pattern
|
||||
error = BadRequestError(
|
||||
message='litellm.BadRequestError: OpenrouterException - This endpoint\'s maximum context length is 40960 tokens. However, you requested about 42988 tokens (38892 of text input, 4096 in the output). Please reduce the length of either one, or use the "middle-out" transform to compress your prompt automatically.',
|
||||
model='openrouter/qwen/qwen3-30b-a3b',
|
||||
llm_provider='openrouter',
|
||||
)
|
||||
self.has_errored = True
|
||||
raise error
|
||||
|
||||
step_state = StepState()
|
||||
mock_agent.step = step_state.step
|
||||
mock_agent.config = AgentConfig(enable_history_truncation=True)
|
||||
|
||||
controller = AgentController(
|
||||
agent=mock_agent,
|
||||
event_stream=test_event_stream,
|
||||
iteration_delta=max_iterations,
|
||||
sid='test',
|
||||
confirmation_mode=False,
|
||||
headless_mode=True,
|
||||
status_callback=mock_status_callback,
|
||||
)
|
||||
|
||||
# Set the agent state to RUNNING
|
||||
controller.state.agent_state = AgentState.RUNNING
|
||||
|
||||
# Run the controller until it hits the error
|
||||
for _ in range(error_after + 2): # +2 to ensure we go past the error
|
||||
await controller._step()
|
||||
if step_state.has_errored:
|
||||
break
|
||||
|
||||
# Verify that the error was handled as a context window exceeded error
|
||||
# by checking that _handle_long_context_error was called (which adds a CondensationAction)
|
||||
events = list(test_event_stream.get_events())
|
||||
condensation_actions = [e for e in events if isinstance(e, CondensationAction)]
|
||||
|
||||
# There should be at least one CondensationAction if the error was handled correctly
|
||||
assert len(condensation_actions) > 0, (
|
||||
'OpenRouter context window exceeded error was not handled correctly'
|
||||
)
|
||||
|
||||
await controller.close()
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_sambanova_context_window_exceeded_error(
|
||||
mock_agent, test_event_stream, mock_status_callback
|
||||
):
|
||||
"""Test that SambaNova context window exceeded errors are properly detected and handled."""
|
||||
max_iterations = 5
|
||||
error_after = 2
|
||||
|
||||
class StepState:
|
||||
def __init__(self):
|
||||
self.has_errored = False
|
||||
self.index = 0
|
||||
self.views = []
|
||||
|
||||
def step(self, state: State):
|
||||
# Store the view for later inspection
|
||||
self.views.append(state.view)
|
||||
# only throw it once.
|
||||
if self.index < error_after or self.has_errored:
|
||||
self.index += 1
|
||||
return MessageAction(content=f'Test message {self.index}')
|
||||
|
||||
# Create a BadRequestError with the SambaNova context window exceeded message pattern
|
||||
error = BadRequestError(
|
||||
message='litellm.BadRequestError: SambanovaException - The maximum context length of DeepSeek-V3-0324 is 32768. However, answering your request will take 39732 tokens. Please reduce the length of the messages or the specified max_completion_tokens value.',
|
||||
model='sambanova/deepseek-v3-0324',
|
||||
llm_provider='sambanova',
|
||||
)
|
||||
self.has_errored = True
|
||||
raise error
|
||||
|
||||
step_state = StepState()
|
||||
mock_agent.step = step_state.step
|
||||
mock_agent.config = AgentConfig(enable_history_truncation=True)
|
||||
|
||||
controller = AgentController(
|
||||
agent=mock_agent,
|
||||
event_stream=test_event_stream,
|
||||
iteration_delta=max_iterations,
|
||||
sid='test',
|
||||
confirmation_mode=False,
|
||||
headless_mode=True,
|
||||
status_callback=mock_status_callback,
|
||||
)
|
||||
|
||||
# Set the agent state to RUNNING
|
||||
controller.state.agent_state = AgentState.RUNNING
|
||||
|
||||
# Run the controller until it hits the error
|
||||
for _ in range(error_after + 2): # +2 to ensure we go past the error
|
||||
await controller._step()
|
||||
if step_state.has_errored:
|
||||
break
|
||||
|
||||
# Verify that the error was handled as a context window exceeded error
|
||||
# by checking that _handle_long_context_error was called (which adds a CondensationAction)
|
||||
events = list(test_event_stream.get_events())
|
||||
condensation_actions = [e for e in events if isinstance(e, CondensationAction)]
|
||||
|
||||
# There should be at least one CondensationAction if the error was handled correctly
|
||||
assert len(condensation_actions) > 0, (
|
||||
'SambaNova context window exceeded error was not handled correctly'
|
||||
)
|
||||
|
||||
await controller.close()
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_sambanova_generic_exception_not_handled_as_context_error(
|
||||
mock_agent, test_event_stream, mock_status_callback
|
||||
):
|
||||
"""Test that generic SambaNova exceptions (without context length pattern) are NOT handled as context window errors."""
|
||||
max_iterations = 5
|
||||
error_after = 2
|
||||
|
||||
class StepState:
|
||||
def __init__(self):
|
||||
self.has_errored = False
|
||||
self.index = 0
|
||||
self.views = []
|
||||
|
||||
def step(self, state: State):
|
||||
# Store the view for later inspection
|
||||
self.views.append(state.view)
|
||||
# only throw it once.
|
||||
if self.index < error_after or self.has_errored:
|
||||
self.index += 1
|
||||
return MessageAction(content=f'Test message {self.index}')
|
||||
|
||||
# Create a BadRequestError with a generic SambaNova error (no context length pattern)
|
||||
error = BadRequestError(
|
||||
message='litellm.BadRequestError: SambanovaException - Some other error occurred',
|
||||
model='sambanova/deepseek-v3-0324',
|
||||
llm_provider='sambanova',
|
||||
)
|
||||
self.has_errored = True
|
||||
raise error
|
||||
|
||||
step_state = StepState()
|
||||
mock_agent.step = step_state.step
|
||||
mock_agent.config = AgentConfig(enable_history_truncation=True)
|
||||
|
||||
controller = AgentController(
|
||||
agent=mock_agent,
|
||||
event_stream=test_event_stream,
|
||||
iteration_delta=max_iterations,
|
||||
sid='test',
|
||||
confirmation_mode=False,
|
||||
headless_mode=True,
|
||||
status_callback=mock_status_callback,
|
||||
)
|
||||
|
||||
# Set the agent state to RUNNING
|
||||
controller.state.agent_state = AgentState.RUNNING
|
||||
|
||||
# Run the controller until it hits the error
|
||||
with pytest.raises(BadRequestError):
|
||||
for _ in range(error_after + 2): # +2 to ensure we go past the error
|
||||
await controller._step()
|
||||
if step_state.has_errored:
|
||||
break
|
||||
|
||||
# Verify that the error was NOT handled as a context window exceeded error
|
||||
# by checking that _handle_long_context_error was NOT called (no CondensationAction should be added)
|
||||
events = list(test_event_stream.get_events())
|
||||
condensation_actions = [e for e in events if isinstance(e, CondensationAction)]
|
||||
|
||||
# There should be NO CondensationAction if the error was correctly NOT handled as context window error
|
||||
assert len(condensation_actions) == 0, (
|
||||
'Generic SambaNova exception was incorrectly handled as context window error'
|
||||
)
|
||||
|
||||
await controller.close()
|
||||
|
||||
@@ -1,34 +1,24 @@
|
||||
"""
|
||||
Unit tests for ConversationWindowCondenser.
|
||||
|
||||
These tests mirror the tests for `_apply_conversation_window` in the AgentController,
|
||||
but adapted to test the condenser implementation. The ConversationWindowCondenser
|
||||
copies the functionality of the `_apply_conversation_window` function as closely as possible.
|
||||
|
||||
The tests verify that the condenser:
|
||||
1. Identifies essential initial events (System Message, First User Message, Recall Action/Observation)
|
||||
2. Keeps roughly half of the non-essential events from recent history
|
||||
3. Handles dangling observations properly
|
||||
4. Returns appropriate CondensationAction objects specifying which events to forget
|
||||
"""
|
||||
|
||||
from unittest.mock import patch
|
||||
from unittest.mock import MagicMock, patch
|
||||
|
||||
import pytest
|
||||
|
||||
from openhands.controller.agent import Agent
|
||||
from openhands.controller.agent_controller import AgentController
|
||||
from openhands.controller.state.state import State
|
||||
from openhands.core.config import OpenHandsConfig
|
||||
from openhands.events import EventSource
|
||||
from openhands.events.action import CmdRunAction, MessageAction, RecallAction
|
||||
from openhands.events.action.agent import CondensationAction
|
||||
from openhands.events.action.message import SystemMessageAction
|
||||
from openhands.events.event import RecallType
|
||||
from openhands.events.observation import (
|
||||
CmdOutputObservation,
|
||||
Observation,
|
||||
RecallObservation,
|
||||
)
|
||||
from openhands.memory.condenser.condenser import Condensation, View
|
||||
from openhands.memory.condenser.impl.conversation_window_condenser import (
|
||||
ConversationWindowCondenser,
|
||||
)
|
||||
from openhands.events.stream import EventStream
|
||||
from openhands.llm.llm import LLM
|
||||
from openhands.llm.metrics import Metrics
|
||||
from openhands.storage.memory import InMemoryFileStore
|
||||
|
||||
|
||||
# Helper function to create events with sequential IDs and causes
|
||||
@@ -96,20 +86,44 @@ def create_events(event_data):
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def condenser_fixture():
|
||||
condenser = ConversationWindowCondenser()
|
||||
return condenser
|
||||
def controller_fixture():
|
||||
mock_agent = MagicMock(spec=Agent)
|
||||
mock_agent.llm = MagicMock(spec=LLM)
|
||||
mock_agent.llm.metrics = Metrics()
|
||||
mock_agent.llm.config = OpenHandsConfig().get_llm_config()
|
||||
mock_agent.config = OpenHandsConfig().get_agent_config('CodeActAgent')
|
||||
|
||||
mock_event_stream = MagicMock(spec=EventStream)
|
||||
mock_event_stream.sid = 'test_sid'
|
||||
mock_event_stream.file_store = InMemoryFileStore({})
|
||||
# Ensure get_latest_event_id returns an integer
|
||||
mock_event_stream.get_latest_event_id.return_value = -1
|
||||
|
||||
# Create a state with iteration_flag.max_value set to 10
|
||||
state = State(inputs={}, session_id='test_sid')
|
||||
state.iteration_flag.max_value = 10
|
||||
|
||||
controller = AgentController(
|
||||
agent=mock_agent,
|
||||
event_stream=mock_event_stream,
|
||||
iteration_delta=1, # Add the required iteration_delta parameter
|
||||
sid='test_sid',
|
||||
initial_state=state,
|
||||
)
|
||||
|
||||
# Don't mock _first_user_message anymore since we need it to work with history
|
||||
return controller
|
||||
|
||||
|
||||
# =============================================
|
||||
# Test Cases for ConversationWindowCondenser
|
||||
# Test Cases for _apply_conversation_window
|
||||
# =============================================
|
||||
|
||||
|
||||
def test_basic_truncation(condenser_fixture):
|
||||
condenser = condenser_fixture
|
||||
def test_basic_truncation(controller_fixture):
|
||||
controller = controller_fixture
|
||||
|
||||
events = create_events(
|
||||
controller.state.history = create_events(
|
||||
[
|
||||
{'type': SystemMessageAction, 'content': 'System Prompt'}, # 1
|
||||
{
|
||||
@@ -142,7 +156,6 @@ def test_basic_truncation(condenser_fixture):
|
||||
}, # 10
|
||||
]
|
||||
)
|
||||
view = View(events=events)
|
||||
|
||||
# Calculation (RecallAction now essential):
|
||||
# History len = 10
|
||||
@@ -154,22 +167,21 @@ def test_basic_truncation(condenser_fixture):
|
||||
# Validation: remove leading obs2(8). validated_slice = [cmd3(9), obs3(10)]
|
||||
# Final = essentials + validated_slice = [sys(1), user(2), recall_act(3), recall_obs(4), cmd3(9), obs3(10)]
|
||||
# Expected IDs: [1, 2, 3, 4, 9, 10]. Length 6.
|
||||
# Forgotten IDs: [5, 6, 7, 8]
|
||||
condensation = condenser.get_condensation(view)
|
||||
truncated_events = controller._apply_conversation_window(controller.state.history)
|
||||
|
||||
assert isinstance(condensation, Condensation)
|
||||
assert isinstance(condensation.action, CondensationAction)
|
||||
|
||||
# Check the forgotten event IDs
|
||||
forgotten_ids = condensation.action.forgotten
|
||||
expected_forgotten = [5, 6, 7, 8]
|
||||
assert sorted(forgotten_ids) == expected_forgotten
|
||||
assert len(truncated_events) == 6
|
||||
expected_ids = [1, 2, 3, 4, 9, 10]
|
||||
actual_ids = [e.id for e in truncated_events]
|
||||
assert actual_ids == expected_ids
|
||||
# Check no dangling observations at the start of the recent slice part
|
||||
# The first event of the validated slice is cmd3(9)
|
||||
assert not isinstance(truncated_events[4], Observation) # Index adjusted
|
||||
|
||||
|
||||
def test_no_system_message(condenser_fixture):
|
||||
condenser = condenser_fixture
|
||||
def test_no_system_message(controller_fixture):
|
||||
controller = controller_fixture
|
||||
|
||||
events = create_events(
|
||||
controller.state.history = create_events(
|
||||
[
|
||||
{
|
||||
'type': MessageAction,
|
||||
@@ -201,7 +213,7 @@ def test_no_system_message(condenser_fixture):
|
||||
}, # 9
|
||||
]
|
||||
)
|
||||
view = View(events=events)
|
||||
# No longer need to set mock ID
|
||||
|
||||
# Calculation (RecallAction now essential):
|
||||
# History len = 9
|
||||
@@ -212,22 +224,19 @@ def test_no_system_message(condenser_fixture):
|
||||
# recent_events_slice = history[6:] = [obs2(7), cmd3(8), obs3(9)]
|
||||
# Validation: remove leading obs2(7). validated_slice = [cmd3(8), obs3(9)]
|
||||
# Final = essentials + validated_slice = [user(1), recall_act(2), recall_obs(3), cmd3(8), obs3(9)]
|
||||
# Expected kept IDs: [1, 2, 3, 8, 9]. Length 5.
|
||||
# Forgotten IDs: [4, 5, 6, 7]
|
||||
condensation = condenser.get_condensation(view)
|
||||
# Expected IDs: [1, 2, 3, 8, 9]. Length 5.
|
||||
truncated_events = controller._apply_conversation_window(controller.state.history)
|
||||
|
||||
assert isinstance(condensation, Condensation)
|
||||
assert isinstance(condensation.action, CondensationAction)
|
||||
|
||||
forgotten_ids = condensation.action.forgotten
|
||||
expected_forgotten = [4, 5, 6, 7]
|
||||
assert sorted(forgotten_ids) == expected_forgotten
|
||||
assert len(truncated_events) == 5
|
||||
expected_ids = [1, 2, 3, 8, 9]
|
||||
actual_ids = [e.id for e in truncated_events]
|
||||
assert actual_ids == expected_ids
|
||||
|
||||
|
||||
def test_no_recall_observation(condenser_fixture):
|
||||
condenser = condenser_fixture
|
||||
def test_no_recall_observation(controller_fixture):
|
||||
controller = controller_fixture
|
||||
|
||||
events = create_events(
|
||||
controller.state.history = create_events(
|
||||
[
|
||||
{'type': SystemMessageAction, 'content': 'System Prompt'}, # 1
|
||||
{
|
||||
@@ -260,33 +269,29 @@ def test_no_recall_observation(condenser_fixture):
|
||||
}, # 9
|
||||
]
|
||||
)
|
||||
view = View(events=events)
|
||||
|
||||
# Calculation (RecallAction essential even without RecallObs in condenser):
|
||||
# Calculation (RecallAction essential only if RecallObs exists):
|
||||
# History len = 9
|
||||
# Essentials = [sys(1), user(2), recall_action(3)] (len=3)
|
||||
# Non-essential count = 9 - 3 = 6
|
||||
# num_recent_to_keep = max(1, 6 // 2) = 3
|
||||
# Essentials = [sys(1), user(2)] (len=2) - RecallObs missing, so RecallAction not essential here
|
||||
# Non-essential count = 9 - 2 = 7
|
||||
# num_recent_to_keep = max(1, 7 // 2) = 3
|
||||
# slice_start_index = 9 - 3 = 6
|
||||
# recent_events_slice = history[6:] = [obs2(7), cmd3(8), obs3(9)]
|
||||
# Validation: remove leading obs2(7). validated_slice = [cmd3(8), obs3(9)]
|
||||
# Final = essentials + validated_slice = [sys(1), user(2), recall_action(3), cmd_cat(8), obs_cat(9)]
|
||||
# Expected kept IDs: [1, 2, 3, 8, 9]. Length 5.
|
||||
# Forgotten IDs: [4, 5, 6, 7]
|
||||
condensation = condenser.get_condensation(view)
|
||||
# Expected IDs: [1, 2, 3, 8, 9]. Length 5.
|
||||
truncated_events = controller._apply_conversation_window(controller.state.history)
|
||||
|
||||
assert isinstance(condensation, Condensation)
|
||||
assert isinstance(condensation.action, CondensationAction)
|
||||
|
||||
forgotten_ids = condensation.action.forgotten
|
||||
expected_forgotten = [4, 5, 6, 7]
|
||||
assert sorted(forgotten_ids) == expected_forgotten
|
||||
assert len(truncated_events) == 5
|
||||
expected_ids = [1, 2, 3, 8, 9]
|
||||
actual_ids = [e.id for e in truncated_events]
|
||||
assert actual_ids == expected_ids
|
||||
|
||||
|
||||
def test_short_history_no_truncation(condenser_fixture):
|
||||
condenser = condenser_fixture
|
||||
def test_short_history_no_truncation(controller_fixture):
|
||||
controller = controller_fixture
|
||||
|
||||
events = create_events(
|
||||
history = create_events(
|
||||
[
|
||||
{'type': SystemMessageAction, 'content': 'System Prompt'}, # 1
|
||||
{
|
||||
@@ -305,7 +310,7 @@ def test_short_history_no_truncation(condenser_fixture):
|
||||
}, # 6
|
||||
]
|
||||
)
|
||||
view = View(events=events)
|
||||
controller.state.history = history
|
||||
|
||||
# Calculation (RecallAction now essential):
|
||||
# History len = 6
|
||||
@@ -316,22 +321,19 @@ def test_short_history_no_truncation(condenser_fixture):
|
||||
# recent_events_slice = history[5:] = [obs1(6)]
|
||||
# Validation: remove leading obs1(6). validated_slice = []
|
||||
# Final = essentials + validated_slice = [sys(1), user(2), recall_act(3), recall_obs(4)]
|
||||
# Expected kept IDs: [1, 2, 3, 4]. Length 4.
|
||||
# Forgotten IDs: [5, 6]
|
||||
condensation = condenser.get_condensation(view)
|
||||
# Expected IDs: [1, 2, 3, 4]. Length 4.
|
||||
truncated_events = controller._apply_conversation_window(controller.state.history)
|
||||
|
||||
assert isinstance(condensation, Condensation)
|
||||
assert isinstance(condensation.action, CondensationAction)
|
||||
|
||||
forgotten_ids = condensation.action.forgotten
|
||||
expected_forgotten = [5, 6]
|
||||
assert sorted(forgotten_ids) == expected_forgotten
|
||||
assert len(truncated_events) == 4
|
||||
expected_ids = [1, 2, 3, 4]
|
||||
actual_ids = [e.id for e in truncated_events]
|
||||
assert actual_ids == expected_ids
|
||||
|
||||
|
||||
def test_only_essential_events(condenser_fixture):
|
||||
condenser = condenser_fixture
|
||||
def test_only_essential_events(controller_fixture):
|
||||
controller = controller_fixture
|
||||
|
||||
events = create_events(
|
||||
history = create_events(
|
||||
[
|
||||
{'type': SystemMessageAction, 'content': 'System Prompt'}, # 1
|
||||
{
|
||||
@@ -343,7 +345,7 @@ def test_only_essential_events(condenser_fixture):
|
||||
{'type': RecallObservation, 'content': 'Recall result', 'cause_id': 3}, # 4
|
||||
]
|
||||
)
|
||||
view = View(events=events)
|
||||
controller.state.history = history
|
||||
|
||||
# Calculation (RecallAction now essential):
|
||||
# History len = 4
|
||||
@@ -354,22 +356,19 @@ def test_only_essential_events(condenser_fixture):
|
||||
# recent_events_slice = history[3:] = [recall_obs(4)]
|
||||
# Validation: remove leading recall_obs(4). validated_slice = []
|
||||
# Final = essentials + validated_slice = [sys(1), user(2), recall_act(3), recall_obs(4)]
|
||||
# Expected kept IDs: [1, 2, 3, 4]. Length 4.
|
||||
# Forgotten IDs: []
|
||||
condensation = condenser.get_condensation(view)
|
||||
# Expected IDs: [1, 2, 3, 4]. Length 4.
|
||||
truncated_events = controller._apply_conversation_window(controller.state.history)
|
||||
|
||||
assert isinstance(condensation, Condensation)
|
||||
assert isinstance(condensation.action, CondensationAction)
|
||||
|
||||
forgotten_ids = condensation.action.forgotten
|
||||
expected_forgotten = []
|
||||
assert forgotten_ids == expected_forgotten
|
||||
assert len(truncated_events) == 4
|
||||
expected_ids = [1, 2, 3, 4]
|
||||
actual_ids = [e.id for e in truncated_events]
|
||||
assert actual_ids == expected_ids
|
||||
|
||||
|
||||
def test_dangling_observations_at_cut_point(condenser_fixture):
|
||||
condenser = condenser_fixture
|
||||
def test_dangling_observations_at_cut_point(controller_fixture):
|
||||
controller = controller_fixture
|
||||
|
||||
events = create_events(
|
||||
history_forced_dangle = create_events(
|
||||
[
|
||||
{'type': SystemMessageAction, 'content': 'System Prompt'}, # 1
|
||||
{
|
||||
@@ -406,7 +405,7 @@ def test_dangling_observations_at_cut_point(condenser_fixture):
|
||||
}, # 10
|
||||
]
|
||||
) # 10 events total
|
||||
view = View(events=events)
|
||||
controller.state.history = history_forced_dangle
|
||||
|
||||
# Calculation (RecallAction now essential):
|
||||
# History len = 10
|
||||
@@ -417,22 +416,20 @@ def test_dangling_observations_at_cut_point(condenser_fixture):
|
||||
# recent_events_slice = history[7:] = [obs1(8), cmd2(9), obs2(10)]
|
||||
# Validation: remove leading obs1(8). validated_slice = [cmd2(9), obs2(10)]
|
||||
# Final = essentials + validated_slice = [sys(1), user(2), recall_act(3), recall_obs(4), cmd2(9), obs2(10)]
|
||||
# Expected kept IDs: [1, 2, 3, 4, 9, 10]. Length 6.
|
||||
# Forgotten IDs: [5, 6, 7, 8]
|
||||
condensation = condenser.get_condensation(view)
|
||||
# Expected IDs: [1, 2, 3, 4, 9, 10]. Length 6.
|
||||
truncated_events = controller._apply_conversation_window(controller.state.history)
|
||||
|
||||
assert isinstance(condensation, Condensation)
|
||||
assert isinstance(condensation.action, CondensationAction)
|
||||
|
||||
forgotten_ids = condensation.action.forgotten
|
||||
expected_forgotten = [5, 6, 7, 8]
|
||||
assert sorted(forgotten_ids) == expected_forgotten
|
||||
assert len(truncated_events) == 6
|
||||
expected_ids = [1, 2, 3, 4, 9, 10]
|
||||
actual_ids = [e.id for e in truncated_events]
|
||||
assert actual_ids == expected_ids
|
||||
# Verify dangling observations 5 and 6 were removed (implicitly by slice start and validation)
|
||||
|
||||
|
||||
def test_only_dangling_observations_in_recent_slice(condenser_fixture):
|
||||
condenser = condenser_fixture
|
||||
def test_only_dangling_observations_in_recent_slice(controller_fixture):
|
||||
controller = controller_fixture
|
||||
|
||||
events = create_events(
|
||||
history = create_events(
|
||||
[
|
||||
{'type': SystemMessageAction, 'content': 'System Prompt'}, # 1
|
||||
{
|
||||
@@ -455,7 +452,7 @@ def test_only_dangling_observations_in_recent_slice(condenser_fixture):
|
||||
}, # 6 (Dangling)
|
||||
]
|
||||
) # 6 events total
|
||||
view = View(events=events)
|
||||
controller.state.history = history
|
||||
|
||||
# Calculation (RecallAction now essential):
|
||||
# History len = 6
|
||||
@@ -466,44 +463,43 @@ def test_only_dangling_observations_in_recent_slice(condenser_fixture):
|
||||
# recent_events_slice = history[5:] = [dangle2(6)]
|
||||
# Validation: remove leading dangle2(6). validated_slice = [] (Corrected based on user feedback/bugfix)
|
||||
# Final = essentials + validated_slice = [sys(1), user(2), recall_act(3), recall_obs(4)]
|
||||
# Expected kept IDs: [1, 2, 3, 4]. Length 4.
|
||||
# Forgotten IDs: [5, 6]
|
||||
# Expected IDs: [1, 2, 3, 4]. Length 4.
|
||||
with patch(
|
||||
'openhands.memory.condenser.impl.conversation_window_condenser.logger.warning'
|
||||
'openhands.controller.agent_controller.logger.warning'
|
||||
) as mock_log_warning:
|
||||
condensation = condenser.get_condensation(view)
|
||||
truncated_events = controller._apply_conversation_window(
|
||||
controller.state.history
|
||||
)
|
||||
|
||||
assert isinstance(condensation, Condensation)
|
||||
assert isinstance(condensation.action, CondensationAction)
|
||||
|
||||
forgotten_ids = condensation.action.forgotten
|
||||
expected_forgotten = [5, 6]
|
||||
assert sorted(forgotten_ids) == expected_forgotten
|
||||
assert len(truncated_events) == 4
|
||||
expected_ids = [1, 2, 3, 4]
|
||||
actual_ids = [e.id for e in truncated_events]
|
||||
assert actual_ids == expected_ids
|
||||
# Verify dangling observations 5 and 6 were removed
|
||||
|
||||
# Check that the specific warning was logged exactly once
|
||||
assert mock_log_warning.call_count == 1
|
||||
|
||||
# Check the essential parts of the arguments
|
||||
# Check the essential parts of the arguments, allowing for variations like stacklevel
|
||||
call_args, call_kwargs = mock_log_warning.call_args
|
||||
expected_message_substring = 'All recent events are dangling observations, which we truncate. This means the agent has only the essential first events. This should not happen.'
|
||||
assert expected_message_substring in call_args[0]
|
||||
assert 'extra' in call_kwargs
|
||||
assert call_kwargs['extra'].get('session_id') == 'test_sid'
|
||||
|
||||
|
||||
def test_empty_history(condenser_fixture):
|
||||
condenser = condenser_fixture
|
||||
view = View(events=[])
|
||||
def test_empty_history(controller_fixture):
|
||||
controller = controller_fixture
|
||||
controller.state.history = []
|
||||
|
||||
condensation = condenser.get_condensation(view)
|
||||
|
||||
assert isinstance(condensation, Condensation)
|
||||
assert isinstance(condensation.action, CondensationAction)
|
||||
assert condensation.action.forgotten == []
|
||||
truncated_events = controller._apply_conversation_window(controller.state.history)
|
||||
assert truncated_events == []
|
||||
|
||||
|
||||
def test_multiple_user_messages(condenser_fixture):
|
||||
condenser = condenser_fixture
|
||||
def test_multiple_user_messages(controller_fixture):
|
||||
controller = controller_fixture
|
||||
|
||||
events = create_events(
|
||||
history = create_events(
|
||||
[
|
||||
{'type': SystemMessageAction, 'content': 'System Prompt'}, # 1
|
||||
{
|
||||
@@ -544,7 +540,7 @@ def test_multiple_user_messages(condenser_fixture):
|
||||
}, # 11
|
||||
]
|
||||
) # 11 events total
|
||||
view = View(events=events)
|
||||
controller.state.history = history
|
||||
|
||||
# Calculation (RecallAction now essential):
|
||||
# History len = 11
|
||||
@@ -555,18 +551,15 @@ def test_multiple_user_messages(condenser_fixture):
|
||||
# recent_events_slice = history[8:] = [recall_obs2(9), cmd2(10), obs2(11)]
|
||||
# Validation: remove leading recall_obs2(9). validated_slice = [cmd2(10), obs2(11)]
|
||||
# Final = essentials + validated_slice = [sys(1), user1(2), recall_act1(3), recall_obs1(4)] + [cmd2(10), obs2(11)]
|
||||
# Expected kept IDs: [1, 2, 3, 4, 10, 11]. Length 6.
|
||||
# Forgotten IDs: [5, 6, 7, 8, 9]
|
||||
condensation = condenser.get_condensation(view)
|
||||
# Expected IDs: [1, 2, 3, 4, 10, 11]. Length 6.
|
||||
truncated_events = controller._apply_conversation_window(controller.state.history)
|
||||
|
||||
assert isinstance(condensation, Condensation)
|
||||
assert isinstance(condensation.action, CondensationAction)
|
||||
assert len(truncated_events) == 6
|
||||
expected_ids = [1, 2, 3, 4, 10, 11]
|
||||
actual_ids = [e.id for e in truncated_events]
|
||||
assert actual_ids == expected_ids
|
||||
|
||||
forgotten_ids = condensation.action.forgotten
|
||||
expected_forgotten = [5, 6, 7, 8, 9]
|
||||
assert sorted(forgotten_ids) == expected_forgotten
|
||||
|
||||
# Additional validation: ensure that only the first user message is kept
|
||||
kept_event_ids = set(range(1, 12)) - set(forgotten_ids)
|
||||
assert 2 in kept_event_ids # First user message kept
|
||||
assert 7 not in kept_event_ids # Second user message forgotten
|
||||
# Verify the second user message (ID 7) was NOT kept
|
||||
assert not any(event.id == 7 for event in truncated_events)
|
||||
# Verify the first user message (ID 2) is present
|
||||
assert any(event.id == 2 for event in truncated_events)
|
||||
@@ -345,7 +345,6 @@ async def test_main_without_task(
|
||||
mock_args.agent_cls = None
|
||||
mock_args.llm_config = None
|
||||
mock_args.name = None
|
||||
mock_args.file = None
|
||||
mock_parse_args.return_value = mock_args
|
||||
|
||||
# Mock config
|
||||
@@ -428,7 +427,6 @@ async def test_main_with_task(
|
||||
mock_args = MagicMock()
|
||||
mock_args.agent_cls = 'custom-agent'
|
||||
mock_args.llm_config = 'custom-config'
|
||||
mock_args.file = None
|
||||
mock_parse_args.return_value = mock_args
|
||||
|
||||
# Mock config
|
||||
@@ -525,7 +523,6 @@ async def test_main_with_session_name_passes_name_to_run_session(
|
||||
mock_args.agent_cls = None
|
||||
mock_args.llm_config = None
|
||||
mock_args.name = test_session_name # Set the session name
|
||||
mock_args.file = None
|
||||
mock_parse_args.return_value = mock_args
|
||||
|
||||
# Mock config
|
||||
@@ -834,93 +831,3 @@ async def test_config_loading_order(
|
||||
|
||||
# Verify that run_session was called with the correct arguments
|
||||
mock_run_session.assert_called_once()
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
@patch('openhands.cli.main.parse_arguments')
|
||||
@patch('openhands.cli.main.setup_config_from_args')
|
||||
@patch('openhands.cli.main.FileSettingsStore.get_instance')
|
||||
@patch('openhands.cli.main.check_folder_security_agreement')
|
||||
@patch('openhands.cli.main.run_session')
|
||||
@patch('openhands.cli.main.LLMSummarizingCondenserConfig')
|
||||
@patch('openhands.cli.main.NoOpCondenserConfig')
|
||||
@patch('builtins.open', new_callable=MagicMock)
|
||||
async def test_main_with_file_option(
|
||||
mock_open,
|
||||
mock_noop_condenser,
|
||||
mock_llm_condenser,
|
||||
mock_run_session,
|
||||
mock_check_security,
|
||||
mock_get_settings_store,
|
||||
mock_setup_config,
|
||||
mock_parse_args,
|
||||
):
|
||||
"""Test main function with a file option."""
|
||||
loop = asyncio.get_running_loop()
|
||||
|
||||
# Mock arguments
|
||||
mock_args = MagicMock()
|
||||
mock_args.agent_cls = None
|
||||
mock_args.llm_config = None
|
||||
mock_args.name = None
|
||||
mock_args.file = '/path/to/test/file.txt'
|
||||
mock_args.task = None
|
||||
mock_parse_args.return_value = mock_args
|
||||
|
||||
# Mock config
|
||||
mock_config = MagicMock()
|
||||
mock_config.workspace_base = '/test/dir'
|
||||
mock_config.cli_multiline_input = False
|
||||
mock_setup_config.return_value = mock_config
|
||||
|
||||
# Mock settings store
|
||||
mock_settings_store = AsyncMock()
|
||||
mock_settings = MagicMock()
|
||||
mock_settings.agent = 'test-agent'
|
||||
mock_settings.llm_model = 'test-model'
|
||||
mock_settings.llm_api_key = 'test-api-key'
|
||||
mock_settings.llm_base_url = 'test-base-url'
|
||||
mock_settings.confirmation_mode = True
|
||||
mock_settings.enable_default_condenser = True
|
||||
mock_settings_store.load.return_value = mock_settings
|
||||
mock_get_settings_store.return_value = mock_settings_store
|
||||
|
||||
# Mock condenser config to return a mock instead of validating
|
||||
mock_llm_condenser_instance = MagicMock()
|
||||
mock_llm_condenser.return_value = mock_llm_condenser_instance
|
||||
|
||||
# Mock security check
|
||||
mock_check_security.return_value = True
|
||||
|
||||
# Mock file open
|
||||
mock_file = MagicMock()
|
||||
mock_file.__enter__.return_value.read.return_value = 'This is a test file content.'
|
||||
mock_open.return_value = mock_file
|
||||
|
||||
# Mock run_session to return False (no new session requested)
|
||||
mock_run_session.return_value = False
|
||||
|
||||
# Run the function
|
||||
await cli.main_with_loop(loop)
|
||||
|
||||
# Assertions
|
||||
mock_parse_args.assert_called_once()
|
||||
mock_setup_config.assert_called_once_with(mock_args)
|
||||
mock_get_settings_store.assert_called_once()
|
||||
mock_settings_store.load.assert_called_once()
|
||||
mock_check_security.assert_called_once_with(mock_config, '/test/dir')
|
||||
|
||||
# Verify file was opened
|
||||
mock_open.assert_called_once_with('/path/to/test/file.txt', 'r', encoding='utf-8')
|
||||
|
||||
# Check that run_session was called with expected arguments
|
||||
mock_run_session.assert_called_once()
|
||||
# Extract the task_str from the call
|
||||
task_str = mock_run_session.call_args[0][4]
|
||||
assert "The user has tagged a file '/path/to/test/file.txt'" in task_str
|
||||
assert 'Please read and understand the following file content first:' in task_str
|
||||
assert 'This is a test file content.' in task_str
|
||||
assert (
|
||||
'After reviewing the file, please ask the user what they would like to do with it.'
|
||||
in task_str
|
||||
)
|
||||
|
||||
@@ -6,11 +6,7 @@ from unittest.mock import patch
|
||||
import pytest
|
||||
|
||||
from openhands.core.config import LLMConfig, OpenHandsConfig
|
||||
from openhands.core.logger import (
|
||||
LOG_JSON_LEVEL_KEY,
|
||||
OpenHandsLoggerAdapter,
|
||||
json_log_handler,
|
||||
)
|
||||
from openhands.core.logger import OpenHandsLoggerAdapter, json_log_handler
|
||||
from openhands.core.logger import openhands_logger as openhands_logger
|
||||
|
||||
|
||||
@@ -143,7 +139,7 @@ class TestJsonOutput:
|
||||
output = json.loads(string_io.getvalue())
|
||||
assert 'timestamp' in output
|
||||
del output['timestamp']
|
||||
assert output == {'message': 'Test message', LOG_JSON_LEVEL_KEY: 'INFO'}
|
||||
assert output == {'message': 'Test message', 'level': 'INFO'}
|
||||
|
||||
def test_error(self, json_handler):
|
||||
logger, string_io = json_handler
|
||||
@@ -151,7 +147,7 @@ class TestJsonOutput:
|
||||
logger.error('Test message')
|
||||
output = json.loads(string_io.getvalue())
|
||||
del output['timestamp']
|
||||
assert output == {'message': 'Test message', LOG_JSON_LEVEL_KEY: 'ERROR'}
|
||||
assert output == {'message': 'Test message', 'level': 'ERROR'}
|
||||
|
||||
def test_extra_fields(self, json_handler):
|
||||
logger, string_io = json_handler
|
||||
@@ -162,7 +158,7 @@ class TestJsonOutput:
|
||||
assert output == {
|
||||
'key': '..val..',
|
||||
'message': 'Test message',
|
||||
LOG_JSON_LEVEL_KEY: 'INFO',
|
||||
'level': 'INFO',
|
||||
}
|
||||
|
||||
def test_extra_fields_from_adapter(self, json_handler):
|
||||
@@ -175,7 +171,7 @@ class TestJsonOutput:
|
||||
'context_field': '..val..',
|
||||
'log_fied': '..val..',
|
||||
'message': 'Test message',
|
||||
LOG_JSON_LEVEL_KEY: 'INFO',
|
||||
'level': 'INFO',
|
||||
}
|
||||
|
||||
def test_extra_fields_from_adapter_can_override(self, json_handler):
|
||||
@@ -187,5 +183,5 @@ class TestJsonOutput:
|
||||
assert output == {
|
||||
'override': 'b',
|
||||
'message': 'Test message',
|
||||
LOG_JSON_LEVEL_KEY: 'INFO',
|
||||
'level': 'INFO',
|
||||
}
|
||||
|
||||
@@ -201,75 +201,3 @@ This microagent has an invalid type.
|
||||
assert '"knowledge"' in error_msg
|
||||
assert '"repo"' in error_msg
|
||||
assert '"task"' in error_msg
|
||||
|
||||
|
||||
def test_cursorrules_file_load():
|
||||
"""Test loading .cursorrules file as a RepoMicroagent."""
|
||||
cursorrules_content = """Always use Python for new files.
|
||||
Follow the existing code style.
|
||||
Add proper error handling."""
|
||||
|
||||
cursorrules_path = Path('.cursorrules')
|
||||
|
||||
# Test loading .cursorrules file directly
|
||||
agent = BaseMicroagent.load(cursorrules_path, file_content=cursorrules_content)
|
||||
|
||||
# Verify it's loaded as a RepoMicroagent
|
||||
assert isinstance(agent, RepoMicroagent)
|
||||
assert agent.name == 'cursorrules'
|
||||
assert agent.content == cursorrules_content
|
||||
assert agent.type == MicroagentType.REPO_KNOWLEDGE
|
||||
assert agent.metadata.name == 'cursorrules'
|
||||
assert agent.source == str(cursorrules_path)
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def temp_microagents_dir_with_cursorrules():
|
||||
"""Create a temporary directory with test microagents and .cursorrules file."""
|
||||
with tempfile.TemporaryDirectory() as temp_dir:
|
||||
root = Path(temp_dir)
|
||||
|
||||
# Create .openhands/microagents directory structure
|
||||
microagents_dir = root / '.openhands' / 'microagents'
|
||||
microagents_dir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# Create .cursorrules file in repository root
|
||||
cursorrules_content = """Always use TypeScript for new files.
|
||||
Follow the existing code style."""
|
||||
(root / '.cursorrules').write_text(cursorrules_content)
|
||||
|
||||
# Create test repo agent
|
||||
repo_agent = """---
|
||||
# type: repo
|
||||
version: 1.0.0
|
||||
agent: CodeActAgent
|
||||
---
|
||||
|
||||
# Test Repository Agent
|
||||
|
||||
Repository-specific test instructions.
|
||||
"""
|
||||
(microagents_dir / 'repo.md').write_text(repo_agent)
|
||||
|
||||
yield root
|
||||
|
||||
|
||||
def test_load_microagents_with_cursorrules(temp_microagents_dir_with_cursorrules):
|
||||
"""Test loading microagents when .cursorrules file exists."""
|
||||
microagents_dir = (
|
||||
temp_microagents_dir_with_cursorrules / '.openhands' / 'microagents'
|
||||
)
|
||||
|
||||
repo_agents, knowledge_agents = load_microagents_from_dir(microagents_dir)
|
||||
|
||||
# Verify that .cursorrules file was loaded as a RepoMicroagent
|
||||
assert len(repo_agents) == 2 # repo.md + .cursorrules
|
||||
assert 'repo' in repo_agents
|
||||
assert 'cursorrules' in repo_agents
|
||||
|
||||
# Check .cursorrules agent
|
||||
cursorrules_agent = repo_agents['cursorrules']
|
||||
assert isinstance(cursorrules_agent, RepoMicroagent)
|
||||
assert cursorrules_agent.name == 'cursorrules'
|
||||
assert 'Always use TypeScript for new files' in cursorrules_agent.content
|
||||
assert cursorrules_agent.type == MicroagentType.REPO_KNOWLEDGE
|
||||
|
||||
@@ -1,7 +1,6 @@
|
||||
"""Tests for the custom secrets API endpoints."""
|
||||
# flake8: noqa: E501
|
||||
|
||||
import os
|
||||
from unittest.mock import AsyncMock, patch
|
||||
|
||||
import pytest
|
||||
@@ -25,12 +24,7 @@ def test_client():
|
||||
"""Create a test client for the settings API."""
|
||||
app = FastAPI()
|
||||
app.include_router(secrets_app)
|
||||
|
||||
# Mock SESSION_API_KEY to None to disable authentication in tests
|
||||
with patch.dict(os.environ, {'SESSION_API_KEY': ''}, clear=False):
|
||||
# Clear the SESSION_API_KEY to disable auth dependency
|
||||
with patch('openhands.server.dependencies._SESSION_API_KEY', None):
|
||||
yield TestClient(app)
|
||||
return TestClient(app)
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
|
||||
@@ -1,4 +1,3 @@
|
||||
import os
|
||||
from unittest.mock import AsyncMock, MagicMock, patch
|
||||
|
||||
import pytest
|
||||
@@ -55,8 +54,6 @@ class MockUserAuth(UserAuth):
|
||||
def test_client():
|
||||
# Create a test client
|
||||
with (
|
||||
patch.dict(os.environ, {'SESSION_API_KEY': ''}, clear=False),
|
||||
patch('openhands.server.dependencies._SESSION_API_KEY', None),
|
||||
patch(
|
||||
'openhands.server.user_auth.user_auth.UserAuth.get_instance',
|
||||
return_value=MockUserAuth(),
|
||||
|
||||
@@ -1,4 +1,3 @@
|
||||
import os
|
||||
from unittest.mock import AsyncMock, MagicMock, patch
|
||||
|
||||
import pytest
|
||||
@@ -29,8 +28,6 @@ async def get_settings_store(request):
|
||||
def test_client():
|
||||
# Create a test client
|
||||
with (
|
||||
patch.dict(os.environ, {'SESSION_API_KEY': ''}, clear=False),
|
||||
patch('openhands.server.dependencies._SESSION_API_KEY', None),
|
||||
patch(
|
||||
'openhands.server.routes.secrets.check_provider_tokens',
|
||||
AsyncMock(return_value=''),
|
||||
|
||||
@@ -1,217 +0,0 @@
|
||||
"""Tests for user directory microagent loading."""
|
||||
|
||||
import tempfile
|
||||
from pathlib import Path
|
||||
from unittest.mock import patch
|
||||
|
||||
import pytest
|
||||
|
||||
from openhands.events.stream import EventStream
|
||||
from openhands.memory.memory import Memory
|
||||
from openhands.microagent import KnowledgeMicroagent, MicroagentType, RepoMicroagent
|
||||
from openhands.storage import get_file_store
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def temp_user_microagents_dir():
|
||||
"""Create a temporary directory to simulate ~/.openhands/microagents/."""
|
||||
with tempfile.TemporaryDirectory() as temp_dir:
|
||||
user_dir = Path(temp_dir)
|
||||
|
||||
# Create test knowledge agent
|
||||
knowledge_agent = """---
|
||||
name: user_knowledge
|
||||
version: 1.0.0
|
||||
agent: CodeActAgent
|
||||
triggers:
|
||||
- user-test
|
||||
- personal
|
||||
---
|
||||
|
||||
# User Knowledge Agent
|
||||
|
||||
Personal knowledge and guidelines.
|
||||
"""
|
||||
(user_dir / 'user_knowledge.md').write_text(knowledge_agent)
|
||||
|
||||
# Create test repo agent
|
||||
repo_agent = """---
|
||||
name: user_repo
|
||||
version: 1.0.0
|
||||
agent: CodeActAgent
|
||||
---
|
||||
|
||||
# User Repository Agent
|
||||
|
||||
Personal repository-specific instructions.
|
||||
"""
|
||||
(user_dir / 'user_repo.md').write_text(repo_agent)
|
||||
|
||||
yield user_dir
|
||||
|
||||
|
||||
def test_user_microagents_loading(temp_user_microagents_dir):
|
||||
"""Test that user microagents are loaded from ~/.openhands/microagents/."""
|
||||
with patch(
|
||||
'openhands.memory.memory.USER_MICROAGENTS_DIR', str(temp_user_microagents_dir)
|
||||
):
|
||||
with tempfile.TemporaryDirectory() as temp_dir:
|
||||
# Create event stream and memory
|
||||
file_store = get_file_store('local', temp_dir)
|
||||
event_stream = EventStream('test', file_store)
|
||||
memory = Memory(event_stream, 'test_sid')
|
||||
|
||||
# Check that user microagents were loaded
|
||||
assert 'user_knowledge' in memory.knowledge_microagents
|
||||
assert 'user_repo' in memory.repo_microagents
|
||||
|
||||
# Verify the loaded agents
|
||||
user_knowledge = memory.knowledge_microagents['user_knowledge']
|
||||
assert isinstance(user_knowledge, KnowledgeMicroagent)
|
||||
assert user_knowledge.type == MicroagentType.KNOWLEDGE
|
||||
assert 'user-test' in user_knowledge.triggers
|
||||
assert 'personal' in user_knowledge.triggers
|
||||
|
||||
user_repo = memory.repo_microagents['user_repo']
|
||||
assert isinstance(user_repo, RepoMicroagent)
|
||||
assert user_repo.type == MicroagentType.REPO_KNOWLEDGE
|
||||
|
||||
|
||||
def test_user_microagents_directory_creation():
|
||||
"""Test that user microagents directory is created if it doesn't exist."""
|
||||
with tempfile.TemporaryDirectory() as temp_dir:
|
||||
non_existent_dir = Path(temp_dir) / 'non_existent' / 'microagents'
|
||||
|
||||
with patch(
|
||||
'openhands.memory.memory.USER_MICROAGENTS_DIR', str(non_existent_dir)
|
||||
):
|
||||
with tempfile.TemporaryDirectory() as temp_store_dir:
|
||||
# Create event stream and memory
|
||||
file_store = get_file_store('local', temp_store_dir)
|
||||
event_stream = EventStream('test', file_store)
|
||||
Memory(event_stream, 'test_sid')
|
||||
|
||||
# Check that the directory was created
|
||||
assert non_existent_dir.exists()
|
||||
assert non_existent_dir.is_dir()
|
||||
|
||||
|
||||
def test_user_microagents_override_global():
|
||||
"""Test that user microagents can override global ones with the same name."""
|
||||
with tempfile.TemporaryDirectory() as temp_dir:
|
||||
user_dir = Path(temp_dir)
|
||||
|
||||
# Create a user microagent with the same name as a global one
|
||||
# (assuming there's a global 'github' microagent)
|
||||
github_agent = """---
|
||||
name: github
|
||||
version: 1.0.0
|
||||
agent: CodeActAgent
|
||||
triggers:
|
||||
- github
|
||||
- git
|
||||
---
|
||||
|
||||
# Personal GitHub Agent
|
||||
|
||||
My personal GitHub workflow and preferences.
|
||||
"""
|
||||
(user_dir / 'github.md').write_text(github_agent)
|
||||
|
||||
with patch('openhands.memory.memory.USER_MICROAGENTS_DIR', str(user_dir)):
|
||||
with tempfile.TemporaryDirectory() as temp_store_dir:
|
||||
# Create event stream and memory
|
||||
file_store = get_file_store('local', temp_store_dir)
|
||||
event_stream = EventStream('test', file_store)
|
||||
memory = Memory(event_stream, 'test_sid')
|
||||
|
||||
# Check that the user microagent is loaded
|
||||
if 'github' in memory.knowledge_microagents:
|
||||
github_microagent = memory.knowledge_microagents['github']
|
||||
# The user version should contain our personal content
|
||||
assert 'My personal GitHub workflow' in github_microagent.content
|
||||
|
||||
|
||||
def test_user_microagents_loading_error_handling():
|
||||
"""Test error handling when user microagents directory has issues."""
|
||||
with tempfile.TemporaryDirectory() as temp_dir:
|
||||
user_dir = Path(temp_dir)
|
||||
|
||||
# Create an invalid microagent file
|
||||
invalid_agent = """---
|
||||
name: invalid
|
||||
type: invalid_type
|
||||
---
|
||||
|
||||
# Invalid Agent
|
||||
"""
|
||||
(user_dir / 'invalid.md').write_text(invalid_agent)
|
||||
|
||||
with patch('openhands.memory.memory.USER_MICROAGENTS_DIR', str(user_dir)):
|
||||
with tempfile.TemporaryDirectory() as temp_store_dir:
|
||||
# Create event stream and memory - should not crash
|
||||
file_store = get_file_store('local', temp_store_dir)
|
||||
event_stream = EventStream('test', file_store)
|
||||
memory = Memory(event_stream, 'test_sid')
|
||||
|
||||
# Memory should still be created despite the invalid microagent
|
||||
assert memory is not None
|
||||
# The invalid microagent should not be loaded
|
||||
assert 'invalid' not in memory.knowledge_microagents
|
||||
assert 'invalid' not in memory.repo_microagents
|
||||
|
||||
|
||||
def test_user_microagents_empty_directory():
|
||||
"""Test behavior when user microagents directory is empty."""
|
||||
with tempfile.TemporaryDirectory() as temp_dir:
|
||||
empty_dir = Path(temp_dir)
|
||||
|
||||
with patch('openhands.memory.memory.USER_MICROAGENTS_DIR', str(empty_dir)):
|
||||
with tempfile.TemporaryDirectory() as temp_store_dir:
|
||||
# Create event stream and memory
|
||||
file_store = get_file_store('local', temp_store_dir)
|
||||
event_stream = EventStream('test', file_store)
|
||||
memory = Memory(event_stream, 'test_sid')
|
||||
|
||||
# Memory should be created successfully
|
||||
assert memory is not None
|
||||
# No user microagents should be loaded, but global ones might be
|
||||
# (we can't assert the exact count since global microagents may exist)
|
||||
|
||||
|
||||
def test_user_microagents_nested_directories(temp_user_microagents_dir):
|
||||
"""Test loading user microagents from nested directories."""
|
||||
# Create nested microagent
|
||||
nested_dir = temp_user_microagents_dir / 'personal' / 'tools'
|
||||
nested_dir.mkdir(parents=True)
|
||||
|
||||
nested_agent = """---
|
||||
name: personal_tool
|
||||
version: 1.0.0
|
||||
agent: CodeActAgent
|
||||
triggers:
|
||||
- personal-tool
|
||||
---
|
||||
|
||||
# Personal Tool Agent
|
||||
|
||||
My personal development tools and workflows.
|
||||
"""
|
||||
(nested_dir / 'tool.md').write_text(nested_agent)
|
||||
|
||||
with patch(
|
||||
'openhands.memory.memory.USER_MICROAGENTS_DIR', str(temp_user_microagents_dir)
|
||||
):
|
||||
with tempfile.TemporaryDirectory() as temp_store_dir:
|
||||
# Create event stream and memory
|
||||
file_store = get_file_store('local', temp_store_dir)
|
||||
event_stream = EventStream('test', file_store)
|
||||
memory = Memory(event_stream, 'test_sid')
|
||||
|
||||
# Check that nested microagent was loaded
|
||||
# The name should be derived from the relative path
|
||||
assert 'personal/tools/tool' in memory.knowledge_microagents
|
||||
|
||||
nested_microagent = memory.knowledge_microagents['personal/tools/tool']
|
||||
assert isinstance(nested_microagent, KnowledgeMicroagent)
|
||||
assert 'personal-tool' in nested_microagent.triggers
|
||||
@@ -1,4 +1,4 @@
|
||||
from openhands.events.action.agent import CondensationAction, CondensationRequestAction
|
||||
from openhands.events.action.agent import CondensationAction
|
||||
from openhands.events.action.message import MessageAction
|
||||
from openhands.events.event import Event
|
||||
from openhands.events.observation.agent import AgentCondensationObservation
|
||||
@@ -98,169 +98,6 @@ def test_no_condensation_action_in_view() -> None:
|
||||
assert len(view) == 3 # Event 1, Event 2, Event 3 (Event 0 was forgotten)
|
||||
|
||||
|
||||
def test_unhandled_condensation_request_with_no_condensation() -> None:
|
||||
"""Test that unhandled_condensation_request is True when there's a CondensationRequestAction but no CondensationAction."""
|
||||
events: list[Event] = [
|
||||
MessageAction(content='Event 0'),
|
||||
MessageAction(content='Event 1'),
|
||||
CondensationRequestAction(),
|
||||
MessageAction(content='Event 2'),
|
||||
]
|
||||
set_ids(events)
|
||||
view = View.from_events(events)
|
||||
|
||||
# Should be marked as having an unhandled condensation request
|
||||
assert view.unhandled_condensation_request is True
|
||||
|
||||
# CondensationRequestAction should be removed from the view
|
||||
assert len(view) == 3 # Only the MessageActions remain
|
||||
for event in view:
|
||||
assert not isinstance(event, CondensationRequestAction)
|
||||
|
||||
|
||||
def test_handled_condensation_request_with_condensation_action() -> None:
|
||||
"""Test that unhandled_condensation_request is False when CondensationAction comes after CondensationRequestAction."""
|
||||
events: list[Event] = [
|
||||
MessageAction(content='Event 0'),
|
||||
MessageAction(content='Event 1'),
|
||||
CondensationRequestAction(),
|
||||
MessageAction(content='Event 2'),
|
||||
CondensationAction(forgotten_event_ids=[0, 1]), # Handles the request
|
||||
MessageAction(content='Event 3'),
|
||||
]
|
||||
set_ids(events)
|
||||
view = View.from_events(events)
|
||||
|
||||
# Should NOT be marked as having an unhandled condensation request
|
||||
assert view.unhandled_condensation_request is False
|
||||
|
||||
# Both CondensationRequestAction and CondensationAction should be removed from the view
|
||||
assert len(view) == 2 # Event 2 and Event 3 (Event 0, 1 forgotten)
|
||||
for event in view:
|
||||
assert not isinstance(event, CondensationRequestAction)
|
||||
assert not isinstance(event, CondensationAction)
|
||||
|
||||
|
||||
def test_multiple_condensation_requests_pattern() -> None:
|
||||
"""Test the pattern with multiple condensation requests and actions."""
|
||||
events: list[Event] = [
|
||||
MessageAction(content='Event 0'),
|
||||
CondensationRequestAction(), # First request
|
||||
MessageAction(content='Event 1'),
|
||||
CondensationAction(forgotten_event_ids=[0]), # Handles first request
|
||||
MessageAction(content='Event 2'),
|
||||
CondensationRequestAction(), # Second request - should be unhandled
|
||||
MessageAction(content='Event 3'),
|
||||
]
|
||||
set_ids(events)
|
||||
view = View.from_events(events)
|
||||
|
||||
# Should be marked as having an unhandled condensation request (the second one)
|
||||
assert view.unhandled_condensation_request is True
|
||||
|
||||
# Both CondensationRequestActions and CondensationAction should be removed from the view
|
||||
assert len(view) == 3 # Event 1, Event 2, Event 3 (Event 0 forgotten)
|
||||
for event in view:
|
||||
assert not isinstance(event, CondensationRequestAction)
|
||||
assert not isinstance(event, CondensationAction)
|
||||
|
||||
|
||||
def test_condensation_action_before_request() -> None:
|
||||
"""Test that CondensationAction before CondensationRequestAction doesn't affect the unhandled status."""
|
||||
events: list[Event] = [
|
||||
MessageAction(content='Event 0'),
|
||||
CondensationAction(
|
||||
forgotten_event_ids=[]
|
||||
), # This doesn't handle the later request
|
||||
MessageAction(content='Event 1'),
|
||||
CondensationRequestAction(), # This should be unhandled
|
||||
MessageAction(content='Event 2'),
|
||||
]
|
||||
set_ids(events)
|
||||
view = View.from_events(events)
|
||||
|
||||
# Should be marked as having an unhandled condensation request
|
||||
assert view.unhandled_condensation_request is True
|
||||
|
||||
# Both CondensationRequestAction and CondensationAction should be removed from the view
|
||||
assert len(view) == 3 # Event 0, Event 1, Event 2
|
||||
for event in view:
|
||||
assert not isinstance(event, CondensationRequestAction)
|
||||
assert not isinstance(event, CondensationAction)
|
||||
|
||||
|
||||
def test_no_condensation_events() -> None:
|
||||
"""Test that unhandled_condensation_request is False when there are no condensation events."""
|
||||
events: list[Event] = [
|
||||
MessageAction(content='Event 0'),
|
||||
MessageAction(content='Event 1'),
|
||||
MessageAction(content='Event 2'),
|
||||
]
|
||||
set_ids(events)
|
||||
view = View.from_events(events)
|
||||
|
||||
# Should NOT be marked as having an unhandled condensation request
|
||||
assert view.unhandled_condensation_request is False
|
||||
|
||||
# All events should remain
|
||||
assert len(view) == 3
|
||||
assert view.events == events
|
||||
|
||||
|
||||
def test_only_condensation_action() -> None:
|
||||
"""Test behavior when there's only a CondensationAction (no request)."""
|
||||
events: list[Event] = [
|
||||
MessageAction(content='Event 0'),
|
||||
MessageAction(content='Event 1'),
|
||||
CondensationAction(forgotten_event_ids=[0]),
|
||||
MessageAction(content='Event 2'),
|
||||
]
|
||||
set_ids(events)
|
||||
view = View.from_events(events)
|
||||
|
||||
# Should NOT be marked as having an unhandled condensation request
|
||||
assert view.unhandled_condensation_request is False
|
||||
|
||||
# CondensationAction should be removed, Event 0 should be forgotten
|
||||
assert len(view) == 2 # Event 1, Event 2
|
||||
for event in view:
|
||||
assert not isinstance(event, CondensationAction)
|
||||
|
||||
|
||||
def test_condensation_request_always_removed_from_view() -> None:
|
||||
"""Test that CondensationRequestAction is always removed from the view regardless of unhandled status."""
|
||||
# Test case 1: Unhandled request
|
||||
events_unhandled: list[Event] = [
|
||||
MessageAction(content='Event 0'),
|
||||
CondensationRequestAction(),
|
||||
MessageAction(content='Event 1'),
|
||||
]
|
||||
set_ids(events_unhandled)
|
||||
view_unhandled = View.from_events(events_unhandled)
|
||||
|
||||
assert view_unhandled.unhandled_condensation_request is True
|
||||
assert len(view_unhandled) == 2 # Only MessageActions
|
||||
for event in view_unhandled:
|
||||
assert not isinstance(event, CondensationRequestAction)
|
||||
|
||||
# Test case 2: Handled request
|
||||
events_handled: list[Event] = [
|
||||
MessageAction(content='Event 0'),
|
||||
CondensationRequestAction(),
|
||||
MessageAction(content='Event 1'),
|
||||
CondensationAction(forgotten_event_ids=[]),
|
||||
MessageAction(content='Event 2'),
|
||||
]
|
||||
set_ids(events_handled)
|
||||
view_handled = View.from_events(events_handled)
|
||||
|
||||
assert view_handled.unhandled_condensation_request is False
|
||||
assert len(view_handled) == 3 # Only MessageActions
|
||||
for event in view_handled:
|
||||
assert not isinstance(event, CondensationRequestAction)
|
||||
assert not isinstance(event, CondensationAction)
|
||||
|
||||
|
||||
def set_ids(events: list[Event]) -> None:
|
||||
"""Set the IDs of the events in the list to their index."""
|
||||
for i, e in enumerate(events):
|
||||
|
||||
Reference in New Issue
Block a user