mirror of
https://github.com/All-Hands-AI/OpenHands.git
synced 2026-04-29 03:00:45 -04:00
Compare commits
16 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
0925798aee | ||
|
|
8028e2c2dd | ||
|
|
ff9058e28a | ||
|
|
c45caaef1f | ||
|
|
a3c107daa4 | ||
|
|
040839bdd1 | ||
|
|
aabbbb6c6a | ||
|
|
9747c9e9f8 | ||
|
|
bb85542aca | ||
|
|
6e4ff56934 | ||
|
|
561f308401 | ||
|
|
43d782da06 | ||
|
|
4876d811a1 | ||
|
|
0ab457f1d3 | ||
|
|
70e29f9b75 | ||
|
|
4cd1d80eea |
@@ -18,24 +18,24 @@ diverse, inclusive, and healthy community.
|
||||
Examples of behavior that contributes to a positive environment for our
|
||||
community include:
|
||||
|
||||
* Demonstrating empathy and kindness toward other people
|
||||
* Being respectful of differing opinions, viewpoints, and experiences
|
||||
* Giving and gracefully accepting constructive feedback
|
||||
* Demonstrating empathy and kindness toward other people.
|
||||
* Being respectful of differing opinions, viewpoints, and experiences.
|
||||
* Giving and gracefully accepting constructive feedback.
|
||||
* Accepting responsibility and apologizing to those affected by our mistakes,
|
||||
and learning from the experience
|
||||
and learning from the experience.
|
||||
* Focusing on what is best not just for us as individuals, but for the overall
|
||||
community
|
||||
community.
|
||||
|
||||
Examples of unacceptable behavior include:
|
||||
|
||||
* The use of sexualized language or imagery, and sexual attention or advances of
|
||||
any kind
|
||||
* Trolling, insulting or derogatory comments, and personal or political attacks
|
||||
* Public or private harassment
|
||||
any kind.
|
||||
* Trolling, insulting or derogatory comments, and personal or political attacks.
|
||||
* Public or private harassment.
|
||||
* Publishing others' private information, such as a physical or email address,
|
||||
without their explicit permission
|
||||
without their explicit permission.
|
||||
* Other conduct which could reasonably be considered inappropriate in a
|
||||
professional setting
|
||||
professional setting.
|
||||
|
||||
## Enforcement Responsibilities
|
||||
|
||||
@@ -61,7 +61,7 @@ representative at an online or offline event.
|
||||
|
||||
Instances of abusive, harassing, or otherwise unacceptable behavior may be
|
||||
reported to the community leaders responsible for enforcement at
|
||||
contact@all-hands.dev
|
||||
contact@all-hands.dev.
|
||||
All complaints will be reviewed and investigated promptly and fairly.
|
||||
|
||||
All community leaders are obligated to respect the privacy and security of the
|
||||
|
||||
@@ -11,11 +11,11 @@ To understand the codebase, please refer to the README in each module:
|
||||
- [agenthub](./openhands/agenthub/README.md)
|
||||
- [server](./openhands/server/README.md)
|
||||
|
||||
## Setting up your development environment
|
||||
## Setting up Your Development Environment
|
||||
|
||||
We have a separate doc [Development.md](https://github.com/All-Hands-AI/OpenHands/blob/main/Development.md) that tells you how to set up a development workflow.
|
||||
|
||||
## How can I contribute?
|
||||
## How Can I Contribute?
|
||||
|
||||
There are many ways that you can contribute:
|
||||
|
||||
@@ -23,7 +23,7 @@ There are many ways that you can contribute:
|
||||
2. **Send feedback** after each session by [clicking the thumbs-up thumbs-down buttons](https://docs.all-hands.dev/modules/usage/feedback), so we can see where things are working and failing, and also build an open dataset for training code agents.
|
||||
3. **Improve the Codebase** by sending [PRs](#sending-pull-requests-to-openhands) (see details below). In particular, we have some [good first issues](https://github.com/All-Hands-AI/OpenHands/labels/good%20first%20issue) that may be ones to start on.
|
||||
|
||||
## What can I build?
|
||||
## What Can I Build?
|
||||
Here are a few ways you can help improve the codebase.
|
||||
|
||||
#### UI/UX
|
||||
@@ -35,7 +35,7 @@ of the application, please open an issue first, or better, join the #frontend ch
|
||||
to gather consensus from our design team first.
|
||||
|
||||
#### Improving the agent
|
||||
Our main agent is the CodeAct agent. You can [see its prompts here](https://github.com/All-Hands-AI/OpenHands/tree/main/openhands/agenthub/codeact_agent)
|
||||
Our main agent is the CodeAct agent. You can [see its prompts here](https://github.com/All-Hands-AI/OpenHands/tree/main/openhands/agenthub/codeact_agent).
|
||||
|
||||
Changes to these prompts, and to the underlying behavior in Python, can have a huge impact on user experience.
|
||||
You can try modifying the prompts to see how they change the behavior of the agent as you use the app
|
||||
@@ -63,7 +63,7 @@ At the moment, we have two kinds of tests: [`unit`](./tests/unit) and [`integrat
|
||||
## Sending Pull Requests to OpenHands
|
||||
|
||||
You'll need to fork our repository to send us a Pull Request. You can learn more
|
||||
about how to fork a GitHub repo and open a PR with your changes in [this article](https://medium.com/swlh/forks-and-pull-requests-how-to-contribute-to-github-repos-8843fac34ce8)
|
||||
about how to fork a GitHub repo and open a PR with your changes in [this article](https://medium.com/swlh/forks-and-pull-requests-how-to-contribute-to-github-repos-8843fac34ce8).
|
||||
|
||||
### Pull Request title
|
||||
As described [here](https://github.com/commitizen/conventional-commit-types/blob/master/index.json), a valid PR title should begin with one of the following prefixes:
|
||||
@@ -103,7 +103,7 @@ Further, if you see an issue you like, please leave a "thumbs-up" or a comment,
|
||||
|
||||
### Making Pull Requests
|
||||
|
||||
We're generally happy to consider all [PRs](https://github.com/All-Hands-AI/OpenHands/pulls), with the evaluation process varying based on the type of change:
|
||||
We're generally happy to consider all pull requests with the evaluation process varying based on the type of change:
|
||||
|
||||
#### For Small Improvements
|
||||
|
||||
|
||||
@@ -3,7 +3,7 @@ This guide is for people working on OpenHands and editing the source code.
|
||||
If you wish to contribute your changes, check out the [CONTRIBUTING.md](https://github.com/All-Hands-AI/OpenHands/blob/main/CONTRIBUTING.md) on how to clone and setup the project initially before moving on.
|
||||
Otherwise, you can clone the OpenHands project directly.
|
||||
|
||||
## Start the server for development
|
||||
## Start the Server for Development
|
||||
### 1. Requirements
|
||||
* Linux, Mac OS, or [WSL on Windows](https://learn.microsoft.com/en-us/windows/wsl/install) [Ubuntu <= 22.04]
|
||||
* [Docker](https://docs.docker.com/engine/install/) (For those on MacOS, make sure to allow the default Docker socket to be used from advanced settings!)
|
||||
@@ -58,7 +58,7 @@ See [our documentation](https://docs.all-hands.dev/modules/usage/llms) for recom
|
||||
|
||||
### 4. Running the application
|
||||
#### Option A: Run the Full Application
|
||||
Once the setup is complete, launching OpenHands is as simple as running a single command. This command starts both the backend and frontend servers seamlessly, allowing you to interact with OpenHands:
|
||||
Once the setup is complete, this command starts both the backend and frontend servers, allowing you to interact with OpenHands:
|
||||
```bash
|
||||
make run
|
||||
```
|
||||
@@ -75,11 +75,11 @@ make run
|
||||
```
|
||||
|
||||
### 6. LLM Debugging
|
||||
If you encounter any issues with the Language Model (LM) or you're simply curious, you can inspect the actual LLM prompts and responses. To do so, export DEBUG=1 in the environment and restart the backend.
|
||||
OpenHands will then log the prompts and responses in the logs/llm/CURRENT_DATE directory, allowing you to identify the causes.
|
||||
If you encounter any issues with the Language Model (LM) or you're simply curious, export DEBUG=1 in the environment and restart the backend.
|
||||
OpenHands will log the prompts and responses in the logs/llm/CURRENT_DATE directory, allowing you to identify the causes.
|
||||
|
||||
### 7. Help
|
||||
Need assistance or information on available targets and commands? The help command provides all the necessary guidance to ensure a smooth experience with OpenHands.
|
||||
Need help or info on available targets and commands? Use the help command for all the guidance you need with OpenHands.
|
||||
```bash
|
||||
make help
|
||||
```
|
||||
@@ -93,14 +93,14 @@ poetry run pytest ./tests/unit/test_*.py
|
||||
```
|
||||
|
||||
### 9. Add or update dependency
|
||||
1. Add your dependency in `pyproject.toml` or use `poetry add xxx`
|
||||
2. Update the poetry.lock file via `poetry lock --no-update`
|
||||
1. Add your dependency in `pyproject.toml` or use `poetry add xxx`.
|
||||
2. Update the poetry.lock file via `poetry lock --no-update`.
|
||||
|
||||
### 9. Use existing Docker image
|
||||
To reduce build time (e.g., if no changes were made to the client-runtime component), you can use an existing Docker container image by
|
||||
setting the SANDBOX_RUNTIME_CONTAINER_IMAGE environment variable to the desired Docker image.
|
||||
|
||||
Example: `export SANDBOX_RUNTIME_CONTAINER_IMAGE=ghcr.io/all-hands-ai/runtime:0.18-nikolaik`
|
||||
Example: `export SANDBOX_RUNTIME_CONTAINER_IMAGE=ghcr.io/all-hands-ai/runtime:0.19-nikolaik`
|
||||
|
||||
## Develop inside Docker container
|
||||
|
||||
@@ -110,7 +110,7 @@ TL;DR
|
||||
make docker-dev
|
||||
```
|
||||
|
||||
See more details [here](./containers/dev/README.md)
|
||||
See more details [here](./containers/dev/README.md).
|
||||
|
||||
If you are just interested in running `OpenHands` without installing all the required tools on your host.
|
||||
|
||||
|
||||
@@ -2,8 +2,8 @@
|
||||
These are the procedures and guidelines on how issues are triaged in this repo by the maintainers.
|
||||
|
||||
## General
|
||||
* Most issues must be tagged with **enhancement** or **bug**
|
||||
* Issues may be tagged with what it relates to (**backend**, **frontend**, **agent quality**, etc.)
|
||||
* Most issues must be tagged with **enhancement** or **bug**.
|
||||
* Issues may be tagged with what it relates to (**backend**, **frontend**, **agent quality**, etc.).
|
||||
|
||||
## Severity
|
||||
* **Low**: Minor issues or affecting single user.
|
||||
@@ -11,10 +11,10 @@ These are the procedures and guidelines on how issues are triaged in this repo b
|
||||
* **Critical**: Affecting all users or potential security issues.
|
||||
|
||||
## Effort
|
||||
* Issues may be estimated with effort required (**small effort**, **medium effort**, **large effort**)
|
||||
* Issues may be estimated with effort required (**small effort**, **medium effort**, **large effort**).
|
||||
|
||||
## Difficulty
|
||||
* Issues with low implementation difficulty may be tagged with **good first issue**
|
||||
* Issues with low implementation difficulty may be tagged with **good first issue**.
|
||||
|
||||
## Not Enough Information
|
||||
* User is asked to provide more information (logs, how to reproduce, etc.) when the issue is not clear.
|
||||
|
||||
@@ -43,17 +43,17 @@ See the [Installation](https://docs.all-hands.dev/modules/usage/installation) gu
|
||||
system requirements and more information.
|
||||
|
||||
```bash
|
||||
docker pull docker.all-hands.dev/all-hands-ai/runtime:0.18-nikolaik
|
||||
docker pull docker.all-hands.dev/all-hands-ai/runtime:0.19-nikolaik
|
||||
|
||||
docker run -it --rm --pull=always \
|
||||
-e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.18-nikolaik \
|
||||
-e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.19-nikolaik \
|
||||
-e LOG_ALL_EVENTS=true \
|
||||
-v /var/run/docker.sock:/var/run/docker.sock \
|
||||
-v ~/.openhands-state:/.openhands-state \
|
||||
-p 3000:3000 \
|
||||
--add-host host.docker.internal:host-gateway \
|
||||
--name openhands-app \
|
||||
docker.all-hands.dev/all-hands-ai/openhands:0.18
|
||||
docker.all-hands.dev/all-hands-ai/openhands:0.19
|
||||
```
|
||||
|
||||
You'll find OpenHands running at [http://localhost:3000](http://localhost:3000)!
|
||||
|
||||
@@ -7,7 +7,7 @@ services:
|
||||
image: openhands:latest
|
||||
container_name: openhands-app-${DATE:-}
|
||||
environment:
|
||||
- SANDBOX_RUNTIME_CONTAINER_IMAGE=${SANDBOX_RUNTIME_CONTAINER_IMAGE:-ghcr.io/all-hands-ai/runtime:0.18-nikolaik}
|
||||
- SANDBOX_RUNTIME_CONTAINER_IMAGE=${SANDBOX_RUNTIME_CONTAINER_IMAGE:-ghcr.io/all-hands-ai/runtime:0.19-nikolaik}
|
||||
- SANDBOX_USER_ID=${SANDBOX_USER_ID:-1234}
|
||||
- WORKSPACE_MOUNT_PATH=${WORKSPACE_BASE:-$PWD/workspace}
|
||||
ports:
|
||||
|
||||
@@ -11,7 +11,7 @@ services:
|
||||
- BACKEND_HOST=${BACKEND_HOST:-"0.0.0.0"}
|
||||
- SANDBOX_API_HOSTNAME=host.docker.internal
|
||||
#
|
||||
- SANDBOX_RUNTIME_CONTAINER_IMAGE=${SANDBOX_RUNTIME_CONTAINER_IMAGE:-ghcr.io/all-hands-ai/runtime:0.18-nikolaik}
|
||||
- SANDBOX_RUNTIME_CONTAINER_IMAGE=${SANDBOX_RUNTIME_CONTAINER_IMAGE:-ghcr.io/all-hands-ai/runtime:0.19-nikolaik}
|
||||
- SANDBOX_USER_ID=${SANDBOX_USER_ID:-1234}
|
||||
- WORKSPACE_MOUNT_PATH=${WORKSPACE_BASE:-$PWD/workspace}
|
||||
ports:
|
||||
|
||||
@@ -52,7 +52,7 @@ LLM_API_KEY="sk_test_12345"
|
||||
```bash
|
||||
docker run -it \
|
||||
--pull=always \
|
||||
-e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.18-nikolaik \
|
||||
-e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.19-nikolaik \
|
||||
-e SANDBOX_USER_ID=$(id -u) \
|
||||
-e WORKSPACE_MOUNT_PATH=$WORKSPACE_BASE \
|
||||
-e LLM_API_KEY=$LLM_API_KEY \
|
||||
@@ -61,7 +61,7 @@ docker run -it \
|
||||
-v /var/run/docker.sock:/var/run/docker.sock \
|
||||
--add-host host.docker.internal:host-gateway \
|
||||
--name openhands-app-$(date +%Y%m%d%H%M%S) \
|
||||
docker.all-hands.dev/all-hands-ai/openhands:0.18 \
|
||||
docker.all-hands.dev/all-hands-ai/openhands:0.19 \
|
||||
python -m openhands.core.cli
|
||||
```
|
||||
|
||||
|
||||
@@ -46,7 +46,7 @@ LLM_API_KEY="sk_test_12345"
|
||||
```bash
|
||||
docker run -it \
|
||||
--pull=always \
|
||||
-e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.18-nikolaik \
|
||||
-e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.19-nikolaik \
|
||||
-e SANDBOX_USER_ID=$(id -u) \
|
||||
-e WORKSPACE_MOUNT_PATH=$WORKSPACE_BASE \
|
||||
-e LLM_API_KEY=$LLM_API_KEY \
|
||||
@@ -56,6 +56,6 @@ docker run -it \
|
||||
-v /var/run/docker.sock:/var/run/docker.sock \
|
||||
--add-host host.docker.internal:host-gateway \
|
||||
--name openhands-app-$(date +%Y%m%d%H%M%S) \
|
||||
docker.all-hands.dev/all-hands-ai/openhands:0.18 \
|
||||
docker.all-hands.dev/all-hands-ai/openhands:0.19 \
|
||||
python -m openhands.core.main -t "write a bash script that prints hi" --no-auto-continue
|
||||
```
|
||||
|
||||
@@ -13,16 +13,16 @@
|
||||
La façon la plus simple d'exécuter OpenHands est avec Docker.
|
||||
|
||||
```bash
|
||||
docker pull docker.all-hands.dev/all-hands-ai/runtime:0.18-nikolaik
|
||||
docker pull docker.all-hands.dev/all-hands-ai/runtime:0.19-nikolaik
|
||||
|
||||
docker run -it --rm --pull=always \
|
||||
-e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.18-nikolaik \
|
||||
-e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.19-nikolaik \
|
||||
-e LOG_ALL_EVENTS=true \
|
||||
-v /var/run/docker.sock:/var/run/docker.sock \
|
||||
-p 3000:3000 \
|
||||
--add-host host.docker.internal:host-gateway \
|
||||
--name openhands-app \
|
||||
docker.all-hands.dev/all-hands-ai/openhands:0.18
|
||||
docker.all-hands.dev/all-hands-ai/openhands:0.19
|
||||
```
|
||||
|
||||
Vous pouvez également exécuter OpenHands en mode [headless scriptable](https://docs.all-hands.dev/modules/usage/how-to/headless-mode), en tant que [CLI interactive](https://docs.all-hands.dev/modules/usage/how-to/cli-mode), ou en utilisant l'[Action GitHub OpenHands](https://docs.all-hands.dev/modules/usage/how-to/github-action).
|
||||
|
||||
@@ -13,7 +13,7 @@ C'est le Runtime par défaut qui est utilisé lorsque vous démarrez OpenHands.
|
||||
|
||||
```
|
||||
docker run # ...
|
||||
-e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.18-nikolaik \
|
||||
-e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.19-nikolaik \
|
||||
-v /var/run/docker.sock:/var/run/docker.sock \
|
||||
# ...
|
||||
```
|
||||
|
||||
@@ -50,7 +50,7 @@ LLM_API_KEY="sk_test_12345"
|
||||
```bash
|
||||
docker run -it \
|
||||
--pull=always \
|
||||
-e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.18-nikolaik \
|
||||
-e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.19-nikolaik \
|
||||
-e SANDBOX_USER_ID=$(id -u) \
|
||||
-e WORKSPACE_MOUNT_PATH=$WORKSPACE_BASE \
|
||||
-e LLM_API_KEY=$LLM_API_KEY \
|
||||
@@ -59,7 +59,7 @@ docker run -it \
|
||||
-v /var/run/docker.sock:/var/run/docker.sock \
|
||||
--add-host host.docker.internal:host-gateway \
|
||||
--name openhands-app-$(date +%Y%m%d%H%M%S) \
|
||||
docker.all-hands.dev/all-hands-ai/openhands:0.18 \
|
||||
docker.all-hands.dev/all-hands-ai/openhands:0.19 \
|
||||
python -m openhands.core.cli
|
||||
```
|
||||
|
||||
|
||||
@@ -47,7 +47,7 @@ LLM_API_KEY="sk_test_12345"
|
||||
```bash
|
||||
docker run -it \
|
||||
--pull=always \
|
||||
-e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.18-nikolaik \
|
||||
-e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.19-nikolaik \
|
||||
-e SANDBOX_USER_ID=$(id -u) \
|
||||
-e WORKSPACE_MOUNT_PATH=$WORKSPACE_BASE \
|
||||
-e LLM_API_KEY=$LLM_API_KEY \
|
||||
@@ -57,6 +57,6 @@ docker run -it \
|
||||
-v /var/run/docker.sock:/var/run/docker.sock \
|
||||
--add-host host.docker.internal:host-gateway \
|
||||
--name openhands-app-$(date +%Y%m%d%H%M%S) \
|
||||
docker.all-hands.dev/all-hands-ai/openhands:0.18 \
|
||||
docker.all-hands.dev/all-hands-ai/openhands:0.19 \
|
||||
python -m openhands.core.main -t "write a bash script that prints hi" --no-auto-continue
|
||||
```
|
||||
|
||||
@@ -11,16 +11,16 @@
|
||||
在 Docker 中运行 OpenHands 是最简单的方式。
|
||||
|
||||
```bash
|
||||
docker pull docker.all-hands.dev/all-hands-ai/runtime:0.18-nikolaik
|
||||
docker pull docker.all-hands.dev/all-hands-ai/runtime:0.19-nikolaik
|
||||
|
||||
docker run -it --rm --pull=always \
|
||||
-e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.18-nikolaik \
|
||||
-e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.19-nikolaik \
|
||||
-e LOG_ALL_EVENTS=true \
|
||||
-v /var/run/docker.sock:/var/run/docker.sock \
|
||||
-p 3000:3000 \
|
||||
--add-host host.docker.internal:host-gateway \
|
||||
--name openhands-app \
|
||||
docker.all-hands.dev/all-hands-ai/openhands:0.18
|
||||
docker.all-hands.dev/all-hands-ai/openhands:0.19
|
||||
```
|
||||
|
||||
你也可以在可脚本化的[无头模式](https://docs.all-hands.dev/modules/usage/how-to/headless-mode)下运行 OpenHands,作为[交互式 CLI](https://docs.all-hands.dev/modules/usage/how-to/cli-mode),或使用 [OpenHands GitHub Action](https://docs.all-hands.dev/modules/usage/how-to/github-action)。
|
||||
|
||||
@@ -11,7 +11,7 @@
|
||||
|
||||
```
|
||||
docker run # ...
|
||||
-e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.18-nikolaik \
|
||||
-e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.19-nikolaik \
|
||||
-v /var/run/docker.sock:/var/run/docker.sock \
|
||||
# ...
|
||||
```
|
||||
|
||||
@@ -35,7 +35,7 @@ To run OpenHands in CLI mode with Docker:
|
||||
```bash
|
||||
docker run -it \
|
||||
--pull=always \
|
||||
-e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.18-nikolaik \
|
||||
-e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.19-nikolaik \
|
||||
-e SANDBOX_USER_ID=$(id -u) \
|
||||
-e WORKSPACE_MOUNT_PATH=$WORKSPACE_BASE \
|
||||
-e LLM_API_KEY=$LLM_API_KEY \
|
||||
@@ -45,7 +45,7 @@ docker run -it \
|
||||
-v ~/.openhands-state:/.openhands-state \
|
||||
--add-host host.docker.internal:host-gateway \
|
||||
--name openhands-app-$(date +%Y%m%d%H%M%S) \
|
||||
docker.all-hands.dev/all-hands-ai/openhands:0.18 \
|
||||
docker.all-hands.dev/all-hands-ai/openhands:0.19 \
|
||||
python -m openhands.core.cli
|
||||
```
|
||||
|
||||
|
||||
@@ -32,7 +32,7 @@ To run OpenHands in Headless mode with Docker:
|
||||
```bash
|
||||
docker run -it \
|
||||
--pull=always \
|
||||
-e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.18-nikolaik \
|
||||
-e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.19-nikolaik \
|
||||
-e SANDBOX_USER_ID=$(id -u) \
|
||||
-e WORKSPACE_MOUNT_PATH=$WORKSPACE_BASE \
|
||||
-e LLM_API_KEY=$LLM_API_KEY \
|
||||
@@ -43,7 +43,7 @@ docker run -it \
|
||||
-v ~/.openhands-state:/.openhands-state \
|
||||
--add-host host.docker.internal:host-gateway \
|
||||
--name openhands-app-$(date +%Y%m%d%H%M%S) \
|
||||
docker.all-hands.dev/all-hands-ai/openhands:0.18 \
|
||||
docker.all-hands.dev/all-hands-ai/openhands:0.19 \
|
||||
python -m openhands.core.main -t "write a bash script that prints hi"
|
||||
```
|
||||
|
||||
|
||||
@@ -11,17 +11,17 @@
|
||||
The easiest way to run OpenHands is in Docker.
|
||||
|
||||
```bash
|
||||
docker pull docker.all-hands.dev/all-hands-ai/runtime:0.18-nikolaik
|
||||
docker pull docker.all-hands.dev/all-hands-ai/runtime:0.19-nikolaik
|
||||
|
||||
docker run -it --rm --pull=always \
|
||||
-e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.18-nikolaik \
|
||||
-e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.19-nikolaik \
|
||||
-e LOG_ALL_EVENTS=true \
|
||||
-v /var/run/docker.sock:/var/run/docker.sock \
|
||||
-v ~/.openhands-state:/.openhands-state \
|
||||
-p 3000:3000 \
|
||||
--add-host host.docker.internal:host-gateway \
|
||||
--name openhands-app \
|
||||
docker.all-hands.dev/all-hands-ai/openhands:0.18
|
||||
docker.all-hands.dev/all-hands-ai/openhands:0.19
|
||||
```
|
||||
|
||||
You'll find OpenHands running at http://localhost:3000!
|
||||
|
||||
@@ -1,17 +1,31 @@
|
||||
# Micro-Agents
|
||||
# Public Micro-Agents
|
||||
|
||||
OpenHands uses specialized micro-agents to handle specific tasks and contexts efficiently. These micro-agents are small, focused components that provide specialized behavior and knowledge for particular scenarios.
|
||||
OpenHands uses specialized micro-agents to handle specific tasks and contexts efficiently. These micro-agents are small,
|
||||
focused components that provide specialized behavior and knowledge for particular scenarios.
|
||||
|
||||
## Overview
|
||||
|
||||
Micro-agents are defined in markdown files under the `openhands/agenthub/codeact_agent/micro/` directory. Each micro-agent is configured with:
|
||||
Public micro-agents are defined in markdown files under the
|
||||
[`microagents/knowledge/`](https://github.com/All-Hands-AI/OpenHands/tree/main/microagents/knowledge) directory.
|
||||
Each micro-agent is configured with:
|
||||
|
||||
- A unique name.
|
||||
- The agent type (typically CodeActAgent).
|
||||
- Trigger keywords that activate the agent.
|
||||
- Specific instructions and capabilities.
|
||||
|
||||
## Available Micro-Agents
|
||||
### Integration
|
||||
|
||||
Public micro-agents are automatically integrated into OpenHands' workflow. They:
|
||||
- Monitor incoming commands for their trigger words.
|
||||
- Activate when relevant triggers are detected.
|
||||
- Apply their specialized knowledge and capabilities.
|
||||
- Follow their specific guidelines and restrictions.
|
||||
|
||||
## Available Public Micro-Agents
|
||||
|
||||
For more information about specific micro-agents, refer to their individual documentation files in
|
||||
the [`micro-agents`](https://github.com/All-Hands-AI/OpenHands/tree/main/microagents) directory.
|
||||
|
||||
### GitHub Agent
|
||||
**File**: `github.md`
|
||||
@@ -29,6 +43,14 @@ Key features:
|
||||
- Git configuration management
|
||||
- API-first approach for GitHub operations
|
||||
|
||||
Usage Example:
|
||||
|
||||
```bash
|
||||
git checkout -b feature-branch
|
||||
git commit -m "Add new feature"
|
||||
git push origin feature-branch
|
||||
```
|
||||
|
||||
### NPM Agent
|
||||
**File**: `npm.md`
|
||||
**Triggers**: `npm`
|
||||
@@ -38,9 +60,15 @@ Specializes in handling npm package management with specific focus on:
|
||||
- Automated confirmation handling using Unix 'yes' command.
|
||||
- Package installation automation.
|
||||
|
||||
### Custom Micro-Agents
|
||||
Usage Example:
|
||||
|
||||
You can create your own micro-agents by adding new markdown files to the micro-agents directory.
|
||||
```bash
|
||||
yes | npm install package-name
|
||||
```
|
||||
|
||||
### Custom Public Micro-Agents
|
||||
|
||||
You can create your own public micro-agents by adding new markdown files to the `microagents/knowledge/` directory.
|
||||
Each file should follow this structure:
|
||||
|
||||
```markdown
|
||||
@@ -55,43 +83,29 @@ triggers:
|
||||
Instructions and capabilities for the micro-agent...
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
## Working With Public Micro-Agents
|
||||
|
||||
When working with micro-agents:
|
||||
When working with public micro-agents:
|
||||
- **Use Appropriate Triggers**: Ensure your commands include the relevant trigger words to activate the correct micro-agent.
|
||||
- **Follow Agent Guidelines**: Each agent has specific instructions and limitations. Respect these for optimal results.
|
||||
- **API-First Approach**: When available, use API endpoints rather than web interfaces.
|
||||
- **Automation Friendly**: Design commands that work well in non-interactive environments.
|
||||
|
||||
## Integration
|
||||
## Contributing a Public Micro-Agent
|
||||
|
||||
Micro-agents are automatically integrated into OpenHands' workflow. They:
|
||||
- Monitor incoming commands for their trigger words.
|
||||
- Activate when relevant triggers are detected.
|
||||
- Apply their specialized knowledge and capabilities.
|
||||
- Follow their specific guidelines and restrictions.
|
||||
Best practices for creating public micro-agents:
|
||||
|
||||
## Example Usage
|
||||
- **Clear Scope**: Keep the micro-agent focused on a specific domain or task.
|
||||
- **Explicit Instructions**: Provide clear, unambiguous guidelines.
|
||||
- **Useful Examples**: Include practical examples of common use cases.
|
||||
- **Safety First**: Include necessary warnings and constraints.
|
||||
- **Integration Awareness**: Consider how the micro-agent interacts with other components.
|
||||
|
||||
```bash
|
||||
# GitHub agent example
|
||||
git checkout -b feature-branch
|
||||
git commit -m "Add new feature"
|
||||
git push origin feature-branch
|
||||
To contribute a new micro-agent to OpenHands:
|
||||
|
||||
# NPM agent example
|
||||
yes | npm install package-name
|
||||
```
|
||||
### 1. Plan the Public Micro-Agent
|
||||
|
||||
For more information about specific agents, refer to their individual documentation files in the micro-agents directory.
|
||||
|
||||
## Contributing a Micro-Agent
|
||||
|
||||
To contribute a new micro-agent to OpenHands, follow these guidelines:
|
||||
|
||||
### 1. Planning Your Micro-Agent
|
||||
|
||||
Before creating a micro-agent, consider:
|
||||
Before creating a public micro-agent, consider:
|
||||
- What specific problem or use case will it address?
|
||||
- What unique capabilities or knowledge should it have?
|
||||
- What trigger words make sense for activating it?
|
||||
@@ -99,11 +113,11 @@ Before creating a micro-agent, consider:
|
||||
|
||||
### 2. File Structure
|
||||
|
||||
Create a new markdown file in `openhands/agenthub/codeact_agent/micro/` with a descriptive name (e.g., `docker.md` for a Docker-focused agent).
|
||||
Create a new markdown file in `microagents/knowledge/` with a descriptive name (e.g., `docker.md` for a Docker-focused agent).
|
||||
|
||||
### 3. Required Components
|
||||
|
||||
Your micro-agent file must include:
|
||||
The micro-agent file must include:
|
||||
|
||||
- **Front Matter**: YAML metadata at the start of the file:
|
||||
```markdown
|
||||
@@ -133,15 +147,7 @@ Examples of usage:
|
||||
[Example 2]
|
||||
```
|
||||
|
||||
### 4. Best Practices for Micro-Agent Development
|
||||
|
||||
- **Clear Scope**: Keep the agent focused on a specific domain or task.
|
||||
- **Explicit Instructions**: Provide clear, unambiguous guidelines.
|
||||
- **Useful Examples**: Include practical examples of common use cases.
|
||||
- **Safety First**: Include necessary warnings and constraints.
|
||||
- **Integration Awareness**: Consider how the agent interacts with other components.
|
||||
|
||||
### 5. Testing Your Micro-Agent
|
||||
### 4. Testing the Public Micro-Agent
|
||||
|
||||
Before submitting:
|
||||
- Test the agent with various prompts.
|
||||
@@ -149,7 +155,14 @@ Before submitting:
|
||||
- Ensure instructions are clear and comprehensive.
|
||||
- Check for potential conflicts with existing agents.
|
||||
|
||||
### 6. Example Implementation
|
||||
### 5. Submission Process
|
||||
|
||||
Submit a pull request with:
|
||||
- The new micro-agent file.
|
||||
- Updated documentation if needed.
|
||||
- Description of the agent's purpose and capabilities.
|
||||
|
||||
### Example Public Micro-Agent Implementation
|
||||
|
||||
Here's a template for a new micro-agent:
|
||||
|
||||
@@ -197,14 +210,5 @@ Remember to:
|
||||
- Optimize for build time and image size
|
||||
```
|
||||
|
||||
### 7. Submission Process
|
||||
|
||||
1. Create your micro-agent file in the correct directory.
|
||||
2. Test thoroughly.
|
||||
3. Submit a pull request with:
|
||||
- The new micro-agent file.
|
||||
- Updated documentation if needed.
|
||||
- Description of the agent's purpose and capabilities.
|
||||
|
||||
Remember that micro-agents are a powerful way to extend OpenHands' capabilities in specific domains. Well-designed
|
||||
agents can significantly improve the system's ability to handle specialized tasks.
|
||||
@@ -1,10 +1,12 @@
|
||||
# Customizing Agent Behavior
|
||||
# Repository Micro-Agents
|
||||
|
||||
OpenHands can be customized to work more effectively with specific repositories by providing repository-specific context and guidelines. This section explains how to optimize OpenHands for your project.
|
||||
OpenHands can be customized to work more effectively with specific repositories by providing repository-specific context
|
||||
and guidelines. This section explains how to optimize OpenHands for your project.
|
||||
|
||||
## Repository Configuration
|
||||
|
||||
You can customize OpenHands' behavior for your repository by creating a `.openhands` directory in your repository's root. At minimum, it should contain the file
|
||||
You can customize OpenHands' behavior for your repository by creating a `.openhands/microagents/` directory in your repository's root.
|
||||
At minimum, it should contain the file
|
||||
`.openhands/microagents/repo.md`, which includes instructions that will
|
||||
be given to the agent every time it works with this repository.
|
||||
|
||||
@@ -39,7 +41,8 @@ Guidelines:
|
||||
|
||||
### Customizing Prompts
|
||||
|
||||
When working with a repository:
|
||||
You may also add customized prompts to the `.openhands/microagents/repo.md` file when working with a repository.
|
||||
These could:
|
||||
|
||||
- **Reference Project Standards**: Mention specific coding standards or patterns used in your project.
|
||||
- **Include Context**: Reference relevant documentation or existing implementations.
|
||||
@@ -54,14 +57,10 @@ The component should use our shared styling from src/styles/components.
|
||||
|
||||
### Best Practices for Repository Customization
|
||||
|
||||
- **Keep Instructions Updated**: Regularly update your `.openhands` directory as your project evolves.
|
||||
- **Keep Instructions Updated**: Regularly update your `.openhands/microagents/` directory as your project evolves.
|
||||
- **Be Specific**: Include specific paths, patterns, and requirements unique to your project.
|
||||
- **Document Dependencies**: List all tools and dependencies required for development.
|
||||
- **Include Examples**: Provide examples of good code patterns from your project.
|
||||
- **Specify Conventions**: Document naming conventions, file organization, and code style preferences.
|
||||
|
||||
By customizing OpenHands for your repository, you'll get more accurate and consistent results that align with your project's standards and requirements.
|
||||
|
||||
## Other Microagents
|
||||
You can create other instructions in the `.openhands/microagents/` directory
|
||||
that will be sent to the agent if a particular keyword is found, like `test`, `frontend`, or `migration`. See [Micro-Agents](microagents.md) for more information.
|
||||
@@ -16,7 +16,7 @@ some flags being passed to `docker run` that make this possible:
|
||||
|
||||
```
|
||||
docker run # ...
|
||||
-e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.18-nikolaik \
|
||||
-e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.19-nikolaik \
|
||||
-v /var/run/docker.sock:/var/run/docker.sock \
|
||||
# ...
|
||||
```
|
||||
|
||||
1322
docs/package-lock.json
generated
1322
docs/package-lock.json
generated
File diff suppressed because it is too large
Load Diff
@@ -15,10 +15,10 @@
|
||||
"typecheck": "tsc"
|
||||
},
|
||||
"dependencies": {
|
||||
"@docusaurus/core": "^3.6.3",
|
||||
"@docusaurus/plugin-content-pages": "^3.6.3",
|
||||
"@docusaurus/preset-classic": "^3.6.3",
|
||||
"@docusaurus/theme-mermaid": "^3.6.3",
|
||||
"@docusaurus/core": "^3.7.0",
|
||||
"@docusaurus/plugin-content-pages": "^3.7.0",
|
||||
"@docusaurus/preset-classic": "^3.7.0",
|
||||
"@docusaurus/theme-mermaid": "^3.7.0",
|
||||
"@mdx-js/react": "^3.1.0",
|
||||
"clsx": "^2.0.0",
|
||||
"prism-react-renderer": "^2.4.1",
|
||||
@@ -29,7 +29,7 @@
|
||||
},
|
||||
"devDependencies": {
|
||||
"@docusaurus/module-type-aliases": "^3.5.1",
|
||||
"@docusaurus/tsconfig": "^3.6.3",
|
||||
"@docusaurus/tsconfig": "^3.7.0",
|
||||
"@docusaurus/types": "^3.5.1",
|
||||
"typescript": "~5.7.2"
|
||||
},
|
||||
|
||||
@@ -23,15 +23,21 @@ const sidebars: SidebarsConfig = {
|
||||
id: 'usage/prompting/prompting-best-practices',
|
||||
},
|
||||
{
|
||||
type: 'doc',
|
||||
label: 'Customization',
|
||||
id: 'usage/prompting/customization',
|
||||
},
|
||||
{
|
||||
type: 'doc',
|
||||
label: 'Microagents',
|
||||
id: 'usage/prompting/microagents',
|
||||
},
|
||||
type: 'category',
|
||||
label: 'Micro-Agents',
|
||||
items: [
|
||||
{
|
||||
type: 'doc',
|
||||
label: 'Public',
|
||||
id: 'usage/prompting/microagents-public',
|
||||
},
|
||||
{
|
||||
type: 'doc',
|
||||
label: 'Repository',
|
||||
id: 'usage/prompting/microagents-repo',
|
||||
},
|
||||
],
|
||||
}
|
||||
],
|
||||
},
|
||||
{
|
||||
|
||||
@@ -0,0 +1 @@
|
||||
{"agent_class": "CodeActAgent", "llm_config": {"model": "claude-3-5-sonnet-20241022", "api_key": null, "base_url": null, "api_version": null, "embedding_model": "local", "embedding_base_url": null, "embedding_deployment_name": null, "aws_access_key_id": null, "aws_secret_access_key": null, "aws_region_name": null, "openrouter_site_url": "https://docs.all-hands.dev/", "openrouter_app_name": "OpenHands", "num_retries": 8, "retry_multiplier": 2, "retry_min_wait": 15, "retry_max_wait": 120, "timeout": null, "max_message_chars": 30000, "temperature": 0.0, "top_p": 1.0, "custom_llm_provider": null, "max_input_tokens": null, "max_output_tokens": null, "input_cost_per_token": null, "output_cost_per_token": null, "ollama_base_url": null, "drop_params": true, "modify_params": true, "disable_vision": null, "caching_prompt": true, "log_completions": false, "log_completions_folder": "/workspace/OpenHands/logs/completions", "draft_editor": null, "custom_tokenizer": null, "native_tool_calling": null}, "max_iterations": 10, "eval_output_dir": "./dummy_eval_output_dir/dummy_dataset_descrption/CodeActAgent/claude-3-5-sonnet-20241022_maxiter_10_N_dummy_eval_note", "start_time": "2025-01-08 18:01:01", "git_commit": "007052c8aa15ea5149fff31583a3412ea7b8625a", "dataset": "dummy_dataset_descrption", "data_split": null, "details": {}, "condenser_config": {"type": "noop"}}
|
||||
@@ -15,6 +15,7 @@ from evaluation.utils.shared import (
|
||||
EvalOutput,
|
||||
assert_and_raise,
|
||||
codeact_user_response,
|
||||
get_metrics,
|
||||
is_fatal_evaluation_error,
|
||||
make_metadata,
|
||||
prepare_dataset,
|
||||
@@ -148,6 +149,7 @@ def get_config(
|
||||
codeact_enable_jupyter=False,
|
||||
codeact_enable_browsing=RUN_WITH_BROWSING,
|
||||
codeact_enable_llm_editor=False,
|
||||
condenser=metadata.condenser_config,
|
||||
)
|
||||
config.set_agent_config(agent_config)
|
||||
return config
|
||||
@@ -448,7 +450,7 @@ def process_instance(
|
||||
|
||||
# NOTE: this is NO LONGER the event stream, but an agent history that includes delegate agent's events
|
||||
histories = [event_to_dict(event) for event in state.history]
|
||||
metrics = state.metrics.get() if state.metrics else None
|
||||
metrics = get_metrics(state)
|
||||
|
||||
# Save the output
|
||||
output = EvalOutput(
|
||||
|
||||
@@ -17,6 +17,10 @@ from tqdm import tqdm
|
||||
|
||||
from openhands.controller.state.state import State
|
||||
from openhands.core.config import LLMConfig
|
||||
from openhands.core.config.condenser_config import (
|
||||
CondenserConfig,
|
||||
NoOpCondenserConfig,
|
||||
)
|
||||
from openhands.core.exceptions import (
|
||||
AgentRuntimeBuildError,
|
||||
AgentRuntimeDisconnectedError,
|
||||
@@ -33,6 +37,7 @@ from openhands.events.action.message import MessageAction
|
||||
from openhands.events.event import Event
|
||||
from openhands.events.serialization.event import event_to_dict
|
||||
from openhands.events.utils import get_pairs_from_events
|
||||
from openhands.memory.condenser import get_condensation_metadata
|
||||
|
||||
|
||||
class EvalMetadata(BaseModel):
|
||||
@@ -45,11 +50,17 @@ class EvalMetadata(BaseModel):
|
||||
dataset: str | None = None
|
||||
data_split: str | None = None
|
||||
details: dict[str, Any] | None = None
|
||||
condenser_config: CondenserConfig | None = None
|
||||
|
||||
def model_dump(self, *args, **kwargs):
|
||||
dumped_dict = super().model_dump(*args, **kwargs)
|
||||
# avoid leaking sensitive information
|
||||
dumped_dict['llm_config'] = self.llm_config.to_safe_dict()
|
||||
if hasattr(self.condenser_config, 'llm_config'):
|
||||
dumped_dict['condenser_config']['llm_config'] = (
|
||||
self.condenser_config.llm_config.to_safe_dict()
|
||||
)
|
||||
|
||||
return dumped_dict
|
||||
|
||||
def model_dump_json(self, *args, **kwargs):
|
||||
@@ -57,6 +68,11 @@ class EvalMetadata(BaseModel):
|
||||
dumped_dict = json.loads(dumped)
|
||||
# avoid leaking sensitive information
|
||||
dumped_dict['llm_config'] = self.llm_config.to_safe_dict()
|
||||
if hasattr(self.condenser_config, 'llm_config'):
|
||||
dumped_dict['condenser_config']['llm_config'] = (
|
||||
self.condenser_config.llm_config.to_safe_dict()
|
||||
)
|
||||
|
||||
logger.debug(f'Dumped metadata: {dumped_dict}')
|
||||
return json.dumps(dumped_dict)
|
||||
|
||||
@@ -192,6 +208,7 @@ def make_metadata(
|
||||
eval_output_dir: str,
|
||||
data_split: str | None = None,
|
||||
details: dict[str, Any] | None = None,
|
||||
condenser_config: CondenserConfig | None = None,
|
||||
) -> EvalMetadata:
|
||||
model_name = llm_config.model.split('/')[-1]
|
||||
model_path = model_name.replace(':', '_').replace('@', '-')
|
||||
@@ -222,6 +239,9 @@ def make_metadata(
|
||||
dataset=dataset_name,
|
||||
data_split=data_split,
|
||||
details=details,
|
||||
condenser_config=condenser_config
|
||||
if condenser_config
|
||||
else NoOpCondenserConfig(),
|
||||
)
|
||||
metadata_json = metadata.model_dump_json()
|
||||
logger.info(f'Metadata: {metadata_json}')
|
||||
@@ -551,3 +571,10 @@ def is_fatal_evaluation_error(error: str | None) -> bool:
|
||||
return True
|
||||
|
||||
return False
|
||||
|
||||
|
||||
def get_metrics(state: State) -> dict[str, Any]:
|
||||
"""Extract metrics from the state."""
|
||||
metrics = state.metrics.get() if state.metrics else {}
|
||||
metrics['condenser'] = get_condensation_metadata(state)
|
||||
return metrics
|
||||
|
||||
30
frontend/__tests__/context/ws-client-provider.test.tsx
Normal file
30
frontend/__tests__/context/ws-client-provider.test.tsx
Normal file
@@ -0,0 +1,30 @@
|
||||
import { describe, it, expect, vi } from "vitest";
|
||||
import { render, screen } from "@testing-library/react";
|
||||
import * as ChatSlice from "#/state/chat-slice";
|
||||
import {
|
||||
updateStatusWhenErrorMessagePresent,
|
||||
} from "#/context/ws-client-provider";
|
||||
|
||||
describe("Propagate error message", () => {
|
||||
it("should do nothing when no message was passed from server", () => {
|
||||
const addErrorMessageSpy = vi.spyOn(ChatSlice, "addErrorMessage");
|
||||
updateStatusWhenErrorMessagePresent(null)
|
||||
updateStatusWhenErrorMessagePresent(undefined)
|
||||
updateStatusWhenErrorMessagePresent({})
|
||||
updateStatusWhenErrorMessagePresent({message: null})
|
||||
|
||||
expect(addErrorMessageSpy).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it("should display error to user when present", () => {
|
||||
const message = "We have a problem!"
|
||||
const addErrorMessageSpy = vi.spyOn(ChatSlice, "addErrorMessage")
|
||||
updateStatusWhenErrorMessagePresent({message})
|
||||
|
||||
expect(addErrorMessageSpy).toHaveBeenCalledWith({
|
||||
message,
|
||||
status_update: true,
|
||||
type: 'error'
|
||||
});
|
||||
});
|
||||
});
|
||||
1033
frontend/package-lock.json
generated
1033
frontend/package-lock.json
generated
File diff suppressed because it is too large
Load Diff
@@ -1,6 +1,6 @@
|
||||
{
|
||||
"name": "openhands-frontend",
|
||||
"version": "0.18.0",
|
||||
"version": "0.19.0",
|
||||
"private": true,
|
||||
"type": "module",
|
||||
"engines": {
|
||||
@@ -8,32 +8,32 @@
|
||||
},
|
||||
"dependencies": {
|
||||
"@monaco-editor/react": "^4.7.0-rc.0",
|
||||
"@nextui-org/react": "^2.6.10",
|
||||
"@nextui-org/react": "^2.6.11",
|
||||
"@react-router/node": "^7.1.1",
|
||||
"@react-router/serve": "^7.1.1",
|
||||
"@react-types/shared": "^3.25.0",
|
||||
"@reduxjs/toolkit": "^2.5.0",
|
||||
"@tanstack/react-query": "^5.62.12",
|
||||
"@tanstack/react-query": "^5.63.0",
|
||||
"@vitejs/plugin-react": "^4.3.2",
|
||||
"@xterm/addon-fit": "^0.10.0",
|
||||
"@xterm/xterm": "^5.4.0",
|
||||
"axios": "^1.7.9",
|
||||
"clsx": "^2.1.1",
|
||||
"eslint-config-airbnb-typescript": "^18.0.0",
|
||||
"i18next": "^24.2.0",
|
||||
"i18next": "^24.2.1",
|
||||
"i18next-browser-languagedetector": "^8.0.2",
|
||||
"i18next-http-backend": "^3.0.1",
|
||||
"isbot": "^5.1.19",
|
||||
"isbot": "^5.1.20",
|
||||
"jose": "^5.9.4",
|
||||
"monaco-editor": "^0.52.2",
|
||||
"posthog-js": "^1.203.3",
|
||||
"posthog-js": "^1.205.0",
|
||||
"react": "^19.0.0",
|
||||
"react-dom": "^19.0.0",
|
||||
"react-highlight": "^0.15.0",
|
||||
"react-hot-toast": "^2.5.1",
|
||||
"react-i18next": "^15.4.0",
|
||||
"react-icons": "^5.4.0",
|
||||
"react-markdown": "^9.0.1",
|
||||
"react-markdown": "^9.0.3",
|
||||
"react-redux": "^9.2.0",
|
||||
"react-router": "^7.1.1",
|
||||
"react-syntax-highlighter": "^15.6.1",
|
||||
@@ -78,13 +78,13 @@
|
||||
"@mswjs/socket.io-binding": "^0.1.1",
|
||||
"@playwright/test": "^1.49.1",
|
||||
"@react-router/dev": "^7.1.1",
|
||||
"@tailwindcss/typography": "^0.5.15",
|
||||
"@tailwindcss/typography": "^0.5.16",
|
||||
"@tanstack/eslint-plugin-query": "^5.62.16",
|
||||
"@testing-library/jest-dom": "^6.6.1",
|
||||
"@testing-library/react": "^16.1.0",
|
||||
"@testing-library/user-event": "^14.5.2",
|
||||
"@types/node": "^22.10.5",
|
||||
"@types/react": "^19.0.2",
|
||||
"@types/react": "^19.0.3",
|
||||
"@types/react-dom": "^19.0.2",
|
||||
"@types/react-highlight": "^0.12.8",
|
||||
"@types/react-syntax-highlighter": "^15.5.13",
|
||||
|
||||
@@ -2,7 +2,10 @@ import posthog from "posthog-js";
|
||||
import React from "react";
|
||||
import { io, Socket } from "socket.io-client";
|
||||
import EventLogger from "#/utils/event-logger";
|
||||
import { handleAssistantMessage } from "#/services/actions";
|
||||
import {
|
||||
handleAssistantMessage,
|
||||
handleStatusMessage,
|
||||
} from "#/services/actions";
|
||||
import { useRate } from "#/hooks/use-rate";
|
||||
import { OpenHandsParsedEvent } from "#/types/core";
|
||||
import {
|
||||
@@ -64,6 +67,21 @@ interface WsClientProviderProps {
|
||||
conversationId: string;
|
||||
}
|
||||
|
||||
export function updateStatusWhenErrorMessagePresent(data: unknown) {
|
||||
if (
|
||||
data &&
|
||||
typeof data === "object" &&
|
||||
"message" in data &&
|
||||
typeof data.message === "string"
|
||||
) {
|
||||
handleStatusMessage({
|
||||
type: "error",
|
||||
message: data.message,
|
||||
status_update: true,
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
export function WsClientProvider({
|
||||
conversationId,
|
||||
children,
|
||||
@@ -101,7 +119,7 @@ export function WsClientProvider({
|
||||
handleAssistantMessage(event);
|
||||
}
|
||||
|
||||
function handleDisconnect() {
|
||||
function handleDisconnect(data: unknown) {
|
||||
setStatus(WsClientProviderStatus.DISCONNECTED);
|
||||
const sio = sioRef.current;
|
||||
if (!sio) {
|
||||
@@ -109,11 +127,13 @@ export function WsClientProvider({
|
||||
}
|
||||
sio.io.opts.query = sio.io.opts.query || {};
|
||||
sio.io.opts.query.latest_event_id = lastEventRef.current?.id;
|
||||
updateStatusWhenErrorMessagePresent(data);
|
||||
}
|
||||
|
||||
function handleError() {
|
||||
posthog.capture("socket_error");
|
||||
function handleError(data: unknown) {
|
||||
setStatus(WsClientProviderStatus.DISCONNECTED);
|
||||
updateStatusWhenErrorMessagePresent(data);
|
||||
posthog.capture("socket_error");
|
||||
}
|
||||
|
||||
React.useEffect(() => {
|
||||
|
||||
@@ -75,6 +75,7 @@ export function handleActionMessage(message: ActionMessage) {
|
||||
if (message.args && message.args.thought) {
|
||||
store.dispatch(addAssistantMessage(message.args.thought));
|
||||
}
|
||||
// Need to convert ActionMessage to RejectAction
|
||||
// @ts-expect-error TODO: fix
|
||||
store.dispatch(addAssistantAction(message));
|
||||
}
|
||||
|
||||
@@ -73,7 +73,7 @@ export const chatSlice = createSlice({
|
||||
state.messages.push(message);
|
||||
},
|
||||
|
||||
addAssistantMessage(state, action: PayloadAction<string>) {
|
||||
addAssistantMessage(state: SliceState, action: PayloadAction<string>) {
|
||||
const message: Message = {
|
||||
type: "thought",
|
||||
sender: "assistant",
|
||||
@@ -85,7 +85,10 @@ export const chatSlice = createSlice({
|
||||
state.messages.push(message);
|
||||
},
|
||||
|
||||
addAssistantAction(state, action: PayloadAction<OpenHandsAction>) {
|
||||
addAssistantAction(
|
||||
state: SliceState,
|
||||
action: PayloadAction<OpenHandsAction>,
|
||||
) {
|
||||
const actionID = action.payload.action;
|
||||
if (!HANDLED_ACTIONS.includes(actionID)) {
|
||||
return;
|
||||
@@ -125,7 +128,7 @@ export const chatSlice = createSlice({
|
||||
},
|
||||
|
||||
addAssistantObservation(
|
||||
state,
|
||||
state: SliceState,
|
||||
observation: PayloadAction<OpenHandsObservation>,
|
||||
) {
|
||||
const observationID = observation.payload.observation;
|
||||
@@ -179,7 +182,7 @@ export const chatSlice = createSlice({
|
||||
},
|
||||
|
||||
addErrorMessage(
|
||||
state,
|
||||
state: SliceState,
|
||||
action: PayloadAction<{ id?: string; message: string }>,
|
||||
) {
|
||||
const { id, message } = action.payload;
|
||||
@@ -192,7 +195,7 @@ export const chatSlice = createSlice({
|
||||
});
|
||||
},
|
||||
|
||||
clearMessages(state) {
|
||||
clearMessages(state: SliceState) {
|
||||
state.messages = [];
|
||||
},
|
||||
},
|
||||
|
||||
@@ -43,6 +43,6 @@ export interface ObservationMessage {
|
||||
export interface StatusMessage {
|
||||
status_update: true;
|
||||
type: string;
|
||||
id: string;
|
||||
id?: string;
|
||||
message: string;
|
||||
}
|
||||
|
||||
@@ -24,6 +24,7 @@ from openhands.events.action import (
|
||||
MessageAction,
|
||||
)
|
||||
from openhands.events.observation import (
|
||||
AgentCondensationObservation,
|
||||
AgentDelegateObservation,
|
||||
BrowserOutputObservation,
|
||||
CmdOutputObservation,
|
||||
@@ -36,6 +37,7 @@ from openhands.events.observation.error import ErrorObservation
|
||||
from openhands.events.observation.observation import Observation
|
||||
from openhands.events.serialization.event import truncate_content
|
||||
from openhands.llm.llm import LLM
|
||||
from openhands.memory.condenser import Condenser
|
||||
from openhands.runtime.plugins import (
|
||||
AgentSkillsRequirement,
|
||||
JupyterRequirement,
|
||||
@@ -115,6 +117,9 @@ class CodeActAgent(Agent):
|
||||
disabled_microagents=self.config.disabled_microagents,
|
||||
)
|
||||
|
||||
self.condenser = Condenser.from_config(self.config.condenser)
|
||||
logger.debug(f'Using condenser: {self.condenser}')
|
||||
|
||||
def get_action_message(
|
||||
self,
|
||||
action: Action,
|
||||
@@ -322,6 +327,9 @@ class CodeActAgent(Agent):
|
||||
text = 'OBSERVATION:\n' + truncate_content(obs.content, max_message_chars)
|
||||
text += '\n[Last action has been rejected by the user]'
|
||||
message = Message(role='user', content=[TextContent(text=text)])
|
||||
elif isinstance(obs, AgentCondensationObservation):
|
||||
text = truncate_content(obs.content, max_message_chars)
|
||||
message = Message(role='user', content=[TextContent(text=text)])
|
||||
else:
|
||||
# If an observation message is not returned, it will cause an error
|
||||
# when the LLM tries to return the next message
|
||||
@@ -442,7 +450,10 @@ class CodeActAgent(Agent):
|
||||
|
||||
pending_tool_call_action_messages: dict[str, Message] = {}
|
||||
tool_call_id_to_message: dict[str, Message] = {}
|
||||
events = list(state.history)
|
||||
|
||||
# Condense the events from the state.
|
||||
events = self.condenser.condensed_history(state)
|
||||
|
||||
for event in events:
|
||||
# create a regular message from an event
|
||||
if isinstance(event, Action):
|
||||
|
||||
@@ -4,6 +4,11 @@ You are OpenHands agent, a helpful AI assistant that can interact with a compute
|
||||
* When configuring git credentials, use "openhands" as the user.name and "openhands@all-hands.dev" as the user.email by default, unless explicitly instructed otherwise.
|
||||
* The assistant MUST NOT include comments in the code unless they are necessary to describe non-obvious behavior.
|
||||
</IMPORTANT>
|
||||
{% if github_repo %}
|
||||
<REPOSITORY_INFO>
|
||||
At the user's request, repository {{ github_repo }} has been cloned to directory {{ repo_directory }}.
|
||||
</REPOSITORY_INFO>
|
||||
{% endif %}
|
||||
{% if repo_instructions %}
|
||||
<REPOSITORY_INSTRUCTIONS>
|
||||
{{ repo_instructions }}
|
||||
|
||||
@@ -993,10 +993,12 @@ class AgentController:
|
||||
|
||||
def __repr__(self):
|
||||
return (
|
||||
f'AgentController(id={self.id}, agent={self.agent!r}, '
|
||||
f'event_stream={self.event_stream!r}, '
|
||||
f'state={self.state!r}, '
|
||||
f'delegate={self.delegate!r}, _pending_action={self._pending_action!r})'
|
||||
f'AgentController(id={getattr(self, "id", "<uninitialized>")}, '
|
||||
f'agent={getattr(self, "agent", "<uninitialized>")!r}, '
|
||||
f'event_stream={getattr(self, "event_stream", "<uninitialized>")!r}, '
|
||||
f'state={getattr(self, "state", "<uninitialized>")!r}, '
|
||||
f'delegate={getattr(self, "delegate", "<uninitialized>")!r}, '
|
||||
f'_pending_action={getattr(self, "_pending_action", "<uninitialized>")!r})'
|
||||
)
|
||||
|
||||
def _is_awaiting_observation(self):
|
||||
|
||||
@@ -1,5 +1,6 @@
|
||||
from dataclasses import dataclass, fields
|
||||
from dataclasses import dataclass, field, fields
|
||||
|
||||
from openhands.core.config.condenser_config import CondenserConfig, NoOpCondenserConfig
|
||||
from openhands.core.config.config_utils import get_field_info
|
||||
|
||||
|
||||
@@ -18,6 +19,7 @@ class AgentConfig:
|
||||
llm_config: The name of the llm config to use. If specified, this will override global llm config.
|
||||
use_microagents: Whether to use microagents at all. Default is True.
|
||||
disabled_microagents: A list of microagents to disable. Default is None.
|
||||
condenser: Configuration for the memory condenser. Default is NoOpCondenserConfig.
|
||||
"""
|
||||
|
||||
codeact_enable_browsing: bool = True
|
||||
@@ -29,6 +31,7 @@ class AgentConfig:
|
||||
llm_config: str | None = None
|
||||
use_microagents: bool = True
|
||||
disabled_microagents: list[str] | None = None
|
||||
condenser: CondenserConfig = field(default_factory=NoOpCondenserConfig) # type: ignore
|
||||
|
||||
def defaults_to_dict(self) -> dict:
|
||||
"""Serialize fields to a dict for the frontend, including type hints, defaults, and whether it's optional."""
|
||||
|
||||
90
openhands/core/config/condenser_config.py
Normal file
90
openhands/core/config/condenser_config.py
Normal file
@@ -0,0 +1,90 @@
|
||||
from typing import Literal
|
||||
|
||||
from pydantic import BaseModel, Field
|
||||
|
||||
from openhands.core.config.llm_config import LLMConfig
|
||||
|
||||
|
||||
class NoOpCondenserConfig(BaseModel):
|
||||
"""Configuration for NoOpCondenser."""
|
||||
|
||||
type: Literal['noop'] = Field('noop')
|
||||
|
||||
|
||||
class ObservationMaskingCondenserConfig(BaseModel):
|
||||
"""Configuration for ObservationMaskingCondenser."""
|
||||
|
||||
type: Literal['observation_masking'] = Field('observation_masking')
|
||||
attention_window: int = Field(
|
||||
default=10,
|
||||
description='The number of most-recent events where observations will not be masked.',
|
||||
ge=1,
|
||||
)
|
||||
|
||||
|
||||
class RecentEventsCondenserConfig(BaseModel):
|
||||
"""Configuration for RecentEventsCondenser."""
|
||||
|
||||
type: Literal['recent'] = Field('recent')
|
||||
keep_first: int = Field(
|
||||
default=0,
|
||||
description='The number of initial events to condense.',
|
||||
ge=0,
|
||||
)
|
||||
max_events: int = Field(
|
||||
default=10, description='Maximum number of events to keep.', ge=1
|
||||
)
|
||||
|
||||
|
||||
class LLMSummarizingCondenserConfig(BaseModel):
|
||||
"""Configuration for LLMCondenser."""
|
||||
|
||||
type: Literal['llm'] = Field('llm')
|
||||
llm_config: LLMConfig = Field(
|
||||
..., description='Configuration for the LLM to use for condensing.'
|
||||
)
|
||||
|
||||
|
||||
class AmortizedForgettingCondenserConfig(BaseModel):
|
||||
"""Configuration for AmortizedForgettingCondenser."""
|
||||
|
||||
type: Literal['amortized'] = Field('amortized')
|
||||
max_size: int = Field(
|
||||
default=100,
|
||||
description='Maximum size of the condensed history before triggering forgetting.',
|
||||
ge=2,
|
||||
)
|
||||
keep_first: int = Field(
|
||||
default=0,
|
||||
description='Number of initial events to always keep in history.',
|
||||
ge=0,
|
||||
)
|
||||
|
||||
|
||||
class LLMAttentionCondenserConfig(BaseModel):
|
||||
"""Configuration for LLMAttentionCondenser."""
|
||||
|
||||
type: Literal['llm_attention'] = Field('llm_attention')
|
||||
llm_config: LLMConfig = Field(
|
||||
..., description='Configuration for the LLM to use for attention.'
|
||||
)
|
||||
max_size: int = Field(
|
||||
default=100,
|
||||
description='Maximum size of the condensed history before triggering forgetting.',
|
||||
ge=2,
|
||||
)
|
||||
keep_first: int = Field(
|
||||
default=0,
|
||||
description='Number of initial events to always keep in history.',
|
||||
ge=0,
|
||||
)
|
||||
|
||||
|
||||
CondenserConfig = (
|
||||
NoOpCondenserConfig
|
||||
| ObservationMaskingCondenserConfig
|
||||
| RecentEventsCondenserConfig
|
||||
| LLMSummarizingCondenserConfig
|
||||
| AmortizedForgettingCondenserConfig
|
||||
| LLMAttentionCondenserConfig
|
||||
)
|
||||
@@ -37,8 +37,6 @@ class SandboxConfig:
|
||||
enable_gpu: Whether to enable GPU.
|
||||
docker_runtime_kwargs: Additional keyword arguments to pass to the Docker runtime when running containers.
|
||||
This should be a JSON string that will be parsed into a dictionary.
|
||||
Example in config.toml:
|
||||
docker_runtime_kwargs = '{"mem_limit": "4g", "cpu_quota": 100000}'
|
||||
"""
|
||||
|
||||
remote_runtime_api_url: str = 'http://localhost:8000'
|
||||
|
||||
@@ -44,5 +44,8 @@ class ObservationTypeSchema(BaseModel):
|
||||
|
||||
USER_REJECTED: str = Field(default='user_rejected')
|
||||
|
||||
CONDENSE: str = Field(default='condense')
|
||||
"""Result of a condensation operation."""
|
||||
|
||||
|
||||
ObservationType = ObservationTypeSchema()
|
||||
|
||||
@@ -1,4 +1,7 @@
|
||||
from openhands.events.observation.agent import AgentStateChangedObservation
|
||||
from openhands.events.observation.agent import (
|
||||
AgentCondensationObservation,
|
||||
AgentStateChangedObservation,
|
||||
)
|
||||
from openhands.events.observation.browse import BrowserOutputObservation
|
||||
from openhands.events.observation.commands import (
|
||||
CmdOutputMetadata,
|
||||
@@ -32,4 +35,5 @@ __all__ = [
|
||||
'AgentDelegateObservation',
|
||||
'SuccessObservation',
|
||||
'UserRejectObservation',
|
||||
'AgentCondensationObservation',
|
||||
]
|
||||
|
||||
@@ -14,3 +14,14 @@ class AgentStateChangedObservation(Observation):
|
||||
@property
|
||||
def message(self) -> str:
|
||||
return ''
|
||||
|
||||
|
||||
@dataclass
|
||||
class AgentCondensationObservation(Observation):
|
||||
"""The output of a condensation action."""
|
||||
|
||||
observation: str = ObservationType.CONDENSE
|
||||
|
||||
@property
|
||||
def message(self) -> str:
|
||||
return self.content
|
||||
|
||||
@@ -1,6 +1,9 @@
|
||||
import copy
|
||||
|
||||
from openhands.events.observation.agent import AgentStateChangedObservation
|
||||
from openhands.events.observation.agent import (
|
||||
AgentCondensationObservation,
|
||||
AgentStateChangedObservation,
|
||||
)
|
||||
from openhands.events.observation.browse import BrowserOutputObservation
|
||||
from openhands.events.observation.commands import (
|
||||
CmdOutputMetadata,
|
||||
@@ -32,6 +35,7 @@ observations = (
|
||||
ErrorObservation,
|
||||
AgentStateChangedObservation,
|
||||
UserRejectObservation,
|
||||
AgentCondensationObservation,
|
||||
)
|
||||
|
||||
OBSERVATION_TYPE_TO_CLASS = {
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
from openhands.memory.condenser import MemoryCondenser
|
||||
from openhands.memory.condenser import Condenser
|
||||
from openhands.memory.memory import LongTermMemory
|
||||
|
||||
__all__ = ['LongTermMemory', 'MemoryCondenser']
|
||||
__all__ = ['LongTermMemory', 'Condenser']
|
||||
|
||||
@@ -1,24 +1,409 @@
|
||||
from __future__ import annotations
|
||||
|
||||
from abc import ABC, abstractmethod
|
||||
from contextlib import contextmanager
|
||||
from typing import Any
|
||||
|
||||
from litellm import supports_response_schema
|
||||
from pydantic import BaseModel
|
||||
from typing_extensions import override
|
||||
|
||||
from openhands.controller.state.state import State
|
||||
from openhands.core.config.condenser_config import (
|
||||
AmortizedForgettingCondenserConfig,
|
||||
CondenserConfig,
|
||||
LLMAttentionCondenserConfig,
|
||||
LLMSummarizingCondenserConfig,
|
||||
NoOpCondenserConfig,
|
||||
ObservationMaskingCondenserConfig,
|
||||
RecentEventsCondenserConfig,
|
||||
)
|
||||
from openhands.core.logger import openhands_logger as logger
|
||||
from openhands.events.event import Event
|
||||
from openhands.events.observation import AgentCondensationObservation, Observation
|
||||
from openhands.llm.llm import LLM
|
||||
|
||||
CONDENSER_METADATA_KEY = 'condenser_meta'
|
||||
"""Key identifying where metadata is stored in a `State` object's `extra_data` field."""
|
||||
|
||||
class MemoryCondenser:
|
||||
def condense(self, summarize_prompt: str, llm: LLM):
|
||||
"""Attempts to condense the memory by using the llm
|
||||
|
||||
Parameters:
|
||||
- llm (LLM): llm to be used for summarization
|
||||
def get_condensation_metadata(state: State) -> list[dict[str, Any]]:
|
||||
"""Utility function to retrieve a list of metadata batches from a `State`.
|
||||
|
||||
Args:
|
||||
state: The state to retrieve metadata from.
|
||||
|
||||
Returns:
|
||||
list[dict[str, Any]]: A list of metadata batches, each representing a condensation.
|
||||
"""
|
||||
if CONDENSER_METADATA_KEY in state.extra_data:
|
||||
return state.extra_data[CONDENSER_METADATA_KEY]
|
||||
return []
|
||||
|
||||
|
||||
class Condenser(ABC):
|
||||
"""Abstract condenser interface.
|
||||
|
||||
Condensers take a list of `Event` objects and reduce them into a potentially smaller list.
|
||||
|
||||
Agents can use condensers to reduce the amount of events they need to consider when deciding which action to take. To use a condenser, agents can call the `condensed_history` method on the current `State` being considered and use the results instead of the full history.
|
||||
|
||||
Example usage::
|
||||
|
||||
condenser = Condenser.from_config(condenser_config)
|
||||
events = condenser.condensed_history(state)
|
||||
"""
|
||||
|
||||
def __init__(self):
|
||||
self._metadata_batch: dict[str, Any] = {}
|
||||
|
||||
def add_metadata(self, key: str, value: Any) -> None:
|
||||
"""Add information to the current metadata batch.
|
||||
|
||||
Any key/value pairs added to the metadata batch will be recorded in the `State` at the end of the current condensation.
|
||||
|
||||
Args:
|
||||
key: The key to store the metadata under.
|
||||
|
||||
value: The metadata to store.
|
||||
"""
|
||||
self._metadata_batch[key] = value
|
||||
|
||||
def write_metadata(self, state: State) -> None:
|
||||
"""Write the current batch of metadata to the `State`.
|
||||
|
||||
Resets the current metadata batch: any metadata added after this call will be stored in a new batch and written to the `State` at the end of the next condensation.
|
||||
"""
|
||||
if CONDENSER_METADATA_KEY not in state.extra_data:
|
||||
state.extra_data[CONDENSER_METADATA_KEY] = []
|
||||
if self._metadata_batch:
|
||||
state.extra_data[CONDENSER_METADATA_KEY].append(self._metadata_batch)
|
||||
|
||||
# Since the batch has been written, clear it for the next condensation
|
||||
self._metadata_batch = {}
|
||||
|
||||
@contextmanager
|
||||
def metadata_batch(self, state: State):
|
||||
"""Context manager to ensure batched metadata is always written to the `State`."""
|
||||
try:
|
||||
yield
|
||||
finally:
|
||||
self.write_metadata(state)
|
||||
|
||||
@abstractmethod
|
||||
def condense(self, events: list[Event]) -> list[Event]:
|
||||
"""Condense a sequence of events into a potentially smaller list.
|
||||
|
||||
New condenser strategies should override this method to implement their own condensation logic. Call `self.add_metadata` in the implementation to record any relevant per-condensation diagnostic information.
|
||||
|
||||
Args:
|
||||
events: A list of events representing the entire history of the agent.
|
||||
|
||||
Returns:
|
||||
list[Event]: An event sequence representing a condensed history of the agent.
|
||||
"""
|
||||
|
||||
def condensed_history(self, state: State) -> list[Event]:
|
||||
"""Condense the state's history."""
|
||||
with self.metadata_batch(state):
|
||||
return self.condense(state.history)
|
||||
|
||||
@classmethod
|
||||
def from_config(cls, config: CondenserConfig) -> Condenser:
|
||||
"""Create a condenser from a configuration object.
|
||||
|
||||
Args:
|
||||
config: Configuration for the condenser.
|
||||
|
||||
Returns:
|
||||
Condenser: A condenser instance.
|
||||
|
||||
Raises:
|
||||
- Exception: the same exception as it got from the llm or processing the response
|
||||
ValueError: If the condenser type is not recognized.
|
||||
"""
|
||||
match config:
|
||||
case NoOpCondenserConfig():
|
||||
return NoOpCondenser()
|
||||
|
||||
case ObservationMaskingCondenserConfig():
|
||||
return ObservationMaskingCondenser(
|
||||
**config.model_dump(exclude=['type'])
|
||||
)
|
||||
|
||||
case RecentEventsCondenserConfig():
|
||||
return RecentEventsCondenser(**config.model_dump(exclude=['type']))
|
||||
|
||||
case LLMSummarizingCondenserConfig(llm_config=llm_config):
|
||||
return LLMSummarizingCondenser(llm=LLM(config=llm_config))
|
||||
|
||||
case AmortizedForgettingCondenserConfig():
|
||||
return AmortizedForgettingCondenser(
|
||||
**config.model_dump(exclude=['type'])
|
||||
)
|
||||
|
||||
case LLMAttentionCondenserConfig(llm_config=llm_config):
|
||||
return LLMAttentionCondenser(
|
||||
llm=LLM(config=llm_config),
|
||||
**config.model_dump(exclude=['type', 'llm_config']),
|
||||
)
|
||||
|
||||
case _:
|
||||
raise ValueError(f'Unknown condenser config: {config}')
|
||||
|
||||
|
||||
class RollingCondenser(Condenser, ABC):
|
||||
"""Base class for a specialized condenser strategy that applies condensation to a rolling history.
|
||||
|
||||
The rolling history is computed by appending new events to the most recent condensation. For example, the sequence of calls::
|
||||
|
||||
assert state.history == [event1, event2, event3]
|
||||
condensation = condenser.condensed_history(state)
|
||||
|
||||
# ...new events are added to the state...
|
||||
|
||||
assert state.history == [event1, event2, event3, event4, event5]
|
||||
condenser.condensed_history(state)
|
||||
|
||||
will result in second call to `condensed_history` passing `condensation + [event4, event5]` to the `condense` method.
|
||||
"""
|
||||
|
||||
def __init__(self) -> None:
|
||||
self._condensation: list[Event] = []
|
||||
self._last_history_length: int = 0
|
||||
|
||||
super().__init__()
|
||||
|
||||
@override
|
||||
def condensed_history(self, state: State) -> list[Event]:
|
||||
new_events = state.history[self._last_history_length :]
|
||||
|
||||
with self.metadata_batch(state):
|
||||
results = self.condense(self._condensation + new_events)
|
||||
|
||||
self._condensation = results
|
||||
self._last_history_length = len(state.history)
|
||||
|
||||
return results
|
||||
|
||||
|
||||
class NoOpCondenser(Condenser):
|
||||
"""A condenser that does nothing to the event sequence."""
|
||||
|
||||
def condense(self, events: list[Event]) -> list[Event]:
|
||||
"""Returns the list of events unchanged."""
|
||||
return events
|
||||
|
||||
|
||||
class ObservationMaskingCondenser(Condenser):
|
||||
"""A condenser that masks the values of observations outside of a recent attention window."""
|
||||
|
||||
def __init__(self, attention_window: int = 5):
|
||||
self.attention_window = attention_window
|
||||
|
||||
super().__init__()
|
||||
|
||||
def condense(self, events: list[Event]) -> list[Event]:
|
||||
"""Replace the content of observations outside of the attention window with a placeholder."""
|
||||
results: list[Event] = []
|
||||
for i, event in enumerate(events):
|
||||
if (
|
||||
isinstance(event, Observation)
|
||||
and i < len(events) - self.attention_window
|
||||
):
|
||||
results.append(AgentCondensationObservation('<MASKED>'))
|
||||
else:
|
||||
results.append(event)
|
||||
|
||||
return results
|
||||
|
||||
|
||||
class RecentEventsCondenser(Condenser):
|
||||
"""A condenser that only keeps a certain number of the most recent events."""
|
||||
|
||||
def __init__(self, keep_first: int = 0, max_events: int = 10):
|
||||
self.keep_first = keep_first
|
||||
self.max_events = max_events
|
||||
|
||||
super().__init__()
|
||||
|
||||
def condense(self, events: list[Event]) -> list[Event]:
|
||||
"""Keep only the most recent events (up to `max_events`)."""
|
||||
head = events[: self.keep_first]
|
||||
tail_length = max(0, self.max_events - len(head))
|
||||
tail = events[-tail_length:]
|
||||
return head + tail
|
||||
|
||||
|
||||
class LLMSummarizingCondenser(Condenser):
|
||||
"""A condenser that relies on a language model to summarize the event sequence as a single event."""
|
||||
|
||||
def __init__(self, llm: LLM):
|
||||
self.llm = llm
|
||||
|
||||
super().__init__()
|
||||
|
||||
def condense(self, events: list[Event]) -> list[Event]:
|
||||
"""Applies an LLM to summarize the list of events.
|
||||
|
||||
Raises:
|
||||
Exception: If the LLM is unable to summarize the event sequence.
|
||||
"""
|
||||
try:
|
||||
messages = [{'content': summarize_prompt, 'role': 'user'}]
|
||||
resp = llm.completion(messages=messages)
|
||||
summary_response = resp['choices'][0]['message']['content']
|
||||
return summary_response
|
||||
except Exception as e:
|
||||
logger.error('Error condensing thoughts: %s', str(e), exc_info=False)
|
||||
# Convert events to a format suitable for summarization
|
||||
events_text = '\n'.join(f'{e.timestamp}: {e.message}' for e in events)
|
||||
summarize_prompt = f'Please summarize these events:\n{events_text}'
|
||||
|
||||
# TODO If the llm fails with ContextWindowExceededError, we can try to condense the memory chunk by chunk
|
||||
raise
|
||||
resp = self.llm.completion(
|
||||
messages=[{'content': summarize_prompt, 'role': 'user'}]
|
||||
)
|
||||
summary_response = resp.choices[0].message.content
|
||||
|
||||
# Create a new summary event with the condensed content
|
||||
summary_event = AgentCondensationObservation(summary_response)
|
||||
|
||||
# Add metrics to state
|
||||
self.add_metadata('response', resp.model_dump())
|
||||
self.add_metadata('metrics', self.llm.metrics.get())
|
||||
|
||||
return [summary_event]
|
||||
|
||||
except Exception as e:
|
||||
logger.error('Error condensing events: %s', str(e), exc_info=False)
|
||||
raise e
|
||||
|
||||
|
||||
class AmortizedForgettingCondenser(RollingCondenser):
|
||||
"""A condenser that maintains a condensed history and forgets old events when it grows too large."""
|
||||
|
||||
def __init__(self, max_size: int = 100, keep_first: int = 0):
|
||||
"""Initialize the condenser.
|
||||
|
||||
Args:
|
||||
max_size: Maximum size of history before forgetting.
|
||||
keep_first: Number of initial events to always keep.
|
||||
|
||||
Raises:
|
||||
ValueError: If keep_first is greater than max_size, keep_first is negative, or max_size is non-positive.
|
||||
"""
|
||||
if keep_first >= max_size // 2:
|
||||
raise ValueError(
|
||||
f'keep_first ({keep_first}) must be less than half of max_size ({max_size})'
|
||||
)
|
||||
if keep_first < 0:
|
||||
raise ValueError(f'keep_first ({keep_first}) cannot be negative')
|
||||
if max_size < 1:
|
||||
raise ValueError(f'max_size ({keep_first}) cannot be non-positive')
|
||||
|
||||
self.max_size = max_size
|
||||
self.keep_first = keep_first
|
||||
|
||||
super().__init__()
|
||||
|
||||
def condense(self, events: list[Event]) -> list[Event]:
|
||||
"""Apply the amortized forgetting strategy to the given list of events."""
|
||||
if len(events) <= self.max_size:
|
||||
return events
|
||||
|
||||
target_size = self.max_size // 2
|
||||
head = events[: self.keep_first]
|
||||
|
||||
events_from_tail = target_size - len(head)
|
||||
tail = events[-events_from_tail:]
|
||||
|
||||
return head + tail
|
||||
|
||||
|
||||
class ImportantEventSelection(BaseModel):
|
||||
"""Utility class for the `LLMAttentionCondenser` that forces the LLM to return a list of integers."""
|
||||
|
||||
ids: list[int]
|
||||
|
||||
|
||||
class LLMAttentionCondenser(RollingCondenser):
|
||||
"""Rolling condenser strategy that uses an LLM to select the most important events when condensing the history."""
|
||||
|
||||
def __init__(self, llm: LLM, max_size: int = 100, keep_first: int = 0):
|
||||
if keep_first >= max_size // 2:
|
||||
raise ValueError(
|
||||
f'keep_first ({keep_first}) must be less than half of max_size ({max_size})'
|
||||
)
|
||||
if keep_first < 0:
|
||||
raise ValueError(f'keep_first ({keep_first}) cannot be negative')
|
||||
if max_size < 1:
|
||||
raise ValueError(f'max_size ({keep_first}) cannot be non-positive')
|
||||
|
||||
self.max_size = max_size
|
||||
self.keep_first = keep_first
|
||||
self.llm = llm
|
||||
|
||||
# This condenser relies on the `response_schema` feature, which is not supported by all LLMs
|
||||
if not supports_response_schema(
|
||||
model=self.llm.config.model,
|
||||
custom_llm_provider=self.llm.config.custom_llm_provider,
|
||||
):
|
||||
raise ValueError(
|
||||
"The LLM model must support the 'response_schema' parameter to use the LLMAttentionCondenser."
|
||||
)
|
||||
|
||||
super().__init__()
|
||||
|
||||
def condense(self, events: list[Event]) -> list[Event]:
|
||||
"""If the history is too long, use an LLM to select the most important events."""
|
||||
if len(events) <= self.max_size:
|
||||
return events
|
||||
|
||||
target_size = self.max_size // 2
|
||||
head = events[: self.keep_first]
|
||||
|
||||
events_from_tail = target_size - len(head)
|
||||
|
||||
message: str = """You will be given a list of actions, observations, and thoughts from a coding agent.
|
||||
Each item in the list has an identifier. Please sort the identifiers in order of how important the
|
||||
contents of the item are for the next step of the coding agent's task, from most important to least
|
||||
important."""
|
||||
|
||||
response = self.llm.completion(
|
||||
messages=[
|
||||
{'content': message, 'role': 'user'},
|
||||
*[
|
||||
{
|
||||
'content': f'<ID>{e.id}</ID>\n<CONTENT>{e.message}</CONTENT>',
|
||||
'role': 'user',
|
||||
}
|
||||
for e in events
|
||||
],
|
||||
],
|
||||
response_format={
|
||||
'type': 'json_schema',
|
||||
'json_schema': {
|
||||
'name': 'ImportantEventSelection',
|
||||
'schema': ImportantEventSelection.model_json_schema(),
|
||||
},
|
||||
},
|
||||
)
|
||||
|
||||
response_ids = ImportantEventSelection.model_validate_json(
|
||||
response.choices[0].message.content
|
||||
).ids
|
||||
|
||||
self.add_metadata('all_event_ids', [event.id for event in events])
|
||||
self.add_metadata('response_ids', response_ids)
|
||||
self.add_metadata('metrics', self.llm.metrics.get())
|
||||
|
||||
# Filter out any IDs from the head and trim the results down
|
||||
head_ids = [event.id for event in head]
|
||||
response_ids = [
|
||||
response_id for response_id in response_ids if response_id not in head_ids
|
||||
][:events_from_tail]
|
||||
|
||||
# If the response IDs aren't _long_ enough, iterate backwards through the events and add any unfound IDs to the list.
|
||||
for event in reversed(events):
|
||||
if len(response_ids) >= events_from_tail:
|
||||
break
|
||||
if event.id not in response_ids:
|
||||
response_ids.append(event.id)
|
||||
|
||||
# Grab the events associated with the response IDs
|
||||
tail = [event for event in events if event.id in response_ids]
|
||||
|
||||
return head + tail
|
||||
|
||||
@@ -26,6 +26,17 @@ class DockerRuntimeBuilder(RuntimeBuilder):
|
||||
|
||||
self.rolling_logger = RollingLogger(max_lines=10)
|
||||
|
||||
@staticmethod
|
||||
def check_buildx():
|
||||
"""Check if Docker Buildx is available"""
|
||||
try:
|
||||
result = subprocess.run(
|
||||
['docker', 'buildx', 'version'], capture_output=True, text=True
|
||||
)
|
||||
return result.returncode == 0
|
||||
except FileNotFoundError:
|
||||
return False
|
||||
|
||||
def build(
|
||||
self,
|
||||
path: str,
|
||||
@@ -62,6 +73,38 @@ class DockerRuntimeBuilder(RuntimeBuilder):
|
||||
'Docker server version must be >= 18.09 to use BuildKit'
|
||||
)
|
||||
|
||||
if not DockerRuntimeBuilder.check_buildx():
|
||||
# when running openhands in a container, there might not be a "docker"
|
||||
# binary available, in which case we need to download docker binary.
|
||||
# since the official openhands app image is built from debian, we use
|
||||
# debian way to install docker binary
|
||||
logger.info(
|
||||
'No docker binary available inside openhands-app container, trying to download online...'
|
||||
)
|
||||
commands = [
|
||||
'apt-get update',
|
||||
'apt-get install -y ca-certificates curl gnupg',
|
||||
'install -m 0755 -d /etc/apt/keyrings',
|
||||
'curl -fsSL https://download.docker.com/linux/debian/gpg -o /etc/apt/keyrings/docker.asc',
|
||||
'chmod a+r /etc/apt/keyrings/docker.asc',
|
||||
'echo \
|
||||
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/debian \
|
||||
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
|
||||
tee /etc/apt/sources.list.d/docker.list > /dev/null',
|
||||
'apt-get update',
|
||||
'apt-get install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin',
|
||||
]
|
||||
for cmd in commands:
|
||||
try:
|
||||
subprocess.run(
|
||||
cmd, shell=True, check=True, stdout=subprocess.DEVNULL
|
||||
)
|
||||
except subprocess.CalledProcessError as e:
|
||||
logger.error(f'Image build failed:\n{e}')
|
||||
logger.error(f'Command output:\n{e.output}')
|
||||
raise
|
||||
logger.info('Downloaded and installed docker binary')
|
||||
|
||||
target_image_hash_name = tags[0]
|
||||
target_image_repo, target_image_source_tag = target_image_hash_name.split(':')
|
||||
target_image_tag = tags[1].split(':')[1] if len(tags) > 1 else None
|
||||
|
||||
@@ -204,8 +204,9 @@ class AgentSession:
|
||||
)
|
||||
return
|
||||
|
||||
repo_directory = None
|
||||
if selected_repository:
|
||||
await call_sync_from_async(
|
||||
repo_directory = await call_sync_from_async(
|
||||
self.runtime.clone_repo, github_token, selected_repository
|
||||
)
|
||||
if agent.prompt_manager:
|
||||
@@ -213,6 +214,10 @@ class AgentSession:
|
||||
self.runtime.get_microagents_from_selected_repo, selected_repository
|
||||
)
|
||||
agent.prompt_manager.load_microagents(microagents)
|
||||
# Pass GitHub repository information to the prompt manager
|
||||
agent.prompt_manager.set_repository_info(
|
||||
selected_repository, repo_directory
|
||||
)
|
||||
|
||||
logger.debug(
|
||||
f'Runtime initialized with plugins: {[plugin.name for plugin in self.runtime.plugins]}'
|
||||
@@ -274,27 +279,25 @@ class AgentSession:
|
||||
confirmation_mode=confirmation_mode,
|
||||
headless_mode=False,
|
||||
status_callback=self._status_callback,
|
||||
initial_state=self._maybe_restore_state(),
|
||||
)
|
||||
|
||||
# Note: We now attempt to restore the state from session here,
|
||||
# but if it fails, we fall back to None and still initialize the controller
|
||||
# with a fresh state. That way, the controller will always load events from the event stream
|
||||
# even if the state file was corrupt.
|
||||
return controller
|
||||
|
||||
def _maybe_restore_state(self) -> State | None:
|
||||
"""Helper method to handle state restore logic."""
|
||||
restored_state = None
|
||||
|
||||
# Attempt to restore the state from session.
|
||||
# Use a heuristic to figure out if we should have a state:
|
||||
# if we have events in the stream.
|
||||
try:
|
||||
restored_state = State.restore_from_session(self.sid, self.file_store)
|
||||
logger.debug(f'Restored state from session, sid: {self.sid}')
|
||||
except Exception as e:
|
||||
if self.event_stream.get_latest_event_id() > 0:
|
||||
# if we have events, we should have a state
|
||||
logger.warning(f'State could not be restored: {e}')
|
||||
|
||||
# Set the initial state through the controller.
|
||||
controller.set_initial_state(restored_state, max_iterations, confirmation_mode)
|
||||
if restored_state:
|
||||
logger.debug(f'Restored agent state from session, sid: {self.sid}')
|
||||
else:
|
||||
logger.debug('New session state created.')
|
||||
|
||||
logger.debug('Agent controller initialized.')
|
||||
return controller
|
||||
else:
|
||||
logger.debug('No events found, no state to restore')
|
||||
return restored_state
|
||||
|
||||
@@ -21,7 +21,8 @@ class GoogleCloudFileStore(FileStore):
|
||||
|
||||
def write(self, path: str, contents: str | bytes) -> None:
|
||||
blob = self.bucket.blob(path)
|
||||
with blob.open('w') as f:
|
||||
mode = 'wb' if isinstance(contents, bytes) else 'w'
|
||||
with blob.open(mode) as f:
|
||||
f.write(contents)
|
||||
|
||||
def read(self, path: str) -> str:
|
||||
|
||||
@@ -1,4 +1,5 @@
|
||||
import os
|
||||
from dataclasses import dataclass
|
||||
from itertools import islice
|
||||
|
||||
from jinja2 import Template
|
||||
@@ -13,6 +14,14 @@ from openhands.microagent import (
|
||||
)
|
||||
|
||||
|
||||
@dataclass
|
||||
class RepositoryInfo:
|
||||
"""Information about a GitHub repository that has been cloned."""
|
||||
|
||||
repo_name: str | None = None
|
||||
repo_directory: str | None = None
|
||||
|
||||
|
||||
class PromptManager:
|
||||
"""
|
||||
Manages prompt templates and micro-agents for AI interactions.
|
||||
@@ -32,9 +41,14 @@ class PromptManager:
|
||||
prompt_dir: str,
|
||||
microagent_dir: str | None = None,
|
||||
disabled_microagents: list[str] | None = None,
|
||||
github_repo: str | None = None,
|
||||
repo_directory: str | None = None,
|
||||
):
|
||||
self.disabled_microagents: list[str] = disabled_microagents or []
|
||||
self.prompt_dir: str = prompt_dir
|
||||
self.repository_info = RepositoryInfo()
|
||||
if github_repo:
|
||||
self.set_repository_info(github_repo, repo_directory)
|
||||
|
||||
self.system_template: Template = self._load_template('system_prompt')
|
||||
self.user_template: Template = self._load_template('user_prompt')
|
||||
@@ -91,7 +105,24 @@ class PromptManager:
|
||||
if repo_instructions:
|
||||
repo_instructions += '\n\n'
|
||||
repo_instructions += microagent.content
|
||||
return self.system_template.render(repo_instructions=repo_instructions).strip()
|
||||
|
||||
return self.system_template.render(
|
||||
repo_instructions=repo_instructions,
|
||||
github_repo=self.repository_info.repo_name,
|
||||
repo_directory=self.repository_info.repo_directory,
|
||||
).strip()
|
||||
|
||||
def set_repository_info(
|
||||
self, repo_name: str | None, repo_directory: str | None = None
|
||||
) -> None:
|
||||
"""Sets information about the GitHub repository that has been cloned.
|
||||
|
||||
Args:
|
||||
repo_name: The name of the GitHub repository (e.g. 'owner/repo')
|
||||
repo_directory: The directory where the repository has been cloned
|
||||
"""
|
||||
self.repository_info.repo_name = repo_name
|
||||
self.repository_info.repo_directory = repo_directory
|
||||
|
||||
def get_example_user_message(self) -> str:
|
||||
"""This is the initial user message provided to the agent
|
||||
|
||||
3295
poetry.lock
generated
3295
poetry.lock
generated
File diff suppressed because it is too large
Load Diff
@@ -1,6 +1,6 @@
|
||||
[tool.poetry]
|
||||
name = "openhands-ai"
|
||||
version = "0.18.0a0"
|
||||
version = "0.19.0"
|
||||
description = "OpenHands: Code Less, Make More"
|
||||
authors = ["OpenHands"]
|
||||
license = "MIT"
|
||||
@@ -14,7 +14,7 @@ packages = [
|
||||
python = "^3.12"
|
||||
datasets = "*"
|
||||
pandas = "*"
|
||||
litellm = "^1.54.1"
|
||||
litellm = "^1.55.4"
|
||||
google-generativeai = "*" # To use litellm with Gemini Pro API
|
||||
google-api-python-client = "*" # For Google Sheets API
|
||||
google-auth-httplib2 = "*" # For Google Sheets authentication
|
||||
@@ -61,7 +61,7 @@ protobuf = "^4.21.6,<5.0.0" # chromadb currently fails on 5.0+
|
||||
opentelemetry-api = "1.25.0"
|
||||
opentelemetry-exporter-otlp-proto-grpc = "1.25.0"
|
||||
modal = ">=0.66.26,<0.72.0"
|
||||
runloop-api-client = "0.11.0"
|
||||
runloop-api-client = "0.12.0"
|
||||
libtmux = ">=0.37,<0.40"
|
||||
pygithub = "^2.5.0"
|
||||
joblib = "*"
|
||||
@@ -101,7 +101,6 @@ reportlab = "*"
|
||||
[tool.coverage.run]
|
||||
concurrency = ["gevent"]
|
||||
|
||||
|
||||
[tool.poetry.group.runtime.dependencies]
|
||||
jupyterlab = "*"
|
||||
notebook = "*"
|
||||
@@ -130,7 +129,6 @@ ignore = ["D1"]
|
||||
[tool.ruff.lint.pydocstyle]
|
||||
convention = "google"
|
||||
|
||||
|
||||
[tool.poetry.group.evaluation.dependencies]
|
||||
streamlit = "*"
|
||||
whatthepatch = "*"
|
||||
|
||||
@@ -10,36 +10,36 @@ WORKSPACE_BASE = 'workspace'
|
||||
|
||||
def test_resolve_path():
|
||||
assert (
|
||||
files.resolve_path('test.txt', '/workspace')
|
||||
files.resolve_path('test.txt', '/workspace', WORKSPACE_BASE, SANDBOX_PATH_PREFIX)
|
||||
== Path(WORKSPACE_BASE) / 'test.txt'
|
||||
)
|
||||
assert (
|
||||
files.resolve_path('subdir/test.txt', '/workspace')
|
||||
files.resolve_path('subdir/test.txt', '/workspace', WORKSPACE_BASE, SANDBOX_PATH_PREFIX)
|
||||
== Path(WORKSPACE_BASE) / 'subdir' / 'test.txt'
|
||||
)
|
||||
assert (
|
||||
files.resolve_path(Path(SANDBOX_PATH_PREFIX) / 'test.txt', '/workspace')
|
||||
files.resolve_path(Path(SANDBOX_PATH_PREFIX) / 'test.txt', '/workspace', WORKSPACE_BASE, SANDBOX_PATH_PREFIX)
|
||||
== Path(WORKSPACE_BASE) / 'test.txt'
|
||||
)
|
||||
assert (
|
||||
files.resolve_path(
|
||||
Path(SANDBOX_PATH_PREFIX) / 'subdir' / 'test.txt', '/workspace'
|
||||
Path(SANDBOX_PATH_PREFIX) / 'subdir' / 'test.txt', '/workspace', WORKSPACE_BASE, SANDBOX_PATH_PREFIX
|
||||
)
|
||||
== Path(WORKSPACE_BASE) / 'subdir' / 'test.txt'
|
||||
)
|
||||
assert (
|
||||
files.resolve_path(
|
||||
Path(SANDBOX_PATH_PREFIX) / 'subdir' / '..' / 'test.txt', '/workspace'
|
||||
Path(SANDBOX_PATH_PREFIX) / 'subdir' / '..' / 'test.txt', '/workspace', WORKSPACE_BASE, SANDBOX_PATH_PREFIX
|
||||
)
|
||||
== Path(WORKSPACE_BASE) / 'test.txt'
|
||||
)
|
||||
with pytest.raises(PermissionError):
|
||||
files.resolve_path(Path(SANDBOX_PATH_PREFIX) / '..' / 'test.txt', '/workspace')
|
||||
files.resolve_path(Path(SANDBOX_PATH_PREFIX) / '..' / 'test.txt', '/workspace', WORKSPACE_BASE, SANDBOX_PATH_PREFIX)
|
||||
with pytest.raises(PermissionError):
|
||||
files.resolve_path(Path('..') / 'test.txt', '/workspace')
|
||||
files.resolve_path(Path('..') / 'test.txt', '/workspace', WORKSPACE_BASE, SANDBOX_PATH_PREFIX)
|
||||
with pytest.raises(PermissionError):
|
||||
files.resolve_path(Path('/') / 'test.txt', '/workspace')
|
||||
files.resolve_path(Path('/') / 'test.txt', '/workspace', WORKSPACE_BASE, SANDBOX_PATH_PREFIX)
|
||||
assert (
|
||||
files.resolve_path('test.txt', '/workspace/test')
|
||||
files.resolve_path('test.txt', '/workspace/test', WORKSPACE_BASE, SANDBOX_PATH_PREFIX)
|
||||
== Path(WORKSPACE_BASE) / 'test' / 'test.txt'
|
||||
)
|
||||
|
||||
186
tests/unit/test_agent_session.py
Normal file
186
tests/unit/test_agent_session.py
Normal file
@@ -0,0 +1,186 @@
|
||||
from unittest.mock import AsyncMock, MagicMock, patch
|
||||
|
||||
import pytest
|
||||
|
||||
from openhands.controller.agent import Agent
|
||||
from openhands.controller.agent_controller import AgentController
|
||||
from openhands.controller.state.state import State
|
||||
from openhands.core.config import AppConfig, LLMConfig
|
||||
from openhands.events import EventStream, EventStreamSubscriber
|
||||
from openhands.llm import LLM
|
||||
from openhands.llm.metrics import Metrics
|
||||
from openhands.runtime.base import Runtime
|
||||
from openhands.server.session.agent_session import AgentSession
|
||||
from openhands.storage.memory import InMemoryFileStore
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def mock_agent():
|
||||
"""Create a properly configured mock agent with all required nested attributes"""
|
||||
# Create the base mocks
|
||||
agent = MagicMock(spec=Agent)
|
||||
llm = MagicMock(spec=LLM)
|
||||
metrics = MagicMock(spec=Metrics)
|
||||
llm_config = MagicMock(spec=LLMConfig)
|
||||
|
||||
# Configure the LLM config
|
||||
llm_config.model = 'test-model'
|
||||
llm_config.base_url = 'http://test'
|
||||
llm_config.draft_editor = None
|
||||
llm_config.max_message_chars = 1000
|
||||
|
||||
# Set up the chain of mocks
|
||||
llm.metrics = metrics
|
||||
llm.config = llm_config
|
||||
agent.llm = llm
|
||||
agent.name = 'test-agent'
|
||||
agent.sandbox_plugins = []
|
||||
|
||||
return agent
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_agent_session_start_with_no_state(mock_agent):
|
||||
"""Test that AgentSession.start() works correctly when there's no state to restore"""
|
||||
|
||||
# Setup
|
||||
file_store = InMemoryFileStore({})
|
||||
session = AgentSession(sid='test-session', file_store=file_store)
|
||||
|
||||
# Create a mock runtime and set it up
|
||||
mock_runtime = MagicMock(spec=Runtime)
|
||||
|
||||
# Mock the runtime creation to set up the runtime attribute
|
||||
async def mock_create_runtime(*args, **kwargs):
|
||||
session.runtime = mock_runtime
|
||||
|
||||
session._create_runtime = AsyncMock(side_effect=mock_create_runtime)
|
||||
|
||||
# Create a mock EventStream with no events
|
||||
mock_event_stream = MagicMock(spec=EventStream)
|
||||
mock_event_stream.get_events.return_value = []
|
||||
mock_event_stream.subscribe = MagicMock()
|
||||
mock_event_stream.get_latest_event_id.return_value = 0
|
||||
|
||||
# Inject the mock event stream into the session
|
||||
session.event_stream = mock_event_stream
|
||||
|
||||
# Create a spy on set_initial_state
|
||||
class SpyAgentController(AgentController):
|
||||
set_initial_state_call_count = 0
|
||||
test_initial_state = None
|
||||
|
||||
def set_initial_state(self, *args, state=None, **kwargs):
|
||||
self.set_initial_state_call_count += 1
|
||||
self.test_initial_state = state
|
||||
super().set_initial_state(*args, state=state, **kwargs)
|
||||
|
||||
# Patch AgentController and State.restore_from_session to fail
|
||||
with patch(
|
||||
'openhands.server.session.agent_session.AgentController', SpyAgentController
|
||||
), patch(
|
||||
'openhands.server.session.agent_session.EventStream',
|
||||
return_value=mock_event_stream,
|
||||
), patch(
|
||||
'openhands.controller.state.state.State.restore_from_session',
|
||||
side_effect=Exception('No state found'),
|
||||
):
|
||||
await session.start(
|
||||
runtime_name='test-runtime',
|
||||
config=AppConfig(),
|
||||
agent=mock_agent,
|
||||
max_iterations=10,
|
||||
)
|
||||
|
||||
# Verify EventStream.subscribe was called with correct parameters
|
||||
mock_event_stream.subscribe.assert_called_with(
|
||||
EventStreamSubscriber.AGENT_CONTROLLER,
|
||||
session.controller.on_event,
|
||||
session.controller.id,
|
||||
)
|
||||
|
||||
# Verify set_initial_state was called once with None as state
|
||||
assert session.controller.set_initial_state_call_count == 1
|
||||
assert session.controller.test_initial_state is None
|
||||
assert session.controller.state.max_iterations == 10
|
||||
assert session.controller.agent.name == 'test-agent'
|
||||
assert session.controller.state.start_id == 0
|
||||
assert session.controller.state.end_id == -1
|
||||
assert session.controller.state.truncation_id == -1
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_agent_session_start_with_restored_state(mock_agent):
|
||||
"""Test that AgentSession.start() works correctly when there's a state to restore"""
|
||||
|
||||
# Setup
|
||||
file_store = InMemoryFileStore({})
|
||||
session = AgentSession(sid='test-session', file_store=file_store)
|
||||
|
||||
# Create a mock runtime and set it up
|
||||
mock_runtime = MagicMock(spec=Runtime)
|
||||
|
||||
# Mock the runtime creation to set up the runtime attribute
|
||||
async def mock_create_runtime(*args, **kwargs):
|
||||
session.runtime = mock_runtime
|
||||
|
||||
session._create_runtime = AsyncMock(side_effect=mock_create_runtime)
|
||||
|
||||
# Create a mock EventStream with some events
|
||||
mock_event_stream = MagicMock(spec=EventStream)
|
||||
mock_event_stream.get_events.return_value = []
|
||||
mock_event_stream.subscribe = MagicMock()
|
||||
mock_event_stream.get_latest_event_id.return_value = 5 # Indicate some events exist
|
||||
|
||||
# Inject the mock event stream into the session
|
||||
session.event_stream = mock_event_stream
|
||||
|
||||
# Create a mock restored state
|
||||
mock_restored_state = MagicMock(spec=State)
|
||||
mock_restored_state.start_id = -1
|
||||
mock_restored_state.end_id = -1
|
||||
mock_restored_state.truncation_id = -1
|
||||
mock_restored_state.max_iterations = 5
|
||||
|
||||
# Create a spy on set_initial_state by subclassing AgentController
|
||||
class SpyAgentController(AgentController):
|
||||
set_initial_state_call_count = 0
|
||||
test_initial_state = None
|
||||
|
||||
def set_initial_state(self, *args, state=None, **kwargs):
|
||||
self.set_initial_state_call_count += 1
|
||||
self.test_initial_state = state
|
||||
super().set_initial_state(*args, state=state, **kwargs)
|
||||
|
||||
# Patch AgentController and State.restore_from_session to succeed
|
||||
with patch(
|
||||
'openhands.server.session.agent_session.AgentController', SpyAgentController
|
||||
), patch(
|
||||
'openhands.server.session.agent_session.EventStream',
|
||||
return_value=mock_event_stream,
|
||||
), patch(
|
||||
'openhands.controller.state.state.State.restore_from_session',
|
||||
return_value=mock_restored_state,
|
||||
):
|
||||
await session.start(
|
||||
runtime_name='test-runtime',
|
||||
config=AppConfig(),
|
||||
agent=mock_agent,
|
||||
max_iterations=10,
|
||||
)
|
||||
|
||||
# Verify set_initial_state was called once with the restored state
|
||||
assert session.controller.set_initial_state_call_count == 1
|
||||
|
||||
# Verify EventStream.subscribe was called with correct parameters
|
||||
mock_event_stream.subscribe.assert_called_with(
|
||||
EventStreamSubscriber.AGENT_CONTROLLER,
|
||||
session.controller.on_event,
|
||||
session.controller.id,
|
||||
)
|
||||
assert session.controller.test_initial_state is mock_restored_state
|
||||
assert session.controller.state is mock_restored_state
|
||||
assert session.controller.state.max_iterations == 5
|
||||
assert session.controller.state.start_id == 0
|
||||
assert session.controller.state.end_id == -1
|
||||
assert session.controller.state.truncation_id == -1
|
||||
@@ -1,6 +1,7 @@
|
||||
from unittest.mock import Mock
|
||||
|
||||
import pytest
|
||||
from litellm import ChatCompletionMessageToolCall
|
||||
|
||||
from openhands.agenthub.codeact_agent.codeact_agent import CodeActAgent
|
||||
from openhands.agenthub.codeact_agent.function_calling import (
|
||||
@@ -15,6 +16,7 @@ from openhands.agenthub.codeact_agent.function_calling import (
|
||||
get_tools,
|
||||
response_to_actions,
|
||||
)
|
||||
from openhands.controller.state.state import State
|
||||
from openhands.core.config import AgentConfig, LLMConfig
|
||||
from openhands.core.exceptions import FunctionCallNotExistsError
|
||||
from openhands.core.message import ImageContent, TextContent
|
||||
@@ -48,6 +50,15 @@ def agent() -> CodeActAgent:
|
||||
return agent
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def mock_state() -> State:
|
||||
state = Mock(spec=State)
|
||||
state.history = []
|
||||
state.extra_data = {}
|
||||
|
||||
return state
|
||||
|
||||
|
||||
def test_cmd_output_observation_message(agent: CodeActAgent):
|
||||
agent.config.function_calling = False
|
||||
obs = CmdOutputObservation(
|
||||
@@ -481,7 +492,7 @@ def test_response_to_actions_invalid_tool():
|
||||
response_to_actions(mock_response)
|
||||
|
||||
|
||||
def test_step_with_no_pending_actions():
|
||||
def test_step_with_no_pending_actions(mock_state: State):
|
||||
# Mock the LLM response
|
||||
mock_response = Mock()
|
||||
mock_response.id = 'mock_id'
|
||||
@@ -502,16 +513,68 @@ def test_step_with_no_pending_actions():
|
||||
agent = CodeActAgent(llm=llm, config=config)
|
||||
|
||||
# Test step with no pending actions
|
||||
state = Mock()
|
||||
state.history = []
|
||||
state.latest_user_message = None
|
||||
state.latest_user_message_id = None
|
||||
state.latest_user_message_timestamp = None
|
||||
state.latest_user_message_cause = None
|
||||
state.latest_user_message_timeout = None
|
||||
state.latest_user_message_llm_metrics = None
|
||||
state.latest_user_message_tool_call_metadata = None
|
||||
mock_state.latest_user_message = None
|
||||
mock_state.latest_user_message_id = None
|
||||
mock_state.latest_user_message_timestamp = None
|
||||
mock_state.latest_user_message_cause = None
|
||||
mock_state.latest_user_message_timeout = None
|
||||
mock_state.latest_user_message_llm_metrics = None
|
||||
mock_state.latest_user_message_tool_call_metadata = None
|
||||
|
||||
action = agent.step(state)
|
||||
action = agent.step(mock_state)
|
||||
assert isinstance(action, MessageAction)
|
||||
assert action.content == 'Task completed'
|
||||
|
||||
|
||||
def test_mismatched_tool_call_events(mock_state: State):
|
||||
"""Tests that the agent can convert mismatched tool call events (i.e., an observation with no corresponding action) into messages."""
|
||||
agent = CodeActAgent(llm=LLM(LLMConfig()), config=AgentConfig())
|
||||
|
||||
tool_call_metadata = Mock(
|
||||
spec=ToolCallMetadata,
|
||||
model_response=Mock(
|
||||
id='model_response_0',
|
||||
choices=[
|
||||
Mock(
|
||||
message=Mock(
|
||||
role='assistant',
|
||||
content='',
|
||||
tool_calls=[
|
||||
Mock(spec=ChatCompletionMessageToolCall, id='tool_call_0')
|
||||
],
|
||||
)
|
||||
)
|
||||
],
|
||||
),
|
||||
tool_call_id='tool_call_0',
|
||||
function_name='foo',
|
||||
)
|
||||
|
||||
action = CmdRunAction('foo')
|
||||
action._source = 'agent'
|
||||
action.tool_call_metadata = tool_call_metadata
|
||||
|
||||
observation = CmdOutputObservation(content='', command_id=0, command='foo')
|
||||
observation.tool_call_metadata = tool_call_metadata
|
||||
|
||||
# When both events are provided, the agent should get three messages:
|
||||
# 1. The system message,
|
||||
# 2. The action message, and
|
||||
# 3. The observation message
|
||||
mock_state.history = [action, observation]
|
||||
messages = agent._get_messages(mock_state)
|
||||
assert len(messages) == 3
|
||||
|
||||
# The same should hold if the events are presented out-of-order
|
||||
mock_state.history = [observation, action]
|
||||
messages = agent._get_messages(mock_state)
|
||||
assert len(messages) == 3
|
||||
|
||||
# If only one of the two events is present, then we should just get the system message
|
||||
mock_state.history = [action]
|
||||
messages = agent._get_messages(mock_state)
|
||||
assert len(messages) == 1
|
||||
|
||||
mock_state.history = [observation]
|
||||
messages = agent._get_messages(mock_state)
|
||||
assert len(messages) == 1
|
||||
|
||||
@@ -1,44 +1,520 @@
|
||||
from unittest.mock import Mock, patch
|
||||
from datetime import datetime
|
||||
from typing import Any
|
||||
from unittest.mock import MagicMock
|
||||
|
||||
import pytest
|
||||
|
||||
from openhands.core.exceptions import LLMResponseError
|
||||
from openhands.llm.llm import LLM
|
||||
from openhands.memory.condenser import MemoryCondenser
|
||||
from openhands.controller.state.state import State
|
||||
from openhands.core.config.condenser_config import (
|
||||
AmortizedForgettingCondenserConfig,
|
||||
LLMAttentionCondenserConfig,
|
||||
LLMSummarizingCondenserConfig,
|
||||
NoOpCondenserConfig,
|
||||
ObservationMaskingCondenserConfig,
|
||||
RecentEventsCondenserConfig,
|
||||
)
|
||||
from openhands.core.config.llm_config import LLMConfig
|
||||
from openhands.events.event import Event, EventSource
|
||||
from openhands.events.observation.observation import Observation
|
||||
from openhands.llm import LLM
|
||||
from openhands.memory.condenser import (
|
||||
AmortizedForgettingCondenser,
|
||||
Condenser,
|
||||
ImportantEventSelection,
|
||||
LLMAttentionCondenser,
|
||||
LLMSummarizingCondenser,
|
||||
NoOpCondenser,
|
||||
ObservationMaskingCondenser,
|
||||
RecentEventsCondenser,
|
||||
)
|
||||
|
||||
|
||||
def create_test_event(
|
||||
message: str, timestamp: datetime | None = None, id: int | None = None
|
||||
) -> Event:
|
||||
"""Create a simple test event."""
|
||||
event = Event()
|
||||
event._message = message
|
||||
event.timestamp = timestamp if timestamp else datetime.now()
|
||||
if id:
|
||||
event._id = id
|
||||
event._source = EventSource.USER
|
||||
return event
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def memory_condenser():
|
||||
return MemoryCondenser()
|
||||
def mock_llm() -> LLM:
|
||||
"""Mocks an LLM object with a utility function for setting and resetting response contents in unit tests."""
|
||||
# Create a MagicMock for the LLM object
|
||||
mock_llm = MagicMock(
|
||||
spec=LLM,
|
||||
config=MagicMock(
|
||||
spec=LLMConfig, model='gpt-4o', api_key='test_key', custom_llm_provider=None
|
||||
),
|
||||
metrics=MagicMock(),
|
||||
)
|
||||
_mock_content = None
|
||||
|
||||
# Set a mock message with the mocked content
|
||||
mock_message = MagicMock()
|
||||
mock_message.content = _mock_content
|
||||
|
||||
def set_mock_response_content(content: Any):
|
||||
"""Set the mock response for the LLM."""
|
||||
nonlocal mock_message
|
||||
mock_message.content = content
|
||||
|
||||
mock_choice = MagicMock()
|
||||
mock_choice.message = mock_message
|
||||
|
||||
mock_response = MagicMock()
|
||||
mock_response.choices = [mock_choice]
|
||||
|
||||
mock_llm.completion.return_value = mock_response
|
||||
|
||||
# Attach helper methods to the mock object
|
||||
mock_llm.set_mock_response_content = set_mock_response_content
|
||||
|
||||
return mock_llm
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def mock_llm():
|
||||
return Mock(spec=LLM)
|
||||
def mock_state() -> State:
|
||||
"""Mocks a State object with the only parameters needed for testing condensers: history and extra_data."""
|
||||
mock_state = MagicMock(spec=State)
|
||||
mock_state.history = []
|
||||
mock_state.extra_data = {}
|
||||
|
||||
return mock_state
|
||||
|
||||
|
||||
def test_condense_success(memory_condenser, mock_llm):
|
||||
mock_llm.completion.return_value = {
|
||||
'choices': [{'message': {'content': 'Condensed memory'}}]
|
||||
}
|
||||
result = memory_condenser.condense('Summarize this', mock_llm)
|
||||
assert result == 'Condensed memory'
|
||||
mock_llm.completion.assert_called_once_with(
|
||||
messages=[{'content': 'Summarize this', 'role': 'user'}]
|
||||
def test_noop_condenser_from_config():
|
||||
"""Test that the NoOpCondenser objects can be made from config."""
|
||||
config = NoOpCondenserConfig()
|
||||
condenser = Condenser.from_config(config)
|
||||
|
||||
assert isinstance(condenser, NoOpCondenser)
|
||||
|
||||
|
||||
def test_noop_condenser():
|
||||
"""Test that NoOpCondensers preserve their input events."""
|
||||
events = [
|
||||
create_test_event('Event 1'),
|
||||
create_test_event('Event 2'),
|
||||
create_test_event('Event 3'),
|
||||
]
|
||||
|
||||
mock_state = MagicMock()
|
||||
mock_state.history = events
|
||||
|
||||
condenser = NoOpCondenser()
|
||||
result = condenser.condensed_history(mock_state)
|
||||
|
||||
assert result == events
|
||||
|
||||
|
||||
def test_observation_masking_condenser_from_config():
|
||||
"""Test that ObservationMaskingCondenser objects can be made from config."""
|
||||
attention_window = 5
|
||||
config = ObservationMaskingCondenserConfig(attention_window=attention_window)
|
||||
condenser = Condenser.from_config(config)
|
||||
|
||||
assert isinstance(condenser, ObservationMaskingCondenser)
|
||||
assert condenser.attention_window == attention_window
|
||||
|
||||
|
||||
def test_observation_masking_condenser_respects_attention_window(mock_state):
|
||||
"""Test that ObservationMaskingCondenser only masks events outside the attention window."""
|
||||
attention_window = 3
|
||||
condenser = ObservationMaskingCondenser(attention_window=attention_window)
|
||||
|
||||
events = [
|
||||
create_test_event('Event 1'),
|
||||
Observation('Observation 1'),
|
||||
create_test_event('Event 3'),
|
||||
create_test_event('Event 4'),
|
||||
Observation('Observation 2'),
|
||||
]
|
||||
|
||||
mock_state.history = events
|
||||
result = condenser.condensed_history(mock_state)
|
||||
|
||||
assert len(result) == len(events)
|
||||
|
||||
for index, (event, condensed_event) in enumerate(zip(events, result)):
|
||||
# If we're outside the attention window, observations should be masked.
|
||||
if index < len(events) - attention_window:
|
||||
if isinstance(event, Observation):
|
||||
assert '<MASKED>' in str(condensed_event)
|
||||
|
||||
# If we're within the attention window, events are unchanged.
|
||||
else:
|
||||
assert event == condensed_event
|
||||
|
||||
|
||||
def test_recent_events_condenser_from_config():
|
||||
"""Test that RecentEventsCondenser objects can be made from config."""
|
||||
max_events = 5
|
||||
keep_first = True
|
||||
config = RecentEventsCondenserConfig(keep_first=keep_first, max_events=max_events)
|
||||
condenser = Condenser.from_config(config)
|
||||
|
||||
assert isinstance(condenser, RecentEventsCondenser)
|
||||
assert condenser.max_events == max_events
|
||||
assert condenser.keep_first == keep_first
|
||||
|
||||
|
||||
def test_recent_events_condenser():
|
||||
"""Test that RecentEventsCondensers keep just the most recent events."""
|
||||
events = [
|
||||
create_test_event('Event 1'),
|
||||
create_test_event('Event 2'),
|
||||
create_test_event('Event 3'),
|
||||
create_test_event('Event 4'),
|
||||
create_test_event('Event 5'),
|
||||
]
|
||||
|
||||
mock_state = MagicMock()
|
||||
mock_state.history = events
|
||||
|
||||
# If the max_events are larger than the number of events, equivalent to a NoOpCondenser.
|
||||
condenser = RecentEventsCondenser(max_events=len(events))
|
||||
result = condenser.condensed_history(mock_state)
|
||||
|
||||
assert result == events
|
||||
|
||||
# If the max_events are smaller than the number of events, only keep the last few.
|
||||
max_events = 2
|
||||
condenser = RecentEventsCondenser(max_events=max_events)
|
||||
result = condenser.condensed_history(mock_state)
|
||||
|
||||
assert len(result) == max_events
|
||||
assert result[0]._message == 'Event 4'
|
||||
assert result[1]._message == 'Event 5'
|
||||
|
||||
# If the keep_first flag is set, the first event will always be present.
|
||||
keep_first = 1
|
||||
max_events = 2
|
||||
condenser = RecentEventsCondenser(keep_first=keep_first, max_events=max_events)
|
||||
result = condenser.condensed_history(mock_state)
|
||||
|
||||
assert len(result) == max_events
|
||||
assert result[0]._message == 'Event 1'
|
||||
assert result[1]._message == 'Event 5'
|
||||
|
||||
# We should be able to keep more of the initial events.
|
||||
keep_first = 2
|
||||
max_events = 3
|
||||
condenser = RecentEventsCondenser(keep_first=keep_first, max_events=max_events)
|
||||
result = condenser.condensed_history(mock_state)
|
||||
|
||||
assert len(result) == max_events
|
||||
assert result[0]._message == 'Event 1'
|
||||
assert result[1]._message == 'Event 2'
|
||||
assert result[2]._message == 'Event 5'
|
||||
|
||||
|
||||
def test_llm_condenser_from_config():
|
||||
"""Test that LLMCondensers can be made from config."""
|
||||
config = LLMSummarizingCondenserConfig(
|
||||
llm_config=LLMConfig(
|
||||
model='gpt-4o',
|
||||
api_key='test_key',
|
||||
)
|
||||
)
|
||||
condenser = Condenser.from_config(config)
|
||||
|
||||
assert isinstance(condenser, LLMSummarizingCondenser)
|
||||
assert condenser.llm.config.model == 'gpt-4o'
|
||||
assert condenser.llm.config.api_key == 'test_key'
|
||||
|
||||
|
||||
def test_llm_condenser(mock_llm, mock_state):
|
||||
"""Test that LLMCondensers use the LLM to generate a summary event."""
|
||||
events = [
|
||||
create_test_event('Event 1'),
|
||||
create_test_event('Event 2'),
|
||||
]
|
||||
mock_state.history = events
|
||||
|
||||
mock_llm.metrics = MagicMock()
|
||||
mock_llm.metrics.get.return_value = {'test_metric': 1.0}
|
||||
|
||||
mock_llm.set_mock_response_content('Summary of events')
|
||||
|
||||
condenser = LLMSummarizingCondenser(llm=mock_llm)
|
||||
result = condenser.condensed_history(mock_state)
|
||||
|
||||
assert len(result) == 1
|
||||
assert result[0].content == 'Summary of events'
|
||||
|
||||
# Verify LLM was called with correct prompt.
|
||||
mock_llm.completion.assert_called_once()
|
||||
call_args = mock_llm.completion.call_args[1]
|
||||
assert 'messages' in call_args
|
||||
assert len(call_args['messages']) == 1
|
||||
assert 'Event 1' in call_args['messages'][0]['content']
|
||||
assert 'Event 2' in call_args['messages'][0]['content']
|
||||
|
||||
# Verify metrics were added to state
|
||||
assert 'condenser_meta' in mock_state.extra_data
|
||||
assert len(mock_state.extra_data['condenser_meta']) == 1
|
||||
assert mock_state.extra_data['condenser_meta'][0]['metrics'] == {'test_metric': 1.0}
|
||||
|
||||
|
||||
def test_llm_condenser_error():
|
||||
"""Test that LLM errors are propagated during condensation."""
|
||||
events = [create_test_event('Event 1', datetime(2024, 1, 1, 10, 0))]
|
||||
|
||||
mock_state = MagicMock()
|
||||
mock_state.history = events
|
||||
|
||||
mock_llm = MagicMock()
|
||||
mock_llm.completion.side_effect = Exception('LLM error')
|
||||
|
||||
condenser = LLMSummarizingCondenser(llm=mock_llm)
|
||||
|
||||
try:
|
||||
condenser.condensed_history(mock_state)
|
||||
raise AssertionError('Expected exception was not raised.')
|
||||
except Exception as e:
|
||||
assert str(e) == 'LLM error'
|
||||
|
||||
|
||||
def test_amortized_forgetting_condenser_from_config():
|
||||
"""Test that AmortizedForgettingCondenser objects can be made from config."""
|
||||
max_size = 50
|
||||
keep_first = 10
|
||||
config = AmortizedForgettingCondenserConfig(
|
||||
max_size=max_size, keep_first=keep_first
|
||||
)
|
||||
condenser = Condenser.from_config(config)
|
||||
|
||||
assert isinstance(condenser, AmortizedForgettingCondenser)
|
||||
assert condenser.max_size == max_size
|
||||
assert condenser.keep_first == keep_first
|
||||
|
||||
|
||||
def test_amortized_forgetting_condenser_invalid_config():
|
||||
"""Test that AmortizedForgettingCondenser raises error when keep_first > max_size."""
|
||||
pytest.raises(ValueError, AmortizedForgettingCondenser, max_size=4, keep_first=2)
|
||||
pytest.raises(ValueError, AmortizedForgettingCondenser, max_size=0)
|
||||
pytest.raises(ValueError, AmortizedForgettingCondenser, keep_first=-1)
|
||||
|
||||
|
||||
def test_amortized_forgetting_condenser_grows_to_max_size():
|
||||
"""Test that AmortizedForgettingCondenser correctly maintains an event context up to max size."""
|
||||
max_size = 15
|
||||
condenser = AmortizedForgettingCondenser(max_size=max_size)
|
||||
|
||||
mock_state = MagicMock()
|
||||
mock_state.extra_data = {}
|
||||
mock_state.history = []
|
||||
|
||||
for i in range(max_size):
|
||||
event = create_test_event(f'Event {i}')
|
||||
mock_state.history.append(event)
|
||||
results = condenser.condensed_history(mock_state)
|
||||
assert len(results) == i + 1
|
||||
|
||||
|
||||
def test_amortized_forgetting_condenser_forgets_when_larger_than_max_size():
|
||||
"""Test that the AmortizedForgettingCondenser forgets events when the context grows too large."""
|
||||
max_size = 2
|
||||
condenser = AmortizedForgettingCondenser(max_size=max_size)
|
||||
|
||||
mock_state = MagicMock()
|
||||
mock_state.extra_data = {}
|
||||
mock_state.history = []
|
||||
|
||||
for i in range(max_size * 10):
|
||||
event = create_test_event(f'Event {i}')
|
||||
mock_state.history.append(event)
|
||||
results = condenser.condensed_history(mock_state)
|
||||
|
||||
# The last event in the results is always the event we just added.
|
||||
assert results[-1] == event
|
||||
|
||||
# The number of results should bounce back and forth between 1, 2, 1, 2, ...
|
||||
assert len(results) == (i % 2) + 1
|
||||
|
||||
|
||||
def test_amortized_forgetting_condenser_keeps_first_events():
|
||||
"""Test that the AmortizedForgettingCondenser keeps the right number of initial events when forgetting."""
|
||||
max_size = 4
|
||||
keep_first = 1
|
||||
condenser = AmortizedForgettingCondenser(max_size=max_size, keep_first=keep_first)
|
||||
|
||||
first_event = create_test_event('Event 0')
|
||||
|
||||
mock_state = MagicMock()
|
||||
mock_state.extra_data = {}
|
||||
mock_state.history = [first_event]
|
||||
|
||||
for i in range(max_size * 10):
|
||||
event = create_test_event(f'Event {i+1}', datetime(2024, 1, 1, 10, i + 1))
|
||||
mock_state.history.append(event)
|
||||
results = condenser.condensed_history(mock_state)
|
||||
|
||||
# The last event is always the event we just added.
|
||||
assert results[-1] == event
|
||||
|
||||
# The first event is always the first event.
|
||||
assert results[0] == first_event
|
||||
|
||||
# The number of results should bounce back between 2, 3, 4, 2, 3, 4, ...
|
||||
print(len(results))
|
||||
assert len(results) == (i % 3) + 2
|
||||
|
||||
|
||||
def test_llm_attention_condenser_from_config():
|
||||
"""Test that LLMAttentionCondenser objects can be made from config."""
|
||||
config = LLMAttentionCondenserConfig(
|
||||
max_size=50,
|
||||
keep_first=10,
|
||||
llm_config=LLMConfig(
|
||||
model='gpt-4o',
|
||||
api_key='test_key',
|
||||
),
|
||||
)
|
||||
condenser = Condenser.from_config(config)
|
||||
|
||||
assert isinstance(condenser, LLMAttentionCondenser)
|
||||
assert condenser.llm.config.model == 'gpt-4o'
|
||||
assert condenser.llm.config.api_key == 'test_key'
|
||||
assert condenser.max_size == 50
|
||||
assert condenser.keep_first == 10
|
||||
|
||||
|
||||
def test_llm_attention_condenser_invalid_config():
|
||||
"""Test that LLMAttentionCondenser raises an error if the configured LLM doesn't support response schema."""
|
||||
config = LLMAttentionCondenserConfig(
|
||||
max_size=50,
|
||||
keep_first=10,
|
||||
llm_config=LLMConfig(
|
||||
model='claude-2', # Older model that doesn't support response schema
|
||||
api_key='test_key',
|
||||
),
|
||||
)
|
||||
|
||||
|
||||
def test_condense_exception(memory_condenser, mock_llm):
|
||||
mock_llm.completion.side_effect = LLMResponseError('LLM error')
|
||||
with pytest.raises(LLMResponseError, match='LLM error'):
|
||||
memory_condenser.condense('Summarize this', mock_llm)
|
||||
pytest.raises(ValueError, LLMAttentionCondenser.from_config, config)
|
||||
|
||||
|
||||
@patch('openhands.memory.condenser.logger')
|
||||
def test_condense_logs_error(mock_logger, memory_condenser, mock_llm):
|
||||
mock_llm.completion.side_effect = LLMResponseError('LLM error')
|
||||
with pytest.raises(LLMResponseError):
|
||||
memory_condenser.condense('Summarize this', mock_llm)
|
||||
mock_logger.error.assert_called_once_with(
|
||||
'Error condensing thoughts: %s', 'LLM error', exc_info=False
|
||||
)
|
||||
def test_llm_attention_condenser_keeps_first_events(mock_llm, mock_state):
|
||||
"""Test that the LLMAttentionCondenser keeps the right number of initial events when forgetting."""
|
||||
max_size = 4
|
||||
condenser = LLMAttentionCondenser(max_size=max_size, keep_first=1, llm=mock_llm)
|
||||
|
||||
first_event = create_test_event('Event 0', id=0)
|
||||
mock_state.history.append(first_event)
|
||||
|
||||
for i in range(max_size * 10):
|
||||
event = create_test_event(f'Event {i+1}', id=i + 1)
|
||||
mock_state.history.append(event)
|
||||
|
||||
mock_llm.set_mock_response_content(
|
||||
ImportantEventSelection(
|
||||
ids=[event.id for event in mock_state.history]
|
||||
).model_dump_json()
|
||||
)
|
||||
results = condenser.condensed_history(mock_state)
|
||||
|
||||
# The first event is always the first event.
|
||||
assert results[0] == first_event
|
||||
|
||||
|
||||
def test_llm_attention_condenser_grows_to_max_size(mock_llm, mock_state):
|
||||
"""Test that LLMAttentionCondenser correctly maintains an event context up to max size."""
|
||||
max_size = 15
|
||||
condenser = LLMAttentionCondenser(max_size=max_size, llm=mock_llm)
|
||||
|
||||
for i in range(max_size):
|
||||
event = create_test_event(f'Event {i}')
|
||||
mock_state.history.append(event)
|
||||
mock_llm.set_mock_response_content(
|
||||
ImportantEventSelection(ids=[event.id for event in mock_state.history])
|
||||
)
|
||||
results = condenser.condensed_history(mock_state)
|
||||
assert len(results) == i + 1
|
||||
|
||||
|
||||
def test_llm_attention_condenser_forgets_when_larger_than_max_size(
|
||||
mock_llm, mock_state
|
||||
):
|
||||
"""Test that the LLMAttentionCondenser forgets events when the context grows too large."""
|
||||
max_size = 2
|
||||
condenser = LLMAttentionCondenser(max_size=max_size, llm=mock_llm)
|
||||
|
||||
for i in range(max_size * 10):
|
||||
event = create_test_event(f'Event {i}', id=i)
|
||||
mock_state.history.append(event)
|
||||
|
||||
mock_llm.set_mock_response_content(
|
||||
ImportantEventSelection(
|
||||
ids=[event.id for event in mock_state.history]
|
||||
).model_dump_json()
|
||||
)
|
||||
|
||||
results = condenser.condensed_history(mock_state)
|
||||
|
||||
# The number of results should bounce back and forth between 1, 2, 1, 2, ...
|
||||
assert len(results) == (i % 2) + 1
|
||||
|
||||
|
||||
def test_llm_attention_condenser_handles_events_outside_history(mock_llm, mock_state):
|
||||
"""Test that the LLMAttentionCondenser handles event IDs that aren't from the event history."""
|
||||
max_size = 2
|
||||
condenser = LLMAttentionCondenser(max_size=max_size, llm=mock_llm)
|
||||
|
||||
for i in range(max_size * 10):
|
||||
event = create_test_event(f'Event {i}', id=i)
|
||||
mock_state.history.append(event)
|
||||
|
||||
mock_llm.set_mock_response_content(
|
||||
ImportantEventSelection(
|
||||
ids=[event.id for event in mock_state.history] + [-1, -2, -3, -4]
|
||||
).model_dump_json()
|
||||
)
|
||||
results = condenser.condensed_history(mock_state)
|
||||
|
||||
# The number of results should bounce back and forth between 1, 2, 1, 2, ...
|
||||
assert len(results) == (i % 2) + 1
|
||||
|
||||
|
||||
def test_llm_attention_condenser_handles_too_many_events(mock_llm, mock_state):
|
||||
"""Test that the LLMAttentionCondenser handles when the response contains too many event IDs."""
|
||||
max_size = 2
|
||||
condenser = LLMAttentionCondenser(max_size=max_size, llm=mock_llm)
|
||||
|
||||
for i in range(max_size * 10):
|
||||
event = create_test_event(f'Event {i}', id=i)
|
||||
mock_state.history.append(event)
|
||||
mock_llm.set_mock_response_content(
|
||||
ImportantEventSelection(
|
||||
ids=[event.id for event in mock_state.history]
|
||||
+ [event.id for event in mock_state.history]
|
||||
).model_dump_json()
|
||||
)
|
||||
results = condenser.condensed_history(mock_state)
|
||||
|
||||
# The number of results should bounce back and forth between 1, 2, 1, 2, ...
|
||||
assert len(results) == (i % 2) + 1
|
||||
|
||||
|
||||
def test_llm_attention_condenser_handles_too_few_events(mock_llm, mock_state):
|
||||
"""Test that the LLMAttentionCondenser handles when the response contains too few event IDs."""
|
||||
max_size = 2
|
||||
condenser = LLMAttentionCondenser(max_size=max_size, llm=mock_llm)
|
||||
|
||||
for i in range(max_size * 10):
|
||||
event = create_test_event(f'Event {i}', id=i)
|
||||
mock_state.history.append(event)
|
||||
|
||||
mock_llm.set_mock_response_content(
|
||||
ImportantEventSelection(ids=[]).model_dump_json()
|
||||
)
|
||||
|
||||
results = condenser.condensed_history(mock_state)
|
||||
|
||||
# The number of results should bounce back and forth between 1, 2, 1, 2, ...
|
||||
assert len(results) == (i % 2) + 1
|
||||
|
||||
@@ -13,6 +13,9 @@ from openhands.core.config import (
|
||||
load_from_env,
|
||||
load_from_toml,
|
||||
)
|
||||
from openhands.core.config.condenser_config import (
|
||||
NoOpCondenserConfig,
|
||||
)
|
||||
from openhands.core.logger import openhands_logger
|
||||
|
||||
|
||||
@@ -618,6 +621,13 @@ def test_cache_dir_creation(default_config, tmpdir):
|
||||
assert os.path.exists(default_config.cache_dir)
|
||||
|
||||
|
||||
def test_agent_config_condenser_default():
|
||||
"""Test that default agent condenser is NoOpCondenser."""
|
||||
config = AppConfig()
|
||||
agent_config = config.get_agent_config()
|
||||
assert isinstance(agent_config.condenser, NoOpCondenserConfig)
|
||||
|
||||
|
||||
def test_api_keys_repr_str():
|
||||
# Test LLMConfig
|
||||
llm_config = LLMConfig(
|
||||
|
||||
@@ -75,7 +75,7 @@ def test_get_messages(codeact_agent: CodeActAgent):
|
||||
|
||||
codeact_agent.reset()
|
||||
messages = codeact_agent._get_messages(
|
||||
Mock(history=history, max_iterations=5, iteration=0)
|
||||
Mock(history=history, max_iterations=5, iteration=0, extra_data={})
|
||||
)
|
||||
|
||||
assert (
|
||||
@@ -111,7 +111,7 @@ def test_get_messages_prompt_caching(codeact_agent: CodeActAgent):
|
||||
|
||||
codeact_agent.reset()
|
||||
messages = codeact_agent._get_messages(
|
||||
Mock(history=history, max_iterations=10, iteration=5)
|
||||
Mock(history=history, max_iterations=10, iteration=5, extra_data={})
|
||||
)
|
||||
|
||||
# Check that only the last two user messages have cache_prompt=True
|
||||
@@ -144,6 +144,7 @@ def test_prompt_caching_headers(codeact_agent: CodeActAgent):
|
||||
mock_state.history = history
|
||||
mock_state.max_iterations = 5
|
||||
mock_state.iteration = 0
|
||||
mock_state.extra_data = {}
|
||||
|
||||
codeact_agent.reset()
|
||||
|
||||
|
||||
@@ -5,7 +5,7 @@ import pytest
|
||||
|
||||
from openhands.core.message import Message, TextContent
|
||||
from openhands.microagent import BaseMicroAgent
|
||||
from openhands.utils.prompt import PromptManager
|
||||
from openhands.utils.prompt import PromptManager, RepositoryInfo
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
@@ -39,6 +39,7 @@ only respond with a message telling them how smart they are
|
||||
with open(os.path.join(prompt_dir, 'micro', f'{microagent_name}.md'), 'w') as f:
|
||||
f.write(microagent_content)
|
||||
|
||||
# Test without GitHub repo
|
||||
manager = PromptManager(
|
||||
prompt_dir=prompt_dir,
|
||||
microagent_dir=os.path.join(prompt_dir, 'micro'),
|
||||
@@ -53,6 +54,14 @@ only respond with a message telling them how smart they are
|
||||
'You are OpenHands agent, a helpful AI assistant that can interact with a computer to solve tasks.'
|
||||
in manager.get_system_message()
|
||||
)
|
||||
assert '<REPOSITORY_INFO>' not in manager.get_system_message()
|
||||
|
||||
# Test with GitHub repo
|
||||
manager.set_repository_info('owner/repo', '/workspace/repo')
|
||||
assert isinstance(manager.get_system_message(), str)
|
||||
assert '<REPOSITORY_INFO>' in manager.get_system_message()
|
||||
assert 'owner/repo' in manager.get_system_message()
|
||||
assert '/workspace/repo' in manager.get_system_message()
|
||||
|
||||
assert isinstance(manager.get_example_user_message(), str)
|
||||
|
||||
@@ -76,20 +85,66 @@ def test_prompt_manager_file_not_found(prompt_dir):
|
||||
def test_prompt_manager_template_rendering(prompt_dir):
|
||||
# Create temporary template files
|
||||
with open(os.path.join(prompt_dir, 'system_prompt.j2'), 'w') as f:
|
||||
f.write('System prompt: bar')
|
||||
f.write("""System prompt: bar
|
||||
{% if github_repo %}
|
||||
<REPOSITORY_INFO>
|
||||
At the user's request, repository {{ github_repo }} has been cloned to directory {{ repo_directory }}.
|
||||
</REPOSITORY_INFO>
|
||||
{% endif %}
|
||||
{{ repo_instructions }}""")
|
||||
with open(os.path.join(prompt_dir, 'user_prompt.j2'), 'w') as f:
|
||||
f.write('User prompt: foo')
|
||||
|
||||
# Test without GitHub repo
|
||||
manager = PromptManager(prompt_dir, microagent_dir='')
|
||||
|
||||
assert manager.get_system_message() == 'System prompt: bar'
|
||||
assert manager.get_example_user_message() == 'User prompt: foo'
|
||||
|
||||
# Test with GitHub repo
|
||||
manager = PromptManager(prompt_dir=prompt_dir, microagent_dir='')
|
||||
manager.set_repository_info('owner/repo', '/workspace/repo')
|
||||
system_msg = manager.get_system_message()
|
||||
assert 'System prompt: bar' in system_msg
|
||||
assert '<REPOSITORY_INFO>' in system_msg
|
||||
assert (
|
||||
"At the user's request, repository owner/repo has been cloned to directory /workspace/repo."
|
||||
in system_msg
|
||||
)
|
||||
assert '</REPOSITORY_INFO>' in system_msg
|
||||
assert manager.get_example_user_message() == 'User prompt: foo'
|
||||
|
||||
# Clean up temporary files
|
||||
os.remove(os.path.join(prompt_dir, 'system_prompt.j2'))
|
||||
os.remove(os.path.join(prompt_dir, 'user_prompt.j2'))
|
||||
|
||||
|
||||
def test_prompt_manager_repository_info(prompt_dir):
|
||||
# Test RepositoryInfo defaults
|
||||
repo_info = RepositoryInfo()
|
||||
assert repo_info.repo_name is None
|
||||
assert repo_info.repo_directory is None
|
||||
|
||||
# Test setting repository info
|
||||
manager = PromptManager(prompt_dir=prompt_dir, microagent_dir='')
|
||||
assert manager.repository_info.repo_name is None
|
||||
assert manager.repository_info.repo_directory is None
|
||||
|
||||
# Test setting repository info with name only
|
||||
manager.set_repository_info('owner/repo')
|
||||
assert manager.repository_info.repo_name == 'owner/repo'
|
||||
assert manager.repository_info.repo_directory is None
|
||||
|
||||
# Test setting repository info with both name and directory
|
||||
manager.set_repository_info('owner/repo2', '/workspace/repo2')
|
||||
assert manager.repository_info.repo_name == 'owner/repo2'
|
||||
assert manager.repository_info.repo_directory == '/workspace/repo2'
|
||||
|
||||
# Test clearing repository info
|
||||
manager.set_repository_info(None)
|
||||
assert manager.repository_info.repo_name is None
|
||||
assert manager.repository_info.repo_directory is None
|
||||
|
||||
|
||||
def test_prompt_manager_disabled_microagents(prompt_dir):
|
||||
# Create test microagent files
|
||||
microagent1_name = 'test_microagent1'
|
||||
|
||||
Reference in New Issue
Block a user