Compare commits

...

29 Commits

Author SHA1 Message Date
openhands
7b54eb3ab8 Add comprehensive tests for conversation metrics API 2025-03-04 21:40:48 +00:00
openhands
c3cc4c71f9 Fix unreachable code in get_metrics method 2025-03-04 21:38:06 +00:00
dependabot[bot]
0f68a18cbb chore(deps): bump docker/setup-qemu-action from 3.4.0 to 3.6.0 (#7075)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-04 20:57:14 +00:00
Robert Brennan
c9ebabd82d Add contact link to runtime settings label (#6880)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-03-05 00:49:53 +04:00
mamoodi
ad932e45e8 Checkout HEAD instead of Merge Commit for builds (#7085) 2025-03-04 15:32:59 -05:00
sp.wack
3278caf3c2 Always enable GET /settings (#7101) 2025-03-04 14:54:26 -05:00
He Du
896d7b8b96 Openhands fix issue 7091 (#7092)
Co-authored-by: 杜贺 <duhe@duhedeMacBook-Pro-2.local>
2025-03-04 18:39:28 +01:00
Ryan H. Tran
cb61282c39 Improve error detection for read and edit observations (#7090) 2025-03-04 15:05:15 +01:00
Graham Neubig
7a235ce6ff Fix/mypy routes (#6900)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-03-04 03:43:09 +00:00
Rohit Malhotra
5ffb1ef704 Fix typing (#7083)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-03-03 20:41:11 +00:00
chuckbutkus
4e4f4d64f8 Fix runtime to call new token refresh (#7084) 2025-03-03 20:36:27 +00:00
Engel Nyst
3d38a105cf Add loading from toml for condensers (#6974)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Calvin Smith <email@cjsmith.io>
2025-03-03 20:32:46 +01:00
chuckbutkus
b1ab4d342e Add offline_access scope (#7059) 2025-03-03 19:06:08 +00:00
Rohit Malhotra
3e91899720 [Experimental]: Attach convo id to initial user instructions (#7062) 2025-03-03 13:46:09 -05:00
dependabot[bot]
959fa3ed64 chore(deps): bump the version-all group across 1 directory with 28 updates (#7077)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: amanape <83104063+amanape@users.noreply.github.com>
2025-03-03 18:32:27 +00:00
tofarr
c51f07bd1f Fixes for keycloak in localhost (#7079) 2025-03-03 10:36:57 -07:00
tofarr
b8ef68dc60 Upgrade default version of claude (#7072) 2025-03-03 11:31:12 -05:00
Ivan Dagelic
d21bd49f08 docs: daytona runtime configuration (#7073)
Signed-off-by: Ivan Dagelic <dagelic.ivan@gmail.com>
2025-03-03 11:30:58 -05:00
Engel Nyst
4c265515d2 (chore) Fix linting issues in openhands directory (#7068)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-03-03 16:52:25 +01:00
Engel Nyst
e4acfa68ec Fix #7060: Remove obsolete micro_agent_name attribute from test_long_term_memory.py (#7061)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-03-03 16:51:36 +01:00
mamoodi
d395b5e11f Add more information to the main docs page (#7074) 2025-03-03 10:18:20 -05:00
tawago
6d75647c40 [Bugfix] Add github_token verification in resolver utils (#7065) 2025-03-03 09:59:16 -05:00
Engel Nyst
285010b48f OpenAI models fixes (#7045) 2025-03-03 15:53:18 +01:00
Engel Nyst
395c1ea9e3 [Refactor] split runtime initialization (create, connect, init) in cli scripts (#7036) 2025-03-03 00:19:25 +01:00
Graham Neubig
91ad59dc24 More explicit feedback message about how to report errors to developers (#7063) 2025-03-02 22:21:07 +00:00
Engel Nyst
62750c07e5 Fix GitLab CI environment variable check (issue #7050) (#7052)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-03-02 21:33:07 +01:00
Ivan Dagelic
cf439fa89c chore: daytona readme quick start verbosity (#7056)
Signed-off-by: Ivan Dagelic <dagelic.ivan@gmail.com>
2025-03-02 20:17:35 +01:00
Ivan Dagelic
85c0864802 chore: update daytona readme (#7053)
Signed-off-by: Ivan Dagelic <dagelic.ivan@gmail.com>
2025-03-02 17:43:38 +01:00
mamoodi
ff5d8094de Updates to the ISSUE TRIAGE (#7043) 2025-03-02 10:35:47 -05:00
102 changed files with 3603 additions and 2998 deletions

View File

@@ -41,8 +41,10 @@ jobs:
steps:
- name: Checkout
uses: actions/checkout@v4
with:
ref: ${{ github.event.pull_request.head.sha }}
- name: Set up QEMU
uses: docker/setup-qemu-action@v3.4.0
uses: docker/setup-qemu-action@v3.6.0
with:
image: tonistiigi/binfmt:latest
- name: Login to GHCR
@@ -90,8 +92,10 @@ jobs:
steps:
- name: Checkout
uses: actions/checkout@v4
with:
ref: ${{ github.event.pull_request.head.sha }}
- name: Set up QEMU
uses: docker/setup-qemu-action@v3.4.0
uses: docker/setup-qemu-action@v3.6.0
with:
image: tonistiigi/binfmt:latest
- name: Login to GHCR
@@ -154,6 +158,8 @@ jobs:
base_image: ['nikolaik']
steps:
- uses: actions/checkout@v4
with:
ref: ${{ github.event.pull_request.head.sha }}
- name: Cache Poetry dependencies
uses: actions/cache@v4
with:

View File

@@ -2,12 +2,13 @@
These are the procedures and guidelines on how issues are triaged in this repo by the maintainers.
## General
* Most issues must be tagged with **enhancement** or **bug**.
* Issues may be tagged with what it relates to (**backend**, **frontend**, **agent quality**, etc.).
* All issues must be tagged with **enhancement**, **bug** or **troubleshooting/help**.
* Issues may be tagged with what it relates to (**agent quality**, **frontend**, **resolver**, etc.).
## Severity
* **Low**: Minor issues or affecting single user.
* **Medium**: Affecting multiple users.
* **High**: High visibility issues or affecting many users.
* **Critical**: Affecting all users or potential security issues.
## Effort
@@ -18,8 +19,14 @@ These are the procedures and guidelines on how issues are triaged in this repo b
## Not Enough Information
* User is asked to provide more information (logs, how to reproduce, etc.) when the issue is not clear.
* If an issue is unclear and the author does not provide more information or respond to a request, the issue may be closed as **not planned** (Usually after a week).
* If an issue is unclear and the author does not provide more information or respond to a request,
the issue may be closed as **not planned** (Usually after a week).
## Multiple Requests/Fixes in One Issue
* These issues will be narrowed down to one request/fix so the issue is more easily tracked and fixed.
* Issues may be broken down into multiple issues if required.
## Stale and Auto Closures
* In order to keep a maintainable backlog, issues that have no activity within 30 days are automatically marked as **Stale**.
* If issues marked as **Stale** continue to have no activity for 7 more days, they will automatically be closed as not planned.
* Issues may be reopened by maintainers if deemed important.

View File

@@ -95,6 +95,11 @@ workspace_base = "./workspace"
# List of allowed file extensions for uploads
#file_uploads_allowed_extensions = [".*"]
# Whether to enable the default LLM summarizing condenser when no condenser is specified in config
# When true, a LLMSummarizingCondenserConfig will be used as the default condenser
# When false, a NoOpCondenserConfig (no summarization) will be used
#enable_default_condenser = true
#################################### LLM #####################################
# Configuration for LLM models (group name starts with 'llm')
# use 'llm' for the default LLM config
@@ -294,6 +299,69 @@ llm_config = 'gpt3'
# The security analyzer to use (For Headless / CLI only - In Web this is overridden by Session Init)
#security_analyzer = ""
#################################### Condenser #################################
# Condensers control how conversation history is managed and compressed when
# the context grows too large. Each agent uses one condenser configuration.
##############################################################################
[condenser]
# The type of condenser to use. Available options:
# - "noop": No condensing, keeps full history (default)
# - "observation_masking": Keeps full event structure but masks older observations
# - "recent": Keeps only recent events and discards older ones
# - "llm": Uses an LLM to summarize conversation history
# - "amortized": Intelligently forgets older events while preserving important context
# - "llm_attention": Uses an LLM to prioritize most relevant context
type = "noop"
# Examples for each condenser type (uncomment and modify as needed):
# 1. NoOp Condenser - No additional settings needed
#type = "noop"
# 2. Observation Masking Condenser
#type = "observation_masking"
# Number of most-recent events where observations will not be masked
#attention_window = 100
# 3. Recent Events Condenser
#type = "recent"
# Number of initial events to always keep (typically includes task description)
#keep_first = 1
# Maximum number of events to keep in history
#max_events = 100
# 4. LLM Summarizing Condenser
#type = "llm"
# Reference to an LLM config to use for summarization
#llm_config = "condenser"
# Number of initial events to always keep (typically includes task description)
#keep_first = 1
# Maximum size of history before triggering summarization
#max_size = 100
# 5. Amortized Forgetting Condenser
#type = "amortized"
# Number of initial events to always keep (typically includes task description)
#keep_first = 1
# Maximum size of history before triggering forgetting
#max_size = 100
# 6. LLM Attention Condenser
#type = "llm_attention"
# Reference to an LLM config to use for attention scoring
#llm_config = "condenser"
# Number of initial events to always keep (typically includes task description)
#keep_first = 1
# Maximum size of history before triggering attention mechanism
#max_size = 100
# Example of a custom LLM configuration for condensers that require an LLM
# If not provided, it falls back to the default LLM
#[llm.condenser]
#model = "gpt-4o"
#temperature = 0.1
#max_tokens = 1024
#################################### Eval ####################################
# Configuration for the evaluation, please refer to the specific evaluation
# plugin for the available options

View File

@@ -84,3 +84,36 @@ docker run # ...
-e MODAL_API_TOKEN_ID="your-id" \
-e MODAL_API_TOKEN_SECRET="your-secret" \
```
## Daytona Runtime
Another option is using [Daytona](https://www.daytona.io/) as a runtime provider:
### Step 1: Retrieve Your Daytona API Key
1. Visit the [Daytona Dashboard](https://app.daytona.io/dashboard/keys).
2. Click **"Create Key"**.
3. Enter a name for your key and confirm the creation.
4. Once the key is generated, copy it.
### Step 2: Set Your API Key as an Environment Variable
Run the following command in your terminal, replacing `<your-api-key>` with the actual key you copied:
```bash
export DAYTONA_API_KEY="<your-api-key>"
```
This step ensures that OpenHands can authenticate with the Daytona platform when it runs.
### Step 3: Run OpenHands Locally Using Docker
To start the latest version of OpenHands on your machine, execute the following command in your terminal:
```bash
bash -i <(curl -sL https://get.daytona.io/openhands)
```
#### What This Command Does:
- Downloads the latest OpenHands release script.
- Runs the script in an interactive Bash session.
- Automatically pulls and runs the OpenHands container using Docker.
Once executed, OpenHands should be running locally and ready for use.
For more details and manual initialization, view the entire [README.md](https://github.com/All-Hands-AI/OpenHands/blob/main/openhands/runtime/impl/daytona/README.md)

View File

@@ -5,9 +5,6 @@ export function Demo() {
const videoRef = React.useRef<HTMLVideoElement>(null);
return (
<div
style={{ paddingBottom: "10px", paddingTop: "10px", textAlign: "center" }}
>
<video
playsInline
autoPlay={true}
@@ -20,6 +17,5 @@ export function Demo() {
>
<source src="img/teaser.mp4" type="video/mp4"></source>
</video>
</div>
);
}

View File

@@ -1,6 +1,5 @@
.demo {
width: 100%;
padding: 30px;
max-width: 800px;
text-align: center;
border-radius: 40px;

View File

@@ -17,6 +17,29 @@ export function HomepageHeader() {
<p className="header-subtitle">{siteConfig.tagline}</p>
<div style={{
textAlign: 'center',
fontSize: '1.2rem',
maxWidth: '800px',
margin: '0 auto',
padding: '0rem 0rem 1rem'
}}>
<p style={{ margin: '0' }}>
Use AI to tackle the toil in your backlog. Our agents have all the same tools as a human developer: they can modify code, run commands, browse the web,
call APIs, and yes-even copy code snippets from StackOverflow.
<br/>
<Link to="https://docs.all-hands.dev/modules/usage/installation"
style={{
textDecoration: 'underline',
display: 'inline-block',
marginTop: '0.5rem'
}}
>
Get started with OpenHands.
</Link>
</p>
</div>
<div align="center" className="header-links">
<a href="https://github.com/All-Hands-AI/OpenHands/graphs/contributors"><img src="https://img.shields.io/github/contributors/All-Hands-AI/OpenHands?style=for-the-badge&color=blue" alt="Contributors" /></a>
<a href="https://github.com/All-Hands-AI/OpenHands/stargazers"><img src="https://img.shields.io/github/stars/All-Hands-AI/OpenHands?style=for-the-badge&color=blue" alt="Stargazers" /></a>
@@ -27,12 +50,9 @@ export function HomepageHeader() {
<a href="https://discord.gg/ESHStjSjD4"><img src="https://img.shields.io/badge/Discord-Join%20Us-purple?logo=discord&logoColor=white&style=for-the-badge" alt="Join our Discord community" /></a>
<a href="https://github.com/All-Hands-AI/OpenHands/blob/main/CREDITS.md"><img src="https://img.shields.io/badge/Project-Credits-blue?style=for-the-badge&color=FFE165&logo=github&logoColor=white" alt="Credits" /></a>
<br/>
<a href="https://docs.all-hands.dev/modules/usage/getting-started"><img src="https://img.shields.io/badge/Documentation-000?logo=googledocs&logoColor=FFE165&style=for-the-badge" alt="Check out the documentation" /></a>
<a href="https://arxiv.org/abs/2407.16741"><img src="https://img.shields.io/badge/Paper%20on%20Arxiv-000?logoColor=FFE165&logo=arxiv&style=for-the-badge" alt="Paper on Arxiv" /></a>
<a href="https://huggingface.co/spaces/OpenHands/evaluation"><img src="https://img.shields.io/badge/Benchmark%20score-000?logoColor=FFE165&logo=huggingface&style=for-the-badge" alt="Evaluation Benchmark Score" /></a>
</div>
<Demo />
</div>
</div>
);

View File

@@ -1,14 +1,14 @@
/* homepageHeader.css */
.homepage-header {
height: 800px;
padding: 1rem 0;
}
.header-content {
display: flex;
flex-direction: column;
align-items: center;
padding: 2rem;
padding: 1rem;
font-weight: 300;
width: 100%;
}
@@ -25,6 +25,7 @@
.header-subtitle {
font-size: 1.5rem;
margin: 0.5rem 0;
}
.header-links {

View File

@@ -2,15 +2,7 @@ import useDocusaurusContext from '@docusaurus/useDocusaurusContext';
import Layout from '@theme/Layout';
import { HomepageHeader } from '../components/HomepageHeader/HomepageHeader';
import { translate } from '@docusaurus/Translate';
export function Header({ title, summary }): JSX.Element {
return (
<div>
<h1>{title}</h1>
<h2 style={{ fontSize: '3rem' }}>{summary}</h2>
</div>
);
}
import { Demo } from "../components/Demo/Demo";
export default function Home(): JSX.Element {
const { siteConfig } = useDocusaurusContext();
@@ -23,11 +15,14 @@ export default function Home(): JSX.Element {
})}
>
<HomepageHeader />
<div style={{ textAlign: 'center', padding: '2rem' }}>
<br />
<div style={{ textAlign: 'center', padding: '1rem 0' }}>
<Demo />
</div>
<div style={{ textAlign: 'center', padding: '0.5rem 2rem 1.5rem' }}>
<h2>Most Popular Links</h2>
<ul style={{ listStyleType: 'none'}}>
<li><a href="/modules/usage/Installation">How to Run OpenHands</a></li>
<li><a href="/modules/usage/prompting/microagents-repo">Customizing OpenHands to a repository</a></li>
<li><a href="/modules/usage/how-to/github-action">Integrating OpenHands with Github</a></li>
<li><a href="/modules/usage/llms#model-recommendations">Recommended models to use</a></li>

View File

@@ -24,6 +24,7 @@ from openhands.core.config import (
from openhands.core.logger import openhands_logger as logger
from openhands.core.main import create_runtime, run_controller
from openhands.events.action import MessageAction
from openhands.utils.async_utils import call_async_from_sync
game = None
@@ -121,6 +122,7 @@ def process_instance(
# Here's how you can run the agent (similar to the `main` function) and get the final task state
runtime = create_runtime(config)
call_async_from_sync(runtime.connect)
state: State | None = asyncio.run(
run_controller(

View File

@@ -34,6 +34,7 @@ from openhands.core.main import create_runtime, run_controller
from openhands.events.action import AgentFinishAction, CmdRunAction, MessageAction
from openhands.events.observation import CmdOutputObservation
from openhands.runtime.base import Runtime
from openhands.utils.async_utils import call_async_from_sync
def get_config(
@@ -210,6 +211,7 @@ def process_instance(
# =============================================
runtime: Runtime = create_runtime(config)
call_async_from_sync(runtime.connect)
initialize_runtime(runtime, instance=instance)

View File

@@ -34,6 +34,7 @@ from openhands.core.main import create_runtime, run_controller
from openhands.events.action import CmdRunAction, MessageAction
from openhands.events.observation import CmdOutputObservation
from openhands.runtime.base import Runtime
from openhands.utils.async_utils import call_async_from_sync
# Configure visibility of unit tests to the Agent.
USE_UNIT_TESTS = os.environ.get('USE_UNIT_TESTS', 'false').lower() == 'true'
@@ -203,7 +204,7 @@ def process_instance(
# =============================================
runtime: Runtime = create_runtime(config)
call_async_from_sync(runtime.connect)
initialize_runtime(runtime, instance=instance)
# Here's how you can run the agent (similar to the `main` function) and get the final task state

View File

@@ -31,6 +31,7 @@ from openhands.core.main import create_runtime, run_controller
from openhands.events.action import CmdRunAction, MessageAction
from openhands.events.observation import CmdOutputObservation
from openhands.runtime.base import Runtime
from openhands.utils.async_utils import call_async_from_sync
AGENT_CLS_TO_FAKE_USER_RESPONSE_FN = {
'CodeActAgent': functools.partial(
@@ -274,6 +275,7 @@ def process_instance(
instruction += AGENT_CLS_TO_INST_SUFFIX[metadata.agent_class]
runtime = create_runtime(config)
call_async_from_sync(runtime.connect)
initialize_runtime(runtime, instance)
# Here's how you can run the agent (similar to the `main` function) and get the final task state

View File

@@ -34,6 +34,7 @@ from openhands.core.main import create_runtime, run_controller
from openhands.events.action import CmdRunAction, MessageAction
from openhands.events.observation import CmdOutputObservation
from openhands.runtime.base import Runtime
from openhands.utils.async_utils import call_async_from_sync
def codeact_user_response(state: State) -> str:
@@ -399,6 +400,7 @@ def process_instance(
instruction += AGENT_CLS_TO_INST_SUFFIX[metadata.agent_class]
runtime = create_runtime(config)
call_async_from_sync(runtime.connect)
initialize_runtime(runtime, instance)
# Here's how you can run the agent (similar to the `main` function) and get the final task state

View File

@@ -25,6 +25,7 @@ from openhands.core.config import (
from openhands.core.logger import openhands_logger as logger
from openhands.core.main import create_runtime, run_controller
from openhands.events.action import MessageAction
from openhands.utils.async_utils import call_async_from_sync
# Only CodeActAgent can delegate to BrowsingAgent
SUPPORTED_AGENT_CLS = {'CodeActAgent'}
@@ -74,6 +75,7 @@ def process_instance(
)
runtime = create_runtime(config)
call_async_from_sync(runtime.connect)
state: State | None = asyncio.run(
run_controller(

View File

@@ -35,6 +35,7 @@ from openhands.events.action import CmdRunAction, MessageAction
from openhands.events.observation import CmdOutputObservation, ErrorObservation
from openhands.events.serialization.event import event_to_dict
from openhands.runtime.base import Runtime
from openhands.utils.async_utils import call_async_from_sync
from openhands.utils.shutdown_listener import sleep_if_should_continue
USE_HINT_TEXT = os.environ.get('USE_HINT_TEXT', 'false').lower() == 'true'
@@ -394,6 +395,7 @@ def process_instance(
logger.info(f'Starting evaluation for instance {instance.instance_id}.')
runtime = create_runtime(config)
call_async_from_sync(runtime.connect)
try:
initialize_runtime(runtime, instance)

View File

@@ -34,6 +34,7 @@ from openhands.core.main import create_runtime, run_controller
from openhands.events.action import AgentFinishAction, CmdRunAction, MessageAction
from openhands.events.observation import CmdOutputObservation
from openhands.runtime.base import Runtime
from openhands.utils.async_utils import call_async_from_sync
EVALUATION_LLM = 'gpt-4-1106-preview'
@@ -281,6 +282,7 @@ def process_instance(
# Here's how you can run the agent (similar to the `main` function) and get the final task state
runtime = create_runtime(config)
call_async_from_sync(runtime.connect)
initialize_runtime(runtime, instance.data_files)
state: State | None = asyncio.run(

View File

@@ -31,6 +31,7 @@ from openhands.core.main import create_runtime, run_controller
from openhands.events.action import AgentFinishAction, CmdRunAction, MessageAction
from openhands.events.observation import CmdOutputObservation
from openhands.runtime.base import Runtime
from openhands.utils.async_utils import call_async_from_sync
DATASET_CACHE_DIR = os.path.join(os.path.dirname(__file__), 'data')
@@ -148,6 +149,7 @@ def process_instance(
logger.info(f'Instruction:\n{instruction}', extra={'msg_type': 'OBSERVATION'})
runtime = create_runtime(config)
call_async_from_sync(runtime.connect)
initialize_runtime(runtime, instance)
# Here's how you can run the agent (similar to the `main` function) and get the final task state

View File

@@ -13,6 +13,7 @@
# limitations under the License.
# This file is modified from https://github.com/ShishirPatil/gorilla/blob/main/eval/eval-scripts/ast_eval_hf.py
import tree_sitter_python as tspython
from tree_sitter import Language, Parser
@@ -39,10 +40,9 @@ def get_all_sub_trees(root_node):
# Parse the program into AST trees
def ast_parse(candidate, lang='python'):
LANGUAGE = Language('evaluation/gorilla/my-languages.so', lang)
parser = Parser()
parser.set_language(LANGUAGE)
def ast_parse(candidate):
LANGUAGE = Language(tspython.language())
parser = Parser(LANGUAGE)
candidate_tree = parser.parse(bytes(candidate, 'utf8')).root_node
return candidate_tree

View File

@@ -13,6 +13,7 @@
# limitations under the License.
# This file is modified from https://github.com/ShishirPatil/gorilla/blob/main/eval/eval-scripts/ast_eval_tf.py
import tree_sitter_python as tspython
from tree_sitter import Language, Parser
@@ -39,10 +40,9 @@ def get_all_sub_trees(root_node):
# Parse the program into AST trees
def ast_parse(candidate, lang='python'):
LANGUAGE = Language('evaluation/gorilla/my-languages.so', lang)
parser = Parser()
parser.set_language(LANGUAGE)
def ast_parse(candidate):
LANGUAGE = Language(tspython.language())
parser = Parser(LANGUAGE)
candidate_tree = parser.parse(bytes(candidate, 'utf8')).root_node
return candidate_tree

View File

@@ -13,6 +13,7 @@
# limitations under the License.
# This file is modified from https://github.com/ShishirPatil/gorilla/blob/main/eval/eval-scripts/ast_eval_th.py
import tree_sitter_python as tspython
from tree_sitter import Language, Parser
@@ -39,10 +40,9 @@ def get_all_sub_trees(root_node):
# Parse the program into AST trees
def ast_parse(candidate, lang='python'):
LANGUAGE = Language('evaluation/gorilla/my-languages.so', lang)
parser = Parser()
parser.set_language(LANGUAGE)
def ast_parse(candidate):
LANGUAGE = Language(tspython.language())
parser = Parser(LANGUAGE)
candidate_tree = parser.parse(bytes(candidate, 'utf8')).root_node
return candidate_tree

View File

@@ -26,6 +26,7 @@ from openhands.core.config import (
from openhands.core.logger import openhands_logger as logger
from openhands.core.main import create_runtime, run_controller
from openhands.events.action import MessageAction
from openhands.utils.async_utils import call_async_from_sync
AGENT_CLS_TO_FAKE_USER_RESPONSE_FN = {
'CodeActAgent': codeact_user_response,
@@ -82,6 +83,7 @@ def process_instance(
# Here's how you can run the agent (similar to the `main` function) and get the final task state
runtime = create_runtime(config)
call_async_from_sync(runtime.connect)
state: State | None = asyncio.run(
run_controller(
config=config,

View File

@@ -71,19 +71,19 @@ def fetch_data(url, filename):
def get_data_for_hub(hub: str):
if hub == 'hf':
question_data = 'https://raw.githubusercontent.com/ShishirPatil/gorilla/main/eval/eval-data/questions/huggingface/questions_huggingface_0_shot.jsonl'
api_dataset = 'https://raw.githubusercontent.com/ShishirPatil/gorilla/main/data/api/huggingface_api.jsonl'
apibench = 'https://raw.githubusercontent.com/ShishirPatil/gorilla/main/data/apibench/huggingface_eval.json'
question_data = 'https://raw.githubusercontent.com/ShishirPatil/gorilla/refs/tags/v1.2/eval/eval-data/questions/huggingface/questions_huggingface_0_shot.jsonl'
api_dataset = 'https://raw.githubusercontent.com/ShishirPatil/gorilla/refs/tags/v1.2/data/api/huggingface_api.jsonl'
apibench = 'https://raw.githubusercontent.com/ShishirPatil/gorilla/refs/tags/v1.2/data/apibench/huggingface_eval.json'
ast_eval = ast_eval_hf
elif hub == 'torch':
question_data = 'https://raw.githubusercontent.com/ShishirPatil/gorilla/main/eval/eval-data/questions/torchhub/questions_torchhub_0_shot.jsonl'
api_dataset = 'https://raw.githubusercontent.com/ShishirPatil/gorilla/main/data/api/torchhub_api.jsonl'
apibench = 'https://raw.githubusercontent.com/ShishirPatil/gorilla/main/data/apibench/torchhub_eval.json'
question_data = 'https://raw.githubusercontent.com/ShishirPatil/gorilla/refs/tags/v1.2/eval/eval-data/questions/torchhub/questions_torchhub_0_shot.jsonl'
api_dataset = 'https://raw.githubusercontent.com/ShishirPatil/gorilla/refs/tags/v1.2/data/api/torchhub_api.jsonl'
apibench = 'https://raw.githubusercontent.com/ShishirPatil/gorilla/refs/tags/v1.2/data/apibench/torchhub_eval.json'
ast_eval = ast_eval_th
elif hub == 'tf':
question_data = 'https://raw.githubusercontent.com/ShishirPatil/gorilla/main/eval/eval-data/questions/tensorflowhub/questions_tensorflowhub_0_shot.jsonl'
api_dataset = 'https://raw.githubusercontent.com/ShishirPatil/gorilla/main/data/api/tensorflowhub_api.jsonl'
apibench = 'https://raw.githubusercontent.com/ShishirPatil/gorilla/main/data/apibench/tensorflow_eval.json'
question_data = 'https://raw.githubusercontent.com/ShishirPatil/gorilla/refs/tags/v1.2/eval/eval-data/questions/tensorflowhub/questions_tensorflowhub_0_shot.jsonl'
api_dataset = 'https://raw.githubusercontent.com/ShishirPatil/gorilla/refs/tags/v1.2/data/api/tensorflowhub_api.jsonl'
apibench = 'https://raw.githubusercontent.com/ShishirPatil/gorilla/refs/tags/v1.2/data/apibench/tensorflow_eval.json'
ast_eval = ast_eval_tf
question_data = fetch_data(question_data, 'question_data.jsonl')

View File

@@ -49,6 +49,7 @@ from openhands.events.action import (
MessageAction,
)
from openhands.events.observation import Observation
from openhands.utils.async_utils import call_async_from_sync
ACTION_FORMAT = """
<<FINAL_ANSWER||
@@ -214,6 +215,7 @@ Ok now its time to start solving the question. Good luck!
"""
runtime = create_runtime(config)
call_async_from_sync(runtime.connect)
state: State | None = asyncio.run(
run_controller(
config=config,

View File

@@ -39,6 +39,7 @@ from openhands.core.main import create_runtime, run_controller
from openhands.events.action import CmdRunAction, MessageAction
from openhands.events.observation import CmdOutputObservation
from openhands.runtime.base import Runtime
from openhands.utils.async_utils import call_async_from_sync
IMPORT_HELPER = {
'python': [
@@ -232,6 +233,7 @@ def process_instance(
# Here's how you can run the agent (similar to the `main` function) and get the final task state
runtime = create_runtime(config)
call_async_from_sync(runtime.connect)
initialize_runtime(runtime, instance)
state: State | None = asyncio.run(
run_controller(

View File

@@ -31,6 +31,7 @@ from openhands.events.action import (
)
from openhands.events.observation import CmdOutputObservation
from openhands.runtime.base import Runtime
from openhands.utils.async_utils import call_async_from_sync
AGENT_CLS_TO_FAKE_USER_RESPONSE_FN = {
'CodeActAgent': codeact_user_response,
@@ -206,6 +207,7 @@ def process_instance(
instruction += AGENT_CLS_TO_INST_SUFFIX[metadata.agent_class]
runtime = create_runtime(config)
call_async_from_sync(runtime.connect)
initialize_runtime(runtime, instance)
# Here's how you can run the agent (similar to the `main` function) and get the final task state

View File

@@ -41,6 +41,7 @@ from openhands.runtime.browser.browser_env import (
BROWSER_EVAL_GET_GOAL_ACTION,
BROWSER_EVAL_GET_REWARDS_ACTION,
)
from openhands.utils.async_utils import call_async_from_sync
SUPPORTED_AGENT_CLS = {'BrowsingAgent', 'CodeActAgent'}
@@ -145,6 +146,7 @@ def process_instance(
logger.info(f'Starting evaluation for instance {env_id}.')
runtime = create_runtime(config)
call_async_from_sync(runtime.connect)
task_str, obs = initialize_runtime(runtime)
task_str += (

View File

@@ -35,6 +35,7 @@ from openhands.events.action import (
)
from openhands.events.observation import CmdOutputObservation
from openhands.runtime.base import Runtime
from openhands.utils.async_utils import call_async_from_sync
def codeact_user_response_mint(state: State, task: Task, task_config: dict[str, int]):
@@ -184,6 +185,7 @@ def process_instance(
)
runtime = create_runtime(config)
call_async_from_sync(runtime.connect)
initialize_runtime(runtime)
state: State | None = asyncio.run(

View File

@@ -43,6 +43,7 @@ from openhands.core.main import create_runtime, run_controller
from openhands.events.action import CmdRunAction, MessageAction
from openhands.events.observation import CmdOutputObservation
from openhands.runtime.base import Runtime
from openhands.utils.async_utils import call_async_from_sync
config = load_app_config()
@@ -234,6 +235,7 @@ def process_instance(instance: Any, metadata: EvalMetadata, reset_logger: bool =
instruction += AGENT_CLS_TO_INST_SUFFIX[metadata.agent_class]
runtime = create_runtime(config)
call_async_from_sync(runtime.connect)
initialize_runtime(runtime, instance)
# Run the agent

View File

@@ -29,6 +29,7 @@ from openhands.core.main import create_runtime, run_controller
from openhands.events.action import CmdRunAction, MessageAction
from openhands.events.observation import CmdOutputObservation
from openhands.runtime.base import Runtime
from openhands.utils.async_utils import call_async_from_sync
AGENT_CLS_TO_FAKE_USER_RESPONSE_FN = {
'CodeActAgent': codeact_user_response,
@@ -195,6 +196,7 @@ If the program uses some packages that are incompatible, please figure out alter
"""
runtime = create_runtime(config)
call_async_from_sync(runtime.connect)
initialize_runtime(runtime, instance)
# Here's how you can run the agent (similar to the `main` function) and get the final task state

View File

@@ -40,6 +40,7 @@ from openhands.events.action import CmdRunAction, MessageAction
from openhands.events.observation import CmdOutputObservation, ErrorObservation
from openhands.events.serialization.event import event_to_dict
from openhands.runtime.base import Runtime
from openhands.utils.async_utils import call_async_from_sync
from openhands.utils.shutdown_listener import sleep_if_should_continue
USE_HINT_TEXT = os.environ.get('USE_HINT_TEXT', 'false').lower() == 'true'
@@ -464,6 +465,7 @@ def process_instance(
f'This is the {runtime_failure_count + 1}th attempt for instance {instance.instance_id}, setting resource factor to {config.sandbox.remote_runtime_resource_factor}'
)
runtime = create_runtime(config)
call_async_from_sync(runtime.connect)
try:
initialize_runtime(runtime, instance)

View File

@@ -7,7 +7,7 @@ import os
import re
from dataclasses import dataclass
from enum import Enum, auto
from typing import Dict, List, Optional, Union
from typing import Dict, List, Union
from openhands.core.logger import openhands_logger as logger
from openhands.events.action import BrowseInteractiveAction
@@ -133,7 +133,7 @@ def parse_content_to_elements(content: str) -> Dict[str, str]:
return elements
def find_matching_anchor(content: str, selector: str) -> Optional[str]:
def find_matching_anchor(content: str, selector: str) -> str | None:
"""Find the anchor ID that matches the given selector description"""
elements = parse_content_to_elements(content)

View File

@@ -28,6 +28,7 @@ from openhands.core.main import create_runtime, run_controller
from openhands.events.action import CmdRunAction, MessageAction
from openhands.events.observation import BrowserOutputObservation, CmdOutputObservation
from openhands.runtime.base import Runtime
from openhands.utils.async_utils import call_async_from_sync
def get_config(
@@ -275,7 +276,7 @@ if __name__ == '__main__':
args.task_image_name, task_short_name, temp_dir, agent_llm_config, agent_config
)
runtime: Runtime = create_runtime(config)
call_async_from_sync(runtime.connect)
init_task_env(runtime, args.server_hostname, env_llm_config)
dependencies = load_dependencies(runtime)

View File

@@ -27,6 +27,7 @@ from openhands.core.main import create_runtime, run_controller
from openhands.events.action import CmdRunAction, MessageAction
from openhands.events.observation import CmdOutputObservation
from openhands.runtime.base import Runtime
from openhands.utils.async_utils import call_async_from_sync
AGENT_CLS_TO_FAKE_USER_RESPONSE_FN = {
'CodeActAgent': codeact_user_response,
@@ -104,6 +105,7 @@ def process_instance(instance: Any, metadata: EvalMetadata, reset_logger: bool =
logger.info(f'Instruction:\n{instruction}', extra={'msg_type': 'OBSERVATION'})
runtime = create_runtime(config)
call_async_from_sync(runtime.connect)
initialize_runtime(runtime)
# Here's how you can run the agent (similar to the `main` function) and get the final task state

View File

@@ -37,6 +37,7 @@ from openhands.runtime.browser.browser_env import (
BROWSER_EVAL_GET_GOAL_ACTION,
BROWSER_EVAL_GET_REWARDS_ACTION,
)
from openhands.utils.async_utils import call_async_from_sync
SUPPORTED_AGENT_CLS = {'VisualBrowsingAgent'}
AGENT_CLS_TO_FAKE_USER_RESPONSE_FN = {
@@ -159,6 +160,8 @@ def process_instance(
logger.info(f'Starting evaluation for instance {env_id}.')
runtime = create_runtime(config)
call_async_from_sync(runtime.connect)
task_str, goal_image_urls = initialize_runtime(runtime)
initial_user_action = MessageAction(content=task_str, image_urls=goal_image_urls)
state: State | None = asyncio.run(

View File

@@ -36,6 +36,7 @@ from openhands.runtime.browser.browser_env import (
BROWSER_EVAL_GET_GOAL_ACTION,
BROWSER_EVAL_GET_REWARDS_ACTION,
)
from openhands.utils.async_utils import call_async_from_sync
SUPPORTED_AGENT_CLS = {'BrowsingAgent'}
@@ -144,6 +145,7 @@ def process_instance(
logger.info(f'Starting evaluation for instance {env_id}.')
runtime = create_runtime(config)
call_async_from_sync(runtime.connect)
task_str = initialize_runtime(runtime)
state: State | None = asyncio.run(

View File

@@ -30,6 +30,7 @@ from openhands.core.main import create_runtime, run_controller
from openhands.events.action import MessageAction
from openhands.events.serialization.event import event_to_dict
from openhands.runtime.base import Runtime
from openhands.utils.async_utils import call_async_from_sync
FAKE_RESPONSES = {
'CodeActAgent': fake_user_response,
@@ -108,6 +109,7 @@ def process_instance(
# create sandbox and run the agent
# =============================================
runtime: Runtime = create_runtime(config)
call_async_from_sync(runtime.connect)
try:
test_class.initialize_runtime(runtime)

View File

@@ -1,35 +0,0 @@
import { screen } from "@testing-library/react";
import { describe, it, expect } from "vitest";
import { renderWithProviders } from "test-utils";
import { RuntimeSizeSelector } from "#/components/shared/modals/settings/runtime-size-selector";
const renderRuntimeSizeSelector = () =>
renderWithProviders(<RuntimeSizeSelector isDisabled={false} />);
describe("RuntimeSizeSelector", () => {
it("should show both runtime size options", () => {
renderRuntimeSizeSelector();
// The options are in the hidden select element
const select = screen.getByRole("combobox", { hidden: true });
expect(select).toHaveValue("1");
expect(select).toHaveDisplayValue("1x (2 core, 8G)");
expect(select.children).toHaveLength(3); // Empty option + 2 size options
});
it("should show the full description text for disabled options", async () => {
renderRuntimeSizeSelector();
// Click the button to open the dropdown
const button = screen.getByRole("button", {
name: "1x (2 core, 8G) SETTINGS_FORM$RUNTIME_SIZE_LABEL",
});
button.click();
// Wait for the dropdown to open and find the description text
const description = await screen.findByText(
"Runtime sizes over 1 are disabled by default, please contact contact@all-hands.dev to get access to larger runtimes.",
);
expect(description).toBeInTheDocument();
expect(description).toHaveClass("whitespace-normal", "break-words");
});
});

File diff suppressed because it is too large Load Diff

View File

@@ -7,47 +7,47 @@
"node": ">=20.0.0"
},
"dependencies": {
"@heroui/react": "2.6.14",
"@heroui/react": "2.7.4",
"@monaco-editor/react": "^4.7.0-rc.0",
"@react-router/node": "^7.1.5",
"@react-router/serve": "^7.1.5",
"@react-router/node": "^7.2.0",
"@react-router/serve": "^7.2.0",
"@react-types/shared": "^3.27.0",
"@reduxjs/toolkit": "^2.5.1",
"@reduxjs/toolkit": "^2.6.0",
"@stripe/react-stripe-js": "^3.1.1",
"@stripe/stripe-js": "^5.5.0",
"@tanstack/react-query": "^5.66.7",
"@stripe/stripe-js": "^5.7.0",
"@tanstack/react-query": "^5.66.11",
"@vitejs/plugin-react": "^4.3.2",
"@xterm/addon-fit": "^0.10.0",
"@xterm/xterm": "^5.4.0",
"axios": "^1.7.9",
"axios": "^1.8.1",
"clsx": "^2.1.1",
"eslint-config-airbnb-typescript": "^18.0.0",
"framer-motion": "^12.4.4",
"framer-motion": "^12.4.7",
"i18next": "^24.2.2",
"i18next-browser-languagedetector": "^8.0.3",
"i18next-browser-languagedetector": "^8.0.4",
"i18next-http-backend": "^3.0.2",
"isbot": "^5.1.22",
"jose": "^5.10.0",
"isbot": "^5.1.23",
"jose": "^6.0.8",
"monaco-editor": "^0.52.2",
"posthog-js": "^1.219.3",
"posthog-js": "^1.225.1",
"react": "^19.0.0",
"react-dom": "^19.0.0",
"react-highlight": "^0.15.0",
"react-hot-toast": "^2.5.1",
"react-i18next": "^15.4.1",
"react-icons": "^5.4.0",
"react-markdown": "^9.0.3",
"react-icons": "^5.5.0",
"react-markdown": "^10.0.1",
"react-redux": "^9.2.0",
"react-router": "^7.1.5",
"react-router": "^7.2.0",
"react-syntax-highlighter": "^15.6.1",
"react-textarea-autosize": "^8.5.7",
"remark-gfm": "^4.0.1",
"sirv-cli": "^3.0.1",
"socket.io-client": "^4.8.1",
"tailwind-merge": "^3.0.1",
"vite": "^6.1.0",
"tailwind-merge": "^3.0.2",
"vite": "^6.2.0",
"web-vitals": "^3.5.2",
"ws": "^8.18.0"
"ws": "^8.18.1"
},
"scripts": {
"dev": "npm run make-i18n && cross-env VITE_MOCK_API=false react-router dev",
@@ -81,14 +81,14 @@
"devDependencies": {
"@mswjs/socket.io-binding": "^0.1.1",
"@playwright/test": "^1.50.1",
"@react-router/dev": "^7.1.5",
"@react-router/dev": "^7.2.0",
"@tailwindcss/typography": "^0.5.16",
"@tanstack/eslint-plugin-query": "^5.66.1",
"@testing-library/dom": "^10.4.0",
"@testing-library/jest-dom": "^6.6.1",
"@testing-library/react": "^16.2.0",
"@testing-library/user-event": "^14.6.1",
"@types/node": "^22.13.4",
"@types/node": "^22.13.8",
"@types/react": "^19.0.8",
"@types/react-dom": "^19.0.3",
"@types/react-highlight": "^0.12.8",
@@ -96,13 +96,13 @@
"@types/ws": "^8.5.14",
"@typescript-eslint/eslint-plugin": "^7.18.0",
"@typescript-eslint/parser": "^7.18.0",
"@vitest/coverage-v8": "^3.0.6",
"@vitest/coverage-v8": "^3.0.7",
"autoprefixer": "^10.4.20",
"cross-env": "^7.0.3",
"eslint": "^8.57.0",
"eslint-config-airbnb": "^19.0.4",
"eslint-config-airbnb-typescript": "^18.0.0",
"eslint-config-prettier": "^10.0.1",
"eslint-config-prettier": "^10.0.2",
"eslint-plugin-import": "^2.29.1",
"eslint-plugin-jsx-a11y": "^6.10.2",
"eslint-plugin-prettier": "^5.2.3",
@@ -113,10 +113,10 @@
"lint-staged": "^15.4.3",
"msw": "^2.6.6",
"postcss": "^8.5.2",
"prettier": "^3.5.1",
"stripe": "^17.5.0",
"prettier": "^3.5.3",
"stripe": "^17.7.0",
"tailwindcss": "^3.4.17",
"typescript": "^5.7.3",
"typescript": "^5.8.2",
"vite-plugin-svgr": "^4.2.0",
"vite-tsconfig-paths": "^5.1.4",
"vitest": "^3.0.2"

View File

@@ -57,18 +57,19 @@ export function ChatMessage({
onClick={handleCopyToClipboard}
mode={isCopy ? "copied" : "copy"}
/>
<Markdown
className="text-sm overflow-auto break-words"
components={{
code,
ul,
ol,
a: anchor,
}}
remarkPlugins={[remarkGfm]}
>
{message}
</Markdown>
<div className="text-sm overflow-auto break-words">
<Markdown
components={{
code,
ul,
ol,
a: anchor,
}}
remarkPlugins={[remarkGfm]}
>
{message}
</Markdown>
</div>
{children}
</article>
);

View File

@@ -123,17 +123,18 @@ export function ExpandableMessage({
)}
</div>
{(!headline || showDetails) && (
<Markdown
className="text-sm overflow-auto"
components={{
code,
ul,
ol,
}}
remarkPlugins={[remarkGfm]}
>
{details}
</Markdown>
<div className="text-sm overflow-auto">
<Markdown
components={{
code,
ul,
ol,
}}
remarkPlugins={[remarkGfm]}
>
{details}
</Markdown>
</div>
)}
</div>
</div>

View File

@@ -99,7 +99,6 @@ export function GitHubRepositorySelector({
<AutocompleteItem
data-testid="github-repo-item"
key={repo.id}
value={repo.id}
className="data-[selected=true]:bg-default-100"
textValue={repo.full_name}
>
@@ -114,7 +113,6 @@ export function GitHubRepositorySelector({
<AutocompleteItem
data-testid="github-repo-item"
key={repo.id}
value={repo.id}
className="data-[selected=true]:bg-default-100"
textValue={repo.full_name}
>

View File

@@ -1,9 +1,10 @@
import { Autocomplete, AutocompleteItem } from "@heroui/react";
import { ReactNode } from "react";
import { OptionalTag } from "./optional-tag";
interface SettingsDropdownInputProps {
testId: string;
label: string;
label: ReactNode;
name: string;
items: { key: React.Key; label: string }[];
showOptionalTag?: boolean;
@@ -29,7 +30,7 @@ export function SettingsDropdownInput({
{showOptionalTag && <OptionalTag />}
</div>
<Autocomplete
aria-label={label}
aria-label={typeof label === "string" ? label : name}
data-testid={testId}
name={name}
defaultItems={items}

View File

@@ -1,43 +0,0 @@
import { Autocomplete, AutocompleteItem } from "@heroui/react";
interface FormFieldsetProps {
id: string;
label: string;
items: { key: string; value: string }[];
defaultSelectedKey?: string;
isClearable?: boolean;
}
export function FormFieldset({
id,
label,
items,
defaultSelectedKey,
isClearable,
}: FormFieldsetProps) {
return (
<fieldset className="flex flex-col gap-2">
<label htmlFor={id} className="font-[500] text-[#A3A3A3] text-xs">
{label}
</label>
<Autocomplete
id={id}
name={id}
aria-label={label}
defaultSelectedKey={defaultSelectedKey}
isClearable={isClearable}
inputProps={{
classNames: {
inputWrapper: "bg-[#27272A] rounded-md text-sm px-3 py-[10px]",
},
}}
>
{items.map((item) => (
<AutocompleteItem key={item.key} value={item.key}>
{item.value}
</AutocompleteItem>
))}
</Autocomplete>
</fieldset>
);
}

View File

@@ -1,46 +0,0 @@
import { Autocomplete, AutocompleteItem } from "@heroui/react";
import { useTranslation } from "react-i18next";
import { I18nKey } from "#/i18n/declaration";
interface AgentInputProps {
isDisabled: boolean;
defaultValue: string;
agents: string[];
}
export function AgentInput({
isDisabled,
defaultValue,
agents,
}: AgentInputProps) {
const { t } = useTranslation();
return (
<fieldset data-testid="agent-selector" className="flex flex-col gap-2">
<label htmlFor="agent" className="font-[500] text-[#A3A3A3] text-xs">
{t(I18nKey.SETTINGS_FORM$AGENT_LABEL)}
</label>
<Autocomplete
isDisabled={isDisabled}
isRequired
id="agent"
aria-label="Agent"
data-testid="agent-input"
name="agent"
defaultSelectedKey={defaultValue}
isClearable={false}
inputProps={{
classNames: {
inputWrapper: "bg-[#27272A] rounded-md text-sm px-3 py-[10px]",
},
}}
>
{agents.map((agent) => (
<AutocompleteItem key={agent} value={agent}>
{agent}
</AutocompleteItem>
))}
</Autocomplete>
</fieldset>
);
}

View File

@@ -1,46 +0,0 @@
import { Autocomplete, AutocompleteItem } from "@heroui/react";
import { useTranslation } from "react-i18next";
import { I18nKey } from "#/i18n/declaration";
interface SecurityAnalyzerInputProps {
isDisabled: boolean;
defaultValue: string;
securityAnalyzers: string[];
}
export function SecurityAnalyzerInput({
isDisabled,
defaultValue,
securityAnalyzers,
}: SecurityAnalyzerInputProps) {
const { t } = useTranslation();
return (
<fieldset className="flex flex-col gap-2">
<label
htmlFor="security-analyzer"
className="font-[500] text-[#A3A3A3] text-xs"
>
{t(I18nKey.SETTINGS_FORM$SECURITY_ANALYZER_LABEL)}
</label>
<Autocomplete
isDisabled={isDisabled}
id="security-analyzer"
name="security-analyzer"
aria-label="Security Analyzer"
defaultSelectedKey={defaultValue}
inputProps={{
classNames: {
inputWrapper: "bg-[#27272A] rounded-md text-sm px-3 py-[10px]",
},
}}
>
{securityAnalyzers.map((analyzer) => (
<AutocompleteItem key={analyzer} value={analyzer}>
{analyzer}
</AutocompleteItem>
))}
</Autocomplete>
</fieldset>
);
}

View File

@@ -100,7 +100,6 @@ export function ModelSelector({
<AutocompleteItem
data-testid={`provider-item-${provider}`}
key={provider}
value={provider}
>
{mapProvider(provider)}
</AutocompleteItem>
@@ -110,7 +109,7 @@ export function ModelSelector({
{Object.keys(models)
.filter((provider) => !VERIFIED_PROVIDERS.includes(provider))
.map((provider) => (
<AutocompleteItem key={provider} value={provider}>
<AutocompleteItem key={provider}>
{mapProvider(provider)}
</AutocompleteItem>
))}
@@ -148,9 +147,7 @@ export function ModelSelector({
{models[selectedProvider || ""]?.models
.filter((model) => VERIFIED_MODELS.includes(model))
.map((model) => (
<AutocompleteItem key={model} value={model}>
{model}
</AutocompleteItem>
<AutocompleteItem key={model}>{model}</AutocompleteItem>
))}
</AutocompleteSection>
<AutocompleteSection title="Others">
@@ -160,7 +157,6 @@ export function ModelSelector({
<AutocompleteItem
data-testid={`model-item-${model}`}
key={model}
value={model}
>
{model}
</AutocompleteItem>

View File

@@ -1,57 +0,0 @@
import { useTranslation } from "react-i18next";
import { Select, SelectItem } from "@heroui/react";
import { I18nKey } from "#/i18n/declaration";
interface RuntimeSizeSelectorProps {
isDisabled: boolean;
defaultValue?: number;
}
export function RuntimeSizeSelector({
isDisabled,
defaultValue,
}: RuntimeSizeSelectorProps) {
const { t } = useTranslation();
return (
<fieldset className="flex flex-col gap-2">
<label
htmlFor="runtime-size"
className="font-[500] text-[#A3A3A3] text-xs"
>
{t(I18nKey.SETTINGS_FORM$RUNTIME_SIZE_LABEL)}
</label>
<Select
data-testid="runtime-size"
id="runtime-size"
name="runtime-size"
defaultSelectedKeys={[String(defaultValue || 1)]}
selectedKeys={[String(defaultValue || 1)]}
isDisabled={isDisabled}
selectionMode="single"
disallowEmptySelection
aria-label={t(I18nKey.SETTINGS_FORM$RUNTIME_SIZE_LABEL)}
classNames={{
trigger: "bg-[#27272A] rounded-md text-sm px-3 py-[10px]",
}}
>
<SelectItem key="1" value={1}>
1x (2 core, 8G)
</SelectItem>
<SelectItem
key="2"
value={2}
isDisabled
classNames={{
description:
"whitespace-normal break-words min-w-[300px] max-w-[300px]",
base: "min-w-[300px] max-w-[300px]",
}}
description="Runtime sizes over 1 are disabled by default, please contact contact@all-hands.dev to get access to larger runtimes."
>
2x (4 core, 16G)
</SelectItem>
</Select>
</fieldset>
);
}

View File

@@ -43,15 +43,18 @@ export function SettingsForm({ settings, models, onClose }: SettingsFormProps) {
const handleFormSubmission = async (formData: FormData) => {
const newSettings = extractSettings(formData);
await saveUserSettings(newSettings);
onClose();
resetOngoingSession();
await saveUserSettings(newSettings, {
onSuccess: () => {
onClose();
resetOngoingSession();
posthog.capture("settings_saved", {
LLM_MODEL: newSettings.LLM_MODEL,
LLM_API_KEY: newSettings.LLM_API_KEY ? "SET" : "UNSET",
REMOTE_RUNTIME_RESOURCE_FACTOR:
newSettings.REMOTE_RUNTIME_RESOURCE_FACTOR,
posthog.capture("settings_saved", {
LLM_MODEL: newSettings.LLM_MODEL,
LLM_API_KEY: newSettings.LLM_API_KEY ? "SET" : "UNSET",
REMOTE_RUNTIME_RESOURCE_FACTOR:
newSettings.REMOTE_RUNTIME_RESOURCE_FACTOR,
});
},
});
};

View File

@@ -3,7 +3,6 @@ import React from "react";
import posthog from "posthog-js";
import OpenHands from "#/api/open-hands";
import { useAuth } from "#/context/auth-context";
import { useConfig } from "#/hooks/query/use-config";
import { DEFAULT_SETTINGS } from "#/services/settings";
const getSettingsQueryFn = async () => {
@@ -27,12 +26,10 @@ const getSettingsQueryFn = async () => {
export const useSettings = () => {
const { setGitHubTokenIsSet, githubTokenIsSet } = useAuth();
const { data: config } = useConfig();
const query = useQuery({
queryKey: ["settings", githubTokenIsSet],
queryFn: getSettingsQueryFn,
enabled: config?.APP_MODE !== "saas" || githubTokenIsSet,
// Only retry if the error is not a 404 because we
// would want to show the modal immediately if the
// settings are not found

View File

@@ -278,7 +278,15 @@ function AccountSettings() {
<SettingsDropdownInput
testId="runtime-settings-input"
name="runtime-settings-input"
label="Runtime Settings"
label={
<>
Runtime Settings (
<a href="mailto:contact@all-hands.dev">
get in touch for access
</a>
)
</>
}
items={REMOTE_RUNTIME_OPTIONS}
defaultSelectedKey={settings.REMOTE_RUNTIME_RESOURCE_FACTOR?.toString()}
isDisabled

View File

@@ -95,6 +95,7 @@ export function handleObservationMessage(message: ObservationMessage) {
observation,
extras: {
path: String(message.extras.path || ""),
impl_source: String(message.extras.impl_source || ""),
},
}),
);
@@ -107,6 +108,7 @@ export function handleObservationMessage(message: ObservationMessage) {
extras: {
path: String(message.extras.path || ""),
diff: String(message.extras.diff || ""),
impl_source: String(message.extras.impl_source || ""),
},
}),
);

View File

@@ -159,9 +159,16 @@ export const chatSlice = createSlice({
.includes("error:");
} else if (observationID === "read" || observationID === "edit") {
// For read/edit operations, we consider it successful if there's content and no error
causeMessage.success =
observation.payload.content.length > 0 &&
!observation.payload.content.toLowerCase().includes("error:");
if (observation.payload.extras.impl_source === "oh_aci") {
causeMessage.success =
observation.payload.content.length > 0 &&
!observation.payload.content.startsWith("ERROR:\n");
} else {
causeMessage.success =
observation.payload.content.length > 0 &&
!observation.payload.content.toLowerCase().includes("error:");
}
}
if (observationID === "run" || observationID === "run_ipython") {

View File

@@ -63,6 +63,7 @@ export interface ReadObservation extends OpenHandsObservationEvent<"read"> {
source: "agent";
extras: {
path: string;
impl_source: string;
};
}
@@ -71,6 +72,7 @@ export interface EditObservation extends OpenHandsObservationEvent<"edit"> {
extras: {
path: string;
diff: string;
impl_source: string;
};
}

View File

@@ -6,12 +6,10 @@
*/
export const generateGitHubAuthUrl = (clientId: string, requestUrl: URL) => {
const redirectUri = `${requestUrl.origin}/oauth/keycloak/callback`;
const baseUrl = `${requestUrl.origin}`
.replace("https://", "")
.replace("http://", "");
const authUrl = baseUrl
const authUrl = requestUrl.hostname
.replace(/(^|\.)staging\.all-hands\.dev$/, "$1auth.staging.all-hands.dev")
.replace(/(^|\.)app\.all-hands\.dev$/, "auth.app.all-hands.dev");
const scope = "openid email profile";
.replace(/(^|\.)app\.all-hands\.dev$/, "auth.app.all-hands.dev")
.replace(/(^|\.)localhost$/, "auth.staging.all-hands.dev");
const scope = "openid email profile offline_access";
return `https://${authUrl}/realms/allhands/protocol/openid-connect/auth?client_id=github&response_type=code&redirect_uri=${encodeURIComponent(redirectUri)}&scope=${encodeURIComponent(scope)}`;
};

View File

@@ -53,4 +53,3 @@ To verify Docker is working correctly, run the hello-world container:
```bash
sudo docker run hello-world
```

View File

@@ -91,7 +91,7 @@ class CodeActAgent(Agent):
self.conversation_memory = ConversationMemory(self.prompt_manager)
self.condenser = Condenser.from_config(self.config.condenser)
logger.debug(f'Using condenser: {self.condenser}')
logger.debug(f'Using condenser: {type(self.condenser)}')
def reset(self) -> None:
"""Resets the CodeAct Agent."""

View File

@@ -248,8 +248,9 @@ class AgentController:
)
reported = RuntimeError(
'There was an unexpected error while running the agent. Please '
f'report this error to the developers. Your session ID is {self.id}. '
f'Error type: {e.__class__.__name__}'
'report this error to the developers by opening an issue at '
'https://github.com/All-Hands-AI/OpenHands. Your session ID is '
f' {self.id}. Error type: {e.__class__.__name__}'
)
if (
isinstance(e, litellm.AuthenticationError)

View File

@@ -14,7 +14,12 @@ from openhands.core.config import (
from openhands.core.logger import openhands_logger as logger
from openhands.core.loop import run_agent_until_done
from openhands.core.schema import AgentState
from openhands.core.setup import create_agent, create_controller, create_runtime
from openhands.core.setup import (
create_agent,
create_controller,
create_runtime,
initialize_repository_for_runtime,
)
from openhands.events import EventSource, EventStreamSubscriber
from openhands.events.action import (
Action,
@@ -109,7 +114,6 @@ async def main(loop: asyncio.AbstractEventLoop):
sid=sid,
headless_mode=True,
agent=agent,
selected_repository=config.sandbox.selected_repo,
)
controller, _ = create_controller(agent, runtime, config)
@@ -165,6 +169,14 @@ async def main(loop: asyncio.AbstractEventLoop):
await runtime.connect()
# Initialize repository if needed
if config.sandbox.selected_repo:
initialize_repository_for_runtime(
runtime,
agent=agent,
selected_repository=config.sandbox.selected_repo,
)
if initial_user_action:
# If there's an initial user action, enqueue it and do not prompt again
event_stream.add_event(initial_user_action, EventSource.USER)

View File

@@ -82,6 +82,7 @@ class AppConfig(BaseModel):
daytona_target: str = Field(default='us')
cli_multiline_input: bool = Field(default=False)
conversation_max_age_seconds: int = Field(default=864000) # 10 days in seconds
enable_default_condenser: bool = Field(default=True)
defaults_dict: ClassVar[dict] = {}

View File

@@ -1,7 +1,8 @@
from typing import Literal
from typing import Literal, cast
from pydantic import BaseModel, Field
from pydantic import BaseModel, Field, ValidationError
from openhands.core import logger
from openhands.core.config.llm_config import LLMConfig
@@ -10,17 +11,21 @@ class NoOpCondenserConfig(BaseModel):
type: Literal['noop'] = Field('noop')
model_config = {'extra': 'forbid'}
class ObservationMaskingCondenserConfig(BaseModel):
"""Configuration for ObservationMaskingCondenser."""
type: Literal['observation_masking'] = Field('observation_masking')
attention_window: int = Field(
default=10,
default=100,
description='The number of most-recent events where observations will not be masked.',
ge=1,
)
model_config = {'extra': 'forbid'}
class RecentEventsCondenserConfig(BaseModel):
"""Configuration for RecentEventsCondenser."""
@@ -34,9 +39,11 @@ class RecentEventsCondenserConfig(BaseModel):
ge=0,
)
max_events: int = Field(
default=10, description='Maximum number of events to keep.', ge=1
default=100, description='Maximum number of events to keep.', ge=1
)
model_config = {'extra': 'forbid'}
class LLMSummarizingCondenserConfig(BaseModel):
"""Configuration for LLMCondenser."""
@@ -49,13 +56,17 @@ class LLMSummarizingCondenserConfig(BaseModel):
# at least one event by default, because the best guess is that it's the user task
keep_first: int = Field(
default=1,
description='The number of initial events to condense.',
description='Number of initial events to always keep in history.',
ge=0,
)
max_size: int = Field(
default=10, description='Maximum number of events to keep.', ge=1
default=100,
description='Maximum size of the condensed history before triggering forgetting.',
ge=2,
)
model_config = {'extra': 'forbid'}
class AmortizedForgettingCondenserConfig(BaseModel):
"""Configuration for AmortizedForgettingCondenser."""
@@ -74,6 +85,8 @@ class AmortizedForgettingCondenserConfig(BaseModel):
ge=0,
)
model_config = {'extra': 'forbid'}
class LLMAttentionCondenserConfig(BaseModel):
"""Configuration for LLMAttentionCondenser."""
@@ -95,7 +108,10 @@ class LLMAttentionCondenserConfig(BaseModel):
ge=0,
)
model_config = {'extra': 'forbid'}
# Type alias for convenience
CondenserConfig = (
NoOpCondenserConfig
| ObservationMaskingCondenserConfig
@@ -104,3 +120,121 @@ CondenserConfig = (
| AmortizedForgettingCondenserConfig
| LLMAttentionCondenserConfig
)
def condenser_config_from_toml_section(
data: dict, llm_configs: dict | None = None
) -> dict[str, CondenserConfig]:
"""
Create a CondenserConfig instance from a toml dictionary representing the [condenser] section.
For CondenserConfig, the handling is different since it's a union type. The type of condenser
is determined by the 'type' field in the section.
Example:
Parse condenser config like:
[condenser]
type = "noop"
For condensers that require an LLM config, you can specify the name of an LLM config:
[condenser]
type = "llm"
llm_config = "my_llm" # References [llm.my_llm] section
Args:
data: The TOML dictionary representing the [condenser] section.
llm_configs: Optional dictionary of LLMConfig objects keyed by name.
Returns:
dict[str, CondenserConfig]: A mapping where the key "condenser" corresponds to the configuration.
"""
# Initialize the result mapping
condenser_mapping: dict[str, CondenserConfig] = {}
# Process config
try:
# Determine which condenser type to use based on 'type' field
condenser_type = data.get('type', 'noop')
# Handle LLM config reference if needed
if (
condenser_type in ('llm', 'llm_attention')
and 'llm_config' in data
and isinstance(data['llm_config'], str)
):
llm_config_name = data['llm_config']
if llm_configs and llm_config_name in llm_configs:
# Replace the string reference with the actual LLMConfig object
data_copy = data.copy()
data_copy['llm_config'] = llm_configs[llm_config_name]
config = create_condenser_config(condenser_type, data_copy)
else:
logger.openhands_logger.warning(
f"LLM config '{llm_config_name}' not found for condenser. Using default LLMConfig."
)
# Create a default LLMConfig if the referenced one doesn't exist
data_copy = data.copy()
# Try to use the fallback 'llm' config
if llm_configs is not None:
data_copy['llm_config'] = llm_configs.get('llm')
config = create_condenser_config(condenser_type, data_copy)
else:
config = create_condenser_config(condenser_type, data)
condenser_mapping['condenser'] = config
except (ValidationError, ValueError) as e:
logger.openhands_logger.warning(
f'Invalid condenser configuration: {e}. Using NoOpCondenserConfig.'
)
# Default to NoOpCondenserConfig if config fails
config = NoOpCondenserConfig()
condenser_mapping['condenser'] = config
return condenser_mapping
# For backward compatibility
from_toml_section = condenser_config_from_toml_section
def create_condenser_config(condenser_type: str, data: dict) -> CondenserConfig:
"""
Create a CondenserConfig instance based on the specified type.
Args:
condenser_type: The type of condenser to create.
data: The configuration data.
Returns:
A CondenserConfig instance.
Raises:
ValueError: If the condenser type is unknown.
ValidationError: If the provided data fails validation for the condenser type.
"""
# Mapping of condenser types to their config classes
condenser_classes = {
'noop': NoOpCondenserConfig,
'observation_masking': ObservationMaskingCondenserConfig,
'recent': RecentEventsCondenserConfig,
'llm': LLMSummarizingCondenserConfig,
'amortized': AmortizedForgettingCondenserConfig,
'llm_attention': LLMAttentionCondenserConfig,
}
if condenser_type not in condenser_classes:
raise ValueError(f'Unknown condenser type: {condenser_type}')
# Create and validate the config using direct instantiation
# Explicitly handle ValidationError to provide more context
try:
config_class = condenser_classes[condenser_type]
# Use type casting to help mypy understand the return type
return cast(CondenserConfig, config_class(**data))
except ValidationError as e:
# Just re-raise with a more descriptive message, but don't try to pass the errors
# which can cause compatibility issues with different pydantic versions
raise ValueError(
f"Validation failed for condenser type '{condenser_type}': {e}"
)

View File

@@ -48,7 +48,7 @@ class LLMConfig(BaseModel):
reasoning_effort: The effort to put into reasoning. This is a string that can be one of 'low', 'medium', 'high', or 'none'. Exclusive for o1 models.
"""
model: str = Field(default='claude-3-5-sonnet-20241022')
model: str = Field(default='claude-3-7-sonnet-20250219')
api_key: SecretStr | None = Field(default=None)
base_url: str | None = Field(default=None)
api_version: str | None = Field(default=None)

View File

@@ -12,9 +12,11 @@ import toml
from dotenv import load_dotenv
from pydantic import BaseModel, SecretStr, ValidationError
from openhands import __version__
from openhands.core import logger
from openhands.core.config.agent_config import AgentConfig
from openhands.core.config.app_config import AppConfig
from openhands.core.config.condenser_config import condenser_config_from_toml_section
from openhands.core.config.config_utils import (
OH_DEFAULT_AGENT,
OH_MAX_ITERATIONS,
@@ -193,6 +195,44 @@ def load_from_toml(cfg: AppConfig, toml_file: str = 'config.toml') -> None:
# Re-raise ValueError from SandboxConfig.from_toml_section
raise ValueError('Error in [sandbox] section in config.toml')
# Process condenser section if present
if 'condenser' in toml_config:
try:
# Pass the LLM configs to the condenser config parser
condenser_mapping = condenser_config_from_toml_section(
toml_config['condenser'], cfg.llms
)
# Assign the default condenser configuration to the default agent configuration
if 'condenser' in condenser_mapping:
# Get the default agent config and assign the condenser config to it
default_agent_config = cfg.get_agent_config()
default_agent_config.condenser = condenser_mapping['condenser']
logger.openhands_logger.debug(
'Default condenser configuration loaded from config toml and assigned to default agent'
)
except (TypeError, KeyError, ValidationError) as e:
logger.openhands_logger.warning(
f'Cannot parse [condenser] config from toml, values have not been applied.\nError: {e}'
)
# If no condenser section is in toml but enable_default_condenser is True,
# set LLMSummarizingCondenserConfig as default
elif cfg.enable_default_condenser:
from openhands.core.config.condenser_config import LLMSummarizingCondenserConfig
# Get default agent config
default_agent_config = cfg.get_agent_config()
# Create default LLM summarizing condenser config
default_condenser = LLMSummarizingCondenserConfig(
llm_config=cfg.get_llm_config(), # Use default LLM config
)
# Set as default condenser
default_agent_config.condenser = default_condenser
logger.openhands_logger.debug(
'Default LLM summarizing condenser assigned to default agent (no condenser in config)'
)
# Process extended section if present
if 'extended' in toml_config:
try:
@@ -203,7 +243,15 @@ def load_from_toml(cfg: AppConfig, toml_file: str = 'config.toml') -> None:
)
# Check for unknown sections
known_sections = {'core', 'extended', 'agent', 'llm', 'security', 'sandbox'}
known_sections = {
'core',
'extended',
'agent',
'llm',
'security',
'sandbox',
'condenser',
}
for key in toml_config:
if key.lower() not in known_sections:
logger.openhands_logger.warning(f'Unknown section [{key}] in {toml_file}')
@@ -492,8 +540,6 @@ def parse_arguments() -> argparse.Namespace:
args = parser.parse_args()
if args.version:
from openhands import __version__
print(f'OpenHands version: {__version__}')
sys.exit(0)

View File

@@ -20,6 +20,7 @@ from openhands.core.setup import (
create_controller,
create_runtime,
generate_sid,
initialize_repository_for_runtime,
)
from openhands.events import EventSource, EventStreamSubscriber
from openhands.events.action import MessageAction, NullAction
@@ -29,6 +30,7 @@ from openhands.events.observation import AgentStateChangedObservation
from openhands.events.serialization import event_from_dict
from openhands.io import read_input, read_task
from openhands.runtime.base import Runtime
from openhands.utils.async_utils import call_async_from_sync
class FakeUserResponseFunc(Protocol):
@@ -97,8 +99,17 @@ async def run_controller(
sid=sid,
headless_mode=headless_mode,
agent=agent,
selected_repository=config.sandbox.selected_repo,
)
# Connect to the runtime
call_async_from_sync(runtime.connect)
# Initialize repository if needed
if config.sandbox.selected_repo:
initialize_repository_for_runtime(
runtime,
agent=agent,
selected_repository=config.sandbox.selected_repo,
)
event_stream = runtime.event_stream

View File

@@ -7,6 +7,7 @@ OpenHands uses its own `Message` class (`openhands/core/message.py`) which provi
## Class Structure
Our `Message` class (`openhands/core/message.py`):
```python
class Message(BaseModel):
role: Literal['user', 'system', 'assistant', 'tool']
@@ -22,13 +23,14 @@ class Message(BaseModel):
```
litellm's `Message` class (`litellm/types/utils.py`):
```python
class Message(OpenAIObject):
content: Optional[str]
content: str | None
role: Literal["assistant", "user", "system", "tool", "function"]
tool_calls: Optional[List[ChatCompletionMessageToolCall]]
function_call: Optional[FunctionCall]
audio: Optional[ChatCompletionAudioResponse] = None
tool_calls: List[ChatCompletionMessageToolCall] | None
function_call: FunctionCall | None
audio: ChatCompletionAudioResponse | None = None
```
## How It Works
@@ -36,6 +38,7 @@ class Message(OpenAIObject):
1. **Message Creation**: Our `Message` class is a Pydantic model that supports rich content (text and images) through its `content` field.
2. **Serialization**: The class uses Pydantic's `@model_serializer` to convert messages into dictionaries that litellm can understand. We have two serialization methods:
```python
def _string_serializer(self) -> dict:
# convert content to a single string
@@ -55,6 +58,7 @@ class Message(OpenAIObject):
```
The appropriate serializer is chosen based on the message's capabilities:
```python
@model_serializer
def serialize_model(self) -> dict:
@@ -64,11 +68,13 @@ class Message(OpenAIObject):
```
3. **Tool Call Handling**: Tool calls require special attention in serialization because:
- They need to work with litellm's API calls (which accept both dicts and objects)
- They need to be properly serialized for token counting
- They need to maintain compatibility with different LLM providers' formats
4. **litellm Integration**: When we pass our messages to `litellm.completion()`, litellm doesn't care about the message class type - it works with the dictionary representation. This works because:
- litellm's transformation code (e.g., `litellm/llms/anthropic/chat/transformation.py`) processes messages based on their structure, not their type
- our serialization produces dictionaries that match litellm's expected format
- litellm handles rich content by looking at the message structure, supporting both simple string content and lists of content items
@@ -78,6 +84,7 @@ class Message(OpenAIObject):
### Token Counting
To use litellm's token counter, we need to make sure that all message components (including tool calls) are properly serialized to dictionaries. This is because:
- litellm's token counter expects dictionary structures
- Tool calls need to be included in the token count
- Different providers may count tokens differently for structured content

View File

@@ -21,7 +21,6 @@ from openhands.runtime import get_runtime_cls
from openhands.runtime.base import Runtime
from openhands.security import SecurityAnalyzer, options
from openhands.storage import get_file_store
from openhands.utils.async_utils import call_async_from_sync
def create_runtime(
@@ -29,18 +28,19 @@ def create_runtime(
sid: str | None = None,
headless_mode: bool = True,
agent: Agent | None = None,
selected_repository: str | None = None,
github_token: SecretStr | None = None,
) -> Runtime:
"""Create a runtime for the agent to run on.
config: The app config.
sid: (optional) The session id. IMPORTANT: please don't set this unless you know what you're doing.
Set it to incompatible value will cause unexpected behavior on RemoteRuntime.
headless_mode: Whether the agent is run in headless mode. `create_runtime` is typically called within evaluation scripts,
where we don't want to have the VSCode UI open, so it defaults to True.
selected_repository: (optional) The GitHub repository to use.
github_token: (optional) The GitHub token to use.
Args:
config: The app config.
sid: (optional) The session id. IMPORTANT: please don't set this unless you know what you're doing.
Set it to incompatible value will cause unexpected behavior on RemoteRuntime.
headless_mode: Whether the agent is run in headless mode. `create_runtime` is typically called within evaluation scripts,
where we don't want to have the VSCode UI open, so it defaults to True.
agent: (optional) The agent instance to use for configuring the runtime.
Returns:
The created Runtime instance (not yet connected or initialized).
"""
# if sid is provided on the command line, use it as the name of the event stream
# otherwise generate it on the basis of the configured jwt_secret
@@ -74,8 +74,30 @@ def create_runtime(
headless_mode=headless_mode,
)
call_async_from_sync(runtime.connect)
logger.debug(
f'Runtime created with plugins: {[plugin.name for plugin in runtime.plugins]}'
)
return runtime
def initialize_repository_for_runtime(
runtime: Runtime,
agent: Agent | None = None,
selected_repository: str | None = None,
github_token: SecretStr | None = None,
) -> str | None:
"""Initialize the repository for the runtime.
Args:
runtime: The runtime to initialize the repository for.
agent: (optional) The agent to load microagents for.
selected_repository: (optional) The GitHub repository to use.
github_token: (optional) The GitHub token to use.
Returns:
The repository directory path if a repository was cloned, None otherwise.
"""
# clone selected repository if provided
repo_directory = None
github_token = (
@@ -98,11 +120,7 @@ def create_runtime(
agent.prompt_manager.load_microagents(microagents)
agent.prompt_manager.set_repository_info(selected_repository, repo_directory)
logger.debug(
f'Runtime initialized with plugins: {[plugin.name for plugin in runtime.plugins]}'
)
return runtime
return repo_directory
def create_agent(config: AppConfig) -> Agent:

View File

@@ -428,3 +428,41 @@ class EventStream:
break
return matching_events
def get_metrics(self):
"""Get the accumulated metrics from all events in the stream.
This method extracts metrics from events that contain them and returns
the aggregated metrics object.
Returns:
Metrics: The metrics object containing accumulated cost and token usage data.
Returns None if no metrics are found.
"""
from openhands.llm.metrics import Metrics
# Look for events with metrics
metrics = None
events_with_metrics = []
try:
# First collect all events with metrics
for event in self.get_events():
if hasattr(event, 'llm_metrics') and event.llm_metrics is not None:
events_with_metrics.append(event)
# Then merge them if any were found
if events_with_metrics:
# Get the first event with metrics to initialize our metrics object
first_event = events_with_metrics[0]
if first_event.llm_metrics is not None:
metrics = Metrics(model_name=first_event.llm_metrics.model_name)
# Merge metrics from all events
for event in events_with_metrics:
if event.llm_metrics is not None:
metrics.merge(event.llm_metrics)
except Exception as e:
logger.error(f'Error retrieving metrics from events: {e}')
return metrics

View File

@@ -51,6 +51,9 @@ class GitHubService:
async def get_latest_token(self) -> SecretStr:
return self.token
async def get_latest_provider_token(self) -> SecretStr:
return self.token
async def _fetch_data(
self, url: str, params: dict | None = None
) -> tuple[Any, dict]:

View File

@@ -76,6 +76,8 @@ REASONING_EFFORT_SUPPORTED_MODELS = [
MODELS_WITHOUT_STOP_WORDS = [
'o1-mini',
'o1-preview',
'o1',
'o1-2024-12-17',
]
@@ -217,9 +219,8 @@ class LLM(RetryMixin, DebugMixin):
kwargs['stop'] = STOP_WORDS
mock_fncall_tools = kwargs.pop('tools')
kwargs['tool_choice'] = (
'none' # force no tool calling because we're mocking it - without it, it will cause issue with sglang
)
# tool_choice should not be specified when mocking function calling
kwargs.pop('tool_choice', None)
# if we have no messages, something went very wrong
if not messages:

View File

@@ -1,4 +1,4 @@
import openhands.memory.condenser.impl # noqa F401 (we import this to get the condensers registered)
from openhands.memory.condenser.condenser import Condenser, get_condensation_metadata
__all__ = ['Condenser', 'get_condensation_metadata']
__all__ = ['Condenser', 'get_condensation_metadata', 'CONDENSER_REGISTRY']

View File

@@ -202,7 +202,7 @@ async def process_issue(
timeout=300,
)
if os.getenv('GITLAB_CI') == 'True':
if os.getenv('GITLAB_CI') == 'true':
sandbox_config.local_runtime_url = os.getenv(
'LOCAL_RUNTIME_URL', 'http://localhost'
)
@@ -651,7 +651,7 @@ def main() -> None:
if not token:
raise ValueError('Token is required.')
platform = identify_token(token)
platform = identify_token(token, repo)
if platform == Platform.INVALID:
raise ValueError('Token is invalid.')

View File

@@ -22,18 +22,37 @@ class Platform(Enum):
GITLAB = 2
def identify_token(token: str) -> Platform:
def identify_token(token: str, repo: str | None = None) -> Platform:
"""
Identifies whether a token belongs to GitHub or GitLab.
Parameters:
token (str): The personal access token to check.
repo (str): Repository in format "owner/repo" for GitHub Actions token validation.
Returns:
Platform: "GitHub" if the token is valid for GitHub,
"GitLab" if the token is valid for GitLab,
"Invalid" if the token is not recognized by either.
"""
# Try GitHub Actions token format (Bearer) with repo endpoint if repo is provided
if repo:
github_repo_url = f'https://api.github.com/repos/{repo}'
github_bearer_headers = {
'Authorization': f'Bearer {token}',
'Accept': 'application/vnd.github+json',
}
try:
github_repo_response = requests.get(
github_repo_url, headers=github_bearer_headers, timeout=5
)
if github_repo_response.status_code == 200:
return Platform.GITHUB
except requests.RequestException as e:
print(f'Error connecting to GitHub API (repo check): {e}')
# Try GitHub PAT format (token)
github_url = 'https://api.github.com/user'
github_headers = {'Authorization': f'token {token}'}
@@ -44,6 +63,7 @@ def identify_token(token: str) -> Platform:
except requests.RequestException as e:
print(f'Error connecting to GitHub API: {e}')
# Try GitLab token
gitlab_url = 'https://gitlab.com/api/v4/user'
gitlab_headers = {'Authorization': f'Bearer {token}'}

View File

@@ -222,7 +222,7 @@ class Runtime(FileEditRuntimeMixin):
if isinstance(event, CmdRunAction):
if self.github_user_id and '$GITHUB_TOKEN' in event.command:
gh_client = GithubServiceImpl(user_id=self.github_user_id)
token = await gh_client.get_latest_token()
token = await gh_client.get_latest_provider_token()
if token:
export_cmd = CmdRunAction(
f"export GITHUB_TOKEN='{token.get_secret_value()}'"

View File

@@ -19,16 +19,19 @@ class DockerRuntimeBuilder(RuntimeBuilder):
version_info = self.docker_client.version()
server_version = version_info.get('Version', '').replace('-', '.')
self.is_podman = version_info.get('Components')[0].get('Name').startswith('Podman')
if tuple(map(int, server_version.split('.')[:2])) < (18, 9) and not self.is_podman:
self.is_podman = (
version_info.get('Components')[0].get('Name').startswith('Podman')
)
if (
tuple(map(int, server_version.split('.')[:2])) < (18, 9)
and not self.is_podman
):
raise AgentRuntimeBuildError(
'Docker server version must be >= 18.09 to use BuildKit'
)
if self.is_podman and tuple(map(int, server_version.split('.')[:2])) < (4, 9):
raise AgentRuntimeBuildError(
'Podman server version must be >= 4.9.0'
)
raise AgentRuntimeBuildError('Podman server version must be >= 4.9.0')
self.rolling_logger = RollingLogger(max_lines=10)
@@ -37,7 +40,9 @@ class DockerRuntimeBuilder(RuntimeBuilder):
"""Check if Docker Buildx is available"""
try:
result = subprocess.run(
['docker' if not is_podman else 'podman', 'buildx', 'version'], capture_output=True, text=True
['docker' if not is_podman else 'podman', 'buildx', 'version'],
capture_output=True,
text=True,
)
return result.returncode == 0
except FileNotFoundError:
@@ -74,16 +79,16 @@ class DockerRuntimeBuilder(RuntimeBuilder):
self.docker_client = docker.from_env()
version_info = self.docker_client.version()
server_version = version_info.get('Version', '').split('+')[0].replace('-', '.')
self.is_podman = version_info.get('Components')[0].get('Name').startswith('Podman')
self.is_podman = (
version_info.get('Components')[0].get('Name').startswith('Podman')
)
if tuple(map(int, server_version.split('.'))) < (18, 9) and not self.is_podman:
raise AgentRuntimeBuildError(
'Docker server version must be >= 18.09 to use BuildKit'
)
if self.is_podman and tuple(map(int, server_version.split('.'))) < (4, 9):
raise AgentRuntimeBuildError(
'Podman server version must be >= 4.9.0'
)
raise AgentRuntimeBuildError('Podman server version must be >= 4.9.0')
if not DockerRuntimeBuilder.check_buildx(self.is_podman):
# when running openhands in a container, there might not be a "docker"

View File

@@ -2,22 +2,80 @@
[Daytona](https://www.daytona.io/) is a platform that provides a secure and elastic infrastructure for running AI-generated code. It provides all the necessary features for an AI Agent to interact with a codebase. It provides a Daytona SDK with official Python and TypeScript interfaces for interacting with Daytona, enabling you to programmatically manage development environments and execute code.
## Getting started
## Quick Start
1. Sign in at https://app.daytona.io/
1. Generate and copy your API key
1. Set the following environment variables before running the OpenHands app on your local machine or via a `docker run` command:
### Step 1: Retrieve Your Daytona API Key
1. Visit the [Daytona Dashboard](https://app.daytona.io/dashboard/keys).
2. Click **"Create Key"**.
3. Enter a name for your key and confirm the creation.
4. Once the key is generated, copy it.
### Step 2: Set Your API Key as an Environment Variable
Run the following command in your terminal, replacing `<your-api-key>` with the actual key you copied:
```bash
RUNTIME="daytona"
DAYTONA_API_KEY="<your-api-key>"
export DAYTONA_API_KEY="<your-api-key>"
```
Optionally, if you don't want your sandboxes to default to the US region, set:
This step ensures that OpenHands can authenticate with the Daytona platform when it runs.
### Step 3: Run OpenHands Locally Using Docker
To start the latest version of OpenHands on your machine, execute the following command in your terminal:
```bash
bash -i <(curl -sL https://get.daytona.io/openhands)
```
#### What This Command Does:
- Downloads the latest OpenHands release script.
- Runs the script in an interactive Bash session.
- Automatically pulls and runs the OpenHands container using Docker.
Once executed, OpenHands should be running locally and ready for use.
## Manual Initialization
### Step 1: Set the `OPENHANDS_VERSION` Environment Variable
Run the following command in your terminal, replacing `<openhands-release>` with the latest release's version seen in the [main README.md file](https://github.com/All-Hands-AI/OpenHands?tab=readme-ov-file#-quick-start):
```bash
DAYTONA_TARGET="eu"
export OPENHANDS_VERSION="<openhands-release>" # e.g. 0.27
```
### Step 2: Retrieve Your Daytona API Key
1. Visit the [Daytona Dashboard](https://app.daytona.io/dashboard/keys).
2. Click **"Create Key"**.
3. Enter a name for your key and confirm the creation.
4. Once the key is generated, copy it.
### Step 3: Set Your API Key as an Environment Variable:
Run the following command in your terminal, replacing `<your-api-key>` with the actual key you copied:
```bash
export DAYTONA_API_KEY="<your-api-key>"
```
### Step 4: Run the following `docker` command:
This command pulls and runs the OpenHands container using Docker. Once executed, OpenHands should be running locally and ready for use.
```bash
docker run -it --rm --pull=always \
-e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:${OPENHANDS_VERSION}-nikolaik \
-e LOG_ALL_EVENTS=true \
-e RUNTIME=daytona \
-e DAYTONA_API_KEY=${DAYTONA_API_KEY} \
-v ~/.openhands-state:/.openhands-state \
-p 3000:3000 \
--name openhands-app \
docker.all-hands.dev/all-hands-ai/openhands:${OPENHANDS_VERSION}
```
> **Tip:** If you don't want your sandboxes to default to the US region, you can set the `DAYTONA_TARGET` environment variable to `eu`
### Running OpenHands Locally Without Docker
Alternatively, if you want to run the OpenHands app on your local machine using `make run` without Docker, make sure to set the following environment variables first:
```bash
export RUNTIME="daytona"
export DAYTONA_API_KEY="<your-api-key>"
```
## Documentation

View File

@@ -333,7 +333,10 @@ class DockerRuntime(ActionExecutionClient):
if exposed_ports:
for exposed_port in exposed_ports.keys():
exposed_port = int(exposed_port.split('/tcp')[0])
if exposed_port != self._host_port and exposed_port != self._vscode_port:
if (
exposed_port != self._host_port
and exposed_port != self._vscode_port
):
self._app_ports.append(exposed_port)
self.api_url = f'{self.config.sandbox.local_runtime_url}:{self._container_port}'

View File

@@ -1,4 +1,4 @@
from typing import Callable, Optional
from typing import Callable
from openhands.core.config import AppConfig
from openhands.events.action import (
@@ -27,7 +27,7 @@ class E2BRuntime(Runtime):
sid: str = 'default',
plugins: list[PluginRequirement] | None = None,
sandbox: E2BSandbox | None = None,
status_callback: Optional[Callable] = None,
status_callback: Callable | None = None,
):
super().__init__(
config,

View File

@@ -7,7 +7,7 @@ import shutil
import subprocess
import tempfile
import threading
from typing import Callable, Optional
from typing import Callable
import requests
import tenacity
@@ -155,7 +155,7 @@ class LocalRuntime(ActionExecutionClient):
self.api_url = f'{self.config.sandbox.local_runtime_url}:{self._host_port}'
self.status_callback = status_callback
self.server_process: Optional[subprocess.Popen[str]] = None
self.server_process: subprocess.Popen[str] | None = None
self.action_semaphore = threading.Semaphore(1) # Ensure one action at a time
# Update env vars

View File

@@ -1,5 +1,5 @@
import os
from typing import Callable, Optional
from typing import Callable
from urllib.parse import urlparse
import requests
@@ -42,7 +42,7 @@ class RemoteRuntime(ActionExecutionClient):
sid: str = 'default',
plugins: list[PluginRequirement] | None = None,
env_vars: dict[str, str] | None = None,
status_callback: Optional[Callable] = None,
status_callback: Callable | None = None,
attach_to_existing: bool = False,
headless_mode: bool = True,
github_user_id: str | None = None,

View File

@@ -45,4 +45,4 @@ This extension is part of the OpenHands project. To modify or extend it:
## License
This extension is licensed under the MIT license.
This extension is licensed under the MIT license.

View File

@@ -4,30 +4,30 @@ const MemoryMonitor = require('./memory_monitor');
function activate(context) {
// Create memory monitor instance
const memoryMonitor = new MemoryMonitor();
// Store the context in the memory monitor
memoryMonitor.context = context;
// Register memory monitor start command
let startMonitorCommand = vscode.commands.registerCommand('openhands-memory-monitor.startMemoryMonitor', function () {
memoryMonitor.start();
});
// Register memory monitor stop command
let stopMonitorCommand = vscode.commands.registerCommand('openhands-memory-monitor.stopMemoryMonitor', function () {
memoryMonitor.stop();
});
// Register memory details command
let showMemoryDetailsCommand = vscode.commands.registerCommand('openhands-memory-monitor.showMemoryDetails', function () {
memoryMonitor.showDetails();
});
// Add all commands to subscriptions
context.subscriptions.push(startMonitorCommand);
context.subscriptions.push(stopMonitorCommand);
context.subscriptions.push(showMemoryDetailsCommand);
// Start memory monitoring by default
memoryMonitor.start();
}
@@ -39,4 +39,4 @@ function deactivate() {
module.exports = {
activate,
deactivate
}
}

View File

@@ -21,15 +21,15 @@ class MemoryMonitor {
this.isMonitoring = true;
this.statusBarItem.show();
// Initial update
this.updateMemoryInfo();
// Set interval for updates
this.intervalId = setInterval(() => {
this.updateMemoryInfo();
}, interval);
vscode.window.showInformationMessage('Memory monitoring started');
}
@@ -41,7 +41,7 @@ class MemoryMonitor {
this.isMonitoring = false;
clearInterval(this.intervalId);
this.statusBarItem.hide();
vscode.window.showInformationMessage('Memory monitoring stopped');
}
@@ -49,18 +49,18 @@ class MemoryMonitor {
const totalMem = os.totalmem();
const freeMem = os.freemem();
const usedMem = totalMem - freeMem;
// Calculate memory usage percentage
const memUsagePercent = Math.round((usedMem / totalMem) * 100);
// Format memory values to MB
const usedMemMB = Math.round(usedMem / (1024 * 1024));
const totalMemMB = Math.round(totalMem / (1024 * 1024));
// Update status bar
this.statusBarItem.text = `$(pulse) Mem: ${memUsagePercent}%`;
this.statusBarItem.tooltip = `Memory Usage: ${usedMemMB}MB / ${totalMemMB}MB`;
// Store memory data in history
this.memoryHistory.push({
timestamp: new Date(),
@@ -69,7 +69,7 @@ class MemoryMonitor {
memUsagePercent,
processMemory: process.memoryUsage()
});
// Limit history length
if (this.memoryHistory.length > this.maxHistoryLength) {
this.memoryHistory.shift();
@@ -86,7 +86,7 @@ class MemoryMonitor {
enableScripts: true
}
);
// Set up message handler for real-time updates
panel.webview.onDidReceiveMessage(
message => {
@@ -97,60 +97,60 @@ class MemoryMonitor {
undefined,
this.context ? this.context.subscriptions : []
);
// Initial update
this.updateWebviewContent(panel);
// Handle panel disposal
panel.onDidDispose(() => {
// Clean up any resources if needed
}, null, this.context ? this.context.subscriptions : []);
}
updateWebviewContent(panel) {
// Get system memory info
const totalMem = os.totalmem();
const freeMem = os.freemem();
const usedMem = totalMem - freeMem;
// Format memory values
const usedMemMB = Math.round(usedMem / (1024 * 1024));
const freeMemMB = Math.round(freeMem / (1024 * 1024));
const totalMemMB = Math.round(totalMem / (1024 * 1024));
// Get process memory usage
const processMemory = process.memoryUsage();
const rss = Math.round(processMemory.rss / (1024 * 1024));
const heapTotal = Math.round(processMemory.heapTotal / (1024 * 1024));
const heapUsed = Math.round(processMemory.heapUsed / (1024 * 1024));
// Get process information
this.processMonitor.getProcessInfo((error, processInfo) => {
if (error) {
console.error('Error getting process info:', error);
return;
}
// Create HTML content for the webview
const htmlContent = this.generateHtmlReport(
usedMemMB, freeMemMB, totalMemMB,
usedMemMB, freeMemMB, totalMemMB,
rss, heapTotal, heapUsed,
processInfo
);
// Set the webview's HTML content
panel.webview.html = htmlContent;
});
}
generateHtmlReport(usedMemMB, freeMemMB, totalMemMB, rss, heapTotal, heapUsed, processInfo) {
// Create memory usage history data for chart
const memoryLabels = this.memoryHistory.map((entry, index) => index);
const memoryData = this.memoryHistory.map(entry => entry.memUsagePercent);
const heapData = this.memoryHistory.map(entry =>
const heapData = this.memoryHistory.map(entry =>
Math.round(entry.processMemory.heapUsed / (1024 * 1024))
);
// Format process info table
let processTable = '';
if (processInfo && processInfo.processes) {
@@ -174,7 +174,7 @@ class MemoryMonitor {
</table>
`;
}
return `
<!DOCTYPE html>
<html lang="en">
@@ -237,7 +237,7 @@ class MemoryMonitor {
</head>
<body>
<h1>Memory Monitor</h1>
<div class="memory-card">
<h2>System Memory</h2>
<div class="memory-info">
@@ -259,7 +259,7 @@ class MemoryMonitor {
</div>
</div>
</div>
<div class="memory-card">
<h2>Process Memory (VSCode Extension Host)</h2>
<div class="memory-info">
@@ -277,18 +277,18 @@ class MemoryMonitor {
</div>
</div>
</div>
<div class="memory-card">
<h2>Memory Usage History</h2>
<div class="chart-container">
<canvas id="memoryChart"></canvas>
</div>
</div>
<div class="memory-card">
${processTable}
</div>
<script>
// Create memory usage chart
const ctx = document.getElementById('memoryChart').getContext('2d');
@@ -323,10 +323,10 @@ class MemoryMonitor {
}
}
});
// Set up real-time updates
const vscode = acquireVsCodeApi();
// Request updates every 5 seconds
setInterval(() => {
vscode.postMessage({
@@ -340,4 +340,4 @@ class MemoryMonitor {
}
}
module.exports = MemoryMonitor;
module.exports = MemoryMonitor;

View File

@@ -44,7 +44,7 @@ class ProcessMonitor {
const memPercent = parseFloat(parts[parts.length - 2]);
const cpuPercent = parseFloat(parts[parts.length - 1]);
const cmd = parts.slice(2, parts.length - 2).join(' ');
return {
pid,
ppid,
@@ -80,7 +80,7 @@ class ProcessMonitor {
const memPercent = parseFloat(parts[parts.length - 2]);
const cpuPercent = parseFloat(parts[parts.length - 1]);
const cmd = parts.slice(2, parts.length - 2).join(' ');
return {
pid,
ppid,
@@ -109,21 +109,21 @@ class ProcessMonitor {
// Parse the CSV output
const lines = stdout.trim().split('\n');
const header = "PID,PPID,Command,Memory (bytes)";
// Skip empty lines and the header
const dataLines = lines.filter(line => line.trim() !== '' && !line.includes('Node,'));
const processes = dataLines.map(line => {
const parts = line.split(',');
if (parts.length < 4) return null;
// Last part is the node name, then ProcessId, ParentProcessId, CommandLine, WorkingSetSize
const pid = parts[parts.length - 4];
const ppid = parts[parts.length - 3];
const cmd = parts[parts.length - 2];
const memBytes = parseInt(parts[parts.length - 1], 10);
const memPercent = (memBytes / os.totalmem() * 100).toFixed(1);
return {
pid,
ppid,
@@ -140,4 +140,4 @@ class ProcessMonitor {
}
}
module.exports = ProcessMonitor;
module.exports = ProcessMonitor;

View File

@@ -1,5 +1,5 @@
import json
from typing import Any, Literal, Optional
from typing import Any, Literal
import requests
from pydantic import BaseModel
@@ -15,7 +15,7 @@ class FeedbackDataModel(BaseModel):
'positive', 'negative'
] # TODO: remove this, its here for backward compatibility
permissions: Literal['public', 'private']
trajectory: Optional[list[dict[str, Any]]]
trajectory: list[dict[str, Any]] | None
FEEDBACK_URL = 'https://share-od-trajectory-3u9bw9tx.uc.gateway.dev/share_od_trajectory'

View File

@@ -7,6 +7,70 @@ from openhands.runtime.base import Runtime
app = APIRouter(prefix='/api/conversations/{conversation_id}')
@app.get('/metrics')
async def get_conversation_metrics(request: Request):
"""Retrieve the conversation metrics.
This endpoint returns the accumulated cost and token usage metrics for the conversation.
Args:
request (Request): The incoming FastAPI request object.
Returns:
JSONResponse: A JSON response containing the metrics data.
"""
try:
if not hasattr(request.state, 'conversation'):
return JSONResponse(
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
content={'error': 'No conversation found in request state'},
)
event_stream = request.state.conversation.event_stream
# Get metrics from the event stream
metrics = (
event_stream.get_metrics() if hasattr(event_stream, 'get_metrics') else None
)
if not metrics:
# Return empty metrics if not available
return JSONResponse(
status_code=status.HTTP_200_OK,
content={
'accumulated_cost': 0.0,
'total_prompt_tokens': 0,
'total_completion_tokens': 0,
'total_tokens': 0,
},
)
# Calculate total tokens
total_prompt_tokens = sum(usage.prompt_tokens for usage in metrics.token_usages)
total_completion_tokens = sum(
usage.completion_tokens for usage in metrics.token_usages
)
total_tokens = total_prompt_tokens + total_completion_tokens
return JSONResponse(
status_code=status.HTTP_200_OK,
content={
'accumulated_cost': metrics.accumulated_cost,
'total_prompt_tokens': total_prompt_tokens,
'total_completion_tokens': total_completion_tokens,
'total_tokens': total_tokens,
},
)
except Exception as e:
logger.error(f'Error getting conversation metrics: {e}')
return JSONResponse(
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
content={
'error': f'Error getting conversation metrics: {e}',
},
)
@app.get('/config')
async def get_remote_runtime_config(request: Request):
"""Retrieve the runtime configuration.

View File

@@ -11,7 +11,7 @@ app = APIRouter(prefix='/api/conversations/{conversation_id}')
@app.post('/submit-feedback')
async def submit_feedback(request: Request, conversation_id: str):
async def submit_feedback(request: Request, conversation_id: str) -> JSONResponse:
"""Submit user feedback.
This function stores the provided feedback data.

View File

@@ -15,7 +15,7 @@ from openhands.server.auth import get_github_token, get_idp_token, get_user_id
app = APIRouter(prefix='/api/github')
@app.get('/repositories')
@app.get('/repositories', response_model=list[GitHubRepository])
async def get_github_repositories(
page: int = 1,
per_page: int = 10,
@@ -47,7 +47,7 @@ async def get_github_repositories(
)
@app.get('/user')
@app.get('/user', response_model=GitHubUser)
async def get_github_user(
github_user_id: str | None = Depends(get_user_id),
github_user_token: SecretStr | None = Depends(get_github_token),
@@ -73,7 +73,7 @@ async def get_github_user(
)
@app.get('/installations')
@app.get('/installations', response_model=list[int])
async def get_github_installation_ids(
github_user_id: str | None = Depends(get_user_id),
github_user_token: SecretStr | None = Depends(get_github_token),
@@ -99,7 +99,7 @@ async def get_github_installation_ids(
)
@app.get('/search/repositories')
@app.get('/search/repositories', response_model=list[GitHubRepository])
async def search_github_repositories(
query: str,
per_page: int = 5,
@@ -131,7 +131,7 @@ async def search_github_repositories(
)
@app.get('/suggested-tasks')
@app.get('/suggested-tasks', response_model=list[SuggestedTask])
async def get_suggested_tasks(
github_user_id: str | None = Depends(get_user_id),
github_user_token: SecretStr | None = Depends(get_github_token),

View File

@@ -51,6 +51,7 @@ async def _create_new_conversation(
selected_branch: str | None,
initial_user_msg: str | None,
image_urls: list[str] | None,
attach_convo_id: bool = False,
):
monitoring_listener.on_create_conversation()
logger.info('Loading settings')
@@ -109,8 +110,13 @@ async def _create_new_conversation(
logger.info(f'Starting agent loop for conversation {conversation_id}')
initial_message_action = None
if initial_user_msg or image_urls:
user_msg = (
initial_user_msg.format(conversation_id)
if attach_convo_id and initial_user_msg
else initial_user_msg
)
initial_message_action = MessageAction(
content=initial_user_msg or '',
content=user_msg or '',
image_urls=image_urls or [],
)
event_stream = await conversation_manager.maybe_start_agent_loop(

View File

@@ -1,6 +1,8 @@
import warnings
from typing import Any
import requests
from fastapi import APIRouter
from openhands.security.options import SecurityAnalyzers
@@ -8,10 +10,6 @@ with warnings.catch_warnings():
warnings.simplefilter('ignore')
import litellm
from fastapi import (
APIRouter,
)
from openhands.controller.agent import Agent
from openhands.core.config import LLMConfig
from openhands.core.logger import openhands_logger as logger
@@ -21,7 +19,7 @@ from openhands.server.shared import config, server_config
app = APIRouter(prefix='/api/options')
@app.get('/models')
@app.get('/models', response_model=list[str])
async def get_litellm_models() -> list[str]:
"""Get all models supported by LiteLLM.
@@ -34,7 +32,7 @@ async def get_litellm_models() -> list[str]:
```
Returns:
list: A sorted list of unique model names.
list[str]: A sorted list of unique model names.
"""
litellm_model_list = litellm.model_list + list(litellm.model_cost.keys())
litellm_model_list_without_bedrock = bedrock.remove_error_modelId(
@@ -74,8 +72,8 @@ async def get_litellm_models() -> list[str]:
return list(sorted(set(model_list)))
@app.get('/agents')
async def get_agents():
@app.get('/agents', response_model=list[str])
async def get_agents() -> list[str]:
"""Get all agents supported by LiteLLM.
To get the agents:
@@ -84,14 +82,13 @@ async def get_agents():
```
Returns:
list: A sorted list of agent names.
list[str]: A sorted list of agent names.
"""
agents = sorted(Agent.list_agents())
return agents
return sorted(Agent.list_agents())
@app.get('/security-analyzers')
async def get_security_analyzers():
@app.get('/security-analyzers', response_model=list[str])
async def get_security_analyzers() -> list[str]:
"""Get all supported security analyzers.
To get the security analyzers:
@@ -100,15 +97,16 @@ async def get_security_analyzers():
```
Returns:
list: A sorted list of security analyzer names.
list[str]: A sorted list of security analyzer names.
"""
return sorted(SecurityAnalyzers.keys())
@app.get('/config')
async def get_config():
"""
Get current config
"""
@app.get('/config', response_model=dict[str, Any])
async def get_config() -> dict[str, Any]:
"""Get current config.
Returns:
dict[str, Any]: The current server configuration.
"""
return server_config.get_config()

View File

@@ -2,6 +2,7 @@ from fastapi import (
APIRouter,
HTTPException,
Request,
Response,
status,
)
@@ -9,7 +10,7 @@ app = APIRouter(prefix='/api/conversations/{conversation_id}')
@app.route('/security/{path:path}', methods=['GET', 'POST', 'PUT', 'DELETE'])
async def security_api(request: Request):
async def security_api(request: Request) -> Response:
"""Catch-all route for security analyzer API requests.
Each request is handled directly to the security analyzer.
@@ -18,7 +19,7 @@ async def security_api(request: Request):
request (Request): The incoming FastAPI request object.
Returns:
Any: The response from the security analyzer.
Response: The response from the security analyzer.
Raises:
HTTPException: If the security analyzer is not initialized.

View File

@@ -11,8 +11,8 @@ from openhands.server.shared import SettingsStoreImpl, config
app = APIRouter(prefix='/api')
@app.get('/settings')
async def load_settings(request: Request) -> GETSettingsModel | None:
@app.get('/settings', response_model=GETSettingsModel)
async def load_settings(request: Request) -> GETSettingsModel | JSONResponse:
try:
user_id = get_user_id(request)
settings_store = await SettingsStoreImpl.get_instance(config, user_id)
@@ -40,13 +40,12 @@ async def load_settings(request: Request) -> GETSettingsModel | None:
)
@app.post('/settings')
@app.post('/settings', response_model=dict[str, str])
async def store_settings(
request: Request,
settings: POSTSettingsModel,
) -> JSONResponse:
# Check if token is valid
if settings.github_token:
try:
# We check if the token is valid by getting the user

View File

@@ -9,7 +9,7 @@ app = APIRouter(prefix='/api/conversations/{conversation_id}')
@app.get('/trajectory')
async def get_trajectory(request: Request):
async def get_trajectory(request: Request) -> JSONResponse:
"""Get trajectory.
This function retrieves the current trajectory and returns it.

View File

@@ -1,6 +1,6 @@
import asyncio
import time
from typing import Callable, Optional
from typing import Callable
from pydantic import SecretStr
@@ -52,7 +52,7 @@ class AgentSession:
sid: str,
file_store: FileStore,
monitoring_listener: MonitoringListener,
status_callback: Optional[Callable] = None,
status_callback: Callable | None = None,
github_user_id: str | None = None,
):
"""Initializes a new instance of the Session class

View File

@@ -1,5 +1,5 @@
import os
from typing import List, Optional
from typing import List
from google.api_core.exceptions import NotFound
from google.cloud import storage
@@ -8,7 +8,7 @@ from openhands.storage.files import FileStore
class GoogleCloudFileStore(FileStore):
def __init__(self, bucket_name: Optional[str] = None) -> None:
def __init__(self, bucket_name: str | None = None) -> None:
"""
Create a new FileStore. If GOOGLE_APPLICATION_CREDENTIALS is defined in the
environment it will be used for authentication. Otherwise access will be

View File

@@ -0,0 +1,120 @@
import pytest
from unittest.mock import MagicMock, patch
from openhands.events.event import Event
from openhands.events.stream import EventStream
from openhands.llm.metrics import Metrics
class TestEventStream:
def test_get_metrics_empty_stream(self):
"""Test that get_metrics returns None for an empty stream."""
sid = "test-stream-id"
file_store = MagicMock()
stream = EventStream(sid=sid, file_store=file_store)
assert stream.get_metrics() is None
def test_get_metrics_no_metrics_in_events(self):
"""Test that get_metrics returns None when no events have metrics."""
sid = "test-stream-id"
file_store = MagicMock()
stream = EventStream(sid=sid, file_store=file_store)
event = MagicMock(spec=Event)
event.llm_metrics = None
with patch.object(stream, 'get_events', return_value=[event]):
assert stream.get_metrics() is None
def test_get_metrics_with_metrics(self):
"""Test that get_metrics correctly aggregates metrics from events."""
sid = "test-stream-id"
file_store = MagicMock()
stream = EventStream(sid=sid, file_store=file_store)
# Create mock events with metrics
event1 = MagicMock(spec=Event)
metrics1 = Metrics(model_name="gpt-4")
metrics1.add_token_usage(
prompt_tokens=10,
completion_tokens=20,
cache_read_tokens=0,
cache_write_tokens=0,
response_id="resp1"
)
event1.llm_metrics = metrics1
event2 = MagicMock(spec=Event)
metrics2 = Metrics(model_name="gpt-4")
metrics2.add_token_usage(
prompt_tokens=15,
completion_tokens=25,
cache_read_tokens=0,
cache_write_tokens=0,
response_id="resp2"
)
event2.llm_metrics = metrics2
with patch.object(stream, 'get_events', return_value=[event1, event2]):
result = stream.get_metrics()
assert result is not None
assert result.model_name == "gpt-4"
# Check token usages are merged correctly
total_prompt_tokens = sum(usage.prompt_tokens for usage in result.token_usages)
total_completion_tokens = sum(usage.completion_tokens for usage in result.token_usages)
assert total_prompt_tokens == 25 # 10 + 15
assert total_completion_tokens == 45 # 20 + 25
assert len(result.token_usages) == 2
def test_get_metrics_with_exception(self):
"""Test that get_metrics handles exceptions gracefully."""
sid = "test-stream-id"
file_store = MagicMock()
stream = EventStream(sid=sid, file_store=file_store)
with patch.object(stream, 'get_events', side_effect=Exception("Test exception")):
assert stream.get_metrics() is None
def test_get_metrics_with_mixed_events(self):
"""Test that get_metrics correctly handles a mix of events with and without metrics."""
sid = "test-stream-id"
file_store = MagicMock()
stream = EventStream(sid=sid, file_store=file_store)
# Create mock events, some with metrics and some without
event1 = MagicMock(spec=Event)
metrics1 = Metrics(model_name="gpt-4")
metrics1.add_token_usage(
prompt_tokens=10,
completion_tokens=20,
cache_read_tokens=0,
cache_write_tokens=0,
response_id="resp1"
)
event1.llm_metrics = metrics1
event2 = MagicMock(spec=Event)
event2.llm_metrics = None
event3 = MagicMock(spec=Event)
metrics3 = Metrics(model_name="gpt-4")
metrics3.add_token_usage(
prompt_tokens=15,
completion_tokens=25,
cache_read_tokens=0,
cache_write_tokens=0,
response_id="resp3"
)
event3.llm_metrics = metrics3
with patch.object(stream, 'get_events', return_value=[event1, event2, event3]):
result = stream.get_metrics()
assert result is not None
assert result.model_name == "gpt-4"
# Check token usages are merged correctly
total_prompt_tokens = sum(usage.prompt_tokens for usage in result.token_usages)
total_completion_tokens = sum(usage.completion_tokens for usage in result.token_usages)
assert total_prompt_tokens == 25 # 10 + 15
assert total_completion_tokens == 45 # 20 + 25
assert len(result.token_usages) == 2

View File

@@ -0,0 +1,139 @@
import pytest
from unittest.mock import MagicMock, patch
from fastapi import Request, status
from fastapi.responses import JSONResponse
from openhands.server.routes.conversation import get_conversation_metrics
from openhands.llm.metrics import Metrics, TokenUsage
@pytest.fixture
def mock_request():
"""Create a mock request with a conversation."""
request = MagicMock(spec=Request)
request.state.conversation = MagicMock()
request.state.conversation.runtime = MagicMock()
request.state.conversation.event_stream = MagicMock()
return request
@pytest.mark.asyncio
async def test_get_conversation_metrics_success(mock_request):
"""Test successful retrieval of conversation metrics."""
# Setup mock metrics
metrics = Metrics()
metrics.token_usages = [
TokenUsage(
prompt_tokens=100,
completion_tokens=50,
model="test-model",
cache_read_tokens=0,
cache_write_tokens=0,
response_id="test-response-1"
),
TokenUsage(
prompt_tokens=200,
completion_tokens=150,
model="test-model",
cache_read_tokens=0,
cache_write_tokens=0,
response_id="test-response-2"
),
]
metrics.accumulated_cost = 0.25
# Configure mock to return metrics
mock_request.state.conversation.event_stream.get_metrics.return_value = metrics
# Call the endpoint
response = await get_conversation_metrics(mock_request)
# Verify response
assert isinstance(response, JSONResponse)
assert response.status_code == status.HTTP_200_OK
# Extract content from JSONResponse
content = response.body.decode('utf-8')
import json
content_dict = json.loads(content)
# Verify metrics
assert content_dict['accumulated_cost'] == 0.25
assert content_dict['total_prompt_tokens'] == 300
assert content_dict['total_completion_tokens'] == 200
assert content_dict['total_tokens'] == 500
@pytest.mark.asyncio
async def test_get_conversation_metrics_no_metrics(mock_request):
"""Test handling when no metrics are available."""
# Configure mock to return None for metrics
mock_request.state.conversation.event_stream.get_metrics.return_value = None
# Call the endpoint
response = await get_conversation_metrics(mock_request)
# Verify response
assert isinstance(response, JSONResponse)
assert response.status_code == status.HTTP_200_OK
# Extract content from JSONResponse
content = response.body.decode('utf-8')
import json
content_dict = json.loads(content)
# Verify default metrics
assert content_dict['accumulated_cost'] == 0.0
assert content_dict['total_prompt_tokens'] == 0
assert content_dict['total_completion_tokens'] == 0
assert content_dict['total_tokens'] == 0
@pytest.mark.asyncio
async def test_get_conversation_metrics_no_conversation():
"""Test handling when no conversation is found in request state."""
# Create a request without conversation
request = MagicMock(spec=Request)
request.state = MagicMock()
# Remove conversation attribute
delattr(request.state, 'conversation')
# Call the endpoint
response = await get_conversation_metrics(request)
# Verify response
assert isinstance(response, JSONResponse)
assert response.status_code == status.HTTP_500_INTERNAL_SERVER_ERROR
# Extract content from JSONResponse
content = response.body.decode('utf-8')
import json
content_dict = json.loads(content)
# Verify error message
assert 'error' in content_dict
assert 'No conversation found in request state' in content_dict['error']
@pytest.mark.asyncio
async def test_get_conversation_metrics_exception(mock_request):
"""Test handling when an exception occurs."""
# Configure mock to raise an exception
mock_request.state.conversation.event_stream.get_metrics.side_effect = Exception("Test exception")
# Call the endpoint
response = await get_conversation_metrics(mock_request)
# Verify response
assert isinstance(response, JSONResponse)
assert response.status_code == status.HTTP_500_INTERNAL_SERVER_ERROR
# Extract content from JSONResponse
content = response.body.decode('utf-8')
import json
content_dict = json.loads(content)
# Verify error message
assert 'error' in content_dict
assert 'Test exception' in content_dict['error']

View File

@@ -16,7 +16,9 @@ from openhands.core.config import (
load_from_toml,
)
from openhands.core.config.condenser_config import (
LLMSummarizingCondenserConfig,
NoOpCondenserConfig,
RecentEventsCondenserConfig,
)
from openhands.core.logger import openhands_logger
@@ -566,13 +568,243 @@ def test_cache_dir_creation(default_config, tmpdir):
assert os.path.exists(default_config.cache_dir)
def test_agent_config_condenser_default():
"""Test that default agent condenser is NoOpCondenser."""
config = AppConfig()
def test_agent_config_condenser_with_no_enabled():
"""Test default agent condenser with enable_default_condenser=False."""
config = AppConfig(enable_default_condenser=False)
agent_config = config.get_agent_config()
assert isinstance(agent_config.condenser, NoOpCondenserConfig)
def test_condenser_config_from_toml_basic(default_config, temp_toml_file):
"""Test loading basic condenser configuration from TOML."""
with open(temp_toml_file, 'w', encoding='utf-8') as toml_file:
toml_file.write("""
[condenser]
type = "recent"
keep_first = 3
max_events = 15
""")
load_from_toml(default_config, temp_toml_file)
# Verify that the condenser config is correctly assigned to the default agent config
agent_config = default_config.get_agent_config()
assert isinstance(agent_config.condenser, RecentEventsCondenserConfig)
assert agent_config.condenser.keep_first == 3
assert agent_config.condenser.max_events == 15
# We can also verify the function works directly
from openhands.core.config.condenser_config import (
condenser_config_from_toml_section,
)
condenser_data = {'type': 'recent', 'keep_first': 3, 'max_events': 15}
condenser_mapping = condenser_config_from_toml_section(condenser_data)
assert 'condenser' in condenser_mapping
assert isinstance(condenser_mapping['condenser'], RecentEventsCondenserConfig)
assert condenser_mapping['condenser'].keep_first == 3
assert condenser_mapping['condenser'].max_events == 15
def test_condenser_config_from_toml_with_llm_reference(default_config, temp_toml_file):
"""Test loading condenser configuration with LLM reference from TOML."""
with open(temp_toml_file, 'w', encoding='utf-8') as toml_file:
toml_file.write("""
[llm.condenser_llm]
model = "gpt-4"
api_key = "test-key"
[condenser]
type = "llm"
llm_config = "condenser_llm"
keep_first = 2
max_size = 50
""")
load_from_toml(default_config, temp_toml_file)
# Verify that the LLM config was loaded
assert 'condenser_llm' in default_config.llms
assert default_config.llms['condenser_llm'].model == 'gpt-4'
# Verify that the condenser config is correctly assigned to the default agent config
agent_config = default_config.get_agent_config()
assert isinstance(agent_config.condenser, LLMSummarizingCondenserConfig)
assert agent_config.condenser.keep_first == 2
assert agent_config.condenser.max_size == 50
assert agent_config.condenser.llm_config.model == 'gpt-4'
# Test the condenser config with the LLM reference
from openhands.core.config.condenser_config import (
condenser_config_from_toml_section,
)
condenser_data = {
'type': 'llm',
'llm_config': 'condenser_llm',
'keep_first': 2,
'max_size': 50,
}
condenser_mapping = condenser_config_from_toml_section(
condenser_data, default_config.llms
)
assert 'condenser' in condenser_mapping
assert isinstance(condenser_mapping['condenser'], LLMSummarizingCondenserConfig)
assert condenser_mapping['condenser'].keep_first == 2
assert condenser_mapping['condenser'].max_size == 50
assert condenser_mapping['condenser'].llm_config.model == 'gpt-4'
def test_condenser_config_from_toml_with_missing_llm_reference(
default_config, temp_toml_file
):
"""Test loading condenser configuration with missing LLM reference from TOML."""
with open(temp_toml_file, 'w', encoding='utf-8') as toml_file:
toml_file.write("""
[condenser]
type = "llm"
llm_config = "missing_llm"
keep_first = 2
max_size = 50
""")
load_from_toml(default_config, temp_toml_file)
# Test the condenser config with a missing LLM reference
from openhands.core.config.condenser_config import (
condenser_config_from_toml_section,
)
condenser_data = {
'type': 'llm',
'llm_config': 'missing_llm',
'keep_first': 2,
'max_size': 50,
}
condenser_mapping = condenser_config_from_toml_section(
condenser_data, default_config.llms
)
assert 'condenser' in condenser_mapping
assert isinstance(condenser_mapping['condenser'], NoOpCondenserConfig)
# Should not have a default LLMConfig when the reference is missing
assert not hasattr(condenser_mapping['condenser'], 'llm_config')
def test_condenser_config_from_toml_with_invalid_config(default_config, temp_toml_file):
"""Test loading invalid condenser configuration from TOML."""
with open(temp_toml_file, 'w', encoding='utf-8') as toml_file:
toml_file.write("""
[condenser]
type = "invalid_type"
""")
load_from_toml(default_config, temp_toml_file)
# Test the condenser config with an invalid type
from openhands.core.config.condenser_config import (
condenser_config_from_toml_section,
)
condenser_data = {'type': 'invalid_type'}
condenser_mapping = condenser_config_from_toml_section(condenser_data)
# Should default to NoOpCondenserConfig when the type is invalid
assert 'condenser' in condenser_mapping
assert isinstance(condenser_mapping['condenser'], NoOpCondenserConfig)
def test_condenser_config_from_toml_with_validation_error(
default_config, temp_toml_file
):
"""Test loading condenser configuration with validation error from TOML."""
with open(temp_toml_file, 'w', encoding='utf-8') as toml_file:
toml_file.write("""
[condenser]
type = "recent"
keep_first = -1 # Invalid: must be >= 0
max_events = 0 # Invalid: must be >= 1
""")
load_from_toml(default_config, temp_toml_file)
# Test the condenser config with validation errors
from openhands.core.config.condenser_config import (
condenser_config_from_toml_section,
)
condenser_data = {'type': 'recent', 'keep_first': -1, 'max_events': 0}
condenser_mapping = condenser_config_from_toml_section(condenser_data)
# Should default to NoOpCondenserConfig when validation fails
assert 'condenser' in condenser_mapping
assert isinstance(condenser_mapping['condenser'], NoOpCondenserConfig)
def test_default_condenser_behavior_enabled(default_config, temp_toml_file):
"""Test the default condenser behavior when enable_default_condenser is True."""
# Create a minimal TOML file with no condenser section
with open(temp_toml_file, 'w', encoding='utf-8') as toml_file:
toml_file.write("""
[core]
# Empty core section, no condenser section
""")
# Set enable_default_condenser to True
default_config.enable_default_condenser = True
load_from_toml(default_config, temp_toml_file)
# Verify the default agent config has LLMSummarizingCondenserConfig
agent_config = default_config.get_agent_config()
assert isinstance(agent_config.condenser, LLMSummarizingCondenserConfig)
assert agent_config.condenser.keep_first == 1
assert agent_config.condenser.max_size == 100
def test_default_condenser_behavior_disabled(default_config, temp_toml_file):
"""Test the default condenser behavior when enable_default_condenser is False."""
# Create a minimal TOML file with no condenser section
with open(temp_toml_file, 'w', encoding='utf-8') as toml_file:
toml_file.write("""
[core]
# Empty core section, no condenser section
""")
# Set enable_default_condenser to False
default_config.enable_default_condenser = False
load_from_toml(default_config, temp_toml_file)
# Verify the agent config uses NoOpCondenserConfig
agent_config = default_config.get_agent_config()
assert isinstance(agent_config.condenser, NoOpCondenserConfig)
def test_default_condenser_explicit_toml_override(default_config, temp_toml_file):
"""Test that explicit condenser in TOML takes precedence over the default."""
# Set enable_default_condenser to True
default_config.enable_default_condenser = True
# Create a TOML file with an explicit condenser section
with open(temp_toml_file, 'w', encoding='utf-8') as toml_file:
toml_file.write("""
[condenser]
type = "recent"
keep_first = 3
max_events = 15
""")
# Load the config
load_from_toml(default_config, temp_toml_file)
# Verify the explicit condenser from TOML takes precedence
agent_config = default_config.get_agent_config()
assert isinstance(agent_config.condenser, RecentEventsCondenserConfig)
assert agent_config.condenser.keep_first == 3
assert agent_config.condenser.max_events == 15
def test_api_keys_repr_str():
# Test LLMConfig
llm_config = LLMConfig(

View File

@@ -0,0 +1,164 @@
import pytest
from unittest.mock import MagicMock, patch
from fastapi import Request, status
from fastapi.responses import JSONResponse
from openhands.llm.metrics import Metrics, TokenUsage
from openhands.server.routes.conversation import get_conversation_metrics
@pytest.fixture
def mock_metrics():
metrics = Metrics()
metrics.accumulated_cost = 0.25
metrics.token_usages = [
TokenUsage(
model="gpt-4",
prompt_tokens=100,
completion_tokens=50,
cache_read_tokens=0,
cache_write_tokens=0,
response_id="resp1"
),
TokenUsage(
model="gpt-4",
prompt_tokens=200,
completion_tokens=75,
cache_read_tokens=0,
cache_write_tokens=0,
response_id="resp2"
),
]
return metrics
@pytest.mark.asyncio
async def test_get_conversation_metrics_success(mock_metrics):
"""Test that the metrics endpoint returns the correct metrics data."""
# Create a mock request with a conversation that has metrics
mock_event_stream = MagicMock()
mock_event_stream.get_metrics.return_value = mock_metrics
mock_conversation = MagicMock()
mock_conversation.event_stream = mock_event_stream
mock_request = MagicMock()
mock_request.state.conversation = mock_conversation
# Call the endpoint function directly
response = await get_conversation_metrics(mock_request)
# Check the response
assert isinstance(response, JSONResponse)
assert response.status_code == status.HTTP_200_OK
# Extract the content from the response
content = response.body.decode('utf-8')
import json
data = json.loads(content)
# Verify the metrics data
assert data["accumulated_cost"] == 0.25
assert data["total_prompt_tokens"] == 300
assert data["total_completion_tokens"] == 125
assert data["total_tokens"] == 425
# Verify the get_metrics method was called
mock_conversation.event_stream.get_metrics.assert_called_once()
@pytest.mark.asyncio
async def test_get_conversation_metrics_no_metrics():
"""Test that the metrics endpoint handles the case where no metrics are available."""
# Create a mock request with a conversation that has no metrics
mock_event_stream = MagicMock()
mock_event_stream.get_metrics.return_value = None
mock_conversation = MagicMock()
mock_conversation.event_stream = mock_event_stream
mock_request = MagicMock()
mock_request.state.conversation = mock_conversation
# Call the endpoint function directly
response = await get_conversation_metrics(mock_request)
# Check the response
assert isinstance(response, JSONResponse)
assert response.status_code == status.HTTP_200_OK
# Extract the content from the response
content = response.body.decode('utf-8')
import json
data = json.loads(content)
# Verify the metrics data
assert data["accumulated_cost"] == 0.0
assert data["total_prompt_tokens"] == 0
assert data["total_completion_tokens"] == 0
assert data["total_tokens"] == 0
# Verify the get_metrics method was called
mock_conversation.event_stream.get_metrics.assert_called_once()
@pytest.mark.asyncio
async def test_get_conversation_metrics_no_conversation():
"""Test that the metrics endpoint handles the case where no conversation is found."""
# Create a mock request with no conversation attribute
mock_request = MagicMock()
mock_request.state = MagicMock()
# Intentionally not setting request.state.conversation
# Call the endpoint function directly
response = await get_conversation_metrics(mock_request)
# Check the response
assert isinstance(response, JSONResponse)
assert response.status_code == status.HTTP_500_INTERNAL_SERVER_ERROR
# Extract the content from the response
content = response.body.decode('utf-8')
import json
data = json.loads(content)
# Verify the error message
assert "error" in data
assert "No conversation found" in data["error"] or "conversation" in data["error"].lower()
@pytest.mark.asyncio
async def test_get_conversation_metrics_exception():
"""Test that the metrics endpoint handles exceptions gracefully."""
# Create a mock request with a conversation that raises an exception
mock_event_stream = MagicMock()
mock_event_stream.get_metrics.side_effect = Exception("Test exception")
mock_conversation = MagicMock()
mock_conversation.event_stream = mock_event_stream
mock_request = MagicMock()
mock_request.state.conversation = mock_conversation
# Call the endpoint function directly
response = await get_conversation_metrics(mock_request)
# Check the response
assert isinstance(response, JSONResponse)
assert response.status_code == status.HTTP_500_INTERNAL_SERVER_ERROR
# Extract the content from the response
content = response.body.decode('utf-8')
import json
data = json.loads(content)
# Verify the error message
assert "error" in data
assert "Test exception" in data["error"]
# Verify the get_metrics method was called
mock_conversation.event_stream.get_metrics.assert_called_once()

Some files were not shown because too many files have changed in this diff Show More