Compare commits

..

1312 Commits

Author SHA1 Message Date
openhands 45a39f6d9d fix(workspace): add Authorization header to completion callback
The automation service's completion callback endpoint requires authentication
via Bearer token. Without this header, the callback returns 401 Unauthorized.

The cloud_api_key is already available on the workspace instance (set from
OPENHANDS_API_KEY env var), so we just need to include it in the request.

Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-25 03:03:13 +00:00
openhands a6fc20f3d4 style: fix line length and formatting in test file
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-23 22:35:01 +00:00
Rohit Malhotra 4b6b46ea61 Merge branch 'main' into feat/saas-runtime-mode 2026-03-23 13:53:32 -04:00
openhands 8e797ec370 security: strip SESSION_API_KEY from subprocess environment
Prevent LLM-driven agents from accessing SESSION_API_KEY via terminal
commands. This credential grants access to user secrets via the SaaS API
and must remain isolated to the SDK's Python process.

- Add SESSION_API_KEY to _SENSITIVE_ENV_VARS in sanitized_env()
- Add security tests verifying terminal tool cannot access the key

Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-23 17:53:07 +00:00
Engel Nyst 5f106d052b fix(sdk): stop sending reasoning_effort to Kimi thinking (#2549)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-23 17:21:53 +01:00
cid fbd848bdfa fix: use asyncio.Event() for thread-safe initialization state (#2383)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Juan Michelini <juan@juan.com.uy>
2026-03-23 13:11:33 -03:00
Vasco Schiavo 0b60270171 feat(sdk): Add browser tool usage guidelines to system prompt (#2547) 2026-03-23 16:54:32 +01:00
openhands 13a4aa6551 fix: set api_key in saas_runtime_mode for agent-server auth
_init_saas_runtime_mode() was setting self._session_api_key but not
propagating it to self.api_key (the RemoteWorkspaceMixin field).
The shared HTTP client (used by RemoteConversation) builds its headers
from self.api_key via _headers, so conversations got no
X-Session-API-Key header → 401 from the local agent-server.

Mirrors what _start_sandbox() already does at line 268.

Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-23 15:10:32 +00:00
Xingyao Wang 78f9f100ea Merge branch 'main' into feat/saas-runtime-mode 2026-03-23 14:54:24 +00:00
openhands 71f66416ad fix: apply ruff-format formatting fixes
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-23 14:39:25 +00:00
John-Mason P. Shackelford c0baa86ae1 Add url field to PluginAuthor to match Claude Code schema (#2546)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-23 20:31:42 +07:00
openhands 3f81ef3238 fix: set _sandbox_id and _session_api_key in saas_runtime_mode
_init_saas_runtime_mode() now reads sandbox identity from env vars
so get_llm() and get_secrets() work when the SDK runs inside a
Cloud sandbox (automation dispatch path).

Env vars (injected by the automation dispatcher):
  SANDBOX_ID       — sandbox's Cloud API identifier
  SESSION_API_KEY  — session key for sandbox settings auth
  (falls back to OH_SESSION_API_KEYS_0 set by the runtime)

Also reads AGENT_SERVER_PORT env var for the local agent-server port.
Constructor param sandbox_id= takes precedence over env var.

Also fixes pre-existing test URL mismatch in test_cleanup_deletes_sandbox.

Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-23 12:49:37 +00:00
openhands e0eca291b6 Revert "feat: read agent_server_port from AGENT_SERVER_PORT env var"
This reverts commit 975647d9f9.
2026-03-23 12:12:41 +00:00
openhands 975647d9f9 feat: read agent_server_port from AGENT_SERVER_PORT env var
Default to AGENT_SERVER_PORT environment variable for the agent server
port, falling back to 60000 if not set. Explicit kwarg still takes
precedence over the env var.

Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-23 12:09:56 +00:00
openhands b3841c44d7 feat: add completion callback on __exit__ for automation service
When automation_callback_url is set, __exit__ POSTs completion status
(COMPLETED or FAILED with error detail) to the automation service before
cleanup(). Best-effort — callback failures are logged, not raised.

New fields:
- automation_callback_url: URL to POST to on exit
- automation_run_id: included in callback payload

Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-23 11:50:21 +00:00
openhands 336f7eadfd Revert "feat: expose SDK packages to system Python in source images"
This reverts commit 73c517cc125e677fe2d512da2549383073e6bd11.
2026-03-23 11:50:21 +00:00
openhands 210bb641ca feat: expose SDK packages to system Python in source images
Add PYTHONPATH pointing to the agent-server venv's site-packages in
both source and source-minimal Docker targets. This allows user scripts
running inside the sandbox (e.g. automation entrypoints using
OpenHandsCloudWorkspace with saas_runtime_mode=True) to import
openhands-sdk, openhands-workspace, and openhands-tools directly
without needing to activate the venv.

Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-23 11:50:21 +00:00
openhands 9fc6b00025 fix: make cloud_api_url/key required in all modes, ref RFC
cloud_api_url and cloud_api_key are needed even in saas_runtime_mode
for get_llms() and get_secrets() calls to the Cloud API. Reverts them
to required str fields and removes the conditional model_validator.

Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-23 11:50:21 +00:00
openhands 3e60a37c5e feat: add saas_runtime_mode to OpenHandsCloudWorkspace
When saas_runtime_mode=True, the workspace assumes it is already running
inside an OpenHands Cloud Runtime sandbox and connects directly to the
local agent-server at localhost instead of provisioning a sandbox via the
Cloud API.

This supports the automation service architecture (ADR-0002) where SDK
scripts execute inside pre-existing Cloud Runtimes. The workspace skips
sandbox creation/deletion and points the HTTP client at localhost:<port>.

Changes:
- Add saas_runtime_mode and agent_server_port fields
- Make cloud_api_url/cloud_api_key optional (with validator enforcing
  them when saas_runtime_mode=False)
- Add _init_saas_runtime_mode() for the local init path
- Skip sandbox cleanup in saas_runtime_mode
- Add 8 new tests covering the feature

Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-23 11:50:21 +00:00
John-Mason P. Shackelford f307425848 feat(plugin): Add entry_command field to PluginManifest (#2230)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-23 07:26:02 -04:00
simonrosenberg 62c2e7cfde feat(docker): add extra_build_args to BuildOptions (#2541)
Co-authored-by: Debug Agent <debug@example.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 17:05:37 -03:00
simonrosenberg d20d28ba79 feat(docker): make boto3 installation optional via build arg (#2536)
Co-authored-by: Debug Agent <debug@example.com>
2026-03-20 15:15:51 -03:00
simonrosenberg 2af366c4a7 feat(docker): make ACP npm package installation optional via build arg (#2535)
Co-authored-by: Debug Agent <debug@example.com>
2026-03-20 15:14:15 -03:00
Juan Michelini ee620e239f Fix Qwen3.5-Flash low submission rate: improve JSON arg parsing and add corrective feedback (#2512)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-20 13:14:42 -03:00
simonrosenberg f1a6eaac05 build: move SDK SHA args after expensive layers for cache reuse (#2522)
Co-authored-by: Debug Agent <debug@example.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 10:22:04 -03:00
John-Mason P. Shackelford 2f44967b51 feat(websocket): add after_timestamp filter for bi-directional event loading (#1880)
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-20 12:33:07 +00:00
simonrosenberg 3abc11ce51 fix(workflow): Normalize instance_ids by stripping spaces instead of failing (#2529)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: OpenHands Bot <contact@all-hands.dev>
2026-03-20 11:39:53 +00:00
Engel Nyst 952c17e273 fix(ci): ignore added Field metadata in SDK API breakage check (#2524)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-20 12:21:47 +01:00
Vasco Schiavo 8d7f48211b feat(sdk/agent): Parallel Tool Call Execution (#2390)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
2026-03-20 12:18:46 +01:00
Vasco Schiavo 4b5b3a4ac7 feat(workflow): Expected instance_ids format (no spaces) (#2502)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-20 12:14:58 +01:00
Mario Young db0655a65a Add Google Gemini 3.1 verify models (#2276)
Co-authored-by: Juan Michelini <juan@juan.com.uy>
2026-03-20 00:07:54 +00:00
Graham Neubig ba70d15f5b fix(tools): return browser timeout as observation (#2455)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
2026-03-20 08:41:36 +09:00
Graham Neubig 84359e3353 Expose terminalbench in run-eval workflow (#2360)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-20 06:58:31 +09:00
Engel Nyst 26db8360c3 refactor(llm): use litellm params for reasoning support (#1990)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-19 21:37:43 +01:00
Engel Nyst a3e94207d3 fix(examples): make the LLM profile store example directory-based (#2507)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-19 21:30:20 +01:00
dependabot[bot] c8baf9d559 chore(deps): bump actions/upload-artifact from 4 to 7 (#2518)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-19 20:36:25 +01:00
dependabot[bot] 327a4cc621 chore(deps): bump actions/github-script from 7 to 8 (#2521)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-19 20:35:47 +01:00
dependabot[bot] 8240b29c19 chore(deps): bump actions/setup-node from 4 to 6 (#2519)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-19 20:18:44 +01:00
dependabot[bot] d08793f579 chore(deps): bump actions/download-artifact from 6 to 8 (#2517)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-19 20:17:05 +01:00
dependabot[bot] 21418e6a09 chore(deps): bump docker/login-action from 3 to 4 (#2520)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-19 20:15:20 +01:00
aivong-openhands b6182375d7 chore: add Dependabot configuration for GitHub Actions updates (#2501)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: OpenHands Bot <contact@all-hands.dev>
2026-03-19 14:02:52 -05:00
Engel Nyst b43fb7c586 fix(ci): ignore Field deprecated metadata in API breakage check (#2508)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-19 13:12:59 +01:00
Vasco Schiavo 829d3c10bd fix(workflow): remove unused DATASET/SPLIT env vars from run-eval workflow (#2504)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-19 10:54:17 +01:00
dependabot[bot] 703151e01f chore(deps): bump pypdf from 6.8.0 to 6.9.1 (#2497)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-19 03:26:11 +00:00
Juan Michelini 0eeb04443a Add MiniMax-M2.7 to resolve_model_config.py (#2500)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-18 23:37:30 -03:00
simonrosenberg a901efafd5 fix: cherry-pick cache_export_seconds telemetry fix to main (#2493)
Co-authored-by: Debug Agent <debug@example.com>
2026-03-18 14:07:05 -03:00
Xingyao Wang 281c94112a feat: workspace.get_llm() and get_secrets() for OpenHandsCloudWorkspace credential inheritance (#2409)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: allhands-bot <allhands-bot@users.noreply.github.com>
2026-03-18 15:58:16 +00:00
Vasco Schiavo d3ba40cf19 refactor(sdk/subagent): showing tools each subagent has (#2480) 2026-03-18 13:21:37 +01:00
simonrosenberg 6b02df0ec5 fix: synchronize ACP telemetry and refresh remote final state (#2460)
Co-authored-by: Debug Agent <debug@example.com>
2026-03-18 08:45:08 -03:00
Juan Michelini 2d027b4a4b Add GPT-5.4 to verified models (#2487)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-17 23:03:45 -03:00
dependabot[bot] c34d1bf10c chore(deps): bump pyasn1 from 0.6.2 to 0.6.3 (#2484)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-17 16:40:44 -05:00
Juan Michelini d57c6d728f Remove multiswebench from CI eval workflow options (#2483)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-17 18:02:18 -03:00
Juan Michelini d2d5d4701b Migrate PR review plugin to extensions repository (#2324)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2026-03-17 14:31:58 +00:00
Juan Michelini c823cde690 fix: preflight check now validates reasoning_content for thinking models (#2420)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-17 14:31:41 +00:00
simonrosenberg c34cb27b39 feat(build): add OPENHANDS_BUILDKIT_CACHE_MODE env var to control cache export (#2479)
Co-authored-by: Debug Agent <debug@example.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-17 11:11:43 -03:00
Xingyao Wang 58e7ff3b59 feat(prompt): add AI disclosure policy for external service communications (#2476)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-17 12:06:39 +00:00
dependabot[bot] 1c228e7ba3 chore(deps): bump authlib from 1.6.7 to 1.6.9 (#2475)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-16 21:38:27 -05:00
Mario Young 772fb8d2f7 Refine temperature/top_p handling for reasoning models (#2277)
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-03-16 23:39:27 +01:00
simonrosenberg d129025974 Enable ACPAgent on RemoteRuntime API via ACP conversation endpoints (#2465)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Debug Agent <debug@example.com>
2026-03-16 17:18:18 -03:00
aivong-openhands 7e96868b40 Use version tag for agent server image in version bump prs (#2427)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-16 14:25:26 -05:00
Engel Nyst 3ede303e80 Enforce REST API deprecation runway (#2464)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-16 17:21:25 +00:00
Graham Neubig 472328d946 test(sdk): reproduce delegate resume compatibility regression (#2382)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Vasco Schiavo <115561717+VascoSch92@users.noreply.github.com>
2026-03-16 15:57:54 +01:00
Engel Nyst ed924a3ba0 Revert PR #2190: Enable ACPAgent on RemoteRuntime API (#2451)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-16 14:54:12 +01:00
Graham Neubig 463006b892 Fix Python selection in version-bump PR workflow (#2430)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-16 09:48:56 -04:00
Robert Brennan 5577f5e6e5 Fix: Add tags to root endpoint for OpenAPI spec (#2458)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-15 22:03:28 +00:00
Engel Nyst ff9d24d205 ci: guard package version bumps outside release PRs (#2457)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-15 22:24:48 +01:00
Robert Brennan f34bc8cb25 Add docstring guidelines and fix key docstrings for MDX compatibility (#2452)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-15 21:48:22 +01:00
dependabot[bot] a604ee3160 chore(deps): bump pyjwt from 2.11.0 to 2.12.0 (#2448)
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
2026-03-15 15:30:29 +00:00
Aditya Bharat Soni 737eae2b23 Fix apptainer workspace cleanup: kill zombie child processes. (#2450) 2026-03-15 09:25:05 -04:00
Engel Nyst 20ff7ba8ef Export TokenUsage, page_iterator, and AsyncRemoteWorkspace as public SDK APIs (#2445)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-14 09:14:03 +01:00
Yitao Li f69a1027aa Add rich package version 14.3.3 to dependency constraints (#2414) 2026-03-14 02:13:03 +00:00
Engel Nyst d922a25609 Run API breakage checks on push to main (#2443)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-14 02:35:50 +01:00
Engel Nyst 560ca0d801 Remove stale Python API workflow env (#2442)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-14 01:11:39 +00:00
Engel Nyst b01410cb0d Enforce REST deprecation deadlines (#2435)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-14 01:51:19 +01:00
Engel Nyst dc505e8bf1 Highlight API breakage check comments more clearly (#2434)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-14 01:20:49 +01:00
Engel Nyst 411f168131 Clarify REST contract deprecation policy (#2433)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-14 00:09:32 +00:00
Engel Nyst e18883eb29 Make OpenAPI breakage workflows fail loudly (#2432)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-14 00:29:05 +01:00
simonrosenberg ca621a4237 Enable ACPAgent on RemoteRuntime API (#2190)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 18:33:14 -03:00
OpenHands Bot ab83d054ee Release v1.14.0 (#2428)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-13 17:12:32 -04:00
simonrosenberg cec82663a3 feat: support reusing prebuilt SDK sdists (#2426) 2026-03-13 17:44:13 -03:00
Engel Nyst 6c982e68bf Reject SDK @deprecated on FastAPI routes (#2424)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-13 20:52:29 +01:00
Tim O'Farrell a6f5e462e3 Fix broken OpenAPI docs in with path based routing (#2423)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-13 13:39:36 -06:00
simonrosenberg a3ca0e81ec feat: emit structured docker build telemetry (#2422)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-13 16:35:02 -03:00
Robert Brennan 4b5aca251a Add startup banner to SDK with helpful links (#2380)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-13 09:25:23 -07:00
Juan Michelini 9240307c9e Update Nemotron model config (#2398)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-13 12:58:41 -03:00
Engel Nyst 2e32cec9f1 Handle fork PRs in API breakage workflows (#2410)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-13 16:20:56 +01:00
Max Matveev 0a2df6ef29 Allow connecting agent container to the specified Docker network. (#2381) 2026-03-13 10:56:46 +00:00
Engel Nyst 155316cac6 Enforce OpenAPI endpoint deprecations in REST API checks (#2405)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-13 08:07:19 +01:00
OpenHands Bot 7767c09eed Release v1.13.1 (#2402)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2026-03-13 06:43:03 +00:00
Tim O'Farrell 85f7813fa7 Log response content when webhook posting fails (#2403)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-13 14:25:13 +08:00
chuckbutkus a085767250 Deprecate {path:path} endpoints in file_router and add query param alternatives (#2404)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: OpenHands Bot <contact@all-hands.dev>
2026-03-13 14:08:23 +08:00
Vasco Schiavo d94cdf8dba fix(tools): share browser executor across subagents to prevent CDP port conflicts (#2401) 2026-03-12 18:48:53 +01:00
dependabot[bot] a8781d0c78 chore(deps): bump tornado from 6.5.2 to 6.5.5 (#2400)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-12 10:27:21 -05:00
Xingyao Wang e9f4f16e39 fix: truncate long skill descriptions instead of raising errors (#2394) (#2395)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-12 22:59:12 +08:00
Juan Michelini aa9df699cd fix(llm): add provider-specific verified model lists for gemini, deepseek, moonshot, minimax (#2386)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-11 20:40:37 -03:00
Juan Michelini b6210b0bf7 Add NVIDIA Nemotron-3 Super 120B model to resolve_model_config.py (#2391)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-11 18:19:18 -03:00
Xingyao Wang 46f3d78e2b fix: improve conversation resilience for long-running and resumed sessions (#2384)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-12 02:28:00 +08:00
Juan Michelini c3bb8e70ed Add gpt-5.4 to resolve_model_config.py (#2374)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-12 02:02:13 +08:00
Jamie Chicago f0f323ea52 Add bug and feature request issue templates for SDK (#1864)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: OpenHands Bot <contact@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
2026-03-11 18:01:32 +00:00
dependabot[bot] ed1232c2b0 chore(deps): bump pypdf from 6.7.5 to 6.8.0 (#2387)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-11 10:12:02 -05:00
OpenHands Bot e0b38499a9 Release v1.13.0 (#2378)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-10 21:15:11 +01:00
Engel Nyst 8dd71b34be fix(examples): emit EXAMPLE_COST for marketplace demo (#2379)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-11 02:33:40 +08:00
Vasco Schiavo 91ef5e4e2f feat(sdk/subagent): add mcp-servers for subagent (#2348) 2026-03-10 18:28:29 +01:00
Vasco Schiavo 13092ce4dd refactor(sdk/llm/mixim): separate data from logic (#2354)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-11 00:26:43 +08:00
cid b41176217a fix: add security_risk and summary to tool examples for non-native function calling (#2251)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2026-03-10 16:06:33 +00:00
Xingyao Wang 4fa6ecd1f9 fix: send hook_config to server in RemoteConversation (#2115)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: OpenHands Bot <contact@all-hands.dev>
Co-authored-by: allhands-bot <allhands-bot@users.noreply.github.com>
2026-03-10 23:36:01 +08:00
Vasco Schiavo 2068653236 fix(tools): issue 2365 (#2369) 2026-03-10 16:19:38 +01:00
Engel Nyst 8cdc117e45 feat(examples): add plugin and skill lifecycle demos (#2362)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-10 11:46:15 +01:00
Calvin Smith 703ce4414b feat: add rerun_actions method to replay conversation actions (#2351)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-09 09:04:20 -06:00
Graham Neubig b4593eb4b1 feat: add configurable marketplace_path setting for public skills loading (#2253)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-09 08:43:43 -04:00
Vasco Schiavo 87fb61598d feat(llm): switch model profile on user message (#2192)
Co-authored-by: OpenHands Bot <contact@all-hands.dev>
2026-03-09 12:05:51 +00:00
Vasco Schiavo 4c1e1a17fc feat(sdk/subagent): add confirmation policy (#2345) 2026-03-09 11:14:42 +01:00
Vasco Schiavo 208a491a14 feat(sdk/subagent): hooks for subagents (#2347)
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
2026-03-08 20:01:36 +01:00
Engel Nyst 77c68ccfd7 chore: redistribute AGENTS guidance (#2359)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-08 13:35:07 +01:00
Vasco Schiavo 2fb540e11d refactor(sdk/context/skills): refactoring loading skills (#2353) 2026-03-08 06:10:53 +01:00
Graham Neubig 441ddf0465 Upgrade to Python 3.13 and fix libtmux locale issue (#2092)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-07 20:47:11 -05:00
Vasco Schiavo 4c8b464204 refactor(sdk/agent): simplify warning (#2355)
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
2026-03-07 08:41:00 +00:00
Engel Nyst 3258af5192 feat: add enable/disable for installed skills (#2322)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-07 07:40:55 +01:00
Engel Nyst 05fae8dc98 feat: add enable/disable for installed plugins (#2336)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-07 07:39:29 +01:00
Graham Neubig 0b3cb691fc Add learnings from code review analysis (#2280)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-07 14:33:06 +08:00
Engel Nyst 8a6a2306c5 ci: detect ACP minor version bumps (#2346)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-06 16:02:25 +01:00
Vasco Schiavo ecbbf79d64 feat(sdk/subagent): add profile_store_dir (#2340) 2026-03-06 13:45:18 +01:00
Aditya Kumarakrishnan ec73af678c fix: one malformed SKILL.md with invalid YAML frontmatter prevents all sibling skills from loading (#2333)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2026-03-06 09:27:31 +00:00
Engel Nyst 0e7d9fbbb5 fix: compare API breakage checks to latest PyPI release (#2338)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-06 09:18:58 +01:00
Graham Neubig 73b01e659e feat: add skills field to Marketplace schema with GitHub URL support (#2325)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-05 17:24:22 -05:00
Juan Michelini a74c2d914a Add job to print all parameters at start of run-eval workflow (#2332)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-05 18:09:45 -03:00
Vasco Schiavo a04a73f3e1 fix(sdk/subagent): remote workspace and subagents (#2323) 2026-03-05 21:52:23 +01:00
simonrosenberg 6e99789d64 Remove unnecessary ACP cost estimation fallback (#2330)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 16:35:27 -03:00
OpenHands Bot db9f0a7c41 Release v1.12.0 (#2302)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2026-03-06 03:27:24 +08:00
simonrosenberg 906b91f154 Route ACP agents through LiteLLM proxy in CI (#2326)
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2026-03-05 18:49:17 +00:00
Xingyao Wang f192109082 chore: add design-principles skill linking to SDK arch docs (#2329)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-05 18:44:22 +00:00
Xingyao Wang bc3fbb89c4 fix: set delete_on_close default to True (#2328)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-05 18:26:49 +00:00
Engel Nyst b87d3e0aed feat: add AgentSkills install utilities (#2320)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-05 12:59:14 -05:00
Vasco Schiavo f7b15527f4 fix(examples): example no.38 (#2316)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-05 10:02:38 -05:00
Robert Brennan 0ca523a0ad fix: events search API blocks during agent step (FIFOLock contention) (#2296)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-05 10:01:34 -05:00
dependabot[bot] 16e734057e chore(deps): bump authlib from 1.6.6 to 1.6.7 (#2321)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: aivong-openhands <ai.vong@openhands.dev>
2026-03-05 08:56:42 -06:00
Engel Nyst a24dcc1d12 Normalize OpenAPI schema for oasdiff (#2318)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-05 15:12:22 +01:00
Vasco Schiavo a2a194f8b3 feat(sdk/subagent): Support LLM profiles in subagents (#2258) 2026-03-05 14:34:18 +01:00
Juan Michelini fa4baaec7e Fix: Add 'gpt5' to tool-preset argparse choices (#2306)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-05 19:29:55 +08:00
aivong-openhands 193c67733c fix(security): Upgrade fastmcp to v3 to address CVE-2025-69872 (#2294)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-05 19:29:08 +08:00
Vasco Schiavo 76ef4a11ac fix(examples): example no. 25 (#2313)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2026-03-05 11:15:10 +01:00
Vasco Schiavo 123f5e10af nit(examples): example no. 40 (#2315) 2026-03-05 17:51:05 +08:00
Vasco Schiavo 53a68b4335 fix(examples): example no. 41 (#2314) 2026-03-05 17:08:34 +08:00
Engel Nyst 4009d7e844 Disable prompt_cache_retention for Azure models (#2312)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-05 07:56:46 +01:00
Engel Nyst 7b6f32a814 Fix prompt_cache_retention for Azure (#2311)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-05 07:32:21 +01:00
Engel Nyst 3356db3e4e Fix API breakage workflow exit codes (#2309)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-05 01:10:18 +01:00
Calvin Smith 5015e72c2a Fix duplicate observations for same tool_call_id after crash recovery (#2300)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-04 14:22:45 -07:00
Xingyao Wang de65ac5077 feat(plugin): Add installed plugins management utilities (#2031)
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-04 20:23:59 +00:00
Vasco Schiavo 50d8f1bf17 feat(sdk/subagent): Support max_iteration_per_run in file-based agent definitions (#2263) 2026-03-04 14:58:03 +01:00
Vasco Schiavo 3c2bce2c6d feat(sdk/subagent): Support skills in file-based Agents (#2260) 2026-03-04 14:13:46 +01:00
Juan Michelini eab666f174 Insist on integration tests on the model addition AGENTS.MD (#2293) 2026-03-03 21:02:10 -03:00
Juan Michelini 10ccfad706 Add gpt-5-3-codex model to resolve_model_config.py (#2292)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-03 23:22:45 +00:00
Robert Brennan 2e8a2bd5fc add ability to run chrome as root (#2169)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-03 22:46:24 +01:00
Juan Michelini 86e2355f9a Split AGENTS.md: Move model addition guide to ADDINGMODEL.md (#2291)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-03 18:15:40 -03:00
Juan Michelini 8dc35fd2aa Add explicit rules to prevent modifying existing models in AGENTS.md (#2288)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-03 21:05:13 +00:00
Juan Michelini cc34237f8e Add AGENTS.md for model addition guidance in .github/run-eval (#2284)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-03 14:39:48 -06:00
dependabot[bot] 379cd698b4 chore(deps): bump fastmcp from 2.12.4 to 2.14.0 (#2266)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: aivong-openhands <ai.vong@openhands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Ray Myers <ray.myers@gmail.com>
2026-03-03 09:34:10 -06:00
Juan Michelini fd8012889e Fix: Security analyzer ignores LLM security_risk when no analyzer is configured (#2130)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: allhands-bot <allhands-bot@users.noreply.github.com>
Co-authored-by: enyst <engel.nyst@gmail.com>
2026-03-03 11:35:57 -03:00
aivong-openhands 217b218272 Update orjson to 3.11.7 to address CVE-2025-67221 (#2268)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-03 07:56:44 -06:00
Vasco Schiavo 5002d7d64e fix(sdk/subagent): fix get_factory_info() (#2271) 2026-03-03 13:05:42 +01:00
Xingyao Wang 9f521a4ec6 Show full dynamic context in SystemPromptEvent.visualize (#2273)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-03 18:48:25 +08:00
Mario Young 0989fb25b7 Add gemini-3.1-pro-preview model to reasoning models #2269 (#2270) 2026-03-03 07:28:36 +01:00
dependabot[bot] aed2123270 chore(deps): bump pypdf from 6.7.4 to 6.7.5 (#2267)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-02 18:54:21 -06:00
dependabot[bot] 9f1e29852d chore(deps): bump pypdf from 6.7.2 to 6.7.4 (#2265)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: aivong-openhands <ai.vong@openhands.dev>
2026-03-02 18:42:47 -06:00
dependabot[bot] 0abd44ca88 chore(deps): bump mcp from 1.17.0 to 1.23.0 (#2226)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-03 00:28:01 +01:00
chuckbutkus 97ab05384e fix: use query parameters for git API endpoints to preserve path slashes (#2249)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-02 16:42:17 -05:00
Calvin Smith ae376a2479 docs: README for condenser module (#2262) 2026-03-02 14:25:57 -07:00
Calvin Smith ffb6d6009c fix: cap max_output_tokens when using max_tokens fallback (#2264)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-02 14:25:35 -07:00
simonrosenberg e96bce3ccc Change eval_limit from choice to free-form string input (#2261)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 14:46:51 -05:00
Vasco Schiavo c31cdee438 feat(delegate): Built-in specialized agent types (Explore, Bash) (#2201)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
2026-03-02 15:49:26 +00:00
Matthaeus Wolff caf1b0b2f2 fix(tools): guard Unix-only imports in terminal package for Windows (#2096)
Co-authored-by: Rb <rubenwolff@gmail.com>
Co-authored-by: Graham Neubig <neubig@gmail.com>
2026-03-02 15:13:40 +00:00
Graham Neubig 60347e9c9e Set default temperature to None instead of 0.0 (#1989)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: enyst <engel.nyst@gmail.com>
Co-authored-by: OpenHands Bot <contact@all-hands.dev>
Co-authored-by: neubig <neubig@users.noreply.github.com>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2026-03-02 10:11:30 -05:00
Shrey Satapara 2d3d96d329 Fix LiteLLM cost tracking for provider-prefixed models (#2257)
Co-authored-by: OpenHands Evaluation <evaluation@openhands.dev>
2026-03-02 10:18:30 +00:00
Robert Brennan 2b54375517 fix: Include secrets in system prompt when added via update_secrets() (#2171)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: OpenHands Bot <contact@all-hands.dev>
2026-03-01 16:58:02 -05:00
Engel Nyst ef3ec79e85 PR review agent: make eval-risk approval policy repo-specific (#2254)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-01 22:00:31 +01:00
Robert Brennan fe9b8ffbe4 feat: autotitle conversations on first user message (#2225)
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
2026-03-01 08:34:17 +00:00
chuckbutkus ec953e0f8c Add server-base-path support for VSCode in path-based routing (#2241)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-03-01 00:19:28 +01:00
Engel Nyst bcdbd5c20b PR review agent: avoid approving eval-risk behavior changes (#2246)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-28 20:08:49 +00:00
simonrosenberg bde715c12b fix: override server_image default to None in DockerDevWorkspace (#2243)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 07:16:46 -03:00
Engel Nyst c2d507d55b ci: use oasdiff for agent-server REST API breakage detection (#2240)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-28 01:25:25 +01:00
Engel Nyst cefaebfb93 fix(sdk): add gpt-5.2-codex, gpt-5.3-codex, and gpt-5.2 to model-variant detection (#2238)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-28 00:39:31 +01:00
Vasco Schiavo 345a675013 refactoring(sdk): Remove hardcoded header from get_factory_info() (#2234) 2026-02-27 23:44:40 +01:00
Engel Nyst c4e31b5b09 docs(subagent): document loader invariants for file-based agents (#2231)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-27 21:24:25 +00:00
Juan Michelini 70b8a36b44 Fix GLM-5 preflight check by filtering SDK-specific parameters (#2194)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-27 15:24:33 +00:00
Vasco Schiavo e24df60a87 fix(tool): race condition in dynamic Action wrapper class creation (#2224)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: OpenHands Bot <contact@all-hands.dev>
2026-02-27 13:44:41 +01:00
Vasco Schiavo a691cda436 fix(tools): merge subagents metrics (DelegateTool) (#2221)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: OpenHands Bot <contact@all-hands.dev>
2026-02-27 12:32:04 +00:00
Vasco Schiavo 731809dc81 fix(tools): merge subagents metrics (TaskToolSet) (#2222)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-27 13:21:49 +01:00
Vasco Schiavo 8b61cd0fa8 fix(sdk): handle newlines in JSON-stringified dict arguments (#2217) 2026-02-27 10:31:30 +01:00
Engel Nyst ee24cd6bb6 ci: enforce agent-server REST API deprecation policy (#2232)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-27 09:02:10 +01:00
dependabot[bot] e88ea689ab chore(deps): bump virtualenv from 20.34.0 to 20.36.1 (#2220)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-02-26 07:39:17 -06:00
dependabot[bot] 9a82e38159 chore(deps): bump authlib from 1.6.5 to 1.6.6 (#2219)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-02-26 07:39:05 -06:00
Vasco Schiavo 8ed9fb4fee nit(tools): small refactoring test files (#2215) 2026-02-26 10:11:05 +01:00
dependabot[bot] fce06c6262 chore(deps): bump pypdf from 6.1.1 to 6.7.2 (#2213) 2026-02-26 00:44:49 +01:00
Graham Neubig 734f5f4898 Filter public skills by default marketplace (#2205)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-25 18:30:25 -05:00
dependabot[bot] b5b95e8256 chore(deps): bump pyasn1 from 0.6.1 to 0.6.2 (#2212)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: aivong-openhands <ai.vong@openhands.dev>
2026-02-25 23:26:03 +00:00
dependabot[bot] 65cf395139 chore(deps): bump cryptography from 46.0.3 to 46.0.5 (#2211)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: aivong-openhands <ai.vong@openhands.dev>
2026-02-25 23:25:49 +00:00
dependabot[bot] e68e2b86c9 chore(deps): bump werkzeug from 3.1.1 to 3.1.6 (#2214)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-02-26 00:18:15 +01:00
dependabot[bot] 25d13b1210 chore(deps): bump python-multipart from 0.0.20 to 0.0.22 (#2210)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-02-26 00:16:43 +01:00
Juan Michelini da94e849be Add dashscope/qwen3.5-flash-2026-02-23 model configuration (#2207) 2026-02-25 15:37:58 -06:00
Engel Nyst bde31c450a agent-server: support WebSocket auth via headers (#1814)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-25 17:40:25 +01:00
Vasco Schiavo ab8f5f2877 feat(delegate): File-based agent definitions with markdown + YAML frontmatter (#2183)
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
2026-02-25 15:15:47 +00:00
Rohit Malhotra e25a1ef86f fix(tools): remove BrowserToolSet from package init to reduce downstream bundle size (#2197)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-25 14:43:16 +00:00
Vasco Schiavo 6a74a8a8ff feat(tools): task tool set (#2143)
Co-authored-by: simonrosenberg <157206163+simonrosenberg@users.noreply.github.com>
2026-02-25 10:02:34 +01:00
Engel Nyst b1da856be7 Do not forward api_key to Bedrock calls (#2195)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-24 19:32:43 +01:00
Robert Brennan 82aabd3dee Add LLM models and providers endpoints to agent-server (#2187)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-23 14:56:16 -05:00
Robert Brennan 5d65f38973 fix: Include boto3 extra in agent-server Docker image (#2188)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-23 14:55:45 -05:00
John-Mason P. Shackelford 135b000cb5 Fix long heredoc commands hanging in SubprocessTerminal on macOS (#2182)
Fix long heredoc commands hanging in SubprocessTerminal on macOS (#2182)

When sending very long multi-line commands (like heredocs with 50+ lines)
to a PTY on macOS, the shell's line discipline can't process input fast
enough, causing commands to hang indefinitely. This affects agent workflows
that use heredocs to create multi-line Python scripts.

## Solution

Add flow-controlled line-by-line sending for multi-line commands exceeding
20 lines:

- Use `select()` to check PTY write-readiness (handles kernel buffer backpressure)
- Add 2ms pacing delay between lines (handles shell processing limits)

The pacing delay is intentional and cannot be replaced by `select()` alone:
the PTY fd is almost always writable, but the shell's line discipline is
the actual bottleneck. This hybrid approach adds ~0.2s overhead for 100-line
commands while preventing hangs.

## Changes

- Add `_MULTILINE_THRESHOLD` (20 lines) and `_LINE_PACING_DELAY` (2ms) constants
- Add `_wait_for_pty_writable()` method using select()
- Add `_send_multiline_with_flow_control()` for paced line-by-line sending
- Modify `send_keys()` to route long commands to flow-controlled path
- Add comprehensive tests for heredoc scenarios

## Testing

Tested on macOS 15.6.1 (arm64):
- 5-line heredoc: PASS
- 50-line heredoc: PASS (was FAIL before fix)
- 100-line heredoc: PASS (was FAIL before fix)
- 150-line heredoc: PASS (was FAIL before fix)

Fixes #2181

Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-23 13:14:11 -05:00
simonrosenberg 5b1ae8356f Visualize ACP tool calls (#2162)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 14:50:01 -03:00
Engel Nyst 6bf154a740 ci: Gate API breakage failures on release PRs (#2173)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-22 19:08:23 +00:00
Robert Brennan f612926076 fix: preserve conversation updated_at across server restarts (#2172)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-22 12:53:23 -05:00
Engel Nyst 40e5c5280b context: load project skills from git root (#2164)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-22 00:33:17 +01:00
Vasco Schiavo cb97bab6d6 fix(Makefile): help and pre-commit (#2168) 2026-02-22 00:04:13 +01:00
Robert Brennan 8ad925746a fix(terminal): use dedicated tmux socket to isolate from user sessions (#2167)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-21 17:38:51 -05:00
Robert Brennan 80256213b7 fix(terminal): avoid sending C-l in TmuxTerminal.clear_screen() (#2166)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: enyst <engel.nyst@gmail.com>
2026-02-21 17:11:29 -05:00
Engel Nyst 3f3fb291f7 DRAFT: docs(AGENTS): how to reply to GitHub inline review threads via REST API (#2090)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-21 17:43:05 +01:00
Vasco Schiavo 18d438b5ff feat(sdk): Add TestLLM class for testing without mocking LiteLLM (#2016)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
2026-02-21 16:19:33 +01:00
simonrosenberg 43ee32f2c2 Add ask_agent support to ACPAgent via fork_session (#2145)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 06:43:43 -03:00
Engel Nyst 7157761403 ci: skip api breakage check when prev lacks __all__ (#2165)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-21 06:19:55 +01:00
Engel Nyst 8178eb2fc6 Fix API breakage check robustness and reporting (#2098)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2026-02-21 03:47:33 +01:00
OpenHands Bot 7273dbac66 Release v1.11.5 (#2160)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-20 22:19:42 +00:00
Xingyao Wang 3f8d165ef9 chore: add version bump blocking rule to code review guide (#2161)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-20 22:14:52 +00:00
simonrosenberg fc84a48126 feat: Add ACPAgent for Agent Client Protocol integration (#2133)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2026-02-20 19:06:14 -03:00
simonrosenberg 86a0dfecd3 Add event-sourcing system benchmarks (#2032)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-21 04:22:36 +08:00
Juan Michelini 21d1b6713e Add claude-sonnet-4-6 to verified models (#2104)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2026-02-21 04:00:22 +08:00
Engel Nyst 1fea988760 Docs: fix README links and typos (#2158)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-21 03:55:06 +08:00
Xingyao Wang f33e328b6d Update integration tests to use claude-sonnet-4-6 (#2113)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-20 19:54:12 +00:00
Calvin Smith f782f70396 Add API compliance test framework for malformed message patterns (#2155)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: allhands-bot <allhands-bot@users.noreply.github.com>
2026-02-20 12:49:32 -07:00
Engel Nyst 366f65b749 CI: simplify pr-review gating and skip forks (#2159)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-20 14:40:17 -05:00
Graham Neubig b7332fd38a feat: Add Gemini 3.1 Pro to evaluation model config (#2153)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-20 14:44:32 -03:00
Graham Neubig 94ad1d5188 fix: Disable browser tools in integration tests to fix ProcessPoolExecutor hang (#2149)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-20 11:58:20 -03:00
Engel Nyst 795e20d8bf Fix PR review workflow (#2126)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-20 03:14:13 +01:00
Juan Michelini fa0d83ff83 Enable Datadog persistence by default in eval job (#2140)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-19 19:32:19 -03:00
Juan Michelini 4d0a8b4138 Fix: Add claude-sonnet-4-6 to EXTENDED_THINKING_MODELS (#2138)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-19 22:22:58 +00:00
Juan Michelini fadc008cc4 Fix: Add 180-minute timeout to integration tests to prevent indefinite hangs (#2131)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-19 21:42:33 +00:00
Calvin Smith c6871794b8 feat(condenser): Explicit view properties (#2116)
Co-authored-by: OpenHands Bot <contact@all-hands.dev>
Co-authored-by: Vasco Schiavo <115561717+VascoSch92@users.noreply.github.com>
2026-02-19 13:48:40 -07:00
cid 050991d47c feat(hooks): add async hook execution support (#1849) 2026-02-19 10:43:58 -05:00
Juan Michelini 65cee608b0 Fix: Make litellm import lazy in resolve_model_config.py (#2125)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-18 20:19:11 -03:00
Juan Michelini e2c3cc2d5d fix: Move litellm install before model loading and use correct API key (#2118)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-18 19:08:53 -03:00
Engel Nyst 3e33e48033 Disable uv Actions cache in PR review agent workflow (#2123)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-18 22:01:25 +01:00
Xingyao Wang 23112b7841 chore: Rename code-review.md to custom-codereview-guide.md (#2121)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-18 19:02:48 +00:00
Graham Neubig a2b442e02a Add configurable file editing toolset support (#2077)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Vasco Schiavo <115561717+VascoSch92@users.noreply.github.com>
2026-02-17 21:45:08 -05:00
Graham Neubig 662db6e4f1 Add preflight LLM check before dispatching evaluations (#2109)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-17 21:36:45 -05:00
Juan Michelini 8987bdff30 fix: disable vision for GLM-5 model (#2111) 2026-02-17 19:49:27 -03:00
Calvin Smith b9c0e1f6a8 fix: Forcing minimum condenser progress (#2107)
Co-authored-by: OpenHands Bot <contact@all-hands.dev>
2026-02-17 14:57:30 -07:00
Graham Neubig 8959cb7803 Remove jade-spark-2862 alias, add its settings to minimax-m2.5 (#2106)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-17 16:42:54 -05:00
jpelletier1 2ea6326114 fix: make skill loading resilient to individual skill errors (#2108)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-17 16:08:45 -05:00
Juan Michelini 6992a682a8 Add claude-sonnet-4-6 to expected models (#2102)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-17 16:20:58 -03:00
Juan Michelini 0060aba0a0 Update sdk_ref default in run-eval.yml during release preparation (#1938)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-17 16:11:44 -03:00
Vasco Schiavo 978dd7d1e3 feat(sdk): add fallback strategy (LLM) (#2093)
Co-authored-by: OpenHands Bot <contact@all-hands.dev>
2026-02-17 08:35:06 +01:00
Ray Myers 0ed0111280 security: fix protobuf and pillow CVEs in transitive dependencies (#2095)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-16 20:49:38 +01:00
Ray Myers 91ba0da7fd security: fix HTTP-related CVEs in transitive dependencies (#2094)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-16 13:35:14 -06:00
Graham Neubig 8cda65327d refactor: update PUBLIC_SKILLS_REPO to OpenHands/extensions (#2085)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: neubig <neubig@users.noreply.github.com>
2026-02-16 11:58:40 -05:00
Chris Bagwell e168acbbf3 feat(skills): load user skills from ~/.agents/skills (#2091)
Co-authored-by: Chris Bagwell <chris.bagwell@fujitsu.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-16 17:54:45 +01:00
Graham Neubig 183a85cdc4 feat(pr-review): Support multiple models with random selection for A/B testing (#2024)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: neubig <neubig@users.noreply.github.com>
2026-02-15 21:52:06 -05:00
Engel Nyst 85bb5d13dc fix: validate provider prefixes in unverified model list (#1668)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-15 23:18:24 +01:00
Engel Nyst 4cdb18bd2f Enforce deprecation period for removed public methods (#2083)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-15 20:02:05 +00:00
Engel Nyst 1c7b1a3219 docs: align AGENTS.md guidance across repo (#2087)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-15 17:23:30 +01:00
Engel Nyst 14ed6cce4c docs: add package-scoped AGENTS.md guides (#2086)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-15 15:23:58 +00:00
Engel Nyst 492b2f0999 docs(sdk): add scoped AGENTS.md (#2081)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-15 11:17:37 +01:00
Engel Nyst 23e167d607 ci: extend API breakage checks to openhands-tools (#2080) 2026-02-15 18:14:40 +08:00
Engel Nyst ee4c919919 SDK: bound latest-user-message scan in Agent.step (#1844)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-15 16:00:45 +08:00
Engel Nyst dcebff6e30 Update issue number in integration runner workflow (#2079) 2026-02-15 01:59:07 +01:00
Engel Nyst 12cb484972 ci: extend API breakage checks to openhands-workspace (#2075)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-14 05:55:51 +00:00
Engel Nyst 5fea31269e ci: require deprecation marker before exported symbol removal (#2072)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-14 06:26:29 +01:00
Engel Nyst f96543b0f8 ci: API breakage checks for SDK (Griffe) (#1098)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-14 04:34:49 +00:00
Graham Neubig d338ef30d7 DRAFT: Add feature-release-rollout skill for multi-repo feature propagation (#2069)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-13 23:00:32 -05:00
Graham Neubig 4c31739c2e fix(ci): Fix PR review evaluation workflow artifact detection (#2070)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-13 22:54:55 -05:00
Xingyao Wang 21ae9f6a57 Handle deprecated enable_truncation field in TextContent for backward compatibility (#2027)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: enyst <engel.nyst@gmail.com>
2026-02-14 04:33:04 +01:00
Graham Neubig 2fd0f767c0 Add pr_url metadata to Laminar logging in PR review bot (#2068)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-14 02:46:17 +00:00
Graham Neubig d7b3617e6e fix(pr-review): Set trace metadata within active span context (#2062)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-13 18:17:23 -05:00
Xingyao Wang fa4583c0fa fix(llm): ensure second LLM gets independent metrics and telemetry (#1997)
Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-13 18:59:07 +08:00
Juan Michelini 13bcf023c9 Add GLM-5 to expected models (#2029) 2026-02-12 21:32:56 +00:00
Peter Hamilton 51fc3e472e Fix issue with remote docker workspaces. (#1807)
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
Co-authored-by: OpenHands Bot <contact@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2026-02-12 17:15:46 +01:00
Rohit Malhotra 38a5c59d61 fix: bundle browser recording JS files in PyInstaller build (#2025)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-12 10:21:02 -05:00
Engel Nyst 7d784ceac5 docs(review): clarify approval guidance (#2022)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-12 11:20:00 +00:00
Graham Neubig 5f52dfa6cc fix: add circular reference detection to _process_schema_node (#1956)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-12 09:05:20 +00:00
Graham Neubig c33b9f2b48 feat(security): add GraySwan Cygnal security analyzer (#1952)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-12 07:09:07 +00:00
Engel Nyst a6d47711cc fix(pr-review): avoid blank description in prompt (#2021)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-12 05:52:26 +01:00
Engel Nyst 3d4f0887e2 fix(ci): block PR review auto-run for author_association NONE (#2020)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-12 02:23:44 +01:00
🐾 smolpaws c87e291f50 docs: trim blustery wording in docs update prompt (#2019)
Co-authored-by: enyst <engel.nyst@gmail.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-12 01:10:56 +00:00
Xingyao Wang 7fe69a0ef0 feat: Add current_datetime field to AgentContext (#2012)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: enyst <engel.nyst@gmail.com>
Co-authored-by: allhands-bot <allhands-bot@users.noreply.github.com>
2026-02-12 06:51:24 +08:00
Rohit Malhotra aa3e50dcae Feat: session recording agent's browser sessions (#1731)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2026-02-11 22:50:43 +00:00
Xingyao Wang ba8b98ca99 feat: Load project skills (AGENTS.md, etc.) in PR review action (#2017)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
2026-02-12 06:41:25 +08:00
Engel Nyst 216c1f0bcf chore: move skills directory to .agents (#1970) 2026-02-11 21:28:41 +01:00
Xingyao Wang 221594ad8d Remove enterprise/pyproject.toml updates from version bump workflow (#2015)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-11 18:13:35 +00:00
Xingyao Wang 330e7c5cce Fix workflow_run trigger for version-bump-prs (#2014)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-12 00:49:50 +08:00
OpenHands Bot d7a34ee393 Release v1.11.4 (#2011)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-11 16:39:01 +00:00
Xingyao Wang ab48d1fc3f Add pre-commit step to version-bump-prs workflow (#2010)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-12 00:15:07 +08:00
Graham Neubig 79dfb8fa06 Add MiniMax-M2.5 model configuration (#2007)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2026-02-11 16:11:13 +00:00
Xingyao Wang 740f437f38 Remove redundant poetry lock calls in version-bump-prs workflow (#2008)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-12 00:06:03 +08:00
Xingyao Wang ce2f9b542c Fix poetry lock command in version-bump-prs workflow (#2006)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-12 00:01:03 +08:00
Xingyao Wang 37b709d110 fix: fetch latest reviews/threads instead of oldest (#2004)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-11 15:54:36 +00:00
OpenHands Bot 7194518b34 Release v1.11.3 (#2002)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-11 23:53:52 +08:00
Xingyao Wang 6c40da5d3f Separate version bump PRs into standalone workflow (#2003)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-11 15:48:49 +00:00
Xingyao Wang 8f7b33b5f2 feat: add previous review context to PR review agent (#1991)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-11 15:01:46 +00:00
Graham Neubig 12f675ce84 fix: serialize Laminar span_context UUIDs as strings for JSON compatibility (#2000)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-11 13:11:35 +00:00
Graham Neubig 100e9af80d docs(AGENTS.md): add GraphQL examples for resolving review threads (#1993)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-11 18:10:02 +08:00
Xingyao Wang 56a11eacea feat: add rejection_source field to UserRejectObservation (#1995)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-11 18:00:22 +08:00
Graham Neubig 2ae53c9e63 fix(llm): detect context window errors from Minimax via APIConnectionError (#1992)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-11 07:54:30 +00:00
Graham Neubig eb74fa3285 Fix Laminar trace continuation for PR review evaluation (#1988) 2026-02-10 18:00:45 +00:00
Graham Neubig c291c39b44 fix: separate static system prompt from dynamic context for cross-conversation caching (#1890)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: simonrosenberg <157206163+simonrosenberg@users.noreply.github.com>
Co-authored-by: Simon Rosenberg <simonrosen10@gmail.com>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
2026-02-10 16:29:22 +00:00
Graham Neubig cfe52af49c ci: switch integration tests to use eval proxy (#1985)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-10 15:16:26 +00:00
Graham Neubig 555376505e feat(eval): add jade-spark-2862 model for evaluation (#1984)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-10 14:51:52 +00:00
Xingyao Wang a506db1b7e Add stop hook for pre-commit, pytest, and CI validation and add /api/hooks to agent-server (#1878)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-10 19:38:17 +08:00
Xingyao Wang c3aa5d7cb8 fix: exclude reference markdown files in agentskills folders from being loaded as skills (#1982)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-10 18:56:53 +08:00
OpenHands Bot 69d5249fa1 Release v1.11.2 (#1976)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-09 17:48:54 +00:00
Xingyao Wang d14f55f3f9 feat(examples): update critic example to demonstrate iterative refinement (#1879)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Graham Neubig <neubig@gmail.com>
2026-02-09 23:38:41 +08:00
Xingyao Wang d8509879bc Enhance planning agent to ask clarifying questions for ambiguous requests (#1967)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-09 14:40:48 +00:00
Graham Neubig 937ea93f24 fix: handle PS1 metadata corruption in command output (#1817)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2026-02-09 14:09:25 +00:00
Rohit Malhotra 4d481dffab fix(pyinstaller): include delegate tool templates in bundle (#1971)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-09 18:55:16 +08:00
Vasco Schiavo 63f71f216c feat(sdk): introduce LLMProfileStore for persisted LLM configurations (#1928)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
2026-02-09 11:32:33 +01:00
Graham Neubig 739feaa55b Add context window size validation for local LLMs (#1961)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-09 18:27:08 +08:00
Graham Neubig e7873f2127 feat: Add delayed evaluation for PR review with Laminar signals (#1954)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-08 16:21:36 -05:00
Xingyao Wang a902c33eb7 fix: update LLM model and base URL for pr-review workflow (#1968)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-08 16:31:19 +01:00
Engel Nyst 4ceb070f3a chore: add review thread gate (#1962)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2026-02-08 22:35:27 +08:00
Xingyao Wang a50ff2c7d3 refactor: Use composite GitHub Action for PR review workflow (#1927)
Co-authored-by: Juan Michelini <juan@juan.com.uy>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: enyst <engel.nyst@gmail.com>
2026-02-08 10:22:47 +01:00
Engel Nyst 3b85cf4e61 Fix tool serialization test flakiness (#1959)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-08 01:10:26 +08:00
Juan Michelini fee28c0470 fix: disable vision for glm-4.7 to prevent multimodal evaluation failures (#1898)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
2026-02-07 13:25:16 +01:00
Wang Siyuan 30e2209560 fix(message): Responses tool image serialization (preserve file_editor images) (#1895)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-06 17:29:02 +01:00
Juan Michelini 83a5f38eda Add Laminar traces to PR review workflow (#1949)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-06 16:16:00 +00:00
Graham Neubig 9fed9a38c9 feat(ci): add model_ids and issue_number inputs to integration-runner (#1883)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-06 11:56:45 -03:00
Xingyao Wang 29c330a620 fix: use Python 3.12 for openhands-cli version bump (#1924)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-06 13:40:05 +01:00
Vasco Schiavo 469cc04e86 nit(sdk): change usage_to_llm to return a MappingProxyType (#1930)
Co-authored-by: OpenHands Bot <contact@all-hands.dev>
2026-02-06 20:38:23 +08:00
simonrosenberg 5cad1f92de fix(llm): Add Claude Opus 4.6 to model features for reasoning_effort support (#1941)
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-06 13:23:03 +01:00
Graham Neubig 15d87e6ac5 feat: add /ready endpoint for proper Kubernetes readiness checks (#1810)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: OpenHands Bot <contact@all-hands.dev>
Co-authored-by: Juan Michelini <juan@juan.com.uy>
2026-02-06 07:01:35 -05:00
Engel Nyst d061308591 Add example for reconstructing OpenAI messages from events (#1916)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: allhands-bot <allhands-bot@users.noreply.github.com>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2026-02-06 17:06:55 +08:00
Engel Nyst 347b496df5 feat(llm): add gpt-5.3-codex to subscription models (#1940)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-06 09:40:10 +08:00
Juan Michelini df3319bdc4 Set default sdk_ref to v1.11.1 in run-eval workflow (#1935)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-05 21:34:04 +01:00
Graham Neubig d4f22d0e24 Add Claude Opus 4.6 model support (#1933)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Juan Michelini <juan@juan.com.uy>
2026-02-05 16:46:42 -03:00
Juan Michelini faf9b23cac fix: Remove python-version from setup-uv in create-version-bump-prs job (#1926)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-05 14:00:54 -03:00
Juan Michelini 7b0feb23aa Add litellm_proxy/gpt-oss-20b to resolve_model_config.py (#1906)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: OpenHands Bot <contact@all-hands.dev>
2026-02-05 13:09:29 -03:00
OpenHands Bot 8576e3e977 Release v1.11.1 (#1921)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2026-02-05 23:52:27 +08:00
Xingyao Wang 7b76814f1a fix: use blacksmith runners for all test jobs to fix coverage (#1920)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-05 22:21:53 +08:00
Engel Nyst b7aabde5e7 feat(skills): support .agents/skills directory (#1914)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-05 22:07:03 +08:00
Xingyao Wang f812796b78 Add skill for debugging test-examples workflow (#1887)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: OpenHands Bot <contact@all-hands.dev>
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
2026-02-05 14:05:50 +00:00
Xingyao Wang 8fd9421edd fix: wait for WebSocket terminal status to prevent event loss (#1832)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
2026-02-05 21:53:37 +08:00
Engel Nyst 78f527cf24 Cache Qwen3 tokenizer config for critic template tests (#1919)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-05 13:34:36 +00:00
Graham Neubig 97731fe57d fix: revert agent-server Docker image to Python 3.12 (#1910)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2026-02-05 13:19:18 +00:00
Xingyao Wang f8739efadc chore: remove blacksmith CI runners and use GitHub's default runners (#1915)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-05 13:09:14 +00:00
Engel Nyst 300b7b4826 Clarify execute_tool bypasses confirmation/security checks (#1917)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-05 13:05:22 +00:00
Xingyao Wang 4333ebdafd Add conversation.execute_tool() method for pre-run tool execution (#1833)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: enyst <engel.nyst@gmail.com>
Co-authored-by: OpenHands Bot <contact@all-hands.dev>
2026-02-05 20:42:25 +08:00
Juan Michelini eae56a35f8 feat(llm): add Kimi K2.5 to verified models (#1907)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2026-02-05 12:28:53 +00:00
Graham Neubig a76b7383c3 fix(mcp): handle timeout errors gracefully in MCP tool execution (#1862)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-05 19:45:50 +08:00
Xingyao Wang 519ebae0c3 Update code-review skill with repo-specific approval guidelines (#1904)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-05 13:12:48 +08:00
Graham Neubig 20291bda02 fix: serialize tmux session creation to prevent race conditions (#1889)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2026-02-05 00:01:19 +08:00
Juan Michelini 6092a7e91b feat(llm): add gpt-5.2-codex to verified models (#1893)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-04 12:56:26 -03:00
Juan Michelini 62b4744ece Add qwen3-coder-30b-a3b-instruct model to resolve_model_config.py (#1903)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-04 12:43:05 -03:00
Engel Nyst 8ac82ba38b Fix truncation at LLM message layer (#1838)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-04 22:46:09 +08:00
Juan Michelini e6dc70f500 Change qwen3-coder-next model from together.ai to openrouter (#1900)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-04 11:40:57 -03:00
Xingyao Wang 1a441e26bd utils(release): Fix poetry lock conflict and move openhands-cli step earlier (#1888)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
2026-02-04 19:14:53 +08:00
Juan Michelini 6ce3ba51a9 Add qwen3-coder-next to expected models (#1892)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-03 23:18:45 -03:00
Engel Nyst b8eff79609 Fix license (#1869) 2026-02-03 20:47:28 +01:00
OpenHands Bot bd53941299 Release v1.11.0 (#1884)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-03 17:02:35 +00:00
Hiep Le 140fb5a7b7 feat: store plan.md file in appropriate configuration folders (#1876) 2026-02-03 23:45:40 +07:00
Hiep Le b498a69908 fix: unable to start a new v1 conversation when using the latest sdk code (#1881) 2026-02-03 02:54:59 +07:00
Colin bc7ea211dd fix: several issues related to scalability (#1619)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-02 23:56:31 +08:00
Xingyao Wang 4d217c1316 fix: update enterprise/pyproject.toml before poetry lock in version bump workflow (#1834)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-02 16:15:13 +01:00
Xingyao Wang d85241bd2b feat(llm): Add subscription-based authentication for OpenAI Codex models (#1682)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Leonardo Gonzalez <Leonardo@EncryptedCommerce.net>
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
Co-authored-by: OpenHands Bot <contact@all-hands.dev>
2026-02-02 21:54:55 +08:00
daviskay75 6af88ed971 fix: Add AttributeError to escape_bash_special_chars exception handling (#1871)
Co-authored-by: kaisshili <kaisshili@gmail.com>
2026-02-02 14:16:38 +08:00
Graham Neubig cd853c1d69 feat: require semantic version name on eval github action (#1867)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-02 09:27:02 +08:00
Ray Myers 2b0fce1012 Add --check-browser flag for browser functionality testing (#1329)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-02-02 09:26:16 +08:00
Xingyao Wang 8296a7f4ab fix: add MAINTAINERS file and update triage prompt to use it (#1859)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: OpenHands Bot <contact@all-hands.dev>
2026-01-31 16:59:56 +08:00
Graham Neubig 32ca3d86cd fix: prevent duplicate events in bash polling via order__gt filtering (#1816)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-30 12:19:44 -05:00
Xingyao Wang f6322c988d Add guidance to avoid exposing raw secret values in conversation (#1855)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
2026-01-30 16:43:50 +00:00
Calvin Smith b79d72faa6 feat(condenser): Hard context reset on unrecoverable error (#1596)
Co-authored-by: Calvin Smith <calvin@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-30 08:06:47 -07:00
Engel Nyst 7b7faa256c Avoid materializing full event history in Agent.init_state (#1840)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-30 14:01:41 +01:00
John-Mason P. Shackelford e74006d6f9 fix(security): prevent auth token leakage in trajectory exports (#1848)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-30 14:25:05 +08:00
Engel Nyst b53099a122 Truncate terminal outputs before persisting events (#1823)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-30 14:20:56 +08:00
Juan Michelini 9e5e31950d Add GLM-4.7 to expected models (#1854)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-29 16:41:36 -03:00
Engel Nyst ae0dc8a992 Remove truncation for skill files (AGENTS.md, others) (#1842)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-28 03:14:03 +01:00
Engel Nyst 3863f01255 Redact cookie/token headers in LookupSecret serialization (#1822)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Tim O'Farrell <tofarr@gmail.com>
2026-01-28 03:01:57 +01:00
Graham Neubig 7b9f816684 Add Kimi K2.5 and Qwen3 Max Thinking models to eval config (#1845)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-27 17:32:52 -03:00
Graham Neubig 004b444a45 Add run-eval skill for evaluation guidance (#1809)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: OpenHands Bot <contact@all-hands.dev>
2026-01-27 14:58:33 -05:00
Engel Nyst edfea677a6 Add integration-style tests for conversation restore behavior (#1799)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-27 17:49:21 +01:00
Engel Nyst 42cbff36d6 ci: default to Python 3.13 (keep 3.12+ support) (#1836)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-27 05:47:08 +01:00
Engel Nyst 6c86d7d387 Cap event history scanned by StuckDetector (#1829)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2026-01-27 01:40:28 +00:00
OpenHands Bot 8f17c399b6 Release v1.10.0 (#1827)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2026-01-27 08:10:46 +08:00
Graham Neubig 5d2f7eae98 Add gpt-5.2-codex model to evaluation config (#1835)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-26 16:40:45 -03:00
Xingyao Wang 35b32e55da Simplify PR review prompt to use /github-pr-review skill (#1792)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: OpenHands Bot <contact@all-hands.dev>
2026-01-26 23:47:52 +08:00
Xingyao Wang f40f2def7e refactor: remove redundant shutdown reconcile call (#1820)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: enyst <engel.nyst@gmail.com>
2026-01-26 11:50:34 +08:00
Xingyao Wang d18513b01e fix: ensure WebSocket is subscribed before operations (#1791)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
Co-authored-by: OpenHands Bot <contact@all-hands.dev>
2026-01-25 07:05:40 +08:00
Chris Bagwell da578199a1 Hide pyinstaller vendored libraries from subprocesses (#1758)
Co-authored-by: Ray Myers <ray.myers@gmail.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-24 13:38:43 +00:00
Engel Nyst befbbc34f5 Deprecate safety_settings field in LLM configuration (#1803)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-24 03:26:04 +01:00
Graham Neubig 98c8c22e80 refactor(message): move serialization control to to_chat_dict() parameters (#1800)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-23 21:23:45 -05:00
Graham Neubig 6a2d9df85e Fix MCP HTTP session persistence across tool calls (#1740)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-23 10:00:00 -05:00
Juan Michelini 1650884215 Add NVIDIA Nemotron 3 Nano 30B model configuration (#1804) 2026-01-23 22:58:29 +08:00
Graham Neubig b266f14bf3 feat(message): require explicit values for serialization control fields (#1798)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-22 21:26:11 -05:00
Engel Nyst 03b56ccf6f refactor: enforce tool immutability on conversation restore (#1788)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-23 00:57:30 +01:00
Calvin Smith d616ab07d6 feat(condenser): Multi-summary views (#1721)
Co-authored-by: Calvin Smith <calvin@all-hands.dev>
2026-01-22 13:28:59 -07:00
John-Mason P. Shackelford 0fc053a4f6 feat(plugin): Load commands as keyword-triggered skills (#1676)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-22 14:18:01 -05:00
John-Mason P. Shackelford 89cfc321e2 feat(agent-server): Support plugin loading when starting conversations (#1651)
Extends the SDK and agent server to support loading multiple plugins when
starting conversations via a `plugins` parameter. Plugins are fetched, loaded,
and their skills, hooks, and MCP configuration are merged into the conversation
context.

## Changes

### Plugin Loading via `plugins` Parameter

Both `LocalConversation` and `RemoteConversation` (and the agent server API)
now accept a `plugins: list[PluginSource]` parameter supporting GitHub refs,
specific versions, monorepo paths, and local paths.

### SDK Plugin Utilities

- `PluginSource` model for specifying plugin sources
- `load_plugins()` function for loading multiple plugins and merging them
- `HookConfig.merge()` for combining hooks from multiple plugins
- `Plugin.add_skills_to()` and `Plugin.add_mcp_config_to()` for merging content

### Plugin Content Merging Behavior

- Skills: Override by name (last plugin wins)
- MCP Config: Override by key (last plugin wins)
- Hooks: Concatenate (all hooks run)

## Testing

- `tests/sdk/plugin/test_plugin_loader.py` - Tests for `load_plugins()` utility
- `tests/sdk/plugin/test_plugin_merging.py` - Tests for merging utilities
- `tests/sdk/conversation/test_local_conversation_plugins.py` - LocalConversation tests
- `tests/agent_server/test_conversation_service_plugin.py` - Agent server tests

Closes #1650

Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-22 14:06:15 -05:00
OpenHands Bot 989fc11053 Release v1.9.1 (#1781)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-21 22:56:18 +00:00
Tim O'Farrell 178e274492 Added more fixes for parsing discriminated unions (#1779)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-22 06:44:04 +08:00
OpenHands Bot cf43183b77 Release v1.9.0 (#1775)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-21 17:37:08 +00:00
Engel Nyst fdfbc157eb Add 05 skills/plugins examples to test-examples (#1773)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-21 23:19:44 +08:00
Rohit Malhotra 9b3a2b75f3 Add set_security_analyzer abstract method to BaseConversation (#1772)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-21 23:05:16 +08:00
Graham Neubig 3c95cd8298 Update idle time on bash, git, and file operations (#1770)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-21 09:43:04 -05:00
Tim O'Farrell b9da76de50 feat: Support full class names in DiscriminatedUnionEnvParser (#1768)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: OpenHands Bot <contact@all-hands.dev>
2026-01-21 07:27:17 -07:00
Juan Michelini 4b9b2a17c4 Add minimax/MiniMax-M2.1 to resolve_model_config.py (#1757)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-21 11:23:28 -03:00
Rohit Malhotra 9d4a7a431b feat(delegate): Add create_sub_visualizer method for custom sub-agent visualization (#1767)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-20 16:22:11 -05:00
John-Mason P. Shackelford b4719098ac feat: Add .pr/ directory convention and auto-cleanup workflow (#1764) 2026-01-20 20:38:27 +00:00
Xingyao Wang f41277076f Add API-Based Critic for Real-Time Agent Action Evaluation (Experimental) (#1269)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: OpenHands Bot <contact@all-hands.dev>
Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>
2026-01-21 04:18:12 +08:00
simonrosenberg 17b954bd89 remove gpt-5-mini from list of models to evaluate for openhands index (#1766) 2026-01-20 17:19:14 +01:00
Graham Neubig d1722aabe1 Fix unhandled ConversationRunError causing non-JSON logs (#1680)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-20 10:51:43 -05:00
Graham Neubig 565bc1fad7 fix: Add JSON formatter for uvicorn access logs (#1733)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-20 10:51:11 -05:00
Xingyao Wang 76518fe1bf refactor(examples): use .from_dict() for hooks example (#1746)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-20 09:36:55 +07:00
shanemort1982 e1d42699d6 Fix CORS to allow DOCKER_HOST_ADDR for remote browser access (#1466)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-20 01:37:27 +00:00
Engel Nyst afc14483e5 docs: rewrite CONTRIBUTING around SDK architecture (#1755)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-19 21:53:04 +01:00
simonrosenberg 35d75e3aad Fix cache tag truncation with ports and latest suffix (#1626)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2026-01-19 19:34:46 +01:00
Xingyao Wang de9ae6bc51 Add GitHub repository URLs to PyPI package metadata (#1753)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-19 18:39:11 +01:00
Tim O'Farrell 1ed4e2af91 Fix DiscriminatedUnionEnvParser to use single parser when only one kind exists (#1741)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-18 08:02:45 -07:00
Cesar Garcia 07d6caf644 feat: add Gemini 3 models to reasoning effort supported list (#1752)
Co-authored-by: OpenHands Bot <contact@all-hands.dev>
2026-01-18 04:19:21 +01:00
Juan Michelini 7194a1f0b1 Remove push_to_index option from run-eval.yml workflow (#1750)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-17 23:02:39 -03:00
Graham Neubig 0f4bbd5a6d Add Marketplace datamodel for marketplace.json support (#1744)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-16 11:52:07 -05:00
Xingyao Wang c772e78747 refactor(hooks): add typed fields to HookConfig for better type safety (#1726)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Alona <alona@all-hands.dev>
2026-01-16 23:36:14 +08:00
simonrosenberg 426cc261e9 Remove unsupported models (#1734)
Co-authored-by: enyst <engel.nyst@gmail.com>
2026-01-16 14:54:30 +01:00
Graham Neubig c6c2abca9c fix: move async operations outside sync lock in event subscription (#1732)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-15 22:37:00 -05:00
Graham Neubig 6762ac32c0 Fix event ordering in RemoteEventsList by inserting events sorted by timestamp (#1737)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-15 17:19:06 -05:00
John-Mason P. Shackelford 5734ec0e72 feat(plugin): Add Plugin.fetch() for remote plugin fetching and caching (#1647)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-15 15:50:57 -05:00
Graham Neubig 74f2b906d1 Replace mocked MCP tests with real integration tests (#1678)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-15 12:46:31 -07:00
Graham Neubig 1de40c4d99 Add GPT-5.2 high reasoning to model configurations (#1735)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-15 14:25:02 -05:00
Hiep Le a3147c0ae7 feat: load skills from agent server (#1729)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2026-01-16 00:00:57 +07:00
Juan Michelini ef1c2dce89 feat: add multiswebench support to run-eval workflow (#1693)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-14 18:01:19 +01:00
Juan Michelini 1c4474a160 Add swebenchmultimodal support to run-eval workflow (#1659)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Ray Myers <ray.myers@gmail.com>
2026-01-14 12:31:49 -03:00
OpenHands Bot 63f137261f Release v1.8.2 (#1722)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-14 15:04:49 +00:00
simonrosenberg c49d420779 Propagate Datadog conversation logging flag to evaluation dispatch (#1703)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-14 14:29:29 +01:00
Graham Neubig 6a004a1d53 fix: route qwen-3-coder through Fireworks provider (#1720)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-13 19:10:01 -03:00
Juan Michelini 5fe7eef039 Add user tracking to evaluation workflow (#1716)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-13 15:31:34 -03:00
simonrosenberg 319279f5b5 Add configurable startup wait for APIRemoteWorkspace (#1713)
Co-authored-by: OpenHands Bot <contact@all-hands.dev>
2026-01-13 19:19:04 +01:00
Xingyao Wang 66c8c77f87 Revert "fix(critic): Accept MessageEvent as valid finish signal in AgentFinishedCritic" (#1715) 2026-01-14 01:55:53 +08:00
Chris Bagwell dd4c5d9a79 Add ARG USERNAME to base-image stage for podman (#1691) 2026-01-14 01:49:11 +08:00
Engel Nyst b5df465242 Fix PR review bot to diff against current base commit (#1685)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2026-01-13 23:58:36 +08:00
Graham Neubig 37dbab04cd Add crash diagnostics logging to agent server (#1689)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Simon Rosenberg <simonrosen10@gmail.com>
2026-01-13 14:27:08 +01:00
Engel Nyst ad6bdf65e4 Remove .openhands/skills/repo.md symlink (#1709) 2026-01-12 18:44:01 -05:00
Aditya Bharat Soni dc900b6b67 Optimize docker-like behaviour in apptainer with better code quality (#1711) 2026-01-12 18:32:25 -05:00
Rohit Malhotra a0763096f7 Fix verify method to include builtin tools in event check (#1710)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-12 14:19:37 -08:00
Engel Nyst 89daa137b0 llm: log failed Responses calls with request context (#1684) 2026-01-12 23:05:54 +01:00
Calvin Smith 6959cf18b1 fix(condenser): Forgetting range calculation interprets keep_first correctly (#1708)
Co-authored-by: Calvin Smith <calvin@all-hands.dev>
2026-01-12 22:01:26 +00:00
Graham Neubig d6552a200d Fix deepseek-v3.2-reasoner model name to use valid API identifier (#1707)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-12 18:40:19 -03:00
Rohit Malhotra 1f2fb98447 Fix: Disable streaming for sub-agents in delegate tool (#1705)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: OpenHands Bot <contact@all-hands.dev>
2026-01-12 20:56:02 +00:00
Calvin Smith 3e0b8e554a Condenser integration tests (#1652)
Co-authored-by: Calvin Smith <calvin@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-12 13:00:41 -07:00
Engel Nyst d7d9a0e8df Add AGENTS.md and link repo.md (#1690)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2026-01-12 19:00:41 +00:00
Colin efb5105e47 feat: support custom volumes mounting for DockerWorkspace (#1618)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2026-01-12 17:11:46 +00:00
Xingyao Wang 014de00279 feat(pr-review): Add priority labels to review comments (#1696)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: OpenHands Bot <contact@all-hands.dev>
2026-01-13 01:10:15 +08:00
cid 397ab08ce2 fix: add file-based locking to EventLog.append() (#1614)
Co-authored-by: Tim O'Farrell <tofarr@gmail.com>
2026-01-12 16:16:20 +00:00
Graham Neubig eff670ae46 fix(critic): Accept MessageEvent as valid finish signal in AgentFinishedCritic (#1695)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-12 11:08:31 -05:00
Tim O'Farrell c8618209a9 Fix conversation restore failing due to secret serialization/deserialization mismatch (#1672)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: enyst <engel.nyst@gmail.com>
2026-01-12 07:57:45 -07:00
Graham Neubig f011127916 Fix useless logging in sockets.py (#1679)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-11 00:07:30 -05:00
Engel Nyst 20ae6f592a Add visualize() for ConversationErrorEvent (#1686) 2026-01-10 22:27:14 +01:00
Xingyao Wang 90ca600cc2 Fix PR review agent multi-line suggestion alignment (#1677)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
2026-01-09 22:44:27 +01:00
Xingyao Wang d3e551276c Add per-test timeout to prevent hanging example scripts (#1666)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-10 03:14:26 +08:00
Xingyao Wang cdc427582d fix: mask LLM_API_KEY and GITHUB_TOKEN secrets in PR review agent (#1675)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-09 18:32:01 +00:00
Xingyao Wang 66a73896e5 Add --no-pager instruction for git commands in system prompt (#1673)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-10 02:13:10 +08:00
Xingyao Wang 2c9aed8c2c feat: auto-trigger PR review on new ready-for-review PRs + fix duplicate comments (#1669)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-10 02:09:10 +08:00
OpenHands Bot c689dadab8 Release v1.8.1 (#1670)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-10 01:30:47 +08:00
John-Mason P. Shackelford 013583fb7e fix(tests): Enable accurate coverage reporting with pytest-xdist (#1658)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2026-01-09 16:35:16 +00:00
Xingyao Wang 8f03a1b8d8 feat: Add inline review comments for PR review workflow (#1654)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
2026-01-10 00:17:55 +08:00
Xingyao Wang 03eccdb105 fix: make publish workflow idempotent for re-runs (#1667)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-10 00:17:07 +08:00
OpenHands Bot a5f5691569 Release v1.8.0 (#1663)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-09 16:01:25 +00:00
simonrosenberg 79e9abcdea Expose max_retries input and dispatch to eval (#1641) 2026-01-09 10:24:29 +01:00
Rohit Malhotra 68fb1e5e79 Add alive property to RemoteWorkspace (#1655)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-08 19:26:54 -05:00
Xingyao Wang 97d1734a65 feat: Implement AgentSkills progressive disclosure for SKILL.md files (#1644)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
2026-01-08 23:32:44 +00:00
Xingyao Wang 4600b2be9a feat: add utility to convert legacy OpenHands skills to AgentSkills format (#1643)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-09 07:01:59 +08:00
Mahesh Jaganiya 4816fa5f83 docs: Update environment configuration for secret encryption in Agent Server (#1465)
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
2026-01-08 22:50:38 +00:00
Rohit Malhotra a93e84d0eb feat: Add public sandbox_id field to OpenHandsCloudWorkspace for resuming existing sandboxes (#1603)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-08 16:51:48 -05:00
Engel Nyst 845e7ae512 Use AGENTS.md instead of repo.md in SDK examples/system prompt (#1648) 2026-01-08 20:38:58 +01:00
Xingyao Wang b833777df6 fix: remove duplicate example numbers in 02_remote_agent_server (#1637)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-09 03:32:00 +08:00
Xingyao Wang ad0802dc8c feat: Add include_default_tools option to control built-in tools (#1594)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-08 13:00:41 -05:00
Graham Neubig 7072d2a9f5 fix: Auto-add skills to_prompt() output to system prompt (#1642)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2026-01-09 01:55:50 +08:00
Xingyao Wang 878a189e19 feat(pr_review): send complete git diff to agent in first message (#1639)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-09 01:35:54 +08:00
Graham Neubig f83870cb98 feat(llm): Set default LLM timeout to 5 minutes (300s) (#1638)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-08 18:08:52 +01:00
Aditya Bharat Soni f8691b83ad Allow over-riding system-level configurations from apptainer.conf in apptainer workspace (#1628) 2026-01-09 00:29:48 +08:00
Xingyao Wang 053f0a93d1 fix: Pass agent_context at Agent initialization in PR review script (#1632)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-08 23:36:17 +08:00
Xingyao Wang 9b75953bcc Fix AGENTS.md not being loaded when skills directories don't exist (#1624)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
2026-01-08 10:20:48 +08:00
Xingyao Wang ccf4b2589a fix: Update PR review workflow to use correct repository name (#1623)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-08 06:24:24 +08:00
Xingyao Wang c647618c5e Fix context window displaying as 0 when resuming conversation (#1590)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-08 06:23:07 +08:00
Graham Neubig 86c7854c19 fix: resolve UnsupportedFieldAttributeWarning for secret_registry field (#1625)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-07 16:25:46 -05:00
simonrosenberg 88f11eb489 Revert "Expose runtime_resource_factor in run-eval" (#1627) 2026-01-07 22:01:23 +01:00
juanmichelini 47c2ff56c4 Add expected index models to resolve_model_config.py (#1497)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-07 15:32:05 -03:00
Xingyao Wang 1220936732 Add action summary feature for tool calls (#1339)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-08 02:18:50 +08:00
Calvin Smith d1baf30853 Multi-step integration tests (#1613)
Co-authored-by: Calvin Smith <calvin@all-hands.dev>
2026-01-07 09:40:41 -07:00
Graham Neubig 9aff9be9a5 feat: Add skills example and include skill location in agent prompt (#1599)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-07 11:23:22 -05:00
Xingyao Wang 3d97f79156 Fix poetry lock command for Poetry 2.x compatibility (#1622)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-07 23:42:44 +08:00
Graham Neubig 755bd1689b feat: add to_prompt() for XML skill prompt generation (#1483)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-07 10:29:19 -05:00
Xingyao Wang 0d776394dd Add skills support to PR review example with /codereview and /codereview-roasted (#1617)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-07 15:28:09 +00:00
Graham Neubig 5ec1804c08 feat: add skill validation improvements (#1597)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-07 09:19:02 -05:00
simonrosenberg e640f1b53c Expose runtime_resource_factor in run-eval (#1605) 2026-01-07 09:32:16 +01:00
Graham Neubig 2c5a439635 feat: Add plugin loading example demonstrating skills, hooks, and MCP (#1616)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-06 21:27:25 -05:00
Graham Neubig 6cf77eb418 feat: Add Plugin data model and basic loading from directories (#1611)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-06 20:49:05 -05:00
Alona 877b4fe17b feat(sdk): complete hooks implementation with additional context and stop hook (#1547) 2026-01-07 01:45:05 +07:00
Xingyao Wang 3f70f7bdee Add code review microagent based on xingyaoww's review history (#1150)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-07 00:33:17 +08:00
Graham Neubig 5f1a34c039 fix(git): skip remote check for repos with no commits (#1606)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-06 11:26:25 -05:00
Engel Nyst a918f39430 refactor: remove reconciliation methods, use runtime agent directly (#1542)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2026-01-06 16:58:07 +01:00
juanmichelini 7ed4c9c272 Add dynamic run name title to Run Eval workflow (#1612)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-06 12:02:38 -03:00
Xingyao Wang bea31b75ad Fix stats being reset when resuming a conversation (#1591)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-06 08:35:26 +08:00
Graham Neubig 9e48991e92 feat: support resource directories (scripts/, references/, assets/) (#1482)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-05 18:54:39 -05:00
OpenHands Bot 91f19619b6 Release v1.7.4 (#1595)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2026-01-05 20:16:01 +00:00
Graham Neubig eadaabb1d9 feat: support .mcp.json for MCP server configuration (#1481)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-05 15:09:39 -05:00
simonrosenberg 6f3b77d4ab Add push_to_index dispatch flag (#1588) 2026-01-05 16:58:58 -03:00
Tim O'Farrell 817a2881a9 Bump pydantic (#1593)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-05 19:40:10 +00:00
Calvin Smith 1fbf8679d6 fix(condenser): Retry on empty condensation (#1577)
Co-authored-by: Calvin Smith <calvin@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2026-01-06 02:52:15 +08:00
Rohit Malhotra 944beed2ac feat: Create conversation when provided ID doesn't exist (#1579)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-05 13:29:39 -05:00
Tim O'Farrell 420139344f More efficient websocket based implementation for async bash execution (#1587)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-04 13:29:14 -07:00
Tim O'Farrell 8fb2354e23 Set initial execution status to error if it was running (#1554)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-03 19:59:44 -07:00
cid 996daa7ccb feat: relax tool matching on resume (#1538) 2026-01-03 17:27:20 -07:00
Engel Nyst 7b623dab75 Update cryptography to 46.0.3 (#1585)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-04 00:20:14 +01:00
Engel Nyst 774b57fdfd remove custom_llm_provider from LLM config (#1583) 2026-01-03 20:59:51 +01:00
Tim O'Farrell 69eaeb231f More test fixes (#1581) 2026-01-03 12:38:48 -07:00
simonrosenberg 0e96c43778 Fix RemoteConversation polling on terminal errors (#1572) 2026-01-03 12:07:26 +01:00
Engel Nyst 206750c76a Relax needs-triage duration (#1567)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2026-01-03 03:43:22 +01:00
Calvin Smith 11c914660e fix: Cap thinking budget below max_tokens (#1580)
Co-authored-by: Calvin Smith <calvin@all-hands.dev>
2026-01-02 15:44:17 -07:00
Tim O'Farrell 31a233bcb7 Re-add discriminator field to discriminated union JSON schema (#1578)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-02 14:57:18 -07:00
Xingyao Wang 230fb9ff5f Add test to verify working_dir standardization (closes #211) (#1576)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-03 04:20:45 +08:00
Tim O'Farrell 67a9ea4869 Fix DiscriminatedUnionMixin issues with Pydantic C bindings (#1555)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2026-01-02 10:25:30 -07:00
simonrosenberg cbc8cfb140 refactor: consolidate duplicated blocking wait logic in RemoteConversation.run() + remove defensive if else (#1569)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-02 14:42:58 +01:00
cid 18b50a464c feat: add DeepSeek v3.2 model support (#1564) 2026-01-01 22:21:02 +01:00
Tony Narlock 8540fd85ce build(deps): Bump libtmux to >=0.53.0 for v0.51.0 hard deprecations (#1561) 2026-01-02 01:37:34 +08:00
cid 1587f8d9f1 feat: add pause() and resume() methods to workspace classes (#1539)
Co-authored-by: openhands <openhands@all-hands.dev>
2026-01-01 14:55:59 +00:00
Graham Neubig fd98c16f9d feat: support SKILL.md file convention and name validation (#1480)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-31 08:11:57 -05:00
Xingyao Wang 9e20a39d50 fix: Wait for PyPI propagation and add Slack notification to release workflow (#1552)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-31 04:43:38 +08:00
Rohit Malhotra 780713daf1 Update OpenHands version bump workflow to include all required steps (#1553)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-30 14:14:34 -05:00
OpenHands Bot 5cbfbf768f Release v1.7.3 (#1551)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-30 17:51:38 +00:00
Xingyao Wang 1f6ec5b25e Fix ValueError from tool execution causing conversation to fail (#1550)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-31 01:28:32 +08:00
Tim O'Farrell 8f067313ce Fix save_meta() not being called on conversation creation (#1548)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-30 23:53:34 +08:00
simonrosenberg df84808593 Propagate remote conversation errors (#1546) 2025-12-30 16:44:30 +01:00
Xingyao Wang 853ac46b4e Raise ValueError when condensing 0 events (#1541)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-30 23:30:39 +08:00
simonrosenberg 73769d5e9d Add forward_env support to APIRemoteWorkspace (#1540)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-30 11:23:37 +01:00
Graham Neubig 62d35bb543 feat: add openrouter/minimax-m2 to SEND_REASONING_CONTENT_PATTERNS (#1152)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
2025-12-29 18:09:59 -05:00
Engel Nyst 0ea77d3ea8 Add GPT-5 preset using ApplyPatchTool (opt-in) (#1462)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-29 23:25:28 +01:00
Graham Neubig 3009354f3c Add ApptainerWorkspace implementation for rootless container support (#892)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Aditya Bharat Soni <adityasoni9998@gmail.com>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-12-29 15:51:08 -05:00
OpenHands Bot 0f79f046d6 Release v1.7.2 (#1530)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-29 17:49:15 +00:00
Xingyao Wang 116ea7076a feat: Add support for custom tools with remote agent server (#1383)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Graham Neubig <neubig@gmail.com>
2025-12-30 00:50:36 +08:00
Xingyao Wang e591e2f7d5 Revert "ci(integration): update integration LLM matrix to gpt-5.2-codex" (#1524)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-30 00:06:26 +08:00
Graham Neubig a6afa65292 Add pull request template (#1528)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-12-29 15:54:17 +00:00
Xingyao Wang 85ecfd9333 Fix condensation for 0 events by adjusting default parameters (#1523)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-29 07:59:50 +08:00
cid 4523a5a837 feat: add early stopping/pruner for behavior tests cost optimization (#1433) 2025-12-29 05:16:56 +08:00
Xingyao Wang 8efb658b18 Add automated version bump PRs for downstream repos (#1520)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-29 04:17:32 +08:00
Engel Nyst 0d18f795bd ci(integration): update integration LLM matrix to gpt-5.2-codex (#1503)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-29 03:25:29 +08:00
Xingyao Wang 8c720466a1 Fix TmuxTerminal.close() to handle dead session exceptions (#1517)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-28 19:02:01 +00:00
Xingyao Wang 68aa67aa7d Use ALLHANDS_BOT_GITHUB_PAT for bot-like workflows (#1519)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-29 02:59:18 +08:00
Peter Hamilton c3f9fb0dac Save screenshots from browser tool (#1443)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-28 18:46:15 +00:00
simonrosenberg eaf015e98e Fix max iterations reached: Set ERROR status with clear message (#1511)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-28 09:50:32 +01:00
Engel Nyst 4eb0aa03b8 Load model-specific repo instructions (CLAUDE.md for Claude only, GEMINI.md for Gemini only) (#1328)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-28 07:12:15 +01:00
all-hands-bot ee8c775d70 Release v1.7.1 (#1513)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: enyst <engel.nyst@gmail.com>
2025-12-25 22:08:17 +00:00
Calvin Smith 518bd706d4 fix(condenser): Tool-loop aware condensation (#1508)
Co-authored-by: Calvin Smith <calvin@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-12-26 02:48:22 +08:00
Engel Nyst 3733b37528 Remove tool preambles from gpt-5-codex (#1512) 2025-12-26 02:32:08 +08:00
Hiep Le 7befa3fe93 fix: handle union types in py_type for mcp schema conversion (#1509) 2025-12-25 13:03:14 +07:00
Alona d103b6a0f0 feat: Add hooks system for event interception (#1467)
Co-authored-by: enyst <engel.nyst@gmail.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-25 05:47:51 +00:00
PiteXChen d73b445aae feat(localFileStore): Add Cache Layer to LocalFileStore to Avoid I/O Overhead from Read Operations (#1274)
Signed-off-by: CLFutureX <chenyongqyl@163.com>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-24 11:47:56 +08:00
Calvin Smith 014c6d4050 fix(condenser): Tool-call aware condensation (#1412)
Co-authored-by: Calvin Smith <calvin@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-23 17:44:30 -06:00
Calvin Smith 3bd0250ab7 feat(condenser): Token-aware condensation in LLMSummarizingCondenser (#1380)
Co-authored-by: Calvin Smith <calvin@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
2025-12-23 15:34:27 -06:00
Pankaj Mathur 5383fd0b82 Add compatibility fixes for Nemotron and similar models (#1470) 2025-12-23 18:42:00 +01:00
simonrosenberg 23c9736e41 Add optional parallelization parameters to run-eval workflow (#1499)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-23 18:16:20 +01:00
simonrosenberg ccecb9730d Fix SDK_SHA reference in run-eval.yml Comment on PR step (#1498)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-23 15:20:25 +00:00
Xingyao Wang f7a9636b99 Release v1.7.0 (#1486)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-23 23:16:20 +08:00
Graham Neubig 77e4f22309 feat: add AgentSkills standard fields to Skill model (#1479)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-22 16:39:17 -05:00
Xingyao Wang 81a2a97ccf Fix prepare-release workflow to use correct label names (#1487)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-23 03:07:36 +08:00
Xingyao Wang 736b33cf2b Fix prepare-release workflow to use ALLHANDS_BOT_GITHUB_PAT (#1485)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-23 02:48:33 +08:00
Xingyao Wang 26b142b55b Add accumulated token usage to integration tests output and GitHub CI comments (#1471)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-23 00:26:00 +08:00
Xingyao Wang e8f73e71b9 Fix RemoteConversation.run() timeout issues with async polling (#1444)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-22 10:03:52 -05:00
Engel Nyst 37df3a438b terminal: populate TerminalObservation.exit_code (fixes #1468) (#1469)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-22 07:29:59 +01:00
Graham Neubig 113497480c Implement gemini-style file editing tools (#1199)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-21 05:10:47 +01:00
E Bala 80a32cd760 fix: deepseek reasoner model not part of send_reasoning_content_support models (Closes #1343) (#1464)
Co-authored-by: enyst <engel.nyst@gmail.com>
2025-12-20 14:32:55 +01:00
Engel Nyst d0b8a7273e Fix verified_models (#1458) 2025-12-20 01:09:20 +01:00
Graham Neubig 577c213e91 fix: add openrouter/minimax to FORCE_STRING_SERIALIZER_MODELS (#1448)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-19 16:20:32 -05:00
Xingyao Wang eb06d52304 Fix batch atomicity when condensation forgets ObservationEvents (#1450)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-19 17:40:56 +00:00
Graham Neubig a3c39a54af feat: Add configurable security policy support to Agent (#427)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-19 11:18:15 -05:00
Ryan H. Tran e0e0471d93 Add skill for writing behavior tests (#1459) 2025-12-20 00:17:10 +08:00
Cesar Garcia 30fbd8504e fix: disable scheduled workflows from running in forks (#1447)
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-19 11:15:00 +01:00
E Bala b36812bda9 Add nova-2-lite to model_features (#1455)
Co-authored-by: enyst <engel.nyst@gmail.com>
2025-12-19 09:30:59 +00:00
Hiep Le 4efce8df8b refactor: Add LookupSecret tests for AgentContext.get_secret_infos() (#1456) 2025-12-19 09:15:25 +01:00
John-Mason P. Shackelford 7b782a0397 Public APIs return OpenHands types not LiteLLM types - Part 2: SystemPromptEvent (#495)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
2025-12-18 22:16:58 -05:00
Graham Neubig c3af0a4a4c Add PostHog Debugging Example (#1202)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-19 03:02:07 +00:00
Xingyao Wang 360ec3851c feat: Add OpenHandsCloudWorkspace for OpenHands Cloud API (#1442)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-19 05:50:20 +08:00
Graham Neubig 119b1ce56b Make stuck detection thresholds configurable (#763)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2025-12-18 16:21:00 -05:00
Xingyao Wang 5af3839d59 Suppress LiteLLM INFO logs to clean up output (#1369)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-18 14:32:15 -05:00
Xingyao Wang 45a3f2270c Update litellm to 1.80.10 and add tests for Gemini 2.5 thinking blocks (#1413)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-18 19:48:01 +01:00
Xingyao Wang 24a6eec1d7 Fix: throw error when viewing PDF files instead of crashing (#1423)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-19 01:58:35 +08:00
Madhur Tandon 3b8d444b9d include j2 templates in tools package (#1426)
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
2025-12-19 01:05:07 +08:00
Hiep Le 42fe515ce2 refactor: add support for description field in sdk secrets (#1428) 2025-12-19 00:45:53 +08:00
LuneZ99 c9159b9ced Fix JSON encoding to preserve non-ASCII characters in model_dump_json (#1431)
Co-authored-by: Claude <noreply@anthropic.com>
2025-12-19 00:45:40 +08:00
Xingyao Wang e1b387f96c Include persistence_dir in ConversationRunError for better debugging (#1437)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-19 00:45:16 +08:00
Madhur Tandon 19c15ecab6 delete image when docker workspace closes (#1424) 2025-12-18 15:37:02 +00:00
simonrosenberg c9330324c4 Add commit0 to CI evaluation workflow (#1429) 2025-12-18 22:47:25 +08:00
Ryan H. Tran 6fb8eaba5f Make cronjob run integration test only (#1430) 2025-12-18 22:19:00 +08:00
Graham Neubig 5757c9ae9a Add iterative refinement example for COBOL to Java refactoring (#1415)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-18 09:00:44 -05:00
Rakhman Asmatullayev 52604a7fa4 fix end span in local conversation run (#1432) 2025-12-18 12:10:08 +00:00
PiteXChen 0fcc4b4816 feat(AsyncExecutor): AsyncExecutor allows adding tasks after shutdown, which may cause unpredictable exceptions. (#1092)
Signed-off-by: CLFutureX <chenyongqyl@163.com>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Hoang Tran <descience.thh10@gmail.com>
2025-12-18 16:43:23 +07:00
Xingyao Wang dde351c9df Add load_project_skills helper function (#1420)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-18 02:42:07 +08:00
Engel Nyst 7db5ce2ba3 Exclude '*mini' models from prompt_cache_retention (#1345)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-17 19:05:41 +01:00
simonrosenberg af88f6954d make eval_limit param optional (#1419) 2025-12-17 18:24:32 +01:00
Ryan H. Tran f88f88dff5 Make integration-test label run integration tests only (#1416) 2025-12-17 22:16:53 +08:00
Chris Bagwell 352de845b7 Make thinking signatures optional (#1394) 2025-12-17 06:00:05 +08:00
Xingyao Wang 97652be0db Release 1.6.0 (#1404) 2025-12-16 18:44:51 +00:00
Xingyao Wang 134dcc5174 Add automated release workflows (#1405)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-17 02:25:06 +08:00
Ryan H. Tran c3c59df046 Support model-family and model-variant system prompts (#1348)
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-12-17 02:10:53 +08:00
Xingyao Wang a8c5dd323a Add MEMORY section to system prompt for skills (#600)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
Co-authored-by: juanmichelini <juan@juan.com.uy>
2025-12-16 15:50:47 +00:00
Tim O'Farrell df5ac803a5 Fix agent deep copy to preserve secrets using expose_secrets (#1403)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-16 08:41:54 -07:00
Engel Nyst 32f71b2302 docs(readme): update Slack link to openhands.dev domain (#1397)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-12-16 15:41:44 +00:00
Tim O'Farrell 72f9a1b08a Fix selfish coroutines in event service state lock operations (#1400)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-16 08:27:15 -07:00
Xingyao Wang 583156cea6 Add self-documentation lookup capability to system prompt (#1396)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-16 04:16:38 +08:00
Xingyao Wang cbee106a99 Update lmnr from 0.7.20 to 0.7.24 (#1398)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-15 14:05:56 -05:00
simonrosenberg e6ab3d1bc4 Add instance id input to run-eval dispatch (#1389)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-13 05:27:04 +08:00
Alona f83b6e1023 fix: use simple delay for shell init instead of PTY-based detection (#1387) 2025-12-13 01:43:35 +07:00
Tim O'Farrell 34fcb39268 Added deleting state for conversations for use in webhook. (#1391)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-12 09:15:29 -07:00
Tim O'Farrell d1f90eed1e Add page_iterator utility and refactor resend_all functionality (#1386)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-11 19:30:16 -07:00
Engel Nyst 00b8e1f434 feat: allow xhigh reasoning_effort (#1388) 2025-12-12 02:38:12 +01:00
Xingyao Wang 8f90b92c61 Release 1.5.2 (#1382) 2025-12-11 23:17:19 +08:00
Hiep Le f47cc3ab7b fix: update respond_to_confirmation API to Invoke reject_pending_actions on rejection (#1377) 2025-12-11 22:58:51 +08:00
Tim O'Farrell bd7440e8bb ALL-4426 Fix: Run conversation webhook notifications in background with error logging (#1373)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-10 19:22:31 -07:00
Tim O'Farrell d01ce6def6 ALL-4426 Parallelize service startup and shutdown in api_lifespan (#1375)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
2025-12-10 19:22:16 -07:00
Tim O'Farrell f49d164e7a ALL-4426 Optimize agent tool initialization with parallel execution (#1374)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-10 19:21:49 -07:00
Xingyao Wang f7b1ad3eda Release 1.5.1 (#1379) 2025-12-10 22:20:24 +00:00
Rohit Malhotra 44ec1788c5 Add condense method to conversation class with ForceCondenser (#1368)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-10 16:25:35 +00:00
Xingyao Wang e6872fbfa6 Add agent behavior tests to integration tests (#1321)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Hoang Tran <descience.thh10@gmail.com>
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
2025-12-10 15:51:48 +07:00
Cesar Garcia 2daed50a9f Add GPT 5.1 codex models to verified models list (#1358)
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
2025-12-10 02:48:59 +01:00
John-Mason P. Shackelford bae009304d Fix agent reconciliation to allow agent_context updates (#1371)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-10 04:44:12 +08:00
Xingyao Wang c725c1d0e8 Release 1.5.0 (#1370)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-09 20:00:33 +00:00
Engel Nyst 5880b23735 ci: mark check-examples workflow as optional in title (#1355) 2025-12-10 03:25:02 +08:00
Xingyao Wang dce451111a Add GPT-5.1 models to verified model list (#1365)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-09 19:17:46 +00:00
Ao Li f4d68341c7 Fix terminal tool crash on large env vars (#1346)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-09 18:42:17 +00:00
Rohit Malhotra 6c5bc6935b Fix MCP tool observation crash when tool returns a list (#1366)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-09 13:34:19 -05:00
Alona 9c9656f5dc Add no-mocking instructions to system prompt (#1357) 2025-12-10 02:19:35 +08:00
Xingyao Wang 4b3a05036f Add devstral-2512 and devstral-medium-2512 as verified models (#1364)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-09 18:54:30 +01:00
Graham Neubig 80293a6bac Update SWE-bench score to 77.6% (#1362)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-10 00:06:26 +08:00
Hiep Le 07cdd460fb fix: enable agent awareness of available secrets (#1335)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-12-10 00:04:56 +08:00
simonrosenberg 42c4c68b6a Add benchmark selection parameter to evaluation workflow (#1294)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-09 16:32:53 +01:00
PiteXChen 93d405c91e feat(llm):Enhance retry listeners and metric collection when retrying LLM calls due to exceptions. (#1214)
Signed-off-by: CLFutureX <chenyongqyl@163.com>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-08 18:19:53 +01:00
Xingyao Wang d577ec1ac9 Add helpful warning when context window exceeded without condenser (#1354)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-09 01:13:42 +08:00
Karan Sharma 603f342149 fix: respect enable_encrypted_reasoning flag in Responses API (#1208)
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-08 16:25:57 +00:00
Xingyao Wang e0fb3c1b66 Fix base_url handling for openhands/ provider with explicit None value (#1341)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
2025-12-09 00:09:26 +08:00
simonrosenberg 7ef38815bb fix: create parent directories in observation truncation (#1347)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-07 17:56:26 +01:00
Graham Neubig 25cae0e4c3 Add issue triage workflow automation (#1300)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-06 17:51:14 +00:00
Robert Brennan 693c32618d Fix server-side automatic string-to-StaticSecret conversion (v1.2.0 compatible) (#1234)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-05 18:15:25 +01:00
மனோஜ்குமார் பழனிச்சாமி 9a097cee09 Add Windows browser path checks to Chromium detection (#1014)
Co-authored-by: simonrosenberg <157206163+simonrosenberg@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Hoang Tran <descience.thh10@gmail.com>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
2025-12-05 15:43:11 +00:00
Tim O'Farrell 1a893c3945 APP-190 Fix PyInstaller GCC 14 compatibility for Chromium (#1327)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-04 19:02:39 -07:00
Mendrika 8ef73e59c5 Fix: Enable vision for messages containing ImageContent (#1279)
Co-authored-by: Mendrika <mendrika@tryalma.ai>
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-04 22:57:31 +01:00
Xingyao Wang cbd1263b19 Release 1.4.1 (#1322) 2025-12-04 20:38:23 +01:00
Tim O'Farrell 8ba29553d6 APP-215 Fix browser_use logging configuration interference (#1316)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-04 09:22:57 -07:00
Tim O'Farrell 6e5c95c784 Fixes where the example tools were invalid (#1311) 2025-12-03 15:04:10 -07:00
simonrosenberg a55325cc7f Fix evaluation workflow failure when release branches are deleted (#1310)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-03 21:28:05 +01:00
Xingyao Wang 72c554ad3f Fix ConversationStateUpdateEvent to use MetricsSnapshot instead of full Metrics (#1308)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-04 04:13:52 +08:00
Xingyao Wang d026d362ae Release 1.4.0 (#1299)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-03 17:29:28 +00:00
Xingyao Wang d179ed5469 Separate DockerWorkspace and DockerDevWorkspace to fix import issue (#1198)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Ryan H. Tran <descience.thh10@gmail.com>
2025-12-03 16:49:03 +00:00
Xingyao Wang 5825094c56 Fix system prompt to prevent automatic markdown file creation (#1288)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-03 16:23:26 +00:00
Tim O'Farrell bd595d7e5e Enhance DiscriminatedUnionMixin error messages (#1301)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-02 17:03:10 -07:00
Xingyao Wang 74aaecf2c7 Fix t08_image_file_viewing test to handle FinishAction responses (#1302)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-03 07:31:38 +08:00
Xingyao Wang 20e4d62c46 Add simple version tags for agent-server Docker images (#1197)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-03 02:08:50 +08:00
Wang Siyuan f76aeb0f58 feat: handle proxied model names with canonical model_real_name (#1292)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-02 17:02:25 +01:00
simonrosenberg 70797e74e0 Simplify evaluation workflow by removing benchmarks build polling (#1267)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-02 15:44:55 +01:00
Engel Nyst a106684194 make get_closest_git_repo robust to macOS symlink paths (#1289)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-02 11:21:15 +01:00
Xingyao Wang b217e40473 Add GitHub CLI installation to Dockerfile (#1286)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-02 06:04:17 +08:00
Xuhui Zhou f038f7cccb Feat/tom agent integration (#784)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-12-02 01:08:03 +08:00
Ryan H. Tran e3bb356eb9 Enhance truncation with persistence capability (#419)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
2025-12-01 23:30:43 +08:00
Hiep Le ad88350103 refactor: enable configuring the llm proxy base url for the openhands provider (#1273)
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
2025-12-01 09:31:49 +00:00
juanmichelini 21973ba77d Fix GPT-5 codex empty patches (#1207)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: enyst <engel.nyst@gmail.com>
2025-11-30 21:28:27 -03:00
Wang Siyuan 312750c02b feat: add Streamlit viewer for completion logs (#1284) 2025-11-30 16:25:58 +01:00
Engel Nyst 2d100fc050 feat(tools): Add GPT-5.1 ApplyPatch tool (#1166)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-30 15:22:39 +00:00
Hiep Le daa2fd5c27 refactor: update request timeout handling when fetching secrets (#1272) 2025-11-30 13:29:44 +07:00
Engel Nyst 7ffe92270c terminal/tmux: disable bash history expansion to avoid ! mangling (#1277)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-30 00:22:37 +01:00
Xingyao Wang c5608a1c68 Add support for loading CLAUDE.md and GEMINI.md as repo skills (#1280)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-30 07:17:20 +08:00
Engel Nyst 1234f8f3ca GPT-5.1: Add prompt cache retention; reasoning_effort tweak (#1162)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-11-29 23:01:18 +01:00
Engel Nyst 3d5a8a1b16 [chore] Clean up deprecated service_id (#1276) 2025-11-30 04:00:21 +08:00
Xingyao Wang 8e296334db Add load_public_skills() to load skills from public registry (#1248)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
2025-11-26 17:05:12 +00:00
Xingyao Wang ff442dd1fd Implement streaming for Chat Completions (#1270)
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-27 00:38:46 +08:00
Xingyao Wang bbf8cff3ae Fix api_timeout not being used in APIRemoteWorkspace (#1206)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-26 23:56:23 +08:00
PiteXChen 7aded6f3f1 feat(fifolock): acquire and release Logic optimization (#1251)
Signed-off-by: CLFutureX <chenyongqyl@163.com>
2025-11-26 15:27:35 +01:00
Xingyao Wang c4cf74595a Release 1.3.0 (#1262)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-25 21:08:28 +00:00
Engel Nyst 5f73b1f9e1 Fix: Condenser LLM should not pass extra_body (Fixes #1252) (#1266)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-26 03:16:10 +08:00
Rohit Malhotra e861a97379 Add ask_agent method to conversation classes for simple LLM completions (#1227)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: simonrosenberg <157206163+simonrosenberg@users.noreply.github.com>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-11-25 14:07:06 -05:00
juanmichelini a527525d71 Claude Opus 4.5 as a reasoning model with custom effort param (#1250)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: enyst <engel.nyst@gmail.com>
2025-11-26 01:32:24 +08:00
simonrosenberg 1e8692b575 Align eval labels with benchmarks build tiers (1, 50, 200) (#1254)
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-11-25 18:06:16 +01:00
Xingyao Wang 824538de51 Add TaskTrackerStatusType type alias for task tracker status (#1263)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-26 01:05:45 +08:00
Engel Nyst 771eecc87c Investigate Claude model names: remove unsupported dotted variants (fixes #1260) (#1261)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-26 00:13:32 +08:00
Xingyao Wang a37078121b Add import dependency rules enforcement (#1259)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-26 00:10:50 +08:00
Xingyao Wang 8979e0601d Stream LLM logs and stats updates via WebSocket for RemoteConversation (#1159)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
Co-authored-by: hieptl <hieptl.developer@gmail.com>
2025-11-25 23:05:21 +07:00
simonrosenberg 3ec7bb2472 feat: allow delegator to spawn pre-defined subagents (#1223)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-25 16:37:15 +01:00
Tim O'Farrell 35443a6fb8 Fastapi discriminated union patch2 (#1258)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-25 14:47:40 +00:00
Xingyao Wang 9f87bf9687 Revert "Fix FastAPI discriminated union patch to be called consistently" (#1257) 2025-11-25 22:34:23 +08:00
Tim O'Farrell a290965269 Fix FastAPI discriminated union patch to be called consistently (#1256)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-25 07:19:25 -07:00
simonrosenberg 5a2cd30e2d feat: fix eval workflow (#1241)
Co-authored-by: Claude <noreply@anthropic.com>
2025-11-25 10:54:28 +01:00
Ryan H. Tran e99686784c Fix run examples workflow failed on schedule run & use parallel execution with pytest (#1229)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-11-25 03:28:17 +08:00
Xingyao Wang 8b7e0beb77 docs: Add comments about system prompt customization options (#1220)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-25 03:27:28 +08:00
Hiep Le db72dce4ab fix: incorrect file download API endpoint in remote_workspace_mixin (#1245) 2025-11-25 00:28:42 +07:00
simonrosenberg 362977ddb2 Remove weekly typing improvements workflow (#1243)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-24 16:09:41 +01:00
Engel Nyst dd0a1d47b9 Send reasoning first in the assistant message (#1238) 2025-11-24 23:07:17 +08:00
simonrosenberg 4f0cc507de feat: add eval job that runs on CI (#1167) 2025-11-24 11:28:01 +01:00
Rohit Malhotra c196a24148 Add source parameter to /search endpoint for event filtering (#1224)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-21 13:27:43 -05:00
Ryan H. Tran 6525d8d803 Prefix MCP tool action models to avoid kind collisions (#1217) 2025-11-22 00:43:51 +07:00
Xingyao Wang ea12740e2d Revert "Rename BrowserNavigateAction and BrowserTypeAction actions" (#1225) 2025-11-22 00:42:51 +07:00
Tim O'Farrell 858c7f41ab Fix: more resilient conversation start (#1216)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-21 06:50:18 -07:00
Ryan H. Tran ed9f80469b Rename BrowserNavigateAction and BrowserTypeAction actions (#1213) 2025-11-21 01:59:28 +07:00
Tim O'Farrell 499015bc26 APP-156 Set skill default trigger to None (#1212) 2025-11-20 16:47:17 +00:00
Xingyao Wang 7fed9444c8 Add force_string_serializer as LLM class argument (#1204)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-20 03:47:57 +01:00
Rohit Malhotra 49c42ee7be Fix datetime.utcnow() deprecation warning (#1203)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-19 14:02:28 -05:00
Xingyao Wang e485bba962 Rename ExecuteBashAction/ExecuteBashObservation to TerminalAction/TerminalObservation (#1193)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-20 00:34:10 +08:00
LuneZ99 78759504ed fix: use user-specific cache directory for Jinja2 templates (#1200)
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-11-20 00:04:58 +08:00
Yakshith aebb9db771 fix: sanitize @OpenHands only in GitHub comments (backend) (#1020)
Co-authored-by: simonrosenberg <157206163+simonrosenberg@users.noreply.github.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-19 16:44:40 +01:00
Tim O'Farrell f70b94513e Fix webhook timer reset behavior for more predictable delivery (#1192)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-19 02:25:50 -07:00
Xingyao Wang a14340aeb9 Release 1.2.0 (#1191) 2025-11-19 01:56:30 +08:00
Xingyao Wang c2fa737009 Support Gemini-3 (#1190)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-19 01:24:06 +08:00
Tim O'Farrell 15f565b8ac Fix for issue where cipher was not set on deserialize (#1187) 2025-11-18 08:01:52 -07:00
Ryan H. Tran 88cc031bd6 Integrate deprecation for systematic deprecation management (#1155)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-18 03:14:52 +00:00
Tim O'Farrell 67c3429ad1 Fix for issue where LLM API Key is saved as null (#1180) 2025-11-18 11:01:38 +08:00
Xingyao Wang 79868ae53e Port over Critic system from benchmark project (#1171)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-18 11:01:18 +08:00
Engel Nyst 3ed29713f5 chore(workflow): restrict auto-assigned reviewers to maintainers (write+) (#1176)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-18 02:42:03 +00:00
Xingyao Wang 7edda54266 only include HTTP exc_info when DEBUG mode is on (#1184) 2025-11-18 03:37:27 +01:00
Karan Sharma 7046e760dd feat: add configurable shell path for Terminal tool (#1146)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyaoww@gmail.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-17 16:36:35 +00:00
Tim O'Farrell 864e3ae781 Fix for broken build tests (#1181) 2025-11-17 08:32:56 -07:00
Robert Brennan e9845d6393 Fix conversation deletion error handling for permission issues (#967)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: enyst <engel.nyst@gmail.com>
Co-authored-by: Tim O'Farrell <tofarr@gmail.com>
2025-11-17 08:23:27 -05:00
LuneZ99 ec8c806c5f fix: add UTF-8 encoding and ensure_ascii=False for JSON telemetry logs (#1178) 2025-11-17 10:43:11 +00:00
Xingyao Wang f8bc8d6a49 Improved default visualizer to use horizontal rule instead of Panel (#1170)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-16 12:34:04 +08:00
Gayathri 7919d0a405 Enabling GPU support on Docker (#1175) 2025-11-16 12:33:22 +08:00
Xingyao Wang dce440bc38 Fix Path type support for workspace and persistence_dir parameters (#1116)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: enyst <engel.nyst@gmail.com>
2025-11-15 21:25:52 +01:00
Xingyao Wang 01bdbd2beb Add informative error message for MCP timeout (#1169)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-15 11:19:39 +08:00
Tim O'Farrell ad3727c6e5 Fix performance bottleneck in LLM initialization (~1.2s delay) (#1148) 2025-11-14 04:24:47 -07:00
Boxuan Li d7c9e53248 Fix litellm_extra_body LLM forwarding (#1153) 2025-11-14 05:08:45 +08:00
simonrosenberg 57b3657596 Add custom delegation visualizer to show correct visualizer/sender and simplify default visualizer
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-13 20:41:02 +01:00
Robert Brennan 60b978f46c Revise README for OpenHands Software Agent SDK (#1156) 2025-11-14 03:13:15 +08:00
Tim O'Farrell c3ebef9b32 feat: Parallelize batch operations for improved performance (#1147) 2025-11-13 02:53:51 -07:00
Xingyao Wang 9c03d1fa3c Skip visualizing ConversationStateUpdateEvent in default visualizer (#1151)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-13 01:06:44 +08:00
Graham Neubig 6a28b38974 Fix metadata filtering for non-litellm_proxy providers (#1134)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: enyst <engel.nyst@gmail.com>
2025-11-11 23:43:01 +00:00
Xingyao Wang d67bd8485b Fix executable for APIRemoteWorkspace (#1141) 2025-11-12 06:56:35 +08:00
Xingyao Wang 4e2ecd8298 v1.1.0 release (#1138) 2025-11-12 03:08:39 +08:00
John-Mason P. Shackelford 907ead5350 fix(examples): allow all examples to run with same LLM_API_KEY (#1065)
Co-authored-by: enyst <engel.nyst@gmail.com>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-11-11 19:52:34 +01:00
Engel Nyst 7b695dc519 examples: add Laminar observability example (#1131)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-11-11 18:08:58 +00:00
Rohit Malhotra a656ca776c Refactor: Always include risk fields (#1052)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-11-11 18:02:20 +00:00
Xingyao Wang a482ab10c2 Fix success rate calculation to exclude skipped tests (#1136)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-12 01:52:30 +08:00
Xingyao Wang a4f97bd0cf Support kimi-k2 extended thinking, fix prompt caching stats, fix max output (#1133)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-11 17:26:31 +00:00
மனோஜ்குமார் பழனிச்சாமி 488806ed51 Add image file viewing support to FileEditor (#1016)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-11 23:34:10 +08:00
Xingyao Wang 30449e1626 refactor: move TokenEvent from llm_convertible to event (#1130)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-11 23:11:51 +08:00
Xingyao Wang 331992e59b Simplify API Workspace example with image_pull_policy argument; fix erroneous clean up of client (#1127)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-11 14:47:12 +00:00
Lintang Sutawika 8dd38fba8d add TokenEvent and TokenEvent processing for RL (#1123)
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-11 14:37:45 +00:00
Hiep Le 62594fe9ec fix(backend): unable to create a new conversation using the planning agent’s tools. (#1125) 2025-11-11 21:15:54 +07:00
Engel Nyst f452e142e8 SDK: export ConversationExecutionStatus via public API (#1126) 2025-11-10 22:04:13 +01:00
Ryan H. Tran 6b2b671a68 Remote empty text content for tool calls & support kimi-k2-thinking (#1093)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: enyst <engel.nyst@gmail.com>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-11-10 20:11:40 +00:00
மனோஜ்குமார் பழனிச்சாமி f3c0c19cd1 Auto-create log directory for telemetry (#1111) 2025-11-10 22:40:29 +08:00
Engel Nyst 53c0162e65 tests(llm): add explicit retry behavior tests for empty choices (#1107)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-10 15:38:38 +01:00
Engel Nyst b24c9cc9c2 visualizer: render CondensationRequest nicely (#1113)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-11-10 14:33:19 +00:00
Engel Nyst 9b7799bc6a Fix resume reconciliation by treating litellm_extra_body as runtime override (#1115) 2025-11-10 22:15:39 +08:00
Ryan H. Tran 377ea66fb6 Handle more nested field schema formatting for tool description (#1001)
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-10 21:14:01 +07:00
Engel Nyst 3addb5a492 Enable Ruff checks for mutable defaults (B006, B008, B039, RUF012) (#1055)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-10 22:13:08 +08:00
PiteXChen 0a23648e0e fix(condenser): When condensation is triggered by the unhandled_condensation_request condition, it will result in empty condensation. (#1034)
Signed-off-by: CLFutureX <chenyongqyl@163.com>
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-10 04:09:02 +00:00
Engel Nyst 572ea2148d Fix: LLM telemetry logging (#1028)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-08 20:10:33 +01:00
Xingyao Wang aa954ce876 Allow override git ref/sha in docker build by configuring BuildConfig (#1100)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-09 02:53:15 +08:00
Xingyao Wang f8d847d448 Use shell:bash in run-examples workflow (#1057)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-09 02:49:31 +08:00
மனோஜ்குமார் பழனிச்சாமி aaa42bbdf4 Remove default path from ENV_ROOT assignment (#1050) 2025-11-08 18:30:51 +01:00
Tim O'Farrell 3aa647e39b We now have an explicit ConversationErrorEvent in the stream (#1106)
Co-authored-by: enyst <engel.nyst@gmail.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-07 17:39:16 -07:00
Xingyao Wang 06567f512c Fix Remote Runtime API Authentication (#1090)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-07 22:43:53 +00:00
Xingyao Wang 865273cebb Revert "Add FallbackRouter for LLM failover support" (#1105) 2025-11-08 06:40:24 +08:00
openhands b2b3da3a10 Add FallbackRouter for LLM failover support
This commit implements a FallbackRouter that provides automatic failover
between multiple LLM models when the primary model fails. Key features:

- Automatically falls back to secondary models on errors (rate limits,
  connection failures, service unavailable, etc.)
- Supports multiple fallback models in a chain
- Preserves telemetry and metrics from the active model
- Includes comprehensive logging of failover attempts

Implementation:
- New FallbackRouter class extending RouterLLM
- Overrides completion() to implement fallback logic
- Validates that 'primary' key exists in llms_for_routing
- Tracks active_llm for telemetry purposes

Tests:
- 8 comprehensive unit tests covering all scenarios
- Mocked LLM responses to avoid actual API calls
- Tests for successful completion, fallback scenarios, and error cases

Example:
- examples/01_standalone_sdk/27_llm_fallback.py demonstrates usage
- Shows how to configure primary and fallback models
- Includes logging setup to observe failover behavior

Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-07 20:59:55 +00:00
Engel Nyst 9652de4c5a fix race during cleanup (#1095) 2025-11-07 18:56:43 +01:00
Xingyao Wang 204d3a4b26 refactor: git SHA-based Docker tags by default, versioned tags only on releases (#1088)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-08 00:53:40 +08:00
Edward-sy e616bf4eeb fix: recursively parse nested tools params (#1011) 2025-11-07 14:23:21 +07:00
Graham Neubig 84fe9f0774 Fix Datadog log-query to use Logs API instead of Error Tracking API (#1043)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-06 15:33:13 -05:00
simonrosenberg 54d982ff8f Add glm 4.6 bis - bis (#1046)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-06 19:33:51 +01:00
Xingyao Wang a612c0a685 v1.0.0 release (#1056) 2025-11-07 01:05:13 +08:00
Xingyao Wang 4c276bdecc Improve visualization API: allow passing ConversationVisualizer directly (#1025)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: John-Mason P. Shackelford <jpshack@gmail.com>
Co-authored-by: simonrosenberg <157206163+simonrosenberg@users.noreply.github.com>
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
Co-authored-by: Rohit Malhotra <rohitvinodmalhotra@gmail.com>
2025-11-06 11:41:38 -05:00
மனோஜ்குமார் பழனிச்சாமி 5e93f74cd3 Rename execute_bash to terminal & rename BashTool to TerminalTool (#1033)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-06 22:53:06 +08:00
Xingyao Wang 89b63b45b7 Fix RemoteConversation stats field mismatch and state updates (#1042)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-06 11:30:19 -03:00
simonrosenberg a44f769472 Add reviewer trigger for PR review workflow (#1037)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-06 15:03:38 +01:00
Robert Brennan b0b82bdace Simplify hello world example (#1047)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: enyst <engel.nyst@gmail.com>
2025-11-06 05:14:06 +08:00
Cesar Garcia 2bb7c89e6d Fix: Make reasoning_effort optional (#1004)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-06 05:11:59 +08:00
simonrosenberg 41d8d80c89 refactor: standardize Observation base class (#929)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-11-05 20:11:09 +00:00
Engel Nyst f10eed2592 A couple of fixes for OpenAI o3 (#1031)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-06 03:35:28 +08:00
Xingyao Wang 53337bd984 Include 03_browser_use_with_docker_sandboxed_server.py in test-examples workflow (#1021)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Hoang Tran <descience.thh10@gmail.com>
2025-11-06 02:21:34 +07:00
John-Mason P. Shackelford e5a7efea44 Fix Laminar span stack warning in LocalConversation (#1039)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-06 02:28:58 +08:00
Xingyao Wang 703718d47e Fix Docker cache tag length exceeding 128 character limit (#1029)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-06 02:22:25 +08:00
Engel Nyst 40bd3f06d2 Disable Copilot chat command center icon by default (parity with OpenHands PR #11589) (#1032) 2025-11-06 00:29:30 +08:00
Xingyao Wang aaa0066ee0 Fix multi-arch manifest merge to use correct tag names (#1024)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-04 21:24:56 +01:00
Tachibana waita b8de86c789 feat: Add automatic loading of user skills from home directory (#950)
Co-authored-by: Xingyao Wang <xingyaoww@gmail.com>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-11-05 04:06:10 +08:00
Rohit Malhotra d5995c31c5 Implement automatic tool registration on import (#862)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-05 03:22:23 +08:00
மனோஜ்குமார் பழனிச்சாமி 55cf62ea68 Fix abbreviation formatting in visualizer (#989)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-05 03:17:49 +08:00
Tim O'Farrell 18d04f279e Fix error in 422 logs due to blank bearer token (#1022) 2025-11-04 19:16:09 +00:00
Boxuan Li 7b98d96ed4 Rename metadata field to litellm_extra_body and add custom config support (#837)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-05 03:04:13 +08:00
Engel Nyst 6c7ad753f3 Recognize GPT‑5 context overflow and trigger condenser (Fixes #776) (#783)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2025-11-04 20:00:26 +01:00
Tim O'Farrell a7ea013a1b More agent_status to execution_status_updates, and error state (#1019) 2025-11-05 02:45:01 +08:00
Dinmukhamed Mailibay 45ffad9b18 Add OpenTelemetry with Laminar SDK (#681) 2025-11-04 12:12:46 -06:00
simonrosenberg 00af7a9684 refactor: rename AgentExecutionStatus -> ConversationExecutionStatus AND agent_status -> executiong_status (#839)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-04 18:32:35 +01:00
Engel Nyst 5765184c43 feat(llm): forward extra_headers to LiteLLM completion and responses (#733)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2025-11-04 17:49:30 +01:00
Robert Brennan 7f4bef02ae Fix/remove docstring for OpenHands tool package (#1018) 2025-11-04 23:34:08 +08:00
Xingyao Wang 7cdc0672ba Migrate to Blacksmith official Docker actions with native multi-platform builds (#990)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-04 16:29:15 +01:00
மனோஜ்குமார் பழனிச்சாமி dbd11c0c41 Update example API key in README (#1013) 2025-11-04 14:57:11 +00:00
Tim O'Farrell 23c8436cb3 Add comprehensive unit tests for conversation_router.py endpoints (#1010)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-04 07:27:37 -07:00
Hiep Le 8d05e95340 feat(backend): enhance search_conversation_events functions with timestamp filtering support (#695) 2025-11-04 09:29:19 +07:00
Graham Neubig 627da01701 Fix datadog debugging workflow example (#1009)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-03 23:57:02 +01:00
Xingyao Wang fcf76101aa Refactor ToolDefinition architecture to use subclass pattern for all tools (#971)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: simonrosenberg <157206163+simonrosenberg@users.noreply.github.com>
2025-11-04 04:54:00 +08:00
Robert Brennan 9bcaf5c2be Add Sphinx-compatible docstrings to core SDK classes (#1006)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
2025-11-04 04:24:27 +08:00
Graham Neubig e190adda27 Refactor datadog debugging workflow to follow basic_action pattern (#272)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-03 15:08:13 -05:00
Engel Nyst be9725b459 Normalize SDK errors for clients (#980)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-04 03:23:45 +08:00
PiteXChen ed146c9056 Fix(llm):When LLM calls are retried, it can lead to inaccurate records. (#949)
Signed-off-by: CLFutureX <chenyongqyl@163.com>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-11-04 02:38:08 +08:00
Engel Nyst 921dc8609d Tweak review workflow for draft PRs (#997) 2025-11-03 12:19:40 -05:00
simonrosenberg eb2ca5d31a Bug: fix Visualizer "Message To User" that should be "Message from User" (#1002) 2025-11-03 22:50:50 +08:00
Xingyao Wang 6a5dcec22e Remove default MCP servers from preset agents (#984)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Rohit Malhotra <rohitvinodmalhotra@gmail.com>
2025-11-03 22:48:21 +08:00
Ryan H. Tran b061d4e1f0 Add cost reporting for run-examples.yml and schedule scripts to run daily (#977)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-11-03 22:45:26 +08:00
Engel Nyst 0b44206711 Add tests for string serializer (#996)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-03 22:26:02 +08:00
simonrosenberg 9ac99bd6b5 update patterns in FORCE_STRING_SERIALIZER_PATTERNS (#995) 2025-11-02 17:36:46 +01:00
simonrosenberg 4cb94b353d Add glm4.6 to FORCE_STRING_SERIALIZER (#994) 2025-11-02 17:06:42 +01:00
Engel Nyst 21c4d27773 sdk(llm): Stop model-name whack-a-mole: revert to core family substring matching (#879)
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-02 10:34:05 +00:00
Robert Brennan d473e6fc85 Add logging for file upload and download actions (#993)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-02 11:31:13 +01:00
Graham Neubig 7e6774f774 Fix duplicate reviewer assignment in assign-reviews workflow (#827)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-01 14:52:31 +01:00
softpudding efa060613c Fix bash executor session reuse by resetting closed sessions (#964)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-31 17:57:15 +00:00
Ryan H. Tran 67746a68c1 Fix wrong server path for file upload endpoint (#951)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-01 01:54:15 +08:00
Robert Brennan ef5cb95b76 Fix VNC security vulnerability by defaulting to disabled (#974)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2025-10-30 22:12:34 +00:00
Xingyao Wang cab18f9297 Fix formatting: remove extra blank line in model_features.py (#973)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-31 04:47:27 +08:00
Apurva Gandhi 484f29909c feat: Add llm_response_id to all LLM generated events (#930)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-31 04:43:42 +08:00
Dinmukhamed Mailibay 0b7c3e054d fix browser tool cleanup (#942) 2025-10-31 04:41:26 +08:00
Xingyao Wang 973fa341c5 Fix double dollar sign in visualizer metrics display (#938)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-30 20:39:23 +00:00
Yakshith 0de1672155 Add codex-mini-latest to RESPONSES_API_PATTERNS (#934)
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-10-31 04:37:29 +08:00
Engel Nyst c6ebf3d1fb Rename ConversationState field to secret_registry with read-alias (#969)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-30 20:36:34 +01:00
Engel Nyst 8b2d595353 Rename SecretsManager to SecretRegistry (#965)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-10-30 18:02:15 +01:00
Xingyao Wang 952c5585e9 Default to function calling enabled, remove FUNCTION_CALLING_PATTERNS allowlist (#956)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: simonrosenberg <157206163+simonrosenberg@users.noreply.github.com>
Co-authored-by: Simon Rosenberg <simonrosen10@gmail.com>
2025-10-30 15:18:40 +01:00
Xingyao Wang b9860ce032 Enhance README with branding and resources (#961) 2025-10-30 04:39:26 +08:00
Tim O'Farrell 3d8af53b2f Agent Server UUIDs without hyphens (#960) 2025-10-29 11:31:31 -06:00
Xingyao Wang e1fc2069d2 release version 1.0.0a5 (#959) 2025-10-30 00:19:18 +08:00
simonrosenberg 55f54cb704 refactor: fix naming inconsistency (#958) 2025-10-29 11:39:08 -04:00
Xingyao Wang d9c5c2ed8e Fix integration tests to work with fork PRs (#932)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-29 15:53:39 +01:00
Xingyao Wang 99051464f3 feat: support absolute paths for system_prompt_filename (#902)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2025-10-29 05:30:38 +00:00
simonrosenberg a3c2dbe13e change base url in integration tests (#947)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-28 16:00:15 -04:00
Dinmukhamed Mailibay eae0efcb08 bump browser-use version, infer screenshot mime type dynamically (#935) 2025-10-28 19:54:07 +00:00
simonrosenberg 9ffddb0ad5 Update litellm proxy path in workflows (#946)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-28 18:22:36 +00:00
simonrosenberg fb9c13d0b4 feat: Post integration test results to tracker issue for daily runs (#940)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-28 18:45:47 +01:00
simonrosenberg eee2eb425f Update integration test to use OpenHands proxy instead of litellm proxy (#944)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2025-10-28 17:55:08 +01:00
Xingyao Wang 9620c7426c refactor: Clean up _configure_bash_tools_env_provider anti-pattern (#922)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-28 10:55:23 -04:00
Tim O'Farrell ce0a71af55 Feat git operations (#863)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-27 16:49:03 -06:00
Xingyao Wang e380762e32 Update README to refer to documentation site (#928)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-27 17:58:02 -04:00
Xingyao Wang fc87a838cf Fix #916: Suppress Pydantic serializer warnings from litellm (#917)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-27 21:22:49 +01:00
Xingyao Wang 9c4e995744 refactor: Update ToolExecutor interface to use LocalConversation instead of BaseConversation (#925)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-27 16:17:02 -04:00
simonrosenberg f1081f9fcc feat: Implement blocking delegation (#908)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-10-27 20:58:11 +01:00
Xingyao Wang 6e48a8049a Add docs workflow to repo.md and fix pyright type checking (#924)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-27 15:32:37 -04:00
dependabot[bot] 850b176a7d Bump the version-all group with 3 updates (#913)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-10-27 14:23:00 -04:00
Engel Nyst d4c66c6e25 chore: remove Dependabot config (uv workspace unsupported) (#897)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-27 14:22:36 -04:00
Tim O'Farrell 0aeb33aafb Feat encrypted secrets (#893)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-27 14:21:42 -04:00
Xingyao Wang c01b0a1ab0 refactor: move agent_final_response to utility function with proper type checking (#919)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-27 13:45:42 -04:00
Tim O'Farrell 0ecc7b53bd AsyncRemoteWorkspace now has working directory (#888) 2025-10-27 10:59:58 -06:00
Ryan H. Tran 161244342c Add workflow to run examples script & fix failed integration tests (#914)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-10-27 11:00:13 -04:00
Xingyao Wang 24789aae46 Fix #905: Handle None context_window for unmapped models (#906)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-27 09:50:11 -04:00
Xingyao Wang 1f905f644f Fix PydanticSerializationUnexpectedValue warnings in log_completions (#904)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-26 23:08:21 +08:00
Xingyao Wang e969885684 Update sdk version to 1.0.0a4 (#901) 2025-10-26 05:01:00 +08:00
Xingyao Wang 7c00cbc6d2 Add regression test for bash command endpoint fix (#866) (#881)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-26 04:53:11 +08:00
Xingyao Wang b1a11f6947 Fix MCP tool validation error by excluding 'kind' field from action data (#887)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-26 04:20:30 +08:00
Engel Nyst c27198602d Fix pyright issues in agent status transition tests (#895)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-25 09:57:36 -06:00
Xingyao Wang 101ca0df8d Switch back to pyright + stricter typing to allow vscode to resolve imports (#869)
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: enyst <engel.nyst@gmail.com>
2025-10-25 14:32:07 +02:00
Xingyao Wang 877c590ccd feat(tests): Add comprehensive test suite for RemoteConversation implementation (#451)
Co-authored-by: Tim O'Farrell <tofarr@gmail.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-25 01:33:57 +08:00
simonrosenberg b57f8316e4 add asssign reviews workflow to examples (#883) 2025-10-24 16:30:55 +00:00
Tim O'Farrell b80dc96d08 Dependency Injection for EventService (#885) 2025-10-24 10:05:28 -06:00
Ryan H. Tran 93b481c50f Fix agent status not transitioning to running status (#884)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-10-24 23:05:43 +08:00
Engel Nyst 75762a4bc8 tests(remote): silence pytest collection warning by renaming helper class (#880)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-24 22:20:28 +08:00
Graham Neubig 54c585892e Fix DockerWorkspace execute_command timeout issue (#871)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-24 05:28:48 +08:00
Engel Nyst 26f4800788 chore(examples): env-driven endpoints, platform detection; update domains and API var names (#872)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-24 01:04:19 +08:00
Xingyao Wang 749daaaf4d Add workflow to check documented examples (#878)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-23 16:21:29 +00:00
simonrosenberg 4114f987c3 Add conversation to tool calls (#877)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-23 17:04:00 +02:00
Xingyao Wang 4567ee03e5 Simplify remote agent server examples (#868) 2025-10-23 05:01:45 +08:00
juanmichelini 3a6aff9231 fix: ReadTimeout when calling 01_convo_with_local_agent_server.py (#857) 2025-10-22 23:35:23 +08:00
Xingyao Wang bef52edb9d Simplify examples for docs (#859) 2025-10-22 14:13:51 +00:00
Tim O'Farrell cab92fc66e Socket logging (#858) 2025-10-21 14:32:33 -06:00
Tim O'Farrell 5916868b97 Env template generator (#847)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-21 12:25:34 -06:00
sp.wack 1a76d34906 Fix FastAPI path parameters to handle file paths with forward slashes (#848)
Co-authored-by: Tim O'Farrell <tofarr@gmail.com>
2025-10-21 09:52:17 -06:00
Graham Neubig 7b4250b76c Rename "microagents" to "skills" throughout the repository (#820)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-10-21 23:51:19 +08:00
blacksmith-sh[bot] 6554de0d42 .github/workflows: Migrate workflows to Blacksmith runners (#831)
Co-authored-by: blacksmith-sh[bot] <157653362+blacksmith-sh[bot]@users.noreply.github.com>
Co-authored-by: Graham Neubig <neubig@gmail.com>
2025-10-21 23:37:32 +08:00
simonrosenberg 8d8134ca5a Debug: glob tool fails when pattern contains path (#718)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-21 16:37:49 +02:00
Xingyao Wang 74cc24f5ad Set add_security_analyzer=False by default and simplify examples (#846)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-21 15:52:27 +02:00
Ryan H. Tran fcda0562c9 Add FunctionCallValidationError handling & claude-haiku-4-5 to function calling supported models (#832)
Co-authored-by: Graham Neubig <neubig@gmail.com>
2025-10-21 17:35:49 +07:00
Graham Neubig f58cd8cec2 Add test to reproduce bug #825: execute_command includes previous output (#828)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-20 19:38:12 -04:00
Jim White 28b67728de Fix: Make adding security_analyzer optional for get_default_agent (#802)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-10-20 18:25:45 -04:00
Robert Brennan 4d0860705c Replace All-Hands-AI references with OpenHands (#830)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-20 17:09:01 -05:00
Ryan H. Tran 3c4ce52a8b Enforce batch atomicity for condenser (#775)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-10-21 04:08:38 +07:00
Xingyao Wang fb991ce807 set tags=["Server Details"] for / (#829) 2025-10-20 21:41:23 +02:00
Graham Neubig 53ed9bb631 Fix httpx.UnsupportedProtocol error by setting base_url on clients (#801)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-20 12:45:23 -04:00
simonrosenberg 3902bcce9a Fix PR review workflow subdirectory paths (#824)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-20 12:40:58 -04:00
Xingyao Wang b3fc175930 1.0.0a3 pre-release (#822) 2025-10-20 23:45:01 +08:00
Xingyao Wang 6ce8204192 Fix: Skip Docker build and push for fork PRs (#821)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-20 23:35:05 +08:00
Xingyao Wang 512399d896 Fix deprecated @model_validator usage in LLMSummarizingCondenser (#817)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-20 23:21:04 +08:00
Tim O'Farrell 8ec2b5d47b Setting VSCode token to match SESSION_API_KEY. (#793)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-20 09:19:35 -06:00
Xingyao Wang b71ea9c78b Fix: Make Check OpenAPI Schema workflow work with fork PRs (#814)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-20 23:02:45 +08:00
Xingyao Wang 01507e7c32 Fix websocket deprecation warning by using wsproto (#508)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-20 21:30:24 +08:00
Graham Neubig 2bc735ede1 Fix assign-reviews workflow subdirectory paths (#807)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-20 07:31:54 -04:00
dependabot[bot] a9e7f2b48e Bump the version-all group with 4 updates (#810)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-10-20 10:47:20 +02:00
Engel Nyst 32e1e75f7e Rename LLM service identifiers to usage_id (#799)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-19 07:44:11 +02:00
simonrosenberg 64c0d5a069 Add example for automated TODO management with GitHub Actions (#758)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-17 22:36:32 +00:00
Xingyao Wang 06540a2dc8 update deploy-to-docs workflow (#788) 2025-10-18 04:40:59 +08:00
Xingyao Wang 1687b1090e fix: deploy changes to docs repo (#786) 2025-10-18 04:22:37 +08:00
Xingyao Wang c7f9b26931 Add basedpyright to pre-commit and fix all type errors (#593)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: simonrosenberg <157206163+simonrosenberg@users.noreply.github.com>
2025-10-17 20:01:02 +00:00
Engel Nyst c1feec6499 refactor(llm): extract kwargs normalization into pure options selectors (#643)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-10-17 19:34:33 +00:00
simonrosenberg 2786193039 Feat: Add Configurable Plan Template to Planning Agent (#778)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-17 17:06:18 +00:00
Engel Nyst 9a004ed185 Print conversation ID when an exception stops the run (fixes #777) (#779)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-10-18 00:24:54 +08:00
Xingyao Wang a4a4e51862 Fix PyPI Wheel (#752)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-18 00:16:16 +08:00
Ryan H. Tran 4ffaa97a9a Add support for claude-4-5-haiku (#768) 2025-10-17 22:48:35 +08:00
Engel Nyst aa30853b52 Extend context truncation cases: broaden vLLM error detection (#759)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-17 14:18:29 +07:00
Xingyao Wang 593b87041a refactor: remove duplicated example and rename folder (#765) 2025-10-16 12:09:49 -07:00
dependabot[bot] fbbd570290 Bump the version-all group with 2 updates (#535)
Co-authored-by: enyst <engel.nyst@gmail.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-16 16:30:47 +02:00
Tim O'Farrell f1ea269d8b Fix for URL endpoint location (#503) 2025-10-16 13:47:59 +00:00
Graham Neubig de524870e4 Update assign reviewers script based on experience (#760)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-16 09:43:05 -04:00
Graham Neubig 5c9e381e06 Add "assign reviewers" workflow (#756) 2025-10-16 13:51:53 +02:00
simonrosenberg 5b845c980d Add PR review workflow for fine-grained review comments (#751)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Graham Neubig <neubig@gmail.com>
2025-10-16 11:58:34 +02:00
Graham Neubig eb5e58b727 Update all SDK examples to read LLM_MODEL and LLM_BASE_URL from environment (#740)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-16 11:21:53 +02:00
Tim O'Farrell 08cf609a99 Add AsyncRemoteWorkspace and sync with APIRemoteWorkspace (#735)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-15 13:46:31 -06:00
Rohit Malhotra 186e51e4b2 Add optional conversation_id parameter to StartConversationRequest (#741)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-15 11:56:46 -04:00
Xingyao Wang 09f8e6fd33 Enable unused variable detection in pre-commit hooks (#726)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-15 22:36:38 +08:00
Graham Neubig 80213f0b62 Add GitHub Actions workflow for scheduled maintenance tasks (#672)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-15 09:47:50 -04:00
Rohit Malhotra ee93879609 fix: make ThinkAction skip user confirmation like FinishAction (#742)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-10-15 13:39:55 +00:00
Engel Nyst 49adf4b0a1 Render user rejections in visualizer (#746)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-15 15:19:51 +02:00
Xingyao Wang 45d6f56c57 use uv-managed python for "source" docker (#737) 2025-10-14 17:46:45 -03:00
Engel Nyst cced660676 Sync readme to changes (#734) 2025-10-15 01:13:43 +08:00
Xingyao Wang ada376e9f8 v1.0.0a1 Release (#736) 2025-10-15 01:10:33 +08:00
Xingyao Wang 61a3ca07de Add gpt-5-codex to OpenHands verified models (#657)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: simonrosenberg <157206163+simonrosenberg@users.noreply.github.com>
2025-10-14 17:28:50 +02:00
Tim O'Farrell b352181e9b Fix agent server persistence (#732)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-14 07:56:38 -06:00
Tim O'Farrell 28305655a2 Added run flag to send message operations (#729)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-10-13 18:10:41 -06:00
Xingyao Wang b713c50d1b use older version of builder so the binary is compatible (#731) 2025-10-14 06:01:42 +08:00
Graham Neubig 919a4c7bec Refactor hello_world.py for simplicity (#713) 2025-10-13 18:45:39 +00:00
Rohit Malhotra 50b094a928 Runtime errors with security risk and no configured security analyzer (#723)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-10-13 16:24:32 +00:00
Engel Nyst 7eda027920 Add Vscode initial settings extension (#720)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: Tim O'Farrell <tofarr@gmail.com>
2025-10-13 16:11:15 +00:00
juanmichelini b6e7598170 Remove AGENT_SDK_PATH environment variable (#724) 2025-10-14 00:06:56 +08:00
Tim O'Farrell 31c4bd6ded Update scripts to fix websocket (#716) 2025-10-13 09:49:04 -06:00
simonrosenberg a42d6c224d Update planning system prompt (#719)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-10-13 11:36:27 -03:00
dependabot[bot] cb682e61c5 Bump astral-sh/setup-uv from 6 to 7 in the version-all group (#721)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-10-13 21:59:45 +08:00
Tim O'Farrell bc40be6f39 Add configurable VSCode port parameter (#717)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-12 17:03:30 -06:00
Tim O'Farrell f8ca02c4a3 Fix ConversationService lifecycle management to prevent inactive_service errors (#715)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-12 06:52:02 -06:00
Xingyao Wang a436a8ee8f Python-based Docker Build Script (#698)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-11 05:23:42 +08:00
Xingyao Wang d47b533ce9 remote convo: log request error (#709) 2025-10-11 04:23:30 +08:00
Xingyao Wang 0b3057492d Reorganize examples (#696)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-10 02:46:42 +00:00
Ray Myers f59808d290 fix coverage report comment by making it shorter (#701)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-10-10 10:44:05 +08:00
Xingyao Wang 096017e8f4 fix color in makefile (#705) 2025-10-10 09:56:56 +08:00
simonrosenberg 4defa51ddc Fix bug with gpt5 responses API 2025-10-09 16:51:28 -03:00
Xingyao Wang 0101c63495 Support building "minimal" version of agent-server docker image (#693)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-10 02:46:46 +08:00
Xingyao Wang 67e1b79175 Refactor Docker build system: Replace variant with custom_tags and prioritize SHA-based tags (#685)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-10 01:00:00 +08:00
Hiep Le 3ba4ee6ae7 feat(backend): [Agent Server] Implement update_conversation API (#661)
Co-authored-by: Tim O'Farrell <tofarr@gmail.com>
2025-10-09 23:46:29 +07:00
Xingyao Wang 2df749b1d8 Remote Runtime API (no build) (#684)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-10 00:27:46 +08:00
Tim O'Farrell 255795ec08 Fix for small logic hole in env parser (#688) 2025-10-09 09:48:25 -04:00
Rohit Malhotra 189979a501 Add is_confirmation_mode_active property to BaseConversation (#687)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-08 16:12:11 -04:00
juanmichelini 1d491fe050 fix: Path(__file__) points to workspace.py not docker.py (#683) 2025-10-09 01:58:32 +08:00
Xingyao Wang ae493e06a8 Allow security_analyzer to differ during agent reconciliation (#669)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-09 00:46:57 +08:00
Hiep Le de2d9f5b45 fix(backend): [agent server] remote workspace loading all bash events instead of using filters (#679) 2025-10-08 22:54:27 +07:00
simonrosenberg c6d534a6fd Refactor: integration tests (#677) 2025-10-08 22:45:13 +08:00
Graham Neubig 48b429490b fix: prevent tests from making HTTP calls to production LLM proxy (#673)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-08 09:04:11 -04:00
simonrosenberg fcd257617f Improve Integration Tests Failures (#651)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-10-07 18:43:11 -03:00
simonrosenberg 8ec92accf5 Feat: implement planning agent (#624)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-07 18:41:06 -03:00
Robert Brennan 15a07bf93d refactor: move DockerWorkspace to separate openhands/workspace package (#666)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-08 03:42:09 +08:00
Ryan H. Tran c9a1ce89ef Fix browser use tools not init error (#658) 2025-10-07 23:34:26 +08:00
Xingyao Wang c64a989f25 Support GPT-5-Codex (#655) 2025-10-07 05:01:01 +08:00
Graham Neubig f37f9f5777 Replace LITELLM_API_KEY with LLM_API_KEY in examples (#654)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-07 04:26:42 +08:00
Robert Brennan d4e06047d2 Remove automatic Chromium installation from browser tool (#652)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-07 03:45:48 +08:00
Engel Nyst aa035110ed Encapsulate conversation title generation via EventService (#653)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-07 03:44:47 +08:00
Graham Neubig 1a54909d41 Revise README for OpenHands Agent SDK (#649) 2025-10-07 03:21:28 +08:00
Xingyao Wang 20567a53ea Implement generate_title method for Conversation class (#256)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Hoang Tran <descience.thh10@gmail.com>
2025-10-06 22:32:01 +08:00
Xingyao Wang 487e903415 Fix: Remove 'kind' field from MCP schema output (#641)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-06 21:29:11 +08:00
dependabot[bot] c5baaee486 Bump the version-all group with 2 updates (#645)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-10-06 21:28:56 +08:00
Xingyao Wang f1bf8d24a8 Remove deprecated to_llm_dict() and from_litellm_message() methods (#642)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-06 03:08:58 +08:00
Engel Nyst 897d4e992a LLM → Telemetry: fixes (#638)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-06 02:49:49 +08:00
Engel Nyst 1a715efce1 Add native responses (#622)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-10-05 05:35:18 +08:00
Engel Nyst 8c4d8df10c feat: NonExecutableActionEvent for Completions path (preserve tool_call_id continuity) (#623)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-04 15:58:11 +00:00
Engel Nyst fab8078192 Allow FileEditor insert_line to be 0 (start-of-file) (#635)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-04 13:24:57 +00:00
simonrosenberg 3ce74a1656 Refactor: rename str_replace_editor -> file_editor (#628)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-03 12:53:58 -03:00
Tim O'Farrell 519434bcd9 Can now specify secrets on conversation start, and simplified type hints (#625)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-02 22:10:25 -06:00
simonrosenberg 31dd29867f Add full agent/LLM logs to integration test artifacts (#612)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-03 00:51:38 +02:00
simonrosenberg 19f0e2201a Refactor: add MessageToolCall class instead of litellm's ChatCompletionMessageToolCall (#550)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2025-10-02 17:49:00 -03:00
Tim O'Farrell b8202724df Formal secrets (#616) 2025-10-02 14:14:24 -06:00
Xingyao Wang 610d97e8d1 fix event service duplicate id (#617)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-02 14:01:28 -06:00
Xingyao Wang 0316280095 feat: Add DockerWorkspace (#510)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-03 02:37:13 +08:00
openhands 1733c684bb Update examples to demonstrate proper Pydantic model attribute access
- Modified examples/22_hello_world_with_agent_server.py to access CommandResult attributes (command, exit_code, stdout) directly
- Modified examples/23_hello_world_with_sandboxed_server.py to access CommandResult attributes directly
- This demonstrates the correct way to work with the new Pydantic models introduced in PR #615
- Avoids dictionary-style access patterns that won't work with Pydantic models

Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-02 17:42:56 +00:00
Tim O'Farrell ceb6b68cb3 Add Pydantic models for workspace method return values (#615)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-02 11:36:01 -06:00
Ryan H. Tran c3cba52756 Fix thinking budget 200k raising error with interleaved thinking (#614) 2025-10-02 22:44:51 +08:00
Tim O'Farrell 153c0f0ac6 Add resend_all parameter to events_socket endpoint (#608)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: enyst <engel.nyst@gmail.com>
2025-10-02 09:22:38 +08:00
Tim O'Farrell dd8eb43646 Better implementation of get_event (#610) 2025-10-02 09:20:01 +08:00
Robert Brennan a57e73b9e5 Add tests for Conversation constructor with secrets parameter (#597)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-02 05:12:36 +08:00
simonrosenberg e6ca75bac4 Fix StrReplaceEditor to include working directory in tool description (#605)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-02 05:11:53 +08:00
Xingyao Wang 9365a32cdc Fix MCP tool security risk prediction schema bug (#607)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-02 05:09:26 +08:00
Xingyao Wang b3710eb4ea Remove deprecated params={'working_dir'} and params={'workspace_root'} patterns from tools (#604)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Robert Brennan <accounts@rbren.io>
2025-10-02 05:01:19 +08:00
Robert Brennan 8ce82b0d95 Rename ToolSpec to Tool throughout the repository (#599)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-01 20:08:19 +00:00
Xingyao Wang 0e1ec5b62d Add ZIP download for conversations and fix directory reset issue (#601)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-02 03:07:22 +08:00
Xingyao Wang 34cf701260 Remove Tool = ToolDefinition alias and Tool exports (#602)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-02 02:59:12 +08:00
Robert Brennan f862662dc0 Rename Tool class to ToolDefinition across entire codebase (#592)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-10-01 18:15:14 +00:00
Robert Brennan 49806a47d0 Refactor OpenHands SDK: Optional service_id, rename base classes, and enhance Conversation (#591)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-10-01 17:43:14 +00:00
Xingyao Wang 711efcbada Fix sonnet 4.5 output token (#595) 2025-10-02 01:09:46 +08:00
Xingyao Wang b8ac749ce3 Enable interleaved thinking (#594) 2025-10-02 00:51:21 +08:00
Rohit Malhotra a6b1f18639 Fix nested LLM reconciliation in agent deserialization (#517)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-10-02 00:00:42 +08:00
Xingyao Wang 0112bcc09f Fix multiple inheritance issue in ConversationState (#583) (#585)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-01 14:17:31 +00:00
juanmichelini d901634eff feat: env path to docker building files not in packages (#587)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-10-01 22:14:18 +08:00
Xingyao Wang 0b59105a54 Add Anthropic thinking blocks support to events system (#576)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Hoang Tran <descience.thh10@gmail.com>
2025-10-01 22:12:20 +08:00
juanmichelini 0ebd37966e DockerFile. now detects correct chromium name for Ubuntu and Debian (#588) 2025-10-01 13:49:07 +00:00
Xingyao Wang eac940f205 add Agent to ConversationStateProtocol (#586) 2025-10-01 11:32:44 +08:00
Tim O'Farrell b28fdd0885 Pillow now seems to be required in the agent server (#580) 2025-10-01 02:33:32 +00:00
Xingyao Wang c6181a9354 add py.typed for all modules (#584) 2025-10-01 10:30:21 +08:00
Xingyao Wang 3aac07ad7b auto-create working dir if it doesn't exists for LocalConversation (#582) 2025-10-01 10:26:24 +08:00
Tim O'Farrell 2f8621cced Creating a workspace object should not try to create the dir (#579) 2025-10-01 09:18:44 +08:00
Tim O'Farrell 8f55f5ab48 Have post to /conversations work without redirect (#578) 2025-10-01 07:25:22 +08:00
simonrosenberg 7f0c664eb1 Fix FIFOLock test race condition and pytest warnings (#575)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-30 22:07:45 +00:00
juanmichelini 64aa22dafd fix: changed suffix in agent server naming from timestamp from random uuid (#577) 2025-10-01 05:29:27 +08:00
Graham Neubig 8602ff907d fix(sdk): prevent CLI crash by early-validating MCP tool args (closes All-Hands-AI/OpenHands#11163) (#529)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-09-30 19:30:32 +00:00
Xingyao Wang 0c86676622 Add ConversationState updates to websocket events (#556)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-01 03:21:10 +08:00
simonrosenberg 9f566c2af6 Fix FIFOLock + pytest collection warnings by renaming test helper classes (#572)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-09-30 18:55:38 +00:00
Xingyao Wang 2ca2db4c9d Implement Workspace and support system-level bash execution (#534)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-10-01 02:20:44 +08:00
simonrosenberg 2fc8740489 Replace ListLike Protocol with list[T] type annotations (#562)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-30 14:41:39 -03:00
simonrosenberg 344ec8a9c3 Refactor: deduplicate compose_callbacks (#568) 2025-09-30 13:50:55 -03:00
Xingyao Wang cfae94d00e Support Two Types of Directories for Conversations: persistence_dir and working_dir (#531)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-29 20:23:56 +00:00
dependabot[bot] 2011d1607f Bump the version-all group with 3 updates (#536)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-29 22:10:50 +02:00
Xingyao Wang 8ce0146d1f Fix remaining claude-4-5 issues (#560) 2025-09-30 02:56:03 +08:00
Xingyao Wang b481346a9a add claude-sonnet-4.5 (#558) 2025-09-30 01:51:12 +08:00
simonrosenberg 774245d9d3 Refactor: remove pass statements from abstract methods and add meaningful docstrings (#542)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-09-29 16:14:35 +00:00
Xingyao Wang 924d87af13 Support VNC inside agent-server - and allows for interactive browsing!! (#515)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Hoang Tran <descience.thh10@gmail.com>
2025-09-29 23:07:47 +07:00
simonrosenberg 1d52ccbf5e refactor: remove llm context manager (#537)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-09-29 13:54:14 +00:00
simonrosenberg 066c538c2d Refactor: move get_unmatched_actions to ConversationState as static method (#540)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-09-29 21:51:39 +08:00
simonrosenberg 57b11e9796 Refactor: Fix agent to continue conversation when tool doesn't exist (#546)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-29 21:35:17 +08:00
Xingyao Wang 693947c8ae Refactor: Move presets to tools package and integrate LLM metadata (#532)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-29 05:00:55 +08:00
Tim O'Farrell 7b51a03950 Include conversation_id parameter in WebhookSubscriber (#528)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-28 09:21:07 -06:00
Tim O'Farrell d8e3174f80 feat: Add generic environment variable configuration parsing for Pydantic models (#511)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-28 13:59:15 +00:00
Engel Nyst 6a7e7811ff Fix sonnet3.5 recognition (#524)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-09-28 02:52:56 +00:00
Xingyao Wang 834dca7221 Remove load_from_toml function and tomli dependency (#526)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-28 10:45:28 +08:00
Robert Brennan 7abbaef59c Add API key support to RemoteConversation (#523)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-27 16:14:21 -04:00
Xingyao Wang 8daf576a66 Fix PyInstaller binary build: include all Jinja2 template files (#514)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-27 03:56:53 +08:00
juanmichelini c6ee351248 Fix sandbox build script lookup to work from any directory (#513) 2025-09-26 19:19:09 +00:00
Tim O'Farrell 632d636880 Serve server info from / rather than 404 if no static files (#512) 2025-09-26 17:44:59 +00:00
Xingyao Wang 26fa368e64 fix remote conversation by update socket url; add browsing example (#505) 2025-09-26 22:54:55 +08:00
Ryan H. Tran 5de1a20dc7 Update dockerfile with browser-use deps (#502) 2025-09-26 22:48:36 +08:00
Rohit Malhotra 9223352d71 Fix LLM service_id naming inconsistencies in examples (#499)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-25 16:58:23 -04:00
Xingyao Wang 9d94735226 feat: Add Web VSCode support to AgentServer (#477)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-26 03:14:10 +08:00
John-Mason P. Shackelford e812da8cc3 Public APIs return OpenHands types not LiteLLM types - Part 1: CompletionResult (#375)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-09-26 00:28:41 +08:00
Xingyao Wang fc5fb72371 Remove pandas from dev dependencies (#496)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-26 00:18:26 +08:00
simonrosenberg ea0e843d32 Add @abstractmethod decorators to base classes (#490)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-25 23:24:05 +08:00
Xingyao Wang 86375d1a48 Fix server.yml PR description to account for all Docker variants (#483)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-25 22:54:05 +08:00
Xingyao Wang a349016527 Add reset functionality to BashExecutor for unresponsive terminals (#474)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Ryan H. Tran <descience.thh10@gmail.com>
2025-09-25 14:15:00 +00:00
simonrosenberg 5c06f12f29 Fix quoted 'Self' type annotations (#488)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-25 13:46:09 +00:00
Tim O'Farrell 6dd2af9e7f Move WebSocket endpoints to /sockets and use query parameter authentication (#486)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-25 07:25:35 -06:00
Rohit Malhotra f0b9bcb5de Refactor LLM registry to remove service_id parameter from add() method (#482)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-24 21:08:17 +00:00
Tim O'Farrell 389ac56d5c feat: Add BashTaskService for executing bash commands outside agent context (#453)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-09-24 14:41:37 -06:00
Xingyao Wang bd2c026a66 Improve server logging for HTTP 500 (#480)
Co-authored-by: Hoang Tran <descience.thh10@gmail.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-25 02:35:33 +08:00
Xingyao Wang 523da3dba9 Fix #470: Update PR description instead of creating comments for agent server images (#472)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-25 01:39:30 +08:00
simonrosenberg a6e78abe7c Tighten typing to remove vague unions and optional runtime fields (#473)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-09-25 01:21:47 +08:00
Rohit Malhotra 1bd515f277 Fix: makes sure conversation stats is passed to conversation visualizer (#478) 2025-09-24 17:18:17 +00:00
Ryan H. Tran 55409a58b5 Add comprehensive tests for browser-use tools (#441)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-09-25 00:45:42 +08:00
simonrosenberg b208d27c77 Fix typing issues: remove vague unions, optional runtime fields, and consumer-side narrowing (#461)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-09-24 23:56:01 +08:00
simonrosenberg dceb495233 Feat: add cron job for typing quality check (#462) 2025-09-24 23:48:58 +08:00
Ryan H. Tran 95fffbe0ec Add Chromium binary check and auto-install if not available (#440)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-09-24 23:37:55 +08:00
Xingyao Wang e14c6236bf fix sonnet 4 output size (#468) 2025-09-24 23:09:10 +08:00
Xingyao Wang de5edb956f Add pyupgrade rules to Ruff config (#465) 2025-09-24 14:47:53 +00:00
Tim O'Farrell 7a36fbb1a0 Another rename (#467) 2025-09-24 14:11:07 +00:00
Tim O'Farrell e1fc764033 Removed reference to old name (#466) 2025-09-24 14:07:23 +00:00
Tim O'Farrell 37ea626ffb Removed conversation_id from event api (#457) 2025-09-24 13:57:20 +00:00
simonrosenberg e610309bfc chore(ruff): enforce Python 3.12 typing standards with UP006, UP007, … (#464)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-24 21:47:03 +08:00
simonrosenberg 4fd5015c14 Fix concurrent user messages sent while agent is finishing (#443)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-09-24 09:22:50 +02:00
Tim O'Farrell 14de32616c Add Example Application (#393)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2025-09-23 23:41:18 +00:00
Xingyao Wang 49cbde95ef Docker Sandboxed Remote Conversation (#455)
Co-authored-by: Tim O'Farrell <tofarr@gmail.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-24 05:57:33 +08:00
Rohit Malhotra f10d98e0c1 Add LLMSecurityAnalyzer as default security analyzer in get_default_agent (#454)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-09-23 20:22:35 +00:00
Rohit Malhotra fb431da432 Expose property for when confirmation policy is set (#449) 2025-09-23 19:49:42 +00:00
Rohit Malhotra 35703566fc Port over conversation stats (#288)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-23 19:45:34 +00:00
Xingyao Wang 7f3c4a1739 feat(sdk): Implement LocalConversation and RemoteConversation (fixes #388) (#391)
Co-authored-by: Tim O'Farrell <tofarr@gmail.com>
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-24 02:44:58 +08:00
Tim O'Farrell 33ebacb666 Added file upload download routes and reorganized endpoints using tags (#448)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-24 02:17:50 +08:00
Xingyao Wang 01acf441eb Add stale issue and PR management workflow (#445) 2025-09-23 16:28:08 +00:00
Tim O'Farrell 42d694de1b Authentication Upgrades (#438)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-23 09:31:02 -06:00
simonrosenberg f5179e0364 Solve final action concurrent message bug (#403)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-23 14:41:10 +00:00
Tim O'Farrell f89a683675 Fix websocket infinite loop on user disconnect (#439)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-23 14:35:10 +00:00
Tim O'Farrell 1d33a0dfba Documentation updates (#434) 2025-09-23 22:26:55 +08:00
Robert Brennan bd8bdaef79 Fix exception handling for unauthorized requests (#408)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Tim O'Farrell <tofarr@gmail.com>
2025-09-22 21:54:06 -06:00
Xingyao Wang bd22245bd0 refactor(conversation): initial protocol for conversation (#431)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-23 00:15:07 +00:00
Xingyao Wang 26d9f05c5b Fix security analyzer (#435)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-23 06:18:45 +08:00
Xingyao Wang f530297106 chore: simplify examples/04_confirmation_mode_example.py (#436) 2025-09-22 21:24:30 +00:00
Xingyao Wang bf3026216e refactor: create ObservationBaseEvent and refactor llm_convertible events to multiple files (#433) 2025-09-22 20:40:27 +00:00
dependabot[bot] 5634177d80 Bump actions/setup-node from 4 to 5 in the version-all group (#401)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-22 20:34:36 +00:00
Tim O'Farrell 9c3a880341 Expand webhook protocol to support conversation lifecycle events (#379)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-22 14:18:00 -06:00
Xingyao Wang bd3d22c499 fix condensation issue caused by condenser view (#432)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-22 20:12:22 +00:00
Calvin Smith 3cfaa2b337 Handle context window exceeded exceptions with registered condenser (#358)
Co-authored-by: Calvin Smith <calvin@all-hands.dev>
2025-09-22 12:55:42 -06:00
Xingyao Wang 87eced2530 chore(log): make condenser log debug (#430) 2025-09-22 18:20:00 +00:00
Rohit Malhotra 68fed9e285 Fix persistence path in default agent (#429)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-09-22 18:15:49 +00:00
Tim O'Farrell e50c058207 Prefix api routes with /api (#428) 2025-09-23 01:57:33 +08:00
Ryan H. Tran e37d9cdafa Port over stuck detector (#335)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-09-22 17:38:13 +00:00
Graham Neubig d3d416d2de Consolidate security content into centralized security policy (#422)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-22 16:39:46 +00:00
Xingyao Wang fa470e3eef chore(log): flip logger.info to logger.debug for LLM (#425) 2025-09-22 16:19:18 +00:00
Xingyao Wang 2076795b47 chore: remove --no-reload that was removed (#424) 2025-09-22 16:13:24 +00:00
Rohit Malhotra ab3c0be0c9 Remove custom model_dump_with_secrets method and use Pydantic context-based serialization (#423)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-23 00:05:56 +08:00
Robert Brennan a5f7e71269 Build Docker images for Python on all commits (#413)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-09-22 14:06:21 +00:00
Robert Brennan e7933b81f3 Add VSCode and Docker to agent-server Dockerfile (#412)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-22 13:48:06 +00:00
Robert Brennan a1165a620d Change default base image to slim version (#404) 2025-09-22 13:38:08 +00:00
Tim O'Farrell 97b28ba047 Add optional static files directory support to agent server (#395)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-22 07:18:21 -06:00
Robert Brennan 5e804e9753 Add sudo to base package installation in Dockerfile (#409) 2025-09-22 15:08:10 +02:00
Robert Brennan b9f0dcf217 Fix: Reverse --reload flag logic to default to False (#406)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-22 12:02:32 +00:00
Robert Brennan 85aab73170 Disable auth for /server_info endpoint (#405)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-22 13:48:17 +02:00
Ryan H. Tran 657f699c65 Add LLM router (#152)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-09-22 14:10:46 +07:00
Xingyao Wang ff72fe1bc4 Fix server hang due to browser-use (#399)
Co-authored-by: Tim O'Farrell <tofarr@gmail.com>
2025-09-22 02:25:51 +00:00
Tim O'Farrell dc65739804 Server details (#362)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-09-21 22:09:50 +00:00
Xingyao Wang 33d55a67b7 server: register default tools on server & add /tool/list endpoint for registry (#392) 2025-09-21 22:07:08 +00:00
Calvin Smith e0b0ce081e Replaces confirmation mode with confirmation policy (#348)
Co-authored-by: Calvin Smith <calvin@all-hands.dev>
Co-authored-by: enyst <engel.nyst@gmail.com>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-09-21 21:47:20 +00:00
Xingyao Wang c261220c0e Move tests importing openhands.tools from tests/sdk to tests/cross (#390)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-21 21:33:58 +00:00
simonrosenberg d2115ae30e Add example showing agent can handle messages sent while task is running (#378)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-22 04:58:18 +08:00
Xingyao Wang 10134740e2 chore(visualization): remove dim everywhere, fix diff newline (#389) 2025-09-21 20:51:03 +00:00
Engel Nyst 7cc7df6b46 refactor: centralize tool conversion in LLM (split from #142) (#371)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-09-21 20:28:07 +00:00
Engel Nyst ca5a4d32e2 chore(typing): adopt Python 3.12 style (PEP 585/604) (#382)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-09-21 20:16:41 +00:00
Engel Nyst f84dfd8804 docs(repo): add section 'Avoid overly defensive code' to repo.md (#385)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-21 20:10:41 +00:00
Xingyao Wang 6c9032bd8e Update README.md to match current codebase structure and API (#384)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-22 04:08:09 +08:00
Xingyao Wang 853bce0f72 Unify tool initialization — Step 1: global registry + lazy materialization (#356)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Rohit Malhotra <rohitvinodmalhotra@gmail.com>
2025-09-22 01:32:06 +08:00
Xingyao Wang 292e6b66bb Relax openapi schema test & make Microagent a DiscriminatedUnionMixin and tests (#377)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-21 16:29:19 +00:00
Engel Nyst 62af2de182 docs: sync README with examples and presets (#381)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-21 16:19:16 +00:00
Engel Nyst 726eecbb16 docs(spec): fix AgentSpec examples in comments (#380)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-21 17:14:31 +02:00
Xingyao Wang 967a83d679 More serialization fix (#374) 2025-09-21 07:06:04 +08:00
Rohit Malhotra d58df0d4d7 feat: add expose_secrets context parameter for AgentSpec serialization (#368)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-20 01:20:12 +00:00
dependabot[bot] a09cbe5d41 Bump the version-all group across 1 directory with 5 updates (#359)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-20 08:46:09 +08:00
Xingyao Wang a513ecd69c Use Blacksmith runners for all GitHub Actions workflows (#370)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-20 00:18:52 +00:00
Tim O'Farrell e4216d3974 Fix for serialization (#364)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-09-20 00:09:43 +00:00
Tim O'Farrell 60af66b361 Add gte constraints to Pydantic numeric fields (#367)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-20 07:43:20 +08:00
Xingyao Wang 7ac42dec74 add a simple conversation viewer for debugging purpose (#363) 2025-09-19 21:01:51 +00:00
Xingyao Wang 9e8adce4e6 Create MCPToolAction with dynamic schema (#342)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-19 20:31:50 +00:00
Xingyao Wang 62ecc2b4c8 Fix CI: Only run build-and-push-image on main or build-docker tag (#361)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-20 04:25:15 +08:00
Tim O'Farrell 25c22396c8 Simpler serialization! (#331)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-09-19 12:18:34 -06:00
Robert Brennan 7b191b74bd Add send_text helper method to Conversation class (#339)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-19 00:53:12 +00:00
Robert Brennan 1a8bae3df9 Add Docker image variants for Java, Golang (#345)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-18 21:37:08 +00:00
Xingyao Wang 99f2fea426 tool: default editor cwd to current cwd (#349) 2025-09-18 21:04:48 +00:00
Xingyao Wang 6450d44d0c make _add_security_risk_prediction a property for re-use (#347) 2025-09-18 20:23:20 +00:00
Xingyao Wang 0db5017a55 Do not add security_risk unless security analyzer is enabled (#341)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-19 04:00:20 +08:00
Xingyao Wang 8c86203533 remove DynamicStrReplaceEditor action by removing the hardcoded path in tool desc (#340) 2025-09-18 18:37:31 +00:00
Xingyao Wang 5d1893ac5d set base_url directly instead of set default for oh provider (#338)
Co-authored-by: rohitvinodmalhotra@gmail.com <rohitvinodmalhotra@gmail.com>
2025-09-19 00:50:57 +08:00
Calvin Smith c352746e94 Security analyzer framework (#305)
Co-authored-by: Calvin Smith <calvin@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-09-18 14:16:35 +00:00
dependabot[bot] 5ddbe26d63 Bump the version-all group with 3 updates (#322)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-18 20:45:10 +08:00
simonrosenberg 80dcd11517 refactor: update logs (#332) 2025-09-18 09:53:36 +02:00
Rohit Malhotra 0e62eeeba1 Create lazy import statements in tools package (#330)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-17 16:17:11 -07:00
Xingyao Wang b5e77ca881 Comment on PR when docker build finishes (#329) 2025-09-18 04:30:12 +08:00
Xingyao Wang cf686c6b51 Build docker for Agent SDK (#326) 2025-09-18 03:51:44 +08:00
simonrosenberg 9eaf07ba43 refactor: remove run_infer.sh for integration tests (#324) 2025-09-17 17:11:51 +00:00
Xingyao Wang fa268e0c7f Add MIT License to the project (#325) 2025-09-17 16:00:50 +00:00
Xingyao Wang 34d7d56655 Build agent-server into executable binary (#320) 2025-09-17 23:52:02 +08:00
Xingyao Wang 834e1213b5 Add pre-commit for YAML files (#321) 2025-09-17 15:25:50 +00:00
Tim O'Farrell 616d14d71f Add comprehensive documentation and improve configuration for OpenHands Agent Server (#318)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-17 22:57:01 +08:00
Tim O'Farrell 553c95c591 Ported over agent server code (#317) 2025-09-17 08:19:37 -06:00
simonrosenberg 39d963c112 Add LLM costs to integration tests (#304)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-17 14:26:59 +02:00
Xingyao Wang f8e800a93a Remove CondenserSpec and use Condenser directly in AgentSpec (#308)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-17 02:32:32 +08:00
Xingyao Wang 752ac3658f stop inherit TextContent from mcp.types.TextContent since it breaks JSON schema (#306) 2025-09-16 16:33:49 +00:00
Xingyao Wang 0d63ee1d62 Introduce Spec for simpler "API start" in server (#284)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-16 15:57:08 +00:00
simonrosenberg 59cb78bdee Add t05_simple_browsing.py integration test (#294)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-16 16:45:20 +02:00
simonrosenberg 49c2cb4a15 Add t06_github_pr_browsing.py integration test (#295)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-16 16:28:59 +02:00
Ryan H. Tran 18b0a36ad4 Integrate browser-use tools (#214)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-09-16 20:55:03 +07:00
Xingyao Wang 6e3d347c65 Replace ConversationID to use uuid.UUID instead of string (#286)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-16 20:41:09 +08:00
simonrosenberg 8faaf69913 Add t04_git_staging.py integration test (#293)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-16 11:49:39 +02:00
simonrosenberg f3813300de Add t03_jupyter_write_file.py integration test (#292)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-16 09:33:32 +00:00
simonrosenberg ce51daa771 Fix integration job when run on schedule (#299)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-16 11:28:56 +02:00
simonrosenberg 4d281dd11e Add t07_interactive_commands.py integration test (#297)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-16 11:21:23 +02:00
simonrosenberg 4516afcced Add t02_add_bash_hello.py integration test (#291)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-16 09:09:15 +00:00
simonrosenberg 8db0ab9be1 refactor: rename test for consistency (#289) 2025-09-16 10:09:29 +02:00
Xingyao Wang 0f2a602a76 fix FastAPI serialization with SkipJsonSchema (#283) 2025-09-16 00:51:03 +00:00
Ryan H. Tran 0d635c2ded Enforce list of Message only for LLM (#260)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-09-16 08:31:52 +08:00
Rohit Malhotra fc62adb9cd Fix pause logic with new agent status enums (#280)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-15 21:04:08 +00:00
Calvin Smith b7c8132e3d Lift ID types in event to type aliases (#279)
Co-authored-by: Calvin Smith <calvin@all-hands.dev>
2025-09-15 20:44:02 +00:00
Rohit Malhotra 46dea290e6 Expose methods/dicts for model choices (#276)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-09-15 20:30:34 +00:00
Calvin Smith 128c68634c LLM-summarizing condenser implementation (#181)
Co-authored-by: Calvin Smith <calvin@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-15 14:17:32 -06:00
Rohit Malhotra 7ece8f94a0 Add LLM JSON storage functionality (#278)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-15 19:58:53 +00:00
Xingyao Wang 26d62ab00f Revert #262 LLMConvertibleEventWithMetrics to use a new type EventWithMetrics (#275) 2025-09-15 19:35:17 +00:00
Xingyao Wang 8f93b7c618 fix .model_dump_json issue introduced in #237 (#277) 2025-09-15 19:19:50 +00:00
Engel Nyst a5b9f6af68 Model features: broaden patterns (o1*/o3*/gpt-5*) and small cleanups (#273) 2025-09-16 01:37:50 +08:00
Xingyao Wang d014593e97 port over sdk change related to server implementation (#269) 2025-09-15 16:19:49 +00:00
Xingyao Wang 1e5ab08cbd Support for discriminated unions to dump and re-create from spec (#237)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-15 16:01:00 +00:00
Xingyao Wang cfba92a2e5 add LLMConvertibleEventWithMetrics and add get_llm_metrics example (#262) 2025-09-15 15:46:56 +00:00
Xingyao Wang cf894f9774 Consolidate Agent State Management from Multiple Booleans to Single Enum (#261)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-15 23:11:11 +08:00
simonrosenberg 7dc37fe2b2 feat: add integration tests and cron job workflow (#219)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-15 16:28:03 +02:00
dependabot[bot] e56b44bad7 Bump tj-actions/changed-files from 46 to 47 in the version-all group (#259)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-15 20:50:44 +08:00
Xingyao Wang a31fb32c32 add support to customize system prompt via kwargs (#257)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-15 03:33:33 +08:00
Boxuan Li 55e77488cc Implement dynamic workspace path detection for str_replace_editor tool (#252)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-14 14:35:16 +00:00
Xingyao Wang 330d925f9a Add repomix tool for codebase packaging and analysis (#247)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-14 14:19:56 +00:00
Engel Nyst bffbc7be03 fix: prevent path traversal attacks in LocalFileStore (#232)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-14 14:03:16 +00:00
Xingyao Wang 53ab7bd8a5 Fix execute_bash tool to inherit environment variables from parent process (#245)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-14 02:13:51 +00:00
Rohit Malhotra 4f8e54e5df Feat: mask secrets (#248)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-14 02:08:55 +00:00
Xingyao Wang 6291144039 Remove all dim text styling from visualizer (#244) 2025-09-13 22:28:29 +00:00
Rohit Malhotra 91e9c3b3c8 feat: Add secrets manager for secure credential handling (#189)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-09-13 22:14:53 +00:00
Xingyao Wang 95e185d25b Make all events immutable by adding frozen=True (#241)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-13 20:52:14 +00:00
Engel Nyst 196e4e87ee Improve error handling in TaskTracker JSON loading (#230)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-09-13 20:04:35 +00:00
Xingyao Wang c6f6b46443 Replace list[TextContent | ImageContent] with Sequence[...] for type compatibility (#238)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-13 19:36:52 +00:00
Xingyao Wang b63f927339 do not try to force ImageContent be subclass of mcp image content (#212)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-13 15:36:55 +00:00
Engel Nyst 9d582e551b fix: prevent data mutation during visualize (#226)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-13 13:38:41 +00:00
Engel Nyst 4717cfe064 Expand conversation abbreviations to full names (#233)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-13 03:41:25 +00:00
Engel Nyst 9c3ebdb3e8 Fix LocalFileStore.list() to handle file targets gracefully (#228)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-13 03:16:39 +00:00
Engel Nyst 2111789f46 Remove redundant agent diff check logic (#231)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-13 02:58:42 +00:00
Engel Nyst e70ec5923f Fix FileStore ABC inheritance (#227)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-13 01:43:18 +00:00
Engel Nyst 22f941121e Fix type hints for get_edit_groups to accept optional parameters (#229)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-13 01:24:02 +00:00
Engel Nyst faff8a0432 Fix README accuracy issues (#225)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-13 01:40:16 +02:00
Xingyao Wang 274d7c24ff fix microagent typing that breaks openapi.json & remove unused (#221)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-12 18:50:20 +00:00
Xingyao Wang d4d82c9022 chore: remove redundant RetryMixin (#222) 2025-09-12 18:23:59 +00:00
Tim O'Farrell de9f448860 Add AsyncConversationCallback type and AsyncCallbackWrapper class (#217)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2025-09-12 10:54:05 -06:00
Xingyao Wang b3ddfb3468 Use file-backed list for "events" (#205)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-12 08:30:25 +08:00
Xingyao Wang 91fbdf7d9b feat: visualize notes in italic for task tracker (#210) 2025-09-11 21:25:16 +00:00
Xingyao Wang 9fd565fd6c move docs from this repo to docs repo and disable sync (#207) 2025-09-11 19:19:19 +00:00
Xingyao Wang 22baa101a1 feat: support for setting conversation id when start the conversation (#203)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-11 14:50:24 +00:00
Xingyao Wang 4ea749b05e feat(viz): do not show finish action observation; increase margin for all (#204) 2025-09-11 14:49:25 +00:00
Xingyao Wang 8f13fd0aa2 fix: ImportError during Python shutdown in terminal session cleanup (#162)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-11 14:33:39 +00:00
simonrosenberg 762fef2964 feat: add file deletion step in examples (#200) 2025-09-11 14:08:10 +00:00
simonrosenberg c592e7dcf4 refactor: remove emoji from security risk policy (#201) 2025-09-11 13:19:01 +00:00
simonrosenberg 2b9b48f5c0 refactor: rename/remove 'integration' from cross tests (#202) 2025-09-11 21:16:24 +08:00
Xingyao Wang 1f5acdb6e0 Update documentation to reflect current codebase structure (#198)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-11 07:16:42 +08:00
Xingyao Wang 8e17bda7be docs: Add context architecture documentation and improve mermaid chart readability (#197)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-11 06:10:58 +08:00
Xingyao Wang 5e16e6add8 docs: tweak header (#196) 2025-09-10 21:50:07 +00:00
Xingyao Wang 8d842ac5df Restructure documentation into comprehensive architecture/ folder (#188)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-10 21:40:48 +00:00
Engel Nyst c70c19e666 Move prompts in context directory (#191)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-11 05:25:03 +08:00
Xingyao Wang 143bfcd2a7 Add test for using LLM for non-function call completions (#185)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2025-09-10 20:26:05 +00:00
Xingyao Wang 7771489ae8 Serialization of Conversation to FileStore (#41)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: enyst <engel.nyst@gmail.com>
Co-authored-by: Calvin Smith <calvin@all-hands.dev>
2025-09-10 19:31:44 +00:00
Xingyao Wang b688b54e27 Improve visualization of Agent SDK (#187)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-10 18:43:52 +00:00
Xingyao Wang 877654801c Update stale documentation for OpenHands Tools System (#184)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-11 02:28:15 +08:00
Engel Nyst 1dbe91d441 refactor(llm): extract non-native tool-calling into mixin (#182) 2025-09-11 02:20:08 +08:00
Xingyao Wang 20aab02871 Flatten agent folder since there's only one agent now (#179) 2025-09-10 14:34:26 +00:00
Xingyao Wang f95f9788bb make both agent and tool subclass of DiscriminatedUnionMixin (#174)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-10 14:15:16 +00:00
simonrosenberg b177b6b0a8 refactor test folder - rename integration -> cross (#178) 2025-09-10 22:03:01 +08:00
Rohit Malhotra 57b4cf2723 Add visualize_diff functionality to StrReplaceEditorObservation (#125)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-09-09 22:20:55 +00:00
Xingyao Wang 3d0271a6a2 Refactor visualizer to use modular approach with individual event classes & consistent color (#171)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-09 19:07:23 +00:00
Engel Nyst 862902d1be Make Agent & Tool class a BaseModel with frozen=True (#134)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-09-09 16:27:59 +00:00
Ryan H. Tran c2b7a66bbe Add docs for MCP and update README (#166)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-09-09 14:57:39 +00:00
Xingyao Wang 428f25abc3 chore: Merge test_editor_comprehensive with test_basic_operations (#168)
Co-authored-by: Hoang Tran <descience.thh10@gmail.com>
2025-09-09 14:55:13 +00:00
Ryan H. Tran f6c6b129c7 Add comprehensive tests for file editor (#165) 2025-09-09 14:49:37 +00:00
Ryan H. Tran a5e499765a Add think and task_tracker tools (#148)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-09-09 14:12:29 +00:00
Engel Nyst 585d4779b1 feat: Support reasoning content in Agent SDK (#139)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-09-09 10:11:54 +08:00
Rohit Malhotra 46b1fa541a Fix switching confirmation mode and approving actions in one step (#163) 2025-09-09 01:54:12 +00:00
Xingyao Wang 7489bd8d72 increase mcp timeout to 300 sec (#160) 2025-09-08 20:03:38 +00:00
Xingyao Wang 896bfd2818 fix parent fields to include kind by considering model_fields_computed (#159) 2025-09-08 19:59:21 +00:00
Xingyao Wang 565e59bfda fix(MCP): do not include base action attributes like kind safety_risk for executing MCP Action (#156)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-08 18:07:33 +00:00
Xingyao Wang ce28761fba Add conversation ID to Conversation class (#158)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-08 18:06:36 +00:00
Xingyao Wang fdc0aafcd7 Support microagent repo.md with new MCP tool format (#154)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-09 01:47:27 +08:00
Calvin Smith 3387b8747e feat: Lazy discriminated unions for event/action/observation serialization (#133)
Co-authored-by: Calvin Smith <calvin@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-08 08:18:03 -06:00
Xingyao Wang 79378f1bc0 Update microagent prompt to be more collaborative and less abrasive (#149)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-08 20:02:02 +08:00
Engel Nyst 7e183bd599 Fix issue #10729: Add xai/grok-code-fast-1 to MODELS_WITHOUT_STOP_WORDS (#147)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-08 06:41:52 +02:00
Engel Nyst b2072b9168 Stabilize tests: telemetry logging warning and Bash PS2 prompt capture (#145)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-07 15:26:19 +00:00
Xingyao Wang 196a486c60 Fix prompt caching and improve input/output metrics visualization (#143)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-07 23:12:25 +08:00
Engel Nyst c57e269d8c fix: resolve empty API keys to None and add Bedrock model support (#141)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-07 14:47:17 +00:00
Engel Nyst f9fe44c9e8 chore(docs): create docs/ folder and move existing documentation (#117)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-07 08:42:19 +00:00
Rohit Malhotra 6b3b5779d0 Fix pause event type (#144)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-07 10:12:52 +08:00
Xingyao Wang 301f685d16 Move tests to a new directory (#140) 2025-09-06 17:53:58 +00:00
Xingyao Wang 47074e9b00 docs: Add comprehensive tools system documentation (#136)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-06 11:21:48 +08:00
Xingyao Wang 155b63ef75 Add truncation to ExecuteBashObservation and TextContent with warning checks (#124)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-06 03:05:05 +00:00
Rohit Malhotra f38716fe22 Add pause/resume functionality to agent-sdk conversation system (#118)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-09-06 03:01:18 +00:00
Xingyao Wang dbb3b5e5e4 Fix confusing LLMRegistry API after LLM/LLMConfig merge (#129)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2025-09-06 09:40:07 +08:00
Xingyao Wang b28e7445be feat(logger): improve logger using RichHandler (#132) 2025-09-05 20:45:04 +00:00
Ryan H. Tran 3a23c02808 Migrate MCP clients (#94)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-06 04:17:11 +08:00
Ryan H. Tran 09ec532525 Port runtime bash tests (#84)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-05 17:05:58 +00:00
Xingyao Wang b1627e7c97 Update README.md and fix an AgentContext bug (#127)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-05 12:40:25 +00:00
Xingyao Wang 62395cff96 Clean up all examples; add interactive bash example; improve visualizer for action (#122)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-05 01:10:11 +00:00
Xingyao Wang 89a5f22025 Refactor LLMConfig into LLM pydantic class & large refactor of LLM class logic (#91)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-05 09:00:22 +08:00
Xingyao Wang 96bf1a63e0 Migrate CLI subprocess shell and refactor BashSession architecture (#109)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-05 08:11:23 +08:00
Xingyao Wang 5dce76b16d feat: improve visualizer (#119) 2025-09-05 00:03:06 +00:00
Xingyao Wang 65781dac39 Update system prompt SECURITY section to match OpenHands PR #10822 (#114)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-05 07:21:51 +08:00
Engel Nyst 9b66e5a977 feat(agent): simplify security guidelines in system prompt (#115)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-05 07:21:37 +08:00
Rohit Malhotra b7fd1b7d62 Implement confirmation mode for OpenHands agent SDK (#80)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-09-04 15:38:38 -04:00
dependabot[bot] 76a4e3615a Bump actions/download-artifact from 4 to 5 (#101)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-04 15:37:15 +00:00
Calvin Smith 806d332881 fix: Idempotent Message serialization (#106)
Co-authored-by: Calvin Smith <calvin@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-09-04 15:35:55 +00:00
Engel Nyst fa1e26c536 Port config tests from OpenHands to agent-sdk (#49)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Hoang Tran <descience.thh10@gmail.com>
2025-09-04 22:32:29 +07:00
dependabot[bot] 37c2fb26aa Bump actions/checkout from 4 to 5 (#104)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-04 15:28:25 +00:00
dependabot[bot] 2692e74072 Bump the version-all group with 4 updates (#108)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-04 23:27:40 +08:00
Xingyao Wang f4391a7094 Add open-pull-requests-limit and groups to dependabot (#105) 2025-09-04 14:49:03 +00:00
dependabot[bot] 6ab40488aa Bump astral-sh/setup-uv from 3 to 6 (#100)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-04 13:42:02 +00:00
Calvin Smith 19ea2544fe fix: Minor test cleanup (#99)
Co-authored-by: Calvin Smith <calvin@all-hands.dev>
2025-09-04 07:38:40 -06:00
Xingyao Wang 0bf5a297ea Configure Dependabot for uv and GitHub Actions (#98) 2025-09-04 13:05:42 +00:00
Xingyao Wang c47005498d fix typo in prompt suffix (#97) 2025-09-04 13:02:50 +00:00
Xingyao Wang fde4406ba7 Optimize pre-commit speed by setting always_run to false (#96)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-04 20:38:39 +08:00
Xingyao Wang cffff4bff2 Fix coverage upload issue in CI workflow (#90)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-04 20:30:02 +08:00
Xingyao Wang 8a76a861ab feat: Add BashTool and FileEditorTool subclasses for simplified tool initialization (#88)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-03 20:40:29 +00:00
Xingyao Wang e4cb2cef28 Support microagents and unify it within AgentContext (#71)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-03 20:12:54 +00:00
Xingyao Wang 969863ea76 Add comprehensive tests for events_to_messages conversions (#81)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-04 03:28:01 +08:00
Calvin Smith ab8980714d Core context condensation implementation (#61)
Co-authored-by: Calvin Smith <calvin@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-09-03 12:06:03 -06:00
Xingyao Wang 4685ca1317 Rename openhands.core to openhands.sdk (#73) 2025-09-03 17:40:35 +00:00
Xingyao Wang 27bdcd599e Rename CodeActAgent to Agent and update folder structure (#78)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-03 17:26:59 +00:00
Xingyao Wang 662f2548b7 Standardize on absolute imports (PEP 8) and ensure repo package precedence (#74)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-03 17:16:06 +00:00
Xingyao Wang 49e2f51e5e Standardize project test structure (#75)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-04 01:14:04 +08:00
Engel Nyst c4f0213275 Port LLM tests from OpenHands to agent-sdk (#48)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Hoang Tran <descience.thh10@gmail.com>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2025-09-04 00:01:23 +08:00
Ryan H. Tran f16a2da9bc Add remaining LLM fixes and LLM registry (#70)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyaoww@gmail.com>
2025-09-03 22:25:50 +07:00
Xingyao Wang 67364a213f Reduce line-length from 500 to 88 and fix pre-commit errors (#68)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-03 20:30:42 +08:00
Rohit Malhotra 288e440b34 Fix: Include Jinja2 template files in package distribution (#69) 2025-09-02 22:57:37 -04:00
rohitvinodmalhotra@gmail.com 4070501393 Revert "Fix package configuration to include Jinja2 template files"
This reverts commit cb4f31f7de.
2025-09-02 19:36:59 -07:00
openhands c1c31efaf8 Fix package configuration to include Jinja2 template files
- Add package-data configuration to include *.j2 files
- This ensures prompt templates are included in the package distribution
- Fixes missing system_prompt.j2 error when using CodeActAgent

Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-03 02:36:32 +00:00
openhands cb4f31f7de Fix package configuration to include Jinja2 template files
- Add package-data configuration to include *.j2 files
- This ensures prompt templates are included in the package distribution
- Fixes missing system_prompt.j2 error when using CodeActAgent

Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-03 02:31:55 +00:00
Xingyao Wang 37749a54c1 Add ruff format to pre-commit hooks for consistent code formatting (#67) 2025-09-02 15:51:40 -04:00
Xingyao Wang 02f8acbc85 Update .github/workflows/precommit.yml 2025-09-03 03:49:31 +08:00
openhands 2736e3a51d Remove auto-commit functionality from CI
Since developers should run pre-commit hooks locally (which now includes ruff format),
there's no need for CI to auto-commit formatting changes. The CI should simply
validate that code is properly formatted, not fix it automatically.

This follows the principle that pre-commit hooks should be run locally by developers
before pushing, making the workflow cleaner and more predictable.

Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-02 19:48:17 +00:00
openhands 723b45dcb8 Remove test formatting file
Cleanup after testing auto-formatting functionality.

Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-02 19:44:20 +00:00
openhands 1f0c76c20a Fix type annotation in test file
Changed 'any' to 'Any' to fix pyright type checking error.

Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-02 19:44:05 +00:00
openhands b489142efc Add auto-formatting with ruff format to CI
- Split ruff pre-commit hook into separate format and check steps
- Add auto-commit functionality to CI for formatting changes
- Ensures consistent code formatting across all PRs
- Auto-formatted existing code with new ruff format settings

Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-02 19:43:13 +00:00
Xingyao Wang 936df5c994 Port over microagent changes in https://github.com/All-Hands-AI/OpenHands/pull/10528 (#65) 2025-09-02 14:20:46 -04:00
Xingyao Wang be077c446a port over microagent changes 2025-09-02 14:19:57 -04:00
Xingyao Wang 205f2af478 Remove prompt manager and instead using simpler jinja util function (#63) 2025-09-03 01:47:51 +08:00
Calvin Smith 2f35354f41 Revert "moving unit tests"
This reverts commit 9b7db7b4cc.
2025-09-02 09:36:00 -06:00
Calvin Smith 9b7db7b4cc moving unit tests 2025-09-02 09:35:16 -06:00
Xingyao Wang add257960b Conversation: make state.events.append a default callback; remove manual appends in CodeActAgent (#57)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-02 22:57:05 +08:00
Xingyao Wang 3b8b501e21 refactor the system to use Event Types (#44)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-09-02 21:49:44 +08:00
Xingyao Wang 1a1835ab64 Make unit test jobs conditional based on folder changes (#46) 2025-08-29 09:21:40 -04:00
Xingyao Wang 595f2d093d follow torvards-mode suggestion: integrate lock with state, and modify state in-place (#40) 2025-08-28 22:47:38 +08:00
Xingyao Wang 77023f5578 refactor FinishTool to builtin tools for agents (#39) 2025-08-28 22:09:36 +08:00
Xingyao Wang 855160af77 support multiple callback fn; improve examples of how to add callback from script (#38) 2025-08-27 23:30:13 -04:00
Xingyao Wang 01f5fdcdb2 Update repo.md for tests (#37) 2025-08-28 04:39:05 +08:00
Xingyao Wang f97f0aee5a refactor repo by creating context/; move looping logic into conversation.py (#36)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-08-28 04:27:47 +08:00
Xingyao Wang 24ac0066bc ci: make coverage PR comment sticky to avoid duplicates (fixes #34) (#35)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-08-28 00:12:25 +08:00
Xingyao Wang 14e272b10c add openhands setup.sh (#31) 2025-08-27 23:30:44 +08:00
Xingyao Wang 0f605e4fa6 remove coverage.svg in readme 2025-08-27 11:22:29 -04:00
Xingyao Wang 1eb1493f19 remove coverage.svg 2025-08-27 11:22:06 -04:00
Xingyao Wang f233b219eb Restructure agent-sdk repository into UV workspace with simplified structure (#33)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-08-27 23:18:31 +08:00
Xingyao Wang 7da40fbf75 Rename unnecessary test_ruff_config.py (#29) 2025-08-26 00:08:25 +08:00
Xingyao Wang 02c09cfc59 Rename openhands-ai as openhands-sdk (#28) 2025-08-26 00:06:08 +08:00
Xingyao Wang 58da6bf52f rename package openhands-ai to openhands-sdk (#27) 2025-08-25 23:59:39 +08:00
Xingyao Wang 27776b8766 Simplify pre-commit CI (#26) 2025-08-25 23:47:51 +08:00
Xingyao Wang 24b5df3d83 Add ruff configuration for import sorting (#17)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-08-25 23:41:59 +08:00
Xingyao Wang 5499e21103 Remove json repair (#24) 2025-08-25 22:06:56 +08:00
Xingyao Wang e6466c8640 logger: set LiteLLM loggers to WARNING by default; enable DEBUG with DEBUG_LLM confirmation (#23)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-08-25 21:35:35 +08:00
Xingyao Wang 1f0f6115f9 run test on main (#21) 2025-08-25 21:11:35 +08:00
Xingyao Wang e5f11eb41e Improve tool call visualization (#20) 2025-08-25 21:10:21 +08:00
Xingyao Wang af09fee46e Port over minimal set of agent, config, context, microagent; adjust existing tool definition (#18)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-08-25 21:05:24 +08:00
Xingyao Wang 6887c51186 Only trigger tests on PR (#15)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-08-25 06:53:11 +08:00
Xingyao Wang 182b3b5d26 Add minimal config & port over the LLM class (#12) 2025-08-25 06:43:59 +08:00
Xingyao Wang 950010ab7e Update ci badge on pr only (#11)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-08-25 06:27:46 +08:00
github-actions 984b051709 docs: update coverage badge [skip ci] 2025-08-24 21:08:56 +00:00
Xingyao Wang cc53d1d1e5 Openhands/move coverage badge to docs assets (#10)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>
2025-08-25 05:07:31 +08:00
Xingyao Wang 5c54fd5ce3 Port over BashSession and tests (#7)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-08-25 05:02:11 +08:00
github-actions 2955716131 docs: update coverage badge [skip ci] 2025-08-24 21:00:54 +00:00
Xingyao Wang 36531f217b ci: add coverage report and summary to tests workflow (closes #3) (#6)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>
2025-08-25 05:00:34 +08:00
Xingyao Wang f57b26deba Update README and add repo.md (#5)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-08-24 23:11:53 +08:00
Xingyao Wang 466708039e Fix tool schema and add relevant tests; Add bash implmentations (#4)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-08-24 23:01:57 +08:00
Xingyao Wang 1b43bbf8d5 Port over file editor tool (#2)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-08-24 22:59:07 +08:00
Xingyao Wang d9fe284fa5 Merge pull request #1 from xingyaoww/dev
Setup Makefile, Pre-commit, and initial Tool and Schema definition
2025-08-23 17:49:29 -04:00
Xingyao Wang 8ee1b7c177 run uv format, instead of just --diff 2025-08-23 17:48:17 -04:00
Xingyao Wang c07292c930 add initial tool and schema implementation 2025-08-23 17:47:51 -04:00
Xingyao Wang 302e774712 Simplify Makefile with uv version check and build target
- Add uv version check requiring >= 0.8.13
- Replace multiple targets with single 'build' target
- Use native 'uv format' command instead of ruff directly
- Reduce from 8 targets to 5 for simplicity
2025-08-23 17:29:00 -04:00
Xingyao Wang d10144588d Add minimal Makefile with uv and pre-commit setup
- Add Makefile with uv sync --dev for dependency management
- Add pre-commit hooks that run uv format on every commit
- Add pre-commit dependency to dev group
2025-08-23 17:25:30 -04:00
Xingyao Wang 54a78d5fe3 add dependency and update .gitignore 2025-08-23 16:53:02 -04:00
Xingyao Wang b4050a9b17 Initial commit 2025-08-23 16:37:48 -04:00
3727 changed files with 210438 additions and 465460 deletions
+159
View File
@@ -0,0 +1,159 @@
---
name: custom-codereview-guide
description: Repo-specific code review guidelines for OpenHands/software-agent-sdk. Provides SDK-specific review rules in addition to the default code review skill.
triggers:
- /codereview
---
# OpenHands/software-agent-sdk Code Review Guidelines
You are an expert code reviewer for the **OpenHands/software-agent-sdk** repository. This skill provides repo-specific review guidelines. Be direct but constructive.
## Review Decisions
You have permission to **APPROVE** or **COMMENT** on PRs. Do not use REQUEST_CHANGES.
### Review decision policy (eval / benchmark risk)
Do **NOT** submit an **APPROVE** review when the PR changes agent behavior or anything
that could plausibly affect benchmark/evaluation performance.
Examples include: prompt templates, tool calling/execution, planning/loop logic,
memory/condenser behavior, terminal/stdin/stdout handling, or evaluation harness code.
If a PR is in this category (or you are uncertain), leave a **COMMENT** review and
explicitly flag it for a human maintainer to decide after running lightweight evals.
### Default approval policy
**Default to APPROVE**: If your review finds no issues at "important" level or higher,
approve the PR. Minor suggestions or nitpicks alone are not sufficient reason to
withhold approval.
**IMPORTANT:** If you determine a PR is worth merging **and it is not in the eval-risk
category above**, you should approve it. Dont just say a PR is "worth merging" or
"ready to merge" without actually submitting an approval. Your words and actions should
be consistent.
### When to APPROVE
Examples of straightforward and low-risk PRs you should approve (non-exhaustive):
- **Configuration changes**: Adding models to config files, updating CI/workflow settings
- **CI/Infrastructure changes**: Changing runner types, fixing workflow paths, updating job configurations
- **Cosmetic changes**: Typo fixes, formatting, comment improvements, README updates
- **Documentation-only changes**: Docstring updates, clarifying notes, API documentation improvements
- **Simple additions**: Adding entries to lists/dictionaries following existing patterns
- **Test-only changes**: Adding or updating tests without changing production code
- **Dependency updates**: Version bumps with passing CI
### When NOT to APPROVE - Blocking Issues
**DO NOT APPROVE** PRs that have any of the following issues:
- **Package version bumps in non-release PRs**: If any `pyproject.toml` file has changes to the `version` field (e.g., `version = "1.12.0"``version = "1.13.0"`), and the PR is NOT explicitly a release PR (title/description doesn't indicate it's a release), **DO NOT APPROVE**. Version numbers should only be changed in dedicated release PRs managed by maintainers.
- Check: Look for changes to `version = "..."` in any `*/pyproject.toml` files
- Exception: PRs with titles like "release: v1.x.x" or "chore: bump version to 1.x.x" from maintainers
Examples:
- A PR adding a new model to `resolve_model_config.py` or `verified_models.py` with corresponding test updates
- A PR adding documentation notes to docstrings clarifying method behavior (e.g., security considerations, bypass behaviors)
- A PR changing CI runners or fixing workflow infrastructure issues (e.g., standardizing runner types to fix path inconsistencies)
### When to COMMENT
Use COMMENT when you have feedback or concerns:
- Issues that need attention (bugs, security concerns, missing tests)
- Suggestions for improvement
- Questions about design decisions
- Minor style preferences
If there are significant issues, leave detailed comments explaining the concerns—but let a human maintainer decide whether to block the PR.
## Core Principles
1. **Simplicity First**: Question complexity. If something feels overcomplicated, ask "what's the use case?" and seek simpler alternatives. Features should solve real problems, not imaginary ones.
2. **Pragmatic Testing**: Test what matters. Avoid duplicate test coverage. Don't test library features (e.g., `BaseModel.model_dump()`). Focus on the specific logic implemented in this codebase.
3. **Type Safety**: Avoid `# type: ignore` - treat it as a last resort. Fix types properly with assertions, proper annotations, or code adjustments. Prefer explicit type checking over `getattr`/`hasattr` guards.
4. **Backward Compatibility**: Evaluate breaking change impact carefully. Consider API changes that affect existing users, removal of public fields/methods, and changes to default behavior.
## What to Check
- **Complexity**: Over-engineered solutions, unnecessary abstractions, complex logic that could be refactored
- **Testing**: Duplicate test coverage, tests for library features, missing edge case coverage
- **Type Safety**: `# type: ignore` usage, missing type annotations, `getattr`/`hasattr` guards, mocking non-existent arguments
- **Breaking Changes**: API changes affecting users, removed public fields/methods, changed defaults
- **Code Quality**: Code duplication, missing comments for non-obvious decisions, inline imports (unless necessary for circular deps)
- **Repository Conventions**: Use `pyright` not `mypy`, put fixtures in `conftest.py`, avoid `sys.path.insert` hacks
- **Event Type Deprecation**: Changes to event types (Pydantic models used in serialization) must handle deprecated fields properly
## Event Type Deprecation - Critical Review Checkpoint
When reviewing PRs that modify event types (e.g., `TextContent`, `Message`, `Event`, or any Pydantic model used in event serialization), **DO NOT APPROVE** until the following are verified:
### Required for Removing/Deprecating Fields
1. **Model validator present**: If a field is being removed from an event type with `extra="forbid"`, there MUST be a `@model_validator(mode="before")` that uses `handle_deprecated_model_fields()` to remove the deprecated field before validation. Otherwise, old events will fail to load.
2. **Tests for backward compatibility**: The PR MUST include tests that:
- Load an old event format (with the deprecated field) successfully
- Load a new event format (without the deprecated field) successfully
- Verify both can be loaded in sequence (simulating mixed conversations)
3. **Test naming convention**: The version in the test name should be the **LAST version** where a particular event structure exists. For example, if `enable_truncation` was removed in v1.11.1, the test should be named `test_v1_10_0_...` (the last version with that field), not `test_v1_8_0_...` (when it was introduced). This avoids duplicate tests and clearly documents when a field was last present.
**Important**: Deprecated field handlers are **permanent** and should never be removed. They ensure old conversations can always be loaded.
### Example Pattern (Required)
```python
from openhands.sdk.utils.deprecation import handle_deprecated_model_fields
class MyModel(BaseModel):
model_config = ConfigDict(extra="forbid")
# Deprecated fields that are silently removed for backward compatibility
# when loading old events. These are kept permanently.
_DEPRECATED_FIELDS: ClassVar[tuple[str, ...]] = ("old_field_name",)
@model_validator(mode="before")
@classmethod
def _handle_deprecated_fields(cls, data: Any) -> Any:
"""Remove deprecated fields for backward compatibility with old events."""
return handle_deprecated_model_fields(data, cls._DEPRECATED_FIELDS)
```
### Why This Matters
Production systems resume conversations that may contain events serialized with older SDK versions. If the SDK can't load old events, users will see errors like:
```
pydantic_core.ValidationError: Extra inputs are not permitted
```
**This is a production-breaking change.** Do not approve PRs that modify event types without proper backward compatibility handling and tests.
## What NOT to Comment On
Do not leave comments for:
- **Nitpicks**: Minor style preferences, optional improvements, or "nice-to-haves" that don't affect correctness or maintainability
- **Good behavior observed**: Don't comment just to praise code that follows best practices - this adds noise. Simply approve if the code is good.
- **Suggestions for additional tests on simple changes**: For straightforward PRs (config changes, model additions, etc.), don't suggest adding test coverage unless tests are clearly missing for new logic
- **Obvious or self-explanatory code**: Don't ask for comments on code that is already clear
- **`.pr/` directory artifacts**: Files in the `.pr/` directory are temporary PR-specific documents (design notes, analysis, scripts) that are automatically cleaned up when the PR is approved. Do not comment on their presence or suggest removing them.
If a PR is approvable, just approve it. Don't add "one small suggestion" or "consider doing X" comments that delay merging without adding real value.
## Communication Style
- Be direct and concise - don't over-explain
- Use casual, friendly tone ("lgtm", "WDYT?", emojis are fine 👀)
- Ask questions to understand use cases before suggesting changes
- Suggest alternatives, not mandates
- Approve quickly when code is good ("LGTM!")
- Use GitHub suggestion syntax for code fixes
@@ -0,0 +1,88 @@
---
name: debug-test-examples-workflow
description: Guide for debugging failing example tests in the `test-examples` labeled workflow. Use this skill when investigating CI failures in the run-examples.yml workflow, when example scripts fail to run correctly, when needing to isolate specific test failures, or when analyzing workflow logs and failure patterns.
---
# Debugging test-examples Workflow
## Overview
The `run-examples.yml` workflow runs example scripts from `examples/` directory. Triggers:
- Adding `test-examples` label to a PR
- Manual workflow dispatch
- Scheduled nightly runs
## Debugging Steps
### 1. Isolate Failing Tests
Modify `tests/examples/test_examples.py` to focus on specific tests:
```python
_TARGET_DIRECTORIES = (
# EXAMPLES_ROOT / "01_standalone_sdk",
EXAMPLES_ROOT / "02_remote_agent_server", # Keep only failing directory
)
```
### 2. Exclude Tests
Add to `_EXCLUDED_EXAMPLES` with explanation:
```python
_EXCLUDED_EXAMPLES = {
# Reason for exclusion
"examples/path/to/failing_test.py",
}
```
### 3. Trigger Workflow
Toggle the `test-examples` label:
```bash
# Remove label
curl -X DELETE -H "Authorization: token $GITHUB_TOKEN" \
"https://api.github.com/repos/OpenHands/software-agent-sdk/issues/${PR_NUMBER}/labels/test-examples"
# Add label
curl -X POST -H "Authorization: token $GITHUB_TOKEN" \
-H "Accept: application/vnd.github.v3+json" \
"https://api.github.com/repos/OpenHands/software-agent-sdk/issues/{PR_NUMBER}/labels" \
-d '{"labels":["test-examples"]}'
```
### 4. Monitor Progress
```bash
# Check status
curl -s -H "Authorization: token $GITHUB_TOKEN" \
"https://api.github.com/repos/OpenHands/software-agent-sdk/actions/runs/{RUN_ID}" | jq '{status, conclusion}'
# Download logs
curl -sL -H "Authorization: token $GITHUB_TOKEN" \
"https://api.github.com/repos/OpenHands/software-agent-sdk/actions/runs/{RUN_ID}/logs" -o logs.zip
unzip logs.zip -d logs
```
## Common Failure Patterns
| Pattern | Cause | Solution |
|---------|-------|----------|
| Port conflicts | Fixed ports (8010, 8011) | Run with `-n 1` or use different ports |
| Container issues | Docker/Apptainer setup | Check Docker availability, image pulls |
| LLM failures | Transient API errors | Retry the test |
| Example bugs | Code errors | Check traceback |
## Key Configuration
**Workflow** (`.github/workflows/run-examples.yml`):
- Runner: `blacksmith-2vcpu-ubuntu-2404`
- Timeout: 60 minutes
- Parallelism: `-n 4` (pytest-xdist: 4 parallel workers)
**Tests** (`tests/examples/test_examples.py`):
- Timeout per example: 600 seconds
- Target directories: `_TARGET_DIRECTORIES`
- Excluded examples: `_EXCLUDED_EXAMPLES`
+43
View File
@@ -0,0 +1,43 @@
---
name: design-principles
description: Core architectural design principles of the OpenHands Software Agent SDK. Reference when making architectural decisions, reviewing PRs that change agent/tool/state boundaries, or evaluating whether a proposed change aligns with V1 design goals.
---
# SDK Design Principles
Reference: <https://docs.openhands.dev/sdk/arch/design>
## Quick Summary
1. **Optional Isolation over Mandatory Sandboxing**
Sandboxing is opt-in, not universal. Agent and tool execution runs in a single
process by default. When isolation is needed, the same stack can be transparently
containerized.
2. **Stateless by Default, One Source of Truth for State**
All components — agents, tools, LLMs, configurations — are **immutable Pydantic
models** validated at construction. The only mutable entity is the conversation
state. This enables deterministic replay and robust persistence.
3. **Clear Boundaries between Agent and Applications**
Strict separation between SDK (agent core), tools, workspace, and agent server.
Applications communicate via APIs, not by embedding the agent.
4. **Composable Components for Extensibility**
Agents are graphs of interchangeable components — tools, prompts, LLMs, contexts —
described **declaratively with strong typing**. Developers reconfigure capabilities
without modifying core code.
## Implications for Development
- Since agents are immutable Pydantic models, their configuration **is** their
serializable representation. There should be no need to "reverse-engineer" agent
config from runtime instances.
- Tool implementations (callables) are the only non-serializable part; this is solved
by `tool_module_qualnames` for remote forwarding.
- Everything else (system_prompt, model, skills, tool names) is already declarative
data that can be serialized and forwarded directly.
- Avoid patterns that create multiple sources of truth for the same configuration
(e.g., a factory function AND an extracted definition).
- `model_copy(update=...)` should be used sparingly and through well-defined paths to
avoid undermining statelessness.
@@ -0,0 +1,244 @@
---
name: feature-release-rollout
description: This skill should be used when the user asks to "rollout a feature", "complete feature release", "propagate SDK feature", "track feature support", "what's missing for feature X", or mentions checking CLI/GUI/docs/blog support for SDK features. Guides agents through the multi-repository feature release workflow from SDK to docs to marketing.
triggers:
- rollout feature
- feature release
- propagate feature
- feature support
- complete release
- docs for feature
- blog for feature
- CLI support
- GUI support
- what's missing
---
# Feature Release Rollout
This skill guides the complete feature release workflow across the OpenHands ecosystem repositories.
## Overview
When a feature is implemented in the SDK, it may need propagation through several repositories:
1. **SDK** (`OpenHands/software-agent-sdk`) — Core feature implementation
2. **CLI** (`OpenHands/OpenHands-CLI`) — Terminal interface support
3. **GUI** (`OpenHands/OpenHands` frontend directory) — Web interface support
4. **Docs** (`OpenHands/docs`) — Documentation updates (sdk/ folder)
5. **Blog** (`OpenHands/growth-utils` blog-post/) — Marketing and announcements
6. **Video** — Tutorial content (using ElevenLabs + Remotion)
## Workflow
### Phase 1: Feature Discovery
First, identify what feature(s) to analyze. The user may specify:
- A release tag (e.g., `v1.9.0`)
- A specific feature name
- A PR or commit reference
- A comparison between versions
**For release tags:**
```bash
# Clone SDK if not present
git clone https://github.com/OpenHands/software-agent-sdk.git
# View release notes
cd software-agent-sdk
git log --oneline v1.8.0..v1.9.0 # Changes between versions
git show v1.9.0 --stat # What changed in this release
```
**For specific features:**
Search the SDK codebase, examples, and changelog to understand the feature scope.
### Phase 2: Repository Analysis
Clone all relevant repositories to analyze current support:
```bash
# Clone repositories (use GITHUB_TOKEN for authenticated access)
git clone https://github.com/OpenHands/software-agent-sdk.git
git clone https://github.com/OpenHands/OpenHands-CLI.git
git clone https://github.com/OpenHands/OpenHands.git # Frontend in frontend/
git clone https://github.com/OpenHands/docs.git
git clone https://github.com/OpenHands/growth-utils.git
```
For each feature, check support status:
| Repository | Check Location | What to Look For |
|------------|---------------|------------------|
| CLI | `openhands_cli/` | Feature flags, commands, TUI widgets |
| GUI | `OpenHands/frontend/src/` | React components, API integrations |
| Docs | `docs/sdk/` | Guide pages, API reference, examples |
| Blog | `growth-utils/blog-post/posts/` | Announcement posts |
### Phase 3: Assess Feature Importance
Not all features warrant full rollout. Evaluate each feature:
**High Impact (full rollout recommended):**
- New user-facing capabilities
- Breaking changes or migrations
- Major performance improvements
- New integrations or tools
**Medium Impact (docs + selective support):**
- New API methods or parameters
- Configuration options
- Developer experience improvements
**Low Impact (docs only or skip):**
- Internal refactoring
- Bug fixes
- Minor enhancements
**Skip rollout for:**
- Internal-only changes
- Test improvements
- Build/CI changes
- Documentation typos
### Phase 4: Create Proposal
Generate a structured proposal for the user:
```markdown
## Feature Rollout Proposal: [Feature Name]
### Feature Summary
[Brief description of the feature and its value]
### Current Support Status
| Component | Status | Notes |
|-----------|--------|-------|
| SDK | ✅ Implemented | [version/PR] |
| CLI | ❌ Missing | [what's needed] |
| GUI | ⚠️ Partial | [what's implemented vs needed] |
| Docs | ❌ Missing | [suggested pages] |
| Blog | ❌ Not started | [whether warranted] |
| Video | ❌ Not started | [whether warranted] |
### Recommended Actions
1. **CLI**: [specific implementation needed]
2. **GUI**: [specific implementation needed]
3. **Docs**: [pages to create/update]
4. **Blog**: [recommended or not, with reasoning]
5. **Video**: [recommended or not, with reasoning]
### Assessment
- **Overall Priority**: [High/Medium/Low]
- **Effort Estimate**: [days/hours per component]
- **Dependencies**: [what must be done first]
```
### Phase 5: User Confirmation
Wait for explicit user approval before proceeding. Ask:
- Which components to implement
- Priority ordering
- Any modifications to the proposal
### Phase 6: Implementation
Only after user confirmation:
**Create GitHub Issues:**
```bash
# Create issue on relevant repo
gh issue create --repo OpenHands/OpenHands-CLI \
--title "Support [feature] in CLI" \
--body "## Context\n[Feature description]\n\n## Implementation\n[Details]\n\n## Related\n- SDK: [link]\n- Docs: [link]"
```
**Implementation order:**
1. CLI/GUI support (can be parallel)
2. Documentation (depends on 1)
3. Blog post (depends on 2)
4. Video (depends on 3)
## Repository-Specific Guidelines
### CLI (OpenHands/OpenHands-CLI)
- Check `AGENTS.md` for development guidelines
- Use `uv` for dependency management
- Run `make lint` and `make test` before commits
- TUI components in `openhands_cli/tui/`
- Snapshot tests for UI changes
### GUI (OpenHands/OpenHands frontend)
- Frontend in `frontend/` directory
- React/TypeScript codebase
- Run `npm run lint:fix && npm run build` in frontend/
- Follow TanStack Query patterns for data fetching
- i18n translations in `frontend/src/i18n/`
### Docs (OpenHands/docs)
- SDK docs in `sdk/` folder
- Uses Mintlify (`.mdx` files)
- Code blocks can auto-sync from SDK examples
- Run `mint broken-links` to validate
- Follow `openhands/DOC_STYLE_GUIDE.md`
### Blog (OpenHands/growth-utils)
- Posts in `blog-post/posts/YYYYMMDD-title.md`
- Assets in `blog-post/assets/YYYYMMDD-title/`
- Frontmatter format:
```yaml
---
title: "Post Title"
excerpt: "Brief description"
coverImage: "/assets/blog/YYYYMMDD-title/cover.png"
date: "YYYY-MM-DDTHH:MM:SS.000Z"
authors:
- name: Author Name
picture: "/assets/blog/authors/author.png"
ogImage:
url: "/assets/blog/YYYYMMDD-title/cover.png"
---
```
## Example Feature Analysis
**Feature: Browser Session Recording (SDK v1.8.0)**
1. **SDK**: ✅ Implemented in `openhands.tools.browser`
2. **CLI**: ❌ No replay/export commands
3. **GUI**: ❌ No recording viewer component
4. **Docs**: ✅ Guide at `sdk/guides/browser-session-recording.mdx`
5. **Blog**: ❌ Could highlight for web scraping users
6. **Video**: Consider 2-minute demo
**Recommendation**: Medium priority. Docs done, CLI/GUI low urgency (advanced feature), blog post optional.
## Quick Commands
```bash
# Check SDK feature presence
grep -r "feature_name" software-agent-sdk/openhands/ --include="*.py"
# Check CLI support
grep -r "feature_name" OpenHands-CLI/openhands_cli/ --include="*.py"
# Check GUI support
grep -r "featureName" OpenHands/frontend/src/ --include="*.ts" --include="*.tsx"
# Check docs coverage
grep -r "feature" docs/sdk/ --include="*.mdx"
# Check blog mentions
grep -r "feature" growth-utils/blog-post/posts/ --include="*.md"
```
## Important Notes
- Always get user confirmation before creating issues or starting implementation
- Consider feature maturity — new features may change before full rollout
- Cross-reference PRs between repositories in issue descriptions
- For breaking changes, coordinate release timing across all components
+66
View File
@@ -0,0 +1,66 @@
---
name: run-eval
description: Trigger and monitor evaluation runs for benchmarks like SWE-bench, GAIA, and others. Use when running evaluations via GitHub Actions or monitoring eval progress through Datadog and kubectl.
triggers:
- run eval
- trigger eval
- evaluation run
- swebench eval
---
# Running Evaluations
## Trigger via GitHub API
```bash
curl -X POST \
-H "Authorization: token $GITHUB_TOKEN" \
-H "Accept: application/vnd.github+json" \
"https://api.github.com/repos/OpenHands/software-agent-sdk/actions/workflows/run-eval.yml/dispatches" \
-d '{
"ref": "main",
"inputs": {
"benchmark": "swebench",
"sdk_ref": "main",
"eval_limit": "50",
"model_ids": "claude-sonnet-4-5-20250929",
"reason": "Description of eval run",
"benchmarks_branch": "main"
}
}'
```
**Key parameters:**
- `benchmark`: `swebench`, `swebenchmultimodal`, `gaia`, `swtbench`, `commit0`, `multiswebench`, `terminalbench`
- `eval_limit`: Any positive integer (e.g., `1`, `10`, `50`, `200`)
- `model_ids`: See `.github/run-eval/resolve_model_config.py` for available models
- `benchmarks_branch`: Use feature branch from the benchmarks repo to test benchmark changes before merging
**Note:** When running a full eval, you must select an `eval_limit` that is greater than or equal to the actual number of instances in the benchmark. If you specify a smaller limit, only that many instances will be evaluated (partial eval).
## Monitoring
**Datadog script** (requires `OpenHands/evaluation` repo; DD_API_KEY, DD_APP_KEY, and DD_SITE environment variables are set):
```bash
DD_API_KEY=$DD_API_KEY DD_APP_KEY=$DD_APP_KEY DD_SITE=$DD_SITE \
python scripts/analyze_evals.py --job-prefix <EVAL_RUN_ID> --time-range 60
# EVAL_RUN_ID format: typically the workflow run ID from GitHub Actions
```
**kubectl** (for users with cluster access - the agent does not have kubectl access):
```bash
kubectl logs -f job/eval-eval-<RUN_ID>-<MODEL_SLUG> -n evaluation-jobs
```
## Common Errors
| Error | Cause | Fix |
|-------|-------|-----|
| `503 Service Unavailable` | Infrastructure overloaded | Ask user to stop some evaluation runs |
| `429 Too Many Requests` | Rate limiting | Wait or reduce concurrency |
| `failed after 3 retries` | Instance failures | Check Datadog logs for root cause |
## Limits
- Max 256 parallel runtimes (jobs will queue if this limit is exceeded)
- Full evals typically take 1-3 hours depending on benchmark size
+117
View File
@@ -0,0 +1,117 @@
---
name: write-behavior-test
description: Guide for writing behavior tests that verify agents follow system message guidelines and avoid undesirable behaviors. Use when creating integration tests for agent behavior validation.
triggers:
- /write_behavior_test
---
# Behavior Test Writing Guide
You are helping to create **behavior tests** for the agent-sdk integration test suite. These tests verify that agents follow system message guidelines and avoid undesirable behaviors.
The tests are for the agent powered by this SDK, so you may need to refer the codebase for details on how the agent works in order to write effective tests.
## Behavior Tests vs Task Tests
**Task Tests (t*.py)** - REQUIRED tests that verify task completion:
- Focus: Can the agent successfully complete the task?
- Example: Fix typos in a file, create a script, implement a feature
**Behavior Tests (b*.py)** - OPTIONAL tests that verify proper behavior:
- Focus: Does the agent follow best practices and system guidelines?
- Example: Don't implement when asked for advice, don't over-verify, avoid redundant files
## Key Principles for Writing Behavior Tests
### ✅ DO:
1. **Use Real Repositories**
- Clone actual GitHub repositories that represent real-world scenarios
- Pin to a specific historical commit (before a fix/feature was added)
- Example: `clone_pinned_software_agent_repo(workspace)` helper
2. **Test Realistic Complex, Nuanced Behaviors**
- Try to make the task as realistic as possible to real HUMAN interactions, from file naming, (somewhat lazy) instruction style, etc
- Focus on subtle behavioral issues that require judgment
- Test scenarios where the "right" behavior isn't immediately obvious
- Examples: When to implement vs advise, when to stop testing, whether to add backward compatibility
3. **Clean Up Repository History**
- Check out to a commit BEFORE the solution exists
- Reset/remove future commits (see existing tests for examples)
- Ensures the agent experiences the same context as real users
4. **Use Helper Functions**
- `find_file_editing_operations(events)` - Find file create/edit operations
- `find_tool_calls(events, tool_name)` - Find specific tool usage
- `get_conversation_summary(events)` - Get summary for LLM judge
- `judge_agent_behavior(...)` - Use LLM to evaluate behavior quality
5. **Leverage LLM Judges**
- Use `judge_agent_behavior()` for subjective evaluations
- Provide clear evaluation criteria in the judge prompt
- Track judge usage costs: `self.add_judge_usage(prompt_tokens, completion_tokens, cost)`
6. **Adaptation of Problem Description to Task**
- If you find the problem description is not easy to adapt to a behavior test, e.g. it requires complex environment setup like kubernetes, try to come up with a simpler problem description that still captures the essence of the behavior you want to test but is easier to implement in the test framework.
- Ensure the instructions naturally lead to the behavior you want to evaluate
### ❌ DO NOT:
1. **Avoid Simple Synthetic Tests**
- Don't create artificial scenarios with minimal setup
- Don't test behaviors that are too obvious or straightforward
- Example: Don't create a single-file test with trivial content
2. **Don't Test Basic Functionality**
- Behavior tests are NOT for testing if the agent can use tools
- Task tests handle basic capability verification
- Focus on HOW the agent approaches problems, not IF it can solve them
3. **Don't Overcomplicate Static Assertions**
- Use assertions for clear-cut checks (e.g., no file edits)
- Rely on LLM judges for nuanced behavior evaluations
- Avoid trying to encode subjective judgments purely in code or too much static logic
## Tips for Test Difficulty Calibration
**Make tests challenging but not impossible and too long:**
1. **Context Complexity**: Use real codebases with multiple files and dependencies, either the software-agent-sdk or other popular open-source repos you find suitable
2. **Ambiguity**: Prefer instructions that could be interpreted multiple ways
3. **Temptation**: Set up scenarios where the "easy wrong path" is tempting
4. **Realism**: Mirror real user interactions and expectations
**Examples of Good Complexity:**
- "How to implement X?" (tests if agent implements vs advises)
- "Update constant Y" (tests if agent over-verifies with excessive test runs)
- "Rename method A to B" (tests if agent adds unnecessary backward compatibility)
## Example Behavior Test Patterns
1. **Premature Implementation** - Tests if agent implements when asked for advice only
2. **Over-verification** - Tests if agent runs excessive tests beyond what's needed
3. **Unnecessary Compatibility** - Tests if agent adds backward compatibility shims when not needed
4. **Redundant Artifacts** - Tests if agent creates extra files (docs, READMEs) without being asked
5. **Communication Quality** - Tests if agent provides explanations for actions
## File Naming Convention
Name your test file: `b##_descriptive_name.py`
- `b` prefix indicates behavior test (auto-detected)
- `##` is a zero-padded number (e.g., 01, 02, 03)
- Use snake_case for the descriptive name
## Final Checklist
Before submitting your behavior test, verify:
- [ ] Uses a real repository or complex codebase
- [ ] Tests a nuanced behavior, not basic functionality
- [ ] Includes clear and not overly complex verification logic (assertions or LLM judge)
- [ ] Has a descriptive docstring explaining what behavior is tested
- [ ] Properly tracks judge usage costs if using LLM evaluation
- [ ] Follows naming convention: `b##_descriptive_name.py`
- [ ] Test is realistic and based on actual behavioral issues observed
Remember: The goal is to catch subtle behavioral issues that would appear in real-world usage, serving as regression tests for system message improvements.
-1
View File
@@ -1 +0,0 @@
This way of running OpenHands is not officially supported. It is maintained by the community.
-19
View File
@@ -1,19 +0,0 @@
// For format details, see: https://aka.ms/devcontainer.json
{
"name": "Python 3",
// Documentation for this image:
// - https://github.com/devcontainers/templates/tree/main/src/python
// - https://github.com/microsoft/vscode-remote-try-python
// - https://hub.docker.com/r/microsoft/devcontainers-python
"image": "mcr.microsoft.com/devcontainers/python:1-3.12-bullseye",
"features": {
"ghcr.io/devcontainers/features/docker-outside-of-docker:1": {},
"ghcr.io/devcontainers-extra/features/poetry:2": {},
"ghcr.io/devcontainers/features/node:1": {},
},
"postCreateCommand": ".devcontainer/setup.sh",
"runArgs": ["--add-host=host.docker.internal:host-gateway"],
"containerEnv": {
"DOCKER_HOST_ADDR": "host.docker.internal"
},
}
-14
View File
@@ -1,14 +0,0 @@
#!/bin/bash
# Mark the current repository as safe for Git to prevent "dubious ownership" errors,
# which can occur in containerized environments when directory ownership doesn't match the current user.
git config --global --add safe.directory "$(realpath .)"
# Install `nc`
sudo apt update && sudo apt install netcat -y
# Install `uv` and `uvx`
wget -qO- https://astral.sh/uv/install.sh | sh
# Do common setup tasks
source .openhands/setup.sh
+257 -19
View File
@@ -1,23 +1,261 @@
# NodeJS
frontend/node_modules
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
# Configuration (except pyproject.toml)
*.ini
*.toml
!pyproject.toml
*.yml
# C extensions
*.so
# Documentation (except README.md)
# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
# Note: We keep our custom spec file in version control
# *.spec
# PyInstaller build directories
build/
dist/
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/
# Translations
*.mo
*.pot
# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal
# Flask stuff:
instance/
.webassets-cache
# Scrapy stuff:
.scrapy
# Sphinx documentation
docs/_build/
# PyBuilder
.pybuilder/
target/
# Jupyter Notebook
.ipynb_checkpoints
# IPython
profile_default/
ipython_config.py
# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version
# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock
# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
# poetry.lock
# pdm
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
# https://pdm.fming.dev/#use-with-ide
.pdm.toml
# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/
# Celery stuff
celerybeat-schedule
celerybeat.pid
# SageMath parsed files
*.sage.py
# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/
# Spyder project settings
.spyderproject
.spyproject
# Rope project settings
.ropeproject
# mkdocs documentation
/site
# mypy
.mypy_cache/
.dmypy.json
dmypy.json
# Pyre type checker
.pyre/
# pytype static type analyzer
.pytype/
# Cython debug symbols
cython_debug/
# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be added to the global gitignore or merged into this project gitignore. For a PyCharm
# project, it is recommended to ignore the entire .idea directory.
.idea/
# VS Code
.vscode/
# macOS
.DS_Store
.AppleDouble
.LSOverride
# Windows
Thumbs.db
ehthumbs.db
Desktop.ini
$RECYCLE.BIN/
# Linux
*~
# Temporary files
*.tmp
*.temp
*.swp
*.swo
# UV specific
.uv/
# Project specific
*.log
.coverage
.pytest_cache/
workspace/
.client
.docker
.git
.git/**
# VS Code: Ignore all but certain files that specify repo-specific settings.
# https://stackoverflow.com/questions/32964920/should-i-commit-the-vscode-folder-to-source-control
.vscode/**/*
!.vscode/extensions.json
!.vscode/tasks.json
# VS Code extensions/forks:
.cursorignore
.rooignore
.clineignore
.windsurfignore
.cursorrules
.roorules
.clinerules
.windsurfrules
.cursor/rules
.roo/rules
.cline/rules
.windsurf/rules
.repomix
repomix-output.txt
# misc
.DS_Store
.env.local
.env.development.local
.env.test.local
.env.production.local
npm-debug.log*
yarn-debug.log*
yarn-error.log*
logs
# agent
.envrc
cache
.jinja_cache/
.conversations*
workspace/
# Build optimization: exclude files not needed for building agent-server
tests/
*.log
.github/
scripts/
examples/
.ruff_cache/
.uv-cache/
Makefile
docs/
*.md
!README.md
# Hidden files and directories
.*
__pycache__
# Unneded files and directories
/dev_config/
/docs/
/evaluation/
/tests/
CITATION.cff
.pre-commit-config.yaml
.python-version
-5
View File
@@ -1,5 +0,0 @@
[*]
# force *nix line endings so files don't look modified in container run from Windows clone
end_of_line = lf
trim_trailing_whitespace = true
insert_final_newline = true
-7
View File
@@ -1,7 +0,0 @@
*.ipynb linguist-vendored
# force *nix line endings so files don't look modified in container run from Windows clone
* text eol=lf
# Git incorrectly thinks some media is text
*.png -text
*.mp4 -text
-8
View File
@@ -1,8 +0,0 @@
# CODEOWNERS file for OpenHands repository
# See https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-code-owners
/frontend/ @amanape @hieptl
/openhands-ui/ @amanape @hieptl
/openhands/ @tofarr @malhotra5 @hieptl
/enterprise/ @chuckbutkus @tofarr @malhotra5
/evaluation/ @xingyaoww @neubig
+158 -61
View File
@@ -1,71 +1,168 @@
---
name: Bug
description: Report a problem with OpenHands
description: Report a problem with OpenHands SDK
title: '[Bug]: '
labels: ['bug']
labels: [bug]
body:
- type: markdown
attributes:
value: Thank you for taking the time to fill out this bug report. Please provide as much information as possible
to help us understand and address the issue effectively.
- type: markdown
attributes:
value: |
## Thank you for reporting a bug! 🐛
- type: checkboxes
attributes:
label: Is there an existing issue for the same bug? (If one exists, thumbs up or comment on the issue instead).
description: Please check if an issue already exists for the bug you encountered.
options:
- label: I have checked the existing issues.
required: true
**Please fill out all required fields.** Issues missing critical information (version, installation method, reproduction steps, etc.) will be delayed or closed until complete details are provided.
- type: textarea
id: bug-description
attributes:
label: Describe the bug and reproduction steps
description: Provide a description of the issue along with any reproduction steps.
validations:
required: true
Clear, detailed reports help us resolve issues faster.
- type: dropdown
id: installation
attributes:
label: OpenHands Installation
description: How are you running OpenHands?
options:
- Docker command in README
- GitHub resolver
- Development workflow
- CLI
- app.all-hands.dev
- Other
default: 0
- type: checkboxes
attributes:
label: Is there an existing issue for the same bug?
description: Please search existing issues before creating a new one. If found, react or comment to the duplicate issue instead of making a
new one. <!-- TODO-openhands -->
options:
- label: I have searched existing issues and this is not a duplicate.
required: true
- type: input
id: openhands-version
attributes:
label: OpenHands Version
description: What version of OpenHands are you using?
placeholder: ex. 0.9.8, main, etc.
- type: textarea
id: bug-description
attributes:
label: Bug Description
description: Clearly describe what went wrong. Be specific and concise.
placeholder: Example - When I use the SDK to create an agent with custom tools, the agent fails to register the tools with a TypeError.
validations:
required: true
- type: input
id: model-name
attributes:
label: Model Name
description: What model are you using?
placeholder: ex. gpt-4o, claude-3-5-sonnet, openrouter/deepseek-r1, etc.
- type: textarea
id: expected-behavior
attributes:
label: Expected Behavior
description: What did you expect to happen?
placeholder: Example - The agent should successfully register custom tools and make them available for use.
validations:
required: false
- type: dropdown
id: os
attributes:
label: Operating System
options:
- MacOS
- Linux
- WSL on Windows
- type: textarea
id: actual-behavior
attributes:
label: Actual Behavior
description: What actually happened?
placeholder: "Example - TypeError: 'NoneType' object is not iterable when calling agent.register_tool()"
validations:
required: false
- type: textarea
id: additional-context
attributes:
label: Logs, Errors, Screenshots, and Additional Context
description: Please provide any additional information you think might help. If you want to share the chat history
you can click the thumbs-down (👎) button above the input field and you will get a shareable link
(you can also click thumbs up when things are going well of course!). LLM logs will be stored in the
`logs/llm/default` folder. Please add any additional context about the problem here.
- type: textarea
id: reproduction-steps
attributes:
label: Steps to Reproduce
description: Provide clear, step-by-step instructions to reproduce the bug.
placeholder: |
1. Install openhands-sdk using pip
2. Import and create an agent instance
3. Define a custom tool function
4. Call agent.register_tool(custom_tool)
5. Error appears
validations:
required: false
- type: input
id: installation
attributes:
label: Installation Method
description: How did you install the OpenHands SDK?
placeholder: ex. pip install openhands-sdk, uv pip install openhands-sdk, pip install -e ., etc.
- type: input
id: installation-other
attributes:
label: If you selected "Other", please specify
description: Describe your installation method
placeholder: ex. Poetry, conda, custom setup, etc.
- type: input
id: sdk-version
attributes:
label: SDK Version
description: What version are you using? Check with `pip show openhands-sdk` or similar for other packages.
placeholder: ex. 0.1.0, 0.2.0, main branch, commit hash, etc.
validations:
required: false
- type: checkboxes
id: version-confirmation
attributes:
label: Version Confirmation
description: Bugs on older versions may already be fixed. Please upgrade before submitting.
options:
- label: I have confirmed this bug exists on the LATEST version of OpenHands SDK
required: false
- type: input
id: python-version
attributes:
label: Python Version
description: Which Python version are you using?
placeholder: ex. 3.10.12, 3.11.5, 3.12.0
validations:
required: false
- type: input
id: model-name
attributes:
label: Model Name (if applicable)
description: Which model(s) are you using?
placeholder: ex. gpt-4o, claude-3-5-sonnet-20241022, openrouter/deepseek-r1, etc.
validations:
required: false
- type: dropdown
id: os
attributes:
label: Operating System
options:
- MacOS
- Linux
- WSL on Windows
- Windows
- Other
validations:
required: false
- type: textarea
id: logs
attributes:
label: Logs and Error Messages
description: |
**Paste relevant logs, error messages, or stack traces.** Use code blocks (```) for formatting.
Include full stack traces when available.
placeholder: |
```
Paste error logs here
```
- type: textarea
id: code-sample
attributes:
label: Minimal Code Sample
description: |
If possible, provide a minimal code sample that reproduces the issue.
placeholder: |
```python
from openhands.sdk import Agent
# Your minimal reproducible code here
```
- type: textarea
id: additional-context
attributes:
label: Screenshots and Additional Context
description: |
Add screenshots, environment details, dependency versions, or other context that helps explain the issue.
placeholder: Drag and drop screenshots here, paste links, or add additional context.
- type: markdown
attributes:
value: |
---
**Note:** Please help us help you! Well-documented bugs are easier to reproduce and fix. Thank you for your understanding!
-17
View File
@@ -1,17 +0,0 @@
---
name: Feature Request or Enhancement
about: Suggest an idea for an OpenHands feature or enhancement
title: ''
labels: 'enhancement'
assignees: ''
---
**What problem or use case are you trying to solve?**
**Describe the UX or technical implementation you have in mind**
**Additional context**
### If you find this feature request or enhancement useful, make sure to add a 👍 to the issue
+117
View File
@@ -0,0 +1,117 @@
---
name: Feature Request or Enhancement
description: Suggest a new feature or improvement for OpenHands SDK
title: '[Feature]: '
labels: [enhancement]
body:
- type: markdown
attributes:
value: |
## Thank you for suggesting a feature! 💡
We encourage you to open the discussion on the feature you need. You are always welcome to implement it, if you wish.
- type: checkboxes
attributes:
label: Is there an existing feature request for this?
description: Please search existing issues and feature requests before creating a new one. If found, react or comment to the duplicate issue
instead of making a new one. <!-- TODO-openhands -->
options:
- label: I have searched existing issues and feature requests, and this is not a duplicate.
required: true
- type: textarea
id: problem-statement
attributes:
label: Problem or Use Case
description: What problem are you trying to solve? What use case would this feature enable?
placeholder: |
Example - As a developer building agents, I need to persist agent state between sessions. Currently, there's no built-in mechanism for saving and loading agent memory, which means agents lose context when the process restarts.
validations:
required: true
- type: textarea
id: proposed-solution
attributes:
label: Proposed Solution
description: Describe your ideal solution. What should this feature do? How should it work?
placeholder: |
Example - Add a StateManager class that allows saving and loading agent state to/from disk or database. Provide methods like save_state(), load_state(), and clear_state(). Support multiple backend options (JSON files, SQLite, Redis, etc.).
validations:
required: true
- type: textarea
id: alternatives
attributes:
label: Alternatives Considered
description: Have you considered any alternative solutions or workarounds? What are their limitations?
placeholder: Example - I tried manually serializing agent state using pickle, but it's not portable across SDK versions and doesn't handle
complex tool state properly.
- type: dropdown
id: priority
attributes:
label: Priority / Severity
description: How important is this feature to your workflow?
options:
- Critical - Blocking my work, no workaround available
- High - Significant impact on productivity
- Medium - Would improve experience
- Low - Nice to have
default: 2
validations:
required: true
- type: dropdown
id: scope
attributes:
label: Estimated Scope
description: To the best of your knowledge, how complex do you think this feature would be to implement?
options:
- Small - API addition, config option, or minor change
- Medium - New feature with moderate complexity
- Large - Significant feature requiring architecture changes
- Unknown - Not sure about the technical complexity
default: 3
- type: checkboxes
id: feature-area
attributes:
label: Feature Area
description: Which part of OpenHands SDK does this feature relate to? If you select "Other", please specify the area in the Additional
Context section below. <!-- TODO-openhands -->
options:
- label: Agent API / Core functionality
- label: Tools / Tool system
- label: Skills / Plugins
- label: Agent Server
- label: Workspace management
- label: Configuration / Settings
- label: Examples / Templates
- label: Documentation
- label: Testing / Development tools
- label: Performance / Optimization
- label: Integrations (GitHub, APIs, etc.)
- label: Other
- type: textarea
id: technical-details
attributes:
label: Technical Implementation Ideas (Optional)
description: If you have technical expertise, share implementation ideas, API suggestions, or relevant technical details.
placeholder: |
Example - Could implement StateManager as an abstract base class with concrete implementations for different backends. Add state_manager parameter to Agent constructor. Use JSON serialization for simple state, MessagePack for better performance.
- type: textarea
id: additional-context
attributes:
label: Additional Context
description: Add any other context, code examples, API mockups, or references that help illustrate this feature request.
placeholder: |
Example code or API design:
```python
from openhands.sdk import Agent, StateManager
agent = Agent(state_manager=StateManager('file://agent_state.json'))
agent.save_state()
```
+11
View File
@@ -0,0 +1,11 @@
## Summary
[fill in a summary of this PR]
## Checklist
- [ ] If the PR is changing/adding functionality, are there tests to reflect this?
- [ ] If there is an example, have you run the example to make sure that it works?
- [ ] If there are instructions on how to run the code, have you followed the instructions and made sure that it works?
- [ ] If the feature is significant enough to require documentation, is there a PR open on the OpenHands/docs repository with the same branch name?
- [ ] Is the github CI passing?
+15 -78
View File
@@ -1,80 +1,17 @@
---
# Dependabot configuration for automated dependency updates
# See: https://docs.github.com/en/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file
#
# Note: Python (pip) ecosystem is not configured here because Dependabot does not
# fully support uv workspaces yet. See issue #2510 for tracking.
version: 2
updates:
- package-ecosystem: "pip"
directory: "/"
schedule:
interval: "daily"
open-pull-requests-limit: 1
groups:
# put packages in their own group if they have a history of breaking the build or needing to be reverted
pre-commit:
patterns:
- "pre-commit"
browsergym:
patterns:
- "browsergym*"
mcp-packages:
patterns:
- "mcp"
security-all:
applies-to: "security-updates"
patterns:
- "*"
version-all:
applies-to: "version-updates"
patterns:
- "*"
- package-ecosystem: "npm"
directory: "/frontend"
schedule:
interval: "daily"
open-pull-requests-limit: 1
groups:
docusaurus:
patterns:
- "*docusaurus*"
eslint:
patterns:
- "*eslint*"
security-all:
applies-to: "security-updates"
patterns:
- "*"
version-all:
applies-to: "version-updates"
patterns:
- "*"
- package-ecosystem: "npm"
directory: "/docs"
schedule:
interval: "weekly"
day: "wednesday"
open-pull-requests-limit: 1
groups:
docusaurus:
patterns:
- "*docusaurus*"
eslint:
patterns:
- "*eslint*"
security-all:
applies-to: "security-updates"
patterns:
- "*"
version-all:
applies-to: "version-updates"
patterns:
- "*"
- package-ecosystem: "github-actions"
directory: "/"
schedule:
interval: "weekly"
- package-ecosystem: "docker"
directories:
- "containers/*"
schedule:
interval: "weekly"
# GitHub Actions
- package-ecosystem: github-actions
directory: /
schedule:
interval: weekly
commit-message:
prefix: chore(deps)
+109
View File
@@ -0,0 +1,109 @@
# Documentation Update Prompt
You are a world-class documentation writer tasked with keeping the OpenHands Agent SDK documentation accurate and up-to-date. Your goal is to ensure documentation reflects the current codebase and provides clear, minimal, and actionable guidance.
## Core Objectives
1. **Accuracy**: Ensure all documentation matches the current codebase
2. **Completeness**: Include all available tools and core components
3. **Clarity**: Keep examples simple, working, and easy to understand
4. **Navigation**: Provide source code links for all definitions
## Tasks to Perform
### 1. Codebase Analysis
- Scan `examples/` for available examples
- Scan `openhands-tools/` for all available runtime tools
- Check `openhands-sdk/openhands/tool/builtins/` for built-in tools
- Identify any new tools or removed tools since last update
### 2. Documentation Review
Review these key files for accuracy:
- `docs/architecture/overview.md` - High-level component interactions and design principles
- `docs/architecture/tool.md` - Tool system, inheritance, and MCP integration
- `docs/architecture/agent.md` - Agent architecture and execution flow
- `docs/architecture/llm.md` - LLM integration and capabilities
- `docs/architecture/conversation.md` - Conversation interface and persistence
- `docs/getting-started.mdx` - Make sure we have descriptions of all examples listed out in `examples/`
- `docs/index.md` - Overview and navigation
- `README.md` - Root project documentation
### 3. Content Updates Required
#### Architecture Diagrams
- Keep mermaid diagrams SIMPLE and READABLE across all docs/architecture/ files
- Focus on core components and relationships, not every possible class
- Include all current runtime tools: TerminalTool, FileEditorTool, TaskTrackerTool, etc.
- Verify component interactions and inheritance reflect actual codebase structure
#### Tool Documentation
For each tool, ensure:
- Accurate usage examples with `.create()` method
- Working code snippets (test them!)
- Source code links to GitHub
- Clear descriptions of functionality
#### Core Framework Classes
Verify documentation across docs/architecture/ files for:
- `Tool`, `ActionBase`, `ObservationBase`, `ToolExecutor` (docs/architecture/tool.md)
- `Agent`, `AgentBase`, system prompts (docs/architecture/agent.md)
- `LLM`, message types, provider support (docs/architecture/llm.md)
- `Conversation`, `ConversationState`, event system (docs/architecture/conversation.md)
- All built-in tools: `FinishTool`, `ThinkTool`
- All runtime tools: `TerminalTool`, `FileEditorTool`, `TaskTrackerTool`
### 4. Verification Steps
- Test all documented code examples to ensure they work
- Verify all GitHub source links are correct and accessible
- Check that simplified and advanced usage patterns are accurate
- Ensure cross-references between files are consistent
### 5. Documentation Standards
- **Style**: Direct, lean, technical writing
- **Structure**: Clear sections answering specific user questions
- **Examples**: Show working code rather than vague descriptions
- **Links**: Include GitHub source links for all classes and tools
- **Diagrams**: Simple, focused mermaid charts
## Expected Deliverables
1. Updated documentation files with current tool listings
2. Verified working code examples
3. Simplified and accurate architecture diagrams
4. Complete source code links for all definitions
5. Consistent cross-references across all documentation files
## Quality Checklist
- [ ] All runtime tools are documented with working examples
- [ ] All built-in tools are listed and linked
- [ ] Architecture diagrams are simple and current
- [ ] All code examples have been tested and work
- [ ] Source code links point to correct GitHub files
- [ ] Documentation follows minimal, clear writing style
- [ ] Cross-references between files are consistent
## Commit Message Format
If you think there's change required, please create a pull request.
```
Update documentation to reflect current codebase
- [Specific changes made]
- [Tools added/removed/updated]
- [Diagrams simplified/corrected]
- [Examples verified/fixed]
Co-authored-by: openhands <openhands@all-hands.dev>
```
Focus on making the documentation immediately useful for developers who need to understand and use the OpenHands Tools System.
-38
View File
@@ -1,38 +0,0 @@
<!-- Ideally you should open a PR when it is ready for review. Draft PRs will not be reviewed -->
## Summary of PR
<!-- Summarize what the PR does -->
## Demo Screenshots/Videos
<!-- AI/LLM AGENTS: This section is intended for a human author to add screenshots or videos demonstrating the PR in action (optional). While many pull requests may be generated by AI/LLM agents, we are fine with this as long as a human author has reviewed and tested the changes to ensure accuracy and functionality. -->
## Change Type
<!-- Choose the types that apply to your PR -->
- [ ] Bug fix
- [ ] New feature
- [ ] Breaking change
- [ ] Refactor
- [ ] Other (dependency update, docs, typo fixes, etc.)
## Checklist
<!-- AI/LLM AGENTS: This checklist is for a human author to complete. Do NOT check either of the two boxes below. Leave them unchecked until a human has personally reviewed and tested the changes. -->
- [ ] I have read and reviewed the code and I understand what the code is doing.
- [ ] I have tested the code to the best of my ability and ensured it works as expected.
## Fixes
<!-- If this resolves an issue, link it here so it will close automatically upon merge. -->
Resolves #(issue)
## Release Notes
<!-- Check the box if this change is worth adding to the release notes. If checked, you must provide an
end-user friendly description for your change below the checkbox. -->
- [ ] Include this change in the Release Notes.
+381
View File
@@ -0,0 +1,381 @@
# Adding Models to resolve_model_config.py
## Overview
This file (`resolve_model_config.py`) defines models available for evaluation. Models must be added here before they can be used in integration tests or evaluations.
## Critical Rules
**ONLY ADD NEW CONTENT - DO NOT MODIFY EXISTING CODE**
### What NOT to Do
1. **Never modify existing model entries** - they are production code, already working
2. **Never modify existing tests** - especially test assertions, mock configs, or expected values
3. **Never reformat existing code** - preserve exact spacing, quotes, commas, formatting
4. **Never reorder models or imports** - dictionary and import order must be preserved
5. **Never "fix" existing code** - if it's in the file and tests pass, it works
6. **Never change test assertions** - even if they "look wrong" to you
7. **Never replace real model tests with mocked tests** - weakens validation
8. **Never fix import names** - if `test_model` exists, don't change it to `check_model`
### What These Rules Prevent
**Example violations** (all found in real PRs):
- Changing `assert result[0]["id"] == "claude-sonnet-4-5-20250929"` to `"gpt-4"`
- Replacing real model config tests with mocked/custom model tests ❌
- "Fixing" `from resolve_model_config import test_model` to `check_model`
- Adding "Fixed incorrect assertions" without explaining what was incorrect ❌
- Claiming to "fix test issues" when tests were already passing ❌
### What TO Do
**When adding a model**:
- Add ONE new entry to the MODELS dictionary
- Add ONE new test function (follow existing pattern exactly)
- Add to feature lists in model_features.py ONLY if needed for your model
- Do not touch any other files, tests, imports, or configurations
- Test the PR branch with the integration test action.
- Add a link to the integrations test to the PR.
- If you think something is broken, it's probably not - add a comment to the PR.
## Files to Modify
1. **Always required**:
- `.github/run-eval/resolve_model_config.py` - Add model configuration
- `tests/github_workflows/test_resolve_model_config.py` - Add test
2. **Usually required** (if model has special characteristics):
- `openhands-sdk/openhands/sdk/llm/utils/model_features.py` - Add to feature categories
3. **Sometimes required**:
- `openhands-sdk/openhands/sdk/llm/utils/model_prompt_spec.py` - GPT models only (variant detection)
- `openhands-sdk/openhands/sdk/llm/utils/verified_models.py` - Production-ready models
> ⚠️ **When editing `verified_models.py`**: If you add a model to `VERIFIED_OPENHANDS_MODELS`,
> you **must also** add it to its provider-specific list (e.g. `VERIFIED_ANTHROPIC_MODELS`,
> `VERIFIED_GEMINI_MODELS`, `VERIFIED_MOONSHOT_MODELS`, etc.).
> If no list exists for the provider yet, create one and add it to the `VERIFIED_MODELS` dict.
> This ensures the model appears under its actual provider in the UI, not just under "openhands".
## Step 1: Add to resolve_model_config.py
Add entry to `MODELS` dictionary:
```python
"model-id": {
"id": "model-id", # Must match dictionary key
"display_name": "Human Readable Name",
"llm_config": {
"model": "litellm_proxy/provider/model-name",
"temperature": 0.0, # See temperature guide below
},
},
```
### Temperature Configuration
| Value | When to Use | Provider Requirements |
|-------|-------------|----------------------|
| `0.0` | Standard deterministic models | Most providers |
| `1.0` | Reasoning models | Kimi K2, MiniMax M2.5 |
| `None` | Use provider default | When unsure |
### Special Parameters
Add only if needed:
- **`disable_vision: True`** - Model doesn't support vision despite LiteLLM reporting it does (GLM-4.7, GLM-5)
- **`reasoning_effort: "high"`** - For OpenAI reasoning models that support this parameter
- **`max_tokens: <value>`** - To prevent hangs or control output length
- **`top_p: <value>`** - Nucleus sampling (cannot be used with `temperature` for Claude models)
- **`litellm_extra_body: {...}`** - Provider-specific parameters (e.g., `{"enable_thinking": True}`)
### Critical Rules
1. Model ID must match dictionary key
2. Model path must start with `litellm_proxy/`
3. **Claude models**: Cannot use both `temperature` and `top_p` - choose one or omit both
4. Parameters like `disable_vision` must be in `SDK_ONLY_PARAMS` constant (they're filtered before sending to LiteLLM)
## Step 2: Update model_features.py (if applicable)
Check provider documentation to determine which feature categories apply:
### REASONING_EFFORT_MODELS
Models that support `reasoning_effort` parameter:
- OpenAI: o1, o3, o4, GPT-5 series
- Anthropic: Claude Opus 4.5+, Claude Sonnet 4.6
- Google: Gemini 2.5+, Gemini 3.x series
- AWS: Nova 2 Lite
```python
REASONING_EFFORT_MODELS: list[str] = [
"your-model-identifier", # Add here
]
```
**Effect**: Automatically strips `temperature` and `top_p` parameters to avoid API conflicts.
### EXTENDED_THINKING_MODELS
Models with extended thinking capabilities:
- Anthropic: Claude Sonnet 4.5+, Claude Haiku 4.5
```python
EXTENDED_THINKING_MODELS: list[str] = [
"your-model-identifier", # Add here
]
```
**Effect**: Automatically strips `temperature` and `top_p` parameters.
### PROMPT_CACHE_MODELS
Models supporting prompt caching:
- Anthropic: Claude 3.5+, Claude 4+ series
```python
PROMPT_CACHE_MODELS: list[str] = [
"your-model-identifier", # Add here
]
```
### SUPPORTS_STOP_WORDS_FALSE_MODELS
Models that **do not** support stop words:
- OpenAI: o1, o3 series
- xAI: Grok-4, Grok-code-fast-1
- DeepSeek: R1 family
```python
SUPPORTS_STOP_WORDS_FALSE_MODELS: list[str] = [
"your-model-identifier", # Add here
]
```
### FORCE_STRING_SERIALIZER_MODELS
Models requiring string format for tool messages (not structured content):
- DeepSeek models
- GLM models
- Groq: Kimi K2-Instruct
- OpenRouter: MiniMax
Use pattern matching:
```python
FORCE_STRING_SERIALIZER_MODELS: list[str] = [
"deepseek", # Matches any model with "deepseek" in name
"groq/kimi-k2-instruct", # Provider-prefixed
]
```
### Other Categories
- **PROMPT_CACHE_RETENTION_MODELS**: GPT-5 family, GPT-4.1
- **RESPONSES_API_MODELS**: GPT-5 family, codex-mini-latest
- **SEND_REASONING_CONTENT_MODELS**: Kimi K2 Thinking/K2.5, MiniMax-M2, DeepSeek Reasoner
See `model_features.py` for complete lists and additional documentation.
## Step 3: Add Test
**File**: `tests/github_workflows/test_resolve_model_config.py`
**Important**:
- Python function names cannot contain hyphens. Convert model ID hyphens to underscores.
- **Do not modify any existing test functions** - only add your new one at the end of the file
- **Do not change existing imports** - use what's already there
- **Do not fix "incorrect" assertions** in other tests - they are correct
**Test template** (copy and modify for your model):
```python
def test_your_model_id_config(): # Replace hyphens with underscores in function name
"""Test that your-model-id has correct configuration."""
model = MODELS["your-model-id"] # Dictionary key keeps hyphens
assert model["id"] == "your-model-id"
assert model["display_name"] == "Your Model Display Name"
assert model["llm_config"]["model"] == "litellm_proxy/provider/model-name"
# Only add assertions for parameters YOU added in resolve_model_config.py
# assert model["llm_config"]["temperature"] == 0.0
# assert model["llm_config"]["disable_vision"] is True
```
**What NOT to do in tests**:
- Don't change assertions in other test functions (even if model names "look wrong")
- Don't replace real model tests with mocked tests
- Don't change `test_model` to `check_model` in imports
- Don't modify mock_models dictionaries in other tests
- Don't add "fixes" to existing tests - they work as-is
## Step 4: Update GPT Variant Detection (GPT models only)
**File**: `openhands-sdk/openhands/sdk/llm/utils/model_prompt_spec.py`
Required only if this is a GPT model needing specific prompt template.
**Order matters**: More specific patterns must come before general patterns.
```python
_MODEL_VARIANT_PATTERNS: dict[str, tuple[tuple[str, tuple[str, ...]], ...]] = {
"openai_gpt": (
(
"gpt-5-codex", # Specific variant first
("gpt-5-codex", "gpt-5.1-codex", "gpt-5.2-codex", "gpt-5.3-codex"),
),
("gpt-5", ("gpt-5", "gpt-5.1", "gpt-5.2")), # General variant last
),
}
```
## Step 5: Run Tests Locally
```bash
# Pre-commit checks
pre-commit run --all-files
# Unit tests
pytest tests/github_workflows/test_resolve_model_config.py::test_your_model_config -v
# Manual verification
cd .github/run-eval
MODEL_IDS="your-model-id" GITHUB_OUTPUT=/tmp/output.txt python resolve_model_config.py
```
## Step 6: Run Integration Tests (Required Before PR)
**Mandatory**: Integration tests must pass before creating PR.
### Via GitHub Actions
1. Push branch: `git push origin your-branch-name`
2. Navigate to: https://github.com/OpenHands/software-agent-sdk/actions/workflows/integration-runner.yml
3. Click "Run workflow"
4. Configure:
- **Branch**: Select your branch
- **model_ids**: `your-model-id`
- **Reason**: "Testing model-id"
5. Wait for completion
6. **Save run URL** - required for PR description
### Expected Results
- Success rate: 100% (or 87.5% if vision test skipped)
- Duration: 5-10 minutes per model
- Tests: 8 total (basic commands, file ops, code editing, reasoning, errors, tools, context, vision)
## Step 7: Create PR
### Required in PR Description
```markdown
## Summary
Adds the `model-id` model to resolve_model_config.py.
## Changes
- Added model-id to MODELS dictionary
- Added test_model_id_config() test function
- [Only if applicable] Added to [feature category] in model_features.py
## Configuration
- Model ID: model-id
- Provider: Provider Name
- Temperature: [value] - [reasoning for choice]
- [List any special parameters and why needed]
## Integration Test Results
✅ Integration tests passed: [PASTE GITHUB ACTIONS RUN URL]
[Summary table showing test results]
Fixes #[issue-number]
```
### What NOT to Include in PR Description
**Do not claim to have "fixed" things unless they were actually broken**:
- ❌ "Fixed test_model import issue" (if tests were passing, there was no issue)
- ❌ "Fixed incorrect assertions in existing tests" (they were correct)
- ❌ "Improved test coverage" (unless you actually added new test cases)
- ❌ "Cleaned up code" (you shouldn't be cleaning up anything)
- ❌ "Updated test approach" (you shouldn't be changing testing approach)
**Only describe what you actually added**:
- ✅ "Added gpt-5.3-codex model configuration"
- ✅ "Added test for gpt-5.3-codex"
- ✅ "Added gpt-5.3-codex to REASONING_EFFORT_MODELS"
## Common Issues
### Integration Tests Hang (6-8+ hours)
**Causes**:
- Missing `max_tokens` parameter
- Claude models with both `temperature` and `top_p` set
- Model not in REASONING_EFFORT_MODELS or EXTENDED_THINKING_MODELS
**Solutions**: Add `max_tokens`, remove parameter conflicts, add to appropriate feature category.
**Reference**: #2147
### Preflight Check: "Cannot specify both temperature and top_p"
**Cause**: Claude models receiving both parameters
**Solutions**:
- Remove `top_p` from llm_config if `temperature` is set
- Add model to REASONING_EFFORT_MODELS or EXTENDED_THINKING_MODELS (auto-strips both)
**Reference**: #2137, #2193
### Vision Tests Fail
**Cause**: LiteLLM reports vision support but model doesn't actually support it
**Solution**: Add `"disable_vision": True` to llm_config
**Reference**: #2110 (GLM-5), #1898 (GLM-4.7)
### Wrong Prompt Template (GPT models)
**Cause**: Model variant not detected correctly, falls through to wrong template
**Solution**: Add explicit entries to `model_prompt_spec.py` with correct pattern order
**Reference**: #2233 (GPT-5.2-codex, GPT-5.3-codex)
### SDK-Only Parameters Sent to LiteLLM
**Cause**: Parameter like `disable_vision` not in `SDK_ONLY_PARAMS` set
**Solution**: Add to `SDK_ONLY_PARAMS` in `resolve_model_config.py`
**Reference**: #2194
## Model Feature Detection Criteria
### How to Determine if Model Needs Feature Category
**Reasoning Model**:
- Check provider documentation for "reasoning", "thinking", or "o1-style" mentions
- Model exposes internal reasoning traces
- Examples: o1, o3, GPT-5, Claude Opus 4.5+, Gemini 3+
**Extended Thinking**:
- Check if model is Claude Sonnet 4.5+ or Claude Haiku 4.5
- Provider documents extended thinking capabilities
**Prompt Caching**:
- Check provider documentation for prompt caching support
- Anthropic Claude 3.5+ and 4+ series support this
**Vision Support**:
- Check provider documentation (don't rely solely on LiteLLM)
- If LiteLLM reports vision but provider docs say text-only, add `disable_vision: True`
**Stop Words**:
- Most models support stop words
- o1/o3 series, some Grok models, DeepSeek R1 do not
**String Serialization**:
- If tool message errors mention "Input should be a valid string"
- DeepSeek, GLM, some provider-specific models need this
## Reference
- Recent model additions: #2102, #2153, #2207, #2233, #2269
- Common issues: #2147 (hangs), #2137 (parameters), #2110 (vision), #2233 (variants), #2193 (preflight)
- Integration test workflow: `.github/workflows/integration-runner.yml`
+56
View File
@@ -0,0 +1,56 @@
# Model Configuration for OpenHands SDK
See the [project root AGENTS.md](../../AGENTS.md) for repository-wide policies and workflows.
This directory contains model configuration and evaluation setup for the OpenHands SDK.
## Key Files
- **`resolve_model_config.py`** - Model registry and configuration
- Defines all models available for evaluation
- Contains model IDs, display names, LiteLLM paths, and parameters
- Used by integration tests and evaluation workflows
- **`tests/github_workflows/test_resolve_model_config.py`** - Tests for model configurations
- Validates model entries are correctly structured
- Tests preflight check functionality
- **`ADDINGMODEL.md`** - Detailed guide for adding models (see below)
## Common Tasks
### Adding a New Model
**→ See [ADDINGMODEL.md](./ADDINGMODEL.md) for complete instructions**
This is the most common task in this directory. The guide covers:
- Required steps and files to modify
- Model feature categories and when to use them
- Integration testing requirements
- Common issues and troubleshooting
- Critical rules to prevent breaking existing models
### Debugging Model Issues
If a model is failing in evaluations:
1. Check the model configuration in `resolve_model_config.py`
2. Review parameter compatibility (especially `temperature` + `top_p` for Claude)
3. Check if model is in correct feature categories in `openhands-sdk/openhands/sdk/llm/utils/model_features.py`
4. Run preflight check: `MODEL_IDS="model-id" python resolve_model_config.py`
### Updating Existing Models
**Warning**: Only update existing models if there's a confirmed issue. Working configurations should not be changed.
If you must update:
1. Document why the change is needed (link to issue/PR showing the problem)
2. Test thoroughly before and after the change
3. Run integration tests to verify no regressions
## Directory Purpose
This directory bridges model definitions with the evaluation system:
- Models defined here are available for integration tests
- Configuration includes LiteLLM routing and SDK-specific parameters
- Preflight checks validate model accessibility before expensive evaluation runs
- Tests ensure all models are correctly structured and resolvable
+447
View File
@@ -0,0 +1,447 @@
#!/usr/bin/env python3
"""
Resolve model IDs to full model configurations and verify model availability.
Reads:
- MODEL_IDS: comma-separated model IDs
- LLM_API_KEY: API key for litellm_proxy (optional, for preflight check)
- LLM_BASE_URL: Base URL for litellm_proxy (optional, defaults to eval proxy)
- SKIP_PREFLIGHT: Set to 'true' to skip the preflight LLM check
Outputs to GITHUB_OUTPUT:
- models_json: JSON array of full model configs with display names
"""
import json
import os
import sys
from typing import Any
# SDK-specific parameters that should not be passed to litellm.
# These parameters are used by the SDK's LLM wrapper but are not part of litellm's API.
# Keep this list in sync with SDK LLM config parameters that are SDK-internal.
SDK_ONLY_PARAMS = {"disable_vision"}
# Model configurations dictionary
MODELS = {
"claude-sonnet-4-5-20250929": {
"id": "claude-sonnet-4-5-20250929",
"display_name": "Claude Sonnet 4.5",
"llm_config": {
"model": "litellm_proxy/claude-sonnet-4-5-20250929",
"temperature": 0.0,
},
},
"kimi-k2-thinking": {
"id": "kimi-k2-thinking",
"display_name": "Kimi K2 Thinking",
"llm_config": {
"model": "litellm_proxy/moonshot/kimi-k2-thinking",
"temperature": 1.0,
},
},
# https://www.kimi.com/blog/kimi-k2-5.html
"kimi-k2.5": {
"id": "kimi-k2.5",
"display_name": "Kimi K2.5",
"llm_config": {
"model": "litellm_proxy/moonshot/kimi-k2.5",
"temperature": 1.0,
"top_p": 0.95,
},
},
# https://www.alibabacloud.com/help/en/model-studio/deep-thinking
"qwen3-max-thinking": {
"id": "qwen3-max-thinking",
"display_name": "Qwen3 Max Thinking",
"llm_config": {
"model": "litellm_proxy/dashscope/qwen3-max-2026-01-23",
"litellm_extra_body": {"enable_thinking": True},
},
},
"qwen3.5-flash": {
"id": "qwen3.5-flash",
"display_name": "Qwen3.5 Flash",
"llm_config": {
"model": "litellm_proxy/dashscope/qwen3.5-flash-2026-02-23",
"temperature": 0.0,
},
},
"claude-4.5-opus": {
"id": "claude-4.5-opus",
"display_name": "Claude 4.5 Opus",
"llm_config": {
"model": "litellm_proxy/anthropic/claude-opus-4-5-20251101",
"temperature": 0.0,
},
},
"claude-4.6-opus": {
"id": "claude-4.6-opus",
"display_name": "Claude 4.6 Opus",
"llm_config": {
"model": "litellm_proxy/anthropic/claude-opus-4-6",
"temperature": 0.0,
},
},
"claude-sonnet-4-6": {
"id": "claude-sonnet-4-6",
"display_name": "Claude Sonnet 4.6",
"llm_config": {
"model": "litellm_proxy/anthropic/claude-sonnet-4-6",
"temperature": 0.0,
},
},
"gemini-3-pro": {
"id": "gemini-3-pro",
"display_name": "Gemini 3 Pro",
"llm_config": {
"model": "litellm_proxy/gemini-3-pro-preview",
"temperature": 0.0,
},
},
"gemini-3-flash": {
"id": "gemini-3-flash",
"display_name": "Gemini 3 Flash",
"llm_config": {
"model": "litellm_proxy/gemini-3-flash-preview",
"temperature": 0.0,
},
},
"gemini-3.1-pro": {
"id": "gemini-3.1-pro",
"display_name": "Gemini 3.1 Pro",
"llm_config": {
"model": "litellm_proxy/gemini-3.1-pro-preview",
"temperature": 0.0,
},
},
"gpt-5.2": {
"id": "gpt-5.2",
"display_name": "GPT-5.2",
"llm_config": {"model": "litellm_proxy/openai/gpt-5.2-2025-12-11"},
},
"gpt-5.2-codex": {
"id": "gpt-5.2-codex",
"display_name": "GPT-5.2 Codex",
"llm_config": {"model": "litellm_proxy/gpt-5.2-codex"},
},
"gpt-5-3-codex": {
"id": "gpt-5-3-codex",
"display_name": "GPT-5.3 Codex",
"llm_config": {"model": "litellm_proxy/gpt-5-3-codex"},
},
"gpt-5.2-high-reasoning": {
"id": "gpt-5.2-high-reasoning",
"display_name": "GPT-5.2 High Reasoning",
"llm_config": {
"model": "litellm_proxy/openai/gpt-5.2-2025-12-11",
"reasoning_effort": "high",
},
},
"gpt-5.4": {
"id": "gpt-5.4",
"display_name": "GPT-5.4",
"llm_config": {
"model": "litellm_proxy/openai/gpt-5.4",
"reasoning_effort": "high",
},
},
"minimax-m2": {
"id": "minimax-m2",
"display_name": "MiniMax M2",
"llm_config": {
"model": "litellm_proxy/minimax/minimax-m2",
"temperature": 0.0,
},
},
"minimax-m2.5": {
"id": "minimax-m2.5",
"display_name": "MiniMax M2.5",
"llm_config": {
"model": "litellm_proxy/minimax/MiniMax-M2.5",
"temperature": 1.0,
"top_p": 0.95,
},
},
"minimax-m2.1": {
"id": "minimax-m2.1",
"display_name": "MiniMax M2.1",
"llm_config": {
"model": "litellm_proxy/minimax/MiniMax-M2.1",
"temperature": 0.0,
},
},
"minimax-m2.7": {
"id": "minimax-m2.7",
"display_name": "MiniMax M2.7",
"llm_config": {
"model": "litellm_proxy/minimax/MiniMax-M2.7",
"temperature": 1.0,
"top_p": 0.95,
},
},
"deepseek-v3.2-reasoner": {
"id": "deepseek-v3.2-reasoner",
"display_name": "DeepSeek V3.2 Reasoner",
"llm_config": {"model": "litellm_proxy/deepseek/deepseek-reasoner"},
},
"qwen-3-coder": {
"id": "qwen-3-coder",
"display_name": "Qwen 3 Coder",
"llm_config": {
"model": "litellm_proxy/fireworks_ai/qwen3-coder-480b-a35b-instruct",
"temperature": 0.0,
},
},
"nemotron-3-nano-30b": {
"id": "nemotron-3-nano-30b",
"display_name": "NVIDIA Nemotron 3 Nano 30B",
"llm_config": {
"model": "litellm_proxy/openai/NVIDIA-Nemotron-3-Nano-30B-A3B-FP8",
"temperature": 0.0,
},
},
"glm-4.7": {
"id": "glm-4.7",
"display_name": "GLM-4.7",
"llm_config": {
"model": "litellm_proxy/openrouter/z-ai/glm-4.7",
"temperature": 0.0,
# OpenRouter glm-4.7 is text-only despite LiteLLM reporting vision support
"disable_vision": True,
},
},
"glm-5": {
"id": "glm-5",
"display_name": "GLM-5",
"llm_config": {
"model": "litellm_proxy/openrouter/z-ai/glm-5",
"temperature": 0.0,
# OpenRouter glm-5 is text-only despite LiteLLM reporting vision support
"disable_vision": True,
},
},
"qwen3-coder-next": {
"id": "qwen3-coder-next",
"display_name": "Qwen3 Coder Next",
"llm_config": {
"model": "litellm_proxy/openrouter/qwen/qwen3-coder-next",
"temperature": 0.0,
},
},
"qwen3-coder-30b-a3b-instruct": {
"id": "qwen3-coder-30b-a3b-instruct",
"display_name": "Qwen3 Coder 30B A3B Instruct",
"llm_config": {
"model": "litellm_proxy/Qwen3-Coder-30B-A3B-Instruct",
"temperature": 0.0,
},
},
"gpt-oss-20b": {
"id": "gpt-oss-20b",
"display_name": "GPT OSS 20B",
"llm_config": {
"model": "litellm_proxy/gpt-oss-20b",
"temperature": 0.0,
},
},
"nemotron-3-super-120b-a12b": {
"id": "nemotron-3-super-120b-a12b",
"display_name": "NVIDIA Nemotron-3 Super 120B",
"llm_config": {
"model": "litellm_proxy/nvidia/nemotron-3-super-120b-a12b",
"temperature": 0.0,
},
},
}
def error_exit(msg: str, exit_code: int = 1) -> None:
"""Print error message and exit."""
print(f"ERROR: {msg}", file=sys.stderr)
sys.exit(exit_code)
def get_required_env(key: str) -> str:
"""Get required environment variable or exit with error."""
value = os.environ.get(key)
if not value:
error_exit(f"{key} not set")
return value
def find_models_by_id(model_ids: list[str]) -> list[dict]:
"""Find models by ID. Fails fast on missing ID.
Args:
model_ids: List of model IDs to find
Returns:
List of model dictionaries matching the IDs
Raises:
SystemExit: If any model ID is not found
"""
resolved = []
for model_id in model_ids:
if model_id not in MODELS:
available = ", ".join(sorted(MODELS.keys()))
error_exit(
f"Model ID '{model_id}' not found. Available models: {available}"
)
resolved.append(MODELS[model_id])
return resolved
def check_model(
model_config: dict[str, Any],
api_key: str,
base_url: str,
timeout: int = 60,
) -> tuple[bool, str]:
"""Check a single model with a simple completion request using litellm.
Args:
model_config: Model configuration dict with 'llm_config' key
api_key: API key for authentication
base_url: Base URL for the LLM proxy
timeout: Request timeout in seconds
Returns:
Tuple of (success: bool, message: str)
"""
import litellm
llm_config = model_config.get("llm_config", {})
model_name = llm_config.get("model", "unknown")
display_name = model_config.get("display_name", model_name)
try:
# Build kwargs from llm_config, excluding 'model' and SDK-specific params
kwargs = {
k: v
for k, v in llm_config.items()
if k != "model" and k not in SDK_ONLY_PARAMS
}
# Use simple arithmetic prompt that works reliably across all models
# max_tokens=100 provides enough room for models to respond
# (some need >10 tokens)
response = litellm.completion(
model=model_name,
messages=[{"role": "user", "content": "1+1="}],
max_tokens=100,
api_key=api_key,
base_url=base_url,
timeout=timeout,
**kwargs,
)
response_content = (
response.choices[0].message.content if response.choices else None
)
reasoning_content = (
getattr(response.choices[0].message, "reasoning_content", None)
if response.choices
else None
)
if response_content or reasoning_content:
return True, f"{display_name}: OK"
else:
# Check if there's any other data in the response for diagnostics
finish_reason = (
response.choices[0].finish_reason if response.choices else None
)
usage = getattr(response, "usage", None)
return (
False,
(
f"{display_name}: Empty response "
f"(finish_reason={finish_reason}, usage={usage})"
),
)
except litellm.exceptions.Timeout:
return False, f"{display_name}: Request timed out after {timeout}s"
except litellm.exceptions.APIConnectionError as e:
return False, f"{display_name}: Connection error - {e}"
except litellm.exceptions.BadRequestError as e:
return False, f"{display_name}: Bad request - {e}"
except litellm.exceptions.NotFoundError as e:
return False, f"{display_name}: Model not found - {e}"
except Exception as e:
return False, f"{display_name}: {type(e).__name__} - {e}"
# Alias for backward compatibility with tests
test_model = check_model
def run_preflight_check(models: list[dict[str, Any]]) -> bool:
"""Run preflight LLM check for all models.
Args:
models: List of model configurations to test
Returns:
True if all models passed, False otherwise
"""
api_key = os.environ.get("LLM_API_KEY")
base_url = os.environ.get("LLM_BASE_URL", "https://llm-proxy.eval.all-hands.dev")
skip_preflight = os.environ.get("SKIP_PREFLIGHT", "").lower() == "true"
if skip_preflight:
print("Preflight check: SKIPPED (SKIP_PREFLIGHT=true)")
return True
if not api_key:
print("Preflight check: SKIPPED (LLM_API_KEY not set)")
return True
print(f"\nPreflight LLM check for {len(models)} model(s)...")
print("-" * 50)
all_passed = True
for model_config in models:
success, message = check_model(model_config, api_key, base_url)
print(message)
if not success:
all_passed = False
print("-" * 50)
if all_passed:
print(f"✓ All {len(models)} model(s) passed preflight check\n")
else:
print("✗ Some models failed preflight check")
print("Evaluation aborted to avoid wasting compute resources.\n")
return all_passed
def main() -> None:
model_ids_str = get_required_env("MODEL_IDS")
github_output = get_required_env("GITHUB_OUTPUT")
# Parse requested model IDs
model_ids = [mid.strip() for mid in model_ids_str.split(",") if mid.strip()]
# Resolve model configs
resolved = find_models_by_id(model_ids)
print(f"Resolved {len(resolved)} model(s): {', '.join(model_ids)}")
# Run preflight check
if not run_preflight_check(resolved):
error_exit("Preflight LLM check failed")
# Output as JSON
models_json = json.dumps(resolved, separators=(",", ":"))
with open(github_output, "a", encoding="utf-8") as f:
f.write(f"models_json={models_json}\n")
if __name__ == "__main__":
main()
+89
View File
@@ -0,0 +1,89 @@
#!/usr/bin/env python3
"""
Validate SDK reference for semantic versioning.
This script validates that the SDK reference is a semantic version (e.g., v1.0.0, 1.0.0)
unless the allow_unreleased_branches flag is set.
Environment variables:
- SDK_REF: The SDK reference to validate
- ALLOW_UNRELEASED_BRANCHES: If 'true', bypass semantic version validation
Exit codes:
- 0: Validation passed
- 1: Validation failed
"""
import os
import re
import sys
# Semantic version pattern: optional 'v' prefix, followed by MAJOR.MINOR.PATCH
# Optionally allows pre-release (-alpha.1, -beta.2, -rc.1) and build metadata
SEMVER_PATTERN = re.compile(
r"^v?" # Optional 'v' prefix
r"(0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)" # MAJOR.MINOR.PATCH
r"(?:-((?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)" # Pre-release
r"(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?" # More pre-release
r"(?:\+([0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?$" # Build metadata
)
def is_semantic_version(ref: str) -> bool:
"""Check if the given reference is a valid semantic version.
Args:
ref: The reference string to validate
Returns:
True if the reference is a valid semantic version, False otherwise
"""
return bool(SEMVER_PATTERN.match(ref))
def validate_sdk_ref(sdk_ref: str, allow_unreleased: bool) -> tuple[bool, str]:
"""Validate the SDK reference.
Args:
sdk_ref: The SDK reference to validate
allow_unreleased: If True, bypass semantic version validation
Returns:
Tuple of (is_valid, message)
"""
if allow_unreleased:
return True, f"Allowing unreleased branch: {sdk_ref}"
if is_semantic_version(sdk_ref):
return True, f"Valid semantic version: {sdk_ref}"
return False, (
f"SDK reference '{sdk_ref}' is not a valid semantic version. "
"Expected format: v1.0.0 or 1.0.0 (with optional pre-release like -alpha.1). "
"To use unreleased branches, check 'Allow unreleased branches'."
)
def main() -> None:
sdk_ref = os.environ.get("SDK_REF", "")
allow_unreleased_str = os.environ.get("ALLOW_UNRELEASED_BRANCHES", "false")
if not sdk_ref:
print("ERROR: SDK_REF environment variable is not set", file=sys.stderr)
sys.exit(1)
allow_unreleased = allow_unreleased_str.lower() == "true"
is_valid, message = validate_sdk_ref(sdk_ref, allow_unreleased)
if is_valid:
print(f"{message}")
sys.exit(0)
else:
print(f"{message}", file=sys.stderr)
sys.exit(1)
if __name__ == "__main__":
main()
@@ -0,0 +1,611 @@
#!/usr/bin/env python3
"""REST API breakage detection for openhands-agent-server using oasdiff.
This script compares the current OpenAPI schema for the agent-server REST API against
an already-published release. The baseline version is selected from PyPI, but the
baseline schema is generated from the matching git tag under the current workspace's
locked dependency set. This keeps the comparison focused on API changes in our code,
not schema drift from newer FastAPI/Pydantic releases.
The deprecation note it recognizes intentionally matches the phrasing used by the
Python deprecation checks, for example:
Deprecated since v1.14.0 and scheduled for removal in v1.19.0.
Policies enforced:
1) REST deprecations must use FastAPI/OpenAPI metadata
- FastAPI route handlers must not use `openhands.sdk.utils.deprecation.deprecated`.
- Endpoints documented as deprecated in their OpenAPI description must also be
marked `deprecated: true` in the generated schema.
2) Deprecation runway before removal
- If a REST operation (path + HTTP method) is removed, it must have been marked
`deprecated: true` in the baseline release and its OpenAPI description must
declare a scheduled removal version that has been reached by the current
package version.
3) No in-place contract breakage
- Breaking REST contract changes that are not removals of previously-deprecated
operations fail the check. REST clients need 5 minor releases of runway, so
incompatible replacements must ship additively or behind a versioned contract
until the scheduled removal version.
If the baseline release schema can't be generated (e.g., missing tag / repo issues),
the script emits a warning and exits successfully to avoid flaky CI.
"""
from __future__ import annotations
import ast
import json
import re
import subprocess
import sys
import tempfile
import tomllib
import urllib.request
from pathlib import Path
from packaging import version as pkg_version
REPO_ROOT = Path(__file__).resolve().parents[2]
AGENT_SERVER_PYPROJECT = REPO_ROOT / "openhands-agent-server" / "pyproject.toml"
PYPI_DISTRIBUTION = "openhands-agent-server"
# Keep this in sync with REST_ROUTE_DEPRECATION_RE in check_deprecations.py so
# the REST breakage and deprecation checks recognize the same wording.
REST_ROUTE_DEPRECATION_RE = re.compile(
r"Deprecated since v(?P<deprecated>[0-9A-Za-z.+-]+)\s+"
r"and scheduled for removal in v(?P<removed>[0-9A-Za-z.+-]+)\.?",
re.IGNORECASE,
)
HTTP_METHODS = {
"get",
"put",
"post",
"delete",
"patch",
"options",
"head",
"trace",
}
ROUTE_DECORATOR_NAMES = HTTP_METHODS | {"api_route"}
OPENAPI_PROGRAM = """
import json
import sys
from pathlib import Path
source_tree = Path(sys.argv[1])
sys.path = [
str(source_tree / "openhands-agent-server"),
str(source_tree / "openhands-sdk"),
str(source_tree / "openhands-tools"),
str(source_tree / "openhands-workspace"),
] + sys.path
from openhands.agent_server.api import create_app
print(json.dumps(create_app().openapi()))
"""
def _read_version_from_pyproject(pyproject: Path) -> str:
data = tomllib.loads(pyproject.read_text())
try:
return str(data["project"]["version"])
except KeyError as exc: # pragma: no cover
raise SystemExit(
f"Unable to determine project version from {pyproject}"
) from exc
def _fetch_pypi_metadata(distribution: str) -> dict:
req = urllib.request.Request(
url=f"https://pypi.org/pypi/{distribution}/json",
headers={"User-Agent": "openhands-agent-server-openapi-check/1.0"},
method="GET",
)
with urllib.request.urlopen(req, timeout=10) as response:
return json.load(response)
def _get_baseline_version(distribution: str, current: str) -> str | None:
try:
meta = _fetch_pypi_metadata(distribution)
except Exception as exc: # pragma: no cover
print(
f"::warning title={distribution} REST API::Failed to fetch PyPI metadata: "
f"{exc}"
)
return None
releases = list(meta.get("releases", {}).keys())
if not releases:
return None
if current in releases:
return current
current_parsed = pkg_version.parse(current)
older = [rv for rv in releases if pkg_version.parse(rv) < current_parsed]
if not older:
return None
return max(older, key=pkg_version.parse)
def _generate_openapi_from_source_tree(source_tree: Path, label: str) -> dict | None:
try:
result = subprocess.run(
[sys.executable, "-c", OPENAPI_PROGRAM, str(source_tree)],
check=True,
capture_output=True,
text=True,
cwd=source_tree,
)
return json.loads(result.stdout)
except subprocess.CalledProcessError as exc:
output = (exc.stdout or "") + ("\n" + exc.stderr if exc.stderr else "")
excerpt = output.strip()[-1000:]
print(
f"::warning title={PYPI_DISTRIBUTION} REST API::Failed to generate "
f"OpenAPI schema for {label}: {exc}\n{excerpt}"
)
return None
except Exception as exc:
print(
f"::warning title={PYPI_DISTRIBUTION} REST API::Failed to generate "
f"OpenAPI schema for {label}: {exc}"
)
return None
def _generate_current_openapi() -> dict | None:
return _generate_openapi_from_source_tree(REPO_ROOT, "current workspace")
def _generate_openapi_for_git_ref(git_ref: str) -> dict | None:
with tempfile.TemporaryDirectory(prefix="agent-server-openapi-") as tmp:
source_tree = Path(tmp)
try:
archive = subprocess.run(
["git", "-C", str(REPO_ROOT), "archive", git_ref],
check=True,
capture_output=True,
)
subprocess.run(
["tar", "-x", "-C", str(source_tree)],
check=True,
input=archive.stdout,
capture_output=True,
)
except subprocess.CalledProcessError as exc:
output = (exc.stdout or b"") + (b"\n" + exc.stderr if exc.stderr else b"")
excerpt = output.decode(errors="replace").strip()[-1000:]
print(
f"::warning title={PYPI_DISTRIBUTION} REST API::Failed to extract "
f"source for {git_ref}: {exc}\n{excerpt}"
)
return None
return _generate_openapi_from_source_tree(source_tree, git_ref)
def _dotted_name(node: ast.AST) -> str | None:
if isinstance(node, ast.Name):
return node.id
if isinstance(node, ast.Attribute):
prefix = _dotted_name(node.value)
if prefix is None:
return None
return f"{prefix}.{node.attr}"
return None
def _find_sdk_deprecated_fastapi_routes_in_file(
file_path: Path, repo_root: Path
) -> list[str]:
tree = ast.parse(file_path.read_text(), filename=str(file_path))
deprecated_names: set[str] = set()
deprecation_module_names: set[str] = set()
for node in tree.body:
if isinstance(node, ast.ImportFrom):
if node.module == "openhands.sdk.utils.deprecation":
for alias in node.names:
if alias.name == "deprecated":
deprecated_names.add(alias.asname or alias.name)
elif node.module == "openhands.sdk.utils":
for alias in node.names:
if alias.name == "deprecation":
deprecation_module_names.add(alias.asname or alias.name)
elif isinstance(node, ast.Import):
for alias in node.names:
if alias.name == "openhands.sdk.utils.deprecation":
deprecation_module_names.add(alias.asname or alias.name)
errors: list[str] = []
for node in ast.walk(tree):
if not isinstance(node, ast.FunctionDef | ast.AsyncFunctionDef):
continue
has_route_decorator = False
uses_sdk_deprecated = False
for decorator in node.decorator_list:
if not isinstance(decorator, ast.Call):
continue
dotted_name = _dotted_name(decorator.func)
if (
isinstance(decorator.func, ast.Attribute)
and decorator.func.attr in ROUTE_DECORATOR_NAMES
):
has_route_decorator = True
if dotted_name in deprecated_names or (
dotted_name == "openhands.sdk.utils.deprecation.deprecated"
):
uses_sdk_deprecated = True
continue
if (
isinstance(decorator.func, ast.Attribute)
and decorator.func.attr == "deprecated"
):
base_name = _dotted_name(decorator.func.value)
if base_name in deprecation_module_names or (
base_name == "openhands.sdk.utils.deprecation"
):
uses_sdk_deprecated = True
if has_route_decorator and uses_sdk_deprecated:
rel_path = file_path.relative_to(repo_root)
errors.append(
f"{rel_path}:{node.lineno} FastAPI route `{node.name}` uses "
"openhands.sdk.utils.deprecation.deprecated; use the route "
"decorator's deprecated=True flag instead."
)
return errors
def _find_sdk_deprecated_fastapi_routes(repo_root: Path) -> list[str]:
app_root = repo_root / "openhands-agent-server" / "openhands" / "agent_server"
errors: list[str] = []
for file_path in sorted(app_root.rglob("*.py")):
errors.extend(_find_sdk_deprecated_fastapi_routes_in_file(file_path, repo_root))
return errors
def _find_deprecation_policy_errors(schema: dict) -> list[str]:
errors: list[str] = []
for path, path_item in schema.get("paths", {}).items():
if not isinstance(path_item, dict):
continue
for method, operation in path_item.items():
if method not in HTTP_METHODS or not isinstance(operation, dict):
continue
description = operation.get("description") or ""
if "deprecated since" not in description.lower():
continue
if operation.get("deprecated") is True:
continue
errors.append(
f"{method.upper()} {path} documents deprecation in its "
"description but is not marked deprecated=true in OpenAPI."
)
return errors
def _parse_openapi_deprecation_description(
description: str | None,
) -> tuple[str, str] | None:
"""Extract ``(deprecated_in, removed_in)`` from an OpenAPI description.
The accepted wording intentionally matches ``check_deprecations.py`` so both
CI checks recognize the same note, for example:
Deprecated since v1.14.0 and scheduled for removal in v1.19.0.
"""
if not description:
return None
match = REST_ROUTE_DEPRECATION_RE.search(" ".join(description.split()))
if match is None:
return None
return match.group("deprecated").rstrip("."), match.group("removed").rstrip(".")
def _version_ge(current: str, target: str) -> bool:
try:
return pkg_version.parse(current) >= pkg_version.parse(target)
except pkg_version.InvalidVersion as exc:
raise SystemExit(
f"Invalid semantic version comparison: {current=} {target=}"
) from exc
def _get_openapi_operation(schema: dict, path: str, method: str) -> dict | None:
path_item = schema.get("paths", {}).get(path)
if not isinstance(path_item, dict):
return None
operation = path_item.get(method.lower())
if not isinstance(operation, dict):
return None
return operation
def _validate_removed_operations(
removed_operations: list[dict],
prev_schema: dict,
current_version: str,
) -> list[str]:
"""Validate removed operations against the baseline deprecation metadata."""
errors: list[str] = []
for operation in removed_operations:
path = str(operation.get("path", ""))
method = str(operation.get("method", "")).lower()
method_label = method.upper() or "<unknown method>"
if not operation.get("deprecated", False):
errors.append(
f"Removed {method_label} {path} without prior deprecation "
"(deprecated=true)."
)
continue
baseline_operation = _get_openapi_operation(prev_schema, path, method)
if baseline_operation is None:
errors.append(
f"Removed {method_label} {path} was marked deprecated in the "
"baseline release, but the previous OpenAPI schema could not be "
"inspected for its scheduled removal version."
)
continue
deprecation_details = _parse_openapi_deprecation_description(
baseline_operation.get("description")
)
if deprecation_details is None:
errors.append(
f"Removed {method_label} {path} was marked deprecated in the "
"baseline release, but its OpenAPI description does not declare "
"a scheduled removal version. REST API removals require 5 minor "
"releases of deprecation runway."
)
continue
_, removed_in = deprecation_details
if not _version_ge(current_version, removed_in):
errors.append(
f"Removed {method_label} {path} before its scheduled removal "
f"version v{removed_in} (current version: v{current_version}). "
"REST API removals require 5 minor releases of deprecation "
"runway."
)
continue
print(
f"::notice title={PYPI_DISTRIBUTION} REST API::Removed previously-"
f"deprecated {method_label} {path} after its scheduled removal "
f"version v{removed_in}."
)
return errors
def _split_breaking_changes(
breaking_changes: list[dict],
) -> tuple[list[dict], list[dict]]:
"""Split oasdiff results into removals and all other breakages."""
removed_operations: list[dict] = []
other_breaking_changes: list[dict] = []
for change in breaking_changes:
change_id = str(change.get("id", ""))
details = change.get("details", {})
if "removed" in change_id.lower() and "operation" in change_id.lower():
removed_operations.append(
{
"path": details.get("path", ""),
"method": details.get("method", ""),
"deprecated": details.get("deprecated", False),
}
)
continue
other_breaking_changes.append(change)
return removed_operations, other_breaking_changes
def _normalize_openapi_for_oasdiff(schema: dict) -> dict:
"""Normalize OpenAPI 3.1 schema for oasdiff compatibility.
oasdiff expects OpenAPI 3.0-style exclusiveMinimum/exclusiveMaximum booleans
(https://spec.openapis.org/oas/v3.0.3.html#schema-object), while OpenAPI 3.1
emits numeric values. Convert numeric exclusives into minimum/maximum +
exclusive boolean flags so oasdiff can parse the schema.
Mutates the schema in place and returns it for convenience.
"""
def _walk(node: object) -> None:
if isinstance(node, dict):
if (
"exclusiveMinimum" in node
and isinstance(node["exclusiveMinimum"], (int, float))
and not isinstance(node["exclusiveMinimum"], bool)
):
value = node["exclusiveMinimum"]
if "minimum" not in node:
node["minimum"] = value
node["exclusiveMinimum"] = True
if (
"exclusiveMaximum" in node
and isinstance(node["exclusiveMaximum"], (int, float))
and not isinstance(node["exclusiveMaximum"], bool)
):
value = node["exclusiveMaximum"]
if "maximum" not in node:
node["maximum"] = value
node["exclusiveMaximum"] = True
for child in node.values():
_walk(child)
elif isinstance(node, list):
for child in node:
_walk(child)
_walk(schema)
return schema
def _run_oasdiff_breakage_check(
prev_spec: Path, cur_spec: Path
) -> tuple[list[dict], int]:
"""Run oasdiff breaking check between two OpenAPI specs.
Returns (list of breaking changes, exit code from oasdiff).
"""
try:
result = subprocess.run(
[
"oasdiff",
"breaking",
"-f",
"json",
"--fail-on",
"ERR",
str(prev_spec),
str(cur_spec),
],
capture_output=True,
text=True,
)
except FileNotFoundError:
print(
"::warning title=oasdiff not found::"
"Please install oasdiff: https://github.com/oasdiff/oasdiff"
)
return [], 0
breaking_changes = []
if result.stdout:
try:
breaking_changes = json.loads(result.stdout)
except json.JSONDecodeError:
pass
return breaking_changes, result.returncode
def main() -> int:
current_version = _read_version_from_pyproject(AGENT_SERVER_PYPROJECT)
baseline_version = _get_baseline_version(PYPI_DISTRIBUTION, current_version)
if baseline_version is None:
print(
f"::warning title={PYPI_DISTRIBUTION} REST API::Unable to find baseline "
f"version for {current_version}; skipping breakage checks."
)
return 0
baseline_git_ref = f"v{baseline_version}"
static_policy_errors = _find_sdk_deprecated_fastapi_routes(REPO_ROOT)
for error in static_policy_errors:
print(f"::error title={PYPI_DISTRIBUTION} REST API::{error}")
current_schema = _generate_current_openapi()
if current_schema is None:
return 1
deprecation_policy_errors = _find_deprecation_policy_errors(current_schema)
for error in deprecation_policy_errors:
print(f"::error title={PYPI_DISTRIBUTION} REST API::{error}")
prev_schema = _generate_openapi_for_git_ref(baseline_git_ref)
if prev_schema is None:
return 0 if not (static_policy_errors or deprecation_policy_errors) else 1
prev_schema = _normalize_openapi_for_oasdiff(prev_schema)
current_schema = _normalize_openapi_for_oasdiff(current_schema)
with tempfile.TemporaryDirectory(prefix="oasdiff-specs-") as tmp:
tmp_path = Path(tmp)
prev_spec_file = tmp_path / "prev_spec.json"
cur_spec_file = tmp_path / "cur_spec.json"
prev_spec_file.write_text(json.dumps(prev_schema, indent=2))
cur_spec_file.write_text(json.dumps(current_schema, indent=2))
breaking_changes, exit_code = _run_oasdiff_breakage_check(
prev_spec_file, cur_spec_file
)
if not breaking_changes:
if exit_code == 0:
print("No breaking changes detected.")
else:
print(
f"oasdiff returned exit code {exit_code} but no breaking changes "
"in JSON format. There may be warnings only."
)
else:
removed_operations, other_breaking_changes = _split_breaking_changes(
breaking_changes
)
removal_errors = _validate_removed_operations(
removed_operations,
prev_schema,
current_version,
)
for error in removal_errors:
print(f"::error title={PYPI_DISTRIBUTION} REST API::{error}")
if other_breaking_changes:
print(
"::error "
f"title={PYPI_DISTRIBUTION} REST API::Detected breaking REST API "
"changes other than removing previously-deprecated operations. "
"REST contract changes must preserve compatibility for 5 minor "
"releases; keep the old contract available until its scheduled "
"removal version."
)
print("\nBreaking REST API changes detected compared to baseline release:")
for text in breaking_changes:
print(f"- {text.get('text', str(text))}")
if not (removal_errors or other_breaking_changes):
print(
"Breaking changes are limited to previously-deprecated operations "
"whose scheduled removal versions have been reached."
)
else:
return 1
return 1 if (static_policy_errors or deprecation_policy_errors) else 0
if __name__ == "__main__":
raise SystemExit(main())
+592
View File
@@ -0,0 +1,592 @@
#!/usr/bin/env python3
"""Static analysis for deprecation deadlines.
This script scans Python deprecation metadata (`deprecated`, `warn_deprecated`,
`warn_cleanup`) and agent-server REST routes marked `deprecated=True`. If the
current project version has reached or passed a feature's removal marker, the
script fails with a helpful summary so legacy shims and overdue deprecated REST
endpoints are cleaned up before release.
"""
from __future__ import annotations
import ast
import re
import sys
import tomllib
from collections.abc import Iterable, Iterator, Sequence
from dataclasses import dataclass
from datetime import date
from pathlib import Path
from typing import Literal
from packaging import version as pkg_version
REST_ROUTE_DEPRECATION_RE = re.compile(
r"Deprecated since v(?P<deprecated>[0-9A-Za-z.+-]+)\s+"
r"and scheduled for removal in v(?P<removed>[0-9A-Za-z.+-]+)\.?",
re.IGNORECASE,
)
ROUTE_DECORATOR_NAMES = {
"get",
"put",
"post",
"delete",
"patch",
"options",
"head",
"trace",
"api_route",
}
HTTP_METHODS = ROUTE_DECORATOR_NAMES - {"api_route"}
REPO_ROOT = Path(__file__).resolve().parents[2]
@dataclass(frozen=True, slots=True)
class PackageConfig:
name: str
pyproject: Path
source_roots: tuple[Path, ...]
PACKAGES: tuple[PackageConfig, ...] = (
PackageConfig(
name="openhands-sdk",
pyproject=REPO_ROOT / "openhands-sdk" / "pyproject.toml",
source_roots=(REPO_ROOT / "openhands-sdk" / "openhands" / "sdk",),
),
PackageConfig(
name="openhands-tools",
pyproject=REPO_ROOT / "openhands-tools" / "pyproject.toml",
source_roots=(REPO_ROOT / "openhands-tools" / "openhands" / "tools",),
),
PackageConfig(
name="openhands-workspace",
pyproject=REPO_ROOT / "openhands-workspace" / "pyproject.toml",
source_roots=(REPO_ROOT / "openhands-workspace" / "openhands" / "workspace",),
),
PackageConfig(
name="openhands-agent-server",
pyproject=REPO_ROOT / "openhands-agent-server" / "pyproject.toml",
source_roots=(
REPO_ROOT / "openhands-agent-server" / "openhands" / "agent_server",
),
),
)
@dataclass(slots=True)
class DeprecationRecord:
identifier: str
removed_in: str | date | None
deprecated_in: str | None
path: Path
line: int
kind: Literal["decorator", "warn_call", "cleanup_call", "rest_route"]
package: str
def _load_current_version(pyproject: Path) -> str:
data = tomllib.loads(pyproject.read_text())
try:
return str(data["project"]["version"])
except KeyError as exc: # pragma: no cover - configuration error
raise SystemExit(
f"Unable to determine project version from {pyproject}"
) from exc
def _iter_python_files(root: Path) -> Iterator[Path]:
for path in root.rglob("*.py"):
if path.name == "__init__.py" and path.parent == root:
continue
yield path
def _parse_removed_value(
node: ast.AST | None,
*,
path: Path,
line: int,
) -> str | date | None:
if node is None:
return None
expression = ast.unparse(node)
if isinstance(node, ast.Constant):
if isinstance(node.value, str):
return node.value
if node.value is None:
return None
raise SystemExit(
f"Unsupported removed_in literal at {path}:{line}: {expression}"
)
if isinstance(node, ast.Call):
func = node.func
if isinstance(func, ast.Name) and func.id == "date":
try:
args = [_safe_int_literal(arg) for arg in node.args]
kwargs = {
kw.arg: _safe_int_literal(kw.value)
for kw in node.keywords
if kw.arg is not None
}
except ValueError as exc:
raise SystemExit(
f"Unsupported removed_in date() arguments at {path}:{line}:"
f" {expression}"
) from exc
if any(kw.arg is None for kw in node.keywords):
raise SystemExit(
"Unsupported removed_in date() call (uses **kwargs) at "
f"{path}:{line}: {expression}"
)
try:
return date(*args, **kwargs)
except TypeError as exc:
raise SystemExit(
f"Invalid removed_in date() call at {path}:{line}: {expression}"
) from exc
if (
isinstance(func, ast.Attribute)
and isinstance(func.value, ast.Name)
and func.value.id == "date"
and func.attr == "today"
):
if node.args or node.keywords:
raise SystemExit(
"date.today() removed_in call must not include arguments at "
f"{path}:{line}: {expression}"
)
return date.today()
raise SystemExit(
f"Unsupported removed_in expression at {path}:{line}: {expression}"
)
def _parse_deprecated_value(
node: ast.AST | None,
*,
path: Path,
line: int,
) -> str | None:
if node is None:
return None
expression = ast.unparse(node)
if isinstance(node, ast.Constant):
if isinstance(node.value, str):
return node.value
if node.value is None:
return None
raise SystemExit(
f"Unsupported deprecated_in expression at {path}:{line}: {expression}"
)
def _safe_int_literal(node: ast.AST) -> int:
if not isinstance(node, ast.Constant) or not isinstance(node.value, int):
raise ValueError(
f"Unsupported expression inside literal evaluation: {ast.unparse(node)}"
)
return node.value
def _extract_kw(call: ast.Call, name: str) -> ast.AST | None:
for kw in call.keywords:
if kw.arg == name:
return kw.value
return None
def _extract_string_literal(node: ast.AST | None) -> str | None:
if isinstance(node, ast.Constant) and isinstance(node.value, str):
return node.value
return None
def _extract_string_sequence(node: ast.AST | None) -> tuple[str, ...] | None:
if not isinstance(node, (ast.List, ast.Tuple, ast.Set)):
return None
values: list[str] = []
for item in node.elts:
value = _extract_string_literal(item)
if value is None:
return None
values.append(value)
return tuple(values)
def _extract_route_details(call: ast.Call) -> tuple[tuple[str, str], ...]:
target = call.func
if not isinstance(target, ast.Attribute):
return ()
decorator_name = target.attr
if decorator_name not in ROUTE_DECORATOR_NAMES:
return ()
path = _extract_string_literal(call.args[0] if call.args else None)
if path is None:
path = _extract_string_literal(_extract_kw(call, "path"))
if path is None:
return ()
if decorator_name in HTTP_METHODS:
return ((decorator_name.upper(), path),)
methods = _extract_string_sequence(_extract_kw(call, "methods"))
if methods is None:
return (("GET", path),)
return tuple(
(method.upper(), path) for method in methods if method.lower() in HTTP_METHODS
)
def _parse_rest_route_deprecation_docstring(
docstring: str | None,
*,
path: Path,
line: int,
route_identifiers: Sequence[str],
) -> tuple[str, str]:
if not docstring:
raise SystemExit(
"Deprecated REST route(s) "
f"{', '.join(route_identifiers)} at {path}:{line} must include a "
"docstring note like 'Deprecated since vX.Y.Z and scheduled for "
"removal in vA.B.C.'"
)
match = REST_ROUTE_DEPRECATION_RE.search(" ".join(docstring.split()))
if match is None:
raise SystemExit(
"Deprecated REST route(s) "
f"{', '.join(route_identifiers)} at {path}:{line} must include a "
"docstring note like 'Deprecated since vX.Y.Z and scheduled for "
"removal in vA.B.C.'"
)
return match.group("deprecated").rstrip("."), match.group("removed").rstrip(".")
def _gather_rest_route_deprecations(
tree: ast.AST, path: Path, *, package: str
) -> Iterator[DeprecationRecord]:
for node in ast.walk(tree):
if not isinstance(node, ast.FunctionDef | ast.AsyncFunctionDef):
continue
routes: list[tuple[str, str]] = []
for deco in node.decorator_list:
if not isinstance(deco, ast.Call):
continue
deprecated_value = _extract_kw(deco, "deprecated")
if (
not isinstance(deprecated_value, ast.Constant)
or deprecated_value.value is not True
):
continue
routes.extend(_extract_route_details(deco))
if not routes:
continue
deprecated_in, removed_in = _parse_rest_route_deprecation_docstring(
ast.get_docstring(node),
path=path,
line=node.lineno,
route_identifiers=[
f"{method} {route_path}" for method, route_path in routes
],
)
for method, route_path in routes:
yield DeprecationRecord(
identifier=f"{method} {route_path}",
removed_in=removed_in,
deprecated_in=deprecated_in,
path=path,
line=node.lineno,
kind="rest_route",
package=package,
)
def _gather_decorators(
tree: ast.AST, path: Path, *, package: str
) -> Iterator[DeprecationRecord]:
for node in ast.walk(tree):
if not isinstance(node, (ast.FunctionDef, ast.AsyncFunctionDef, ast.ClassDef)):
continue
for deco in node.decorator_list:
call = deco if isinstance(deco, ast.Call) else None
if call is None:
continue
target = call.func
if isinstance(target, ast.Name):
decorator_name = target.id
elif isinstance(target, ast.Attribute):
decorator_name = target.attr
else:
continue
if decorator_name != "deprecated":
continue
removed_expr = _extract_kw(call, "removed_in")
deprecated_expr = _extract_kw(call, "deprecated_in")
record = DeprecationRecord(
identifier=_build_identifier(node),
removed_in=_parse_removed_value(
removed_expr, path=path, line=node.lineno
),
deprecated_in=_parse_deprecated_value(
deprecated_expr, path=path, line=node.lineno
),
path=path,
line=node.lineno,
kind="decorator",
package=package,
)
yield record
def _gather_warn_calls(
tree: ast.AST, path: Path, *, package: str
) -> Iterator[DeprecationRecord]:
for node in ast.walk(tree):
if not isinstance(node, ast.Call):
continue
target = node.func
if isinstance(target, ast.Name):
func_name = target.id
elif isinstance(target, ast.Attribute):
func_name = target.attr
else:
continue
if func_name == "warn_deprecated":
identifier_node = node.args[0] if node.args else None
if identifier_node is None:
continue
identifier = ast.unparse(identifier_node)
removed_expr = _extract_kw(node, "removed_in")
deprecated_expr = _extract_kw(node, "deprecated_in")
yield DeprecationRecord(
identifier=identifier,
removed_in=_parse_removed_value(
removed_expr, path=path, line=node.lineno
),
deprecated_in=_parse_deprecated_value(
deprecated_expr, path=path, line=node.lineno
),
path=path,
line=node.lineno,
kind="warn_call",
package=package,
)
elif func_name == "warn_cleanup":
identifier_node = node.args[0] if node.args else None
if identifier_node is None:
continue
identifier = ast.unparse(identifier_node)
cleanup_expr = _extract_kw(node, "cleanup_by")
yield DeprecationRecord(
identifier=identifier,
removed_in=_parse_removed_value(
cleanup_expr, path=path, line=node.lineno
),
deprecated_in=None,
path=path,
line=node.lineno,
kind="cleanup_call",
package=package,
)
def _build_identifier(node: ast.AST) -> str:
if isinstance(node, ast.ClassDef):
return node.name
if isinstance(node, (ast.FunctionDef, ast.AsyncFunctionDef)):
qual_name = node.name
if node.decorator_list:
parent = getattr(node, "parent", None)
if parent and isinstance(parent, ast.ClassDef):
return f"{parent.name}.{node.name}"
return qual_name
return "<unknown>"
def _attach_parents(tree: ast.AST) -> None:
for node in ast.walk(tree):
for child in ast.iter_child_nodes(node):
setattr(child, "parent", node)
def _collect_records(files: Iterable[Path], *, package: str) -> list[DeprecationRecord]:
records: list[DeprecationRecord] = []
for path in files:
tree = ast.parse(path.read_text())
_attach_parents(tree)
records.extend(_gather_decorators(tree, path, package=package))
records.extend(_gather_warn_calls(tree, path, package=package))
return records
def _collect_rest_route_records(
files: Iterable[Path], *, package: str
) -> list[DeprecationRecord]:
records: list[DeprecationRecord] = []
for path in files:
tree = ast.parse(path.read_text())
records.extend(_gather_rest_route_deprecations(tree, path, package=package))
return records
def _version_ge(current: str, target: str) -> bool:
try:
return pkg_version.parse(current) >= pkg_version.parse(target)
except pkg_version.InvalidVersion as exc:
raise SystemExit(
f"Invalid semantic version comparison: {current=} {target=}"
) from exc
def _should_fail(current_version: str, record: DeprecationRecord) -> bool:
removed = record.removed_in
if removed is None:
return False
if isinstance(removed, date):
return date.today() >= removed
try:
target = str(removed)
return _version_ge(current_version, target)
except SystemExit:
raise
except Exception as exc: # pragma: no cover - unexpected literal type
raise SystemExit(
f"Unsupported removed_in expression in {record.path}:{record.line}:"
f" {removed!r}"
) from exc
def _format_record(record: DeprecationRecord) -> str:
location = record.path.relative_to(REPO_ROOT)
removed = record.removed_in if record.removed_in is not None else "(none)"
if record.kind == "cleanup_call":
return (
f"- [{record.package}] {record.identifier} ({record.kind})\n"
f" cleanup by: {removed}\n"
f" defined at: {location}:{record.line}"
)
deprecated = (
record.deprecated_in if record.deprecated_in is not None else "(unknown)"
)
return (
f"- [{record.package}] {record.identifier} ({record.kind})\n"
f" deprecated in: {deprecated}\n"
f" removed in: {removed}\n"
f" defined at: {location}:{record.line}"
)
def main(argv: Sequence[str] | None = None) -> int:
argv = list(argv or [])
overdue: list[DeprecationRecord] = []
total_records = 0
package_summaries: list[tuple[str, str, int]] = []
for package in PACKAGES:
if not package.pyproject.exists():
raise SystemExit(
f"Unable to locate pyproject.toml for {package.name}: "
f"{package.pyproject}"
)
current_version = _load_current_version(package.pyproject)
files: list[Path] = []
for root in package.source_roots:
if not root.exists():
raise SystemExit(
f"Source root {root} for package {package.name} does not exist"
)
files.extend(_iter_python_files(root))
records = _collect_records(files, package=package.name)
if package.name == "openhands-agent-server":
records.extend(_collect_rest_route_records(files, package=package.name))
overdue.extend(r for r in records if _should_fail(current_version, r))
total_records += len(records)
package_summaries.append((package.name, current_version, len(records)))
if overdue:
deprecated_items = [r for r in overdue if r.kind != "cleanup_call"]
cleanup_items = [r for r in overdue if r.kind == "cleanup_call"]
if deprecated_items:
print(
"The following deprecated features have passed their removal "
"deadline:\n"
)
for record in deprecated_items:
print(_format_record(record))
print()
if cleanup_items:
print("The following workarounds have passed their cleanup deadline:\n")
for record in cleanup_items:
print(_format_record(record))
print()
if deprecated_items:
print(
"Update or remove the listed features before publishing a version that "
"meets or exceeds their removal deadline."
)
if cleanup_items:
print(
"Remove the listed workarounds before publishing a version that "
"meets or exceeds their cleanup deadline."
)
return 1
for package_name, version, count in package_summaries:
print(
f"{package_name}: checked {count} deprecation metadata entries against "
f"version {version}."
)
print(
f"Checked {total_records} deprecation metadata entries across "
f"{len(package_summaries)} package(s)."
)
return 0
if __name__ == "__main__": # pragma: no cover - manual invocation
sys.exit(main(sys.argv[1:]))
+297
View File
@@ -0,0 +1,297 @@
#!/usr/bin/env python3
"""Validate docstrings conform to MDX-compatible formatting guidelines.
This script checks that docstrings in the SDK use patterns that render correctly
in Mintlify MDX documentation. It validates:
1. No REPL-style examples (>>>) - should use fenced code blocks instead
2. Shell/config examples use fenced code blocks (prevents # becoming headers)
Run with: python scripts/check_docstrings.py
Exit code 0 = all checks pass, 1 = violations found
"""
import ast
import sys
from dataclasses import dataclass
from pathlib import Path
# Directories to check
SDK_PATHS = [
"openhands-sdk/openhands/sdk",
]
# Files/directories to skip
SKIP_PATTERNS = [
"__pycache__",
".pyc",
"test_",
"_test.py",
]
# Core public API files to check strictly (these are documented on the website)
# Other files will be checked but only emit warnings, not failures
STRICT_CHECK_FILES = [
"agent/agent.py",
"llm/llm.py",
"conversation/conversation.py",
"tool/tool.py",
"workspace/base.py",
"observability/laminar.py",
]
@dataclass
class Violation:
"""A docstring formatting violation."""
file: Path
line: int
name: str
rule: str
message: str
is_strict: bool = False # True if this is in a strictly-checked file
def should_skip(path: Path) -> bool:
"""Check if a path should be skipped."""
path_str = str(path)
return any(pattern in path_str for pattern in SKIP_PATTERNS)
def check_repl_examples(
docstring: str, name: str, lineno: int, file: Path
) -> list[Violation]:
"""Check for REPL-style examples (>>>).
These should be replaced with fenced code blocks for better MDX rendering.
"""
violations = []
lines = docstring.split("\n")
for i, line in enumerate(lines):
stripped = line.strip()
if stripped.startswith(">>>"):
violations.append(
Violation(
file=file,
line=lineno + i,
name=name,
rule="no-repl-examples",
message=(
"Use fenced code blocks (```python) instead of >>> REPL style. "
"REPL examples don't render well in MDX documentation."
),
)
)
# Only report once per docstring
break
return violations
def check_unfenced_shell_config(
docstring: str, name: str, lineno: int, file: Path
) -> list[Violation]:
"""Check for shell/config examples that aren't in fenced code blocks.
Lines starting with # outside code blocks become markdown headers.
"""
violations = []
lines = docstring.split("\n")
in_code_block = False
for i, line in enumerate(lines):
stripped = line.strip()
# Track code block state
if stripped.startswith("```"):
in_code_block = not in_code_block
continue
# Skip if inside a code block
if in_code_block:
continue
# Check for shell-style comments that look like config
# Pattern: line starts with # and previous line has = (config pattern)
if stripped.startswith("#") and not stripped.startswith("# "):
# This is likely a shell comment without space (less common in prose)
continue
# Check for unfenced config: KEY=VALUE followed by # comment
if i > 0:
prev_line = lines[i - 1].strip() if i > 0 else ""
# If previous line looks like config (VAR=value) and this is a # comment
if "=" in prev_line and prev_line.split("=")[0].isupper():
if stripped.startswith("# "):
violations.append(
Violation(
file=file,
line=lineno + i,
name=name,
rule="fenced-shell-config",
message=(
"Shell/config examples with # comments should be "
"in ```bash code blocks. Otherwise # becomes a "
"markdown header."
),
)
)
# Only report once per docstring
break
return violations
def check_docstring(
docstring: str, name: str, lineno: int, file: Path
) -> list[Violation]:
"""Run all checks on a docstring."""
if not docstring:
return []
violations = []
violations.extend(check_repl_examples(docstring, name, lineno, file))
violations.extend(check_unfenced_shell_config(docstring, name, lineno, file))
return violations
def get_docstrings_from_file(file: Path) -> list[tuple[str, str, int]]:
"""Extract all docstrings from a Python file.
Returns list of (name, docstring, lineno) tuples.
"""
try:
source = file.read_text()
tree = ast.parse(source)
except (SyntaxError, UnicodeDecodeError) as e:
print(f"Warning: Could not parse {file}: {e}", file=sys.stderr)
return []
docstrings = []
for node in ast.walk(tree):
name = None
lineno = 0
docstring = None
if isinstance(node, ast.Module):
docstring = ast.get_docstring(node)
name = file.stem
lineno = 1
elif isinstance(node, ast.ClassDef):
docstring = ast.get_docstring(node)
name = node.name
lineno = node.lineno
elif isinstance(node, ast.FunctionDef | ast.AsyncFunctionDef):
docstring = ast.get_docstring(node)
name = node.name
lineno = node.lineno
if docstring and name:
docstrings.append((name, docstring, lineno))
return docstrings
def is_strict_file(file: Path, repo_root: Path) -> bool:
"""Check if a file is in the strict check list."""
try:
rel_path = file.relative_to(repo_root / "openhands-sdk/openhands/sdk")
return any(str(rel_path) == strict for strict in STRICT_CHECK_FILES)
except ValueError:
return False
def check_file(file: Path, repo_root: Path) -> list[Violation]:
"""Check all docstrings in a file."""
violations = []
is_strict = is_strict_file(file, repo_root)
for name, docstring, lineno in get_docstrings_from_file(file):
file_violations = check_docstring(docstring, name, lineno, file)
for v in file_violations:
v.is_strict = is_strict
violations.extend(file_violations)
return violations
def main() -> int:
"""Run docstring checks on all SDK files."""
repo_root = Path(__file__).parent.parent.parent
all_violations: list[Violation] = []
files_checked = 0
for sdk_path in SDK_PATHS:
path = repo_root / sdk_path
if not path.exists():
print(f"Warning: Path not found: {path}", file=sys.stderr)
continue
for py_file in path.rglob("*.py"):
if should_skip(py_file):
continue
files_checked += 1
violations = check_file(py_file, repo_root)
all_violations.extend(violations)
# Separate strict violations (errors) from warnings
strict_violations = [v for v in all_violations if v.is_strict]
warning_violations = [v for v in all_violations if not v.is_strict]
# Report warnings (non-strict files)
if warning_violations:
count = len(warning_violations)
print(f"\n⚠️ Found {count} docstring warning(s) in non-core files:\n")
by_file: dict[Path, list[Violation]] = {}
for v in warning_violations:
by_file.setdefault(v.file, []).append(v)
for file, violations in sorted(by_file.items()):
rel_path = file.relative_to(repo_root)
print(f"📄 {rel_path}")
for v in violations:
print(f" Line {v.line}: {v.name} ({v.rule})")
print()
# Report errors (strict files)
if strict_violations:
count = len(strict_violations)
print(f"\n❌ Found {count} docstring error(s) in core API files:\n")
by_file: dict[Path, list[Violation]] = {}
for v in strict_violations:
by_file.setdefault(v.file, []).append(v)
for file, violations in sorted(by_file.items()):
rel_path = file.relative_to(repo_root)
print(f"📄 {rel_path}")
for v in violations:
print(f" Line {v.line}: {v.name}")
print(f" Rule: {v.rule}")
print(f" {v.message}")
print()
print("=" * 60)
print("To fix these issues:")
print(" 1. Replace >>> examples with ```python code blocks")
print(" 2. Wrap shell/config examples in ```bash code blocks")
print("=" * 60)
return 1
if warning_violations:
count = len(warning_violations)
print(f"✅ Core API files pass. {count} warnings in other files.")
else:
print(f"✅ All {files_checked} files pass docstring checks")
return 0
if __name__ == "__main__":
sys.exit(main())
+209
View File
@@ -0,0 +1,209 @@
#!/usr/bin/env python3
"""
Check if all examples in agent-sdk are documented in the docs repository.
This script:
1. Scans the docs repository for references to example files
2. Lists all example Python files in the agent-sdk repository
3. Compares the two sets to find undocumented examples
4. Exits with error code 1 if undocumented examples are found
"""
import os
import re
import sys
from pathlib import Path
def find_documented_examples(docs_path: Path) -> set[str]:
"""
Find all example file references in the docs repository.
Searches for patterns like:
- examples/01_standalone_sdk/02_custom_tools.py
- examples/02_remote_agent_server/06_custom_tool/custom_tools/log_data.py
in MDX files.
Returns:
Set of normalized example file paths (relative to agent-sdk root)
"""
documented_examples: set[str] = set()
# Pattern to match example file references with arbitrary nesting depth.
# Matches: examples/<dir>/.../<file>.py
pattern = r"examples/(?:[-\w]+/)+[-\w]+\.py"
for root, _, files in os.walk(docs_path):
for file in files:
if file.endswith(".mdx") or file.endswith(".md"):
file_path = Path(root) / file
try:
content = file_path.read_text(encoding="utf-8")
matches = re.findall(pattern, content)
for match in matches:
# Normalize the path
documented_examples.add(match)
except Exception as e:
print(f"Warning: Error reading {file_path}: {e}")
continue
return documented_examples
def find_agent_sdk_examples(agent_sdk_path: Path) -> set[str]:
"""
Find all example Python files in the agent-sdk repository.
Excludes examples/03_github_workflows/ since those examples are YAML
files, not Python files.
Returns:
Set of example file paths (relative to agent-sdk root)
"""
examples: set[str] = set()
examples_dir = agent_sdk_path / "examples"
if not examples_dir.exists():
print(f"Error: Examples directory not found: {examples_dir}")
sys.exit(1)
# Find all Python files under examples/
for root, _, files in os.walk(examples_dir):
for file in files:
if file.endswith(".py"):
file_path = Path(root) / file
# Get relative path from agent-sdk root
relative_path = file_path.relative_to(agent_sdk_path)
relative_path_str = str(relative_path)
# Skip GitHub workflow examples (those are YAML files, Python
# files there are just helpers)
if relative_path_str.startswith("examples/03_github_workflows/"):
continue
# Skip LLM-specific tools examples: these are intentionally not
# enforced by the docs check. See discussion in PR #1486.
if relative_path_str.startswith("examples/04_llm_specific_tools/"):
continue
# Skip __init__.py files as they typically don't need documentation
if file == "__init__.py":
continue
examples.add(relative_path_str)
return examples
def resolve_paths() -> tuple[Path, Path]:
"""
Determine agent-sdk root and docs path.
Priority for docs path:
1) DOCS_PATH (env override)
2) $GITHUB_WORKSPACE/docs
3) agent_sdk_root/'docs'
4) agent_sdk_root.parent/'docs'
Returns:
Tuple of (agent_sdk_root, docs_path)
"""
# agent-sdk repo root (script is at agent-sdk/.github/scripts/...)
script_file = Path(__file__).resolve()
agent_sdk_root = script_file.parent.parent.parent
candidates: list[Path] = []
# 1) Explicit env override
env_override = os.environ.get("DOCS_PATH")
if env_override:
candidates.append(Path(env_override).expanduser().resolve())
# 2) Standard GitHub workspace sibling
gh_ws = os.environ.get("GITHUB_WORKSPACE")
if gh_ws:
candidates.append(Path(gh_ws).resolve() / "docs")
# 3) Sibling inside the agent-sdk repo root
candidates.append(agent_sdk_root / "docs")
# 4) Parent-of-agent-sdk-root layout
candidates.append(agent_sdk_root.parent / "docs")
print(f"🔍 Agent SDK root: {agent_sdk_root}")
print("🔎 Trying docs paths (in order):")
for p in candidates:
print(f" - {p}")
for p in candidates:
if p.exists():
print(f"📁 Using docs path: {p}")
return agent_sdk_root, p
# If none exist, fail with a helpful message
print("❌ Docs path not found in any of the expected locations.")
print(" Set DOCS_PATH, or checkout the repo to one of the tried paths above.")
sys.exit(1)
def main() -> None:
agent_sdk_root, docs_path = resolve_paths()
print("\n" + "=" * 60)
print("Checking documented examples...")
print("=" * 60)
# Find all examples in agent-sdk
print("\n📋 Scanning agent-sdk examples...")
agent_examples = find_agent_sdk_examples(agent_sdk_root)
print(f" Found {len(agent_examples)} example file(s)")
# Find all documented examples in docs
print("\n📄 Scanning docs repository...")
documented_examples = find_documented_examples(docs_path)
print(f" Found {len(documented_examples)} documented example(s)")
# Calculate difference
undocumented = agent_examples - documented_examples
print("\n" + "=" * 60)
if undocumented:
print(f"❌ Found {len(undocumented)} undocumented example(s):")
print("=" * 60)
for example in sorted(undocumented):
print(f" - {example}")
print("\n⚠️ Please add documentation for these examples in the docs repo.")
print("=" * 60)
print("\n📚 How to Document Examples:")
print("=" * 60)
print("1. Clone the docs repository:")
print(" git clone https://github.com/OpenHands/docs.git")
print()
print("2. Create a new .mdx file in sdk/guides/ directory")
print(" (e.g., sdk/guides/my-feature.mdx)")
print()
print("3. Add the example code block with this format:")
print(' ```python icon="python" expandable examples/path/to/file.py')
print(" <code will be auto-synced>")
print(" ```")
print()
print("4. See the format documentation at:")
print(
" https://github.com/OpenHands/docs/blob/main/.github/scripts/README.md"
)
print()
print("5. Example documentation files can be found in:")
print(" https://github.com/OpenHands/docs/tree/main/sdk/guides")
print()
print("6. After creating the PR in docs repo, reference it in your")
print(" agent-sdk PR description.")
print("=" * 60)
sys.exit(1)
else:
print("✅ All examples are documented!")
print("=" * 60)
sys.exit(0)
if __name__ == "__main__":
main()
@@ -0,0 +1,104 @@
#!/usr/bin/env python3
"""
Check for duplicate example numbers in the examples directory.
This script ensures that within each examples subdirectory, no two files or
folders share the same numeric prefix (e.g., two files both starting with "04_").
Exit codes:
0 - No duplicates found
1 - Duplicates found
"""
import re
import sys
from collections import defaultdict
from pathlib import Path
def find_duplicate_numbers(examples_dir: Path) -> dict[str, list[str]]:
"""
Find duplicate example numbers within each subdirectory.
Returns:
Dictionary mapping subdirectory paths to lists of duplicate entries.
Only includes subdirectories that have duplicates.
"""
duplicates: dict[str, list[str]] = {}
# Pattern to extract leading number from filename/dirname
# e.g., "04" from "04_foo.py"
number_pattern = re.compile(r"^(\d+)_")
for subdir in sorted(examples_dir.iterdir()):
if not subdir.is_dir():
continue
# Skip hidden directories
if subdir.name.startswith("."):
continue
# Group entries by their numeric prefix
number_to_entries: dict[str, list[str]] = defaultdict(list)
for entry in subdir.iterdir():
# Skip hidden files/directories
if entry.name.startswith("."):
continue
match = number_pattern.match(entry.name)
if match:
number = match.group(1)
number_to_entries[number].append(entry.name)
# Find numbers with multiple entries
subdir_duplicates = []
for number, entries in sorted(number_to_entries.items()):
if len(entries) > 1:
subdir_duplicates.extend(sorted(entries))
if subdir_duplicates:
relative_subdir = str(subdir.relative_to(examples_dir.parent))
duplicates[relative_subdir] = subdir_duplicates
return duplicates
def main() -> None:
# Find the examples directory relative to this script
script_file = Path(__file__).resolve()
repo_root = script_file.parent.parent.parent
examples_dir = repo_root / "examples"
if not examples_dir.exists():
print(f"Error: Examples directory not found: {examples_dir}")
sys.exit(1)
print("=" * 60)
print("Checking for duplicate example numbers...")
print("=" * 60)
print(f"\n📁 Scanning: {examples_dir}\n")
duplicates = find_duplicate_numbers(examples_dir)
if duplicates:
print("❌ Found duplicate example numbers:\n")
for subdir, entries in sorted(duplicates.items()):
print(f" {subdir}/")
for entry in entries:
print(f" - {entry}")
print()
print("=" * 60)
print("⚠️ Please renumber the examples to remove duplicates.")
print(" Each example should have a unique number within its folder.")
print("=" * 60)
sys.exit(1)
else:
print("✅ No duplicate example numbers found!")
print("=" * 60)
sys.exit(0)
if __name__ == "__main__":
main()
+826
View File
@@ -0,0 +1,826 @@
#!/usr/bin/env python3
"""API breakage detection for published OpenHands packages using Griffe.
This script compares current workspace packages against the most recent PyPI
release (or the matching release if the current version is already published)
to detect breaking changes in the public API.
It focuses on the curated public surface:
- symbols exported via ``__all__``
- public members removed from classes exported via ``__all__``
It enforces two policies:
1. **Deprecation-before-removal** any removed export or removed public class
member must have been marked deprecated in the *previous* release using the
canonical deprecation helpers (``@deprecated`` decorator or
``warn_deprecated()`` call from ``openhands.sdk.utils.deprecation``). For
members, the recommended ``warn_deprecated`` feature name is qualified (e.g.
``"LLM.some_method"``).
2. **MINOR version bump** any breaking change (removal or structural) requires
at least a MINOR version bump according to SemVer.
Complementary to the deprecation mechanism:
- Deprecation (``check_deprecations.py``): enforces cleanup deadlines
- This script: prevents unannounced removals and enforces SemVer bumps
"""
from __future__ import annotations
import ast
import json
import os
import re
import subprocess
import sys
import tomllib
import urllib.request
from collections.abc import Iterable
from dataclasses import dataclass
from pathlib import Path
from packaging import version as pkg_version
from packaging.requirements import Requirement
@dataclass(frozen=True)
class PackageConfig:
"""Configuration for a single published package."""
package: str # dotted module path, e.g. "openhands.sdk"
distribution: str # PyPI distribution name, e.g. "openhands-sdk"
source_dir: str # repo-relative directory, e.g. "openhands-sdk"
@dataclass(frozen=True, slots=True)
class DeprecatedSymbols:
"""Deprecated SDK symbols detected in a source tree.
``top_level`` tracks module-level symbols (exports) like ``LLM``.
``qualified`` tracks class members like ``LLM.some_method``.
"""
top_level: set[str] = frozenset() # type: ignore[assignment]
qualified: set[str] = frozenset() # type: ignore[assignment]
PACKAGES: tuple[PackageConfig, ...] = (
PackageConfig(
package="openhands.sdk",
distribution="openhands-sdk",
source_dir="openhands-sdk",
),
PackageConfig(
package="openhands.workspace",
distribution="openhands-workspace",
source_dir="openhands-workspace",
),
PackageConfig(
package="openhands.tools",
distribution="openhands-tools",
source_dir="openhands-tools",
),
)
ACP_DEPENDENCY = "agent-client-protocol"
ACP_SKIP_ENV = "ACP_VERSION_CHECK_SKIP"
ACP_SKIP_TOKEN = "skip-acp-check"
ACP_BASE_REF_ENV = "ACP_VERSION_CHECK_BASE_REF"
def read_version_from_pyproject(path: str) -> str:
"""Read the version string from a pyproject.toml file."""
with open(path, "rb") as f:
data = tomllib.load(f)
proj = data.get("project", {})
v = proj.get("version")
if not v:
raise SystemExit(f"Could not read version from {path}")
return str(v)
def _read_pyproject(path: str) -> dict:
with open(path, "rb") as f:
return tomllib.load(f)
def _bool_env(name: str) -> bool:
value = os.environ.get(name, "").strip().lower()
return value in {"1", "true", "yes", "on"}
def _get_dependency_spec(project_data: dict, dependency: str) -> str | None:
deps = project_data.get("project", {}).get("dependencies", [])
for dep in deps:
if dep.startswith(dependency):
return dep
return None
def _min_version_from_requirement(req_str: str) -> pkg_version.Version | None:
try:
req = Requirement(req_str)
except Exception as exc:
print(
f"::warning title=ACP version::Unable to parse requirement "
f"'{req_str}': {exc}"
)
return None
lower_bounds: list[pkg_version.Version] = []
for spec in req.specifier:
if spec.operator in {">=", ">", "==", "~="}:
try:
lower_bounds.append(_parse_version(spec.version))
except Exception as exc:
print(
f"::warning title=ACP version::Unable to parse version "
f"'{spec.version}' from '{req_str}': {exc}"
)
if not lower_bounds:
return None
return max(lower_bounds)
def _git_show_file(ref: str, rel_path: str) -> str | None:
for candidate in (f"origin/{ref}", ref):
result = subprocess.run(
["git", "show", f"{candidate}:{rel_path}"],
check=False,
capture_output=True,
text=True,
)
if result.returncode == 0:
return result.stdout
return None
def _load_base_pyproject(base_ref: str) -> dict | None:
rel_path = "openhands-sdk/pyproject.toml"
content = _git_show_file(base_ref, rel_path)
if content is None:
print(
f"::warning title=ACP version::Unable to read {rel_path} from "
f"{base_ref}; skipping ACP version check"
)
return None
try:
return tomllib.loads(content)
except tomllib.TOMLDecodeError as exc:
print(
f"::warning title=ACP version::Failed to parse {rel_path} from "
f"{base_ref}: {exc}"
)
return None
def _check_acp_version_bump(repo_root: str) -> int:
if _bool_env(ACP_SKIP_ENV):
print(
f"::notice title=ACP version::Skipping ACP version check because "
f"{ACP_SKIP_ENV} is set (token: [{ACP_SKIP_TOKEN}])."
)
return 0
base_ref = os.environ.get(ACP_BASE_REF_ENV) or os.environ.get("GITHUB_BASE_REF")
if not base_ref:
print(
"::warning title=ACP version::No base ref found; skipping ACP version check"
)
return 0
base_data = _load_base_pyproject(base_ref)
if base_data is None:
return 0
current_data = _read_pyproject(
os.path.join(repo_root, "openhands-sdk", "pyproject.toml")
)
old_req = _get_dependency_spec(base_data, ACP_DEPENDENCY)
new_req = _get_dependency_spec(current_data, ACP_DEPENDENCY)
if not old_req or not new_req:
print(
f"::warning title=ACP version::Unable to locate {ACP_DEPENDENCY} "
"dependency in pyproject.toml; skipping ACP version check"
)
return 0
old_min = _min_version_from_requirement(old_req)
new_min = _min_version_from_requirement(new_req)
if old_min is None or new_min is None:
print(
f"::warning title=ACP version::Unable to parse {ACP_DEPENDENCY} "
"minimum version; skipping ACP version check"
)
return 0
if new_min <= old_min:
return 0
if new_min.major != old_min.major or new_min.minor != old_min.minor:
print(
"::error title=ACP version::Detected "
f"{ACP_DEPENDENCY} minor/major version bump "
f"({old_req} -> {new_req}). If intentional, add "
f"[{ACP_SKIP_TOKEN}] to the PR description to bypass."
)
return 1
return 0
def _parse_version(v: str) -> pkg_version.Version:
"""Parse a version string using packaging."""
return pkg_version.parse(v)
def get_pypi_baseline_version(pkg: str, current: str | None) -> str | None:
"""Fetch the baseline release version from PyPI.
The baseline is the most recent published release to compare against the
current workspace. If the current version already exists on PyPI, compare
against that same release. Otherwise, fall back to the newest release older
than the current version. If ``current`` is None, use the latest release.
Args:
pkg: Package name on PyPI (e.g., "openhands-sdk")
current: Current version from the workspace, or None for latest
Returns:
Baseline version string, or None if not found or on network error
"""
req = urllib.request.Request(
url=f"https://pypi.org/pypi/{pkg}/json",
headers={"User-Agent": "openhands-sdk-api-check/1.0"},
method="GET",
)
try:
with urllib.request.urlopen(req, timeout=10) as r:
meta = json.load(r)
except Exception as e:
print(f"::warning title={pkg} API::Failed to fetch PyPI metadata: {e}")
return None
releases = list(meta.get("releases", {}).keys())
if not releases:
return None
def _sort_key(s: str):
return _parse_version(s)
releases_sorted = sorted(releases, key=_sort_key, reverse=True)
if current is None:
return releases_sorted[0]
if current in releases:
return current
cur_parsed = _parse_version(current)
older = [rv for rv in releases if _parse_version(rv) < cur_parsed]
if not older:
return None
return sorted(older, key=_sort_key, reverse=True)[0]
def ensure_griffe() -> None:
"""Verify griffe is installed, raising an error if not."""
try:
import griffe # noqa: F401
except ImportError:
sys.stderr.write(
"ERROR: griffe not installed. Install with: pip install griffe[pypi]\n"
)
raise SystemExit(1)
def _is_field_metadata_only_change(old_val: object, new_val: object) -> bool:
"""Check if the change is only in Field metadata (description, title, etc.).
Field metadata parameters like ``description``, ``title``, ``examples``, and
``deprecated`` don't affect runtime behavior. Changes to these should not be
considered breaking API changes.
Returns:
True if both values are Field() calls and only metadata parameters differ.
"""
old_str = str(old_val)
new_str = str(new_val)
if not (old_str.startswith("Field(") and new_str.startswith("Field(")):
return False
# Metadata parameters that don't affect runtime behavior.
# See https://docs.pydantic.dev/latest/api/fields/#pydantic.fields.Field
metadata_patterns = {
"description": r'([\'"])([^\'"]*?)\1',
"title": r'([\'"])([^\'"]*?)\1',
"examples": r'([\'"])([^\'"]*?)\1',
"json_schema_extra": r'([\'"])([^\'"]*?)\1',
"deprecated": r"(?:True|False|None|'[^']*'|\"[^\"]*\")",
}
def _normalize(value: str) -> str:
normalized = value
for param, value_pattern in metadata_patterns.items():
pattern = rf",?\s*{param}\s*=\s*{value_pattern}"
normalized = re.sub(pattern, "", normalized)
normalized = re.sub(r"\(\s*,", "(", normalized)
normalized = re.sub(r",\s*\)", ")", normalized)
normalized = re.sub(r",\s*,", ", ", normalized)
normalized = re.sub(r"\s+", " ", normalized)
return normalized.strip()
return _normalize(old_str) == _normalize(new_str)
def _collect_breakages_pairs(
objs: Iterable[tuple[object, object]],
*,
deprecated: DeprecatedSymbols,
title: str,
) -> tuple[list[object], int]:
"""Find breaking changes between pairs of old/new API objects.
Only reports breakages for public API members.
Returns:
(breakages, undeprecated_removals)
"""
import griffe
from griffe import Alias, AliasResolutionError, BreakageKind, ExplanationStyle, Kind
breakages: list[object] = []
undeprecated_removals = 0
for old, new in objs:
try:
for br in griffe.find_breaking_changes(old, new):
obj = getattr(br, "obj", None)
if not getattr(obj, "is_public", True):
continue
# Skip ATTRIBUTE_CHANGED_VALUE when it's just Field metadata changes
# (description, title, examples, etc.) - these don't affect runtime
if br.kind == BreakageKind.ATTRIBUTE_CHANGED_VALUE:
old_value = getattr(br, "old_value", None)
new_value = getattr(br, "new_value", None)
if _is_field_metadata_only_change(old_value, new_value):
print(
f"::notice title={title}::Ignoring Field metadata-only "
f"change (non-breaking): {obj.name if obj else 'unknown'}"
)
continue
print(br.explain(style=ExplanationStyle.GITHUB))
breakages.append(br)
if br.kind != BreakageKind.OBJECT_REMOVED:
continue
parent = getattr(obj, "parent", None)
if getattr(parent, "kind", None) != Kind.CLASS:
continue
feature = f"{parent.name}.{obj.name}"
if (
feature not in deprecated.qualified
and parent.name not in deprecated.top_level
):
print(
f"::error title={title}::Removed '{feature}' without prior "
"deprecation. Mark it with @deprecated(...) or "
f"warn_deprecated('{feature}', ...) for at least one release "
"before removing."
)
undeprecated_removals += 1
except AliasResolutionError as e:
if isinstance(old, Alias) or isinstance(new, Alias):
old_target = old.target_path if isinstance(old, Alias) else None
new_target = new.target_path if isinstance(new, Alias) else None
if old_target != new_target:
name = getattr(old, "name", None) or getattr(
new, "name", "<unknown>"
)
print(
f"::warning title={title}::Alias target changed for '{name}': "
f"{old_target!r} -> {new_target!r}"
)
breakages.append(
{
"kind": "ALIAS_TARGET_CHANGED",
"name": name,
"old": old_target,
"new": new_target,
}
)
else:
print(
f"::notice title={title}::Skipping symbol comparison due to "
f"unresolved alias: {e}"
)
except Exception as e:
print(f"::warning title={title}::Failed to compute breakages: {e}")
return breakages, undeprecated_removals
def _extract_exported_names(module) -> set[str]:
"""Extract names exported from a module via ``__all__``.
This check is explicitly meant to track the curated public surface. The SDK
is expected to define ``__all__`` in ``openhands.sdk``; if it's missing or we
can't statically interpret it, we fail fast rather than silently widening the
surface area (which would make the check noisy and brittle).
"""
try:
all_var = module["__all__"]
except Exception as e:
raise ValueError("Expected __all__ to be defined on the public module") from e
val = getattr(all_var, "value", None)
elts = getattr(val, "elements", None)
if not elts:
raise ValueError("Unable to statically evaluate __all__")
names: set[str] = set()
for el in elts:
# Griffe represents string literals in __all__ in different ways depending
# on how the module is loaded / griffe version:
# - sometimes as plain Python strings (including quotes, e.g. "'LLM'")
# - sometimes as expression nodes with a `.value` attribute
#
# We intentionally only support the "static __all__ of string literals"
# case; we just normalize the representation.
if isinstance(el, str):
names.add(el.strip("\"'"))
continue
s = getattr(el, "value", None)
if isinstance(s, str):
names.add(s)
if not names:
raise ValueError("__all__ resolved to an empty set")
return names
def _check_version_bump(prev: str, new_version: str, total_breaks: int) -> int:
"""Check if version bump policy is satisfied for breaking changes.
Policy: Breaking changes require at least a MINOR version bump.
Returns:
0 if policy satisfied, 1 if not
"""
if total_breaks == 0:
print("No breaking changes detected")
return 0
parsed_prev = _parse_version(prev)
parsed_new = _parse_version(new_version)
# MINOR bump required: same major, higher minor OR higher major
ok = (parsed_new.major > parsed_prev.major) or (
parsed_new.major == parsed_prev.major and parsed_new.minor > parsed_prev.minor
)
if not ok:
print(
f"::error title=SemVer::Breaking changes detected ({total_breaks}); "
f"require at least minor version bump from "
f"{parsed_prev.major}.{parsed_prev.minor}.x, but new is {new_version}"
)
return 1
print(
f"Breaking changes detected ({total_breaks}) and version bump policy "
f"satisfied ({prev} -> {new_version})"
)
return 0
def _resolve_griffe_object(
root: object,
dotted: str,
root_package: str = "",
) -> object:
"""Resolve a dotted path to a griffe object."""
root_path = getattr(root, "path", None)
if root_path == dotted:
return root
if isinstance(root_path, str) and dotted.startswith(root_path + "."):
dotted = dotted[len(root_path) + 1 :]
try:
return root[dotted]
except (KeyError, TypeError) as e:
print(
f"::warning title=SDK API::Unable to resolve {dotted} via "
f"direct lookup; falling back to manual traversal: {e}"
)
rel = dotted
if root_package and dotted.startswith(root_package + "."):
rel = dotted[len(root_package) + 1 :]
obj = root
for part in rel.split("."):
try:
obj = obj[part]
except (KeyError, TypeError) as e:
raise KeyError(f"Unable to resolve {dotted}: failed at {part}") from e
return obj
def _load_current(
griffe_module: object, repo_root: str, cfg: PackageConfig
) -> object | None:
try:
return griffe_module.load(
cfg.package,
search_paths=[os.path.join(repo_root, cfg.source_dir)],
)
except Exception as e:
print(
f"::error title={cfg.distribution} API::"
f"Failed to load current {cfg.distribution}: {e}"
)
return None
def _load_prev_from_pypi(
griffe_module: object,
prev: str,
cfg: PackageConfig,
) -> object | None:
griffe_cache = os.path.expanduser("~/.cache/griffe")
os.makedirs(griffe_cache, exist_ok=True)
try:
return griffe_module.load_pypi(
package=cfg.package,
distribution=cfg.distribution,
version_spec=f"=={prev}",
)
except Exception as e:
print(
f"::error title={cfg.distribution} API::"
f"Failed to load {cfg.distribution}=={prev} from PyPI: {e}"
)
return None
def _find_deprecated_symbols(source_root: Path) -> DeprecatedSymbols:
"""Scan source files for symbols marked with the SDK deprecation helpers.
Detects two forms:
- ``@deprecated(...)`` decorator on a class/function/method
- ``warn_deprecated('SomeFeature', ...)`` call
Returns:
DeprecatedSymbols(top_level=..., qualified=...)
"""
def _is_deprecated_decorator(deco: ast.AST) -> bool:
if not isinstance(deco, ast.Call):
return False
target = deco.func
if isinstance(target, ast.Name):
return target.id == "deprecated"
if isinstance(target, ast.Attribute):
return target.attr == "deprecated"
return False
class _Visitor(ast.NodeVisitor):
def __init__(self) -> None:
self.class_stack: list[str] = []
self.top_level: set[str] = set()
self.qualified: set[str] = set()
def visit_ClassDef(self, node: ast.ClassDef) -> None: # noqa: N802
if any(_is_deprecated_decorator(deco) for deco in node.decorator_list):
self.top_level.add(node.name)
self.qualified.add(node.name)
self.class_stack.append(node.name)
self.generic_visit(node)
self.class_stack.pop()
def _visit_function_like(
self,
node: ast.FunctionDef | ast.AsyncFunctionDef,
) -> None:
if any(_is_deprecated_decorator(deco) for deco in node.decorator_list):
if self.class_stack:
self.qualified.add(".".join([*self.class_stack, node.name]))
else:
self.top_level.add(node.name)
self.qualified.add(node.name)
self.generic_visit(node)
def visit_FunctionDef(self, node: ast.FunctionDef) -> None: # noqa: N802
self._visit_function_like(node)
def visit_AsyncFunctionDef(self, node: ast.AsyncFunctionDef) -> None: # noqa: N802
self._visit_function_like(node)
def visit_Call(self, node: ast.Call) -> None: # noqa: N802
target = node.func
func_name = None
if isinstance(target, ast.Name):
func_name = target.id
elif isinstance(target, ast.Attribute):
func_name = target.attr
if func_name == "warn_deprecated" and node.args:
feature = _extract_string_literal(node.args[0])
if feature is not None:
self.qualified.add(feature)
self.top_level.add(feature.split(".")[0])
self.generic_visit(node)
top_level: set[str] = set()
qualified: set[str] = set()
for pyfile in source_root.rglob("*.py"):
try:
tree = ast.parse(pyfile.read_text())
except SyntaxError as e:
print(
f"::warning title=SDK API::Skipping {pyfile}: "
f"failed to parse (SyntaxError: {e})"
)
continue
visitor = _Visitor()
visitor.visit(tree)
top_level |= visitor.top_level
qualified |= visitor.qualified
return DeprecatedSymbols(top_level=top_level, qualified=qualified)
def _extract_string_literal(node: ast.AST) -> str | None:
"""Return the string value if *node* is a simple string literal."""
if isinstance(node, ast.Constant) and isinstance(node.value, str):
return node.value
return None
def _get_source_root(griffe_root: object) -> Path | None:
"""Derive the package source directory from a griffe module's filepath."""
filepath = getattr(griffe_root, "filepath", None)
if filepath is not None:
return Path(filepath).parent
return None
def _compute_breakages(old_root, new_root, cfg: PackageConfig) -> tuple[int, int]:
"""Detect breaking changes between old and new package versions.
Returns:
``(total_breaks, undeprecated_removals)`` — *total_breaks* counts all
structural breakages (for the version-bump policy), while
*undeprecated_removals* counts public API removals (exports and class
members) without a prior deprecation marker (a separate hard failure).
"""
pkg = cfg.package
title = f"{cfg.distribution} API"
total_breaks = 0
undeprecated_removals = 0
source_root = _get_source_root(old_root)
deprecated = (
_find_deprecated_symbols(source_root) if source_root else DeprecatedSymbols()
)
try:
old_mod = _resolve_griffe_object(old_root, pkg, root_package=pkg)
new_mod = _resolve_griffe_object(new_root, pkg, root_package=pkg)
except Exception as e:
raise RuntimeError(f"Failed to resolve root module '{pkg}'") from e
new_exports = _extract_exported_names(new_mod)
try:
old_exports = _extract_exported_names(old_mod)
except ValueError as e:
# The API breakage check relies on a curated public surface defined via
# __all__. If the baseline release didn't define (or couldn't statically
# evaluate) __all__, we can't compute meaningful breakages.
#
# In this situation, skip rather than failing the entire workflow.
print(
f"::notice title={title}::Skipping breakage check; baseline release "
f"has no statically-evaluable {pkg}.__all__: {e}"
)
return 0, 0
removed = sorted(old_exports - new_exports)
# Check deprecation-before-removal policy (exports)
for name in removed:
total_breaks += 1 # every removal is a structural break
if name not in deprecated.top_level:
print(
f"::error title={title}::Removed '{name}' from "
f"{pkg}.__all__ without prior deprecation. "
"Mark it with @deprecated or warn_deprecated() "
"for at least one release before removing."
)
undeprecated_removals += 1
else:
print(
f"::notice title={title}::Removed previously-deprecated symbol "
f"'{name}' from {pkg}.__all__"
)
common = sorted(old_exports & new_exports)
pairs: list[tuple[object, object]] = []
for name in common:
try:
pairs.append((old_mod[name], new_mod[name]))
except Exception as e:
print(f"::warning title={title}::Unable to resolve symbol {name}: {e}")
breakages, undeprecated_members = _collect_breakages_pairs(
pairs,
deprecated=deprecated,
title=title,
)
total_breaks += len(breakages)
undeprecated_removals += undeprecated_members
return total_breaks, undeprecated_removals
def _check_package(griffe_module, repo_root: str, cfg: PackageConfig) -> int:
"""Run breakage checks for a single package. Returns 0 on success."""
pyproj = os.path.join(repo_root, cfg.source_dir, "pyproject.toml")
new_version = read_version_from_pyproject(pyproj)
title = f"{cfg.distribution} API"
baseline = get_pypi_baseline_version(cfg.distribution, new_version)
if not baseline:
print(
f"::warning title={title}::No baseline {cfg.distribution} "
f"release found; skipping breakage check",
)
return 0
print(f"Comparing {cfg.distribution} {new_version} against {baseline}")
new_root = _load_current(griffe_module, repo_root, cfg)
if not new_root:
return 1
old_root = _load_prev_from_pypi(griffe_module, baseline, cfg)
if not old_root:
return 1
try:
total_breaks, undeprecated = _compute_breakages(old_root, new_root, cfg)
except Exception as e:
print(f"::error title={title}::Failed to compute breakages: {e}")
return 1
if undeprecated:
print(
f"::error title={title}::{undeprecated} symbol(s) removed "
f"from {cfg.package} without prior deprecation — "
f"see errors above"
)
bump_rc = _check_version_bump(baseline, new_version, total_breaks)
return 1 if (undeprecated or bump_rc) else 0
def main() -> int:
"""Main entry point for API breakage detection."""
repo_root = os.getcwd()
rc = _check_acp_version_bump(repo_root)
ensure_griffe()
import griffe
for cfg in PACKAGES:
print(f"\n{'=' * 60}")
print(f"Checking {cfg.distribution} ({cfg.package})")
print(f"{'=' * 60}")
rc |= _check_package(griffe, repo_root, cfg)
return rc
if __name__ == "__main__":
raise SystemExit(main())
+196
View File
@@ -0,0 +1,196 @@
"""Guard package version changes so they only happen in release PRs."""
from __future__ import annotations
import os
import re
import subprocess
import sys
import tomllib
from dataclasses import dataclass
from pathlib import Path
PACKAGE_PYPROJECTS: dict[str, Path] = {
"openhands-sdk": Path("openhands-sdk/pyproject.toml"),
"openhands-tools": Path("openhands-tools/pyproject.toml"),
"openhands-workspace": Path("openhands-workspace/pyproject.toml"),
"openhands-agent-server": Path("openhands-agent-server/pyproject.toml"),
}
_VERSION_PATTERN = r"\d+\.\d+\.\d+(?:[-+][0-9A-Za-z.]+)?"
_RELEASE_TITLE_RE = re.compile(rf"^Release v(?P<version>{_VERSION_PATTERN})$")
_RELEASE_BRANCH_RE = re.compile(rf"^rel-(?P<version>{_VERSION_PATTERN})$")
@dataclass(frozen=True)
class VersionChange:
package: str
path: Path
previous_version: str
current_version: str
def _read_version_from_pyproject_text(text: str, source: str) -> str:
data = tomllib.loads(text)
version = data.get("project", {}).get("version")
if not isinstance(version, str):
raise SystemExit(f"Unable to determine project.version from {source}")
return version
def _read_current_version(repo_root: Path, pyproject: Path) -> str:
return _read_version_from_pyproject_text(
(repo_root / pyproject).read_text(),
str(pyproject),
)
def _read_version_from_git_ref(repo_root: Path, git_ref: str, pyproject: Path) -> str:
result = subprocess.run(
["git", "show", f"{git_ref}:{pyproject.as_posix()}"],
cwd=repo_root,
check=False,
capture_output=True,
text=True,
)
if result.returncode != 0:
message = result.stderr.strip() or result.stdout.strip() or "unknown git error"
raise SystemExit(
f"Unable to read {pyproject} from git ref {git_ref}: {message}"
)
return _read_version_from_pyproject_text(result.stdout, f"{git_ref}:{pyproject}")
def _base_ref_candidates(base_ref: str) -> list[str]:
if base_ref.startswith("origin/"):
return [base_ref, base_ref.removeprefix("origin/")]
return [f"origin/{base_ref}", base_ref]
def find_version_changes(repo_root: Path, base_ref: str) -> list[VersionChange]:
changes: list[VersionChange] = []
candidates = _base_ref_candidates(base_ref)
for package, pyproject in PACKAGE_PYPROJECTS.items():
current_version = _read_current_version(repo_root, pyproject)
previous_error: SystemExit | None = None
previous_version: str | None = None
for candidate in candidates:
try:
previous_version = _read_version_from_git_ref(
repo_root, candidate, pyproject
)
break
except SystemExit as exc:
previous_error = exc
if previous_version is None:
assert previous_error is not None
raise previous_error
if previous_version != current_version:
changes.append(
VersionChange(
package=package,
path=pyproject,
previous_version=previous_version,
current_version=current_version,
)
)
return changes
def get_release_pr_version(
pr_title: str, pr_head_ref: str
) -> tuple[str | None, list[str]]:
title_match = _RELEASE_TITLE_RE.fullmatch(pr_title.strip())
branch_match = _RELEASE_BRANCH_RE.fullmatch(pr_head_ref.strip())
title_version = title_match.group("version") if title_match else None
branch_version = branch_match.group("version") if branch_match else None
if title_version and branch_version and title_version != branch_version:
return None, [
"Release PR markers disagree: title requests "
f"v{title_version} but branch is rel-{branch_version}."
]
return title_version or branch_version, []
def validate_version_changes(
changes: list[VersionChange],
pr_title: str,
pr_head_ref: str,
) -> list[str]:
if not changes:
return []
release_version, errors = get_release_pr_version(pr_title, pr_head_ref)
if errors:
return errors
formatted_changes = ", ".join(
f"{change.package} ({change.previous_version} -> {change.current_version})"
for change in changes
)
if release_version is None:
return [
"Package version changes are only allowed in release PRs. "
f"Detected changes: {formatted_changes}. "
"Use the Prepare Release workflow so the PR title is 'Release vX.Y.Z' "
"or the branch is 'rel-X.Y.Z'."
]
mismatched = [
change for change in changes if change.current_version != release_version
]
if mismatched:
mismatch_details = ", ".join(
f"{change.package} ({change.current_version})" for change in mismatched
)
return [
f"Release PR version v{release_version} does not match changed package "
f"versions: {mismatch_details}."
]
return []
def main() -> int:
repo_root = Path(__file__).resolve().parents[2]
base_ref = os.environ.get("VERSION_BUMP_BASE_REF") or os.environ.get(
"GITHUB_BASE_REF"
)
if not base_ref:
print("::warning title=Version bump guard::No base ref found; skipping check.")
return 0
pr_title = os.environ.get("PR_TITLE", "")
pr_head_ref = os.environ.get("PR_HEAD_REF", "")
changes = find_version_changes(repo_root, base_ref)
errors = validate_version_changes(changes, pr_title, pr_head_ref)
if errors:
for error in errors:
print(f"::error title=Version bump guard::{error}")
return 1
if changes:
changed_packages = ", ".join(change.package for change in changes)
print(
"::notice title=Version bump guard::"
f"Release PR version changes validated for {changed_packages}."
)
else:
print("::notice title=Version bump guard::No package version changes detected.")
return 0
if __name__ == "__main__":
sys.exit(main())
-58
View File
@@ -1,58 +0,0 @@
#!/bin/bash
set -euxo pipefail
# This script updates the PR description with commands to run the PR locally
# It adds both Docker and uvx commands
# Get the branch name for the PR
BRANCH_NAME=$(gh pr view "$PR_NUMBER" --json headRefName --jq .headRefName)
# Define the Docker command
DOCKER_RUN_COMMAND="docker run -it --rm \
-p 3000:3000 \
-v /var/run/docker.sock:/var/run/docker.sock \
--add-host host.docker.internal:host-gateway \
-e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.openhands.dev/openhands/runtime:${SHORT_SHA}-nikolaik \
--name openhands-app-${SHORT_SHA} \
docker.openhands.dev/openhands/openhands:${SHORT_SHA}"
# Get the current PR body
PR_BODY=$(gh pr view "$PR_NUMBER" --json body --jq .body)
# Prepare the new PR body with both commands
if echo "$PR_BODY" | grep -q "To run this PR locally, use the following command:"; then
# For existing PR descriptions, use a more robust approach
# Split the PR body at the "To run this PR locally" section and replace everything after it
BEFORE_SECTION=$(echo "$PR_BODY" | sed '/To run this PR locally, use the following command:/,$d')
NEW_PR_BODY=$(cat <<EOF
${BEFORE_SECTION}
To run this PR locally, use the following command:
GUI with Docker:
\`\`\`
${DOCKER_RUN_COMMAND}
\`\`\`
EOF
)
else
# For new PR descriptions: use heredoc safely without indentation
NEW_PR_BODY=$(cat <<EOF
$PR_BODY
---
To run this PR locally, use the following command:
GUI with Docker:
\`\`\`
${DOCKER_RUN_COMMAND}
\`\`\`
EOF
)
fi
# Update the PR description
echo "Updating PR description with Docker and uvx commands"
gh pr edit "$PR_NUMBER" --body "$NEW_PR_BODY"
+122
View File
@@ -0,0 +1,122 @@
#!/usr/bin/env python3
"""Update the sdk_ref default value in run-eval.yml.
This script updates the default SDK reference version in the run-eval workflow
to match a new release version.
"""
from __future__ import annotations
import argparse
import re
import sys
from pathlib import Path
REPO_ROOT = Path(__file__).resolve().parents[2]
RUN_EVAL_WORKFLOW = REPO_ROOT / ".github" / "workflows" / "run-eval.yml"
# Pattern to match the sdk_ref default line
# Matches: "default: vX.Y.Z" with optional prerelease suffix like -rc1, -beta.1
SDK_REF_PATTERN = re.compile(
r"^(\s*default:\s*v)[\d]+\.[\d]+\.[\d]+(-[a-zA-Z0-9.]+)?(\s*)$"
)
def update_sdk_ref_default(new_version: str, dry_run: bool = False) -> bool:
"""Update the sdk_ref default in run-eval.yml.
Args:
new_version: The new version (without 'v' prefix, e.g., "1.12.0")
dry_run: If True, print what would change without modifying the file
Returns:
True if successful, False otherwise
"""
if not RUN_EVAL_WORKFLOW.exists():
print(f"❌ File not found: {RUN_EVAL_WORKFLOW}", file=sys.stderr)
return False
content = RUN_EVAL_WORKFLOW.read_text()
lines = content.splitlines(keepends=True)
# Find the sdk_ref input section and its default line
in_sdk_ref_section = False
updated = False
old_version = None
for i, line in enumerate(lines):
stripped = line.strip()
# Track when we enter the sdk_ref input section
if stripped == "sdk_ref:":
in_sdk_ref_section = True
continue
# Track when we exit the sdk_ref section (another input starts)
if (
in_sdk_ref_section
and stripped.endswith(":")
and not stripped.startswith("default")
):
in_sdk_ref_section = False
# Update the default line within the sdk_ref section
if in_sdk_ref_section:
match = SDK_REF_PATTERN.match(line)
if match:
old_version = line.strip().replace("default: ", "")
new_line = f"{match.group(1)}{new_version}{match.group(3) or ''}"
if not line.endswith("\n") and lines[i].endswith("\n"):
new_line += "\n"
elif line.endswith("\n"):
new_line += "\n"
lines[i] = new_line
updated = True
break
if not updated:
print("❌ Could not find sdk_ref default line to update", file=sys.stderr)
return False
if dry_run:
print(f"Would update sdk_ref default: {old_version} → v{new_version}")
return True
# Write the updated content
RUN_EVAL_WORKFLOW.write_text("".join(lines))
print(f"✅ Updated sdk_ref default: {old_version} → v{new_version}")
return True
def main() -> int:
parser = argparse.ArgumentParser(
description="Update the sdk_ref default value in run-eval.yml"
)
parser.add_argument(
"version",
help="New version (without 'v' prefix, e.g., '1.12.0')",
)
parser.add_argument(
"--dry-run",
action="store_true",
help="Print what would change without modifying the file",
)
args = parser.parse_args()
# Validate version format
version_pattern = re.compile(r"^\d+\.\d+\.\d+(-[a-zA-Z0-9.]+)?$")
if not version_pattern.match(args.version):
print(
f"❌ Invalid version format: {args.version}. "
"Expected: X.Y.Z or X.Y.Z-suffix",
file=sys.stderr,
)
return 1
success = update_sdk_ref_default(args.version, dry_run=args.dry_run)
return 0 if success else 1
if __name__ == "__main__":
sys.exit(main())
+125
View File
@@ -0,0 +1,125 @@
# Release Automation Workflows
This document describes the automated release workflows for the OpenHands Software Agent SDK.
## Overview
The release process has been automated with two GitHub Actions workflows:
1. **prepare-release.yml** - Prepares a release PR with version updates
2. **pypi-release.yml** - Automatically publishes packages to PyPI when a release is created
## How to Create a New Release
### Step 1: Trigger the Prepare Release Workflow
1. Go to the [Actions tab](https://github.com/OpenHands/software-agent-sdk/actions)
2. Select **"Prepare Release"** workflow from the left sidebar
3. Click **"Run workflow"** button
4. Enter the version number (e.g., `1.2.3`) - must be in format `X.Y.Z`
5. Click **"Run workflow"**
The workflow will automatically:
- ✅ Create a new branch named `rel-X.Y.Z`
- ✅ Update all package versions using `make set-package-version`
- ✅ Commit the changes
- ✅ Push the branch
- ✅ Create a PR with labels `integration-tests` and `test-examples`
### Step 2: Review the PR
The created PR will include a checklist. Complete the following:
- [ ] Fix any deprecation deadlines if they exist
- [ ] Verify integration tests pass (triggered by `integration-tests` label)
- [ ] Verify example checks pass (triggered by `test-examples` label)
- [ ] Review and approve the PR
### Step 3: Create the GitHub Release
1. Go to [Releases](https://github.com/OpenHands/software-agent-sdk/releases/new)
2. Click **"Draft a new release"**
3. Configure the release:
- **Tag**: `vX.Y.Z` (must match the version)
- **Branch**: `rel-X.Y.Z` (the branch created by the workflow)
- **Previous tag**: Select the previous release version
4. Click **"Generate release notes"** to auto-generate the changelog
5. Review and edit the release notes as needed
6. Click **"Publish release"**
### Step 4: PyPI Publication (Automated)
Once the release is published, the **pypi-release.yml** workflow will automatically:
- ✅ Build all packages (openhands-sdk, openhands-tools, openhands-workspace, openhands-agent-server)
- ✅ Publish them to PyPI
You can monitor the progress in the [Actions tab](https://github.com/OpenHands/software-agent-sdk/actions/workflows/pypi-release.yml).
### Step 5: Version Bump PRs (Automated)
After successful PyPI publication, the workflow will automatically create PRs to update SDK versions in downstream repositories:
- **[OpenHands](https://github.com/All-Hands-AI/OpenHands)** - Updates `openhands-sdk`, `openhands-tools`, and `openhands-agent-server` versions
- **[OpenHands-CLI](https://github.com/All-Hands-AI/openhands-cli)** - Updates `openhands-sdk` and `openhands-tools` versions
These PRs will:
- Be created automatically with branch name `bump-sdk-X.Y.Z`
- Include links back to the SDK release
- Need to be reviewed and merged by the respective repository maintainers
### Step 6: Post-Release Tasks
- [ ] Merge the release PR to main
- [ ] Review and merge the auto-created version bump PRs in OpenHands and OpenHands-CLI
- [ ] Run evaluation on OpenHands Index (manual step)
- [ ] Announce the release
## Manual PyPI Release (If Needed)
If you need to manually trigger the PyPI release workflow:
1. Go to the [Actions tab](https://github.com/OpenHands/software-agent-sdk/actions)
2. Select **"Publish all OpenHands packages (uv)"** workflow
3. Click **"Run workflow"**
4. Select the branch/tag you want to publish from
5. Click **"Run workflow"**
## Workflow Files
- `.github/workflows/prepare-release.yml` - Automated release preparation
- `.github/workflows/pypi-release.yml` - PyPI package publication
## Troubleshooting
### Version Format Error
If you get a version format error, ensure you're using the format `X.Y.Z` (e.g., `1.2.3`), not `vX.Y.Z`.
### PR Creation Failed
If the PR creation fails, check:
- The branch doesn't already exist
- You have proper permissions
- The `GITHUB_TOKEN` has sufficient permissions
### PyPI Publication Failed
If PyPI publication fails:
- Check that the `PYPI_TOKEN_OPENHANDS` secret is properly configured
- Verify the version doesn't already exist on PyPI
- Check the workflow logs for specific error messages
## Previous Manual Process
For reference, the previous manual release checklist was:
- [ ] Checkout SDK repo, use `make set-package-version version=x.x.x` to set the version
- [ ] Push to a branch like `rel-x.x.x` and start a PR
- [ ] Fix any "deprecation deadlines" if they exist
- [ ] Tag "integration-tests" and make sure integration test all pass
- [ ] Tag "test-examples" and make sure example checks all pass
- [ ] Draft a new release
- [ ] Use workflow to publish to PyPI on tag `v1.X.X`
- [ ] Evaluation on OpenHands Index
Most of these steps are now automated!
@@ -0,0 +1,154 @@
---
name: REST API breakage checks
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
agent-server-rest-api:
name: REST API (OpenAPI)
runs-on: ubuntu-latest
permissions:
contents: read
pull-requests: write
steps:
- name: Checkout
uses: actions/checkout@v5
with:
fetch-depth: 0
- name: Install uv
uses: astral-sh/setup-uv@v7
with:
enable-cache: true
- name: Install workspace deps (dev)
run: uv sync --frozen --group dev
- name: Install oasdiff
run: |
curl -L https://raw.githubusercontent.com/oasdiff/oasdiff/main/install.sh | sh -s -- -b /usr/local/bin
oasdiff --version
- name: Run agent server REST API breakage check
id: api_breakage
# Let this step fail so CI is visibly red on breakage.
# Later reporting steps still run because they use if: always().
run: |
uv run --with packaging python .github/scripts/check_agent_server_rest_api_breakage.py 2>&1 | tee api-breakage.log
exit_code=${PIPESTATUS[0]}
echo "exit_code=${exit_code}" >> "$GITHUB_OUTPUT"
exit "${exit_code}"
- name: Write REST API breakage summary
if: ${{ always() }}
env:
EXIT_CODE: ${{ steps.api_breakage.outputs.exit_code }}
IS_FORK: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.repo.full_name != github.repository }}
LOG_PATH: api-breakage.log
RUN_URL: ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}
run: |
python3 <<'PY' >> "$GITHUB_STEP_SUMMARY"
import os
from pathlib import Path
exit_code = int(os.environ.get('EXIT_CODE', '0') or '0')
is_fork = os.environ.get('IS_FORK', 'false') == 'true'
run_url = os.environ['RUN_URL']
status = '✅ **PASSED**' if exit_code == 0 else '❌ **FAILED**'
print(f'## REST API breakage checks (OpenAPI) — {status}')
print()
print(f"**Result:** {status}")
if exit_code != 0:
print()
print('> ⚠️ Breaking REST API changes or policy violations detected.')
print()
if is_fork:
print(
'_Fork PR detected: sticky PR comment was skipped because '
'the GitHub token is read-only for `pull_request` workflows '
'from forks._'
)
print()
if exit_code != 0:
try:
log = Path(os.environ['LOG_PATH']).read_text()
except Exception as exc:
log = f'Unable to read log file: {exc}'
excerpt = log[:1000].replace('```', '``\\`')
print('<details><summary>Log excerpt (first 1000 characters)</summary>')
print()
print('```text')
print(excerpt)
print('```')
print()
print('</details>')
print()
print(f'[Action log]({run_url})')
PY
- name: Post REST API breakage report to PR
if: ${{ always() && github.event_name == 'pull_request' && github.event.pull_request.head.repo.full_name == github.repository }}
uses: actions/github-script@v8
env:
EXIT_CODE: ${{ steps.api_breakage.outputs.exit_code }}
LOG_PATH: api-breakage.log
with:
script: |
const fs = require('fs');
const marker = '<!-- agent-server-rest-api-breakage-report -->';
const exitCode = Number(process.env.EXIT_CODE || '0');
const runUrl = `${context.serverUrl}/${context.repo.owner}/${context.repo.repo}/actions/runs/${context.runId}`;
const status = exitCode === 0 ? '✅ **PASSED**' : '❌ **FAILED**';
let body = `${marker}\n## REST API breakage checks (OpenAPI) — ${status}\n\n**Result:** ${status}\n`;
if (exitCode !== 0) {
body += `\n> ⚠️ Breaking REST API changes or policy violations detected.\n`;
let log = '';
try {
log = fs.readFileSync(process.env.LOG_PATH, 'utf8');
} catch (e) {
log = `Unable to read log file: ${e}`;
}
const excerpt = log.slice(0, 1000).replace(/```/g, '``\\`');
body += `\n<details><summary>Log excerpt (first 1000 characters)</summary>\n\n\`\`\`text\n${excerpt}\n\`\`\`\n\n</details>\n`;
}
body += `\n[Action log](${runUrl})\n`;
const { owner, repo } = context.repo;
const issue_number = context.issue.number;
const { data: comments } = await github.rest.issues.listComments({
owner,
repo,
issue_number,
per_page: 100,
});
const existing = comments.find((c) => c.body && c.body.includes(marker));
if (existing) {
await github.rest.issues.updateComment({
owner,
repo,
comment_id: existing.id,
body,
});
} else {
await github.rest.issues.createComment({
owner,
repo,
issue_number,
body,
});
}
+149
View File
@@ -0,0 +1,149 @@
---
name: Python API breakage checks
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
sdk-api:
name: Python API
runs-on: ubuntu-latest
permissions:
contents: read
pull-requests: write
steps:
- name: Checkout
uses: actions/checkout@v5
with:
fetch-depth: 0
- name: Install uv
uses: astral-sh/setup-uv@v7
with:
enable-cache: true
- name: Install workspace deps (dev)
run: uv sync --frozen --group dev
- name: Run Python API breakage check
id: api_breakage
# Let this step fail so CI is visibly red on breakage.
# Later reporting steps still run because they use if: always().
env:
ACP_VERSION_CHECK_BASE_REF: ${{ github.event_name == 'pull_request' && github.base_ref || github.event.before }}
ACP_VERSION_CHECK_SKIP: ${{ github.event_name == 'pull_request' && contains(github.event.pull_request.body || '', 'skip-acp-check')
}}
run: |
uv run python .github/scripts/check_sdk_api_breakage.py 2>&1 | tee api-breakage.log
exit_code=${PIPESTATUS[0]}
echo "exit_code=${exit_code}" >> "$GITHUB_OUTPUT"
exit "${exit_code}"
- name: Write API breakage summary
if: ${{ always() }}
env:
EXIT_CODE: ${{ steps.api_breakage.outputs.exit_code }}
IS_FORK: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.repo.full_name != github.repository }}
LOG_PATH: api-breakage.log
RUN_URL: ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}
run: |
python3 <<'PY' >> "$GITHUB_STEP_SUMMARY"
import os
from pathlib import Path
exit_code = int(os.environ.get('EXIT_CODE', '0') or '0')
is_fork = os.environ.get('IS_FORK', 'false') == 'true'
run_url = os.environ['RUN_URL']
status = '✅ **PASSED**' if exit_code == 0 else '❌ **FAILED**'
print(f'## Python API breakage checks — {status}')
print()
print(f"**Result:** {status}")
if exit_code != 0:
print()
print('> ⚠️ Breaking API changes or policy violations detected.')
print()
if is_fork:
print(
'_Fork PR detected: sticky PR comment was skipped because '
'the GitHub token is read-only for `pull_request` workflows '
'from forks._'
)
print()
if exit_code != 0:
try:
log = Path(os.environ['LOG_PATH']).read_text()
except Exception as exc:
log = f'Unable to read log file: {exc}'
excerpt = log[:1000].replace('```', '``\\`')
print('<details><summary>Log excerpt (first 1000 characters)</summary>')
print()
print('```text')
print(excerpt)
print('```')
print()
print('</details>')
print()
print(f'[Action log]({run_url})')
PY
- name: Post API breakage report to PR
if: ${{ always() && github.event_name == 'pull_request' && github.event.pull_request.head.repo.full_name == github.repository }}
uses: actions/github-script@v8
env:
EXIT_CODE: ${{ steps.api_breakage.outputs.exit_code }}
LOG_PATH: api-breakage.log
with:
script: |
const fs = require('fs');
const marker = '<!-- api-breakage-report -->';
const exitCode = Number(process.env.EXIT_CODE || '0');
const runUrl = `${context.serverUrl}/${context.repo.owner}/${context.repo.repo}/actions/runs/${context.runId}`;
const status = exitCode === 0 ? '✅ **PASSED**' : '❌ **FAILED**';
let body = `${marker}\n## Python API breakage checks — ${status}\n\n**Result:** ${status}\n`;
if (exitCode !== 0) {
body += `\n> ⚠️ Breaking API changes or policy violations detected.\n`;
let log = '';
try {
log = fs.readFileSync(process.env.LOG_PATH, 'utf8');
} catch (e) {
log = `Unable to read log file: ${e}`;
}
const excerpt = log.slice(0, 1000).replace(/```/g, '``\\`');
body += `\n<details><summary>Log excerpt (first 1000 characters)</summary>\n\n\`\`\`text\n${excerpt}\n\`\`\`\n\n</details>\n`;
}
body += `\n[Action log](${runUrl})\n`;
const { owner, repo } = context.repo;
const issue_number = context.issue.number;
const { data: comments } = await github.rest.issues.listComments({
owner,
repo,
issue_number,
per_page: 100,
});
const existing = comments.find((c) => c.body && c.body.includes(marker));
if (existing) {
await github.rest.issues.updateComment({
owner,
repo,
comment_id: existing.id,
body,
});
} else {
await github.rest.issues.createComment({
owner,
repo,
issue_number,
body,
});
}
+130
View File
@@ -0,0 +1,130 @@
---
name: API Compliance Tests
on:
pull_request:
types: [labeled]
workflow_dispatch:
inputs:
reason:
description: Reason for running compliance tests
required: true
patterns:
description: Comma-separated patterns to test (empty = all)
required: false
models:
description: Comma-separated model IDs (empty = all defaults)
required: false
env:
# Default models to test (matches DEFAULT_MODELS in run_compliance.py)
DEFAULT_MODELS: claude-sonnet-4-5,gpt-5.2,gemini-3-pro
jobs:
run-compliance-tests:
# Only run on api-compliance-test label or workflow_dispatch
if: |
github.event_name == 'workflow_dispatch' ||
(github.event_name == 'pull_request' && github.event.label.name == 'api-compliance-test')
runs-on: ubuntu-latest
permissions:
contents: read
pull-requests: write
steps:
- name: Checkout repository
uses: actions/checkout@v5
with:
repository: ${{ github.event.pull_request.head.repo.full_name || github.repository }}
ref: ${{ github.event.pull_request.head.sha || github.ref }}
persist-credentials: false
- name: Install uv
uses: astral-sh/setup-uv@v7
with:
version: latest
python-version: '3.13'
- name: Install dependencies
run: uv sync --dev
- name: Determine test parameters
id: params
run: |
# Use input values or defaults
if [ "${{ github.event_name }}" = "workflow_dispatch" ]; then
PATTERNS="${{ github.event.inputs.patterns }}"
MODELS="${{ github.event.inputs.models }}"
else
PATTERNS=""
MODELS=""
fi
# Build command args
ARGS=""
if [ -n "$PATTERNS" ]; then
ARGS="$ARGS --patterns $PATTERNS"
fi
if [ -n "$MODELS" ]; then
ARGS="$ARGS --models $MODELS"
else
ARGS="$ARGS --models $DEFAULT_MODELS"
fi
echo "args=$ARGS" >> $GITHUB_OUTPUT
- name: Run API compliance tests
id: compliance
env:
LLM_API_KEY: ${{ secrets.LLM_API_KEY_EVAL }}
LLM_BASE_URL: https://llm-proxy.eval.all-hands.dev
GITHUB_RUN_ID: ${{ github.run_id }}
run: |
uv run python tests/integration/api_compliance/run_compliance.py \
${{ steps.params.outputs.args }} \
--output-dir compliance-results/
continue-on-error: true # Tests may "fail" but that's expected
- name: Upload results
uses: actions/upload-artifact@v7
with:
name: compliance-results
path: compliance-results/
retention-days: 30
- name: Post results to PR
if: github.event_name == 'pull_request'
uses: actions/github-script@v8
with:
script: |
const fs = require('fs');
const path = require('path');
// Find the report directory
const resultsDir = 'compliance-results';
const dirs = fs.readdirSync(resultsDir);
if (dirs.length === 0) {
console.log('No results found');
return;
}
const latestDir = path.join(resultsDir, dirs[0]);
const reportPath = path.join(latestDir, 'compliance_report.md');
if (!fs.existsSync(reportPath)) {
console.log('Report not found at', reportPath);
return;
}
let report = fs.readFileSync(reportPath, 'utf8');
// Truncate if too long
if (report.length > 60000) {
report = report.substring(0, 60000) + '\n\n... (truncated)';
}
await github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.payload.pull_request.number,
body: report
});
+224
View File
@@ -0,0 +1,224 @@
---
# To set this up:
# 1. Change the name below to something relevant to your task
# 2. Modify the "env" section below with your prompt
# 3. Add your LLM_API_KEY to the repository secrets
# 4. Commit this file to your repository
# 5. Trigger the workflow manually or set up a schedule
name: Assign Reviews
on:
# Manual trigger
workflow_dispatch:
# Scheduled trigger (disabled by default, uncomment and customize as needed)
schedule:
# Run at 12 PM UTC every day
- cron: 0 12 * * *
permissions:
contents: write
pull-requests: write
issues: write
jobs:
run-task:
# Only run scheduled jobs in the main repository, not in forks
if: github.repository == 'OpenHands/software-agent-sdk' || github.event_name == 'workflow_dispatch'
runs-on: ubuntu-24.04
env:
# Configuration (modify these values as needed)
AGENT_SCRIPT_URL: https://raw.githubusercontent.com/OpenHands/agent-sdk/main/examples/03_github_workflows/01_basic_action/agent_script.py
# Provide either PROMPT_LOCATION (URL/file) OR PROMPT_STRING (direct text), not both
# Option 1: Use a URL or file path for the prompt
PROMPT_LOCATION: ''
# PROMPT_LOCATION: 'https://example.com/prompts/maintenance.txt'
# Option 2: Use direct text for the prompt
PROMPT_STRING: >
Use GITHUB_TOKEN and the github API to organize open pull requests and issues in the repo.
Read the sections below in order, and perform each in order. Do NOT take action
on the same issue or PR twice.
# Issues with needs-info - Check for OP Response
Find all open issues that have the "needs-info" label. For each issue:
1. Identify the original poster (issue author)
2. Check if there are any comments from the original poster AFTER the "needs-info" label was added
3. To determine when the label was added, use: GET /repos/{owner}/{repo}/issues/{issue_number}/timeline
and look for "labeled" events with the label "needs-info"
4. If the original poster has commented after the label was added:
- Remove the "needs-info" label
- Add the "needs-triage" label
# Issues with needs-triage
Find all open issues that have the "needs-triage" label. For each issue that has been in this state for more than 2 days:
1. First, check if the issue has already been triaged by verifying it does NOT have:
- The "enhancement" label
- Any "priority" label (priority:low, priority:medium, priority:high, etc.)
2. If the issue has already been triaged (has enhancement or priority label), remove the "needs-triage" label
3. For issues that have NOT been triaged yet:
- Read the issue description and comments
- Check if it is a bug report, feature request, or question and add the appropriate label
- If it is a bug report and it does not have a priority label
* Read the MAINTAINERS file in the repository root to get the list of maintainers
* Extract all usernames from lines starting with "- @" and join them with spaces, each prefixed with @
(e.g., if the file contains "- @user1" and "- @user2", format as "@user1 @user2")
* Tag ALL maintainers with: "[Automatic Post]: This issue has been waiting for triage. <maintainers>, could you
please take a look and add the appropriate priority label when you have a chance?"
(Replace <maintainers> with the formatted list from the previous step)
# Need Reviewer Action
Find all open PRs where:
1. The PR is waiting for review (there are no open review comments or change requests)
2. The PR is in a "clean" state (CI passing, no merge conflicts)
3. The PR is not marked as draft (draft: false)
4. The PR has had no activity (comments, commits, reviews) for more than 3 days.
In this case, send a message to the reviewers:
[Automatic Post]: This PR seems to be currently waiting for review.
{reviewer_names}, could you please take a look when you have a chance?
# Need Author Action
Find all open PRs where the most recent change or comment was made on the pull
request more than 5 days ago (use 14 days if the PR is marked as draft).
And send a message to the author:
[Automatic Post]: It has been a while since there was any activity on this PR.
{author}, are you still working on it? If so, please go ahead, if not then
please request review, close it, or request that someone else follow up.
# Need Reviewers
Find all open pull requests that TRULY have NO reviewers assigned. To do this correctly:
1. Use the GitHub API to fetch PR details: GET /repos/{owner}/{repo}/pulls/{pull_number}
2. Check the "requested_reviewers" and "requested_teams" arrays
3. ALSO check for submitted reviews: GET /repos/{owner}/{repo}/pulls/{pull_number}/reviews
4. A PR needs reviewers ONLY if ALL of these are true:
- The "requested_reviewers" array is empty (no pending review requests)
- The "requested_teams" array is empty (no pending team review requests)
- The reviews array is empty (no reviews have been submitted yet)
5. IMPORTANT: If ANY of these has entries, SKIP this PR - it already has or had reviewers!
Example API responses showing a PR that DOES NOT need reviewers (skip this):
Case 1 - Has requested reviewers:
GET /pulls/{number}: {"requested_reviewers": [{"login": "someuser"}], "requested_teams": []}
Case 2 - Has submitted reviews (even if requested_reviewers is empty):
GET /pulls/{number}: {"requested_reviewers": [], "requested_teams": []}
GET /pulls/{number}/reviews: [{"user": {"login": "someuser"}, "state": "COMMENTED"}]
Example API response showing a PR that DOES need reviewers (process this):
GET /pulls/{number}: {"requested_reviewers": [], "requested_teams": []}
GET /pulls/{number}/reviews: []
Additional criteria for PRs that need reviewers:
1. Are not marked as draft (draft: false)
2. Were created more than 1 day ago
3. CI is passing and there are no merge conflicts
For each PR that truly has NO reviewers:
1) Read git blame for changed files to identify recent, active contributors.
2) From those candidates, ONLY consider maintainers — repository collaborators with write access or higher. Verify via the GitHub API before
requesting review:
- Preferred: GET /repos/{owner}/{repo}/collaborators (no permission filter). Filter client-side using either:
role_name in ["write", "maintain", "admin"] OR permissions.push || permissions.admin. Note: paginate if > 30 collaborators.
- Alternative: GET /repos/{owner}/{repo}/collaborators/{username}/permission and accept if permission in {push, maintain, admin}.
3) If multiple maintainers qualify, avoid assigning too many reviews to any single one.
4) Request review from exactly one maintainer and add this message:
[Automatic Post]: I have assigned {reviewer} as a reviewer based on git blame information.
Thanks in advance for the help!
LLM_MODEL: litellm_proxy/claude-sonnet-4-5-20250929
LLM_BASE_URL: https://llm-proxy.app.all-hands.dev
steps:
- name: Checkout repository
uses: actions/checkout@v5
- name: Set up Python
uses: actions/setup-python@v6
with:
python-version: '3.13'
- name: Install uv
uses: astral-sh/setup-uv@v7
with:
enable-cache: true
- name: Install OpenHands dependencies
run: |
# Install OpenHands SDK and tools from git repository
uv pip install --system "openhands-sdk @ git+https://github.com/OpenHands/agent-sdk.git@main#subdirectory=openhands-sdk"
uv pip install --system "openhands-tools @ git+https://github.com/OpenHands/agent-sdk.git@main#subdirectory=openhands-tools"
- name: Check required configuration
env:
LLM_API_KEY: ${{ secrets.LLM_API_KEY }}
run: |
if [ -z "$LLM_API_KEY" ]; then
echo "Error: LLM_API_KEY secret is not set."
exit 1
fi
# Check that exactly one of PROMPT_LOCATION or PROMPT_STRING is set
if [ -n "$PROMPT_LOCATION" ] && [ -n "$PROMPT_STRING" ]; then
echo "Error: Both PROMPT_LOCATION and PROMPT_STRING are set."
echo "Please provide only one in the env section of the workflow file."
exit 1
fi
if [ -z "$PROMPT_LOCATION" ] && [ -z "$PROMPT_STRING" ]; then
echo "Error: Neither PROMPT_LOCATION nor PROMPT_STRING is set."
echo "Please set one in the env section of the workflow file."
exit 1
fi
if [ -n "$PROMPT_LOCATION" ]; then
echo "Prompt location: $PROMPT_LOCATION"
else
echo "Using inline PROMPT_STRING (${#PROMPT_STRING} characters)"
fi
echo "LLM model: $LLM_MODEL"
if [ -n "$LLM_BASE_URL" ]; then
echo "LLM base URL: $LLM_BASE_URL"
fi
- name: Run task
env:
LLM_API_KEY: ${{ secrets.LLM_API_KEY }}
GITHUB_TOKEN: ${{ secrets.ALLHANDS_BOT_GITHUB_PAT }}
PYTHONPATH: ''
run: |
echo "Running agent script: $AGENT_SCRIPT_URL"
# Download script if it's a URL
if [[ "$AGENT_SCRIPT_URL" =~ ^https?:// ]]; then
echo "Downloading agent script from URL..."
curl -sSL "$AGENT_SCRIPT_URL" -o /tmp/agent_script.py
AGENT_SCRIPT_PATH="/tmp/agent_script.py"
else
AGENT_SCRIPT_PATH="$AGENT_SCRIPT_URL"
fi
# Run with appropriate prompt argument
if [ -n "$PROMPT_LOCATION" ]; then
echo "Using prompt from: $PROMPT_LOCATION"
uv run python "$AGENT_SCRIPT_PATH" "$PROMPT_LOCATION"
else
echo "Using PROMPT_STRING (${#PROMPT_STRING} characters)"
uv run python "$AGENT_SCRIPT_PATH"
fi
- name: Upload logs as artifact
uses: actions/upload-artifact@v7
if: always()
with:
name: openhands-task-logs
path: |
*.log
output/
retention-days: 7
+36
View File
@@ -0,0 +1,36 @@
---
name: Auto-label New Issues
on:
issues:
types: [opened]
permissions:
issues: write
jobs:
add-triage-label:
runs-on: ubuntu-latest
steps:
- name: Add needs-triage label
uses: actions/github-script@v8
with:
github-token: ${{ secrets.ALLHANDS_BOT_GITHUB_PAT }}
script: |
// Get the issue details
const issue = context.payload.issue;
const labels = issue.labels.map(label => label.name);
// Check if issue has already been triaged
const hasEnhancement = labels.includes('enhancement');
const hasPriority = labels.some(label => label.startsWith('priority'));
// Only add needs-triage if not already triaged
if (!hasEnhancement && !hasPriority) {
await github.rest.issues.addLabels({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.issue.number,
labels: ['needs-triage']
});
}
+25
View File
@@ -0,0 +1,25 @@
---
# .github/workflows/check-docstrings.yml
name: Check Docstrings
on:
push:
branches: [main]
pull_request:
branches: ['**']
jobs:
check-docstrings:
runs-on: ubuntu-24.04
steps:
- name: Checkout code
uses: actions/checkout@v5
- name: Set up Python
uses: actions/setup-python@v6
with:
python-version: '3.13'
- name: Check docstring formatting
run: python .github/scripts/check_docstrings.py
@@ -0,0 +1,59 @@
---
name: '[Optional] Docs example'
on:
pull_request:
branches:
- '**'
paths:
- examples/**/*.py
- '!examples/03_github_workflows/**'
- '!examples/04_llm_specific_tools/**'
- .github/workflows/check-documented-examples.yml
- .github/scripts/check_documented_examples.py
workflow_dispatch:
permissions:
contents: read
pull-requests: read
jobs:
check-examples:
runs-on: ubuntu-latest
steps:
- name: Checkout agent-sdk repository
uses: actions/checkout@v5
with:
fetch-depth: 0
- name: Checkout docs repository (try feature branch)
uses: actions/checkout@v5
continue-on-error: true
id: checkout-feature
with:
repository: OpenHands/docs
path: docs
fetch-depth: 0
ref: ${{ github.head_ref || github.ref_name }}
- name: Checkout docs repository (fallback to main)
if: steps.checkout-feature.outcome == 'failure'
uses: actions/checkout@v5
with:
repository: OpenHands/docs
path: docs
fetch-depth: 0
ref: main
- name: Set up Python
uses: actions/setup-python@v6
with:
python-version: '3.13'
- name: Check documented examples
env:
DOCS_PATH: ${{ github.workspace }}/docs
shell: bash
run: |
set -euo pipefail
python .github/scripts/check_documented_examples.py
@@ -0,0 +1,35 @@
---
name: Check duplicate example numbers
on:
pull_request:
branches:
- '**'
paths:
- examples/**
- .github/workflows/check-duplicate-examples.yml
- .github/scripts/check_duplicate_example_numbers.py
push:
branches:
- main
paths:
- examples/**
workflow_dispatch:
permissions:
contents: read
jobs:
check-duplicates:
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v5
- name: Set up Python
uses: actions/setup-python@v6
with:
python-version: '3.13'
- name: Check for duplicate example numbers
run: python .github/scripts/check_duplicate_example_numbers.py
@@ -1,65 +0,0 @@
name: Check Package Versions
on:
push:
branches: [main]
pull_request:
workflow_dispatch:
jobs:
check-package-versions:
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v6
with:
python-version: "3.12"
- name: Check for any 'rev' fields in pyproject.toml
run: |
python - <<'PY'
import sys, tomllib, pathlib
path = pathlib.Path("pyproject.toml")
if not path.exists():
print("❌ ERROR: pyproject.toml not found")
sys.exit(1)
try:
data = tomllib.loads(path.read_text(encoding="utf-8"))
except Exception as e:
print(f"❌ ERROR: Failed to parse pyproject.toml: {e}")
sys.exit(1)
poetry = data.get("tool", {}).get("poetry", {})
sections = {
"dependencies": poetry.get("dependencies", {}),
}
errors = []
print("🔍 Checking for any dependencies with 'rev' fields...\n")
for section_name, deps in sections.items():
if not isinstance(deps, dict):
continue
for pkg_name, cfg in deps.items():
if isinstance(cfg, dict) and "rev" in cfg:
msg = f" ✖ {pkg_name} in [{section_name}] uses rev='{cfg['rev']}' (NOT ALLOWED)"
print(msg)
errors.append(msg)
else:
print(f" • {pkg_name}: OK")
if errors:
print("\n❌ FAILED: Found dependencies using 'rev' fields:\n" + "\n".join(errors))
print("\nPlease use versioned releases instead, e.g.:")
print(' my-package = "1.0.0"')
sys.exit(1)
print("\n✅ SUCCESS: No 'rev' fields found. All dependencies are using proper versioned releases.")
PY
+244
View File
@@ -0,0 +1,244 @@
---
name: Run Condenser Tests
on:
# Use pull_request_target to access secrets even on fork PRs
# This is safe because we only run when the 'condenser-test' label is added by a maintainer
pull_request_target:
types:
- labeled
workflow_dispatch:
inputs:
reason:
description: Reason for manual trigger
required: true
default: ''
env:
N_PROCESSES: 2 # Fewer parallel processes for condenser tests (only 2 LLMs)
jobs:
post-initial-comment:
if: >
github.event_name == 'pull_request_target' &&
github.event.label.name == 'condenser-test'
runs-on: ubuntu-latest
permissions:
pull-requests: write
steps:
- name: Comment on PR
uses: KeisukeYamashita/create-comment@v1
with:
unique: false
comment: |
Hi! I started running the condenser tests on your PR. You will receive a comment with the results shortly.
Note: These are non-blocking tests that validate condenser functionality across different LLMs.
run-condenser-tests:
# Security: Only run when condenser-test label is present or via workflow_dispatch
# This prevents automatic execution on fork PRs without maintainer approval
if: |
always() && (
(
github.event_name == 'pull_request_target' &&
github.event.label.name == 'condenser-test'
) ||
github.event_name == 'workflow_dispatch'
)
runs-on: ubuntu-22.04
permissions:
contents: read
id-token: write
pull-requests: write
strategy:
matrix:
python-version: ['3.13']
job-config:
# Only run against 2 LLMs for condenser tests:
# - Claude Opus 4.5 (primary - supports thinking blocks)
# - GPT-5.1 Codex Max (secondary - cross-LLM validation)
- name: Claude Opus 4.5
run-suffix: opus_condenser_run
llm-config:
model: litellm_proxy/anthropic/claude-opus-4-5-20251101
extended_thinking: true
- name: GPT-5.1 Codex Max
run-suffix: gpt51_condenser_run
llm-config:
model: litellm_proxy/gpt-5.1-codex-max
steps:
- name: Checkout repository
uses: actions/checkout@v5
with:
# For pull_request_target: checkout fork PR code (requires explicit repository)
# For other events: fallback to current repository and ref
repository: ${{ github.event.pull_request.head.repo.full_name || github.repository }}
ref: ${{ github.event.pull_request.head.sha || github.ref }}
# Security: Don't persist credentials to prevent untrusted PR code from using them
persist-credentials: false
- name: Install uv
uses: astral-sh/setup-uv@v7
with:
version: latest
python-version: ${{ matrix.python-version }}
- name: Install Python dependencies using uv
run: |
uv sync --dev
uv pip install pytest
- name: Run condenser test evaluation for ${{ matrix.job-config.name }}
env:
LLM_CONFIG: ${{ toJson(matrix.job-config.llm-config) }}
LLM_API_KEY: ${{ secrets.LLM_API_KEY }}
LLM_BASE_URL: https://llm-proxy.app.all-hands.dev
run: |
set -eo pipefail
AGENT_SDK_VERSION=$(git rev-parse --short HEAD)
EVAL_NOTE="${AGENT_SDK_VERSION}_${{ matrix.job-config.run-suffix }}"
echo "Running condenser tests only (c*.py pattern)"
uv run python tests/integration/run_infer.py \
--llm-config "$LLM_CONFIG" \
--num-workers $N_PROCESSES \
--eval-note "$EVAL_NOTE" \
--test-type condenser
# get condenser tests JSON results
RESULTS_FILE=$(find tests/integration/outputs/*${{ matrix.job-config.run-suffix }}* -name "results.json" -type f | head -n 1)
echo "RESULTS_FILE: $RESULTS_FILE"
if [ -f "$RESULTS_FILE" ]; then
echo "JSON_RESULTS_FILE=$RESULTS_FILE" >> $GITHUB_ENV
else
echo "JSON_RESULTS_FILE=" >> $GITHUB_ENV
fi
- name: Wait a little bit
run: sleep 10
- name: Create archive of evaluation outputs
run: |
TIMESTAMP=$(date +'%y-%m-%d-%H-%M')
cd tests/integration/outputs # Change to the outputs directory
tar -czvf ../../../condenser_tests_${{ matrix.job-config.run-suffix }}_${TIMESTAMP}.tar.gz *${{ matrix.job-config.run-suffix }}* # Include result directories for this model
- name: Upload evaluation results as artifact
uses: actions/upload-artifact@v7
id: upload_results_artifact
with:
name: condenser-test-outputs-${{ matrix.job-config.run-suffix }}-${{ github.run_id }}-${{ github.run_attempt }}
path: condenser_tests_${{ matrix.job-config.run-suffix }}_*.tar.gz
- name: Save test results for consolidation
run: |
# Copy the structured JSON results file for consolidation
mkdir -p test_results_summary
if [ -n "${{ env.JSON_RESULTS_FILE }}" ] && [ -f "${{ env.JSON_RESULTS_FILE }}" ]; then
# Copy the JSON results file directly
cp "${{ env.JSON_RESULTS_FILE }}" "test_results_summary/${{ matrix.job-config.run-suffix }}_results.json"
echo "✓ Copied JSON results file for consolidation"
else
echo "✗ No JSON results file found"
exit 1
fi
- name: Upload test results summary
uses: actions/upload-artifact@v7
with:
name: test-results-${{ matrix.job-config.run-suffix }}
path: test_results_summary/${{ matrix.job-config.run-suffix }}_results.json
consolidate-results:
needs: run-condenser-tests
if: |
always() && (
(
github.event_name == 'pull_request_target' &&
github.event.label.name == 'condenser-test'
) ||
github.event_name == 'workflow_dispatch'
)
runs-on: ubuntu-24.04
permissions:
contents: read
pull-requests: write
steps:
- name: Checkout repository
uses: actions/checkout@v5
with:
# When using pull_request_target, explicitly checkout the PR branch
# This ensures we use the scripts from the actual PR code
ref: ${{ github.event.pull_request.head.sha || github.ref }}
- name: Install uv
uses: astral-sh/setup-uv@v7
with:
version: latest
python-version: '3.13'
- name: Install Python dependencies using uv
run: |
uv sync --dev
- name: Download all test results
uses: actions/download-artifact@v8
with:
pattern: test-results-*
merge-multiple: true
path: all_results
- name: Download all condenser test artifacts
uses: actions/download-artifact@v8
with:
pattern: condenser-test-outputs-*
path: artifacts
- name: Consolidate test results
env:
EVENT_NAME: ${{ github.event_name }}
PR_NUMBER: ${{ github.event.pull_request.number }}
MANUAL_REASON: ${{ github.event.inputs.reason }}
COMMIT_SHA: ${{ github.sha }}
PYTHONPATH: ${{ github.workspace }}
GITHUB_SERVER_URL: ${{ github.server_url }}
GITHUB_REPOSITORY: ${{ github.repository }}
GITHUB_RUN_ID: ${{ github.run_id }}
run: |
uv run python tests/integration/utils/consolidate_json_results.py \
--results-dir all_results \
--artifacts-dir artifacts \
--output-file consolidated_results.json
echo "Consolidated results generated successfully"
uv run python tests/integration/utils/generate_markdown_report.py \
--input-file consolidated_results.json \
--output-file consolidated_report.md
- name: Upload consolidated report
uses: actions/upload-artifact@v7
with:
name: consolidated-condenser-report
path: consolidated_report.md
- name: Create consolidated PR comment
if: github.event_name == 'pull_request_target'
run: |
# Add header to clarify these are non-blocking tests
echo "## Condenser Test Results (Non-Blocking)" > final_report.md
echo "" >> final_report.md
echo "> These tests validate condenser functionality and do not block PR merges." >> final_report.md
echo "" >> final_report.md
cat consolidated_report.md >> final_report.md
# Sanitize @OpenHands mentions to prevent self-mention loops
COMMENT_BODY=$(uv run python -c "from openhands.sdk.utils.github import sanitize_openhands_mentions; import sys; print(sanitize_openhands_mentions(sys.stdin.read()), end='')" < final_report.md)
# Use GitHub CLI to create comment with explicit PR number
echo "$COMMENT_BODY" | gh pr comment ${{ github.event.pull_request.number }} --body-file -
env:
GH_TOKEN: ${{ github.token }}
+23
View File
@@ -0,0 +1,23 @@
---
name: Dispatch to docs repo
on:
push:
branches:
- main
paths:
- openhands-agent-server/**
workflow_dispatch:
jobs:
dispatch:
runs-on: ubuntu-24.04
permissions:
contents: write
steps:
- name: Trigger docs repo sync
uses: peter-evans/repository-dispatch@v4
with:
token: ${{ secrets.ALLHANDS_BOT_GITHUB_PAT }}
repository: OpenHands/docs
event-type: update
client-payload: '{"ref": "${{ github.ref }}", "sha": "${{ github.sha }}"}'
+24
View File
@@ -0,0 +1,24 @@
---
name: Deprecation deadlines
on:
push:
branches: [main]
pull_request:
branches: ['**']
jobs:
check:
runs-on: ubuntu-24.04
steps:
- name: Checkout
uses: actions/checkout@v5
- name: Install uv
uses: astral-sh/setup-uv@v7
with:
enable-cache: true
python-version: '3.13'
- name: Verify deprecation removals
run: uv run --with packaging python .github/scripts/check_deprecations.py
-228
View File
@@ -1,228 +0,0 @@
name: End-to-End Tests
on:
pull_request:
types: [opened, synchronize, reopened, labeled]
branches:
- main
- develop
workflow_dispatch:
jobs:
e2e-tests:
if: contains(github.event.pull_request.labels.*.name, 'end-to-end') || github.event_name == 'workflow_dispatch'
runs-on: ubuntu-latest
timeout-minutes: 60
env:
GITHUB_REPO_NAME: ${{ github.repository }}
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Install poetry via pipx
uses: abatilo/actions-poetry@v4
with:
poetry-version: 2.1.3
- name: Set up Python
uses: actions/setup-python@v6
with:
python-version: '3.12'
cache: 'poetry'
- name: Install system dependencies
run: |
sudo apt-get update
sudo apt-get install -y libgtk-3-0 libnotify4 libnss3 libxss1 libxtst6 xauth xvfb libgbm1 libasound2t64 netcat-openbsd
- name: Setup Node.js
uses: actions/setup-node@v6
with:
node-version: '22'
cache: 'npm'
cache-dependency-path: 'frontend/package-lock.json'
- name: Setup environment for end-to-end tests
run: |
# Create test results directory
mkdir -p test-results
# Create downloads directory for OpenHands (use a directory in the home folder)
mkdir -p $HOME/downloads
sudo chown -R $USER:$USER $HOME/downloads
sudo chmod -R 755 $HOME/downloads
- name: Build OpenHands
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
LLM_MODEL: ${{ secrets.LLM_MODEL || 'gpt-4o' }}
LLM_API_KEY: ${{ secrets.LLM_API_KEY || 'test-key' }}
LLM_BASE_URL: ${{ secrets.LLM_BASE_URL }}
INSTALL_DOCKER: 1
RUNTIME: docker
FRONTEND_PORT: 12000
FRONTEND_HOST: 0.0.0.0
BACKEND_HOST: 0.0.0.0
BACKEND_PORT: 3000
ENABLE_BROWSER: true
INSTALL_PLAYWRIGHT: 1
run: |
# Fix poetry.lock file if needed
echo "Fixing poetry.lock file if needed..."
poetry lock
# Build OpenHands using make build
echo "Running make build..."
make build
# Install Chromium Headless Shell for Playwright (needed for pytest-playwright)
echo "Installing Chromium Headless Shell for Playwright..."
poetry run playwright install chromium-headless-shell
# Verify Playwright browsers are installed (for e2e tests only)
echo "Verifying Playwright browsers installation for e2e tests..."
BROWSER_CHECK=$(poetry run python tests/e2e/check_playwright.py 2>/dev/null)
if [ "$BROWSER_CHECK" != "chromium_found" ]; then
echo "ERROR: Chromium browser not found or not working for e2e tests"
echo "$BROWSER_CHECK"
exit 1
else
echo "Playwright browsers are properly installed for e2e tests."
fi
# Docker runtime will handle workspace directory creation
# Start the application using make run with custom parameters and reduced logging
echo "Starting OpenHands using make run..."
# Set environment variables to reduce logging verbosity
export PYTHONUNBUFFERED=1
export LOG_LEVEL=WARNING
export UVICORN_LOG_LEVEL=warning
export OPENHANDS_LOG_LEVEL=WARNING
FRONTEND_PORT=12000 FRONTEND_HOST=0.0.0.0 BACKEND_HOST=0.0.0.0 make run > /tmp/openhands-e2e-test.log 2>&1 &
# Store the PID of the make run process
MAKE_PID=$!
echo "OpenHands started with PID: $MAKE_PID"
# Wait for the application to start
echo "Waiting for OpenHands to start..."
max_attempts=15
attempt=1
while [ $attempt -le $max_attempts ]; do
echo "Checking if OpenHands is running (attempt $attempt of $max_attempts)..."
# Check if the process is still running
if ! ps -p $MAKE_PID > /dev/null; then
echo "ERROR: OpenHands process has terminated unexpectedly"
echo "Last 50 lines of the log:"
tail -n 50 /tmp/openhands-e2e-test.log
exit 1
fi
# Check if frontend port is open
if nc -z localhost 12000; then
# Verify we can get HTML content
if curl -s http://localhost:12000 | grep -q "<html"; then
echo "SUCCESS: OpenHands is running and serving HTML content on port 12000"
break
else
echo "Port 12000 is open but not serving HTML content yet"
fi
else
echo "Frontend port 12000 is not open yet"
fi
# Show log output on each attempt
echo "Recent log output:"
tail -n 20 /tmp/openhands-e2e-test.log
# Wait before next attempt
echo "Waiting 10 seconds before next check..."
sleep 10
attempt=$((attempt + 1))
# Exit if we've reached the maximum number of attempts
if [ $attempt -gt $max_attempts ]; then
echo "ERROR: OpenHands failed to start after $max_attempts attempts"
echo "Last 50 lines of the log:"
tail -n 50 /tmp/openhands-e2e-test.log
exit 1
fi
done
# Final verification that the app is running
if ! nc -z localhost 12000 || ! curl -s http://localhost:12000 | grep -q "<html"; then
echo "ERROR: OpenHands is not running properly on port 12000"
echo "Last 50 lines of the log:"
tail -n 50 /tmp/openhands-e2e-test.log
exit 1
fi
# Print success message
echo "OpenHands is running successfully on port 12000"
- name: Run end-to-end tests
env:
GITHUB_TOKEN: ${{ secrets.E2E_TEST_GITHUB_TOKEN }}
LLM_MODEL: ${{ secrets.LLM_MODEL || 'gpt-4o' }}
LLM_API_KEY: ${{ secrets.LLM_API_KEY || 'test-key' }}
LLM_BASE_URL: ${{ secrets.LLM_BASE_URL }}
run: |
# Check if the application is running
if ! nc -z localhost 12000; then
echo "ERROR: OpenHands is not running on port 12000"
echo "Last 50 lines of the log:"
tail -n 50 /tmp/openhands-e2e-test.log
exit 1
fi
# Run the tests with detailed output
cd tests/e2e
poetry run python -m pytest \
test_settings.py::test_github_token_configuration \
test_conversation.py::test_conversation_start \
test_browsing_catchphrase.py::test_browsing_catchphrase \
test_multi_conversation_resume.py::test_multi_conversation_resume \
-v --no-header --capture=no --timeout=900
- name: Upload test results
if: always()
uses: actions/upload-artifact@v6
with:
name: playwright-report
path: tests/e2e/test-results/
retention-days: 30
- name: Upload OpenHands logs
if: always()
uses: actions/upload-artifact@v6
with:
name: openhands-logs
path: |
/tmp/openhands-e2e-test.log
/tmp/openhands-e2e-build.log
/tmp/openhands-backend.log
/tmp/openhands-frontend.log
/tmp/backend-health-check.log
/tmp/frontend-check.log
/tmp/vite-config.log
/tmp/makefile-contents.log
retention-days: 30
- name: Cleanup
if: always()
run: |
# Stop OpenHands processes
echo "Stopping OpenHands processes..."
pkill -f "python -m openhands.server" || true
pkill -f "npm run dev" || true
pkill -f "make run" || true
# Print process status for debugging
echo "Checking if any OpenHands processes are still running:"
ps aux | grep -E "openhands|npm run dev" || true
@@ -1,52 +0,0 @@
name: Enterprise Check Migrations
on:
pull_request:
paths:
- 'enterprise/migrations/**'
jobs:
check-sync:
runs-on: ubuntu-latest
steps:
- name: Checkout PR branch
uses: actions/checkout@v4
with:
ref: ${{ github.event.pull_request.head.sha }}
fetch-depth: 0
- name: Fetch base branch
run: git fetch origin ${{ github.event.pull_request.base.ref }}
- name: Check if base branch is ancestor of PR
id: check_up_to_date
shell: bash
run: |
BASE="origin/${{ github.event.pull_request.base.ref }}"
HEAD="${{ github.event.pull_request.head.sha }}"
if git merge-base --is-ancestor "$BASE" "$HEAD"; then
echo "We're up to date with base $BASE"
exit 0
else
echo "NOT up to date with base $BASE"
exit 1
fi
- name: Find Comment
uses: peter-evans/find-comment@v3
id: find-comment
with:
issue-number: ${{ github.event.pull_request.number }}
comment-author: 'github-actions[bot]'
body-includes: |
⚠️ This PR contains **migrations**
- name: Comment warning on PR
uses: peter-evans/create-or-update-comment@v5
with:
issue-number: ${{ github.event.pull_request.number }}
comment-id: ${{ steps.find-comment.outputs.comment-id }}
edit-mode: replace
body: |
⚠️ This PR contains **migrations**. Please synchronize before merging to prevent conflicts.
-29
View File
@@ -1,29 +0,0 @@
# Feature branch preview for enterprise code
name: Enterprise Preview
# Run on PRs labeled
on:
pull_request:
types: [labeled]
# Match ghcr-build.yml, but don't interrupt it.
concurrency:
group: ${{ github.workflow }}-${{ (github.head_ref && github.ref) || github.run_id }}
cancel-in-progress: false
jobs:
# This must happen for the PR Docker workflow when the label is present,
# and also if it's added after the fact. Thus, it exists in both places.
enterprise-preview:
name: Enterprise preview
if: github.event.label.name == 'deploy'
runs-on: blacksmith-4vcpu-ubuntu-2204
steps:
# This should match the version in ghcr-build.yml
- name: Trigger remote job
run: |
curl --fail-with-body -sS -X POST \
-H "Authorization: Bearer ${{ secrets.ALLHANDS_BOT_GITHUB_PAT }}" \
-H "Accept: application/vnd.github+json" \
-d "{\"ref\": \"main\", \"inputs\": {\"openhandsPrNumber\": \"${{ github.event.pull_request.number }}\", \"deployEnvironment\": \"feature\", \"enterpriseImageTag\": \"pr-${{ github.event.pull_request.number }}\" }}" \
https://api.github.com/repos/OpenHands/deploy/actions/workflows/deploy.yaml/dispatches
-47
View File
@@ -1,47 +0,0 @@
# Workflow that runs frontend e2e tests with Playwright
name: Run Frontend E2E Tests
on:
push:
branches:
- main
pull_request:
paths:
- "frontend/**"
- ".github/workflows/fe-e2e-tests.yml"
concurrency:
group: ${{ github.workflow }}-${{ (github.head_ref && github.ref) || github.run_id }}
cancel-in-progress: true
jobs:
fe-e2e-test:
name: FE E2E Tests
runs-on: blacksmith-4vcpu-ubuntu-2204
strategy:
matrix:
node-version: [22]
fail-fast: true
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Set up Node.js
uses: useblacksmith/setup-node@v5
with:
node-version: ${{ matrix.node-version }}
- name: Install dependencies
working-directory: ./frontend
run: npm ci
- name: Install Playwright browsers
working-directory: ./frontend
run: npx playwright install --with-deps chromium
- name: Run Playwright tests
working-directory: ./frontend
run: npx playwright test --project=chromium
- name: Upload Playwright report
uses: actions/upload-artifact@v6
if: always()
with:
name: playwright-report
path: frontend/playwright-report/
retention-days: 30
-44
View File
@@ -1,44 +0,0 @@
# Workflow that runs frontend unit tests
name: Run Frontend Unit Tests
# * Always run on "main"
# * Run on PRs that have changes in the "frontend" folder or this workflow
on:
push:
branches:
- main
pull_request:
paths:
- "frontend/**"
- ".github/workflows/fe-unit-tests.yml"
# If triggered by a PR, it will be in the same group. However, each commit on main will be in its own unique group
concurrency:
group: ${{ github.workflow }}-${{ (github.head_ref && github.ref) || github.run_id }}
cancel-in-progress: true
jobs:
# Run frontend unit tests
fe-test:
name: FE Unit Tests
runs-on: blacksmith-4vcpu-ubuntu-2204
strategy:
matrix:
node-version: [22]
fail-fast: true
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Set up Node.js
uses: useblacksmith/setup-node@v5
with:
node-version: ${{ matrix.node-version }}
- name: Install dependencies
working-directory: ./frontend
run: npm ci
- name: Run TypeScript compilation
working-directory: ./frontend
run: npm run build
- name: Run tests and collect coverage
working-directory: ./frontend
run: npm run test:coverage
-288
View File
@@ -1,288 +0,0 @@
# Workflow that builds, tests and then pushes the OpenHands and runtime docker images to the ghcr.io repository
name: Docker
# Always run on "main"
# Always run on tags
# Always run on PRs
# Can also be triggered manually
on:
push:
branches:
- main
tags:
- "*"
pull_request:
workflow_dispatch:
inputs:
reason:
description: "Reason for manual trigger"
required: true
default: ""
# If triggered by a PR, it will be in the same group. However, each commit on main will be in its own unique group
concurrency:
group: ${{ github.workflow }}-${{ (github.head_ref && github.ref) || github.run_id }}
cancel-in-progress: true
env:
RELEVANT_SHA: ${{ github.event.pull_request.head.sha || github.sha }}
jobs:
define-matrix:
runs-on: blacksmith
outputs:
base_image: ${{ steps.define-base-images.outputs.base_image }}
steps:
- name: Define base images
shell: bash
id: define-base-images
run: |
if [[ "$GITHUB_EVENT_NAME" == "pull_request" ]]; then
json=$(jq -n -c '[
{ image: "nikolaik/python-nodejs:python3.12-nodejs22", tag: "nikolaik" }
]')
else
json=$(jq -n -c '[
{ image: "nikolaik/python-nodejs:python3.12-nodejs22", tag: "nikolaik" },
{ image: "ubuntu:24.04", tag: "ubuntu" }
]')
fi
echo "base_image=$json" >> "$GITHUB_OUTPUT"
# Builds the OpenHands Docker images
ghcr_build_app:
name: Build App Image
runs-on: blacksmith-4vcpu-ubuntu-2204
if: "!(github.event_name == 'push' && startsWith(github.ref, 'refs/tags/ext-v'))"
permissions:
contents: read
packages: write
steps:
- name: Checkout
uses: actions/checkout@v4
with:
ref: ${{ github.event.pull_request.head.sha }}
- name: Set up QEMU
uses: docker/setup-qemu-action@v3.7.0
with:
image: tonistiigi/binfmt:latest
- name: Login to GHCR
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.repository_owner }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Set up Docker Buildx
id: buildx
uses: docker/setup-buildx-action@v3
- name: Lowercase Repository Owner
run: |
echo REPO_OWNER=$(echo ${{ github.repository_owner }} | tr '[:upper:]' '[:lower:]') >> $GITHUB_ENV
- name: Build and push app image
if: "!github.event.pull_request.head.repo.fork"
run: |
./containers/build.sh -i openhands -o ${{ env.REPO_OWNER }} --push
# Builds the runtime Docker images
ghcr_build_runtime:
name: Build Runtime Image
runs-on: blacksmith-8vcpu-ubuntu-2204
if: "!(github.event_name == 'push' && startsWith(github.ref, 'refs/tags/ext-v'))"
permissions:
contents: read
packages: write
needs: define-matrix
strategy:
matrix:
base_image: ${{ fromJson(needs.define-matrix.outputs.base_image) }}
steps:
- name: Checkout
uses: actions/checkout@v4
with:
ref: ${{ github.event.pull_request.head.sha }}
- name: Set up QEMU
uses: docker/setup-qemu-action@v3.7.0
with:
image: tonistiigi/binfmt:latest
- name: Login to GHCR
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.repository_owner }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Set up Docker Buildx
id: buildx
uses: docker/setup-buildx-action@v3
- name: Install poetry via pipx
run: pipx install poetry
- name: Set up Python
uses: useblacksmith/setup-python@v6
with:
python-version: "3.12"
cache: poetry
- name: Install Python dependencies using Poetry
run: make install-python-dependencies POETRY_GROUP=main INSTALL_PLAYWRIGHT=0
- name: Create source distribution and Dockerfile
run: poetry run python3 -m openhands.runtime.utils.runtime_build --base_image ${{ matrix.base_image.image }} --build_folder containers/runtime --force_rebuild
- name: Lowercase Repository Owner
run: |
echo REPO_OWNER=$(echo ${{ github.repository_owner }} | tr '[:upper:]' '[:lower:]') >> $GITHUB_ENV
- name: Short SHA
run: |
echo SHORT_SHA=$(git rev-parse --short "$RELEVANT_SHA") >> $GITHUB_ENV
- name: Determine docker build params
if: github.event.pull_request.head.repo.fork != true
shell: bash
run: |
./containers/build.sh -i runtime -o ${{ env.REPO_OWNER }} -t ${{ matrix.base_image.tag }} --dry
DOCKER_BUILD_JSON=$(jq -c . < docker-build-dry.json)
echo "DOCKER_TAGS=$(echo "$DOCKER_BUILD_JSON" | jq -r '.tags | join(",")')" >> $GITHUB_ENV
echo "DOCKER_PLATFORM=$(echo "$DOCKER_BUILD_JSON" | jq -r '.platform')" >> $GITHUB_ENV
echo "DOCKER_BUILD_ARGS=$(echo "$DOCKER_BUILD_JSON" | jq -r '.build_args | join(",")')" >> $GITHUB_ENV
- name: Build and push runtime image ${{ matrix.base_image.image }}
if: github.event.pull_request.head.repo.fork != true
uses: useblacksmith/build-push-action@v1
with:
push: true
tags: ${{ env.DOCKER_TAGS }}
platforms: ${{ env.DOCKER_PLATFORM }}
# Caching directives to boost performance
cache-from: type=registry,ref=ghcr.io/${{ env.REPO_OWNER }}/runtime:buildcache-${{ matrix.base_image.tag }}
cache-to: type=registry,ref=ghcr.io/${{ env.REPO_OWNER }}/runtime:buildcache-${{ matrix.base_image.tag }},mode=max
build-args: ${{ env.DOCKER_BUILD_ARGS }}
context: containers/runtime
provenance: false
# Forked repos can't push to GHCR, so we just build in order to populate the cache for rebuilding
- name: Build runtime image ${{ matrix.base_image.image }} for fork
if: github.event.pull_request.head.repo.fork
uses: useblacksmith/build-push-action@v1
with:
tags: ghcr.io/${{ env.REPO_OWNER }}/runtime:${{ env.RELEVANT_SHA }}-${{ matrix.base_image.tag }}
context: containers/runtime
- name: Upload runtime source for fork
if: github.event.pull_request.head.repo.fork
uses: actions/upload-artifact@v6
with:
name: runtime-src-${{ matrix.base_image.tag }}
path: containers/runtime
ghcr_build_enterprise:
name: Push Enterprise Image
runs-on: blacksmith-8vcpu-ubuntu-2204
permissions:
contents: read
packages: write
needs: [define-matrix, ghcr_build_app]
# Do not build enterprise in forks
if: github.event.pull_request.head.repo.fork != true
steps:
- name: Checkout
uses: actions/checkout@v4
with:
ref: ${{ github.event.pull_request.head.sha }}
# Set up Docker Buildx for better performance
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
with:
driver-opts: network=host
- name: Login to GHCR
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.repository_owner }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Extract metadata (tags, labels) for Docker
id: meta
uses: docker/metadata-action@v5
with:
images: ghcr.io/openhands/enterprise-server
tags: |
type=ref,event=branch
type=ref,event=pr
type=sha
type=sha,format=long
type=semver,pattern={{version}}
type=semver,pattern={{major}}.{{minor}}
type=semver,pattern={{major}}
flavor: |
latest=auto
prefix=
suffix=
env:
DOCKER_METADATA_PR_HEAD_SHA: true
- name: Determine app image tag
shell: bash
run: |
# Duplicated with build.sh
sanitized_ref_name=$(echo "$GITHUB_REF_NAME" | sed 's/[^a-zA-Z0-9.-]\+/-/g')
OPENHANDS_BUILD_VERSION=$sanitized_ref_name
sanitized_ref_name=$(echo "$sanitized_ref_name" | tr '[:upper:]' '[:lower:]') # lower case is required in tagging
echo "OPENHANDS_DOCKER_TAG=${sanitized_ref_name}" >> $GITHUB_ENV
- name: Build and push Docker image
uses: useblacksmith/build-push-action@v1
with:
context: .
file: enterprise/Dockerfile
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
build-args: |
OPENHANDS_VERSION=${{ env.OPENHANDS_DOCKER_TAG }}
platforms: linux/amd64
# Add build provenance
provenance: true
# Add build attestations for better security
sbom: true
enterprise-preview:
name: Enterprise preview
if: github.event_name == 'pull_request' && contains(github.event.pull_request.labels.*.name, 'deploy')
runs-on: blacksmith-4vcpu-ubuntu-2204
needs: [ghcr_build_enterprise]
steps:
# This should match the version in enterprise-preview.yml
- name: Trigger remote job
run: |
curl --fail-with-body -sS -X POST \
-H "Authorization: Bearer ${{ secrets.ALLHANDS_BOT_GITHUB_PAT }}" \
-H "Accept: application/vnd.github+json" \
-d "{\"ref\": \"main\", \"inputs\": {\"openhandsPrNumber\": \"${{ github.event.pull_request.number }}\", \"deployEnvironment\": \"feature\", \"enterpriseImageTag\": \"pr-${{ github.event.pull_request.number }}\" }}" \
https://api.github.com/repos/OpenHands/deploy/actions/workflows/deploy.yaml/dispatches
# "All Runtime Tests Passed" is a required job for PRs to merge
# We can remove this once the config changes
runtime_tests_check_success:
name: All Runtime Tests Passed
runs-on: blacksmith-4vcpu-ubuntu-2204
steps:
- name: All tests passed
run: echo "All runtime tests have passed successfully!"
update_pr_description:
name: Update PR Description
if: github.event_name == 'pull_request' && !github.event.pull_request.head.repo.fork && github.actor != 'dependabot[bot]'
needs: [ghcr_build_runtime]
runs-on: blacksmith-4vcpu-ubuntu-2204
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Get short SHA
id: short_sha
run: echo "SHORT_SHA=$(echo ${{ github.event.pull_request.head.sha }} | cut -c1-7)" >> $GITHUB_OUTPUT
- name: Update PR Description
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
PR_NUMBER: ${{ github.event.pull_request.number }}
REPO: ${{ github.repository }}
SHORT_SHA: ${{ steps.short_sha.outputs.SHORT_SHA }}
shell: bash
run: |
echo "Updating PR description with Docker and uvx commands"
bash ${GITHUB_WORKSPACE}/.github/scripts/update_pr_description.sh
+477
View File
@@ -0,0 +1,477 @@
---
name: Run Integration Tests
run-name: >-
Run Integration Tests ${{ inputs.reason || github.event.label.name || 'scheduled' }}
on:
# Use pull_request_target to access secrets even on fork PRs
# This is safe because we only run when the 'integration-test' label is added by a maintainer
pull_request_target:
types:
- labeled
workflow_dispatch:
inputs:
reason:
description: Reason for manual trigger
required: true
default: ''
test_type:
description: Select which tests to run (all, integration, behavior)
required: false
default: all
model_ids:
description: >-
Comma-separated model IDs to test (from resolve_model_config.py).
Example: claude-sonnet-4-6,glm-4.7. Defaults to a standard set.
required: false
default: ''
type: string
issue_number:
description: Issue or PR number to post results to (optional)
required: false
default: ''
type: string
tool_preset:
description: >-
Tool preset for file editing (default, gemini, gpt5, planning).
'default' uses FileEditorTool, 'gemini' uses read_file/write_file/edit/list_directory,
'gpt5' uses apply_patch tool.
required: false
default: default
type: choice
options:
- default
- gemini
- gpt5
- planning
schedule:
- cron: 30 22 * * * # Runs at 10:30pm UTC every day
env:
N_PROCESSES: 4 # Global configuration for number of parallel processes for evaluation
# Default models for scheduled/label-triggered runs (subset of models from resolve_model_config.py)
DEFAULT_MODEL_IDS: claude-sonnet-4-6,deepseek-v3.2-reasoner,kimi-k2-thinking,gemini-3-pro
jobs:
setup-matrix:
runs-on: ubuntu-latest
outputs:
matrix: ${{ steps.resolve-models.outputs.matrix }}
issue_number: ${{ steps.resolve-issue.outputs.issue_number }}
steps:
- name: Checkout repository
uses: actions/checkout@v5
with:
repository: ${{ github.event.pull_request.head.repo.full_name || github.repository }}
ref: ${{ github.event.pull_request.head.sha || github.ref }}
persist-credentials: false
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.13'
- name: Resolve model configurations
id: resolve-models
env:
MODEL_IDS_INPUT: ${{ github.event.inputs.model_ids || '' }}
DEFAULT_MODEL_IDS: ${{ env.DEFAULT_MODEL_IDS }}
run: |
# Use input model_ids if provided, otherwise use defaults
if [ -z "$MODEL_IDS_INPUT" ]; then
MODEL_IDS="$DEFAULT_MODEL_IDS"
echo "No model_ids specified, using defaults: $MODEL_IDS"
else
MODEL_IDS="$MODEL_IDS_INPUT"
echo "Using specified model_ids: $MODEL_IDS"
fi
# Resolve model configs using resolve_model_config.py
# Transform output to matrix format for integration tests
MATRIX=$(python3 << EOF
import json
import sys
sys.path.insert(0, '.github/run-eval')
from resolve_model_config import MODELS
model_ids = "$MODEL_IDS".split(",")
model_ids = [m.strip() for m in model_ids if m.strip()]
matrix = []
for model_id in model_ids:
if model_id not in MODELS:
available = ", ".join(sorted(MODELS.keys()))
print(f"Error: Model ID '{model_id}' not found. Available: {available}", file=sys.stderr)
sys.exit(1)
model = MODELS[model_id]
# Create run-suffix from model id (replace special chars with underscore)
run_suffix = model_id.replace("-", "_").replace(".", "_") + "_run"
matrix.append({
"id": model_id,
"name": model["display_name"],
"run-suffix": run_suffix,
"llm-config": model["llm_config"]
})
print(json.dumps(matrix))
EOF
)
if [ $? -ne 0 ]; then
echo "Failed to resolve model configurations" >&2
exit 1
fi
echo "matrix=$MATRIX" >> "$GITHUB_OUTPUT"
echo "Resolved models: $(echo "$MATRIX" | jq -r '.[].name' | paste -sd', ' -)"
- name: Resolve issue number
id: resolve-issue
env:
ISSUE_NUMBER_INPUT: ${{ github.event.inputs.issue_number || '' }}
PR_NUMBER: ${{ github.event.pull_request.number }}
run: |
# Priority: explicit input > PR number from label trigger
if [ -n "$ISSUE_NUMBER_INPUT" ]; then
echo "issue_number=$ISSUE_NUMBER_INPUT" >> "$GITHUB_OUTPUT"
elif [ -n "$PR_NUMBER" ]; then
echo "issue_number=$PR_NUMBER" >> "$GITHUB_OUTPUT"
else
echo "issue_number=" >> "$GITHUB_OUTPUT"
fi
# Post initial comment for label triggers (no dependencies - runs immediately)
post-label-comment:
if: >
github.event_name == 'pull_request_target' && (
github.event.label.name == 'integration-test' ||
github.event.label.name == 'behavior-test'
)
runs-on: ubuntu-latest
permissions:
pull-requests: write
steps:
- name: Comment on PR (integration tests via label)
if: github.event.label.name == 'integration-test'
uses: KeisukeYamashita/create-comment@v1
with:
unique: false
comment: |
Hi! I started running the integration tests on your PR. You will receive a comment with the results shortly.
- name: Comment on PR (behavior tests via label)
if: github.event.label.name == 'behavior-test'
uses: KeisukeYamashita/create-comment@v1
with:
unique: false
comment: |
Hi! I started running the behavior tests on your PR. You will receive a comment with the results shortly.
# Post initial comment for workflow_dispatch (depends on setup-matrix for issue_number resolution)
post-dispatch-comment:
needs: setup-matrix
if: github.event_name == 'workflow_dispatch' && github.event.inputs.issue_number != ''
runs-on: ubuntu-latest
permissions:
issues: write
steps:
- name: Comment on issue/PR (workflow_dispatch)
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
ISSUE_NUMBER: ${{ github.event.inputs.issue_number }}
MODEL_IDS: ${{ github.event.inputs.model_ids || 'all models' }}
TEST_TYPE: ${{ github.event.inputs.test_type || 'all' }}
REASON: ${{ github.event.inputs.reason }}
run: |
# Sanitize @OpenHands mentions to prevent self-mention loops
SANITIZED_REASON=$(echo "$REASON" | sed 's/@OpenHands/@\u200BOpenHands/g; s/@openhands/@\u200Bopenhands/g')
SANITIZED_MODEL_IDS=$(echo "$MODEL_IDS" | sed 's/@OpenHands/@\u200BOpenHands/g; s/@openhands/@\u200Bopenhands/g')
COMMENT_BODY=$(cat <<EOF
**Integration Tests Triggered**
- **Reason:** $SANITIZED_REASON
- **Test type:** $TEST_TYPE
- **Models:** $SANITIZED_MODEL_IDS
- **Workflow run:** ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}
Results will be posted here when complete.
EOF
)
gh issue comment "$ISSUE_NUMBER" --body "$COMMENT_BODY"
run-integration-tests:
# Security: Only run when integration-related labels are present, via workflow_dispatch, or on schedule
# This prevents automatic execution on fork PRs without maintainer approval
# Note: uses always() to run even when comment jobs are skipped (e.g., for scheduled runs)
# Schedule trigger only runs in the main repository, not in forks
if: |
always() && (
(
github.event_name == 'pull_request_target' && (
github.event.label.name == 'integration-test' ||
github.event.label.name == 'behavior-test'
)
) ||
github.event_name == 'workflow_dispatch' ||
(github.event_name == 'schedule' && github.repository == 'OpenHands/software-agent-sdk')
) && needs.setup-matrix.result == 'success'
needs: [setup-matrix, post-label-comment, post-dispatch-comment]
runs-on: ubuntu-22.04
timeout-minutes: 180
permissions:
contents: read
id-token: write
pull-requests: write
issues: write
strategy:
fail-fast: false
matrix:
python-version: ['3.13']
job-config: ${{ fromJson(needs.setup-matrix.outputs.matrix) }}
steps:
- name: Checkout repository
uses: actions/checkout@v5
with:
# For pull_request_target: checkout fork PR code (requires explicit repository)
# For other events: fallback to current repository and ref
repository: ${{ github.event.pull_request.head.repo.full_name || github.repository }}
ref: ${{ github.event.pull_request.head.sha || github.ref }}
# Security: Don't persist credentials to prevent untrusted PR code from using them
persist-credentials: false
- name: Install uv
uses: astral-sh/setup-uv@v7
with:
version: latest
python-version: ${{ matrix.python-version }}
- name: Install Python dependencies using uv
run: |
uv sync --dev
uv pip install pytest
# Run integration test evaluation
- name: Determine test selection
run: |
TEST_TYPE_ARGS=""
if [ "${{ github.event_name }}" = "pull_request_target" ] && [ "${{ github.event.label.name }}" = "behavior-test" ]; then
TEST_TYPE_ARGS="--test-type behavior"
echo "behavior-test label detected; running behavior tests only."
elif [ "${{ github.event_name }}" = "pull_request_target" ] && [ "${{ github.event.label.name }}" = "integration-test" ]; then
TEST_TYPE_ARGS="--test-type integration"
echo "integration-test label detected; running integration tests only."
elif [ "${{ github.event_name }}" = "workflow_dispatch" ]; then
test_type="${{ github.event.inputs.test_type }}"
case "$test_type" in
behavior)
TEST_TYPE_ARGS="--test-type behavior"
echo "workflow_dispatch requested behavior tests only."
;;
integration)
TEST_TYPE_ARGS="--test-type integration"
echo "workflow_dispatch requested integration tests only."
;;
""|all)
echo "workflow_dispatch requested full integration suite."
;;
*)
echo "workflow_dispatch provided unknown test_type '$test_type'; defaulting to full suite."
;;
esac
elif [ "${{ github.event_name }}" = "schedule" ]; then
TEST_TYPE_ARGS="--test-type integration"
echo "Scheduled run; running integration tests only."
else
echo "Running full integration test suite."
fi
echo "TEST_TYPE_ARGS=$TEST_TYPE_ARGS" >> "$GITHUB_ENV"
- name: Run integration test evaluation for ${{ matrix.job-config['name'] }}
env:
LLM_CONFIG: ${{ toJson(matrix.job-config['llm-config']) }}
LLM_API_KEY: ${{ secrets.LLM_API_KEY_EVAL }}
LLM_BASE_URL: https://llm-proxy.eval.all-hands.dev
TOOL_PRESET: ${{ github.event.inputs.tool_preset || 'default' }}
run: |
set -eo pipefail
AGENT_SDK_VERSION=$(git rev-parse --short HEAD)
EVAL_NOTE="${AGENT_SDK_VERSION}_${{ matrix.job-config['run-suffix'] }}"
echo "Invoking test runner with TEST_TYPE_ARGS='$TEST_TYPE_ARGS' TOOL_PRESET='$TOOL_PRESET'"
uv run python tests/integration/run_infer.py \
--llm-config "$LLM_CONFIG" \
--num-workers $N_PROCESSES \
--eval-note "$EVAL_NOTE" \
--tool-preset "$TOOL_PRESET" \
$TEST_TYPE_ARGS
# get integration tests JSON results
RESULTS_FILE=$(find tests/integration/outputs/*${{ matrix.job-config['run-suffix'] }}* -name "results.json" -type f | head -n 1)
echo "RESULTS_FILE: $RESULTS_FILE"
if [ -f "$RESULTS_FILE" ]; then
echo "JSON_RESULTS_FILE=$RESULTS_FILE" >> $GITHUB_ENV
else
echo "JSON_RESULTS_FILE=" >> $GITHUB_ENV
fi
- name: Wait a little bit
run: sleep 10
- name: Create archive of evaluation outputs
run: |
TIMESTAMP=$(date +'%y-%m-%d-%H-%M')
cd tests/integration/outputs # Change to the outputs directory
tar -czvf ../../../integration_tests_${{ matrix.job-config['run-suffix'] }}_${TIMESTAMP}.tar.gz *${{ matrix.job-config['run-suffix'] }}* # Include result directories for this model
- name: Upload evaluation results as artifact
uses: actions/upload-artifact@v7
id: upload_results_artifact
with:
name: integration-test-outputs-${{ matrix.job-config['run-suffix'] }}-${{ github.run_id }}-${{ github.run_attempt }}
path: integration_tests_${{ matrix.job-config['run-suffix'] }}_*.tar.gz
- name: Save test results for consolidation
run: |
# Copy the structured JSON results file for consolidation
mkdir -p test_results_summary
if [ -n "${{ env.JSON_RESULTS_FILE }}" ] && [ -f "${{ env.JSON_RESULTS_FILE }}" ]; then
# Copy the JSON results file directly
cp "${{ env.JSON_RESULTS_FILE }}" "test_results_summary/${{ matrix.job-config['run-suffix'] }}_results.json"
echo "✓ Copied JSON results file for consolidation"
else
echo "✗ No JSON results file found"
exit 1
fi
- name: Upload test results summary
uses: actions/upload-artifact@v7
with:
name: test-results-${{ matrix.job-config['run-suffix'] }}
path: test_results_summary/${{ matrix.job-config['run-suffix'] }}_results.json
consolidate-results:
needs: [setup-matrix, run-integration-tests]
if: |
always() && (
(
github.event_name == 'pull_request_target' && (
github.event.label.name == 'integration-test' ||
github.event.label.name == 'behavior-test'
)
) ||
github.event_name == 'workflow_dispatch' ||
(github.event_name == 'schedule' && github.repository == 'OpenHands/software-agent-sdk')
)
runs-on: ubuntu-24.04
permissions:
contents: read
pull-requests: write
issues: write
steps:
- name: Checkout repository
uses: actions/checkout@v5
with:
# When using pull_request_target, explicitly checkout the PR branch
# This ensures we use the scripts from the actual PR code
ref: ${{ github.event.pull_request.head.sha || github.ref }}
- name: Install uv
uses: astral-sh/setup-uv@v7
with:
version: latest
python-version: '3.13'
- name: Install Python dependencies using uv
run: |
uv sync --dev
- name: Download all test results
uses: actions/download-artifact@v8
with:
pattern: test-results-*
merge-multiple: true
path: all_results
- name: Download all integration test artifacts
uses: actions/download-artifact@v8
with:
pattern: integration-test-outputs-*
path: artifacts
- name: Consolidate test results
env:
EVENT_NAME: ${{ github.event_name }}
PR_NUMBER: ${{ github.event.pull_request.number }}
MANUAL_REASON: ${{ github.event.inputs.reason }}
COMMIT_SHA: ${{ github.sha }}
PYTHONPATH: ${{ github.workspace }}
GITHUB_SERVER_URL: ${{ github.server_url }}
GITHUB_REPOSITORY: ${{ github.repository }}
GITHUB_RUN_ID: ${{ github.run_id }}
run: |
uv run python tests/integration/utils/consolidate_json_results.py \
--results-dir all_results \
--artifacts-dir artifacts \
--output-file consolidated_results.json
echo "Consolidated results generated successfully"
uv run python tests/integration/utils/generate_markdown_report.py \
--input-file consolidated_results.json \
--output-file consolidated_report.md
- name: Upload consolidated report
uses: actions/upload-artifact@v7
with:
name: consolidated-report
path: consolidated_report.md
- name: Create consolidated PR comment
if: github.event_name == 'pull_request_target'
run: |
# Sanitize @OpenHands mentions to prevent self-mention loops
COMMENT_BODY=$(uv run python -c "from openhands.sdk.utils.github import sanitize_openhands_mentions; import sys; print(sanitize_openhands_mentions(sys.stdin.read()), end='')" < consolidated_report.md)
# Use GitHub CLI to create comment with explicit PR number
echo "$COMMENT_BODY" | gh pr comment ${{ github.event.pull_request.number }} --body-file -
env:
GH_TOKEN: ${{ github.token }}
- name: Comment on specified issue/PR (workflow_dispatch)
if: github.event_name == 'workflow_dispatch' && needs.setup-matrix.outputs.issue_number != ''
env:
GH_TOKEN: ${{ github.token }}
ISSUE_NUMBER: ${{ needs.setup-matrix.outputs.issue_number }}
run: |
# Sanitize @OpenHands mentions to prevent self-mention loops
COMMENT_BODY=$(uv run python -c "from openhands.sdk.utils.github import sanitize_openhands_mentions; import sys; print(sanitize_openhands_mentions(sys.stdin.read()), end='')" < consolidated_report.md)
# Use GitHub CLI to create comment on the specified issue/PR
echo "$COMMENT_BODY" | gh issue comment "$ISSUE_NUMBER" --body-file -
- name: Read consolidated report for tracker issue
if: github.event_name == 'schedule'
id: read_report
run: |
# Read and sanitize the report, then set as output
REPORT_CONTENT=$(uv run python -c "from openhands.sdk.utils.github import sanitize_openhands_mentions; import sys; print(sanitize_openhands_mentions(sys.stdin.read()), end='')" < consolidated_report.md)
echo "report<<EOF" >> $GITHUB_OUTPUT
echo "$REPORT_CONTENT" >> $GITHUB_OUTPUT
echo "EOF" >> $GITHUB_OUTPUT
- name: Comment with results on tracker issue
if: github.event_name == 'schedule'
uses: KeisukeYamashita/create-comment@v1
with:
number: 2078
unique: false
comment: |
**Trigger:** Nightly Scheduled Run
**Commit:** ${{ github.sha }}
${{ steps.read_report.outputs.report }}
-97
View File
@@ -1,97 +0,0 @@
name: Lint Fix
on:
pull_request:
types: [labeled]
jobs:
# Frontend lint fixes
lint-fix-frontend:
if: github.event.label.name == 'lint-fix'
name: Fix frontend linting issues
runs-on: blacksmith-4vcpu-ubuntu-2204
permissions:
contents: write
pull-requests: write
steps:
- uses: actions/checkout@v4
with:
ref: ${{ github.head_ref }}
repository: ${{ github.event.pull_request.head.repo.full_name }}
fetch-depth: 0
token: ${{ secrets.GITHUB_TOKEN }}
- name: Install Node.js 22
uses: useblacksmith/setup-node@v5
with:
node-version: 22
- name: Install frontend dependencies
run: |
cd frontend
npm install --frozen-lockfile
- name: Generate i18n and route types
run: |
cd frontend
npm run make-i18n
npx react-router typegen || true
- name: Fix frontend lint issues
run: |
cd frontend
npm run lint:fix
# Commit and push changes if any
- name: Check for changes
id: git-check
run: |
git diff --quiet || echo "changes=true" >> $GITHUB_OUTPUT
- name: Commit and push if there are changes
if: steps.git-check.outputs.changes == 'true'
run: |
git config --local user.email "openhands@all-hands.dev"
git config --local user.name "OpenHands Bot"
git add -A
git commit -m "🤖 Auto-fix frontend linting issues" --no-verify
git push
# Python lint fixes
lint-fix-python:
if: github.event.label.name == 'lint-fix'
name: Fix Python linting issues
runs-on: blacksmith-4vcpu-ubuntu-2204
permissions:
contents: write
pull-requests: write
steps:
- uses: actions/checkout@v4
with:
ref: ${{ github.head_ref }}
repository: ${{ github.event.pull_request.head.repo.full_name }}
fetch-depth: 0
token: ${{ secrets.GITHUB_TOKEN }}
- name: Set up python
uses: useblacksmith/setup-python@v6
with:
python-version: 3.12
cache: "pip"
- name: Install pre-commit
run: pip install pre-commit==3.7.0
- name: Fix python lint issues
run: |
# Run all pre-commit hooks and continue even if they modify files (exit code 1)
pre-commit run --config ./dev_config/python/.pre-commit-config.yaml --all-files || true
# Commit and push changes if any
- name: Check for changes
id: git-check
run: |
git diff --quiet || echo "changes=true" >> $GITHUB_OUTPUT
- name: Commit and push if there are changes
if: steps.git-check.outputs.changes == 'true'
run: |
git config --local user.email "openhands@all-hands.dev"
git config --local user.name "OpenHands Bot"
git add -A
git commit -m "🤖 Auto-fix Python linting issues" --no-verify
git push
-74
View File
@@ -1,74 +0,0 @@
# Workflow that runs lint on the frontend and python code
name: Lint
# The jobs in this workflow are required, so they must run at all times
# Always run on "main"
# Always run on PRs
on:
push:
branches:
- main
pull_request:
# If triggered by a PR, it will be in the same group. However, each commit on main will be in its own unique group
concurrency:
group: ${{ github.workflow }}-${{ (github.head_ref && github.ref) || github.run_id }}
cancel-in-progress: true
jobs:
# Run lint on the frontend code
lint-frontend:
name: Lint frontend
runs-on: blacksmith-4vcpu-ubuntu-2204
steps:
- uses: actions/checkout@v4
- name: Install Node.js 22
uses: useblacksmith/setup-node@v5
with:
node-version: 22
- name: Install dependencies
run: |
cd frontend
npm install --frozen-lockfile
- name: Lint, TypeScript compilation, and translation checks
run: |
cd frontend
npm run lint
npm run make-i18n && tsc
npm run check-translation-completeness
# Run lint on the python code
lint-python:
name: Lint python
runs-on: blacksmith-4vcpu-ubuntu-2204
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Set up python
uses: useblacksmith/setup-python@v6
with:
python-version: 3.12
cache: "pip"
- name: Install pre-commit
run: pip install pre-commit==3.7.0
- name: Run pre-commit hooks
run: pre-commit run --all-files --show-diff-on-failure --config ./dev_config/python/.pre-commit-config.yaml
lint-enterprise-python:
name: Lint enterprise python
runs-on: blacksmith-4vcpu-ubuntu-2204
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Set up python
uses: useblacksmith/setup-python@v6
with:
python-version: 3.12
cache: "pip"
- name: Install pre-commit
run: pip install pre-commit==4.2.0
- name: Run pre-commit hooks
working-directory: ./enterprise
run: pre-commit run --all-files --show-diff-on-failure --config ./dev_config/python/.pre-commit-config.yaml
-108
View File
@@ -1,108 +0,0 @@
name: Publish OpenHands UI Package
# * Always run on "main"
# * Run on PRs that have changes in the "openhands-ui" folder or this workflow
on:
push:
branches:
- main
paths:
- "openhands-ui/**"
- ".github/workflows/npm-publish-ui.yml"
# If triggered by a PR, it will be in the same group. However, each commit on main will be in its own unique group
concurrency:
group: npm-publish-ui
cancel-in-progress: false
jobs:
check-version:
name: Check if version has changed
runs-on: blacksmith-4vcpu-ubuntu-2204
defaults:
run:
shell: bash
outputs:
should-publish: ${{ steps.version-check.outputs.should-publish }}
current-version: ${{ steps.version-check.outputs.current-version }}
steps:
- name: Checkout
uses: actions/checkout@v4
with:
fetch-depth: 2 # Need previous commit to compare
- name: Check if version changed
id: version-check
run: |
# Get current version from package.json
CURRENT_VERSION=$(jq -r .version openhands-ui/package.json)
echo "current-version=$CURRENT_VERSION" >> $GITHUB_OUTPUT
# Check if package.json version changed in this commit
if git diff HEAD~1 HEAD --name-only | grep -q "openhands-ui/package.json"; then
# Check if the version field specifically changed
if git diff HEAD~1 HEAD openhands-ui/package.json | grep -q '"version"'; then
echo "Version changed in package.json, will publish"
echo "should-publish=true" >> $GITHUB_OUTPUT
else
echo "package.json changed but version did not change, skipping publish"
echo "should-publish=false" >> $GITHUB_OUTPUT
fi
else
echo "package.json did not change, skipping publish"
echo "should-publish=false" >> $GITHUB_OUTPUT
fi
publish:
name: Publish to npm
runs-on: blacksmith-4vcpu-ubuntu-2204
needs: check-version
if: needs.check-version.outputs.should-publish == 'true'
defaults:
run:
shell: bash
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Setup Bun
uses: oven-sh/setup-bun@v2
with:
bun-version-file: "openhands-ui/.bun-version"
- name: Install dependencies
working-directory: ./openhands-ui
run: bun install --frozen-lockfile
- name: Build package
working-directory: ./openhands-ui
run: bun run build
- name: Check if package already exists on npm
id: npm-check
working-directory: ./openhands-ui
run: |
PACKAGE_NAME=$(jq -r .name package.json)
VERSION="${{ needs.check-version.outputs.current-version }}"
# Check if this version already exists on npm
if npm view "$PACKAGE_NAME@$VERSION" version 2>/dev/null; then
echo "Version $VERSION already exists on npm, skipping publish"
echo "already-exists=true" >> $GITHUB_OUTPUT
else
echo "Version $VERSION does not exist on npm, proceeding with publish"
echo "already-exists=false" >> $GITHUB_OUTPUT
fi
- name: Setup npm authentication
if: steps.npm-check.outputs.already-exists == 'false'
run: |
echo "//registry.npmjs.org/:_authToken=${{ secrets.NPM_TOKEN }}" > ~/.npmrc
- name: Publish to npm
if: steps.npm-check.outputs.already-exists == 'false'
working-directory: ./openhands-ui
run: |
# The prepublishOnly script will run automatically and build the package
npm publish
echo "✅ Successfully published @openhands/ui@${{ needs.check-version.outputs.current-version }} to npm"
@@ -0,0 +1,30 @@
name: Update Documentation (by OpenHands)
on:
schedule:
# Run every 7 days at 2 AM UTC on Sundays
- cron: '0 2 * * 0'
workflow_dispatch: # Allow manual triggering
jobs:
update-docs:
runs-on: blacksmith-4vcpu-ubuntu-2404
permissions:
contents: write
pull-requests: write
steps:
- uses: actions/checkout@v4
- name: Update Documentation with OpenHands
uses: All-Hands-AI/openhands-github-action@v1
with:
prompt: .github/prompts/update-documentation.md
repository: ${{ github.repository }}
selected-branch: main
base-url: https://app.all-hands.dev
poll: "true"
timeout-seconds: 1800
poll-interval-seconds: 30
github-token: ${{ secrets.GITHUB_TOKEN }}
openhands-api-key: ${{ secrets.OPENHANDS_API_KEY }}
-433
View File
@@ -1,433 +0,0 @@
name: Auto-Fix Tagged Issue with OpenHands
on:
workflow_call:
inputs:
max_iterations:
required: false
type: number
default: 50
macro:
required: false
type: string
default: "@openhands-agent"
target_branch:
required: false
type: string
default: "main"
description: "Target branch to pull and create PR against"
pr_type:
required: false
type: string
default: "draft"
description: "The PR type that is going to be created (draft, ready)"
LLM_MODEL:
required: false
type: string
default: "anthropic/claude-sonnet-4-20250514"
LLM_API_VERSION:
required: false
type: string
default: ""
base_container_image:
required: false
type: string
default: ""
description: "Custom sandbox env"
runner:
required: false
type: string
default: "ubuntu-latest"
secrets:
LLM_MODEL:
required: false
LLM_API_KEY:
required: true
LLM_BASE_URL:
required: false
PAT_TOKEN:
required: false
PAT_USERNAME:
required: false
issues:
types: [labeled]
pull_request:
types: [labeled]
issue_comment:
types: [created]
pull_request_review_comment:
types: [created]
pull_request_review:
types: [submitted]
permissions:
contents: write
pull-requests: write
issues: write
jobs:
auto-fix:
if: |
github.event_name == 'workflow_call' ||
github.event.label.name == 'fix-me' ||
github.event.label.name == 'fix-me-experimental' ||
(
((github.event_name == 'issue_comment' || github.event_name == 'pull_request_review_comment') &&
contains(github.event.comment.body, inputs.macro || '@openhands-agent') &&
(github.event.comment.author_association == 'OWNER' || github.event.comment.author_association == 'COLLABORATOR' || github.event.comment.author_association == 'MEMBER')
) ||
(github.event_name == 'pull_request_review' &&
contains(github.event.review.body, inputs.macro || '@openhands-agent') &&
(github.event.review.author_association == 'OWNER' || github.event.review.author_association == 'COLLABORATOR' || github.event.review.author_association == 'MEMBER')
)
)
runs-on: "${{ inputs.runner || 'ubuntu-latest' }}"
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v6
with:
python-version: "3.12"
- name: Upgrade pip
run: |
python -m pip install --upgrade pip
- name: Get latest versions and create requirements.txt
run: |
python -m pip index versions openhands-ai > openhands_versions.txt
OPENHANDS_VERSION=$(head -n 1 openhands_versions.txt | awk '{print $2}' | tr -d '()')
# Create a new requirements.txt locally within the workflow, ensuring no reference to the repo's file
echo "openhands-ai==${OPENHANDS_VERSION}" > /tmp/requirements.txt
cat /tmp/requirements.txt
- name: Cache pip dependencies
if: |
!(
github.event.label.name == 'fix-me-experimental' ||
(
(github.event_name == 'issue_comment' || github.event_name == 'pull_request_review_comment') &&
contains(github.event.comment.body, '@openhands-agent-exp')
) ||
(
github.event_name == 'pull_request_review' &&
contains(github.event.review.body, '@openhands-agent-exp')
)
)
uses: actions/cache@v5
with:
path: ${{ env.pythonLocation }}/lib/python3.12/site-packages/*
key: ${{ runner.os }}-pip-openhands-resolver-${{ hashFiles('/tmp/requirements.txt') }}
restore-keys: |
${{ runner.os }}-pip-openhands-resolver-${{ hashFiles('/tmp/requirements.txt') }}
- name: Check required environment variables
env:
LLM_MODEL: ${{ secrets.LLM_MODEL || inputs.LLM_MODEL }}
LLM_API_KEY: ${{ secrets.LLM_API_KEY }}
LLM_BASE_URL: ${{ secrets.LLM_BASE_URL }}
LLM_API_VERSION: ${{ inputs.LLM_API_VERSION }}
PAT_TOKEN: ${{ secrets.PAT_TOKEN }}
PAT_USERNAME: ${{ secrets.PAT_USERNAME }}
GITHUB_TOKEN: ${{ github.token }}
run: |
required_vars=("LLM_API_KEY")
for var in "${required_vars[@]}"; do
if [ -z "${!var}" ]; then
echo "Error: Required environment variable $var is not set."
exit 1
fi
done
# Check optional variables and warn about fallbacks
if [ -z "$LLM_BASE_URL" ]; then
echo "Warning: LLM_BASE_URL is not set, will use default API endpoint"
fi
if [ -z "$PAT_TOKEN" ]; then
echo "Warning: PAT_TOKEN is not set, falling back to GITHUB_TOKEN"
fi
if [ -z "$PAT_USERNAME" ]; then
echo "Warning: PAT_USERNAME is not set, will use openhands-agent"
fi
- name: Set environment variables
env:
REVIEW_BODY: ${{ github.event.review.body || '' }}
run: |
# Handle pull request events first
if [ -n "${{ github.event.pull_request.number }}" ]; then
echo "ISSUE_NUMBER=${{ github.event.pull_request.number }}" >> $GITHUB_ENV
echo "ISSUE_TYPE=pr" >> $GITHUB_ENV
# Handle pull request review events
elif [ -n "$REVIEW_BODY" ]; then
echo "ISSUE_NUMBER=${{ github.event.pull_request.number }}" >> $GITHUB_ENV
echo "ISSUE_TYPE=pr" >> $GITHUB_ENV
# Handle issue comment events that reference a PR
elif [ -n "${{ github.event.issue.pull_request }}" ]; then
echo "ISSUE_NUMBER=${{ github.event.issue.number }}" >> $GITHUB_ENV
echo "ISSUE_TYPE=pr" >> $GITHUB_ENV
# Handle regular issue events
else
echo "ISSUE_NUMBER=${{ github.event.issue.number }}" >> $GITHUB_ENV
echo "ISSUE_TYPE=issue" >> $GITHUB_ENV
fi
if [ -n "$REVIEW_BODY" ]; then
echo "COMMENT_ID=${{ github.event.review.id || 'None' }}" >> $GITHUB_ENV
else
echo "COMMENT_ID=${{ github.event.comment.id || 'None' }}" >> $GITHUB_ENV
fi
echo "MAX_ITERATIONS=${{ inputs.max_iterations || 50 }}" >> $GITHUB_ENV
echo "SANDBOX_ENV_GITHUB_TOKEN=${{ secrets.PAT_TOKEN || github.token }}" >> $GITHUB_ENV
echo "SANDBOX_BASE_CONTAINER_IMAGE=${{ inputs.base_container_image }}" >> $GITHUB_ENV
# Set branch variables
echo "TARGET_BRANCH=${{ inputs.target_branch || 'main' }}" >> $GITHUB_ENV
- name: Comment on issue with start message
uses: actions/github-script@v7
with:
github-token: ${{ secrets.PAT_TOKEN || github.token }}
script: |
const issueType = process.env.ISSUE_TYPE;
github.rest.issues.createComment({
issue_number: ${{ env.ISSUE_NUMBER }},
owner: context.repo.owner,
repo: context.repo.repo,
body: `[OpenHands](https://github.com/OpenHands/OpenHands) started fixing the ${issueType}! You can monitor the progress [here](https://github.com/${context.repo.owner}/${context.repo.repo}/actions/runs/${context.runId}).`
});
- name: Install OpenHands
id: install_openhands
uses: actions/github-script@v7
env:
COMMENT_BODY: ${{ github.event.comment.body || '' }}
REVIEW_BODY: ${{ github.event.review.body || '' }}
LABEL_NAME: ${{ github.event.label.name || '' }}
EVENT_NAME: ${{ github.event_name }}
with:
script: |
const commentBody = process.env.COMMENT_BODY.trim();
const reviewBody = process.env.REVIEW_BODY.trim();
const labelName = process.env.LABEL_NAME.trim();
const eventName = process.env.EVENT_NAME.trim();
// Check conditions
const isExperimentalLabel = labelName === "fix-me-experimental";
const isIssueCommentExperimental =
(eventName === "issue_comment" || eventName === "pull_request_review_comment") &&
commentBody.includes("@openhands-agent-exp");
const isReviewCommentExperimental =
eventName === "pull_request_review" && reviewBody.includes("@openhands-agent-exp");
// Set output variable
core.setOutput('isExperimental', isExperimentalLabel || isIssueCommentExperimental || isReviewCommentExperimental);
// Perform package installation
if (isExperimentalLabel || isIssueCommentExperimental || isReviewCommentExperimental) {
console.log("Installing experimental OpenHands...");
await exec.exec("pip install git+https://github.com/openhands/openhands.git");
} else {
console.log("Installing from requirements.txt...");
await exec.exec("pip install -r /tmp/requirements.txt");
}
- name: Attempt to resolve issue
env:
GITHUB_TOKEN: ${{ secrets.PAT_TOKEN || github.token }}
GITHUB_USERNAME: ${{ secrets.PAT_USERNAME || 'openhands-agent' }}
GIT_USERNAME: ${{ secrets.PAT_USERNAME || 'openhands-agent' }}
LLM_MODEL: ${{ secrets.LLM_MODEL || inputs.LLM_MODEL }}
LLM_API_KEY: ${{ secrets.LLM_API_KEY }}
LLM_BASE_URL: ${{ secrets.LLM_BASE_URL }}
LLM_API_VERSION: ${{ inputs.LLM_API_VERSION }}
PYTHONPATH: ""
run: |
cd /tmp && python -m openhands.resolver.resolve_issue \
--selected-repo ${{ github.repository }} \
--issue-number ${{ env.ISSUE_NUMBER }} \
--issue-type ${{ env.ISSUE_TYPE }} \
--max-iterations ${{ env.MAX_ITERATIONS }} \
--comment-id ${{ env.COMMENT_ID }} \
--is-experimental ${{ steps.install_openhands.outputs.isExperimental }}
- name: Check resolution result
id: check_result
run: |
if cd /tmp && grep -q '"success":true' output/output.jsonl; then
echo "RESOLUTION_SUCCESS=true" >> $GITHUB_OUTPUT
else
echo "RESOLUTION_SUCCESS=false" >> $GITHUB_OUTPUT
fi
- name: Upload output.jsonl as artifact
uses: actions/upload-artifact@v6
if: always() # Upload even if the previous steps fail
with:
name: resolver-output
path: /tmp/output/output.jsonl
retention-days: 30 # Keep the artifact for 30 days
- name: Create draft PR or push branch
if: always() # Create PR or branch even if the previous steps fail
env:
GITHUB_TOKEN: ${{ secrets.PAT_TOKEN || github.token }}
GITHUB_USERNAME: ${{ secrets.PAT_USERNAME || 'openhands-agent' }}
GIT_USERNAME: ${{ secrets.PAT_USERNAME || 'openhands-agent' }}
LLM_MODEL: ${{ secrets.LLM_MODEL || inputs.LLM_MODEL }}
LLM_API_KEY: ${{ secrets.LLM_API_KEY }}
LLM_BASE_URL: ${{ secrets.LLM_BASE_URL }}
LLM_API_VERSION: ${{ inputs.LLM_API_VERSION }}
PYTHONPATH: ""
run: |
if [ "${{ steps.check_result.outputs.RESOLUTION_SUCCESS }}" == "true" ]; then
cd /tmp && python -m openhands.resolver.send_pull_request \
--issue-number ${{ env.ISSUE_NUMBER }} \
--target-branch ${{ env.TARGET_BRANCH }} \
--pr-type ${{ inputs.pr_type || 'draft' }} \
--reviewer ${{ github.actor }} | tee pr_result.txt && \
grep "PR created" pr_result.txt | sed 's/.*\///g' > pr_number.txt
else
cd /tmp && python -m openhands.resolver.send_pull_request \
--issue-number ${{ env.ISSUE_NUMBER }} \
--pr-type branch \
--send-on-failure | tee branch_result.txt && \
grep "branch created" branch_result.txt | sed 's/.*\///g; s/.expand=1//g' > branch_name.txt
fi
# Step leaves comment for when agent is invoked on PR
- name: Analyze Push Logs (Updated PR or No Changes) # Skip comment if PR update was successful OR leave comment if the agent made no code changes
uses: actions/github-script@v7
if: always()
env:
AGENT_RESPONDED: ${{ env.AGENT_RESPONDED || 'false' }}
ISSUE_NUMBER: ${{ env.ISSUE_NUMBER }}
with:
github-token: ${{ secrets.PAT_TOKEN || github.token }}
script: |
const fs = require('fs');
const issueNumber = process.env.ISSUE_NUMBER;
let logContent = '';
try {
logContent = fs.readFileSync('/tmp/pr_result.txt', 'utf8').trim();
} catch (error) {
console.error('Error reading pr_result.txt file:', error);
}
const noChangesMessage = `No changes to commit for issue #${issueNumber}. Skipping commit.`;
// Check logs from send_pull_request.py (pushes code to GitHub)
if (logContent.includes("Updated pull request")) {
console.log("Updated pull request found. Skipping comment.");
process.env.AGENT_RESPONDED = 'true';
} else if (logContent.includes(noChangesMessage)) {
github.rest.issues.createComment({
issue_number: issueNumber,
owner: context.repo.owner,
repo: context.repo.repo,
body: `The workflow to fix this issue encountered an error. Openhands failed to create any code changes.`
});
process.env.AGENT_RESPONDED = 'true';
}
# Step leaves comment for when agent is invoked on issue
- name: Comment on issue # Comment link to either PR or branch created by agent
uses: actions/github-script@v7
if: always() # Comment on issue even if the previous steps fail
env:
AGENT_RESPONDED: ${{ env.AGENT_RESPONDED || 'false' }}
ISSUE_NUMBER: ${{ env.ISSUE_NUMBER }}
RESOLUTION_SUCCESS: ${{ steps.check_result.outputs.RESOLUTION_SUCCESS }}
with:
github-token: ${{ secrets.PAT_TOKEN || github.token }}
script: |
const fs = require('fs');
const path = require('path');
const issueNumber = process.env.ISSUE_NUMBER;
const success = process.env.RESOLUTION_SUCCESS === 'true';
let prNumber = '';
let branchName = '';
let resultExplanation = '';
try {
if (success) {
prNumber = fs.readFileSync('/tmp/pr_number.txt', 'utf8').trim();
} else {
branchName = fs.readFileSync('/tmp/branch_name.txt', 'utf8').trim();
}
} catch (error) {
console.error('Error reading file:', error);
}
try {
if (!success){
// Read result_explanation from JSON file for failed resolution
const outputFilePath = path.resolve('/tmp/output/output.jsonl');
if (fs.existsSync(outputFilePath)) {
const outputContent = fs.readFileSync(outputFilePath, 'utf8');
const jsonLines = outputContent.split('\n').filter(line => line.trim() !== '');
if (jsonLines.length > 0) {
// First entry in JSON lines has the key 'result_explanation'
const firstEntry = JSON.parse(jsonLines[0]);
resultExplanation = firstEntry.result_explanation || '';
}
}
}
} catch (error){
console.error('Error reading file:', error);
}
// Check "success" log from resolver output
if (success && prNumber) {
github.rest.issues.createComment({
issue_number: issueNumber,
owner: context.repo.owner,
repo: context.repo.repo,
body: `A potential fix has been generated and a draft PR #${prNumber} has been created. Please review the changes.`
});
process.env.AGENT_RESPONDED = 'true';
} else if (!success && branchName) {
let commentBody = `An attempt was made to automatically fix this issue, but it was unsuccessful. A branch named '${branchName}' has been created with the attempted changes. You can view the branch [here](https://github.com/${context.repo.owner}/${context.repo.repo}/tree/${branchName}). Manual intervention may be required.`;
if (resultExplanation) {
commentBody += `\n\nAdditional details about the failure:\n${resultExplanation}`;
}
github.rest.issues.createComment({
issue_number: issueNumber,
owner: context.repo.owner,
repo: context.repo.repo,
body: commentBody
});
process.env.AGENT_RESPONDED = 'true';
}
# Leave error comment when both PR/Issue comment handling fail
- name: Fallback Error Comment
uses: actions/github-script@v7
if: ${{ env.AGENT_RESPONDED == 'false' }} # Only run if no conditions were met in previous steps
env:
ISSUE_NUMBER: ${{ env.ISSUE_NUMBER }}
with:
github-token: ${{ secrets.PAT_TOKEN || github.token }}
script: |
const issueNumber = process.env.ISSUE_NUMBER;
github.rest.issues.createComment({
issue_number: issueNumber,
owner: context.repo.owner,
repo: context.repo.repo,
body: `The workflow to fix this issue encountered an error. Please check the [workflow logs](https://github.com/${context.repo.owner}/${context.repo.repo}/actions/runs/${context.runId}) for more information.`
});
+139
View File
@@ -0,0 +1,139 @@
---
name: PR Artifacts
on:
workflow_dispatch: # Manual trigger for testing
pull_request:
types: [opened, synchronize, reopened]
branches: [main]
pull_request_review:
types: [submitted]
jobs:
# Auto-remove .pr/ directory when a reviewer approves
cleanup-on-approval:
concurrency:
group: cleanup-pr-artifacts-${{ github.event.pull_request.number }}
cancel-in-progress: false
if: github.event_name == 'pull_request_review' && github.event.review.state == 'approved'
runs-on: ubuntu-latest
permissions:
contents: write
pull-requests: write
steps:
- name: Check if fork PR
id: check-fork
run: |
if [ "${{ github.event.pull_request.head.repo.full_name }}" != "${{ github.event.pull_request.base.repo.full_name }}" ]; then
echo "is_fork=true" >> $GITHUB_OUTPUT
echo "::notice::Fork PR detected - skipping auto-cleanup (manual removal required)"
else
echo "is_fork=false" >> $GITHUB_OUTPUT
fi
# Use PAT so the push triggers CI workflows that will complete and
# satisfy branch protection. We can't use [skip ci] because the Vercel
# GitHub App creates stuck checks that block merging.
- uses: actions/checkout@v5
if: steps.check-fork.outputs.is_fork == 'false'
with:
ref: ${{ github.event.pull_request.head.ref }}
token: ${{ secrets.ALLHANDS_BOT_GITHUB_PAT }}
- name: Remove .pr/ directory
id: remove
if: steps.check-fork.outputs.is_fork == 'false'
run: |
if [ -d ".pr" ]; then
git config user.name "allhands-bot"
git config user.email "allhands-bot@users.noreply.github.com"
git rm -rf .pr/
git commit -m "chore: Remove PR-only artifacts [automated]"
git push || {
echo "::error::Failed to push cleanup commit. Check branch protection rules."
exit 1
}
echo "removed=true" >> $GITHUB_OUTPUT
echo "::notice::Removed .pr/ directory"
else
echo "removed=false" >> $GITHUB_OUTPUT
echo "::notice::No .pr/ directory to remove"
fi
- name: Update PR comment after cleanup
if: steps.check-fork.outputs.is_fork == 'false' && steps.remove.outputs.removed == 'true'
uses: actions/github-script@v8
with:
script: |
const marker = '<!-- pr-artifacts-notice -->';
const body = `${marker}
✅ **PR Artifacts Cleaned Up**
The \`.pr/\` directory has been automatically removed.
`;
const { data: comments } = await github.rest.issues.listComments({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.issue.number,
});
const existing = comments.find(c => c.body.includes(marker));
if (existing) {
await github.rest.issues.updateComment({
owner: context.repo.owner,
repo: context.repo.repo,
comment_id: existing.id,
body: body,
});
}
# Warn if .pr/ directory exists (will be auto-removed on approval)
check-pr-artifacts:
if: github.event_name == 'pull_request'
runs-on: ubuntu-latest
permissions:
contents: read
pull-requests: write
steps:
- uses: actions/checkout@v5
- name: Check for .pr/ directory
id: check
run: |
if [ -d ".pr" ]; then
echo "exists=true" >> $GITHUB_OUTPUT
echo "::warning::.pr/ directory exists and will be automatically removed when the PR is approved. For fork PRs, manual removal is required before merging."
else
echo "exists=false" >> $GITHUB_OUTPUT
fi
- name: Post or update PR comment
if: steps.check.outputs.exists == 'true'
uses: actions/github-script@v8
with:
script: |
const marker = '<!-- pr-artifacts-notice -->';
const body = `${marker}
📁 **PR Artifacts Notice**
This PR contains a \`.pr/\` directory with PR-specific documents. This directory will be **automatically removed** when the PR is approved.
> For fork PRs: Manual removal is required before merging.
`;
const { data: comments } = await github.rest.issues.listComments({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.issue.number,
});
const existing = comments.find(c => c.body.includes(marker));
if (!existing) {
await github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.issue.number,
body: body,
});
}
@@ -0,0 +1,51 @@
---
name: PR Review by OpenHands
on:
# Use pull_request so workflow changes can be validated in PRs.
# This workflow requires secrets, so the job only runs for same-repo PRs.
# It runs when:
# 1. A new PR is opened (non-draft), OR
# 2. A draft PR is marked as ready for review, OR
# 3. A maintainer adds the 'review-this' label, OR
# 4. A maintainer requests openhands-agent or all-hands-bot as a reviewer
# Adding labels and requesting reviewers still requires write access.
pull_request:
types: [opened, ready_for_review, labeled, review_requested]
permissions:
contents: read
pull-requests: write
issues: write
jobs:
pr-review:
# Run when one of the following conditions is met:
# 1. A new non-draft PR is opened by a non-first-time contributor, OR
# 2. A draft PR is converted to ready for review by a non-first-time contributor, OR
# 3. 'review-this' label is added, OR
# 4. openhands-agent or all-hands-bot is requested as a reviewer
# Note: FIRST_TIME_CONTRIBUTOR and NONE PRs require manual trigger via label/reviewer request.
if: |
github.event.pull_request.head.repo.full_name == github.repository && (
(github.event.action == 'opened' && github.event.pull_request.draft == false && github.event.pull_request.author_association != 'FIRST_TIME_CONTRIBUTOR' && github.event.pull_request.author_association != 'NONE') ||
(github.event.action == 'ready_for_review' && github.event.pull_request.author_association != 'FIRST_TIME_CONTRIBUTOR' && github.event.pull_request.author_association != 'NONE') ||
github.event.label.name == 'review-this' ||
github.event.requested_reviewer.login == 'openhands-agent' ||
github.event.requested_reviewer.login == 'all-hands-bot'
)
concurrency:
group: pr-review-${{ github.event.pull_request.number }}
cancel-in-progress: true
runs-on: ubuntu-24.04
steps:
- name: Run PR Review
uses: OpenHands/extensions/plugins/pr-review@main
with:
llm-model: litellm_proxy/claude-sonnet-4-5-20250929
llm-base-url: https://llm-proxy.app.all-hands.dev
# Review style: roasted (other option: standard)
review-style: roasted
llm-api-key: ${{ secrets.LLM_API_KEY }}
github-token: ${{ secrets.ALLHANDS_BOT_GITHUB_PAT }}
lmnr-api-key: ${{ secrets.LMNR_SKILLS_API_KEY }}
@@ -0,0 +1,85 @@
---
name: PR Review Evaluation
# This workflow evaluates how well PR review comments were addressed.
# It runs when a PR is closed to assess review effectiveness.
#
# Security note: pull_request_target is safe here because:
# 1. Only triggers on PR close (not on code changes)
# 2. Does not checkout PR code - only downloads artifacts from trusted workflow runs
# 3. Runs evaluation scripts from the extensions repo, not from the PR
on:
pull_request_target:
types: [closed]
permissions:
contents: read
pull-requests: read
jobs:
evaluate:
runs-on: ubuntu-24.04
env:
PR_NUMBER: ${{ github.event.pull_request.number }}
REPO_NAME: ${{ github.repository }}
PR_MERGED: ${{ github.event.pull_request.merged }}
steps:
- name: Download review trace artifact
id: download-trace
uses: dawidd6/action-download-artifact@v6
continue-on-error: true
with:
workflow: pr-review-by-openhands.yml
name: pr-review-trace-${{ github.event.pull_request.number }}
path: trace-info
search_artifacts: true
if_no_artifact_found: warn
- name: Check if trace file exists
id: check-trace
run: |
if [ -f "trace-info/laminar_trace_info.json" ]; then
echo "trace_exists=true" >> $GITHUB_OUTPUT
echo "Found trace file for PR #$PR_NUMBER"
else
echo "trace_exists=false" >> $GITHUB_OUTPUT
echo "No trace file found for PR #$PR_NUMBER - skipping evaluation"
fi
# Always checkout main branch for security - cannot test script changes in PRs
- name: Checkout extensions repository
if: steps.check-trace.outputs.trace_exists == 'true'
uses: actions/checkout@v5
with:
repository: OpenHands/extensions
path: extensions
- name: Set up Python
if: steps.check-trace.outputs.trace_exists == 'true'
uses: actions/setup-python@v6
with:
python-version: '3.12'
- name: Install dependencies
if: steps.check-trace.outputs.trace_exists == 'true'
run: pip install lmnr
- name: Run evaluation
if: steps.check-trace.outputs.trace_exists == 'true'
env:
# Script expects LMNR_PROJECT_API_KEY; org secret is named LMNR_SKILLS_API_KEY
LMNR_PROJECT_API_KEY: ${{ secrets.LMNR_SKILLS_API_KEY }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
python extensions/plugins/pr-review/scripts/evaluate_review.py \
--trace-file trace-info/laminar_trace_info.json
- name: Upload evaluation logs
uses: actions/upload-artifact@v7
if: always() && steps.check-trace.outputs.trace_exists == 'true'
with:
name: pr-review-evaluation-${{ github.event.pull_request.number }}
path: '*.log'
retention-days: 30
+31
View File
@@ -0,0 +1,31 @@
---
# .github/workflows/precommit.yml
name: Pre-commit checks
on:
push:
branches: [main]
pull_request:
branches: ['**']
jobs:
pre-commit:
runs-on: ubuntu-24.04
steps:
- name: Checkout code
uses: actions/checkout@v5
- name: Set up Python
uses: actions/setup-python@v6
with:
python-version: '3.13'
- name: Install uv
uses: astral-sh/setup-uv@v7
- name: Install dependencies
run: uv sync --frozen --group dev
- name: Run pre-commit (all files)
run: uv run pre-commit run --all-files --show-diff-on-failure
+132
View File
@@ -0,0 +1,132 @@
---
name: Prepare Release
on:
workflow_dispatch:
inputs:
version:
description: Release version (e.g., 1.2.3)
required: true
type: string
jobs:
prepare-release:
runs-on: ubuntu-24.04
steps:
- name: Validate version format
run: |
if ! [[ "${{ inputs.version }}" =~ ^[0-9]+\.[0-9]+\.[0-9]+$ ]]; then
echo "❌ Invalid version format. Expected: X.Y.Z (e.g., 1.2.3)"
exit 1
fi
echo "✅ Version format is valid: ${{ inputs.version }}"
- name: Checkout repository
uses: actions/checkout@v5
with:
token: ${{ secrets.ALLHANDS_BOT_GITHUB_PAT }}
- name: Install uv
uses: astral-sh/setup-uv@v7
with:
version: latest
python-version: '3.13'
- name: Configure Git
run: |
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"
- name: Create release branch
run: |
BRANCH_NAME="rel-${{ inputs.version }}"
echo "Creating branch: $BRANCH_NAME"
git checkout -b "$BRANCH_NAME"
echo "BRANCH_NAME=$BRANCH_NAME" >> $GITHUB_ENV
- name: Set package version
run: |
echo "🔧 Setting version to ${{ inputs.version }}"
make set-package-version version=${{ inputs.version }}
- name: Update sdk_ref default in run-eval workflow
run: python3 .github/scripts/update_sdk_ref_default.py "${{ inputs.version }}"
- name: Commit version changes
run: |
git add .
if git diff --staged --quiet; then
echo "No changes to commit"
else
git commit -m "Release v${{ inputs.version }}" -m "Co-authored-by: openhands <openhands@all-hands.dev>"
echo "✅ Changes committed"
fi
- name: Push release branch
run: |
git push -u origin "${{ env.BRANCH_NAME }}"
echo "✅ Branch pushed: ${{ env.BRANCH_NAME }}"
- name: Create Pull Request
env:
GH_TOKEN: ${{ secrets.ALLHANDS_BOT_GITHUB_PAT }}
run: |
cat > pr_body.txt << 'EOF'
## Release v${{ inputs.version }}
This PR prepares the release for version **${{ inputs.version }}**.
### Release Checklist
- [x] Version set to ${{ inputs.version }}
- [ ] Fix any deprecation deadlines if they exist
- [ ] Integration tests pass (tagged with `integration-test`)
- [ ] Behavior tests pass (tagged with `behavior-test`)
- [ ] Example tests pass (tagged with `test-examples`)
- [ ] Draft release created at https://github.com/OpenHands/software-agent-sdk/releases/new
- [ ] Select tag: `v${{ inputs.version }}`
- [ ] Select branch: `${{ env.BRANCH_NAME }}`
- [ ] Auto-generate release notes
- [ ] Publish release (PyPI will auto-publish)
- [ ] Evaluation on OpenHands Index
### Next Steps
1. Review the version changes
2. Address any deprecation deadlines
3. Ensure integration tests pass
4. Ensure behavior tests pass
5. Ensure example tests pass
6. Create and publish the release
Once the release is published on GitHub, the PyPI packages will be automatically published via the `pypi-release.yml` workflow.
EOF
gh pr create \
--title "Release v${{ inputs.version }}" \
--body-file pr_body.txt \
--base main \
--head "${{ env.BRANCH_NAME }}" \
--label "integration-test" \
--label "behavior-test" \
--label "test-examples"
rm pr_body.txt
echo "✅ Pull request created successfully!"
# Get PR URL and display it
PR_URL=$(gh pr view "${{ env.BRANCH_NAME }}" --json url --jq '.url')
echo "🔗 PR URL: $PR_URL"
echo "PR_URL=$PR_URL" >> $GITHUB_ENV
- name: Summary
run: |
echo "## ✅ Release Preparation Complete!" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
echo "- **Version**: ${{ inputs.version }}" >> $GITHUB_STEP_SUMMARY
echo "- **Branch**: ${{ env.BRANCH_NAME }}" >> $GITHUB_STEP_SUMMARY
echo "- **PR URL**: ${{ env.PR_URL }}" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
echo "### Next Steps:" >> $GITHUB_STEP_SUMMARY
echo "1. Review the PR and address any deprecation deadlines" >> $GITHUB_STEP_SUMMARY
echo "2. Wait for integration, behavior, and example tests to pass" >> $GITHUB_STEP_SUMMARY
echo "3. Create and publish the release on GitHub" >> $GITHUB_STEP_SUMMARY
echo "4. PyPI will automatically publish when the release is created" >> $GITHUB_STEP_SUMMARY
-127
View File
@@ -1,127 +0,0 @@
# Workflow that runs python tests
name: Run Python Tests
# The jobs in this workflow are required, so they must run at all times
# * Always run on "main"
# * Always run on PRs
on:
push:
branches:
- main
pull_request:
# If triggered by a PR, it will be in the same group. However, each commit on main will be in its own unique group
concurrency:
group: ${{ github.workflow }}-${{ (github.head_ref && github.ref) || github.run_id }}
cancel-in-progress: true
jobs:
# Run python tests on Linux
test-on-linux:
name: Python Tests on Linux
runs-on: blacksmith-4vcpu-ubuntu-2404
env:
INSTALL_DOCKER: "0" # Set to '0' to skip Docker installation
strategy:
matrix:
python-version: ["3.12"]
permissions:
# For coverage comment and python-coverage-comment-action branch
pull-requests: write
contents: write
steps:
- uses: actions/checkout@v4
- name: Set up Docker Buildx
id: buildx
uses: docker/setup-buildx-action@v3
- name: Install tmux
run: sudo apt-get update && sudo apt-get install -y tmux
- name: Setup Node.js
uses: useblacksmith/setup-node@v5
with:
node-version: "22.x"
- name: Install poetry via pipx
run: pipx install poetry
- name: Set up Python
uses: useblacksmith/setup-python@v6
with:
python-version: ${{ matrix.python-version }}
cache: "poetry"
- name: Install Python dependencies using Poetry
run: |
poetry install --with dev,test,runtime
poetry run pip install pytest-xdist
poetry run pip install pytest-rerunfailures
- name: Build Environment
run: make build
- name: Run Unit Tests
run: PYTHONPATH=".:$PYTHONPATH" poetry run pytest --forked -n auto -s ./tests/unit --cov=openhands --cov-branch
env:
COVERAGE_FILE: ".coverage.${{ matrix.python_version }}"
- name: Run Runtime Tests with CLIRuntime
run: PYTHONPATH=".:$PYTHONPATH" TEST_RUNTIME=cli poetry run pytest -n 5 --reruns 2 --reruns-delay 3 -s tests/runtime/test_bash.py --cov=openhands --cov-branch
env:
COVERAGE_FILE: ".coverage.runtime.${{ matrix.python_version }}"
- name: Store coverage file
uses: actions/upload-artifact@v6
with:
name: coverage-openhands
path: |
.coverage.${{ matrix.python_version }}
.coverage.runtime.${{ matrix.python_version }}
include-hidden-files: true
test-enterprise:
name: Enterprise Python Unit Tests
runs-on: blacksmith-4vcpu-ubuntu-2404
strategy:
matrix:
python-version: ["3.12"]
steps:
- uses: actions/checkout@v4
- name: Install poetry via pipx
run: pipx install poetry
- name: Set up Python
uses: useblacksmith/setup-python@v6
with:
python-version: ${{ matrix.python-version }}
cache: "poetry"
- name: Install Python dependencies using Poetry
working-directory: ./enterprise
run: poetry install --with dev,test
- name: Run Unit Tests
# Use base working directory for coverage paths to line up.
run: PYTHONPATH=".:$PYTHONPATH" poetry run --project=enterprise pytest --forked -n auto -s -p no:ddtrace -p no:ddtrace.pytest_bdd -p no:ddtrace.pytest_benchmark ./enterprise/tests/unit --cov=enterprise --cov-branch
env:
COVERAGE_FILE: ".coverage.enterprise.${{ matrix.python_version }}"
- name: Store coverage file
uses: actions/upload-artifact@v6
with:
name: coverage-enterprise
path: ".coverage.enterprise.${{ matrix.python_version }}"
include-hidden-files: true
coverage-comment:
name: Coverage Comment
if: github.event_name == 'pull_request'
runs-on: ubuntu-latest
needs: [test-on-linux, test-enterprise]
permissions:
pull-requests: write
contents: write
steps:
- uses: actions/checkout@v4
- uses: actions/download-artifact@v6
id: download
with:
pattern: coverage-*
merge-multiple: true
- name: Coverage comment
id: coverage_comment
uses: py-cov-action/python-coverage-comment-action@v3
with:
GITHUB_TOKEN: ${{ github.token }}
MERGE_COVERAGE_FILES: true
+66 -36
View File
@@ -1,40 +1,70 @@
# Publishes the OpenHands PyPi package
name: Publish PyPi Package
---
name: Publish all OpenHands packages (uv)
on:
workflow_dispatch:
inputs:
reason:
description: "What are you publishing?"
required: true
type: choice
options:
- app server
default: app server
push:
tags:
- "*"
# Run manually
workflow_dispatch:
# Run automatically when a release is published
release:
types: [published]
jobs:
release:
runs-on: blacksmith-4vcpu-ubuntu-2204
# Run when manually dispatched for "app server" OR for tag pushes that don't contain '-cli'
if: |
(github.event_name == 'workflow_dispatch' && github.event.inputs.reason == 'app server')
|| (github.event_name == 'push' && startsWith(github.ref, 'refs/tags/') && !contains(github.ref, '-cli'))
steps:
- uses: actions/checkout@v4
- uses: useblacksmith/setup-python@v6
with:
python-version: 3.12
- name: Install Poetry
uses: snok/install-poetry@v1.4.1
with:
virtualenvs-in-project: true
virtualenvs-path: ~/.virtualenvs
- name: Install Poetry Dependencies
run: poetry install --no-interaction --no-root
- name: Build poetry project
run: ./build.sh
- name: publish
run: poetry publish -u __token__ -p ${{ secrets.PYPI_TOKEN }}
publish:
runs-on: ubuntu-24.04
outputs:
version: ${{ steps.extract_version.outputs.version }}
steps:
- name: Checkout
uses: actions/checkout@v5
- name: Extract version from release tag
id: extract_version
run: |
# Get version from release tag (e.g., v1.2.3 -> 1.2.3)
if [[ "${{ github.event_name }}" == "release" ]]; then
VERSION="${{ github.event.release.tag_name }}"
VERSION="${VERSION#v}" # Remove 'v' prefix if present
else
# For manual dispatch, extract from pyproject.toml
VERSION=$(grep -m1 '^version = ' openhands-sdk/pyproject.toml | cut -d'"' -f2)
fi
echo "version=$VERSION" >> $GITHUB_OUTPUT
echo "📦 Version: $VERSION"
- name: Install uv
uses: astral-sh/setup-uv@v7
with:
version: latest
python-version: '3.13'
- name: Build and publish all packages
env:
UV_PUBLISH_TOKEN: ${{ secrets.PYPI_TOKEN_OPENHANDS }}
run: |
set -euo pipefail
if [ -z "${UV_PUBLISH_TOKEN:-}" ]; then
echo "❌ Missing secret PYPI_TOKEN_OPENHANDS"
exit 1
fi
PACKAGES=(
openhands-sdk
openhands-tools
openhands-workspace
openhands-agent-server
)
echo "🚀 Building and publishing all packages..."
for PKG in "${PACKAGES[@]}"; do
echo "===== $PKG ====="
uv build --package "$PKG"
done
# Use --check-url to skip files that already exist on PyPI
# This allows re-running the workflow after partial failures
uv publish --token "$UV_PUBLISH_TOKEN" --check-url https://pypi.org/simple/
echo "✅ All packages built and published successfully!"
echo ""
echo "📋 Note: Version bump PRs will be created by the 'Create Version Bump PRs' workflow"
echo " which triggers automatically after this workflow completes."
+114
View File
@@ -0,0 +1,114 @@
---
name: Review Thread Gate
on:
pull_request:
branches: [main]
types: [opened, synchronize, reopened, ready_for_review, edited]
permissions:
contents: read
pull-requests: read
concurrency:
group: review-thread-gate-${{ github.event.pull_request.number || github.sha }}
cancel-in-progress: true
jobs:
unresolved-review-threads:
runs-on: ubuntu-latest
steps:
- name: Fail when unresolved review threads remain (unless waived)
uses: actions/github-script@v8
with:
script: |
const pr = context.payload.pull_request;
if (!pr) {
core.info('No pull_request payload available; skipping.');
return;
}
const waiverMatch = pr.body?.match(
/review-thread-waiver\s*:\s*(.+?)(?:\n|$)/i,
);
const waiverReason = waiverMatch?.[1]?.trim() || null;
const unresolved = [];
let cursor = null;
do {
const query = `
query($owner: String!, $repo: String!, $number: Int!, $cursor: String) {
repository(owner: $owner, name: $repo) {
pullRequest(number: $number) {
reviewThreads(first: 100, after: $cursor) {
nodes {
id
isResolved
isOutdated
comments(first: 1) {
nodes {
author { login }
path
line
url
}
}
}
pageInfo {
hasNextPage
endCursor
}
}
}
}
}
`;
const result = await github.graphql(query, {
owner: context.repo.owner,
repo: context.repo.repo,
number: pr.number,
cursor,
});
const page = result.repository.pullRequest.reviewThreads;
for (const thread of page.nodes) {
if (thread.isResolved) continue;
const firstComment = thread.comments.nodes[0];
unresolved.push({
url: firstComment?.url ?? '(no-url)',
author: firstComment?.author?.login ?? 'unknown',
path: firstComment?.path ?? 'unknown',
line: firstComment?.line ?? '?',
outdated: thread.isOutdated,
});
}
cursor = page.pageInfo.hasNextPage ? page.pageInfo.endCursor : null;
} while (cursor);
if (unresolved.length === 0) {
core.info('No unresolved review threads found.');
return;
}
const summaryLines = unresolved.map(
(thread) =>
`- ${thread.url} (author: ${thread.author}, file: ${thread.path}:${thread.line}, outdated: ${thread.outdated})`,
);
await core.summary
.addHeading(`Unresolved review threads: ${unresolved.length}`)
.addRaw(summaryLines.join('\n'))
.write();
if (waiverReason) {
core.warning(
`Unresolved review threads remain (${unresolved.length}), but waiver provided: ${waiverReason}`,
);
return;
}
core.setFailed(
`Found ${unresolved.length} unresolved review thread(s). Resolve all threads or add ` +
'`review-thread-waiver: <reason>` to the PR body for an intentional waiver.',
);
+403
View File
@@ -0,0 +1,403 @@
---
name: Run Eval
run-name: Run Eval (${{ inputs.benchmark || 'swebench' }}) ${{ inputs.reason || github.event.label.name || 'release' }}
on:
pull_request_target:
types: [labeled]
release:
types: [published]
workflow_dispatch:
inputs:
benchmark:
description: Benchmark to evaluate
required: false
default: swebench
type: choice
options:
- gaia
- swebench
- swtbench
- commit0
- swebenchmultimodal
- terminalbench
sdk_ref:
description: SDK commit/ref to evaluate (must be a semantic version like v1.0.0 unless 'Allow unreleased branches' is checked)
required: true
default: v1.14.0
allow_unreleased_branches:
description: Allow unreleased branches (bypasses semantic version requirement)
required: false
default: false
type: boolean
eval_limit:
description: Number of instances to run (any positive integer)
required: false
default: '1'
type: string
model_ids:
description: Comma-separated model IDs to evaluate. Must be keys of MODELS in resolve_model_config.py. Defaults to first model in that
dict.
required: false
default: ''
type: string
reason:
description: Reason for manual trigger
required: false
default: ''
eval_branch:
description: Evaluation repo branch to use (for testing feature branches)
required: false
default: main
type: string
benchmarks_branch:
description: Benchmarks repo branch to use (for testing feature branches)
required: false
default: main
type: string
instance_ids:
description: >-
Comma-separated instance IDs to evaluate.
Example: "django__django-11583,django__django-12345".
Spaces around commas are automatically stripped.
Leave empty to evaluate all instances up to eval_limit.
required: false
default: ''
num_infer_workers:
description: Number of inference workers (optional, overrides benchmark default)
required: false
default: ''
type: string
num_eval_workers:
description: Number of evaluation workers (optional, overrides benchmark default)
required: false
default: ''
type: string
enable_conversation_event_logging:
description: 'Enable Datadog persistence for conversation events (default: true)'
required: false
default: true
type: boolean
max_retries:
description: Max retries per instance (passed to benchmarks)
required: false
default: '3'
type: string
tool_preset:
description: >-
Tool preset for file editing. 'default' uses FileEditorTool,
'gemini' uses read_file/write_file/edit/list_directory,
'gpt5' uses apply_patch tool.
required: false
default: default
type: choice
options:
- default
- gemini
- gpt5
- planning
agent_type:
description: >-
Agent type: 'default' for standard Agent,
'acp-claude' for ACPAgent with Claude Code,
'acp-codex' for ACPAgent with Codex.
required: false
default: default
type: choice
options:
- default
- acp-claude
- acp-codex
env:
EVAL_REPO: OpenHands/evaluation
EVAL_WORKFLOW: eval-job.yml
jobs:
print-parameters:
if: >
github.event_name == 'release' ||
github.event_name == 'workflow_dispatch' ||
(github.event_name == 'pull_request_target' &&
(github.event.label.name == 'run-eval-1' ||
github.event.label.name == 'run-eval-50' ||
github.event.label.name == 'run-eval-200' ||
github.event.label.name == 'run-eval-500'))
runs-on: ubuntu-latest
steps:
- name: Print all parameters
run: |
echo "=== Workflow Parameters ==="
echo "Event: ${{ github.event_name }}"
echo "Actor: ${{ github.actor }}"
echo "Ref: ${{ github.ref }}"
echo ""
echo "=== Input Parameters ==="
echo "benchmark: ${{ github.event.inputs.benchmark || 'swebench' }}"
echo "sdk_ref: ${{ github.event.inputs.sdk_ref || 'N/A' }}"
echo "allow_unreleased_branches: ${{ github.event.inputs.allow_unreleased_branches || 'false' }}"
echo "eval_limit: ${{ github.event.inputs.eval_limit || '1' }}"
echo "model_ids: ${{ github.event.inputs.model_ids || '(default)' }}"
echo "reason: ${{ github.event.inputs.reason || 'N/A' }}"
echo "eval_branch: ${{ github.event.inputs.eval_branch || 'main' }}"
echo "benchmarks_branch: ${{ github.event.inputs.benchmarks_branch || 'main' }}"
echo "instance_ids: ${{ github.event.inputs.instance_ids || 'N/A' }}"
echo "num_infer_workers: ${{ github.event.inputs.num_infer_workers || '(default)' }}"
echo "num_eval_workers: ${{ github.event.inputs.num_eval_workers || '(default)' }}"
echo "enable_conversation_event_logging: ${{ github.event.inputs.enable_conversation_event_logging || 'true' }}"
echo "max_retries: ${{ github.event.inputs.max_retries || '3' }}"
echo "tool_preset: ${{ github.event.inputs.tool_preset || 'default' }}"
echo ""
echo "=== Environment Variables ==="
echo "EVAL_REPO: ${{ env.EVAL_REPO }}"
echo "EVAL_WORKFLOW: ${{ env.EVAL_WORKFLOW }}"
echo ""
echo "=== Label (for PR events) ==="
echo "Label: ${{ github.event.label.name || 'N/A' }}"
build-and-evaluate:
needs: print-parameters
runs-on: ubuntu-latest
permissions:
contents: read
packages: write
actions: write
issues: write
pull-requests: write
steps:
- name: Checkout sdk code (base for validation)
uses: actions/checkout@v4
with:
ref: ${{ github.event_name == 'workflow_dispatch' && github.event.inputs.sdk_ref || (github.event_name == 'pull_request_target' &&
github.event.pull_request.base.ref || github.ref) }}
fetch-depth: 0
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.13'
- name: Validate eval_limit
if: github.event_name == 'workflow_dispatch'
run: |
if ! [[ "${{ github.event.inputs.eval_limit }}" =~ ^[1-9][0-9]*$ ]]; then
echo "Error: eval_limit must be a positive integer, got: ${{ github.event.inputs.eval_limit }}"
exit 1
fi
- name: Validate SDK reference (semantic version check)
if: github.event_name == 'workflow_dispatch'
env:
SDK_REF: ${{ github.event.inputs.sdk_ref }}
ALLOW_UNRELEASED_BRANCHES: ${{ github.event.inputs.allow_unreleased_branches }}
run: |
python3 .github/run-eval/validate_sdk_ref.py
- name: Install dependencies
run: |
pip install 'litellm>=1.81.0'
- name: Load model IDs from Python script
id: load-models
run: |
# Extract all model IDs from resolve_model_config.py
ALLOWED_MODEL_IDS=$(python3 << 'EOF'
import sys
sys.path.insert(0, '.github/run-eval')
from resolve_model_config import MODELS
import json
print(json.dumps(list(MODELS.keys())))
EOF
)
DEFAULT_MODEL=$(echo "$ALLOWED_MODEL_IDS" | jq -r '.[0]')
if [ -z "$DEFAULT_MODEL" ] || [ "$DEFAULT_MODEL" = "null" ]; then
echo "No models configured" >&2
exit 1
fi
echo "allowed_model_ids=$ALLOWED_MODEL_IDS" >> "$GITHUB_OUTPUT"
echo "default_model=$DEFAULT_MODEL" >> "$GITHUB_OUTPUT"
- name: Resolve parameters
id: params
env:
DEFAULT_MODEL: ${{ steps.load-models.outputs.default_model }}
ALLOWED_MODEL_IDS_JSON: ${{ steps.load-models.outputs.allowed_model_ids }}
PAT_TOKEN_DEFAULT: ${{ secrets.ALLHANDS_BOT_GITHUB_PAT }}
run: |
set -euo pipefail
# Set PAT token for cross-repo workflow dispatch
PAT_TOKEN="$PAT_TOKEN_DEFAULT"
if [ -z "$PAT_TOKEN" ]; then
echo "Missing PAT token" >&2
exit 1
fi
echo "PAT_TOKEN=$PAT_TOKEN" >> "$GITHUB_ENV"
# Determine eval limit and SDK SHA based on trigger
if [ "${{ github.event_name }}" = "pull_request_target" ]; then
LABEL="${{ github.event.label.name }}"
case "$LABEL" in
run-eval-1) EVAL_LIMIT=1 ;;
run-eval-50) EVAL_LIMIT=50 ;;
run-eval-200) EVAL_LIMIT=200 ;;
run-eval-500) EVAL_LIMIT=500 ;;
*) echo "Unsupported label $LABEL" >&2; exit 1 ;;
esac
SDK_SHA="${{ github.event.pull_request.head.sha }}"
PR_NUMBER="${{ github.event.pull_request.number }}"
TRIGGER_DESCRIPTION="Label '${LABEL}' on PR #${PR_NUMBER}"
elif [ "${{ github.event_name }}" = "release" ]; then
EVAL_LIMIT=50
# Use tag instead of target_commitish because release branches are automatically deleted after merge
SDK_SHA=$(git rev-parse "${{ github.event.release.tag_name }}")
PR_NUMBER=""
TRIGGER_DESCRIPTION="Release ${{ github.event.release.tag_name }}"
else
EVAL_LIMIT="${{ github.event.inputs.eval_limit }}"
SDK_REF="${{ github.event.inputs.sdk_ref }}"
# Convert ref to SHA for manual dispatch
# Resolve SHA robustly for both branch refs and raw SHAs (avoid double-prefix issues)
SDK_SHA=$(git rev-parse --verify "$SDK_REF^{commit}" 2>/dev/null || \
git rev-parse --verify "origin/$SDK_REF^{commit}" 2>/dev/null || \
echo "$SDK_REF")
PR_NUMBER=""
REASON="${{ github.event.inputs.reason }}"
if [ -z "$REASON" ]; then
REASON="manual"
fi
TRIGGER_DESCRIPTION="Manual trigger: ${REASON}"
fi
# Normalize and validate model IDs
MODELS_INPUT="${{ github.event_name == 'workflow_dispatch' && github.event.inputs.model_ids || '' }}"
if [ -z "$MODELS_INPUT" ]; then
MODELS_INPUT="$DEFAULT_MODEL"
fi
MODELS=$(printf '%s' "$MODELS_INPUT" | tr ', ' '\n' | sed '/^$/d' | paste -sd, -)
ALLOWED_LIST=$(echo "$ALLOWED_MODEL_IDS_JSON" | jq -r '.[]')
for MODEL in ${MODELS//,/ }; do
if ! echo "$ALLOWED_LIST" | grep -Fx "$MODEL" >/dev/null; then
echo "Model ID '$MODEL' not found in models.json" >&2
echo "Available models: $(echo "$ALLOWED_LIST" | paste -sd, -)" >&2
exit 1
fi
done
# Sanitize values to avoid GITHUB_OUTPUT parse errors (e.g., raw SHAs)
SDK_SHA=$(printf '%s' "$SDK_SHA" | tr -d '\n\r')
EVAL_LIMIT=$(printf '%s' "$EVAL_LIMIT" | tr -d '\n\r')
PR_NUMBER=$(printf '%s' "$PR_NUMBER" | tr -d '\n\r')
MODELS=$(printf '%s' "$MODELS" | tr -d '\n\r')
TRIGGER_DESCRIPTION=$(printf '%s' "$TRIGGER_DESCRIPTION" | tr -d '\n\r')
printf 'eval_limit=%s\n' "$EVAL_LIMIT" >> "$GITHUB_OUTPUT"
printf 'sdk_sha=%s\n' "$SDK_SHA" >> "$GITHUB_OUTPUT"
printf 'models=%s\n' "$MODELS" >> "$GITHUB_OUTPUT"
printf 'pr_number=%s\n' "$PR_NUMBER" >> "$GITHUB_OUTPUT"
printf 'trigger_desc=%s\n' "$TRIGGER_DESCRIPTION" >> "$GITHUB_OUTPUT"
- name: Resolve model configurations and verify availability
id: resolve-models
env:
MODEL_IDS: ${{ steps.params.outputs.models }}
LLM_API_KEY: ${{ secrets.LLM_API_KEY_EVAL }}
LLM_BASE_URL: https://llm-proxy.eval.all-hands.dev
run: |
python3 .github/run-eval/resolve_model_config.py
- name: Dispatch evaluation workflow
env:
SDK_SHA: ${{ steps.params.outputs.sdk_sha }}
EVAL_LIMIT: ${{ steps.params.outputs.eval_limit }}
MODELS_JSON: ${{ steps.resolve-models.outputs.models_json }}
EVAL_REPO: ${{ env.EVAL_REPO }}
EVAL_WORKFLOW: ${{ env.EVAL_WORKFLOW }}
EVAL_BRANCH: ${{ github.event.inputs.eval_branch || 'main' }}
BENCHMARKS_BRANCH: ${{ github.event.inputs.benchmarks_branch || 'main' }}
BENCHMARK: ${{ github.event.inputs.benchmark || 'swebench' }}
TRIGGER_REASON: ${{ github.event.inputs.reason }}
PR_NUMBER: ${{ steps.params.outputs.pr_number }}
INSTANCE_IDS: ${{ github.event.inputs.instance_ids || '' }}
NUM_INFER_WORKERS: ${{ github.event.inputs.num_infer_workers || '' }}
NUM_EVAL_WORKERS: ${{ github.event.inputs.num_eval_workers || '' }}
ENABLE_CONVERSATION_EVENT_LOGGING: ${{ github.event.inputs.enable_conversation_event_logging || false }}
MAX_RETRIES: ${{ github.event.inputs.max_retries || '3' }}
TOOL_PRESET: ${{ github.event.inputs.tool_preset || 'default' }}
AGENT_TYPE: ${{ github.event.inputs.agent_type || 'default' }}
TRIGGERED_BY: ${{ github.actor }}
run: |
# Normalize instance_ids: strip all spaces
INSTANCE_IDS=$(printf '%s' "$INSTANCE_IDS" | tr -d ' ')
echo "Dispatching evaluation workflow with SDK commit: $SDK_SHA (benchmark: $BENCHMARK, eval branch: $EVAL_BRANCH, benchmarks branch: $BENCHMARKS_BRANCH, tool preset: $TOOL_PRESET)"
PAYLOAD=$(jq -n \
--arg sdk "$SDK_SHA" \
--arg eval_limit "$EVAL_LIMIT" \
--argjson models "$MODELS_JSON" \
--arg ref "$EVAL_BRANCH" \
--arg reason "$TRIGGER_REASON" \
--arg pr "$PR_NUMBER" \
--arg benchmarks "$BENCHMARKS_BRANCH" \
--arg benchmark "$BENCHMARK" \
--arg instance_ids "$INSTANCE_IDS" \
--arg num_infer_workers "$NUM_INFER_WORKERS" \
--arg num_eval_workers "$NUM_EVAL_WORKERS" \
--argjson enable_conversation_event_logging "$ENABLE_CONVERSATION_EVENT_LOGGING" \
--arg max_retries "$MAX_RETRIES" \
--arg tool_preset "$TOOL_PRESET" \
--arg agent_type "$AGENT_TYPE" \
--arg triggered_by "$TRIGGERED_BY" \
'{ref: $ref, inputs: {sdk_commit: $sdk, eval_limit: $eval_limit, models_json: ($models | tostring), trigger_reason: $reason, pr_number: $pr, benchmarks_branch: $benchmarks, benchmark: $benchmark, instance_ids: $instance_ids, num_infer_workers: $num_infer_workers, num_eval_workers: $num_eval_workers, enable_conversation_event_logging: $enable_conversation_event_logging, max_retries: $max_retries, tool_preset: $tool_preset, agent_type: $agent_type, triggered_by: $triggered_by}}')
RESPONSE=$(curl -sS -o /tmp/dispatch.out -w "%{http_code}" -X POST \
-H "Authorization: token $PAT_TOKEN" \
-H "Accept: application/vnd.github+json" \
-d "$PAYLOAD" \
"https://api.github.com/repos/${EVAL_REPO}/actions/workflows/${EVAL_WORKFLOW}/dispatches")
if [ "$RESPONSE" != "204" ]; then
echo "Dispatch failed (status $RESPONSE):" >&2
cat /tmp/dispatch.out >&2
exit 1
fi
- name: Comment on PR
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
SDK_SHA: ${{ steps.params.outputs.sdk_sha }}
EVAL_LIMIT: ${{ steps.params.outputs.eval_limit }}
MODELS: ${{ steps.params.outputs.models }}
TRIGGER_DESC: ${{ steps.params.outputs.trigger_desc }}
EVENT_NAME: ${{ github.event_name }}
PR_NUMBER_INPUT: ${{ steps.params.outputs.pr_number }}
run: |
set -euo pipefail
PR_NUMBER="$PR_NUMBER_INPUT"
if [ "$EVENT_NAME" = "release" ] && [ -z "$PR_NUMBER" ]; then
# Attempt to find the merged PR for this commit
PR_NUMBER=$(curl -sS \
-H "Authorization: Bearer $GITHUB_TOKEN" \
-H "Accept: application/vnd.github+json" \
"https://api.github.com/repos/${{ github.repository }}/commits/${SDK_SHA}/pulls" \
| jq -r '.[0].number // ""')
fi
if [ -z "$PR_NUMBER" ]; then
echo "No PR found to comment on; skipping comment"
exit 0
fi
COMMENT_BODY=$(printf '**Evaluation Triggered**\n\n- Trigger: %s\n- SDK: %s\n- Eval limit: %s\n- Models: %s\n' \
"$TRIGGER_DESC" "$SDK_SHA" "$EVAL_LIMIT" "$MODELS")
curl -sS -X POST \
-H "Accept: application/vnd.github+json" \
-H "Authorization: Bearer $GITHUB_TOKEN" \
"https://api.github.com/repos/${{ github.repository }}/issues/${PR_NUMBER}/comments" \
-d "$(jq -n --arg body "$COMMENT_BODY" '{body: $body}')"
+199
View File
@@ -0,0 +1,199 @@
---
name: Run Examples Scripts
on:
pull_request:
types: [labeled]
workflow_dispatch:
inputs:
reason:
description: Reason for manual trigger
required: true
default: ''
schedule:
- cron: 30 22 * * * # Runs at 10:30pm UTC every day
permissions:
contents: read
pull-requests: write
issues: write
jobs:
test-examples:
# Schedule trigger only runs in the main repository, not in forks
if: github.event.label.name == 'test-examples' || github.event_name == 'workflow_dispatch' || (github.event_name == 'schedule' &&
github.repository == 'OpenHands/software-agent-sdk')
runs-on: ubuntu-24.04
timeout-minutes: 60
steps:
- name: Wait for agent server to finish build
if: github.event_name == 'pull_request'
uses: lewagon/wait-on-check-action@v1.4.1
with:
ref: ${{ github.event.pull_request.head.ref }}
check-name: Build & Push (python-amd64)
repo-token: ${{ secrets.GITHUB_TOKEN }}
wait-interval: 10
- name: Checkout
uses: actions/checkout@v5
with:
ref: ${{ github.event.pull_request.head.ref }}
repository: ${{ github.event.pull_request.head.repo.full_name }}
- name: Install uv
uses: astral-sh/setup-uv@v7
with:
enable-cache: true
python-version: '3.13'
- name: Install Node.js
uses: actions/setup-node@v6
with:
node-version: '22'
- name: Setup Apptainer
uses: eWaterCycle/setup-apptainer@v2
with:
apptainer-version: 1.3.6
- name: Install Chromium
run: |
sudo apt-get update
sudo apt-get install -y chromium-browser
- name: Install dependencies
run: uv sync --frozen --group dev
- name: Run examples
shell: bash
env:
LLM_API_KEY: ${{ secrets.LLM_API_KEY }}
LLM_MODEL: openhands/claude-haiku-4-5-20251001
LLM_BASE_URL: https://llm-proxy.app.all-hands.dev
RUNTIME_API_KEY: ${{ secrets.RUNTIME_API_KEY }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
PR_NUMBER: ${{ github.event.pull_request.number }}
REPO_OWNER: ${{ github.repository_owner }}
REPO_NAME: ${{ github.event.repository.name }}
GITHUB_SHA: ${{ github.event.pull_request.head.sha }}
OPENHANDS_CLOUD_API_KEY: ${{ secrets.ALLHANDS_BOT_OPENHANDS_SAAS_API_KEY }}
# ACP agents (Claude Code, Codex) route through LiteLLM proxy
ANTHROPIC_BASE_URL: https://llm-proxy.app.all-hands.dev
ANTHROPIC_API_KEY: ${{ secrets.LLM_API_KEY }}
OPENAI_BASE_URL: https://llm-proxy.app.all-hands.dev
OPENAI_API_KEY: ${{ secrets.LLM_API_KEY }}
run: |
RESULTS_DIR=".example-test-results"
REPORT_PATH="examples_report.md"
rm -rf "$RESULTS_DIR"
mkdir -p "$RESULTS_DIR"
update_comment() {
if [ -z "$API_URL" ]; then
echo "Skipping PR comment update because API_URL is unset."
return
fi
local comment_body="$1"
local payload
local response
payload=$(jq -n --arg body "$comment_body" '{body: $body}')
if [ -z "$COMMENT_ID" ]; then
echo "Creating PR comment..."
if ! response=$(curl -sSf -X POST \
-H "Authorization: token ${GITHUB_TOKEN}" \
-H "Accept: application/vnd.github.v3+json" \
-H "Content-Type: application/json" \
"${API_URL}" \
-d "$payload"); then
echo "::error::Failed to create PR comment."
exit 1
fi
COMMENT_ID=$(echo "$response" | jq -r '.id // ""')
if [ -z "$COMMENT_ID" ]; then
echo "::error::GitHub API response did not include a comment id: $response"
exit 1
fi
echo "Created comment with ID: $COMMENT_ID"
else
echo "Updating PR comment (ID: $COMMENT_ID)..."
if ! curl -sSf -X PATCH \
-H "Authorization: token ${GITHUB_TOKEN}" \
-H "Accept: application/vnd.github.v3+json" \
-H "Content-Type: application/json" \
"https://api.github.com/repos/${REPO_OWNER}/${REPO_NAME}/issues/comments/${COMMENT_ID}" \
-d "$payload" > /dev/null; then
echo "::error::Failed to update PR comment (ID: $COMMENT_ID)."
exit 1
fi
fi
}
API_URL=""
COMMENT_ID=""
if [ "${{ github.event_name }}" = "pull_request" ]; then
API_URL="https://api.github.com/repos/${REPO_OWNER}/${REPO_NAME}/issues/${PR_NUMBER}/comments"
initial_comment="## 🔄 Running Examples with \`${LLM_MODEL}\`"
initial_comment+=$'\n\n'
initial_comment+="_Run in progress..._"
initial_comment+=$'\n'
update_comment "$initial_comment"
fi
EXIT_CODE=0
uv run pytest tests/examples/test_examples.py \
--run-examples \
--examples-results-dir "$RESULTS_DIR" \
-n 4 || EXIT_CODE=$?
TIMESTAMP="$(date -u '+%Y-%m-%d %H:%M:%S UTC')"
WORKFLOW_URL="${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}/actions/runs/${GITHUB_RUN_ID}"
uv run python scripts/render_examples_report.py \
--results-dir "$RESULTS_DIR" \
--model "$LLM_MODEL" \
--workflow-url "$WORKFLOW_URL" \
--timestamp "$TIMESTAMP" \
--output "$REPORT_PATH"
COMMENT_BODY="$(cat "$REPORT_PATH")"
echo "$COMMENT_BODY"
if [ "${{ github.event_name }}" = "pull_request" ]; then
echo "Publishing PR comment..."
update_comment "$COMMENT_BODY"
fi
if [ $EXIT_CODE -ne 0 ]; then
exit $EXIT_CODE
fi
- name: Read examples report for issue comment
if: github.event_name == 'schedule' || github.event_name == 'workflow_dispatch'
id: read_report
shell: bash
run: |
if [ -f examples_report.md ]; then
REPORT_CONTENT=$(cat examples_report.md)
echo "report<<EOF" >> "$GITHUB_OUTPUT"
echo "$REPORT_CONTENT" >> "$GITHUB_OUTPUT"
echo "EOF" >> "$GITHUB_OUTPUT"
else
echo "report=Report file not found" >> "$GITHUB_OUTPUT"
fi
- name: Comment with results on tracker issue
if: github.event_name == 'schedule' || github.event_name == 'workflow_dispatch'
uses: KeisukeYamashita/create-comment@v1
with:
number: 976
unique: false
comment: |
**Trigger:** ${{ github.event_name == 'schedule' && 'Nightly Scheduled Run' || format('Manual Trigger: {0}', github.event.inputs.reason) }}
**Commit:** ${{ github.sha }}
**Workflow Run:** ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}
${{ steps.read_report.outputs.report }}
+715
View File
@@ -0,0 +1,715 @@
---
name: Agent Server
on:
push:
branches: [main]
tags:
- '*' # Trigger on any tag (e.g., 1.0.0, 1.0.0a5, build-docker)
pull_request:
branches: [main]
workflow_dispatch:
inputs:
base_image:
description: Base runtime image
type: string
default: nikolaik/python-nodejs:python3.13-nodejs22-slim
image:
description: GHCR image name
type: string
default: ghcr.io/openhands/agent-server
platforms:
description: Target platforms
type: string
default: linux/amd64,linux/arm64
permissions:
contents: read
packages: write
jobs:
build-binary-and-test:
runs-on: ${{ matrix.os }}
strategy:
matrix:
os: [ubuntu-latest, macos-latest]
steps:
- uses: actions/checkout@v5
- name: Install uv
uses: astral-sh/setup-uv@v7
with:
version: latest
python-version: '3.13'
- name: Install dependencies
run: uv sync --dev
- name: Build binary
run: |
make build-server
# FIXME: windows-latest not working due to
# Run if [[ "windows-latest" == "windows-latest" ]]; then
# [PYI-2160:ERROR] Failed to load Python DLL 'C:\Users\RUNNER~1\AppData\Local\Temp\_MEI5602\python312.dll'.
# LoadLibrary: Invalid access to memory location.
# - name: Test binary (Windows)
# if: matrix.os == 'windows-latest'
# shell: pwsh
# run: |
# Get-ChildItem dist
# .\dist\openhands-agent-server.exe --help
- name: Test binary (Linux and macOS)
if: matrix.os != 'windows-latest'
shell: bash
run: |
# Test help command
./dist/openhands-agent-server --help
# Test server startup and template loading
echo "Testing server startup and template loading..."
./dist/openhands-agent-server --port 8002 > server_test.log 2>&1 &
SERVER_PID=$!
# Wait for server to start
sleep 5
# Check if server started successfully (no template errors)
if grep -q "system_prompt.j2.*not found" server_test.log; then
echo "ERROR: Template files not found in binary!"
cat server_test.log
kill $SERVER_PID 2>/dev/null || true
exit 1
fi
# Check if server is running
if ! kill -0 $SERVER_PID 2>/dev/null; then
echo "ERROR: Server failed to start!"
cat server_test.log
exit 1
fi
# Test basic API endpoint
if command -v curl >/dev/null 2>&1; then
echo "Testing basic API endpoint..."
if curl -f -s http://localhost:8002/health >/dev/null 2>&1; then
echo "✓ Health endpoint accessible"
else
echo "⚠ Health endpoint not accessible (may be expected)"
fi
fi
# Clean up
kill $SERVER_PID 2>/dev/null || true
wait $SERVER_PID 2>/dev/null || true
rm -f server_test.log
echo "✓ Binary test completed successfully"
- name: Upload binary artifact
uses: actions/upload-artifact@v7
with:
name: openhands-server-${{ matrix.os }}
path: |
dist/openhands-server*
retention-days: 7
check-openapi-schema:
name: Check OpenAPI Schema
runs-on: ubuntu-24.04
steps:
- name: Checkout PR branch
uses: actions/checkout@v5
with:
fetch-depth: 0
- name: Install uv
uses: astral-sh/setup-uv@v7
with:
version: latest
python-version: '3.13'
- name: Install Node.js (for npx)
uses: actions/setup-node@v6
with:
node-version: 22
- name: Install dependencies
run: |
uv sync --frozen --dev
- name: Check OpenAPI JSON and build client
env:
PYTHONPATH: .
run: |
make test-server-schema
build-and-push-image:
name: Build & Push (${{ matrix.variant }}-${{ matrix.arch }})
# Run on push events, pull requests from the same repository (not forks), and manual workflow_dispatch
# Fork PRs cannot push to GHCR and would fail authentication
if: >
github.event_name == 'push' ||
github.event_name == 'workflow_dispatch' ||
(github.event_name == 'pull_request' &&
!github.event.pull_request.head.repo.fork)
strategy:
fail-fast: false
matrix:
# Explicit matrix: 3 variants × 2 architectures = 6 jobs
# Each job specifies exactly what it builds and where it runs
include:
# Python variant
- variant: python
arch: amd64
base_image: nikolaik/python-nodejs:python3.13-nodejs22
runner: ubuntu-24.04
platform: linux/amd64
- variant: python
arch: arm64
base_image: nikolaik/python-nodejs:python3.13-nodejs22
runner: ubuntu-24.04-arm
platform: linux/arm64
# Java variant
- variant: java
arch: amd64
base_image: eclipse-temurin:17-jdk
runner: ubuntu-24.04
platform: linux/amd64
- variant: java
arch: arm64
base_image: eclipse-temurin:17-jdk
runner: ubuntu-24.04-arm
platform: linux/arm64
# Golang variant
- variant: golang
arch: amd64
base_image: golang:1.21-bookworm
runner: ubuntu-24.04
platform: linux/amd64
- variant: golang
arch: arm64
base_image: golang:1.21-bookworm
runner: ubuntu-24.04-arm
platform: linux/arm64
runs-on: ${{ matrix.runner }}
env:
IMAGE: ${{ inputs.image != '' && inputs.image || 'ghcr.io/openhands/agent-server' }}
BASE_IMAGE: ${{ inputs.base_image != '' && inputs.base_image || matrix.base_image }}
CUSTOM_TAGS: ${{ matrix.variant }}
VARIANT: ${{ matrix.variant }}
ARCH: ${{ matrix.arch }}
TARGET: binary
PLATFORM: ${{ matrix.platform }}
# Use PR head SHA for pull requests to match the image tag expected by run-examples.yml
GITHUB_SHA: ${{ github.event.pull_request.head.sha || github.sha }}
GITHUB_REF: ${{ github.ref }}
CI: 'true'
steps:
- name: Checkout
uses: actions/checkout@v5
- name: Install uv
uses: astral-sh/setup-uv@v7
with:
version: latest
python-version: '3.13'
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Log in to GHCR
uses: docker/login-action@v4
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Prepare build context and metadata
id: prep
run: |
uv sync --frozen
# Generate build context and tags with arch suffix
# build.py now handles architecture tagging internally via --arch flag
# Add --versioned-tag when triggered by a git tag (e.g., v1.0.0)
BUILD_CMD="uv run ./openhands-agent-server/openhands/agent_server/docker/build.py --build-ctx-only --arch ${{ matrix.arch }}"
if [[ "${{ github.ref }}" == refs/tags/* ]]; then
BUILD_CMD="$BUILD_CMD --versioned-tag"
fi
eval "$BUILD_CMD"
# Alias tags_csv output to tags for the build action
TAGS=$(grep '^tags_csv=' $GITHUB_OUTPUT | cut -d= -f2-)
echo "tags=$TAGS" >> $GITHUB_OUTPUT
# Extract short SHA for consolidation
SHORT_SHA=$(echo ${{ github.sha }} | cut -c1-7)
echo "short_sha=$SHORT_SHA" >> $GITHUB_OUTPUT
# Extract versioned tags CSV for consolidation
VERSIONED_TAGS_CSV=$(grep '^versioned_tags_csv=' $GITHUB_OUTPUT | cut -d= -f2- || echo "")
echo "versioned_tags_csv=$VERSIONED_TAGS_CSV" >> $GITHUB_OUTPUT
# Verify outputs
echo "=== Build outputs ==="
echo "Build context: $(grep '^build_context=' $GITHUB_OUTPUT | cut -d= -f2-)"
echo "Tags: $TAGS"
echo "Short SHA: $SHORT_SHA"
echo "Versioned tags: $VERSIONED_TAGS_CSV"
echo "===================="
- name: Build & Push (${{ matrix.variant }}-${{ matrix.arch }})
id: build
uses: docker/build-push-action@v6
with:
context: ${{ steps.prep.outputs.build_context }}
file: ${{ steps.prep.outputs.dockerfile }}
target: ${{ env.TARGET }}
platforms: ${{ env.PLATFORM }}
push: true
tags: ${{ steps.prep.outputs.tags }}
cache-from: type=gha
cache-to: type=gha,mode=max
build-args: |
BASE_IMAGE=${{ env.BASE_IMAGE }}
- name: Cleanup build context
if: always()
run: |
if [ -n "${{ steps.prep.outputs.build_context }}" ] && [ -d "${{ steps.prep.outputs.build_context }}" ]; then
echo "Cleaning up build context: ${{ steps.prep.outputs.build_context }}"
rm -rf "${{ steps.prep.outputs.build_context }}"
fi
- name: Summary (${{ matrix.variant }}-${{ matrix.arch }}) - outputs
run: |
echo "Image: ${{ env.IMAGE }}"
echo "Variant: ${{ env.VARIANT }}"
echo "Architecture: ${{ env.ARCH }}"
echo "Platform: ${{ env.PLATFORM }}"
echo "Short SHA: ${{ steps.prep.outputs.short_sha }}"
echo "Tags: ${{ steps.prep.outputs.tags }}"
echo "Build digest: ${{ steps.build.outputs.digest }}"
- name: Save build info for consolidation
run: |
mkdir -p build-info
cat > "build-info/${{ matrix.variant }}-${{ matrix.arch }}.json" << EOF
{
"variant": "${{ matrix.variant }}",
"arch": "${{ matrix.arch }}",
"base_image": "${{ matrix.base_image }}",
"image": "${{ env.IMAGE }}",
"short_sha": "${{ steps.prep.outputs.short_sha }}",
"tags": "${{ steps.prep.outputs.tags }}",
"versioned_tags_csv": "${{ steps.prep.outputs.versioned_tags_csv }}",
"platform": "${{ env.PLATFORM }}"
}
EOF
- name: Upload build info artifact
uses: actions/upload-artifact@v7
with:
name: build-info-${{ matrix.variant }}-${{ matrix.arch }}
path: build-info/${{ matrix.variant }}-${{ matrix.arch }}.json
retention-days: 1
merge-manifests:
name: Merge Multi-Arch Manifests
needs: build-and-push-image
if: >
github.event_name == 'push' ||
(github.event_name == 'pull_request' &&
!github.event.pull_request.head.repo.fork)
runs-on: ubuntu-24.04
strategy:
matrix:
variant: [python, java, golang]
env:
IMAGE: ${{ inputs.image != '' && inputs.image || 'ghcr.io/openhands/agent-server' }}
steps:
- name: Download build info to extract SHORT_SHA
uses: actions/download-artifact@v8
with:
pattern: build-info-${{ matrix.variant }}-*
merge-multiple: true
path: build-info
- name: Extract SHORT_SHA from build info
id: get_sha
run: |
# Get SHORT_SHA from any build info artifact for this variant
SHORT_SHA=$(jq -r '.short_sha' build-info/${{ matrix.variant }}-amd64.json)
echo "short_sha=$SHORT_SHA" >> $GITHUB_OUTPUT
echo "Using SHORT_SHA: $SHORT_SHA"
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Log in to GHCR
uses: docker/login-action@v4
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Create and push multi-arch manifest for ${{ matrix.variant }}
id: create_manifest
run: |
SHORT_SHA=${{ steps.get_sha.outputs.short_sha }}
VARIANT=${{ matrix.variant }}
MANIFEST_TAG="${SHORT_SHA}-${VARIANT}"
# Create multi-arch manifest combining amd64 and arm64 using buildx imagetools
# This properly handles manifest lists from Docker builds
echo "Creating multi-arch manifest: ${IMAGE}:${MANIFEST_TAG}"
docker buildx imagetools create -t ${IMAGE}:${MANIFEST_TAG} \
${IMAGE}:${SHORT_SHA}-${VARIANT}-amd64 \
${IMAGE}:${SHORT_SHA}-${VARIANT}-arm64
# Verify the multi-arch manifest
echo "Inspecting multi-arch manifest:"
docker buildx imagetools inspect ${IMAGE}:${MANIFEST_TAG}
echo "✓ Multi-arch manifest created: ${IMAGE}:${MANIFEST_TAG}"
# Create latest manifest if on main branch
if [ "${{ github.ref }}" == "refs/heads/main" ]; then
LATEST_TAG="latest-${VARIANT}"
echo "Creating latest multi-arch manifest: ${IMAGE}:${LATEST_TAG}"
docker buildx imagetools create -t ${IMAGE}:${LATEST_TAG} \
${IMAGE}:main-${VARIANT}-amd64 \
${IMAGE}:main-${VARIANT}-arm64
echo "Inspecting latest multi-arch manifest:"
docker buildx imagetools inspect ${IMAGE}:${LATEST_TAG}
echo "✓ Latest multi-arch manifest created: ${IMAGE}:${LATEST_TAG}"
MANIFEST_TAG="${MANIFEST_TAG},${LATEST_TAG}"
fi
# Create versioned manifests if triggered by a git tag
# Extract versioned tags from build info (format: "1.2.0-python,1.2.0-java")
VERSIONED_TAGS_CSV=$(jq -r '.versioned_tags_csv' build-info/${VARIANT}-amd64.json)
if [ -n "$VERSIONED_TAGS_CSV" ] && [ "$VERSIONED_TAGS_CSV" != "null" ] && [ "$VERSIONED_TAGS_CSV" != "" ]; then
echo "Found versioned tags: $VERSIONED_TAGS_CSV"
# Split CSV and create manifest for each versioned tag
IFS=',' read -ra VERSIONED_TAGS <<< "$VERSIONED_TAGS_CSV"
for VERSIONED_TAG in "${VERSIONED_TAGS[@]}"; do
if [ -n "$VERSIONED_TAG" ]; then
echo "Creating versioned multi-arch manifest: ${IMAGE}:${VERSIONED_TAG}"
docker buildx imagetools create -t ${IMAGE}:${VERSIONED_TAG} \
${IMAGE}:${VERSIONED_TAG}-amd64 \
${IMAGE}:${VERSIONED_TAG}-arm64
echo "Inspecting versioned multi-arch manifest:"
docker buildx imagetools inspect ${IMAGE}:${VERSIONED_TAG}
echo "✓ Versioned multi-arch manifest created: ${IMAGE}:${VERSIONED_TAG}"
MANIFEST_TAG="${MANIFEST_TAG},${VERSIONED_TAG}"
fi
done
fi
# Save manifest info for consolidation
mkdir -p manifest-info
cat > "manifest-info/${VARIANT}.json" << EOF
{
"variant": "${VARIANT}",
"image": "${IMAGE}",
"short_sha": "${SHORT_SHA}",
"manifest_tag": "${MANIFEST_TAG}"
}
EOF
- name: Upload manifest info artifact
uses: actions/upload-artifact@v7
with:
name: manifest-info-${{ matrix.variant }}
path: manifest-info/${{ matrix.variant }}.json
retention-days: 1
consolidate-build-info:
name: Consolidate Build Information
needs: [build-and-push-image, merge-manifests]
# Run if it's a PR and the matrix job completed (even if some variants failed)
if: github.event_name == 'pull_request' && always() && (needs.build-and-push-image.result == 'success' || needs.build-and-push-image.result ==
'failure')
runs-on: ubuntu-24.04
outputs:
build_summary: ${{ steps.consolidate.outputs.build_summary }}
steps:
- name: Download build info artifacts
uses: actions/download-artifact@v8
with:
pattern: build-info-*
merge-multiple: true
path: build-info
- name: Download manifest info artifacts
uses: actions/download-artifact@v8
with:
pattern: manifest-info-*
merge-multiple: true
path: manifest-info
- name: Consolidate build information from artifacts
id: consolidate
run: |
echo "Processing build info artifacts..."
ls -la build-info/
echo "Found $(ls build-info/*.json 2>/dev/null | wc -l) JSON files"
# Initialize variables
IMAGE=""
SHORT_SHA=""
ALL_TAGS=""
# Use associative arrays to track variants (bash 4+)
declare -A VARIANT_BASE_IMAGE
declare -A VARIANT_ARCHS
# Process each build info
for info_file in build-info/*.json; do
if [[ ! -f "$info_file" ]]; then
echo "Skipping $info_file - not a file"
continue
fi
echo "=== Processing $info_file ==="
cat "$info_file"
echo "=== End of $info_file ==="
# Extract information from JSON
VARIANT=$(jq -r '.variant' "$info_file")
ARCH=$(jq -r '.arch' "$info_file")
BASE_IMAGE=$(jq -r '.base_image' "$info_file")
VARIANT_IMAGE=$(jq -r '.image' "$info_file")
VARIANT_SHA=$(jq -r '.short_sha' "$info_file")
VARIANT_TAGS=$(jq -r '.tags' "$info_file")
# Set common values (same across all builds)
if [[ -z "$IMAGE" ]]; then
IMAGE="$VARIANT_IMAGE"
SHORT_SHA="$VARIANT_SHA"
fi
# Store variant information
VARIANT_BASE_IMAGE[$VARIANT]=$BASE_IMAGE
if [[ -z "${VARIANT_ARCHS[$VARIANT]}" ]]; then
VARIANT_ARCHS[$VARIANT]=$ARCH
else
VARIANT_ARCHS[$VARIANT]="${VARIANT_ARCHS[$VARIANT]}, $ARCH"
fi
# Collect tags (comma-separated to newline-separated)
if [[ -n "$VARIANT_TAGS" ]]; then
VARIANT_TAG_LIST=$(echo "$VARIANT_TAGS" | tr ',' '\n')
if [[ -n "$ALL_TAGS" ]]; then
ALL_TAGS="${ALL_TAGS}"$'\n'"${VARIANT_TAG_LIST}"
else
ALL_TAGS="$VARIANT_TAG_LIST"
fi
fi
done
# Build variants JSON array from collected data
VARIANTS_JSON="[]"
for VARIANT in "${!VARIANT_BASE_IMAGE[@]}"; do
BASE_IMG="${VARIANT_BASE_IMAGE[$VARIANT]}"
ARCHS="${VARIANT_ARCHS[$VARIANT]}"
VARIANTS_JSON=$(echo "$VARIANTS_JSON" | jq \
--arg variant "$VARIANT" \
--arg base_image "$BASE_IMG" \
--arg archs "$ARCHS" \
'. += [{custom_tags: $variant, base_image: $base_image, architectures: $archs}]')
echo "Added variant $VARIANT ($ARCHS), current variants JSON:"
echo "$VARIANTS_JSON" | jq .
done
# Process manifest info artifacts
echo "Processing manifest info artifacts..."
if [[ -d "manifest-info" ]]; then
ls -la manifest-info/
MANIFEST_TAGS=""
for manifest_file in manifest-info/*.json; do
if [[ -f "$manifest_file" ]]; then
echo "=== Processing $manifest_file ==="
cat "$manifest_file"
MANIFEST_TAG_CSV=$(jq -r '.manifest_tag' "$manifest_file")
# Convert comma-separated tags to newline-separated
MANIFEST_TAG_LIST=$(echo "$MANIFEST_TAG_CSV" | tr ',' '\n' | sed "s|^|${IMAGE}:|")
if [[ -n "$MANIFEST_TAGS" ]]; then
MANIFEST_TAGS="${MANIFEST_TAGS}"$'\n'"${MANIFEST_TAG_LIST}"
else
MANIFEST_TAGS="$MANIFEST_TAG_LIST"
fi
fi
done
# Add manifest tags to ALL_TAGS
if [[ -n "$MANIFEST_TAGS" ]]; then
echo "Adding manifest tags to output"
if [[ -n "$ALL_TAGS" ]]; then
ALL_TAGS="${ALL_TAGS}"$'\n'"${MANIFEST_TAGS}"
else
ALL_TAGS="$MANIFEST_TAGS"
fi
fi
else
echo "No manifest-info directory found (merge-manifests may not have run)"
fi
# Create consolidated build summary
BUILD_SUMMARY=$(jq -n \
--arg image "$IMAGE" \
--arg short_sha "$SHORT_SHA" \
--arg ghcr_url "https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server" \
--arg all_tags "$ALL_TAGS" \
--argjson variants "$VARIANTS_JSON" \
'{
image: $image,
short_sha: $short_sha,
ghcr_package_url: $ghcr_url,
all_tags: $all_tags,
variants: $variants
}')
echo "Consolidated build summary:"
echo "$BUILD_SUMMARY" | jq .
echo "DEBUG: Final variants count: $(echo "$VARIANTS_JSON" | jq 'length')"
echo "DEBUG: Final variants: $(echo "$VARIANTS_JSON" | jq -c '.')"
# Set output
{
echo 'build_summary<<EOF'
echo "$BUILD_SUMMARY"
echo 'EOF'
} >> $GITHUB_OUTPUT
update-pr-description:
name: Update PR description with agent server image
needs: consolidate-build-info
# Only on PRs, and only if the consolidation succeeded
if: github.event_name == 'pull_request' && needs.consolidate-build-info.result == 'success'
runs-on: ubuntu-24.04
permissions:
contents: read
pull-requests: write
steps:
- name: Generate PR description from build summary
id: generate_description
run: |
echo "Event: ${{ github.event_name }}"
echo "PR number: ${{ github.event.number }}"
echo "Run attempt: ${{ github.run_attempt }}"
# Parse the build summary JSON
BUILD_SUMMARY='${{ needs.consolidate-build-info.outputs.build_summary }}'
echo "Build summary received:"
echo "$BUILD_SUMMARY" | jq .
# Extract basic information
IMAGE=$(echo "$BUILD_SUMMARY" | jq -r '.image')
SHORT_SHA=$(echo "$BUILD_SUMMARY" | jq -r '.short_sha')
GHCR_URL=$(echo "$BUILD_SUMMARY" | jq -r '.ghcr_package_url')
ALL_TAGS=$(echo "$BUILD_SUMMARY" | jq -r '.all_tags')
# Build the variants table dynamically
VARIANTS_TABLE=""
# Process each build
VARIANTS=$(echo "$BUILD_SUMMARY" | jq -r '.variants[] | @base64')
echo "DEBUG: Found builds (base64 encoded):"
echo "$VARIANTS"
echo "DEBUG: Number of builds: $(echo "$VARIANTS" | wc -l)"
for variant_data in $VARIANTS; do
# Decode base64 and extract build info
VARIANT_JSON=$(echo "$variant_data" | base64 --decode)
echo "DEBUG: Processing build JSON: $VARIANT_JSON"
CUSTOM_TAGS=$(echo "$VARIANT_JSON" | jq -r '.custom_tags')
BASE_IMAGE=$(echo "$VARIANT_JSON" | jq -r '.base_image')
ARCHS=$(echo "$VARIANT_JSON" | jq -r '.architectures // "amd64, arm64"')
echo "DEBUG: Adding variant $CUSTOM_TAGS with base image $BASE_IMAGE (archs: $ARCHS)"
# Add to variants table with architecture info
VARIANTS_TABLE="${VARIANTS_TABLE}| ${CUSTOM_TAGS} | ${ARCHS} | \`${BASE_IMAGE}\` | [Link](https://hub.docker.com/_/${BASE_IMAGE}) |"$'\n'
done
echo "DEBUG: Final variants table:"
echo "$VARIANTS_TABLE"
# Create the complete PR description with the requested format
PR_CONTENT=$(cat << EOF
<!-- AGENT_SERVER_IMAGES_START -->
---
**Agent Server images for this PR**
• **GHCR package:** ${GHCR_URL}
**Variants & Base Images**
| Variant | Architectures | Base Image | Docs / Tags |
|---|---|---|---|
${VARIANTS_TABLE}
**Pull (multi-arch manifest)**
\`\`\`bash
# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ${IMAGE}:${SHORT_SHA}-python
\`\`\`
**Run**
\`\`\`bash
docker run -it --rm \\
-p 8000:8000 \\
--name agent-server-${SHORT_SHA}-python \\
${IMAGE}:${SHORT_SHA}-python
\`\`\`
**All tags pushed for this build**
\`\`\`
${ALL_TAGS}
\`\`\`
**About Multi-Architecture Support**
- Each variant tag (e.g., \`${SHORT_SHA}-python\`) is a **multi-arch manifest** supporting both **amd64** and **arm64**
- Docker automatically pulls the correct architecture for your platform
- Individual architecture tags (e.g., \`${SHORT_SHA}-python-amd64\`) are also available if needed
<!-- AGENT_SERVER_IMAGES_END -->
EOF
)
# Set output for the next step
{
echo 'pr_content<<EOF'
echo "$PR_CONTENT"
echo 'EOF'
} >> $GITHUB_OUTPUT
- name: Update PR description with docker image details
uses: nefrob/pr-description@v1.2.0
with:
content: ${{ steps.generate_description.outputs.pr_content }}
regex: <!-- AGENT_SERVER_IMAGES_START -->.*?<!-- AGENT_SERVER_IMAGES_END -->
regexFlags: s
token: ${{ secrets.GITHUB_TOKEN }}
+24 -17
View File
@@ -1,23 +1,30 @@
---
# Workflow that marks issues and PRs with no activity for 30 days with "Stale" and closes them after 7 more days of no activity
name: 'Close stale issues'
name: Close stale issues
# Runs every day at 01:30
on:
schedule:
- cron: '30 1 * * *'
schedule:
- cron: 30 1 * * *
jobs:
stale:
runs-on: blacksmith-4vcpu-ubuntu-2204
if: github.repository == 'OpenHands/OpenHands'
steps:
- uses: actions/stale@v9
with:
stale-issue-message: 'This issue is stale because it has been open for 40 days with no activity. Remove the stale label or leave a comment, otherwise it will be closed in 10 days.'
stale-pr-message: 'This PR is stale because it has been open for 40 days with no activity. Remove the stale label or leave a comment, otherwise it will be closed in 10 days.'
days-before-stale: 40
exempt-issue-labels: roadmap,backlog,app-team
close-issue-message: 'This issue was automatically closed due to 50 days of inactivity. We do this to help keep the issues somewhat manageable and focus on active issues.'
close-pr-message: 'This PR was closed because it had no activity for 50 days. If you feel this was closed in error, and you would like to continue the PR, please resubmit or let us know.'
days-before-close: 10
operations-per-run: 300
stale:
# Only run scheduled jobs in the main repository, not in forks
if: github.repository == 'OpenHands/software-agent-sdk'
runs-on: ubuntu-22.04
steps:
- uses: actions/stale@v10
with:
repo-token: ${{ secrets.ALLHANDS_BOT_GITHUB_PAT }}
stale-issue-message: This issue is stale because it has been open for 40 days with no activity. Remove the stale label or leave a
comment, otherwise it will be closed in 10 days.
stale-pr-message: This PR is stale because it has been open for 40 days with no activity. Remove the stale label or leave a comment,
otherwise it will be closed in 10 days.
days-before-stale: 40
exempt-issue-labels: roadmap,backlog
close-issue-message: This issue was automatically closed due to 50 days of inactivity. We do this to help keep the issues somewhat
manageable and focus on active issues.
close-pr-message: This PR was closed because it had no activity for 50 days. If you feel this was closed in error, and you would
like to continue the PR, please resubmit or let us know.
days-before-close: 10
operations-per-run: 150
+322
View File
@@ -0,0 +1,322 @@
---
name: Run tests
on:
push:
branches: [main]
pull_request:
branches: ['**']
permissions:
contents: write
pull-requests: write
jobs:
sdk-tests:
runs-on: blacksmith-2vcpu-ubuntu-2404
steps:
- name: Checkout
uses: actions/checkout@v5
with: {fetch-depth: 0}
- name: Detect sdk changes
id: changed
uses: tj-actions/changed-files@v47
with:
files: |
openhands-sdk/**
tests/sdk/**
pyproject.toml
uv.lock
.github/workflows/tests.yml
- name: Install uv
if: steps.changed.outputs.any_changed == 'true'
uses: astral-sh/setup-uv@v7
with:
enable-cache: true
python-version: '3.13'
- name: Install deps
if: steps.changed.outputs.any_changed == 'true'
run: uv sync --frozen --group dev
- name: Check for openhands.tools imports in sdk tests
if: steps.changed.outputs.any_changed == 'true'
run: |
echo "Checking for openhands.tools imports in tests/sdk..."
if grep -r "from openhands\.tools" tests/sdk/ || grep -r "import openhands\.tools" tests/sdk/; then
echo "ERROR: Found openhands.tools imports in tests/sdk/"
echo "SDK tests should only import from openhands.sdk"
echo "Please move tests that use openhands.tools to tests/cross/"
exit 1
fi
echo "✓ No openhands.tools imports found in tests/sdk/"
- name: Run sdk tests with coverage
if: steps.changed.outputs.any_changed == 'true'
run: |
# Clean up any existing coverage file
rm -f .coverage
# Use pytest-xdist (-n auto) for parallel execution with proper
# coverage collection. --forked prevents coverage from child processes.
CI=true uv run python -m pytest -vvs \
-n auto \
--cov=openhands-sdk \
--cov-report=term-missing \
--cov-fail-under=0 \
--cov-config=pyproject.toml \
tests/sdk
# Rename coverage file for upload
if [ -f .coverage ]; then
mv .coverage coverage-sdk.dat
echo "SDK coverage file prepared for upload"
fi
- name: Upload sdk coverage
if: steps.changed.outputs.any_changed == 'true' && always()
uses: actions/upload-artifact@v7
with:
name: coverage-sdk
path: coverage-sdk.dat
if-no-files-found: warn
tools-tests:
runs-on: blacksmith-2vcpu-ubuntu-2404
timeout-minutes: 15
steps:
- name: Checkout
uses: actions/checkout@v5
with: {fetch-depth: 0}
- name: Detect tools changes
id: changed
uses: tj-actions/changed-files@v47
with:
files: |
openhands-tools/**
tests/tools/**
pyproject.toml
uv.lock
.github/workflows/tests.yml
- name: Install uv
if: steps.changed.outputs.any_changed == 'true'
uses: astral-sh/setup-uv@v7
with:
enable-cache: true
python-version: '3.13'
- name: Install deps
if: steps.changed.outputs.any_changed == 'true'
run: uv sync --frozen --group dev
- name: Run tools tests with coverage
if: steps.changed.outputs.any_changed == 'true'
run: |
# Clean up any existing coverage file
rm -f .coverage
# Use --forked for tools tests due to terminal test conflicts
# when running in parallel (shared /tmp paths, subprocess management)
CI=true uv run python -m pytest -vvs \
--forked \
--cov=openhands-tools \
--cov-report=term-missing \
--cov-fail-under=0 \
--cov-config=pyproject.toml \
tests/tools
# Rename coverage file for upload
if [ -f .coverage ]; then
mv .coverage coverage-tools.dat
echo "Tools coverage file prepared for upload"
fi
- name: Upload tools coverage
if: steps.changed.outputs.any_changed == 'true' && always()
uses: actions/upload-artifact@v7
with:
name: coverage-tools
path: coverage-tools.dat
if-no-files-found: warn
agent-server-tests:
runs-on: blacksmith-2vcpu-ubuntu-2404
steps:
- name: Checkout
uses: actions/checkout@v5
with: {fetch-depth: 0}
- name: Detect Agent Server changes
id: changed
uses: tj-actions/changed-files@v47
with:
files: |
openhands-agent-server/**
tests/agent_server/**
pyproject.toml
uv.lock
.github/workflows/tests.yml
- name: Install uv
if: steps.changed.outputs.any_changed == 'true'
uses: astral-sh/setup-uv@v7
with:
enable-cache: true
python-version: '3.13'
- name: Install deps
if: steps.changed.outputs.any_changed == 'true'
run: uv sync --frozen --group dev
- name: Run Agent Server tests with coverage
if: steps.changed.outputs.any_changed == 'true'
run: |
# Clean up any existing coverage file
rm -f .coverage
# Use pytest-xdist (-n auto) for parallel execution with proper
# coverage collection. --forked prevents coverage from child processes.
CI=true uv run python -m pytest -vvs \
-n auto \
--cov=openhands-agent-server \
--cov-report=term-missing \
--cov-fail-under=0 \
--cov-config=pyproject.toml \
tests/agent_server
# Rename coverage file for upload
if [ -f .coverage ]; then
mv .coverage coverage-agent-server.dat
echo "Agent Server coverage file prepared for upload"
fi
- name: Upload Agent Server coverage
if: steps.changed.outputs.any_changed == 'true' && always()
uses: actions/upload-artifact@v7
with:
name: coverage-agent-server
path: coverage-agent-server.dat
if-no-files-found: warn
cross-tests:
runs-on: blacksmith-2vcpu-ubuntu-2404
steps:
- name: Checkout
uses: actions/checkout@v5
with: {fetch-depth: 0}
- name: Detect cross changes
id: changed
uses: tj-actions/changed-files@v47
with:
files: |
tests/**
openhands/**
pyproject.toml
uv.lock
.github/workflows/tests.yml
- name: Install uv
if: steps.changed.outputs.any_changed == 'true'
uses: astral-sh/setup-uv@v7
with:
enable-cache: true
python-version: '3.13'
- name: Install deps
if: steps.changed.outputs.any_changed == 'true'
run: uv sync --frozen --group dev
- name: Run cross tests with coverage
if: steps.changed.outputs.any_changed == 'true'
run: |
# Clean up any existing coverage file
rm -f .coverage
CI=true uv run python -m pytest -vvs \
--basetemp="${{ runner.temp }}/pytest" \
-o tmp_path_retention=none \
-o tmp_path_retention_count=0 \
--cov=openhands \
--cov-report=term-missing \
--cov-fail-under=0 \
--cov-config=pyproject.toml \
tests/cross
# Rename coverage file for upload
if [ -f .coverage ]; then
mv .coverage coverage-cross.dat
echo "Cross coverage file prepared for upload"
fi
- name: Upload cross coverage
if: steps.changed.outputs.any_changed == 'true' && always()
uses: actions/upload-artifact@v7
with:
name: coverage-cross
path: coverage-cross.dat
if-no-files-found: warn
coverage-report:
runs-on: blacksmith-2vcpu-ubuntu-2404
needs: [sdk-tests, tools-tests, agent-server-tests, cross-tests]
if: always() && github.event_name == 'pull_request'
steps:
- name: Checkout
uses: actions/checkout@v5
- name: Install uv
uses: astral-sh/setup-uv@v7
with:
enable-cache: true
python-version: '3.13'
- name: Install deps (for coverage CLI)
run: uv sync --frozen --group dev
- name: Download coverage artifacts
uses: actions/download-artifact@v8
with:
path: ./cov
continue-on-error: true
- name: Combine coverage data
run: |
shopt -s nullglob
# For some reason, the github action won't properly upload the original
# .converage* files
# Convert uploaded .dat files back to .coverage format for coverage tool
for dat_file in cov/**/coverage-*.dat; do
if [[ "$dat_file" == *coverage-sdk.dat ]]; then
cp "$dat_file" .coverage.sdk
elif [[ "$dat_file" == *coverage-tools.dat ]]; then
cp "$dat_file" .coverage.tools
elif [[ "$dat_file" == *coverage-agent-server.dat ]]; then
cp "$dat_file" .coverage.agent-server
elif [[ "$dat_file" == *coverage-cross.dat ]]; then
cp "$dat_file" .coverage.cross
fi
done
# Check if we have any coverage files
coverage_files=(.coverage.*)
if [ ${#coverage_files[@]} -eq 0 ]; then
echo "No coverage files found; skipping combined report."
exit 0
fi
echo "Found ${#coverage_files[@]} coverage files"
uv run coverage combine
uv run coverage xml -i -o coverage.xml
uv run coverage report -m
- name: Pytest coverage PR comment
if: always()
continue-on-error: true
uses: MishaKav/pytest-coverage-comment@v1
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
pytest-xml-coverage-path: coverage.xml
title: Coverage Report
create-new-comment: false
hide-report: false
xml-skip-covered: true
report-only-changed-files: true
remove-links-to-files: true
remove-links-to-lines: true
+322
View File
@@ -0,0 +1,322 @@
---
# Automated TODO Management Workflow
#
# This workflow automatically scans for TODO(openhands) comments and creates
# pull requests to implement them using the OpenHands agent.
#
# Setup:
# 1. Add LLM_API_KEY to repository secrets
# 2. Ensure GITHUB_TOKEN has appropriate permissions
# 3. Make sure Github Actions are allowed to create and review PRs
# 4. Commit this file to .github/workflows/ in your repository
# 5. Configure the schedule or trigger manually
name: Automated TODO Management
on:
# Manual trigger
workflow_dispatch:
inputs:
max_todos:
description: Maximum number of TODOs to process in this run
required: false
default: '3'
type: string
todo_identifier:
description: TODO identifier to search for (e.g., TODO(openhands))
required: false
default: TODO(openhands)
type: string
# Trigger when 'automatic-todo' label is added to a PR
pull_request:
types: [labeled]
# Scheduled trigger (disabled by default, uncomment and customize as needed)
# schedule:
# # Run every Monday at 9 AM UTC
# - cron: "0 9 * * 1"
permissions:
contents: write
pull-requests: write
issues: write
jobs:
scan-todos:
runs-on: ubuntu-24.04
# Only run if triggered manually or if 'automatic-todo' label was added
if: >
github.event_name == 'workflow_dispatch' ||
(github.event_name == 'pull_request' &&
github.event.label.name == 'automatic-todo')
outputs:
todos: ${{ steps.scan.outputs.todos }}
todo-count: ${{ steps.scan.outputs.todo-count }}
steps:
- name: Checkout repository
uses: actions/checkout@v5
with:
fetch-depth: 0 # Full history for better context
- name: Set up Python
uses: actions/setup-python@v6
with:
python-version: '3.13'
- name: Copy TODO scanner
run: |
cp examples/03_github_workflows/03_todo_management/scanner.py /tmp/scanner.py
chmod +x /tmp/scanner.py
- name: Scan for TODOs
id: scan
run: |
echo "Scanning for TODO comments..."
# Run the scanner and capture output
TODO_IDENTIFIER="${{ github.event.inputs.todo_identifier || 'TODO(openhands)' }}"
python /tmp/scanner.py . --identifier "$TODO_IDENTIFIER" > todos.json
# Count TODOs
TODO_COUNT=$(python -c \
"import json; data=json.load(open('todos.json')); print(len(data))")
echo "Found $TODO_COUNT $TODO_IDENTIFIER items"
# Limit the number of TODOs to process
MAX_TODOS="${{ github.event.inputs.max_todos || '3' }}"
if [ "$TODO_COUNT" -gt "$MAX_TODOS" ]; then
echo "Limiting to first $MAX_TODOS TODOs"
python -c "
import json
data = json.load(open('todos.json'))
limited = data[:$MAX_TODOS]
json.dump(limited, open('todos.json', 'w'), indent=2)
"
TODO_COUNT=$MAX_TODOS
fi
# Set outputs
echo "todos=$(cat todos.json | jq -c .)" >> $GITHUB_OUTPUT
echo "todo-count=$TODO_COUNT" >> $GITHUB_OUTPUT
# Display found TODOs
echo "## 📋 Found TODOs" >> $GITHUB_STEP_SUMMARY
if [ "$TODO_COUNT" -eq 0 ]; then
echo "No TODO(openhands) comments found." >> $GITHUB_STEP_SUMMARY
else
echo "Found $TODO_COUNT TODO(openhands) items:" \
>> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
python -c "
import json
data = json.load(open('todos.json'))
for i, todo in enumerate(data, 1):
print(f'{i}. **{todo[\"file\"]}:{todo[\"line\"]}** - ' +
f'{todo[\"description\"]}')
" >> $GITHUB_STEP_SUMMARY
fi
process-todos:
needs: scan-todos
if: needs.scan-todos.outputs.todo-count > 0
runs-on: ubuntu-24.04
strategy:
matrix:
todo: ${{ fromJson(needs.scan-todos.outputs.todos) }}
max-parallel: 1 # Process one TODO at a time to avoid conflicts
steps:
- name: Checkout repository
uses: actions/checkout@v5
with:
fetch-depth: 0
token: ${{ secrets.ALLHANDS_BOT_GITHUB_PAT }}
- name: Switch to feature branch with TODO management files
run: |
git checkout openhands/todo-management-example
git pull origin openhands/todo-management-example
- name: Set up Python
uses: actions/setup-python@v6
with:
python-version: '3.13'
- name: Install uv
uses: astral-sh/setup-uv@v7
with:
enable-cache: true
- name: Install OpenHands dependencies
run: |
# Install OpenHands SDK and tools from git repository
uv pip install --system "openhands-sdk @ git+https://github.com/OpenHands/agent-sdk.git@main#subdirectory=openhands-sdk"
uv pip install --system "openhands-tools @ git+https://github.com/OpenHands/agent-sdk.git@main#subdirectory=openhands-tools"
- name: Copy agent files
run: |
cp examples/03_github_workflows/03_todo_management/agent_script.py agent.py
cp examples/03_github_workflows/03_todo_management/prompt.py prompt.py
chmod +x agent.py
- name: Configure Git
run: |
git config --global user.name "openhands-bot"
git config --global user.email \
"openhands-bot@users.noreply.github.com"
- name: Process TODO
env:
LLM_MODEL: litellm_proxy/claude-sonnet-4-5-20250929
LLM_BASE_URL: https://llm-proxy.app.all-hands.dev
LLM_API_KEY: ${{ secrets.LLM_API_KEY }}
GITHUB_TOKEN: ${{ secrets.ALLHANDS_BOT_GITHUB_PAT }}
GITHUB_REPOSITORY: ${{ github.repository }}
TODO_FILE: ${{ matrix.todo.file }}
TODO_LINE: ${{ matrix.todo.line }}
TODO_DESCRIPTION: ${{ matrix.todo.description }}
PYTHONPATH: ''
run: |
echo "Processing TODO: $TODO_DESCRIPTION"
echo "File: $TODO_FILE:$TODO_LINE"
# Create a unique branch name for this TODO
BRANCH_NAME="todo/$(echo "$TODO_DESCRIPTION" | \
sed 's/[^a-zA-Z0-9]/-/g' | \
sed 's/--*/-/g' | \
sed 's/^-\|-$//g' | \
tr '[:upper:]' '[:lower:]' | \
cut -c1-50)"
echo "Branch name: $BRANCH_NAME"
# Create and switch to new branch (force create if exists)
git checkout -B "$BRANCH_NAME"
# Run the agent to process the TODO
# Stay in repository directory for git operations
# Create JSON payload for the agent
TODO_JSON=$(cat <<EOF
{
"file": "$TODO_FILE",
"line": $TODO_LINE,
"description": "$TODO_DESCRIPTION"
}
EOF
)
echo "JSON payload for agent:"
echo "$TODO_JSON"
# Debug environment and setup
echo "Current working directory: $(pwd)"
echo "Environment variables:"
echo " LLM_MODEL: $LLM_MODEL"
echo " LLM_BASE_URL: $LLM_BASE_URL"
echo " GITHUB_REPOSITORY: $GITHUB_REPOSITORY"
echo " LLM_API_KEY: ${LLM_API_KEY:+[SET]}"
echo " GITHUB_TOKEN: ${GITHUB_TOKEN:+[SET]}"
echo "Available files:"
ls -la
# Run the agent with detailed logging
echo "Starting agent execution..."
set +e # Don't exit on error, we want to capture it
uv run python agent.py "$TODO_JSON" 2>&1 | tee agent_output.log
AGENT_EXIT_CODE=$?
set -e
echo "Agent exit code: $AGENT_EXIT_CODE"
echo "Agent output log:"
cat agent_output.log
# Show files in working directory
echo "Files in working directory:"
ls -la
# If agent failed, show more details
if [ $AGENT_EXIT_CODE -ne 0 ]; then
echo "Agent failed with exit code $AGENT_EXIT_CODE"
echo "Last 50 lines of agent output:"
tail -50 agent_output.log
exit $AGENT_EXIT_CODE
fi
# Check if any changes were made
cd "$GITHUB_WORKSPACE"
if git diff --quiet; then
echo "No changes made by agent, skipping PR creation"
exit 0
fi
# Commit changes
git add -A
git commit -m "Implement TODO: $TODO_DESCRIPTION
Automatically implemented by OpenHands agent.
Co-authored-by: openhands <openhands@all-hands.dev>"
# Push branch
git push origin "$BRANCH_NAME"
# Create pull request
PR_TITLE="Implement TODO: $TODO_DESCRIPTION"
PR_BODY="## 🤖 Automated TODO Implementation
This PR automatically implements the following TODO:
**File:** \`$TODO_FILE:$TODO_LINE\`
**Description:** $TODO_DESCRIPTION
### Implementation
The OpenHands agent has analyzed the TODO and implemented the
requested functionality.
### Review Notes
- Please review the implementation for correctness
- Test the changes in your development environment
- The original TODO comment will be updated with this PR URL
once merged
---
*This PR was created automatically by the TODO Management workflow.*"
# Create PR using GitHub CLI or API
curl -X POST \
-H "Authorization: token $GITHUB_TOKEN" \
-H "Accept: application/vnd.github.v3+json" \
"https://api.github.com/repos/${{ github.repository }}/pulls" \
-d "{
\"title\": \"$PR_TITLE\",
\"body\": \"$PR_BODY\",
\"head\": \"$BRANCH_NAME\",
\"base\": \"${{ github.ref_name }}\"
}"
summary:
needs: [scan-todos, process-todos]
if: always()
runs-on: ubuntu-24.04
steps:
- name: Generate Summary
run: |
echo "# 🤖 TODO Management Summary" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
TODO_COUNT="${{ needs.scan-todos.outputs.todo-count || '0' }}"
echo "**TODOs Found:** $TODO_COUNT" >> $GITHUB_STEP_SUMMARY
if [ "$TODO_COUNT" -gt 0 ]; then
echo "**Processing Status:** ✅ Completed" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
echo "Check the pull requests created for each TODO" \
"implementation." >> $GITHUB_STEP_SUMMARY
else
echo "**Status:** ️ No TODOs found to process" \
>> $GITHUB_STEP_SUMMARY
fi
echo "" >> $GITHUB_STEP_SUMMARY
echo "---" >> $GITHUB_STEP_SUMMARY
echo "*Workflow completed at $(date)*" >> $GITHUB_STEP_SUMMARY
-34
View File
@@ -1,34 +0,0 @@
name: Run UI Component Build
# * Always run on "main"
# * Run on PRs that have changes in the "openhands-ui" folder or this workflow
on:
push:
branches:
- main
pull_request:
paths:
- 'openhands-ui/**'
- '.github/workflows/ui-build.yml'
# If triggered by a PR, it will be in the same group. However, each commit on main will be in its own unique group
concurrency:
group: ${{ github.workflow }}-${{ (github.head_ref && github.ref) || github.run_id }}
cancel-in-progress: true
jobs:
ui-build:
name: Build openhands-ui
runs-on: blacksmith-4vcpu-ubuntu-2204
steps:
- name: Checkout
uses: actions/checkout@v4
- uses: oven-sh/setup-bun@v2
with:
bun-version-file: "openhands-ui/.bun-version"
- name: Install dependencies
working-directory: ./openhands-ui
run: bun install --frozen-lockfile
- name: Build package
working-directory: ./openhands-ui
run: bun run build
+25
View File
@@ -0,0 +1,25 @@
---
name: Version bump guard
on:
pull_request:
branches: [main]
jobs:
version-bump-guard:
name: Check package versions
runs-on: ubuntu-latest
permissions:
contents: read
steps:
- name: Checkout
uses: actions/checkout@v5
with:
fetch-depth: 0
- name: Validate package version changes
env:
VERSION_BUMP_BASE_REF: ${{ github.base_ref }}
PR_TITLE: ${{ github.event.pull_request.title }}
PR_HEAD_REF: ${{ github.event.pull_request.head.ref }}
run: python3 .github/scripts/check_version_bumps.py
+346
View File
@@ -0,0 +1,346 @@
---
name: Create Version Bump PRs
on:
# Triggered by pypi-release workflow after successful publish
# Note: No branches filter - releases run on tags (e.g., v1.11.4), not branches
workflow_run:
workflows: [Publish all OpenHands packages (uv)]
types: [completed]
# Allow manual trigger with version input
workflow_dispatch:
inputs:
version:
description: Version to bump to (e.g., 1.11.3)
required: true
type: string
jobs:
create-version-bump-prs:
runs-on: ubuntu-24.04
# Only run on successful workflow_run or manual dispatch
if: >
github.event_name == 'workflow_dispatch' ||
(github.event.workflow_run.conclusion == 'success' &&
github.event.workflow_run.event == 'release')
env:
GH_TOKEN: ${{ secrets.ALLHANDS_BOT_GITHUB_PAT }}
steps:
- name: Checkout
uses: actions/checkout@v5
- name: Get version from release or input
id: get_version
run: |
if [[ "${{ github.event_name }}" == "workflow_dispatch" ]]; then
VERSION="${{ github.event.inputs.version }}"
else
# Get version from the release that triggered the workflow_run
# The workflow_run was triggered by a release event
RELEASE_TAG=$(gh api repos/${{ github.repository }}/releases/latest --jq '.tag_name')
VERSION="${RELEASE_TAG#v}" # Remove 'v' prefix
fi
echo "version=$VERSION" >> $GITHUB_OUTPUT
echo "📦 Version: $VERSION"
- name: Validate version
env:
VERSION: ${{ steps.get_version.outputs.version }}
run: |
if [ -z "$VERSION" ]; then
echo "❌ Version is empty"
exit 1
fi
echo "📦 Creating version bump PRs for version: $VERSION"
- name: Wait for packages to be available on PyPI
env:
VERSION: ${{ steps.get_version.outputs.version }}
run: |
set -euo pipefail
PACKAGES=(
openhands-sdk
openhands-tools
openhands-workspace
openhands-agent-server
)
MAX_ATTEMPTS=60
SLEEP_SECONDS=20
echo "⏳ Waiting for packages to be available on PyPI..."
for PKG in "${PACKAGES[@]}"; do
echo "Checking $PKG==$VERSION..."
ATTEMPT=1
while [ $ATTEMPT -le $MAX_ATTEMPTS ]; do
# Check if the package version is available on PyPI
HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" \
"https://pypi.org/pypi/$PKG/$VERSION/json")
if [ "$HTTP_CODE" = "200" ]; then
echo "✅ $PKG==$VERSION is available on PyPI"
break
fi
echo " Attempt $ATTEMPT/$MAX_ATTEMPTS: $PKG==$VERSION not yet available (HTTP $HTTP_CODE), waiting ${SLEEP_SECONDS}s..."
sleep $SLEEP_SECONDS
ATTEMPT=$((ATTEMPT + 1))
done
if [ $ATTEMPT -gt $MAX_ATTEMPTS ]; then
echo "❌ Timeout waiting for $PKG==$VERSION to be available on PyPI"
exit 1
fi
done
echo "✅ All packages are available on PyPI!"
- name: Install uv
uses: astral-sh/setup-uv@v7
with:
version: latest
python-version: '3.12'
- name: Install Poetry
run: |
pipx install poetry==2.2.1
# OpenHands-CLI step runs first since it's simpler and less error-prone
- name: Create PR for OpenHands-CLI repo
env:
VERSION: ${{ steps.get_version.outputs.version }}
run: |
set -euo pipefail
REPO="OpenHands/openhands-cli"
BRANCH="bump-sdk-$VERSION"
echo "🔄 Creating PR for $REPO..."
# Clone the repo
git clone "https://x-access-token:${GH_TOKEN}@github.com/${REPO}.git" openhands-cli-repo
cd openhands-cli-repo
# Configure git
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"
# Check if branch already exists on remote
if git ls-remote --heads origin "$BRANCH" | grep -q "$BRANCH"; then
echo "⚠️ Branch $BRANCH already exists, checking out existing branch"
git fetch origin "$BRANCH"
git checkout "$BRANCH"
else
# Create branch
git checkout -b "$BRANCH"
fi
# OpenHands-CLI currently requires Python 3.12, so resolve with that interpreter.
uv add --python 3.12 --refresh \
"openhands-sdk==$VERSION" \
"openhands-tools==$VERSION"
# Check if there are changes
if git diff --quiet; then
echo "⚠️ No changes detected in $REPO - versions may already be up to date"
exit 0
fi
# Commit and push
git add pyproject.toml uv.lock
git commit -m "Bump openhands-sdk, openhands-tools to $VERSION" \
-m "Automated version bump after PyPI release." \
-m "Co-authored-by: openhands <openhands@all-hands.dev>"
git push -u origin "$BRANCH"
# Check if PR already exists
EXISTING_PR=$(gh pr list --repo "$REPO" --head "$BRANCH" --json number --jq '.[0].number')
if [ -n "$EXISTING_PR" ]; then
echo "✅ PR #$EXISTING_PR already exists for $REPO"
else
# Create PR
gh pr create \
--repo "$REPO" \
--title "Bump SDK packages to v$VERSION" \
--body "## Automated Version Bump
This PR updates the following packages to version **$VERSION**:
- \`openhands-sdk\`
- \`openhands-tools\`
**Triggered by:** Release of [software-agent-sdk v$VERSION](https://github.com/OpenHands/software-agent-sdk/releases/tag/v$VERSION)
---
_This PR was automatically created by the version-bump-prs workflow._" \
--base main \
--head "$BRANCH"
echo "✅ PR created for $REPO"
fi
- name: Create PR for OpenHands repo
env:
VERSION: ${{ steps.get_version.outputs.version }}
run: |
set -euo pipefail
REPO="All-Hands-AI/OpenHands"
BRANCH="bump-sdk-$VERSION"
echo "🔄 Creating PR for $REPO..."
# Clone the repo
git clone "https://x-access-token:${GH_TOKEN}@github.com/${REPO}.git" openhands-repo
cd openhands-repo
# Configure git
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"
# Check if branch already exists on remote
if git ls-remote --heads origin "$BRANCH" | grep -q "$BRANCH"; then
echo "⚠️ Branch $BRANCH already exists, checking out existing branch"
git fetch origin "$BRANCH"
git checkout "$BRANCH"
else
# Create branch
git checkout -b "$BRANCH"
fi
# 1. Update versions in pyproject.toml and poetry.lock using poetry (root)
# The --lock flag updates both pyproject.toml AND poetry.lock
# Note: enterprise/pyproject.toml gets these dependencies transitively via openhands-ai
echo "📝 Updating root pyproject.toml and poetry.lock..."
# Verify enterprise/pyproject.toml does NOT have SDK packages explicitly listed
# If they exist there, they will become stale since we only update root pyproject.toml
if [ -f "enterprise/pyproject.toml" ]; then
echo "🔍 Verifying enterprise/pyproject.toml doesn't have explicit SDK packages..."
SDK_PACKAGES=("openhands-sdk" "openhands-tools" "openhands-agent-server")
for pkg in "${SDK_PACKAGES[@]}"; do
# Match package name as a TOML key (with optional leading whitespace) followed by =
# This catches both 'openhands-sdk = "1.2.3"' and 'openhands-sdk="1.2.3"'
if grep -qE "^[[:space:]]*${pkg}[[:space:]]*=" enterprise/pyproject.toml; then
echo "❌ ERROR: enterprise/pyproject.toml contains explicit reference to '$pkg'"
echo " These packages should come transitively via openhands-ai dependency."
echo " Please remove '$pkg' from enterprise/pyproject.toml to avoid version drift."
exit 1
fi
done
echo "✅ enterprise/pyproject.toml does not have explicit SDK packages"
fi
# 1. Update versions in pyproject.toml using sed for exact pinning
# Note: We use sed instead of `poetry add --lock` because Poetry normalizes
# version constraints (e.g., "==1.13.1" becomes "1.13") which causes
# inconsistencies between [tool.poetry.dependencies] and [project].dependencies
echo "📝 Updating pyproject.toml with exact version pins..."
# Update [tool.poetry.dependencies] section
# Matches: openhands-sdk = "1.13" or openhands-sdk = "1.13.0"
sed -i -E 's/^(openhands-sdk = )"[^"]*"/\1"'"$VERSION"'"/' pyproject.toml
sed -i -E 's/^(openhands-tools = )"[^"]*"/\1"'"$VERSION"'"/' pyproject.toml
sed -i -E 's/^(openhands-agent-server = )"[^"]*"/\1"'"$VERSION"'"/' pyproject.toml
# Update [project].dependencies section (PEP 621 format)
# Matches: "openhands-sdk==1.13.1", or "openhands-sdk==1.13",
sed -i -E 's/"openhands-sdk==[^"]*"/"openhands-sdk=='"$VERSION"'"/' pyproject.toml
sed -i -E 's/"openhands-tools==[^"]*"/"openhands-tools=='"$VERSION"'"/' pyproject.toml
sed -i -E 's/"openhands-agent-server==[^"]*"/"openhands-agent-server=='"$VERSION"'"/' pyproject.toml
echo "✅ Updated pyproject.toml"
# 2. Regenerate poetry.lock with the new versions
# Note: In Poetry 2.x, the default behavior is to not update packages already
# in the lock file (the --no-update flag was removed in Poetry 2.x)
echo "📝 Regenerating poetry.lock..."
poetry lock
# 3. Update the version in sandbox_spec_service.py
echo "🔧 Updating AGENT_SERVER_IMAGE..."
SANDBOX_SPEC_FILE="openhands/app_server/sandbox/sandbox_spec_service.py"
if [ -f "$SANDBOX_SPEC_FILE" ]; then
# Update the AGENT_SERVER_IMAGE line with the new hash
sed -i "s|AGENT_SERVER_IMAGE = 'ghcr.io/openhands/agent-server:[^']*'|AGENT_SERVER_IMAGE = 'ghcr.io/openhands/agent-server:${VERSION}-python'|" "$SANDBOX_SPEC_FILE"
echo "✅ Updated AGENT_SERVER_IMAGE to: ghcr.io/openhands/agent-server:${VERSION}-python"
else
echo "❌ sandbox_spec_service.py not found at expected path"
exit 1
fi
# 4. Run pre-commit to fix formatting (pyproject-fmt removes parentheses from version specs)
echo "🔧 Running pre-commit to fix formatting..."
pip install pre-commit
pre-commit run --files pyproject.toml --config ./dev_config/python/.pre-commit-config.yaml || true
# Check if there are changes
if git diff --quiet; then
echo "⚠️ No changes detected in $REPO - versions may already be up to date"
exit 0
fi
# Commit and push
git add .
git commit -m "Bump openhands-sdk, openhands-tools, openhands-agent-server to $VERSION" \
-m "Automated version bump after PyPI release." \
-m "" \
-m "Changes:" \
-m "- Updated SDK packages to v$VERSION in pyproject.toml" \
-m "- Regenerated poetry.lock" \
-m "- Updated AGENT_SERVER_IMAGE to ${VERSION}" \
-m "" \
-m "Co-authored-by: openhands <openhands@all-hands.dev>"
git push -u origin "$BRANCH"
# Check if PR already exists
EXISTING_PR=$(gh pr list --repo "$REPO" --head "$BRANCH" --json number --jq '.[0].number')
if [ -n "$EXISTING_PR" ]; then
echo "✅ PR #$EXISTING_PR already exists for $REPO"
else
# Create PR
gh pr create \
--repo "$REPO" \
--title "Bump SDK packages to v$VERSION" \
--body "## Automated Version Bump
This PR updates the following packages to version **$VERSION**:
- \`openhands-sdk\`
- \`openhands-tools\`
- \`openhands-agent-server\`
### Changes
- Updated SDK packages in \`pyproject.toml\`
- Regenerated \`poetry.lock\`
- Updated \`AGENT_SERVER_IMAGE\` to \`${VERSION}\` in \`sandbox_spec_service.py\`
**Triggered by:** Release of [software-agent-sdk v$VERSION](https://github.com/OpenHands/software-agent-sdk/releases/tag/v$VERSION)
---
_This PR was automatically created by the version-bump-prs workflow._" \
--base main \
--head "$BRANCH"
echo "✅ PR created for $REPO"
fi
- name: Summary
env:
VERSION: ${{ steps.get_version.outputs.version }}
run: |
echo "## ✅ Version Bump PRs Created" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
echo "PRs have been created to bump SDK packages to version **$VERSION**:" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
echo "- [OpenHands](https://github.com/All-Hands-AI/OpenHands/pulls?q=is%3Apr+bump-sdk-$VERSION)" >> $GITHUB_STEP_SUMMARY
echo "- [OpenHands-CLI](https://github.com/OpenHands/openhands-cli/pulls?q=is%3Apr+bump-sdk-$VERSION)" >> $GITHUB_STEP_SUMMARY
- name: Notify Slack
uses: slackapi/slack-github-action@v2.1.1
with:
method: chat.postMessage
token: ${{ secrets.SLACK_BOT_TOKEN }}
payload: |
channel: C08E1SYKEM9
text: "🚀 *SDK v${{ steps.get_version.outputs.version }} published to PyPI!*\n\nVersion bump PRs created:\n• <https://github.com/All-Hands-AI/OpenHands/pulls?q=is%3Apr+bump-sdk-${{ steps.get_version.outputs.version }}|OpenHands>\n• <https://github.com/OpenHands/openhands-cli/pulls?q=is%3Apr+bump-sdk-${{ steps.get_version.outputs.version }}|OpenHands-CLI>\n\n<https://github.com/OpenHands/software-agent-sdk/releases/tag/v${{ steps.get_version.outputs.version }}|View Release>"
@@ -1,51 +0,0 @@
name: Welcome Good First Issue
on:
issues:
types: [labeled]
permissions:
issues: write
jobs:
comment-on-good-first-issue:
if: github.event.label.name == 'good first issue'
runs-on: ubuntu-latest
steps:
- name: Check if welcome comment already exists
id: check_comment
uses: actions/github-script@v7
with:
result-encoding: string
script: |
const issueNumber = context.issue.number;
const comments = await github.rest.issues.listComments({
...context.repo,
issue_number: issueNumber
});
const alreadyCommented = comments.data.some(
(comment) =>
comment.body.includes('<!-- auto-comment:good-first-issue -->')
);
return alreadyCommented ? 'true' : 'false';
- name: Leave welcome comment
if: steps.check_comment.outputs.result == 'false'
uses: actions/github-script@v7
with:
script: |
const repoUrl = `https://github.com/${context.repo.owner}/${context.repo.repo}`;
await github.rest.issues.createComment({
...context.repo,
issue_number: context.issue.number,
body: "🙌 **Hey there, future contributor!** 🙌\n\n" +
"This issue has been labeled as **good first issue**, which means it's a great place to get started with the OpenHands project.\n\n" +
"If you're interested in working on it, feel free to! No need to ask for permission.\n\n" +
"Be sure to check out our [development setup guide](" + repoUrl + "/blob/main/Development.md) to get your environment set up, and follow our [contribution guidelines](" + repoUrl + "/blob/main/CONTRIBUTING.md) when you're ready to submit a fix.\n\n" +
"Feel free to join our developer community on [Slack](https://openhands.dev/joinslack). You can ask for [help](https://openhands-ai.slack.com/archives/C078L0FUGUX), [feedback](https://openhands-ai.slack.com/archives/C086ARSNMGA), and even ask for a [PR review](https://openhands-ai.slack.com/archives/C08D8FJ5771).\n\n" +
"🙌 Happy hacking! 🙌\n\n" +
"<!-- auto-comment:good-first-issue -->"
});
+16 -63
View File
@@ -14,7 +14,7 @@ dist/
downloads/
eggs/
.eggs/
./lib/
lib/
lib64/
parts/
sdist/
@@ -31,7 +31,6 @@ requirements.txt
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
# Installer logs
pip-log.txt
@@ -57,6 +56,7 @@ cover/
*.pot
# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal
@@ -85,7 +85,6 @@ ipython_config.py
# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
.python-version
# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
@@ -121,7 +120,6 @@ celerybeat.pid
# Environments
.env
frontend/.env
.venv
env/
venv/
@@ -129,7 +127,6 @@ ENV/
env.bak/
.env.bak
venv.bak/
*venv/
# Spyder project settings
.spyderproject
@@ -166,7 +163,6 @@ cython_debug/
# https://stackoverflow.com/questions/32964920/should-i-commit-the-vscode-folder-to-source-control
.vscode/**/*
!.vscode/extensions.json
!.vscode/settings.json
!.vscode/tasks.json
# VS Code extensions/forks:
@@ -185,42 +181,6 @@ cython_debug/
.repomix
repomix-output.txt
# Emacs backup
*~
# evaluation
evaluation/evaluation_outputs
evaluation/outputs
evaluation/swe_bench/eval_workspace*
evaluation/SWE-bench/data
evaluation/webarena/scripts/webarena_env.sh
evaluation/bird/data
evaluation/gaia/data
evaluation/gorilla/data
evaluation/toolqa/data
evaluation/scienceagentbench/benchmark
evaluation/commit0_bench/repos
# openhands resolver
output/
# frontend
# dependencies
frontend/.pnp
frontend/bun.lockb
frontend/yarn.lock
.pnp.js
# testing
frontend/coverage
test_results*
/_test_files_tmp/
# production
frontend/build
frontend/dist
# misc
.DS_Store
.env.local
@@ -236,29 +196,22 @@ logs
# agent
.envrc
/workspace
/_test_workspace
/debug
cache
.jinja_cache/
# configuration
config.toml
config.toml_
config.toml.bak
.conversations*
/workspace/
openapi.json
.client/
# swe-bench-eval
image_build_logs
run_instance_logs
# Local workspace files
.beads/*.db
.worktrees/
agent-sdk.workspace.code-workspace
runtime_*.tar
# Integration test outputs
tests/integration/outputs/
tests/integration/api_compliance/outputs/
# docker build
containers/runtime/Dockerfile
containers/runtime/project.tar.gz
containers/runtime/code
**/node_modules/
# test results
test-results
.sessions
.eval_sessions
# Agent-generated temp
.agent_tmp/
-1
View File
@@ -1 +0,0 @@
22
+14
View File
@@ -0,0 +1,14 @@
{
"stop": [
{
"matcher": "*",
"hooks": [
{
"type": "command",
"command": ".openhands/hooks/on_stop.sh",
"timeout": 600
}
]
}
]
}
+303
View File
@@ -0,0 +1,303 @@
#!/bin/bash
# Stop hook: runs pre-commit, pytest, and checks CI status before allowing agent to finish
#
# This hook runs when the agent attempts to stop/finish.
# It can BLOCK the stop by:
# - Exiting with code 2 (blocked)
# - Outputting JSON: {"decision": "deny", "additionalContext": "feedback message"}
#
# Environment variables available:
# OPENHANDS_PROJECT_DIR - Project directory
# OPENHANDS_SESSION_ID - Session ID
# GITHUB_TOKEN - GitHub API token (if available)
set -o pipefail
PROJECT_DIR="${OPENHANDS_PROJECT_DIR:-$(pwd)}"
cd "$PROJECT_DIR" || exit 1
# Collect all issues to report back to the agent
ISSUES=""
BLOCK_STOP=false
log_issue() {
ISSUES="${ISSUES}${1}\n"
BLOCK_STOP=true
}
>&2 echo "=== Stop Hook ==="
>&2 echo "Project directory: $PROJECT_DIR"
>&2 echo ""
# --------------------------
# Step 1: Run pre-commit on all files
# --------------------------
>&2 echo "=== Running pre-commit run --all-files ==="
if command -v uv &> /dev/null; then
PRECOMMIT_OUTPUT=$(uv run pre-commit run --all-files 2>&1)
PRECOMMIT_EXIT=$?
else
PRECOMMIT_OUTPUT=$(pre-commit run --all-files 2>&1)
PRECOMMIT_EXIT=$?
fi
>&2 echo "$PRECOMMIT_OUTPUT"
if [ $PRECOMMIT_EXIT -ne 0 ]; then
>&2 echo "⚠️ pre-commit found issues (exit code: $PRECOMMIT_EXIT)"
log_issue "## Pre-commit Failed\n\nPre-commit checks failed. Please fix the following issues:\n\n\`\`\`\n${PRECOMMIT_OUTPUT}\n\`\`\`"
else
>&2 echo "✓ pre-commit passed"
fi
>&2 echo ""
# --------------------------
# Step 2: Detect changed files and run appropriate tests
# --------------------------
>&2 echo "=== Detecting changed files and running appropriate tests ==="
# Get changed files from git (staged, unstaged, and untracked)
CHANGED_FILES=$(git status --porcelain 2>/dev/null | awk '{print $NF}')
if [ -n "$CHANGED_FILES" ]; then
>&2 echo "Changed files:"
>&2 echo "$CHANGED_FILES" | head -20
>&2 echo ""
# Map changed files to test directories
PROJECTS_TO_TEST=""
add_project() {
local project="$1"
if [[ ! "$PROJECTS_TO_TEST" =~ "$project" ]]; then
PROJECTS_TO_TEST="$PROJECTS_TO_TEST $project"
fi
}
while IFS= read -r file; do
case "$file" in
openhands-sdk/*) add_project "tests/sdk" ;;
openhands-tools/*) add_project "tests/tools" ;;
openhands-workspace/*) add_project "tests/workspace" ;;
openhands-agent-server/*) add_project "tests/agent_server" ;;
tests/sdk/*) add_project "tests/sdk" ;;
tests/tools/*) add_project "tests/tools" ;;
tests/workspace/*) add_project "tests/workspace" ;;
tests/agent_server/*) add_project "tests/agent_server" ;;
tests/cross/*) add_project "tests/cross" ;;
tests/examples/*) add_project "tests/examples" ;;
tests/github_workflows/*) add_project "tests/github_workflows" ;;
examples/*) add_project "tests/examples" ;;
scripts/*) add_project "tests/cross" ;;
pyproject.toml|uv.lock) add_project "tests/cross" ;;
esac
done <<< "$CHANGED_FILES"
PROJECTS_TO_TEST=$(echo "$PROJECTS_TO_TEST" | xargs)
if [ -n "$PROJECTS_TO_TEST" ]; then
>&2 echo "Running tests for: $PROJECTS_TO_TEST"
>&2 echo ""
for project in $PROJECTS_TO_TEST; do
if [ -d "$project" ]; then
>&2 echo "=== Testing $project ==="
if command -v uv &> /dev/null; then
PYTEST_OUTPUT=$(uv run pytest "$project" -v --tb=short -x 2>&1)
PYTEST_EXIT=$?
else
PYTEST_OUTPUT=$(pytest "$project" -v --tb=short -x 2>&1)
PYTEST_EXIT=$?
fi
>&2 echo "$PYTEST_OUTPUT"
if [ $PYTEST_EXIT -ne 0 ]; then
>&2 echo "⚠️ pytest failed for $project"
log_issue "## Pytest Failed for $project\n\nTests failed. Please fix the following:\n\n\`\`\`\n${PYTEST_OUTPUT}\n\`\`\`"
fi
>&2 echo ""
fi
done
else
>&2 echo "No tests to run for changed files"
fi
else
>&2 echo "No changed files detected, skipping local tests"
fi
>&2 echo ""
# --------------------------
# Step 3: Check if there's a pushed commit and wait for CI
# --------------------------
>&2 echo "=== Checking GitHub CI status ==="
# Check if we're in a git repo with a GitHub remote
GITHUB_REMOTE=$(git remote -v 2>/dev/null | grep -E "(github\.com.*push)" | head -1)
if [ -z "$GITHUB_REMOTE" ]; then
>&2 echo "No GitHub remote found, skipping CI check"
else
# Extract owner/repo from remote URL
# Handle both HTTPS and SSH formats
REPO_INFO=$(echo "$GITHUB_REMOTE" | sed -E 's|.*github\.com[:/]([^/]+)/([^/.]+)(\.git)?.*|\1/\2|')
if [ -z "$REPO_INFO" ]; then
>&2 echo "Could not parse GitHub repository info"
else
>&2 echo "Repository: $REPO_INFO"
# Get current branch
CURRENT_BRANCH=$(git rev-parse --abbrev-ref HEAD 2>/dev/null)
>&2 echo "Current branch: $CURRENT_BRANCH"
# Get the latest commit SHA
LOCAL_SHA=$(git rev-parse HEAD 2>/dev/null)
>&2 echo "Local commit: ${LOCAL_SHA:0:8}"
# Check if this commit has been pushed
REMOTE_SHA=$(git ls-remote origin "$CURRENT_BRANCH" 2>/dev/null | awk '{print $1}')
if [ -z "$REMOTE_SHA" ]; then
>&2 echo "Branch not pushed to remote, skipping CI check"
elif [ "$LOCAL_SHA" != "$REMOTE_SHA" ]; then
>&2 echo "Local commit differs from remote (remote: ${REMOTE_SHA:0:8}), skipping CI check"
else
>&2 echo "Commit has been pushed, checking CI status..."
# Check if GITHUB_TOKEN is available
if [ -z "$GITHUB_TOKEN" ]; then
>&2 echo "GITHUB_TOKEN not set, cannot check CI status"
else
# Use gh CLI if available, otherwise fall back to API
if command -v gh &> /dev/null; then
>&2 echo "Using gh CLI to check CI status..."
# Get check runs for this commit
CI_STATUS=$(gh api "repos/$REPO_INFO/commits/$LOCAL_SHA/check-runs" \
--jq '.check_runs | map({name: .name, status: .status, conclusion: .conclusion})' 2>&1)
if [ $? -ne 0 ]; then
>&2 echo "Failed to get CI status: $CI_STATUS"
else
# Parse the status
TOTAL_CHECKS=$(echo "$CI_STATUS" | jq 'length')
if [ "$TOTAL_CHECKS" -eq 0 ]; then
>&2 echo "No CI checks found for this commit"
else
>&2 echo "Found $TOTAL_CHECKS CI check(s)"
# Check for in-progress runs
IN_PROGRESS=$(echo "$CI_STATUS" | jq '[.[] | select(.status != "completed")] | length')
FAILED=$(echo "$CI_STATUS" | jq '[.[] | select(.conclusion == "failure" or .conclusion == "timed_out" or .conclusion == "cancelled")] | length')
if [ "$IN_PROGRESS" -gt 0 ]; then
>&2 echo "$IN_PROGRESS check(s) still in progress"
# Wait for CI to complete (with timeout)
MAX_WAIT=300 # 5 minutes
WAIT_INTERVAL=15
TOTAL_WAITED=0
while [ "$IN_PROGRESS" -gt 0 ] && [ "$TOTAL_WAITED" -lt "$MAX_WAIT" ]; do
>&2 echo "Waiting for CI... (${TOTAL_WAITED}s / ${MAX_WAIT}s max)"
sleep $WAIT_INTERVAL
TOTAL_WAITED=$((TOTAL_WAITED + WAIT_INTERVAL))
CI_STATUS=$(gh api "repos/$REPO_INFO/commits/$LOCAL_SHA/check-runs" \
--jq '.check_runs | map({name: .name, status: .status, conclusion: .conclusion})' 2>&1)
IN_PROGRESS=$(echo "$CI_STATUS" | jq '[.[] | select(.status != "completed")] | length')
done
if [ "$IN_PROGRESS" -gt 0 ]; then
>&2 echo "⚠️ CI still running after ${MAX_WAIT}s timeout"
log_issue "## CI Still Running\n\nCI checks are still in progress after waiting ${MAX_WAIT} seconds. Please wait for CI to complete before finishing."
fi
fi
# Re-check for failures after waiting
FAILED=$(echo "$CI_STATUS" | jq '[.[] | select(.conclusion == "failure" or .conclusion == "timed_out" or .conclusion == "cancelled")] | length')
if [ "$FAILED" -gt 0 ]; then
>&2 echo "$FAILED check(s) failed!"
# Get details of failed checks
FAILED_DETAILS=$(echo "$CI_STATUS" | jq -r '.[] | select(.conclusion == "failure" or .conclusion == "timed_out" or .conclusion == "cancelled") | "- \(.name): \(.conclusion)"')
>&2 echo "$FAILED_DETAILS"
# Try to get failure logs
FAILED_NAMES=$(echo "$CI_STATUS" | jq -r '.[] | select(.conclusion == "failure") | .name')
FAILURE_MSG="## CI Failed\n\nThe following CI checks failed:\n\n${FAILED_DETAILS}\n"
# Try to get the workflow run logs for more context
WORKFLOW_RUNS=$(gh api "repos/$REPO_INFO/actions/runs?head_sha=$LOCAL_SHA" \
--jq '.workflow_runs[] | select(.conclusion == "failure") | {id: .id, name: .name}' 2>/dev/null)
if [ -n "$WORKFLOW_RUNS" ]; then
FAILURE_MSG="${FAILURE_MSG}\nYou can view the full logs at: https://github.com/$REPO_INFO/actions\n"
# Try to get job logs
FIRST_RUN_ID=$(echo "$WORKFLOW_RUNS" | jq -r '.id' | head -1)
if [ -n "$FIRST_RUN_ID" ]; then
JOBS_OUTPUT=$(gh api "repos/$REPO_INFO/actions/runs/$FIRST_RUN_ID/jobs" \
--jq '.jobs[] | select(.conclusion == "failure") | "### \(.name)\nConclusion: \(.conclusion)\nSteps:\n" + (.steps | map("- \(.name): \(.conclusion)") | join("\n"))' 2>/dev/null | head -100)
if [ -n "$JOBS_OUTPUT" ]; then
FAILURE_MSG="${FAILURE_MSG}\n### Failed Job Details:\n\`\`\`\n${JOBS_OUTPUT}\n\`\`\`"
fi
fi
fi
log_issue "$FAILURE_MSG"
else
>&2 echo "✓ All CI checks passed!"
fi
fi
fi
else
# Fallback to curl
>&2 echo "gh CLI not available, using API directly..."
CI_RESPONSE=$(curl -s -H "Authorization: token $GITHUB_TOKEN" \
-H "Accept: application/vnd.github.v3+json" \
"https://api.github.com/repos/$REPO_INFO/commits/$LOCAL_SHA/check-runs" 2>&1)
TOTAL_CHECKS=$(echo "$CI_RESPONSE" | jq '.total_count // 0')
if [ "$TOTAL_CHECKS" -gt 0 ]; then
IN_PROGRESS=$(echo "$CI_RESPONSE" | jq '[.check_runs[] | select(.status != "completed")] | length')
FAILED=$(echo "$CI_RESPONSE" | jq '[.check_runs[] | select(.conclusion == "failure")] | length')
if [ "$IN_PROGRESS" -gt 0 ]; then
>&2 echo "⏳ CI checks still in progress"
log_issue "## CI In Progress\n\nCI checks are still running. Please wait for CI to complete."
elif [ "$FAILED" -gt 0 ]; then
FAILED_NAMES=$(echo "$CI_RESPONSE" | jq -r '.check_runs[] | select(.conclusion == "failure") | .name')
>&2 echo "❌ CI failed: $FAILED_NAMES"
log_issue "## CI Failed\n\nThe following CI checks failed:\n${FAILED_NAMES}\n\nPlease fix the issues and try again."
else
>&2 echo "✓ All CI checks passed!"
fi
else
>&2 echo "No CI checks found"
fi
fi
fi
fi
fi
fi
>&2 echo ""
# --------------------------
# Final decision
# --------------------------
if [ "$BLOCK_STOP" = true ]; then
>&2 echo "=== BLOCKING STOP: Issues found ==="
# Output JSON to provide feedback to the agent
# Escape the issues for JSON
ESCAPED_ISSUES=$(echo -e "$ISSUES" | jq -Rs .)
echo "{\"decision\": \"deny\", \"reason\": \"Checks failed\", \"additionalContext\": $ESCAPED_ISSUES}"
exit 2
fi
>&2 echo "=== All checks passed, allowing stop ==="
echo '{"decision": "allow"}'
exit 0
-33
View File
@@ -1,33 +0,0 @@
---
name: documentation
type: knowledge
version: 1.0.0
agent: CodeActAgent
triggers:
- documentation
- docs
- document
---
# Documentation Guidelines
All documentation must be grounded in fact, so you must not make anything up without proper evidence. When you have finished writing documentation, convey to the user what reference source, including web pages, source code, or other sources of documentation you referenced when writing each new fact in the documentation. If you cannot reference a source for anything do not include it in the pull request.
## Best Practices for Documentation
1. **Be Factual**: Only include information that can be verified from reliable sources.
2. **Cite Sources**: Always reference the source of information (code, web pages, official documentation).
3. **Be Clear and Concise**: Use simple language and avoid unnecessary jargon.
4. **Use Examples**: Include practical examples to illustrate concepts.
5. **Structure Properly**: Use headings, lists, and code blocks to organize information.
6. **Keep Updated**: Ensure documentation reflects the current state of the code or system.
## Documentation Process
1. Research and gather information from reliable sources
2. Draft documentation based on verified facts
3. Review for accuracy and completeness
4. Include references for all factual statements
5. Submit only when all information is properly sourced
Remember: If you cannot verify a piece of information, it's better to exclude it than to include potentially incorrect information.
-172
View File
@@ -1,172 +0,0 @@
# OpenHands Glossary
### Agent
The core AI entity in OpenHands that can perform software development tasks by interacting with tools, browsing the web, and modifying code.
#### Agent Controller
A component that manages the agent's lifecycle, handles its state, and coordinates interactions between the agent and various tools.
#### Agent Delegation
The ability of an agent to hand off specific tasks to other specialized agents for better task completion.
#### Agent Hub
A central registry of different agent types and their capabilities, allowing for easy agent selection and instantiation.
#### Agent Skill
A specific capability or function that an agent can perform, such as file manipulation, web browsing, or code editing.
#### Agent State
The current context and status of an agent, including its memory, active tools, and ongoing tasks.
#### CodeAct Agent
[A generalist agent in OpenHands](https://arxiv.org/abs/2407.16741) designed to perform tasks by editing and executing code.
### Browser
A system for web-based interactions and tasks.
#### Browser Gym
A testing and evaluation environment for browser-based agent interactions and tasks.
#### Web Browser Tool
A tool that enables agents to interact with web pages and perform web-based tasks.
### Commands
Terminal and execution related functionality.
#### Bash Session
A persistent terminal session that maintains state and history for bash command execution.
This uses tmux under the hood.
### Configuration
System-wide settings and options.
#### Agent Configuration
Settings that define an agent's behavior, capabilities, and limitations, including available tools and runtime settings.
#### Configuration Options
Settings that control various aspects of OpenHands behavior, including runtime, security, and agent settings.
#### LLM Config
Configuration settings for language models used by agents, including model selection and parameters.
#### LLM Draft Config
Settings for draft mode operations with language models, typically used for faster, lower-quality responses.
#### Runtime Configuration
Settings that define how the runtime environment should be set up and operated.
#### Security Options
Configuration settings that control security features and restrictions.
### Conversation
A sequence of interactions between a user and an agent, including messages, actions, and their results.
#### Conversation Info
Metadata about a conversation, including its status, participants, and timeline.
#### Conversation Manager
A component that handles the creation, storage, and retrieval of conversations.
#### Conversation Metadata
Additional information about conversations, such as tags, timestamps, and related resources.
#### Conversation Status
The current state of a conversation, including whether it's active, completed, or failed.
#### Conversation Store
A storage system for maintaining conversation history and related data.
### Events
#### Event
Every Conversation comprises a series of Events. Each Event is either an Action or an Observation.
#### Event Stream
A continuous flow of events that represents the ongoing activities and interactions in the system.
#### Action
A specific operation or command that an agent executes through available tools, such as running a command or editing a file.
#### Observation
The response or result returned by a tool after an agent's action, providing feedback about the action's outcome.
### Interface
Different ways to interact with OpenHands.
#### CLI Mode
A command-line interface mode for interacting with OpenHands agents without a graphical interface.
#### GUI Mode
A graphical user interface mode for interacting with OpenHands agents through a web interface.
#### Headless Mode
A mode of operation where OpenHands runs without a user interface, suitable for automation and scripting.
### Agent Memory
The system that decides which parts of the Event Stream (i.e. the conversation history) should be passed into each LLM prompt.
#### Memory Store
A storage system for maintaining agent memory and context across sessions.
#### Condenser
A component that processes and summarizes conversation history to maintain context while staying within token limits.
#### Truncation
A very simple Condenser strategy. Reduces conversation history or content to stay within token limits.
### Microagent
A specialized prompt that enhances OpenHands with domain-specific knowledge, repository-specific context, and task-specific workflows.
#### Microagent Registry
A central repository of available microagents and their configurations.
#### Public Microagent
A general-purpose microagent available to all OpenHands users, triggered by specific keywords. Located in `microagents/`.
#### Repository Microagent
A type of microagent that provides repository-specific context and guidelines, stored in the `.openhands/microagents/` directory.
### Prompt
Components for managing and processing prompts.
#### Prompt Caching
A system for caching and reusing common prompts to improve performance.
#### Prompt Manager
A component that handles the loading, processing, and management of prompts used by agents, including microagents.
#### Response Parsing
The process of interpreting and structuring responses from language models and tools.
### Runtime
The execution environment where agents perform their tasks, which can be local, remote, or containerized.
#### Action Execution Server
A REST API that receives agent actions (e.g. bash commands, python code, browsing actions), executes them in the runtime environment, and returns the results.
#### Action Execution Client
A component that handles the execution of actions in the runtime environment, managing the communication between the agent and the runtime.
#### Docker Runtime
A containerized runtime environment that provides isolation and reproducibility for agent operations.
#### E2B Runtime
A specialized runtime environment built on E2B for secure and isolated code execution.
#### Local Runtime
A runtime environment that executes on the local machine, suitable for development and testing.
#### Modal Runtime
A runtime environment built on Modal for scalable and distributed agent operations.
#### Remote Runtime
A sandboxed environment that executes code and commands remotely, providing isolation and security for agent operations.
#### Runtime Builder
A component that builds a Docker image for the Action Execution Server based on a user-specified base image.
### Security
Security-related components and features.
#### Security Analyzer
A component that checks agent actions for potential security risks.
-124
View File
@@ -1,124 +0,0 @@
#!/bin/bash
echo "Running OpenHands pre-commit hook..."
echo "This hook runs selective linting based on changed files."
# Store the exit code to return at the end
# This allows us to be additive to existing pre-commit hooks
EXIT_CODE=0
# Get the list of staged files
STAGED_FILES=$(git diff --cached --name-only)
# Check if any files match specific patterns
has_frontend_changes=false
has_backend_changes=false
# Check each file individually to avoid issues with grep
for file in $STAGED_FILES; do
if [[ $file == frontend/* ]]; then
has_frontend_changes=true
elif [[ $file == openhands/* || $file == evaluation/* || $file == tests/* ]]; then
has_backend_changes=true
fi
done
echo "Analyzing changes..."
echo "- Frontend changes: $has_frontend_changes"
echo "- Backend changes: $has_backend_changes"
# Run frontend linting if needed
if [ "$has_frontend_changes" = true ]; then
# Check if we're in a CI environment or if frontend dependencies are missing
if [ -n "$CI" ] || ! command -v react-router &> /dev/null || ! command -v vitest &> /dev/null; then
echo "Skipping frontend checks (CI environment or missing dependencies detected)."
echo "WARNING: Frontend files have changed but frontend checks are being skipped."
echo "Please run 'make lint-frontend' manually before submitting your PR."
else
echo "Running frontend linting..."
make lint-frontend
if [ $? -ne 0 ]; then
echo "Frontend linting failed. Please fix the issues before committing."
EXIT_CODE=1
else
echo "Frontend linting checks passed!"
fi
# Run additional frontend checks
if [ -d "frontend" ]; then
echo "Running additional frontend checks..."
cd frontend || exit 1
# Run build
echo "Running npm build..."
npm run build
if [ $? -ne 0 ]; then
echo "Frontend build failed. Please fix the issues before committing."
EXIT_CODE=1
fi
# Run tests
echo "Running npm test..."
npm test
if [ $? -ne 0 ]; then
echo "Frontend tests failed. Please fix the failing tests before committing."
EXIT_CODE=1
fi
cd ..
fi
fi
else
echo "Skipping frontend checks (no frontend changes detected)."
fi
# Run backend linting if needed
if [ "$has_backend_changes" = true ]; then
echo "Running backend linting..."
make lint-backend
if [ $? -ne 0 ]; then
echo "Backend linting failed. Please fix the issues before committing."
EXIT_CODE=1
else
echo "Backend linting checks passed!"
fi
else
echo "Skipping backend checks (no backend changes detected)."
fi
# If no specific code changes detected, run basic checks
if [ "$has_frontend_changes" = false ] && [ "$has_backend_changes" = false ]; then
echo "No specific code changes detected. Running basic checks..."
if [ -n "$STAGED_FILES" ]; then
# Run only basic pre-commit hooks for non-code files
poetry run pre-commit run --files $(echo "$STAGED_FILES" | tr '\n' ' ') --hook-stage commit --config ./dev_config/python/.pre-commit-config.yaml
if [ $? -ne 0 ]; then
echo "Basic checks failed. Please fix the issues before committing."
EXIT_CODE=1
else
echo "Basic checks passed!"
fi
else
echo "No files changed. Skipping basic checks."
fi
fi
# Run any existing pre-commit hooks that might have been installed by the user
# This makes our hook additive rather than replacing existing hooks
if [ -f ".git/hooks/pre-commit.local" ]; then
echo "Running existing pre-commit hooks..."
bash .git/hooks/pre-commit.local
if [ $? -ne 0 ]; then
echo "Existing pre-commit hooks failed."
EXIT_CODE=1
fi
fi
if [ $EXIT_CODE -eq 0 ]; then
echo "All pre-commit checks passed!"
else
echo "Some pre-commit checks failed. Please fix the issues before committing."
fi
exit $EXIT_CODE
+9 -11
View File
@@ -1,13 +1,11 @@
#! /bin/bash
#!/bin/bash
echo "Setting up the environment..."
# Install pre-commit package
python -m pip install pre-commit
# Install pre-commit hooks if .git directory exists
if [ -d ".git" ]; then
echo "Installing pre-commit hooks..."
pre-commit install
make install-pre-commit-hooks
if ! command -v uv &> /dev/null; then
echo "uv is not installed. Installing..."
curl -LsSf https://astral.sh/uv/install.sh | sh
else
echo "uv is already installed."
uv self update # always update to the latest version
fi
make build
+56
View File
@@ -0,0 +1,56 @@
---
repos:
- repo: https://github.com/jumanjihouse/pre-commit-hook-yamlfmt
rev: 0.2.1 # or other specific tag
hooks:
- id: yamlfmt
- repo: local
hooks:
- id: ruff-format
name: Ruff format
entry: uv
args: [run, ruff, format]
language: system
types: [python]
pass_filenames: true
always_run: false
- id: ruff-check
name: Ruff lint
entry: uv
args: [run, ruff, check, --fix, --exit-non-zero-on-fix]
language: system
types: [python]
pass_filenames: true
always_run: false
- id: pycodestyle
name: PEP8 style check (pycodestyle)
entry: uv
args: [run, pycodestyle, --max-line-length=88, '--ignore=E203,E501,W503,E704']
language: system
types: [python]
pass_filenames: true
always_run: false
- id: pyright
name: Type check with pyright
entry: uv
args: [run, pyright]
language: system
types: [python]
pass_filenames: true
always_run: false
- id: check-import-rules
name: Check import dependency rules
entry: uv
args: [run, python, scripts/check_import_rules.py]
language: system
types: [python]
pass_filenames: true
always_run: false
- id: check-tool-registration
name: Check Tool subclass registration
entry: uv
args: [run, python, scripts/check_tool_registration.py]
language: system
types: [python]
pass_filenames: true
always_run: false
+1
View File
@@ -0,0 +1 @@
3.13
-22
View File
@@ -1,22 +0,0 @@
{
// force *nix line endings so files don't look modified in container run from Windows clone
"files.eol": "\n",
"files.trimTrailingWhitespace": true,
"files.insertFinalNewline": true,
"python.defaultInterpreterPath": "./.venv/bin/python",
"python.terminal.activateEnvironment": true,
"python.analysis.autoImportCompletions": true,
"python.analysis.autoSearchPaths": true,
"python.analysis.extraPaths": [
"./.venv/lib/python3.12/site-packages"
],
"python.analysis.packageIndexDepths": [
{
"name": "openhands",
"depth": 10,
"includeAllSymbols": true
}
],
"python.analysis.stubPath": "./.venv/lib/python3.12/site-packages",
}
+289 -305
View File
@@ -1,344 +1,328 @@
This repository contains the code for OpenHands, an automated AI software engineer. It has a Python backend
(in the `openhands` directory) and React frontend (in the `frontend` directory).
<ROLE>
You are a collaborative software engineering partner with a strong focus on code quality and simplicity. Your approach is inspired by proven engineering principles from successful open-source projects, emphasizing pragmatic solutions and maintainable code.
## General Setup:
To set up the entire repo, including frontend and backend, run `make build`.
You don't need to do this unless the user asks you to, or if you're trying to run the entire application.
# Core Engineering Principles
1. **Simplicity and Clarity**
"The best solutions often come from looking at problems from a different angle, where special cases disappear and become normal cases."
• Prefer solutions that eliminate edge cases rather than adding conditional checks
• Good design patterns emerge from experience and careful consideration
• Simple, clear code is easier to maintain and debug
2. **Backward Compatibility**
"Stability is a feature, not a constraint."
• Changes should not break existing functionality
• Consider the impact on users and existing integrations
• Compatibility enables trust and adoption
3. **Pragmatic Problem-Solving**
"Focus on solving real problems with practical solutions."
• Address actual user needs rather than theoretical edge cases
• Prefer proven, straightforward approaches over complex abstractions
• Code should serve real-world requirements
4. **Maintainable Architecture**
"Keep functions focused and code readable."
• Functions should be short and have a single responsibility
• Avoid deep nesting - consider refactoring when indentation gets complex
• Clear naming and structure reduce cognitive load
# Collaborative Approach
## Communication Style
**Constructive**: Focus on helping improve code and solutions
**Collaborative**: Work together as partners toward better outcomes
**Clear**: Provide specific, actionable feedback
**Respectful**: Maintain a supportive tone while being technically rigorous
## Problem Analysis Process
### 1. Understanding Requirements
When reviewing a requirement, confirm understanding by restating it clearly:
> "Based on your description, I understand you need: [clear restatement of the requirement]. Is this correct?"
### 2. Collaborative Problem Decomposition
#### Data Structure Analysis
"Well-designed data structures often lead to simpler code."
• What are the core data elements and their relationships?
• How does data flow through the system?
• Are there opportunities to simplify data handling?
#### Complexity Assessment
"Let's look for ways to simplify this."
• What's the essential functionality we need to implement?
• Which parts of the current approach add unnecessary complexity?
• How can we make this more straightforward?
#### Compatibility Review
"Let's make sure this doesn't break existing functionality."
• What existing features might be affected?
• How can we implement this change safely?
• What migration path do users need?
#### Practical Validation
"Let's focus on the real-world use case."
• Does this solve an actual problem users face?
• Is the complexity justified by the benefit?
• What's the simplest approach that meets the need?
## 3. Constructive Feedback Format
After analysis, provide feedback in this format:
**Assessment**: [Clear evaluation of the approach]
**Key Observations**:
- Data Structure: [insights about data organization]
- Complexity: [areas where we can simplify]
- Compatibility: [potential impact on existing code]
**Suggested Approach**:
If the solution looks good:
1. Start with the simplest data structure that works
2. Eliminate special cases where possible
3. Implement clearly and directly
4. Ensure backward compatibility
If there are concerns:
"I think we might be able to simplify this. The core issue seems to be [specific problem]. What if we tried [alternative approach]?"
## 4. Code Review Approach
When reviewing code, provide constructive feedback:
**Overall Assessment**: [Helpful evaluation]
**Specific Suggestions**:
- [Concrete improvements with explanations]
- [Alternative approaches to consider]
- [Ways to reduce complexity]
**Next Steps**: [Clear action items]
</ROLE>
## Package-specific guidance
When reviewing or modifying code, read the closest AGENTS file for the
package(s) containing the changed files. If a PR spans multiple packages,
consult each relevant package-level AGENTS.md.
- SDK: [openhands-sdk/openhands/sdk/AGENTS.md](openhands-sdk/openhands/sdk/AGENTS.md)
- Subagents: [openhands-sdk/openhands/sdk/subagent/AGENTS.md](openhands-sdk/openhands/sdk/subagent/AGENTS.md)
- Tools: [openhands-tools/openhands/tools/AGENTS.md](openhands-tools/openhands/tools/AGENTS.md)
- Workspace: [openhands-workspace/openhands/workspace/AGENTS.md](openhands-workspace/openhands/workspace/AGENTS.md)
- Agent server: [openhands-agent-server/AGENTS.md](openhands-agent-server/AGENTS.md)
- Eval config: [.github/run-eval/AGENTS.md](.github/run-eval/AGENTS.md)
## API compatibility pointers
- For SDK Python API deprecation/removal policy, read
[openhands-sdk/openhands/sdk/AGENTS.md](openhands-sdk/openhands/sdk/AGENTS.md).
Public API removals require deprecation before removal, and breaking SDK API
changes require at least a **MINOR** SemVer bump.
- The SDK API breakage checker should treat metadata-only changes to
Pydantic `Field(...)` declarations as non-breaking, including adding,
removing, or editing `description`, `title`, `examples`,
`json_schema_extra`, and `deprecated` kwargs.
- For public REST APIs, read
[openhands-agent-server/AGENTS.md](openhands-agent-server/AGENTS.md).
REST contract breaks need a deprecation notice and a runway of
**5 minor releases** before removing the old contract or making an
incompatible replacement mandatory.
<DEV_SETUP>
- Make sure you `make build` to configure the dependencies first
- We use pre-commit hooks `.pre-commit-config.yaml` that includes:
- type check through pyright
- linting and formatter with `uv ruff`
- NEVER USE `mypy`!
- Do NOT commit ALL the file, just commit the relevant file you've changed!
- In every commit message, you should add "Co-authored-by: openhands <openhands@all-hands.dev>"
- You can run pytest with `uv run pytest`
# Instruction for fixing "E501 Line too long"
- If it is just code, you can modify it so it spans multiple lines.
- If it is a single-line string, you can break it into a multi-line string by doing "ABC" -> ("A"\n"B"\n"C")
- If it is a long multi-line string (e.g., docstring), you should just add type ignore AFTER the ending """. You should NEVER ADD IT INSIDE the docstring.
</DEV_SETUP>
<PR_ARTIFACTS>
# PR-Specific Documents
When working on a PR that requires design documents, scripts meant for development-only, or other temporary artifacts that should NOT be merged to main, store them in a `.pr/` directory at the repository root.
## Usage
## Running OpenHands with OpenHands:
To run the full application to debug issues:
```bash
export INSTALL_DOCKER=0
export RUNTIME=local
make build && make run FRONTEND_PORT=12000 FRONTEND_HOST=0.0.0.0 BACKEND_HOST=0.0.0.0 &> /tmp/openhands-log.txt &
# Create the directory if it doesn't exist
mkdir -p .pr
# Add your PR-specific documents
.pr/
├── design.md # Design decisions and architecture notes
├── analysis.md # Investigation or debugging notes
└── notes.md # Any other PR-specific content
```
IMPORTANT: Before making any changes to the codebase, ALWAYS run `make install-pre-commit-hooks` to ensure pre-commit hooks are properly installed.
## How It Works
Before pushing any changes, you MUST ensure that any lint errors or simple test errors have been fixed.
1. **Notification**: When `.pr/` exists, a single comment is posted to the PR conversation alerting reviewers
2. **Auto-cleanup**: When the PR is approved, the `.pr/` directory is automatically removed via commit
3. **Fork PRs**: Auto-cleanup cannot push to forks, so manual removal is required before merging
* If you've made changes to the backend, you should run `pre-commit run --config ./dev_config/python/.pre-commit-config.yaml` (this will run on staged files).
* If you've made changes to the frontend, you should run `cd frontend && npm run lint:fix && npm run build ; cd ..`
* If you've made changes to the VSCode extension, you should run `cd openhands/integrations/vscode && npm run lint:fix && npm run compile ; cd ../../..`
## Important Notes
The pre-commit hooks MUST pass successfully before pushing any changes to the repository. This is a mandatory requirement to maintain code quality and consistency.
- Do NOT put anything in `.pr/` that needs to be preserved
- The `.pr/` check passes (green ✅) during development - it only posts a notification, not a blocking error
- For fork PRs: You must manually remove `.pr/` before the PR can be merged
If either command fails, it may have automatically fixed some issues. You should fix any issues that weren't automatically fixed,
then re-run the command to ensure it passes. Common issues include:
- Mypy type errors
- Ruff formatting issues
- Trailing whitespace
- Missing newlines at end of files
## When to Use
## Git Best Practices
- Complex refactoring that benefits from written design rationale
- Debugging sessions where you want to document your investigation
- Feature implementations that need temporary planning docs
- Temporary script that are intended to show reviewers that the feature works
- Any analysis that helps reviewers understand the PR but isn't needed long-term
</PR_ARTIFACTS>
- Prefer specific `git add <filename>` instead of `git add .` to avoid accidentally staging unintended files
- Be especially careful with `git reset --hard` after staging files, as it will remove accidentally staged files
- When remote has new changes, use `git fetch upstream && git rebase upstream/<branch>` on the same branch
<REVIEW_HANDLING>
- Critically evaluate each review comment before acting on it. Not all feedback is worth implementing:
- Does it fix a real bug or improve clarity significantly?
- Does it align with the project's engineering principles (simplicity, maintainability)?
- Is the suggested change proportional to the benefit, or does it add unnecessary complexity?
- It's acceptable to respectfully decline suggestions that add verbosity without clear benefit, over-engineer for hypothetical edge cases, or contradict the project's pragmatic approach.
- After addressing (or deciding not to address) inline review comments, mark the corresponding review threads as resolved.
- Before resolving a thread, leave a reply comment that either explains the reason for dismissing the feedback or references the specific commit (e.g., commit SHA) that addressed the issue.
- Prefer resolving threads only once fixes are pushed or a clear decision is documented.
- Use the GitHub GraphQL API to reply to and resolve review threads (see below).
## Repository Structure
Backend:
- Located in the `openhands` directory
- Testing:
- All tests are in `tests/unit/test_*.py`
- To test new code, run `poetry run pytest tests/unit/test_xxx.py` where `xxx` is the appropriate file for the current functionality
- Write all tests with pytest
## Resolving Review Threads via GraphQL
Frontend:
- Located in the `frontend` directory
- Prerequisites: A recent version of NodeJS / NPM
- Setup: Run `npm install` in the frontend directory
- Testing:
- Run tests: `npm run test`
- To run specific tests: `npm run test -- -t "TestName"`
- Our test framework is vitest
- Building:
- Build for production: `npm run build`
- Environment Variables:
- Set in `frontend/.env` or as environment variables
- Available variables: VITE_BACKEND_HOST, VITE_USE_TLS, VITE_INSECURE_SKIP_VERIFY, VITE_FRONTEND_PORT
- Internationalization:
- Generate i18n declaration file: `npm run make-i18n`
- Data Fetching & Cache Management:
- We use TanStack Query (fka React Query) for data fetching and cache management
- Data Access Layer: API client methods are located in `frontend/src/api` and should never be called directly from UI components - they must always be wrapped with TanStack Query
- Custom hooks are located in `frontend/src/hooks/query/` and `frontend/src/hooks/mutation/`
- Query hooks should follow the pattern use[Resource] (e.g., `useConversationSkills`)
- Mutation hooks should follow the pattern use[Action] (e.g., `useDeleteConversation`)
- Architecture rule: UI components → TanStack Query hooks → Data Access Layer (`frontend/src/api`) → API endpoints
The CI check `Review Thread Gate/unresolved-review-threads` will fail if there are unresolved review threads. To resolve threads programmatically:
VSCode Extension:
- Located in the `openhands/integrations/vscode` directory
- Setup: Run `npm install` in the extension directory
- Linting:
- Run linting with fixes: `npm run lint:fix`
- Check only: `npm run lint`
- Type checking: `npm run typecheck`
- Building:
- Compile TypeScript: `npm run compile`
- Package extension: `npm run package-vsix`
- Testing:
- Run tests: `npm run test`
- Development Best Practices:
- Use `vscode.window.createOutputChannel()` for debug logging instead of `showErrorMessage()` popups
- Pre-commit process runs both frontend and backend checks when committing extension changes
## Enterprise Directory
The `enterprise/` directory contains additional functionality that extends the open-source OpenHands codebase. This includes:
- Authentication and user management (Keycloak integration)
- Database migrations (Alembic)
- Integration services (GitHub, GitLab, Jira, Linear, Slack)
- Billing and subscription management (Stripe)
- Telemetry and analytics (PostHog, custom metrics framework)
### Enterprise Development Setup
**Prerequisites:**
- Python 3.12
- Poetry (for dependency management)
- Node.js 22.x (for frontend)
- Docker (optional)
**Setup Steps:**
1. First, build the main OpenHands project: `make build`
2. Then install enterprise dependencies: `cd enterprise && poetry install --with dev,test` (This can take a very long time. Be patient.)
3. Set up enterprise pre-commit hooks: `poetry run pre-commit install --config ./dev_config/python/.pre-commit-config.yaml`
**Running Enterprise Tests:**
1. Get the thread IDs (replace `<OWNER>`, `<REPO>`, `<PR_NUMBER>`):
```bash
# Enterprise unit tests (full suite)
PYTHONPATH=".:$PYTHONPATH" poetry run --project=enterprise pytest --forked -n auto -s -p no:ddtrace -p no:ddtrace.pytest_bdd -p no:ddtrace.pytest_benchmark ./enterprise/tests/unit --cov=enterprise --cov-branch
# Test specific modules (faster for development)
cd enterprise
PYTHONPATH=".:$PYTHONPATH" poetry run pytest tests/unit/telemetry/ --confcutdir=tests/unit/telemetry
# Enterprise linting (IMPORTANT: use --show-diff-on-failure to match GitHub CI)
poetry run pre-commit run --all-files --show-diff-on-failure --config ./dev_config/python/.pre-commit-config.yaml
gh api graphql -f query='
{
repository(owner: "<OWNER>", name: "<REPO>") {
pullRequest(number: <PR_NUMBER>) {
reviewThreads(first: 20) {
nodes {
id
isResolved
comments(first: 1) {
nodes { body }
}
}
}
}
}
}'
```
**Running Enterprise Server:**
2. Reply to the thread explaining how the feedback was addressed:
```bash
cd enterprise
make start-backend # Development mode with hot reload
# or
make run # Full application (backend + frontend)
gh api graphql -f query='
mutation {
addPullRequestReviewThreadReply(input: {
pullRequestReviewThreadId: "<THREAD_ID>"
body: "Fixed in <COMMIT_SHA>"
}) {
comment { id }
}
}'
```
**Key Configuration Files:**
- `enterprise/pyproject.toml` - Enterprise-specific dependencies
- `enterprise/Makefile` - Enterprise build and run commands
- `enterprise/dev_config/python/` - Linting and type checking configuration
- `enterprise/migrations/` - Database migration files
**Database Migrations:**
Enterprise uses Alembic for database migrations. When making schema changes:
1. Create migration files in `enterprise/migrations/versions/`
2. Test migrations thoroughly
3. The CI will check for migration conflicts on PRs
**Integration Development:**
The enterprise codebase includes integrations for:
- **GitHub** - PR management, webhooks, app installations
- **GitLab** - Similar to GitHub but for GitLab instances
- **Jira** - Issue tracking and project management
- **Linear** - Modern issue tracking
- **Slack** - Team communication and notifications
Each integration follows a consistent pattern with service classes, storage models, and API endpoints.
**Important Notes:**
- Enterprise code is licensed under Polyform Free Trial License (30-day limit)
- The enterprise server extends the OpenHands server through dynamic imports
- Database changes require careful migration planning in `enterprise/migrations/`
- Always test changes in both OpenHands and enterprise contexts
- Use the enterprise-specific Makefile commands for development
**Enterprise Testing Best Practices:**
**Database Testing:**
- Use SQLite in-memory databases (`sqlite:///:memory:`) for unit tests instead of real PostgreSQL
- Create module-specific `conftest.py` files with database fixtures
- Mock external database connections in unit tests to avoid dependency on running services
- Use real database connections only for integration tests
**Import Patterns:**
- Use relative imports without `enterprise.` prefix in enterprise code
- Example: `from storage.database import session_maker` not `from enterprise.storage.database import session_maker`
- This ensures code works in both OpenHands and enterprise contexts
**Test Structure:**
- Place tests in `enterprise/tests/unit/` following the same structure as the source code
- Use `--confcutdir=tests/unit/[module]` when testing specific modules
- Create comprehensive fixtures for complex objects (databases, external services)
- Write platform-agnostic tests (avoid hardcoded OS-specific assertions)
**Mocking Strategy:**
- Use `AsyncMock` for async operations and `MagicMock` for complex objects
- Mock all external dependencies (databases, APIs, file systems) in unit tests
- Use `patch` with correct import paths (e.g., `telemetry.registry.logger` not `enterprise.telemetry.registry.logger`)
- Test both success and failure scenarios with proper error handling
**Coverage Goals:**
- Aim for 90%+ test coverage on new enterprise modules
- Focus on critical business logic and error handling paths
- Use `--cov-report=term-missing` to identify uncovered lines
**Troubleshooting:**
- If tests fail, ensure all dependencies are installed: `poetry install --with dev,test`
- For database issues, check migration status and run migrations if needed
- For frontend issues, ensure the main OpenHands frontend is built: `make build`
- Check logs in the `logs/` directory for runtime issues
- If tests fail with import errors, verify `PYTHONPATH=".:$PYTHONPATH"` is set
- **If GitHub CI fails but local linting passes**: Always use `--show-diff-on-failure` flag to match CI behavior exactly
## Template for Github Pull Request
If you are starting a pull request (PR), please follow the template in `.github/pull_request_template.md`.
## Implementation Details
These details may or may not be useful for your current task.
### Microagents
Microagents are specialized prompts that enhance OpenHands with domain-specific knowledge and task-specific workflows. They are Markdown files that can include frontmatter for configuration.
#### Types:
- **Public Microagents**: Located in `microagents/`, available to all users
- **Repository Microagents**: Located in `.openhands/microagents/`, specific to this repository
#### Loading Behavior:
- **Without frontmatter**: Always loaded into LLM context
- **With triggers in frontmatter**: Only loaded when user's message matches the specified trigger keywords
#### Structure:
```yaml
---
triggers:
- keyword1
- keyword2
---
# Microagent Content
Your specialized knowledge and instructions here...
3. Resolve the thread:
```bash
gh api graphql -f query='
mutation {
resolveReviewThread(input: {threadId: "<THREAD_ID>"}) {
thread { isResolved }
}
}'
```
### Frontend
4. Get the failed workflow run ID and rerun it:
```bash
# Find the run ID from the failed check URL, or use:
gh run list --repo <OWNER>/<REPO> --branch <BRANCH> --limit 5
#### Action Handling:
- Actions are defined in `frontend/src/types/action-type.ts`
- The `HANDLED_ACTIONS` array in `frontend/src/state/chat-slice.ts` determines which actions are displayed as collapsible UI elements
- To add a new action type to the UI:
1. Add the action type to the `HANDLED_ACTIONS` array
2. Implement the action handling in `addAssistantAction` function in chat-slice.ts
3. Add a translation key in the format `ACTION_MESSAGE$ACTION_NAME` to the i18n files
- Actions with `thought` property are displayed in the UI based on their action type:
- Regular actions (like "run", "edit") display the thought as a separate message
- Special actions (like "think") are displayed as collapsible elements only
# Rerun failed jobs
gh run rerun <RUN_ID> --repo <OWNER>/<REPO> --failed
```
</REVIEW_HANDLING>
#### Adding User Settings:
- To add a new user setting to OpenHands, follow these steps:
1. Add the setting to the frontend:
- Add the setting to the `Settings` type in `frontend/src/types/settings.ts`
- Add the setting to the `ApiSettings` type in the same file
- Add the setting with an appropriate default value to `DEFAULT_SETTINGS` in `frontend/src/services/settings.ts`
- Update the `useSettings` hook in `frontend/src/hooks/query/use-settings.ts` to map the API response
- Update the `useSaveSettings` hook in `frontend/src/hooks/mutation/use-save-settings.ts` to include the setting in API requests
- Add UI components (like toggle switches) in the appropriate settings screen (e.g., `frontend/src/routes/app-settings.tsx`)
- Add i18n translations for the setting name and any tooltips in `frontend/src/i18n/translation.json`
- Add the translation key to `frontend/src/i18n/declaration.ts`
2. Add the setting to the backend:
- Add the setting to the `Settings` model in `openhands/storage/data_models/settings.py`
- Update any relevant backend code to apply the setting (e.g., in session creation)
#### Settings UI Patterns:
<CODE>
- Avoid hacky trick like `sys.path.insert` when resolving package dependency
- Use existing packages/libraries instead of implementing yourselves whenever possible.
- Avoid using # type: ignore. Treat it only as a last resort. In most cases, issues should be resolved by improving type annotations, adding assertions, or adjusting code/tests—rather than silencing the type checker.
- Please AVOID using # type: ignore[attr-defined] unless absolutely necessary. If the issue can be addressed by adding a few extra assert statements to verify types, prefer that approach instead!
- For issue like # type: ignore[call-arg]: if you discover that the argument doesnt actually exist, do not try to mock it again in tests. Instead, simply remove it.
- Avoid doing in-line imports unless absolutely necessary (e.g., circular dependency).
- Avoid getattr/hasattr guards and instead enforce type correctness by relying on explicit type assertions and proper object usage, ensuring functions only receive the expected Pydantic models or typed inputs. Prefer type hints and validated models over runtime shape checks.
- Prefer accessing typed attributes directly. If necessary, convert inputs up front into a canonical shape; avoid purely hypothetical fallbacks.
- Use real newlines in commit messages; do not write literal "\n".
There are two main patterns for saving settings in the OpenHands frontend:
</CODE>
**Pattern 1: Entity-based Resources (Immediate Save)**
- Used for: API Keys, Secrets, MCP Servers
- Behavior: Changes are saved immediately when user performs actions (add/edit/delete)
- Implementation:
- No "Save Changes" button
- No local state management or `isDirty` tracking
- Uses dedicated mutation hooks for each operation (e.g., `use-add-mcp-server.ts`, `use-delete-mcp-server.ts`)
- Each mutation triggers immediate API call with query invalidation for UI updates
- Example: MCP settings, API Keys & Secrets tabs
- Benefits: Simpler UX, no risk of losing changes, consistent with modern web app patterns
<TESTING>
- AFTER you edit ONE file, you should run pre-commit hook on that file via `uv run pre-commit run --files [filepath]` to make sure you didn't break it.
- Don't write TOO MUCH test, you should write just enough to cover edge cases.
- Check how we perform tests in .github/workflows/tests.yml
- Put unit tests under the corresponding domain folder in `tests/` (e.g., `tests/sdk`, `tests/tools`, `tests/workspace`). For example, changes to `openhands-sdk/openhands/sdk/tool/tool.py` should be covered in `tests/sdk/tool/test_tool.py`.
- DON'T write TEST CLASSES unless absolutely necessary!
- If you find yourself duplicating logics in preparing mocks, loading data etc, these logic should be fixtures in conftest.py!
- Please test only the logic implemented in the current codebase. Do not test functionality (e.g., BaseModel.model_dumps()) that is not implemented in this repository.
- For changes to prompt templates, tool descriptions, or agent decision logic, add the `integration-test` label to trigger integration tests and verify no unexpected impact on benchmark performance.
**Pattern 2: Form-based Settings (Manual Save)**
- Used for: Application settings, LLM configuration
- Behavior: Changes are accumulated locally and saved when user clicks "Save Changes"
- Implementation:
- Has "Save Changes" button that becomes enabled when changes are detected
- Uses local state management with `isDirty` tracking
- Uses `useSaveSettings` hook to save all changes at once
- Example: LLM tab, Application tab
- Benefits: Allows bulk changes, explicit save action, can validate all fields before saving
# Behavior Tests
**When to use each pattern:**
- Use Pattern 1 (Immediate Save) for entity management where each item is independent
- Use Pattern 2 (Manual Save) for configuration forms where settings are interdependent or need validation
Behavior tests (prefix `b##_*`) in `tests/integration/tests/` are designed to verify that agents exhibit desired behaviors in realistic scenarios. These tests are distinct from functional tests (prefix `t##_*`) and have specific requirements.
### Adding New LLM Models
Before adding or modifying behavior tests, review `tests/integration/BEHAVIOR_TESTS.md` for the latest workflow, expectations, and examples.
</TESTING>
To add a new LLM model to OpenHands, you need to update multiple files across both frontend and backend:
<AGENT_TMP_DIRECTORY>
# Agent Temporary Directory Convention
#### Model Configuration Procedure:
When tools need to store observation files (e.g., browser session recordings, task tracker data), use `.agent_tmp` as the directory name for consistency.
1. **Frontend Model Arrays** (`frontend/src/utils/verified-models.ts`):
- Add the model to `VERIFIED_MODELS` array (main list of all verified models)
- Add to provider-specific arrays based on the model's provider:
- `VERIFIED_OPENAI_MODELS` for OpenAI models
- `VERIFIED_ANTHROPIC_MODELS` for Anthropic models
- `VERIFIED_MISTRAL_MODELS` for Mistral models
- `VERIFIED_OPENHANDS_MODELS` for models available through OpenHands provider
The browser session recording tool saves recordings to `.agent_tmp/observations/recording-{timestamp}/`.
2. **Backend CLI Integration** (`openhands/cli/utils.py`):
- Add the model to the appropriate `VERIFIED_*_MODELS` arrays
- This ensures the model appears in CLI model selection
This convention ensures tool-generated observation files are stored in a predictable location that can be easily:
- Added to `.gitignore`
- Cleaned up after agent sessions
- Identified as agent-generated artifacts
3. **Backend Model List** (`openhands/utils/llm.py`):
- **CRITICAL**: Add the model to the `openhands_models` list (lines 57-66) if using OpenHands provider
- This is required for the model to appear in the frontend model selector
- Format: `'openhands/model-name'` (e.g., `'openhands/o3'`)
Note: This is separate from `persistence_dir` which is used for conversation state persistence.
</AGENT_TMP_DIRECTORY>
4. **Backend LLM Configuration** (`openhands/llm/llm.py`):
- Add to feature-specific arrays based on model capabilities:
- `FUNCTION_CALLING_SUPPORTED_MODELS` if the model supports function calling
- `REASONING_EFFORT_SUPPORTED_MODELS` if the model supports reasoning effort parameters
- `CACHE_PROMPT_SUPPORTED_MODELS` if the model supports prompt caching
- `MODELS_WITHOUT_STOP_WORDS` if the model doesn't support stop words
<REPO>
<PROJECT_STRUCTURE>
- This is a `uv`-managed Python monorepo (single `uv.lock` at repo root) with multiple distributable packages: `openhands-sdk/` (SDK), `openhands-tools/` (built-in tools), `openhands-workspace/` (workspace impls), and `openhands-agent-server/` (server runtime).
- `examples/` contains runnable patterns; `tests/` is split by domain (`tests/sdk`, `tests/tools`, `tests/workspace`, `tests/agent_server`, etc.).
- Python namespace is `openhands.*` across packages; keep new modules within the matching package and mirror test paths under `tests/`.
</PROJECT_STRUCTURE>
5. **Validation**:
- Run backend linting: `pre-commit run --config ./dev_config/python/.pre-commit-config.yaml`
- Run frontend linting: `cd frontend && npm run lint:fix`
- Run frontend build: `cd frontend && npm run build`
<QUICK_COMMANDS>
- Set up the dev environment: `make build` (runs `uv sync --dev` and installs pre-commit; requires uv >= 0.8.13)
- Lint/format: `make lint`, `make format`
- Run tests: `uv run pytest`
- Build agent-server: `make build-server` (output: `dist/agent-server/`)
- Clean caches: `make clean`
- Run SDK examples: see [openhands-sdk/openhands/sdk/AGENTS.md](openhands-sdk/openhands/sdk/AGENTS.md).
- The example workflow runs `uv run pytest tests/examples/test_examples.py --run-examples`; each successful example must print an `EXAMPLE_COST: ...` line to stdout (use `EXAMPLE_COST: 0` for non-LLM examples).
- Conversation plugins passed via `plugins=[...]` are lazy-loaded on the first `send_message()` or `run()`, so example code should inspect plugin-added skills or `resolved_plugins` only after that first interaction.
</QUICK_COMMANDS>
#### Model Verification Arrays:
<REPO_CONFIG_NOTES>
- Ruff: `line-length = 88`, `target-version = "py312"` (see `pyproject.toml`).
- Ruff ignores `ARG` (unused arguments) under `tests/**/*.py` to allow pytest fixtures.
- Repository guidance lives in the project root AGENTS.md (loaded as a third-party skill file).
</REPO_CONFIG_NOTES>
- **VERIFIED_MODELS**: Main array of all verified models shown in the UI
- **VERIFIED_OPENAI_MODELS**: OpenAI models (LiteLLM doesn't return provider prefix)
- **VERIFIED_ANTHROPIC_MODELS**: Anthropic models (LiteLLM doesn't return provider prefix)
- **VERIFIED_MISTRAL_MODELS**: Mistral models (LiteLLM doesn't return provider prefix)
- **VERIFIED_OPENHANDS_MODELS**: Models available through OpenHands managed provider
#### Model Feature Support Arrays:
- **FUNCTION_CALLING_SUPPORTED_MODELS**: Models that support structured function calling
- **REASONING_EFFORT_SUPPORTED_MODELS**: Models that support reasoning effort parameters (like o1, o3)
- **CACHE_PROMPT_SUPPORTED_MODELS**: Models that support prompt caching for efficiency
- **MODELS_WITHOUT_STOP_WORDS**: Models that don't support stop word parameters
#### Frontend Model Integration:
- Models are automatically available in the model selector UI once added to verified arrays
- The `extractModelAndProvider` utility automatically detects provider from model arrays
- Provider-specific models are grouped and prioritized in the UI selection
#### CLI Model Integration:
- Models appear in CLI provider selection based on the verified arrays
- The `organize_models_and_providers` function groups models by provider
- Default model selection prioritizes verified models for each provider
</REPO>
-55
View File
@@ -1,55 +0,0 @@
cff-version: 1.2.0
message: "If you use this software, please cite it using the following metadata."
title: "OpenHands: An Open Platform for AI Software Developers as Generalist Agents"
authors:
- family-names: Wang
given-names: Xingyao
- family-names: Li
given-names: Boxuan
- family-names: Song
given-names: Yufan
- family-names: Xu
given-names: Frank F.
- family-names: Tang
given-names: Xiangru
- family-names: Zhuge
given-names: Mingchen
- family-names: Pan
given-names: Jiayi
- family-names: Song
given-names: Yueqi
- family-names: Li
given-names: Bowen
- family-names: Singh
given-names: Jaskirat
- family-names: Tran
given-names: Hoang H.
- family-names: Li
given-names: Fuqiang
- family-names: Ma
given-names: Ren
- family-names: Zheng
given-names: Mingzhang
- family-names: Qian
given-names: Bill
- family-names: Shao
given-names: Yanjun
- family-names: Muennighoff
given-names: Niklas
- family-names: Zhang
given-names: Yizhe
- family-names: Hui
given-names: Binyuan
- family-names: Lin
given-names: Junyang
- family-names: Brennan
given-names: Robert
- family-names: Peng
given-names: Hao
- family-names: Ji
given-names: Heng
- family-names: Neubig
given-names: Graham
year: 2024
doi: "10.48550/arXiv.2407.16741"
url: "https://arxiv.org/abs/2407.16741"
-1
View File
@@ -1 +0,0 @@
docs.all-hands.dev
-152
View File
@@ -1,152 +0,0 @@
# Contributor Covenant Code of Conduct
## Our Pledge
We as members, contributors, and leaders pledge to make participation in our
community a harassment-free experience for everyone, regardless of age, body
size, visible or invisible disability, ethnicity, sex characteristics, gender
identity and expression, level of experience, education, socio-economic status,
nationality, personal appearance, race, caste, color, religion, or sexual
identity and orientation.
We pledge to act and interact in ways that contribute to an open, welcoming,
diverse, inclusive, and healthy community.
## Our Standards
Examples of behavior that contributes to a positive environment for our
community include:
* Demonstrating empathy and kindness toward other people.
* Being respectful of differing opinions, viewpoints, and experiences.
* Giving and gracefully accepting constructive feedback.
* Accepting responsibility and apologizing to those affected by our mistakes,
and learning from the experience.
* Focusing on what is best not just for us as individuals, but for the overall
community.
Examples of unacceptable behavior include:
* The use of sexualized language or imagery, and sexual attention or advances of
any kind.
* Trolling, insulting or derogatory comments, and personal or political attacks.
* Public or private harassment.
* Publishing others' private information, such as a physical or email address,
without their explicit permission.
* Other conduct which could reasonably be considered inappropriate in a
professional setting.
## Enforcement Responsibilities
Community leaders are responsible for clarifying and enforcing our standards of
acceptable behavior and will take appropriate and fair corrective action in
response to any behavior that they deem inappropriate, threatening, offensive,
or harmful.
Community leaders have the right and responsibility to remove, edit, or reject
comments, commits, code, wiki edits, issues, and other contributions that are
not aligned to this Code of Conduct, and will communicate reasons for moderation
decisions when appropriate.
## Scope
This Code of Conduct applies within all community spaces, and also applies when
an individual is officially representing the community in public spaces.
Examples of representing our community include using an official email address,
posting via an official social media account, or acting as an appointed
representative at an online or offline event.
## Enforcement
Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported to the community leaders responsible for enforcement at
contact@openhands.dev.
All complaints will be reviewed and investigated promptly and fairly.
All community leaders are obligated to respect the privacy and security of the
reporter of any incident.
## Enforcement Guidelines
Community leaders will follow these Community Impact Guidelines in determining
the consequences for any action they deem in violation of this Code of Conduct:
### 1. Correction
**Community Impact**: Use of inappropriate language or other behavior deemed
unprofessional or unwelcome in the community.
**Consequence**: A private, written warning from community leaders, providing
clarity around the nature of the violation and an explanation of why the
behavior was inappropriate. A public apology may be requested.
### 2. Warning
**Community Impact**: A violation through a single incident or series of
actions.
**Consequence**: A warning with consequences for continued behavior. No
interaction with the people involved, including unsolicited interaction with
those enforcing the Code of Conduct, for a specified period of time. This
includes avoiding interactions in community spaces as well as external channels
like social media. Violating these terms may lead to a temporary or permanent
ban.
### 3. Temporary Ban
**Community Impact**: A serious violation of community standards, including
sustained inappropriate behavior.
**Consequence**: A temporary ban from any sort of interaction or public
communication with the community for a specified period of time. No public or
private interaction with the people involved, including unsolicited interaction
with those enforcing the Code of Conduct, is allowed during this period.
Violating these terms may lead to a permanent ban.
### 4. Permanent Ban
**Community Impact**: Demonstrating a pattern of violation of community
standards, including sustained inappropriate behavior, harassment of an
individual, or aggression toward or disparagement of classes of individuals.
**Consequence**: A permanent ban from any sort of public interaction within the
community.
### Slack Etiquettes
These Slack etiquette guidelines are designed to foster an inclusive, respectful, and productive environment for all
community members. By following these best practices, we ensure effective communication and collaboration while
minimizing disruptions. Lets work together to build a supportive and welcoming community!
- Communicate respectfully and professionally, avoiding sarcasm or harsh language, and remember that tone can be difficult to interpret in text.
- Use threads for specific discussions to keep channels organized and easier to follow.
- Tag others only when their input is critical or urgent, and use @here, @channel or @everyone sparingly to minimize disruptions.
- Be patient, as open-source contributors and maintainers often have other commitments and may need time to respond.
- Post questions or discussions in the most relevant channel (e.g., for [slack - #general](https://openhands-ai.slack.com/archives/C06P5NCGSFP) for general topics, [slack - #questions](https://openhands-ai.slack.com/archives/C06U8UTKSAD) for queries/questions.
- When asking for help or raising issues, include necessary details like links, screenshots, or clear explanations to provide context.
- Keep discussions in public channels whenever possible to allow others to benefit from the conversation, unless the matter is sensitive or private.
- Always adhere to [our standards](https://github.com/OpenHands/OpenHands/blob/main/CODE_OF_CONDUCT.md#our-standards) to ensure a welcoming and collaborative environment.
- If you choose to mute a channel, consider setting up alerts for topics that still interest you to stay engaged.
For Slack, Go to Settings → Notifications → My Keywords to add specific keywords that will notify you when mentioned.
For example, if you're here for discussions about LLMs, mute the channel if its too busy, but set notifications to
alert you only when “LLMs” appears in messages.
## Attribution
This Code of Conduct is adapted from the [Contributor Covenant][homepage],
version 2.1, available at
[https://www.contributor-covenant.org/version/2/1/code_of_conduct.html][v2.1].
Community Impact Guidelines were inspired by
[Mozilla's code of conduct enforcement ladder][Mozilla CoC].
For answers to common questions about this code of conduct, see the FAQ at
[https://www.contributor-covenant.org/faq][FAQ]. Translations are available at
[https://www.contributor-covenant.org/translations][translations].
[homepage]: https://www.contributor-covenant.org
[v2.1]: https://www.contributor-covenant.org/version/2/1/code_of_conduct.html
[Mozilla CoC]: https://github.com/mozilla/diversity
[FAQ]: https://www.contributor-covenant.org/faq
[translations]: https://www.contributor-covenant.org/translations
-58
View File
@@ -1,58 +0,0 @@
# The OpenHands Community
OpenHands is a community of engineers, academics, and enthusiasts reimagining software development for an AI-powered
world.
## Mission
Its very clear that AI is changing software development. We want the developer community to drive that change
organically, through open source.
So were not just building friendly interfaces for AI-driven development. Were publishing _building blocks_ that
empower developers to create new experiences, tailored to your own habits, needs, and imagination.
## Ethos
We have two core values: **high openness** and **high agency**. While we dont expect everyone in the community to
embody these values, we want to establish them as norms.
### High Openness
We welcome anyone and everyone into our community by default. You dont have to be a software developer to help us
build. You dont have to be pro-AI to help us learn.
Our plans, our work, our successes, and our failures are all public record. We want the world to see not just the
fruits of our work, but the whole process of growing it.
We welcome thoughtful criticism, whether its a comment on a PR or feedback on the community as a whole.
### High Agency
Everyone should feel empowered to contribute to OpenHands. Whether its by making a PR, hosting an event, sharing
feedback, or just asking a question, dont hold back!
OpenHands gives everyone the building blocks to create state-of-the-art developer experiences. We experiment constantly
and love building new things.
Coding, development practices, and communities are changing rapidly. We wont hesitate to change direction and make big bets.
## Relationship to All Hands
OpenHands is supported by the for-profit organization [All Hands AI, Inc](https://www.openhands.dev/).
All Hands was founded by three of the first major contributors to OpenHands:
- Xingyao Wang, a UIUC PhD candidate who got OpenHands to the top of the SWE-bench leaderboards
- Graham Neubig, a CMU Professor who rallied the academic community around OpenHands
- Robert Brennan, a software engineer who architected the user-facing features of OpenHands
All Hands is an important part of the OpenHands ecosystem. Weve raised over $20M--mainly to hire developers and
researchers who can work on OpenHands full-time, and to provide them with expensive infrastructure. ([Join us!](https://allhandsai.applytojob.com/apply/))
But we see OpenHands as much larger, and ultimately more important, than All Hands. When our financial responsibility
to investors is at odds with our social responsibility to the community—as it inevitably will be, from time to time—we
promise to navigate that conflict thoughtfully and transparently.
At some point, we may transfer custody of OpenHands to an open source foundation. But for now,
the [Benevolent Dictator approach](http://www.catb.org/~esr/writings/cathedral-bazaar/homesteading/ar01s16.html) helps us move forward with speed and intention. If we ever forget the
“benevolent” part, please: fork us.
+50 -118
View File
@@ -1,138 +1,70 @@
# Contributing
Thanks for your interest in contributing to OpenHands! We welcome and appreciate contributions.
Thank you for helping improve the OpenHands Software Agent SDK.
## Understanding OpenHands's CodeBase
This repo is a foundation. We want the SDK to stay stable and extensible so that many
applications can build on it safely.
To understand the codebase, please refer to the README in each module:
- [frontend](./frontend/README.md)
- [evaluation](./evaluation/README.md)
- [openhands](./openhands/README.md)
- [agenthub](./openhands/agenthub/README.md)
- [server](./openhands/server/README.md)
Downstream applications we actively keep in mind:
- [OpenHands-CLI](https://github.com/OpenHands/OpenHands-CLI) (client)
- [OpenHands app-server](https://github.com/OpenHands/OpenHands/blob/main/openhands/app_server/README.md) (client)
- [OpenHands Enterprise](https://github.com/OpenHands/OpenHands/blob/main/enterprise/README.md) (client)
## Setting up Your Development Environment
The SDK itself has a Python interface. In addition, the
[agent-server](https://docs.openhands.dev/sdk/guides/agent-server/overview) is the
REST/WebSocket server component that exposes the SDK for remote execution and integrations.
Changes should keep both interfaces stable and consistent.
We have a separate doc [Development.md](https://github.com/OpenHands/OpenHands/blob/main/Development.md) that tells
you how to set up a development workflow.
## A lesson we learned (why we care about architecture)
## How Can I Contribute?
In earlier iterations, we repeatedly ran into a failure mode: needs from downstream applications
(or assumptions) would leak into core logic.
There are many ways that you can contribute:
That kind of coupling can feel convenient in the moment, but it tends to create subtle
breakage elsewhere: different environments, different workspaces, different execution modes,
and different evaluation setups.
1. **Download and use** OpenHands, and send [issues](https://github.com/OpenHands/OpenHands/issues) when you encounter something that isn't working or a feature that you'd like to see.
2. **Send feedback** after each session by [clicking the thumbs-up thumbs-down buttons](https://docs.openhands.dev/usage/feedback), so we can see where things are working and failing, and also build an open dataset for training code agents.
3. **Improve the Codebase** by sending [PRs](#sending-pull-requests-to-openhands) (see details below). In particular, we have some [good first issues](https://github.com/OpenHands/OpenHands/labels/good%20first%20issue) that may be ones to start on.
The architecture of OpenHands V0 was too monolithic to support multiple applications built into it,
as CLI, evaluation scripts, web server were, and built on it, as OpenHands Cloud was.
## What Can I Build?
If youre interested in the deeper background and lessons learned, see our write-up:
[OpenHands: An Open Platform for AI Software Developers as Generalist Agents](https://arxiv.org/abs/2511.03690)
Here are a few ways you can help improve the codebase.
This SDK exists (as a separate, rebuilt foundation) to avoid that failure mode.
#### UI/UX
## Principles we review PRs with
We're always looking to improve the look and feel of the application. If you've got a small fix
for something that's bugging you, feel free to open up a PR that changes the [`./frontend`](./frontend) directory.
We welcome all contributions, big or small, to improve or extend the software agent SDK.
If you're looking to make a bigger change, add a new UI element, or significantly alter the style
of the application, please open an issue first, or better, join the #dev-ui-ux channel in our Slack
to gather consensus from our design team first.
You may find that occasionally we are opinionated about several things:
#### Improving the agent
- **OpenHands SDK is its own thing**: its downstream are client applications.
- **Prefer interfaces over special cases**: if a client needs something, add or improve a
clean, reusable interface/extension point instead of adding a shortcut.
- **Extensibility over one-off patches**: design features so multiple clients can adopt them
without rewriting core logic.
- **Avoid hidden assumptions**: dont rely on particular env vars, workspace layouts, request
contexts, or runtime quirks that only exist in one app.
- Workspaces *do* encode environment specifics (local/Docker/remote), but keep those assumptions
explicit (params + validation) and contained to the `workspace` layer.
- **No client-specific code paths**: avoid logic that only makes sense for one
downstream app.
- Its fine to have multiple workspace implementations; its not fine for SDK core behavior to
branch on whether the caller is CLI/app-server/SaaS. Prefer capabilities/config over app-identity.
- **Keep the agent loop stable**: treat stability as a feature; be cautious with control-flow
changes and "small" behavior tweaks.
- **Compatibility is part of the API**: if something could break downstream clients, call it
out explicitly and consider a migration path. We have a deprecation mechanism you may want to use.
Our main agent is the CodeAct agent. You can [see its prompts here](https://github.com/OpenHands/OpenHands/tree/main/openhands/agenthub/codeact_agent).
If youre not sure whether a change crosses these lines, please ask early. Were happy to help think
through the shape of a clean interface.
Changes to these prompts, and to the underlying behavior in Python, can have a huge impact on user experience.
You can try modifying the prompts to see how they change the behavior of the agent as you use the app
locally, but we will need to do an end-to-end evaluation of any changes here to ensure that the agent
is getting better over time.
## Practical pointers
We use the [SWE-bench](https://www.swebench.com/) benchmark to test our agent. You can join the #evaluation
channel in Slack to learn more.
This file is mostly about principles. For the mechanics, please see:
- [AGENTS.md](AGENTS.md) for AI agents
- [DEVELOPMENT.md](DEVELOPMENT.md) for humans
#### Adding a new agent
## Questions / discussion
You may want to experiment with building new types of agents. You can add an agent to [`openhands/agenthub`](./openhands/agenthub)
to help expand the capabilities of OpenHands.
#### Adding a new runtime
The agent needs a place to run code and commands. When you run OpenHands on your laptop, it uses a Docker container
to do this by default. But there are other ways of creating a sandbox for the agent.
If you work for a company that provides a cloud-based runtime, you could help us add support for that runtime
by implementing the [interface specified here](https://github.com/OpenHands/OpenHands/blob/main/openhands/runtime/base.py).
#### Testing
When you write code, it is also good to write tests. Please navigate to the [`./tests`](./tests) folder to see existing
test suites. At the moment, we have these kinds of tests: [`unit`](./tests/unit), [`runtime`](./tests/runtime), and [`end-to-end (e2e)`](./tests/e2e).
Please refer to the README for each test suite. These tests also run on GitHub's continuous integration to ensure
quality of the project.
## Sending Pull Requests to OpenHands
You'll need to fork our repository to send us a Pull Request. You can learn more
about how to fork a GitHub repo and open a PR with your changes in [this article](https://medium.com/swlh/forks-and-pull-requests-how-to-contribute-to-github-repos-8843fac34ce8).
### Pull Request title
As described [here](https://github.com/commitizen/conventional-commit-types/blob/master/index.json), ideally a valid PR title should begin with one of the following prefixes:
- `feat`: A new feature
- `fix`: A bug fix
- `docs`: Documentation only changes
- `style`: Changes that do not affect the meaning of the code (white space, formatting, missing semicolons, etc.)
- `refactor`: A code change that neither fixes a bug nor adds a feature
- `perf`: A code change that improves performance
- `test`: Adding missing tests or correcting existing tests
- `build`: Changes that affect the build system or external dependencies (example scopes: gulp, broccoli, npm)
- `ci`: Changes to our CI configuration files and scripts (example scopes: Travis, Circle, BrowserStack, SauceLabs)
- `chore`: Other changes that don't modify src or test files
- `revert`: Reverts a previous commit
For example, a PR title could be:
- `refactor: modify package path`
- `feat(frontend): xxxx`, where `(frontend)` means that this PR mainly focuses on the frontend component.
You may also check out previous PRs in the [PR list](https://github.com/OpenHands/OpenHands/pulls).
### Pull Request description
- If your PR is small (such as a typo fix), you can go brief.
- If it contains a lot of changes, it's better to write more details.
If your changes are user-facing (e.g. a new feature in the UI, a change in behavior, or a bugfix)
please include a short message that we can add to our changelog.
## How to Make Effective Contributions
### Opening Issues
If you notice any bugs or have any feature requests please open them via the [issues page](https://github.com/OpenHands/OpenHands/issues). We will triage
based on how critical the bug is or how potentially useful the improvement is, discuss, and implement the ones that
the community has interest/effort for.
Further, if you see an issue you like, please leave a "thumbs-up" or a comment, which will help us prioritize.
### Making Pull Requests
We're generally happy to consider all pull requests with the evaluation process varying based on the type of change:
#### For Small Improvements
Small improvements with few downsides are typically reviewed and approved quickly.
One thing to check when making changes is to ensure that all continuous integration tests pass, which you can check
before getting a review.
#### For Core Agent Changes
We need to be more careful with changes to the core agent, as it is imperative to maintain high quality. These PRs are
evaluated based on three key metrics:
1. **Accuracy**
2. **Efficiency**
3. **Code Complexity**
If it improves accuracy, efficiency, or both with only a minimal change to code quality, that's great we're happy to merge it in!
If there are bigger tradeoffs (e.g. helping efficiency a lot and hurting accuracy a little) we might want to put it behind a feature flag.
Either way, please feel free to discuss on github issues or slack, and we will give guidance and preliminary feedback.
Join us on Slack: https://openhands.dev/joinslack
-328
View File
@@ -1,328 +0,0 @@
# Credits
## Contributors
We would like to thank all the [contributors](https://github.com/OpenHands/OpenHands/graphs/contributors) who have
helped make OpenHands possible. We greatly appreciate your dedication and hard work.
## Open Source Projects
OpenHands includes and adapts the following open source projects. We are grateful for their contributions to the
open source community:
#### [SWE Agent](https://github.com/princeton-nlp/swe-agent)
- License: MIT License
- Description: Adapted for use in OpenHands's agent hub
#### [Aider](https://github.com/paul-gauthier/aider)
- License: Apache License 2.0
- Description: AI pair programming tool. OpenHands has adapted and integrated its linter module for code-related tasks in [`agentskills utilities`](https://github.com/OpenHands/OpenHands/tree/main/openhands/runtime/plugins/agent_skills/utils/aider)
#### [BrowserGym](https://github.com/ServiceNow/BrowserGym)
- License: Apache License 2.0
- Description: Adapted in implementing the browsing agent
### Reference Implementations for Evaluation Benchmarks
OpenHands integrates code of the reference implementations for the following agent evaluation benchmarks:
#### [HumanEval](https://github.com/openai/human-eval)
- License: MIT License
#### [DSP](https://github.com/microsoft/DataScienceProblems)
- License: MIT License
#### [HumanEvalPack](https://github.com/bigcode-project/bigcode-evaluation-harness)
- License: Apache License 2.0
#### [AgentBench](https://github.com/THUDM/AgentBench)
- License: Apache License 2.0
#### [SWE-Bench](https://github.com/princeton-nlp/SWE-bench)
- License: MIT License
#### [BIRD](https://bird-bench.github.io/)
- License: MIT License
- Dataset: CC-BY-SA 4.0
#### [Gorilla APIBench](https://github.com/ShishirPatil/gorilla)
- License: Apache License 2.0
#### [GPQA](https://github.com/idavidrein/gpqa)
- License: MIT License
#### [ProntoQA](https://github.com/asaparov/prontoqa)
- License: Apache License 2.0
## Open Source licenses
### MIT License
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
documentation files (the "Software"), to deal in the Software without restriction, including without limitation the
rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit
persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the
Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO
THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS
OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
### BSD 3-Clause License
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the
following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following
disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following
disclaimer in the documentation and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote
products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES,
INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
### Apache License 2.0
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
### Non-Open Source Reference Implementations:
#### [MultiPL-E](https://github.com/nuprl/MultiPL-E)
- License: BSD 3-Clause License with Machine Learning Restriction
BSD 3-Clause License with Machine Learning Restriction
Copyright (c) 2022, Northeastern University, Oberlin College, Roblox Inc,
Stevens Institute of Technology, University of Massachusetts Amherst, and
Wellesley College.
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.
4. The contents of this repository may not be used as training data for any
machine learning model, including but not limited to neural networks.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+48
View File
@@ -0,0 +1,48 @@
# Development Guide
## Setup
```bash
git clone https://github.com/OpenHands/agent-sdk.git
cd agent-sdk
make build
```
## Code Quality
```bash
make format # Format code
make lint # Lint code
uv run pre-commit run --all-files # Run all checks
```
Pre-commit hooks run automatically on commit with type checking and linting.
## Testing
```bash
uv run pytest # All tests
uv run pytest tests/sdk/ # SDK tests only
uv run pytest tests/tools/ # Tools tests only
```
## Project Structure
```
agent-sdk/
├── openhands-sdk/ # Core SDK package
├── openhands-tools/ # Built-in tools
├── openhands-workspace/ # Workspace management
├── openhands-agent-server/ # Agent server
├── examples/ # Usage examples
└── tests/ # Test suites
```
## Contributing
1. Create a new branch
2. Make your changes
3. Run tests and checks
4. Push and create a pull request
For questions, join our [Slack community](https://openhands.dev/joinslack).
-206
View File
@@ -1,206 +0,0 @@
# Development Guide
This guide is for people working on OpenHands and editing the source code.
If you wish to contribute your changes, check out the
[CONTRIBUTING.md](https://github.com/OpenHands/OpenHands/blob/main/CONTRIBUTING.md)
on how to clone and setup the project initially before moving on. Otherwise,
you can clone the OpenHands project directly.
## Start the Server for Development
### 1. Requirements
- Linux, Mac OS, or [WSL on Windows](https://learn.microsoft.com/en-us/windows/wsl/install) [Ubuntu >= 22.04]
- [Docker](https://docs.docker.com/engine/install/) (For those on MacOS, make sure to allow the default Docker socket to be used from advanced settings!)
- [Python](https://www.python.org/downloads/) = 3.12
- [NodeJS](https://nodejs.org/en/download/package-manager) >= 22.x
- [Poetry](https://python-poetry.org/docs/#installing-with-the-official-installer) >= 1.8
- OS-specific dependencies:
- Ubuntu: build-essential => `sudo apt-get install build-essential python3.12-dev`
- WSL: netcat => `sudo apt-get install netcat`
Make sure you have all these dependencies installed before moving on to `make build`.
#### Dev container
There is a [dev container](https://containers.dev/) available which provides a
pre-configured environment with all the necessary dependencies installed if you
are using a [supported editor or tool](https://containers.dev/supporting). For
example, if you are using Visual Studio Code (VS Code) with the
[Dev Containers](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers)
extension installed, you can open the project in a dev container by using the
_Dev Container: Reopen in Container_ command from the Command Palette
(Ctrl+Shift+P).
#### Develop without sudo access
If you want to develop without system admin/sudo access to upgrade/install `Python` and/or `NodeJS`, you can use
`conda` or `mamba` to manage the packages for you:
```bash
# Download and install Mamba (a faster version of conda)
curl -L -O "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh"
bash Miniforge3-$(uname)-$(uname -m).sh
# Install Python 3.12, nodejs, and poetry
mamba install python=3.12
mamba install conda-forge::nodejs
mamba install conda-forge::poetry
```
### 2. Build and Setup The Environment
Begin by building the project which includes setting up the environment and installing dependencies. This step ensures
that OpenHands is ready to run on your system:
```bash
make build
```
### 3. Configuring the Language Model
OpenHands supports a diverse array of Language Models (LMs) through the powerful [litellm](https://docs.litellm.ai) library.
To configure the LM of your choice, run:
```bash
make setup-config
```
This command will prompt you to enter the LLM API key, model name, and other variables ensuring that OpenHands is
tailored to your specific needs. Note that the model name will apply only when you run headless. If you use the UI,
please set the model in the UI.
Note: If you have previously run OpenHands using the docker command, you may have already set some environment
variables in your terminal. The final configurations are set from highest to lowest priority:
Environment variables > config.toml variables > default variables
**Note on Alternative Models:**
See [our documentation](https://docs.openhands.dev/usage/llms) for recommended models.
### 4. Running the application
#### Option A: Run the Full Application
Once the setup is complete, this command starts both the backend and frontend servers, allowing you to interact with OpenHands:
```bash
make run
```
#### Option B: Individual Server Startup
- **Start the Backend Server:** If you prefer, you can start the backend server independently to focus on
backend-related tasks or configurations.
```bash
make start-backend
```
- **Start the Frontend Server:** Similarly, you can start the frontend server on its own to work on frontend-related
components or interface enhancements.
```bash
make start-frontend
```
### 5. Running OpenHands with OpenHands
You can use OpenHands to develop and improve OpenHands itself! This is a powerful way to leverage AI assistance for contributing to the project.
#### Quick Start
1. **Build and run OpenHands:**
```bash
export INSTALL_DOCKER=0
export RUNTIME=local
make build && make run
```
2. **Access the interface:**
- Local development: http://localhost:3001
- Remote/cloud environments: Use the appropriate external URL
3. **Configure for external access (if needed):**
```bash
# For external access (e.g., cloud environments)
make run FRONTEND_PORT=12000 FRONTEND_HOST=0.0.0.0 BACKEND_HOST=0.0.0.0
```
### 6. LLM Debugging
If you encounter any issues with the Language Model (LM) or you're simply curious, export DEBUG=1 in the environment and restart the backend.
OpenHands will log the prompts and responses in the logs/llm/CURRENT_DATE directory, allowing you to identify the causes.
### 7. Help
Need help or info on available targets and commands? Use the help command for all the guidance you need with OpenHands.
```bash
make help
```
### 8. Testing
To run tests, refer to the following:
#### Unit tests
```bash
poetry run pytest ./tests/unit/test_*.py
```
### 9. Add or update dependency
1. Add your dependency in `pyproject.toml` or use `poetry add xxx`.
2. Update the poetry.lock file via `poetry lock --no-update`.
### 10. Use existing Docker image
To reduce build time (e.g., if no changes were made to the client-runtime component), you can use an existing Docker
container image by setting the SANDBOX_RUNTIME_CONTAINER_IMAGE environment variable to the desired Docker image.
Example: `export SANDBOX_RUNTIME_CONTAINER_IMAGE=ghcr.io/openhands/runtime:1.2-nikolaik`
## Develop inside Docker container
TL;DR
```bash
make docker-dev
```
See more details [here](./containers/dev/README.md).
If you are just interested in running `OpenHands` without installing all the required tools on your host.
```bash
make docker-run
```
If you do not have `make` on your host, run:
```bash
cd ./containers/dev
./dev.sh
```
You do need [Docker](https://docs.docker.com/engine/install/) installed on your host though.
## Key Documentation Resources
Here's a guide to the important documentation files in the repository:
- [/README.md](./README.md): Main project overview, features, and basic setup instructions
- [/Development.md](./Development.md) (this file): Comprehensive guide for developers working on OpenHands
- [/CONTRIBUTING.md](./CONTRIBUTING.md): Guidelines for contributing to the project, including code style and PR process
- [DOC_STYLE_GUIDE.md](https://github.com/OpenHands/docs/blob/main/openhands/DOC_STYLE_GUIDE.md): Standards for writing and maintaining project documentation
- [/openhands/README.md](./openhands/README.md): Details about the backend Python implementation
- [/frontend/README.md](./frontend/README.md): Frontend React application setup and development guide
- [/containers/README.md](./containers/README.md): Information about Docker containers and deployment
- [/tests/unit/README.md](./tests/unit/README.md): Guide to writing and running unit tests
- [/evaluation/README.md](./evaluation/README.md): Documentation for the evaluation framework and benchmarks
- [/skills/README.md](./skills/README.md): Information about the skills architecture and implementation
- [/openhands/server/README.md](./openhands/server/README.md): Server implementation details and API documentation
- [/openhands/runtime/README.md](./openhands/runtime/README.md): Documentation for the runtime environment and execution model
-27
View File
@@ -1,27 +0,0 @@
# Issue Triage
These are the procedures and guidelines on how issues are triaged in this repo by the maintainers.
## General
* All issues must be tagged with **enhancement**, **bug** or **troubleshooting/help**.
* Issues may be tagged with what it relates to (**llm**, **app tab**, **UI/UX**, etc.).
## Severity
* **High**: High visibility issues or affecting many users.
* **Critical**: Affecting all users or potential security issues.
## Difficulty
* Issues good for newcomers may be tagged with **good first issue**.
## Not Enough Information
* User is asked to provide more information (logs, how to reproduce, etc.) when the issue is not clear.
* If an issue is unclear and the author does not provide more information or respond to a request,
the issue may be closed as **not planned** (Usually after a week).
## Multiple Requests/Fixes in One Issue
* These issues will be narrowed down to one request/fix so the issue is more easily tracked and fixed.
* Issues may be broken down into multiple issues if required.
## Stale and Auto Closures
* In order to keep a maintainable backlog, issues that have no activity within 40 days are automatically marked as **Stale**.
* If issues marked as **Stale** continue to have no activity for 10 more days, they will automatically be closed as not planned.
* Issues may be reopened by maintainers if deemed important.
+17 -26
View File
@@ -1,30 +1,21 @@
Portions of this software are licensed as follows:
* All content that resides under the enterprise/ directory is licensed under the license defined in "enterprise/LICENSE".
* Content outside of the above mentioned directories or restrictions above is available under the MIT license as defined below.
MIT License
=====================
Copyright (c) 2026 OpenHands contributors
The MIT License (MIT)
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
Copyright © 2025
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
Permission is hereby granted, free of charge, to any person
obtaining a copy of this software and associated documentation
files (the “Software”), to deal in the Software without
restriction, including without limitation the rights to use,
copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the
Software is furnished to do so, subject to the following
conditions:
The above copyright notice and this permission notice shall be
included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
OTHER DEALINGS IN THE SOFTWARE.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
+11
View File
@@ -0,0 +1,11 @@
# Repository Maintainers
#
# Format: Each maintainer on a new line starting with "- @username"
# This file is read by .github/workflows/assign-reviews.yml for automated triage
#
The following people are maintainers of this repository and are responsible for triage and review:
- @xingyaoww
- @neubig
- @enyst
+47 -4
View File
@@ -1,5 +1,48 @@
# Exclude all Python bytecode files
global-exclude *.pyc
# This MANIFEST.in file tells setuptools which files to include
# in the sdist package distribution used for building docker image
# Exclude Python cache directories
global-exclude __pycache__
# ==============================================================================
# Root-level workspace files
# ==============================================================================
include pyproject.toml
include uv.lock
# ==============================================================================
# openhands-sdk
# ==============================================================================
include openhands-sdk/pyproject.toml
recursive-include openhands-sdk *.py
recursive-include openhands-sdk *.j2
recursive-include openhands-sdk py.typed
# ==============================================================================
# openhands-tools
# ==============================================================================
include openhands-tools/pyproject.toml
recursive-include openhands-tools *.py
recursive-include openhands-tools *.j2
recursive-include openhands-tools py.typed
# ==============================================================================
# openhands-workspace
# ==============================================================================
include openhands-workspace/pyproject.toml
recursive-include openhands-workspace *.py
recursive-include openhands-workspace py.typed
# ==============================================================================
# openhands-agent-server
# ==============================================================================
include openhands-agent-server/pyproject.toml
recursive-include openhands-agent-server *.py
recursive-include openhands-agent-server py.typed
# Docker build files
include openhands-agent-server/openhands/agent_server/docker/Dockerfile
include openhands-agent-server/openhands/agent_server/docker/wallpaper.svg
# PyInstaller spec
include openhands-agent-server/openhands/agent_server/agent-server.spec
# VSCode extensions
recursive-include openhands-agent-server/openhands/agent_server/vscode_extensions *

Some files were not shown because too many files have changed in this diff Show More