Compare commits

..

2577 Commits

Author SHA1 Message Date
Robert Brennan 8ac8a35811 Merge branch 'main' into rb/github-patch 2024-11-11 18:35:00 -05:00
Robert Brennan 9d3c6d87fb Merge branch 'rb/fix-remote' into rb/github-patch 2024-11-11 18:30:24 -05:00
Robert Brennan 7df7f43e3c Revert "Add rate limiting to server endpoints" (#4910) 2024-11-11 23:26:49 +00:00
Robert Brennan 4c935a84e7 another attempt 2024-11-11 18:10:40 -05:00
tofarr 2ad0831560 Merge branch 'main' into revert-4867-feature/add-rate-limiting 2024-11-11 15:53:20 -07:00
Engel Nyst a45aba512a Tweak log levels (#4729) 2024-11-11 22:51:56 +00:00
Robert Brennan d865f1e4a7 Revert "Add rate limiting to server endpoints (#4867)"
This reverts commit 79492b6551.
2024-11-11 17:41:15 -05:00
tofarr a1a9d2f175 Refactor websocket (#4879)
Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>
2024-11-11 22:36:07 +00:00
Robert Brennan 79492b6551 Add rate limiting to server endpoints (#4867)
Co-authored-by: openhands <openhands@all-hands.dev>
2024-11-11 16:54:22 -05:00
sp.wack 80fdb9a2f4 feat(posthog): Emit user activated event (#4886) 2024-11-11 23:31:41 +02:00
Robert Brennan a38c45cf75 fix remote runtimes 2024-11-11 15:44:42 -05:00
Nafis Reza 975e75531d Move assets/icons to dedicated folder (#4850) 2024-11-11 20:17:04 +00:00
Robert Brennan 1b5f5bcdad fixes for upcoming changes to remote API (#4834) 2024-11-11 14:51:14 -05:00
Rohit Malhotra 8c00d96024 Support displaying images/videos/pdfs in the workspace (#4898) 2024-11-11 20:22:17 +02:00
Robert Brennan bf8ccc8fc3 fix infinite loop (#4873)
Co-authored-by: amanape <83104063+amanape@users.noreply.github.com>
2024-11-11 10:59:43 +00:00
OpenHands 037d770f66 Fix issue #4884: (chore) add missing FE translations (#4885)
Co-authored-by: tobitege <10787084+tobitege@users.noreply.github.com>
2024-11-11 10:09:46 +00:00
sp.wack dd50246672 test(frontend): Pass failing tests (#4887) 2024-11-11 09:49:56 +00:00
Graham Neubig 090771674c Update llms.md w/ more recent results (#4874) 2024-11-10 03:12:09 +00:00
Xingyao Wang d8ab0208ba fix: remove duplicate claude-3-5-sonnet-20241022 model from VERIFIED_MODELS (#4871)
Co-authored-by: openhands <openhands@all-hands.dev>
2024-11-09 21:41:56 +00:00
Xingyao Wang a07e8272da fix: improve remote runtime reliability on large-scale evaluation (#4869) 2024-11-09 20:17:10 +00:00
Robert Brennan be82832eb1 Use keyword matching for CodeAct microagents (#4568)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2024-11-09 11:25:02 -05:00
ross 67c8915d51 feat(runtime): Add prototype Runloop runtime impl (#4598)
Co-authored-by: Robert Brennan <contact@rbren.io>
2024-11-08 23:40:31 -05:00
Daniel Cruz 40b3ccb17c Adds missing spanish translations (#4858) 2024-11-09 05:14:55 +01:00
Robert Brennan 35c68863dc Don't persist cache on reload (#4854) 2024-11-08 22:31:24 +00:00
mamoodi 8bfee87bcf Release 0.13.0 (#4849) 2024-11-08 22:24:56 +00:00
Robert Brennan e1383afbc3 Add signed cookie-based GitHub authentication caching (#4853)
Co-authored-by: openhands <openhands@all-hands.dev>
2024-11-08 22:19:34 +00:00
Xingyao Wang 4ce3b9094a Revert "(feat): Prompt engineering to remind o1 to generate a patch" (#4846) 2024-11-08 16:12:57 +00:00
Graham Neubig 0a4e196670 Update openhands-resolver.yml to remove issue number (#4843) 2024-11-08 15:13:56 +00:00
Daniel Cruz 8d32a59f55 Adds missing localization and translation to spanish (#4837)
Co-authored-by: adrianamorenogt <adrianamorenogutierrez@gmail.com>
2024-11-08 09:33:19 +02:00
tofarr 38b92f4251 UX: Show a loading indicator when downloading a zip (#4833) 2024-11-08 09:28:18 +02:00
Boxuan Li 88dbe85594 Make trajectories_path support file path (#4840) 2024-11-08 06:26:12 +00:00
OpenHands f5003a7449 Fix issue #4830: [Bug]: Copy-paste into the "What do you want to build?" bar doesn't work (#4832)
Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-11-07 23:20:43 -06:00
Alejandro Cuadron Lafuente a6810fa6ad (feat): Prompt engineering to remind o1 to generate a patch (#4807)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: mamoodi <mamoodiha@gmail.com>
Co-authored-by: tofarr <tofarr@gmail.com>
Co-authored-by: Robert Brennan <contact@rbren.io>
2024-11-08 03:10:18 +00:00
Robert Brennan fc05d8d4eb instruct the agent to comment less (#4681) 2024-11-08 05:21:48 +08:00
sp.wack 1d6ef0e18e fix(frontend): Remove runtime indicator (#4829) 2024-11-08 02:37:59 +08:00
Xingyao Wang dc0e223d1a fix(agent controller): misplaced runtime.connect that cause swebench workspace to fail (#4826) 2024-11-08 01:50:33 +08:00
tofarr 932de79154 Fix: Buffering zip downloads to files rather than holding in memory (#4802) 2024-11-07 10:24:30 -07:00
Robert Brennan fa625fed70 Retry on github auth failure (#4767) 2024-11-07 16:57:06 +00:00
Xingyao Wang f9fa1d95cb fix(RemoteRuntime): add retry for pod status after /start (#4825) 2024-11-07 16:22:47 +00:00
sp.wack 5615d54f81 feat(posthog): Emit useful events (#4798)
Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-11-07 16:16:33 +00:00
Xingyao Wang 8166bf768a fix(agent, browsing): too long tool description for openai (#4778) 2024-11-08 00:11:08 +08:00
sp.wack c3991c870d feat(frontend): Cache request data (#4816) 2024-11-07 16:53:34 +02:00
sp.wack 1a27619b39 feat(frontend): Update npm scripts for cross-platform compatibility with PowerShell and Unix shells (#4727) 2024-11-07 16:51:02 +02:00
sp.wack cc15aee405 fix(frontend): Fix Jupyter tab overflow (#4818) 2024-11-07 22:48:10 +08:00
Xingyao Wang 53390d9885 Fix issue #4583: [Bug]: Unable to pull the full SWE-Bench test set (#4813)
Co-authored-by: openhands <openhands@all-hands.dev>
2024-11-07 22:35:20 +08:00
sp.wack 0335b1a634 feat(posthog): Identify users logged in with GitHub (#4794) 2024-11-07 08:37:07 +00:00
Daniel Cruz bb362cd377 Use i18n Keys (2) (#4464)
Co-authored-by: adrianamorenogt <adrianamorenogutierrez@gmail.com>
Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>
2024-11-07 08:34:59 +00:00
Xingyao Wang 4405b109e3 Fix issue #4809: [Bug]: Model does not support image upload when usin… (#4810)
Co-authored-by: openhands <openhands@all-hands.dev>
2024-11-07 02:28:16 +00:00
Engel Nyst 47464a9cfa Revert "Feature: Add ability to reconnect websockets" (#4801) 2024-11-07 01:56:39 +00:00
Engel Nyst 2b3fd94540 Fix init order in the agent controller (#4796)
Co-authored-by: tofarr <tofarr@gmail.com>
2024-11-06 22:44:12 +00:00
tofarr 1bd46f3832 Fix - terminal not working (#4800) 2024-11-06 20:34:42 +00:00
Xingyao Wang 8a063fdf6a fix(agent): not default to /repo path (#4799) 2024-11-06 20:21:41 +00:00
OpenHands 025dac5d8f Fix issue #4776: [Bug]: Files are not uploaded to the environment (SWE-Bench) (#4795) 2024-11-06 19:05:06 +00:00
tofarr 0e5e754420 Feature: Add ability to reconnect websockets (#4526) 2024-11-06 18:12:31 +00:00
Robert Brennan 7a8e207985 Fix: Implement caching for clientLoader to prevent repeated calls (#4772)
Co-authored-by: openhands <openhands@all-hands.dev>
2024-11-06 12:51:09 -05:00
mamoodi a4de0f2142 Update leftover versions (#4792) 2024-11-06 17:21:38 +00:00
dependabot[bot] 27716171bf chore(deps): bump the docusaurus group in /docs with 7 updates (#4789)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-11-06 17:44:32 +02:00
sp.wack e5d7735d75 ALL-677 fix(frontend) Truncate long CMD outputs to prevent UI freezing (#4785) 2024-11-06 23:43:25 +08:00
OpenHands 83ccb74d36 Fix issue #4780: [Bug]: Initial query is not cleared after submission (#4781) 2024-11-06 09:54:15 +00:00
sp.wack 118957235d feat(frontend): Chat interface empty state (#4737) 2024-11-06 08:55:50 +00:00
Xingyao Wang 4a6406ed71 feat: add drag & paste image support to ChatInput (#4762)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>
2024-11-06 07:44:16 +00:00
Rohit Malhotra 4bef974a89 Adding PR number variable to openhands-resolver (#4777) 2024-11-06 02:26:04 +00:00
Robert Brennan e497438085 Remove extra calls to isAuthenticated (#4766) 2024-11-05 22:09:43 +00:00
Robert Brennan 74b3335b7d Bugfix: fix session close (#4765) 2024-11-05 14:11:15 -05:00
Xingyao Wang 55c41212c8 chore: update browser message to be more human-readable in UI (#4761) 2024-11-05 17:05:19 +00:00
mamoodi 4374ea08d3 Patch release 0.12.3 (#4760) 2024-11-05 16:53:08 +00:00
Rohit Malhotra 436ecb80a3 Logger fixes for openhands-resolver (#4710)
Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-11-05 16:49:32 +00:00
tofarr df9e9fca5a Refactor: Shorter syntax. (#4753) 2024-11-05 16:09:14 +00:00
OpenHands add0e7d05c Fix issue #4756: [Documentation] When GITHUB_TOKEN is provided automatically through the UI (#4757)
Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>
2024-11-05 15:50:39 +00:00
Robert Brennan 145194c87b Fix images in docker run command for PRs (#4674) 2024-11-05 10:50:24 -05:00
sp.wack 6eafe0d2a8 feat(frontend): Redirect user to app after a project upload or repo selection (and add e2e tests) (#4751) 2024-11-05 17:12:58 +02:00
Engel Nyst eeb2342509 Refactor history/event stream (#3808) 2024-11-05 03:36:14 +01:00
Graham Neubig edfba4618a Update bug_template.yml to show app.all-hands.dev (#4709) 2024-11-04 20:47:22 -05:00
Robert Brennan 98751a3ee2 Refactor of error handling (#4575)
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-11-04 23:30:53 +00:00
Xingyao Wang 24117143ae feat(llm): add new haiku into func calling support model (#4738) 2024-11-04 22:38:00 +00:00
mamoodi 78f4712080 Release 0.12.2 (#4741) 2024-11-04 16:33:50 -05:00
Xingyao Wang 1d2a616be7 Fix issue #4739: '[Bug]: The agent doesn'"'"'t know its name' (#4740)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-11-04 21:24:35 +00:00
OpenHands ba25b02978 Fix issue #4735: Update msw mocks (#4736) 2024-11-04 16:58:56 +00:00
Xingyao Wang 966da7b7c8 feat(agent, CodeAct 2.2): native CodeAct support for Browsing (#4667)
Co-authored-by: tofarr <tofarr@gmail.com>
2024-11-05 00:27:27 +08:00
sp.wack f0af90bff3 fix(frontend): Always return user is authed if mode is oss (#4733) 2024-11-04 16:24:23 +00:00
Engel Nyst 1638968509 History microfixes (#4728) 2024-11-04 16:37:22 +01:00
Robert Brennan 250fcbe62c Various async fixes (#4722) 2024-11-04 10:08:09 -05:00
sp.wack 0595d2336a feat: Analytics with PostHog (#4655) 2024-11-04 09:57:56 +00:00
sp.wack 387c8f1df3 feat(frontend): Make loader synchronous (#4689) 2024-11-04 11:26:30 +02:00
Polygons1 f6c2b287bc Fix for #4717 (#4721) 2024-11-04 08:24:00 +08:00
Xingyao Wang ab188d026d Revert "Fix permissions on __init__.py" (#4718) 2024-11-04 05:10:43 +08:00
Robert Brennan 316fc260f6 Fix list-files async calls (#4720)
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-11-03 10:52:53 -08:00
Robert Brennan aab7fa483b Fix permissions on __init__.py (#4713) 2024-11-03 22:14:42 +08:00
Rohit Malhotra 496364ce53 Adding PR label trigger for openhands-resolver (#4712) 2024-11-02 20:19:30 -04:00
Ryan H. Tran 4446d3180f fix: use None check instead of falsy (#4705) 2024-11-02 12:44:03 -04:00
Robert Brennan 7b8241e424 fix auth when there are no allow lists (#4707) 2024-11-02 16:25:35 +00:00
Abhijeetsingh Meena 8857f02083 [Eval] DiscoveryBench OpenHands Integration (#4627)
Signed-off-by: Abhijeetsingh Meena <abhijeet040403@gmail.com>
Co-authored-by: Harshit Surana <surana.h@gmail.com>
2024-11-02 07:24:34 -04:00
Xingyao Wang 1747b3d6b2 fix: prompt caching (#4704) 2024-11-02 07:21:21 -04:00
Robert Brennan 36623a16da Minor auth fixes (#4699) 2024-11-01 18:33:29 -07:00
OpenHands 9d3b77bffc Fix issue #4695: [Bug]: Dependabot PRs fail on "Update PR Description" github action step (#4697) 2024-11-01 18:32:31 -07:00
OpenHands 2682518d0e Fix issue #4692: [Bug]: Slack link no longer working (#4693) 2024-11-01 18:34:20 -05:00
Robert Brennan b27fabe504 Add Google Sheets integration for GitHub user verification (#4671)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-11-01 15:17:15 -07:00
Xingyao Wang adf7ab5849 fix: handle the case where LLM assistant return None instead of empty string (#4690) 2024-11-01 19:13:01 +00:00
Robert Brennan 456998175f Fix authentication (#4686) 2024-11-01 10:54:06 -07:00
Graham Neubig b4afd9f170 Update README.md w/ github resolver link (#4679) 2024-11-01 13:07:35 +00:00
sp.wack 73c7375b92 fix(frontend): Prevent editor from changing width unpredictably (#4659) 2024-11-01 14:04:39 +02:00
tofarr 6414b1af6e Fix agent session error in logs (#4669) 2024-11-01 10:50:56 +08:00
tofarr dd55290f4e Fix : app unresponsive on startup (#4668) 2024-10-31 14:30:33 -07:00
tofarr be77baea31 refactor: remove unused methods and constants from Session class (#4662)
Co-authored-by: openhands <openhands@all-hands.dev>
2024-10-31 14:55:37 -06:00
Robert Brennan a812e2b5f1 Add cookie-based authentication to all routes (#4642)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>
2024-10-31 12:18:42 -07:00
tofarr 4ebff5aaf3 Fix unawaited (#4665) 2024-10-31 19:16:37 +00:00
Engel Nyst 0687608feb [Arch proposal] ENVIRONMENT event source (#4584)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2024-11-01 02:33:13 +08:00
Ziru "Ron" Chen db4e1dbbec [eval] Add ScienceAgentBench. (#4645)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2024-11-01 02:30:55 +08:00
Robert Brennan 9442e4f9e3 dont run pr update on forks (#4663) 2024-11-01 01:55:50 +08:00
Robert Brennan e17f7b22a6 Remove hidden commands from feedback (#4597)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-10-31 08:49:47 -07:00
mamoodi ce6939fc0d Release 0.12.0 - Pending Release Notes Prep (#4650) 2024-10-31 13:14:01 +00:00
Xingyao Wang 4705ef9ec2 chore: do not include "status" dict in share-openhands (#4620) 2024-10-31 20:35:35 +08:00
Xingyao Wang 9c2b48ff5d fix(eval): SWE-Bench instance with upper-case instance id (#4649) 2024-10-30 21:24:18 +00:00
Robert Brennan 87906b96a7 Add job to update PR description with docker run command (#4550)
Co-authored-by: openhands <openhands@all-hands.dev>
2024-10-30 16:42:03 -04:00
Xingyao Wang c0a0d46eb2 test(runtime) #4623: file permission when running the file_editor (#4628)
Co-authored-by: openhands <openhands@all-hands.dev>
2024-10-31 04:34:34 +08:00
Engel Nyst 0ea5dcc781 Remove console leak (#4648) 2024-10-30 20:33:42 +00:00
Robert Brennan d9e0344619 minor cleanup in readme (#4639) 2024-10-30 19:32:36 +00:00
Engel Nyst 1c9cdaf1a2 Fix old string serializer (#4644) 2024-10-30 19:26:26 +00:00
Engel Nyst bde978cf0f Fix Openrouter (#4641) 2024-10-30 18:31:24 +00:00
Xingyao Wang 2587220b12 fix(llm): fallback when model is out of function calling supported list (#4617)
Co-authored-by: openhands <openhands@all-hands.dev>
2024-10-31 01:54:50 +08:00
sp.wack 87bc35d2c8 feat(frontend): Add a better auth flow and UI handling (#4603) 2024-10-30 13:38:43 -04:00
OpenHands 866ba6e3b2 Fix issue #4629: [Bug]: Replace claude-3-5-sonnet-20240620 with claude-3-5-sonnet-20241022 (#4631)
Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-10-30 17:16:04 +00:00
Xingyao Wang 2b0eada176 agent: enable browsing & jupyter by default 2024-10-30 12:53:16 -04:00
Robert Brennan 2e50a5bef5 Document various runtimes (#4536)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2024-10-30 16:18:42 +00:00
Xingyao Wang 3ae4bc0f8e chore: bump the litellm version (#4632) 2024-10-30 23:16:10 +08:00
tofarr faf774cdbd Doc Update : Troubleshooting docker engine (#4609) 2024-10-30 08:47:04 -06:00
tofarr 05645d1bbd Refactor CORS middleware and enhance localhost handling (#4624)
Co-authored-by: openhands <openhands@all-hands.dev>
2024-10-30 08:46:22 -06:00
Robert Brennan e21abce786 Load GitHub users list at startup for improved authentication performance (#4567)
Co-authored-by: openhands <openhands@all-hands.dev>
2024-10-30 10:27:25 -04:00
tofarr 75ee54bbc5 Increase share popup duration from 5s to 10s (#4625)
Co-authored-by: openhands <openhands@all-hands.dev>
2024-10-30 14:14:28 +00:00
Xingyao Wang 89406bac44 feat: provide directory information to the agent from FE (#4622)
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-10-30 09:10:47 +04:00
Robert Brennan 572b3ad682 safer model info access (#4619) 2024-10-29 17:44:51 -04:00
sp.wack 981b05fc2b feat(frontend): Introduce secrets prop to hide from the terminal (#4529) 2024-10-29 23:18:01 +04:00
Robert Brennan 997dc80d18 chance default model to 3.5 sonnet new (#4612)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2024-10-29 17:43:25 +00:00
Robert Brennan e231776be8 feat: Add automatic translation updater script (#4608)
Co-authored-by: openhands <openhands@all-hands.dev>
2024-10-29 13:05:01 -04:00
Xingyao Wang d50425865a fix(runtime): only accept one request at a time for exec action requests (#4589) 2024-10-29 23:48:50 +08:00
Xingyao Wang 6d19c93d19 [eval] add evaluation workflow (#4489)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-10-29 13:52:25 +00:00
Robert Brennan 30eeaa641c Major logging overhaul (#4563) 2024-10-29 07:30:50 +01:00
Xingyao Wang ae13171194 feat(agent): CodeAct with function calling (#4537)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: tobitege <10787084+tobitege@users.noreply.github.com>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: tofarr <tofarr@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-29 11:06:33 +08:00
Engel Nyst 421b4c108a Compatibility for renamed attribute (#4582)
Co-authored-by: tofarr <tofarr@gmail.com>
2024-10-28 16:06:22 -06:00
Xingyao Wang affb2123d9 feat(runtime): add versioned runtime image (base_name+oh_version) (#4574) 2024-10-29 04:52:54 +08:00
Robert Brennan fdb385ab93 Simplify makefile (#4591) 2024-10-28 13:10:32 -04:00
sp.wack 13d101e092 fix(frontend): Record events sent to WS (#4596) 2024-10-28 15:53:31 +00:00
sp.wack 6cf3728247 test(frontend): Test, refactor, and improve the chat interface (#4549) 2024-10-28 17:26:28 +04:00
sp.wack ae188458ef chore(frontend): Remove root level package.json (#4590) 2024-10-28 16:42:17 +04:00
Robert Brennan a20da54e3a Remove verbose log from agent controller (#4585) 2024-10-27 15:50:23 +00:00
Mahmoud Sehsah 2a6740f4ba fix(builder): Build the runtime with docker version that contains (-) in the version name (#4580) 2024-10-27 02:54:52 +01:00
Ryan H. Tran 5ba7bc6be1 Mention build-essential dependency for ubuntu in dev doc (#4511) 2024-10-26 20:17:43 -05:00
Xingyao Wang 98d4884ced fix(controller): stop when run into loop (#4579) 2024-10-26 19:40:58 -05:00
Xingyao Wang be3cbb045e fix(controllor): make agent controller stops when encounter fatal observation (#4573) 2024-10-26 13:28:27 -05:00
dependabot[bot] 8bfd2fcf4f chore(deps): bump the version-all group across 1 directory with 8 updates (#4564)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-25 20:07:16 +02:00
tofarr d4e3982a6b Small refactor : EventStream as a dataclass (#4557) 2024-10-25 17:31:20 +00:00
Xingyao Wang 1f23dc89b6 fix(eval): add runtime.connect to all eval harness (#4565) 2024-10-26 00:41:30 +08:00
Xingyao Wang 7340b78962 feat(eval): rewrite log_completions to save completions to directory (#4566) 2024-10-25 16:36:11 +00:00
tofarr c3da25febc Fix for docker leak (#4560) 2024-10-25 15:53:39 +00:00
Robert Brennan 8d2b2d4318 Refactor runtime to add a connect method (#4410)
Co-authored-by: Tim O'Farrell <tofarr@gmail.com>
2024-10-25 09:02:19 -04:00
tofarr c4f5c07be1 Refactor: shorter syntax (#4558) 2024-10-25 06:45:28 -06:00
Xingyao Wang 349e2dbe50 refactor: move bash related logic into BashSession for cleaner code (#4527)
Co-authored-by: Tim O'Farrell <tofarr@gmail.com>
2024-10-25 20:44:25 +08:00
Xingyao Wang dcd4b04f57 feat(llm): update prompt caching list to include new sonnet (#4552) 2024-10-25 20:36:35 +08:00
sp.wack 78eacc4489 fix(frontend): Fix loader checking unset config variable in window (#4546)
Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-10-25 08:14:40 -04:00
tofarr 60990c128a Feature: Minor refactor of SessionManager to make it a dataclass (#4553) 2024-10-24 14:32:05 -06:00
Robert Brennan c4c25ea229 Minor fixes for GitHub credential exchange (#4554) 2024-10-24 16:29:03 -04:00
tofarr 930726f4e8 Fix for issue where we hammer docker needlessly (#4551) 2024-10-24 20:03:35 +00:00
tofarr ee2c2ff2b8 Feat changed "is_confirmed" to "confirmation_state" (#4508) 2024-10-24 13:35:14 -06:00
Robert Brennan 8c064fe3df add catch all route, disable caching (#4547) 2024-10-24 15:06:17 -04:00
sp.wack e878741ae7 test(frontend): Test, refactor, and improve the chat input (#4535) 2024-10-24 18:19:41 +04:00
tofarr 90e2bf4883 Split bash commands by the new line character (#4462) 2024-10-24 07:44:38 -06:00
dependabot[bot] 615b94cf2f chore(deps): bump the version-all group across 1 directory with 19 updates (#4531)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-23 21:28:02 +02:00
Graham Neubig ce2430180f Update README.md to fix miniwob name (#4534) 2024-10-23 18:24:43 +00:00
Xingyao Wang eaea94cc1b fix(remote runtime): retry on 429 error on remote build & log retries (#4532) 2024-10-24 02:07:11 +08:00
sp.wack 385cc8f512 [ALL-561] feat(frontend|backend): Display error messages in the chat (#4509) 2024-10-23 18:56:00 +04:00
Xingyao Wang 2d5b360505 refactor: re-organize different runtime implementations into an impl folder (#4346)
Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-10-23 10:10:03 +00:00
mamoodi 9b6fd239d0 Release 0.11.0 (#4523) 2024-10-22 16:43:13 -04:00
sp.wack dd15845b91 [ALL-570] fix(frontend): Don't wrap filenames in the file explorer (#4521) 2024-10-22 23:31:42 +04:00
sp.wack 64adb64fef [ALL-597] fix(frontend): Fetch config.json locally (#4522) 2024-10-22 23:31:29 +04:00
Yashwanth S C 6573304014 fix(frontend): Error when API key is not entered is not clear (#4429) 2024-10-22 22:23:09 +04:00
sp.wack 29ddcdaf46 [ALL-469] fix(frontend): Indicate that import projects require zips (#4515) 2024-10-22 22:15:08 +04:00
mamoodi d0bbad8eda Remove settings base container as it is not supported (#4520) 2024-10-22 18:14:59 +00:00
sp.wack 7b81df2a94 [ALL-596] fix(frontend): Fix import project from sending request before runtime is active (#4513) 2024-10-22 18:04:49 +00:00
mamoodi 550044454c Revert docker install in OpenHands app image (#4519) 2024-10-22 13:46:19 -04:00
sp.wack 3927fc3616 [ALL-594] chore(frontend): Add frontend error handling for failed requests (#4501) 2024-10-22 20:05:59 +04:00
sp.wack 864f81bc71 test(frontend): User actions and friends (#4497) 2024-10-22 20:04:07 +04:00
Graham Neubig 54250e3fe2 Update evaluation README.md structure (#4516) 2024-10-22 14:42:22 +00:00
Xingyao Wang da548d308c [agent] LLM-based editing (#3985)
Co-authored-by: Tim O'Farrell <tofarr@gmail.com>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Robert Brennan <accounts@rbren.io>
Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-10-22 04:51:44 +08:00
sp.wack 6fe5482b20 [ALL-571] chore(frontend): Move saas-related configs to config.json (#4496) 2024-10-21 14:59:20 +00:00
dependabot[bot] 520586a89c chore(deps): bump @mdx-js/react from 3.0.1 to 3.1.0 in /docs in the version-all group (#4478)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-21 09:21:33 +04:00
Xingyao Wang 263798584e fix(runtime): replace codec error in pexcept (#4493) 2024-10-20 12:51:05 +08:00
Alejandro Cuadron Lafuente a9a593bb21 [Fix] Added support to specify the platform on which the runtime image should be built. (#4402)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: mamoodi <mamoodiha@gmail.com>
Co-authored-by: tofarr <tofarr@gmail.com>
Co-authored-by: Robert Brennan <contact@rbren.io>
2024-10-20 09:19:05 +08:00
tobitege 6471d0f94d .gitignore: ignore all node_modules folders (#4491) 2024-10-20 09:17:45 +08:00
sp.wack 5cc16cb82a fix(frontend): Fix waitlist logic (#4492) 2024-10-19 14:20:54 -04:00
Robert Brennan cc68756b26 fix freeze on zip-files endpoint (#4487) 2024-10-18 15:29:07 -04:00
Xingyao Wang 126bf316bc fix(docker): Dockerfile failed to build on RemoteRuntime (#4481)
Co-authored-by: tofarr <tofarr@gmail.com>
2024-10-19 03:28:39 +08:00
Xingyao Wang 91308ba4dc feat: clean-up retries RemoteRuntime & add FatalErrorObservation (#4485) 2024-10-18 17:23:13 +00:00
Graham Neubig b660aa99b8 Fix issue #4480: '[Bug]: Being blocked by cloudflare results in futile retries (#4482)
Co-authored-by: openhands <openhands@all-hands.dev>
2024-10-18 13:04:34 -04:00
sp.wack cf793582a7 [ALL-543] feat(frontend): Setup auth route, replace loading spinner, add new route (#4448) 2024-10-18 19:32:46 +04:00
Robert Brennan 56fe905241 reduce dependabot frequency (#4305) 2024-10-18 11:21:15 -04:00
mamoodi 02abf60433 Run flaky mac tests nightly (#4470) 2024-10-18 10:38:40 -04:00
mamoodi e6a5e39047 Update docs associated with new UI (#4469) 2024-10-18 10:19:56 -04:00
mamoodi feee509de7 Update leftover versions (#4468) 2024-10-18 09:28:53 -04:00
Robert Brennan fd6facbf03 update contributing docs (#4438)
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-10-18 05:08:54 +02:00
dependabot[bot] 1ea3087eec chore(deps): bump modal from 0.64.182 to 0.64.192 (#4460)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-18 05:02:20 +02:00
dependabot[bot] 2e09b4f95e chore(deps-dev): bump torch from 2.2.2 to 2.5.0 (#4459)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-18 05:01:27 +02:00
mamoodi d2d55f5ea2 Update custom sandbox doc (#4332) 2024-10-17 18:23:57 -04:00
mamoodi 0e467b1429 Release 0.10.0 (#4463) 2024-10-17 22:23:40 +00:00
Xingyao Wang ec3152b6e1 linter: only lint on updated lines in the new file (#4409) 2024-10-17 15:57:03 -04:00
sp.wack 642e01b673 fix(frontend): Update build directory and referenced paths (#4461) 2024-10-17 23:24:49 +04:00
sp.wack 6cb174b7d1 [ALL-557] feat(frontend): Add save and discard actions to the editor (#4442)
Co-authored-by: mamoodi <mamoodiha@gmail.com>
2024-10-17 17:14:55 +00:00
Robert Brennan 154854bbe3 run in dev mode in makefile (#4452) 2024-10-17 12:40:47 -04:00
sp.wack 678630c5bd fix(frontend): Catch config fetch error and set default fallback (#4453) 2024-10-17 16:17:44 +00:00
dependabot[bot] ad800bf373 chore(deps): bump litellm from 1.49.5 to 1.49.6 (#4458)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-17 17:51:45 +02:00
dependabot[bot] 206788a0e8 chore(deps): bump react-syntax-highlighter from 15.5.0 to 15.6.1 in /frontend (#4457)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-17 15:31:07 +00:00
dependabot[bot] ca3fbb2a80 chore(deps-dev): bump @types/node from 22.7.5 to 22.7.6 in /frontend (#4455)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-17 15:29:23 +00:00
dependabot[bot] cc500a622a chore(deps-dev): bump @testing-library/jest-dom from 6.5.0 to 6.6.1 in /frontend (#4456)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-17 15:29:01 +00:00
tofarr 5fb3dece93 Feat: Divided docker layer to make it easier to cache (#4313)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2024-10-17 15:08:56 +00:00
sp.wack 83c096b974 [ALL-551] chore(frontend): Retrieve APP_MODE from the server (#4423) 2024-10-17 18:35:21 +04:00
Xingyao Wang 015df47e53 chore: remove integration tests from CI to unblock (#4451) 2024-10-17 14:19:53 +00:00
Jiayi Pan c1b323a076 Show actual dataset name in swebench log directory (#4417) 2024-10-17 10:32:38 +08:00
Xingyao Wang 84a578ad20 [test] remove integration tests from CI & move them into evaluation (#4447) 2024-10-17 05:38:23 +08:00
dependabot[bot] 8e5db345b2 chore(deps): bump boto3 from 1.35.40 to 1.35.42 (#4445)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-16 22:51:40 +02:00
dependabot[bot] f61266841c chore(deps): bump browsergym from 0.8.0 to 0.8.1 (#4437)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-16 22:50:39 +02:00
dependabot[bot] 277d991b37 chore(deps): bump fastapi from 0.115.0 to 0.115.2 (#4370)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-16 22:02:26 +02:00
Engel Nyst 20aa66d5e2 Bump Mac version in CI (#4441) 2024-10-16 21:52:21 +02:00
dependabot[bot] 9bc6252967 chore(deps): bump anthropic from 0.36.0 to 0.36.1 (#4436)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-16 21:25:00 +02:00
Alejandro Cuadron Lafuente bb416009c5 [Fix] Fixed the inputs to the ManagerAgent (#4427)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: mamoodi <mamoodiha@gmail.com>
Co-authored-by: tofarr <tofarr@gmail.com>
Co-authored-by: Robert Brennan <contact@rbren.io>
2024-10-16 20:47:46 +02:00
Robert Brennan 226ea545fa Add workflow scope to GitHub authentication URL (#4439)
Co-authored-by: openhands <openhands@all-hands.dev>
2024-10-16 14:41:46 -04:00
tofarr e12bff5189 Fix: Removed flaky test (#4444) 2024-10-16 18:10:27 +00:00
dependabot[bot] 23d3becf1d chore(deps): bump litellm from 1.49.4 to 1.49.5 (#4431)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-16 18:36:39 +02:00
Robert Brennan be79ccdb39 fix default host (#4413) 2024-10-16 10:56:42 -04:00
sp.wack 2277897f86 feat(frontend): Improve file based routing (#4317) 2024-10-16 18:54:15 +04:00
tofarr be9619be3a Feat faster unit tests 2 (#4418) 2024-10-16 08:40:53 -06:00
tofarr cb58dab82b Fix loop graceful shutdown (#4394) 2024-10-16 08:40:33 -06:00
sp.wack 8ab293a667 fix(frontend): Fix request headers (#4422) 2024-10-16 14:22:18 +00:00
tofarr 8a93da51be Fix for lockup - create the runtime in a background thread (#4412)
Co-authored-by: Robert Brennan <contact@rbren.io>
2024-10-15 23:52:21 +00:00
mamoodi 6f2e678028 Fix eval output path in case of @ char (#4416) 2024-10-15 22:45:08 +00:00
Xingyao Wang da23189e4c refactor: move get_pairs from memory to shared utils (#4411) 2024-10-15 19:31:49 +00:00
dependabot[bot] 2cf77e2589 chore(deps): bump modal from 0.64.181 to 0.64.182 (#4407)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-15 19:06:40 +02:00
dependabot[bot] 58e9b31d08 chore(deps-dev): bump llama-index from 0.11.17 to 0.11.18 (#4408)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-15 19:05:33 +02:00
dependabot[bot] 53c2932fa5 chore(deps): bump litellm from 1.49.3 to 1.49.4 (#4406)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-15 18:59:26 +02:00
dependabot[bot] ad4e5b3851 chore(deps-dev): bump tailwindcss from 3.4.13 to 3.4.14 in /frontend (#4404)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-15 19:26:50 +04:00
dependabot[bot] a56850a4e3 chore(deps): bump @reduxjs/toolkit from 2.2.8 to 2.3.0 in /frontend (#4405)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-15 19:26:32 +04:00
Abhijeetsingh Meena 173018eb58 fix: Resolves HumanEval Inference by replacing task_id with instance_id (#4364)
Co-authored-by: Harshit Surana <surana.h@gmail.com>
2024-10-15 15:18:38 +00:00
Xingyao Wang 6bbd75c6e7 fix: metric logging in agent controller (#4387) 2024-10-15 22:32:39 +08:00
Xingyao Wang 50c13aad98 [Eval] Improve SWE-Bench Eval harness: multi-run support & entry script simplification (#4396) 2024-10-15 21:34:52 +08:00
tofarr 15df12cf15 Feat Faster unit tests (#4395) 2024-10-15 06:12:47 -06:00
dependabot[bot] 9862d93cfb chore(deps): bump tailwind-merge from 2.5.3 to 2.5.4 in /frontend (#4380)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-15 10:57:37 +00:00
Peyton Walters 9566ca4a3c Implement basic modal sandbox support (#4133) 2024-10-15 06:37:02 -04:00
dependabot[bot] 0ca66beac9 chore(deps): bump sirv-cli from 2.0.2 to 3.0.0 in /frontend (#4381)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-15 10:19:40 +00:00
dependabot[bot] 0d93c5914d chore(deps): bump vite from 5.4.8 to 5.4.9 in /frontend (#4385)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-15 13:43:50 +04:00
tofarr d47f3e854b Fix build error (#4393) 2024-10-14 16:32:19 -06:00
Robert Brennan f60652dc5a Hide hard-coded commands from the agent (#4330)
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>
2024-10-14 20:40:22 +00:00
Kilian Lieret 746e6595d5 Fix: _interrupt_bash to send multiple Ctrl+C (#4390) 2024-10-14 19:59:42 +00:00
dependabot[bot] 7c95fd6038 chore(deps-dev): bump mypy from 1.11.2 to 1.12.0 (#4371)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-14 21:34:49 +02:00
sp.wack 78cf9d5dec feat(frontend) Focus on text area input when the user uploads a project (#4372) 2024-10-14 17:54:12 +00:00
sp.wack 5eae10d1e2 feat(frontend): Add flag to mock API during dev mode (#4326) 2024-10-14 21:46:42 +04:00
sp.wack 8fa3591073 feat(frontend): Always show user context menu (#4366) 2024-10-14 21:46:20 +04:00
sp.wack 70bd710e82 chore(frontend): Remove old session class and some artifacts that are no longer needed (#4310) 2024-10-14 11:44:23 -06:00
Robert Brennan 63ff69fd97 Allow attaching to existing sessions without reinitializing the runtime (#4329)
Co-authored-by: tofarr <tofarr@gmail.com>
2024-10-14 15:24:29 +00:00
sp.wack 640ce0f60d feat(frontend): Remove chat interface header label (#4367) 2024-10-14 18:35:58 +04:00
Xingyao Wang 25f9413965 [Eval] Fix eval stuck when result is too large for pbar (#4361) 2024-10-14 22:08:34 +08:00
dependabot[bot] 4e8cfb0d60 chore(deps): bump jose from 5.9.3 to 5.9.4 in /frontend (#4338)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-14 09:39:59 +04:00
Xingyao Wang 4dfc7a7ef0 [Eval] Add a more lightweight / easier-to-use SWE-Bench output visualizer (#4360)
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-10-14 02:09:01 +00:00
Boxuan Li 7186224899 Dump trajectories with delegate history if configured (#4336) 2024-10-13 17:30:04 -07:00
Xingyao Wang 343cc8710f [remote runtime] poll runtime info to wait until alive instead of using long timeout (#4334)
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-10-13 20:38:03 +00:00
Engel Nyst 20455cea3e (Browsing agent) Fix send_msg_to_user (#4354) 2024-10-13 13:37:23 -07:00
Engel Nyst df23168c10 AgentDelegateAction: make delegate start with the task in execute tags, not the rest of the parent LLM response (#4327) 2024-10-13 13:17:51 -07:00
OpenHands ff8a9a1a56 Fix issue #4225: Add evaluation data to the LLMs docs (#4312)
Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-10-13 11:12:12 -07:00
Engel Nyst edcc391768 Revert "chore(deps): bump protobuf from 4.25.5 to 5.28.2 (#4214)" (#4325)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2024-10-13 17:40:42 +00:00
Amir 87021bd78f Add docs about using pre-built image + remove duplicated method (#4359) 2024-10-13 11:34:30 +00:00
dependabot[bot] 2692c0c8fd chore(deps): bump litellm from 1.49.0 to 1.49.2 (#4351)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-12 20:08:38 +02:00
dependabot[bot] 81d455a33b chore(deps): bump boto3 from 1.35.37 to 1.35.39 (#4352)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-12 20:07:50 +02:00
dependabot[bot] 495fc47c28 chore(deps-dev): bump chromadb from 0.5.12 to 0.5.13 (#4342)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-12 20:07:15 +02:00
sp.wack 6a7abcc7c9 Rename Settings.ts to settings.ts (#4324) 2024-10-12 09:15:39 +04:00
tofarr 4c5e2a339f Feat: Async Goodies for OpenHands (#4347) 2024-10-11 15:34:44 -06:00
Engel Nyst caa77cf7a6 Log cache hit/miss for deepseek (#4343) 2024-10-11 18:43:43 +02:00
Xingyao Wang a3c49538fc Update README.md (#4333) 2024-10-11 12:58:12 +08:00
mamoodi c3764a7422 Cancel previous commit builds on PRs but not on main (#4314) 2024-10-10 17:01:32 -04:00
sp.wack 36e304b3da chore(backend): Refactor copy_from method to be more generic (#4278) 2024-10-10 12:10:35 -04:00
Robert Brennan 62a58ea5d3 fix container_image when using hard-coded image (#4322) 2024-10-10 15:45:14 +00:00
dependabot[bot] 33a74e2792 chore(deps): bump boto3 from 1.35.36 to 1.35.37 (#4319)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-10 11:35:05 -04:00
tofarr f867fda2f9 Fix Graceful cleanup of session manager (#4306) 2024-10-10 09:15:29 -06:00
mamoodi 2d2d3ccfa5 Install docker in the OpenHands app image (#4283)
Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-10-10 08:21:57 -04:00
Xingyao Wang b23c7aab5a [eval] stop set sid in eval (#4311) 2024-10-10 11:47:27 +08:00
sp.wack a6993b7bf5 improvement(frontend): Update app behavior with invalid tokens (#4286) 2024-10-09 22:14:48 +00:00
dependabot[bot] 77772b6954 chore(deps-dev): bump pre-commit from 4.0.0 to 4.0.1 (#4293)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-09 22:37:35 +02:00
mamoodi ea883d4d18 Update issue template to make it less daunting (#4307) 2024-10-09 19:32:49 +00:00
dependabot[bot] 3f36338d19 chore(deps-dev): bump openai from 1.51.1 to 1.51.2 (#4302)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-09 18:16:13 +00:00
Robert Brennan ae6c489423 fix protocol: use https when possible (#4303) 2024-10-09 17:23:06 +00:00
dependabot[bot] 8937c2ff12 chore(deps-dev): bump chromadb from 0.5.11 to 0.5.12 (#4301)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-09 18:46:31 +02:00
Robert Brennan 45fb4fb9bc allow reconnecting to a runtime (#4223) 2024-10-09 16:37:52 +00:00
dependabot[bot] aae9b5ba5d chore(deps): bump litellm from 1.48.18 to 1.49.0 (#4298)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-09 12:09:59 -04:00
dependabot[bot] f2321dbfae chore(deps): bump json-repair from 0.29.10 to 0.30.0 (#4299)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-09 12:09:49 -04:00
dependabot[bot] 5ec3cb0ac9 chore(deps): bump browsergym from 0.7.1 to 0.8.0 (#4300)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-09 12:09:38 -04:00
dependabot[bot] 217eb5dee2 chore(deps): bump boto3 from 1.35.35 to 1.35.36 (#4294)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-09 11:58:49 -04:00
dependabot[bot] f47afa9ebc chore(deps-dev): bump llama-index from 0.11.16 to 0.11.17 (#4297)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-09 11:58:41 -04:00
dependabot[bot] 72db908251 chore(deps): bump anthropic from 0.35.0 to 0.36.0 (#4295)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-09 11:58:31 -04:00
dependabot[bot] 45a9d0ba9a chore(deps): bump google-cloud-aiplatform from 1.69.0 to 1.70.0 (#4296)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-09 11:58:20 -04:00
dependabot[bot] 1a65094377 chore(deps-dev): bump typescript from 5.6.2 to 5.6.3 in /docs (#4292)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-09 11:13:59 -04:00
dependabot[bot] 2d4c79f181 chore(deps-dev): bump typescript from 5.6.2 to 5.6.3 in /frontend (#4291)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-09 11:13:48 -04:00
sp.wack d5fc8bc65f fix(frontend): Remove unnecessary encoding in /list-files API endpoint path param (#4287) 2024-10-09 11:54:44 +00:00
sp.wack 51387a52c1 style(frontend): Make modal backdrop darker (#4285) 2024-10-09 15:49:45 +04:00
sp.wack fe084b4b16 chore(frontend): Change the backend base url fallback to be dynamic to current host (#4284) 2024-10-09 07:26:27 -04:00
tofarr 5097c4fe71 [Runtime] Audit HTTP Retry timeouts (#4282) 2024-10-08 19:31:25 -06:00
sp.wack be5f9772e6 chore(frontend): Update push to GH action prompt (#4276) 2024-10-09 00:35:36 +04:00
OpenHands 39798e9758 Fix issue #4142: Documentation: Create a "Usage Methods -> GUI Mode" page (#4156)
Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: mamoodi <mamoodiha@gmail.com>
2024-10-08 17:06:25 +00:00
sp.wack d81a330a13 hotfix(frontend): Fix hero description (#4277) 2024-10-08 17:03:55 +00:00
mamoodi 81d3a2881a Remove concurrency from ghcr-build so it always runs on main commits (#4275) 2024-10-08 12:29:31 -04:00
dependabot[bot] 59fbc11afe chore(deps-dev): bump @types/node from 22.7.4 to 22.7.5 in /frontend (#4272)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-08 15:50:35 +00:00
dependabot[bot] 1db912aed3 chore(deps): bump @reduxjs/toolkit from 2.2.7 to 2.2.8 in /frontend (#4273)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-08 15:45:14 +00:00
dependabot[bot] e459069941 chore(deps): bump boto3 from 1.35.34 to 1.35.35 (#4269)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-08 15:19:57 +00:00
dependabot[bot] 411e66395d chore(deps-dev): bump openai from 1.51.0 to 1.51.1 (#4268)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-08 17:13:12 +02:00
dependabot[bot] bf61bcb34a chore(deps): bump google-generativeai from 0.8.2 to 0.8.3 (#4267)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-08 17:10:56 +02:00
dependabot[bot] 69f2bf93c1 chore(deps): bump json-repair from 0.29.8 to 0.29.10 (#4266)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-08 17:09:58 +02:00
sp.wack 3661893161 fix(frontend): Set min width so it doesn't squish on smaller screens (#4264) 2024-10-08 10:06:15 -04:00
sp.wack ebeda8bcfb fix(frontend) End session and redirect to main screen if token is invalid (#4263) 2024-10-08 10:05:57 -04:00
sp.wack ce18792b12 docs(frontend): Update README (#4262) 2024-10-08 10:05:18 -04:00
sp.wack ef3e106543 chore(frontend): Add meta title and description (#4265) 2024-10-08 10:04:30 -04:00
sp.wack 9d6c1e569d fix(frontend): Refactor frontend config (#4261) 2024-10-08 10:04:13 -04:00
tofarr cdd05a98db Lockup Resiliency and Asyncio Improvements (#4221) 2024-10-08 07:17:37 -06:00
Boxuan Li 568c8ce993 Runtime build fixes for OpenHands as a python library (#3989) 2024-10-07 19:50:07 -07:00
JeevaRamanathan 9296cedbed Improved readability in CONTRIBUTING.md (#4240)
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-10-08 01:35:27 +00:00
Robert Brennan 98b39023f4 Ask the agent not to push changes to GitHub (#4222)
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-10-08 01:14:14 +00:00
Engel Nyst e6847e9e61 Move agenthub within openhands (#4130) 2024-10-08 00:34:18 +00:00
Alejandro Cuadron Lafuente a3571ec510 [Fix] Error when trying to pull all docker evaluation containers (#4244) 2024-10-08 05:03:36 +08:00
Aditya Bharat Soni 0809d26f4d fix: Allow evaluation benchmarks to pass image urls in run_controller() instead of simply passing strings (#4100)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2024-10-07 15:37:08 -04:00
Xingyao Wang 9c07370559 fix runtime_startup_env_vars not being used (#4250) 2024-10-07 15:33:12 -04:00
sp.wack bfdd7fd620 feat(frontend): UI overhaul (#3604) 2024-10-07 23:15:38 +04:00
dependabot[bot] 0186674352 chore(deps): bump i18next from 23.15.1 to 23.15.2 in /frontend (#4252)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>
2024-10-07 22:07:08 +04:00
dependabot[bot] d4666cdc7d chore(deps-dev): bump @types/react from 18.3.10 to 18.3.11 in /frontend (#4194)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-07 16:47:40 +00:00
dependabot[bot] 3886fa8b04 chore(deps-dev): bump pre-commit from 3.8.0 to 4.0.0 (#4249)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-07 15:31:18 +00:00
dependabot[bot] 46299301f2 chore(deps): bump termcolor from 2.4.0 to 2.5.0 (#4247)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-07 17:26:06 +02:00
dependabot[bot] 7be224e595 chore(deps): bump boto3 from 1.35.33 to 1.35.34 (#4246)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-07 17:25:28 +02:00
dependabot[bot] 11ea248b41 chore(deps-dev): bump build from 1.2.2 to 1.2.2.post1 (#4243)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-07 17:24:55 +02:00
dependabot[bot] 097de51be2 chore(deps): bump anthropic from 0.34.2 to 0.35.0 (#4245)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-07 17:23:32 +02:00
dependabot[bot] cd0198b87f chore(deps): bump litellm from 1.48.14 to 1.48.18 (#4248)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-07 15:15:32 +00:00
Xingyao Wang 01ae54a69d fix swebench repo/version being string (#4241) 2024-10-07 22:01:42 +08:00
dependabot[bot] 93f95d85de chore(deps): bump @vitejs/plugin-react from 4.3.1 to 4.3.2 in /frontend (#4119)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-07 08:30:26 +00:00
Engel Nyst 6b1f23a20a Fix browsing actions to be more robust (#4226) 2024-10-06 22:03:13 -04:00
mamoodi 09243eba07 Small changes to getting started (#4233) 2024-10-06 13:48:21 -04:00
mamoodi e3450bb8c9 Update README to installation guides for system requirements (#4232) 2024-10-06 13:48:05 -04:00
Naman Tyagi 583b54c854 Fix grammar, typos, and consistency in CREDITS.md (#4229) 2024-10-06 16:50:43 +02:00
Engel Nyst 8c32ef2234 Fix to use async variant of completion (#4228) 2024-10-06 05:10:36 +02:00
Engel Nyst 9d0e6a24bc Refactor embeddings (#4219) 2024-10-05 18:59:08 +00:00
Boxuan Li 40d2935911 BrowserOutputObservation: Remove axtree from log (#4206)
Co-authored-by: mamoodi <mamoodiha@gmail.com>
2024-10-05 11:07:40 -07:00
Robert Brennan 42c118f4b4 Add Getting Started docs (#4224) 2024-10-05 14:21:02 +00:00
tofarr e60eaf9a52 Feat Startup events for the remote runtime (#4210) 2024-10-04 12:33:57 -06:00
dependabot[bot] 1354675ce3 chore(deps-dev): bump ruff from 0.6.8 to 0.6.9 (#4218)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-04 18:03:58 +00:00
dependabot[bot] 2eb42cb4f4 chore(deps-dev): bump llama-index from 0.11.15 to 0.11.16 (#4216)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-04 17:43:16 +00:00
dependabot[bot] 00f961822a chore(deps): bump boto3 from 1.35.32 to 1.35.33 (#4217)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-04 18:56:43 +02:00
dependabot[bot] 62c39def7c chore(deps): bump json-repair from 0.29.7 to 0.29.8 (#4215)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-04 12:24:44 -04:00
dependabot[bot] e986e78b31 chore(deps): bump protobuf from 4.25.5 to 5.28.2 (#4214)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-04 12:24:24 -04:00
dependabot[bot] 6d461a4934 chore(deps): bump litellm from 1.48.10 to 1.48.14 (#4213)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-04 12:24:02 -04:00
tobitege 25462ae394 (arch) ghcr-build.yml: fix interpolation error (RELEVANT_SHA) (#4207) 2024-10-04 11:40:45 -04:00
dependabot[bot] ad60ef11ad chore(deps): bump tailwind-merge from 2.5.2 to 2.5.3 in /frontend (#4211)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-04 19:20:28 +04:00
dependabot[bot] ceebbcac2a chore(deps): bump i18next-http-backend from 2.6.1 to 2.6.2 in /frontend (#4212)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-04 19:20:06 +04:00
Xingyao Wang 245334e89d [eval] improve update output script for swe-bench (#4180) 2024-10-04 15:10:03 +00:00
Xingyao Wang 80a631361b eval: update aiderbench readme (#4209) 2024-10-04 09:26:12 -04:00
Xingyao Wang 9cc9b19958 eval: improve swebench infer error handling and retry (#4205) 2024-10-04 07:09:56 -05:00
Xingyao Wang 0c2a35b256 [eval] update aider bench scripts (#4203) 2024-10-04 02:23:06 +00:00
Robert Brennan 641a15356f Better AWS S3 storage support (#4195) 2024-10-03 22:53:46 +00:00
Xingyao Wang 42649745bd fix(runtime): fix bash interrupt on program that cannot be stopped via ctrl+c (#4161) 2024-10-04 06:48:44 +08:00
Vaishakh 4678ae4ebd Reduce list spacing (#4177)
Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-10-03 16:52:41 -04:00
Robert Brennan 2f310d9338 Fix for sha detection for docker tags (#4197) 2024-10-03 16:51:35 -04:00
tofarr ee6a1cf334 Fix issue where an exception is raised because we try to finish a thread that was never started (#4200) 2024-10-03 22:04:26 +02:00
tofarr 152f99c64f Chore Bump python version (#3545) 2024-10-03 13:40:55 -04:00
tofarr 909e332207 Fix Better error message in development when version number changes. (#4188) 2024-10-03 17:37:10 +02:00
dependabot[bot] 053e2f90d5 chore(deps-dev): bump llama-index from 0.11.14 to 0.11.15 (#4190)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-03 17:35:18 +02:00
dependabot[bot] bd4640924c chore(deps): bump boto3 from 1.35.31 to 1.35.32 (#4192)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-03 17:32:35 +02:00
dependabot[bot] f26861aa93 chore(deps): bump litellm from 1.48.9 to 1.48.10 (#4191)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-03 17:32:03 +02:00
mamoodi cf9f980a22 Release 0.9.8 (#4189) 2024-10-03 10:52:41 -04:00
Xingyao Wang 16a2cf37da fix: reuse config parser for cli (#4187) 2024-10-03 09:41:02 -04:00
Robert Brennan 9c95d0ff58 Enable authentication for runtime environments (#4179)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2024-10-03 09:14:39 -04:00
Graham Neubig 9641bfbd3e Fix issue #4181: 'Prompting best practices documentation' (#4183)
Co-authored-by: openhands <openhands@all-hands.dev>
2024-10-03 07:58:13 -04:00
Ikko Eltociear Ashimine 5c31fd9357 chore: update agent_session.py (#4186) 2024-10-03 07:00:45 +00:00
Engel Nyst 1abfd3b808 Retry on litellm's APIError, which includes 502 (#4167) 2024-10-03 01:54:49 +02:00
Xingyao Wang e0594432e2 fix: build shutdown listener (#4147) 2024-10-02 22:25:10 +00:00
Xingyao Wang e81c5597d6 feat(runtime): use micromamba instead of mamba and fix build issue (#4154) 2024-10-02 21:23:18 +00:00
Rehan Ganapathy c8a933590a (feat) allow specification of config.toml location via args (solves #3947) (#4168)
Co-authored-by: Rehan Ganapathy <rehanganapathy@MACASF.local>
2024-10-02 20:30:12 +00:00
mamoodi dd228c07e0 Small reordering of PR template (#4173) 2024-10-02 13:30:53 -04:00
tofarr e0f8a5d508 Fix: Add timeout on websocket accept (#4169) 2024-10-02 10:51:12 -06:00
dependabot[bot] e93db80769 chore(deps-dev): bump reportlab from 4.2.4 to 4.2.5 (#4170)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-02 18:28:34 +02:00
dependabot[bot] 14a4e1018a chore(deps): bump litellm from 1.48.7 to 1.48.9 (#4176)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-02 16:15:15 +00:00
dependabot[bot] bb151655cc chore(deps-dev): bump streamlit from 1.38.0 to 1.39.0 (#4175)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-02 16:06:58 +00:00
dependabot[bot] 471867859f chore(deps): bump boto3 from 1.35.30 to 1.35.31 (#4174)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-02 15:50:47 +00:00
dependabot[bot] a1d09c4437 chore(deps): bump google-cloud-aiplatform from 1.68.0 to 1.69.0 (#4172)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-02 15:40:16 +00:00
dependabot[bot] 240b500acf chore(deps-dev): bump openai from 1.50.2 to 1.51.0 (#4171)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-02 17:28:43 +02:00
Xingyao Wang d518ca08b7 standardize error message across remote runtime and eventstream runtime (#4159) 2024-10-02 22:42:17 +08:00
Graham Neubig 52e0630af8 Update .openhands_instructions with linting (#4165) 2024-10-02 08:10:09 -04:00
Graham Neubig 178dbfaf4a Run pre-commit (#4163) 2024-10-02 04:52:02 +00:00
Xingyao Wang 240a470a1d Revert "add few seconds to properly receive timeout error from client"
This reverts commit dd2cb4399a.
2024-10-01 23:44:05 -04:00
Xingyao Wang dd2cb4399a add few seconds to properly receive timeout error from client 2024-10-01 23:43:50 -04:00
tofarr 4eaf28d7b1 Fix ctrl c not working during startup (#4155) 2024-10-02 11:05:00 +08:00
Engel Nyst 5a45c648a8 attributes for BE/FE should not be sent (#4150) 2024-10-01 23:00:03 +00:00
Xingyao Wang 3cf794faef fix(runtime build): only check for image exist on exact hash tag (#4152) 2024-10-01 22:20:25 +00:00
mamoodi 04643d6f3c Make Claude Sonnet 3.5 the recommended model and update docs accordingly (#4151) 2024-10-01 20:32:39 +00:00
Xingyao Wang 53a015f718 fix: make llm_completions optional to fix eval_infer.py (#4148) 2024-10-02 03:55:03 +08:00
Graham Neubig 148d22e1af Fix issue #4136: 'Restructuring documentation' (#4138)
Co-authored-by: openhands <openhands@all-hands.dev>
2024-10-01 17:44:48 +00:00
Robert Brennan 31b2e4b5b2 allow specifying exact remote image (#4135) 2024-10-01 13:17:51 -04:00
dependabot[bot] 1d6633164f chore(deps-dev): bump @types/node from 22.7.3 to 22.7.4 in /frontend (#4118)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-01 13:16:32 -04:00
dependabot[bot] dd89cfba2a chore(deps): bump @react-types/shared from 3.24.1 to 3.25.0 in /frontend (#4139)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-01 13:16:08 -04:00
mamoodi 0144caaf1f Update eval doc for remote runtime (#4145) 2024-10-01 13:14:36 -04:00
Robert Brennan ec1a86f150 Handle errors when starting session (#4134) 2024-10-01 12:40:09 -04:00
dependabot[bot] 926af7f5fd chore(deps): bump boto3 from 1.35.29 to 1.35.30 (#4144)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-01 16:19:07 +00:00
dependabot[bot] cc55c6dbe5 chore(deps): bump litellm from 1.48.6 to 1.48.7 (#4141)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-01 18:18:49 +02:00
OpenHands c777cfeacf Fix issue #4113: Document github action (#4124)
Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-10-01 18:08:57 +02:00
dependabot[bot] 823966c24e chore(deps-dev): bump @types/react from 18.3.9 to 18.3.10 in /frontend (#4117)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-01 10:55:45 -04:00
dependabot[bot] adba7dad96 chore(deps): bump uvicorn from 0.30.6 to 0.31.0 (#4114)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-01 10:55:27 -04:00
Xingyao Wang 1109637efb Update instruction for new version of eval runtime-api (#4128) 2024-09-30 23:48:38 +00:00
mamoodi 71adfeebab Update PR Template for better release notes (#4126) 2024-09-30 17:06:56 -04:00
Robert Brennan 8059e8e298 make runtime url configurable (#4093) 2024-09-30 18:59:57 +00:00
Xingyao Wang 54ac340e0b refactor: standardize linter output data structure and interface (#4077)
Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-10-01 02:40:23 +08:00
dependabot[bot] 13901b4b5a chore(deps): bump python-multipart from 0.0.9 to 0.0.12 (#4121)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-30 20:30:55 +02:00
dependabot[bot] 0b27d51135 chore(deps): bump litellm from 1.48.5 to 1.48.6 (#4120)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-30 20:30:09 +02:00
dependabot[bot] f0ce682fa0 chore(deps): bump json-repair from 0.29.5 to 0.29.7 (#4115)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-30 20:28:32 +02:00
dependabot[bot] 3567911da8 chore(deps): bump boto3 from 1.35.28 to 1.35.29 (#4122)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-30 20:27:58 +02:00
Graham Neubig 215d227e5a Reference the OpenHands resolver (#4125) 2024-09-30 14:26:12 -04:00
mamoodi 50e6cc6156 Release 0.9.7 (#4123) 2024-09-30 11:28:16 -04:00
Xingyao Wang 8d6eda3623 fix eval_infer.sh to correctly copy SWE-Bench logs (#4111) 2024-09-29 18:39:18 -05:00
Cole Murray d5f965b474 Update LiteLLLM to 1.48.5 (#4110) 2024-09-29 06:42:59 +00:00
tobitege c3bbe604eb (fix) Fix logging in shared eval file to prevent key disclosure (#4108) 2024-09-28 19:33:16 +00:00
Ana Noemi c7fe39998c Update README to decrease unsuccessful drivebys (#4091) 2024-09-28 18:52:01 +00:00
Xingyao Wang ec6e07647f fix hash equivalance verification ci for fork (#4107) 2024-09-29 02:19:59 +08:00
Graham Neubig e744eadb8b Robustify openhands resolver workflow (#4105) 2024-09-28 11:35:56 -04:00
Engel Nyst e582806004 Vision and prompt caching fixes (#4014) 2024-09-28 14:37:29 +02:00
OpenHands f427f9d8d4 Fix issue #4103: Improve description of how to do frontend setup and testing in .openhands_instructions (#4104)
Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-09-28 06:41:34 +00:00
Graham Neubig d669c7b60d Add github issue resolution workflow (#4102) 2024-09-28 04:52:52 +00:00
dependabot[bot] 42be4ee5bc chore(deps-dev): bump openai from 1.48.0 to 1.50.2 (#4101)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-28 05:04:03 +02:00
Engel Nyst f994277d0f Make agents follow configured temperature (#4099) 2024-09-28 01:15:46 +00:00
tofarr 5ccee7c8a7 Fix Bash commands now do not block and actually respect the timeout (#4058) 2024-09-28 08:40:00 +08:00
tobitege 575a829d94 (enh) add test_python_version to test_bash.py runtime tests (#4098) 2024-09-28 08:21:14 +08:00
Xingyao Wang 2bed3a424c chore: pass logger DEBUG mode to client side (#4096) 2024-09-28 08:21:04 +08:00
Xingyao Wang a4cc010110 chore: parser fix for deepseek (#4097) 2024-09-28 08:20:51 +08:00
tobitege 9651368e6a revert #3871 dockerfile template: don't write to .bashrc file (#4095) 2024-09-27 21:49:51 +00:00
tofarr c5025fb66e Fix Reducing the amount being downloaded every time the hash changes. (#4078) 2024-09-27 15:48:33 -06:00
Robert Brennan 3f9111c615 add idle time to client server (#4084) 2024-09-27 19:41:16 +00:00
dependabot[bot] 89e95f2671 chore(deps): bump boto3 from 1.35.27 to 1.35.28 (#4090)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-27 16:48:34 +00:00
dependabot[bot] 5bfa0c2f8d chore(deps): bump browsergym from 0.7.0 to 0.7.1 (#4089)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-27 16:21:58 +00:00
dependabot[bot] 84141f656d chore(deps-dev): bump chromadb from 0.5.9 to 0.5.11 (#4088)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-27 16:19:17 +00:00
dependabot[bot] 6ff7506581 chore(deps-dev): bump reportlab from 4.2.2 to 4.2.4 (#4086)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-27 16:01:32 +00:00
dependabot[bot] 41dc7f0256 chore(deps-dev): bump llama-index from 0.11.13 to 0.11.14 (#4085)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-27 15:50:17 +00:00
Xingyao Wang 34f3b61536 [runtime hash] fix runtime hash mismatch between inside app image and in "development mode" (#4039) 2024-09-27 15:26:26 +00:00
dependabot[bot] 4533c47595 chore(deps-dev): bump @types/node from 22.7.2 to 22.7.3 in /frontend (#4081)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-27 15:20:11 +00:00
Xingyao Wang 47774e60b0 chore: remove deprecated dockerfile (#4079) 2024-09-27 15:03:23 +00:00
Robert Brennan b78f646b65 Release 0.9.6 (#4076) 2024-09-26 21:27:17 +00:00
Amir 3e5c01dfc8 Remove param from docstring that does not exist in the append_file (#4060) 2024-09-26 22:25:11 +02:00
tobitege 29c34e0b6a (fix) actions.ts: restored handleAssistantMessage handling order (#4074) 2024-09-26 19:56:12 +00:00
tofarr c919086e25 Fix for regression (#4075)
Regression fixed
2024-09-26 12:58:00 -06:00
Engel Nyst 0a03c802f5 Refactor llm.py (#4057) 2024-09-26 17:44:18 +00:00
Xingyao Wang 081ebdbdd8 [runtime] do not keep rebuilding from generic image (#4072) 2024-09-26 17:19:46 +00:00
dependabot[bot] 572c7b726d chore(deps-dev): bump ruff from 0.6.7 to 0.6.8 (#4067)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-26 17:19:16 +00:00
Xingyao Wang cfc5bb70c1 Update README.md for CodeAct (#4070) 2024-09-26 16:55:08 +00:00
dependabot[bot] 008b866a38 chore(deps-dev): bump jsdom from 25.0.0 to 25.0.1 in /frontend (#3992)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-26 16:34:29 +00:00
dependabot[bot] 676ad3e140 chore(deps-dev): bump chromadb from 0.5.7 to 0.5.9 (#4069)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-26 16:08:17 +00:00
dependabot[bot] 19278de5d0 chore(deps): bump json-repair from 0.29.4 to 0.29.5 (#4068)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-26 15:48:39 +00:00
dependabot[bot] 891e4a8d34 chore(deps): bump datasets from 3.0.0 to 3.0.1 (#4065)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-26 15:35:36 +00:00
dependabot[bot] 85be8607e0 chore(deps): bump litellm from 1.48.1 to 1.48.2 (#4066)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-26 23:33:24 +08:00
dependabot[bot] 49b244610c chore(deps-dev): bump openai from 1.47.1 to 1.48.0 (#4063)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-26 17:16:59 +02:00
dependabot[bot] b347b1d06f chore(deps): bump boto3 from 1.35.26 to 1.35.27 (#4064)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-26 17:15:11 +02:00
dependabot[bot] 0c86a60b35 chore(deps-dev): bump @types/node from 22.7.0 to 22.7.2 in /frontend (#4062)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-26 19:11:06 +04:00
tofarr 01317138e2 Fix: uvicorn reloading when python files in workspace change, & started section for debugging instructions for developers (#4041)
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-09-26 08:57:37 -06:00
Xingyao Wang e03855cd7f Make sure we print the observation in the same way as the LLM sees it 2024-09-26 14:01:48 +00:00
jaki300 757c9593f1 Create gke-example.md (#3795)
Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-09-26 09:11:33 -04:00
mamoodi 266e8ff951 Release 0.9.5 (#4061) 2024-09-26 08:36:31 -04:00
dependabot[bot] 3e79cd12a6 chore(deps-dev): bump @types/react from 18.3.8 to 18.3.9 in /frontend (#4029)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-26 14:41:21 +04:00
tobitege 2cc1c3ef42 (enh) Docker runtime builder with BuildKit support, enh. caching (#4009) 2024-09-26 08:50:53 +02:00
dependabot[bot] ef0b08a46e chore(deps-dev): bump tailwindcss from 3.4.12 to 3.4.13 in /frontend (#4030)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-26 05:28:03 +00:00
dependabot[bot] f1d5202884 chore(deps): bump vite from 5.4.7 to 5.4.8 in /frontend (#4046) 2024-09-26 12:56:35 +08:00
dependabot[bot] 11cedfb854 chore(deps): bump google-cloud-aiplatform from 1.67.1 to 1.68.0 (#4051) 2024-09-26 12:56:16 +08:00
dependabot[bot] 6d103a0db2 chore(deps-dev): bump @types/node from 22.6.1 to 22.7.0 in /frontend (#4047) 2024-09-26 12:56:00 +08:00
Engel Nyst 798aaeaef6 remove Exception in the agent (#4054) 2024-09-26 06:39:17 +02:00
tofarr 0df4b97e5b Fix startup statuses (#4053) 2024-09-25 14:38:32 -06:00
Xingyao Wang 81b3cd71b3 [eval] log evaluating warnings directly to console (#4026) 2024-09-26 03:42:32 +08:00
Robert Brennan 9241ae2148 Fix persistence of "advanced settings" (#4038) 2024-09-25 12:57:08 -04:00
dependabot[bot] d3f86e052a chore(deps-dev): bump llama-index from 0.11.12 to 0.11.13 (#4044)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-25 18:24:09 +02:00
dependabot[bot] e0c65f8f9c chore(deps): bump google-generativeai from 0.8.1 to 0.8.2 (#4050)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-25 18:22:01 +02:00
dependabot[bot] 394ab360a8 chore(deps): bump boto3 from 1.35.25 to 1.35.26 (#4048)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-25 18:21:27 +02:00
dependabot[bot] 8a146d5ced chore(deps): bump litellm from 1.48.0 to 1.48.1 (#4049)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-25 18:20:47 +02:00
mamoodi 1d052818ae Set runtime container image so it doesn't need to be rebuilt (#4035) 2024-09-25 05:20:45 +02:00
tofarr ee284bae8f Fix server lock up on session init (#4007) 2024-09-24 15:49:30 -06:00
Xingyao Wang 1b1d8f0b02 [eval] Use imap_unorderd for parallizing evaluation (#4040) 2024-09-24 20:47:27 +00:00
tobitege c32cec7f89 (enh) send status messages to UI during startup (#3771)
Co-authored-by: Robert Brennan <accounts@rbren.io>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Robert Brennan <contact@rbren.io>
Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>
2024-09-24 18:46:58 +00:00
Robert Brennan 7b2b1eff57 fix up settings saves (#4037) 2024-09-24 18:18:19 +00:00
dependabot[bot] 2f1b537471 chore(deps): bump minio from 7.2.8 to 7.2.9 (#4034)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-24 18:07:24 +02:00
dependabot[bot] f2a71eb388 chore(deps): bump boto3 from 1.35.24 to 1.35.25 (#4027)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-24 18:06:52 +02:00
dependabot[bot] 63c5d74169 chore(deps-dev): bump openai from 1.47.0 to 1.47.1 (#4033)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-24 18:06:20 +02:00
dependabot[bot] 5d77aec90b chore(deps): bump litellm from 1.47.1 to 1.48.0 (#4032)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-24 18:05:50 +02:00
Xingyao Wang a66e738957 [eval] use mp Pool instead ProcessPoolExecutor (#4025) 2024-09-24 23:59:06 +08:00
dependabot[bot] 582f07f9c9 chore(deps-dev): bump @types/node from 22.5.5 to 22.6.1 in /frontend (#4028)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-24 19:51:39 +04:00
sp.wack 83b7fcc3ed hotfix(frontend): Clear base URL when advanced options is unselected (#4016) 2024-09-24 19:49:39 +04:00
Graham Neubig dc418e7b71 Update README.md for runtime (#4015) 2024-09-24 02:50:15 +02:00
mamoodi dd3d1497f6 Add clearer OpenHands Configuration logs (#4011) 2024-09-23 18:42:00 -04:00
Xingyao Wang 8ea2d61ff2 [llm] Add app name for OpenRouter (#4010) 2024-09-24 00:26:07 +02:00
Graham Neubig 73ded7de10 Make drop_params default in llm_config (#4012) 2024-09-23 16:57:10 -04:00
dependabot[bot] 3f6aa0d1f1 chore(deps-dev): bump ruff from 0.6.6 to 0.6.7 (#4001)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-23 20:15:54 +02:00
dependabot[bot] 2556767ccb chore(deps-dev): bump @types/react from 18.3.7 to 18.3.8 in /frontend (#3974)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-23 12:43:16 -04:00
tobitege fbef93b762 Refactor config.py file into package (own folder with separate files) (#3987) 2024-09-23 12:42:54 -04:00
dependabot[bot] 3a93fd4c64 chore(deps): bump vite from 5.4.6 to 5.4.7 in /frontend (#3994)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-23 12:39:24 -04:00
dependabot[bot] a12e8cf06a chore(deps): bump json-repair from 0.29.2 to 0.29.4 (#4000)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-23 12:38:49 -04:00
dependabot[bot] 69479852ff chore(deps-dev): bump llama-index from 0.11.10 to 0.11.12 (#3999)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-23 12:38:40 -04:00
dependabot[bot] 7024e973d4 chore(deps): bump browsergym from 0.6.4 to 0.7.0 (#3998)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-23 12:38:28 -04:00
dependabot[bot] a260cc8dc8 chore(deps): bump boto3 from 1.35.23 to 1.35.24 (#3997)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-23 12:38:13 -04:00
dependabot[bot] 7e53c96b14 chore(deps-dev): bump openai from 1.46.1 to 1.47.0 (#4002)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-23 18:38:02 +02:00
dependabot[bot] 96b23d2e4c chore(deps): bump jose from 5.9.2 to 5.9.3 in /frontend (#3993)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-23 12:37:52 -04:00
dependabot[bot] b97aa10b66 chore(deps): bump litellm from 1.46.8 to 1.47.1 (#4003)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-23 18:36:58 +02:00
Aavash Baral b61455042f fix: models cut off in Settings UI (#3900) (#3991) 2024-09-23 10:54:00 -04:00
Ikko Eltociear Ashimine c84495830e [eval] update swe_bench/README.md (#3990) 2024-09-23 11:03:09 +02:00
Xingyao Wang 3435f1e5d8 Store the file edit backup file in /tmp (#3958) 2024-09-23 06:32:24 +08:00
Xingyao Wang 714e46f29a [eval] save eventstream & llm completions for SWE-Bench run_infer (#3923) 2024-09-22 04:39:13 +00:00
mamoodi e0608af0b3 Add OpenRouter provider docs (#3986) 2024-09-21 20:57:25 -04:00
Xingyao Wang 402a03cb9a change top_p default value to 1.0 (#3983) 2024-09-21 18:00:18 +00:00
tobitege 01462e11d7 (fix) CodeActAgent/LLM: react on should_exit flag (user cancellation) (#3968) 2024-09-20 23:49:45 +02:00
Engel Nyst ebd93977cd Update local ollama doc (#3966) 2024-09-20 21:56:38 +02:00
mamoodi ef189d52a5 Remove ollama reference from documentation for now (#3959) 2024-09-20 14:48:48 -04:00
mamoodi 3c77cc80dc Release 0.9.4 (#3982) 2024-09-20 20:05:01 +02:00
dependabot[bot] 31b189c9af chore(deps-dev): bump ruff from 0.6.5 to 0.6.6 (#3976)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-20 18:44:25 +02:00
dependabot[bot] a7630c399a chore(deps): bump boto3 from 1.35.22 to 1.35.23 (#3977)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-20 18:43:40 +02:00
dependabot[bot] 86521c971b chore(deps): bump litellm from 1.46.6 to 1.46.8 (#3975)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-20 18:40:40 +02:00
tobitege 6682e0f1dd (fix) CodeActAgent: use content of AgentDelegateObservation (#3970)
Co-authored-by: Ryan H. Tran <descience.thh10@gmail.com>
2024-09-20 18:31:11 +02:00
dependabot[bot] 031b91457a chore(deps): bump pandas from 2.2.2 to 2.2.3 (#3978)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-20 16:21:52 +00:00
Xingyao Wang b13ed017d8 [eval] add git patch post-processing for SWE-Bench eval_infer (#3980) 2024-09-20 15:33:53 +00:00
Robert Brennan 72ca1690a7 Wait for runtime to be ready in __init__ (#3963) 2024-09-20 17:31:30 +02:00
tobitege 45066f19dc (fix) restore sudo-capability after recent changes (#3964) 2024-09-19 23:08:13 +02:00
dependabot[bot] 809903bad4 chore(deps-dev): bump chromadb from 0.5.5 to 0.5.7 (#3951)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-19 21:35:25 +02:00
dependabot[bot] 071f1f0e13 chore(deps): bump google-cloud-aiplatform from 1.67.0 to 1.67.1 (#3952)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-19 21:33:59 +02:00
dependabot[bot] 08a505406b chore(deps): bump litellm from 1.46.4 to 1.46.6 (#3953)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-19 21:33:08 +02:00
dependabot[bot] 1756b85675 chore(deps): bump browsergym from 0.6.3 to 0.6.4 (#3954)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-19 21:31:46 +02:00
dependabot[bot] e8ab8f1146 chore(deps): bump boto3 from 1.35.21 to 1.35.22 (#3955)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-19 21:28:24 +02:00
dependabot[bot] a6ea4c2a60 chore(deps-dev): bump sympy from 1.13.2 to 1.13.3 (#3956)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-19 21:27:51 +02:00
dependabot[bot] 1e7f398376 chore(deps-dev): bump openai from 1.46.0 to 1.46.1 (#3957)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-19 21:27:14 +02:00
niliy01 0f6fb0f80e (enh) unify the log output in docker build process (#3961)
Signed-off-by: niliy <WannaTen@users.noreply.github.com>
2024-09-19 19:19:16 +02:00
dependabot[bot] bd1f0f1671 chore(deps): bump monaco-editor from 0.51.0 to 0.52.0 in /frontend (#3950)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-19 21:12:38 +04:00
tobitege 2842efa0e8 ChatMessage: get timestamp out of direct view (#3948) 2024-09-19 15:40:20 +02:00
tobitege 620526b8b4 agent_controller: in PAUSED state reduce delegate logspam from delegate (#3946) 2024-09-19 14:34:38 +02:00
tofarr 31dbd3d02e Fix google cloud session manager (#3942) 2024-09-19 06:28:10 -06:00
tofarr dd7174e559 Fix broken cli (#3941) 2024-09-18 19:15:09 -06:00
Engel Nyst 8fdfece059 Refactor messages serialization (#3832)
Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-09-18 23:48:58 +02:00
tofarr ad0b549d8b Feat Tightening up Timeouts and interrupt conditions. (#3926) 2024-09-18 20:50:42 +00:00
Engel Nyst 47f60b8275 Don't send gemini settings when the llm is not gemini (#3940) 2024-09-18 20:12:58 +00:00
Xingyao Wang 5d7f2fd4ae [eval] Allow evaluation of SWE-Bench patches on RemoteRuntime (#3927)
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-09-18 16:07:34 -04:00
dependabot[bot] 22e885736b chore(deps-dev): bump openai from 1.45.1 to 1.46.0 (#3929)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-18 20:22:42 +02:00
dependabot[bot] f9589a552e chore(deps): bump google-cloud-aiplatform from 1.66.0 to 1.67.0 (#3934)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-18 20:22:32 +02:00
dependabot[bot] 043fb2771d chore(deps): bump @nextui-org/react from 2.4.6 to 2.4.8 in /frontend (#3888)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-18 17:50:52 +00:00
dependabot[bot] b95740cc55 chore(deps-dev): bump tailwindcss from 3.4.11 to 3.4.12 in /frontend (#3928)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-18 21:38:26 +04:00
Robert Brennan c864715b43 Fix UID management for ubuntu users (#3937) 2024-09-18 16:29:39 +00:00
Robert Brennan a7f1bca586 fix documentation links (#3936) 2024-09-18 12:21:30 -04:00
tobitege b4408b41c9 (feat) LLM class: add safety_settings for Gemini; improve max_output_tokens defaulting (#3925) 2024-09-18 11:51:23 -04:00
Engel Nyst e3be71f523 Fix init order with threading (#3935) 2024-09-18 15:26:51 +00:00
dependabot[bot] dbe767325f chore(deps): bump boto3 from 1.35.20 to 1.35.21 (#3930)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-18 17:13:25 +02:00
dependabot[bot] ec0fe35d48 chore(deps): bump fastapi from 0.114.2 to 0.115.0 (#3932)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-18 17:12:48 +02:00
dependabot[bot] 883f36b005 chore(deps): bump browsergym from 0.6.0 to 0.6.3 (#3931)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-18 17:11:59 +02:00
dependabot[bot] dc6b4b7296 chore(deps): bump litellm from 1.46.1 to 1.46.4 (#3933)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-18 15:06:39 +00:00
dependabot[bot] fe0a8eb036 chore(deps-dev): bump ruff from 0.6.4 to 0.6.5 (#3894)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-18 14:20:26 +00:00
tobitege c3117e8c39 (feat) add --version to cli (#3924) 2024-09-18 09:44:51 -04:00
Qiang Li f7ebc1cf1f chore: Add docker files for developing inside container. (#3911) 2024-09-17 23:19:01 +02:00
mamoodi 8a419b5c45 Reorder sidebar doc by bringing LLMs higher up (#3922) 2024-09-17 14:26:13 -04:00
dependabot[bot] 9787a31ba1 chore(deps-dev): bump @types/react from 18.3.6 to 18.3.7 in /frontend (#3920)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-17 11:57:33 -04:00
dependabot[bot] 31296624d1 chore(deps): bump vite from 5.4.5 to 5.4.6 in /frontend (#3919)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-17 11:57:23 -04:00
dependabot[bot] 656222f416 chore(deps): bump boto3 from 1.35.19 to 1.35.20 (#3915)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-17 11:29:15 -04:00
dependabot[bot] a4e61faf56 chore(deps-dev): bump llama-index from 0.11.9 to 0.11.10 (#3916)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-17 11:29:06 -04:00
dependabot[bot] f4657edc48 chore(deps): bump litellm from 1.46.0 to 1.46.1 (#3917)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-17 15:21:32 +00:00
Engel Nyst ef09f0fe37 Small fix in readme (#3912) 2024-09-17 14:33:25 +00:00
Xingyao Wang f996b31d64 [eval] Fix multi-processing bug (again^3) & allow set EXP_NAME for each run_infer (#3907)
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-09-17 14:07:58 +00:00
mamoodi fa0d9cfa42 Add providers section in documentation and put current docs under it (#3905) 2024-09-17 08:52:13 -04:00
niliy01 07a094e701 (enh) Update Docker pull data in place (#3910)
Signed-off-by: Yi Lin <teroincn@gmail.com>
2024-09-17 10:22:07 +02:00
tobitege 52c5abccbf (enh) Dockerfile.j2: improve env vars for bash and activate in .bashrc (#3871) 2024-09-17 08:49:04 +02:00
dependabot[bot] 29b0e62cd7 chore(deps): bump react-i18next from 15.0.1 to 15.0.2 in /frontend (#3889) 2024-09-17 11:04:48 +08:00
dependabot[bot] 2a10ff1374 chore(deps): bump litellm from 1.44.28 to 1.46.0 (#3896)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-16 20:56:22 +02:00
dependabot[bot] 592c9580d2 chore(deps-dev): bump @types/node from 22.5.4 to 22.5.5 in /frontend (#3890)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-16 18:29:25 +00:00
niliy01 804674bb9f refactor the logic in agent_controller to imporve readability (#3873)
Signed-off-by: Yi Lin <teroincn@gmail.com>
2024-09-16 14:13:52 -04:00
Engel Nyst 41a54378dc Add delegates events to eval trajectories (#3881) 2024-09-16 14:10:42 -04:00
mamoodi 9d4eb7d19d Minor tweaks to getting started guide (#3901) 2024-09-16 13:53:46 -04:00
dependabot[bot] d886bf7e2b chore(deps-dev): bump postcss from 8.4.45 to 8.4.47 in /frontend (#3886)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-16 11:17:33 -05:00
dependabot[bot] dcede97f68 chore(deps-dev): bump openai from 1.45.0 to 1.45.1 (#3887)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-16 11:17:11 -05:00
dependabot[bot] 531aec404a chore(deps): bump jose from 5.8.0 to 5.9.2 in /frontend (#3891)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: yufansong <yufan@risingwave-labs.com>
2024-09-16 11:16:26 -05:00
dependabot[bot] b7e050eaf6 chore(deps): bump fastapi from 0.114.1 to 0.114.2 (#3892)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-16 11:15:59 -05:00
dependabot[bot] 1cc1016768 chore(deps-dev): bump @types/react from 18.3.5 to 18.3.6 in /frontend (#3893)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: yufansong <yufan@risingwave-labs.com>
2024-09-16 11:15:45 -05:00
dependabot[bot] d3c2a022b3 chore(deps-dev): bump llama-index-embeddings-ollama from 0.3.0 to 0.3.1 (#3895)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: yufansong <yufan@risingwave-labs.com>
2024-09-16 11:15:26 -05:00
dependabot[bot] 88abf1b5d8 chore(deps): bump boto3 from 1.35.18 to 1.35.19 (#3897)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: yufansong <yufan@risingwave-labs.com>
2024-09-16 11:15:09 -05:00
mamoodi 834296f7b4 Missed adding Groq to sidebar so it shows up (#3885) 2024-09-16 11:31:53 -04:00
Graham Neubig 243cb492aa Run pre-commit on all files (#3884) 2024-09-16 11:07:08 -04:00
mamoodi f11e767eb5 Remove Cleanup stage in runtime tests (#3861)
Co-authored-by: tobitege <10787084+tobitege@users.noreply.github.com>
2024-09-16 10:14:49 -04:00
tofarr 0db664986d Tightened up the logic on retries. (#3882) 2024-09-16 07:28:06 -06:00
tobitege a33f61c025 (feat) Show messages' timestamp in UI (#3869) 2024-09-16 05:41:29 +02:00
tobitege a45b20a406 (fix) runtime: tweak _wait_until_alive tenacity and exception handling (#3878) 2024-09-16 04:24:58 +02:00
Engel Nyst f6709d08ef Add docs for Groq and for OpenAI proxies (#3877) 2024-09-15 23:08:37 +00:00
Xingyao Wang 2b3925278d [eval] refactor process instance logic into update_progress (#3875) 2024-09-15 18:47:15 -04:00
tobitege ecf4aed28b (fix) Update logs after run_action (EventStreamRuntime) (#3870) 2024-09-15 18:50:10 +02:00
mamoodi a97ef34139 documentation changes associated with UI changes and more consistency (#3866) 2024-09-14 17:22:15 -04:00
tobitege 554636cf2a (fix) Fix runtime (RT) tests and split tests in 2 actions (openhands/root) (#3791)
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-09-14 21:51:30 +02:00
tobitege 57390eb26b (enh) docker pull (if not found locally) with progress info (#3682) 2024-09-14 06:26:42 +02:00
Engel Nyst 379f2b6f23 Fix queue length on Macs (#3867) 2024-09-14 01:11:29 +00:00
Graham Neubig 26cc1670ad Adds OpenAI o-1 models to frontend (#3868)
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-09-13 23:53:24 +00:00
dependabot[bot] cbeae3e612 chore(deps): bump browsergym from 0.5.1 to 0.6.0 (#3860)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-13 22:43:17 +02:00
dependabot[bot] 434951af23 chore(deps): bump boto3 from 1.35.17 to 1.35.18 (#3858)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-13 17:25:49 +00:00
dependabot[bot] 900d819394 chore(deps): bump vite from 5.4.4 to 5.4.5 in /frontend (#3862)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-13 11:53:30 -04:00
Xingyao Wang 3a1b8c093b [eval] yet another eval fixes on multi-processing (#3854)
Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-09-13 15:51:22 +00:00
dependabot[bot] cea6b6e30e chore(deps-dev): bump llama-index from 0.11.8 to 0.11.9 (#3859)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-13 15:19:06 +00:00
dependabot[bot] 0233cd7848 chore(deps): bump litellm from 1.44.24 to 1.44.28 (#3856)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-13 15:13:34 +00:00
dependabot[bot] 95edc1b5cf chore(deps): bump google-generativeai from 0.8.0 to 0.8.1 (#3855)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-13 15:06:24 +00:00
tobitege 6111f530c2 (fix) StuckDetector: syntax error loops were not detected (#3663)
Co-authored-by: mamoodi <mamoodiha@gmail.com>
2024-09-13 16:53:52 +02:00
Xingyao Wang 78c5f58adc refactor & improve retry for the reliability of RemoteRuntime & evaluation (#3846) 2024-09-13 07:37:07 -04:00
dependabot[bot] 7506b20087 chore(deps): bump boto3 from 1.35.16 to 1.35.17 (#3841)
Bumps [boto3](https://github.com/boto/boto3) from 1.35.16 to 1.35.17.
- [Release notes](https://github.com/boto/boto3/releases)
- [Commits](https://github.com/boto/boto3/compare/1.35.16...1.35.17)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-13 10:07:05 +00:00
mamoodi bff9296d68 Make the publish PyPi manual to fit current release process (#3851) 2024-09-12 20:53:19 -04:00
mamoodi 4b5240cdf7 Release 0.9.3 (#3850) 2024-09-12 20:51:34 -04:00
Robert Brennan 9ef79cbe22 fix advanced settings issue (#3849) 2024-09-12 22:00:31 +00:00
Robert Brennan 1f13d80ddc fix saves (#3848) 2024-09-12 21:47:02 +00:00
Robert Brennan 58de5221f5 fix file access (#3847) 2024-09-12 15:30:21 -04:00
Robert Brennan 9bbb35ec18 minor settings fixes (#3809)
* minor settings fixes

* useMemo

* fix code
2024-09-12 19:08:18 +00:00
Xingyao Wang 797f02ff6f rename huggingface evaluation benchmark (#3845) 2024-09-12 18:50:26 +00:00
Xingyao Wang 47d9621742 [eval] SWE-Bench eval usability fixes (#3836)
* [eval] increase timeout for swebench eval init/complete

* allow CmdRunAction to optionally block when .timeout is setted

* fix unit test for serialization

* fix unit tests for security analyzer

* fix integration tests

* add more timeout

* only check P2P when instances are non-empty;
convert P2P and F2P columns to string instead of list

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-09-12 16:33:51 +00:00
dependabot[bot] 8c12f5b67d chore(deps-dev): bump tailwindcss from 3.4.10 to 3.4.11 in /frontend (#3838)
Bumps [tailwindcss](https://github.com/tailwindlabs/tailwindcss) from 3.4.10 to 3.4.11.
- [Release notes](https://github.com/tailwindlabs/tailwindcss/releases)
- [Changelog](https://github.com/tailwindlabs/tailwindcss/blob/v3.4.11/CHANGELOG.md)
- [Commits](https://github.com/tailwindlabs/tailwindcss/compare/v3.4.10...v3.4.11)

---
updated-dependencies:
- dependency-name: tailwindcss
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-12 11:38:21 -04:00
dependabot[bot] 6132a58367 chore(deps-dev): bump husky from 9.1.5 to 9.1.6 in /frontend (#3839)
Bumps [husky](https://github.com/typicode/husky) from 9.1.5 to 9.1.6.
- [Release notes](https://github.com/typicode/husky/releases)
- [Commits](https://github.com/typicode/husky/compare/v9.1.5...v9.1.6)

---
updated-dependencies:
- dependency-name: husky
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-12 11:38:15 -04:00
Xingyao Wang 2fe2f4c530 [eval] increase timeout for SWEBench eval init/complete (#3829)
* [eval] increase timeout for swebench eval init/complete

* allow CmdRunAction to optionally block when .timeout is setted

* fix unit test for serialization

* fix unit tests for security analyzer

* fix integration tests

* add more timeout
2024-09-12 15:20:58 +00:00
tofarr 4d8f812f88 chore: update the icons from OpenDevin to OpenHands (#3835)
* chore: update the icons from OpenDevin to OpenHands

* Update logo
2024-09-12 14:19:31 +00:00
tofarr e5cb80d59d docs: Update steps for running integration tests in a local environment (#3830)
* docs: Update steps for running integration tests in a local environment
2024-09-12 03:22:53 -06:00
sp.wack ee9bea393f feat(frontend): Update settings to include base URL (#3755)
* Update settings to include base URL

* Fix tests
2024-09-11 21:57:19 -04:00
dependabot[bot] 940de86caa chore(deps): bump browsergym from 0.4.3 to 0.5.1 (#3833)
Bumps browsergym from 0.4.3 to 0.5.1.

---
updated-dependencies:
- dependency-name: browsergym
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-09-12 03:55:21 +02:00
Frank Xu fe5ecb6da8 add url info in browsing observation (#3815)
* add url info in browsing observation

* fix integration tests for url

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-09-12 02:46:39 +02:00
mamoodi 41b8f3e4a7 Fail the Runtime tests check if the previous jobs were cancelled or failed (#3825)
* Fail the Runtime tests check if the previous jobs were cancelled or failed

* Check only for failures

* Add test exit

* Test cancel

* fix conditions

* fix condition again

* Try another way

* Now test success
2024-09-11 16:40:00 -04:00
dependabot[bot] 93ecd82e61 chore(deps): bump google-cloud-aiplatform from 1.65.0 to 1.66.0 (#3828)
Bumps [google-cloud-aiplatform](https://github.com/googleapis/python-aiplatform) from 1.65.0 to 1.66.0.
- [Release notes](https://github.com/googleapis/python-aiplatform/releases)
- [Changelog](https://github.com/googleapis/python-aiplatform/blob/main/CHANGELOG.md)
- [Commits](https://github.com/googleapis/python-aiplatform/compare/v1.65.0...v1.66.0)

---
updated-dependencies:
- dependency-name: google-cloud-aiplatform
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-11 18:52:35 +02:00
dependabot[bot] 052149ccf3 chore(deps-dev): bump llama-index from 0.11.6 to 0.11.8 (#3827)
Bumps [llama-index](https://github.com/run-llama/llama_index) from 0.11.6 to 0.11.8.
- [Release notes](https://github.com/run-llama/llama_index/releases)
- [Changelog](https://github.com/run-llama/llama_index/blob/main/CHANGELOG.md)
- [Commits](https://github.com/run-llama/llama_index/compare/v0.11.6...v0.11.8)

---
updated-dependencies:
- dependency-name: llama-index
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-11 18:51:05 +02:00
dependabot[bot] 3aba722a59 chore(deps-dev): bump evaluate from 0.4.2 to 0.4.3 (#3818)
Bumps [evaluate](https://github.com/huggingface/evaluate) from 0.4.2 to 0.4.3.
- [Release notes](https://github.com/huggingface/evaluate/releases)
- [Commits](https://github.com/huggingface/evaluate/compare/v0.4.2...v0.4.3)

---
updated-dependencies:
- dependency-name: evaluate
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-11 18:47:22 +02:00
dependabot[bot] 4738640f4e chore(deps): bump datasets from 2.21.0 to 3.0.0 (#3821)
Bumps [datasets](https://github.com/huggingface/datasets) from 2.21.0 to 3.0.0.
- [Release notes](https://github.com/huggingface/datasets/releases)
- [Commits](https://github.com/huggingface/datasets/compare/2.21.0...3.0.0)

---
updated-dependencies:
- dependency-name: datasets
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-11 18:46:23 +02:00
dependabot[bot] bf6f7b9943 chore(deps): bump google-generativeai from 0.7.2 to 0.8.0 (#3822)
Bumps [google-generativeai](https://github.com/google/generative-ai-python) from 0.7.2 to 0.8.0.
- [Release notes](https://github.com/google/generative-ai-python/releases)
- [Changelog](https://github.com/google-gemini/generative-ai-python/blob/main/RELEASE.md)
- [Commits](https://github.com/google/generative-ai-python/compare/v0.7.2...v0.8.0)

---
updated-dependencies:
- dependency-name: google-generativeai
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-11 18:35:33 +02:00
dependabot[bot] 0f20965999 chore(deps): bump fastapi from 0.114.0 to 0.114.1 (#3823)
Bumps [fastapi](https://github.com/fastapi/fastapi) from 0.114.0 to 0.114.1.
- [Release notes](https://github.com/fastapi/fastapi/releases)
- [Commits](https://github.com/fastapi/fastapi/compare/0.114.0...0.114.1)

---
updated-dependencies:
- dependency-name: fastapi
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-11 16:21:08 +00:00
dependabot[bot] e6034b301b chore(deps): bump vite from 5.4.3 to 5.4.4 in /frontend (#3817)
Bumps [vite](https://github.com/vitejs/vite/tree/HEAD/packages/vite) from 5.4.3 to 5.4.4.
- [Release notes](https://github.com/vitejs/vite/releases)
- [Changelog](https://github.com/vitejs/vite/blob/v5.4.4/packages/vite/CHANGELOG.md)
- [Commits](https://github.com/vitejs/vite/commits/v5.4.4/packages/vite)

---
updated-dependencies:
- dependency-name: vite
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-11 19:19:49 +03:00
dependabot[bot] 9e50b88f9f chore(deps): bump litellm from 1.44.23 to 1.44.24 (#3826)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.44.23 to 1.44.24.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.44.23...v1.44.24)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-11 16:10:43 +00:00
dependabot[bot] fd0ecdd4e0 chore(deps): bump boto3 from 1.35.15 to 1.35.16 (#3824)
Bumps [boto3](https://github.com/boto/boto3) from 1.35.15 to 1.35.16.
- [Release notes](https://github.com/boto/boto3/releases)
- [Commits](https://github.com/boto/boto3/compare/1.35.15...1.35.16)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-11 16:08:19 +00:00
Robert Brennan c6105f264f Improvements to file list UI (#3794)
* move filematching logic into server

* wait until ready before returning

* show loading message instead of empty

* logspam

* delint

* fix type

* add a few more default ignores
2024-09-11 09:44:37 -04:00
Engel Nyst 93f271579c Revert "chore(deps): bump numpy from 1.26.4 to 2.1.1 (#3713)" (#3816)
This reverts commit 5100d12cea.
2024-09-11 12:29:32 +02:00
Boxuan Li 1976763152 Fix wrong mention of Devin in localLLMs.md (#3814) 2024-09-11 07:49:33 +02:00
Engel Nyst 516ca701d4 Update local-llms.md (#3811) 2024-09-11 01:08:20 +02:00
mamoodi f3b2085f9b Reduce runtime tests duration by running them across CPUs (#3779)
* Reduce runtime tests duration by running them across CPUs

* fix hardcoded image name

* test two cpus

* Test folder change

* Up the CPU to 4 again to test

* Change to 3 CPUs

* Down to 2

* Add param to remove all openhands containers

* Add comment

* Add reruns just in case

* Fix ordering of if
2024-09-10 14:31:17 -04:00
Cole Murray 97a03faf33 Add Handling of Cache Prompt When Formatting Messages (#3773)
* Add Handling of Cache Prompt When Formatting Messages

* Fix Value for Cache Control

* Fix Value for Cache Control

* Update openhands/core/message.py

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

* Fix lint error

* Serialize Messages if Propt Caching Is Enabled

* Remove formatting message change

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: tobitege <10787084+tobitege@users.noreply.github.com>
2024-09-10 16:34:41 +00:00
dependabot[bot] 06ed142191 chore(deps): bump json-repair from 0.29.1 to 0.29.2 (#3800)
Bumps [json-repair](https://github.com/mangiucugna/json_repair) from 0.29.1 to 0.29.2.
- [Release notes](https://github.com/mangiucugna/json_repair/releases)
- [Commits](https://github.com/mangiucugna/json_repair/compare/v0.29.1...v0.29.2)

---
updated-dependencies:
- dependency-name: json-repair
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-10 16:18:36 +00:00
dependabot[bot] dbd0786345 chore(deps-dev): bump pytest from 8.3.2 to 8.3.3 (#3801)
Bumps [pytest](https://github.com/pytest-dev/pytest) from 8.3.2 to 8.3.3.
- [Release notes](https://github.com/pytest-dev/pytest/releases)
- [Changelog](https://github.com/pytest-dev/pytest/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pytest-dev/pytest/compare/8.3.2...8.3.3)

---
updated-dependencies:
- dependency-name: pytest
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-10 16:17:30 +00:00
dependabot[bot] 3bc9a485b8 chore(deps-dev): bump openai from 1.44.0 to 1.44.1 (#3802)
Bumps [openai](https://github.com/openai/openai-python) from 1.44.0 to 1.44.1.
- [Release notes](https://github.com/openai/openai-python/releases)
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-python/compare/v1.44.0...v1.44.1)

---
updated-dependencies:
- dependency-name: openai
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-10 16:16:32 +00:00
dependabot[bot] 2b845d9568 chore(deps): bump i18next from 23.15.0 to 23.15.1 in /frontend (#3805)
Bumps [i18next](https://github.com/i18next/i18next) from 23.15.0 to 23.15.1.
- [Release notes](https://github.com/i18next/i18next/releases)
- [Changelog](https://github.com/i18next/i18next/blob/master/CHANGELOG.md)
- [Commits](https://github.com/i18next/i18next/compare/v23.15.0...v23.15.1)

---
updated-dependencies:
- dependency-name: i18next
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-10 23:57:59 +08:00
dependabot[bot] d7e0db0b35 chore(deps-dev): bump typescript from 5.5.4 to 5.6.2 in /frontend (#3804)
Bumps [typescript](https://github.com/microsoft/TypeScript) from 5.5.4 to 5.6.2.
- [Release notes](https://github.com/microsoft/TypeScript/releases)
- [Changelog](https://github.com/microsoft/TypeScript/blob/main/azure-pipelines.release.yml)
- [Commits](https://github.com/microsoft/TypeScript/compare/v5.5.4...v5.6.2)

---
updated-dependencies:
- dependency-name: typescript
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-10 23:57:49 +08:00
dependabot[bot] 508681691a chore(deps-dev): bump typescript from 5.5.4 to 5.6.2 in /docs (#3806)
Bumps [typescript](https://github.com/microsoft/TypeScript) from 5.5.4 to 5.6.2.
- [Release notes](https://github.com/microsoft/TypeScript/releases)
- [Changelog](https://github.com/microsoft/TypeScript/blob/main/azure-pipelines.release.yml)
- [Commits](https://github.com/microsoft/TypeScript/compare/v5.5.4...v5.6.2)

---
updated-dependencies:
- dependency-name: typescript
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-10 23:57:41 +08:00
tobitege 1c9b4ad78a fix regression in fork artifact upload in ghcr_runtime (#3807) 2024-09-10 17:36:36 +02:00
dependabot[bot] ecb1b9b2a0 chore(deps): bump boto3 from 1.35.14 to 1.35.15 (#3798)
Bumps [boto3](https://github.com/boto/boto3) from 1.35.14 to 1.35.15.
- [Release notes](https://github.com/boto/boto3/releases)
- [Commits](https://github.com/boto/boto3/compare/1.35.14...1.35.15)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-10 17:29:56 +02:00
dependabot[bot] e4f8708656 chore(deps): bump litellm from 1.44.22 to 1.44.23 (#3799)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.44.22 to 1.44.23.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.44.22...v1.44.23)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-10 17:25:41 +02:00
dependabot[bot] 822de89394 chore(deps): bump browsergym from 0.3.4 to 0.4.3 (#3762)
* chore(deps): bump browsergym from 0.3.4 to 0.4.3

Bumps browsergym from 0.3.4 to 0.4.3.

---
updated-dependencies:
- dependency-name: browsergym
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* integration tests updated to browsergym

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-09-10 17:24:40 +02:00
Robert Brennan c5e89be6de fix runtime tags (#3790) 2024-09-09 20:03:48 +00:00
dependabot[bot] 386688da5f chore(deps): bump fastapi from 0.113.0 to 0.114.0 (#3783)
Bumps [fastapi](https://github.com/fastapi/fastapi) from 0.113.0 to 0.114.0.
- [Release notes](https://github.com/fastapi/fastapi/releases)
- [Commits](https://github.com/fastapi/fastapi/compare/0.113.0...0.114.0)

---
updated-dependencies:
- dependency-name: fastapi
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-09 15:23:34 -04:00
dependabot[bot] 6c01d25976 chore(deps): bump boto3 from 1.35.13 to 1.35.14 (#3784)
Bumps [boto3](https://github.com/boto/boto3) from 1.35.13 to 1.35.14.
- [Release notes](https://github.com/boto/boto3/releases)
- [Commits](https://github.com/boto/boto3/compare/1.35.13...1.35.14)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-09 15:23:26 -04:00
dependabot[bot] 05e2b0c352 chore(deps-dev): bump openai from 1.43.1 to 1.44.0 (#3787)
Bumps [openai](https://github.com/openai/openai-python) from 1.43.1 to 1.44.0.
- [Release notes](https://github.com/openai/openai-python/releases)
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-python/compare/v1.43.1...v1.44.0)

---
updated-dependencies:
- dependency-name: openai
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-09 15:23:16 -04:00
dependabot[bot] 5066468c36 chore(deps-dev): bump build from 1.2.1 to 1.2.2 (#3785)
Bumps [build](https://github.com/pypa/build) from 1.2.1 to 1.2.2.
- [Release notes](https://github.com/pypa/build/releases)
- [Changelog](https://github.com/pypa/build/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pypa/build/compare/1.2.1...1.2.2)

---
updated-dependencies:
- dependency-name: build
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-09 15:23:04 -04:00
dependabot[bot] 9349009074 chore(deps): bump litellm from 1.44.19 to 1.44.22 (#3786)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.44.19 to 1.44.22.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.44.19...v1.44.22)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-09 15:22:52 -04:00
tobitege 5ffff742de Regression fixes: LLM logging; client readiness (EventStreamRuntime) (#3776)
* Regression fixes: LLM logging; client readiness (EventStreamRuntime)

* fix llm.async_completion_wrapper bad edit in previous commit

* regen couple of mock files

* client: always log initialized status
2024-09-09 21:02:43 +02:00
dependabot[bot] 50dc17c65c chore(deps): bump i18next from 23.14.0 to 23.15.0 in /frontend (#3788)
Bumps [i18next](https://github.com/i18next/i18next) from 23.14.0 to 23.15.0.
- [Release notes](https://github.com/i18next/i18next/releases)
- [Changelog](https://github.com/i18next/i18next/blob/master/CHANGELOG.md)
- [Commits](https://github.com/i18next/i18next/compare/v23.14.0...v23.15.0)

---
updated-dependencies:
- dependency-name: i18next
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-09 19:31:27 +03:00
Robert Brennan f4d6b3262d fix runtime image (#3782) 2024-09-09 09:55:48 -04:00
Robert Brennan cb3168da12 Redesign settings modal (#3751)
* add advanced options switch

* remove logic for custom models

* remove extra span

* consolidate row

* no more toast

* remove default subtitle

* fix title tags

* remove getSettingsDifference

* delint

* delint

* fix select buttons

* delete test

* fix test

* fix test

* fix labels

* remove model-id tests

* fix labels

* fix labels

* rename var

* fix test

* fix more tests

* change some mocks

* more fixes

* one more fix

* remove some mocks

* rename things

* one more test

* ahhhhh

* change required settings

* fix test

* delint

* fix build
2024-09-09 09:02:58 -04:00
tobitege 2b7517e542 (enh) add caching@v4 action in workflows (#3780)
* dummy test change

* regen yml: 1st install python 3.11, then poetry

* fix caching for poetry; old entry for python was rather useless

* fix steps order (cache before poetry)

* add poetry caching to ghcr_runtime; fix fork conditions

* ghcr_runtime: more caching actions; condition fixes

* fix interim action error (order of steps)

* cache@v4 instead of v3

* fixed interim typo for 2 fork conditions

* runtime/test_env_vars: compacted multiple tests into one to reduce time

* ugh if fork condition changes again
2024-09-09 10:49:49 +02:00
Cole Murray dadada18ce Add Anthropic Models to Cache Prompt (#3775)
* Add Anthropic Models to Cache Prompt

* Update Cache Prompt Active Check for Partial String Matching
2024-09-08 22:09:14 +00:00
Robert Brennan ab3851593d Support interactive commands (#3653)
* hacky solution for interactive commands

* add more behavior

* debug

* fix continue functionality

* remove prints

* refactor a bit

* reduce test sleep

* fix python version

* fix pre-commit issue

* Regenerate integration tests

* Update openhands/runtime/client/client.py

* revert some prompt stuff

* several integration mock files regenerated

* execute_action: remove duplicate exception logging

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: tobitege <10787084+tobitege@users.noreply.github.com>
2024-09-08 21:45:51 +02:00
dependabot[bot] 5100d12cea chore(deps): bump numpy from 1.26.4 to 2.1.1 (#3713)
Bumps [numpy](https://github.com/numpy/numpy) from 1.26.4 to 2.1.1.
- [Release notes](https://github.com/numpy/numpy/releases)
- [Changelog](https://github.com/numpy/numpy/blob/main/doc/RELEASE_WALKTHROUGH.rst)
- [Commits](https://github.com/numpy/numpy/compare/v1.26.4...v2.1.1)

---
updated-dependencies:
- dependency-name: numpy
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-06 15:52:27 -04:00
mamoodi a9cf7e6ee6 docs: fix cli and headless commands (#3768) 2024-09-06 15:52:11 -04:00
sp.wack 8392a3fb6b Remove unwanted (large) keys when sharing feedback (#3766) 2024-09-06 22:45:18 +03:00
dependabot[bot] c376b81505 chore(deps): bump fastapi from 0.112.3 to 0.113.0 (#3760)
Bumps [fastapi](https://github.com/fastapi/fastapi) from 0.112.3 to 0.113.0.
- [Release notes](https://github.com/fastapi/fastapi/releases)
- [Commits](https://github.com/fastapi/fastapi/compare/0.112.3...0.113.0)

---
updated-dependencies:
- dependency-name: fastapi
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-06 19:33:12 +02:00
dependabot[bot] 8440604dd1 chore(deps-dev): bump ruff from 0.6.3 to 0.6.4 (#3763)
Bumps [ruff](https://github.com/astral-sh/ruff) from 0.6.3 to 0.6.4.
- [Release notes](https://github.com/astral-sh/ruff/releases)
- [Changelog](https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md)
- [Commits](https://github.com/astral-sh/ruff/compare/0.6.3...0.6.4)

---
updated-dependencies:
- dependency-name: ruff
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-06 19:31:58 +02:00
dependabot[bot] a4d75cd190 chore(deps-dev): bump llama-index from 0.11.5 to 0.11.6 (#3761)
Bumps [llama-index](https://github.com/run-llama/llama_index) from 0.11.5 to 0.11.6.
- [Release notes](https://github.com/run-llama/llama_index/releases)
- [Changelog](https://github.com/run-llama/llama_index/blob/main/CHANGELOG.md)
- [Commits](https://github.com/run-llama/llama_index/compare/v0.11.5...v0.11.6)

---
updated-dependencies:
- dependency-name: llama-index
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-06 19:31:01 +02:00
dependabot[bot] 4db929b986 chore(deps): bump litellm from 1.44.17 to 1.44.19 (#3764)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.44.17 to 1.44.19.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.44.17...v1.44.19)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-06 19:30:20 +02:00
dependabot[bot] b6b38fcd37 chore(deps-dev): bump openai from 1.43.0 to 1.43.1 (#3765)
Bumps [openai](https://github.com/openai/openai-python) from 1.43.0 to 1.43.1.
- [Release notes](https://github.com/openai/openai-python/releases)
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-python/compare/v1.43.0...v1.43.1)

---
updated-dependencies:
- dependency-name: openai
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-06 19:29:02 +02:00
dependabot[bot] d619be96d1 chore(deps): bump boto3 from 1.35.12 to 1.35.13 (#3759)
Bumps [boto3](https://github.com/boto/boto3) from 1.35.12 to 1.35.13.
- [Release notes](https://github.com/boto/boto3/releases)
- [Commits](https://github.com/boto/boto3/compare/1.35.12...1.35.13)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-06 18:09:36 +02:00
dependabot[bot] 6f44ea0115 chore(deps): bump json-repair from 0.28.4 to 0.29.1 (#3745)
Bumps [json-repair](https://github.com/mangiucugna/json_repair) from 0.28.4 to 0.29.1.
- [Release notes](https://github.com/mangiucugna/json_repair/releases)
- [Commits](https://github.com/mangiucugna/json_repair/compare/v0.28.4...v0.29.1)

---
updated-dependencies:
- dependency-name: json-repair
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-06 18:08:17 +02:00
Robert Brennan 0f118df910 Docs: fix "architecture" and "llms" sections (#3749)
* fix llm docs

* fix architecture
2024-09-06 09:51:45 -04:00
Jiayi Pan 43c4a7fff4 Allow Generalized SWE-Bench format for evaluation (#3752)
* allow generalized swe-bench format

* Update run_infer.py

* fix linter

---------

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-09-06 13:05:00 +00:00
tobitege 57187417b7 revert enabling litellm verbose mode from testing (#3750) 2024-09-05 20:12:04 +00:00
tobitege 03b5b03bb2 (enh) CodeActAgent: improve logging; sensible retry defaults in config (#3729)
* CodeActAgent: improve logging; sensible retry defaults for completion errors

* CodeActAgent: reduce completion error message sent to UI

* tweak values; docs+config template changes

* fix format_messages; log exception in codeactagent again
2024-09-05 18:14:15 +00:00
dependabot[bot] 681276f27c chore(deps): bump google-cloud-aiplatform from 1.64.0 to 1.65.0 (#3747)
Bumps [google-cloud-aiplatform](https://github.com/googleapis/python-aiplatform) from 1.64.0 to 1.65.0.
- [Release notes](https://github.com/googleapis/python-aiplatform/releases)
- [Changelog](https://github.com/googleapis/python-aiplatform/blob/main/CHANGELOG.md)
- [Commits](https://github.com/googleapis/python-aiplatform/compare/v1.64.0...v1.65.0)

---
updated-dependencies:
- dependency-name: google-cloud-aiplatform
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-05 13:31:15 -04:00
niliy01 82a154f7e7 (feat) making prompt caching optional instead of enabled default (#3689)
* (feat) making prompt caching optional instead of enabled default

At present, only the Claude models support prompt caching as a experimental feature, therefore, this feature should be implemented as an optional setting rather than being enabled by default.

Signed-off-by: Yi Lin <teroincn@gmail.com>

* handle the conflict

* fix unittest mock return value

* fix lint error in whitespace

---------

Signed-off-by: Yi Lin <teroincn@gmail.com>
2024-09-05 18:52:26 +02:00
dependabot[bot] 5b7ab28511 chore(deps): bump vite from 5.4.2 to 5.4.3 in /frontend (#3721)
Bumps [vite](https://github.com/vitejs/vite/tree/HEAD/packages/vite) from 5.4.2 to 5.4.3.
- [Release notes](https://github.com/vitejs/vite/releases)
- [Changelog](https://github.com/vitejs/vite/blob/main/packages/vite/CHANGELOG.md)
- [Commits](https://github.com/vitejs/vite/commits/v5.4.3/packages/vite)

---
updated-dependencies:
- dependency-name: vite
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-05 11:54:09 -04:00
dependabot[bot] ca3f39e918 chore(deps-dev): bump postcss from 8.4.44 to 8.4.45 in /frontend (#3722)
Bumps [postcss](https://github.com/postcss/postcss) from 8.4.44 to 8.4.45.
- [Release notes](https://github.com/postcss/postcss/releases)
- [Changelog](https://github.com/postcss/postcss/blob/main/CHANGELOG.md)
- [Commits](https://github.com/postcss/postcss/compare/8.4.44...8.4.45)

---
updated-dependencies:
- dependency-name: postcss
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-05 11:53:50 -04:00
dependabot[bot] 6c2630e506 chore(deps-dev): bump @types/node from 22.5.2 to 22.5.4 in /frontend (#3740)
Bumps [@types/node](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/node) from 22.5.2 to 22.5.4.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/node)

---
updated-dependencies:
- dependency-name: "@types/node"
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-05 11:53:38 -04:00
dependabot[bot] 260e41486e chore(deps-dev): bump llama-index from 0.11.4 to 0.11.5 (#3741)
Bumps [llama-index](https://github.com/run-llama/llama_index) from 0.11.4 to 0.11.5.
- [Release notes](https://github.com/run-llama/llama_index/releases)
- [Changelog](https://github.com/run-llama/llama_index/blob/main/CHANGELOG.md)
- [Commits](https://github.com/run-llama/llama_index/compare/v0.11.4...v0.11.5)

---
updated-dependencies:
- dependency-name: llama-index
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-05 11:53:28 -04:00
dependabot[bot] 9b0fb8f81a chore(deps): bump boto3 from 1.35.11 to 1.35.12 (#3742)
Bumps [boto3](https://github.com/boto/boto3) from 1.35.11 to 1.35.12.
- [Release notes](https://github.com/boto/boto3/releases)
- [Commits](https://github.com/boto/boto3/compare/1.35.11...1.35.12)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-05 11:53:22 -04:00
dependabot[bot] cd360ef6aa chore(deps): bump litellm from 1.44.15 to 1.44.17 (#3743)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.44.15 to 1.44.17.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.44.15...v1.44.17)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-05 11:53:15 -04:00
dependabot[bot] 5bb46525a4 chore(deps): bump fastapi from 0.112.2 to 0.112.3 (#3744)
Bumps [fastapi](https://github.com/fastapi/fastapi) from 0.112.2 to 0.112.3.
- [Release notes](https://github.com/fastapi/fastapi/releases)
- [Commits](https://github.com/fastapi/fastapi/compare/0.112.2...0.112.3)

---
updated-dependencies:
- dependency-name: fastapi
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-05 11:53:08 -04:00
dependabot[bot] f80e4a9e5d chore(deps): bump anthropic from 0.34.1 to 0.34.2 (#3746)
Bumps [anthropic](https://github.com/anthropics/anthropic-sdk-python) from 0.34.1 to 0.34.2.
- [Release notes](https://github.com/anthropics/anthropic-sdk-python/releases)
- [Changelog](https://github.com/anthropics/anthropic-sdk-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/anthropics/anthropic-sdk-python/compare/v0.34.1...v0.34.2)

---
updated-dependencies:
- dependency-name: anthropic
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-05 11:52:59 -04:00
mamoodi 60c5fd41ec Update docs on LLM providers for consistency (#3738)
* Update docs on LLM providers for consistency

* Update headless command

* minor tweaks based on feedback

---------

Co-authored-by: Robert Brennan <contact@rbren.io>
Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-09-05 15:10:28 +00:00
Xingyao Wang 688068a44e Fix issues for running RemoteRuntime in parallel on SWE-Bench (#3716)
* feat: add SWE-bench fullset support

* fix instance image list

* update eval script and documentation

* increase timeout for remote runtime

* add push script

* handle the case when ret push is an generator

* update pbar

* set SWE-Bench default to run SWE-Bench lite

* add script to cleanup remote runtime

* fix the cases when tag is too long

* update README

* update readme for cleanup

* rename od to oh

* Update evaluation/swe_bench/README.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/swe_bench/README.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/swe_bench/scripts/cleanup_remote_runtime.sh

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/swe_bench/scripts/cleanup_remote_runtime.sh

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/swe_bench/scripts/cleanup_remote_runtime.sh

Co-authored-by: Graham Neubig <neubig@gmail.com>

* gets API key and Runtime from env var

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-09-05 10:34:31 +08:00
Robert Brennan ee158feb15 Documentation updates (#3733)
* update badges

* fix badges

* better badges

* move credits

* more badge work

* add gh logo

* update some copy

* update logo

* fix height

* update text

* emdash

* remove cruft

* move title

* update links

* add hr

* white logo

* move some stuff to getting-started

* revert logo

* more copy changes

* minor tweaks

* fix sidebar

* explicit sidebar

* words

* fix tag

* fix how-to

* more docs work

* update styles

* fix up custom sandbox docs

* change eval title

* fix up getting-started

* fix getting started

* update to 0.9.2

* update screenshot

* add company link

* fix dark mode

* minor fixes

* update image

* update headless and cli docs

* update readme

* fix links

* revert package

* rename links

* fix links

* fix link

* chagne to claude
2024-09-04 21:22:52 +00:00
dependabot[bot] ec25abd98b chore(deps): bump boto3 from 1.35.10 to 1.35.11 (#3724)
Bumps [boto3](https://github.com/boto/boto3) from 1.35.10 to 1.35.11.
- [Release notes](https://github.com/boto/boto3/releases)
- [Commits](https://github.com/boto/boto3/compare/1.35.10...1.35.11)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: tofarr <tofarr@gmail.com>
2024-09-04 12:49:11 -07:00
leo bb28dea51f Add documentation for CLI mode (#3706)
* Add documentation for CLI mode

Fixes #3703

Add documentation for CLI mode in OpenHands.

* **New Documentation**: Add `docs/modules/usage/how-to/cli-mode.md` to document CLI mode.
  - Include instructions on starting an interactive OpenHands session via the command line.
  - Explain the difference between CLI mode and headless mode.
  - Provide examples of CLI commands and expected outputs.
* **Update Existing Documentation**: Modify `docs/modules/usage/how-to/headless-mode.md`.
  - Clarify the difference between headless mode and CLI mode.
  - Add a reference to the new CLI mode documentation.

* Update cli-mode.md

* Update headless-mode.md

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
Co-authored-by: tofarr <tofarr@gmail.com>
2024-09-04 15:38:57 -04:00
Robert Brennan 2a99aa6679 Cosmetic README updates (#3728)
* update badges

* fix badges

* better badges

* move credits

* more badge work

* add gh logo

* update some copy

* update logo

* fix height

* update text

* emdash

* remove cruft

* move title

* update links

* add hr

* white logo

* move some stuff to getting-started

* revert logo

* more copy changes

* minor tweaks

* words

* fix getting started

* update to 0.9.2

* update screenshot

* minor fixes
2024-09-04 19:38:28 +00:00
mamoodi add4653335 Release 0.9.2 (#3727) 2024-09-04 14:31:54 -04:00
Robert Brennan 2557c18fb1 specify runtime image (#3726) 2024-09-04 12:03:26 -04:00
tobitege bc31fb15fe (fix) CodeActAgent: fix issues with vision support in prompts (#3665)
* CodeActAgent: fix message prep if prompt caching is not supported

* fix python version in regen tests workflow

* fix in conftest "mock_completion" method

* add disable_vision to LLMConfig; revert change in message parsing in llm.py

* format messages in several files for completion

* refactored message(s) formatting (llm.py); added vision_is_active()

* fix a unit test

* regenerate: added LOG_TO_FILE and FORCE_REGENERATE env flags

* try to fix path to logs folder in workflow

* llm: prevent index error

* try FORCE_USE_LLM in regenerate

* tweaks everywhere...

* fix 2 random unit test errors :(

* added FORCE_REGENERATE_TESTS=true to regenerate CLI

* fix test_lint_file_fail_typescript again

* double-quotes for env vars in workflow; llm logger set to debug

* fix typo in regenerate

* regenerate iterations now 20; applied iteration counter fix by Li

* regenerate: pass FORCE_REGENERATE flag into env

* fixes for int tests. several mock files updated.

* browsing_agent: fix response_parser.py adding ) to empty response

* test_browse_internet: fix skipif and revert obsolete mock files

* regenerate: fi bracketing for http server start/kill conditions

* disable test_browse_internet for CodeAct*Agents; mock files updated after merge

* missed to include more mock files earlier

* reverts after review feedback from Li

* forgot one

* browsing agent test, partial fixes and updated mock files

* test_browse_internet works in my WSL now!

* adapt unit test test_prompt_caching.py

* add DEBUG to regenerate workflow command

* convert regenerate workflow params to inputs

* more integration test mock files updated

* more files

* test_prompt_caching: restored test_prompt_caching_headers purpose

* file_ops: fix potential exception, like "cross device copy"; fixed mock files accordingly

* reverts/changes wrt feedback from xingyao

* updated docs and config template

* code cleanup wrt review feedback
2024-09-04 17:58:30 +02:00
mamoodi 1b66f2e777 Revert "Create a reusable workflow for building, testing and publishing runti…" (#3725)
This reverts commit d1a741792f.
2024-09-04 11:47:01 -04:00
mamoodi d1a741792f Create a reusable workflow for building, testing and publishing runtime images and use it (#3717)
* Test resuable workflow

* fix path for reusable workflow

* Fix workflow typo

* fix reusable workflow

* input typo

* Add secrets to reusable workflow

* Make token required

* Fix secret indentation

* Fix image with tag
2024-09-04 08:58:59 -04:00
Shubham raj 2bc3e8d584 Fix: llm completion exception breaks CodeActAgent (#3678)
* Catch exception and return finish action with an exception message in case of exception in llm completion

* Remove exception logs

* Raise llm response error for any exception in llm completion

* Raise LLMResponseError from async completion and async streaming completion as well
2024-09-04 05:51:49 +02:00
sp.wack 0bb0903a22 Fix anthropic providers pointing to the correct model ID (#3715) 2024-09-03 18:17:07 +00:00
dependabot[bot] dd3a701b93 chore(deps): bump litellm from 1.44.14 to 1.44.15 (#3712)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.44.14 to 1.44.15.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/commits)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-03 10:57:29 -07:00
dependabot[bot] d4e6ea5e49 chore(deps-dev): bump llama-index from 0.11.2 to 0.11.4 (#3711)
Bumps [llama-index](https://github.com/run-llama/llama_index) from 0.11.2 to 0.11.4.
- [Release notes](https://github.com/run-llama/llama_index/releases)
- [Changelog](https://github.com/run-llama/llama_index/blob/main/CHANGELOG.md)
- [Commits](https://github.com/run-llama/llama_index/compare/v0.11.2...v0.11.4)

---
updated-dependencies:
- dependency-name: llama-index
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: yufansong <yufan@risingwave-labs.com>
2024-09-03 10:57:12 -07:00
mamoodi 510a82a039 Verified list of LLMs working with OpenHands and some LLM doc changes (#3714)
* Verified list of LLMs working with OpenHands and some LLM doc changes

* Update list of verified

* Add warning for ollama guide
2024-09-03 13:14:38 -04:00
Xingyao Wang d8a87d7ccb [Eval] Make SWE-Bench run_infer.sh to default to run SWE-Bench Lite (#3704)
* feat: add SWE-bench fullset support

* fix instance image list

* update eval script and documentation

* increase timeout for remote runtime

* add push script

* handle the case when ret push is an generator

* update pbar

* set SWE-Bench default to run SWE-Bench lite
2024-09-04 00:58:14 +08:00
tofarr ff64085042 doc updates for headless / cli mode (#3707)
* Documentation update for headless / cli mode

* WIP

* WIP

* WIP

* Fix link
2024-09-03 10:56:21 -04:00
mamoodi 31a2dbb372 Indicate browser as experimental so user is aware it may not be fully stable (#3676)
* Indicate browser as experimental so user knows it may not be fully stable

* Add translation of change for all languages
2024-09-02 22:25:22 -04:00
sp.wack 7df3ca4ac8 Fix (#3694) 2024-09-02 22:23:57 -04:00
dependabot[bot] 0a9369df5e chore(deps): bump litellm from 1.44.11 to 1.44.14 (#3698) 2024-09-03 02:10:51 +00:00
dependabot[bot] 327393670a chore(deps): bump boto3 from 1.35.9 to 1.35.10 (#3696) 2024-09-03 09:17:00 +08:00
Xingyao Wang d283420ac2 feat: add SWE-bench fullset support (#3477)
* feat: add SWE-bench fullset support

* fix instance image list

* update eval script and documentation

* add push script

* handle the case when ret push is an generator

* update pbar
2024-09-02 20:28:52 -04:00
dependabot[bot] 57ad0583b7 chore(deps-dev): bump lint-staged from 15.2.9 to 15.2.10 in /frontend (#3700)
Bumps [lint-staged](https://github.com/lint-staged/lint-staged) from 15.2.9 to 15.2.10.
- [Release notes](https://github.com/lint-staged/lint-staged/releases)
- [Changelog](https://github.com/lint-staged/lint-staged/blob/master/CHANGELOG.md)
- [Commits](https://github.com/lint-staged/lint-staged/compare/v15.2.9...v15.2.10)

---
updated-dependencies:
- dependency-name: lint-staged
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-02 16:38:38 +00:00
dependabot[bot] 2f359b4f29 chore(deps-dev): bump @types/node from 22.5.1 to 22.5.2 in /frontend (#3702)
Bumps [@types/node](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/node) from 22.5.1 to 22.5.2.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/node)

---
updated-dependencies:
- dependency-name: "@types/node"
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-02 15:56:45 +00:00
dependabot[bot] 5f3ee286bf chore(deps-dev): bump postcss from 8.4.41 to 8.4.44 in /frontend (#3701)
Bumps [postcss](https://github.com/postcss/postcss) from 8.4.41 to 8.4.44.
- [Release notes](https://github.com/postcss/postcss/releases)
- [Changelog](https://github.com/postcss/postcss/blob/main/CHANGELOG.md)
- [Commits](https://github.com/postcss/postcss/compare/8.4.41...8.4.44)

---
updated-dependencies:
- dependency-name: postcss
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-02 15:56:03 +00:00
Mislav Balunovic f979d612ec (fix) confirmation mode bugfix for the EventStreamRuntime (#3695) 2024-09-02 13:27:33 +00:00
Boxuan Li 75d5591816 file_ops: Use tmp file for original linting (#3681)
Fix a potential issue that might lead to file corruption when edit linting is enabled

#3124 introduces a feature for editing: running linter twice before and after the change and only extract new errors introduced by the agent. This has some potential issues and I am working on #3649 to address them, but I feel like I am not gonna finish it in the next few days, and that PR has become harder and harder to review, thus this PR, which only focuses on a small improvement.

So what's the issue? When we run linters on the original file before our edits, we need to copy the original file and use a temporary file to lint, because linting may have side-effect (e.g. modifying the file in-place). I used the word "may" because:

Flake8 has no side-effect, so not a problem as of now.
We don't enforce this or document this "no side-effect" as a requirement for linter implementation, so side-effect is allowed.
Regardless, the "after-edit-linting" uses the same approach: backup the file before linting to avoid data corruption. We should keep our "before-edit-linting" consistent.

Why no new unittest that reproduces the issue? Well, as I have mentioned earlier, flake8 has no side-effect, so technically it's not a bug but a flaw. Therefore, there's no way to write a test that reproduces the issue.
2024-09-01 23:36:57 -07:00
tobitege 7068a73ae7 (enh) Improve CodeActAgent's file editing reliability (#3610)
* improve file editing prompts and unit test
converted most raise calls to a _output_error call in file_ops.py

* tweaks in test_agent_skill.py wrt to SEP separator

* tweaked the separator

* remove server runtime remnants and TEST_RUNTIME references

* restore use of TEST_RUNTIME args and variables

* fix integration tests

* added hint to properly escape docstrings

* revert latest prompt change

---------

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-09-02 06:03:22 +02:00
Yufan Song 15a32e973e Docs: add logo to repo (#3691)
* add logo to repo

* change position

* change position

* change position
2024-09-02 02:24:35 +00:00
mamoodi 6fcc4ca052 fix eval README link (#3692) 2024-09-02 09:29:42 +08:00
tobitege c83fab8a00 llm: add NotFoundError to completion exception handling (#3668)
Co-authored-by: tofarr <tofarr@gmail.com>
2024-09-01 07:47:21 +00:00
Boxuan Li 1e2796e168 Fix step count out-of-sync bug when child agent fails (#3680) 2024-09-01 09:36:51 +02:00
dependabot[bot] e5bff4bca8 chore(deps-dev): bump llama-index-embeddings-azure-openai (#3669)
Bumps llama-index-embeddings-azure-openai from 0.2.4 to 0.2.5.

---
updated-dependencies:
- dependency-name: llama-index-embeddings-azure-openai
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-30 16:39:01 -07:00
dependabot[bot] 72a705f9a5 chore(deps-dev): bump ruff from 0.6.2 to 0.6.3 (#3674) 2024-08-30 20:28:04 +00:00
dependabot[bot] 6b8adde7bc chore(deps): bump litellm from 1.44.9 to 1.44.11 (#3671) 2024-08-30 20:18:16 +00:00
dependabot[bot] a130367ec7 chore(deps): bump boto3 from 1.35.7 to 1.35.9 (#3670) 2024-08-31 00:11:35 +08:00
dependabot[bot] 52cc0977ca chore(deps-dev): bump @types/react from 18.3.4 to 18.3.5 in /frontend (#3672) 2024-08-31 00:11:21 +08:00
dependabot[bot] 1955fba2ca chore(deps-dev): bump openai from 1.42.0 to 1.43.0 (#3673) 2024-08-31 00:11:03 +08:00
dependabot[bot] 46dae57cf2 chore(deps): bump prism-react-renderer from 2.3.1 to 2.4.0 in /docs (#3675) 2024-08-31 00:10:47 +08:00
tobitege dbb671a8a5 logname fix; improve test calling instruction (#3666) 2024-08-30 17:15:31 +02:00
Graham Neubig 99139e9c72 Release to pypi (#3667)
* Release to pypi

* Change to match our actual versioning
2024-08-30 15:08:25 +00:00
dependabot[bot] c0a5cc0ba8 chore(deps-dev): bump @testing-library/react in /frontend (#3652)
Bumps [@testing-library/react](https://github.com/testing-library/react-testing-library) from 16.0.0 to 16.0.1.
- [Release notes](https://github.com/testing-library/react-testing-library/releases)
- [Changelog](https://github.com/testing-library/react-testing-library/blob/main/CHANGELOG.md)
- [Commits](https://github.com/testing-library/react-testing-library/compare/v16.0.0...v16.0.1)

---
updated-dependencies:
- dependency-name: "@testing-library/react"
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-30 10:42:51 -04:00
dependabot[bot] 5956349a94 chore(deps): bump litellm from 1.44.7 to 1.44.9 (#3654)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.44.7 to 1.44.9.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.44.7...v1.44.9)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-30 22:01:51 +08:00
niliy01 89e1c4f29c feat: add more embed models that Ollama supports recently (#3641)
Signed-off-by: Yi Lin <teroincn@gmail.com>
2024-08-30 07:58:01 -04:00
tobitege 1ef83a8554 fix test_is_stuck.py to not fail on macos (#3662) 2024-08-30 05:06:30 +00:00
Xingyao Wang 090c911a50 (refactor) Make Runtime class synchronous (#3661)
* change runtime to be synchronous

* fix test runtime with the new interface

* fix arg

* fix eval

* fix missing config attribute

* fix plugins

* fix on_event by revert it back to async

* update upload_file endpoint

* fix argument to upload file

* remove unncessary async for eval;
fix evaluation run in parallel

* use asyncio to run controller for eval

* revert file upload

* truncate eval test result output
2024-08-30 01:37:03 +00:00
Robert Brennan b0e52f121c fix build for fork (#3660) 2024-08-29 21:49:38 +00:00
Robert Brennan ece8fef739 implement GitHub action for integration regeneration (#3659)
* modify readme

* Regenerate integration tests

* actually run the integration tests

* fix python version

* add diff check

* remove colima

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2024-08-29 21:46:41 +00:00
Robert Brennan 09cfcfbee9 add (placeholder) workflow for regenerating the integration tests (#3657)
* add workflow

* add todo
2024-08-29 20:12:11 +00:00
mamoodi 37e6fa442e Release 0.9.1 (#3656) 2024-08-29 13:55:29 -04:00
Xingyao Wang 8b1f207d39 feat: support remote runtime (#3406)
* feat: refactor building logic into runtime builder

* return image name

* fix testcases

* use runtime builder for eventstream runtime

* have runtime builder return str

* add api_key to sandbox config

* draft remote runtime

* remove extra if clause

* initialize runtime based on box class

* add build logic

* use base64 for file upload

* get runtime image prefix from API

* replace ___ with _s_ to make it a valid image name

* use /build to start build and /build_status to check the build progress

* update logging

* fix exit code

* always use port

* add remote runtime

* rename runtime

* fix tests import

* make dir first if work_dir does not exists;

* update debug print to remote runtime

* fix exit close_sync

* update logging

* add retry for stop

* use all box class for test keep prompt

* fix test browsing

* add retry stop

* merge init commands to save startup time

* fix await

* remove sandbox url

* support execute through specific runtime url

* fix file ops

* simplify close

* factor out runtime retry code

* fix exception handling

* fix content type error (e.g., bad gateway when runtime is not ready)

* add retry for wait until alive;
add retry for check image exists

* Revert "add retry for wait until alive;"

This reverts commit dd013cd268.

* retry when wait until alive

* clean up msg

* directly save sdist to temp dir for _put_source_code_to_dir

* support running testcases in parallel

* tweak logging;
try to close session

* try to close session even on exception

* update poetry lock

* support remote to run integration tests

* add warning for workspace base on remote runtime

* set default runtime api

* remove server runtime

* update poetry lock

* support running swe-bench (n=1) eval on remoteruntime

* add a timeout of 30 min

* add todo for docker namespace

* update poetry loc
2024-08-29 15:53:37 +00:00
Shubham raj 296fa8182a Mock config env variables (#3621) 2024-08-29 15:48:23 +00:00
tobitege a2d94c9cb1 (enh) StuckDetector: fix+enhance syntax error loop detection (#3628)
* fix StuckDetector and add more errors for detection

* more stringent error detection and more unit tests
2024-08-29 17:33:54 +02:00
tobitege ae153aa8ab (enh) review of logger.py; less logging in AgentController (#3648)
* revised logger.py; agent_controller: less debug logging (every second)

* agent_controller._step: removed logging upon _pending_action
2024-08-29 16:07:38 +02:00
tobitege fd0fad7362 improve github.md with more API commands with less code duplication (#3651) 2024-08-29 16:02:14 +02:00
tobitege 8fca5a5354 linter and test_aider_linter extensions for eslint (#3543)
* linter and test_aider_linter extensions for eslint

* linter tweaks

* try enabling verbose output in linter test

* one more option for linter test

* try conftest.py for tests/unit folder

* enable verbose mode in workflow; remove conftest.py again

* debug print statements of linter results

* skip some tests if eslint is not installed at all

* more tweaks

* final test skip setups

* code quality revisions

* fix test again

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-08-29 10:40:43 +02:00
tobitege c875a5fb77 (feat) Add Aider bench output visualizer (#3643)
* aider-bench: add visualization to summarize script and readme

* added example cost and actions histogram images for readme

* moved dependencies to evaluation section
2024-08-29 05:03:44 +00:00
niliy01 717929b5d4 docs: delete "ssh_box and image_agnostic_util references" in zh-hans doc (#3647)
Signed-off-by: Yi Lin <teroincn@gmail.com>
2024-08-29 04:36:35 +00:00
dependabot[bot] 216d624705 chore(deps-dev): bump llama-index from 0.11.1 to 0.11.2 (#3637)
Bumps [llama-index](https://github.com/run-llama/llama_index) from 0.11.1 to 0.11.2.
- [Release notes](https://github.com/run-llama/llama_index/releases)
- [Changelog](https://github.com/run-llama/llama_index/blob/main/CHANGELOG.md)
- [Commits](https://github.com/run-llama/llama_index/compare/v0.11.1...v0.11.2)

---
updated-dependencies:
- dependency-name: llama-index
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-29 10:33:49 +08:00
mamoodi 813b5a2c62 Added stale exemption for tracked issues (#3638) 2024-08-28 20:27:50 -04:00
dependabot[bot] 0a41e6c0cb chore(deps): bump zope-interface from 7.0.2 to 7.0.3 (#3630)
Bumps [zope-interface](https://github.com/zopefoundation/zope.interface) from 7.0.2 to 7.0.3.
- [Changelog](https://github.com/zopefoundation/zope.interface/blob/master/CHANGES.rst)
- [Commits](https://github.com/zopefoundation/zope.interface/compare/7.0.2...7.0.3)

---
updated-dependencies:
- dependency-name: zope-interface
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-28 15:59:14 -07:00
dependabot[bot] 70c638a885 chore(deps-dev): bump streamlit from 1.37.1 to 1.38.0 (#3631)
Bumps [streamlit](https://github.com/streamlit/streamlit) from 1.37.1 to 1.38.0.
- [Release notes](https://github.com/streamlit/streamlit/releases)
- [Commits](https://github.com/streamlit/streamlit/compare/1.37.1...1.38.0)

---
updated-dependencies:
- dependency-name: streamlit
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-28 15:58:59 -07:00
dependabot[bot] da19b2c297 chore(deps): bump boto3 from 1.35.5 to 1.35.7 (#3632)
Bumps [boto3](https://github.com/boto/boto3) from 1.35.5 to 1.35.7.
- [Release notes](https://github.com/boto/boto3/releases)
- [Commits](https://github.com/boto/boto3/compare/1.35.5...1.35.7)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: tofarr <tofarr@gmail.com>
2024-08-28 15:58:45 -07:00
dependabot[bot] 6284f3c63a chore(deps-dev): bump notebook from 7.2.1 to 7.2.2 (#3633)
Bumps [notebook](https://github.com/jupyter/notebook) from 7.2.1 to 7.2.2.
- [Release notes](https://github.com/jupyter/notebook/releases)
- [Changelog](https://github.com/jupyter/notebook/blob/@jupyter-notebook/tree@7.2.2/CHANGELOG.md)
- [Commits](https://github.com/jupyter/notebook/compare/@jupyter-notebook/tree@7.2.1...@jupyter-notebook/tree@7.2.2)

---
updated-dependencies:
- dependency-name: notebook
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: yufansong <yufan@risingwave-labs.com>
2024-08-28 15:58:34 -07:00
dependabot[bot] 20efbc1bfb chore(deps): bump json-repair from 0.28.3 to 0.28.4 (#3634)
Bumps [json-repair](https://github.com/mangiucugna/json_repair) from 0.28.3 to 0.28.4.
- [Release notes](https://github.com/mangiucugna/json_repair/releases)
- [Commits](https://github.com/mangiucugna/json_repair/compare/v0.28.3...v0.28.4)

---
updated-dependencies:
- dependency-name: json-repair
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: yufansong <yufan@risingwave-labs.com>
2024-08-28 15:58:19 -07:00
dependabot[bot] 0b8ae460ac chore(deps): bump google-cloud-aiplatform from 1.63.0 to 1.64.0 (#3635)
Bumps [google-cloud-aiplatform](https://github.com/googleapis/python-aiplatform) from 1.63.0 to 1.64.0.
- [Release notes](https://github.com/googleapis/python-aiplatform/releases)
- [Changelog](https://github.com/googleapis/python-aiplatform/blob/main/CHANGELOG.md)
- [Commits](https://github.com/googleapis/python-aiplatform/compare/v1.63.0...v1.64.0)

---
updated-dependencies:
- dependency-name: google-cloud-aiplatform
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: yufansong <yufan@risingwave-labs.com>
2024-08-28 15:57:58 -07:00
tobitege daeff3dfaf startup handling and logging of docker images tweaked (#3645) 2024-08-28 22:17:58 +00:00
dependabot[bot] 978951ef88 chore(deps-dev): bump @types/node from 22.5.0 to 22.5.1 in /frontend (#3640)
Bumps [@types/node](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/node) from 22.5.0 to 22.5.1.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/node)

---
updated-dependencies:
- dependency-name: "@types/node"
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-28 21:09:28 +00:00
dependabot[bot] 94d1239155 chore(deps-dev): bump @tailwindcss/typography in /frontend (#3639)
Bumps [@tailwindcss/typography](https://github.com/tailwindlabs/tailwindcss-typography) from 0.5.14 to 0.5.15.
- [Release notes](https://github.com/tailwindlabs/tailwindcss-typography/releases)
- [Changelog](https://github.com/tailwindlabs/tailwindcss-typography/blob/main/CHANGELOG.md)
- [Commits](https://github.com/tailwindlabs/tailwindcss-typography/compare/v0.5.14...v0.5.15)

---
updated-dependencies:
- dependency-name: "@tailwindcss/typography"
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-28 22:49:34 +03:00
Graham Neubig c6ba0e8339 Remove singleton config (#3614)
* Remove singleton config

* Fix tests

* Fix logging reset

* Fix pre-commit
2024-08-28 20:05:49 +01:00
tobitege f8c4d1df45 (test) Fix test_agent_controller.py mock exceptions (#3577)
* fix test_agent_controller.py mock exceptions

* revert change to agent_controller.py
2024-08-28 19:05:22 +02:00
tobitege 9c39f07430 (enh) Aider-Bench: make resumable with skip_num arg (#3626)
* added optional START_ID env flag to resume from that instance id

* prepare_dataset: fix comparisons by using instance id's as int

* aider bench complete_runtime: close runtime to close container

* added matrix display of instance id for logging

* fix typo in summarize_results.py saying summarise_results

* changed start_id to skip_num to skip rows from dataset (start_id wasn't supportable)

* doc changes about huggingface spaces to temporarily point back to OD
2024-08-28 15:42:01 +00:00
Xingyao Wang 4ed45c7c9c Update README.md (#3629) 2024-08-28 15:36:16 +00:00
Xingyao Wang d9a8b53bc2 feat: specialize CodeAct into micro agents by providing markdown files (#3511)
* update microagent name and update template.toml

* substitute actual micro_agent_name for prompt manager

* add python-frontmatter

* support micro agent in codeact

* add test cases

* add instruction from require env var

* add draft gh micro agent

* update poetry lock

* update poetry lock
2024-08-28 14:58:16 +00:00
dependabot[bot] 653bc4ef6d chore(deps): bump fastapi from 0.112.1 to 0.112.2 (#3598)
Bumps [fastapi](https://github.com/fastapi/fastapi) from 0.112.1 to 0.112.2.
- [Release notes](https://github.com/fastapi/fastapi/releases)
- [Commits](https://github.com/fastapi/fastapi/compare/0.112.1...0.112.2)

---
updated-dependencies:
- dependency-name: fastapi
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-28 10:05:37 +08:00
Xingyao Wang 98081b9b1b (eval) EOF fixes for SWE-Bench evaluation (#3623)
* add error handling for client eof

* remove root check

* remove set -e

* echo USER to fix for swebench infer

* fix entry timeout

* add timeout;
fix runtime close
2024-08-27 21:09:31 +00:00
tobitege 0b8779447a New README for OpenHands/openhands/runtime folder (#3576)
* new OpenHands/openhands/runtime/README.md - made by OpenHands

* move parts to server readme; fix OD runtime in docs
2024-08-27 21:04:50 +00:00
tobitege 097fbd6362 (fix) Enable and log if logging to file is enabled (#3556)
* enable logging to file also when DEBUG is active

* Log a message if logging to file is enabled

* log a message if DEBUG mode is enabled
2024-08-27 22:36:33 +02:00
Raj Maheshwari 0cdeb83b17 Enabling of unittests in aider benchmark should be optional. (#3620) 2024-08-27 17:25:55 +00:00
dependabot[bot] 292148826e chore(deps-dev): bump jupyterlab from 4.2.4 to 4.2.5 (#3616) 2024-08-28 00:39:19 +08:00
dependabot[bot] 045c8367b7 chore(deps): bump zope-interface from 7.0.1 to 7.0.2 (#3617) 2024-08-28 00:38:49 +08:00
dependabot[bot] 0b391e09b5 chore(deps): bump litellm from 1.44.5 to 1.44.7 (#3618) 2024-08-28 00:38:37 +08:00
tobitege 1fddc77247 (feat) runtime: in _wait_until_alive upon start wait for client to have initialized too (#3612)
* runtime: in _wait_until_alive wait initially for client to initialize

* fix typo in runtime log entry
2024-08-27 17:11:32 +02:00
mamoodi 212a78c703 docs: More consistent documentation (#3608) 2024-08-27 16:04:13 +02:00
Raj Maheshwari 789f15a5db Allow the Agent to run uniittests for verification. (#3609)
* Allow the Agent to run uniittests for verification.

* minor bugfix - removed artifact
2024-08-27 06:22:01 +00:00
dependabot[bot] a1bdbd0aaf chore(deps-dev): bump jsdom from 24.1.1 to 25.0.0 in /frontend (#3589) 2024-08-27 10:00:36 +08:00
dependabot[bot] 0f5561509d chore(deps-dev): bump @testing-library/jest-dom in /frontend (#3590) 2024-08-27 10:00:14 +08:00
dependabot[bot] e96649decd chore(deps): bump jose from 5.7.0 to 5.8.0 in /frontend (#3588) 2024-08-27 09:58:55 +08:00
dependabot[bot] 47afd1b141 chore(deps): bump boto3 from 1.35.4 to 1.35.5 (#3595) 2024-08-27 09:58:38 +08:00
dependabot[bot] 809a6632b1 chore(deps-dev): bump mypy from 1.11.1 to 1.11.2 (#3596) 2024-08-27 09:58:10 +08:00
dependabot[bot] 7a856c9e68 chore(deps): bump litellm from 1.44.4 to 1.44.5 (#3599) 2024-08-27 09:57:38 +08:00
dependabot[bot] 8344be3996 chore(deps-dev): bump llama-index-embeddings-huggingface (#3600) 2024-08-27 09:57:04 +08:00
Kaushik Deka 5bb931e4d6 Add prompt caching (Sonnet, Haiku only) (#3411)
* Add prompt caching

* remove anthropic-version from extra_headers

* change supports_prompt_caching method to attribute

* change caching strat and log cache statistics

* add reminder as a new message to fix caching

* fix unit test

* append reminder to the end of the last message content

* move token logs to post completion function

* fix unit test failure

* fix reminder and prompt caching

* unit tests for prompt caching

* add test

* clean up tests

* separate reminder, use latest two messages

* fix tests

---------

Co-authored-by: tobitege <10787084+tobitege@users.noreply.github.com>
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-08-26 20:46:44 -04:00
Raj Maheshwari e72dc96d13 [Fix] Stop API key from leaking in evaluation outputs. (#3603)
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-08-26 23:38:37 +02:00
Robert Brennan 3a1c547c8c [WIP] Fix docker push issues (#3585)
* empty commit

* fix workflows for forks

* fix runtime tests

* fix repo name

* fix artifacts

* add download step
2024-08-26 19:36:28 +00:00
mamoodi 8f6410603c Change PR template to enforce a simple description for CHANGELOG (#3602) 2024-08-26 18:30:58 +00:00
tobitege 8fcf0817d4 (eval) Aider_bench: add eval_ids arg to run specific instance id's (#3592)
* add eval_ids arg to run specific instance id's; fix/extend README

* fix description in parser for --eval-ids

* fix test_arg_parser.py to account for added arg

* fix typo in README to say "summarize" instead of "summarise" for script
2024-08-27 00:49:26 +08:00
tofarr 8c4c3b18b5 Feat google cloud storage (#3574)
* Google cloud storage implementation
* Unit test refactor
2024-08-26 08:16:49 -06:00
tobitege f1882ba886 (test) Fix regressions in tests (#3579)
* fix conftest.py option (#3573)

* try to fix fixture base_container_image in runtime conftest

* fix integration test mock files due to #3548

* fix test_ipython.py integration test
2024-08-26 13:14:37 +02:00
sp.wack eaf8e5c8a7 feat: parse new terminal output (#3582) 2024-08-26 13:09:43 +03:00
dependabot[bot] c444aa801a chore(deps): bump boto3 from 1.35.3 to 1.35.4 (#3559) 2024-08-26 15:58:39 +08:00
tofarr 6ce77e157b Fix pypi build (#3548)
* Fix pypi build

The package on pypi only included opendevin/* (the poetry default). It also needs to include agenthub/*

* Bumped version so people will actually get it!

* Fix package definition
* Updated poetry lock file
* Update package name to openhands-ai
* Add py.typed to indicate that OpenHands has type annotations

* Replace package name with openhands_ai

* Fix tests to reflect new name

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-08-26 01:31:37 -06:00
Graham Neubig f9088766e8 Allow setting of runtime container image (#3573)
* Add runtime container image setting

* Fix typo in test

* Fix sandbox base container image

* Update variables

* Update to base_container_image

* Update tests/unit/test_config.py

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>

* Fixed eval

* Fixed container_image

* Fix typo

---------

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-08-25 23:05:41 +00:00
Robert Brennan 356d9b34be Add CLI mode (#3564)
* set log levels

* basic cli flow

* basic display

* better exits

* set log level

* fix messages

* clean up logs

* better exits

* better printing

* add todo
2024-08-26 06:10:21 +08:00
tobitege 7589be671e npm: fix Regular Expression Denial of Service (ReDoS) in micromatch (#3569) 2024-08-25 00:47:16 +02:00
mamoodi 36d1745b5e Release 0.9.0 (#3565) 2024-08-23 20:20:42 +00:00
sp.wack 07e750f038 feat(frontend): Improve models input UI/UX in settings (#3530)
* Create helper functions

* Add map according to litellm docs

* Create ModelSelector

* Extend model selector

* use autocomplete from nextui

* Improve keys without providers

* Handle models without a provider

* Add verified section and some empty handling

* Add support for default or previously set models

* Update tests

* Lint

* Remove modifier

* Fix typescript error

* Functionality for switching to custom model

* Add verified models

* Respond to resetting to default

* Comment
2024-08-23 19:06:15 +02:00
Robert Brennan b63dec4b2e Add back docker caching, simplify docker builds (#3546)
* fix multiarch

* remove extra push

* add back tag file

* fix cache tag

* add login step

* fix login

* try to fix save

* fix output maybe

* rm outputs

* remove tars

* fix refs

* fix runtime dep

* force rebuild

* lowercase image

* add suffix to build tags for runtime

* update matrix

* fix cut

* fix cut again

* add back matrix

* Update containers/build.sh

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>

---------

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-08-23 17:01:18 +00:00
dependabot[bot] bf534a90b7 chore(deps): bump monaco-editor from 0.50.0 to 0.51.0 in /frontend (#3562)
Bumps [monaco-editor](https://github.com/microsoft/monaco-editor) from 0.50.0 to 0.51.0.
- [Release notes](https://github.com/microsoft/monaco-editor/releases)
- [Changelog](https://github.com/microsoft/monaco-editor/blob/main/CHANGELOG.md)
- [Commits](https://github.com/microsoft/monaco-editor/compare/v0.50.0...v0.51.0)

---
updated-dependencies:
- dependency-name: monaco-editor
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-23 12:11:02 -04:00
dependabot[bot] 54640fb959 chore(deps): bump litellm from 1.44.1 to 1.44.4 (#3560)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.44.1 to 1.44.4.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.44.1...v1.44.4)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-23 12:10:51 -04:00
tobitege fc5f026942 prevent 500 server error on a just removed folder when listing files (#3553) 2024-08-23 18:05:38 +02:00
tofarr 8d47cebde9 Fix spaces in path (#3547)
* Fix for issue where spaces in path results in error
2024-08-23 07:29:41 -06:00
sp.wack 6180483fac docs: Update to include guide for OpenAI LLMs (#3552)
* Update docs to include OpenAI LLMs

* Remove broken link

* Update docs/modules/usage/llms/openai-llms.md
2024-08-23 16:00:04 +03:00
Raj Maheshwari 11d8d05b1a [Fix] Metrics should be updated when agent reaches max iterations. (#3549) 2024-08-23 02:28:16 +00:00
Robert Brennan 9642f8d4be Revert "Remove concurrency and cleanup workflows" (#3534)
* Revert "Remove concurrency and cleanup workflows (#3524)"

This reverts commit b7b4556433.

* Update .github/workflows/ghcr_app.yml

* Update .github/workflows/ghcr_runtime.yml

* Update .github/workflows/ghcr_runtime.yml

---------

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-08-22 18:13:09 +00:00
Ikko Eltociear Ashimine 87cc28beca chore: update client.py (#3542)
occurence -> occurrence
2024-08-23 01:18:16 +08:00
dependabot[bot] e523e25e38 chore(deps-dev): bump ruff from 0.6.1 to 0.6.2 (#3540) 2024-08-23 00:04:04 +08:00
dependabot[bot] 760811e998 chore(deps): bump litellm from 1.43.19 to 1.44.1 (#3539) 2024-08-23 00:03:38 +08:00
dependabot[bot] 590f31e39f chore(deps): bump boto3 from 1.35.2 to 1.35.3 (#3537) 2024-08-23 00:03:18 +08:00
dependabot[bot] b4b231bf10 chore(deps-dev): bump pytest-asyncio from 0.23.8 to 0.24.0 (#3541) 2024-08-23 00:02:52 +08:00
dependabot[bot] eae272b0f8 chore(deps-dev): bump @types/node from 22.4.2 to 22.5.0 in /frontend (#3536)
Bumps [@types/node](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/node) from 22.4.2 to 22.5.0.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/node)

---
updated-dependencies:
- dependency-name: "@types/node"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-22 18:44:09 +03:00
Graham Neubig ba0d1a9c76 Update FeedbackModal.tsx (#3538) 2024-08-22 15:40:21 +00:00
Robert Brennan 12de9f7a78 revert to old ghcr repo (#3533) 2024-08-22 10:03:18 -04:00
Aaron Xia dc0a1f3940 Fix wrong doc url (#3531)
* Update custom-sandbox-guide.md

update https://docs.all-hands.dev/modules/usage/architecture/runtime

* Update runtime_build.py

update url

* Update README.md

update url
2024-08-22 13:16:27 +02:00
tobitege f8e8365caa (test) Increase test_arg_parser.py code coverage with more tests (#3509)
* increase test_arg_parser.py code coverage with more tests

* fix ipython integration tests
2024-08-22 09:43:56 +08:00
dependabot[bot] 9b30f0c21b chore(deps-dev): bump husky from 9.1.4 to 9.1.5 in /frontend (#3501)
Bumps [husky](https://github.com/typicode/husky) from 9.1.4 to 9.1.5.
- [Release notes](https://github.com/typicode/husky/releases)
- [Commits](https://github.com/typicode/husky/compare/v9.1.4...v9.1.5)

---
updated-dependencies:
- dependency-name: husky
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>
2024-08-22 00:30:29 +02:00
dependabot[bot] 174f48f5b3 chore(deps): bump boto3 from 1.35.1 to 1.35.2 (#3518)
Bumps [boto3](https://github.com/boto/boto3) from 1.35.1 to 1.35.2.
- [Release notes](https://github.com/boto/boto3/releases)
- [Commits](https://github.com/boto/boto3/compare/1.35.1...1.35.2)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
2024-08-22 00:29:48 +02:00
Xingyao Wang b19b724eae feat: show exact python interpreter to the agent in IPython and Bash (#3448)
* try to fix pip unavailable

* update test case for pip

* force rebuild in CI

* remove extra symlink

* fix newline

* added semi-colon to line 31

* Dockerfile.j2: activate env at the end

* Revert "Dockerfile.j2: activate env at the end"

This reverts commit cf2f565102.

* cleanup Dockerfile

* switch default python image

* remove image agnostic (no longer used)

* fix tests

* simplify integration tests default image

* add nodejs specific runtime tests

* update tests and workflows

* switch to nikolaik/python-nodejs:python3.11-nodejs22

* update build sh to output image name correctly

* increase custom images to test

* fix test

* fix test

* fix double quote

* try fixing ci

* update ghcr workflow

* fix artifact name

* try to fix ghcr again

* fix workflow

* save built image to correct dir

* remove extra -docker-image

* make last tag to be human readable image tag

* fix hyphen to underscore

* run test runtime on all tags

* revert app build

* separate ghcr workflow

* update dockerfile for eval

* fix tag for test run

* try fix tag

* try fix tag via matrix output

* try workflow again

* update comments

* try fixing test matrix

* fix artifact name

* try fix tag again

* Revert "try fix tag again"

This reverts commit b369badd8c.

* tweak filename

* try different path

* fix filepath

* try fix tag artifact path again

* save json instead of line

* update matrix

* print all tags in workflow

* support only streaming diff logs from the runtime client

* remove strip from log line to fix indentation

* get py interpreter for jupyter

* rstrip to remove newline on the rightside for logging

* fix blocking issue for stream logs

* set python interpreter path in bash ps1

* update testcase for jupyter py interpreter path

* remove accidentally added changes

* remove accidentally added changes

* only print dockerfile when debug

* add docs

* remove extra tests that weren't supposed to be in this pr

* add back missing test

* revert

* make LogBuffer synchronous to fix hang in integration tests

* fix integration tests

* Update opendevin/runtime/client/client.py

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

* fix test case

* fix integration tests

* change deque to list

* update integration tests

* rename test runtime

* fix docs

* rename opendevin to openhands in tests

---------

Co-authored-by: tobitege <tobitege@gmx.de>
Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: tobitege <10787084+tobitege@users.noreply.github.com>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-08-21 20:08:50 +00:00
Robert Brennan 54546b1e29 fix slack links again (#3526) 2024-08-22 03:54:28 +08:00
tobitege c7886168e1 (feat) implement typescript linting for CodeActAgent (#3452)
* tweaks to linter.py to prep for typescript linting (not implemented yet)

* fix 2 linter unit tests

* simpler basic_lint output; updated unit test

* fix default gpt-4o model name in aider default config

* linter.py: use tsc (typescript compiler) for linting; added more tests

* make typescript linting be more forgiving

* use npx instead of npm to install typescript in Dockerfile.j2

* Fix merge mistake

* removed npx call from Dockerfile.j2

* fix run_cmd to use code parameter; replace regex in test

* fix test_lint_file_fail_typescript to ignore leading path characters

* added TODO comment to extract_error_line_from

* fixed bug in ts_lint with wrong line number parsing
2024-08-21 21:41:35 +02:00
tobitege 81bb918ea0 (test) Move test_runtime to ghcr_test_runtime; adapt workflows (#3513)
* move test_runtime to ghcr_test_runtime; adapt workflows; fix runtime AttributeError

* split test_runtime.py into multiple tests in new tests/runtime folder

* moved common fixtures to tests/runtime/conftest.py
2024-08-22 03:07:20 +08:00
Raj Maheshwari 80f88e14cd [Feat] Aider Benchmark (#3507)
* [Feat] Aider Benchmark

* [Add] README.md
2024-08-21 18:05:41 +00:00
mamoodi b7b4556433 Remove concurrency and cleanup workflows (#3524) 2024-08-21 13:12:00 -04:00
dependabot[bot] 9dcad7877b chore(deps-dev): bump @types/react from 18.3.3 to 18.3.4 in /frontend (#3522)
Bumps [@types/react](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/react) from 18.3.3 to 18.3.4.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/react)

---
updated-dependencies:
- dependency-name: "@types/react"
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-21 16:52:38 +00:00
Robert Brennan 4172f8530f fix slack links (#3523) 2024-08-21 16:02:20 +00:00
dependabot[bot] a24779fdb8 chore(deps): bump i18next-http-backend from 2.6.0 to 2.6.1 in /frontend (#3521)
Bumps [i18next-http-backend](https://github.com/i18next/i18next-http-backend) from 2.6.0 to 2.6.1.
- [Changelog](https://github.com/i18next/i18next-http-backend/blob/master/CHANGELOG.md)
- [Commits](https://github.com/i18next/i18next-http-backend/compare/v2.6.0...v2.6.1)

---
updated-dependencies:
- dependency-name: i18next-http-backend
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-21 15:56:02 +00:00
dependabot[bot] 63d2ae8625 chore(deps-dev): bump @types/node from 22.4.1 to 22.4.2 in /frontend (#3520)
Bumps [@types/node](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/node) from 22.4.1 to 22.4.2.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/node)

---
updated-dependencies:
- dependency-name: "@types/node"
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-21 15:51:24 +00:00
dependabot[bot] 751e420710 chore(deps-dev): bump openai from 1.41.1 to 1.42.0 (#3517)
Bumps [openai](https://github.com/openai/openai-python) from 1.41.1 to 1.42.0.
- [Release notes](https://github.com/openai/openai-python/releases)
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-python/compare/v1.41.1...v1.42.0)

---
updated-dependencies:
- dependency-name: openai
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-21 23:42:33 +08:00
dependabot[bot] 4124e861f6 chore(deps): bump litellm from 1.43.18 to 1.43.19 (#3516)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.43.18 to 1.43.19.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.43.18...v1.43.19)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-21 23:42:03 +08:00
dependabot[bot] dec8f85f0d chore(deps): bump google-cloud-aiplatform from 1.62.0 to 1.63.0 (#3519)
Bumps [google-cloud-aiplatform](https://github.com/googleapis/python-aiplatform) from 1.62.0 to 1.63.0.
- [Release notes](https://github.com/googleapis/python-aiplatform/releases)
- [Changelog](https://github.com/googleapis/python-aiplatform/blob/main/CHANGELOG.md)
- [Commits](https://github.com/googleapis/python-aiplatform/compare/v1.62.0...v1.63.0)

---
updated-dependencies:
- dependency-name: google-cloud-aiplatform
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-21 23:41:51 +08:00
tobitege 70723581ac (test) enhance conftest to lessen false positives in integration tests (#3512)
* conftest: also mask size=xxx parameter in integration tests

* mask random poetry path part

* make regex more flexible
2024-08-21 15:48:32 +02:00
tobitege 426f3b5fc0 (fix) add Anthropic dependency for Vertex (#3506)
* add anthropic dependency for Vertex to Dockerfile.j2 template

* add anthropic also to pyproject

* revert Dockerfile.j2 change

Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>

---------

Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2024-08-21 11:12:49 +08:00
Robert Brennan 3fe7894966 update docs for headless mode (#3508)
* update docs for headless mode

* fix newline

* fix command parsing

* add docs

* Update README.md

* Update headless-mode.md

* empty commit

* update integration tests

---------

Co-authored-by: tobitege <10787084+tobitege@users.noreply.github.com>
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-08-20 23:57:48 +00:00
Xingyao Wang 944d21f34a fix(ci): repo owner for docker pull (#3510) 2024-08-21 06:16:24 +08:00
tofarr 5e1d55f2a0 Now checking for the meta (Command) key as well as the ctrl key (#3484) 2024-08-20 12:20:14 -06:00
tobitege 7ef5a2d1ff (fix) Rename last opendevin occurences (#3490)
* renaming more opendevin occurences

* remove DOCKER_IMAGE variable from Makefile

* Revert rename in evaluation/swe_bench/run_infer.py

Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>

---------

Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2024-08-20 16:45:26 +00:00
tobitege af14dea7c6 bump frontend version from 0.1.0 to 0.8.3 (#3496) 2024-08-20 18:35:40 +02:00
dependabot[bot] 7d85cacf9b chore(deps-dev): bump openai from 1.41.0 to 1.41.1 (#3499) 2024-08-21 00:31:31 +08:00
dependabot[bot] a444aaf17f chore(deps): bump i18next from 23.12.3 to 23.14.0 in /frontend (#3502) 2024-08-21 00:31:19 +08:00
dependabot[bot] e60ec328d4 chore(deps): bump vite from 5.4.1 to 5.4.2 in /frontend (#3500) 2024-08-21 00:30:58 +08:00
dependabot[bot] 8aa282dda6 chore(deps): bump json-repair from 0.28.1 to 0.28.3 (#3503)
Bumps [json-repair](https://github.com/mangiucugna/json_repair) from 0.28.1 to 0.28.3.
- [Release notes](https://github.com/mangiucugna/json_repair/releases)
- [Commits](https://github.com/mangiucugna/json_repair/compare/v0.28.1...v0.28.3)

---
updated-dependencies:
- dependency-name: json-repair
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-20 16:14:43 +00:00
Mahmood Alhawaj 6487175a31 refactored all relative paths to absolute paths (#3495) 2024-08-21 00:09:48 +08:00
dependabot[bot] d0df95ac62 chore(deps): bump boto3 from 1.35.0 to 1.35.1 (#3498)
Bumps [boto3](https://github.com/boto/boto3) from 1.35.0 to 1.35.1.
- [Release notes](https://github.com/boto/boto3/releases)
- [Commits](https://github.com/boto/boto3/compare/1.35.0...1.35.1)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-20 15:55:37 +00:00
Xingyao Wang c8452f5813 fix: custom runtime image won't work for go (#3464)
* fix request param for container_image;
add test for go;

* fix go version issue

* update test to detect go version
2024-08-20 23:38:59 +08:00
Graham Neubig a7dfd5b9fa Update README.md (#3492)
This PR acknowledges that this repo used to be called OpenDevin, to reduce confusion for those who may have heard of OpenDevin before.
2024-08-20 16:26:34 +02:00
tofarr f5aa111ba6 Fix: Bump max_iterations when resuming due to throttling (#3410)
* Fix: Reset iteration count when resuming due to throttling

* Fix inadvertent additions

* WIP

* Changing max_iterations instead of iteration count

* Now adjusting max_iterations or max_budget_per_task as appropriate

* Fix check on iterations

* Fix linter issues

* AgentController: remember initial max_iterations and use it to extend state's iterations

* increase task budget by initial value (not doubling it)

---------

Co-authored-by: Tim O'Farrell <tofarr@gmai.com>
Co-authored-by: tobitege <10787084+tobitege@users.noreply.github.com>
Co-authored-by: mamoodi <mamoodiha@gmail.com>
2024-08-20 06:53:26 -06:00
dependabot[bot] 0a3d46a90b chore(deps): bump i18next-http-backend from 2.5.2 to 2.6.0 in /frontend (#3459)
Bumps [i18next-http-backend](https://github.com/i18next/i18next-http-backend) from 2.5.2 to 2.6.0.
- [Changelog](https://github.com/i18next/i18next-http-backend/blob/master/CHANGELOG.md)
- [Commits](https://github.com/i18next/i18next-http-backend/compare/v2.5.2...v2.6.0)

---
updated-dependencies:
- dependency-name: i18next-http-backend
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-20 09:03:04 +00:00
Xingyao Wang dc4f017b80 remove extra guide url file (#3486) 2024-08-20 07:30:04 +02:00
dependabot[bot] 7cc0b2827a chore(deps-dev): bump @types/node from 22.3.0 to 22.4.1 in /frontend (#3460) 2024-08-20 02:16:02 +00:00
dependabot[bot] 435688be51 chore(deps): bump jose from 5.6.3 to 5.7.0 in /frontend (#3458)
Bumps [jose](https://github.com/panva/jose) from 5.6.3 to 5.7.0.
- [Release notes](https://github.com/panva/jose/releases)
- [Changelog](https://github.com/panva/jose/blob/main/CHANGELOG.md)
- [Commits](https://github.com/panva/jose/compare/v5.6.3...v5.7.0)

---
updated-dependencies:
- dependency-name: jose
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
2024-08-20 02:14:59 +00:00
Xingyao Wang 981c46cb61 fix: last tag list in CI (#3488) 2024-08-20 09:56:33 +08:00
Xingyao Wang d45b7208f9 fix last_tag of matrix (#3487) 2024-08-20 09:36:40 +08:00
Xingyao Wang 3d9a38d755 fix: CI push image (#3485)
* fix comment

* remove extra ls

* update title

* fix app docker
2024-08-20 09:17:22 +08:00
Xingyao Wang d4ba7f53f9 fix: integration tests (#3481)
* chore: fix ghcr again

* fix integraton tests
2024-08-19 23:06:55 +00:00
Xingyao Wang 94590aad35 chore: fix ghcr app push yet again (#3482) 2024-08-20 06:55:55 +08:00
Xingyao Wang 058dd5a025 chore: fix ghcr again (#3479) 2024-08-19 22:17:24 +00:00
dependabot[bot] 02845a611b chore(deps-dev): bump llama-index-embeddings-ollama from 0.1.3 to 0.2.0 (#3462)
Bumps llama-index-embeddings-ollama from 0.1.3 to 0.2.0.

---
updated-dependencies:
- dependency-name: llama-index-embeddings-ollama
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-19 14:08:51 -07:00
Xingyao Wang 8f0f764a85 fix: CI docker image push (#3476)
* fix ghcr app

* fix ghcr runtime push

* rename od_runtime to runtime
2024-08-19 20:53:28 +00:00
dependabot[bot] d1d2fb9aca chore(deps): bump minio from 7.2.7 to 7.2.8 (#3463)
Bumps [minio](https://github.com/minio/minio-py) from 7.2.7 to 7.2.8.
- [Release notes](https://github.com/minio/minio-py/releases)
- [Commits](https://github.com/minio/minio-py/compare/7.2.7...7.2.8)

---
updated-dependencies:
- dependency-name: minio
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-19 20:43:30 +00:00
dependabot[bot] ea6dcd8ac8 chore(deps): bump boto3 from 1.34.162 to 1.35.0 (#3465)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.162 to 1.35.0.
- [Release notes](https://github.com/boto/boto3/releases)
- [Commits](https://github.com/boto/boto3/compare/1.34.162...1.35.0)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-08-19 20:08:46 +00:00
dependabot[bot] 8f8931dd5a chore(deps): bump json-repair from 0.28.0 to 0.28.1 (#3461)
Bumps [json-repair](https://github.com/mangiucugna/json_repair) from 0.28.0 to 0.28.1.
- [Release notes](https://github.com/mangiucugna/json_repair/releases)
- [Commits](https://github.com/mangiucugna/json_repair/compare/0.28.0...v0.28.1)

---
updated-dependencies:
- dependency-name: json-repair
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-08-19 20:08:38 +00:00
dependabot[bot] ac21733b1e chore(deps-dev): bump ruff from 0.6.0 to 0.6.1 (#3467)
Bumps [ruff](https://github.com/astral-sh/ruff) from 0.6.0 to 0.6.1.
- [Release notes](https://github.com/astral-sh/ruff/releases)
- [Changelog](https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md)
- [Commits](https://github.com/astral-sh/ruff/compare/0.6.0...0.6.1)

---
updated-dependencies:
- dependency-name: ruff
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: yufansong <yufan@risingwave-labs.com>
2024-08-19 12:45:56 -07:00
dependabot[bot] f8a0f936e5 chore(deps): bump litellm from 1.43.15 to 1.43.18 (#3466)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.43.15 to 1.43.18.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.43.15...v1.43.18)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-19 19:19:16 +00:00
dependabot[bot] 1b0d083edb chore(deps-dev): bump openai from 1.40.8 to 1.41.0 (#3470)
Bumps [openai](https://github.com/openai/openai-python) from 1.40.8 to 1.41.0.
- [Release notes](https://github.com/openai/openai-python/releases)
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-python/compare/v1.40.8...v1.41.0)

---
updated-dependencies:
- dependency-name: openai
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-08-19 21:16:51 +02:00
Engel Nyst 840d2b2aba renaming (#3474) 2024-08-19 18:41:45 +00:00
Xingyao Wang 356cbe0adf fix: ci docker upload (#3469)
* fix upload workflow

* fix typo in filename

* fix ghcr runtime

* rename ghcr to ghcr runtime

* fix upload app docker again
2024-08-20 01:42:08 +08:00
Robert Brennan 01ae22ef57 Rename OpenDevin to OpenHands (#3472)
* Replace OpenDevin with OpenHands

* Update CONTRIBUTING.md

* Update README.md

* Update README.md

* update poetry lock; move opendevin folder to openhands

* fix env var

* revert image references in docs

* revert permissions

* revert permissions

---------

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-08-20 00:44:54 +08:00
Xingyao Wang 83f36c1d66 test: build and run runtime tests on different custom docker images (#3324)
* try to fix pip unavailable

* update test case for pip

* force rebuild in CI

* remove extra symlink

* fix newline

* added semi-colon to line 31

* Dockerfile.j2: activate env at the end

* Revert "Dockerfile.j2: activate env at the end"

This reverts commit cf2f565102.

* cleanup Dockerfile

* switch default python image

* remove image agnostic (no longer used)

* fix tests

* simplify integration tests default image

* add nodejs specific runtime tests

* update tests and workflows

* switch to nikolaik/python-nodejs:python3.11-nodejs22

* update build sh to output image name correctly

* increase custom images to test

* fix test

* fix test

* fix double quote

* try fixing ci

* update ghcr workflow

* fix artifact name

* try to fix ghcr again

* fix workflow

* save built image to correct dir

* remove extra -docker-image

* make last tag to be human readable image tag

* fix hyphen to underscore

* run test runtime on all tags

* revert app build

* separate ghcr workflow

* update dockerfile for eval

* fix tag for test run

* try fix tag

* try fix tag via matrix output

* try workflow again

* update comments

* try fixing test matrix

* fix artifact name

* try fix tag again

* Revert "try fix tag again"

This reverts commit b369badd8c.

* tweak filename

* try different path

* fix filepath

* try fix tag artifact path again

* save json instead of line

* update matrix

* print all tags in workflow

* fix DOCKER_IMAGE to avoid ghcr.io/opendevin/ghcr.io/opendevin/od_runtime

* fix test matrix to only load unique test image tags

* try fix matrix again!!!!!

* add all runtime tests passed

---------

Co-authored-by: tobitege <tobitege@gmx.de>
Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: tobitege <10787084+tobitege@users.noreply.github.com>
2024-08-19 21:12:00 +08:00
Xingyao Wang 4f285c8e0f Fix "Credits" in README.md (#3455) 2024-08-19 09:07:18 -04:00
mamoodi 14a4e45cbb Standardize names and better organization (#3453)
* Standardize names and better organization

* Update links
2024-08-19 09:30:47 +08:00
tobitege d1b9787751 remove obsolete prompt.py file (codeact_agent) (#3450) 2024-08-19 09:18:36 +08:00
tobitege 49bde1a760 ghcr.yml: bump docker/login-action@v2 to v3 (#3451) 2024-08-19 09:18:16 +08:00
tobitege 733bf2b38c tweaks to chat message CSS stylings for lists (#3449) 2024-08-18 21:49:12 +02:00
Xingyao Wang 537fb7d985 feat: convert agent prompts into structured Jinja2 templates (#3360)
* commit jinja draft

* remove extra file

* update system prompt

* remove github message

* update prompts

* add prompt manager and its tests

* use prompt manager for codeact and bump version

* fix integration tests

* fix lint

* simplify test case

* update system

* fix integration tests

* update credit path for aider

* Update CREDITS.md

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-08-18 16:38:46 +00:00
Xingyao Wang a2ea17909d chore: remove deprecated RuntimeTool (#3443) 2024-08-18 09:45:45 +08:00
Xingyao Wang ac80f20473 update readme 2024-08-17 20:45:21 -05:00
Xingyao Wang c3f62c3ce9 allow setting dataset name and split 2024-08-17 20:43:59 -05:00
Xingyao Wang b8ec420ccd remove unused stuff 2024-08-17 20:43:09 -05:00
Xingyao Wang 7b6ae3638e remove unused swebench scripts 2024-08-17 20:38:49 -05:00
Engel Nyst 84a814c447 Additions to agent docs (#3434)
* adjust docstrings

docstrings for agent controller

Update agent readme.

* remove not implemented

* fix formatting

* fix indentation

* fix block

* formatting

* include stop
2024-08-18 01:56:07 +02:00
Graham Neubig 2b1e73365b Fix typing for fake user response function (#3438) 2024-08-18 06:47:08 +08:00
tobitege 944f934475 remove 1s sleep from _ensure_session (runtime) (#3439) 2024-08-17 22:00:20 +00:00
Engel Nyst 9cb0bf97c1 Fix restore cli sessions (#3409)
* fix restore cli sessions

* pytest

* fix log message

* make sure sid is set

---------

Co-authored-by: mamoodi <mamoodiha@gmail.com>
2024-08-17 23:31:42 +02:00
Xingyao Wang 8d7bf83224 refactor: break agentskills single file to multiple composable modules (#3429)
* refactor agentskills to prepare for agentless

* fix import

* fix typo

* fix imports

* fix globals

* fix import

* fix import

* disable log to file to avoid auto-created log file w/ permission issue when import od in runtime

* import agentskills from OD instead from itself directly

* add back pythonpath

* remove chown since there's no log/folder
2024-08-18 05:18:36 +08:00
Engel Nyst 3d04bd90e1 Add credits and licenses for code from other projects (#3436)
* add projects with licenses list

* fix projects

* fix projects

* fix projects

* fix licenses, include full Apache

* move AL 2.0 text to Credits

* Update README.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-08-17 18:12:28 +00:00
tofarr b2221f135f Now printing a warning message when a config has an unknown key (#3428)
* Now printing a warning message when unknown keys present
2024-08-17 10:54:38 -06:00
Engel Nyst 92b1a2da5c Refactor agent to accept agent config (#3430)
* refactor agents to receive their agent config

* add unit test

* fix test

* fix tests
2024-08-17 18:11:30 +02:00
dependabot[bot] 629b47400d chore(deps): bump fastapi from 0.112.0 to 0.112.1 (#3417)
Bumps [fastapi](https://github.com/fastapi/fastapi) from 0.112.0 to 0.112.1.
- [Release notes](https://github.com/fastapi/fastapi/releases)
- [Commits](https://github.com/fastapi/fastapi/compare/0.112.0...0.112.1)

---
updated-dependencies:
- dependency-name: fastapi
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: mamoodi <mamoodiha@gmail.com>
2024-08-17 23:52:05 +08:00
tobitege 0583609531 (fix) jobs need if condition modified to enable run on PR push (#3433)
* fix ghcr_push job: condition needed to check right push context

* apply same rule for other dependent jobs
2024-08-17 23:37:57 +08:00
mamoodi 6e70f79bf4 Remove unnecessary concurrency and more comments (#3422) 2024-08-16 18:19:34 -04:00
Xingyao Wang 1cf44ef854 fix app docker again! (#3427) 2024-08-16 21:43:56 +00:00
Xingyao Wang 5d920489fc docs: improve documentation (#3425)
* add imports to eval harness

* update out-dated custom sandbox guide

* Update docs/modules/usage/evaluation_harness.md

* remove llm pasta

* update od doc

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-08-16 21:25:10 +00:00
Xingyao Wang 3f20e4edc6 attempt to fix docker issue (#3426) 2024-08-16 19:41:24 +00:00
tofarr f8b129da43 Added remark gfm to render tables (#3423)
Co-authored-by: Tim O'Farrell <tofarr@gmai.com>
2024-08-17 03:20:50 +08:00
dependabot[bot] 23608be974 chore(deps-dev): bump openai from 1.40.6 to 1.40.8 (#3419)
Bumps [openai](https://github.com/openai/openai-python) from 1.40.6 to 1.40.8.
- [Release notes](https://github.com/openai/openai-python/releases)
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-python/compare/v1.40.6...v1.40.8)

---
updated-dependencies:
- dependency-name: openai
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-16 16:46:15 +00:00
dependabot[bot] df2bdc2ae9 chore(deps): bump litellm from 1.43.13 to 1.43.15 (#3420)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.43.13 to 1.43.15.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.43.13...v1.43.15)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-16 16:42:39 +00:00
dependabot[bot] 3c0a61666b chore(deps): bump json-repair from 0.27.2 to 0.28.0 (#3421)
Bumps [json-repair](https://github.com/mangiucugna/json_repair) from 0.27.2 to 0.28.0.
- [Release notes](https://github.com/mangiucugna/json_repair/releases)
- [Commits](https://github.com/mangiucugna/json_repair/compare/0.27.2...0.28.0)

---
updated-dependencies:
- dependency-name: json-repair
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-16 16:41:34 +00:00
dependabot[bot] db5d120c2c chore(deps): bump boto3 from 1.34.161 to 1.34.162 (#3418)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.161 to 1.34.162.
- [Release notes](https://github.com/boto/boto3/releases)
- [Commits](https://github.com/boto/boto3/compare/1.34.161...1.34.162)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-16 15:42:34 +00:00
mamoodi 340fe0de18 Split Frontend and Python Unit tests (#3399)
* Split Frontend and Python Unit tests

* Extra comment for deploy docs workflow

* Simpler comment

* Add paths and paths-ignore to unit tests

* More specific comment for py unit tests

* Remove paths-ignore because jobs are required

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-08-16 11:39:14 -04:00
Engel Nyst f9f96dd429 Use generic types (#3414)
* use generic types post python 3.10

* use newer optional

* a lost list

* fix integration tests
2024-08-16 06:41:57 -04:00
Bin Lei ae5f130881 fix potential flake8 miss checking (#3124)
* fix potential flake8 miss checking

* Add unit test for edit_file_by_replace function with problematic file

* Add unit test for edit_file_by_replace function with problematic file

* Add unit test for edit_file_by_replace function with problematic file

* Add unit test for edit_file_by_replace function with problematic file

* Add unit test for edit_file function with problematic file

* Add unit test for edit_file function with problematic file

* Add unit test for edit_file function with problematic file

* Add unit test for edit_file function with problematic file

* Add unit test for edit_file function with problematic file

* Update opendevin/runtime/plugins/agent_skills/agentskills.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* add test intention description

* fix potential flake8 miss checking

* fix potential flake8 miss checking

* fix potential flake8 miss checking

* fix potential flake8 miss checking

* fix potential flake8 miss checking

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: tobitege <tobitege@gmx.de>
2024-08-16 09:23:55 +08:00
tofarr eab7ea3d37 Feat: unsaved file content (#3358)
Added file states, useEffect and destructor
2024-08-15 18:21:49 -06:00
dependabot[bot] c67df47c10 chore(deps): bump litellm from 1.43.10 to 1.43.13 (#3403)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.43.10 to 1.43.13.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.43.10...v1.43.13)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-15 14:51:47 -04:00
dependabot[bot] 4629f3d984 chore(deps): bump vite from 5.4.0 to 5.4.1 in /frontend (#3402)
Bumps [vite](https://github.com/vitejs/vite/tree/HEAD/packages/vite) from 5.4.0 to 5.4.1.
- [Release notes](https://github.com/vitejs/vite/releases)
- [Changelog](https://github.com/vitejs/vite/blob/main/packages/vite/CHANGELOG.md)
- [Commits](https://github.com/vitejs/vite/commits/v5.4.1/packages/vite)

---
updated-dependencies:
- dependency-name: vite
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-15 10:48:50 -07:00
dependabot[bot] 6129ffbf77 chore(deps-dev): bump ruff from 0.5.7 to 0.6.0 (#3404)
Bumps [ruff](https://github.com/astral-sh/ruff) from 0.5.7 to 0.6.0.
- [Release notes](https://github.com/astral-sh/ruff/releases)
- [Changelog](https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md)
- [Commits](https://github.com/astral-sh/ruff/compare/0.5.7...0.6.0)

---
updated-dependencies:
- dependency-name: ruff
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-15 10:48:14 -07:00
dependabot[bot] 1c0f71a857 chore(deps): bump boto3 from 1.34.160 to 1.34.161 (#3405)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.160 to 1.34.161.
- [Release notes](https://github.com/boto/boto3/releases)
- [Commits](https://github.com/boto/boto3/compare/1.34.160...1.34.161)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-15 19:47:45 +02:00
tobitege 17c360c891 RuntimeClient: make _init_user more robust (#3400) 2024-08-15 16:42:39 +02:00
tobitege 3a77af8a22 copy manifest file into container (#3396) 2024-08-14 21:46:56 -04:00
mamoodi 5dab07094d fix: Only run one deploy docs workflow at a time. Add comments (#3398) 2024-08-15 03:30:57 +02:00
Graham Neubig 50b1256c49 Add tests for agent controller (#3357)
* Add tests for agent controller

* Remove dead code

* Remove dead code
2024-08-15 04:58:16 +08:00
Xingyao Wang b6243bb96b feat: refactor image building logic into runtime builder (#3395)
* feat: refactor building logic into runtime builder

* return image name

* fix testcases

* use runtime builder for eventstream runtime
2024-08-14 20:02:12 +00:00
Engel Nyst be2ea6ab35 logging fixes (#3393) 2024-08-14 21:21:42 +02:00
Engel Nyst 463c66a372 add error obs to codeact SWE (#3392) 2024-08-14 18:41:25 +00:00
tobitege 4095a7d5c9 _create_project_source_dist: run subprocess as shell (#3390) 2024-08-14 18:55:21 +02:00
dependabot[bot] 09e67d90a4 chore(deps): bump litellm from 1.43.9 to 1.43.10 (#3388) 2024-08-15 00:17:11 +08:00
dependabot[bot] 3fc8f5d4fa chore(deps): bump boto3 from 1.34.159 to 1.34.160 (#3387) 2024-08-15 00:16:57 +08:00
dependabot[bot] a263d5b9c8 chore(deps): bump google-cloud-aiplatform from 1.61.0 to 1.62.0 (#3386) 2024-08-15 00:16:40 +08:00
dependabot[bot] 072a4a1f8c chore(deps-dev): bump tailwindcss from 3.4.9 to 3.4.10 in /frontend (#3385) 2024-08-15 00:16:19 +08:00
dependabot[bot] 0efc25e009 chore(deps-dev): bump @types/node from 22.2.0 to 22.3.0 in /frontend (#3384) 2024-08-15 00:16:03 +08:00
dependabot[bot] edbaa42137 chore(deps): bump datasets from 2.20.0 to 2.21.0 (#3389) 2024-08-15 00:15:37 +08:00
Graham Neubig 7d331acffa Handle error observations in codeact (#3383)
* Handle error observations in codeact

* Remove comments
2024-08-14 13:47:31 +00:00
tobitege 19bc06198d exclude Python cache files/folders from sdist to avoid permission errors at runtime (#3381) 2024-08-14 06:54:41 -04:00
Graham Neubig 92b19ed1fb Add unit tests for MemoryCondenser in test_condenser.py (#3379)
* Add unit tests for MemoryCondenser in test_condenser.py

* Formatting

* Fix formatting etc

---------

Co-authored-by: opendevin <opendevin@all-hands.dev>
2024-08-14 10:20:30 +02:00
dependabot[bot] e33b3560df chore(deps): bump the docusaurus group in /docs with 7 updates (#3376)
Bumps the docusaurus group in /docs with 7 updates:

| Package | From | To |
| --- | --- | --- |
| [@docusaurus/core](https://github.com/facebook/docusaurus/tree/HEAD/packages/docusaurus) | `3.5.1` | `3.5.2` |
| [@docusaurus/plugin-content-pages](https://github.com/facebook/docusaurus/tree/HEAD/packages/docusaurus-plugin-content-pages) | `3.5.1` | `3.5.2` |
| [@docusaurus/preset-classic](https://github.com/facebook/docusaurus/tree/HEAD/packages/docusaurus-preset-classic) | `3.5.1` | `3.5.2` |
| [@docusaurus/theme-mermaid](https://github.com/facebook/docusaurus/tree/HEAD/packages/docusaurus-theme-mermaid) | `3.5.1` | `3.5.2` |
| [@docusaurus/module-type-aliases](https://github.com/facebook/docusaurus/tree/HEAD/packages/docusaurus-module-type-aliases) | `3.5.1` | `3.5.2` |
| [@docusaurus/tsconfig](https://github.com/facebook/docusaurus/tree/HEAD/packages/docusaurus-tsconfig) | `3.5.1` | `3.5.2` |
| [@docusaurus/types](https://github.com/facebook/docusaurus/tree/HEAD/packages/docusaurus-types) | `3.5.1` | `3.5.2` |


Updates `@docusaurus/core` from 3.5.1 to 3.5.2
- [Release notes](https://github.com/facebook/docusaurus/releases)
- [Changelog](https://github.com/facebook/docusaurus/blob/main/CHANGELOG.md)
- [Commits](https://github.com/facebook/docusaurus/commits/v3.5.2/packages/docusaurus)

Updates `@docusaurus/plugin-content-pages` from 3.5.1 to 3.5.2
- [Release notes](https://github.com/facebook/docusaurus/releases)
- [Changelog](https://github.com/facebook/docusaurus/blob/main/CHANGELOG.md)
- [Commits](https://github.com/facebook/docusaurus/commits/v3.5.2/packages/docusaurus-plugin-content-pages)

Updates `@docusaurus/preset-classic` from 3.5.1 to 3.5.2
- [Release notes](https://github.com/facebook/docusaurus/releases)
- [Changelog](https://github.com/facebook/docusaurus/blob/main/CHANGELOG.md)
- [Commits](https://github.com/facebook/docusaurus/commits/v3.5.2/packages/docusaurus-preset-classic)

Updates `@docusaurus/theme-mermaid` from 3.5.1 to 3.5.2
- [Release notes](https://github.com/facebook/docusaurus/releases)
- [Changelog](https://github.com/facebook/docusaurus/blob/main/CHANGELOG.md)
- [Commits](https://github.com/facebook/docusaurus/commits/v3.5.2/packages/docusaurus-theme-mermaid)

Updates `@docusaurus/module-type-aliases` from 3.5.1 to 3.5.2
- [Release notes](https://github.com/facebook/docusaurus/releases)
- [Changelog](https://github.com/facebook/docusaurus/blob/main/CHANGELOG.md)
- [Commits](https://github.com/facebook/docusaurus/commits/v3.5.2/packages/docusaurus-module-type-aliases)

Updates `@docusaurus/tsconfig` from 3.5.1 to 3.5.2
- [Release notes](https://github.com/facebook/docusaurus/releases)
- [Changelog](https://github.com/facebook/docusaurus/blob/main/CHANGELOG.md)
- [Commits](https://github.com/facebook/docusaurus/commits/v3.5.2/packages/docusaurus-tsconfig)

Updates `@docusaurus/types` from 3.5.1 to 3.5.2
- [Release notes](https://github.com/facebook/docusaurus/releases)
- [Changelog](https://github.com/facebook/docusaurus/blob/main/CHANGELOG.md)
- [Commits](https://github.com/facebook/docusaurus/commits/v3.5.2/packages/docusaurus-types)

---
updated-dependencies:
- dependency-name: "@docusaurus/core"
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: docusaurus
- dependency-name: "@docusaurus/plugin-content-pages"
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: docusaurus
- dependency-name: "@docusaurus/preset-classic"
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: docusaurus
- dependency-name: "@docusaurus/theme-mermaid"
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: docusaurus
- dependency-name: "@docusaurus/module-type-aliases"
  dependency-type: direct:development
  update-type: version-update:semver-patch
  dependency-group: docusaurus
- dependency-name: "@docusaurus/tsconfig"
  dependency-type: direct:development
  update-type: version-update:semver-patch
  dependency-group: docusaurus
- dependency-name: "@docusaurus/types"
  dependency-type: direct:development
  update-type: version-update:semver-patch
  dependency-group: docusaurus
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-13 10:55:35 -04:00
Graham Neubig 98e6756641 Add dependabot group for eslint (#3374) 2024-08-13 12:37:37 +00:00
dependabot[bot] bf5e08f203 chore(deps): bump boto3 from 1.34.158 to 1.34.159 (#3373)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.158 to 1.34.159.
- [Release notes](https://github.com/boto/boto3/releases)
- [Commits](https://github.com/boto/boto3/compare/1.34.158...1.34.159)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-13 07:44:12 -04:00
dependabot[bot] 57cddd5223 chore(deps): bump uvicorn from 0.30.5 to 0.30.6 (#3372)
Bumps [uvicorn](https://github.com/encode/uvicorn) from 0.30.5 to 0.30.6.
- [Release notes](https://github.com/encode/uvicorn/releases)
- [Changelog](https://github.com/encode/uvicorn/blob/master/CHANGELOG.md)
- [Commits](https://github.com/encode/uvicorn/compare/0.30.5...0.30.6)

---
updated-dependencies:
- dependency-name: uvicorn
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-13 07:44:05 -04:00
dependabot[bot] f239522a79 chore(deps-dev): bump openai from 1.40.3 to 1.40.6 (#3371)
Bumps [openai](https://github.com/openai/openai-python) from 1.40.3 to 1.40.6.
- [Release notes](https://github.com/openai/openai-python/releases)
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-python/compare/v1.40.3...v1.40.6)

---
updated-dependencies:
- dependency-name: openai
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-13 07:43:58 -04:00
dependabot[bot] 7259a46ace chore(deps): bump litellm from 1.43.7 to 1.43.9 (#3370)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.43.7 to 1.43.9.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.43.7...v1.43.9)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-13 07:43:51 -04:00
dependabot[bot] 8105bb199e chore(deps): bump react-icons from 5.2.1 to 5.3.0 in /frontend (#3369)
Bumps [react-icons](https://github.com/react-icons/react-icons) from 5.2.1 to 5.3.0.
- [Release notes](https://github.com/react-icons/react-icons/releases)
- [Commits](https://github.com/react-icons/react-icons/compare/v5.2.1...v5.3.0)

---
updated-dependencies:
- dependency-name: react-icons
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-13 07:43:44 -04:00
dependabot[bot] a737d49f54 chore(deps): bump react-icons from 5.2.1 to 5.3.0 in /docs (#3367)
Bumps [react-icons](https://github.com/react-icons/react-icons) from 5.2.1 to 5.3.0.
- [Release notes](https://github.com/react-icons/react-icons/releases)
- [Commits](https://github.com/react-icons/react-icons/compare/v5.2.1...v5.3.0)

---
updated-dependencies:
- dependency-name: react-icons
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-13 07:43:37 -04:00
dependabot[bot] 71099ea14f chore(deps-dev): bump lint-staged from 15.2.8 to 15.2.9 in /frontend (#3365)
Bumps [lint-staged](https://github.com/lint-staged/lint-staged) from 15.2.8 to 15.2.9.
- [Release notes](https://github.com/lint-staged/lint-staged/releases)
- [Changelog](https://github.com/lint-staged/lint-staged/blob/master/CHANGELOG.md)
- [Commits](https://github.com/lint-staged/lint-staged/compare/v15.2.8...v15.2.9)

---
updated-dependencies:
- dependency-name: lint-staged
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-13 07:43:29 -04:00
dependabot[bot] 7f0cbf144a chore(deps): bump tailwind-merge from 2.5.1 to 2.5.2 in /frontend (#3364)
Bumps [tailwind-merge](https://github.com/dcastil/tailwind-merge) from 2.5.1 to 2.5.2.
- [Release notes](https://github.com/dcastil/tailwind-merge/releases)
- [Commits](https://github.com/dcastil/tailwind-merge/compare/v2.5.1...v2.5.2)

---
updated-dependencies:
- dependency-name: tailwind-merge
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-13 07:42:39 -04:00
dependabot[bot] 8bbcf1613c chore(deps): bump i18next from 23.12.2 to 23.12.3 in /frontend (#3363)
Bumps [i18next](https://github.com/i18next/i18next) from 23.12.2 to 23.12.3.
- [Release notes](https://github.com/i18next/i18next/releases)
- [Changelog](https://github.com/i18next/i18next/blob/master/CHANGELOG.md)
- [Commits](https://github.com/i18next/i18next/compare/v23.12.2...v23.12.3)

---
updated-dependencies:
- dependency-name: i18next
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-13 07:40:39 -04:00
adragos e0b67ad2f1 feat: add Security Analyzer functionality (#3058)
* feat: Initial work on security analyzer

* feat: Add remote invariant client

* chore: improve fault tolerance of client

* feat: Add button to enable Invariant Security Analyzer

* [feat] confirmation mode for bash actions

* feat: Add Invariant Tab with security risk outputs

* feat: Add modal setting for Confirmation Mode

* fix: frontend tests for confirmation mode switch

* fix: add missing CONFIRMATION_MODE value in SettingsModal.test.tsx

* fix: update test to integrate new setting

* feat: Initial work on security analyzer

* feat: Add remote invariant client

* chore: improve fault tolerance of client

* feat: Add button to enable Invariant Security Analyzer

* feat: Add Invariant Tab with security risk outputs

* feat: integrate security analyzer with confirmation mode

* feat: improve invariant analyzer tab

* feat: Implement user confirmation for running bash/python code

* fix: don't display rejected actions

* fix: make confirmation show only on assistant messages

* feat: download traces, update policy, implement settings, auto-approve based on defined risk

* Fix: low risk not being shown because it's 0

* fix: duplicate logs in tab

* fix: log duplication

* chore: prepare for merge, remove logging

* Merge confirmation_mode from OpenDevin main

* test: update tests to pass

* chore: finish merging changes, security analyzer now operational again

* feat: document Security Analyzers

* refactor: api, monitor

* chore: lint, fix risk None, revert policy

* fix: check security_risk for None

* refactor: rename instances of invariant to security analyzer

* feat: add /api/options/security-analyzers endpoint

* Move security analyzer from tab to modal

* Temporary fix lock when security analyzer is not chosen

* feat: don't show lock at all when security analyzer is not enabled

* refactor:
- Frontend:
* change type of SECURITY_ANALYZER from bool to string
* add combobox to select SECURITY_ANALYZER, current options are "invariant and "" (no security analyzer)
* Security is now a modal, lock in bottom right is visible only if there's a security analyzer selected
- Backend:
* add close to SecurityAnalyzer
* instantiate SecurityAnalyzer based on provided string from frontend

* fix: update close to be async, to be consistent with other close on resources

* fix: max height of modal (prevent overflow)

* feat: add logo

* small fixes

* update docs for creating a security analyzer module

* fix linting

* update timeout for http client

* fix: move security_analyzer config from agent to session

* feat: add security_risk to browser actions

* add optional remark on combobox

* fix: asdict not called on dataclass, remove invariant dependency

* fix: exclude None values when serializing

* feat: take default policy from invariant-server instead of being hardcoded

* fix: check if policy is None

* update image name

* test: fix some failing runs

* fix: security analyzer tests

* refactor: merge confirmation_mode and security_analyzer into SecurityConfig. Change invariant error message for docker

* test: add tests for invariant parsing actions / observations

* fix: python linting for test_security.py

* Apply suggestions from code review

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

* use ActionSecurityRisk | None intead of Optional

* refactor action parsing

* add extra check

* lint parser.py

* test: add field keep_prompt to test_security

* docs: add information about how to enable the analyzer

* test: Remove trailing whitespace in README.md text

---------

Co-authored-by: Mislav Balunovic <mislav.balunovic@gmail.com>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-08-13 11:29:41 +00:00
dependabot[bot] 7ce4f9c4da chore(deps): bump the docusaurus group in /docs with 3 updates (#3366)
Bumps the docusaurus group in /docs with 3 updates: [@docusaurus/core](https://github.com/facebook/docusaurus/tree/HEAD/packages/docusaurus), [@docusaurus/preset-classic](https://github.com/facebook/docusaurus/tree/HEAD/packages/docusaurus-preset-classic) and [@docusaurus/theme-mermaid](https://github.com/facebook/docusaurus/tree/HEAD/packages/docusaurus-theme-mermaid).


Updates `@docusaurus/core` from 3.4.0 to 3.5.1
- [Release notes](https://github.com/facebook/docusaurus/releases)
- [Changelog](https://github.com/facebook/docusaurus/blob/main/CHANGELOG.md)
- [Commits](https://github.com/facebook/docusaurus/commits/v3.5.1/packages/docusaurus)

Updates `@docusaurus/preset-classic` from 3.4.0 to 3.5.1
- [Release notes](https://github.com/facebook/docusaurus/releases)
- [Changelog](https://github.com/facebook/docusaurus/blob/main/CHANGELOG.md)
- [Commits](https://github.com/facebook/docusaurus/commits/v3.5.1/packages/docusaurus-preset-classic)

Updates `@docusaurus/theme-mermaid` from 3.4.0 to 3.5.1
- [Release notes](https://github.com/facebook/docusaurus/releases)
- [Changelog](https://github.com/facebook/docusaurus/blob/main/CHANGELOG.md)
- [Commits](https://github.com/facebook/docusaurus/commits/v3.5.1/packages/docusaurus-theme-mermaid)

---
updated-dependencies:
- dependency-name: "@docusaurus/core"
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: docusaurus
- dependency-name: "@docusaurus/preset-classic"
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: docusaurus
- dependency-name: "@docusaurus/theme-mermaid"
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: docusaurus
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-13 06:35:39 -04:00
Xingyao Wang 568e6cdb40 feat: change Jupyter cwd alone with "bash" (#3331)
* remove unused plugin mixin

* change the entire jupyter PWD with bash;
print jupyter pwd in obs as well;

* remove unused field

* remove unused comments

* change the entire jupyter PWD with bash;
print jupyter pwd in obs as well;

* fix runtime tests for jupyter

* update intgeration tests

* fix test again

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-08-13 06:08:31 -04:00
tobitege 9ad665a996 fix dependabot.yml groups (#3361) 2024-08-13 17:58:15 +08:00
Graham Neubig 20a35c59f6 Group docusaurus dependabot updates (#3359) 2024-08-13 13:38:41 +08:00
dependabot[bot] ea8397ff89 chore(deps-dev): bump @docusaurus/tsconfig from 3.4.0 to 3.5.1 in /docs (#3342)
Bumps [@docusaurus/tsconfig](https://github.com/facebook/docusaurus/tree/HEAD/packages/docusaurus-tsconfig) from 3.4.0 to 3.5.1.
- [Release notes](https://github.com/facebook/docusaurus/releases)
- [Changelog](https://github.com/facebook/docusaurus/blob/main/CHANGELOG.md)
- [Commits](https://github.com/facebook/docusaurus/commits/v3.5.1/packages/docusaurus-tsconfig)

---
updated-dependencies:
- dependency-name: "@docusaurus/tsconfig"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-13 02:30:42 +00:00
dependabot[bot] c73016c551 chore(deps-dev): bump @docusaurus/types from 3.4.0 to 3.5.1 in /docs (#3344)
Bumps [@docusaurus/types](https://github.com/facebook/docusaurus/tree/HEAD/packages/docusaurus-types) from 3.4.0 to 3.5.1.
- [Release notes](https://github.com/facebook/docusaurus/releases)
- [Changelog](https://github.com/facebook/docusaurus/blob/main/CHANGELOG.md)
- [Commits](https://github.com/facebook/docusaurus/commits/v3.5.1/packages/docusaurus-types)

---
updated-dependencies:
- dependency-name: "@docusaurus/types"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-12 21:42:57 -04:00
Yufan Song 86d933f1b0 remove useless code (#3355) 2024-08-12 21:34:17 -04:00
tofarr 930ee27037 Collapsible resizers (#3330)
* Collapsible resizable divs

Co-authored-by: Tim O'Farrell <tofarr@gmai.com>
Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>
2024-08-12 15:05:59 -06:00
mamoodi e2b2f74737 Add comments to the runtime_build script (#3333) 2024-08-12 15:17:12 -04:00
dependabot[bot] 5226d8d684 chore(deps): bump @docusaurus/plugin-content-pages in /docs (#3346)
Bumps [@docusaurus/plugin-content-pages](https://github.com/facebook/docusaurus/tree/HEAD/packages/docusaurus-plugin-content-pages) from 3.4.0 to 3.5.1.
- [Release notes](https://github.com/facebook/docusaurus/releases)
- [Changelog](https://github.com/facebook/docusaurus/blob/main/CHANGELOG.md)
- [Commits](https://github.com/facebook/docusaurus/commits/v3.5.1/packages/docusaurus-plugin-content-pages)

---
updated-dependencies:
- dependency-name: "@docusaurus/plugin-content-pages"
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-12 18:44:37 +00:00
dependabot[bot] 38c1ab189a chore(deps-dev): bump @docusaurus/module-type-aliases in /docs (#3348)
Bumps [@docusaurus/module-type-aliases](https://github.com/facebook/docusaurus/tree/HEAD/packages/docusaurus-module-type-aliases) from 3.4.0 to 3.5.1.
- [Release notes](https://github.com/facebook/docusaurus/releases)
- [Changelog](https://github.com/facebook/docusaurus/blob/main/CHANGELOG.md)
- [Commits](https://github.com/facebook/docusaurus/commits/v3.5.1/packages/docusaurus-module-type-aliases)

---
updated-dependencies:
- dependency-name: "@docusaurus/module-type-aliases"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-12 10:50:09 -07:00
dependabot[bot] cbfa8a872b chore(deps): bump boto3 from 1.34.157 to 1.34.158 (#3351)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.157 to 1.34.158.
- [Release notes](https://github.com/boto/boto3/releases)
- [Commits](https://github.com/boto/boto3/compare/1.34.157...1.34.158)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-12 10:49:45 -07:00
dependabot[bot] 390a660aab chore(deps-dev): bump sympy from 1.13.1 to 1.13.2 (#3353)
Bumps [sympy](https://github.com/sympy/sympy) from 1.13.1 to 1.13.2.
- [Release notes](https://github.com/sympy/sympy/releases)
- [Commits](https://github.com/sympy/sympy/compare/sympy-1.13.1...sympy-1.13.2)

---
updated-dependencies:
- dependency-name: sympy
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: yufansong <yufan@risingwave-labs.com>
2024-08-12 10:49:22 -07:00
dependabot[bot] 6c61f55962 chore(deps): bump litellm from 1.43.4 to 1.43.7 (#3349)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.43.4 to 1.43.7.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.43.4...v1.43.7)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: yufansong <yufan@risingwave-labs.com>
2024-08-12 10:48:54 -07:00
dependabot[bot] b4795ffbca chore(deps): bump json-repair from 0.27.0 to 0.27.2 (#3350)
Bumps [json-repair](https://github.com/mangiucugna/json_repair) from 0.27.0 to 0.27.2.
- [Release notes](https://github.com/mangiucugna/json_repair/releases)
- [Commits](https://github.com/mangiucugna/json_repair/compare/0.27.0...0.27.2)

---
updated-dependencies:
- dependency-name: json-repair
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-12 10:48:28 -07:00
dependabot[bot] 602bf40bd4 chore(deps-dev): bump llama-index-embeddings-huggingface (#3352)
Bumps llama-index-embeddings-huggingface from 0.2.2 to 0.2.3.

---
updated-dependencies:
- dependency-name: llama-index-embeddings-huggingface
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: yufansong <yufan@risingwave-labs.com>
2024-08-12 10:48:11 -07:00
dependabot[bot] 9653f04367 chore(deps-dev): bump openai from 1.40.2 to 1.40.3 (#3354)
Bumps [openai](https://github.com/openai/openai-python) from 1.40.2 to 1.40.3.
- [Release notes](https://github.com/openai/openai-python/releases)
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-python/compare/v1.40.2...v1.40.3)

---
updated-dependencies:
- dependency-name: openai
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: yufansong <yufan@risingwave-labs.com>
2024-08-12 10:47:47 -07:00
dependabot[bot] 34510c3605 chore(deps): bump tailwind-merge from 2.4.0 to 2.5.1 in /frontend (#3341)
Bumps [tailwind-merge](https://github.com/dcastil/tailwind-merge) from 2.4.0 to 2.5.1.
- [Release notes](https://github.com/dcastil/tailwind-merge/releases)
- [Commits](https://github.com/dcastil/tailwind-merge/compare/v2.4.0...v2.5.1)

---
updated-dependencies:
- dependency-name: tailwind-merge
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-12 23:53:33 +08:00
dependabot[bot] 4a62f9c4ec chore(deps-dev): bump @types/node from 22.1.0 to 22.2.0 in /frontend (#3340)
Bumps [@types/node](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/node) from 22.1.0 to 22.2.0.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/node)

---
updated-dependencies:
- dependency-name: "@types/node"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-12 23:53:08 +08:00
Yufan Song 1424db8309 remove uesless process (#3338) 2024-08-11 18:53:55 -07:00
Yufan Song 28dd882f98 remove useless sanbox (#3336) 2024-08-11 19:08:30 +00:00
Yufan Song 99ac91f6ad remove sandbox abstract class (#3337)
* remove sandbox abstract class

* remove cancellable stream
2024-08-11 22:47:28 +08:00
yueqis 9757f362bf Update run_infer.py of gorilla to download my-languages.so (#3286)
* Update run_infer.py of gorilla to download my-languages.so

* add exist check, change file path, lint code

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: yufansong <yufan@risingwave-labs.com>
2024-08-11 07:40:27 +00:00
Yufan Song da5bf6c1bf remove useless mock test code (#3335) 2024-08-11 07:33:58 +00:00
Xingyao Wang e6fa5b5df0 chore: remove unused plugin mixin (#3332)
* remove unused plugin mixin

* remove unused field

* remove unused comments
2024-08-09 20:50:49 -04:00
tofarr ace733cb1f Minor UI fix to explorer tree (#3328)
Co-authored-by: Tim O'Farrell <tofarr@gmai.com>
Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-08-09 23:33:35 +00:00
dependabot[bot] e059407c74 chore(deps-dev): bump vite-tsconfig-paths in /frontend (#3303)
Bumps [vite-tsconfig-paths](https://github.com/aleclarson/vite-tsconfig-paths) from 4.3.2 to 5.0.1.
- [Release notes](https://github.com/aleclarson/vite-tsconfig-paths/releases)
- [Commits](https://github.com/aleclarson/vite-tsconfig-paths/compare/v4.3.2...v5.0.1)

---
updated-dependencies:
- dependency-name: vite-tsconfig-paths
  dependency-type: direct:development
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: tofarr <tofarr@gmail.com>
Co-authored-by: tobitege <10787084+tobitege@users.noreply.github.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
2024-08-09 15:39:06 -07:00
tofarr b4a7e27bfd chore Fix architecture diagram (#3141)
* Fix architecture diagram

* Updated Readme with permanent img

* Fix path

---------

Co-authored-by: Tim O'Farrell <tofarr@Tims-MacBook-Pro-2.local>
Co-authored-by: Tim O'Farrell <tofarr@gmai.com>
Co-authored-by: tobitege <10787084+tobitege@users.noreply.github.com>
2024-08-09 20:48:31 +00:00
tobitege 76a27e72bd remove test_sandbox references from run-unit-tests.yml (#3326) 2024-08-09 20:13:27 +00:00
Xingyao Wang bdf6df12c3 fix: pip not available in runtime (#3306)
* try to fix pip unavailable

* update test case for pip

* force rebuild in CI

* remove extra symlink

* fix newline

* added semi-colon to line 31

* Dockerfile.j2: activate env at the end

* Revert "Dockerfile.j2: activate env at the end"

This reverts commit cf2f565102.

* cleanup Dockerfile

* switch default python image

* remove image agnostic (no longer used)

* fix tests

* switch to nikolaik/python-nodejs:python3.11-nodejs22

* fix test

* fix test

* revert docker

* update template

---------

Co-authored-by: tobitege <tobitege@gmx.de>
Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-08-09 15:04:43 -04:00
dependabot[bot] 00bc68642f chore(deps-dev): bump openai from 1.40.1 to 1.40.2 (#3319)
Bumps [openai](https://github.com/openai/openai-python) from 1.40.1 to 1.40.2.
- [Release notes](https://github.com/openai/openai-python/releases)
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-python/compare/v1.40.1...v1.40.2)

---
updated-dependencies:
- dependency-name: openai
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-09 11:38:49 -04:00
dependabot[bot] 2b96911758 chore(deps): bump google-cloud-aiplatform from 1.60.0 to 1.61.0 (#3318)
Bumps [google-cloud-aiplatform](https://github.com/googleapis/python-aiplatform) from 1.60.0 to 1.61.0.
- [Release notes](https://github.com/googleapis/python-aiplatform/releases)
- [Changelog](https://github.com/googleapis/python-aiplatform/blob/main/CHANGELOG.md)
- [Commits](https://github.com/googleapis/python-aiplatform/compare/v1.60.0...v1.61.0)

---
updated-dependencies:
- dependency-name: google-cloud-aiplatform
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-09 11:38:39 -04:00
Kavinkumar 8bb9e37c46 Clear history at the beginning of a new task🧹 (#3285)
Co-authored-by: tobitege <tobitege@gmx.de>
2024-08-09 11:38:23 -04:00
dependabot[bot] 6854d38707 chore(deps): bump boto3 from 1.34.156 to 1.34.157 (#3317)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.156 to 1.34.157.
- [Release notes](https://github.com/boto/boto3/releases)
- [Commits](https://github.com/boto/boto3/compare/1.34.156...1.34.157)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-09 11:36:46 -04:00
dependabot[bot] e57d14bfc8 chore(deps): bump litellm from 1.43.3 to 1.43.4 (#3320)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.43.3 to 1.43.4.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.43.3...v1.43.4)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-09 11:36:04 -04:00
dependabot[bot] e9ee8f1cb9 chore(deps): bump json-repair from 0.26.0 to 0.27.0 (#3321)
Bumps [json-repair](https://github.com/mangiucugna/json_repair) from 0.26.0 to 0.27.0.
- [Release notes](https://github.com/mangiucugna/json_repair/releases)
- [Commits](https://github.com/mangiucugna/json_repair/compare/0.26.0...0.27.0)

---
updated-dependencies:
- dependency-name: json-repair
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-09 11:35:52 -04:00
dependabot[bot] 985b16a459 chore(deps-dev): bump ruff from 0.5.6 to 0.5.7 (#3322)
Bumps [ruff](https://github.com/astral-sh/ruff) from 0.5.6 to 0.5.7.
- [Release notes](https://github.com/astral-sh/ruff/releases)
- [Changelog](https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md)
- [Commits](https://github.com/astral-sh/ruff/compare/0.5.6...0.5.7)

---
updated-dependencies:
- dependency-name: ruff
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-09 11:35:28 -04:00
tofarr f8c815279d Bug fix drag resize (#3307)
* Fix issue where mouse drag fails on first attempt

* Detach event correctly

* Use old variable names

---------

Co-authored-by: Tim O'Farrell <tofarr@Tims-MacBook-Pro-2.local>
Co-authored-by: tobitege <tobitege@gmx.de>
2024-08-09 18:31:29 +03:00
tobitege d2a8ff0918 fix double quotes around env vars typo in ghcr.yml (#3313) 2024-08-09 14:22:10 +00:00
Graham Neubig b4db5b9cae Remove some obsolete examples of persist_sandbox in the doc (#3312) 2024-08-09 14:06:55 +00:00
tobitege 5e6fd58f56 try to fix DOCKER_IMAGE_TAG in ghcr_push_runtime (od_runtime, arm64) (#3311) 2024-08-09 12:02:43 +00:00
tobitege ac7badc236 (fix) revert #3240 ignore_paths in ghcr.yml (#3308) 2024-08-09 11:06:39 +02:00
Xingyao Wang 2e6b08db4f fix: workspace folder permission & app container cannot access client API (#3300)
* also copy over pyproject and poetry lock

* add missing readme

* remove extra git config init since it is already done in client.py

* only chown if the /workspace dir does not exists

* Revert "remove extra git config init since it is already done in client.py"

This reverts commit e8556cd76d.

* remove extra git config init since it is already done in client.py

* fix test runtime

* print container log while reconnecting

* print log in more readable format

* print log in more readable format

* increase lines

* clean up sandbox and ssh related stuff

* remove ssh hostname

* remove ssh hostname

* fix docker app cannot access runtime API issue

* remove ssh password

* API HOSTNAME should be pre-fixed with SANDBOX

* update config

* fix typo that breaks the test
2024-08-08 19:28:34 -04:00
dependabot[bot] ddd2565035 chore(deps-dev): bump @tailwindcss/typography in /frontend (#3294)
Bumps [@tailwindcss/typography](https://github.com/tailwindlabs/tailwindcss-typography) from 0.5.13 to 0.5.14.
- [Release notes](https://github.com/tailwindlabs/tailwindcss-typography/releases)
- [Changelog](https://github.com/tailwindlabs/tailwindcss-typography/blob/master/CHANGELOG.md)
- [Commits](https://github.com/tailwindlabs/tailwindcss-typography/compare/v0.5.13...v0.5.14)

---
updated-dependencies:
- dependency-name: "@tailwindcss/typography"
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-08 18:25:52 -04:00
dependabot[bot] 81db5aefc7 chore(deps): bump vite from 5.3.5 to 5.4.0 in /frontend (#3295)
Bumps [vite](https://github.com/vitejs/vite/tree/HEAD/packages/vite) from 5.3.5 to 5.4.0.
- [Release notes](https://github.com/vitejs/vite/releases)
- [Changelog](https://github.com/vitejs/vite/blob/main/packages/vite/CHANGELOG.md)
- [Commits](https://github.com/vitejs/vite/commits/create-vite@5.4.0/packages/vite)

---
updated-dependencies:
- dependency-name: vite
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-08 18:25:36 -04:00
dependabot[bot] 512f56ea80 chore(deps-dev): bump tailwindcss from 3.4.7 to 3.4.9 in /frontend (#3297)
Bumps [tailwindcss](https://github.com/tailwindlabs/tailwindcss) from 3.4.7 to 3.4.9.
- [Release notes](https://github.com/tailwindlabs/tailwindcss/releases)
- [Changelog](https://github.com/tailwindlabs/tailwindcss/blob/v3.4.9/CHANGELOG.md)
- [Commits](https://github.com/tailwindlabs/tailwindcss/compare/v3.4.7...v3.4.9)

---
updated-dependencies:
- dependency-name: tailwindcss
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-08 18:25:24 -04:00
Xingyao Wang a5195b0e65 chore: clean up sandbox and ssh related configs (#3301)
* clean up sandbox and ssh related stuff

* remove ssh hostname

* remove ssh hostname

* remove ssh password

* update config

* fix typo that breaks the test
2024-08-08 22:15:40 +00:00
tofarr 040b9cb75c Chore Readme updates (#3302)
* Readme updates

Added explicit installation instructions to server and frontend README

* Documentation update

* WIP

* WIP

---------

Co-authored-by: Tim O'Farrell <tofarr@Tims-MacBook-Pro-2.local>
2024-08-08 18:06:58 -04:00
Xingyao Wang 4915168da2 fix: copy over pyproject and poetry lock for App docker (#3299)
* also copy over pyproject and poetry lock

* add missing readme
2024-08-08 21:06:26 +00:00
dependabot[bot] 8a008773ee chore(deps-dev): bump python-pptx from 1.0.1 to 1.0.2 (#3292)
Bumps [python-pptx](https://github.com/scanny/python-pptx) from 1.0.1 to 1.0.2.
- [Changelog](https://github.com/scanny/python-pptx/blob/master/HISTORY.rst)
- [Commits](https://github.com/scanny/python-pptx/compare/v1.0.1...v1.0.2)

---
updated-dependencies:
- dependency-name: python-pptx
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-08 23:49:45 +08:00
dependabot[bot] 9f18172982 chore(deps): bump zope-interface from 7.0 to 7.0.1 (#3276)
Bumps [zope-interface](https://github.com/zopefoundation/zope.interface) from 7.0 to 7.0.1.
- [Changelog](https://github.com/zopefoundation/zope.interface/blob/master/CHANGES.rst)
- [Commits](https://github.com/zopefoundation/zope.interface/compare/7.0...7.0.1)

---
updated-dependencies:
- dependency-name: zope-interface
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Leo <ifuryst@gmail.com>
2024-08-08 23:49:34 +08:00
dependabot[bot] 7bd4af80f0 chore(deps): bump boto3 from 1.34.155 to 1.34.156 (#3293)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.155 to 1.34.156.
- [Release notes](https://github.com/boto/boto3/releases)
- [Commits](https://github.com/boto/boto3/compare/1.34.155...1.34.156)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-08 23:49:23 +08:00
dependabot[bot] c741df58e3 chore(deps-dev): bump openai from 1.40.0 to 1.40.1 (#3291)
Bumps [openai](https://github.com/openai/openai-python) from 1.40.0 to 1.40.1.
- [Release notes](https://github.com/openai/openai-python/releases)
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-python/compare/v1.40.0...v1.40.1)

---
updated-dependencies:
- dependency-name: openai
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-08 23:05:39 +08:00
dependabot[bot] 50b6ce8578 chore(deps): bump litellm from 1.43.1 to 1.43.3 (#3290)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.43.1 to 1.43.3.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.43.1...v1.43.3)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-08 23:05:24 +08:00
Graham Neubig f36639be28 Improve listen.py test coverage (#3289)
* Add unit tests for listen.py

* Added new tests

* Improve test coverage for listen.py

* Update tests

---------

Co-authored-by: opendevin <opendevin@all-hands.dev>
2024-08-08 14:25:12 +00:00
Xingyao Wang db302fd33c fix: dubious ownership when running git (#3282)
* switch default to eventstream runtime

* remove pull docker from makefile

* fix unittest

* fix file store path

* try deprecate server runtime

* remove persist sandbox

* move file utils

* remove server runtime related workflow

* remove unused method

* attempt to remove the reliance on filestore for BE

* fix async for list file

* fix list_files to post

* fix list files

* add suffix to directory

* make sure list file returns abs path;
make sure other backend endpoints accpets abs path

* remove server runtime test workflow

* set git config in runtime

* chown for workspace in client;
use INIT_COMMANDS to maintain all commands that need to be run before bash start;

* fix client issue;
add test case for git;

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-08-08 13:14:45 +00:00
Xingyao Wang 2669a5378c update video; (#3284)
clean up docs site;
remove faq;
2024-08-08 08:39:32 -04:00
Xingyao Wang 73bd165118 ci: try to fix runtime build when exact hash for runtime image is found (#3272) 2024-08-08 03:46:05 +00:00
dependabot[bot] e922e2666b chore(deps-dev): bump postcss from 8.4.40 to 8.4.41 in /frontend (#3260)
Bumps [postcss](https://github.com/postcss/postcss) from 8.4.40 to 8.4.41.
- [Release notes](https://github.com/postcss/postcss/releases)
- [Changelog](https://github.com/postcss/postcss/blob/main/CHANGELOG.md)
- [Commits](https://github.com/postcss/postcss/compare/8.4.40...8.4.41)

---
updated-dependencies:
- dependency-name: postcss
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
2024-08-07 22:25:12 -04:00
dependabot[bot] a627fabfc5 chore(deps): bump react-i18next from 15.0.0 to 15.0.1 in /frontend (#3279)
Bumps [react-i18next](https://github.com/i18next/react-i18next) from 15.0.0 to 15.0.1.
- [Changelog](https://github.com/i18next/react-i18next/blob/master/CHANGELOG.md)
- [Commits](https://github.com/i18next/react-i18next/compare/v15.0.0...v15.0.1)

---
updated-dependencies:
- dependency-name: react-i18next
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-07 22:24:45 -04:00
Xingyao Wang 90d0a62469 (arch) Switch default runtime to EventStream Runtime (#3271)
* switch default to eventstream runtime

* remove pull docker from makefile

* fix unittest

* fix file store path

* try deprecate server runtime

* remove persist sandbox

* move file utils

* remove server runtime related workflow

* remove unused method

* attempt to remove the reliance on filestore for BE

* fix async for list file

* fix list_files to post

* fix list files

* add suffix to directory

* make sure list file returns abs path;
make sure other backend endpoints accpets abs path

* remove server runtime test workflow

* set git config in runtime
2024-08-08 10:11:49 +08:00
sven 71ad979ffd fixed list rendering (#3273) 2024-08-08 01:50:16 +08:00
Xingyao Wang b30a2dd87a completely remove update_source_code (#3280) 2024-08-07 16:57:11 +00:00
dependabot[bot] 36ae44f6ef chore(deps-dev): bump flake8 from 7.1.0 to 7.1.1 (#3248)
Bumps [flake8](https://github.com/pycqa/flake8) from 7.1.0 to 7.1.1.
- [Commits](https://github.com/pycqa/flake8/compare/7.1.0...7.1.1)

---
updated-dependencies:
- dependency-name: flake8
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
2024-08-08 00:00:02 +08:00
dependabot[bot] 74264c8902 chore(deps-dev): bump openai from 1.39.0 to 1.40.0 (#3278)
Bumps [openai](https://github.com/openai/openai-python) from 1.39.0 to 1.40.0.
- [Release notes](https://github.com/openai/openai-python/releases)
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-python/compare/v1.39.0...v1.40.0)

---
updated-dependencies:
- dependency-name: openai
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-07 23:59:21 +08:00
dependabot[bot] 33ad4e27ee chore(deps): bump litellm from 1.43.0 to 1.43.1 (#3274)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.43.0 to 1.43.1.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.43.0...v1.43.1)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-07 23:59:08 +08:00
dependabot[bot] 069b40e8b4 chore(deps-dev): bump sympy from 1.12.1 to 1.13.1 (#3277)
Bumps [sympy](https://github.com/sympy/sympy) from 1.12.1 to 1.13.1.
- [Release notes](https://github.com/sympy/sympy/releases)
- [Commits](https://github.com/sympy/sympy/compare/sympy-1.12.1...sympy-1.13.1)

---
updated-dependencies:
- dependency-name: sympy
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-07 23:58:57 +08:00
dependabot[bot] 9c39c8b250 chore(deps): bump boto3 from 1.34.154 to 1.34.155 (#3275)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.154 to 1.34.155.
- [Release notes](https://github.com/boto/boto3/releases)
- [Commits](https://github.com/boto/boto3/compare/1.34.154...1.34.155)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-07 23:57:44 +08:00
mamoodi 2915679adc Make all workflows consistent formatting and some more comments (#3270) 2024-08-07 02:00:15 +00:00
Xingyao Wang c941d028b9 make sure mermaid flowchart is displayed correctly (#3269) 2024-08-07 09:27:17 +08:00
dependabot[bot] 60e11b0dd2 chore(deps-dev): bump ruff from 0.5.5 to 0.5.6 (#3251)
Bumps [ruff](https://github.com/astral-sh/ruff) from 0.5.5 to 0.5.6.
- [Release notes](https://github.com/astral-sh/ruff/releases)
- [Changelog](https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md)
- [Commits](https://github.com/astral-sh/ruff/compare/0.5.5...0.5.6)

---
updated-dependencies:
- dependency-name: ruff
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
2024-08-06 15:13:34 -07:00
Xingyao Wang bb66f15ff6 [Arch] Streamline EventStream Runtime Image Building Logic (#3259)
* remove nocache

* simplify runtime build to use hash & always update source

* style

* try to fix temp folder issue

* fix rm tree

* create build folder first (to get correct hash), then copy it over to actual build folder

* fix assert

* fix indentation

* fix copy over

* add runtime documentation

* fix runtime docs

* fix typo

* Update docs/modules/usage/runtime.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update docs/modules/usage/runtime.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-08-07 06:09:38 +08:00
Xingyao Wang a22ee6656b [docs] Clean up evaluation docs (#3268)
* remove llm copy pasta

* add emoji
2024-08-07 05:05:45 +08:00
Xingyao Wang 7270d21cf9 update documentation for evaluation tutorial 2024-08-06 14:55:42 -04:00
dependabot[bot] 9c44d94cef chore(deps-dev): bump lint-staged from 15.2.7 to 15.2.8 in /frontend (#3255)
Bumps [lint-staged](https://github.com/lint-staged/lint-staged) from 15.2.7 to 15.2.8.
- [Release notes](https://github.com/lint-staged/lint-staged/releases)
- [Changelog](https://github.com/lint-staged/lint-staged/blob/master/CHANGELOG.md)
- [Commits](https://github.com/lint-staged/lint-staged/compare/v15.2.7...v15.2.8)

---
updated-dependencies:
- dependency-name: lint-staged
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-06 13:26:15 -04:00
dependabot[bot] a8ed63ef23 chore(deps-dev): bump autoprefixer from 10.4.19 to 10.4.20 in /frontend (#3254)
Bumps [autoprefixer](https://github.com/postcss/autoprefixer) from 10.4.19 to 10.4.20.
- [Release notes](https://github.com/postcss/autoprefixer/releases)
- [Changelog](https://github.com/postcss/autoprefixer/blob/main/CHANGELOG.md)
- [Commits](https://github.com/postcss/autoprefixer/compare/10.4.19...10.4.20)

---
updated-dependencies:
- dependency-name: autoprefixer
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-06 13:26:01 -04:00
Xingyao Wang 31b244f95e [Refactor, Evaluation] Refactor and clean up evaluation harness to remove global config and use EventStreamRuntime (#3230)
* move multi-line bash tests to test_runtime;
support multi-line bash for esruntime;

* add testcase to handle PS2 prompt

* use bashlex for bash parsing to handle multi-line commands;
add testcases for multi-line commands

* revert ghcr runtime change

* Apply stash

* fix run as other user;
make test async;

* fix test runtime for run as od

* add run-as-devin to all the runtime tests

* handle the case when username is root

* move all run-as-devin tests from sandbox;
only tests a few cases on different user to save time;

* move over multi-line echo related tests to test_runtime

* fix user-specific jupyter by fixing the pypoetry virtualenv folder

* make plugin's init async;
chdir at initialization of jupyter plugin;
move ipy simple testcase to test runtime;

* support agentskills import in
move tests for jupyter pwd tests;
overload `add_env_vars` for EventStreamRuntime to update env var also in Jupyter;
make agentskills read env var lazily, in case env var is updated;

* fix ServerRuntime agentskills issue

* move agnostic image test to test_runtime

* merge runtime tests in CI

* fix enable auto lint as env var

* update warning message

* update warning message

* test for different container images

* change parsing output as debug

* add exception handling for update_pwd_decorator

* fix unit test indentation

* add plugins as default input to Runtime class;
remove init_sandbox_plugins;
implement add_env_var (include jupyter) in the base class;

* fix server runtime auto lint

* Revert "add exception handling for update_pwd_decorator"

This reverts commit 2b668b1506.

* tries to print debugging info for agentskills

* explictly setting uid (try fix permission issue)

* Revert "tries to print debugging info for agentskills"

This reverts commit 8be4c86756.

* set sandbox user id during testing to hopefully fix the permission issue

* add browser tools for server runtime

* try to debug for old pwd

* update debug cmd

* only test agnostic runtime when TEST_RUNTIME is Server

* fix temp dir mkdir

* load TEST_RUNTIME at the beginning

* remove ipython tests

* only log to file when DEBUG

* default logging to project root

* temporarily remove log to file

* fix LLM logger dir

* fix logger

* make set pwd an optional aux action

* fix prev pwd

* fix infinity recursion

* simplify

* do not import the whole od library to avoid logger folder by jupyter

* fix browsing

* increase timeout

* attempt to fix agentskills yet again

* clean up in testcases, since CI maybe run as non-root

* add _cause attribute for event.id

* remove parent

* add a bunch of debugging statement again for CI :(

* fix temp_dir fixture

* change all temp dir to follow pytest's tmp_path_factory

* remove extra bracket

* clean up error printing a bit

* jupyter chdir to self.config.workspace_mount_path_in_sandbox on initialization

* jupyter chdir to self.config.workspace_mount_path_in_sandbox on initialization

* add typing for tmp dir fixture

* clear the directory before running the test to avoid weird CI temp dir

* remove agnostic test case for server runtime

* Revert "remove agnostic test case for server runtime"

This reverts commit 30e2181c3f.

* disable agnostic tests in CI

* fix test

* make sure plugin arg is not passed when no plugin is specified;
remove redundant on_event function;

* move mock prompt

* rename runtime

* remove extra logging

* refactor run_controller's interface;
support multiple runtime for integration test;
filter out hostname for prompt

* uncomment other tests

* pass the right runtime to controller

* log runtime when start

* uncomment tests

* improve symbol filters

* add intergration test prompts that seemd ok

* add integration test workflow

* add python3 to default ubuntu image

* symlink python and fix permission to jupyter pip

* add retry for jupyter execute server

* fix jupyter pip install;
add post-process for jupyter pip install;
simplify init by add agent_skills path to PYTHONPATH;
add testcase to tests jupyter pip install;

* fix bug

* use ubuntu:22.04 for eventstream integration tests

* add todo

* update testcase

* remove redundant code

* fix unit test

* reduce dependency for runtime

* try making llama-index an optional dependency that's not installed by default

* remove pip install since it seemd not needed

* log ipython execution;
await write message since it returns a future

* update ipy testcase

* do not install llama-index in CI

* do not install llama-index in the app docker as well

* set sandbox container image in the integration test script

* log plugins & env var for runtime

* update conftest for sha256

* add git

* remove all non-alphanumeric chalracters

* add working ipy module tests!

* default to use host network

* remove is_async from browser to make thing a little more reliable;
retry loading browser when error;

* add sleep to wait a bit for http server

* kill http server before regenerate browsing tests

* fix browsing

* only set sandbox container image if undefined

* skip empty config value

* update evaluation to use the latest run_controller

* revert logger in execute_server to be compatible with server runtime

* revert logging level to fix jupyter

* set logger level

* revert the logging

* chmod for workspace to fix permission

* support getting timeout from action

* update test for server runtime

* try to fix file permission

* fix test_cmd_run_action_serialization_deserialization test (added timeout)

* poetry: pip 24.2, torch 2.2.2

* revert adding pip to pyproject.toml

* add build to dependencies in pyproject.toml

* forgot poetry lock --no-update

* fix a DelegatorAgent prompt_002.log (timeout)

* fix a DelegatorAgent prompt_003.log (timeout)

* couple more timeout attribs in prompt files

* some more prompt files

* prompts galore

* add clarification comment for timeout

* default timeout to config

* add assert

* update integraton tests for eventstream

* update integration tests

* fix timeout for action<->dict

* remove redundant on_event

* default to use instance image

* update run_controller interface

* add logging for copy

* refactor swe_bench for the new design

* fix action execution timeout

* updatelock

* remove build sandbox locally

* fix runtime

* use plain for-loop for single process

* remove extra print

* get swebench inference working

* print whole `test_result` dict

* got swebench patch post-process working

* update swe-bench evaluation readme

* refactor using shared reset_logger function

* move messy swebench prompt to a different file

* support the ability to specify whether to keep prompt

* support the ability to specify whether to keep prompt

* fix dockerfile

* fix import and remove unnecessary strip logic

* fix action serialization

* get agentbench running

* remove extra ls for agent bench

* fix agentbench metric

* factor out common documentation for eval

* update biocoder doc

* remove swe_env_box since it is no longer needed

* get biocoder working

* add func timeout for bird

* fix jupyter pwd with ~ as user name

* fix jupyter pwd with ~ as user name

* get bird working

* get browsing evaluation working

* make eda runnable

* fix id column

* fix eda run_infer

* unify eval output using a structured format;
make swebench coompatible with that format;
update client source code for every swebench run;
do not inject testcmd for swebench

* standardize existing benchs for the new eval output

* set update source code = true

* get gaia standardized

* fix gaia

* gorilla refactored but stuck at language.so to test

* refactor and make gpqa work

* refactor humanevalfix and get it working

* refactor logic reasoning and get it working

* refactor browser env so it works with eventstream runtime for eval

* add initial version of miniwob refactor

* fix browsergym environment

* get miniwob working!!

* allowing injecting additional dependency to OD runtime docker image

* allowing injecting additional dependency to OD runtime docker image

* support logic reasoning with pre-injected dependency

* get mint working

* update runtime build

* fix mint docker

* add test for keep_prompt;
add missing await close for some tests

* update integration tests for eventstream runtime

* fix integration tests for server runtime

* refactor ml bench and toolqa

* refactor webarena

* fix default factory

* Update run_infer.py

* add APIError to retry

* increase timeout for swebench

* make sure to hide api key when dump eval output

* update the behavior of put source code to put files instead of tarball

* add dishash to dependency

* sendintr when timeout

* fix dockerfile copy

* reduce timeout

* use dirhash to avoid repeat building for update source

* fix runtime_build testcase

* add dir_hash to docker build pipeline

* revert api error

* update poetry lock

* add retries for swebench run infer

* fix git patch

* update poetry lock

* adjust config order

* fix mount volumns

* enforce all eval to use "instance_id"

* remove file store from runtime

* make file_store public inside eventstream

* move the runtime logic inside `main` out

* support using async function for process_instance_fn

* refactor run_infer with the create_time

* fix file store

* Update evaluation/toolqa/utils.py

Co-authored-by: Graham Neubig <neubig@gmail.com>

* fix typo

---------

Co-authored-by: tobitege <tobitege@gmx.de>
Co-authored-by: super-dainiu <78588128+super-dainiu@users.noreply.github.com>
Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-08-06 17:21:45 +00:00
dependabot[bot] 9029cd77d3 chore(deps): bump litellm from 1.42.12 to 1.43.0 (#3267)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.42.12 to 1.43.0.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.42.12...v1.43.0)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-06 23:47:02 +08:00
dependabot[bot] adedfe5e7f chore(deps-dev): bump python-pptx from 1.0.0 to 1.0.1 (#3263)
Bumps [python-pptx](https://github.com/scanny/python-pptx) from 1.0.0 to 1.0.1.
- [Changelog](https://github.com/scanny/python-pptx/blob/master/HISTORY.rst)
- [Commits](https://github.com/scanny/python-pptx/compare/v1.0.0...v1.0.1)

---
updated-dependencies:
- dependency-name: python-pptx
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-06 23:46:29 +08:00
dependabot[bot] 43768684d9 chore(deps): bump zope-interface from 6.4.post2 to 7.0 (#3262)
Bumps [zope-interface](https://github.com/zopefoundation/zope.interface) from 6.4.post2 to 7.0.
- [Changelog](https://github.com/zopefoundation/zope.interface/blob/master/CHANGES.rst)
- [Commits](https://github.com/zopefoundation/zope.interface/compare/6.4.post2...7.0)

---
updated-dependencies:
- dependency-name: zope-interface
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-06 23:46:17 +08:00
dependabot[bot] 0f6ce9717f chore(deps): bump boto3 from 1.34.153 to 1.34.154 (#3264)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.153 to 1.34.154.
- [Release notes](https://github.com/boto/boto3/releases)
- [Commits](https://github.com/boto/boto3/compare/1.34.153...1.34.154)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-06 23:46:00 +08:00
dependabot[bot] d5170448e3 chore(deps-dev): bump openai from 1.38.0 to 1.39.0 (#3266)
Bumps [openai](https://github.com/openai/openai-python) from 1.38.0 to 1.39.0.
- [Release notes](https://github.com/openai/openai-python/releases)
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-python/compare/v1.38.0...v1.39.0)

---
updated-dependencies:
- dependency-name: openai
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-06 23:45:46 +08:00
dependabot[bot] 91de16a88b chore(deps-dev): bump streamlit from 1.37.0 to 1.37.1 (#3265)
Bumps [streamlit](https://github.com/streamlit/streamlit) from 1.37.0 to 1.37.1.
- [Release notes](https://github.com/streamlit/streamlit/releases)
- [Commits](https://github.com/streamlit/streamlit/compare/1.37.0...1.37.1)

---
updated-dependencies:
- dependency-name: streamlit
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-06 23:45:13 +08:00
dependabot[bot] b4a018ee57 chore(deps-dev): bump python-pptx from 0.6.23 to 1.0.0 (#3252)
Bumps [python-pptx](https://github.com/scanny/python-pptx) from 0.6.23 to 1.0.0.
- [Changelog](https://github.com/scanny/python-pptx/blob/master/HISTORY.rst)
- [Commits](https://github.com/scanny/python-pptx/compare/v0.6.23...v1.0.0)

---
updated-dependencies:
- dependency-name: python-pptx
  dependency-type: direct:development
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-06 04:13:23 +08:00
dependabot[bot] a26b0b3744 chore(deps-dev): bump openai from 1.37.2 to 1.38.0 (#3249)
Bumps [openai](https://github.com/openai/openai-python) from 1.37.2 to 1.38.0.
- [Release notes](https://github.com/openai/openai-python/releases)
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-python/compare/v1.37.2...v1.38.0)

---
updated-dependencies:
- dependency-name: openai
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-06 04:12:59 +08:00
dependabot[bot] a6f6350d98 chore(deps): bump litellm from 1.42.9 to 1.42.12 (#3253)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.42.9 to 1.42.12.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.42.9...v1.42.12)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-06 04:12:45 +08:00
tobitege f561b48f03 Dockerfile for app: re-declare ARG OPEN_DEVIN_BUILD_VERSION (#3257) 2024-08-06 04:12:30 +08:00
Graham Neubig 789f3504a9 Add init_runtime_tools for event stream runtime (#3256) 2024-08-06 01:14:31 +08:00
dependabot[bot] d42347aabb chore(deps): bump boto3 from 1.34.152 to 1.34.153 (#3247)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.152 to 1.34.153.
- [Release notes](https://github.com/boto/boto3/releases)
- [Commits](https://github.com/boto/boto3/compare/1.34.152...1.34.153)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-05 23:39:05 +08:00
tobitege 0e1a4e1a6c (backend) Added workflow clean-up.yml to remove old workflows and artifacts (#3244)
* added workflow clean-up.yml to remove old workflows and artifacts

* dispatch-only run for now

* add retention-days of 14 to upload-artifact in ghcr.yml
2024-08-05 09:25:18 +02:00
Xingyao Wang a69120d399 [Arch] Use hash to avoid repeat building EventStreamRuntime image (#3243)
* update the behavior of put source code to put files instead of tarball

* add dishash to dependency

* fix dockerfile copy

* use dirhash to avoid repeat building for update source

* fix runtime_build testcase

* add dir_hash to docker build pipeline

* add additional tests for source directory

* add comment

* clear the assertion by explictly check existing files

* also assert od is a dir
2024-08-05 03:13:32 +00:00
tobitege abec52abfe (fix) Revert #3233; more logging in runtimes (#3236)
* ServerRuntime: config copy in init

* revert #3233 but more logging

* get_box_classes: reset order back to previous version

* 3 logging commands switched to debug (were info)

* runtimes debug output of config on initialization

* removed unneeded logger message from _init_container
2024-08-04 19:13:37 +00:00
Xingyao Wang 6a12a9f83c [Arch, Eval] Allowing injecting additional dependency to OD runtime docker image (#3237)
* allowing injecting additional dependency to OD runtime docker image

* update runtime build

* make `extra_deps` optional str | None
2024-08-04 17:38:56 +00:00
mamoodi 6533b8a00e Remove pyproject workflow and cleanup stale workflow (#3238) 2024-08-05 00:38:37 +08:00
tobitege c2c363b0ec (workflow) ghcr.yml with paths/paths-ignore conditions for pull_request (#3240)
* ghcr.yml with paths/paths-ignore conditions for pull_request

* include evaluation folder

* removed paths, just paths-ignore now

* deploy-docs only for docs

* exclude evaluation folder

Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>

---------

Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2024-08-04 16:37:07 +02:00
Xingyao Wang 62ce183c2d [Agent Action] Support the ability to specify whether to keep prompt for CmdRun (#3218)
* support the ability to specify whether to keep prompt

* fix action serialization

* fix jupyter pwd with ~ as user name

* add test for keep_prompt;
add missing await close for some tests

* update integration tests for eventstream runtime

* fix integration tests for server runtime
2024-08-04 20:30:25 +08:00
Kaushik Deka 415843476c Feat: Add Vision Input Support for LLM with Vision Capabilities (#2848)
* add image feature

* fix-linting

* check model support for images

* add comment

* Add image support to other models

* Add images to chat

* fix linting

* fix test issues

* refactor variable names and import

* fix tests

* fix chat message tests

* fix linting

* add pydantic class message

* use message

* remove redundant comments

* remove redundant comments

* change Message class

* remove unintended change

* fix integration tests using regenerate.sh

* rename image_bas64 to images_url, fix tests

* rename Message.py to message, change reminder append logic, add unit tests

* remove comment, fix error to merge

* codeact_swe_agent

* fix f string

* update eventstream integration tests

* add missing if check in codeact_swe_agent

* update integration tests

* Update frontend/src/components/chat/ChatInput.tsx

* Update frontend/src/components/chat/ChatInput.tsx

* Update frontend/src/components/chat/ChatInput.tsx

* Update frontend/src/components/chat/ChatInput.tsx

* Update frontend/src/components/chat/ChatMessage.tsx

---------

Co-authored-by: tobitege <tobitege@gmx.de>
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>
2024-08-04 02:26:22 +08:00
Xingyao Wang b7061f4497 [Eval, Browser] Refactor Browser Env so it works with EventStreamRuntime for Browsing Evaluation (#3235)
* refactor browser env so it works with eventstream runtime for eval

* fix browsergym environment
2024-08-03 15:06:37 +00:00
dependabot[bot] 948b9266ae chore(deps): bump boto3 from 1.34.151 to 1.34.152 (#3221)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.151 to 1.34.152.
- [Release notes](https://github.com/boto/boto3/releases)
- [Commits](https://github.com/boto/boto3/compare/1.34.151...1.34.152)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-08-03 00:30:19 -04:00
Graham Neubig 24c87e2d84 Update slack link (#3234) 2024-08-03 06:07:28 +02:00
tobitege 1166b0e610 client runtime: fix config passing on init; added logging (#3233) 2024-08-03 10:37:38 +08:00
dependabot[bot] a1fec393ac chore(deps-dev): bump @types/node from 22.0.2 to 22.1.0 in /frontend (#3228)
Bumps [@types/node](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/node) from 22.0.2 to 22.1.0.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/node)

---
updated-dependencies:
- dependency-name: "@types/node"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-02 15:17:59 -07:00
dependabot[bot] 327dc23064 chore(deps): bump uvicorn from 0.30.4 to 0.30.5 (#3225)
Bumps [uvicorn](https://github.com/encode/uvicorn) from 0.30.4 to 0.30.5.
- [Release notes](https://github.com/encode/uvicorn/releases)
- [Changelog](https://github.com/encode/uvicorn/blob/master/CHANGELOG.md)
- [Commits](https://github.com/encode/uvicorn/compare/0.30.4...0.30.5)

---
updated-dependencies:
- dependency-name: uvicorn
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-03 01:53:37 +08:00
dependabot[bot] d0fc68ae4e chore(deps): bump json-repair from 0.25.3 to 0.26.0 (#3223)
Bumps [json-repair](https://github.com/mangiucugna/json_repair) from 0.25.3 to 0.26.0.
- [Release notes](https://github.com/mangiucugna/json_repair/releases)
- [Commits](https://github.com/mangiucugna/json_repair/compare/0.25.3...0.26.0)

---
updated-dependencies:
- dependency-name: json-repair
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-03 01:53:21 +08:00
dependabot[bot] f7a03b80d1 chore(deps): bump litellm from 1.42.7 to 1.42.9 (#3222)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.42.7 to 1.42.9.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.42.7...v1.42.9)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-03 01:52:53 +08:00
dependabot[bot] 778d8ded19 chore(deps): bump fastapi from 0.111.1 to 0.112.0 (#3224)
Bumps [fastapi](https://github.com/fastapi/fastapi) from 0.111.1 to 0.112.0.
- [Release notes](https://github.com/fastapi/fastapi/releases)
- [Commits](https://github.com/fastapi/fastapi/compare/0.111.1...0.112.0)

---
updated-dependencies:
- dependency-name: fastapi
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-03 01:52:14 +08:00
dependabot[bot] d8700e8f4b chore(deps-dev): bump openai from 1.37.1 to 1.37.2 (#3226)
Bumps [openai](https://github.com/openai/openai-python) from 1.37.1 to 1.37.2.
- [Release notes](https://github.com/openai/openai-python/releases)
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-python/compare/v1.37.1...v1.37.2)

---
updated-dependencies:
- dependency-name: openai
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-03 01:50:55 +08:00
Xingyao Wang 69ecde640b Update integration tests README.md (#3227)
* Update README.md

* lint
2024-08-02 17:29:11 +00:00
Xingyao Wang 105f0ffed5 bump swebench version (#3216) 2024-08-02 10:13:10 +08:00
Xingyao Wang 001195a3ea reduce the duplication in run_controller (#3217) 2024-08-02 10:12:34 +08:00
dependabot[bot] 8b4ad35cda chore(deps): bump grep-ast from 0.3.2 to 0.3.3 (#3192)
Bumps [grep-ast](https://github.com/paul-gauthier/grep-ast) from 0.3.2 to 0.3.3.
- [Commits](https://github.com/paul-gauthier/grep-ast/commits)

---
updated-dependencies:
- dependency-name: grep-ast
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-01 23:14:27 +00:00
dependabot[bot] c042f7beea chore(deps-dev): bump llama-index-embeddings-ollama from 0.1.2 to 0.1.3 (#3206)
Bumps llama-index-embeddings-ollama from 0.1.2 to 0.1.3.

---
updated-dependencies:
- dependency-name: llama-index-embeddings-ollama
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-01 23:02:24 +00:00
dependabot[bot] 2375c69be5 chore(deps): bump uvicorn from 0.30.3 to 0.30.4 (#3207)
Bumps [uvicorn](https://github.com/encode/uvicorn) from 0.30.3 to 0.30.4.
- [Release notes](https://github.com/encode/uvicorn/releases)
- [Changelog](https://github.com/encode/uvicorn/blob/master/CHANGELOG.md)
- [Commits](https://github.com/encode/uvicorn/compare/0.30.3...0.30.4)

---
updated-dependencies:
- dependency-name: uvicorn
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-08-02 06:13:15 +08:00
dependabot[bot] 0eefe6dbb6 chore(deps): bump litellm from 1.42.5 to 1.42.7 (#3209)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.42.5 to 1.42.7.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.42.5...v1.42.7)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-08-02 06:12:51 +08:00
Xingyao Wang 4f0a454ed6 [Arch] Support integration tests using EventStream Runtime (#3184)
* Remove global config from memory

* Remove runtime global config

* Remove from storage

* Remove global config

* Fix event stream tests

* Fix sandbox issue

* Change config

* Removed transferred tests

* Add swe env box

* Fixes on testing

* Fixed some tests

* Merge with stashed changes

* Fix typing

* Fix ipython test

* Revive function

* Make temp_dir fixture

* Remove test to avoid circular import

* fix eventstream filestore for test_runtime

* fix parse arg issue that cause integration test to fail

* support swebench pull from custom namespace

* add back simple tests for runtime

* move multi-line bash tests to test_runtime;
support multi-line bash for esruntime;

* add testcase to handle PS2 prompt

* use bashlex for bash parsing to handle multi-line commands;
add testcases for multi-line commands

* revert ghcr runtime change

* Apply stash

* fix run as other user;
make test async;

* fix test runtime for run as od

* add run-as-devin to all the runtime tests

* handle the case when username is root

* move all run-as-devin tests from sandbox;
only tests a few cases on different user to save time;

* move over multi-line echo related tests to test_runtime

* fix user-specific jupyter by fixing the pypoetry virtualenv folder

* make plugin's init async;
chdir at initialization of jupyter plugin;
move ipy simple testcase to test runtime;

* support agentskills import in
move tests for jupyter pwd tests;
overload `add_env_vars` for EventStreamRuntime to update env var also in Jupyter;
make agentskills read env var lazily, in case env var is updated;

* fix ServerRuntime agentskills issue

* move agnostic image test to test_runtime

* merge runtime tests in CI

* fix enable auto lint as env var

* update warning message

* update warning message

* test for different container images

* change parsing output as debug

* add exception handling for update_pwd_decorator

* fix unit test indentation

* add plugins as default input to Runtime class;
remove init_sandbox_plugins;
implement add_env_var (include jupyter) in the base class;

* fix server runtime auto lint

* Revert "add exception handling for update_pwd_decorator"

This reverts commit 2b668b1506.

* tries to print debugging info for agentskills

* explictly setting uid (try fix permission issue)

* Revert "tries to print debugging info for agentskills"

This reverts commit 8be4c86756.

* set sandbox user id during testing to hopefully fix the permission issue

* add browser tools for server runtime

* try to debug for old pwd

* update debug cmd

* only test agnostic runtime when TEST_RUNTIME is Server

* fix temp dir mkdir

* load TEST_RUNTIME at the beginning

* remove ipython tests

* only log to file when DEBUG

* default logging to project root

* temporarily remove log to file

* fix LLM logger dir

* fix logger

* make set pwd an optional aux action

* fix prev pwd

* fix infinity recursion

* simplify

* do not import the whole od library to avoid logger folder by jupyter

* fix browsing

* increase timeout

* attempt to fix agentskills yet again

* clean up in testcases, since CI maybe run as non-root

* add _cause attribute for event.id

* remove parent

* add a bunch of debugging statement again for CI :(

* fix temp_dir fixture

* change all temp dir to follow pytest's tmp_path_factory

* remove extra bracket

* clean up error printing a bit

* jupyter chdir to self.config.workspace_mount_path_in_sandbox on initialization

* jupyter chdir to self.config.workspace_mount_path_in_sandbox on initialization

* add typing for tmp dir fixture

* clear the directory before running the test to avoid weird CI temp dir

* remove agnostic test case for server runtime

* Revert "remove agnostic test case for server runtime"

This reverts commit 30e2181c3f.

* disable agnostic tests in CI

* fix test

* make sure plugin arg is not passed when no plugin is specified;
remove redundant on_event function;

* move mock prompt

* rename runtime

* remove extra logging

* refactor run_controller's interface;
support multiple runtime for integration test;
filter out hostname for prompt

* uncomment other tests

* pass the right runtime to controller

* log runtime when start

* uncomment tests

* improve symbol filters

* add intergration test prompts that seemd ok

* add integration test workflow

* add python3 to default ubuntu image

* symlink python and fix permission to jupyter pip

* add retry for jupyter execute server

* fix jupyter pip install;
add post-process for jupyter pip install;
simplify init by add agent_skills path to PYTHONPATH;
add testcase to tests jupyter pip install;

* fix bug

* use ubuntu:22.04 for eventstream integration tests

* add todo

* update testcase

* remove redundant code

* fix unit test

* reduce dependency for runtime

* try making llama-index an optional dependency that's not installed by default

* remove pip install since it seemd not needed

* log ipython execution;
await write message since it returns a future

* update ipy testcase

* do not install llama-index in CI

* do not install llama-index in the app docker as well

* set sandbox container image in the integration test script

* log plugins & env var for runtime

* update conftest for sha256

* add git

* remove all non-alphanumeric chalracters

* add working ipy module tests!

* default to use host network

* remove is_async from browser to make thing a little more reliable;
retry loading browser when error;

* add sleep to wait a bit for http server

* kill http server before regenerate browsing tests

* fix browsing

* only set sandbox container image if undefined

* skip empty config value

* update evaluation to use the latest run_controller

* revert logger in execute_server to be compatible with server runtime

* revert logging level to fix jupyter

* set logger level

* revert the logging

* chmod for workspace to fix permission

* support getting timeout from action

* update test for server runtime

* try to fix file permission

* fix test_cmd_run_action_serialization_deserialization test (added timeout)

* poetry: pip 24.2, torch 2.2.2

* revert adding pip to pyproject.toml

* add build to dependencies in pyproject.toml

* forgot poetry lock --no-update

* fix a DelegatorAgent prompt_002.log (timeout)

* fix a DelegatorAgent prompt_003.log (timeout)

* couple more timeout attribs in prompt files

* some more prompt files

* prompts galore

* add clarification comment for timeout

* default timeout to config

* add assert

* update integraton tests for eventstream

* update integration tests

* fix timeout for action<->dict

* remove redundant on_event

* fix action execution timeout

* updatelock

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: tobitege <tobitege@gmx.de>
2024-08-01 22:07:39 +00:00
Robert Brennan 7ebbe10b1c Add pyjwt to pyproject (#3210)
* add pyjwt to pyproject

* Update pyproject.toml

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>

* added "build" to pyproject.toml

* lock

---------

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
Co-authored-by: tobitege <tobitege@gmx.de>
2024-08-02 04:42:14 +08:00
tobitege a4cb880699 (feat) LLM class: added acompletion and streaming + unit test (#3202)
* LLM class: added acompletion and streaming, unit test test_acompletion.py

* LLM: cleanup of self.config defaults and their use

* added set_missing_attributes to LLMConfig

* move default checker up
2024-08-01 22:41:40 +02:00
Robert Brennan 8d11e0eac9 better zero state for file list (#3213)
* better zero state for file list

* Lint, cleanup, and update branch

---------

Co-authored-by: amanape <83104063+amanape@users.noreply.github.com>
2024-08-01 20:04:47 +00:00
Xingyao Wang 286f10053e [arch] Implement copy_to for Runtime (#3211)
* add copy to

* implement for ServerRuntime

* implement copyto for runtime (required by eval);
add tests for copy to

* fix exist file check

* unify copy_to_behavior and fix stuff
2024-08-02 02:46:11 +08:00
mamoodi d5d7c18858 Release 0.8.3 (#3212) 2024-08-01 18:20:35 +00:00
dependabot[bot] 678b4a76be chore(deps-dev): bump @typescript-eslint/eslint-plugin from 7.17.0 to 7.18.0 in /frontend (#3181)
* chore(deps-dev): bump @typescript-eslint/eslint-plugin in /frontend

Bumps [@typescript-eslint/eslint-plugin](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/eslint-plugin) from 7.17.0 to 7.18.0.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/eslint-plugin/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v7.18.0/packages/eslint-plugin)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/eslint-plugin"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Update frontend/package.json

* Update frontend/package.json

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: amanape <83104063+amanape@users.noreply.github.com>
2024-08-01 17:24:47 +00:00
Xingyao Wang 2e60d25eae [Agent, LLM] Make sure codeact agent produce message in u/a/u/a order (#3193)
* make sure codeact agent produce message in u/a/u/a order

* integration tests

* sync message changes to codeact swe

* fix integration tests

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-08-02 00:17:53 +08:00
dependabot[bot] 0627af8578 chore(deps-dev): bump @types/node from 22.0.0 to 22.0.2 in /frontend (#3205) 2024-08-02 00:04:47 +08:00
Engel Nyst 21ea9953b3 don't use realpath with non-existent files (#3200) 2024-08-01 01:11:22 +02:00
tobitege 70dd705418 Fix: apply config arguments for miniwob get_sandbox() from loaded config (#3198) 2024-07-31 19:38:15 +00:00
tobitege b049bc9688 update custom sandbox guide with nikolaik image consideration (#3197) 2024-07-31 15:15:40 -04:00
Xingyao Wang 1d49ef253b [Runtime] Reduce dependency to speed up CI and reduce image size (#3195)
* reduce dependency for runtime

* try making llama-index an optional dependency that's not installed by default

* do not install llama-index in CI

* do not install llama-index in the app docker as well
2024-07-31 13:55:09 -04:00
tobitege 938ed027c2 (fix) test_runtime.py parametrization for box_class (#3186)
* fix test_runtime.py parametrization; prevent duplicate test runs

* trivial file change to unblock stuck CI workflow

* fix print_method_name fixture in test_runtime (yield was missing)

* revert wrong param fixtures
2024-08-01 01:30:10 +08:00
dependabot[bot] 0cf4e1ecf3 chore(deps): bump boto3 from 1.34.150 to 1.34.151 (#3191)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.150 to 1.34.151.
- [Release notes](https://github.com/boto/boto3/releases)
- [Commits](https://github.com/boto/boto3/compare/1.34.150...1.34.151)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-31 15:51:49 +00:00
Engel Nyst 93433fa849 pass swe-bench box config parameter (#3189) 2024-07-31 15:31:50 +00:00
dependabot[bot] 9fc522a610 chore(deps-dev): bump mypy from 1.11.0 to 1.11.1 (#3190)
Bumps [mypy](https://github.com/python/mypy) from 1.11.0 to 1.11.1.
- [Changelog](https://github.com/python/mypy/blob/master/CHANGELOG.md)
- [Commits](https://github.com/python/mypy/compare/v1.11...v1.11.1)

---
updated-dependencies:
- dependency-name: mypy
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-31 15:31:28 +00:00
Engel Nyst d41699c133 rename to UserRejectObservation (#3175) 2024-07-31 22:44:31 +08:00
Graham Neubig a562a7ac7d Add unit tests for LLM init function (#3188)
* Add unit tests for LLM init function

* Fix formatting

---------

Co-authored-by: OpenDevin <opendevin@all-hands.dev>
2024-07-31 16:28:50 +02:00
mamoodi 5f177b6f88 Add toggle to enable/disable agent selection - default is agent selection is off (#3174)
* Hide agent selection and always use CodeActAgent

* Revert changes

* Add toggle to enable agent selection

* Refactor, simplify, and update tests

* Update frontend/src/components/modals/settings/SettingsForm.test.tsx

---------

Co-authored-by: amanape <83104063+amanape@users.noreply.github.com>
2024-07-31 11:15:34 +00:00
Xingyao Wang bd68249fba [Arch] Test EventStreamRuntime to ensure its feature parity with ServerRuntime (#3157)
* Remove global config from memory

* Remove runtime global config

* Remove from storage

* Remove global config

* Fix event stream tests

* Fix sandbox issue

* Change config

* Removed transferred tests

* Add swe env box

* Fixes on testing

* Fixed some tests

* Merge with stashed changes

* Fix typing

* Fix ipython test

* Revive function

* Make temp_dir fixture

* Remove test to avoid circular import

* fix eventstream filestore for test_runtime

* fix parse arg issue that cause integration test to fail

* support swebench pull from custom namespace

* add back simple tests for runtime

* move multi-line bash tests to test_runtime;
support multi-line bash for esruntime;

* add testcase to handle PS2 prompt

* use bashlex for bash parsing to handle multi-line commands;
add testcases for multi-line commands

* revert ghcr runtime change

* Apply stash

* fix run as other user;
make test async;

* fix test runtime for run as od

* add run-as-devin to all the runtime tests

* handle the case when username is root

* move all run-as-devin tests from sandbox;
only tests a few cases on different user to save time;

* move over multi-line echo related tests to test_runtime

* fix user-specific jupyter by fixing the pypoetry virtualenv folder

* make plugin's init async;
chdir at initialization of jupyter plugin;
move ipy simple testcase to test runtime;

* support agentskills import in
move tests for jupyter pwd tests;
overload `add_env_vars` for EventStreamRuntime to update env var also in Jupyter;
make agentskills read env var lazily, in case env var is updated;

* fix ServerRuntime agentskills issue

* move agnostic image test to test_runtime

* merge runtime tests in CI

* fix enable auto lint as env var

* update warning message

* update warning message

* test for different container images

* change parsing output as debug

* add exception handling for update_pwd_decorator

* fix unit test indentation

* add plugins as default input to Runtime class;
remove init_sandbox_plugins;
implement add_env_var (include jupyter) in the base class;

* fix server runtime auto lint

* Revert "add exception handling for update_pwd_decorator"

This reverts commit 2b668b1506.

* tries to print debugging info for agentskills

* explictly setting uid (try fix permission issue)

* Revert "tries to print debugging info for agentskills"

This reverts commit 8be4c86756.

* set sandbox user id during testing to hopefully fix the permission issue

* add browser tools for server runtime

* try to debug for old pwd

* update debug cmd

* only test agnostic runtime when TEST_RUNTIME is Server

* fix temp dir mkdir

* load TEST_RUNTIME at the beginning

* remove ipython tests

* only log to file when DEBUG

* default logging to project root

* temporarily remove log to file

* fix LLM logger dir

* fix logger

* make set pwd an optional aux action

* fix prev pwd

* fix infinity recursion

* simplify

* do not import the whole od library to avoid logger folder by jupyter

* fix browsing

* increase timeout

* attempt to fix agentskills yet again

* clean up in testcases, since CI maybe run as non-root

* add _cause attribute for event.id

* remove parent

* add a bunch of debugging statement again for CI :(

* fix temp_dir fixture

* change all temp dir to follow pytest's tmp_path_factory

* remove extra bracket

* clean up error printing a bit

* jupyter chdir to self.config.workspace_mount_path_in_sandbox on initialization

* jupyter chdir to self.config.workspace_mount_path_in_sandbox on initialization

* add typing for tmp dir fixture

* clear the directory before running the test to avoid weird CI temp dir

* remove agnostic test case for server runtime

* Revert "remove agnostic test case for server runtime"

This reverts commit 30e2181c3f.

* disable agnostic tests in CI

* fix test

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-07-31 04:30:59 +08:00
dependabot[bot] c8fd039173 chore(deps-dev): bump @typescript-eslint/parser in /frontend (#3180)
Bumps [@typescript-eslint/parser](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/parser) from 7.17.0 to 7.18.0.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/parser/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v7.18.0/packages/parser)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/parser"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-30 23:26:28 +03:00
dependabot[bot] c0687db1af chore(deps-dev): bump husky from 9.1.3 to 9.1.4 in /frontend (#3179)
Bumps [husky](https://github.com/typicode/husky) from 9.1.3 to 9.1.4.
- [Release notes](https://github.com/typicode/husky/releases)
- [Commits](https://github.com/typicode/husky/compare/v9.1.3...v9.1.4)

---
updated-dependencies:
- dependency-name: husky
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-30 23:26:05 +03:00
dependabot[bot] 2953438c98 chore(deps): bump boto3 from 1.34.149 to 1.34.150 (#3182)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.149 to 1.34.150.
- [Release notes](https://github.com/boto/boto3/releases)
- [Commits](https://github.com/boto/boto3/compare/1.34.149...1.34.150)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-30 18:14:09 +02:00
dependabot[bot] 29258cd62a chore(deps-dev): bump @types/node from 20.14.12 to 22.0.0 in /frontend (#3165)
Bumps [@types/node](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/node) from 20.14.12 to 22.0.0.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/node)

---
updated-dependencies:
- dependency-name: "@types/node"
  dependency-type: direct:development
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-30 10:20:36 +03:00
dependabot[bot] 0efd1c7a87 chore(deps): bump @reduxjs/toolkit from 2.2.6 to 2.2.7 in /frontend (#3164)
Bumps [@reduxjs/toolkit](https://github.com/reduxjs/redux-toolkit) from 2.2.6 to 2.2.7.
- [Release notes](https://github.com/reduxjs/redux-toolkit/releases)
- [Commits](https://github.com/reduxjs/redux-toolkit/compare/v2.2.6...v2.2.7)

---
updated-dependencies:
- dependency-name: "@reduxjs/toolkit"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Leo <ifuryst@gmail.com>
2024-07-30 10:12:04 +03:00
dependabot[bot] 34072102d6 chore(deps-dev): bump husky from 9.1.2 to 9.1.3 in /frontend (#3166)
Bumps [husky](https://github.com/typicode/husky) from 9.1.2 to 9.1.3.
- [Release notes](https://github.com/typicode/husky/releases)
- [Commits](https://github.com/typicode/husky/compare/v9.1.2...v9.1.3)

---
updated-dependencies:
- dependency-name: husky
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-30 10:11:19 +03:00
dependabot[bot] b9b5cf7a61 chore(deps-dev): bump pre-commit from 3.7.1 to 3.8.0 (#3168)
Bumps [pre-commit](https://github.com/pre-commit/pre-commit) from 3.7.1 to 3.8.0.
- [Release notes](https://github.com/pre-commit/pre-commit/releases)
- [Changelog](https://github.com/pre-commit/pre-commit/blob/main/CHANGELOG.md)
- [Commits](https://github.com/pre-commit/pre-commit/compare/v3.7.1...v3.8.0)

---
updated-dependencies:
- dependency-name: pre-commit
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-30 01:01:19 +00:00
மனோஜ்குமார் பழனிச்சாமி 84a6e90dc2 Chore: Add line breaks in pull_request_template.md (#2988)
* Update pull_request_template.md

* Update pull_request_template.md
2024-07-29 15:10:33 -04:00
tobitege 2533efabbb (fix) split_bash_commands replaced; temp_dir fixture fix in some tests (#3160)
* split_bash_commands replaced; temp_dir fixture fix in some tests

* tweak test_runtime

* skip 2 tests in test_runtime that need fixing in extra PR

* reverting bash parsing changes and re-enabled tests

* missed to revert a changed assert in test_runtime.py
2024-07-29 17:05:58 +00:00
dependabot[bot] 7cefd32fd0 chore(deps): bump litellm from 1.42.3 to 1.42.5 (#3167)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.42.3 to 1.42.5.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.42.3...v1.42.5)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-29 16:10:31 +00:00
மனோஜ்குமார் பழனிச்சாமி 570fd6e483 Fix: Parse exit code correctly for background commands (#3161)
* Parse exit code correctly for background commands

* Update ssh_box.py
2024-07-29 23:57:58 +08:00
tobitege a52df6a272 (Docker) Fix passing of POETRY_CACHE_DIR into RUN command (#3163)
* Fix RUN command to get correct value of POETRY_CACHE_DIR

* Fix passing of POETRY_CACHE_DIR into RUN command

* remove test copy of Dockerfile
2024-07-29 23:16:46 +08:00
மனோஜ்குமார் பழனிச்சாமி 563ebd406d Fix: Add missing arguments for SSHBox in evaluation (#3075)
* Fix WebArena evaluation script to connect to SSH session

* Update run_infer.py

* Add missing arguments for DockerSSHBox
2024-07-29 23:09:39 +08:00
tobitege 1eb3bdea95 remove unneeded message about config file not found (#3158) 2024-07-29 16:27:14 +02:00
மனோஜ்குமார் பழனிச்சாமி eb182f492e Feat: Dev Container for GitHub Codespaces (#2689)
* Workaround for GitHub Codespaces

* Add devcontainer config

* rename devcontainer folder

* install netcat

* add VS Code Python extension

* apt update

* give executable path to avoid bugs

* configure poetry env

* fix postCreateCommand

* revert executable path

* add postStartCommand

* run in background

* Add Codespaces badge

* add default config

* Add Codespaces badge to doc

* update comment

* apply workaround 2

* refactor

* fix lib path

* Update on_create.sh

* pass env directly to cmd

* resolve unexpected merge conflicts

* Separated to #2850

* Update README.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update codespaces link

* Update README.md

* Separated to #2975

---------

Co-authored-by: github-actions <github-actions@github.com>
Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-07-28 07:44:18 -04:00
Xingyao Wang b1ea204c5b Migrate multi-line-bash-related sandbox tests into runtime tests and fix multi-line issue (#3128)
* Remove global config from memory

* Remove runtime global config

* Remove from storage

* Remove global config

* Fix event stream tests

* Fix sandbox issue

* Change config

* Removed transferred tests

* Add swe env box

* Fixes on testing

* Fixed some tests

* Merge with stashed changes

* Fix typing

* Fix ipython test

* Revive function

* Make temp_dir fixture

* Remove test to avoid circular import

* fix eventstream filestore for test_runtime

* fix parse arg issue that cause integration test to fail

* support swebench pull from custom namespace

* add back simple tests for runtime

* move multi-line bash tests to test_runtime;
support multi-line bash for esruntime;

* add testcase to handle PS2 prompt

* use bashlex for bash parsing to handle multi-line commands;
add testcases for multi-line commands

* revert ghcr runtime change

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-07-27 20:12:57 +00:00
mamoodi 8b77e8a0ff chore: Release 0.8.2 (#3150) 2024-07-27 17:54:39 +00:00
Engel Nyst 3328669b89 fix Finish action to sent its 'thoughts' in the prompt (#3149) 2024-07-27 17:37:44 +00:00
Engel Nyst 9ed95abf83 Fix max budget per task error in headless mode (#3147)
* set agent in ERROR instead of PAUSED when in headless mode

* fallback to config value for budget
2024-07-27 17:35:40 +00:00
Xingyao Wang b5d3fcaba8 [docs] Update README.md for new OpenDevin Runtime (#3142)
* Update README.md

* address comment

* remove DEBUG flag
2024-07-27 15:45:09 +00:00
Engel Nyst a29c795418 clear last_error when restoring a session (#3146) 2024-07-27 15:34:36 +00:00
Engel Nyst f07280153a restore logging of user messages when using cli (#3145) 2024-07-27 12:58:23 +00:00
tofarr 437e0c76bf Icon transparency (#3138)
I noticed that in dark modes the icons don't look as good as they could - making the outlines transparent makes them pop a bit more

Co-authored-by: Tim O'Farrell <tofarr@Tims-MacBook-Pro-2.local>
2024-07-26 21:29:41 -04:00
tobitege c0adca1e30 fix DummyAgent (#3137) 2024-07-26 18:59:25 +00:00
Xingyao Wang 1c813a2fa0 support swebench pull from custom namespace (#3136) 2024-07-26 18:46:36 +00:00
Graham Neubig 275ea706cf Remove remaining global config (#3099)
* Remove global config from memory

* Remove runtime global config

* Remove from storage

* Remove global config

* Fix event stream tests

* Fix sandbox issue

* Change config

* Removed transferred tests

* Add swe env box

* Fixes on testing

* Fixed some tests

* Fix typing

* Fix ipython test

* Revive function

* Make temp_dir fixture

* Remove test to avoid circular import
2024-07-26 18:43:32 +00:00
Anatolij Vasilev 3301beffec fixed broken shell command (#3135) 2024-07-26 18:36:20 +00:00
dependabot[bot] 6c4cce01a7 chore(deps-dev): bump openai from 1.37.0 to 1.37.1 (#3134)
Bumps [openai](https://github.com/openai/openai-python) from 1.37.0 to 1.37.1.
- [Release notes](https://github.com/openai/openai-python/releases)
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-python/compare/v1.37.0...v1.37.1)

---
updated-dependencies:
- dependency-name: openai
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-26 17:01:16 +00:00
dependabot[bot] e62a7c08f8 chore(deps): bump boto3 from 1.34.148 to 1.34.149 (#3133)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.148 to 1.34.149.
- [Release notes](https://github.com/boto/boto3/releases)
- [Commits](https://github.com/boto/boto3/compare/1.34.148...1.34.149)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-27 00:11:07 +08:00
dependabot[bot] f6317a3607 chore(deps-dev): bump streamlit from 1.36.0 to 1.37.0 (#3129)
Bumps [streamlit](https://github.com/streamlit/streamlit) from 1.36.0 to 1.37.0.
- [Release notes](https://github.com/streamlit/streamlit/releases)
- [Commits](https://github.com/streamlit/streamlit/compare/1.36.0...1.37.0)

---
updated-dependencies:
- dependency-name: streamlit
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-27 00:10:32 +08:00
dependabot[bot] c4eb8e9fc8 chore(deps-dev): bump ruff from 0.5.4 to 0.5.5 (#3132)
Bumps [ruff](https://github.com/astral-sh/ruff) from 0.5.4 to 0.5.5.
- [Release notes](https://github.com/astral-sh/ruff/releases)
- [Changelog](https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md)
- [Commits](https://github.com/astral-sh/ruff/compare/0.5.4...0.5.5)

---
updated-dependencies:
- dependency-name: ruff
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-26 16:02:29 +00:00
dependabot[bot] 618b124e0c chore(deps): bump litellm from 1.42.1 to 1.42.3 (#3131)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.42.1 to 1.42.3.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.42.1...v1.42.3)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-26 16:01:45 +00:00
Xingyao Wang 1f6e86c932 Fix(test,CI): runtime build tests (#3126)
* fix runtime build test

* get runtime_build test to run in CI

* move test involving docker from `test_ipython` to `test_sandbox`
2024-07-26 22:53:01 +08:00
Xingyao Wang 99f6f8899d fix filename for image tar (#3121) 2024-07-26 02:38:41 +00:00
Xingyao Wang f28a6db2c6 add missing ext (#3120) 2024-07-26 06:32:14 +08:00
tobitege 22c7bca556 (fix) colima: fix return code handling (followup to #3097) (#3106)
* colima: fix return code handling; added delay before retry; 4 retries

* moved docker context outside of function

* changed delete occurence; added logs output

* removed delete; trying to add more logging

* fix typo

* changed logging to github-style. maybe this finally shows up.

* reverted context; loop now with install+delete and alternating IP

* fix local keyword

* try limactl for creating an instance for IP

* revert IP change attempts

* actually return 0 in start_colima

* moved install out of loop again

* another to avoid duplicate start of colima via limactl

* added --init call for lima.yaml file creation

* dont trust an LLM to give you flags...

* Update run-unit-tests.yml
2024-07-26 03:16:02 +08:00
tobitege d0217b84ef test_runtime: run tests per runtime, not alternating (#3103) 2024-07-26 03:01:50 +08:00
Yufan Song 056b66df65 revert torch version (#3118) 2024-07-25 17:18:17 +00:00
dependabot[bot] 422c3194c4 chore(deps-dev): bump husky from 9.1.1 to 9.1.2 in /frontend (#3117)
Bumps [husky](https://github.com/typicode/husky) from 9.1.1 to 9.1.2.
- [Release notes](https://github.com/typicode/husky/releases)
- [Commits](https://github.com/typicode/husky/compare/v9.1.1...v9.1.2)

---
updated-dependencies:
- dependency-name: husky
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-25 17:12:50 +00:00
dependabot[bot] e3e942370e chore(deps-dev): bump tailwindcss from 3.4.6 to 3.4.7 in /frontend (#3116)
Bumps [tailwindcss](https://github.com/tailwindlabs/tailwindcss) from 3.4.6 to 3.4.7.
- [Release notes](https://github.com/tailwindlabs/tailwindcss/releases)
- [Changelog](https://github.com/tailwindlabs/tailwindcss/blob/v3.4.7/CHANGELOG.md)
- [Commits](https://github.com/tailwindlabs/tailwindcss/compare/v3.4.6...v3.4.7)

---
updated-dependencies:
- dependency-name: tailwindcss
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-25 16:56:11 +00:00
dependabot[bot] 006842bb88 chore(deps): bump vite from 5.3.4 to 5.3.5 in /frontend (#3115)
Bumps [vite](https://github.com/vitejs/vite/tree/HEAD/packages/vite) from 5.3.4 to 5.3.5.
- [Release notes](https://github.com/vitejs/vite/releases)
- [Changelog](https://github.com/vitejs/vite/blob/main/packages/vite/CHANGELOG.md)
- [Commits](https://github.com/vitejs/vite/commits/v5.3.5/packages/vite)

---
updated-dependencies:
- dependency-name: vite
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-25 16:15:09 +00:00
dependabot[bot] e3b180e702 chore(deps-dev): bump postcss from 8.4.39 to 8.4.40 in /frontend (#3114)
Bumps [postcss](https://github.com/postcss/postcss) from 8.4.39 to 8.4.40.
- [Release notes](https://github.com/postcss/postcss/releases)
- [Changelog](https://github.com/postcss/postcss/blob/main/CHANGELOG.md)
- [Commits](https://github.com/postcss/postcss/compare/8.4.39...8.4.40)

---
updated-dependencies:
- dependency-name: postcss
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-25 16:03:54 +00:00
sp.wack f586911ecf Refactor and test component (#3108) 2024-07-25 18:49:51 +03:00
dependabot[bot] da4dc15e76 chore(deps-dev): bump pytest from 8.3.1 to 8.3.2 (#3113)
Bumps [pytest](https://github.com/pytest-dev/pytest) from 8.3.1 to 8.3.2.
- [Release notes](https://github.com/pytest-dev/pytest/releases)
- [Changelog](https://github.com/pytest-dev/pytest/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pytest-dev/pytest/compare/8.3.1...8.3.2)

---
updated-dependencies:
- dependency-name: pytest
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-25 15:44:11 +00:00
dependabot[bot] cef0d13c43 chore(deps): bump boto3 from 1.34.147 to 1.34.148 (#3112)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.147 to 1.34.148.
- [Release notes](https://github.com/boto/boto3/releases)
- [Commits](https://github.com/boto/boto3/compare/1.34.147...1.34.148)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
2024-07-25 15:41:55 +00:00
dependabot[bot] fcaf0d2a40 chore(deps): bump google-cloud-aiplatform from 1.59.0 to 1.60.0 (#3111)
Bumps [google-cloud-aiplatform](https://github.com/googleapis/python-aiplatform) from 1.59.0 to 1.60.0.
- [Release notes](https://github.com/googleapis/python-aiplatform/releases)
- [Changelog](https://github.com/googleapis/python-aiplatform/blob/main/CHANGELOG.md)
- [Commits](https://github.com/googleapis/python-aiplatform/compare/v1.59.0...v1.60.0)

---
updated-dependencies:
- dependency-name: google-cloud-aiplatform
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
2024-07-25 15:18:20 +00:00
dependabot[bot] 1e2d5b57fd chore(deps-dev): bump torch from 2.2.2 to 2.4.0 (#3110)
Bumps [torch](https://github.com/pytorch/pytorch) from 2.2.2 to 2.4.0.
- [Release notes](https://github.com/pytorch/pytorch/releases)
- [Changelog](https://github.com/pytorch/pytorch/blob/main/RELEASE.md)
- [Commits](https://github.com/pytorch/pytorch/compare/v2.2.2...v2.4.0)

---
updated-dependencies:
- dependency-name: torch
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
2024-07-25 11:10:40 -04:00
dependabot[bot] 2702c09477 chore(deps): bump litellm from 1.41.28 to 1.42.1 (#3109)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.41.28 to 1.42.1.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.41.28...v1.42.1)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
2024-07-25 15:07:48 +00:00
tobitege b71dc38855 ghcr-runtime: no unzip, artifact downloads as-is, without extention (#3105) 2024-07-25 14:20:20 +02:00
tobitege 9576916a49 (fix) Runtime yml missing zip handling (fixes #3101) (#3104)
* fix ghcr_push use of image name

* ghcr-runtime: fix artifact download (zip); removed obsolete show of df
2024-07-25 08:53:27 +00:00
Xingyao Wang 6f345e82f2 Update paper link in README.md (#3102) 2024-07-25 05:11:54 +00:00
tobitege 690290697a fix ghcr_push use of image name (#3101) 2024-07-25 10:53:20 +08:00
Graham Neubig 98276cf733 Change doc title of agent hub (#3100)
This PR changes the title of the `agenthub` doc from "Agent Framework Research" to "Agent Hub".
2024-07-25 01:28:40 +00:00
tobitege 4862661732 (fix) colima: use a docker context specific to runner; prevent duplicate start (#3097)
* colima: use a docker context specific to runner; prevent duplicate start

* updated use of context (for docker, not colima)

* added --ssh to colima start to use TCP instead of socket

* replace --ssh with random port
2024-07-24 13:59:59 -04:00
dependabot[bot] caece67ef8 chore(deps-dev): bump @types/node from 20.14.11 to 20.14.12 in /frontend (#3095) 2024-07-25 00:02:19 +08:00
dependabot[bot] 4a1111d497 chore(deps): bump @react-types/shared from 3.24.0 to 3.24.1 in /frontend (#3094) 2024-07-25 00:02:05 +08:00
dependabot[bot] fe7b5f12a7 chore(deps): bump boto3 from 1.34.146 to 1.34.147 (#3093) 2024-07-25 00:01:51 +08:00
dependabot[bot] 625de5668f chore(deps): bump litellm from 1.41.27 to 1.41.28 (#3092) 2024-07-25 00:01:33 +08:00
tobitege d50a8447ad fix: add llm drop_params parameter to LLMConfig (#2471)
* feat: add drop_params to LLMConfig

* Update opendevin/llm/llm.py

Fix use of unknown method.

Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>

---------

Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>
2024-07-24 16:25:36 +02:00
Xingyao Wang 405c8a0456 [Arch] Add runtime image build CI & clean up runtime build using jinja2 template (#3055)
* test_runtime_client.py to test _execute_bash()

* runtime_build and runtime tweaks

* fix in docker script

* revert bash changes

* use sandbox_config.update_source_code to control source code update

* add od_version to the sandbox tag

* add doc instruction for update source code

* do not remove whole poetry folder;
add mamba clean

* add missing newlines

* cleanup runtime dockerfile into jinja template

* make prep temp file a separate function;
make that function accessible through cli

* modify `runtime_build.py` so it can generate directory for building docker img

* add dockerfile and sdist of runtime to gitignore since it will be dynamically generated

* add runtime to build

* do not rebuild new image when an `od_runtime` is provided

* use default container_image for testing if possible

* move runtime tests to ghcr runtime workflow

* update docker base dir for runtime

* fix unittest

* fix image name

* fix image name for test case

* rename to make it consistent

---------

Co-authored-by: tobitege <tobitege@gmx.de>
2024-07-24 21:56:12 +08:00
dependabot[bot] 547c510848 chore(deps): bump @react-types/shared from 3.23.1 to 3.24.0 in /frontend (#3082)
Bumps [@react-types/shared](https://github.com/adobe/react-spectrum) from 3.23.1 to 3.24.0.
- [Release notes](https://github.com/adobe/react-spectrum/releases)
- [Commits](https://github.com/adobe/react-spectrum/compare/@react-types/shared@3.23.1...@react-types/shared@3.24.0)

---
updated-dependencies:
- dependency-name: "@react-types/shared"
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-24 10:20:39 +03:00
linshaoxin-maker 800e25eac1 Modify codeAct paper link (#3076) 2024-07-23 20:25:54 +00:00
dependabot[bot] d92bbd97d7 chore(deps-dev): bump chromadb from 0.5.4 to 0.5.5 (#3085)
Bumps [chromadb](https://github.com/chroma-core/chroma) from 0.5.4 to 0.5.5.
- [Release notes](https://github.com/chroma-core/chroma/releases)
- [Changelog](https://github.com/chroma-core/chroma/blob/main/RELEASE_PROCESS.md)
- [Commits](https://github.com/chroma-core/chroma/compare/0.5.4...0.5.5)

---
updated-dependencies:
- dependency-name: chromadb
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-23 20:20:41 +00:00
dependabot[bot] 549032176c chore(deps): bump boto3 from 1.34.145 to 1.34.146 (#3087)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.145 to 1.34.146.
- [Release notes](https://github.com/boto/boto3/releases)
- [Commits](https://github.com/boto/boto3/compare/1.34.145...1.34.146)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-23 20:17:17 +00:00
dependabot[bot] e4319da3f5 chore(deps-dev): bump @typescript-eslint/eslint-plugin in /frontend (#3081)
Bumps [@typescript-eslint/eslint-plugin](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/eslint-plugin) from 7.16.1 to 7.17.0.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/eslint-plugin/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v7.17.0/packages/eslint-plugin)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/eslint-plugin"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-23 15:48:39 -04:00
dependabot[bot] 61a07a5864 chore(deps-dev): bump @testing-library/jest-dom in /frontend (#3083)
Bumps [@testing-library/jest-dom](https://github.com/testing-library/jest-dom) from 6.4.6 to 6.4.8.
- [Release notes](https://github.com/testing-library/jest-dom/releases)
- [Changelog](https://github.com/testing-library/jest-dom/blob/main/CHANGELOG.md)
- [Commits](https://github.com/testing-library/jest-dom/compare/v6.4.6...v6.4.8)

---
updated-dependencies:
- dependency-name: "@testing-library/jest-dom"
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
2024-07-23 15:45:11 -04:00
dependabot[bot] 4ed5967442 chore(deps-dev): bump typescript from 5.5.3 to 5.5.4 in /frontend (#3084)
Bumps [typescript](https://github.com/Microsoft/TypeScript) from 5.5.3 to 5.5.4.
- [Release notes](https://github.com/Microsoft/TypeScript/releases)
- [Changelog](https://github.com/microsoft/TypeScript/blob/main/azure-pipelines.release.yml)
- [Commits](https://github.com/Microsoft/TypeScript/compare/v5.5.3...v5.5.4)

---
updated-dependencies:
- dependency-name: typescript
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
2024-07-23 15:44:49 -04:00
dependabot[bot] c2c4e9f37f chore(deps): bump litellm from 1.41.25 to 1.41.27 (#3086)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.41.25 to 1.41.27.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.41.25...v1.41.27)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Yufan Song: <33971064+yufansong@users.noreply.github.com>
2024-07-23 15:41:44 -04:00
dependabot[bot] 723d2e7c36 chore(deps-dev): bump openai from 1.36.1 to 1.37.0 (#3088)
Bumps [openai](https://github.com/openai/openai-python) from 1.36.1 to 1.37.0.
- [Release notes](https://github.com/openai/openai-python/releases)
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-python/compare/v1.36.1...v1.37.0)

---
updated-dependencies:
- dependency-name: openai
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-23 15:39:08 -04:00
dependabot[bot] 8a5bd21d77 chore(deps-dev): bump @typescript-eslint/parser in /frontend (#3080)
Bumps [@typescript-eslint/parser](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/parser) from 7.16.1 to 7.17.0.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/parser/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v7.17.0/packages/parser)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/parser"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-23 20:45:32 +03:00
dependabot[bot] 79e76e9053 chore(deps-dev): bump typescript from 5.5.3 to 5.5.4 in /docs (#3079)
Bumps [typescript](https://github.com/Microsoft/TypeScript) from 5.5.3 to 5.5.4.
- [Release notes](https://github.com/Microsoft/TypeScript/releases)
- [Changelog](https://github.com/microsoft/TypeScript/blob/main/azure-pipelines.release.yml)
- [Commits](https://github.com/Microsoft/TypeScript/compare/v5.5.3...v5.5.4)

---
updated-dependencies:
- dependency-name: typescript
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-23 20:45:00 +03:00
dependabot[bot] 606a4b41d5 chore(deps): bump @nextui-org/react from 2.4.5 to 2.4.6 in /frontend (#3059)
Bumps [@nextui-org/react](https://github.com/nextui-org/nextui/tree/HEAD/packages/core/react) from 2.4.5 to 2.4.6.
- [Release notes](https://github.com/nextui-org/nextui/releases)
- [Changelog](https://github.com/nextui-org/nextui/blob/canary/packages/core/react/CHANGELOG.md)
- [Commits](https://github.com/nextui-org/nextui/commits/@nextui-org/react@2.4.6/packages/core/react)

---
updated-dependencies:
- dependency-name: "@nextui-org/react"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-23 20:44:38 +03:00
Boxuan Li 445f290beb Validate to_replace in edit_file_by_replace AgentSkill (#3073)
* Validate to_replace in edit_file_by_replace AgentSkill

* Remove redundant replace reminder prompt

* Add unit tests

* Fix prompt
2024-07-22 21:01:35 -07:00
Xingyao Wang 41a8bb3cf1 [eval,fix]: metrics get carried across eval instances (#3072)
* fix: make max_budget_per_task optional in `run_agent_controller`

* update arg for each run infer

* fix: metrics logging carried along; reset llm metric with the agent;

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-07-23 03:30:28 +00:00
Xingyao Wang da17665cab fix: make max_budget_per_task optional in run_agent_controller (#3071)
* fix: make max_budget_per_task optional in `run_agent_controller`

* update arg for each run infer
2024-07-22 21:47:00 -04:00
Graham Neubig 4099e48122 Removed config from agent controller (#3038)
* Removed config from agent controller

* Fix tests

* Increase budget

* Update tests

* Update prompts

* Add missing prompt

* Fix mistaken deletions

* Fix browsing test

* Fixed browse tests
2024-07-22 17:42:57 +00:00
dependabot[bot] c3d4f6495f chore(deps-dev): bump ruff from 0.5.3 to 0.5.4 (#3068)
Bumps [ruff](https://github.com/astral-sh/ruff) from 0.5.3 to 0.5.4.
- [Release notes](https://github.com/astral-sh/ruff/releases)
- [Changelog](https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md)
- [Commits](https://github.com/astral-sh/ruff/compare/0.5.3...0.5.4)

---
updated-dependencies:
- dependency-name: ruff
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-22 17:30:20 +00:00
dependabot[bot] a0f1cd2cdb chore(deps-dev): bump pytest from 8.2.2 to 8.3.1 (#3065)
Bumps [pytest](https://github.com/pytest-dev/pytest) from 8.2.2 to 8.3.1.
- [Release notes](https://github.com/pytest-dev/pytest/releases)
- [Changelog](https://github.com/pytest-dev/pytest/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pytest-dev/pytest/compare/8.2.2...8.3.1)

---
updated-dependencies:
- dependency-name: pytest
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-22 17:20:38 +00:00
dependabot[bot] aeed1ea871 chore(deps): bump litellm from 1.41.24 to 1.41.25 (#3064)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.41.24 to 1.41.25.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.41.24...v1.41.25)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-22 17:14:58 +00:00
dependabot[bot] cc6128522d chore(deps-dev): bump openai from 1.36.0 to 1.36.1 (#3069)
Bumps [openai](https://github.com/openai/openai-python) from 1.36.0 to 1.36.1.
- [Release notes](https://github.com/openai/openai-python/releases)
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-python/compare/v1.36.0...v1.36.1)

---
updated-dependencies:
- dependency-name: openai
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
2024-07-22 17:13:19 +00:00
dependabot[bot] 78e700ef94 chore(deps-dev): bump jsdom from 24.1.0 to 24.1.1 in /frontend (#3057)
Bumps [jsdom](https://github.com/jsdom/jsdom) from 24.1.0 to 24.1.1.
- [Release notes](https://github.com/jsdom/jsdom/releases)
- [Changelog](https://github.com/jsdom/jsdom/blob/main/Changelog.md)
- [Commits](https://github.com/jsdom/jsdom/compare/24.1.0...24.1.1)

---
updated-dependencies:
- dependency-name: jsdom
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-22 12:40:40 -04:00
dependabot[bot] 6c8aae0d12 chore(deps-dev): bump eslint-plugin-react in /frontend (#3060)
Bumps [eslint-plugin-react](https://github.com/jsx-eslint/eslint-plugin-react) from 7.34.4 to 7.35.0.
- [Release notes](https://github.com/jsx-eslint/eslint-plugin-react/releases)
- [Changelog](https://github.com/jsx-eslint/eslint-plugin-react/blob/master/CHANGELOG.md)
- [Commits](https://github.com/jsx-eslint/eslint-plugin-react/compare/v7.34.4...v7.35.0)

---
updated-dependencies:
- dependency-name: eslint-plugin-react
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
2024-07-22 12:40:01 -04:00
dependabot[bot] 784f644b1d chore(deps): bump react-use from 17.5.0 to 17.5.1 in /docs (#3063)
Bumps [react-use](https://github.com/streamich/react-use) from 17.5.0 to 17.5.1.
- [Release notes](https://github.com/streamich/react-use/releases)
- [Changelog](https://github.com/streamich/react-use/blob/master/CHANGELOG.md)
- [Commits](https://github.com/streamich/react-use/compare/v17.5.0...v17.5.1)

---
updated-dependencies:
- dependency-name: react-use
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
2024-07-22 12:36:40 -04:00
dependabot[bot] 669fe40229 chore(deps-dev): bump mypy from 1.10.1 to 1.11.0 (#3066)
Bumps [mypy](https://github.com/python/mypy) from 1.10.1 to 1.11.0.
- [Changelog](https://github.com/python/mypy/blob/master/CHANGELOG.md)
- [Commits](https://github.com/python/mypy/compare/v1.10.1...v1.11)

---
updated-dependencies:
- dependency-name: mypy
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
2024-07-22 12:35:05 -04:00
dependabot[bot] 733d4f5924 chore(deps): bump uvicorn from 0.30.1 to 0.30.3 (#3062)
Bumps [uvicorn](https://github.com/encode/uvicorn) from 0.30.1 to 0.30.3.
- [Release notes](https://github.com/encode/uvicorn/releases)
- [Changelog](https://github.com/encode/uvicorn/blob/master/CHANGELOG.md)
- [Commits](https://github.com/encode/uvicorn/compare/0.30.1...0.30.3)

---
updated-dependencies:
- dependency-name: uvicorn
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-22 12:32:32 -04:00
Priyank Sevak 812e5d1dc9 Issue#2977 (#3050)
- keeping agentStateService unchanged.
- Renamed 'changeAgentState' inside 'agentSlice' to 'setCurrentAgentState'

Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>
2024-07-22 08:38:31 +00:00
Xingyao Wang ce8a11a62f [Arch] Shrink runtime image size (#3051)
* test_runtime_client.py to test _execute_bash()

* runtime_build and runtime tweaks

* fix in docker script

* revert bash changes

* use sandbox_config.update_source_code to control source code update

* add od_version to the sandbox tag

* add doc instruction for update source code

* do not remove whole poetry folder;
add mamba clean

* add missing newlines

---------

Co-authored-by: tobitege <tobitege@gmx.de>
2024-07-22 02:34:45 +08:00
மனோஜ்குமார் பழனிச்சாமி f3c23e8039 CI: Force stop colima (#3053) 2024-07-22 01:46:57 +08:00
Xingyao Wang a61ac5a214 remove extra arg from swebench ssh box (#3054) 2024-07-21 14:58:16 +08:00
Graham Neubig 04877f8caf Remove global config from tests (#3052) 2024-07-20 23:07:09 -04:00
Zheng Chaojian 15697bed5a Update Dockerfile casing (#3045)
* Update Dockerfile casing

* Update Dockerfile
2024-07-20 08:00:37 -04:00
Boxuan Li be6e6e3add Bug fix: Metrics not accumulated across agent delegation (#3012)
* Add test to reproduce cost miscalculation bug

* Fix metrics bug

* Copy metrics upon AgentRejectAction
2024-07-20 04:05:05 +00:00
Xingyao Wang 6b16a5da0b [Eval,Arch] Update GPTQ eval and add headless_mode for Controller (#2994)
* update and polish gptq eval

* fix typo

* Update evaluation/gpqa/README.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/gpqa/run_infer.py

Co-authored-by: Graham Neubig <neubig@gmail.com>

* add headless mode to all appropriate agent controller call

* delegate set to error when in headless mode

* try to deduplicate a bit

* make headless_mode default to True and only change it to false for AgentSession

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-07-20 03:35:48 +00:00
Graham Neubig dada004fac Remove config from files (#3039) 2024-07-19 23:20:44 -04:00
Raj Maheshwari 9cf2b5b74b [FIX] Update SWEBenchSSHBox after global config was removed from sandbox in #2961 (#3014)
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-07-19 14:41:50 -07:00
Graham Neubig 3a21198424 Remove monologue agent (#3036)
* Remove monologue agent

* Fixes
2024-07-19 19:25:05 +00:00
mamoodi 71cb8b02dc chore: Release 0.8.1 (#3035) 2024-07-19 19:12:32 +00:00
dependabot[bot] 08b44f0d60 chore(deps-dev): bump openai from 1.35.13 to 1.36.0 (#3033) 2024-07-20 01:04:04 +08:00
dependabot[bot] 82f94b99c4 chore(deps): bump @nextui-org/react from 2.4.3 to 2.4.5 in /frontend (#3021) 2024-07-20 01:03:47 +08:00
dependabot[bot] fb9ad04362 chore(deps-dev): bump jupyterlab from 4.2.3 to 4.2.4 (#3028)
Bumps [jupyterlab](https://github.com/jupyterlab/jupyterlab) from 4.2.3 to 4.2.4.
- [Release notes](https://github.com/jupyterlab/jupyterlab/releases)
- [Changelog](https://github.com/jupyterlab/jupyterlab/blob/@jupyterlab/lsp@4.2.4/CHANGELOG.md)
- [Commits](https://github.com/jupyterlab/jupyterlab/compare/@jupyterlab/lsp@4.2.3...@jupyterlab/lsp@4.2.4)

---
updated-dependencies:
- dependency-name: jupyterlab
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-20 00:08:24 +08:00
dependabot[bot] 4bb92bdd02 chore(deps): bump i18next from 23.12.1 to 23.12.2 in /frontend (#3020)
Bumps [i18next](https://github.com/i18next/i18next) from 23.12.1 to 23.12.2.
- [Release notes](https://github.com/i18next/i18next/releases)
- [Changelog](https://github.com/i18next/i18next/blob/master/CHANGELOG.md)
- [Commits](https://github.com/i18next/i18next/compare/v23.12.1...v23.12.2)

---
updated-dependencies:
- dependency-name: i18next
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-20 00:08:05 +08:00
dependabot[bot] 46c9c9d5c6 chore(deps): bump boto3 from 1.34.144 to 1.34.145 (#3022)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.144 to 1.34.145.
- [Release notes](https://github.com/boto/boto3/releases)
- [Commits](https://github.com/boto/boto3/compare/1.34.144...1.34.145)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-20 00:07:48 +08:00
dependabot[bot] b004678345 chore(deps-dev): bump llama-index-embeddings-azure-openai (#3024)
Bumps llama-index-embeddings-azure-openai from 0.1.10 to 0.1.11.

---
updated-dependencies:
- dependency-name: llama-index-embeddings-azure-openai
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-20 00:07:28 +08:00
dependabot[bot] 93b9fd028d chore(deps-dev): bump ruff from 0.5.2 to 0.5.3 (#3026)
Bumps [ruff](https://github.com/astral-sh/ruff) from 0.5.2 to 0.5.3.
- [Release notes](https://github.com/astral-sh/ruff/releases)
- [Changelog](https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md)
- [Commits](https://github.com/astral-sh/ruff/compare/0.5.2...0.5.3)

---
updated-dependencies:
- dependency-name: ruff
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-20 00:07:12 +08:00
sp.wack 23493a2e36 (test|refactor)(frontend): Refactor/cleanup and extend tests for ChatInterface, its children, and introduce improvements to the feedback flow (#2997)
* Refactor and remove useless test

* Refactor and test feedback modal artifacts

* Update and pass test

* Replace select with radio buttons

* Store and retrieve user email during feedback

* Improve post share feedback toast

* Fix tests

* Add test todo
2024-07-19 12:00:59 -04:00
sp.wack 8bfa61f3e4 Run package commands directly (#3013) 2024-07-19 16:44:14 +02:00
jigsawlabs-student fa6c12473e #2220, integrated aider style linting, currently passes related o… (#2489)
* WIP for integrate aider linter, see OpenDevin#2220

Updated aider linter to:
    * Always return text and line numbers
    * Moved extract line number more consistently
    * Changed pylint to stop after first linter detects errors
Updated agentskills
    * To get back a LintResult object and then use lines and text for error message and related line number
    * Moved code for extracting line number to aider linter
Tests:
* Added additional unit tests for aider to test for
* Return values from lint failures
* Confirm linter works for non-configured languages like Ruby

* move to agent_skills, fixes not seeing skills error

* format/lint to new code, fix failing tests, remove unused code from aider linter

* small changes (remove litellm, fix readme typo)

* fix failing sandbox test

* keep, change dumping of metadata

* WIP for integrate aider linter, see OpenDevin#2220

Updated aider linter to:
    * Always return text and line numbers
    * Moved extract line number more consistently
    * Changed pylint to stop after first linter detects errors
Updated agentskills
    * To get back a LintResult object and then use lines and text for error message and related line number
    * Moved code for extracting line number to aider linter
Tests:
* Added additional unit tests for aider to test for
* Return values from lint failures
* Confirm linter works for non-configured languages like Ruby

* move to agent_skills, fixes not seeing skills error

* format/lint to new code, fix failing tests, remove unused code from aider linter

* remove duplication of tree-sitter, grep-ast and update poetry.lock

* revert to main branch poetry.lock version

* only update necessary package

* fix jupyter kernel wrong interpreter issue (only for swebench)

* fix failing lint tests

* update syntax error checks for flake

* update poetry lock file

* update poetry.lock file, which update content-hash

* add grep ast

* remove extra stuff caused by merge

* update pyproject

* remove extra pytest fixture, ruff styling fixes

* lint files

* update poetry.lock file

---------

Co-authored-by: Jeff Katzy <jeffreyerickatz@gmail.com>
Co-authored-by: yufansong <yufan@risingwave-labs.com>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
Co-authored-by: tobitege <tobitege@gmx.de>
2024-07-19 21:58:54 +08:00
Julian Gums b5e4cddce3 fix docs links (#3017)
* fix docs link

* Update CONTRIBUTING.md

* Update README.md

* fix docs link

* fix docs link

* Update troubleshooting.md

* fix docs link

* fix docs link

* fix docs link

* fix docs link
2024-07-19 13:57:33 +00:00
Xingyao Wang ac27ded81f Fix: handle the case where env var is empty (#3016)
* handle the case where env var is empty

* fix logging

* include obs content in logging

* change to add_env_vars
2024-07-19 13:51:06 +00:00
dependabot[bot] c555fb6840 chore(deps-dev): bump husky from 9.0.11 to 9.1.1 in /frontend (#2996)
Bumps [husky](https://github.com/typicode/husky) from 9.0.11 to 9.1.1.
- [Release notes](https://github.com/typicode/husky/releases)
- [Commits](https://github.com/typicode/husky/compare/v9.0.11...v9.1.1)

---
updated-dependencies:
- dependency-name: husky
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-19 11:28:47 +03:00
dependabot[bot] 452da5663d chore(deps): bump @nextui-org/react from 2.4.2 to 2.4.3 in /frontend (#2995)
Bumps [@nextui-org/react](https://github.com/nextui-org/nextui/tree/HEAD/packages/core/react) from 2.4.2 to 2.4.3.
- [Release notes](https://github.com/nextui-org/nextui/releases)
- [Changelog](https://github.com/nextui-org/nextui/blob/canary/packages/core/react/CHANGELOG.md)
- [Commits](https://github.com/nextui-org/nextui/commits/@nextui-org/react@2.4.3/packages/core/react)

---
updated-dependencies:
- dependency-name: "@nextui-org/react"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-19 11:26:48 +03:00
மனோஜ்குமார் பழனிச்சாமி b60890c064 Fix playwright cache error during container restart (#3011) 2024-07-19 09:30:19 +02:00
Xingyao Wang 1761b88af5 update url for share opendevin visualization (#3009) 2024-07-19 12:12:55 +08:00
Xingyao Wang ff6ddc831f fix: runtime test for mac (#3005)
* move use_host_network to sandbox config

* fix test runtime tests

* fix kwargs to make it clearer
2024-07-19 03:03:55 +00:00
tobitege d6642c26be Session: set base_url in default_llm_config (#3003) 2024-07-18 22:28:26 -04:00
dependabot[bot] 3bff8cf88a chore(deps-dev): bump pytest-asyncio from 0.23.7 to 0.23.8 (#3000)
Bumps [pytest-asyncio](https://github.com/pytest-dev/pytest-asyncio) from 0.23.7 to 0.23.8.
- [Release notes](https://github.com/pytest-dev/pytest-asyncio/releases)
- [Commits](https://github.com/pytest-dev/pytest-asyncio/compare/v0.23.7...v0.23.8)

---
updated-dependencies:
- dependency-name: pytest-asyncio
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-19 10:18:52 +08:00
dependabot[bot] bf39af895e chore(deps): bump litellm from 1.41.23 to 1.41.24 (#2999)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.41.23 to 1.41.24.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.41.23...v1.41.24)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-19 10:18:41 +08:00
Graham Neubig de74c7a0a1 Convert docs to new URL (#3002) 2024-07-18 17:04:35 +00:00
Xingyao Wang cf910dfa9d fix eval api_key leak in metadata; fix llm config in run infer (#2998) 2024-07-18 15:46:59 +00:00
Graham Neubig 692fe21d60 Remove global config from session (#2987)
* Remove global config from session

* Fix double agent
2024-07-18 15:39:38 +00:00
Boxuan Li 9d41314d1a State: Add local_iteration attribute (#2990)
* Add local_iteration state attribute

* Fix typos

---------

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-07-18 14:49:19 +00:00
மனோஜ்குமார் பழனிச்சாமி f70c5afb6e Update run-unit-tests.yml (#2993) 2024-07-18 13:58:23 +00:00
மனோஜ்குமார் பழனிச்சாமி 2250947919 CI: Stop colima instance if failed to start (#2989)
* CI: Stop colima instance if failed to start

* change to classic for loop
2024-07-18 15:53:44 +02:00
tobitege 5a5713009f INT: prevent error on repeat integration tests after failed test(s) (#2935)
* Integration tests: prevent File not found error

* forgot to remove debug calls in regenerate.sh
2024-07-18 06:29:15 +02:00
மனோஜ்குமார் பழனிச்சாமி 728131ff1d Fix: Review PR Dogfood (#2916)
* Fix env variables, prompt, and exit

(cherry picked from commit b45bc1638397427ec5e82540c63c4cda0d1e2094)

* fix echo

* Run without docker

to avoid running as root.

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-07-18 11:10:53 +08:00
Xingyao Wang 135da0ea2b increase the default retries for LLM (#2986) 2024-07-18 09:17:12 +08:00
eng-waleed1 f689d5dcc3 docs-issue-#2887: Create openshift-example.md (#2960)
docs-issue-#2887: Create openshift-example.md
2024-07-18 09:05:22 +08:00
Xingyao Wang f80ecec772 [Arch] Add tests for EventStreamRuntime and fix bash parsing (#2933)
* deprecating recall action

* fix integration tests

* fix integration tests

* refractor runtime to use async

* remove search memory

* rename .initialize to .ainit

* draft of runtime image building (separate from img agnostic)

* refractor runtime build into separate file and add unit tests for it

* fix image agnostic tests

* move `split_bash_commands` into a separate util file

* fix bash pexcept parsing for env

* refractor add_env_var from sandbox to runtime;
add test runtime for env var, remove it from sandbox;

* remove unclear comment

* capture broader error

* make `add_env_var` handle multiple export at the same time

* add multi env var test

* fix tests with new config

* make runtime tests a separate ci to avoid full disk

* Update Runtime README with architecture diagram and detailed explanations

* update test

* remove dependency of global config in sandbox test

* fix sandbox typo

* runtime tests does not need ghcr build now

* remove download runtime img

* remove dependency of global config in sandbox test

* fix sandbox typo

* try to free disk before running the tests

* Update opendevin/runtime/client/README.md

Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>

* Update opendevin/runtime/client/README.md

Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>

* Update opendevin/runtime/client/README.md

Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>

* try to reduce code duplication

* Update opendevin/runtime/client/README.md

Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>

* Update opendevin/runtime/client/README.md

Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>

* Update opendevin/runtime/client/README.md

Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>

* Update opendevin/runtime/client/README.md

Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>

* Update opendevin/runtime/client/README.md

Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>

* cleanup before setup

* temporarily remove this enable lint test since env var are now handled by runtime

* linter

---------

Co-authored-by: OpenDevin <opendevin@all-hands.dev>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
2024-07-18 06:10:45 +08:00
Xingyao Wang cf3d2298da Refactor: remove the use of global variable in test_sandbox (#2985)
* remove dependency of global config in sandbox test

* fix sandbox typo

* try to reduce code duplication
2024-07-17 20:42:40 +00:00
dependabot[bot] b04c69858c chore(deps): bump pyarrow from 16.1.0 to 17.0.0 (#2963)
Bumps [pyarrow](https://github.com/apache/arrow) from 16.1.0 to 17.0.0.
- [Release notes](https://github.com/apache/arrow/releases)
- [Commits](https://github.com/apache/arrow/compare/r-16.1.0...go/v17.0.0)

---
updated-dependencies:
- dependency-name: pyarrow
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
2024-07-17 13:29:18 -07:00
JeffKatzy 5c438432d6 fix bug in config.py file, update reference to variable (#2984) 2024-07-17 19:56:07 +00:00
dependabot[bot] 70b2238f5e chore(deps-dev): bump @types/node from 20.14.10 to 20.14.11 in /frontend (#2982)
Bumps [@types/node](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/node) from 20.14.10 to 20.14.11.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/node)

---
updated-dependencies:
- dependency-name: "@types/node"
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-17 18:33:03 +03:00
dependabot[bot] f991069b00 chore(deps-dev): bump tailwindcss from 3.4.5 to 3.4.6 in /frontend (#2983)
Bumps [tailwindcss](https://github.com/tailwindlabs/tailwindcss) from 3.4.5 to 3.4.6.
- [Release notes](https://github.com/tailwindlabs/tailwindcss/releases)
- [Changelog](https://github.com/tailwindlabs/tailwindcss/blob/v3.4.6/CHANGELOG.md)
- [Commits](https://github.com/tailwindlabs/tailwindcss/compare/v3.4.5...v3.4.6)

---
updated-dependencies:
- dependency-name: tailwindcss
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-17 15:22:42 +00:00
dependabot[bot] a66ede2ee6 chore(deps): bump react-i18next from 14.1.3 to 15.0.0 in /frontend (#2981)
Bumps [react-i18next](https://github.com/i18next/react-i18next) from 14.1.3 to 15.0.0.
- [Changelog](https://github.com/i18next/react-i18next/blob/master/CHANGELOG.md)
- [Commits](https://github.com/i18next/react-i18next/compare/v14.1.3...v15.0.0)

---
updated-dependencies:
- dependency-name: react-i18next
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-17 23:15:27 +08:00
dependabot[bot] d97e92e714 chore(deps-dev): bump eslint-plugin-prettier in /frontend (#2980)
Bumps [eslint-plugin-prettier](https://github.com/prettier/eslint-plugin-prettier) from 5.1.3 to 5.2.1.
- [Release notes](https://github.com/prettier/eslint-plugin-prettier/releases)
- [Changelog](https://github.com/prettier/eslint-plugin-prettier/blob/master/CHANGELOG.md)
- [Commits](https://github.com/prettier/eslint-plugin-prettier/compare/v5.1.3...v5.2.1)

---
updated-dependencies:
- dependency-name: eslint-plugin-prettier
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-17 23:14:51 +08:00
Graham Neubig 9f12c77bac Fix bug with model list (#2978) 2024-07-17 15:08:31 +00:00
sp.wack 2c02ab9586 Refactor and extend, and pass tests (#2976) 2024-07-17 17:08:05 +03:00
Graham Neubig c897791024 Refactor LLM config (#2953)
* Add max_message_chars to LLM

* Refactor LLM config

* Fix tests

* Made some functions class functions

* Fix regression

* Fixed comments
2024-07-17 09:16:04 -04:00
Graham Neubig 01ce1e35b5 Remove global config from auth (#2962) 2024-07-17 06:25:45 -04:00
Graham Neubig 88d53e781f Remove global config from logger (#2974)
* Remove global config from loggers

* Fix bug
2024-07-17 06:25:26 -04:00
Graham Neubig 257698e89b Remove global config from sandbox (#2961)
* Some changes

* Fixed errors

* Remove duplicate initialize_plugins

* Fix some tests

* Fix tests
2024-07-16 18:34:04 +00:00
Jiayi Pan 7111e8ee14 Support Instance Level Images for SWE-Bench Evaluation (#2874)
* rename pulled instance images

* Swebench: add support to instance level images

* Update evaluation/swe_bench/run_infer.py

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>

* instance swebench: use env var and docker tags instead

* swebench disable instance report for instance images

* Update evaluation/swe_bench/README.md

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>

---------

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-07-17 01:31:42 +08:00
Graham Neubig 2c982582d7 Remove global config from bedrock (#2954) 2024-07-16 13:16:48 -04:00
dependabot[bot] 0b0952547d chore(deps): bump litellm from 1.41.21 to 1.41.23 (#2964) 2024-07-17 00:52:12 +08:00
sp.wack a2ec1ded26 Remove unused framer motion package (#2973) 2024-07-16 16:33:05 +00:00
dependabot[bot] dc45b14720 chore(deps-dev): bump @typescript-eslint/eslint-plugin in /frontend (#2970)
Bumps [@typescript-eslint/eslint-plugin](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/eslint-plugin) from 7.16.0 to 7.16.1.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/eslint-plugin/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v7.16.1/packages/eslint-plugin)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/eslint-plugin"
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-16 16:20:10 +00:00
sp.wack 4b6a2ff3c4 Remove unused react router package (#2972) 2024-07-16 15:55:05 +00:00
Xingyao Wang f45a2ff04e [Agent, Eval] Fixes LLM config issue for delegation & Add eval to measure the delegation accuracy (#2948)
* fix json import

* pass llm to delegation action so that sub-agent shares the same llm for cost accum purpose

* add inference script for browser delegation

* add readme

* Update agenthub/codeact_agent/action_parser.py

Co-authored-by: Graham Neubig <neubig@gmail.com>

* revert action parser changes.

* Rework --llm-config CLI arg

* Revert "pass llm to delegation action so that sub-agent shares the same llm for cost accum purpose"

This reverts commit 81034c486e.

* remove view summary

* update readme

* update comment

* update readme

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-07-16 15:51:29 +00:00
dependabot[bot] f5a4fb80a3 chore(deps-dev): bump tailwindcss from 3.4.4 to 3.4.5 in /frontend (#2971)
Bumps [tailwindcss](https://github.com/tailwindlabs/tailwindcss) from 3.4.4 to 3.4.5.
- [Release notes](https://github.com/tailwindlabs/tailwindcss/releases)
- [Changelog](https://github.com/tailwindlabs/tailwindcss/blob/v3.4.5/CHANGELOG.md)
- [Commits](https://github.com/tailwindlabs/tailwindcss/compare/v3.4.4...v3.4.5)

---
updated-dependencies:
- dependency-name: tailwindcss
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-16 15:41:18 +00:00
dependabot[bot] 59d05f3934 chore(deps-dev): bump @typescript-eslint/parser in /frontend (#2969)
Bumps [@typescript-eslint/parser](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/parser) from 7.16.0 to 7.16.1.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/parser/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v7.16.1/packages/parser)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/parser"
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-16 15:39:00 +00:00
dependabot[bot] 29483c0620 chore(deps): bump vite from 5.3.3 to 5.3.4 in /frontend (#2967)
Bumps [vite](https://github.com/vitejs/vite/tree/HEAD/packages/vite) from 5.3.3 to 5.3.4.
- [Release notes](https://github.com/vitejs/vite/releases)
- [Changelog](https://github.com/vitejs/vite/blob/main/packages/vite/CHANGELOG.md)
- [Commits](https://github.com/vitejs/vite/commits/v5.3.4/packages/vite)

---
updated-dependencies:
- dependency-name: vite
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-16 15:28:56 +00:00
dependabot[bot] 42abc727d7 chore(deps): bump react-i18next from 14.1.2 to 14.1.3 in /frontend (#2965)
Bumps [react-i18next](https://github.com/i18next/react-i18next) from 14.1.2 to 14.1.3.
- [Changelog](https://github.com/i18next/react-i18next/blob/master/CHANGELOG.md)
- [Commits](https://github.com/i18next/react-i18next/compare/v14.1.2...v14.1.3)

---
updated-dependencies:
- dependency-name: react-i18next
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-16 15:27:33 +00:00
sp.wack 1fd2e511f8 refactor: Frontend tests (#2959)
* Refactor and clean

* Refactor, cleanup, and pass skipped test

* Refactor

* Refactor and cleanup

* Cleanup

* Refactor and cleanup

* Remove unused mock

* Refactor and cleanup

* Refactor

* Remove unused hooks

* Refactor
2024-07-16 11:14:00 +03:00
Boxuan Li e3e437fcc2 Rework --llm-config CLI arg (#2957) 2024-07-16 04:17:59 +00:00
Anush Kumar V 8f76587e5c docs: updated docstrings using ruff's autofix feature (#2923)
* Updated documentation using ruff's autofix feature

* Updated pyproject.toml to include docstring validations

* Updated documentation using ruff's autofix feature

* Updated pyproject.toml to include docstring validations

* Updated docstrings using ruff's autfix feature

* Deleted opendevin/runtime/utils/soource.py, Keeping in sync with main

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-07-16 01:35:33 +00:00
Graham Neubig 149dac8e5b Run all tests in development.md (#2951)
This PR changes the directions in development.md to run all tests.

Note that I opted to explicitly specify `test_*.py` instead of doing test discovery so it's obvious that you can also specify specific files.
2024-07-15 20:28:02 +00:00
மனோஜ்குமார் பழனிச்சாமி ec2535c57c Ref: Remove make-i18n from makefile (#2905) 2024-07-15 19:31:08 +00:00
மனோஜ்குமார் பழனிச்சாமி 471703bea6 CI: Add retry mechanism (#2915) 2024-07-15 19:30:27 +00:00
Boxuan Li 4b4fa1c390 Remove legacy swe_bench/scripts/summarise_results.py (#2932)
* Remove swe_bench/scripts/summarise_results.py

* Remove mention of legacy script
2024-07-15 15:03:07 -04:00
dependabot[bot] 3c0975d71d chore(deps-dev): bump whatthepatch from 1.0.5 to 1.0.6 (#2943)
Bumps [whatthepatch](https://github.com/cscorley/whatthepatch) from 1.0.5 to 1.0.6.
- [Release notes](https://github.com/cscorley/whatthepatch/releases)
- [Changelog](https://github.com/cscorley/whatthepatch/blob/main/HISTORY.md)
- [Commits](https://github.com/cscorley/whatthepatch/compare/1.0.5...1.0.6)

---
updated-dependencies:
- dependency-name: whatthepatch
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-15 14:50:24 -04:00
dependabot[bot] 17b2eb58e4 chore(deps): bump litellm from 1.41.19 to 1.41.21 (#2942)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.41.19 to 1.41.21.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.41.19...v1.41.21)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-15 14:49:39 -04:00
dependabot[bot] 7cd3431beb chore(deps): bump boto3 from 1.34.143 to 1.34.144 (#2941)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.143 to 1.34.144.
- [Release notes](https://github.com/boto/boto3/releases)
- [Commits](https://github.com/boto/boto3/compare/1.34.143...1.34.144)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-15 14:49:24 -04:00
dependabot[bot] cf531518a5 chore(deps): bump i18next from 23.11.5 to 23.12.1 in /frontend (#2939)
Bumps [i18next](https://github.com/i18next/i18next) from 23.11.5 to 23.12.1.
- [Release notes](https://github.com/i18next/i18next/releases)
- [Changelog](https://github.com/i18next/i18next/blob/master/CHANGELOG.md)
- [Commits](https://github.com/i18next/i18next/compare/v23.11.5...v23.12.1)

---
updated-dependencies:
- dependency-name: i18next
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>
2024-07-15 18:27:16 +00:00
dependabot[bot] 8ea66a82c8 chore(deps-dev): bump eslint-plugin-react in /frontend (#2940) 2024-07-16 00:51:23 +08:00
dependabot[bot] 59042bb0a9 chore(deps-dev): bump ruff from 0.5.1 to 0.5.2 (#2945) 2024-07-16 00:51:04 +08:00
dependabot[bot] b501083425 chore(deps): bump fastapi from 0.111.0 to 0.111.1 (#2944) 2024-07-16 00:50:37 +08:00
dependabot[bot] 653a3c0f11 chore(deps-dev): bump prettier from 3.3.2 to 3.3.3 in /frontend (#2938) 2024-07-16 00:50:02 +08:00
Boxuan Li b834b354e5 Add compare_patch_filename.py (#2934) 2024-07-15 23:55:45 +08:00
mamoodi 214f728d32 docs: Add doc on how issues are triaged (#2928)
* docs: Add doc on how issues are triaged

* Update some wordings
2024-07-15 15:50:34 +00:00
மனோஜ்குமார் பழனிச்சாமி 9d7adefe0c chore: Update wordings in pull request template (#2926)
* Update wordings in pull request template

* Update .github/pull_request_template.md
2024-07-15 11:28:05 -04:00
Xingyao Wang 9b1f59a56e Arch: refactor and add unit tests for EventStreamRuntime docker image build (#2908)
* deprecating recall action

* fix integration tests

* fix integration tests

* refractor runtime to use async

* remove search memory

* rename .initialize to .ainit

* draft of runtime image building (separate from img agnostic)

* refractor runtime build into separate file and add unit tests for it

* fix image agnostic tests

* Update opendevin/runtime/utils/runtime_build.py

Co-authored-by: Mingzhang Zheng <649940882@qq.com>

---------

Co-authored-by: Mingzhang Zheng <649940882@qq.com>
2024-07-15 01:27:31 +00:00
Yufan Song 959d21c48f remove useless code (#2922) 2024-07-13 15:20:31 -07:00
mamoodi 46edb4b15b chore: Release 0.8.0 (#2919)
* Release 0.8.0

* Update email in code of conduct

* Remove unnecessary sentence from README
2024-07-13 18:05:05 +00:00
dependabot[bot] 91d46ccb8c chore(deps): bump litellm from 1.41.15 to 1.41.19 (#2906)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.41.15 to 1.41.19.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.41.15...v1.41.19)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
2024-07-13 10:55:43 -07:00
Raj Maheshwari 64be2cb466 [Fix] Minor bug in parse_response of CodeActResponseParser (#2912) 2024-07-13 14:36:27 +00:00
மனோஜ்குமார் பழனிச்சாமி b2b6d2ac1e Fix: hostname in logging (#2914)
Co-authored-by: tobitege <tobitege@gmx.de>
2024-07-13 10:04:08 +00:00
Boxuan Li f249254ba4 Fix delegator LLM config when config is set from UI (#2913) 2024-07-13 11:31:30 +02:00
Xingyao Wang 7e68de746d arch: refractor eventstream into async (#2907)
* deprecating recall action

* fix integration tests

* fix integration tests

* refractor runtime to use async

* remove search memory

* rename .initialize to .ainit
2024-07-12 20:03:28 +00:00
Xingyao Wang e45ddeb2a2 arch: deprecating recall action and search_memory (#2900)
* deprecating recall action

* fix integration tests

* fix integration tests

* remove search memory
2024-07-12 19:23:21 +00:00
Boxuan Li 2b7c4e5571 Remove legacy dummy action from CI (#2903) 2024-07-13 02:27:13 +08:00
Boxuan Li ebbc0e6803 Integration testing: unset irrelevant env variables (#2902) 2024-07-12 22:12:37 +08:00
Xingyao Wang 96b5cb78fd [Arch] EventStreamRuntime supports browser (#2899)
* fix the case when source and tmp are not on the same device

* always build a dev box (with updated source code) for development purpose

* tail the log before removing the container

* move browse function

* support browser!
2024-07-11 22:32:12 +00:00
Xingyao Wang ced7499f8d fix Runtime import (#2897) 2024-07-11 21:51:57 +00:00
மனோஜ்குமார் பழனிச்சாமி 7cbf2d9f16 Doc: LM Studio guide (#2875) 2024-07-11 23:17:03 +02:00
Xingyao Wang e45d46c993 [Arch] Implement EventStream Runtime Client with Jupyter Support using Agnostic Sandbox (#2879)
* support loading a particular runtime class via config.runtime (default to server to not break things)

* move image agnostic util to shared runtime util

* move dependency

* include poetry.lock in sdist

* accept port as arg for client

* make client start server with specified port

* update image agnostic utility for eventstream runtime

* make client and runtime working with REST API

* rename execute_server

* add plugin to initialize stuff inside es-runtime;
cleanup runtime methods to delegate everything to container

* remove redundant ls -alh

* fix jupyter

* improve logging in agnostic sandbox

* improve logging of test function

* add read & edit

* update agnostic sandbox

* support setting work dir at start

* fix file read/write test

* fix unit test

* update tescase

* Fix unit test again

* fix unit test again again
2024-07-12 01:52:26 +08:00
மனோஜ்குமார் பழனிச்சாமி 43c3e904b7 Restore last mute setting (#2895) 2024-07-11 16:48:04 +00:00
dependabot[bot] ae2fbbf8af chore(deps): bump boto3 from 1.34.142 to 1.34.143 (#2893) 2024-07-12 00:12:23 +08:00
dependabot[bot] 29ed1d744d chore(deps-dev): bump chromadb from 0.5.3 to 0.5.4 (#2892) 2024-07-12 00:12:11 +08:00
dependabot[bot] 217eed9dab chore(deps): bump litellm from 1.41.14 to 1.41.15 (#2891) 2024-07-12 00:11:59 +08:00
மனோஜ்குமார் பழனிச்சாமி 6bef270526 Doc: Fix Azure Guide (#2894)
* Doc: Fix Azure Guide

* Update azureLLMs.md
2024-07-11 15:56:36 +00:00
Xingyao Wang 1b54800a29 [Agent] Improve edits by adding back edit_file_by_line (#2722)
* add replace-based block edit & preliminary test case fix

* further fix the insert behavior

* make edit only work on first occurence

* bump codeact version since we now use new edit agentskills

* update prompt for new agentskills

* update integration tests

* make run_infer.sh executable

* remove code block for edit_file

* update integration test for prompt changes

* default to not use hint for eval

* fix insert emptyfile bug

* throw value error when `to_replace` is empty

* make `_edit_or_insert_file` return string so we can try to fix some linter errors (best attempt)

* add todo

* update integration test

* fix sandbox test for this PR

* fix inserting with additional newline

* rename to edit_file_by_replace

* add back `edit_file_by_line`

* update prompt for new editing tool

* fix integration tests

* bump codeact version since there are more changes

* add back append file

* fix current line for append

* fix append unit tests

* change the location where we show edited line no to agent and fix tests

* update integration tests

* fix global window size affect by open_file bug

* fix global window size affect by open_file bug

* increase window size to 300

* add file beginning and ending marker to avoid looping

* expand the editor window to better display edit error for model

* refractor to breakdown edit to internal functions

* reduce window to 200

* move window to 100

* refractor to cleanup some logic into _calculate_window_bounds

* fix integration tests

* fix sandbox test on new prompt

* update demonstration with new changes

* fix integration

* initialize llm inside process_instance to circumvent "AttributeError: Can't pickle local object"

* update kwargs

* retry for internal server error

* fix max iteration

* override max iter from config

* fix integration tests

* remove edit file by line

* fix integration tests

* add instruction to avoid hanging

* Revert "add instruction to avoid hanging"

This reverts commit 06fd2c5938.

* handle content policy violation error

* fix integration tests

* fix typo in prompt - the window is 100

* update all integration tests

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
2024-07-11 15:30:20 +00:00
dependabot[bot] e793ca2261 chore(deps): bump framer-motion from 11.3.0 to 11.3.2 in /frontend (#2890) 2024-07-11 23:19:52 +08:00
adragos 5f61885e44 feat: Implement user confirmation mode, request confirmation when running bash/python code in this mode (#2774)
* [feat] confirmation mode for bash actions

* feat: Add modal setting for Confirmation Mode

* fix: frontend tests for confirmation mode switch

* fix: add missing CONFIRMATION_MODE value in SettingsModal.test.tsx

* fix: update test to integrate new setting

* feat: Implement user confirmation for running bash/python code

* fix: don't display rejected actions

* fix: linting, rename/refactor based on feedback

* fix: add property only to commands, pass serialization tests

* fix: package-lock.json, lint test_action_serialization.py

* test: add is_confirmed to integration test outputs

---------

Co-authored-by: Mislav Balunovic <mislav.balunovic@gmail.com>
2024-07-11 14:57:21 +03:00
மனோஜ்குமார் பழனிச்சாமி 1d4f422638 Doc: Mention FORCE_REGENERATE var (#2833)
* Mention FORCE_REGENERATE var in doc

* Update tests/integration/README.md

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
2024-07-11 04:01:15 +00:00
dependabot[bot] 456690818c chore(deps): bump google-cloud-aiplatform from 1.58.0 to 1.59.0 (#2884)
Bumps [google-cloud-aiplatform](https://github.com/googleapis/python-aiplatform) from 1.58.0 to 1.59.0.
- [Release notes](https://github.com/googleapis/python-aiplatform/releases)
- [Changelog](https://github.com/googleapis/python-aiplatform/blob/main/CHANGELOG.md)
- [Commits](https://github.com/googleapis/python-aiplatform/compare/v1.58.0...v1.59.0)

---
updated-dependencies:
- dependency-name: google-cloud-aiplatform
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-10 10:09:48 -07:00
dependabot[bot] de11b49a38 chore(deps): bump boto3 from 1.34.141 to 1.34.142 (#2882) 2024-07-11 00:04:56 +08:00
dependabot[bot] 50f07aea44 chore(deps): bump framer-motion from 11.2.14 to 11.3.0 in /frontend (#2880) 2024-07-11 00:03:22 +08:00
dependabot[bot] 008f288bb0 chore(deps-dev): bump openai from 1.35.10 to 1.35.13 (#2885) 2024-07-11 00:02:55 +08:00
dependabot[bot] 7938b454e4 chore(deps): bump litellm from 1.41.13 to 1.41.14 (#2883) 2024-07-11 00:02:39 +08:00
dependabot[bot] 45e40d68f6 chore(deps): bump json-repair from 0.25.2 to 0.25.3 (#2881) 2024-07-11 00:02:17 +08:00
Boxuan Li c68478f470 Customize LLM config per agent (#2756)
Currently, OpenDevin uses a global singleton LLM config and a global singleton agent config. This PR allows customers to configure an LLM config for each agent. A hypothetically useful scenario is to use a cheaper LLM for repo exploration / code search, and a more powerful LLM to actually do the problem solving (CodeActAgent).

Partially solves #2075 (web GUI improvement is not the goal of this PR)
2024-07-09 22:05:54 -07:00
Jiayi Pan 23e2d01cf5 Fix instance agonistic: remove Miniforge after installation (#2878)
* remove Miniforge after installation

* fix typo
2024-07-09 20:08:58 +00:00
dependabot[bot] de47d8eecc chore(deps-dev): bump @typescript-eslint/eslint-plugin in /frontend (#2872) 2024-07-09 16:42:22 +00:00
dependabot[bot] 4049c69590 chore(deps): bump litellm from 1.41.11 to 1.41.13 (#2870) 2024-07-10 00:08:12 +08:00
dependabot[bot] 792949aeb2 chore(deps): bump boto3 from 1.34.140 to 1.34.141 (#2869) 2024-07-10 00:07:48 +08:00
dependabot[bot] 864ee465fe chore(deps): bump google-generativeai from 0.7.1 to 0.7.2 (#2868) 2024-07-10 00:07:35 +08:00
dependabot[bot] fbced43ff3 chore(deps): bump framer-motion from 11.2.13 to 11.2.14 in /frontend (#2871) 2024-07-10 00:07:12 +08:00
dependabot[bot] 46b853e1b4 chore(deps-dev): bump @typescript-eslint/parser in /frontend (#2873) 2024-07-10 00:06:36 +08:00
Yufan Song 9198ea3fb1 fix output code error in docker image (#2862) 2024-07-08 22:40:39 -07:00
Yufan Song f0bc231f3e chores: open the websockets ports for port mapping and remove chores. (#2864)
* add port map

* add more comments TODO
2024-07-08 22:33:58 -07:00
Yufan Song 8cfb1be5a3 od-runtime-client: check and remove permission TODO (#2863) 2024-07-08 22:33:25 -07:00
Xingyao Wang f2e92b2db7 move image agnostic util to shared runtime util (#2859) 2024-07-08 22:17:01 +00:00
mamoodi e2636f9ece docs: Reorder docs and small update to README (#2860)
Co-authored-by: Mahmoud Work <mahmoudwork@mahmouds-mini.home>
2024-07-08 21:33:12 +00:00
Yufan Song 351127db55 add od runtime clinet dependencies (#2858) 2024-07-08 18:20:44 +00:00
Yufan Song 9fbfa0650e Add websocket runtime and od-client-runtime (#2603)
* add draft code

* add some sandbox draft code

* Export WebSocketBox and fix add_to_env async

* fix

* test execute

* add runtime draft

* add draft od-runtime-client

* refactor useless code

* format

* resume runtime

* resume runtime

* remove background command

* remove uselss action and init function

* add EventStreamRuntime test

* add echo server test

* temporarily build websocket everytimes

* remove websocket sandbox deprecated

* refactor code

* fix bug, add test

* fix bug

* remove test draft code

* refactor code, remove async

* rename file and directory

* add init plugin and runtime tools function

* add docker luanch

* fix plugin initialization

* remove test scropt

* add mock test code

* apply suggestions

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-07-08 17:44:37 +00:00
dependabot[bot] e6ebb4307e chore(deps): bump boto3 from 1.34.139 to 1.34.140 (#2855)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.139 to 1.34.140.
- [Release notes](https://github.com/boto/boto3/releases)
- [Commits](https://github.com/boto/boto3/compare/1.34.139...1.34.140)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-08 15:43:42 +00:00
dependabot[bot] 9b9b754965 chore(deps): bump litellm from 1.41.7 to 1.41.11 (#2854)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.41.7 to 1.41.11.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.41.7...v1.41.11)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-08 15:41:41 +00:00
dependabot[bot] 648597c4f7 chore(deps-dev): bump @types/node from 20.14.9 to 20.14.10 in /frontend (#2852)
Bumps [@types/node](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/node) from 20.14.9 to 20.14.10.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/node)

---
updated-dependencies:
- dependency-name: "@types/node"
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-08 14:49:49 +00:00
dependabot[bot] ff701f99e6 chore(deps): bump tailwind-merge from 2.3.0 to 2.4.0 in /frontend (#2851)
Bumps [tailwind-merge](https://github.com/dcastil/tailwind-merge) from 2.3.0 to 2.4.0.
- [Release notes](https://github.com/dcastil/tailwind-merge/releases)
- [Commits](https://github.com/dcastil/tailwind-merge/compare/v2.3.0...v2.4.0)

---
updated-dependencies:
- dependency-name: tailwind-merge
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-08 14:49:39 +00:00
Engel Nyst 2df1d67007 History clean up (#2849)
* clean up add_history

* refactor last agent message
2024-07-08 05:10:21 +02:00
மனோஜ்குமார் பழனிச்சாமி c6aa50779d Update regenerate.sh (#2832) 2024-07-07 23:52:03 +02:00
Ralf D. Müller ba0f57c279 added netcat to the requirements (#2822) 2024-07-07 21:32:56 +00:00
Engel Nyst d37b2973b2 Refactoring: event stream based agent history (#2709)
* add to event stream sync

* remove async from tests

* small logging spam fix

* remove swe agent

* arch refactoring: use history from the event stream

* refactor agents

* monologue agent

* ruff

* planner agent

* micro-agents

* refactor history in evaluations

* evals history refactoring

* adapt evals and tests

* unit testing stuck

* testing micro agents, event stream

* fix planner agent

* fix tests

* fix stuck after rename

* fix test

* small clean up

* fix merge

* fix merge issue

* fix integration tests

* Update agenthub/dummy_agent/agent.py

* fix tests

* rename more clearly; add todo; clean up
2024-07-07 21:04:23 +00:00
மனோஜ்குமார் பழனிச்சாமி 9dc2d2c80f Refactor: Remove extra log (#2687) 2024-07-08 05:37:13 +09:00
Shimada666 e35c1ff74a Display real-time build logs for the agnostic image (#2830)
* Display real-time build logs for the agnostic image and improve wget's output.

* remove unused code
2024-07-08 04:35:16 +08:00
மனோஜ்குமார் பழனிச்சாமி 34c765688b Streamline Logging Events (#2532)
* Skip duplicate log

* log user actions

* fix tests

* log all action _step

* refactor log

* revert test

* refactor log

* visual diff

* disable overriding event source

* Revert "disable overriding event source"

This reverts commit b0047cc0cd.

* Refactor logic

* refactored runtime on_event

* fix merge conflict

in Web UI, it shows as red color (seems deletion but added)

* linted

---------

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-07-08 05:34:47 +09:00
மனோஜ்குமார் பழனிச்சாமி adf1a0d556 Bugfix: add missing f-string for logging debug message in task creation (#2836) 2024-07-07 17:36:19 +02:00
மனோஜ்குமார் பழனிச்சாமி 85a817304e Check exit code (#2834) 2024-07-07 17:35:22 +02:00
Graham Neubig d0384cafdd Two fixes to swe bench eval (#2831)
* Two fixes to swe bench eval

* Add error message

* Change dumping of metadata
2024-07-07 07:21:50 +00:00
மனோஜ்குமார் பழனிச்சாமி 3a3694ca17 doc: Mention negative feedback feature in bug report. (#2827)
* doc: Mention feedback feature in bug report.

* Update .github/ISSUE_TEMPLATE/bug_template.yml

Co-authored-by: Graham Neubig <neubig@gmail.com>

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-07-07 06:56:12 +00:00
Bin Lei c8e5848add fix git diff TIMEOUT problem in swe_bench evaluation (#2828)
* fix git diff TIMEOUT problem in swe_bench evaluation

* fix git diff TIMEOUT problem in swe_bench evaluation

* Update evaluation/swe_bench/swe_env_box.py

Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>

---------

Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>
2024-07-07 06:30:59 +00:00
Shimada666 0973e31f00 Update custom sandbox usage guide (#2829) 2024-07-07 05:33:35 +02:00
Shimada666 82f256be96 trim the sandbox image and install plugin dependencies in agnostic image (#2792)
* trim the sandbox image

* remove wrong code

* readd python

* readd python

* fix script

* readd nano
2024-07-06 17:38:37 +02:00
மனோஜ்குமார் பழனிச்சாமி d6570bd572 Fix gemini-1.5-flash crash due to missing 'vertexai' module (#2826)
* Fix gemini-1.5-flash crash due to missing 'vertexai' module

* Update poetry.lock
2024-07-06 16:27:59 +02:00
Shimada666 d22ff73905 Make the sandbox Python runtime completely transparent (#2796)
* Make the sandbox Python runtime completely independent

* fix source bashrc

* add pip install instruction for ipython to fix intergration tests for codeact swe

* update integration tests

* change flake8 command to (maybe) fix sandbox tests?

* make lint support both unittest & sandbox

* fix agnostic image build error

* refactor build script

---------

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
Co-authored-by: tobitege <tobitege@gmx.de>
2024-07-06 15:22:44 +02:00
Leo 9b0ff117ab CI: Support uploading frontend unit test coverage. (#2772)
* CI: Support uploading frontend unit test coverage.

* Add make-i18n before test.

* Update vitest configuration to include only .ts and .tsx files in coverage.

* remove .only in test and fix the failed tests.

* Add text summary.

* Move vite-tsconfig-paths to dev dep. Adjust UTs.

---------

Signed-off-by: ifuryst <ifuryst@gmail.com>
Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>
2024-07-06 12:16:30 +08:00
Xingyao Wang f6dc89b41a [Evaluation] Simplify eval & and multi-processing related fixes (#2810)
* initialize agent inside process_instance_fn;

* remove dependency on `config.max_iterations`

* switch back to only include llm config to metadata
2024-07-06 07:18:46 +08:00
Xingyao Wang a47713ecb0 [Arch] Remove supports for Background Commands (#2803)
* depracting docker exec box

* remove doc exec from workflow and docs

* remove background commands

* Update tests/unit/test_sandbox.py

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

* replace for-loop with assignment

* fix integration tests

* fix integration tests for shell script

* fix integration tests

* increase max iter to fix some monologue agent issue

* fix integration test again

* fix integration tests (seems related to run_user issue)

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-07-06 03:38:05 +08:00
mamoodi 99284da476 Use README as the only place for getting started instructions (#2815)
Co-authored-by: Mahmoud Work <mahmoudwork@mahmouds-mini.home>
2024-07-05 19:44:29 +02:00
mamoodi 9ccc64fa7e Update README (#2814)
Co-authored-by: Mahmoud Work <mahmoudwork@mahmouds-mini.home>
2024-07-05 12:54:34 -04:00
dependabot[bot] eda582335a chore(deps): bump tenacity from 8.4.2 to 8.5.0 (#2813)
Bumps [tenacity](https://github.com/jd/tenacity) from 8.4.2 to 8.5.0.
- [Release notes](https://github.com/jd/tenacity/releases)
- [Commits](https://github.com/jd/tenacity/compare/8.4.2...8.5.0)

---
updated-dependencies:
- dependency-name: tenacity
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: yufansong <yufan@risingwave-labs.com>
2024-07-05 16:26:08 +00:00
மனோஜ்குமார் பழனிச்சாமி ed45a9e7b1 delete colima profile (#2807) 2024-07-05 08:55:26 -07:00
dependabot[bot] 1bdfbedccc chore(deps-dev): bump ruff from 0.5.0 to 0.5.1 (#2811)
Bumps [ruff](https://github.com/astral-sh/ruff) from 0.5.0 to 0.5.1.
- [Release notes](https://github.com/astral-sh/ruff/releases)
- [Changelog](https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md)
- [Commits](https://github.com/astral-sh/ruff/compare/0.5.0...0.5.1)

---
updated-dependencies:
- dependency-name: ruff
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-05 08:46:42 -07:00
dependabot[bot] 274464101f chore(deps): bump litellm from 1.41.6 to 1.41.7 (#2812)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.41.6 to 1.41.7.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.41.6...v1.41.7)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-05 08:46:34 -07:00
Graham Neubig a081935fd8 Simplify eval code (#2775)
* Start simplifying eval code

* Update

* Add EDA

* Updated GAIA

* Update gpqa

* Add humanevalfix

* Fix logic_reasoning

* Add miniwob

* Add mint and ml_bench

* toolqa

* Added swe-bench

* Fixed webarena

* Refactor parameters
2024-07-05 19:33:08 +09:00
r.e.e.c.h.e.e 038e8f8caa docs: update docker run command to pull default 'latest' tag (#2804)
- Ensure users get the most recent stable release version when pulling default image.
- Explains the main tag for those who want the most recent updates.
2024-07-05 04:13:35 +00:00
மனோஜ்குமார் பழனிச்சாமி 143f38d25a Refactored sandbox config and added fast boot (#2455)
* Refactored sandbox config and added fastboot

* added tests

* fixed tests

* fixed tests

* intimate user about breaking change

* remove default config from eval

* check for lowercase env

* add test

* Revert Migration

* migrate old sandbox configs

* resolve merge conflict

* revert migration 2

* Revert "remove default config from eval"

This reverts commit de57c588db.

* change type to box_type

* fix var name

* linted

* lint

* lint comments

* fix tests

* fix tests

* fix typo

* fix box_type, remove fast_boot

* add tests for sandbox config

* fix test

* update eval docs

* small removal comments

* adapt toml template

* old fields shouldn't be in the app dataclass

* fix old keys in app config

* clean up exec box

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-07-05 03:30:21 +00:00
Xingyao Wang 82f4860470 retry for internal server error (#2806) 2024-07-05 01:47:55 +00:00
Xingyao Wang 298956c78a [Eval] initialize llm inside process_instance to circumvent "AttributeError:… (#2805)
* initialize llm inside process_instance to circumvent "AttributeError: Can't pickle local object"

* update kwargs
2024-07-05 01:26:03 +00:00
Xingyao Wang 0d3b3ffbf8 [Arch] Removing docker exec box (#2802)
* depracting docker exec box

* remove doc exec from workflow and docs
2024-07-04 23:15:25 +00:00
Xingyao Wang e6cdf18d3b [Evaluation] Log empty patch stats for SWE-Bench (#2776)
* bump swebench version since the fix PR is merged

* add empy generation stats from latest pr

* delete eval_outputs if it already exists

* handle non string patch
2024-07-05 07:03:27 +08:00
Engel Nyst 0b8d357bef Add event synchronously (#2700)
* add to event stream sync

* remove async from tests
2024-07-05 00:15:51 +02:00
sven 1b10e2b9d5 Make CodeAct finish task (#2673)
* Added feature to CodeAct agent to finish action instead of waiting for user input.

* Minor change

* Update agenthub/codeact_agent/codeact_agent.py

Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>

* updated integration tests with claude-sonnet-3.5

* Update agenthub/codeact_agent/prompt.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* updated tests to remove typo in prompt

* resolve merge conflicts II

* revert unintended change of regenerate script

* re-regenerating prompts to resolve merge conflicts

---------

Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-07-04 11:21:46 -07:00
மனோஜ்குமார் பழனிச்சாமி 688bd2a8fc Added local ollama models (#2433)
* added local ollama models

* add ollama_base_url config

* Update listen.py

* add docs

* Update opendevin/server/listen.py

Co-authored-by: Graham Neubig <neubig@gmail.com>

* lint

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-07-04 15:56:26 +00:00
dependabot[bot] 6853cbb4f6 chore(deps): bump framer-motion from 11.2.12 to 11.2.13 in /frontend (#2793)
Bumps [framer-motion](https://github.com/framer/motion) from 11.2.12 to 11.2.13.
- [Changelog](https://github.com/framer/motion/blob/main/CHANGELOG.md)
- [Commits](https://github.com/framer/motion/compare/v11.2.12...v11.2.13)

---
updated-dependencies:
- dependency-name: framer-motion
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-04 23:41:47 +08:00
dependabot[bot] 14b4213acf chore(deps-dev): bump openai from 1.35.9 to 1.35.10 (#2789) 2024-07-04 12:37:08 +00:00
dependabot[bot] d145dd78a3 chore(deps-dev): bump typescript from 5.2.2 to 5.5.3 in /docs (#2785) 2024-07-04 20:02:11 +08:00
dependabot[bot] c8270013ad chore(deps): bump react-dom from 18.2.0 to 18.3.1 in /docs (#2787) 2024-07-04 11:36:57 +00:00
dependabot[bot] dd1bd9caf3 chore(deps): bump react from 18.2.0 to 18.3.1 in /docs (#2786) 2024-07-04 19:03:46 +08:00
dependabot[bot] c77480bb55 chore(deps): bump litellm from 1.41.3 to 1.41.6 (#2790)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.41.3 to 1.41.6.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.41.3...v1.41.6)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-04 11:01:48 +00:00
dependabot[bot] 7291c320fc chore(deps): bump boto3 from 1.34.138 to 1.34.139 (#2788)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.138 to 1.34.139.
- [Release notes](https://github.com/boto/boto3/releases)
- [Commits](https://github.com/boto/boto3/compare/1.34.138...1.34.139)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-04 10:58:40 +00:00
Leo c2f557edde refactor: multiple code improvements (#2771) 2024-07-04 18:51:22 +08:00
Leo 869997b941 chore: Add docs for dependabot and add the open pr limit from 5 to 20. (#2784)
Signed-off-by: ifuryst <ifuryst@gmail.com>
2024-07-04 18:07:30 +08:00
r.e.e.c.h.e.e d894347f9b Add 'latest' tag to Docker builds for stable releases only (#2781)
- Ensure 'latest' always points to the most recent stable version
- Address issue #2730: Release "latest" tag when pushing image to DockerHub
2024-07-04 18:06:54 +08:00
Leo 90a68ca816 chore: Add architecture diagram. (#2783)
* chore: Add architecture diagram.

Signed-off-by: ifuryst <ifuryst@gmail.com>

* Fix syntax error.

Signed-off-by: ifuryst <ifuryst@gmail.com>

---------

Signed-off-by: ifuryst <ifuryst@gmail.com>
2024-07-04 18:06:14 +08:00
dependabot[bot] 82ed0a07cb chore(deps): bump jose from 5.6.2 to 5.6.3 in /frontend (#2766) 2024-07-04 13:28:35 +08:00
dependabot[bot] 81226e63e9 chore(deps): bump vite from 5.3.2 to 5.3.3 in /frontend (#2767) 2024-07-04 13:28:25 +08:00
dependabot[bot] 8b01d16926 chore(deps): bump react-router-dom from 6.24.0 to 6.24.1 in /frontend (#2768) 2024-07-04 13:28:13 +08:00
John Yang 89a3752c8c Restore SWE-bench dep refs (#2752)
* Restore SWE-bench dep refs

* update poetry lock

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-07-03 18:44:01 +00:00
Leo b60a696296 Fix the FE failed unit test. (#2773)
Signed-off-by: ifuryst <ifuryst@gmail.com>
2024-07-03 15:52:51 +00:00
dependabot[bot] 018f45893b chore(deps-dev): bump openai from 1.35.8 to 1.35.9 (#2770)
Bumps [openai](https://github.com/openai/openai-python) from 1.35.8 to 1.35.9.
- [Release notes](https://github.com/openai/openai-python/releases)
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-python/compare/v1.35.8...v1.35.9)

---
updated-dependencies:
- dependency-name: openai
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-03 23:22:26 +08:00
dependabot[bot] 5123e1d8d1 chore(deps): bump boto3 from 1.34.137 to 1.34.138 (#2769)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.137 to 1.34.138.
- [Release notes](https://github.com/boto/boto3/releases)
- [Commits](https://github.com/boto/boto3/compare/1.34.137...1.34.138)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-03 23:21:42 +08:00
r.e.e.c.h.e.e 94d68ca0b8 docs: Update custom sandbox guide to include steps to use pre-existin… (#2740)
* docs: Update custom sandbox guide to include steps to use pre-existing Docker images #2734

* docs: Update custom and pre-existing sandbox guide

* docs: Update custom and pre-existing sandbox guide
2024-07-03 12:08:55 +00:00
Leo 4dc01a7369 feature: Enable DEBUG level logging based on config setting. (#2762)
Signed-off-by: ifuryst <ifuryst@gmail.com>
2024-07-03 20:21:41 +09:00
Graham Neubig ffd3c7144c Remove global args (#2760)
* Remove global args

* Remove global args

* Update files

* Update main

* Bug fixes

* Fix logging
2024-07-03 20:07:52 +09:00
Leo 0d0e6db1e3 feature: Add config template. (#2736)
Signed-off-by: ifuryst <ifuryst@gmail.com>
2024-07-03 13:31:09 +09:00
Xingyao Wang 4d0c4f37d6 [Evaluation] fix SWE-Bench docker image name (#2751)
* fix double underscore

* remove unused script
2024-07-03 04:30:38 +08:00
dependabot[bot] f293b33856 chore(deps-dev): bump @typescript-eslint/eslint-plugin in /frontend (#2742)
Bumps [@typescript-eslint/eslint-plugin](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/eslint-plugin) from 7.14.1 to 7.15.0.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/eslint-plugin/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v7.15.0/packages/eslint-plugin)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/eslint-plugin"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-02 16:08:09 +00:00
dependabot[bot] 50efea5960 chore(deps): bump boto3 from 1.34.136 to 1.34.137 (#2744)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.136 to 1.34.137.
- [Release notes](https://github.com/boto/boto3/releases)
- [Commits](https://github.com/boto/boto3/compare/1.34.136...1.34.137)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-02 23:35:35 +08:00
dependabot[bot] 52ab4b4f18 chore(deps-dev): bump @typescript-eslint/parser in /frontend (#2743)
Bumps [@typescript-eslint/parser](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/parser) from 7.14.1 to 7.15.0.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/parser/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v7.15.0/packages/parser)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/parser"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-02 23:35:13 +08:00
dependabot[bot] 8ff9332358 chore(deps-dev): bump openai from 1.35.7 to 1.35.8 (#2745)
Bumps [openai](https://github.com/openai/openai-python) from 1.35.7 to 1.35.8.
- [Release notes](https://github.com/openai/openai-python/releases)
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-python/compare/v1.35.7...v1.35.8)

---
updated-dependencies:
- dependency-name: openai
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-02 23:34:47 +08:00
dependabot[bot] c052ea30cc chore(deps): bump litellm from 1.40.29 to 1.41.3 (#2746)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.40.29 to 1.41.3.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.40.29...v1.41.3)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-02 23:34:11 +08:00
dependabot[bot] 4489c40106 chore(deps-dev): bump typescript from 5.5.2 to 5.5.3 in /frontend (#2741)
Bumps [typescript](https://github.com/Microsoft/TypeScript) from 5.5.2 to 5.5.3.
- [Release notes](https://github.com/Microsoft/TypeScript/releases)
- [Changelog](https://github.com/microsoft/TypeScript/blob/main/azure-pipelines.release.yml)
- [Commits](https://github.com/Microsoft/TypeScript/compare/v5.5.2...v5.5.3)

---
updated-dependencies:
- dependency-name: typescript
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-02 23:33:34 +08:00
மனோஜ்குமார் பழனிச்சாமி cc8204e5c1 CI: For colima, modify CPU count and memory (#2712)
* CI: For colima, modify CPUcount and memory

* fix arg
2024-07-02 15:44:44 +02:00
r.e.e.c.h.e.e 0689eef844 Fix: Add scroll functionality to file explorer sidepane (#2731)
* Fix: Add scroll and resize functionalities to file explorer

* fix: the icon missing from filename and tree scroll
2024-07-02 12:14:12 +00:00
PierrunoYT bfa1de4a6b Refactor: Enhance file handling and code editing functionality (#2646)
* refactor: Enhance file handling and code editing functionality

# PR Summary
**refactor: Enhance file handling and code editing functionality**

## PR Description

This pull request includes improvements to file handling, error management, and code editing functionality across multiple files. The changes enhance the robustness, security, and user experience of the application.

### Changes in `listen.py`

1. **Imports and Error Handling**:
   - Removed `warnings` import and its usage with `litellm`.
   - More consistent use of `JSONResponse` and `HTTPException` for error handling.

2. **WebSocket Endpoint (`/ws`)**:
   - Simplified logic for handling events using a single `isinstance` check.

3. **New Endpoint**:
   - Added `/api/save-file` POST endpoint for saving file contents.
   - Implemented checks for agent state before allowing file edits.

4. **Code Style and Organization**:
   - Improved code formatting and organization.
   - Refactored some functions for better readability and consistency.

### Changes in `fileService.ts`

1. **Error Handling**:
   - Added try-catch blocks to all functions for better error handling and logging.

2. **Input Sanitization**:
   - Implemented `encodeURIComponent()` for file names and paths in API requests.

3. **Type Checking**:
   - Added type checks for API responses to ensure data format consistency.

4. **File Upload Improvement**:
   - Refactored `uploadFiles()` to use `Array.from(files)` instead of a for loop.

5. **New Functionality**:
   - Added `saveFile()` function to allow saving file content to a specified path.

### Changes in `CodeEditor.tsx`

1. **New Dependencies**:
   - Added imports for state management, UI components, and file operations.

2. **State Management**:
   - Introduced new state variables for tracking save status and last saved time.
   - Implemented Redux state management for code and agent state.

3. **UI Enhancements**:
   - Added a save button with dynamic colors based on save status.
   - Implemented a save notification system.
   - Added a "Last saved" timestamp display.

4. **File Saving Functionality**:
   - Implemented complete file saving feature with error handling and user feedback.

5. **Code Structure**:
   - Improved structure with additional hooks and memoized values for optimization.

### Testing Performed
- Manually tested new file saving functionality.
- Verified error handling and user feedback mechanisms.
- Checked integration between backend (`listen.py`) and frontend (`fileService.ts`, `CodeEditor.tsx`).

### Next Steps
- Conduct thorough testing of the file saving feature across different scenarios.
- Update documentation to reflect new file handling capabilities.
- Consider adding unit tests for new functions and components.

* Added Docstrings back

Added Docstrings back

* Fix

# Allow Code Editing in AWAITING_USER_INPUT State

## Description
This pull request extends the functionality of the code editor to allow editing when the agent is in the AWAITING_USER_INPUT state, in addition to the existing PAUSED and FINISHED states.

## Changes
1. Backend (`listen.py`):
   - Updated the `save_file` function to allow saving when the agent state is AWAITING_USER_INPUT.

2. Frontend (`CodeEditor.tsx`):
   - Modified the `isEditingAllowed` condition to include the AWAITING_USER_INPUT state.

## Files Changed
- `listen.py`
- `CodeEditor.tsx`

## Testing
- Verified that the save button appears when the agent is in the AWAITING_USER_INPUT state.
- Tested saving files in all three allowed states (PAUSED, FINISHED, AWAITING_USER_INPUT).
- Ensured that saving is still prohibited in other agent states.

## Additional Notes
This change improves the user experience by allowing code edits while the agent is waiting for user input, which is a common scenario in interactive coding sessions.

* Add internationalization for 'File saved successfully' message

# Add internationalization for 'File saved successfully' message

## Description
This PR adds internationalization support for the "File saved successfully" message in the CodeEditor component. It updates the translation.json file to include translations for multiple languages and modifies the CodeEditor.tsx file to use the new translation key.

## Changes
1. Updated `translation.json`:
   - Added a new key `CODE_EDITOR$FILE_SAVED_SUCCESSFULLY` with translations for multiple languages.
   - Ensured the file structure supports multiple languages per key.

2. Modified `CodeEditor.tsx`:
   - Updated the success message to use the new translation key.
   - Applied the translation to both the toast notification and the on-screen notification.

## Why
These changes improve the user experience for non-English speakers by providing localized feedback when a file is successfully saved. This aligns with our goal of making the application more accessible to a global audience.

## How to Test
1. Change the application language to different supported languages.
2. Open the CodeEditor, make changes to a file, and save it.
3. Verify that the "File saved successfully" message appears in the correct language for both the toast and on-screen notifications.

## Additional Notes
Please pay special attention to the structure of the translation.json file to ensure it follows our established patterns for internationalization.

* Add toast notifications for error handling in fileService

# Add toast notifications for error handling in fileService

## Description
This PR enhances the error handling in the `fileService.ts` file by adding toast notifications for user feedback. It maintains the existing console error logging for debugging purposes while improving the user experience by providing visible error messages in the UI.

## Changes
- Added import for the toast utility
- Implemented toast.error() calls in catch blocks for all file operations
- Kept console.error() calls for detailed logging
- Updated error messages to be more user-friendly

## Files Changed
- `src/services/fileService.ts`

## Testing
- Tested all file operations (select, upload, list, save) to ensure proper error handling
- Verified that toast notifications appear when errors are simulated
- Confirmed that console errors are still logged for debugging

## Additional Notes
This change improves error visibility for users without altering the underlying error handling logic. It should make troubleshooting easier for both users and developers.

* Add file path safety check and improve error handling in file services

# Add file path safety check and improve error handling in file services

## Description
This PR enhances the `fileService.ts` by adding a safety check for file paths in the `saveFile` function and improves error handling across all file operations. It also includes new translations for various file-related error messages.

## Changes
1. Updated `src/services/fileService.ts`:
   - Added a validation check for file paths in the saveFile function
   - Improved error handling for all file operations (select, upload, list, save)
   - Implemented toast error messages with translation support

2. Updated `src/i18n/translations.json`:
   - Added new translation keys for file service error messages:
     - FILE_SERVICE$SELECT_FILE_ERROR
     - FILE_SERVICE$UPLOAD_FILES_ERROR
     - FILE_SERVICE$LIST_FILES_ERROR
     - FILE_SERVICE$SAVE_FILE_ERROR
     - FILE_SERVICE$INVALID_FILE_PATH

## Files Changed
- `src/services/fileService.ts`
- `src/i18n/translations.json`

## Key Implementation Details
```typescript
export async function saveFile(filePath: string, content: string): Promise<void> {
  const { t } = useTranslation();
  if (!filePath || filePath.includes('..')) {
    toast.error(t(I18nKey.FILE_SERVICE$INVALID_FILE_PATH));
    throw new Error('Invalid file path');
  }
  try {
    // Existing implementation...
  } catch (error) {
    console.error('Error saving file:', error);
    toast.error(t(I18nKey.FILE_SERVICE$SAVE_FILE_ERROR), 'File Save Error');
    throw error;
  }
}
```

## Testing
- Verified that the saveFile function rejects invalid file paths (empty or containing '..')
- Confirmed that appropriate error messages are displayed using toast notifications for all file operations
- Tested with different languages to ensure translated messages appear correctly

## Security Implications
The file path check in saveFile enhances security by preventing potential directory traversal attacks.

## Next Steps
- Consider adding similar safety checks to other file operations if applicable
- Ensure thorough testing of error scenarios across all supported languages

* Add docstrings to listen.py

# Add docstrings to listen.py

## Description
This PR adds comprehensive docstrings to all functions in the `listen.py` file. These additions improve code documentation, making the file more readable and maintainable for current and future developers.

## Changes
- Added docstrings to all functions in `listen.py`
- Docstrings follow the Google Python Style Guide format
- Included descriptions, parameters, return values, and potential exceptions for each function

## Files Changed
- `src/listen.py`

## Docstring Example
Here's an example of one of the added docstrings:

```python
@app.post('/api/save-file')
async def save_file(request: Request):
    """
    Save a file to the agent's runtime file store.

    This endpoint allows saving a file when the agent is in a paused, finished,
    or awaiting user input state. It checks the agent's state before proceeding
    with the file save operation.

    Args:
        request (Request): The incoming FastAPI request object.

    Returns:
        JSONResponse: A JSON response indicating the success of the operation.

    Raises:
        HTTPException:
            - 403 error if the agent is not in an allowed state for editing.
            - 400 error if the file path or content is missing.
            - 500 error if there's an unexpected error during the save operation.
    """
    # Function implementation...
```

## Impact
- Improved code readability and maintainability
- Better understanding of function purposes, inputs, outputs, and potential errors
- Easier onboarding for new developers working on this file
- Enhanced IDE support for function descriptions and parameter information

## Testing
- No functional changes were made, so existing tests should pass without modification
- Manual review of docstrings for accuracy and completeness is recommended

## Next Steps
- Consider adding similar docstrings to other files in the project for consistency
- Review the added docstrings to ensure they accurately describe the current functionality
- Update docstrings as needed when function implementations change in the future

## Additional Notes
The existing code structure and functionality remain unchanged. This PR focuses solely on improving documentation through the addition of docstrings.

* Revert exclude_list formatting and add docstrings in listen.py

# Revert exclude_list formatting and add docstrings in listen.py

## Description
This PR makes two main changes to the `listen.py` file:
1. Reverts the `exclude_list` in the `list_files` function to its original format, with each item on a separate line.
2. Adds comprehensive docstrings to all functions in the file.

These changes improve code readability, maintain consistency with project standards, and enhance documentation for better maintainability.

## Changes
1. Updated `opendevin/server/listen.py`:
   - Reverted `exclude_list` formatting in `list_files` function
   - Added docstrings to all functions

## Detailed Changes

### 1. Reverted exclude_list formatting
```python
exclude_list = (
    '.git',
    '.DS_Store',
    '.svn',
    '.hg',
    '.idea',
    '.vscode',
    '.settings',
    '.pytest_cache',
    '__pycache__',
    'node_modules',
    'vendor',
    'build',
    'dist',
    'bin',
    'logs',
    'log',
    'tmp',
    'temp',
    'coverage',
    'venv',
    'env',
)
```

### 2. Added docstrings (example)
```python
@app.get('/api/list-files')
def list_files(request: Request, path: str = '/'):
    """
    List files in the specified path.

    This function retrieves a list of files from the agent's runtime file store,
    excluding certain system and hidden files/directories.

    Args:
        request (Request): The incoming request object.
        path (str, optional): The path to list files from. Defaults to '/'.

    Returns:
        list: A list of file names in the specified path.

    Raises:
        HTTPException: If there's an error listing the files.
    """
    # Function implementation...
```

## Rationale
- Reverting `exclude_list` formatting maintains consistency with the project's coding style and ensures proper functioning of pre-commit hooks.
- Adding docstrings improves code documentation, making it easier for developers to understand and maintain the codebase.

## Impact
- Improved code readability and consistency
- Enhanced documentation for all functions in `listen.py`
- Easier onboarding for new developers
- Better IDE support for function descriptions and parameter information

## Testing
- No functional changes were made, so existing tests should pass without modification
- Manual review of the reverted `exclude_list` and new docstrings is recommended

## Additional Notes
- The existing code functionality remains unchanged
- All functions in `listen.py` now have detailed docstrings following the Google Python Style Guide format

## Next Steps
- Review the added docstrings to ensure they accurately describe the current functionality
- Consider adding similar docstrings to other files in the project for consistency
- Update docstrings as needed when function implementations change in the future

* made code reviewable

* fixed ruff issues

* Update listen.py docstrings

* final tweaks

* re-added encodedURIComponent in selectFile

---------

Co-authored-by: tobitege <tobitege@gmx.de>
Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>
2024-07-02 09:42:43 +02:00
Leo 5e6fb6131f refactor: Renamed variables to resolve naming conflicts and eliminate warnings (#2732)
* refactor: Renamed variables to resolve naming conflicts and eliminate warnings

Signed-off-by: ifuryst <ifuryst@gmail.com>

* Fix lint failed.

Signed-off-by: ifuryst <ifuryst@gmail.com>

* Combine set_initial_state methods, rename _filed to f, and adjust the AppConfig update codes.

Signed-off-by: ifuryst <ifuryst@gmail.com>

---------

Signed-off-by: ifuryst <ifuryst@gmail.com>
2024-07-02 15:00:58 +08:00
Xingyao Wang 41ddba84bd [Agent] (Potentially) improve Editing using diff (#2685)
* add replace-based block edit & preliminary test case fix

* further fix the insert behavior

* make edit only work on first occurence

* bump codeact version since we now use new edit agentskills

* update prompt for new agentskills

* update integration tests

* make run_infer.sh executable

* remove code block for edit_file

* update integration test for prompt changes

* default to not use hint for eval

* fix insert emptyfile bug

* throw value error when `to_replace` is empty

* make `_edit_or_insert_file` return string so we can try to fix some linter errors (best attempt)

* add todo

* update integration test

* fix sandbox test for this PR
2024-07-02 11:50:15 +09:00
Xingyao Wang 6a0ffc5c61 [Evaluation] Use the latest official SWE-Bench Dockerization for evaluation (#2728)
* add newline after patch to fix patch apply

* new swebench wip

* add newline after patch to fix patch apply

* only add newline if not empty

* update swebench source and update

* update gitignore for swebench eval

* update old prep_eval

* update gitignore

* add scripts for push and pull swebench images

* update eval_infer.sh

* update eval_infer for new docker workflow

* update script to create markdown report based on report.json

* update eval infer to use update output

* update readme

* only move result to folder if running whole file

* remove set-x

* update conversion script

* Update evaluation/swe_bench/README.md

* Update evaluation/swe_bench/README.md

* Update evaluation/swe_bench/README.md

* make sure last line end with newline

* switch to an fix attempt branch of swebench

* Update evaluation/swe_bench/README.md

* Update evaluation/swe_bench/README.md

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-07-01 23:58:30 +00:00
tobitege 6246cb8e67 fix: improve exit_code processing (ssh_box) (#2726)
* improve exit_code processing

* removed debug spam
2024-07-01 19:59:31 +00:00
Arno.Edwards c224a6f3f6 fix(docs): translate missing parts (#2724)
- fix incorrect format in sidebar.json
- add custom_sandbox_guide.md to other language folders
2024-07-01 18:57:21 +02:00
Leo 82ea8b3556 test: fix the failed unit tests. (#2721)
Signed-off-by: ifuryst <ifuryst@gmail.com>
2024-07-01 15:27:27 +00:00
dependabot[bot] c88b9f8f6d chore(deps-dev): bump ruff from 0.4.10 to 0.5.0 (#2719)
Bumps [ruff](https://github.com/astral-sh/ruff) from 0.4.10 to 0.5.0.
- [Release notes](https://github.com/astral-sh/ruff/releases)
- [Changelog](https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md)
- [Commits](https://github.com/astral-sh/ruff/compare/v0.4.10...0.5.0)

---
updated-dependencies:
- dependency-name: ruff
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-01 23:22:13 +08:00
dependabot[bot] e83d1fe8b8 chore(deps): bump @reduxjs/toolkit from 2.2.5 to 2.2.6 in /frontend (#2717)
Bumps [@reduxjs/toolkit](https://github.com/reduxjs/redux-toolkit) from 2.2.5 to 2.2.6.
- [Release notes](https://github.com/reduxjs/redux-toolkit/releases)
- [Commits](https://github.com/reduxjs/redux-toolkit/compare/v2.2.5...v2.2.6)

---
updated-dependencies:
- dependency-name: "@reduxjs/toolkit"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-01 23:21:47 +08:00
dependabot[bot] 1cbbb10c56 chore(deps-dev): bump postcss from 8.4.38 to 8.4.39 in /frontend (#2716)
Bumps [postcss](https://github.com/postcss/postcss) from 8.4.38 to 8.4.39.
- [Release notes](https://github.com/postcss/postcss/releases)
- [Changelog](https://github.com/postcss/postcss/blob/main/CHANGELOG.md)
- [Commits](https://github.com/postcss/postcss/compare/8.4.38...8.4.39)

---
updated-dependencies:
- dependency-name: postcss
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-01 23:21:24 +08:00
dependabot[bot] 51a71aaec9 chore(deps): bump boto3 from 1.34.135 to 1.34.136 (#2718)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.135 to 1.34.136.
- [Release notes](https://github.com/boto/boto3/releases)
- [Commits](https://github.com/boto/boto3/compare/1.34.135...1.34.136)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-01 15:18:28 +00:00
Leo 58191e0af0 style: refine the copy button and add copy feedback for the icon. (#2715)
* Add padding on top for input to overlay the message bubbles.

Signed-off-by: ifuryst <ifuryst@gmail.com>
2024-07-01 14:59:12 +00:00
tobitege 7fda69ff10 mixin: improve logging (#2713)
* mixin: improve logging

* refactor logger creation
2024-07-01 14:58:42 +02:00
tobitege ea3c58dc8c feat: added make run-wsl (#2711)
* feat: added make run-wsl

* moved Windows warning
2024-07-01 09:01:54 +02:00
Engel Nyst 1975689cd4 remove swe agent (#2708) 2024-07-01 12:27:14 +09:00
mamoodi 00542432bd docs: Update documentation with some consistency (#2706)
* Update documentation with some consistency

* Make windows troubleshooting a little more clear

* Apply suggestions from code review

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

---------

Co-authored-by: Mahmoud Work <mahmoudwork@mahmouds-mini.home>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-06-30 23:27:58 +00:00
Boxuan Li 8dae1f9307 Bypass MAX_ITERATIONS and MAX_BUDGET_PER_TASK on web GUI (#2697)
Closes #1493

Introduced TRAFFIC_CONTROL_STATE to allow OpenDevin to switch between normal traffic limiting mode and temporarily disabled mode.
2024-06-30 13:19:45 -07:00
Engel Nyst e24c52d060 Small refactoring of obs truncation (#2701)
* refactor truncate_content a bit to be usable by all agents

* adjust doc
2024-06-30 12:12:08 +02:00
Engel Nyst 2d9bb56763 Add ability to restore the cli session (optional) (#2699)
* add ability to restore the main session

* add quick log

* rename to cli session
2024-06-30 06:56:55 +00:00
Engel Nyst 874b4c9075 CLI concurrency (#2695)
* add session id in cli, evals

* fix main sid
2024-06-30 04:04:30 +02:00
Xingyao Wang 15e0c524f4 default to not use hint for eval (#2696) 2024-06-29 21:27:57 +00:00
Engel Nyst 4b1cc56682 Sync history to stream (#2640)
* add event to stream before budget check

* make the budget check before the step

* Update opendevin/controller/agent_controller.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-06-29 21:19:00 +00:00
மனோஜ்குமார் பழனிச்சாமி b4f63ae076 doc: Update CONTRIBUTING.md (#2671) 2024-06-29 22:06:14 +02:00
Boxuan Li e45b311c35 Remove MAX_CHARS traffic control (#2694)
* Remove MAX_CHARS limiting

* More cleanup
2024-06-29 12:59:41 -07:00
mamoodi 75f3181c08 Update tag to use in README and docs to 0.7.0 (#2683)
Co-authored-by: Mahmoud Work <mahmoudwork@mahmouds-mini.home>
2024-06-29 19:53:49 +00:00
Boxuan Li 79a3e81708 ghcr: Fix local built image name in tests (#2686)
* ghcr: Fix local built image name in tests

* Push to registry when tagged

* Fix push and experiment

* Remove debug
2024-06-29 09:53:30 -07:00
Xingyao Wang e8cb6803df [Evaluation] Improve patch apply in SWE-Bench (#2684)
* add newline after patch to fix patch apply

* only add newline if not empty
2024-06-29 14:11:07 +08:00
tobitege 7d31057904 feat: file explorer: better sorting; .gitignore support; file upload config (#2621)
* feat: file explorer: better sorting; .gitignore support; file upload config

* resolved poetry

* move config settings (no extra file); updated uploading of files; fix exception on refresh of removed folder

* removed console cmds; fix in a toast

* attempt fix of upload toasts

* fix new options' assignments in listen.py
2024-06-28 16:36:25 +00:00
dependabot[bot] b88898235e chore(deps-dev): bump llama-index-vector-stores-chroma (#2680) 2024-06-28 23:49:49 +08:00
dependabot[bot] a9302b15df chore(deps): bump json-repair from 0.25.1 to 0.25.2 (#2679) 2024-06-28 23:49:19 +08:00
dependabot[bot] 52c7019cb3 chore(deps): bump litellm from 1.40.28 to 1.40.29 (#2677) 2024-06-28 23:48:57 +08:00
dependabot[bot] b05aaffd7e chore(deps-dev): bump openai from 1.35.6 to 1.35.7 (#2676) 2024-06-28 23:48:37 +08:00
dependabot[bot] a918198f43 chore(deps): bump vite from 5.3.1 to 5.3.2 in /frontend (#2681) 2024-06-28 23:48:02 +08:00
dependabot[bot] 839c51ed3c chore(deps): bump jose from 5.6.1 to 5.6.2 in /frontend (#2682) 2024-06-28 23:47:35 +08:00
dependabot[bot] 105662a40c chore(deps): bump boto3 from 1.34.134 to 1.34.135 (#2678)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.134 to 1.34.135.
- [Release notes](https://github.com/boto/boto3/releases)
- [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst)
- [Commits](https://github.com/boto/boto3/compare/1.34.134...1.34.135)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-28 15:08:16 +00:00
Boxuan Li 7766a3283e CodeActAgent: Fix delegate history (#2672) 2024-06-28 16:37:23 +09:00
மனோஜ்குமார் பழனிச்சாமி af9385322b Refactor: Simplify message formatting (#2670)
Removed redundant `str()` conversion in f-string.
2024-06-28 07:34:26 +02:00
Jiayi Pan 917d96e06f Fix doc error in evals (#2654) 2024-06-27 16:13:47 +00:00
dependabot[bot] cc9cb1bac0 chore(deps): bump litellm from 1.40.27 to 1.40.28 (#2664)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.40.27 to 1.40.28.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.40.27...v1.40.28)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-27 15:44:54 +00:00
dependabot[bot] fdef348365 chore(deps): bump jose from 5.5.0 to 5.6.1 in /frontend (#2663) 2024-06-27 15:42:38 +00:00
dependabot[bot] 1827579f4b chore(deps): bump google-generativeai from 0.7.0 to 0.7.1 (#2662) 2024-06-27 23:40:56 +08:00
dependabot[bot] c2969d422a chore(deps-dev): bump openai from 1.35.4 to 1.35.6 (#2660) 2024-06-27 23:40:28 +08:00
dependabot[bot] badfb054f1 chore(deps): bump boto3 from 1.34.133 to 1.34.134 (#2661) 2024-06-27 23:39:58 +08:00
மனோஜ்குமார் பழனிச்சாமி 9919d8e448 Provide [Package already installed] info to LLM (#2642)
* Provide [Package already installed] info to LLM

* regenerate tests
2024-06-27 09:03:54 +00:00
Engel Nyst 58b06cced7 Revert "Show relevant error in UI (#2516)" (#2657)
This reverts commit d0bdae232f.
2024-06-27 08:55:41 +00:00
Boxuan Li 5835680292 Add test for auto_lint after file edit (#2655) 2024-06-27 17:13:45 +09:00
மனோஜ்குமார் பழனிச்சாமி 877d9ae794 Added Sound Notification 🎵 (#2203)
* Added Sound Notification

* prettify

* linted

* disabled by default

* Add dev config for frontend

* add volume icon

* fix typo

* lint

* lint

* Apply suggestions from code review

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Move volume icon to bottom

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-06-27 01:27:48 +00:00
Xingyao Wang 6099f72ae3 Update docusaurus.config.ts for a changed Doc URL (#2653)
* Update docusaurus.config.ts for a changed Doc URL

* fix url

* fix broken link again

* fix url again
2024-06-26 21:56:42 +00:00
Xingyao Wang 418cda23c5 Update custom_sandbox_guide.md (#2650) 2024-06-26 21:58:19 +02:00
dependabot[bot] 3fb8206b72 chore(deps-dev): bump openai from 1.35.3 to 1.35.4 (#2649)
Bumps [openai](https://github.com/openai/openai-python) from 1.35.3 to 1.35.4.
- [Release notes](https://github.com/openai/openai-python/releases)
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-python/compare/v1.35.3...v1.35.4)

---
updated-dependencies:
- dependency-name: openai
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-26 15:51:57 +00:00
dependabot[bot] 7394c91c0e chore(deps): bump litellm from 1.40.26 to 1.40.27 (#2648)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.40.26 to 1.40.27.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.40.26...v1.40.27)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-26 15:46:56 +00:00
dependabot[bot] acee3e9d04 chore(deps): bump boto3 from 1.34.132 to 1.34.133 (#2647)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.132 to 1.34.133.
- [Release notes](https://github.com/boto/boto3/releases)
- [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst)
- [Commits](https://github.com/boto/boto3/compare/1.34.132...1.34.133)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-26 15:42:58 +00:00
dependabot[bot] 8fef94cac4 chore(deps): bump jose from 5.4.1 to 5.5.0 in /frontend (#2645)
Bumps [jose](https://github.com/panva/jose) from 5.4.1 to 5.5.0.
- [Release notes](https://github.com/panva/jose/releases)
- [Changelog](https://github.com/panva/jose/blob/main/CHANGELOG.md)
- [Commits](https://github.com/panva/jose/compare/v5.4.1...v5.5.0)

---
updated-dependencies:
- dependency-name: jose
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-26 23:40:14 +08:00
Sheun Aluko, MD-MS 0e8d1e5fb8 added custom sandbox guide markdown file (#2637) 2024-06-26 23:33:53 +08:00
dependabot[bot] 10b7a1e9c5 chore(deps-dev): bump @types/node from 20.14.8 to 20.14.9 in /frontend (#2644)
Bumps [@types/node](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/node) from 20.14.8 to 20.14.9.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/node)

---
updated-dependencies:
- dependency-name: "@types/node"
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-26 23:29:55 +08:00
dependabot[bot] 5abe014218 chore(deps): bump react-router-dom from 6.23.1 to 6.24.0 in /frontend (#2628)
Bumps [react-router-dom](https://github.com/remix-run/react-router/tree/HEAD/packages/react-router-dom) from 6.23.1 to 6.24.0.
- [Release notes](https://github.com/remix-run/react-router/releases)
- [Changelog](https://github.com/remix-run/react-router/blob/main/packages/react-router-dom/CHANGELOG.md)
- [Commits](https://github.com/remix-run/react-router/commits/react-router-dom@6.24.0/packages/react-router-dom)

---
updated-dependencies:
- dependency-name: react-router-dom
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-26 23:19:52 +08:00
tobitege 163ee3abf6 dev: added make-i18n to "build" (package.json) (#2641)
* added i18n to build-frontend

* revert Makefile; update package.json for make-i18n instead
2024-06-26 11:53:40 +02:00
Boxuan Li ee86d8d25e Frontend support for delegation and rejection (#2608)
1. Add support for rejection action on frontend
2. Show users the reason for rejection
3. Get rid of weird empty box after delegation
4. On web GUI, show customer when a delegation starts and ends
2024-06-26 00:30:10 -07:00
Xavier Vergés cd91d45b44 Allow SANDBOX_CONTAINER_IMAGEs built from opendevin/sandbox:main (#2622) 2024-06-26 12:05:07 +08:00
tobitege 94d24b71d7 frontend: apply more i18n (#2639) 2024-06-26 05:54:35 +02:00
PierrunoYT fcbf1f8242 feat(frontend): Add Typing Indicator for Agent Response using Tailwind CSS (#2630)
* feat(frontend): Add Typing Indicator for Agent Response

* finetuning and linting

* evil lil linter

* TailwindCSS styles now

* improve bounce

* merge fix

* merge fix 2

* one more fix

* poetry.lock fix

---------

Co-authored-by: tobitege <tobitege@gmx.de>
2024-06-26 04:15:04 +02:00
dependabot[bot] 7ff746250e chore(deps): bump boto3 from 1.34.131 to 1.34.132 (#2632)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.131 to 1.34.132.
- [Release notes](https://github.com/boto/boto3/releases)
- [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst)
- [Commits](https://github.com/boto/boto3/compare/1.34.131...1.34.132)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-26 02:37:28 +02:00
dependabot[bot] 2b1d2104d9 chore(deps-dev): bump reportlab from 4.2.0 to 4.2.2 (#2635)
Bumps [reportlab](https://www.reportlab.com/) from 4.2.0 to 4.2.2.

---
updated-dependencies:
- dependency-name: reportlab
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-26 00:45:12 +08:00
dependabot[bot] 406bb13caf chore(deps): bump tenacity from 8.4.1 to 8.4.2 (#2631)
Bumps [tenacity](https://github.com/jd/tenacity) from 8.4.1 to 8.4.2.
- [Release notes](https://github.com/jd/tenacity/releases)
- [Commits](https://github.com/jd/tenacity/compare/8.4.1...8.4.2)

---
updated-dependencies:
- dependency-name: tenacity
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-26 00:44:52 +08:00
dependabot[bot] c1009069ab chore(deps-dev): bump @typescript-eslint/eslint-plugin in /frontend (#2629)
Bumps [@typescript-eslint/eslint-plugin](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/eslint-plugin) from 7.13.1 to 7.14.1.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/eslint-plugin/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v7.14.1/packages/eslint-plugin)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/eslint-plugin"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-25 16:33:12 +00:00
dependabot[bot] 7d413c288e chore(deps): bump litellm from 1.40.25 to 1.40.26 (#2633)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.40.25 to 1.40.26.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.40.25...v1.40.26)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-26 00:06:06 +08:00
dependabot[bot] 781a6250f8 chore(deps-dev): bump mypy from 1.10.0 to 1.10.1 (#2634)
Bumps [mypy](https://github.com/python/mypy) from 1.10.0 to 1.10.1.
- [Changelog](https://github.com/python/mypy/blob/master/CHANGELOG.md)
- [Commits](https://github.com/python/mypy/compare/v1.10.0...v1.10.1)

---
updated-dependencies:
- dependency-name: mypy
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-26 00:05:45 +08:00
dependabot[bot] ffda754256 chore(deps-dev): bump @typescript-eslint/parser in /frontend (#2627)
Bumps [@typescript-eslint/parser](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/parser) from 7.13.1 to 7.14.1.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/parser/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v7.14.1/packages/parser)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/parser"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-25 23:55:40 +08:00
dependabot[bot] 7ca7f803e1 chore(deps): bump framer-motion from 11.2.11 to 11.2.12 in /frontend (#2626)
Bumps [framer-motion](https://github.com/framer/motion) from 11.2.11 to 11.2.12.
- [Changelog](https://github.com/framer/motion/blob/main/CHANGELOG.md)
- [Commits](https://github.com/framer/motion/compare/v11.2.11...v11.2.12)

---
updated-dependencies:
- dependency-name: framer-motion
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-25 15:37:44 +00:00
tobitege dab9a60e00 feat: Agent buttons hover decor (#2623)
* feat: Agent buttons hover decor

* use TailwindCSS for styling
2024-06-25 15:44:34 +02:00
PierrunoYT c99b94dbe9 feat(frontend): Add "Copy" Button to Chat Messages (#2619)
* Add "Copy" Button to Chat Messages

### PR Overview: Add "Copy" Button to Chat Messages

**Description:**
This PR introduces a "Copy" button to each chat message in the `ChatMessage` component. The button allows users to easily copy the content of a chat message to their clipboard. The implementation includes a button with a clipboard icon and the necessary functionality to copy the message content.

**Changes Made:**
1. **Imports:**
   - Added `FaClipboard` from `react-icons/fa` for the clipboard icon.
   - Added `toast` from `#utils/toast` for displaying notifications.

2. **New Functionality:**
   - Implemented `copyToClipboard` function using `navigator.clipboard.writeText` to copy the message content.
   - Added a button element with an `onClick` handler to trigger the `copyToClipboard` function.

3. **UI Enhancements:**
   - The button is styled to match the existing UI and is placed next to the message content.

**Code Changes:**
- Modified `frontend/src/components/chat/ChatMessage.tsx` to include the new button and functionality.

**Testing:**
- Verified that clicking the "Copy" button copies the message content to the clipboard.
- Confirmed that a toast notification appears upon successful copy or failure.

**Example Code:**
```tsx
import React from "react";
import Markdown from "react-markdown";
import { twMerge } from "tailwind-merge";
import { code } from "../markdown/code";
import { FaClipboard } from "react-icons/fa";
import toast from "#/utils/toast"; // Assuming you have a toast utility for notifications

interface MessageProps {
  message: Message;
}

function ChatMessage({ message }: MessageProps) {
  const className = twMerge(
    "markdown-body",
    "p-3 text-white max-w-[90%] overflow-y-auto rounded-lg",
    message.sender === "user" ? "bg-neutral-700 self-end" : "bg-neutral-500",
  );

  const copyToClipboard = () => {
    navigator.clipboard.writeText(message.content).then(() => {
      toast.info("Message copied to clipboard!");
    }).catch((error) => {
      toast.error(`Failed to copy message: ${error}`);
    });
  };

  return (
    <div data-testid="message" className={className}>
      <div className="flex justify-between items-center">
        <Markdown components={{ code }}>{message.content}</Markdown>
        <button
          onClick={copyToClipboard}
          className="ml-2 p-1 bg-neutral-600 rounded hover:bg-neutral-500"
          aria-label="Copy message"
        >
          <FaClipboard />
        </button>
      </div>
    </div>
  );
}

export default ChatMessage;
```

**Notes:**
- Ensure that the `react-icons` package is installed (`npm install react-icons` or `yarn add react-icons`).
- The toast utility is assumed to be available for notifications. If not, consider using an alternative notification method.

* layout enhancements; linting

---------

Co-authored-by: tobitege <tobitege@gmx.de>
Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>
2024-06-25 06:15:44 +00:00
Boxuan Li 7e78fde48f Bug fix: add error observation to history (#2610)
* Bug fix: add error observation to history

* Regenerate to demonstrate format error
2024-06-24 21:24:17 -07:00
dependabot[bot] e71b8d1464 chore(deps-dev): bump @types/node from 20.14.7 to 20.14.8 in /frontend (#2617)
Bumps [@types/node](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/node) from 20.14.7 to 20.14.8.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/node)

---
updated-dependencies:
- dependency-name: "@types/node"
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-24 21:36:33 +03:00
dependabot[bot] 4b35497b54 chore(deps): bump litellm from 1.40.20 to 1.40.25 (#2615)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.40.20 to 1.40.25.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.40.20...v1.40.25)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-24 19:08:03 +02:00
dependabot[bot] 6042b226a0 chore(deps): bump boto3 from 1.34.130 to 1.34.131 (#2616)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.130 to 1.34.131.
- [Release notes](https://github.com/boto/boto3/releases)
- [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst)
- [Commits](https://github.com/boto/boto3/compare/1.34.130...1.34.131)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-24 19:04:28 +02:00
tobitege 1117dfebeb feat: update version to 0.6.2. added Action to update pyproject on Release (#2552)
* updated version; added Action to update pyproject version by current tag (if changed)

* higer pyproject version creates a tag now

* Release-only run to write tag to pyproject
2024-06-24 18:34:57 +02:00
Boxuan Li 8bce806dce Tweak prompts of ManagerAgent and CommitWriterAgent (#2609)
* Tweak prompts of ManagerAgent and CommitWriterAgent

* Fix prompts
2024-06-24 00:14:28 -07:00
Boxuan Li 39d90c0b2a Track metrics throughout delegation & Polish UX for out of budget error (#2595)
* Track metrics (costs) throught delegation

* Metrics should be shared across agents for better UX

* Update cost before starting delegate
2024-06-23 18:38:52 -07:00
Xingyao Wang 6de584d77d update swe-bench output with eval results (#2606) 2024-06-24 08:07:28 +09:00
mamoodi 4d1ffa1aaf Default makefile for persist_sandbox to be false (#2605)
Co-authored-by: Mahmoud Work <mahmoudwork@mahmouds-mini.home>
2024-06-23 15:56:56 +00:00
Arno.Edwards e1caf1db33 Add i18n support for official website (#2463)
* feat(i18n): initial i18n setup

- Configured i18n settings in docusaurus.config.js
- Implemented Translate component and translate function in key components

* docs(i18n): complete documentation internationalization

- Added support for Simplified Chinese and French

* Update docs/i18n/zh-Hans/docusaurus-plugin-content-docs/current/usage/troubleshooting/troubleshooting.md

* Update docs/i18n/zh-Hans/docusaurus-plugin-content-docs/current/usage/troubleshooting/troubleshooting.md

* Update docs/i18n/zh-Hans/docusaurus-plugin-content-docs/current/usage/troubleshooting/troubleshooting.md

* fix(build): resolve broken links causing build failure

- Fix issue causing build errors due to broken links in Docusaurus documentation.
- Resolve uncontrolled resource consumption in braces (see: https://github.com/advisories/GHSA-grv7-fg5c-xmjg).
- Bump Docusaurus to ^3.4.0 to fix MDX loader: linkify should process the md AST instead of the md string.

* fix: sync with commit 868b746

-  Change to `docusaurus write-translations` to provide translation for JSON files.

---------

Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-06-23 15:13:04 +00:00
sp.wack 868b7460d4 Replace console errors with toast errors (#2601) 2024-06-23 07:09:16 +00:00
sp.wack 63a3285b23 Disable eslint rule that throws useeffect warning (#2600) 2024-06-23 06:32:37 +00:00
sp.wack 0c1d6f8b4e fix(frontend): Prevent actions from disappearing before sending data (#2599)
* Prevent actions from disappearing before sending data

* Lint
2024-06-23 06:28:48 +00:00
மனோஜ்குமார் பழனிச்சாமி c455a09e43 Remove Colima and lima directory after uninstalling for Mac OS CI (#2598)
* Remove colima dir after uninstall

* Delete lima dir
2024-06-23 08:28:29 +02:00
Graham Neubig cab7a288ca Add NUM_WORKERS variable to run_infer.sh scripts for configurable woker settings (#2597)
* Add NUM_WORKERS variable to run_infer.sh scripts for configurable worker settings

* Update evaluation/webarena/scripts/run_infer.sh

---------

Co-authored-by: OpenDevin <opendevin@all-hands.dev>
2024-06-23 03:43:43 +00:00
mamoodi 9af2a08eda Remove PERSIST_SANDBOX=true option (#2591) 2024-06-22 17:59:34 +00:00
மனோஜ்குமார் பழனிச்சாமி 6bf1b56f06 Revert "Enable "vz" vm-type for MacOS CI (#2586)" (#2588)
This reverts commit 57b56c0536.
2024-06-22 23:45:14 +08:00
Graham Neubig 6312c21f8e Update documentation regarding feedback data usage (#2585)
* Update feedback data usage

* Update hugging face
2024-06-22 23:39:40 +08:00
மனோஜ்குமார் பழனிச்சாமி 57b56c0536 Enable "vz" vm-type for MacOS CI (#2586) 2024-06-22 12:30:18 +00:00
மனோஜ்குமார் பழனிச்சாமி 4c5e5a0b5a Mention Ubuntu supported versions (#2584) 2024-06-22 10:19:52 +00:00
sp.wack fa86e32231 Update feedback modal content (#2582) 2024-06-22 08:12:39 +00:00
Graham Neubig 45d7a53b91 Add links to a feedback sharing site (#2580)
* Add links to a feedback sharing site

* Remove console log

---------

Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>
2024-06-22 09:36:31 +02:00
dependabot[bot] 7ad6ab1548 chore(deps-dev): bump ruff from 0.4.9 to 0.4.10 (#2573)
Bumps [ruff](https://github.com/astral-sh/ruff) from 0.4.9 to 0.4.10.
- [Release notes](https://github.com/astral-sh/ruff/releases)
- [Changelog](https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md)
- [Commits](https://github.com/astral-sh/ruff/compare/v0.4.9...v0.4.10)

---
updated-dependencies:
- dependency-name: ruff
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-06-22 03:14:30 +00:00
dependabot[bot] ec43e67ae0 chore(deps): bump litellm from 1.40.17 to 1.40.20 (#2576)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.40.17 to 1.40.20.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.40.17...v1.40.20)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-06-22 03:13:12 +00:00
மனோஜ்குமார் பழனிச்சாமி c743320201 Interactive Terminal (#2493)
* Interactive Terminal

* linted

* fixed tests

* fixed tests

* refactored logic

* remove console logs
2024-06-21 20:56:54 -06:00
மனோஜ்குமார் பழனிச்சாமி 0845d475b8 Fix Mac CI Test (#2569)
* Fix Mac CI Test

* Start colima service

* unlink colima dependency: go

* Check for colima

Co-authored-by: Graham Neubig <neubig@gmail.com>

* fix indent

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Try with uninstall

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-06-22 02:36:23 +00:00
Shimada666 5972498e23 No longer chown -R the miniforge3 folder (#2566)
* No longer chown -R the miniforge3 folder

* change miniforge3 group permission
2024-06-21 17:33:00 +00:00
dependabot[bot] 4567dd60e4 chore(deps): bump json-repair from 0.25.0 to 0.25.1 (#2575)
Bumps [json-repair](https://github.com/mangiucugna/json_repair) from 0.25.0 to 0.25.1.
- [Release notes](https://github.com/mangiucugna/json_repair/releases)
- [Commits](https://github.com/mangiucugna/json_repair/compare/0.25.0...0.25.1)

---
updated-dependencies:
- dependency-name: json-repair
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-21 09:38:37 -06:00
dependabot[bot] 13adfe5625 chore(deps-dev): bump openai from 1.35.1 to 1.35.3 (#2574)
Bumps [openai](https://github.com/openai/openai-python) from 1.35.1 to 1.35.3.
- [Release notes](https://github.com/openai/openai-python/releases)
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-python/compare/v1.35.1...v1.35.3)

---
updated-dependencies:
- dependency-name: openai
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-21 09:38:21 -06:00
dependabot[bot] 900743ee7a chore(deps-dev): bump streamlit from 1.35.0 to 1.36.0 (#2572)
Bumps [streamlit](https://github.com/streamlit/streamlit) from 1.35.0 to 1.36.0.
- [Release notes](https://github.com/streamlit/streamlit/releases)
- [Commits](https://github.com/streamlit/streamlit/compare/1.35.0...1.36.0)

---
updated-dependencies:
- dependency-name: streamlit
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-21 09:38:02 -06:00
dependabot[bot] 1076789c23 chore(deps-dev): bump @types/node from 20.14.6 to 20.14.7 in /frontend (#2571)
Bumps [@types/node](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/node) from 20.14.6 to 20.14.7.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/node)

---
updated-dependencies:
- dependency-name: "@types/node"
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-21 15:23:25 +00:00
dependabot[bot] 3f158d09e2 chore(deps-dev): bump typescript from 5.4.5 to 5.5.2 in /frontend (#2570)
Bumps [typescript](https://github.com/Microsoft/TypeScript) from 5.4.5 to 5.5.2.
- [Release notes](https://github.com/Microsoft/TypeScript/releases)
- [Changelog](https://github.com/microsoft/TypeScript/blob/main/azure-pipelines.release.yml)
- [Commits](https://github.com/Microsoft/TypeScript/compare/v5.4.5...v5.5.2)

---
updated-dependencies:
- dependency-name: typescript
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-21 15:18:21 +00:00
Graham Neubig 7e44c8f348 Update doc to clarify OpenDevin mission and link directly to content (#2568)
* Update doc

* Fix review comments
2024-06-21 15:12:40 +00:00
tobitege 6879acdfe7 fix: Makefile shall pull sandbox:main, not :latest (#2561) 2024-06-21 11:27:56 +02:00
Shimada666 64c2a783d4 Revert "Always pull sandbox image (#2538)" (#2560)
This reverts commit 6dd2491944.
2024-06-21 08:09:26 +00:00
Shimada666 1ffaed48c4 Stop always pulling the latest image. (#2558)
LGTM, thanks!
2024-06-21 05:42:49 +00:00
Boxuan Li 01fa52d062 Enforce linter in tests folder (#2557) 2024-06-20 21:50:34 -07:00
மனோஜ்குமார் பழனிச்சாமி f3ba2f02d5 Fix Mac OS CI - usernet unable to resolve IP for SSH forwarding (#2556) 2024-06-21 10:02:26 +05:30
dependabot[bot] 2df629f912 chore(deps-dev): bump eslint-plugin-jsx-a11y in /frontend (#2543)
Bumps [eslint-plugin-jsx-a11y](https://github.com/jsx-eslint/eslint-plugin-jsx-a11y) from 6.8.0 to 6.9.0.
- [Release notes](https://github.com/jsx-eslint/eslint-plugin-jsx-a11y/releases)
- [Changelog](https://github.com/jsx-eslint/eslint-plugin-jsx-a11y/blob/main/CHANGELOG.md)
- [Commits](https://github.com/jsx-eslint/eslint-plugin-jsx-a11y/compare/v6.8.0...v6.9.0)

---
updated-dependencies:
- dependency-name: eslint-plugin-jsx-a11y
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Leo <ifuryst@gmail.com>
2024-06-21 07:30:29 +03:00
மனோஜ்குமார் பழனிச்சாமி 41564c2eac Use :main instead of :latest (#2539)
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-06-21 03:57:50 +00:00
tobitege ff9f34bea2 sec: update npm module to 8.17.1 (#2554) 2024-06-21 02:45:09 +00:00
Boxuan Li bfa00932cc Enable test_agnostic_sandbox_jupyter_agentskills_fileop_pwd in CI (#2534)
* Enable test_agnostic_sandbox_jupyter_agentskills_fileop_pwd in CI

* Fix env variable value
2024-06-20 20:39:11 -06:00
Boxuan Li feabc97aba Evaluation time travel: build sandbox on the fly (#2491) 2024-06-20 20:22:02 -06:00
மனோஜ்குமார் பழனிச்சாமி 6dd2491944 Always pull sandbox image (#2538) 2024-06-21 06:52:18 +05:30
dependabot[bot] 0b742826ea chore(deps-dev): bump openai from 1.34.0 to 1.35.1 (#2547) 2024-06-21 01:17:51 +08:00
Xingyao Wang eac05d71fa replace / for tag (#2550) 2024-06-20 11:09:29 -06:00
Shimada666 0c2882d552 add workspace text (#2548) 2024-06-21 00:57:10 +08:00
மனோஜ்குமார் பழனிச்சாமி 35fc2177d0 Fix Mac CI test (#2544) 2024-06-21 00:56:23 +08:00
dependabot[bot] 8e22cd5f5e chore(deps-dev): bump @types/node from 20.14.5 to 20.14.6 in /frontend (#2542) 2024-06-21 00:36:46 +08:00
dependabot[bot] cac8159fd8 chore(deps): bump boto3 from 1.34.129 to 1.34.130 (#2546) 2024-06-21 00:36:16 +08:00
dependabot[bot] ff0a35451d chore(deps): bump monaco-editor from 0.49.0 to 0.50.0 in /frontend (#2541)
Bumps [monaco-editor](https://github.com/microsoft/monaco-editor) from 0.49.0 to 0.50.0.
- [Release notes](https://github.com/microsoft/monaco-editor/releases)
- [Changelog](https://github.com/microsoft/monaco-editor/blob/main/CHANGELOG.md)
- [Commits](https://github.com/microsoft/monaco-editor/compare/v0.49.0...v0.50.0)

---
updated-dependencies:
- dependency-name: monaco-editor
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-20 09:52:26 -06:00
Robert Brennan 373a700599 Architecture documentation (#2116)
* add initial arch

* add agent

* update docs

* Update opendevin/README.md

* Update opendevin/README.md

* Update opendevin/README.md
2024-06-20 15:16:06 +00:00
Shimada666 9ec5e4f3e2 remove gcc (#2536) 2024-06-20 13:49:26 +05:30
மனோஜ்குமார் பழனிச்சாமி 833bb50fb0 Downgrade Mac version in CI/CD Pipeline (#2499)
* downgrade mac version in CI

* Delete run-integration-tests.yml
2024-06-19 22:12:15 -07:00
Shimada666 26fc3c886a Make plugins sandbox-agnostic (#2101)
* tmp

* tmp

* merge main

* feat: auto build image cache

* remove plugins

* use config file

* update mamba setup shell

* support agnostic sandbox image autobuild

* remove config

* Update .gitignore

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>

* Update opendevin/runtime/docker/ssh_box.py

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>

* update setup.sh

* readd sudo

* add sudo in dockerfile

* remove export

* move od-runtime dependencies to sandbox dockerfile

* factor out re-build logic into a separate util file

* tweak existing plugin to use OD specific sandbox

* update testcase

* attempt to fix unit test using image built in ghcr

* use cache tag

* try to fix unit tests

* add unittest

* add unittest

* add some unittests

* revert gh workflow changes

* feat: optimize sandbox image naming rule

* add pull latest image hint

* add opendevin python hint and use mamba to install gcc

* update docker image naming rule and fix mamba issue

* Update opendevin/runtime/docker/ssh_box.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* fix: opendevin user use correct pip

* fix lint issue

* fix custom sandbox base image

* rename test name

* add skipif

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
Co-authored-by: tobitege <tobitege@gmx.de>
2024-06-19 19:58:07 -07:00
Xingyao Wang b569ba710d docs: Add visualizer instruction for SWE-Bench (#2529)
* Update README.md for visualizer instruction

* Polish the visualization guidance (#2531)

* fix conda create error

* fix and polish the readme for visualization

* Update README.md

---------

Co-authored-by: Haofei Yu <haofeiy@cs.cmu.edu>
2024-06-19 20:41:09 +00:00
Xingyao Wang 0a0f78f2fb docs: Update link to evaluation benchmark on README.md (#2530) 2024-06-19 20:32:39 +00:00
mamoodi c4ad3b8f47 docs: Fix formatting in CONTRIBUTING (#2526)
Co-authored-by: Mahmoud Work <mahmoudwork@mahmouds-mini.home>
2024-06-20 03:14:46 +08:00
dependabot[bot] dda435e5e8 chore(deps-dev): bump eslint-plugin-react in /frontend (#2524)
Bumps [eslint-plugin-react](https://github.com/jsx-eslint/eslint-plugin-react) from 7.34.2 to 7.34.3.
- [Release notes](https://github.com/jsx-eslint/eslint-plugin-react/releases)
- [Changelog](https://github.com/jsx-eslint/eslint-plugin-react/blob/master/CHANGELOG.md)
- [Commits](https://github.com/jsx-eslint/eslint-plugin-react/compare/v7.34.2...v7.34.3)

---
updated-dependencies:
- dependency-name: eslint-plugin-react
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>
2024-06-19 17:10:08 +00:00
Graham Neubig 8ee5cbc786 Update CONTRIBUTING.md (#2525)
This PR tries to update contributing.md to add a little more structure and make it clear the different ways to contribute.
2024-06-19 15:41:48 +00:00
dependabot[bot] 64fe3d4cba chore(deps): bump framer-motion from 11.2.10 to 11.2.11 in /frontend (#2523)
Bumps [framer-motion](https://github.com/framer/motion) from 11.2.10 to 11.2.11.
- [Changelog](https://github.com/framer/motion/blob/main/CHANGELOG.md)
- [Commits](https://github.com/framer/motion/compare/v11.2.10...v11.2.11)

---
updated-dependencies:
- dependency-name: framer-motion
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-19 09:33:33 -06:00
dependabot[bot] 718e435bb8 chore(deps): bump litellm from 1.40.16 to 1.40.17 (#2522)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.40.16 to 1.40.17.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.40.16...v1.40.17)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-19 15:13:40 +00:00
dependabot[bot] 13bd154b50 chore(deps): bump boto3 from 1.34.128 to 1.34.129 (#2520)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.128 to 1.34.129.
- [Release notes](https://github.com/boto/boto3/releases)
- [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst)
- [Commits](https://github.com/boto/boto3/compare/1.34.128...1.34.129)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-19 15:12:56 +00:00
dependabot[bot] a67b79d9ea chore(deps): bump json-repair from 0.23.1 to 0.25.0 (#2521)
Bumps [json-repair](https://github.com/mangiucugna/json_repair) from 0.23.1 to 0.25.0.
- [Release notes](https://github.com/mangiucugna/json_repair/releases)
- [Commits](https://github.com/mangiucugna/json_repair/compare/0.23.1...0.25.0)

---
updated-dependencies:
- dependency-name: json-repair
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-19 15:10:53 +00:00
dependabot[bot] e1c6af865b chore(deps-dev): bump chromadb from 0.5.1 to 0.5.3 (#2514)
Bumps [chromadb](https://github.com/chroma-core/chroma) from 0.5.1 to 0.5.3.
- [Release notes](https://github.com/chroma-core/chroma/releases)
- [Changelog](https://github.com/chroma-core/chroma/blob/main/RELEASE_PROCESS.md)
- [Commits](https://github.com/chroma-core/chroma/compare/0.5.1...0.5.3)

---
updated-dependencies:
- dependency-name: chromadb
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-19 16:49:00 +02:00
மனோஜ்குமார் பழனிச்சாமி d0bdae232f Show relevant error in UI (#2516) 2024-06-19 15:58:48 +05:30
மனோஜ்குமார் பழனிச்சாமி 92a1eaa866 Add dev config for frontend for WSL (#2506)
* Add dev config for frontend

* Update package.json

* Update package.json
2024-06-19 14:06:06 +05:30
dependabot[bot] 3ffcc7157d chore(deps-dev): bump @typescript-eslint/parser in /frontend (#2497)
Bumps [@typescript-eslint/parser](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/parser) from 7.13.0 to 7.13.1.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/parser/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v7.13.1/packages/parser)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/parser"
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-19 00:26:20 -07:00
Boxuan Li 89cae8d6b8 Fix Docker tagging issue with upper case (#2512)
* Fix Docker tagging issue with upper case

* Update containers/build.sh

Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>

* Use tr command which is available on both zsh and bash

* Lower image name

* Lower image name

* Update .github/workflows/ghcr.yml

Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>

* Fix shell syntax

---------

Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>
2024-06-19 00:11:18 -06:00
tobitege 3f869ae853 default timezone in build (#2513) 2024-06-19 08:07:57 +02:00
tobitege 8f48d10442 poetry.lock updated (lots of changes) (#2492)
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-06-19 04:10:41 +00:00
Engel Nyst 80fe13f4be rename our completion as a drop-in replacement of litellm completion (#2509) 2024-06-19 05:25:25 +02:00
Engel Nyst b2307db010 Document, rename Agent* exceptions to LLM* (#2508)
* rename "Agent" exceptions to LLM*, document

* LLMResponseError
2024-06-18 22:30:22 +00:00
dependabot[bot] 3d33bc6813 chore(deps): bump boto3 from 1.34.126 to 1.34.128 (#2504)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.126 to 1.34.128.
- [Release notes](https://github.com/boto/boto3/releases)
- [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst)
- [Commits](https://github.com/boto/boto3/compare/1.34.126...1.34.128)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-19 01:54:15 +08:00
Boxuan Li c2fa99b4a1 Split container image build & push (#2456)
* Split container image build & push

* Code cleanup

* Cleanup

* Add back useless docker_build_success step to make CI happy

* Revert "Cleanup"

This reverts commit 2a260791a9.

* Use fresh built sandbox image in integration test

* fix dependency

* DEBUG: only build

* Attempt to fix dependency

* Change dependency

* Combine both jobs

* Fix env

* Remove Mac integration tests as they are too unstable

* Move sandbox tests to ghcr

* Use loaded image
2024-06-19 01:44:51 +08:00
dependabot[bot] 5e8ef23a53 chore(deps): bump litellm from 1.40.15 to 1.40.16 (#2501) 2024-06-19 00:40:46 +08:00
Xingyao Wang 1f379bebc2 Update README.md (#2505)
LGTM
2024-06-18 18:14:21 +02:00
dependabot[bot] 1f72f4c401 chore(deps): bump google-generativeai from 0.6.0 to 0.7.0 (#2500) 2024-06-18 15:16:17 +00:00
dependabot[bot] 07fa57ff69 chore(deps-dev): bump @types/node from 20.14.2 to 20.14.5 in /frontend (#2498) 2024-06-18 22:54:46 +08:00
dependabot[bot] 9b9dae9463 chore(deps-dev): bump @typescript-eslint/eslint-plugin in /frontend (#2495) 2024-06-18 22:53:51 +08:00
dependabot[bot] 1bc3ca20da chore(deps): bump jose from 5.4.0 to 5.4.1 in /frontend (#2496) 2024-06-18 22:53:06 +08:00
Shimada666 e4d78153fe feat: add environments in global bashrc file (#2486) 2024-06-18 02:36:35 +00:00
dependabot[bot] e001c6a9b8 chore(deps-dev): bump ruff from 0.4.8 to 0.4.9 (#2482) 2024-06-17 16:40:50 +00:00
dependabot[bot] f8607e0ccf chore(deps): bump @nextui-org/react from 2.4.1 to 2.4.2 in /frontend (#2485) 2024-06-18 00:10:13 +08:00
dependabot[bot] f91fb3a839 chore(deps): bump tenacity from 8.3.0 to 8.4.1 (#2483) 2024-06-18 00:09:20 +08:00
dependabot[bot] 1075f5abad chore(deps): bump litellm from 1.40.12 to 1.40.15 (#2484) 2024-06-18 00:08:52 +08:00
dependabot[bot] 75ab295aec chore(deps-dev): bump flake8 from 7.0.0 to 7.1.0 (#2481) 2024-06-18 00:08:18 +08:00
மனோஜ்குமார் பழனிச்சாமி 5ae647f5c2 Bump docker version (#2479) 2024-06-17 13:50:48 +00:00
Boxuan Li 009f6f9ebc Integration tests: check agent error and fix test_edits (#2473)
* Integration tests: check agent error and fix test_edits

* Fix CodeActSWEAgent test_ipython_module prompt logs
2024-06-17 20:39:03 +08:00
மனோஜ்குமார் பழனிச்சாமி f4c917345f Reworded port forward msg (#2478)
* Reworded port forward msg

* applied suggestions.

---------

Co-authored-by: tobitege <tobitege@gmx.de>
2024-06-17 11:40:37 +00:00
tobitege 3142764104 fix: test_ipython being skipped (#2477) 2024-06-17 16:33:37 +05:30
tobitege d2509a19c8 fix: logger with more masking of sensitive data (#2470)
* fix: more logger sensitive masking

* fix: test_config.py updated for more sensitive patterns

* added one more...
2024-06-16 17:32:26 -04:00
Graham Neubig 798921cec3 Codecov after_n_builds=5 (#2468)
* Codecov after_n_builds=2

* Update to 5
2024-06-17 04:40:12 +08:00
tobitege 823298e0d0 fix: Agentskills enhancements (#2384)
* avoid repeat logging of unneeded messages

* refactored append/edit_file (tests next)

* agentskills and unit test fixes

* testing

* more changes and test prompts

* smaller changes

* final test fixes

* remove dead code from test_agent.py

* reverting unneeded changes

* updated tests, more tweaks to skills

* refactor (#2442)

* chores: fix DelegatorAgent description (#2446)

* change

* change comments

* fix

* stopped container to prevent port issues. (#2447)

* chore: remove useless browsing code in CodeActSWEAgent (#2438)

* remove useless

* fix integration test

* Regenerate test_ipython_module artifacts for CodeActSWEAgent

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Merge remote-tracking branch 'upstream/main' into agent-fileops

* unneeded tweak

* * fix edit_file to not introduce extra newline
* updated docstrings with more details for LLM
* fix legacy typo in prompts causing ]] instead of ]
* several mock files regenerated

* Regen'ed CodeActSWEAgent integration tests

* fix _print_window signature; explicit exception type in _is_valid_path

* splitlines with named param

---------

Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-06-16 15:06:46 -04:00
மனோஜ்குமார் பழனிச்சாமி cb44c116cd detailed jupyter error log (#2448) 2024-06-16 14:28:24 -04:00
Yufan Song 85c0eae31d remove (#2465) 2024-06-17 00:45:48 +08:00
mamoodi b5e6ba8239 docs: Update Development and CONTRIBUTING docs (#2453)
* docs: Update Development and CONTRIBUTING docs

* Explain the PR process in simpler terms

* Fix formatting

---------

Co-authored-by: Mahmoud Work <mahmoudwork@mahmouds-mini.home>
2024-06-16 10:45:10 -04:00
tobitege 2d824947f8 fix: improve toml parsing exception (#2459) 2024-06-16 14:27:21 +00:00
Boxuan Li 6f235937cf Evaluation time travel: allow evaluation on a specific version (#2356)
* Time travel for evaluation

* Fix source script path

* Exit script if given version doesn't exist

* Exit on failure

* Update README

* Change scripts of all other benchmarks

* Modify README files

* Fix logic_reasoning README
2024-06-16 10:25:14 -04:00
மனோஜ்குமார் பழனிச்சாமி 5e61a10627 Fixed typo in PR template (#2461) 2024-06-16 13:46:06 +00:00
மனோஜ்குமார் பழனிச்சாமி b5b81d06a1 Added Pull Request Template (#2454) 2024-06-16 05:54:44 -04:00
Yufan Song 426e429b18 chore: remove useless browsing code in CodeActSWEAgent (#2438)
* remove useless

* fix integration test

* Regenerate test_ipython_module artifacts for CodeActSWEAgent

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-06-15 10:53:03 +08:00
மனோஜ்குமார் பழனிச்சாமி ac6ef8e59d stopped container to prevent port issues. (#2447) 2024-06-15 00:31:33 +00:00
Yufan Song c82666fea3 chores: fix DelegatorAgent description (#2446)
* change

* change comments

* fix
2024-06-15 00:04:43 +00:00
Yufan Song fd29b8faa8 refactor (#2442) 2024-06-14 19:02:25 -04:00
dependabot[bot] 0d5dc1faeb chore(deps): bump vite from 5.3.0 to 5.3.1 in /frontend (#2441)
Bumps [vite](https://github.com/vitejs/vite/tree/HEAD/packages/vite) from 5.3.0 to 5.3.1.
- [Release notes](https://github.com/vitejs/vite/releases)
- [Changelog](https://github.com/vitejs/vite/blob/main/packages/vite/CHANGELOG.md)
- [Commits](https://github.com/vitejs/vite/commits/v5.3.1/packages/vite)

---
updated-dependencies:
- dependency-name: vite
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-14 23:11:35 +08:00
dependabot[bot] c272e2e851 chore(deps): bump boto3 from 1.34.125 to 1.34.126 (#2439)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.125 to 1.34.126.
- [Release notes](https://github.com/boto/boto3/releases)
- [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst)
- [Commits](https://github.com/boto/boto3/compare/1.34.125...1.34.126)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-14 22:57:58 +08:00
dependabot[bot] 1748d47437 chore(deps): bump litellm from 1.40.9 to 1.40.12 (#2440)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.40.9 to 1.40.12.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.40.9...v1.40.12)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-14 22:57:31 +08:00
Engel Nyst bb4ea1e6cb Adjust is-stuck check for the same steps to 3 until it's stopped (#2437) 2024-06-14 19:20:12 +05:30
Boxuan Li dd1095cf6b regenerate.sh: Exit upon common known errors (#2385)
* Exit regenerate.sh upon common known errors

* More fixes

* Remove mention of transient issue

* Use tmp file instead of tty

* Remove redundant cleanup
2024-06-13 23:42:58 -07:00
Engel Nyst 1cc70be616 workspace_mount_path sentinel: an undefined string (#2431) 2024-06-14 10:39:33 +05:30
Graham Neubig 48a2f24bf4 Replace all instances of OPENDEVIN_WORKSPACE with WORKSPACE_BASE (#2418)
* Replace all instances of OPENDEVIN_WORKSPACE with WORKSPACE_BASE

* Update frontend/src/i18n/translation.json

* Update frontend/src/i18n/translation.json

---------

Co-authored-by: OpenDevin <opendevin@opendevin.ai>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
2024-06-13 18:27:56 +00:00
dependabot[bot] b2dc177562 chore(deps): bump datasets from 2.19.2 to 2.20.0 (#2424)
Bumps [datasets](https://github.com/huggingface/datasets) from 2.19.2 to 2.20.0.
- [Release notes](https://github.com/huggingface/datasets/releases)
- [Commits](https://github.com/huggingface/datasets/compare/2.19.2...2.20.0)

---
updated-dependencies:
- dependency-name: datasets
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-13 18:18:46 +02:00
dependabot[bot] ce0034eeaa chore(deps): bump boto3 from 1.34.124 to 1.34.125 (#2423)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.124 to 1.34.125.
- [Release notes](https://github.com/boto/boto3/releases)
- [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst)
- [Commits](https://github.com/boto/boto3/compare/1.34.124...1.34.125)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-13 18:09:17 +02:00
dependabot[bot] d42e02f286 chore(deps-dev): bump openai from 1.33.0 to 1.34.0 (#2422)
Bumps [openai](https://github.com/openai/openai-python) from 1.33.0 to 1.34.0.
- [Release notes](https://github.com/openai/openai-python/releases)
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-python/compare/v1.33.0...v1.34.0)

---
updated-dependencies:
- dependency-name: openai
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-13 17:49:25 +02:00
dependabot[bot] 0f26ff214a chore(deps): bump vite from 5.2.13 to 5.3.0 in /frontend (#2416)
Bumps [vite](https://github.com/vitejs/vite/tree/HEAD/packages/vite) from 5.2.13 to 5.3.0.
- [Release notes](https://github.com/vitejs/vite/releases)
- [Changelog](https://github.com/vitejs/vite/blob/main/packages/vite/CHANGELOG.md)
- [Commits](https://github.com/vitejs/vite/commits/v5.3.0/packages/vite)

---
updated-dependencies:
- dependency-name: vite
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-13 14:57:37 +00:00
dependabot[bot] 52719b5ded chore(deps-dev): bump lint-staged from 15.2.6 to 15.2.7 in /frontend (#2414)
Bumps [lint-staged](https://github.com/okonet/lint-staged) from 15.2.6 to 15.2.7.
- [Release notes](https://github.com/okonet/lint-staged/releases)
- [Changelog](https://github.com/lint-staged/lint-staged/blob/master/CHANGELOG.md)
- [Commits](https://github.com/okonet/lint-staged/compare/v15.2.6...v15.2.7)

---
updated-dependencies:
- dependency-name: lint-staged
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-13 14:56:16 +00:00
Yufan Song e04bde851e test (#2409)
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-06-13 21:10:17 +08:00
Yufan Song 7f8c324d3a Refactor CodeActSWEAgent, add response parser (#2368)
* refactor code

* fix

* seperate code

* Update agenthub/codeact_swe_agent/response_parser.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update agenthub/codeact_swe_agent/response_parser.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update agenthub/codeact_swe_agent/action_parser.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* remove browsing

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-06-13 04:14:05 +00:00
Yufan Song 0c92144220 Refactor MonologueAgent, PlannerAgent add response parser (#2400)
* refactor monologue

* refactor planner_agent

* fix bug

* add back code

* add back code
2024-06-13 12:00:27 +08:00
super-dainiu 563bc41fd3 Use LLM to analyze ML-Bench failure cases (#2399)
* add ml-bench w/o exec env

* fix typos (#1956)

no functional change

* Refactored Logs (#1939)

* [Feat] A competitive Web Browsing agent (#1856)

* initial attempt at a browsing only agent

* add browsing agent

* update

* implement agent

* update

* fix comments

* remove unnecessary things from memory extras

* update image processing

---------

Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>

* Update README.md SWE-bench score (#1959)

* Update README.md SWE-bench score

Our most recent results on swe-bench lite are 25%, so this updates the README accordingly.

* Update

* fix: llm is_local function logic error (#1961)

Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>

* doc: update documentation about poetry update (#1962)

* add doc

* Update Development.md

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* feat: add metrics related to cost for better observability (#1944)

* add metrics for total_cost

* make lint

* refact codeact

* change metrics into llm

* add costs list, add into state

* refactor log completion

* refactor and test others

* make lint

* Update opendevin/core/metrics.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update opendevin/llm/llm.py

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>

* refactor

* add code

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>

* doc: add more cmd in unit test documentation (#1963)

* --- (#1975)

updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* --- (#1976)

updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Logging security (#1943)

* update .gitignore

* Rename the confusing 'INFO' style to 'DETAIL'

* override str and repr

* feat: api_key desensitize

* feat: add SensitiveDataFilter in file handler

* tweak regex, add tests

* more tweaks, include other attrs

* add env vars, those with equivalent config

* fix tests

* tests are invaluable

---------

Co-authored-by: Shimada666 <649940882@qq.com>

* --- (#1967)

updated-dependencies:
- dependency-name: react-dom
  dependency-type: direct:production
  update-type: version-update:semver-minor
- dependency-name: "@types/react-dom"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* --- (#1968)

updated-dependencies:
- dependency-name: "@reduxjs/toolkit"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* --- (#1969)

updated-dependencies:
- dependency-name: husky
  dependency-type: direct:development
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* --- (#1970)

updated-dependencies:
- dependency-name: tailwind-merge
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* --- (#1971)

updated-dependencies:
- dependency-name: i18next
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>

* Refactor session management (#1810)

* refactor session mgmt

* defer file handling to runtime

* add todo

* refactor sessions a bit more

* remove messages logic from FE

* fix up socket handshake

* refactor frontend auth a bit

* first pass at redoing file explorer

* implement directory suffix

* fix up file tree

* close agent on websocket close

* remove session saving

* move file refresh

* remove getWorkspace

* plumb path/code differently

* fix build issues

* fix the tests

* fix npm build

* add session rehydration

* fix event serialization

* logspam

* fix user message rehydration

* add get_event fn

* agent state restoration

* change history tracking for codeact

* fix responsiveness of init

* fix lint

* lint

* delint

* fix prop

* update tests

* logspam

* lint

* fix test

* revert codeact

* change fileService to use API

* fix up session loading

* delint

* delint

* fix integration tests

* revert test

* fix up access to options endpoints

* fix initial files load

* delint

* fix file initialization

* fix mock server

* fixl int

* fix auth for html

* Update frontend/src/i18n/translation.json

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>

* refactor sessions and sockets

* avoid reinitializing the same session

* fix reconnect issue

* change up intro message

* more guards on reinit

* rename agent_session

* delint

* fix a bunch of tests

* delint

* fix last test

* remove code editor context

* fix build

* fix any

* fix dot notation

* Update frontend/src/services/api.ts

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* fix up error handling

* Update opendevin/server/session/agent.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update opendevin/server/session/agent.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update frontend/src/services/session.ts

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* fix build errs

* fix else

* add closed state

* delint

* Update opendevin/server/session/session.py

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

---------

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

* fix #1960 (#1964)

* Add ruff for shared mutable defaults (B) (#1938)

* Add ruff for shared mutable defaults (B)

* Apply B006, B008 on current files, except fast API

* Update agenthub/SWE_agent/prompts.py

Co-authored-by: Graham Neubig <neubig@gmail.com>

* fix unintended behavior change

* this is correct, tell Ruff to leave it alone

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Refactor integration testing CI, add optional Mac tests, and mark a few agents as deprecated (#1888)

* Add MacOS to integration tests

* Switch back to python 3.11

* Install Docker for macos pipeline

* regenerate.sh: Use environmental variable for sandbox type

* Pack different agents' tests into a single check

* Fix CodeAct tests

* Reduce file match and extensive debug logs

* Add TEST_IN_CI mode that reports codecov

* Small fix: don't quit if reusing old responses failed

* Merge codecov results

* Fix typos

* Remove coverage merge step - codecov automatically does that

* Make mac integration tests as optional - too slow

* Fix codecov args

* Add comments in yaml

* Include sandbox type in codecov report name

* Fix codecov report merge

* Revert renaming of test_matrix_success

* Remove SWEAgent and PlannerAgent from tests

* Mark planner agent and SWE agent as deprecated

* CodeCov: Ignore planner and sweagent

* Revert "Remove SWEAgent and PlannerAgent from tests"

This reverts commit 040cb3bfb9.

* Remove all tests for SWE Agent

* Only keep basic tests for MonologueAgent and PlannerAgent

* Mark SWE Agent as deprecated, and ignore code coverage for it

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

* Fix Repeated Responses in Chat by Adding IPythonRunCellObservation (#1987)

Co-authored-by: jianghongwei <jianghongwei@58.com>
Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>

* Save CI cycles for backend tests (#1985)

* Fix typo in prompt (#1992)

* Refactor monologue and SWE agent to use the messages in state history (#1863)

* Refactor monologue to use the messages in state history

* add messages, clean up

* fix monologue

* update integration tests

* move private method

* update SWE agent to use the history from State

* integration tests for SWE agent

* rename monologue to initial_thoughts, since that is what it is

* fix: catch session file not existed exception when init EventStream(maybe creating a new session with no session files stored). (#1994)

* add ml-bench in readme

* Bump boto3 from 1.34.110 to 1.34.111 (#2001)

Bumps [boto3](https://github.com/boto/boto3) from 1.34.110 to 1.34.111.
- [Release notes](https://github.com/boto/boto3/releases)
- [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst)
- [Commits](https://github.com/boto/boto3/compare/1.34.110...1.34.111)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump docker from 7.0.0 to 7.1.0 (#2002)

Bumps [docker](https://github.com/docker/docker-py) from 7.0.0 to 7.1.0.
- [Release notes](https://github.com/docker/docker-py/releases)
- [Commits](https://github.com/docker/docker-py/compare/7.0.0...7.1.0)

---
updated-dependencies:
- dependency-name: docker
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump litellm from 1.37.20 to 1.38.0 (#2005)

Bumps [litellm](https://github.com/BerriAI/litellm) from 1.37.20 to 1.38.0.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.37.20...v1.38.0)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Fix SWE-Bench evaluation due to setuptools version (#1995)

* correctly setup plugins for swebench eval

* bump swe-bench version and add logging

* Revert "correctly setup plugins for swebench eval"

This reverts commit 2bd1055673.

* bump version

* fix session state after resuming (#1999)

* fix state resuming

* fix session reconnection

* fix lint

* Implement `agentskills` for OpenDevin to helpfully improve edit AND including more useful tools/skills (#1941)

* add draft for skills

* Implement and test agentskills functions: open_file, goto_line, scroll_down, scroll_up, create_file, search_dir, search_file, find_file

* Remove new_sample.txt file

* add some work from opendevin w/ fixes

* Add unit tests for agentskills module

* fix some issues and updated tests

* add more tests for open

* tweak and handle goto_line

* add tests for some edge cases

* add tests for scrolling

* add tests for edit

* add tests for search_dir

* update tests to use pytest

* use pytest --forked to avoid file op unit tests to interfere with each other via global var

* update doc based on swe agent tool

* update and add tests for find_file and search_file

* move agent_skills to plugins

* add agentskills as plugin and docs

* add agentskill to ssh box and fix sandbox integration

* remove extra returns in doc

* add agentskills to initial tool for jupyter

* support re-init jupyter kernel (for agentskills) after restart

* fix print window's issue with indentation and add testcases

* add prompt for codeact with the newest edit primitives

* modify the way line number is presented (remove leading space)

* change prompt to the newest display format

* support tracking of costs via metrics

* Update opendevin/runtime/plugins/agent_skills/README.md

* Update opendevin/runtime/plugins/agent_skills/README.md

* implement and add tests for py linting

* remove extra text arg for incompatible subprocess ver

* remove sample.txt

* update test_edits integration tests

* fix all integration

* Update opendevin/runtime/plugins/agent_skills/README.md

* Update opendevin/runtime/plugins/agent_skills/README.md

* Update opendevin/runtime/plugins/agent_skills/README.md

* Update agenthub/codeact_agent/prompt.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update agenthub/codeact_agent/prompt.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update agenthub/codeact_agent/prompt.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update opendevin/runtime/plugins/agent_skills/agentskills.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* correctly setup plugins for swebench eval

* bump swe-bench version and add logging

* correctly setup plugins for swebench eval

* bump swe-bench version and add logging

* Revert "correctly setup plugins for swebench eval"

This reverts commit 2bd1055673.

* bump version

* remove _AGENT_SKILLS_DOCS

* move flake8 to test dep

* update poetry.lock

* remove extra arg

* reduce max iter for eval

* update poetry

* fix integration tests

---------

Co-authored-by: OpenDevin <opendevin@opendevin.ai>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* build: Add poetry command to use Python 3.11 for environment setup (#1972)

* Bump @react-types/shared from 3.23.0 to 3.23.1 in /frontend (#2006)

Bumps [@react-types/shared](https://github.com/adobe/react-spectrum) from 3.23.0 to 3.23.1.
- [Release notes](https://github.com/adobe/react-spectrum/releases)
- [Commits](https://github.com/adobe/react-spectrum/compare/@react-types/shared@3.23.0...@react-types/shared@3.23.1)

---
updated-dependencies:
- dependency-name: "@react-types/shared"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump @types/react-syntax-highlighter in /frontend (#2007)

Bumps [@types/react-syntax-highlighter](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/react-syntax-highlighter) from 15.5.11 to 15.5.13.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/react-syntax-highlighter)

---
updated-dependencies:
- dependency-name: "@types/react-syntax-highlighter"
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump @typescript-eslint/parser from 7.9.0 to 7.10.0 in /frontend (#2008)

Bumps [@typescript-eslint/parser](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/parser) from 7.9.0 to 7.10.0.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/parser/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v7.10.0/packages/parser)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/parser"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump lint-staged from 15.2.2 to 15.2.4 in /frontend (#2009)

Bumps [lint-staged](https://github.com/okonet/lint-staged) from 15.2.2 to 15.2.4.
- [Release notes](https://github.com/okonet/lint-staged/releases)
- [Changelog](https://github.com/lint-staged/lint-staged/blob/master/CHANGELOG.md)
- [Commits](https://github.com/okonet/lint-staged/compare/v15.2.2...v15.2.4)

---
updated-dependencies:
- dependency-name: lint-staged
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update README.md

* Update README.md

* add run_infer.sh

* fix input output

* fix docker sandbox

* fix run

* update and clean run_infer.py

* add script to clean up dockers

* update repo uid

* add description

* new

* Update README.md

* use root for sandbox

* update readme

* update ml-bench conda env

* update readme

* update readme

* use try except

* modify raise exception

* add int

* update README

* longer time

* fix existing issues

* fix existing issue

* new docker image

* add metrics of cost

* add result parsing cost

* fix

* fix

* update summarize

* fix

* add analyze

* update readme

* use 4o

* add eval output

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-31-157.ec2.internal>
Co-authored-by: RainRat <rainrat78@yahoo.ca>
Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>
Co-authored-by: Frank Xu <frankxu2004@gmail.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: Shimada666 <649940882@qq.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Robert Brennan <accounts@rbren.io>
Co-authored-by: Rahul Anand <62982824+zeul22@users.noreply.github.com>
Co-authored-by: jiangleo <jiangleo@users.noreply.github.com>
Co-authored-by: jianghongwei <jianghongwei@58.com>
Co-authored-by: Jeremi Joslin <jeremi@newlogic.com>
Co-authored-by: Aaron Xia <zhhuaxia@gmail.com>
Co-authored-by: OpenDevin <opendevin@opendevin.ai>
Co-authored-by: DaxServer <7479937+DaxServer@users.noreply.github.com>
Co-authored-by: Robert <871607149@qq.com>
2024-06-13 09:30:55 +08:00
dependabot[bot] 0bd5ab72ed chore(deps): bump boto3 from 1.34.123 to 1.34.124 (#2410) 2024-06-13 00:07:45 +08:00
dependabot[bot] 4c774544ce chore(deps): bump litellm from 1.40.8 to 1.40.9 (#2411) 2024-06-13 00:07:09 +08:00
dependabot[bot] d1eaf486bd chore(deps-dev): bump lint-staged from 15.2.5 to 15.2.6 in /frontend (#2407)
Bumps [lint-staged](https://github.com/okonet/lint-staged) from 15.2.5 to 15.2.6.
- [Release notes](https://github.com/okonet/lint-staged/releases)
- [Changelog](https://github.com/lint-staged/lint-staged/blob/master/CHANGELOG.md)
- [Commits](https://github.com/okonet/lint-staged/compare/v15.2.5...v15.2.6)

---
updated-dependencies:
- dependency-name: lint-staged
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-12 17:35:01 +03:00
Leo 7fcd14a055 fix the failed unit test. (#2405)
Signed-off-by: ifuryst <ifuryst@gmail.com>
2024-06-12 13:55:23 +00:00
Yufan Song 90ec0095df Add integration test for CodeActSWEAgent (#2377)
* add test log

* remove browsing internet

* add test by GPT-4o

* fix prompts

* change test_agent

* fix test

* fix nits
2024-06-12 02:46:15 +08:00
dependabot[bot] a25fb4d21b chore(deps-dev): bump @typescript-eslint/parser in /frontend (#2387)
Bumps [@typescript-eslint/parser](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/parser) from 7.12.0 to 7.13.0.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/parser/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v7.13.0/packages/parser)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/parser"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-11 18:33:20 +00:00
dependabot[bot] 11d7efb0e8 chore(deps-dev): bump @testing-library/jest-dom in /frontend (#2388)
Bumps [@testing-library/jest-dom](https://github.com/testing-library/jest-dom) from 6.4.5 to 6.4.6.
- [Release notes](https://github.com/testing-library/jest-dom/releases)
- [Changelog](https://github.com/testing-library/jest-dom/blob/main/CHANGELOG.md)
- [Commits](https://github.com/testing-library/jest-dom/compare/v6.4.5...v6.4.6)

---
updated-dependencies:
- dependency-name: "@testing-library/jest-dom"
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-12 02:17:20 +08:00
dependabot[bot] 17e7a45c3c chore(deps-dev): bump @typescript-eslint/eslint-plugin in /frontend (#2389)
Bumps [@typescript-eslint/eslint-plugin](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/eslint-plugin) from 7.12.0 to 7.13.0.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/eslint-plugin/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v7.13.0/packages/eslint-plugin)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/eslint-plugin"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-12 02:16:47 +08:00
dependabot[bot] e14297fde9 chore(deps-dev): bump prettier from 3.3.1 to 3.3.2 in /frontend (#2390)
Bumps [prettier](https://github.com/prettier/prettier) from 3.3.1 to 3.3.2.
- [Release notes](https://github.com/prettier/prettier/releases)
- [Changelog](https://github.com/prettier/prettier/blob/main/CHANGELOG.md)
- [Commits](https://github.com/prettier/prettier/compare/3.3.1...3.3.2)

---
updated-dependencies:
- dependency-name: prettier
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-12 02:15:00 +08:00
dependabot[bot] 35c4c9cb34 chore(deps): bump litellm from 1.40.7 to 1.40.8 (#2392)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.40.7 to 1.40.8.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.40.7...v1.40.8)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-11 15:23:14 +00:00
dependabot[bot] 9893c06b2e chore(deps): bump boto3 from 1.34.122 to 1.34.123 (#2391)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.122 to 1.34.123.
- [Release notes](https://github.com/boto/boto3/releases)
- [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst)
- [Commits](https://github.com/boto/boto3/compare/1.34.122...1.34.123)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-11 15:19:34 +00:00
Yufan Song c6951eb6c1 refactor browsing agent response parse (#2366)
* refactor browsing

* add comments

* change file name

* Rename resposne_parser.py to response_parser.py

* Fixed typos

* Typo fix

---------

Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-06-11 03:46:33 +00:00
Xingyao Wang b3bdc44292 mkdir infer_logs instead of logs (#2382) 2024-06-11 07:18:19 +08:00
Xingyao Wang 11a2d1682d Minor SWE-Bench inference config tweak (#2381)
* save infer logs to infer_logs

* set max budget for swebench eval
2024-06-10 20:14:22 +00:00
tobitege e4145aef66 avoid repeat logging of unneeded messages (#2380) 2024-06-10 20:08:09 +00:00
Xingyao Wang a6ba6c5277 Add SWEBench-docker eval (#2085)
* add initial version of swebench-docker eval

* update the branch of git repo

* add poetry run

* download dev set too and pre-load f2p and p2p

* update eval infer script

* increase timeout

* add poetry run

* install swebench from our fork

* update script

* update loc

* support single instance debug

* replace \r\n from model patch

* replace eval docker from namespace xingyaoww

* update script to auto detect swe-bench format jsonl

* support eval infer on single instance id

* change log output dir to logs

* update summarise result script

* update README

* update readme

* tweak branch

* Update evaluation/swe_bench/scripts/eval/prep_eval.sh

Co-authored-by: Graham Neubig <neubig@gmail.com>

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-06-10 19:30:40 +00:00
tobitege 9605106e72 feat: append_file incl. all tests [agentskills] (#2346)
* new skill: append_file incl. all tests

* more tests needed caring

* file_name for append_file/edit_file; updated tests
2024-06-10 17:18:40 +00:00
dependabot[bot] a5f5bc30b4 chore(deps): bump @vitejs/plugin-react from 4.3.0 to 4.3.1 in /frontend (#2371) 2024-06-11 00:32:10 +08:00
Yufan Song f4cb192ebe Fix llm key leaks bug (#2376)
* fix bug

* fix bug

* add
2024-06-10 15:55:33 +00:00
dependabot[bot] c633d41091 chore(deps-dev): bump llama-index-vector-stores-chroma (#2375)
Bumps llama-index-vector-stores-chroma from 0.1.8 to 0.1.9.

---
updated-dependencies:
- dependency-name: llama-index-vector-stores-chroma
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-10 15:39:42 +00:00
dependabot[bot] e2512b43b6 chore(deps-dev): bump llama-index-embeddings-azure-openai (#2374)
Bumps llama-index-embeddings-azure-openai from 0.1.9 to 0.1.10.

---
updated-dependencies:
- dependency-name: llama-index-embeddings-azure-openai
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-10 15:39:31 +00:00
dependabot[bot] 090046b2e6 chore(deps-dev): bump openai from 1.32.0 to 1.33.0 (#2373)
Bumps [openai](https://github.com/openai/openai-python) from 1.32.0 to 1.33.0.
- [Release notes](https://github.com/openai/openai-python/releases)
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-python/compare/v1.32.0...v1.33.0)

---
updated-dependencies:
- dependency-name: openai
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-10 15:32:52 +00:00
dependabot[bot] f29a2704f2 chore(deps): bump boto3 from 1.34.121 to 1.34.122 (#2372)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.121 to 1.34.122.
- [Release notes](https://github.com/boto/boto3/releases)
- [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst)
- [Commits](https://github.com/boto/boto3/compare/1.34.121...1.34.122)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-10 15:25:38 +00:00
dependabot[bot] a3062ba4d0 chore(deps): bump litellm from 1.40.4 to 1.40.7 (#2370)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.40.4 to 1.40.7.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.40.4...v1.40.7)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-10 15:22:30 +00:00
tobitege f1760f3a67 remove some MonologueAgent mentions (#2364) 2024-06-10 11:57:37 +00:00
Yufan Song f7491bd2fa Refactor response to action in agent step (#2350)
* refactor action parser

* Fix typos

* fix typo

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-06-10 10:17:30 +00:00
Robert 7fc57650f3 BioCoder integration (#2076)
* prepare execution and inference

* Create README.md

* Update README.md

* Update evaluation/biocoder/README.md

* Update evaluation/swe_bench/swe_env_box.py

* switch to biocoder docker container and test-specific code

* code for copying and running test files into container

* add metrics

* add readme

* Biocoder evaluation code finished (rewrite testing infrastructure, prompt tuning, and bug fixes)

* Update README.md

---------

Co-authored-by: lilbillybiscuit <qianbill2014@outlook.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
Co-authored-by: yufansong <yufan@risingwave-labs.com>
2024-06-10 11:11:40 +08:00
Boxuan Li 91ddd93756 conftest: Exit without revealing secrets (#2351) 2024-06-10 10:47:31 +08:00
மனோஜ்குமார் பழனிச்சாமி 003b599dd0 Issues Category Update: Removed Question Type (#2345)
We've removed the "Question" type from the Issues category to streamline our issue-tracking process. This change will help us focus on actionable issues and feature requests. If you have any questions or discussions, please use the Discussions tab. This is better suited for community engagement, sharing knowledge, and getting help from other contributors.
2024-06-09 21:14:56 -04:00
tobitege 41344f0dfe remove backtick handling from run_ipython (#2347) 2024-06-09 22:53:06 +00:00
RainRat 745ae42a72 fix typos (#2352) 2024-06-09 12:57:58 -07:00
Serg Kryvonos a400e94971 Parameterize Python version (#2348) 2024-06-09 17:29:37 +00:00
Temo e925cefeef Refactored prompt.py to reduce token usage (#1996)
* Refactored prompt.py to reduce token usage

* Reverted some destructive changes

* Update agenthub/codeact_agent/prompt.py

* Update agenthub/codeact_agent/prompt.py

* Update agenthub/codeact_agent/prompt.py

* Update agenthub/codeact_agent/prompt.py

* Update agenthub/codeact_agent/prompt.py

* Update agenthub/codeact_agent/prompt.py

* Update agenthub/codeact_agent/prompt.py

* Apply suggestions from code review

* Apply suggestions from code review

* Update agenthub/codeact_agent/prompt.py

* fix integration test

* make lint

* feat: support ToolQA benchmark (#2263)

* Add files via upload

* Update README.md

* Update run_infer.py

* Update utils.py

* make lint

* Update evaluation/toolqa/run_infer.py

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: yufansong <yufan@risingwave-labs.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* feat: revert hiden special paths change in file action (#2328)

* revert change in file action

* remove useless code

* make lint

* Support gpqa benchmark evaluation (#2080)

* feat: add gpqa benchmark evaluation

* add metrics

* reset configs in final block

* make lint

---------

Co-authored-by: yufansong <yufan@risingwave-labs.com>

* fix(frontend): prevent API key from resetting after modal change (#2329)

* remove bottom chatbox fade

* Modal wider; fix lint error

* settings: attempt to not clear api key for same provider

* prevent api key from resetting after changing the model

* revert other changes and fix post test tear down error

---------

Co-authored-by: amanape <83104063+amanape@users.noreply.github.com>

* fix: codeact bug [If running a command that never returns, it gets stuck #1895] (#2034)

* fix: codeact bug https://github.com/OpenDevin/OpenDevin/issues/1895

* fix: add CmdRunAction timeout hint.

* Update agenthub/codeact_agent/prompt.py

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

* regenerate integration test

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: yufansong <yufan@risingwave-labs.com>

* Feat: Support Gorilla APIBench  (#2081)

* removed unused files from gorilla

* Update run_infer.py, removed unused imports

* Update utils.py

* Update ast_eval_hf.py

* Update ast_eval_tf.py

* Update ast_eval_th.py

* Create README.md

* Update run_infer.py

* make lint

* Update run_infer.py

* fix lint

---------

Co-authored-by: yufansong <yufan@risingwave-labs.com>

* remote useless (#2332)

* fix integration test

* Update agenthub/codeact_agent/prompt.py

* Update agenthub/codeact_agent/prompt.py

* fix integration test

---------

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
Co-authored-by: Frank Xu <frankxu2004@gmail.com>
Co-authored-by: yufansong <yufan@risingwave-labs.com>
Co-authored-by: yueqis <141804823+yueqis@users.noreply.github.com>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
Co-authored-by: Jaskirat Singh <1.jaskiratsingh@gmail.com>
Co-authored-by: tobitege <tobitege@gmx.de>
Co-authored-by: amanape <83104063+amanape@users.noreply.github.com>
Co-authored-by: Aaron Xia <zhhuaxia@gmail.com>
Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-06-09 10:19:05 -07:00
Bibek Poudel 221a4e83f1 doc: Added citation subsection in README (#2339)
* added citation in readme

* minor change to date format

* Update README.md

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>

---------

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-06-09 14:05:35 +00:00
Frank Xu bd00f0f049 Restore previous browsing agent behavior when evaluating on WebArena and miniwob++ only (#2341)
* restore eval mode

* fix
2024-06-09 04:10:02 -04:00
Engel Nyst fab8c9003b remove deprecated github-token config (#2334)
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-06-09 09:50:24 +02:00
மனோஜ்குமார் பழனிச்சாமி e0ad289483 Downgraded Python version to 3.12.3 (#2331)
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-06-09 11:54:30 +05:30
Boxuan Li a9a2f10170 Revamp AgentRejectAction and allow ManagerAgent to handle rejection (#1735)
* Fix AgentRejectAction handling

* Add ManagerAgent to integration tests

* Fix regenerate.sh

* Fix merge

* Update README for micro-agents

* Add test reject to regenerate.sh

* regenerate.sh: Add support for running a specific test and/or agent

* Refine reject schema, and allow ManagerAgent to handle reject

* Add test artifacts for test_simple_task_rejection

* Fix manager agent tests

* Fix README

* test_simple_task_rejection: check final agent state

* Integration test: exit if mock prompt not found

* Update test_simple_task_rejection tests

* Fix test_edits test artifacts after prompt update

* Fix ManagerAgent test_edits

* WIP

* Fix tests

* update test_edits for ManagerAgent

* Skip local sandbox for reject test

* Fix test comparison
2024-06-08 23:12:30 -07:00
tobitege c062468dcf fix: warning about zope-interface (pyproject) (#2335) 2024-06-08 22:51:55 +00:00
tobitege a97d0767e9 fix: Backticks get always escaped by runtime; add Ipython test (#2321)
* added tests related to backticks

* updated .gitignore

* added extra linter test for #2210

* hotfix for integration test

* added test_ipython unit test

* added test_ipython unit test

* remove draft test from test_ipython.py

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-06-08 21:02:27 +00:00
Yufan Song 1bdf8752e6 remote useless (#2332) 2024-06-08 19:04:43 +00:00
yueqis 68d9ad61cf Feat: Support Gorilla APIBench (#2081)
* removed unused files from gorilla

* Update run_infer.py, removed unused imports

* Update utils.py

* Update ast_eval_hf.py

* Update ast_eval_tf.py

* Update ast_eval_th.py

* Create README.md

* Update run_infer.py

* make lint

* Update run_infer.py

* fix lint

---------

Co-authored-by: yufansong <yufan@risingwave-labs.com>
2024-06-08 16:54:54 +00:00
Aaron Xia b5a17efc45 fix: codeact bug [If running a command that never returns, it gets stuck #1895] (#2034)
* fix: codeact bug https://github.com/OpenDevin/OpenDevin/issues/1895

* fix: add CmdRunAction timeout hint.

* Update agenthub/codeact_agent/prompt.py

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

* regenerate integration test

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: yufansong <yufan@risingwave-labs.com>
2024-06-08 16:40:23 +00:00
tobitege a8c6fd0d42 fix(frontend): prevent API key from resetting after modal change (#2329)
* remove bottom chatbox fade

* Modal wider; fix lint error

* settings: attempt to not clear api key for same provider

* prevent api key from resetting after changing the model

* revert other changes and fix post test tear down error

---------

Co-authored-by: amanape <83104063+amanape@users.noreply.github.com>
2024-06-08 16:27:43 +00:00
Jaskirat Singh e8307608c2 Support gpqa benchmark evaluation (#2080)
* feat: add gpqa benchmark evaluation

* add metrics

* reset configs in final block

* make lint

---------

Co-authored-by: yufansong <yufan@risingwave-labs.com>
2024-06-08 16:24:24 +00:00
Yufan Song 06a6ffcb09 feat: revert hiden special paths change in file action (#2328)
* revert change in file action

* remove useless code

* make lint
2024-06-08 12:12:52 +00:00
yueqis 82d4d25b09 feat: support ToolQA benchmark (#2263)
* Add files via upload

* Update README.md

* Update run_infer.py

* Update utils.py

* make lint

* Update evaluation/toolqa/run_infer.py

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: yufansong <yufan@risingwave-labs.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-06-08 07:54:01 -04:00
Xingyao Wang 903381f16e Add back jupyter PWD env var for agentskills (#2327)
* add back jupyter pwd env var for agentskills

* add unit test for pwd change in execute_cli
2024-06-08 08:51:42 +00:00
tobitege c3c2b2d7b6 fix: remove bottom chatbox fade (frontend) (#2323)
* remove bottom chatbox fade

* Modal wider; fix lint error
2024-06-08 07:09:21 +00:00
tobitege 5e42f140cb fix: hide special paths; sort models (#2325) 2024-06-08 02:13:11 +00:00
dependabot[bot] 705758ac36 Bump vite from 5.2.12 to 5.2.13 in /frontend (#2315)
Bumps [vite](https://github.com/vitejs/vite/tree/HEAD/packages/vite) from 5.2.12 to 5.2.13.
- [Release notes](https://github.com/vitejs/vite/releases)
- [Changelog](https://github.com/vitejs/vite/blob/v5.2.13/packages/vite/CHANGELOG.md)
- [Commits](https://github.com/vitejs/vite/commits/v5.2.13/packages/vite)

---
updated-dependencies:
- dependency-name: vite
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-07 17:47:36 +00:00
dependabot[bot] f1fc2c3fea Bump boto3 from 1.34.120 to 1.34.121 (#2316)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.120 to 1.34.121.
- [Release notes](https://github.com/boto/boto3/releases)
- [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst)
- [Commits](https://github.com/boto/boto3/compare/1.34.120...1.34.121)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-06-07 19:16:55 +02:00
dependabot[bot] 7e64df8332 Bump openai from 1.31.2 to 1.32.0 (#2317)
Bumps [openai](https://github.com/openai/openai-python) from 1.31.2 to 1.32.0.
- [Release notes](https://github.com/openai/openai-python/releases)
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-python/compare/v1.31.2...v1.32.0)

---
updated-dependencies:
- dependency-name: openai
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-06-07 19:16:44 +02:00
Yufan Song dc94914ad7 fix dogfood (#2313) 2024-06-07 16:35:12 +00:00
tobitege b431fce938 tests: more Agentskills tests; updated .gitignore (#2307)
* added tests related to backticks

* updated .gitignore

* added extra linter test for #2210

* hotfix for integration test

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-06-07 16:29:03 +00:00
Yufan Song 6aba337416 fix (#2318) 2024-06-07 09:22:29 -07:00
Frank Xu 4455260290 [bugfix] browse actions shouldn't change url and screenshot, only observations (#2311)
* browse related actions shouldn't change url and screenshot, only the observations should

* fix linting

* fix integrat

* update integration test

---------

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-06-08 00:03:32 +08:00
Boxuan Li 45ce09d70e CodeActAgent: Delegate to BrowsingAgent for browsing tasks (#2103) 2024-06-07 00:53:47 -07:00
Biraj Silwal 001cc33664 fix: ExplorerActions overlapping with file name. (#2287)
* fix ExplorerActions overlapping with file name.

* Update frontend/src/components/file-explorer/FileExplorer.tsx

---------

Co-authored-by: Leo <ifuryst@gmail.com>
2024-06-07 03:30:16 +00:00
dependabot[bot] 1df9649c7e Bump tailwindcss from 3.4.3 to 3.4.4 in /frontend (#2298) 2024-06-07 09:03:03 +08:00
Mohammad Sadoughi 19788cbad8 updated Makefile setup-config to store the persist_sandbox bolean value to config.toml (#2304)
Co-authored-by: msadough <msadough@amazon.com>
2024-06-06 22:14:09 +00:00
dependabot[bot] dea9b5c258 Bump boto3 from 1.34.119 to 1.34.120 (#2299)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.119 to 1.34.120.
- [Release notes](https://github.com/boto/boto3/releases)
- [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst)
- [Commits](https://github.com/boto/boto3/compare/1.34.119...1.34.120)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-06 18:50:12 +02:00
dependabot[bot] 07423c9277 Bump ruff from 0.4.7 to 0.4.8 (#2297)
Bumps [ruff](https://github.com/astral-sh/ruff) from 0.4.7 to 0.4.8.
- [Release notes](https://github.com/astral-sh/ruff/releases)
- [Changelog](https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md)
- [Commits](https://github.com/astral-sh/ruff/compare/v0.4.7...v0.4.8)

---
updated-dependencies:
- dependency-name: ruff
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-06 18:49:38 +02:00
dependabot[bot] bb757223a2 Bump litellm from 1.40.2 to 1.40.4 (#2300)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.40.2 to 1.40.4.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.40.2...v1.40.4)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-06 18:48:51 +02:00
dependabot[bot] ac0c6efc82 Bump openai from 1.31.0 to 1.31.2 (#2301)
Bumps [openai](https://github.com/openai/openai-python) from 1.31.0 to 1.31.2.
- [Release notes](https://github.com/openai/openai-python/releases)
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-python/compare/v1.31.0...v1.31.2)

---
updated-dependencies:
- dependency-name: openai
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-06 18:48:04 +02:00
tobitege 1ce4d383d3 doc: add Python keyring to Troubleshooting documentation (#2289)
* fix: set Python keyring for Poetry

* Python keyring troubleshooting added

* Revert Makefile change

* Troubleshooting extended

* setup config: added absolute path hint
2024-06-06 12:26:58 +00:00
Aleksandar b0b19e6c25 Update AgentHubREADME.md (#2290)
Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>
2024-06-06 11:14:41 +00:00
dependabot[bot] 08137d1968 Bump boto3 from 1.34.118 to 1.34.119 (#2280)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.118 to 1.34.119.
- [Release notes](https://github.com/boto/boto3/releases)
- [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst)
- [Commits](https://github.com/boto/boto3/compare/1.34.118...1.34.119)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-06 14:13:41 +08:00
super-dainiu beabcce16d [Hotfix] Fix ML-Bench continue `run_inference.py` (#2284)
* add ml-bench w/o exec env

* fix typos (#1956)

no functional change

* Refactored Logs (#1939)

* [Feat] A competitive Web Browsing agent (#1856)

* initial attempt at a browsing only agent

* add browsing agent

* update

* implement agent

* update

* fix comments

* remove unnecessary things from memory extras

* update image processing

---------

Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>

* Update README.md SWE-bench score (#1959)

* Update README.md SWE-bench score

Our most recent results on swe-bench lite are 25%, so this updates the README accordingly.

* Update

* fix: llm is_local function logic error (#1961)

Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>

* doc: update documentation about poetry update (#1962)

* add doc

* Update Development.md

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* feat: add metrics related to cost for better observability (#1944)

* add metrics for total_cost

* make lint

* refact codeact

* change metrics into llm

* add costs list, add into state

* refactor log completion

* refactor and test others

* make lint

* Update opendevin/core/metrics.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update opendevin/llm/llm.py

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>

* refactor

* add code

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>

* doc: add more cmd in unit test documentation (#1963)

* --- (#1975)

updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* --- (#1976)

updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Logging security (#1943)

* update .gitignore

* Rename the confusing 'INFO' style to 'DETAIL'

* override str and repr

* feat: api_key desensitize

* feat: add SensitiveDataFilter in file handler

* tweak regex, add tests

* more tweaks, include other attrs

* add env vars, those with equivalent config

* fix tests

* tests are invaluable

---------

Co-authored-by: Shimada666 <649940882@qq.com>

* --- (#1967)

updated-dependencies:
- dependency-name: react-dom
  dependency-type: direct:production
  update-type: version-update:semver-minor
- dependency-name: "@types/react-dom"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* --- (#1968)

updated-dependencies:
- dependency-name: "@reduxjs/toolkit"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* --- (#1969)

updated-dependencies:
- dependency-name: husky
  dependency-type: direct:development
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* --- (#1970)

updated-dependencies:
- dependency-name: tailwind-merge
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* --- (#1971)

updated-dependencies:
- dependency-name: i18next
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>

* Refactor session management (#1810)

* refactor session mgmt

* defer file handling to runtime

* add todo

* refactor sessions a bit more

* remove messages logic from FE

* fix up socket handshake

* refactor frontend auth a bit

* first pass at redoing file explorer

* implement directory suffix

* fix up file tree

* close agent on websocket close

* remove session saving

* move file refresh

* remove getWorkspace

* plumb path/code differently

* fix build issues

* fix the tests

* fix npm build

* add session rehydration

* fix event serialization

* logspam

* fix user message rehydration

* add get_event fn

* agent state restoration

* change history tracking for codeact

* fix responsiveness of init

* fix lint

* lint

* delint

* fix prop

* update tests

* logspam

* lint

* fix test

* revert codeact

* change fileService to use API

* fix up session loading

* delint

* delint

* fix integration tests

* revert test

* fix up access to options endpoints

* fix initial files load

* delint

* fix file initialization

* fix mock server

* fixl int

* fix auth for html

* Update frontend/src/i18n/translation.json

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>

* refactor sessions and sockets

* avoid reinitializing the same session

* fix reconnect issue

* change up intro message

* more guards on reinit

* rename agent_session

* delint

* fix a bunch of tests

* delint

* fix last test

* remove code editor context

* fix build

* fix any

* fix dot notation

* Update frontend/src/services/api.ts

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* fix up error handling

* Update opendevin/server/session/agent.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update opendevin/server/session/agent.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update frontend/src/services/session.ts

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* fix build errs

* fix else

* add closed state

* delint

* Update opendevin/server/session/session.py

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

---------

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

* fix #1960 (#1964)

* Add ruff for shared mutable defaults (B) (#1938)

* Add ruff for shared mutable defaults (B)

* Apply B006, B008 on current files, except fast API

* Update agenthub/SWE_agent/prompts.py

Co-authored-by: Graham Neubig <neubig@gmail.com>

* fix unintended behavior change

* this is correct, tell Ruff to leave it alone

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Refactor integration testing CI, add optional Mac tests, and mark a few agents as deprecated (#1888)

* Add MacOS to integration tests

* Switch back to python 3.11

* Install Docker for macos pipeline

* regenerate.sh: Use environmental variable for sandbox type

* Pack different agents' tests into a single check

* Fix CodeAct tests

* Reduce file match and extensive debug logs

* Add TEST_IN_CI mode that reports codecov

* Small fix: don't quit if reusing old responses failed

* Merge codecov results

* Fix typos

* Remove coverage merge step - codecov automatically does that

* Make mac integration tests as optional - too slow

* Fix codecov args

* Add comments in yaml

* Include sandbox type in codecov report name

* Fix codecov report merge

* Revert renaming of test_matrix_success

* Remove SWEAgent and PlannerAgent from tests

* Mark planner agent and SWE agent as deprecated

* CodeCov: Ignore planner and sweagent

* Revert "Remove SWEAgent and PlannerAgent from tests"

This reverts commit 040cb3bfb9.

* Remove all tests for SWE Agent

* Only keep basic tests for MonologueAgent and PlannerAgent

* Mark SWE Agent as deprecated, and ignore code coverage for it

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

* Fix Repeated Responses in Chat by Adding IPythonRunCellObservation (#1987)

Co-authored-by: jianghongwei <jianghongwei@58.com>
Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>

* Save CI cycles for backend tests (#1985)

* Fix typo in prompt (#1992)

* Refactor monologue and SWE agent to use the messages in state history (#1863)

* Refactor monologue to use the messages in state history

* add messages, clean up

* fix monologue

* update integration tests

* move private method

* update SWE agent to use the history from State

* integration tests for SWE agent

* rename monologue to initial_thoughts, since that is what it is

* fix: catch session file not existed exception when init EventStream(maybe creating a new session with no session files stored). (#1994)

* add ml-bench in readme

* Bump boto3 from 1.34.110 to 1.34.111 (#2001)

Bumps [boto3](https://github.com/boto/boto3) from 1.34.110 to 1.34.111.
- [Release notes](https://github.com/boto/boto3/releases)
- [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst)
- [Commits](https://github.com/boto/boto3/compare/1.34.110...1.34.111)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump docker from 7.0.0 to 7.1.0 (#2002)

Bumps [docker](https://github.com/docker/docker-py) from 7.0.0 to 7.1.0.
- [Release notes](https://github.com/docker/docker-py/releases)
- [Commits](https://github.com/docker/docker-py/compare/7.0.0...7.1.0)

---
updated-dependencies:
- dependency-name: docker
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump litellm from 1.37.20 to 1.38.0 (#2005)

Bumps [litellm](https://github.com/BerriAI/litellm) from 1.37.20 to 1.38.0.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.37.20...v1.38.0)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Fix SWE-Bench evaluation due to setuptools version (#1995)

* correctly setup plugins for swebench eval

* bump swe-bench version and add logging

* Revert "correctly setup plugins for swebench eval"

This reverts commit 2bd1055673.

* bump version

* fix session state after resuming (#1999)

* fix state resuming

* fix session reconnection

* fix lint

* Implement `agentskills` for OpenDevin to helpfully improve edit AND including more useful tools/skills (#1941)

* add draft for skills

* Implement and test agentskills functions: open_file, goto_line, scroll_down, scroll_up, create_file, search_dir, search_file, find_file

* Remove new_sample.txt file

* add some work from opendevin w/ fixes

* Add unit tests for agentskills module

* fix some issues and updated tests

* add more tests for open

* tweak and handle goto_line

* add tests for some edge cases

* add tests for scrolling

* add tests for edit

* add tests for search_dir

* update tests to use pytest

* use pytest --forked to avoid file op unit tests to interfere with each other via global var

* update doc based on swe agent tool

* update and add tests for find_file and search_file

* move agent_skills to plugins

* add agentskills as plugin and docs

* add agentskill to ssh box and fix sandbox integration

* remove extra returns in doc

* add agentskills to initial tool for jupyter

* support re-init jupyter kernel (for agentskills) after restart

* fix print window's issue with indentation and add testcases

* add prompt for codeact with the newest edit primitives

* modify the way line number is presented (remove leading space)

* change prompt to the newest display format

* support tracking of costs via metrics

* Update opendevin/runtime/plugins/agent_skills/README.md

* Update opendevin/runtime/plugins/agent_skills/README.md

* implement and add tests for py linting

* remove extra text arg for incompatible subprocess ver

* remove sample.txt

* update test_edits integration tests

* fix all integration

* Update opendevin/runtime/plugins/agent_skills/README.md

* Update opendevin/runtime/plugins/agent_skills/README.md

* Update opendevin/runtime/plugins/agent_skills/README.md

* Update agenthub/codeact_agent/prompt.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update agenthub/codeact_agent/prompt.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update agenthub/codeact_agent/prompt.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update opendevin/runtime/plugins/agent_skills/agentskills.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* correctly setup plugins for swebench eval

* bump swe-bench version and add logging

* correctly setup plugins for swebench eval

* bump swe-bench version and add logging

* Revert "correctly setup plugins for swebench eval"

This reverts commit 2bd1055673.

* bump version

* remove _AGENT_SKILLS_DOCS

* move flake8 to test dep

* update poetry.lock

* remove extra arg

* reduce max iter for eval

* update poetry

* fix integration tests

---------

Co-authored-by: OpenDevin <opendevin@opendevin.ai>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* build: Add poetry command to use Python 3.11 for environment setup (#1972)

* Bump @react-types/shared from 3.23.0 to 3.23.1 in /frontend (#2006)

Bumps [@react-types/shared](https://github.com/adobe/react-spectrum) from 3.23.0 to 3.23.1.
- [Release notes](https://github.com/adobe/react-spectrum/releases)
- [Commits](https://github.com/adobe/react-spectrum/compare/@react-types/shared@3.23.0...@react-types/shared@3.23.1)

---
updated-dependencies:
- dependency-name: "@react-types/shared"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump @types/react-syntax-highlighter in /frontend (#2007)

Bumps [@types/react-syntax-highlighter](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/react-syntax-highlighter) from 15.5.11 to 15.5.13.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/react-syntax-highlighter)

---
updated-dependencies:
- dependency-name: "@types/react-syntax-highlighter"
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump @typescript-eslint/parser from 7.9.0 to 7.10.0 in /frontend (#2008)

Bumps [@typescript-eslint/parser](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/parser) from 7.9.0 to 7.10.0.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/parser/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v7.10.0/packages/parser)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/parser"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump lint-staged from 15.2.2 to 15.2.4 in /frontend (#2009)

Bumps [lint-staged](https://github.com/okonet/lint-staged) from 15.2.2 to 15.2.4.
- [Release notes](https://github.com/okonet/lint-staged/releases)
- [Changelog](https://github.com/lint-staged/lint-staged/blob/master/CHANGELOG.md)
- [Commits](https://github.com/okonet/lint-staged/compare/v15.2.2...v15.2.4)

---
updated-dependencies:
- dependency-name: lint-staged
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update README.md

* Update README.md

* add run_infer.sh

* fix input output

* fix docker sandbox

* fix run

* update and clean run_infer.py

* add script to clean up dockers

* update repo uid

* add description

* new

* Update README.md

* use root for sandbox

* update readme

* update ml-bench conda env

* update readme

* update readme

* use try except

* modify raise exception

* add int

* update README

* longer time

* fix existing issues

* fix existing issue

* new docker image

* add metrics of cost

* add result parsing cost

* fix

* fix

* update summarize

* fix

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-31-157.ec2.internal>
Co-authored-by: RainRat <rainrat78@yahoo.ca>
Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>
Co-authored-by: Frank Xu <frankxu2004@gmail.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: Shimada666 <649940882@qq.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Robert Brennan <accounts@rbren.io>
Co-authored-by: Rahul Anand <62982824+zeul22@users.noreply.github.com>
Co-authored-by: jiangleo <jiangleo@users.noreply.github.com>
Co-authored-by: jianghongwei <jianghongwei@58.com>
Co-authored-by: Jeremi Joslin <jeremi@newlogic.com>
Co-authored-by: Aaron Xia <zhhuaxia@gmail.com>
Co-authored-by: OpenDevin <opendevin@opendevin.ai>
Co-authored-by: DaxServer <7479937+DaxServer@users.noreply.github.com>
Co-authored-by: Robert <871607149@qq.com>
2024-06-06 03:53:21 +00:00
tobitege 1fa09e0414 fix: test_sandbox tests didn't close dockers (#2274)
* fix test_sandbox tests to close dockers

* removed try/finally

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-06-06 03:45:45 +00:00
Frank Xu 48151bdbb0 [feat] WebArena benchmark, MiniWoB++ benchmark and related arch changes (#2170)
* add webarena, and revamp messaging for webarena eval

* add changes for browsergym

* update infer script

* fix unit tests

* update

* add multiple run for miniwob

* update instruction, remove personal path

* update

* add code for getting final reward, fix integration, add results

* add avg cost calculation
2024-06-06 09:01:20 +08:00
dependabot[bot] 99c6333e1a Bump openai from 1.30.5 to 1.31.0 (#2283)
Bumps [openai](https://github.com/openai/openai-python) from 1.30.5 to 1.31.0.
- [Release notes](https://github.com/openai/openai-python/releases)
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-python/compare/v1.30.5...v1.31.0)

---
updated-dependencies:
- dependency-name: openai
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-05 21:14:49 +00:00
Xingyao Wang 42d3dd8a2e Update screenshot (#2286)
* Add files via upload

* Update screenshot.png
2024-06-05 22:43:46 +02:00
மனோஜ்குமார் பழனிச்சாமி 971ad68431 Solved Hugging Face cache issue. (#2277) 2024-06-05 21:18:33 +05:30
dependabot[bot] 3bf0636a53 Bump litellm from 1.40.0 to 1.40.2 (#2282)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.40.0 to 1.40.2.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.40.0...v1.40.2)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-05 23:46:00 +08:00
dependabot[bot] 105b5b9103 Bump json-repair from 0.21.0 to 0.23.1 (#2278)
Bumps [json-repair](https://github.com/mangiucugna/json_repair) from 0.21.0 to 0.23.1.
- [Release notes](https://github.com/mangiucugna/json_repair/releases)
- [Commits](https://github.com/mangiucugna/json_repair/compare/0.21.0...0.23.1)

---
updated-dependencies:
- dependency-name: json-repair
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-05 23:45:42 +08:00
dependabot[bot] a4bccfc6aa Bump @types/node from 20.14.1 to 20.14.2 in /frontend (#2279)
Bumps [@types/node](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/node) from 20.14.1 to 20.14.2.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/node)

---
updated-dependencies:
- dependency-name: "@types/node"
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-05 23:40:53 +08:00
dependabot[bot] 4e540da85e Bump prettier from 3.3.0 to 3.3.1 in /frontend (#2281)
Bumps [prettier](https://github.com/prettier/prettier) from 3.3.0 to 3.3.1.
- [Release notes](https://github.com/prettier/prettier/releases)
- [Changelog](https://github.com/prettier/prettier/blob/main/CHANGELOG.md)
- [Commits](https://github.com/prettier/prettier/compare/3.3.0...3.3.1)

---
updated-dependencies:
- dependency-name: prettier
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-05 23:39:41 +08:00
RainRat 3b0e1361a4 fix typos (#2267)
* fix typos

no functional change

* fix typos

* fix typos

* fix integration test

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Leo <ifuryst@gmail.com>
Co-authored-by: yufansong <yufan@risingwave-labs.com>
2024-06-05 23:06:40 +08:00
மனோஜ்குமார் பழனிச்சாமி ae815b20d2 Improved logs (#2272) 2024-06-05 17:54:40 +05:30
Aaron Xia 69542c9999 fix: there maybe unexpected files in event file list, not like 1.json… (#2270)
* fix: there maybe unexpected files in event file list, not like 1.json, 2.json, but .DS_Store for macOS system.

* log

---------

Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>
2024-06-05 17:56:39 +08:00
dependabot[bot] 95a9be2dc5 Bump @typescript-eslint/eslint-plugin from 7.11.0 to 7.12.0 in /frontend (#2260) 2024-06-05 08:10:05 +00:00
Boxuan Li 208b1461ca [AgentBench evaluation] set run_as_devin to true (#2269)
Co-authored-by: Leo <ifuryst@gmail.com>
2024-06-05 07:53:33 +00:00
dependabot[bot] 1b25a37ad4 Bump @testing-library/react from 15.0.7 to 16.0.0 in /frontend (#2227)
* Bump @testing-library/react from 15.0.7 to 16.0.0 in /frontend

Bumps [@testing-library/react](https://github.com/testing-library/react-testing-library) from 15.0.7 to 16.0.0.
- [Release notes](https://github.com/testing-library/react-testing-library/releases)
- [Changelog](https://github.com/testing-library/react-testing-library/blob/main/CHANGELOG.md)
- [Commits](https://github.com/testing-library/react-testing-library/compare/v15.0.7...v16.0.0)

---
updated-dependencies:
- dependency-name: "@testing-library/react"
  dependency-type: direct:development
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

* resolve error during test teardown

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: amanape <83104063+amanape@users.noreply.github.com>
2024-06-05 07:51:58 +00:00
Ryan H. Tran 0584e428b2 [Mint evaluation] Fix bug in stopping when the agent reaches max steps or solution proposals (#2268)
* fix: bug in stopping when the agent reaches max steps or solution proposals

* remove --eval-num-workers

* update env.py
2024-06-05 06:47:07 +00:00
super-dainiu ebafb702e5 Add ML-Bench Evaluation with OpenDevin (#2015)
* add ml-bench w/o exec env

* fix typos (#1956)

no functional change

* Refactored Logs (#1939)

* [Feat] A competitive Web Browsing agent (#1856)

* initial attempt at a browsing only agent

* add browsing agent

* update

* implement agent

* update

* fix comments

* remove unnecessary things from memory extras

* update image processing

---------

Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>

* Update README.md SWE-bench score (#1959)

* Update README.md SWE-bench score

Our most recent results on swe-bench lite are 25%, so this updates the README accordingly.

* Update

* fix: llm is_local function logic error (#1961)

Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>

* doc: update documentation about poetry update (#1962)

* add doc

* Update Development.md

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* feat: add metrics related to cost for better observability (#1944)

* add metrics for total_cost

* make lint

* refact codeact

* change metrics into llm

* add costs list, add into state

* refactor log completion

* refactor and test others

* make lint

* Update opendevin/core/metrics.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update opendevin/llm/llm.py

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>

* refactor

* add code

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>

* doc: add more cmd in unit test documentation (#1963)

* --- (#1975)

updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* --- (#1976)

updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Logging security (#1943)

* update .gitignore

* Rename the confusing 'INFO' style to 'DETAIL'

* override str and repr

* feat: api_key desensitize

* feat: add SensitiveDataFilter in file handler

* tweak regex, add tests

* more tweaks, include other attrs

* add env vars, those with equivalent config

* fix tests

* tests are invaluable

---------

Co-authored-by: Shimada666 <649940882@qq.com>

* --- (#1967)

updated-dependencies:
- dependency-name: react-dom
  dependency-type: direct:production
  update-type: version-update:semver-minor
- dependency-name: "@types/react-dom"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* --- (#1968)

updated-dependencies:
- dependency-name: "@reduxjs/toolkit"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* --- (#1969)

updated-dependencies:
- dependency-name: husky
  dependency-type: direct:development
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* --- (#1970)

updated-dependencies:
- dependency-name: tailwind-merge
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* --- (#1971)

updated-dependencies:
- dependency-name: i18next
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>

* Refactor session management (#1810)

* refactor session mgmt

* defer file handling to runtime

* add todo

* refactor sessions a bit more

* remove messages logic from FE

* fix up socket handshake

* refactor frontend auth a bit

* first pass at redoing file explorer

* implement directory suffix

* fix up file tree

* close agent on websocket close

* remove session saving

* move file refresh

* remove getWorkspace

* plumb path/code differently

* fix build issues

* fix the tests

* fix npm build

* add session rehydration

* fix event serialization

* logspam

* fix user message rehydration

* add get_event fn

* agent state restoration

* change history tracking for codeact

* fix responsiveness of init

* fix lint

* lint

* delint

* fix prop

* update tests

* logspam

* lint

* fix test

* revert codeact

* change fileService to use API

* fix up session loading

* delint

* delint

* fix integration tests

* revert test

* fix up access to options endpoints

* fix initial files load

* delint

* fix file initialization

* fix mock server

* fixl int

* fix auth for html

* Update frontend/src/i18n/translation.json

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>

* refactor sessions and sockets

* avoid reinitializing the same session

* fix reconnect issue

* change up intro message

* more guards on reinit

* rename agent_session

* delint

* fix a bunch of tests

* delint

* fix last test

* remove code editor context

* fix build

* fix any

* fix dot notation

* Update frontend/src/services/api.ts

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* fix up error handling

* Update opendevin/server/session/agent.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update opendevin/server/session/agent.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update frontend/src/services/session.ts

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* fix build errs

* fix else

* add closed state

* delint

* Update opendevin/server/session/session.py

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

---------

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

* fix #1960 (#1964)

* Add ruff for shared mutable defaults (B) (#1938)

* Add ruff for shared mutable defaults (B)

* Apply B006, B008 on current files, except fast API

* Update agenthub/SWE_agent/prompts.py

Co-authored-by: Graham Neubig <neubig@gmail.com>

* fix unintended behavior change

* this is correct, tell Ruff to leave it alone

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Refactor integration testing CI, add optional Mac tests, and mark a few agents as deprecated (#1888)

* Add MacOS to integration tests

* Switch back to python 3.11

* Install Docker for macos pipeline

* regenerate.sh: Use environmental variable for sandbox type

* Pack different agents' tests into a single check

* Fix CodeAct tests

* Reduce file match and extensive debug logs

* Add TEST_IN_CI mode that reports codecov

* Small fix: don't quit if reusing old responses failed

* Merge codecov results

* Fix typos

* Remove coverage merge step - codecov automatically does that

* Make mac integration tests as optional - too slow

* Fix codecov args

* Add comments in yaml

* Include sandbox type in codecov report name

* Fix codecov report merge

* Revert renaming of test_matrix_success

* Remove SWEAgent and PlannerAgent from tests

* Mark planner agent and SWE agent as deprecated

* CodeCov: Ignore planner and sweagent

* Revert "Remove SWEAgent and PlannerAgent from tests"

This reverts commit 040cb3bfb9.

* Remove all tests for SWE Agent

* Only keep basic tests for MonologueAgent and PlannerAgent

* Mark SWE Agent as deprecated, and ignore code coverage for it

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

* Fix Repeated Responses in Chat by Adding IPythonRunCellObservation (#1987)

Co-authored-by: jianghongwei <jianghongwei@58.com>
Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>

* Save CI cycles for backend tests (#1985)

* Fix typo in prompt (#1992)

* Refactor monologue and SWE agent to use the messages in state history (#1863)

* Refactor monologue to use the messages in state history

* add messages, clean up

* fix monologue

* update integration tests

* move private method

* update SWE agent to use the history from State

* integration tests for SWE agent

* rename monologue to initial_thoughts, since that is what it is

* fix: catch session file not existed exception when init EventStream(maybe creating a new session with no session files stored). (#1994)

* add ml-bench in readme

* Bump boto3 from 1.34.110 to 1.34.111 (#2001)

Bumps [boto3](https://github.com/boto/boto3) from 1.34.110 to 1.34.111.
- [Release notes](https://github.com/boto/boto3/releases)
- [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst)
- [Commits](https://github.com/boto/boto3/compare/1.34.110...1.34.111)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump docker from 7.0.0 to 7.1.0 (#2002)

Bumps [docker](https://github.com/docker/docker-py) from 7.0.0 to 7.1.0.
- [Release notes](https://github.com/docker/docker-py/releases)
- [Commits](https://github.com/docker/docker-py/compare/7.0.0...7.1.0)

---
updated-dependencies:
- dependency-name: docker
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump litellm from 1.37.20 to 1.38.0 (#2005)

Bumps [litellm](https://github.com/BerriAI/litellm) from 1.37.20 to 1.38.0.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.37.20...v1.38.0)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Fix SWE-Bench evaluation due to setuptools version (#1995)

* correctly setup plugins for swebench eval

* bump swe-bench version and add logging

* Revert "correctly setup plugins for swebench eval"

This reverts commit 2bd1055673.

* bump version

* fix session state after resuming (#1999)

* fix state resuming

* fix session reconnection

* fix lint

* Implement `agentskills` for OpenDevin to helpfully improve edit AND including more useful tools/skills (#1941)

* add draft for skills

* Implement and test agentskills functions: open_file, goto_line, scroll_down, scroll_up, create_file, search_dir, search_file, find_file

* Remove new_sample.txt file

* add some work from opendevin w/ fixes

* Add unit tests for agentskills module

* fix some issues and updated tests

* add more tests for open

* tweak and handle goto_line

* add tests for some edge cases

* add tests for scrolling

* add tests for edit

* add tests for search_dir

* update tests to use pytest

* use pytest --forked to avoid file op unit tests to interfere with each other via global var

* update doc based on swe agent tool

* update and add tests for find_file and search_file

* move agent_skills to plugins

* add agentskills as plugin and docs

* add agentskill to ssh box and fix sandbox integration

* remove extra returns in doc

* add agentskills to initial tool for jupyter

* support re-init jupyter kernel (for agentskills) after restart

* fix print window's issue with indentation and add testcases

* add prompt for codeact with the newest edit primitives

* modify the way line number is presented (remove leading space)

* change prompt to the newest display format

* support tracking of costs via metrics

* Update opendevin/runtime/plugins/agent_skills/README.md

* Update opendevin/runtime/plugins/agent_skills/README.md

* implement and add tests for py linting

* remove extra text arg for incompatible subprocess ver

* remove sample.txt

* update test_edits integration tests

* fix all integration

* Update opendevin/runtime/plugins/agent_skills/README.md

* Update opendevin/runtime/plugins/agent_skills/README.md

* Update opendevin/runtime/plugins/agent_skills/README.md

* Update agenthub/codeact_agent/prompt.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update agenthub/codeact_agent/prompt.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update agenthub/codeact_agent/prompt.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update opendevin/runtime/plugins/agent_skills/agentskills.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* correctly setup plugins for swebench eval

* bump swe-bench version and add logging

* correctly setup plugins for swebench eval

* bump swe-bench version and add logging

* Revert "correctly setup plugins for swebench eval"

This reverts commit 2bd1055673.

* bump version

* remove _AGENT_SKILLS_DOCS

* move flake8 to test dep

* update poetry.lock

* remove extra arg

* reduce max iter for eval

* update poetry

* fix integration tests

---------

Co-authored-by: OpenDevin <opendevin@opendevin.ai>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* build: Add poetry command to use Python 3.11 for environment setup (#1972)

* Bump @react-types/shared from 3.23.0 to 3.23.1 in /frontend (#2006)

Bumps [@react-types/shared](https://github.com/adobe/react-spectrum) from 3.23.0 to 3.23.1.
- [Release notes](https://github.com/adobe/react-spectrum/releases)
- [Commits](https://github.com/adobe/react-spectrum/compare/@react-types/shared@3.23.0...@react-types/shared@3.23.1)

---
updated-dependencies:
- dependency-name: "@react-types/shared"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump @types/react-syntax-highlighter in /frontend (#2007)

Bumps [@types/react-syntax-highlighter](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/react-syntax-highlighter) from 15.5.11 to 15.5.13.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/react-syntax-highlighter)

---
updated-dependencies:
- dependency-name: "@types/react-syntax-highlighter"
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump @typescript-eslint/parser from 7.9.0 to 7.10.0 in /frontend (#2008)

Bumps [@typescript-eslint/parser](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/parser) from 7.9.0 to 7.10.0.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/parser/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v7.10.0/packages/parser)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/parser"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump lint-staged from 15.2.2 to 15.2.4 in /frontend (#2009)

Bumps [lint-staged](https://github.com/okonet/lint-staged) from 15.2.2 to 15.2.4.
- [Release notes](https://github.com/okonet/lint-staged/releases)
- [Changelog](https://github.com/lint-staged/lint-staged/blob/master/CHANGELOG.md)
- [Commits](https://github.com/okonet/lint-staged/compare/v15.2.2...v15.2.4)

---
updated-dependencies:
- dependency-name: lint-staged
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update README.md

* Update README.md

* add run_infer.sh

* fix input output

* fix docker sandbox

* fix run

* update and clean run_infer.py

* add script to clean up dockers

* update repo uid

* add description

* new

* Update README.md

* use root for sandbox

* update readme

* update ml-bench conda env

* update readme

* update readme

* use try except

* modify raise exception

* add int

* update README

* longer time

* fix existing issues

* fix existing issue

* new docker image

* add metrics of cost

* add result parsing cost

* fix

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-31-157.ec2.internal>
Co-authored-by: RainRat <rainrat78@yahoo.ca>
Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>
Co-authored-by: Frank Xu <frankxu2004@gmail.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: Shimada666 <649940882@qq.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Robert Brennan <accounts@rbren.io>
Co-authored-by: Rahul Anand <62982824+zeul22@users.noreply.github.com>
Co-authored-by: jiangleo <jiangleo@users.noreply.github.com>
Co-authored-by: jianghongwei <jianghongwei@58.com>
Co-authored-by: Jeremi Joslin <jeremi@newlogic.com>
Co-authored-by: Aaron Xia <zhhuaxia@gmail.com>
Co-authored-by: OpenDevin <opendevin@opendevin.ai>
Co-authored-by: DaxServer <7479937+DaxServer@users.noreply.github.com>
Co-authored-by: Robert <871607149@qq.com>
2024-06-05 01:56:39 +00:00
Leo 040d6bd806 fix: add an early exit check for agent answers in agent bench. (#2257)
Signed-off-by: ifuryst <ifuryst@gmail.com>
2024-06-04 18:45:07 -07:00
tobitege 5776474dcf Fix SWE-Bench README typos (#2250) 2024-06-05 01:18:02 +00:00
tobitege 44bbe5e208 Fix agentskills tests (#2242)
* Fix agentskills tests

* Improved test_agent_skill

---------

Co-authored-by: Leo <ifuryst@gmail.com>
2024-06-04 21:33:32 +00:00
tobitege 0082640ac8 fix test_config to prevent leaks (#2245) 2024-06-04 21:32:46 +02:00
tobitege 7263705492 fix frontend tests; minor readme update (#2219)
* fix frontend tests; minor readme update

* Fix indent in ChatInput.test

* Fix linting errors, finally

* lint: minor fixes (per make lint)

* All tests passed!
2024-06-04 20:46:47 +03:00
dependabot[bot] 4de08a9c00 Bump @types/node from 20.14.0 to 20.14.1 in /frontend (#2258)
Bumps [@types/node](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/node) from 20.14.0 to 20.14.1.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/node)

---
updated-dependencies:
- dependency-name: "@types/node"
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Leo <ifuryst@gmail.com>
2024-06-04 16:09:17 +00:00
dependabot[bot] 7e3e740616 Bump jose from 5.3.0 to 5.4.0 in /frontend (#2259)
Bumps [jose](https://github.com/panva/jose) from 5.3.0 to 5.4.0.
- [Release notes](https://github.com/panva/jose/releases)
- [Changelog](https://github.com/panva/jose/blob/main/CHANGELOG.md)
- [Commits](https://github.com/panva/jose/compare/v5.3.0...v5.4.0)

---
updated-dependencies:
- dependency-name: jose
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-04 16:06:38 +00:00
மனோஜ்குமார் பழனிச்சாமி 2ffd54d258 fixed output logging (#2244)
Co-authored-by: Leo <ifuryst@gmail.com>
2024-06-04 16:05:23 +00:00
dependabot[bot] 6dd6e6c087 Bump @typescript-eslint/parser from 7.11.0 to 7.12.0 in /frontend (#2261)
Bumps [@typescript-eslint/parser](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/parser) from 7.11.0 to 7.12.0.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/parser/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v7.12.0/packages/parser)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/parser"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Leo <ifuryst@gmail.com>
2024-06-04 15:59:35 +00:00
dependabot[bot] aec3e18836 Bump litellm from 1.39.5 to 1.40.0 (#2256)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.39.5 to 1.40.0.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.39.5...v1.40.0)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-04 15:36:15 +00:00
dependabot[bot] d85c548bf5 Bump opencv-python from 4.9.0.80 to 4.10.0.82 (#2255)
Bumps [opencv-python](https://github.com/opencv/opencv-python) from 4.9.0.80 to 4.10.0.82.
- [Release notes](https://github.com/opencv/opencv-python/releases)
- [Commits](https://github.com/opencv/opencv-python/commits)

---
updated-dependencies:
- dependency-name: opencv-python
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-04 15:08:05 +00:00
dependabot[bot] 0f60899ee0 Bump google-generativeai from 0.5.4 to 0.6.0 (#2254)
Bumps [google-generativeai](https://github.com/google/generative-ai-python) from 0.5.4 to 0.6.0.
- [Release notes](https://github.com/google/generative-ai-python/releases)
- [Changelog](https://github.com/google-gemini/generative-ai-python/blob/main/RELEASE.md)
- [Commits](https://github.com/google/generative-ai-python/compare/v0.5.4...v0.6.0)

---
updated-dependencies:
- dependency-name: google-generativeai
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-04 15:06:51 +00:00
மனோஜ்குமார் பழனிச்சாமி 4e479038f9 Bugfix by added config to disable plugin initialization for Persistent sandbox (#2179)
* refactored source bashrc logic

* added initialize_plugins config

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-06-04 10:59:30 -04:00
dependabot[bot] 11b66bd733 Bump boto3 from 1.34.117 to 1.34.118 (#2253)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.117 to 1.34.118.
- [Release notes](https://github.com/boto/boto3/releases)
- [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst)
- [Commits](https://github.com/boto/boto3/compare/1.34.117...1.34.118)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-04 14:57:58 +00:00
dependabot[bot] 62c179be6c Bump pytest from 8.2.1 to 8.2.2 (#2252)
Bumps [pytest](https://github.com/pytest-dev/pytest) from 8.2.1 to 8.2.2.
- [Release notes](https://github.com/pytest-dev/pytest/releases)
- [Changelog](https://github.com/pytest-dev/pytest/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pytest-dev/pytest/compare/8.2.1...8.2.2)

---
updated-dependencies:
- dependency-name: pytest
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-04 14:55:09 +00:00
மனோஜ்குமார் பழனிச்சாமி 4afd85e591 Quick doc fix (#2243) 2024-06-04 07:00:44 +00:00
Leo 9ada36e30b fix: restore python linting. (#2228)
* fix: restore python linting.

Signed-off-by: ifuryst <ifuryst@gmail.com>

* update: extend the Python lint check to evaluation.

Signed-off-by: ifuryst <ifuryst@gmail.com>

* Update evaluation/logic_reasoning/instruction.txt

---------

Signed-off-by: ifuryst <ifuryst@gmail.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-06-04 06:36:19 +00:00
Xida Ren (Cedar) 1314a09ce9 One-step launch instructions (#2189)
Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>
Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-06-03 23:28:50 -07:00
Yufan Song 2374374778 Fix python environment in review-pr dogfood action (#2237)
This doesn't completely fix the bug; it fixes the python environment, and there is more to do to fix the issue.
2024-06-03 20:31:38 -07:00
Graham Neubig 44665ee235 Add docs for sharing feedback (#2241) 2024-06-04 07:37:33 +05:30
Graham Neubig 74e25920da Transition to gcloud endpoint (#2240) 2024-06-04 01:20:04 +00:00
Leo 759f76fab5 Fix: Properly close Docker client in DockerExecBox to prevent resource leakage (#2224) 2024-06-04 09:05:41 +08:00
dependabot[bot] 87c679ff1a Bump ruff from 0.4.6 to 0.4.7 (#2233) 2024-06-03 22:09:08 +00:00
finaltrip 05b84df9cb chore: fix some comments (#2234)
Signed-off-by: finaltrip <finaltrip@qq.com>
2024-06-03 16:04:34 +00:00
Bibek Poudel 42671815a8 changed the welcome logo from 60vh to auto (#2235) 2024-06-03 15:50:58 +00:00
dependabot[bot] c0e8e11cdc Bump boto3 from 1.34.116 to 1.34.117 (#2232)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.116 to 1.34.117.
- [Release notes](https://github.com/boto/boto3/releases)
- [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst)
- [Commits](https://github.com/boto/boto3/compare/1.34.116...1.34.117)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-03 23:47:27 +08:00
dependabot[bot] 673cde31ba Bump datasets from 2.19.1 to 2.19.2 (#2231)
Bumps [datasets](https://github.com/huggingface/datasets) from 2.19.1 to 2.19.2.
- [Release notes](https://github.com/huggingface/datasets/releases)
- [Commits](https://github.com/huggingface/datasets/compare/2.19.1...2.19.2)

---
updated-dependencies:
- dependency-name: datasets
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-03 23:47:10 +08:00
dependabot[bot] 5de0d5d041 Bump uvicorn from 0.30.0 to 0.30.1 (#2230)
Bumps [uvicorn](https://github.com/encode/uvicorn) from 0.30.0 to 0.30.1.
- [Release notes](https://github.com/encode/uvicorn/releases)
- [Changelog](https://github.com/encode/uvicorn/blob/master/CHANGELOG.md)
- [Commits](https://github.com/encode/uvicorn/compare/0.30.0...0.30.1)

---
updated-dependencies:
- dependency-name: uvicorn
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-03 23:46:45 +08:00
dependabot[bot] adaa9c7c92 Bump e2b from 0.17.0 to 0.17.1 (#2229)
Bumps [e2b](https://github.com/e2b-dev/e2b) from 0.17.0 to 0.17.1.
- [Release notes](https://github.com/e2b-dev/e2b/releases)
- [Commits](https://github.com/e2b-dev/e2b/compare/@e2b/python-sdk@0.17.0...@e2b/python-sdk@0.17.1)

---
updated-dependencies:
- dependency-name: e2b
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-03 23:46:06 +08:00
tobitege 4b76f98b26 fix: keep colon part in model name for OpenRouter (#2223) 2024-06-03 17:11:44 +02:00
dependabot[bot] 47c12902de Bump @types/node from 20.12.13 to 20.14.0 in /frontend (#2226)
Bumps [@types/node](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/node) from 20.12.13 to 20.14.0.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/node)

---
updated-dependencies:
- dependency-name: "@types/node"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Leo <ifuryst@gmail.com>
2024-06-03 15:05:11 +00:00
dependabot[bot] bffa61adb4 Bump prettier from 3.2.5 to 3.3.0 in /frontend (#2225)
Bumps [prettier](https://github.com/prettier/prettier) from 3.2.5 to 3.3.0.
- [Release notes](https://github.com/prettier/prettier/releases)
- [Changelog](https://github.com/prettier/prettier/blob/main/CHANGELOG.md)
- [Commits](https://github.com/prettier/prettier/compare/3.2.5...3.3.0)

---
updated-dependencies:
- dependency-name: prettier
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-03 22:51:51 +08:00
Graham Neubig 4476c250c5 Add consent dialog (#2169)
* Add consent dialog for sharing conversation histories

* Update

* Update to nextui modals

* Update

* More fixes to modal

* Updates

* Revert most changes to ChatInterface

* Update form

* Cleanup

* Update consent dialog

* Lint

* Fix toast

* Fix to be a select

* prettier

* Update frontend/src/components/chat/ChatInterface.tsx

Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>

* Update frontend/src/components/modals/feedback/FeedbackModal.tsx

Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>

* Update frontend/src/components/modals/feedback/FeedbackModal.tsx

Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>

* Update frontend/src/components/chat/ChatInterface.tsx

Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>

* Fix

---------

Co-authored-by: OpenDevin <opendevin@opendevin.ai>
Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>
2024-06-03 14:33:53 +00:00
Boxuan Li c9c5d71e5c logger.py: Fix resource leak (#2215) 2024-06-03 05:57:54 +00:00
மனோஜ்குமார் பழனிச்சாமி 783a3545b5 Named docker app container (#2202) 2024-06-03 05:49:10 +00:00
Boxuan Li 538d1d85a2 evaluation: Reset configs in finally block (#2214) 2024-06-03 09:52:12 +08:00
Boxuan Li 1adbec6757 ssh_box: Fix Docker descriptor leak (#2212) 2024-06-03 01:22:30 +00:00
Boxuan Li 6fd8e8d5b8 Fix file descriptor leaks in agentskills (#2209) 2024-06-03 09:11:10 +08:00
tobitege 908c253897 German translations updated (#2208) 2024-06-02 16:49:40 -07:00
Boxuan Li 399e6fb1d1 ssh_box: Close containers before throwing exception (#2206) 2024-06-02 20:13:44 +00:00
tobitege 64f7749b63 Windows docs extended; some markdown lint fixes (#2205) 2024-06-02 14:59:54 +00:00
Graham Neubig efd689293e Bump docs to 0.6 (#2193)
* Bump docs to 0.6

* Update README.md
2024-06-02 06:34:40 -04:00
Ryan H. Tran 22e8fb39b1 add cost metrics to evaluation outputs for all benchmarks (#2199) 2024-06-02 08:28:00 +00:00
Yizhe Zhang 8d79c3edbc modify the exiting logic and reward calculation, delete unused function (#2198) 2024-06-02 06:38:09 +00:00
tobitege b0478d2880 fix: Fix husky install deprecated message (since v9 of husky) (#2190) (#2191)
Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>
2024-06-02 02:46:32 +00:00
RainRat ed6dcc8381 fix typos (#2187)
* fix typos

no functional change

* fix typos
2024-06-01 20:40:30 +00:00
Leo 2c231c57c9 Add supported benchmarks to evaluation README (AgentBench, BIRD, LogicReasoning) (#2183)
Signed-off-by: ifuryst <ifuryst@gmail.com>
2024-06-01 11:33:01 -04:00
மனோஜ்குமார் பழனிச்சாமி 4ece6fb3cc Auto started persistent container (#2151) 2024-06-01 14:46:41 +00:00
மனோஜ்குமார் பழனிச்சாமி f9c7c3a520 Refactored logging (#2159) 2024-06-01 14:31:35 +00:00
மனோஜ்குமார் பழனிச்சாமி aee3d506e6 Restricted persistent sandbox to opendevin user only (#2177) 2024-06-01 14:18:03 +00:00
Graham Neubig 3b8a649b3d Update slack invite link to make it valid (#2182)
* Update README.md

* Update CustomFooter.tsx

* Update about.md

* Update faq.tsx

* Update intro.mdx
2024-06-01 21:55:27 +08:00
Binyuan Hui 46dcf4bb3e Support BIRD benchmark (#2117)
* update: change timeout from 10 to 30

* update: readme for bird evaluation

* Update evaluation/bird/run_infer.py

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

* Update evaluation/bird/README.md

Co-authored-by: Shimada666 <649940882@qq.com>

* Update evaluation/bird/README.md

Co-authored-by: Shimada666 <649940882@qq.com>

* Update evaluation/bird/run_infer.py

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Shimada666 <649940882@qq.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
2024-06-01 11:34:36 +00:00
Leo 78e003caf6 Fix: Avoid bash backtick eval in runtime commands. (#2180)
Signed-off-by: ifuryst <ifuryst@gmail.com>
2024-06-01 19:19:15 +08:00
Leo be251b11de Add AgentBench. (#2012)
* Add AgentBench.

* Load the datasets from HF.

Signed-off-by: ifuryst <ifuryst@gmail.com>

* Add helper functions.

* Add mock executor.

Signed-off-by: ifuryst <ifuryst@gmail.com>

* Add retriv agent answer cmd.

* Adjust the dataset.
* Refine test results.

Signed-off-by: ifuryst <ifuryst@gmail.com>

* Consolidate all AgentBench datasets and scripts into a single CSV dataset.

* Refactor dataset source.
* Update helper functions.

Signed-off-by: ifuryst <ifuryst@gmail.com>

* Fix the CRLF problem.

Signed-off-by: ifuryst <ifuryst@gmail.com>

* Separate the instance's workspace.

Signed-off-by: ifuryst <ifuryst@gmail.com>

* Add cleanup logic and error handling for sandbox closure.

* Normalized dataset

Signed-off-by: ifuryst <ifuryst@gmail.com>

* Update README.

Signed-off-by: ifuryst <ifuryst@gmail.com>

* Update the prompt to capture the answer.

Signed-off-by: ifuryst <ifuryst@gmail.com>

* Refactor script execution paths to use absolute container workspace path.

Signed-off-by: ifuryst <ifuryst@gmail.com>

* Update AgentBench README.

Signed-off-by: ifuryst <ifuryst@gmail.com>

* Delete useless functions.

Signed-off-by: ifuryst <ifuryst@gmail.com>

* Update evaluation/agent_bench/README.md

* Add script to summarize test results from JSONL file in AgentBench

Signed-off-by: ifuryst <ifuryst@gmail.com>

* Delete useless script and codes.

Signed-off-by: ifuryst <ifuryst@gmail.com>

* Update evaluation/agent_bench/scripts/summarise_results.py

---------

Signed-off-by: ifuryst <ifuryst@gmail.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-06-01 07:58:14 +00:00
மனோஜ்குமார் பழனிச்சாமி 04d7354501 Detailed logs for ssh_box (#2173) 2024-06-01 11:40:22 +05:30
Boxuan Li 06e45afc75 Fix ssh box hung issue (#2172)
Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>
2024-06-01 05:31:32 +00:00
மனோஜ்குமார் பழனிச்சாமி 3a4dc5c68c Initialized plugins only once for persistent sandboxes (#2162) 2024-06-01 10:46:09 +05:30
Boxuan Li feaae0b7ac Fix persist_sandbox in Makefile (#2171) 2024-06-01 12:50:31 +08:00
Rahul Anand 6e76f9a02f Fix: Codebase font fixed, and other fixes for #2138 PR (#2154)
* fix #2123

* Docs enhancement

* Update docs/src/components/CustomFooter.tsx

Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>

* Update docs/src/components/CustomFooter.tsx

Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>

* Update docs/src/pages/faq.tsx

Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>

* update

* fix for #2138 pr

* Update docs/src/components/CustomFooter.tsx

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update docs/src/components/HomepageHeader/HomepageHeader.tsx

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update docs/src/components/Welcome/Welcome.tsx

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update docs/src/css/custom.css

Co-authored-by: Graham Neubig <neubig@gmail.com>

---------

Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>
Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-06-01 02:22:44 +00:00
மனோஜ்குமார் பழனிச்சாமி bf24a0b5c0 Fixed makefile (#2168) 2024-06-01 03:35:43 +05:30
Aaron Xia 42c6b506b5 Lazy launching BrowseEnv / making BrowseEnv optional (#2155)
* feat: lazy launching browser; browser optional for diffrent agents.

* style: lint

* fix: integration test fail due to browser not started.

* fix: run by cli and integration test failed.

* fix: lint

* fix: lint

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-05-31 16:40:42 -04:00
மனோஜ்குமார் பழனிச்சாமி 8413f147c9 Added logs (#2153)
* Logged about config file

* Logged Browser env

* Update opendevin/core/config.py

Co-authored-by: Aleksandar <isavitaisa@gmail.com>

* Update opendevin/core/config.py

Co-authored-by: Aleksandar <isavitaisa@gmail.com>

---------

Co-authored-by: Aleksandar <isavitaisa@gmail.com>
2024-05-31 16:04:36 -04:00
Ryan H. Tran 01296ff79d Add remaining subsets for MINT benchmark (#2142)
* add MMLU subset

* add theoremqa subset

* remove redundant packages from requirements.txt, adjust prompts, handle gpt3.5 propose a wrong answer after a correct answer

* add MBPP subset

* add humaneval subset

* update README

* exit actively after the agent finishes the task
2024-05-31 20:04:13 +00:00
மனோஜ்குமார் பழனிச்சாமி f3f5768b4f Install chromium only once (#2100)
* install chromium only once

* Update Makefile

* Update Makefile
2024-05-31 15:39:10 -04:00
dependabot[bot] 9a441ea8f7 Bump boto3 from 1.34.115 to 1.34.116 (#2164)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.115 to 1.34.116.
- [Release notes](https://github.com/boto/boto3/releases)
- [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst)
- [Commits](https://github.com/boto/boto3/compare/1.34.115...1.34.116)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-31 15:13:33 -04:00
Graham Neubig 6596d5c799 Fix: Feedback should be sent through the backend to avoid CORS issues (#2046)
* Fix: Feedback should be sent through the backend to avoid CORS issues

* Update

* Fix merge error

* Revert unnecessary change

* Lint

* Moved to services

* Fixed bugs

---------

Co-authored-by: OpenDevin <opendevin@opendevin.ai>
2024-05-31 15:00:09 -04:00
dependabot[bot] 6aec3d789e Bump litellm from 1.39.3 to 1.39.5 (#2163)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.39.3 to 1.39.5.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.39.3...v1.39.5)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-31 19:36:04 +02:00
Graham Neubig 7a2122ebc2 Default to gpt-4o (#2158)
* Default to gpt-4o

* Fix default
2024-05-31 14:44:07 +00:00
dependabot[bot] a7b19a0048 Bump @nextui-org/react from 2.4.0 to 2.4.1 in /frontend (#2161)
Bumps [@nextui-org/react](https://github.com/nextui-org/nextui/tree/HEAD/packages/core/react) from 2.4.0 to 2.4.1.
- [Release notes](https://github.com/nextui-org/nextui/releases)
- [Changelog](https://github.com/nextui-org/nextui/blob/canary/packages/core/react/CHANGELOG.md)
- [Commits](https://github.com/nextui-org/nextui/commits/@nextui-org/react@2.4.1/packages/core/react)

---
updated-dependencies:
- dependency-name: "@nextui-org/react"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-31 14:32:21 +00:00
dependabot[bot] e6c8e1c9d2 Bump framer-motion from 11.2.9 to 11.2.10 in /frontend (#2160)
Bumps [framer-motion](https://github.com/framer/motion) from 11.2.9 to 11.2.10.
- [Changelog](https://github.com/framer/motion/blob/main/CHANGELOG.md)
- [Commits](https://github.com/framer/motion/compare/v11.2.9...v11.2.10)

---
updated-dependencies:
- dependency-name: framer-motion
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-31 14:30:14 +00:00
Boxuan Li 4d14b44a9a SWE-bench: Add summarise utility script to view passed/failed task IDs (#2137)
* SWE-bench: Add summarise utility script to view passed/failed task IDs

* Fix typos

* Move file

* Prettify

* Use merged jsonl file
2024-05-31 12:32:17 +08:00
Boxuan Li f188abd7a3 Delete evaluation outputs files (#2152)
* Delete evaluation outputs files

* Fix README
2024-05-31 03:12:27 +00:00
மனோஜ்குமார் பழனிச்சாமி 961c96a2a1 Added ssh_password to config setup (#2139)
Co-authored-by: Aleksandar <isavitaisa@gmail.com>
2024-05-31 07:26:16 +05:30
dependabot[bot] f4bc52461a Bump openai from 1.30.4 to 1.30.5 (#2144)
Bumps [openai](https://github.com/openai/openai-python) from 1.30.4 to 1.30.5.
- [Release notes](https://github.com/openai/openai-python/releases)
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-python/compare/v1.30.4...v1.30.5)

---
updated-dependencies:
- dependency-name: openai
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-30 23:29:38 +08:00
dependabot[bot] cd6f863a49 Bump litellm from 1.39.2 to 1.39.3 (#2145)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.39.2 to 1.39.3.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.39.2...v1.39.3)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-30 23:29:11 +08:00
dependabot[bot] 486c5d983f Bump json-repair from 0.20.1 to 0.21.0 (#2146)
Bumps [json-repair](https://github.com/mangiucugna/json_repair) from 0.20.1 to 0.21.0.
- [Release notes](https://github.com/mangiucugna/json_repair/releases)
- [Commits](https://github.com/mangiucugna/json_repair/compare/0.20.1...0.21.0)

---
updated-dependencies:
- dependency-name: json-repair
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-30 23:28:55 +08:00
dependabot[bot] 33d9882621 Bump @types/node from 18.19.30 to 20.12.13 in /frontend (#2147)
Bumps [@types/node](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/node) from 18.19.30 to 20.12.13.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/node)

---
updated-dependencies:
- dependency-name: "@types/node"
  dependency-type: direct:development
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-30 23:28:31 +08:00
dependabot[bot] 2fcaa2328e Bump boto3 from 1.34.113 to 1.34.115 (#2143)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.113 to 1.34.115.
- [Release notes](https://github.com/boto/boto3/releases)
- [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst)
- [Commits](https://github.com/boto/boto3/compare/1.34.113...1.34.115)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-30 23:24:59 +08:00
Rahul Anand a0373900be Docs enhancement (#2138)
* fix #2123

* Docs enhancement

* Update docs/src/components/CustomFooter.tsx

Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>

* Update docs/src/components/CustomFooter.tsx

Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>

* Update docs/src/pages/faq.tsx

Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>

---------

Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>
2024-05-30 17:05:09 +03:00
Ren Ma a9823491e6 Support Logic Reasoning Benchmark (#1973) 2024-05-30 16:35:15 +08:00
Xingyao Wang 01ef90205d Add CodeActSWEAgent to remove browsing & github + improvements on agentskills (#2105)
* update swe_bench prompt;
use minimal prompt for codeact;

* upgrade agentskills and update testcases

* update infer prompt

* fix cwd

* add icl for swebench

* also log in_context_example to run infer

* remove extra print

* change prompt to abs path

* update error message to include current file info

* change cwd for jupyter if needed

* update edit error message

* update prompt

* improve git get patch

* update hint string

* default to 50 turns

* revert changes from codeact agent and create new CodeActSWEAgent

* revert changes to codeact

* revert instructions for run infer

* revert instructions for run infer

* update README

* update max iter

* add codeact swe agent

* fix issue for CodeActSWEAgent

* allow specifying max iter in cmdline script

* stop printing

* Update agenthub/codeact_swe_agent/README.md

Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>

* Fix prompt regression in jupyter plugin

---------

Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-05-29 21:19:00 -07:00
Aaron Xia b1ec8e5dc2 style: Update agent_controller.py to clean log (#2124) 2024-05-29 18:56:11 -07:00
Rahul Anand b3cce763a2 fix #2123 (#2125) 2024-05-29 17:56:45 -04:00
Robert Brennan 89ac732cb6 Adjust docs a bit (#2135)
* tweak docs a bit

* move warning
2024-05-29 17:56:28 -04:00
dependabot[bot] eb1e0e9da8 Bump llama-index-embeddings-huggingface from 0.2.0 to 0.2.1 (#2132) 2024-05-29 20:48:14 +00:00
dependabot[bot] ab454e122a Bump browsergym from 0.3.3 to 0.3.4 (#2127)
Bumps [browsergym](https://github.com/ServiceNow/BrowserGym) from 0.3.3 to 0.3.4.
- [Release notes](https://github.com/ServiceNow/BrowserGym/releases)
- [Commits](https://github.com/ServiceNow/BrowserGym/compare/v0.3.3...v0.3.4)

---
updated-dependencies:
- dependency-name: browsergym
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-29 15:42:21 -04:00
dependabot[bot] cf95f1aabe Bump ruff from 0.4.5 to 0.4.6 (#2130)
Bumps [ruff](https://github.com/astral-sh/ruff) from 0.4.5 to 0.4.6.
- [Release notes](https://github.com/astral-sh/ruff/releases)
- [Changelog](https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md)
- [Commits](https://github.com/astral-sh/ruff/compare/v0.4.5...v0.4.6)

---
updated-dependencies:
- dependency-name: ruff
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-30 00:53:53 +08:00
dependabot[bot] b011190b40 Bump litellm from 1.38.11 to 1.39.2 (#2133)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.38.11 to 1.39.2.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.38.11...v1.39.2)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-29 15:51:50 +00:00
dependabot[bot] 439e9c0e60 Bump openai from 1.30.3 to 1.30.4 (#2131) 2024-05-29 15:44:51 +00:00
dependabot[bot] 53b3309a5a Bump @typescript-eslint/eslint-plugin from 7.10.0 to 7.11.0 in /frontend (#2129) 2024-05-29 15:29:54 +00:00
dependabot[bot] c45123ddb2 Bump framer-motion from 11.2.6 to 11.2.9 in /frontend (#2128) 2024-05-29 15:29:40 +00:00
dependabot[bot] af3ddddd33 Bump lint-staged from 15.2.4 to 15.2.5 in /frontend (#2126)
Bumps [lint-staged](https://github.com/okonet/lint-staged) from 15.2.4 to 15.2.5.
- [Release notes](https://github.com/okonet/lint-staged/releases)
- [Changelog](https://github.com/lint-staged/lint-staged/blob/master/CHANGELOG.md)
- [Commits](https://github.com/okonet/lint-staged/compare/v15.2.4...v15.2.5)

---
updated-dependencies:
- dependency-name: lint-staged
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-29 15:29:30 +00:00
மனோஜ்குமார் பழனிச்சாமி d4ccd48af8 Persistent docker session (#1998)
Co-authored-by: Robert Brennan <accounts@rbren.io>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-05-29 13:22:34 +00:00
Robert Brennan 03386a81e0 fix file uploads (#2102)
Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>
2024-05-29 13:22:22 +00:00
மனோஜ்குமார் பழனிச்சாமி 343e5c73ae Parsed model_name for model_info (#2122) 2024-05-29 16:54:27 +08:00
Prithvi 13d04fa36c Fix issue #2029: Replace defaultProps with JavaScript default parameters (#2106)
* updated basemodal

Updated the basemodal.tsx file by removing the  BaseModal.defaultProps block and including the default values directly within the function parameters.

* Removed DefaultProps from the files

Removed DefaultProps from the files:
AgentControlBar.tsx, ChatInput.tsx, ExplorerTree.tsx, TreeNode.tsx, IconButton.tsx, HeaderContent.tsx, AutocompleteCombobox.tsx

and replaced the usage of defaultProps with JavaScript default parameters in the given components.

* Removed comments and updated eslintrc

Removed all the comments (Removed the defaultProps block comment), and updated the ESLint rules to ignore the defaultProps warning thrown by ESLint.

* Finished Linting Succesfully.

Ran the lint command with the --fix and --write arg to fix all remaining issues and errors before pushing. Thanks a lot @amanape for the support!

---------

Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>
2024-05-29 09:49:50 +03:00
Boxuan Li 9b371b1b5f Refactor agent delegation and tweak micro agents (#1910)
This PR fixes #1897. In addition, this PR fixes and tweaks a few micro-agents.

For the first time, I am able to use ManagerAgent to complete test_write_simple_script and test_edits tasks in integration tests, so this PR also adds ManagerAgent as part of integration tests. test_write_simple_script involves delegation to CoderAgent while test_edits involves delegation to TypoFixerAgent.

Also for the first time, I am able to use DelegateAgent to complete test_write_simple_script and test_edits tasks in integration tests, so this PR also adds DelegateAgent as part of integration tests. It involves delegation to StudyRepoForTaskAgent, CoderAgent and VerifierAgent.

This PR is a blocker for #1735 and likely #1945.
2024-05-28 20:01:16 -07:00
mamoodi c37a474dc5 doc: Small fix for development.md and docs (#2119) 2024-05-28 20:43:58 +00:00
dependabot[bot] b9aee7046c Bump @nextui-org/react from 2.3.6 to 2.4.0 in /frontend (#2115)
Bumps [@nextui-org/react](https://github.com/nextui-org/nextui/tree/HEAD/packages/core/react) from 2.3.6 to 2.4.0.
- [Release notes](https://github.com/nextui-org/nextui/releases)
- [Changelog](https://github.com/nextui-org/nextui/blob/canary/packages/core/react/CHANGELOG.md)
- [Commits](https://github.com/nextui-org/nextui/commits/@nextui-org/react@2.4.0/packages/core/react)

---
updated-dependencies:
- dependency-name: "@nextui-org/react"
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
2024-05-29 00:32:02 +08:00
dependabot[bot] 2a12642228 Bump uvicorn from 0.29.0 to 0.30.0 (#2111)
Bumps [uvicorn](https://github.com/encode/uvicorn) from 0.29.0 to 0.30.0.
- [Release notes](https://github.com/encode/uvicorn/releases)
- [Changelog](https://github.com/encode/uvicorn/blob/master/CHANGELOG.md)
- [Commits](https://github.com/encode/uvicorn/compare/0.29.0...0.30.0)

---
updated-dependencies:
- dependency-name: uvicorn
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-28 23:33:32 +08:00
dependabot[bot] 535d316a89 Bump vite from 5.2.11 to 5.2.12 in /frontend (#2112)
Bumps [vite](https://github.com/vitejs/vite/tree/HEAD/packages/vite) from 5.2.11 to 5.2.12.
- [Release notes](https://github.com/vitejs/vite/releases)
- [Changelog](https://github.com/vitejs/vite/blob/main/packages/vite/CHANGELOG.md)
- [Commits](https://github.com/vitejs/vite/commits/v5.2.12/packages/vite)

---
updated-dependencies:
- dependency-name: vite
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-28 23:33:06 +08:00
dependabot[bot] e3e4aaa31b Bump eslint-plugin-react from 7.34.1 to 7.34.2 in /frontend (#2113)
Bumps [eslint-plugin-react](https://github.com/jsx-eslint/eslint-plugin-react) from 7.34.1 to 7.34.2.
- [Release notes](https://github.com/jsx-eslint/eslint-plugin-react/releases)
- [Changelog](https://github.com/jsx-eslint/eslint-plugin-react/blob/v7.34.2/CHANGELOG.md)
- [Commits](https://github.com/jsx-eslint/eslint-plugin-react/compare/v7.34.1...v7.34.2)

---
updated-dependencies:
- dependency-name: eslint-plugin-react
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-28 23:32:38 +08:00
dependabot[bot] 6640a247c0 Bump @typescript-eslint/parser from 7.10.0 to 7.11.0 in /frontend (#2114)
Bumps [@typescript-eslint/parser](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/parser) from 7.10.0 to 7.11.0.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/parser/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v7.11.0/packages/parser)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/parser"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-28 23:32:16 +08:00
dependabot[bot] 5b16ca7a45 Bump litellm from 1.38.10 to 1.38.11 (#2110)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.38.10 to 1.38.11.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.38.10...v1.38.11)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-28 16:57:43 +02:00
Ryan H. Tran 9434bcce48 Support MINT benchmark (MATH, GSM8K subset) (#1955)
* setup boilerplate and README

* setup test script and load dataset

* add temp intg that works

* refactor code

* add solution evaluation through 'fake_user_response_fn'

* finish integrating MATH subset

* Update evaluation/mint/run_infer.py

* Update evaluation/mint/run_infer.sh

* Update opendevin/core/main.py

* remove redudant templates, add eval_note, update README

* use <execute_ipython> tag instead of <execute>

* hardcode AGENT option for run_infer.sh

* Update evaluation/mint/task.py

Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>

* fix: bug no message returned when task's success

* change message to make the agent exit

* import bash abstractmethod

* install all required packages inside sandbox before the agent runs, adjust prompt

* add subset eval folder separation and test for gsm8k

* fix bug in Reasoning task result check, add requirements.txt

* Fix syntax error in evaluation/mint/run_infer.py

* update README, add default values for `SUBSET` and `EVAL_LIMIT`

---------

Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
Co-authored-by: yufansong <yufan@risingwave-labs.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-05-28 07:42:52 +00:00
dependabot[bot] 110c530582 Bump browsergym from 0.3.2 to 0.3.3 (#2091)
Bumps [browsergym](https://github.com/ServiceNow/BrowserGym) from 0.3.2 to 0.3.3.
- [Release notes](https://github.com/ServiceNow/BrowserGym/releases)
- [Commits](https://github.com/ServiceNow/BrowserGym/compare/v0.3.2...v0.3.3)

---
updated-dependencies:
- dependency-name: browsergym
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-27 23:32:30 +08:00
dependabot[bot] 1b10c5bcb8 Bump @testing-library/user-event from 13.5.0 to 14.5.2 in /frontend (#2096)
Bumps [@testing-library/user-event](https://github.com/testing-library/user-event) from 13.5.0 to 14.5.2.
- [Release notes](https://github.com/testing-library/user-event/releases)
- [Changelog](https://github.com/testing-library/user-event/blob/main/CHANGELOG.md)
- [Commits](https://github.com/testing-library/user-event/compare/v13.5.0...v14.5.2)

---
updated-dependencies:
- dependency-name: "@testing-library/user-event"
  dependency-type: direct:development
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-27 23:32:21 +08:00
dependabot[bot] e52e24c5d5 Bump jsdom from 24.0.0 to 24.1.0 in /frontend (#2097)
Bumps [jsdom](https://github.com/jsdom/jsdom) from 24.0.0 to 24.1.0.
- [Release notes](https://github.com/jsdom/jsdom/releases)
- [Changelog](https://github.com/jsdom/jsdom/blob/main/Changelog.md)
- [Commits](https://github.com/jsdom/jsdom/compare/24.0.0...24.1.0)

---
updated-dependencies:
- dependency-name: jsdom
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-27 23:32:12 +08:00
dependabot[bot] b570354357 Bump openai from 1.30.1 to 1.30.3 (#2090)
Bumps [openai](https://github.com/openai/openai-python) from 1.30.1 to 1.30.3.
- [Release notes](https://github.com/openai/openai-python/releases)
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-python/compare/v1.30.1...v1.30.3)

---
updated-dependencies:
- dependency-name: openai
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-27 23:31:45 +08:00
dependabot[bot] 848746a1c8 Bump boto3 from 1.34.112 to 1.34.113 (#2092)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.112 to 1.34.113.
- [Release notes](https://github.com/boto/boto3/releases)
- [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst)
- [Commits](https://github.com/boto/boto3/compare/1.34.112...1.34.113)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-27 23:31:32 +08:00
dependabot[bot] bbcc1ab171 Bump json-repair from 0.19.2 to 0.20.1 (#2093)
Bumps [json-repair](https://github.com/mangiucugna/json_repair) from 0.19.2 to 0.20.1.
- [Release notes](https://github.com/mangiucugna/json_repair/releases)
- [Commits](https://github.com/mangiucugna/json_repair/compare/0.19.2...0.20.1)

---
updated-dependencies:
- dependency-name: json-repair
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-27 23:31:23 +08:00
dependabot[bot] e6ae9ae259 Bump @vitejs/plugin-react from 4.2.1 to 4.3.0 in /frontend (#2094)
Bumps [@vitejs/plugin-react](https://github.com/vitejs/vite-plugin-react/tree/HEAD/packages/plugin-react) from 4.2.1 to 4.3.0.
- [Release notes](https://github.com/vitejs/vite-plugin-react/releases)
- [Changelog](https://github.com/vitejs/vite-plugin-react/blob/main/packages/plugin-react/CHANGELOG.md)
- [Commits](https://github.com/vitejs/vite-plugin-react/commits/v4.3.0/packages/plugin-react)

---
updated-dependencies:
- dependency-name: "@vitejs/plugin-react"
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-27 23:31:17 +08:00
dependabot[bot] 9507f4426a Bump typescript from 5.4.4 to 5.4.5 in /frontend (#2098)
Bumps [typescript](https://github.com/Microsoft/TypeScript) from 5.4.4 to 5.4.5.
- [Release notes](https://github.com/Microsoft/TypeScript/releases)
- [Changelog](https://github.com/microsoft/TypeScript/blob/main/azure-pipelines.release.yml)
- [Commits](https://github.com/Microsoft/TypeScript/compare/v5.4.4...v5.4.5)

---
updated-dependencies:
- dependency-name: typescript
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-27 23:31:05 +08:00
dependabot[bot] 5bccaefc5f Bump litellm from 1.38.2 to 1.38.10 (#2089)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.38.2 to 1.38.10.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.38.2...v1.38.10)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-27 23:28:44 +08:00
Engel Nyst 55fdee31ad Remove unnecessary stuff from the sandboxes tests (#2095) 2024-05-27 20:50:02 +05:30
Yoni 3c5c214d87 Fix: serve UI from fastAPI app (#2086) 2024-05-27 18:56:13 +05:30
Xingyao Wang ae8cda1495 Support specifying custom cost per token (#2083)
* support specifying custom cost per token

* fix test for new attrs

* add to docs

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-05-27 19:35:34 +08:00
Yufan Song 3d29ec0418 add version (#2078) 2024-05-26 20:46:39 -07:00
Aaron Xia b66a915de1 feat: auto clean session with session close called. (#1990)
* feat: auto clean session with session close called.

* fix: lint

* fix: lint

* fix: lint

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-05-27 11:08:24 +08:00
Aleksandar 18d07bda89 feat: add max_budget_per_task configuration to control task cost (#2070)
* feat: add max_budget_per_task configuration to control task cost

* Fix test_arg_parser.py

* Use the config.max_budget_per_task as default value

* Add max_budget_per_task to core/main.py as well

* Update opendevin/controller/agent_controller.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-05-27 02:04:31 +08:00
Engel Nyst 783fea62a0 Ignore pid for loop detection (Was: override eq...) (#2045)
* rewrite, implement pid ignore in the controller

* make the helper method private
2024-05-26 19:27:12 +02:00
Xingyao Wang 2c0a2dbc61 fix yet another swe_bench issue (#2069) 2024-05-26 10:01:43 -07:00
Gant f0271f9f91 need to run as root to use SWEBench container (#2068) 2024-05-26 14:21:33 +00:00
Xingyao Wang 5114230e53 Some SWE-Bench infer fixes and improvements (#2065)
* reset workspace base properly

* support running without hint

* support running without hint

* bump swe-bench eval docker to v1.2 for latest agentskills

* only give hint when use hint text is trie

* add swe-agent instructions for validation

* update dockerfile

* pin the python interpreter for execute_cli

* avoid initialize plugins twice

* default to use hint

* save results to swe_bench_lite

* unset gh token and increase max iter to 50

* remove printing of use hint status

* refractor ssh login into one function

* ok drop to 30 turns bc it is so expensive :(

* remove reproduce comments to avoid stuck
2024-05-26 10:02:11 +00:00
Xingyao Wang a6b3ce866d refractor ssh login into one function (#2066) 2024-05-26 08:56:13 +00:00
Boxuan Li 7d6cb69a51 main.py: Fix redundant ChangeAgentStateAction (#2064) 2024-05-26 15:20:56 +08:00
Chris Mamatas 1891fd88d5 feat(frontend): Added React Router (#2061)
* Added React Router

* Moved router import above ./App

---------

Co-authored-by: Chris Mamatas <chrismamatas1@gmail.com>
2024-05-25 21:57:15 +03:00
Shimada666 be1c2ad60d feat: use retry decorator instead of retrying in a loop (#2058)
* feat: use retry decorator instead of retrying in a loop

* update code logic

* update poetry lock
2024-05-25 16:04:40 +00:00
Yizhe Zhang 0c829cd067 Support Entity-Deduction-Arena (EDA) Benchmark (#1931)
* adding draft evaluation code for EDA, using chatgpt as the temporal agent for now

* Update README.md

* Delete frontend/package.json

* reverse the irrelevant changes

* reverse package.json

* use chatgpt as the codeactagent

* integrate with opendevin

* Update evaluation/EDA/README.md

* Update evaluation/EDA/README.md

* Use poetry to manage packages

* integrate with opendevin

* minor update

* minor update

* update poetry

* update README

* clean-up infer scripts

* add run_infer script and improve readme

* log final success and final message & ground truth

---------

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
Co-authored-by: yufansong <yufan@risingwave-labs.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-05-25 23:17:04 +08:00
Xingyao Wang 28ab00946b update README for GAIA (#2054)
* update README for GAIA

* Update evaluation/gaia/README.md

* Update evaluation/gaia/README.md

* Update evaluation/gaia/README.md

---------

Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
2024-05-25 15:01:03 +00:00
Xingyao Wang ec68af5b83 fix the openai_api_key detected by agentskills (#2052) 2024-05-25 22:09:07 +08:00
Xingyao Wang 221035d39a Add retry logic to ssh login (#2053)
* add retry logic to ssh login

* Update opendevin/runtime/docker/ssh_box.py

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-05-25 12:16:24 +00:00
Shimada666 b31f7701eb Integrate Multimodal tools to agentskills. (#2016)
* suport reading multimodal files

* move file

* update dependency

* remove useless pip install

* add comments

* update the comment

* Apply suggestions from code review

* Add unit test for TXTReader

* pre-commit hook corrupted utf16 test txt

* Revert unnecessary dependency upgrades

* feat: import some readers for agentskill

* add dependencies

* Integrate some multimodal tools

* add shell pip dependency

* update dependencies

* update dependencies

* update print window

* remove __main__

* locally import cv2

* add c library for opencv

* update lock file

* update prompt

* remove unuseful file

* add some unittest

* add unittest & remove excel-related parser

* rollback poetry lock

* remove markdown

* remove requests

* optimize parse_video output

* Fix integration tests for CodeActAgent

* remove test_parse_image unittest

* Add a TODO to containers/sandbox/Dockerfile

* update dependencies

* remove pyproject.toml useless package

* change document via openai key

* Fix prompts after removing some actions

---------

Co-authored-by: Mingchen Zhuge <mczhuge@gmail.com>
Co-authored-by: yufansong <yufan@risingwave-labs.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
Co-authored-by: Mingchen Zhuge <64179323+mczhuge@users.noreply.github.com>
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-05-25 18:58:49 +08:00
Boxuan Li 91f313c914 BrowserEnv: init exception handling (#2050)
* BrowserEnv: init exception handling

* Revert irrelevant changes

* Remove type ignore
2024-05-25 00:17:25 -07:00
மனோஜ்குமார் பழனிச்சாமி 36ff060c1a Added links in docs (#2051) 2024-05-25 11:23:20 +05:30
மனோஜ்குமார் பழனிச்சாமி cfae6821fa refactored timeout (#2044) 2024-05-24 18:19:14 +02:00
mamoodi 752ce8c4ea Update bug template to include os version (#1982) 2024-05-24 15:58:05 +00:00
dependabot[bot] cc6895a65c Bump streamlit from 1.34.0 to 1.35.0 (#2037)
Bumps [streamlit](https://github.com/streamlit/streamlit) from 1.34.0 to 1.35.0.
- [Release notes](https://github.com/streamlit/streamlit/releases)
- [Commits](https://github.com/streamlit/streamlit/compare/1.34.0...1.35.0)

---
updated-dependencies:
- dependency-name: streamlit
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-24 23:00:37 +08:00
dependabot[bot] 5538ee9bde Bump @types/react from 18.3.2 to 18.3.3 in /frontend (#2039)
Bumps [@types/react](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/react) from 18.3.2 to 18.3.3.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/react)

---
updated-dependencies:
- dependency-name: "@types/react"
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-24 23:00:08 +08:00
dependabot[bot] 9a0bae6d9b Bump @testing-library/react from 13.4.0 to 15.0.7 in /frontend (#2040)
Bumps [@testing-library/react](https://github.com/testing-library/react-testing-library) from 13.4.0 to 15.0.7.
- [Release notes](https://github.com/testing-library/react-testing-library/releases)
- [Changelog](https://github.com/testing-library/react-testing-library/blob/main/CHANGELOG.md)
- [Commits](https://github.com/testing-library/react-testing-library/compare/v13.4.0...v15.0.7)

---
updated-dependencies:
- dependency-name: "@testing-library/react"
  dependency-type: direct:development
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-24 22:59:53 +08:00
dependabot[bot] de0f30f6cc Bump eslint-plugin-react-hooks from 4.6.0 to 4.6.2 in /frontend (#2041)
Bumps [eslint-plugin-react-hooks](https://github.com/facebook/react/tree/HEAD/packages/eslint-plugin-react-hooks) from 4.6.0 to 4.6.2.
- [Release notes](https://github.com/facebook/react/releases)
- [Changelog](https://github.com/facebook/react/blob/main/packages/eslint-plugin-react-hooks/CHANGELOG.md)
- [Commits](https://github.com/facebook/react/commits/HEAD/packages/eslint-plugin-react-hooks)

---
updated-dependencies:
- dependency-name: eslint-plugin-react-hooks
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-24 22:59:37 +08:00
dependabot[bot] 6ae16dbc48 Bump react-i18next from 14.1.1 to 14.1.2 in /frontend (#2043)
Bumps [react-i18next](https://github.com/i18next/react-i18next) from 14.1.1 to 14.1.2.
- [Changelog](https://github.com/i18next/react-i18next/blob/master/CHANGELOG.md)
- [Commits](https://github.com/i18next/react-i18next/compare/v14.1.1...v14.1.2)

---
updated-dependencies:
- dependency-name: react-i18next
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-24 22:59:18 +08:00
dependabot[bot] b6be108f49 Bump monaco-editor from 0.48.0 to 0.49.0 in /frontend (#2042)
Bumps [monaco-editor](https://github.com/microsoft/monaco-editor) from 0.48.0 to 0.49.0.
- [Release notes](https://github.com/microsoft/monaco-editor/releases)
- [Changelog](https://github.com/microsoft/monaco-editor/blob/main/CHANGELOG.md)
- [Commits](https://github.com/microsoft/monaco-editor/compare/v0.48.0...v0.49.0)

---
updated-dependencies:
- dependency-name: monaco-editor
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-24 22:58:36 +08:00
dependabot[bot] ef813af9d7 Bump litellm from 1.38.0 to 1.38.2 (#2038)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.38.0 to 1.38.2.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.38.0...v1.38.2)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-24 14:51:22 +00:00
dependabot[bot] 909d7b45ef Bump boto3 from 1.34.111 to 1.34.112 (#2036)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.111 to 1.34.112.
- [Release notes](https://github.com/boto/boto3/releases)
- [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst)
- [Commits](https://github.com/boto/boto3/compare/1.34.111...1.34.112)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-24 14:50:02 +00:00
Xingyao Wang e731048ccf Improve action and observation logging for the CLI interface (#2035)
* properly log user messages;
format browser action/obs, summarize action, messages properly for logging

* add source to message

* add spaces for printing
2024-05-24 08:21:25 -04:00
Jiayi Pan 2d52298a1d Support GAIA benchmark (#1911)
* Add gaia test

* Improve gaia prompts

* Fix browser_env hang bug

* Fix gaia bugs

* add gaia to eval readme

* Fix gaia bugs

* minor fix

* add run_infer.sh and update readme

* set num eval worker to 1

* default to 2023 gaia level1 subset

* default to level 1

* add prompt to instruct model enclose answer within <solution> tag

* add missing break

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
Co-authored-by: yufansong <yufan@risingwave-labs.com>
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-05-24 11:22:28 +00:00
dependabot[bot] 2f6167b953 Bump framer-motion from 11.2.5 to 11.2.6 in /frontend (#2010)
Bumps [framer-motion](https://github.com/framer/motion) from 11.2.5 to 11.2.6.
- [Changelog](https://github.com/framer/motion/blob/main/CHANGELOG.md)
- [Commits](https://github.com/framer/motion/compare/v11.2.5...v11.2.6)

---
updated-dependencies:
- dependency-name: framer-motion
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>
Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-05-24 10:01:42 +00:00
Boxuan Li 78241d9d43 Add tests for browser agent (#2031)
Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-05-24 09:59:40 +00:00
Boxuan Li b13a40c05c README.md: Add CodeCov badge (#2022)
Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-05-24 09:54:25 +00:00
dependabot[bot] ad2784d534 Bump ruff from 0.4.4 to 0.4.5 (#2004)
Bumps [ruff](https://github.com/astral-sh/ruff) from 0.4.4 to 0.4.5.
- [Release notes](https://github.com/astral-sh/ruff/releases)
- [Changelog](https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md)
- [Commits](https://github.com/astral-sh/ruff/compare/v0.4.4...v0.4.5)

---
updated-dependencies:
- dependency-name: ruff
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-24 05:51:05 -04:00
Boxuan Li 593b8d468b Fix CI workflows [mac-test] (#2025)
* Fix CI settings

* Stop saving cpu cycles for GitHub

* Conditionally run mac tests

* Random push to trigger CI checks again

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-05-24 09:25:00 +00:00
sp.wack ae105c2faf feat(frontend): Add actions to send feedback to backend (#2020)
* Add feedback actions to send to backend

* Uncomment request

* Refactor and disable feedback when sending

* disable defaultProp error

---------

Co-authored-by: amanape <stephanpsaras@gmail.com>
2024-05-24 04:26:06 -04:00
dependabot[bot] 9207a8da01 Bump browsergym from 0.2.6 to 0.3.2 (#2013)
Bumps [browsergym](https://github.com/ServiceNow/BrowserGym) from 0.2.6 to 0.3.2.
- [Release notes](https://github.com/ServiceNow/BrowserGym/releases)
- [Commits](https://github.com/ServiceNow/BrowserGym/compare/v0.2.6...v0.3.2)

---
updated-dependencies:
- dependency-name: browsergym
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Robert Brennan <accounts@rbren.io>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-05-24 16:03:56 +08:00
Frank Xu 53f64ffa06 Improve browsing agent prompts, allowing agent to properly finish when done (#1993)
* improve browsing agent, allowing it to properly finish.

* handle parsing error, show user what the agent's browsing thoughts in the front end

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-05-24 00:02:19 -07:00
Boxuan Li c59bcbbffd Minor docstring & prompt fixes for AgentSkills (#2028)
* A few minor fixes to agentskills

* Regenerate prompts

* Remove redundant comment
2024-05-24 13:30:48 +08:00
Xingyao Wang cbf4c4b4c4 fix ExceptionPxssh (#2023)
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-05-23 21:24:21 -07:00
Boxuan Li 633ece5f9c Fix integration tests (#2024) 2024-05-23 20:24:31 -07:00
Robert Brennan 9ca2007201 fix json encoding (#2018)
* fix json encoding

* add test

* add another test

* fix integration tests

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-05-23 23:36:15 +00:00
dependabot[bot] b492b6293a Bump lint-staged from 15.2.2 to 15.2.4 in /frontend (#2009)
Bumps [lint-staged](https://github.com/okonet/lint-staged) from 15.2.2 to 15.2.4.
- [Release notes](https://github.com/okonet/lint-staged/releases)
- [Changelog](https://github.com/lint-staged/lint-staged/blob/master/CHANGELOG.md)
- [Commits](https://github.com/okonet/lint-staged/compare/v15.2.2...v15.2.4)

---
updated-dependencies:
- dependency-name: lint-staged
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-23 12:25:00 -04:00
dependabot[bot] 0a6b26735b Bump @typescript-eslint/parser from 7.9.0 to 7.10.0 in /frontend (#2008)
Bumps [@typescript-eslint/parser](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/parser) from 7.9.0 to 7.10.0.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/parser/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v7.10.0/packages/parser)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/parser"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-23 12:24:42 -04:00
dependabot[bot] dff0f1be13 Bump @types/react-syntax-highlighter in /frontend (#2007)
Bumps [@types/react-syntax-highlighter](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/react-syntax-highlighter) from 15.5.11 to 15.5.13.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/react-syntax-highlighter)

---
updated-dependencies:
- dependency-name: "@types/react-syntax-highlighter"
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-23 12:24:07 -04:00
dependabot[bot] e7306b7226 Bump @react-types/shared from 3.23.0 to 3.23.1 in /frontend (#2006)
Bumps [@react-types/shared](https://github.com/adobe/react-spectrum) from 3.23.0 to 3.23.1.
- [Release notes](https://github.com/adobe/react-spectrum/releases)
- [Commits](https://github.com/adobe/react-spectrum/compare/@react-types/shared@3.23.0...@react-types/shared@3.23.1)

---
updated-dependencies:
- dependency-name: "@react-types/shared"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-23 12:23:49 -04:00
DaxServer b118df606f build: Add poetry command to use Python 3.11 for environment setup (#1972) 2024-05-23 12:05:19 -04:00
Xingyao Wang 602ffcdffb Implement agentskills for OpenDevin to helpfully improve edit AND including more useful tools/skills (#1941)
* add draft for skills

* Implement and test agentskills functions: open_file, goto_line, scroll_down, scroll_up, create_file, search_dir, search_file, find_file

* Remove new_sample.txt file

* add some work from opendevin w/ fixes

* Add unit tests for agentskills module

* fix some issues and updated tests

* add more tests for open

* tweak and handle goto_line

* add tests for some edge cases

* add tests for scrolling

* add tests for edit

* add tests for search_dir

* update tests to use pytest

* use pytest --forked to avoid file op unit tests to interfere with each other via global var

* update doc based on swe agent tool

* update and add tests for find_file and search_file

* move agent_skills to plugins

* add agentskills as plugin and docs

* add agentskill to ssh box and fix sandbox integration

* remove extra returns in doc

* add agentskills to initial tool for jupyter

* support re-init jupyter kernel (for agentskills) after restart

* fix print window's issue with indentation and add testcases

* add prompt for codeact with the newest edit primitives

* modify the way line number is presented (remove leading space)

* change prompt to the newest display format

* support tracking of costs via metrics

* Update opendevin/runtime/plugins/agent_skills/README.md

* Update opendevin/runtime/plugins/agent_skills/README.md

* implement and add tests for py linting

* remove extra text arg for incompatible subprocess ver

* remove sample.txt

* update test_edits integration tests

* fix all integration

* Update opendevin/runtime/plugins/agent_skills/README.md

* Update opendevin/runtime/plugins/agent_skills/README.md

* Update opendevin/runtime/plugins/agent_skills/README.md

* Update agenthub/codeact_agent/prompt.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update agenthub/codeact_agent/prompt.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update agenthub/codeact_agent/prompt.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update opendevin/runtime/plugins/agent_skills/agentskills.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* correctly setup plugins for swebench eval

* bump swe-bench version and add logging

* correctly setup plugins for swebench eval

* bump swe-bench version and add logging

* Revert "correctly setup plugins for swebench eval"

This reverts commit 2bd1055673.

* bump version

* remove _AGENT_SKILLS_DOCS

* move flake8 to test dep

* update poetry.lock

* remove extra arg

* reduce max iter for eval

* update poetry

* fix integration tests

---------

Co-authored-by: OpenDevin <opendevin@opendevin.ai>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-05-23 16:04:09 +00:00
Robert Brennan ea9c785075 fix session state after resuming (#1999)
* fix state resuming

* fix session reconnection

* fix lint
2024-05-23 11:47:36 -04:00
Xingyao Wang 6ff50ed369 Fix SWE-Bench evaluation due to setuptools version (#1995)
* correctly setup plugins for swebench eval

* bump swe-bench version and add logging

* Revert "correctly setup plugins for swebench eval"

This reverts commit 2bd1055673.

* bump version
2024-05-23 23:17:42 +08:00
dependabot[bot] d6327f99ce Bump litellm from 1.37.20 to 1.38.0 (#2005)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.37.20 to 1.38.0.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.37.20...v1.38.0)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-23 14:57:56 +00:00
dependabot[bot] 1c40ea5222 Bump docker from 7.0.0 to 7.1.0 (#2002)
Bumps [docker](https://github.com/docker/docker-py) from 7.0.0 to 7.1.0.
- [Release notes](https://github.com/docker/docker-py/releases)
- [Commits](https://github.com/docker/docker-py/compare/7.0.0...7.1.0)

---
updated-dependencies:
- dependency-name: docker
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-23 16:45:46 +02:00
dependabot[bot] 58d45a1a8a Bump boto3 from 1.34.110 to 1.34.111 (#2001)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.110 to 1.34.111.
- [Release notes](https://github.com/boto/boto3/releases)
- [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst)
- [Commits](https://github.com/boto/boto3/compare/1.34.110...1.34.111)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-23 14:42:31 +00:00
Niklas Muennighoff ef6cdb7532 HumanEvalFix integration (#1908)
* Preliminary HumanEvalFix integration

* Clean paths

* fix: set workspace path correctly for config
fix: task in that contains /

* add missing run_infer.sh

* update run_infer w/o hard coded agent

* fix typo

* change `instance_id` to `task_id`

* add the warning and env var setting to run_infer.sh

* reset back workspace mount at the end of each instance

* 10 max iter is probably enough for humanevalfix

* Remove unneeded section

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>

* Fix link

Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>

* Use logger

Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>

* Update run_infer.py

fix a bug:

ERROR:concurrent.futures:exception calling callback for <Future at 0x309cbc470 state=finished raised NameError>
concurrent.futures.process._RemoteTraceback:

* Update README.md

* Update README.md

* Update README.md

* Update README.md

added an example

* Update README.md

added: enable_auto_lint = true

* Update pyproject.toml

add: evaluate package

* Delete poetry.lock

update poetry.lock

* update poetry.lock

update poetry.lock

* Update README.md

* Update README.md

---------

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
Co-authored-by: Robert <871607149@qq.com>
2024-05-23 13:09:40 +00:00
Aaron Xia f53a91b17c fix: catch session file not existed exception when init EventStream(maybe creating a new session with no session files stored). (#1994) 2024-05-23 11:35:55 +00:00
Engel Nyst 0eccf31604 Refactor monologue and SWE agent to use the messages in state history (#1863)
* Refactor monologue to use the messages in state history

* add messages, clean up

* fix monologue

* update integration tests

* move private method

* update SWE agent to use the history from State

* integration tests for SWE agent

* rename monologue to initial_thoughts, since that is what it is
2024-05-23 07:29:12 +00:00
Jeremi Joslin 3235836b00 Fix typo in prompt (#1992) 2024-05-23 07:11:46 +00:00
Boxuan Li a605e59b7e Save CI cycles for backend tests (#1985) 2024-05-23 00:10:13 -07:00
jiangleo d1475e6e04 Fix Repeated Responses in Chat by Adding IPythonRunCellObservation (#1987)
Co-authored-by: jianghongwei <jianghongwei@58.com>
Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>
2024-05-23 11:10:27 +05:30
Boxuan Li acb430eef5 Refactor integration testing CI, add optional Mac tests, and mark a few agents as deprecated (#1888)
* Add MacOS to integration tests

* Switch back to python 3.11

* Install Docker for macos pipeline

* regenerate.sh: Use environmental variable for sandbox type

* Pack different agents' tests into a single check

* Fix CodeAct tests

* Reduce file match and extensive debug logs

* Add TEST_IN_CI mode that reports codecov

* Small fix: don't quit if reusing old responses failed

* Merge codecov results

* Fix typos

* Remove coverage merge step - codecov automatically does that

* Make mac integration tests as optional - too slow

* Fix codecov args

* Add comments in yaml

* Include sandbox type in codecov report name

* Fix codecov report merge

* Revert renaming of test_matrix_success

* Remove SWEAgent and PlannerAgent from tests

* Mark planner agent and SWE agent as deprecated

* CodeCov: Ignore planner and sweagent

* Revert "Remove SWEAgent and PlannerAgent from tests"

This reverts commit 040cb3bfb9.

* Remove all tests for SWE Agent

* Only keep basic tests for MonologueAgent and PlannerAgent

* Mark SWE Agent as deprecated, and ignore code coverage for it

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-05-22 20:38:57 -07:00
Engel Nyst b9a5be2569 Add ruff for shared mutable defaults (B) (#1938)
* Add ruff for shared mutable defaults (B)

* Apply B006, B008 on current files, except fast API

* Update agenthub/SWE_agent/prompts.py

Co-authored-by: Graham Neubig <neubig@gmail.com>

* fix unintended behavior change

* this is correct, tell Ruff to leave it alone

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-05-22 20:06:00 -07:00
Rahul Anand 9a2591d0f6 fix #1960 (#1964) 2024-05-22 19:36:45 -07:00
Robert Brennan 5bdacf738d Refactor session management (#1810)
* refactor session mgmt

* defer file handling to runtime

* add todo

* refactor sessions a bit more

* remove messages logic from FE

* fix up socket handshake

* refactor frontend auth a bit

* first pass at redoing file explorer

* implement directory suffix

* fix up file tree

* close agent on websocket close

* remove session saving

* move file refresh

* remove getWorkspace

* plumb path/code differently

* fix build issues

* fix the tests

* fix npm build

* add session rehydration

* fix event serialization

* logspam

* fix user message rehydration

* add get_event fn

* agent state restoration

* change history tracking for codeact

* fix responsiveness of init

* fix lint

* lint

* delint

* fix prop

* update tests

* logspam

* lint

* fix test

* revert codeact

* change fileService to use API

* fix up session loading

* delint

* delint

* fix integration tests

* revert test

* fix up access to options endpoints

* fix initial files load

* delint

* fix file initialization

* fix mock server

* fixl int

* fix auth for html

* Update frontend/src/i18n/translation.json

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>

* refactor sessions and sockets

* avoid reinitializing the same session

* fix reconnect issue

* change up intro message

* more guards on reinit

* rename agent_session

* delint

* fix a bunch of tests

* delint

* fix last test

* remove code editor context

* fix build

* fix any

* fix dot notation

* Update frontend/src/services/api.ts

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* fix up error handling

* Update opendevin/server/session/agent.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update opendevin/server/session/agent.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update frontend/src/services/session.ts

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* fix build errs

* fix else

* add closed state

* delint

* Update opendevin/server/session/session.py

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

---------

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-05-22 18:33:16 +00:00
dependabot[bot] 37354dbc83 --- (#1971)
updated-dependencies:
- dependency-name: i18next
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
2024-05-22 18:04:43 +00:00
dependabot[bot] 53afdc8496 --- (#1970)
updated-dependencies:
- dependency-name: tailwind-merge
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-23 01:53:22 +08:00
dependabot[bot] bf6fa8e5f4 --- (#1969)
updated-dependencies:
- dependency-name: husky
  dependency-type: direct:development
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-23 01:52:47 +08:00
dependabot[bot] afc1752a5c --- (#1968)
updated-dependencies:
- dependency-name: "@reduxjs/toolkit"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-23 01:52:25 +08:00
dependabot[bot] 40411db7bf --- (#1967)
updated-dependencies:
- dependency-name: react-dom
  dependency-type: direct:production
  update-type: version-update:semver-minor
- dependency-name: "@types/react-dom"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-23 01:51:43 +08:00
Engel Nyst 46352e890b Logging security (#1943)
* update .gitignore

* Rename the confusing 'INFO' style to 'DETAIL'

* override str and repr

* feat: api_key desensitize

* feat: add SensitiveDataFilter in file handler

* tweak regex, add tests

* more tweaks, include other attrs

* add env vars, those with equivalent config

* fix tests

* tests are invaluable

---------

Co-authored-by: Shimada666 <649940882@qq.com>
2024-05-22 18:27:38 +02:00
dependabot[bot] 8e3e51e984 --- (#1976)
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-22 15:37:23 +00:00
dependabot[bot] 7da647a091 --- (#1975)
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-22 17:36:52 +02:00
Yufan Song 4292998ee2 doc: add more cmd in unit test documentation (#1963) 2024-05-22 19:47:03 +08:00
Yufan Song d18e6c85a0 feat: add metrics related to cost for better observability (#1944)
* add metrics for total_cost

* make lint

* refact codeact

* change metrics into llm

* add costs list, add into state

* refactor log completion

* refactor and test others

* make lint

* Update opendevin/core/metrics.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update opendevin/llm/llm.py

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>

* refactor

* add code

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-05-22 08:53:31 +00:00
Yufan Song 34cccfe9cc doc: update documentation about poetry update (#1962)
* add doc

* Update Development.md

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-05-22 07:33:53 +00:00
Shimada666 da8369c4d2 fix: llm is_local function logic error (#1961)
Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>
2024-05-22 10:50:21 +05:30
Graham Neubig 6618941422 Update README.md SWE-bench score (#1959)
* Update README.md SWE-bench score

Our most recent results on swe-bench lite are 25%, so this updates the README accordingly.

* Update
2024-05-22 04:32:17 +00:00
Frank Xu 1fe290adf9 [Feat] A competitive Web Browsing agent (#1856)
* initial attempt at a browsing only agent

* add browsing agent

* update

* implement agent

* update

* fix comments

* remove unnecessary things from memory extras

* update image processing

---------

Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
2024-05-21 19:20:33 +00:00
மனோஜ்குமார் பழனிச்சாமி d76c425b76 Refactored Logs (#1939) 2024-05-22 03:06:13 +08:00
RainRat 43c187b949 fix typos (#1956)
no functional change
2024-05-21 19:00:48 +00:00
dependabot[bot] dd4ef67809 --- (#1952)
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
2024-05-21 16:33:09 +00:00
dependabot[bot] a2f00584c1 --- (#1950)
updated-dependencies:
- dependency-name: "@typescript-eslint/eslint-plugin"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-21 16:32:30 +00:00
dependabot[bot] 108404f85a --- (#1953)
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
2024-05-21 16:30:12 +00:00
mamoodi f74b807342 Update README and doc intro to be consistent with each other (#1954) 2024-05-21 16:26:24 +00:00
dependabot[bot] d60aa2b512 --- (#1951)
updated-dependencies:
- dependency-name: json-repair
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-22 00:11:07 +08:00
dependabot[bot] 772734c884 --- (#1947)
updated-dependencies:
- dependency-name: react
  dependency-type: direct:production
  update-type: version-update:semver-minor
- dependency-name: "@types/react"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-21 16:04:15 +00:00
dependabot[bot] 6ce0eea4bb --- (#1949)
updated-dependencies:
- dependency-name: framer-motion
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-21 15:55:53 +00:00
dependabot[bot] fd3c83d1eb --- (#1948)
updated-dependencies:
- dependency-name: i18next-http-backend
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-21 15:29:47 +00:00
dependabot[bot] 2610965f07 --- (#1946)
updated-dependencies:
- dependency-name: clsx
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-21 15:08:06 +00:00
Engel Nyst 1e51bb9276 Fix/update controller is_stuck() (#1891)
* Refactor monologue to use the messages in state history

remove now unused method

* is_stuck update

* fix is_stuck

* unit tests

* fix tests

* Revert "Refactor monologue to use the messages in state history"

This reverts commit 76b4b765ef.

* Override eq for CmdOutputObservation to ignore the pid, compare the actual command only

* Revert "Override eq for CmdOutputObservation to ignore the pid, compare the actual command only"

This reverts commit 6418d856b5.
2024-05-21 22:56:59 +08:00
Boxuan Li 4add8a5595 SWE-bench: Allow selection of tasks (#1935) 2024-05-21 16:53:29 +08:00
Engel Nyst 2926c51839 fix logging of full llm_config (leaks api key) (#1937) 2024-05-21 08:08:48 +00:00
Engel Nyst 04094e4201 Document config (#1929)
* add more docstrings for config

* fix typo

* Update opendevin/core/config.py

Co-authored-by: Aleksandar <isavitaisa@gmail.com>

---------

Co-authored-by: Aleksandar <isavitaisa@gmail.com>
2024-05-21 15:28:20 +08:00
Boxuan Li 99651e3249 Fix browser_env hung bug (#1933)
* Fix browser_env hang bug

* Send SIGKILL if SIGTERM doesn't work
2024-05-20 19:31:36 -07:00
dependabot[bot] 8750d3e68b Bump browsergym from 0.2.2 to 0.2.6 (#1925)
Bumps [browsergym](https://github.com/ServiceNow/BrowserGym) from 0.2.2 to 0.2.6.
- [Release notes](https://github.com/ServiceNow/BrowserGym/releases)
- [Commits](https://github.com/ServiceNow/BrowserGym/compare/v0.2.2...v0.2.6)

---
updated-dependencies:
- dependency-name: browsergym
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-20 12:51:55 -04:00
dependabot[bot] bcdde452da Bump react-icons from 5.0.1 to 5.2.1 in /frontend (#1918)
Bumps [react-icons](https://github.com/react-icons/react-icons) from 5.0.1 to 5.2.1.
- [Release notes](https://github.com/react-icons/react-icons/releases)
- [Commits](https://github.com/react-icons/react-icons/compare/v5.0.1...v5.2.1)

---
updated-dependencies:
- dependency-name: react-icons
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-20 19:34:25 +03:00
மனோஜ்குமார் பழனிச்சாமி 4612e107c9 fix: Handle invalid exit code conversion (#1915) 2024-05-20 11:30:52 -04:00
dependabot[bot] 1bc1d306bd Bump vite from 5.2.8 to 5.2.11 in /frontend (#1919)
Bumps [vite](https://github.com/vitejs/vite/tree/HEAD/packages/vite) from 5.2.8 to 5.2.11.
- [Release notes](https://github.com/vitejs/vite/releases)
- [Changelog](https://github.com/vitejs/vite/blob/main/packages/vite/CHANGELOG.md)
- [Commits](https://github.com/vitejs/vite/commits/v5.2.11/packages/vite)

---
updated-dependencies:
- dependency-name: vite
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-20 11:29:32 -04:00
dependabot[bot] c833d0b847 Bump @typescript-eslint/parser from 7.5.0 to 7.9.0 in /frontend (#1920)
Bumps [@typescript-eslint/parser](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/parser) from 7.5.0 to 7.9.0.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/parser/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v7.9.0/packages/parser)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/parser"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-20 11:29:23 -04:00
dependabot[bot] 4c90dedcc7 Bump @nextui-org/react from 2.2.10 to 2.3.6 in /frontend (#1921)
Bumps [@nextui-org/react](https://github.com/nextui-org/nextui/tree/HEAD/packages/core/react) from 2.2.10 to 2.3.6.
- [Release notes](https://github.com/nextui-org/nextui/releases)
- [Changelog](https://github.com/nextui-org/nextui/blob/canary/packages/core/react/CHANGELOG.md)
- [Commits](https://github.com/nextui-org/nextui/commits/@nextui-org/react@2.3.6/packages/core/react)

---
updated-dependencies:
- dependency-name: "@nextui-org/react"
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-20 11:29:15 -04:00
dependabot[bot] a982a2cc85 Bump react-i18next from 14.1.0 to 14.1.1 in /frontend (#1922)
Bumps [react-i18next](https://github.com/i18next/react-i18next) from 14.1.0 to 14.1.1.
- [Changelog](https://github.com/i18next/react-i18next/blob/master/CHANGELOG.md)
- [Commits](https://github.com/i18next/react-i18next/compare/v14.1.0...v14.1.1)

---
updated-dependencies:
- dependency-name: react-i18next
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-20 11:29:07 -04:00
dependabot[bot] ef828f3b9b Bump pytest-asyncio from 0.23.6 to 0.23.7 (#1924)
Bumps [pytest-asyncio](https://github.com/pytest-dev/pytest-asyncio) from 0.23.6 to 0.23.7.
- [Release notes](https://github.com/pytest-dev/pytest-asyncio/releases)
- [Commits](https://github.com/pytest-dev/pytest-asyncio/compare/v0.23.6...v0.23.7)

---
updated-dependencies:
- dependency-name: pytest-asyncio
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-20 11:28:55 -04:00
dependabot[bot] eab64cfcc2 Bump google-generativeai from 0.5.3 to 0.5.4 (#1926)
Bumps [google-generativeai](https://github.com/google/generative-ai-python) from 0.5.3 to 0.5.4.
- [Release notes](https://github.com/google/generative-ai-python/releases)
- [Changelog](https://github.com/google-gemini/generative-ai-python/blob/main/RELEASE.md)
- [Commits](https://github.com/google/generative-ai-python/compare/v0.5.3...v0.5.4)

---
updated-dependencies:
- dependency-name: google-generativeai
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-20 17:28:03 +02:00
dependabot[bot] e978c77eb6 Bump mypy from 1.9.0 to 1.10.0 (#1927)
Bumps [mypy](https://github.com/python/mypy) from 1.9.0 to 1.10.0.
- [Changelog](https://github.com/python/mypy/blob/master/CHANGELOG.md)
- [Commits](https://github.com/python/mypy/compare/1.9.0...v1.10.0)

---
updated-dependencies:
- dependency-name: mypy
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-20 17:27:13 +02:00
dependabot[bot] 6e2e9ab13d Bump pytest from 8.2.0 to 8.2.1 (#1923)
Bumps [pytest](https://github.com/pytest-dev/pytest) from 8.2.0 to 8.2.1.
- [Release notes](https://github.com/pytest-dev/pytest/releases)
- [Changelog](https://github.com/pytest-dev/pytest/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pytest-dev/pytest/compare/8.2.0...8.2.1)

---
updated-dependencies:
- dependency-name: pytest
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-20 17:25:03 +02:00
Shimada666 75cecf68e0 docs: update tutorial docs (#1912)
* docs: update tutorial docs

* Update evaluation/TUTORIAL.md

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-05-20 14:40:31 +00:00
Xingyao Wang 7817d4c94f fix(controller): Improve error info logging (#1864)
* improve error info logging

* Move assignment of self.state.error to report_error function

* only log exception to state, but not to user

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-05-20 18:42:38 +08:00
Boxuan Li b845a38169 Small improvements & fixes to SWE-Bench (#1874)
I was able to run a few benchmark instances from SWE-Bench by myself following the documentation - it was great! In general the experience was smooth, thanks to @xingyaoww, @libowen2121 and the team! I made a few small enhancements and fixes to further improve the developer experience.

Always use poetry run python (using python from poetry's virtual environment) over python or python3 in scripts to make sure the behavior is consistent.
Make AGENT configurable. One can use an argument to control which agent they would like to benchmark. To facilitate this, I removed hardcoded CodeActAgent from run_infer.sh, and also added VERSION attribute to all agents, as the benchmark needs to record the agent version.
Make EVAL_LIMIT configurable. One can use an argument to control how many instances they'd like to benchmark. Useful for debugging & development purposes.
Fix 'eval_output_dir' not defined error in run_infer.py.
Other enhancements to the README file and logs.
I also notice that a lot of code from run_infer.py could be shared by other benchmarks, but since we only have one benchmark now, I think we could avoid over-engineering. A refactor and code dedup would be useful in the future once we have more benchmarks, though.
2024-05-20 08:03:30 +00:00
Shimada666 3e3dcd52a8 feat: add inline code style (#1909)
* feat: add inline code style

* feat: add code block radius

---------

Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>
2024-05-20 07:01:29 +00:00
மனோஜ்குமார் பழனிச்சாமி ec7be6ee51 Fix: Update WebSocket URL to use correct protocol from window.location (#1900) 2024-05-20 09:19:38 +03:00
மனோஜ்குமார் பழனிச்சாமி 6ef7e6eb0f Checked for enduser in Docker (#1899) 2024-05-20 08:50:42 +05:30
Graham Neubig c5abc81bc9 Add documentation regarding how to fix github issues with OpenDevin (#1904)
* Add documentation on fixing GitHub issues with OpenDevin

* Update documentation to instruct users to send a prompt to OpenDevin for fixing GitHub issues

* Remove duplicate line in FAQ documentation

* Fix error with closed paragraph

---------

Co-authored-by: Your Name <you@example.com>
2024-05-19 19:00:46 +02:00
Shimada666 50c2141d20 feat: add log when config file not found (#1898) 2024-05-19 14:42:55 +02:00
dependabot[bot] 3f1b7117b1 Bump @typescript-eslint/eslint-plugin from 7.5.0 to 7.9.0 in /frontend (#1880)
Bumps [@typescript-eslint/eslint-plugin](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/eslint-plugin) from 7.5.0 to 7.9.0.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/eslint-plugin/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v7.9.0/packages/eslint-plugin)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/eslint-plugin"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>
2024-05-18 14:45:03 -04:00
dependabot[bot] c5945bc5a8 Bump ruff from 0.3.7 to 0.4.4 (#1885)
Bumps [ruff](https://github.com/astral-sh/ruff) from 0.3.7 to 0.4.4.
- [Release notes](https://github.com/astral-sh/ruff/releases)
- [Changelog](https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md)
- [Commits](https://github.com/astral-sh/ruff/compare/v0.3.7...v0.4.4)

---
updated-dependencies:
- dependency-name: ruff
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-18 14:44:47 -04:00
dependabot[bot] 75e1485cb0 Bump litellm from 1.37.12 to 1.37.16 (#1886)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.37.12 to 1.37.16.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.37.12...v1.37.16)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-05-18 14:44:22 -04:00
Robert Brennan 0ecba83e53 Move message history out of CodeAct (#1847)
* stop keeping history state in codeact

* regenerate tests

* Update agenthub/codeact_agent/codeact_agent.py

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

* revert tests

* regen tests

* refactor codeact a bit

* regenerate without using LLM

* simplify logic

* change to heredoc

* fix heredoc

* fix end_of_edit docs

* regen tests

* regenerate

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-05-18 18:39:27 +00:00
dependabot[bot] f85a73a50c Bump i18next-browser-languagedetector from 7.2.1 to 8.0.0 in /frontend (#1881)
Bumps [i18next-browser-languagedetector](https://github.com/i18next/i18next-browser-languageDetector) from 7.2.1 to 8.0.0.
- [Changelog](https://github.com/i18next/i18next-browser-languageDetector/blob/master/CHANGELOG.md)
- [Commits](https://github.com/i18next/i18next-browser-languageDetector/compare/v7.2.1...v8.0.0)

---
updated-dependencies:
- dependency-name: i18next-browser-languagedetector
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>
2024-05-18 18:13:41 +00:00
Xingyao Wang b2fdb963b6 Add detailed tutorial for adding new evaluation benchmarks (#1827)
* Add detailed tutorial for adding new evaluation benchmarks

* update tutorial, fix typo, and log observation to the cmdline

* fix url

* Update evaluation/TUTORIAL.md

* Update evaluation/TUTORIAL.md

* Update evaluation/TUTORIAL.md

* Update evaluation/TUTORIAL.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/TUTORIAL.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/TUTORIAL.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/TUTORIAL.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/TUTORIAL.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/TUTORIAL.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/TUTORIAL.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/TUTORIAL.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/TUTORIAL.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/TUTORIAL.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/TUTORIAL.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/TUTORIAL.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/TUTORIAL.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/TUTORIAL.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* simplify readme and add comments to the actual code

* Fix typo in evaluation/TUTORIAL.md

* Fix typo in evaluation/swe_bench/run_infer.py

* Fix another typo in evaluation/swe_bench/run_infer.py

* Update TUTORIAL.md

* Set host net work to false for SWEBench

* Update evaluation/TUTORIAL.md

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update evaluation/TUTORIAL.md

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update evaluation/TUTORIAL.md

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update evaluation/TUTORIAL.md

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

---------

Co-authored-by: OpenDevin <opendevin@opendevin.ai>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-05-18 13:40:53 -04:00
dependabot[bot] eafed447f8 Bump monaco-editor from 0.47.0 to 0.48.0 in /frontend (#1879)
Bumps [monaco-editor](https://github.com/microsoft/monaco-editor) from 0.47.0 to 0.48.0.
- [Release notes](https://github.com/microsoft/monaco-editor/releases)
- [Changelog](https://github.com/microsoft/monaco-editor/blob/main/CHANGELOG.md)
- [Commits](https://github.com/microsoft/monaco-editor/compare/v0.47.0...v0.48.0)

---
updated-dependencies:
- dependency-name: monaco-editor
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-18 13:38:26 -04:00
dependabot[bot] 061dfc0ee3 Bump boto3 from 1.34.106 to 1.34.108 (#1887)
Bumps [boto3](https://github.com/boto/boto3) from 1.34.106 to 1.34.108.
- [Release notes](https://github.com/boto/boto3/releases)
- [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst)
- [Commits](https://github.com/boto/boto3/compare/1.34.106...1.34.108)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-18 13:37:53 -04:00
dependabot[bot] 9cc76550d1 Bump @testing-library/jest-dom from 6.4.2 to 6.4.5 in /frontend (#1882)
Bumps [@testing-library/jest-dom](https://github.com/testing-library/jest-dom) from 6.4.2 to 6.4.5.
- [Release notes](https://github.com/testing-library/jest-dom/releases)
- [Changelog](https://github.com/testing-library/jest-dom/blob/main/CHANGELOG.md)
- [Commits](https://github.com/testing-library/jest-dom/compare/v6.4.2...v6.4.5)

---
updated-dependencies:
- dependency-name: "@testing-library/jest-dom"
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-18 13:16:04 -04:00
dependabot[bot] 6896ffd654 Bump e2b from 0.14.14 to 0.17.0 (#1883)
Bumps [e2b](https://github.com/e2b-dev/e2b) from 0.14.14 to 0.17.0.
- [Release notes](https://github.com/e2b-dev/e2b/releases)
- [Commits](https://github.com/e2b-dev/e2b/compare/@e2b/python-sdk@0.14.14...@e2b/python-sdk@0.17.0)

---
updated-dependencies:
- dependency-name: e2b
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-18 13:13:45 -04:00
dependabot[bot] ee005cc2e9 Bump jose from 5.2.4 to 5.3.0 in /frontend (#1878)
Bumps [jose](https://github.com/panva/jose) from 5.2.4 to 5.3.0.
- [Release notes](https://github.com/panva/jose/releases)
- [Changelog](https://github.com/panva/jose/blob/main/CHANGELOG.md)
- [Commits](https://github.com/panva/jose/compare/v5.2.4...v5.3.0)

---
updated-dependencies:
- dependency-name: jose
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-18 13:12:44 -04:00
Temo d4c136a48f Update dependabot.yml (#1876)
Changed dependabot update schedule to daily to keep packages more up to date
2024-05-18 16:17:39 +00:00
Robert Brennan 1a045dc935 remove codecov annotations (#1877) 2024-05-18 12:04:22 -04:00
Robert Brennan 10933a2066 Only list files one directory deep (#1853)
* modify api endpoint

* update frontend for backend

* fix fileservice

* rm file

* unskip test

* fix some more tests

* fix another test

* fix another test

* fix api call

* fix refresh for subdirs

* more tests passing

* more tests

* more tests

* another test

* logspam

* lint

* fix import

* logspam

* code review feedback

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-05-18 15:36:26 +00:00
மனோஜ்குமார் பழனிச்சாமி b401be66f4 Revert "Fix/attribute error (#1812)" (#1871)
This reverts commit 6cce9c3c28.
2024-05-18 11:28:16 -04:00
Boxuan Li 0abc35cf57 ssh_box: Shutdown container when fail to start ssh session (#1872) 2024-05-18 17:04:38 +08:00
Boxuan Li f0ce2ffabf Allow code coverage to be zero for that patch (#1873) 2024-05-18 08:05:48 +00:00
Boxuan Li a57a213c7c Turn off auto linting by default, and on for swe_bench (#1861)
Disable Python linting by default, and turn it on for SWE Bench.

It is turned off by default since this behavior is weird and somewhat annoying to end users.
It is turned on for SWE Bench because linting python files gives LLM a chance to fix the indentations.
2024-05-18 04:04:38 +00:00
Aleksandar 94a9ec76b0 Disable Python linting by default (fixes #1789) (#1794)
* Disable Python linting by default (fixes #1789)

* Try to simplify

* Return do nothing comment

* Disable linting for the javascript as well

* Apply suggestions from code review

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-05-17 20:55:12 -07:00
Gant f950e3b48e make CodeAct paper link correct (#1870) 2024-05-18 03:54:10 +00:00
Shimada666 5df85dcb57 refactor agent status component and add i18n support (#1867)
* fix: correct simple i18n key typo

* feat: refactor agent status component and add i18n support

---------

Co-authored-by: Jim Su <jimsu@protonmail.com>
2024-05-17 20:40:58 -07:00
மனோஜ்குமார் பழனிச்சாமி b0b44ed467 Auto restarted Jupyter kernel (#1808)
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-05-18 08:40:31 +05:30
John Tackman 171764469d Update Dockerfile to assign workspace directory properly to user (#1830)
* Update Dockerfile

creating the workspace directory after switching user to fix permission errors
https://github.com/OpenDevin/OpenDevin/issues/1560

* assign $workspace_base to user

instead of creating directory later, just fix the permissions so user can write to it

* another sudo

---------

Co-authored-by: Leo <ifuryst@gmail.com>
2024-05-17 20:11:23 -04:00
Boxuan Li 735fbbfe3e (test) Include message separators in mock prompts (#1855)
* Add message separator to prompts in tests

* DEMO: remove existing prompts for PlannerAgent

* Add results after prompt regeneration
2024-05-18 00:33:55 +02:00
Xingyao Wang 7ca560471b close runtime after completion in main (#1860) 2024-05-17 16:55:45 +08:00
Xingyao Wang c320d908e5 add iproute2 to sandbox (#1859) 2024-05-17 12:57:58 +08:00
Robert Brennan 110b878dd9 fix up serialization and deserialization of events (#1850)
* fix up serialization and deserialization of events

* fix tests

* remove prints

* fix test

* regenerate tests

* add try blocks
2024-05-17 01:09:15 +00:00
மனோஜ்குமார் பழனிச்சாமி 5b6f622dad Update browser_env.py (#1779)
Co-authored-by: Robert Brennan <accounts@rbren.io>
Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-05-17 06:11:32 +05:30
Frank Xu 9856e76c1f add BrowseInteractiveAction in dummy agent (#1852) 2024-05-17 00:04:17 +00:00
Robert Brennan 49147bf13a add fix:lint command (#1848) 2024-05-16 18:21:31 -04:00
dependabot[bot] fff10402e3 Bump pyarrow from 16.0.0 to 16.1.0 (#1841)
Bumps [pyarrow](https://github.com/apache/arrow) from 16.0.0 to 16.1.0.
- [Commits](https://github.com/apache/arrow/compare/go/v16.0.0...go/v16.1.0)

---
updated-dependencies:
- dependency-name: pyarrow
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-16 22:21:20 +00:00
Frank Xu 35e7157565 Make screenshot fill up the whole browser panel (#1846)
* fix frontend browsing screenshot, allow link following in MD

* make screenshot auto scale and fill the whole browser panel
2024-05-16 22:10:48 +00:00
Xingyao Wang 0cb707bc9d fix(sandbox): ssh_box parsing (#1831)
* fix ssh_box error parsing

* Add type check

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: OpenDevinBot <bot@opendevin.com>
2024-05-16 18:03:34 -04:00
Engel Nyst b3a45ed7fe Fix workspace paths defaults (#1845)
* workspace_mount_path is set to the workspace_base if unset

* unit tests for paths

* workspace_base is absolute path
2024-05-16 17:53:31 -04:00
Xingyao Wang be1aef5863 set user.name and user.email for opendevin (#1842) 2024-05-16 17:50:32 -04:00
dependabot[bot] f55e5d00ca Bump web-vitals from 2.1.4 to 3.5.2 in /frontend (#1843) 2024-05-16 18:12:13 +00:00
Xingyao Wang e31f8b8322 automatically get agent version for eval (#1844) 2024-05-16 13:39:00 -04:00
dependabot[bot] 70f3a3c80d Bump litellm from 1.37.9 to 1.37.12 (#1839)
Bumps [litellm](https://github.com/BerriAI/litellm) from 1.37.9 to 1.37.12.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](https://github.com/BerriAI/litellm/compare/v1.37.9...v1.37.12)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-16 13:38:37 -04:00
dependabot[bot] 77ce5f3e9d Bump pre-commit from 3.7.0 to 3.7.1 (#1840)
Bumps [pre-commit](https://github.com/pre-commit/pre-commit) from 3.7.0 to 3.7.1.
- [Release notes](https://github.com/pre-commit/pre-commit/releases)
- [Changelog](https://github.com/pre-commit/pre-commit/blob/main/CHANGELOG.md)
- [Commits](https://github.com/pre-commit/pre-commit/compare/v3.7.0...v3.7.1)

---
updated-dependencies:
- dependency-name: pre-commit
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
2024-05-16 17:35:14 +00:00
dependabot[bot] 756b8c4149 Bump @tailwindcss/typography from 0.5.12 to 0.5.13 in /frontend (#1834)
Bumps [@tailwindcss/typography](https://github.com/tailwindlabs/tailwindcss-typography) from 0.5.12 to 0.5.13.
- [Release notes](https://github.com/tailwindlabs/tailwindcss-typography/releases)
- [Changelog](https://github.com/tailwindlabs/tailwindcss-typography/blob/master/CHANGELOG.md)
- [Commits](https://github.com/tailwindlabs/tailwindcss-typography/compare/v0.5.12...v0.5.13)

---
updated-dependencies:
- dependency-name: "@tailwindcss/typography"
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-16 12:40:17 -04:00
dependabot[bot] a0017070b3 Bump vitest from 1.5.0 to 1.6.0 in /frontend (#1836)
Bumps [vitest](https://github.com/vitest-dev/vitest/tree/HEAD/packages/vitest) from 1.5.0 to 1.6.0.
- [Release notes](https://github.com/vitest-dev/vitest/releases)
- [Commits](https://github.com/vitest-dev/vitest/commits/v1.6.0/packages/vitest)

---
updated-dependencies:
- dependency-name: vitest
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-16 12:40:09 -04:00
dependabot[bot] 7c66b45667 Bump llama-index from 0.10.36 to 0.10.37 (#1838)
Bumps [llama-index](https://github.com/run-llama/llama_index) from 0.10.36 to 0.10.37.
- [Release notes](https://github.com/run-llama/llama_index/releases)
- [Changelog](https://github.com/run-llama/llama_index/blob/main/CHANGELOG.md)
- [Commits](https://github.com/run-llama/llama_index/compare/v0.10.36...v0.10.37)

---
updated-dependencies:
- dependency-name: llama-index
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-16 12:39:38 -04:00
dependabot[bot] 9d9329a43d Bump react-redux from 9.1.0 to 9.1.2 in /frontend (#1835)
Bumps [react-redux](https://github.com/reduxjs/react-redux) from 9.1.0 to 9.1.2.
- [Release notes](https://github.com/reduxjs/react-redux/releases)
- [Changelog](https://github.com/reduxjs/react-redux/blob/master/CHANGELOG.md)
- [Commits](https://github.com/reduxjs/react-redux/compare/v9.1.0...v9.1.2)

---
updated-dependencies:
- dependency-name: react-redux
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-16 12:39:28 -04:00
dependabot[bot] c8d717fb3a Bump @react-types/shared from 3.22.1 to 3.23.0 in /frontend (#1832)
Bumps [@react-types/shared](https://github.com/adobe/react-spectrum) from 3.22.1 to 3.23.0.
- [Release notes](https://github.com/adobe/react-spectrum/releases)
- [Commits](https://github.com/adobe/react-spectrum/compare/@react-types/shared@3.22.1...@react-types/shared@3.23.0)

---
updated-dependencies:
- dependency-name: "@react-types/shared"
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-16 12:39:17 -04:00
Robert Brennan a4a7ad6c87 Create dependabot.yml (#1829) 2024-05-16 11:55:38 -04:00
Boxuan Li b6ff201780 Refactor integration test framework and relieve the pain of regeneration (#1818)
* Update README.md

* Fix WORKSPACE_MOUNT_PATH_IN_SANDBOX variable in regenerate.sh

* Regenerate prompts without calling real LLM

* Disable pytest warning capture

* Change planner agent prompt by a bit for demo

* Regenerate prompt files following prompt changes

* doc: elaborate on FORCE_USE_LLM

* Add another prompt change to monologue_agent for demo purpose

* Regenerate prompts with FORCE_USE_LLM=true

---------

Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
2024-05-16 08:30:29 -07:00
Leo e89cc8f19b Feat: add stream output to exec_run (#1625)
* Feat: add stream output to exec_run

* Using command timeout to control the exec_box's timeout.
* add bash -c to source command to compatible for sh.

Signed-off-by: ifuryst <ifuryst@gmail.com>

* Feat: add stream output to SSHBox execute

Signed-off-by: ifuryst <ifuryst@gmail.com>

* fix the test case fail.

Signed-off-by: ifuryst <ifuryst@gmail.com>

* fix the test case import wrong path for method.

Signed-off-by: ifuryst <ifuryst@gmail.com>

---------

Signed-off-by: ifuryst <ifuryst@gmail.com>
2024-05-16 14:37:49 +00:00
Xingyao Wang 0fdbe1ee93 Update README.md (#1825) 2024-05-16 11:06:28 +00:00
மனோஜ்குமார் பழனிச்சாமி 7313421ae4 Enabled LLM logs by default (#1819)
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
2024-05-16 10:35:18 +00:00
mzyddd 6cce9c3c28 Fix/attribute error (#1812)
* refactor : delete useless messages.json messages

* Update msg_stack.py

* Update msg_stack.py

* buf fix #1809
AttributeError

* buf fix #1809
AttributeError

---------

Co-authored-by: mengziyi.mzy <mengziyi.mzy@alibaba-inc.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
2024-05-16 10:26:30 +00:00
Frank Xu adea9b3f32 fix frontend browsing screenshot, allow link following in MD (#1817) 2024-05-16 18:06:06 +08:00
yangpryili 52e21c20e3 Update msg_stack.py (#1820)
* Update msg_stack.py

1、[msg.to_dict() for msg in msgs], msg is not instanse of Message, it not has a func of to_dict(), so msg.to_dict() will accur JSONDecodeError;
2、json.dump(new_data, file), it appends new_data to the end of the file instead of overwriting from the beginning, Hence, it's necessary to first perform file.seek(0) and file.truncate().

* Update opendevin/server/session/msg_stack.py

---------

Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
2024-05-16 10:04:05 +00:00
sp.wack 15685f9aba feat(frontend): uploading multiple files (#1718)
* create test todos

* extend to support uploading directories

* remove dir-upload logic and feature drag-and-drop

---------

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-05-16 17:41:15 +08:00
Xingyao Wang 9e59937180 fix utf-8 decoding issue (#1816) 2024-05-15 22:49:49 -07:00
Xingyao Wang 2406b901df feat(SWE-Bench environment) integrate SWE-Bench sandbox (#1468)
* add draft dockerfile for build all

* add rsync for build

* add all-in-one docker

* update prepare scripts

* Update swe_env_box.py

* Add swe_entry.sh (buggy now)

* Parse the test command in swe_entry.sh

* Update README for instance eval in sandbox

* revert specialized config

* replace run_as_devin as an init arg

* set container & run_as_root via args

* update swe entry script

* update env

* remove mounting

* allow error after swe_entry

* update swe_env_box

* move file

* update gitignore

* get swe_env_box a working demo

* support faking user response & provide sandox ahead of time;
also return state for controller

* tweak main to support adding controller kwargs

* add module

* initialize plugin for provided sandbox

* add pip cache to plugin & fix jupyter kernel waiting

* better print Observation output

* add run infer scripts

* update readme

* add utility for getting diff patch

* use get_diff_patch in infer

* update readme

* support cost tracking for codeact

* add swe agent edit hack

* disable color in git diff

* fix git diff cmd

* fix state return

* support limit eval

* increase t imeout and export pip cache

* add eval limit config

* return state when hit turn limit

* save log to file; allow agent to give up

* run eval with max 50 turns

* add outputs to gitignore

* save swe_instance & instruction

* add uuid to swebench

* add streamlit dep

* fix save series

* fix the issue where session id might be duplicated

* allow setting temperature for llm (use 0 for eval)

* Get report from agent running log

* support evaluating task success right after inference.

* remove extra log

* comment out prompt for baseline

* add visualizer for eval

* use plaintext for instruction

* reduce timeout for all; only increase timeout for init

* reduce timeout for all; only increase timeout for init

* ignore sid for swe env

* close sandbox in each eval loop

* update visualizer instruction

* increase max chars

* add finish action to history too

* show test result in metrics

* add sidebars for visualizer

* also visualize swe_instance

* cleanup browser when agent controller finish runinng

* do not mount workspace for swe-eval to avoid accidentally overwrite files

* Revert "do not mount workspace for swe-eval to avoid accidentally overwrite files"

This reverts commit 8ef7739054.

* Revert "Revert "do not mount workspace for swe-eval to avoid accidentally overwrite files""

This reverts commit 016cfbb9f0.

* run jupyter command via copy to, instead of cp to mount

* only print mixin output when failed

* change ssh box logging

* add visualizer for pass rate

* add instance id to sandbox name

* only remove container we created

* use opendevin logger in main

* support multi-processing infer

* add back metadata, support keyboard interrupt

* remove container with startswith

* make pbar behave correctly

* update instruction w/ multi-processing

* show resolved rate by repo

* rename tmp dir name

* attempt to fix racing for copy to ssh_box

* fix script

* bump swe-bench-all version

* fix ipython with self-contained commands

* add jupyter demo to swe_env_box

* make resolved count two column

* increase height

* do not add glob to url params

* analyze obs length

* print instance id prior to removal handler

* add gold patch in visualizer

* fix interactive git by adding a git --no-pager as alias

* increase max_char to 10k to cover 98% of swe-bench obs cases

* allow parsing note

* prompt v2

* add iteration reminder

* adjust user response

* adjust order

* fix return eval

* fix typo

* add reminder before logging

* remove other resolve rate

* re adjust to new folder structure

* support adding eval note

* fix eval note path

* make sure first log of each instance is printed

* add eval note

* fix the display for visualizer

* tweak visualizer for better git patch reading

* exclude empty patch

* add retry mechanism for swe_env_box start

* fix ssh timeout issue

* add stat field for apply test patch success

* add visualization for fine-grained report

* attempt to support monologue agent by constraining it to single thread

* also log error msg when stopeed

* save error as well

* override WORKSPACE_MOUNT_PATH and WORKSPACE_BASE for monologue to work in mp

* add retry mechanism for sshbox

* remove retry for swe env box

* try to handle loop state stopped

* Add get report scripts

* Add script to convert agent output to swe-bench format

* Merge fine grained report for visualizer

* Update eval readme

* Update README.md

* Add CodeAct gpt4-1106 output and eval logs on swe-bench-lite

* Update the script to get model report

* Update get_model_report.sh

* Update get_agent_report.sh

* Update report merge script

* Add agent output conversion script

* Update swe_lite_env_setup.sh

* Add example swe-bench output files

* Update eval readme

* Remove redundant scripts

* set iteration count down to false by default

* fix: Issue where CodeAct agent was trying to log cost on local llm and throwing Undefined Model execption out of litellm (#1666)

* fix: Issue where CodeAct agent was trying to log cost on local llm and throwing Undefined Model execption out of litellm

* Review Feedback

* Missing None Check

* Review feedback and improved error handling

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>

* fix prepare_swe_util scripts

* update builder images

* update setup script

* remove swe-bench build workflow

* update lock

* remove experiments since they are moved to hf

* remove visualizer (since it is moved to hf repo)

* simply jupyter execution via heredoc

* update ssh_box

* add initial docker readme

* add pkg-config as dependency

* add script for swe_bench all-in-one docker

* add rsync to builder

* rename var

* update commit

* update readme

* update lock

* support specify timeout for long running tasks

* fix path

* separate building of all deps and files

* support returning states at the end of controller

* remove return None

* support specify timeout for long running tasks

* add timeout for all existing sandbox impl

* fix swe_env_box for new codebase

* update llm config in config.py

* support pass sandbox in

* remove force set

* update eval script

* fix issue of overriding final state

* change default eval output to hf demo

* change default eval output to hf demo

* fix config

* only close it when it is NOT external sandbox

* add scripts

* tweak config

* only put in hostory when state has history attr

* fix agent controller on the case of run out interaction budget

* always assume state is always not none

* remove print of final state

* catch all exception when cannot compute completion cost

* Update README.md

* save source into json

* fix path

* update docker path

* return the final state on close

* merge AgentState with State

* fix integration test

* merge AgentState with State

* fix integration test

* add ChangeAgentStateAction to history in attempt to fix integration

* add back set agent state

* update tests

* update tests

* move scripts for setup

* update script and readme for infer

* do not reset logger when n processes == 1

* update eval_infer scripts and readme

* simplify readme

* copy over dir after eval

* copy over dir after eval

* directly return get state

* update lock

* fix output saving of infer

* replace print with logger

* update eval_infer script

* add back the missing .close

* increase timeout

* copy all swe_bench_format file

* attempt to fix output parsing

* log git commit id as metadata

* fix eval script

* update lock

* update unit tests

* fix argparser unit test

* fix lock

* the deps are now lightweight enough to be incude in make build

* add spaces for tests

* add eval outputs to gitignore

* remove git submodule

* readme

* tweak git email

* update upload instruction

* bump codeact version for eval

---------

Co-authored-by: Bowen Li <libowen.ne@gmail.com>
Co-authored-by: huybery <huybery@gmail.com>
Co-authored-by: Bart Shappee <bshappee@gmail.com>
Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-05-15 16:15:55 +00:00
Frank Xu a84d19f03c Enable CodeAct agents with browsing, and also enable arbitrary BrowserGym action support (#1807)
* enable browsing in codeact, and  arbitrary browsergym DSL support

* fix

* fix unit test case

* update frontend for the new interactive browsing action

* bump ver

* Fix integration tests

---------

Co-authored-by: OpenDevinBot <bot@opendevin.com>
2024-05-15 11:59:58 -04:00
Xia Zhenhua 76abca361c feat: simplify state.history with to_memory call in micro-agent. Or the call to LLM may exceed the token limit. (#1806)
* feat: simplify state.history with to_memory call in micro-agent.

* feat: merge master and replace to_memory with event_to_memory.

---------

Co-authored-by: aaren.xzh <aaren.xzh@antfin.com>
2024-05-15 14:47:37 +02:00
Xia Zhenhua bf14b47890 feat: make other agents support asking user input in MessageAction. (#1777)
* feat: make other agents support asking user input in MessageAction.

* Update agenthub/micro/_instructions/actions/message.md

Co-authored-by: Robert Brennan <accounts@rbren.io>

* Update agenthub/micro/_instructions/actions/message.md

Co-authored-by: Robert Brennan <accounts@rbren.io>

* feat: make other agents support asking user input in MessageAction.

* Regenerate test artifacts

---------

Co-authored-by: aaren.xzh <aaren.xzh@antfin.com>
Co-authored-by: Robert Brennan <accounts@rbren.io>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-05-15 00:44:45 -07:00
Shimada666 817222061f refactor: jupyter scroll (#1799)
* refactor: jupyter scroll

* Update Jupyter.tsx

---------

Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>
2024-05-15 07:39:55 +00:00
Boxuan Li 6714000b2c CodeActAgent: Fix iteration reminder (#1803)
This PR includes three changes:
1) Iteration reminder should start with MAX_ITERATIONS from config rather than default value 100
2) In the first prompt, we should tell the LLM it has `MAX_ITERATIONS - 1` turns left, rather than `MAX_ITERATIONS - 2`
3) Remove legacy ITERATION_REMINDER config
2024-05-15 13:48:47 +08:00
Xingyao Wang d1fd277ad4 Support return final task states for evaluation (#1755)
* support returning states at the end of controller

* remove return None

* fix issue of overriding final state

* return the final state on close

* merge AgentState with State

* fix integration test

* add ChangeAgentStateAction to history in attempt to fix integration

* add back set agent state

* update tests

* update tests

* directly return get state

* add back the missing .close()

* Update typo in opendevin/core/main.py

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-05-15 03:43:01 +00:00
Robert Brennan c604f8fcd2 change error message to something more descriptive (#1790) 2024-05-15 08:32:51 +08:00
Robert Brennan 135320861c set a higer UID_MAX (#1788)
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
2024-05-15 00:28:37 +00:00
Xingyao Wang 123968f887 Runtime only close then sandbox if it is created by itself (#1793) 2024-05-15 05:47:56 +08:00
Graham Neubig 3cef8ee187 Add GitHub prompt to CodeAct (#1792)
* Added github to CodeAct

* More codeact

* Simplify prompt

* Modify codeact prompt

* fix integration test for CodeAct

* yet another integration test fix for codeact

* fix plugin use in jupyter

* update edit tests

* fix jupyter plugin potential port conflict

* fix test ipython with latest ipython fix

* update integration test

* wait a bit for jupyter execution

* add one unit tests for sandbox

* fix integration test

---------

Co-authored-by: OpenDevinBot <bot@opendevin.com>
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-05-14 21:25:21 +00:00
Xingyao Wang 8d8ed0c3be hotfix: Initialize plugin with new runtime (#1795)
* fix plugin use in jupyter

* fix jupyter plugin potential port conflict

* update integration test

* wait a bit for jupyter execution

* add one unit tests for sandbox

* fix integration test

* fix integration

* fix integration yet again

* init sandbox plugins in the server
2024-05-14 21:15:19 +00:00
Shimada666 e4460a974d feat: chat interface autoscroll (#1761)
Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>
2024-05-14 18:00:16 +00:00
Robert Brennan 6ed17aad37 fix serialization (#1785) 2024-05-14 13:46:15 -04:00
Marshall Roch 64ee5d404d Fix CodeAct paper link (#1784)
https://arxiv.org/abs/2402.13463 is RefuteBench: Evaluating Refuting Instruction-Following for Large Language Models

https://arxiv.org/abs/2402.01030 is Executable Code Actions Elicit Better LLM Agents
2024-05-14 17:40:07 +00:00
mamoodi 1d8402a14a doc: Small fixes to documentation (#1783) 2024-05-14 13:36:53 -04:00
Robert Brennan dcb5d1ce0a Add permanent storage option for EventStream (#1697)
* add storage classes

* add minio

* add event stream storage

* storage test working

* use fixture

* event stream test passing

* better serialization

* factor out serialization pkg

* move more serialization

* fix tests

* fix test

* remove __all__

* add rehydration test

* add more rehydration test

* fix fixture

* fix dict init

* update tests

* lock

* regenerate tests

* Update opendevin/events/stream.py

* revert tests

* revert old integration tests

* only add fields if present

* regen tests

* pin pyarrow

* fix unit tests

* remove cause from memories

* revert tests

* regen tests
2024-05-14 11:09:45 -04:00
Robert Brennan beb74a19f6 Use event stream for the runtime (#1776)
* rebuild PR from scratch

* fix max_iter

* regenerate tests

* cut down on history

* Update opendevin/controller/agent_controller.py

* regenerate tests

* revert swe agent

* revert some codeact chagnes

* regenerate tests

* add source to dict

* only add source if not none

* try to fix coverage issue

* lock

* add gevent
2024-05-14 13:35:25 +00:00
Robert Brennan 82a798990c refactor remind_iterations (#1760)
* refactor remind_iterations

* regenerate tests

* concatenate iteration message

* fix merge issues

* update integration tests
2024-05-14 08:27:12 -04:00
Boxuan Li 3d53d363b4 Integration test: Verify finish state & add auto-rerun in regenerate.sh (#1773)
* regenerate.sh: Allow testing on a specific agent and/or test

* Check agent finish state

* rengerate.sh: Rerun after fixing the prompts

* Fix SWEAgent test_write_simple_script

* Add more help message

* Add a known issue to README.md

* regenerate.sh: Fix help message typo

* Fix a typo in README
2024-05-14 03:50:29 -04:00
Boxuan Li b84f25ab35 Integration test: exit if no prompt match (#1772) 2024-05-13 20:03:09 -07:00
Robert Brennan 2771328036 use -it and pull=always for docker (#1769) 2024-05-13 19:17:57 -04:00
Robert Brennan b028bd46bb Use messages to drive tasks (#1688)
* finish is working

* start reworking main_goal

* remove main_goal from microagents

* remove main_goal from other agents

* fix issues

* revert codeact line

* make plan a subclass of task

* fix frontend for new plan setup

* lint

* fix type

* more lint

* fix build issues

* fix codeact mgs

* fix edge case in regen script

* fix task validation errors

* regenerate integration tests

* fix up tests

* fix sweagent

* revert codeact prompt

* update integration tests

* update integration tests

* handle loading state

* Update agenthub/codeact_agent/codeact_agent.py

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

* Update opendevin/controller/agent_controller.py

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

* Update agenthub/codeact_agent/codeact_agent.py

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

* Update opendevin/controller/state/plan.py

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

* update docs

* regenerate tests

* remove none from state type

* revert test files

* update integration tests

* rename plan to root_task

* revert plugin perms

* regen integration tests

* tweak integration script

* prettier

* fix test

* set workspace up for regeneration

* regenerate tests

* Change directory of copy

* Updated tests

* Disable PlannerAgent test

* Fix listen

* Updated prompts

* Disable planner again

* Make codecov more lenient

* Update agenthub/README.md

* Update opendevin/server/README.md

* re-enable planner tests

* finish top level tasks

* regen planner

* fix root task factory

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-05-13 23:14:15 +00:00
Robert Brennan e28b3ef9e8 Fix integration tests (#1764)
* refactor remind_iterations

* regenerate tests

* concatenate iteration message

* add some helpers to the tests

* regenerate tests

* add to logs

* regenerate tests

* add debug info

* fix exit_on_message

* fix regen script

* regenerate tests

* Revert "Merge branch 'rb/test-regen' of ssh://github.com/opendevin/opendevin into rb/test-regen"

This reverts commit b9cd1acbf2, reversing
changes made to c888285304.

* remove prints

* revert files

* revert more

* revert more

* regenerate for the last time I hope

* add back remind_iter

* regenerate

* add back remind_iter

* regenerate

* fix remind_iter

* regenerate yet again

* regen

* remove comment

* regen again
2024-05-13 18:08:59 -04:00
wallter ee66a1d5d1 Fix: Correct --add-host Flag Format in README (#1767)
This PR updates the README to correct the format of the --add-host flag used in the Docker run command.
The previous format, host.docker.internal=host-gateway, was incorrect and resulted in the following error:
invalid argument "host.docker.internal=host-gateway" for "--add-host" flag: bad format for add-host: "host.docker.internal=host-gateway"
Use code with caution.
This PR fixes the issue by updating the flag to the correct format:
--add-host host-gateway:host.docker.internal
Use code with caution.
This ensures that the Docker container can correctly resolve the host.docker.internal hostname to the host machine's gateway IP address.
2024-05-13 22:07:56 +00:00
Graham Neubig b13d4647ab Print out the regenerate command (#1759)
* Print out the output of the regenerate command

* Update regenerate.sh
2024-05-13 18:43:58 +00:00
Pete Stenger a48b02207f await closing the controller (#1751)
* await closing the controller

* Update manager.py

* Cleanly exit

* Update agent.py

---------

Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
Co-authored-by: Jim Su <jimsu@protonmail.com>
2024-05-13 14:34:03 -04:00
Xingyao Wang 755a4072b6 Support specify timeout for long running tasks (#1756)
* support specify timeout for long running tasks

* add timeout for all existing sandbox impl

* Update opendevin/runtime/docker/local_box.py

Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>

* Update opendevin/runtime/docker/exec_box.py

Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>

* Update opendevin/runtime/docker/ssh_box.py

Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>

* Update opendevin/runtime/e2b/sandbox.py

Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>

* Update opendevin/runtime/sandbox.py

Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>

---------

Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
2024-05-13 10:17:03 +00:00
Xingyao Wang 00c0edae5f Re-adjust ssh_box for parallel evaluation (#1729)
* update ssh_box

* fix controller in test

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-05-13 14:35:30 +08:00
Frank Xu ba8d8634ac fix browsergym to old ver (#1753) 2024-05-12 22:05:37 -07:00
Boxuan Li eba5ef8e67 Fix test_ipython (#1750) 2024-05-12 16:15:32 -07:00
Xingyao Wang 4db4a84e2e Simply Jupyter execution via heredoc (#1728)
* simply jupyter execution via heredoc

* make sure /tmp always exists

* add integration test for jupyter exec
2024-05-13 04:57:06 +08:00
Boxuan Li 49de262577 opendevin/core/main.py: Graceful shutdown (#1731)
* opendevin/core/main.py: Graceful shutdown

* Shutdown controller at exit

* Update opendevin/core/main.py

---------

Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-05-12 13:56:35 -07:00
Engel Nyst e5f1dbf5e7 Move json utility to the custom json parsing; apply it to the monologue-like agents (#1740) 2024-05-12 13:39:38 -04:00
Aleksandar f861db6675 Enhance API Documentation (#1727)
* Add Server Interaction Guide

* Fix style

* Remove the server_interaction.md and add docstrings doc

* Remove very specific setup for the token from the doc

* Fix mdx expression failure

* Fix all examples

* Fix missing empty args {}

* Fix the run example to have and background
2024-05-12 08:58:01 -07:00
Robert Brennan efd0d61e70 Fix the tests (#1737)
* fix config patching

* revert tests
2024-05-12 11:02:10 -04:00
Robert Brennan d94b575cd4 Sandbox: adjust whitespace processing (#1474)
* adjust whitespace processing

* revert ssh_box

* adjust tests

* change lstrip to remove prefix

* run tests on exec box

* remove lstrips

* fix multiline

* remove stripping logic

* fix single multiline commands

* fix imports

* fix multiline echo

* better command splitter

* fix merge issue
2024-05-12 14:41:50 +00:00
Xingyao Wang 8bfae8413e Support passing sandbox as argument and iteration reminder (#1730)
* support custom sandbox;
add iteration_reminder

* Enable iteration reminder in CodeActAgent integration test

* Don't remove numbers when comparing prompts

* Update tests/integration/README.md

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-05-12 07:57:33 +00:00
Xingyao Wang 1d58917bc8 remove swe-bench build workflow (#1726) 2024-05-12 06:56:20 +08:00
Jens Roland 6a18cafa40 docs: fixed typo in launch command (#1724)
The argument `--add-host host.docker.internal:host-gateway` should be `--add-host host.docker.internal=host-gateway` (with an `=` character).

Solves `Error creating controller: Could not establish connection to host` errors.

Co-authored-by: Jim Su <jimsu@protonmail.com>
2024-05-11 17:48:45 -04:00
Jim Su 3abdc231c4 Highlight currently selected file (#1725)
* Highlight selected file

* Don't recompute context on every render

* Fix lint errors
2024-05-12 00:36:55 +03:00
Natu Lauchande c997289eb7 Update troubleshooting Docker error in a Mac (#1722) 2024-05-11 19:38:00 +00:00
Boxuan Li 316a772849 CodeAct: Emphasize open before edit (#1709)
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
2024-05-11 12:20:14 -07:00
Temo 93fe31a490 Updated poetry.lock again (#1721) 2024-05-11 18:06:41 +00:00
Engel Nyst 98adbf54ec Small refactoring (#1614)
* move MemoryCondenser, LongTermMemory, json, out of the monologue

* PlannerAgent and Microagents use the custom json.loads/dumps

* Move short term history out of monologue agent...

* move memory in their package

* add __init__
2024-05-11 17:15:19 +02:00
mzyddd 5277c43c49 refactor : delete useless messages.json messages (#1706) 2024-05-11 11:59:05 +00:00
Xia Zhenhua 4477f08e6d fix: a critical bug of sharing _subscribers and _events between class not obj instance, when multi agent running(several browsers opened at the same time), the state management will be messed up. (#1713) 2024-05-11 11:26:41 +00:00
மனோஜ்குமார் பழனிச்சாமி 24e61ead65 Fixed bool config having int value (#1708)
like DEBUG=1
2024-05-11 09:46:11 +00:00
Xia Zhenhua 5244a34a1d feat: skip deploy-decs in folk repos. (#1703)
Co-authored-by: aaren.xzh <aaren.xzh@antfin.com>
2024-05-11 01:44:25 -07:00
Boxuan Li bde12f4a09 CodeActAgent: Fix hack for multiple edits in same command (#1684)
* Fix edit hack for multiple edits in same command

This PR changes ([\s\S]*) to ([\s\S]*?) to make the capturing
group non-greedy. This change ensures that the regex captures
the smallest set of characters that extends up to the first
end_of_edit it encounters, rather than extending across multiple
edit commands.

Without the fix, a bash command consisting of multiple edits
would be corrupt and lead to unexpected edit results.
2024-05-10 23:32:09 -07:00
Graham Neubig 1787a7304e Disable ChromaDB telemetry (#1699) 2024-05-11 01:25:19 +02:00
Xingyao Wang 33e141e626 fix: do not raise error when failed to delete ~/.bashrc (#1701) 2024-05-11 00:08:44 +02:00
mzyddd e1a1c9a00c perf : optimizations to send event logging performance (#1635)
Co-authored-by: mengziyi.mzy <mengziyi.mzy@alibaba-inc.com>
2024-05-10 23:54:22 +02:00
மனோஜ்குமார் பழனிச்சாமி b4cdebec06 Ignore any warnings LiteLLM might emit on import (#1687) 2024-05-10 16:42:08 -04:00
Robert Brennan 1cbb16cfc2 allow running app as root (#1651)
* allow running app as root

* better entrypoint mgmt

* add nosetup option

* remove comments

* create docker group if it doesnt exist

* better docker group mgmt

* cast bools better

* fix playwright

* fix playwright for root

* fix root source ~/.bashrc hangs by create clean bashrc

---------

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-05-10 20:41:25 +00:00
Xia Zhenhua 968b4d71bd feat: auto clean inactive sessions for a long time. (#1644)
* feat: auto clean inactive sessions for a long time.

* feat: new delegate stuck check.

---------

Co-authored-by: aaren.xzh <aaren.xzh@antfin.com>
2024-05-10 20:09:36 +00:00
Bart Shappee 78cd2e5b47 fix: Issue where CodeAct agent was trying to log cost on local llm and throwing Undefined Model execption out of litellm (#1666)
* fix: Issue where CodeAct agent was trying to log cost on local llm and throwing Undefined Model execption out of litellm

* Review Feedback

* Missing None Check

* Review feedback and improved error handling

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-05-10 13:57:37 -04:00
மனோஜ்குமார் பழனிச்சாமி 18e6b0b2d0 Removed logging output of plugin installation (#1690)
* Removed logging output of plugin installation
2024-05-10 16:09:49 +00:00
Zhou Hang 0cf94a2718 feat: add continue button (#1508)
* feat: add-continue-button

* feat: control the visibility of continue button

* feat: reset input status

* feat: update continue button UI

* feat: add test

* fix: fix lint issues

* feat: update ui

* feat: remove continue button to the parent and update icon

* fix: remove empty file

---------

Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>
2024-05-10 10:26:39 +00:00
Jim Su f8d4b1ab0d Use generic types (#1680) 2024-05-10 04:21:22 +02:00
Engel Nyst a17308108c make runnable a class var (#1679) 2024-05-09 22:12:19 -04:00
Xia Zhenhua 10b971c612 feat: new delegate stuck check. (#1677)
Co-authored-by: aaren.xzh <aaren.xzh@antfin.com>
2024-05-09 21:06:20 -04:00
Robert Brennan e63910263c cast bools better (#1678) 2024-05-09 21:05:15 -04:00
Robert Brennan 96151d9147 fix e2b env (#1670) 2024-05-09 23:14:44 +00:00
Aleksandar 657b177b4e Default to less expensive gpt-3.5-turbo model (#1675) 2024-05-09 19:11:27 -04:00
Ammar Ladhani 564739d1db Removes upper-case 'List' imports from typing and replaces them with lower-case 'list' (#1676) 2024-05-09 19:05:46 -04:00
Robert Brennan 26d82841d5 Create runtime implementation (#1626)
* first pass at moving runtime

* fix import

* remove github refs

* remove unnecessary import

* remove unnecessary import

* add e2b

* refactor read and write file ops

* remove github test

* rm action

* revert permissions

* regenerate tests

* re-delete file operations

* regenerate integration tests

* Update opendevin/runtime/runtime.py

Co-authored-by: Graham Neubig <neubig@gmail.com>

* fix ref

* add docs

* remove logspam

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-05-09 19:04:49 -04:00
Engel Nyst 446eaec1e6 Refactor config to dataclasses (#1552)
* mypy is invaluable

* fix config, add test

* Add new-style toml support

* add singleton, small doc fixes

* fix some cases of loading toml, clean up, try to make it clearer

* Add defaults_dict for UI

* allow config to be mutable
error handling
fix toml parsing

* remove debug stuff

* Adapt Makefile

* Add defaults for temperature and top_p

* update to CodeActAgent

* comments

* fix unit tests

* implement groups of llm settings (CLI)

* fix merge issue

* small fix sandboxes, small refactoring

* adapt LLM init to accept overrides at runtime

* reading config is enough

* Encapsulate minimally embeddings initialization

* agent bug fix; fix tests

* fix sandboxes tests

* refactor globals in sandboxes to properties
2024-05-09 22:48:29 +02:00
மனோஜ்குமார் பழனிச்சாமி 73693ba416 Mentioned LLM logs directory (#1587)
* Update bug_template.yml

* Pythonized

* updated configs type

* updated opendevin_logger

* fixed bool config

* fixed bool config
2024-05-09 13:31:14 -04:00
Frank Xu ae7f208d51 Fix browser env leak after resetting agent (#1589)
* add more teardown

* add browser process teardown logic in agent controller

* remove testing code

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-05-09 13:17:16 -04:00
Boxuan Li a60a6a40d6 Only regenerate integratio tests for failed ones (#1661) 2024-05-09 09:32:00 -04:00
Arno.Edwards 06aae67fed feat(makefile): add capability to skip Docker image pull (#1664) 2024-05-09 09:06:26 -04:00
Xia Zhenhua 4a72e83938 fix: AgentThinkAction deleted caused bug. (#1662)
* fix: AgentThinkAction deleted caused bug.

* fix: AgentThinkAction deleted caused bug in plannerAgent.

* fix: plan content-not-changed caused frontend crash bug.

---------

Co-authored-by: aaren.xzh <aaren.xzh@antfin.com>
2024-05-09 09:04:02 -04:00
808vita 780db1e906 Rename CodeOfConduct.md to CODE_OF_CONDUCT.md (#1665) 2024-05-09 10:38:24 +00:00
Xingyao Wang 21fe8dc1eb Align codeact with swebench eval (#1612)
* align codeact agent with the slight adjustment on eval branch

* update integration test for new prompt

* Regenerate test artifacts for CodeActAgent

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-05-09 00:42:07 -07:00
Robert Brennan 45d1b6969a Allow setting env vars inside sandboxes (#1652)
* add env to all sandboxes

* add unit tests
2024-05-09 15:39:31 +08:00
Shimada666 73e180638e feat: support for stopping automatic scrolling to bottom and add "to bottom" button (#1656)
Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>
2024-05-09 06:23:32 +00:00
Robert Brennan 09e8b11451 Update integration test instructions (#1645)
* Update README.md

* Update tests/integration/README.md

* Apply suggestions from code review

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-05-09 02:56:33 +00:00
Ikko Eltociear Ashimine 4cc462cc18 Update codeact_agent.py (#1653)
splited -> splitted
2024-05-08 13:55:40 -04:00
Boxuan Li af5bdf67aa Add AgentRejectAction across multiple modules (#1615)
* Add AgentRejectAction across multiple modules

This commit introduces the AgentRejectAction class and integrates it across various modules and actions. It includes updates to READMEs, action definitions, and agent controllers to handle the new 'reject' action. This functionality will allow agents to properly signal task rejection.

* Fix unit test

* Remove wrong generates attributes from a few micro-agents
2024-05-08 10:03:14 -07:00
AJ (@techfren) c2868985e4 add sandbox_user_id to run command which is neccesary (#1641) 2024-05-08 12:20:31 +00:00
sp.wack 88ef414e3a fix warnings during test runs (#1638) 2024-05-08 20:12:14 +08:00
tahussle 04676d17a8 Updated Makefile to support Manjaro / Arch linux hosts (#1642) 2024-05-08 12:06:41 +00:00
Robert Brennan 242c4a0df6 Remove extra message actions (#1608)
* remove extra actions

* remove message observations

* support null obs

* handle null obs

* fix frontend for changes

* fix the way messages flow to the UI

* change think to message

* add regen script

* regenerate all integration tests

* change task

* remove gh test

* fix messages

* fix tests

* help agent exit after hitting max iter

* Update opendevin/events/observation/success.py

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

* Update agenthub/codeact_agent/codeact_agent.py

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-05-07 21:13:08 +00:00
Lev 4a2a35b6cf feat(frontend): "Reset to Default" button (#1573)
* frontend: reset-button

* frontend: key prop removed, issue with uncontrolled Autocomplete input

* frontend: reset button test, Autocomplete switch to controlled input

* frontend: proper use of getDefaultSettings in test

* frontend: separate selectedKey and inputValue in Autocompletecombobox

* no fallbacks, defaultSelectedKey prop is used to prevent the input from clearing itself

* remove conflict resolution fragments

---------

Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>
Co-authored-by: amanape <stephanpsaras@gmail.com>
2024-05-07 17:17:08 +00:00
Temo fda21d2ce3 Updated poetry.lock (#1628) 2024-05-07 16:56:37 +00:00
Robert Brennan 2ad9c55010 add SWE-bench Lite score to readme (#1620)
* add score to readme

* try fixing images

* try fixing images

* set width

* text-align

* use p

* add progress header

* more readme tweaks

* change badge colors

* remove back to top

* remove link

* move docs section

* change up community msg

* spacing

* add back demo video

* add teaser gif

* better gif

* faster gif

* change scale

* remove scale

* update demo and make it a gif

* replace demo

* update results

* add screenshot

* add back teaser video

* remove demo gif
2024-05-07 15:08:53 +00:00
Niall dd666cf0a6 fix: Add dependency boto3 to solve issues#1618 (#1619)
Co-authored-by: ning.zhao <ning.zhao@jiduauto.com>
Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>
2024-05-07 17:52:35 +03:00
Aleksandar 4bf4119259 Introduce TypoFixerAgent for in-place typo corrections in agenthub/micro (#1613)
* Add TypoFixerAgent micro-agent to fix typos

* Improve parse_response to accurately extract the first complete JSON object

* Add tests for parse_response function handling complex scenarios

* Fix tests and logic to use action_from_dict

* Fix small formatting issues
2024-05-07 13:25:35 +02:00
zhaoninge 6150ab6a3e fix: corrected bedrock model list (#1513)
- auto set environment variable
- add criteria for querying the AWS bedrock model
2024-05-07 03:51:49 +00:00
Xingyao Wang 356caf0960 Fix the issue of newly import package by including instruction for Kernel restart (#1609)
* fix the issue of newly import package by including instruction for kernel restart

* fix integration test for new prompt

* fix integration yet again
2024-05-07 06:11:05 +08:00
Robert Brennan 2be7e55303 Prompt for settings on initial load, and add migration logic (#1527)
* prompt for settings on initial load and add migration logic

* logspam

* revert message

* change fn body

* fix fn

* move agent box to top

* add message about update

* fix up settings logic

* pr feedback

* fix up lint

* Update frontend/src/components/modals/settings/SettingsModal.tsx

Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>

* Update frontend/src/components/modals/settings/SettingsModal.tsx

Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>

* disable save if required settings arent set

* simplify required settings

* fix up vars

* lint

* fix compile issues

* fix test

* fix test

* fix all tests

* lint

* fix build err

* lint

* more lint

---------

Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>
2024-05-06 18:30:18 +00:00
Frank Xu 26dcf4fd7c remove screenshot in browser observation (#1588)
* remove screenshot in browser observation

* refactor utils

* allow only dict

* fix screenshot not showing up in frontend

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-05-06 13:56:28 -04:00
Robert Brennan d0967122f8 Minor changes to agent state management (#1597)
* move towards event stream

* refactor agent state changes

* move agent state logic

* fix callbacks

* break on finish

* closer to working

* change frontend to accomodate new flow

* handle start action

* fix locked stream

* revert message

* logspam

* no async on close

* get rid of agent_task

* fix up closing

* better asyncio handling

* sleep to give back control

* fix key

* logspam

* update frontend agent state actions

* fix pause and cancel

* delint

* fix map

* delint

* wait for agent to finish

* fix unit test

* event stream enums

* fix merge issues

* fix lint

* fix test

* fix test

* add user message action

* add user message action

* fix up user messages

* fix main.py flow

* refactor message waiting

* lint

* fix test

* fix test

* simplify if/else

* fix state reset

* logspam

* add error status

* minor changes to control bar

* handle user messages when not awaiting

* restart agent after stopping

* Update opendevin/controller/agent_controller.py

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

* delint

* refactor initialize

* delint

* fix dispatch

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-05-06 12:29:48 +00:00
Xia Zhenhua c9970e6817 fix: delegateAgent select micro agents name bug. (#1601)
* fix: delegateAgent select micro agents name bug.

* refactor: last_observation

---------

Co-authored-by: aaren.xzh <aaren.xzh@antfin.com>
2024-05-06 08:27:03 -04:00
ChengyangDu 16400e40ca fix: check (and create if necessary) dir before write file (#1600) 2024-05-06 08:18:32 -04:00
Xia Zhenhua 2d24521222 fix: file in _instructions directory not set correctly. (#1602)
Co-authored-by: aaren.xzh <aaren.xzh@antfin.com>
2024-05-06 08:09:03 -04:00
Leo 886f713e4c Fix: Compatible with older docker version for --add-host (#1604)
Signed-off-by: ifuryst <ifuryst@gmail.com>
2024-05-06 07:59:51 -04:00
ChengyangDu a58f41a7b3 fix: check action['action'] instance type before parsing action cls (#1599) 2024-05-06 08:41:14 +02:00
Alex Bäuerle ae2b857cb4 docs(docker): update docker version and make get started more prominent (#1536)
* docs(docker): update docker version and make get started more prominent

* bump version
2024-05-06 00:40:27 +00:00
sp.wack 35b248ffd6 update/refactor unit tests (#1598) 2024-05-05 20:27:21 -04:00
Shimada666 3dbd502510 feat: customize the tab list displayed for different agents (#1543) 2024-05-05 20:18:05 -04:00
Christian Balcom 27e13fafb5 Token counting and litellm provider customization (#1421)
* Count tokens to judge more accurate max monologue length, add configurations for max input and output tokens, pulling from litellm when available.

* Fix token counter

* Use None as the default for llm_custom_llm_provider, resolve settings conflict with recent command-r-plus commit.

* Document rationale for default token counts.

* Update opendevin/llm/llm.py

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

* Update opendevin/llm/llm.py

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

* Reverting formatting changes from merge.

* Maybe this will satisfy pydoc-markdown?

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-05-06 00:43:00 +02:00
Robert Brennan f7e0c6cd06 Separate agent controller and server via EventStream (#1538)
* move towards event stream

* refactor agent state changes

* move agent state logic

* fix callbacks

* break on finish

* closer to working

* change frontend to accomodate new flow

* handle start action

* fix locked stream

* revert message

* logspam

* no async on close

* get rid of agent_task

* fix up closing

* better asyncio handling

* sleep to give back control

* fix key

* logspam

* update frontend agent state actions

* fix pause and cancel

* delint

* fix map

* delint

* wait for agent to finish

* fix unit test

* event stream enums

* fix merge issues

* fix lint

* fix test

* fix test

* add user message action

* add user message action

* fix up user messages

* fix main.py flow

* refactor message waiting

* lint

* fix test

* fix test
2024-05-05 19:20:01 +00:00
Aleksandar 4e84aac577 Add to Local LLM README.md -e SANDBOX_USER_ID=501 (#1592) 2024-05-05 08:16:20 -04:00
Shimada666 224ddf3ac0 feat: support jupyter auto scroll to bottom (#1558)
Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>
2024-05-05 11:46:34 +03:00
Robert Brennan 7b6b4e3a11 be more dynamic around uid generation (#1584)
* be more dynamic around uid generation

* fix comment

* fix second uid add
2024-05-04 22:26:55 -04:00
Zhou Hang 24c5bcb502 i18n: Fix and add zh-CN 简体中文 (#1577)
* i18n: fix&add zh-CN translation

* i18n: Improve translation
2024-05-04 21:26:30 -04:00
Paulo Ferreira 4292a4aa32 Fix README.md formatting: Removed unnecessary backticks (#1581) 2024-05-04 20:06:15 +00:00
Shimada666 30de484d49 feat: merge identical url from different error messages (#1572) 2024-05-04 15:55:10 -04:00
sp.wack 718730a2b9 (refactor|test)(frontend): chat (#1488)
* initial commit

* update tests and feat-markdown

* rename

* introduce chat

* initial commit

* extend chatinterface, add
new store, and utilize new reducers

* improve styles, code markdown, and adjust styles

* update actions to use new reducers

* scroll down for new messages

* add fade and action banner

* support with jupyter

* refactor to pass tests

* remove unused import

* remove outdated files/folders

* create simple useTyping hook

* remove action banner, extend tests, move files

* extend tests

* disable error
2024-05-04 12:11:47 -07:00
RainRat 4eeaac28eb Update mixin.py (#1579)
fix typo
2024-05-04 20:17:49 +02:00
快乐的老鼠宝宝 8961ceabc6 docs: translation of zh-TW (#1580) 2024-05-05 00:24:28 +08:00
Robert Brennan 730fcd7a39 change default to CodeActAgent and update instructions (#1523)
* change default to CodeActAgent and update instructions

* add quickstart

* change code

* unpin patch version

* Apply suggestions from code review
2024-05-04 08:37:23 -04:00
Robert Brennan 72efa05c71 Update ghcr.yml (#1576) 2024-05-04 08:32:12 -04:00
Robert Brennan 322b23550d run ghcr on tag (#1575) 2024-05-04 12:18:21 +00:00
Robert Brennan 1d92e9a27b Fix user permissions in docker (#1565)
* permissions mostly working

* fix browser
2024-05-04 13:08:43 +08:00
Leo 6013faeec5 Add frontend tests to pre-commit and Makefile. (#1549)
Signed-off-by: ifuryst <ifuryst@gmail.com>
2024-05-03 16:15:22 -04:00
Jiayi Pan bccb8297b8 feat: ability to configure temperature and top-p sampling for llm generation (#1556)
Co-authored-by: Jim Su <jimsu@protonmail.com>
2024-05-03 19:15:39 +00:00
Binyuan Hui b2a99b509f fix the issue where session id might be duplicated (#1557)
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-05-03 15:47:19 +00:00
Robert Brennan 7d8e10aa18 Remove v prefix on tags (#1555) 2024-05-03 10:36:09 -04:00
Graham Neubig 74e159add6 Remove screenshot from microagent prompt (#1550)
* Remove screenshot from microagent prompt

* Update recursive search

* Update to handle various data types
2024-05-03 09:34:39 -04:00
Anas DORBANI 91c8f1cb3f Update README.md (#1548)
Add documentation badge
2024-05-03 20:09:17 +08:00
Junyang Lin 7d2d2ab242 Update README.md
update slack invite link
2024-05-03 15:57:34 +08:00
RainRat 3cdff79173 fix typos (#1537) 2024-05-03 09:41:32 +03:00
Xingyao Wang 3886c51217 Update CodeAct README.md (#1534)
* Update README.md

* update documentation in docstring
2024-05-03 02:32:53 +00:00
Robert Brennan 2c3fd1322d default to codeact agent (#1535) 2024-05-02 21:55:30 -04:00
Robert Brennan 99e4dd1730 Update social links (#1539)
* Update social links

* fix more slack links

* update discord

* update reach out faq

* fix init in test

* revert test
2024-05-02 23:08:39 +00:00
Alex Bäuerle d9ba45dae8 ci(docs): only generate autogen python docs on deploy (#1501)
* ci(docs): only generate autogen python docs on deploy

* poetry

* workdir
2024-05-02 18:29:41 +00:00
Shimada666 04306b0219 feat: adjust scroll timeout and scroll behavior (#1526)
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-05-02 14:20:16 -04:00
Robert Brennan 7914258945 Fix docker build, integration test (#1533)
* fix docker build

* revert plugins

* fix imports

* fix import

* fix one more import
2024-05-02 13:49:25 -04:00
Robert Brennan fadcdc117e Migrate to new folder structure in preparation for refactor (#1531)
* fix up folder structure

* update docs

* fix imports

* fix imports

* fix imoprt

* fix imports

* fix imports

* fix imports

* fix test import

* fix tests

* fix main import
2024-05-02 17:01:54 +00:00
Robert Brennan ce7c7eaae4 Refactor actions and observations (#1479)
* refactor actions and events

* remove type_key

* remove stream

* move import

* move import

* fix NullObs

* reorder imports

* fix lint

* fix dataclasses

* remove blank fields

* fix nullobs

* fix sidebar labels

* fix test compilation

* switch to asdict

* lint

* fix whitespace

* fix executable

* delint

* fix run

* remove NotImplementeds

* fix path prefix

* remove null files

* add debug

* add more debug info

* fix dataclass on null

* remove debug

* revert sandbox

* fix merge issues

* fix tyeps

* Update opendevin/events/action/browse.py
2024-05-02 15:44:54 +00:00
Xingyao Wang f19333aa6e fix(controller): "Current task state not recognized" (#1517)
* fix agent task already running bug

* attemp to fix "Current task state not recognized"

* Revert "attemp to fix "Current task state not recognized""

This reverts commit f5cbfe1ebb.

* attemp to fix "Current task state not recognized"

* fix invalid state for reset

---------

Co-authored-by: Leo <ifuryst@gmail.com>
2024-05-02 23:34:55 +08:00
Xingyao Wang 1e50d58982 fix(controller): "agent task already running" (#1516)
* fix agent task already running bug

* attemp to fix "Current task state not recognized"

* Revert "attemp to fix "Current task state not recognized""

This reverts commit f5cbfe1ebb.

---------

Co-authored-by: Leo <ifuryst@gmail.com>
2024-05-02 11:33:51 -04:00
Shimada666 24750ba04f fix: update correct troubleshooting doc url (#1510) 2024-05-02 14:09:05 +00:00
sp.wack cc4879a973 refactor tests to pass since changes, remove console logs, use toast error (#1497) 2024-05-02 09:46:20 -04:00
Boxuan Li 11d8253215 Add new CommitWriterAgent to auto-generate commit messages from staged diffs (#1484)
* Add new CommitWriterAgent to auto-generate commit messages from staged diffs

This commit introduces the CommitWriterAgent along with its configuration and detailed task description. The agent is designed to analyze git diffs staged for commit and automatically generate succinct and relevant commit messages.

* Remove devnote section from yaml and add README
2024-05-02 09:42:55 -04:00
Robert Brennan 0bdbd8a90d disable action concurrency (#1503)
* disable action concurrency

* empty commit
2024-05-02 13:30:04 +00:00
Robert Brennan 8dbbbcf1ba dont run lint on main (#1518) 2024-05-02 09:11:11 -04:00
Boxuan Li 30e10e7a4a Micro-agent: Remove planner specific actions (#1515) 2024-05-02 08:01:58 -04:00
Leo 95e4ca490f Feat: add lint frontend and lint all to Makefile. (#1354)
* Feat: add lint frontend and lint all to Makefile.

* style codes.

* Remove redundant target.

---------

Co-authored-by: Jim Su <jimsu@protonmail.com>
Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-05-02 11:53:57 +00:00
Frank Xu 836864fa88 [feat] Integrate BrowserGym (#1452)
* add a single-threaded server serving browsergym

* update poetry

* update browser page content

* add import to make sure browsergym environments are registered properly

* remove flask server, use multiprocess impl and Pipe

* fix

* refactor BrowserEnv

* update browser action and obs to include more complete info

* fix screenshot

* update poetry lock

* add playwright install to workflow

* update

* add better html to text conversion

* update for better text conversion to maintain parity with the current handling of browseurlaction

* update

* update poetry

* update multiprocessing mp

* fix multiprocessing

* update

* update github workflow

---------

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-05-02 19:52:53 +08:00
Robert Brennan 0d77f495e3 Add 404 troubleshooting section (#1505) 2024-05-01 21:22:20 -07:00
Robert Brennan 24cc8ceb96 add back free storage (#1506) 2024-05-02 02:44:10 +00:00
Jirka Borovec 1b810cfbf0 ci/lint: fix calling Ruff's format (#1457)
* ci/lint: fix calling Ruff's format

* Transition for ruff lint. Only checking the modified files.

---------

Co-authored-by: ifuryst <ifuryst@gmail.com>
2024-05-01 22:19:54 -04:00
Robert Brennan cfef3ee5c4 Fix docker push for non-forks (#1499)
* fix fork check

* minor docker tweaks

* remove caching

* try not chowinng

* fix chowns

* revert build.sh

* fix entrypoint user

* change message

* remove free disk space

* chown the entrypoint

* remove comments

* empty commit
2024-05-02 10:15:12 +08:00
Xingyao Wang 435f47ca0e Improve the both frontend and backend for CodeActAgent (#1494)
* improve the both frontend and backend for CodeActAgent

* fix linter

* update integration test
2024-05-02 02:07:40 +08:00
Tom Mery 545327cc1e upload file to workspace with button (#1394)
* upload file to workspace with button

* resolve filepath

* regenerate lock file

* fix lock file

* fix torch version

* fix lock

* fix poerty.lock

* fix lint

---------

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-05-01 15:40:35 +00:00
Boxuan Li c7dd443fa2 CoderAgent: Render summary prompt conditionally (#1461)
* CoderAgent: Render repo summary conditionally

* Add unittests

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-05-01 15:40:20 +00:00
Robert Brennan c50319138e Revert "feat(makefile): add capability to skip Docker image pull (#1463)" (#1489)
This reverts commit 442ab7371c.
2024-05-01 11:00:06 -04:00
Xingyao Wang d80f025e21 Implement Jupyter Frontend (#1363)
* initialize plugin definition

* initialize plugin definition

* simplify mixin

* further improve plugin mixin

* add cache dir for pip

* support clean up cache

* add script for setup jupyter and execution server

* integrate JupyterRequirement to ssh_box

* source bashrc at the end of plugin load

* add execute_cli that accept code via stdin

* make JUPYTER_EXEC_SERVER_PORT configurable via env var

* increase background cmd sleep time

* Update opendevin/sandbox/plugins/mixin.py

Co-authored-by: Robert Brennan <accounts@rbren.io>

* add mixin to base class

* make jupyter requirement a dataclass

* source plugins only when >0 requirements

* add `sandbox_plugins` for each agent & have controller take care of it

* update build.sh to make logs available in /opendevin/logs

* switch to use config for lib and cache dir

* Add SANDBOX_WORKSPACE_DIR into config

* Add SANDBOX_WORKSPACE_DIR into config

* fix occurence of /workspace

* fix permission issue with /workspace

* use python to implement execute_cli to avoid stdin escape issue

* add IPythonRunCellAction and get it working

* wait until jupyter is avaialble

* support plugin via copying instead of mounting

* add agent talk action

* support follow-up user language feedback

* add __str__ for action to be printed better

* only print PLAN at the beginning

* wip: update codeact agent

* get rid the initial messate

* update codeact agent to handle null action;
add thought to bash

* dispatch thought for RUN action as well

* fix weird behavior of pxssh where the output would not flush correctly

* make ssh box can handle exit_code properly as well

* add initial version of swe-agent plugin;

* rename swe cursors

* split setup script into two and create two requirements

* print SWE-agent command documentation

* update swe-agent to default to no custom docs

* add initial version of swe-agent plugin;

* rename swe cursors

* split setup script into two and create two requirements

* print SWE-agent command documentation

* update swe-agent to default to no custom docs

* update dockerfile with dependency from swe-agent

* make env setup a separate script for .bashrc source

* add wip prompt

* fix mount_dir for ssh_box

* update prompt

* fix mount_dir for ssh_box

* default to use host network

* default to use host network

* move prompt to a separate file

* fix swe-tool plugins;
add missing _split_string

* remove hostname from sshbox

* update the prompt with edit functionality

* fix swe-tool plugins;
add missing _split_string

* add awaiting into status bar

* fix the bug of additional send event

* remove some print action

* move logic to config.py

* remove debugging comments

* make host network as default

* make WORKSPACE_MOUNT_PATH as abspath

* implement execute_cli via file cp

* Revert "implement execute_cli via file cp"

This reverts commit 06f0155bc1.

* add codeact dependencies to default container

* add IPythonRunCellObservation

* add back cache dir and default to /tmp

* make USE_HOST_NETWORK a bool

* revert use host network to false

* add temporarily fix for IPython RUN action

* preliminary implementation of CodeActAgent's jupyter

* update node module

* update prompt

* revert USE_HOST_NETWORK to true since it is not affecting anything

* attempt to fix lint

* remove newline

* update prompt

* Refactor browser style. (#1358)

* delete useless assets and css class.
* add waiting for page loaded (networkidle with 3s timeout)

* Add integration test framework with mock llm (#1301)

* Add integration test framework with mock llm

* Fix MonologueAgent and PlannerAgent tests

* Remove adhoc logging

* Use existing logs

* Fix SWEAgent and PlannerAgent

* Check-in test log files

* conftest: look up under test name folder only

* Add docstring to conftest

* Finish dev doc

* Avoid non-determinism

* Remove dependency on llm embedding model

* Init embedding model only for MonologueAgent

* Add adhoc fix for sandbox discrepancy

* Test ssh and exec sandboxes

* CI: fix missing sandbox type

* conftest: Remove hack

* Reword comment for TODO

* Revert "refactor(frontend): Terminal (#1315)" (#1360)

This reverts commit 27246aca7e.

* revert USE_HOST_NETWORK to true since it is not affecting anything

* attempt to fix lint

* handle IsADirectory errors (#1365)

* update to 0.4.0 (#1362)

Co-authored-by: Jim Su <jimsu@protonmail.com>

* feat(frontend): multiple design changes (#1370)

* fix/improve terminal hook (#1371)

* Revert "update node module"

This reverts commit 459b1031e7.

* support SyntaxHighlighter and markdown for jupyter visualization

* fix jupyter execution server

* make jupyter active

* improve the display of markdown and raw text

* get base64 image display for react

* add `thought` to most action class

* fix unit tests for current action abstraction

* support user exit

* update test cases with the latest action format (added 'thought')

* fix integration test for CodeActAGent by mocking stdin

* only mock stdin for tests with user_responses.log

* remove -exec integration test for CodeActAgent since it is not supported

* remove specific stop word

* fix comments

* improve clarity of prompt

* attempt to fix lint

* attempt to fix lint yet agiain

* fix py lint

* fix integration tests

* sandbox might failed in chown due to mounting, but it won't be fatal

* update debug instruction for sshbox

* fix typo

* get RUN_AS_DEVIN and network=host working with app sandbox

* get RUN_AS_DEVIN and network=host working with app sandbox

* attempt to fix the workspace base permission

* sandbox might failed in chown due to mounting, but it won't be fatal

* update sshbox instruction

* remove default user id since it will be passed in the instruction

* revert permission fix since it should be resolved by correct SANDBOX_USER_ID

* the permission issue can be fixed by simply provide correct env var

* remove log

* set sandbox user id to getuid by default

* move logging to initializer

* make the uid consistent across host, app container, and sandbox

* remove hostname as it causes sudo issue

* fix permission of entrypoint script

* make the uvicron app run as host user uid for jupyter plugin

* add warning message

* fix frontend lint

* update dev md for instruction of running unit tests

* add back unit tests

* revert back to the original sandbox implementation to fix testcases

* revert use host network

* get docker socket gid and usermod instead of chmod 777

* allow unit test workflow to find docker.sock

* make sandbox test working via patch

* fix arg parser that's broken for some reason

* try to fix app build disk space issue

* fix integration test

* Revert "fix arg parser that's broken for some reason"

This reverts commit 6cc8961133.

* update Development.md

* cleanup intergration tests & add exception for CodeAct+execbox

* fix config

* implement user_message action

* fix doc

* fix event dict error

* fix frontend lint

* revert accidentally changes to integration tests

* revert accidentally changes to integration tests

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
Co-authored-by: Leo <ifuryst@gmail.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
Co-authored-by: Jim Su <jimsu@protonmail.com>
Co-authored-by: Alex Bäuerle <alex@a13x.io>
Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>
Co-authored-by: Robert Brennan <contact@rbren.io>
2024-05-01 22:37:01 +08:00
Arno.Edwards 442ab7371c feat(makefile): add capability to skip Docker image pull (#1463)
* feat(makefile): add capability to skip Docker image pull

* ci(github-actions): add conditional Docker installation based on ENV variable
2024-05-01 09:01:11 -04:00
Xingyao Wang 1c7cdbefdd feat(CodeActAgent): Support Agent-User Interaction during Task Execution and the Full Integration of CodeActAgent (#1290)
* initialize plugin definition

* initialize plugin definition

* simplify mixin

* further improve plugin mixin

* add cache dir for pip

* support clean up cache

* add script for setup jupyter and execution server

* integrate JupyterRequirement to ssh_box

* source bashrc at the end of plugin load

* add execute_cli that accept code via stdin

* make JUPYTER_EXEC_SERVER_PORT configurable via env var

* increase background cmd sleep time

* Update opendevin/sandbox/plugins/mixin.py

Co-authored-by: Robert Brennan <accounts@rbren.io>

* add mixin to base class

* make jupyter requirement a dataclass

* source plugins only when >0 requirements

* add `sandbox_plugins` for each agent & have controller take care of it

* update build.sh to make logs available in /opendevin/logs

* switch to use config for lib and cache dir

* Add SANDBOX_WORKSPACE_DIR into config

* Add SANDBOX_WORKSPACE_DIR into config

* fix occurence of /workspace

* fix permission issue with /workspace

* use python to implement execute_cli to avoid stdin escape issue

* add IPythonRunCellAction and get it working

* wait until jupyter is avaialble

* support plugin via copying instead of mounting

* add agent talk action

* support follow-up user language feedback

* add __str__ for action to be printed better

* only print PLAN at the beginning

* wip: update codeact agent

* get rid the initial messate

* update codeact agent to handle null action;
add thought to bash

* dispatch thought for RUN action as well

* fix weird behavior of pxssh where the output would not flush correctly

* make ssh box can handle exit_code properly as well

* add initial version of swe-agent plugin;

* rename swe cursors

* split setup script into two and create two requirements

* print SWE-agent command documentation

* update swe-agent to default to no custom docs

* add initial version of swe-agent plugin;

* rename swe cursors

* split setup script into two and create two requirements

* print SWE-agent command documentation

* update swe-agent to default to no custom docs

* update dockerfile with dependency from swe-agent

* make env setup a separate script for .bashrc source

* add wip prompt

* fix mount_dir for ssh_box

* update prompt

* fix mount_dir for ssh_box

* default to use host network

* default to use host network

* move prompt to a separate file

* fix swe-tool plugins;
add missing _split_string

* remove hostname from sshbox

* update the prompt with edit functionality

* fix swe-tool plugins;
add missing _split_string

* add awaiting into status bar

* fix the bug of additional send event

* remove some print action

* move logic to config.py

* remove debugging comments

* make host network as default

* make WORKSPACE_MOUNT_PATH as abspath

* implement execute_cli via file cp

* Revert "implement execute_cli via file cp"

This reverts commit 06f0155bc1.

* add codeact dependencies to default container

* add IPythonRunCellObservation

* add back cache dir and default to /tmp

* make USE_HOST_NETWORK a bool

* revert use host network to false

* add temporarily fix for IPython RUN action

* update prompt

* revert USE_HOST_NETWORK to true since it is not affecting anything

* attempt to fix lint

* remove newline

* fix jupyter execution server

* add `thought` to most action class

* fix unit tests for current action abstraction

* support user exit

* update test cases with the latest action format (added 'thought')

* fix integration test for CodeActAGent by mocking stdin

* only mock stdin for tests with user_responses.log

* remove -exec integration test for CodeActAgent since it is not supported

* remove specific stop word

* fix comments

* improve clarity of prompt

* fix py lint

* fix integration tests

* sandbox might failed in chown due to mounting, but it won't be fatal

* update debug instruction for sshbox

* fix typo

* get RUN_AS_DEVIN and network=host working with app sandbox

* get RUN_AS_DEVIN and network=host working with app sandbox

* attempt to fix the workspace base permission

* sandbox might failed in chown due to mounting, but it won't be fatal

* update sshbox instruction

* remove default user id since it will be passed in the instruction

* revert permission fix since it should be resolved by correct SANDBOX_USER_ID

* the permission issue can be fixed by simply provide correct env var

* remove log

* set sandbox user id to getuid by default

* move logging to initializer

* make the uid consistent across host, app container, and sandbox

* remove hostname as it causes sudo issue

* fix permission of entrypoint script

* make the uvicron app run as host user uid for jupyter plugin

* add warning message

* update dev md for instruction of running unit tests

* add back unit tests

* revert back to the original sandbox implementation to fix testcases

* revert use host network

* get docker socket gid and usermod instead of chmod 777

* allow unit test workflow to find docker.sock

* make sandbox test working via patch

* fix arg parser that's broken for some reason

* try to fix app build disk space issue

* fix integration test

* Revert "fix arg parser that's broken for some reason"

This reverts commit 6cc8961133.

* update Development.md

* cleanup intergration tests & add exception for CodeAct+execbox

* fix config

* implement user_message action

* fix doc

* fix event dict error

* fix frontend lint

* revert accidentally changes to integration tests

* revert accidentally changes to integration tests

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
Co-authored-by: Robert Brennan <contact@rbren.io>
2024-05-01 08:40:00 -04:00
Engel Nyst ea214d1c07 Re-initialize monologue agent after reset (#1478)
* re-initialize monologue agent after reset

* Add to docstring

* Remove now duplicated instances
2024-05-01 08:38:11 -04:00
amodev 62e4fb47b2 Fix API key exposure in toast notifications, resolves #1477 (#1480) 2024-05-01 06:01:48 +00:00
sp.wack 5f2d6f58f2 update tests after setting modals upgrade (#1467) 2024-04-30 10:18:32 -07:00
Robert Brennan 0cda5f64af Add integration test with dummy agent (#1316)
* first pass at dummy

* add assertion to dummy

* add dummy workflow

* beef up tests

* try and fix huggingface issue

* remove newlines

* rename test

* move to pytest

* Revert " move to pytest"

This reverts commit de8121c400.

* fix lint

* delint

* Update .github/workflows/dummy-agent-test.yml

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-04-30 16:52:00 +00:00
ChengyangDu eb1c3d8790 fix: configurable sand box user id (#1459) 2024-04-30 16:51:40 +00:00
Xingyao Wang fe43aeb9b6 Revert ssh box implemetation, fix multi-line command issues and add unit tests (#1460)
* update dev md for instruction of running unit tests

* add back unit tests

* revert back to the original sandbox implementation to fix testcases

* allow unit test workflow to find docker.sock

* make sandbox test working via patch

* fix arg parser that's broken for some reason

* fix integration test

* Revert "fix arg parser that's broken for some reason"

This reverts commit 6cc8961133.

* update Development.md
2024-04-30 12:46:35 -04:00
Xingyao Wang ccbbabac1c Get RUN_AS_DEVIN working with app sandbox (#1426)
* get RUN_AS_DEVIN and network=host working with app sandbox

* attempt to fix the workspace base permission

* sandbox might failed in chown due to mounting, but it won't be fatal

* update sshbox instruction

* remove default user id since it will be passed in the instruction

* revert permission fix since it should be resolved by correct SANDBOX_USER_ID

* the permission issue can be fixed by simply provide correct env var

* remove log

* set sandbox user id to getuid by default

* move logging to initializer

* make the uid consistent across host, app container, and sandbox

* remove hostname as it causes sudo issue

* fix permission of entrypoint script

* make the uvicron app run as host user uid for jupyter plugin

* revert use host network

* get docker socket gid and usermod instead of chmod 777

* try to fix app build disk space issue
2024-04-30 12:39:52 -04:00
Alex Bäuerle eb7703a44e feat(frontend): add empty states to all tabs according to Figma (#1456)
* feat(frontend): add empty states to all tabs according to Figma

* import order

---------

Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>
2024-04-30 12:34:23 -04:00
Robert Brennan 31c1a2d748 Update bug_template.yml (#1458) 2024-04-30 03:36:39 +02:00
Aleksandar 5989853f7a Fix duplicate LLM_BASE_URL entry in config.toml and enable different ollama embeddings (#1437)
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-04-30 02:26:02 +02:00
Xingyao Wang fe8bb51f17 Update debug instruction for SSH Box (#1425)
* update debug instruction for sshbox

* fix typo

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-29 20:09:56 -04:00
Jirka Borovec 0c2ebfd6e1 Ruff: use I rule for isort (#1410)
Ruff: use I rule for isort
2024-04-29 15:41:58 -07:00
Xia Zhenhua 086a2ed17f feat: make the response of agent_controller better to process when exception in step execution (#1445)
* feat: make the response of agent_controller better to process when exception occurred during executing step.

* Update opendevin/controller/agent_controller.py

---------

Co-authored-by: aaren.xzh <aaren.xzh@antfin.com>
Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-29 21:21:26 +00:00
Alex Bäuerle fa067edbac feat: set API key from settings modal (#1319)
* feat: set API key from settings modal

* feat: init with api key

* test

* fix

* fixes

* fix api key reference

* test

* minor fixes

* fix settings update

* combine settings call

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>
2024-04-29 23:18:00 +03:00
Robert Brennan 11d48cc2f3 Run docker on forked pull requests (#1450)
* build docker on pull request

* run docker build on PRs

* remove if

* add permissions

* change ghcr login

* empty commit

* always use opendevin as org

* lowercase

* no client token

* dont push on forks

* remove env

* only cache-to if pushing

* fix org name

* fix owner

* Update containers/build.sh

Co-authored-by: Graham Neubig <neubig@gmail.com>

* lowercase

* remove tag prefix

* lowercase

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-04-29 14:24:28 -04:00
Graham Neubig c821d0967c Point docker errors to troubleshooting doc (#1453) 2024-04-29 18:21:55 +00:00
Alex Bäuerle b47fdbb23b docs(readme): remove demo video (#1455)
The demo video is included in our homepage now. If we're linking to a video file local to the repository from the README, it does not get displayed. Instead, there is just a link to it. I don't think it's a good idea to always have to update both the readme video and the website video, so I am removing the readme video.
2024-04-29 11:09:48 -07:00
Alex Bäuerle b3930ad195 ci: fix deploy config (#1454) 2024-04-29 17:28:30 +00:00
Alex Bäuerle cd58194d2a docs(docs): start implementing docs website (#1372)
* docs(docs): start implementing docs website

* update video url

* add autogenerated codebase docs for backend

* precommit

* update links

* fix config and video

* gh actions

* rename

* workdirs

* path

* path

* fix doc1

* redo markdown

* docs

* change main folder name

* simplify readme

* add back architecture

* Fix lint errors

* lint

* update poetry lock

---------

Co-authored-by: Jim Su <jimsu@protonmail.com>
2024-04-29 10:00:51 -07:00
Jirka Borovec 46bd83678a precommit: add pyproject's hooks (#1412)
* precommit: add `pyproject`'s hooks

* apply precommit

* Update dev_config/python/.pre-commit-config.yaml

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-29 14:58:21 +00:00
sp.wack 1597f0093e fix custom llm model value not displaying (#1447) 2024-04-29 10:46:39 -04:00
sp.wack c02b681728 update chat input tests to expect new behaviour (#1443) 2024-04-29 10:37:33 -04:00
Boxuan Li 319b9ac0f3 Fix micro-agents schema bug (#1424)
* Fix micro agents definitions

* Add tests for micro agents

* Add to CI

* Revert "Add to CI"

This reverts commit 94f3b4e7c8.

* Remove test artifacts for ManagerAgent

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-04-29 10:34:03 -04:00
Jirka Borovec 5a9b6b0c9f precommit: remove unused setup-cfg-fmt (#1411) 2024-04-29 10:33:50 -04:00
Boxuan Li a5a1545c59 Add Codecov test coverage (#1397)
* Add pytest-cov dependency

* Add codecov to unit test CI

* Codecov: Use secrets

* Add codecov for integration tests
2024-04-29 10:33:13 -04:00
Arno.Edwards 03ecec1e76 ci: refine job matrix and enable cache for poetry (#1381)
* ci: refine job matrix and enable cache for poetry

- Replace direct installation of Poetry with pipx to ensure isolated environment setups.
- Enable caching for Poetry dependencies using setup-python action to improve build efficiency.
- Refactor the job matrix specifications.

* ci: enable Homebrew caching in Actions

* ci: optimize Docker and Colima installation in GitHub Actions

- Check if Docker and Colima are already installed before attempting to install them. Link and start each service appropriately to avoid unnecessary reinstallation and ensure they are ready for immediate use in the CI pipeline.

* ci: remove Homebrew cache.

* fix typo

* fix: specified the Python version to avoid errors in actions.

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-04-29 10:32:25 -04:00
sp.wack 8954855ba3 refactor(frontend): settings and friends (#1400)
* refactor settings and friends

* extend base modal to support disabled property

* extend settings handler to change language via i18next

* remove unused settings.d.ts

---------

Co-authored-by: Jim Su <jimsu@protonmail.com>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-04-28 21:59:26 -04:00
Christian Balcom 24b71927c3 fix(backend) changes to improve Command-R+ behavior, plus file i/o error improvements, attempt 2 (#1417)
* Some improvements to prompts, some better exception handling for various file IO errors, added timeout and max return token configurations for the LLM api.

* More monologue prompt improvements

* Dynamically set username provided in prompt.

* Remove absolute paths from llm prompts, fetch working directory from sandbox when resolving paths in fileio operations, add customizable timeout for bash commands, mention said timeout in llm prompt.

* Switched ssh_box to disabling tty echo and removed the logic attempting to delete it from the response afterwards, fixed get_working_directory for ssh_box.

* Update prompts in integration tests to match monologue agent changes.

* Minor tweaks to make merge easier.

* Another minor prompt tweak, better invalid json handling.

* Fix lint error

* More catch-up to fix lint errors introduced by merge.

* Force WORKSPACE_MOUNT_PATH_IN_SANDBOX to match WORKSPACE_MOUNT_PATH in local sandbox mode, combine exception handlers in prompts.py.

---------

Co-authored-by: Jim Su <jimsu@protonmail.com>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-04-28 21:58:53 -04:00
Graham Neubig a5f61caae9 Add actions for github push and send PR (#1415)
* Added a push action

* Tests

* Add tests

* Fix capitalization

* Update

* Fix typo

* Fix integration tests

* Added poetry.lock

* Set lock

* Fix action parsing

* Update integration test output

* Updated prompt

* Update integration test

* Add github token to default config

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-04-29 00:56:23 +00:00
Graham Neubig 0d224de369 Fixed typo in integration tests README.md (#1438)
* Fixed typo in integration tests README.md

* Update tests/integration/README.md

---------

Co-authored-by: ThoughtfulRobot <robot@example.com>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-04-28 20:44:58 -04:00
Graham Neubig 567e2c2b35 Add poetry install to makefile (#1436) 2024-04-28 22:40:37 +01:00
Xingyao Wang a7e42ff0a3 allow interaction with local box (#1429) 2024-04-28 23:24:01 +08:00
Robert Brennan d1f62bb6be Add final step for matrix tests, some cleanup (#1408)
* add final step for matrix tests

* change test names

* add names

* change run conditions

* fix test names
2024-04-27 14:34:07 -04:00
Robert Brennan 9c9aee29f0 Revert "fix(backend) changes to improve Command-R+ behavior, plus file i/o er…" (#1405)
This reverts commit 44aea95dde.
2024-04-27 08:57:04 -04:00
sp.wack d8386366d9 feature(frontend): improve chat overflow handling and syntax highlighting (#1402)
* improve overflow handling and syntax highlighting

* add reference to gh page
2024-04-27 08:21:29 -04:00
Engel Nyst ca73ecb499 message in setup-config about model (#1398) 2024-04-27 08:19:31 -04:00
Leo 6f90239521 Fix: enable input but don't be allowed to submit when agent init (#1353)
* Fix: enable input but don't be allowed to submit when agent initializing.

* Prevent Enter from adding a newline when disabled.

---------

Co-authored-by: Jim Su <jimsu@protonmail.com>
Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-27 12:17:58 +00:00
Christian Balcom 546be7ca8e Handle Unicode errors (#1388)
* Add errors for non-unicode file data or command return, trim excessively long command returns.

* Fix lint issue.

* Fix lint issue (try 2).

* Realized that prompts were trimmed elsewhere and dropped the new addition.
2024-04-27 08:15:16 -04:00
Robert Brennan ce47461c5a disable host network (#1404) 2024-04-27 08:07:25 -04:00
Christian Balcom 44aea95dde fix(backend) changes to improve Command-R+ behavior, plus file i/o error improvements. (#1347)
* Some improvements to prompts, some better exception handling for various file IO errors, added timeout and max return token configurations for the LLM api.

* More monologue prompt improvements

* Dynamically set username provided in prompt.

* Remove absolute paths from llm prompts, fetch working directory from sandbox when resolving paths in fileio operations, add customizable timeout for bash commands, mention said timeout in llm prompt.

* Switched ssh_box to disabling tty echo and removed the logic attempting to delete it from the response afterwards, fixed get_working_directory for ssh_box.

* Update prompts in integration tests to match monologue agent changes.

* Minor tweaks to make merge easier.

* Another minor prompt tweak, better invalid json handling.

* Fix lint error

* More catch-up to fix lint errors introduced by merge.

---------

Co-authored-by: Jim Su <jimsu@protonmail.com>
Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-27 11:58:34 +00:00
Jirka Borovec e32d95cb1a lint: simplify hooks already covered by Ruff (#1204)
* lint: simplify hooks already covered by Ruff

* prune dev dependency

* setting E, W, F

* poetry?

* autopep8

* quote-style = "single"

* double-quote-string-fixer

* --all-files

* apply

* Q

* drop double-quote-string-fixer

* --all-files

* apply pre-commit

* python3.11 -m poetry lock --no-update

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-27 11:32:14 +00:00
Boxuan Li 7d5856e36b local sandbox: Create workspace base during init (#1377) 2024-04-27 11:50:58 +08:00
Robert Brennan fd9e598136 add options for controlling memory (#1364)
* add options for controlling memory

* Update agenthub/monologue_agent/utils/memory.py

Co-authored-by: Jim Su <jimsu@protonmail.com>

* move memory initialization switch back

* fix lint

* fix type

---------

Co-authored-by: Jim Su <jimsu@protonmail.com>
2024-04-26 21:53:54 +00:00
Jim Su c0adb55bfa Fix mock server (#1392) 2024-04-26 17:16:29 -04:00
Alex Bäuerle e9a9186717 build(backend): change reload to ignore workspace correctly (#1393)
Co-authored-by: Jim Su <jimsu@protonmail.com>
2024-04-26 18:19:18 +00:00
Jim Su 93d8ee1400 Render messages in markdown (#1391) 2024-04-26 14:05:17 -04:00
Boxuan Li 831e934dab Refactor: Use enum for config keys (#1376) 2024-04-26 10:26:01 -04:00
Engel Nyst 5a0ef2b5de Doc: update Azure guide with setting model in the UI (#1386)
* Update AzureLLMs.md

Add note about saving the actual model in the UI.

* Update AzureLLMs.md

Add note about saving the actual value in the UI.
2024-04-26 09:17:06 -04:00
sp.wack 0fb3d63406 fix/improve terminal hook (#1371) 2024-04-25 19:13:42 -04:00
Alex Bäuerle 65558df1f7 feat(frontend): multiple design changes (#1370) 2024-04-25 14:24:05 -07:00
Robert Brennan 5515c08624 update to 0.4.0 (#1362)
Co-authored-by: Jim Su <jimsu@protonmail.com>
2024-04-25 19:27:25 +00:00
Robert Brennan f407382a5a handle IsADirectory errors (#1365) 2024-04-25 15:15:06 -04:00
Xingyao Wang 110e7f0c4c make WORKSPACE_MOUNT_PATH as abspath (#1361) 2024-04-25 13:31:05 -04:00
Robert Brennan a1a0767681 Revert "refactor(frontend): Terminal (#1315)" (#1360)
This reverts commit 27246aca7e.
2024-04-25 11:21:01 -04:00
Boxuan Li e7b5ddfe06 Add integration test framework with mock llm (#1301)
* Add integration test framework with mock llm

* Fix MonologueAgent and PlannerAgent tests

* Remove adhoc logging

* Use existing logs

* Fix SWEAgent and PlannerAgent

* Check-in test log files

* conftest: look up under test name folder only

* Add docstring to conftest

* Finish dev doc

* Avoid non-determinism

* Remove dependency on llm embedding model

* Init embedding model only for MonologueAgent

* Add adhoc fix for sandbox discrepancy

* Test ssh and exec sandboxes

* CI: fix missing sandbox type

* conftest: Remove hack

* Reword comment for TODO
2024-04-25 10:56:53 -04:00
Leo bf5a2afd83 Refactor browser style. (#1358)
* delete useless assets and css class.
* add waiting for page loaded (networkidle with 3s timeout)
2024-04-25 10:48:19 -04:00
Xingyao Wang 88397e0469 add back cache dir and default to /tmp (#1359) 2024-04-25 10:46:15 -04:00
Boxuan Li c45e2ae4d9 Fix output strip behavior discrepancy among sandboxes (#1355) 2024-04-25 14:38:01 +08:00
Robert Brennan 5543fe2c3e fix up prompts (#1345) 2024-04-24 18:30:18 -04:00
Robert Brennan bc84482c48 Update ssh_box.py (#1346) 2024-04-24 18:30:05 -04:00
Robert Brennan 6972ea0089 Go back to no host network by default (#1349) 2024-04-24 18:29:30 -04:00
Robert Brennan 43214a69a0 Better version management (#1313)
* better version management

* Update opendevin/config.py
2024-04-24 21:53:49 +00:00
Engel Nyst 2318ceae35 Send JSON parsing exceptions to LLM (#1342)
* Add malformed JSON where we don't even start finding actions

* Send any exception during JSON parsing back

* Use specific exceptions

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-24 17:51:09 -04:00
Engel Nyst f4ce7e7862 Update description (#1337)
* Update base.py

* Update opendevin/action/base.py

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-24 17:50:51 -04:00
Engel Nyst bd470e8076 Send the permission error to the llm (#1343)
* Send the permission error to the llm

* Update opendevin/action/fileop.py

* Update opendevin/action/fileop.py

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-24 17:50:32 -04:00
Xingyao Wang fde0392457 Default for sandbox to use the host network (#1334)
* default to use host network

* make host network as default
2024-04-24 21:48:11 +00:00
Robert Brennan 1e95fa435d Microagents and Delegation (#1238)
* basic microagent structure

* start on jinja

* add instructions parser

* add action instructions

* add history instructions

* fix a few issues

* fix a few issues

* fix issues

* fix agent encoding

* fix up anon class

* prompt to fix errors

* less debug info when errors happen

* add another traceback

* add output to finish

* fix math prompt

* fix pg prompt

* fix up json prompt

* fix math prompt

* fix math prompt

* fix repo prompt

* fix up repo explorer

* update lock

* revert changes to agent_controller

* refactor microagent registration a bit

* create delegate action

* delegation working

* add finish action to manager

* fix tests

* rename microagents registry

* rename fn

* logspam

* add metadata to manager agent

* fix message

* move repo_explorer

* add delegator agent

* rename agent_definition

* fix up input-output plumbing

* fix tests

* Update agenthub/micro/math_agent/agent.yaml

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update agenthub/delegator_agent/prompt.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update agenthub/delegator_agent/prompt.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* remove prompt.py

* fix lint

* Update agenthub/micro/postgres_agent/agent.yaml

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update agenthub/micro/postgres_agent/agent.yaml

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* fix error

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-04-24 17:46:14 -04:00
Robert Brennan 8828d9836d use caching for docker builds (#1340)
* try pulling for cached layers

* use cache-from and cache-to

* empty commit

* add readme

* use specific cache tags

* also cache from main
2024-04-24 15:57:46 -04:00
Xia Zhenhua d357fcff36 fix: sweagent prompt message SYSTEM_MESSAGE role set to user bug. (#1329)
Co-authored-by: aaren.xzh <aaren.xzh@antfin.com>
2024-04-24 15:51:52 -04:00
Robert Brennan 9241f713b0 pin macos (#1344) 2024-04-24 15:37:32 -04:00
Alex Bäuerle 2472840a56 style(frontend): remove the welcome message in the code editor (#1324)
* style(frontend): remove the welcome message in the code editor

I don't feel like we need this and it can be confusing since there is no file called "welcome". Also, the chat window already has a welcome message.

* lint

* remove empty checks

---------

Co-authored-by: Jim Su <jimsu@protonmail.com>
2024-04-24 16:58:47 +00:00
Season ab3e18667b feat: LocalLLM docs without docker (#1269)
* feat:  start Devin without Docker locally

* chore: make consistent model choices

* chore: more detailed explanation for using litellm server as walkaround

* chore: simply pr
2024-04-24 16:47:20 +00:00
Robert Brennan 236b7bf6ea refactor error handling so not all exceptions are caught (#1296)
* refactor error handling so not all exceptions are caught

* revert

* Send the failed decoding back to the LLM (#1322)

* fix quotes

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-04-24 16:44:06 +00:00
Alex Bäuerle dc73954cce build: when running in dev mode, reload the poetry server whenever a … (#1323)
* build: when running in dev mode, reload the poetry server whenever a file changes

* only reload for specific directories

---------

Co-authored-by: Jim Su <jimsu@protonmail.com>
2024-04-24 09:34:40 -07:00
Xingyao Wang 9b48f63544 Fix build error (#1339)
* add initial version of swe-agent plugin;

* rename swe cursors

* split setup script into two and create two requirements

* print SWE-agent command documentation

* update swe-agent to default to no custom docs

* update dockerfile with dependency from swe-agent

* make env setup a separate script for .bashrc source

* fix swe-tool plugins;
add missing _split_string

* remove import for temporarily fix (will add back in another pr)
2024-04-24 12:25:18 -04:00
Xingyao Wang e6ecbd5510 fix(sandbox): mount_dir for ssh_box (#1333)
* fix mount_dir for ssh_box

* move logic to config.py
2024-04-24 12:10:14 -04:00
sp.wack 2268529b75 fix test imports (#1327) 2024-04-24 09:09:48 -07:00
sp.wack a0dd5880d7 improve formatting and extend to include i18n scripts (#1328) 2024-04-24 09:09:05 -07:00
Christian Balcom b48f440476 Add sanitization against malformed paths used in fileop actions. Added tests for resolve_path. (#1338) 2024-04-24 15:46:16 +00:00
Xingyao Wang a0e8fcb19a Add SWE-agent tools as sandbox plugins (#1305)
* add initial version of swe-agent plugin;

* rename swe cursors

* split setup script into two and create two requirements

* print SWE-agent command documentation

* update swe-agent to default to no custom docs

* update dockerfile with dependency from swe-agent

* make env setup a separate script for .bashrc source

* fix swe-tool plugins;
add missing _split_string

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-24 22:50:14 +08:00
மனோஜ்குமார் பழனிச்சாமி 99829f95fd Applied npm patch for build errors (#1326)
* npm patch

https://github.com/npm/cli/issues/7231#issuecomment-2062032218

* Delete huge unnecessary tools folder
2024-04-24 09:24:48 -04:00
Xingyao Wang 0b6ab238f9 Update Development.md (#1330) 2024-04-24 17:05:27 +08:00
Xingyao Wang 3eea6de546 fix(sandbox): SSH box .execute misalign issue (#1304)
* fix weird behavior of pxssh where the output would not flush correctly

* make ssh box can handle exit_code properly as well

* Update opendevin/sandbox/docker/ssh_box.py

Co-authored-by: Robert Brennan <accounts@rbren.io>

* fix typo

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-24 12:27:44 +08:00
sp.wack 27246aca7e refactor(frontend): Terminal (#1315)
* create new modal for loading previous session

* style and replace modal

* retire old components and group modals into folder

* Utilise i18n for text content and add en translations

* prevent modal from being dismissed via the backdrop

* reference issue that its fixing

* fix incorrect role in tests

* initial commit'

* add output support

* update addon-fit library and mgirate to useXTerm

* add test todos

* move useXTerm to hooks folder

* Fix import path error

---------

Co-authored-by: Jim Su <jimsu@protonmail.com>
2024-04-24 02:34:46 +00:00
Larkin Williams-Capone ba556ea4d8 Add docs for using google llms (#1321) 2024-04-23 22:22:18 +00:00
Alex Bäuerle f40e91b41b refactor: move leftnav and agent controls to new play bar (#1317)
* refactor: move leftnav and agent controls to new play bar

* layout change
2024-04-23 15:17:43 -07:00
Engel Nyst d1551c3097 Add checks to stop infinite loops (#1293)
* Add checks to stop infinite loops

* Send an AgentErrorObservation for the user to see an oops loop

* (NullAction, Obs) problem should be (NullAction, error Obs)

* Merge the two with AgentErrorObs.

* Update opendevin/controller/agent_controller.py
2024-04-23 17:52:32 -04:00
Joen Wallenstrøm (UnLogick) f6dcbeaa1d Add ollama wsl section to LocalLLMs.md (#1318) 2024-04-23 21:40:28 +00:00
Alex Bäuerle 620bbb38cf feat: disable settings while task is running (#1291)
* feat: disable settings while task is running

* test files

* tests

* test
2024-04-23 14:20:49 -07:00
Vasek Mlejnsky a2a86e84f0 Integrate copy_to in E2BBox (#1298)
* initialize plugin definition

* initialize plugin definition

* simplify mixin

* further improve plugin mixin

* add cache dir for pip

* support clean up cache

* add script for setup jupyter and execution server

* integrate JupyterRequirement to ssh_box

* source bashrc at the end of plugin load

* add execute_cli that accept code via stdin

* make JUPYTER_EXEC_SERVER_PORT configurable via env var

* increase background cmd sleep time

* Update opendevin/sandbox/plugins/mixin.py

Co-authored-by: Robert Brennan <accounts@rbren.io>

* add mixin to base class

* make jupyter requirement a dataclass

* source plugins only when >0 requirements

* add `sandbox_plugins` for each agent & have controller take care of it

* update build.sh to make logs available in /opendevin/logs

* switch to use config for lib and cache dir

* fix permission issue with /workspace

* use python to implement execute_cli to avoid stdin escape issue

* wait until jupyter is avaialble

* support plugin via copying instead of mounting

* Fix linter issue

---------

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-23 07:55:27 -04:00
Leo 0a2e90dece Disable additional Docker metadata to more compatible with OCI spec. (#1302) 2024-04-23 07:55:00 -04:00
Xia Zhenhua 747ac23cd0 fix: conftest.py comment bug. (#1303)
Co-authored-by: aaren.xzh <aaren.xzh@antfin.com>
2024-04-23 07:51:33 -04:00
Engel Nyst 960f17a565 Bug with sshd (#1297)
* Workaround for a bug with sshd

* Update containers/sandbox/Dockerfile

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

---------

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-04-23 01:33:44 +00:00
Xingyao Wang fc5e075ea0 feat(sandbox): Implementation of Sandbox Plugin to Support Jupyter (#1255)
* initialize plugin definition

* initialize plugin definition

* simplify mixin

* further improve plugin mixin

* add cache dir for pip

* support clean up cache

* add script for setup jupyter and execution server

* integrate JupyterRequirement to ssh_box

* source bashrc at the end of plugin load

* add execute_cli that accept code via stdin

* make JUPYTER_EXEC_SERVER_PORT configurable via env var

* increase background cmd sleep time

* Update opendevin/sandbox/plugins/mixin.py

Co-authored-by: Robert Brennan <accounts@rbren.io>

* add mixin to base class

* make jupyter requirement a dataclass

* source plugins only when >0 requirements

* add `sandbox_plugins` for each agent & have controller take care of it

* update build.sh to make logs available in /opendevin/logs

* switch to use config for lib and cache dir

* fix permission issue with /workspace

* use python to implement execute_cli to avoid stdin escape issue

* wait until jupyter is avaialble

* support plugin via copying instead of mounting

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-23 08:45:53 +08:00
Engel Nyst 220dac926e Note on model name (#1295) 2024-04-22 22:53:16 +00:00
Xingyao Wang e84125de6f Symlink python3 to python in Sandbox (#1286)
* symlink py3 to py

* remove service ssh start
2024-04-23 01:05:05 +08:00
Jim Su b0c3bca915 refactor(frontend): Refactor imports to use absolute path (#1288)
* Refactor imports to use absolute path

* Rename absolute import symbol from src to #

* Refactor imports to use absolute path

* Rename absolute import symbol from src to #

* Remove unused LoadMessageModal
2024-04-22 16:29:03 +00:00
Engel Nyst 39a851bd95 Message property sent to llm (#1273)
* Add to_memory for observations

* Don't send 'message' property to the llm from observation; fix other agents
2024-04-22 12:14:45 -04:00
sp.wack f55f7be2e6 update readme to include testing script and documentation (#1272) 2024-04-22 08:54:28 -07:00
sp.wack 0669b27522 refactor(frontend): load previous session modal (#1284)
* create new modal for loading previous session

* style and replace modal

* retire old components and group modals into folder

* Utilise i18n for text content and add en translations

* prevent modal from being dismissed via the backdrop

* reference issue that its fixing

* fix incorrect role in tests

---------

Co-authored-by: Jim Su <jimsu@protonmail.com>
2024-04-22 08:46:32 -07:00
Jim Su 3193014ed8 Correct workspace directory config variable in README (#1282) 2024-04-22 10:32:48 -04:00
மனோஜ்குமார் பழனிச்சாமி cfbd482498 Added playwright commands (#1184)
* post playwright commands

* post playwright commands with deps

* post playwright commands

* post playwright commands with deps

* Put playwright install in the correct place

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-04-22 09:43:08 -04:00
Beichen Ma 2242702cf9 feat add prerequisites validation (#943)
* feat add prerequisites validation

* fix error

* Update Makefile

Co-authored-by: Robert Brennan <accounts@rbren.io>

* change awk with sek

* fix typo

* fix error

* fix linux error

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
Co-authored-by: Jim Su <jimsu@protonmail.com>
2024-04-22 03:40:56 +00:00
#Einswilli 157278704e style: Refactor get_prompt function for readability and maintainability (#1051)
* AgentHub.planner_agent.prompt: Optimize the get_prompt function by using a dictionary to map ActionType to hints, thus avoiding repetitions and making the code more readable and maintainable.

* Update prompt.py

* Fix Lint issues in prompt.py

* Fix lint issues in prompt.py

* Fix Lint issues in prompt.py

* Fix Lint issues prompt.py

* Remove trailing whitespace

---------

Co-authored-by: Alex Bäuerle <alex@a13x.io>
Co-authored-by: Robert Brennan <accounts@rbren.io>
Co-authored-by: Jim Su <jimsu@protonmail.com>
2024-04-22 03:32:31 +00:00
Alex Bäuerle ffa62b9928 style: flip controls and status message for chat (#1278) 2024-04-22 03:21:02 +00:00
Geun, Lim bc23c9429c update ko-KR translation (#1258)
update ko-KR translation
2024-04-22 03:18:01 +00:00
Robert Brennan 8310cf6e23 Update README.md (#1279) 2024-04-21 23:07:49 -04:00
Engel Nyst 464bf7ee23 Tweak connect exceptions (#1120)
* Clean up manual sleep

* Add default retries and document them.

* Add doctrings to llm

* Add exponential backoff for rate limiting errors

* Get embeddings for the action and its own content, not the user message

* Add a few bad exceptions to stop loop

* Stop loop when the step has no action

* Add action with content, no message, to history

* make retry settings customizable

* fix condense to stop the loop for the same reasons as completion

* Add 500-504 exception to retries

* document the retry variables

* Add retries and limits for embeddings. Replaces llama-index hard-coded decorator.

* Rename to retry_min_wait and retry_max_wait
2024-04-22 04:00:01 +02:00
sp.wack 1f2a845feb refactor(frontend): settings modal (#1256)
* allow arrow functions for components

* initial commit - BaseModal

* initial commit - SettingsForm

* extend tests and component

* extend to support language

* refactor tests

* move files and separate component/tests

* extend functionality

* refactor

* major refactor and flip flops

* add tests

* fix styles and names

* add loading state

* remove old SettingModal

* refactor component into smaller ones

* fix model input

* revert eslint rule to allow multiple function definitions for components and remove unused helper function

* add new i18n key for language placeholder

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-21 17:54:29 -07:00
yaroslavMain 454e9613b0 Add pydantic (#868)
* Update action.py

* Update observation.py

* Update observation.py

* Update action.py

* Update observation.py

* Update opendevin/schema/action.py

* Update action.py

* Update observation.py

* Update action.py

* Update observation.py

* Update observation.py

* Update observation.py

* Update observation.py

* Update action.py

* Update action.py

* Update action.py

* Update observation.py

* Update observation.py

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-04-21 19:41:04 +00:00
Andrii Stadnik 900de8728c Update LocalLLMs.md (#1267) 2024-04-21 19:23:19 +00:00
Boxuan Li 7bd5417b95 DogFood: instruct agent to read from diff file (#1259) 2024-04-21 15:21:09 -04:00
Yoni 5a4913224a Use /bin/bash as the shell for Makefile (#1264) 2024-04-21 15:17:35 -04:00
Xingyao Wang dffbeec45a Add SANDBOX_WORKSPACE_DIR into config (#1266)
* Add SANDBOX_WORKSPACE_DIR into config

* Add SANDBOX_WORKSPACE_DIR into config

* fix occurence of /workspace
2024-04-21 15:17:10 -04:00
Leo 0e572c3e41 feat: support tls. #1234 (#1248)
* feat: support tls.

* update the frontend README.

* Update frontend/README.md

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-04-21 14:50:48 -04:00
Arno.Edwards d692a72bf3 fix: correct port number in LocalLLMs.md (#1252)
* fix: correct port number in `LocalLLMs.md`

- improve details and linting

* Update docs/guides/LocalLLMs.md

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-20 14:34:32 +00:00
Coenraad Loubser b5da25e9cc Update README.md:--add-host host.docker.internal=host-gateway fixes #1156 (#1250) 2024-04-20 09:49:59 -04:00
Boxuan Li 1210eea48d DogFood: Use OpenDevin to review PR (#1223)
* DogFood: Use OpenDevin to review PR

* Use diff rather than patch

* Fix prompt

* Return if label not present

* Don't write review to environment variable

* Fix label check

* Use better name for labels
2024-04-20 09:25:36 -04:00
Boxuan Li cf3372a5fe Fix arg parser help message display (#1247)
* Fix arg parser help message display

* Add unittest
2024-04-20 09:23:19 -04:00
Alex Bäuerle 49ff317b50 feat: add change indicator to workspace tabs (#1236) 2024-04-20 09:23:07 -04:00
Alex Bäuerle 6340cd9aed style: remove warning about useEffect (#1232)
This warning was appended to every PR. While the warningn typically makes sense, in this case we intentionally don't want to rerun if `desiredState` changes.
2024-04-20 09:20:03 -04:00
Robert Brennan 3c85ece23e Add troubleshooting document and issue template (#1243)
* add troubleshooting doc

* add link to troubleshooting

* mark required

* unmark required

* more required stuff

* add nonetype issue

* Update .github/ISSUE_TEMPLATE/bug_template.yml

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update .github/ISSUE_TEMPLATE/bug_template.yml

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-04-20 13:05:45 +00:00
மனோஜ்குமார் பழனிச்சாமி 0356f6ec89 Azure LLM fix (#1227)
* azure embedding fix

* corrected embedding config

* fixed doc
2024-04-20 01:05:14 +02:00
Robert Brennan 3988443705 Fix local LLM guide (#1240)
* Fix local LLM guide

* Update docs/guides/LocalLLMs.md
2024-04-19 16:22:41 -04:00
Robert Brennan 7675b3ca34 fix manager ref (#1241) 2024-04-19 16:18:30 -04:00
Robert Brennan d61fdb8bba less debug info when errors happen (#1233)
* less debug info when errors happen

* add another traceback

* better prompt debugs

* remove unuse arg
2024-04-19 15:15:38 -04:00
Alex Bäuerle 0a59d46ca3 feat: align status bar with control bar (#1235) 2024-04-19 15:15:07 -04:00
Vasek Mlejnsky 76b81ca0ed Integrate E2B sandbox as an alternative to a Docker container (#727)
* add e2b sandbox [wip]

* Install e2b package

* Add basic E2B sandbox integration

* Update dependencies and fix command execution in E2BSandbox

* Udpate e2b

* Add comment

* Lint

* Remove unnecessary type conversion

* Lint

* Fix linting

* Resolve comments

* Update opendevin/action/fileop.py

* Update opendevin/action/fileop.py

* Fix log

* Update E2B readme

* poetry lock

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-19 14:21:58 -04:00
Alex Bäuerle e6d91affc6 refactor: remove the previously implemented shallow file fetch (#1231)
Removing because the new filetree implementation is fast enough as is and fetching partial trees adds a lot of complexity that we probably don't need. For example, if we still want to automatically open the tree location of a changed file, we need to make sure all the parent folders are fetched first.
2024-04-19 17:25:40 +00:00
sp.wack e82543747c fix(frontend): rebuild the file explorer; goodbye lag (#1221)
* initial commit

* separate tests and add new prop

* initial commit

* initial commit - file explorer

* update test and code

* adjustments and replacements

* update folder name

* fix import

* refactor for better readability and extending

* replace TreeNode type with existing WorkspaceFile

* refactor to have workspace support only one root node

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-19 17:09:21 +00:00
Alex Bäuerle 959d91c9d6 feat: implement basic planner UI (#1173)
* feat: implement basic planner UI

* update planner UI

* lint

* fix type

* fixes

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-19 13:09:14 -04:00
Xingyao Wang 871eefe801 Revert "feat(sandbox): Add Jupyter Kernel for Interactive Python Interpreter for Sandbox (#1215)" (#1229)
This reverts commit 492feecb67.
2024-04-19 16:49:24 +00:00
Leo 34286fabcc fix ghcr workflow throw error when in the fork repo. (#1203)
* fix ghcr workflow throw error when in the fork repo.

* Split DOCKER_REPOSITORY for cleaner.
2024-04-19 11:06:25 -04:00
sp.wack b9bed7da8d test(frontend): create new test and update existing ones (#1190)
* create new test and update existing

* Prevent empty/undefined values from being set
2024-04-19 11:05:14 -04:00
Robert Brennan 7c3ea9fa08 Fix up documentation for specific LLMs (#1217)
* Update LOCAL_LLM_GUIDE.md

* Create WINDOWS_WSL_GUIDE.md

* update docs

* Update docs/Agents.md

* update windows docs

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-04-19 10:47:54 -04:00
Xingyao Wang 492feecb67 feat(sandbox): Add Jupyter Kernel for Interactive Python Interpreter for Sandbox (#1215)
* add initial version of py interpreter

* fix bug

* fix async issue

* remove debugging print statement

* initialize kernel & update printing

* fix port mapping

* uncomment debug lines

* fix poetry lock

* make jupyter py interpreter into a subclass
2024-04-19 16:55:51 +08:00
sp.wack ca1d53c161 fix(frontend): improve the <Input /> component (#1195)
* create a better chat input component

* remove the useInputComposition hook
2024-04-18 16:19:57 -07:00
Engel Nyst fe3d4b129d Path in observation (#1216)
* create observation with the workspace path

* rename the whole path whole_path
2024-04-18 15:06:51 -04:00
RaGe 2bf34093b0 Add new endpoint for shallow file listing - backend only (#1174)
* Add /api/list-files endpoint

Returns files at the path specified by query parameter relpath, relative to base path, with a depth of 1

* Add method in services
2024-04-18 11:22:39 -07:00
Robert Brennan 9e85550b46 fix initialize (#1214) 2024-04-18 13:19:33 -04:00
Robert Brennan 2826e39056 fix file writes (#1213)
* fix file writes

* Update opendevin/action/fileop.py
2024-04-18 12:47:52 -04:00
Leo 226fb43fbe fix wrong logo path. (#1211) 2024-04-18 11:53:11 -04:00
Robert Brennan 9d5bc6b332 Update README.md (#1212) 2024-04-18 11:49:17 -04:00
Robert Brennan af262c17f1 properly exit loop on finish (#1210) 2024-04-18 11:03:29 -04:00
Robert Brennan 6b0408d47c use threading to insert docs into the db (#1191) 2024-04-18 11:31:39 +00:00
Robert Brennan 74c9b58d1b handle not found errors inside fileop (#1188) 2024-04-18 07:16:48 -04:00
Leo 1356da8795 feat: support controlling agent task state. (#1094)
* feat: support controlling agent task state.

* feat: add agent task state to agent status bar.

* feat: add agent task control bar to FE.

* Remove stop agent task action.

* Merge pause and resume buttons into one button; Add loading and disabled status for action buttons.

* Apply suggestions from code review

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-18 11:09:00 +00:00
Anthony Elkommos Youssef f7fd925f30 i18n : Add Arabic (#997)
* i18n : Add Arabic

* add ar translation

* chatSlice Cleanup

* simplified 2 conditions

* RTL Support & French:
* Added French as a language
* Added RTL support for Arabic

* Merge

* Merge

* Merge

* Merge

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-18 11:01:01 +00:00
Edwards.Arno 94831083c3 feat: improve credential validation and simplify response in get_token (#1155)
* feat: improve credential validation and simplify response in get_token

* Apply suggestions from code review

* Update opendevin/server/listen.py

* fix typo

* Apply suggestions from code review

* Update listen.py to remove unused import.

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-18 07:00:24 -04:00
从零开始学AI 396b6a62a9 [doc] add README-zh.md (#1116) 2024-04-18 06:59:16 -04:00
Engel Nyst caabfab7e2 Property 'message' sent to llm (#1198)
* Add action with content, no message, to history

* fix to_memory(), add it to serialization tests

* Actions without 'message' in completion too
2024-04-18 06:51:07 -04:00
Alex Bäuerle 51149780ac fix: fix path traversal vulnerability (#1199) 2024-04-18 06:49:37 -04:00
sp.wack 426f387123 setup env for controlled integration tests with redux (#1180) 2024-04-17 10:55:17 -07:00
Robert Brennan 9fd7068204 Fix for setting LLM model, frontend settings refactor (#1169)
* simplify frontend settings management

* add debug info to llm.py

* always reinitialize agent

* remove old config stuff

* delint

* fix first initialize event

* refactor settings management

* remove logs

* change endpoint to remove litellm reference

* actually fix socket issues

* refactor a bit

* delint

* remove isFirstRun

* delint

* delint python

* fix export

* fix up socket handshake

* fix types

* fix lint errors

* delint

* fix test names

* moar lint

* fix build errors

* remove newline

* Update frontend/src/services/settingsService.test.ts

* Update frontend/src/services/settingsService.test.ts
2024-04-17 17:49:38 +00:00
மனோஜ்குமார் பழனிச்சாமி 18348911c2 Use python3.11 as default (#885)
* use python3.11

* use python3.11

* removed 3.12

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-17 17:28:16 +00:00
Justus 8472436fe8 Syntax highlighting changes based on what file you are editing (#1176)
Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-17 17:10:45 +00:00
Jack Quimby 16fc728696 feat: SWE Agent Implementation (#846)
* Merge branch 'main' of https://github.com/JayQuimby/OpenDevin

* Using commands.sh for ACI

* parsing, prompting, and actions modifications

* added start and end index to read and write

* bug fixes and test updates

* Lint code changes to ensure code is proper

* State management, bugs, prompts

* Prompt Engineering

* exception handling

* big fixes

* more bug fixes

* merge conflicts

* Renamed SWEAgent, basic tests, bug fixes

* prompt changes, bug fixes

* merge conflicts

* merge conflict

* start and end line for read and write

* more merge conflicts

* env error

* linter error

* read fixed, prompt change, example added

* added x_line:end read operation

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-17 12:19:00 -04:00
Yoni c8c1eed5be Reset agent between tasks (#1182)
Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-17 12:13:54 -04:00
Xia Zhenhua 60a92a07c1 modify all api response to the same format (JSONResponse with status_code and content set) (#1175)
* feat: modify all api response to JSONResponse with status_code and content.

* refactor: make status-code 200 response more concise.

---------

Co-authored-by: aaren.xzh <aaren.xzh@antfin.com>
2024-04-17 12:13:32 -04:00
மனோஜ்குமார் பழனிச்சாமி a0fee17388 less verbose (#1183) 2024-04-17 11:41:18 -04:00
Robert Brennan b654b00aa1 Update README.md (#1181) 2024-04-17 09:11:45 -04:00
Boxuan Li aed82704a9 Fix python linter inconsistent behaviour with quotes (#1112)
* This has been a headache for a long time, and we had #1071 and #1100 with the hope to fix the inconsistent behaviour across linters and environments. However, we recently found out that double-quote-string-fixer plugin in pre-commit-hook has inconsistent behaviour on python 3.11 and 3.12. See discussion here. This is sad because while this plugin enforces single quote behaviour with 3.11, it doesn't always enforce so with 3.12. Specifically, with fstr syntax, this plugin allows both single quotes and double quotes with python 3.12.

The problem is, some developers have black linter installed/integrated with their IDE, which is probably the most popular linter in python world (ranked by GitHub stars). This linter insists on always using double quotes. Now we have black and double-quote-string-fixer fight each other (iff the developer uses python 3.12) for some quotes (fstr syntax).

After a lot of research, I couldn't find a way to enforce single quote behaviour without introducing a new dependency, flake8, together with a plugin for it to enforce quotes' behavior. I believe it's better off introducing the more popular black if we have to introduce a new linter. Since black and autopep8 sometimes fight each other, and they mostly overlap, I further remove autopep8.

The unfortunate consequence of this PR is that I had to revert all single quotes back to double quotes. This might cause some inconvenience to existing PRs as they have to resolve conflicts, but I believe the headache will be gone soon. That being said, I am open to abandon this PR if anyone has a better idea to solve the headache.

* Remove black

* Prevent black from changing quotes

* Use flake8 to enforce single quotes

* Fix quotes in config.py

* Add back autopep8

* Add make lint to run linters
2024-04-17 00:12:52 -04:00
sp.wack edeea95e7d replace Jest with Vitest as the frontend test runner (#1163) 2024-04-16 21:12:11 -07:00
Robert Brennan a663302ba2 Refactor actions a bit (#1165)
* refactor to action_manager

* make all actions awaitable

* move task logic into tasks

* Update opendevin/action/tasks.py
2024-04-16 21:29:38 -04:00
Engel Nyst 1338248310 Document LLM prompt and response logging. (#1166) 2024-04-16 17:29:50 -04:00
Robert Brennan d34a9db04f Run docker publish on tags (#1162)
* run ghcr on tags

* Update .github/workflows/ghcr.yml

Co-authored-by: RaGe <foragerr@users.noreply.github.com>

---------

Co-authored-by: RaGe <foragerr@users.noreply.github.com>
2024-04-16 13:27:31 -04:00
Alex Bäuerle 7710112ae2 fix: fix delete messages endpoint url (#1140)
Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-16 17:08:18 +00:00
Alex Bäuerle 6b007c163b fix: make typing chat bubble adhere to max width (#1142) 2024-04-16 10:02:04 -07:00
Engel Nyst 1115b60a74 Logging additions and fixes (#1139)
* Refactor print_to_color into a color formatter

misc fixes

catch ValueErrors and others from Router initialization

add default methods

* Tweak console log formatting, clean up after rebasing exceptions out

* Fix prompts/responses

* clean up

* keep regular colors when no msg_type

* fix filename

* handle file log first

* happy mypy

* ok, mypy

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-16 12:55:22 -04:00
sp.wack 5881857d5c test(frontend): add unit tests (#1076)
* test(frontend): add unit tests for getCachedConfig

* test(frontend): add unit tests for getCachedConfig

* add unit test for the useTypingEffect hook

* add unit test for the useInputComposition hook

* create unit test for auth service

* remove outdated and failing component test

* create unit test for session service

* break down saveSettings into smaller functions for testability and create unit test for new mergeAndUpdateSettings

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-16 12:49:56 -04:00
Alex Bäuerle 4b4bc15fdb fix: fix multiple frontend warnings (#1143) 2024-04-16 10:06:31 -04:00
RaGe 45c6f639f4 (fix) OpenDevin works on OpenDevin issues (#1149)
* Use SANDBOX_TYPE=exec, use docker image

* Use updated image
2024-04-16 09:24:31 -04:00
Boxuan Li 90ca29414c Remove custom signal handlers (#1153) 2024-04-16 09:20:13 -04:00
Xia Zhenhua 8e4c4c9946 fix #1028, /select-file api call on deleted file in Code Editor caused Error (#1158)
* fix: /select-file on deleted file exception, detail: https://github.com/OpenDevin/OpenDevin/issues/1028

* fix: lint.

* fix: lint.

---------

Co-authored-by: aaren.xzh <aaren.xzh@antfin.com>
2024-04-16 09:19:10 -04:00
மனோஜ்குமார் பழனிச்சாமி 88e31f91d8 corrected port (#1159) 2024-04-16 09:18:12 -04:00
Akki 7e825b571f fix(): build out opendevin modal component (#1141) 2024-04-16 08:43:04 -04:00
Robert Brennan 516c9bf1e0 Revamp docker build process (#1121)
* refactor docker building

* change to buildx

* disable branch filter

* disable tags

* matrix for building

* fix branch filter

* rename workflow

* sanitize ref name

* fix sanitization

* fix source command

* fix source command

* add push arg

* enable for all branches

* logs

* empty commit

* try freeing disk space

* try disk clean again

* try alpine

* Update ghcr.yml

* Update ghcr.yml

* move checkout

* ignore .git

* add disk space debug

* add df h to build script

* remove pull

* try another failure bypass

* remove maximize build space step

* remove df -h debug

* add no-root

* multi-stage python build

* add ssh

* update readme

* remove references to config.toml
2024-04-15 19:10:38 -04:00
Alex Bäuerle 71edee17be build: fix workspace variable name in dev setup (#1138) 2024-04-15 16:33:49 -04:00
RaGe de672029ef Add build-frontend to build (#1137) 2024-04-15 13:00:25 -07:00
808vita f0559892ab lint-frontend Files.tsx (#1089)
for lint-frontend exhaustive deps

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-15 11:52:30 -07:00
Edwards.Arno 2491a3524e feat: add question.md template for community Q&A (#1115)
* feat: add `question.md` template for community Q&A

* Update .github/ISSUE_TEMPLATE/question.md

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-15 14:14:37 +00:00
மனோஜ்குமார் பழனிச்சாமி 0616fe3f8d Added Retry for LLM calls (#1092)
* added retry

* filtered API errors

* fixed decorator

* used litellm retries

* added custom backoff too

* Apply suggestions from code review

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

* added custom backoff too

* retried only if certain Exceptions

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-15 14:47:38 +02:00
Robert Brennan 474aafbc79 Fix frontend build (#1118) 2024-04-15 14:36:18 +02:00
Robert Brennan 342302ceef Add Docker DOOD setup (#1023)
* simplified get

* resolved merge conflicts

* removed default param for get

* add dood setup

* add readme

* better build process

* multi-stage build

* revert makefile

* rm entrypoint.sh

* adjust ssh box for docker

* update readme

* update readme

* fix hostname

* change workspace setting

* add workspace_mount_base

* fixes for workspace dir

* clean up frontend

* refactor dockerfile

* try download.py

* change docker order a bit

* remove workspace_dir from frontend settings

* fix merge issues

* Update opendevin/config.py

* remove relpath logic from server

* rename workspace_mount_base to workspace_base

* remove workspace dir plumbing for now

* delint

* delint

* move workspace base dir

* remove refs to workspace_dir

* factor out constant

* fix local directory usage

* dont require dir

* fix docs

* fix arg parsing for task

* implement WORKSPACE_MOUNT_PATH

* fix workspace dir

* fix ports

* fix merge issues

* add makefile

* revert settingsService

* fix string

* Add address

* Update Dockerfile

* Update local_box.py

* fix lint

* move to port 3000

---------

Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>
Co-authored-by: enyst <engel.nyst@gmail.com>
2024-04-15 14:19:02 +02:00
Xia Zhenhua 8450b47609 fix: action.run executed twice if action is not awaitable (#1021) 2024-04-15 12:12:07 +00:00
Robert Brennan 34ecfe3c75 Update dogfood.yml (#1107) 2024-04-15 10:02:31 +02:00
மனோஜ்குமார் பழனிச்சாமி f0cd5a37e8 Update bug_report.md (#1111) 2024-04-15 10:02:01 +02:00
Tunay Engin 98d019b825 Adding: Turkish language (#1102)
* Adding: Turkish language

* Add space

---------

Co-authored-by: Jim Su <jimsu@protonmail.com>
2024-04-15 03:07:59 +00:00
Sujay Kapadnis d2f7056c06 refactor: Added docstrings to BackgroundCommand(#1083) (#1093)
* refactor: Added docstrings to BackgroundCommand(#1083)

* refactor: Added Docstring with example for BackgroundCommand

* fixed typo

Co-authored-by: Graham Neubig <neubig@gmail.com>

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-04-14 22:13:33 +00:00
Boxuan Li 652507f430 Fix pre-commit and linter versions to avoid surprise (#1100)
* Fix pre-commit and linter versions to avoid surprise

To avoid surprising results on GitHub Actions, e.g. a new release of pre-commit starts to
reject all PRs, fix it to the latest version, 3.7.0. This PR also fixes ruff and mypy
versions in pyproject.toml since we very likely don't really need latest upgrades from
linters, and upgrades can always bring surprise.

* pre-commit-config: Use v0.3.7 for Ruff as in pyproject.toml
2024-04-14 21:33:58 +02:00
மனோஜ்குமார் பழனிச்சாமி 033352e340 added to sudo group (#1091) 2024-04-14 15:32:36 +02:00
Z 6b9316f722 doc: Add supplementary notes for WSL2 users to Local LLM Guide (#1031)
* Add supplementary notes for WSL2 users

* Add supplementary notes for WSL2 users

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-14 13:22:58 +00:00
Boxuan Li 53f95056de Revamp Exception handling (#1080)
* Revamp exception handling

* Agent controller: sleep 3 seconds if APIConnection error

* Fix AuthenticationError capture

* Revert unrelated style fixes

* Add type enforcement for action_from_dict call
2024-04-14 06:51:17 +02:00
Boxuan Li dd32fa6f4a Unify linter behaviour across CI and pre-commit-hook (#1071)
* CI: Add autopep8 linter

Currently, we have autopep8 as part of pre-commit-hook. To ensure
consistent behaviour, we should have it in CI as well.

Moreover, pre-commit-hook contains a double-quote-string-fixer hook
which changes all double quotes to single quotes, but I do observe
some PRs with massive changes that do the opposite way. I suspect
that these authors 1) disable or circumvent the pre-commit-hook,
and 2) have other linters such as black in their IDE, which
automatically change all single quotes to double quotes. This
has caused a lot of unnecessary diff, made review really hard,
and led to a lot of conflicts.

* Use -diff for autopep8

* autopep8: Freeze version in CI

* Ultimate fix

* Remove pep8 long line disable workaround

* Fix lint.yml

* Fix all files under opendevin and agenthub
2024-04-14 00:19:56 -04:00
Jim Su d4ce4ea541 Throw error if an illegal sandbox type is used (#1087) 2024-04-13 23:59:31 -04:00
Robert Brennan 066680f8ef Auto-close stale issues and PRs (#1032)
* stale issues

* Update .github/workflows/stale.yml

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update .github/workflows/stale.yml

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update .github/workflows/stale.yml

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update .github/workflows/stale.yml

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-04-14 02:39:46 +00:00
RaGe d27895d018 Add new sandbox type - local (#1029) 2024-04-13 20:47:11 -04:00
Akki 8b9f13b1ed fix(editor): ui enhancements and code refactor (#1069) 2024-04-13 08:52:26 -07:00
Leo 2326312d3a fix: print the wrong ssh port number (#1054) 2024-04-12 22:18:34 -04:00
Boxuan Li e0c7492609 Traffic Control: Add new config MAX_CHARS (#1015)
* Add new config MAX_CHARS

* Fix mypy linting issues
2024-04-12 19:01:52 +00:00
namtacs 5d5106c510 Response recognition for weak llms (#523)
* Tweak for weak llms

* Update to the latest commits

* Update to the latest commits

* Fix lint errors

* Remove merge artifact

---------

Co-authored-by: Jim Su <jimsu@protonmail.com>
2024-04-12 09:20:47 -04:00
மனோஜ்குமார் பழனிச்சாமி 70534f203e simplified get (#962)
* simplified get

* resolved merge conflicts

* removed default param for get

* Update opendevin/config.py

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-12 09:18:41 -04:00
Leo 494a1b6872 Feat add agent manager (#904)
* feat: add agent manager to manage all agents;

* extract the host ssh port to prevent conflict.

* clean all containers with prefix is sandbox-

* merge from upstream/main

* merge from upstream/main

* Update frontend/src/state/settingsSlice.ts

* Update opendevin/sandbox/ssh_box.py

* Update opendevin/sandbox/exec_box.py

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-12 07:53:47 -04:00
Engel Nyst ded0a762aa Formatting AZURE_LLM_GUIDE (#1046) 2024-04-12 11:28:42 +02:00
Engel Nyst 32ba0e7f7e Add Azure configuration doc (#1035)
* Add Azure configuration doc

* Add link to Azure doc.
2024-04-12 04:59:51 -04:00
PierrunoYT 7b526b3620 Add Italian, Spanish and Português (#1017)
* Update index.ts

Add Italian, Spanish and Português

* Update translation.json

Add Italian. Spanish and Português

* Remove unnecessary i18n initialization arguments

---------

Co-authored-by: Jim Su <jimsu@protonmail.com>
2024-04-12 04:57:12 -04:00
Alex Bäuerle 224ee7d1f8 fix: fix some of the styling to more closely match figma (#927)
* fix: fix some of the styling to more closely match figma

* overflow
2024-04-12 04:45:03 -04:00
Redrum624 9fd95cc35f Fix: local files in the browser (at least for Windows) (#839)
* Fix: local files in the browser (at least for Windows)

The Browser works for screenshoting websites, but when the LLM tries to display a local file that he created, it can't access the full directory on Windows.

* Update browse.py

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-11 22:12:23 -04:00
hugehope 9cd4ad3298 chore: fix some typos in comments (#1013)
Signed-off-by: hugehope <cmm7@sina.cn>
2024-04-11 23:21:46 +02:00
808vita fb54c36c90 lint-frontend settingsSlice.ts (#1022)
* lint-frontend settingsSlice.ts

for  lint-frontend
Unexpected console statement

* Update settingsSlice.ts
2024-04-11 19:25:04 +02:00
PierrunoYT 51b3ae56c7 Add German Translation (#851)
* Update index.ts

* Update translation.json

* Update index.ts

* Remove comment and ident JSON

---------

Co-authored-by: Jim Su <jimsu@protonmail.com>
2024-04-10 22:59:46 -04:00
Engel Nyst b8202e804f Make sure Azure info is read from toml (#1005) 2024-04-10 19:28:51 -04:00
Robert Brennan 7f5d9c7d92 Update agent_controller.py (#1004) 2024-04-10 19:28:30 -04:00
Alex Bäuerle cd723abfdd fix: make chat colors match figma (#957) 2024-04-10 18:24:25 -04:00
Robert Brennan cc584445c6 implement exec box (#983)
* implement exec box

* fix autopep8 issue

* fix mypy issue

* fix some bugs

* set default to ssh

* rename sandbox impls

* empty commit

* fix imports

---------

Co-authored-by: Anas Dorbani <anasdorbani@gmail.com>
2024-04-10 17:39:33 -04:00
Robert Brennan 9846e24299 Fix logger import (#985)
* fix logger import
* fix mypy version
* make mypy happy (#994)

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-04-10 21:48:40 +02:00
Engel Nyst 973a42fd78 More json cleaning (#924)
* More json cleaning

* remove redundant check
2024-04-10 20:03:20 +02:00
Leo b2de79ae08 Feat add toast (#951)
* feat: add agent status bar.

* fix: ws will reconnect after failed in first time.

* update: delete useless codes.

* feat: add toast
2024-04-10 12:10:58 -04:00
Ikko Eltociear Ashimine 0b21d77880 Update architecture/README.md (#970)
generat -> generate
2024-04-10 12:10:25 -04:00
libowen2121 e256329e5e Update SWE-bench eval results (#978) 2024-04-10 21:09:49 +08:00
Engel Nyst 52b63908ca FE config (#960)
* Send FE its own version of config

* Don't read some settings back

* Update opendevin/config.py

Co-authored-by: Robert Brennan <accounts@rbren.io>

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-10 15:08:09 +02:00
Xingyao Wang e8ff184912 Update README.md 2024-04-10 14:57:52 +08:00
Alex Bäuerle 707ab7b3f8 feat: make panes resizable (#925) 2024-04-09 20:42:16 -04:00
Robert Brennan 78858c18b5 Update README.md (#956) 2024-04-09 15:37:33 -05:00
dred0n 92562a5152 remove json_dump and adjust handlers (#881)
Co-authored-by: senseable <dondre+witcebs@gmail.com>
2024-04-09 14:41:07 -04:00
Xiaochen Duan c1aab59c2e i18n : Add zh-TW 繁體中文 (#947)
* add zh-TW translation

* add zh-TW translation
2024-04-09 13:22:24 -05:00
Leo eb7b3484fd feat: add agent status bar. (#941)
* feat: add agent status bar.

* fix: ws will reconnect after failed in first time.

* update: delete useless codes.
2024-04-09 15:26:06 +00:00
Engel Nyst 8ab9c6fb86 Revert the use of Router, good ole completion works. (#910)
* Revert the use of Router, good ole completion works.

* Stopgap exception message

* Get the updated dependencies.
2024-04-08 22:21:30 -04:00
Alex Bäuerle 3c4b3eddc0 fix: text input can be edited even if not yet initialized (#928)
Clicking the send button is still only possible once initialized.
2024-04-09 00:01:49 +02:00
ITNerdAZ 01c4c4bee4 Add installation warning to readme (#883)
* Update README

* Update README.md

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-08 09:56:02 -05:00
Xingyao Wang 3c82ba7f34 support running sudo in a passwordless manner (#906) 2024-04-08 22:54:06 +08:00
Exlo 73fb4843a3 i18n : Add Norwegian (#876)
* Update index.ts

* Update translation.json

* Update index.ts
2024-04-08 10:36:04 -04:00
Akki 6f795f5e9c style(code editor): improved UI / UX for code editor (#826)
* style(): improved code edito ui / ux

* fix(): fix build issue and use cn fn

* theme variable updated to tailwind neutral gray

* fix(): fix conflicts

* fix lint errors

---------

Co-authored-by: Jim Su <jimsu@protonmail.com>
2024-04-08 10:32:45 -04:00
Xingyao Wang 6b7c5b09af fix(sandbox): Fix docker root login and set default to run-as-devin (#891)
* support execute as-root for sandbox

* set run as devin by default to true

* get network_mode = host back

* Update opendevin/sandbox/sandbox.py

Co-authored-by: Anas DORBANI <95044293+dorbanianas@users.noreply.github.com>

* print login info for debugging

* use ssh -v instead of ssh

* change port map to 2222 to circumvent the MacOS issue

* add warning message for port forwarding

---------

Co-authored-by: Anas DORBANI <95044293+dorbanianas@users.noreply.github.com>
2024-04-08 21:34:07 +08:00
Yufan Song 6e566dd21d fix: change default RUN_AS_DEVIN to true and change some network setting (#895)
* fix use devin

* fix port problem
2024-04-08 02:23:25 -07:00
Robert Brennan cc6626ff0d fix up json parsing (#875) 2024-04-08 14:39:36 +08:00
Xingyao Wang fab2259d3a fix(ghcr image) bump build tag minor version to fix docker build (#884) 2024-04-08 13:32:34 +08:00
Xingyao Wang 55760ec4dd feat(sandbox): Support sshd-based stateful docker session (#847)
* support sshd-based stateful docker session

* use .getLogger to avoid same logging message to get printed twice

* update poetry lock for dependency

* fix ruff

* bump docker image version with sshd

* set-up random user password and only allow localhost connection for sandbox

* fix poetry

* move apt install up
2024-04-08 12:59:18 +08:00
RaGe 6e3b554317 Create a CommandExecutor abstract class (#874)
* Create abstract CommandExecutor class

* Use CommandExecutor for Sandbox
2024-04-07 14:57:31 -05:00
Leo e52bf5ad7b Fix aligning settings between fe and be (#863)
* fix: aligning settings between FE and BE.

* apply black formatter and clean useless codes.
2024-04-07 14:56:14 -05:00
Robert Brennan e878b0c7ee Remove old references to pip (#871)
* Update README.md

* Update README.md
2024-04-07 13:21:07 -05:00
Engel Nyst 2f9bf606c7 Don't save backend file (#870)
* Don't save backend file

* Update Makefile

Co-authored-by: Anas DORBANI <95044293+dorbanianas@users.noreply.github.com>

---------

Co-authored-by: Anas DORBANI <95044293+dorbanianas@users.noreply.github.com>
2024-04-07 19:39:32 +02:00
Engel Nyst 04066ca42b fix default logging (#867) 2024-04-07 18:55:52 +02:00
iamiks a0c5c8efe9 i18n : Add Korean (#849) 2024-04-07 10:40:18 -05:00
808vita 6f346b3789 refactor & fix frontend: typing chat (#861) 2024-04-07 10:40:03 -05:00
Xingyao Wang e9121b78fe use .getLogger to avoid same logging message to get printed twice (#850) 2024-04-07 04:07:59 -04:00
Anas DORBANI d3770f1db6 add github action for project build on macos and linux (#838)
* update github action to build and run tests  on macos and linux

* fix docker installation

* Fix poetry installation on macos

* Fix docker installation

* Fix docker installation - start docker daemon

* Change docker installation macos

* Update docker buildx version

* new docker installation

* Add new start docker structure

* Add new start docker structure 2

* update github action to build and run tests  on macos and linux

* Update makefile to fix chroma-hnswlib issue with macos

* fix macos build

* Fix macos issue

* Fix macos

* Reformat Makefile

* updates
2024-04-07 03:54:52 -04:00
Engel Nyst 99a8dc4ff9 Fallback to less expensive model (#475) 2024-04-07 05:45:37 +02:00
Engel Nyst 4b4ce20f2d Add logging (#660)
* Add logging config for the app and for llm debug

* - switch to python, add special llm logger

- add logging to sandbox.py

- add session.py

- add a directory per session

- small additions for AgentController

* - add sys log, but try to exclude litellm; log llm responses as json

* Update opendevin/_logging.py

Co-authored-by: Anas DORBANI <95044293+dorbanianas@users.noreply.github.com>

* - use standard file naming
- quick pass through a few more files

* fix ruff

* clean up

* mypy types

* make mypy happy

---------

Co-authored-by: Anas DORBANI <95044293+dorbanianas@users.noreply.github.com>
2024-04-07 05:43:25 +02:00
Sri d87a7ddd83 moving type and eslint related packages to dev dependencies (#811)
Co-authored-by: msri23 <112919748+msri23@users.noreply.github.com>
2024-04-06 22:40:51 -04:00
Anas DORBANI 9c98b67002 Fix awk error for the WSL users (#837) 2024-04-07 00:56:15 +00:00
AbhisekOmkar c0dfc851b9 Add Discord and Slack Community Link to README (#835)
This pull request enhances the README.md file by adding a Discord community link to the OpenDevin project. The addition includes a Discord logo with a hyperlink to the project's Discord server invite, ensuring easy access for users interested in joining the community. This update maintains consistency with existing style elements, such as badges for other community platforms like Slack. By incorporating the Discord link, this pull request aims to promote community engagement and improve user accessibility within the project.
2024-04-06 13:30:58 -05:00
AbhisekOmkar 5ce3af6f8f Update README.md (#832) 2024-04-06 13:02:44 -05:00
RaGe 371f7a127d (feat) Use OpenDevin to address OpenDevin issues (#803)
* Initial attempt

* Add `poetry run` to command

* Add poetry dependency cache

* cache needs poetry to be installed first :/

* escape newlines in issue body

* Write task to file instead

* Add LLM_API_KEY

* Attempt PR generation

* Bring back agent operations

* set-output is deprecated, use GITHUB_ENV

* PR should target default branch
2024-04-06 13:49:21 -04:00
Anas DORBANI e54fe4491c Update README.md
Add poetry to requirements
2024-04-06 17:47:33 +00:00
Leo 3313e473ea style: Action and Observation use schema for unifying (#494)
* style: Action and Observation use schema for unifying

* merge from upstream/main

* merge from upstream/main
2024-04-06 13:46:07 -04:00
Xingyao Wang 229fa988c5 remove seed=42 to fix #813 (#830) 2024-04-06 13:04:17 -04:00
Junyang Lin 5dda0ddef6 Update README.md
change discord invite link
2024-04-07 00:55:35 +08:00
Aaron Jorgensen 23a7057be2 Update SettingModal.tsx (#792) 2024-04-06 12:41:48 -04:00
Graham Neubig f40fe6ac28 Remove pnpm (#823)
* Remove pnpm

* Remove from ci

* Remove pnpm

* Remove cache from lint
2024-04-06 12:39:02 -04:00
Graham Neubig 8f097f8643 Make poetry install manual and provide user with install instructions (#818)
* Add install instructions for poetry

* Update ci

* Move poetry before docker pull

* Added link
2024-04-06 12:38:12 -04:00
Junyang Lin 49f3665a99 update readme with discord server invite link (#828) 2024-04-07 00:35:46 +08:00
Leo 0b5eea967f fix: enable custom model. (#825) 2024-04-06 10:23:48 -04:00
Leo c34517be2b fix: wrong setting for localstorage workspace. (#821) 2024-04-06 08:38:12 -04:00
RaGe 228889c50c fix frontend dependencies (#810)
fixes #805
2024-04-06 07:04:10 -04:00
Anas DORBANI d38113cead Ad/fix pre commit (#807)
* Fix pre-commit config and some mypy issues

* remove hook of requirements.txt
2024-04-06 03:48:30 +00:00
Anas DORBANI 66cd9f9bd9 Check python npm pnpm docker (#800) 2024-04-06 01:20:47 +00:00
Anas DORBANI 7d3cd06ef5 Add litellm Gemini Pre-requisites (#798) 2024-04-06 00:36:34 +00:00
Alex Bäuerle d20f532289 feat: add tree view for the files in the current workspace (#601)
* feat: add tree view for the files in the current workspace

* minor

* rest endpoints

* minor
2024-04-05 20:21:39 -04:00
Anas DORBANI 7cc58b28a5 Fix pre-commit unset when using make build (#796) 2024-04-06 00:11:12 +00:00
Sagar Shah 2855959c76 translate to english (#675) 2024-04-05 13:42:41 -05:00
Robert Brennan fad51a6141 Add GitHub Action for pytest (#400)
* add action for pytest

* Update .github/workflows/run-tests.yml

* Update .github/workflows/run-tests.yml

* Update run-tests.yml

* Apply suggestions from code review

* Update run-tests.yml

* Update Pipfile

* Update .github/workflows/run-tests.yml

* Update .github/workflows/run-tests.yml

* Update .github/workflows/run-tests.yml

* Stubborn pipenv (#620)

* Update run-tests.yml

* Update run-tests.yml

* switch to poetry

* Update .github/workflows/run-tests.yml

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-04-05 13:24:49 -05:00
Davide Guidotti 3da56d3b6f Add the ability to set LLM_BASE_URL with make setup-config (#616)
* Add the ability to set LLM_BASE_URL with make setup-config

* Add hint for LLM_BASE_URL in make setup-config

* Adjust indentation after merge

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-05 12:57:56 -05:00
Jack Quimby d6128941b7 Doc: Document difference between agents (#722)
* doc: Guide for using local LLM with Ollama

* forgot to delete print statement

* typos

* Updated guide - new working method

* Move to docs folder

* Fixed front end overwrite local model name

* Update llm.py

* Delete docs/examples/images/example.png

deleted example.png

* Documentation of agent differences

* rename examples to documentation

* Docstrings for all agents

* typo fix

* typo fixes

* Typo fixes

* more typo fixes

* typo fix

* typo fixes

* typos fixed

* Typo fixes

* top 10 list

* typo fix

* typo fix

* typos to the moon

* typos fixed

* typo fix

* typo fix

* anotha one

* The rest of the typos

* Corrected agent descriptions

* Agents markdown updated

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-05 12:51:25 -05:00
Alex Bäuerle a82e065f56 feat: add commands for swebench (#682)
* feat: add commands for swebench

* restructure
2024-04-05 12:47:32 -05:00
Vincent da12d70bc2 fix to pass api key to openai embedding (#754) 2024-04-05 12:28:10 -05:00
Alex Bäuerle a202b2550f fix: show code tab by default, planner not working yet (#775)
* fix: show code tab by default, planner not working yet

* remove planner content

* remove planner for now
2024-04-05 12:26:46 -05:00
Leo adbcfefd8c feat: websocket connection management and sandbox bound to session. (#559)
* feat: websocket connection management and sandbox bound to session.

* fix: set default value to id

* feat: add session management.

* fix for mypy

* fix for mypy

* fix the pnpm-lock.

* fix the default model is empty will throw error.
2024-04-05 12:19:52 -05:00
Robert Brennan fe9815d57b Add Contributor Covenant (#769)
* Add Contributor Covenant

* Update CodeOfConduct.md
2024-04-05 13:07:59 -04:00
RaGe 89dc78953e (fix) bump actions versions (#776) 2024-04-05 12:06:01 -05:00
Alex Bäuerle 9109344728 fix: fix chat overflow (#773) 2024-04-05 12:02:24 -05:00
geohotstan 1328f55dcd Added tests for other action subclasses in test_action_serialization (#593)
* tests

* change some args

* use cls instead
2024-04-05 09:53:29 -05:00
iFurySt dcaf51bc9e fix: add the community-maintained model list (#654) 2024-04-04 23:21:17 -04:00
Engel Nyst 12ddd973b5 Remove md around json (#685) 2024-04-04 23:04:34 -04:00
Anas DORBANI c834aa7275 enable build & run tests for the pull_requests (#745) 2024-04-05 02:24:27 +00:00
Anas DORBANI 8c2b4f5fde Update and fix Makefile (#743) 2024-04-05 02:08:23 +00:00
Alex Bäuerle 8ec24ab7ea feat: start implementing new frontend layout (#737) 2024-04-04 21:45:28 -04:00
Anas DORBANI 5ec0e5b7ec Switch to Poetry (#378)
* create the pyproject file

* Fix the pyproject.toml file

* Update Makefile

* adapt makefile

* fix some execution issues

* Untrack lock files and wait for the backend to get start before frontend

* Remove LangChain dependencies

* Add github action for pytest

* add missing dependency

* rebase and fix the versions adding lock file

* add torch and pymupdfb deps

* some conflicts fixes

* Add dependencies evaluation group

* add poetry.lock

* Fix unexpected operator

---------

Co-authored-by: Robert Brennan <contact@rbren.io>
2024-04-05 00:27:29 +00:00
Vincent a0928ae590 Updating MakeFile and fixing monologue memory parameter issue (#692)
* Updating memory for monologue agent to fix base_url being used with OpenAIEmbedding by accident, added default text-embedding-ada-002 to monologue agent memory for OpenAIEmbeddings, added more enviroment variable configurations to setup-config in make file

* adding indent
2024-04-04 17:47:31 -05:00
Udbhav Ram 7680ff7dec Update README.md to include docker configuration (#731) 2024-04-04 17:45:57 -05:00
RaGe 17ea9a7849 (fix) Do not reset backend for only language change (#736) 2024-04-04 18:23:49 -04:00
Engel Nyst e70767c226 clean up (#721) 2024-04-04 17:02:44 -05:00
mashiro 0534c14279 feat: i18n (#723)
* feat: i18n

* fix: ci lint error

* fix: pnpm run pre script not trigger

---------

Co-authored-by: Jim Su <jimsu@protonmail.com>
2024-04-04 16:38:19 -04:00
mashiro baa981cda7 fix: block input send event while ime composition (#701)
* fix: trigger send event while ime composition & separate input element & disable input event while initializing

* fix: eslint react plugin setting

---------

Co-authored-by: Jim Su <jimsu@protonmail.com>
2024-04-04 09:44:07 -04:00
Jack Quimby 0748f0b7ce Doc: Updated documentation for Ollama (#689)
* doc: Guide for using local LLM with Ollama

* forgot to delete print statement

* typos

* Updated guide - new working method

* Move to docs folder

* Fixed front end overwrite local model name

* Update llm.py

* Delete docs/examples/images/example.png

deleted example.png
2024-04-04 09:24:44 -04:00
mashiro 5bb7bdb9f8 fix: detect corepack node version need to equal minimize (#703) 2024-04-04 16:59:40 +08:00
mashiro 0fdc401f91 chore: use pnpm to manage frontend deps (#659)
* chore: use pnpm to manage frontend deps

* typo: change content of tips

* docs: update node.js require version & replacement installation link

* feat: detect the Node.js version to ensure corepack is supported

* typo: remove chinese comment

* fix: lint CI config, use pnpm to install dependencies

* fix: nextui style error when using pnpm

* fix: ci setup pnpm cwd

* fix: frontend lint ci install deps crash

* fix: ci lint frontend missing package.json path

* fix: frontend lint ci add cache-dependency-path prop
2024-04-04 14:55:03 +08:00
xcodebuild 3adcb7ea56 fix: make run on Windows (#665)
* fix: make run on Windows

* Update Makefile

Co-authored-by: Graham Neubig <neubig@gmail.com>

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-04-03 23:15:31 -04:00
808vita 398749fcd8 fix frontend : highlight active assistant text and maintain chat seq | for #476 (#598)
* highlight active assistant text and maintain seq

* adjust speed

* alter autoscrolll
2024-04-03 22:46:52 -04:00
Engel Nyst 66031a67ba Make max_iterations configurable (#676) 2024-04-03 16:28:56 -04:00
Anas DORBANI 78c196d95f Add the description about the LMs Support (#674) 2024-04-03 16:09:37 -04:00
Jim Su 0c00e84e77 Send initialization arguments only when stored in localStorage (#618)
* Send initialization arguments only when stored in localStorage

* Fix mock default-model path
2024-04-03 13:56:47 -04:00
Robert Brennan 310cd7017d Add link to LiteLLM to make-setup (#614)
* Update Makefile

* fix tab

* add note to readme
2024-04-03 10:22:08 -04:00
xcodebuild 1c6f046c84 Fix make vars (#641)
* fix: let BACKEND_HOST and FRONTEND_PORT works in markfile

* feat: let vite do not clear terminal to keep backend log
2024-04-03 08:58:01 -04:00
Jack Quimby 08a2dfb01a Guide for Ollama local LLM (#615)
* doc: Guide for using local LLM with Ollama
2024-04-02 23:20:25 -04:00
Jim Su d397a200bf Add playwright to pipenv (#625)
* Add playwright to Pipfile
2024-04-02 21:25:16 -04:00
Paweł Ciosek b672ac3632 Fix terminal output disappears after switching tabs (#604) 2024-04-02 20:57:30 -04:00
universea 5f29df088a Add playwright and show screenshoots on web browser (#547)
* add playwright and show screenshoots on web browser

* fix lints

* fix lint issues

* fix lint issues

* fix lint issues

* fix lint issues

* fix lint issues
2024-04-02 18:46:01 -04:00
Robert Brennan c3f95f049f Update Makefile (#609) 2024-04-02 18:36:10 -04:00
Yufan Song 324a00f477 refactor(config): make a single source of truth file (#524)
* refactor

* fix nits

* add get from env

* refactor logic
2024-04-02 18:01:23 -04:00
Yufan Song 1af287a24b optimize(sandbox): cleanup docker when disconnect to speed up restart speed (#557)
* clean docker when disconnect

* add check
2024-04-02 17:57:16 -04:00
ryanwclark1 b6b7272cdc Add default model option to setup-config reads echo and writes to conf if not provided by user (#585)
Co-authored-by: Ryan Clark <ryanc@accentservices.com>
2024-04-02 17:52:48 -04:00
Theodoros Aslanidis fdbcdb84ee Update README (#574) 2024-04-02 17:51:57 -04:00
Engel Nyst 8fef7495a9 Removed langchain (#586)
* Removed langchain

* Update README.md

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-02 17:38:54 -04:00
ncrypted | Oliver 801417dcbf fix: make run conditional logs removal (#579)
* fix: make run conditional logs removal

* fix: make run fail-safe logs & pipe management
2024-04-02 16:50:13 -04:00
Robert Brennan 22a9a28e46 Update intro message (#569) 2024-04-02 10:02:09 -04:00
Jim Su 0a8a857d00 fix: apply settings without explicit change (#541)
* fix: apply settings without explicit change
* Change default model to gpt-4-0125-preview and don't print settings selection on startup
2024-04-02 09:49:43 -04:00
xcodebuild d64383a520 fix: let make run output both backend and frontend (#576)
* fix: let make run output both backend and frontend

* fix: delete pipe on run
2024-04-02 20:54:16 +08:00
Yufan Song 5e87c79838 refactor (#543) 2024-04-02 08:13:38 -04:00
Robert Brennan ddbe4fc604 Remove debug output (#572) 2024-04-02 08:12:26 -04:00
Yufan Song 93b0156f34 fix (#567) 2024-04-02 08:12:12 -04:00
Yufan Song 3e2045626a add max_iterations (#566) 2024-04-02 08:11:23 -04:00
xcodebuild c992a9f464 feat: add mock apis (#564) 2024-04-02 08:10:47 -04:00
Xingyao Wang 16fe8303e7 support disable color printing via environment variable in config.toml (#545) 2024-04-02 07:21:30 -04:00
xcodebuild 4c1d278d14 fix: use 127.0.0.1 for api proxy (#535) 2024-04-02 18:22:28 +08:00
Xingyao Wang c37124d740 Update README.md (#540) 2024-04-02 14:58:28 +08:00
Xingyao Wang fe736013cc Print with color in controller (#176)
* and color-coded printing action abd obs

* fix langchains agent's observation

* remove color printing from codeact agent

* fix mypy

* fix ruff

* resolve conflict

* fix ruff
2024-04-02 13:13:19 +08:00
Binyuan Hui 6d715c2ac7 fix: badges need to be centered (#538)
* fix: refine readme style

* fix: refine readme style

* fix: refine readme style

* fix: badges need to be centered

* fix: badges need to be centered

* fix: badges need to be centered

* fix: badges need to be centered

* fix: badges need to be centered
2024-04-02 11:45:20 +08:00
Alex Bäuerle 79237210f2 build(add-files-created-for-other-dev-envs-to-gitignore): Add files such as requirements.txt, .python-version, bun.lockb, and yarn.lock so that if anybody uses these systems, they don't accidentally push the files (#519) 2024-04-01 23:21:45 -04:00
Alex Bäuerle 774529887a Update README.md (#536) 2024-04-02 11:14:55 +08:00
Robert Brennan e5db7b8855 Update bug report template (#395)
* Update bug report template

* Update bug_report.md

* Update bug_report.md

* Update .github/ISSUE_TEMPLATE/bug_report.md
2024-04-01 23:08:37 -04:00
Graham Neubig c15fa3e768 Add a host command to the makefile (#534) 2024-04-01 23:06:53 -04:00
George Balch a9f469f0e7 fix: Handle api key error (#488)
* Propagate exception up to task coroutine instead of breaking the for loop manually, await coroutine so exception is returned to caller

* Remove redundant traceback as is already called in step function lower on the stack

* Change message as error could occur in middle of task also

* Fix linter errors

* Raise exception only on api key error

* Add TODO
2024-04-01 22:49:06 -04:00
Anas DORBANI fa40d379de Update README file (#472)
* Update README file

* add the requirements section

* Revert the project description
2024-04-01 21:18:10 -04:00
Devin abe0b9fd79 fix: Add numpy to Pipfile dependencies for issue #520 (#533)
Co-authored-by: Ubuntu <ubuntu@ip-10-240-235-229.us-west-2.compute.internal>
2024-04-01 20:55:20 -04:00
Joo-Won Jung b609f0681b Add an Pipfile example for AMD GPU or CPU only users (#510)
* refactor: add Pipfile example to select various PyTorch package

* Update README.md

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-01 13:51:50 -04:00
Robert Brennan 511afa12fe fix old references to langchains (#513) 2024-04-01 13:33:20 -04:00
Binyuan Hui d97602d071 add: .gitattributes (#511)
* add: .gitattributes

* fix: change ipynb as vendored
2024-04-02 01:24:28 +08:00
Alex Bäuerle 6d43b0ae88 fix: set the input to be empty after pressing enter instead of showing a newline (#506) 2024-04-02 01:07:06 +08:00
Robert Brennan dac89a9720 Update README.md (#505) 2024-04-01 11:54:45 -04:00
Robert Brennan 3e6508b7f1 Update Langchains Agent, rename to Monologue (#402)
* standardize on content vs contents

* remove langchain dep

* rename langchains to monologue

* unused imports

* Update agenthub/planner_agent/prompt.py

* update locks

* fix code editor

* fix agent list in frontend
2024-04-01 11:33:39 -04:00
Aravind Somaraj 26c9ce132b style: Moved argument parsing statements into a separate function (#503)
* style: moved argument parsing into a separate function

* commito

* Update evaluation/regression/conftest.py

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-01 10:47:58 -04:00
Patrick Nercessian 64281c4cc4 Transitioned to use LiteLLM Router to support retries and backoffs (#501) 2024-04-01 10:42:52 -04:00
Tess 8796a690d5 doc - Added code documentation for clarity (#434)
* doc - Added code documentaion to 'plan.py'

* doc - Added code documentation to 'session.py'

* doc - added code documentation for clarity

* doc - added documentation to 'conftest.py'

* doc - added code documentation to 'run_tests.pt'

* Update evaluation/regression/conftest.py

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-01 10:22:09 -04:00
xcodebuild 1ae3f1bf6a Fix API with vite proxy (#454)
* fix: use vite proxy for API and ws

* feat: add env var BACKEND_HOST

* feat: mock api for /litellm-models
2024-04-01 13:17:20 +08:00
Erik Nilsson 4404b9af24 Automatic repair of json for langchains agent (#444)
* Added json_repair to Pipfile

* Automatic repair of json for langchains agent
2024-03-31 22:52:59 -04:00
808vita 0e3c86ad59 fix typing jsx ChatInterface.tsx (#468)
wrapped with fragment
2024-03-31 22:46:08 -04:00
Alex Bäuerle 15ab5e1617 feat: make input for message autoresize (#466) 2024-04-01 04:25:44 +08:00
Alex Bäuerle 838ca200a8 fix: fix chat bubble width (#464) 2024-04-01 03:33:26 +08:00
808vita a00f604995 feat(frontend): typing chat feature | for #187 (#353)
* feat(frontend): typing chat

* fix linting

* typo corrections - ChatInterface.tsx

typo corrections in comments

* refactor and fix typing
2024-04-01 03:10:37 +08:00
Anas DORBANI c6cc5f6b5a Add Makefile (#432) 2024-03-31 14:46:42 -04:00
iFurySt a08c82d35e ci: check if the image exists in ghcr.io to avoid repeat building and pushing (#283)
* ci: check if the image exists in ghcr.io to avoid repeat building and pushing.

* feat: add push MAJOR and MINOR version to ghcr.io
2024-03-31 11:30:13 -04:00
Robert Brennan ec073834ad Send error response instead of crashing (#450) 2024-03-31 11:17:23 -04:00
808vita e9adc45276 doc: add mock server to list in CONTRIBUTING.md (#442)
Added mock server to "How to begin" section
2024-03-31 11:16:33 -04:00
Robert Brennan a73df88678 require 3.11 (#451) 2024-03-31 11:12:13 -04:00
Engel Nyst 6e4089fb75 Remove unexpected backticks from the LLM (#440) 2024-03-31 10:39:35 -04:00
Jayanth Parthsarathy 907da30320 doc: fix wrong link to server in contributing.md (#427)
* doc: fix wrong link to server in contributing.md

---------

Co-authored-by: jimsu <jimsu@protonmail.com>
2024-03-31 01:32:27 -04:00
Jim Su 7af57b3e1d Make user messages blue (#428) 2024-03-31 01:09:03 -04:00
Engel Nyst ddf27c0891 Fix Agent README.md (#422) 2024-03-31 00:55:24 -04:00
iFurySt dab86c4a77 Style improve fe layout and interaction (#385)
* style: improve FE layout and interaction, remove daisyui, switch to nextui; reduce the props of Workspace component; adjust style of chat bubble and workspace tabs.
2024-03-31 00:52:34 -04:00
Manal Arora 87d56d961f feat(frontend): adding functinality to get agents from backend server (#406)
* adding functinality to get agents from backend server

* dynamic selection of agents in backend

* adding the function listAgents

* Update opendevin/server/listen.py

* Update opendevin/agent.py

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-03-30 19:45:55 -04:00
Robert Brennan fd5618202e Update sandbox.py (#390) 2024-03-30 19:35:59 -04:00
Robert Brennan 2a959a199b Update README.md (#396) 2024-03-30 19:28:58 -04:00
jannikwinghart da69bab11a Added instructions to update the backend architecture diagram (#413) 2024-03-30 19:24:52 -04:00
Robert Brennan 93cb5e0617 Update agent docs (#407)
* Update agent docs

* Update README.md

* Update README.md
2024-03-30 19:23:40 -04:00
Yufan Song 2d8cb973a8 add container image (#408) 2024-03-30 14:58:31 -04:00
Robert Brennan 4882cb6a66 Update README.md (#403) 2024-03-30 14:11:58 -04:00
jannikwinghart dd1ac82d15 Added architecture diagrams for system overview and backend (#391) 2024-03-30 12:03:56 -04:00
Zijian Han 838f79879b fix #375: update python version (#380) 2024-03-30 10:46:02 -04:00
Robert Brennan 6bd566d780 simplify readme (#366)
* simplify readme

* Update config.toml.template

* Update vite.config.ts (#372)

* Update vite.config.ts

* Update frontend/vite.config.ts

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>

* remove old langchains infra

* remove refs to OPENAI_API_KEY

* simplify opendevin readme

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-03-30 10:15:20 -04:00
Jim Su 11ed011b11 Update README.md (#379) 2024-03-30 10:15:08 -04:00
Robert Brennan effac868c1 Implement deserialization for actions and observations (#359)
* action deserializing

* add observation deserialization

* add tests

* refactor agents with serialization

* fix some errors

* fix lint

* fix json parser
2024-03-30 10:06:25 -04:00
Robert Brennan f68ee45761 output prompt debug before response (#348) 2024-03-30 20:08:05 +08:00
Renu 0a4f7de215 refactor(frontend): categorized css files into a separate folder (#298)
* Issue 294

* categorize css files

* cleanup

* formatted css
2024-03-30 20:01:39 +08:00
Jim Su 1be355436d Expose LiteLLM model names in backend (#370) 2024-03-29 19:34:32 -04:00
Robert Brennan 96af82ad6d Update README.md (#365) 2024-03-29 16:58:21 -04:00
Hesham 0e1ba56ef0 Support --task-file argument for main.py (#358)
* Update main.py

* Update main.py

* Update opendevin/main.py

Co-authored-by: Robert Brennan <accounts@rbren.io>

* Update main.py

* Update main.py

* Update main.py

* Update main.py

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-03-29 16:54:26 -04:00
Anas DORBANI f5d07ad1b3 add more checks for the pre-commit config (#279) 2024-03-29 16:48:02 -04:00
Robert Brennan 3d7a86feb6 Fix TOML parser (#363)
* fix toml

* Update opendevin/config.py

* install types

* install types

* Revert "install types"

This reverts commit 820b38b1e2.

* fix install types

* non-interactive
2024-03-29 16:39:53 -04:00
Robert Brennan 248c39697e Remove extra install step (#361) 2024-03-29 16:13:29 -04:00
iFurySt 2286e73912 fix: change to use the latest docker image. (#290)
Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-03-29 15:59:52 -04:00
Josh Hendershot 197e7fb2c0 fix(agenthub): update initial thoughts (#281)
* fix(agenthub): update initial thoughts

* edit langchains_agent
2024-03-29 15:57:10 -04:00
Yufan Song a7b4a7ca2f add byte order (#292) 2024-03-29 15:50:06 -04:00
Yufan Song 2def49e794 add (#355)
Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-03-29 15:33:02 -04:00
Jim Su b1b96df8a8 Replace environment variables with configuration file (#339)
* Replace environment variables with configuration file

* Add config.toml to .gitignore

* Remove unused os imports

* Update README.md

* Update README.md

* Update README.md

* Fix merge conflict

* Fallback to environment variables

* Use template file for config.toml

* Update config.toml.template

* Update config.toml.template

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-03-29 15:26:20 -04:00
George Balch b443c0af29 fix: Update requirements.txt with google-generativai (#315)
* Install google generativeai package and update requirements.txt using pip freeze

* Switch to pipenv for package management and add google-generateai package as well

* Update README with new installation instructions, refactor a little for better ordering of instructions

* Fix typo

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-03-29 15:16:12 -04:00
808vita 98e7057d53 refactor(vite config) & doc (main readme) : Start Frontend with just "npm start" (#333)
* frontend : added vite --port 3001 | package.json

 "start" script : added "vite --port 3001" to frontend package.json

Just use npm start ; uses port 3001

* Updated installation section for frontend | README.md

With the addition of "start": "vite --port 3001" to the frontend's package.json, simply use "npm start" to initiate the frontend development environment.

* -modify vite config port to 3001
-Revert "frontend : added vite --port 3001 | package.json"
2024-03-29 15:13:42 -04:00
Raphaël Fleury-le veso f3fda42765 fix: Add claude-3-opus-20240229 option to settingsService.ts (#342)
* Added claude-3-opus-20240229 option to settingsService.ts

* Fix lint error

---------

Co-authored-by: Jim Su <jimsu@protonmail.com>
2024-03-29 12:23:18 -04:00
Robert Brennan a6f0c066b5 Implement Planning (#267)
* add outline of agent

* add plan class

* add initial prompt

* plumb plan through a bit

* refactor state management

* move task into state

* fix errors

* add prompt parsing

* add task actions

* better serialization

* more serialization hacks

* fix fn

* fix recursion error

* refine prompt

* better description of run

* update prompt

* tighter planning mechanism

* prompt tweaks

* fix merge

* fix lint issues

* add error handling for tasks

* add graphic for plans

* remove base_path from file actions

* rename subtask to task

* better planning

* prompt updates for verification

* remove verify field

* ruff

* mypy

* fix actions
2024-03-29 11:47:29 -04:00
Robert Brennan 32a3a0259a Serialization of Actions and Observations (#314)
* checkout geohotstan work

* merge session.py changes

* add observation ids

* ignore null actions and obs

* add back action messages

* fix lint
2024-03-29 10:49:40 -04:00
brzhang bbc51c858d feat: pref message send box style (#335)
Co-authored-by: brzhang <hoollyzhang@tencent.com>
2024-03-29 09:19:39 -04:00
Robert Brennan 0322693ec2 remove base_path from file actions (#320)
* remove base_path from file actions

* unused imports
2024-03-29 11:55:25 +08:00
Anas DORBANI 7c27e59918 feat: Ad/regression tests using pytest (#329)
* Remove all the unnecessary files

* Create finalize the regression testing framework and add hello world test case

* Update requirements.txt

* Update the test function to execute the generate script
2024-03-28 23:40:30 -04:00
Jim Su fa87352b45 Correctly change directory (#325)
* Correctly change directory

* Make pre-commit executable

* Fix lint issues
2024-03-28 20:49:37 -04:00
808vita 7448d9147b Update README.md (#323)
Frontend readme updated - for terminal section
2024-03-28 20:36:56 -04:00
Ashher Ali 33f67876d9 bug: #313 responsive width wrt size (#302)
* #294 refactor Browser.tsx and Browser.css

* #313 UI responsive

* husky issue resolved

* #313 UI responsive
2024-03-28 16:54:35 -04:00
Yufan Song e6633e6aae refactor (#318) 2024-03-28 15:51:28 -04:00
Yufan Song dedd09fdf5 add mkdir (#317) 2024-03-28 15:39:40 -04:00
Robert Brennan 94120f2b5d refactor state management (#258)
* refactor state management

* rm import

* move task into state

* revert change

* revert a few files
2024-03-28 15:23:47 -04:00
Yufan Song d993162801 feat(sandbox): add container process timeout kill machenism (#316)
* add timeout

* set timeout
2024-03-28 15:04:13 -04:00
Jim Su e249776e96 Force dark mode (#310) 2024-03-28 12:25:17 -04:00
Yashwanth S C f38bbf9261 feat(frontend): Add file picker to choose the working directory #221 (#289)
Add a file picker feature to top of the chat window.

Now a user can select the directory to work with.

Once a directory is chosen the user has the option to edit the directory also.(i.e choose another directory).
2024-03-28 10:53:33 -04:00
Yufan Song 256a04b307 fix(sandbox): add get pid code and kill the process (#274)
* add get pid code and kill

* remove print

* apply suggestion
2024-03-28 10:41:17 -04:00
808vita 2ff2cb5549 refactor (frontend) : for issue #294 frontend task : Clean up Browser.tsx & Browser.css (#296)
* Clean up comments from Browser.tsx

Clean up commented code from Browser.tsx

* clean up Browser.css and add mockup-browser class

clean up unused css classes in Browser.css and add "mockup-browser class"

* removed inline style-  Browser.tsx

Removed inline styles from the div in Browser.tsx ;  moved them to Browser.css under the class name "mockup-browser".
2024-03-28 21:30:03 +08:00
Kishore Chitrapu 73ea0e5803 Fix Docker container creation and error handling (#280)
* Fix Docker container creation and error handling

* updated patch based on the feedback

* add check docker message
2024-03-28 08:51:31 -04:00
Anas DORBANI 82c215ed5d Extract logic from init from langchains_agent and codeact_agent (#167) 2024-03-28 08:51:21 -04:00
Jim Su b1944a63ef Re-add banner to App and add change message (#282)
* Re-add banner to App and add change message

* feat: styling improvements for right panel

---------

Co-authored-by: huybery <huybery@gmail.com>
2024-03-28 20:35:17 +08:00
digger yu ed3bf194c7 fix typo (#286) 2024-03-28 19:27:50 +08:00
808vita 145d9e8041 Update CONTRIBUTING.md (#291)
typo correction
2024-03-28 19:23:36 +08:00
Robert Brennan 89e923679e API schema draft (#189)
* add api schema draft

* Update README.md

* Update opendevin/server/README.md
2024-03-27 22:03:55 -04:00
Binyuan Hui 658b860d04 feat: support tailwind and daisyUI (#266)
* feat: support tailwind and daisyUI

* feat: some styling improvements with daisyUI

* fix: remove flex in app.css and edit height in CodeEditor
2024-03-28 09:44:04 +08:00
Jim Su 2590570109 Add model and agent options in frontend (#271)
Add model and agent options in frontend
2024-03-27 21:09:22 -04:00
George Balch 16bf9d3cd2 Check for env var in parser argument default value or use hardcoded default (#276) 2024-03-27 20:55:16 -04:00
Robert Brennan a9102382f6 Update image registry in README.md (#265)
* Update image registry in README.md

* Update README.md

* Update README.md
2024-03-28 04:36:50 +08:00
Yufan Song 5db5acdaa3 add docker image docs (#263) 2024-03-27 16:08:53 -04:00
Robert Brennan 4304aceff3 remove openai key assertion, enable alternate embedding models (#231)
* remove openai key assertion

* support different embedding models

* add todo

* add local embeddings

* Make lint happy (#232)

* Include Azure AI embedding model (#239)

* Include Azure AI embedding model

* updated requirements

---------

Co-authored-by: Rohit Rushil <rohit.rushil@honeywell.com>

* Update agenthub/langchains_agent/utils/memory.py

* Update agenthub/langchains_agent/utils/memory.py

* add base url

* add docs

* Update requirements.txt

* default to local embeddings

* Update llm.py

* fix fn

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: RoHitRushil <43521824+RohitX0X@users.noreply.github.com>
Co-authored-by: Rohit Rushil <rohit.rushil@honeywell.com>
2024-03-27 14:58:47 -04:00
Vikramaditya Singh 9ae903697d Fixed terminal padding in addition to UI enhancement (#217)
* feat:enhanced layout

Signed-off-by: Vikramaditya <awesomevikram3@gmail.com>

* fixed:terminal padding

Signed-off-by: Vikramaditya <awesomevikram3@gmail.com>

* fixing lint issues

Signed-off-by: Vikramaditya <awesomevikram3@gmail.com>

* fixing lint issues

Signed-off-by: Vikramaditya <awesomevikram3@gmail.com>

---------

Signed-off-by: Vikramaditya <awesomevikram3@gmail.com>
2024-03-28 02:42:19 +08:00
George Balch 98919d15ae Scroll to bottom of messages on new messages state change (#230)
Co-authored-by: George Balch <george.balch@proton.me>
2024-03-27 14:42:02 -04:00
Ikko Eltociear Ashimine 7c69672301 Update listen.py (#264)
recieves -> receives
2024-03-27 14:40:08 -04:00
Robert Brennan 0403b460f1 remove auto-open from vite (#254) 2024-03-28 01:15:59 +08:00
Robert Brennan 895ed00d1a Update README.md (#256) 2024-03-28 00:41:43 +08:00
Robert Brennan 9bc1890d33 add debug dir for prompts (#205)
* add debug dir for prompts

* add indent to dumps

* only wrap completion in debug mode

* fix mypy
2024-03-27 12:40:08 -04:00
iFurySt 89abc5e253 fix: move the makefile to correct path. (#252) 2024-03-27 23:53:40 +08:00
Jim Su fcad0538fb Refactor socket to handle both action and observation types (#250) 2024-03-27 11:34:28 -04:00
Robert Brennan 15a4c0044b Update README.md 2024-03-27 11:12:15 -04:00
iFurySt 8b9fc3df28 feat: add workflow to ghcr (#237)
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-03-27 23:10:34 +08:00
George Balch 9b5190796e fix: Add eslint to package.json (#229)
* Add eslint to package.json

* Use npm to install instead of yarn, update package-lock.json

---------

Co-authored-by: George Balch <george.balch@proton.me>
2024-03-27 11:08:39 -04:00
iFurySt 1c5c84faea fix: delete the docker exec output prefix(8 bytes). (#247) 2024-03-27 11:05:07 -04:00
Robert Brennan 413f5b1c0b standardize on LLM_MODEL (#244) 2024-03-27 09:41:13 -04:00
Robert Brennan 6b4a613578 Update README.md (#243) 2024-03-27 21:27:59 +08:00
Binyuan Hui 0cbe95dd35 feat: update opendevin logo (#242)
* update opendevin logo

* update opendevin logo

* remove 'open' text in the logo
2024-03-27 21:26:50 +08:00
Robert Brennan ea10809a96 fix file write observations (#225)
* fix file write observations

* empty content for file write
2024-03-27 13:07:14 +08:00
Robert Brennan 95c128c3d6 Fix summaries (#223)
* fix summarization

* fix content vs contents issue
2024-03-26 16:55:00 -04:00
Robert Brennan 93656f3bb3 add error handling for action running (#212) 2024-03-26 16:42:27 -04:00
Yufan Song 9ab15b3287 add mock server (#214) 2024-03-26 16:15:20 -04:00
iFurySt 02a0367757 feat: add message related schema to FE and BE. (#195) 2024-03-26 12:06:30 -04:00
Robert Brennan 7cdfe63432 Add node version requirement to README (#206)
* Update README.md

* add engines to package.json
2024-03-26 12:05:48 -04:00
Robert Brennan a6aa2d88cc Fix messages in chat window (#185)
* fix messages from server

* fix observations

* fix file path arg

* revert session
2024-03-26 23:47:53 +08:00
Robert Brennan 134cc2bc84 fix summaries (#191) 2024-03-26 23:44:54 +08:00
Robert Brennan 9e924ab4b1 add error observations (#196) 2024-03-26 23:43:59 +08:00
Robert Brennan 4ee0ae9c4a fix readme env vars (#199) 2024-03-26 10:43:11 -04:00
Giacomo Rocchetti ecadff3442 Removed attach button (#197)
Co-authored-by: giacomo <Giacomo.Rocchetti@diatechpharmacogenetics.com>
2024-03-26 10:29:37 -04:00
Robert Brennan 983092c182 use utf8 encoding for file operations (#180) 2024-03-26 08:44:16 -04:00
Robert Brennan 0758db202c fix bad observation check (#184)
* fix bad observation check

* fix add_history call
2024-03-26 08:35:06 -04:00
Robert Brennan 5d75e23fa0 add model picking docs (#181) 2024-03-26 07:57:47 -04:00
Robert Brennan 397f3fbc78 Add rewrite logic to help with docker-in-docker scenarios (#146)
* add rewrite logic

* Update opendevin/sandbox/sandbox.py

* Update sandbox.py
2024-03-26 07:38:57 -04:00
Xingyao Wang 3cb1e05c8c update readme for new abstraction (#173) 2024-03-26 07:29:46 -04:00
Robert Brennan 486ec2d94a Minor fixes for new Agent framework (#158)
* add message if exit happens before finish

* fix kill command

* fix browse action

* fix sandbox

* update requirements

* add status code

* refactor controller

* add try to callbacks

* fix background log collection

* format logs a bit more nicely

* run background procs in same container

* fix close command

* rename sockt

* fix up kills

* fix lint issues

* fix ruff

---------

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-03-26 12:55:33 +08:00
Jim Su 430527680f Server: load environment variables from .env and document environment variables (#137)
Server: load environment variables from .env and document environment variables
2024-03-26 00:14:48 -04:00
Robert Brennan eb4a261880 Create generic LLM client using LiteLLM (#114)
* add generic llm client

* fix lint errors

* fix lint issues

* a potential suggestion for llm wrapper to keep all the function sigatures for ide

* use completion partial

* fix resp

* remove unused args

* add back truncation logic

* fix add_event

* fix merge issues

* more merge issues fixed

* fix codeact agent

* remove dead code

* remove import

* unused imports

* fix ruff

* update requirements

* mypy fixes

* more lint fixes

* fix browser errors

* fix up observation conversion

* fix format of error

* change max iter default back to 100

* fix kill action

* fix docker cleanup

* add RUN_AS_DEVIN flag

* fix condense

* revert some files

* unused imports

---------

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
Co-authored-by: Robert Brennan <rbren@Roberts-MacBook-Pro.local>
2024-03-26 12:10:23 +08:00
Robert Brennan 815b78595a Fix sandbox user ID on windows (#170)
* set user id

* fix lint
2024-03-25 22:56:48 -04:00
Yufan Song df14ce6c72 add version info (#168) 2024-03-25 21:17:34 -04:00
Hesham 6a1197f5c0 improve monologue (#157)
* improve monologue

* Update monologue.py

---------

Co-authored-by: Hesham Haroon <heshamharoon@Heshams-MacBook-Pro.local>
2024-03-25 19:15:40 -04:00
Hesham 847c2148a6 update messages (#156)
* update messages

* Update opendevin/action/agent.py

Co-authored-by: Robert Brennan <accounts@rbren.io>

* Update opendevin/action/agent.py

Co-authored-by: Robert Brennan <accounts@rbren.io>

---------

Co-authored-by: Hesham Haroon <heshamharoon@Heshams-MacBook-Pro.local>
Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-03-25 19:15:23 -04:00
geohotstan 0e090773e3 updated pre-commit so it works (#161) 2024-03-25 18:28:32 -04:00
Robert Brennan ad83a4d869 fix tag on sandbox (#164) 2024-03-25 18:24:30 -04:00
Robert Brennan dc1a1e6685 Update README.md (#160) 2024-03-25 18:15:25 -04:00
Xingyao Wang 82f934d4cd New Agent, Action, Observation Abstraction with updated Controller (#105)
* rearrange workspace_dir and max_step as arguments to controller

* remove unused output

* abstract each action into dataclass

* move actions

* fix action import

* move cmd manager and change method to private

* move controller

* rename action folder

* add state

* a draft of Controller & new agent abstraction

* add agent actions

* remove controller file

* add observation to perform a refractor on langchains agent

* revert to make this compatible via translation

* fix typo and translate error

* add error to observation

* index thought as dict

* refractor controller

* fix circular dependency caused by type hint

* add runnable attribute to agent

* add mixin to denote executable

* change baseclass

* make file read/write action compatible w/ docker directory

* remove event

* fix some merge issue

* fix sandbox w/ permission issue

* cleanup history abstraction since langchains agent is not really using it

* tweak to make langchains agent working

* make all actions return observation

* fix missing import

* add echo action for agent

* add error code to cmd output obs

* make cmd manager returns cmd output obs

* fix codeact agent to make it work

* fix all ruff issue

* fix mypy

* add import agenthub back

* add message for Action attribute (migrate from previous event)

* fix typo

* fix instruction setting

* fix instruction setting

* attempt to fix session

* ruff fix

* add .to_dict method for base and observation

* add message for recall

* try to simplify the state_updated_info with tuple of action and obs

* update_info to Tuple[Action, Observation]

* make codeact agent and langchains compatible with Tuple[Action, Observation]

* fix ruff

* fix ruff

* change to base path to fix minimal langchains agent

* add NullAction to potentially handle for chat scenario

* Update opendevin/controller/command_manager.py

Co-authored-by: Robert Brennan <accounts@rbren.io>

* fix event args

* set the default workspace to "workspace"

* make directory relative (so it does not show up to agent in File*Action)

* fix typo

* await to yield for sending observation

* fix message format

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-03-26 01:56:58 +08:00
Ikko Eltociear Ashimine bc9c9198c4 docs: update CONTRIBUTING.md (#140)
offical -> official
2024-03-25 11:24:57 -04:00
Robert Brennan 8a64d7c912 Fix read and write operations when LLM asks for absolute path (#126)
* fix resolution of filepaths

* fix imports

* Update write.py
2024-03-25 08:53:26 -04:00
Engin Can Dinc e445f92281 Fix vertical layout for browser (#135) 2024-03-25 08:25:00 -04:00
Xingyao Wang 0c396d3bbd Update README.md (#134) 2024-03-25 08:07:41 -04:00
Jim Su 9f8b61666a Update README.md (#132)
Add demo video to README.md
2024-03-25 12:32:42 +08:00
Robert Brennan d9c1bee2aa Add instructions to README.md (#129)
* Update README.md

* add WORKSPACE_DIR env var

* Update README.md

* Update session.py

---------

Co-authored-by: Robert Brennan <rbren@Roberts-MacBook-Pro.local>
2024-03-25 11:35:12 +08:00
Jim Su 335a91610e Wire up frontend (#128) 2024-03-25 10:07:38 +08:00
Robert Brennan 0f4a11685e Add agenthub import to listen.py (#127) 2024-03-24 19:20:42 -04:00
Robert Brennan a023fcae64 fix truncation logic (#125) 2024-03-24 18:46:01 -04:00
Robert Brennan 8ce2ac7648 Fix agenthub import (#123)
* fix agenthub import

* add noqa
2024-03-24 18:29:09 -04:00
Sean Lim dfa98b334e fix typo (#122) 2024-03-24 17:15:29 -04:00
Robert Brennan 4aa24eb41d Server working with agent library (#97)
* server working with agent library

* update readme

* add messages to events

* factor out steps

* fix websocket messages

* allow user to run arbitrary actions

* allow user to run commands before a task is started

* fix main.py

* check JSON

* handle errors in controller better

* fix memory issue

* better error handling and task cancellation

* fix monologue len

* fix imports

* remove server from lint check

* fix lint issues

* fix lint errors
2024-03-24 22:24:44 +08:00
geohotstan fb1822123a Adding pre-commit and CI for ruff and mypy (#69)
* don't modify directories

* oops typo

* dev_config/python

* add config to CI

* bump CI python to 3.10

* 3.11?

* del actions/

* add suggestions

* delete unused code

* missed some

* oops missed another one

* remove a file
2024-03-23 19:41:49 -04:00
Yufan Song 642e1b3cd0 doc(sandbox): add more instructions for starting on sandbox work (#107)
* add doc

* fix

* remove requirement

* Update opendevin/README.md

Co-authored-by: Robert Brennan <accounts@rbren.io>

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-03-23 17:44:48 -04:00
Jim Su a2f245e2fe Set up Redux, extract component state into Redux store, and break WebSocket out of Terminal component only (#85) 2024-03-23 17:40:43 -04:00
Yufan Song fabdddca21 fix(evaluation): solve graceful shutdown of docker containers (#106)
* add atexit

* remove print

* fix typo
2024-03-23 17:37:20 -04:00
Robert Brennan 78f8752246 minor fixes for langchain agent (#83)
* minor fixes

* set max output len

* truncate logs

* Update agenthub/langchains_agent/utils/agent.py
2024-03-23 17:34:35 -04:00
Xingyao Wang 0b5a531518 fix typo in README (#104) 2024-03-23 12:10:49 +08:00
Junyang Lin b0a8af6c99 update MVP roadmap (#100) 2024-03-23 09:10:52 +08:00
Yufan Song 5c68d82995 doc: add contribution docs (#101)
* add doc

* fix typo
2024-03-23 09:08:19 +08:00
zch-cc e5a28cba2f Evaluation: Fix bug on python path on run.sh (#98)
* Move regression tests to evaluation/

* use pythnon instead of docker in the script

* add model para

* change python to python3

* bug fix

* add python path

* add readme
2024-03-23 00:01:48 +08:00
zch-cc cfefc47439 Move regression tests to evaluation/ (#86)
* Move regression tests to evaluation/

* use pythnon instead of docker in the script

* add model para

* change python to python3

* bug fix
2024-03-22 23:26:37 +08:00
Xingyao Wang 2ba6fb1e7b Use main.py arguments to specify model_name instead of env var for Langchains Agent (#94)
* use model name from main.py arguments instead of env var

* set default model to gpt-4-0125-preview for langchains agent to work;
print model, dir, and task when started
2024-03-22 22:06:42 +08:00
Robert Brennan 3b2ed14ae7 Use Docker SDK for sandbox, integrate into CommandManager (#93)
* refactor command manager to use docker and move to docker sdk

* fix read and write actions

* actually run background cmd

* use bash for running cmds and fix logs

* keep logs in buffer file

* fix up background logs

* consolidate requirements

* fix docker imports

* add fixme

* add remove fixme

* fix sandbox.py path in README

* fix typo annotation and prompt

---------

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-03-22 21:54:00 +08:00
Aadya Madankar 4fda533b91 A starting point for SWE-Bench Evaluation with docker (#87) 2024-03-22 21:34:28 +08:00
libowen2121 40a3614e80 Add a roadmap for eval (#92) 2024-03-22 20:27:30 +08:00
Xingyao Wang 2d5c8f1060 change to OpenDevin fork (#89) 2024-03-22 18:30:12 +08:00
Xingyao Wang 5ff96111f0 A starting point for SWE-Bench Evaluation with docker (#60)
* a starting point for SWE-Bench evaluation with docker

* fix the swe-bench uid issue

* typo fixed

* fix conda missing issue

* move files based on new PR

* Update doc and gitignore using devin prediction file from #81

* fix typo

* add a sentence

* fix typo in path

* fix path

---------

Co-authored-by: Binyuan Hui <binyuan.hby@alibaba-inc.com>
2024-03-22 12:43:49 +08:00
Jiaxin Pei dc88dac296 adding a script to fetch and convert devin's output for evaluation (#81)
* adding code to fetch and convert devin's output for evaluation

* update README.md

* update code for fetching and processing devin's outputs

* update code for fetching and processing devin's outputs
2024-03-22 01:33:01 +08:00
Robert Brennan b84463f512 Refactor agent interface a bit (#74)
* start moving files

* initial refactor

* factor out command management

* fix command runner

* add workspace to gitignore

* factor out command manager

* remove dupe add_event

* update docs

* fix init

* fix langchain agent after merge
2024-03-21 23:35:28 +08:00
Xingyao Wang 2de75d4782 Minimal Docker Sandbox with GPT-3.5 Execution Example (#48)
* minimal docker sandbox

* make container_image as an argument (fall back to ubuntu);
increase timeout to avoid return too early for long running commands;

* add a minimal working (imperfect) example

* fix typo

* change default container name

* attempt to fix "Bad file descriptor" error

* handle ctrl+D

* add Python gitignore

* push sandbox to shared dockerhub for ease of use

* move codeact example into research folder

* add README for opendevin

* change container image name to opendevin dockerhub

* move folder; change example to a more general agent

* update Message and Role

* update docker sandbox to support mounting folder and switch to user with correct permission

* make network as host

* handle erorrs when attrs are not set yet

* convert codeact agent into a compatible agent

* add workspace to gitignore

* make sure the agent interface adjustment works for langchain_agent
2024-03-21 21:54:56 +08:00
Jim Su a722f5c0b1 Fix: Keypresses in Terminal throws exception (#71) 2024-03-21 06:36:48 -04:00
Xingyao Wang 0380070e98 Abstraction that allows us to develop different agents, frontend, backend, and evaluation in parallel (#68)
* move agent to langchains_agent

* remove old .env

* remove the old agent folder

* add preliminary version of Agent abstraction

* add preliminary version of the main.py

* merge controlloop and main into a Agent class

* add init

* fix json import

* fix missing arg

* get langchains_agent working after abstraction

* rename `research` to `agenthub`

* rename: rename research to agenthub

---------

Co-authored-by: huybery <huybery@gmail.com>
2024-03-20 15:09:29 -04:00
Binyuan Hui f99f4ebdaa fix: typo in the evaluation folder name. (#66) 2024-03-20 23:00:09 +08:00
Binyuan Hui a94f3d81cb fix: merge multiple .gitignore to unify management (#61) 2024-03-20 21:35:51 +08:00
geohotstan fd277666f6 use argparse in main.py (#62)
* changed to argparse

* del useless import

* del print statement

* use shortened argument
2024-03-20 09:27:14 -04:00
Robert Brennan 6ff1e52c83 add websocket handshake to server (#57)
* add websocket handshake to server

* Update server/requirements.txt
2024-03-20 09:00:19 -04:00
Jiaxin Pei 60e043ed8b Creating the evaluation folder and adding initial analysis of Devin's evaluation outputs (#50)
* Creating the folder for experiments, adding initial analysis of Devin's outputs on SWE-bench

* typo fixed

* update devin's evaluation outputs analysis

* remove data files from commit

* add logistics for managing the experiment folder in README

* Change folder naming and update logistics.

---------

Co-authored-by: Bowen Li <libowen.ne@gmail.com>
2024-03-20 19:59:51 +08:00
Robert Brennan 1eade7d188 First pass at a control loop (#35)
* initialize control loop

* add todo

* more todo

* add dockerignore

* add notes to prompt

* encourage llm to finish

* add debug env

* update prompts a bit

* fix task prompts

* add basic regression framework

* add hello-world regression case

* add hello-name test case

* fix workspace ignore

* document regression script

* add python-cli test case

* add default git config

* add help regression test

* add node rewrite test case

* add react-todo test case

* fix dockerfile

* add ability to run background commands

* add client-server test case

* update regression readme

* better support for background commands

* update tests

* fix bug in command removal
2024-03-20 18:44:50 +08:00
Junyang Lin 3c18f22fb3 Update README.md 2024-03-20 16:20:58 +08:00
Xingyao Wang dcff11cd2f add Python gitignore (#59) 2024-03-20 16:17:16 +08:00
Jim Su b6699fa047 Bug Fix: UI layout changes as you switch between tabs (#58)
* Fix UI layout change bug

* Reword comment
2024-03-19 22:43:15 -04:00
Jim Su 34c76a52c8 Frontend: Implement browser tab (#51)
* Add browser tab

* Add comments for URL and screenshot state variables
2024-03-19 09:14:20 -04:00
Vikramaditya Singh 1e733b0c71 feat:modified user-message view (#54)
Signed-off-by: Vikramaditya <awesomevikram3@gmail.com>
2024-03-19 08:34:16 -04:00
Pekka Enberg d5c28a47bc Fix frontend index.html head section (#53)
Let's use "OpenDevin" as page title, but also use realfavicongenerator.net to
generate a proper favicon from OpenDevinLogo.jpg.
2024-03-19 07:38:21 -04:00
Suryavir Kapur 9e589731f0 frontend: adds attach button (#55) 2024-03-19 07:35:55 -04:00
Jim Su cdb83c72e2 Add code editor and separate tabs (#43) 2024-03-18 22:47:16 -04:00
Suryavir Kapur cac687508f cra -> vite (#47) 2024-03-18 21:31:51 -04:00
Jim Su cf97b66ff9 Frontend: Implement real terminal with xterm.js (#39)
* Add xterm terminal

* Remove unused code blocks

* Update README.md with terminal documentation

* Update frontend/README.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-03-17 23:33:48 -04:00
Robert Brennan 346e992276 add basic websocket server (#41) 2024-03-17 23:25:34 -04:00
Robert Brennan f0ef8203cf add issue templates (#40) 2024-03-17 23:18:29 -04:00
Jim Su 38628c106f Add formatting and linting for typescript components (#32)
* Add formatting and linting for typescript components

* Add lint GitHub Action

* Add push to GitHub action trigger

* Update GitHub action node version

* Add Husky

* Update husky settings

* Fix Husky settings

* Should not pass

* Test

* Test

* Finalize

* Add --legacy-peer-deps flag to npm ci

* Fix lint pre-commit hook

* Compile

* Remove eslint and prettier command quotes
2024-03-17 14:57:20 -04:00
Junyang Lin 2a12619f3a Update README.md 2024-03-17 10:37:02 +08:00
Xiang Yue 3b73809be9 Update README (Xiang) (#28) 2024-03-17 09:33:12 +08:00
Asad Memon 39add27f15 Create MIT LICENSE (#8) 2024-03-16 22:46:04 +08:00
Binyuan Hui f2680fd065 Merge pull request #17 from neubig/neubig/add_prototype_frontend
Add prototype OpenDevin frontend
2024-03-16 21:22:25 +08:00
Graham Neubig 788ba686c9 Add prototype frontend 2024-03-15 20:22:29 -04:00
出蛰 d7b8013472 update: readme 2024-03-13 11:47:10 +08:00
出蛰 7b63deba2d init: project 2024-03-13 11:36:30 +08:00
1967 changed files with 143689 additions and 215287 deletions
-159
View File
@@ -1,159 +0,0 @@
---
name: custom-codereview-guide
description: Repo-specific code review guidelines for OpenHands/software-agent-sdk. Provides SDK-specific review rules in addition to the default code review skill.
triggers:
- /codereview
---
# OpenHands/software-agent-sdk Code Review Guidelines
You are an expert code reviewer for the **OpenHands/software-agent-sdk** repository. This skill provides repo-specific review guidelines. Be direct but constructive.
## Review Decisions
You have permission to **APPROVE** or **COMMENT** on PRs. Do not use REQUEST_CHANGES.
### Review decision policy (eval / benchmark risk)
Do **NOT** submit an **APPROVE** review when the PR changes agent behavior or anything
that could plausibly affect benchmark/evaluation performance.
Examples include: prompt templates, tool calling/execution, planning/loop logic,
memory/condenser behavior, terminal/stdin/stdout handling, or evaluation harness code.
If a PR is in this category (or you are uncertain), leave a **COMMENT** review and
explicitly flag it for a human maintainer to decide after running lightweight evals.
### Default approval policy
**Default to APPROVE**: If your review finds no issues at "important" level or higher,
approve the PR. Minor suggestions or nitpicks alone are not sufficient reason to
withhold approval.
**IMPORTANT:** If you determine a PR is worth merging **and it is not in the eval-risk
category above**, you should approve it. Dont just say a PR is "worth merging" or
"ready to merge" without actually submitting an approval. Your words and actions should
be consistent.
### When to APPROVE
Examples of straightforward and low-risk PRs you should approve (non-exhaustive):
- **Configuration changes**: Adding models to config files, updating CI/workflow settings
- **CI/Infrastructure changes**: Changing runner types, fixing workflow paths, updating job configurations
- **Cosmetic changes**: Typo fixes, formatting, comment improvements, README updates
- **Documentation-only changes**: Docstring updates, clarifying notes, API documentation improvements
- **Simple additions**: Adding entries to lists/dictionaries following existing patterns
- **Test-only changes**: Adding or updating tests without changing production code
- **Dependency updates**: Version bumps with passing CI
### When NOT to APPROVE - Blocking Issues
**DO NOT APPROVE** PRs that have any of the following issues:
- **Package version bumps in non-release PRs**: If any `pyproject.toml` file has changes to the `version` field (e.g., `version = "1.12.0"``version = "1.13.0"`), and the PR is NOT explicitly a release PR (title/description doesn't indicate it's a release), **DO NOT APPROVE**. Version numbers should only be changed in dedicated release PRs managed by maintainers.
- Check: Look for changes to `version = "..."` in any `*/pyproject.toml` files
- Exception: PRs with titles like "release: v1.x.x" or "chore: bump version to 1.x.x" from maintainers
Examples:
- A PR adding a new model to `resolve_model_config.py` or `verified_models.py` with corresponding test updates
- A PR adding documentation notes to docstrings clarifying method behavior (e.g., security considerations, bypass behaviors)
- A PR changing CI runners or fixing workflow infrastructure issues (e.g., standardizing runner types to fix path inconsistencies)
### When to COMMENT
Use COMMENT when you have feedback or concerns:
- Issues that need attention (bugs, security concerns, missing tests)
- Suggestions for improvement
- Questions about design decisions
- Minor style preferences
If there are significant issues, leave detailed comments explaining the concerns—but let a human maintainer decide whether to block the PR.
## Core Principles
1. **Simplicity First**: Question complexity. If something feels overcomplicated, ask "what's the use case?" and seek simpler alternatives. Features should solve real problems, not imaginary ones.
2. **Pragmatic Testing**: Test what matters. Avoid duplicate test coverage. Don't test library features (e.g., `BaseModel.model_dump()`). Focus on the specific logic implemented in this codebase.
3. **Type Safety**: Avoid `# type: ignore` - treat it as a last resort. Fix types properly with assertions, proper annotations, or code adjustments. Prefer explicit type checking over `getattr`/`hasattr` guards.
4. **Backward Compatibility**: Evaluate breaking change impact carefully. Consider API changes that affect existing users, removal of public fields/methods, and changes to default behavior.
## What to Check
- **Complexity**: Over-engineered solutions, unnecessary abstractions, complex logic that could be refactored
- **Testing**: Duplicate test coverage, tests for library features, missing edge case coverage
- **Type Safety**: `# type: ignore` usage, missing type annotations, `getattr`/`hasattr` guards, mocking non-existent arguments
- **Breaking Changes**: API changes affecting users, removed public fields/methods, changed defaults
- **Code Quality**: Code duplication, missing comments for non-obvious decisions, inline imports (unless necessary for circular deps)
- **Repository Conventions**: Use `pyright` not `mypy`, put fixtures in `conftest.py`, avoid `sys.path.insert` hacks
- **Event Type Deprecation**: Changes to event types (Pydantic models used in serialization) must handle deprecated fields properly
## Event Type Deprecation - Critical Review Checkpoint
When reviewing PRs that modify event types (e.g., `TextContent`, `Message`, `Event`, or any Pydantic model used in event serialization), **DO NOT APPROVE** until the following are verified:
### Required for Removing/Deprecating Fields
1. **Model validator present**: If a field is being removed from an event type with `extra="forbid"`, there MUST be a `@model_validator(mode="before")` that uses `handle_deprecated_model_fields()` to remove the deprecated field before validation. Otherwise, old events will fail to load.
2. **Tests for backward compatibility**: The PR MUST include tests that:
- Load an old event format (with the deprecated field) successfully
- Load a new event format (without the deprecated field) successfully
- Verify both can be loaded in sequence (simulating mixed conversations)
3. **Test naming convention**: The version in the test name should be the **LAST version** where a particular event structure exists. For example, if `enable_truncation` was removed in v1.11.1, the test should be named `test_v1_10_0_...` (the last version with that field), not `test_v1_8_0_...` (when it was introduced). This avoids duplicate tests and clearly documents when a field was last present.
**Important**: Deprecated field handlers are **permanent** and should never be removed. They ensure old conversations can always be loaded.
### Example Pattern (Required)
```python
from openhands.sdk.utils.deprecation import handle_deprecated_model_fields
class MyModel(BaseModel):
model_config = ConfigDict(extra="forbid")
# Deprecated fields that are silently removed for backward compatibility
# when loading old events. These are kept permanently.
_DEPRECATED_FIELDS: ClassVar[tuple[str, ...]] = ("old_field_name",)
@model_validator(mode="before")
@classmethod
def _handle_deprecated_fields(cls, data: Any) -> Any:
"""Remove deprecated fields for backward compatibility with old events."""
return handle_deprecated_model_fields(data, cls._DEPRECATED_FIELDS)
```
### Why This Matters
Production systems resume conversations that may contain events serialized with older SDK versions. If the SDK can't load old events, users will see errors like:
```
pydantic_core.ValidationError: Extra inputs are not permitted
```
**This is a production-breaking change.** Do not approve PRs that modify event types without proper backward compatibility handling and tests.
## What NOT to Comment On
Do not leave comments for:
- **Nitpicks**: Minor style preferences, optional improvements, or "nice-to-haves" that don't affect correctness or maintainability
- **Good behavior observed**: Don't comment just to praise code that follows best practices - this adds noise. Simply approve if the code is good.
- **Suggestions for additional tests on simple changes**: For straightforward PRs (config changes, model additions, etc.), don't suggest adding test coverage unless tests are clearly missing for new logic
- **Obvious or self-explanatory code**: Don't ask for comments on code that is already clear
- **`.pr/` directory artifacts**: Files in the `.pr/` directory are temporary PR-specific documents (design notes, analysis, scripts) that are automatically cleaned up when the PR is approved. Do not comment on their presence or suggest removing them.
If a PR is approvable, just approve it. Don't add "one small suggestion" or "consider doing X" comments that delay merging without adding real value.
## Communication Style
- Be direct and concise - don't over-explain
- Use casual, friendly tone ("lgtm", "WDYT?", emojis are fine 👀)
- Ask questions to understand use cases before suggesting changes
- Suggest alternatives, not mandates
- Approve quickly when code is good ("LGTM!")
- Use GitHub suggestion syntax for code fixes
@@ -1,88 +0,0 @@
---
name: debug-test-examples-workflow
description: Guide for debugging failing example tests in the `test-examples` labeled workflow. Use this skill when investigating CI failures in the run-examples.yml workflow, when example scripts fail to run correctly, when needing to isolate specific test failures, or when analyzing workflow logs and failure patterns.
---
# Debugging test-examples Workflow
## Overview
The `run-examples.yml` workflow runs example scripts from `examples/` directory. Triggers:
- Adding `test-examples` label to a PR
- Manual workflow dispatch
- Scheduled nightly runs
## Debugging Steps
### 1. Isolate Failing Tests
Modify `tests/examples/test_examples.py` to focus on specific tests:
```python
_TARGET_DIRECTORIES = (
# EXAMPLES_ROOT / "01_standalone_sdk",
EXAMPLES_ROOT / "02_remote_agent_server", # Keep only failing directory
)
```
### 2. Exclude Tests
Add to `_EXCLUDED_EXAMPLES` with explanation:
```python
_EXCLUDED_EXAMPLES = {
# Reason for exclusion
"examples/path/to/failing_test.py",
}
```
### 3. Trigger Workflow
Toggle the `test-examples` label:
```bash
# Remove label
curl -X DELETE -H "Authorization: token $GITHUB_TOKEN" \
"https://api.github.com/repos/OpenHands/software-agent-sdk/issues/${PR_NUMBER}/labels/test-examples"
# Add label
curl -X POST -H "Authorization: token $GITHUB_TOKEN" \
-H "Accept: application/vnd.github.v3+json" \
"https://api.github.com/repos/OpenHands/software-agent-sdk/issues/{PR_NUMBER}/labels" \
-d '{"labels":["test-examples"]}'
```
### 4. Monitor Progress
```bash
# Check status
curl -s -H "Authorization: token $GITHUB_TOKEN" \
"https://api.github.com/repos/OpenHands/software-agent-sdk/actions/runs/{RUN_ID}" | jq '{status, conclusion}'
# Download logs
curl -sL -H "Authorization: token $GITHUB_TOKEN" \
"https://api.github.com/repos/OpenHands/software-agent-sdk/actions/runs/{RUN_ID}/logs" -o logs.zip
unzip logs.zip -d logs
```
## Common Failure Patterns
| Pattern | Cause | Solution |
|---------|-------|----------|
| Port conflicts | Fixed ports (8010, 8011) | Run with `-n 1` or use different ports |
| Container issues | Docker/Apptainer setup | Check Docker availability, image pulls |
| LLM failures | Transient API errors | Retry the test |
| Example bugs | Code errors | Check traceback |
## Key Configuration
**Workflow** (`.github/workflows/run-examples.yml`):
- Runner: `blacksmith-2vcpu-ubuntu-2404`
- Timeout: 60 minutes
- Parallelism: `-n 4` (pytest-xdist: 4 parallel workers)
**Tests** (`tests/examples/test_examples.py`):
- Timeout per example: 600 seconds
- Target directories: `_TARGET_DIRECTORIES`
- Excluded examples: `_EXCLUDED_EXAMPLES`
-43
View File
@@ -1,43 +0,0 @@
---
name: design-principles
description: Core architectural design principles of the OpenHands Software Agent SDK. Reference when making architectural decisions, reviewing PRs that change agent/tool/state boundaries, or evaluating whether a proposed change aligns with V1 design goals.
---
# SDK Design Principles
Reference: <https://docs.openhands.dev/sdk/arch/design>
## Quick Summary
1. **Optional Isolation over Mandatory Sandboxing**
Sandboxing is opt-in, not universal. Agent and tool execution runs in a single
process by default. When isolation is needed, the same stack can be transparently
containerized.
2. **Stateless by Default, One Source of Truth for State**
All components — agents, tools, LLMs, configurations — are **immutable Pydantic
models** validated at construction. The only mutable entity is the conversation
state. This enables deterministic replay and robust persistence.
3. **Clear Boundaries between Agent and Applications**
Strict separation between SDK (agent core), tools, workspace, and agent server.
Applications communicate via APIs, not by embedding the agent.
4. **Composable Components for Extensibility**
Agents are graphs of interchangeable components — tools, prompts, LLMs, contexts —
described **declaratively with strong typing**. Developers reconfigure capabilities
without modifying core code.
## Implications for Development
- Since agents are immutable Pydantic models, their configuration **is** their
serializable representation. There should be no need to "reverse-engineer" agent
config from runtime instances.
- Tool implementations (callables) are the only non-serializable part; this is solved
by `tool_module_qualnames` for remote forwarding.
- Everything else (system_prompt, model, skills, tool names) is already declarative
data that can be serialized and forwarded directly.
- Avoid patterns that create multiple sources of truth for the same configuration
(e.g., a factory function AND an extracted definition).
- `model_copy(update=...)` should be used sparingly and through well-defined paths to
avoid undermining statelessness.
@@ -1,244 +0,0 @@
---
name: feature-release-rollout
description: This skill should be used when the user asks to "rollout a feature", "complete feature release", "propagate SDK feature", "track feature support", "what's missing for feature X", or mentions checking CLI/GUI/docs/blog support for SDK features. Guides agents through the multi-repository feature release workflow from SDK to docs to marketing.
triggers:
- rollout feature
- feature release
- propagate feature
- feature support
- complete release
- docs for feature
- blog for feature
- CLI support
- GUI support
- what's missing
---
# Feature Release Rollout
This skill guides the complete feature release workflow across the OpenHands ecosystem repositories.
## Overview
When a feature is implemented in the SDK, it may need propagation through several repositories:
1. **SDK** (`OpenHands/software-agent-sdk`) — Core feature implementation
2. **CLI** (`OpenHands/OpenHands-CLI`) — Terminal interface support
3. **GUI** (`OpenHands/OpenHands` frontend directory) — Web interface support
4. **Docs** (`OpenHands/docs`) — Documentation updates (sdk/ folder)
5. **Blog** (`OpenHands/growth-utils` blog-post/) — Marketing and announcements
6. **Video** — Tutorial content (using ElevenLabs + Remotion)
## Workflow
### Phase 1: Feature Discovery
First, identify what feature(s) to analyze. The user may specify:
- A release tag (e.g., `v1.9.0`)
- A specific feature name
- A PR or commit reference
- A comparison between versions
**For release tags:**
```bash
# Clone SDK if not present
git clone https://github.com/OpenHands/software-agent-sdk.git
# View release notes
cd software-agent-sdk
git log --oneline v1.8.0..v1.9.0 # Changes between versions
git show v1.9.0 --stat # What changed in this release
```
**For specific features:**
Search the SDK codebase, examples, and changelog to understand the feature scope.
### Phase 2: Repository Analysis
Clone all relevant repositories to analyze current support:
```bash
# Clone repositories (use GITHUB_TOKEN for authenticated access)
git clone https://github.com/OpenHands/software-agent-sdk.git
git clone https://github.com/OpenHands/OpenHands-CLI.git
git clone https://github.com/OpenHands/OpenHands.git # Frontend in frontend/
git clone https://github.com/OpenHands/docs.git
git clone https://github.com/OpenHands/growth-utils.git
```
For each feature, check support status:
| Repository | Check Location | What to Look For |
|------------|---------------|------------------|
| CLI | `openhands_cli/` | Feature flags, commands, TUI widgets |
| GUI | `OpenHands/frontend/src/` | React components, API integrations |
| Docs | `docs/sdk/` | Guide pages, API reference, examples |
| Blog | `growth-utils/blog-post/posts/` | Announcement posts |
### Phase 3: Assess Feature Importance
Not all features warrant full rollout. Evaluate each feature:
**High Impact (full rollout recommended):**
- New user-facing capabilities
- Breaking changes or migrations
- Major performance improvements
- New integrations or tools
**Medium Impact (docs + selective support):**
- New API methods or parameters
- Configuration options
- Developer experience improvements
**Low Impact (docs only or skip):**
- Internal refactoring
- Bug fixes
- Minor enhancements
**Skip rollout for:**
- Internal-only changes
- Test improvements
- Build/CI changes
- Documentation typos
### Phase 4: Create Proposal
Generate a structured proposal for the user:
```markdown
## Feature Rollout Proposal: [Feature Name]
### Feature Summary
[Brief description of the feature and its value]
### Current Support Status
| Component | Status | Notes |
|-----------|--------|-------|
| SDK | ✅ Implemented | [version/PR] |
| CLI | ❌ Missing | [what's needed] |
| GUI | ⚠️ Partial | [what's implemented vs needed] |
| Docs | ❌ Missing | [suggested pages] |
| Blog | ❌ Not started | [whether warranted] |
| Video | ❌ Not started | [whether warranted] |
### Recommended Actions
1. **CLI**: [specific implementation needed]
2. **GUI**: [specific implementation needed]
3. **Docs**: [pages to create/update]
4. **Blog**: [recommended or not, with reasoning]
5. **Video**: [recommended or not, with reasoning]
### Assessment
- **Overall Priority**: [High/Medium/Low]
- **Effort Estimate**: [days/hours per component]
- **Dependencies**: [what must be done first]
```
### Phase 5: User Confirmation
Wait for explicit user approval before proceeding. Ask:
- Which components to implement
- Priority ordering
- Any modifications to the proposal
### Phase 6: Implementation
Only after user confirmation:
**Create GitHub Issues:**
```bash
# Create issue on relevant repo
gh issue create --repo OpenHands/OpenHands-CLI \
--title "Support [feature] in CLI" \
--body "## Context\n[Feature description]\n\n## Implementation\n[Details]\n\n## Related\n- SDK: [link]\n- Docs: [link]"
```
**Implementation order:**
1. CLI/GUI support (can be parallel)
2. Documentation (depends on 1)
3. Blog post (depends on 2)
4. Video (depends on 3)
## Repository-Specific Guidelines
### CLI (OpenHands/OpenHands-CLI)
- Check `AGENTS.md` for development guidelines
- Use `uv` for dependency management
- Run `make lint` and `make test` before commits
- TUI components in `openhands_cli/tui/`
- Snapshot tests for UI changes
### GUI (OpenHands/OpenHands frontend)
- Frontend in `frontend/` directory
- React/TypeScript codebase
- Run `npm run lint:fix && npm run build` in frontend/
- Follow TanStack Query patterns for data fetching
- i18n translations in `frontend/src/i18n/`
### Docs (OpenHands/docs)
- SDK docs in `sdk/` folder
- Uses Mintlify (`.mdx` files)
- Code blocks can auto-sync from SDK examples
- Run `mint broken-links` to validate
- Follow `openhands/DOC_STYLE_GUIDE.md`
### Blog (OpenHands/growth-utils)
- Posts in `blog-post/posts/YYYYMMDD-title.md`
- Assets in `blog-post/assets/YYYYMMDD-title/`
- Frontmatter format:
```yaml
---
title: "Post Title"
excerpt: "Brief description"
coverImage: "/assets/blog/YYYYMMDD-title/cover.png"
date: "YYYY-MM-DDTHH:MM:SS.000Z"
authors:
- name: Author Name
picture: "/assets/blog/authors/author.png"
ogImage:
url: "/assets/blog/YYYYMMDD-title/cover.png"
---
```
## Example Feature Analysis
**Feature: Browser Session Recording (SDK v1.8.0)**
1. **SDK**: ✅ Implemented in `openhands.tools.browser`
2. **CLI**: ❌ No replay/export commands
3. **GUI**: ❌ No recording viewer component
4. **Docs**: ✅ Guide at `sdk/guides/browser-session-recording.mdx`
5. **Blog**: ❌ Could highlight for web scraping users
6. **Video**: Consider 2-minute demo
**Recommendation**: Medium priority. Docs done, CLI/GUI low urgency (advanced feature), blog post optional.
## Quick Commands
```bash
# Check SDK feature presence
grep -r "feature_name" software-agent-sdk/openhands/ --include="*.py"
# Check CLI support
grep -r "feature_name" OpenHands-CLI/openhands_cli/ --include="*.py"
# Check GUI support
grep -r "featureName" OpenHands/frontend/src/ --include="*.ts" --include="*.tsx"
# Check docs coverage
grep -r "feature" docs/sdk/ --include="*.mdx"
# Check blog mentions
grep -r "feature" growth-utils/blog-post/posts/ --include="*.md"
```
## Important Notes
- Always get user confirmation before creating issues or starting implementation
- Consider feature maturity — new features may change before full rollout
- Cross-reference PRs between repositories in issue descriptions
- For breaking changes, coordinate release timing across all components
-66
View File
@@ -1,66 +0,0 @@
---
name: run-eval
description: Trigger and monitor evaluation runs for benchmarks like SWE-bench, GAIA, and others. Use when running evaluations via GitHub Actions or monitoring eval progress through Datadog and kubectl.
triggers:
- run eval
- trigger eval
- evaluation run
- swebench eval
---
# Running Evaluations
## Trigger via GitHub API
```bash
curl -X POST \
-H "Authorization: token $GITHUB_TOKEN" \
-H "Accept: application/vnd.github+json" \
"https://api.github.com/repos/OpenHands/software-agent-sdk/actions/workflows/run-eval.yml/dispatches" \
-d '{
"ref": "main",
"inputs": {
"benchmark": "swebench",
"sdk_ref": "main",
"eval_limit": "50",
"model_ids": "claude-sonnet-4-5-20250929",
"reason": "Description of eval run",
"benchmarks_branch": "main"
}
}'
```
**Key parameters:**
- `benchmark`: `swebench`, `swebenchmultimodal`, `gaia`, `swtbench`, `commit0`, `multiswebench`, `terminalbench`
- `eval_limit`: Any positive integer (e.g., `1`, `10`, `50`, `200`)
- `model_ids`: See `.github/run-eval/resolve_model_config.py` for available models
- `benchmarks_branch`: Use feature branch from the benchmarks repo to test benchmark changes before merging
**Note:** When running a full eval, you must select an `eval_limit` that is greater than or equal to the actual number of instances in the benchmark. If you specify a smaller limit, only that many instances will be evaluated (partial eval).
## Monitoring
**Datadog script** (requires `OpenHands/evaluation` repo; DD_API_KEY, DD_APP_KEY, and DD_SITE environment variables are set):
```bash
DD_API_KEY=$DD_API_KEY DD_APP_KEY=$DD_APP_KEY DD_SITE=$DD_SITE \
python scripts/analyze_evals.py --job-prefix <EVAL_RUN_ID> --time-range 60
# EVAL_RUN_ID format: typically the workflow run ID from GitHub Actions
```
**kubectl** (for users with cluster access - the agent does not have kubectl access):
```bash
kubectl logs -f job/eval-eval-<RUN_ID>-<MODEL_SLUG> -n evaluation-jobs
```
## Common Errors
| Error | Cause | Fix |
|-------|-------|-----|
| `503 Service Unavailable` | Infrastructure overloaded | Ask user to stop some evaluation runs |
| `429 Too Many Requests` | Rate limiting | Wait or reduce concurrency |
| `failed after 3 retries` | Instance failures | Check Datadog logs for root cause |
## Limits
- Max 256 parallel runtimes (jobs will queue if this limit is exceeded)
- Full evals typically take 1-3 hours depending on benchmark size
-117
View File
@@ -1,117 +0,0 @@
---
name: write-behavior-test
description: Guide for writing behavior tests that verify agents follow system message guidelines and avoid undesirable behaviors. Use when creating integration tests for agent behavior validation.
triggers:
- /write_behavior_test
---
# Behavior Test Writing Guide
You are helping to create **behavior tests** for the agent-sdk integration test suite. These tests verify that agents follow system message guidelines and avoid undesirable behaviors.
The tests are for the agent powered by this SDK, so you may need to refer the codebase for details on how the agent works in order to write effective tests.
## Behavior Tests vs Task Tests
**Task Tests (t*.py)** - REQUIRED tests that verify task completion:
- Focus: Can the agent successfully complete the task?
- Example: Fix typos in a file, create a script, implement a feature
**Behavior Tests (b*.py)** - OPTIONAL tests that verify proper behavior:
- Focus: Does the agent follow best practices and system guidelines?
- Example: Don't implement when asked for advice, don't over-verify, avoid redundant files
## Key Principles for Writing Behavior Tests
### ✅ DO:
1. **Use Real Repositories**
- Clone actual GitHub repositories that represent real-world scenarios
- Pin to a specific historical commit (before a fix/feature was added)
- Example: `clone_pinned_software_agent_repo(workspace)` helper
2. **Test Realistic Complex, Nuanced Behaviors**
- Try to make the task as realistic as possible to real HUMAN interactions, from file naming, (somewhat lazy) instruction style, etc
- Focus on subtle behavioral issues that require judgment
- Test scenarios where the "right" behavior isn't immediately obvious
- Examples: When to implement vs advise, when to stop testing, whether to add backward compatibility
3. **Clean Up Repository History**
- Check out to a commit BEFORE the solution exists
- Reset/remove future commits (see existing tests for examples)
- Ensures the agent experiences the same context as real users
4. **Use Helper Functions**
- `find_file_editing_operations(events)` - Find file create/edit operations
- `find_tool_calls(events, tool_name)` - Find specific tool usage
- `get_conversation_summary(events)` - Get summary for LLM judge
- `judge_agent_behavior(...)` - Use LLM to evaluate behavior quality
5. **Leverage LLM Judges**
- Use `judge_agent_behavior()` for subjective evaluations
- Provide clear evaluation criteria in the judge prompt
- Track judge usage costs: `self.add_judge_usage(prompt_tokens, completion_tokens, cost)`
6. **Adaptation of Problem Description to Task**
- If you find the problem description is not easy to adapt to a behavior test, e.g. it requires complex environment setup like kubernetes, try to come up with a simpler problem description that still captures the essence of the behavior you want to test but is easier to implement in the test framework.
- Ensure the instructions naturally lead to the behavior you want to evaluate
### ❌ DO NOT:
1. **Avoid Simple Synthetic Tests**
- Don't create artificial scenarios with minimal setup
- Don't test behaviors that are too obvious or straightforward
- Example: Don't create a single-file test with trivial content
2. **Don't Test Basic Functionality**
- Behavior tests are NOT for testing if the agent can use tools
- Task tests handle basic capability verification
- Focus on HOW the agent approaches problems, not IF it can solve them
3. **Don't Overcomplicate Static Assertions**
- Use assertions for clear-cut checks (e.g., no file edits)
- Rely on LLM judges for nuanced behavior evaluations
- Avoid trying to encode subjective judgments purely in code or too much static logic
## Tips for Test Difficulty Calibration
**Make tests challenging but not impossible and too long:**
1. **Context Complexity**: Use real codebases with multiple files and dependencies, either the software-agent-sdk or other popular open-source repos you find suitable
2. **Ambiguity**: Prefer instructions that could be interpreted multiple ways
3. **Temptation**: Set up scenarios where the "easy wrong path" is tempting
4. **Realism**: Mirror real user interactions and expectations
**Examples of Good Complexity:**
- "How to implement X?" (tests if agent implements vs advises)
- "Update constant Y" (tests if agent over-verifies with excessive test runs)
- "Rename method A to B" (tests if agent adds unnecessary backward compatibility)
## Example Behavior Test Patterns
1. **Premature Implementation** - Tests if agent implements when asked for advice only
2. **Over-verification** - Tests if agent runs excessive tests beyond what's needed
3. **Unnecessary Compatibility** - Tests if agent adds backward compatibility shims when not needed
4. **Redundant Artifacts** - Tests if agent creates extra files (docs, READMEs) without being asked
5. **Communication Quality** - Tests if agent provides explanations for actions
## File Naming Convention
Name your test file: `b##_descriptive_name.py`
- `b` prefix indicates behavior test (auto-detected)
- `##` is a zero-padded number (e.g., 01, 02, 03)
- Use snake_case for the descriptive name
## Final Checklist
Before submitting your behavior test, verify:
- [ ] Uses a real repository or complex codebase
- [ ] Tests a nuanced behavior, not basic functionality
- [ ] Includes clear and not overly complex verification logic (assertions or LLM judge)
- [ ] Has a descriptive docstring explaining what behavior is tested
- [ ] Properly tracks judge usage costs if using LLM evaluation
- [ ] Follows naming convention: `b##_descriptive_name.py`
- [ ] Test is realistic and based on actual behavioral issues observed
Remember: The goal is to catch subtle behavioral issues that would appear in real-world usage, serving as regression tests for system message improvements.
+1
View File
@@ -0,0 +1 @@
The files in this directory configure a development container for GitHub Codespaces.
+15
View File
@@ -0,0 +1,15 @@
{
"name": "OpenHands Codespaces",
"image": "mcr.microsoft.com/devcontainers/universal",
"customizations":{
"vscode":{
"extensions": [
"ms-python.python"
]
}
},
"onCreateCommand": "sh ./.devcontainer/on_create.sh",
"postCreateCommand": "make build",
"postStartCommand": "USE_HOST_NETWORK=True nohup bash -c 'make run &'"
}
+6
View File
@@ -0,0 +1,6 @@
#!/usr/bin/env bash
sudo apt update
sudo apt install -y netcat
sudo add-apt-repository -y ppa:deadsnakes/ppa
sudo apt install -y python3.12
curl -sSL https://install.python-poetry.org | python3.12 -
+4 -260
View File
@@ -1,261 +1,5 @@
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
# C extensions
*.so
# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
# Note: We keep our custom spec file in version control
# *.spec
# PyInstaller build directories
build/
dist/
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/
# Translations
*.mo
*.pot
# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal
# Flask stuff:
instance/
.webassets-cache
# Scrapy stuff:
.scrapy
# Sphinx documentation
docs/_build/
# PyBuilder
.pybuilder/
target/
# Jupyter Notebook
.ipynb_checkpoints
# IPython
profile_default/
ipython_config.py
# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version
# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock
# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
# poetry.lock
# pdm
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
# https://pdm.fming.dev/#use-with-ide
.pdm.toml
# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/
# Celery stuff
celerybeat-schedule
celerybeat.pid
# SageMath parsed files
*.sage.py
# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/
# Spyder project settings
.spyderproject
.spyproject
# Rope project settings
.ropeproject
# mkdocs documentation
/site
# mypy
.mypy_cache/
.dmypy.json
dmypy.json
# Pyre type checker
.pyre/
# pytype static type analyzer
.pytype/
# Cython debug symbols
cython_debug/
# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be added to the global gitignore or merged into this project gitignore. For a PyCharm
# project, it is recommended to ignore the entire .idea directory.
.idea/
# VS Code
.vscode/
# macOS
.DS_Store
.AppleDouble
.LSOverride
# Windows
Thumbs.db
ehthumbs.db
Desktop.ini
$RECYCLE.BIN/
# Linux
*~
# Temporary files
*.tmp
*.temp
*.swp
*.swo
# UV specific
.uv/
# Project specific
*.log
.coverage
.pytest_cache/
workspace/
.client
.docker
.git
.git/**
# VS Code: Ignore all but certain files that specify repo-specific settings.
# https://stackoverflow.com/questions/32964920/should-i-commit-the-vscode-folder-to-source-control
.vscode/**/*
!.vscode/extensions.json
!.vscode/tasks.json
# VS Code extensions/forks:
.cursorignore
.rooignore
.clineignore
.windsurfignore
.cursorrules
.roorules
.clinerules
.windsurfrules
.cursor/rules
.roo/rules
.cline/rules
.windsurf/rules
.repomix
repomix-output.txt
# misc
.DS_Store
.env.local
.env.development.local
.env.test.local
.env.production.local
npm-debug.log*
yarn-debug.log*
yarn-error.log*
logs
# agent
frontend/node_modules
config.toml
.envrc
cache
.jinja_cache/
.conversations*
workspace/
# Build optimization: exclude files not needed for building agent-server
tests/
*.log
.github/
scripts/
examples/
.ruff_cache/
.uv-cache/
Makefile
docs/
*.md
!README.md
.pre-commit-config.yaml
.python-version
.env
.git
+1
View File
@@ -0,0 +1 @@
*.ipynb linguist-vendored
+19
View File
@@ -0,0 +1,19 @@
codecov:
notify:
wait_for_ci: true
# our project is large, so 6 builds are typically uploaded. this waits till 5/6
# See https://docs.codecov.com/docs/notifications#section-preventing-notifications-until-after-n-builds
after_n_builds: 5
coverage:
status:
patch:
default:
threshold: 100% # allow patch coverage to be lower than project coverage by any amount
project:
default:
threshold: 5% # allow project coverage to drop at most 5%
comment: false
github_checks:
annotations: false
+52 -159
View File
@@ -1,168 +1,61 @@
---
name: Bug
description: Report a problem with OpenHands SDK
description: Report a problem with OpenHands
title: '[Bug]: '
labels: [bug]
labels: ['bug']
body:
- type: markdown
attributes:
value: |
## Thank you for reporting a bug! 🐛
- type: markdown
attributes:
value: Thank you for taking the time to fill out this bug report. Please provide as much information as possible to help us understand and address the issue effectively.
**Please fill out all required fields.** Issues missing critical information (version, installation method, reproduction steps, etc.) will be delayed or closed until complete details are provided.
- type: checkboxes
attributes:
label: Is there an existing issue for the same bug?
description: Please check if an issue already exists for the bug you encountered.
options:
- label: I have checked the existing issues.
required: true
Clear, detailed reports help us resolve issues faster.
- type: textarea
id: bug-description
attributes:
label: Describe the bug and reproduction steps
description: Provide a description of the issue along with any reproduction steps.
validations:
required: true
- type: checkboxes
attributes:
label: Is there an existing issue for the same bug?
description: Please search existing issues before creating a new one. If found, react or comment to the duplicate issue instead of making a
new one. <!-- TODO-openhands -->
options:
- label: I have searched existing issues and this is not a duplicate.
required: true
- type: dropdown
id: installation
attributes:
label: OpenHands Installation
description: How are you running OpenHands?
options:
- Docker command in README
- Development workflow
- app.all-hands.dev
- Other
default: 0
- type: textarea
id: bug-description
attributes:
label: Bug Description
description: Clearly describe what went wrong. Be specific and concise.
placeholder: Example - When I use the SDK to create an agent with custom tools, the agent fails to register the tools with a TypeError.
validations:
required: true
- type: input
id: openhands-version
attributes:
label: OpenHands Version
description: What version of OpenHands are you using?
placeholder: ex. 0.9.8, main, etc.
- type: textarea
id: expected-behavior
attributes:
label: Expected Behavior
description: What did you expect to happen?
placeholder: Example - The agent should successfully register custom tools and make them available for use.
validations:
required: false
- type: dropdown
id: os
attributes:
label: Operating System
options:
- MacOS
- Linux
- WSL on Windows
- type: textarea
id: actual-behavior
attributes:
label: Actual Behavior
description: What actually happened?
placeholder: "Example - TypeError: 'NoneType' object is not iterable when calling agent.register_tool()"
validations:
required: false
- type: textarea
id: reproduction-steps
attributes:
label: Steps to Reproduce
description: Provide clear, step-by-step instructions to reproduce the bug.
placeholder: |
1. Install openhands-sdk using pip
2. Import and create an agent instance
3. Define a custom tool function
4. Call agent.register_tool(custom_tool)
5. Error appears
validations:
required: false
- type: input
id: installation
attributes:
label: Installation Method
description: How did you install the OpenHands SDK?
placeholder: ex. pip install openhands-sdk, uv pip install openhands-sdk, pip install -e ., etc.
- type: input
id: installation-other
attributes:
label: If you selected "Other", please specify
description: Describe your installation method
placeholder: ex. Poetry, conda, custom setup, etc.
- type: input
id: sdk-version
attributes:
label: SDK Version
description: What version are you using? Check with `pip show openhands-sdk` or similar for other packages.
placeholder: ex. 0.1.0, 0.2.0, main branch, commit hash, etc.
validations:
required: false
- type: checkboxes
id: version-confirmation
attributes:
label: Version Confirmation
description: Bugs on older versions may already be fixed. Please upgrade before submitting.
options:
- label: I have confirmed this bug exists on the LATEST version of OpenHands SDK
required: false
- type: input
id: python-version
attributes:
label: Python Version
description: Which Python version are you using?
placeholder: ex. 3.10.12, 3.11.5, 3.12.0
validations:
required: false
- type: input
id: model-name
attributes:
label: Model Name (if applicable)
description: Which model(s) are you using?
placeholder: ex. gpt-4o, claude-3-5-sonnet-20241022, openrouter/deepseek-r1, etc.
validations:
required: false
- type: dropdown
id: os
attributes:
label: Operating System
options:
- MacOS
- Linux
- WSL on Windows
- Windows
- Other
validations:
required: false
- type: textarea
id: logs
attributes:
label: Logs and Error Messages
description: |
**Paste relevant logs, error messages, or stack traces.** Use code blocks (```) for formatting.
Include full stack traces when available.
placeholder: |
```
Paste error logs here
```
- type: textarea
id: code-sample
attributes:
label: Minimal Code Sample
description: |
If possible, provide a minimal code sample that reproduces the issue.
placeholder: |
```python
from openhands.sdk import Agent
# Your minimal reproducible code here
```
- type: textarea
id: additional-context
attributes:
label: Screenshots and Additional Context
description: |
Add screenshots, environment details, dependency versions, or other context that helps explain the issue.
placeholder: Drag and drop screenshots here, paste links, or add additional context.
- type: markdown
attributes:
value: |
---
**Note:** Please help us help you! Well-documented bugs are easier to reproduce and fix. Thank you for your understanding!
- type: textarea
id: additional-context
attributes:
label: Logs, Errors, Screenshots, and Additional Context
description: Please provide any additional information you think might help. If you want to share the chat history
you can click the thumbs-down (👎) button above the input field and you will get a shareable link
(you can also click thumbs up when things are going well of course!). LLM logs will be stored in the
`logs/llm/default` folder. Please add any additional context about the problem here.
+18
View File
@@ -0,0 +1,18 @@
---
name: Feature Request
about: Suggest an idea for OpenHands features
title: ''
labels: 'enhancement'
assignees: ''
---
**What problem or use case are you trying to solve?**
**Describe the UX of the solution you'd like**
**Do you have thoughts on the technical implementation?**
**Describe alternatives you've considered**
**Additional context**
-117
View File
@@ -1,117 +0,0 @@
---
name: Feature Request or Enhancement
description: Suggest a new feature or improvement for OpenHands SDK
title: '[Feature]: '
labels: [enhancement]
body:
- type: markdown
attributes:
value: |
## Thank you for suggesting a feature! 💡
We encourage you to open the discussion on the feature you need. You are always welcome to implement it, if you wish.
- type: checkboxes
attributes:
label: Is there an existing feature request for this?
description: Please search existing issues and feature requests before creating a new one. If found, react or comment to the duplicate issue
instead of making a new one. <!-- TODO-openhands -->
options:
- label: I have searched existing issues and feature requests, and this is not a duplicate.
required: true
- type: textarea
id: problem-statement
attributes:
label: Problem or Use Case
description: What problem are you trying to solve? What use case would this feature enable?
placeholder: |
Example - As a developer building agents, I need to persist agent state between sessions. Currently, there's no built-in mechanism for saving and loading agent memory, which means agents lose context when the process restarts.
validations:
required: true
- type: textarea
id: proposed-solution
attributes:
label: Proposed Solution
description: Describe your ideal solution. What should this feature do? How should it work?
placeholder: |
Example - Add a StateManager class that allows saving and loading agent state to/from disk or database. Provide methods like save_state(), load_state(), and clear_state(). Support multiple backend options (JSON files, SQLite, Redis, etc.).
validations:
required: true
- type: textarea
id: alternatives
attributes:
label: Alternatives Considered
description: Have you considered any alternative solutions or workarounds? What are their limitations?
placeholder: Example - I tried manually serializing agent state using pickle, but it's not portable across SDK versions and doesn't handle
complex tool state properly.
- type: dropdown
id: priority
attributes:
label: Priority / Severity
description: How important is this feature to your workflow?
options:
- Critical - Blocking my work, no workaround available
- High - Significant impact on productivity
- Medium - Would improve experience
- Low - Nice to have
default: 2
validations:
required: true
- type: dropdown
id: scope
attributes:
label: Estimated Scope
description: To the best of your knowledge, how complex do you think this feature would be to implement?
options:
- Small - API addition, config option, or minor change
- Medium - New feature with moderate complexity
- Large - Significant feature requiring architecture changes
- Unknown - Not sure about the technical complexity
default: 3
- type: checkboxes
id: feature-area
attributes:
label: Feature Area
description: Which part of OpenHands SDK does this feature relate to? If you select "Other", please specify the area in the Additional
Context section below. <!-- TODO-openhands -->
options:
- label: Agent API / Core functionality
- label: Tools / Tool system
- label: Skills / Plugins
- label: Agent Server
- label: Workspace management
- label: Configuration / Settings
- label: Examples / Templates
- label: Documentation
- label: Testing / Development tools
- label: Performance / Optimization
- label: Integrations (GitHub, APIs, etc.)
- label: Other
- type: textarea
id: technical-details
attributes:
label: Technical Implementation Ideas (Optional)
description: If you have technical expertise, share implementation ideas, API suggestions, or relevant technical details.
placeholder: |
Example - Could implement StateManager as an abstract base class with concrete implementations for different backends. Add state_manager parameter to Agent constructor. Use JSON serialization for simple state, MessagePack for better performance.
- type: textarea
id: additional-context
attributes:
label: Additional Context
description: Add any other context, code examples, API mockups, or references that help illustrate this feature request.
placeholder: |
Example code or API design:
```python
from openhands.sdk import Agent, StateManager
agent = Agent(state_manager=StateManager('file://agent_state.json'))
agent.save_state()
```
@@ -0,0 +1,18 @@
---
name: Technical Proposal
about: Propose a new architecture or technology
title: ''
labels: 'proposal'
assignees: ''
---
**Summary**
**Motivation**
**Technical Design**
**Alternatives to Consider**
**Additional context**
-11
View File
@@ -1,11 +0,0 @@
## Summary
[fill in a summary of this PR]
## Checklist
- [ ] If the PR is changing/adding functionality, are there tests to reflect this?
- [ ] If there is an example, have you run the example to make sure that it works?
- [ ] If there are instructions on how to run the code, have you followed the instructions and made sure that it works?
- [ ] If the feature is significant enough to require documentation, is there a PR open on the OpenHands/docs repository with the same branch name?
- [ ] Is the github CI passing?
+67 -15
View File
@@ -1,17 +1,69 @@
---
# Dependabot configuration for automated dependency updates
# See: https://docs.github.com/en/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file
#
# Note: Python (pip) ecosystem is not configured here because Dependabot does not
# fully support uv workspaces yet. See issue #2510 for tracking.
version: 2
updates:
# GitHub Actions
- package-ecosystem: github-actions
directory: /
schedule:
interval: weekly
commit-message:
prefix: chore(deps)
- package-ecosystem: "pip"
directory: "/"
schedule:
interval: "daily"
open-pull-requests-limit: 1
groups:
# put packages in their own group if they have a history of breaking the build or needing to be reverted
pre-commit:
patterns:
- "pre-commit"
llama:
patterns:
- "llama*"
chromadb:
patterns:
- "chromadb"
security-all:
applies-to: "security-updates"
patterns:
- "*"
version-all:
applies-to: "version-updates"
patterns:
- "*"
- package-ecosystem: "npm"
directory: "/frontend"
schedule:
interval: "daily"
open-pull-requests-limit: 1
groups:
docusaurus:
patterns:
- "*docusaurus*"
eslint:
patterns:
- "*eslint*"
security-all:
applies-to: "security-updates"
patterns:
- "*"
version-all:
applies-to: "version-updates"
patterns:
- "*"
- package-ecosystem: "npm"
directory: "/docs"
schedule:
interval: "weekly"
day: "wednesday"
open-pull-requests-limit: 1
groups:
docusaurus:
patterns:
- "*docusaurus*"
eslint:
patterns:
- "*eslint*"
security-all:
applies-to: "security-updates"
patterns:
- "*"
version-all:
applies-to: "version-updates"
patterns:
- "*"
-109
View File
@@ -1,109 +0,0 @@
# Documentation Update Prompt
You are a world-class documentation writer tasked with keeping the OpenHands Agent SDK documentation accurate and up-to-date. Your goal is to ensure documentation reflects the current codebase and provides clear, minimal, and actionable guidance.
## Core Objectives
1. **Accuracy**: Ensure all documentation matches the current codebase
2. **Completeness**: Include all available tools and core components
3. **Clarity**: Keep examples simple, working, and easy to understand
4. **Navigation**: Provide source code links for all definitions
## Tasks to Perform
### 1. Codebase Analysis
- Scan `examples/` for available examples
- Scan `openhands-tools/` for all available runtime tools
- Check `openhands-sdk/openhands/tool/builtins/` for built-in tools
- Identify any new tools or removed tools since last update
### 2. Documentation Review
Review these key files for accuracy:
- `docs/architecture/overview.md` - High-level component interactions and design principles
- `docs/architecture/tool.md` - Tool system, inheritance, and MCP integration
- `docs/architecture/agent.md` - Agent architecture and execution flow
- `docs/architecture/llm.md` - LLM integration and capabilities
- `docs/architecture/conversation.md` - Conversation interface and persistence
- `docs/getting-started.mdx` - Make sure we have descriptions of all examples listed out in `examples/`
- `docs/index.md` - Overview and navigation
- `README.md` - Root project documentation
### 3. Content Updates Required
#### Architecture Diagrams
- Keep mermaid diagrams SIMPLE and READABLE across all docs/architecture/ files
- Focus on core components and relationships, not every possible class
- Include all current runtime tools: TerminalTool, FileEditorTool, TaskTrackerTool, etc.
- Verify component interactions and inheritance reflect actual codebase structure
#### Tool Documentation
For each tool, ensure:
- Accurate usage examples with `.create()` method
- Working code snippets (test them!)
- Source code links to GitHub
- Clear descriptions of functionality
#### Core Framework Classes
Verify documentation across docs/architecture/ files for:
- `Tool`, `ActionBase`, `ObservationBase`, `ToolExecutor` (docs/architecture/tool.md)
- `Agent`, `AgentBase`, system prompts (docs/architecture/agent.md)
- `LLM`, message types, provider support (docs/architecture/llm.md)
- `Conversation`, `ConversationState`, event system (docs/architecture/conversation.md)
- All built-in tools: `FinishTool`, `ThinkTool`
- All runtime tools: `TerminalTool`, `FileEditorTool`, `TaskTrackerTool`
### 4. Verification Steps
- Test all documented code examples to ensure they work
- Verify all GitHub source links are correct and accessible
- Check that simplified and advanced usage patterns are accurate
- Ensure cross-references between files are consistent
### 5. Documentation Standards
- **Style**: Direct, lean, technical writing
- **Structure**: Clear sections answering specific user questions
- **Examples**: Show working code rather than vague descriptions
- **Links**: Include GitHub source links for all classes and tools
- **Diagrams**: Simple, focused mermaid charts
## Expected Deliverables
1. Updated documentation files with current tool listings
2. Verified working code examples
3. Simplified and accurate architecture diagrams
4. Complete source code links for all definitions
5. Consistent cross-references across all documentation files
## Quality Checklist
- [ ] All runtime tools are documented with working examples
- [ ] All built-in tools are listed and linked
- [ ] Architecture diagrams are simple and current
- [ ] All code examples have been tested and work
- [ ] Source code links point to correct GitHub files
- [ ] Documentation follows minimal, clear writing style
- [ ] Cross-references between files are consistent
## Commit Message Format
If you think there's change required, please create a pull request.
```
Update documentation to reflect current codebase
- [Specific changes made]
- [Tools added/removed/updated]
- [Diagrams simplified/corrected]
- [Examples verified/fixed]
Co-authored-by: openhands <openhands@all-hands.dev>
```
Focus on making the documentation immediately useful for developers who need to understand and use the OpenHands Tools System.
+11
View File
@@ -0,0 +1,11 @@
**End-user friendly description of the problem this fixes or functionality that this introduces**
- [ ] Include this change in the Release Notes. If checked, you must provide an **end-user friendly** description for your change below
---
**Give a summary of what the PR does, explaining any non-trivial design decisions**
---
**Link of any specific issues this addresses**
-381
View File
@@ -1,381 +0,0 @@
# Adding Models to resolve_model_config.py
## Overview
This file (`resolve_model_config.py`) defines models available for evaluation. Models must be added here before they can be used in integration tests or evaluations.
## Critical Rules
**ONLY ADD NEW CONTENT - DO NOT MODIFY EXISTING CODE**
### What NOT to Do
1. **Never modify existing model entries** - they are production code, already working
2. **Never modify existing tests** - especially test assertions, mock configs, or expected values
3. **Never reformat existing code** - preserve exact spacing, quotes, commas, formatting
4. **Never reorder models or imports** - dictionary and import order must be preserved
5. **Never "fix" existing code** - if it's in the file and tests pass, it works
6. **Never change test assertions** - even if they "look wrong" to you
7. **Never replace real model tests with mocked tests** - weakens validation
8. **Never fix import names** - if `test_model` exists, don't change it to `check_model`
### What These Rules Prevent
**Example violations** (all found in real PRs):
- Changing `assert result[0]["id"] == "claude-sonnet-4-5-20250929"` to `"gpt-4"`
- Replacing real model config tests with mocked/custom model tests ❌
- "Fixing" `from resolve_model_config import test_model` to `check_model`
- Adding "Fixed incorrect assertions" without explaining what was incorrect ❌
- Claiming to "fix test issues" when tests were already passing ❌
### What TO Do
**When adding a model**:
- Add ONE new entry to the MODELS dictionary
- Add ONE new test function (follow existing pattern exactly)
- Add to feature lists in model_features.py ONLY if needed for your model
- Do not touch any other files, tests, imports, or configurations
- Test the PR branch with the integration test action.
- Add a link to the integrations test to the PR.
- If you think something is broken, it's probably not - add a comment to the PR.
## Files to Modify
1. **Always required**:
- `.github/run-eval/resolve_model_config.py` - Add model configuration
- `tests/github_workflows/test_resolve_model_config.py` - Add test
2. **Usually required** (if model has special characteristics):
- `openhands-sdk/openhands/sdk/llm/utils/model_features.py` - Add to feature categories
3. **Sometimes required**:
- `openhands-sdk/openhands/sdk/llm/utils/model_prompt_spec.py` - GPT models only (variant detection)
- `openhands-sdk/openhands/sdk/llm/utils/verified_models.py` - Production-ready models
> ⚠️ **When editing `verified_models.py`**: If you add a model to `VERIFIED_OPENHANDS_MODELS`,
> you **must also** add it to its provider-specific list (e.g. `VERIFIED_ANTHROPIC_MODELS`,
> `VERIFIED_GEMINI_MODELS`, `VERIFIED_MOONSHOT_MODELS`, etc.).
> If no list exists for the provider yet, create one and add it to the `VERIFIED_MODELS` dict.
> This ensures the model appears under its actual provider in the UI, not just under "openhands".
## Step 1: Add to resolve_model_config.py
Add entry to `MODELS` dictionary:
```python
"model-id": {
"id": "model-id", # Must match dictionary key
"display_name": "Human Readable Name",
"llm_config": {
"model": "litellm_proxy/provider/model-name",
"temperature": 0.0, # See temperature guide below
},
},
```
### Temperature Configuration
| Value | When to Use | Provider Requirements |
|-------|-------------|----------------------|
| `0.0` | Standard deterministic models | Most providers |
| `1.0` | Reasoning models | Kimi K2, MiniMax M2.5 |
| `None` | Use provider default | When unsure |
### Special Parameters
Add only if needed:
- **`disable_vision: True`** - Model doesn't support vision despite LiteLLM reporting it does (GLM-4.7, GLM-5)
- **`reasoning_effort: "high"`** - For OpenAI reasoning models that support this parameter
- **`max_tokens: <value>`** - To prevent hangs or control output length
- **`top_p: <value>`** - Nucleus sampling (cannot be used with `temperature` for Claude models)
- **`litellm_extra_body: {...}`** - Provider-specific parameters (e.g., `{"enable_thinking": True}`)
### Critical Rules
1. Model ID must match dictionary key
2. Model path must start with `litellm_proxy/`
3. **Claude models**: Cannot use both `temperature` and `top_p` - choose one or omit both
4. Parameters like `disable_vision` must be in `SDK_ONLY_PARAMS` constant (they're filtered before sending to LiteLLM)
## Step 2: Update model_features.py (if applicable)
Check provider documentation to determine which feature categories apply:
### REASONING_EFFORT_MODELS
Models that support `reasoning_effort` parameter:
- OpenAI: o1, o3, o4, GPT-5 series
- Anthropic: Claude Opus 4.5+, Claude Sonnet 4.6
- Google: Gemini 2.5+, Gemini 3.x series
- AWS: Nova 2 Lite
```python
REASONING_EFFORT_MODELS: list[str] = [
"your-model-identifier", # Add here
]
```
**Effect**: Automatically strips `temperature` and `top_p` parameters to avoid API conflicts.
### EXTENDED_THINKING_MODELS
Models with extended thinking capabilities:
- Anthropic: Claude Sonnet 4.5+, Claude Haiku 4.5
```python
EXTENDED_THINKING_MODELS: list[str] = [
"your-model-identifier", # Add here
]
```
**Effect**: Automatically strips `temperature` and `top_p` parameters.
### PROMPT_CACHE_MODELS
Models supporting prompt caching:
- Anthropic: Claude 3.5+, Claude 4+ series
```python
PROMPT_CACHE_MODELS: list[str] = [
"your-model-identifier", # Add here
]
```
### SUPPORTS_STOP_WORDS_FALSE_MODELS
Models that **do not** support stop words:
- OpenAI: o1, o3 series
- xAI: Grok-4, Grok-code-fast-1
- DeepSeek: R1 family
```python
SUPPORTS_STOP_WORDS_FALSE_MODELS: list[str] = [
"your-model-identifier", # Add here
]
```
### FORCE_STRING_SERIALIZER_MODELS
Models requiring string format for tool messages (not structured content):
- DeepSeek models
- GLM models
- Groq: Kimi K2-Instruct
- OpenRouter: MiniMax
Use pattern matching:
```python
FORCE_STRING_SERIALIZER_MODELS: list[str] = [
"deepseek", # Matches any model with "deepseek" in name
"groq/kimi-k2-instruct", # Provider-prefixed
]
```
### Other Categories
- **PROMPT_CACHE_RETENTION_MODELS**: GPT-5 family, GPT-4.1
- **RESPONSES_API_MODELS**: GPT-5 family, codex-mini-latest
- **SEND_REASONING_CONTENT_MODELS**: Kimi K2 Thinking/K2.5, MiniMax-M2, DeepSeek Reasoner
See `model_features.py` for complete lists and additional documentation.
## Step 3: Add Test
**File**: `tests/github_workflows/test_resolve_model_config.py`
**Important**:
- Python function names cannot contain hyphens. Convert model ID hyphens to underscores.
- **Do not modify any existing test functions** - only add your new one at the end of the file
- **Do not change existing imports** - use what's already there
- **Do not fix "incorrect" assertions** in other tests - they are correct
**Test template** (copy and modify for your model):
```python
def test_your_model_id_config(): # Replace hyphens with underscores in function name
"""Test that your-model-id has correct configuration."""
model = MODELS["your-model-id"] # Dictionary key keeps hyphens
assert model["id"] == "your-model-id"
assert model["display_name"] == "Your Model Display Name"
assert model["llm_config"]["model"] == "litellm_proxy/provider/model-name"
# Only add assertions for parameters YOU added in resolve_model_config.py
# assert model["llm_config"]["temperature"] == 0.0
# assert model["llm_config"]["disable_vision"] is True
```
**What NOT to do in tests**:
- Don't change assertions in other test functions (even if model names "look wrong")
- Don't replace real model tests with mocked tests
- Don't change `test_model` to `check_model` in imports
- Don't modify mock_models dictionaries in other tests
- Don't add "fixes" to existing tests - they work as-is
## Step 4: Update GPT Variant Detection (GPT models only)
**File**: `openhands-sdk/openhands/sdk/llm/utils/model_prompt_spec.py`
Required only if this is a GPT model needing specific prompt template.
**Order matters**: More specific patterns must come before general patterns.
```python
_MODEL_VARIANT_PATTERNS: dict[str, tuple[tuple[str, tuple[str, ...]], ...]] = {
"openai_gpt": (
(
"gpt-5-codex", # Specific variant first
("gpt-5-codex", "gpt-5.1-codex", "gpt-5.2-codex", "gpt-5.3-codex"),
),
("gpt-5", ("gpt-5", "gpt-5.1", "gpt-5.2")), # General variant last
),
}
```
## Step 5: Run Tests Locally
```bash
# Pre-commit checks
pre-commit run --all-files
# Unit tests
pytest tests/github_workflows/test_resolve_model_config.py::test_your_model_config -v
# Manual verification
cd .github/run-eval
MODEL_IDS="your-model-id" GITHUB_OUTPUT=/tmp/output.txt python resolve_model_config.py
```
## Step 6: Run Integration Tests (Required Before PR)
**Mandatory**: Integration tests must pass before creating PR.
### Via GitHub Actions
1. Push branch: `git push origin your-branch-name`
2. Navigate to: https://github.com/OpenHands/software-agent-sdk/actions/workflows/integration-runner.yml
3. Click "Run workflow"
4. Configure:
- **Branch**: Select your branch
- **model_ids**: `your-model-id`
- **Reason**: "Testing model-id"
5. Wait for completion
6. **Save run URL** - required for PR description
### Expected Results
- Success rate: 100% (or 87.5% if vision test skipped)
- Duration: 5-10 minutes per model
- Tests: 8 total (basic commands, file ops, code editing, reasoning, errors, tools, context, vision)
## Step 7: Create PR
### Required in PR Description
```markdown
## Summary
Adds the `model-id` model to resolve_model_config.py.
## Changes
- Added model-id to MODELS dictionary
- Added test_model_id_config() test function
- [Only if applicable] Added to [feature category] in model_features.py
## Configuration
- Model ID: model-id
- Provider: Provider Name
- Temperature: [value] - [reasoning for choice]
- [List any special parameters and why needed]
## Integration Test Results
✅ Integration tests passed: [PASTE GITHUB ACTIONS RUN URL]
[Summary table showing test results]
Fixes #[issue-number]
```
### What NOT to Include in PR Description
**Do not claim to have "fixed" things unless they were actually broken**:
- ❌ "Fixed test_model import issue" (if tests were passing, there was no issue)
- ❌ "Fixed incorrect assertions in existing tests" (they were correct)
- ❌ "Improved test coverage" (unless you actually added new test cases)
- ❌ "Cleaned up code" (you shouldn't be cleaning up anything)
- ❌ "Updated test approach" (you shouldn't be changing testing approach)
**Only describe what you actually added**:
- ✅ "Added gpt-5.3-codex model configuration"
- ✅ "Added test for gpt-5.3-codex"
- ✅ "Added gpt-5.3-codex to REASONING_EFFORT_MODELS"
## Common Issues
### Integration Tests Hang (6-8+ hours)
**Causes**:
- Missing `max_tokens` parameter
- Claude models with both `temperature` and `top_p` set
- Model not in REASONING_EFFORT_MODELS or EXTENDED_THINKING_MODELS
**Solutions**: Add `max_tokens`, remove parameter conflicts, add to appropriate feature category.
**Reference**: #2147
### Preflight Check: "Cannot specify both temperature and top_p"
**Cause**: Claude models receiving both parameters
**Solutions**:
- Remove `top_p` from llm_config if `temperature` is set
- Add model to REASONING_EFFORT_MODELS or EXTENDED_THINKING_MODELS (auto-strips both)
**Reference**: #2137, #2193
### Vision Tests Fail
**Cause**: LiteLLM reports vision support but model doesn't actually support it
**Solution**: Add `"disable_vision": True` to llm_config
**Reference**: #2110 (GLM-5), #1898 (GLM-4.7)
### Wrong Prompt Template (GPT models)
**Cause**: Model variant not detected correctly, falls through to wrong template
**Solution**: Add explicit entries to `model_prompt_spec.py` with correct pattern order
**Reference**: #2233 (GPT-5.2-codex, GPT-5.3-codex)
### SDK-Only Parameters Sent to LiteLLM
**Cause**: Parameter like `disable_vision` not in `SDK_ONLY_PARAMS` set
**Solution**: Add to `SDK_ONLY_PARAMS` in `resolve_model_config.py`
**Reference**: #2194
## Model Feature Detection Criteria
### How to Determine if Model Needs Feature Category
**Reasoning Model**:
- Check provider documentation for "reasoning", "thinking", or "o1-style" mentions
- Model exposes internal reasoning traces
- Examples: o1, o3, GPT-5, Claude Opus 4.5+, Gemini 3+
**Extended Thinking**:
- Check if model is Claude Sonnet 4.5+ or Claude Haiku 4.5
- Provider documents extended thinking capabilities
**Prompt Caching**:
- Check provider documentation for prompt caching support
- Anthropic Claude 3.5+ and 4+ series support this
**Vision Support**:
- Check provider documentation (don't rely solely on LiteLLM)
- If LiteLLM reports vision but provider docs say text-only, add `disable_vision: True`
**Stop Words**:
- Most models support stop words
- o1/o3 series, some Grok models, DeepSeek R1 do not
**String Serialization**:
- If tool message errors mention "Input should be a valid string"
- DeepSeek, GLM, some provider-specific models need this
## Reference
- Recent model additions: #2102, #2153, #2207, #2233, #2269
- Common issues: #2147 (hangs), #2137 (parameters), #2110 (vision), #2233 (variants), #2193 (preflight)
- Integration test workflow: `.github/workflows/integration-runner.yml`
-56
View File
@@ -1,56 +0,0 @@
# Model Configuration for OpenHands SDK
See the [project root AGENTS.md](../../AGENTS.md) for repository-wide policies and workflows.
This directory contains model configuration and evaluation setup for the OpenHands SDK.
## Key Files
- **`resolve_model_config.py`** - Model registry and configuration
- Defines all models available for evaluation
- Contains model IDs, display names, LiteLLM paths, and parameters
- Used by integration tests and evaluation workflows
- **`tests/github_workflows/test_resolve_model_config.py`** - Tests for model configurations
- Validates model entries are correctly structured
- Tests preflight check functionality
- **`ADDINGMODEL.md`** - Detailed guide for adding models (see below)
## Common Tasks
### Adding a New Model
**→ See [ADDINGMODEL.md](./ADDINGMODEL.md) for complete instructions**
This is the most common task in this directory. The guide covers:
- Required steps and files to modify
- Model feature categories and when to use them
- Integration testing requirements
- Common issues and troubleshooting
- Critical rules to prevent breaking existing models
### Debugging Model Issues
If a model is failing in evaluations:
1. Check the model configuration in `resolve_model_config.py`
2. Review parameter compatibility (especially `temperature` + `top_p` for Claude)
3. Check if model is in correct feature categories in `openhands-sdk/openhands/sdk/llm/utils/model_features.py`
4. Run preflight check: `MODEL_IDS="model-id" python resolve_model_config.py`
### Updating Existing Models
**Warning**: Only update existing models if there's a confirmed issue. Working configurations should not be changed.
If you must update:
1. Document why the change is needed (link to issue/PR showing the problem)
2. Test thoroughly before and after the change
3. Run integration tests to verify no regressions
## Directory Purpose
This directory bridges model definitions with the evaluation system:
- Models defined here are available for integration tests
- Configuration includes LiteLLM routing and SDK-specific parameters
- Preflight checks validate model accessibility before expensive evaluation runs
- Tests ensure all models are correctly structured and resolvable
-447
View File
@@ -1,447 +0,0 @@
#!/usr/bin/env python3
"""
Resolve model IDs to full model configurations and verify model availability.
Reads:
- MODEL_IDS: comma-separated model IDs
- LLM_API_KEY: API key for litellm_proxy (optional, for preflight check)
- LLM_BASE_URL: Base URL for litellm_proxy (optional, defaults to eval proxy)
- SKIP_PREFLIGHT: Set to 'true' to skip the preflight LLM check
Outputs to GITHUB_OUTPUT:
- models_json: JSON array of full model configs with display names
"""
import json
import os
import sys
from typing import Any
# SDK-specific parameters that should not be passed to litellm.
# These parameters are used by the SDK's LLM wrapper but are not part of litellm's API.
# Keep this list in sync with SDK LLM config parameters that are SDK-internal.
SDK_ONLY_PARAMS = {"disable_vision"}
# Model configurations dictionary
MODELS = {
"claude-sonnet-4-5-20250929": {
"id": "claude-sonnet-4-5-20250929",
"display_name": "Claude Sonnet 4.5",
"llm_config": {
"model": "litellm_proxy/claude-sonnet-4-5-20250929",
"temperature": 0.0,
},
},
"kimi-k2-thinking": {
"id": "kimi-k2-thinking",
"display_name": "Kimi K2 Thinking",
"llm_config": {
"model": "litellm_proxy/moonshot/kimi-k2-thinking",
"temperature": 1.0,
},
},
# https://www.kimi.com/blog/kimi-k2-5.html
"kimi-k2.5": {
"id": "kimi-k2.5",
"display_name": "Kimi K2.5",
"llm_config": {
"model": "litellm_proxy/moonshot/kimi-k2.5",
"temperature": 1.0,
"top_p": 0.95,
},
},
# https://www.alibabacloud.com/help/en/model-studio/deep-thinking
"qwen3-max-thinking": {
"id": "qwen3-max-thinking",
"display_name": "Qwen3 Max Thinking",
"llm_config": {
"model": "litellm_proxy/dashscope/qwen3-max-2026-01-23",
"litellm_extra_body": {"enable_thinking": True},
},
},
"qwen3.5-flash": {
"id": "qwen3.5-flash",
"display_name": "Qwen3.5 Flash",
"llm_config": {
"model": "litellm_proxy/dashscope/qwen3.5-flash-2026-02-23",
"temperature": 0.0,
},
},
"claude-4.5-opus": {
"id": "claude-4.5-opus",
"display_name": "Claude 4.5 Opus",
"llm_config": {
"model": "litellm_proxy/anthropic/claude-opus-4-5-20251101",
"temperature": 0.0,
},
},
"claude-4.6-opus": {
"id": "claude-4.6-opus",
"display_name": "Claude 4.6 Opus",
"llm_config": {
"model": "litellm_proxy/anthropic/claude-opus-4-6",
"temperature": 0.0,
},
},
"claude-sonnet-4-6": {
"id": "claude-sonnet-4-6",
"display_name": "Claude Sonnet 4.6",
"llm_config": {
"model": "litellm_proxy/anthropic/claude-sonnet-4-6",
"temperature": 0.0,
},
},
"gemini-3-pro": {
"id": "gemini-3-pro",
"display_name": "Gemini 3 Pro",
"llm_config": {
"model": "litellm_proxy/gemini-3-pro-preview",
"temperature": 0.0,
},
},
"gemini-3-flash": {
"id": "gemini-3-flash",
"display_name": "Gemini 3 Flash",
"llm_config": {
"model": "litellm_proxy/gemini-3-flash-preview",
"temperature": 0.0,
},
},
"gemini-3.1-pro": {
"id": "gemini-3.1-pro",
"display_name": "Gemini 3.1 Pro",
"llm_config": {
"model": "litellm_proxy/gemini-3.1-pro-preview",
"temperature": 0.0,
},
},
"gpt-5.2": {
"id": "gpt-5.2",
"display_name": "GPT-5.2",
"llm_config": {"model": "litellm_proxy/openai/gpt-5.2-2025-12-11"},
},
"gpt-5.2-codex": {
"id": "gpt-5.2-codex",
"display_name": "GPT-5.2 Codex",
"llm_config": {"model": "litellm_proxy/gpt-5.2-codex"},
},
"gpt-5-3-codex": {
"id": "gpt-5-3-codex",
"display_name": "GPT-5.3 Codex",
"llm_config": {"model": "litellm_proxy/gpt-5-3-codex"},
},
"gpt-5.2-high-reasoning": {
"id": "gpt-5.2-high-reasoning",
"display_name": "GPT-5.2 High Reasoning",
"llm_config": {
"model": "litellm_proxy/openai/gpt-5.2-2025-12-11",
"reasoning_effort": "high",
},
},
"gpt-5.4": {
"id": "gpt-5.4",
"display_name": "GPT-5.4",
"llm_config": {
"model": "litellm_proxy/openai/gpt-5.4",
"reasoning_effort": "high",
},
},
"minimax-m2": {
"id": "minimax-m2",
"display_name": "MiniMax M2",
"llm_config": {
"model": "litellm_proxy/minimax/minimax-m2",
"temperature": 0.0,
},
},
"minimax-m2.5": {
"id": "minimax-m2.5",
"display_name": "MiniMax M2.5",
"llm_config": {
"model": "litellm_proxy/minimax/MiniMax-M2.5",
"temperature": 1.0,
"top_p": 0.95,
},
},
"minimax-m2.1": {
"id": "minimax-m2.1",
"display_name": "MiniMax M2.1",
"llm_config": {
"model": "litellm_proxy/minimax/MiniMax-M2.1",
"temperature": 0.0,
},
},
"minimax-m2.7": {
"id": "minimax-m2.7",
"display_name": "MiniMax M2.7",
"llm_config": {
"model": "litellm_proxy/minimax/MiniMax-M2.7",
"temperature": 1.0,
"top_p": 0.95,
},
},
"deepseek-v3.2-reasoner": {
"id": "deepseek-v3.2-reasoner",
"display_name": "DeepSeek V3.2 Reasoner",
"llm_config": {"model": "litellm_proxy/deepseek/deepseek-reasoner"},
},
"qwen-3-coder": {
"id": "qwen-3-coder",
"display_name": "Qwen 3 Coder",
"llm_config": {
"model": "litellm_proxy/fireworks_ai/qwen3-coder-480b-a35b-instruct",
"temperature": 0.0,
},
},
"nemotron-3-nano-30b": {
"id": "nemotron-3-nano-30b",
"display_name": "NVIDIA Nemotron 3 Nano 30B",
"llm_config": {
"model": "litellm_proxy/openai/NVIDIA-Nemotron-3-Nano-30B-A3B-FP8",
"temperature": 0.0,
},
},
"glm-4.7": {
"id": "glm-4.7",
"display_name": "GLM-4.7",
"llm_config": {
"model": "litellm_proxy/openrouter/z-ai/glm-4.7",
"temperature": 0.0,
# OpenRouter glm-4.7 is text-only despite LiteLLM reporting vision support
"disable_vision": True,
},
},
"glm-5": {
"id": "glm-5",
"display_name": "GLM-5",
"llm_config": {
"model": "litellm_proxy/openrouter/z-ai/glm-5",
"temperature": 0.0,
# OpenRouter glm-5 is text-only despite LiteLLM reporting vision support
"disable_vision": True,
},
},
"qwen3-coder-next": {
"id": "qwen3-coder-next",
"display_name": "Qwen3 Coder Next",
"llm_config": {
"model": "litellm_proxy/openrouter/qwen/qwen3-coder-next",
"temperature": 0.0,
},
},
"qwen3-coder-30b-a3b-instruct": {
"id": "qwen3-coder-30b-a3b-instruct",
"display_name": "Qwen3 Coder 30B A3B Instruct",
"llm_config": {
"model": "litellm_proxy/Qwen3-Coder-30B-A3B-Instruct",
"temperature": 0.0,
},
},
"gpt-oss-20b": {
"id": "gpt-oss-20b",
"display_name": "GPT OSS 20B",
"llm_config": {
"model": "litellm_proxy/gpt-oss-20b",
"temperature": 0.0,
},
},
"nemotron-3-super-120b-a12b": {
"id": "nemotron-3-super-120b-a12b",
"display_name": "NVIDIA Nemotron-3 Super 120B",
"llm_config": {
"model": "litellm_proxy/nvidia/nemotron-3-super-120b-a12b",
"temperature": 0.0,
},
},
}
def error_exit(msg: str, exit_code: int = 1) -> None:
"""Print error message and exit."""
print(f"ERROR: {msg}", file=sys.stderr)
sys.exit(exit_code)
def get_required_env(key: str) -> str:
"""Get required environment variable or exit with error."""
value = os.environ.get(key)
if not value:
error_exit(f"{key} not set")
return value
def find_models_by_id(model_ids: list[str]) -> list[dict]:
"""Find models by ID. Fails fast on missing ID.
Args:
model_ids: List of model IDs to find
Returns:
List of model dictionaries matching the IDs
Raises:
SystemExit: If any model ID is not found
"""
resolved = []
for model_id in model_ids:
if model_id not in MODELS:
available = ", ".join(sorted(MODELS.keys()))
error_exit(
f"Model ID '{model_id}' not found. Available models: {available}"
)
resolved.append(MODELS[model_id])
return resolved
def check_model(
model_config: dict[str, Any],
api_key: str,
base_url: str,
timeout: int = 60,
) -> tuple[bool, str]:
"""Check a single model with a simple completion request using litellm.
Args:
model_config: Model configuration dict with 'llm_config' key
api_key: API key for authentication
base_url: Base URL for the LLM proxy
timeout: Request timeout in seconds
Returns:
Tuple of (success: bool, message: str)
"""
import litellm
llm_config = model_config.get("llm_config", {})
model_name = llm_config.get("model", "unknown")
display_name = model_config.get("display_name", model_name)
try:
# Build kwargs from llm_config, excluding 'model' and SDK-specific params
kwargs = {
k: v
for k, v in llm_config.items()
if k != "model" and k not in SDK_ONLY_PARAMS
}
# Use simple arithmetic prompt that works reliably across all models
# max_tokens=100 provides enough room for models to respond
# (some need >10 tokens)
response = litellm.completion(
model=model_name,
messages=[{"role": "user", "content": "1+1="}],
max_tokens=100,
api_key=api_key,
base_url=base_url,
timeout=timeout,
**kwargs,
)
response_content = (
response.choices[0].message.content if response.choices else None
)
reasoning_content = (
getattr(response.choices[0].message, "reasoning_content", None)
if response.choices
else None
)
if response_content or reasoning_content:
return True, f"{display_name}: OK"
else:
# Check if there's any other data in the response for diagnostics
finish_reason = (
response.choices[0].finish_reason if response.choices else None
)
usage = getattr(response, "usage", None)
return (
False,
(
f"{display_name}: Empty response "
f"(finish_reason={finish_reason}, usage={usage})"
),
)
except litellm.exceptions.Timeout:
return False, f"{display_name}: Request timed out after {timeout}s"
except litellm.exceptions.APIConnectionError as e:
return False, f"{display_name}: Connection error - {e}"
except litellm.exceptions.BadRequestError as e:
return False, f"{display_name}: Bad request - {e}"
except litellm.exceptions.NotFoundError as e:
return False, f"{display_name}: Model not found - {e}"
except Exception as e:
return False, f"{display_name}: {type(e).__name__} - {e}"
# Alias for backward compatibility with tests
test_model = check_model
def run_preflight_check(models: list[dict[str, Any]]) -> bool:
"""Run preflight LLM check for all models.
Args:
models: List of model configurations to test
Returns:
True if all models passed, False otherwise
"""
api_key = os.environ.get("LLM_API_KEY")
base_url = os.environ.get("LLM_BASE_URL", "https://llm-proxy.eval.all-hands.dev")
skip_preflight = os.environ.get("SKIP_PREFLIGHT", "").lower() == "true"
if skip_preflight:
print("Preflight check: SKIPPED (SKIP_PREFLIGHT=true)")
return True
if not api_key:
print("Preflight check: SKIPPED (LLM_API_KEY not set)")
return True
print(f"\nPreflight LLM check for {len(models)} model(s)...")
print("-" * 50)
all_passed = True
for model_config in models:
success, message = check_model(model_config, api_key, base_url)
print(message)
if not success:
all_passed = False
print("-" * 50)
if all_passed:
print(f"✓ All {len(models)} model(s) passed preflight check\n")
else:
print("✗ Some models failed preflight check")
print("Evaluation aborted to avoid wasting compute resources.\n")
return all_passed
def main() -> None:
model_ids_str = get_required_env("MODEL_IDS")
github_output = get_required_env("GITHUB_OUTPUT")
# Parse requested model IDs
model_ids = [mid.strip() for mid in model_ids_str.split(",") if mid.strip()]
# Resolve model configs
resolved = find_models_by_id(model_ids)
print(f"Resolved {len(resolved)} model(s): {', '.join(model_ids)}")
# Run preflight check
if not run_preflight_check(resolved):
error_exit("Preflight LLM check failed")
# Output as JSON
models_json = json.dumps(resolved, separators=(",", ":"))
with open(github_output, "a", encoding="utf-8") as f:
f.write(f"models_json={models_json}\n")
if __name__ == "__main__":
main()
-89
View File
@@ -1,89 +0,0 @@
#!/usr/bin/env python3
"""
Validate SDK reference for semantic versioning.
This script validates that the SDK reference is a semantic version (e.g., v1.0.0, 1.0.0)
unless the allow_unreleased_branches flag is set.
Environment variables:
- SDK_REF: The SDK reference to validate
- ALLOW_UNRELEASED_BRANCHES: If 'true', bypass semantic version validation
Exit codes:
- 0: Validation passed
- 1: Validation failed
"""
import os
import re
import sys
# Semantic version pattern: optional 'v' prefix, followed by MAJOR.MINOR.PATCH
# Optionally allows pre-release (-alpha.1, -beta.2, -rc.1) and build metadata
SEMVER_PATTERN = re.compile(
r"^v?" # Optional 'v' prefix
r"(0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)" # MAJOR.MINOR.PATCH
r"(?:-((?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)" # Pre-release
r"(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?" # More pre-release
r"(?:\+([0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?$" # Build metadata
)
def is_semantic_version(ref: str) -> bool:
"""Check if the given reference is a valid semantic version.
Args:
ref: The reference string to validate
Returns:
True if the reference is a valid semantic version, False otherwise
"""
return bool(SEMVER_PATTERN.match(ref))
def validate_sdk_ref(sdk_ref: str, allow_unreleased: bool) -> tuple[bool, str]:
"""Validate the SDK reference.
Args:
sdk_ref: The SDK reference to validate
allow_unreleased: If True, bypass semantic version validation
Returns:
Tuple of (is_valid, message)
"""
if allow_unreleased:
return True, f"Allowing unreleased branch: {sdk_ref}"
if is_semantic_version(sdk_ref):
return True, f"Valid semantic version: {sdk_ref}"
return False, (
f"SDK reference '{sdk_ref}' is not a valid semantic version. "
"Expected format: v1.0.0 or 1.0.0 (with optional pre-release like -alpha.1). "
"To use unreleased branches, check 'Allow unreleased branches'."
)
def main() -> None:
sdk_ref = os.environ.get("SDK_REF", "")
allow_unreleased_str = os.environ.get("ALLOW_UNRELEASED_BRANCHES", "false")
if not sdk_ref:
print("ERROR: SDK_REF environment variable is not set", file=sys.stderr)
sys.exit(1)
allow_unreleased = allow_unreleased_str.lower() == "true"
is_valid, message = validate_sdk_ref(sdk_ref, allow_unreleased)
if is_valid:
print(f"{message}")
sys.exit(0)
else:
print(f"{message}", file=sys.stderr)
sys.exit(1)
if __name__ == "__main__":
main()
@@ -1,611 +0,0 @@
#!/usr/bin/env python3
"""REST API breakage detection for openhands-agent-server using oasdiff.
This script compares the current OpenAPI schema for the agent-server REST API against
an already-published release. The baseline version is selected from PyPI, but the
baseline schema is generated from the matching git tag under the current workspace's
locked dependency set. This keeps the comparison focused on API changes in our code,
not schema drift from newer FastAPI/Pydantic releases.
The deprecation note it recognizes intentionally matches the phrasing used by the
Python deprecation checks, for example:
Deprecated since v1.14.0 and scheduled for removal in v1.19.0.
Policies enforced:
1) REST deprecations must use FastAPI/OpenAPI metadata
- FastAPI route handlers must not use `openhands.sdk.utils.deprecation.deprecated`.
- Endpoints documented as deprecated in their OpenAPI description must also be
marked `deprecated: true` in the generated schema.
2) Deprecation runway before removal
- If a REST operation (path + HTTP method) is removed, it must have been marked
`deprecated: true` in the baseline release and its OpenAPI description must
declare a scheduled removal version that has been reached by the current
package version.
3) No in-place contract breakage
- Breaking REST contract changes that are not removals of previously-deprecated
operations fail the check. REST clients need 5 minor releases of runway, so
incompatible replacements must ship additively or behind a versioned contract
until the scheduled removal version.
If the baseline release schema can't be generated (e.g., missing tag / repo issues),
the script emits a warning and exits successfully to avoid flaky CI.
"""
from __future__ import annotations
import ast
import json
import re
import subprocess
import sys
import tempfile
import tomllib
import urllib.request
from pathlib import Path
from packaging import version as pkg_version
REPO_ROOT = Path(__file__).resolve().parents[2]
AGENT_SERVER_PYPROJECT = REPO_ROOT / "openhands-agent-server" / "pyproject.toml"
PYPI_DISTRIBUTION = "openhands-agent-server"
# Keep this in sync with REST_ROUTE_DEPRECATION_RE in check_deprecations.py so
# the REST breakage and deprecation checks recognize the same wording.
REST_ROUTE_DEPRECATION_RE = re.compile(
r"Deprecated since v(?P<deprecated>[0-9A-Za-z.+-]+)\s+"
r"and scheduled for removal in v(?P<removed>[0-9A-Za-z.+-]+)\.?",
re.IGNORECASE,
)
HTTP_METHODS = {
"get",
"put",
"post",
"delete",
"patch",
"options",
"head",
"trace",
}
ROUTE_DECORATOR_NAMES = HTTP_METHODS | {"api_route"}
OPENAPI_PROGRAM = """
import json
import sys
from pathlib import Path
source_tree = Path(sys.argv[1])
sys.path = [
str(source_tree / "openhands-agent-server"),
str(source_tree / "openhands-sdk"),
str(source_tree / "openhands-tools"),
str(source_tree / "openhands-workspace"),
] + sys.path
from openhands.agent_server.api import create_app
print(json.dumps(create_app().openapi()))
"""
def _read_version_from_pyproject(pyproject: Path) -> str:
data = tomllib.loads(pyproject.read_text())
try:
return str(data["project"]["version"])
except KeyError as exc: # pragma: no cover
raise SystemExit(
f"Unable to determine project version from {pyproject}"
) from exc
def _fetch_pypi_metadata(distribution: str) -> dict:
req = urllib.request.Request(
url=f"https://pypi.org/pypi/{distribution}/json",
headers={"User-Agent": "openhands-agent-server-openapi-check/1.0"},
method="GET",
)
with urllib.request.urlopen(req, timeout=10) as response:
return json.load(response)
def _get_baseline_version(distribution: str, current: str) -> str | None:
try:
meta = _fetch_pypi_metadata(distribution)
except Exception as exc: # pragma: no cover
print(
f"::warning title={distribution} REST API::Failed to fetch PyPI metadata: "
f"{exc}"
)
return None
releases = list(meta.get("releases", {}).keys())
if not releases:
return None
if current in releases:
return current
current_parsed = pkg_version.parse(current)
older = [rv for rv in releases if pkg_version.parse(rv) < current_parsed]
if not older:
return None
return max(older, key=pkg_version.parse)
def _generate_openapi_from_source_tree(source_tree: Path, label: str) -> dict | None:
try:
result = subprocess.run(
[sys.executable, "-c", OPENAPI_PROGRAM, str(source_tree)],
check=True,
capture_output=True,
text=True,
cwd=source_tree,
)
return json.loads(result.stdout)
except subprocess.CalledProcessError as exc:
output = (exc.stdout or "") + ("\n" + exc.stderr if exc.stderr else "")
excerpt = output.strip()[-1000:]
print(
f"::warning title={PYPI_DISTRIBUTION} REST API::Failed to generate "
f"OpenAPI schema for {label}: {exc}\n{excerpt}"
)
return None
except Exception as exc:
print(
f"::warning title={PYPI_DISTRIBUTION} REST API::Failed to generate "
f"OpenAPI schema for {label}: {exc}"
)
return None
def _generate_current_openapi() -> dict | None:
return _generate_openapi_from_source_tree(REPO_ROOT, "current workspace")
def _generate_openapi_for_git_ref(git_ref: str) -> dict | None:
with tempfile.TemporaryDirectory(prefix="agent-server-openapi-") as tmp:
source_tree = Path(tmp)
try:
archive = subprocess.run(
["git", "-C", str(REPO_ROOT), "archive", git_ref],
check=True,
capture_output=True,
)
subprocess.run(
["tar", "-x", "-C", str(source_tree)],
check=True,
input=archive.stdout,
capture_output=True,
)
except subprocess.CalledProcessError as exc:
output = (exc.stdout or b"") + (b"\n" + exc.stderr if exc.stderr else b"")
excerpt = output.decode(errors="replace").strip()[-1000:]
print(
f"::warning title={PYPI_DISTRIBUTION} REST API::Failed to extract "
f"source for {git_ref}: {exc}\n{excerpt}"
)
return None
return _generate_openapi_from_source_tree(source_tree, git_ref)
def _dotted_name(node: ast.AST) -> str | None:
if isinstance(node, ast.Name):
return node.id
if isinstance(node, ast.Attribute):
prefix = _dotted_name(node.value)
if prefix is None:
return None
return f"{prefix}.{node.attr}"
return None
def _find_sdk_deprecated_fastapi_routes_in_file(
file_path: Path, repo_root: Path
) -> list[str]:
tree = ast.parse(file_path.read_text(), filename=str(file_path))
deprecated_names: set[str] = set()
deprecation_module_names: set[str] = set()
for node in tree.body:
if isinstance(node, ast.ImportFrom):
if node.module == "openhands.sdk.utils.deprecation":
for alias in node.names:
if alias.name == "deprecated":
deprecated_names.add(alias.asname or alias.name)
elif node.module == "openhands.sdk.utils":
for alias in node.names:
if alias.name == "deprecation":
deprecation_module_names.add(alias.asname or alias.name)
elif isinstance(node, ast.Import):
for alias in node.names:
if alias.name == "openhands.sdk.utils.deprecation":
deprecation_module_names.add(alias.asname or alias.name)
errors: list[str] = []
for node in ast.walk(tree):
if not isinstance(node, ast.FunctionDef | ast.AsyncFunctionDef):
continue
has_route_decorator = False
uses_sdk_deprecated = False
for decorator in node.decorator_list:
if not isinstance(decorator, ast.Call):
continue
dotted_name = _dotted_name(decorator.func)
if (
isinstance(decorator.func, ast.Attribute)
and decorator.func.attr in ROUTE_DECORATOR_NAMES
):
has_route_decorator = True
if dotted_name in deprecated_names or (
dotted_name == "openhands.sdk.utils.deprecation.deprecated"
):
uses_sdk_deprecated = True
continue
if (
isinstance(decorator.func, ast.Attribute)
and decorator.func.attr == "deprecated"
):
base_name = _dotted_name(decorator.func.value)
if base_name in deprecation_module_names or (
base_name == "openhands.sdk.utils.deprecation"
):
uses_sdk_deprecated = True
if has_route_decorator and uses_sdk_deprecated:
rel_path = file_path.relative_to(repo_root)
errors.append(
f"{rel_path}:{node.lineno} FastAPI route `{node.name}` uses "
"openhands.sdk.utils.deprecation.deprecated; use the route "
"decorator's deprecated=True flag instead."
)
return errors
def _find_sdk_deprecated_fastapi_routes(repo_root: Path) -> list[str]:
app_root = repo_root / "openhands-agent-server" / "openhands" / "agent_server"
errors: list[str] = []
for file_path in sorted(app_root.rglob("*.py")):
errors.extend(_find_sdk_deprecated_fastapi_routes_in_file(file_path, repo_root))
return errors
def _find_deprecation_policy_errors(schema: dict) -> list[str]:
errors: list[str] = []
for path, path_item in schema.get("paths", {}).items():
if not isinstance(path_item, dict):
continue
for method, operation in path_item.items():
if method not in HTTP_METHODS or not isinstance(operation, dict):
continue
description = operation.get("description") or ""
if "deprecated since" not in description.lower():
continue
if operation.get("deprecated") is True:
continue
errors.append(
f"{method.upper()} {path} documents deprecation in its "
"description but is not marked deprecated=true in OpenAPI."
)
return errors
def _parse_openapi_deprecation_description(
description: str | None,
) -> tuple[str, str] | None:
"""Extract ``(deprecated_in, removed_in)`` from an OpenAPI description.
The accepted wording intentionally matches ``check_deprecations.py`` so both
CI checks recognize the same note, for example:
Deprecated since v1.14.0 and scheduled for removal in v1.19.0.
"""
if not description:
return None
match = REST_ROUTE_DEPRECATION_RE.search(" ".join(description.split()))
if match is None:
return None
return match.group("deprecated").rstrip("."), match.group("removed").rstrip(".")
def _version_ge(current: str, target: str) -> bool:
try:
return pkg_version.parse(current) >= pkg_version.parse(target)
except pkg_version.InvalidVersion as exc:
raise SystemExit(
f"Invalid semantic version comparison: {current=} {target=}"
) from exc
def _get_openapi_operation(schema: dict, path: str, method: str) -> dict | None:
path_item = schema.get("paths", {}).get(path)
if not isinstance(path_item, dict):
return None
operation = path_item.get(method.lower())
if not isinstance(operation, dict):
return None
return operation
def _validate_removed_operations(
removed_operations: list[dict],
prev_schema: dict,
current_version: str,
) -> list[str]:
"""Validate removed operations against the baseline deprecation metadata."""
errors: list[str] = []
for operation in removed_operations:
path = str(operation.get("path", ""))
method = str(operation.get("method", "")).lower()
method_label = method.upper() or "<unknown method>"
if not operation.get("deprecated", False):
errors.append(
f"Removed {method_label} {path} without prior deprecation "
"(deprecated=true)."
)
continue
baseline_operation = _get_openapi_operation(prev_schema, path, method)
if baseline_operation is None:
errors.append(
f"Removed {method_label} {path} was marked deprecated in the "
"baseline release, but the previous OpenAPI schema could not be "
"inspected for its scheduled removal version."
)
continue
deprecation_details = _parse_openapi_deprecation_description(
baseline_operation.get("description")
)
if deprecation_details is None:
errors.append(
f"Removed {method_label} {path} was marked deprecated in the "
"baseline release, but its OpenAPI description does not declare "
"a scheduled removal version. REST API removals require 5 minor "
"releases of deprecation runway."
)
continue
_, removed_in = deprecation_details
if not _version_ge(current_version, removed_in):
errors.append(
f"Removed {method_label} {path} before its scheduled removal "
f"version v{removed_in} (current version: v{current_version}). "
"REST API removals require 5 minor releases of deprecation "
"runway."
)
continue
print(
f"::notice title={PYPI_DISTRIBUTION} REST API::Removed previously-"
f"deprecated {method_label} {path} after its scheduled removal "
f"version v{removed_in}."
)
return errors
def _split_breaking_changes(
breaking_changes: list[dict],
) -> tuple[list[dict], list[dict]]:
"""Split oasdiff results into removals and all other breakages."""
removed_operations: list[dict] = []
other_breaking_changes: list[dict] = []
for change in breaking_changes:
change_id = str(change.get("id", ""))
details = change.get("details", {})
if "removed" in change_id.lower() and "operation" in change_id.lower():
removed_operations.append(
{
"path": details.get("path", ""),
"method": details.get("method", ""),
"deprecated": details.get("deprecated", False),
}
)
continue
other_breaking_changes.append(change)
return removed_operations, other_breaking_changes
def _normalize_openapi_for_oasdiff(schema: dict) -> dict:
"""Normalize OpenAPI 3.1 schema for oasdiff compatibility.
oasdiff expects OpenAPI 3.0-style exclusiveMinimum/exclusiveMaximum booleans
(https://spec.openapis.org/oas/v3.0.3.html#schema-object), while OpenAPI 3.1
emits numeric values. Convert numeric exclusives into minimum/maximum +
exclusive boolean flags so oasdiff can parse the schema.
Mutates the schema in place and returns it for convenience.
"""
def _walk(node: object) -> None:
if isinstance(node, dict):
if (
"exclusiveMinimum" in node
and isinstance(node["exclusiveMinimum"], (int, float))
and not isinstance(node["exclusiveMinimum"], bool)
):
value = node["exclusiveMinimum"]
if "minimum" not in node:
node["minimum"] = value
node["exclusiveMinimum"] = True
if (
"exclusiveMaximum" in node
and isinstance(node["exclusiveMaximum"], (int, float))
and not isinstance(node["exclusiveMaximum"], bool)
):
value = node["exclusiveMaximum"]
if "maximum" not in node:
node["maximum"] = value
node["exclusiveMaximum"] = True
for child in node.values():
_walk(child)
elif isinstance(node, list):
for child in node:
_walk(child)
_walk(schema)
return schema
def _run_oasdiff_breakage_check(
prev_spec: Path, cur_spec: Path
) -> tuple[list[dict], int]:
"""Run oasdiff breaking check between two OpenAPI specs.
Returns (list of breaking changes, exit code from oasdiff).
"""
try:
result = subprocess.run(
[
"oasdiff",
"breaking",
"-f",
"json",
"--fail-on",
"ERR",
str(prev_spec),
str(cur_spec),
],
capture_output=True,
text=True,
)
except FileNotFoundError:
print(
"::warning title=oasdiff not found::"
"Please install oasdiff: https://github.com/oasdiff/oasdiff"
)
return [], 0
breaking_changes = []
if result.stdout:
try:
breaking_changes = json.loads(result.stdout)
except json.JSONDecodeError:
pass
return breaking_changes, result.returncode
def main() -> int:
current_version = _read_version_from_pyproject(AGENT_SERVER_PYPROJECT)
baseline_version = _get_baseline_version(PYPI_DISTRIBUTION, current_version)
if baseline_version is None:
print(
f"::warning title={PYPI_DISTRIBUTION} REST API::Unable to find baseline "
f"version for {current_version}; skipping breakage checks."
)
return 0
baseline_git_ref = f"v{baseline_version}"
static_policy_errors = _find_sdk_deprecated_fastapi_routes(REPO_ROOT)
for error in static_policy_errors:
print(f"::error title={PYPI_DISTRIBUTION} REST API::{error}")
current_schema = _generate_current_openapi()
if current_schema is None:
return 1
deprecation_policy_errors = _find_deprecation_policy_errors(current_schema)
for error in deprecation_policy_errors:
print(f"::error title={PYPI_DISTRIBUTION} REST API::{error}")
prev_schema = _generate_openapi_for_git_ref(baseline_git_ref)
if prev_schema is None:
return 0 if not (static_policy_errors or deprecation_policy_errors) else 1
prev_schema = _normalize_openapi_for_oasdiff(prev_schema)
current_schema = _normalize_openapi_for_oasdiff(current_schema)
with tempfile.TemporaryDirectory(prefix="oasdiff-specs-") as tmp:
tmp_path = Path(tmp)
prev_spec_file = tmp_path / "prev_spec.json"
cur_spec_file = tmp_path / "cur_spec.json"
prev_spec_file.write_text(json.dumps(prev_schema, indent=2))
cur_spec_file.write_text(json.dumps(current_schema, indent=2))
breaking_changes, exit_code = _run_oasdiff_breakage_check(
prev_spec_file, cur_spec_file
)
if not breaking_changes:
if exit_code == 0:
print("No breaking changes detected.")
else:
print(
f"oasdiff returned exit code {exit_code} but no breaking changes "
"in JSON format. There may be warnings only."
)
else:
removed_operations, other_breaking_changes = _split_breaking_changes(
breaking_changes
)
removal_errors = _validate_removed_operations(
removed_operations,
prev_schema,
current_version,
)
for error in removal_errors:
print(f"::error title={PYPI_DISTRIBUTION} REST API::{error}")
if other_breaking_changes:
print(
"::error "
f"title={PYPI_DISTRIBUTION} REST API::Detected breaking REST API "
"changes other than removing previously-deprecated operations. "
"REST contract changes must preserve compatibility for 5 minor "
"releases; keep the old contract available until its scheduled "
"removal version."
)
print("\nBreaking REST API changes detected compared to baseline release:")
for text in breaking_changes:
print(f"- {text.get('text', str(text))}")
if not (removal_errors or other_breaking_changes):
print(
"Breaking changes are limited to previously-deprecated operations "
"whose scheduled removal versions have been reached."
)
else:
return 1
return 1 if (static_policy_errors or deprecation_policy_errors) else 0
if __name__ == "__main__":
raise SystemExit(main())
-592
View File
@@ -1,592 +0,0 @@
#!/usr/bin/env python3
"""Static analysis for deprecation deadlines.
This script scans Python deprecation metadata (`deprecated`, `warn_deprecated`,
`warn_cleanup`) and agent-server REST routes marked `deprecated=True`. If the
current project version has reached or passed a feature's removal marker, the
script fails with a helpful summary so legacy shims and overdue deprecated REST
endpoints are cleaned up before release.
"""
from __future__ import annotations
import ast
import re
import sys
import tomllib
from collections.abc import Iterable, Iterator, Sequence
from dataclasses import dataclass
from datetime import date
from pathlib import Path
from typing import Literal
from packaging import version as pkg_version
REST_ROUTE_DEPRECATION_RE = re.compile(
r"Deprecated since v(?P<deprecated>[0-9A-Za-z.+-]+)\s+"
r"and scheduled for removal in v(?P<removed>[0-9A-Za-z.+-]+)\.?",
re.IGNORECASE,
)
ROUTE_DECORATOR_NAMES = {
"get",
"put",
"post",
"delete",
"patch",
"options",
"head",
"trace",
"api_route",
}
HTTP_METHODS = ROUTE_DECORATOR_NAMES - {"api_route"}
REPO_ROOT = Path(__file__).resolve().parents[2]
@dataclass(frozen=True, slots=True)
class PackageConfig:
name: str
pyproject: Path
source_roots: tuple[Path, ...]
PACKAGES: tuple[PackageConfig, ...] = (
PackageConfig(
name="openhands-sdk",
pyproject=REPO_ROOT / "openhands-sdk" / "pyproject.toml",
source_roots=(REPO_ROOT / "openhands-sdk" / "openhands" / "sdk",),
),
PackageConfig(
name="openhands-tools",
pyproject=REPO_ROOT / "openhands-tools" / "pyproject.toml",
source_roots=(REPO_ROOT / "openhands-tools" / "openhands" / "tools",),
),
PackageConfig(
name="openhands-workspace",
pyproject=REPO_ROOT / "openhands-workspace" / "pyproject.toml",
source_roots=(REPO_ROOT / "openhands-workspace" / "openhands" / "workspace",),
),
PackageConfig(
name="openhands-agent-server",
pyproject=REPO_ROOT / "openhands-agent-server" / "pyproject.toml",
source_roots=(
REPO_ROOT / "openhands-agent-server" / "openhands" / "agent_server",
),
),
)
@dataclass(slots=True)
class DeprecationRecord:
identifier: str
removed_in: str | date | None
deprecated_in: str | None
path: Path
line: int
kind: Literal["decorator", "warn_call", "cleanup_call", "rest_route"]
package: str
def _load_current_version(pyproject: Path) -> str:
data = tomllib.loads(pyproject.read_text())
try:
return str(data["project"]["version"])
except KeyError as exc: # pragma: no cover - configuration error
raise SystemExit(
f"Unable to determine project version from {pyproject}"
) from exc
def _iter_python_files(root: Path) -> Iterator[Path]:
for path in root.rglob("*.py"):
if path.name == "__init__.py" and path.parent == root:
continue
yield path
def _parse_removed_value(
node: ast.AST | None,
*,
path: Path,
line: int,
) -> str | date | None:
if node is None:
return None
expression = ast.unparse(node)
if isinstance(node, ast.Constant):
if isinstance(node.value, str):
return node.value
if node.value is None:
return None
raise SystemExit(
f"Unsupported removed_in literal at {path}:{line}: {expression}"
)
if isinstance(node, ast.Call):
func = node.func
if isinstance(func, ast.Name) and func.id == "date":
try:
args = [_safe_int_literal(arg) for arg in node.args]
kwargs = {
kw.arg: _safe_int_literal(kw.value)
for kw in node.keywords
if kw.arg is not None
}
except ValueError as exc:
raise SystemExit(
f"Unsupported removed_in date() arguments at {path}:{line}:"
f" {expression}"
) from exc
if any(kw.arg is None for kw in node.keywords):
raise SystemExit(
"Unsupported removed_in date() call (uses **kwargs) at "
f"{path}:{line}: {expression}"
)
try:
return date(*args, **kwargs)
except TypeError as exc:
raise SystemExit(
f"Invalid removed_in date() call at {path}:{line}: {expression}"
) from exc
if (
isinstance(func, ast.Attribute)
and isinstance(func.value, ast.Name)
and func.value.id == "date"
and func.attr == "today"
):
if node.args or node.keywords:
raise SystemExit(
"date.today() removed_in call must not include arguments at "
f"{path}:{line}: {expression}"
)
return date.today()
raise SystemExit(
f"Unsupported removed_in expression at {path}:{line}: {expression}"
)
def _parse_deprecated_value(
node: ast.AST | None,
*,
path: Path,
line: int,
) -> str | None:
if node is None:
return None
expression = ast.unparse(node)
if isinstance(node, ast.Constant):
if isinstance(node.value, str):
return node.value
if node.value is None:
return None
raise SystemExit(
f"Unsupported deprecated_in expression at {path}:{line}: {expression}"
)
def _safe_int_literal(node: ast.AST) -> int:
if not isinstance(node, ast.Constant) or not isinstance(node.value, int):
raise ValueError(
f"Unsupported expression inside literal evaluation: {ast.unparse(node)}"
)
return node.value
def _extract_kw(call: ast.Call, name: str) -> ast.AST | None:
for kw in call.keywords:
if kw.arg == name:
return kw.value
return None
def _extract_string_literal(node: ast.AST | None) -> str | None:
if isinstance(node, ast.Constant) and isinstance(node.value, str):
return node.value
return None
def _extract_string_sequence(node: ast.AST | None) -> tuple[str, ...] | None:
if not isinstance(node, (ast.List, ast.Tuple, ast.Set)):
return None
values: list[str] = []
for item in node.elts:
value = _extract_string_literal(item)
if value is None:
return None
values.append(value)
return tuple(values)
def _extract_route_details(call: ast.Call) -> tuple[tuple[str, str], ...]:
target = call.func
if not isinstance(target, ast.Attribute):
return ()
decorator_name = target.attr
if decorator_name not in ROUTE_DECORATOR_NAMES:
return ()
path = _extract_string_literal(call.args[0] if call.args else None)
if path is None:
path = _extract_string_literal(_extract_kw(call, "path"))
if path is None:
return ()
if decorator_name in HTTP_METHODS:
return ((decorator_name.upper(), path),)
methods = _extract_string_sequence(_extract_kw(call, "methods"))
if methods is None:
return (("GET", path),)
return tuple(
(method.upper(), path) for method in methods if method.lower() in HTTP_METHODS
)
def _parse_rest_route_deprecation_docstring(
docstring: str | None,
*,
path: Path,
line: int,
route_identifiers: Sequence[str],
) -> tuple[str, str]:
if not docstring:
raise SystemExit(
"Deprecated REST route(s) "
f"{', '.join(route_identifiers)} at {path}:{line} must include a "
"docstring note like 'Deprecated since vX.Y.Z and scheduled for "
"removal in vA.B.C.'"
)
match = REST_ROUTE_DEPRECATION_RE.search(" ".join(docstring.split()))
if match is None:
raise SystemExit(
"Deprecated REST route(s) "
f"{', '.join(route_identifiers)} at {path}:{line} must include a "
"docstring note like 'Deprecated since vX.Y.Z and scheduled for "
"removal in vA.B.C.'"
)
return match.group("deprecated").rstrip("."), match.group("removed").rstrip(".")
def _gather_rest_route_deprecations(
tree: ast.AST, path: Path, *, package: str
) -> Iterator[DeprecationRecord]:
for node in ast.walk(tree):
if not isinstance(node, ast.FunctionDef | ast.AsyncFunctionDef):
continue
routes: list[tuple[str, str]] = []
for deco in node.decorator_list:
if not isinstance(deco, ast.Call):
continue
deprecated_value = _extract_kw(deco, "deprecated")
if (
not isinstance(deprecated_value, ast.Constant)
or deprecated_value.value is not True
):
continue
routes.extend(_extract_route_details(deco))
if not routes:
continue
deprecated_in, removed_in = _parse_rest_route_deprecation_docstring(
ast.get_docstring(node),
path=path,
line=node.lineno,
route_identifiers=[
f"{method} {route_path}" for method, route_path in routes
],
)
for method, route_path in routes:
yield DeprecationRecord(
identifier=f"{method} {route_path}",
removed_in=removed_in,
deprecated_in=deprecated_in,
path=path,
line=node.lineno,
kind="rest_route",
package=package,
)
def _gather_decorators(
tree: ast.AST, path: Path, *, package: str
) -> Iterator[DeprecationRecord]:
for node in ast.walk(tree):
if not isinstance(node, (ast.FunctionDef, ast.AsyncFunctionDef, ast.ClassDef)):
continue
for deco in node.decorator_list:
call = deco if isinstance(deco, ast.Call) else None
if call is None:
continue
target = call.func
if isinstance(target, ast.Name):
decorator_name = target.id
elif isinstance(target, ast.Attribute):
decorator_name = target.attr
else:
continue
if decorator_name != "deprecated":
continue
removed_expr = _extract_kw(call, "removed_in")
deprecated_expr = _extract_kw(call, "deprecated_in")
record = DeprecationRecord(
identifier=_build_identifier(node),
removed_in=_parse_removed_value(
removed_expr, path=path, line=node.lineno
),
deprecated_in=_parse_deprecated_value(
deprecated_expr, path=path, line=node.lineno
),
path=path,
line=node.lineno,
kind="decorator",
package=package,
)
yield record
def _gather_warn_calls(
tree: ast.AST, path: Path, *, package: str
) -> Iterator[DeprecationRecord]:
for node in ast.walk(tree):
if not isinstance(node, ast.Call):
continue
target = node.func
if isinstance(target, ast.Name):
func_name = target.id
elif isinstance(target, ast.Attribute):
func_name = target.attr
else:
continue
if func_name == "warn_deprecated":
identifier_node = node.args[0] if node.args else None
if identifier_node is None:
continue
identifier = ast.unparse(identifier_node)
removed_expr = _extract_kw(node, "removed_in")
deprecated_expr = _extract_kw(node, "deprecated_in")
yield DeprecationRecord(
identifier=identifier,
removed_in=_parse_removed_value(
removed_expr, path=path, line=node.lineno
),
deprecated_in=_parse_deprecated_value(
deprecated_expr, path=path, line=node.lineno
),
path=path,
line=node.lineno,
kind="warn_call",
package=package,
)
elif func_name == "warn_cleanup":
identifier_node = node.args[0] if node.args else None
if identifier_node is None:
continue
identifier = ast.unparse(identifier_node)
cleanup_expr = _extract_kw(node, "cleanup_by")
yield DeprecationRecord(
identifier=identifier,
removed_in=_parse_removed_value(
cleanup_expr, path=path, line=node.lineno
),
deprecated_in=None,
path=path,
line=node.lineno,
kind="cleanup_call",
package=package,
)
def _build_identifier(node: ast.AST) -> str:
if isinstance(node, ast.ClassDef):
return node.name
if isinstance(node, (ast.FunctionDef, ast.AsyncFunctionDef)):
qual_name = node.name
if node.decorator_list:
parent = getattr(node, "parent", None)
if parent and isinstance(parent, ast.ClassDef):
return f"{parent.name}.{node.name}"
return qual_name
return "<unknown>"
def _attach_parents(tree: ast.AST) -> None:
for node in ast.walk(tree):
for child in ast.iter_child_nodes(node):
setattr(child, "parent", node)
def _collect_records(files: Iterable[Path], *, package: str) -> list[DeprecationRecord]:
records: list[DeprecationRecord] = []
for path in files:
tree = ast.parse(path.read_text())
_attach_parents(tree)
records.extend(_gather_decorators(tree, path, package=package))
records.extend(_gather_warn_calls(tree, path, package=package))
return records
def _collect_rest_route_records(
files: Iterable[Path], *, package: str
) -> list[DeprecationRecord]:
records: list[DeprecationRecord] = []
for path in files:
tree = ast.parse(path.read_text())
records.extend(_gather_rest_route_deprecations(tree, path, package=package))
return records
def _version_ge(current: str, target: str) -> bool:
try:
return pkg_version.parse(current) >= pkg_version.parse(target)
except pkg_version.InvalidVersion as exc:
raise SystemExit(
f"Invalid semantic version comparison: {current=} {target=}"
) from exc
def _should_fail(current_version: str, record: DeprecationRecord) -> bool:
removed = record.removed_in
if removed is None:
return False
if isinstance(removed, date):
return date.today() >= removed
try:
target = str(removed)
return _version_ge(current_version, target)
except SystemExit:
raise
except Exception as exc: # pragma: no cover - unexpected literal type
raise SystemExit(
f"Unsupported removed_in expression in {record.path}:{record.line}:"
f" {removed!r}"
) from exc
def _format_record(record: DeprecationRecord) -> str:
location = record.path.relative_to(REPO_ROOT)
removed = record.removed_in if record.removed_in is not None else "(none)"
if record.kind == "cleanup_call":
return (
f"- [{record.package}] {record.identifier} ({record.kind})\n"
f" cleanup by: {removed}\n"
f" defined at: {location}:{record.line}"
)
deprecated = (
record.deprecated_in if record.deprecated_in is not None else "(unknown)"
)
return (
f"- [{record.package}] {record.identifier} ({record.kind})\n"
f" deprecated in: {deprecated}\n"
f" removed in: {removed}\n"
f" defined at: {location}:{record.line}"
)
def main(argv: Sequence[str] | None = None) -> int:
argv = list(argv or [])
overdue: list[DeprecationRecord] = []
total_records = 0
package_summaries: list[tuple[str, str, int]] = []
for package in PACKAGES:
if not package.pyproject.exists():
raise SystemExit(
f"Unable to locate pyproject.toml for {package.name}: "
f"{package.pyproject}"
)
current_version = _load_current_version(package.pyproject)
files: list[Path] = []
for root in package.source_roots:
if not root.exists():
raise SystemExit(
f"Source root {root} for package {package.name} does not exist"
)
files.extend(_iter_python_files(root))
records = _collect_records(files, package=package.name)
if package.name == "openhands-agent-server":
records.extend(_collect_rest_route_records(files, package=package.name))
overdue.extend(r for r in records if _should_fail(current_version, r))
total_records += len(records)
package_summaries.append((package.name, current_version, len(records)))
if overdue:
deprecated_items = [r for r in overdue if r.kind != "cleanup_call"]
cleanup_items = [r for r in overdue if r.kind == "cleanup_call"]
if deprecated_items:
print(
"The following deprecated features have passed their removal "
"deadline:\n"
)
for record in deprecated_items:
print(_format_record(record))
print()
if cleanup_items:
print("The following workarounds have passed their cleanup deadline:\n")
for record in cleanup_items:
print(_format_record(record))
print()
if deprecated_items:
print(
"Update or remove the listed features before publishing a version that "
"meets or exceeds their removal deadline."
)
if cleanup_items:
print(
"Remove the listed workarounds before publishing a version that "
"meets or exceeds their cleanup deadline."
)
return 1
for package_name, version, count in package_summaries:
print(
f"{package_name}: checked {count} deprecation metadata entries against "
f"version {version}."
)
print(
f"Checked {total_records} deprecation metadata entries across "
f"{len(package_summaries)} package(s)."
)
return 0
if __name__ == "__main__": # pragma: no cover - manual invocation
sys.exit(main(sys.argv[1:]))
-297
View File
@@ -1,297 +0,0 @@
#!/usr/bin/env python3
"""Validate docstrings conform to MDX-compatible formatting guidelines.
This script checks that docstrings in the SDK use patterns that render correctly
in Mintlify MDX documentation. It validates:
1. No REPL-style examples (>>>) - should use fenced code blocks instead
2. Shell/config examples use fenced code blocks (prevents # becoming headers)
Run with: python scripts/check_docstrings.py
Exit code 0 = all checks pass, 1 = violations found
"""
import ast
import sys
from dataclasses import dataclass
from pathlib import Path
# Directories to check
SDK_PATHS = [
"openhands-sdk/openhands/sdk",
]
# Files/directories to skip
SKIP_PATTERNS = [
"__pycache__",
".pyc",
"test_",
"_test.py",
]
# Core public API files to check strictly (these are documented on the website)
# Other files will be checked but only emit warnings, not failures
STRICT_CHECK_FILES = [
"agent/agent.py",
"llm/llm.py",
"conversation/conversation.py",
"tool/tool.py",
"workspace/base.py",
"observability/laminar.py",
]
@dataclass
class Violation:
"""A docstring formatting violation."""
file: Path
line: int
name: str
rule: str
message: str
is_strict: bool = False # True if this is in a strictly-checked file
def should_skip(path: Path) -> bool:
"""Check if a path should be skipped."""
path_str = str(path)
return any(pattern in path_str for pattern in SKIP_PATTERNS)
def check_repl_examples(
docstring: str, name: str, lineno: int, file: Path
) -> list[Violation]:
"""Check for REPL-style examples (>>>).
These should be replaced with fenced code blocks for better MDX rendering.
"""
violations = []
lines = docstring.split("\n")
for i, line in enumerate(lines):
stripped = line.strip()
if stripped.startswith(">>>"):
violations.append(
Violation(
file=file,
line=lineno + i,
name=name,
rule="no-repl-examples",
message=(
"Use fenced code blocks (```python) instead of >>> REPL style. "
"REPL examples don't render well in MDX documentation."
),
)
)
# Only report once per docstring
break
return violations
def check_unfenced_shell_config(
docstring: str, name: str, lineno: int, file: Path
) -> list[Violation]:
"""Check for shell/config examples that aren't in fenced code blocks.
Lines starting with # outside code blocks become markdown headers.
"""
violations = []
lines = docstring.split("\n")
in_code_block = False
for i, line in enumerate(lines):
stripped = line.strip()
# Track code block state
if stripped.startswith("```"):
in_code_block = not in_code_block
continue
# Skip if inside a code block
if in_code_block:
continue
# Check for shell-style comments that look like config
# Pattern: line starts with # and previous line has = (config pattern)
if stripped.startswith("#") and not stripped.startswith("# "):
# This is likely a shell comment without space (less common in prose)
continue
# Check for unfenced config: KEY=VALUE followed by # comment
if i > 0:
prev_line = lines[i - 1].strip() if i > 0 else ""
# If previous line looks like config (VAR=value) and this is a # comment
if "=" in prev_line and prev_line.split("=")[0].isupper():
if stripped.startswith("# "):
violations.append(
Violation(
file=file,
line=lineno + i,
name=name,
rule="fenced-shell-config",
message=(
"Shell/config examples with # comments should be "
"in ```bash code blocks. Otherwise # becomes a "
"markdown header."
),
)
)
# Only report once per docstring
break
return violations
def check_docstring(
docstring: str, name: str, lineno: int, file: Path
) -> list[Violation]:
"""Run all checks on a docstring."""
if not docstring:
return []
violations = []
violations.extend(check_repl_examples(docstring, name, lineno, file))
violations.extend(check_unfenced_shell_config(docstring, name, lineno, file))
return violations
def get_docstrings_from_file(file: Path) -> list[tuple[str, str, int]]:
"""Extract all docstrings from a Python file.
Returns list of (name, docstring, lineno) tuples.
"""
try:
source = file.read_text()
tree = ast.parse(source)
except (SyntaxError, UnicodeDecodeError) as e:
print(f"Warning: Could not parse {file}: {e}", file=sys.stderr)
return []
docstrings = []
for node in ast.walk(tree):
name = None
lineno = 0
docstring = None
if isinstance(node, ast.Module):
docstring = ast.get_docstring(node)
name = file.stem
lineno = 1
elif isinstance(node, ast.ClassDef):
docstring = ast.get_docstring(node)
name = node.name
lineno = node.lineno
elif isinstance(node, ast.FunctionDef | ast.AsyncFunctionDef):
docstring = ast.get_docstring(node)
name = node.name
lineno = node.lineno
if docstring and name:
docstrings.append((name, docstring, lineno))
return docstrings
def is_strict_file(file: Path, repo_root: Path) -> bool:
"""Check if a file is in the strict check list."""
try:
rel_path = file.relative_to(repo_root / "openhands-sdk/openhands/sdk")
return any(str(rel_path) == strict for strict in STRICT_CHECK_FILES)
except ValueError:
return False
def check_file(file: Path, repo_root: Path) -> list[Violation]:
"""Check all docstrings in a file."""
violations = []
is_strict = is_strict_file(file, repo_root)
for name, docstring, lineno in get_docstrings_from_file(file):
file_violations = check_docstring(docstring, name, lineno, file)
for v in file_violations:
v.is_strict = is_strict
violations.extend(file_violations)
return violations
def main() -> int:
"""Run docstring checks on all SDK files."""
repo_root = Path(__file__).parent.parent.parent
all_violations: list[Violation] = []
files_checked = 0
for sdk_path in SDK_PATHS:
path = repo_root / sdk_path
if not path.exists():
print(f"Warning: Path not found: {path}", file=sys.stderr)
continue
for py_file in path.rglob("*.py"):
if should_skip(py_file):
continue
files_checked += 1
violations = check_file(py_file, repo_root)
all_violations.extend(violations)
# Separate strict violations (errors) from warnings
strict_violations = [v for v in all_violations if v.is_strict]
warning_violations = [v for v in all_violations if not v.is_strict]
# Report warnings (non-strict files)
if warning_violations:
count = len(warning_violations)
print(f"\n⚠️ Found {count} docstring warning(s) in non-core files:\n")
by_file: dict[Path, list[Violation]] = {}
for v in warning_violations:
by_file.setdefault(v.file, []).append(v)
for file, violations in sorted(by_file.items()):
rel_path = file.relative_to(repo_root)
print(f"📄 {rel_path}")
for v in violations:
print(f" Line {v.line}: {v.name} ({v.rule})")
print()
# Report errors (strict files)
if strict_violations:
count = len(strict_violations)
print(f"\n❌ Found {count} docstring error(s) in core API files:\n")
by_file: dict[Path, list[Violation]] = {}
for v in strict_violations:
by_file.setdefault(v.file, []).append(v)
for file, violations in sorted(by_file.items()):
rel_path = file.relative_to(repo_root)
print(f"📄 {rel_path}")
for v in violations:
print(f" Line {v.line}: {v.name}")
print(f" Rule: {v.rule}")
print(f" {v.message}")
print()
print("=" * 60)
print("To fix these issues:")
print(" 1. Replace >>> examples with ```python code blocks")
print(" 2. Wrap shell/config examples in ```bash code blocks")
print("=" * 60)
return 1
if warning_violations:
count = len(warning_violations)
print(f"✅ Core API files pass. {count} warnings in other files.")
else:
print(f"✅ All {files_checked} files pass docstring checks")
return 0
if __name__ == "__main__":
sys.exit(main())
@@ -1,209 +0,0 @@
#!/usr/bin/env python3
"""
Check if all examples in agent-sdk are documented in the docs repository.
This script:
1. Scans the docs repository for references to example files
2. Lists all example Python files in the agent-sdk repository
3. Compares the two sets to find undocumented examples
4. Exits with error code 1 if undocumented examples are found
"""
import os
import re
import sys
from pathlib import Path
def find_documented_examples(docs_path: Path) -> set[str]:
"""
Find all example file references in the docs repository.
Searches for patterns like:
- examples/01_standalone_sdk/02_custom_tools.py
- examples/02_remote_agent_server/06_custom_tool/custom_tools/log_data.py
in MDX files.
Returns:
Set of normalized example file paths (relative to agent-sdk root)
"""
documented_examples: set[str] = set()
# Pattern to match example file references with arbitrary nesting depth.
# Matches: examples/<dir>/.../<file>.py
pattern = r"examples/(?:[-\w]+/)+[-\w]+\.py"
for root, _, files in os.walk(docs_path):
for file in files:
if file.endswith(".mdx") or file.endswith(".md"):
file_path = Path(root) / file
try:
content = file_path.read_text(encoding="utf-8")
matches = re.findall(pattern, content)
for match in matches:
# Normalize the path
documented_examples.add(match)
except Exception as e:
print(f"Warning: Error reading {file_path}: {e}")
continue
return documented_examples
def find_agent_sdk_examples(agent_sdk_path: Path) -> set[str]:
"""
Find all example Python files in the agent-sdk repository.
Excludes examples/03_github_workflows/ since those examples are YAML
files, not Python files.
Returns:
Set of example file paths (relative to agent-sdk root)
"""
examples: set[str] = set()
examples_dir = agent_sdk_path / "examples"
if not examples_dir.exists():
print(f"Error: Examples directory not found: {examples_dir}")
sys.exit(1)
# Find all Python files under examples/
for root, _, files in os.walk(examples_dir):
for file in files:
if file.endswith(".py"):
file_path = Path(root) / file
# Get relative path from agent-sdk root
relative_path = file_path.relative_to(agent_sdk_path)
relative_path_str = str(relative_path)
# Skip GitHub workflow examples (those are YAML files, Python
# files there are just helpers)
if relative_path_str.startswith("examples/03_github_workflows/"):
continue
# Skip LLM-specific tools examples: these are intentionally not
# enforced by the docs check. See discussion in PR #1486.
if relative_path_str.startswith("examples/04_llm_specific_tools/"):
continue
# Skip __init__.py files as they typically don't need documentation
if file == "__init__.py":
continue
examples.add(relative_path_str)
return examples
def resolve_paths() -> tuple[Path, Path]:
"""
Determine agent-sdk root and docs path.
Priority for docs path:
1) DOCS_PATH (env override)
2) $GITHUB_WORKSPACE/docs
3) agent_sdk_root/'docs'
4) agent_sdk_root.parent/'docs'
Returns:
Tuple of (agent_sdk_root, docs_path)
"""
# agent-sdk repo root (script is at agent-sdk/.github/scripts/...)
script_file = Path(__file__).resolve()
agent_sdk_root = script_file.parent.parent.parent
candidates: list[Path] = []
# 1) Explicit env override
env_override = os.environ.get("DOCS_PATH")
if env_override:
candidates.append(Path(env_override).expanduser().resolve())
# 2) Standard GitHub workspace sibling
gh_ws = os.environ.get("GITHUB_WORKSPACE")
if gh_ws:
candidates.append(Path(gh_ws).resolve() / "docs")
# 3) Sibling inside the agent-sdk repo root
candidates.append(agent_sdk_root / "docs")
# 4) Parent-of-agent-sdk-root layout
candidates.append(agent_sdk_root.parent / "docs")
print(f"🔍 Agent SDK root: {agent_sdk_root}")
print("🔎 Trying docs paths (in order):")
for p in candidates:
print(f" - {p}")
for p in candidates:
if p.exists():
print(f"📁 Using docs path: {p}")
return agent_sdk_root, p
# If none exist, fail with a helpful message
print("❌ Docs path not found in any of the expected locations.")
print(" Set DOCS_PATH, or checkout the repo to one of the tried paths above.")
sys.exit(1)
def main() -> None:
agent_sdk_root, docs_path = resolve_paths()
print("\n" + "=" * 60)
print("Checking documented examples...")
print("=" * 60)
# Find all examples in agent-sdk
print("\n📋 Scanning agent-sdk examples...")
agent_examples = find_agent_sdk_examples(agent_sdk_root)
print(f" Found {len(agent_examples)} example file(s)")
# Find all documented examples in docs
print("\n📄 Scanning docs repository...")
documented_examples = find_documented_examples(docs_path)
print(f" Found {len(documented_examples)} documented example(s)")
# Calculate difference
undocumented = agent_examples - documented_examples
print("\n" + "=" * 60)
if undocumented:
print(f"❌ Found {len(undocumented)} undocumented example(s):")
print("=" * 60)
for example in sorted(undocumented):
print(f" - {example}")
print("\n⚠️ Please add documentation for these examples in the docs repo.")
print("=" * 60)
print("\n📚 How to Document Examples:")
print("=" * 60)
print("1. Clone the docs repository:")
print(" git clone https://github.com/OpenHands/docs.git")
print()
print("2. Create a new .mdx file in sdk/guides/ directory")
print(" (e.g., sdk/guides/my-feature.mdx)")
print()
print("3. Add the example code block with this format:")
print(' ```python icon="python" expandable examples/path/to/file.py')
print(" <code will be auto-synced>")
print(" ```")
print()
print("4. See the format documentation at:")
print(
" https://github.com/OpenHands/docs/blob/main/.github/scripts/README.md"
)
print()
print("5. Example documentation files can be found in:")
print(" https://github.com/OpenHands/docs/tree/main/sdk/guides")
print()
print("6. After creating the PR in docs repo, reference it in your")
print(" agent-sdk PR description.")
print("=" * 60)
sys.exit(1)
else:
print("✅ All examples are documented!")
print("=" * 60)
sys.exit(0)
if __name__ == "__main__":
main()
@@ -1,104 +0,0 @@
#!/usr/bin/env python3
"""
Check for duplicate example numbers in the examples directory.
This script ensures that within each examples subdirectory, no two files or
folders share the same numeric prefix (e.g., two files both starting with "04_").
Exit codes:
0 - No duplicates found
1 - Duplicates found
"""
import re
import sys
from collections import defaultdict
from pathlib import Path
def find_duplicate_numbers(examples_dir: Path) -> dict[str, list[str]]:
"""
Find duplicate example numbers within each subdirectory.
Returns:
Dictionary mapping subdirectory paths to lists of duplicate entries.
Only includes subdirectories that have duplicates.
"""
duplicates: dict[str, list[str]] = {}
# Pattern to extract leading number from filename/dirname
# e.g., "04" from "04_foo.py"
number_pattern = re.compile(r"^(\d+)_")
for subdir in sorted(examples_dir.iterdir()):
if not subdir.is_dir():
continue
# Skip hidden directories
if subdir.name.startswith("."):
continue
# Group entries by their numeric prefix
number_to_entries: dict[str, list[str]] = defaultdict(list)
for entry in subdir.iterdir():
# Skip hidden files/directories
if entry.name.startswith("."):
continue
match = number_pattern.match(entry.name)
if match:
number = match.group(1)
number_to_entries[number].append(entry.name)
# Find numbers with multiple entries
subdir_duplicates = []
for number, entries in sorted(number_to_entries.items()):
if len(entries) > 1:
subdir_duplicates.extend(sorted(entries))
if subdir_duplicates:
relative_subdir = str(subdir.relative_to(examples_dir.parent))
duplicates[relative_subdir] = subdir_duplicates
return duplicates
def main() -> None:
# Find the examples directory relative to this script
script_file = Path(__file__).resolve()
repo_root = script_file.parent.parent.parent
examples_dir = repo_root / "examples"
if not examples_dir.exists():
print(f"Error: Examples directory not found: {examples_dir}")
sys.exit(1)
print("=" * 60)
print("Checking for duplicate example numbers...")
print("=" * 60)
print(f"\n📁 Scanning: {examples_dir}\n")
duplicates = find_duplicate_numbers(examples_dir)
if duplicates:
print("❌ Found duplicate example numbers:\n")
for subdir, entries in sorted(duplicates.items()):
print(f" {subdir}/")
for entry in entries:
print(f" - {entry}")
print()
print("=" * 60)
print("⚠️ Please renumber the examples to remove duplicates.")
print(" Each example should have a unique number within its folder.")
print("=" * 60)
sys.exit(1)
else:
print("✅ No duplicate example numbers found!")
print("=" * 60)
sys.exit(0)
if __name__ == "__main__":
main()
-826
View File
@@ -1,826 +0,0 @@
#!/usr/bin/env python3
"""API breakage detection for published OpenHands packages using Griffe.
This script compares current workspace packages against the most recent PyPI
release (or the matching release if the current version is already published)
to detect breaking changes in the public API.
It focuses on the curated public surface:
- symbols exported via ``__all__``
- public members removed from classes exported via ``__all__``
It enforces two policies:
1. **Deprecation-before-removal** any removed export or removed public class
member must have been marked deprecated in the *previous* release using the
canonical deprecation helpers (``@deprecated`` decorator or
``warn_deprecated()`` call from ``openhands.sdk.utils.deprecation``). For
members, the recommended ``warn_deprecated`` feature name is qualified (e.g.
``"LLM.some_method"``).
2. **MINOR version bump** any breaking change (removal or structural) requires
at least a MINOR version bump according to SemVer.
Complementary to the deprecation mechanism:
- Deprecation (``check_deprecations.py``): enforces cleanup deadlines
- This script: prevents unannounced removals and enforces SemVer bumps
"""
from __future__ import annotations
import ast
import json
import os
import re
import subprocess
import sys
import tomllib
import urllib.request
from collections.abc import Iterable
from dataclasses import dataclass
from pathlib import Path
from packaging import version as pkg_version
from packaging.requirements import Requirement
@dataclass(frozen=True)
class PackageConfig:
"""Configuration for a single published package."""
package: str # dotted module path, e.g. "openhands.sdk"
distribution: str # PyPI distribution name, e.g. "openhands-sdk"
source_dir: str # repo-relative directory, e.g. "openhands-sdk"
@dataclass(frozen=True, slots=True)
class DeprecatedSymbols:
"""Deprecated SDK symbols detected in a source tree.
``top_level`` tracks module-level symbols (exports) like ``LLM``.
``qualified`` tracks class members like ``LLM.some_method``.
"""
top_level: set[str] = frozenset() # type: ignore[assignment]
qualified: set[str] = frozenset() # type: ignore[assignment]
PACKAGES: tuple[PackageConfig, ...] = (
PackageConfig(
package="openhands.sdk",
distribution="openhands-sdk",
source_dir="openhands-sdk",
),
PackageConfig(
package="openhands.workspace",
distribution="openhands-workspace",
source_dir="openhands-workspace",
),
PackageConfig(
package="openhands.tools",
distribution="openhands-tools",
source_dir="openhands-tools",
),
)
ACP_DEPENDENCY = "agent-client-protocol"
ACP_SKIP_ENV = "ACP_VERSION_CHECK_SKIP"
ACP_SKIP_TOKEN = "skip-acp-check"
ACP_BASE_REF_ENV = "ACP_VERSION_CHECK_BASE_REF"
def read_version_from_pyproject(path: str) -> str:
"""Read the version string from a pyproject.toml file."""
with open(path, "rb") as f:
data = tomllib.load(f)
proj = data.get("project", {})
v = proj.get("version")
if not v:
raise SystemExit(f"Could not read version from {path}")
return str(v)
def _read_pyproject(path: str) -> dict:
with open(path, "rb") as f:
return tomllib.load(f)
def _bool_env(name: str) -> bool:
value = os.environ.get(name, "").strip().lower()
return value in {"1", "true", "yes", "on"}
def _get_dependency_spec(project_data: dict, dependency: str) -> str | None:
deps = project_data.get("project", {}).get("dependencies", [])
for dep in deps:
if dep.startswith(dependency):
return dep
return None
def _min_version_from_requirement(req_str: str) -> pkg_version.Version | None:
try:
req = Requirement(req_str)
except Exception as exc:
print(
f"::warning title=ACP version::Unable to parse requirement "
f"'{req_str}': {exc}"
)
return None
lower_bounds: list[pkg_version.Version] = []
for spec in req.specifier:
if spec.operator in {">=", ">", "==", "~="}:
try:
lower_bounds.append(_parse_version(spec.version))
except Exception as exc:
print(
f"::warning title=ACP version::Unable to parse version "
f"'{spec.version}' from '{req_str}': {exc}"
)
if not lower_bounds:
return None
return max(lower_bounds)
def _git_show_file(ref: str, rel_path: str) -> str | None:
for candidate in (f"origin/{ref}", ref):
result = subprocess.run(
["git", "show", f"{candidate}:{rel_path}"],
check=False,
capture_output=True,
text=True,
)
if result.returncode == 0:
return result.stdout
return None
def _load_base_pyproject(base_ref: str) -> dict | None:
rel_path = "openhands-sdk/pyproject.toml"
content = _git_show_file(base_ref, rel_path)
if content is None:
print(
f"::warning title=ACP version::Unable to read {rel_path} from "
f"{base_ref}; skipping ACP version check"
)
return None
try:
return tomllib.loads(content)
except tomllib.TOMLDecodeError as exc:
print(
f"::warning title=ACP version::Failed to parse {rel_path} from "
f"{base_ref}: {exc}"
)
return None
def _check_acp_version_bump(repo_root: str) -> int:
if _bool_env(ACP_SKIP_ENV):
print(
f"::notice title=ACP version::Skipping ACP version check because "
f"{ACP_SKIP_ENV} is set (token: [{ACP_SKIP_TOKEN}])."
)
return 0
base_ref = os.environ.get(ACP_BASE_REF_ENV) or os.environ.get("GITHUB_BASE_REF")
if not base_ref:
print(
"::warning title=ACP version::No base ref found; skipping ACP version check"
)
return 0
base_data = _load_base_pyproject(base_ref)
if base_data is None:
return 0
current_data = _read_pyproject(
os.path.join(repo_root, "openhands-sdk", "pyproject.toml")
)
old_req = _get_dependency_spec(base_data, ACP_DEPENDENCY)
new_req = _get_dependency_spec(current_data, ACP_DEPENDENCY)
if not old_req or not new_req:
print(
f"::warning title=ACP version::Unable to locate {ACP_DEPENDENCY} "
"dependency in pyproject.toml; skipping ACP version check"
)
return 0
old_min = _min_version_from_requirement(old_req)
new_min = _min_version_from_requirement(new_req)
if old_min is None or new_min is None:
print(
f"::warning title=ACP version::Unable to parse {ACP_DEPENDENCY} "
"minimum version; skipping ACP version check"
)
return 0
if new_min <= old_min:
return 0
if new_min.major != old_min.major or new_min.minor != old_min.minor:
print(
"::error title=ACP version::Detected "
f"{ACP_DEPENDENCY} minor/major version bump "
f"({old_req} -> {new_req}). If intentional, add "
f"[{ACP_SKIP_TOKEN}] to the PR description to bypass."
)
return 1
return 0
def _parse_version(v: str) -> pkg_version.Version:
"""Parse a version string using packaging."""
return pkg_version.parse(v)
def get_pypi_baseline_version(pkg: str, current: str | None) -> str | None:
"""Fetch the baseline release version from PyPI.
The baseline is the most recent published release to compare against the
current workspace. If the current version already exists on PyPI, compare
against that same release. Otherwise, fall back to the newest release older
than the current version. If ``current`` is None, use the latest release.
Args:
pkg: Package name on PyPI (e.g., "openhands-sdk")
current: Current version from the workspace, or None for latest
Returns:
Baseline version string, or None if not found or on network error
"""
req = urllib.request.Request(
url=f"https://pypi.org/pypi/{pkg}/json",
headers={"User-Agent": "openhands-sdk-api-check/1.0"},
method="GET",
)
try:
with urllib.request.urlopen(req, timeout=10) as r:
meta = json.load(r)
except Exception as e:
print(f"::warning title={pkg} API::Failed to fetch PyPI metadata: {e}")
return None
releases = list(meta.get("releases", {}).keys())
if not releases:
return None
def _sort_key(s: str):
return _parse_version(s)
releases_sorted = sorted(releases, key=_sort_key, reverse=True)
if current is None:
return releases_sorted[0]
if current in releases:
return current
cur_parsed = _parse_version(current)
older = [rv for rv in releases if _parse_version(rv) < cur_parsed]
if not older:
return None
return sorted(older, key=_sort_key, reverse=True)[0]
def ensure_griffe() -> None:
"""Verify griffe is installed, raising an error if not."""
try:
import griffe # noqa: F401
except ImportError:
sys.stderr.write(
"ERROR: griffe not installed. Install with: pip install griffe[pypi]\n"
)
raise SystemExit(1)
def _is_field_metadata_only_change(old_val: object, new_val: object) -> bool:
"""Check if the change is only in Field metadata (description, title, etc.).
Field metadata parameters like ``description``, ``title``, ``examples``, and
``deprecated`` don't affect runtime behavior. Changes to these should not be
considered breaking API changes.
Returns:
True if both values are Field() calls and only metadata parameters differ.
"""
old_str = str(old_val)
new_str = str(new_val)
if not (old_str.startswith("Field(") and new_str.startswith("Field(")):
return False
# Metadata parameters that don't affect runtime behavior.
# See https://docs.pydantic.dev/latest/api/fields/#pydantic.fields.Field
metadata_patterns = {
"description": r'([\'"])([^\'"]*?)\1',
"title": r'([\'"])([^\'"]*?)\1',
"examples": r'([\'"])([^\'"]*?)\1',
"json_schema_extra": r'([\'"])([^\'"]*?)\1',
"deprecated": r"(?:True|False|None|'[^']*'|\"[^\"]*\")",
}
def _normalize(value: str) -> str:
normalized = value
for param, value_pattern in metadata_patterns.items():
pattern = rf",?\s*{param}\s*=\s*{value_pattern}"
normalized = re.sub(pattern, "", normalized)
normalized = re.sub(r"\(\s*,", "(", normalized)
normalized = re.sub(r",\s*\)", ")", normalized)
normalized = re.sub(r",\s*,", ", ", normalized)
normalized = re.sub(r"\s+", " ", normalized)
return normalized.strip()
return _normalize(old_str) == _normalize(new_str)
def _collect_breakages_pairs(
objs: Iterable[tuple[object, object]],
*,
deprecated: DeprecatedSymbols,
title: str,
) -> tuple[list[object], int]:
"""Find breaking changes between pairs of old/new API objects.
Only reports breakages for public API members.
Returns:
(breakages, undeprecated_removals)
"""
import griffe
from griffe import Alias, AliasResolutionError, BreakageKind, ExplanationStyle, Kind
breakages: list[object] = []
undeprecated_removals = 0
for old, new in objs:
try:
for br in griffe.find_breaking_changes(old, new):
obj = getattr(br, "obj", None)
if not getattr(obj, "is_public", True):
continue
# Skip ATTRIBUTE_CHANGED_VALUE when it's just Field metadata changes
# (description, title, examples, etc.) - these don't affect runtime
if br.kind == BreakageKind.ATTRIBUTE_CHANGED_VALUE:
old_value = getattr(br, "old_value", None)
new_value = getattr(br, "new_value", None)
if _is_field_metadata_only_change(old_value, new_value):
print(
f"::notice title={title}::Ignoring Field metadata-only "
f"change (non-breaking): {obj.name if obj else 'unknown'}"
)
continue
print(br.explain(style=ExplanationStyle.GITHUB))
breakages.append(br)
if br.kind != BreakageKind.OBJECT_REMOVED:
continue
parent = getattr(obj, "parent", None)
if getattr(parent, "kind", None) != Kind.CLASS:
continue
feature = f"{parent.name}.{obj.name}"
if (
feature not in deprecated.qualified
and parent.name not in deprecated.top_level
):
print(
f"::error title={title}::Removed '{feature}' without prior "
"deprecation. Mark it with @deprecated(...) or "
f"warn_deprecated('{feature}', ...) for at least one release "
"before removing."
)
undeprecated_removals += 1
except AliasResolutionError as e:
if isinstance(old, Alias) or isinstance(new, Alias):
old_target = old.target_path if isinstance(old, Alias) else None
new_target = new.target_path if isinstance(new, Alias) else None
if old_target != new_target:
name = getattr(old, "name", None) or getattr(
new, "name", "<unknown>"
)
print(
f"::warning title={title}::Alias target changed for '{name}': "
f"{old_target!r} -> {new_target!r}"
)
breakages.append(
{
"kind": "ALIAS_TARGET_CHANGED",
"name": name,
"old": old_target,
"new": new_target,
}
)
else:
print(
f"::notice title={title}::Skipping symbol comparison due to "
f"unresolved alias: {e}"
)
except Exception as e:
print(f"::warning title={title}::Failed to compute breakages: {e}")
return breakages, undeprecated_removals
def _extract_exported_names(module) -> set[str]:
"""Extract names exported from a module via ``__all__``.
This check is explicitly meant to track the curated public surface. The SDK
is expected to define ``__all__`` in ``openhands.sdk``; if it's missing or we
can't statically interpret it, we fail fast rather than silently widening the
surface area (which would make the check noisy and brittle).
"""
try:
all_var = module["__all__"]
except Exception as e:
raise ValueError("Expected __all__ to be defined on the public module") from e
val = getattr(all_var, "value", None)
elts = getattr(val, "elements", None)
if not elts:
raise ValueError("Unable to statically evaluate __all__")
names: set[str] = set()
for el in elts:
# Griffe represents string literals in __all__ in different ways depending
# on how the module is loaded / griffe version:
# - sometimes as plain Python strings (including quotes, e.g. "'LLM'")
# - sometimes as expression nodes with a `.value` attribute
#
# We intentionally only support the "static __all__ of string literals"
# case; we just normalize the representation.
if isinstance(el, str):
names.add(el.strip("\"'"))
continue
s = getattr(el, "value", None)
if isinstance(s, str):
names.add(s)
if not names:
raise ValueError("__all__ resolved to an empty set")
return names
def _check_version_bump(prev: str, new_version: str, total_breaks: int) -> int:
"""Check if version bump policy is satisfied for breaking changes.
Policy: Breaking changes require at least a MINOR version bump.
Returns:
0 if policy satisfied, 1 if not
"""
if total_breaks == 0:
print("No breaking changes detected")
return 0
parsed_prev = _parse_version(prev)
parsed_new = _parse_version(new_version)
# MINOR bump required: same major, higher minor OR higher major
ok = (parsed_new.major > parsed_prev.major) or (
parsed_new.major == parsed_prev.major and parsed_new.minor > parsed_prev.minor
)
if not ok:
print(
f"::error title=SemVer::Breaking changes detected ({total_breaks}); "
f"require at least minor version bump from "
f"{parsed_prev.major}.{parsed_prev.minor}.x, but new is {new_version}"
)
return 1
print(
f"Breaking changes detected ({total_breaks}) and version bump policy "
f"satisfied ({prev} -> {new_version})"
)
return 0
def _resolve_griffe_object(
root: object,
dotted: str,
root_package: str = "",
) -> object:
"""Resolve a dotted path to a griffe object."""
root_path = getattr(root, "path", None)
if root_path == dotted:
return root
if isinstance(root_path, str) and dotted.startswith(root_path + "."):
dotted = dotted[len(root_path) + 1 :]
try:
return root[dotted]
except (KeyError, TypeError) as e:
print(
f"::warning title=SDK API::Unable to resolve {dotted} via "
f"direct lookup; falling back to manual traversal: {e}"
)
rel = dotted
if root_package and dotted.startswith(root_package + "."):
rel = dotted[len(root_package) + 1 :]
obj = root
for part in rel.split("."):
try:
obj = obj[part]
except (KeyError, TypeError) as e:
raise KeyError(f"Unable to resolve {dotted}: failed at {part}") from e
return obj
def _load_current(
griffe_module: object, repo_root: str, cfg: PackageConfig
) -> object | None:
try:
return griffe_module.load(
cfg.package,
search_paths=[os.path.join(repo_root, cfg.source_dir)],
)
except Exception as e:
print(
f"::error title={cfg.distribution} API::"
f"Failed to load current {cfg.distribution}: {e}"
)
return None
def _load_prev_from_pypi(
griffe_module: object,
prev: str,
cfg: PackageConfig,
) -> object | None:
griffe_cache = os.path.expanduser("~/.cache/griffe")
os.makedirs(griffe_cache, exist_ok=True)
try:
return griffe_module.load_pypi(
package=cfg.package,
distribution=cfg.distribution,
version_spec=f"=={prev}",
)
except Exception as e:
print(
f"::error title={cfg.distribution} API::"
f"Failed to load {cfg.distribution}=={prev} from PyPI: {e}"
)
return None
def _find_deprecated_symbols(source_root: Path) -> DeprecatedSymbols:
"""Scan source files for symbols marked with the SDK deprecation helpers.
Detects two forms:
- ``@deprecated(...)`` decorator on a class/function/method
- ``warn_deprecated('SomeFeature', ...)`` call
Returns:
DeprecatedSymbols(top_level=..., qualified=...)
"""
def _is_deprecated_decorator(deco: ast.AST) -> bool:
if not isinstance(deco, ast.Call):
return False
target = deco.func
if isinstance(target, ast.Name):
return target.id == "deprecated"
if isinstance(target, ast.Attribute):
return target.attr == "deprecated"
return False
class _Visitor(ast.NodeVisitor):
def __init__(self) -> None:
self.class_stack: list[str] = []
self.top_level: set[str] = set()
self.qualified: set[str] = set()
def visit_ClassDef(self, node: ast.ClassDef) -> None: # noqa: N802
if any(_is_deprecated_decorator(deco) for deco in node.decorator_list):
self.top_level.add(node.name)
self.qualified.add(node.name)
self.class_stack.append(node.name)
self.generic_visit(node)
self.class_stack.pop()
def _visit_function_like(
self,
node: ast.FunctionDef | ast.AsyncFunctionDef,
) -> None:
if any(_is_deprecated_decorator(deco) for deco in node.decorator_list):
if self.class_stack:
self.qualified.add(".".join([*self.class_stack, node.name]))
else:
self.top_level.add(node.name)
self.qualified.add(node.name)
self.generic_visit(node)
def visit_FunctionDef(self, node: ast.FunctionDef) -> None: # noqa: N802
self._visit_function_like(node)
def visit_AsyncFunctionDef(self, node: ast.AsyncFunctionDef) -> None: # noqa: N802
self._visit_function_like(node)
def visit_Call(self, node: ast.Call) -> None: # noqa: N802
target = node.func
func_name = None
if isinstance(target, ast.Name):
func_name = target.id
elif isinstance(target, ast.Attribute):
func_name = target.attr
if func_name == "warn_deprecated" and node.args:
feature = _extract_string_literal(node.args[0])
if feature is not None:
self.qualified.add(feature)
self.top_level.add(feature.split(".")[0])
self.generic_visit(node)
top_level: set[str] = set()
qualified: set[str] = set()
for pyfile in source_root.rglob("*.py"):
try:
tree = ast.parse(pyfile.read_text())
except SyntaxError as e:
print(
f"::warning title=SDK API::Skipping {pyfile}: "
f"failed to parse (SyntaxError: {e})"
)
continue
visitor = _Visitor()
visitor.visit(tree)
top_level |= visitor.top_level
qualified |= visitor.qualified
return DeprecatedSymbols(top_level=top_level, qualified=qualified)
def _extract_string_literal(node: ast.AST) -> str | None:
"""Return the string value if *node* is a simple string literal."""
if isinstance(node, ast.Constant) and isinstance(node.value, str):
return node.value
return None
def _get_source_root(griffe_root: object) -> Path | None:
"""Derive the package source directory from a griffe module's filepath."""
filepath = getattr(griffe_root, "filepath", None)
if filepath is not None:
return Path(filepath).parent
return None
def _compute_breakages(old_root, new_root, cfg: PackageConfig) -> tuple[int, int]:
"""Detect breaking changes between old and new package versions.
Returns:
``(total_breaks, undeprecated_removals)`` — *total_breaks* counts all
structural breakages (for the version-bump policy), while
*undeprecated_removals* counts public API removals (exports and class
members) without a prior deprecation marker (a separate hard failure).
"""
pkg = cfg.package
title = f"{cfg.distribution} API"
total_breaks = 0
undeprecated_removals = 0
source_root = _get_source_root(old_root)
deprecated = (
_find_deprecated_symbols(source_root) if source_root else DeprecatedSymbols()
)
try:
old_mod = _resolve_griffe_object(old_root, pkg, root_package=pkg)
new_mod = _resolve_griffe_object(new_root, pkg, root_package=pkg)
except Exception as e:
raise RuntimeError(f"Failed to resolve root module '{pkg}'") from e
new_exports = _extract_exported_names(new_mod)
try:
old_exports = _extract_exported_names(old_mod)
except ValueError as e:
# The API breakage check relies on a curated public surface defined via
# __all__. If the baseline release didn't define (or couldn't statically
# evaluate) __all__, we can't compute meaningful breakages.
#
# In this situation, skip rather than failing the entire workflow.
print(
f"::notice title={title}::Skipping breakage check; baseline release "
f"has no statically-evaluable {pkg}.__all__: {e}"
)
return 0, 0
removed = sorted(old_exports - new_exports)
# Check deprecation-before-removal policy (exports)
for name in removed:
total_breaks += 1 # every removal is a structural break
if name not in deprecated.top_level:
print(
f"::error title={title}::Removed '{name}' from "
f"{pkg}.__all__ without prior deprecation. "
"Mark it with @deprecated or warn_deprecated() "
"for at least one release before removing."
)
undeprecated_removals += 1
else:
print(
f"::notice title={title}::Removed previously-deprecated symbol "
f"'{name}' from {pkg}.__all__"
)
common = sorted(old_exports & new_exports)
pairs: list[tuple[object, object]] = []
for name in common:
try:
pairs.append((old_mod[name], new_mod[name]))
except Exception as e:
print(f"::warning title={title}::Unable to resolve symbol {name}: {e}")
breakages, undeprecated_members = _collect_breakages_pairs(
pairs,
deprecated=deprecated,
title=title,
)
total_breaks += len(breakages)
undeprecated_removals += undeprecated_members
return total_breaks, undeprecated_removals
def _check_package(griffe_module, repo_root: str, cfg: PackageConfig) -> int:
"""Run breakage checks for a single package. Returns 0 on success."""
pyproj = os.path.join(repo_root, cfg.source_dir, "pyproject.toml")
new_version = read_version_from_pyproject(pyproj)
title = f"{cfg.distribution} API"
baseline = get_pypi_baseline_version(cfg.distribution, new_version)
if not baseline:
print(
f"::warning title={title}::No baseline {cfg.distribution} "
f"release found; skipping breakage check",
)
return 0
print(f"Comparing {cfg.distribution} {new_version} against {baseline}")
new_root = _load_current(griffe_module, repo_root, cfg)
if not new_root:
return 1
old_root = _load_prev_from_pypi(griffe_module, baseline, cfg)
if not old_root:
return 1
try:
total_breaks, undeprecated = _compute_breakages(old_root, new_root, cfg)
except Exception as e:
print(f"::error title={title}::Failed to compute breakages: {e}")
return 1
if undeprecated:
print(
f"::error title={title}::{undeprecated} symbol(s) removed "
f"from {cfg.package} without prior deprecation — "
f"see errors above"
)
bump_rc = _check_version_bump(baseline, new_version, total_breaks)
return 1 if (undeprecated or bump_rc) else 0
def main() -> int:
"""Main entry point for API breakage detection."""
repo_root = os.getcwd()
rc = _check_acp_version_bump(repo_root)
ensure_griffe()
import griffe
for cfg in PACKAGES:
print(f"\n{'=' * 60}")
print(f"Checking {cfg.distribution} ({cfg.package})")
print(f"{'=' * 60}")
rc |= _check_package(griffe, repo_root, cfg)
return rc
if __name__ == "__main__":
raise SystemExit(main())
-196
View File
@@ -1,196 +0,0 @@
"""Guard package version changes so they only happen in release PRs."""
from __future__ import annotations
import os
import re
import subprocess
import sys
import tomllib
from dataclasses import dataclass
from pathlib import Path
PACKAGE_PYPROJECTS: dict[str, Path] = {
"openhands-sdk": Path("openhands-sdk/pyproject.toml"),
"openhands-tools": Path("openhands-tools/pyproject.toml"),
"openhands-workspace": Path("openhands-workspace/pyproject.toml"),
"openhands-agent-server": Path("openhands-agent-server/pyproject.toml"),
}
_VERSION_PATTERN = r"\d+\.\d+\.\d+(?:[-+][0-9A-Za-z.]+)?"
_RELEASE_TITLE_RE = re.compile(rf"^Release v(?P<version>{_VERSION_PATTERN})$")
_RELEASE_BRANCH_RE = re.compile(rf"^rel-(?P<version>{_VERSION_PATTERN})$")
@dataclass(frozen=True)
class VersionChange:
package: str
path: Path
previous_version: str
current_version: str
def _read_version_from_pyproject_text(text: str, source: str) -> str:
data = tomllib.loads(text)
version = data.get("project", {}).get("version")
if not isinstance(version, str):
raise SystemExit(f"Unable to determine project.version from {source}")
return version
def _read_current_version(repo_root: Path, pyproject: Path) -> str:
return _read_version_from_pyproject_text(
(repo_root / pyproject).read_text(),
str(pyproject),
)
def _read_version_from_git_ref(repo_root: Path, git_ref: str, pyproject: Path) -> str:
result = subprocess.run(
["git", "show", f"{git_ref}:{pyproject.as_posix()}"],
cwd=repo_root,
check=False,
capture_output=True,
text=True,
)
if result.returncode != 0:
message = result.stderr.strip() or result.stdout.strip() or "unknown git error"
raise SystemExit(
f"Unable to read {pyproject} from git ref {git_ref}: {message}"
)
return _read_version_from_pyproject_text(result.stdout, f"{git_ref}:{pyproject}")
def _base_ref_candidates(base_ref: str) -> list[str]:
if base_ref.startswith("origin/"):
return [base_ref, base_ref.removeprefix("origin/")]
return [f"origin/{base_ref}", base_ref]
def find_version_changes(repo_root: Path, base_ref: str) -> list[VersionChange]:
changes: list[VersionChange] = []
candidates = _base_ref_candidates(base_ref)
for package, pyproject in PACKAGE_PYPROJECTS.items():
current_version = _read_current_version(repo_root, pyproject)
previous_error: SystemExit | None = None
previous_version: str | None = None
for candidate in candidates:
try:
previous_version = _read_version_from_git_ref(
repo_root, candidate, pyproject
)
break
except SystemExit as exc:
previous_error = exc
if previous_version is None:
assert previous_error is not None
raise previous_error
if previous_version != current_version:
changes.append(
VersionChange(
package=package,
path=pyproject,
previous_version=previous_version,
current_version=current_version,
)
)
return changes
def get_release_pr_version(
pr_title: str, pr_head_ref: str
) -> tuple[str | None, list[str]]:
title_match = _RELEASE_TITLE_RE.fullmatch(pr_title.strip())
branch_match = _RELEASE_BRANCH_RE.fullmatch(pr_head_ref.strip())
title_version = title_match.group("version") if title_match else None
branch_version = branch_match.group("version") if branch_match else None
if title_version and branch_version and title_version != branch_version:
return None, [
"Release PR markers disagree: title requests "
f"v{title_version} but branch is rel-{branch_version}."
]
return title_version or branch_version, []
def validate_version_changes(
changes: list[VersionChange],
pr_title: str,
pr_head_ref: str,
) -> list[str]:
if not changes:
return []
release_version, errors = get_release_pr_version(pr_title, pr_head_ref)
if errors:
return errors
formatted_changes = ", ".join(
f"{change.package} ({change.previous_version} -> {change.current_version})"
for change in changes
)
if release_version is None:
return [
"Package version changes are only allowed in release PRs. "
f"Detected changes: {formatted_changes}. "
"Use the Prepare Release workflow so the PR title is 'Release vX.Y.Z' "
"or the branch is 'rel-X.Y.Z'."
]
mismatched = [
change for change in changes if change.current_version != release_version
]
if mismatched:
mismatch_details = ", ".join(
f"{change.package} ({change.current_version})" for change in mismatched
)
return [
f"Release PR version v{release_version} does not match changed package "
f"versions: {mismatch_details}."
]
return []
def main() -> int:
repo_root = Path(__file__).resolve().parents[2]
base_ref = os.environ.get("VERSION_BUMP_BASE_REF") or os.environ.get(
"GITHUB_BASE_REF"
)
if not base_ref:
print("::warning title=Version bump guard::No base ref found; skipping check.")
return 0
pr_title = os.environ.get("PR_TITLE", "")
pr_head_ref = os.environ.get("PR_HEAD_REF", "")
changes = find_version_changes(repo_root, base_ref)
errors = validate_version_changes(changes, pr_title, pr_head_ref)
if errors:
for error in errors:
print(f"::error title=Version bump guard::{error}")
return 1
if changes:
changed_packages = ", ".join(change.package for change in changes)
print(
"::notice title=Version bump guard::"
f"Release PR version changes validated for {changed_packages}."
)
else:
print("::notice title=Version bump guard::No package version changes detected.")
return 0
if __name__ == "__main__":
sys.exit(main())
-122
View File
@@ -1,122 +0,0 @@
#!/usr/bin/env python3
"""Update the sdk_ref default value in run-eval.yml.
This script updates the default SDK reference version in the run-eval workflow
to match a new release version.
"""
from __future__ import annotations
import argparse
import re
import sys
from pathlib import Path
REPO_ROOT = Path(__file__).resolve().parents[2]
RUN_EVAL_WORKFLOW = REPO_ROOT / ".github" / "workflows" / "run-eval.yml"
# Pattern to match the sdk_ref default line
# Matches: "default: vX.Y.Z" with optional prerelease suffix like -rc1, -beta.1
SDK_REF_PATTERN = re.compile(
r"^(\s*default:\s*v)[\d]+\.[\d]+\.[\d]+(-[a-zA-Z0-9.]+)?(\s*)$"
)
def update_sdk_ref_default(new_version: str, dry_run: bool = False) -> bool:
"""Update the sdk_ref default in run-eval.yml.
Args:
new_version: The new version (without 'v' prefix, e.g., "1.12.0")
dry_run: If True, print what would change without modifying the file
Returns:
True if successful, False otherwise
"""
if not RUN_EVAL_WORKFLOW.exists():
print(f"❌ File not found: {RUN_EVAL_WORKFLOW}", file=sys.stderr)
return False
content = RUN_EVAL_WORKFLOW.read_text()
lines = content.splitlines(keepends=True)
# Find the sdk_ref input section and its default line
in_sdk_ref_section = False
updated = False
old_version = None
for i, line in enumerate(lines):
stripped = line.strip()
# Track when we enter the sdk_ref input section
if stripped == "sdk_ref:":
in_sdk_ref_section = True
continue
# Track when we exit the sdk_ref section (another input starts)
if (
in_sdk_ref_section
and stripped.endswith(":")
and not stripped.startswith("default")
):
in_sdk_ref_section = False
# Update the default line within the sdk_ref section
if in_sdk_ref_section:
match = SDK_REF_PATTERN.match(line)
if match:
old_version = line.strip().replace("default: ", "")
new_line = f"{match.group(1)}{new_version}{match.group(3) or ''}"
if not line.endswith("\n") and lines[i].endswith("\n"):
new_line += "\n"
elif line.endswith("\n"):
new_line += "\n"
lines[i] = new_line
updated = True
break
if not updated:
print("❌ Could not find sdk_ref default line to update", file=sys.stderr)
return False
if dry_run:
print(f"Would update sdk_ref default: {old_version} → v{new_version}")
return True
# Write the updated content
RUN_EVAL_WORKFLOW.write_text("".join(lines))
print(f"✅ Updated sdk_ref default: {old_version} → v{new_version}")
return True
def main() -> int:
parser = argparse.ArgumentParser(
description="Update the sdk_ref default value in run-eval.yml"
)
parser.add_argument(
"version",
help="New version (without 'v' prefix, e.g., '1.12.0')",
)
parser.add_argument(
"--dry-run",
action="store_true",
help="Print what would change without modifying the file",
)
args = parser.parse_args()
# Validate version format
version_pattern = re.compile(r"^\d+\.\d+\.\d+(-[a-zA-Z0-9.]+)?$")
if not version_pattern.match(args.version):
print(
f"❌ Invalid version format: {args.version}. "
"Expected: X.Y.Z or X.Y.Z-suffix",
file=sys.stderr,
)
return 1
success = update_sdk_ref_default(args.version, dry_run=args.dry_run)
return 0 if success else 1
if __name__ == "__main__":
sys.exit(main())
-125
View File
@@ -1,125 +0,0 @@
# Release Automation Workflows
This document describes the automated release workflows for the OpenHands Software Agent SDK.
## Overview
The release process has been automated with two GitHub Actions workflows:
1. **prepare-release.yml** - Prepares a release PR with version updates
2. **pypi-release.yml** - Automatically publishes packages to PyPI when a release is created
## How to Create a New Release
### Step 1: Trigger the Prepare Release Workflow
1. Go to the [Actions tab](https://github.com/OpenHands/software-agent-sdk/actions)
2. Select **"Prepare Release"** workflow from the left sidebar
3. Click **"Run workflow"** button
4. Enter the version number (e.g., `1.2.3`) - must be in format `X.Y.Z`
5. Click **"Run workflow"**
The workflow will automatically:
- ✅ Create a new branch named `rel-X.Y.Z`
- ✅ Update all package versions using `make set-package-version`
- ✅ Commit the changes
- ✅ Push the branch
- ✅ Create a PR with labels `integration-tests` and `test-examples`
### Step 2: Review the PR
The created PR will include a checklist. Complete the following:
- [ ] Fix any deprecation deadlines if they exist
- [ ] Verify integration tests pass (triggered by `integration-tests` label)
- [ ] Verify example checks pass (triggered by `test-examples` label)
- [ ] Review and approve the PR
### Step 3: Create the GitHub Release
1. Go to [Releases](https://github.com/OpenHands/software-agent-sdk/releases/new)
2. Click **"Draft a new release"**
3. Configure the release:
- **Tag**: `vX.Y.Z` (must match the version)
- **Branch**: `rel-X.Y.Z` (the branch created by the workflow)
- **Previous tag**: Select the previous release version
4. Click **"Generate release notes"** to auto-generate the changelog
5. Review and edit the release notes as needed
6. Click **"Publish release"**
### Step 4: PyPI Publication (Automated)
Once the release is published, the **pypi-release.yml** workflow will automatically:
- ✅ Build all packages (openhands-sdk, openhands-tools, openhands-workspace, openhands-agent-server)
- ✅ Publish them to PyPI
You can monitor the progress in the [Actions tab](https://github.com/OpenHands/software-agent-sdk/actions/workflows/pypi-release.yml).
### Step 5: Version Bump PRs (Automated)
After successful PyPI publication, the workflow will automatically create PRs to update SDK versions in downstream repositories:
- **[OpenHands](https://github.com/All-Hands-AI/OpenHands)** - Updates `openhands-sdk`, `openhands-tools`, and `openhands-agent-server` versions
- **[OpenHands-CLI](https://github.com/All-Hands-AI/openhands-cli)** - Updates `openhands-sdk` and `openhands-tools` versions
These PRs will:
- Be created automatically with branch name `bump-sdk-X.Y.Z`
- Include links back to the SDK release
- Need to be reviewed and merged by the respective repository maintainers
### Step 6: Post-Release Tasks
- [ ] Merge the release PR to main
- [ ] Review and merge the auto-created version bump PRs in OpenHands and OpenHands-CLI
- [ ] Run evaluation on OpenHands Index (manual step)
- [ ] Announce the release
## Manual PyPI Release (If Needed)
If you need to manually trigger the PyPI release workflow:
1. Go to the [Actions tab](https://github.com/OpenHands/software-agent-sdk/actions)
2. Select **"Publish all OpenHands packages (uv)"** workflow
3. Click **"Run workflow"**
4. Select the branch/tag you want to publish from
5. Click **"Run workflow"**
## Workflow Files
- `.github/workflows/prepare-release.yml` - Automated release preparation
- `.github/workflows/pypi-release.yml` - PyPI package publication
## Troubleshooting
### Version Format Error
If you get a version format error, ensure you're using the format `X.Y.Z` (e.g., `1.2.3`), not `vX.Y.Z`.
### PR Creation Failed
If the PR creation fails, check:
- The branch doesn't already exist
- You have proper permissions
- The `GITHUB_TOKEN` has sufficient permissions
### PyPI Publication Failed
If PyPI publication fails:
- Check that the `PYPI_TOKEN_OPENHANDS` secret is properly configured
- Verify the version doesn't already exist on PyPI
- Check the workflow logs for specific error messages
## Previous Manual Process
For reference, the previous manual release checklist was:
- [ ] Checkout SDK repo, use `make set-package-version version=x.x.x` to set the version
- [ ] Push to a branch like `rel-x.x.x` and start a PR
- [ ] Fix any "deprecation deadlines" if they exist
- [ ] Tag "integration-tests" and make sure integration test all pass
- [ ] Tag "test-examples" and make sure example checks all pass
- [ ] Draft a new release
- [ ] Use workflow to publish to PyPI on tag `v1.X.X`
- [ ] Evaluation on OpenHands Index
Most of these steps are now automated!
@@ -1,154 +0,0 @@
---
name: REST API breakage checks
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
agent-server-rest-api:
name: REST API (OpenAPI)
runs-on: ubuntu-latest
permissions:
contents: read
pull-requests: write
steps:
- name: Checkout
uses: actions/checkout@v5
with:
fetch-depth: 0
- name: Install uv
uses: astral-sh/setup-uv@v7
with:
enable-cache: true
- name: Install workspace deps (dev)
run: uv sync --frozen --group dev
- name: Install oasdiff
run: |
curl -L https://raw.githubusercontent.com/oasdiff/oasdiff/main/install.sh | sh -s -- -b /usr/local/bin
oasdiff --version
- name: Run agent server REST API breakage check
id: api_breakage
# Let this step fail so CI is visibly red on breakage.
# Later reporting steps still run because they use if: always().
run: |
uv run --with packaging python .github/scripts/check_agent_server_rest_api_breakage.py 2>&1 | tee api-breakage.log
exit_code=${PIPESTATUS[0]}
echo "exit_code=${exit_code}" >> "$GITHUB_OUTPUT"
exit "${exit_code}"
- name: Write REST API breakage summary
if: ${{ always() }}
env:
EXIT_CODE: ${{ steps.api_breakage.outputs.exit_code }}
IS_FORK: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.repo.full_name != github.repository }}
LOG_PATH: api-breakage.log
RUN_URL: ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}
run: |
python3 <<'PY' >> "$GITHUB_STEP_SUMMARY"
import os
from pathlib import Path
exit_code = int(os.environ.get('EXIT_CODE', '0') or '0')
is_fork = os.environ.get('IS_FORK', 'false') == 'true'
run_url = os.environ['RUN_URL']
status = '✅ **PASSED**' if exit_code == 0 else '❌ **FAILED**'
print(f'## REST API breakage checks (OpenAPI) — {status}')
print()
print(f"**Result:** {status}")
if exit_code != 0:
print()
print('> ⚠️ Breaking REST API changes or policy violations detected.')
print()
if is_fork:
print(
'_Fork PR detected: sticky PR comment was skipped because '
'the GitHub token is read-only for `pull_request` workflows '
'from forks._'
)
print()
if exit_code != 0:
try:
log = Path(os.environ['LOG_PATH']).read_text()
except Exception as exc:
log = f'Unable to read log file: {exc}'
excerpt = log[:1000].replace('```', '``\\`')
print('<details><summary>Log excerpt (first 1000 characters)</summary>')
print()
print('```text')
print(excerpt)
print('```')
print()
print('</details>')
print()
print(f'[Action log]({run_url})')
PY
- name: Post REST API breakage report to PR
if: ${{ always() && github.event_name == 'pull_request' && github.event.pull_request.head.repo.full_name == github.repository }}
uses: actions/github-script@v8
env:
EXIT_CODE: ${{ steps.api_breakage.outputs.exit_code }}
LOG_PATH: api-breakage.log
with:
script: |
const fs = require('fs');
const marker = '<!-- agent-server-rest-api-breakage-report -->';
const exitCode = Number(process.env.EXIT_CODE || '0');
const runUrl = `${context.serverUrl}/${context.repo.owner}/${context.repo.repo}/actions/runs/${context.runId}`;
const status = exitCode === 0 ? '✅ **PASSED**' : '❌ **FAILED**';
let body = `${marker}\n## REST API breakage checks (OpenAPI) — ${status}\n\n**Result:** ${status}\n`;
if (exitCode !== 0) {
body += `\n> ⚠️ Breaking REST API changes or policy violations detected.\n`;
let log = '';
try {
log = fs.readFileSync(process.env.LOG_PATH, 'utf8');
} catch (e) {
log = `Unable to read log file: ${e}`;
}
const excerpt = log.slice(0, 1000).replace(/```/g, '``\\`');
body += `\n<details><summary>Log excerpt (first 1000 characters)</summary>\n\n\`\`\`text\n${excerpt}\n\`\`\`\n\n</details>\n`;
}
body += `\n[Action log](${runUrl})\n`;
const { owner, repo } = context.repo;
const issue_number = context.issue.number;
const { data: comments } = await github.rest.issues.listComments({
owner,
repo,
issue_number,
per_page: 100,
});
const existing = comments.find((c) => c.body && c.body.includes(marker));
if (existing) {
await github.rest.issues.updateComment({
owner,
repo,
comment_id: existing.id,
body,
});
} else {
await github.rest.issues.createComment({
owner,
repo,
issue_number,
body,
});
}
-149
View File
@@ -1,149 +0,0 @@
---
name: Python API breakage checks
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
sdk-api:
name: Python API
runs-on: ubuntu-latest
permissions:
contents: read
pull-requests: write
steps:
- name: Checkout
uses: actions/checkout@v5
with:
fetch-depth: 0
- name: Install uv
uses: astral-sh/setup-uv@v7
with:
enable-cache: true
- name: Install workspace deps (dev)
run: uv sync --frozen --group dev
- name: Run Python API breakage check
id: api_breakage
# Let this step fail so CI is visibly red on breakage.
# Later reporting steps still run because they use if: always().
env:
ACP_VERSION_CHECK_BASE_REF: ${{ github.event_name == 'pull_request' && github.base_ref || github.event.before }}
ACP_VERSION_CHECK_SKIP: ${{ github.event_name == 'pull_request' && contains(github.event.pull_request.body || '', 'skip-acp-check')
}}
run: |
uv run python .github/scripts/check_sdk_api_breakage.py 2>&1 | tee api-breakage.log
exit_code=${PIPESTATUS[0]}
echo "exit_code=${exit_code}" >> "$GITHUB_OUTPUT"
exit "${exit_code}"
- name: Write API breakage summary
if: ${{ always() }}
env:
EXIT_CODE: ${{ steps.api_breakage.outputs.exit_code }}
IS_FORK: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.repo.full_name != github.repository }}
LOG_PATH: api-breakage.log
RUN_URL: ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}
run: |
python3 <<'PY' >> "$GITHUB_STEP_SUMMARY"
import os
from pathlib import Path
exit_code = int(os.environ.get('EXIT_CODE', '0') or '0')
is_fork = os.environ.get('IS_FORK', 'false') == 'true'
run_url = os.environ['RUN_URL']
status = '✅ **PASSED**' if exit_code == 0 else '❌ **FAILED**'
print(f'## Python API breakage checks — {status}')
print()
print(f"**Result:** {status}")
if exit_code != 0:
print()
print('> ⚠️ Breaking API changes or policy violations detected.')
print()
if is_fork:
print(
'_Fork PR detected: sticky PR comment was skipped because '
'the GitHub token is read-only for `pull_request` workflows '
'from forks._'
)
print()
if exit_code != 0:
try:
log = Path(os.environ['LOG_PATH']).read_text()
except Exception as exc:
log = f'Unable to read log file: {exc}'
excerpt = log[:1000].replace('```', '``\\`')
print('<details><summary>Log excerpt (first 1000 characters)</summary>')
print()
print('```text')
print(excerpt)
print('```')
print()
print('</details>')
print()
print(f'[Action log]({run_url})')
PY
- name: Post API breakage report to PR
if: ${{ always() && github.event_name == 'pull_request' && github.event.pull_request.head.repo.full_name == github.repository }}
uses: actions/github-script@v8
env:
EXIT_CODE: ${{ steps.api_breakage.outputs.exit_code }}
LOG_PATH: api-breakage.log
with:
script: |
const fs = require('fs');
const marker = '<!-- api-breakage-report -->';
const exitCode = Number(process.env.EXIT_CODE || '0');
const runUrl = `${context.serverUrl}/${context.repo.owner}/${context.repo.repo}/actions/runs/${context.runId}`;
const status = exitCode === 0 ? '✅ **PASSED**' : '❌ **FAILED**';
let body = `${marker}\n## Python API breakage checks — ${status}\n\n**Result:** ${status}\n`;
if (exitCode !== 0) {
body += `\n> ⚠️ Breaking API changes or policy violations detected.\n`;
let log = '';
try {
log = fs.readFileSync(process.env.LOG_PATH, 'utf8');
} catch (e) {
log = `Unable to read log file: ${e}`;
}
const excerpt = log.slice(0, 1000).replace(/```/g, '``\\`');
body += `\n<details><summary>Log excerpt (first 1000 characters)</summary>\n\n\`\`\`text\n${excerpt}\n\`\`\`\n\n</details>\n`;
}
body += `\n[Action log](${runUrl})\n`;
const { owner, repo } = context.repo;
const issue_number = context.issue.number;
const { data: comments } = await github.rest.issues.listComments({
owner,
repo,
issue_number,
per_page: 100,
});
const existing = comments.find((c) => c.body && c.body.includes(marker));
if (existing) {
await github.rest.issues.updateComment({
owner,
repo,
comment_id: existing.id,
body,
});
} else {
await github.rest.issues.createComment({
owner,
repo,
issue_number,
body,
});
}
-130
View File
@@ -1,130 +0,0 @@
---
name: API Compliance Tests
on:
pull_request:
types: [labeled]
workflow_dispatch:
inputs:
reason:
description: Reason for running compliance tests
required: true
patterns:
description: Comma-separated patterns to test (empty = all)
required: false
models:
description: Comma-separated model IDs (empty = all defaults)
required: false
env:
# Default models to test (matches DEFAULT_MODELS in run_compliance.py)
DEFAULT_MODELS: claude-sonnet-4-5,gpt-5.2,gemini-3-pro
jobs:
run-compliance-tests:
# Only run on api-compliance-test label or workflow_dispatch
if: |
github.event_name == 'workflow_dispatch' ||
(github.event_name == 'pull_request' && github.event.label.name == 'api-compliance-test')
runs-on: ubuntu-latest
permissions:
contents: read
pull-requests: write
steps:
- name: Checkout repository
uses: actions/checkout@v5
with:
repository: ${{ github.event.pull_request.head.repo.full_name || github.repository }}
ref: ${{ github.event.pull_request.head.sha || github.ref }}
persist-credentials: false
- name: Install uv
uses: astral-sh/setup-uv@v7
with:
version: latest
python-version: '3.13'
- name: Install dependencies
run: uv sync --dev
- name: Determine test parameters
id: params
run: |
# Use input values or defaults
if [ "${{ github.event_name }}" = "workflow_dispatch" ]; then
PATTERNS="${{ github.event.inputs.patterns }}"
MODELS="${{ github.event.inputs.models }}"
else
PATTERNS=""
MODELS=""
fi
# Build command args
ARGS=""
if [ -n "$PATTERNS" ]; then
ARGS="$ARGS --patterns $PATTERNS"
fi
if [ -n "$MODELS" ]; then
ARGS="$ARGS --models $MODELS"
else
ARGS="$ARGS --models $DEFAULT_MODELS"
fi
echo "args=$ARGS" >> $GITHUB_OUTPUT
- name: Run API compliance tests
id: compliance
env:
LLM_API_KEY: ${{ secrets.LLM_API_KEY_EVAL }}
LLM_BASE_URL: https://llm-proxy.eval.all-hands.dev
GITHUB_RUN_ID: ${{ github.run_id }}
run: |
uv run python tests/integration/api_compliance/run_compliance.py \
${{ steps.params.outputs.args }} \
--output-dir compliance-results/
continue-on-error: true # Tests may "fail" but that's expected
- name: Upload results
uses: actions/upload-artifact@v7
with:
name: compliance-results
path: compliance-results/
retention-days: 30
- name: Post results to PR
if: github.event_name == 'pull_request'
uses: actions/github-script@v8
with:
script: |
const fs = require('fs');
const path = require('path');
// Find the report directory
const resultsDir = 'compliance-results';
const dirs = fs.readdirSync(resultsDir);
if (dirs.length === 0) {
console.log('No results found');
return;
}
const latestDir = path.join(resultsDir, dirs[0]);
const reportPath = path.join(latestDir, 'compliance_report.md');
if (!fs.existsSync(reportPath)) {
console.log('Report not found at', reportPath);
return;
}
let report = fs.readFileSync(reportPath, 'utf8');
// Truncate if too long
if (report.length > 60000) {
report = report.substring(0, 60000) + '\n\n... (truncated)';
}
await github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.payload.pull_request.number,
body: report
});
-224
View File
@@ -1,224 +0,0 @@
---
# To set this up:
# 1. Change the name below to something relevant to your task
# 2. Modify the "env" section below with your prompt
# 3. Add your LLM_API_KEY to the repository secrets
# 4. Commit this file to your repository
# 5. Trigger the workflow manually or set up a schedule
name: Assign Reviews
on:
# Manual trigger
workflow_dispatch:
# Scheduled trigger (disabled by default, uncomment and customize as needed)
schedule:
# Run at 12 PM UTC every day
- cron: 0 12 * * *
permissions:
contents: write
pull-requests: write
issues: write
jobs:
run-task:
# Only run scheduled jobs in the main repository, not in forks
if: github.repository == 'OpenHands/software-agent-sdk' || github.event_name == 'workflow_dispatch'
runs-on: ubuntu-24.04
env:
# Configuration (modify these values as needed)
AGENT_SCRIPT_URL: https://raw.githubusercontent.com/OpenHands/agent-sdk/main/examples/03_github_workflows/01_basic_action/agent_script.py
# Provide either PROMPT_LOCATION (URL/file) OR PROMPT_STRING (direct text), not both
# Option 1: Use a URL or file path for the prompt
PROMPT_LOCATION: ''
# PROMPT_LOCATION: 'https://example.com/prompts/maintenance.txt'
# Option 2: Use direct text for the prompt
PROMPT_STRING: >
Use GITHUB_TOKEN and the github API to organize open pull requests and issues in the repo.
Read the sections below in order, and perform each in order. Do NOT take action
on the same issue or PR twice.
# Issues with needs-info - Check for OP Response
Find all open issues that have the "needs-info" label. For each issue:
1. Identify the original poster (issue author)
2. Check if there are any comments from the original poster AFTER the "needs-info" label was added
3. To determine when the label was added, use: GET /repos/{owner}/{repo}/issues/{issue_number}/timeline
and look for "labeled" events with the label "needs-info"
4. If the original poster has commented after the label was added:
- Remove the "needs-info" label
- Add the "needs-triage" label
# Issues with needs-triage
Find all open issues that have the "needs-triage" label. For each issue that has been in this state for more than 2 days:
1. First, check if the issue has already been triaged by verifying it does NOT have:
- The "enhancement" label
- Any "priority" label (priority:low, priority:medium, priority:high, etc.)
2. If the issue has already been triaged (has enhancement or priority label), remove the "needs-triage" label
3. For issues that have NOT been triaged yet:
- Read the issue description and comments
- Check if it is a bug report, feature request, or question and add the appropriate label
- If it is a bug report and it does not have a priority label
* Read the MAINTAINERS file in the repository root to get the list of maintainers
* Extract all usernames from lines starting with "- @" and join them with spaces, each prefixed with @
(e.g., if the file contains "- @user1" and "- @user2", format as "@user1 @user2")
* Tag ALL maintainers with: "[Automatic Post]: This issue has been waiting for triage. <maintainers>, could you
please take a look and add the appropriate priority label when you have a chance?"
(Replace <maintainers> with the formatted list from the previous step)
# Need Reviewer Action
Find all open PRs where:
1. The PR is waiting for review (there are no open review comments or change requests)
2. The PR is in a "clean" state (CI passing, no merge conflicts)
3. The PR is not marked as draft (draft: false)
4. The PR has had no activity (comments, commits, reviews) for more than 3 days.
In this case, send a message to the reviewers:
[Automatic Post]: This PR seems to be currently waiting for review.
{reviewer_names}, could you please take a look when you have a chance?
# Need Author Action
Find all open PRs where the most recent change or comment was made on the pull
request more than 5 days ago (use 14 days if the PR is marked as draft).
And send a message to the author:
[Automatic Post]: It has been a while since there was any activity on this PR.
{author}, are you still working on it? If so, please go ahead, if not then
please request review, close it, or request that someone else follow up.
# Need Reviewers
Find all open pull requests that TRULY have NO reviewers assigned. To do this correctly:
1. Use the GitHub API to fetch PR details: GET /repos/{owner}/{repo}/pulls/{pull_number}
2. Check the "requested_reviewers" and "requested_teams" arrays
3. ALSO check for submitted reviews: GET /repos/{owner}/{repo}/pulls/{pull_number}/reviews
4. A PR needs reviewers ONLY if ALL of these are true:
- The "requested_reviewers" array is empty (no pending review requests)
- The "requested_teams" array is empty (no pending team review requests)
- The reviews array is empty (no reviews have been submitted yet)
5. IMPORTANT: If ANY of these has entries, SKIP this PR - it already has or had reviewers!
Example API responses showing a PR that DOES NOT need reviewers (skip this):
Case 1 - Has requested reviewers:
GET /pulls/{number}: {"requested_reviewers": [{"login": "someuser"}], "requested_teams": []}
Case 2 - Has submitted reviews (even if requested_reviewers is empty):
GET /pulls/{number}: {"requested_reviewers": [], "requested_teams": []}
GET /pulls/{number}/reviews: [{"user": {"login": "someuser"}, "state": "COMMENTED"}]
Example API response showing a PR that DOES need reviewers (process this):
GET /pulls/{number}: {"requested_reviewers": [], "requested_teams": []}
GET /pulls/{number}/reviews: []
Additional criteria for PRs that need reviewers:
1. Are not marked as draft (draft: false)
2. Were created more than 1 day ago
3. CI is passing and there are no merge conflicts
For each PR that truly has NO reviewers:
1) Read git blame for changed files to identify recent, active contributors.
2) From those candidates, ONLY consider maintainers — repository collaborators with write access or higher. Verify via the GitHub API before
requesting review:
- Preferred: GET /repos/{owner}/{repo}/collaborators (no permission filter). Filter client-side using either:
role_name in ["write", "maintain", "admin"] OR permissions.push || permissions.admin. Note: paginate if > 30 collaborators.
- Alternative: GET /repos/{owner}/{repo}/collaborators/{username}/permission and accept if permission in {push, maintain, admin}.
3) If multiple maintainers qualify, avoid assigning too many reviews to any single one.
4) Request review from exactly one maintainer and add this message:
[Automatic Post]: I have assigned {reviewer} as a reviewer based on git blame information.
Thanks in advance for the help!
LLM_MODEL: litellm_proxy/claude-sonnet-4-5-20250929
LLM_BASE_URL: https://llm-proxy.app.all-hands.dev
steps:
- name: Checkout repository
uses: actions/checkout@v5
- name: Set up Python
uses: actions/setup-python@v6
with:
python-version: '3.13'
- name: Install uv
uses: astral-sh/setup-uv@v7
with:
enable-cache: true
- name: Install OpenHands dependencies
run: |
# Install OpenHands SDK and tools from git repository
uv pip install --system "openhands-sdk @ git+https://github.com/OpenHands/agent-sdk.git@main#subdirectory=openhands-sdk"
uv pip install --system "openhands-tools @ git+https://github.com/OpenHands/agent-sdk.git@main#subdirectory=openhands-tools"
- name: Check required configuration
env:
LLM_API_KEY: ${{ secrets.LLM_API_KEY }}
run: |
if [ -z "$LLM_API_KEY" ]; then
echo "Error: LLM_API_KEY secret is not set."
exit 1
fi
# Check that exactly one of PROMPT_LOCATION or PROMPT_STRING is set
if [ -n "$PROMPT_LOCATION" ] && [ -n "$PROMPT_STRING" ]; then
echo "Error: Both PROMPT_LOCATION and PROMPT_STRING are set."
echo "Please provide only one in the env section of the workflow file."
exit 1
fi
if [ -z "$PROMPT_LOCATION" ] && [ -z "$PROMPT_STRING" ]; then
echo "Error: Neither PROMPT_LOCATION nor PROMPT_STRING is set."
echo "Please set one in the env section of the workflow file."
exit 1
fi
if [ -n "$PROMPT_LOCATION" ]; then
echo "Prompt location: $PROMPT_LOCATION"
else
echo "Using inline PROMPT_STRING (${#PROMPT_STRING} characters)"
fi
echo "LLM model: $LLM_MODEL"
if [ -n "$LLM_BASE_URL" ]; then
echo "LLM base URL: $LLM_BASE_URL"
fi
- name: Run task
env:
LLM_API_KEY: ${{ secrets.LLM_API_KEY }}
GITHUB_TOKEN: ${{ secrets.ALLHANDS_BOT_GITHUB_PAT }}
PYTHONPATH: ''
run: |
echo "Running agent script: $AGENT_SCRIPT_URL"
# Download script if it's a URL
if [[ "$AGENT_SCRIPT_URL" =~ ^https?:// ]]; then
echo "Downloading agent script from URL..."
curl -sSL "$AGENT_SCRIPT_URL" -o /tmp/agent_script.py
AGENT_SCRIPT_PATH="/tmp/agent_script.py"
else
AGENT_SCRIPT_PATH="$AGENT_SCRIPT_URL"
fi
# Run with appropriate prompt argument
if [ -n "$PROMPT_LOCATION" ]; then
echo "Using prompt from: $PROMPT_LOCATION"
uv run python "$AGENT_SCRIPT_PATH" "$PROMPT_LOCATION"
else
echo "Using PROMPT_STRING (${#PROMPT_STRING} characters)"
uv run python "$AGENT_SCRIPT_PATH"
fi
- name: Upload logs as artifact
uses: actions/upload-artifact@v7
if: always()
with:
name: openhands-task-logs
path: |
*.log
output/
retention-days: 7
-36
View File
@@ -1,36 +0,0 @@
---
name: Auto-label New Issues
on:
issues:
types: [opened]
permissions:
issues: write
jobs:
add-triage-label:
runs-on: ubuntu-latest
steps:
- name: Add needs-triage label
uses: actions/github-script@v8
with:
github-token: ${{ secrets.ALLHANDS_BOT_GITHUB_PAT }}
script: |
// Get the issue details
const issue = context.payload.issue;
const labels = issue.labels.map(label => label.name);
// Check if issue has already been triaged
const hasEnhancement = labels.includes('enhancement');
const hasPriority = labels.some(label => label.startsWith('priority'));
// Only add needs-triage if not already triaged
if (!hasEnhancement && !hasPriority) {
await github.rest.issues.addLabels({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.issue.number,
labels: ['needs-triage']
});
}
-25
View File
@@ -1,25 +0,0 @@
---
# .github/workflows/check-docstrings.yml
name: Check Docstrings
on:
push:
branches: [main]
pull_request:
branches: ['**']
jobs:
check-docstrings:
runs-on: ubuntu-24.04
steps:
- name: Checkout code
uses: actions/checkout@v5
- name: Set up Python
uses: actions/setup-python@v6
with:
python-version: '3.13'
- name: Check docstring formatting
run: python .github/scripts/check_docstrings.py
@@ -1,59 +0,0 @@
---
name: '[Optional] Docs example'
on:
pull_request:
branches:
- '**'
paths:
- examples/**/*.py
- '!examples/03_github_workflows/**'
- '!examples/04_llm_specific_tools/**'
- .github/workflows/check-documented-examples.yml
- .github/scripts/check_documented_examples.py
workflow_dispatch:
permissions:
contents: read
pull-requests: read
jobs:
check-examples:
runs-on: ubuntu-latest
steps:
- name: Checkout agent-sdk repository
uses: actions/checkout@v5
with:
fetch-depth: 0
- name: Checkout docs repository (try feature branch)
uses: actions/checkout@v5
continue-on-error: true
id: checkout-feature
with:
repository: OpenHands/docs
path: docs
fetch-depth: 0
ref: ${{ github.head_ref || github.ref_name }}
- name: Checkout docs repository (fallback to main)
if: steps.checkout-feature.outcome == 'failure'
uses: actions/checkout@v5
with:
repository: OpenHands/docs
path: docs
fetch-depth: 0
ref: main
- name: Set up Python
uses: actions/setup-python@v6
with:
python-version: '3.13'
- name: Check documented examples
env:
DOCS_PATH: ${{ github.workspace }}/docs
shell: bash
run: |
set -euo pipefail
python .github/scripts/check_documented_examples.py
@@ -1,35 +0,0 @@
---
name: Check duplicate example numbers
on:
pull_request:
branches:
- '**'
paths:
- examples/**
- .github/workflows/check-duplicate-examples.yml
- .github/scripts/check_duplicate_example_numbers.py
push:
branches:
- main
paths:
- examples/**
workflow_dispatch:
permissions:
contents: read
jobs:
check-duplicates:
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v5
- name: Set up Python
uses: actions/setup-python@v6
with:
python-version: '3.13'
- name: Check for duplicate example numbers
run: python .github/scripts/check_duplicate_example_numbers.py
+69
View File
@@ -0,0 +1,69 @@
# Workflow that cleans up outdated and old workflows to prevent out of disk issues
name: Delete old workflow runs
# This workflow is currently only triggered manually
on:
workflow_dispatch:
inputs:
days:
description: 'Days-worth of runs to keep for each workflow'
required: true
default: '30'
minimum_runs:
description: 'Minimum runs to keep for each workflow'
required: true
default: '10'
delete_workflow_pattern:
description: 'Name or filename of the workflow (if not set, all workflows are targeted)'
required: false
delete_workflow_by_state_pattern:
description: 'Filter workflows by state: active, deleted, disabled_fork, disabled_inactivity, disabled_manually'
required: true
default: "ALL"
type: choice
options:
- "ALL"
- active
- deleted
- disabled_inactivity
- disabled_manually
delete_run_by_conclusion_pattern:
description: 'Remove runs based on conclusion: action_required, cancelled, failure, skipped, success'
required: true
default: 'ALL'
type: choice
options:
- 'ALL'
- 'Unsuccessful: action_required,cancelled,failure,skipped'
- action_required
- cancelled
- failure
- skipped
- success
dry_run:
description: 'Logs simulated changes, no deletions are performed'
required: false
jobs:
del_runs:
runs-on: ubuntu-latest
permissions:
actions: write
contents: read
steps:
- name: Delete workflow runs
uses: Mattraks/delete-workflow-runs@v2
with:
token: ${{ github.token }}
repository: ${{ github.repository }}
retain_days: ${{ github.event.inputs.days }}
keep_minimum_runs: ${{ github.event.inputs.minimum_runs }}
delete_workflow_pattern: ${{ github.event.inputs.delete_workflow_pattern }}
delete_workflow_by_state_pattern: ${{ github.event.inputs.delete_workflow_by_state_pattern }}
delete_run_by_conclusion_pattern: >-
${{
startsWith(github.event.inputs.delete_run_by_conclusion_pattern, 'Unsuccessful:')
&& 'action_required,cancelled,failure,skipped'
|| github.event.inputs.delete_run_by_conclusion_pattern
}}
dry_run: ${{ github.event.inputs.dry_run }}
-244
View File
@@ -1,244 +0,0 @@
---
name: Run Condenser Tests
on:
# Use pull_request_target to access secrets even on fork PRs
# This is safe because we only run when the 'condenser-test' label is added by a maintainer
pull_request_target:
types:
- labeled
workflow_dispatch:
inputs:
reason:
description: Reason for manual trigger
required: true
default: ''
env:
N_PROCESSES: 2 # Fewer parallel processes for condenser tests (only 2 LLMs)
jobs:
post-initial-comment:
if: >
github.event_name == 'pull_request_target' &&
github.event.label.name == 'condenser-test'
runs-on: ubuntu-latest
permissions:
pull-requests: write
steps:
- name: Comment on PR
uses: KeisukeYamashita/create-comment@v1
with:
unique: false
comment: |
Hi! I started running the condenser tests on your PR. You will receive a comment with the results shortly.
Note: These are non-blocking tests that validate condenser functionality across different LLMs.
run-condenser-tests:
# Security: Only run when condenser-test label is present or via workflow_dispatch
# This prevents automatic execution on fork PRs without maintainer approval
if: |
always() && (
(
github.event_name == 'pull_request_target' &&
github.event.label.name == 'condenser-test'
) ||
github.event_name == 'workflow_dispatch'
)
runs-on: ubuntu-22.04
permissions:
contents: read
id-token: write
pull-requests: write
strategy:
matrix:
python-version: ['3.13']
job-config:
# Only run against 2 LLMs for condenser tests:
# - Claude Opus 4.5 (primary - supports thinking blocks)
# - GPT-5.1 Codex Max (secondary - cross-LLM validation)
- name: Claude Opus 4.5
run-suffix: opus_condenser_run
llm-config:
model: litellm_proxy/anthropic/claude-opus-4-5-20251101
extended_thinking: true
- name: GPT-5.1 Codex Max
run-suffix: gpt51_condenser_run
llm-config:
model: litellm_proxy/gpt-5.1-codex-max
steps:
- name: Checkout repository
uses: actions/checkout@v5
with:
# For pull_request_target: checkout fork PR code (requires explicit repository)
# For other events: fallback to current repository and ref
repository: ${{ github.event.pull_request.head.repo.full_name || github.repository }}
ref: ${{ github.event.pull_request.head.sha || github.ref }}
# Security: Don't persist credentials to prevent untrusted PR code from using them
persist-credentials: false
- name: Install uv
uses: astral-sh/setup-uv@v7
with:
version: latest
python-version: ${{ matrix.python-version }}
- name: Install Python dependencies using uv
run: |
uv sync --dev
uv pip install pytest
- name: Run condenser test evaluation for ${{ matrix.job-config.name }}
env:
LLM_CONFIG: ${{ toJson(matrix.job-config.llm-config) }}
LLM_API_KEY: ${{ secrets.LLM_API_KEY }}
LLM_BASE_URL: https://llm-proxy.app.all-hands.dev
run: |
set -eo pipefail
AGENT_SDK_VERSION=$(git rev-parse --short HEAD)
EVAL_NOTE="${AGENT_SDK_VERSION}_${{ matrix.job-config.run-suffix }}"
echo "Running condenser tests only (c*.py pattern)"
uv run python tests/integration/run_infer.py \
--llm-config "$LLM_CONFIG" \
--num-workers $N_PROCESSES \
--eval-note "$EVAL_NOTE" \
--test-type condenser
# get condenser tests JSON results
RESULTS_FILE=$(find tests/integration/outputs/*${{ matrix.job-config.run-suffix }}* -name "results.json" -type f | head -n 1)
echo "RESULTS_FILE: $RESULTS_FILE"
if [ -f "$RESULTS_FILE" ]; then
echo "JSON_RESULTS_FILE=$RESULTS_FILE" >> $GITHUB_ENV
else
echo "JSON_RESULTS_FILE=" >> $GITHUB_ENV
fi
- name: Wait a little bit
run: sleep 10
- name: Create archive of evaluation outputs
run: |
TIMESTAMP=$(date +'%y-%m-%d-%H-%M')
cd tests/integration/outputs # Change to the outputs directory
tar -czvf ../../../condenser_tests_${{ matrix.job-config.run-suffix }}_${TIMESTAMP}.tar.gz *${{ matrix.job-config.run-suffix }}* # Include result directories for this model
- name: Upload evaluation results as artifact
uses: actions/upload-artifact@v7
id: upload_results_artifact
with:
name: condenser-test-outputs-${{ matrix.job-config.run-suffix }}-${{ github.run_id }}-${{ github.run_attempt }}
path: condenser_tests_${{ matrix.job-config.run-suffix }}_*.tar.gz
- name: Save test results for consolidation
run: |
# Copy the structured JSON results file for consolidation
mkdir -p test_results_summary
if [ -n "${{ env.JSON_RESULTS_FILE }}" ] && [ -f "${{ env.JSON_RESULTS_FILE }}" ]; then
# Copy the JSON results file directly
cp "${{ env.JSON_RESULTS_FILE }}" "test_results_summary/${{ matrix.job-config.run-suffix }}_results.json"
echo "✓ Copied JSON results file for consolidation"
else
echo "✗ No JSON results file found"
exit 1
fi
- name: Upload test results summary
uses: actions/upload-artifact@v7
with:
name: test-results-${{ matrix.job-config.run-suffix }}
path: test_results_summary/${{ matrix.job-config.run-suffix }}_results.json
consolidate-results:
needs: run-condenser-tests
if: |
always() && (
(
github.event_name == 'pull_request_target' &&
github.event.label.name == 'condenser-test'
) ||
github.event_name == 'workflow_dispatch'
)
runs-on: ubuntu-24.04
permissions:
contents: read
pull-requests: write
steps:
- name: Checkout repository
uses: actions/checkout@v5
with:
# When using pull_request_target, explicitly checkout the PR branch
# This ensures we use the scripts from the actual PR code
ref: ${{ github.event.pull_request.head.sha || github.ref }}
- name: Install uv
uses: astral-sh/setup-uv@v7
with:
version: latest
python-version: '3.13'
- name: Install Python dependencies using uv
run: |
uv sync --dev
- name: Download all test results
uses: actions/download-artifact@v8
with:
pattern: test-results-*
merge-multiple: true
path: all_results
- name: Download all condenser test artifacts
uses: actions/download-artifact@v8
with:
pattern: condenser-test-outputs-*
path: artifacts
- name: Consolidate test results
env:
EVENT_NAME: ${{ github.event_name }}
PR_NUMBER: ${{ github.event.pull_request.number }}
MANUAL_REASON: ${{ github.event.inputs.reason }}
COMMIT_SHA: ${{ github.sha }}
PYTHONPATH: ${{ github.workspace }}
GITHUB_SERVER_URL: ${{ github.server_url }}
GITHUB_REPOSITORY: ${{ github.repository }}
GITHUB_RUN_ID: ${{ github.run_id }}
run: |
uv run python tests/integration/utils/consolidate_json_results.py \
--results-dir all_results \
--artifacts-dir artifacts \
--output-file consolidated_results.json
echo "Consolidated results generated successfully"
uv run python tests/integration/utils/generate_markdown_report.py \
--input-file consolidated_results.json \
--output-file consolidated_report.md
- name: Upload consolidated report
uses: actions/upload-artifact@v7
with:
name: consolidated-condenser-report
path: consolidated_report.md
- name: Create consolidated PR comment
if: github.event_name == 'pull_request_target'
run: |
# Add header to clarify these are non-blocking tests
echo "## Condenser Test Results (Non-Blocking)" > final_report.md
echo "" >> final_report.md
echo "> These tests validate condenser functionality and do not block PR merges." >> final_report.md
echo "" >> final_report.md
cat consolidated_report.md >> final_report.md
# Sanitize @OpenHands mentions to prevent self-mention loops
COMMENT_BODY=$(uv run python -c "from openhands.sdk.utils.github import sanitize_openhands_mentions; import sys; print(sanitize_openhands_mentions(sys.stdin.read()), end='')" < final_report.md)
# Use GitHub CLI to create comment with explicit PR number
echo "$COMMENT_BODY" | gh pr comment ${{ github.event.pull_request.number }} --body-file -
env:
GH_TOKEN: ${{ github.token }}
+71 -20
View File
@@ -1,23 +1,74 @@
---
name: Dispatch to docs repo
# Workflow that builds and deploys the documentation website
name: Deploy Docs to GitHub Pages
# * Always run on "main"
# * Run on PRs that target the "main" branch and have changes in the "docs" folder or this workflow
on:
push:
branches:
- main
paths:
- openhands-agent-server/**
workflow_dispatch:
push:
branches:
- main
pull_request:
paths:
- 'docs/**'
- '.github/workflows/deploy-docs.yml'
branches:
- main
# If triggered by a PR, it will be in the same group. However, each commit on main will be in its own unique group
concurrency:
group: ${{ github.workflow }}-${{ (github.head_ref && github.ref) || github.run_id }}
cancel-in-progress: true
jobs:
dispatch:
runs-on: ubuntu-24.04
permissions:
contents: write
steps:
- name: Trigger docs repo sync
uses: peter-evans/repository-dispatch@v4
with:
token: ${{ secrets.ALLHANDS_BOT_GITHUB_PAT }}
repository: OpenHands/docs
event-type: update
client-payload: '{"ref": "${{ github.ref }}", "sha": "${{ github.sha }}"}'
# Build the documentation website
build:
if: github.repository == 'All-Hands-AI/OpenHands'
name: Build Docusaurus
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: actions/setup-node@v4
with:
node-version: 18
cache: npm
cache-dependency-path: docs/package-lock.json
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.12'
- name: Generate Python Docs
run: rm -rf docs/modules/python && pip install pydoc-markdown && pydoc-markdown
- name: Install dependencies
run: cd docs && npm ci
- name: Build website
run: cd docs && npm run build
- name: Upload Build Artifact
if: github.ref == 'refs/heads/main'
uses: actions/upload-pages-artifact@v3
with:
path: docs/build
# Deploy the documentation website
deploy:
if: github.ref == 'refs/heads/main' && github.repository == 'All-Hands-AI/OpenHands'
name: Deploy to GitHub Pages
runs-on: ubuntu-latest
# This job only runs on "main" so only run one of these jobs at a time
# otherwise it will fail if one is already running
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
needs: build
# Grant GITHUB_TOKEN the permissions required to make a Pages deployment
permissions:
pages: write # to deploy to Pages
id-token: write # to verify the deployment originates from an appropriate source
# Deploy to the github-pages environment
environment:
name: github-pages
url: ${{ steps.deployment.outputs.page_url }}
steps:
- name: Deploy to GitHub Pages
id: deployment
uses: actions/deploy-pages@v4
-24
View File
@@ -1,24 +0,0 @@
---
name: Deprecation deadlines
on:
push:
branches: [main]
pull_request:
branches: ['**']
jobs:
check:
runs-on: ubuntu-24.04
steps:
- name: Checkout
uses: actions/checkout@v5
- name: Install uv
uses: astral-sh/setup-uv@v7
with:
enable-cache: true
python-version: '3.13'
- name: Verify deprecation removals
run: uv run --with packaging python .github/scripts/check_deprecations.py
+61
View File
@@ -0,0 +1,61 @@
# Workflow that uses the DummyAgent to run a simple task
name: Run E2E test with dummy agent
# Always run on "main"
# Always run on PRs
on:
push:
branches:
- main
pull_request:
# If triggered by a PR, it will be in the same group. However, each commit on main will be in its own unique group
concurrency:
group: ${{ github.workflow }}-${{ (github.head_ref && github.ref) || github.run_id }}
cancel-in-progress: true
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Free Disk Space (Ubuntu)
uses: jlumbroso/free-disk-space@main
with:
# this might remove tools that are actually needed,
# if set to "true" but frees about 6 GB
tool-cache: true
# all of these default to true, but feel free to set to
# "false" if necessary for your workflow
android: true
dotnet: true
haskell: true
large-packages: true
docker-images: false
swap-storage: true
- name: Set up Docker Buildx
id: buildx
uses: docker/setup-buildx-action@v3
- name: Install poetry via pipx
run: pipx install poetry
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.12'
cache: 'poetry'
- name: Install Python dependencies using Poetry
run: poetry install --without evaluation,llama-index
- name: Build Environment
run: make build
- name: Run tests
run: |
set -e
SANDBOX_FORCE_REBUILD_RUNTIME=True poetry run python3 openhands/core/main.py -t "do a flip" -d ./workspace/ -c DummyAgent
- name: Check exit code
run: |
if [ $? -ne 0 ]; then
echo "Test failed"
exit 1
else
echo "Test passed"
fi
+160
View File
@@ -0,0 +1,160 @@
name: Run Evaluation
on:
pull_request:
types: [labeled]
schedule:
- cron: "0 1 * * *" # Run daily at 1 AM UTC
workflow_dispatch:
inputs:
reason:
description: "Reason for manual trigger"
required: true
default: ""
env:
N_PROCESSES: 32 # Global configuration for number of parallel processes for evaluation
jobs:
run-evaluation:
if: github.event.label.name == 'eval-this' || github.event_name != 'pull_request'
runs-on: ubuntu-latest
permissions:
contents: "read"
id-token: "write"
pull-requests: "write"
issues: "write"
strategy:
matrix:
python-version: ["3.12"]
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Install poetry via pipx
run: pipx install poetry
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
cache: "poetry"
- name: Comment on PR if 'eval-this' label is present
if: github.event_name == 'pull_request' && github.event.label.name == 'eval-this'
uses: KeisukeYamashita/create-comment@v1
with:
unique: false
comment: |
Hi! I started running the evaluation on your PR. You will receive a comment with the results shortly.
- name: Install Python dependencies using Poetry
run: poetry install
- name: Configure config.toml for evaluation
env:
DEEPSEEK_API_KEY: ${{ secrets.DEEPSEEK_LLM_API_KEY }}
run: |
echo "[llm.eval]" > config.toml
echo "model = \"deepseek/deepseek-chat\"" >> config.toml
echo "api_key = \"$DEEPSEEK_API_KEY\"" >> config.toml
echo "temperature = 0.0" >> config.toml
- name: Run integration test evaluation
env:
ALLHANDS_API_KEY: ${{ secrets.ALLHANDS_EVAL_RUNTIME_API_KEY }}
RUNTIME: remote
SANDBOX_REMOTE_RUNTIME_API_URL: https://runtime.eval.all-hands.dev
EVAL_DOCKER_IMAGE_PREFIX: us-central1-docker.pkg.dev/evaluation-092424/swe-bench-images
run: |
poetry run ./evaluation/integration_tests/scripts/run_infer.sh llm.eval HEAD CodeActAgent '' $N_PROCESSES
# get evaluation report
REPORT_FILE=$(find evaluation/evaluation_outputs/outputs/integration_tests/CodeActAgent/deepseek-chat_maxiter_10_N* -name "report.md" -type f | head -n 1)
echo "REPORT_FILE: $REPORT_FILE"
echo "INTEGRATION_TEST_REPORT<<EOF" >> $GITHUB_ENV
cat $REPORT_FILE >> $GITHUB_ENV
echo >> $GITHUB_ENV
echo "EOF" >> $GITHUB_ENV
- name: Run SWE-Bench evaluation
env:
ALLHANDS_API_KEY: ${{ secrets.ALLHANDS_EVAL_RUNTIME_API_KEY }}
RUNTIME: remote
SANDBOX_REMOTE_RUNTIME_API_URL: https://runtime.eval.all-hands.dev
EVAL_DOCKER_IMAGE_PREFIX: us-central1-docker.pkg.dev/evaluation-092424/swe-bench-images
run: |
poetry run ./evaluation/swe_bench/scripts/run_infer.sh llm.eval HEAD CodeActAgent 300 30 $N_PROCESSES "princeton-nlp/SWE-bench_Lite" test
OUTPUT_FOLDER=$(find evaluation/evaluation_outputs/outputs/princeton-nlp__SWE-bench_Lite-test/CodeActAgent -name "deepseek-chat_maxiter_50_N_*-no-hint-run_1" -type d | head -n 1)
echo "OUTPUT_FOLDER for SWE-bench evaluation: $OUTPUT_FOLDER"
poetry run ./evaluation/swe_bench/scripts/eval_infer_remote.sh $OUTPUT_FOLDER/output.jsonl $N_PROCESSES "princeton-nlp/SWE-bench_Lite" test
poetry run ./evaluation/swe_bench/scripts/eval/summarize_outputs.py $OUTPUT_FOLDER/output.jsonl > summarize_outputs.log 2>&1
echo "SWEBENCH_REPORT<<EOF" >> $GITHUB_ENV
cat summarize_outputs.log >> $GITHUB_ENV
echo "EOF" >> $GITHUB_ENV
- name: Create tar.gz of evaluation outputs
run: |
TIMESTAMP=$(date +'%y-%m-%d-%H-%M')
tar -czvf evaluation_outputs_${TIMESTAMP}.tar.gz evaluation/evaluation_outputs/outputs
- name: Upload evaluation results as artifact
uses: actions/upload-artifact@v4
id: upload_results_artifact
with:
name: evaluation-outputs
path: evaluation_outputs_*.tar.gz
- name: Get artifact URL
run: echo "ARTIFACT_URL=${{ steps.upload_results_artifact.outputs.artifact-url }}" >> $GITHUB_ENV
- name: Authenticate to Google Cloud
uses: 'google-github-actions/auth@v2'
with:
credentials_json: ${{ secrets.GCP_RESEARCH_OBJECT_CREATOR_SA_KEY }}
- name: Set timestamp and trigger reason
run: |
echo "TIMESTAMP=$(date +'%Y-%m-%d-%H-%M')" >> $GITHUB_ENV
if [[ "${{ github.event_name }}" == "pull_request" ]]; then
echo "TRIGGER_REASON=pr-${{ github.event.pull_request.number }}" >> $GITHUB_ENV
elif [[ "${{ github.event_name }}" == "schedule" ]]; then
echo "TRIGGER_REASON=schedule" >> $GITHUB_ENV
else
echo "TRIGGER_REASON=manual-${{ github.event.inputs.reason }}" >> $GITHUB_ENV
fi
- name: Upload evaluation results to Google Cloud Storage
uses: 'google-github-actions/upload-cloud-storage@v2'
with:
path: 'evaluation/evaluation_outputs/outputs'
destination: 'openhands-oss-eval-results/${{ env.TIMESTAMP }}-${{ env.TRIGGER_REASON }}'
- name: Comment with evaluation results and artifact link
id: create_comment
uses: KeisukeYamashita/create-comment@v1
with:
number: ${{ github.event_name == 'pull_request' && github.event.pull_request.number || 4504 }}
unique: false
comment: |
Trigger by: ${{ github.event_name == 'pull_request' && format('Pull Request (eval-this label on PR #{0})', github.event.pull_request.number) || github.event_name == 'schedule' && 'Daily Schedule' || format('Manual Trigger: {0}', github.event.inputs.reason) }}
Commit: ${{ github.sha }}
**SWE-Bench Evaluation Report**
${{ env.SWEBENCH_REPORT }}
---
**Integration Tests Evaluation Report**
${{ env.INTEGRATION_TEST_REPORT }}
---
You can download the full evaluation outputs [here](${{ env.ARTIFACT_URL }}).
- name: Post to a Slack channel
id: slack
uses: slackapi/slack-github-action@v1.27.0
with:
channel-id: 'C07SVQSCR6F'
slack-message: "*Evaluation Trigger:* ${{ github.event_name == 'pull_request' && format('Pull Request (eval-this label on PR #{0})', github.event.pull_request.number) || github.event_name == 'schedule' && 'Daily Schedule' || format('Manual Trigger: {0}', github.event.inputs.reason) }}\n\nLink to summary: [here](https://github.com/${{ github.repository }}/issues/${{ github.event_name == 'pull_request' && github.event.pull_request.number || 4504 }}#issuecomment-${{ steps.create_comment.outputs.comment-id }})"
env:
SLACK_BOT_TOKEN: ${{ secrets.EVAL_NOTIF_SLACK_BOT_TOKEN }}
+44
View File
@@ -0,0 +1,44 @@
# Workflow that runs frontend unit tests
name: Run Frontend Unit Tests
# * Always run on "main"
# * Run on PRs that have changes in the "frontend" folder or this workflow
on:
push:
branches:
- main
pull_request:
paths:
- 'frontend/**'
- '.github/workflows/fe-unit-tests.yml'
# If triggered by a PR, it will be in the same group. However, each commit on main will be in its own unique group
concurrency:
group: ${{ github.workflow }}-${{ (github.head_ref && github.ref) || github.run_id }}
cancel-in-progress: true
jobs:
# Run frontend unit tests
fe-test:
name: FE Unit Tests
runs-on: ubuntu-latest
strategy:
matrix:
node-version: [20]
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version: ${{ matrix.node-version }}
- name: Install dependencies
working-directory: ./frontend
run: npm ci
- name: Run tests and collect coverage
working-directory: ./frontend
run: npm run test:coverage
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v4
env:
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
+447
View File
@@ -0,0 +1,447 @@
# Workflow that builds, tests and then pushes the OpenHands and runtime docker images to the ghcr.io repository
name: Docker
# Always run on "main"
# Always run on tags
# Always run on PRs
# Can also be triggered manually
on:
push:
branches:
- main
tags:
- '*'
pull_request:
workflow_dispatch:
inputs:
reason:
description: 'Reason for manual trigger'
required: true
default: ''
# If triggered by a PR, it will be in the same group. However, each commit on main will be in its own unique group
concurrency:
group: ${{ github.workflow }}-${{ (github.head_ref && github.ref) || github.run_id }}
cancel-in-progress: true
env:
BASE_IMAGE_FOR_HASH_EQUIVALENCE_TEST: nikolaik/python-nodejs:python3.12-nodejs22
RELEVANT_SHA: ${{ github.event.pull_request.head.sha || github.sha }}
jobs:
# Builds the OpenHands Docker images
ghcr_build_app:
name: Build App Image
runs-on: ubuntu-latest
permissions:
contents: read
packages: write
outputs:
hash_from_app_image: ${{ steps.get_hash_in_app_image.outputs.hash_from_app_image }}
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Free Disk Space (Ubuntu)
uses: jlumbroso/free-disk-space@main
with:
# this might remove tools that are actually needed,
# if set to "true" but frees about 6 GB
tool-cache: true
# all of these default to true, but feel free to set to
# "false" if necessary for your workflow
android: true
dotnet: true
haskell: true
large-packages: true
docker-images: false
swap-storage: true
- name: Set up QEMU
uses: docker/setup-qemu-action@v3.0.0
with:
image: tonistiigi/binfmt:latest
- name: Login to GHCR
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.repository_owner }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Set up Docker Buildx
id: buildx
uses: docker/setup-buildx-action@v3
- name: Build and push app image
if: "!github.event.pull_request.head.repo.fork"
run: |
./containers/build.sh -i openhands -o ${{ github.repository_owner }} --push
- name: Build app image
if: "github.event.pull_request.head.repo.fork"
run: |
./containers/build.sh -i openhands -o ${{ github.repository_owner }} --load
- name: Get hash in App Image
id: get_hash_in_app_image
run: |
# Lowercase the repository owner
export REPO_OWNER=${{ github.repository_owner }}
REPO_OWNER=$(echo $REPO_OWNER | tr '[:upper:]' '[:lower:]')
# Run the build script in the app image
docker run -e SANDBOX_USER_ID=0 -v /var/run/docker.sock:/var/run/docker.sock ghcr.io/${REPO_OWNER}/openhands:${{ env.RELEVANT_SHA }} /bin/bash -c "mkdir -p containers/runtime; python3 openhands/runtime/utils/runtime_build.py --base_image ${{ env.BASE_IMAGE_FOR_HASH_EQUIVALENCE_TEST }} --build_folder containers/runtime --force_rebuild" 2>&1 | tee docker-outputs.txt
# Get the hash from the build script
hash_from_app_image=$(cat docker-outputs.txt | grep "Hash for docker build directory" | awk -F "): " '{print $2}' | uniq | head -n1)
echo "hash_from_app_image=$hash_from_app_image" >> $GITHUB_OUTPUT
echo "Hash from app image: $hash_from_app_image"
# Builds the runtime Docker images
ghcr_build_runtime:
name: Build Image
runs-on: ubuntu-latest
permissions:
contents: read
packages: write
strategy:
matrix:
base_image:
- image: 'nikolaik/python-nodejs:python3.12-nodejs22'
tag: nikolaik
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Free Disk Space (Ubuntu)
uses: jlumbroso/free-disk-space@main
with:
# this might remove tools that are actually needed,
# if set to "true" but frees about 6 GB
tool-cache: true
# all of these default to true, but feel free to set to
# "false" if necessary for your workflow
android: true
dotnet: true
haskell: true
large-packages: true
docker-images: false
swap-storage: true
- name: Set up QEMU
uses: docker/setup-qemu-action@v3.0.0
with:
image: tonistiigi/binfmt:latest
- name: Login to GHCR
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.repository_owner }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Set up Docker Buildx
id: buildx
uses: docker/setup-buildx-action@v3
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.12'
- name: Cache Poetry dependencies
uses: actions/cache@v4
with:
path: |
~/.cache/pypoetry
~/.virtualenvs
key: ${{ runner.os }}-poetry-${{ hashFiles('**/poetry.lock') }}
restore-keys: |
${{ runner.os }}-poetry-
- name: Install poetry via pipx
run: pipx install poetry
- name: Install Python dependencies using Poetry
run: make install-python-dependencies
- name: Create source distribution and Dockerfile
run: poetry run python3 openhands/runtime/utils/runtime_build.py --base_image ${{ matrix.base_image.image }} --build_folder containers/runtime --force_rebuild
- name: Build and push runtime image ${{ matrix.base_image.image }}
if: github.event.pull_request.head.repo.fork != true
run: |
./containers/build.sh -i runtime -o ${{ github.repository_owner }} --push -t ${{ matrix.base_image.tag }}
# Forked repos can't push to GHCR, so we need to upload the image as an artifact
- name: Build runtime image ${{ matrix.base_image.image }} for fork
if: github.event.pull_request.head.repo.fork
uses: docker/build-push-action@v6
with:
tags: ghcr.io/all-hands-ai/runtime:${{ env.RELEVANT_SHA }}-${{ matrix.base_image.tag }}
outputs: type=docker,dest=/tmp/runtime-${{ matrix.base_image.tag }}.tar
context: containers/runtime
- name: Upload runtime image for fork
if: github.event.pull_request.head.repo.fork
uses: actions/upload-artifact@v4
with:
name: runtime-${{ matrix.base_image.tag }}
path: /tmp/runtime-${{ matrix.base_image.tag }}.tar
verify_hash_equivalence_in_runtime_and_app:
name: Verify Hash Equivalence in Runtime and Docker images
runs-on: ubuntu-latest
needs: [ghcr_build_runtime, ghcr_build_app]
strategy:
fail-fast: false
matrix:
base_image: ['nikolaik']
steps:
- uses: actions/checkout@v4
- name: Cache Poetry dependencies
uses: actions/cache@v4
with:
path: |
~/.cache/pypoetry
~/.virtualenvs
key: ${{ runner.os }}-poetry-${{ hashFiles('**/poetry.lock') }}
restore-keys: |
${{ runner.os }}-poetry-
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.12'
- name: Install poetry via pipx
run: pipx install poetry
- name: Install Python dependencies using Poetry
run: make install-python-dependencies
- name: Get hash in App Image
run: |
echo "Hash from app image: ${{ needs.ghcr_build_app.outputs.hash_from_app_image }}"
echo "hash_from_app_image=${{ needs.ghcr_build_app.outputs.hash_from_app_image }}" >> $GITHUB_ENV
- name: Get hash using code (development mode)
run: |
mkdir -p containers/runtime
poetry run python3 openhands/runtime/utils/runtime_build.py --base_image ${{ env.BASE_IMAGE_FOR_HASH_EQUIVALENCE_TEST }} --build_folder containers/runtime --force_rebuild > output.txt 2>&1
hash_from_code=$(cat output.txt | grep "Hash for docker build directory" | awk -F "): " '{print $2}' | uniq | head -n1)
echo "hash_from_code=$hash_from_code" >> $GITHUB_ENV
- name: Compare hashes
run: |
echo "Hash from App Image: ${{ env.hash_from_app_image }}"
echo "Hash from Code: ${{ env.hash_from_code }}"
if [ "${{ env.hash_from_app_image }}" = "${{ env.hash_from_code }}" ]; then
echo "Hashes match!"
else
echo "Hashes do not match!"
exit 1
fi
# Run unit tests with the EventStream runtime Docker images as root
test_runtime_root:
name: RT Unit Tests (Root)
needs: [ghcr_build_runtime]
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
base_image: ['nikolaik']
steps:
- uses: actions/checkout@v4
- name: Free Disk Space (Ubuntu)
uses: jlumbroso/free-disk-space@main
with:
# this might remove tools that are actually needed,
# if set to "true" but frees about 6 GB
tool-cache: true
# all of these default to true, but feel free to set to
# "false" if necessary for your workflow
android: true
dotnet: true
haskell: true
large-packages: true
docker-images: false
swap-storage: true
- name: Set up Docker Buildx
id: buildx
uses: docker/setup-buildx-action@v3
# Forked repos can't push to GHCR, so we need to download the image as an artifact
- name: Download runtime image for fork
if: github.event.pull_request.head.repo.fork
uses: actions/download-artifact@v4
with:
name: runtime-${{ matrix.base_image }}
path: /tmp
- name: Load runtime image for fork
if: github.event.pull_request.head.repo.fork
run: |
docker load --input /tmp/runtime-${{ matrix.base_image }}.tar
- name: Cache Poetry dependencies
uses: actions/cache@v4
with:
path: |
~/.cache/pypoetry
~/.virtualenvs
key: ${{ runner.os }}-poetry-${{ hashFiles('**/poetry.lock') }}
restore-keys: |
${{ runner.os }}-poetry-
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.12'
- name: Install poetry via pipx
run: pipx install poetry
- name: Install Python dependencies using Poetry
run: make install-python-dependencies
- name: Run runtime tests
run: |
# We install pytest-xdist in order to run tests across CPUs
poetry run pip install pytest-xdist
# Install to be able to retry on failures for flaky tests
poetry run pip install pytest-rerunfailures
image_name=ghcr.io/${{ github.repository_owner }}/runtime:${{ env.RELEVANT_SHA }}-${{ matrix.base_image }}
image_name=$(echo $image_name | tr '[:upper:]' '[:lower:]')
SKIP_CONTAINER_LOGS=true \
TEST_RUNTIME=eventstream \
SANDBOX_USER_ID=$(id -u) \
SANDBOX_RUNTIME_CONTAINER_IMAGE=$image_name \
TEST_IN_CI=true \
RUN_AS_OPENHANDS=false \
poetry run pytest -n 3 -raRs --reruns 2 --reruns-delay 5 --cov=openhands --cov-report=xml -s ./tests/runtime
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v4
env:
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
# Run unit tests with the EventStream runtime Docker images as openhands user
test_runtime_oh:
name: RT Unit Tests (openhands)
runs-on: ubuntu-latest
needs: [ghcr_build_runtime]
strategy:
matrix:
base_image: ['nikolaik']
steps:
- uses: actions/checkout@v4
- name: Free Disk Space (Ubuntu)
uses: jlumbroso/free-disk-space@main
with:
# this might remove tools that are actually needed,
# if set to "true" but frees about 6 GB
tool-cache: true
# all of these default to true, but feel free to set to
# "false" if necessary for your workflow
android: true
dotnet: true
haskell: true
large-packages: true
docker-images: false
swap-storage: true
- name: Set up Docker Buildx
id: buildx
uses: docker/setup-buildx-action@v3
# Forked repos can't push to GHCR, so we need to download the image as an artifact
- name: Download runtime image for fork
if: github.event.pull_request.head.repo.fork
uses: actions/download-artifact@v4
with:
name: runtime-${{ matrix.base_image }}
path: /tmp
- name: Load runtime image for fork
if: github.event.pull_request.head.repo.fork
run: |
docker load --input /tmp/runtime-${{ matrix.base_image }}.tar
- name: Cache Poetry dependencies
uses: actions/cache@v4
with:
path: |
~/.cache/pypoetry
~/.virtualenvs
key: ${{ runner.os }}-poetry-${{ hashFiles('**/poetry.lock') }}
restore-keys: |
${{ runner.os }}-poetry-
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.12'
- name: Install poetry via pipx
run: pipx install poetry
- name: Install Python dependencies using Poetry
run: make install-python-dependencies
- name: Run runtime tests
run: |
# We install pytest-xdist in order to run tests across CPUs
poetry run pip install pytest-xdist
# Install to be able to retry on failures for flaky tests
poetry run pip install pytest-rerunfailures
image_name=ghcr.io/${{ github.repository_owner }}/runtime:${{ env.RELEVANT_SHA }}-${{ matrix.base_image }}
image_name=$(echo $image_name | tr '[:upper:]' '[:lower:]')
SKIP_CONTAINER_LOGS=true \
TEST_RUNTIME=eventstream \
SANDBOX_USER_ID=$(id -u) \
SANDBOX_RUNTIME_CONTAINER_IMAGE=$image_name \
TEST_IN_CI=true \
RUN_AS_OPENHANDS=true \
poetry run pytest -n 3 -raRs --reruns 2 --reruns-delay 5 --cov=openhands --cov-report=xml -s ./tests/runtime
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v4
env:
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
# The two following jobs (named identically) are to check whether all the runtime tests have passed as the
# "All Runtime Tests Passed" is a required job for PRs to merge
# Due to this bug: https://github.com/actions/runner/issues/2566, we want to create a job that runs when the
# prerequisites have been cancelled or failed so merging is disallowed, otherwise Github considers "skipped" as "success"
runtime_tests_check_success:
name: All Runtime Tests Passed
if: ${{ !cancelled() && !contains(needs.*.result, 'failure') && !contains(needs.*.result, 'cancelled') }}
runs-on: ubuntu-latest
needs: [test_runtime_root, test_runtime_oh, verify_hash_equivalence_in_runtime_and_app]
steps:
- name: All tests passed
run: echo "All runtime tests have passed successfully!"
runtime_tests_check_fail:
name: All Runtime Tests Passed
if: ${{ cancelled() || contains(needs.*.result, 'failure') || contains(needs.*.result, 'cancelled') }}
runs-on: ubuntu-latest
needs: [test_runtime_root, test_runtime_oh, verify_hash_equivalence_in_runtime_and_app]
steps:
- name: Some tests failed
run: |
echo "Some runtime tests failed or were cancelled"
exit 1
update_pr_description:
name: Update PR Description
if: github.event_name == 'pull_request' && !github.event.pull_request.head.repo.fork && github.actor != 'dependabot[bot]'
needs: [ghcr_build_runtime]
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Get short SHA
id: short_sha
run: echo "SHORT_SHA=$(echo ${{ github.event.pull_request.head.sha }} | cut -c1-7)" >> $GITHUB_OUTPUT
- name: Update PR Description
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
PR_NUMBER: ${{ github.event.pull_request.number }}
REPO: ${{ github.repository }}
SHORT_SHA: ${{ steps.short_sha.outputs.SHORT_SHA }}
run: |
echo "updating PR description"
DOCKER_RUN_COMMAND="docker run -it --rm \
-p 3000:3000 \
-v /var/run/docker.sock:/var/run/docker.sock \
--add-host host.docker.internal:host-gateway \
-e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:$SHORT_SHA-nikolaik \
--name openhands-app-$SHORT_SHA \
docker.all-hands.dev/all-hands-ai/openhands:$SHORT_SHA"
PR_BODY=$(gh pr view $PR_NUMBER --json body --jq .body)
if echo "$PR_BODY" | grep -q "To run this PR locally, use the following command:"; then
UPDATED_PR_BODY=$(echo "${PR_BODY}" | sed -E "s|docker run -it --rm.*|$DOCKER_RUN_COMMAND|")
else
UPDATED_PR_BODY="${PR_BODY}
---
To run this PR locally, use the following command:
\`\`\`
$DOCKER_RUN_COMMAND
\`\`\`"
fi
echo "updated body: $UPDATED_PR_BODY"
gh pr edit $PR_NUMBER --body "$UPDATED_PR_BODY"
-477
View File
@@ -1,477 +0,0 @@
---
name: Run Integration Tests
run-name: >-
Run Integration Tests ${{ inputs.reason || github.event.label.name || 'scheduled' }}
on:
# Use pull_request_target to access secrets even on fork PRs
# This is safe because we only run when the 'integration-test' label is added by a maintainer
pull_request_target:
types:
- labeled
workflow_dispatch:
inputs:
reason:
description: Reason for manual trigger
required: true
default: ''
test_type:
description: Select which tests to run (all, integration, behavior)
required: false
default: all
model_ids:
description: >-
Comma-separated model IDs to test (from resolve_model_config.py).
Example: claude-sonnet-4-6,glm-4.7. Defaults to a standard set.
required: false
default: ''
type: string
issue_number:
description: Issue or PR number to post results to (optional)
required: false
default: ''
type: string
tool_preset:
description: >-
Tool preset for file editing (default, gemini, gpt5, planning).
'default' uses FileEditorTool, 'gemini' uses read_file/write_file/edit/list_directory,
'gpt5' uses apply_patch tool.
required: false
default: default
type: choice
options:
- default
- gemini
- gpt5
- planning
schedule:
- cron: 30 22 * * * # Runs at 10:30pm UTC every day
env:
N_PROCESSES: 4 # Global configuration for number of parallel processes for evaluation
# Default models for scheduled/label-triggered runs (subset of models from resolve_model_config.py)
DEFAULT_MODEL_IDS: claude-sonnet-4-6,deepseek-v3.2-reasoner,kimi-k2-thinking,gemini-3-pro
jobs:
setup-matrix:
runs-on: ubuntu-latest
outputs:
matrix: ${{ steps.resolve-models.outputs.matrix }}
issue_number: ${{ steps.resolve-issue.outputs.issue_number }}
steps:
- name: Checkout repository
uses: actions/checkout@v5
with:
repository: ${{ github.event.pull_request.head.repo.full_name || github.repository }}
ref: ${{ github.event.pull_request.head.sha || github.ref }}
persist-credentials: false
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.13'
- name: Resolve model configurations
id: resolve-models
env:
MODEL_IDS_INPUT: ${{ github.event.inputs.model_ids || '' }}
DEFAULT_MODEL_IDS: ${{ env.DEFAULT_MODEL_IDS }}
run: |
# Use input model_ids if provided, otherwise use defaults
if [ -z "$MODEL_IDS_INPUT" ]; then
MODEL_IDS="$DEFAULT_MODEL_IDS"
echo "No model_ids specified, using defaults: $MODEL_IDS"
else
MODEL_IDS="$MODEL_IDS_INPUT"
echo "Using specified model_ids: $MODEL_IDS"
fi
# Resolve model configs using resolve_model_config.py
# Transform output to matrix format for integration tests
MATRIX=$(python3 << EOF
import json
import sys
sys.path.insert(0, '.github/run-eval')
from resolve_model_config import MODELS
model_ids = "$MODEL_IDS".split(",")
model_ids = [m.strip() for m in model_ids if m.strip()]
matrix = []
for model_id in model_ids:
if model_id not in MODELS:
available = ", ".join(sorted(MODELS.keys()))
print(f"Error: Model ID '{model_id}' not found. Available: {available}", file=sys.stderr)
sys.exit(1)
model = MODELS[model_id]
# Create run-suffix from model id (replace special chars with underscore)
run_suffix = model_id.replace("-", "_").replace(".", "_") + "_run"
matrix.append({
"id": model_id,
"name": model["display_name"],
"run-suffix": run_suffix,
"llm-config": model["llm_config"]
})
print(json.dumps(matrix))
EOF
)
if [ $? -ne 0 ]; then
echo "Failed to resolve model configurations" >&2
exit 1
fi
echo "matrix=$MATRIX" >> "$GITHUB_OUTPUT"
echo "Resolved models: $(echo "$MATRIX" | jq -r '.[].name' | paste -sd', ' -)"
- name: Resolve issue number
id: resolve-issue
env:
ISSUE_NUMBER_INPUT: ${{ github.event.inputs.issue_number || '' }}
PR_NUMBER: ${{ github.event.pull_request.number }}
run: |
# Priority: explicit input > PR number from label trigger
if [ -n "$ISSUE_NUMBER_INPUT" ]; then
echo "issue_number=$ISSUE_NUMBER_INPUT" >> "$GITHUB_OUTPUT"
elif [ -n "$PR_NUMBER" ]; then
echo "issue_number=$PR_NUMBER" >> "$GITHUB_OUTPUT"
else
echo "issue_number=" >> "$GITHUB_OUTPUT"
fi
# Post initial comment for label triggers (no dependencies - runs immediately)
post-label-comment:
if: >
github.event_name == 'pull_request_target' && (
github.event.label.name == 'integration-test' ||
github.event.label.name == 'behavior-test'
)
runs-on: ubuntu-latest
permissions:
pull-requests: write
steps:
- name: Comment on PR (integration tests via label)
if: github.event.label.name == 'integration-test'
uses: KeisukeYamashita/create-comment@v1
with:
unique: false
comment: |
Hi! I started running the integration tests on your PR. You will receive a comment with the results shortly.
- name: Comment on PR (behavior tests via label)
if: github.event.label.name == 'behavior-test'
uses: KeisukeYamashita/create-comment@v1
with:
unique: false
comment: |
Hi! I started running the behavior tests on your PR. You will receive a comment with the results shortly.
# Post initial comment for workflow_dispatch (depends on setup-matrix for issue_number resolution)
post-dispatch-comment:
needs: setup-matrix
if: github.event_name == 'workflow_dispatch' && github.event.inputs.issue_number != ''
runs-on: ubuntu-latest
permissions:
issues: write
steps:
- name: Comment on issue/PR (workflow_dispatch)
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
ISSUE_NUMBER: ${{ github.event.inputs.issue_number }}
MODEL_IDS: ${{ github.event.inputs.model_ids || 'all models' }}
TEST_TYPE: ${{ github.event.inputs.test_type || 'all' }}
REASON: ${{ github.event.inputs.reason }}
run: |
# Sanitize @OpenHands mentions to prevent self-mention loops
SANITIZED_REASON=$(echo "$REASON" | sed 's/@OpenHands/@\u200BOpenHands/g; s/@openhands/@\u200Bopenhands/g')
SANITIZED_MODEL_IDS=$(echo "$MODEL_IDS" | sed 's/@OpenHands/@\u200BOpenHands/g; s/@openhands/@\u200Bopenhands/g')
COMMENT_BODY=$(cat <<EOF
**Integration Tests Triggered**
- **Reason:** $SANITIZED_REASON
- **Test type:** $TEST_TYPE
- **Models:** $SANITIZED_MODEL_IDS
- **Workflow run:** ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}
Results will be posted here when complete.
EOF
)
gh issue comment "$ISSUE_NUMBER" --body "$COMMENT_BODY"
run-integration-tests:
# Security: Only run when integration-related labels are present, via workflow_dispatch, or on schedule
# This prevents automatic execution on fork PRs without maintainer approval
# Note: uses always() to run even when comment jobs are skipped (e.g., for scheduled runs)
# Schedule trigger only runs in the main repository, not in forks
if: |
always() && (
(
github.event_name == 'pull_request_target' && (
github.event.label.name == 'integration-test' ||
github.event.label.name == 'behavior-test'
)
) ||
github.event_name == 'workflow_dispatch' ||
(github.event_name == 'schedule' && github.repository == 'OpenHands/software-agent-sdk')
) && needs.setup-matrix.result == 'success'
needs: [setup-matrix, post-label-comment, post-dispatch-comment]
runs-on: ubuntu-22.04
timeout-minutes: 180
permissions:
contents: read
id-token: write
pull-requests: write
issues: write
strategy:
fail-fast: false
matrix:
python-version: ['3.13']
job-config: ${{ fromJson(needs.setup-matrix.outputs.matrix) }}
steps:
- name: Checkout repository
uses: actions/checkout@v5
with:
# For pull_request_target: checkout fork PR code (requires explicit repository)
# For other events: fallback to current repository and ref
repository: ${{ github.event.pull_request.head.repo.full_name || github.repository }}
ref: ${{ github.event.pull_request.head.sha || github.ref }}
# Security: Don't persist credentials to prevent untrusted PR code from using them
persist-credentials: false
- name: Install uv
uses: astral-sh/setup-uv@v7
with:
version: latest
python-version: ${{ matrix.python-version }}
- name: Install Python dependencies using uv
run: |
uv sync --dev
uv pip install pytest
# Run integration test evaluation
- name: Determine test selection
run: |
TEST_TYPE_ARGS=""
if [ "${{ github.event_name }}" = "pull_request_target" ] && [ "${{ github.event.label.name }}" = "behavior-test" ]; then
TEST_TYPE_ARGS="--test-type behavior"
echo "behavior-test label detected; running behavior tests only."
elif [ "${{ github.event_name }}" = "pull_request_target" ] && [ "${{ github.event.label.name }}" = "integration-test" ]; then
TEST_TYPE_ARGS="--test-type integration"
echo "integration-test label detected; running integration tests only."
elif [ "${{ github.event_name }}" = "workflow_dispatch" ]; then
test_type="${{ github.event.inputs.test_type }}"
case "$test_type" in
behavior)
TEST_TYPE_ARGS="--test-type behavior"
echo "workflow_dispatch requested behavior tests only."
;;
integration)
TEST_TYPE_ARGS="--test-type integration"
echo "workflow_dispatch requested integration tests only."
;;
""|all)
echo "workflow_dispatch requested full integration suite."
;;
*)
echo "workflow_dispatch provided unknown test_type '$test_type'; defaulting to full suite."
;;
esac
elif [ "${{ github.event_name }}" = "schedule" ]; then
TEST_TYPE_ARGS="--test-type integration"
echo "Scheduled run; running integration tests only."
else
echo "Running full integration test suite."
fi
echo "TEST_TYPE_ARGS=$TEST_TYPE_ARGS" >> "$GITHUB_ENV"
- name: Run integration test evaluation for ${{ matrix.job-config['name'] }}
env:
LLM_CONFIG: ${{ toJson(matrix.job-config['llm-config']) }}
LLM_API_KEY: ${{ secrets.LLM_API_KEY_EVAL }}
LLM_BASE_URL: https://llm-proxy.eval.all-hands.dev
TOOL_PRESET: ${{ github.event.inputs.tool_preset || 'default' }}
run: |
set -eo pipefail
AGENT_SDK_VERSION=$(git rev-parse --short HEAD)
EVAL_NOTE="${AGENT_SDK_VERSION}_${{ matrix.job-config['run-suffix'] }}"
echo "Invoking test runner with TEST_TYPE_ARGS='$TEST_TYPE_ARGS' TOOL_PRESET='$TOOL_PRESET'"
uv run python tests/integration/run_infer.py \
--llm-config "$LLM_CONFIG" \
--num-workers $N_PROCESSES \
--eval-note "$EVAL_NOTE" \
--tool-preset "$TOOL_PRESET" \
$TEST_TYPE_ARGS
# get integration tests JSON results
RESULTS_FILE=$(find tests/integration/outputs/*${{ matrix.job-config['run-suffix'] }}* -name "results.json" -type f | head -n 1)
echo "RESULTS_FILE: $RESULTS_FILE"
if [ -f "$RESULTS_FILE" ]; then
echo "JSON_RESULTS_FILE=$RESULTS_FILE" >> $GITHUB_ENV
else
echo "JSON_RESULTS_FILE=" >> $GITHUB_ENV
fi
- name: Wait a little bit
run: sleep 10
- name: Create archive of evaluation outputs
run: |
TIMESTAMP=$(date +'%y-%m-%d-%H-%M')
cd tests/integration/outputs # Change to the outputs directory
tar -czvf ../../../integration_tests_${{ matrix.job-config['run-suffix'] }}_${TIMESTAMP}.tar.gz *${{ matrix.job-config['run-suffix'] }}* # Include result directories for this model
- name: Upload evaluation results as artifact
uses: actions/upload-artifact@v7
id: upload_results_artifact
with:
name: integration-test-outputs-${{ matrix.job-config['run-suffix'] }}-${{ github.run_id }}-${{ github.run_attempt }}
path: integration_tests_${{ matrix.job-config['run-suffix'] }}_*.tar.gz
- name: Save test results for consolidation
run: |
# Copy the structured JSON results file for consolidation
mkdir -p test_results_summary
if [ -n "${{ env.JSON_RESULTS_FILE }}" ] && [ -f "${{ env.JSON_RESULTS_FILE }}" ]; then
# Copy the JSON results file directly
cp "${{ env.JSON_RESULTS_FILE }}" "test_results_summary/${{ matrix.job-config['run-suffix'] }}_results.json"
echo "✓ Copied JSON results file for consolidation"
else
echo "✗ No JSON results file found"
exit 1
fi
- name: Upload test results summary
uses: actions/upload-artifact@v7
with:
name: test-results-${{ matrix.job-config['run-suffix'] }}
path: test_results_summary/${{ matrix.job-config['run-suffix'] }}_results.json
consolidate-results:
needs: [setup-matrix, run-integration-tests]
if: |
always() && (
(
github.event_name == 'pull_request_target' && (
github.event.label.name == 'integration-test' ||
github.event.label.name == 'behavior-test'
)
) ||
github.event_name == 'workflow_dispatch' ||
(github.event_name == 'schedule' && github.repository == 'OpenHands/software-agent-sdk')
)
runs-on: ubuntu-24.04
permissions:
contents: read
pull-requests: write
issues: write
steps:
- name: Checkout repository
uses: actions/checkout@v5
with:
# When using pull_request_target, explicitly checkout the PR branch
# This ensures we use the scripts from the actual PR code
ref: ${{ github.event.pull_request.head.sha || github.ref }}
- name: Install uv
uses: astral-sh/setup-uv@v7
with:
version: latest
python-version: '3.13'
- name: Install Python dependencies using uv
run: |
uv sync --dev
- name: Download all test results
uses: actions/download-artifact@v8
with:
pattern: test-results-*
merge-multiple: true
path: all_results
- name: Download all integration test artifacts
uses: actions/download-artifact@v8
with:
pattern: integration-test-outputs-*
path: artifacts
- name: Consolidate test results
env:
EVENT_NAME: ${{ github.event_name }}
PR_NUMBER: ${{ github.event.pull_request.number }}
MANUAL_REASON: ${{ github.event.inputs.reason }}
COMMIT_SHA: ${{ github.sha }}
PYTHONPATH: ${{ github.workspace }}
GITHUB_SERVER_URL: ${{ github.server_url }}
GITHUB_REPOSITORY: ${{ github.repository }}
GITHUB_RUN_ID: ${{ github.run_id }}
run: |
uv run python tests/integration/utils/consolidate_json_results.py \
--results-dir all_results \
--artifacts-dir artifacts \
--output-file consolidated_results.json
echo "Consolidated results generated successfully"
uv run python tests/integration/utils/generate_markdown_report.py \
--input-file consolidated_results.json \
--output-file consolidated_report.md
- name: Upload consolidated report
uses: actions/upload-artifact@v7
with:
name: consolidated-report
path: consolidated_report.md
- name: Create consolidated PR comment
if: github.event_name == 'pull_request_target'
run: |
# Sanitize @OpenHands mentions to prevent self-mention loops
COMMENT_BODY=$(uv run python -c "from openhands.sdk.utils.github import sanitize_openhands_mentions; import sys; print(sanitize_openhands_mentions(sys.stdin.read()), end='')" < consolidated_report.md)
# Use GitHub CLI to create comment with explicit PR number
echo "$COMMENT_BODY" | gh pr comment ${{ github.event.pull_request.number }} --body-file -
env:
GH_TOKEN: ${{ github.token }}
- name: Comment on specified issue/PR (workflow_dispatch)
if: github.event_name == 'workflow_dispatch' && needs.setup-matrix.outputs.issue_number != ''
env:
GH_TOKEN: ${{ github.token }}
ISSUE_NUMBER: ${{ needs.setup-matrix.outputs.issue_number }}
run: |
# Sanitize @OpenHands mentions to prevent self-mention loops
COMMENT_BODY=$(uv run python -c "from openhands.sdk.utils.github import sanitize_openhands_mentions; import sys; print(sanitize_openhands_mentions(sys.stdin.read()), end='')" < consolidated_report.md)
# Use GitHub CLI to create comment on the specified issue/PR
echo "$COMMENT_BODY" | gh issue comment "$ISSUE_NUMBER" --body-file -
- name: Read consolidated report for tracker issue
if: github.event_name == 'schedule'
id: read_report
run: |
# Read and sanitize the report, then set as output
REPORT_CONTENT=$(uv run python -c "from openhands.sdk.utils.github import sanitize_openhands_mentions; import sys; print(sanitize_openhands_mentions(sys.stdin.read()), end='')" < consolidated_report.md)
echo "report<<EOF" >> $GITHUB_OUTPUT
echo "$REPORT_CONTENT" >> $GITHUB_OUTPUT
echo "EOF" >> $GITHUB_OUTPUT
- name: Comment with results on tracker issue
if: github.event_name == 'schedule'
uses: KeisukeYamashita/create-comment@v1
with:
number: 2078
unique: false
comment: |
**Trigger:** Nightly Scheduled Run
**Commit:** ${{ github.sha }}
${{ steps.read_report.outputs.report }}
+54
View File
@@ -0,0 +1,54 @@
# Workflow that runs lint on the frontend and python code
name: Lint
# The jobs in this workflow are required, so they must run at all times
# Always run on "main"
# Always run on PRs
on:
push:
branches:
- main
pull_request:
# If triggered by a PR, it will be in the same group. However, each commit on main will be in its own unique group
concurrency:
group: ${{ github.workflow }}-${{ (github.head_ref && github.ref) || github.run_id }}
cancel-in-progress: true
jobs:
# Run lint on the frontend code
lint-frontend:
name: Lint frontend
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install Node.js 20
uses: actions/setup-node@v4
with:
node-version: 20
- name: Install dependencies
run: |
cd frontend
npm install --frozen-lockfile
- name: Lint
run: |
cd frontend
npm run lint
# Run lint on the python code
lint-python:
name: Lint python
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Set up python
uses: actions/setup-python@v5
with:
python-version: 3.12
cache: 'pip'
- name: Install pre-commit
run: pip install pre-commit==3.7.0
- name: Run pre-commit hooks
run: pre-commit run --files openhands/**/* evaluation/**/* tests/**/* --show-diff-on-failure --config ./dev_config/python/.pre-commit-config.yaml
@@ -1,30 +0,0 @@
name: Update Documentation (by OpenHands)
on:
schedule:
# Run every 7 days at 2 AM UTC on Sundays
- cron: '0 2 * * 0'
workflow_dispatch: # Allow manual triggering
jobs:
update-docs:
runs-on: blacksmith-4vcpu-ubuntu-2404
permissions:
contents: write
pull-requests: write
steps:
- uses: actions/checkout@v4
- name: Update Documentation with OpenHands
uses: All-Hands-AI/openhands-github-action@v1
with:
prompt: .github/prompts/update-documentation.md
repository: ${{ github.repository }}
selected-branch: main
base-url: https://app.all-hands.dev
poll: "true"
timeout-seconds: 1800
poll-interval-seconds: 30
github-token: ${{ secrets.GITHUB_TOKEN }}
openhands-api-key: ${{ secrets.OPENHANDS_API_KEY }}
+15
View File
@@ -0,0 +1,15 @@
name: Resolve Issues with OpenHands
on:
issues:
types: [labeled]
pull_request:
types: [labeled]
jobs:
call-openhands-resolver:
uses: All-Hands-AI/openhands-resolver/.github/workflows/openhands-resolver.yml@main
if: github.event.label.name == 'fix-me'
with:
max_iterations: 50
secrets: inherit
-139
View File
@@ -1,139 +0,0 @@
---
name: PR Artifacts
on:
workflow_dispatch: # Manual trigger for testing
pull_request:
types: [opened, synchronize, reopened]
branches: [main]
pull_request_review:
types: [submitted]
jobs:
# Auto-remove .pr/ directory when a reviewer approves
cleanup-on-approval:
concurrency:
group: cleanup-pr-artifacts-${{ github.event.pull_request.number }}
cancel-in-progress: false
if: github.event_name == 'pull_request_review' && github.event.review.state == 'approved'
runs-on: ubuntu-latest
permissions:
contents: write
pull-requests: write
steps:
- name: Check if fork PR
id: check-fork
run: |
if [ "${{ github.event.pull_request.head.repo.full_name }}" != "${{ github.event.pull_request.base.repo.full_name }}" ]; then
echo "is_fork=true" >> $GITHUB_OUTPUT
echo "::notice::Fork PR detected - skipping auto-cleanup (manual removal required)"
else
echo "is_fork=false" >> $GITHUB_OUTPUT
fi
# Use PAT so the push triggers CI workflows that will complete and
# satisfy branch protection. We can't use [skip ci] because the Vercel
# GitHub App creates stuck checks that block merging.
- uses: actions/checkout@v5
if: steps.check-fork.outputs.is_fork == 'false'
with:
ref: ${{ github.event.pull_request.head.ref }}
token: ${{ secrets.ALLHANDS_BOT_GITHUB_PAT }}
- name: Remove .pr/ directory
id: remove
if: steps.check-fork.outputs.is_fork == 'false'
run: |
if [ -d ".pr" ]; then
git config user.name "allhands-bot"
git config user.email "allhands-bot@users.noreply.github.com"
git rm -rf .pr/
git commit -m "chore: Remove PR-only artifacts [automated]"
git push || {
echo "::error::Failed to push cleanup commit. Check branch protection rules."
exit 1
}
echo "removed=true" >> $GITHUB_OUTPUT
echo "::notice::Removed .pr/ directory"
else
echo "removed=false" >> $GITHUB_OUTPUT
echo "::notice::No .pr/ directory to remove"
fi
- name: Update PR comment after cleanup
if: steps.check-fork.outputs.is_fork == 'false' && steps.remove.outputs.removed == 'true'
uses: actions/github-script@v8
with:
script: |
const marker = '<!-- pr-artifacts-notice -->';
const body = `${marker}
✅ **PR Artifacts Cleaned Up**
The \`.pr/\` directory has been automatically removed.
`;
const { data: comments } = await github.rest.issues.listComments({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.issue.number,
});
const existing = comments.find(c => c.body.includes(marker));
if (existing) {
await github.rest.issues.updateComment({
owner: context.repo.owner,
repo: context.repo.repo,
comment_id: existing.id,
body: body,
});
}
# Warn if .pr/ directory exists (will be auto-removed on approval)
check-pr-artifacts:
if: github.event_name == 'pull_request'
runs-on: ubuntu-latest
permissions:
contents: read
pull-requests: write
steps:
- uses: actions/checkout@v5
- name: Check for .pr/ directory
id: check
run: |
if [ -d ".pr" ]; then
echo "exists=true" >> $GITHUB_OUTPUT
echo "::warning::.pr/ directory exists and will be automatically removed when the PR is approved. For fork PRs, manual removal is required before merging."
else
echo "exists=false" >> $GITHUB_OUTPUT
fi
- name: Post or update PR comment
if: steps.check.outputs.exists == 'true'
uses: actions/github-script@v8
with:
script: |
const marker = '<!-- pr-artifacts-notice -->';
const body = `${marker}
📁 **PR Artifacts Notice**
This PR contains a \`.pr/\` directory with PR-specific documents. This directory will be **automatically removed** when the PR is approved.
> For fork PRs: Manual removal is required before merging.
`;
const { data: comments } = await github.rest.issues.listComments({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.issue.number,
});
const existing = comments.find(c => c.body.includes(marker));
if (!existing) {
await github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.issue.number,
body: body,
});
}
@@ -1,51 +0,0 @@
---
name: PR Review by OpenHands
on:
# Use pull_request so workflow changes can be validated in PRs.
# This workflow requires secrets, so the job only runs for same-repo PRs.
# It runs when:
# 1. A new PR is opened (non-draft), OR
# 2. A draft PR is marked as ready for review, OR
# 3. A maintainer adds the 'review-this' label, OR
# 4. A maintainer requests openhands-agent or all-hands-bot as a reviewer
# Adding labels and requesting reviewers still requires write access.
pull_request:
types: [opened, ready_for_review, labeled, review_requested]
permissions:
contents: read
pull-requests: write
issues: write
jobs:
pr-review:
# Run when one of the following conditions is met:
# 1. A new non-draft PR is opened by a non-first-time contributor, OR
# 2. A draft PR is converted to ready for review by a non-first-time contributor, OR
# 3. 'review-this' label is added, OR
# 4. openhands-agent or all-hands-bot is requested as a reviewer
# Note: FIRST_TIME_CONTRIBUTOR and NONE PRs require manual trigger via label/reviewer request.
if: |
github.event.pull_request.head.repo.full_name == github.repository && (
(github.event.action == 'opened' && github.event.pull_request.draft == false && github.event.pull_request.author_association != 'FIRST_TIME_CONTRIBUTOR' && github.event.pull_request.author_association != 'NONE') ||
(github.event.action == 'ready_for_review' && github.event.pull_request.author_association != 'FIRST_TIME_CONTRIBUTOR' && github.event.pull_request.author_association != 'NONE') ||
github.event.label.name == 'review-this' ||
github.event.requested_reviewer.login == 'openhands-agent' ||
github.event.requested_reviewer.login == 'all-hands-bot'
)
concurrency:
group: pr-review-${{ github.event.pull_request.number }}
cancel-in-progress: true
runs-on: ubuntu-24.04
steps:
- name: Run PR Review
uses: OpenHands/extensions/plugins/pr-review@main
with:
llm-model: litellm_proxy/claude-sonnet-4-5-20250929
llm-base-url: https://llm-proxy.app.all-hands.dev
# Review style: roasted (other option: standard)
review-style: roasted
llm-api-key: ${{ secrets.LLM_API_KEY }}
github-token: ${{ secrets.ALLHANDS_BOT_GITHUB_PAT }}
lmnr-api-key: ${{ secrets.LMNR_SKILLS_API_KEY }}
@@ -1,85 +0,0 @@
---
name: PR Review Evaluation
# This workflow evaluates how well PR review comments were addressed.
# It runs when a PR is closed to assess review effectiveness.
#
# Security note: pull_request_target is safe here because:
# 1. Only triggers on PR close (not on code changes)
# 2. Does not checkout PR code - only downloads artifacts from trusted workflow runs
# 3. Runs evaluation scripts from the extensions repo, not from the PR
on:
pull_request_target:
types: [closed]
permissions:
contents: read
pull-requests: read
jobs:
evaluate:
runs-on: ubuntu-24.04
env:
PR_NUMBER: ${{ github.event.pull_request.number }}
REPO_NAME: ${{ github.repository }}
PR_MERGED: ${{ github.event.pull_request.merged }}
steps:
- name: Download review trace artifact
id: download-trace
uses: dawidd6/action-download-artifact@v6
continue-on-error: true
with:
workflow: pr-review-by-openhands.yml
name: pr-review-trace-${{ github.event.pull_request.number }}
path: trace-info
search_artifacts: true
if_no_artifact_found: warn
- name: Check if trace file exists
id: check-trace
run: |
if [ -f "trace-info/laminar_trace_info.json" ]; then
echo "trace_exists=true" >> $GITHUB_OUTPUT
echo "Found trace file for PR #$PR_NUMBER"
else
echo "trace_exists=false" >> $GITHUB_OUTPUT
echo "No trace file found for PR #$PR_NUMBER - skipping evaluation"
fi
# Always checkout main branch for security - cannot test script changes in PRs
- name: Checkout extensions repository
if: steps.check-trace.outputs.trace_exists == 'true'
uses: actions/checkout@v5
with:
repository: OpenHands/extensions
path: extensions
- name: Set up Python
if: steps.check-trace.outputs.trace_exists == 'true'
uses: actions/setup-python@v6
with:
python-version: '3.12'
- name: Install dependencies
if: steps.check-trace.outputs.trace_exists == 'true'
run: pip install lmnr
- name: Run evaluation
if: steps.check-trace.outputs.trace_exists == 'true'
env:
# Script expects LMNR_PROJECT_API_KEY; org secret is named LMNR_SKILLS_API_KEY
LMNR_PROJECT_API_KEY: ${{ secrets.LMNR_SKILLS_API_KEY }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
python extensions/plugins/pr-review/scripts/evaluate_review.py \
--trace-file trace-info/laminar_trace_info.json
- name: Upload evaluation logs
uses: actions/upload-artifact@v7
if: always() && steps.check-trace.outputs.trace_exists == 'true'
with:
name: pr-review-evaluation-${{ github.event.pull_request.number }}
path: '*.log'
retention-days: 30
-31
View File
@@ -1,31 +0,0 @@
---
# .github/workflows/precommit.yml
name: Pre-commit checks
on:
push:
branches: [main]
pull_request:
branches: ['**']
jobs:
pre-commit:
runs-on: ubuntu-24.04
steps:
- name: Checkout code
uses: actions/checkout@v5
- name: Set up Python
uses: actions/setup-python@v6
with:
python-version: '3.13'
- name: Install uv
uses: astral-sh/setup-uv@v7
- name: Install dependencies
run: uv sync --frozen --group dev
- name: Run pre-commit (all files)
run: uv run pre-commit run --all-files --show-diff-on-failure
-132
View File
@@ -1,132 +0,0 @@
---
name: Prepare Release
on:
workflow_dispatch:
inputs:
version:
description: Release version (e.g., 1.2.3)
required: true
type: string
jobs:
prepare-release:
runs-on: ubuntu-24.04
steps:
- name: Validate version format
run: |
if ! [[ "${{ inputs.version }}" =~ ^[0-9]+\.[0-9]+\.[0-9]+$ ]]; then
echo "❌ Invalid version format. Expected: X.Y.Z (e.g., 1.2.3)"
exit 1
fi
echo "✅ Version format is valid: ${{ inputs.version }}"
- name: Checkout repository
uses: actions/checkout@v5
with:
token: ${{ secrets.ALLHANDS_BOT_GITHUB_PAT }}
- name: Install uv
uses: astral-sh/setup-uv@v7
with:
version: latest
python-version: '3.13'
- name: Configure Git
run: |
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"
- name: Create release branch
run: |
BRANCH_NAME="rel-${{ inputs.version }}"
echo "Creating branch: $BRANCH_NAME"
git checkout -b "$BRANCH_NAME"
echo "BRANCH_NAME=$BRANCH_NAME" >> $GITHUB_ENV
- name: Set package version
run: |
echo "🔧 Setting version to ${{ inputs.version }}"
make set-package-version version=${{ inputs.version }}
- name: Update sdk_ref default in run-eval workflow
run: python3 .github/scripts/update_sdk_ref_default.py "${{ inputs.version }}"
- name: Commit version changes
run: |
git add .
if git diff --staged --quiet; then
echo "No changes to commit"
else
git commit -m "Release v${{ inputs.version }}" -m "Co-authored-by: openhands <openhands@all-hands.dev>"
echo "✅ Changes committed"
fi
- name: Push release branch
run: |
git push -u origin "${{ env.BRANCH_NAME }}"
echo "✅ Branch pushed: ${{ env.BRANCH_NAME }}"
- name: Create Pull Request
env:
GH_TOKEN: ${{ secrets.ALLHANDS_BOT_GITHUB_PAT }}
run: |
cat > pr_body.txt << 'EOF'
## Release v${{ inputs.version }}
This PR prepares the release for version **${{ inputs.version }}**.
### Release Checklist
- [x] Version set to ${{ inputs.version }}
- [ ] Fix any deprecation deadlines if they exist
- [ ] Integration tests pass (tagged with `integration-test`)
- [ ] Behavior tests pass (tagged with `behavior-test`)
- [ ] Example tests pass (tagged with `test-examples`)
- [ ] Draft release created at https://github.com/OpenHands/software-agent-sdk/releases/new
- [ ] Select tag: `v${{ inputs.version }}`
- [ ] Select branch: `${{ env.BRANCH_NAME }}`
- [ ] Auto-generate release notes
- [ ] Publish release (PyPI will auto-publish)
- [ ] Evaluation on OpenHands Index
### Next Steps
1. Review the version changes
2. Address any deprecation deadlines
3. Ensure integration tests pass
4. Ensure behavior tests pass
5. Ensure example tests pass
6. Create and publish the release
Once the release is published on GitHub, the PyPI packages will be automatically published via the `pypi-release.yml` workflow.
EOF
gh pr create \
--title "Release v${{ inputs.version }}" \
--body-file pr_body.txt \
--base main \
--head "${{ env.BRANCH_NAME }}" \
--label "integration-test" \
--label "behavior-test" \
--label "test-examples"
rm pr_body.txt
echo "✅ Pull request created successfully!"
# Get PR URL and display it
PR_URL=$(gh pr view "${{ env.BRANCH_NAME }}" --json url --jq '.url')
echo "🔗 PR URL: $PR_URL"
echo "PR_URL=$PR_URL" >> $GITHUB_ENV
- name: Summary
run: |
echo "## ✅ Release Preparation Complete!" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
echo "- **Version**: ${{ inputs.version }}" >> $GITHUB_STEP_SUMMARY
echo "- **Branch**: ${{ env.BRANCH_NAME }}" >> $GITHUB_STEP_SUMMARY
echo "- **PR URL**: ${{ env.PR_URL }}" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
echo "### Next Steps:" >> $GITHUB_STEP_SUMMARY
echo "1. Review the PR and address any deprecation deadlines" >> $GITHUB_STEP_SUMMARY
echo "2. Wait for integration, behavior, and example tests to pass" >> $GITHUB_STEP_SUMMARY
echo "3. Create and publish the release on GitHub" >> $GITHUB_STEP_SUMMARY
echo "4. PyPI will automatically publish when the release is created" >> $GITHUB_STEP_SUMMARY
+96
View File
@@ -0,0 +1,96 @@
# Workflow that runs python unit tests on mac
name: Run Python Unit Tests Mac
# This job is flaky so only run it nightly
on:
schedule:
- cron: '0 0 * * *'
jobs:
# Run python unit tests on macOS
test-on-macos:
name: Python Unit Tests on macOS
runs-on: macos-14
env:
INSTALL_DOCKER: '1' # Set to '0' to skip Docker installation
strategy:
matrix:
python-version: ['3.12']
steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
- name: Cache Poetry dependencies
uses: actions/cache@v4
with:
path: |
~/.cache/pypoetry
~/.virtualenvs
key: ${{ runner.os }}-poetry-${{ hashFiles('**/poetry.lock') }}
restore-keys: |
${{ runner.os }}-poetry-
- name: Install poetry via pipx
run: pipx install poetry
- name: Install Python dependencies using Poetry
run: poetry install --without evaluation,llama-index
- name: Install & Start Docker
if: env.INSTALL_DOCKER == '1'
run: |
INSTANCE_NAME="colima-${GITHUB_RUN_ID}"
# Uninstall colima to upgrade to the latest version
if brew list colima &>/dev/null; then
brew uninstall colima
# unlinking colima dependency: go
brew uninstall go@1.21
fi
rm -rf ~/.colima ~/.lima
brew install --HEAD colima
brew install docker
start_colima() {
# Find a free port in the range 10000-20000
RANDOM_PORT=$((RANDOM % 10001 + 10000))
# Original line:
if ! colima start --network-address --arch x86_64 --cpu=1 --memory=1 --verbose --ssh-port $RANDOM_PORT; then
echo "Failed to start Colima."
return 1
fi
return 0
}
# Attempt to start Colima for 5 total attempts:
ATTEMPT_LIMIT=5
for ((i=1; i<=ATTEMPT_LIMIT; i++)); do
if start_colima; then
echo "Colima started successfully."
break
else
colima stop -f
sleep 10
colima delete -f
if [ $i -eq $ATTEMPT_LIMIT ]; then
exit 1
fi
sleep 10
fi
done
# For testcontainers to find the Colima socket
# https://github.com/abiosoft/colima/blob/main/docs/FAQ.md#cannot-connect-to-the-docker-daemon-at-unixvarrundockersock-is-the-docker-daemon-running
sudo ln -sf $HOME/.colima/default/docker.sock /var/run/docker.sock
- name: Build Environment
run: make build
- name: Set up Docker Buildx
id: buildx
uses: docker/setup-buildx-action@v3
- name: Run Tests
run: poetry run pytest --forked --cov=openhands --cov-report=xml ./tests/unit --ignore=tests/unit/test_memory.py
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v4
env:
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
+49
View File
@@ -0,0 +1,49 @@
# Workflow that runs python unit tests
name: Run Python Unit Tests
# The jobs in this workflow are required, so they must run at all times
# * Always run on "main"
# * Always run on PRs
on:
push:
branches:
- main
pull_request:
# If triggered by a PR, it will be in the same group. However, each commit on main will be in its own unique group
concurrency:
group: ${{ github.workflow }}-${{ (github.head_ref && github.ref) || github.run_id }}
cancel-in-progress: true
jobs:
# Run python unit tests on Linux
test-on-linux:
name: Python Unit Tests on Linux
runs-on: ubuntu-latest
env:
INSTALL_DOCKER: '0' # Set to '0' to skip Docker installation
strategy:
matrix:
python-version: ['3.12']
steps:
- uses: actions/checkout@v4
- name: Set up Docker Buildx
id: buildx
uses: docker/setup-buildx-action@v3
- name: Install poetry via pipx
run: pipx install poetry
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
cache: 'poetry'
- name: Install Python dependencies using Poetry
run: poetry install --without evaluation,llama-index
- name: Build Environment
run: make build
- name: Run Tests
run: poetry run pytest --forked --cov=openhands --cov-report=xml -svv ./tests/unit --ignore=tests/unit/test_memory.py
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v4
env:
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
+27 -66
View File
@@ -1,70 +1,31 @@
---
name: Publish all OpenHands packages (uv)
# Publishes the OpenHands PyPi package
name: Publish PyPi Package
# Triggered manually
on:
# Run manually
workflow_dispatch:
# Run automatically when a release is published
release:
types: [published]
workflow_dispatch:
inputs:
reason:
description: 'Reason for manual trigger'
required: true
default: ''
jobs:
publish:
runs-on: ubuntu-24.04
outputs:
version: ${{ steps.extract_version.outputs.version }}
steps:
- name: Checkout
uses: actions/checkout@v5
- name: Extract version from release tag
id: extract_version
run: |
# Get version from release tag (e.g., v1.2.3 -> 1.2.3)
if [[ "${{ github.event_name }}" == "release" ]]; then
VERSION="${{ github.event.release.tag_name }}"
VERSION="${VERSION#v}" # Remove 'v' prefix if present
else
# For manual dispatch, extract from pyproject.toml
VERSION=$(grep -m1 '^version = ' openhands-sdk/pyproject.toml | cut -d'"' -f2)
fi
echo "version=$VERSION" >> $GITHUB_OUTPUT
echo "📦 Version: $VERSION"
- name: Install uv
uses: astral-sh/setup-uv@v7
with:
version: latest
python-version: '3.13'
- name: Build and publish all packages
env:
UV_PUBLISH_TOKEN: ${{ secrets.PYPI_TOKEN_OPENHANDS }}
run: |
set -euo pipefail
if [ -z "${UV_PUBLISH_TOKEN:-}" ]; then
echo "❌ Missing secret PYPI_TOKEN_OPENHANDS"
exit 1
fi
PACKAGES=(
openhands-sdk
openhands-tools
openhands-workspace
openhands-agent-server
)
echo "🚀 Building and publishing all packages..."
for PKG in "${PACKAGES[@]}"; do
echo "===== $PKG ====="
uv build --package "$PKG"
done
# Use --check-url to skip files that already exist on PyPI
# This allows re-running the workflow after partial failures
uv publish --token "$UV_PUBLISH_TOKEN" --check-url https://pypi.org/simple/
echo "✅ All packages built and published successfully!"
echo ""
echo "📋 Note: Version bump PRs will be created by the 'Create Version Bump PRs' workflow"
echo " which triggers automatically after this workflow completes."
release:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: 3.12
- name: Install Poetry
uses: snok/install-poetry@v1.4.1
with:
virtualenvs-in-project: true
virtualenvs-path: ~/.virtualenvs
- name: Install Poetry Dependencies
run: poetry install --no-interaction --no-root
- name: Build poetry project
run: ./build.sh
- name: publish
run: poetry publish -u __token__ -p ${{ secrets.PYPI_TOKEN }}
+81
View File
@@ -0,0 +1,81 @@
# Workflow that uses OpenHands to review a pull request. PR must be labeled 'review-this'
name: Use OpenHands to Review Pull Request
on:
pull_request:
types: [synchronize, labeled]
permissions:
contents: write
pull-requests: write
jobs:
dogfood:
if: contains(github.event.pull_request.labels.*.name, 'review-this')
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Docker Buildx
id: buildx
uses: docker/setup-buildx-action@v3
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.12'
- name: install git, github cli
run: |
sudo apt-get install -y git gh
git config --global --add safe.directory $PWD
- name: Checkout Repository
uses: actions/checkout@v4
with:
ref: ${{ github.event.pull_request.base.ref }} # check out the target branch
- name: Download Diff
run: |
curl -O "${{ github.event.pull_request.diff_url }}" -L
- name: Write Task File
run: |
echo "Your coworker wants to apply a pull request to this project." > task.txt
echo "Read and review ${{ github.event.pull_request.number }}.diff file. Create a review-${{ github.event.pull_request.number }}.txt and write your concise comments and suggestions there." >> task.txt
echo "Do not ask me for confirmation at any point." >> task.txt
echo "" >> task.txt
echo "Title" >> task.txt
echo "${{ github.event.pull_request.title }}" >> task.txt
echo "" >> task.txt
echo "Description" >> task.txt
echo "${{ github.event.pull_request.body }}" >> task.txt
echo "" >> task.txt
echo "Diff file is: ${{ github.event.pull_request.number }}.diff" >> task.txt
- name: Set up environment
run: |
curl -sSL https://install.python-poetry.org | python3 -
export PATH="/github/home/.local/bin:$PATH"
poetry install --without evaluation,llama-index
poetry run playwright install --with-deps chromium
- name: Run OpenHands
env:
LLM_API_KEY: ${{ secrets.LLM_API_KEY }}
LLM_MODEL: ${{ vars.LLM_MODEL }}
run: |
# Append path to launch poetry
export PATH="/github/home/.local/bin:$PATH"
# Append path to correctly import package, note: must set pwd at first
export PYTHONPATH=$(pwd):$PYTHONPATH
export WORKSPACE_MOUNT_PATH=$GITHUB_WORKSPACE
export WORKSPACE_BASE=$GITHUB_WORKSPACE
echo -e "/exit\n" | poetry run python openhands/core/main.py -i 50 -f task.txt
rm task.txt
- name: Check if review file is non-empty
id: check_file
run: |
ls -la
if [[ -s review-${{ github.event.pull_request.number }}.txt ]]; then
echo "non_empty=true" >> $GITHUB_OUTPUT
fi
shell: bash
- name: Create PR review if file is non-empty
env:
GH_TOKEN: ${{ github.token }}
if: steps.check_file.outputs.non_empty == 'true'
run: |
gh pr review ${{ github.event.pull_request.number }} --comment --body-file "review-${{ github.event.pull_request.number }}.txt"
-114
View File
@@ -1,114 +0,0 @@
---
name: Review Thread Gate
on:
pull_request:
branches: [main]
types: [opened, synchronize, reopened, ready_for_review, edited]
permissions:
contents: read
pull-requests: read
concurrency:
group: review-thread-gate-${{ github.event.pull_request.number || github.sha }}
cancel-in-progress: true
jobs:
unresolved-review-threads:
runs-on: ubuntu-latest
steps:
- name: Fail when unresolved review threads remain (unless waived)
uses: actions/github-script@v8
with:
script: |
const pr = context.payload.pull_request;
if (!pr) {
core.info('No pull_request payload available; skipping.');
return;
}
const waiverMatch = pr.body?.match(
/review-thread-waiver\s*:\s*(.+?)(?:\n|$)/i,
);
const waiverReason = waiverMatch?.[1]?.trim() || null;
const unresolved = [];
let cursor = null;
do {
const query = `
query($owner: String!, $repo: String!, $number: Int!, $cursor: String) {
repository(owner: $owner, name: $repo) {
pullRequest(number: $number) {
reviewThreads(first: 100, after: $cursor) {
nodes {
id
isResolved
isOutdated
comments(first: 1) {
nodes {
author { login }
path
line
url
}
}
}
pageInfo {
hasNextPage
endCursor
}
}
}
}
}
`;
const result = await github.graphql(query, {
owner: context.repo.owner,
repo: context.repo.repo,
number: pr.number,
cursor,
});
const page = result.repository.pullRequest.reviewThreads;
for (const thread of page.nodes) {
if (thread.isResolved) continue;
const firstComment = thread.comments.nodes[0];
unresolved.push({
url: firstComment?.url ?? '(no-url)',
author: firstComment?.author?.login ?? 'unknown',
path: firstComment?.path ?? 'unknown',
line: firstComment?.line ?? '?',
outdated: thread.isOutdated,
});
}
cursor = page.pageInfo.hasNextPage ? page.pageInfo.endCursor : null;
} while (cursor);
if (unresolved.length === 0) {
core.info('No unresolved review threads found.');
return;
}
const summaryLines = unresolved.map(
(thread) =>
`- ${thread.url} (author: ${thread.author}, file: ${thread.path}:${thread.line}, outdated: ${thread.outdated})`,
);
await core.summary
.addHeading(`Unresolved review threads: ${unresolved.length}`)
.addRaw(summaryLines.join('\n'))
.write();
if (waiverReason) {
core.warning(
`Unresolved review threads remain (${unresolved.length}), but waiver provided: ${waiverReason}`,
);
return;
}
core.setFailed(
`Found ${unresolved.length} unresolved review thread(s). Resolve all threads or add ` +
'`review-thread-waiver: <reason>` to the PR body for an intentional waiver.',
);
-403
View File
@@ -1,403 +0,0 @@
---
name: Run Eval
run-name: Run Eval (${{ inputs.benchmark || 'swebench' }}) ${{ inputs.reason || github.event.label.name || 'release' }}
on:
pull_request_target:
types: [labeled]
release:
types: [published]
workflow_dispatch:
inputs:
benchmark:
description: Benchmark to evaluate
required: false
default: swebench
type: choice
options:
- gaia
- swebench
- swtbench
- commit0
- swebenchmultimodal
- terminalbench
sdk_ref:
description: SDK commit/ref to evaluate (must be a semantic version like v1.0.0 unless 'Allow unreleased branches' is checked)
required: true
default: v1.14.0
allow_unreleased_branches:
description: Allow unreleased branches (bypasses semantic version requirement)
required: false
default: false
type: boolean
eval_limit:
description: Number of instances to run (any positive integer)
required: false
default: '1'
type: string
model_ids:
description: Comma-separated model IDs to evaluate. Must be keys of MODELS in resolve_model_config.py. Defaults to first model in that
dict.
required: false
default: ''
type: string
reason:
description: Reason for manual trigger
required: false
default: ''
eval_branch:
description: Evaluation repo branch to use (for testing feature branches)
required: false
default: main
type: string
benchmarks_branch:
description: Benchmarks repo branch to use (for testing feature branches)
required: false
default: main
type: string
instance_ids:
description: >-
Comma-separated instance IDs to evaluate.
Example: "django__django-11583,django__django-12345".
Spaces around commas are automatically stripped.
Leave empty to evaluate all instances up to eval_limit.
required: false
default: ''
num_infer_workers:
description: Number of inference workers (optional, overrides benchmark default)
required: false
default: ''
type: string
num_eval_workers:
description: Number of evaluation workers (optional, overrides benchmark default)
required: false
default: ''
type: string
enable_conversation_event_logging:
description: 'Enable Datadog persistence for conversation events (default: true)'
required: false
default: true
type: boolean
max_retries:
description: Max retries per instance (passed to benchmarks)
required: false
default: '3'
type: string
tool_preset:
description: >-
Tool preset for file editing. 'default' uses FileEditorTool,
'gemini' uses read_file/write_file/edit/list_directory,
'gpt5' uses apply_patch tool.
required: false
default: default
type: choice
options:
- default
- gemini
- gpt5
- planning
agent_type:
description: >-
Agent type: 'default' for standard Agent,
'acp-claude' for ACPAgent with Claude Code,
'acp-codex' for ACPAgent with Codex.
required: false
default: default
type: choice
options:
- default
- acp-claude
- acp-codex
env:
EVAL_REPO: OpenHands/evaluation
EVAL_WORKFLOW: eval-job.yml
jobs:
print-parameters:
if: >
github.event_name == 'release' ||
github.event_name == 'workflow_dispatch' ||
(github.event_name == 'pull_request_target' &&
(github.event.label.name == 'run-eval-1' ||
github.event.label.name == 'run-eval-50' ||
github.event.label.name == 'run-eval-200' ||
github.event.label.name == 'run-eval-500'))
runs-on: ubuntu-latest
steps:
- name: Print all parameters
run: |
echo "=== Workflow Parameters ==="
echo "Event: ${{ github.event_name }}"
echo "Actor: ${{ github.actor }}"
echo "Ref: ${{ github.ref }}"
echo ""
echo "=== Input Parameters ==="
echo "benchmark: ${{ github.event.inputs.benchmark || 'swebench' }}"
echo "sdk_ref: ${{ github.event.inputs.sdk_ref || 'N/A' }}"
echo "allow_unreleased_branches: ${{ github.event.inputs.allow_unreleased_branches || 'false' }}"
echo "eval_limit: ${{ github.event.inputs.eval_limit || '1' }}"
echo "model_ids: ${{ github.event.inputs.model_ids || '(default)' }}"
echo "reason: ${{ github.event.inputs.reason || 'N/A' }}"
echo "eval_branch: ${{ github.event.inputs.eval_branch || 'main' }}"
echo "benchmarks_branch: ${{ github.event.inputs.benchmarks_branch || 'main' }}"
echo "instance_ids: ${{ github.event.inputs.instance_ids || 'N/A' }}"
echo "num_infer_workers: ${{ github.event.inputs.num_infer_workers || '(default)' }}"
echo "num_eval_workers: ${{ github.event.inputs.num_eval_workers || '(default)' }}"
echo "enable_conversation_event_logging: ${{ github.event.inputs.enable_conversation_event_logging || 'true' }}"
echo "max_retries: ${{ github.event.inputs.max_retries || '3' }}"
echo "tool_preset: ${{ github.event.inputs.tool_preset || 'default' }}"
echo ""
echo "=== Environment Variables ==="
echo "EVAL_REPO: ${{ env.EVAL_REPO }}"
echo "EVAL_WORKFLOW: ${{ env.EVAL_WORKFLOW }}"
echo ""
echo "=== Label (for PR events) ==="
echo "Label: ${{ github.event.label.name || 'N/A' }}"
build-and-evaluate:
needs: print-parameters
runs-on: ubuntu-latest
permissions:
contents: read
packages: write
actions: write
issues: write
pull-requests: write
steps:
- name: Checkout sdk code (base for validation)
uses: actions/checkout@v4
with:
ref: ${{ github.event_name == 'workflow_dispatch' && github.event.inputs.sdk_ref || (github.event_name == 'pull_request_target' &&
github.event.pull_request.base.ref || github.ref) }}
fetch-depth: 0
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.13'
- name: Validate eval_limit
if: github.event_name == 'workflow_dispatch'
run: |
if ! [[ "${{ github.event.inputs.eval_limit }}" =~ ^[1-9][0-9]*$ ]]; then
echo "Error: eval_limit must be a positive integer, got: ${{ github.event.inputs.eval_limit }}"
exit 1
fi
- name: Validate SDK reference (semantic version check)
if: github.event_name == 'workflow_dispatch'
env:
SDK_REF: ${{ github.event.inputs.sdk_ref }}
ALLOW_UNRELEASED_BRANCHES: ${{ github.event.inputs.allow_unreleased_branches }}
run: |
python3 .github/run-eval/validate_sdk_ref.py
- name: Install dependencies
run: |
pip install 'litellm>=1.81.0'
- name: Load model IDs from Python script
id: load-models
run: |
# Extract all model IDs from resolve_model_config.py
ALLOWED_MODEL_IDS=$(python3 << 'EOF'
import sys
sys.path.insert(0, '.github/run-eval')
from resolve_model_config import MODELS
import json
print(json.dumps(list(MODELS.keys())))
EOF
)
DEFAULT_MODEL=$(echo "$ALLOWED_MODEL_IDS" | jq -r '.[0]')
if [ -z "$DEFAULT_MODEL" ] || [ "$DEFAULT_MODEL" = "null" ]; then
echo "No models configured" >&2
exit 1
fi
echo "allowed_model_ids=$ALLOWED_MODEL_IDS" >> "$GITHUB_OUTPUT"
echo "default_model=$DEFAULT_MODEL" >> "$GITHUB_OUTPUT"
- name: Resolve parameters
id: params
env:
DEFAULT_MODEL: ${{ steps.load-models.outputs.default_model }}
ALLOWED_MODEL_IDS_JSON: ${{ steps.load-models.outputs.allowed_model_ids }}
PAT_TOKEN_DEFAULT: ${{ secrets.ALLHANDS_BOT_GITHUB_PAT }}
run: |
set -euo pipefail
# Set PAT token for cross-repo workflow dispatch
PAT_TOKEN="$PAT_TOKEN_DEFAULT"
if [ -z "$PAT_TOKEN" ]; then
echo "Missing PAT token" >&2
exit 1
fi
echo "PAT_TOKEN=$PAT_TOKEN" >> "$GITHUB_ENV"
# Determine eval limit and SDK SHA based on trigger
if [ "${{ github.event_name }}" = "pull_request_target" ]; then
LABEL="${{ github.event.label.name }}"
case "$LABEL" in
run-eval-1) EVAL_LIMIT=1 ;;
run-eval-50) EVAL_LIMIT=50 ;;
run-eval-200) EVAL_LIMIT=200 ;;
run-eval-500) EVAL_LIMIT=500 ;;
*) echo "Unsupported label $LABEL" >&2; exit 1 ;;
esac
SDK_SHA="${{ github.event.pull_request.head.sha }}"
PR_NUMBER="${{ github.event.pull_request.number }}"
TRIGGER_DESCRIPTION="Label '${LABEL}' on PR #${PR_NUMBER}"
elif [ "${{ github.event_name }}" = "release" ]; then
EVAL_LIMIT=50
# Use tag instead of target_commitish because release branches are automatically deleted after merge
SDK_SHA=$(git rev-parse "${{ github.event.release.tag_name }}")
PR_NUMBER=""
TRIGGER_DESCRIPTION="Release ${{ github.event.release.tag_name }}"
else
EVAL_LIMIT="${{ github.event.inputs.eval_limit }}"
SDK_REF="${{ github.event.inputs.sdk_ref }}"
# Convert ref to SHA for manual dispatch
# Resolve SHA robustly for both branch refs and raw SHAs (avoid double-prefix issues)
SDK_SHA=$(git rev-parse --verify "$SDK_REF^{commit}" 2>/dev/null || \
git rev-parse --verify "origin/$SDK_REF^{commit}" 2>/dev/null || \
echo "$SDK_REF")
PR_NUMBER=""
REASON="${{ github.event.inputs.reason }}"
if [ -z "$REASON" ]; then
REASON="manual"
fi
TRIGGER_DESCRIPTION="Manual trigger: ${REASON}"
fi
# Normalize and validate model IDs
MODELS_INPUT="${{ github.event_name == 'workflow_dispatch' && github.event.inputs.model_ids || '' }}"
if [ -z "$MODELS_INPUT" ]; then
MODELS_INPUT="$DEFAULT_MODEL"
fi
MODELS=$(printf '%s' "$MODELS_INPUT" | tr ', ' '\n' | sed '/^$/d' | paste -sd, -)
ALLOWED_LIST=$(echo "$ALLOWED_MODEL_IDS_JSON" | jq -r '.[]')
for MODEL in ${MODELS//,/ }; do
if ! echo "$ALLOWED_LIST" | grep -Fx "$MODEL" >/dev/null; then
echo "Model ID '$MODEL' not found in models.json" >&2
echo "Available models: $(echo "$ALLOWED_LIST" | paste -sd, -)" >&2
exit 1
fi
done
# Sanitize values to avoid GITHUB_OUTPUT parse errors (e.g., raw SHAs)
SDK_SHA=$(printf '%s' "$SDK_SHA" | tr -d '\n\r')
EVAL_LIMIT=$(printf '%s' "$EVAL_LIMIT" | tr -d '\n\r')
PR_NUMBER=$(printf '%s' "$PR_NUMBER" | tr -d '\n\r')
MODELS=$(printf '%s' "$MODELS" | tr -d '\n\r')
TRIGGER_DESCRIPTION=$(printf '%s' "$TRIGGER_DESCRIPTION" | tr -d '\n\r')
printf 'eval_limit=%s\n' "$EVAL_LIMIT" >> "$GITHUB_OUTPUT"
printf 'sdk_sha=%s\n' "$SDK_SHA" >> "$GITHUB_OUTPUT"
printf 'models=%s\n' "$MODELS" >> "$GITHUB_OUTPUT"
printf 'pr_number=%s\n' "$PR_NUMBER" >> "$GITHUB_OUTPUT"
printf 'trigger_desc=%s\n' "$TRIGGER_DESCRIPTION" >> "$GITHUB_OUTPUT"
- name: Resolve model configurations and verify availability
id: resolve-models
env:
MODEL_IDS: ${{ steps.params.outputs.models }}
LLM_API_KEY: ${{ secrets.LLM_API_KEY_EVAL }}
LLM_BASE_URL: https://llm-proxy.eval.all-hands.dev
run: |
python3 .github/run-eval/resolve_model_config.py
- name: Dispatch evaluation workflow
env:
SDK_SHA: ${{ steps.params.outputs.sdk_sha }}
EVAL_LIMIT: ${{ steps.params.outputs.eval_limit }}
MODELS_JSON: ${{ steps.resolve-models.outputs.models_json }}
EVAL_REPO: ${{ env.EVAL_REPO }}
EVAL_WORKFLOW: ${{ env.EVAL_WORKFLOW }}
EVAL_BRANCH: ${{ github.event.inputs.eval_branch || 'main' }}
BENCHMARKS_BRANCH: ${{ github.event.inputs.benchmarks_branch || 'main' }}
BENCHMARK: ${{ github.event.inputs.benchmark || 'swebench' }}
TRIGGER_REASON: ${{ github.event.inputs.reason }}
PR_NUMBER: ${{ steps.params.outputs.pr_number }}
INSTANCE_IDS: ${{ github.event.inputs.instance_ids || '' }}
NUM_INFER_WORKERS: ${{ github.event.inputs.num_infer_workers || '' }}
NUM_EVAL_WORKERS: ${{ github.event.inputs.num_eval_workers || '' }}
ENABLE_CONVERSATION_EVENT_LOGGING: ${{ github.event.inputs.enable_conversation_event_logging || false }}
MAX_RETRIES: ${{ github.event.inputs.max_retries || '3' }}
TOOL_PRESET: ${{ github.event.inputs.tool_preset || 'default' }}
AGENT_TYPE: ${{ github.event.inputs.agent_type || 'default' }}
TRIGGERED_BY: ${{ github.actor }}
run: |
# Normalize instance_ids: strip all spaces
INSTANCE_IDS=$(printf '%s' "$INSTANCE_IDS" | tr -d ' ')
echo "Dispatching evaluation workflow with SDK commit: $SDK_SHA (benchmark: $BENCHMARK, eval branch: $EVAL_BRANCH, benchmarks branch: $BENCHMARKS_BRANCH, tool preset: $TOOL_PRESET)"
PAYLOAD=$(jq -n \
--arg sdk "$SDK_SHA" \
--arg eval_limit "$EVAL_LIMIT" \
--argjson models "$MODELS_JSON" \
--arg ref "$EVAL_BRANCH" \
--arg reason "$TRIGGER_REASON" \
--arg pr "$PR_NUMBER" \
--arg benchmarks "$BENCHMARKS_BRANCH" \
--arg benchmark "$BENCHMARK" \
--arg instance_ids "$INSTANCE_IDS" \
--arg num_infer_workers "$NUM_INFER_WORKERS" \
--arg num_eval_workers "$NUM_EVAL_WORKERS" \
--argjson enable_conversation_event_logging "$ENABLE_CONVERSATION_EVENT_LOGGING" \
--arg max_retries "$MAX_RETRIES" \
--arg tool_preset "$TOOL_PRESET" \
--arg agent_type "$AGENT_TYPE" \
--arg triggered_by "$TRIGGERED_BY" \
'{ref: $ref, inputs: {sdk_commit: $sdk, eval_limit: $eval_limit, models_json: ($models | tostring), trigger_reason: $reason, pr_number: $pr, benchmarks_branch: $benchmarks, benchmark: $benchmark, instance_ids: $instance_ids, num_infer_workers: $num_infer_workers, num_eval_workers: $num_eval_workers, enable_conversation_event_logging: $enable_conversation_event_logging, max_retries: $max_retries, tool_preset: $tool_preset, agent_type: $agent_type, triggered_by: $triggered_by}}')
RESPONSE=$(curl -sS -o /tmp/dispatch.out -w "%{http_code}" -X POST \
-H "Authorization: token $PAT_TOKEN" \
-H "Accept: application/vnd.github+json" \
-d "$PAYLOAD" \
"https://api.github.com/repos/${EVAL_REPO}/actions/workflows/${EVAL_WORKFLOW}/dispatches")
if [ "$RESPONSE" != "204" ]; then
echo "Dispatch failed (status $RESPONSE):" >&2
cat /tmp/dispatch.out >&2
exit 1
fi
- name: Comment on PR
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
SDK_SHA: ${{ steps.params.outputs.sdk_sha }}
EVAL_LIMIT: ${{ steps.params.outputs.eval_limit }}
MODELS: ${{ steps.params.outputs.models }}
TRIGGER_DESC: ${{ steps.params.outputs.trigger_desc }}
EVENT_NAME: ${{ github.event_name }}
PR_NUMBER_INPUT: ${{ steps.params.outputs.pr_number }}
run: |
set -euo pipefail
PR_NUMBER="$PR_NUMBER_INPUT"
if [ "$EVENT_NAME" = "release" ] && [ -z "$PR_NUMBER" ]; then
# Attempt to find the merged PR for this commit
PR_NUMBER=$(curl -sS \
-H "Authorization: Bearer $GITHUB_TOKEN" \
-H "Accept: application/vnd.github+json" \
"https://api.github.com/repos/${{ github.repository }}/commits/${SDK_SHA}/pulls" \
| jq -r '.[0].number // ""')
fi
if [ -z "$PR_NUMBER" ]; then
echo "No PR found to comment on; skipping comment"
exit 0
fi
COMMENT_BODY=$(printf '**Evaluation Triggered**\n\n- Trigger: %s\n- SDK: %s\n- Eval limit: %s\n- Models: %s\n' \
"$TRIGGER_DESC" "$SDK_SHA" "$EVAL_LIMIT" "$MODELS")
curl -sS -X POST \
-H "Accept: application/vnd.github+json" \
-H "Authorization: Bearer $GITHUB_TOKEN" \
"https://api.github.com/repos/${{ github.repository }}/issues/${PR_NUMBER}/comments" \
-d "$(jq -n --arg body "$COMMENT_BODY" '{body: $body}')"
-199
View File
@@ -1,199 +0,0 @@
---
name: Run Examples Scripts
on:
pull_request:
types: [labeled]
workflow_dispatch:
inputs:
reason:
description: Reason for manual trigger
required: true
default: ''
schedule:
- cron: 30 22 * * * # Runs at 10:30pm UTC every day
permissions:
contents: read
pull-requests: write
issues: write
jobs:
test-examples:
# Schedule trigger only runs in the main repository, not in forks
if: github.event.label.name == 'test-examples' || github.event_name == 'workflow_dispatch' || (github.event_name == 'schedule' &&
github.repository == 'OpenHands/software-agent-sdk')
runs-on: ubuntu-24.04
timeout-minutes: 60
steps:
- name: Wait for agent server to finish build
if: github.event_name == 'pull_request'
uses: lewagon/wait-on-check-action@v1.4.1
with:
ref: ${{ github.event.pull_request.head.ref }}
check-name: Build & Push (python-amd64)
repo-token: ${{ secrets.GITHUB_TOKEN }}
wait-interval: 10
- name: Checkout
uses: actions/checkout@v5
with:
ref: ${{ github.event.pull_request.head.ref }}
repository: ${{ github.event.pull_request.head.repo.full_name }}
- name: Install uv
uses: astral-sh/setup-uv@v7
with:
enable-cache: true
python-version: '3.13'
- name: Install Node.js
uses: actions/setup-node@v6
with:
node-version: '22'
- name: Setup Apptainer
uses: eWaterCycle/setup-apptainer@v2
with:
apptainer-version: 1.3.6
- name: Install Chromium
run: |
sudo apt-get update
sudo apt-get install -y chromium-browser
- name: Install dependencies
run: uv sync --frozen --group dev
- name: Run examples
shell: bash
env:
LLM_API_KEY: ${{ secrets.LLM_API_KEY }}
LLM_MODEL: openhands/claude-haiku-4-5-20251001
LLM_BASE_URL: https://llm-proxy.app.all-hands.dev
RUNTIME_API_KEY: ${{ secrets.RUNTIME_API_KEY }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
PR_NUMBER: ${{ github.event.pull_request.number }}
REPO_OWNER: ${{ github.repository_owner }}
REPO_NAME: ${{ github.event.repository.name }}
GITHUB_SHA: ${{ github.event.pull_request.head.sha }}
OPENHANDS_CLOUD_API_KEY: ${{ secrets.ALLHANDS_BOT_OPENHANDS_SAAS_API_KEY }}
# ACP agents (Claude Code, Codex) route through LiteLLM proxy
ANTHROPIC_BASE_URL: https://llm-proxy.app.all-hands.dev
ANTHROPIC_API_KEY: ${{ secrets.LLM_API_KEY }}
OPENAI_BASE_URL: https://llm-proxy.app.all-hands.dev
OPENAI_API_KEY: ${{ secrets.LLM_API_KEY }}
run: |
RESULTS_DIR=".example-test-results"
REPORT_PATH="examples_report.md"
rm -rf "$RESULTS_DIR"
mkdir -p "$RESULTS_DIR"
update_comment() {
if [ -z "$API_URL" ]; then
echo "Skipping PR comment update because API_URL is unset."
return
fi
local comment_body="$1"
local payload
local response
payload=$(jq -n --arg body "$comment_body" '{body: $body}')
if [ -z "$COMMENT_ID" ]; then
echo "Creating PR comment..."
if ! response=$(curl -sSf -X POST \
-H "Authorization: token ${GITHUB_TOKEN}" \
-H "Accept: application/vnd.github.v3+json" \
-H "Content-Type: application/json" \
"${API_URL}" \
-d "$payload"); then
echo "::error::Failed to create PR comment."
exit 1
fi
COMMENT_ID=$(echo "$response" | jq -r '.id // ""')
if [ -z "$COMMENT_ID" ]; then
echo "::error::GitHub API response did not include a comment id: $response"
exit 1
fi
echo "Created comment with ID: $COMMENT_ID"
else
echo "Updating PR comment (ID: $COMMENT_ID)..."
if ! curl -sSf -X PATCH \
-H "Authorization: token ${GITHUB_TOKEN}" \
-H "Accept: application/vnd.github.v3+json" \
-H "Content-Type: application/json" \
"https://api.github.com/repos/${REPO_OWNER}/${REPO_NAME}/issues/comments/${COMMENT_ID}" \
-d "$payload" > /dev/null; then
echo "::error::Failed to update PR comment (ID: $COMMENT_ID)."
exit 1
fi
fi
}
API_URL=""
COMMENT_ID=""
if [ "${{ github.event_name }}" = "pull_request" ]; then
API_URL="https://api.github.com/repos/${REPO_OWNER}/${REPO_NAME}/issues/${PR_NUMBER}/comments"
initial_comment="## 🔄 Running Examples with \`${LLM_MODEL}\`"
initial_comment+=$'\n\n'
initial_comment+="_Run in progress..._"
initial_comment+=$'\n'
update_comment "$initial_comment"
fi
EXIT_CODE=0
uv run pytest tests/examples/test_examples.py \
--run-examples \
--examples-results-dir "$RESULTS_DIR" \
-n 4 || EXIT_CODE=$?
TIMESTAMP="$(date -u '+%Y-%m-%d %H:%M:%S UTC')"
WORKFLOW_URL="${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}/actions/runs/${GITHUB_RUN_ID}"
uv run python scripts/render_examples_report.py \
--results-dir "$RESULTS_DIR" \
--model "$LLM_MODEL" \
--workflow-url "$WORKFLOW_URL" \
--timestamp "$TIMESTAMP" \
--output "$REPORT_PATH"
COMMENT_BODY="$(cat "$REPORT_PATH")"
echo "$COMMENT_BODY"
if [ "${{ github.event_name }}" = "pull_request" ]; then
echo "Publishing PR comment..."
update_comment "$COMMENT_BODY"
fi
if [ $EXIT_CODE -ne 0 ]; then
exit $EXIT_CODE
fi
- name: Read examples report for issue comment
if: github.event_name == 'schedule' || github.event_name == 'workflow_dispatch'
id: read_report
shell: bash
run: |
if [ -f examples_report.md ]; then
REPORT_CONTENT=$(cat examples_report.md)
echo "report<<EOF" >> "$GITHUB_OUTPUT"
echo "$REPORT_CONTENT" >> "$GITHUB_OUTPUT"
echo "EOF" >> "$GITHUB_OUTPUT"
else
echo "report=Report file not found" >> "$GITHUB_OUTPUT"
fi
- name: Comment with results on tracker issue
if: github.event_name == 'schedule' || github.event_name == 'workflow_dispatch'
uses: KeisukeYamashita/create-comment@v1
with:
number: 976
unique: false
comment: |
**Trigger:** ${{ github.event_name == 'schedule' && 'Nightly Scheduled Run' || format('Manual Trigger: {0}', github.event.inputs.reason) }}
**Commit:** ${{ github.sha }}
**Workflow Run:** ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}
${{ steps.read_report.outputs.report }}
-715
View File
@@ -1,715 +0,0 @@
---
name: Agent Server
on:
push:
branches: [main]
tags:
- '*' # Trigger on any tag (e.g., 1.0.0, 1.0.0a5, build-docker)
pull_request:
branches: [main]
workflow_dispatch:
inputs:
base_image:
description: Base runtime image
type: string
default: nikolaik/python-nodejs:python3.13-nodejs22-slim
image:
description: GHCR image name
type: string
default: ghcr.io/openhands/agent-server
platforms:
description: Target platforms
type: string
default: linux/amd64,linux/arm64
permissions:
contents: read
packages: write
jobs:
build-binary-and-test:
runs-on: ${{ matrix.os }}
strategy:
matrix:
os: [ubuntu-latest, macos-latest]
steps:
- uses: actions/checkout@v5
- name: Install uv
uses: astral-sh/setup-uv@v7
with:
version: latest
python-version: '3.13'
- name: Install dependencies
run: uv sync --dev
- name: Build binary
run: |
make build-server
# FIXME: windows-latest not working due to
# Run if [[ "windows-latest" == "windows-latest" ]]; then
# [PYI-2160:ERROR] Failed to load Python DLL 'C:\Users\RUNNER~1\AppData\Local\Temp\_MEI5602\python312.dll'.
# LoadLibrary: Invalid access to memory location.
# - name: Test binary (Windows)
# if: matrix.os == 'windows-latest'
# shell: pwsh
# run: |
# Get-ChildItem dist
# .\dist\openhands-agent-server.exe --help
- name: Test binary (Linux and macOS)
if: matrix.os != 'windows-latest'
shell: bash
run: |
# Test help command
./dist/openhands-agent-server --help
# Test server startup and template loading
echo "Testing server startup and template loading..."
./dist/openhands-agent-server --port 8002 > server_test.log 2>&1 &
SERVER_PID=$!
# Wait for server to start
sleep 5
# Check if server started successfully (no template errors)
if grep -q "system_prompt.j2.*not found" server_test.log; then
echo "ERROR: Template files not found in binary!"
cat server_test.log
kill $SERVER_PID 2>/dev/null || true
exit 1
fi
# Check if server is running
if ! kill -0 $SERVER_PID 2>/dev/null; then
echo "ERROR: Server failed to start!"
cat server_test.log
exit 1
fi
# Test basic API endpoint
if command -v curl >/dev/null 2>&1; then
echo "Testing basic API endpoint..."
if curl -f -s http://localhost:8002/health >/dev/null 2>&1; then
echo "✓ Health endpoint accessible"
else
echo "⚠ Health endpoint not accessible (may be expected)"
fi
fi
# Clean up
kill $SERVER_PID 2>/dev/null || true
wait $SERVER_PID 2>/dev/null || true
rm -f server_test.log
echo "✓ Binary test completed successfully"
- name: Upload binary artifact
uses: actions/upload-artifact@v7
with:
name: openhands-server-${{ matrix.os }}
path: |
dist/openhands-server*
retention-days: 7
check-openapi-schema:
name: Check OpenAPI Schema
runs-on: ubuntu-24.04
steps:
- name: Checkout PR branch
uses: actions/checkout@v5
with:
fetch-depth: 0
- name: Install uv
uses: astral-sh/setup-uv@v7
with:
version: latest
python-version: '3.13'
- name: Install Node.js (for npx)
uses: actions/setup-node@v6
with:
node-version: 22
- name: Install dependencies
run: |
uv sync --frozen --dev
- name: Check OpenAPI JSON and build client
env:
PYTHONPATH: .
run: |
make test-server-schema
build-and-push-image:
name: Build & Push (${{ matrix.variant }}-${{ matrix.arch }})
# Run on push events, pull requests from the same repository (not forks), and manual workflow_dispatch
# Fork PRs cannot push to GHCR and would fail authentication
if: >
github.event_name == 'push' ||
github.event_name == 'workflow_dispatch' ||
(github.event_name == 'pull_request' &&
!github.event.pull_request.head.repo.fork)
strategy:
fail-fast: false
matrix:
# Explicit matrix: 3 variants × 2 architectures = 6 jobs
# Each job specifies exactly what it builds and where it runs
include:
# Python variant
- variant: python
arch: amd64
base_image: nikolaik/python-nodejs:python3.13-nodejs22
runner: ubuntu-24.04
platform: linux/amd64
- variant: python
arch: arm64
base_image: nikolaik/python-nodejs:python3.13-nodejs22
runner: ubuntu-24.04-arm
platform: linux/arm64
# Java variant
- variant: java
arch: amd64
base_image: eclipse-temurin:17-jdk
runner: ubuntu-24.04
platform: linux/amd64
- variant: java
arch: arm64
base_image: eclipse-temurin:17-jdk
runner: ubuntu-24.04-arm
platform: linux/arm64
# Golang variant
- variant: golang
arch: amd64
base_image: golang:1.21-bookworm
runner: ubuntu-24.04
platform: linux/amd64
- variant: golang
arch: arm64
base_image: golang:1.21-bookworm
runner: ubuntu-24.04-arm
platform: linux/arm64
runs-on: ${{ matrix.runner }}
env:
IMAGE: ${{ inputs.image != '' && inputs.image || 'ghcr.io/openhands/agent-server' }}
BASE_IMAGE: ${{ inputs.base_image != '' && inputs.base_image || matrix.base_image }}
CUSTOM_TAGS: ${{ matrix.variant }}
VARIANT: ${{ matrix.variant }}
ARCH: ${{ matrix.arch }}
TARGET: binary
PLATFORM: ${{ matrix.platform }}
# Use PR head SHA for pull requests to match the image tag expected by run-examples.yml
GITHUB_SHA: ${{ github.event.pull_request.head.sha || github.sha }}
GITHUB_REF: ${{ github.ref }}
CI: 'true'
steps:
- name: Checkout
uses: actions/checkout@v5
- name: Install uv
uses: astral-sh/setup-uv@v7
with:
version: latest
python-version: '3.13'
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Log in to GHCR
uses: docker/login-action@v4
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Prepare build context and metadata
id: prep
run: |
uv sync --frozen
# Generate build context and tags with arch suffix
# build.py now handles architecture tagging internally via --arch flag
# Add --versioned-tag when triggered by a git tag (e.g., v1.0.0)
BUILD_CMD="uv run ./openhands-agent-server/openhands/agent_server/docker/build.py --build-ctx-only --arch ${{ matrix.arch }}"
if [[ "${{ github.ref }}" == refs/tags/* ]]; then
BUILD_CMD="$BUILD_CMD --versioned-tag"
fi
eval "$BUILD_CMD"
# Alias tags_csv output to tags for the build action
TAGS=$(grep '^tags_csv=' $GITHUB_OUTPUT | cut -d= -f2-)
echo "tags=$TAGS" >> $GITHUB_OUTPUT
# Extract short SHA for consolidation
SHORT_SHA=$(echo ${{ github.sha }} | cut -c1-7)
echo "short_sha=$SHORT_SHA" >> $GITHUB_OUTPUT
# Extract versioned tags CSV for consolidation
VERSIONED_TAGS_CSV=$(grep '^versioned_tags_csv=' $GITHUB_OUTPUT | cut -d= -f2- || echo "")
echo "versioned_tags_csv=$VERSIONED_TAGS_CSV" >> $GITHUB_OUTPUT
# Verify outputs
echo "=== Build outputs ==="
echo "Build context: $(grep '^build_context=' $GITHUB_OUTPUT | cut -d= -f2-)"
echo "Tags: $TAGS"
echo "Short SHA: $SHORT_SHA"
echo "Versioned tags: $VERSIONED_TAGS_CSV"
echo "===================="
- name: Build & Push (${{ matrix.variant }}-${{ matrix.arch }})
id: build
uses: docker/build-push-action@v6
with:
context: ${{ steps.prep.outputs.build_context }}
file: ${{ steps.prep.outputs.dockerfile }}
target: ${{ env.TARGET }}
platforms: ${{ env.PLATFORM }}
push: true
tags: ${{ steps.prep.outputs.tags }}
cache-from: type=gha
cache-to: type=gha,mode=max
build-args: |
BASE_IMAGE=${{ env.BASE_IMAGE }}
- name: Cleanup build context
if: always()
run: |
if [ -n "${{ steps.prep.outputs.build_context }}" ] && [ -d "${{ steps.prep.outputs.build_context }}" ]; then
echo "Cleaning up build context: ${{ steps.prep.outputs.build_context }}"
rm -rf "${{ steps.prep.outputs.build_context }}"
fi
- name: Summary (${{ matrix.variant }}-${{ matrix.arch }}) - outputs
run: |
echo "Image: ${{ env.IMAGE }}"
echo "Variant: ${{ env.VARIANT }}"
echo "Architecture: ${{ env.ARCH }}"
echo "Platform: ${{ env.PLATFORM }}"
echo "Short SHA: ${{ steps.prep.outputs.short_sha }}"
echo "Tags: ${{ steps.prep.outputs.tags }}"
echo "Build digest: ${{ steps.build.outputs.digest }}"
- name: Save build info for consolidation
run: |
mkdir -p build-info
cat > "build-info/${{ matrix.variant }}-${{ matrix.arch }}.json" << EOF
{
"variant": "${{ matrix.variant }}",
"arch": "${{ matrix.arch }}",
"base_image": "${{ matrix.base_image }}",
"image": "${{ env.IMAGE }}",
"short_sha": "${{ steps.prep.outputs.short_sha }}",
"tags": "${{ steps.prep.outputs.tags }}",
"versioned_tags_csv": "${{ steps.prep.outputs.versioned_tags_csv }}",
"platform": "${{ env.PLATFORM }}"
}
EOF
- name: Upload build info artifact
uses: actions/upload-artifact@v7
with:
name: build-info-${{ matrix.variant }}-${{ matrix.arch }}
path: build-info/${{ matrix.variant }}-${{ matrix.arch }}.json
retention-days: 1
merge-manifests:
name: Merge Multi-Arch Manifests
needs: build-and-push-image
if: >
github.event_name == 'push' ||
(github.event_name == 'pull_request' &&
!github.event.pull_request.head.repo.fork)
runs-on: ubuntu-24.04
strategy:
matrix:
variant: [python, java, golang]
env:
IMAGE: ${{ inputs.image != '' && inputs.image || 'ghcr.io/openhands/agent-server' }}
steps:
- name: Download build info to extract SHORT_SHA
uses: actions/download-artifact@v8
with:
pattern: build-info-${{ matrix.variant }}-*
merge-multiple: true
path: build-info
- name: Extract SHORT_SHA from build info
id: get_sha
run: |
# Get SHORT_SHA from any build info artifact for this variant
SHORT_SHA=$(jq -r '.short_sha' build-info/${{ matrix.variant }}-amd64.json)
echo "short_sha=$SHORT_SHA" >> $GITHUB_OUTPUT
echo "Using SHORT_SHA: $SHORT_SHA"
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Log in to GHCR
uses: docker/login-action@v4
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Create and push multi-arch manifest for ${{ matrix.variant }}
id: create_manifest
run: |
SHORT_SHA=${{ steps.get_sha.outputs.short_sha }}
VARIANT=${{ matrix.variant }}
MANIFEST_TAG="${SHORT_SHA}-${VARIANT}"
# Create multi-arch manifest combining amd64 and arm64 using buildx imagetools
# This properly handles manifest lists from Docker builds
echo "Creating multi-arch manifest: ${IMAGE}:${MANIFEST_TAG}"
docker buildx imagetools create -t ${IMAGE}:${MANIFEST_TAG} \
${IMAGE}:${SHORT_SHA}-${VARIANT}-amd64 \
${IMAGE}:${SHORT_SHA}-${VARIANT}-arm64
# Verify the multi-arch manifest
echo "Inspecting multi-arch manifest:"
docker buildx imagetools inspect ${IMAGE}:${MANIFEST_TAG}
echo "✓ Multi-arch manifest created: ${IMAGE}:${MANIFEST_TAG}"
# Create latest manifest if on main branch
if [ "${{ github.ref }}" == "refs/heads/main" ]; then
LATEST_TAG="latest-${VARIANT}"
echo "Creating latest multi-arch manifest: ${IMAGE}:${LATEST_TAG}"
docker buildx imagetools create -t ${IMAGE}:${LATEST_TAG} \
${IMAGE}:main-${VARIANT}-amd64 \
${IMAGE}:main-${VARIANT}-arm64
echo "Inspecting latest multi-arch manifest:"
docker buildx imagetools inspect ${IMAGE}:${LATEST_TAG}
echo "✓ Latest multi-arch manifest created: ${IMAGE}:${LATEST_TAG}"
MANIFEST_TAG="${MANIFEST_TAG},${LATEST_TAG}"
fi
# Create versioned manifests if triggered by a git tag
# Extract versioned tags from build info (format: "1.2.0-python,1.2.0-java")
VERSIONED_TAGS_CSV=$(jq -r '.versioned_tags_csv' build-info/${VARIANT}-amd64.json)
if [ -n "$VERSIONED_TAGS_CSV" ] && [ "$VERSIONED_TAGS_CSV" != "null" ] && [ "$VERSIONED_TAGS_CSV" != "" ]; then
echo "Found versioned tags: $VERSIONED_TAGS_CSV"
# Split CSV and create manifest for each versioned tag
IFS=',' read -ra VERSIONED_TAGS <<< "$VERSIONED_TAGS_CSV"
for VERSIONED_TAG in "${VERSIONED_TAGS[@]}"; do
if [ -n "$VERSIONED_TAG" ]; then
echo "Creating versioned multi-arch manifest: ${IMAGE}:${VERSIONED_TAG}"
docker buildx imagetools create -t ${IMAGE}:${VERSIONED_TAG} \
${IMAGE}:${VERSIONED_TAG}-amd64 \
${IMAGE}:${VERSIONED_TAG}-arm64
echo "Inspecting versioned multi-arch manifest:"
docker buildx imagetools inspect ${IMAGE}:${VERSIONED_TAG}
echo "✓ Versioned multi-arch manifest created: ${IMAGE}:${VERSIONED_TAG}"
MANIFEST_TAG="${MANIFEST_TAG},${VERSIONED_TAG}"
fi
done
fi
# Save manifest info for consolidation
mkdir -p manifest-info
cat > "manifest-info/${VARIANT}.json" << EOF
{
"variant": "${VARIANT}",
"image": "${IMAGE}",
"short_sha": "${SHORT_SHA}",
"manifest_tag": "${MANIFEST_TAG}"
}
EOF
- name: Upload manifest info artifact
uses: actions/upload-artifact@v7
with:
name: manifest-info-${{ matrix.variant }}
path: manifest-info/${{ matrix.variant }}.json
retention-days: 1
consolidate-build-info:
name: Consolidate Build Information
needs: [build-and-push-image, merge-manifests]
# Run if it's a PR and the matrix job completed (even if some variants failed)
if: github.event_name == 'pull_request' && always() && (needs.build-and-push-image.result == 'success' || needs.build-and-push-image.result ==
'failure')
runs-on: ubuntu-24.04
outputs:
build_summary: ${{ steps.consolidate.outputs.build_summary }}
steps:
- name: Download build info artifacts
uses: actions/download-artifact@v8
with:
pattern: build-info-*
merge-multiple: true
path: build-info
- name: Download manifest info artifacts
uses: actions/download-artifact@v8
with:
pattern: manifest-info-*
merge-multiple: true
path: manifest-info
- name: Consolidate build information from artifacts
id: consolidate
run: |
echo "Processing build info artifacts..."
ls -la build-info/
echo "Found $(ls build-info/*.json 2>/dev/null | wc -l) JSON files"
# Initialize variables
IMAGE=""
SHORT_SHA=""
ALL_TAGS=""
# Use associative arrays to track variants (bash 4+)
declare -A VARIANT_BASE_IMAGE
declare -A VARIANT_ARCHS
# Process each build info
for info_file in build-info/*.json; do
if [[ ! -f "$info_file" ]]; then
echo "Skipping $info_file - not a file"
continue
fi
echo "=== Processing $info_file ==="
cat "$info_file"
echo "=== End of $info_file ==="
# Extract information from JSON
VARIANT=$(jq -r '.variant' "$info_file")
ARCH=$(jq -r '.arch' "$info_file")
BASE_IMAGE=$(jq -r '.base_image' "$info_file")
VARIANT_IMAGE=$(jq -r '.image' "$info_file")
VARIANT_SHA=$(jq -r '.short_sha' "$info_file")
VARIANT_TAGS=$(jq -r '.tags' "$info_file")
# Set common values (same across all builds)
if [[ -z "$IMAGE" ]]; then
IMAGE="$VARIANT_IMAGE"
SHORT_SHA="$VARIANT_SHA"
fi
# Store variant information
VARIANT_BASE_IMAGE[$VARIANT]=$BASE_IMAGE
if [[ -z "${VARIANT_ARCHS[$VARIANT]}" ]]; then
VARIANT_ARCHS[$VARIANT]=$ARCH
else
VARIANT_ARCHS[$VARIANT]="${VARIANT_ARCHS[$VARIANT]}, $ARCH"
fi
# Collect tags (comma-separated to newline-separated)
if [[ -n "$VARIANT_TAGS" ]]; then
VARIANT_TAG_LIST=$(echo "$VARIANT_TAGS" | tr ',' '\n')
if [[ -n "$ALL_TAGS" ]]; then
ALL_TAGS="${ALL_TAGS}"$'\n'"${VARIANT_TAG_LIST}"
else
ALL_TAGS="$VARIANT_TAG_LIST"
fi
fi
done
# Build variants JSON array from collected data
VARIANTS_JSON="[]"
for VARIANT in "${!VARIANT_BASE_IMAGE[@]}"; do
BASE_IMG="${VARIANT_BASE_IMAGE[$VARIANT]}"
ARCHS="${VARIANT_ARCHS[$VARIANT]}"
VARIANTS_JSON=$(echo "$VARIANTS_JSON" | jq \
--arg variant "$VARIANT" \
--arg base_image "$BASE_IMG" \
--arg archs "$ARCHS" \
'. += [{custom_tags: $variant, base_image: $base_image, architectures: $archs}]')
echo "Added variant $VARIANT ($ARCHS), current variants JSON:"
echo "$VARIANTS_JSON" | jq .
done
# Process manifest info artifacts
echo "Processing manifest info artifacts..."
if [[ -d "manifest-info" ]]; then
ls -la manifest-info/
MANIFEST_TAGS=""
for manifest_file in manifest-info/*.json; do
if [[ -f "$manifest_file" ]]; then
echo "=== Processing $manifest_file ==="
cat "$manifest_file"
MANIFEST_TAG_CSV=$(jq -r '.manifest_tag' "$manifest_file")
# Convert comma-separated tags to newline-separated
MANIFEST_TAG_LIST=$(echo "$MANIFEST_TAG_CSV" | tr ',' '\n' | sed "s|^|${IMAGE}:|")
if [[ -n "$MANIFEST_TAGS" ]]; then
MANIFEST_TAGS="${MANIFEST_TAGS}"$'\n'"${MANIFEST_TAG_LIST}"
else
MANIFEST_TAGS="$MANIFEST_TAG_LIST"
fi
fi
done
# Add manifest tags to ALL_TAGS
if [[ -n "$MANIFEST_TAGS" ]]; then
echo "Adding manifest tags to output"
if [[ -n "$ALL_TAGS" ]]; then
ALL_TAGS="${ALL_TAGS}"$'\n'"${MANIFEST_TAGS}"
else
ALL_TAGS="$MANIFEST_TAGS"
fi
fi
else
echo "No manifest-info directory found (merge-manifests may not have run)"
fi
# Create consolidated build summary
BUILD_SUMMARY=$(jq -n \
--arg image "$IMAGE" \
--arg short_sha "$SHORT_SHA" \
--arg ghcr_url "https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server" \
--arg all_tags "$ALL_TAGS" \
--argjson variants "$VARIANTS_JSON" \
'{
image: $image,
short_sha: $short_sha,
ghcr_package_url: $ghcr_url,
all_tags: $all_tags,
variants: $variants
}')
echo "Consolidated build summary:"
echo "$BUILD_SUMMARY" | jq .
echo "DEBUG: Final variants count: $(echo "$VARIANTS_JSON" | jq 'length')"
echo "DEBUG: Final variants: $(echo "$VARIANTS_JSON" | jq -c '.')"
# Set output
{
echo 'build_summary<<EOF'
echo "$BUILD_SUMMARY"
echo 'EOF'
} >> $GITHUB_OUTPUT
update-pr-description:
name: Update PR description with agent server image
needs: consolidate-build-info
# Only on PRs, and only if the consolidation succeeded
if: github.event_name == 'pull_request' && needs.consolidate-build-info.result == 'success'
runs-on: ubuntu-24.04
permissions:
contents: read
pull-requests: write
steps:
- name: Generate PR description from build summary
id: generate_description
run: |
echo "Event: ${{ github.event_name }}"
echo "PR number: ${{ github.event.number }}"
echo "Run attempt: ${{ github.run_attempt }}"
# Parse the build summary JSON
BUILD_SUMMARY='${{ needs.consolidate-build-info.outputs.build_summary }}'
echo "Build summary received:"
echo "$BUILD_SUMMARY" | jq .
# Extract basic information
IMAGE=$(echo "$BUILD_SUMMARY" | jq -r '.image')
SHORT_SHA=$(echo "$BUILD_SUMMARY" | jq -r '.short_sha')
GHCR_URL=$(echo "$BUILD_SUMMARY" | jq -r '.ghcr_package_url')
ALL_TAGS=$(echo "$BUILD_SUMMARY" | jq -r '.all_tags')
# Build the variants table dynamically
VARIANTS_TABLE=""
# Process each build
VARIANTS=$(echo "$BUILD_SUMMARY" | jq -r '.variants[] | @base64')
echo "DEBUG: Found builds (base64 encoded):"
echo "$VARIANTS"
echo "DEBUG: Number of builds: $(echo "$VARIANTS" | wc -l)"
for variant_data in $VARIANTS; do
# Decode base64 and extract build info
VARIANT_JSON=$(echo "$variant_data" | base64 --decode)
echo "DEBUG: Processing build JSON: $VARIANT_JSON"
CUSTOM_TAGS=$(echo "$VARIANT_JSON" | jq -r '.custom_tags')
BASE_IMAGE=$(echo "$VARIANT_JSON" | jq -r '.base_image')
ARCHS=$(echo "$VARIANT_JSON" | jq -r '.architectures // "amd64, arm64"')
echo "DEBUG: Adding variant $CUSTOM_TAGS with base image $BASE_IMAGE (archs: $ARCHS)"
# Add to variants table with architecture info
VARIANTS_TABLE="${VARIANTS_TABLE}| ${CUSTOM_TAGS} | ${ARCHS} | \`${BASE_IMAGE}\` | [Link](https://hub.docker.com/_/${BASE_IMAGE}) |"$'\n'
done
echo "DEBUG: Final variants table:"
echo "$VARIANTS_TABLE"
# Create the complete PR description with the requested format
PR_CONTENT=$(cat << EOF
<!-- AGENT_SERVER_IMAGES_START -->
---
**Agent Server images for this PR**
• **GHCR package:** ${GHCR_URL}
**Variants & Base Images**
| Variant | Architectures | Base Image | Docs / Tags |
|---|---|---|---|
${VARIANTS_TABLE}
**Pull (multi-arch manifest)**
\`\`\`bash
# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ${IMAGE}:${SHORT_SHA}-python
\`\`\`
**Run**
\`\`\`bash
docker run -it --rm \\
-p 8000:8000 \\
--name agent-server-${SHORT_SHA}-python \\
${IMAGE}:${SHORT_SHA}-python
\`\`\`
**All tags pushed for this build**
\`\`\`
${ALL_TAGS}
\`\`\`
**About Multi-Architecture Support**
- Each variant tag (e.g., \`${SHORT_SHA}-python\`) is a **multi-arch manifest** supporting both **amd64** and **arm64**
- Docker automatically pulls the correct architecture for your platform
- Individual architecture tags (e.g., \`${SHORT_SHA}-python-amd64\`) are also available if needed
<!-- AGENT_SERVER_IMAGES_END -->
EOF
)
# Set output for the next step
{
echo 'pr_content<<EOF'
echo "$PR_CONTENT"
echo 'EOF'
} >> $GITHUB_OUTPUT
- name: Update PR description with docker image details
uses: nefrob/pr-description@v1.2.0
with:
content: ${{ steps.generate_description.outputs.pr_content }}
regex: <!-- AGENT_SERVER_IMAGES_START -->.*?<!-- AGENT_SERVER_IMAGES_END -->
regexFlags: s
token: ${{ secrets.GITHUB_TOKEN }}
+15 -24
View File
@@ -1,30 +1,21 @@
---
# Workflow that marks issues and PRs with no activity for 30 days with "Stale" and closes them after 7 more days of no activity
name: Close stale issues
name: 'Close stale issues'
# Runs every day at 01:30
on:
schedule:
- cron: 30 1 * * *
schedule:
- cron: '30 1 * * *'
jobs:
stale:
# Only run scheduled jobs in the main repository, not in forks
if: github.repository == 'OpenHands/software-agent-sdk'
runs-on: ubuntu-22.04
steps:
- uses: actions/stale@v10
with:
repo-token: ${{ secrets.ALLHANDS_BOT_GITHUB_PAT }}
stale-issue-message: This issue is stale because it has been open for 40 days with no activity. Remove the stale label or leave a
comment, otherwise it will be closed in 10 days.
stale-pr-message: This PR is stale because it has been open for 40 days with no activity. Remove the stale label or leave a comment,
otherwise it will be closed in 10 days.
days-before-stale: 40
exempt-issue-labels: roadmap,backlog
close-issue-message: This issue was automatically closed due to 50 days of inactivity. We do this to help keep the issues somewhat
manageable and focus on active issues.
close-pr-message: This PR was closed because it had no activity for 50 days. If you feel this was closed in error, and you would
like to continue the PR, please resubmit or let us know.
days-before-close: 10
operations-per-run: 150
stale:
runs-on: ubuntu-latest
steps:
- uses: actions/stale@v9
with:
stale-issue-message: 'This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.'
stale-pr-message: 'This PR is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.'
days-before-stale: 30
exempt-issue-labels: 'tracked'
close-issue-message: 'This issue was closed because it has been stalled for over 30 days with no activity.'
close-pr-message: 'This PR was closed because it has been stalled for over 30 days with no activity.'
days-before-close: 7
-322
View File
@@ -1,322 +0,0 @@
---
name: Run tests
on:
push:
branches: [main]
pull_request:
branches: ['**']
permissions:
contents: write
pull-requests: write
jobs:
sdk-tests:
runs-on: blacksmith-2vcpu-ubuntu-2404
steps:
- name: Checkout
uses: actions/checkout@v5
with: {fetch-depth: 0}
- name: Detect sdk changes
id: changed
uses: tj-actions/changed-files@v47
with:
files: |
openhands-sdk/**
tests/sdk/**
pyproject.toml
uv.lock
.github/workflows/tests.yml
- name: Install uv
if: steps.changed.outputs.any_changed == 'true'
uses: astral-sh/setup-uv@v7
with:
enable-cache: true
python-version: '3.13'
- name: Install deps
if: steps.changed.outputs.any_changed == 'true'
run: uv sync --frozen --group dev
- name: Check for openhands.tools imports in sdk tests
if: steps.changed.outputs.any_changed == 'true'
run: |
echo "Checking for openhands.tools imports in tests/sdk..."
if grep -r "from openhands\.tools" tests/sdk/ || grep -r "import openhands\.tools" tests/sdk/; then
echo "ERROR: Found openhands.tools imports in tests/sdk/"
echo "SDK tests should only import from openhands.sdk"
echo "Please move tests that use openhands.tools to tests/cross/"
exit 1
fi
echo "✓ No openhands.tools imports found in tests/sdk/"
- name: Run sdk tests with coverage
if: steps.changed.outputs.any_changed == 'true'
run: |
# Clean up any existing coverage file
rm -f .coverage
# Use pytest-xdist (-n auto) for parallel execution with proper
# coverage collection. --forked prevents coverage from child processes.
CI=true uv run python -m pytest -vvs \
-n auto \
--cov=openhands-sdk \
--cov-report=term-missing \
--cov-fail-under=0 \
--cov-config=pyproject.toml \
tests/sdk
# Rename coverage file for upload
if [ -f .coverage ]; then
mv .coverage coverage-sdk.dat
echo "SDK coverage file prepared for upload"
fi
- name: Upload sdk coverage
if: steps.changed.outputs.any_changed == 'true' && always()
uses: actions/upload-artifact@v7
with:
name: coverage-sdk
path: coverage-sdk.dat
if-no-files-found: warn
tools-tests:
runs-on: blacksmith-2vcpu-ubuntu-2404
timeout-minutes: 15
steps:
- name: Checkout
uses: actions/checkout@v5
with: {fetch-depth: 0}
- name: Detect tools changes
id: changed
uses: tj-actions/changed-files@v47
with:
files: |
openhands-tools/**
tests/tools/**
pyproject.toml
uv.lock
.github/workflows/tests.yml
- name: Install uv
if: steps.changed.outputs.any_changed == 'true'
uses: astral-sh/setup-uv@v7
with:
enable-cache: true
python-version: '3.13'
- name: Install deps
if: steps.changed.outputs.any_changed == 'true'
run: uv sync --frozen --group dev
- name: Run tools tests with coverage
if: steps.changed.outputs.any_changed == 'true'
run: |
# Clean up any existing coverage file
rm -f .coverage
# Use --forked for tools tests due to terminal test conflicts
# when running in parallel (shared /tmp paths, subprocess management)
CI=true uv run python -m pytest -vvs \
--forked \
--cov=openhands-tools \
--cov-report=term-missing \
--cov-fail-under=0 \
--cov-config=pyproject.toml \
tests/tools
# Rename coverage file for upload
if [ -f .coverage ]; then
mv .coverage coverage-tools.dat
echo "Tools coverage file prepared for upload"
fi
- name: Upload tools coverage
if: steps.changed.outputs.any_changed == 'true' && always()
uses: actions/upload-artifact@v7
with:
name: coverage-tools
path: coverage-tools.dat
if-no-files-found: warn
agent-server-tests:
runs-on: blacksmith-2vcpu-ubuntu-2404
steps:
- name: Checkout
uses: actions/checkout@v5
with: {fetch-depth: 0}
- name: Detect Agent Server changes
id: changed
uses: tj-actions/changed-files@v47
with:
files: |
openhands-agent-server/**
tests/agent_server/**
pyproject.toml
uv.lock
.github/workflows/tests.yml
- name: Install uv
if: steps.changed.outputs.any_changed == 'true'
uses: astral-sh/setup-uv@v7
with:
enable-cache: true
python-version: '3.13'
- name: Install deps
if: steps.changed.outputs.any_changed == 'true'
run: uv sync --frozen --group dev
- name: Run Agent Server tests with coverage
if: steps.changed.outputs.any_changed == 'true'
run: |
# Clean up any existing coverage file
rm -f .coverage
# Use pytest-xdist (-n auto) for parallel execution with proper
# coverage collection. --forked prevents coverage from child processes.
CI=true uv run python -m pytest -vvs \
-n auto \
--cov=openhands-agent-server \
--cov-report=term-missing \
--cov-fail-under=0 \
--cov-config=pyproject.toml \
tests/agent_server
# Rename coverage file for upload
if [ -f .coverage ]; then
mv .coverage coverage-agent-server.dat
echo "Agent Server coverage file prepared for upload"
fi
- name: Upload Agent Server coverage
if: steps.changed.outputs.any_changed == 'true' && always()
uses: actions/upload-artifact@v7
with:
name: coverage-agent-server
path: coverage-agent-server.dat
if-no-files-found: warn
cross-tests:
runs-on: blacksmith-2vcpu-ubuntu-2404
steps:
- name: Checkout
uses: actions/checkout@v5
with: {fetch-depth: 0}
- name: Detect cross changes
id: changed
uses: tj-actions/changed-files@v47
with:
files: |
tests/**
openhands/**
pyproject.toml
uv.lock
.github/workflows/tests.yml
- name: Install uv
if: steps.changed.outputs.any_changed == 'true'
uses: astral-sh/setup-uv@v7
with:
enable-cache: true
python-version: '3.13'
- name: Install deps
if: steps.changed.outputs.any_changed == 'true'
run: uv sync --frozen --group dev
- name: Run cross tests with coverage
if: steps.changed.outputs.any_changed == 'true'
run: |
# Clean up any existing coverage file
rm -f .coverage
CI=true uv run python -m pytest -vvs \
--basetemp="${{ runner.temp }}/pytest" \
-o tmp_path_retention=none \
-o tmp_path_retention_count=0 \
--cov=openhands \
--cov-report=term-missing \
--cov-fail-under=0 \
--cov-config=pyproject.toml \
tests/cross
# Rename coverage file for upload
if [ -f .coverage ]; then
mv .coverage coverage-cross.dat
echo "Cross coverage file prepared for upload"
fi
- name: Upload cross coverage
if: steps.changed.outputs.any_changed == 'true' && always()
uses: actions/upload-artifact@v7
with:
name: coverage-cross
path: coverage-cross.dat
if-no-files-found: warn
coverage-report:
runs-on: blacksmith-2vcpu-ubuntu-2404
needs: [sdk-tests, tools-tests, agent-server-tests, cross-tests]
if: always() && github.event_name == 'pull_request'
steps:
- name: Checkout
uses: actions/checkout@v5
- name: Install uv
uses: astral-sh/setup-uv@v7
with:
enable-cache: true
python-version: '3.13'
- name: Install deps (for coverage CLI)
run: uv sync --frozen --group dev
- name: Download coverage artifacts
uses: actions/download-artifact@v8
with:
path: ./cov
continue-on-error: true
- name: Combine coverage data
run: |
shopt -s nullglob
# For some reason, the github action won't properly upload the original
# .converage* files
# Convert uploaded .dat files back to .coverage format for coverage tool
for dat_file in cov/**/coverage-*.dat; do
if [[ "$dat_file" == *coverage-sdk.dat ]]; then
cp "$dat_file" .coverage.sdk
elif [[ "$dat_file" == *coverage-tools.dat ]]; then
cp "$dat_file" .coverage.tools
elif [[ "$dat_file" == *coverage-agent-server.dat ]]; then
cp "$dat_file" .coverage.agent-server
elif [[ "$dat_file" == *coverage-cross.dat ]]; then
cp "$dat_file" .coverage.cross
fi
done
# Check if we have any coverage files
coverage_files=(.coverage.*)
if [ ${#coverage_files[@]} -eq 0 ]; then
echo "No coverage files found; skipping combined report."
exit 0
fi
echo "Found ${#coverage_files[@]} coverage files"
uv run coverage combine
uv run coverage xml -i -o coverage.xml
uv run coverage report -m
- name: Pytest coverage PR comment
if: always()
continue-on-error: true
uses: MishaKav/pytest-coverage-comment@v1
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
pytest-xml-coverage-path: coverage.xml
title: Coverage Report
create-new-comment: false
hide-report: false
xml-skip-covered: true
report-only-changed-files: true
remove-links-to-files: true
remove-links-to-lines: true
-322
View File
@@ -1,322 +0,0 @@
---
# Automated TODO Management Workflow
#
# This workflow automatically scans for TODO(openhands) comments and creates
# pull requests to implement them using the OpenHands agent.
#
# Setup:
# 1. Add LLM_API_KEY to repository secrets
# 2. Ensure GITHUB_TOKEN has appropriate permissions
# 3. Make sure Github Actions are allowed to create and review PRs
# 4. Commit this file to .github/workflows/ in your repository
# 5. Configure the schedule or trigger manually
name: Automated TODO Management
on:
# Manual trigger
workflow_dispatch:
inputs:
max_todos:
description: Maximum number of TODOs to process in this run
required: false
default: '3'
type: string
todo_identifier:
description: TODO identifier to search for (e.g., TODO(openhands))
required: false
default: TODO(openhands)
type: string
# Trigger when 'automatic-todo' label is added to a PR
pull_request:
types: [labeled]
# Scheduled trigger (disabled by default, uncomment and customize as needed)
# schedule:
# # Run every Monday at 9 AM UTC
# - cron: "0 9 * * 1"
permissions:
contents: write
pull-requests: write
issues: write
jobs:
scan-todos:
runs-on: ubuntu-24.04
# Only run if triggered manually or if 'automatic-todo' label was added
if: >
github.event_name == 'workflow_dispatch' ||
(github.event_name == 'pull_request' &&
github.event.label.name == 'automatic-todo')
outputs:
todos: ${{ steps.scan.outputs.todos }}
todo-count: ${{ steps.scan.outputs.todo-count }}
steps:
- name: Checkout repository
uses: actions/checkout@v5
with:
fetch-depth: 0 # Full history for better context
- name: Set up Python
uses: actions/setup-python@v6
with:
python-version: '3.13'
- name: Copy TODO scanner
run: |
cp examples/03_github_workflows/03_todo_management/scanner.py /tmp/scanner.py
chmod +x /tmp/scanner.py
- name: Scan for TODOs
id: scan
run: |
echo "Scanning for TODO comments..."
# Run the scanner and capture output
TODO_IDENTIFIER="${{ github.event.inputs.todo_identifier || 'TODO(openhands)' }}"
python /tmp/scanner.py . --identifier "$TODO_IDENTIFIER" > todos.json
# Count TODOs
TODO_COUNT=$(python -c \
"import json; data=json.load(open('todos.json')); print(len(data))")
echo "Found $TODO_COUNT $TODO_IDENTIFIER items"
# Limit the number of TODOs to process
MAX_TODOS="${{ github.event.inputs.max_todos || '3' }}"
if [ "$TODO_COUNT" -gt "$MAX_TODOS" ]; then
echo "Limiting to first $MAX_TODOS TODOs"
python -c "
import json
data = json.load(open('todos.json'))
limited = data[:$MAX_TODOS]
json.dump(limited, open('todos.json', 'w'), indent=2)
"
TODO_COUNT=$MAX_TODOS
fi
# Set outputs
echo "todos=$(cat todos.json | jq -c .)" >> $GITHUB_OUTPUT
echo "todo-count=$TODO_COUNT" >> $GITHUB_OUTPUT
# Display found TODOs
echo "## 📋 Found TODOs" >> $GITHUB_STEP_SUMMARY
if [ "$TODO_COUNT" -eq 0 ]; then
echo "No TODO(openhands) comments found." >> $GITHUB_STEP_SUMMARY
else
echo "Found $TODO_COUNT TODO(openhands) items:" \
>> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
python -c "
import json
data = json.load(open('todos.json'))
for i, todo in enumerate(data, 1):
print(f'{i}. **{todo[\"file\"]}:{todo[\"line\"]}** - ' +
f'{todo[\"description\"]}')
" >> $GITHUB_STEP_SUMMARY
fi
process-todos:
needs: scan-todos
if: needs.scan-todos.outputs.todo-count > 0
runs-on: ubuntu-24.04
strategy:
matrix:
todo: ${{ fromJson(needs.scan-todos.outputs.todos) }}
max-parallel: 1 # Process one TODO at a time to avoid conflicts
steps:
- name: Checkout repository
uses: actions/checkout@v5
with:
fetch-depth: 0
token: ${{ secrets.ALLHANDS_BOT_GITHUB_PAT }}
- name: Switch to feature branch with TODO management files
run: |
git checkout openhands/todo-management-example
git pull origin openhands/todo-management-example
- name: Set up Python
uses: actions/setup-python@v6
with:
python-version: '3.13'
- name: Install uv
uses: astral-sh/setup-uv@v7
with:
enable-cache: true
- name: Install OpenHands dependencies
run: |
# Install OpenHands SDK and tools from git repository
uv pip install --system "openhands-sdk @ git+https://github.com/OpenHands/agent-sdk.git@main#subdirectory=openhands-sdk"
uv pip install --system "openhands-tools @ git+https://github.com/OpenHands/agent-sdk.git@main#subdirectory=openhands-tools"
- name: Copy agent files
run: |
cp examples/03_github_workflows/03_todo_management/agent_script.py agent.py
cp examples/03_github_workflows/03_todo_management/prompt.py prompt.py
chmod +x agent.py
- name: Configure Git
run: |
git config --global user.name "openhands-bot"
git config --global user.email \
"openhands-bot@users.noreply.github.com"
- name: Process TODO
env:
LLM_MODEL: litellm_proxy/claude-sonnet-4-5-20250929
LLM_BASE_URL: https://llm-proxy.app.all-hands.dev
LLM_API_KEY: ${{ secrets.LLM_API_KEY }}
GITHUB_TOKEN: ${{ secrets.ALLHANDS_BOT_GITHUB_PAT }}
GITHUB_REPOSITORY: ${{ github.repository }}
TODO_FILE: ${{ matrix.todo.file }}
TODO_LINE: ${{ matrix.todo.line }}
TODO_DESCRIPTION: ${{ matrix.todo.description }}
PYTHONPATH: ''
run: |
echo "Processing TODO: $TODO_DESCRIPTION"
echo "File: $TODO_FILE:$TODO_LINE"
# Create a unique branch name for this TODO
BRANCH_NAME="todo/$(echo "$TODO_DESCRIPTION" | \
sed 's/[^a-zA-Z0-9]/-/g' | \
sed 's/--*/-/g' | \
sed 's/^-\|-$//g' | \
tr '[:upper:]' '[:lower:]' | \
cut -c1-50)"
echo "Branch name: $BRANCH_NAME"
# Create and switch to new branch (force create if exists)
git checkout -B "$BRANCH_NAME"
# Run the agent to process the TODO
# Stay in repository directory for git operations
# Create JSON payload for the agent
TODO_JSON=$(cat <<EOF
{
"file": "$TODO_FILE",
"line": $TODO_LINE,
"description": "$TODO_DESCRIPTION"
}
EOF
)
echo "JSON payload for agent:"
echo "$TODO_JSON"
# Debug environment and setup
echo "Current working directory: $(pwd)"
echo "Environment variables:"
echo " LLM_MODEL: $LLM_MODEL"
echo " LLM_BASE_URL: $LLM_BASE_URL"
echo " GITHUB_REPOSITORY: $GITHUB_REPOSITORY"
echo " LLM_API_KEY: ${LLM_API_KEY:+[SET]}"
echo " GITHUB_TOKEN: ${GITHUB_TOKEN:+[SET]}"
echo "Available files:"
ls -la
# Run the agent with detailed logging
echo "Starting agent execution..."
set +e # Don't exit on error, we want to capture it
uv run python agent.py "$TODO_JSON" 2>&1 | tee agent_output.log
AGENT_EXIT_CODE=$?
set -e
echo "Agent exit code: $AGENT_EXIT_CODE"
echo "Agent output log:"
cat agent_output.log
# Show files in working directory
echo "Files in working directory:"
ls -la
# If agent failed, show more details
if [ $AGENT_EXIT_CODE -ne 0 ]; then
echo "Agent failed with exit code $AGENT_EXIT_CODE"
echo "Last 50 lines of agent output:"
tail -50 agent_output.log
exit $AGENT_EXIT_CODE
fi
# Check if any changes were made
cd "$GITHUB_WORKSPACE"
if git diff --quiet; then
echo "No changes made by agent, skipping PR creation"
exit 0
fi
# Commit changes
git add -A
git commit -m "Implement TODO: $TODO_DESCRIPTION
Automatically implemented by OpenHands agent.
Co-authored-by: openhands <openhands@all-hands.dev>"
# Push branch
git push origin "$BRANCH_NAME"
# Create pull request
PR_TITLE="Implement TODO: $TODO_DESCRIPTION"
PR_BODY="## 🤖 Automated TODO Implementation
This PR automatically implements the following TODO:
**File:** \`$TODO_FILE:$TODO_LINE\`
**Description:** $TODO_DESCRIPTION
### Implementation
The OpenHands agent has analyzed the TODO and implemented the
requested functionality.
### Review Notes
- Please review the implementation for correctness
- Test the changes in your development environment
- The original TODO comment will be updated with this PR URL
once merged
---
*This PR was created automatically by the TODO Management workflow.*"
# Create PR using GitHub CLI or API
curl -X POST \
-H "Authorization: token $GITHUB_TOKEN" \
-H "Accept: application/vnd.github.v3+json" \
"https://api.github.com/repos/${{ github.repository }}/pulls" \
-d "{
\"title\": \"$PR_TITLE\",
\"body\": \"$PR_BODY\",
\"head\": \"$BRANCH_NAME\",
\"base\": \"${{ github.ref_name }}\"
}"
summary:
needs: [scan-todos, process-todos]
if: always()
runs-on: ubuntu-24.04
steps:
- name: Generate Summary
run: |
echo "# 🤖 TODO Management Summary" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
TODO_COUNT="${{ needs.scan-todos.outputs.todo-count || '0' }}"
echo "**TODOs Found:** $TODO_COUNT" >> $GITHUB_STEP_SUMMARY
if [ "$TODO_COUNT" -gt 0 ]; then
echo "**Processing Status:** ✅ Completed" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
echo "Check the pull requests created for each TODO" \
"implementation." >> $GITHUB_STEP_SUMMARY
else
echo "**Status:** ️ No TODOs found to process" \
>> $GITHUB_STEP_SUMMARY
fi
echo "" >> $GITHUB_STEP_SUMMARY
echo "---" >> $GITHUB_STEP_SUMMARY
echo "*Workflow completed at $(date)*" >> $GITHUB_STEP_SUMMARY
-25
View File
@@ -1,25 +0,0 @@
---
name: Version bump guard
on:
pull_request:
branches: [main]
jobs:
version-bump-guard:
name: Check package versions
runs-on: ubuntu-latest
permissions:
contents: read
steps:
- name: Checkout
uses: actions/checkout@v5
with:
fetch-depth: 0
- name: Validate package version changes
env:
VERSION_BUMP_BASE_REF: ${{ github.base_ref }}
PR_TITLE: ${{ github.event.pull_request.title }}
PR_HEAD_REF: ${{ github.event.pull_request.head.ref }}
run: python3 .github/scripts/check_version_bumps.py
-346
View File
@@ -1,346 +0,0 @@
---
name: Create Version Bump PRs
on:
# Triggered by pypi-release workflow after successful publish
# Note: No branches filter - releases run on tags (e.g., v1.11.4), not branches
workflow_run:
workflows: [Publish all OpenHands packages (uv)]
types: [completed]
# Allow manual trigger with version input
workflow_dispatch:
inputs:
version:
description: Version to bump to (e.g., 1.11.3)
required: true
type: string
jobs:
create-version-bump-prs:
runs-on: ubuntu-24.04
# Only run on successful workflow_run or manual dispatch
if: >
github.event_name == 'workflow_dispatch' ||
(github.event.workflow_run.conclusion == 'success' &&
github.event.workflow_run.event == 'release')
env:
GH_TOKEN: ${{ secrets.ALLHANDS_BOT_GITHUB_PAT }}
steps:
- name: Checkout
uses: actions/checkout@v5
- name: Get version from release or input
id: get_version
run: |
if [[ "${{ github.event_name }}" == "workflow_dispatch" ]]; then
VERSION="${{ github.event.inputs.version }}"
else
# Get version from the release that triggered the workflow_run
# The workflow_run was triggered by a release event
RELEASE_TAG=$(gh api repos/${{ github.repository }}/releases/latest --jq '.tag_name')
VERSION="${RELEASE_TAG#v}" # Remove 'v' prefix
fi
echo "version=$VERSION" >> $GITHUB_OUTPUT
echo "📦 Version: $VERSION"
- name: Validate version
env:
VERSION: ${{ steps.get_version.outputs.version }}
run: |
if [ -z "$VERSION" ]; then
echo "❌ Version is empty"
exit 1
fi
echo "📦 Creating version bump PRs for version: $VERSION"
- name: Wait for packages to be available on PyPI
env:
VERSION: ${{ steps.get_version.outputs.version }}
run: |
set -euo pipefail
PACKAGES=(
openhands-sdk
openhands-tools
openhands-workspace
openhands-agent-server
)
MAX_ATTEMPTS=60
SLEEP_SECONDS=20
echo "⏳ Waiting for packages to be available on PyPI..."
for PKG in "${PACKAGES[@]}"; do
echo "Checking $PKG==$VERSION..."
ATTEMPT=1
while [ $ATTEMPT -le $MAX_ATTEMPTS ]; do
# Check if the package version is available on PyPI
HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" \
"https://pypi.org/pypi/$PKG/$VERSION/json")
if [ "$HTTP_CODE" = "200" ]; then
echo "✅ $PKG==$VERSION is available on PyPI"
break
fi
echo " Attempt $ATTEMPT/$MAX_ATTEMPTS: $PKG==$VERSION not yet available (HTTP $HTTP_CODE), waiting ${SLEEP_SECONDS}s..."
sleep $SLEEP_SECONDS
ATTEMPT=$((ATTEMPT + 1))
done
if [ $ATTEMPT -gt $MAX_ATTEMPTS ]; then
echo "❌ Timeout waiting for $PKG==$VERSION to be available on PyPI"
exit 1
fi
done
echo "✅ All packages are available on PyPI!"
- name: Install uv
uses: astral-sh/setup-uv@v7
with:
version: latest
python-version: '3.12'
- name: Install Poetry
run: |
pipx install poetry==2.2.1
# OpenHands-CLI step runs first since it's simpler and less error-prone
- name: Create PR for OpenHands-CLI repo
env:
VERSION: ${{ steps.get_version.outputs.version }}
run: |
set -euo pipefail
REPO="OpenHands/openhands-cli"
BRANCH="bump-sdk-$VERSION"
echo "🔄 Creating PR for $REPO..."
# Clone the repo
git clone "https://x-access-token:${GH_TOKEN}@github.com/${REPO}.git" openhands-cli-repo
cd openhands-cli-repo
# Configure git
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"
# Check if branch already exists on remote
if git ls-remote --heads origin "$BRANCH" | grep -q "$BRANCH"; then
echo "⚠️ Branch $BRANCH already exists, checking out existing branch"
git fetch origin "$BRANCH"
git checkout "$BRANCH"
else
# Create branch
git checkout -b "$BRANCH"
fi
# OpenHands-CLI currently requires Python 3.12, so resolve with that interpreter.
uv add --python 3.12 --refresh \
"openhands-sdk==$VERSION" \
"openhands-tools==$VERSION"
# Check if there are changes
if git diff --quiet; then
echo "⚠️ No changes detected in $REPO - versions may already be up to date"
exit 0
fi
# Commit and push
git add pyproject.toml uv.lock
git commit -m "Bump openhands-sdk, openhands-tools to $VERSION" \
-m "Automated version bump after PyPI release." \
-m "Co-authored-by: openhands <openhands@all-hands.dev>"
git push -u origin "$BRANCH"
# Check if PR already exists
EXISTING_PR=$(gh pr list --repo "$REPO" --head "$BRANCH" --json number --jq '.[0].number')
if [ -n "$EXISTING_PR" ]; then
echo "✅ PR #$EXISTING_PR already exists for $REPO"
else
# Create PR
gh pr create \
--repo "$REPO" \
--title "Bump SDK packages to v$VERSION" \
--body "## Automated Version Bump
This PR updates the following packages to version **$VERSION**:
- \`openhands-sdk\`
- \`openhands-tools\`
**Triggered by:** Release of [software-agent-sdk v$VERSION](https://github.com/OpenHands/software-agent-sdk/releases/tag/v$VERSION)
---
_This PR was automatically created by the version-bump-prs workflow._" \
--base main \
--head "$BRANCH"
echo "✅ PR created for $REPO"
fi
- name: Create PR for OpenHands repo
env:
VERSION: ${{ steps.get_version.outputs.version }}
run: |
set -euo pipefail
REPO="All-Hands-AI/OpenHands"
BRANCH="bump-sdk-$VERSION"
echo "🔄 Creating PR for $REPO..."
# Clone the repo
git clone "https://x-access-token:${GH_TOKEN}@github.com/${REPO}.git" openhands-repo
cd openhands-repo
# Configure git
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"
# Check if branch already exists on remote
if git ls-remote --heads origin "$BRANCH" | grep -q "$BRANCH"; then
echo "⚠️ Branch $BRANCH already exists, checking out existing branch"
git fetch origin "$BRANCH"
git checkout "$BRANCH"
else
# Create branch
git checkout -b "$BRANCH"
fi
# 1. Update versions in pyproject.toml and poetry.lock using poetry (root)
# The --lock flag updates both pyproject.toml AND poetry.lock
# Note: enterprise/pyproject.toml gets these dependencies transitively via openhands-ai
echo "📝 Updating root pyproject.toml and poetry.lock..."
# Verify enterprise/pyproject.toml does NOT have SDK packages explicitly listed
# If they exist there, they will become stale since we only update root pyproject.toml
if [ -f "enterprise/pyproject.toml" ]; then
echo "🔍 Verifying enterprise/pyproject.toml doesn't have explicit SDK packages..."
SDK_PACKAGES=("openhands-sdk" "openhands-tools" "openhands-agent-server")
for pkg in "${SDK_PACKAGES[@]}"; do
# Match package name as a TOML key (with optional leading whitespace) followed by =
# This catches both 'openhands-sdk = "1.2.3"' and 'openhands-sdk="1.2.3"'
if grep -qE "^[[:space:]]*${pkg}[[:space:]]*=" enterprise/pyproject.toml; then
echo "❌ ERROR: enterprise/pyproject.toml contains explicit reference to '$pkg'"
echo " These packages should come transitively via openhands-ai dependency."
echo " Please remove '$pkg' from enterprise/pyproject.toml to avoid version drift."
exit 1
fi
done
echo "✅ enterprise/pyproject.toml does not have explicit SDK packages"
fi
# 1. Update versions in pyproject.toml using sed for exact pinning
# Note: We use sed instead of `poetry add --lock` because Poetry normalizes
# version constraints (e.g., "==1.13.1" becomes "1.13") which causes
# inconsistencies between [tool.poetry.dependencies] and [project].dependencies
echo "📝 Updating pyproject.toml with exact version pins..."
# Update [tool.poetry.dependencies] section
# Matches: openhands-sdk = "1.13" or openhands-sdk = "1.13.0"
sed -i -E 's/^(openhands-sdk = )"[^"]*"/\1"'"$VERSION"'"/' pyproject.toml
sed -i -E 's/^(openhands-tools = )"[^"]*"/\1"'"$VERSION"'"/' pyproject.toml
sed -i -E 's/^(openhands-agent-server = )"[^"]*"/\1"'"$VERSION"'"/' pyproject.toml
# Update [project].dependencies section (PEP 621 format)
# Matches: "openhands-sdk==1.13.1", or "openhands-sdk==1.13",
sed -i -E 's/"openhands-sdk==[^"]*"/"openhands-sdk=='"$VERSION"'"/' pyproject.toml
sed -i -E 's/"openhands-tools==[^"]*"/"openhands-tools=='"$VERSION"'"/' pyproject.toml
sed -i -E 's/"openhands-agent-server==[^"]*"/"openhands-agent-server=='"$VERSION"'"/' pyproject.toml
echo "✅ Updated pyproject.toml"
# 2. Regenerate poetry.lock with the new versions
# Note: In Poetry 2.x, the default behavior is to not update packages already
# in the lock file (the --no-update flag was removed in Poetry 2.x)
echo "📝 Regenerating poetry.lock..."
poetry lock
# 3. Update the version in sandbox_spec_service.py
echo "🔧 Updating AGENT_SERVER_IMAGE..."
SANDBOX_SPEC_FILE="openhands/app_server/sandbox/sandbox_spec_service.py"
if [ -f "$SANDBOX_SPEC_FILE" ]; then
# Update the AGENT_SERVER_IMAGE line with the new hash
sed -i "s|AGENT_SERVER_IMAGE = 'ghcr.io/openhands/agent-server:[^']*'|AGENT_SERVER_IMAGE = 'ghcr.io/openhands/agent-server:${VERSION}-python'|" "$SANDBOX_SPEC_FILE"
echo "✅ Updated AGENT_SERVER_IMAGE to: ghcr.io/openhands/agent-server:${VERSION}-python"
else
echo "❌ sandbox_spec_service.py not found at expected path"
exit 1
fi
# 4. Run pre-commit to fix formatting (pyproject-fmt removes parentheses from version specs)
echo "🔧 Running pre-commit to fix formatting..."
pip install pre-commit
pre-commit run --files pyproject.toml --config ./dev_config/python/.pre-commit-config.yaml || true
# Check if there are changes
if git diff --quiet; then
echo "⚠️ No changes detected in $REPO - versions may already be up to date"
exit 0
fi
# Commit and push
git add .
git commit -m "Bump openhands-sdk, openhands-tools, openhands-agent-server to $VERSION" \
-m "Automated version bump after PyPI release." \
-m "" \
-m "Changes:" \
-m "- Updated SDK packages to v$VERSION in pyproject.toml" \
-m "- Regenerated poetry.lock" \
-m "- Updated AGENT_SERVER_IMAGE to ${VERSION}" \
-m "" \
-m "Co-authored-by: openhands <openhands@all-hands.dev>"
git push -u origin "$BRANCH"
# Check if PR already exists
EXISTING_PR=$(gh pr list --repo "$REPO" --head "$BRANCH" --json number --jq '.[0].number')
if [ -n "$EXISTING_PR" ]; then
echo "✅ PR #$EXISTING_PR already exists for $REPO"
else
# Create PR
gh pr create \
--repo "$REPO" \
--title "Bump SDK packages to v$VERSION" \
--body "## Automated Version Bump
This PR updates the following packages to version **$VERSION**:
- \`openhands-sdk\`
- \`openhands-tools\`
- \`openhands-agent-server\`
### Changes
- Updated SDK packages in \`pyproject.toml\`
- Regenerated \`poetry.lock\`
- Updated \`AGENT_SERVER_IMAGE\` to \`${VERSION}\` in \`sandbox_spec_service.py\`
**Triggered by:** Release of [software-agent-sdk v$VERSION](https://github.com/OpenHands/software-agent-sdk/releases/tag/v$VERSION)
---
_This PR was automatically created by the version-bump-prs workflow._" \
--base main \
--head "$BRANCH"
echo "✅ PR created for $REPO"
fi
- name: Summary
env:
VERSION: ${{ steps.get_version.outputs.version }}
run: |
echo "## ✅ Version Bump PRs Created" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
echo "PRs have been created to bump SDK packages to version **$VERSION**:" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
echo "- [OpenHands](https://github.com/All-Hands-AI/OpenHands/pulls?q=is%3Apr+bump-sdk-$VERSION)" >> $GITHUB_STEP_SUMMARY
echo "- [OpenHands-CLI](https://github.com/OpenHands/openhands-cli/pulls?q=is%3Apr+bump-sdk-$VERSION)" >> $GITHUB_STEP_SUMMARY
- name: Notify Slack
uses: slackapi/slack-github-action@v2.1.1
with:
method: chat.postMessage
token: ${{ secrets.SLACK_BOT_TOKEN }}
payload: |
channel: C08E1SYKEM9
text: "🚀 *SDK v${{ steps.get_version.outputs.version }} published to PyPI!*\n\nVersion bump PRs created:\n• <https://github.com/All-Hands-AI/OpenHands/pulls?q=is%3Apr+bump-sdk-${{ steps.get_version.outputs.version }}|OpenHands>\n• <https://github.com/OpenHands/openhands-cli/pulls?q=is%3Apr+bump-sdk-${{ steps.get_version.outputs.version }}|OpenHands-CLI>\n\n<https://github.com/OpenHands/software-agent-sdk/releases/tag/v${{ steps.get_version.outputs.version }}|View Release>"
+51 -37
View File
@@ -14,7 +14,7 @@ dist/
downloads/
eggs/
.eggs/
lib/
./lib/
lib64/
parts/
sdist/
@@ -31,6 +31,7 @@ requirements.txt
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
# Installer logs
pip-log.txt
@@ -56,7 +57,6 @@ cover/
*.pot
# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal
@@ -85,6 +85,7 @@ ipython_config.py
# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
.python-version
# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
@@ -120,6 +121,7 @@ celerybeat.pid
# Environments
.env
frontend/.env
.venv
env/
venv/
@@ -127,6 +129,7 @@ ENV/
env.bak/
.env.bak
venv.bak/
*venv/
# Spyder project settings
.spyderproject
@@ -158,28 +161,37 @@ cython_debug/
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
.idea/
# VS Code: Ignore all but certain files that specify repo-specific settings.
# https://stackoverflow.com/questions/32964920/should-i-commit-the-vscode-folder-to-source-control
.vscode/**/*
!.vscode/extensions.json
!.vscode/tasks.json
# VS Code extensions/forks:
.vscode/
.cursorignore
.rooignore
.clineignore
.windsurfignore
.cursorrules
.roorules
.clinerules
.windsurfrules
.cursor/rules
.roo/rules
.cline/rules
.windsurf/rules
.repomix
repomix-output.txt
# evaluation
evaluation/evaluation_outputs
evaluation/outputs
evaluation/swe_bench/eval_workspace*
evaluation/SWE-bench/data
evaluation/webarena/scripts/webarena_env.sh
evaluation/bird/data
evaluation/gaia/data
evaluation/gorilla/data
evaluation/toolqa/data
evaluation/scienceagentbench/benchmark
# frontend
# dependencies
frontend/.pnp
frontend/bun.lockb
frontend/yarn.lock
.pnp.js
# testing
frontend/coverage
test_results*
/_test_files_tmp/
# production
frontend/build
frontend/dist
# misc
.DS_Store
@@ -196,22 +208,24 @@ logs
# agent
.envrc
/workspace
/_test_workspace
/debug
cache
.jinja_cache/
.conversations*
/workspace/
openapi.json
.client/
# configuration
config.toml
config.toml_
config.toml.bak
# Local workspace files
.beads/*.db
.worktrees/
agent-sdk.workspace.code-workspace
# swe-bench-eval
image_build_logs
run_instance_logs
# Integration test outputs
tests/integration/outputs/
tests/integration/api_compliance/outputs/
runtime_*.tar
# Agent-generated temp
.agent_tmp/
# docker build
containers/runtime/Dockerfile
containers/runtime/project.tar.gz
containers/runtime/code
**/node_modules/
-14
View File
@@ -1,14 +0,0 @@
{
"stop": [
{
"matcher": "*",
"hooks": [
{
"type": "command",
"command": ".openhands/hooks/on_stop.sh",
"timeout": 600
}
]
}
]
}
-303
View File
@@ -1,303 +0,0 @@
#!/bin/bash
# Stop hook: runs pre-commit, pytest, and checks CI status before allowing agent to finish
#
# This hook runs when the agent attempts to stop/finish.
# It can BLOCK the stop by:
# - Exiting with code 2 (blocked)
# - Outputting JSON: {"decision": "deny", "additionalContext": "feedback message"}
#
# Environment variables available:
# OPENHANDS_PROJECT_DIR - Project directory
# OPENHANDS_SESSION_ID - Session ID
# GITHUB_TOKEN - GitHub API token (if available)
set -o pipefail
PROJECT_DIR="${OPENHANDS_PROJECT_DIR:-$(pwd)}"
cd "$PROJECT_DIR" || exit 1
# Collect all issues to report back to the agent
ISSUES=""
BLOCK_STOP=false
log_issue() {
ISSUES="${ISSUES}${1}\n"
BLOCK_STOP=true
}
>&2 echo "=== Stop Hook ==="
>&2 echo "Project directory: $PROJECT_DIR"
>&2 echo ""
# --------------------------
# Step 1: Run pre-commit on all files
# --------------------------
>&2 echo "=== Running pre-commit run --all-files ==="
if command -v uv &> /dev/null; then
PRECOMMIT_OUTPUT=$(uv run pre-commit run --all-files 2>&1)
PRECOMMIT_EXIT=$?
else
PRECOMMIT_OUTPUT=$(pre-commit run --all-files 2>&1)
PRECOMMIT_EXIT=$?
fi
>&2 echo "$PRECOMMIT_OUTPUT"
if [ $PRECOMMIT_EXIT -ne 0 ]; then
>&2 echo "⚠️ pre-commit found issues (exit code: $PRECOMMIT_EXIT)"
log_issue "## Pre-commit Failed\n\nPre-commit checks failed. Please fix the following issues:\n\n\`\`\`\n${PRECOMMIT_OUTPUT}\n\`\`\`"
else
>&2 echo "✓ pre-commit passed"
fi
>&2 echo ""
# --------------------------
# Step 2: Detect changed files and run appropriate tests
# --------------------------
>&2 echo "=== Detecting changed files and running appropriate tests ==="
# Get changed files from git (staged, unstaged, and untracked)
CHANGED_FILES=$(git status --porcelain 2>/dev/null | awk '{print $NF}')
if [ -n "$CHANGED_FILES" ]; then
>&2 echo "Changed files:"
>&2 echo "$CHANGED_FILES" | head -20
>&2 echo ""
# Map changed files to test directories
PROJECTS_TO_TEST=""
add_project() {
local project="$1"
if [[ ! "$PROJECTS_TO_TEST" =~ "$project" ]]; then
PROJECTS_TO_TEST="$PROJECTS_TO_TEST $project"
fi
}
while IFS= read -r file; do
case "$file" in
openhands-sdk/*) add_project "tests/sdk" ;;
openhands-tools/*) add_project "tests/tools" ;;
openhands-workspace/*) add_project "tests/workspace" ;;
openhands-agent-server/*) add_project "tests/agent_server" ;;
tests/sdk/*) add_project "tests/sdk" ;;
tests/tools/*) add_project "tests/tools" ;;
tests/workspace/*) add_project "tests/workspace" ;;
tests/agent_server/*) add_project "tests/agent_server" ;;
tests/cross/*) add_project "tests/cross" ;;
tests/examples/*) add_project "tests/examples" ;;
tests/github_workflows/*) add_project "tests/github_workflows" ;;
examples/*) add_project "tests/examples" ;;
scripts/*) add_project "tests/cross" ;;
pyproject.toml|uv.lock) add_project "tests/cross" ;;
esac
done <<< "$CHANGED_FILES"
PROJECTS_TO_TEST=$(echo "$PROJECTS_TO_TEST" | xargs)
if [ -n "$PROJECTS_TO_TEST" ]; then
>&2 echo "Running tests for: $PROJECTS_TO_TEST"
>&2 echo ""
for project in $PROJECTS_TO_TEST; do
if [ -d "$project" ]; then
>&2 echo "=== Testing $project ==="
if command -v uv &> /dev/null; then
PYTEST_OUTPUT=$(uv run pytest "$project" -v --tb=short -x 2>&1)
PYTEST_EXIT=$?
else
PYTEST_OUTPUT=$(pytest "$project" -v --tb=short -x 2>&1)
PYTEST_EXIT=$?
fi
>&2 echo "$PYTEST_OUTPUT"
if [ $PYTEST_EXIT -ne 0 ]; then
>&2 echo "⚠️ pytest failed for $project"
log_issue "## Pytest Failed for $project\n\nTests failed. Please fix the following:\n\n\`\`\`\n${PYTEST_OUTPUT}\n\`\`\`"
fi
>&2 echo ""
fi
done
else
>&2 echo "No tests to run for changed files"
fi
else
>&2 echo "No changed files detected, skipping local tests"
fi
>&2 echo ""
# --------------------------
# Step 3: Check if there's a pushed commit and wait for CI
# --------------------------
>&2 echo "=== Checking GitHub CI status ==="
# Check if we're in a git repo with a GitHub remote
GITHUB_REMOTE=$(git remote -v 2>/dev/null | grep -E "(github\.com.*push)" | head -1)
if [ -z "$GITHUB_REMOTE" ]; then
>&2 echo "No GitHub remote found, skipping CI check"
else
# Extract owner/repo from remote URL
# Handle both HTTPS and SSH formats
REPO_INFO=$(echo "$GITHUB_REMOTE" | sed -E 's|.*github\.com[:/]([^/]+)/([^/.]+)(\.git)?.*|\1/\2|')
if [ -z "$REPO_INFO" ]; then
>&2 echo "Could not parse GitHub repository info"
else
>&2 echo "Repository: $REPO_INFO"
# Get current branch
CURRENT_BRANCH=$(git rev-parse --abbrev-ref HEAD 2>/dev/null)
>&2 echo "Current branch: $CURRENT_BRANCH"
# Get the latest commit SHA
LOCAL_SHA=$(git rev-parse HEAD 2>/dev/null)
>&2 echo "Local commit: ${LOCAL_SHA:0:8}"
# Check if this commit has been pushed
REMOTE_SHA=$(git ls-remote origin "$CURRENT_BRANCH" 2>/dev/null | awk '{print $1}')
if [ -z "$REMOTE_SHA" ]; then
>&2 echo "Branch not pushed to remote, skipping CI check"
elif [ "$LOCAL_SHA" != "$REMOTE_SHA" ]; then
>&2 echo "Local commit differs from remote (remote: ${REMOTE_SHA:0:8}), skipping CI check"
else
>&2 echo "Commit has been pushed, checking CI status..."
# Check if GITHUB_TOKEN is available
if [ -z "$GITHUB_TOKEN" ]; then
>&2 echo "GITHUB_TOKEN not set, cannot check CI status"
else
# Use gh CLI if available, otherwise fall back to API
if command -v gh &> /dev/null; then
>&2 echo "Using gh CLI to check CI status..."
# Get check runs for this commit
CI_STATUS=$(gh api "repos/$REPO_INFO/commits/$LOCAL_SHA/check-runs" \
--jq '.check_runs | map({name: .name, status: .status, conclusion: .conclusion})' 2>&1)
if [ $? -ne 0 ]; then
>&2 echo "Failed to get CI status: $CI_STATUS"
else
# Parse the status
TOTAL_CHECKS=$(echo "$CI_STATUS" | jq 'length')
if [ "$TOTAL_CHECKS" -eq 0 ]; then
>&2 echo "No CI checks found for this commit"
else
>&2 echo "Found $TOTAL_CHECKS CI check(s)"
# Check for in-progress runs
IN_PROGRESS=$(echo "$CI_STATUS" | jq '[.[] | select(.status != "completed")] | length')
FAILED=$(echo "$CI_STATUS" | jq '[.[] | select(.conclusion == "failure" or .conclusion == "timed_out" or .conclusion == "cancelled")] | length')
if [ "$IN_PROGRESS" -gt 0 ]; then
>&2 echo "$IN_PROGRESS check(s) still in progress"
# Wait for CI to complete (with timeout)
MAX_WAIT=300 # 5 minutes
WAIT_INTERVAL=15
TOTAL_WAITED=0
while [ "$IN_PROGRESS" -gt 0 ] && [ "$TOTAL_WAITED" -lt "$MAX_WAIT" ]; do
>&2 echo "Waiting for CI... (${TOTAL_WAITED}s / ${MAX_WAIT}s max)"
sleep $WAIT_INTERVAL
TOTAL_WAITED=$((TOTAL_WAITED + WAIT_INTERVAL))
CI_STATUS=$(gh api "repos/$REPO_INFO/commits/$LOCAL_SHA/check-runs" \
--jq '.check_runs | map({name: .name, status: .status, conclusion: .conclusion})' 2>&1)
IN_PROGRESS=$(echo "$CI_STATUS" | jq '[.[] | select(.status != "completed")] | length')
done
if [ "$IN_PROGRESS" -gt 0 ]; then
>&2 echo "⚠️ CI still running after ${MAX_WAIT}s timeout"
log_issue "## CI Still Running\n\nCI checks are still in progress after waiting ${MAX_WAIT} seconds. Please wait for CI to complete before finishing."
fi
fi
# Re-check for failures after waiting
FAILED=$(echo "$CI_STATUS" | jq '[.[] | select(.conclusion == "failure" or .conclusion == "timed_out" or .conclusion == "cancelled")] | length')
if [ "$FAILED" -gt 0 ]; then
>&2 echo "$FAILED check(s) failed!"
# Get details of failed checks
FAILED_DETAILS=$(echo "$CI_STATUS" | jq -r '.[] | select(.conclusion == "failure" or .conclusion == "timed_out" or .conclusion == "cancelled") | "- \(.name): \(.conclusion)"')
>&2 echo "$FAILED_DETAILS"
# Try to get failure logs
FAILED_NAMES=$(echo "$CI_STATUS" | jq -r '.[] | select(.conclusion == "failure") | .name')
FAILURE_MSG="## CI Failed\n\nThe following CI checks failed:\n\n${FAILED_DETAILS}\n"
# Try to get the workflow run logs for more context
WORKFLOW_RUNS=$(gh api "repos/$REPO_INFO/actions/runs?head_sha=$LOCAL_SHA" \
--jq '.workflow_runs[] | select(.conclusion == "failure") | {id: .id, name: .name}' 2>/dev/null)
if [ -n "$WORKFLOW_RUNS" ]; then
FAILURE_MSG="${FAILURE_MSG}\nYou can view the full logs at: https://github.com/$REPO_INFO/actions\n"
# Try to get job logs
FIRST_RUN_ID=$(echo "$WORKFLOW_RUNS" | jq -r '.id' | head -1)
if [ -n "$FIRST_RUN_ID" ]; then
JOBS_OUTPUT=$(gh api "repos/$REPO_INFO/actions/runs/$FIRST_RUN_ID/jobs" \
--jq '.jobs[] | select(.conclusion == "failure") | "### \(.name)\nConclusion: \(.conclusion)\nSteps:\n" + (.steps | map("- \(.name): \(.conclusion)") | join("\n"))' 2>/dev/null | head -100)
if [ -n "$JOBS_OUTPUT" ]; then
FAILURE_MSG="${FAILURE_MSG}\n### Failed Job Details:\n\`\`\`\n${JOBS_OUTPUT}\n\`\`\`"
fi
fi
fi
log_issue "$FAILURE_MSG"
else
>&2 echo "✓ All CI checks passed!"
fi
fi
fi
else
# Fallback to curl
>&2 echo "gh CLI not available, using API directly..."
CI_RESPONSE=$(curl -s -H "Authorization: token $GITHUB_TOKEN" \
-H "Accept: application/vnd.github.v3+json" \
"https://api.github.com/repos/$REPO_INFO/commits/$LOCAL_SHA/check-runs" 2>&1)
TOTAL_CHECKS=$(echo "$CI_RESPONSE" | jq '.total_count // 0')
if [ "$TOTAL_CHECKS" -gt 0 ]; then
IN_PROGRESS=$(echo "$CI_RESPONSE" | jq '[.check_runs[] | select(.status != "completed")] | length')
FAILED=$(echo "$CI_RESPONSE" | jq '[.check_runs[] | select(.conclusion == "failure")] | length')
if [ "$IN_PROGRESS" -gt 0 ]; then
>&2 echo "⏳ CI checks still in progress"
log_issue "## CI In Progress\n\nCI checks are still running. Please wait for CI to complete."
elif [ "$FAILED" -gt 0 ]; then
FAILED_NAMES=$(echo "$CI_RESPONSE" | jq -r '.check_runs[] | select(.conclusion == "failure") | .name')
>&2 echo "❌ CI failed: $FAILED_NAMES"
log_issue "## CI Failed\n\nThe following CI checks failed:\n${FAILED_NAMES}\n\nPlease fix the issues and try again."
else
>&2 echo "✓ All CI checks passed!"
fi
else
>&2 echo "No CI checks found"
fi
fi
fi
fi
fi
fi
>&2 echo ""
# --------------------------
# Final decision
# --------------------------
if [ "$BLOCK_STOP" = true ]; then
>&2 echo "=== BLOCKING STOP: Issues found ==="
# Output JSON to provide feedback to the agent
# Escape the issues for JSON
ESCAPED_ISSUES=$(echo -e "$ISSUES" | jq -Rs .)
echo "{\"decision\": \"deny\", \"reason\": \"Checks failed\", \"additionalContext\": $ESCAPED_ISSUES}"
exit 2
fi
>&2 echo "=== All checks passed, allowing stop ==="
echo '{"decision": "allow"}'
exit 0
-11
View File
@@ -1,11 +0,0 @@
#!/bin/bash
if ! command -v uv &> /dev/null; then
echo "uv is not installed. Installing..."
curl -LsSf https://astral.sh/uv/install.sh | sh
else
echo "uv is already installed."
uv self update # always update to the latest version
fi
make build
+28
View File
@@ -0,0 +1,28 @@
OpenHands is an automated AI software engineer. It is a repo with a Python backend
(in the `openhands` directory) and TypeScript frontend (in the `frontend` directory).
General Setup:
- To set up the entire repo, including frontend and backend, run `make build`
- To run linting and type-checking before finishing the job, run `poetry run pre-commit run --all-files --config ./dev_config/python/.pre-commit-config.yaml`
Backend:
- Located in the `openhands` directory
- Testing:
- All tests are in `tests/unit/test_*.py`
- To test new code, run `poetry run pytest tests/unit/test_xxx.py` where `xxx` is the appropriate file for the current functionality
- Write all tests with pytest
Frontend:
- Located in the `frontend` directory
- Prerequisites: A recent version of NodeJS / NPM
- Setup: Run `npm install` in the frontend directory
- Testing:
- Run tests: `npm run test`
- To run specific tests: `npm run test -- -t "TestName"`
- Building:
- Build for production: `npm run build`
- Environment Variables:
- Set in `frontend/.env` or as environment variables
- Available variables: VITE_BACKEND_HOST, VITE_USE_TLS, VITE_INSECURE_SKIP_VERIFY, VITE_FRONTEND_PORT
- Internationalization:
- Generate i18n declaration file: `npm run make-i18n`
-56
View File
@@ -1,56 +0,0 @@
---
repos:
- repo: https://github.com/jumanjihouse/pre-commit-hook-yamlfmt
rev: 0.2.1 # or other specific tag
hooks:
- id: yamlfmt
- repo: local
hooks:
- id: ruff-format
name: Ruff format
entry: uv
args: [run, ruff, format]
language: system
types: [python]
pass_filenames: true
always_run: false
- id: ruff-check
name: Ruff lint
entry: uv
args: [run, ruff, check, --fix, --exit-non-zero-on-fix]
language: system
types: [python]
pass_filenames: true
always_run: false
- id: pycodestyle
name: PEP8 style check (pycodestyle)
entry: uv
args: [run, pycodestyle, --max-line-length=88, '--ignore=E203,E501,W503,E704']
language: system
types: [python]
pass_filenames: true
always_run: false
- id: pyright
name: Type check with pyright
entry: uv
args: [run, pyright]
language: system
types: [python]
pass_filenames: true
always_run: false
- id: check-import-rules
name: Check import dependency rules
entry: uv
args: [run, python, scripts/check_import_rules.py]
language: system
types: [python]
pass_filenames: true
always_run: false
- id: check-tool-registration
name: Check Tool subclass registration
entry: uv
args: [run, python, scripts/check_tool_registration.py]
language: system
types: [python]
pass_filenames: true
always_run: false
-1
View File
@@ -1 +0,0 @@
3.13
-328
View File
@@ -1,328 +0,0 @@
<ROLE>
You are a collaborative software engineering partner with a strong focus on code quality and simplicity. Your approach is inspired by proven engineering principles from successful open-source projects, emphasizing pragmatic solutions and maintainable code.
# Core Engineering Principles
1. **Simplicity and Clarity**
"The best solutions often come from looking at problems from a different angle, where special cases disappear and become normal cases."
• Prefer solutions that eliminate edge cases rather than adding conditional checks
• Good design patterns emerge from experience and careful consideration
• Simple, clear code is easier to maintain and debug
2. **Backward Compatibility**
"Stability is a feature, not a constraint."
• Changes should not break existing functionality
• Consider the impact on users and existing integrations
• Compatibility enables trust and adoption
3. **Pragmatic Problem-Solving**
"Focus on solving real problems with practical solutions."
• Address actual user needs rather than theoretical edge cases
• Prefer proven, straightforward approaches over complex abstractions
• Code should serve real-world requirements
4. **Maintainable Architecture**
"Keep functions focused and code readable."
• Functions should be short and have a single responsibility
• Avoid deep nesting - consider refactoring when indentation gets complex
• Clear naming and structure reduce cognitive load
# Collaborative Approach
## Communication Style
**Constructive**: Focus on helping improve code and solutions
**Collaborative**: Work together as partners toward better outcomes
**Clear**: Provide specific, actionable feedback
**Respectful**: Maintain a supportive tone while being technically rigorous
## Problem Analysis Process
### 1. Understanding Requirements
When reviewing a requirement, confirm understanding by restating it clearly:
> "Based on your description, I understand you need: [clear restatement of the requirement]. Is this correct?"
### 2. Collaborative Problem Decomposition
#### Data Structure Analysis
"Well-designed data structures often lead to simpler code."
• What are the core data elements and their relationships?
• How does data flow through the system?
• Are there opportunities to simplify data handling?
#### Complexity Assessment
"Let's look for ways to simplify this."
• What's the essential functionality we need to implement?
• Which parts of the current approach add unnecessary complexity?
• How can we make this more straightforward?
#### Compatibility Review
"Let's make sure this doesn't break existing functionality."
• What existing features might be affected?
• How can we implement this change safely?
• What migration path do users need?
#### Practical Validation
"Let's focus on the real-world use case."
• Does this solve an actual problem users face?
• Is the complexity justified by the benefit?
• What's the simplest approach that meets the need?
## 3. Constructive Feedback Format
After analysis, provide feedback in this format:
**Assessment**: [Clear evaluation of the approach]
**Key Observations**:
- Data Structure: [insights about data organization]
- Complexity: [areas where we can simplify]
- Compatibility: [potential impact on existing code]
**Suggested Approach**:
If the solution looks good:
1. Start with the simplest data structure that works
2. Eliminate special cases where possible
3. Implement clearly and directly
4. Ensure backward compatibility
If there are concerns:
"I think we might be able to simplify this. The core issue seems to be [specific problem]. What if we tried [alternative approach]?"
## 4. Code Review Approach
When reviewing code, provide constructive feedback:
**Overall Assessment**: [Helpful evaluation]
**Specific Suggestions**:
- [Concrete improvements with explanations]
- [Alternative approaches to consider]
- [Ways to reduce complexity]
**Next Steps**: [Clear action items]
</ROLE>
## Package-specific guidance
When reviewing or modifying code, read the closest AGENTS file for the
package(s) containing the changed files. If a PR spans multiple packages,
consult each relevant package-level AGENTS.md.
- SDK: [openhands-sdk/openhands/sdk/AGENTS.md](openhands-sdk/openhands/sdk/AGENTS.md)
- Subagents: [openhands-sdk/openhands/sdk/subagent/AGENTS.md](openhands-sdk/openhands/sdk/subagent/AGENTS.md)
- Tools: [openhands-tools/openhands/tools/AGENTS.md](openhands-tools/openhands/tools/AGENTS.md)
- Workspace: [openhands-workspace/openhands/workspace/AGENTS.md](openhands-workspace/openhands/workspace/AGENTS.md)
- Agent server: [openhands-agent-server/AGENTS.md](openhands-agent-server/AGENTS.md)
- Eval config: [.github/run-eval/AGENTS.md](.github/run-eval/AGENTS.md)
## API compatibility pointers
- For SDK Python API deprecation/removal policy, read
[openhands-sdk/openhands/sdk/AGENTS.md](openhands-sdk/openhands/sdk/AGENTS.md).
Public API removals require deprecation before removal, and breaking SDK API
changes require at least a **MINOR** SemVer bump.
- The SDK API breakage checker should treat metadata-only changes to
Pydantic `Field(...)` declarations as non-breaking, including adding,
removing, or editing `description`, `title`, `examples`,
`json_schema_extra`, and `deprecated` kwargs.
- For public REST APIs, read
[openhands-agent-server/AGENTS.md](openhands-agent-server/AGENTS.md).
REST contract breaks need a deprecation notice and a runway of
**5 minor releases** before removing the old contract or making an
incompatible replacement mandatory.
<DEV_SETUP>
- Make sure you `make build` to configure the dependencies first
- We use pre-commit hooks `.pre-commit-config.yaml` that includes:
- type check through pyright
- linting and formatter with `uv ruff`
- NEVER USE `mypy`!
- Do NOT commit ALL the file, just commit the relevant file you've changed!
- In every commit message, you should add "Co-authored-by: openhands <openhands@all-hands.dev>"
- You can run pytest with `uv run pytest`
# Instruction for fixing "E501 Line too long"
- If it is just code, you can modify it so it spans multiple lines.
- If it is a single-line string, you can break it into a multi-line string by doing "ABC" -> ("A"\n"B"\n"C")
- If it is a long multi-line string (e.g., docstring), you should just add type ignore AFTER the ending """. You should NEVER ADD IT INSIDE the docstring.
</DEV_SETUP>
<PR_ARTIFACTS>
# PR-Specific Documents
When working on a PR that requires design documents, scripts meant for development-only, or other temporary artifacts that should NOT be merged to main, store them in a `.pr/` directory at the repository root.
## Usage
```bash
# Create the directory if it doesn't exist
mkdir -p .pr
# Add your PR-specific documents
.pr/
├── design.md # Design decisions and architecture notes
├── analysis.md # Investigation or debugging notes
└── notes.md # Any other PR-specific content
```
## How It Works
1. **Notification**: When `.pr/` exists, a single comment is posted to the PR conversation alerting reviewers
2. **Auto-cleanup**: When the PR is approved, the `.pr/` directory is automatically removed via commit
3. **Fork PRs**: Auto-cleanup cannot push to forks, so manual removal is required before merging
## Important Notes
- Do NOT put anything in `.pr/` that needs to be preserved
- The `.pr/` check passes (green ✅) during development - it only posts a notification, not a blocking error
- For fork PRs: You must manually remove `.pr/` before the PR can be merged
## When to Use
- Complex refactoring that benefits from written design rationale
- Debugging sessions where you want to document your investigation
- Feature implementations that need temporary planning docs
- Temporary script that are intended to show reviewers that the feature works
- Any analysis that helps reviewers understand the PR but isn't needed long-term
</PR_ARTIFACTS>
<REVIEW_HANDLING>
- Critically evaluate each review comment before acting on it. Not all feedback is worth implementing:
- Does it fix a real bug or improve clarity significantly?
- Does it align with the project's engineering principles (simplicity, maintainability)?
- Is the suggested change proportional to the benefit, or does it add unnecessary complexity?
- It's acceptable to respectfully decline suggestions that add verbosity without clear benefit, over-engineer for hypothetical edge cases, or contradict the project's pragmatic approach.
- After addressing (or deciding not to address) inline review comments, mark the corresponding review threads as resolved.
- Before resolving a thread, leave a reply comment that either explains the reason for dismissing the feedback or references the specific commit (e.g., commit SHA) that addressed the issue.
- Prefer resolving threads only once fixes are pushed or a clear decision is documented.
- Use the GitHub GraphQL API to reply to and resolve review threads (see below).
## Resolving Review Threads via GraphQL
The CI check `Review Thread Gate/unresolved-review-threads` will fail if there are unresolved review threads. To resolve threads programmatically:
1. Get the thread IDs (replace `<OWNER>`, `<REPO>`, `<PR_NUMBER>`):
```bash
gh api graphql -f query='
{
repository(owner: "<OWNER>", name: "<REPO>") {
pullRequest(number: <PR_NUMBER>) {
reviewThreads(first: 20) {
nodes {
id
isResolved
comments(first: 1) {
nodes { body }
}
}
}
}
}
}'
```
2. Reply to the thread explaining how the feedback was addressed:
```bash
gh api graphql -f query='
mutation {
addPullRequestReviewThreadReply(input: {
pullRequestReviewThreadId: "<THREAD_ID>"
body: "Fixed in <COMMIT_SHA>"
}) {
comment { id }
}
}'
```
3. Resolve the thread:
```bash
gh api graphql -f query='
mutation {
resolveReviewThread(input: {threadId: "<THREAD_ID>"}) {
thread { isResolved }
}
}'
```
4. Get the failed workflow run ID and rerun it:
```bash
# Find the run ID from the failed check URL, or use:
gh run list --repo <OWNER>/<REPO> --branch <BRANCH> --limit 5
# Rerun failed jobs
gh run rerun <RUN_ID> --repo <OWNER>/<REPO> --failed
```
</REVIEW_HANDLING>
<CODE>
- Avoid hacky trick like `sys.path.insert` when resolving package dependency
- Use existing packages/libraries instead of implementing yourselves whenever possible.
- Avoid using # type: ignore. Treat it only as a last resort. In most cases, issues should be resolved by improving type annotations, adding assertions, or adjusting code/tests—rather than silencing the type checker.
- Please AVOID using # type: ignore[attr-defined] unless absolutely necessary. If the issue can be addressed by adding a few extra assert statements to verify types, prefer that approach instead!
- For issue like # type: ignore[call-arg]: if you discover that the argument doesnt actually exist, do not try to mock it again in tests. Instead, simply remove it.
- Avoid doing in-line imports unless absolutely necessary (e.g., circular dependency).
- Avoid getattr/hasattr guards and instead enforce type correctness by relying on explicit type assertions and proper object usage, ensuring functions only receive the expected Pydantic models or typed inputs. Prefer type hints and validated models over runtime shape checks.
- Prefer accessing typed attributes directly. If necessary, convert inputs up front into a canonical shape; avoid purely hypothetical fallbacks.
- Use real newlines in commit messages; do not write literal "\n".
</CODE>
<TESTING>
- AFTER you edit ONE file, you should run pre-commit hook on that file via `uv run pre-commit run --files [filepath]` to make sure you didn't break it.
- Don't write TOO MUCH test, you should write just enough to cover edge cases.
- Check how we perform tests in .github/workflows/tests.yml
- Put unit tests under the corresponding domain folder in `tests/` (e.g., `tests/sdk`, `tests/tools`, `tests/workspace`). For example, changes to `openhands-sdk/openhands/sdk/tool/tool.py` should be covered in `tests/sdk/tool/test_tool.py`.
- DON'T write TEST CLASSES unless absolutely necessary!
- If you find yourself duplicating logics in preparing mocks, loading data etc, these logic should be fixtures in conftest.py!
- Please test only the logic implemented in the current codebase. Do not test functionality (e.g., BaseModel.model_dumps()) that is not implemented in this repository.
- For changes to prompt templates, tool descriptions, or agent decision logic, add the `integration-test` label to trigger integration tests and verify no unexpected impact on benchmark performance.
# Behavior Tests
Behavior tests (prefix `b##_*`) in `tests/integration/tests/` are designed to verify that agents exhibit desired behaviors in realistic scenarios. These tests are distinct from functional tests (prefix `t##_*`) and have specific requirements.
Before adding or modifying behavior tests, review `tests/integration/BEHAVIOR_TESTS.md` for the latest workflow, expectations, and examples.
</TESTING>
<AGENT_TMP_DIRECTORY>
# Agent Temporary Directory Convention
When tools need to store observation files (e.g., browser session recordings, task tracker data), use `.agent_tmp` as the directory name for consistency.
The browser session recording tool saves recordings to `.agent_tmp/observations/recording-{timestamp}/`.
This convention ensures tool-generated observation files are stored in a predictable location that can be easily:
- Added to `.gitignore`
- Cleaned up after agent sessions
- Identified as agent-generated artifacts
Note: This is separate from `persistence_dir` which is used for conversation state persistence.
</AGENT_TMP_DIRECTORY>
<REPO>
<PROJECT_STRUCTURE>
- This is a `uv`-managed Python monorepo (single `uv.lock` at repo root) with multiple distributable packages: `openhands-sdk/` (SDK), `openhands-tools/` (built-in tools), `openhands-workspace/` (workspace impls), and `openhands-agent-server/` (server runtime).
- `examples/` contains runnable patterns; `tests/` is split by domain (`tests/sdk`, `tests/tools`, `tests/workspace`, `tests/agent_server`, etc.).
- Python namespace is `openhands.*` across packages; keep new modules within the matching package and mirror test paths under `tests/`.
</PROJECT_STRUCTURE>
<QUICK_COMMANDS>
- Set up the dev environment: `make build` (runs `uv sync --dev` and installs pre-commit; requires uv >= 0.8.13)
- Lint/format: `make lint`, `make format`
- Run tests: `uv run pytest`
- Build agent-server: `make build-server` (output: `dist/agent-server/`)
- Clean caches: `make clean`
- Run SDK examples: see [openhands-sdk/openhands/sdk/AGENTS.md](openhands-sdk/openhands/sdk/AGENTS.md).
- The example workflow runs `uv run pytest tests/examples/test_examples.py --run-examples`; each successful example must print an `EXAMPLE_COST: ...` line to stdout (use `EXAMPLE_COST: 0` for non-LLM examples).
- Conversation plugins passed via `plugins=[...]` are lazy-loaded on the first `send_message()` or `run()`, so example code should inspect plugin-added skills or `resolved_plugins` only after that first interaction.
</QUICK_COMMANDS>
<REPO_CONFIG_NOTES>
- Ruff: `line-length = 88`, `target-version = "py312"` (see `pyproject.toml`).
- Ruff ignores `ARG` (unused arguments) under `tests/**/*.py` to allow pytest fixtures.
- Repository guidance lives in the project root AGENTS.md (loaded as a third-party skill file).
</REPO_CONFIG_NOTES>
</REPO>
+133
View File
@@ -0,0 +1,133 @@
# Contributor Covenant Code of Conduct
## Our Pledge
We as members, contributors, and leaders pledge to make participation in our
community a harassment-free experience for everyone, regardless of age, body
size, visible or invisible disability, ethnicity, sex characteristics, gender
identity and expression, level of experience, education, socio-economic status,
nationality, personal appearance, race, caste, color, religion, or sexual
identity and orientation.
We pledge to act and interact in ways that contribute to an open, welcoming,
diverse, inclusive, and healthy community.
## Our Standards
Examples of behavior that contributes to a positive environment for our
community include:
* Demonstrating empathy and kindness toward other people
* Being respectful of differing opinions, viewpoints, and experiences
* Giving and gracefully accepting constructive feedback
* Accepting responsibility and apologizing to those affected by our mistakes,
and learning from the experience
* Focusing on what is best not just for us as individuals, but for the overall
community
Examples of unacceptable behavior include:
* The use of sexualized language or imagery, and sexual attention or advances of
any kind
* Trolling, insulting or derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or email address,
without their explicit permission
* Other conduct which could reasonably be considered inappropriate in a
professional setting
## Enforcement Responsibilities
Community leaders are responsible for clarifying and enforcing our standards of
acceptable behavior and will take appropriate and fair corrective action in
response to any behavior that they deem inappropriate, threatening, offensive,
or harmful.
Community leaders have the right and responsibility to remove, edit, or reject
comments, commits, code, wiki edits, issues, and other contributions that are
not aligned to this Code of Conduct, and will communicate reasons for moderation
decisions when appropriate.
## Scope
This Code of Conduct applies within all community spaces, and also applies when
an individual is officially representing the community in public spaces.
Examples of representing our community include using an official email address,
posting via an official social media account, or acting as an appointed
representative at an online or offline event.
## Enforcement
Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported to the community leaders responsible for enforcement at
contact@all-hands.dev
All complaints will be reviewed and investigated promptly and fairly.
All community leaders are obligated to respect the privacy and security of the
reporter of any incident.
## Enforcement Guidelines
Community leaders will follow these Community Impact Guidelines in determining
the consequences for any action they deem in violation of this Code of Conduct:
### 1. Correction
**Community Impact**: Use of inappropriate language or other behavior deemed
unprofessional or unwelcome in the community.
**Consequence**: A private, written warning from community leaders, providing
clarity around the nature of the violation and an explanation of why the
behavior was inappropriate. A public apology may be requested.
### 2. Warning
**Community Impact**: A violation through a single incident or series of
actions.
**Consequence**: A warning with consequences for continued behavior. No
interaction with the people involved, including unsolicited interaction with
those enforcing the Code of Conduct, for a specified period of time. This
includes avoiding interactions in community spaces as well as external channels
like social media. Violating these terms may lead to a temporary or permanent
ban.
### 3. Temporary Ban
**Community Impact**: A serious violation of community standards, including
sustained inappropriate behavior.
**Consequence**: A temporary ban from any sort of interaction or public
communication with the community for a specified period of time. No public or
private interaction with the people involved, including unsolicited interaction
with those enforcing the Code of Conduct, is allowed during this period.
Violating these terms may lead to a permanent ban.
### 4. Permanent Ban
**Community Impact**: Demonstrating a pattern of violation of community
standards, including sustained inappropriate behavior, harassment of an
individual, or aggression toward or disparagement of classes of individuals.
**Consequence**: A permanent ban from any sort of public interaction within the
community.
## Attribution
This Code of Conduct is adapted from the [Contributor Covenant][homepage],
version 2.1, available at
[https://www.contributor-covenant.org/version/2/1/code_of_conduct.html][v2.1].
Community Impact Guidelines were inspired by
[Mozilla's code of conduct enforcement ladder][Mozilla CoC].
For answers to common questions about this code of conduct, see the FAQ at
[https://www.contributor-covenant.org/faq][FAQ]. Translations are available at
[https://www.contributor-covenant.org/translations][translations].
[homepage]: https://www.contributor-covenant.org
[v2.1]: https://www.contributor-covenant.org/version/2/1/code_of_conduct.html
[Mozilla CoC]: https://github.com/mozilla/diversity
[FAQ]: https://www.contributor-covenant.org/faq
[translations]: https://www.contributor-covenant.org/translations
+74 -50
View File
@@ -1,70 +1,94 @@
# Contributing
Thank you for helping improve the OpenHands Software Agent SDK.
Thanks for your interest in contributing to OpenHands! We welcome and appreciate contributions.
This repo is a foundation. We want the SDK to stay stable and extensible so that many
applications can build on it safely.
## Understanding OpenHands's CodeBase
Downstream applications we actively keep in mind:
- [OpenHands-CLI](https://github.com/OpenHands/OpenHands-CLI) (client)
- [OpenHands app-server](https://github.com/OpenHands/OpenHands/blob/main/openhands/app_server/README.md) (client)
- [OpenHands Enterprise](https://github.com/OpenHands/OpenHands/blob/main/enterprise/README.md) (client)
To understand the codebase, please refer to the README in each module:
- [frontend](./frontend/README.md)
- [evaluation](./evaluation/README.md)
- [openhands](./openhands/README.md)
- [agenthub](./openhands/agenthub/README.md)
- [server](./openhands/server/README.md)
The SDK itself has a Python interface. In addition, the
[agent-server](https://docs.openhands.dev/sdk/guides/agent-server/overview) is the
REST/WebSocket server component that exposes the SDK for remote execution and integrations.
Changes should keep both interfaces stable and consistent.
## Setting up your development environment
## A lesson we learned (why we care about architecture)
We have a separate doc [Development.md](https://github.com/All-Hands-AI/OpenHands/blob/main/Development.md) that tells you how to set up a development workflow.
In earlier iterations, we repeatedly ran into a failure mode: needs from downstream applications
(or assumptions) would leak into core logic.
## How can I contribute?
That kind of coupling can feel convenient in the moment, but it tends to create subtle
breakage elsewhere: different environments, different workspaces, different execution modes,
and different evaluation setups.
There are many ways that you can contribute:
The architecture of OpenHands V0 was too monolithic to support multiple applications built into it,
as CLI, evaluation scripts, web server were, and built on it, as OpenHands Cloud was.
1. **Download and use** OpenHands, and send [issues](https://github.com/All-Hands-AI/OpenHands/issues) when you encounter something that isn't working or a feature that you'd like to see.
2. **Send feedback** after each session by [clicking the thumbs-up thumbs-down buttons](https://docs.all-hands.dev/modules/usage/feedback), so we can see where things are working and failing, and also build an open dataset for training code agents.
3. **Improve the Codebase** by sending PRs (see details below). In particular, we have some [good first issues](https://github.com/All-Hands-AI/OpenHands/labels/good%20first%20issue) that may be ones to start on.
If youre interested in the deeper background and lessons learned, see our write-up:
[OpenHands: An Open Platform for AI Software Developers as Generalist Agents](https://arxiv.org/abs/2511.03690)
## What can I build?
Here are a few ways you can help improve the codebase.
This SDK exists (as a separate, rebuilt foundation) to avoid that failure mode.
#### UI/UX
We're always looking to improve the look and feel of the application. If you've got a small fix
for something that's bugging you, feel free to open up a PR that changes the `./frontend` directory.
## Principles we review PRs with
If you're looking to make a bigger change, add a new UI element, or significantly alter the style
of the application, please open an issue first, or better, join the #frontend channel in our Slack
to gather consensus from our design team first.
We welcome all contributions, big or small, to improve or extend the software agent SDK.
#### Improving the agent
Our main agent is the CodeAct agent. You can [see its prompts here](https://github.com/All-Hands-AI/OpenHands/tree/main/openhands/agenthub/codeact_agent)
You may find that occasionally we are opinionated about several things:
Changes to these prompts, and to the underlying behavior in Python, can have a huge impact on user experience.
You can try modifying the prompts to see how they change the behavior of the agent as you use the app
locally, but we will need to do an end-to-end evaluation of any changes here to ensure that the agent
is getting better over time.
- **OpenHands SDK is its own thing**: its downstream are client applications.
- **Prefer interfaces over special cases**: if a client needs something, add or improve a
clean, reusable interface/extension point instead of adding a shortcut.
- **Extensibility over one-off patches**: design features so multiple clients can adopt them
without rewriting core logic.
- **Avoid hidden assumptions**: dont rely on particular env vars, workspace layouts, request
contexts, or runtime quirks that only exist in one app.
- Workspaces *do* encode environment specifics (local/Docker/remote), but keep those assumptions
explicit (params + validation) and contained to the `workspace` layer.
- **No client-specific code paths**: avoid logic that only makes sense for one
downstream app.
- Its fine to have multiple workspace implementations; its not fine for SDK core behavior to
branch on whether the caller is CLI/app-server/SaaS. Prefer capabilities/config over app-identity.
- **Keep the agent loop stable**: treat stability as a feature; be cautious with control-flow
changes and "small" behavior tweaks.
- **Compatibility is part of the API**: if something could break downstream clients, call it
out explicitly and consider a migration path. We have a deprecation mechanism you may want to use.
We use the [SWE-bench](https://www.swebench.com/) benchmark to test our agent. You can join the #evaluation
channel in Slack to learn more.
If youre not sure whether a change crosses these lines, please ask early. Were happy to help think
through the shape of a clean interface.
#### Adding a new agent
You may want to experiment with building new types of agents. You can add an agent to `openhands/agenthub`
to help expand the capabilities of OpenHands.
## Practical pointers
#### Adding a new runtime
The agent needs a place to run code and commands. When you run OpenHands on your laptop, it uses a Docker container
to do this by default. But there are other ways of creating a sandbox for the agent.
This file is mostly about principles. For the mechanics, please see:
- [AGENTS.md](AGENTS.md) for AI agents
- [DEVELOPMENT.md](DEVELOPMENT.md) for humans
If you work for a company that provides a cloud-based runtime, you could help us add support for that runtime
by implementing the [interface specified here](https://github.com/All-Hands-AI/OpenHands/blob/main/openhands/runtime/runtime.py).
## Questions / discussion
#### Testing
When you write code, it is also good to write tests. Please navigate to the `tests` folder to see existing test suites.
At the moment, we have two kinds of tests: `unit` and `integration`. Please refer to the README for each test suite. These tests also run on GitHub's continuous integration to ensure quality of the project.
Join us on Slack: https://openhands.dev/joinslack
## Sending Pull Requests to OpenHands
You'll need to fork our repository to send us a Pull Request. You can learn more
about how to fork a GitHub repo and open a PR with your changes in [this article](https://medium.com/swlh/forks-and-pull-requests-how-to-contribute-to-github-repos-8843fac34ce8)
### Pull Request title
As described [here](https://github.com/commitizen/conventional-commit-types/blob/master/index.json), a valid PR title should begin with one of the following prefixes:
- `feat`: A new feature
- `fix`: A bug fix
- `docs`: Documentation only changes
- `style`: Changes that do not affect the meaning of the code (white space, formatting, missing semicolons, etc.)
- `refactor`: A code change that neither fixes a bug nor adds a feature
- `perf`: A code change that improves performance
- `test`: Adding missing tests or correcting existing tests
- `build`: Changes that affect the build system or external dependencies (example scopes: gulp, broccoli, npm)
- `ci`: Changes to our CI configuration files and scripts (example scopes: Travis, Circle, BrowserStack, SauceLabs)
- `chore`: Other changes that don't modify src or test files
- `revert`: Reverts a previous commit
For example, a PR title could be:
- `refactor: modify package path`
- `feat(frontend): xxxx`, where `(frontend)` means that this PR mainly focuses on the frontend component.
You may also check out previous PRs in the [PR list](https://github.com/All-Hands-AI/OpenHands/pulls).
### Pull Request description
- If your PR is small (such as a typo fix), you can go brief.
- If it contains a lot of changes, it's better to write more details.
If your changes are user-facing (e.g. a new feature in the UI, a change in behavior, or a bugfix)
please include a short message that we can add to our changelog.
+312
View File
@@ -0,0 +1,312 @@
# Credits
## Contributors
We would like to thank all the [contributors](https://github.com/All-Hands-AI/OpenHands/graphs/contributors) who have helped make OpenHands possible. We greatly appreciate your dedication and hard work.
## Open Source Projects
OpenHands includes and adapts the following open source projects. We are grateful for their contributions to the open source community:
#### [SWE Agent](https://github.com/princeton-nlp/swe-agent)
- License: MIT License
- Description: Adapted for use in OpenHands's agent hub
#### [Aider](https://github.com/paul-gauthier/aider)
- License: Apache License 2.0
- Description: AI pair programming tool. OpenHands has adapted and integrated its linter module for code-related tasks in [`agentskills utilities`](https://github.com/All-Hands-AI/OpenHands/tree/main/openhands/runtime/plugins/agent_skills/utils/aider)
#### [BrowserGym](https://github.com/ServiceNow/BrowserGym)
- License: Apache License 2.0
- Description: Adapted in implementing the browsing agent
### Reference Implementations for Evaluation Benchmarks
OpenHands integrates code of the reference implementations for the following agent evaluation benchmarks:
#### [HumanEval](https://github.com/openai/human-eval)
- License: MIT License
#### [DSP](https://github.com/microsoft/DataScienceProblems)
- License: MIT License
#### [HumanEvalPack](https://github.com/bigcode-project/bigcode-evaluation-harness)
- License: Apache License 2.0
#### [AgentBench](https://github.com/THUDM/AgentBench)
- License: Apache License 2.0
#### [SWE-Bench](https://github.com/princeton-nlp/SWE-bench)
- License: MIT License
#### [BIRD](https://bird-bench.github.io/)
- License: MIT License
- Dataset: CC-BY-SA 4.0
#### [Gorilla APIBench](https://github.com/ShishirPatil/gorilla)
- License: Apache License 2.0
#### [GPQA](https://github.com/idavidrein/gpqa)
- License: MIT License
#### [ProntoQA](https://github.com/asaparov/prontoqa)
- License: Apache License 2.0
## Open Source licenses
### MIT License
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
### BSD 3-Clause License
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
### Apache License 2.0
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
### Non-Open Source Reference Implementations:
#### [MultiPL-E](https://github.com/nuprl/MultiPL-E)
- License: BSD 3-Clause License with Machine Learning Restriction
BSD 3-Clause License with Machine Learning Restriction
Copyright (c) 2022, Northeastern University, Oberlin College, Roblox Inc,
Stevens Institute of Technology, University of Massachusetts Amherst, and
Wellesley College.
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.
4. The contents of this repository may not be used as training data for any
machine learning model, including but not limited to neural networks.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-48
View File
@@ -1,48 +0,0 @@
# Development Guide
## Setup
```bash
git clone https://github.com/OpenHands/agent-sdk.git
cd agent-sdk
make build
```
## Code Quality
```bash
make format # Format code
make lint # Lint code
uv run pre-commit run --all-files # Run all checks
```
Pre-commit hooks run automatically on commit with type checking and linting.
## Testing
```bash
uv run pytest # All tests
uv run pytest tests/sdk/ # SDK tests only
uv run pytest tests/tools/ # Tools tests only
```
## Project Structure
```
agent-sdk/
├── openhands-sdk/ # Core SDK package
├── openhands-tools/ # Built-in tools
├── openhands-workspace/ # Workspace management
├── openhands-agent-server/ # Agent server
├── examples/ # Usage examples
└── tests/ # Test suites
```
## Contributing
1. Create a new branch
2. Make your changes
3. Run tests and checks
4. Push and create a pull request
For questions, join our [Slack community](https://openhands.dev/joinslack).
+128
View File
@@ -0,0 +1,128 @@
# Development Guide
This guide is for people working on OpenHands and editing the source code.
If you wish to contribute your changes, check out the [CONTRIBUTING.md](https://github.com/All-Hands-AI/OpenHands/blob/main/CONTRIBUTING.md) on how to clone and setup the project initially before moving on.
Otherwise, you can clone the OpenHands project directly.
## Start the server for development
### 1. Requirements
* Linux, Mac OS, or [WSL on Windows](https://learn.microsoft.com/en-us/windows/wsl/install) [Ubuntu <= 22.04]
* [Docker](https://docs.docker.com/engine/install/) (For those on MacOS, make sure to allow the default Docker socket to be used from advanced settings!)
* [Python](https://www.python.org/downloads/) = 3.12
* [NodeJS](https://nodejs.org/en/download/package-manager) >= 18.17.1
* [Poetry](https://python-poetry.org/docs/#installing-with-the-official-installer) >= 1.8
* OS-specific dependencies:
- Ubuntu: build-essential => `sudo apt-get install build-essential`
- WSL: netcat => `sudo apt-get install netcat`
Make sure you have all these dependencies installed before moving on to `make build`.
#### Develop without sudo access
If you want to develop without system admin/sudo access to upgrade/install `Python` and/or `NodeJs`, you can use `conda` or `mamba` to manage the packages for you:
```bash
# Download and install Mamba (a faster version of conda)
curl -L -O "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh"
bash Miniforge3-$(uname)-$(uname -m).sh
# Install Python 3.12, nodejs, and poetry
mamba install python=3.12
mamba install conda-forge::nodejs
mamba install conda-forge::poetry
```
### 2. Build and Setup The Environment
Begin by building the project which includes setting up the environment and installing dependencies. This step ensures that OpenHands is ready to run on your system:
```bash
make build
```
### 3. Configuring the Language Model
OpenHands supports a diverse array of Language Models (LMs) through the powerful [litellm](https://docs.litellm.ai) library. By default, we've chosen the mighty GPT-4 from OpenAI as our go-to model, but the world is your oyster! You can unleash the potential of Anthropic's suave Claude, the enigmatic Llama, or any other LM that piques your interest.
To configure the LM of your choice, run:
```bash
make setup-config
```
This command will prompt you to enter the LLM API key, model name, and other variables ensuring that OpenHands is tailored to your specific needs. Note that the model name will apply only when you run headless. If you use the UI, please set the model in the UI.
Note: If you have previously run OpenHands using the docker command, you may have already set some environmental variables in your terminal. The final configurations are set from highest to lowest priority:
Environment variables > config.toml variables > default variables
**Note on Alternative Models:**
Some alternative models may prove more challenging to tame than others. Fear not, brave adventurer! We shall soon unveil LLM-specific documentation to guide you on your quest.
And if you've already mastered the art of wielding a model other than OpenAI's GPT, we encourage you to share your setup instructions with us by creating instructions and adding it [to our documentation](https://github.com/All-Hands-AI/OpenHands/tree/main/docs/modules/usage/llms).
For a full list of the LM providers and models available, please consult the [litellm documentation](https://docs.litellm.ai/docs/providers).
### 4. Running the application
#### Option A: Run the Full Application
Once the setup is complete, launching OpenHands is as simple as running a single command. This command starts both the backend and frontend servers seamlessly, allowing you to interact with OpenHands:
```bash
make run
```
#### Option B: Individual Server Startup
- **Start the Backend Server:** If you prefer, you can start the backend server independently to focus on backend-related tasks or configurations.
```bash
make start-backend
```
- **Start the Frontend Server:** Similarly, you can start the frontend server on its own to work on frontend-related components or interface enhancements.
```bash
make start-frontend
```
### 6. LLM Debugging
If you encounter any issues with the Language Model (LM) or you're simply curious, you can inspect the actual LLM prompts and responses. To do so, export DEBUG=1 in the environment and restart the backend.
OpenHands will then log the prompts and responses in the logs/llm/CURRENT_DATE directory, allowing you to identify the causes.
### 7. Help
Need assistance or information on available targets and commands? The help command provides all the necessary guidance to ensure a smooth experience with OpenHands.
```bash
make help
```
### 8. Testing
To run tests, refer to the following:
#### Unit tests
```bash
poetry run pytest ./tests/unit/test_*.py
```
### 9. Add or update dependency
1. Add your dependency in `pyproject.toml` or use `poetry add xxx`
2. Update the poetry.lock file via `poetry lock --no-update`
### 9. Use existing Docker image
To reduce build time (e.g., if no changes were made to the client-runtime component), you can use an existing Docker container image. Follow these steps:
1. Set the SANDBOX_RUNTIME_CONTAINER_IMAGE environment variable to the desired Docker image.
2. Example: export SANDBOX_RUNTIME_CONTAINER_IMAGE=ghcr.io/all-hands-ai/runtime:0.13-nikolaik
## Develop inside Docker container
TL;DR
```bash
make docker-dev
```
See more details [here](./containers/dev/README.md)
If you are just interested in running `OpenHands` without installing all the required tools on your host.
```bash
make docker-run
```
If you do not have `make` on your host, run:
```bash
cd ./containers/dev
./dev.sh
```
You do need [Docker](https://docs.docker.com/engine/install/) installed on your host though.
+25
View File
@@ -0,0 +1,25 @@
# Issue Triage
These are the procedures and guidelines on how issues are triaged in this repo by the maintainers.
## General
* Most issues must be tagged with **enhancement** or **bug**
* Issues may be tagged with what it relates to (**backend**, **frontend**, **agent quality**, etc.)
## Severity
* **Low**: Minor issues, single user report
* **Medium**: Affecting multiple users
* **Critical**: Affecting all users or potential security issues
## Effort
* Issues may be estimated with effort required (**small effort**, **medium effort**, **large effort**)
## Difficulty
* Issues with low implementation difficulty may be tagged with **good first issue**
## Not Enough Information
* User is asked to provide more information (logs, how to reproduce, etc.) when the issue is not clear
* If an issue is unclear and the author does not provide more information or respond to a request, the issue may be closed as **not planned** (Usually after a week)
## Multiple Requests/Fixes in One Issue
* These issues will be narrowed down to one request/fix so the issue is more easily tracked and fixed
* Issues may be broken down into multiple issues if required
+21 -17
View File
@@ -1,21 +1,25 @@
MIT License
The MIT License (MIT)
=====================
Copyright (c) 2026 OpenHands contributors
Copyright © 2023
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
Permission is hereby granted, free of charge, to any person
obtaining a copy of this software and associated documentation
files (the Software”), to deal in the Software without
restriction, including without limitation the rights to use,
copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the
Software is furnished to do so, subject to the following
conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
The above copyright notice and this permission notice shall be
included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
OTHER DEALINGS IN THE SOFTWARE.
-11
View File
@@ -1,11 +0,0 @@
# Repository Maintainers
#
# Format: Each maintainer on a new line starting with "- @username"
# This file is read by .github/workflows/assign-reviews.yml for automated triage
#
The following people are maintainers of this repository and are responsible for triage and review:
- @xingyaoww
- @neubig
- @enyst
+4 -47
View File
@@ -1,48 +1,5 @@
# This MANIFEST.in file tells setuptools which files to include
# in the sdist package distribution used for building docker image
# Exclude all Python bytecode files
global-exclude *.pyc
# ==============================================================================
# Root-level workspace files
# ==============================================================================
include pyproject.toml
include uv.lock
# ==============================================================================
# openhands-sdk
# ==============================================================================
include openhands-sdk/pyproject.toml
recursive-include openhands-sdk *.py
recursive-include openhands-sdk *.j2
recursive-include openhands-sdk py.typed
# ==============================================================================
# openhands-tools
# ==============================================================================
include openhands-tools/pyproject.toml
recursive-include openhands-tools *.py
recursive-include openhands-tools *.j2
recursive-include openhands-tools py.typed
# ==============================================================================
# openhands-workspace
# ==============================================================================
include openhands-workspace/pyproject.toml
recursive-include openhands-workspace *.py
recursive-include openhands-workspace py.typed
# ==============================================================================
# openhands-agent-server
# ==============================================================================
include openhands-agent-server/pyproject.toml
recursive-include openhands-agent-server *.py
recursive-include openhands-agent-server py.typed
# Docker build files
include openhands-agent-server/openhands/agent_server/docker/Dockerfile
include openhands-agent-server/openhands/agent_server/docker/wallpaper.svg
# PyInstaller spec
include openhands-agent-server/openhands/agent_server/agent-server.spec
# VSCode extensions
recursive-include openhands-agent-server/openhands/agent_server/vscode_extensions *
# Exclude Python cache directories
global-exclude __pycache__
+325 -102
View File
@@ -1,109 +1,332 @@
SHELL := /usr/bin/env bash
.SHELLFLAGS := -eu -o pipefail -c
SHELL=/bin/bash
# Makefile for OpenHands project
# Colors for output
ECHO := printf '%b\n'
GREEN := \033[32m
YELLOW := \033[33m
RED := \033[31m
CYAN := \033[36m
RESET := \033[0m
UNDERLINE := \033[4m
# Variables
BACKEND_HOST ?= "127.0.0.1"
BACKEND_PORT = 3000
BACKEND_HOST_PORT = "$(BACKEND_HOST):$(BACKEND_PORT)"
FRONTEND_PORT = 3001
DEFAULT_WORKSPACE_DIR = "./workspace"
DEFAULT_MODEL = "gpt-4o"
CONFIG_FILE = config.toml
PRE_COMMIT_CONFIG_PATH = "./dev_config/python/.pre-commit-config.yaml"
PYTHON_VERSION = 3.12
# Required uv version
REQUIRED_UV_VERSION := 0.8.13
PKGS ?= openhands-sdk openhands-tools openhands-workspace openhands-agent-server
# ANSI color codes
GREEN=$(shell tput -Txterm setaf 2)
YELLOW=$(shell tput -Txterm setaf 3)
RED=$(shell tput -Txterm setaf 1)
BLUE=$(shell tput -Txterm setaf 6)
RESET=$(shell tput -Txterm sgr0)
.PHONY: build format lint clean help check-uv-version
# Build
build:
@echo "$(GREEN)Building project...$(RESET)"
@$(MAKE) -s check-dependencies
@$(MAKE) -s install-python-dependencies
@$(MAKE) -s install-frontend-dependencies
@$(MAKE) -s install-pre-commit-hooks
@$(MAKE) -s build-frontend
@echo "$(GREEN)Build completed successfully.$(RESET)"
# Default target
.DEFAULT_GOAL := help
check-dependencies:
@echo "$(YELLOW)Checking dependencies...$(RESET)"
@$(MAKE) -s check-system
@$(MAKE) -s check-python
@$(MAKE) -s check-npm
@$(MAKE) -s check-nodejs
ifeq ($(INSTALL_DOCKER),)
@$(MAKE) -s check-docker
endif
@$(MAKE) -s check-poetry
@echo "$(GREEN)Dependencies checked successfully.$(RESET)"
check-uv-version:
@$(ECHO) "$(YELLOW)Checking uv version...$(RESET)"
@UV_VERSION=$$(uv --version | cut -d' ' -f2); \
REQUIRED_VERSION=$(REQUIRED_UV_VERSION); \
if [ "$$(printf '%s\n' "$$REQUIRED_VERSION" "$$UV_VERSION" | sort -V | head -n1)" != "$$REQUIRED_VERSION" ]; then \
$(ECHO) "$(RED)Error: uv version $$UV_VERSION is less than required $$REQUIRED_VERSION$(RESET)"; \
$(ECHO) "$(YELLOW)Please update uv with: uv self update$(RESET)"; \
exit 1; \
fi; \
$(ECHO) "$(GREEN)uv version $$UV_VERSION meets requirements$(RESET)"
build: check-uv-version
@$(ECHO) "$(CYAN)Setting up OpenHands V1 development environment...$(RESET)"
@$(ECHO) "$(YELLOW)Installing dependencies with uv sync --dev...$(RESET)"
@uv sync --dev
@$(ECHO) "$(GREEN)Dependencies installed successfully.$(RESET)"
@$(ECHO) "$(YELLOW)Setting up pre-commit hooks...$(RESET)"
@uv run pre-commit install
@$(ECHO) "$(GREEN)Pre-commit hooks installed successfully.$(RESET)"
@$(ECHO) "$(GREEN)Build complete! Development environment is ready.$(RESET)"
format:
@$(ECHO) "$(YELLOW)Formatting code with uv format...$(RESET)"
@uv run ruff format
@$(ECHO) "$(GREEN)Code formatted successfully.$(RESET)"
lint:
@$(ECHO) "$(YELLOW)Linting code with ruff...$(RESET)"
@uv run ruff check --fix
@$(ECHO) "$(GREEN)Linting completed.$(RESET)"
pre-commit:
@$(ECHO) "$(YELLOW)Run pre-commit...$(RESET)"
uv run pre-commit run --all-files
@$(ECHO) "$(GREEN)Pre-commit run successfully.$(RESET)"
clean:
@$(ECHO) "$(YELLOW)Cleaning up cache files...$(RESET)"
@find . -type d -name "__pycache__" -exec rm -rf {} + 2>/dev/null || true
@find . -type f -name "*.pyc" -delete 2>/dev/null || true
@rm -rf .pytest_cache .ruff_cache .mypy_cache 2>/dev/null || true
@$(ECHO) "$(GREEN)Cache files cleaned.$(RESET)"
# Show help
help:
@$(ECHO) "$(CYAN)OpenHands V1 Makefile$(RESET)"
@$(ECHO) ""
@$(ECHO) "$(UNDERLINE)Usage:$(RESET) make <COMMAND>"
@$(ECHO) ""
@$(ECHO) "$(UNDERLINE)Commands:$(RESET)"
@$(ECHO) " $(GREEN)build$(RESET) Setup development environment (install deps + hooks)"
@$(ECHO) " $(GREEN)build-server$(RESET) Build agent-server executable"
@$(ECHO) " $(GREEN)test-server-schema$(RESET) Test server schema"
@$(ECHO) " $(GREEN)format$(RESET) Format code with uv format"
@$(ECHO) " $(GREEN)lint$(RESET) Lint code with ruff"
@$(ECHO) " $(GREEN)pre-commit$(RESET) Run the pre-commit"
@$(ECHO) " $(GREEN)clean$(RESET) Clean up cache files"
@$(ECHO) " $(GREEN)help$(RESET) Show this help message"
build-server: check-uv-version
@$(ECHO) "$(CYAN)Building agent-server executable...$(RESET)"
@uv run pyinstaller openhands-agent-server/openhands/agent_server/agent-server.spec
@$(ECHO) "$(GREEN)Build complete! Executable is in dist/agent-server/$(RESET)"
test-server-schema: check-uv-version
set -euo pipefail;
# Generate OpenAPI JSON inline (no file left in repo)
uv run python -c 'import os,json; from openhands.agent_server.api import api; open("openapi.json","w").write(json.dumps(api.openapi(), indent=2))'
npx --yes @apidevtools/swagger-cli@^4 validate openapi.json
# Clean up temp schema
rm -f openapi.json
rm -rf .client
.PHONY: set-package-version
set-package-version: check-uv-version
@if [ -z "$(version)" ]; then \
$(ECHO) "$(RED)Error: missing version. Use: make set-package-version version=1.2.3$(RESET)"; \
check-system:
@echo "$(YELLOW)Checking system...$(RESET)"
@if [ "$(shell uname)" = "Darwin" ]; then \
echo "$(BLUE)macOS detected.$(RESET)"; \
elif [ "$(shell uname)" = "Linux" ]; then \
if [ -f "/etc/manjaro-release" ]; then \
echo "$(BLUE)Manjaro Linux detected.$(RESET)"; \
else \
echo "$(BLUE)Linux detected.$(RESET)"; \
fi; \
elif [ "$$(uname -r | grep -i microsoft)" ]; then \
echo "$(BLUE)Windows Subsystem for Linux detected.$(RESET)"; \
else \
echo "$(RED)Unsupported system detected. Please use macOS, Linux, or Windows Subsystem for Linux (WSL).$(RESET)"; \
exit 1; \
fi
@$(ECHO) "$(CYAN)Setting version to $(version) for: $(PKGS)$(RESET)"
@for PKG in $(PKGS); do \
$(ECHO) "$(YELLOW)bumping $$PKG -> $(version)$(RESET)"; \
uv version --package $$PKG $(version); \
done
@$(ECHO) "$(GREEN)Version updated in all selected packages.$(RESET)"
check-python:
@echo "$(YELLOW)Checking Python installation...$(RESET)"
@if command -v python$(PYTHON_VERSION) > /dev/null; then \
echo "$(BLUE)$(shell python$(PYTHON_VERSION) --version) is already installed.$(RESET)"; \
else \
echo "$(RED)Python $(PYTHON_VERSION) is not installed. Please install Python $(PYTHON_VERSION) to continue.$(RESET)"; \
exit 1; \
fi
check-npm:
@echo "$(YELLOW)Checking npm installation...$(RESET)"
@if command -v npm > /dev/null; then \
echo "$(BLUE)npm $(shell npm --version) is already installed.$(RESET)"; \
else \
echo "$(RED)npm is not installed. Please install Node.js to continue.$(RESET)"; \
exit 1; \
fi
check-nodejs:
@echo "$(YELLOW)Checking Node.js installation...$(RESET)"
@if command -v node > /dev/null; then \
NODE_VERSION=$(shell node --version | sed -E 's/v//g'); \
IFS='.' read -r -a NODE_VERSION_ARRAY <<< "$$NODE_VERSION"; \
if [ "$${NODE_VERSION_ARRAY[0]}" -gt 18 ] || ([ "$${NODE_VERSION_ARRAY[0]}" -eq 18 ] && [ "$${NODE_VERSION_ARRAY[1]}" -gt 17 ]) || ([ "$${NODE_VERSION_ARRAY[0]}" -eq 18 ] && [ "$${NODE_VERSION_ARRAY[1]}" -eq 17 ] && [ "$${NODE_VERSION_ARRAY[2]}" -ge 1 ]); then \
echo "$(BLUE)Node.js $$NODE_VERSION is already installed.$(RESET)"; \
else \
echo "$(RED)Node.js 18.17.1 or later is required. Please install Node.js 18.17.1 or later to continue.$(RESET)"; \
exit 1; \
fi; \
else \
echo "$(RED)Node.js is not installed. Please install Node.js to continue.$(RESET)"; \
exit 1; \
fi
check-docker:
@echo "$(YELLOW)Checking Docker installation...$(RESET)"
@if command -v docker > /dev/null; then \
echo "$(BLUE)$(shell docker --version) is already installed.$(RESET)"; \
else \
echo "$(RED)Docker is not installed. Please install Docker to continue.$(RESET)"; \
exit 1; \
fi
check-poetry:
@echo "$(YELLOW)Checking Poetry installation...$(RESET)"
@if command -v poetry > /dev/null; then \
POETRY_VERSION=$(shell poetry --version 2>&1 | sed -E 's/Poetry \(version ([0-9]+\.[0-9]+\.[0-9]+)\)/\1/'); \
IFS='.' read -r -a POETRY_VERSION_ARRAY <<< "$$POETRY_VERSION"; \
if [ $${POETRY_VERSION_ARRAY[0]} -ge 1 ] && [ $${POETRY_VERSION_ARRAY[1]} -ge 8 ]; then \
echo "$(BLUE)$(shell poetry --version) is already installed.$(RESET)"; \
else \
echo "$(RED)Poetry 1.8 or later is required. You can install poetry by running the following command, then adding Poetry to your PATH:"; \
echo "$(RED) curl -sSL https://install.python-poetry.org | python$(PYTHON_VERSION) -$(RESET)"; \
echo "$(RED)More detail here: https://python-poetry.org/docs/#installing-with-the-official-installer$(RESET)"; \
exit 1; \
fi; \
else \
echo "$(RED)Poetry is not installed. You can install poetry by running the following command, then adding Poetry to your PATH:"; \
echo "$(RED) curl -sSL https://install.python-poetry.org | python$(PYTHON_VERSION) -$(RESET)"; \
echo "$(RED)More detail here: https://python-poetry.org/docs/#installing-with-the-official-installer$(RESET)"; \
exit 1; \
fi
install-python-dependencies:
@echo "$(GREEN)Installing Python dependencies...$(RESET)"
@if [ -z "${TZ}" ]; then \
echo "Defaulting TZ (timezone) to UTC"; \
export TZ="UTC"; \
fi
poetry env use python$(PYTHON_VERSION)
@if [ "$(shell uname)" = "Darwin" ]; then \
echo "$(BLUE)Installing chroma-hnswlib...$(RESET)"; \
export HNSWLIB_NO_NATIVE=1; \
poetry run pip install chroma-hnswlib; \
fi
@poetry install --without llama-index
@if [ -f "/etc/manjaro-release" ]; then \
echo "$(BLUE)Detected Manjaro Linux. Installing Playwright dependencies...$(RESET)"; \
poetry run pip install playwright; \
poetry run playwright install chromium; \
else \
if [ ! -f cache/playwright_chromium_is_installed.txt ]; then \
echo "Running playwright install --with-deps chromium..."; \
poetry run playwright install --with-deps chromium; \
mkdir -p cache; \
touch cache/playwright_chromium_is_installed.txt; \
else \
echo "Setup already done. Skipping playwright installation."; \
fi \
fi
@echo "$(GREEN)Python dependencies installed successfully.$(RESET)"
install-frontend-dependencies:
@echo "$(YELLOW)Setting up frontend environment...$(RESET)"
@echo "$(YELLOW)Detect Node.js version...$(RESET)"
@cd frontend && node ./scripts/detect-node-version.js
echo "$(BLUE)Installing frontend dependencies with npm...$(RESET)"
@cd frontend && npm install
@echo "$(GREEN)Frontend dependencies installed successfully.$(RESET)"
install-pre-commit-hooks:
@echo "$(YELLOW)Installing pre-commit hooks...$(RESET)"
@git config --unset-all core.hooksPath || true
@poetry run pre-commit install --config $(PRE_COMMIT_CONFIG_PATH)
@echo "$(GREEN)Pre-commit hooks installed successfully.$(RESET)"
lint-backend:
@echo "$(YELLOW)Running linters...$(RESET)"
@poetry run pre-commit run --files openhands/**/* agenthub/**/* evaluation/**/* --show-diff-on-failure --config $(PRE_COMMIT_CONFIG_PATH)
lint-frontend:
@echo "$(YELLOW)Running linters for frontend...$(RESET)"
@cd frontend && npm run lint
lint:
@$(MAKE) -s lint-frontend
@$(MAKE) -s lint-backend
test-frontend:
@echo "$(YELLOW)Running tests for frontend...$(RESET)"
@cd frontend && npm run test
test:
@$(MAKE) -s test-frontend
build-frontend:
@echo "$(YELLOW)Building frontend...$(RESET)"
@cd frontend && npm run build
# Start backend
start-backend:
@echo "$(YELLOW)Starting backend...$(RESET)"
@poetry run uvicorn openhands.server.listen:app --host $(BACKEND_HOST) --port $(BACKEND_PORT) --reload --reload-exclude "$(shell pwd)/workspace"
# Start frontend
start-frontend:
@echo "$(YELLOW)Starting frontend...$(RESET)"
@cd frontend && VITE_BACKEND_HOST=$(BACKEND_HOST_PORT) VITE_FRONTEND_PORT=$(FRONTEND_PORT) npm run dev -- --port $(FRONTEND_PORT) --host $(BACKEND_HOST)
# Common setup for running the app (non-callable)
_run_setup:
@if [ "$(OS)" = "Windows_NT" ]; then \
echo "$(RED) Windows is not supported, use WSL instead!$(RESET)"; \
exit 1; \
fi
@mkdir -p logs
@echo "$(YELLOW)Starting backend server...$(RESET)"
@poetry run uvicorn openhands.server.listen:app --host $(BACKEND_HOST) --port $(BACKEND_PORT) &
@echo "$(YELLOW)Waiting for the backend to start...$(RESET)"
@until nc -z localhost $(BACKEND_PORT); do sleep 0.1; done
@echo "$(GREEN)Backend started successfully.$(RESET)"
# Run the app (standard mode)
run:
@echo "$(YELLOW)Running the app...$(RESET)"
@$(MAKE) -s _run_setup
@$(MAKE) -s start-frontend
@echo "$(GREEN)Application started successfully.$(RESET)"
# Run the app (in docker)
docker-run: WORKSPACE_BASE ?= $(PWD)/workspace
docker-run:
@if [ -f /.dockerenv ]; then \
echo "Running inside a Docker container. Exiting..."; \
exit 0; \
else \
echo "$(YELLOW)Running the app in Docker $(OPTIONS)...$(RESET)"; \
export WORKSPACE_BASE=${WORKSPACE_BASE}; \
export SANDBOX_USER_ID=$(shell id -u); \
export DATE=$(shell date +%Y%m%d%H%M%S); \
docker compose up $(OPTIONS); \
fi
# Run the app (WSL mode)
run-wsl:
@echo "$(YELLOW)Running the app in WSL mode...$(RESET)"
@$(MAKE) -s _run_setup
@cd frontend && echo "$(BLUE)Starting frontend with npm (WSL mode)...$(RESET)" && npm run dev_wsl -- --port $(FRONTEND_PORT)
@echo "$(GREEN)Application started successfully in WSL mode.$(RESET)"
# Setup config.toml
setup-config:
@echo "$(YELLOW)Setting up config.toml...$(RESET)"
@$(MAKE) setup-config-prompts
@mv $(CONFIG_FILE).tmp $(CONFIG_FILE)
@echo "$(GREEN)Config.toml setup completed.$(RESET)"
setup-config-prompts:
@echo "[core]" > $(CONFIG_FILE).tmp
@read -p "Enter your workspace directory (as absolute path) [default: $(DEFAULT_WORKSPACE_DIR)]: " workspace_dir; \
workspace_dir=$${workspace_dir:-$(DEFAULT_WORKSPACE_DIR)}; \
echo "workspace_base=\"$$workspace_dir\"" >> $(CONFIG_FILE).tmp
@echo "" >> $(CONFIG_FILE).tmp
@echo "[llm]" >> $(CONFIG_FILE).tmp
@read -p "Enter your LLM model name, used for running without UI. Set the model in the UI after you start the app. (see https://docs.litellm.ai/docs/providers for full list) [default: $(DEFAULT_MODEL)]: " llm_model; \
llm_model=$${llm_model:-$(DEFAULT_MODEL)}; \
echo "model=\"$$llm_model\"" >> $(CONFIG_FILE).tmp
@read -p "Enter your LLM api key: " llm_api_key; \
echo "api_key=\"$$llm_api_key\"" >> $(CONFIG_FILE).tmp
@read -p "Enter your LLM base URL [mostly used for local LLMs, leave blank if not needed - example: http://localhost:5001/v1/]: " llm_base_url; \
if [[ ! -z "$$llm_base_url" ]]; then echo "base_url=\"$$llm_base_url\"" >> $(CONFIG_FILE).tmp; fi
@echo "Enter your LLM Embedding Model"; \
echo "Choices are:"; \
echo " - openai"; \
echo " - azureopenai"; \
echo " - Embeddings available only with OllamaEmbedding:"; \
echo " - llama2"; \
echo " - mxbai-embed-large"; \
echo " - nomic-embed-text"; \
echo " - all-minilm"; \
echo " - stable-code"; \
echo " - bge-m3"; \
echo " - bge-large"; \
echo " - paraphrase-multilingual"; \
echo " - snowflake-arctic-embed"; \
echo " - Leave blank to default to 'BAAI/bge-small-en-v1.5' via huggingface"; \
read -p "> " llm_embedding_model; \
echo "embedding_model=\"$$llm_embedding_model\"" >> $(CONFIG_FILE).tmp; \
if [ "$$llm_embedding_model" = "llama2" ] || [ "$$llm_embedding_model" = "mxbai-embed-large" ] || [ "$$llm_embedding_model" = "nomic-embed-text" ] || [ "$$llm_embedding_model" = "all-minilm" ] || [ "$$llm_embedding_model" = "stable-code" ]; then \
read -p "Enter the local model URL for the embedding model (will set llm.embedding_base_url): " llm_embedding_base_url; \
echo "embedding_base_url=\"$$llm_embedding_base_url\"" >> $(CONFIG_FILE).tmp; \
elif [ "$$llm_embedding_model" = "azureopenai" ]; then \
read -p "Enter the Azure endpoint URL (will overwrite llm.base_url): " llm_base_url; \
echo "base_url=\"$$llm_base_url\"" >> $(CONFIG_FILE).tmp; \
read -p "Enter the Azure LLM Embedding Deployment Name: " llm_embedding_deployment_name; \
echo "embedding_deployment_name=\"$$llm_embedding_deployment_name\"" >> $(CONFIG_FILE).tmp; \
read -p "Enter the Azure API Version: " llm_api_version; \
echo "api_version=\"$$llm_api_version\"" >> $(CONFIG_FILE).tmp; \
fi
# Develop in container
docker-dev:
@if [ -f /.dockerenv ]; then \
echo "Running inside a Docker container. Exiting..."; \
exit 0; \
else \
echo "$(YELLOW)Build and run in Docker $(OPTIONS)...$(RESET)"; \
./containers/dev/dev.sh $(OPTIONS); \
fi
# Clean up all caches
clean:
@echo "$(YELLOW)Cleaning up caches...$(RESET)"
@rm -rf openhands/.cache
@echo "$(GREEN)Caches cleaned up successfully.$(RESET)"
# Help
help:
@echo "$(BLUE)Usage: make [target]$(RESET)"
@echo "Targets:"
@echo " $(GREEN)build$(RESET) - Build project, including environment setup and dependencies."
@echo " $(GREEN)lint$(RESET) - Run linters on the project."
@echo " $(GREEN)setup-config$(RESET) - Setup the configuration for OpenHands by providing LLM API key,"
@echo " LLM Model name, and workspace directory."
@echo " $(GREEN)start-backend$(RESET) - Start the backend server for the OpenHands project."
@echo " $(GREEN)start-frontend$(RESET) - Start the frontend server for the OpenHands project."
@echo " $(GREEN)run$(RESET) - Run the OpenHands application, starting both backend and frontend servers."
@echo " Backend Log file will be stored in the 'logs' directory."
@echo " $(GREEN)docker-dev$(RESET) - Build and run the OpenHands application in Docker."
@echo " $(GREEN)docker-run$(RESET) - Run the OpenHands application, starting both backend and frontend servers in Docker."
@echo " $(GREEN)help$(RESET) - Display this help message, providing information on available targets."
# Phony targets
.PHONY: build check-dependencies check-python check-npm check-docker check-poetry install-python-dependencies install-frontend-dependencies install-pre-commit-hooks lint start-backend start-frontend run run-wsl setup-config setup-config-prompts help
.PHONY: docker-dev docker-run
+93 -87
View File
@@ -1,124 +1,130 @@
<a name="readme-top"></a>
<div align="center">
<img src="https://raw.githubusercontent.com/OpenHands/docs/main/openhands/static/img/logo.png" alt="Logo" width="200">
<h1 align="center">OpenHands Software Agent SDK </h1>
<img src="./docs/static/img/logo.png" alt="Logo" width="200">
<h1 align="center">OpenHands: Code Less, Make More</h1>
</div>
<div align="center">
<a href="https://github.com/OpenHands/software-agent-sdk/blob/main/LICENSE"><img src="https://img.shields.io/github/license/OpenHands/software-agent-sdk?style=for-the-badge&color=blue" alt="MIT License"></a>
<a href="https://openhands.dev/joinslack"><img src="https://img.shields.io/badge/Slack-Join%20Us-red?logo=slack&logoColor=white&style=for-the-badge" alt="Join our Slack community"></a>
<br>
<a href="https://docs.openhands.dev/sdk"><img src="https://img.shields.io/badge/Documentation-000?logo=googledocs&logoColor=FFE165&style=for-the-badge" alt="Check out the documentation"></a>
<a href="https://arxiv.org/abs/2511.03690"><img src="https://img.shields.io/badge/Paper-000?logoColor=FFE165&logo=arxiv&style=for-the-badge" alt="Tech Report"></a>
<a href="https://docs.google.com/spreadsheets/d/1wOUdFCMyY6Nt0AIqF705KN4JKOWgeI4wUGUP60krXXs/edit?gid=811504672#gid=811504672"><img src="https://img.shields.io/badge/SWEBench-77.6-000?logoColor=FFE165&style=for-the-badge" alt="Benchmark Score"></a>
<br>
<!-- Keep these links. Translations will automatically update with the README. -->
<a href="https://www.readme-i18n.com/OpenHands/software-agent-sdk?lang=de">Deutsch</a> |
<a href="https://www.readme-i18n.com/OpenHands/software-agent-sdk?lang=es">Español</a> |
<a href="https://www.readme-i18n.com/OpenHands/software-agent-sdk?lang=fr">français</a> |
<a href="https://www.readme-i18n.com/OpenHands/software-agent-sdk?lang=ja">日本語</a> |
<a href="https://www.readme-i18n.com/OpenHands/software-agent-sdk?lang=ko">한국어</a> |
<a href="https://www.readme-i18n.com/OpenHands/software-agent-sdk?lang=pt">Português</a> |
<a href="https://www.readme-i18n.com/OpenHands/software-agent-sdk?lang=ru">Русский</a> |
<a href="https://www.readme-i18n.com/OpenHands/software-agent-sdk?lang=zh">中文</a>
<a href="https://github.com/All-Hands-AI/OpenHands/graphs/contributors"><img src="https://img.shields.io/github/contributors/All-Hands-AI/OpenHands?style=for-the-badge&color=blue" alt="Contributors"></a>
<a href="https://github.com/All-Hands-AI/OpenHands/stargazers"><img src="https://img.shields.io/github/stars/All-Hands-AI/OpenHands?style=for-the-badge&color=blue" alt="Stargazers"></a>
<a href="https://codecov.io/github/All-Hands-AI/OpenHands?branch=main"><img alt="CodeCov" src="https://img.shields.io/codecov/c/github/All-Hands-AI/OpenHands?style=for-the-badge&color=blue"></a>
<a href="https://github.com/All-Hands-AI/OpenHands/blob/main/LICENSE"><img src="https://img.shields.io/github/license/All-Hands-AI/OpenHands?style=for-the-badge&color=blue" alt="MIT License"></a>
<br/>
<a href="https://join.slack.com/t/openhands-ai/shared_invite/zt-2tom0er4l-JeNUGHt_AxpEfIBstbLPiw"><img src="https://img.shields.io/badge/Slack-Join%20Us-red?logo=slack&logoColor=white&style=for-the-badge" alt="Join our Slack community"></a>
<a href="https://discord.gg/ESHStjSjD4"><img src="https://img.shields.io/badge/Discord-Join%20Us-purple?logo=discord&logoColor=white&style=for-the-badge" alt="Join our Discord community"></a>
<a href="https://github.com/All-Hands-AI/OpenHands/blob/main/CREDITS.md"><img src="https://img.shields.io/badge/Project-Credits-blue?style=for-the-badge&color=FFE165&logo=github&logoColor=white" alt="Credits"></a>
<br/>
<a href="https://docs.all-hands.dev/modules/usage/getting-started"><img src="https://img.shields.io/badge/Documentation-000?logo=googledocs&logoColor=FFE165&style=for-the-badge" alt="Check out the documentation"></a>
<a href="https://arxiv.org/abs/2407.16741"><img src="https://img.shields.io/badge/Paper%20on%20Arxiv-000?logoColor=FFE165&logo=arxiv&style=for-the-badge" alt="Paper on Arxiv"></a>
<a href="https://huggingface.co/spaces/OpenHands/evaluation"><img src="https://img.shields.io/badge/Benchmark%20score-000?logoColor=FFE165&logo=huggingface&style=for-the-badge" alt="Evaluation Benchmark Score"></a>
<hr>
</div>
The OpenHands Software Agent SDK is a set of Python and REST APIs for **building agents that work with code**.
Welcome to OpenHands (formerly OpenDevin), a platform for software development agents powered by AI.
You can use the OpenHands Software Agent SDK for:
* One-off tasks, like building a README for your repo
* Routine maintenance tasks, like updating dependencies
* Major tasks that involve multiple agents, like refactors and rewrites
OpenHands agents can do anything a human developer can: modify code, run commands, browse the web,
call APIs, and yes—even copy code snippets from StackOverflow.
Importantly, agents can either use the local machine as their workspace, or run inside ephemeral workspaces
(e.g. in Docker or Kubernetes) using the Agent Server.
Learn more at [docs.all-hands.dev](https://docs.all-hands.dev), or jump to the [Quick Start](#-quick-start).
You can even use the SDK to build new developer experiences: its the engine behind the
[OpenHands CLI](https://github.com/OpenHands/OpenHands-CLI) and [OpenHands Cloud](https://github.com/OpenHands/OpenHands).
![App screenshot](./docs/static/img/screenshot.png)
Get started with some [examples](https://docs.openhands.dev/sdk/guides/hello-world) or [check out the docs](https://docs.openhands.dev/sdk) to learn more.
## ⚡ Quick Start
## Quick Start
The easiest way to run OpenHands is in Docker.
See the [Installation](https://docs.all-hands.dev/modules/usage/installation) guide for
system requirements and more information.
Here's what building with the SDK looks like:
```bash
docker pull docker.all-hands.dev/all-hands-ai/runtime:0.13-nikolaik
```python
import os
from openhands.sdk import LLM, Agent, Conversation, Tool
from openhands.tools.file_editor import FileEditorTool
from openhands.tools.task_tracker import TaskTrackerTool
from openhands.tools.terminal import TerminalTool
llm = LLM(
model="anthropic/claude-sonnet-4-5-20250929",
api_key=os.getenv("LLM_API_KEY"),
)
agent = Agent(
llm=llm,
tools=[
Tool(name=TerminalTool.name),
Tool(name=FileEditorTool.name),
Tool(name=TaskTrackerTool.name),
],
)
cwd = os.getcwd()
conversation = Conversation(agent=agent, workspace=cwd)
conversation.send_message("Write 3 facts about the current project into FACTS.txt.")
conversation.run()
print("All done!")
docker run -it --pull=always \
-e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.13-nikolaik \
-v /var/run/docker.sock:/var/run/docker.sock \
-p 3000:3000 \
-e LOG_ALL_EVENTS=true \
--add-host host.docker.internal:host-gateway \
--name openhands-app \
docker.all-hands.dev/all-hands-ai/openhands:0.13
```
For installation instructions and detailed setup, see the [Getting Started Guide](https://docs.openhands.dev/sdk/getting-started).
You'll find OpenHands running at [http://localhost:3000](http://localhost:3000)!
## Documentation
Finally, you'll need a model provider and API key.
[Anthropic's Claude 3.5 Sonnet](https://www.anthropic.com/api) (`anthropic/claude-3-5-sonnet-20241022`)
works best, but you have [many options](https://docs.all-hands.dev/modules/usage/llms).
For detailed documentation, tutorials, and API reference, visit:
---
**[https://docs.openhands.dev/sdk](https://docs.openhands.dev/sdk)**
You can also [connect OpenHands to your local filesystem](https://docs.all-hands.dev/modules/usage/runtimes),
run OpenHands in a scriptable [headless mode](https://docs.all-hands.dev/modules/usage/how-to/headless-mode),
interact with it via a [friendly CLI](https://docs.all-hands.dev/modules/usage/how-to/cli-mode),
or run it on tagged issues with [a github action](https://github.com/All-Hands-AI/OpenHands-resolver).
The documentation includes:
- [Getting Started Guide](https://docs.openhands.dev/sdk/getting-started) - Installation and setup
- [Architecture & Core Concepts](https://docs.openhands.dev/sdk/arch/overview) - Agents, tools, workspaces, and more
- [Guides](https://docs.openhands.dev/sdk/guides/hello-world) - Hello World, custom tools, MCP, skills, and more
- [API Reference](https://docs.openhands.dev/sdk/guides/agent-server/api-reference/server-details/alive) - Agent Server REST API documentation
Visit [Installation](https://docs.all-hands.dev/modules/usage/installation) for more information and setup instructions.
## Examples
If you want to modify the OpenHands source code, check out [Development.md](https://github.com/All-Hands-AI/OpenHands/blob/main/Development.md).
The `examples/` directory contains comprehensive usage examples:
Having issues? The [Troubleshooting Guide](https://docs.all-hands.dev/modules/usage/troubleshooting) can help.
- **Standalone SDK** (`examples/01_standalone_sdk/`) - Basic agent usage, custom tools, and microagents
- **Remote Agent Server** (`examples/02_remote_agent_server/`) - Client-server architecture and WebSocket connections
- **GitHub Workflows** (`examples/03_github_workflows/`) - CI/CD integration and automated workflows
## 📖 Documentation
## Contributing
To learn more about the project, and for tips on using OpenHands,
**check out our [documentation](https://docs.all-hands.dev/modules/usage/getting-started)**.
For development setup, testing, and contribution guidelines, see [DEVELOPMENT.md](DEVELOPMENT.md).
There you'll find resources on how to use different LLM providers,
troubleshooting resources, and advanced configuration options.
## Community
## 🤝 How to Contribute
- [Join Slack](https://openhands.dev/joinslack) - Connect with the OpenHands community
- [GitHub Repository](https://github.com/OpenHands/software-agent-sdk) - Source code and issues
- [Documentation](https://docs.openhands.dev/sdk) - Complete documentation
OpenHands is a community-driven project, and we welcome contributions from everyone.
Whether you're a developer, a researcher, or simply enthusiastic about advancing the field of
software engineering with AI, there are many ways to get involved:
## Cite
- **Code Contributions:** Help us develop new agents, core functionality, the frontend and other interfaces, or sandboxing solutions.
- **Research and Evaluation:** Contribute to our understanding of LLMs in software engineering, participate in evaluating the models, or suggest improvements.
- **Feedback and Testing:** Use the OpenHands toolset, report bugs, suggest features, or provide feedback on usability.
For details, please check [CONTRIBUTING.md](./CONTRIBUTING.md).
## 🤖 Join Our Community
Whether you're a developer, a researcher, or simply enthusiastic about OpenHands, we'd love to have you in our community.
Let's make software engineering better together!
- [Slack workspace](https://join.slack.com/t/openhands-ai/shared_invite/zt-2tom0er4l-JeNUGHt_AxpEfIBstbLPiw) - Here we talk about research, architecture, and future development.
- [Discord server](https://discord.gg/ESHStjSjD4) - This is a community-run server for general discussion, questions, and feedback.
## 📈 Progress
<p align="center">
<a href="https://star-history.com/#All-Hands-AI/OpenHands&Date">
<img src="https://api.star-history.com/svg?repos=All-Hands-AI/OpenHands&type=Date" width="500" alt="Star History Chart">
</a>
</p>
## 📜 License
Distributed under the MIT License. See [`LICENSE`](./LICENSE) for more information.
## 🙏 Acknowledgements
OpenHands is built by a large number of contributors, and every contribution is greatly appreciated! We also build upon other open source projects, and we are deeply thankful for their work.
For a list of open source projects and licenses used in OpenHands, please see our [CREDITS.md](./CREDITS.md) file.
## 📚 Cite
```
@misc{wang2025openhandssoftwareagentsdk,
title={The OpenHands Software Agent SDK: A Composable and Extensible Foundation for Production Agents},
author={Xingyao Wang and Simon Rosenberg and Juan Michelini and Calvin Smith and Hoang Tran and Engel Nyst and Rohit Malhotra and Xuhui Zhou and Valerie Chen and Robert Brennan and Graham Neubig},
year={2025},
eprint={2511.03690},
@misc{openhands,
title={{OpenHands: An Open Platform for AI Software Developers as Generalist Agents}},
author={Xingyao Wang and Boxuan Li and Yufan Song and Frank F. Xu and Xiangru Tang and Mingchen Zhuge and Jiayi Pan and Yueqi Song and Bowen Li and Jaskirat Singh and Hoang H. Tran and Fuqiang Li and Ren Ma and Mingzhang Zheng and Bill Qian and Yanjun Shao and Niklas Muennighoff and Yizhe Zhang and Binyuan Hui and Junyang Lin and Robert Brennan and Hao Peng and Heng Ji and Graham Neubig},
year={2024},
eprint={2407.16741},
archivePrefix={arXiv},
primaryClass={cs.SE},
url={https://arxiv.org/abs/2511.03690},
url={https://arxiv.org/abs/2407.16741},
}
```
Executable
+5
View File
@@ -0,0 +1,5 @@
#!/bin/bash
set -e
cp pyproject.toml poetry.lock openhands
poetry build -v
+22
View File
@@ -0,0 +1,22 @@
#
services:
openhands:
build:
context: ./
dockerfile: ./containers/app/Dockerfile
image: openhands:latest
container_name: openhands-app-${DATE:-}
environment:
- SANDBOX_RUNTIME_CONTAINER_IMAGE=${SANDBOX_RUNTIME_CONTAINER_IMAGE:-ghcr.io/all-hands-ai/runtime:0.13-nikolaik}
- SANDBOX_USER_ID=${SANDBOX_USER_ID:-1234}
- WORKSPACE_MOUNT_PATH=${WORKSPACE_BASE:-$PWD/workspace}
ports:
- "3000:3000"
extra_hosts:
- "host.docker.internal:host-gateway"
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- ${WORKSPACE_BASE:-$PWD/workspace}:/opt/workspace_base
pull_policy: build
stdin_open: true
tty: true
+249
View File
@@ -0,0 +1,249 @@
###################### OpenHands Configuration Example ######################
#
# All settings have default values, so you only need to uncomment and
# modify what you want to change
# The fields within each section are sorted in alphabetical order.
#
##############################################################################
#################################### Core ####################################
# General core configurations
##############################################################################
[core]
# API key for E2B
#e2b_api_key = ""
# API key for Modal
#modal_api_token_id = ""
#modal_api_token_secret = ""
# Base path for the workspace
workspace_base = "./workspace"
# Cache directory path
#cache_dir = "/tmp/cache"
# Debugging enabled
#debug = false
# Disable color in terminal output
#disable_color = false
# Enable saving and restoring the session when run from CLI
#enable_cli_session = false
# Path to store trajectories, can be a folder or a file
# If it's a folder, the session id will be used as the file name
#trajectories_path="./trajectories"
# File store path
#file_store_path = "/tmp/file_store"
# File store type
#file_store = "memory"
# List of allowed file extensions for uploads
#file_uploads_allowed_extensions = [".*"]
# Maximum file size for uploads, in megabytes
#file_uploads_max_file_size_mb = 0
# Maximum budget per task, 0.0 means no limit
#max_budget_per_task = 0.0
# Maximum number of iterations
#max_iterations = 100
# Path to mount the workspace in the sandbox
#workspace_mount_path_in_sandbox = "/workspace"
# Path to mount the workspace
#workspace_mount_path = ""
# Path to rewrite the workspace mount path to
#workspace_mount_rewrite = ""
# Run as openhands
#run_as_openhands = true
# Runtime environment
#runtime = "eventstream"
# Name of the default agent
#default_agent = "CodeActAgent"
# JWT secret for authentication
#jwt_secret = ""
# Restrict file types for file uploads
#file_uploads_restrict_file_types = false
# List of allowed file extensions for uploads
#file_uploads_allowed_extensions = [".*"]
#################################### LLM #####################################
# Configuration for LLM models (group name starts with 'llm')
# use 'llm' for the default LLM config
##############################################################################
[llm]
# AWS access key ID
#aws_access_key_id = ""
# AWS region name
#aws_region_name = ""
# AWS secret access key
#aws_secret_access_key = ""
# API key to use
api_key = "your-api-key"
# API base URL
#base_url = ""
# API version
#api_version = ""
# Cost per input token
#input_cost_per_token = 0.0
# Cost per output token
#output_cost_per_token = 0.0
# Custom LLM provider
#custom_llm_provider = ""
# Embedding API base URL
#embedding_base_url = ""
# Embedding deployment name
#embedding_deployment_name = ""
# Embedding model to use
embedding_model = "local"
# Maximum number of characters in an observation's content
#max_message_chars = 10000
# Maximum number of input tokens
#max_input_tokens = 0
# Maximum number of output tokens
#max_output_tokens = 0
# Model to use
model = "gpt-4o"
# Number of retries to attempt when an operation fails with the LLM.
# Increase this value to allow more attempts before giving up
#num_retries = 8
# Maximum wait time (in seconds) between retry attempts
# This caps the exponential backoff to prevent excessively long
#retry_max_wait = 120
# Minimum wait time (in seconds) between retry attempts
# This sets the initial delay before the first retry
#retry_min_wait = 15
# Multiplier for exponential backoff calculation
# The wait time increases by this factor after each failed attempt
# A value of 2.0 means each retry waits twice as long as the previous one
#retry_multiplier = 2.0
# Drop any unmapped (unsupported) params without causing an exception
#drop_params = false
# Using the prompt caching feature if provided by the LLM and supported
#caching_prompt = true
# Base URL for the OLLAMA API
#ollama_base_url = ""
# Temperature for the API
#temperature = 0.0
# Timeout for the API
#timeout = 0
# Top p for the API
#top_p = 1.0
# If model is vision capable, this option allows to disable image processing (useful for cost reduction).
#disable_vision = true
[llm.gpt4o-mini]
api_key = "your-api-key"
model = "gpt-4o"
#################################### Agent ###################################
# Configuration for agents (group name starts with 'agent')
# Use 'agent' for the default agent config
# otherwise, group name must be `agent.<agent_name>` (case-sensitive), e.g.
# agent.CodeActAgent
##############################################################################
[agent]
# Name of the micro agent to use for this agent
#micro_agent_name = ""
# Memory enabled
#memory_enabled = false
# Memory maximum threads
#memory_max_threads = 3
# LLM config group to use
#llm_config = 'your-llm-config-group'
[agent.RepoExplorerAgent]
# Example: use a cheaper model for RepoExplorerAgent to reduce cost, especially
# useful when an agent doesn't demand high quality but uses a lot of tokens
llm_config = 'gpt3'
#################################### Sandbox ###################################
# Configuration for the sandbox
##############################################################################
[sandbox]
# Sandbox timeout in seconds
#timeout = 120
# Sandbox user ID
#user_id = 1000
# Container image to use for the sandbox
#base_container_image = "nikolaik/python-nodejs:python3.12-nodejs22"
# Use host network
#use_host_network = false
# Enable auto linting after editing
#enable_auto_lint = false
# Whether to initialize plugins
#initialize_plugins = true
# Extra dependencies to install in the runtime image
#runtime_extra_deps = ""
# Environment variables to set at the launch of the runtime
#runtime_startup_env_vars = {}
# BrowserGym environment to use for evaluation
#browsergym_eval_env = ""
#################################### Security ###################################
# Configuration for security features
##############################################################################
[security]
# Enable confirmation mode
#confirmation_mode = false
# The security analyzer to use
#security_analyzer = ""
#################################### Eval ####################################
# Configuration for the evaluation, please refer to the specific evaluation
# plugin for the available options
##############################################################################
+12
View File
@@ -0,0 +1,12 @@
# Docker Containers
Each folder here contains a Dockerfile, and a config.sh describing how to build
the images and where to push them. These images are built and pushed in GitHub Actions
by the `ghcr.yml` workflow.
## Building Manually
```bash
docker build -f containers/app/Dockerfile -t openhands .
docker build -f containers/sandbox/Dockerfile -t sandbox .
```
+94
View File
@@ -0,0 +1,94 @@
ARG OPENHANDS_BUILD_VERSION=dev
FROM node:21.7.2-bookworm-slim AS frontend-builder
WORKDIR /app
COPY ./frontend/package.json frontend/package-lock.json ./
RUN npm install -g npm@10.5.1
RUN npm ci
COPY ./frontend ./
RUN npm run build
FROM python:3.12.3-slim AS backend-builder
WORKDIR /app
ENV PYTHONPATH='/app'
ENV POETRY_NO_INTERACTION=1 \
POETRY_VIRTUALENVS_IN_PROJECT=1 \
POETRY_VIRTUALENVS_CREATE=1 \
POETRY_CACHE_DIR=/tmp/poetry_cache
RUN apt-get update -y \
&& apt-get install -y curl make git build-essential \
&& python3 -m pip install poetry==1.8.2 --break-system-packages
COPY ./pyproject.toml ./poetry.lock ./
RUN touch README.md
RUN export POETRY_CACHE_DIR && poetry install --without evaluation,llama-index --no-root && rm -rf $POETRY_CACHE_DIR
FROM python:3.12.3-slim AS openhands-app
WORKDIR /app
ARG OPENHANDS_BUILD_VERSION #re-declare for this section
ENV RUN_AS_OPENHANDS=true
# A random number--we need this to be different from the user's UID on the host machine
ENV OPENHANDS_USER_ID=42420
ENV SANDBOX_LOCAL_RUNTIME_URL=http://host.docker.internal
ENV USE_HOST_NETWORK=false
ENV WORKSPACE_BASE=/opt/workspace_base
ENV OPENHANDS_BUILD_VERSION=$OPENHANDS_BUILD_VERSION
ENV SANDBOX_USER_ID=0
RUN mkdir -p $WORKSPACE_BASE
RUN apt-get update -y \
&& apt-get install -y curl ssh sudo
# Default is 1000, but OSX is often 501
RUN sed -i 's/^UID_MIN.*/UID_MIN 499/' /etc/login.defs
# Default is 60000, but we've seen up to 200000
RUN sed -i 's/^UID_MAX.*/UID_MAX 1000000/' /etc/login.defs
RUN groupadd app
RUN useradd -l -m -u $OPENHANDS_USER_ID -s /bin/bash openhands && \
usermod -aG app openhands && \
usermod -aG sudo openhands && \
echo '%sudo ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoers
RUN chown -R openhands:app /app && chmod -R 770 /app
RUN sudo chown -R openhands:app $WORKSPACE_BASE && sudo chmod -R 770 $WORKSPACE_BASE
USER openhands
ENV VIRTUAL_ENV=/app/.venv \
PATH="/app/.venv/bin:$PATH" \
PYTHONPATH='/app'
COPY --chown=openhands:app --chmod=770 --from=backend-builder ${VIRTUAL_ENV} ${VIRTUAL_ENV}
RUN playwright install --with-deps chromium
COPY --chown=openhands:app --chmod=770 ./openhands ./openhands
COPY --chown=openhands:app --chmod=777 ./openhands/runtime/plugins ./openhands/runtime/plugins
COPY --chown=openhands:app --chmod=770 ./openhands/agenthub ./openhands/agenthub
COPY --chown=openhands:app ./pyproject.toml ./pyproject.toml
COPY --chown=openhands:app ./poetry.lock ./poetry.lock
COPY --chown=openhands:app ./README.md ./README.md
COPY --chown=openhands:app ./MANIFEST.in ./MANIFEST.in
COPY --chown=openhands:app ./LICENSE ./LICENSE
# This is run as "openhands" user, and will create __pycache__ with openhands:openhands ownership
RUN python openhands/core/download.py # No-op to download assets
# Add this line to set group ownership of all files/directories not already in "app" group
# openhands:openhands -> openhands:app
RUN find /app \! -group app -exec chgrp app {} +
COPY --chown=openhands:app --chmod=770 --from=frontend-builder /app/build ./frontend/build
COPY --chown=openhands:app --chmod=770 ./containers/app/entrypoint.sh /app/entrypoint.sh
USER root
WORKDIR /app
ENTRYPOINT ["/app/entrypoint.sh"]
CMD ["uvicorn", "openhands.server.listen:app", "--host", "0.0.0.0", "--port", "3000"]
+4
View File
@@ -0,0 +1,4 @@
DOCKER_REGISTRY=ghcr.io
DOCKER_ORG=all-hands-ai
DOCKER_IMAGE=openhands
DOCKER_BASE_DIR="."
+69
View File
@@ -0,0 +1,69 @@
#!/bin/bash
set -eo pipefail
echo "Starting OpenHands..."
if [[ $NO_SETUP == "true" ]]; then
echo "Skipping setup, running as $(whoami)"
"$@"
exit 0
fi
if [ "$(id -u)" -ne 0 ]; then
echo "The OpenHands entrypoint.sh must run as root"
exit 1
fi
if [ -z "$SANDBOX_USER_ID" ]; then
echo "SANDBOX_USER_ID is not set"
exit 1
fi
if [ -z "$WORKSPACE_MOUNT_PATH" ]; then
# This is set to /opt/workspace in the Dockerfile. But if the user isn't mounting, we want to unset it so that OpenHands doesn't mount at all
unset WORKSPACE_BASE
fi
if [[ "$SANDBOX_USER_ID" -eq 0 ]]; then
echo "Running OpenHands as root"
export RUN_AS_OPENHANDS=false
mkdir -p /root/.cache/ms-playwright/
if [ -d "/home/openhands/.cache/ms-playwright/" ]; then
mv /home/openhands/.cache/ms-playwright/ /root/.cache/
fi
"$@"
else
echo "Setting up enduser with id $SANDBOX_USER_ID"
if id "enduser" &>/dev/null; then
echo "User enduser already exists. Skipping creation."
else
if ! useradd -l -m -u $SANDBOX_USER_ID -s /bin/bash enduser; then
echo "Failed to create user enduser with id $SANDBOX_USER_ID. Moving openhands user."
incremented_id=$(($SANDBOX_USER_ID + 1))
usermod -u $incremented_id openhands
if ! useradd -l -m -u $SANDBOX_USER_ID -s /bin/bash enduser; then
echo "Failed to create user enduser with id $SANDBOX_USER_ID for a second time. Exiting."
exit 1
fi
fi
fi
usermod -aG app enduser
# get the user group of /var/run/docker.sock and set openhands to that group
DOCKER_SOCKET_GID=$(stat -c '%g' /var/run/docker.sock)
echo "Docker socket group id: $DOCKER_SOCKET_GID"
if getent group $DOCKER_SOCKET_GID; then
echo "Group with id $DOCKER_SOCKET_GID already exists"
else
echo "Creating group with id $DOCKER_SOCKET_GID"
groupadd -g $DOCKER_SOCKET_GID docker
fi
mkdir -p /home/enduser/.cache/huggingface/hub/
mkdir -p /home/enduser/.cache/ms-playwright/
if [ -d "/home/openhands/.cache/ms-playwright/" ]; then
mv /home/openhands/.cache/ms-playwright/ /home/enduser/.cache/
fi
usermod -aG $DOCKER_SOCKET_GID enduser
echo "Running as enduser"
su enduser /bin/bash -c "${*@Q}" # This magically runs any arguments passed to the script as a command
fi
+156
View File
@@ -0,0 +1,156 @@
#!/bin/bash
set -eo pipefail
# Initialize variables with default values
image_name=""
org_name=""
push=0
load=0
tag_suffix=""
# Function to display usage information
usage() {
echo "Usage: $0 -i <image_name> [-o <org_name>] [--push] [--load] [-t <tag_suffix>]"
echo " -i: Image name (required)"
echo " -o: Organization name"
echo " --push: Push the image"
echo " --load: Load the image"
echo " -t: Tag suffix"
exit 1
}
# Parse command-line options
while [[ $# -gt 0 ]]; do
case $1 in
-i) image_name="$2"; shift 2 ;;
-o) org_name="$2"; shift 2 ;;
--push) push=1; shift ;;
--load) load=1; shift ;;
-t) tag_suffix="$2"; shift 2 ;;
*) usage ;;
esac
done
# Check if required arguments are provided
if [[ -z "$image_name" ]]; then
echo "Error: Image name is required."
usage
fi
echo "Building: $image_name"
tags=()
OPENHANDS_BUILD_VERSION="dev"
cache_tag_base="buildcache"
cache_tag="$cache_tag_base"
if [[ -n $RELEVANT_SHA ]]; then
git_hash=$(git rev-parse --short "$RELEVANT_SHA")
tags+=("$git_hash")
tags+=("$RELEVANT_SHA")
fi
if [[ -n $GITHUB_REF_NAME ]]; then
# check if ref name is a version number
if [[ $GITHUB_REF_NAME =~ ^[0-9]+\.[0-9]+\.[0-9]+$ ]]; then
major_version=$(echo "$GITHUB_REF_NAME" | cut -d. -f1)
minor_version=$(echo "$GITHUB_REF_NAME" | cut -d. -f1,2)
tags+=("$major_version" "$minor_version")
tags+=("latest")
fi
sanitized_ref_name=$(echo "$GITHUB_REF_NAME" | sed 's/[^a-zA-Z0-9.-]\+/-/g')
OPENHANDS_BUILD_VERSION=$sanitized_ref_name
sanitized_ref_name=$(echo "$sanitized_ref_name" | tr '[:upper:]' '[:lower:]') # lower case is required in tagging
tags+=("$sanitized_ref_name")
cache_tag+="-${sanitized_ref_name}"
fi
if [[ -n $tag_suffix ]]; then
cache_tag+="-${tag_suffix}"
for i in "${!tags[@]}"; do
tags[$i]="${tags[$i]}-$tag_suffix"
done
fi
echo "Tags: ${tags[@]}"
if [[ "$image_name" == "openhands" ]]; then
dir="./containers/app"
elif [[ "$image_name" == "runtime" ]]; then
dir="./containers/runtime"
else
dir="./containers/$image_name"
fi
if [[ (! -f "$dir/Dockerfile") && "$image_name" != "runtime" ]]; then
# Allow runtime to be built without a Dockerfile
echo "No Dockerfile found"
exit 1
fi
if [[ ! -f "$dir/config.sh" ]]; then
echo "No config.sh found for Dockerfile"
exit 1
fi
source "$dir/config.sh"
if [[ -n "$org_name" ]]; then
DOCKER_ORG="$org_name"
fi
# If $DOCKER_IMAGE_SOURCE_TAG is set, add it to the tags
if [[ -n "$DOCKER_IMAGE_SOURCE_TAG" ]]; then
tags+=("$DOCKER_IMAGE_SOURCE_TAG")
fi
# If $DOCKER_IMAGE_TAG is set, add it to the tags
if [[ -n "$DOCKER_IMAGE_TAG" ]]; then
tags+=("$DOCKER_IMAGE_TAG")
fi
DOCKER_REPOSITORY="$DOCKER_REGISTRY/$DOCKER_ORG/$DOCKER_IMAGE"
DOCKER_REPOSITORY=${DOCKER_REPOSITORY,,} # lowercase
echo "Repo: $DOCKER_REPOSITORY"
echo "Base dir: $DOCKER_BASE_DIR"
args=""
for tag in "${tags[@]}"; do
args+=" -t $DOCKER_REPOSITORY:$tag"
done
if [[ $push -eq 1 ]]; then
args+=" --push"
args+=" --cache-to=type=registry,ref=$DOCKER_REPOSITORY:$cache_tag,mode=max"
fi
if [[ $load -eq 1 ]]; then
args+=" --load"
fi
echo "Args: $args"
# Modify the platform selection based on --load flag
if [[ $load -eq 1 ]]; then
# When loading, build only for the current platform
platform=$(docker version -f '{{.Server.Os}}/{{.Server.Arch}}')
else
# For push or without load, build for multiple platforms
platform="linux/amd64,linux/arm64"
fi
echo "Building for platform(s): $platform"
docker buildx build \
$args \
--build-arg OPENHANDS_BUILD_VERSION="$OPENHANDS_BUILD_VERSION" \
--cache-from=type=registry,ref=$DOCKER_REPOSITORY:$cache_tag \
--cache-from=type=registry,ref=$DOCKER_REPOSITORY:$cache_tag_base-main \
--platform $platform \
--provenance=false \
-f "$dir/Dockerfile" \
"$DOCKER_BASE_DIR"
# If load was requested, print the loaded images
if [[ $load -eq 1 ]]; then
echo "Local images built:"
docker images "$DOCKER_REPOSITORY" --format "{{.Repository}}:{{.Tag}}"
fi
+124
View File
@@ -0,0 +1,124 @@
# syntax=docker/dockerfile:1
###
FROM ubuntu:22.04 AS dind
# https://docs.docker.com/engine/install/ubuntu/
RUN apt-get update && apt-get install -y \
ca-certificates \
curl \
&& install -m 0755 -d /etc/apt/keyrings \
&& curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc \
&& chmod a+r /etc/apt/keyrings/docker.asc \
&& echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | tee /etc/apt/sources.list.d/docker.list > /dev/null
RUN apt-get update && apt-get install -y \
docker-ce \
docker-ce-cli \
containerd.io \
docker-buildx-plugin \
docker-compose-plugin \
&& rm -rf /var/lib/apt/lists/* \
&& apt-get clean \
&& apt-get autoremove -y
###
FROM dind AS openhands
ENV DEBIAN_FRONTEND=noninteractive
#
RUN apt-get update && apt-get install -y \
bash \
build-essential \
curl \
git \
git-lfs \
software-properties-common \
make \
netcat \
sudo \
wget \
&& rm -rf /var/lib/apt/lists/* \
&& apt-get clean \
&& apt-get autoremove -y
# https://github.com/cli/cli/blob/trunk/docs/install_linux.md
RUN curl -fsSL https://cli.github.com/packages/githubcli-archive-keyring.gpg | dd of=/usr/share/keyrings/githubcli-archive-keyring.gpg \
&& chmod go+r /usr/share/keyrings/githubcli-archive-keyring.gpg \
&& echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/githubcli-archive-keyring.gpg] https://cli.github.com/packages stable main" | tee /etc/apt/sources.list.d/github-cli.list > /dev/null \
&& apt-get update && apt-get -y install \
gh \
&& rm -rf /var/lib/apt/lists/* \
&& apt-get clean \
&& apt-get autoremove -y
# Python 3.12
RUN add-apt-repository ppa:deadsnakes/ppa \
&& apt-get update \
&& apt-get install -y python3.12 python3.12-venv python3.12-dev python3-pip \
&& ln -s /usr/bin/python3.12 /usr/bin/python
# NodeJS >= 18.17.1
RUN curl -fsSL https://deb.nodesource.com/setup_18.x | bash - \
&& apt-get install -y nodejs
# Poetry >= 1.8
RUN curl -fsSL https://install.python-poetry.org | python3.12 - \
&& ln -s ~/.local/bin/poetry /usr/local/bin/poetry
#
RUN <<EOF
#!/bin/bash
printf "#!/bin/bash
set +x
uname -a
docker --version
gh --version | head -n 1
git --version
#
python --version
echo node `node --version`
echo npm `npm --version`
poetry --version
netcat -h 2>&1 | head -n 1
" > /version.sh
chmod a+x /version.sh
EOF
###
FROM openhands AS dev
RUN apt-get update && apt-get install -y \
dnsutils \
file \
iproute2 \
jq \
lsof \
ripgrep \
silversearcher-ag \
vim \
&& rm -rf /var/lib/apt/lists/* \
&& apt-get clean \
&& apt-get autoremove -y
WORKDIR /app
# cache build dependencies
RUN \
--mount=type=bind,source=./,target=/app/ \
<<EOF
#!/bin/bash
make -s clean
make -s check-dependencies
make -s install-python-dependencies
# NOTE
# node_modules are .dockerignore-d therefore not mountable
# make -s install-frontend-dependencies
EOF
#
CMD ["bash"]
+54
View File
@@ -0,0 +1,54 @@
# Develop in Docker
Install [Docker](https://docs.docker.com/engine/install/) on your host machine and run:
```bash
make docker-dev
# same as:
cd ./containers/dev
./dev.sh
```
It could take some time if you are running for the first time as Docker will pull all the tools required for building OpenHands. The next time you run again, it should be instant.
## Build and run
If everything goes well, you should be inside a container after Docker finishes building the `openhands:dev` image similar to the following:
```bash
Build and run in Docker ...
root@93fc0005fcd2:/app#
```
You may now proceed with the normal [build and run](../../Development.md) workflow as if you were on the host.
## Make changes
The source code on the host is mounted as `/app` inside docker. You may edit the files as usual either inside the Docker container or on your host with your favorite IDE/editors.
The following are also mapped as readonly from your host:
```yaml
# host credentials
- $HOME/.git-credentials:/root/.git-credentials:ro
- $HOME/.gitconfig:/root/.gitconfig:ro
- $HOME/.npmrc:/root/.npmrc:ro
```
## VSCode
Alternatively, if you use VSCode, you could also [attach to the running container](https://code.visualstudio.com/docs/devcontainers/attach-container).
See details for [developing in docker](https://code.visualstudio.com/docs/devcontainers/containers) or simply ask `OpenHands` ;-)
## Rebuild dev image
You could optionally pass additional options to the build script.
```bash
make docker-dev OPTIONS="--build"
# or
./containers/dev/dev.sh --build
```
See [docker compose run](https://docs.docker.com/reference/cli/docker/compose/run/) for more options.

Some files were not shown because too many files have changed in this diff Show More