17 Commits

Author SHA1 Message Date
Aditya Bharat Soni
aebb583779 Support for VisualWebArena evaluation in OpenHands (#4773)
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Graham Neubig <neubig@gmail.com>
2025-01-23 20:18:30 +00:00
Xingyao Wang
899c1f8360 fix(bash): also show timeout reminder when no_change_timeout is triggered (#6318)
Co-authored-by: Robert Brennan <accounts@rbren.io>
2025-01-18 03:31:23 +08:00
Engel Nyst
b9a70c8d5c Delegation fixes (#6165) 2025-01-15 03:24:39 +00:00
Xingyao Wang
ec70af9412 refactor: Replace pexpect with libtmux in BashSession (#4881)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Robert Brennan <accounts@rbren.io>
2025-01-04 05:22:13 +08:00
Robert Brennan
0e4e1b3316 Factor out ActionExecutionClient (#5796) 2024-12-30 15:32:13 +00:00
Xingyao Wang
9908e1b285 [Evaluation]: Log openhands version in eval output folder, instead of agent version (#5394) 2024-12-04 03:33:43 +00:00
Engel Nyst
ea994b6209 More integration tests info (#5319) 2024-11-29 16:39:03 +01:00
Xingyao Wang
4d3b035e00 feat(agent): add BrowseURLAction to CodeAct (produce markdown from URL) (#5285) 2024-11-27 21:55:57 +00:00
OpenHands
f0ca2239f3 Fix issue #5076: Integration test github action (#5077)
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-11-27 21:31:48 +01:00
Engel Nyst
eeb2342509 Refactor history/event stream (#3808) 2024-11-05 03:36:14 +01:00
Xingyao Wang
966da7b7c8 feat(agent, CodeAct 2.2): native CodeAct support for Browsing (#4667)
Co-authored-by: tofarr <tofarr@gmail.com>
2024-11-05 00:27:27 +08:00
Xingyao Wang
6d19c93d19 [eval] add evaluation workflow (#4489)
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-10-29 13:52:25 +00:00
Xingyao Wang
ae13171194 feat(agent): CodeAct with function calling (#4537)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: tobitege <10787084+tobitege@users.noreply.github.com>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: tofarr <tofarr@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-29 11:06:33 +08:00
Xingyao Wang
1f23dc89b6 fix(eval): add runtime.connect to all eval harness (#4565) 2024-10-26 00:41:30 +08:00
Xingyao Wang
7340b78962 feat(eval): rewrite log_completions to save completions to directory (#4566) 2024-10-25 16:36:11 +00:00
Xingyao Wang
2d5b360505 refactor: re-organize different runtime implementations into an impl folder (#4346)
Co-authored-by: Graham Neubig <neubig@gmail.com>
2024-10-23 10:10:03 +00:00
Xingyao Wang
84a578ad20 [test] remove integration tests from CI & move them into evaluation (#4447) 2024-10-17 05:38:23 +08:00