OpenHands

mirror of https://github.com/All-Hands-AI/OpenHands.git synced 2026-04-29 03:00:45 -04:00

Author	SHA1	Message	Date
Xingyao Wang	3435f1e5d8	Store the file edit backup file in `/tmp` (#3958 )	2024-09-23 06:32:24 +08:00
Xingyao Wang	714e46f29a	[eval] save eventstream & llm completions for SWE-Bench run_infer (#3923 )	2024-09-22 04:39:13 +00:00
Xingyao Wang	402a03cb9a	change top_p default value to 1.0 (#3983 )	2024-09-21 18:00:18 +00:00
tobitege	01462e11d7	(fix) CodeActAgent/LLM: react on should_exit flag (user cancellation) (#3968 )	2024-09-20 23:49:45 +02:00
Robert Brennan	72ca1690a7	Wait for runtime to be ready in __init__ (#3963 )	2024-09-20 17:31:30 +02:00
tobitege	45066f19dc	(fix) restore sudo-capability after recent changes (#3964 )	2024-09-19 23:08:13 +02:00
niliy01	0f6fb0f80e	(enh) unify the log output in docker build process (#3961 ) Signed-off-by: niliy <WannaTen@users.noreply.github.com>	2024-09-19 19:19:16 +02:00
tobitege	620526b8b4	agent_controller: in PAUSED state reduce delegate logspam from delegate (#3946 )	2024-09-19 14:34:38 +02:00
tofarr	31dbd3d02e	Fix google cloud session manager (#3942 )	2024-09-19 06:28:10 -06:00
tofarr	dd7174e559	Fix broken cli (#3941 )	2024-09-18 19:15:09 -06:00
Engel Nyst	8fdfece059	Refactor messages serialization (#3832 ) Co-authored-by: Robert Brennan <accounts@rbren.io>	2024-09-18 23:48:58 +02:00
tofarr	ad0b549d8b	Feat Tightening up Timeouts and interrupt conditions. (#3926 )	2024-09-18 20:50:42 +00:00
Engel Nyst	47f60b8275	Don't send gemini settings when the llm is not gemini (#3940 )	2024-09-18 20:12:58 +00:00
Xingyao Wang	5d7f2fd4ae	[eval] Allow evaluation of SWE-Bench patches on `RemoteRuntime` (#3927 ) Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk> Co-authored-by: Graham Neubig <neubig@gmail.com>	2024-09-18 16:07:34 -04:00
Robert Brennan	c864715b43	Fix UID management for ubuntu users (#3937 )	2024-09-18 16:29:39 +00:00
tobitege	b4408b41c9	(feat) LLM class: add safety_settings for Gemini; improve max_output_tokens defaulting (#3925 )	2024-09-18 11:51:23 -04:00
Engel Nyst	e3be71f523	Fix init order with threading (#3935 )	2024-09-18 15:26:51 +00:00
tobitege	c3117e8c39	(feat) add --version to cli (#3924 )	2024-09-18 09:44:51 -04:00
niliy01	07a094e701	(enh) Update Docker pull data in place (#3910 ) Signed-off-by: Yi Lin <teroincn@gmail.com>	2024-09-17 10:22:07 +02:00
tobitege	52c5abccbf	(enh) Dockerfile.j2: improve env vars for bash and activate in .bashrc (#3871 )	2024-09-17 08:49:04 +02:00
niliy01	804674bb9f	refactor the logic in agent_controller to imporve readability (#3873 ) Signed-off-by: Yi Lin <teroincn@gmail.com>	2024-09-16 14:13:52 -04:00
Engel Nyst	41a54378dc	Add delegates events to eval trajectories (#3881 )	2024-09-16 14:10:42 -04:00
tofarr	0db664986d	Tightened up the logic on retries. (#3882 )	2024-09-16 07:28:06 -06:00
tobitege	a33f61c025	(feat) Show messages' timestamp in UI (#3869 )	2024-09-16 05:41:29 +02:00
tobitege	a45b20a406	(fix) runtime: tweak _wait_until_alive tenacity and exception handling (#3878 )	2024-09-16 04:24:58 +02:00
tobitege	ecf4aed28b	(fix) Update logs after run_action (EventStreamRuntime) (#3870 )	2024-09-15 18:50:10 +02:00
tobitege	554636cf2a	(fix) Fix runtime (RT) tests and split tests in 2 actions (openhands/root) (#3791 ) Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>	2024-09-14 21:51:30 +02:00
tobitege	57390eb26b	(enh) docker pull (if not found locally) with progress info (#3682 )	2024-09-14 06:26:42 +02:00
tobitege	6111f530c2	(fix) StuckDetector: syntax error loops were not detected (#3663 ) Co-authored-by: mamoodi <mamoodiha@gmail.com>	2024-09-13 16:53:52 +02:00
Xingyao Wang	78c5f58adc	refactor & improve retry for the reliability of `RemoteRuntime` & evaluation (#3846 )	2024-09-13 07:37:07 -04:00
Robert Brennan	1f13d80ddc	fix saves (#3848 )	2024-09-12 21:47:02 +00:00
Robert Brennan	58de5221f5	fix file access (#3847 )	2024-09-12 15:30:21 -04:00
Xingyao Wang	2fe2f4c530	[eval] increase timeout for SWEBench eval init/complete (#3829 ) * [eval] increase timeout for swebench eval init/complete * allow CmdRunAction to optionally block when .timeout is setted * fix unit test for serialization * fix unit tests for security analyzer * fix integration tests * add more timeout	2024-09-12 15:20:58 +00:00
Robert Brennan	c6105f264f	Improvements to file list UI (#3794 ) * move filematching logic into server * wait until ready before returning * show loading message instead of empty * logspam * delint * fix type * add a few more default ignores	2024-09-11 09:44:37 -04:00
mamoodi	f3b2085f9b	Reduce runtime tests duration by running them across CPUs (#3779 ) * Reduce runtime tests duration by running them across CPUs * fix hardcoded image name * test two cpus * Test folder change * Up the CPU to 4 again to test * Change to 3 CPUs * Down to 2 * Add param to remove all openhands containers * Add comment * Add reruns just in case * Fix ordering of if	2024-09-10 14:31:17 -04:00
Cole Murray	97a03faf33	Add Handling of Cache Prompt When Formatting Messages (#3773 ) * Add Handling of Cache Prompt When Formatting Messages * Fix Value for Cache Control * Fix Value for Cache Control * Update openhands/core/message.py Co-authored-by: Engel Nyst <enyst@users.noreply.github.com> * Fix lint error * Serialize Messages if Propt Caching Is Enabled * Remove formatting message change --------- Co-authored-by: Engel Nyst <enyst@users.noreply.github.com> Co-authored-by: tobitege <10787084+tobitege@users.noreply.github.com>	2024-09-10 16:34:41 +00:00
tobitege	5ffff742de	Regression fixes: LLM logging; client readiness (EventStreamRuntime) (#3776 ) * Regression fixes: LLM logging; client readiness (EventStreamRuntime) * fix llm.async_completion_wrapper bad edit in previous commit * regen couple of mock files * client: always log initialized status	2024-09-09 21:02:43 +02:00
tobitege	2b7517e542	(enh) add caching@v4 action in workflows (#3780 ) * dummy test change * regen yml: 1st install python 3.11, then poetry * fix caching for poetry; old entry for python was rather useless * fix steps order (cache before poetry) * add poetry caching to ghcr_runtime; fix fork conditions * ghcr_runtime: more caching actions; condition fixes * fix interim action error (order of steps) * cache@v4 instead of v3 * fixed interim typo for 2 fork conditions * runtime/test_env_vars: compacted multiple tests into one to reduce time * ugh if fork condition changes again	2024-09-09 10:49:49 +02:00
Cole Murray	dadada18ce	Add Anthropic Models to Cache Prompt (#3775 ) * Add Anthropic Models to Cache Prompt * Update Cache Prompt Active Check for Partial String Matching	2024-09-08 22:09:14 +00:00
Robert Brennan	ab3851593d	Support interactive commands (#3653 ) * hacky solution for interactive commands * add more behavior * debug * fix continue functionality * remove prints * refactor a bit * reduce test sleep * fix python version * fix pre-commit issue * Regenerate integration tests * Update openhands/runtime/client/client.py * revert some prompt stuff * several integration mock files regenerated * execute_action: remove duplicate exception logging --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: tobitege <10787084+tobitege@users.noreply.github.com>	2024-09-08 21:45:51 +02:00
tobitege	57187417b7	revert enabling litellm verbose mode from testing (#3750 )	2024-09-05 20:12:04 +00:00
tobitege	03b5b03bb2	(enh) CodeActAgent: improve logging; sensible retry defaults in config (#3729 ) * CodeActAgent: improve logging; sensible retry defaults for completion errors * CodeActAgent: reduce completion error message sent to UI * tweak values; docs+config template changes * fix format_messages; log exception in codeactagent again	2024-09-05 18:14:15 +00:00
niliy01	82a154f7e7	(feat) making prompt caching optional instead of enabled default (#3689 ) * (feat) making prompt caching optional instead of enabled default At present, only the Claude models support prompt caching as a experimental feature, therefore, this feature should be implemented as an optional setting rather than being enabled by default. Signed-off-by: Yi Lin <teroincn@gmail.com> * handle the conflict * fix unittest mock return value * fix lint error in whitespace --------- Signed-off-by: Yi Lin <teroincn@gmail.com>	2024-09-05 18:52:26 +02:00
Xingyao Wang	688068a44e	Fix issues for running `RemoteRuntime` in parallel on SWE-Bench (#3716 ) * feat: add SWE-bench fullset support * fix instance image list * update eval script and documentation * increase timeout for remote runtime * add push script * handle the case when ret push is an generator * update pbar * set SWE-Bench default to run SWE-Bench lite * add script to cleanup remote runtime * fix the cases when tag is too long * update README * update readme for cleanup * rename od to oh * Update evaluation/swe_bench/README.md Co-authored-by: Graham Neubig <neubig@gmail.com> * Update evaluation/swe_bench/README.md Co-authored-by: Graham Neubig <neubig@gmail.com> * Update evaluation/swe_bench/scripts/cleanup_remote_runtime.sh Co-authored-by: Graham Neubig <neubig@gmail.com> * Update evaluation/swe_bench/scripts/cleanup_remote_runtime.sh Co-authored-by: Graham Neubig <neubig@gmail.com> * Update evaluation/swe_bench/scripts/cleanup_remote_runtime.sh Co-authored-by: Graham Neubig <neubig@gmail.com> * gets API key and Runtime from env var --------- Co-authored-by: Graham Neubig <neubig@gmail.com>	2024-09-05 10:34:31 +08:00
tobitege	bc31fb15fe	(fix) CodeActAgent: fix issues with vision support in prompts (#3665 ) * CodeActAgent: fix message prep if prompt caching is not supported * fix python version in regen tests workflow * fix in conftest "mock_completion" method * add disable_vision to LLMConfig; revert change in message parsing in llm.py * format messages in several files for completion * refactored message(s) formatting (llm.py); added vision_is_active() * fix a unit test * regenerate: added LOG_TO_FILE and FORCE_REGENERATE env flags * try to fix path to logs folder in workflow * llm: prevent index error * try FORCE_USE_LLM in regenerate * tweaks everywhere... * fix 2 random unit test errors :( * added FORCE_REGENERATE_TESTS=true to regenerate CLI * fix test_lint_file_fail_typescript again * double-quotes for env vars in workflow; llm logger set to debug * fix typo in regenerate * regenerate iterations now 20; applied iteration counter fix by Li * regenerate: pass FORCE_REGENERATE flag into env * fixes for int tests. several mock files updated. * browsing_agent: fix response_parser.py adding ) to empty response * test_browse_internet: fix skipif and revert obsolete mock files * regenerate: fi bracketing for http server start/kill conditions * disable test_browse_internet for CodeActAgents; mock files updated after merge missed to include more mock files earlier * reverts after review feedback from Li * forgot one * browsing agent test, partial fixes and updated mock files * test_browse_internet works in my WSL now! * adapt unit test test_prompt_caching.py * add DEBUG to regenerate workflow command * convert regenerate workflow params to inputs * more integration test mock files updated * more files * test_prompt_caching: restored test_prompt_caching_headers purpose * file_ops: fix potential exception, like "cross device copy"; fixed mock files accordingly * reverts/changes wrt feedback from xingyao * updated docs and config template * code cleanup wrt review feedback	2024-09-04 17:58:30 +02:00
Shubham raj	2bc3e8d584	Fix: llm completion exception breaks CodeActAgent (#3678 ) * Catch exception and return finish action with an exception message in case of exception in llm completion * Remove exception logs * Raise llm response error for any exception in llm completion * Raise LLMResponseError from async completion and async streaming completion as well	2024-09-04 05:51:49 +02:00
Xingyao Wang	d8a87d7ccb	[Eval] Make SWE-Bench run_infer.sh to default to run SWE-Bench Lite (#3704 ) * feat: add SWE-bench fullset support * fix instance image list * update eval script and documentation * increase timeout for remote runtime * add push script * handle the case when ret push is an generator * update pbar * set SWE-Bench default to run SWE-Bench lite	2024-09-04 00:58:14 +08:00
Mislav Balunovic	f979d612ec	(fix) confirmation mode bugfix for the EventStreamRuntime (#3695 )	2024-09-02 13:27:33 +00:00
Boxuan Li	75d5591816	file_ops: Use tmp file for original linting (#3681 ) Fix a potential issue that might lead to file corruption when edit linting is enabled #3124 introduces a feature for editing: running linter twice before and after the change and only extract new errors introduced by the agent. This has some potential issues and I am working on #3649 to address them, but I feel like I am not gonna finish it in the next few days, and that PR has become harder and harder to review, thus this PR, which only focuses on a small improvement. So what's the issue? When we run linters on the original file before our edits, we need to copy the original file and use a temporary file to lint, because linting may have side-effect (e.g. modifying the file in-place). I used the word "may" because: Flake8 has no side-effect, so not a problem as of now. We don't enforce this or document this "no side-effect" as a requirement for linter implementation, so side-effect is allowed. Regardless, the "after-edit-linting" uses the same approach: backup the file before linting to avoid data corruption. We should keep our "before-edit-linting" consistent. Why no new unittest that reproduces the issue? Well, as I have mentioned earlier, flake8 has no side-effect, so technically it's not a bug but a flaw. Therefore, there's no way to write a test that reproduces the issue.	2024-09-01 23:36:57 -07:00
tobitege	7068a73ae7	(enh) Improve CodeActAgent's file editing reliability (#3610 ) * improve file editing prompts and unit test converted most raise calls to a _output_error call in file_ops.py * tweaks in test_agent_skill.py wrt to SEP separator * tweaked the separator * remove server runtime remnants and TEST_RUNTIME references * restore use of TEST_RUNTIME args and variables * fix integration tests * added hint to properly escape docstrings * revert latest prompt change --------- Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>	2024-09-02 06:03:22 +02:00

1 2

86 Commits