OpenHands

mirror of https://github.com/All-Hands-AI/OpenHands.git synced 2026-04-29 03:00:45 -04:00

Author	SHA1	Message	Date
Robert Brennan	01ae22ef57	Rename OpenDevin to OpenHands (#3472 ) * Replace OpenDevin with OpenHands * Update CONTRIBUTING.md * Update README.md * Update README.md * update poetry lock; move opendevin folder to openhands * fix env var * revert image references in docs * revert permissions * revert permissions --------- Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>	2024-08-20 00:44:54 +08:00
Graham Neubig	2b1e73365b	Fix typing for fake user response function (#3438 )	2024-08-18 06:47:08 +08:00
Engel Nyst	9cb0bf97c1	Fix restore cli sessions (#3409 ) * fix restore cli sessions * pytest * fix log message * make sure sid is set --------- Co-authored-by: mamoodi <mamoodiha@gmail.com>	2024-08-17 23:31:42 +02:00
Xingyao Wang	8d7bf83224	refactor: break agentskills single file to multiple composable modules (#3429 ) * refactor agentskills to prepare for agentless * fix import * fix typo * fix imports * fix globals * fix import * fix import * disable log to file to avoid auto-created log file w/ permission issue when import od in runtime * import agentskills from OD instead from itself directly * add back pythonpath * remove chown since there's no log/folder	2024-08-18 05:18:36 +08:00
tofarr	b2221f135f	Now printing a warning message when a config has an unknown key (#3428 ) * Now printing a warning message when unknown keys present	2024-08-17 10:54:38 -06:00
Engel Nyst	92b1a2da5c	Refactor agent to accept agent config (#3430 ) * refactor agents to receive their agent config * add unit test * fix test * fix tests	2024-08-17 18:11:30 +02:00
Engel Nyst	be2ea6ab35	logging fixes (#3393 )	2024-08-14 21:21:42 +02:00
adragos	e0b67ad2f1	feat: add Security Analyzer functionality (#3058 ) * feat: Initial work on security analyzer * feat: Add remote invariant client * chore: improve fault tolerance of client * feat: Add button to enable Invariant Security Analyzer * [feat] confirmation mode for bash actions * feat: Add Invariant Tab with security risk outputs * feat: Add modal setting for Confirmation Mode * fix: frontend tests for confirmation mode switch * fix: add missing CONFIRMATION_MODE value in SettingsModal.test.tsx * fix: update test to integrate new setting * feat: Initial work on security analyzer * feat: Add remote invariant client * chore: improve fault tolerance of client * feat: Add button to enable Invariant Security Analyzer * feat: Add Invariant Tab with security risk outputs * feat: integrate security analyzer with confirmation mode * feat: improve invariant analyzer tab * feat: Implement user confirmation for running bash/python code * fix: don't display rejected actions * fix: make confirmation show only on assistant messages * feat: download traces, update policy, implement settings, auto-approve based on defined risk * Fix: low risk not being shown because it's 0 * fix: duplicate logs in tab * fix: log duplication * chore: prepare for merge, remove logging * Merge confirmation_mode from OpenDevin main * test: update tests to pass * chore: finish merging changes, security analyzer now operational again * feat: document Security Analyzers * refactor: api, monitor * chore: lint, fix risk None, revert policy * fix: check security_risk for None * refactor: rename instances of invariant to security analyzer * feat: add /api/options/security-analyzers endpoint * Move security analyzer from tab to modal * Temporary fix lock when security analyzer is not chosen * feat: don't show lock at all when security analyzer is not enabled * refactor: - Frontend: * change type of SECURITY_ANALYZER from bool to string * add combobox to select SECURITY_ANALYZER, current options are "invariant and "" (no security analyzer) * Security is now a modal, lock in bottom right is visible only if there's a security analyzer selected - Backend: * add close to SecurityAnalyzer * instantiate SecurityAnalyzer based on provided string from frontend * fix: update close to be async, to be consistent with other close on resources * fix: max height of modal (prevent overflow) * feat: add logo * small fixes * update docs for creating a security analyzer module * fix linting * update timeout for http client * fix: move security_analyzer config from agent to session * feat: add security_risk to browser actions * add optional remark on combobox * fix: asdict not called on dataclass, remove invariant dependency * fix: exclude None values when serializing * feat: take default policy from invariant-server instead of being hardcoded * fix: check if policy is None * update image name * test: fix some failing runs * fix: security analyzer tests * refactor: merge confirmation_mode and security_analyzer into SecurityConfig. Change invariant error message for docker * test: add tests for invariant parsing actions / observations * fix: python linting for test_security.py * Apply suggestions from code review Co-authored-by: Engel Nyst <enyst@users.noreply.github.com> * use ActionSecurityRisk \| None intead of Optional * refactor action parsing * add extra check * lint parser.py * test: add field keep_prompt to test_security * docs: add information about how to enable the analyzer * test: Remove trailing whitespace in README.md text --------- Co-authored-by: Mislav Balunovic <mislav.balunovic@gmail.com> Co-authored-by: Engel Nyst <enyst@users.noreply.github.com> Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>	2024-08-13 11:29:41 +00:00
Yufan Song	99ac91f6ad	remove sandbox abstract class (#3337 ) * remove sandbox abstract class * remove cancellable stream	2024-08-11 22:47:28 +08:00
Xingyao Wang	bdf6df12c3	fix: pip not available in runtime (#3306 ) * try to fix pip unavailable * update test case for pip * force rebuild in CI * remove extra symlink * fix newline * added semi-colon to line 31 * Dockerfile.j2: activate env at the end * Revert "Dockerfile.j2: activate env at the end" This reverts commit `cf2f565102`. * cleanup Dockerfile * switch default python image * remove image agnostic (no longer used) * fix tests * switch to nikolaik/python-nodejs:python3.11-nodejs22 * fix test * fix test * revert docker * update template --------- Co-authored-by: tobitege <tobitege@gmx.de> Co-authored-by: Graham Neubig <neubig@gmail.com>	2024-08-09 15:04:43 -04:00
Xingyao Wang	2e6b08db4f	fix: workspace folder permission & app container cannot access client API (#3300 ) * also copy over pyproject and poetry lock * add missing readme * remove extra git config init since it is already done in client.py * only chown if the /workspace dir does not exists * Revert "remove extra git config init since it is already done in client.py" This reverts commit `e8556cd76d`. * remove extra git config init since it is already done in client.py * fix test runtime * print container log while reconnecting * print log in more readable format * print log in more readable format * increase lines * clean up sandbox and ssh related stuff * remove ssh hostname * remove ssh hostname * fix docker app cannot access runtime API issue * remove ssh password * API HOSTNAME should be pre-fixed with SANDBOX * update config * fix typo that breaks the test	2024-08-08 19:28:34 -04:00
Xingyao Wang	a5195b0e65	chore: clean up sandbox and ssh related configs (#3301 ) * clean up sandbox and ssh related stuff * remove ssh hostname * remove ssh hostname * remove ssh password * update config * fix typo that breaks the test	2024-08-08 22:15:40 +00:00
Xingyao Wang	90d0a62469	(arch) Switch default runtime to EventStream Runtime (#3271 ) * switch default to eventstream runtime * remove pull docker from makefile * fix unittest * fix file store path * try deprecate server runtime * remove persist sandbox * move file utils * remove server runtime related workflow * remove unused method * attempt to remove the reliance on filestore for BE * fix async for list file * fix list_files to post * fix list files * add suffix to directory * make sure list file returns abs path; make sure other backend endpoints accpets abs path * remove server runtime test workflow * set git config in runtime	2024-08-08 10:11:49 +08:00
Xingyao Wang	bb66f15ff6	[Arch] Streamline EventStream Runtime Image Building Logic (#3259 ) * remove nocache * simplify runtime build to use hash & always update source * style * try to fix temp folder issue * fix rm tree * create build folder first (to get correct hash), then copy it over to actual build folder * fix assert * fix indentation * fix copy over * add runtime documentation * fix runtime docs * fix typo * Update docs/modules/usage/runtime.md Co-authored-by: Graham Neubig <neubig@gmail.com> * Update docs/modules/usage/runtime.md Co-authored-by: Graham Neubig <neubig@gmail.com> --------- Co-authored-by: Graham Neubig <neubig@gmail.com>	2024-08-07 06:09:38 +08:00
Xingyao Wang	31b244f95e	[Refactor, Evaluation] Refactor and clean up evaluation harness to remove global config and use EventStreamRuntime (#3230 ) * move multi-line bash tests to test_runtime; support multi-line bash for esruntime; * add testcase to handle PS2 prompt * use bashlex for bash parsing to handle multi-line commands; add testcases for multi-line commands * revert ghcr runtime change * Apply stash * fix run as other user; make test async; * fix test runtime for run as od * add run-as-devin to all the runtime tests * handle the case when username is root * move all run-as-devin tests from sandbox; only tests a few cases on different user to save time; * move over multi-line echo related tests to test_runtime * fix user-specific jupyter by fixing the pypoetry virtualenv folder * make plugin's init async; chdir at initialization of jupyter plugin; move ipy simple testcase to test runtime; * support agentskills import in move tests for jupyter pwd tests; overload `add_env_vars` for EventStreamRuntime to update env var also in Jupyter; make agentskills read env var lazily, in case env var is updated; * fix ServerRuntime agentskills issue * move agnostic image test to test_runtime * merge runtime tests in CI * fix enable auto lint as env var * update warning message * update warning message * test for different container images * change parsing output as debug * add exception handling for update_pwd_decorator * fix unit test indentation * add plugins as default input to Runtime class; remove init_sandbox_plugins; implement add_env_var (include jupyter) in the base class; * fix server runtime auto lint * Revert "add exception handling for update_pwd_decorator" This reverts commit `2b668b1506`. * tries to print debugging info for agentskills * explictly setting uid (try fix permission issue) * Revert "tries to print debugging info for agentskills" This reverts commit `8be4c86756`. * set sandbox user id during testing to hopefully fix the permission issue * add browser tools for server runtime * try to debug for old pwd * update debug cmd * only test agnostic runtime when TEST_RUNTIME is Server * fix temp dir mkdir * load TEST_RUNTIME at the beginning * remove ipython tests * only log to file when DEBUG * default logging to project root * temporarily remove log to file * fix LLM logger dir * fix logger * make set pwd an optional aux action * fix prev pwd * fix infinity recursion * simplify * do not import the whole od library to avoid logger folder by jupyter * fix browsing * increase timeout * attempt to fix agentskills yet again * clean up in testcases, since CI maybe run as non-root * add _cause attribute for event.id * remove parent * add a bunch of debugging statement again for CI :( * fix temp_dir fixture * change all temp dir to follow pytest's tmp_path_factory * remove extra bracket * clean up error printing a bit * jupyter chdir to self.config.workspace_mount_path_in_sandbox on initialization * jupyter chdir to self.config.workspace_mount_path_in_sandbox on initialization * add typing for tmp dir fixture * clear the directory before running the test to avoid weird CI temp dir * remove agnostic test case for server runtime * Revert "remove agnostic test case for server runtime" This reverts commit `30e2181c3f`. * disable agnostic tests in CI * fix test * make sure plugin arg is not passed when no plugin is specified; remove redundant on_event function; * move mock prompt * rename runtime * remove extra logging * refactor run_controller's interface; support multiple runtime for integration test; filter out hostname for prompt * uncomment other tests * pass the right runtime to controller * log runtime when start * uncomment tests * improve symbol filters * add intergration test prompts that seemd ok * add integration test workflow * add python3 to default ubuntu image * symlink python and fix permission to jupyter pip * add retry for jupyter execute server * fix jupyter pip install; add post-process for jupyter pip install; simplify init by add agent_skills path to PYTHONPATH; add testcase to tests jupyter pip install; * fix bug * use ubuntu:22.04 for eventstream integration tests * add todo * update testcase * remove redundant code * fix unit test * reduce dependency for runtime * try making llama-index an optional dependency that's not installed by default * remove pip install since it seemd not needed * log ipython execution; await write message since it returns a future * update ipy testcase * do not install llama-index in CI * do not install llama-index in the app docker as well * set sandbox container image in the integration test script * log plugins & env var for runtime * update conftest for sha256 * add git * remove all non-alphanumeric chalracters * add working ipy module tests! * default to use host network * remove is_async from browser to make thing a little more reliable; retry loading browser when error; * add sleep to wait a bit for http server * kill http server before regenerate browsing tests * fix browsing * only set sandbox container image if undefined * skip empty config value * update evaluation to use the latest run_controller * revert logger in execute_server to be compatible with server runtime * revert logging level to fix jupyter * set logger level * revert the logging * chmod for workspace to fix permission * support getting timeout from action * update test for server runtime * try to fix file permission * fix test_cmd_run_action_serialization_deserialization test (added timeout) * poetry: pip 24.2, torch 2.2.2 * revert adding pip to pyproject.toml * add build to dependencies in pyproject.toml * forgot poetry lock --no-update * fix a DelegatorAgent prompt_002.log (timeout) * fix a DelegatorAgent prompt_003.log (timeout) * couple more timeout attribs in prompt files * some more prompt files * prompts galore * add clarification comment for timeout * default timeout to config * add assert * update integraton tests for eventstream * update integration tests * fix timeout for action<->dict * remove redundant on_event * default to use instance image * update run_controller interface * add logging for copy * refactor swe_bench for the new design * fix action execution timeout * updatelock * remove build sandbox locally * fix runtime * use plain for-loop for single process * remove extra print * get swebench inference working * print whole `test_result` dict * got swebench patch post-process working * update swe-bench evaluation readme * refactor using shared reset_logger function * move messy swebench prompt to a different file * support the ability to specify whether to keep prompt * support the ability to specify whether to keep prompt * fix dockerfile * fix import and remove unnecessary strip logic * fix action serialization * get agentbench running * remove extra ls for agent bench * fix agentbench metric * factor out common documentation for eval * update biocoder doc * remove swe_env_box since it is no longer needed * get biocoder working * add func timeout for bird * fix jupyter pwd with ~ as user name * fix jupyter pwd with ~ as user name * get bird working * get browsing evaluation working * make eda runnable * fix id column * fix eda run_infer * unify eval output using a structured format; make swebench coompatible with that format; update client source code for every swebench run; do not inject testcmd for swebench * standardize existing benchs for the new eval output * set update source code = true * get gaia standardized * fix gaia * gorilla refactored but stuck at language.so to test * refactor and make gpqa work * refactor humanevalfix and get it working * refactor logic reasoning and get it working * refactor browser env so it works with eventstream runtime for eval * add initial version of miniwob refactor * fix browsergym environment * get miniwob working!! * allowing injecting additional dependency to OD runtime docker image * allowing injecting additional dependency to OD runtime docker image * support logic reasoning with pre-injected dependency * get mint working * update runtime build * fix mint docker * add test for keep_prompt; add missing await close for some tests * update integration tests for eventstream runtime * fix integration tests for server runtime * refactor ml bench and toolqa * refactor webarena * fix default factory * Update run_infer.py * add APIError to retry * increase timeout for swebench * make sure to hide api key when dump eval output * update the behavior of put source code to put files instead of tarball * add dishash to dependency * sendintr when timeout * fix dockerfile copy * reduce timeout * use dirhash to avoid repeat building for update source * fix runtime_build testcase * add dir_hash to docker build pipeline * revert api error * update poetry lock * add retries for swebench run infer * fix git patch * update poetry lock * adjust config order * fix mount volumns * enforce all eval to use "instance_id" * remove file store from runtime * make file_store public inside eventstream * move the runtime logic inside `main` out * support using async function for process_instance_fn * refactor run_infer with the create_time * fix file store * Update evaluation/toolqa/utils.py Co-authored-by: Graham Neubig <neubig@gmail.com> * fix typo --------- Co-authored-by: tobitege <tobitege@gmx.de> Co-authored-by: super-dainiu <78588128+super-dainiu@users.noreply.github.com> Co-authored-by: Graham Neubig <neubig@gmail.com>	2024-08-06 17:21:45 +00:00
Xingyao Wang	6a12a9f83c	[Arch, Eval] Allowing injecting additional dependency to OD runtime docker image (#3237 ) * allowing injecting additional dependency to OD runtime docker image * update runtime build * make `extra_deps` optional str \| None	2024-08-04 17:38:56 +00:00
Kaushik Deka	415843476c	Feat: Add Vision Input Support for LLM with Vision Capabilities (#2848 ) * add image feature * fix-linting * check model support for images * add comment * Add image support to other models * Add images to chat * fix linting * fix test issues * refactor variable names and import * fix tests * fix chat message tests * fix linting * add pydantic class message * use message * remove redundant comments * remove redundant comments * change Message class * remove unintended change * fix integration tests using regenerate.sh * rename image_bas64 to images_url, fix tests * rename Message.py to message, change reminder append logic, add unit tests * remove comment, fix error to merge * codeact_swe_agent * fix f string * update eventstream integration tests * add missing if check in codeact_swe_agent * update integration tests * Update frontend/src/components/chat/ChatInput.tsx * Update frontend/src/components/chat/ChatInput.tsx * Update frontend/src/components/chat/ChatInput.tsx * Update frontend/src/components/chat/ChatInput.tsx * Update frontend/src/components/chat/ChatMessage.tsx --------- Co-authored-by: tobitege <tobitege@gmx.de> Co-authored-by: Xingyao Wang <xingyao6@illinois.edu> Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com>	2024-08-04 02:26:22 +08:00
Xingyao Wang	b7061f4497	[Eval, Browser] Refactor Browser Env so it works with `EventStreamRuntime` for Browsing Evaluation (#3235 ) * refactor browser env so it works with eventstream runtime for eval * fix browsergym environment	2024-08-03 15:06:37 +00:00
Xingyao Wang	001195a3ea	reduce the duplication in run_controller (#3217 )	2024-08-02 10:12:34 +08:00
Xingyao Wang	4f0a454ed6	[Arch] Support integration tests using EventStream Runtime (#3184 ) * Remove global config from memory * Remove runtime global config * Remove from storage * Remove global config * Fix event stream tests * Fix sandbox issue * Change config * Removed transferred tests * Add swe env box * Fixes on testing * Fixed some tests * Merge with stashed changes * Fix typing * Fix ipython test * Revive function * Make temp_dir fixture * Remove test to avoid circular import * fix eventstream filestore for test_runtime * fix parse arg issue that cause integration test to fail * support swebench pull from custom namespace * add back simple tests for runtime * move multi-line bash tests to test_runtime; support multi-line bash for esruntime; * add testcase to handle PS2 prompt * use bashlex for bash parsing to handle multi-line commands; add testcases for multi-line commands * revert ghcr runtime change * Apply stash * fix run as other user; make test async; * fix test runtime for run as od * add run-as-devin to all the runtime tests * handle the case when username is root * move all run-as-devin tests from sandbox; only tests a few cases on different user to save time; * move over multi-line echo related tests to test_runtime * fix user-specific jupyter by fixing the pypoetry virtualenv folder * make plugin's init async; chdir at initialization of jupyter plugin; move ipy simple testcase to test runtime; * support agentskills import in move tests for jupyter pwd tests; overload `add_env_vars` for EventStreamRuntime to update env var also in Jupyter; make agentskills read env var lazily, in case env var is updated; * fix ServerRuntime agentskills issue * move agnostic image test to test_runtime * merge runtime tests in CI * fix enable auto lint as env var * update warning message * update warning message * test for different container images * change parsing output as debug * add exception handling for update_pwd_decorator * fix unit test indentation * add plugins as default input to Runtime class; remove init_sandbox_plugins; implement add_env_var (include jupyter) in the base class; * fix server runtime auto lint * Revert "add exception handling for update_pwd_decorator" This reverts commit `2b668b1506`. * tries to print debugging info for agentskills * explictly setting uid (try fix permission issue) * Revert "tries to print debugging info for agentskills" This reverts commit `8be4c86756`. * set sandbox user id during testing to hopefully fix the permission issue * add browser tools for server runtime * try to debug for old pwd * update debug cmd * only test agnostic runtime when TEST_RUNTIME is Server * fix temp dir mkdir * load TEST_RUNTIME at the beginning * remove ipython tests * only log to file when DEBUG * default logging to project root * temporarily remove log to file * fix LLM logger dir * fix logger * make set pwd an optional aux action * fix prev pwd * fix infinity recursion * simplify * do not import the whole od library to avoid logger folder by jupyter * fix browsing * increase timeout * attempt to fix agentskills yet again * clean up in testcases, since CI maybe run as non-root * add _cause attribute for event.id * remove parent * add a bunch of debugging statement again for CI :( * fix temp_dir fixture * change all temp dir to follow pytest's tmp_path_factory * remove extra bracket * clean up error printing a bit * jupyter chdir to self.config.workspace_mount_path_in_sandbox on initialization * jupyter chdir to self.config.workspace_mount_path_in_sandbox on initialization * add typing for tmp dir fixture * clear the directory before running the test to avoid weird CI temp dir * remove agnostic test case for server runtime * Revert "remove agnostic test case for server runtime" This reverts commit `30e2181c3f`. * disable agnostic tests in CI * fix test * make sure plugin arg is not passed when no plugin is specified; remove redundant on_event function; * move mock prompt * rename runtime * remove extra logging * refactor run_controller's interface; support multiple runtime for integration test; filter out hostname for prompt * uncomment other tests * pass the right runtime to controller * log runtime when start * uncomment tests * improve symbol filters * add intergration test prompts that seemd ok * add integration test workflow * add python3 to default ubuntu image * symlink python and fix permission to jupyter pip * add retry for jupyter execute server * fix jupyter pip install; add post-process for jupyter pip install; simplify init by add agent_skills path to PYTHONPATH; add testcase to tests jupyter pip install; * fix bug * use ubuntu:22.04 for eventstream integration tests * add todo * update testcase * remove redundant code * fix unit test * reduce dependency for runtime * try making llama-index an optional dependency that's not installed by default * remove pip install since it seemd not needed * log ipython execution; await write message since it returns a future * update ipy testcase * do not install llama-index in CI * do not install llama-index in the app docker as well * set sandbox container image in the integration test script * log plugins & env var for runtime * update conftest for sha256 * add git * remove all non-alphanumeric chalracters * add working ipy module tests! * default to use host network * remove is_async from browser to make thing a little more reliable; retry loading browser when error; * add sleep to wait a bit for http server * kill http server before regenerate browsing tests * fix browsing * only set sandbox container image if undefined * skip empty config value * update evaluation to use the latest run_controller * revert logger in execute_server to be compatible with server runtime * revert logging level to fix jupyter * set logger level * revert the logging * chmod for workspace to fix permission * support getting timeout from action * update test for server runtime * try to fix file permission * fix test_cmd_run_action_serialization_deserialization test (added timeout) * poetry: pip 24.2, torch 2.2.2 * revert adding pip to pyproject.toml * add build to dependencies in pyproject.toml * forgot poetry lock --no-update * fix a DelegatorAgent prompt_002.log (timeout) * fix a DelegatorAgent prompt_003.log (timeout) * couple more timeout attribs in prompt files * some more prompt files * prompts galore * add clarification comment for timeout * default timeout to config * add assert * update integraton tests for eventstream * update integration tests * fix timeout for action<->dict * remove redundant on_event * fix action execution timeout * updatelock --------- Co-authored-by: Graham Neubig <neubig@gmail.com> Co-authored-by: tobitege <tobitege@gmx.de>	2024-08-01 22:07:39 +00:00
tobitege	a4cb880699	(feat) LLM class: added acompletion and streaming + unit test (#3202 ) * LLM class: added acompletion and streaming, unit test test_acompletion.py * LLM: cleanup of self.config defaults and their use * added set_missing_attributes to LLMConfig * move default checker up	2024-08-01 22:41:40 +02:00
Xingyao Wang	bd68249fba	[Arch] Test `EventStreamRuntime` to ensure its feature parity with `ServerRuntime` (#3157 ) * Remove global config from memory * Remove runtime global config * Remove from storage * Remove global config * Fix event stream tests * Fix sandbox issue * Change config * Removed transferred tests * Add swe env box * Fixes on testing * Fixed some tests * Merge with stashed changes * Fix typing * Fix ipython test * Revive function * Make temp_dir fixture * Remove test to avoid circular import * fix eventstream filestore for test_runtime * fix parse arg issue that cause integration test to fail * support swebench pull from custom namespace * add back simple tests for runtime * move multi-line bash tests to test_runtime; support multi-line bash for esruntime; * add testcase to handle PS2 prompt * use bashlex for bash parsing to handle multi-line commands; add testcases for multi-line commands * revert ghcr runtime change * Apply stash * fix run as other user; make test async; * fix test runtime for run as od * add run-as-devin to all the runtime tests * handle the case when username is root * move all run-as-devin tests from sandbox; only tests a few cases on different user to save time; * move over multi-line echo related tests to test_runtime * fix user-specific jupyter by fixing the pypoetry virtualenv folder * make plugin's init async; chdir at initialization of jupyter plugin; move ipy simple testcase to test runtime; * support agentskills import in move tests for jupyter pwd tests; overload `add_env_vars` for EventStreamRuntime to update env var also in Jupyter; make agentskills read env var lazily, in case env var is updated; * fix ServerRuntime agentskills issue * move agnostic image test to test_runtime * merge runtime tests in CI * fix enable auto lint as env var * update warning message * update warning message * test for different container images * change parsing output as debug * add exception handling for update_pwd_decorator * fix unit test indentation * add plugins as default input to Runtime class; remove init_sandbox_plugins; implement add_env_var (include jupyter) in the base class; * fix server runtime auto lint * Revert "add exception handling for update_pwd_decorator" This reverts commit `2b668b1506`. * tries to print debugging info for agentskills * explictly setting uid (try fix permission issue) * Revert "tries to print debugging info for agentskills" This reverts commit `8be4c86756`. * set sandbox user id during testing to hopefully fix the permission issue * add browser tools for server runtime * try to debug for old pwd * update debug cmd * only test agnostic runtime when TEST_RUNTIME is Server * fix temp dir mkdir * load TEST_RUNTIME at the beginning * remove ipython tests * only log to file when DEBUG * default logging to project root * temporarily remove log to file * fix LLM logger dir * fix logger * make set pwd an optional aux action * fix prev pwd * fix infinity recursion * simplify * do not import the whole od library to avoid logger folder by jupyter * fix browsing * increase timeout * attempt to fix agentskills yet again * clean up in testcases, since CI maybe run as non-root * add _cause attribute for event.id * remove parent * add a bunch of debugging statement again for CI :( * fix temp_dir fixture * change all temp dir to follow pytest's tmp_path_factory * remove extra bracket * clean up error printing a bit * jupyter chdir to self.config.workspace_mount_path_in_sandbox on initialization * jupyter chdir to self.config.workspace_mount_path_in_sandbox on initialization * add typing for tmp dir fixture * clear the directory before running the test to avoid weird CI temp dir * remove agnostic test case for server runtime * Revert "remove agnostic test case for server runtime" This reverts commit `30e2181c3f`. * disable agnostic tests in CI * fix test --------- Co-authored-by: Graham Neubig <neubig@gmail.com>	2024-07-31 04:30:59 +08:00
tobitege	1eb3bdea95	remove unneeded message about config file not found (#3158 )	2024-07-29 16:27:14 +02:00
Xingyao Wang	b1ea204c5b	Migrate multi-line-bash-related sandbox tests into runtime tests and fix multi-line issue (#3128 ) * Remove global config from memory * Remove runtime global config * Remove from storage * Remove global config * Fix event stream tests * Fix sandbox issue * Change config * Removed transferred tests * Add swe env box * Fixes on testing * Fixed some tests * Merge with stashed changes * Fix typing * Fix ipython test * Revive function * Make temp_dir fixture * Remove test to avoid circular import * fix eventstream filestore for test_runtime * fix parse arg issue that cause integration test to fail * support swebench pull from custom namespace * add back simple tests for runtime * move multi-line bash tests to test_runtime; support multi-line bash for esruntime; * add testcase to handle PS2 prompt * use bashlex for bash parsing to handle multi-line commands; add testcases for multi-line commands * revert ghcr runtime change --------- Co-authored-by: Graham Neubig <neubig@gmail.com>	2024-07-27 20:12:57 +00:00
Engel Nyst	9ed95abf83	Fix max budget per task error in headless mode (#3147 ) * set agent in ERROR instead of PAUSED when in headless mode * fallback to config value for budget	2024-07-27 17:35:40 +00:00
Graham Neubig	275ea706cf	Remove remaining global config (#3099 ) * Remove global config from memory * Remove runtime global config * Remove from storage * Remove global config * Fix event stream tests * Fix sandbox issue * Change config * Removed transferred tests * Add swe env box * Fixes on testing * Fixed some tests * Fix typing * Fix ipython test * Revive function * Make temp_dir fixture * Remove test to avoid circular import	2024-07-26 18:43:32 +00:00
tobitege	d50a8447ad	fix: add llm `drop_params` parameter to LLMConfig (#2471 ) * feat: add drop_params to LLMConfig * Update opendevin/llm/llm.py Fix use of unknown method. Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com> --------- Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>	2024-07-24 16:25:36 +02:00
Xingyao Wang	da17665cab	fix: make max_budget_per_task optional in `run_agent_controller` (#3071 ) * fix: make max_budget_per_task optional in `run_agent_controller` * update arg for each run infer	2024-07-22 21:47:00 -04:00
Graham Neubig	4099e48122	Removed config from agent controller (#3038 ) * Removed config from agent controller * Fix tests * Increase budget * Update tests * Update prompts * Add missing prompt * Fix mistaken deletions * Fix browsing test * Fixed browse tests	2024-07-22 17:42:57 +00:00
Xingyao Wang	ce8a11a62f	[Arch] Shrink runtime image size (#3051 ) * test_runtime_client.py to test _execute_bash() * runtime_build and runtime tweaks * fix in docker script * revert bash changes * use sandbox_config.update_source_code to control source code update * add od_version to the sandbox tag * add doc instruction for update source code * do not remove whole poetry folder; add mamba clean * add missing newlines --------- Co-authored-by: tobitege <tobitege@gmx.de>	2024-07-22 02:34:45 +08:00
Boxuan Li	be6e6e3add	Bug fix: Metrics not accumulated across agent delegation (#3012 ) * Add test to reproduce cost miscalculation bug * Fix metrics bug * Copy metrics upon AgentRejectAction	2024-07-20 04:05:05 +00:00
Xingyao Wang	6b16a5da0b	[Eval,Arch] Update GPTQ eval and add `headless_mode` for Controller (#2994 ) * update and polish gptq eval * fix typo * Update evaluation/gpqa/README.md Co-authored-by: Graham Neubig <neubig@gmail.com> * Update evaluation/gpqa/run_infer.py Co-authored-by: Graham Neubig <neubig@gmail.com> * add headless mode to all appropriate agent controller call * delegate set to error when in headless mode * try to deduplicate a bit * make headless_mode default to True and only change it to false for AgentSession --------- Co-authored-by: Graham Neubig <neubig@gmail.com>	2024-07-20 03:35:48 +00:00
Julian Gums	b5e4cddce3	fix docs links (#3017 ) * fix docs link * Update CONTRIBUTING.md * Update README.md * fix docs link * fix docs link * Update troubleshooting.md * fix docs link * fix docs link * fix docs link * fix docs link	2024-07-19 13:57:33 +00:00
Xingyao Wang	ff6ddc831f	fix: runtime test for mac (#3005 ) * move use_host_network to sandbox config * fix test runtime tests * fix kwargs to make it clearer	2024-07-19 03:03:55 +00:00
Graham Neubig	de74c7a0a1	Convert docs to new URL (#3002 )	2024-07-18 17:04:35 +00:00
Xingyao Wang	cf910dfa9d	fix eval api_key leak in metadata; fix llm config in run infer (#2998 )	2024-07-18 15:46:59 +00:00
Xingyao Wang	135da0ea2b	increase the default retries for LLM (#2986 )	2024-07-18 09:17:12 +08:00
Xingyao Wang	f80ecec772	[Arch] Add tests for `EventStreamRuntime` and fix bash parsing (#2933 ) * deprecating recall action * fix integration tests * fix integration tests * refractor runtime to use async * remove search memory * rename .initialize to .ainit * draft of runtime image building (separate from img agnostic) * refractor runtime build into separate file and add unit tests for it * fix image agnostic tests * move `split_bash_commands` into a separate util file * fix bash pexcept parsing for env * refractor add_env_var from sandbox to runtime; add test runtime for env var, remove it from sandbox; * remove unclear comment * capture broader error * make `add_env_var` handle multiple export at the same time * add multi env var test * fix tests with new config * make runtime tests a separate ci to avoid full disk * Update Runtime README with architecture diagram and detailed explanations * update test * remove dependency of global config in sandbox test * fix sandbox typo * runtime tests does not need ghcr build now * remove download runtime img * remove dependency of global config in sandbox test * fix sandbox typo * try to free disk before running the tests * Update opendevin/runtime/client/README.md Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com> * Update opendevin/runtime/client/README.md Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com> * Update opendevin/runtime/client/README.md Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com> * try to reduce code duplication * Update opendevin/runtime/client/README.md Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com> * Update opendevin/runtime/client/README.md Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com> * Update opendevin/runtime/client/README.md Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com> * Update opendevin/runtime/client/README.md Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com> * Update opendevin/runtime/client/README.md Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com> * cleanup before setup * temporarily remove this enable lint test since env var are now handled by runtime * linter --------- Co-authored-by: OpenDevin <opendevin@all-hands.dev> Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>	2024-07-18 06:10:45 +08:00
JeffKatzy	5c438432d6	fix bug in config.py file, update reference to variable (#2984 )	2024-07-17 19:56:07 +00:00
Graham Neubig	c897791024	Refactor LLM config (#2953 ) * Add max_message_chars to LLM * Refactor LLM config * Fix tests * Made some functions class functions * Fix regression * Fixed comments	2024-07-17 09:16:04 -04:00
Graham Neubig	88d53e781f	Remove global config from logger (#2974 ) * Remove global config from loggers * Fix bug	2024-07-17 06:25:26 -04:00
Graham Neubig	257698e89b	Remove global config from sandbox (#2961 ) * Some changes * Fixed errors * Remove duplicate initialize_plugins * Fix some tests * Fix tests	2024-07-16 18:34:04 +00:00
Boxuan Li	e3e437fcc2	Rework --llm-config CLI arg (#2957 )	2024-07-16 04:17:59 +00:00
Anush Kumar V	8f76587e5c	docs: updated docstrings using ruff's autofix feature (#2923 ) * Updated documentation using ruff's autofix feature * Updated pyproject.toml to include docstring validations * Updated documentation using ruff's autofix feature * Updated pyproject.toml to include docstring validations * Updated docstrings using ruff's autfix feature * Deleted opendevin/runtime/utils/soource.py, Keeping in sync with main --------- Co-authored-by: Graham Neubig <neubig@gmail.com>	2024-07-16 01:35:33 +00:00
Xingyao Wang	7e68de746d	arch: refractor eventstream into async (#2907 ) * deprecating recall action * fix integration tests * fix integration tests * refractor runtime to use async * remove search memory * rename .initialize to .ainit	2024-07-12 20:03:28 +00:00
Xingyao Wang	e45ddeb2a2	arch: deprecating recall action and `search_memory` (#2900 ) * deprecating recall action * fix integration tests * fix integration tests * remove search memory	2024-07-12 19:23:21 +00:00
Xingyao Wang	e45d46c993	[Arch] Implement EventStream Runtime Client with Jupyter Support using Agnostic Sandbox (#2879 ) * support loading a particular runtime class via config.runtime (default to server to not break things) * move image agnostic util to shared runtime util * move dependency * include poetry.lock in sdist * accept port as arg for client * make client start server with specified port * update image agnostic utility for eventstream runtime * make client and runtime working with REST API * rename execute_server * add plugin to initialize stuff inside es-runtime; cleanup runtime methods to delegate everything to container * remove redundant ls -alh * fix jupyter * improve logging in agnostic sandbox * improve logging of test function * add read & edit * update agnostic sandbox * support setting work dir at start * fix file read/write test * fix unit test * update tescase * Fix unit test again * fix unit test again again	2024-07-12 01:52:26 +08:00
adragos	5f61885e44	feat: Implement user confirmation mode, request confirmation when running bash/python code in this mode (#2774 ) * [feat] confirmation mode for bash actions * feat: Add modal setting for Confirmation Mode * fix: frontend tests for confirmation mode switch * fix: add missing CONFIRMATION_MODE value in SettingsModal.test.tsx * fix: update test to integrate new setting * feat: Implement user confirmation for running bash/python code * fix: don't display rejected actions * fix: linting, rename/refactor based on feedback * fix: add property only to commands, pass serialization tests * fix: package-lock.json, lint test_action_serialization.py * test: add is_confirmed to integration test outputs --------- Co-authored-by: Mislav Balunovic <mislav.balunovic@gmail.com>	2024-07-11 14:57:21 +03:00
Boxuan Li	c68478f470	Customize LLM config per agent (#2756 ) Currently, OpenDevin uses a global singleton LLM config and a global singleton agent config. This PR allows customers to configure an LLM config for each agent. A hypothetically useful scenario is to use a cheaper LLM for repo exploration / code search, and a more powerful LLM to actually do the problem solving (CodeActAgent). Partially solves #2075 (web GUI improvement is not the goal of this PR)	2024-07-09 22:05:54 -07:00
Engel Nyst	d37b2973b2	Refactoring: event stream based agent history (#2709 ) * add to event stream sync * remove async from tests * small logging spam fix * remove swe agent * arch refactoring: use history from the event stream * refactor agents * monologue agent * ruff * planner agent * micro-agents * refactor history in evaluations * evals history refactoring * adapt evals and tests * unit testing stuck * testing micro agents, event stream * fix planner agent * fix tests * fix stuck after rename * fix test * small clean up * fix merge * fix merge issue * fix integration tests * Update agenthub/dummy_agent/agent.py * fix tests * rename more clearly; add todo; clean up	2024-07-07 21:04:23 +00:00

1 2 3

130 Commits