OpenHands

mirror of https://github.com/All-Hands-AI/OpenHands.git synced 2026-04-29 03:00:45 -04:00

Author	SHA1	Message	Date
Xingyao Wang	602ffcdffb	Implement `agentskills` for OpenDevin to helpfully improve edit AND including more useful tools/skills (#1941 ) * add draft for skills * Implement and test agentskills functions: open_file, goto_line, scroll_down, scroll_up, create_file, search_dir, search_file, find_file * Remove new_sample.txt file * add some work from opendevin w/ fixes * Add unit tests for agentskills module * fix some issues and updated tests * add more tests for open * tweak and handle goto_line * add tests for some edge cases * add tests for scrolling * add tests for edit * add tests for search_dir * update tests to use pytest * use pytest --forked to avoid file op unit tests to interfere with each other via global var * update doc based on swe agent tool * update and add tests for find_file and search_file * move agent_skills to plugins * add agentskills as plugin and docs * add agentskill to ssh box and fix sandbox integration * remove extra returns in doc * add agentskills to initial tool for jupyter * support re-init jupyter kernel (for agentskills) after restart * fix print window's issue with indentation and add testcases * add prompt for codeact with the newest edit primitives * modify the way line number is presented (remove leading space) * change prompt to the newest display format * support tracking of costs via metrics * Update opendevin/runtime/plugins/agent_skills/README.md * Update opendevin/runtime/plugins/agent_skills/README.md * implement and add tests for py linting * remove extra text arg for incompatible subprocess ver * remove sample.txt * update test_edits integration tests * fix all integration * Update opendevin/runtime/plugins/agent_skills/README.md * Update opendevin/runtime/plugins/agent_skills/README.md * Update opendevin/runtime/plugins/agent_skills/README.md * Update agenthub/codeact_agent/prompt.py Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk> * Update agenthub/codeact_agent/prompt.py Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk> * Update agenthub/codeact_agent/prompt.py Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk> * Update opendevin/runtime/plugins/agent_skills/agentskills.py Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk> * correctly setup plugins for swebench eval * bump swe-bench version and add logging * correctly setup plugins for swebench eval * bump swe-bench version and add logging * Revert "correctly setup plugins for swebench eval" This reverts commit `2bd1055673`. * bump version * remove _AGENT_SKILLS_DOCS * move flake8 to test dep * update poetry.lock * remove extra arg * reduce max iter for eval * update poetry * fix integration tests --------- Co-authored-by: OpenDevin <opendevin@opendevin.ai> Co-authored-by: Engel Nyst <enyst@users.noreply.github.com> Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>	2024-05-23 16:04:09 +00:00
Engel Nyst	0eccf31604	Refactor monologue and SWE agent to use the messages in state history (#1863 ) * Refactor monologue to use the messages in state history * add messages, clean up * fix monologue * update integration tests * move private method * update SWE agent to use the history from State * integration tests for SWE agent * rename monologue to initial_thoughts, since that is what it is	2024-05-23 07:29:12 +00:00
Boxuan Li	acb430eef5	Refactor integration testing CI, add optional Mac tests, and mark a few agents as deprecated (#1888 ) * Add MacOS to integration tests * Switch back to python 3.11 * Install Docker for macos pipeline * regenerate.sh: Use environmental variable for sandbox type * Pack different agents' tests into a single check * Fix CodeAct tests * Reduce file match and extensive debug logs * Add TEST_IN_CI mode that reports codecov * Small fix: don't quit if reusing old responses failed * Merge codecov results * Fix typos * Remove coverage merge step - codecov automatically does that * Make mac integration tests as optional - too slow * Fix codecov args * Add comments in yaml * Include sandbox type in codecov report name * Fix codecov report merge * Revert renaming of test_matrix_success * Remove SWEAgent and PlannerAgent from tests * Mark planner agent and SWE agent as deprecated * CodeCov: Ignore planner and sweagent * Revert "Remove SWEAgent and PlannerAgent from tests" This reverts commit `040cb3bfb9`. * Remove all tests for SWE Agent * Only keep basic tests for MonologueAgent and PlannerAgent * Mark SWE Agent as deprecated, and ignore code coverage for it --------- Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>	2024-05-22 20:38:57 -07:00
Engel Nyst	46352e890b	Logging security (#1943 ) * update .gitignore * Rename the confusing 'INFO' style to 'DETAIL' * override str and repr * feat: api_key desensitize * feat: add SensitiveDataFilter in file handler * tweak regex, add tests * more tweaks, include other attrs * add env vars, those with equivalent config * fix tests * tests are invaluable --------- Co-authored-by: Shimada666 <649940882@qq.com>	2024-05-22 18:27:38 +02:00
Robert Brennan	0ecba83e53	Move message history out of CodeAct (#1847 ) * stop keeping history state in codeact * regenerate tests * Update agenthub/codeact_agent/codeact_agent.py Co-authored-by: Engel Nyst <enyst@users.noreply.github.com> * revert tests * regen tests * refactor codeact a bit * regenerate without using LLM * simplify logic * change to heredoc * fix heredoc * fix end_of_edit docs * regen tests * regenerate --------- Co-authored-by: Engel Nyst <enyst@users.noreply.github.com> Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>	2024-05-18 18:39:27 +00:00
மனோஜ்குமார் பழனிச்சாமி	b0b44ed467	Auto restarted Jupyter kernel (#1808 ) Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com> Co-authored-by: Xingyao Wang <xingyao6@illinois.edu> Co-authored-by: Engel Nyst <enyst@users.noreply.github.com> Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>	2024-05-18 08:40:31 +05:30
Boxuan Li	735fbbfe3e	(test) Include message separators in mock prompts (#1855 ) * Add message separator to prompts in tests * DEMO: remove existing prompts for PlannerAgent * Add results after prompt regeneration	2024-05-18 00:33:55 +02:00
Robert Brennan	110b878dd9	fix up serialization and deserialization of events (#1850 ) * fix up serialization and deserialization of events * fix tests * remove prints * fix test * regenerate tests * add try blocks	2024-05-17 01:09:15 +00:00
Boxuan Li	b6ff201780	Refactor integration test framework and relieve the pain of regeneration (#1818 ) * Update README.md * Fix WORKSPACE_MOUNT_PATH_IN_SANDBOX variable in regenerate.sh * Regenerate prompts without calling real LLM * Disable pytest warning capture * Change planner agent prompt by a bit for demo * Regenerate prompt files following prompt changes * doc: elaborate on FORCE_USE_LLM * Add another prompt change to monologue_agent for demo purpose * Regenerate prompts with FORCE_USE_LLM=true --------- Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>	2024-05-16 08:30:29 -07:00
Frank Xu	a84d19f03c	Enable CodeAct agents with browsing, and also enable arbitrary BrowserGym action support (#1807 ) * enable browsing in codeact, and arbitrary browsergym DSL support * fix * fix unit test case * update frontend for the new interactive browsing action * bump ver * Fix integration tests --------- Co-authored-by: OpenDevinBot <bot@opendevin.com>	2024-05-15 11:59:58 -04:00
Xia Zhenhua	bf14b47890	feat: make other agents support asking user input in MessageAction. (#1777 ) * feat: make other agents support asking user input in MessageAction. * Update agenthub/micro/_instructions/actions/message.md Co-authored-by: Robert Brennan <accounts@rbren.io> * Update agenthub/micro/_instructions/actions/message.md Co-authored-by: Robert Brennan <accounts@rbren.io> * feat: make other agents support asking user input in MessageAction. * Regenerate test artifacts --------- Co-authored-by: aaren.xzh <aaren.xzh@antfin.com> Co-authored-by: Robert Brennan <accounts@rbren.io> Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>	2024-05-15 00:44:45 -07:00
Boxuan Li	6714000b2c	CodeActAgent: Fix iteration reminder (#1803 ) This PR includes three changes: 1) Iteration reminder should start with MAX_ITERATIONS from config rather than default value 100 2) In the first prompt, we should tell the LLM it has `MAX_ITERATIONS - 1` turns left, rather than `MAX_ITERATIONS - 2` 3) Remove legacy ITERATION_REMINDER config	2024-05-15 13:48:47 +08:00
Xingyao Wang	d1fd277ad4	Support return final task states for evaluation (#1755 ) * support returning states at the end of controller * remove return None * fix issue of overriding final state * return the final state on close * merge AgentState with State * fix integration test * add ChangeAgentStateAction to history in attempt to fix integration * add back set agent state * update tests * update tests * directly return get state * add back the missing .close() * Update typo in opendevin/core/main.py --------- Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>	2024-05-15 03:43:01 +00:00
Graham Neubig	3cef8ee187	Add GitHub prompt to CodeAct (#1792 ) * Added github to CodeAct * More codeact * Simplify prompt * Modify codeact prompt * fix integration test for CodeAct * yet another integration test fix for codeact * fix plugin use in jupyter * update edit tests * fix jupyter plugin potential port conflict * fix test ipython with latest ipython fix * update integration test * wait a bit for jupyter execution * add one unit tests for sandbox * fix integration test --------- Co-authored-by: OpenDevinBot <bot@opendevin.com> Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>	2024-05-14 21:25:21 +00:00
Xingyao Wang	8d8ed0c3be	hotfix: Initialize plugin with new runtime (#1795 ) * fix plugin use in jupyter * fix jupyter plugin potential port conflict * update integration test * wait a bit for jupyter execution * add one unit tests for sandbox * fix integration test * fix integration * fix integration yet again * init sandbox plugins in the server	2024-05-14 21:15:19 +00:00
Robert Brennan	dcb5d1ce0a	Add permanent storage option for EventStream (#1697 ) * add storage classes * add minio * add event stream storage * storage test working * use fixture * event stream test passing * better serialization * factor out serialization pkg * move more serialization * fix tests * fix test * remove __all__ * add rehydration test * add more rehydration test * fix fixture * fix dict init * update tests * lock * regenerate tests * Update opendevin/events/stream.py * revert tests * revert old integration tests * only add fields if present * regen tests * pin pyarrow * fix unit tests * remove cause from memories * revert tests * regen tests	2024-05-14 11:09:45 -04:00
Robert Brennan	beb74a19f6	Use event stream for the runtime (#1776 ) * rebuild PR from scratch * fix max_iter * regenerate tests * cut down on history * Update opendevin/controller/agent_controller.py * regenerate tests * revert swe agent * revert some codeact chagnes * regenerate tests * add source to dict * only add source if not none * try to fix coverage issue * lock * add gevent	2024-05-14 13:35:25 +00:00
Robert Brennan	82a798990c	refactor remind_iterations (#1760 ) * refactor remind_iterations * regenerate tests * concatenate iteration message * fix merge issues * update integration tests	2024-05-14 08:27:12 -04:00
Boxuan Li	3d53d363b4	Integration test: Verify finish state & add auto-rerun in regenerate.sh (#1773 ) * regenerate.sh: Allow testing on a specific agent and/or test * Check agent finish state * rengerate.sh: Rerun after fixing the prompts * Fix SWEAgent test_write_simple_script * Add more help message * Add a known issue to README.md * regenerate.sh: Fix help message typo * Fix a typo in README	2024-05-14 03:50:29 -04:00
Boxuan Li	b84f25ab35	Integration test: exit if no prompt match (#1772 )	2024-05-13 20:03:09 -07:00
Robert Brennan	b028bd46bb	Use messages to drive tasks (#1688 ) * finish is working * start reworking main_goal * remove main_goal from microagents * remove main_goal from other agents * fix issues * revert codeact line * make plan a subclass of task * fix frontend for new plan setup * lint * fix type * more lint * fix build issues * fix codeact mgs * fix edge case in regen script * fix task validation errors * regenerate integration tests * fix up tests * fix sweagent * revert codeact prompt * update integration tests * update integration tests * handle loading state * Update agenthub/codeact_agent/codeact_agent.py Co-authored-by: Engel Nyst <enyst@users.noreply.github.com> * Update opendevin/controller/agent_controller.py Co-authored-by: Engel Nyst <enyst@users.noreply.github.com> * Update agenthub/codeact_agent/codeact_agent.py Co-authored-by: Engel Nyst <enyst@users.noreply.github.com> * Update opendevin/controller/state/plan.py Co-authored-by: Engel Nyst <enyst@users.noreply.github.com> * update docs * regenerate tests * remove none from state type * revert test files * update integration tests * rename plan to root_task * revert plugin perms * regen integration tests * tweak integration script * prettier * fix test * set workspace up for regeneration * regenerate tests * Change directory of copy * Updated tests * Disable PlannerAgent test * Fix listen * Updated prompts * Disable planner again * Make codecov more lenient * Update agenthub/README.md * Update opendevin/server/README.md * re-enable planner tests * finish top level tasks * regen planner * fix root task factory --------- Co-authored-by: Engel Nyst <enyst@users.noreply.github.com> Co-authored-by: Xingyao Wang <xingyao6@illinois.edu> Co-authored-by: Graham Neubig <neubig@gmail.com> Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>	2024-05-13 23:14:15 +00:00
Robert Brennan	e28b3ef9e8	Fix integration tests (#1764 ) * refactor remind_iterations * regenerate tests * concatenate iteration message * add some helpers to the tests * regenerate tests * add to logs * regenerate tests * add debug info * fix exit_on_message * fix regen script * regenerate tests * Revert "Merge branch 'rb/test-regen' of ssh://github.com/opendevin/opendevin into rb/test-regen" This reverts commit `b9cd1acbf2`, reversing changes made to `c888285304`. * remove prints * revert files * revert more * revert more * regenerate for the last time I hope * add back remind_iter * regenerate * add back remind_iter * regenerate * fix remind_iter * regenerate yet again * regen * remove comment * regen again	2024-05-13 18:08:59 -04:00
Graham Neubig	b13d4647ab	Print out the regenerate command (#1759 ) * Print out the output of the regenerate command * Update regenerate.sh	2024-05-13 18:43:58 +00:00
Boxuan Li	eba5ef8e67	Fix test_ipython (#1750 )	2024-05-12 16:15:32 -07:00
Xingyao Wang	4db4a84e2e	Simply Jupyter execution via heredoc (#1728 ) * simply jupyter execution via heredoc * make sure /tmp always exists * add integration test for jupyter exec	2024-05-13 04:57:06 +08:00
Boxuan Li	49de262577	opendevin/core/main.py: Graceful shutdown (#1731 ) * opendevin/core/main.py: Graceful shutdown * Shutdown controller at exit * Update opendevin/core/main.py --------- Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com> Co-authored-by: Graham Neubig <neubig@gmail.com>	2024-05-12 13:56:35 -07:00
Xingyao Wang	8bfae8413e	Support passing sandbox as argument and iteration reminder (#1730 ) * support custom sandbox; add iteration_reminder * Enable iteration reminder in CodeActAgent integration test * Don't remove numbers when comparing prompts * Update tests/integration/README.md --------- Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>	2024-05-12 07:57:33 +00:00
Boxuan Li	316a772849	CodeAct: Emphasize open before edit (#1709 ) Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>	2024-05-11 12:20:14 -07:00
Boxuan Li	bde12f4a09	CodeActAgent: Fix hack for multiple edits in same command (#1684 ) * Fix edit hack for multiple edits in same command This PR changes ([\s\S]) to ([\s\S]?) to make the capturing group non-greedy. This change ensures that the regex captures the smallest set of characters that extends up to the first end_of_edit it encounters, rather than extending across multiple edit commands. Without the fix, a bash command consisting of multiple edits would be corrupt and lead to unexpected edit results.	2024-05-10 23:32:09 -07:00
Robert Brennan	26d82841d5	Create runtime implementation (#1626 ) * first pass at moving runtime * fix import * remove github refs * remove unnecessary import * remove unnecessary import * add e2b * refactor read and write file ops * remove github test * rm action * revert permissions * regenerate tests * re-delete file operations * regenerate integration tests * Update opendevin/runtime/runtime.py Co-authored-by: Graham Neubig <neubig@gmail.com> * fix ref * add docs * remove logspam --------- Co-authored-by: Graham Neubig <neubig@gmail.com>	2024-05-09 19:04:49 -04:00
Engel Nyst	446eaec1e6	Refactor config to dataclasses (#1552 ) * mypy is invaluable * fix config, add test * Add new-style toml support * add singleton, small doc fixes * fix some cases of loading toml, clean up, try to make it clearer * Add defaults_dict for UI * allow config to be mutable error handling fix toml parsing * remove debug stuff * Adapt Makefile * Add defaults for temperature and top_p * update to CodeActAgent * comments * fix unit tests * implement groups of llm settings (CLI) * fix merge issue * small fix sandboxes, small refactoring * adapt LLM init to accept overrides at runtime * reading config is enough * Encapsulate minimally embeddings initialization * agent bug fix; fix tests * fix sandboxes tests * refactor globals in sandboxes to properties	2024-05-09 22:48:29 +02:00
Boxuan Li	a60a6a40d6	Only regenerate integratio tests for failed ones (#1661 )	2024-05-09 09:32:00 -04:00
Xingyao Wang	21fe8dc1eb	Align codeact with swebench eval (#1612 ) * align codeact agent with the slight adjustment on eval branch * update integration test for new prompt * Regenerate test artifacts for CodeActAgent --------- Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>	2024-05-09 00:42:07 -07:00
Robert Brennan	09e8b11451	Update integration test instructions (#1645 ) * Update README.md * Update tests/integration/README.md * Apply suggestions from code review --------- Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>	2024-05-09 02:56:33 +00:00
Robert Brennan	242c4a0df6	Remove extra message actions (#1608 ) * remove extra actions * remove message observations * support null obs * handle null obs * fix frontend for changes * fix the way messages flow to the UI * change think to message * add regen script * regenerate all integration tests * change task * remove gh test * fix messages * fix tests * help agent exit after hitting max iter * Update opendevin/events/observation/success.py Co-authored-by: Engel Nyst <enyst@users.noreply.github.com> * Update agenthub/codeact_agent/codeact_agent.py Co-authored-by: Engel Nyst <enyst@users.noreply.github.com> --------- Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>	2024-05-07 21:13:08 +00:00
Xingyao Wang	356caf0960	Fix the issue of newly import package by including instruction for Kernel restart (#1609 ) * fix the issue of newly import package by including instruction for kernel restart * fix integration test for new prompt * fix integration yet again	2024-05-07 06:11:05 +08:00
Robert Brennan	fadcdc117e	Migrate to new folder structure in preparation for refactor (#1531 ) * fix up folder structure * update docs * fix imports * fix imports * fix imoprt * fix imports * fix imports * fix imports * fix test import * fix tests * fix main import	2024-05-02 17:01:54 +00:00
Xingyao Wang	435f47ca0e	Improve the both frontend and backend for CodeActAgent (#1494 ) * improve the both frontend and backend for CodeActAgent * fix linter * update integration test	2024-05-02 02:07:40 +08:00
Xingyao Wang	1c7cdbefdd	feat(CodeActAgent): Support Agent-User Interaction during Task Execution and the Full Integration of CodeActAgent (#1290 ) * initialize plugin definition * initialize plugin definition * simplify mixin * further improve plugin mixin * add cache dir for pip * support clean up cache * add script for setup jupyter and execution server * integrate JupyterRequirement to ssh_box * source bashrc at the end of plugin load * add execute_cli that accept code via stdin * make JUPYTER_EXEC_SERVER_PORT configurable via env var * increase background cmd sleep time * Update opendevin/sandbox/plugins/mixin.py Co-authored-by: Robert Brennan <accounts@rbren.io> * add mixin to base class * make jupyter requirement a dataclass * source plugins only when >0 requirements * add `sandbox_plugins` for each agent & have controller take care of it * update build.sh to make logs available in /opendevin/logs * switch to use config for lib and cache dir * Add SANDBOX_WORKSPACE_DIR into config * Add SANDBOX_WORKSPACE_DIR into config * fix occurence of /workspace * fix permission issue with /workspace * use python to implement execute_cli to avoid stdin escape issue * add IPythonRunCellAction and get it working * wait until jupyter is avaialble * support plugin via copying instead of mounting * add agent talk action * support follow-up user language feedback * add __str__ for action to be printed better * only print PLAN at the beginning * wip: update codeact agent * get rid the initial messate * update codeact agent to handle null action; add thought to bash * dispatch thought for RUN action as well * fix weird behavior of pxssh where the output would not flush correctly * make ssh box can handle exit_code properly as well * add initial version of swe-agent plugin; * rename swe cursors * split setup script into two and create two requirements * print SWE-agent command documentation * update swe-agent to default to no custom docs * add initial version of swe-agent plugin; * rename swe cursors * split setup script into two and create two requirements * print SWE-agent command documentation * update swe-agent to default to no custom docs * update dockerfile with dependency from swe-agent * make env setup a separate script for .bashrc source * add wip prompt * fix mount_dir for ssh_box * update prompt * fix mount_dir for ssh_box * default to use host network * default to use host network * move prompt to a separate file * fix swe-tool plugins; add missing _split_string * remove hostname from sshbox * update the prompt with edit functionality * fix swe-tool plugins; add missing _split_string * add awaiting into status bar * fix the bug of additional send event * remove some print action * move logic to config.py * remove debugging comments * make host network as default * make WORKSPACE_MOUNT_PATH as abspath * implement execute_cli via file cp * Revert "implement execute_cli via file cp" This reverts commit `06f0155bc1`. * add codeact dependencies to default container * add IPythonRunCellObservation * add back cache dir and default to /tmp * make USE_HOST_NETWORK a bool * revert use host network to false * add temporarily fix for IPython RUN action * update prompt * revert USE_HOST_NETWORK to true since it is not affecting anything * attempt to fix lint * remove newline * fix jupyter execution server * add `thought` to most action class * fix unit tests for current action abstraction * support user exit * update test cases with the latest action format (added 'thought') * fix integration test for CodeActAGent by mocking stdin * only mock stdin for tests with user_responses.log * remove -exec integration test for CodeActAgent since it is not supported * remove specific stop word * fix comments * improve clarity of prompt * fix py lint * fix integration tests * sandbox might failed in chown due to mounting, but it won't be fatal * update debug instruction for sshbox * fix typo * get RUN_AS_DEVIN and network=host working with app sandbox * get RUN_AS_DEVIN and network=host working with app sandbox * attempt to fix the workspace base permission * sandbox might failed in chown due to mounting, but it won't be fatal * update sshbox instruction * remove default user id since it will be passed in the instruction * revert permission fix since it should be resolved by correct SANDBOX_USER_ID * the permission issue can be fixed by simply provide correct env var * remove log * set sandbox user id to getuid by default * move logging to initializer * make the uid consistent across host, app container, and sandbox * remove hostname as it causes sudo issue * fix permission of entrypoint script * make the uvicron app run as host user uid for jupyter plugin * add warning message * update dev md for instruction of running unit tests * add back unit tests * revert back to the original sandbox implementation to fix testcases * revert use host network * get docker socket gid and usermod instead of chmod 777 * allow unit test workflow to find docker.sock * make sandbox test working via patch * fix arg parser that's broken for some reason * try to fix app build disk space issue * fix integration test * Revert "fix arg parser that's broken for some reason" This reverts commit `6cc8961133`. * update Development.md * cleanup intergration tests & add exception for CodeAct+execbox * fix config * implement user_message action * fix doc * fix event dict error * fix frontend lint * revert accidentally changes to integration tests * revert accidentally changes to integration tests --------- Co-authored-by: Robert Brennan <accounts@rbren.io> Co-authored-by: Robert Brennan <contact@rbren.io>	2024-05-01 08:40:00 -04:00
Jirka Borovec	0c2ebfd6e1	Ruff: use I rule for isort (#1410 ) Ruff: use I rule for isort	2024-04-29 15:41:58 -07:00
Christian Balcom	24b71927c3	fix(backend) changes to improve Command-R+ behavior, plus file i/o error improvements, attempt 2 (#1417 ) * Some improvements to prompts, some better exception handling for various file IO errors, added timeout and max return token configurations for the LLM api. * More monologue prompt improvements * Dynamically set username provided in prompt. * Remove absolute paths from llm prompts, fetch working directory from sandbox when resolving paths in fileio operations, add customizable timeout for bash commands, mention said timeout in llm prompt. * Switched ssh_box to disabling tty echo and removed the logic attempting to delete it from the response afterwards, fixed get_working_directory for ssh_box. * Update prompts in integration tests to match monologue agent changes. * Minor tweaks to make merge easier. * Another minor prompt tweak, better invalid json handling. * Fix lint error * More catch-up to fix lint errors introduced by merge. * Force WORKSPACE_MOUNT_PATH_IN_SANDBOX to match WORKSPACE_MOUNT_PATH in local sandbox mode, combine exception handlers in prompts.py. --------- Co-authored-by: Jim Su <jimsu@protonmail.com> Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>	2024-04-28 21:58:53 -04:00
Graham Neubig	a5f61caae9	Add actions for github push and send PR (#1415 ) * Added a push action * Tests * Add tests * Fix capitalization * Update * Fix typo * Fix integration tests * Added poetry.lock * Set lock * Fix action parsing * Update integration test output * Updated prompt * Update integration test * Add github token to default config --------- Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>	2024-04-29 00:56:23 +00:00
Graham Neubig	0d224de369	Fixed typo in integration tests README.md (#1438 ) * Fixed typo in integration tests README.md * Update tests/integration/README.md --------- Co-authored-by: ThoughtfulRobot <robot@example.com> Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>	2024-04-28 20:44:58 -04:00
Robert Brennan	9c9aee29f0	Revert "fix(backend) changes to improve Command-R+ behavior, plus file i/o er…" (#1405 ) This reverts commit `44aea95dde`.	2024-04-27 08:57:04 -04:00
Christian Balcom	44aea95dde	fix(backend) changes to improve Command-R+ behavior, plus file i/o error improvements. (#1347 ) * Some improvements to prompts, some better exception handling for various file IO errors, added timeout and max return token configurations for the LLM api. * More monologue prompt improvements * Dynamically set username provided in prompt. * Remove absolute paths from llm prompts, fetch working directory from sandbox when resolving paths in fileio operations, add customizable timeout for bash commands, mention said timeout in llm prompt. * Switched ssh_box to disabling tty echo and removed the logic attempting to delete it from the response afterwards, fixed get_working_directory for ssh_box. * Update prompts in integration tests to match monologue agent changes. * Minor tweaks to make merge easier. * Another minor prompt tweak, better invalid json handling. * Fix lint error * More catch-up to fix lint errors introduced by merge. --------- Co-authored-by: Jim Su <jimsu@protonmail.com> Co-authored-by: Robert Brennan <accounts@rbren.io>	2024-04-27 11:58:34 +00:00
Boxuan Li	e7b5ddfe06	Add integration test framework with mock llm (#1301 ) * Add integration test framework with mock llm * Fix MonologueAgent and PlannerAgent tests * Remove adhoc logging * Use existing logs * Fix SWEAgent and PlannerAgent * Check-in test log files * conftest: look up under test name folder only * Add docstring to conftest * Finish dev doc * Avoid non-determinism * Remove dependency on llm embedding model * Init embedding model only for MonologueAgent * Add adhoc fix for sandbox discrepancy * Test ssh and exec sandboxes * CI: fix missing sandbox type * conftest: Remove hack * Reword comment for TODO	2024-04-25 10:56:53 -04:00

46 Commits