17 Commits

Author SHA1 Message Date
Boxuan Li
9b371b1b5f Refactor agent delegation and tweak micro agents (#1910)
This PR fixes #1897. In addition, this PR fixes and tweaks a few micro-agents.

For the first time, I am able to use ManagerAgent to complete test_write_simple_script and test_edits tasks in integration tests, so this PR also adds ManagerAgent as part of integration tests. test_write_simple_script involves delegation to CoderAgent while test_edits involves delegation to TypoFixerAgent.

Also for the first time, I am able to use DelegateAgent to complete test_write_simple_script and test_edits tasks in integration tests, so this PR also adds DelegateAgent as part of integration tests. It involves delegation to StudyRepoForTaskAgent, CoderAgent and VerifierAgent.

This PR is a blocker for #1735 and likely #1945.
2024-05-28 20:01:16 -07:00
Yufan Song
d18e6c85a0 feat: add metrics related to cost for better observability (#1944)
* add metrics for total_cost

* make lint

* refact codeact

* change metrics into llm

* add costs list, add into state

* refactor log completion

* refactor and test others

* make lint

* Update opendevin/core/metrics.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update opendevin/llm/llm.py

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>

* refactor

* add code

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-05-22 08:53:31 +00:00
Boxuan Li
b845a38169 Small improvements & fixes to SWE-Bench (#1874)
I was able to run a few benchmark instances from SWE-Bench by myself following the documentation - it was great! In general the experience was smooth, thanks to @xingyaoww, @libowen2121 and the team! I made a few small enhancements and fixes to further improve the developer experience.

Always use poetry run python (using python from poetry's virtual environment) over python or python3 in scripts to make sure the behavior is consistent.
Make AGENT configurable. One can use an argument to control which agent they would like to benchmark. To facilitate this, I removed hardcoded CodeActAgent from run_infer.sh, and also added VERSION attribute to all agents, as the benchmark needs to record the agent version.
Make EVAL_LIMIT configurable. One can use an argument to control how many instances they'd like to benchmark. Useful for debugging & development purposes.
Fix 'eval_output_dir' not defined error in run_infer.py.
Other enhancements to the README file and logs.
I also notice that a lot of code from run_infer.py could be shared by other benchmarks, but since we only have one benchmark now, I think we could avoid over-engineering. A refactor and code dedup would be useful in the future once we have more benchmarks, though.
2024-05-20 08:03:30 +00:00
Xia Zhenhua
76abca361c feat: simplify state.history with to_memory call in micro-agent. Or the call to LLM may exceed the token limit. (#1806)
* feat: simplify state.history with to_memory call in micro-agent.

* feat: merge master and replace to_memory with event_to_memory.

---------

Co-authored-by: aaren.xzh <aaren.xzh@antfin.com>
2024-05-15 14:47:37 +02:00
Robert Brennan
dcb5d1ce0a Add permanent storage option for EventStream (#1697)
* add storage classes

* add minio

* add event stream storage

* storage test working

* use fixture

* event stream test passing

* better serialization

* factor out serialization pkg

* move more serialization

* fix tests

* fix test

* remove __all__

* add rehydration test

* add more rehydration test

* fix fixture

* fix dict init

* update tests

* lock

* regenerate tests

* Update opendevin/events/stream.py

* revert tests

* revert old integration tests

* only add fields if present

* regen tests

* pin pyarrow

* fix unit tests

* remove cause from memories

* revert tests

* regen tests
2024-05-14 11:09:45 -04:00
Robert Brennan
b028bd46bb Use messages to drive tasks (#1688)
* finish is working

* start reworking main_goal

* remove main_goal from microagents

* remove main_goal from other agents

* fix issues

* revert codeact line

* make plan a subclass of task

* fix frontend for new plan setup

* lint

* fix type

* more lint

* fix build issues

* fix codeact mgs

* fix edge case in regen script

* fix task validation errors

* regenerate integration tests

* fix up tests

* fix sweagent

* revert codeact prompt

* update integration tests

* update integration tests

* handle loading state

* Update agenthub/codeact_agent/codeact_agent.py

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

* Update opendevin/controller/agent_controller.py

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

* Update agenthub/codeact_agent/codeact_agent.py

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

* Update opendevin/controller/state/plan.py

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

* update docs

* regenerate tests

* remove none from state type

* revert test files

* update integration tests

* rename plan to root_task

* revert plugin perms

* regen integration tests

* tweak integration script

* prettier

* fix test

* set workspace up for regeneration

* regenerate tests

* Change directory of copy

* Updated tests

* Disable PlannerAgent test

* Fix listen

* Updated prompts

* Disable planner again

* Make codecov more lenient

* Update agenthub/README.md

* Update opendevin/server/README.md

* re-enable planner tests

* finish top level tasks

* regen planner

* fix root task factory

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-05-13 23:14:15 +00:00
Engel Nyst
e5f1dbf5e7 Move json utility to the custom json parsing; apply it to the monologue-like agents (#1740) 2024-05-12 13:39:38 -04:00
Engel Nyst
98adbf54ec Small refactoring (#1614)
* move MemoryCondenser, LongTermMemory, json, out of the monologue

* PlannerAgent and Microagents use the custom json.loads/dumps

* Move short term history out of monologue agent...

* move memory in their package

* add __init__
2024-05-11 17:15:19 +02:00
Jim Su
f8d4b1ab0d Use generic types (#1680) 2024-05-10 04:21:22 +02:00
Aleksandar
4bf4119259 Introduce TypoFixerAgent for in-place typo corrections in agenthub/micro (#1613)
* Add TypoFixerAgent micro-agent to fix typos

* Improve parse_response to accurately extract the first complete JSON object

* Add tests for parse_response function handling complex scenarios

* Fix tests and logic to use action_from_dict

* Fix small formatting issues
2024-05-07 13:25:35 +02:00
Frank Xu
26dcf4fd7c remove screenshot in browser observation (#1588)
* remove screenshot in browser observation

* refactor utils

* allow only dict

* fix screenshot not showing up in frontend

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-05-06 13:56:28 -04:00
Graham Neubig
74e159add6 Remove screenshot from microagent prompt (#1550)
* Remove screenshot from microagent prompt

* Update recursive search

* Update to handle various data types
2024-05-03 09:34:39 -04:00
Robert Brennan
fadcdc117e Migrate to new folder structure in preparation for refactor (#1531)
* fix up folder structure

* update docs

* fix imports

* fix imports

* fix imoprt

* fix imports

* fix imports

* fix imports

* fix test import

* fix tests

* fix main import
2024-05-02 17:01:54 +00:00
Robert Brennan
ce7c7eaae4 Refactor actions and observations (#1479)
* refactor actions and events

* remove type_key

* remove stream

* move import

* move import

* fix NullObs

* reorder imports

* fix lint

* fix dataclasses

* remove blank fields

* fix nullobs

* fix sidebar labels

* fix test compilation

* switch to asdict

* lint

* fix whitespace

* fix executable

* delint

* fix run

* remove NotImplementeds

* fix path prefix

* remove null files

* add debug

* add more debug info

* fix dataclass on null

* remove debug

* revert sandbox

* fix merge issues

* fix tyeps

* Update opendevin/events/action/browse.py
2024-05-02 15:44:54 +00:00
Jirka Borovec
0c2ebfd6e1 Ruff: use I rule for isort (#1410)
Ruff: use I rule for isort
2024-04-29 15:41:58 -07:00
Boxuan Li
319b9ac0f3 Fix micro-agents schema bug (#1424)
* Fix micro agents definitions

* Add tests for micro agents

* Add to CI

* Revert "Add to CI"

This reverts commit 94f3b4e7c8.

* Remove test artifacts for ManagerAgent

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-04-29 10:34:03 -04:00
Robert Brennan
1e95fa435d Microagents and Delegation (#1238)
* basic microagent structure

* start on jinja

* add instructions parser

* add action instructions

* add history instructions

* fix a few issues

* fix a few issues

* fix issues

* fix agent encoding

* fix up anon class

* prompt to fix errors

* less debug info when errors happen

* add another traceback

* add output to finish

* fix math prompt

* fix pg prompt

* fix up json prompt

* fix math prompt

* fix math prompt

* fix repo prompt

* fix up repo explorer

* update lock

* revert changes to agent_controller

* refactor microagent registration a bit

* create delegate action

* delegation working

* add finish action to manager

* fix tests

* rename microagents registry

* rename fn

* logspam

* add metadata to manager agent

* fix message

* move repo_explorer

* add delegator agent

* rename agent_definition

* fix up input-output plumbing

* fix tests

* Update agenthub/micro/math_agent/agent.yaml

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update agenthub/delegator_agent/prompt.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update agenthub/delegator_agent/prompt.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* remove prompt.py

* fix lint

* Update agenthub/micro/postgres_agent/agent.yaml

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update agenthub/micro/postgres_agent/agent.yaml

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* fix error

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-04-24 17:46:14 -04:00