* add event to stream before budget check
* make the budget check before the step
* Update opendevin/controller/agent_controller.py
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
---------
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
1. Add support for rejection action on frontend
2. Show users the reason for rejection
3. Get rid of weird empty box after delegation
4. On web GUI, show customer when a delegation starts and ends
* Fix AgentRejectAction handling
* Add ManagerAgent to integration tests
* Fix regenerate.sh
* Fix merge
* Update README for micro-agents
* Add test reject to regenerate.sh
* regenerate.sh: Add support for running a specific test and/or agent
* Refine reject schema, and allow ManagerAgent to handle reject
* Add test artifacts for test_simple_task_rejection
* Fix manager agent tests
* Fix README
* test_simple_task_rejection: check final agent state
* Integration test: exit if mock prompt not found
* Update test_simple_task_rejection tests
* Fix test_edits test artifacts after prompt update
* Fix ManagerAgent test_edits
* WIP
* Fix tests
* update test_edits for ManagerAgent
* Skip local sandbox for reject test
* Fix test comparison
This PR fixes#1897. In addition, this PR fixes and tweaks a few micro-agents.
For the first time, I am able to use ManagerAgent to complete test_write_simple_script and test_edits tasks in integration tests, so this PR also adds ManagerAgent as part of integration tests. test_write_simple_script involves delegation to CoderAgent while test_edits involves delegation to TypoFixerAgent.
Also for the first time, I am able to use DelegateAgent to complete test_write_simple_script and test_edits tasks in integration tests, so this PR also adds DelegateAgent as part of integration tests. It involves delegation to StudyRepoForTaskAgent, CoderAgent and VerifierAgent.
This PR is a blocker for #1735 and likely #1945.
* feat: add max_budget_per_task configuration to control task cost
* Fix test_arg_parser.py
* Use the config.max_budget_per_task as default value
* Add max_budget_per_task to core/main.py as well
* Update opendevin/controller/agent_controller.py
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
---------
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
* properly log user messages;
format browser action/obs, summarize action, messages properly for logging
* add source to message
* add spaces for printing
* Refactor monologue to use the messages in state history
* add messages, clean up
* fix monologue
* update integration tests
* move private method
* update SWE agent to use the history from State
* integration tests for SWE agent
* rename monologue to initial_thoughts, since that is what it is
* Refactor monologue to use the messages in state history
remove now unused method
* is_stuck update
* fix is_stuck
* unit tests
* fix tests
* Revert "Refactor monologue to use the messages in state history"
This reverts commit 76b4b765ef.
* Override eq for CmdOutputObservation to ignore the pid, compare the actual command only
* Revert "Override eq for CmdOutputObservation to ignore the pid, compare the actual command only"
This reverts commit 6418d856b5.
* improve error info logging
* Move assignment of self.state.error to report_error function
* only log exception to state, but not to user
---------
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
This PR includes three changes:
1) Iteration reminder should start with MAX_ITERATIONS from config rather than default value 100
2) In the first prompt, we should tell the LLM it has `MAX_ITERATIONS - 1` turns left, rather than `MAX_ITERATIONS - 2`
3) Remove legacy ITERATION_REMINDER config
* support returning states at the end of controller
* remove return None
* fix issue of overriding final state
* return the final state on close
* merge AgentState with State
* fix integration test
* add ChangeAgentStateAction to history in attempt to fix integration
* add back set agent state
* update tests
* update tests
* directly return get state
* add back the missing .close()
* Update typo in opendevin/core/main.py
---------
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
* mypy is invaluable
* fix config, add test
* Add new-style toml support
* add singleton, small doc fixes
* fix some cases of loading toml, clean up, try to make it clearer
* Add defaults_dict for UI
* allow config to be mutable
error handling
fix toml parsing
* remove debug stuff
* Adapt Makefile
* Add defaults for temperature and top_p
* update to CodeActAgent
* comments
* fix unit tests
* implement groups of llm settings (CLI)
* fix merge issue
* small fix sandboxes, small refactoring
* adapt LLM init to accept overrides at runtime
* reading config is enough
* Encapsulate minimally embeddings initialization
* agent bug fix; fix tests
* fix sandboxes tests
* refactor globals in sandboxes to properties
* Add AgentRejectAction across multiple modules
This commit introduces the AgentRejectAction class and integrates it across various modules and actions. It includes updates to READMEs, action definitions, and agent controllers to handle the new 'reject' action. This functionality will allow agents to properly signal task rejection.
* Fix unit test
* Remove wrong generates attributes from a few micro-agents
* remove extra actions
* remove message observations
* support null obs
* handle null obs
* fix frontend for changes
* fix the way messages flow to the UI
* change think to message
* add regen script
* regenerate all integration tests
* change task
* remove gh test
* fix messages
* fix tests
* help agent exit after hitting max iter
* Update opendevin/events/observation/success.py
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
* Update agenthub/codeact_agent/codeact_agent.py
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
---------
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
* move towards event stream
* refactor agent state changes
* move agent state logic
* fix callbacks
* break on finish
* closer to working
* change frontend to accomodate new flow
* handle start action
* fix locked stream
* revert message
* logspam
* no async on close
* get rid of agent_task
* fix up closing
* better asyncio handling
* sleep to give back control
* fix key
* logspam
* update frontend agent state actions
* fix pause and cancel
* delint
* fix map
* delint
* wait for agent to finish
* fix unit test
* event stream enums
* fix merge issues
* fix lint
* fix test
* fix test
* add user message action
* add user message action
* fix up user messages
* fix main.py flow
* refactor message waiting
* lint
* fix test
* fix test
* simplify if/else
* fix state reset
* logspam
* add error status
* minor changes to control bar
* handle user messages when not awaiting
* restart agent after stopping
* Update opendevin/controller/agent_controller.py
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
* delint
* refactor initialize
* delint
* fix dispatch
---------
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
* move towards event stream
* refactor agent state changes
* move agent state logic
* fix callbacks
* break on finish
* closer to working
* change frontend to accomodate new flow
* handle start action
* fix locked stream
* revert message
* logspam
* no async on close
* get rid of agent_task
* fix up closing
* better asyncio handling
* sleep to give back control
* fix key
* logspam
* update frontend agent state actions
* fix pause and cancel
* delint
* fix map
* delint
* wait for agent to finish
* fix unit test
* event stream enums
* fix merge issues
* fix lint
* fix test
* fix test
* add user message action
* add user message action
* fix up user messages
* fix main.py flow
* refactor message waiting
* lint
* fix test
* fix test
* Feat: add lint frontend and lint all to Makefile.
* style codes.
* Remove redundant target.
---------
Co-authored-by: Jim Su <jimsu@protonmail.com>
Co-authored-by: Robert Brennan <accounts@rbren.io>
* add a single-threaded server serving browsergym
* update poetry
* update browser page content
* add import to make sure browsergym environments are registered properly
* remove flask server, use multiprocess impl and Pipe
* fix
* refactor BrowserEnv
* update browser action and obs to include more complete info
* fix screenshot
* update poetry lock
* add playwright install to workflow
* update
* add better html to text conversion
* update for better text conversion to maintain parity with the current handling of browseurlaction
* update
* update poetry
* update multiprocessing mp
* fix multiprocessing
* update
* update github workflow
---------
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>