41 Commits

Author SHA1 Message Date
tobitege
4b76f98b26 fix: keep colon part in model name for OpenRouter (#2223) 2024-06-03 17:11:44 +02:00
மனோஜ்குமார் பழனிச்சாமி
343e5c73ae Parsed model_name for model_info (#2122) 2024-05-29 16:54:27 +08:00
Xingyao Wang
ae8cda1495 Support specifying custom cost per token (#2083)
* support specifying custom cost per token

* fix test for new attrs

* add to docs

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-05-27 19:35:34 +08:00
Engel Nyst
46352e890b Logging security (#1943)
* update .gitignore

* Rename the confusing 'INFO' style to 'DETAIL'

* override str and repr

* feat: api_key desensitize

* feat: add SensitiveDataFilter in file handler

* tweak regex, add tests

* more tweaks, include other attrs

* add env vars, those with equivalent config

* fix tests

* tests are invaluable

---------

Co-authored-by: Shimada666 <649940882@qq.com>
2024-05-22 18:27:38 +02:00
Yufan Song
d18e6c85a0 feat: add metrics related to cost for better observability (#1944)
* add metrics for total_cost

* make lint

* refact codeact

* change metrics into llm

* add costs list, add into state

* refactor log completion

* refactor and test others

* make lint

* Update opendevin/core/metrics.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update opendevin/llm/llm.py

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>

* refactor

* add code

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
2024-05-22 08:53:31 +00:00
Shimada666
da8369c4d2 fix: llm is_local function logic error (#1961)
Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>
2024-05-22 10:50:21 +05:30
Boxuan Li
735fbbfe3e (test) Include message separators in mock prompts (#1855)
* Add message separator to prompts in tests

* DEMO: remove existing prompts for PlannerAgent

* Add results after prompt regeneration
2024-05-18 00:33:55 +02:00
Xingyao Wang
2406b901df feat(SWE-Bench environment) integrate SWE-Bench sandbox (#1468)
* add draft dockerfile for build all

* add rsync for build

* add all-in-one docker

* update prepare scripts

* Update swe_env_box.py

* Add swe_entry.sh (buggy now)

* Parse the test command in swe_entry.sh

* Update README for instance eval in sandbox

* revert specialized config

* replace run_as_devin as an init arg

* set container & run_as_root via args

* update swe entry script

* update env

* remove mounting

* allow error after swe_entry

* update swe_env_box

* move file

* update gitignore

* get swe_env_box a working demo

* support faking user response & provide sandox ahead of time;
also return state for controller

* tweak main to support adding controller kwargs

* add module

* initialize plugin for provided sandbox

* add pip cache to plugin & fix jupyter kernel waiting

* better print Observation output

* add run infer scripts

* update readme

* add utility for getting diff patch

* use get_diff_patch in infer

* update readme

* support cost tracking for codeact

* add swe agent edit hack

* disable color in git diff

* fix git diff cmd

* fix state return

* support limit eval

* increase t imeout and export pip cache

* add eval limit config

* return state when hit turn limit

* save log to file; allow agent to give up

* run eval with max 50 turns

* add outputs to gitignore

* save swe_instance & instruction

* add uuid to swebench

* add streamlit dep

* fix save series

* fix the issue where session id might be duplicated

* allow setting temperature for llm (use 0 for eval)

* Get report from agent running log

* support evaluating task success right after inference.

* remove extra log

* comment out prompt for baseline

* add visualizer for eval

* use plaintext for instruction

* reduce timeout for all; only increase timeout for init

* reduce timeout for all; only increase timeout for init

* ignore sid for swe env

* close sandbox in each eval loop

* update visualizer instruction

* increase max chars

* add finish action to history too

* show test result in metrics

* add sidebars for visualizer

* also visualize swe_instance

* cleanup browser when agent controller finish runinng

* do not mount workspace for swe-eval to avoid accidentally overwrite files

* Revert "do not mount workspace for swe-eval to avoid accidentally overwrite files"

This reverts commit 8ef7739054.

* Revert "Revert "do not mount workspace for swe-eval to avoid accidentally overwrite files""

This reverts commit 016cfbb9f0.

* run jupyter command via copy to, instead of cp to mount

* only print mixin output when failed

* change ssh box logging

* add visualizer for pass rate

* add instance id to sandbox name

* only remove container we created

* use opendevin logger in main

* support multi-processing infer

* add back metadata, support keyboard interrupt

* remove container with startswith

* make pbar behave correctly

* update instruction w/ multi-processing

* show resolved rate by repo

* rename tmp dir name

* attempt to fix racing for copy to ssh_box

* fix script

* bump swe-bench-all version

* fix ipython with self-contained commands

* add jupyter demo to swe_env_box

* make resolved count two column

* increase height

* do not add glob to url params

* analyze obs length

* print instance id prior to removal handler

* add gold patch in visualizer

* fix interactive git by adding a git --no-pager as alias

* increase max_char to 10k to cover 98% of swe-bench obs cases

* allow parsing note

* prompt v2

* add iteration reminder

* adjust user response

* adjust order

* fix return eval

* fix typo

* add reminder before logging

* remove other resolve rate

* re adjust to new folder structure

* support adding eval note

* fix eval note path

* make sure first log of each instance is printed

* add eval note

* fix the display for visualizer

* tweak visualizer for better git patch reading

* exclude empty patch

* add retry mechanism for swe_env_box start

* fix ssh timeout issue

* add stat field for apply test patch success

* add visualization for fine-grained report

* attempt to support monologue agent by constraining it to single thread

* also log error msg when stopeed

* save error as well

* override WORKSPACE_MOUNT_PATH and WORKSPACE_BASE for monologue to work in mp

* add retry mechanism for sshbox

* remove retry for swe env box

* try to handle loop state stopped

* Add get report scripts

* Add script to convert agent output to swe-bench format

* Merge fine grained report for visualizer

* Update eval readme

* Update README.md

* Add CodeAct gpt4-1106 output and eval logs on swe-bench-lite

* Update the script to get model report

* Update get_model_report.sh

* Update get_agent_report.sh

* Update report merge script

* Add agent output conversion script

* Update swe_lite_env_setup.sh

* Add example swe-bench output files

* Update eval readme

* Remove redundant scripts

* set iteration count down to false by default

* fix: Issue where CodeAct agent was trying to log cost on local llm and throwing Undefined Model execption out of litellm (#1666)

* fix: Issue where CodeAct agent was trying to log cost on local llm and throwing Undefined Model execption out of litellm

* Review Feedback

* Missing None Check

* Review feedback and improved error handling

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>

* fix prepare_swe_util scripts

* update builder images

* update setup script

* remove swe-bench build workflow

* update lock

* remove experiments since they are moved to hf

* remove visualizer (since it is moved to hf repo)

* simply jupyter execution via heredoc

* update ssh_box

* add initial docker readme

* add pkg-config as dependency

* add script for swe_bench all-in-one docker

* add rsync to builder

* rename var

* update commit

* update readme

* update lock

* support specify timeout for long running tasks

* fix path

* separate building of all deps and files

* support returning states at the end of controller

* remove return None

* support specify timeout for long running tasks

* add timeout for all existing sandbox impl

* fix swe_env_box for new codebase

* update llm config in config.py

* support pass sandbox in

* remove force set

* update eval script

* fix issue of overriding final state

* change default eval output to hf demo

* change default eval output to hf demo

* fix config

* only close it when it is NOT external sandbox

* add scripts

* tweak config

* only put in hostory when state has history attr

* fix agent controller on the case of run out interaction budget

* always assume state is always not none

* remove print of final state

* catch all exception when cannot compute completion cost

* Update README.md

* save source into json

* fix path

* update docker path

* return the final state on close

* merge AgentState with State

* fix integration test

* merge AgentState with State

* fix integration test

* add ChangeAgentStateAction to history in attempt to fix integration

* add back set agent state

* update tests

* update tests

* move scripts for setup

* update script and readme for infer

* do not reset logger when n processes == 1

* update eval_infer scripts and readme

* simplify readme

* copy over dir after eval

* copy over dir after eval

* directly return get state

* update lock

* fix output saving of infer

* replace print with logger

* update eval_infer script

* add back the missing .close

* increase timeout

* copy all swe_bench_format file

* attempt to fix output parsing

* log git commit id as metadata

* fix eval script

* update lock

* update unit tests

* fix argparser unit test

* fix lock

* the deps are now lightweight enough to be incude in make build

* add spaces for tests

* add eval outputs to gitignore

* remove git submodule

* readme

* tweak git email

* update upload instruction

* bump codeact version for eval

---------

Co-authored-by: Bowen Li <libowen.ne@gmail.com>
Co-authored-by: huybery <huybery@gmail.com>
Co-authored-by: Bart Shappee <bshappee@gmail.com>
Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-05-15 16:15:55 +00:00
மனோஜ்குமார் பழனிச்சாமி
b4cdebec06 Ignore any warnings LiteLLM might emit on import (#1687) 2024-05-10 16:42:08 -04:00
Bart Shappee
78cd2e5b47 fix: Issue where CodeAct agent was trying to log cost on local llm and throwing Undefined Model execption out of litellm (#1666)
* fix: Issue where CodeAct agent was trying to log cost on local llm and throwing Undefined Model execption out of litellm

* Review Feedback

* Missing None Check

* Review feedback and improved error handling

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-05-10 13:57:37 -04:00
Engel Nyst
446eaec1e6 Refactor config to dataclasses (#1552)
* mypy is invaluable

* fix config, add test

* Add new-style toml support

* add singleton, small doc fixes

* fix some cases of loading toml, clean up, try to make it clearer

* Add defaults_dict for UI

* allow config to be mutable
error handling
fix toml parsing

* remove debug stuff

* Adapt Makefile

* Add defaults for temperature and top_p

* update to CodeActAgent

* comments

* fix unit tests

* implement groups of llm settings (CLI)

* fix merge issue

* small fix sandboxes, small refactoring

* adapt LLM init to accept overrides at runtime

* reading config is enough

* Encapsulate minimally embeddings initialization

* agent bug fix; fix tests

* fix sandboxes tests

* refactor globals in sandboxes to properties
2024-05-09 22:48:29 +02:00
Xingyao Wang
21fe8dc1eb Align codeact with swebench eval (#1612)
* align codeact agent with the slight adjustment on eval branch

* update integration test for new prompt

* Regenerate test artifacts for CodeActAgent

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-05-09 00:42:07 -07:00
zhaoninge
6150ab6a3e fix: corrected bedrock model list (#1513)
- auto set environment variable
- add criteria for querying the AWS bedrock model
2024-05-07 03:51:49 +00:00
Christian Balcom
27e13fafb5 Token counting and litellm provider customization (#1421)
* Count tokens to judge more accurate max monologue length, add configurations for max input and output tokens, pulling from litellm when available.

* Fix token counter

* Use None as the default for llm_custom_llm_provider, resolve settings conflict with recent command-r-plus commit.

* Document rationale for default token counts.

* Update opendevin/llm/llm.py

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

* Update opendevin/llm/llm.py

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

* Reverting formatting changes from merge.

* Maybe this will satisfy pydoc-markdown?

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-05-06 00:43:00 +02:00
Jiayi Pan
bccb8297b8 feat: ability to configure temperature and top-p sampling for llm generation (#1556)
Co-authored-by: Jim Su <jimsu@protonmail.com>
2024-05-03 19:15:39 +00:00
Robert Brennan
fadcdc117e Migrate to new folder structure in preparation for refactor (#1531)
* fix up folder structure

* update docs

* fix imports

* fix imports

* fix imoprt

* fix imports

* fix imports

* fix imports

* fix test import

* fix tests

* fix main import
2024-05-02 17:01:54 +00:00
Alex Bäuerle
cd58194d2a docs(docs): start implementing docs website (#1372)
* docs(docs): start implementing docs website

* update video url

* add autogenerated codebase docs for backend

* precommit

* update links

* fix config and video

* gh actions

* rename

* workdirs

* path

* path

* fix doc1

* redo markdown

* docs

* change main folder name

* simplify readme

* add back architecture

* Fix lint errors

* lint

* update poetry lock

---------

Co-authored-by: Jim Su <jimsu@protonmail.com>
2024-04-29 10:00:51 -07:00
Christian Balcom
24b71927c3 fix(backend) changes to improve Command-R+ behavior, plus file i/o error improvements, attempt 2 (#1417)
* Some improvements to prompts, some better exception handling for various file IO errors, added timeout and max return token configurations for the LLM api.

* More monologue prompt improvements

* Dynamically set username provided in prompt.

* Remove absolute paths from llm prompts, fetch working directory from sandbox when resolving paths in fileio operations, add customizable timeout for bash commands, mention said timeout in llm prompt.

* Switched ssh_box to disabling tty echo and removed the logic attempting to delete it from the response afterwards, fixed get_working_directory for ssh_box.

* Update prompts in integration tests to match monologue agent changes.

* Minor tweaks to make merge easier.

* Another minor prompt tweak, better invalid json handling.

* Fix lint error

* More catch-up to fix lint errors introduced by merge.

* Force WORKSPACE_MOUNT_PATH_IN_SANDBOX to match WORKSPACE_MOUNT_PATH in local sandbox mode, combine exception handlers in prompts.py.

---------

Co-authored-by: Jim Su <jimsu@protonmail.com>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-04-28 21:58:53 -04:00
Robert Brennan
9c9aee29f0 Revert "fix(backend) changes to improve Command-R+ behavior, plus file i/o er…" (#1405)
This reverts commit 44aea95dde.
2024-04-27 08:57:04 -04:00
Christian Balcom
44aea95dde fix(backend) changes to improve Command-R+ behavior, plus file i/o error improvements. (#1347)
* Some improvements to prompts, some better exception handling for various file IO errors, added timeout and max return token configurations for the LLM api.

* More monologue prompt improvements

* Dynamically set username provided in prompt.

* Remove absolute paths from llm prompts, fetch working directory from sandbox when resolving paths in fileio operations, add customizable timeout for bash commands, mention said timeout in llm prompt.

* Switched ssh_box to disabling tty echo and removed the logic attempting to delete it from the response afterwards, fixed get_working_directory for ssh_box.

* Update prompts in integration tests to match monologue agent changes.

* Minor tweaks to make merge easier.

* Another minor prompt tweak, better invalid json handling.

* Fix lint error

* More catch-up to fix lint errors introduced by merge.

---------

Co-authored-by: Jim Su <jimsu@protonmail.com>
Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-27 11:58:34 +00:00
Boxuan Li
831e934dab Refactor: Use enum for config keys (#1376) 2024-04-26 10:26:01 -04:00
Boxuan Li
e7b5ddfe06 Add integration test framework with mock llm (#1301)
* Add integration test framework with mock llm

* Fix MonologueAgent and PlannerAgent tests

* Remove adhoc logging

* Use existing logs

* Fix SWEAgent and PlannerAgent

* Check-in test log files

* conftest: look up under test name folder only

* Add docstring to conftest

* Finish dev doc

* Avoid non-determinism

* Remove dependency on llm embedding model

* Init embedding model only for MonologueAgent

* Add adhoc fix for sandbox discrepancy

* Test ssh and exec sandboxes

* CI: fix missing sandbox type

* conftest: Remove hack

* Reword comment for TODO
2024-04-25 10:56:53 -04:00
Engel Nyst
464bf7ee23 Tweak connect exceptions (#1120)
* Clean up manual sleep

* Add default retries and document them.

* Add doctrings to llm

* Add exponential backoff for rate limiting errors

* Get embeddings for the action and its own content, not the user message

* Add a few bad exceptions to stop loop

* Stop loop when the step has no action

* Add action with content, no message, to history

* make retry settings customizable

* fix condense to stop the loop for the same reasons as completion

* Add 500-504 exception to retries

* document the retry variables

* Add retries and limits for embeddings. Replaces llama-index hard-coded decorator.

* Rename to retry_min_wait and retry_max_wait
2024-04-22 04:00:01 +02:00
மனோஜ்குமார் பழனிச்சாமி
0356f6ec89 Azure LLM fix (#1227)
* azure embedding fix

* corrected embedding config

* fixed doc
2024-04-20 01:05:14 +02:00
Robert Brennan
d61fdb8bba less debug info when errors happen (#1233)
* less debug info when errors happen

* add another traceback

* better prompt debugs

* remove unuse arg
2024-04-19 15:15:38 -04:00
Robert Brennan
9fd7068204 Fix for setting LLM model, frontend settings refactor (#1169)
* simplify frontend settings management

* add debug info to llm.py

* always reinitialize agent

* remove old config stuff

* delint

* fix first initialize event

* refactor settings management

* remove logs

* change endpoint to remove litellm reference

* actually fix socket issues

* refactor a bit

* delint

* remove isFirstRun

* delint

* delint python

* fix export

* fix up socket handshake

* fix types

* fix lint errors

* delint

* fix test names

* moar lint

* fix build errors

* remove newline

* Update frontend/src/services/settingsService.test.ts

* Update frontend/src/services/settingsService.test.ts
2024-04-17 17:49:38 +00:00
Engel Nyst
1115b60a74 Logging additions and fixes (#1139)
* Refactor print_to_color into a color formatter

misc fixes

catch ValueErrors and others from Router initialization

add default methods

* Tweak console log formatting, clean up after rebasing exceptions out

* Fix prompts/responses

* clean up

* keep regular colors when no msg_type

* fix filename

* handle file log first

* happy mypy

* ok, mypy

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-16 12:55:22 -04:00
மனோஜ்குமார் பழனிச்சாமி
0616fe3f8d Added Retry for LLM calls (#1092)
* added retry

* filtered API errors

* fixed decorator

* used litellm retries

* added custom backoff too

* Apply suggestions from code review

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

* added custom backoff too

* retried only if certain Exceptions

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-04-15 14:47:38 +02:00
Robert Brennan
9846e24299 Fix logger import (#985)
* fix logger import
* fix mypy version
* make mypy happy (#994)

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
2024-04-10 21:48:40 +02:00
Engel Nyst
8ab9c6fb86 Revert the use of Router, good ole completion works. (#910)
* Revert the use of Router, good ole completion works.

* Stopgap exception message

* Get the updated dependencies.
2024-04-08 22:21:30 -04:00
Engel Nyst
4b4ce20f2d Add logging (#660)
* Add logging config for the app and for llm debug

* - switch to python, add special llm logger

- add logging to sandbox.py

- add session.py

- add a directory per session

- small additions for AgentController

* - add sys log, but try to exclude litellm; log llm responses as json

* Update opendevin/_logging.py

Co-authored-by: Anas DORBANI <95044293+dorbanianas@users.noreply.github.com>

* - use standard file naming
- quick pass through a few more files

* fix ruff

* clean up

* mypy types

* make mypy happy

---------

Co-authored-by: Anas DORBANI <95044293+dorbanianas@users.noreply.github.com>
2024-04-07 05:43:25 +02:00
Engel Nyst
e70767c226 clean up (#721) 2024-04-04 17:02:44 -05:00
Jack Quimby
0748f0b7ce Doc: Updated documentation for Ollama (#689)
* doc: Guide for using local LLM with Ollama

* forgot to delete print statement

* typos

* Updated guide - new working method

* Move to docs folder

* Fixed front end overwrite local model name

* Update llm.py

* Delete docs/examples/images/example.png

deleted example.png
2024-04-04 09:24:44 -04:00
Jack Quimby
08a2dfb01a Guide for Ollama local LLM (#615)
* doc: Guide for using local LLM with Ollama
2024-04-02 23:20:25 -04:00
Yufan Song
324a00f477 refactor(config): make a single source of truth file (#524)
* refactor

* fix nits

* add get from env

* refactor logic
2024-04-02 18:01:23 -04:00
Patrick Nercessian
64281c4cc4 Transitioned to use LiteLLM Router to support retries and backoffs (#501) 2024-04-01 10:42:52 -04:00
Robert Brennan
f68ee45761 output prompt debug before response (#348) 2024-03-30 20:08:05 +08:00
Jim Su
b1b96df8a8 Replace environment variables with configuration file (#339)
* Replace environment variables with configuration file

* Add config.toml to .gitignore

* Remove unused os imports

* Update README.md

* Update README.md

* Update README.md

* Fix merge conflict

* Fallback to environment variables

* Use template file for config.toml

* Update config.toml.template

* Update config.toml.template

---------

Co-authored-by: Robert Brennan <accounts@rbren.io>
2024-03-29 15:26:20 -04:00
Robert Brennan
4304aceff3 remove openai key assertion, enable alternate embedding models (#231)
* remove openai key assertion

* support different embedding models

* add todo

* add local embeddings

* Make lint happy (#232)

* Include Azure AI embedding model (#239)

* Include Azure AI embedding model

* updated requirements

---------

Co-authored-by: Rohit Rushil <rohit.rushil@honeywell.com>

* Update agenthub/langchains_agent/utils/memory.py

* Update agenthub/langchains_agent/utils/memory.py

* add base url

* add docs

* Update requirements.txt

* default to local embeddings

* Update llm.py

* fix fn

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: RoHitRushil <43521824+RohitX0X@users.noreply.github.com>
Co-authored-by: Rohit Rushil <rohit.rushil@honeywell.com>
2024-03-27 14:58:47 -04:00
Robert Brennan
9bc1890d33 add debug dir for prompts (#205)
* add debug dir for prompts

* add indent to dumps

* only wrap completion in debug mode

* fix mypy
2024-03-27 12:40:08 -04:00
Robert Brennan
eb4a261880 Create generic LLM client using LiteLLM (#114)
* add generic llm client

* fix lint errors

* fix lint issues

* a potential suggestion for llm wrapper to keep all the function sigatures for ide

* use completion partial

* fix resp

* remove unused args

* add back truncation logic

* fix add_event

* fix merge issues

* more merge issues fixed

* fix codeact agent

* remove dead code

* remove import

* unused imports

* fix ruff

* update requirements

* mypy fixes

* more lint fixes

* fix browser errors

* fix up observation conversion

* fix format of error

* change max iter default back to 100

* fix kill action

* fix docker cleanup

* add RUN_AS_DEVIN flag

* fix condense

* revert some files

* unused imports

---------

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
Co-authored-by: Robert Brennan <rbren@Roberts-MacBook-Pro.local>
2024-03-26 12:10:23 +08:00