dependabot[bot]
0bd5ab72ed
chore(deps): bump boto3 from 1.34.123 to 1.34.124 ( #2410 )
2024-06-13 00:07:45 +08:00
dependabot[bot]
4c774544ce
chore(deps): bump litellm from 1.40.8 to 1.40.9 ( #2411 )
2024-06-13 00:07:09 +08:00
dependabot[bot]
d1eaf486bd
chore(deps-dev): bump lint-staged from 15.2.5 to 15.2.6 in /frontend ( #2407 )
...
Bumps [lint-staged](https://github.com/okonet/lint-staged ) from 15.2.5 to 15.2.6.
- [Release notes](https://github.com/okonet/lint-staged/releases )
- [Changelog](https://github.com/lint-staged/lint-staged/blob/master/CHANGELOG.md )
- [Commits](https://github.com/okonet/lint-staged/compare/v15.2.5...v15.2.6 )
---
updated-dependencies:
- dependency-name: lint-staged
dependency-type: direct:development
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-12 17:35:01 +03:00
Leo
7fcd14a055
fix the failed unit test. ( #2405 )
...
Signed-off-by: ifuryst <ifuryst@gmail.com >
2024-06-12 13:55:23 +00:00
Yufan Song
90ec0095df
Add integration test for CodeActSWEAgent ( #2377 )
...
* add test log
* remove browsing internet
* add test by GPT-4o
* fix prompts
* change test_agent
* fix test
* fix nits
2024-06-12 02:46:15 +08:00
dependabot[bot]
a25fb4d21b
chore(deps-dev): bump @typescript-eslint/parser in /frontend ( #2387 )
...
Bumps [@typescript-eslint/parser](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/parser ) from 7.12.0 to 7.13.0.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases )
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/parser/CHANGELOG.md )
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v7.13.0/packages/parser )
---
updated-dependencies:
- dependency-name: "@typescript-eslint/parser"
dependency-type: direct:development
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-11 18:33:20 +00:00
dependabot[bot]
11d7efb0e8
chore(deps-dev): bump @testing-library/jest-dom in /frontend ( #2388 )
...
Bumps [@testing-library/jest-dom](https://github.com/testing-library/jest-dom ) from 6.4.5 to 6.4.6.
- [Release notes](https://github.com/testing-library/jest-dom/releases )
- [Changelog](https://github.com/testing-library/jest-dom/blob/main/CHANGELOG.md )
- [Commits](https://github.com/testing-library/jest-dom/compare/v6.4.5...v6.4.6 )
---
updated-dependencies:
- dependency-name: "@testing-library/jest-dom"
dependency-type: direct:development
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-12 02:17:20 +08:00
dependabot[bot]
17e7a45c3c
chore(deps-dev): bump @typescript-eslint/eslint-plugin in /frontend ( #2389 )
...
Bumps [@typescript-eslint/eslint-plugin](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/eslint-plugin ) from 7.12.0 to 7.13.0.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases )
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/eslint-plugin/CHANGELOG.md )
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v7.13.0/packages/eslint-plugin )
---
updated-dependencies:
- dependency-name: "@typescript-eslint/eslint-plugin"
dependency-type: direct:development
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-12 02:16:47 +08:00
dependabot[bot]
e14297fde9
chore(deps-dev): bump prettier from 3.3.1 to 3.3.2 in /frontend ( #2390 )
...
Bumps [prettier](https://github.com/prettier/prettier ) from 3.3.1 to 3.3.2.
- [Release notes](https://github.com/prettier/prettier/releases )
- [Changelog](https://github.com/prettier/prettier/blob/main/CHANGELOG.md )
- [Commits](https://github.com/prettier/prettier/compare/3.3.1...3.3.2 )
---
updated-dependencies:
- dependency-name: prettier
dependency-type: direct:development
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-12 02:15:00 +08:00
dependabot[bot]
35c4c9cb34
chore(deps): bump litellm from 1.40.7 to 1.40.8 ( #2392 )
...
Bumps [litellm](https://github.com/BerriAI/litellm ) from 1.40.7 to 1.40.8.
- [Release notes](https://github.com/BerriAI/litellm/releases )
- [Commits](https://github.com/BerriAI/litellm/compare/v1.40.7...v1.40.8 )
---
updated-dependencies:
- dependency-name: litellm
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-11 15:23:14 +00:00
dependabot[bot]
9893c06b2e
chore(deps): bump boto3 from 1.34.122 to 1.34.123 ( #2391 )
...
Bumps [boto3](https://github.com/boto/boto3 ) from 1.34.122 to 1.34.123.
- [Release notes](https://github.com/boto/boto3/releases )
- [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst )
- [Commits](https://github.com/boto/boto3/compare/1.34.122...1.34.123 )
---
updated-dependencies:
- dependency-name: boto3
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-11 15:19:34 +00:00
Yufan Song
c6951eb6c1
refactor browsing agent response parse ( #2366 )
...
* refactor browsing
* add comments
* change file name
* Rename resposne_parser.py to response_parser.py
* Fixed typos
* Typo fix
---------
Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com >
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk >
2024-06-11 03:46:33 +00:00
Xingyao Wang
b3bdc44292
mkdir infer_logs instead of logs ( #2382 )
2024-06-11 07:18:19 +08:00
Xingyao Wang
11a2d1682d
Minor SWE-Bench inference config tweak ( #2381 )
...
* save infer logs to infer_logs
* set max budget for swebench eval
2024-06-10 20:14:22 +00:00
tobitege
e4145aef66
avoid repeat logging of unneeded messages ( #2380 )
2024-06-10 20:08:09 +00:00
Xingyao Wang
a6ba6c5277
Add SWEBench-docker eval ( #2085 )
...
* add initial version of swebench-docker eval
* update the branch of git repo
* add poetry run
* download dev set too and pre-load f2p and p2p
* update eval infer script
* increase timeout
* add poetry run
* install swebench from our fork
* update script
* update loc
* support single instance debug
* replace \r\n from model patch
* replace eval docker from namespace xingyaoww
* update script to auto detect swe-bench format jsonl
* support eval infer on single instance id
* change log output dir to logs
* update summarise result script
* update README
* update readme
* tweak branch
* Update evaluation/swe_bench/scripts/eval/prep_eval.sh
Co-authored-by: Graham Neubig <neubig@gmail.com >
---------
Co-authored-by: Graham Neubig <neubig@gmail.com >
2024-06-10 19:30:40 +00:00
tobitege
9605106e72
feat: append_file incl. all tests [agentskills] ( #2346 )
...
* new skill: append_file incl. all tests
* more tests needed caring
* file_name for append_file/edit_file; updated tests
2024-06-10 17:18:40 +00:00
dependabot[bot]
a5f5bc30b4
chore(deps): bump @vitejs/plugin-react from 4.3.0 to 4.3.1 in /frontend ( #2371 )
2024-06-11 00:32:10 +08:00
Yufan Song
f4cb192ebe
Fix llm key leaks bug ( #2376 )
...
* fix bug
* fix bug
* add
2024-06-10 15:55:33 +00:00
dependabot[bot]
c633d41091
chore(deps-dev): bump llama-index-vector-stores-chroma ( #2375 )
...
Bumps llama-index-vector-stores-chroma from 0.1.8 to 0.1.9.
---
updated-dependencies:
- dependency-name: llama-index-vector-stores-chroma
dependency-type: direct:development
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-10 15:39:42 +00:00
dependabot[bot]
e2512b43b6
chore(deps-dev): bump llama-index-embeddings-azure-openai ( #2374 )
...
Bumps llama-index-embeddings-azure-openai from 0.1.9 to 0.1.10.
---
updated-dependencies:
- dependency-name: llama-index-embeddings-azure-openai
dependency-type: direct:development
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-10 15:39:31 +00:00
dependabot[bot]
090046b2e6
chore(deps-dev): bump openai from 1.32.0 to 1.33.0 ( #2373 )
...
Bumps [openai](https://github.com/openai/openai-python ) from 1.32.0 to 1.33.0.
- [Release notes](https://github.com/openai/openai-python/releases )
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md )
- [Commits](https://github.com/openai/openai-python/compare/v1.32.0...v1.33.0 )
---
updated-dependencies:
- dependency-name: openai
dependency-type: direct:development
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-10 15:32:52 +00:00
dependabot[bot]
f29a2704f2
chore(deps): bump boto3 from 1.34.121 to 1.34.122 ( #2372 )
...
Bumps [boto3](https://github.com/boto/boto3 ) from 1.34.121 to 1.34.122.
- [Release notes](https://github.com/boto/boto3/releases )
- [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst )
- [Commits](https://github.com/boto/boto3/compare/1.34.121...1.34.122 )
---
updated-dependencies:
- dependency-name: boto3
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-10 15:25:38 +00:00
dependabot[bot]
a3062ba4d0
chore(deps): bump litellm from 1.40.4 to 1.40.7 ( #2370 )
...
Bumps [litellm](https://github.com/BerriAI/litellm ) from 1.40.4 to 1.40.7.
- [Release notes](https://github.com/BerriAI/litellm/releases )
- [Commits](https://github.com/BerriAI/litellm/compare/v1.40.4...v1.40.7 )
---
updated-dependencies:
- dependency-name: litellm
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-10 15:22:30 +00:00
tobitege
f1760f3a67
remove some MonologueAgent mentions ( #2364 )
2024-06-10 11:57:37 +00:00
Yufan Song
f7491bd2fa
Refactor response to action in agent step ( #2350 )
...
* refactor action parser
* Fix typos
* fix typo
---------
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk >
2024-06-10 10:17:30 +00:00
Robert
7fc57650f3
BioCoder integration ( #2076 )
...
* prepare execution and inference
* Create README.md
* Update README.md
* Update evaluation/biocoder/README.md
* Update evaluation/swe_bench/swe_env_box.py
* switch to biocoder docker container and test-specific code
* code for copying and running test files into container
* add metrics
* add readme
* Biocoder evaluation code finished (rewrite testing infrastructure, prompt tuning, and bug fixes)
* Update README.md
---------
Co-authored-by: lilbillybiscuit <qianbill2014@outlook.com >
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com >
Co-authored-by: yufansong <yufan@risingwave-labs.com >
2024-06-10 11:11:40 +08:00
Boxuan Li
91ddd93756
conftest: Exit without revealing secrets ( #2351 )
2024-06-10 10:47:31 +08:00
மனோஜ்குமார் பழனிச்சாமி
003b599dd0
Issues Category Update: Removed Question Type ( #2345 )
...
We've removed the "Question" type from the Issues category to streamline our issue-tracking process. This change will help us focus on actionable issues and feature requests. If you have any questions or discussions, please use the Discussions tab. This is better suited for community engagement, sharing knowledge, and getting help from other contributors.
2024-06-09 21:14:56 -04:00
tobitege
41344f0dfe
remove backtick handling from run_ipython ( #2347 )
2024-06-09 22:53:06 +00:00
RainRat
745ae42a72
fix typos ( #2352 )
2024-06-09 12:57:58 -07:00
Serg Kryvonos
a400e94971
Parameterize Python version ( #2348 )
2024-06-09 17:29:37 +00:00
Temo
e925cefeef
Refactored prompt.py to reduce token usage ( #1996 )
...
* Refactored prompt.py to reduce token usage
* Reverted some destructive changes
* Update agenthub/codeact_agent/prompt.py
* Update agenthub/codeact_agent/prompt.py
* Update agenthub/codeact_agent/prompt.py
* Update agenthub/codeact_agent/prompt.py
* Update agenthub/codeact_agent/prompt.py
* Update agenthub/codeact_agent/prompt.py
* Update agenthub/codeact_agent/prompt.py
* Apply suggestions from code review
* Apply suggestions from code review
* Update agenthub/codeact_agent/prompt.py
* fix integration test
* make lint
* feat: support ToolQA benchmark (#2263 )
* Add files via upload
* Update README.md
* Update run_infer.py
* Update utils.py
* make lint
* Update evaluation/toolqa/run_infer.py
---------
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com >
Co-authored-by: yufansong <yufan@risingwave-labs.com >
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk >
* feat: revert hiden special paths change in file action (#2328 )
* revert change in file action
* remove useless code
* make lint
* Support gpqa benchmark evaluation (#2080 )
* feat: add gpqa benchmark evaluation
* add metrics
* reset configs in final block
* make lint
---------
Co-authored-by: yufansong <yufan@risingwave-labs.com >
* fix(frontend): prevent API key from resetting after modal change (#2329 )
* remove bottom chatbox fade
* Modal wider; fix lint error
* settings: attempt to not clear api key for same provider
* prevent api key from resetting after changing the model
* revert other changes and fix post test tear down error
---------
Co-authored-by: amanape <83104063+amanape@users.noreply.github.com >
* fix: codeact bug [If running a command that never returns, it gets stuck #1895 ] (#2034 )
* fix: codeact bug https://github.com/OpenDevin/OpenDevin/issues/1895
* fix: add CmdRunAction timeout hint.
* Update agenthub/codeact_agent/prompt.py
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com >
* regenerate integration test
---------
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com >
Co-authored-by: Graham Neubig <neubig@gmail.com >
Co-authored-by: yufansong <yufan@risingwave-labs.com >
* Feat: Support Gorilla APIBench (#2081 )
* removed unused files from gorilla
* Update run_infer.py, removed unused imports
* Update utils.py
* Update ast_eval_hf.py
* Update ast_eval_tf.py
* Update ast_eval_th.py
* Create README.md
* Update run_infer.py
* make lint
* Update run_infer.py
* fix lint
---------
Co-authored-by: yufansong <yufan@risingwave-labs.com >
* remote useless (#2332 )
* fix integration test
* Update agenthub/codeact_agent/prompt.py
* Update agenthub/codeact_agent/prompt.py
* fix integration test
---------
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu >
Co-authored-by: Frank Xu <frankxu2004@gmail.com >
Co-authored-by: yufansong <yufan@risingwave-labs.com >
Co-authored-by: yueqis <141804823+yueqis@users.noreply.github.com >
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com >
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk >
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com >
Co-authored-by: Jaskirat Singh <1.jaskiratsingh@gmail.com >
Co-authored-by: tobitege <tobitege@gmx.de >
Co-authored-by: amanape <83104063+amanape@users.noreply.github.com >
Co-authored-by: Aaron Xia <zhhuaxia@gmail.com >
Co-authored-by: Graham Neubig <neubig@gmail.com >
2024-06-09 10:19:05 -07:00
Bibek Poudel
221a4e83f1
doc: Added citation subsection in README ( #2339 )
...
* added citation in readme
* minor change to date format
* Update README.md
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu >
---------
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu >
2024-06-09 14:05:35 +00:00
Frank Xu
bd00f0f049
Restore previous browsing agent behavior when evaluating on WebArena and miniwob++ only ( #2341 )
...
* restore eval mode
* fix
2024-06-09 04:10:02 -04:00
Engel Nyst
fab8c9003b
remove deprecated github-token config ( #2334 )
...
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu >
2024-06-09 09:50:24 +02:00
மனோஜ்குமார் பழனிச்சாமி
e0ad289483
Downgraded Python version to 3.12.3 ( #2331 )
...
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com >
2024-06-09 11:54:30 +05:30
Boxuan Li
a9a2f10170
Revamp AgentRejectAction and allow ManagerAgent to handle rejection ( #1735 )
...
* Fix AgentRejectAction handling
* Add ManagerAgent to integration tests
* Fix regenerate.sh
* Fix merge
* Update README for micro-agents
* Add test reject to regenerate.sh
* regenerate.sh: Add support for running a specific test and/or agent
* Refine reject schema, and allow ManagerAgent to handle reject
* Add test artifacts for test_simple_task_rejection
* Fix manager agent tests
* Fix README
* test_simple_task_rejection: check final agent state
* Integration test: exit if mock prompt not found
* Update test_simple_task_rejection tests
* Fix test_edits test artifacts after prompt update
* Fix ManagerAgent test_edits
* WIP
* Fix tests
* update test_edits for ManagerAgent
* Skip local sandbox for reject test
* Fix test comparison
2024-06-08 23:12:30 -07:00
tobitege
c062468dcf
fix: warning about zope-interface (pyproject) ( #2335 )
2024-06-08 22:51:55 +00:00
tobitege
a97d0767e9
fix: Backticks get always escaped by runtime; add Ipython test ( #2321 )
...
* added tests related to backticks
* updated .gitignore
* added extra linter test for #2210
* hotfix for integration test
* added test_ipython unit test
* added test_ipython unit test
* remove draft test from test_ipython.py
---------
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com >
2024-06-08 21:02:27 +00:00
Yufan Song
1bdf8752e6
remote useless ( #2332 )
2024-06-08 19:04:43 +00:00
yueqis
68d9ad61cf
Feat: Support Gorilla APIBench ( #2081 )
...
* removed unused files from gorilla
* Update run_infer.py, removed unused imports
* Update utils.py
* Update ast_eval_hf.py
* Update ast_eval_tf.py
* Update ast_eval_th.py
* Create README.md
* Update run_infer.py
* make lint
* Update run_infer.py
* fix lint
---------
Co-authored-by: yufansong <yufan@risingwave-labs.com >
2024-06-08 16:54:54 +00:00
Aaron Xia
b5a17efc45
fix: codeact bug [If running a command that never returns, it gets stuck #1895 ] ( #2034 )
...
* fix: codeact bug https://github.com/OpenDevin/OpenDevin/issues/1895
* fix: add CmdRunAction timeout hint.
* Update agenthub/codeact_agent/prompt.py
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com >
* regenerate integration test
---------
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com >
Co-authored-by: Graham Neubig <neubig@gmail.com >
Co-authored-by: yufansong <yufan@risingwave-labs.com >
2024-06-08 16:40:23 +00:00
tobitege
a8c6fd0d42
fix(frontend): prevent API key from resetting after modal change ( #2329 )
...
* remove bottom chatbox fade
* Modal wider; fix lint error
* settings: attempt to not clear api key for same provider
* prevent api key from resetting after changing the model
* revert other changes and fix post test tear down error
---------
Co-authored-by: amanape <83104063+amanape@users.noreply.github.com >
2024-06-08 16:27:43 +00:00
Jaskirat Singh
e8307608c2
Support gpqa benchmark evaluation ( #2080 )
...
* feat: add gpqa benchmark evaluation
* add metrics
* reset configs in final block
* make lint
---------
Co-authored-by: yufansong <yufan@risingwave-labs.com >
2024-06-08 16:24:24 +00:00
Yufan Song
06a6ffcb09
feat: revert hiden special paths change in file action ( #2328 )
...
* revert change in file action
* remove useless code
* make lint
2024-06-08 12:12:52 +00:00
yueqis
82d4d25b09
feat: support ToolQA benchmark ( #2263 )
...
* Add files via upload
* Update README.md
* Update run_infer.py
* Update utils.py
* make lint
* Update evaluation/toolqa/run_infer.py
---------
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com >
Co-authored-by: yufansong <yufan@risingwave-labs.com >
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk >
2024-06-08 07:54:01 -04:00
Xingyao Wang
903381f16e
Add back jupyter PWD env var for agentskills ( #2327 )
...
* add back jupyter pwd env var for agentskills
* add unit test for pwd change in execute_cli
2024-06-08 08:51:42 +00:00
tobitege
c3c2b2d7b6
fix: remove bottom chatbox fade (frontend) ( #2323 )
...
* remove bottom chatbox fade
* Modal wider; fix lint error
2024-06-08 07:09:21 +00:00
tobitege
5e42f140cb
fix: hide special paths; sort models ( #2325 )
2024-06-08 02:13:11 +00:00