Leo
c2f557edde
refactor: multiple code improvements ( #2771 )
2024-07-04 18:51:22 +08:00
Engel Nyst
80fe13f4be
rename our completion as a drop-in replacement of litellm completion ( #2509 )
2024-06-19 05:25:25 +02:00
Yufan Song
fd29b8faa8
refactor ( #2442 )
2024-06-14 19:02:25 -04:00
Yufan Song
c6951eb6c1
refactor browsing agent response parse ( #2366 )
...
* refactor browsing
* add comments
* change file name
* Rename resposne_parser.py to response_parser.py
* Fixed typos
* Typo fix
---------
Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com >
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk >
2024-06-11 03:46:33 +00:00
Frank Xu
bd00f0f049
Restore previous browsing agent behavior when evaluating on WebArena and miniwob++ only ( #2341 )
...
* restore eval mode
* fix
2024-06-09 04:10:02 -04:00
tobitege
b431fce938
tests: more Agentskills tests; updated .gitignore ( #2307 )
...
* added tests related to backticks
* updated .gitignore
* added extra linter test for #2210
* hotfix for integration test
---------
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com >
2024-06-07 16:29:03 +00:00
Boxuan Li
45ce09d70e
CodeActAgent: Delegate to BrowsingAgent for browsing tasks ( #2103 )
2024-06-07 00:53:47 -07:00
Frank Xu
48151bdbb0
[feat] WebArena benchmark, MiniWoB++ benchmark and related arch changes ( #2170 )
...
* add webarena, and revamp messaging for webarena eval
* add changes for browsergym
* update infer script
* fix unit tests
* update
* add multiple run for miniwob
* update instruction, remove personal path
* update
* add code for getting final reward, fix integration, add results
* add avg cost calculation
2024-06-06 09:01:20 +08:00
RainRat
3b0e1361a4
fix typos ( #2267 )
...
* fix typos
no functional change
* fix typos
* fix typos
* fix integration test
---------
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com >
Co-authored-by: Leo <ifuryst@gmail.com >
Co-authored-by: yufansong <yufan@risingwave-labs.com >
2024-06-05 23:06:40 +08:00
Leo
9ada36e30b
fix: restore python linting. ( #2228 )
...
* fix: restore python linting.
Signed-off-by: ifuryst <ifuryst@gmail.com >
* update: extend the Python lint check to evaluation.
Signed-off-by: ifuryst <ifuryst@gmail.com >
* Update evaluation/logic_reasoning/instruction.txt
---------
Signed-off-by: ifuryst <ifuryst@gmail.com >
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk >
2024-06-04 06:36:19 +00:00
RainRat
ed6dcc8381
fix typos ( #2187 )
...
* fix typos
no functional change
* fix typos
2024-06-01 20:40:30 +00:00
Aaron Xia
42c6b506b5
Lazy launching BrowseEnv / making BrowseEnv optional ( #2155 )
...
* feat: lazy launching browser; browser optional for diffrent agents.
* style: lint
* fix: integration test fail due to browser not started.
* fix: run by cli and integration test failed.
* fix: lint
* fix: lint
---------
Co-authored-by: Graham Neubig <neubig@gmail.com >
2024-05-31 16:40:42 -04:00
Frank Xu
53f64ffa06
Improve browsing agent prompts, allowing agent to properly finish when done ( #1993 )
...
* improve browsing agent, allowing it to properly finish.
* handle parsing error, show user what the agent's browsing thoughts in the front end
---------
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk >
2024-05-24 00:02:19 -07:00
Frank Xu
1fe290adf9
[Feat] A competitive Web Browsing agent ( #1856 )
...
* initial attempt at a browsing only agent
* add browsing agent
* update
* implement agent
* update
* fix comments
* remove unnecessary things from memory extras
* update image processing
---------
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com >
2024-05-21 19:20:33 +00:00