Files
OpenHands/evaluation/utils
Graham Neubig a081935fd8 Simplify eval code (#2775)
* Start simplifying eval code

* Update

* Add EDA

* Updated GAIA

* Update gpqa

* Add humanevalfix

* Fix logic_reasoning

* Add miniwob

* Add mint and ml_bench

* toolqa

* Added swe-bench

* Fixed webarena

* Refactor parameters
2024-07-05 19:33:08 +09:00
..
2024-07-05 19:33:08 +09:00