Graham Neubig
|
a081935fd8
|
Simplify eval code (#2775)
* Start simplifying eval code
* Update
* Add EDA
* Updated GAIA
* Update gpqa
* Add humanevalfix
* Fix logic_reasoning
* Add miniwob
* Add mint and ml_bench
* toolqa
* Added swe-bench
* Fixed webarena
* Refactor parameters
|
2024-07-05 19:33:08 +09:00 |
|