OpenHands

mirror of https://github.com/All-Hands-AI/OpenHands.git synced 2026-01-14 01:08:01 -05:00

Files

Ryan H. Tran 01296ff79d Add remaining subsets for MINT benchmark (#2142 )

* add MMLU subset

* add theoremqa subset

* remove redundant packages from requirements.txt, adjust prompts, handle gpt3.5 propose a wrong answer after a correct answer

* add MBPP subset

* add humaneval subset

* update README

* exit actively after the agent finishes the task

2024-05-31 20:04:13 +00:00

humaneval

Add remaining subsets for MINT benchmark (#2142 )

2024-05-31 20:04:13 +00:00

mbpp

Add remaining subsets for MINT benchmark (#2142 )

2024-05-31 20:04:13 +00:00

reasoning

Add remaining subsets for MINT benchmark (#2142 )

2024-05-31 20:04:13 +00:00