OpenHands/evaluation/mint/tasks/__init__.py at ea6dcd8ac8353b1fa2cd080323bc380f6aa70ba2 - OpenHands - AtHeartEngineering

github/OpenHands

mirror of https://github.com/All-Hands-AI/OpenHands.git synced 2026-04-29 03:00:45 -04:00

Files

Ryan H. Tran 01296ff79d Add remaining subsets for MINT benchmark (#2142 )

* add MMLU subset

* add theoremqa subset

* remove redundant packages from requirements.txt, adjust prompts, handle gpt3.5 propose a wrong answer after a correct answer

* add MBPP subset

* add humaneval subset

* update README

* exit actively after the agent finishes the task

2024-05-31 20:04:13 +00:00

13 lines

272 B

Python

Raw Blame History

 from .base import Task
 from .codegen import HumanEvalTask, MBPPTask
 from .reasoning import MultipleChoiceTask, ReasoningTask, TheoremqaTask
 __all__ = [
     'Task',
     'MultipleChoiceTask',
     'ReasoningTask',
     'TheoremqaTask',
     'MBPPTask',
     'HumanEvalTask',
 ]