Xingyao Wang b2fdb963b6 Add detailed tutorial for adding new evaluation benchmarks (#1827)
* Add detailed tutorial for adding new evaluation benchmarks

* update tutorial, fix typo, and log observation to the cmdline

* fix url

* Update evaluation/TUTORIAL.md

* Update evaluation/TUTORIAL.md

* Update evaluation/TUTORIAL.md

* Update evaluation/TUTORIAL.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/TUTORIAL.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/TUTORIAL.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/TUTORIAL.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/TUTORIAL.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/TUTORIAL.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/TUTORIAL.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/TUTORIAL.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/TUTORIAL.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/TUTORIAL.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/TUTORIAL.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/TUTORIAL.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/TUTORIAL.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* Update evaluation/TUTORIAL.md

Co-authored-by: Graham Neubig <neubig@gmail.com>

* simplify readme and add comments to the actual code

* Fix typo in evaluation/TUTORIAL.md

* Fix typo in evaluation/swe_bench/run_infer.py

* Fix another typo in evaluation/swe_bench/run_infer.py

* Update TUTORIAL.md

* Set host net work to false for SWEBench

* Update evaluation/TUTORIAL.md

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update evaluation/TUTORIAL.md

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update evaluation/TUTORIAL.md

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update evaluation/TUTORIAL.md

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

---------

Co-authored-by: OpenDevin <opendevin@opendevin.ai>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
2024-05-18 13:40:53 -04:00
2024-05-18 16:17:39 +00:00
2024-05-14 17:40:07 +00:00
2024-03-16 22:46:04 +08:00

Contributors Forks Stargazers Issues MIT License
Join our Slack community Join our Discord community
SWE-bench
Logo

OpenDevin: Code Less, Make More

Check out the documentation

Welcome to OpenDevin, a platform for autonomous software engineers, powered by AI and LLMs.

OpenDevin agents collaborate with human developers to write code, fix bugs, and ship features.

App screenshot

Quick Start

You can run OpenDevin with Docker. It works best with the most recent version of Docker, 26.0.0.

#The directory you want OpenDevin to modify. MUST be an absolute path!
export WORKSPACE_BASE=$(pwd)/workspace;

docker run \
    -it \
    --pull=always \
    -e SANDBOX_USER_ID=$(id -u) \
    -e WORKSPACE_MOUNT_PATH=$WORKSPACE_BASE \
    -v $WORKSPACE_BASE:/opt/workspace_base \
    -v /var/run/docker.sock:/var/run/docker.sock \
    -p 3000:3000 \
    --add-host host.docker.internal:host-gateway \
    ghcr.io/opendevin/opendevin:0.5

🚀 Documentation

To learn more about the project, and for tips on using OpenDevin, check out our documentation.

There you'll find resources on how to use different LLM providers (like ollama and Anthropic's Claude), troubleshooting resources, and advanced configuration options.

🤝 How to Contribute

OpenDevin is a community-driven project, and we welcome contributions from everyone. Whether you're a developer, a researcher, or simply enthusiastic about advancing the field of software engineering with AI, there are many ways to get involved:

  • Code Contributions: Help us develop new agents, core functionality, the frontend and other interfaces, or sandboxing solutions.
  • Research and Evaluation: Contribute to our understanding of LLMs in software engineering, participate in evaluating the models, or suggest improvements.
  • Feedback and Testing: Use the OpenDevin toolset, report bugs, suggest features, or provide feedback on usability.

For details, please check CONTRIBUTING.md.

🤖 Join Our Community

Whether you're a developer, a researcher, or simply enthusiastic about OpenDevin, we'd love to have you in our community. Let's make software engineering better together!

  • Slack workspace - Here we talk about research, architecture, and future development.
  • Discord server - This is a community-run server for general discussion, questions, and feedback.

📈 Progress

SWE-Bench Lite Score

Star History Chart

📜 License

Distributed under the MIT License. See LICENSE for more information.

Description
No description provided
Readme MIT 742 MiB
Languages
Python 77.7%
TypeScript 19.7%
Shell 1.2%
Jinja 0.8%
JavaScript 0.3%
Other 0.2%