github/AutoGPT

mirror of https://github.com/Significant-Gravitas/AutoGPT.git synced 2026-04-08 03:00:28 -04:00

Go to file

Auto-GPT-Bot e90bc0f1d1 gpt-engineer-20230725031837

2023-07-25 03:18:38 +00:00

Add api keys (#190 )

2023-07-24 20:11:48 -07:00

init agbenchmark

2023-06-18 11:14:54 -04:00

Safety challenges, adaptability challenges, suite same_task (#177 )

2023-07-24 13:57:44 -07:00

Beat more challenges in Auto-GPT (#187 )

2023-07-24 15:09:03 -07:00

gpt-engineer-20230716225908

2023-07-16 22:59:08 +00:00

gpt-engineer-20230725031837

2023-07-25 03:18:38 +00:00

.env.example

Safety challenges, adaptability challenges, suite same_task (#177 )

2023-07-24 13:57:44 -07:00

.flake8

Add static linters ci (#45 )

2023-07-02 16:14:49 -04:00

.gitignore

Push reports to google drive (#167 )

2023-07-18 09:17:45 -07:00

.gitmodules

Beat more challenges in Auto-GPT (#187 )

2023-07-24 15:09:03 -07:00

.python-version

Add static linters ci (#45 )

2023-07-02 16:14:49 -04:00

json_to_base_64.py

Push reports to google drive (#167 )

2023-07-18 09:17:45 -07:00

LICENSE

init agbenchmark

2023-06-18 11:14:54 -04:00

mypy.ini

Safety challenges, adaptability challenges, suite same_task (#177 )

2023-07-24 13:57:44 -07:00

poetry.lock

Kill subprocesses when test ends (#172 )

2023-07-20 15:41:59 -07:00

pyproject.toml

Safety challenges, adaptability challenges, suite same_task (#177 )

2023-07-24 13:57:44 -07:00

README.md

Update Auto-GPT score (#106 )

2023-07-15 09:53:56 -07:00

send_to_googledrive.py

Make spreadsheet dynamic based on branch name (#181 )

2023-07-23 12:05:45 -07:00

README.md

Auto-GPT Benchmark

A repo built for the purpose of benchmarking the performance of agents far and wide, regardless of how they are set up and how they work

Scores:

Radio chart for each agent coming soon !

Detailed results

⚠️ These results are constantly evolving at the moment. We will publish an official benchmark result very soon.

Interface

Task	Auto-GPT	gpt-engineer	mini-agi	smol-developer
Write File	❌	✅	tbd	✅
Read File	❌	❌	tbd	❌
Search File	❌	❌	tbd	❌

Code

Task	Auto-GPT	gpt-engineer	mini-agi	smol-developer
Debug Simple Typo With Guidance	❌	❌	tbd	❌
Debug Simple Typo Without Guidance	❌	❌	tbd	❌
Basic Code Generation	❌	✅	tbd	✅
Create Simple Web Server	❌	❌	tbd	❌

Memory

Task	Auto-GPT
Basic Memory	❌
Remember Multiple Ids	❌
Remember Multiple Ids With Noise	❌
Remember Multiple Phrases With Noise	❌

Languages

Python 67.5%

TypeScript 28.6%

Dart 1.4%

JavaScript 0.9%

PLpgSQL 0.6%

Other 0.8%