github/AutoGPT

mirror of https://github.com/Significant-Gravitas/AutoGPT.git synced 2026-04-30 03:00:41 -04:00

Go to file

Auto-GPT-Bot 8ebe4085e4 BabyAGI-20230723223007

2023-07-23 22:30:09 +00:00

Always send to google drive (#185 )

2023-07-23 14:00:57 -07:00

init agbenchmark

2023-06-18 11:14:54 -04:00

Integrate baby-agi (#168 )

2023-07-21 11:15:42 -07:00

Integrate baby-agi (#168 )

2023-07-21 11:15:42 -07:00

gpt-engineer-20230716225908

2023-07-16 22:59:08 +00:00

BabyAGI-20230723223007

2023-07-23 22:30:09 +00:00

.env.example

Dynamic home path for runs (#119 )

2023-07-16 18:24:06 -07:00

.flake8

Add static linters ci (#45 )

2023-07-02 16:14:49 -04:00

.gitignore

Push reports to google drive (#167 )

2023-07-18 09:17:45 -07:00

.gitmodules

Integrate baby-agi (#168 )

2023-07-21 11:15:42 -07:00

.python-version

Add static linters ci (#45 )

2023-07-02 16:14:49 -04:00

json_to_base_64.py

Push reports to google drive (#167 )

2023-07-18 09:17:45 -07:00

LICENSE

init agbenchmark

2023-06-18 11:14:54 -04:00

mypy.ini

Added --test, consolidate files, reports working (#83 )

2023-07-10 19:25:19 -07:00

poetry.lock

Kill subprocesses when test ends (#172 )

2023-07-20 15:41:59 -07:00

pyproject.toml

Release 0.0.2 (#186 )

2023-07-23 14:03:21 -07:00

README.md

Update Auto-GPT score (#106 )

2023-07-15 09:53:56 -07:00

send_to_googledrive.py

Make spreadsheet dynamic based on branch name (#181 )

2023-07-23 12:05:45 -07:00

README.md

Auto-GPT Benchmark

A repo built for the purpose of benchmarking the performance of agents far and wide, regardless of how they are set up and how they work

Scores:

Radio chart for each agent coming soon !

Detailed results

⚠️ These results are constantly evolving at the moment. We will publish an official benchmark result very soon.

Interface

Task	Auto-GPT	gpt-engineer	mini-agi	smol-developer
Write File	❌	✅	tbd	✅
Read File	❌	❌	tbd	❌
Search File	❌	❌	tbd	❌

Code

Task	Auto-GPT	gpt-engineer	mini-agi	smol-developer
Debug Simple Typo With Guidance	❌	❌	tbd	❌
Debug Simple Typo Without Guidance	❌	❌	tbd	❌
Basic Code Generation	❌	✅	tbd	✅
Create Simple Web Server	❌	❌	tbd	❌

Memory

Task	Auto-GPT
Basic Memory	❌
Remember Multiple Ids	❌
Remember Multiple Ids With Noise	❌
Remember Multiple Phrases With Noise	❌

Languages

Python 69%

TypeScript 29.4%

JavaScript 0.6%

PLpgSQL 0.4%

Shell 0.2%

Other 0.2%