Display results per category (#104)

This commit is contained in:
merwanehamadi
2023-07-14 18:45:24 -07:00
committed by GitHub
parent 66fc7ccb31
commit 8be2a0b2e1

View File

@@ -8,69 +8,30 @@ Radio chart for each agent coming soon !
## Detailed results
:warning: These results are constantly evolving at the moment. We will publish an official benchmark result very soon.
### Auto-GPT
Interface
| Task | Results |
|--------------|---------------------|
| Write File | :white_check_mark: |
| Read File | :white_check_mark: |
| Search File | :x: |
| Task | Auto-GPT | gpt-engineer | mini-agi | smol-developer |
|--------------|--------------------|--------------------|----------|--------------------|
| Write File | :white_check_mark: | :white_check_mark: | tbd | :white_check_mark: |
| Read File | :white_check_mark: | :x: | tbd | :x: |
| Search File | :x: | :x: | tbd | :x: |
Code
| Task | Results |
|-----------------------------------|----------------------|
| Debug Simple Typo With Guidance | :x: |
| Debug Simple Typo Without Guidance| :x: |
| Basic Code Generation | :white_check_mark: |
| Create Simple Web Server | :x: |
| Task | Auto-GPT | gpt-engineer | mini-agi | smol-developer |
|------------------------------------|--------------------|--------------------|----------|--------------------|
| Debug Simple Typo With Guidance | :x: | :x: | tbd | :x: |
| Debug Simple Typo Without Guidance | :x: | :x: | tbd | :x: |
| Basic Code Generation | :white_check_mark: | :white_check_mark: | tbd | :white_check_mark: |
| Create Simple Web Server | :x: | :x: | tbd | :x: |
Memory
| Task | Results |
| Task | Auto-GPT |
|--------------------------------------------|--------------------|
| Basic Memory | :white_check_mark: |
| Remember Multiple Ids | :x: |
| Remember Multiple Ids With Noise | :x: |
| Remember Multiple Phrases With Noise | :x: |
### gpt-engineer
Interface
| Task | Results |
|-------------|--------------------|
| Write File | :white_check_mark: |
| Read File | :x: |
| Search File | :x: |
Code
| Task | Results |
|-----------------------------------|----------------------|
| Debug Simple Typo With Guidance | :x: |
| Debug Simple Typo Without Guidance| :x: |
| Basic Code Generation | :white_check_mark: |
| Create Simple Web Server | :x: |
### mini-agi
Coming Soon!
### smol-developer
Interface
| Task | Results |
|-------------|--------------------|
| Write File | :white_check_mark: |
| Read File | :x: |
| Search File | :x: |
Code
| Task | Results |
|-----------------------------------|----------------------|
| Debug Simple Typo With Guidance | :x: |
| Debug Simple Typo Without Guidance| :x: |
| Basic Code Generation | :white_check_mark: |
| Create Simple Web Server | :x: |