Metadata-Version: 2.1
Name: agbenchmark
Version: 0.0.1
Summary: Benchmarking the performance of agents far and wide, regardless of how they are set up and how they work
License: MIT
Author: Silen Naihin
Author-email: silen.naihin@gmail.com
Requires-Python: >=3.10,<4.0
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Requires-Dist: click (>=8.1.3,<9.0.0)
Requires-Dist: openai (>=0.27.8,<0.28.0)
Requires-Dist: pexpect (>=4.8.0,<5.0.0)
Requires-Dist: psutil (>=5.9.5,<6.0.0)
Requires-Dist: pydantic (>=1.10.9,<2.0.0)
Requires-Dist: pytest (>=7.3.2,<8.0.0)
Requires-Dist: pytest-depends (>=1.0.1,<2.0.0)
Requires-Dist: python-dotenv (>=0.21.0,<0.22.0)
Requires-Dist: requests (>=2.31.0,<3.0.0)
Requires-Dist: types-requests (>=2.31.0.1,<3.0.0.0)
Description-Content-Type: text/markdown

# Auto-GPT Benchmark

A repo built for the purpose of benchmarking the performance of agents far and wide, regardless of how they are set up and how they work

## Scores:
Radio chart for each agent coming soon !

## Detailed results
:warning: These results are constantly evolving at the moment. We will publish an official benchmark result very soon.

Interface

| Task         | Auto-GPT | gpt-engineer       | mini-agi | smol-developer     |
|--------------|----------|--------------------|----------|--------------------|
| Write File   | :x:      | :white_check_mark: | tbd      | :white_check_mark: |
| Read File    | :x:      | :x:                | tbd      | :x:                |
| Search File  | :x:      | :x:                | tbd      | :x:                |


Code

| Task                               | Auto-GPT | gpt-engineer       | mini-agi | smol-developer     |
|------------------------------------|----------|--------------------|----------|--------------------|
| Debug Simple Typo With Guidance    | :x:      | :x:                | tbd      | :x:                |
| Debug Simple Typo Without Guidance | :x:      | :x:                | tbd      | :x:                |
| Basic Code Generation              | :x:      | :white_check_mark: | tbd      | :white_check_mark: |
| Create Simple Web Server           | :x:      | :x:                | tbd      | :x:                |


Memory

| Task                                       | Auto-GPT |
|--------------------------------------------|----------|
| Basic Memory                               | :x:      |
| Remember Multiple Ids                      | :x:      |
| Remember Multiple Ids With Noise           | :x:      |
| Remember Multiple Phrases With Noise       | :x:      |

