AutoGPT/benchmark/agbenchmark/challenges
RainRat cb9ad6f64d
fix typos (#7123)
* fix typos in various places

* Revert changes to NOTICES

---------

Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co>
2024-05-31 11:16:23 +02:00
..
abilities Add more data challenges (#5390) 2023-09-28 19:30:08 -07:00
alignment Add more data challenges (#5390) 2023-09-28 19:30:08 -07:00
library Set up unified pre-commit + CI w/ linting + type checking & FIX EVERYTHING (#7171) 2024-05-28 05:04:21 +02:00
verticals Set up unified pre-commit + CI w/ linting + type checking & FIX EVERYTHING (#7171) 2024-05-28 05:04:21 +02:00
CHALLENGE.md fix typos (#7123) 2024-05-31 11:16:23 +02:00
README.md Benchmark changes 2023-09-12 12:13:39 -07:00
__init__.py feat(benchmark): JungleGym WebArena (#6691) 2024-01-19 20:34:04 +01:00
base.py Set up unified pre-commit + CI w/ linting + type checking & FIX EVERYTHING (#7171) 2024-05-28 05:04:21 +02:00
builtin.py fix(benchmark): Improve output and debug logging of pytest evals 2024-05-30 17:16:17 +02:00
optional_categories.json Benchmark changes 2023-09-12 12:13:39 -07:00
webarena.py Set up unified pre-commit + CI w/ linting + type checking & FIX EVERYTHING (#7171) 2024-05-28 05:04:21 +02:00
webarena_selection.json fix(benchmark): Mock mode, python evals, `--attempts` flag, challenge definitions 2024-02-14 01:05:34 +01:00

README.md

This is the official challenge library for https://github.com/Significant-Gravitas/Auto-GPT-Benchmarks

The goal of this repo is to provide easy challenge creation for test driven development with the Auto-GPT-Benchmarks package. This is essentially a library to craft challenges using a dsl (jsons in this case).

This is the up to date dependency graph: https://sapphire-denys-23.tiiny.site/

How to use

Make sure you have the package installed with pip install agbenchmark.

If you would just like to use the default challenges, don't worry about this repo. Just install the package and you will have access to the default challenges.

To add new challenges as you develop, add this repo as a submodule to your project/agbenchmark folder. Any new challenges you add within the submodule will get registered automatically.