History

Krzysztof Czerwinski 7cb4d4a903 feat(forge, agent, benchmark): Upgrade to Pydantic v2 (#7280 ) Update Pydantic dependency of `autogpt`, `forge` and `benchmark` to `^2.7` [Pydantic Migration Guide](https://docs.pydantic.dev/2.7/migration/) - Migrate usages of now-deprecated functions to their replacements - Update `Field` definitions - Ellipsis `...` for required fields is deprecated - `Field` no longer supports extra `kwargs`, replace use of this feature with field metadata - Replace `Config` class for specifying model configuration with `model_config = ConfigDict(..)` - Removed `ModelContainer` in `BaseAgent`, component configuration dict is now directly serialized using Pydantic v2 helper functions - Forked `agent-protocol` and updated `packages/client/python` for Pydantic v2 support: https://github.com/Significant-Gravitas/agent-protocol --------- Co-authored-by: Reinier van der Leer <pwuts@agpt.co>		2024-07-02 20:45:32 +02:00
..
abilities	Add more data challenges (#5390 )	2023-09-28 19:30:08 -07:00
alignment	Add more data challenges (#5390 )	2023-09-28 19:30:08 -07:00
library	Set up unified pre-commit + CI w/ linting + type checking & FIX EVERYTHING (#7171 )	2024-05-28 05:04:21 +02:00
verticals	feat(forge, agent, benchmark): Upgrade to Pydantic v2 (#7280 )	2024-07-02 20:45:32 +02:00
CHALLENGE.md	fix typos (#7123 )	2024-05-31 11:16:23 +02:00
README.md	Benchmark changes	2023-09-12 12:13:39 -07:00
__init__.py	feat(benchmark): JungleGym WebArena (#6691 )	2024-01-19 20:34:04 +01:00
base.py	Set up unified pre-commit + CI w/ linting + type checking & FIX EVERYTHING (#7171 )	2024-05-28 05:04:21 +02:00
builtin.py	feat(forge, agent, benchmark): Upgrade to Pydantic v2 (#7280 )	2024-07-02 20:45:32 +02:00
optional_categories.json	Benchmark changes	2023-09-12 12:13:39 -07:00
webarena.py	feat(forge, agent, benchmark): Upgrade to Pydantic v2 (#7280 )	2024-07-02 20:45:32 +02:00
webarena_selection.json	fix(benchmark): Mock mode, python evals, `--attempts` flag, challenge definitions	2024-02-14 01:05:34 +01:00

README.md

Auto-GPT-Benchmarks

The goal of this repo is to provide easy challenge creation for test driven development with the Auto-GPT-Benchmarks package. This is essentially a library to craft challenges using a dsl (jsons in this case).

This is the up to date dependency graph: https://sapphire-denys-23.tiiny.site/

How to use

Make sure you have the package installed with pip install agbenchmark.

If you would just like to use the default challenges, don't worry about this repo. Just install the package and you will have access to the default challenges.

To add new challenges as you develop, add this repo as a submodule to your project/agbenchmark folder. Any new challenges you add within the submodule will get registered automatically.