AutoGPT/docs/content/forge/components/built-in-components.md

220 lines
11 KiB
Markdown

# Built-in Components
This page lists all [🧩 Components](./components.md) and [⚙️ Protocols](./protocols.md) they implement that are natively provided. They are used by the AutoGPT agent.
Some components have additional configuration options listed in the table, see [Component configuration](./components.md/#component-configuration) to learn more.
!!! note
If a configuration field uses environment variable, it still can be passed using configuration model. ### Value from the configuration takes precedence over env var! Env var will be only applied if value in the configuration is not set.
## `SystemComponent`
Essential component to allow an agent to finish.
### DirectiveProvider
- Constraints about API budget
### MessageProvider
- Current time and date
- Remaining API budget and warnings if budget is low
### CommandProvider
- `finish` used when task is completed
## `UserInteractionComponent`
Adds ability to interact with user in CLI.
### CommandProvider
- `ask_user` used to ask user for input
## `FileManagerComponent`
Adds ability to read and write persistent files to local storage, Google Cloud Storage or Amazon's S3.
Necessary for saving and loading agent's state (preserving session).
### `FileManagerConfiguration`
| Config variable | Details | Type | Default |
| ---------------- | -------------------------------------- | ----- | ---------------------------------- |
| `storage_path` | Path to agent files, e.g. state | `str` | `agents/{agent_id}/`[^1] |
| `workspace_path` | Path to files that agent has access to | `str` | `agents/{agent_id}/workspace/`[^1] |
[^1] This option is set dynamically during component construction as opposed to by default inside the configuration model, `{agent_id}` is replaced with the agent's unique identifier.
### DirectiveProvider
- Resource information that it's possible to read and write files
### CommandProvider
- `read_file` used to read file
- `write_file` used to write file
- `list_folder` lists all files in a folder
## `CodeExecutorComponent`
Lets the agent execute non-interactive Shell commands and Python code. Python execution works only if Docker is available.
### `CodeExecutorConfiguration`
| Config variable | Details | Type | Default |
| ------------------------ | ---------------------------------------------------- | --------------------------- | ----------------- |
| `execute_local_commands` | Enable shell command execution | `bool` | `False` |
| `shell_command_control` | Controls which list is used | `"allowlist" \| "denylist"` | `"allowlist"` |
| `shell_allowlist` | List of allowed shell commands | `List[str]` | `[]` |
| `shell_denylist` | List of prohibited shell commands | `List[str]` | `[]` |
| `docker_container_name` | Name of the Docker container used for code execution | `str` | `"agent_sandbox"` |
All shell command configurations are expected to be for convience only. This component is not secure and should not be used in production environments. It is recommended to use more appropriate sandboxing.
### CommandProvider
- `execute_shell` execute shell command
- `execute_shell_popen` execute shell command with popen
- `execute_python_code` execute Python code
- `execute_python_file` execute Python file
## `ActionHistoryComponent`
Keeps track of agent's actions and their outcomes. Provides their summary to the prompt.
### `ActionHistoryConfiguration`
| Config variable | Details | Type | Default |
| ---------------------- | ------------------------------------------------------- | ----------- | ------------------ |
| `llm_name` | Name of the llm model used to compress the history | `ModelName` | `"gpt-3.5-turbo"` |
| `max_tokens` | Maximum number of tokens to use for the history summary | `int` | `1024` |
| `spacy_language_model` | Language model used for summary chunking using spacy | `str` | `"en_core_web_sm"` |
| `full_message_count` | Number of cycles to include unsummarized in the prompt | `int` | `4` |
### MessageProvider
- Agent's progress summary
### AfterParse
- Register agent's action
### ExecutionFailure
- Rewinds the agent's action, so it isn't saved
### AfterExecute
- Saves the agent's action result in the history
## `GitOperationsComponent`
Adds ability to iteract with git repositories and GitHub.
### `GitOperationsConfiguration`
| Config variable | Details | Type | Default |
| ----------------- | ----------------------------------------- | ----- | ------- |
| `github_username` | GitHub username, *ENV:* `GITHUB_USERNAME` | `str` | `None` |
| `github_api_key` | GitHub API key, *ENV:* `GITHUB_API_KEY` | `str` | `None` |
### CommandProvider
- `clone_repository` used to clone a git repository
## `ImageGeneratorComponent`
Adds ability to generate images using various providers.
### Hugging Face
To use text-to-image models from Hugging Face, you need a Hugging Face API token.
Link to the appropriate settings page: [Hugging Face > Settings > Tokens](https://huggingface.co/settings/tokens)
### Stable Diffusion WebUI
It is possible to use your own self-hosted Stable Diffusion WebUI with AutoGPT. ### Make sure you are running WebUI with `--api` enabled.
### `ImageGeneratorConfiguration`
| Config variable | Details | Type | Default |
| ------------------------- | ------------------------------------------------------------- | --------------------------------------- | --------------------------------- |
| `image_provider` | Image generation provider | `"dalle" \| "huggingface" \| "sdwebui"` | `"dalle"` |
| `huggingface_image_model` | Hugging Face image model, see [available models] | `str` | `"CompVis/stable-diffusion-v1-4"` |
| `huggingface_api_token` | Hugging Face API token, *ENV:* `HUGGINGFACE_API_TOKEN` | `str` | `None` |
| `sd_webui_url` | URL to self-hosted Stable Diffusion WebUI | `str` | `"http://localhost:7860"` |
| `sd_webui_auth` | Basic auth for Stable Diffusion WebUI, *ENV:* `SD_WEBUI_AUTH` | `str` of format `{username}:{password}` | `None` |
[available models]: https://huggingface.co/models?pipeline_tag=text-to-image
### CommandProvider
- `generate_image` used to generate an image given a prompt
## `WebSearchComponent`
Allows agent to search the web. Google credentials aren't required for DuckDuckGo. [Instructions how to set up Google API key](../../classic/configuration/search.md)
### `WebSearchConfiguration`
| Config variable | Details | Type | Default |
| -------------------------------- | ----------------------------------------------------------------------- | --------------------------- | ------- |
| `google_api_key` | Google API key, *ENV:* `GOOGLE_API_KEY` | `str` | `None` |
| `google_custom_search_engine_id` | Google Custom Search Engine ID, *ENV:* `GOOGLE_CUSTOM_SEARCH_ENGINE_ID` | `str` | `None` |
| `duckduckgo_max_attempts` | Maximum number of attempts to search using DuckDuckGo | `int` | `3` |
| `duckduckgo_backend` | Backend to be used for DDG sdk | `"api" \| "html" \| "lite"` | `"api"` |
### DirectiveProvider
- Resource information that it's possible to search the web
### CommandProvider
- `search_web` used to search the web using DuckDuckGo
- `google` used to search the web using Google, requires API key
## `WebSeleniumComponent`
Allows agent to read websites using Selenium.
### `WebSeleniumConfiguration`
| Config variable | Details | Type | Default |
| ----------------------------- | ------------------------------------------- | --------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------- |
| `llm_name` | Name of the llm model used to read websites | `ModelName` | `"gpt-3.5-turbo"` |
| `web_browser` | Web browser used by Selenium | `"chrome" \| "firefox" \| "safari" \| "edge"` | `"chrome"` |
| `headless` | Run browser in headless mode | `bool` | `True` |
| `user_agent` | User agent used by the browser | `str` | `"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36"` |
| `browse_spacy_language_model` | Spacy language model used for chunking text | `str` | `"en_core_web_sm"` |
| `selenium_proxy` | Http proxy to use with Selenium | `str` | `None` |
### DirectiveProvider
- Resource information that it's possible to read websites
### CommandProvider
- `read_website` used to read a specific url and look for specific topics or answer a question
## `ContextComponent`
Adds ability to keep up-to-date file and folder content in the prompt.
### MessageProvider
- Content of elements in the context
### CommandProvider
- `open_file` used to open a file into context
- `open_folder` used to open a folder into context
- `close_context_item` remove an item from the context
## `WatchdogComponent`
Watches if agent is looping and switches to smart mode if necessary.
### AfterParse
- Investigates what happened and switches to smart mode if necessary