refactor(agent, forge): Move library code from autogpt to forge (#7106)

Moved from `autogpt` to `forge`:
- `autogpt.config`          -> `forge.config`
- `autogpt.processing`      -> `forge.content_processing`
- `autogpt.file_storage`    -> `forge.file_storage`
- `autogpt.logs`            -> `forge.logging`
- `autogpt.speech`          -> `forge.speech`
- `autogpt.agents.(base|components|protocols)`  -> `forge.agent.*`
- `autogpt.command_decorator`                   -> `forge.command.decorator`
- `autogpt.models.(command|command_parameter)`  -> `forge.command.(command|parameter)`
- `autogpt.(commands|components|features)`      -> `forge.components`
- `autogpt.core.utils.json_utils`           -> `forge.json.parsing`
- `autogpt.prompts.utils`                   -> `forge.llm.prompting.utils`
- `autogpt.core.prompting.(base|schema|utils)`    -> `forge.llm.prompting.*`
- `autogpt.core.resource.model_providers`   -> `forge.llm.providers`
- `autogpt.llm.providers.openai` + `autogpt.core.resource.model_providers.utils`
                                            -> `forge.llm.providers.utils`
- `autogpt.models.action_history:Action*`   -> `forge.models.action`
- `autogpt.core.configuration.schema`       -> `forge.models.config`
- `autogpt.core.utils.json_schema`          -> `forge.models.json_schema`
- `autogpt.core.resource.schema`            -> `forge.models.providers`
- `autogpt.models.utils`                    -> `forge.models.utils`
- `forge.sdk.(errors|utils)` + `autogpt.utils.(exceptions|file_operations_utils|validators)`
                        -> `forge.utils.(exceptions|file_operations|url_validator)`
- `autogpt.utils.utils` -> `forge.utils.const` + `forge.utils.yaml_validator`

Moved within `forge`:
- forge/prompts/* -> forge/llm/prompting/*

The rest are mostly import updates, and some sporadic removals and necessary updates (for example to fix circular deps):
- Changed `CommandOutput = Any` to remove coupling with `ContextItem` (no longer needed)
- Removed unused `Singleton` class
- Reluctantly moved `speech` to forge due to coupling (tts needs to be changed into component)
- Moved `function_specs_from_commands` and `core/resource/model_providers` to `llm/providers` (resources were a `core` thing and are no longer relevant)
- Keep tests in `autogpt` to reduce changes in this PR
- Removed unused memory-related code from tests
- Removed duplicated classes: `FancyConsoleFormatter`, `BelowLevelFilter`
- `prompt_settings.yaml` is in both `autogpt` and `forge` because for some reason doesn't work when placed in just one dir (need to be taken care of)
- Removed `config` param from `clean_input`, it wasn't used and caused circular dependency
- Renamed `BaseAgentActionProposal` to `ActionProposal`
- Updated `pyproject.toml` in `forge` and `autogpt`
- Moved `Action*` models from `forge/components/action_history/model.py` to `forge/models/action.py` as those are relevant to the entire agent and not just `EventHistoryComponent` + to reduce coupling
- Renamed `DEFAULT_ASK_COMMAND` to `ASK_COMMAND` and `DEFAULT_FINISH_COMMAND` to `FINISH_COMMAND`
- Renamed `AutoGptFormatter` to `ForgeFormatter` and moved to `forge`

Includes changes from PR https://github.com/Significant-Gravitas/AutoGPT/pull/7148
---------

Co-authored-by: Reinier van der Leer <pwuts@agpt.co>
This commit is contained in:
Krzysztof Czerwinski
2024-05-15 23:37:53 +01:00
committed by GitHub
parent 4f81246fd4
commit e8d7dfa386
188 changed files with 2069 additions and 1223 deletions

View File

@@ -0,0 +1 @@
../../../../autogpts/autogpt/autogpt/agents/README.md

View File

@@ -0,0 +1,115 @@
# Built-in Components
This page lists all [🧩 Components](./components.md) and [⚙️ Protocols](./protocols.md) they implement that are natively provided. They are used by the AutoGPT agent.
## `SystemComponent`
Essential component to allow an agent to finish.
**DirectiveProvider**
- Constraints about API budget
**MessageProvider**
- Current time and date
- Remaining API budget and warnings if budget is low
**CommandProvider**
- `finish` used when task is completed
## `UserInteractionComponent`
Adds ability to interact with user in CLI.
**CommandProvider**
- `ask_user` used to ask user for input
## `FileManagerComponent`
Adds ability to read and write persistent files to local storage, Google Cloud Storage or Amazon's S3.
Necessary for saving and loading agent's state (preserving session).
**DirectiveProvider**
- Resource information that it's possible to read and write files
**CommandProvider**
- `read_file` used to read file
- `write_file` used to write file
- `list_folder` lists all files in a folder
## `CodeExecutorComponent`
Lets the agent execute non-interactive Shell commands and Python code. Python execution works only if Docker is available.
**CommandProvider**
- `execute_shell` execute shell command
- `execute_shell_popen` execute shell command with popen
- `execute_python_code` execute Python code
- `execute_python_file` execute Python file
## `ActionHistoryComponent`
Keeps track of agent's actions and their outcomes. Provides their summary to the prompt.
**MessageProvider**
- Agent's progress summary
**AfterParse**
- Register agent's action
**ExecutionFailuer**
- Rewinds the agent's action, so it isn't saved
**AfterExecute**
- Saves the agent's action result in the history
## `GitOperationsComponent`
**CommandProvider**
- `clone_repository` used to clone a git repository
## `ImageGeneratorComponent`
Adds ability to generate images using various providers, see [Image Generation configuration](./../configuration/imagegen.md) to learn more.
**CommandProvider**
- `generate_image` used to generate an image given a prompt
## `WebSearchComponent`
Allows agent to search the web.
**DirectiveProvider**
- Resource information that it's possible to search the web
**CommandProvider**
- `search_web` used to search the web using DuckDuckGo
- `google` used to search the web using Google, requires API key
## `WebSeleniumComponent`
Allows agent to read websites using Selenium.
**DirectiveProvider**
- Resource information that it's possible to read websites
**CommandProvider**
- `read_website` used to read a specific url and look for specific topics or answer a question
## `ContextComponent`
Adds ability to keep up-to-date file and folder content in the prompt.
**MessageProvider**
- Content of elements in the context
**CommandProvider**
- `open_file` used to open a file into context
- `open_folder` used to open a folder into context
- `close_context_item` remove an item from the context
## `WatchdogComponent`
Watches if agent is looping and switches to smart mode if necessary.
**AfterParse**
- Investigates what happened and switches to smart mode if necessary

View File

@@ -0,0 +1,102 @@
# 🛠️ Commands
Commands are a way for the agent to do anything; e.g. interact with the user or APIs and use tools. They are provided by components that implement the `CommandProvider` [⚙️ Protocol](./protocols.md). Commands are functions that can be called by the agent, they can have parameters and return values that will be seen by the agent.
```py
class CommandProvider(Protocol):
def get_commands(self) -> Iterator[Command]:
...
```
## `command` decorator
The easiest and recommended way to provide a command is to use `command` decorator on a component method and then just yield it in `get_commands` as part of your provider. Each command needs a name, description and a parameter schema - `JSONSchema`. By default method name is used as a command name, and first part of docstring for the description (before first double newline) and schema can be provided in the decorator.
### Example usage of `command` decorator
```py
# Assuming this is inside some component class
@command(
parameters={
"a": JSONSchema(
type=JSONSchema.Type.INTEGER,
description="The first number",
required=True,
),
"b": JSONSchema(
type=JSONSchema.Type.INTEGER,
description="The second number",
required=True,
)})
def multiply(self, a: int, b: int) -> str:
"""
Multiplies two numbers.
Args:
a: First number
b: Second number
Returns:
Result of multiplication
"""
return str(a * b)
```
The agent will be able to call this command, named `multiply` with two arguments and will receive the result. The command description will be: `Multiplies two numbers.`
We can provide `names` and `description` in the decorator, the above command is equivalent to:
```py
@command(
names=["multiply"],
description="Multiplies two numbers.",
parameters={
"a": JSONSchema(
type=JSONSchema.Type.INTEGER,
description="The first number",
required=True,
),
"b": JSONSchema(
type=JSONSchema.Type.INTEGER,
description="The second number",
required=True,
)})
def multiply_command(self, a: int, b: int) -> str:
return str(a * b)
```
To provide the `multiply` command to the agent, we need to yield it in `get_commands`:
```py
def get_commands(self) -> Iterator[Command]:
yield self.multiply
```
## Creating `Command` directly
If you don't want to use the decorator, you can create a `Command` object directly.
```py
def multiply(self, a: int, b: int) -> str:
return str(a * b)
def get_commands(self) -> Iterator[Command]:
yield Command(
names=["multiply"],
description="Multiplies two numbers.",
method=self.multiply,
parameters=[
CommandParameter(name="a", spec=JSONSchema(
type=JSONSchema.Type.INTEGER,
description="The first number",
required=True,
)),
CommandParameter(name="b", spec=JSONSchema(
type=JSONSchema.Type.INTEGER,
description="The second number",
required=True,
)),
],
)
```

View File

@@ -0,0 +1 @@
../../../../autogpts/forge/forge/components/README.md

View File

@@ -0,0 +1,240 @@
# Creating Components
## The minimal component
Components can be used to implement various functionalities like providing messages to the prompt, executing code, or interacting with external services.
*Component* is a class that inherits from `AgentComponent` OR implements one or more *protocols*. Every *protocol* inherits `AgentComponent`, so your class automatically becomes a *component* once you inherit any *protocol*.
```py
class MyComponent(AgentComponent):
pass
```
This is already a valid component, but it doesn't do anything yet. To add some functionality to it, you need to implement one or more *protocols*.
Let's create a simple component that adds "Hello World!" message to the agent's prompt. To do this we need to implement `MessageProvider` *protocol* in our component. `MessageProvider` is an interface with `get_messages` method:
```py
# No longer need to inherit AgentComponent, because MessageProvider already does it
class HelloComponent(MessageProvider):
def get_messages(self) -> Iterator[ChatMessage]:
yield ChatMessage.user("Hello World!")
```
Now we can add our component to an existing agent or create a new Agent class and add it there:
```py
class MyAgent(Agent):
self.hello_component = HelloComponent()
```
`get_messages` will called by the agent each time it needs to build a new prompt and the yielded messages will be added accordingly.
## Passing data to and between components
Since components are regular classes you can pass data (including other components) to them via the `__init__` method.
For example we can pass a config object and then retrieve an API key from it when needed:
```py
class ConfigurableComponent(MessageProvider):
def __init__(self, config: Config):
self.config = config
def get_messages(self) -> Iterator[ChatMessage]:
if self.config.openai_credentials.api_key:
yield ChatMessage.system("API key found!")
else:
yield ChatMessage.system("API key not found!")
```
!!! note
Component-specific configuration handling isn't implemented yet.
## Providing commands
To extend what an agent can do, you need to provide commands using `CommandProvider` protocol. For example to allow agent to multiply two numbers, you can create a component like this:
```py
class MultiplicatorComponent(CommandProvider):
def get_commands(self) -> Iterator[Command]:
# Yield the command so the agent can use it
yield self.multiply
@command(
parameters={
"a": JSONSchema(
type=JSONSchema.Type.INTEGER,
description="The first number",
required=True,
),
"b": JSONSchema(
type=JSONSchema.Type.INTEGER,
description="The second number",
required=True,
)})
def multiply(self, a: int, b: int) -> str:
"""
Multiplies two numbers.
Args:
a: First number
b: Second number
Returns:
Result of multiplication
"""
return str(a * b)
```
To learn more about commands see [🛠️ Commands](./commands.md).
## Prompt structure
After components provided all necessary data, the agent needs to build the final prompt that will be send to a llm.
Currently, `PromptStrategy` (*not* a protocol) is responsible for building the final prompt.
If you want to change the way the prompt is built, you need to create a new `PromptStrategy` class, and then call relavant methods in your agent class.
You can have a look at the default strategy used by the AutoGPT Agent: [OneShotAgentPromptStrategy](https://github.com/Significant-Gravitas/AutoGPT/tree/master/autogpts/autogpt/autogpt/agents/prompt_strategies/one_shot.py), and how it's used in the [Agent](https://github.com/Significant-Gravitas/AutoGPT/tree/master/autogpts/autogpt/autogpt/agents/agent.py) (search for `self.prompt_strategy`).
## Example `UserInteractionComponent`
Let's create a slighlty simplified version of the component that is used by the built-in agent.
It gives an ability for the agent to ask user for input in the terminal.
1. Create a class for the component that inherits from `CommandProvider`.
```py
class MyUserInteractionComponent(CommandProvider):
"""Provides commands to interact with the user."""
pass
```
2. Implement command method that will ask user for input and return it.
```py
def ask_user(self, question: str) -> str:
"""If you need more details or information regarding the given goals,
you can ask the user for input."""
print(f"\nQ: {question}")
resp = input("A:")
return f"The user's answer: '{resp}'"
```
3. The command needs to be decorated with `@command`.
```py
@command(
parameters={
"question": JSONSchema(
type=JSONSchema.Type.STRING,
description="The question or prompt to the user",
required=True,
)
},
)
def ask_user(self, question: str) -> str:
"""If you need more details or information regarding the given goals,
you can ask the user for input."""
print(f"\nQ: {question}")
resp = input("A:")
return f"The user's answer: '{resp}'"
```
4. We need to implement `CommandProvider`'s `get_commands` method to yield the command.
```py
def get_commands(self) -> Iterator[Command]:
yield self.ask_user
```
5. Since agent isn't always running in the terminal or interactive mode, we need to disable this component by setting `self._enabled` when it's not possible to ask for user input.
```py
def __init__(self, config: Config):
self.config = config
self._enabled = not config.noninteractive_mode
```
The final component should look like this:
```py
# 1.
class MyUserInteractionComponent(CommandProvider):
"""Provides commands to interact with the user."""
# We pass config to check if we're in noninteractive mode
def __init__(self, config: Config):
self.config = config
# 5.
self._enabled = not config.noninteractive_mode
# 4.
def get_commands(self) -> Iterator[Command]:
# Yielding the command so the agent can use it
# This won't be yielded if the component is disabled
yield self.ask_user
# 3.
@command(
# We need to provide a schema for ALL the command parameters
parameters={
"question": JSONSchema(
type=JSONSchema.Type.STRING,
description="The question or prompt to the user",
required=True,
)
},
)
# 2.
# Command name will be its method name and description will be its docstring
def ask_user(self, question: str) -> str:
"""If you need more details or information regarding the given goals,
you can ask the user for input."""
print(f"\nQ: {question}")
resp = input("A:")
return f"The user's answer: '{resp}'"
```
Now if we want to use our user interaction *instead of* the default one we need to somehow remove the default one (if our agent inherits from `Agent` the default one is inherited) and add our own. We can simply override the `user_interaction` in `__init__` method:
```py
class MyAgent(Agent):
def __init__(
self,
settings: AgentSettings,
llm_provider: ChatModelProvider,
file_storage: FileStorage,
legacy_config: Config,
):
# Call the parent constructor to bring in the default components
super().__init__(settings, llm_provider, file_storage, legacy_config)
# Disable the default user interaction component by overriding it
self.user_interaction = MyUserInteractionComponent()
```
Alternatively we can disable the default component by setting it to `None`:
```py
class MyAgent(Agent):
def __init__(
self,
settings: AgentSettings,
llm_provider: ChatModelProvider,
file_storage: FileStorage,
legacy_config: Config,
):
# Call the parent constructor to bring in the default components
super().__init__(settings, llm_provider, file_storage, legacy_config)
# Disable the default user interaction component
self.user_interaction = None
# Add our own component
self.my_user_interaction = MyUserInteractionComponent(legacy_config)
```
## Learn more
The best place to see more examples is to look at the built-in components in the [autogpt/components](https://github.com/Significant-Gravitas/AutoGPT/tree/master/autogpts/autogpt/autogpt/components/) and [autogpt/commands](https://github.com/Significant-Gravitas/AutoGPT/tree/master/autogpts/autogpt/autogpt/commands/) directories.
Guide on how to extend the built-in agent and build your own: [🤖 Agents](./agents.md)
Order of some components matters, see [🧩 Components](./components.md) to learn more about components and how they can be customized.
To see built-in protocols with accompanying examples visit [⚙️ Protocols](./protocols.md).

View File

@@ -0,0 +1,17 @@
# Component Agents
This guide explains the component-based architecture of AutoGPT agents. It's a new way of building agents that is more flexible and easier to extend. Components replace some agent's logic and plugins with a more modular and composable system.
Agent is composed of *components*, and each *component* implements a range of *protocols* (interfaces), each one providing a specific functionality, e.g. additional commands or messages. Each *protocol* is handled in a specific order, defined by the agent. This allows for a clear separation of concerns and a more modular design.
This system is simple, flexible, requires basically no configuration, and doesn't hide any data - anything can still be passed or accessed directly from or between components.
### Definitions & Guides
See [Creating Components](./creating-components.md) to get started! Or you can explore the following topics in detail:
- [🧩 Component](./components.md): a class that implements one or more *protocols*. It can be added to an agent to provide additional functionality. See what's already provided in [Built-in Components](./built-in-components.md).
- [⚙️ Protocol](./protocols.md): an interface that defines a set of methods that a component must implement. Protocols are used to group related functionality.
- [🛠️ Command](./commands.md): enable *agent* to interact with user and tools.
- [🤖 Agent](./agents.md): a class that is composed of components. It's responsible for executing pipelines and managing the components.
- **Pipeline**: a sequence of method calls on components. Pipelines are used to execute a series of actions in a specific order. As of now there's no formal class for a pipeline, it's just a sequence of method calls on components. There are two default pipelines implemented in the default agent: `propose_action` and `execute`. See [🤖 Agent](./agents.md) to learn more.

View File

@@ -0,0 +1,165 @@
# ⚙️ Protocols
Protocols are *interfaces* implemented by [Components](./components.md) used to group related functionality. Each protocol needs to be handled explicitly by the agent at some point of the execution. We provide a comprehensive list of built-in protocols that are already handled in the built-in `Agent`, so when you inherit from the base agent all built-in protocols will work!
**Protocols are listed in the order of the default execution.**
## Order-independent protocols
Components implementing exclusively order-independent protocols can added in any order, including in-between ordered protocols.
### `DirectiveProvider`
Yields constraints, resources and best practices for the agent. This has no direct impact on other protocols; is purely informational and will be passed to a llm when the prompt is built.
```py
class DirectiveProvider(AgentComponent):
def get_constraints(self) -> Iterator[str]:
return iter([])
def get_resources(self) -> Iterator[str]:
return iter([])
def get_best_practices(self) -> Iterator[str]:
return iter([])
```
**Example** A web-search component can provide a resource information. Keep in mind that this actually doesn't allow the agent to access the internet. To do this a relevant `Command` needs to be provided.
```py
class WebSearchComponent(DirectiveProvider):
def get_resources(self) -> Iterator[str]:
yield "Internet access for searches and information gathering."
# We can skip "get_constraints" and "get_best_practices" if they aren't needed
```
### `CommandProvider`
Provides a command that can be executed by the agent.
```py
class CommandProvider(AgentComponent):
def get_commands(self) -> Iterator[Command]:
...
```
The easiest way to provide a command is to use `command` decorator on a component method and then yield the method. Each command needs a name, description and a parameter schema using `JSONSchema`. By default method name is used as a command name, and first part of docstring for the description (before `Args:` or `Returns:`) and schema can be provided in the decorator.
**Example** Calculator component that can perform multiplication. Agent is able to call this command if it's relevant to a current task and will see the returned result.
```py
from forge.agent import CommandProvider, Component
from forge.command import command
from forge.models.json_schema import JSONSchema
class CalculatorComponent(CommandProvider):
get_commands(self) -> Iterator[Command]:
yield self.multiply
@command(parameters={
"a": JSONSchema(
type=JSONSchema.Type.INTEGER,
description="The first number",
required=True,
),
"b": JSONSchema(
type=JSONSchema.Type.INTEGER,
description="The second number",
required=True,
)})
def multiply(self, a: int, b: int) -> str:
"""
Multiplies two numbers.
Args:
a: First number
b: Second number
Returns:
Result of multiplication
"""
return str(a * b)
```
The agent will be able to call this command, named `multiply` with two arguments and will receive the result. The command description will be: `Multiplies two numbers.`
To learn more about commands see [🛠️ Commands](./commands.md).
## Order-dependent protocols
The order of components implementing order-dependent protocols is important.
Some components may depend on the results of components before them.
### `MessageProvider`
Yields messages that will be added to the agent's prompt. You can use either `ChatMessage.user()`: this will interpreted as a user-sent message or `ChatMessage.system()`: that will be more important.
```py
class MessageProvider(AgentComponent):
def get_messages(self) -> Iterator[ChatMessage]:
...
```
**Example** Component that provides a message to the agent's prompt.
```py
class HelloComponent(MessageProvider):
def get_messages(self) -> Iterator[ChatMessage]:
yield ChatMessage.user("Hello World!")
```
### `AfterParse`
Protocol called after the response is parsed.
```py
class AfterParse(AgentComponent):
def after_parse(self, response: ThoughtProcessOutput) -> None:
...
```
**Example** Component that logs the response after it's parsed.
```py
class LoggerComponent(AfterParse):
def after_parse(self, response: ThoughtProcessOutput) -> None:
logger.info(f"Response: {response}")
```
### `ExecutionFailure`
Protocol called when the execution of the command fails.
```py
class ExecutionFailure(AgentComponent):
@abstractmethod
def execution_failure(self, error: Exception) -> None:
...
```
**Example** Component that logs the error when the command fails.
```py
class LoggerComponent(ExecutionFailure):
def execution_failure(self, error: Exception) -> None:
logger.error(f"Command execution failed: {error}")
```
### `AfterExecute`
Protocol called after the command is successfully executed by the agent.
```py
class AfterExecute(AgentComponent):
def after_execute(self, result: ActionResult) -> None:
...
```
**Example** Component that logs the result after the command is executed.
```py
class LoggerComponent(AfterExecute):
def after_execute(self, result: ActionResult) -> None:
logger.info(f"Result: {result}")
```