9.9 KiB
Multi-agent conversation Framework
AutoGen offers a unified multi-agent conversation framework as a high-level abstraction of using foundation models. It features capable, customizable and conversable agents which integrate LLM, tool and human via automated agent chat. By automating chat among multiple capable agents, one can easily make them collectively perform tasks autonomously or with human feedback, including tasks that require using tools via code.
This framework simplifies the orchestration, automation and optimization of a complex LLM workflow. It maximizes the performance of LLM models and augments their weakness. It enables building next-gen LLM applications based on multi-agent conversations with minimal effort.
Conversable Agents
We have designed a generic ResponsiveAgent class for Agents that are capable of conversing with each other through the exchange of messages to jointly finish a task. An agent can communicate with other agents and perform actions. Different agents can differ in what actions they perform after receiving messages. Two representative subclasses are AssistantAgent and UserProxyAgent.
AssistantAgent. Designed to act as an assistant by responding to user requests. It could write Python code (in a Python coding block) for a user to execute when a message (typically a description of a task that needs to be solved) is received. Under the hood, the Python code is written by LLM (e.g., GPT-4). It can also receive the execution results and suggest code with bug fix. Its behavior can be altered by passing a new system message. The LLM inference configuration can be configured viallm_config.UserProxyAgent. Serves as a proxy for the human user. Upon receiving a message, the UserProxyAgent will either solicit the human user's input or prepare an automatically generated reply. The chosen action depends on the settings of thehuman_input_modeandmax_consecutive_auto_replywhen theUserProxyAgentinstance is constructed, and whether a human user input is available. By default, the automatically generated reply is crafted based on automatic code execution. TheUserProxyAgenttriggers code execution automatically when it detects an executable code block in the received message and no human user input is provided. Code execution can be disabled by settingcode_execution_configto False. LLM-based response is disabled by default. It can be enabled by settingllm_configto a dict corresponding to the inference configuration. Whenllm_configis set to a dict,UserProxyAgentcan generate replies using an LLM when code execution is not performed.
The auto-reply capability of ResponsiveAgent allows for more autonomous multi-agent communication while retaining the possibility of human intervention.
One can also easily extend it by registering auto_reply functions with the register_auto_reply() method.
Multi-agent Conversations
Basic Example
Example usage of the agents to solve a task with code:
from pyautogen import AssistantAgent, UserProxyAgent
# create an AssistantAgent instance named "assistant"
assistant = AssistantAgent(name="assistant")
# create a UserProxyAgent instance named "user_proxy"
user_proxy = UserProxyAgent(
name="user_proxy",
human_input_mode="NEVER", # in this mode, the agent will never solicit human input but always auto reply
)
# the assistant receives a message from the user, which contains the task description
user.initiate_chat(
assistant,
message="""What date is today? Which big tech stock has the largest year-to-date gain this year? How much is the gain?""",
)
In the example above, we create an AssistantAgent named "assistant" to serve as the assistant and a UserProxyAgent named "user_proxy" to serve as a proxy for the human user.
- The assistant receives a message from the user_proxy, which contains the task description.
- The assistant then tries to write Python code to solve the task and sends the response to the user_proxy.
- Once the user_proxy receives a response from the assistant, it tries to reply by either soliciting human input or preparing an automatically generated reply. In this specific example, since
human_input_modeis set to"NEVER", the user_proxy will not solicit human input but send an automatically generated reply (auto reply). More specifically, the user_proxy executes the code and uses the result as the auto-reply. - The assistant then generates a further response for the user_proxy. The user_proxy can then decide whether to terminate the conversation. If not, steps 3 and 4 are repeated.
Please find a visual illustration of how UserProxyAgent and AssistantAgent collaboratively solve the above task below:

Human Input Mode
The human_input_mode parameter of UserProxyAgent controls the behavior of the agent when it receives a message. It can be set to "NEVER", "ALWAYS", or "TERMINATE".
- Under the mode
human_input_mode="NEVER", the multi-turn conversation between the assistant and the user_proxy stops when the number of auto-reply reaches the upper limit specified bymax_consecutive_auto_replyor the received message is a termination message according tois_termination_msg. - When
human_input_modeis set to"ALWAYS", the user proxy agent solicits human input every time a message is received; and the conversation stops when the human input is "exit", or when the received message is a termination message and no human input is provided. - When
human_input_modeis set to"TERMINATE", the user proxy agent solicits human input only when a termination message is received or the number of auto replies reachesmax_consecutive_auto_reply.
Function Calling
To leverage function calling capability of OpenAI's Chat Completions API, one can pass in a list of callable functions or class methods to UserProxyAgent, which corresponds to the description of functions passed to OpenAI's API.
Example usage of the agents to solve a task with function calling feature:
from pyautogen import AssistantAgent, UserProxyAgent
# put the descriptions of functions in config to be passed to OpenAI's API
llm_config = {
"model": "gpt-4-0613",
"functions": [
{
"name": "python",
"description": "run cell in ipython and return the execution result.",
"parameters": {
"type": "object",
"properties": {
"cell": {
"type": "string",
"description": "Valid Python cell to execute.",
}
},
"required": ["cell"],
},
},
{
"name": "sh",
"description": "run a shell script and return the execution result.",
"parameters": {
"type": "object",
"properties": {
"script": {
"type": "string",
"description": "Valid shell script to execute.",
}
},
"required": ["script"],
},
},
],
}
# create an AssistantAgent instance named "assistant"
chatbot = AssistantAgent("assistant", **llm_config)
# create a UserProxyAgent instance named "user_proxy"
user_proxy = UserProxyAgent(
"user_proxy",
human_input_mode="NEVER",
)
# define functions according to the function desription
from IPython import get_ipython
def exec_python(cell):
ipython = get_ipython()
result = ipython.run_cell(cell)
log = str(result.result)
if result.error_before_exec is not None:
log += f"\n{result.error_before_exec}"
if result.error_in_exec is not None:
log += f"\n{result.error_in_exec}"
return log
def exec_sh(script):
return user_proxy.execute_code_blocks([("sh", script)])
# register the functions
user_proxy.register_function(
function_map={
"python": exec_python,
"sh": exec_sh,
}
)
# start the conversation
user_proxy.initiate_chat(
chatbot,
message="Draw two agents chatting with each other with an example dialog.",
)
Notebook Examples
Interested in trying it yourself? Please check the following notebook examples:
- Automated Task Solving with Code Generation, Execution & Debugging
- Auto Code Generation, Execution, Debugging and Human Feedback
- Solve Tasks Requiring Web Info
- Use Provided Tools as Functions
- Automated Task Solving with Coding & Planning Agents
- Automated Task Solving with GPT-4 + Multiple Human Users
- Automated Chess Game Playing & Chitchatting by GPT-4 Agents
- Automated Task Solving by Group Chat
- Automated Continual Learning from New Data
For Further Reading
Interested in the research that leads to this package? Please check the following papers.
- AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework Qingyun Wu, Gagan Bansal, Jieyu Zhang, Yiran Wu, Shaokun Zhang, Erkang Zhu, Beibin Li, Li Jiang, Xiaoyun Zhang and Chi Wang. ArXiv 2023.