docs: Update user guide notebooks to enhance clarity and add structured output (#5224)

Resolves #5043
This commit is contained in:
Eric Zhu
2025-01-27 13:57:29 -08:00
committed by GitHub
parent 6359b6a7be
commit 2ceb9dcffe
3 changed files with 280 additions and 126 deletions

View File

@@ -35,9 +35,15 @@
"from autogen_agentchat.messages import TextMessage\n",
"from autogen_agentchat.ui import Console\n",
"from autogen_core import CancellationToken\n",
"from autogen_ext.models.openai import OpenAIChatCompletionClient\n",
"\n",
"\n",
"from autogen_ext.models.openai import OpenAIChatCompletionClient"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"# Define a tool that searches the web for information.\n",
"async def web_search(query: str) -> str:\n",
" \"\"\"Find information on the web\"\"\"\n",
@@ -321,6 +327,82 @@
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Structured Output\n",
"\n",
"Structured output allows models to return structured JSON text with pre-defined schema\n",
"provided by the application. Different from JSON-mode, the schema can be provided\n",
"as a [Pydantic BaseModel](https://docs.pydantic.dev/latest/concepts/models/)\n",
"class, which can also be used to validate the output. \n",
"\n",
"```{note}\n",
"Structured output is only available for models that support it. It also\n",
"requires the model client to support structured output as well.\n",
"Currently, the {py:class}`~autogen_ext.models.openai.OpenAIChatCompletionClient`\n",
"and {py:class}`~autogen_ext.models.openai.AzureOpenAIChatCompletionClient`\n",
"support structured output.\n",
"```\n",
"\n",
"Structured output is also useful for incorporating Chain-of-Thought\n",
"reasoning in the agent's responses.\n",
"See the example below for how to use structured output with the assistant agent."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"---------- user ----------\n",
"I am happy.\n",
"---------- assistant ----------\n",
"{\"thoughts\":\"The user explicitly states that they are happy.\",\"response\":\"happy\"}\n"
]
},
{
"data": {
"text/plain": [
"TaskResult(messages=[TextMessage(source='user', models_usage=None, content='I am happy.', type='TextMessage'), TextMessage(source='assistant', models_usage=RequestUsage(prompt_tokens=89, completion_tokens=18), content='{\"thoughts\":\"The user explicitly states that they are happy.\",\"response\":\"happy\"}', type='TextMessage')], stop_reason=None)"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from typing import Literal\n",
"\n",
"from pydantic import BaseModel\n",
"\n",
"\n",
"# The response format for the agent as a Pydantic base model.\n",
"class AgentResponse(BaseModel):\n",
" thoughts: str\n",
" response: Literal[\"happy\", \"sad\", \"neutral\"]\n",
"\n",
"\n",
"# Create an agent that uses the OpenAI GPT-4o model with the custom response format.\n",
"model_client = OpenAIChatCompletionClient(\n",
" model=\"gpt-4o\",\n",
" response_format=AgentResponse, # type: ignore\n",
")\n",
"agent = AssistantAgent(\n",
" \"assistant\",\n",
" model_client=model_client,\n",
" system_message=\"Categorize the input as happy, sad, or neutral following the JSON format.\",\n",
")\n",
"\n",
"await Console(agent.run_stream(task=\"I am happy.\"))"
]
},
{
"cell_type": "markdown",
"metadata": {},

View File

@@ -6,7 +6,10 @@
"source": [
"# Models\n",
"\n",
"In many cases, agents need access to LLM model services such as OpenAI, Azure OpenAI, or local models. Since there are many different providers with different APIs, `autogen-core` implements a protocol for [model clients](../../core-user-guide/components/model-clients.ipynb) and `autogen-ext` implements a set of model clients for popular model services. AgentChat can use these model clients to interact with model services. \n",
"In many cases, agents need access to LLM model services such as OpenAI, Azure OpenAI, or local models. Since there are many different providers with different APIs, `autogen-core` implements a protocol for model clients and `autogen-ext` implements a set of model clients for popular model services. AgentChat can use these model clients to interact with model services. \n",
"\n",
"This section provides a quick overview of available model clients.\n",
"For more details on how to use them directly, please refer to [Model Clients](../../core-user-guide/components/model-clients.ipynb) in the Core API documentation.\n",
"\n",
"```{note}\n",
"See {py:class}`~autogen_ext.models.cache.ChatCompletionCache` for a caching wrapper to use with the following clients.\n",

View File

@@ -14,17 +14,19 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Built-in Model Clients\n",
"\n",
"Currently there are three built-in model clients:\n",
"* {py:class}`~autogen_ext.models.OpenAIChatCompletionClient`\n",
"* {py:class}`~autogen_ext.models.AzureOpenAIChatCompletionClient`\n",
"* {py:class}`~autogen_ext.models.AzureAIChatCompletionClient`\n",
"* {py:class}`~autogen_ext.models.openai.OpenAIChatCompletionClient`\n",
"* {py:class}`~autogen_ext.models.openai.AzureOpenAIChatCompletionClient`\n",
"* {py:class}`~autogen_ext.models.azure.AzureAIChatCompletionClient`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## OpenAI\n",
"\n",
"\n",
"### OpenAI\n",
"\n",
"To use the {py:class}`~autogen_ext.models.OpenAIChatCompletionClient`, you need to install the `openai` extra."
"To use the {py:class}`~autogen_ext.models.openai.OpenAIChatCompletionClient`, you need to install the `openai` extra."
]
},
{
@@ -68,7 +70,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"You can call the {py:meth}`~autogen_ext.models.OpenAIChatCompletionClient.create` method to create a\n",
"You can call the {py:meth}`~autogen_ext.models.openai.BaseOpenAIChatCompletionClient.create` method to create a\n",
"chat completion request, and await for an {py:class}`~autogen_core.models.CreateResult` object in return."
]
},
@@ -118,12 +120,142 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### OpenAI-Compatible API\n",
"## Azure OpenAI\n",
"\n",
"You can use the {py:class}`~autogen_ext.models.OpenAIChatCompletionClient` to interact with OpenAI-compatible APIs such as Ollama and Gemini (beta).\n",
"To use the {py:class}`~autogen_ext.models.openai.AzureOpenAIChatCompletionClient`, you need to provide\n",
"the deployment id, Azure Cognitive Services endpoint, api version, and model capabilities.\n",
"For authentication, you can either provide an API key or an Azure Active Directory (AAD) token credential."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "shellscript"
}
},
"outputs": [],
"source": [
"# pip install \"autogen-ext[openai,azure]\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The following code snippet shows how to use AAD authentication.\n",
"The identity used must be assigned the [**Cognitive Services OpenAI User**](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/role-based-access-control#cognitive-services-openai-user) role."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from autogen_ext.models.openai import AzureOpenAIChatCompletionClient\n",
"from azure.identity import DefaultAzureCredential, get_bearer_token_provider\n",
"\n",
"#### Ollama\n",
"# Create the token provider\n",
"token_provider = get_bearer_token_provider(DefaultAzureCredential(), \"https://cognitiveservices.azure.com/.default\")\n",
"\n",
"az_model_client = AzureOpenAIChatCompletionClient(\n",
" azure_deployment=\"{your-azure-deployment}\",\n",
" model=\"{model-name, such as gpt-4o}\",\n",
" api_version=\"2024-06-01\",\n",
" azure_endpoint=\"https://{your-custom-endpoint}.openai.azure.com/\",\n",
" azure_ad_token_provider=token_provider, # Optional if you choose key-based authentication.\n",
" # api_key=\"sk-...\", # For key-based authentication.\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```{note}\n",
"See [here](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/managed-identity#chat-completions) for how to use the Azure client directly or for more info.\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Azure AI Foundry\n",
"\n",
"[Azure AI Foundry](https://learn.microsoft.com/en-us/azure/ai-studio/) (previously known as Azure AI Studio) offers models hosted on Azure.\n",
"To use those models, you use the {py:class}`~autogen_ext.models.azure.AzureAIChatCompletionClient`.\n",
"\n",
"You need to install the `azure` extra to use this client."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "shellscript"
}
},
"outputs": [],
"source": [
"# pip install \"autogen-ext[openai,azure]\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Below is an example of using this client with the Phi-4 model from [GitHub Marketplace](https://github.com/marketplace/models)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"finish_reason='stop' content='The capital of France is Paris.' usage=RequestUsage(prompt_tokens=14, completion_tokens=8) cached=False logprobs=None\n"
]
}
],
"source": [
"import os\n",
"\n",
"from autogen_core.models import UserMessage\n",
"from autogen_ext.models.azure import AzureAIChatCompletionClient\n",
"from azure.core.credentials import AzureKeyCredential\n",
"\n",
"client = AzureAIChatCompletionClient(\n",
" model=\"Phi-4\",\n",
" endpoint=\"https://models.inference.ai.azure.com\",\n",
" # To authenticate with the model you will need to generate a personal access token (PAT) in your GitHub settings.\n",
" # Create your PAT token by following instructions here: https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens\n",
" credential=AzureKeyCredential(os.environ[\"GITHUB_TOKEN\"]),\n",
" model_info={\n",
" \"json_output\": False,\n",
" \"function_calling\": False,\n",
" \"vision\": False,\n",
" \"family\": \"unknown\",\n",
" },\n",
")\n",
"\n",
"result = await client.create([UserMessage(content=\"What is the capital of France?\", source=\"user\")])\n",
"print(result)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Ollama\n",
"\n",
"You can use the {py:class}`~autogen_ext.models.openai.OpenAIChatCompletionClient` to interact with OpenAI-compatible APIs such as Ollama and Gemini (beta).\n",
"The below example shows how to use a local model running on [Ollama](https://ollama.com) server."
]
},
@@ -164,9 +296,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Gemini (beta)\n",
"## Gemini (beta)\n",
"\n",
"The below example shows how to use the Gemini model."
"The below example shows how to use the Gemini model via the {py:class}`~autogen_ext.models.openai.OpenAIChatCompletionClient`."
]
},
{
@@ -208,9 +340,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Streaming Response\n",
"## Streaming Response\n",
"\n",
"You can use the {py:meth}`~autogen_ext.models.OpenAIChatCompletionClient.create_streaming` method to create a\n",
"You can use the {py:meth}`~autogen_ext.models.openai.BaseOpenAIChatCompletionClient.create_stream` method to create a\n",
"chat completion request with streaming response."
]
},
@@ -366,133 +498,70 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Azure OpenAI\n",
"## Structured Output\n",
"\n",
"To use the {py:class}`~autogen_ext.models.AzureOpenAIChatCompletionClient`, you need to provide\n",
"the deployment id, Azure Cognitive Services endpoint, api version, and model capabilities.\n",
"For authentication, you can either provide an API key or an Azure Active Directory (AAD) token credential."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"vscode": {
"languageId": "shellscript"
}
},
"outputs": [],
"source": [
"# pip install \"autogen-ext[openai,azure]\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The following code snippet shows how to use AAD authentication.\n",
"The identity used must be assigned the [**Cognitive Services OpenAI User**](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/role-based-access-control#cognitive-services-openai-user) role."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from autogen_ext.models.openai import AzureOpenAIChatCompletionClient\n",
"from azure.identity import DefaultAzureCredential, get_bearer_token_provider\n",
"Structured output can be enabled by setting the `response_format` field in\n",
"{py:class}`~autogen_ext.models.openai.OpenAIChatCompletionClient` and {py:class}`~autogen_ext.models.openai.AzureOpenAIChatCompletionClient` to\n",
"as a [Pydantic BaseModel](https://docs.pydantic.dev/latest/concepts/models/) class.\n",
"\n",
"# Create the token provider\n",
"token_provider = get_bearer_token_provider(DefaultAzureCredential(), \"https://cognitiveservices.azure.com/.default\")\n",
"\n",
"az_model_client = AzureOpenAIChatCompletionClient(\n",
" azure_deployment=\"{your-azure-deployment}\",\n",
" model=\"{model-name, such as gpt-4o}\",\n",
" api_version=\"2024-06-01\",\n",
" azure_endpoint=\"https://{your-custom-endpoint}.openai.azure.com/\",\n",
" azure_ad_token_provider=token_provider, # Optional if you choose key-based authentication.\n",
" # api_key=\"sk-...\", # For key-based authentication.\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```{note}\n",
"See [here](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/managed-identity#chat-completions) for how to use the Azure client directly or for more info.\n",
"Structured output is only available for models that support it. It also\n",
"requires the model client to support structured output as well.\n",
"Currently, the {py:class}`~autogen_ext.models.openai.OpenAIChatCompletionClient`\n",
"and {py:class}`~autogen_ext.models.openai.AzureOpenAIChatCompletionClient`\n",
"support structured output.\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Azure AI Foundry\n",
"\n",
"[Azure AI Foundry](https://learn.microsoft.com/en-us/azure/ai-studio/) (previously known as Azure AI Studio) offers models hosted on Azure.\n",
"To use those models, you use the {py:class}`~autogen_ext.models.azure.AzureAIChatCompletionClient`.\n",
"\n",
"You need to install the `azure` extra to use this client."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "shellscript"
}
},
"outputs": [],
"source": [
"# pip install \"autogen-ext[openai,azure]\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Below is an example of using this client with the Phi-4 model from [GitHub Marketplace](https://github.com/marketplace/models)."
]
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"finish_reason='stop' content='The capital of France is Paris.' usage=RequestUsage(prompt_tokens=14, completion_tokens=8) cached=False logprobs=None\n"
"I'm glad to hear that you're feeling happy! It's such a great emotion that can brighten your whole day. Is there anything in particular that's bringing you joy today? 😊\n",
"happy\n"
]
}
],
"source": [
"import os\n",
"from typing import Literal\n",
"\n",
"from autogen_core.models import UserMessage\n",
"from autogen_ext.models.azure import AzureAIChatCompletionClient\n",
"from azure.core.credentials import AzureKeyCredential\n",
"from pydantic import BaseModel\n",
"\n",
"client = AzureAIChatCompletionClient(\n",
" model=\"Phi-4\",\n",
" endpoint=\"https://models.inference.ai.azure.com\",\n",
" # To authenticate with the model you will need to generate a personal access token (PAT) in your GitHub settings.\n",
" # Create your PAT token by following instructions here: https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens\n",
" credential=AzureKeyCredential(os.environ[\"GITHUB_TOKEN\"]),\n",
" model_info={\n",
" \"json_output\": False,\n",
" \"function_calling\": False,\n",
" \"vision\": False,\n",
" \"family\": \"unknown\",\n",
" },\n",
"\n",
"# The response format for the agent as a Pydantic base model.\n",
"class AgentResponse(BaseModel):\n",
" thoughts: str\n",
" response: Literal[\"happy\", \"sad\", \"neutral\"]\n",
"\n",
"\n",
"# Create an agent that uses the OpenAI GPT-4o model with the custom response format.\n",
"model_client = OpenAIChatCompletionClient(\n",
" model=\"gpt-4o\",\n",
" response_format=AgentResponse, # type: ignore\n",
")\n",
"\n",
"result = await client.create([UserMessage(content=\"What is the capital of France?\", source=\"user\")])\n",
"print(result)"
"# Send a message list to the model and await the response.\n",
"messages = [\n",
" UserMessage(content=\"I am happy.\", source=\"user\"),\n",
"]\n",
"response = await model_client.create(messages=messages)\n",
"assert isinstance(response.content, str)\n",
"parsed_response = AgentResponse.model_validate_json(response.content)\n",
"print(parsed_response.thoughts)\n",
"print(parsed_response.response)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You also use the `extra_create_args` parameter in the {py:meth}`~autogen_ext.models.openai.BaseOpenAIChatCompletionClient.create` method\n",
"to set the `response_format` field so that the structured output can be configured for each request."
]
},
{