mirror of
https://github.com/microsoft/autogen.git
synced 2026-04-20 03:02:16 -04:00
Recreated doc for Local LLMs - LiteLLM and Ollama - native function calling in Ollama (#3197)
* Recreated documentation for Local LLMs - LiteLLM and Ollama * Added Docker = False for code execution example --------- Co-authored-by: Chi Wang <wang.chi@microsoft.com>
This commit is contained in:
417
website/docs/topics/non-openai-models/local-litellm-ollama.ipynb
Normal file
417
website/docs/topics/non-openai-models/local-litellm-ollama.ipynb
Normal file
@@ -0,0 +1,417 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# LiteLLM with Ollama\n",
|
||||
"[LiteLLM](https://litellm.ai/) is an open-source locally run proxy server that provides an\n",
|
||||
"OpenAI-compatible API. It interfaces with a large number of providers that do the inference.\n",
|
||||
"To handle the inference, a popular open-source inference engine is [Ollama](https://ollama.com/).\n",
|
||||
"\n",
|
||||
"As not all proxy servers support OpenAI's [Function Calling](https://platform.openai.com/docs/guides/function-calling) (usable with AutoGen),\n",
|
||||
"LiteLLM together with Ollama enable this useful feature.\n",
|
||||
"\n",
|
||||
"Running this stack requires the installation of:\n",
|
||||
"\n",
|
||||
"1. AutoGen ([installation instructions](/docs/installation))\n",
|
||||
"2. LiteLLM\n",
|
||||
"3. Ollama\n",
|
||||
"\n",
|
||||
"Note: We recommend using a virtual environment for your stack, see [this article](https://microsoft.github.io/autogen/docs/installation/#create-a-virtual-environment-optional) for guidance.\n",
|
||||
"\n",
|
||||
"## Installing LiteLLM\n",
|
||||
"\n",
|
||||
"Install LiteLLM with the proxy server functionality:\n",
|
||||
"\n",
|
||||
"```bash\n",
|
||||
"pip install 'litellm[proxy]'\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"Note: If using Windows, run LiteLLM and Ollama within a [WSL2](https://learn.microsoft.com/en-us/windows/wsl/install).\n",
|
||||
"\n",
|
||||
"````mdx-code-block\n",
|
||||
":::tip\n",
|
||||
"For custom LiteLLM installation instructions, see their [GitHub repository](https://github.com/BerriAI/litellm).\n",
|
||||
":::\n",
|
||||
"````\n",
|
||||
"\n",
|
||||
"## Installing Ollama\n",
|
||||
"\n",
|
||||
"For Mac and Windows, [download Ollama](https://ollama.com/download).\n",
|
||||
"\n",
|
||||
"For Linux:\n",
|
||||
"\n",
|
||||
"```bash\n",
|
||||
"curl -fsSL https://ollama.com/install.sh | sh\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"## Downloading models\n",
|
||||
"\n",
|
||||
"Ollama has a library of models to choose from, see them [here](https://ollama.com/library).\n",
|
||||
"\n",
|
||||
"Before you can use a model, you need to download it (using the name of the model from the library):\n",
|
||||
"\n",
|
||||
"```bash\n",
|
||||
"ollama pull llama3:instruct\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"To view the models you have downloaded and can use:\n",
|
||||
"\n",
|
||||
"```bash\n",
|
||||
"ollama list\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"````mdx-code-block\n",
|
||||
":::tip\n",
|
||||
"Ollama enables the use of GGUF model files, available readily on Hugging Face. See Ollama`s [GitHub repository](https://github.com/ollama/ollama)\n",
|
||||
"for examples.\n",
|
||||
":::\n",
|
||||
"````\n",
|
||||
"\n",
|
||||
"## Running LiteLLM proxy server\n",
|
||||
"\n",
|
||||
"To run LiteLLM with the model you have downloaded, in your terminal:\n",
|
||||
"\n",
|
||||
"```bash\n",
|
||||
"litellm --model ollama/llama3:instruct\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"```` text\n",
|
||||
"INFO: Started server process [19040]\n",
|
||||
"INFO: Waiting for application startup.\n",
|
||||
"\n",
|
||||
"#------------------------------------------------------------#\n",
|
||||
"# #\n",
|
||||
"# 'This feature doesn't meet my needs because...' #\n",
|
||||
"# https://github.com/BerriAI/litellm/issues/new #\n",
|
||||
"# #\n",
|
||||
"#------------------------------------------------------------#\n",
|
||||
"\n",
|
||||
" Thank you for using LiteLLM! - Krrish & Ishaan\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"Give Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"INFO: Application startup complete.\n",
|
||||
"INFO: Uvicorn running on http://0.0.0.0:4000 (Press CTRL+C to quit)\n",
|
||||
"````\n",
|
||||
"\n",
|
||||
"This will run the proxy server and it will be available at 'http://0.0.0.0:4000/'."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Using LiteLLM+Ollama with AutoGen\n",
|
||||
"\n",
|
||||
"Now that we have the URL for the LiteLLM proxy server, you can use it within AutoGen\n",
|
||||
"in the same way as OpenAI or cloud-based proxy servers.\n",
|
||||
"\n",
|
||||
"As you are running this proxy server locally, no API key is required. Additionally, as\n",
|
||||
"the model is being set when running the\n",
|
||||
"LiteLLM command, no model name needs to be configured in AutoGen. However, ```model```\n",
|
||||
"and ```api_key``` are mandatory fields for configurations within AutoGen so we put dummy\n",
|
||||
"values in them, as per the example below.\n",
|
||||
"\n",
|
||||
"An additional setting for the configuration is `price`, which can be used to set the pricing of tokens. As we're running it locally, we'll put our costs as zero. Using this setting will also avoid a prompt being shown when price can't be determined."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"\u001b[33magent\u001b[0m (to user):\n",
|
||||
"\n",
|
||||
"How can I help you today?\n",
|
||||
"\n",
|
||||
"--------------------------------------------------------------------------------\n",
|
||||
"\u001b[33muser\u001b[0m (to agent):\n",
|
||||
"\n",
|
||||
"Why is the sky blue?\n",
|
||||
"\n",
|
||||
"--------------------------------------------------------------------------------\n",
|
||||
"\u001b[31m\n",
|
||||
">>>>>>>> USING AUTO REPLY...\u001b[0m\n",
|
||||
"\u001b[33magent\u001b[0m (to user):\n",
|
||||
"\n",
|
||||
"A classic question!\n",
|
||||
"\n",
|
||||
"The sky appears blue because of a phenomenon called scattering. When sunlight enters Earth's atmosphere, it encounters tiny molecules of gases such as nitrogen (N2) and oxygen (O2). These molecules scatter the light in all directions, but they scatter shorter (blue) wavelengths more than longer (red) wavelengths.\n",
|
||||
"\n",
|
||||
"This is known as Rayleigh scattering, named after the British physicist Lord Rayleigh, who first described the phenomenon in the late 19th century. As a result of this scattering, the blue light is distributed throughout the atmosphere, giving the sky its blue appearance.\n",
|
||||
"\n",
|
||||
"Additionally, when sunlight passes through more dense atmospheric particles like water vapor, pollutants, and dust, it can also be scattered or absorbed, which affects the color we see. For example, during sunrise and sunset, the light has to travel longer distances through the atmosphere, which scatters the shorter wavelengths even more, making the sky appear more red.\n",
|
||||
"\n",
|
||||
"So, there you have it! The blue sky is a result of the combination of sunlight, atmospheric gases, and the scattering of light.\n",
|
||||
"\n",
|
||||
"How's that? Do you have any other questions or would you like to explore more topics?\n",
|
||||
"\n",
|
||||
"--------------------------------------------------------------------------------\n",
|
||||
"\u001b[33muser\u001b[0m (to agent):\n",
|
||||
"\n",
|
||||
"Why is it sometimes red, then?\n",
|
||||
"\n",
|
||||
"--------------------------------------------------------------------------------\n",
|
||||
"\u001b[31m\n",
|
||||
">>>>>>>> USING AUTO REPLY...\u001b[0m\n",
|
||||
"\u001b[33magent\u001b[0m (to user):\n",
|
||||
"\n",
|
||||
"Excellent follow-up question!\n",
|
||||
"\n",
|
||||
"As I mentioned earlier, the color we see in the sky can be affected by the amount and type of particles in the atmosphere. When the sunlight has to travel longer distances through the air, like during sunrise and sunset, it encounters more atmospheric particles that scatter the shorter blue wavelengths even more than the longer red wavelengths.\n",
|
||||
"\n",
|
||||
"This is known as Mie scattering, named after the German physicist Gustav Mie. The larger particles, such as water droplets, pollen, and dust, are responsible for this type of scattering. They scatter the shorter blue wavelengths more efficiently than the longer red wavelengths, which is why we often see more red or orange hues during these times.\n",
|
||||
"\n",
|
||||
"Additionally, during sunrise and sunset, the sun's rays have to travel through a thicker layer of atmosphere, which contains more particles like water vapor, pollutants, and aerosols. These particles can absorb or scatter certain wavelengths of light, making them appear redder or more orange.\n",
|
||||
"\n",
|
||||
"The combination of Mie scattering and absorption by atmospheric particles can create the warm, golden hues we often see during sunrise and sunset. It's a beautiful reminder that the color of our sky is not just a result of the sun itself but also the complex interactions between sunlight, atmosphere, and particles!\n",
|
||||
"\n",
|
||||
"Would you like to explore more about the Earth's atmosphere or perhaps learn about other fascinating topics?\n",
|
||||
"\n",
|
||||
"--------------------------------------------------------------------------------\n",
|
||||
"<autogen.agentchat.conversable_agent.ConversableAgent object at 0x7fe35da88dd0>\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"from autogen import ConversableAgent, UserProxyAgent\n",
|
||||
"\n",
|
||||
"local_llm_config = {\n",
|
||||
" \"config_list\": [\n",
|
||||
" {\n",
|
||||
" \"model\": \"NotRequired\", # Loaded with LiteLLM command\n",
|
||||
" \"api_key\": \"NotRequired\", # Not needed\n",
|
||||
" \"base_url\": \"http://0.0.0.0:4000\", # Your LiteLLM URL\n",
|
||||
" \"price\": [0, 0], # Put in price per 1K tokens [prompt, response] as free!\n",
|
||||
" }\n",
|
||||
" ],\n",
|
||||
" \"cache_seed\": None, # Turns off caching, useful for testing different models\n",
|
||||
"}\n",
|
||||
"\n",
|
||||
"# Create the agent that uses the LLM.\n",
|
||||
"assistant = ConversableAgent(\"agent\", llm_config=local_llm_config)\n",
|
||||
"\n",
|
||||
"# Create the agent that represents the user in the conversation.\n",
|
||||
"user_proxy = UserProxyAgent(\"user\", code_execution_config=False)\n",
|
||||
"\n",
|
||||
"# Let the assistant start the conversation. It will end when the user types exit.\n",
|
||||
"res = assistant.initiate_chat(user_proxy, message=\"How can I help you today?\")\n",
|
||||
"\n",
|
||||
"print(assistant)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Example with Function Calling\n",
|
||||
"Function calling (aka Tool calling) is a feature of OpenAI's API that AutoGen, LiteLLM, and Ollama support.\n",
|
||||
"\n",
|
||||
"Below is an example of using function calling with LiteLLM and Ollama. Based on this [currency conversion](https://github.com/microsoft/autogen/blob/501f8d22726e687c55052682c20c97ce62f018ac/notebook/agentchat_function_call_currency_calculator.ipynb) notebook.\n",
|
||||
"\n",
|
||||
"LiteLLM is loaded in the same way as the previous example and we'll continue to use Meta's Llama3 model as it is good at constructing the\n",
|
||||
"function calling message required.\n",
|
||||
"\n",
|
||||
"**Note:** LiteLLM version 1.41.27, or later, is required (to support function calling natively using Ollama).\n",
|
||||
"\n",
|
||||
"In your terminal:\n",
|
||||
"\n",
|
||||
"```bash\n",
|
||||
"litellm --model ollama/llama3\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"Then we run our program with function calling."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"/usr/local/lib/python3.11/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
|
||||
" from .autonotebook import tqdm as notebook_tqdm\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"from typing import Literal\n",
|
||||
"\n",
|
||||
"from typing_extensions import Annotated\n",
|
||||
"\n",
|
||||
"import autogen\n",
|
||||
"\n",
|
||||
"local_llm_config = {\n",
|
||||
" \"config_list\": [\n",
|
||||
" {\n",
|
||||
" \"model\": \"NotRequired\", # Loaded with LiteLLM command\n",
|
||||
" \"api_key\": \"NotRequired\", # Not needed\n",
|
||||
" \"base_url\": \"http://0.0.0.0:4000\", # Your LiteLLM URL\n",
|
||||
" \"price\": [0, 0], # Put in price per 1K tokens [prompt, response] as free!\n",
|
||||
" }\n",
|
||||
" ],\n",
|
||||
" \"cache_seed\": None, # Turns off caching, useful for testing different models\n",
|
||||
"}"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Create the agent and include examples of the function calling JSON in the prompt\n",
|
||||
"# to help guide the model\n",
|
||||
"chatbot = autogen.AssistantAgent(\n",
|
||||
" name=\"chatbot\",\n",
|
||||
" system_message=\"\"\"For currency exchange tasks,\n",
|
||||
" only use the functions you have been provided with.\n",
|
||||
" If the function has been called previously,\n",
|
||||
" return only the word 'TERMINATE'.\"\"\",\n",
|
||||
" llm_config=local_llm_config,\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"user_proxy = autogen.UserProxyAgent(\n",
|
||||
" name=\"user_proxy\",\n",
|
||||
" is_termination_msg=lambda x: x.get(\"content\", \"\") and \"TERMINATE\" in x.get(\"content\", \"\"),\n",
|
||||
" human_input_mode=\"NEVER\",\n",
|
||||
" max_consecutive_auto_reply=1,\n",
|
||||
" code_execution_config={\"work_dir\": \"code\", \"use_docker\": False},\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"CurrencySymbol = Literal[\"USD\", \"EUR\"]\n",
|
||||
"\n",
|
||||
"# Define our function that we expect to call\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"def exchange_rate(base_currency: CurrencySymbol, quote_currency: CurrencySymbol) -> float:\n",
|
||||
" if base_currency == quote_currency:\n",
|
||||
" return 1.0\n",
|
||||
" elif base_currency == \"USD\" and quote_currency == \"EUR\":\n",
|
||||
" return 1 / 1.1\n",
|
||||
" elif base_currency == \"EUR\" and quote_currency == \"USD\":\n",
|
||||
" return 1.1\n",
|
||||
" else:\n",
|
||||
" raise ValueError(f\"Unknown currencies {base_currency}, {quote_currency}\")\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"# Register the function with the agent\n",
|
||||
"@user_proxy.register_for_execution()\n",
|
||||
"@chatbot.register_for_llm(description=\"Currency exchange calculator.\")\n",
|
||||
"def currency_calculator(\n",
|
||||
" base_amount: Annotated[float, \"Amount of currency in base_currency\"],\n",
|
||||
" base_currency: Annotated[CurrencySymbol, \"Base currency\"] = \"USD\",\n",
|
||||
" quote_currency: Annotated[CurrencySymbol, \"Quote currency\"] = \"EUR\",\n",
|
||||
") -> str:\n",
|
||||
" quote_amount = exchange_rate(base_currency, quote_currency) * base_amount\n",
|
||||
" return f\"{format(quote_amount, '.2f')} {quote_currency}\""
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"\u001b[33muser_proxy\u001b[0m (to chatbot):\n",
|
||||
"\n",
|
||||
"How much is 123.45 EUR in USD?\n",
|
||||
"\n",
|
||||
"--------------------------------------------------------------------------------\n",
|
||||
"\u001b[33mchatbot\u001b[0m (to user_proxy):\n",
|
||||
"\n",
|
||||
"\u001b[32m***** Suggested tool call (call_d9584223-9af0-4526-ad09-856b03487fd5): currency_calculator *****\u001b[0m\n",
|
||||
"Arguments: \n",
|
||||
"{\"base_amount\": 123.45, \"base_currency\": \"EUR\", \"quote_currency\": \"USD\"}\n",
|
||||
"\u001b[32m************************************************************************************************\u001b[0m\n",
|
||||
"\n",
|
||||
"--------------------------------------------------------------------------------\n",
|
||||
"\u001b[35m\n",
|
||||
">>>>>>>> EXECUTING FUNCTION currency_calculator...\u001b[0m\n",
|
||||
"\u001b[33muser_proxy\u001b[0m (to chatbot):\n",
|
||||
"\n",
|
||||
"\u001b[33muser_proxy\u001b[0m (to chatbot):\n",
|
||||
"\n",
|
||||
"\u001b[32m***** Response from calling tool (call_d9584223-9af0-4526-ad09-856b03487fd5) *****\u001b[0m\n",
|
||||
"135.80 USD\n",
|
||||
"\u001b[32m**********************************************************************************\u001b[0m\n",
|
||||
"\n",
|
||||
"--------------------------------------------------------------------------------\n",
|
||||
"\u001b[33mchatbot\u001b[0m (to user_proxy):\n",
|
||||
"\n",
|
||||
"\u001b[32m***** Suggested tool call (call_17b07b4d-629f-4314-8a04-97b1537fa486): currency_calculator *****\u001b[0m\n",
|
||||
"Arguments: \n",
|
||||
"{\"base_amount\": 123.45, \"base_currency\": \"EUR\", \"quote_currency\": \"USD\"}\n",
|
||||
"\u001b[32m************************************************************************************************\u001b[0m\n",
|
||||
"\n",
|
||||
"--------------------------------------------------------------------------------\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# start the conversation\n",
|
||||
"res = user_proxy.initiate_chat(\n",
|
||||
" chatbot,\n",
|
||||
" message=\"How much is 123.45 EUR in USD?\",\n",
|
||||
" summary_method=\"reflection_with_llm\",\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"We can see that the currency conversion function was called with the correct values and a result was generated.\n",
|
||||
"\n",
|
||||
"````mdx-code-block\n",
|
||||
":::tip\n",
|
||||
"Once functions are included in the conversation it is possible, using LiteLLM and Ollama, that the model may continue to recommend tool calls (as shown above). This is an area of active development and a native Ollama client for AutoGen is planned for a future release.\n",
|
||||
":::\n",
|
||||
"````"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "autogen",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.11.9"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
||||
@@ -1,329 +0,0 @@
|
||||
# LiteLLM with Ollama
|
||||
[LiteLLM](https://litellm.ai/) is an open-source locally run proxy server that provides an
|
||||
OpenAI-compatible API. It interfaces with a large number of providers that do the inference.
|
||||
To handle the inference, a popular open-source inference engine is [Ollama](https://ollama.com/).
|
||||
|
||||
As not all proxy servers support OpenAI's [Function Calling](https://platform.openai.com/docs/guides/function-calling) (usable with AutoGen),
|
||||
LiteLLM together with Ollama enable this useful feature.
|
||||
|
||||
Running this stack requires the installation of:
|
||||
1. AutoGen ([installation instructions](/docs/installation))
|
||||
2. LiteLLM
|
||||
3. Ollama
|
||||
|
||||
Note: We recommend using a virtual environment for your stack, see [this article](https://microsoft.github.io/autogen/docs/installation/#create-a-virtual-environment-optional) for guidance.
|
||||
|
||||
## Installing LiteLLM
|
||||
|
||||
Install LiteLLM with the proxy server functionality:
|
||||
|
||||
```bash
|
||||
pip install 'litellm[proxy]'
|
||||
```
|
||||
|
||||
Note: If using Windows, run LiteLLM and Ollama within a [WSL2](https://learn.microsoft.com/en-us/windows/wsl/install).
|
||||
|
||||
````mdx-code-block
|
||||
:::tip
|
||||
For custom LiteLLM installation instructions, see their [GitHub repository](https://github.com/BerriAI/litellm).
|
||||
:::
|
||||
````
|
||||
|
||||
## Installing Ollama
|
||||
|
||||
For Mac and Windows, [download Ollama](https://ollama.com/download).
|
||||
|
||||
For Linux:
|
||||
|
||||
```bash
|
||||
curl -fsSL https://ollama.com/install.sh | sh
|
||||
```
|
||||
|
||||
## Downloading models
|
||||
|
||||
Ollama has a library of models to choose from, see them [here](https://ollama.com/library).
|
||||
|
||||
Before you can use a model, you need to download it (using the name of the model from the library):
|
||||
|
||||
```bash
|
||||
ollama pull llama2
|
||||
```
|
||||
|
||||
To view the models you have downloaded and can use:
|
||||
|
||||
```bash
|
||||
ollama list
|
||||
```
|
||||
|
||||
````mdx-code-block
|
||||
:::tip
|
||||
Ollama enables the use of GGUF model files, available readily on Hugging Face. See Ollama`s [GitHub repository](https://github.com/ollama/ollama)
|
||||
for examples.
|
||||
:::
|
||||
````
|
||||
|
||||
## Running LiteLLM proxy server
|
||||
|
||||
To run LiteLLM with the model you have downloaded, in your terminal:
|
||||
|
||||
```bash
|
||||
litellm --model ollama_chat/llama2
|
||||
```
|
||||
|
||||
```` text
|
||||
INFO: Started server process [19040]
|
||||
INFO: Waiting for application startup.
|
||||
|
||||
#------------------------------------------------------------#
|
||||
# #
|
||||
# 'This feature doesn't meet my needs because...' #
|
||||
# https://github.com/BerriAI/litellm/issues/new #
|
||||
# #
|
||||
#------------------------------------------------------------#
|
||||
|
||||
Thank you for using LiteLLM! - Krrish & Ishaan
|
||||
|
||||
|
||||
|
||||
Give Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new
|
||||
|
||||
|
||||
INFO: Application startup complete.
|
||||
INFO: Uvicorn running on http://0.0.0.0:4000 (Press CTRL+C to quit)
|
||||
````
|
||||
|
||||
This will run the proxy server and it will be available at 'http://0.0.0.0:4000/'.
|
||||
|
||||
## Using LiteLLM+Ollama with AutoGen
|
||||
|
||||
Now that we have the URL for the LiteLLM proxy server, you can use it within AutoGen
|
||||
in the same way as OpenAI or cloud-based proxy servers.
|
||||
|
||||
As you are running this proxy server locally, no API key is required. Additionally, as
|
||||
the model is being set when running the
|
||||
LiteLLM command, no model name needs to be configured in AutoGen. However, ```model```
|
||||
and ```api_key``` are mandatory fields for configurations within AutoGen so we put dummy
|
||||
values in them, as per the example below.
|
||||
|
||||
```python
|
||||
from autogen import UserProxyAgent, ConversableAgent
|
||||
|
||||
local_llm_config={
|
||||
"config_list": [
|
||||
{
|
||||
"model": "NotRequired", # Loaded with LiteLLM command
|
||||
"api_key": "NotRequired", # Not needed
|
||||
"base_url": "http://0.0.0.0:4000" # Your LiteLLM URL
|
||||
}
|
||||
],
|
||||
"cache_seed": None # Turns off caching, useful for testing different models
|
||||
}
|
||||
|
||||
# Create the agent that uses the LLM.
|
||||
assistant = ConversableAgent("agent", llm_config=local_llm_config)
|
||||
|
||||
# Create the agent that represents the user in the conversation.
|
||||
user_proxy = UserProxyAgent("user", code_execution_config=False)
|
||||
|
||||
# Let the assistant start the conversation. It will end when the user types exit.
|
||||
assistant.initiate_chat(user_proxy, message="How can I help you today?")
|
||||
```
|
||||
|
||||
Output:
|
||||
|
||||
```` text
|
||||
agent (to user):
|
||||
|
||||
How can I help you today?
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
Provide feedback to agent. Press enter to skip and use auto-reply, or type 'exit' to end the conversation: Tell me, why is the sky blue?
|
||||
user (to agent):
|
||||
|
||||
Tell me, why is the sky blue?
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
>>>>>>>> USING AUTO REPLY...
|
||||
agent (to user):
|
||||
|
||||
Ah, a classic question! The sky appears blue because of a phenomenon called Rayleigh scattering. When sunlight enters Earth's atmosphere, it encounters tiny molecules of gases such as nitrogen and oxygen. These molecules scatter the light in all directions, but they scatter shorter (blue) wavelengths more than longer (red) wavelengths. This is why the sky looks blue during the daytime, especially in the direction of the sun.
|
||||
|
||||
But wait, there's more! The color of the sky can also change depending on the time of day and atmospheric conditions. For example, during sunrise and sunset, the sky can take on hues of red, orange, and pink due to the way that light interacts with the Earth's atmosphere. And if you're in a location with a lot of dust or pollution, the sky may appear more hazy or gray.
|
||||
|
||||
So there you have it! The sky is blue because of the way that light interacts with the Earth's atmosphere, and there are many factors that can affect its appearance. Is there anything else you'd like to know?
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
Provide feedback to agent. Press enter to skip and use auto-reply, or type 'exit' to end the conversation: And what would it take to make it red?
|
||||
user (to agent):
|
||||
|
||||
And what would it take to make it red?
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
>>>>>>>> USING AUTO REPLY...
|
||||
agent (to user):
|
||||
|
||||
Ah, an interesting question! To make the sky appear red, we need to look at the scattering of light in a different way. When the sun is low on the horizon, the light it emits has to travel through a longer distance in the atmosphere to reach our eyes. As a result, much of the shorter wavelength blue light is scattered away, leaving mainly the longer wavelength red and orange light to reach our eyes. This is why the sky can take on hues of red, orange, and pink during sunrise and sunset.
|
||||
|
||||
However, if we were to somehow change the composition of the atmosphere or add some additional particles into the air, we could potentially make the sky appear red even when the sun is high in the sky. For example, if we were to add a lot of dust or smoke into the atmosphere, the sky might take on a reddish hue due to the scattering of light by these particles. Or, if we were to create a situation where the air was filled with a high concentration of certain gases, such as nitrogen oxides or sulfur compounds, the sky could potentially appear red or orange as a result of the way that these gases interact with light.
|
||||
|
||||
So there you have it! While the sky is typically blue during the daytime due to Rayleigh scattering, there are many other factors that can affect its appearance, and with the right conditions, we can even make the sky appear red! Is there anything else you'd like to know?
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
Provide feedback to agent. Press enter to skip and use auto-reply, or type 'exit' to end the conversation: exit
|
||||
````
|
||||
|
||||
## Example with Function Calling
|
||||
Function calling (aka Tool calling) is a feature of OpenAI's API that AutoGen and LiteLLM support.
|
||||
|
||||
Below is an example of using function calling with LiteLLM and Ollama. Based on this [currency conversion](https://github.com/microsoft/autogen/blob/501f8d22726e687c55052682c20c97ce62f018ac/notebook/agentchat_function_call_currency_calculator.ipynb) notebook.
|
||||
|
||||
LiteLLM is loaded in the same way as the previous example, however the DolphinCoder model is used as it is better at constructing the
|
||||
function calling message required.
|
||||
|
||||
In your terminal:
|
||||
|
||||
```bash
|
||||
litellm --model ollama_chat/dolphincoder
|
||||
```
|
||||
|
||||
|
||||
```python
|
||||
import autogen
|
||||
from typing import Literal
|
||||
from typing_extensions import Annotated
|
||||
|
||||
local_llm_config={
|
||||
"config_list": [
|
||||
{
|
||||
"model": "NotRequired", # Loaded with LiteLLM command
|
||||
"api_key": "NotRequired", # Not needed
|
||||
"base_url": "http://0.0.0.0:4000" # Your LiteLLM URL
|
||||
}
|
||||
],
|
||||
"cache_seed": None # Turns off caching, useful for testing different models
|
||||
}
|
||||
|
||||
# Create the agent and include examples of the function calling JSON in the prompt
|
||||
# to help guide the model
|
||||
chatbot = autogen.AssistantAgent(
|
||||
name="chatbot",
|
||||
system_message="""For currency exchange tasks,
|
||||
only use the functions you have been provided with.
|
||||
Output 'TERMINATE' when an answer has been provided.
|
||||
Do not include the function name or result in the JSON.
|
||||
Example of the return JSON is:
|
||||
{
|
||||
"parameter_1_name": 100.00,
|
||||
"parameter_2_name": "ABC",
|
||||
"parameter_3_name": "DEF",
|
||||
}.
|
||||
Another example of the return JSON is:
|
||||
{
|
||||
"parameter_1_name": "GHI",
|
||||
"parameter_2_name": "ABC",
|
||||
"parameter_3_name": "DEF",
|
||||
"parameter_4_name": 123.00,
|
||||
}. """,
|
||||
|
||||
llm_config=local_llm_config,
|
||||
)
|
||||
|
||||
user_proxy = autogen.UserProxyAgent(
|
||||
name="user_proxy",
|
||||
is_termination_msg=lambda x: x.get("content", "") and "TERMINATE" in x.get("content", ""),
|
||||
human_input_mode="NEVER",
|
||||
max_consecutive_auto_reply=1,
|
||||
)
|
||||
|
||||
|
||||
CurrencySymbol = Literal["USD", "EUR"]
|
||||
|
||||
# Define our function that we expect to call
|
||||
def exchange_rate(base_currency: CurrencySymbol, quote_currency: CurrencySymbol) -> float:
|
||||
if base_currency == quote_currency:
|
||||
return 1.0
|
||||
elif base_currency == "USD" and quote_currency == "EUR":
|
||||
return 1 / 1.1
|
||||
elif base_currency == "EUR" and quote_currency == "USD":
|
||||
return 1.1
|
||||
else:
|
||||
raise ValueError(f"Unknown currencies {base_currency}, {quote_currency}")
|
||||
|
||||
# Register the function with the agent
|
||||
@user_proxy.register_for_execution()
|
||||
@chatbot.register_for_llm(description="Currency exchange calculator.")
|
||||
def currency_calculator(
|
||||
base_amount: Annotated[float, "Amount of currency in base_currency"],
|
||||
base_currency: Annotated[CurrencySymbol, "Base currency"] = "USD",
|
||||
quote_currency: Annotated[CurrencySymbol, "Quote currency"] = "EUR",
|
||||
) -> str:
|
||||
quote_amount = exchange_rate(base_currency, quote_currency) * base_amount
|
||||
return f"{format(quote_amount, '.2f')} {quote_currency}"
|
||||
|
||||
# start the conversation
|
||||
res = user_proxy.initiate_chat(
|
||||
chatbot,
|
||||
message="How much is 123.45 EUR in USD?",
|
||||
summary_method="reflection_with_llm",
|
||||
)
|
||||
```
|
||||
|
||||
Output:
|
||||
|
||||
```` text
|
||||
user_proxy (to chatbot):
|
||||
|
||||
How much is 123.45 EUR in USD?
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
chatbot (to user_proxy):
|
||||
|
||||
***** Suggested tool Call (call_c93c4390-93d5-4a28-b40d-09fe74cc58da): currency_calculator *****
|
||||
Arguments:
|
||||
{
|
||||
"base_amount": 123.45,
|
||||
"base_currency": "EUR",
|
||||
"quote_currency": "USD"
|
||||
}
|
||||
|
||||
|
||||
************************************************************************************************
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
>>>>>>>> EXECUTING FUNCTION currency_calculator...
|
||||
user_proxy (to chatbot):
|
||||
|
||||
user_proxy (to chatbot):
|
||||
|
||||
***** Response from calling tool "call_c93c4390-93d5-4a28-b40d-09fe74cc58da" *****
|
||||
135.80 USD
|
||||
**********************************************************************************
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
chatbot (to user_proxy):
|
||||
|
||||
***** Suggested tool Call (call_d8fd94de-5286-4ef6-b1f6-72c826531ff9): currency_calculator *****
|
||||
Arguments:
|
||||
{
|
||||
"base_amount": 123.45,
|
||||
"base_currency": "EUR",
|
||||
"quote_currency": "USD"
|
||||
}
|
||||
|
||||
|
||||
************************************************************************************************
|
||||
````
|
||||
|
||||
````mdx-code-block
|
||||
:::warning
|
||||
Not all open source/weight models are suitable for function calling and AutoGen continues to be
|
||||
developed to provide wider support for open source models.
|
||||
|
||||
The [#alt-models](https://discord.com/channels/1153072414184452236/1201369716057440287) channel
|
||||
on AutoGen's Discord is an active community discussing the use of open source/weight models
|
||||
with AutoGen.
|
||||
:::
|
||||
````
|
||||
Reference in New Issue
Block a user