mirror of
https://github.com/microsoft/autogen.git
synced 2026-04-20 03:02:16 -04:00
add EcoAssistant blog (#647)
* add EcoAssistant blog * add blog * Rename blog.md to blog.mdx * Rename blog.mdx to index.mdx * Update index.mdx * Update index.mdx
This commit is contained in:
BIN
website/blog/2023-11-09-EcoAssistant/img/chat.png
Normal file
BIN
website/blog/2023-11-09-EcoAssistant/img/chat.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 308 KiB |
BIN
website/blog/2023-11-09-EcoAssistant/img/results.png
Normal file
BIN
website/blog/2023-11-09-EcoAssistant/img/results.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 28 KiB |
BIN
website/blog/2023-11-09-EcoAssistant/img/system.png
Normal file
BIN
website/blog/2023-11-09-EcoAssistant/img/system.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 280 KiB |
BIN
website/blog/2023-11-09-EcoAssistant/img/template-demo.png
Normal file
BIN
website/blog/2023-11-09-EcoAssistant/img/template-demo.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 60 KiB |
BIN
website/blog/2023-11-09-EcoAssistant/img/template.png
Normal file
BIN
website/blog/2023-11-09-EcoAssistant/img/template.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 328 KiB |
102
website/blog/2023-11-09-EcoAssistant/index.mdx
Normal file
102
website/blog/2023-11-09-EcoAssistant/index.mdx
Normal file
@@ -0,0 +1,102 @@
|
||||
---
|
||||
title: EcoAssistant - Using LLM Assistants More Accurately and Affordably
|
||||
authors: jieyuz2
|
||||
tags: [LMM, RAG, cost-effectiveness]
|
||||
---
|
||||
|
||||

|
||||
|
||||
**TL;DR:**
|
||||
* Introducing the **EcoAssistant**, which is designed to solve user queries more accurately and affordably.
|
||||
* We show how to let the LLM assistant agent leverage external API to solve user query.
|
||||
* We show how to reduce the cost of using GPT models via **Assistant Hierachy**.
|
||||
* We show how to leverage the idea of Retrieval-augmented Generation (RAG) to improve the success rate via **Solution Demonstration**.
|
||||
|
||||
|
||||
## EcoAssistant
|
||||
|
||||
In this blog, we introduce the **EcoAssistant**, a system built upon AutoGen with the goal of solving user queries more accurately and affordably.
|
||||
|
||||
### Problem setup
|
||||
|
||||
Recently, users have been using conversational LLMs such as ChatGPT for various queries.
|
||||
Reports indicate that 23% of ChatGPT user queries are for knowledge extraction purposes.
|
||||
Many of these queries require knowledge that is external to the information stored within any pre-trained large language models (LLMs).
|
||||
These tasks can only be completed by generating code to fetch necessary information via external APIs that contain the requested information.
|
||||
In the table below, we show three types of user queries that we aim to address in this work.
|
||||
|
||||
| Dataset | API | Example query |
|
||||
|-------------|----------|----------|
|
||||
| Places| [Google Places](https://developers.google.com/maps/documentation/places/web-service/overview) | I’m looking for a 24-hour pharmacy in Montreal, can you find one for me? |
|
||||
| Weather | [Weather API](https://www.weatherapi.com) | What is the current cloud coverage in Mumbai, India? |
|
||||
| Stock | [Alpha Vantage Stock API](https://www.alphavantage.co/documentation/) | Can you give me the opening price of Microsoft for the month of January 2023? |
|
||||
|
||||
|
||||
### Leveraging external APIs
|
||||
|
||||
To address these queries, we first build a **two-agent system** based on AutoGen,
|
||||
where the first agent is a **LLM assistant agent** (`AssistantAgent` in AutoGen) that is responsible for proposing and refining the code and
|
||||
the second agent is a **code executor agent** (`UserProxyAgent` in AutoGen) that would extract the generated code and execute it, forwarding the output back to the LLM assistant agent.
|
||||
A visualization of the two-agent system is shown below.
|
||||
|
||||

|
||||
|
||||
To instruct the assistant agent to leverage external APIs, we only need to add the API name/key dictionary at the beginning of the initial message.
|
||||
The template is shown below, where the red part is the information of APIs and black part is user query.
|
||||
|
||||

|
||||
|
||||
Importantly, we don't want to reveal our real API key to the assistant agent for safety concerns.
|
||||
Therefore, we use a **fake API key** to replace the real API key in the initial message.
|
||||
In particular, we generate a random token (e.g., `181dbb37`) for each API key and replace the real API key with the token in the initial message.
|
||||
Then, when the code executor execute the code, the fake API key would be automatically replaced by the real API key.
|
||||
|
||||
|
||||
### Solution Demonstration
|
||||
In most practical scenarios, queries from users would appear sequentially over time.
|
||||
Our **EcoAssistant** leverages past success to help the LLM assistants address future queries via **Solution Demonstration**.
|
||||
Specifically, whenever a query is deemed successfully resolved by user feedback, we capture and store the query and the final generated code snippet.
|
||||
These query-code pairs are saved in a specialized vector database. When new queries appear, **EcoAssistant** retrieves the most similar query from the database, which is then appended with the associated code to the initial prompt for the new query, serving as a demonstration.
|
||||
The new template of initial message is shown below, where the blue part corresponds to the solution demonstration.
|
||||
|
||||

|
||||
|
||||
We found that this utilization of past successful query-code pairs improves the query resolution process with fewer iterations and enhances the system's performance.
|
||||
|
||||
|
||||
### Assistant Hierarchy
|
||||
LLMs usually have different prices and performance, for example, GPT-3.5-turbo is much cheaper than GPT-4 but also less accurate.
|
||||
Thus, we propose the **Assistant Hierarchy** to reduce the cost of using LLMs.
|
||||
The core idea is that we use the cheaper LLMs first and only use the more expensive LLMs when necessary.
|
||||
By this way, we are able to reduce the reliance on expensive LLMs and thus reduce the cost.
|
||||
In particular, given multiple LLMs, we initiate one assistant agent for each and start the conversation with the most cost-effective LLM assistant.
|
||||
If the conversation between the current LLM assistant and the code executor concludes without successfully resolving the query, **EcoAssistant** would then restart the conversation with the next more expensive LLM assistant in the hierarchy.
|
||||
We found that this strategy significantly reduces costs while still effectively addressing queries.
|
||||
|
||||
### A Synergistic Effect
|
||||
We found that the **Assistant Hierarchy** and **Solution Demonstration** of **EcoAssistant** have a synergistic effect.
|
||||
Because the query-code database is shared by all LLM assistants, even without specialized design,
|
||||
the solution from more powerful LLM assistant (e.g., GPT-4) could be later retrieved to guide weaker LLM assistant (e.g., GPT-3.5-turbo).
|
||||
Such a synergistic effect further improves the performance and reduces the cost of **EcoAssistant**.
|
||||
|
||||
### Experimental Results
|
||||
|
||||
We evaluate **EcoAssistant** on three datasets: Places, Weather, and Stock. When comparing it with a single GPT-4 assistant, we found that **EcoAssistant** achieves a higher success rate with a lower cost as shown in the figure below.
|
||||
For more details about the experimental results and other experiments, please refer to our [paper](https://arxiv.org/abs/2310.03046).
|
||||
|
||||

|
||||
|
||||
## Further reading
|
||||
|
||||
Please refer to our [paper](https://arxiv.org/abs/2310.03046) and [codebase](https://github.com/JieyuZ2/EcoAssistant) for more details about **EcoAssistant**.
|
||||
|
||||
If you find this blog useful, please consider citing:
|
||||
|
||||
```bibtex
|
||||
@article{zhang2023ecoassistant,
|
||||
title={EcoAssistant: Using LLM Assistant More Affordably and Accurately},
|
||||
author={Zhang, Jieyu and Krishna, Ranjay and Awadallah, Ahmed H and Wang, Chi},
|
||||
journal={arXiv preprint arXiv:2310.03046},
|
||||
year={2023}
|
||||
}
|
||||
```
|
||||
@@ -39,3 +39,9 @@ beibinli:
|
||||
title: Senior Research Engineer at Microsoft
|
||||
url: https://github.com/beibinli
|
||||
image_url: https://github.com/beibinli.png
|
||||
|
||||
jieyuz2:
|
||||
name: Jieyu Zhang
|
||||
title: PhD student at University of Washington
|
||||
url: https://jieyuz2.github.io/
|
||||
image_url: https://github.com/jieyuz2.png
|
||||
|
||||
Reference in New Issue
Block a user