mirror of
https://github.com/acon96/home-llm.git
synced 2026-01-09 13:48:05 -05:00
use qwen3 instead of mistral as ollama example
This commit is contained in:
@@ -291,15 +291,25 @@ def option_overrides(backend_type: str) -> dict[str, Any]:
|
||||
# no prompt formats with tool calling support, so just use legacy tool calling
|
||||
CONF_ENABLE_LEGACY_TOOL_CALLING: True
|
||||
},
|
||||
"qwen3": {
|
||||
CONF_PROMPT: DEFAULT_PROMPT_BASE,
|
||||
CONF_TEMPERATURE: 0.6,
|
||||
CONF_TOP_K: 20,
|
||||
CONF_TOP_P: 0.95
|
||||
},
|
||||
"mistral": {
|
||||
CONF_PROMPT: DEFAULT_PROMPT_BASE + ICL_NO_SYSTEM_PROMPT_EXTRAS,
|
||||
CONF_MIN_P: 0.1,
|
||||
CONF_TYPICAL_P: 0.9,
|
||||
# no prompt formats with tool calling support, so just use legacy tool calling
|
||||
CONF_ENABLE_LEGACY_TOOL_CALLING: True,
|
||||
},
|
||||
"mixtral": {
|
||||
CONF_PROMPT: DEFAULT_PROMPT_BASE + ICL_NO_SYSTEM_PROMPT_EXTRAS,
|
||||
CONF_MIN_P: 0.1,
|
||||
CONF_TYPICAL_P: 0.9,
|
||||
# no prompt formats with tool calling support, so just use legacy tool calling
|
||||
CONF_ENABLE_LEGACY_TOOL_CALLING: True,
|
||||
},
|
||||
"llama-3": {
|
||||
CONF_PROMPT: DEFAULT_PROMPT_BASE + ICL_EXTRAS,
|
||||
@@ -309,6 +319,7 @@ def option_overrides(backend_type: str) -> dict[str, Any]:
|
||||
},
|
||||
"zephyr": {
|
||||
CONF_PROMPT: DEFAULT_PROMPT_BASE + ICL_EXTRAS,
|
||||
|
||||
},
|
||||
"phi-3": {
|
||||
CONF_PROMPT: DEFAULT_PROMPT_BASE + ICL_EXTRAS,
|
||||
|
||||
@@ -9,7 +9,7 @@
|
||||
* [Step 1: Wheel Installation for llama-cpp-python](#step-1-wheel-installation-for-llama-cpp-python)
|
||||
* [Step 2: Model Selection](#step-2-model-selection)
|
||||
* [Step 3: Model Configuration](#step-3-model-configuration)
|
||||
* [Path 2: Using Mistral-Instruct-7B with Ollama Backend](#path-2-using-mistral-instruct-7b-with-ollama-backend)
|
||||
* [Path 2: Using Qwen3 with Ollama Backend](#path-2-using-qwen3-with-ollama-backend)
|
||||
* [Overview](#overview-1)
|
||||
* [Step 1: Downloading and serving the Model](#step-1-downloading-and-serving-the-model)
|
||||
* [Step 2: Connect to the Ollama API](#step-2-connect-to-the-ollama-api)
|
||||
@@ -76,12 +76,22 @@ Once the desired API has been selected, scroll to the bottom and click `Submit`.
|
||||
|
||||
The model will be loaded into memory and should now be available to select as a conversation agent!
|
||||
|
||||
## Path 2: Using Mistral-Instruct-7B with Ollama Backend
|
||||
## Path 2: Using Qwen3 with Ollama Backend
|
||||
### Overview
|
||||
For those who have access to a GPU, you can also use the Mistral-Instruct-7B model to power your conversation agent. This path requires a separate machine that has a GPU and has [Ollama](https://ollama.com/) already installed on it. This path utilizes in-context learning examples, to prompt the model to produce the output that we expect.
|
||||
For those who have access to a GPU, you can also use the Qwen3 model to power your conversation agent. This path requires a separate machine that has a GPU and has [Ollama](https://ollama.com/) already installed on it. This path utilizes in-context learning examples, to prompt the model to produce the output that we expect.
|
||||
|
||||
### Step 1: Downloading and serving the Model
|
||||
Mistral can be easily set up and downloaded on the serving machine using the `ollama pull mistral` command.
|
||||
There are multiple size options for the Qwen3 series of model. Replace `8b` with the tag for your choice of model.
|
||||
|
||||
| Parameter Count | Estimated VRAM | Ollama Tag |
|
||||
| -----------------| ---------------|------------|
|
||||
| 4 Billion | 8-10 GB | `4b` |
|
||||
| 8 Billion | 9-12 GB | `8b` |
|
||||
| 14 Billion | 14-16 GB | `14b` |
|
||||
| 32 Billion | 22+ GB | `32b` |
|
||||
| 30B (3B Active) | 20+ GB | `30b` |
|
||||
|
||||
Qwen3 can be easily set up and downloaded on the serving machine using the `ollama pull qwen3:8b` command.
|
||||
|
||||
In order to access the model from another machine, we need to run the Ollama API server open to the local network. This can be achieved using the `OLLAMA_HOST=0.0.0.0:11434 ollama serve` command. **DO NOT RUN THIS COMMAND ON ANY PUBLICLY
|
||||
ACCESSIBLE SERVERS AS IT LISTENS ON ALL NETWORK INTERFACES**
|
||||
@@ -103,7 +113,7 @@ In order to access the model from another machine, we need to run the Ollama API
|
||||
### Step 3: Model Selection & Configuration
|
||||
1. You must create the conversation agent based on the model you wish to use.
|
||||
Under the `Ollama at '<url>` service that you just created, select `+ Add conversation agent`
|
||||
- **Model Name**: Select the Mistral 7B model. This should automatically populated based on the model you already downloaded
|
||||
- **Model Name**: Select `qwen3:8b` from the list.
|
||||
2. You can configure how the model is "prompted". See [here](./Model%20Prompting.md) for more information on how that works.
|
||||
|
||||
For now, defaults for the model should have been populated. If you would like the model to be able to control devices then you must select the `Assist` API.
|
||||
|
||||
Reference in New Issue
Block a user