mirror of
https://github.com/acon96/home-llm.git
synced 2026-01-08 21:28:05 -05:00
backends should all work now
This commit is contained in:
@@ -7,9 +7,6 @@ There are multiple backends to choose for running the model that the Home Assist
|
||||
|-----------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------|
|
||||
| LLM API | This is the set of tools that are provided to the LLM. Use Assist for the built-in API. If you are using Home-LLM v1, v2, or v3, then select the dedicated API | |
|
||||
| System Prompt | [see here](./Model%20Prompting.md) | |
|
||||
| Prompt Format | The format for the context of the model | |
|
||||
| Tool Format | The format of the tools that are provided to the model. Full, Reduced, or Minimal | |
|
||||
| Multi-Turn Tool Use | Enable this if the model you are using expects to receive the result from the tool call before responding to the user | |
|
||||
| Maximum tokens to return in response | Limits the number of tokens that can be produced by each model response | 512 |
|
||||
| Additional attribute to expose in the context | Extra attributes that will be exposed to the model via the `{{ devices }}` template variable | |
|
||||
| Arguments allowed to be pass to service calls | Any arguments not listed here will be filtered out of service calls. Used to restrict the model from modifying certain parts of your home. | |
|
||||
@@ -65,7 +62,6 @@ For details about the sampling parameters, see here: https://github.com/oobaboog
|
||||
| Option Name | Description | Suggested Value |
|
||||
|----------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------|
|
||||
| Request Timeout | The maximum time in seconds that the integration will wait for a response from the remote server | 90 (higher if running on low resource hardware) |
|
||||
| Use chat completions endpoint | If set, tells text-generation-webui to format the prompt instead of this extension. Prompt Format set here will not apply if this is enabled | |
|
||||
| Generation Preset/Character Name | The preset or character name to pass to the backend. If none is provided then the settings that are currently selected in the UI will be applied | |
|
||||
| Chat Mode | [see here](https://github.com/oobabooga/text-generation-webui/wiki/01-%E2%80%90-Chat-Tab#mode) | Instruct |
|
||||
| Top K | Sampling parameter; see above link | 40 |
|
||||
@@ -80,7 +76,6 @@ For details about the sampling parameters, see here: https://github.com/oobaboog
|
||||
|-------------------------------|--------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------|
|
||||
| Request Timeout | The maximum time in seconds that the integration will wait for a response from the remote server | 90 (higher if running on low resource hardware) |
|
||||
| Keep Alive/Inactivity Timeout | The duration in minutes to keep the model loaded after each request. Set to a negative value to keep loaded forever | 30m |
|
||||
| Use chat completions endpoint | If set, tells Ollama to format the prompt instead of this extension. Prompt Format set here will not apply if this is enabled | |
|
||||
| JSON Mode | Restricts the model to only ouput valid JSON objects. Enable this if you are using ICL and are getting invalid JSON responses. | True |
|
||||
| Top K | Sampling parameter; see above link | 40 |
|
||||
| Top P | Sampling parameter; see above link | 1.0 |
|
||||
@@ -92,6 +87,5 @@ For details about the sampling parameters, see here: https://github.com/oobaboog
|
||||
| Option Name | Description | Suggested Value |
|
||||
|-------------------------------|--------------------------------------------------------------------------------------------------|-------------------------------------------------|
|
||||
| Request Timeout | The maximum time in seconds that the integration will wait for a response from the remote server | 90 (higher if running on low resource hardware) |
|
||||
| Use chat completions endpoint | Flag to use `/v1/chat/completions` as the remote endpoint instead of `/v1/completions` | Backend Dependent |
|
||||
| Top P | Sampling parameter; see above link | 1.0 |
|
||||
| Temperature | Sampling parameter; see above link | 0.1 |
|
||||
|
||||
@@ -133,19 +133,3 @@ Vous êtes « Al », un assistant IA utile qui contrôle les appareils d'une m
|
||||
Eres 'Al', un útil asistente de IA que controla los dispositivos de una casa. Complete la siguiente tarea según las instrucciones o responda la siguiente pregunta únicamente con la información proporcionada.
|
||||
```
|
||||
-->
|
||||
|
||||
## Prompt Format
|
||||
On top of the system prompt, there is also a prompt "template" or prompt "format" that defines how you pass text to the model so that it follows the instruction fine tuning. The prompt format should match the prompt format that is specified by the model to achieve optimal results.
|
||||
|
||||
Currently supported prompt formats are:
|
||||
1. ChatML
|
||||
2. Vicuna
|
||||
3. Alpaca
|
||||
4. Mistral
|
||||
5. Zephyr w/ eos token `<|endoftext|>`
|
||||
6. Zephyr w/ eos token `</s>`
|
||||
7. Zephyr w/ eos token `<|end|>`
|
||||
8. Llama 3
|
||||
9. Command-R
|
||||
10. None (useful for foundation models)
|
||||
|
||||
|
||||
Reference in New Issue
Block a user