tweak readme structure

2026-01-08 05:14:02 -05:00 · 2024-02-17 23:05:53 -05:00
parent ccf2c2c293
commit b2836bc250
3 changed files with 203 additions and 202 deletions
--- a/README.md
+++ b/README.md
@@ -1,12 +1,13 @@
 # Home LLM
 This project provides the required "glue" components to control your Home Assistant installation with a completely local Large Language Model acting as a personal assistant. The goal is to provide a drop in solution to be used as a "conversation agent" component by Home Assistant.

-### Home Assistant Component
+## Quick Start
+Please see the [Setup Guide](./docs/Setup.md) for more information on installation.

+## Home Assistant Component
 In order to integrate with Home Assistant, we provide a `custom_component` that exposes the locally running LLM as a "conversation agent".

-This component can be interacted with in a few ways:
-
+This component can be interacted with in a few ways:  
 - using a chat interface so you can chat with it.
 - integrating with Speech-to-Text and Text-to-Speech addons so you can just speak to it.

@@ -14,135 +15,8 @@ The component can either run the model directly as part of the Home Assistant so

 When doing this, you can host the model yourself and point the add-on at machine where the model is hosted, or you can run the model using text-generation-webui using the provided [custom Home Assistant add-on](./addon).

-
-## Requirements
-
- Supported version of HomeAssistant. (at time of writing this is `2024.1.6`)
- [HACs](https://hacs.xyz/docs/setup/download/) (if you want to install it that way)
- SSH or Web Terminal access to your HomeAssistant instance: if you want to use builtin llama-cpp or perform manual install
-
-## 🏃 Getting Started
-
-Installing and configuration HomeLLM will involve several steps: 
-
-1. 💾 Install the HomeLLM component.
-2. ⚙️ Choose and Configure a Backend
-3. 🗣️ Configure the Voice Assistant
-
-
-### 💾 🚕 Install HomeLMM with HACs
-
-> 🛑 ✋🏻 Requires HACs
-> 
-> First make sure you have [HACs installed](https://hacs.xyz/docs/setup/download/)
-
-Once you have HACs installed, this button will help you add the repository to HACS and open the download page
-
-1. [![Open your Home Assistant instance and open a repository inside the Home Assistant Community Store.](https://my.home-assistant.io/badges/hacs_repository.svg)](https://my.home-assistant.io/redirect/hacs_repository/?category=Integration&repository=home-llm&owner=acon96)
-2. Restart Home Assistant
-
-A "LLaMA Conversation" device should show up in the `Settings > Devices and Services > [Devices]` tab now:
-![image](https://github.com/acon96/home-llm/assets/61225/4427e362-e443-4796-bee8-5bdda18305d0)
-
-
-### 💾 🔨 Install HomeLMM Manually
-
-1. Ensure you have either the Samba, SSH, FTP, or another add-on installed that gives you access to the `config` folder
-2. If there is not already a `custom_components` folder, create one now.
-3. Copy the `custom_components/llama_conversation` folder from this repo to `config/custom_components/llama_conversation` on your Home Assistant machine.
-4. Restart Home Assistant: `Developer Tools -> Services -> Run` : `homeassistant.restart`
-
-A "LLaMA Conversation" device should show up in the `Settings > Devices and Services > [Devices]` tab now:
-![image](https://github.com/acon96/home-llm/assets/61225/4427e362-e443-4796-bee8-5bdda18305d0)
-
-
-### ⚙️ Configuration and Setup
-
-Decide if you want to have your model served by an api or not: 
-
- ✖️: continue on.
- ✔️ then follow instructions below on [`llama-cpp-python`](#llama-cpp-python)
-
-1. `Settings > Devices and Services`.
-2. Click the `Add Integration` button in the bottom right of the screen.
-3. Filter the list of "brand names" for llama, and "LLaMa Conversation" should remain.
-4. Choose and configure the backend. [More info 👇](#configure-backend)
-    1. Using builtin llama.cpp with hugging face
-    2. Using builtin llama.cpp with existing model file
-    3. using text-generation-webui api
-    4. using generic openapi compatiable api
-    5. using ollama api
-
-### 🗣️ Configuring the component as a Conversation Agent
-
-1. Navigate to `Settings` -> `Voice Assistants`
-2. Select `+ Add Assistant`
-3. Name the assistant whatever you want.
-4. Select the [conversation agent](#link-to-the-title-id-where-you-guide-the-user-in-doing-this) that we created previously.
-5. If using STT or TTS configure these now
-6. Return to the "Overview" dashboard and select chat icon in the top left.
-7. From here you can submit queries to the AI agent.
-
-In order for any entities be available to the agent, you must "expose" them first.
-
-1. Navigate to "Settings" -> "Voice Assistants" -> "Expose" Tab
-2. Select "+ Expose Entities" in the bottom right
-3. Check any entities you would like to be exposed to the conversation agent.
-
-> 🛑 ✋🏻 Security Warning 
-> 
-> Any devices that you select to be exposed to the model will be added as 
-> context and potentially have their state changed by the model.
-> 
-> Only expose devices that you want the model modifying the state of.
->
-> The model may occasionally hallucinate and issue commands to the wrong device!
-> 
-> Use.At.Your.Own.Risk 💣 
-
-
-## Technical Details
-
-
-### `llama-cpp-python`
-
-This only applies to you if you don't want to spin up your own llm api server and instead just want it to be abstracted away as an implementation detail.
-
-Once this is done, the backend setup process for the LLaMa.cpp options will handle installing the appropriate `*.whl` file.
-
-In order to run a model directly as part of your Home Assistant installation, you will need to install one of the pre-build wheels because there are no existing musllinux wheels for the package. Compatible wheels for x86_x64 and arm64 are provided in the [dist](./dist) folder. Copy the `*.whl` files to the `custom_components/llama_conversation/` folder. They will be installed while setting up the component.
-
-Obtain terminal access to the HomeAssistant instance and create some prerequisite folders. We'll download a set of prebundled python wheel files.
-
-```console
-mkdir -p /config/custom_components/llama_conversation
-cd /config/custom_components/llama_conversation
-
-wget https://github.com/acon96/home-llm/raw/develop/dist/llama_cpp_python-0.2.38-cp311-cp311-musllinux_1_2_aarch64.whl
-wget https://github.com/acon96/home-llm/raw/develop/dist/llama_cpp_python-0.2.38-cp311-cp311-musllinux_1_2_x86_64.whl
-```
-
-> ❔ 🤔 How to get Terminal Access?
-> 
-> There'll be many ways, but for the sake of simplicity you can try out these 
-> addons: 
-> 
-> - https://github.com/hassio-addons/repository?tab=readme-ov-file#-studio-code-server
-> - https://github.com/hassio-addons/repository?tab=readme-ov-file#-advanced-ssh--web-terminal
-
-
-
-### Constrained Grammar
-
-When running the model locally with [Llama.cpp], the component also constrains the model output using a GBNF grammar.
-This forces the model to provide valid output no matter what since its outputs are constrained to valid JSON every time.
-This helps the model perform significantly better at lower quantization levels where it would previously generate syntax errors.
-
-For more information See [output.gbnf](./custom_components/llama_conversation/output.gbnf) for the existing grammar.
-
-
-### Model
-The "Home" models are a fine tuning of the Phi model series from Microsoft.  The model is able to control devices in the user's house as well as perform basic question and answering.  The fine tuning dataset is a combination of the [Cleaned Stanford Alpaca Dataset](https://huggingface.co/datasets/yahma/alpaca-cleaned) as well as a [custom synthetic dataset](./data) designed to teach the model function calling based on the device information in the context.
+## Model
+The "Home" models are a fine tuning of various small-large language models (<3B parameters).  The model is able to control devices in the user's house as well as perform basic question and answering.  The fine tuning dataset is a [custom synthetic dataset](./data) designed to teach the model function calling based on the device information in the context.

 The latest models can be found on HuggingFace:  
 3B v2 (Based on Phi-2): https://huggingface.co/acon96/Home-3B-v2-GGUF  
@@ -157,8 +31,6 @@ The latest models can be found on HuggingFace:

 </details>

-Make sure you have `llama-cpp-python>=0.2.29` in order to run these models.
-
 The main difference between the 2 models (besides parameter count) is the training data. The 1B model is ONLY trained on the synthetic dataset provided in this project, while the 3B model is trained on a mixture of this synthetic dataset, and the cleaned Stanford Alpaca dataset.

 The model is quantized using Llama.cpp in order to enable running the model in super low resource environments that are common with Home Assistant installations such as Raspberry Pis.
@@ -184,7 +56,7 @@ Output from the model will consist of a response that should be relayed back to
 `````
 <|im_start|>assistant
 turning on the kitchen lights for you now
-```homeassistant
+```Home Assistant
 { "service": "light.turn_on", "target_device": "light.kitchen" }
 ```<|im_end|>
 `````
@@ -253,74 +125,9 @@ python3 train.py \
 </details>
 <br/>

+## Home Assistant Addon
+In order to facilitate running the project entirely on the system where Home Assistant is installed, there is an experimental Home Assistant Add-on that runs the oobabooga/text-generation-webui to connect to using the "remote" backend options.  The addon can be found in the [addon/](./addon/README.md) directory.

-### Backend Configuration
-
-![image](https://github.com/airtonix/home-llm/assets/61225/6f5d9748-5bfc-47ce-8abc-4f07d389a73f)
-
-When setting up the component, there are 5 different "backend" options to choose from:
-
-a. Llama.cpp with a model from HuggingFace
-b. Llama.cpp with a locally provided model
-c. A remote instance of text-generation-webui
-d. A generic OpenAI API compatible interface; *should* be compatible with LocalAI, LM Studio, and all other OpenAI compatible backends
-e. Ollama api
-
-See [docs/Backend Configuration.md](/docs/Backend%20Configuration.md) for more info.
-
-
-#### Llama.cpp Backend with a model from HuggingFace
-
-This is option A
-
-You need the following settings to configure the local backend from HuggingFace:
-1. Model Name: the name of the model in the form `repo/model-name`. The repo MUST contain a GGUF quantized model.
-2. Model Quantization: The quantization level to download. Pick from the list. Higher quantizations use more RAM but have higher quality responses.
-
-#### Llama.cpp Backend with a locally downloaded model
-
-This is option B
-
-You need the following settings to configure the local backend from HuggingFace:
-1. Model File Name: the file name where Home Assistant can access the model to load. Most likely a sub-path of `/config` or `/media` or wherever you copied the model file to.
-
-#### Remote Backends
-
-This is effectively options C, D and E
-
-You need the following settings in order to configure the "remote" backend:
-1. Hostname: the host of the machine where text-generation-webui API is hosted. If you are using the provided add-on then the hostname is `local-text-generation-webui` or `f459db47-text-generation-webui` depending on how the addon was installed.
-2. Port: the port for accessing the text-generation-webui API. NOTE: this is not the same as the UI port. (Usually 5000)
-3. Name of the Model: This name must EXACTLY match the name as it appears in `text-generation-webui`
-
-With the remote text-generation-webui backend, the component will validate that the selected model is available for use and will ensure it is loaded remotely. The Generic OpenAI compatible version does NOT do any validation or model loading.
-
-**Setting up with LocalAI**:  
-If you are an existing LocalAI user or would like to use LocalAI as your backend, please refer to [this](https://io.midori-ai.xyz/howtos/setup-with-ha/) website which has instructions on how to setup LocalAI to work with Home-LLM including automatic installation of the latest version of the the Home-LLM model. The auto-installer (LocalAI Manager) will automatically download and setup LocalAI and/or the model of your choice and automatically create the necessary template files for the model to work with this integration.
-
-
-### Running the text-generation-webui add-on
-In order to facilitate running the project entirely on the system where Home Assistant is installed, there is an experimental Home Assistant Add-on that runs the oobabooga/text-generation-webui to connect to using the "remote" backend option.
-
-You can use this button to automatically download and build the addon for `oobabooga/text-generation-webui`
-
-[![Open your Home Assistant instance and show the dashboard of an add-on.](https://my.home-assistant.io/badges/supervisor_addon.svg)](https://my.home-assistant.io/redirect/supervisor_addon/?addon=f459db47_text-generation-webui&repository_url=https%3A%2F%2Fgithub.com%2Facon96%2Fhome-llm)
-
-If the automatic installation fails then you can install the addon manually using the following steps:
-
-1. Ensure you have either the Samba, SSH, FTP, or another add-on installed that gives you access to the `addons` folder
-2. Copy the `addon` folder from this repo to `addons/text-generation-webui` on your Home Assistant machine.
-3. Go to the "Add-ons" section in settings and then pick the "Add-on Store" from the bottom right corner.
-4. Select the 3 dots in the top right and click "Check for Updates" and Refresh the webpage.
-5. There should now be a "Local Add-ons" section at the top of the "Add-on Store"
-6. Install the `oobabooga-text-generation-webui` add-on. It will take ~15-20 minutes to build the image on a Raspberry Pi.
-7. Copy any models you want to use to the `addon_configs/local_text-generation-webui/models` folder or download them using the UI.
-8. Load up a model to use. NOTE: The timeout for ingress pages is only 60 seconds so if the model takes longer than 60 seconds to load (very likely) then the UI will appear to time out and you will need to navigate to the add-on's logs to see when the model is fully loaded.
-
-### Performance of running the model on a Raspberry Pi
-The RPI4 4GB that I have was sitting right at 1.5 tokens/sec for prompt eval and 1.6 tokens/sec for token generation when running the `Q4_K_M` quant. I was reliably getting responses in 30-60 seconds after the initial prompt processing which took almost 5 minutes. It depends significantly on the number of devices that have been exposed as well as how many states have changed since the last invocation because llama.cpp caches KV values for identical prompt prefixes.
-
-It is highly recommend to set up text-generation-webui on a separate machine that can take advantage of a GPU.

 ## Version History
 | Version | Description                                                                                                                                                                                      |
--- a/docs/Performance.md
+++ b/docs/Performance.md
@@ -0,0 +1,36 @@
+### Performance of running the model on a Raspberry Pi
+The RPI4 4GB that I have was sitting right at 1.5 tokens/sec for prompt eval and 1.6 tokens/sec for token generation when running the `Q4_K_M` quant. I was reliably getting responses in 30-60 seconds after the initial prompt processing which took almost 5 minutes. It depends significantly on the number of devices that have been exposed as well as how many states have changed since the last invocation because llama.cpp caches KV values for identical prompt prefixes.
+
+It is highly recommend to set up text-generation-webui on a separate machine that can take advantage of a GPU.
+
+# Home 1B V2 GGUF Q4_K_M RPI5
+
+christmas.txt
+llama_print_timings:        load time =     678.37 ms
+llama_print_timings:      sample time =      16.38 ms /    45 runs   (    0.36 ms per token,  2747.09 tokens per second)
+llama_print_timings: prompt eval time =   31356.56 ms /   487 tokens (   64.39 ms per token,    15.53 tokens per second)
+llama_print_timings:        eval time =    4868.37 ms /    44 runs   (  110.64 ms per token,     9.04 tokens per second)
+llama_print_timings:       total time =   36265.33 ms /   531 tokens
+
+climate.txt
+llama_print_timings:        load time =     613.87 ms
+llama_print_timings:      sample time =      20.62 ms /    55 runs   (    0.37 ms per token,  2667.96 tokens per second)
+llama_print_timings: prompt eval time =   27324.34 ms /   431 tokens (   63.40 ms per token,    15.77 tokens per second)
+llama_print_timings:        eval time =    5780.72 ms /    54 runs   (  107.05 ms per token,     9.34 tokens per second)
+llama_print_timings:       total time =   33152.48 ms /   485 tokens
+
+# Home 3B V2 GGUF Q4_K_M RPI5
+
+climate.txt
+llama_print_timings:        load time =    1179.64 ms
+llama_print_timings:      sample time =      19.25 ms /    52 runs   (    0.37 ms per token,  2702.00 tokens per second)
+llama_print_timings: prompt eval time =   52688.82 ms /   431 tokens (  122.25 ms per token,     8.18 tokens per second)
+llama_print_timings:        eval time =   10206.12 ms /    51 runs   (  200.12 ms per token,     5.00 tokens per second)
+llama_print_timings:       total time =   62942.85 ms /   482 tokens
+
+sonnet.txt
+llama_print_timings:        load time =    1076.44 ms
+llama_print_timings:      sample time =    1225.34 ms /   236 runs   (    5.19 ms per token,   192.60 tokens per second)
+llama_print_timings: prompt eval time =   60754.40 ms /   490 tokens (  123.99 ms per token,     8.07 tokens per second)
+llama_print_timings:        eval time =   44885.82 ms /   213 runs   (  210.73 ms per token,     4.75 tokens per second)
+llama_print_timings:       total time =  107127.16 ms /   703 tokens
--- a/docs/Setup.md
+++ b/docs/Setup.md
@@ -0,0 +1,158 @@
+# Setup Instructions
+
+## Home Assistant Component
+### Requirements
+
+- A supported version of Home Assistant; `2023.10.0` or newer
+- SSH or Samba access to your Home Assistant instance
+
+**Optional:**
+- [HACs](https://hacs.xyz/docs/setup/download/) (if you want to install it that way)
+
+### 💾 🚕 Install the Home Assistant Component with HACs
+
+> 🛑 ✋🏻 Requires HACs
+> 
+> First make sure you have [HACs installed](https://hacs.xyz/docs/setup/download/)
+
+Once you have HACs installed, this button will help you add the repository to HACS and open the download page
+
+[![Open your Home Assistant instance and open a repository inside the Home Assistant Community Store.](https://my.home-assistant.io/badges/hacs_repository.svg)](https://my.home-assistant.io/redirect/hacs_repository/?category=Integration&repository=home-llm&owner=acon96)
+ 
+**Remember to restart Home Assistant after installing the component!**
+
+A "LLaMA Conversation" device should show up in the `Settings > Devices and Services > [Devices]` tab now:
+![image](https://github.com/acon96/home-llm/assets/61225/4427e362-e443-4796-bee8-5bdda18305d0)
+
+
+### 💾 🔨 Install the Home Assistant Component Manually
+
+1. Ensure you have either the Samba, SSH, FTP, or another add-on installed that gives you access to the `config` folder
+2. If there is not already a `custom_components` folder, create one now.
+3. Copy the `custom_components/llama_conversation` folder from this repo to `config/custom_components/llama_conversation` on your Home Assistant machine.
+4. Restart Home Assistant: `Developer Tools -> Services -> Run` : `HomeAssistant.restart`
+
+A "LLaMA Conversation" device should show up in the `Settings > Devices and Services > [Devices]` tab now:
+![image](https://github.com/acon96/home-llm/assets/61225/4427e362-e443-4796-bee8-5bdda18305d0)
+
+
+### ⚙️ Configuration and Setup
+You must configure at least one model by configuring the integration.
+
+1. `Settings > Devices and Services`.
+2. Click the `Add Integration` button in the bottom right of the screen.
+3. Filter the list of "brand names" for llama, and "LLaMa Conversation" should remain.
+4. Choose and configure the backend. [More info 👇](#configure-backend)
+    1. Using builtin llama.cpp with hugging face
+    2. Using builtin llama.cpp with existing model file
+    3. using text-generation-webui api
+    4. using generic openapi compatiable api
+    5. using ollama api
+
+### `llama-cpp-python` Wheel Installation
+
+If you plan on running the model locally on the same hardware as your Home Assistant server, then the recommended way to run the model is to use Llama.cpp. Unfortunately there are not pre-build wheels for this package for the musllinux runtime that Home Assistant Docker images use. To get around this, we provide compatible wheels for x86_x64 and arm64 in the [dist](./dist) folder. 
+
+Download the `*.whl` file that matches your hardware and then copy the `*.whl` file to the `custom_components/llama_conversation/` folder. It will be installed as a configuration step while setting up the Home Assistant component.
+
+| wheel | platform | home assistant version |
+| --- | --- | --- |
+| llama_cpp_python-{version}-cp311-cp311-musllinux_1_2_aarch64.whl | AARCH64 (RPi 4 and 5) | `2024.1.4` and older |
+| llama_cpp_python-{version}-cp311-cp311-musllinux_1_2_x86_64.whl | x86_64 (Intel + AMD) | `2024.1.4` and older |
+| llama_cpp_python-{version}-cp312-cp312-musllinux_1_2_aarch64.whl | AARCH64 (RPi 4 and 5) | `2024.2.0` and newer |
+| llama_cpp_python-{version}-cp312-cp312-musllinux_1_2_x86_64.whl | x86_64 (Intel + AMD) | `2024.2.0` and newer |
+
+### Constrained Grammar
+
+When running the model locally with [Llama.cpp], the component also constrains the model output using a GBNF grammar.
+This forces the model to provide valid output no matter what since its outputs are constrained to valid JSON every time.
+This helps the model perform significantly better at lower quantization levels where it would previously generate syntax errors.
+
+For more information See [output.gbnf](./custom_components/llama_conversation/output.gbnf) for the existing grammar.
+
+
+### Backend Configuration
+
+![image](https://github.com/airtonix/home-llm/assets/61225/6f5d9748-5bfc-47ce-8abc-4f07d389a73f)
+
+When setting up the component, there are 5 different "backend" options to choose from:
+
+a. Llama.cpp with a model from HuggingFace
+b. Llama.cpp with a locally provided model
+c. A remote instance of text-generation-webui
+d. A generic OpenAI API compatible interface; *should* be compatible with LocalAI, LM Studio, and all other OpenAI compatible backends
+e. Ollama api
+
+See [docs/Backend Configuration.md](/docs/Backend%20Configuration.md) for more info.
+
+#### Llama.cpp Backend with a model from HuggingFace
+
+This is option A
+
+You need the following settings to configure the local backend from HuggingFace:
+1. Model Name: the name of the model in the form `repo/model-name`. The repo MUST contain a GGUF quantized model.
+2. Model Quantization: The quantization level to download. Pick from the list. Higher quantizations use more RAM but have higher quality responses.
+
+#### Llama.cpp Backend with a locally downloaded model
+
+This is option B
+
+You need the following settings to configure the local backend from HuggingFace:
+1. Model File Name: the file name where Home Assistant can access the model to load. Most likely a sub-path of `/config` or `/media` or wherever you copied the model file to.
+
+#### Remote Backends
+
+This is options C, D and E
+
+You need the following settings in order to configure the "remote" backend:
+1. Hostname: the host of the machine where text-generation-webui API is hosted. If you are using the provided add-on then the hostname is `local-text-generation-webui` or `f459db47-text-generation-webui` depending on how the addon was installed.
+2. Port: the port for accessing the text-generation-webui API. NOTE: this is not the same as the UI port. (Usually 5000)
+3. Name of the Model: This name must EXACTLY match the name as it appears in `text-generation-webui`
+
+With the remote text-generation-webui backend, the component will validate that the selected model is available for use and will ensure it is loaded remotely. The Generic OpenAI compatible version does NOT do any validation or model loading.
+
+**Setting up with LocalAI**:  
+If you are an existing LocalAI user or would like to use LocalAI as your backend, please refer to [this](https://io.midori-ai.xyz/howtos/setup-with-ha/) website which has instructions on how to setup LocalAI to work with Home-LLM including automatic installation of the latest version of the the Home-LLM model. The auto-installer (LocalAI Manager) will automatically download and setup LocalAI and/or the model of your choice and automatically create the necessary template files for the model to work with this integration.
+
+### 🗣️ Configuring the component as a Conversation Agent
+
+1. Navigate to `Settings` -> `Voice Assistants`
+2. Select `+ Add Assistant`
+3. Name the assistant whatever you want.
+4. Select the conversation agent that we created previously.
+5. If using STT or TTS configure these now
+6. Return to the "Overview" dashboard and select chat icon in the top left.
+7. From here you can submit queries to the AI agent.
+
+In order for any entities be available to the agent, you must "expose" them first.
+
+1. Navigate to "Settings" -> "Voice Assistants" -> "Expose" Tab
+2. Select "+ Expose Entities" in the bottom right
+3. Check any entities you would like to be exposed to the conversation agent.
+
+> 🛑 ✋🏻 Security Warning 
+> 
+> Any devices that you select to be exposed to the model will be added as 
+> context and potentially have their state changed by the model.
+> 
+> Only expose devices that you want the model modifying the state of.
+>
+> The model may occasionally hallucinate and issue commands to the wrong device!
+> 
+> Use.At.Your.Own.Risk 💣 
+
+## text-generation-webui add-on
+You can use this button to automatically download and build the addon for `oobabooga/text-generation-webui`
+
+[![Open your Home Assistant instance and show the dashboard of an add-on.](https://my.home-assistant.io/badges/supervisor_addon.svg)](https://my.home-assistant.io/redirect/supervisor_addon/?addon=f459db47_text-generation-webui&repository_url=https%3A%2F%2Fgithub.com%2Facon96%2Fhome-llm)
+
+If the automatic installation fails then you can install the addon manually using the following steps:
+
+1. Ensure you have either the Samba, SSH, FTP, or another add-on installed that gives you access to the `addons` folder
+2. Copy the `addon` folder from this repo to `addons/text-generation-webui` on your Home Assistant machine.
+3. Go to the "Add-ons" section in settings and then pick the "Add-on Store" from the bottom right corner.
+4. Select the 3 dots in the top right and click "Check for Updates" and Refresh the webpage.
+5. There should now be a "Local Add-ons" section at the top of the "Add-on Store"
+6. Install the `oobabooga-text-generation-webui` add-on. It will take ~15-20 minutes to build the image on a Raspberry Pi.
+7. Copy any models you want to use to the `addon_configs/local_text-generation-webui/models` folder or download them using the UI.
+8. Load up a model to use. NOTE: The timeout for ingress pages is only 60 seconds so if the model takes longer than 60 seconds to load (very likely) then the UI will appear to time out and you will need to navigate to the add-on's logs to see when the model is fully loaded.