Merge branch 'master' into develop

This commit is contained in:
Alex O'Connell
2023-12-28 17:16:30 -05:00
13 changed files with 237 additions and 121 deletions

2
.gitignore vendored
View File

@@ -6,4 +6,4 @@ config/
data/*.json
*.pyc
main.log
.venv
.venv

View File

@@ -1,10 +1,12 @@
# Home LLM
This project provides the required "glue" components to control your Home Assistant installation with a completely local Large Langage Model acting as a personal assistant. The goal is to provide a drop in solution to be used as a "conversation agent" component type by the Home Assistant project.
This project provides the required "glue" components to control your Home Assistant installation with a completely local Large Language Model acting as a personal assistant. The goal is to provide a drop in solution to be used as a "conversation agent" component by Home Assistant.
## Model
The "Home" model is a fine tuning of the Phi model series from Microsoft. The model is able to control devices in the user's house as well as perform basic question and answering. The fine tuning dataset is a combination of the [Cleaned Stanford Alpaca Dataset](https://huggingface.co/datasets/yahma/alpaca-cleaned) as well as a [custom synthetic dataset](./data) designed to teach the model function calling based on the device information in the context.
The "Home" model is a fine tuning of the Phi-2 model from Microsoft. The model is able to control devices in the user's house as well as perform basic question and answering. The fine tuning dataset is a combination of the [Cleaned Stanford Alpaca Dataset](https://huggingface.co/datasets/yahma/alpaca-cleaned) as well as a [custom synthetic dataset](./data) designed to teach the model function calling based on the device information in the context.
The model is quantized using Llama.cpp in order to enable running the model in super low resource environments that are common with Home Assistant installations such as Rapsberry Pis.
The model can be found on HuggingFace: https://huggingface.co/acon96/Home-3B-v1-GGUF
The model is quantized using Llama.cpp in order to enable running the model in super low resource environments that are common with Home Assistant installations such as Raspberry Pis.
The model can be used as an "instruct" type model using the ChatML prompt format. The system prompt is used to provide information about the state of the Home Assistant installation including available devices and callable services.
@@ -39,6 +41,10 @@ Due to the mix of data used during fine tuning, the model is also capable of bas
<|im_start|>assistant If Mary is 7 years old, then you are 10 years old (7+3=10).<|im_end|><|endoftext|>
```
### Synthetic Dataset
The synthetic dataset is aimed at covering basic day to day operations in home assistant such as turning devices on and off.
The supported entity types are: light, fan, cover, lock, media_player
### Training
The model was trained as a LoRA on an RTX 3090 (24GB) using the following settings for the custom training script. The embedding weights were "saved" and trained normally along with the rank matricies in order to train the newly added tokens to the embeddings. The full model is merged together at the end.
@@ -63,16 +69,67 @@ The provided `custom_modeling_phi.py` has Gradient Checkpointing implemented for
## Home Assistant Component
In order to integrate with Home Assistant, we provide a `custom_component` that exposes the locally running LLM as a "conversation agent" that can be interacted with using a chat interface as well as integrate with Speech-to-Text and Text-to-Speech addons to enable interacting with the model by speaking.
The component can either run the model directly as part of the Home Assistant software using llama-cpp-python, or you can run the [oobabooga/text-generation-webui](https://github.com/oobabooga/text-generation-webui) project to provide access to the LLM via an API interface. When doing this, you can host the model yourself and point the addon at machine where the model is hosted, or you can run the model using text-generation-webui using the provided [custom Home Assistant addon](./addon/README.md).
The component can either run the model directly as part of the Home Assistant software using llama-cpp-python, or you can run the [oobabooga/text-generation-webui](https://github.com/oobabooga/text-generation-webui) project to provide access to the LLM via an API interface. When doing this, you can host the model yourself and point the add-on at machine where the model is hosted, or you can run the model using text-generation-webui using the provided [custom Home Assistant add-on](./addon).
### Installing
1. Ensure you have either the Samba, SSH, FTP, or another add-on installed that gives you access to the `config` folder
2. If there is not already a `custom_components` folder, create one now.
3. Copy the `custom_components/llama_conversation` folder from this repo to `config/custom_components/llama_conversation` on your Home Assistant machine.
4. Restart Home Assistant using the "Developer Tools" tab -> Services -> Run `homeassistant.restart`
5. The "LLaMA Conversation" integration should show up in the "Devices" section now.
### Setting up
When setting up the component, there are 3 different "backend" options to choose from:
1. Llama.cpp with a model from HuggingFace
2. Llama.cpp with a locally provided model
3. A remote instance of text-generateion-webui
3. A remote instance of text-generation-webui
**Setting up the Llama.cpp backend with a model from HuggingFace**:
**Setting up the Llama.cpp backend with a model from HuggingFace**:
TODO: need to build wheels for llama.cpp first
**Setting up the Llama.cpp backend with a locally downloaded model**:
**Setting up the Llama.cpp backend with a locally downloaded model**:
TODO: need to build wheels for llama.cpp first
**Setting up the "remote" backend**:
You need the following settings in order to configure the "remote" backend
1. Hostname: the host of the machine where text-generation-webui API is hosted. If you are using the provided add-on then the hostname is `local-text-generation-webui`
2. Port: the port for accessing the text-generation-webui API. NOTE: this is not the same as the UI port. (Usually 5000)
3. Name of the Model: This name must EXACTLY match the name as it appears in `text-generation-webui`
On creation, the component will validate that the model is available for use.
### Configuring the component as a Conversation Agent
**NOTE: ANY DEVICES THAT YOU SELECT TO BE EXPOSED TO THE MODEL WILL BE ADDED AS CONTEXT AND POTENTIALLY HAVE THEIR STATE CHANGED BY THE MODEL. ONLY EXPOSE DEVICES THAT YOU ARE OK WITH THE MODEL MODIFYING THE STATE OF, EVEN IF IT IS NOT WHAT YOU REQUESTED. THE MODEL MAY OCCASIONALLY HALLUCINATE AND ISSUE COMMANDS TO THE WRONG DEVICE! USE AT YOUR OWN RISK.**
In order to utilize the conversation agent in HomeAssistant:
1. Navigate to "Settings" -> "Voice Assistants"
2. Select "+ Add Assistant"
3. Name the assistant whatever you want.
4. Select the "Conversation Agent" that we created previously
5. If using STT or TTS configure these now
6. Return to the "Overview" dashboard and select chat icon in the top left.
From here you can submit queries to the AI agent.
In order for any entities be available to the agent, you must "expose" them first.
1. Navigate to "Settings" -> "Voice Assistants" -> "Expose" Tab
2. Select "+ Expose Entities" in the bottom right
3. Check any entities you would like to be exposed to the conversation agent.
### Running the text-generation-webui add-on
In order to facilitate running the project entirely on the system where Home Assistant is installed, there is an experimental Home Assistant Add-on that runs the oobabooga/text-generation-webui to connect to using the "remote" backend option.
1. Ensure you have either the Samba, SSH, FTP, or another add-on installed that gives you access to the `addons` folder
2. Copy the `addon` folder from this repo to `addons/text-generation-webui` on your Home Assistant machine.
3. Go to the "Add-ons" section in settings and then pick the "Add-on Store" from the bottom right corner.
4. Select the 3 dots in the top right and click "Check for Updates" and Refresh the webpage.
5. There should now be a "Local Add-ons" section at the top of the "Add-on Store"
6. Install the `oobabooga-text-generation-webui` add-on. It will take ~15-20 minutes to build the image on a Raspberry Pi.
7. Copy any models you want to use to the `addon_configs/local_text-generation-webui/models` folder or download them using the UI.
8. Load up a model to use. NOTE: The timeout for ingress pages is only 60 seconds so if the model takes longer than 60 seconds to load (very likely) then the UI will appear to time out and you will need to navigate to the add-on's logs to see when the model is fully loaded.
### Performance of running the model on a Raspberry Pi
The RPI4 4GB that I have was sitting right at 1.5 tokens/sec for prompt eval and 1.6 tokens/sec for token generation when running the `Q4_K_M` quant. I was reliably getting responses in 30-60 seconds after the initial prompt processing which took almost 5 minutes. It depends significantly on the number of devices that have been exposed as well as how many states have changed since the last invocation because llama.cpp caches KV values for identical prompt prefixes.
It is highly recommend to set up text-generation-webui on a separate machine that can take advantage of a GPU.

View File

@@ -18,4 +18,6 @@
[ ] RAG for getting info for setting up new devices
- set up vectordb
- ingest home assistant docs
- "context request" from above to initiate a RAG search
- "context request" from above to initiate a RAG search
[ ] make llama-cpp-python wheels for "llama-cpp-python>=0.2.24"
[ ] prime kv cache with current "state" so that requests are faster

View File

@@ -1,111 +1,51 @@
ARG BUILD_FROM=atinoda/text-generation-webui:default
FROM alpine:latest as overlay-downloader
RUN apk add git && \
git clone https://github.com/hassio-addons/addon-ubuntu-base /tmp/addon-ubuntu-base
ARG BUILD_FROM=ghcr.io/hassio-addons/ubuntu-base:9.0.2
# hadolint ignore=DL3006
FROM ${BUILD_FROM}
# Environment variables
ENV \
CARGO_NET_GIT_FETCH_WITH_CLI=true \
DEBIAN_FRONTEND="noninteractive" \
HOME="/root" \
LANG="C.UTF-8" \
PIP_DISABLE_PIP_VERSION_CHECK=1 \
PIP_NO_CACHE_DIR=1 \
PIP_PREFER_BINARY=1 \
PS1="$(whoami)@$(hostname):$(pwd)$ " \
PYTHONDONTWRITEBYTECODE=1 \
PYTHONUNBUFFERED=1 \
S6_BEHAVIOUR_IF_STAGE2_FAILS=2 \
S6_CMD_WAIT_FOR_SERVICES_MAXTIME=0 \
S6_CMD_WAIT_FOR_SERVICES=1 \
YARN_HTTP_TIMEOUT=1000000 \
TERM="xterm-256color"
# Set shell
SHELL ["/bin/bash", "-o", "pipefail", "-c"]
# Install base system
# Install text-generation-webui
ARG BUILD_ARCH=amd64
ARG BASHIO_VERSION="v0.16.0"
ARG S6_OVERLAY_VERSION="3.1.5.0"
ARG TEMPIO_VERSION="2021.09.0"
ARG APP_DIR=/app
RUN \
apt-get update \
\
&& apt-get install -y --no-install-recommends \
ca-certificates=20230311ubuntu0.22.04.1 \
curl=7.81.0-1ubuntu1.14 \
jq=1.6-2.1ubuntu3 \
tzdata=2023c-0ubuntu0.22.04.2 \
xz-utils=5.2.5-2ubuntu1 \
\
&& S6_ARCH="${BUILD_ARCH}" \
&& if [ "${BUILD_ARCH}" = "i386" ]; then S6_ARCH="i686"; \
elif [ "${BUILD_ARCH}" = "amd64" ]; then S6_ARCH="x86_64"; \
elif [ "${BUILD_ARCH}" = "armv7" ]; then S6_ARCH="arm"; fi \
\
&& curl -L -s "https://github.com/just-containers/s6-overlay/releases/download/v${S6_OVERLAY_VERSION}/s6-overlay-noarch.tar.xz" \
| tar -C / -Jxpf - \
\
&& curl -L -s "https://github.com/just-containers/s6-overlay/releases/download/v${S6_OVERLAY_VERSION}/s6-overlay-${S6_ARCH}.tar.xz" \
| tar -C / -Jxpf - \
\
&& curl -L -s "https://github.com/just-containers/s6-overlay/releases/download/v${S6_OVERLAY_VERSION}/s6-overlay-symlinks-noarch.tar.xz" \
| tar -C / -Jxpf - \
\
&& curl -L -s "https://github.com/just-containers/s6-overlay/releases/download/v${S6_OVERLAY_VERSION}/s6-overlay-symlinks-arch.tar.xz" \
| tar -C / -Jxpf - \
\
&& mkdir -p /etc/fix-attrs.d \
&& mkdir -p /etc/services.d \
\
&& curl -J -L -o /tmp/bashio.tar.gz \
"https://github.com/hassio-addons/bashio/archive/${BASHIO_VERSION}.tar.gz" \
&& mkdir /tmp/bashio \
&& tar zxvf \
/tmp/bashio.tar.gz \
--strip 1 -C /tmp/bashio \
\
&& mv /tmp/bashio/lib /usr/lib/bashio \
&& ln -s /usr/lib/bashio/bashio /usr/bin/bashio \
\
&& curl -L -s -o /usr/bin/tempio \
"https://github.com/home-assistant/tempio/releases/download/${TEMPIO_VERSION}/tempio_${BUILD_ARCH}" \
&& chmod a+x /usr/bin/tempio \
ca-certificates \
curl \
git \
build-essential \
cmake \
python3.10 \
python3-dev \
python3-venv \
python3-pip \
\
&& git clone https://github.com/oobabooga/text-generation-webui.git ${APP_DIR}\
&& python3 -m pip install torch torchvision torchaudio py-cpuinfo==9.0.0 \
&& python3 -m pip install -r ${APP_DIR}/requirements_cpu_only_noavx2.txt -r ${APP_DIR}/extensions/openai/requirements.txt llama-cpp-python \
&& apt-get purge -y --auto-remove \
xz-utils \
git \
build-essential \
cmake \
python3-dev \
&& apt-get clean \
&& rm -fr \
/tmp/* \
/var/{cache,log}/* \
/var/lib/apt/lists/*
# Copy s6-overlay adjustments from cloned git repo
COPY --from=overlay-downloader /tmp/addon-ubuntu-base/base/s6-overlay /package/admin/s6-overlay-${S6_OVERLAY_VERSION}/
# Copy root filesystem for the base image
COPY --from=overlay-downloader /tmp/addon-ubuntu-base/base/rootfs /
# Copy root filesystem for our image
COPY rootfs /
# Entrypoint & CMD
ENTRYPOINT [ "/init" ]
# Build arugments
ARG BUILD_DATE
ARG BUILD_REF
ARG BUILD_VERSION
ARG BUILD_REPOSITORY
# TODO: figure out what is broken with file permissions
USER root
# Labels
LABEL \
io.hass.name="oobabooga text-generation-webui for ${BUILD_ARCH}" \
@@ -113,9 +53,6 @@ LABEL \
io.hass.arch="${BUILD_ARCH}" \
io.hass.type="addon" \
io.hass.version=${BUILD_VERSION} \
io.hass.base.version=${BUILD_VERSION} \
io.hass.base.name="ubuntu" \
io.hass.base.image="hassioaddons/ubuntu-base" \
maintainer="github.com/acon96" \
org.opencontainers.image.title="oobabooga text-generation-webui for ${BUILD_ARCH}" \
org.opencontainers.image.description="Home Assistant Community Add-on: ${BUILD_ARCH} oobabooga text-generation-webui" \

View File

@@ -1,4 +1,4 @@
# text-generation-webui - Home Assistant Addon
NOTE: This is super experimental and probably won't actually run on a Raspberry PI
NOTE: This is super experimental and may or may not work on a Raspberry Pi
This basically takes an existing Docker image and attempts to overlay the required files for Home Assistant to launch and recognize it as an addon.
Installs text-generation-webui into a docker container using CPU only mode (llama.cpp)

View File

@@ -1,3 +1,4 @@
---
build_from:
amd64: atinoda/text-generation-webui:default-snapshot-2023-10-29
aarch64: ghcr.io/hassio-addons/ubuntu-base:9.0.2
amd64: ghcr.io/hassio-addons/ubuntu-base:9.0.2

View File

@@ -1,16 +1,26 @@
---
name: oobabooga text-generation-webui
name: oobabooga-text-generation-webui
version: dev
slug: text-generation-webui
description: ""
url: ""
description: "A tool for running Large Language Models"
url: "https://github.com/oobabooga/text-generation-webui"
init: false
arch:
- amd64
- amd64
- aarch64
ports:
7860/tcp: 7860 # ingress
5000/tcp: 5000 # api
ports_description:
7860/tcp: Web interface (Not required for Ingress)
5000/tcp: OpenAI compatible API Server
ingress: true
ingress_port: 7860
# options: {}
# schema: {}
# TODO: figure out volume mounts so models persist between restarts
options: {}
schema:
log_level: list(trace|debug|info|notice|warning|error|fatal)?
models_directory: str?
map:
- media:rw
- share:rw
- addon_config:rw

View File

@@ -5,5 +5,29 @@
# ==============================================================================
bashio::log.info "Starting Text Generation Webui..."
cd /app
exec python3 /app/server.py --listen --verbose --api
APP_DIR="/app"
DEFAULT_MODELS_DIR="/config/models"
if bashio::config.has_value "models_directory" && ! bashio::config.is_empty "models_directory"; then
MODELS_DIR=$(bashio::config 'models_directory')
if ! bashio::fs.directory_exists "$MODELS_DIR"; then
MODELS_DIR=$DEFAULT_MODELS_DIR
mkdir -p $MODELS_DIR
bashio::log.warning "The provided models directory '$MODELS_DIR' does not exist! Defaulting to '$DEFAULT_MODELS_DIR'"
else
bashio::log.info "Using chosen storage for models: '$MODELS_DIR'"
fi
else
MODELS_DIR=$DEFAULT_MODELS_DIR
mkdir -p $MODELS_DIR
bashio::log.info "Using default local storage for models."
fi
# ensure we can access the folder
chmod 0777 $MODELS_DIR
export GRADIO_ROOT_PATH=$(bashio::addon.ingress_entry)
bashio::log.info "Serving app from $GRADIO_ROOT_PATH"
cd $APP_DIR
exec python3 server.py --listen --verbose --api --model-dir $MODELS_DIR

View File

@@ -8,7 +8,7 @@ from typing import Callable
import numpy.typing as npt
import numpy as np
from llama_cpp import Llama
# from llama_cpp import Llama
import requests
import re
import os
@@ -42,6 +42,7 @@ from .const import (
CONF_TEMPERATURE,
CONF_TOP_K,
CONF_TOP_P,
CONF_REQUEST_TIMEOUT,
CONF_BACKEND_TYPE,
CONF_DOWNLOADED_MODEL_FILE,
DEFAULT_MAX_TOKENS,
@@ -50,6 +51,7 @@ from .const import (
DEFAULT_TOP_K,
DEFAULT_TOP_P,
DEFAULT_BACKEND_TYPE,
DEFAULT_REQUEST_TIMEOUT,
BACKEND_TYPE_REMOTE,
DOMAIN,
)
@@ -112,6 +114,8 @@ class LLaMAAgent(conversation.AbstractConversationAgent):
if self.use_local_backend:
if not model_path:
raise Exception(f"Model was not found at '{model_path}'!")
raise NotImplementedError()
self.llm = Llama(
model_path=model_path,
@@ -200,6 +204,8 @@ class LLaMAAgent(conversation.AbstractConversationAgent):
prompt.append({"role": "assistant", "message": response})
self.history[conversation_id] = prompt
exposed_entities = list(self._async_get_exposed_entities()[0].keys())
to_say = response.strip().split("\n")[0]
pattern = re.compile(r"```homeassistant\n([\S\n]*)```")
for block in pattern.findall(response.strip()):
@@ -213,16 +219,21 @@ class LLaMAAgent(conversation.AbstractConversationAgent):
service = line.split("(")[0]
entity = line.split("(")[1][:-1]
domain, service = tuple(service.split("."))
try:
await self.hass.services.async_call(
domain,
service,
service_data={ATTR_ENTITY_ID: entity},
blocking=True,
)
except Exception as err:
to_say += f"\nFailed to run: {line}"
_LOGGER.debug(f"err: {err}; {repr(err)}")
# only acknowledge requests to exposed entities
if entity not in exposed_entities:
to_say += f"Can't find device '{entity}'"
else:
try:
await self.hass.services.async_call(
domain,
service,
service_data={ATTR_ENTITY_ID: entity},
blocking=True,
)
except Exception as err:
to_say += f"\nFailed to run: {line}"
_LOGGER.debug(f"err: {err}; {repr(err)}")
intent_response = intent.IntentResponse(language=user_input.language)
intent_response.async_set_speech(to_say)
@@ -235,8 +246,10 @@ class LLaMAAgent(conversation.AbstractConversationAgent):
generate_params["model"] = self.model_name
del generate_params["top_k"]
timeout = self.entry.options.get(CONF_REQUEST_TIMEOUT, DEFAULT_REQUEST_TIMEOUT)
result = requests.post(
f"{self.api_host}/v1/completions", json=generate_params, timeout=30
f"{self.api_host}/v1/completions", json=generate_params, timeout=timeout
)
result.raise_for_status()
except requests.RequestException as err:

View File

@@ -35,6 +35,7 @@ from .const import (
CONF_TEMPERATURE,
CONF_TOP_K,
CONF_TOP_P,
CONF_REQUEST_TIMEOUT,
CONF_BACKEND_TYPE,
CONF_BACKEND_TYPE_OPTIONS,
CONF_DOWNLOADED_MODEL_FILE,
@@ -48,6 +49,7 @@ from .const import (
DEFAULT_TEMPERATURE,
DEFAULT_TOP_K,
DEFAULT_TOP_P,
DEFAULT_REQUEST_TIMEOUT,
DEFAULT_BACKEND_TYPE,
BACKEND_TYPE_LLAMA_HF,
BACKEND_TYPE_LLAMA_EXISTING,
@@ -96,6 +98,7 @@ DEFAULT_OPTIONS = types.MappingProxyType(
CONF_TOP_K: DEFAULT_TOP_K,
CONF_TOP_P: DEFAULT_TOP_P,
CONF_TEMPERATURE: DEFAULT_TEMPERATURE,
CONF_REQUEST_TIMEOUT: DEFAULT_REQUEST_TIMEOUT,
}
)
@@ -374,18 +377,19 @@ class OptionsFlow(config_entries.OptionsFlow):
"""Manage the options."""
if user_input is not None:
return self.async_create_entry(title="LLaMA Conversation", data=user_input)
schema = local_llama_config_option_schema(self.config_entry.options)
is_local_backend = self.config_entry.data[CONF_BACKEND_TYPE] != BACKEND_TYPE_REMOTE
schema = local_llama_config_option_schema(self.config_entry.options, is_local_backend)
return self.async_show_form(
step_id="init",
data_schema=vol.Schema(schema),
)
def local_llama_config_option_schema(options: MappingProxyType[str, Any]) -> dict:
def local_llama_config_option_schema(options: MappingProxyType[str, Any], is_local_backend: bool) -> dict:
"""Return a schema for Local LLaMA completion options."""
if not options:
options = DEFAULT_OPTIONS
return {
result = {
vol.Optional(
CONF_PROMPT,
description={"suggested_value": options[CONF_PROMPT]},
@@ -412,3 +416,12 @@ def local_llama_config_option_schema(options: MappingProxyType[str, Any]) -> dic
default=DEFAULT_TEMPERATURE,
): NumberSelector(NumberSelectorConfig(min=0, max=1, step=0.05)),
}
if not is_local_backend:
result[vol.Optional(
CONF_REQUEST_TIMEOUT,
description={"suggested_value": options[CONF_REQUEST_TIMEOUT]},
default=DEFAULT_REQUEST_TIMEOUT,
)] = int
return result

View File

@@ -16,12 +16,15 @@ CONF_TOP_P = "top_p"
DEFAULT_TOP_P = 1
CONF_TEMPERATURE = "temperature"
DEFAULT_TEMPERATURE = 0.1
CONF_REQUEST_TIMEOUT = "request_timeout"
DEFAULT_REQUEST_TIMEOUT = 90
CONF_BACKEND_TYPE = "model_backend"
BACKEND_TYPE_LLAMA_HF = "Llama.cpp (HuggingFace)"
BACKEND_TYPE_LLAMA_EXISTING = "Llama.cpp (existing model)"
BACKEND_TYPE_REMOTE = "text-generation-webui API"
DEFAULT_BACKEND_TYPE = BACKEND_TYPE_LLAMA_HF
CONF_BACKEND_TYPE_OPTIONS = [ BACKEND_TYPE_LLAMA_HF, BACKEND_TYPE_LLAMA_EXISTING, BACKEND_TYPE_REMOTE]
# CONF_BACKEND_TYPE_OPTIONS = [ BACKEND_TYPE_LLAMA_HF, BACKEND_TYPE_LLAMA_EXISTING, BACKEND_TYPE_REMOTE ]
CONF_BACKEND_TYPE_OPTIONS = [ BACKEND_TYPE_REMOTE ]
CONF_DOWNLOADED_MODEL_QUANTIZATION = "downloaded_model_quantization"
CONF_DOWNLOADED_MODEL_QUANTIZATION_OPTIONS = ["Q8_0", "Q5_K_M", "Q4_K_M", "Q3_K_M"]
DEFAULT_DOWNLOADED_MODEL_QUANTIZATION = "Q5_K_M"

View File

@@ -10,7 +10,6 @@
"iot_class": "local_polling",
"requirements": [
"requests",
"huggingface-hub",
"llama-cpp-python>=0.2.24"
"huggingface-hub"
]
}

View File

@@ -0,0 +1,57 @@
{
"config": {
"error": {
"download_failed": "The download failed to complete!",
"failed_to_connect": "Failed to connect to the remote API. See the logs for more details.",
"missing_model_api": "The selected model is not provided by this API.",
"missing_model_file": "The provided file does not exist.",
"other_existing_local": "Another model is already loaded locally. Please unload it or configure a remote model.",
"unknown": "Unexpected error"
},
"progress": {
"download": "Please wait while the model is being downloaded from HuggingFace. This can take a few minutes."
},
"step": {
"local_model": {
"data": {
"downloaded_model_file": "Local file name",
"downloaded_model_quantization": "Downloaded model quantization",
"huggingface_model": "HuggingFace Model"
},
"description": "Please configure llama.cpp for the model",
"title": "Configure llama.cpp"
},
"remote_model": {
"data": {
"host": "API Hostname",
"huggingface_model": "Model Name",
"port": "API Port"
},
"description": "Provide the connection details for an instance of text-generation-webui that is hosting the model.",
"title": "Configure connection to remote API"
},
"user": {
"data": {
"download_model_from_hf": "Download model from HuggingFace",
"use_local_backend": "Use Llama.cpp"
},
"description": "Select the backend for running the model. Either Llama.cpp (locally) or text-generation-webui (remote).",
"title": "Select Backend"
}
}
},
"options": {
"step": {
"init": {
"data": {
"max_new_tokens": "Maximum tokens to return in response",
"prompt": "Prompt Template",
"temperature": "Temperature",
"top_k": "Top K",
"top_p": "Top P",
"request_timeout": "Remote Request Timeout (seconds)"
}
}
}
}
}