home-llm/TODO.md at release/v0.4.2

proper tool calling support

fix old GGUFs to support tool calling

home assistant component text streaming support

move llama-cpp build to forked repo + add support for multi backend builds (no more -noavx)

new model based on qwen3 0.6b

new model based on gemma3 270m

support AI task API

move llamacpp to a separate process because of all the crashing

optional sampling parameters in options panel (don't pass to backend if not set)

support new LLM APIs

update dataset so new models will work with the API

make ICL examples into conversation turns

translate ICL examples + make better ones

areas/room support

convert requests to aiohttp

detection/mitigation of too many entities being exposed & blowing out the context length

figure out DPO to improve response quality

setup github actions to build wheels that are optimized for RPIs

mixtral + prompting (no fine tuning)

add in context learning variables to sys prompt template
add new options to setup process for setting prompt style + picking fine-tuned/ICL

prime kv cache with current "state" so that requests are faster

ChatML format (actually need to add special tokens)

Vicuna dataset merge (yahma/alpaca-cleaned)

Phi-2 fine tuning

Quantize /w llama.cpp

Make custom component use llama.cpp + ChatML

Continued synthetic dataset improvements (there are a bunch of TODOs in there)

Licenses + Attributions

Finish Readme/docs for initial release

Function calling as JSON

Fine tune Phi-1.5 version

make llama-cpp-python wheels for "llama-cpp-python>=0.2.24"

make a proper evaluation framework to run. not just loss. should test accuracy on the function calling

add more remote backends

LocalAI (openai compatible)
Ollama
support chat completions API (might fix Ollama + adds support for text-gen-ui characters)

more config options for prompt template (allow other than chatml)

publish snapshot of dataset on HF

use varied system prompts to add behaviors

TODO