Files
home-llm/TODO.md
Alex O'Connell 67de000326 todo cleanup
2026-01-04 09:46:33 -05:00

3.4 KiB

TODO

  • add examples of 'fixing' a failed tool call to the dataset
  • add proper 'refusals' to the dataset (i.e. tool/device not available or device is already in the desired state)
  • new model based on qwen3 0.6b, 1.7b and 4b
  • new model based on gemma3 270m
  • support AI task API
  • vision support for remote backends
  • vision support for local backend (llama.cpp + llava)
  • move llamacpp to a separate process because of all the crashing
  • optional sampling parameters in options panel (don't pass to backend if not set)
  • update dataset so new models will work with the Assist API
  • make ICL examples into conversation turns
  • translate ICL examples + make better ones
  • figure out DPO to improve response quality
  • proper tool calling support
  • fix old GGUFs to support tool calling
  • home assistant component text streaming support
  • move llama-cpp build to forked repo + add support for multi backend builds (no more -noavx)
  • support new LLM APIs
    • rewrite how services are called
    • handle no API selected
    • rewrite prompts + service block formats
    • implement new LLM API that has HassCallService so old models can still work
  • areas/room support
  • convert requests to aiohttp
  • detection/mitigation of too many entities being exposed & blowing out the context length
  • setup github actions to build wheels that are optimized for RPIs
  • mixtral + prompting (no fine tuning)
    • add in context learning variables to sys prompt template
    • add new options to setup process for setting prompt style + picking fine-tuned/ICL
  • prime kv cache with current "state" so that requests are faster
  • ChatML format (actually need to add special tokens)
  • Vicuna dataset merge (yahma/alpaca-cleaned)
  • Phi-2 fine tuning
  • Quantize /w llama.cpp
  • Make custom component use llama.cpp + ChatML
  • Continued synthetic dataset improvements (there are a bunch of TODOs in there)
  • Licenses + Attributions
  • Finish Readme/docs for initial release
  • Function calling as JSON
  • Fine tune Phi-1.5 version
  • make llama-cpp-python wheels for "llama-cpp-python>=0.2.24"
  • make a proper evaluation framework to run. not just loss. should test accuracy on the function calling
  • add more remote backends
    • LocalAI (openai compatible)
    • Ollama
    • support chat completions API (might fix Ollama + adds support for text-gen-ui characters)
  • more config options for prompt template (allow other than chatml)
  • publish snapshot of dataset on HF
  • use varied system prompts to add behaviors

v0.4 TODO for release:

  • re-order the settings on the options config flow page. the order is very confusing
  • split out entity functionality so we can support conversation + ai tasks
  • fix icl examples to match new tool calling syntax config
  • set up docker-compose for running all of the various backends
  • config sub-entry implementation
    • base work
    • generic openai backend
    • llamacpp backend
    • ollama backend
    • tailored_openai backend
    • generic openai responses backend
  • fix and re-upload all compatible old models (+ upload all original safetensors)
  • config entry migration function
  • re-write setup guide