mirror of
https://github.com/ParisNeo/ollama_proxy_server.git
synced 2026-01-12 15:48:24 -05:00
Homelab Extension #1
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @strhwste on 6/27/2025
Hi ParisNeo. THX for creating this :)
There is a good use-case for this in a homelab envoirement. For example I have a small NUC running for all the stuff, which is of course not powerful enough for llm inference, maybe 1-3B bedt case. But there are two powerful tower PC which are running sometimes (turning them off is good because idle is at least 200W). Must of the time there are used for not too crucial stuff but sometimes they're used for gaming or rendering, when they are, they shouldn't start inferencing.
So what I would love to implement is:
*maybe do a CTX check first if the request can be handled by the smaller model?
*if requested model is not available -> always use the largest model available? Even if smaller Model is available wait for the model which is comparable with what is requested
Are you interested in those enhancements? If not I will just make a fork :)