Open-source LLM/VLM load balancer and serving platform for self-hosting LLMs (and VLMs) at scale ππ¦ Alternative to projects like llm-d, Docker Model Runner, etc but with less moving parts and simple deployments built around ggml ecosystem. Runs on CPU and GPU.
Read the entire source before you build β unlike paid marketplaces that hide it behind a buy button.
Are you the creator of this tool? Claim your listing β and earn 85% of every sale.
More ai-agent tools founders pair with this one.