Made a package to install llama.cpp server binaries

r/LocalLLaMA
Generative AI Open Source AI

So, just like the title says - made a package to install llama.cpp server binaries. gh: pypi: Some ctx: our app needs llama server as a local subprocess, not as a separate deployment, the app already talks to providers through an openai-compatible http api, so local inference should behave like the others + in some other cases, I don’t always need bindings, don’t need a whole frame, just want to start the server, point it at a model, and send reqs to it.