Made a package to install llama.cpp server binaries
r/LocalLLaMA
•
Generative AI
Open Source AI
So, just like the title says - made a package to install llama.cpp server binaries. gh: pypi: Some ctx: our app needs llama server as a local subprocess, not as a separate deployment, the app already talks to providers through an openai-compatible http api, so local inference should behave like the others + in some other cases, I don’t always need bindings, don’t need a whole frame, just want to start the server, point it at a model, and send reqs to it.