Local run for multi users: which software set?
r/LocalLLaMA
•
Generative AI
Open Source AI
AI Tools
Context: I am testing and running local LLM on Linux for some months, first with llama.cpp and now with vLLM for better concurrent capabilities. I use llama-swap in front of either vLLM or llama.cpp in order to have thinking and non-thinking variants exposed with all inference parameters adjusted according to the model requirements. My needs: now, I would like to make the LLM available to multiple (less than 10) users, outside from the local network: https access, web chat interface with either connection or api-key, API access with api-key.