Llamacpp server : How do the -np and -c flags interact?
r/LocalLLaMA
•
Generative AI
Open Source AI
I've been using lm studio for a few months. I want to try hermes agents with Qwen 3.6 MoE, so I'm switching to llama.cpp and I don't understand well how the server slots -np and the context size -c interact. The context for each parallel client appears to be equally distributed across server slots (so each client is allowed c / np context