What's this sub geebral opinion on quantisizing the KV cache

r/LocalLLaMA
Generative AI

Assume I'm talking about Qwen3.6b-27b for coding. I hear a lot about quantisizing the model but almost no opinions on the KV cache for this model. submitted by /u/misanthrophiccunt [link] [comments]