qwen35: use post-norm hidden state for MTP by am17an · Pull Request #24025 · ggml-org/llama.cpp

r/LocalLLaMA
Generative AI Open Source AI

Faster MTP for Qwen submitted by /u/jacek2023 [link] [comments]