Qwen 3.6 27B MTP speed on 3080ti (getting 4.5 t/s)

r/LocalLLaMA
Open Source AI

Using LM Studio with 3080ti (12gb of VRAM) and 128gb of ddr4. Model version: Qwen 3.6 27B MTP UD q4_k_xl Is this my hardware limit? Is there anyway to speed this up using the current hardware? submitted by /u/yehiaserag [link] [comments]