qwen3.6-35b-a3b-mtp running on GTX 1060 6GB

r/LocalLLaMA
AI Hardware

I have this old 10-year old Dell T5810 workstation with 32GB ddr3(?) memory and a E5-2698v3 (16 cores 32 threads), a GTX 1060 6GB that's used for mining back in the old days (paid itself back many times over). I managed to get the model running with LMStudio in Windows(!). My settings are: Model: unsloth qwen3.6-35B-a3b-MTP-GGUF UD Q4_K_XL Ctx length:131072 GPU offload 41 CPU threadpool size 16 Max concurrent 4 Number of experts 8 Number of MOE layers offloaded to CPU 41 MTP max draft 3 KV quantization both Q4_0 prefill 16k about 130-150tps decode 4k about 16tps Very usable for chat.