Get you some GPUs, it's not worth the hacks around lack of RAM
r/LocalLLaMA
•
AI Hardware
If you can, get you some GPUs, all the hacks around limited vram is not worth the pain and effort. Even if it means getting P40s or MI50s. Get you enough GPU to have everything in memory. Qwen3.6-27B. 27B the dense model. Q8, f16 K/V cache, 128k context on 2 used 3090s. 1399 pp, 104 tg submitted by /u/MotokoAGI [link] [comments]