AI RESEARCH

NeUQI: Near-Optimal Uniform Quantization Parameter Initialization for Low-Bit LLMs

arXiv CS.LG

ArXi:2505.17595v4 Announce Type: replace Large language models (LLMs) achieve impressive performance across domains but face significant challenges when deployed on consumer-grade GPUs or personal devices such as laptops, due to high memory consumption and inference costs. Post-