AI RESEARCH
NeUQI: Near-Optimal Uniform Quantization Parameter Initialization for Low-Bit LLMs
arXiv CS.LG
•
ArXi:2505.17595v4 Announce Type: replace Large language models (LLMs) achieve impressive performance across domains but face significant challenges when deployed on consumer-grade GPUs or personal devices such as laptops, due to high memory consumption and inference costs. Post-