AI RESEARCH

WaterSIC: Information-Theoretically (Near) Optimal Linear Layer Quantization

arXiv CS.LG

ArXi:2603.04956v2 Announce Type: replace This paper considers the problem of converting a given dense linear layer to low precision. The tradeoff between compressed length and output discrepancy is analyzed information theoretically (IT). It is shown that a popular GPTQ algorithm may have an arbitrarily large gap to the IT limit. To alleviate this problem, a novel algorithm, termed ``WaterSIC'', is proposed and is shown to be within a rate gap of 0.255 bits to the IT limit, uniformly over all possible covariance matrices of input activations.