STaR-Quant: State-Time Consistent Post-Training Quantization for Diffusion Large Language Models

ArXi:2606.04945v1 Announce Type: new Diffusion large language models (DLLMs) have recently emerged as a promising alternative to autoregressive LLMs by generating text through iterative masked denoising with bidirectional context. However, their large model sizes and iterative denoising process