AI RESEARCH
From AR to Diffusion: Efficiently Adapting Large Language Models with Strictly Causal and Elastic Horizons
arXiv CS.AI
•
ArXi:2605.27387v1 Announce Type: cross Diffusion models promise efficient parallel text generation but rely on bidirectional attention, creating a structural mismatch with pre-trained Autoregressive (AR) models. This incompatibility precludes reusing robust AR priors, necessitating prohibitive pre-