AI RESEARCH

Looped Diffusion Language Models

arXiv CS.LG

ArXi:2605.26106v1 Announce Type: new Masked diffusion models (MDMs) have emerged as a promising alternative to autoregressive models for language modeling, yet the effective design of transformer architectures for MDMs remains underexplored. In this paper, we show that selectively looping the early-middle transformer layers significantly improves both