AI RESEARCH
Looped Diffusion Language Models
arXiv CS.LG
•
ArXi:2605.26106v1 Announce Type: new Masked diffusion models (MDMs) have emerged as a promising alternative to autoregressive models for language modeling, yet the effective design of transformer architectures for MDMs remains underexplored. In this paper, we show that selectively looping the early-middle transformer layers significantly improves both