AI RESEARCH

Esoteric Language Models: A Family of Any-Order Diffusion LLMs

arXiv CS.LG

ArXi:2506.01928v4 Announce Type: replace-cross Diffusion-based language models offer a compelling alternative to autoregressive (AR) models by enabling parallel and controllable generation. Within this family, Masked Diffusion Models (MDMs) currently perform best but still underperform AR models in perplexity and lack key inference-time efficiency features, most notably KV caching. We