AI RESEARCH
Esoteric Language Models: A Family of Any-Order Diffusion LLMs
arXiv CS.LG
•
ArXi:2506.01928v4 Announce Type: replace-cross Diffusion-based language models offer a compelling alternative to autoregressive (AR) models by enabling parallel and controllable generation. Within this family, Masked Diffusion Models (MDMs) currently perform best but still underperform AR models in perplexity and lack key inference-time efficiency features, most notably KV caching. We