AI RESEARCH
Triplet-Block Diffusion RWKV
arXiv CS.CL
•
ArXi:2605.25969v1 Announce Type: new Causal Transformer language models suffer from strictly sequential decoding and a quadratic per-step attention cost. While linear-time causal models and discrete diffusion models each address these weaknesses, their integration remains inherently inconsistent: diffusion requires bidirectional attention, while causal models are unidirectional.