AI RESEARCH

Structured-Sparse Attention for Entity Tracking with Subquadratic Sequence Complexity

arXiv CS.CL

ArXi:2605.22476v1 Announce Type: cross Entity tracking requires maintaining and updating latent states for entities and attributes over long sequences. Recent task-specific attention operators can compress deep Transformer stacks into a few layers by performing multi-hop state propagation within a single layer, but their dense evaluation remains expensive. We show that in this setting, learned attention is strongly structured: most mass concentrates in local block-diagonal neighborhoods with a light cross-block residue.