Interdomain Attention: Beyond Token-Level Key-Value Memory

ArXi:2605.24330v1 Announce Type: new Transformers and deep state space models (SSMs) sit at opposite ends of a basic design choice: attention routes each query through a growing key-value (KV) cache by content-based matching at quadratic cost, while deep SSMs compress context into a fixed-size recurrent state that is not directly addressed by query-key matching.