Tensor Cache: Eviction-conditioned Associative Memory for Transformers

ArXi:2605.22884v1 Announce Type: cross Autoregressive Transformer KV caches grow linearly with context length; sliding-window caching bounds memory but discards evicted tokens entirely, so relevant evidence outside the window becomes inaccessible. We