Prism: Spectral-Aware Block-Sparse Attention

ArXi:2602.08426v2 Announce Type: replace-cross Block-sparse attention is promising for accelerating long-context LLM pre-filling, yet identifying relevant blocks efficiently remains a bottleneck. Existing methods typically employ coarse-grained attention as a proxy for block importance estimation, but often resort to expensive token-level searching or scoring, resulting in significant selection overhead.