AI RESEARCH

NestedKV: Nested Memory Routing for Long-Context KV Cache Compression

arXiv CS.CL

ArXi:2605.26678v1 Announce Type: new Long-context language models are limited by the memory footprint of the key-value (KV) cache. Existing