AI RESEARCH
ParisKV: Fast and Drift-Robust KV-Cache Retrieval for Long-Context LLMs
arXiv CS.LG
•
ArXi:2602.07721v3 Announce Type: replace KV-cache retrieval is essential for long-context LLM inference, yet existing methods struggle with distribution drift and high latency at scale. We