AI RESEARCH
CompilerKV: Risk-Adaptive KV Compression via Offline Experience Compilation
arXiv CS.LG
•
ArXi:2602.08686v2 Announce Type: replace Prefill-only KV compression freezes a token subset at the end of prefill and decodes from it without further eviction. The retention decision is. therefore. irreversible, yet existing methods estimate the corrective signals it relies on, per-head reliability and prompt-level compression sensitivity, online from a single noisy prompt. We argue this is the wrong statistical unit: these signals exhibit far higher cross-prompt regularity than within-prompt signal-to-noise. We.