LongAttnComp: Cross-Family Context Compression for Long-Context Reasoning

ArXi:2606.01336v1 Announce Type: new As real-world applications increasingly require processing inputs of 100k+ tokens, the gap between context length and inference efficiency has become a critical bottleneck. Context compression offers a way to reduce prefill costs while preserving task accuracy. However, existing