Zipping the Thought: When and How Compressed Reasoning Data Works in LLM Post-Training

ArXi:2605.28008v1 Announce Type: new Large language models (LLMs) can now solve complex problems through long chain-of-thought (CoT) reasoning, but the trade-off between performance and token cost remains a central challenge. To address this issue, supervised fine-tuning (SFT) often uses compressed reasoning data, where CoT traces are shortened into compact forms. However, the effect of such compressed reasoning data on post-