How Much Thinking is Enough? Quantifying and Understanding Redundancy in LLM Reasoning

ArXi:2605.23926v1 Announce Type: new Reasoning-capable large language models solve hard problems by emitting long chains of thought, paying heavily in latency, GPU time, and energy. Casual inspection of their traces reveals extensive reformulation, verification, and circular self-reflection, yet how much of this deliberation is actually necessary has never been measured at scale or explained from first principles. This paper closes both gaps.