Accelerated Test-Time Scaling with Model-Free Speculative Sampling

ArXi:2506.04708v3 Announce Type: replace Language models have nstrated remarkable capabilities in reasoning tasks through test-time scaling techniques like best-of-N sampling and tree search. However, these approaches often demand substantial computational resources, creating a critical trade-off between performance and efficiency. We