Agentic Chain-of-Thought Steering for Efficient and Controllable LLM Reasoning

ArXi:2606.03965v1 Announce Type: new Large language models improve final-answer accuracy through extended chain-of-thought reasoning, but often spend tokens inefficiently and offer little inference-time control. Existing efficient reasoning methods control thinking length by shortening, early-stopping, or compressing traces, leaving how the model thinks implicit. In this paper, we propose Agentic Chain-of-Thought Steering (ACTS), which formulates reasoning steering as a Marko decision process where a controller agent adaptively steers a frozen reasoner during inference.