AI RESEARCH

Beyond a Single Direction: Chain-of-Thought Disrupts Simple Steering of Refusal

arXiv CS.AI

ArXi:2605.26772v1 Announce Type: new Large reasoning models (LRMs) generate chain-of-thought (CoT) traces before producing final outputs,