AI RESEARCH
Beyond a Single Direction: Chain-of-Thought Disrupts Simple Steering of Refusal
arXiv CS.AI
•
ArXi:2605.26772v1 Announce Type: new Large reasoning models (LRMs) generate chain-of-thought (CoT) traces before producing final outputs,