DecepChain: Inducing Deceptive Reasoning in Large Language Models

ArXi:2510.00319v2 Announce Type: replace-cross Large Language Models (LLMs) have been nstrating strong reasoning capability with their chain-of-thoughts (CoT), which are routinely used by humans to judge answer quality. This reliance creates a powerful yet fragile basis for trust. In this work, we study an underexplored phenomenon: whether LLMs could generate incorrect yet coherent CoTs that look plausible, while leaving no obvious manipulated traces, closely resembling the reasoning exhibited in benign scenarios. To investigate this, we.