AI RESEARCH

Do Models Know Why They Changed Their Mind? Interpretability and Faithfulness of Chain-of-Thought Under Knowledge Conflict

arXiv CS.AI

ArXi:2605.27773v1 Announce Type: cross When a language model sees a document contradicting its