AI RESEARCH

Towards Long-Horizon Interpretability: Efficient and Faithful Multi-Token Attribution for Reasoning LLMs

arXiv CS.LG

ArXi:2602.01914v2 Announce Type: replace Token attribution methods provide intuitive explanations for language model outputs by identifying causally important input tokens.