Faithfulness Evaluation for Decoder-only LLM Attributions with Controlled Retained Information

ArXi:2601.03089v2 Announce Type: replace-cross Large Language Models (LLMs) are increasingly evaluated with input attribution methods, yet comparing such explanations remains challenging. Existing soft-perturbation faithfulness metrics, such as Soft-NC and Soft-NS, can conflate attribution quality with the number of words retained during perturbation: attribution methods with larger average scores may keep words and. therefore. obtain inflated scores.