Learn from A Rationalist: Distilling Intermediate Interpretable Rationales

ArXi:2601.22531v2 Announce Type: replace-cross Because of the pervasive use of deep neural networks (DNNs), especially in high-stakes domains, the interpretability of DNNs has received increased attention. The general idea of rationale extraction (RE) is to provide an interpretable-by-design framework for DNNs via a select-predict architecture where two neural networks learn jointly to perform feature selection and prediction, respectively.