Towards Benchmarking the Utility of Explanations for Model Debugging

NAACL (TrustNLP) 2021 · Maximilian Idahl, Lijun Lyu, Ujwal Gadiraju, Avishek Anand ·

Post-hoc explanation methods are an important class of approaches that help understand the rationale underlying a trained model's decision. But how useful are they for an end-user towards accomplishing a given task? In this vision paper, we argue the need for a benchmark to facilitate evaluations of the utility of post-hoc explanation methods. As a first step to this end, we enumerate desirable properties that such a benchmark should possess for the task of debugging text classifiers. Additionally, we highlight that such a benchmark facilitates not only assessing the effectiveness of explanations but also their efficiency.