Word-Level Uncertainty Estimation for Black-Box Text Classifiers using RNNs

COLING 2020 · Jakob Smedegaard Andersen, Tom Sch{\"o}ner, Walid Maalej ·

Estimating uncertainties of Neural Network predictions paves the way towards more reliable and trustful text classifications. However, common uncertainty estimation approaches remain as black-boxes without explaining which features have led to the uncertainty of a prediction. This hinders users from understanding the cause of unreliable model behaviour. We introduce an approach to decompose and visualize the uncertainty of text classifiers at the level of words. Our approach builds on top of Recurrent Neural Networks and Bayesian modelling in order to provide detailed explanations of uncertainties, enabling a deeper reasoning about unreliable model behaviours. We conduct a preliminary experiment to check the impact and correctness of our approach. By explaining and investigating the predictive uncertainties of a sentiment analysis task, we argue that our approach is able to provide a more profound understanding of artificial decision making.

PDF Abstract