Word-Level Uncertainty Estimation for Black-Box Text Classifiers using RNNs

Estimating uncertainties of Neural Network predictions paves the way towards more reliable and trustful text classifications. However, common uncertainty estimation approaches remain as black-boxes without explaining which features have led to the uncertainty of a prediction. This hinders users from understanding the cause of unreliable model behaviour. We introduce an approach to decompose and visualize the uncertainty of text classifiers at the level of words. Our approach builds on top of Recurrent Neural Networks and Bayesian modelling in order to provide detailed explanations of uncertainties, enabling a deeper reasoning about unreliable model behaviours. We conduct a preliminary experiment to check the impact and correctness of our approach. By explaining and investigating the predictive uncertainties of a sentiment analysis task, we argue that our approach is able to provide a more profound understanding of artificial decision making.

PDF Abstract


  Add Datasets introduced or used in this paper

Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.


No methods listed for this paper. Add relevant methods here