Noise-Contrastive Variational Information Bottleneck Networks

29 Sep 2021 · Jannik Schmitt, Stefan Roth ·

While deep neural networks for classification have shown impressive predictive performance, e.g. in image classification, they generally tend to be overconfident. We start from the observation that popular methods for reducing overconfidence by regularizing the distribution of outputs or intermediate variables achieve better calibration by sacrificing the separability of correct and incorrect predictions, another important facet of uncertainty estimation. To circumvent this, we propose a novel method that builds upon the distributional alignment of the variational information bottleneck and encourages assigning lower confidence to samples from the latent prior. Our experiments show that this simultaneously improves prediction accuracy and calibration compared to a multitude of output regularization methods without impacting the uncertainty-based separability in multiple classification settings, including under distributional shift.

PDF Abstract