A Probabilistic Representation for Deep Learning: Delving into The Information Bottleneck Principle

NeurIPS 2021 · Xinjie Lan, Kenneth Barner ·

The Information Bottleneck (IB) principle has recently attracted great attention to explaining Deep Neural Networks (DNNs), and the key is to accurately estimate the mutual information between a hidden layer and dataset. However, some unsettled limitations weaken the validity of the IB explanation for DNNs. To address these limitations and fully explain deep learning in an information theoretic fashion, we propose a probabilistic representation for deep learning that allows the framework to estimate the mutual information, more accurately than existing non-parametric models, and also quantify how the components of a hidden layer affect the mutual information. Leveraging the probabilistic representation, we take into account the back-propagation training and derive two novel Markov chains to characterize the information flow in DNNs. We show that different hidden layers achieve different IB trade-offs depending on the architecture and the position of the layers in DNNs, whereas a DNN satisfies the IB principle no matter the architecture of the DNN.

PDF Abstract