On the Generalization of Neural Networks Trained with SGD: Information-Theoretical Bounds and Implications

NeurIPS 2021 · Ziqiao Wang, Yongyi Mao ·

Understanding the generalization behaviour of deep neural networks is an important theme of modern research in machine learning. In this paper, we follow up on a recent work of (Neu, 2021) and present new information-theoretic upper bounds for the generalization error of neural networks trained with SGD. Our bounds and experimental study provide new insights on the SGD training of neural networks. They also point to a new and simple regularization scheme which we show performs comparably to the current state of the art.

PDF Abstract