A Bayesian Perspective on Generalization and Stochastic Gradient Descent

ICLR 2018 Samuel L. Smith and Quoc V. Le

We consider two questions at the heart of machine learning; how can we predict if a minimum will generalize to the test set, and why does stochastic gradient descent find minima that generalize well? Our work responds to \citet{zhang2016understanding}, who showed deep neural networks can easily memorize randomly labeled training data, despite generalizing well on real labels of the same inputs... (read more)

PDF Abstract

Code


No code implementations yet. Submit your code now

Tasks


Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.