1 code implementation • NeurIPS 2021 • Robert J. N. Baldock, Hartmut Maennel, Behnam Neyshabur
Existing work on understanding deep learning often employs measures that compress all data-dependent information into a few numbers.
no code implementations • NeurIPS 2020 • Hartmut Maennel, Ibrahim Alabdulmohsin, Ilya Tolstikhin, Robert J. N. Baldock, Olivier Bousquet, Sylvain Gelly, Daniel Keysers
We show how this alignment produces a positive transfer: networks pre-trained with random labels train faster downstream compared to training from scratch even after accounting for simple effects, such as weight scaling.
1 code implementation • 8 Apr 2019 • Robert J. N. Baldock, Nicola Marzari
We recapitulate the Bayesian formulation of neural network based classifiers and show that, while sampling from the posterior does indeed lead to better generalisation than is obtained by standard optimisation of the cost function, even better performance can in general be achieved by sampling finite temperature ($T$) distributions derived from the posterior.