A Practical PAC-Bayes Generalisation Bound for Deep Learning

29 Sep 2021  ·  Diego Granziol, Mingtian Zhang, Nicholas Baskerville ·

Under a PAC-Bayesian framework, we derive an implementation efficient parameterisation invariant metric to measure the difference between our true and empirical risk. We show that for solutions of low training loss, this metric can be approximated at the same cost as a single step of SGD. We investigate the usefulness of this metric on pathological examples, where traditional Hessian based sharpness metrics increase but generalisation also increases and find good experimental agreement. As a consequence of our PAC-Bayesian framework and theoretical arguments on the effect of sub-sampling the Hessian, we include a trace of Hessian term into our structural risk. We find that this term promotes generalisation on a variety of experiments using Wide-Residual Networks on the CIFAR datasets.

PDF Abstract
No code implementations yet. Submit your code now

Tasks


Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods