Linearised Laplace Inference in Networks with Normalisation Layers and the Neural g-Prior

We show that for neural networks (NN) with normalisation layers, i.e. batch norm, layer norm, or group norm, the Laplace model evidence does not approximate the volume of a posterior mode and is thus unsuitable for model selection. We instead propose to use the Laplace evidence of the linearized network, which is robust to the presence of these layers. We also identify heterogeneity in the scale of Jacobian entries corresponding to different weights. We ameliorate this issue by extending the scale-invariant g-prior to NNs. We demonstrate these methods on toy regression, and image classification with a CNN.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here