We show that for neural networks (NN) with normalisation layers, i. e. batch norm, layer norm, or group norm, the Laplace model evidence does not approximate the volume of a posterior mode and is thus unsuitable for model selection.
Grunwald and Van Ommen (2017) show that Bayesian inference for linear regression can be inconsistent under model misspecification.
In particular, we implement subnetwork linearized Laplace as a simple, scalable Bayesian deep learning method: We first obtain a MAP estimate of all weights and then infer a full-covariance Gaussian posterior over a subnetwork using the linearized Laplace approximation.
In particular, we develop a practical and scalable Bayesian deep learning method that first trains a point estimate, and then infers a full covariance Gaussian posterior approximation over a subnetwork.
For this reason, we propose predictive complexity priors: a functional prior that is defined by comparing the model's predictions to those of a reference model.
In this review, we attempt to provide such a perspective by describing flows through the lens of probabilistic modeling and inference.
Leveraging the wealth of unlabeled data produced in recent years provides great potential for improving supervised models.
To determine whether or not inputs reside in the typical set, we propose a statistically principled, easy-to-implement test using the empirical distribution of model likelihoods.
We propose a neural hybrid model consisting of a linear model defined on a set of features computed by a deep, invertible transformation (i. e. a normalizing flow).
A neural network deployed in the wild may be asked to make predictions for inputs that were drawn from a different distribution than that of the training data.
We propose a novel framework for understanding multiplicative noise in neural networks, considering continuous distributions as well as Bernoulli noise (i. e. dropout).
We present a personalized recommender system using neural network for recommending products, such as eBooks, audio-books, Mobile Apps, Video and Music.
Corrupting the input and hidden layers of deep neural networks (DNNs) with multiplicative noise, often drawn from the Bernoulli distribution (or 'dropout'), provides regularization that has significantly contributed to deep learning's success.