Search Results for author: Alexander Immer

Found 22 papers, 12 papers with code

Shaving Weights with Occam's Razor: Bayesian Sparsification for Neural Networks Using the Marginal Likelihood

no code implementations25 Feb 2024 Rayen Dhahri, Alexander Immer, Betrand Charpentier, Stephan Günnemann, Vincent Fortuin

Neural network sparsification is a promising avenue to save computational time and memory costs, especially in an age where many successful AI models are becoming too large to na\"ively deploy on consumer hardware.

Uncertainty in Graph Contrastive Learning with Bayesian Neural Networks

no code implementations30 Nov 2023 Alexander Möllers, Alexander Immer, Elvin Isufi, Vincent Fortuin

Graph contrastive learning has shown great promise when labeled data is scarce, but large unlabeled datasets are available.

Contrastive Learning Node Classification

Kronecker-Factored Approximate Curvature for Modern Neural Network Architectures

no code implementations NeurIPS 2023 Runa Eschenhagen, Alexander Immer, Richard E. Turner, Frank Schneider, Philipp Hennig

In this work, we identify two different settings of linear weight-sharing layers which motivate two flavours of K-FAC -- $\textit{expand}$ and $\textit{reduce}$.

Towards Training Without Depth Limits: Batch Normalization Without Gradient Explosion

1 code implementation3 Oct 2023 Alexandru Meterez, Amir Joudaki, Francesco Orabona, Alexander Immer, Gunnar Rätsch, Hadi Daneshmand

We answer this question in the affirmative by giving a particular construction of an Multi-Layer Perceptron (MLP) with linear activations and batch-normalization that provably has bounded gradients at any depth.

Hodge-Aware Contrastive Learning

no code implementations14 Sep 2023 Alexander Möllers, Alexander Immer, Vincent Fortuin, Elvin Isufi

We leverage this decomposition to develop a contrastive self-supervised learning approach for processing simplicial data and generating embeddings that encapsulate specific spectral information. Specifically, we encode the pertinent data invariances through simplicial neural networks and devise augmentations that yield positive contrastive examples with suitable spectral properties for downstream tasks.

Contrastive Learning Self-Supervised Learning

Stochastic Marginal Likelihood Gradients using Neural Tangent Kernels

1 code implementation6 Jun 2023 Alexander Immer, Tycho F. A. van der Ouderaa, Mark van der Wilk, Gunnar Rätsch, Bernhard Schölkopf

Recent works show that Bayesian model selection with Laplace approximations can allow to optimize such hyperparameters just like standard neural network parameters using gradients and on the training data.

Hyperparameter Optimization Model Selection

Improving Neural Additive Models with Bayesian Principles

no code implementations26 May 2023 Kouroche Bouchiat, Alexander Immer, Hugo Yèche, Gunnar Rätsch, Vincent Fortuin

Neural additive models (NAMs) enhance the transparency of deep neural networks by handling input features in separate additive sub-networks.

Additive models Bayesian Inference +1

Promises and Pitfalls of the Linearized Laplace in Bayesian Optimization

1 code implementation17 Apr 2023 Agustinus Kristiadi, Alexander Immer, Runa Eschenhagen, Vincent Fortuin

The linearized-Laplace approximation (LLA) has been shown to be effective and efficient in constructing Bayesian neural networks.

Bayesian Optimization Decision Making +2

On the Identifiability and Estimation of Causal Location-Scale Noise Models

1 code implementation13 Oct 2022 Alexander Immer, Christoph Schultheiss, Julia E. Vogt, Bernhard Schölkopf, Peter Bühlmann, Alexander Marx

We study the class of location-scale or heteroscedastic noise models (LSNMs), in which the effect $Y$ can be written as a function of the cause $X$ and a noise source $N$ independent of $X$, which may be scaled by a positive function $g$ over the cause, i. e., $Y = f(X) + g(X)N$.

Causal Discovery Causal Inference

Invariance Learning in Deep Neural Networks with Differentiable Laplace Approximations

1 code implementation22 Feb 2022 Alexander Immer, Tycho F. A. van der Ouderaa, Gunnar Rätsch, Vincent Fortuin, Mark van der Wilk

We develop a convenient gradient-based method for selecting the data augmentation without validation data during training of a deep neural network.

Data Augmentation Gaussian Processes +1

Probing as Quantifying Inductive Bias

1 code implementation ACL 2022 Alexander Immer, Lucas Torroba Hennigen, Vincent Fortuin, Ryan Cotterell

Such performance improvements have motivated researchers to quantify and understand the linguistic information encoded in these representations.

Bayesian Inference Inductive Bias

Pathologies in priors and inference for Bayesian transformers

no code implementations NeurIPS Workshop ICBINB 2021 Tristan Cinquin, Alexander Immer, Max Horn, Vincent Fortuin

In recent years, the transformer has established itself as a workhorse in many applications ranging from natural language processing to reinforcement learning.

Bayesian Inference Variational Inference

Laplace Redux -- Effortless Bayesian Deep Learning

3 code implementations NeurIPS 2021 Erik Daxberger, Agustinus Kristiadi, Alexander Immer, Runa Eschenhagen, Matthias Bauer, Philipp Hennig

Bayesian formulations of deep learning have been shown to have compelling theoretical properties and offer practical functional benefits, such as improved predictive uncertainty quantification and model selection.

Misconceptions Model Selection +1

Laplace Redux - Effortless Bayesian Deep Learning

no code implementations NeurIPS 2021 Erik Daxberger, Agustinus Kristiadi, Alexander Immer, Runa Eschenhagen, Matthias Bauer, Philipp Hennig

Bayesian formulations of deep learning have been shown to have compelling theoretical properties and offer practical functional benefits, such as improved predictive uncertainty quantification and model selection.

Misconceptions Model Selection +1

Scalable Marginal Likelihood Estimation for Model Selection in Deep Learning

1 code implementation11 Apr 2021 Alexander Immer, Matthias Bauer, Vincent Fortuin, Gunnar Rätsch, Mohammad Emtiyaz Khan

Marginal-likelihood based model-selection, even though promising, is rarely used in deep learning due to estimation difficulties.

Image Classification Model Selection +2

Improving predictions of Bayesian neural nets via local linearization

1 code implementation19 Aug 2020 Alexander Immer, Maciej Korzepa, Matthias Bauer

The generalized Gauss-Newton (GGN) approximation is often used to make practical Bayesian deep learning approaches scalable by replacing a second order derivative with a product of first order derivatives.

Out-of-Distribution Detection

Disentangling the Gauss-Newton Method and Approximate Inference for Neural Networks

no code implementations21 Jul 2020 Alexander Immer

Algorithms that combine the Gauss-Newton method with the Laplace and Gaussian variational approximation have recently led to state-of-the-art results in Bayesian deep learning.

Gaussian Processes

Continual Deep Learning by Functional Regularisation of Memorable Past

1 code implementation NeurIPS 2020 Pingbo Pan, Siddharth Swaroop, Alexander Immer, Runa Eschenhagen, Richard E. Turner, Mohammad Emtiyaz Khan

Continually learning new skills is important for intelligent systems, yet standard deep learning methods suffer from catastrophic forgetting of the past.

Variational Inference with Numerical Derivatives: variance reduction through coupling

1 code implementation17 Jun 2019 Alexander Immer, Guillaume P. Dehaene

The Black Box Variational Inference (Ranganath et al. (2014)) algorithm provides a universal method for Variational Inference, but taking advantage of special properties of the approximation family or of the target can improve the convergence speed significantly.

Variational Inference

Approximate Inference Turns Deep Networks into Gaussian Processes

1 code implementation NeurIPS 2019 Mohammad Emtiyaz Khan, Alexander Immer, Ehsan Abedi, Maciej Korzepa

Deep neural networks (DNN) and Gaussian processes (GP) are two powerful models with several theoretical connections relating them, but the relationship between their training methods is not well understood.

Gaussian Processes

Cannot find the paper you are looking for? You can Submit a new open access paper.